Re: [Xen-devel] ARM: SMMUv3 support

2017-07-31 Thread Manish Jaggi



On 6/13/2017 10:19 AM, Manish Jaggi wrote:



On 3/29/2017 5:30 AM, Goel, Sameer wrote:

Sure, I will try to post something soon.

Hi Sameer,
Are you still working on SMMU v3, can you please post patches.


Hi Sameer,
Could you please post RFC patches for SMMUv3, can provide feedback by 
testing on thunderX platform.


Thanks
manish

Thanks
Manish

Thanks,
Sameer

On 3/27/2017 11:03 PM, Vijay Kilari wrote:
On Mon, Mar 27, 2017 at 10:00 PM, Goel, Sameer 
 wrote:

Hi,
  I am working on adding this support. The work is in initial 
stages and will target ACPI systems to start with. Do you have a 
specific requirement? Or even better: want to help with DT testing 
? :)

Thanks Sameer. I don't have any specific requirement. I am also
looking with ACPI support.
Please share your RFC patches so that I can test on our platform.


Thanks,
Sameer

On 3/20/2017 11:58 PM, Vijay Kilari wrote:

Hi,

  Is there any effort put by anyone to get SMMUv3 support in 
Xen for ARM64?.

Would be glad to know.

Regards
Vijay

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum,

a Linux Foundation Collaborative Project.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [linux-4.9 test] 112397: trouble: blocked/broken/fail/pass

2017-07-31 Thread osstest service owner
flight 112397 linux-4.9 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112397/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   2 hosts-allocate broken REGR. vs. 112193
 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 112193
 build-arm64   2 hosts-allocate broken REGR. vs. 112193

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-credit2 16 guest-start/debian.repeat fail in 112388 pass 
in 112397
 test-armhf-armhf-xl-multivcpu 19 leak-check/check fail in 112388 pass in 112397
 test-amd64-amd64-qemuu-nested-intel 19 leak-check/check/l1 fail in 112388 pass 
in 112397
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 16 guest-localmigrate/x10 fail 
pass in 112388

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-examine  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64-pvops 3 capture-logs  broken blocked in 112193
 build-arm64   3 capture-logs  broken blocked in 112193
 build-arm64-xsm   3 capture-logs broken never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail blocked in 
112193
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop   fail in 112388 like 112086
 test-amd64-amd64-xl-qemut-win7-amd64 18 guest-start/win.repeat fail like 112193
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 112193
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail like 112193
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 112193
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 112193
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail 

[Xen-devel] [xtf test] 112399: all pass - PUSHED

2017-07-31 Thread osstest service owner
flight 112399 xtf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112399/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 xtf  c5be5f2f71d5bebb89c79bbcc5469445ee327b38
baseline version:
 xtf  2b5adea4636ae5b6c2b5f3eb391cd4aeb7a997a4

Last test of basis   112287  2017-07-25 12:47:57 Z6 days
Testing same since   112399  2017-07-31 17:50:54 Z0 days1 attempts


People who touched revisions under test:
  Wei Liu 

jobs:
 build-amd64-xtf  pass
 build-amd64  pass
 build-amd64-pvopspass
 test-xtf-amd64-amd64-1   pass
 test-xtf-amd64-amd64-2   pass
 test-xtf-amd64-amd64-3   pass
 test-xtf-amd64-amd64-4   pass
 test-xtf-amd64-amd64-5   pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xtf
+ revision=c5be5f2f71d5bebb89c79bbcc5469445ee327b38
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xtf 
c5be5f2f71d5bebb89c79bbcc5469445ee327b38
+ branch=xtf
+ revision=c5be5f2f71d5bebb89c79bbcc5469445ee327b38
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xtf
+ xenbranch=xen-unstable
+ '[' xxtf = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable
+ prevxenbranch=xen-4.9-testing
+ '[' xc5be5f2f71d5bebb89c79bbcc5469445ee327b38 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/linux-firmware.git
++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git
++ : git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
++ : osst...@xenbits.xen.org:/home/xen/git/linux-pvops.git
++ : git://xenbits.xen.org/linux-pvops.git
++ : tested/linux-4.9
++ : tested/linux-arm-xen
++ '[' 

Re: [Xen-devel] Xen on Intel Atom E3815: crash, no output

2017-07-31 Thread Tian, Kevin
> From: Stefano Stabellini [mailto:sstabell...@kernel.org]
> Sent: Tuesday, August 1, 2017 7:40 AM
> 
> Hi all,
> 
> I noticed that Xen does not boot on Intel Atom E3815. The system is a
> Dell Edge Gateway 3003:
> 
> http://i.dell.com/sites/doccontent/shared-content/data-
> sheets/en/Documents/Dell_Edge_Gateway_3000_Series_spec_sheet.pdf?ne
> wtab=true
> 
> Grub2 loads Xen and Dom0, but no output comes out of Xen. After the
> "Loading" messages from Grub2, Xen doesn't manage to print even a single
> character and the system obviously crashes, but I don't know why because
> there is no output. Before you ask, no I don't have a serial on the
> system.
> 
> I tried to pass console=vga vga=text-80x25 and console=vga vga=ask, but
> I still got nothing.
> 
> Do you have any ideas how to get some output on the screen? Do you know
> how to get Xen to boot successfully?
> 

No much idea except finding a serial for such early boot issue...

Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 00/13] "Non-shared" IOMMU support on ARM

2017-07-31 Thread Tian, Kevin
> From: Oleksandr Tyshchenko [mailto:olekst...@gmail.com]
> Sent: Monday, July 31, 2017 7:58 PM
> 
> Hi, Kevin
> 
> On Mon, Jul 31, 2017 at 8:57 AM, Tian, Kevin  wrote:
> >> From: Oleksandr Tyshchenko
> >> Sent: Wednesday, July 26, 2017 1:27 AM
> >>
> >> From: Oleksandr Tyshchenko 
> >>
> >> Hi, all.
> >>
> >> The purpose of this patch series is to create a base for porting
> >> any "Non-shared" IOMMUs to Xen on ARM. Saying "Non-shared" IOMMU
> I
> >> mean
> >> the IOMMU that can't share the page table with the CPU.
> >
> > Is "non-shared" IOMMU a standard terminology in ARM side? I quickly
> > searched to find it mostly used in this thread...
> I don't think that it is a standard terminology.
> 
> >
> > On the other hand, all IOMMUs support a basic DMA remapping
> > mechanism with page table not shared with CPU. Then some IOMMUs
> > may optional support Shared Virtual Memory (SVM) through page
> > sharing with CPU. Then I'm not sure why need to highlight the
> > "non-shared" manner in this thread, instead of just saying
> > IPMMU-VMSA support...
> I wouldn't use "IPMMU-VMSA support" in this thread since it may be any
> other IOMMUs which can't share page table
> with CPU because of format incompatibilities.

As I commented you can assume all IOMMUs cannot share page
table with CPU as the starting point. It's not good to name an IOMMU
driver based on such fact.

> I needed something short to describe such IOMMUs, but, If title
> "non-shared" IOMMU sounds confusing
> I won't use it anymore. Do you have something in mind?

IOMMU driver needs to be vendor specific. Is your driver working
for all IPMMU-VMSA compatible IOMMUs or only for Renesas?
If the latter, you may make the name explicit for such purpose.

btw since you're porting Linux driver to Xen. What 's the name
used in Linux side? that should be a good reference to you.

> 
> >
> >> Primarily, we are interested in IPMMU-VMSA and I hope that it will be the
> >> first candidate.
> >> It is VMSA-compatible IOMMU that integrated in the newest Renesas R-
> Car
> >> Gen3 SoCs (ARM).
> >> I am about to push IPMMU-VMSA support in a while.
> >>
> >> With regard to the patch series, it was rebased on Xen 4.9.0 release and
> >> tested on Renesas R-Car Gen3
> >> H3/M3 based boards with applied IPMMU-VMSA support:
> >> - Patches 1 and 3 have Julien's Rb.
> >> - Patch 2 has Jan's Rb but only for x86 and generic parts.
> >> - Patch 4 has Julien's Ab.
> >> - Patches 5,6,9,10 were slightly reworked.
> >> - Patch 7 was significantly reworked. The previous patch -> iommu: Split
> >> iommu_hwdom_init() into arch specific parts
> >> - Patches 8,11,12,13 are new.
> >>
> >> Not really sure about x86-related changes since I had no possibility to
> check.
> >> So, compile-tested on x86.
> >>
> >> You can find current patch series here:
> >> repo: https://github.com/otyshchenko1/xen.git branch:
> >> non_shared_iommu_v2
> >>
> >> Previous patch series here:
> >> [PATCH v1 00/10] "Non-shared" IOMMU support on ARM
> >> https://www.mail-archive.com/xen-devel@lists.xen.org/msg107532.html
> >>
> >> [RFC PATCH 0/9] "Non-shared" IOMMU support on ARM
> >> https://www.mail-archive.com/xen-devel@lists.xen.org/msg100468.html
> >>
> >> Thank you.
> >>
> >> Oleksandr Tyshchenko (13):
> >>   xen/device-tree: Add dt_count_phandle_with_args helper
> >>   iommu: Add extra order argument to the IOMMU APIs and platform
> >> callbacks
> >>   xen/arm: p2m: Add helper to convert p2m type to IOMMU flags
> >>   xen/arm: p2m: Update IOMMU mapping whenever possible if page table
> is
> >> not shared
> >>   iommu/arm: Re-define iommu_use_hap_pt(d) as iommu_hap_pt_share
> >>   iommu: Add extra use_iommu argument to iommu_domain_init()
> >>   iommu: Make decision about needing IOMMU for hardware domains in
> >> advance
> >>   iommu/arm: Misc fixes for arch specific part
> >>   xen/arm: Add use_iommu flag to xen_arch_domainconfig
> >>   xen/arm: domain_build: Don't expose IOMMU specific properties to the
> >> guest
> >>   iommu/arm: smmu: Squash map_pages/unmap_pages with
> >> map_page/unmap_page
> >>   [RFC] iommu: VT-d: Squash map_pages/unmap_pages with
> >> map_page/unmap_page
> >>   [RFC] iommu: AMD-Vi: Squash map_pages/unmap_pages with
> >> map_page/unmap_page
> >>
> >>  tools/libxl/libxl_arm.c   |   8 +
> >>  xen/arch/arm/domain.c |   2 +-
> >>  xen/arch/arm/domain_build.c   |  10 ++
> >>  xen/arch/arm/p2m.c|  10 +-
> >>  xen/arch/x86/domain.c |   2 +-
> >>  xen/arch/x86/mm.c |  11 +-
> >>  xen/arch/x86/mm/p2m-ept.c |  21 +--
> >>  xen/arch/x86/mm/p2m-pt.c  |  26 +---
> >>  xen/arch/x86/mm/p2m.c |  38 +
> >>  xen/arch/x86/x86_64/mm.c  |   5 +-
> >>  xen/common/device_tree.c  |   7 

Re: [Xen-devel] [PATCH v2] VT-d: don't panic/warn on iommu=no-igfx

2017-07-31 Thread Tian, Kevin
> From: Rusty Bird [mailto:rustyb...@openmailbox.org]
> Sent: Monday, July 31, 2017 5:04 PM
> 
> When operating on an Intel graphics device, iommu_enable_translation()
> panicked (force_iommu==1) or warned (force_iommu==0) about the BIOS if
> is_igd_vt_enabled_quirk() returned 0. That's good if the actual BIOS
> problem has been detected. But since commit 1463411, returning 0 could
> also happen if the user simply passed "iommu=no-igfx", in which case
> bailing out with an info message (instead of a panic/warning) would be
> more appropriate.
> 
> The panic broke the combination "iommu=force,no-igfx", and also the case
> where "iommu=no-igfx" is passed but force_iommu=1 is set automatically
> by x2apic_bsp_setup().
> 
> Move the iommu_igfx check from is_igd_vt_enabled_quirk() into its only
> caller iommu_enable_translation(), and tweak the logic.
> 
> Signed-off-by: Rusty Bird 

Acked-by: Kevin Tian 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [linux-next test] 112395: regressions - trouble: blocked/broken/fail/pass

2017-07-31 Thread osstest service owner
flight 112395 linux-next real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112395/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-boot fail REGR. vs. 112382
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-boot fail REGR. vs. 112382
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-boot  fail REGR. vs. 112382
 test-amd64-i386-xl-qemuu-win10-i386  7 xen-boot  fail REGR. vs. 112382
 test-amd64-i386-libvirt   7 xen-boot fail REGR. vs. 112382
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot fail REGR. vs. 112382
 test-amd64-i386-xl-xsm7 xen-boot fail REGR. vs. 112382
 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 112382
 test-amd64-i386-pair 11 xen-boot/dst_hostfail REGR. vs. 112382
 test-amd64-i386-xl-qemut-win10-i386  7 xen-boot  fail REGR. vs. 112382
 test-amd64-i386-xl7 xen-boot fail REGR. vs. 112382
 test-amd64-i386-freebsd10-i386  7 xen-boot   fail REGR. vs. 112382
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 112382
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 112382
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-boot fail REGR. vs. 112382
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-boot   fail REGR. vs. 112382
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
112382
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-boot  fail REGR. vs. 112382
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-boot  fail REGR. vs. 112382
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 112382
 test-amd64-i386-freebsd10-amd64  7 xen-boot  fail REGR. vs. 112382
 test-amd64-amd64-pair10 xen-boot/src_hostfail REGR. vs. 112382
 test-amd64-amd64-pair11 xen-boot/dst_hostfail REGR. vs. 112382
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 112382
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-boot  fail REGR. vs. 112382
 test-amd64-i386-xl-raw7 xen-boot fail REGR. vs. 112382
 test-amd64-i386-rumprun-i386  7 xen-boot fail REGR. vs. 112382
 test-amd64-i386-libvirt-xsm   7 xen-boot fail REGR. vs. 112382
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-boot   fail REGR. vs. 112382
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
112382
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-boot  fail REGR. vs. 112382
 test-amd64-i386-examine   7 reboot   fail REGR. vs. 112382

Tests which did not succeed, but are not blocking:
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-examine  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64-pvops 2 hosts-allocate  broken like 112382
 build-arm64-xsm   2 hosts-allocate  broken like 112382
 build-arm64   2 hosts-allocate  broken like 112382
 build-arm64   3 capture-logsbroken like 112382
 build-arm64-xsm   3 capture-logsbroken like 112382
 build-arm64-pvops 3 capture-logsbroken like 112382
 test-amd64-amd64-xl-qemut-win7-amd64 18 guest-start/win.repeat fail blocked in 
112382
 test-amd64-amd64-examine  7 reboot   fail  like 112382
 test-amd64-amd64-amd64-pvgrub  7 xen-boot fail like 112382
 test-amd64-amd64-rumprun-amd64  7 xen-bootfail like 112382
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 112382
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 112382
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 112382
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 112382
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 112382
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 

Re: [Xen-devel] [RFC v2 10/12] tools: implemet new get value interface suitable for all psr allocation features.

2017-07-31 Thread Yi Sun
On 17-07-31 15:30:16, Wei Liu wrote:
> On Thu, Jul 20, 2017 at 04:49:11PM +0800, Yi Sun wrote:
> > This patch implements a new get value interface in tools suitable for all 
> > psr
> > allocation features and the whole flow. It also enables MBA support in tools
> > to get MBA value.
> 
> This suggests this patch can be at least broken into two?
> 
> > 
> > Signed-off-by: Yi Sun 
> > ---
> >  tools/libxc/include/xenctrl.h |  13 +-
> >  tools/libxc/xc_psr.c  |  11 +-
> >  tools/libxl/libxl_psr.c   |  61 ++
> >  tools/xl/xl.h |   3 +
> >  tools/xl/xl_cmdtable.c|   9 +-
> >  tools/xl/xl_psr.c | 275 
> > ++
> >  6 files changed, 236 insertions(+), 136 deletions(-)
> > 
> > diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> > index 0b0ec31..def18f5 100644
> > --- a/tools/libxc/include/xenctrl.h
> > +++ b/tools/libxc/include/xenctrl.h
> > @@ -2452,13 +2452,14 @@ enum xc_psr_cmt_type {
> >  };
> >  typedef enum xc_psr_cmt_type xc_psr_cmt_type;
> >  
> > -enum xc_psr_cat_type {
> > +enum xc_psr_val_type {
> >  XC_PSR_CAT_L3_CBM  = 1,
> >  XC_PSR_CAT_L3_CBM_CODE = 2,
> >  XC_PSR_CAT_L3_CBM_DATA = 3,
> >  XC_PSR_CAT_L2_CBM  = 4,
> > +XC_PSR_MBA_THRTL   = 5,
> >  };
> > -typedef enum xc_psr_cat_type xc_psr_cat_type;
> > +typedef enum xc_psr_val_type xc_psr_val_type;
> 
> Changing the name of the type should be done in a separate patch.
> 
> The rest of this patch mixes renaming and functional change which is
> rather difficult to review I'm afraid.
> 
Thanks! I will split the patch to two, one for type renaming, the other
for functionality.

BRs,
Sun Yi

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2 08/12] tools: create general interfaces to support psr allocation features.

2017-07-31 Thread Yi Sun
On 17-07-31 15:30:08, Wei Liu wrote:
> On Thu, Jul 20, 2017 at 04:49:09PM +0800, Yi Sun wrote:
> [...]
> > +
> > +#ifdef LIBXL_HAVE_PSR_MBA
> > +/*
> > + * Function to set a domain's value. It operates on a single or multiple
> > + * target(s) defined in 'target_map'. 'target_map' specifies all the 
> > sockets
> > + * to be operated on.
> > + */
> > +int libxl_psr_set_val(libxl_ctx *ctx, uint32_t domid,
> > +  libxl_psr_cbm_type type, libxl_bitmap *target_map,
> > +  uint64_t val);
> > +/*
> > + * Function to get a domain's cbm. It operates on a single 'target'.
> > + * 'target' specifies which socket to be operated on.
> > + */
> > +int libxl_psr_get_val(libxl_ctx *ctx, uint32_t domid,
> > +  libxl_psr_cbm_type type, uint32_t target,
> 
> There is no need for target to be uint32_t right? Unsigned int should
> work too?
> 
> > +  uint64_t *val);
> > +/*
> > + * On success, the function returns an array of elements in 'info',
> > + * and the length in 'nr'.
> > + */
> > +int libxl_psr_get_hw_info(libxl_ctx *ctx, libxl_psr_hw_info **info,
> > +  int *nr, libxl_psr_feat_type type, int lvl);
> > +void libxl_psr_hw_info_list_free(libxl_psr_hw_info *list, int nr);
> 
> nr should be unsigned int.
> 
> > +#endif /* LIBXL_HAVE_PSR_MBA */
> > +#endif /* LIBXL_HAVE_PSR_CAT */
> >  
> >  /* misc */
> >  
> > diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
> > index f55ba1e..8319301 100644
> > --- a/tools/libxl/libxl_psr.c
> > +++ b/tools/libxl/libxl_psr.c
> > @@ -425,6 +425,30 @@ void libxl_psr_cat_info_list_free(libxl_psr_cat_info 
> > *list, int nr)
> >  free(list);
> >  }
> >  
> > +int libxl_psr_set_val(libxl_ctx *ctx, uint32_t domid,
> > +  libxl_psr_cbm_type type, libxl_bitmap *target_map,
> > +  uint64_t val)
> > +{
> > +return EXIT_FAILURE;
> 
> ERROR_FAIL here.
> 
Thanks! Will fix above points.

> > +
> > +libxl_psr_hw_info = Struct("psr_hw_info", [
> > +("id", uint32),
> > +("u", KeyedUnion(None, libxl_psr_feat_type, "type",
> > +  [("cat_info", Struct(None, [
> > + ("cos_max", uint32),
> > + ("cbm_len", uint32),
> > + ("cdp_enabled", bool),
> > +   ])),
> > +   ("mba_info", Struct(None, [
> > + ("cos_max", uint32),
> > + ("thrtl_max",   uint32),
> > + ("linear",  bool),
> > +   ])),
> > +  ]))
> 
> If this is output only please mark it as dir=DIR_OUT.
> 
Sorry, I do not understand this clearly. DYM if these values are all output
values for xl application?

BRs,
Sun Yi

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2 04/12] x86: implement data structure and CPU init flow for MBA.

2017-07-31 Thread Yi Sun
On 17-07-31 15:30:11, Wei Liu wrote:
> On Thu, Jul 20, 2017 at 04:49:05PM +0800, Yi Sun wrote:
> >  #define PSR_CMT(1<<0)
> >  #define PSR_CAT(1<<1)
> >  #define PSR_CDP(1<<2)
> > +#define PSR_MBA(1<<3)
> 
> These should really be (1u << X) -- please use unsigned value and add
> spaces around "<<".
> 
> Can you please submit a patch to fix them first?

Thanks for the suggestion! I will fix it in next version because a patch
need be split to two per your suggestion.

BRs,
Sun Yi

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] xen: RCU: avoid busy waiting until the end of grace period.

2017-07-31 Thread Dario Faggioli
On Mon, 2017-07-31 at 16:58 -0700, Stefano Stabellini wrote:
> On Tue, 1 Aug 2017, Dario Faggioli wrote:
> > On Mon, 2017-07-31 at 14:20 -0700, Stefano Stabellini wrote:
> > > On Thu, 27 Jul 2017, Dario Faggioli wrote:
> > > > 
> > > > diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
> > > > index f0fdc87..4586f2a 100644
> > > > --- a/xen/common/rcupdate.c
> > > > +++ b/xen/common/rcupdate.c
> > > > @@ -84,8 +84,14 @@ struct rcu_data {
> > > >  int cpu;
> > > >  struct rcu_head barrier;
> > > >  longlast_rs_qlen; /* qlen during the last
> > > > resched */
> > > > +
> > > > +/* 3) idle CPUs handling */
> > > > +struct timer idle_timer;
> > > > +bool idle_timer_active;
> > > >  };
> > > >  
> > > > +#define RCU_IDLE_TIMER_PERIOD MILLISECS(10)
> > > 
> > > Isn't this a bit too short? How is it chosen?
> > > 
> > 
> > What makes you think it's short?
> 
> In terms of power saving and CPU sleep states, 10ms is not much to
> sleep
> for. I wonder if there are any power saving benefits in sleeping for
> only 10ms (especially on x86 where entering and exiting CPU sleep
> states
> takes longer, to be confirmed).
>
I *think* we should be fine with, say, 100ms. But that's again,
guess/rule of thumb, nothing more than that. And, TBH, I'm not even
sure what a good experiment/benchmark would be, to assess whether a
particular value is good or bad. :-/

>   We might as well do the thing we need
> to do immediately? I guess we cannot do that?
>
You're guessing correct, we can't. It's indeed a bit tricky, and it
took it a little bit to me as well to figure all of it out properly.

Basically, let's say that, at time t1, on CPU1, someone calls
call_rcu(). The situation on the other CPUs is: CPU0 busy; CPU2 idle
(no timer pending); CPU3 busy.

So, a new grace period starts, and its exact end will be when CPU0,
CPU1 and CPU3 have quiesced once (in Xen, quiescence means: "going
through do_softirq()").

But RCU it's a passive mechanism, i.e., we rely on each CPU coming to
the RCU core logic, and tell <>.
Let's say that CPU0 quiesces at time t2 > t1. CPU1 quiesces at time
t3 > t2, and goes fully idle (no timer pending). CPU3 quiesces at time
t4 > t3. Now, at time t4, CPU1 can actually invoke the callbeck, queued
at time t1, from within call_rcu().

This patch series solves two problems, of our current RCU
implementation:

1) right now, we don't only wait for CPU0, CPU1 and CPU3, we also wait 
   for CPU2. But since CPU2 is fully idle, it just won't bother 
   telling RCU that it has quiesced (well, on x86, that would actually 
   happen at some point, while on ARM, it is really possible that this 
   never happens!). This is solved in patch 3, by introducing the 
   cpumask;

2) now (after patch 3) we know that we just can avoid waiting for 
   CPU2. Good. But at time t4, when CPU3 quiesces, marking the end of
   the grace period, how would CPU1 --which is fully idle-- know that
   it can now safely invoke the callback? Again, RCU is passive, so it
   relies on CPU1 to figure that out on its own, next time it wakes
   up, e.g., because of the periodic tick timer. But we don't have a
   periodic tick timer! Well, this again means that, on x86, CPU1 will
   actually figure that out at some (unpredictable) point in future.
   On ARM, not so much. The purpose of the timer in this patch is to
   make sure it always will.
   In fact, with patches 4 and 5 applied, at time t3, we let CPU1 go 
   idle, but we program the timer. It will fire at t3+T (with T=10ms, 
   right now). When this happens, if t3+T > t4, CPU1 invokes the
   callback, and we're done. If not (and CPU1 is still idle) we retry
   in another T.

So, when you say "immediately", *when* do you actually mean?

We can't invoke the callback at t3 (i.e., immediately after CPU1
quiesces), because we need to wait for CPU3 to do the same.

We can't invoke the callback when CPU3 quiesces, at t4 (i.e.,
immediately after the grace period ends) either, because the callback
it's on CPU1, not on CPU3.

Linux's solution is not to stop the tick for CPU1 at time t3. It will
be stopped only after the first firing of the tick itself at time
t > t4 (if CPU1 is still idle, of course). This timer thing is, IMO,
pretty similar. The only difference is that we don't have a tick to not
stop... So I had to make up a very special one. :-D

TL;DR, yes, I also would have loved the world (or even just this
problem) to be simple. It's not! ;-P

Thanks and Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)

signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v14 13/23] x86: refactor psr: CDP: implement CPU init flow.

2017-07-31 Thread Yi Sun
On 17-07-31 08:20:44, Jan Beulich wrote:
> >>> Yi Sun  07/15/17 2:48 AM >>>
> >@@ -272,7 +312,8 @@ static int cat_init_feature(const struct cpuid_leaf 
> >*regs,
> >if ( !opt_cpu_info )
> >return 0;
>  >
> >-printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, 
> >cbm_len:%u\n",
> >+printk(XENLOG_INFO "%s: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
> >+   ((type == FEAT_TYPE_L3_CDP) ? "CDP" : "L3 CAT"),
> 
> Why is this not "L3 CDP" when the enumerator includes L3?
> 
Sure, I will change it. Thanks!

> >@@ -1283,10 +1344,22 @@ static void psr_cpu_init(void)
> >feat = feat_l3;
> >feat_l3 = NULL;
>  >
> >-if ( !cat_init_feature(®s, feat, info, FEAT_TYPE_L3_CAT) )
> >-feat_props[FEAT_TYPE_L3_CAT] = _cat_props;
> >+if ( (regs.c & PSR_CAT_CDP_CAPABILITY) && (opt_psr & PSR_CDP) )
> >+{
> >+if ( !cat_init_feature(®s, feat, info, FEAT_TYPE_L3_CDP) )
> >+feat_props[FEAT_TYPE_L3_CDP] = _cdp_props;
> >+else
> >+/* If CDP init fails, try to work as L3 CAT. */
> >+goto l3_cat_init;
> >+}
> >else
> >-feat_l3 = feat;
> >+{
> >+ l3_cat_init:
> 
> I'd really like to ask to re-structure this slightly so that you won't
> need goto and a label here. As said before, goto-s are sort of okay
> for making complicated error paths readable, but they should be
> avoided in almost all other cases.
>
Got it, I will try to remove the goto. Thanks for the review!

BRs,
Sun Yi
 
> Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 0/1] xen/arm: zynqmp: Disable PCIe

2017-07-31 Thread Edgar E. Iglesias
On Mon, Jul 31, 2017 at 11:11:39PM +0100, Julien Grall wrote:
> 
> 
> On 31/07/2017 20:37, Edgar E. Iglesias wrote:
> >From: "Edgar E. Iglesias" 
> >
> >Hi,
> 
> Hi Edgar,
> 
> 
> >We're seeing panics in dom0 with PCIe enabled due to what seems
> >to be wrongly created mappings by Xen. With older kernels we
> >didn't see the panics but PCIe wasn't functional in dom0.
> >
> >This disables the PCIe nodes on the ZynqMP until Xen/ARM gets
> >more PCIe support.
> 
> I feel a bit sad to ack a patch disabling PCIe in the ZynqMP.
> 
> Before doing that. Can you describe what is the exact problem with Xen? It
> might be possible that we don't parse correctly the device-tree.

Yes, it was related to the DTB since we had a workaround with a modified
DTB.

Anyway, I was to quick with sending out the patch.

We've been carrying this for a while and your questions prompted me
to have a look again and dig out the details. Turns out I can't reproduce it
with our latest kernel and the latest Xen. I don't know what fixed
it yet but PCIe works ATM both on QEMU models and on real HW.

I don't think it ever has before without modified DTBs.

So please ignore this patch for now :-)

Cheers,
Edgar


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] xen: RCU: avoid busy waiting until the end of grace period.

2017-07-31 Thread Stefano Stabellini
On Tue, 1 Aug 2017, Dario Faggioli wrote:
> On Mon, 2017-07-31 at 14:20 -0700, Stefano Stabellini wrote:
> > On Thu, 27 Jul 2017, Dario Faggioli wrote:
> > > 
> > > diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
> > > index f0fdc87..4586f2a 100644
> > > --- a/xen/common/rcupdate.c
> > > +++ b/xen/common/rcupdate.c
> > > @@ -84,8 +84,14 @@ struct rcu_data {
> > >  int cpu;
> > >  struct rcu_head barrier;
> > >  longlast_rs_qlen; /* qlen during the last
> > > resched */
> > > +
> > > +/* 3) idle CPUs handling */
> > > +struct timer idle_timer;
> > > +bool idle_timer_active;
> > >  };
> > >  
> > > +#define RCU_IDLE_TIMER_PERIOD MILLISECS(10)
> > 
> > Isn't this a bit too short? How is it chosen?
> > 
> It's totally arbitrary (and that would be the case for whatever value
> we choose).
> 
> Basically, it's how long, at worst, after the actual end of a grace
> period, a (batch of) callback(s) will be invoked. Currently, on Credit1
> on ARM (without my patch, from this series, that suspends the tick)
> that's (by chance) 30 ms (or whatever value is chosen for Credit1
> timeslice). On Credit2 (on both ARM and x86), it's never, but on x86 it
> (apparently) is 'however frequent time sync rendezvouses happs' (which
> I don't recall, but it's longer), while on ARM is (potentially) never.
> 
> I accept suggestions about alternatives values, and I'm certainly fine
> with adding a comment, containing something along the lines of the
> explanation above, but I fear it's going to be hard to figure out what
> value is actually the "absolute best".
> 
> In Linux (which is where the same 'callback book-keeping' happens for
> them), a tick with a frequency of 1000Hz (== 1ms) is considered 'low-
> latency/Deskop/real-time'. For us, as said above, tick --when it's
> there-- would be 30ms by default.
> 
> I just went with something in the middle.
> 
> Also, it's not that we'll have a 10ms periodic timer going on for
> significant amount of time. In fact we expect it to actually fire just
> once (for each grace period). It's not 100% guaranteed that it won't be
> reprogrammed and fire a couple of times, but it should not, in the vast
> majority of cases.
> 
> What makes you think it's short?

In terms of power saving and CPU sleep states, 10ms is not much to sleep
for. I wonder if there are any power saving benefits in sleeping for
only 10ms (especially on x86 where entering and exiting CPU sleep states
takes longer, to be confirmed).  We might as well do the thing we need
to do immediately? I guess we cannot do that?___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file

2017-07-31 Thread Stefano Stabellini
On Mon, 31 Jul 2017, Edgar E. Iglesias wrote:
> > > >   @role Can only be 'master' or 'slave', it defaults to 
> > > > 'slave'.
> > > > 
> > > >   @prot When @role = master, this means the largest set 
> > > > of
> > > > stage-2 permission flags that can be granted to 
> > > > the
> > > > slave domains.
> > > > When @role = slave, this means the stage-2 permission
> > > > flags of the shared memory area.
> > > > Currently only 'rw' is supported. If not set. it
> > > > defaults to 'rw'.
> > > > 
> > > >   @cache_policy The stage-2 cacheability/shareability 
> > > > attributes of the
> > > > shared memory area. Currently, only two 
> > > > policies are
> > > > supported:
> > > >   * ARM_normal: Only applicable to ARM guests. 
> > > > This
> > > > would mean Inner and Outer 
> > > > Write-Back
> > > > Cacheable, and Inner Shareable.
> > > 
> > > 
> > > Is there a reason not to set this to Outer Shareable?
> > > Again, mainly useful when these pages get shared with devs as well.
> > > 
> > > The guest can always lower it to Inner Shareable via S1 tables if needed.
> > 
> > I don't think we can support memory sharing with devices in this version
> > of the document (see above about GSoC timelines). Normal memory is inner
> > shareable in Xen today, it makes sense to default to that.
> 
> I thought we mapped RAM as Outer shareable to guests but you seem to be right.
> I think we should be mapping all RAM as Outer Shareable and then let the
> guest decide what is Inner and what is Outer via it's S1 tables.
> Right now it would be impossible to be Coherent with a DMA device outside
> of the Inner domain...
> 
> Perhaps we should fix that and then ARM_normal would by itself become Outer.
> If there's agreement I can test it and send a patch.

Today, only device memory is mapped Outer Shareable, while normal memory
is mapped Inner Shareable. I am OK with changing the default in
mfn_to_p2m_entry to Outer Shareable for normal RAM if the change would
make it possible to do coherent DMA with more devices on the platform.
Julien?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Xen on Intel Atom E3815: crash, no output

2017-07-31 Thread Stefano Stabellini
Hi all,

I noticed that Xen does not boot on Intel Atom E3815. The system is a
Dell Edge Gateway 3003:

http://i.dell.com/sites/doccontent/shared-content/data-sheets/en/Documents/Dell_Edge_Gateway_3000_Series_spec_sheet.pdf?newtab=true

Grub2 loads Xen and Dom0, but no output comes out of Xen. After the
"Loading" messages from Grub2, Xen doesn't manage to print even a single
character and the system obviously crashes, but I don't know why because
there is no output. Before you ask, no I don't have a serial on the
system.

I tried to pass console=vga vga=text-80x25 and console=vga vga=ask, but
I still got nothing.

Do you have any ideas how to get some output on the screen? Do you know
how to get Xen to boot successfully?

Thanks,

Stefano

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 112402: tolerable trouble: broken/pass - PUSHED

2017-07-31 Thread osstest service owner
flight 112402 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112402/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  b8029db62eb2a06a204a8e2b69437d0927bd1ac4
baseline version:
 xen  aa4eb460bcf77ea87b9209bb136efc8142a1a512

Last test of basis   112365  2017-07-28 18:01:42 Z3 days
Testing same since   112402  2017-07-31 21:02:08 Z0 days1 attempts


People who touched revisions under test:
  Andrii Anisov 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  broken  
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=b8029db62eb2a06a204a8e2b69437d0927bd1ac4
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
b8029db62eb2a06a204a8e2b69437d0927bd1ac4
+ branch=xen-unstable-smoke
+ revision=b8029db62eb2a06a204a8e2b69437d0927bd1ac4
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.9-testing
+ '[' xb8029db62eb2a06a204a8e2b69437d0927bd1ac4 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/ovmf.git
++ : 

[Xen-devel] [PATCH v3 11/13] xen/pvcalls: implement poll command

2017-07-31 Thread Stefano Stabellini
For active sockets, check the indexes and use the inflight_conn_req
waitqueue to wait.

For passive sockets if an accept is outstanding
(PVCALLS_FLAG_ACCEPT_INFLIGHT), check if it has been answered by looking
at bedata->rsp[req_id]. If so, return POLLIN.  Otherwise use the
inflight_accept_req waitqueue.

If no accepts are inflight, send PVCALLS_POLL to the backend. If we have
outstanding POLL requests awaiting for a response use the inflight_req
waitqueue: inflight_req is awaken when a new response is received; on
wakeup we check whether the POLL response is arrived by looking at the
PVCALLS_FLAG_POLL_RET flag. We set the flag from
pvcalls_front_event_handler, if the response was for a POLL command.

In pvcalls_front_event_handler, get the struct sock_mapping from the
poll id (we previously converted struct sock_mapping* to uint64_t and
used it as id).

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 135 +---
 drivers/xen/pvcalls-front.h |   3 +
 2 files changed, 129 insertions(+), 9 deletions(-)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 635a83a..1c975d6 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -72,6 +72,8 @@ struct sock_mapping {
 * Only one poll operation can be inflight for a given socket.
 */
 #define PVCALLS_FLAG_ACCEPT_INFLIGHT 0
+#define PVCALLS_FLAG_POLL_INFLIGHT   1
+#define PVCALLS_FLAG_POLL_RET2
uint8_t flags;
uint32_t inflight_req_id;
struct sock_mapping *accept_map;
@@ -139,15 +141,32 @@ static irqreturn_t pvcalls_front_event_handler(int irq, 
void *dev_id)
rsp = RING_GET_RESPONSE(>ring, bedata->ring.rsp_cons);
 
req_id = rsp->req_id;
-   dst = (uint8_t *)>rsp[req_id] + sizeof(rsp->req_id);
-   src = (uint8_t *)rsp + sizeof(rsp->req_id);
-   memcpy(dst, src, sizeof(*rsp) - sizeof(rsp->req_id));
-   /*
-* First copy the rest of the data, then req_id. It is
-* paired with the barrier when accessing bedata->rsp.
-*/
-   smp_wmb();
-   WRITE_ONCE(bedata->rsp[req_id].req_id, rsp->req_id);
+   if (rsp->cmd == PVCALLS_POLL) {
+   struct sock_mapping *map = (struct sock_mapping *)
+  rsp->u.poll.id;
+
+   set_bit(PVCALLS_FLAG_POLL_RET,
+   (void *)>passive.flags);
+   /*
+* Set RET, then clear INFLIGHT. It pairs with
+* the checks at the beginning of
+* pvcalls_front_poll_passive.
+*/
+   smp_wmb();
+   clear_bit(PVCALLS_FLAG_POLL_INFLIGHT,
+ (void *)>passive.flags);
+   } else {
+   dst = (uint8_t *)>rsp[req_id] +
+ sizeof(rsp->req_id);
+   src = (uint8_t *)rsp + sizeof(rsp->req_id);
+   memcpy(dst, src, sizeof(*rsp) - sizeof(rsp->req_id));
+   /*
+* First copy the rest of the data, then req_id. It is
+* paired with the barrier when accessing bedata->rsp.
+*/
+   smp_wmb();
+   WRITE_ONCE(bedata->rsp[req_id].req_id, rsp->req_id);
+   }
 
done = 1;
bedata->ring.rsp_cons++;
@@ -736,6 +755,104 @@ int pvcalls_front_accept(struct socket *sock, struct 
socket *newsock, int flags)
return ret;
 }
 
+static unsigned int pvcalls_front_poll_passive(struct file *file,
+  struct pvcalls_bedata *bedata,
+  struct sock_mapping *map,
+  poll_table *wait)
+{
+   int notify, req_id, ret;
+   struct xen_pvcalls_request *req;
+
+   if (test_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT,
+(void *)>passive.flags)) {
+   uint32_t req_id = READ_ONCE(map->passive.inflight_req_id);
+   if (req_id != PVCALLS_INVALID_ID &&
+   READ_ONCE(bedata->rsp[req_id].req_id) == req_id)
+   return POLLIN;
+
+   poll_wait(file, >passive.inflight_accept_req, wait);
+   return 0;
+   }
+
+   if (test_and_clear_bit(PVCALLS_FLAG_POLL_RET,
+  (void *)>passive.flags))
+   return POLLIN;
+
+   /*
+* First check RET, then INFLIGHT. No barriers necessary to
+* ensure execution ordering because of the 

[Xen-devel] [PATCH v3 02/13] xen/pvcalls: implement frontend disconnect

2017-07-31 Thread Stefano Stabellini
Introduce a data structure named pvcalls_bedata. It contains pointers to
the command ring, the event channel, a list of active sockets and a list
of passive sockets. Lists accesses are protected by a spin_lock.

Introduce a waitqueue to allow waiting for a response on commands sent
to the backend.

Introduce an array of struct xen_pvcalls_response to store commands
responses.

Implement pvcalls frontend removal function. Go through the list of
active and passive sockets and free them all, one at a time.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 51 +
 1 file changed, 51 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index a8d38c2..a126195 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -20,6 +20,29 @@
 #include 
 #include 
 
+#define PVCALLS_INVALID_ID UINT_MAX
+#define PVCALLS_RING_ORDER XENBUS_MAX_RING_GRANT_ORDER
+#define PVCALLS_NR_REQ_PER_RING __CONST_RING_SIZE(xen_pvcalls, XEN_PAGE_SIZE)
+
+struct pvcalls_bedata {
+   struct xen_pvcalls_front_ring ring;
+   grant_ref_t ref;
+   int irq;
+
+   struct list_head socket_mappings;
+   struct list_head socketpass_mappings;
+   spinlock_t pvcallss_lock;
+
+   wait_queue_head_t inflight_req;
+   struct xen_pvcalls_response rsp[PVCALLS_NR_REQ_PER_RING];
+};
+static struct xenbus_device *pvcalls_front_dev;
+
+static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id)
+{
+   return IRQ_HANDLED;
+}
+
 static const struct xenbus_device_id pvcalls_front_ids[] = {
{ "pvcalls" },
{ "" }
@@ -27,6 +50,34 @@
 
 static int pvcalls_front_remove(struct xenbus_device *dev)
 {
+   struct pvcalls_bedata *bedata;
+   struct sock_mapping *map = NULL, *n;
+
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   list_for_each_entry_safe(map, n, >socket_mappings, list) {
+   mutex_lock(>active.in_mutex);
+   mutex_lock(>active.out_mutex);
+   pvcalls_front_free_map(bedata, map);
+   mutex_unlock(>active.out_mutex);
+   mutex_unlock(>active.in_mutex);
+   kfree(map);
+   }
+   list_for_each_entry_safe(map, n, >socketpass_mappings, list) {
+   spin_lock(>pvcallss_lock);
+   list_del_init(>list);
+   spin_unlock(>pvcallss_lock);
+   kfree(map);
+   }
+   if (bedata->irq > 0)
+   unbind_from_irqhandler(bedata->irq, dev);
+   if (bedata->ref >= 0)
+   gnttab_end_foreign_access(bedata->ref, 0, 0);
+   kfree(bedata->ring.sring);
+   kfree(bedata);
+   dev_set_drvdata(>dev, NULL);
+   xenbus_switch_state(dev, XenbusStateClosed);
+   pvcalls_front_dev = NULL;
return 0;
 }
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 12/13] xen/pvcalls: implement release command

2017-07-31 Thread Stefano Stabellini
Send PVCALLS_RELEASE to the backend and wait for a reply. Take both
in_mutex and out_mutex to avoid concurrent accesses. Then, free the
socket.

For passive sockets, check whether we have already pre-allocated an
active socket for the purpose of being accepted. If so, free that as
well.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 88 +
 drivers/xen/pvcalls-front.h |  1 +
 2 files changed, 89 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 1c975d6..775a6d2 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -192,6 +192,23 @@ static irqreturn_t pvcalls_front_conn_handler(int irq, 
void *sock_map)
return IRQ_HANDLED;
 }
 
+static void pvcalls_front_free_map(struct pvcalls_bedata *bedata,
+  struct sock_mapping *map)
+{
+   int i;
+
+   spin_lock(>pvcallss_lock);
+   if (!list_empty(>list))
+   list_del_init(>list);
+   spin_unlock(>pvcallss_lock);
+
+   for (i = 0; i < (1 << map->active.ring->ring_order); i++)
+   gnttab_end_foreign_access(map->active.ring->ref[i], 0, 0);
+   gnttab_end_foreign_access(map->active.ref, 0, 0);
+   free_page((unsigned long)map->active.ring);
+   unbind_from_irqhandler(map->active.irq, map);
+}
+
 int pvcalls_front_socket(struct socket *sock)
 {
struct pvcalls_bedata *bedata;
@@ -853,6 +870,77 @@ unsigned int pvcalls_front_poll(struct file *file, struct 
socket *sock,
return pvcalls_front_poll_passive(file, bedata, map, wait);
 }
 
+int pvcalls_front_release(struct socket *sock)
+{
+   struct pvcalls_bedata *bedata;
+   struct sock_mapping *map;
+   int req_id, notify, ret;
+   struct xen_pvcalls_request *req;
+
+   if (!pvcalls_front_dev)
+   return -EIO;
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   if (sock->sk == NULL)
+   return 0;
+
+   map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head);
+   if (map == NULL)
+   return 0;
+
+   spin_lock(>pvcallss_lock);
+   ret = get_request(bedata, _id);
+   if (ret < 0) {
+   spin_unlock(>pvcallss_lock);
+   return ret;
+   }
+   WRITE_ONCE(sock->sk->sk_send_head, NULL);
+
+   req = RING_GET_REQUEST(>ring, req_id);
+   req->req_id = req_id;
+   req->cmd = PVCALLS_RELEASE;
+   req->u.release.id = (uint64_t)map;
+
+   bedata->ring.req_prod_pvt++;
+   RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify);
+   spin_unlock(>pvcallss_lock);
+   if (notify)
+   notify_remote_via_irq(bedata->irq);
+
+   wait_event(bedata->inflight_req,
+  READ_ONCE(bedata->rsp[req_id].req_id) == req_id);
+
+   if (map->active_socket) {
+   /* 
+* Set in_error and wake up inflight_conn_req to force
+* recvmsg waiters to exit.
+*/
+   map->active.ring->in_error = -EBADF;
+   wake_up_interruptible(>active.inflight_conn_req);
+
+   mutex_lock(>active.in_mutex);
+   mutex_lock(>active.out_mutex);
+   pvcalls_front_free_map(bedata, map);
+   mutex_unlock(>active.out_mutex);
+   mutex_unlock(>active.in_mutex);
+   kfree(map);
+   } else {
+   spin_lock(>pvcallss_lock);
+   if (READ_ONCE(map->passive.inflight_req_id) !=
+   PVCALLS_INVALID_ID) {
+   pvcalls_front_free_map(bedata,
+  map->passive.accept_map);
+   kfree(map->passive.accept_map);
+   }
+   list_del_init(>list);
+   kfree(map);
+   spin_unlock(>pvcallss_lock);
+   }
+   WRITE_ONCE(bedata->rsp[req_id].req_id, PVCALLS_INVALID_ID);
+
+   return 0;
+}
+
 static const struct xenbus_device_id pvcalls_front_ids[] = {
{ "pvcalls" },
{ "" }
diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h
index 25e05b8..3332978 100644
--- a/drivers/xen/pvcalls-front.h
+++ b/drivers/xen/pvcalls-front.h
@@ -23,5 +23,6 @@ int pvcalls_front_recvmsg(struct socket *sock,
 unsigned int pvcalls_front_poll(struct file *file,
struct socket *sock,
poll_table *wait);
+int pvcalls_front_release(struct socket *sock);
 
 #endif
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 10/13] xen/pvcalls: implement recvmsg

2017-07-31 Thread Stefano Stabellini
Implement recvmsg by copying data from the "in" ring. If not enough data
is available and the recvmsg call is blocking, then wait on the
inflight_conn_req waitqueue. Take the active socket in_mutex so that
only one function can access the ring at any given time.

If no data is available on the ring, rather than returning immediately
or sleep-waiting, spin for up to 5000 cycles. This small optimization
turns out to improve performance and latency significantly.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 102 
 drivers/xen/pvcalls-front.h |   4 ++
 2 files changed, 106 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 369acde..635a83a 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -105,6 +105,20 @@ static int pvcalls_front_write_todo(struct sock_mapping 
*map)
return size - pvcalls_queued(prod, cons, size);
 }
 
+static bool pvcalls_front_read_todo(struct sock_mapping *map)
+{
+   struct pvcalls_data_intf *intf = map->active.ring;
+   RING_IDX cons, prod;
+   int32_t error;
+
+   cons = intf->in_cons;
+   prod = intf->in_prod;
+   error = intf->in_error;
+   return (error != 0 ||
+   pvcalls_queued(prod, cons,
+  XEN_FLEX_RING_SIZE(intf->ring_order)) != 0);
+}
+
 static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id)
 {
struct xenbus_device *dev = dev_id;
@@ -434,6 +448,94 @@ int pvcalls_front_sendmsg(struct socket *sock, struct 
msghdr *msg,
return tot_sent;
 }
 
+static int __read_ring(struct pvcalls_data_intf *intf,
+  struct pvcalls_data *data,
+  struct iov_iter *msg_iter,
+  size_t len, int flags)
+{
+   RING_IDX cons, prod, size, masked_prod, masked_cons;
+   RING_IDX array_size = XEN_FLEX_RING_SIZE(intf->ring_order);
+   int32_t error;
+
+   cons = intf->in_cons;
+   prod = intf->in_prod;
+   error = intf->in_error;
+   /* get pointers before reading from the ring */
+   virt_rmb();
+   if (error < 0)
+   return error;
+
+   size = pvcalls_queued(prod, cons, array_size);
+   masked_prod = pvcalls_mask(prod, array_size);
+   masked_cons = pvcalls_mask(cons, array_size);
+
+   if (size == 0)
+   return 0;
+
+   if (len > size)
+   len = size;
+
+   if (masked_prod > masked_cons) {
+   copy_to_iter(data->in + masked_cons, len, msg_iter);
+   } else {
+   if (len > (array_size - masked_cons)) {
+   copy_to_iter(data->in + masked_cons,
+array_size - masked_cons, msg_iter);
+   copy_to_iter(data->in,
+len - (array_size - masked_cons),
+msg_iter);
+   } else {
+   copy_to_iter(data->in + masked_cons, len, msg_iter);
+   }
+   }
+   /* read data from the ring before increasing the index */
+   virt_mb();
+   if (!(flags & MSG_PEEK))
+   intf->in_cons += len;
+
+   return len;
+}
+
+int pvcalls_front_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+int flags)
+{
+   struct pvcalls_bedata *bedata;
+   int ret = -EAGAIN;
+   struct sock_mapping *map;
+
+   if (!pvcalls_front_dev)
+   return -ENOTCONN;
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head);
+   if (!map)
+   return -ENOTSOCK;
+
+   if (flags & (MSG_CMSG_CLOEXEC|MSG_ERRQUEUE|MSG_OOB|MSG_TRUNC))
+   return -EOPNOTSUPP;
+
+   mutex_lock(>active.in_mutex);
+   if (len > XEN_FLEX_RING_SIZE(map->active.ring->ring_order))
+   len = XEN_FLEX_RING_SIZE(map->active.ring->ring_order);
+
+   while (!(flags & MSG_DONTWAIT) && !pvcalls_front_read_todo(map)) {
+   wait_event_interruptible(map->active.inflight_conn_req,
+pvcalls_front_read_todo(map));
+   }
+   ret = __read_ring(map->active.ring, >active.data,
+ >msg_iter, len, flags);
+
+   if (ret > 0)
+   notify_remote_via_irq(map->active.irq);
+   if (ret == 0)
+   ret = -EAGAIN;
+   if (ret == -ENOTCONN)
+   ret = 0;
+
+   mutex_unlock(>active.in_mutex);
+   return ret;
+}
+
 int pvcalls_front_bind(struct socket *sock, struct sockaddr *addr, int 
addr_len)
 {
struct pvcalls_bedata *bedata;
diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h
index d937c24..de24041 100644
--- a/drivers/xen/pvcalls-front.h
+++ b/drivers/xen/pvcalls-front.h
@@ 

[Xen-devel] [PATCH v3 06/13] xen/pvcalls: implement bind command

2017-07-31 Thread Stefano Stabellini
Send PVCALLS_BIND to the backend. Introduce a new structure, part of
struct sock_mapping, to store information specific to passive sockets.

Introduce a status field to keep track of the status of the passive
socket.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 59 +
 drivers/xen/pvcalls-front.h |  3 +++
 2 files changed, 62 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 379b8fb..5ccef34 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -59,6 +59,13 @@ struct sock_mapping {
 
wait_queue_head_t inflight_conn_req;
} active;
+   struct {
+   /* Socket status */
+#define PVCALLS_STATUS_UNINITALIZED  0
+#define PVCALLS_STATUS_BIND  1
+#define PVCALLS_STATUS_LISTEN2
+   uint8_t status;
+   } passive;
};
 };
 
@@ -308,6 +315,58 @@ int pvcalls_front_connect(struct socket *sock, struct 
sockaddr *addr,
return ret;
 }
 
+int pvcalls_front_bind(struct socket *sock, struct sockaddr *addr, int 
addr_len)
+{
+   struct pvcalls_bedata *bedata;
+   struct sock_mapping *map = NULL;
+   struct xen_pvcalls_request *req;
+   int notify, req_id, ret;
+
+   if (!pvcalls_front_dev)
+   return -ENOTCONN;
+   if (addr->sa_family != AF_INET || sock->type != SOCK_STREAM)
+   return -ENOTSUPP;
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head);
+   if (map == NULL)
+   return -EINVAL;
+
+   spin_lock(>pvcallss_lock);
+   ret = get_request(bedata, _id);
+   if (ret < 0) {
+   spin_unlock(>pvcallss_lock);
+   return ret;
+   }
+   req = RING_GET_REQUEST(>ring, req_id);
+   req->req_id = req_id;
+   map->sock = sock;
+   req->cmd = PVCALLS_BIND;
+   req->u.bind.id = (uint64_t) map;
+   memcpy(req->u.bind.addr, addr, sizeof(*addr));
+   req->u.bind.len = addr_len;
+
+   init_waitqueue_head(>passive.inflight_accept_req);
+
+   map->active_socket = false;
+
+   bedata->ring.req_prod_pvt++;
+   RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify);
+   spin_unlock(>pvcallss_lock);
+   if (notify)
+   notify_remote_via_irq(bedata->irq);
+
+   wait_event(bedata->inflight_req,
+  READ_ONCE(bedata->rsp[req_id].req_id) == req_id);
+
+   map->passive.status = PVCALLS_STATUS_BIND;
+   ret = bedata->rsp[req_id].ret;
+   /* read ret, then set this rsp slot to be reused */
+   smp_mb();
+   WRITE_ONCE(bedata->rsp[req_id].req_id, PVCALLS_INVALID_ID);
+   return 0;
+}
+
 static const struct xenbus_device_id pvcalls_front_ids[] = {
{ "pvcalls" },
{ "" }
diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h
index 63b0417..8b0a274 100644
--- a/drivers/xen/pvcalls-front.h
+++ b/drivers/xen/pvcalls-front.h
@@ -6,5 +6,8 @@
 int pvcalls_front_socket(struct socket *sock);
 int pvcalls_front_connect(struct socket *sock, struct sockaddr *addr,
  int addr_len, int flags);
+int pvcalls_front_bind(struct socket *sock,
+  struct sockaddr *addr,
+  int addr_len);
 
 #endif
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 08/13] xen/pvcalls: implement accept command

2017-07-31 Thread Stefano Stabellini
Introduce a waitqueue to allow only one outstanding accept command at
any given time and to implement polling on the passive socket. Introduce
a flags field to keep track of in-flight accept and poll commands.

Send PVCALLS_ACCEPT to the backend. Allocate a new active socket. Make
sure that only one accept command is executed at any given time by
setting PVCALLS_FLAG_ACCEPT_INFLIGHT and waiting on the
inflight_accept_req waitqueue.

Convert the new struct sock_mapping pointer into an uint64_t and use it
as id for the new socket to pass to the backend.

Check if the accept call is non-blocking: in that case after sending the
ACCEPT command to the backend store the sock_mapping pointer of the new
struct and the inflight req_id then return -EAGAIN (which will respond
only when there is something to accept). Next time accept is called,
we'll check if the ACCEPT command has been answered, if so we'll pick up
where we left off, otherwise we return -EAGAIN again.

Note that, differently from the other commands, we can use
wait_event_interruptible (instead of wait_event) in the case of accept
as we are able to track the req_id of the ACCEPT response that we are
waiting.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 111 
 drivers/xen/pvcalls-front.h |   3 ++
 2 files changed, 114 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index b2757f5..f83b910 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -65,6 +65,16 @@ struct sock_mapping {
 #define PVCALLS_STATUS_BIND  1
 #define PVCALLS_STATUS_LISTEN2
uint8_t status;
+   /*
+* Internal state-machine flags.
+* Only one accept operation can be inflight for a socket.
+* Only one poll operation can be inflight for a given socket.
+*/
+#define PVCALLS_FLAG_ACCEPT_INFLIGHT 0
+   uint8_t flags;
+   uint32_t inflight_req_id;
+   struct sock_mapping *accept_map;
+   wait_queue_head_t inflight_accept_req;
} passive;
};
 };
@@ -414,6 +424,107 @@ int pvcalls_front_listen(struct socket *sock, int backlog)
return ret;
 }
 
+int pvcalls_front_accept(struct socket *sock, struct socket *newsock, int 
flags)
+{
+   struct pvcalls_bedata *bedata;
+   struct sock_mapping *map;
+   struct sock_mapping *map2 = NULL;
+   struct xen_pvcalls_request *req;
+   int notify, req_id, ret, evtchn, nonblock;
+
+   if (!pvcalls_front_dev)
+   return -ENOTCONN;
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head);
+   if (!map)
+   return -ENOTSOCK;
+
+   if (map->passive.status != PVCALLS_STATUS_LISTEN)
+   return -EINVAL;
+
+   nonblock = flags & SOCK_NONBLOCK;
+   /*
+* Backend only supports 1 inflight accept request, will return
+* errors for the others
+*/
+   if (test_and_set_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT,
+(void *)>passive.flags)) {
+   req_id = READ_ONCE(map->passive.inflight_req_id);
+   if (req_id != PVCALLS_INVALID_ID &&
+   READ_ONCE(bedata->rsp[req_id].req_id) == req_id)
+   goto received;
+   if (nonblock)
+   return -EAGAIN;
+   if (wait_event_interruptible(map->passive.inflight_accept_req,
+   !test_and_set_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT,
+ (void *)>passive.flags)))
+   return -EINTR;
+   }
+
+   spin_lock(>pvcallss_lock);
+   ret = get_request(bedata, _id);
+   if (ret < 0) {
+   spin_unlock(>pvcallss_lock);
+   return ret;
+   }   
+   map2 = kzalloc(sizeof(*map2), GFP_KERNEL);
+   if (map2 == NULL)
+   return -ENOMEM;
+   ret =  create_active(map2, );
+   if (ret < 0) {
+   kfree(map2);
+   return -ENOMEM;
+   }
+   list_add_tail(>list, >socket_mappings);
+
+   req = RING_GET_REQUEST(>ring, req_id);
+   req->req_id = req_id;
+   req->cmd = PVCALLS_ACCEPT;
+   req->u.accept.id = (uint64_t) map;
+   req->u.accept.ref = map2->active.ref;
+   req->u.accept.id_new = (uint64_t) map2;
+   req->u.accept.evtchn = evtchn;
+   map->passive.accept_map = map2;
+
+   bedata->ring.req_prod_pvt++;
+   RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify);
+   spin_unlock(>pvcallss_lock);
+   if (notify)
+   notify_remote_via_irq(bedata->irq);
+   if (nonblock) {
+   WRITE_ONCE(map->passive.inflight_req_id, req_id);
+

[Xen-devel] [PATCH v3 00/13] introduce the Xen PV Calls frontend

2017-07-31 Thread Stefano Stabellini
Hi all,

this series introduces the frontend for the newly introduced PV Calls
procotol.

PV Calls is a paravirtualized protocol that allows the implementation of
a set of POSIX functions in a different domain. The PV Calls frontend
sends POSIX function calls to the backend, which implements them and
returns a value to the frontend and acts on the function call.

For more information about PV Calls, please read:

https://xenbits.xen.org/docs/unstable/misc/pvcalls.html

This patch series only implements the frontend driver. It doesn't
attempt to redirect POSIX calls to it. The functions exported in
pvcalls-front.h are meant to be used for that. A separate patch series
will be sent to use them and hook them into the system.



Changes in v3:

In addition to addressing all comments, in this version of the series I
also made one other significant change: I implemented non-blocking
accept, which I didn't realize was missing in the last version of the
series (non-blocking accept is usually called only after a successful
poll, when there are already outstanding connections to accept). To do
that, I also changed the id for sockets sent to the backend from the
value of the struct socket pointer to the value of the struct
sock_mapping pointer. It's all documented in the patch descriptions.
This is the full list of changes:

- use struct sock_mapping* instead of struct socket* as socket id to
  share with the backend
- allocate struct sock_mapping for a socket in pvcalls_front_socket
- in pvcalls_front_accept check if the request is non-blocking and
  return -EAGAIN in that case
- store req_id and struct sock_mapping * of an inflight accept request,
  so that we can resume when it's ready, in case the accept is
  non-blocking
- remove unnecessary parenthesis
- rename RING_ORDER to PVCALLS_RING_ORDER
- make pvcalls_front_dev static
- remove ref local variable from pvcalls_front_probe
- move patch #12 before patch #2
- move dev_set_drvdata before any got error in pvcalls_front_probe
- combine src and dst calculation lines in pvcalls_front_event_handler
- remove unnecessary != 0 in the wait_event and wait_event_interruptible
  tests
- add an in-code comment about sk_send_head
- In v2 I applied some changes to the wrong patch. Move changes to the
  right patches.
- refactor the code to get a req_id into a function
- use wait_event (instead of wait_event_interruptible) to wait for a
  response from the backend because we cannot cope with missing
  responses
- remove the usage of PVCALLS_FRONT_MAX_SPIN from recvmsg
- remove unnecessary !bedata check in pvcalls_front_release 



Changes in v2:
- use xenbus_read_unsigned when possible
- call dev_set_drvdata earlier in pvcalls_front_probe not to dereference
  a NULL pointer in the error path
- set ret appropriately in pvcalls_front_probe
- include pvcalls-front.h in pvcalls-front.c
- call wake_up only once after the consuming loop in pvcalls_front_event_handler
- don't leak "bytes" in case of errors in create_active
- call spin_unlock appropriately in case of errors in create_active
- remove all BUG_ONs
- don't leak newsock->sk in pvcalls_front_accept in case of errors
- rename PVCALLS_FRON_MAX_SPIN to PVCALLS_FRONT_MAX_SPIN
- return bool from pvcalls_front_read_todo
- add a barrier after setting PVCALLS_FLAG_POLL_RET in
  pvcalls_front_event_handler
- remove outdated comment in pvcalls_front_free_map
- clear sock->sk->sk_send_head later in pvcalls_front_release
- make XEN_PVCALLS_FRONTEND tristate
- don't add an empty resume function



Stefano Stabellini (13):
  xen/pvcalls: introduce the pvcalls xenbus frontend
  xen/pvcalls: implement frontend disconnect
  xen/pvcalls: connect to the backend
  xen/pvcalls: implement socket command and handle events
  xen/pvcalls: implement connect command
  xen/pvcalls: implement bind command
  xen/pvcalls: implement listen command
  xen/pvcalls: implement accept command
  xen/pvcalls: implement sendmsg
  xen/pvcalls: implement recvmsg
  xen/pvcalls: implement poll command
  xen/pvcalls: implement release command
  xen: introduce a Kconfig option to enable the pvcalls frontend

 drivers/xen/Kconfig |9 +
 drivers/xen/Makefile|1 +
 drivers/xen/pvcalls-front.c | 1140 +++
 drivers/xen/pvcalls-front.h |   28 ++
 4 files changed, 1178 insertions(+)
 create mode 100644 drivers/xen/pvcalls-front.c
 create mode 100644 drivers/xen/pvcalls-front.h

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 05/13] xen/pvcalls: implement connect command

2017-07-31 Thread Stefano Stabellini
Send PVCALLS_CONNECT to the backend. Allocate a new ring and evtchn for
the active socket.

Introduce fields in struct sock_mapping to keep track of active sockets.
Introduce a waitqueue to allow the frontend to wait on data coming from
the backend on the active socket (recvmsg command).

Two mutexes (one of reads and one for writes) will be used to protect
the active socket in and out rings from concurrent accesses.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 146 
 drivers/xen/pvcalls-front.h |   2 +
 2 files changed, 148 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 7c4a7cb..379b8fb 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -13,6 +13,10 @@
  */
 
 #include 
+#include 
+#include 
+
+#include 
 
 #include 
 #include 
@@ -44,6 +48,18 @@ struct sock_mapping {
bool active_socket;
struct list_head list;
struct socket *sock;
+   union {
+   struct {
+   int irq;
+   grant_ref_t ref;
+   struct pvcalls_data_intf *ring;
+   struct pvcalls_data data;
+   struct mutex in_mutex;
+   struct mutex out_mutex;
+
+   wait_queue_head_t inflight_conn_req;
+   } active;
+   };
 };
 
 static inline int get_request(struct pvcalls_bedata *bedata, int *req_id)
@@ -97,6 +113,18 @@ static irqreturn_t pvcalls_front_event_handler(int irq, 
void *dev_id)
return IRQ_HANDLED;
 }
 
+static irqreturn_t pvcalls_front_conn_handler(int irq, void *sock_map)
+{
+   struct sock_mapping *map = sock_map;
+
+   if (map == NULL)
+   return IRQ_HANDLED;
+
+   wake_up_interruptible(>active.inflight_conn_req);
+
+   return IRQ_HANDLED;
+}
+
 int pvcalls_front_socket(struct socket *sock)
 {
struct pvcalls_bedata *bedata;
@@ -162,6 +190,124 @@ int pvcalls_front_socket(struct socket *sock)
return ret;
 }
 
+static int create_active(struct sock_mapping *map, int *evtchn)
+{
+   void *bytes;
+   int ret = -ENOMEM, irq = -1, i;
+
+   init_waitqueue_head(>active.inflight_conn_req);
+
+   map->active.ring = (struct pvcalls_data_intf *)
+   __get_free_page(GFP_KERNEL | __GFP_ZERO);
+   if (map->active.ring == NULL)
+   goto out_error;
+   memset(map->active.ring, 0, XEN_PAGE_SIZE);
+   map->active.ring->ring_order = PVCALLS_RING_ORDER;
+   bytes = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+   map->active.ring->ring_order);
+   if (bytes == NULL)
+   goto out_error;
+   for (i = 0; i < (1 << map->active.ring->ring_order); i++)
+   map->active.ring->ref[i] = gnttab_grant_foreign_access(
+   pvcalls_front_dev->otherend_id,
+   pfn_to_gfn(virt_to_pfn(bytes) + i), 0);
+
+   map->active.ref = gnttab_grant_foreign_access(
+   pvcalls_front_dev->otherend_id,
+   pfn_to_gfn(virt_to_pfn((void *)map->active.ring)), 0);
+
+   map->active.data.in = bytes;
+   map->active.data.out = bytes +
+   XEN_FLEX_RING_SIZE(map->active.ring->ring_order);
+
+   ret = xenbus_alloc_evtchn(pvcalls_front_dev, evtchn);
+   if (ret)
+   goto out_error;
+   irq = bind_evtchn_to_irqhandler(*evtchn, pvcalls_front_conn_handler,
+   0, "pvcalls-frontend", map);
+   if (irq < 0) {
+   ret = irq;
+   goto out_error;
+   }
+
+   map->active.irq = irq;
+   map->active_socket = true;
+   mutex_init(>active.in_mutex);
+   mutex_init(>active.out_mutex);
+
+   return 0;
+
+out_error:
+   if (irq >= 0)
+   unbind_from_irqhandler(irq, map);
+   else if (*evtchn >= 0)
+   xenbus_free_evtchn(pvcalls_front_dev, *evtchn);
+   kfree(map->active.data.in);
+   kfree(map->active.ring);
+   kfree(map);
+   return ret;
+}
+
+int pvcalls_front_connect(struct socket *sock, struct sockaddr *addr,
+   int addr_len, int flags)
+{
+   struct pvcalls_bedata *bedata;
+   struct sock_mapping *map = NULL;
+   struct xen_pvcalls_request *req;
+   int notify, req_id, ret, evtchn;
+
+   if (!pvcalls_front_dev)
+   return -ENETUNREACH;
+   if (addr->sa_family != AF_INET || sock->type != SOCK_STREAM)
+   return -ENOTSUPP;
+
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head);
+   if (!map)
+   return -EINVAL;
+
+   spin_lock(>pvcallss_lock);
+   ret = get_request(bedata, _id);
+   if (ret < 0) {
+   

[Xen-devel] [PATCH v3 13/13] xen: introduce a Kconfig option to enable the pvcalls frontend

2017-07-31 Thread Stefano Stabellini
Also add pvcalls-front to the Makefile.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/Kconfig  | 9 +
 drivers/xen/Makefile | 1 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 4545561..0b2c828 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -196,6 +196,15 @@ config XEN_PCIDEV_BACKEND
 
  If in doubt, say m.
 
+config XEN_PVCALLS_FRONTEND
+   tristate "XEN PV Calls frontend driver"
+   depends on INET && XEN
+   help
+ Experimental frontend for the Xen PV Calls protocol
+ (https://xenbits.xen.org/docs/unstable/misc/pvcalls.html). It
+ sends a small set of POSIX calls to the backend, which
+ implements them.
+
 config XEN_PVCALLS_BACKEND
bool "XEN PV Calls backend driver"
depends on INET && XEN && XEN_BACKEND
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 480b928..afb9e03 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -39,6 +39,7 @@ obj-$(CONFIG_XEN_EFI) += efi.o
 obj-$(CONFIG_XEN_SCSI_BACKEND) += xen-scsiback.o
 obj-$(CONFIG_XEN_AUTO_XLATE)   += xlate_mmu.o
 obj-$(CONFIG_XEN_PVCALLS_BACKEND)  += pvcalls-back.o
+obj-$(CONFIG_XEN_PVCALLS_FRONTEND) += pvcalls-front.o
 xen-evtchn-y   := evtchn.o
 xen-gntdev-y   := gntdev.o
 xen-gntalloc-y := gntalloc.o
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 01/13] xen/pvcalls: introduce the pvcalls xenbus frontend

2017-07-31 Thread Stefano Stabellini
Introduce a xenbus frontend for the pvcalls protocol, as defined by
https://xenbits.xen.org/docs/unstable/misc/pvcalls.html.

This patch only adds the stubs, the code will be added by the following
patches.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 61 +
 1 file changed, 61 insertions(+)
 create mode 100644 drivers/xen/pvcalls-front.c

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
new file mode 100644
index 000..a8d38c2
--- /dev/null
+++ b/drivers/xen/pvcalls-front.c
@@ -0,0 +1,61 @@
+/*
+ * (c) 2017 Stefano Stabellini 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static const struct xenbus_device_id pvcalls_front_ids[] = {
+   { "pvcalls" },
+   { "" }
+};
+
+static int pvcalls_front_remove(struct xenbus_device *dev)
+{
+   return 0;
+}
+
+static int pvcalls_front_probe(struct xenbus_device *dev,
+ const struct xenbus_device_id *id)
+{
+   return 0;
+}
+
+static void pvcalls_front_changed(struct xenbus_device *dev,
+   enum xenbus_state backend_state)
+{
+}
+
+static struct xenbus_driver pvcalls_front_driver = {
+   .ids = pvcalls_front_ids,
+   .probe = pvcalls_front_probe,
+   .remove = pvcalls_front_remove,
+   .otherend_changed = pvcalls_front_changed,
+};
+
+static int __init pvcalls_frontend_init(void)
+{
+   if (!xen_domain())
+   return -ENODEV;
+
+   pr_info("Initialising Xen pvcalls frontend driver\n");
+
+   return xenbus_register_frontend(_front_driver);
+}
+
+module_init(pvcalls_frontend_init);
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 04/13] xen/pvcalls: implement socket command and handle events

2017-07-31 Thread Stefano Stabellini
Send a PVCALLS_SOCKET command to the backend, use the masked
req_prod_pvt as req_id. This way, req_id is guaranteed to be between 0
and PVCALLS_NR_REQ_PER_RING. We already have a slot in the rsp array
ready for the response, and there cannot be two outstanding responses
with the same req_id.

Wait for the response by waiting on the inflight_req waitqueue and
check for the req_id field in rsp[req_id]. Use atomic accesses and
barriers to read the field. Note that the barriers are simple smp
barriers (as opposed to virt barriers) because they are for internal
frontend synchronization, not frontend<->backend communication.

Once a response is received, clear the corresponding rsp slot by setting
req_id to PVCALLS_INVALID_ID. Note that PVCALLS_INVALID_ID is invalid
only from the frontend point of view. It is not part of the PVCalls
protocol.

pvcalls_front_event_handler is in charge of copying responses from the
ring to the appropriate rsp slot. It is done by copying the body of the
response first, then by copying req_id atomically. After the copies,
wake up anybody waiting on waitqueue.

pvcallss_lock protects accesses to the ring.

Create a new struct sock_mapping and convert the pointer into an
uint64_t and use it as id for the new socket to pass to the backend. The
struct will be fully initialized later on connect or bind. In this patch
the struct sock_mapping is empty, the fields will be added by the next
patch.

sock->sk->sk_send_head is not used for ip sockets: reuse the field to
store a pointer to the struct sock_mapping corresponding to the socket.
This way, we can easily get the struct sock_mapping from the struct
socket.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 119 
 drivers/xen/pvcalls-front.h |   8 +++
 2 files changed, 127 insertions(+)
 create mode 100644 drivers/xen/pvcalls-front.h

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 2afe36d..7c4a7cb 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 
+#include "pvcalls-front.h"
+
 #define PVCALLS_INVALID_ID UINT_MAX
 #define PVCALLS_RING_ORDER XENBUS_MAX_RING_GRANT_ORDER
 #define PVCALLS_NR_REQ_PER_RING __CONST_RING_SIZE(xen_pvcalls, XEN_PAGE_SIZE)
@@ -38,11 +40,128 @@ struct pvcalls_bedata {
 };
 static struct xenbus_device *pvcalls_front_dev;
 
+struct sock_mapping {
+   bool active_socket;
+   struct list_head list;
+   struct socket *sock;
+};
+
+static inline int get_request(struct pvcalls_bedata *bedata, int *req_id)
+{
+   *req_id = bedata->ring.req_prod_pvt & (RING_SIZE(>ring) - 1);
+   if (RING_FULL(>ring) ||
+   READ_ONCE(bedata->rsp[*req_id].req_id) != PVCALLS_INVALID_ID)
+   return -EAGAIN;
+   return 0;
+}
+
 static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id)
 {
+   struct xenbus_device *dev = dev_id;
+   struct pvcalls_bedata *bedata;
+   struct xen_pvcalls_response *rsp;
+   uint8_t *src, *dst;
+   int req_id = 0, more = 0, done = 0;
+
+   if (dev == NULL)
+   return IRQ_HANDLED;
+
+   bedata = dev_get_drvdata(>dev);
+   if (bedata == NULL)
+   return IRQ_HANDLED;
+
+again:
+   while (RING_HAS_UNCONSUMED_RESPONSES(>ring)) {
+   rsp = RING_GET_RESPONSE(>ring, bedata->ring.rsp_cons);
+
+   req_id = rsp->req_id;
+   dst = (uint8_t *)>rsp[req_id] + sizeof(rsp->req_id);
+   src = (uint8_t *)rsp + sizeof(rsp->req_id);
+   memcpy(dst, src, sizeof(*rsp) - sizeof(rsp->req_id));
+   /*
+* First copy the rest of the data, then req_id. It is
+* paired with the barrier when accessing bedata->rsp.
+*/
+   smp_wmb();
+   WRITE_ONCE(bedata->rsp[req_id].req_id, rsp->req_id);
+
+   done = 1;
+   bedata->ring.rsp_cons++;
+   }
+
+   RING_FINAL_CHECK_FOR_RESPONSES(>ring, more);
+   if (more)
+   goto again;
+   if (done)
+   wake_up(>inflight_req);
return IRQ_HANDLED;
 }
 
+int pvcalls_front_socket(struct socket *sock)
+{
+   struct pvcalls_bedata *bedata;
+   struct sock_mapping *map = NULL;
+   struct xen_pvcalls_request *req;
+   int notify, req_id, ret;
+
+   if (!pvcalls_front_dev)
+   return -EACCES;
+   /*
+* PVCalls only supports domain AF_INET,
+* type SOCK_STREAM and protocol 0 sockets for now.
+*
+* Check socket type here, AF_INET and protocol checks are done
+* by the caller.
+*/
+   if (sock->type != SOCK_STREAM)
+   return -ENOTSUPP;
+
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   map = kzalloc(sizeof(*map), GFP_KERNEL);
+   if (map == NULL)
+   

[Xen-devel] [PATCH v3 07/13] xen/pvcalls: implement listen command

2017-07-31 Thread Stefano Stabellini
Send PVCALLS_LISTEN to the backend.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 47 +
 drivers/xen/pvcalls-front.h |  1 +
 2 files changed, 48 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 5ccef34..b2757f5 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -367,6 +367,53 @@ int pvcalls_front_bind(struct socket *sock, struct 
sockaddr *addr, int addr_len)
return 0;
 }
 
+int pvcalls_front_listen(struct socket *sock, int backlog)
+{
+   struct pvcalls_bedata *bedata;
+   struct sock_mapping *map;
+   struct xen_pvcalls_request *req;
+   int notify, req_id, ret;
+
+   if (!pvcalls_front_dev)
+   return -ENOTCONN;
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head);
+   if (!map)
+   return -ENOTSOCK;
+
+   if (map->passive.status != PVCALLS_STATUS_BIND)
+   return -EOPNOTSUPP;
+
+   spin_lock(>pvcallss_lock);
+   ret = get_request(bedata, _id);
+   if (ret < 0) {
+   spin_unlock(>pvcallss_lock);
+   return ret;
+   }
+   req = RING_GET_REQUEST(>ring, req_id);
+   req->req_id = req_id;
+   req->cmd = PVCALLS_LISTEN;
+   req->u.listen.id = (uint64_t) map;
+   req->u.listen.backlog = backlog;
+
+   bedata->ring.req_prod_pvt++;
+   RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify);
+   spin_unlock(>pvcallss_lock);
+   if (notify)
+   notify_remote_via_irq(bedata->irq);
+
+   wait_event(bedata->inflight_req,
+  READ_ONCE(bedata->rsp[req_id].req_id) == req_id);
+
+   map->passive.status = PVCALLS_STATUS_LISTEN;
+   ret = bedata->rsp[req_id].ret;
+   /* read ret, then set this rsp slot to be reused */
+   smp_mb();
+   WRITE_ONCE(bedata->rsp[req_id].req_id, PVCALLS_INVALID_ID);
+   return ret;
+}
+
 static const struct xenbus_device_id pvcalls_front_ids[] = {
{ "pvcalls" },
{ "" }
diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h
index 8b0a274..aa8fe10 100644
--- a/drivers/xen/pvcalls-front.h
+++ b/drivers/xen/pvcalls-front.h
@@ -9,5 +9,6 @@ int pvcalls_front_connect(struct socket *sock, struct sockaddr 
*addr,
 int pvcalls_front_bind(struct socket *sock,
   struct sockaddr *addr,
   int addr_len);
+int pvcalls_front_listen(struct socket *sock, int backlog);
 
 #endif
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 09/13] xen/pvcalls: implement sendmsg

2017-07-31 Thread Stefano Stabellini
Send data to an active socket by copying data to the "out" ring. Take
the active socket out_mutex so that only one function can access the
ring at any given time.

If not enough room is available on the ring, rather than returning
immediately or sleep-waiting, spin for up to 5000 cycles. This small
optimization turns out to improve performance significantly.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 109 
 drivers/xen/pvcalls-front.h |   3 ++
 2 files changed, 112 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index f83b910..369acde 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -29,6 +29,7 @@
 #define PVCALLS_INVALID_ID UINT_MAX
 #define PVCALLS_RING_ORDER XENBUS_MAX_RING_GRANT_ORDER
 #define PVCALLS_NR_REQ_PER_RING __CONST_RING_SIZE(xen_pvcalls, XEN_PAGE_SIZE)
+#define PVCALLS_FRONT_MAX_SPIN 5000
 
 struct pvcalls_bedata {
struct xen_pvcalls_front_ring ring;
@@ -88,6 +89,22 @@ static inline int get_request(struct pvcalls_bedata *bedata, 
int *req_id)
return 0;
 }
 
+static int pvcalls_front_write_todo(struct sock_mapping *map)
+{
+   struct pvcalls_data_intf *intf = map->active.ring;
+   RING_IDX cons, prod, size = XEN_FLEX_RING_SIZE(intf->ring_order);
+   int32_t error;
+
+   cons = intf->out_cons;
+   prod = intf->out_prod;
+   error = intf->out_error;
+   if (error == -ENOTCONN)
+   return 0;
+   if (error != 0)
+   return error;
+   return size - pvcalls_queued(prod, cons, size);
+}
+
 static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id)
 {
struct xenbus_device *dev = dev_id;
@@ -325,6 +342,98 @@ int pvcalls_front_connect(struct socket *sock, struct 
sockaddr *addr,
return ret;
 }
 
+static int __write_ring(struct pvcalls_data_intf *intf,
+   struct pvcalls_data *data,
+   struct iov_iter *msg_iter,
+   size_t len)
+{
+   RING_IDX cons, prod, size, masked_prod, masked_cons;
+   RING_IDX array_size = XEN_FLEX_RING_SIZE(intf->ring_order);
+   int32_t error;
+
+   cons = intf->out_cons;
+   prod = intf->out_prod;
+   error = intf->out_error;
+   /* read indexes before continuing */
+   virt_mb();
+
+   if (error < 0)
+   return error;
+
+   size = pvcalls_queued(prod, cons, array_size);
+   if (size >= array_size)
+   return 0;
+   if (len > array_size - size)
+   len = array_size - size;
+
+   masked_prod = pvcalls_mask(prod, array_size);
+   masked_cons = pvcalls_mask(cons, array_size);
+
+   if (masked_prod < masked_cons) {
+   copy_from_iter(data->out + masked_prod, len, msg_iter);
+   } else {
+   if (len > array_size - masked_prod) {
+   copy_from_iter(data->out + masked_prod,
+  array_size - masked_prod, msg_iter);
+   copy_from_iter(data->out,
+  len - (array_size - masked_prod),
+  msg_iter);
+   } else {
+   copy_from_iter(data->out + masked_prod, len, msg_iter);
+   }
+   }
+   /* write to ring before updating pointer */
+   virt_wmb();
+   intf->out_prod += len;
+
+   return len;
+}
+
+int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg,
+ size_t len)
+{
+   struct pvcalls_bedata *bedata;
+   struct sock_mapping *map;
+   int sent = 0, tot_sent = 0;
+   int count = 0, flags;
+
+   if (!pvcalls_front_dev)
+   return -ENOTCONN;
+   bedata = dev_get_drvdata(_front_dev->dev);
+
+   map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head);
+   if (!map)
+   return -ENOTSOCK;
+
+   flags = msg->msg_flags;
+   if (flags & (MSG_CONFIRM|MSG_DONTROUTE|MSG_EOR|MSG_OOB))
+   return -EOPNOTSUPP;
+
+   mutex_lock(>active.out_mutex);
+   if ((flags & MSG_DONTWAIT) && !pvcalls_front_write_todo(map)) {
+   mutex_unlock(>active.out_mutex);
+   return -EAGAIN;
+   }
+
+again:
+   count++;
+   sent = __write_ring(map->active.ring,
+   >active.data, >msg_iter,
+   len);
+   if (sent > 0) {
+   len -= sent;
+   tot_sent += sent;
+   notify_remote_via_irq(map->active.irq);
+   }
+   if (sent >= 0 && len > 0 && count < PVCALLS_FRONT_MAX_SPIN)
+   goto again;
+   if (sent < 0)
+   tot_sent = sent;
+
+   mutex_unlock(>active.out_mutex);
+   return tot_sent;
+}
+
 int pvcalls_front_bind(struct socket *sock, struct 

[Xen-devel] [PATCH v3 03/13] xen/pvcalls: connect to the backend

2017-07-31 Thread Stefano Stabellini
Implement the probe function for the pvcalls frontend. Read the
supported versions, max-page-order and function-calls nodes from
xenstore.

Only one frontend<->backend connection is supported at any given time
for a guest. Store the active frontend device to a static pointer.

Introduce a stub functions for the event handler.

Signed-off-by: Stefano Stabellini 
CC: boris.ostrov...@oracle.com
CC: jgr...@suse.com
---
 drivers/xen/pvcalls-front.c | 130 
 1 file changed, 130 insertions(+)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index a126195..2afe36d 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -84,12 +84,142 @@ static int pvcalls_front_remove(struct xenbus_device *dev)
 static int pvcalls_front_probe(struct xenbus_device *dev,
  const struct xenbus_device_id *id)
 {
+   int ret = -ENOMEM, evtchn, i;
+   unsigned int max_page_order, function_calls, len;
+   char *versions;
+   grant_ref_t gref_head = 0;
+   struct xenbus_transaction xbt;
+   struct pvcalls_bedata *bedata = NULL;
+   struct xen_pvcalls_sring *sring;
+
+   if (pvcalls_front_dev != NULL) {
+   dev_err(>dev, "only one PV Calls connection supported\n");
+   return -EINVAL;
+   }
+
+   versions = xenbus_read(XBT_NIL, dev->otherend, "versions", );
+   if (!len)
+   return -EINVAL;
+   if (strcmp(versions, "1")) {
+   kfree(versions);
+   return -EINVAL;
+   }
+   kfree(versions);
+   max_page_order = xenbus_read_unsigned(dev->otherend,
+ "max-page-order", 0);
+   if (max_page_order < PVCALLS_RING_ORDER)
+   return -ENODEV;
+   function_calls = xenbus_read_unsigned(dev->otherend,
+ "function-calls", 0);
+   if (function_calls != 1)
+   return -ENODEV;
+   pr_info("%s max-page-order is %u\n", __func__, max_page_order);
+
+   bedata = kzalloc(sizeof(struct pvcalls_bedata), GFP_KERNEL);
+   if (!bedata)
+   return -ENOMEM;
+
+   dev_set_drvdata(>dev, bedata);
+   pvcalls_front_dev = dev;
+   init_waitqueue_head(>inflight_req);
+   for (i = 0; i < PVCALLS_NR_REQ_PER_RING; i++)
+   bedata->rsp[i].req_id = PVCALLS_INVALID_ID;
+
+   sring = (struct xen_pvcalls_sring *) __get_free_page(GFP_KERNEL |
+__GFP_ZERO);
+   if (!sring)
+   goto error;
+   SHARED_RING_INIT(sring);
+   FRONT_RING_INIT(>ring, sring, XEN_PAGE_SIZE);
+
+   ret = xenbus_alloc_evtchn(dev, );
+   if (ret)
+   goto error;
+
+   bedata->irq = bind_evtchn_to_irqhandler(evtchn,
+   pvcalls_front_event_handler,
+   0, "pvcalls-frontend", dev);
+   if (bedata->irq < 0) {
+   ret = bedata->irq;
+   goto error;
+   }
+
+   ret = gnttab_alloc_grant_references(1, _head);
+   if (ret < 0)
+   goto error;
+   bedata->ref = gnttab_claim_grant_reference(_head);
+   if (bedata->ref < 0) {
+   ret = bedata->ref;
+   goto error;
+   }
+   gnttab_grant_foreign_access_ref(bedata->ref, dev->otherend_id,
+   virt_to_gfn((void *)sring), 0);
+
+ again:
+   ret = xenbus_transaction_start();
+   if (ret) {
+   xenbus_dev_fatal(dev, ret, "starting transaction");
+   goto error;
+   }
+   ret = xenbus_printf(xbt, dev->nodename, "version", "%u", 1);
+   if (ret)
+   goto error_xenbus;
+   ret = xenbus_printf(xbt, dev->nodename, "ring-ref", "%d", bedata->ref);
+   if (ret)
+   goto error_xenbus;
+   ret = xenbus_printf(xbt, dev->nodename, "port", "%u",
+   evtchn);
+   if (ret)
+   goto error_xenbus;
+   ret = xenbus_transaction_end(xbt, 0);
+   if (ret) {
+   if (ret == -EAGAIN)
+   goto again;
+   xenbus_dev_fatal(dev, ret, "completing transaction");
+   goto error;
+   }
+
+   INIT_LIST_HEAD(>socket_mappings);
+   INIT_LIST_HEAD(>socketpass_mappings);
+   spin_lock_init(>pvcallss_lock);
+   xenbus_switch_state(dev, XenbusStateInitialised);
+
return 0;
+
+ error_xenbus:
+   xenbus_transaction_end(xbt, 1);
+   xenbus_dev_fatal(dev, ret, "writing xenstore");
+ error:
+   pvcalls_front_remove(dev);
+   return ret;
 }
 
 static void pvcalls_front_changed(struct xenbus_device *dev,
enum xenbus_state backend_state)
 {
+   switch (backend_state) {
+   case XenbusStateReconfiguring:
+   case 

[Xen-devel] [linux-3.18 test] 112394: trouble: blocked/broken/fail/pass

2017-07-31 Thread osstest service owner
flight 112394 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112394/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   2 hosts-allocate broken REGR. vs. 112102
 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 112102
 build-arm64   2 hosts-allocate broken REGR. vs. 112102
 build-arm64-pvops 3 capture-logs   broken REGR. vs. 112102

Tests which are failing intermittently (not blocking):
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot fail in 112387 pass in 112394
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 16 guest-localmigrate/x10 fail 
pass in 112387

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-examine  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64-xsm   3 capture-logs  broken blocked in 112102
 build-arm64   3 capture-logs  broken blocked in 112102
 test-amd64-amd64-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail blocked in 
112102
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail in 112387 blocked in 
112102
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail in 112387 blocked in 
112102
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail in 112387 
like 112102
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 112085
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 112085
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 112102
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail like 112102
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 112102
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 112102
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 112102
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  

Re: [Xen-devel] [PATCH v2 11/13] xen/pvcalls: implement release command

2017-07-31 Thread Stefano Stabellini
On Thu, 27 Jul 2017, Boris Ostrovsky wrote:
> > +int pvcalls_front_release(struct socket *sock)
> > +{
> > +   struct pvcalls_bedata *bedata;
> > +   struct sock_mapping *map;
> > +   int req_id, notify;
> > +   struct xen_pvcalls_request *req;
> > +
> > +   if (!pvcalls_front_dev)
> > +   return -EIO;
> > +   bedata = dev_get_drvdata(_front_dev->dev);
> > +   if (!bedata)
> > +   return -EIO;
> 
> Some (all?) other ops don't check bedata validity. Should they all do?

No, I don't think they should: dev_set_drvdata is called in the probe
function (pvcalls_front_probe). I'll remove it.


> > +
> > +   if (sock->sk == NULL)
> > +   return 0;
> > +
> > +   map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head);
> > +   if (map == NULL)
> > +   return 0;
> > +
> > +   spin_lock(>pvcallss_lock);
> > +   req_id = bedata->ring.req_prod_pvt & (RING_SIZE(>ring) - 1);
> > +   if (RING_FULL(>ring) ||
> > +   READ_ONCE(bedata->rsp[req_id].req_id) != PVCALLS_INVALID_ID) {
> > +   spin_unlock(>pvcallss_lock);
> > +   return -EAGAIN;
> > +   }
> > +   WRITE_ONCE(sock->sk->sk_send_head, NULL);
> > +
> > +   req = RING_GET_REQUEST(>ring, req_id);
> > +   req->req_id = req_id;
> > +   req->cmd = PVCALLS_RELEASE;
> > +   req->u.release.id = (uint64_t)sock;
> > +
> > +   bedata->ring.req_prod_pvt++;
> > +   RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify);
> > +   spin_unlock(>pvcallss_lock);
> > +   if (notify)
> > +   notify_remote_via_irq(bedata->irq);
> > +
> > +   wait_event(bedata->inflight_req,
> > +   READ_ONCE(bedata->rsp[req_id].req_id) == req_id);
> > +
> > +   if (map->active_socket) {
> > +   /* 
> > +* Set in_error and wake up inflight_conn_req to force
> > +* recvmsg waiters to exit.
> > +*/
> > +   map->active.ring->in_error = -EBADF;
> > +   wake_up_interruptible(>active.inflight_conn_req);
> > +
> > +   mutex_lock(>active.in_mutex);
> > +   mutex_lock(>active.out_mutex);
> > +   pvcalls_front_free_map(bedata, map);
> > +   mutex_unlock(>active.out_mutex);
> > +   mutex_unlock(>active.in_mutex);
> > +   kfree(map);
> 
> Since you are locking here I assume you expect that someone else might
> also be trying to lock the map. But you are freeing it immediately after
> unlocking. Wouldn't that mean that whoever is trying to grab the lock
> might then dereference freed memory?

The lock is to make sure there are no recvmsg or sendmsg in progress. We
are sure that no newer sendmsg or recvmsg are waiting for
pvcalls_front_release to release the lock because before send a message
to the backend we set sk_send_head to NULL.


> > +   } else {
> > +   spin_lock(>pvcallss_lock);
> > +   list_del_init(>list);
> > +   kfree(map);
> > +   spin_unlock(>pvcallss_lock);
> > +   }
> > +   WRITE_ONCE(bedata->rsp[req_id].req_id, PVCALLS_INVALID_ID);
> > +
> > +   return 0;
> > +}
> > +
> >  static const struct xenbus_device_id pvcalls_front_ids[] = {
> > { "pvcalls" },
> > { "" }
> > diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h
> > index 25e05b8..3332978 100644
> > --- a/drivers/xen/pvcalls-front.h
> > +++ b/drivers/xen/pvcalls-front.h
> > @@ -23,5 +23,6 @@ int pvcalls_front_recvmsg(struct socket *sock,
> >  unsigned int pvcalls_front_poll(struct file *file,
> > struct socket *sock,
> > poll_table *wait);
> > +int pvcalls_front_release(struct socket *sock);
> >  
> >  #endif

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 09/13] xen/pvcalls: implement recvmsg

2017-07-31 Thread Stefano Stabellini
On Thu, 27 Jul 2017, Boris Ostrovsky wrote:
> On 07/26/2017 08:08 PM, Stefano Stabellini wrote:
> > On Wed, 26 Jul 2017, Boris Ostrovsky wrote:
>  +count++;
>  +else
>  +
>  wait_event_interruptible(map->active.inflight_conn_req,
>  + 
>  pvcalls_front_read_todo(map));
>  +}
> >>> Should we be using PVCALLS_FRONT_MAX_SPIN here? In sendmsg it is
> >>> counting non-sleeping iterations but here we are sleeping so
> >>> PVCALLS_FRONT_MAX_SPIN (5000) may take a while.
> >>>
> >>> In fact, what shouldn't this waiting be a function of MSG_DONTWAIT
> >> err, which it already is. But the question still stands (except for
> >> MSG_DONTWAIT).
> > The code (admittedly unintuitive) is busy-looping (non-sleeping) for
> > 5000 iterations *before* attempting to sleep. So in that regard, recvmsg
> > and sendmsg use PVCALLS_FRONT_MAX_SPIN in the same way: only for
> > non-sleeping iterations.
> >
> 
> OK.
> 
> Why not go directly into wait_event_interruptible()? I see you write in
> the commit message
> 
> If not enough data is available on the ring, rather than returning
> immediately or sleep-waiting, spin for up to 5000 cycles. This small
> optimization turns out to improve performance and latency significantly.
> 
> 
> Is this because of scheduling latency? I think this should be mentioned not 
> just in the commit message but also as a comment in the code.

It tries to mitigate scheduling latencies on both ends (dom0 and domU)
when the ring buffer is the bottleneck (high bandwidth connections). But
to be honest with you, it's mostly beneficial in the sendmsg case,
because for recvmsg we also introduce a busy-wait in regular
circumstances, when no data is actually available. I confirmed this
statement with a quick iperf test. I'll remove the spin from recvmsg and
keep it in sendmsg.


> 
> (I also think it's not "not enough data" but rather "no data"?)

you are right 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2 3/6] xen/arm: Allow platform_hvc to handle guest SMC calls

2017-07-31 Thread Edgar E. Iglesias
On Thu, Feb 09, 2017 at 12:32:09PM -0700, Tamas K Lengyel wrote:
> On Thu, Feb 9, 2017 at 11:43 AM, Stefano Stabellini
>  wrote:
> > On Thu, 9 Feb 2017, Tamas K Lengyel wrote:
> >> On Thu, Feb 9, 2017 at 11:22 AM, Stefano Stabellini
> >>  wrote:
> >> > On Thu, 9 Feb 2017, Edgar E. Iglesias wrote:
> >> >> On Thu, Feb 09, 2017 at 10:12:41AM +0100, Edgar E. Iglesias wrote:
> >> >> > On Wed, Feb 08, 2017 at 05:20:44PM -0800, Stefano Stabellini wrote:
> >> >> > > On Thu, 9 Feb 2017, Julien Grall wrote:
> >> >> > > > On 08/02/2017 23:28, Tamas K Lengyel wrote:
> >> >> > > > > On Wed, Feb 8, 2017 at 3:04 PM, Julien Grall 
> >> >> > > > >  wrote:
> >> >> > > > > > Hi Tamas,

..

> >> In principle I have nothing against a command line option, but I don't
> >> really follow how that would help. The monitor system is disabled by
> >> default for all domains, so there is no problem with dom0 booting or
> >> any other domain needing to access the firmware. You specifically have
> >> to enable the monitoring for domains. Why is it a problem to have it
> >> be exclusive for just those domains where it is enabled?
> >
> > I am suggesting this solution because I expect many use-cases for memory
> > introspection that don't actually require any platform_hvc events to be
> > monitored at all. On the other end, I expect that on platforms where
> > platform_hvc is implemented, such as the ZynqMP, those calls are
> > important and should be handled in Xen in most cases.
> >
> > Looking at the code, does monitor.privileged_call_enabled only cover
> > SMC? Is monitor.privileged_call_enabled disabled by default?
> > If so, monitor.privileged_call_enabled could be the tunable I was
> > talking about. As long as enabling memory introspection doesn't
> > automatically forward platform_hvc events to the monitor, I am fine with
> > it.
> 
> Yes, monitor.privileged_call_enabled only covers SMCs right now and it
> is disabled by default. It has to be enabled specifically for a
> domain.  Memory introspection is separate from this, that is handled
> by the mem_access system and it can be enabled separately from SMC
> monitoring.
> 
> As for hypercalls that get handled by Xen, I don't really need to
> monitor those. If Xen would on the other hand go and call some
> firmware as a result of the hypercall, I would need to be able to deny
> that. So as long as XSM can be used to control HVC calls, that works
> for me just fine too.

Hi again!

This was quite a while ago but I think we kind of ended up with
monitor.privileged_call_enabled being a possible flag to conditionalize
the forwarding of firmware calls or not.

There are at least 3 cases to consider at the moment:
1. Firmware calls over SMC (PSCI or other platform calls like EEMI)
2. Firmware calls over HVC Handled by Xen (PSCI and XEN Hypercalls)
3. Firmware calls over HVC Handled by platform specific code (e.g EEMI)


For #1 Firmware calls over SMC:
I've conditionalized all of it on monitor.privileged_call_enabled.
It's either the monitor or the firmware call handling, they
are mutually exclusive. Guests can still do PSCI over HVC.

For #2, things work like today. This is PSCI and the Xen Hypercallsi over HVC.

For #3, only platform code knows if the specific call will be handled
in Xen completely or if it will result in some kind of SMC to lower layers.
If monitor.privileged_call_enabled is on, I've made the ZynqMP
implementation gracefully NACK any call that would result in an SMC
issued by Xen.

Are there any concerns around this?

I'll also send out code for review, it may be easier to follow :-)

Best regards,
Edgar




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 05/13] xen/pvcalls: implement bind command

2017-07-31 Thread Stefano Stabellini
On Thu, 27 Jul 2017, Boris Ostrovsky wrote:
> >> This all looks very similar to previous patches. Can it be factored out?
> > You are right that the pattern is the same for all commands:
> > - get a request
> > - fill the request
> > - possibly do something else
> > - wait
> > however each request is different, the struct and fields are different.
> > There are spin_lock and spin_unlock calls intermingled. I am not sure I
> > can factor out much of this. Maybe I could create a static inline or
> > macro as a syntactic sugar to replace the wait call, but that's pretty
> > much it I think.
> 
> Maybe you could factor out common fragments, not necessarily the whole
> thing at once?
> 
> For example,
> 
> static inline int get_request(*bedata, int *req_id)
> {
> 
>   *req_id = bedata->ring.req_prod_pvt & (RING_SIZE(>ring) - 1);
>   if (RING_FULL(>ring) ||
>   READ_ONCE(bedata->rsp[*req_id].req_id) != PVCALLS_INVALID_ID) {
>   return -EAGAIN;
>   return 0;
> }
> 
> (or some such)

You are right, the code looks better this way. I'll add it.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 0/1] xen/arm: zynqmp: Disable PCIe

2017-07-31 Thread Julien Grall



On 31/07/2017 20:37, Edgar E. Iglesias wrote:

From: "Edgar E. Iglesias" 

Hi,


Hi Edgar,



We're seeing panics in dom0 with PCIe enabled due to what seems
to be wrongly created mappings by Xen. With older kernels we
didn't see the panics but PCIe wasn't functional in dom0.

This disables the PCIe nodes on the ZynqMP until Xen/ARM gets
more PCIe support.


I feel a bit sad to ack a patch disabling PCIe in the ZynqMP.

Before doing that. Can you describe what is the exact problem with Xen? 
It might be possible that we don't parse correctly the device-tree.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] xen: RCU: avoid busy waiting until the end of grace period.

2017-07-31 Thread Dario Faggioli
On Mon, 2017-07-31 at 14:20 -0700, Stefano Stabellini wrote:
> On Thu, 27 Jul 2017, Dario Faggioli wrote:
> > 
> > diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
> > index f0fdc87..4586f2a 100644
> > --- a/xen/common/rcupdate.c
> > +++ b/xen/common/rcupdate.c
> > @@ -84,8 +84,14 @@ struct rcu_data {
> >  int cpu;
> >  struct rcu_head barrier;
> >  longlast_rs_qlen; /* qlen during the last
> > resched */
> > +
> > +/* 3) idle CPUs handling */
> > +struct timer idle_timer;
> > +bool idle_timer_active;
> >  };
> >  
> > +#define RCU_IDLE_TIMER_PERIOD MILLISECS(10)
> 
> Isn't this a bit too short? How is it chosen?
> 
It's totally arbitrary (and that would be the case for whatever value
we choose).

Basically, it's how long, at worst, after the actual end of a grace
period, a (batch of) callback(s) will be invoked. Currently, on Credit1
on ARM (without my patch, from this series, that suspends the tick)
that's (by chance) 30 ms (or whatever value is chosen for Credit1
timeslice). On Credit2 (on both ARM and x86), it's never, but on x86 it
(apparently) is 'however frequent time sync rendezvouses happs' (which
I don't recall, but it's longer), while on ARM is (potentially) never.

I accept suggestions about alternatives values, and I'm certainly fine
with adding a comment, containing something along the lines of the
explanation above, but I fear it's going to be hard to figure out what
value is actually the "absolute best".

In Linux (which is where the same 'callback book-keeping' happens for
them), a tick with a frequency of 1000Hz (== 1ms) is considered 'low-
latency/Deskop/real-time'. For us, as said above, tick --when it's
there-- would be 30ms by default.

I just went with something in the middle.

Also, it's not that we'll have a 10ms periodic timer going on for
significant amount of time. In fact we expect it to actually fire just
once (for each grace period). It's not 100% guaranteed that it won't be
reprogrammed and fire a couple of times, but it should not, in the vast
majority of cases.

What makes you think it's short?

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)

signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 02/13] xen/pvcalls: connect to the backend

2017-07-31 Thread Stefano Stabellini
On Thu, 27 Jul 2017, Boris Ostrovsky wrote:
> >>>   static int pvcalls_front_probe(struct xenbus_device *dev,
> >>> const struct xenbus_device_id *id)
> >>>   {
> >>> + int ret = -EFAULT, evtchn, ref = -1, i;
> >>> + unsigned int max_page_order, function_calls, len;
> >>> + char *versions;
> >>> + grant_ref_t gref_head = 0;
> >>> + struct xenbus_transaction xbt;
> >>> + struct pvcalls_bedata *bedata = NULL;
> >>> + struct xen_pvcalls_sring *sring;
> >>> +
> >>> + if (pvcalls_front_dev != NULL) {
> >>> + dev_err(>dev, "only one PV Calls connection
> >>> supported\n");
> >>> + return -EINVAL;
> >>> + }
> >>> +
> >>> + versions = xenbus_read(XBT_NIL, dev->otherend, "versions", );
> >>> + if (!len)
> >>> + return -EINVAL;
> >>> + if (strcmp(versions, "1")) {
> >>> + kfree(versions);
> >>> + return -EINVAL;
> >>> + }
> >>> + kfree(versions);
> >>> + ret = xenbus_scanf(XBT_NIL, dev->otherend,
> >>> +"max-page-order", "%u", _page_order);
> >>> + if (ret <= 0)
> >>> + return -ENODEV;
> >>> + if (max_page_order < RING_ORDER)
> >>> + return -ENODEV;
> >>> + ret = xenbus_scanf(XBT_NIL, dev->otherend,
> >>> +"function-calls", "%u", _calls);
> >>> + if (ret <= 0 || function_calls != 1)
> >>> + return -ENODEV;
> >>> + pr_info("%s max-page-order is %u\n", __func__, max_page_order);
> >>> +
> >>> + bedata = kzalloc(sizeof(struct pvcalls_bedata), GFP_KERNEL);
> >>> + if (!bedata)
> >>> + return -ENOMEM;
> >>> +
> >>> + init_waitqueue_head(>inflight_req);
> >>> + for (i = 0; i < PVCALLS_NR_REQ_PER_RING; i++)
> >>> + bedata->rsp[i].req_id = PVCALLS_INVALID_ID;
> >>> +
> >>> + sring = (struct xen_pvcalls_sring *) __get_free_page(GFP_KERNEL |
> >>> +  __GFP_ZERO);
> >>> + if (!sring)
> >>> + goto error;
> >>> + SHARED_RING_INIT(sring);
> >>> + FRONT_RING_INIT(>ring, sring, XEN_PAGE_SIZE);
> >>> +
> >>> + ret = xenbus_alloc_evtchn(dev, );
> >>> + if (ret)
> >>> + goto error;
> >>> +
> >>> + bedata->irq = bind_evtchn_to_irqhandler(evtchn,
> >>> + pvcalls_front_event_handler,
> >>> + 0, "pvcalls-frontend", dev);
> >>> + if (bedata->irq < 0) {
> >>> + ret = bedata->irq;
> >>> + goto error;
> >>> + }
> >>> +
> >>> + ret = gnttab_alloc_grant_references(1, _head);
> >>> + if (ret < 0)
> >>> + goto error;
> >>> + bedata->ref = ref = gnttab_claim_grant_reference(_head);
> >> Is ref really needed?
> > No, I'll remove it
> >
> >
> >>> + if (ref < 0)
> >>> + goto error;
> >>> + gnttab_grant_foreign_access_ref(ref, dev->otherend_id,
> >>> + virt_to_gfn((void *)sring), 0);
> >>> +
> >>> + again:
> >>> + ret = xenbus_transaction_start();
> >>> + if (ret) {
> >>> + xenbus_dev_fatal(dev, ret, "starting transaction");
> >>> + goto error;
> >>> + }
> >>> + ret = xenbus_printf(xbt, dev->nodename, "version", "%u", 1);
> >>> + if (ret)
> >>> + goto error_xenbus;
> >>> + ret = xenbus_printf(xbt, dev->nodename, "ring-ref", "%d", ref);
> >>> + if (ret)
> >>> + goto error_xenbus;
> >>> + ret = xenbus_printf(xbt, dev->nodename, "port", "%u",
> >>> + evtchn);
> >>> + if (ret)
> >>> + goto error_xenbus;
> >>> + ret = xenbus_transaction_end(xbt, 0);
> >>> + if (ret) {
> >>> + if (ret == -EAGAIN)
> >>> + goto again;
> >>> + xenbus_dev_fatal(dev, ret, "completing transaction");
> >>> + goto error;
> >>> + }
> >>> +
> >>> + INIT_LIST_HEAD(>socket_mappings);
> >>> + INIT_LIST_HEAD(>socketpass_mappings);
> >>> + spin_lock_init(>pvcallss_lock);
> >>> + dev_set_drvdata(>dev, bedata);
> >>> + pvcalls_front_dev = dev;
> >>> + xenbus_switch_state(dev, XenbusStateInitialised);
> >>> +
> >>>   return 0;
> >>> +
> >>> + error_xenbus:
> >>> + xenbus_transaction_end(xbt, 1);
> >>> + xenbus_dev_fatal(dev, ret, "writing xenstore");
> >>> + error:
> >>> + pvcalls_front_remove(dev);
> >> I think patch 12 (where you implement cleanup) could be moved before this 
> >> one.
> > I'll move the patch
> >
> >
> >> I also think you are leaking bedata on error paths.
> > bedata is freed by pvcalls_front_remove (kfree(bedata)), why do you say
> > so?
> 
> bedata there is read from dev_get_drvdata() and here you assign drvdata
> at the very end.
> 
> Come think of it, pvcalls_front_remove() should probably first check
> whether bedata is valid. Or drvdata should be assigned right away in
> this routine, before any 'got error/error_xenbus'.

Yes, I'll do that

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file

2017-07-31 Thread Edgar E. Iglesias
On Mon, Jul 31, 2017 at 02:30:47PM -0700, Stefano Stabellini wrote:
> On Mon, 31 Jul 2017, Edgar E. Iglesias wrote:
> > On Fri, Jul 28, 2017 at 09:03:15PM +0800, Zhongze Liu wrote:
> > > 
> > > 1. Motivation and Description
> > 
> > Hi,
> > 
> > I think this looks quite useful. I have a few comments inline.
> 
> Hi Edgar, thanks for giving it a look!
> 
> 
> > > 
> > > Virtual machines use grant table hypercalls to setup a share page for
> > > inter-VMs communications. These hypercalls are used by all PV
> > > protocols today. However, very simple guests, such as baremetal
> > > applications, might not have the infrastructure to handle the grant table.
> > > This project is about setting up several shared memory areas for inter-VMs
> > > communications directly from the VM config file.
> > > So that the guest kernel doesn't have to have grant table support (in the
> > > embedded space, this is not unusual) to be able to communicate with
> > > other guests.
> > > 
> > > 
> > > 2. Implementation Plan:
> > > 
> > > 
> > > ==
> > > 2.1 Introduce a new VM config option in xl:
> > > ==
> > > 
> > > 2.1.1 Design Goals
> > > ~~~
> > > 
> > > The shared areas should be shareable among several (>=2) VMs, so every 
> > > shared
> > > physical memory area is assigned to a set of VMs. Therefore, a “token” or
> > > “identifier” should be used here to uniquely identify a backing memory 
> > > area.
> > > A string no longer than 128 bytes is used here to serve the purpose.
> > > 
> > > The backing area would be taken from one domain, which we will regard
> > > as the "master domain", and this domain should be created prior to any
> > > other "slave domain"s. Again, we have to use some kind of tag to tell who
> > > is the "master domain".
> > > 
> > > And the ability to specify the permissions and cacheability (and 
> > > shareability
> > > for ARM guest's) of the pages to be shared should also be given to the 
> > > user.
> > > 
> > > 2.2.2 Syntax and Behavior
> > > ~
> > > The following example illustrates the syntax of the proposed config entry
> > > (suppose that we're on x86):
> > > 
> > > In xl config file of vm1:
> > >   static_shm = [ 'id=ID1, begin=0x10, end=0x20, role=master,
> > >   cache_policy=x86_normal, prot=r2',
> > > 
> > >  'id=ID2, begin=0x30, end=0x40, role=master' ]
> > > 
> > > In xl config file of vm2:
> > >   static_shm = [ 'id=ID1, offset = 0, begin=0x50, end=0x60,
> > > role=slave, prot=ro' ]
> > > 
> > > In xl config file of vm3:
> > >   static_shm = [ 'id=ID2, offset = 1, begin=0x69,
> > > end=0x80, role=slave, prot=ro' ]
> > > 
> > > where:
> > >   @id   The identifier of the backing memory area.
> > > Can be any string that matches the regexp "[^ 
> > > \t\n,]+"
> > > and no longer than 128 characters
> > > 
> > >   @offset   Can only appear when @role = slave. The sharing 
> > > will
> > > start from the beginning of backing memory area 
> > > plus
> > > this offset. If not set, it defaults to zero.
> > > Can be decimals or hexadecimals of the form 
> > > "0x2",
> > > and should be the multiple of the hypervisor page
> > > granularity (currently 4K on both ARM and x86).
> > > 
> > >   @begin/endThe boundaries of the shared memory area. The 
> > > format
> > > requirements are the same with @offset.
> > 
> > I'm assuming this is all specified in GFN and also not MFN contigous?
> > Would it be possible to allow the specification of MFN mappings
> > that are contigous?
> 
> That could be done with the iomem= parameter?

The missing part from the iomem paramater is the attributes, cacheability, 
shared etc.
But we could perhaps add that to iomem somehow.


> 
> 
> > This would be useful to map specific kinds of memory (e.g On Chip RAMs).
> > 
> > Other use-cases are when there are not only guests sharing
> > the pages but also devices. In some cases these devs may be locked in
> > with low-latency access to specific memory regions.
> > 
> > Perhaps something like the following?
> > addr=gfn@
> > size=0x1000
> > 
> > with mfn being optional?
> 
> I can see that it might be useful, but we are trying to keep the scope
> small to be able to complete the project within the limited timeframe
> allowed by GSoC. We only have one month left! We risk not getting the
> feature completed.  Once this set of features is done and committed, we
> can expand on it.
> 
> For the sake of this document, we should make clear that addresses 

Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file

2017-07-31 Thread Stefano Stabellini
On Mon, 31 Jul 2017, Edgar E. Iglesias wrote:
> On Fri, Jul 28, 2017 at 09:03:15PM +0800, Zhongze Liu wrote:
> > 
> > 1. Motivation and Description
> 
> Hi,
> 
> I think this looks quite useful. I have a few comments inline.

Hi Edgar, thanks for giving it a look!


> > 
> > Virtual machines use grant table hypercalls to setup a share page for
> > inter-VMs communications. These hypercalls are used by all PV
> > protocols today. However, very simple guests, such as baremetal
> > applications, might not have the infrastructure to handle the grant table.
> > This project is about setting up several shared memory areas for inter-VMs
> > communications directly from the VM config file.
> > So that the guest kernel doesn't have to have grant table support (in the
> > embedded space, this is not unusual) to be able to communicate with
> > other guests.
> > 
> > 
> > 2. Implementation Plan:
> > 
> > 
> > ==
> > 2.1 Introduce a new VM config option in xl:
> > ==
> > 
> > 2.1.1 Design Goals
> > ~~~
> > 
> > The shared areas should be shareable among several (>=2) VMs, so every 
> > shared
> > physical memory area is assigned to a set of VMs. Therefore, a “token” or
> > “identifier” should be used here to uniquely identify a backing memory area.
> > A string no longer than 128 bytes is used here to serve the purpose.
> > 
> > The backing area would be taken from one domain, which we will regard
> > as the "master domain", and this domain should be created prior to any
> > other "slave domain"s. Again, we have to use some kind of tag to tell who
> > is the "master domain".
> > 
> > And the ability to specify the permissions and cacheability (and 
> > shareability
> > for ARM guest's) of the pages to be shared should also be given to the user.
> > 
> > 2.2.2 Syntax and Behavior
> > ~
> > The following example illustrates the syntax of the proposed config entry
> > (suppose that we're on x86):
> > 
> > In xl config file of vm1:
> >   static_shm = [ 'id=ID1, begin=0x10, end=0x20, role=master,
> >   cache_policy=x86_normal, prot=r2',
> > 
> >  'id=ID2, begin=0x30, end=0x40, role=master' ]
> > 
> > In xl config file of vm2:
> >   static_shm = [ 'id=ID1, offset = 0, begin=0x50, end=0x60,
> > role=slave, prot=ro' ]
> > 
> > In xl config file of vm3:
> >   static_shm = [ 'id=ID2, offset = 1, begin=0x69,
> > end=0x80, role=slave, prot=ro' ]
> > 
> > where:
> >   @id   The identifier of the backing memory area.
> > Can be any string that matches the regexp "[^ 
> > \t\n,]+"
> > and no longer than 128 characters
> > 
> >   @offset   Can only appear when @role = slave. The sharing will
> > start from the beginning of backing memory area plus
> > this offset. If not set, it defaults to zero.
> > Can be decimals or hexadecimals of the form 
> > "0x2",
> > and should be the multiple of the hypervisor page
> > granularity (currently 4K on both ARM and x86).
> > 
> >   @begin/endThe boundaries of the shared memory area. The format
> > requirements are the same with @offset.
> 
> I'm assuming this is all specified in GFN and also not MFN contigous?
> Would it be possible to allow the specification of MFN mappings
> that are contigous?

That could be done with the iomem= parameter?


> This would be useful to map specific kinds of memory (e.g On Chip RAMs).
> 
> Other use-cases are when there are not only guests sharing
> the pages but also devices. In some cases these devs may be locked in
> with low-latency access to specific memory regions.
> 
> Perhaps something like the following?
> addr=gfn@
> size=0x1000
> 
> with mfn being optional?

I can see that it might be useful, but we are trying to keep the scope
small to be able to complete the project within the limited timeframe
allowed by GSoC. We only have one month left! We risk not getting the
feature completed.  Once this set of features is done and committed, we
can expand on it.

For the sake of this document, we should make clear that addresses are in
the gfn space and memory is allocated to the master domain.


> >   @role Can only be 'master' or 'slave', it defaults to 
> > 'slave'.
> > 
> >   @prot When @role = master, this means the largest set of
> > stage-2 permission flags that can be granted to the
> > slave domains.
> > When @role = slave, this means the stage-2 permission
> > flags of 

Re: [Xen-devel] [RFC v2 2/6] xen/arm: Introduce platform_hvc

2017-07-31 Thread Edgar E. Iglesias
On Mon, Feb 13, 2017 at 02:08:43PM -0800, Stefano Stabellini wrote:
> On Tue, 7 Feb 2017, Edgar E. Iglesias wrote:
> > From: "Edgar E. Iglesias" 
> > 
> > Introduce platform_hvc as a way to handle hypercalls that
> > Xen does not know about in a platform specific way. This
> > is particularly useful for implementing the SiP (SoC
> > implementation specific) service calls.
> > 
> > Signed-off-by: Edgar E. Iglesias 
> > ---
> >  xen/arch/arm/platform.c| 8 
> >  xen/arch/arm/traps.c   | 3 +++
> >  xen/include/asm-arm/platform.h | 5 +
> >  3 files changed, 16 insertions(+)
> > 
> > diff --git a/xen/arch/arm/platform.c b/xen/arch/arm/platform.c
> > index 0af6d57..90ea6b8 100644
> > --- a/xen/arch/arm/platform.c
> > +++ b/xen/arch/arm/platform.c
> > @@ -127,6 +127,14 @@ void platform_poweroff(void)
> >  platform->poweroff();
> >  }
> >  
> > +bool platform_hvc(struct cpu_user_regs *regs)
> 
> This is fine, but we need a different name for it, as it can be used to
> handle both HVC and SMC calls. Maybe "firmware_call"?

Hi,

Sorry for the super long delay.. I'm looking at this again now.

Yes, you're right. I went with
platform_firmware_call() and platform->firmware_call().

I kept the platform_ prefix in the wrapper function to be
consistent with the other hooks.

Thanks,
Edgar


> 
> 
> > +{
> > +if ( platform && platform->hvc )
> > +return platform->hvc(regs);
> > +
> > +return false;
> > +}
> > +
> >  bool_t platform_has_quirk(uint32_t quirk)
> >  {
> >  uint32_t quirks = 0;
> > diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> > index c5a4d41..33950d9 100644
> > --- a/xen/arch/arm/traps.c
> > +++ b/xen/arch/arm/traps.c
> > @@ -44,6 +44,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include "decode.h"
> >  #include "vtimer.h"
> > @@ -1430,6 +1431,8 @@ static void do_trap_psci(struct cpu_user_regs *regs)
> >  }
> >  break;
> >  default:
> > +if ( platform_hvc(regs) )
> > +return;
> >  domain_crash_synchronous();
> >  return;
> >  }
> > diff --git a/xen/include/asm-arm/platform.h b/xen/include/asm-arm/platform.h
> > index 08010ba..4d51f0a 100644
> > --- a/xen/include/asm-arm/platform.h
> > +++ b/xen/include/asm-arm/platform.h
> > @@ -26,6 +26,10 @@ struct platform_desc {
> >  void (*reset)(void);
> >  /* Platform power-off */
> >  void (*poweroff)(void);
> > +/* Platform specific HVC handler.
> > + * Returns true if the call was handled and false if not.
> > + */
> > +bool (*hvc)(struct cpu_user_regs *regs);
> >  /*
> >   * Platform quirks
> >   * Defined has a function because a platform can support multiple
> > @@ -55,6 +59,7 @@ int platform_cpu_up(int cpu);
> >  #endif
> >  void platform_reset(void);
> >  void platform_poweroff(void);
> > +bool platform_hvc(struct cpu_user_regs *regs);
> >  bool_t platform_has_quirk(uint32_t quirk);
> >  bool_t platform_device_is_blacklisted(const struct dt_device_node *node);

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] xen: RCU: avoid busy waiting until the end of grace period.

2017-07-31 Thread Stefano Stabellini
On Thu, 27 Jul 2017, Dario Faggioli wrote:
> Instead of having the CPU where a callback is queued, busy
> looping on rcu_pending(), use a timer.
> 
> In fact, we let the CPU go idla,e but we program a timer
  ^ idle,


> that will periodically wake it up, for checking whether the
> grace period has actually ended.
> 
> It is kind of similar to introducing a periodic tick, but
> with a much more limited scope, and a lot less overhead. In
> fact, this timer is:
> - only active for the CPU(s) that have callbacks queued,
>   waiting for the end of a grace period;
> - only active when those CPU(s) are idle (and stopped as
>   soon as they resume execution).
> 
> Signed-off-by: Dario Faggioli 
> ---
> Cc: Stefano Stabellini 
> Cc: Julien Grall 
> Cc: Jan Beulich 
> Cc: Andrew Cooper 
> ---
>  xen/arch/arm/domain.c |4 ++-
>  xen/arch/x86/acpi/cpu_idle.c  |6 +++--
>  xen/arch/x86/cpu/mwait-idle.c |6 +++--
>  xen/common/rcupdate.c |   52 
> -
>  xen/include/xen/rcupdate.h|3 ++
>  5 files changed, 65 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 666b7ef..01da96e 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -43,8 +43,9 @@ static void do_idle(void)
>  {
>  unsigned int cpu = smp_processor_id();
>  
> +rcu_idle_timer_start();
>  sched_tick_suspend();
> -/* sched_tick_suspend() can raise TIMER_SOFTIRQ. Process it now. */
> +/* Timer related operations can raise TIMER_SOFTIRQ. Process it now. */
>  process_pending_softirqs();
>  
>  local_irq_disable();
> @@ -58,6 +59,7 @@ static void do_idle(void)
>  local_irq_enable();
>  
>  sched_tick_resume();
> +rcu_idle_timer_stop();
>  }
>  
>  void idle_loop(void)
> diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c
> index 04c52e8..b97986f 100644
> --- a/xen/arch/x86/acpi/cpu_idle.c
> +++ b/xen/arch/x86/acpi/cpu_idle.c
> @@ -576,10 +576,10 @@ static void acpi_processor_idle(void)
>  return;
>  }
>  
> +rcu_idle_timer_start();
>  cpufreq_dbs_timer_suspend();
> -
>  sched_tick_suspend();
> -/* sched_tick_suspend() can raise TIMER_SOFTIRQ. Process it now. */
> +/* Timer related operations can raise TIMER_SOFTIRQ. Process it now. */
>  process_pending_softirqs();
>  
>  /*
> @@ -593,6 +593,7 @@ static void acpi_processor_idle(void)
>  local_irq_enable();
>  sched_tick_resume();
>  cpufreq_dbs_timer_resume();
> +rcu_idle_timer_stop();
>  return;
>  }
>  
> @@ -726,6 +727,7 @@ static void acpi_processor_idle(void)
>  
>  sched_tick_resume();
>  cpufreq_dbs_timer_resume();
> +rcu_idle_timer_stop();
>  
>  if ( cpuidle_current_governor->reflect )
>  cpuidle_current_governor->reflect(power);
> diff --git a/xen/arch/x86/cpu/mwait-idle.c b/xen/arch/x86/cpu/mwait-idle.c
> index ae9e92b..c426e41 100644
> --- a/xen/arch/x86/cpu/mwait-idle.c
> +++ b/xen/arch/x86/cpu/mwait-idle.c
> @@ -743,10 +743,10 @@ static void mwait_idle(void)
>   return;
>   }
>  
> + rcu_idle_timer_start();
>   cpufreq_dbs_timer_suspend();
> -
>   sched_tick_suspend();
> - /* sched_tick_suspend() can raise TIMER_SOFTIRQ. Process it now. */
> + /* Timer related operations can raise TIMER_SOFTIRQ. Process it now. */
>   process_pending_softirqs();
>  
>   /* Interrupts must be disabled for C2 and higher transitions. */
> @@ -756,6 +756,7 @@ static void mwait_idle(void)
>   local_irq_enable();
>   sched_tick_resume();
>   cpufreq_dbs_timer_resume();
> +rcu_idle_timer_stop();
>   return;
>   }
>  
> @@ -802,6 +803,7 @@ static void mwait_idle(void)
>  
>   sched_tick_resume();
>   cpufreq_dbs_timer_resume();
> + rcu_idle_timer_stop();
>  
>   if ( cpuidle_current_governor->reflect )
>   cpuidle_current_governor->reflect(power);
> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
> index f0fdc87..4586f2a 100644
> --- a/xen/common/rcupdate.c
> +++ b/xen/common/rcupdate.c
> @@ -84,8 +84,14 @@ struct rcu_data {
>  int cpu;
>  struct rcu_head barrier;
>  longlast_rs_qlen; /* qlen during the last resched */
> +
> +/* 3) idle CPUs handling */
> +struct timer idle_timer;
> +bool idle_timer_active;
>  };
>  
> +#define RCU_IDLE_TIMER_PERIOD MILLISECS(10)

Isn't this a bit too short? How is it chosen?


>  static DEFINE_PER_CPU(struct rcu_data, rcu_data);
>  
>  static int blimit = 10;
> @@ -402,7 +408,48 @@ int rcu_needs_cpu(int cpu)
>  {
>  struct rcu_data *rdp = _cpu(rcu_data, cpu);
>  
> -return (!!rdp->curlist || rcu_pending(cpu));
> +return (!!rdp->curlist || 

Re: [Xen-devel] [PATCH 3/5] xen: RCU/x86/ARM: discount CPUs that were idle when grace period started.

2017-07-31 Thread Stefano Stabellini
On Thu, 27 Jul 2017, Dario Faggioli wrote:
> Xen is a tickless (micro-)kernel. This means that, when a CPU
> becomes idle, we stop all the activity on it, including any
> periodic tick or timer.
> 
> When we imported RCU from Linux, Linux (x86) was a ticking
> kernel, i.e., there was a periodic timer tick always running,
> even on totally idle CPUs. This was bad from a power efficiency
> perspective, but it's what maked it possible to monitor the
> quiescent states of all the CPUs, and hence tell when an RCU
> grace period ends.
> 
> In Xen, that is impossible, and that's particularly problematic
> when system is idle (or lightly loaded) systems, as CPUs that
> are idle may never have the chance to tell RCU about their
> quiescence, and grace periods could extend indefinitely!
> 
> This has led, on x86, to long (an unpredictable) delays between
> RCU callbacks queueing and invokation. On ARM, we actually see
> infinite grace periods (e.g., complate_domain_destroy() may
> never be actually invoked on an idle system). See here:
> 
>  https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg02454.html
> 
> The first step for fixing this situation is for RCU to record,
> at the beginning of a grace period, which CPUs are already idle.
> In fact, being idle, they can't be in the middle of any read-side
> critical section, and we don't have to wait for them to declare
> a grace period finished.
> 
> This is tracked in a cpumask, in a way that is very similar to
> how Linux also was achieving the same on s390 --which indeed was
> tickless already, even back then-- and to what it started to do
> for x86, from 2.6.21 on (see commit 79bf2bb3 "tick-management:
> dyntick / highres functionality").
> 
> While there, also adopt the memory barrier introduced by Linux
> commit commit c3f59023 ("Fix RCU race in access of nohz_cpu_mask").
> 
> Signed-off-by: Dario Faggioli 
> ---
> Cc: Stefano Stabellini 
> Cc: Julien Grall 
> Cc: Jan Beulich 
> Cc: Andrew Cooper 
> ---
>  xen/arch/arm/domain.c |2 ++
>  xen/arch/x86/acpi/cpu_idle.c  |   25 +
>  xen/arch/x86/cpu/mwait-idle.c |9 -
>  xen/arch/x86/domain.c |8 +++-
>  xen/common/rcupdate.c |   28 ++--
>  xen/include/xen/rcupdate.h|3 +++
>  6 files changed, 63 insertions(+), 12 deletions(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index fce29cb..666b7ef 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -50,8 +50,10 @@ static void do_idle(void)
>  local_irq_disable();
>  if ( cpu_is_haltable(cpu) )
>  {
> +rcu_idle_enter(cpu);
>  dsb(sy);
>  wfi();
> +rcu_idle_exit(cpu);
>  }
>  local_irq_enable();
>  
> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
> index 8cc5a82..f0fdc87 100644
> --- a/xen/common/rcupdate.c
> +++ b/xen/common/rcupdate.c
> @@ -52,7 +52,8 @@ static struct rcu_ctrlblk {
>  int  next_pending;  /* Is the next batch already waiting? */
>  
>  spinlock_t  lock __cacheline_aligned;
> -cpumask_t   cpumask; /* CPUs that need to switch in order*/
> +cpumask_t   cpumask; /* CPUs that need to switch in order ... */
> +cpumask_t   idle_cpumask; /* ... unless they are already idle */
>  /* for current batch to proceed.*/
>  } __cacheline_aligned rcu_ctrlblk = {
>  .cur = -300,
> @@ -248,7 +249,14 @@ static void rcu_start_batch(struct rcu_ctrlblk *rcp)
>  smp_wmb();
>  rcp->cur++;
>  
> -cpumask_copy(>cpumask, _online_map);
> +   /*
> +* Accessing idle_cpumask before incrementing rcp->cur needs a
> +* Barrier  Otherwise it can cause tickless idle CPUs to be
^ otherwise


> +* included in rcp->cpumask, which will extend graceperiods
> +* unnecessarily.
> +*/

It doesn't look like this comment applies to this code: we are accessing
idle_cpumask after rcp->cur here. Unless you meant "Accessing
idle_cpumask *after* incrementing rcp->cur."

Also could you please add a pointer to the other barrier in the pair
(barriers always go in pair, for example I think the smp_wmb() above in
rcu_start_batch is matched by the smp_rmb() in __rcu_process_callbacks.)


> +smp_mb();
> +cpumask_andnot(>cpumask, _online_map, >idle_cpumask);
>  }
>  }
>  
> @@ -474,7 +482,23 @@ static struct notifier_block cpu_nfb = {
>  void __init rcu_init(void)
>  {
>  void *cpu = (void *)(long)smp_processor_id();
> +
> +cpumask_setall(_ctrlblk.idle_cpumask);
>  cpu_callback(_nfb, CPU_UP_PREPARE, cpu);
>  register_cpu_notifier(_nfb);
>  open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
>  }
> +
> +/*
> + * The CPU is becoming idle, so no more read side critical
> + * sections, and one more step toward 

Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file

2017-07-31 Thread Edgar E. Iglesias
On Fri, Jul 28, 2017 at 09:03:15PM +0800, Zhongze Liu wrote:
> 
> 1. Motivation and Description

Hi,

I think this looks quite useful. I have a few comments inline.


> 
> Virtual machines use grant table hypercalls to setup a share page for
> inter-VMs communications. These hypercalls are used by all PV
> protocols today. However, very simple guests, such as baremetal
> applications, might not have the infrastructure to handle the grant table.
> This project is about setting up several shared memory areas for inter-VMs
> communications directly from the VM config file.
> So that the guest kernel doesn't have to have grant table support (in the
> embedded space, this is not unusual) to be able to communicate with
> other guests.
> 
> 
> 2. Implementation Plan:
> 
> 
> ==
> 2.1 Introduce a new VM config option in xl:
> ==
> 
> 2.1.1 Design Goals
> ~~~
> 
> The shared areas should be shareable among several (>=2) VMs, so every shared
> physical memory area is assigned to a set of VMs. Therefore, a “token” or
> “identifier” should be used here to uniquely identify a backing memory area.
> A string no longer than 128 bytes is used here to serve the purpose.
> 
> The backing area would be taken from one domain, which we will regard
> as the "master domain", and this domain should be created prior to any
> other "slave domain"s. Again, we have to use some kind of tag to tell who
> is the "master domain".
> 
> And the ability to specify the permissions and cacheability (and shareability
> for ARM guest's) of the pages to be shared should also be given to the user.
> 
> 2.2.2 Syntax and Behavior
> ~
> The following example illustrates the syntax of the proposed config entry
> (suppose that we're on x86):
> 
> In xl config file of vm1:
>   static_shm = [ 'id=ID1, begin=0x10, end=0x20, role=master,
>   cache_policy=x86_normal, prot=r2',
> 
>  'id=ID2, begin=0x30, end=0x40, role=master' ]
> 
> In xl config file of vm2:
>   static_shm = [ 'id=ID1, offset = 0, begin=0x50, end=0x60,
> role=slave, prot=ro' ]
> 
> In xl config file of vm3:
>   static_shm = [ 'id=ID2, offset = 1, begin=0x69,
> end=0x80, role=slave, prot=ro' ]
> 
> where:
>   @id   The identifier of the backing memory area.
> Can be any string that matches the regexp "[^ \t\n,]+"
> and no longer than 128 characters
> 
>   @offset   Can only appear when @role = slave. The sharing will
> start from the beginning of backing memory area plus
> this offset. If not set, it defaults to zero.
> Can be decimals or hexadecimals of the form "0x2",
> and should be the multiple of the hypervisor page
> granularity (currently 4K on both ARM and x86).
> 
>   @begin/endThe boundaries of the shared memory area. The format
> requirements are the same with @offset.

I'm assuming this is all specified in GFN and also not MFN contigous?
Would it be possible to allow the specification of MFN mappings
that are contigous?

This would be useful to map specific kinds of memory (e.g On Chip RAMs).

Other use-cases are when there are not only guests sharing
the pages but also devices. In some cases these devs may be locked in
with low-latency access to specific memory regions.

Perhaps something like the following?
addr=gfn@
size=0x1000

with mfn being optional?


>   @role Can only be 'master' or 'slave', it defaults to 
> 'slave'.
> 
>   @prot When @role = master, this means the largest set of
> stage-2 permission flags that can be granted to the
> slave domains.
> When @role = slave, this means the stage-2 permission
> flags of the shared memory area.
> Currently only 'rw' is supported. If not set. it
> defaults to 'rw'.
> 
>   @cache_policy The stage-2 cacheability/shareability attributes of 
> the
> shared memory area. Currently, only two policies are
> supported:
>   * ARM_normal: Only applicable to ARM guests. This
> would mean Inner and Outer Write-Back
> Cacheable, and Inner Shareable.


Is there a reason not to set this to Outer Shareable?
Again, mainly useful when these pages get shared with devs as well.

The guest can always lower it to Inner Shareable via S1 tables if needed.

Best 

Re: [Xen-devel] [PATCH 2/5] xen: ARM: suspend the tick (if in use) when going idle.

2017-07-31 Thread Stefano Stabellini
On Thu, 27 Jul 2017, Dario Faggioli wrote:
> Since commit 964fae8ac ("cpuidle: suspend/resume scheduler
> tick timer during cpu idle state entry/exit"), if a scheduler
> has a periodic tick timer, we stop it when going idle.
> 
> This, however, is only true for x86. Make it true for ARM as
> well.
> 
> Signed-off-by: Dario Faggioli 

Reviewed-by: Stefano Stabellini 


> ---
> Cc: Stefano Stabellini 
> Cc: Julien Grall 
> ---
>  xen/arch/arm/domain.c |   29 -
>  1 file changed, 20 insertions(+), 9 deletions(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 2dc8b0a..fce29cb 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -39,6 +39,25 @@
>  
>  DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
>  
> +static void do_idle(void)
> +{
> +unsigned int cpu = smp_processor_id();
> +
> +sched_tick_suspend();
> +/* sched_tick_suspend() can raise TIMER_SOFTIRQ. Process it now. */
> +process_pending_softirqs();
> +
> +local_irq_disable();
> +if ( cpu_is_haltable(cpu) )
> +{
> +dsb(sy);
> +wfi();
> +}
> +local_irq_enable();
> +
> +sched_tick_resume();
> +}
> +
>  void idle_loop(void)
>  {
>  unsigned int cpu = smp_processor_id();
> @@ -52,15 +71,7 @@ void idle_loop(void)
>  if ( unlikely(tasklet_work_to_do(cpu)) )
>  do_tasklet();
>  else
> -{
> -local_irq_disable();
> -if ( cpu_is_haltable(cpu) )
> -{
> -dsb(sy);
> -wfi();
> -}
> -local_irq_enable();
> -}
> +do_idle();
>  
>  do_softirq();
>  /*
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file

2017-07-31 Thread Stefano Stabellini
On Mon, 31 Jul 2017, Zhongze Liu wrote:
> I'm extremely sorry that I mistakenly copied and pasted an immediate
> version of the proposal here. As you might have already noticed, some
> of the content obviously conflicts with itself. Please see the new one below.
> And some typo's and indentation issues are also fixed. Sorry again for my
> mistake.
> 
> The revisited proposal:
> 
> 
> 1. Motivation and Description
> 
> Virtual machines use grant table hypercalls to setup a share page for
> inter-VMs communications. These hypercalls are used by all PV
> protocols today. However, very simple guests, such as baremetal
> applications, might not have the infrastructure to handle the grant table.
> This project is about setting up several shared memory areas for inter-VMs
> communications directly from the VM config file.
> So that the guest kernel doesn't have to have grant table support (in the
> embedded space, this is not unusual) to be able to communicate with
> other guests.
> 
> 
> 2. Implementation Plan:
> 
> 
> ==
> 2.1 Introduce a new VM config option in xl:
> ==
> 
> 2.1.1 Design Goals
> ~~~
> 
> The shared areas should be shareable among several (>=2) VMs, so every shared
> physical memory area is assigned to a set of VMs. Therefore, a “token” or
> “identifier” should be used here to uniquely identify a backing memory area.
> A string no longer than 128 bytes is used here to serve the purpose.
> 
> The backing area would be taken from one domain, which we will regard
> as the "master domain", and this domain should be created prior to any
> other "slave domain"s. Again, we have to use some kind of tag to tell who
> is the "master domain".
> 
> And the ability to specify the permissions and cacheability (and shareability
> for ARM guest's) of the pages to be shared should also be given to the user.
> 
> 2.2.2 Syntax and Behavior
> ~
> The following example illustrates the syntax of the proposed config entry
> (suppose that we're on x86):
> 
> In xl config file of vm1:
>   static_shm = [ 'id=ID1, begin=0x10, end=0x20, role=master,\
>   cache_policy=x86_normal, prot=rw',\
> \
>  'id=ID2, begin=0x30, end=0x40, role=master' ]
> 
> In xl config file of vm2:
>   static_shm = [ 'id=ID1, offset = 0, begin=0x50, end=0x60, \
>   role=slave, prot=rw' ]
> 
> In xl config file of vm3:
>   static_shm = [ 'id=ID2, offset = 0x1, begin=0x69, \
>   end=0x80, role=slave' ]
> 
> where:
>   @id  The identifier of the backing memory area.
>Can be any string that matches the regexp "[_a-zA-Z0-9]+"
>and no longer than 128 characters
> 
>   @offset  Can only appear when @role = slave. The sharing will
>start from the beginning of backing memory area plus
>this offset. If not set, it defaults to zero.
>Can be decimals or hexadecimals of the form "0x2",
>and should be the multiple of the hypervisor page
>granularity (currently 4K on both ARM and x86).
> 
>   @begin/end   The boundaries of the shared memory area. The format
>requirements are the same with @offset.
> 
>   @roleCan only be 'master' or 'slave', it defaults to 'slave'.
> 
>   @protWhen @role = master, this means the largest set of
>stage-2 permission flags that can be granted to the
>slave domains.
>When @role = slave, this means the stage-2 permission
>flags of the shared memory area.
>Currently only 'rw' is supported. If not set. it
>defaults to 'rw'.
> 
>   @cache_policyCan only appear when @role = master.
>The stage-2 cacheability/shareability attributes of the
>shared memory area. Currently, only two policies are
>supported:
>  * ARM_normal: Only applicable to ARM guests. This
>would mean Inner and Outer Write-Back
>Cacheable, and Inner Shareable.
>  * x86_normal: Only applicable to x86 HVM guests. This
>would mean Write-Back Cacheable.
>If not set, it defaults to the *_normal policy for the
>corresponding platform.
> 
> Note:
>   The sizes of the areas 

Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file

2017-07-31 Thread Stefano Stabellini
On Mon, 31 Jul 2017, Wei Liu wrote:
> On Mon, Jul 31, 2017 at 01:09:04AM +0800, Zhongze Liu wrote:
> > 
> >   @cache_policyCan only appear when @role = master.
> >The stage-2 cacheability/shareability attributes of the
> >shared memory area. Currently, only two policies are
> >supported:
> >  * ARM_normal: Only applicable to ARM guests. This
> >would mean Inner and Outer Write-Back
> >Cacheable, and Inner Shareable.
> >  * x86_normal: Only applicable to x86 HVM guests. This
> >would mean Write-Back Cacheable.
> 
> This question might have been asked before, I'm sorry for not being
> able to follow closely previous discussions...
> 
> Why are these opaque names chosen instead of just writing "wb" "wc" or
> whatever?

Because "wb" and "wc" mean different things on ARM and x86. Also the
meaning of "wb" is not clear enough on ARM unless you also specify
shareability. It's simpler this way, and it gives us plenty of room to
improve the syntax in the future if we want to add more cacheability and
shareability modes while remaining backward compatible.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XenSummit 2017] Shared coprocessor framework followup

2017-07-31 Thread Edgar E. Iglesias
On Tue, Jul 18, 2017 at 08:10:15PM +0300, Andrii Anisov wrote:
> **Dear All,
> 
> During the developers summit a Shared Coprocessor Framework (SCF) concept
> was presented. Noticeable interest from community was discovered during
> discussions. So this is a call for all interested parties to collect a
> feedback and setup a collaboration.

Hi Andrii!


> 
> There are several topics I would like to collect responses from the
> community:
> - Who are interested in SCF design, discussions, development, usage,
> etc? Personalities or organizations.

Yes I'm interested in this. I'm not sure how much time I'll be able to
contribute but at least I can review proposals and hopefully look at
implementing a driver/backend that may be useful for our FPGA platforms.

Cheers,
Edgar


> - What devices (type of devices) are intended to be shared using SCF?
> - What are expected coprocessor sharing use-cases (i.e. DSP running
> different FW for different domains, etc).
> - If someone is willing to take a part in SCF design and development
> (core, API)?
> - If someone is willing to implement their coprocessor support (driver)
> for SCF?
> 
> I look forward to hearing from you.
> 
> -- 
> 
> *Andrii Anisov*
> 
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/arm: Fix comments coding style in assembler files

2017-07-31 Thread Stefano Stabellini
On Thu, 27 Jul 2017, Andrii Anisov wrote:
> From: Andrii Anisov 
> 
> Signed-off-by: Andrii Anisov 

Reviewed-by: Stefano Stabellini 


> ---
>  xen/arch/arm/arm32/debug-8250.inc   | 12 +++--
>  xen/arch/arm/arm32/debug-exynos4210.inc | 12 +++--
>  xen/arch/arm/arm32/debug-pl011.inc  | 18 +---
>  xen/arch/arm/arm32/debug-scif.inc   |  6 ++-
>  xen/arch/arm/arm32/debug.S  |  6 ++-
>  xen/arch/arm/arm32/head.S   | 81 
> ++---
>  xen/arch/arm/arm64/debug-8250.inc   | 12 +++--
>  xen/arch/arm/arm64/debug-cadence.inc| 12 +++--
>  xen/arch/arm/arm64/debug-pl011.inc  | 18 +---
>  xen/arch/arm/arm64/debug.S  |  6 ++-
>  xen/arch/arm/arm64/entry.S  | 50 ++--
>  11 files changed, 147 insertions(+), 86 deletions(-)
> 
> diff --git a/xen/arch/arm/arm32/debug-8250.inc 
> b/xen/arch/arm/arm32/debug-8250.inc
> index 757ffd8..0759a27 100644
> --- a/xen/arch/arm/arm32/debug-8250.inc
> +++ b/xen/arch/arm/arm32/debug-8250.inc
> @@ -16,9 +16,11 @@
>  
>  #include 
>  
> -/* 8250 UART wait UART to be ready to transmit
> +/*
> + * 8250 UART wait UART to be ready to transmit
>   * rb: register which contains the UART base address
> - * rc: scratch register */
> + * rc: scratch register
> + */
>  .macro early_uart_ready rb rc
>  1:
>  ldr \rc, [\rb, #(UART_LSR << EARLY_UART_REG_SHIFT)] /* Read LSR 
> */
> @@ -26,9 +28,11 @@
>  beq 1b /* Wait for the UART to be ready 
> */
>  .endm
>  
> -/* 8250 UART transmit character
> +/*
> + * 8250 UART transmit character
>   * rb: register which contains the UART base address
> - * rt: register which contains the character to transmit */
> + * rt: register which contains the character to transmit
> + */
>  .macro early_uart_transmit rb rt
>  str   \rt, [\rb, #UART_THR]  /* Write Transmit buffer */
>  .endm
> diff --git a/xen/arch/arm/arm32/debug-exynos4210.inc 
> b/xen/arch/arm/arm32/debug-exynos4210.inc
> index 752942d..4e80a21 100644
> --- a/xen/arch/arm/arm32/debug-exynos4210.inc
> +++ b/xen/arch/arm/arm32/debug-exynos4210.inc
> @@ -18,9 +18,11 @@
>  
>  #include 
>  
> -/* Exynos 5 UART wait UART to be ready to transmit
> +/*
> + * Exynos 5 UART wait UART to be ready to transmit
>   * rb: register which contains the UART base address
> - * rc: scratch register */
> + * rc: scratch register
> + */
>  .macro early_uart_ready rb rc
>  1:
>  ldr   \rc, [\rb, #UTRSTAT]   /* <- UTRSTAT (Flag register) */
> @@ -28,9 +30,11 @@
>  beq   1b /* Wait for the UART to be ready */
>  .endm
>  
> -/* Exynos 5 UART transmit character
> +/*
> + * Exynos 5 UART transmit character
>   * rb: register which contains the UART base address
> - * rt: register which contains the character to transmit */
> + * rt: register which contains the character to transmit
> + */
>  .macro early_uart_transmit rb rt
>  str   \rt, [\rb, #UTXH]  /* -> UTXH (Data Register) */
>  .endm
> diff --git a/xen/arch/arm/arm32/debug-pl011.inc 
> b/xen/arch/arm/arm32/debug-pl011.inc
> index 6a64dbf..ec462ea 100644
> --- a/xen/arch/arm/arm32/debug-pl011.inc
> +++ b/xen/arch/arm/arm32/debug-pl011.inc
> @@ -18,10 +18,12 @@
>  
>  #include 
>  
> -/* PL011 UART initialization
> +/*
> + * PL011 UART initialization
>   * rb: register which contains the UART base address
>   * rc: scratch register 1
> - * rd: scratch register 2 (unused here) */
> + * rd: scratch register 2 (unused here)
> + */
>  .macro early_uart_init rb, rc, rd
>  mov   \rc, #(7372800 / EARLY_PRINTK_BAUD % 16)
>  str   \rc, [\rb, #FBRD] /* -> UARTFBRD (Baud divisor fraction) */
> @@ -33,9 +35,11 @@
>  str   \rc, [\rb, #CR] /* -> UARTCR (Control Register) */
>  .endm
>  
> -/* PL011 UART wait UART to be ready to transmit
> +/*
> + * PL011 UART wait UART to be ready to transmit
>   * rb: register which contains the UART base address
> - * rc: scratch register */
> + * rc: scratch register
> + */
>  .macro early_uart_ready rb, rc
>  1:
>  ldr   \rc, [\rb, #FR]   /* <- UARTFR (Flag register) */
> @@ -43,9 +47,11 @@
>  bne   1b/* Wait for the UART to be ready */
>  .endm
>  
> -/* PL011 UART transmit character
> +/*
> + * PL011 UART transmit character
>   * rb: register which contains the UART base address
> - * rt: register which contains the character to transmit */
> + * rt: register which contains the character to transmit
> + */
>  .macro early_uart_transmit rb, rt
>  str   \rt, [\rb, #DR]/* -> UARTDR (Data Register) */
>  .endm
> diff --git a/xen/arch/arm/arm32/debug-scif.inc 
> b/xen/arch/arm/arm32/debug-scif.inc
> index ce85752..143f05d 100644
> --- a/xen/arch/arm/arm32/debug-scif.inc
> +++ b/xen/arch/arm/arm32/debug-scif.inc
> @@ -19,7 +19,8 @@
>  
>  #include 
>  
> -/* SCIF UART 

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Stefano Stabellini
On Mon, 31 Jul 2017, Oleksandr Andrushchenko wrote:
> 3 Sharing with page exchange (XENMEM_exchange)
> ==
> 
> This API was pointed to me by Stefano Stabellini as one of the possible ways
> to
> achieve zero copying and share physically contiguous buffers. It is used by
> x86
> SWIOTLB code (xen_create_contiguous_region, [5]), but as per my understanding
> this API cannot be used on ARM as of now [6].  Conclusion: not an option for
> ARM
> at the moment

Let me elaborate on this. The purpose of XENMEM_exchange is to exchange
a number of memory pages with an equal number of contiguous memory
pages, possibly even under 4G. The original purpose of the hypercall was
to get DMA-able memory.

So far, it has only been used by Dom0 on x86. Dom0 on ARM doesn't need
it because it is mapped 1:1 by default and device assignment is not
allowed without an IOMMU. However it should work on ARM too, as the
implementation is all common code in Xen. Also, looking at the
implementation (xen/common/memory.c:memory_exchange) it would seem that
it can be called from a DomU too (but I have never tried).

Thus, if you have a platform without IOMMU and you disabled the IOMMU
checks in Xen to assign a device to a DomU anyway, then you could use
this hypercall from DomU to get memory under 4G to be used for DMA with
this device.

As far as I can tell XENMEM_exchange could help in the design of
zero-copy PV protocols only to address this specific use case:

- you have a frontend in DomU and a backend in Dom0
- pages shared by DomU get mapped in Dom0 and potentially used for DMA
- the device has under 4G DMA restrictions

Normally Dom0 maps a DomU page, then at the time of using the mapped
page for DMA it checks whether it is suitable for DMA (under 4G if the
device requires so). If it is not, Dom0 uses a bounce buffer borrowed
from the swiotlb. Obviously this introduces one or two memcpys.

Instead, if DomU calls XENMEM_exchange to get memory under 4G, and
shares one of the pages with Dom0 via PV frontends, then Dom0 wouldn't
have to use a bounce buffer to do DMA to this page.

Does it make sense?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 112391: regressions - trouble: blocked/broken/fail/pass

2017-07-31 Thread osstest service owner
flight 112391 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112391/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   2 hosts-allocate broken REGR. vs. 112286
 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 112286
 build-arm64   2 hosts-allocate broken REGR. vs. 112286
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
112286

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-rumprun-amd64 17 rumprun-demo-xenstorels/xenstorels.repeat 
fail pass in 112384
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail pass in 
112384
 test-armhf-armhf-xl-rtds 12 guest-startfail pass in 112384

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-examine  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64-xsm   3 capture-logs  broken blocked in 112286
 build-arm64-pvops 3 capture-logs  broken blocked in 112286
 build-arm64   3 capture-logs  broken blocked in 112286
 test-amd64-amd64-xl-qemut-win7-amd64 18 guest-start/win.repeat fail in 112384 
blocked in 112286
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail in 112384 blocked in 
112286
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeat fail in 112384 like 
112274
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop   fail in 112384 like 112286
 test-armhf-armhf-xl-rtds13 migrate-support-check fail in 112384 never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-check fail in 112384 never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 112286
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail like 112286
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 112286
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 112286
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 112286
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 112286
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never 

[Xen-devel] [PATCH] x86/apic/x2apic: Share IRQ vector between cluster members only when no cpumask is specified

2017-07-31 Thread Boris Ostrovsky
We have limited number (slightly under NR_DYNAMIC_VECTORS=192) of IRQ
vectors that are available to each processor. Currently, when x2apic
cluster mode is used (which is default), each vector is shared among
all processors in the cluster. With many IRQs (as is the case on systems
with multiple SR-IOV cards) and few clusters (e.g. single socket)
there is a good chance that we will run out of vectors.

This patch tries to decrease vector sharing between processors by
assigning vector to a single processor if the assignment request (via
__assign_irq_vector()) comes without explicitly specifying which
processors are expected to share the interrupt. This typically happens
during boot time (or possibly PCI hotplug) when create_irq(NUMA_NO_NODE)
is called. When __assign_irq_vector() is called from
set_desc_affinity() which provides sharing mask, vector sharing will
continue to be performed, as before.

This patch to some extent mirrors Linux commit d872818dbbee
("x86/apic/x2apic: Use multiple cluster members for the irq destination
only with the explicit affinity").

Note that this change still does not guarantee that we never run out of
vectors. For example, on a single core system we will be effectively
back to the single cluster/socket case of original code.

Signed-off-by: Boris Ostrovsky 
---
 xen/arch/x86/acpi/boot.c | 2 +-
 xen/arch/x86/apic.c  | 2 +-
 xen/arch/x86/genapic/delivery.c  | 4 ++--
 xen/arch/x86/genapic/x2apic.c| 9 +++--
 xen/arch/x86/io_apic.c   | 8 
 xen/arch/x86/irq.c   | 4 ++--
 xen/arch/x86/mpparse.c   | 2 +-
 xen/arch/x86/msi.c   | 2 +-
 xen/include/asm-x86/genapic.h| 9 ++---
 xen/include/asm-x86/mach-generic/mach_apic.h | 9 +
 10 files changed, 30 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c
index 8e6c96d..c16b14a 100644
--- a/xen/arch/x86/acpi/boot.c
+++ b/xen/arch/x86/acpi/boot.c
@@ -645,7 +645,7 @@ static void __init acpi_process_madt(void)
acpi_ioapic = true;
 
smp_found_config = true;
-   clustered_apic_check();
+   CLUSTERED_APIC_CHECK();
}
}
if (error == -EINVAL) {
diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
index 851a6cc..a0e1798 100644
--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -544,7 +544,7 @@ void setup_local_APIC(void)
  * an APIC.  See e.g. "AP-388 82489DX User's Manual" (Intel
  * document number 292116).  So here it goes...
  */
-init_apic_ldr();
+INIT_APIC_LDR();
 
 /*
  * Set Task Priority to reject any interrupts below FIRST_DYNAMIC_VECTOR.
diff --git a/xen/arch/x86/genapic/delivery.c b/xen/arch/x86/genapic/delivery.c
index ced92a1..d71c01c 100644
--- a/xen/arch/x86/genapic/delivery.c
+++ b/xen/arch/x86/genapic/delivery.c
@@ -30,7 +30,7 @@ void __init clustered_apic_check_flat(void)
printk("Enabling APIC mode:  Flat.  Using %d I/O APICs\n", nr_ioapics);
 }
 
-const cpumask_t *vector_allocation_cpumask_flat(int cpu)
+const cpumask_t *vector_allocation_cpumask_flat(int cpu, const cpumask_t 
*cpumask)
 {
return _online_map;
 } 
@@ -58,7 +58,7 @@ void __init clustered_apic_check_phys(void)
printk("Enabling APIC mode:  Phys.  Using %d I/O APICs\n", nr_ioapics);
 }
 
-const cpumask_t *vector_allocation_cpumask_phys(int cpu)
+const cpumask_t *vector_allocation_cpumask_phys(int cpu, const cpumask_t 
*cpumask)
 {
return cpumask_of(cpu);
 }
diff --git a/xen/arch/x86/genapic/x2apic.c b/xen/arch/x86/genapic/x2apic.c
index 5fffb31..c0b97c9 100644
--- a/xen/arch/x86/genapic/x2apic.c
+++ b/xen/arch/x86/genapic/x2apic.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static DEFINE_PER_CPU_READ_MOSTLY(u32, cpu_2_logical_apicid);
 static DEFINE_PER_CPU_READ_MOSTLY(cpumask_t *, cluster_cpus);
@@ -72,9 +73,13 @@ static void __init clustered_apic_check_x2apic(void)
 {
 }
 
-static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu)
+static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu,
+const cpumask_t *cpumask)
 {
-return per_cpu(cluster_cpus, cpu);
+if ( cpumask != TARGET_CPUS )
+return per_cpu(cluster_cpus, cpu);
+else
+return cpumask_of(cpu);
 }
 
 static unsigned int cpu_mask_to_apicid_x2apic_cluster(const cpumask_t *cpumask)
diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index 2838f6b..3eefcfc 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1038,7 +1038,7 @@ static void __init setup_IO_APIC_irqs(void)
 disable_8259A_irq(irq_to_desc(irq));
 
 desc = irq_to_desc(irq);
-SET_DEST(entry, logical, 

Re: [Xen-devel] DESIGN v2: CPUID part 3

2017-07-31 Thread Konrad Rzeszutek Wilk
On Wed, Jul 05, 2017 at 02:22:00PM +0100, Joao Martins wrote:
> On 07/05/2017 12:16 PM, Andrew Cooper wrote:
> > On 05/07/17 10:46, Joao Martins wrote:
> >> Hey Andrew,
> >>
> >> On 07/04/2017 03:55 PM, Andrew Cooper wrote:
> >>> Presented herewith is the a plan for the final part of CPUID work, which
> >>> primarily covers better Xen/Toolstack interaction for configuring the 
> >>> guests
> >>> CPUID policy.
> >>>
> >> Really nice write up, a few comments below.
> >>
> >>> A PDF version of this document is available from:
> >>>
> >>> http://xenbits.xen.org/people/andrewcoop/cpuid-part-3-rev2.pdf
> >>>
> >>> Changes from v1:
> >>>  * Clarification of the interaction of emulated features
> >>>  * More information about the difference between max and default 
> >>> featuresets.
> >>>
> >>> ~Andrew
> >>>
> >>> -8<-
> >>> % CPUID Handling (part 3)
> >>> % Revision 2
> >>>
> 
> [snip]
> 
> >>> # Proposal
> >>>
> >>> First and foremost, split the current **max\_policy** notion into separate
> >>> **max** and **default** policies.  This allows for the provision of 
> >>> features
> >>> which are unused by default, but may be opted in to, both at the 
> >>> hypervisor
> >>> level and the toolstack level.
> >>>
> >>> At the hypervisor level, **max** constitutes all the features Xen can use 
> >>> on
> >>> the current hardware, while **default** is the subset thereof which are
> >>> supported features, the features which the user has explicitly opted in 
> >>> to,
> >>> and excluding any features the user has explicitly opted out of.
> >>>
> >>> A new `cpuid=` command line option shall be introduced, whose internals 
> >>> are
> >>> generated automatically from the featureset ABI.  This means that all 
> >>> features
> >>> added to `include/public/arch-x86/cpufeatureset.h` automatically gain 
> >>> command
> >>> line control.  (RFC: The same top level option can probably be used for
> >>> non-feature CPUID data control, although I can't currently think of any 
> >>> cases
> >>> where this would be used Also find a sensible way to express 'available 
> >>> but
> >>> not to be used by Xen', as per the current `smep` and `smap` options.)
> >>>
> >>>
> >>> At the guest level, the **max** policy is conceptually unchanged.  It
> >>> constitutes all the features Xen is willing to offer to each type of 
> >>> guest on
> >>> the current hardware (including emulated features).  However, it shall 
> >>> instead
> >>> be derived from Xen's **default** host policy.  This is to ensure that
> >>> experimental hypervisor features must be opted in to at the Xen level 
> >>> before
> >>> they can be opted in to at the toolstack level.
> >>>
> >>> The guests **default** policy is then derived from its **max**.  This is
> >>> because there are some features which should always be explicitly opted 
> >>> in to
> >>> by the toolstack, such as emulated features which come with a security
> >>> trade-off, or for non-architectural features which may differ in
> >>> implementation in heterogeneous environments.
> >>>
> >>> All global policies (Xen and guest, max and default) shall be made 
> >>> available
> >>> to the toolstack, in a manner similar to the existing
> >>> _XEN\_SYSCTL\_get\_cpu\_featureset_ mechanism.  This allows decisions to 
> >>> be
> >>> taken which include all CPUID data, not just the feature bitmaps.
> >>>
> >>> New _XEN\_DOMCTL\_{get,set}\_cpuid\_policy_ hypercalls will be introduced,
> >>> which allows the toolstack to query and set the cpuid policy for a 
> >>> specific
> >>> domain.  It shall supersede _XEN\_DOMCTL\_set\_cpuid_, and shall fail if 
> >>> Xen
> >>> is unhappy with any aspect of the policy during auditing.  This provides
> >>> feedback to the user that a chosen combination will not work, rather than 
> >>> the
> >>> guest booting in an unexpected state.
> >>>
> >>> When a domain is initially created, the appropriate guests **default** 
> >>> policy
> >>> is duplicated for use.  When auditing, Xen shall audit the toolstacks
> >>> requested policy against the guests **max** policy.  This allows 
> >>> experimental
> >>> features or non-migration-safe features to be opted in to, without those
> >>> features being imposed upon all guests automatically.
> >>>
> >>> A guests CPUID policy shall be immutable after construction.  This better
> >>> matches real hardware, and simplifies the logic in Xen to translate policy
> >>> alterations into configuration changes.
> >>>
> >> This appears to be a suitable abstraction even for higher level toolstacks
> >> (libxl). At least I can imagine libvirt fetching the PV/HVM max policy, and
> >> compare them between different servers when user computes the guest cpu 
> >> config
> >> (the normalized one) and use the common denominator as the guest policy.
> >> Probably higher level toolstack could even use these said policies 
> >> constructs
> >> and built the idea of models such that the user could easily choose one 
> >> for a
> >> pool of hosts with different 

[Xen-devel] [PATCH v1 0/1] xen/arm: zynqmp: Disable PCIe

2017-07-31 Thread Edgar E. Iglesias
From: "Edgar E. Iglesias" 

Hi,

We're seeing panics in dom0 with PCIe enabled due to what seems
to be wrongly created mappings by Xen. With older kernels we
didn't see the panics but PCIe wasn't functional in dom0.

This disables the PCIe nodes on the ZynqMP until Xen/ARM gets
more PCIe support.

Can we get this into stable-4.8 and 4.9?

Cheers,
Edgar

Edgar E. Iglesias (1):
  xen/arm: Disable PCIe on the ZynqMP

 xen/arch/arm/platforms/xilinx-zynqmp.c | 1 +
 1 file changed, 1 insertion(+)

-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v1 1/1] xen/arm: Disable PCIe on the ZynqMP

2017-07-31 Thread Edgar E. Iglesias
From: "Edgar E. Iglesias" 

Disable PCIe on the ZynqMP. Xen does not yet know how to map the
controller and dom0 fails to boot with the node enabled.

Signed-off-by: Edgar E. Iglesias 
---
 xen/arch/arm/platforms/xilinx-zynqmp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/arch/arm/platforms/xilinx-zynqmp.c 
b/xen/arch/arm/platforms/xilinx-zynqmp.c
index 2adee91..822287b 100644
--- a/xen/arch/arm/platforms/xilinx-zynqmp.c
+++ b/xen/arch/arm/platforms/xilinx-zynqmp.c
@@ -29,6 +29,7 @@ static const struct dt_device_match zynqmp_blacklist_dev[] 
__initconst =
 {
 /* Power management is not yet supported.  */
 DT_MATCH_COMPATIBLE("xlnx,zynqmp-pm"),
+DT_MATCH_COMPATIBLE("xlnx,nwl-pcie-2.11"),
 { /* sentinel */ },
 };
 
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 112390: regressions - trouble: blocked/broken/fail/pass

2017-07-31 Thread osstest service owner
flight 112390 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112390/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64   2 hosts-allocate broken REGR. vs. 110515
 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 110515
 build-arm64-xsm   2 hosts-allocate broken REGR. vs. 110515
 test-amd64-amd64-examine  7 reboot   fail REGR. vs. 110515
 test-amd64-amd64-libvirt-pair 10 xen-boot/src_host   fail REGR. vs. 110515
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
110515
 test-amd64-amd64-rumprun-amd64  7 xen-boot   fail REGR. vs. 110515
 test-amd64-amd64-libvirt-pair 11 xen-boot/dst_host   fail REGR. vs. 110515
 test-amd64-amd64-qemuu-nested-intel  7 xen-boot  fail REGR. vs. 110515
 test-amd64-amd64-xl-qemut-debianhvm-amd64  7 xen-bootfail REGR. vs. 110515
 test-amd64-amd64-xl-pvh-intel  7 xen-bootfail REGR. vs. 110515
 test-amd64-amd64-pygrub   7 xen-boot fail REGR. vs. 110515
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. 
vs. 110515
 test-amd64-amd64-libvirt  7 xen-boot fail REGR. vs. 110515
 test-amd64-amd64-xl-qcow2 7 xen-boot fail REGR. vs. 110515
 test-amd64-amd64-i386-pvgrub  7 xen-boot fail REGR. vs. 110515
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
110515
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
110515
 test-amd64-amd64-amd64-pvgrub  7 xen-bootfail REGR. vs. 110515
 test-amd64-amd64-libvirt-vhd  7 xen-boot fail REGR. vs. 110515

Tests which did not succeed, but are not blocking:
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-examine  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64-xsm   3 capture-logs  broken blocked in 110515
 build-arm64-pvops 3 capture-logs  broken blocked in 110515
 build-arm64   3 capture-logs  broken blocked in 110515
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 110515
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 110515
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 110515
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 110515
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 110515
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 110515
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 110515
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 

Re: [Xen-devel] [XTF PATCH] build: append -fno-pic to CFLAGS

2017-07-31 Thread Andrew Cooper
On 31/07/17 18:20, Wei Liu wrote:
> It appears that Stretch's gcc has this on by default, which causes the
> generating of several get_pc_thunk's, which breaks xsa-192 test.
>
> Signed-off-by: Wei Liu 

Reviewed and committed.  Thanks.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [XTF PATCH] build: append -fno-pic to CFLAGS

2017-07-31 Thread Wei Liu
It appears that Stretch's gcc has this on by default, which causes the
generating of several get_pc_thunk's, which breaks xsa-192 test.

Signed-off-by: Wei Liu 
---
 build/common.mk | 1 +
 1 file changed, 1 insertion(+)

diff --git a/build/common.mk b/build/common.mk
index f1de800..4ce6abf 100644
--- a/build/common.mk
+++ b/build/common.mk
@@ -28,6 +28,7 @@ COMMON_CFLAGS += -fno-common -fno-asynchronous-unwind-tables 
-fno-strict-aliasin
 COMMON_CFLAGS += -fno-stack-protector -ffreestanding
 COMMON_CFLAGS += -mno-red-zone -mno-sse
 COMMON_CFLAGS += -Wno-unused-parameter -Winline
+COMMON_CFLAGS += -fno-pic
 
 COMMON_AFLAGS-x86_32 := -m32
 COMMON_AFLAGS-x86_64 := -m64
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen:arm earlyprintk configuration for Hikey 960 boards

2017-07-31 Thread Julien Grall



On 31/07/17 17:11, Konrad Rzeszutek Wilk wrote:

On Thu, Jul 27, 2017 at 10:52:40AM +0100, Julien Grall wrote:

Hi Konrad,

On 27/07/17 02:18, Konrad Rzeszutek Wilk wrote:

On Wed, Jul 26, 2017 at 05:59:15PM +0100, Julien Grall wrote:

Hi Konrad,

On 26/07/17 17:54, Konrad Rzeszutek Wilk wrote:

Introduce an earlyprintk configuration of Hikey 960 boards.


..snip..


Would it be possible to update the wiki page on the hikey [1] with your
latest finding?


I added a whole new web-page:
https://wiki.xenproject.org/wiki/HiKey960

As the HiKey != HiKey960


Oh :). Fine then. Would you be up to do testing on this platform during RC?


Yes.


Thank you!



If you can you add your name in [1]. This would help user to know the board
has been tested with latest release.



[Also updated the landing page 
https://wiki.xenproject.org/wiki/Xen_ARM_with_Virtualization_Extensions
to point to this new one]



Cheers,

[1] https://wiki.xenproject.org/wiki/Xen_ARM_Manual_Smoke_Test/Results

--
Julien Grall


--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/2] libxl: do not start dom0 qemu for stubdomain when not needed

2017-07-31 Thread Ian Jackson
Wei Liu writes ("Re: [PATCH v2 2/2] libxl: do not start dom0 qemu for 
stubdomain when not needed"):
> Hmm... I don't think there is requirement in CODING_STYLE for
> multiple-line comment, so there are quite a few styles in use. But
> looking at libxl code the prevailing style seems to be:
> 
>/*
> *  
> *  
> */
> 
> I will make the adjustment while committing.

Personally I think this is well into the kind of territory that is not
worth arguing over.  Even a mix of styles here is fine.  (I think we
should mandate the initial `*' on continuation lines.)

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [libvirt test] 112392: trouble: blocked/broken/pass

2017-07-31 Thread osstest service owner
flight 112392 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112392/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 112276
 build-arm64-xsm   2 hosts-allocate broken REGR. vs. 112276
 build-arm64   2 hosts-allocate broken REGR. vs. 112276

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 build-arm64   3 capture-logs  broken blocked in 112276
 build-arm64-xsm   3 capture-logs  broken blocked in 112276
 build-arm64-pvops 3 capture-logs  broken blocked in 112276
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 112276
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 112276
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 112276
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  772a6e36a5cbed4992315c4a8325947020f7ec70
baseline version:
 libvirt  f7237d63e8f02f3689f9b63b413fae7d4221faa9

Last test of basis   112276  2017-07-25 04:21:09 Z6 days
Failing since112310  2017-07-26 04:21:38 Z5 days6 attempts
Testing same since   112370  2017-07-29 04:23:27 Z2 days3 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Daniel P. Berrange 
  Erik Skultety 
  John Ferlan 
  Ján Tomko 
  Martin Kletzander 
  Michal Privoznik 
  Nitesh Konkar 
  Nitesh Konkar 
  Pavel Hrdina 
  Peter Krempa 
  Scott Garfinkle 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  broken  
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  broken  
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  blocked 
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopsbroken  
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-arm64-arm64-libvirt-xsm blocked 
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-arm64-arm64-libvirt blocked 
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pair

Re: [Xen-devel] [PATCH v2 2/2] libxl: do not start dom0 qemu for stubdomain when not needed

2017-07-31 Thread Wei Liu
On Fri, Jul 28, 2017 at 06:42:14PM +0200, Marek Marczykowski-Górecki wrote:
> Do not setup vfb+vkb when no access method was configured. Then check if
> qemu is really needed.
> 
> The only not configurable thing forcing qemu running in dom0 after this
> change are consoles used to save/restore. But even in that case, there
> is much smaller part of qemu exposed.
> 
> Signed-off-by: Marek Marczykowski-Górecki 

Acked-by: Wei Liu 

> +/* will be changed back to LIBXL__CONSOLE_BACKEND_IOEMU if 
> qemu
> + * will be in use */

Hmm... I don't think there is requirement in CODING_STYLE for
multiple-line comment, so there are quite a few styles in use. But
looking at libxl code the prevailing style seems to be:

   /*
*  
*  
*/

I will make the adjustment while committing.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] PVH VCPU hotplug support v7?

2017-07-31 Thread Boris Ostrovsky
On 07/31/2017 11:36 AM, Ross Lagerwall wrote:
> On 07/31/2017 03:29 PM, Boris Ostrovsky wrote:
>> On 07/31/2017 10:12 AM, Andrew Cooper wrote:
>>> On 31/07/17 14:55, Boris Ostrovsky wrote:
 On 07/31/2017 09:20 AM, Ross Lagerwall wrote:
> Hi Boris,
>
> I've modified your PVH VCPU hotplug support v6 patch series [1] to
> support HVM guests running _with_ a device model for XenServer's
> purposes. This is useful because it moves the vCPU hotplug handling
> out of QEMU and allows it to mostly be shared with PVH. It will also
> allow unplugging vCPUs (libxl currently only does cpu-add for
> upstream
> qemu).
>
> Are you still planning on continuing with that patch series since
> your
> commit to Linux [2]?
 This series has been put on hold until we figure out what to do with
 hotplug for PVH dom0. (The problem was the "dual" view by dom0 of APCI
 CPU namespace --- on hotplug event dom0 has to somehow figure out
 whether the event was due to (dis)appearance of a physical or
 virtual CPU).

 I don't think this has been dealt with yet (copying Roger).
>>>  From the point of view of unblocking several pieces of work, it
>>> would be
>>> fine for this logic to be behind an emulation flag, just like
>>> LAPIC/etc.
>>
>> The (I think) last message discussing this series was
>>
>> https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00143.html
>>
>>
>> Are you suggesting extracting pieces that would move hotplug support for
>> HVM guests from qemu to hypervisor/toolstack but leave all PVH-specific
>> code out? (The feature flag is part of this series ---
>> https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00059.html)
>>
>>
>
> I think(?) Andrew was suggesting to have an emulation flag such that
> hotplug support is moved into the hypervisor for HVM guests _and_ PVH
> guests except for PVH dom0.

That (different handling for PVH dom0 vs. domU) was exactly what Jan was
objecting to. (I'll add him too).

-boris

>
> I don't know what work this unblocks that he was referring to.
>


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 4/8] mm: Scrub memory from idle loop

2017-07-31 Thread Boris Ostrovsky
On 07/31/2017 11:20 AM, Jan Beulich wrote:
 Boris Ostrovsky  07/23/17 4:14 AM >>>
 @@ -1050,17 +1120,42 @@ static void scrub_free_pages(unsigned int node)
 -if ( node_need_scrub[node] == 0 )
 -return;
 +if ( preempt || (node_need_scrub[node] == 0) )
 +goto out;
   }
   } while ( order-- != 0 );
   }
 +
 + out:
 +spin_unlock(_lock);
 +node_clear(node, node_scrubbing);
 +return softirq_pending(cpu) || (node_to_scrub(false) != NUMA_NO_NODE);
>>> While I can see why you use it here, the softirq_pending() looks sort of
>>> misplaced: While invoking it twice in the caller will look a little odd too,
>>> I still think that's where the check belongs.
>>
>> scrub_free_pages is called from idle loop as
>>
>> else if ( !softirq_pending(cpu) && !scrub_free_pages() )
>> pm_idle();
>>
>> so softirq_pending() is unnecessary here.
>>
>> (Not sure why you are saying it would be invoked twice)
> That was sort of implicit - the caller would want to become
>
>
> else if ( !softirq_pending(cpu) && !scrub_free_pages() && 
> !softirq_pending(cpu) )
> pm_idle();
>
> to account for the fact that a softirq may become pending while scrubbing.

That would look really odd IMO.

Would

else if ( !softirq_pending(cpu) )
if ( !scrub_free_pages() && !softirq_pending(cpu) )
   pm_idle();

or 

else if ( !softirq_pending(cpu) && !scrub_free_pages() )
if ( !softirq_pending(cpu) )
pm_idle();

 
be better? (I'd prefer the first)

-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen:arm earlyprintk configuration for Hikey 960 boards

2017-07-31 Thread Konrad Rzeszutek Wilk
On Thu, Jul 27, 2017 at 10:52:40AM +0100, Julien Grall wrote:
> Hi Konrad,
> 
> On 27/07/17 02:18, Konrad Rzeszutek Wilk wrote:
> > On Wed, Jul 26, 2017 at 05:59:15PM +0100, Julien Grall wrote:
> > > Hi Konrad,
> > > 
> > > On 26/07/17 17:54, Konrad Rzeszutek Wilk wrote:
> > > > Introduce an earlyprintk configuration of Hikey 960 boards.
> > 
> > ..snip..
> > > 
> > > Would it be possible to update the wiki page on the hikey [1] with your
> > > latest finding?
> > 
> > I added a whole new web-page:
> > https://wiki.xenproject.org/wiki/HiKey960
> > 
> > As the HiKey != HiKey960
> 
> Oh :). Fine then. Would you be up to do testing on this platform during RC?

Yes.
> 
> If you can you add your name in [1]. This would help user to know the board
> has been tested with latest release.
> 
> > 
> > [Also updated the landing page 
> > https://wiki.xenproject.org/wiki/Xen_ARM_with_Virtualization_Extensions
> > to point to this new one]
> > 
> 
> Cheers,
> 
> [1] https://wiki.xenproject.org/wiki/Xen_ARM_Manual_Smoke_Test/Results
> 
> -- 
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 3/8] mm: Scrub pages in alloc_heap_pages() if needed

2017-07-31 Thread Boris Ostrovsky
On 07/31/2017 11:16 AM, Jan Beulich wrote:
 Boris Ostrovsky  07/23/17 4:07 AM >>>
>> On 06/27/2017 02:00 PM, Jan Beulich wrote:
>> Boris Ostrovsky  06/22/17 8:55 PM >>>
 @@ -862,10 +879,19 @@ static struct page_info *alloc_heap_pages(
   if ( d != NULL )
   d->last_alloc_node = node;
   
 +need_scrub = !!first_dirty_pg && !(memflags & MEMF_no_scrub);
>>> No need for !! here. But I wonder whether that part of the check is really
>>> useful anyway, considering the sole use ...
>>>
   for ( i = 0; i < (1 << order); i++ )
   {
   /* Reference count must continuously be zero for free pages. */
 -BUG_ON(pg[i].count_info != PGC_state_free);
 +BUG_ON((pg[i].count_info & ~PGC_need_scrub) != PGC_state_free);
 +
 +if ( test_bit(_PGC_need_scrub, [i].count_info) )
 +{
 +if ( need_scrub )
 +scrub_one_page([i]);
>>> ... here. If it isn't, I think the local variable isn't warranted either.
>>> If you agree, the thus adjusted patch can have
>>> Reviewed-by: Jan Beulich 
>>> (otherwise I'll wait with it to understand the reason first).
>> first_dirty_pg is indeed unnecessary but I think local variable is 
>> useful to avoid ANDing memflags inside the loop on each iteration 
>> (unless you think compiler is smart enough to realize that memflags is 
>> not changing).
> I don't understand: At least on x86 I'd expect the compiler to use a
> single TEST if you used memflags inside the loop, whereas the local
> variable would likely be a single CMP inside the loop plus setup code
> outside of it.

OK, I haven't considered that you don't actually need to AND and then
CMP. Then yes, local variable is unnecessary.

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/5] xen/livepatch/ARM32: Don't load and crash on livepatches loaded with wrong alignment.

2017-07-31 Thread Konrad Rzeszutek Wilk
On Mon, Jul 31, 2017 at 07:55:34AM -0600, Jan Beulich wrote:
> >>> Konrad Rzeszutek Wilk  07/26/17 9:50 PM >>>
> >--- a/docs/misc/livepatch.markdown
> >+++ b/docs/misc/livepatch.markdown
> >@@ -279,6 +279,10 @@ It may also have some architecture-specific sections. 
> >For example:
> >* Exception tables.
> >* Relocations for each of these sections.
>  >
> >+Note that on ARM 32 the sections SHOULD be four byte aligned. Otherwise
> >+we risk hitting Data Abort exception as un-aligned manipulation of data is
> >+prohibited on ARM 32.
> 
> This (and hence the rest of the patch) is not in line with the outcome of the
> earlier discussion we had. Nothing is wrong with a section having smaller
> alignment, as long as there are no 32-bit (or wider, but I don't think there
> are any such) relocations against such a section. And even if there were, I
> think it should rather be the code doing the relocations needing to cope, as
> I don't think the ARM ELF ABI imposes any such restriction.

The idea behind this patch is to give advance warnings. Akin to what
2ff229643b739e2fd0cd0536ee9fca506cfa92f8
"xen/livepatch: Don't crash on encountering STN_UNDEF relocations" did.

The other patches in this series fix the alignment issues.

The ARM ELF ABI 
(http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044f/IHI0044F_aaelf.pdf)

says:

4.3.5 Section Alignment
There is no minimum alignment required for a section. However, sections 
containing thumb code must be at least
16-bit aligned and sections containing ARM code must be at least 32-bit aligned.
Platform standards may set a limit on the maximum alignment that they can 
guarantee (normally the page size).



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 1/8] mm: Place unscrubbed pages at the end of pagelist

2017-07-31 Thread Boris Ostrovsky
On 07/31/2017 10:45 AM, Jan Beulich wrote:
 Boris Ostrovsky  07/23/17 4:01 AM >>>
>> On 06/27/2017 01:06 PM, Jan Beulich wrote:
>> Boris Ostrovsky  06/22/17 8:55 PM >>>
 +{
 +if ( pg < first_dirty_pg )
 +first_dirty = (first_dirty_pg - pg) / sizeof(*pg);
>>> Pointer subtraction already includes the involved division. 
>>
>> Yes, this was a mistake.
>>
>>> Otoh I wonder
>>> if you couldn't get away without pointer comparison/subtraction here
>>> altogether.
>>
>> Without comparison I can only assume that first_dirty is zero (i.e. the 
>> whole buddy is potentially dirty). Is there something else I could do?
> I was thinking of tracking indexes instead of pointers. But maybe that
> would more hamper readability of the overall result than help it.

I'll try to see how it looks.

>  
 +else
 +i = 0;
 +
 +for ( ; i < (1 << cur_order); i++ )
 +if ( test_bit(_PGC_need_scrub,
 +  _head[i].count_info) )
 +{
 +first_dirty = i;
 +break;
 +}
>>> Perhaps worth having ASSERT(first_dirty != INVALID_DIRTY_IDX) here? Or are
>>> there cases where ->u.free.first_dirty of a page may be wrong?
>>
>> When we merge in free_heap_pages we don't clear first_dirty of the 
>> successor buddy (at some point I did have this done but you questioned 
>> whether it was needed and I dropped it).
> Hmm, this indeed answers my question, but doesn't help (me) understanding
> whether the suggested ASSERT() could be wrong.

Oh, I see what you were asking --- ASSERT() *after* the loop, to make
sure we indeed found the first dirty page. Yes, I will add it.

>
 --- a/xen/include/asm-x86/mm.h
 +++ b/xen/include/asm-x86/mm.h
 @@ -88,7 +88,15 @@ struct page_info
   /* Page is on a free list: ((count_info & PGC_count_mask) == 0). 
 */
   struct {
   /* Do TLBs need flushing for safety before next page use? */
 -bool_t need_tlbflush;
 +unsigned long need_tlbflush:1;
 +
 +/*
 + * Index of the first *possibly* unscrubbed page in the buddy.
 + * One more than maximum possible order (MAX_ORDER+1) to
>>> Why +1 here and hence ...
>> Don't we have MAX_ORDER+1 orders?
> So here there might be a simple misunderstanding: I understand the
> parenthesized MAX_ORDER+1 to represent "maximum possible
> order", i.e. excluding the "one more than", not the least because of
> the ...
>
>>> + * accommodate INVALID_DIRTY_IDX.
>>> + */
>>> +#define INVALID_DIRTY_IDX (-1UL & (((1UL<>> +unsigned long first_dirty:MAX_ORDER + 2;
> +2 here.
>
>>> ... why +2 instead of +1? And isn't the expression INVALID_DIRTY_IDX wrongly
>>> parenthesized (apart from lacking blanks around the shift operator)? I'd
>>> expect you want a value with MAX_ORDER+1 set bits, i.e.
>>> (1UL << (MAX_ORDER + 1)) - 1. ANDing with -1UL seems quite pointless too.
>> Yes to parentheses and AND. Should be (1UL << (MAX_ORDER + 2)) - 1
> I.e. I would still expect it to be (1UL << (MAX_ORDER + 1)) - 1
> here.


Sorry, I still don't get it.

Say, MAX_ORDER is 1. Since this implies that indexes 0, 1, 2 and 3 are
all valid (because we can have up to 2^(MAX_ORDER+1) pages), don't we
need 3 bits to indicate an invalid index?

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] libxl: use xen-blkback for 'vbd' disk types by default

2017-07-31 Thread Wei Liu
On Mon, Jul 31, 2017 at 04:56:04PM +0100, Wei Liu wrote:
> On Fri, Jul 28, 2017 at 06:42:13PM +0200, Marek Marczykowski-Górecki wrote:
> > This will allow later to make HVM domain without qemu in dom0 (in
> > addition to the one in stubdomain).
> > 
> > Signed-off-by: Marek Marczykowski-Górecki 
> > 
> > ---
> > This is extracted from v1 of "libxl: do not start dom0 qemu for
> > stubdomain when not needed".
> > 
> > Signed-off-by: Marek Marczykowski-Górecki 
> > ---
> >  tools/libxl/libxl_disk.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tools/libxl/libxl_disk.c b/tools/libxl/libxl_disk.c
> > index 63de75c..7842d9b 100644
> > --- a/tools/libxl/libxl_disk.c
> > +++ b/tools/libxl/libxl_disk.c
> > @@ -56,10 +56,12 @@ static void disk_eject_xswatch_callback(libxl__egc 
> > *egc, libxl__ev_xswatch *w,
> >  "/local/domain/%d/backend/%" TOSTRING(BACKEND_STRING_SIZE)
> > "[a-z]/%*d/%*d",
> > >backend_domid, backend_type);
> > -if (!strcmp(backend_type, "tap") || !strcmp(backend_type, "vbd")) {
> > +if (!strcmp(backend_type, "tap")) {
> >  disk->backend = LIBXL_DISK_BACKEND_TAP;
> >  } else if (!strcmp(backend_type, "qdisk")) {
> >  disk->backend = LIBXL_DISK_BACKEND_QDISK;
> > +} else if (!strcmp(backend_type, "vbd")) {
> > +disk->backend = LIBXL_DISK_BACKEND_PHY;
> 
> Wait, it only occurred to me until now this patch is changing
> disk_eject_xswatch_callback.
> 
> Is this a bug fix? How is it possible for the backend_type to be "vbd"
> when there isn't such thing in libxl_types.idl?

Oh, I'm an idiot. That's read from xenstore path. I think this patch is
correct. But I still tend to think this is a bug fix. How do you
discover this problem? Has disk eject ever worked for you?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] xen/link: Move .data.rel.ro sections into .rodata for final link

2017-07-31 Thread David Woodhouse
On Mon, 2017-07-31 at 07:15 -0600, Jan Beulich wrote:
> > > > David Woodhouse  07/31/17 1:02 PM >>>
> > On Sun, 2017-07-30 at 00:16 -0600, Jan Beulich wrote:
> > > > > > David Woodhouse  07/20/17 5:22 PM >>>
> > > > This includes stuff lke the hypercall tables which we really want
> > > > to be read-only. And they were going into .data.read-mostly.
> > >
> > > Yes, we'd like them to be read-only, but what if EFI properly assigned r/o
> > > permissions to the .rodata section when loading xen.efi? We'd then be
> > > unable to apply relocations when switching from 1:1 to virtual mappings
> > > (see efi_arch_relocate_image()).
> > 
> > FWIW it does look like TianoCore has gained the ability to mark
> > sections as read-only, in January of this year:
> > https://github.com/tianocore/edk2/commit/d0e92aad46
> > 
> > It doesn't actually seem to be complete — even with subsequent fixes
> > since that commit, it doesn't look like it catches the case of data
> > sections without EFI_IMAGE_SCN_MEM_WRITE, such as .rodata. 
> > 
> > And even if/when that gets fixed you'll note that the protection is
> > deliberately torn down in ExitBootServices(), specifically for the case
> > you're concerned about below — because you'll need to do the
> > relocations.
>
> As said in an earlier reply, a first pass over relocations is being done
> long before the call to ExitBootServices(). A minimal adjustment to
> efi_arch_relocate_image() will be needed anyway, afaict.

Ah, right. I think more "implied" than "said" but I understand now. :)

At least, I understand what you're saying... I have no bloody clue
what's actually going on though.

There is a first call to efi_arch_relocate_image(0) before the
ExitBootServices() call. However I'm missing something because I can't
see how that call achieves anything *other* than to trigger the fault
we're concerned about.

There are three types of relocations — PE_BASE_RELOC_ABS,
PE_BASE_RELOC_HIGHLOW, and PE_BASE_RELOC_DIR64.

The first (ABS) doesn't seem to do anything, ever. And is never emitted
by mkrelocs.c. 

The second (HIGHLOW) does nothing if (!delta).

The third (DIR64) simply adds 'delta' to the target address. We could
potentially stop it faulting on that pointless '*addr += 0' by doing
this...

--- a/xen/arch/x86/efi/efi-boot.h
+++ b/xen/arch/x86/efi/efi-boot.h
@@ -87,7 +87,8 @@ static void __init efi_arch_relocate_image(unsigned long 
delta)
 case PE_BASE_RELOC_DIR64:
 if ( in_page_tables(addr) )
 blexit(L"Unexpected relocation type");
-*(u64 *)addr += delta;
+if ( delta )
+*(u64 *)addr += delta;
 break;
 default:
 blexit(L"Unsupported relocation type");


... but then again, if the whole function is really doing nothing at
all when invoked with delta==0 then perhaps it would just be easier to
remove the first pass altogether. I feel sure I'm missing something,
but what? Is it still supposed to be adding xen_phys_start in the
PE_BASE_RELOC_HIGHLOW case even when delta==0? Because it isn't...

Either way, this is still broken before my patch though, right?
Especially if UEFI learns to do it for non-executable sections, but
AFAICT even before that.

These are the sections I see the PE section headers of a local build:

 Name     Characteristics   Relocations
.text    0xe0d00020 (WRX)    ✓
.rodata  0x40600040 ( R )    ✓
.buildid 0x40300040 ( R )
.init.te 0x60500020 ( RX)    ✓
.init.da 0xc0d00040 (WR )    ✓
.data.re 0xc0800040 (WR )    ✓
.data    0xc0d00040 (WR )    ✓
.bss     0xc180 (WR )
.reloc   0x42300040 ( R )
.pad     0xc0300080 (WR )

So there are (again, before my patch) relocations in .init.da(ta) and
.rodata sections which UEFI *might* start marking read-only, and also
in .init.te(xt) which is R+X and could be marked read-only today.

And the .init.te(xt) relocations include PE_BASE_RELOC_DIR64
relocations, which *would* cause a fault in the !delta case.

(All the relocations in .rodata both before and after my patch are also
PE_BASE_RELOC_DIR64, FWIW)




smime.p7s
Description: S/MIME cryptographic signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] libxl: use xen-blkback for 'vbd' disk types by default

2017-07-31 Thread Wei Liu
On Fri, Jul 28, 2017 at 06:42:13PM +0200, Marek Marczykowski-Górecki wrote:
> This will allow later to make HVM domain without qemu in dom0 (in
> addition to the one in stubdomain).
> 
> Signed-off-by: Marek Marczykowski-Górecki 
> 
> ---
> This is extracted from v1 of "libxl: do not start dom0 qemu for
> stubdomain when not needed".
> 
> Signed-off-by: Marek Marczykowski-Górecki 
> ---
>  tools/libxl/libxl_disk.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/libxl/libxl_disk.c b/tools/libxl/libxl_disk.c
> index 63de75c..7842d9b 100644
> --- a/tools/libxl/libxl_disk.c
> +++ b/tools/libxl/libxl_disk.c
> @@ -56,10 +56,12 @@ static void disk_eject_xswatch_callback(libxl__egc *egc, 
> libxl__ev_xswatch *w,
>  "/local/domain/%d/backend/%" TOSTRING(BACKEND_STRING_SIZE)
> "[a-z]/%*d/%*d",
> >backend_domid, backend_type);
> -if (!strcmp(backend_type, "tap") || !strcmp(backend_type, "vbd")) {
> +if (!strcmp(backend_type, "tap")) {
>  disk->backend = LIBXL_DISK_BACKEND_TAP;
>  } else if (!strcmp(backend_type, "qdisk")) {
>  disk->backend = LIBXL_DISK_BACKEND_QDISK;
> +} else if (!strcmp(backend_type, "vbd")) {
> +disk->backend = LIBXL_DISK_BACKEND_PHY;

Wait, it only occurred to me until now this patch is changing
disk_eject_xswatch_callback.

Is this a bug fix? How is it possible for the backend_type to be "vbd"
when there isn't such thing in libxl_types.idl?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] PVH VCPU hotplug support v7?

2017-07-31 Thread Ross Lagerwall

On 07/31/2017 03:29 PM, Boris Ostrovsky wrote:

On 07/31/2017 10:12 AM, Andrew Cooper wrote:

On 31/07/17 14:55, Boris Ostrovsky wrote:

On 07/31/2017 09:20 AM, Ross Lagerwall wrote:

Hi Boris,

I've modified your PVH VCPU hotplug support v6 patch series [1] to
support HVM guests running _with_ a device model for XenServer's
purposes. This is useful because it moves the vCPU hotplug handling
out of QEMU and allows it to mostly be shared with PVH. It will also
allow unplugging vCPUs (libxl currently only does cpu-add for upstream
qemu).

Are you still planning on continuing with that patch series since your
commit to Linux [2]?

This series has been put on hold until we figure out what to do with
hotplug for PVH dom0. (The problem was the "dual" view by dom0 of APCI
CPU namespace --- on hotplug event dom0 has to somehow figure out
whether the event was due to (dis)appearance of a physical or virtual CPU).

I don't think this has been dealt with yet (copying Roger).

 From the point of view of unblocking several pieces of work, it would be
fine for this logic to be behind an emulation flag, just like LAPIC/etc.


The (I think) last message discussing this series was

https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00143.html

Are you suggesting extracting pieces that would move hotplug support for
HVM guests from qemu to hypervisor/toolstack but leave all PVH-specific
code out? (The feature flag is part of this series ---
https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00059.html)



I think(?) Andrew was suggesting to have an emulation flag such that 
hotplug support is moved into the hypervisor for HVM guests _and_ PVH 
guests except for PVH dom0.


I don't know what work this unblocks that he was referring to.

--
Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/3] Docs: consolidate release related documents

2017-07-31 Thread Wei Liu
On Mon, Jul 31, 2017 at 02:51:21PM +0100, Ian Jackson wrote:
> Wei Liu writes ("[PATCH 0/3] Docs: consolidate release related documents"):
> > Wei Liu (3):
> >   docs: consolidate release related documents
> >   docs: add xen-release-management.pandoc
> >   docs: hook up process/ to build system
> 
> FWIW,
> 
> Acked-by: Ian Jackson 
> 
> However, AFAICT the mean reason this doesn't process
> release-checklist.txt and branching-checklist.txt into files on the
> website is that you only have it process pandocs.
> 

Correct, that's deliberate.

> But I don't think those files are ever going to want to be made into
> web pages.  Whereas it is possible that we will grow other process
> documents in .txt form which do, and no provision is made for them.
> 
> I wonder if it would be better to rename those to files to a different
> directory.  The reason I invented misc/ was precisely to avoid people
> (users and developers) tripping over them thinking they might be
> useful...
> 

Lars (?) said there should be a "process" directory, hence I put
everything that's relevant here.

I'm open to suggestions about directory structure. Maybe having a
technician-checklist directory under process?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 4/8] mm: Scrub memory from idle loop

2017-07-31 Thread Jan Beulich
>>> Boris Ostrovsky  07/23/17 4:14 AM >>>
>>> @@ -1050,17 +1120,42 @@ static void scrub_free_pages(unsigned int node)
>>> -if ( node_need_scrub[node] == 0 )
>>> -return;
>>> +if ( preempt || (node_need_scrub[node] == 0) )
>>> +goto out;
>>>   }
>>>   } while ( order-- != 0 );
>>>   }
>>> +
>>> + out:
>>> +spin_unlock(_lock);
>>> +node_clear(node, node_scrubbing);
>>> +return softirq_pending(cpu) || (node_to_scrub(false) != NUMA_NO_NODE);
>> 
>> While I can see why you use it here, the softirq_pending() looks sort of
>> misplaced: While invoking it twice in the caller will look a little odd too,
>> I still think that's where the check belongs.
>
>
>scrub_free_pages is called from idle loop as
>
>else if ( !softirq_pending(cpu) && !scrub_free_pages() )
>pm_idle();
>
>so softirq_pending() is unnecessary here.
>
>(Not sure why you are saying it would be invoked twice)

That was sort of implicit - the caller would want to become


else if ( !softirq_pending(cpu) && !scrub_free_pages() && 
!softirq_pending(cpu) )
pm_idle();

to account for the fact that a softirq may become pending while scrubbing.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file

2017-07-31 Thread Wei Liu
On Mon, Jul 31, 2017 at 01:09:04AM +0800, Zhongze Liu wrote:
> 
>   @cache_policyCan only appear when @role = master.
>The stage-2 cacheability/shareability attributes of the
>shared memory area. Currently, only two policies are
>supported:
>  * ARM_normal: Only applicable to ARM guests. This
>would mean Inner and Outer Write-Back
>Cacheable, and Inner Shareable.
>  * x86_normal: Only applicable to x86 HVM guests. This
>would mean Write-Back Cacheable.

This question might have been asked before, I'm sorry for not being
able to follow closely previous discussions...

Why are these opaque names chosen instead of just writing "wb" "wc" or
whatever?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 3/8] mm: Scrub pages in alloc_heap_pages() if needed

2017-07-31 Thread Jan Beulich
>>> Boris Ostrovsky  07/23/17 4:07 AM >>>
>On 06/27/2017 02:00 PM, Jan Beulich wrote:
> Boris Ostrovsky  06/22/17 8:55 PM >>>
>>> @@ -862,10 +879,19 @@ static struct page_info *alloc_heap_pages(
>>>   if ( d != NULL )
>>>   d->last_alloc_node = node;
>>>   
>>> +need_scrub = !!first_dirty_pg && !(memflags & MEMF_no_scrub);
>> 
>> No need for !! here. But I wonder whether that part of the check is really
>> useful anyway, considering the sole use ...
>> 
>>>   for ( i = 0; i < (1 << order); i++ )
>>>   {
>>>   /* Reference count must continuously be zero for free pages. */
>>> -BUG_ON(pg[i].count_info != PGC_state_free);
>>> +BUG_ON((pg[i].count_info & ~PGC_need_scrub) != PGC_state_free);
>>> +
>>> +if ( test_bit(_PGC_need_scrub, [i].count_info) )
>>> +{
>>> +if ( need_scrub )
>>> +scrub_one_page([i]);
>> 
>> ... here. If it isn't, I think the local variable isn't warranted either.
>> If you agree, the thus adjusted patch can have
>> Reviewed-by: Jan Beulich 
>> (otherwise I'll wait with it to understand the reason first).
>
>first_dirty_pg is indeed unnecessary but I think local variable is 
>useful to avoid ANDing memflags inside the loop on each iteration 
>(unless you think compiler is smart enough to realize that memflags is 
>not changing).

I don't understand: At least on x86 I'd expect the compiler to use a
single TEST if you used memflags inside the loop, whereas the local
variable would likely be a single CMP inside the loop plus setup code
outside of it.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] libxl: Implement the handler to handle unrecoverable AER errors.

2017-07-31 Thread Wei Liu
On Fri, Jul 28, 2017 at 04:56:56PM -0700, Venu Busireddy wrote:
> On 2017-07-28 17:39:52 +0100, Ian Jackson wrote:
> > Venu Busireddy writes ("[PATCH v2 1/2] libxl: Implement the handler to 
> > handle unrecoverable AER errors."):
> > > Implement the callback function to handle unrecoverable AER errors, and
> > > also the public APIs that can be used to register/unregister the handler.
> > > When an AER error occurs, the handler will forcibly remove the erring
> > > PCIe device from the guest.
> > 
> > Why is this only sometimes the right thing to do ?  On what basis
> > might a user choose ?
> 
> This is not an "only sometimes" thing. User doesn't choose it. We always
> want to watch for AER errors.
> 
> > If this is always the right thing to do then maybe we need to recast
> > this as a general "please run monitoring for this domain" call ?
> 
> What does "recast" imply? Which "call" are you referring to?
> 
> > > +int libxl_reg_aer_events_handler(libxl_ctx *, uint32_t) 
> > > LIBXL_EXTERNAL_CALLERS_ONLY;
> > > +void libxl_unreg_aer_events_handler(libxl_ctx *, uint32_t) 
> > > LIBXL_EXTERNAL_CALLERS_ONLY;
> > 
> > I think these names are very unintuitive.  They describe the
> > implementation, not the effect.
> 
> These names can be changed to anything you want. Please suggest any names
> of your choice, and I will change them. That ensures that we don't spend
> more review revisions in fine tuning those names.
> 
> > The API seems awkward.  Inside libxl, events are only processed while
> > the application is inside libxl.  So for these functions to be
> > effective, the calling application must arrange to be running the
> > libxl event loop.  This should be documented, at least.
> 
> I am afraid I don't follow you. This scheme is tested and it works. So, I
> do not follow you when you say, "...for these functions to be effective,..."

Libxl has an internal event loop. The code as-is registers a watch which
runs on the internal event loop. The event loop's event is only
processed when the process enters libxl (calls libxl functions).

Imagine a toolstack which doesn't use libxl's internal event loop -- for
example libvirt has its own event loop, or a toolstack which only calls
the register function then nothing else. Your API would not work for
those cases.

Ian, please correct me if I'm wrong.

> 
> > What happens if more than one process calls this at once ?
> 
> Each call is handled within a separate 'xl' process's context. I don't
> see a problem there. Do you have anything specific in mind?
> 

It is possible that both xl processes see its watch fires and try to
write to the same node at the same time.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 05/12] x86/domctl: Handle ACPI access from domctl

2017-07-31 Thread Boris Ostrovsky
On 07/31/2017 10:14 AM, Ross Lagerwall wrote:
> On 01/03/2017 02:04 PM, Boris Ostrovsky wrote:
>> Signed-off-by: Boris Ostrovsky 
>> ---
>> Changes in v6:
>> * Adjustments to to patch 4 changes.
>> * Added a spinlock for VCPU map access
>> * Return an error on guest trying to write VCPU map
>>
> snip
>> -static int acpi_cpumap_access_common(struct domain *d, bool is_write,
>> - unsigned int port,
>> +static int acpi_cpumap_access_common(struct domain *d, bool
>> is_guest_access,
>> + bool is_write, unsigned int port,
>>unsigned int bytes, uint32_t
>> *val)
>>   {
>>   unsigned int first_byte = port - XEN_ACPI_CPU_MAP;
>> +int rc = X86EMUL_OKAY;
>>
>>   BUILD_BUG_ON(XEN_ACPI_CPU_MAP + XEN_ACPI_CPU_MAP_LEN
>>> ACPI_GPE0_BLK_ADDRESS_V1);
>>
>> +spin_lock(>arch.hvm_domain.acpi_lock);
>> +
>>   if ( !is_write )
>>   {
>>   uint32_t mask = (bytes < 4) ? ~0U << (bytes * 8) : 0;
>> @@ -32,23 +37,61 @@ static int acpi_cpumap_access_common(struct
>> domain *d, bool is_write,
>>   memcpy(val, (uint8_t *)d->avail_vcpus + first_byte,
>>  min(bytes, ((d->max_vcpus + 7) / 8) - first_byte));
>>   }
>> +else if ( !is_guest_access )
>> +memcpy((uint8_t *)d->avail_vcpus + first_byte, val,
>> +   min(bytes, ((d->max_vcpus + 7) / 8) - first_byte));
>>   else
>>   /* Guests do not write CPU map */
>> -return X86EMUL_UNHANDLEABLE;
>> +rc = X86EMUL_UNHANDLEABLE;
>>
>> -return X86EMUL_OKAY;
>> +spin_unlock(>arch.hvm_domain.acpi_lock);
>> +
>> +return rc;
>>   }
>>
>>   int hvm_acpi_domctl_access(struct domain *d,
>>  const struct xen_domctl_acpi_access
>> *access)
>>   {
>> -return -ENOSYS;
>> +unsigned int bytes, i;
>> +uint32_t val = 0;
>> +uint8_t *ptr = (uint8_t *)
>> +int rc;
>> +bool is_write = (access->rw == XEN_DOMCTL_ACPI_WRITE) ? true :
>> false;
>> +
>> +if ( has_acpi_dm_ff(d) )
>> +return -EOPNOTSUPP;
>> +
>> +if ( access->space_id != XEN_ACPI_SYSTEM_IO )
>> +return -EINVAL;
>> +
>> +if ( !((access->address >= XEN_ACPI_CPU_MAP) &&
>> +   (access->address < XEN_ACPI_CPU_MAP +
>> XEN_ACPI_CPU_MAP_LEN)) )
>> +return -ENODEV;
>> +
>> +for ( i = 0; i < access->width; i += sizeof(val) )
>> +{
>> +bytes = (access->width - i > sizeof(val)) ?
>> +sizeof(val) : access->width - i;
>> +
>> +if ( is_write && copy_from_guest_offset(ptr, access->val, i,
>> bytes) )
>> +return -EFAULT;
>> +
>> +rc = acpi_cpumap_access_common(d, false, is_write,
>> +   access->address, bytes, );
>
> While I'm looking at this code...
> This doesn't work if access->width > sizeof(val) (4 bytes). The same
> value (access->address) is always passed into
> acpi_cpumap_access_common for 'port' and this is used as an offset
> into the avail_cpus array. So the offset is unchanged and only the
> first 4 bytes of avail_cpus ever gets changed.

I'd have to go back to the series (haven't looked at it since it was
posted back in January) but I think I enforce somewhere size of the
access to fit into 4 bytes. And if not then you are right.


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 1/8] mm: Place unscrubbed pages at the end of pagelist

2017-07-31 Thread Jan Beulich
>>> Boris Ostrovsky  07/23/17 4:01 AM >>>
>On 06/27/2017 01:06 PM, Jan Beulich wrote:
> Boris Ostrovsky  06/22/17 8:55 PM >>>
>>> +{
>>> +if ( pg < first_dirty_pg )
>>> +first_dirty = (first_dirty_pg - pg) / sizeof(*pg);
>> 
>> Pointer subtraction already includes the involved division. 
>
>
>Yes, this was a mistake.
>
>> Otoh I wonder
>> if you couldn't get away without pointer comparison/subtraction here
>> altogether.
>
>
>Without comparison I can only assume that first_dirty is zero (i.e. the 
>whole buddy is potentially dirty). Is there something else I could do?

I was thinking of tracking indexes instead of pointers. But maybe that
would more hamper readability of the overall result than help it.
 
>>> @@ -892,8 +934,25 @@ static int reserve_offlined_page(struct page_info 
>>> *head)
>>>   {
>>>   merge:
>>>   /* We don't consider merging outside the head_order. */
>>> -page_list_add_tail(cur_head, (node, zone, cur_order));
>>> -PFN_ORDER(cur_head) = cur_order;
>>> +
>>> +/* See if any of the pages indeed need scrubbing. */
>>> +if ( first_dirty_pg && (cur_head + (1 << cur_order) > 
>>> first_dirty_pg) )
>>> +{
>>> +if ( cur_head < first_dirty_pg )
>>> +i = (first_dirty_pg - cur_head) / 
>>> sizeof(*cur_head);
>
>I assume the same comment as above applies here.

Of course. I usually avoid repeating the same comment, except maybe
when reviewing patches of first time contributors.

>>> +else
>>> +i = 0;
>>> +
>>> +for ( ; i < (1 << cur_order); i++ )
>>> +if ( test_bit(_PGC_need_scrub,
>>> +  _head[i].count_info) )
>>> +{
>>> +first_dirty = i;
>>> +break;
>>> +}
>> 
>> Perhaps worth having ASSERT(first_dirty != INVALID_DIRTY_IDX) here? Or are
>> there cases where ->u.free.first_dirty of a page may be wrong?
>
>
>When we merge in free_heap_pages we don't clear first_dirty of the 
>successor buddy (at some point I did have this done but you questioned 
>whether it was needed and I dropped it).

Hmm, this indeed answers my question, but doesn't help (me) understanding
whether the suggested ASSERT() could be wrong.

>>> --- a/xen/include/asm-x86/mm.h
>>> +++ b/xen/include/asm-x86/mm.h
>>> @@ -88,7 +88,15 @@ struct page_info
>>>   /* Page is on a free list: ((count_info & PGC_count_mask) == 0). 
>>> */
>>>   struct {
>>>   /* Do TLBs need flushing for safety before next page use? */
>>> -bool_t need_tlbflush;
>>> +unsigned long need_tlbflush:1;
>>> +
>>> +/*
>>> + * Index of the first *possibly* unscrubbed page in the buddy.
>>> + * One more than maximum possible order (MAX_ORDER+1) to
>> 
>> Why +1 here and hence ...
>
>Don't we have MAX_ORDER+1 orders?

So here there might be a simple misunderstanding: I understand the
parenthesized MAX_ORDER+1 to represent "maximum possible
order", i.e. excluding the "one more than", not the least because of
the ...

>> + * accommodate INVALID_DIRTY_IDX.
>> + */
>> +#define INVALID_DIRTY_IDX (-1UL & (((1UL<> +unsigned long first_dirty:MAX_ORDER + 2;

+2 here.

>> ... why +2 instead of +1? And isn't the expression INVALID_DIRTY_IDX wrongly
>> parenthesized (apart from lacking blanks around the shift operator)? I'd
>> expect you want a value with MAX_ORDER+1 set bits, i.e.
>> (1UL << (MAX_ORDER + 1)) - 1. ANDing with -1UL seems quite pointless too.
>
>Yes to parentheses and AND. Should be (1UL << (MAX_ORDER + 2)) - 1

I.e. I would still expect it to be (1UL << (MAX_ORDER + 1)) - 1
here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 00/14] arm/mem_access: Walk guest page tables in SW if mem_access is active

2017-07-31 Thread Julien Grall


On 18/07/17 13:24, Sergej Proskurin wrote:
> Hi all,

Hi,

> 
> The function p2m_mem_access_check_and_get_page is called from the function
> get_page_from_gva if mem_access is active and the hardware-aided translation 
> of
> the given guest virtual address (gva) into machine address fails. That is, if
> the stage-2 translation tables constrain access to the guests's page tables,
> hardware-assisted translation will fail. The idea of the function
> p2m_mem_access_check_and_get_page is thus to translate the given gva and check
> the requested access rights in software. However, as the current 
> implementation
> of p2m_mem_access_check_and_get_page makes use of the hardware-aided gva to 
> ipa
> translation, the translation might also fail because of reasons stated above
> and will become equally relevant for the altp2m implementation on ARM.  As
> such, we provide a software guest translation table walk to address the above
> mentioned issue.
> 
> The current version of the implementation supports translation of both the
> short-descriptor as well as the long-descriptor translation table format on
> ARMv7 and ARMv8 (AArch32/AArch64).
> 
> This revised version incorporates the comments of the previous patch series. 
> In
> this patch version we refine the definition of PAGE_SIZE_GRAN and
> PAGE_MASK_GRAN. In particular, we use PAGE_SIZE_GRAN to define PAGE_MASK_GRAN
> and thus avoid these defines to have a differing type. We also changed the
> previously introduced macro BITS_PER_LONG_LONG to BITS_PER_LLONG. Further
> changes comprise minor adjustments in comments and renaming of macros and
> function parameters. Some additional changes comprising code readability and
> correct type usage have been made and stated in the individual commits.
> 
> The following patch series can be found on Github[0].

I tried this series today with the change [1] in Xen to check the translation
is valid. However, I got a failure when booting non-LPAE arm32 Dom0:

(XEN) Loading kernel from boot module @ 80008000
(XEN) Allocating 1:1 mappings totalling 512MB for dom0:
(XEN) BANK[0] 0x00a000-0x00c000 (512MB)
(XEN) Grant table range: 0x00ffe0-0x00ffe6a000
(XEN) Loading zImage from 80008000 to a780-a7f50e28
(XEN) Allocating PPI 16 for event channel interrupt
(XEN) Loading dom0 DTB to 0xa800-0xa8001f8e
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) guest_walk_tables: gva 0xffeff018 pipa 0x1c090018
(XEN) access_guest_memory_by_ipa: gpa 0xa0207ff8
(XEN) access_guest_memory_by_ipa: gpa 0xa13aebfc
(XEN) d0: guestcopy: failed to get table entry.
(XEN) Xen BUG at traps.c:2737
(XEN) [ Xen-4.10-unstable  arm32  debug=y   Not tainted ]
(XEN) CPU:0
(XEN) PC: 00264dc0 do_trap_guest_sync+0x161c/0x1804
(XEN) CPSR:   a05a MODE:Hypervisor
(XEN)  R0: ffea R1:  R2:  R3: 004a
(XEN)  R4: 93830007 R5: 47fcff58 R6: 93830007 R7: 0007
(XEN)  R8: 1c09 R9:  R10: R11:47fcff54 R12:ffea
(XEN) HYP: SP: 47fcfee4 LR: 00258dec
(XEN) 
(XEN)   VTCR_EL2: 80003558
(XEN)  VTTBR_EL2: 00010008f3ffc000
(XEN) 
(XEN)  SCTLR_EL2: 30cd187f
(XEN)HCR_EL2: 0038663f
(XEN)  TTBR0_EL2: fff02000
(XEN) 
(XEN)ESR_EL2: 
(XEN)  HPFAR_EL2: 001c0900
(XEN)  HDFAR: ffeff018
(XEN)  HIFAR: 
(XEN) 
(XEN) Xen stack trace from sp=47fcfee4:
(XEN) 47fcff34 00256008 47fcfefc 47fcfefc 20da 0004 47fd48f4
(XEN)002d5ef0 0004 002d1f00 0004  002d1f00 c163f740 93830007
(XEN)ffeff018 1c090018  47fcff44 c15e70ac 005b c15e70ac c074400c
(XEN)0031  c0743ff8 47fcff58 00268ce0 c15e70ac 005b 0031
(XEN)ffeff000 c15e70ac 005b c15e70ac c074400c 0031  c0743ff8
(XEN) 001f   c074401c 21d3 93830007 
(XEN)c161cac0 c161cac0 c1501de0 c0735640 c161cacc c161cacc c161cad8 c161cad8
(XEN)     c161cae4 c161cae4 41d3
(XEN)     dfdfdfcf cfdfdfdf
(XEN) Xen call trace:
(XEN)[<00264dc0>] do_trap_guest_sync+0x161c/0x1804 (PC)
(XEN)[<00258dec>] access_guest_memory_by_ipa+0x25c/0x284 (LR)
(XEN)[<00268ce0>] entry.o#return_from_trap+0/0x4
(XEN) 
(XEN) 
(XEN) 
(XEN) Panic on CPU 0:
(XEN) Xen BUG at traps.c:2737
(XEN) 
(XEN) 
(XEN) Reboot in five seconds...

The IPA 0xa13aebfc is not valid for the domain.

Cheers,

[1]
diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c
index 4ee07fcea3..89c5ebf3cf 100644
--- a/xen/arch/arm/guestcopy.c
+++ b/xen/arch/arm/guestcopy.c
@@ -139,6 +139,8 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t 
gpa, void *buf,
 return -EINVAL;
 }
 
+printk("%s: gpa 0x%llx\n", __FUNCTION__, gpa);
+
 

Re: [Xen-devel] [PATCH v2 1/2] libxl: Implement the handler to handle unrecoverable AER errors.

2017-07-31 Thread Wei Liu
On Fri, Jul 28, 2017 at 10:15:40AM -0700, Venu Busireddy wrote:
> On 2017-07-28 16:58:13 +0100, Wei Liu wrote:
> > On Wed, Jul 26, 2017 at 07:16:38PM -0500, Venu Busireddy wrote:
> > > Implement the callback function to handle unrecoverable AER errors, and
> > > also the public APIs that can be used to register/unregister the handler.
> > > When an AER error occurs, the handler will forcibly remove the erring
> > > PCIe device from the guest.
> > > 
> > > Signed-off-by: Venu Busireddy 
> > > ---
> > >  tools/libxl/libxl_event.h |  2 ++
> > >  tools/libxl/libxl_pci.c   | 85 
> > > +++
> > >  2 files changed, 87 insertions(+)
> > > 
> > 
> > Please also add a LIBXL_HAVE macro to libxl.h. There are plenty of
> > examples there.
> 
> I assume you meant, for example, something like:
> 
> /* LIBXL_HAVE_UNREG_AER_EVENTS_HANDLER
>  *
>  * If it is defined, libxl has a library function called
>  * libxl_unreg_aer_events_handler.
>  */
> #define LIBXL_HAVE_UNREG_AER_EVENTS_HANDLER 1
> 
> If so, I will add them in the next revision.
> 
> > > +
> > > +int libxl_reg_aer_events_handler(libxl_ctx *ctx, uint32_t domid)
> > > +{
> > > +int rc = 0;
> > > +char *be_path;
> > > +GC_INIT(ctx);
> > > +
> > > +aer_watch.domid = domid;
> > > +be_path = 
> > > GCSPRINTF("/local/domain/0/backend/pci/%u/0/aerFailedSBDF", domid);
> > 
> > I think the best thing to do is you get the domid using
> > libxl__get_domid. Try not to hard-code 0.
> > 
> > Same for your callback function. And there are quite a few 0's that I'm
> > not sure what they stand for.
> 
> All those 0's are saying dom0's domid. Isn't dom0's domid always 0?
> If that is not the case, I can use libxl__get_domid(). Please let me
> know.

Don't assume the code will always run in dom0, so use get_domid is best.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 06/13] libxl: change p9 to use generec add function

2017-07-31 Thread Wei Liu
On Sun, Jul 30, 2017 at 09:42:09PM +0300, Oleksandr Grytsov wrote:
> On Fri, Jul 28, 2017 at 7:23 PM, Wei Liu  wrote:
> > On Fri, Jul 28, 2017 at 03:11:34PM +0100, Wei Liu wrote:
> >> On Tue, Jul 18, 2017 at 05:25:23PM +0300, Oleksandr Grytsov wrote:
> >> [...]
> >> >  /* Waits for the passed device to reach state XenbusStateInitWait.
> >> >   * This is not really useful by itself, but is important when executing
> >> >   * hotplug scripts, since we need to be sure the device is in the 
> >> > correct
> >> > @@ -3565,6 +3559,7 @@ extern const struct libxl_device_type 
> >> > libxl__usbctrl_devtype;
> >> >  extern const struct libxl_device_type libxl__usbdev_devtype;
> >> >  extern const struct libxl_device_type libxl__pcidev_devtype;
> >> >  extern const struct libxl_device_type libxl__vdispl_devtype;
> >> > +extern const struct libxl_device_type libxl__p9_devtype;
> >> >
> >> >  extern const struct libxl_device_type *device_type_tbl[];
> >> >
> >> > diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> >> > index 25563cf..96dbaed 100644
> >> > --- a/tools/libxl/libxl_types.idl
> >> > +++ b/tools/libxl/libxl_types.idl
> >> > @@ -804,7 +804,7 @@ libxl_domain_config = Struct("domain_config", [
> >> >  ("vfbs", Array(libxl_device_vfb, "num_vfbs")),
> >> >  ("vkbs", Array(libxl_device_vkb, "num_vkbs")),
> >> >  ("vtpms", Array(libxl_device_vtpm, "num_vtpms")),
> >> > -("p9", Array(libxl_device_p9, "num_p9s")),
> >> > +("p9s", Array(libxl_device_p9, "num_p9s")),
> >>
> >> Oh, no, please don't do this. We can't change the name of the fields.
> >>
> >> There is already on irregular device type -- the PCI device. I suppose
> >> you probably need another hook somewhere. And please convert PCI devices
> >> if you can.
> >
> > OK, going through the code I think we need to come to a conclusion if we
> > want an extra callback to handle the irregular device names first
> > because that's likely to affect the code of the framework in previous
> > patch.
> 
> Actually creating new callback to handle irregular device name looks
> not so good.
> There is the pattern which all namings should follow. May be it has to
> be documented

The normal pattern is DEVTYPEs.

> somewhere. p9 was added recently we can ask the author to review this rename.

Once it is released we can't change it, of course unless we deem it
unstable. I'm two minded here. P9 was released in 4.9, which was only a
few months old.

But we definitely can't change the PCI type. It has been around since
forever. And there is provision in code to deal with that.

> From other side this rename touches only internals changes: no changes
> in config file
> or CLI interface.
> 

As said, the framework need to be ready to deal with PCI anyway.

What sort of issues do you foresee here?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2 10/12] tools: implemet new get value interface suitable for all psr allocation features.

2017-07-31 Thread Wei Liu
On Thu, Jul 20, 2017 at 04:49:11PM +0800, Yi Sun wrote:
> This patch implements a new get value interface in tools suitable for all psr
> allocation features and the whole flow. It also enables MBA support in tools
> to get MBA value.

This suggests this patch can be at least broken into two?

> 
> Signed-off-by: Yi Sun 
> ---
>  tools/libxc/include/xenctrl.h |  13 +-
>  tools/libxc/xc_psr.c  |  11 +-
>  tools/libxl/libxl_psr.c   |  61 ++
>  tools/xl/xl.h |   3 +
>  tools/xl/xl_cmdtable.c|   9 +-
>  tools/xl/xl_psr.c | 275 
> ++
>  6 files changed, 236 insertions(+), 136 deletions(-)
> 
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index 0b0ec31..def18f5 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -2452,13 +2452,14 @@ enum xc_psr_cmt_type {
>  };
>  typedef enum xc_psr_cmt_type xc_psr_cmt_type;
>  
> -enum xc_psr_cat_type {
> +enum xc_psr_val_type {
>  XC_PSR_CAT_L3_CBM  = 1,
>  XC_PSR_CAT_L3_CBM_CODE = 2,
>  XC_PSR_CAT_L3_CBM_DATA = 3,
>  XC_PSR_CAT_L2_CBM  = 4,
> +XC_PSR_MBA_THRTL   = 5,
>  };
> -typedef enum xc_psr_cat_type xc_psr_cat_type;
> +typedef enum xc_psr_val_type xc_psr_val_type;

Changing the name of the type should be done in a separate patch.

The rest of this patch mixes renaming and functional change which is
rather difficult to review I'm afraid.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2 03/12] x86: rename 'cbm_type' to 'psr_val_type' to make it general.

2017-07-31 Thread Wei Liu
On Thu, Jul 20, 2017 at 04:49:04PM +0800, Yi Sun wrote:
> This patch renames 'cbm_type' to 'psr_val_type' to make it be general.
> Then, we can reuse this for all psr allocation features.
> 
> Signed-off-by: Yi Sun 

The code LGTM.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2 08/12] tools: create general interfaces to support psr allocation features.

2017-07-31 Thread Wei Liu
On Thu, Jul 20, 2017 at 04:49:09PM +0800, Yi Sun wrote:
[...]
> +
> +#ifdef LIBXL_HAVE_PSR_MBA
> +/*
> + * Function to set a domain's value. It operates on a single or multiple
> + * target(s) defined in 'target_map'. 'target_map' specifies all the sockets
> + * to be operated on.
> + */
> +int libxl_psr_set_val(libxl_ctx *ctx, uint32_t domid,
> +  libxl_psr_cbm_type type, libxl_bitmap *target_map,
> +  uint64_t val);
> +/*
> + * Function to get a domain's cbm. It operates on a single 'target'.
> + * 'target' specifies which socket to be operated on.
> + */
> +int libxl_psr_get_val(libxl_ctx *ctx, uint32_t domid,
> +  libxl_psr_cbm_type type, uint32_t target,

There is no need for target to be uint32_t right? Unsigned int should
work too?

> +  uint64_t *val);
> +/*
> + * On success, the function returns an array of elements in 'info',
> + * and the length in 'nr'.
> + */
> +int libxl_psr_get_hw_info(libxl_ctx *ctx, libxl_psr_hw_info **info,
> +  int *nr, libxl_psr_feat_type type, int lvl);
> +void libxl_psr_hw_info_list_free(libxl_psr_hw_info *list, int nr);

nr should be unsigned int.

> +#endif /* LIBXL_HAVE_PSR_MBA */
> +#endif /* LIBXL_HAVE_PSR_CAT */
>  
>  /* misc */
>  
> diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
> index f55ba1e..8319301 100644
> --- a/tools/libxl/libxl_psr.c
> +++ b/tools/libxl/libxl_psr.c
> @@ -425,6 +425,30 @@ void libxl_psr_cat_info_list_free(libxl_psr_cat_info 
> *list, int nr)
>  free(list);
>  }
>  
> +int libxl_psr_set_val(libxl_ctx *ctx, uint32_t domid,
> +  libxl_psr_cbm_type type, libxl_bitmap *target_map,
> +  uint64_t val)
> +{
> +return EXIT_FAILURE;

ERROR_FAIL here.

> +
> +libxl_psr_hw_info = Struct("psr_hw_info", [
> +("id", uint32),
> +("u", KeyedUnion(None, libxl_psr_feat_type, "type",
> +  [("cat_info", Struct(None, [
> + ("cos_max", uint32),
> + ("cbm_len", uint32),
> + ("cdp_enabled", bool),
> +   ])),
> +   ("mba_info", Struct(None, [
> + ("cos_max", uint32),
> + ("thrtl_max",   uint32),
> + ("linear",  bool),
> +   ])),
> +  ]))

If this is output only please mark it as dir=DIR_OUT.

> +])
> -- 
> 1.9.1
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2 09/12] tools: implement the new get hw info interface suitable to all psr allocation features.

2017-07-31 Thread Wei Liu
On Thu, Jul 20, 2017 at 04:49:10PM +0800, Yi Sun wrote:
> This patch implements a new get hw info interface suitable for all psr 
> allocation
> features and the whole flow. It also enables MBA support in tools to get MBA
> HW info.
> 
> Signed-off-by: Yi Sun 
> ---
>  tools/libxc/include/xenctrl.h |  30 +++-
>  tools/libxc/xc_psr.c  |  46 +
>  tools/libxl/libxl_psr.c   | 155 
> +++---
>  tools/xl/xl_cmdtable.c|   3 +
>  tools/xl/xl_psr.c |  45 +++-
>  5 files changed, 238 insertions(+), 41 deletions(-)
> 
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index 2248900..0b0ec31 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -2460,6 +2460,31 @@ enum xc_psr_cat_type {
>  };
>  typedef enum xc_psr_cat_type xc_psr_cat_type;
>  
> +enum xc_psr_feat_type {
> +XC_PSR_FEAT_UNKNOWN= 0,
> +XC_PSR_FEAT_CAT_L3 = 1,
> +XC_PSR_FEAT_CAT_L2 = 2,
> +XC_PSR_FEAT_MBA= 3,

Pointless initialisers.

> +};
> +typedef enum xc_psr_feat_type xc_psr_feat_type;
> +
> +struct xc_psr_hw_info {
> +union {
> +struct {
> +uint32_t cos_max;
> +uint32_t cbm_len;
> +bool cdp_enabled;
> +} xc_cat_info;
> +
> +struct {
> +uint32_t cos_max;
> +uint32_t thrtl_max;
> +bool linear;
> +} xc_mba_info;
> +} u;
> +};
> +typedef struct xc_psr_hw_info xc_psr_hw_info;
[...]
> index 8319301..43b84b6 100644
> --- a/tools/libxl/libxl_psr.c
> +++ b/tools/libxl/libxl_psr.c
> @@ -361,47 +361,49 @@ int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t 
> domid,
>  return rc;
>  }
>  
> +static inline int libxl_psr_hw_info_to_libxl_psr_cat_info(

libxl__ -> two underscores for internal functions.

> +  libxl_psr_feat_type type, libxl_psr_hw_info *hw_info,
> +  libxl_psr_cat_info *cat_info)
> +{
> +if (type != LIBXL_PSR_FEAT_TYPE_CAT_INFO)
> +return -1;

ERROR_INVAL;

> +
> +static inline int libxc__psr_hw_info_to_libxl_psr_hw_info(
> +  libxl_psr_feat_type type, xc_psr_hw_info *xc_hw_info,
> +  libxl_psr_hw_info *xl_hw_info)
> +{
> +switch (type) {
> +case LIBXL_PSR_FEAT_TYPE_CAT_INFO:
> +xl_hw_info->u.cat_info.cos_max = xc_hw_info->u.xc_cat_info.cos_max;
> +xl_hw_info->u.cat_info.cbm_len = xc_hw_info->u.xc_cat_info.cbm_len;
> +xl_hw_info->u.cat_info.cdp_enabled =
> +
> xc_hw_info->u.xc_cat_info.cdp_enabled;
> +break;
> +case LIBXL_PSR_FEAT_TYPE_MBA_INFO:
> +xl_hw_info->u.mba_info.cos_max = xc_hw_info->u.xc_mba_info.cos_max;
> +xl_hw_info->u.mba_info.thrtl_max = 
> xc_hw_info->u.xc_mba_info.thrtl_max;
> +xl_hw_info->u.mba_info.linear = xc_hw_info->u.xc_mba_info.linear;
> +break;
> +default:
> +return -1;

ERROR_INVAL

> +}
> +
> +return 0;
> +}
> +
>  int libxl_psr_get_hw_info(libxl_ctx *ctx, libxl_psr_hw_info **info,
>int *nr, libxl_psr_feat_type type, int lvl)
>  {
> -return EXIT_FAILURE;
> +GC_INIT(ctx);
> +int rc;
> +int i = 0, socketid, nr_sockets;
> +libxl_bitmap socketmap;
> +libxl_psr_hw_info *ptr;
> +xc_psr_feat_type xc_type;
> +xc_psr_hw_info hw_info;
> +
> +libxl_bitmap_init();
> +
> +if ( type == LIBXL_PSR_FEAT_TYPE_CAT_INFO && lvl != 3 && lvl != 2) {

Extraneous space.

>  
>  void libxl_psr_hw_info_list_free(libxl_psr_hw_info *list, int nr)
>  {
> +int i;

unsigned int

> +
> +for (i = 0; i < nr; i++)
> +libxl_psr_hw_info_dispose([i]);
> +free(list);
>  }
>  
>  /*
> diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
> index 2c71a9f..14a02d4 100644
> --- a/tools/xl/xl_cmdtable.c
> +++ b/tools/xl/xl_cmdtable.c
> @@ -524,6 +524,9 @@ struct cmd_spec cmd_table[] = {
>"[options]",
>"-m, --cmt   Show Cache Monitoring Technology (CMT) hardware 
> info\n"
>"-a, --cat   Show Cache Allocation Technology (CAT) hardware 
> info\n"
> +#ifdef LIBXL_HAVE_PSR_MBA
> +  "-b, --mba   Show Memory Bandwidth Allocation (MBA) hardware 
> info\n"
> +#endif

You don't need to test the macro. xl always has all features available.

Same comment applies to the rest of this patch.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2 04/12] x86: implement data structure and CPU init flow for MBA.

2017-07-31 Thread Wei Liu
On Thu, Jul 20, 2017 at 04:49:05PM +0800, Yi Sun wrote:
> This patch implements main data structures of MBA.
> 
> Like CAT features, MBA HW info has cos_max which means the max cos
> registers number, and thrtl_max which means the max throttle value
> (delay value). It also has a flag to represent if the throttle
> value is linear or not.
> 
> One COS register of MBA stores a throttle value for one or more
> domains. The throttle value means the transaction time between L2
> cache and next level memory to be delayed.
> 
> This patch also implements init flow for MBA and register stub
> callback functions.
> 
> Signed-off-by: Yi Sun 
> ---
>  xen/arch/x86/psr.c  | 130 
> 
>  xen/include/asm-x86/msr-index.h |   1 +
>  xen/include/asm-x86/psr.h   |   2 +
>  3 files changed, 109 insertions(+), 24 deletions(-)
> 
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index d1d854f..d1ea5a4 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -27,13 +27,16 @@
>   * - CMT Cache Monitoring Technology
>   * - COS/CLOSClass of Service. Also mean COS registers.
>   * - COS_MAX Max number of COS for the feature (minus 1)
> + * - MBA Memory Bandwidth Allocation
>   * - MSRsMachine Specific Registers
>   * - PSR Intel Platform Shared Resource
> + * - THRTL_MAX   Max throttle value (delay value) of MBA
>   */
>  
>  #define PSR_CMT(1<<0)
>  #define PSR_CAT(1<<1)
>  #define PSR_CDP(1<<2)
> +#define PSR_MBA(1<<3)

These should really be (1u << X) -- please use unsigned value and add
spaces around "<<".

Can you please submit a patch to fix them first?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2 02/12] Rename PSR sysctl/domctl interfaces and xsm policy to make them be general.

2017-07-31 Thread Wei Liu
Normally there is no need to have period at the end of the subject line.

On Thu, Jul 20, 2017 at 04:49:03PM +0800, Yi Sun wrote:
> This patch renames PSR sysctl/domctl interfaces and related xsm policy to
> make them be general for all resource allocation features but not only
> for CAT. Then, we can resuse the interfaces for all allocation features.
> 

It would be useful to list what is changed to what. Afaict "cat" is
changed to "alloc".

This patch mostly looks fine cod-wise. The only thing I want to point
out is that you should bump the version number for both domctl and
sysctl.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] PVH VCPU hotplug support v7?

2017-07-31 Thread Boris Ostrovsky
On 07/31/2017 10:12 AM, Andrew Cooper wrote:
> On 31/07/17 14:55, Boris Ostrovsky wrote:
>> On 07/31/2017 09:20 AM, Ross Lagerwall wrote:
>>> Hi Boris,
>>>
>>> I've modified your PVH VCPU hotplug support v6 patch series [1] to
>>> support HVM guests running _with_ a device model for XenServer's
>>> purposes. This is useful because it moves the vCPU hotplug handling
>>> out of QEMU and allows it to mostly be shared with PVH. It will also
>>> allow unplugging vCPUs (libxl currently only does cpu-add for upstream
>>> qemu).
>>>
>>> Are you still planning on continuing with that patch series since your
>>> commit to Linux [2]?
>> This series has been put on hold until we figure out what to do with
>> hotplug for PVH dom0. (The problem was the "dual" view by dom0 of APCI
>> CPU namespace --- on hotplug event dom0 has to somehow figure out
>> whether the event was due to (dis)appearance of a physical or virtual CPU).
>>
>> I don't think this has been dealt with yet (copying Roger).
> From the point of view of unblocking several pieces of work, it would be
> fine for this logic to be behind an emulation flag, just like LAPIC/etc.

The (I think) last message discussing this series was

https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00143.html

Are you suggesting extracting pieces that would move hotplug support for
HVM guests from qemu to hypervisor/toolstack but leave all PVH-specific
code out? (The feature flag is part of this series --- 
https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00059.html)

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v14 16/23] x86: L2 CAT: implement CPU init flow.

2017-07-31 Thread Jan Beulich
>>> Yi Sun  07/15/17 2:49 AM >>>
>@@ -273,6 +275,12 @@ static int cat_init_feature(const struct cpuid_leaf *regs,
>struct psr_socket_info *info,
>enum psr_feat_type type)
>{
>+const char * const cat_feat_name[FEAT_TYPE_NUM] = {

Strictly speaking the blank after the star is wrong here.

>+"L3 CAT",
>+"CDP",
>+"L2 CAT",
>+};

Please use designated initializers here.

With at least this second aspect taken care of
Reviewed-by: Jan Beulich 

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v14 13/23] x86: refactor psr: CDP: implement CPU init flow.

2017-07-31 Thread Jan Beulich
>>> Yi Sun  07/15/17 2:48 AM >>>
>@@ -272,7 +312,8 @@ static int cat_init_feature(const struct cpuid_leaf *regs,
>if ( !opt_cpu_info )
>return 0;
 >
>-printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
>+printk(XENLOG_INFO "%s: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
>+   ((type == FEAT_TYPE_L3_CDP) ? "CDP" : "L3 CAT"),

Why is this not "L3 CDP" when the enumerator includes L3?

>@@ -1283,10 +1344,22 @@ static void psr_cpu_init(void)
>feat = feat_l3;
>feat_l3 = NULL;
 >
>-if ( !cat_init_feature(®s, feat, info, FEAT_TYPE_L3_CAT) )
>-feat_props[FEAT_TYPE_L3_CAT] = _cat_props;
>+if ( (regs.c & PSR_CAT_CDP_CAPABILITY) && (opt_psr & PSR_CDP) )
>+{
>+if ( !cat_init_feature(®s, feat, info, FEAT_TYPE_L3_CDP) )
>+feat_props[FEAT_TYPE_L3_CDP] = _cdp_props;
>+else
>+/* If CDP init fails, try to work as L3 CAT. */
>+goto l3_cat_init;
>+}
>else
>-feat_l3 = feat;
>+{
>+ l3_cat_init:

I'd really like to ask to re-structure this slightly so that you won't
need goto and a label here. As said before, goto-s are sort of okay
for making complicated error paths readable, but they should be
avoided in almost all other cases.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline test] 112389: regressions - trouble: blocked/broken/fail/pass

2017-07-31 Thread osstest service owner
flight 112389 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112389/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 111765
 build-arm64-xsm   2 hosts-allocate broken REGR. vs. 111765
 build-arm64   2 hosts-allocate broken REGR. vs. 111765
 test-amd64-i386-xl-qemuu-ws16-amd64 10 windows-install   fail REGR. vs. 111765
 test-amd64-i386-xl-qemuu-win7-amd64 10 windows-install   fail REGR. vs. 111765
 test-amd64-amd64-xl-qemuu-win7-amd64 10 windows-install  fail REGR. vs. 111765

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl  10 debian-install fail pass in 112380
 test-armhf-armhf-xl-rtds 12 guest-startfail pass in 112380

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   3 capture-logs  broken blocked in 111765
 build-arm64-pvops 3 capture-logs  broken blocked in 111765
 build-arm64-xsm   3 capture-logs  broken blocked in 111765
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeat fail in 112380 like 
111765
 test-armhf-armhf-xl 13 migrate-support-check fail in 112380 never pass
 test-armhf-armhf-xl 14 saverestore-support-check fail in 112380 never pass
 test-armhf-armhf-xl-rtds13 migrate-support-check fail in 112380 never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-check fail in 112380 never pass
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 111765
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 111765
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 111765
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 111765
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 qemuua588c4985eff363154d65aee8607d0a4601655f7
baseline version:
 qemuu31fe1c414501047cbb91b695bdccc0068496dcf6

Last test of basis   111765  2017-07-13 10:20:16 Z   18 days
Failing since111790  2017-07-14 04:20:46 Z   17 days   25 attempts
Testing same since   112366  2017-07-28 18:48:13 Z2 days4 attempts


People who touched revisions under test:
  Alex Bennée 
  Alex Williamson 
  Alexander Graf 

Re: [Xen-devel] PVH VCPU hotplug support v7?

2017-07-31 Thread Andrew Cooper
On 31/07/17 14:55, Boris Ostrovsky wrote:
> On 07/31/2017 09:20 AM, Ross Lagerwall wrote:
>> Hi Boris,
>>
>> I've modified your PVH VCPU hotplug support v6 patch series [1] to
>> support HVM guests running _with_ a device model for XenServer's
>> purposes. This is useful because it moves the vCPU hotplug handling
>> out of QEMU and allows it to mostly be shared with PVH. It will also
>> allow unplugging vCPUs (libxl currently only does cpu-add for upstream
>> qemu).
>>
>> Are you still planning on continuing with that patch series since your
>> commit to Linux [2]?
> This series has been put on hold until we figure out what to do with
> hotplug for PVH dom0. (The problem was the "dual" view by dom0 of APCI
> CPU namespace --- on hotplug event dom0 has to somehow figure out
> whether the event was due to (dis)appearance of a physical or virtual CPU).
>
> I don't think this has been dealt with yet (copying Roger).

From the point of view of unblocking several pieces of work, it would be
fine for this logic to be behind an emulation flag, just like LAPIC/etc.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 05/12] x86/domctl: Handle ACPI access from domctl

2017-07-31 Thread Ross Lagerwall

On 01/03/2017 02:04 PM, Boris Ostrovsky wrote:

Signed-off-by: Boris Ostrovsky 
---
Changes in v6:
* Adjustments to to patch 4 changes.
* Added a spinlock for VCPU map access
* Return an error on guest trying to write VCPU map


snip

-static int acpi_cpumap_access_common(struct domain *d, bool is_write,
- unsigned int port,
+static int acpi_cpumap_access_common(struct domain *d, bool is_guest_access,
+ bool is_write, unsigned int port,
   unsigned int bytes, uint32_t *val)
  {
  unsigned int first_byte = port - XEN_ACPI_CPU_MAP;
+int rc = X86EMUL_OKAY;

  BUILD_BUG_ON(XEN_ACPI_CPU_MAP + XEN_ACPI_CPU_MAP_LEN
   > ACPI_GPE0_BLK_ADDRESS_V1);

+spin_lock(>arch.hvm_domain.acpi_lock);
+
  if ( !is_write )
  {
  uint32_t mask = (bytes < 4) ? ~0U << (bytes * 8) : 0;
@@ -32,23 +37,61 @@ static int acpi_cpumap_access_common(struct domain *d, bool 
is_write,
  memcpy(val, (uint8_t *)d->avail_vcpus + first_byte,
 min(bytes, ((d->max_vcpus + 7) / 8) - first_byte));
  }
+else if ( !is_guest_access )
+memcpy((uint8_t *)d->avail_vcpus + first_byte, val,
+   min(bytes, ((d->max_vcpus + 7) / 8) - first_byte));
  else
  /* Guests do not write CPU map */
-return X86EMUL_UNHANDLEABLE;
+rc = X86EMUL_UNHANDLEABLE;

-return X86EMUL_OKAY;
+spin_unlock(>arch.hvm_domain.acpi_lock);
+
+return rc;
  }

  int hvm_acpi_domctl_access(struct domain *d,
 const struct xen_domctl_acpi_access *access)
  {
-return -ENOSYS;
+unsigned int bytes, i;
+uint32_t val = 0;
+uint8_t *ptr = (uint8_t *)
+int rc;
+bool is_write = (access->rw == XEN_DOMCTL_ACPI_WRITE) ? true : false;
+
+if ( has_acpi_dm_ff(d) )
+return -EOPNOTSUPP;
+
+if ( access->space_id != XEN_ACPI_SYSTEM_IO )
+return -EINVAL;
+
+if ( !((access->address >= XEN_ACPI_CPU_MAP) &&
+   (access->address < XEN_ACPI_CPU_MAP + XEN_ACPI_CPU_MAP_LEN)) )
+return -ENODEV;
+
+for ( i = 0; i < access->width; i += sizeof(val) )
+{
+bytes = (access->width - i > sizeof(val)) ?
+sizeof(val) : access->width - i;
+
+if ( is_write && copy_from_guest_offset(ptr, access->val, i, bytes) )
+return -EFAULT;
+
+rc = acpi_cpumap_access_common(d, false, is_write,
+   access->address, bytes, );


While I'm looking at this code...
This doesn't work if access->width > sizeof(val) (4 bytes). The same 
value (access->address) is always passed into acpi_cpumap_access_common 
for 'port' and this is used as an offset into the avail_cpus array. So 
the offset is unchanged and only the first 4 bytes of avail_cpus ever 
gets changed.


--
Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v14 12/23] x86: refactor psr: L3 CAT: set value: implement write msr flow.

2017-07-31 Thread Jan Beulich
>>> Yi Sun  07/15/17 2:48 AM >>>
>static int write_psr_msrs(unsigned int socket, unsigned int cos,
>const uint32_t val[], unsigned int array_len,
>enum psr_feat_type feat_type)
>{
>-return -ENOENT;
>+int ret;
>+struct psr_socket_info *info = get_socket_info(socket);
>+struct cos_write_info data =
>+{
>+.cos = cos,
>+.feature = info->features[feat_type],
>+.props = feat_props[feat_type],
>+};
>+
>+if ( cos > info->features[feat_type]->cos_max )
>+return -EINVAL;
>+
>+/* Skip to the feature's value head. */
>+ret = skip_prior_features(_len, feat_type);
>+if ( ret < 0 )
>+return ret;
>+else
>+val += ret;

With this (again pointless) else removed,
Reviewed-by: Jan Beulich 

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v14 09/23] x86: refactor psr: L3 CAT: set value: assemble features value array.

2017-07-31 Thread Jan Beulich
>>> Yi Sun  07/15/17 2:47 AM >>>
>@@ -619,6 +710,46 @@ static int insert_val_into_array(uint32_t val[],
>enum cbm_type type,
>uint32_t new_val)
>{
>+const struct feat_node *feat;
>+const struct feat_props *props;
>+unsigned int i;
>+int ret;
>+
>+ASSERT(feat_type < FEAT_TYPE_NUM);
>+
>+ret = skip_prior_features(_len, feat_type);
>+if ( ret < 0 )
>+return ret;
>+else
>+val += ret;

Please avoid such pointless "else".

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 5/5] livepatch: Declare live patching as a supported feature

2017-07-31 Thread Jan Beulich
>>> Konrad Rzeszutek Wilk  07/26/17 9:48 PM >>>
>From: Ross Lagerwall 
>
>See docs/features/livepatch.pandoc for the details.
>
>Signed-off-by: Ross Lagerwall 
>Signed-off-by: Konrad Rzeszutek Wilk 

Reviewed-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 4/5] alternative/x86/arm32: Align altinstructions (and altinstr_replacement) sections.

2017-07-31 Thread Jan Beulich
>>> Konrad Rzeszutek Wilk  07/26/17 9:50 PM >>>
>On x86 the bloat-o-meter detects that with this change the file shrinks:
>add/remove: 1/0 grow/shrink: 0/2 up/down: 156/-367 (-211)
>function old new   delta
>get_page_from_gfn  - 156+156
>do_mmu_update   45784569  -9
>do_mmuext_op56045246-358
>Total: Before=3170439, After=3170228, chg -0.01%

This looks unexpected, and hence I'd like to ask for an explanation. If
anything I'd expect the image size to grow (slightly).

>--- a/xen/include/asm-x86/alternative.h
>+++ b/xen/include/asm-x86/alternative.h
>@@ -56,10 +56,12 @@ extern void alternative_instructions(void);
 >
>#define ALTERNATIVE_N(newinstr, feature, number)   \
>".pushsection .altinstructions,\"a\"\n"\
>+  ".p2align 2\n"  \

Can't this then be accompanied by dropping the (over-)alignment
done in xen.lds.S?

>ALTINSTR_ENTRY(feature, number)\
>".section .discard,\"a\",@progbits\n"  \
>DISCARD_ENTRY(number)  \
>".section .altinstr_replacement, \"ax\"\n" \
>+  ".p2align 2\n"  \

This surely isn't needed on x86.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/events: Fix interrupt lost during irq_disable and irq_enable

2017-07-31 Thread Boris Ostrovsky
On 07/29/2017 12:59 PM, Liu Shuo wrote:
> Here is a device has xen-pirq-MSI interrupt. Dom0 might lost interrupt
> during driver irq_disable/irq_enable. Here is the scenario,
>  1. irq_disable -> disable_dynirq -> mask_evtchn(irq channel)
>  2. dev interrupt raised by HW and Xen mark its evtchn as pending
>  3. irq_enable -> startup_pirq -> eoi_pirq ->
> clear_evtchn(channel of irq) -> clear pending status
>  4. consume_one_event process the irq event without pending bit assert
> which result in interrupt lost once
>  5. No HW interrupt raising anymore.
>
> Now use enable_dynirq for enable_pirq of xen_pirq_chip to remove
> eoi_pirq when irq_enable.
>
> Signed-off-by: Liu Shuo 

Reviewed-by: Boris Ostrovsky 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


  1   2   >