Re: [Xen-devel] WTH is going on with memory hotplug sysf interface

2017-03-10 Thread Yasuaki Ishimatsu



On 03/10/2017 08:58 AM, Michal Hocko wrote:

Let's CC people touching this logic. A short summary is that onlining
memory via udev is currently unusable for online_movable because blocks
are added from lower addresses while movable blocks are allowed from
last blocks. More below.

On Thu 09-03-17 13:54:00, Michal Hocko wrote:

On Tue 07-03-17 13:40:04, Igor Mammedov wrote:

On Mon, 6 Mar 2017 15:54:17 +0100
Michal Hocko  wrote:


On Fri 03-03-17 18:34:22, Igor Mammedov wrote:

[...]

in current mainline kernel it triggers following code path:

online_pages()
  ...
   if (online_type == MMOP_ONLINE_KERNEL) {
if (!zone_can_shift(pfn, nr_pages, ZONE_NORMAL, _shift))
return -EINVAL;


Are you sure? I would expect MMOP_ONLINE_MOVABLE here

pretty much, reproducer is above so try and see for yourself


I will play with this...


OK so I did with -m 2G,slots=4,maxmem=4G -numa node,mem=1G -numa node,mem=1G 
which generated
[...]
[0.00] ACPI: SRAT: Node 0 PXM 0 [mem 0x-0x0009]
[0.00] ACPI: SRAT: Node 0 PXM 0 [mem 0x0010-0x3fff]
[0.00] ACPI: SRAT: Node 1 PXM 1 [mem 0x4000-0x7fff]
[0.00] ACPI: SRAT: Node 0 PXM 0 [mem 0x1-0x27fff] hotplug
[0.00] NUMA: Node 0 [mem 0x-0x0009] + [mem 
0x0010-0x3fff] -> [mem 0x-0x3fff]
[0.00] NODE_DATA(0) allocated [mem 0x3fffc000-0x3fff]
[0.00] NODE_DATA(1) allocated [mem 0x7ffdc000-0x7ffd]
[0.00] Zone ranges:
[0.00]   DMA  [mem 0x1000-0x00ff]
[0.00]   DMA32[mem 0x0100-0x7ffd]
[0.00]   Normal   empty
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x1000-0x0009efff]
[0.00]   node   0: [mem 0x0010-0x3fff]
[0.00]   node   1: [mem 0x4000-0x7ffd]

so there is neither any normal zone nor movable one at the boot time.
Then I hotplugged 1G slot
(qemu) object_add memory-backend-ram,id=mem1,size=1G
(qemu) device_add pc-dimm,id=dimm1,memdev=mem1

unfortunatelly the memory didn't show up automatically and I got
[  116.375781] acpi PNP0C80:00: Enumeration failure

so I had to probe it manually (prbably the BIOS my qemu uses doesn't
support auto probing - I haven't really dug further). Anyway the SRAT
table printed during the boot told that we should start at 0x1

# echo 0x1 > /sys/devices/system/memory/probe
# grep . /sys/devices/system/memory/memory32/valid_zones
Normal Movable

which looks reasonably right? Both Normal and Movable zones are allowed

# echo $((0x1+(128<<20))) > /sys/devices/system/memory/probe
# grep . /sys/devices/system/memory/memory3?/valid_zones
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal Movable

Huh, so our valid_zones have changed under our feet...

# echo $((0x1+2*(128<<20))) > /sys/devices/system/memory/probe
# grep . /sys/devices/system/memory/memory3?/valid_zones
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal
/sys/devices/system/memory/memory34/valid_zones:Normal Movable

and again. So only the last memblock is considered movable. Let's try to
online them now.

# echo online_movable > /sys/devices/system/memory/memory34/state
# grep . /sys/devices/system/memory/memory3?/valid_zones
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory34/valid_zones:Movable Normal



I think there is no strong reason which kernel has the restriction.
By setting the restrictions, it seems to have made management of
these zone structs simple.

Thanks,
Yasuaki Ishimatsu


This would explain why onlining from the last block actually works but
to me this sounds like a completely crappy behavior. All we need to
guarantee AFAICS is that Normal and Movable zones do not overlap. I
believe there is even no real requirement about ordering of the physical
memory in Normal vs. Movable zones as long as they do not overlap. But
let's keep it simple for the start and always enforce the current status
quo that Normal zone is physically preceeding Movable zone.
Can somebody explain why we cannot have a simple rule for Normal vs.
Movable which would be:
- block [pfn, pfn+block_size] can be Normal if
  !zone_populated(MOVABLE) || pfn+block_size < 
ZONE_MOVABLE->zone_start_pfn
- block [pfn, pfn+block_size] can be Movable if
  !zone_populated(NORMAL) || ZONE_NORMAL->zone_end_pfn < pfn

I haven't fully grokked all the restrictions on the movable zone size
based on the kernel parameters (find_zone_movable_pfns_for_nodes) but
this shouldn't really make the situation really much more 

[Xen-devel] [qemu-mainline baseline-only test] 68648: regressions - trouble: blocked/broken/fail/pass

2017-03-10 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 68648 qemu-mainline real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/68648/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-libvirt-raw  3 host-install(3) broken REGR. vs. 68645
 test-armhf-armhf-xl-xsm   3 host-install(3) broken REGR. vs. 68645
 test-armhf-armhf-xl  11 guest-start   fail REGR. vs. 68645
 test-armhf-armhf-libvirt 11 guest-start   fail REGR. vs. 68645

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt-raw  4 capture-logs(4)broken blocked in 68645
 test-armhf-armhf-xl-xsm   4 capture-logs(4)broken blocked in 68645
 test-amd64-i386-xl   11 guest-start  fail blocked in 68645
 test-amd64-i386-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail blocked in 68645
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 9 windows-install fail blocked in 
68645
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail   like 68645
 test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail like 68645

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   2 hosts-allocate   broken never pass
 build-arm64-xsm   2 hosts-allocate   broken never pass
 build-arm64-pvops 2 hosts-allocate   broken never pass
 build-arm64   3 capture-logs broken never pass
 build-arm64-pvops 3 capture-logs broken never pass
 build-arm64-xsm   3 capture-logs broken never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-midway   12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass

version targeted for testing:
 qemuub64842dee42d6b24d51283e4722140b73be1e222
baseline version:
 qemuu43c227f9dd7945bb4a895f841ecdb957bd8a12da

Last test of basis68645  2017-03-08 22:18:18 Z2 days
Testing same since68648  2017-03-09 11:47:15 Z1 days1 attempts


People who touched revisions under test:
  Daniel P. Berrange 
  Fam Zheng 
  Kevin Wolf 
  Markus Armbruster 
  Peter Maydell 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm

Re: [Xen-devel] Mapping active GDT

2017-03-10 Thread Boris Ostrovsky
On 03/10/2017 09:39 PM, Boris Ostrovsky wrote:
> I am looking into GDT remap series [0] which crashes PV guests and it
> seems that the problem lies in the fact that we cannot establish new
> mapping to an already existing GDT.
>
> The mapping is created by
>
> +static inline void setup_fixmap_gdt(int cpu)
> +{
> +   __set_fixmap(get_cpu_gdt_ro_index(cpu),
> +__pa(get_cpu_gdt_rw(cpu)), PAGE_KERNEL);
> +}
>
> with get_cpu_gdt_rw(cpu) being the current GDT pointer. This results in
>
> (XEN) mm.c:2570:d94v0 Bad type (saw 5401 != exp
> 7000) for mfn 1538fb (pfn 3e809)
> (XEN) mm.c:1022:d94v0 Could not get page type PGT_writable_page
> (XEN) mm.c:1074:d94v0 Error getting mfn 1538fb (pfn 3e809) from L1 entry
> 8001538fb063 for l1e_owner=94, pg_owner=94
>
> (after a small change to xen_set_fixmap(), which I think was missing)
>
> Before I try to come up with a fix I wanted to check here to see if this
> (not being able to map active GDT) is indeed the case.

Uhm.. Nevermind. The change in xen_set_fixmap() is probably sufficient.

I was working with wrong branch ;-( Sorry for the noise.

-boris

>
> -boris
>
>
> [0]
> https://lists.xenproject.org/archives/html/xen-devel/2017-03/msg00869.html
>
>
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Mapping active GDT

2017-03-10 Thread Boris Ostrovsky
I am looking into GDT remap series [0] which crashes PV guests and it
seems that the problem lies in the fact that we cannot establish new
mapping to an already existing GDT.

The mapping is created by

+static inline void setup_fixmap_gdt(int cpu)
+{
+   __set_fixmap(get_cpu_gdt_ro_index(cpu),
+__pa(get_cpu_gdt_rw(cpu)), PAGE_KERNEL);
+}

with get_cpu_gdt_rw(cpu) being the current GDT pointer. This results in

(XEN) mm.c:2570:d94v0 Bad type (saw 5401 != exp
7000) for mfn 1538fb (pfn 3e809)
(XEN) mm.c:1022:d94v0 Could not get page type PGT_writable_page
(XEN) mm.c:1074:d94v0 Error getting mfn 1538fb (pfn 3e809) from L1 entry
8001538fb063 for l1e_owner=94, pg_owner=94

(after a small change to xen_set_fixmap(), which I think was missing)

Before I try to come up with a fix I wanted to check here to see if this
(not being able to map active GDT) is indeed the case.

-boris


[0]
https://lists.xenproject.org/archives/html/xen-devel/2017-03/msg00869.html


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] build/clang: fix XSM dummy policy when using clang 4.0

2017-03-10 Thread Daniel De Graaf

On 03/06/2017 07:31 AM, Roger Pau Monne wrote:

There seems to be some weird bug in clang 4.0 that prevents xsm_pmu_op from
working as expected, and vpmu.o ends up with a reference to
__xsm_action_mismatch_detected which makes the build fail:

[...]
ld-melf_x86_64_fbsd  -T xen.lds -N prelink.o  \
xen/common/symbols-dummy.o -o xen/.xen-syms.0
prelink.o: In function `xsm_default_action':
xen/include/xsm/dummy.h:80: undefined reference to 
`__xsm_action_mismatch_detected'
xen/xen/include/xsm/dummy.h:80: relocation truncated to fit: R_X86_64_PC32 
against undefined symbol `__xsm_action_mismatch_detected'
ld: xen/xen/.xen-syms.0: hidden symbol `__xsm_action_mismatch_detected' isn't 
defined

Then doing a search in the objects files:

# find xen/ -type f -name '*.o' -print0 | xargs -0 bash -c \
  'for filename; do nm "$filename" | \
  grep -q __xsm_action_mismatch_detected && echo "$filename"; done' bash
xen/arch/x86/prelink.o
xen/arch/x86/cpu/vpmu.o
xen/arch/x86/cpu/built_in.o
xen/arch/x86/built_in.o

The current patch is the only way I've found to fix this so far, by simply
moving the XSM_PRIV check into the default case in xsm_pmu_op. This also fixes
the behavior of do_xenpmu_op, which will now return -EINVAL for unknown
XENPMU_* operations, instead of -EPERM when called by a privileged domain.

Signed-off-by: Roger Pau Monné 


Acked-by: Daniel De Graaf 

This also looks like a good patch for backporting.  The best alternative I
can think of is to disable the mismatch detection in non-DEBUG builds, but
I'd rather not do that unless this problem expands more than it has.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 106590: regressions - FAIL

2017-03-10 Thread osstest service owner
flight 106590 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106590/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 105963
 test-amd64-i386-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 105963

version targeted for testing:
 ovmf bbac7a1b9bddd77b0e825c919e990b7a2f7a74c8
baseline version:
 ovmf e0307a7dad02aa8c0cd8b3b0b9edce8ddb3fef2e

Last test of basis   105963  2017-02-21 21:43:31 Z   17 days
Failing since105980  2017-02-22 10:03:53 Z   16 days   45 attempts
Testing same since   106590  2017-03-10 15:46:46 Z0 days1 attempts


People who touched revisions under test:
  Anthony PERARD 
  Ard Biesheuvel 
  Bi, Dandan 
  Brijesh Singh 
  Chao Zhang 
  Chen A Chen 
  Dandan Bi 
  edk2-devel On Behalf Of rthomaiy <[mailto:edk2-devel-boun...@lists.01.org]>
  Fu Siyuan 
  Hao Wu 
  Hegde Nagaraj P 
  Hess Chen 
  Jeff Fan 
  Jiaxin Wu 
  Jiewen Yao 
  Laszlo Ersek 
  Leo Duran 
  Paolo Bonzini 
  Qin Long 
  Richard Thomaiyar 
  Ruiyu Ni 
  Star Zeng 
  Wu Jiaxin 
  Yonghong Zhu 
  Zhang Lubo 
  Zhang, Chao B 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 3736 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 106589: regressions - FAIL

2017-03-10 Thread osstest service owner
flight 106589 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106589/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail REGR. vs. 59254
 test-armhf-armhf-xl  11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-xl-xsm  11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-xl-credit2  11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-libvirt-xsm 11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-xl-cubietruck 11 guest-start fail REGR. vs. 59254
 test-armhf-armhf-libvirt 11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-xl-arndale  11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-xl-multivcpu 11 guest-start  fail REGR. vs. 59254

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-rtds 3 host-install(3) broken in 106581 pass in 106589
 test-amd64-amd64-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail in 106581 
pass in 106589
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 15 guest-localmigrate/x10 
fail pass in 106581

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 11 guest-start   fail REGR. vs. 59254
 test-amd64-amd64-xl-rtds  9 debian-installfail REGR. vs. 59254
 test-armhf-armhf-xl-vhd   9 debian-di-install   fail baseline untested
 test-armhf-armhf-libvirt-raw  9 debian-di-install   fail baseline untested
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 59254
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 59254
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 59254
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 59254

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 build-arm64   5 xen-buildfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass

version targeted for testing:
 linuxc1aa905a304e4b5e6a3fe112ec62d9c1c7b0c155
baseline version:
 linux45820c294fe1b1a9df495d57f40585ef2d069a39

Last test of basis59254  2015-07-09 04:20:48 Z  610 days
Failing since 59348  2015-07-10 04:24:05 Z  609 days  325 attempts
Testing same since   106581  2017-03-10 03:19:28 Z0 days2 attempts


8050 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  fail
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  blocked 
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvops 

Re: [Xen-devel] [RFC PATCH] mm, hotplug: get rid of auto_online_blocks

2017-03-10 Thread Daniel Kiper
Hey,

On Mon, Mar 06, 2017 at 03:54:17PM +0100, Michal Hocko wrote:

[...]

> So let's discuss the current memory hotplug shortcomings and get rid of
> the crud which developed on top. I will start by splitting up the patch
> into 3 parts. Do the auto online thing from the HyperV and xen balloning
> drivers and dropping the config option and finally drop the sysfs knob.
> The last patch might be NAKed and I can live with that as long as the
> reasoning is proper and there is a general consensus on that.

If it is not too late please CC me on the patches and relevant threads.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH net v4] xen-netback: fix race condition on XenBus disconnect

2017-03-10 Thread Igor Druzhinin
In some cases during XenBus disconnect event handling and subsequent
queue resource release there may be some TX handlers active on
other processors. Use RCU in order to synchronize with them.

Signed-off-by: Igor Druzhinin 
---
v4:
 * Use READ_ONCE instead of rcu_dereference to stop sparse complaining

v3:
 * Fix unintended semantic change in xenvif_get_ethtool_stats
 * Dropped extra code

v2:
 * Add protection for xenvif_get_ethtool_stats
 * Additional comments and fixes
---
 drivers/net/xen-netback/interface.c | 26 +-
 drivers/net/xen-netback/netback.c   |  2 +-
 drivers/net/xen-netback/xenbus.c| 20 ++--
 3 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index 829b26c..8397f6c 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -165,13 +165,17 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct 
net_device *dev)
 {
struct xenvif *vif = netdev_priv(dev);
struct xenvif_queue *queue = NULL;
-   unsigned int num_queues = vif->num_queues;
+   unsigned int num_queues;
u16 index;
struct xenvif_rx_cb *cb;
 
BUG_ON(skb->dev != dev);
 
-   /* Drop the packet if queues are not set up */
+   /* Drop the packet if queues are not set up.
+* This handler should be called inside an RCU read section
+* so we don't need to enter it here explicitly.
+*/
+   num_queues = READ_ONCE(vif->num_queues);
if (num_queues < 1)
goto drop;
 
@@ -222,18 +226,18 @@ static struct net_device_stats *xenvif_get_stats(struct 
net_device *dev)
 {
struct xenvif *vif = netdev_priv(dev);
struct xenvif_queue *queue = NULL;
+   unsigned int num_queues;
u64 rx_bytes = 0;
u64 rx_packets = 0;
u64 tx_bytes = 0;
u64 tx_packets = 0;
unsigned int index;
 
-   spin_lock(>lock);
-   if (vif->queues == NULL)
-   goto out;
+   rcu_read_lock();
+   num_queues = READ_ONCE(vif->num_queues);
 
/* Aggregate tx and rx stats from each queue */
-   for (index = 0; index < vif->num_queues; ++index) {
+   for (index = 0; index < num_queues; ++index) {
queue = >queues[index];
rx_bytes += queue->stats.rx_bytes;
rx_packets += queue->stats.rx_packets;
@@ -241,8 +245,7 @@ static struct net_device_stats *xenvif_get_stats(struct 
net_device *dev)
tx_packets += queue->stats.tx_packets;
}
 
-out:
-   spin_unlock(>lock);
+   rcu_read_unlock();
 
vif->dev->stats.rx_bytes = rx_bytes;
vif->dev->stats.rx_packets = rx_packets;
@@ -378,10 +381,13 @@ static void xenvif_get_ethtool_stats(struct net_device 
*dev,
 struct ethtool_stats *stats, u64 * data)
 {
struct xenvif *vif = netdev_priv(dev);
-   unsigned int num_queues = vif->num_queues;
+   unsigned int num_queues;
int i;
unsigned int queue_index;
 
+   rcu_read_lock();
+   num_queues = READ_ONCE(vif->num_queues);
+
for (i = 0; i < ARRAY_SIZE(xenvif_stats); i++) {
unsigned long accum = 0;
for (queue_index = 0; queue_index < num_queues; ++queue_index) {
@@ -390,6 +396,8 @@ static void xenvif_get_ethtool_stats(struct net_device *dev,
}
data[i] = accum;
}
+
+   rcu_read_unlock();
 }
 
 static void xenvif_get_strings(struct net_device *dev, u32 stringset, u8 * 
data)
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index f9bcf4a..602d408 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -214,7 +214,7 @@ static void xenvif_fatal_tx_err(struct xenvif *vif)
netdev_err(vif->dev, "fatal error; disabling device\n");
vif->disabled = true;
/* Disable the vif from queue 0's kthread */
-   if (vif->queues)
+   if (vif->num_queues)
xenvif_kick_thread(>queues[0]);
 }
 
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index d2d7cd9..a56d3ea 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -495,26 +495,26 @@ static void backend_disconnect(struct backend_info *be)
struct xenvif *vif = be->vif;
 
if (vif) {
+   unsigned int num_queues = vif->num_queues;
unsigned int queue_index;
-   struct xenvif_queue *queues;
 
xen_unregister_watchers(vif);
 #ifdef CONFIG_DEBUG_FS
xenvif_debugfs_delif(vif);
 #endif /* CONFIG_DEBUG_FS */
xenvif_disconnect_data(vif);
-   for (queue_index = 0;
-queue_index < vif->num_queues;
-++queue_index)
- 

[Xen-devel] [linux-next test] 106587: regressions - FAIL

2017-03-10 Thread osstest service owner
flight 106587 linux-next real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106587/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-freebsd10-i386 16 guest-localmigrate/x10 fail REGR. vs. 106570
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail 
REGR. vs. 106570
 test-amd64-i386-xl-qemuu-ovmf-amd64 15 guest-localmigrate/x10 fail REGR. vs. 
106570
 test-amd64-amd64-qemuu-nested-amd 17 capture-logs/l1(17) fail REGR. vs. 106570
 test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail REGR. vs. 
106570
 test-amd64-i386-xl-qemuu-winxpsp3  9 windows-install fail REGR. vs. 106570
 test-amd64-amd64-xl-qemuu-winxpsp3  9 windows-installfail REGR. vs. 106570
 test-amd64-amd64-xl-qemut-winxpsp3  9 windows-installfail REGR. vs. 106570
 test-amd64-i386-xl-qemut-win7-amd64  9 windows-install   fail REGR. vs. 106570
 test-amd64-i386-xl-qemut-winxpsp3  9 windows-install fail REGR. vs. 106570
 test-amd64-amd64-xl-qemut-win7-amd64  9 windows-install  fail REGR. vs. 106570
 test-amd64-i386-xl-qemuu-win7-amd64  9 windows-install   fail REGR. vs. 106570
 test-amd64-amd64-xl-qemuu-win7-amd64  9 windows-install  fail REGR. vs. 106570

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-qemuu-nested-intel 17 capture-logs/l1(17) fail blocked in 
106570
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail like 106570
 test-armhf-armhf-xl-rtds 11 guest-start  fail  like 106570
 test-armhf-armhf-xl-xsm  11 guest-start  fail  like 106570
 test-armhf-armhf-xl-credit2  11 guest-start  fail  like 106570
 test-armhf-armhf-libvirt 11 guest-start  fail  like 106570
 test-armhf-armhf-xl  11 guest-start  fail  like 106570
 test-armhf-armhf-xl-multivcpu 11 guest-start  fail like 106570
 test-armhf-armhf-libvirt-xsm 11 guest-start  fail  like 106570
 test-armhf-armhf-xl-cubietruck 11 guest-start fail like 106570
 test-armhf-armhf-xl-arndale  11 guest-start  fail  like 106570
 test-armhf-armhf-xl-vhd   9 debian-di-installfail  like 106570
 test-armhf-armhf-libvirt-raw  9 debian-di-installfail  like 106570

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   5 xen-buildfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass

version targeted for testing:
 linux5be4921c9958ec02a67506bd6f7a52fce663c201
baseline version:
 linuxea6200e84182989a3cce9687cf79a23ac44ec4db

Last test of basis  (not found) 
Failing since 0  1970-01-01 00:00:00 Z 17235 days
Testing same since   106587  2017-03-10 09:20:19 Z0 days1 attempts

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  fail
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  

[Xen-devel] [PATCH v2] xen: don't save/restore the physmap on VM save/restore

2017-03-10 Thread Igor Druzhinin
Saving/restoring the physmap to/from xenstore was introduced to
QEMU majorly in order to cover up the VRAM region restore issue.
The sequence of restore operations implies that we should know
the effective guest VRAM address *before* we have the VRAM region
restored (which happens later). Unfortunately, in Xen environment
VRAM memory does actually belong to a guest - not QEMU itself -
which means the position of this region is unknown beforehand and
can't be mapped into QEMU address space immediately.

Previously, recreating xenstore keys, holding the physmap, by the
toolstack helped to get this information in place at the right
moment ready to be consumed by QEMU to map the region properly.

The extraneous complexity of having those keys transferred by the
toolstack and unnecessary redundancy prompted us to propose a
solution which doesn't require any extra data in xenstore. The idea
is to defer the VRAM region mapping till the point we actually know
the effective address and able to map it. To that end, we initially
only register the pointer to the framebuffer without actual mapping.
Then, during the memory region restore phase, we perform the mapping
of the known address and update the VRAM region metadata (including
previously registered pointer) accordingly.

Signed-off-by: Igor Druzhinin 
---
v2:
* Fix some building and coding style issues
---
 exec.c   |   3 ++
 hw/display/vga.c |   2 +-
 include/hw/xen/xen.h |   2 +-
 xen-hvm-stub.c   |   2 +-
 xen-hvm.c| 114 ---
 5 files changed, 33 insertions(+), 90 deletions(-)

diff --git a/exec.c b/exec.c
index aabb035..5f2809e 100644
--- a/exec.c
+++ b/exec.c
@@ -2008,6 +2008,9 @@ void *qemu_map_ram_ptr(RAMBlock *ram_block, ram_addr_t 
addr)
 }
 
 block->host = xen_map_cache(block->offset, block->max_length, 1);
+if (block->host == NULL) {
+return NULL;
+}
 }
 return ramblock_ptr(block, addr);
 }
diff --git a/hw/display/vga.c b/hw/display/vga.c
index 69c3e1d..be554c2 100644
--- a/hw/display/vga.c
+++ b/hw/display/vga.c
@@ -2163,7 +2163,7 @@ void vga_common_init(VGACommonState *s, Object *obj, bool 
global_vmstate)
 memory_region_init_ram(>vram, obj, "vga.vram", s->vram_size,
_fatal);
 vmstate_register_ram(>vram, global_vmstate ? NULL : DEVICE(obj));
-xen_register_framebuffer(>vram);
+xen_register_framebuffer(>vram, >vram_ptr);
 s->vram_ptr = memory_region_get_ram_ptr(>vram);
 s->get_bpp = vga_get_bpp;
 s->get_offsets = vga_get_offsets;
diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
index 09c2ce5..3831843 100644
--- a/include/hw/xen/xen.h
+++ b/include/hw/xen/xen.h
@@ -45,6 +45,6 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size,
struct MemoryRegion *mr, Error **errp);
 void xen_modified_memory(ram_addr_t start, ram_addr_t length);
 
-void xen_register_framebuffer(struct MemoryRegion *mr);
+void xen_register_framebuffer(struct MemoryRegion *mr, uint8_t **ptr);
 
 #endif /* QEMU_HW_XEN_H */
diff --git a/xen-hvm-stub.c b/xen-hvm-stub.c
index c500325..c89065e 100644
--- a/xen-hvm-stub.c
+++ b/xen-hvm-stub.c
@@ -46,7 +46,7 @@ qemu_irq *xen_interrupt_controller_init(void)
 return NULL;
 }
 
-void xen_register_framebuffer(MemoryRegion *mr)
+void xen_register_framebuffer(MemoryRegion *mr, uint8_t **ptr)
 {
 }
 
diff --git a/xen-hvm.c b/xen-hvm.c
index 5043beb..270cd99 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -41,6 +41,7 @@
 
 static MemoryRegion ram_memory, ram_640k, ram_lo, ram_hi;
 static MemoryRegion *framebuffer;
+static uint8_t **framebuffer_ptr;
 static bool xen_in_migration;
 
 /* Compatibility with older version */
@@ -302,7 +303,6 @@ static hwaddr xen_phys_offset_to_gaddr(hwaddr start_addr,
 return physmap->start_addr;
 }
 }
-
 return start_addr;
 }
 
@@ -317,7 +317,6 @@ static int xen_add_to_physmap(XenIOState *state,
 XenPhysmap *physmap = NULL;
 hwaddr pfn, start_gpfn;
 hwaddr phys_offset = memory_region_get_ram_addr(mr);
-char path[80], value[17];
 const char *mr_name;
 
 if (get_physmapping(state, start_addr, size)) {
@@ -340,6 +339,27 @@ go_physmap:
 DPRINTF("mapping vram to %"HWADDR_PRIx" - %"HWADDR_PRIx"\n",
 start_addr, start_addr + size);
 
+mr_name = memory_region_name(mr);
+
+physmap = g_malloc(sizeof(XenPhysmap));
+
+physmap->start_addr = start_addr;
+physmap->size = size;
+physmap->name = mr_name;
+physmap->phys_offset = phys_offset;
+
+QLIST_INSERT_HEAD(>physmap, physmap, list);
+
+if (runstate_check(RUN_STATE_INMIGRATE)) {
+/* At this point we have a physmap entry for the framebuffer region
+ * established during the restore phase so we can safely update the
+ * registered framebuffer address here. */
+if (mr == framebuffer) {
+*framebuffer_ptr = 

Re: [Xen-devel] [PATCH] x86/monitor: add support for descriptor access events

2017-03-10 Thread Tamas K Lengyel
On Fri, Mar 10, 2017 at 8:50 AM, Vlad Ioan Topan  wrote:
> Adds monitor support for descriptor access events (reads & writes of
> IDTR/GDTR/LDTR/TR) for the x86 architecture (VMX and SVM).
>
> Signed-off-by: Vlad Ioan Topan 
> ---
>  tools/libxc/include/xenctrl.h   |  2 ++
>  tools/libxc/xc_monitor.c| 14 +++
>  tools/tests/xen-access/xen-access.c | 29 -
>  xen/arch/x86/hvm/hvm.c  | 35 ++
>  xen/arch/x86/hvm/monitor.c  | 16 
>  xen/arch/x86/hvm/svm/svm.c  | 50 
> +
>  xen/arch/x86/hvm/vmx/vmx.c  | 45 +
>  xen/arch/x86/monitor.c  | 18 +
>  xen/arch/x86/vm_event.c |  6 +
>  xen/include/asm-x86/domain.h|  1 +
>  xen/include/asm-x86/hvm/hvm.h   |  2 ++
>  xen/include/asm-x86/hvm/monitor.h   |  3 +++
>  xen/include/asm-x86/hvm/support.h   |  2 ++
>  xen/include/asm-x86/monitor.h   |  1 +
>  xen/include/public/domctl.h |  1 +
>  xen/include/public/vm_event.h   | 21 
>  16 files changed, 245 insertions(+), 1 deletion(-)
>
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index a48981a..2de6c61 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -1984,6 +1984,8 @@ int xc_monitor_mov_to_msr(xc_interface *xch, domid_t 
> domain_id, uint32_t msr,
>  int xc_monitor_singlestep(xc_interface *xch, domid_t domain_id, bool enable);
>  int xc_monitor_software_breakpoint(xc_interface *xch, domid_t domain_id,
> bool enable);
> +int xc_monitor_descriptor_access(xc_interface *xch, domid_t domain_id,
> + bool enable);
>  int xc_monitor_guest_request(xc_interface *xch, domid_t domain_id,
>   bool enable, bool sync);
>  int xc_monitor_debug_exceptions(xc_interface *xch, domid_t domain_id,
> diff --git a/tools/libxc/xc_monitor.c b/tools/libxc/xc_monitor.c
> index 15a7c32..f99b6e3 100644
> --- a/tools/libxc/xc_monitor.c
> +++ b/tools/libxc/xc_monitor.c
> @@ -129,6 +129,20 @@ int xc_monitor_singlestep(xc_interface *xch, domid_t 
> domain_id,
>  return do_domctl(xch, );
>  }
>
> +int xc_monitor_descriptor_access(xc_interface *xch, domid_t domain_id,
> + bool enable)
> +{
> +DECLARE_DOMCTL;
> +
> +domctl.cmd = XEN_DOMCTL_monitor_op;
> +domctl.domain = domain_id;
> +domctl.u.monitor_op.op = enable ? XEN_DOMCTL_MONITOR_OP_ENABLE
> +: XEN_DOMCTL_MONITOR_OP_DISABLE;
> +domctl.u.monitor_op.event = XEN_DOMCTL_MONITOR_EVENT_DESC_ACCESS;
> +
> +return do_domctl(xch, );
> +}
> +
>  int xc_monitor_guest_request(xc_interface *xch, domid_t domain_id, bool 
> enable,
>   bool sync)
>  {
> diff --git a/tools/tests/xen-access/xen-access.c 
> b/tools/tests/xen-access/xen-access.c
> index 9d4f957..c567cc0 100644
> --- a/tools/tests/xen-access/xen-access.c
> +++ b/tools/tests/xen-access/xen-access.c
> @@ -337,7 +337,7 @@ void usage(char* progname)
>  {
>  fprintf(stderr, "Usage: %s [-m]  write|exec", progname);
>  #if defined(__i386__) || defined(__x86_64__)
> -fprintf(stderr, 
> "|breakpoint|altp2m_write|altp2m_exec|debug|cpuid");
> +fprintf(stderr, 
> "|breakpoint|altp2m_write|altp2m_exec|debug|cpuid|desc_access");
>  #elif defined(__arm__) || defined(__aarch64__)
>  fprintf(stderr, "|privcall");
>  #endif
> @@ -368,6 +368,7 @@ int main(int argc, char *argv[])
>  int altp2m = 0;
>  int debug = 0;
>  int cpuid = 0;
> +int desc_access = 0;
>  uint16_t altp2m_view_id = 0;
>
>  char* progname = argv[0];
> @@ -434,6 +435,10 @@ int main(int argc, char *argv[])
>  {
>  cpuid = 1;
>  }
> +else if ( !strcmp(argv[0], "desc_access") )
> +{
> +desc_access = 1;
> +}
>  #elif defined(__arm__) || defined(__aarch64__)
>  else if ( !strcmp(argv[0], "privcall") )
>  {
> @@ -570,6 +575,16 @@ int main(int argc, char *argv[])
>  goto exit;
>  }
>  }
> +
> +if ( desc_access )
> +{
> +rc = xc_monitor_descriptor_access(xch, domain_id, 1);
> +if ( rc < 0 )
> +{
> +ERROR("Error %d setting descriptor access listener with 
> vm_event\n", rc);
> +goto exit;
> +}
> +}
>
>  if ( privcall )
>  {
> @@ -595,6 +610,8 @@ int main(int argc, char *argv[])
>  rc = xc_monitor_debug_exceptions(xch, domain_id, 0, 0);
>  if ( cpuid )
>  rc = xc_monitor_cpuid(xch, domain_id, 0);
> +if ( desc_access )
> +rc = xc_monitor_descriptor_access(xch, domain_id, 0);
>
>  if ( privcall )
>  rc = 

Re: [Xen-devel] [PATCH V2] x86/altp2m: Added xc_altp2m_set_mem_access_multi()

2017-03-10 Thread Tamas K Lengyel
On Fri, Mar 10, 2017 at 4:21 AM, Andrew Cooper
 wrote:
> On 10/03/17 07:34, Jan Beulich wrote:
> On 09.03.17 at 18:29,  wrote:
>>> On Thu, Mar 9, 2017 at 9:56 AM, Jan Beulich  wrote:
 However - is this interface supposed to be usable by a guest on itself?
 Arguably the same question would apply to some of the other sub-ops
 too, but anyway.
>>> AFAIK the only op the guest would use on itself is
>>> HVMOP_altp2m_vcpu_enable_notify.
>> Which then means we should move all of them out of here, into a
>> suitable domctl. That will in turn reduce the scope of the bogus
>> interface versioning, which Andrew did point out, quite a bit.
>
> The original usecase for altp2m was for an entirely in-guest agent,
> which is why they got in as hvmops to start with.  I don't think it is
> wise to break that.
>
> I think there needs to be slightly finer grain control, identifying
> whether a domain may use altp2m, and whether it may configure altp2m
> permissions itself.
>
> The nature of altp2m means that using EPTP switching/etc necessarily can
> only happen from inside guest context, but whether you trust the domain
> to make adjustments to the permissions itself depends on your usecase
> and threat model.
>

So I'm actively using EPT switching and gfn remapping from a
privileged monitor domain (not with VMFUNC). My entire usecase for
altp2m is purely external without any in-guest agents. In fact, I have
to deploy a custom XSM policy to blacklist altp2mhvm_op being issued
from the guest.

The reason I mentioned HVMOP_altp2m_vcpu_enable_notify as being the
only one I believe that is only accessible from within the guest is
this distinction in arch/x86/hvm/hvm.c:

d = ( a.cmd != HVMOP_altp2m_vcpu_enable_notify ) ?
rcu_lock_domain_by_any_id(a.domain) : rcu_lock_current_domain();

For the other ops I'm not sure if they were really required to be
accessible from within the guest or not. I'm not even sure using them
would work from the guest with the above check in place. However, if
they do work from the guest then I have no idea how it was supposed to
work for security purposes as any application in that guest could just
issue a hypercall to manipulate it or even turn it off.

Tamas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks)

2017-03-10 Thread Reza Arbab

On Fri, Mar 10, 2017 at 04:53:33PM +0100, Michal Hocko wrote:

OK, so while I was playing with this setup some more I probably got why
this is done this way. All new memblocks are added to the zone Normal
where they are accounted as spanned but not present.


It's not always zone Normal. See zone_for_memory(). This leads to a 
workaround for having to do online_movable in descending block order. 
Instead of this:


1. probe block 34, probe block 33, probe block 32, ...
2. online_movable 34, online_movable 33, online_movable 32, ...

you can online_movable the first block before adding the rest:

1. probe block 32, online_movable 32
2. probe block 33, probe block 34, ...
- zone_for_memory() will cause these to start Movable
3. online 33, online 34, ...
- they're already in Movable, so online_movable is equivalentr

I agree with your general sentiment that this stuff is very 
nonintuitive.


--
Reza Arbab


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH RFC 2/2] xen: credit2: flexible configuration of runqueues

2017-03-10 Thread Praveen Kumar
The idea is to give user more flexibility to configure runqueue further.
For most workloads and in most systems, using per-core means have too many
small runqueues. Using per-socket is almost always better, but it may result
in too few big runqueues.

OPTION 1 :

The user can create runqueue per-cpu using Xen boot parameter like below:

 credit2_runqueue=cpu

which would mean the following:
 - pCPU 0 belong to runqueue 0
 - pCPU 1 belong to runqueue 1
 - pCPU 2 belong to runqueue 2
 and so on.

OPTION 2 :

Further user can be allowed to say something shown below :

 credit2_runqueue=0,1,4,5;2,3,6,7;8,9,12,13;10,11,14,15

or (with exactly the same meaning, but a perhaps more clear syntax):

 credit2_runqueue=[[0,1,4,5][2,3,6,7][8,9,12,13][10,11,14,15]]

which would mean the following:
 - pCPUs 0, 1, 4 and 5 belong to runqueue 0
 - pCPUs 2, 3, 6 and 7 belong to runqueue 1
 - pCPUs 8, 9, 12 and 13 belong to runqueue 2
 - pCPUs 10, 11, 14 and 15 belong to runqueue 3

---
PATCH 2/2 enables to create runqueue based upon [OPTION 2].
---
RFC :
Current implementation is parsing the data for below syntax.
"credit2_runqueue=[[0,1,4,5][2,3,6,7][8,9,12,13][10,11,14,15]]"

I am also working for simpler one as mentioned in above desciption.
Sharing the patch to get further inputs over the changes.
---

Signed-off-by: Praveen Kumar 

---
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index af457c1..505aa6b 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -318,6 +318,16 @@ integer_param("credit2_balance_over", 
opt_overload_balance_tolerance);
  *   (logical) processors of the host belong. This will happen if
  *   the opt_runqueue parameter is set to 'all'.
  *
+ * - custom : meaning that there will be one runqueue per subset being passed 
as
+ *  parameter to credit2_runqueue as shown in below example.
+ *  Example :
+ *  credit2_runqueue=[[cpu0,cpu1][cpu3][cpu4,cpu5]]
+ *
+ *  The example mentioned states :
+ *  cpu0 and cpu1 belongs to runqueue 0
+ *  cpu3 belongs to runqueue 1
+ *  cpu4 and cpu5 belongs to runqueue 2
+ *
  * Depending on the value of opt_runqueue, therefore, cpus that are part of
  * either the same physical core, the same physical socket, the same NUMA
  * node, or just all of them, will be put together to form runqueues.
@@ -326,17 +336,62 @@ integer_param("credit2_balance_over", 
opt_overload_balance_tolerance);
 #define OPT_RUNQUEUE_SOCKET 1
 #define OPT_RUNQUEUE_NODE   2
 #define OPT_RUNQUEUE_ALL3
+#define OPT_RUNQUEUE_CUSTOM 4
 static const char *const opt_runqueue_str[] = {
 [OPT_RUNQUEUE_CORE] = "core",
 [OPT_RUNQUEUE_SOCKET] = "socket",
 [OPT_RUNQUEUE_NODE] = "node",
-[OPT_RUNQUEUE_ALL] = "all"
+[OPT_RUNQUEUE_ALL] = "all",
+[OPT_RUNQUEUE_CUSTOM] = "custom"
 };
 static int __read_mostly opt_runqueue = OPT_RUNQUEUE_SOCKET;
 
+static int __read_mostly custom_cpu_runqueue[NR_CPUS];
+
+
+#define GETTOKEN( token, len, start, end )   \
+{\
+char _tmp[len+1];\
+int _i;  \
+ \
+safe_strcpy(_tmp, start);\
+_tmp[len] = '\0';\
+ \
+for ( _i = 0; _tmp[_i] != '\0'; _i++ )   \
+token = ( ( token * 10 ) + ( _tmp[_i] - '0' ) ); \
+}
+
+static inline int trim(const char *c, char *t, char elem)
+{
+int l = strlen(c);
+const char *x = c ;
+int i = 0;
+if ( !c || !t )
+return -1;
+while ( *x != '\0' && i < l )
+{
+if ( *x != elem )
+t[i++] = *x;
+x++;
+}
+t[i] = '\0';
+return 0;
+}
+
+static inline int getlen(char *start, char *end)
+{
+if ( ( start ) && ( end ) && ( end > start ) )
+return end-start;
+else
+return -1;
+}
+
 static void parse_credit2_runqueue(const char *s)
 {
 unsigned int i;
+const char *s_end = NULL;
+char m[strlen(s)];
+char *_s = NULL;
 
 for ( i = 0; i < ARRAY_SIZE(opt_runqueue_str); i++ )
 {
@@ -346,8 +401,118 @@ static void parse_credit2_runqueue(const char *s)
 return;
 }
 }
+/*
+ * At this stage we are either unknown value of credit2_runqueue or we can
+ * consider it to be custom cpu. Lets try parsing the same.
+ * Resetting the custom_cpu_runqueue for future use. Only the non-negative
+ * entries will be valid. The index 'i' in custom_cpu_runqueue will store
+ * the specific runqueue it belongs to.
+ * Example:
+ * If custom_cpu_runqueue[3] == 2
+ * Then, it means that cpu 3 belong to runqueue 2.
+ * If 

[Xen-devel] [PATCH RFC 1/2] xen: credit2: flexible configuration of runqueues

2017-03-10 Thread Praveen Kumar
The idea is to give user more flexibility to configure runqueue further.
For most workloads and in most systems, using per-core means have too many
small runqueues. Using per-socket is almost always better, but it may result
in too few big runqueues.

OPTION 1 :

The user can create runqueue per-cpu using Xen boot parameter like below:

 credit2_runqueue=cpu

which would mean the following:
 - pCPU 0 belong to runqueue 0
 - pCPU 1 belong to runqueue 1
 - pCPU 2 belong to runqueue 2
 and so on.

OPTION 2 :

Further user can be allowed to say something shown below :

 credit2_runqueue=0,1,4,5;2,3,6,7;8,9,12,13;10,11,14,15

or (with exactly the same meaning, but a perhaps more clear syntax):

 credit2_runqueue=[[0,1,4,5][2,3,6,7][8,9,12,13][10,11,14,15]]

which would mean the following:
 - pCPUs 0, 1, 4 and 5 belong to runqueue 0
 - pCPUs 2, 3, 6 and 7 belong to runqueue 1
 - pCPUs 8, 9, 12 and 13 belong to runqueue 2
 - pCPUs 10, 11, 14 and 15 belong to runqueue 3

---
PATCH 1/2 enables to create runqueue per-cpu [OPTION 1].
---

Signed-off-by: Praveen Kumar 

---
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index af457c1..2bc0013 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -301,6 +301,9 @@ integer_param("credit2_balance_over", 
opt_overload_balance_tolerance);
  * want that to happen basing on topology. At the moment, it is possible
  * to choose to arrange runqueues to be:
  *
+ * - per-cpu: meaning that there will be one runqueue per logical cpu. This
+ *will happen when if the opt_runqueue parameter is set to 'cpu'.
+ *
  * - per-core: meaning that there will be one runqueue per each physical
  * core of the host. This will happen if the opt_runqueue
  * parameter is set to 'core';
@@ -322,11 +325,13 @@ integer_param("credit2_balance_over", 
opt_overload_balance_tolerance);
  * either the same physical core, the same physical socket, the same NUMA
  * node, or just all of them, will be put together to form runqueues.
  */
-#define OPT_RUNQUEUE_CORE   0
-#define OPT_RUNQUEUE_SOCKET 1
-#define OPT_RUNQUEUE_NODE   2
-#define OPT_RUNQUEUE_ALL3
+#define OPT_RUNQUEUE_CPU0
+#define OPT_RUNQUEUE_CORE   1
+#define OPT_RUNQUEUE_SOCKET 2
+#define OPT_RUNQUEUE_NODE   3
+#define OPT_RUNQUEUE_ALL4
 static const char *const opt_runqueue_str[] = {
+[OPT_RUNQUEUE_CPU] = "cpu",
 [OPT_RUNQUEUE_CORE] = "core",
 [OPT_RUNQUEUE_SOCKET] = "socket",
 [OPT_RUNQUEUE_NODE] = "node",
@@ -682,6 +687,8 @@ cpu_to_runqueue(struct csched2_private *prv, unsigned int 
cpu)
 BUG_ON(cpu_to_socket(cpu) == XEN_INVALID_SOCKET_ID ||
cpu_to_socket(peer_cpu) == XEN_INVALID_SOCKET_ID);
 
+if (opt_runqueue == OPT_RUNQUEUE_CPU)
+continue;
 if ( opt_runqueue == OPT_RUNQUEUE_ALL ||
  (opt_runqueue == OPT_RUNQUEUE_CORE && same_core(peer_cpu, cpu)) ||
  (opt_runqueue == OPT_RUNQUEUE_SOCKET && same_socket(peer_cpu, 
cpu)) ||

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4] xen: Allow a default compiled-in command line using Kconfig

2017-03-10 Thread Zhongze Liu
2017-03-10 23:03 GMT+08:00 Jan Beulich :
 On 09.03.17 at 04:13,  wrote:
>> If CMDLINE is set, the cmdline_parse() routine will append the bootloader
>> command line to this string, forming the complete command line
>> before parsing.
>
> I disagree to making it behave like this: There's no need to
> concatenate both, simply parse them one after the other. The
> built-in one is well known (and hence also has no need to be in
> saved_cmdline). That'll avoid reducing the space available for
> user specified options.
>

I did so because I do think the embedded one should be in
saved_cmdline, too, but your suggestion
sounds quite reasonable.
Then I really have to introduce a wrapper to the original
cmdline_parse(). (Recursion seems not suitable for
this, since a new variable indicating the depth is inevitable, making
the parameter list even longer).
My solution will be: the parser calls gather_dom0_options() and the
original cmdline_parse first on
CONFIG_CMDLINE and then on the bootloader cmdline. The interface would
still be the same as
I did in this patch and leaving the original cmdline_parse() unchanged.
BTW, in the Xen tradition, will we rename the original cmdline_parse()
to _cmdline_parse() or
do_cmdline_parse() or something else?

>
>> @@ -1566,14 +1557,14 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>
>>  /* Grab the DOM0 command line. */
>>  cmdline = (char *)(mod[0].string ? __va(mod[0].string) : NULL);
>> -if ( (cmdline != NULL) || (kextra != NULL) )
>> +if ( (cmdline != NULL) || strlen(kextra) )
>
> Is there any reason why kextra can't come out as NULL if unset,
> avoiding the need to touch the code here? That would also avoid
> making kextra a static variable.
>

In the original code, kextra is a pointer to a suffix of the original
cmdline. It's
orphaned from cmdline by turning the first blank in " -- " into a '\0'.
But now since the dom0 options can appear in both CONFIG_CMDLINE and
the bootloader
cmdline, there must be some place (an array) to hold the concatenated
dom0 option string.
So if we want to avoid modifying this piece of code, I can only come
up with two solutions:
1.  Define a new array in this function and let kextra point to it.
Set kextra to NULL if the array is empty.
 But I think this is too awkward.
 2. Define a new array,  say, opt_dom0_options[],  in kernel.c, and
return the pointer to this array back to
 the caller when cmdline_parse() is invoked, or return NULL if the
array is empty.
What do you say?

>
>> --- a/xen/common/Kconfig
>> +++ b/xen/common/Kconfig
>> @@ -237,4 +237,28 @@ config FAST_SYMBOL_LOOKUP
>> The only user of this is Live patching.
>>
>> If unsure, say Y.
>> +
>> +config CMDLINE
>> + string "Built-in hypervisor command string"
>> + default ""
>> + ---help---
>> +   Enter arguments here that should be compiled into the hypervisor
>> +   image and used at boot time.  If the bootloader provides a
>> +   command line at boot time, it is appended to this string to
>> +   form the full hypervisor command line, when the system boots.
>> +   So if the same command line option is not cumulative,
>> +   and was set both in this string and in the bootloader command line,
>> +   only the latter (i.e. the one in the bootloader command line),see
>> +   will take effect.
>> +
>> +config CMDLINE_OVERRIDE
>> + bool "Built-in command line overrides bootloader arguments"
>> + default n
>> + depends on EXPERT = "y"
>
> Personally I think the other option should also be dependent on
> EXPERT. And the one here should probably depend on CMDLINE != ""
> - if someone really wants to force an empty one, (s)he could make
> it be just a single blank.
>

Well, sounds reasonable. But I still want to hear what others say.

>
> Jan
>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.6.5 released

2017-03-10 Thread Stefan Bader
On 08.03.2017 13:54, Jan Beulich wrote:
> All,
> 
> I am pleased to announce the release of Xen 4.6.5. This is
> available immediately from its git repository
> http://xenbits.xen.org/gitweb/?p=xen.git;a=shortlog;h=refs/heads/stable-4.6 
> (tag RELEASE-4.6.5) or from the XenProject download page
> http://www.xenproject.org/downloads/xen-archives/xen-46-series/xen-465.html 
> (where a list of changes can also be found).
> 
> We recommend all users of the 4.6 stable series to update to this
> latest point release.

This does not seem to compile for me (x86_64) without the attached (admittedly
brutish) change.

-Stefan

> 
> Regards, Jan
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
> 

Index: xen-4.6.5/xen/arch/x86/x86_emulate/x86_emulate.c
===
--- xen-4.6.5.orig/xen/arch/x86/x86_emulate/x86_emulate.c
+++ xen-4.6.5/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -20,6 +20,8 @@
  * along with this program; If not, see .
  */
 
+#define MASK_EXTR(v, m) (((v) & (m)) / ((m) & -(m)))
+
 /* Operand sizes: 8-bit operands or specified/overridden size. */
 #define ByteOp  (1<<0) /* 8-bit operands. */
 /* Destination operand type. */


signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain

2017-03-10 Thread Andrew Cooper
On 28/02/17 09:31, Jan Beulich wrote:
 On 27.02.17 at 16:10,  wrote:
>> On 22/02/17 10:10, Jan Beulich wrote:
>> On 22.02.17 at 11:00,  wrote:
 On 22/02/17 09:23, Jan Beulich wrote:
 On 20.02.17 at 12:00,  wrote:
>> The domain builder in libxc no longer depends on leaked CPUID 
>> information to
>> properly construct HVM domains.  Remove the control domain exclusion.
> Am I missing some intermediate step? As long as there's a raw
> CPUID invocation in xc_cpuid_x86.c (which is still there in staging
> and I don't recall this series removing it) it at least _feels_ unsafe.
 Strictly speaking, the domain builder part of this was completed after
 my xsave adjustments.  All the guest-type-dependent information now
 comes from non-cpuid sources in libxc, or Xen ignores the toolstack
 values and recalculates information itself.

 However, until the Intel leaves were complete, dom0 had a hard time
 booting with this change as there were no toolstack-provided policy and
 no leakage from hardware.
>>> So what are the CPUID uses in libxc then needed for at this point?
>>> Could they be removed in a prereq patch to make clear all needed
>>> information is now being obtained via hypercalls?
>> I'd prefer to defer that work.  The next chunk of CPUID work is going to
>> be redesigning and reimplementing the hypervisor/libxc interface, and
>> all cpuid() calls in libxc will fall out there, but its not a trivial
>> set of changes to make.
> With that, could you live with deferring the patch here until then?

We currently have a lot of dom0 implicit dependencies on leaked CPUID
state into PV dom0.

With this series, I believe I have identified all leaked dependencies,
and I really want to prevent is introducing any new implicit dependences
accidentally.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 5/5] x86/ioreq server: Synchronously reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.

2017-03-10 Thread Andrew Cooper
On 08/03/17 15:33, Yu Zhang wrote:
> After an ioreq server has unmapped, the remaining p2m_ioreq_server
> entries need to be reset back to p2m_ram_rw. This patch does this
> synchronously by iterating the p2m table.
>
> The synchronous resetting is necessary because we need to guarantee
> the p2m table is clean before another ioreq server is mapped. And
> since the sweeping of p2m table could be time consuming, it is done
> with hypercall continuation.
>
> Signed-off-by: Yu Zhang 
> ---
> Cc: Paul Durrant 
> Cc: Jan Beulich 
> Cc: Andrew Cooper 
> Cc: George Dunlap 
>
> changes in v1: 
>   - This patch is splitted from patch 4 of last version.
>   - According to comments from Jan: update the gfn_start for
> when use hypercall continuation to reset the p2m type.
>   - According to comments from Jan: use min() to compare gfn_end
> and max mapped pfn in p2m_finish_type_change()
> ---
>  xen/arch/x86/hvm/dm.c  | 43 
> +-
>  xen/arch/x86/mm/p2m.c  | 29 
>  xen/include/asm-x86/p2m.h  |  5 +
>  xen/include/public/hvm/dm_op.h |  3 +--
>  4 files changed, 73 insertions(+), 7 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
> index f97478b..a92d5d7 100644
> --- a/xen/arch/x86/hvm/dm.c
> +++ b/xen/arch/x86/hvm/dm.c
> @@ -288,6 +288,7 @@ static int inject_event(struct domain *d,
>  return 0;
>  }
>  
> +#define DMOP_op_mask 0xff
>  static int dm_op(domid_t domid,
>   unsigned int nr_bufs,
>   xen_dm_op_buf_t bufs[])
> @@ -315,10 +316,8 @@ static int dm_op(domid_t domid,
>  }
>  
>  rc = -EINVAL;
> -if ( op.pad )
> -goto out;
>  
> -switch ( op.op )
> +switch ( op.op & DMOP_op_mask )

Nack to changes like this.  HVMop continuations only existed in this
form because we had to retrofit it to longterm-stable ABIs in the past,
and there were literally no free bits anywhere else.

Currently, the DMOP ABI isn't set in stone, so you have until code
freeze in 4.9 to make changes.  (i.e. soon now.)  We should also
consider which other DMops might potentially need to become continuable,
and take preventative action before the freeze.


If we need to make similar changes once the ABI truely is frozen, there
are plenty of free bits in the end of the union which can be used
without breaking the ABI.

~Andrew.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 08/10] x86/cpuid: Handle leaf 0xb in guest_cpuid()

2017-03-10 Thread Andrew Cooper
Leaf 0xb is reserved by AMD, and uniformly hidden from guests by the toolstack
logic and hypervisor PV logic.  The previous dynamic logic filled in the
x2APIC ID for all HVM guests.

In practice, leaf 0xb is tightly linked with x2APIC, and x2APIC is offered to
guests on AMD hardware, as Xen's APIC emulation is x2APIC capable even if
hardware isn't.

Sensibly exposing the rest of the leaf requires further topology
infrastructure.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

v2:
 * Switch logic to be in terms of the visibility of x2APIC.
---
 xen/arch/x86/cpuid.c | 34 +++---
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 17769fb..dd8b583 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -171,6 +171,7 @@ static void recalculate_misc(struct cpuid_policy *p)
 p->basic.raw[0x6] = EMPTY_LEAF; /* Therm/Power not exposed to guests. */
 
 p->basic.raw[0x8] = EMPTY_LEAF;
+p->basic.raw[0xb] = EMPTY_LEAF; /* TODO: Rework topology logic. */
 p->basic.raw[0xc] = EMPTY_LEAF;
 
 p->extd.e1d &= ~CPUID_COMMON_1D_FEATURES;
@@ -629,12 +630,7 @@ static void pv_cpuid(uint32_t leaf, uint32_t subleaf, 
struct cpuid_leaf *res)
 
 switch ( leaf )
 {
-case 0x000b: /* Extended Topology Enumeration */
-*res = EMPTY_LEAF;
-break;
-
-case 0x0 ... 0xa:
-case 0xc ... XSTATE_CPUID:
+case 0x0 ... XSTATE_CPUID:
 case 0x8000 ... 0x:
 ASSERT_UNREACHABLE();
 /* Now handled in guest_cpuid(). */
@@ -650,13 +646,7 @@ static void hvm_cpuid(uint32_t leaf, uint32_t subleaf, 
struct cpuid_leaf *res)
 
 switch ( leaf )
 {
-case 0xb:
-/* Fix the x2APIC identifier. */
-res->d = v->vcpu_id * 2;
-break;
-
-case 0x0 ... 0xa:
-case 0xc ... XSTATE_CPUID:
+case 0x0 ... XSTATE_CPUID:
 case 0x8000 ... 0x:
 ASSERT_UNREACHABLE();
 /* Now handled in guest_cpuid(). */
@@ -716,8 +706,7 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
 
 case 0x0 ... 0x3:
 case 0x5 ... 0x6:
-case 0x8 ... 0xa:
-case 0xc:
+case 0x8 ... 0xc:
 *res = p->basic.raw[leaf];
 break;
 }
@@ -938,6 +927,21 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
 }
 break;
 
+case 0xb:
+/*
+ * In principle, this leaf is Intel-only.  In practice, it is tightly
+ * coupled with x2apic, and we offer an x2apic-capable APIC emulation
+ * to guests on AMD hardware as well.
+ *
+ * TODO: Rework topology logic.
+ */
+if ( p->basic.x2apic )
+{
+/* Fix the x2APIC identifier. */
+res->d = v->vcpu_id * 2;
+}
+break;
+
 case XSTATE_CPUID:
 switch ( subleaf )
 {
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 06/10] x86/cpuid: Handle leaf 0x6 in guest_cpuid()

2017-03-10 Thread Andrew Cooper
The thermal/performance leaf was previously hidden from HVM guests, but fully
visible to PV guests.  Most of the leaf refers to MSR availability, and there
is nothing an unprivileged PV guest can do with the information, so hide the
leaf entirely.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

v2:
 * Don't leak any state into the hardware domain.
---
 xen/arch/x86/cpuid.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 11e50e9..8585d94 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -168,6 +168,7 @@ static void recalculate_misc(struct cpuid_policy *p)
 p->basic.apic_id = 0; /* Dynamic. */
 
 p->basic.raw[0x5] = EMPTY_LEAF; /* MONITOR not exposed to guests. */
+p->basic.raw[0x6] = EMPTY_LEAF; /* Therm/Power not exposed to guests. */
 
 p->basic.raw[0x8] = EMPTY_LEAF;
 p->basic.raw[0xc] = EMPTY_LEAF;
@@ -643,8 +644,7 @@ static void pv_cpuid(uint32_t leaf, uint32_t subleaf, 
struct cpuid_leaf *res)
 *res = EMPTY_LEAF;
 break;
 
-case 0x0 ... 0x5:
-case 0x7 ... 0x9:
+case 0x0 ... 0x9:
 case 0xc ... XSTATE_CPUID:
 case 0x8000 ... 0x:
 ASSERT_UNREACHABLE();
@@ -678,8 +678,7 @@ static void hvm_cpuid(uint32_t leaf, uint32_t subleaf, 
struct cpuid_leaf *res)
 res->a = (res->a & ~0xff) | 3;
 break;
 
-case 0x0 ... 0x5:
-case 0x7 ... 0x9:
+case 0x0 ... 0x9:
 case 0xc ... XSTATE_CPUID:
 case 0x8000 ... 0x:
 ASSERT_UNREACHABLE();
@@ -739,7 +738,7 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
 goto legacy;
 
 case 0x0 ... 0x3:
-case 0x5:
+case 0x5 ... 0x6:
 case 0x8 ... 0x9:
 case 0xc:
 *res = p->basic.raw[leaf];
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 04/10] x86/cpuid: Handle leaf 0x4 in guest_cpuid()

2017-03-10 Thread Andrew Cooper
Leaf 0x4 is reserved by AMD.  For Intel, it is a multi-invocation leaf with
ecx enumerating different cache details.

Add a new union for it in struct cpuid_policy, collect it from hardware in
calculate_raw_policy(), audit it in recalculate_cpuid_policy() and update
guest_cpuid() and update_domain_cpuid_info() to properly insert/extract data.

A lot of the data here will need further auditing/refinement when better
topology support is introduced, but for now, this matches the existing
toolstack behaviour.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

v2:
 * The cache type field is 5 bits wide, rather than 4.
 * Don't bother clobbering p->basic.raw[0x{4,7,d}]
---
 xen/arch/x86/cpuid.c| 48 +++--
 xen/arch/x86/domctl.c   |  8 +++-
 xen/include/asm-x86/cpuid.h | 10 ++
 3 files changed, 63 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index d6f6b88..e85415f 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -199,6 +199,7 @@ static void recalculate_misc(struct cpuid_policy *p)
 
 case X86_VENDOR_AMD:
 zero_leaves(p->basic.raw, 0x2, 0x3);
+memset(p->cache.raw, 0, sizeof(p->cache.raw));
 p->basic.raw[0x9] = EMPTY_LEAF;
 
 p->extd.vendor_ebx = p->basic.vendor_ebx;
@@ -242,6 +243,25 @@ static void __init calculate_raw_policy(void)
 cpuid_leaf(i, >basic.raw[i]);
 }
 
+if ( p->basic.max_leaf >= 4 )
+{
+for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
+{
+cpuid_count_leaf(4, i, >cache.raw[i]);
+
+if ( p->cache.subleaf[i].type == 0 )
+break;
+}
+
+/*
+ * The choice of CPUID_GUEST_NR_CACHE is arbitrary.  It is expected
+ * that it will eventually need increasing for future hardware.
+ */
+if ( i == ARRAY_SIZE(p->cache.raw) )
+printk(XENLOG_WARNING
+   "CPUID: Insufficient Leaf 4 space for this hardware\n");
+}
+
 if ( p->basic.max_leaf >= 7 )
 {
 cpuid_count_leaf(7, 0, >feat.raw[0]);
@@ -520,6 +540,23 @@ void recalculate_cpuid_policy(struct domain *d)
 recalculate_xstate(p);
 recalculate_misc(p);
 
+for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
+{
+if ( p->cache.subleaf[i].type >= 1 &&
+ p->cache.subleaf[i].type <= 3 )
+{
+/* Subleaf has a valid cache type. Zero reserved fields. */
+p->cache.raw[i].a &= 0xc3ffu;
+p->cache.raw[i].d &= 0x0007u;
+}
+else
+{
+/* Subleaf is not valid.  Zero the rest of the union. */
+zero_leaves(p->cache.raw, i, ARRAY_SIZE(p->cache.raw) - 1);
+break;
+}
+}
+
 if ( !p->extd.svm )
 p->extd.raw[0xa] = EMPTY_LEAF;
 
@@ -605,7 +642,7 @@ static void pv_cpuid(uint32_t leaf, uint32_t subleaf, 
struct cpuid_leaf *res)
 *res = EMPTY_LEAF;
 break;
 
-case 0x0 ... 0x3:
+case 0x0 ... 0x4:
 case 0x7 ... 0x9:
 case 0xc ... XSTATE_CPUID:
 case 0x8000 ... 0x:
@@ -640,7 +677,7 @@ static void hvm_cpuid(uint32_t leaf, uint32_t subleaf, 
struct cpuid_leaf *res)
 res->a = (res->a & ~0xff) | 3;
 break;
 
-case 0x0 ... 0x3:
+case 0x0 ... 0x4:
 case 0x7 ... 0x9:
 case 0xc ... XSTATE_CPUID:
 case 0x8000 ... 0x:
@@ -674,6 +711,13 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
 
 switch ( leaf )
 {
+case 0x4:
+if ( subleaf >= ARRAY_SIZE(p->cache.raw) )
+return;
+
+*res = p->cache.raw[subleaf];
+break;
+
 case 0x7:
 ASSERT(p->feat.max_subleaf < ARRAY_SIZE(p->feat.raw));
 if ( subleaf > min_t(uint32_t, p->feat.max_subleaf,
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 02b48e8..d87efa2 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -101,6 +101,10 @@ static int update_domain_cpuid_info(struct domain *d,
 switch ( ctl->input[0] )
 {
 case 0x ... ARRAY_SIZE(p->basic.raw) - 1:
+if ( ctl->input[0] == 4 &&
+ ctl->input[1] >= ARRAY_SIZE(p->cache.raw) )
+return 0;
+
 if ( ctl->input[0] == 7 &&
  ctl->input[1] >= ARRAY_SIZE(p->feat.raw) )
 return 0;
@@ -129,7 +133,9 @@ static int update_domain_cpuid_info(struct domain *d,
 switch ( ctl->input[0] )
 {
 case 0x ... ARRAY_SIZE(p->basic.raw) - 1:
-if ( ctl->input[0] == 7 )
+if ( ctl->input[0] == 4 )
+p->cache.raw[ctl->input[1]] = leaf;
+else if ( ctl->input[0] == 7 )
 p->feat.raw[ctl->input[1]] = leaf;
 else if ( ctl->input[0] == XSTATE_CPUID )
 p->xstate.raw[ctl->input[1]] = leaf;
diff --git 

[Xen-devel] Ping: [PATCH] AMD-Vi: allocate root table on demand

2017-03-10 Thread Jan Beulich
>>> On 03.03.17 at 15:29,  wrote:
> This was my originally intended fix for the AMD side of XSA-207:
> There's no need to unconditionally allocate the root table, and with
> that there's then also no way to leak it when a guest has no devices
> assigned.
> 
> Signed-off-by: Jan Beulich 
> 
> --- a/xen/drivers/passthrough/amd/iommu_map.c
> +++ b/xen/drivers/passthrough/amd/iommu_map.c
> @@ -636,11 +636,10 @@ int amd_iommu_map_page(struct domain *d,
>  {
>  bool_t need_flush = 0;
>  struct domain_iommu *hd = dom_iommu(d);
> +int rc;
>  unsigned long pt_mfn[7];
>  unsigned int merge_level;
>  
> -BUG_ON( !hd->arch.root_table );
> -
>  if ( iommu_use_hap_pt(d) )
>  return 0;
>  
> @@ -648,6 +647,13 @@ int amd_iommu_map_page(struct domain *d,
>  
>  spin_lock(>arch.mapping_lock);
>  
> +rc = amd_iommu_alloc_root(hd);
> +if ( rc )
> +{
> +spin_unlock(>arch.mapping_lock);
> +return rc;
> +}
> +
>  /* Since HVM domain is initialized with 2 level IO page table,
>   * we might need a deeper page table for lager gfn now */
>  if ( is_hvm_domain(d) )
> @@ -717,8 +723,6 @@ int amd_iommu_unmap_page(struct domain *
>  unsigned long pt_mfn[7];
>  struct domain_iommu *hd = dom_iommu(d);
>  
> -BUG_ON( !hd->arch.root_table );
> -
>  if ( iommu_use_hap_pt(d) )
>  return 0;
>  
> @@ -726,6 +730,12 @@ int amd_iommu_unmap_page(struct domain *
>  
>  spin_lock(>arch.mapping_lock);
>  
> +if ( !hd->arch.root_table )
> +{
> +spin_unlock(>arch.mapping_lock);
> +return 0;
> +}
> +
>  /* Since HVM domain is initialized with 2 level IO page table,
>   * we might need a deeper page table for lager gfn now */
>  if ( is_hvm_domain(d) )
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -222,23 +222,29 @@ int __init amd_iov_detect(void)
>  return scan_pci_devices();
>  }
>  
> -static int allocate_domain_resources(struct domain_iommu *hd)
> +int amd_iommu_alloc_root(struct domain_iommu *hd)
>  {
> -/* allocate root table */
> -spin_lock(>arch.mapping_lock);
> -if ( !hd->arch.root_table )
> +if ( unlikely(!hd->arch.root_table) )
>  {
>  hd->arch.root_table = alloc_amd_iommu_pgtable();
>  if ( !hd->arch.root_table )
> -{
> -spin_unlock(>arch.mapping_lock);
>  return -ENOMEM;
> -}
>  }
> -spin_unlock(>arch.mapping_lock);
> +
>  return 0;
>  }
>  
> +static int __must_check allocate_domain_resources(struct domain_iommu *hd)
> +{
> +int rc;
> +
> +spin_lock(>arch.mapping_lock);
> +rc = amd_iommu_alloc_root(hd);
> +spin_unlock(>arch.mapping_lock);
> +
> +return rc;
> +}
> +
>  static int get_paging_mode(unsigned long entries)
>  {
>  int level = 1;
> @@ -259,14 +265,6 @@ static int amd_iommu_domain_init(struct
>  {
>  struct domain_iommu *hd = dom_iommu(d);
>  
> -/* allocate page directroy */
> -if ( allocate_domain_resources(hd) != 0 )
> -{
> -if ( hd->arch.root_table )
> -free_domheap_page(hd->arch.root_table);
> -return -ENOMEM;
> -}
> -
>  /* For pv and dom0, stick with get_paging_mode(max_page)
>   * For HVM dom0, use 2 level page table at first */
>  hd->arch.paging_mode = is_hvm_domain(d) ?
> @@ -280,6 +278,9 @@ static void __hwdom_init amd_iommu_hwdom
>  unsigned long i; 
>  const struct amd_iommu *iommu;
>  
> +if ( allocate_domain_resources(dom_iommu(d)) )
> +BUG();
> +
>  if ( !iommu_passthrough && !need_iommu(d) )
>  {
>  int rc = 0;
> @@ -363,7 +364,7 @@ static int reassign_device(struct domain
> u8 devfn, struct pci_dev *pdev)
>  {
>  struct amd_iommu *iommu;
> -int bdf;
> +int bdf, rc;
>  struct domain_iommu *t = dom_iommu(target);
>  
>  bdf = PCI_BDF2(pdev->bus, pdev->devfn);
> @@ -385,10 +386,9 @@ static int reassign_device(struct domain
>  pdev->domain = target;
>  }
>  
> -/* IO page tables might be destroyed after pci-detach the last device
> - * In this case, we have to re-allocate root table for next pci-attach.*/
> -if ( t->arch.root_table == NULL )
> -allocate_domain_resources(t);
> +rc = allocate_domain_resources(t);
> +if ( rc )
> +return rc;
>  
>  amd_iommu_setup_domain_device(target, iommu, devfn, pdev);
>  AMD_IOMMU_DEBUG("Re-assign %04x:%02x:%02x.%u from dom%d to dom%d\n",
> --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
> +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
> @@ -56,6 +56,7 @@ int __must_check amd_iommu_map_page(stru
>  unsigned long mfn, unsigned int flags);
>  int __must_check amd_iommu_unmap_page(struct domain *d, unsigned long gfn);
>  u64 amd_iommu_get_next_table_from_pte(u32 *entry);
> +int 

Re: [Xen-devel] [PATCH v7 5/5] x86/ioreq server: Synchronously reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.

2017-03-10 Thread Jan Beulich
>>> On 08.03.17 at 16:33,  wrote:
> --- a/xen/arch/x86/hvm/dm.c
> +++ b/xen/arch/x86/hvm/dm.c
> @@ -288,6 +288,7 @@ static int inject_event(struct domain *d,
>  return 0;
>  }
>  
> +#define DMOP_op_mask 0xff
>  static int dm_op(domid_t domid,

Please follow the existing continuation model used here, instead of
re-introducing the hvmop one.

> --- a/xen/include/asm-x86/p2m.h
> +++ b/xen/include/asm-x86/p2m.h
> @@ -611,6 +611,11 @@ void p2m_change_type_range(struct domain *d,
>  int p2m_change_type_one(struct domain *d, unsigned long gfn,
>  p2m_type_t ot, p2m_type_t nt);
>  
> +/* Synchronously change types across a range of p2m entries (start ... end) 
> */
> +void p2m_finish_type_change(struct domain *d,
> +unsigned long start, unsigned long end,
> +p2m_type_t ot, p2m_type_t nt);

The comment should not give the impression that this is an open
range. I.e. [start, end], but perhaps even better would be to not
use "end" here, as that frequently (e.g. p2m_change_type_range())
this is used to indicate an exclusive range end.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 4/5] ix86/ioreq server: Asynchronously reset outstanding p2m_ioreq_server entries.

2017-03-10 Thread Jan Beulich
>>> On 08.03.17 at 16:33,  wrote:
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -954,6 +954,26 @@ int p2m_change_type_one(struct domain *d, unsigned long 
> gfn,
>   p2m->default_access)
>   : -EBUSY;
>  
> +if ( !rc )
> +{
> +switch ( nt )
> +{
> +case p2m_ram_rw:
> +if ( ot == p2m_ioreq_server )
> +{
> +p2m->ioreq.entry_count--;
> +ASSERT(p2m->ioreq.entry_count >= 0);
> +}
> +break;
> +case p2m_ioreq_server:
> +if ( ot == p2m_ram_rw )
> +p2m->ioreq.entry_count++;

I don't think the conditional is needed here. If anything it should be
an ASSERT() imo, as other transitions to p2m_ioreq_server should
not be allowed.

But there's a wider understanding issue I'm having here: What is
an "entry" here? Commonly I would assume this to refer to an
individual (4k) page, but it looks like you really mean table entry,
i.e. possibly representing a 2M or 1G page.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 106584: tolerable FAIL - PUSHED

2017-03-10 Thread osstest service owner
flight 106584 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106584/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 106534
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 106534
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 106534
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 106534
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 106534
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 106534
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 106534
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 106534

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 build-arm64   5 xen-buildfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  9dc1e0cd81ee469d638d1962a92d9b4bd2972bfa
baseline version:
 xen  4036e7c592905c2292cdeba8269e969959427237

Last test of basis   106534  2017-03-07 19:14:51 Z2 days
Failing since106547  2017-03-08 08:59:30 Z2 days5 attempts
Testing same since   106580  2017-03-09 23:15:02 Z0 days2 attempts


People who touched revisions under test:
  Andrew Cooper 
  Christoph Egger 
  George Dunlap 
  Haozhong Zhang 
  Jan Beulich 
  Julien Grall 
  Paul Durrant 

jobs:
 build-amd64-xsm  pass
 

Re: [Xen-devel] WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks)

2017-03-10 Thread Michal Hocko
On Fri 10-03-17 14:58:07, Michal Hocko wrote:
[...]
> This would explain why onlining from the last block actually works but
> to me this sounds like a completely crappy behavior. All we need to
> guarantee AFAICS is that Normal and Movable zones do not overlap. I
> believe there is even no real requirement about ordering of the physical
> memory in Normal vs. Movable zones as long as they do not overlap. But
> let's keep it simple for the start and always enforce the current status
> quo that Normal zone is physically preceeding Movable zone.
> Can somebody explain why we cannot have a simple rule for Normal vs.
> Movable which would be:
>   - block [pfn, pfn+block_size] can be Normal if
> !zone_populated(MOVABLE) || pfn+block_size < 
> ZONE_MOVABLE->zone_start_pfn
>   - block [pfn, pfn+block_size] can be Movable if
> !zone_populated(NORMAL) || ZONE_NORMAL->zone_end_pfn < pfn

OK, so while I was playing with this setup some more I probably got why
this is done this way. All new memblocks are added to the zone Normal
where they are accounted as spanned but not present. When we do
online_movable we just cut from the end of the Normal zone and move it
to Movable zone. This sounds really awkward. What was the reason to go
this way? Why cannot we simply add those pages to the zone at the online
time?
-- 
Michal Hocko
SUSE Labs

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/monitor: add support for descriptor access events

2017-03-10 Thread Vlad Ioan Topan
Adds monitor support for descriptor access events (reads & writes of
IDTR/GDTR/LDTR/TR) for the x86 architecture (VMX and SVM).

Signed-off-by: Vlad Ioan Topan 
---
 tools/libxc/include/xenctrl.h   |  2 ++
 tools/libxc/xc_monitor.c| 14 +++
 tools/tests/xen-access/xen-access.c | 29 -
 xen/arch/x86/hvm/hvm.c  | 35 ++
 xen/arch/x86/hvm/monitor.c  | 16 
 xen/arch/x86/hvm/svm/svm.c  | 50 +
 xen/arch/x86/hvm/vmx/vmx.c  | 45 +
 xen/arch/x86/monitor.c  | 18 +
 xen/arch/x86/vm_event.c |  6 +
 xen/include/asm-x86/domain.h|  1 +
 xen/include/asm-x86/hvm/hvm.h   |  2 ++
 xen/include/asm-x86/hvm/monitor.h   |  3 +++
 xen/include/asm-x86/hvm/support.h   |  2 ++
 xen/include/asm-x86/monitor.h   |  1 +
 xen/include/public/domctl.h |  1 +
 xen/include/public/vm_event.h   | 21 
 16 files changed, 245 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index a48981a..2de6c61 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1984,6 +1984,8 @@ int xc_monitor_mov_to_msr(xc_interface *xch, domid_t 
domain_id, uint32_t msr,
 int xc_monitor_singlestep(xc_interface *xch, domid_t domain_id, bool enable);
 int xc_monitor_software_breakpoint(xc_interface *xch, domid_t domain_id,
bool enable);
+int xc_monitor_descriptor_access(xc_interface *xch, domid_t domain_id,
+ bool enable);
 int xc_monitor_guest_request(xc_interface *xch, domid_t domain_id,
  bool enable, bool sync);
 int xc_monitor_debug_exceptions(xc_interface *xch, domid_t domain_id,
diff --git a/tools/libxc/xc_monitor.c b/tools/libxc/xc_monitor.c
index 15a7c32..f99b6e3 100644
--- a/tools/libxc/xc_monitor.c
+++ b/tools/libxc/xc_monitor.c
@@ -129,6 +129,20 @@ int xc_monitor_singlestep(xc_interface *xch, domid_t 
domain_id,
 return do_domctl(xch, );
 }
 
+int xc_monitor_descriptor_access(xc_interface *xch, domid_t domain_id,
+ bool enable)
+{
+DECLARE_DOMCTL;
+
+domctl.cmd = XEN_DOMCTL_monitor_op;
+domctl.domain = domain_id;
+domctl.u.monitor_op.op = enable ? XEN_DOMCTL_MONITOR_OP_ENABLE
+: XEN_DOMCTL_MONITOR_OP_DISABLE;
+domctl.u.monitor_op.event = XEN_DOMCTL_MONITOR_EVENT_DESC_ACCESS;
+
+return do_domctl(xch, );
+}
+
 int xc_monitor_guest_request(xc_interface *xch, domid_t domain_id, bool enable,
  bool sync)
 {
diff --git a/tools/tests/xen-access/xen-access.c 
b/tools/tests/xen-access/xen-access.c
index 9d4f957..c567cc0 100644
--- a/tools/tests/xen-access/xen-access.c
+++ b/tools/tests/xen-access/xen-access.c
@@ -337,7 +337,7 @@ void usage(char* progname)
 {
 fprintf(stderr, "Usage: %s [-m]  write|exec", progname);
 #if defined(__i386__) || defined(__x86_64__)
-fprintf(stderr, 
"|breakpoint|altp2m_write|altp2m_exec|debug|cpuid");
+fprintf(stderr, 
"|breakpoint|altp2m_write|altp2m_exec|debug|cpuid|desc_access");
 #elif defined(__arm__) || defined(__aarch64__)
 fprintf(stderr, "|privcall");
 #endif
@@ -368,6 +368,7 @@ int main(int argc, char *argv[])
 int altp2m = 0;
 int debug = 0;
 int cpuid = 0;
+int desc_access = 0;
 uint16_t altp2m_view_id = 0;
 
 char* progname = argv[0];
@@ -434,6 +435,10 @@ int main(int argc, char *argv[])
 {
 cpuid = 1;
 }
+else if ( !strcmp(argv[0], "desc_access") )
+{
+desc_access = 1;
+}
 #elif defined(__arm__) || defined(__aarch64__)
 else if ( !strcmp(argv[0], "privcall") )
 {
@@ -570,6 +575,16 @@ int main(int argc, char *argv[])
 goto exit;
 }
 }
+
+if ( desc_access )
+{
+rc = xc_monitor_descriptor_access(xch, domain_id, 1);
+if ( rc < 0 )
+{
+ERROR("Error %d setting descriptor access listener with 
vm_event\n", rc);
+goto exit;
+}
+}
 
 if ( privcall )
 {
@@ -595,6 +610,8 @@ int main(int argc, char *argv[])
 rc = xc_monitor_debug_exceptions(xch, domain_id, 0, 0);
 if ( cpuid )
 rc = xc_monitor_cpuid(xch, domain_id, 0);
+if ( desc_access )
+rc = xc_monitor_descriptor_access(xch, domain_id, 0);
 
 if ( privcall )
 rc = xc_monitor_privileged_call(xch, domain_id, 0);
@@ -779,6 +796,16 @@ int main(int argc, char *argv[])
 rsp.data = req.data;
 rsp.data.regs.x86.rip += req.u.cpuid.insn_length;
 break;
+case VM_EVENT_REASON_DESCRIPTOR_ACCESS:
+printf("Descriptor access: 

Re: [Xen-devel] [PATCH v7 3/5] x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server pages.

2017-03-10 Thread Jan Beulich
>>> On 08.03.17 at 16:33,  wrote:
> @@ -197,6 +217,10 @@ static int hvmemul_do_io(
>   *   - If the IOREQ_MEM_ACCESS_WRITE flag is not set, treat it
>   *   like a normal PIO or MMIO that doesn't have an ioreq
>   *   server (i.e., by ignoring it).
> + *
> + *   - If the accesss is a read, this could be part of a
> + *   read-modify-write instruction, emulate the read so that we
> + *   have it.

"it" being what here? Grammatically the insn, but we don't care
about "having" the insn.

> @@ -226,6 +250,17 @@ static int hvmemul_do_io(
>  }
>  
>  /*
> + * This is part of a read-modify-write instruction.

"is" or "may be"?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.

2017-03-10 Thread Jan Beulich
>>> On 08.03.17 at 16:33,  wrote:
> changes in v7: 
>   - Use new ioreq server interface - XEN_DMOP_map_mem_type_to_ioreq_server.
>   - According to comments from George: removed domain_pause/unpause() in
> hvm_map_mem_type_to_ioreq_server(), because it's too expensive,
> and we can avoid the: 
> a> deadlock between p2m lock and ioreq server lock by using these locks
>in the same order - solved in patch 4;

That is, until patch 4 there is deadlock potential? I think you want to
re-order the patches if so. Or was it that the type can't really be used
until the last patch of the series? (I'm sorry, it's been quite a while
since the previous version.)

> @@ -365,6 +383,24 @@ static int dm_op(domid_t domid,
>  break;
>  }
>  
> +case XEN_DMOP_map_mem_type_to_ioreq_server:
> +{
> +const struct xen_dm_op_map_mem_type_to_ioreq_server *data =
> +_mem_type_to_ioreq_server;
> +
> +rc = -EINVAL;
> +if ( data->pad )
> +break;
> +
> +/* Only support for HAP enabled hvm. */
> +if ( !hap_enabled(d) )
> +break;

Perhaps better to give an error other than -EINVAL in this case?
If so, then the same error should likely also be used in your
set_mem_type() addition.

> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -99,6 +99,7 @@ static int hvmemul_do_io(
>  uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
>  {
>  struct vcpu *curr = current;
> +struct domain *currd = curr->domain;
>  struct hvm_vcpu_io *vio = >arch.hvm_vcpu.hvm_io;
>  ioreq_t p = {
>  .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
> @@ -140,7 +141,7 @@ static int hvmemul_do_io(
>   (p.dir != dir) ||
>   (p.df != df) ||
>   (p.data_is_ptr != data_is_addr) )
> -domain_crash(curr->domain);
> +domain_crash(currd);

If you mean to do this transformation here, then please do so
consistently for the entire function.

> @@ -177,8 +178,65 @@ static int hvmemul_do_io(
>  break;
>  case X86EMUL_UNHANDLEABLE:
>  {
> -struct hvm_ioreq_server *s =
> -hvm_select_ioreq_server(curr->domain, );
> +/*
> + * Xen isn't emulating the instruction internally, so see if
> + * there's an ioreq server that can handle it. Rules:
> + *
> + * - PIO and "normal" mmio run through hvm_select_ioreq_server()
> + * to choose the ioreq server by range. If no server is found,
> + * the access is ignored.
> + *
> + * - p2m_ioreq_server accesses are handled by the current
> + * ioreq_server for the domain, but there are some corner
> + * cases:

Who or what is "the current ioreq_server for the domain"?

> + *   - If the domain ioreq_server is NULL, assume this is a
> + *   race between the unbinding of ioreq server and guest fault
> + *   so re-try the instruction.
> + *
> + *   - If the IOREQ_MEM_ACCESS_WRITE flag is not set, treat it
> + *   like a normal PIO or MMIO that doesn't have an ioreq
> + *   server (i.e., by ignoring it).
> + */
> +struct hvm_ioreq_server *s = NULL;
> +p2m_type_t p2mt = p2m_invalid;
> +
> +if ( is_mmio )
> +{
> +unsigned long gmfn = paddr_to_pfn(addr);
> +
> +(void) get_gfn_query_unlocked(currd, gmfn, );

Stray cast.

> +if ( p2mt == p2m_ioreq_server )
> +{
> +unsigned int flags;
> +
> +s = p2m_get_ioreq_server(currd, );
> +
> +/*
> + * If p2mt is ioreq_server but ioreq_server is NULL,
> + * we probably lost a race with unbinding of ioreq
> + * server, just retry the access.
> + */
> +if ( s == NULL )
> +{
> +rc = X86EMUL_RETRY;
> +vio->io_req.state = STATE_IOREQ_NONE;
> +break;
> +}
> +
> +/*
> + * If the IOREQ_MEM_ACCESS_WRITE flag is not set,
> + * we should set s to NULL, and just ignore such
> + * access.
> + */
> +if ( !(flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE) )
> +s = NULL;

What is this about? You only allow WRITE registrations, so this looks
to be dead code. Yet if it is meant to guard against future enabling
of READ, then this clearly should not be done for reads.

> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -131,6 +131,13 @@ static void ept_p2m_type_to_flags(struct p2m_domain 
> *p2m, ept_entry_t *entry,
>  entry->r = entry->w = entry->x = 1;
>  entry->a = entry->d = !!cpu_has_vmx_ept_ad;
>  break;
> +case p2m_ioreq_server:

Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document

2017-03-10 Thread Konrad Rzeszutek Wilk
On Fri, Mar 10, 2017 at 12:23:18PM +0900, Roger Pau Monné wrote:
> On Thu, Mar 09, 2017 at 07:29:34PM -0500, Konrad Rzeszutek Wilk wrote:
> > On Thu, Mar 09, 2017 at 01:26:45PM +, Julien Grall wrote:
> > > Hi Konrad,
> > > 
> > > On 09/03/17 11:17, Konrad Rzeszutek Wilk wrote:
> > > > On Thu, Mar 09, 2017 at 11:59:51AM +0900, Roger Pau Monné wrote:
> > > > > On Wed, Mar 08, 2017 at 02:12:09PM -0500, Konrad Rzeszutek Wilk wrote:
> > > > > > On Wed, Mar 08, 2017 at 07:06:23PM +, Julien Grall wrote:
> > > > > > .. this as for SR-IOV devices you need the drivers to kick the 
> > > > > > hardware
> > > > > > to generate the new bus addresses. And those (along with the BAR 
> > > > > > regions) are
> > > > > > not visible in ACPI (they are constructued dynamically).
> > > > > 
> > > > > There's already code in Xen [0] to find out the size of the BARs of 
> > > > > SR-IOV
> > > > > devices, but I'm not sure what's the intended usage of that, does it 
> > > > > need to
> > > > > happen _after_ the driver in Dom0 has done whatever magic for this to 
> > > > > work?
> > > > 
> > > > Yes. This is called via the PHYSDEVOP_pci_device_add hypercall when
> > > > the device driver in dom0 has finished "creating" the VF. See 
> > > > drivers/xen/pci.c
> > > 
> > > We are thinking to not use PHYSDEVOP_pci_device_add hypercall for ARM and 
> > > do
> > > the PCI scanning in Xen.
> > > 
> > > If I understand correctly what you said, only the PCI driver will be able 
> > > to
> > > kick SR-IOV device and Xen would not be able to detect the device until it
> > > has been fully configured. So it would mean that we have to keep
> > > PHYSDEVOP_pci_device_add around to know when Xen can use the device.
> > > 
> > > Am I correct?
> > 
> > Yes. Unless the PCI drivers come up with some other way to tell the
> > OS that oh, hey, there is this new PCI device with this BDF.
> > 
> > Or the underlaying bus on ARM can send some 'new device' information?
> 
> Hm, is this something standard between all the SR-IOV implementations, or each
> vendors have their own sauce?

Gosh, all of them have their own sauce. The only thing that is the same
is that suddenly behind the PF device there are PCI devies that are responding
to 0xcfc requests. MAgic!
> 
> Would it be possible to do this SR-IOV initialization inside of Xen, or that

Not possible inside of the Xen, unless you slurp up DPDK inside of Xen
or the Linux network drivers inside of it.

> requires ACPI information?

No ACPI information. On x86 there is no hardware notification system. It is
only by the goodwill of the PF driver kicking the Linux OS in 'scanning' the
new PCI addresses and 'discovering' the new devices.


> 
> Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 106586: regressions - FAIL

2017-03-10 Thread osstest service owner
flight 106586 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106586/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 105963
 test-amd64-i386-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 105963

version targeted for testing:
 ovmf 90ebd67808df57e435d7113a28ad26add429a766
baseline version:
 ovmf e0307a7dad02aa8c0cd8b3b0b9edce8ddb3fef2e

Last test of basis   105963  2017-02-21 21:43:31 Z   16 days
Failing since105980  2017-02-22 10:03:53 Z   16 days   44 attempts
Testing same since   106586  2017-03-10 09:15:47 Z0 days1 attempts


People who touched revisions under test:
  Anthony PERARD 
  Ard Biesheuvel 
  Bi, Dandan 
  Brijesh Singh 
  Chao Zhang 
  Chen A Chen 
  Dandan Bi 
  edk2-devel On Behalf Of rthomaiy <[mailto:edk2-devel-boun...@lists.01.org]>
  Fu Siyuan 
  Hao Wu 
  Hegde Nagaraj P 
  Hess Chen 
  Jeff Fan 
  Jiaxin Wu 
  Jiewen Yao 
  Laszlo Ersek 
  Leo Duran 
  Paolo Bonzini 
  Qin Long 
  Richard Thomaiyar 
  Ruiyu Ni 
  Star Zeng 
  Wu Jiaxin 
  Yonghong Zhu 
  Zhang Lubo 
  Zhang, Chao B 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 3727 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4] xen: Allow a default compiled-in command line using Kconfig

2017-03-10 Thread Jan Beulich
>>> On 09.03.17 at 04:13,  wrote:
> If CMDLINE is set, the cmdline_parse() routine will append the bootloader
> command line to this string, forming the complete command line
> before parsing.

I disagree to making it behave like this: There's no need to
concatenate both, simply parse them one after the other. The
built-in one is well known (and hence also has no need to be in
saved_cmdline). That'll avoid reducing the space available for
user specified options.

> @@ -1566,14 +1557,14 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>  
>  /* Grab the DOM0 command line. */
>  cmdline = (char *)(mod[0].string ? __va(mod[0].string) : NULL);
> -if ( (cmdline != NULL) || (kextra != NULL) )
> +if ( (cmdline != NULL) || strlen(kextra) )

Is there any reason why kextra can't come out as NULL if unset,
avoiding the need to touch the code here? That would also avoid
making kextra a static variable.

> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -237,4 +237,28 @@ config FAST_SYMBOL_LOOKUP
> The only user of this is Live patching.
>  
> If unsure, say Y.
> +
> +config CMDLINE
> + string "Built-in hypervisor command string"
> + default ""
> + ---help---
> +   Enter arguments here that should be compiled into the hypervisor
> +   image and used at boot time.  If the bootloader provides a
> +   command line at boot time, it is appended to this string to
> +   form the full hypervisor command line, when the system boots.
> +   So if the same command line option is not cumulative,
> +   and was set both in this string and in the bootloader command line,
> +   only the latter (i.e. the one in the bootloader command line),
> +   will take effect.
> +
> +config CMDLINE_OVERRIDE
> + bool "Built-in command line overrides bootloader arguments"
> + default n
> + depends on EXPERT = "y"

Personally I think the other option should also be dependent on
EXPERT. And the one here should probably depend on CMDLINE != ""
- if someone really wants to force an empty one, (s)he could make
it be just a single blank.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks)

2017-03-10 Thread Michal Hocko
Let's CC people touching this logic. A short summary is that onlining
memory via udev is currently unusable for online_movable because blocks
are added from lower addresses while movable blocks are allowed from
last blocks. More below.

On Thu 09-03-17 13:54:00, Michal Hocko wrote:
> On Tue 07-03-17 13:40:04, Igor Mammedov wrote:
> > On Mon, 6 Mar 2017 15:54:17 +0100
> > Michal Hocko  wrote:
> > 
> > > On Fri 03-03-17 18:34:22, Igor Mammedov wrote:
> [...]
> > > > in current mainline kernel it triggers following code path:
> > > > 
> > > > online_pages()
> > > >   ...
> > > >if (online_type == MMOP_ONLINE_KERNEL) { 
> > > > 
> > > > if (!zone_can_shift(pfn, nr_pages, ZONE_NORMAL, 
> > > > _shift))
> > > > return -EINVAL;  
> > > 
> > > Are you sure? I would expect MMOP_ONLINE_MOVABLE here
> > pretty much, reproducer is above so try and see for yourself
> 
> I will play with this...

OK so I did with -m 2G,slots=4,maxmem=4G -numa node,mem=1G -numa node,mem=1G 
which generated
[...]
[0.00] ACPI: SRAT: Node 0 PXM 0 [mem 0x-0x0009]
[0.00] ACPI: SRAT: Node 0 PXM 0 [mem 0x0010-0x3fff]
[0.00] ACPI: SRAT: Node 1 PXM 1 [mem 0x4000-0x7fff]
[0.00] ACPI: SRAT: Node 0 PXM 0 [mem 0x1-0x27fff] hotplug
[0.00] NUMA: Node 0 [mem 0x-0x0009] + [mem 
0x0010-0x3fff] -> [mem 0x-0x3fff]
[0.00] NODE_DATA(0) allocated [mem 0x3fffc000-0x3fff]
[0.00] NODE_DATA(1) allocated [mem 0x7ffdc000-0x7ffd]
[0.00] Zone ranges:
[0.00]   DMA  [mem 0x1000-0x00ff]
[0.00]   DMA32[mem 0x0100-0x7ffd]
[0.00]   Normal   empty
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x1000-0x0009efff]
[0.00]   node   0: [mem 0x0010-0x3fff]
[0.00]   node   1: [mem 0x4000-0x7ffd]

so there is neither any normal zone nor movable one at the boot time.
Then I hotplugged 1G slot
(qemu) object_add memory-backend-ram,id=mem1,size=1G
(qemu) device_add pc-dimm,id=dimm1,memdev=mem1

unfortunatelly the memory didn't show up automatically and I got
[  116.375781] acpi PNP0C80:00: Enumeration failure

so I had to probe it manually (prbably the BIOS my qemu uses doesn't
support auto probing - I haven't really dug further). Anyway the SRAT
table printed during the boot told that we should start at 0x1

# echo 0x1 > /sys/devices/system/memory/probe
# grep . /sys/devices/system/memory/memory32/valid_zones
Normal Movable

which looks reasonably right? Both Normal and Movable zones are allowed

# echo $((0x1+(128<<20))) > /sys/devices/system/memory/probe
# grep . /sys/devices/system/memory/memory3?/valid_zones
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal Movable

Huh, so our valid_zones have changed under our feet...

# echo $((0x1+2*(128<<20))) > /sys/devices/system/memory/probe
# grep . /sys/devices/system/memory/memory3?/valid_zones
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal
/sys/devices/system/memory/memory34/valid_zones:Normal Movable

and again. So only the last memblock is considered movable. Let's try to
online them now.

# echo online_movable > /sys/devices/system/memory/memory34/state
# grep . /sys/devices/system/memory/memory3?/valid_zones
/sys/devices/system/memory/memory32/valid_zones:Normal
/sys/devices/system/memory/memory33/valid_zones:Normal Movable
/sys/devices/system/memory/memory34/valid_zones:Movable Normal

This would explain why onlining from the last block actually works but
to me this sounds like a completely crappy behavior. All we need to
guarantee AFAICS is that Normal and Movable zones do not overlap. I
believe there is even no real requirement about ordering of the physical
memory in Normal vs. Movable zones as long as they do not overlap. But
let's keep it simple for the start and always enforce the current status
quo that Normal zone is physically preceeding Movable zone.
Can somebody explain why we cannot have a simple rule for Normal vs.
Movable which would be:
- block [pfn, pfn+block_size] can be Normal if
  !zone_populated(MOVABLE) || pfn+block_size < 
ZONE_MOVABLE->zone_start_pfn
- block [pfn, pfn+block_size] can be Movable if
  !zone_populated(NORMAL) || ZONE_NORMAL->zone_end_pfn < pfn

I haven't fully grokked all the restrictions on the movable zone size
based on the kernel parameters (find_zone_movable_pfns_for_nodes) but
this shouldn't really make the situation really much more complicated I
believe because those parameters should be mostly about static
initialization rather 

[Xen-devel] [linux-linus test] 106581: regressions - trouble: blocked/broken/fail/pass

2017-03-10 Thread osstest service owner
flight 106581 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106581/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail REGR. vs. 59254
 test-armhf-armhf-xl  11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-xl-xsm  11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-xl-credit2  11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-libvirt-xsm 11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-xl-cubietruck 11 guest-start fail REGR. vs. 59254
 test-armhf-armhf-libvirt 11 guest-start   fail REGR. vs. 59254
 test-amd64-amd64-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail REGR. vs. 
59254
 test-armhf-armhf-xl-arndale  11 guest-start   fail REGR. vs. 59254
 test-armhf-armhf-xl-multivcpu 11 guest-start  fail REGR. vs. 59254

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds  3 host-install(3) broken REGR. vs. 59254
 test-amd64-amd64-xl-rtds  9 debian-installfail REGR. vs. 59254
 test-armhf-armhf-xl-vhd   9 debian-di-install   fail baseline untested
 test-armhf-armhf-libvirt-raw  9 debian-di-install   fail baseline untested
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 59254
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 59254
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 59254

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 build-arm64   5 xen-buildfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass

version targeted for testing:
 linuxc1aa905a304e4b5e6a3fe112ec62d9c1c7b0c155
baseline version:
 linux45820c294fe1b1a9df495d57f40585ef2d069a39

Last test of basis59254  2015-07-09 04:20:48 Z  610 days
Failing since 59348  2015-07-10 04:24:05 Z  609 days  324 attempts
Testing same since   106581  2017-03-10 03:19:28 Z0 days1 attempts


8050 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  fail
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  blocked 
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumprun  pass
 

Re: [Xen-devel] Enabling #VE for a domain from dom0

2017-03-10 Thread Andrew Cooper
On 10/03/17 11:57, Vlad-Ioan TOPAN wrote:
>>> Is there any reason for the other check I've mentioned, performed when
>>> setting the "suppres #VE" bit in PTEs? Unsuppressing #VEs for a page
>>> will only do anything if the guest has already enabled #VE, so the
>>> previous issue doesn't apply in this case.
>> suppress #VE has a negative meaning.  Once #VE is enabled, all frames
>> need SVE for Xen to continue getting EPT_VIOLATION vmexits per usual. 
>> It is only whitelisted frames which should have SVE cleared on them.
>>
>> Having said that, by the time you have cooperation between several
>> domains, I don't see a reason for excluding a remote domain updating SVE.
> Sorry for following up so late on this; what would be an acceptable
> approach to allow a remote domain to update the "Suppress VE" bit for a
> given domain?
>
> The current page permission code flow via do_altp2m_op() / 
> HVMOP_altp2m_set_mem_access only allows setting actual page permissions,
> and all the other APIs it calls along the way (p2m_set_mem_access(),
> set_mem_access() and p2m_set_altp2m_mem_access() in mem_access.c) don't
> have a "suppress VE" parameter (up until the ept_set_entry() call,
> which gets sve as "current->domain != d" from
> p2m_set_altp2m_mem_access()). At that level a separate parameter for
> sVE does not make sense, since the code is shared with the ARM arch.
>
> Would it be acceptable to add an HVMOP_altp2m_set_suppres_ve op?

Given the reasoning in
https://lists.xenproject.org/archives/html/xen-devel/2017-03/msg01309.html
I think it is reasonable, after the domain has initialised altp2m
support, to allow a remote domain to alter all aspects of altp2m
permissions.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Enabling #VE for a domain from dom0

2017-03-10 Thread Vlad-Ioan TOPAN
> > Is there any reason for the other check I've mentioned, performed when
> > setting the "suppres #VE" bit in PTEs? Unsuppressing #VEs for a page
> > will only do anything if the guest has already enabled #VE, so the
> > previous issue doesn't apply in this case.
> 
> suppress #VE has a negative meaning.  Once #VE is enabled, all frames
> need SVE for Xen to continue getting EPT_VIOLATION vmexits per usual. 
> It is only whitelisted frames which should have SVE cleared on them.
> 
> Having said that, by the time you have cooperation between several
> domains, I don't see a reason for excluding a remote domain updating SVE.

Sorry for following up so late on this; what would be an acceptable
approach to allow a remote domain to update the "Suppress VE" bit for a
given domain?

The current page permission code flow via do_altp2m_op() / 
HVMOP_altp2m_set_mem_access only allows setting actual page permissions,
and all the other APIs it calls along the way (p2m_set_mem_access(),
set_mem_access() and p2m_set_altp2m_mem_access() in mem_access.c) don't
have a "suppress VE" parameter (up until the ept_set_entry() call,
which gets sve as "current->domain != d" from
p2m_set_altp2m_mem_access()). At that level a separate parameter for
sVE does not make sense, since the code is shared with the ARM arch.

Would it be acceptable to add an HVMOP_altp2m_set_suppres_ve op?

Thank you,
--
Vlad-Ioan TOPAN
Linux Kernel Development Lead
Bitdefender

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Xen Project Developer and Design Summit location and CfP pre-announcement - also looking for PMC members

2017-03-10 Thread Lars Kurth
Dear community members,

just a quick note to let you know about the upcoming next steps for the Xen 
Project Developer and Design Summit.

The event location and date will be 

Corinthia Hotel, Budapest (http://www.corinthia.com/en/hotels/budapest)
July 11- 13

Note that this will be a 3 day event with talks in the morning design sessions 
(using the Hackathon format) in the afternoon. Also, unlike in the past, lunch 
will be included. This means that there will be *no* Hackathon in 2017, as the 
Hackathon is essentially folded into the summit.

Note that I am looking for volunteers to help choosing the content as part of 
the Program Management Committee.
I will also be looking for companies to sponsor our event. If you think your 
employer may be willing to sponsor the event, please get in touch with me.

The schedule for the website, CfP, etc. will be
* March 14 (formal CfP announcement and event website launch)
* CfP Close: April 14 (open 4 weeks)
* CfP Reviews: April 17 - 28 (2 weeks to review)
* Schedule Announce: May 10 (8-9 weeks to promote)
* Event: July 11 –13

As usual, there will be a social event on either the 11th or 12th: details are 
still to be organised and will be announced later.

Best Regards
Lars
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] x86/altp2m: Added xc_altp2m_set_mem_access_multi()

2017-03-10 Thread Andrew Cooper
On 10/03/17 07:34, Jan Beulich wrote:
 On 09.03.17 at 18:29,  wrote:
>> On Thu, Mar 9, 2017 at 9:56 AM, Jan Beulich  wrote:
>>> However - is this interface supposed to be usable by a guest on itself?
>>> Arguably the same question would apply to some of the other sub-ops
>>> too, but anyway.
>> AFAIK the only op the guest would use on itself is
>> HVMOP_altp2m_vcpu_enable_notify.
> Which then means we should move all of them out of here, into a
> suitable domctl. That will in turn reduce the scope of the bogus
> interface versioning, which Andrew did point out, quite a bit.

The original usecase for altp2m was for an entirely in-guest agent,
which is why they got in as hvmops to start with.  I don't think it is
wise to break that.

I think there needs to be slightly finer grain control, identifying
whether a domain may use altp2m, and whether it may configure altp2m
permissions itself.

The nature of altp2m means that using EPTP switching/etc necessarily can
only happen from inside guest context, but whether you trust the domain
to make adjustments to the permissions itself depends on your usecase
and threat model.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-unstable test] 106580: regressions - trouble: blocked/broken/fail/pass

2017-03-10 Thread Andrew Cooper
On 10/03/17 08:37, Jan Beulich wrote:
 On 10.03.17 at 08:20,  wrote:
>> flight 106580 xen-unstable real [real]
>> http://logs.test-lab.xenproject.org/osstest/logs/106580/ 
>>
>> Regressions :-(
>>
>> Tests which did not succeed and are blocking,
>> including tests which could not be run:
>>  test-armhf-armhf-xl-arndale   3 host-install(3)broken REGR. vs. 
>> 106534
>>  test-amd64-amd64-migrupgrade 10 xen-boot/dst_hostfail REGR. vs. 
>> 106534
> The NMI watchdog has hit the EOI timer waiting to be able to send
> an IPI on CPU1:
>
> Mar 10 00:09:32.745677 (XEN) Xen call trace:
> Mar 10 00:09:32.745727 (XEN)[] _spin_lock+0x2c/0x4f
> Mar 10 00:09:32.745779 (XEN)[] 
> on_selected_cpus+0x2c/0xc6
> Mar 10 00:09:32.753699 (XEN)[] 
> irq.c#irq_guest_eoi_timer_fn+0x142/0x165
> Mar 10 00:09:32.761711 (XEN)[] 
> timer.c#execute_timer+0x47/0x62
> Mar 10 00:09:32.769683 (XEN)[] 
> timer.c#timer_softirq_action+0xdb/0x22c
> Mar 10 00:09:32.769744 (XEN)[] 
> softirq.c#__do_softirq+0x7f/0x8a
> Mar 10 00:09:32.777697 (XEN)[] do_softirq+0x13/0x15
> Mar 10 00:09:32.785792 (XEN)[] 
> entry.o#process_softirqs+0x21/0x30
>
> That lock is being held by CPU2:
>
> Mar 10 00:15:25.133639 (XEN) Xen call trace:
> Mar 10 00:15:25.133655 (XEN)[] __bitmap_empty+0x54/0x96
> Mar 10 00:15:25.141636 (XEN)[] 
> on_selected_cpus+0xad/0xc6
> Mar 10 00:15:25.149635 (XEN)[] 
> powernow.c#powernow_cpufreq_cpu_init+0x20d/0x372
> Mar 10 00:15:25.157633 (XEN)[] 
> cpufreq_add_cpu+0x1d6/0x5d3
> Mar 10 00:15:25.157654 (XEN)[] 
> cpufreq_cpu_init+0x17/0x1a
> Mar 10 00:15:25.165658 (XEN)[] set_px_pminfo+0x2b6/0x2f7
> Mar 10 00:15:25.165679 (XEN)[] 
> do_platform_op+0xe69/0x1959
> Mar 10 00:15:25.173667 (XEN)[] pv_hypercall+0x1ef/0x42d
> Mar 10 00:15:25.181678 (XEN)[] 
> entry.o#test_all_events+0/0x30
>
> Register state tells us that it's CPU5 not responding. The only piece
> of information we have about CPU5 is
>
> Mar 10 00:09:32.809709 (XEN) CPU5 @ e008:82d080134083 ()
>
> which is the also in _spin_lock(), but which I'm afraid is too little to
> diagnose the issue. I'm therefore wondering whether we wouldn't
> better default "async-show-all" to true in debug builds.
>
> What I'm also puzzled by is that the system is still partly alive after
> the panic: There's Dom0 output, and it is also reacting to debug
> key input. I would have expected a panic to bring down the system
> right away...

Not very surprising.  We crashed because the IPI lock was unavailable,
then disable the watchdog in machine_halt() and try to IPI again.  CPU1
is almost certainly waiting trying to broadcast __machine_halt().

This is the second odd corner case we have seen around machine_halt(). 
The last one was because of being unsafe to use if you panic() from the
middle of context_switch(), as interrupts are re-enabled, and a guest
irq hits an assertion.  The solution in both cases to make it more
reliable is to an NMI broadcast and leave interrupts disabled.

IMO, noreboot isn't a clever thing to be using at all.  OSSTest should
be installing a crash kernel and collecting crash logs, which will be
far more useful to aid diagnosis.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table

2017-03-10 Thread Vijay Kilari
On Fri, Mar 3, 2017 at 8:52 PM, Jan Beulich  wrote:
 On 03.03.17 at 16:16,  wrote:
>> On Fri, Mar 3, 2017 at 8:22 PM, Julien Grall  wrote:
>>> int __init acpi_numa_init(void)
>>> {
>>> if (!acpi_parse_table()) {
>>> acpi_table_parse_srat(TYPE_CPU_AFFINITY);
>>
>> This is not defined for ARM. We have to make this also arch specific.
>> So all arch specific code from xen/drivers/acpi/numa.c should be moved
>> to arch specific to xen/arch/x86/srat.c
>
> There surely is a way to specify processor affinity on ARM?

In ARM, we use ACPI_SRAT_TYPE_GICC_AFFINITY type entry in SRAT
to extract cpu to proximity mapping

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1] boot allocator: Use arch helper for virt_to_mfn on DIRECTMAP

2017-03-10 Thread Vijay Kilari
On Fri, Mar 10, 2017 at 3:09 PM, Julien Grall  wrote:
> Hello,
>
> It is not the first time I am saying this. Please CC *all* the maintainers
> of the code you modify. Give a look at scripts/get_maintainers.pl.

I got below maintainers when I ran the script on complete patch as below.

ubuntu@ubuntu:~/xen$ ./scripts/get_maintainer.pl
outgoing/0001-boot-allocator-Use-arch-helper-for-virt_to_mfn-on-DI.patch
Stefano Stabellini 
Julien Grall 
Jan Beulich 
Andrew Cooper 
xen-devel@lists.xen.org
ubuntu@ubuntu:~/xen$

But I think you are seeing different/full maintainer list with
./scripts/get_maintainer.pl -f xen/common/page_alloc.c

>
> On 03/10/2017 07:32 AM, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> On ARM, virt_to_mfn uses the hardware for address
>> translation. So if the virtual address is not mapped translation
>> fault is raised.
>> On ARM with NUMA, While initializing second memory node,
>> panic is triggered from init_node_heap() when virt_to_mfn()
>> is called for DIRECTMAP_VIRT region address.
>>
>> The init_node_heap() makes a check on MFN passed to ensure that
>> MFN less than max MFN. For this, check is made against virt_to_mfn of
>> DIRECTMAP_VIRT_END region. Since DIRECMAP_VIRT region is not mapped
>> to any physical memory on ARM, it fails.
>>
>> In this patch, instead of calling virt_to_mfn(), arch helper
>> arch_directmap_virt_to_mfn() is introduced. For ARM this arch helper
>> will return 0 and for x86 this helper does virt_to_mfn.
>
>
> I don't understand why you return 0 for ARM. It will prevent the code to
> optimize the case where all the node memory is in the direct mapped region.
> Instead it will allocate extra page in xenheap.
>
> On the previous discussion [1], it has been said that on ARM64 all the
> memory is currently direct mapped. So this check should *always* be true and
> not false. It was suggested to move the whole check in arch specific code.
>
> If this suggestion does not fit, please explain why. Similarly you need to
> justify why you return 0 for ARM because so far it looks a random value.

Thanks for pointing out.
I was biased by your statement "On ARM64, all the memory is direct
mapped so far, so this check will
always be false.", Sorry, I missed your later reply.

>
> Regards,
>
> [1] https://lists.xen.org/archives/html/xen-devel/2017-01/msg00575.html
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] tools/kdd: don't use a pointer to an unaligned field.

2017-03-10 Thread Tim Deegan
The 'val' field in the packet is byte-aligned (because it is part of a
packed struct), but the pointer argument to kdd_rdmsr() has the normal
alignment constraints for a uint64_t *.  Use a local variable to make sure
the passed pointer has the correct alignment.

Reported-by: Roger Pau Monné 
Signed-off-by: Tim Deegan 
---
 tools/debugger/kdd/kdd.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/debugger/kdd/kdd.c b/tools/debugger/kdd/kdd.c
index 70f007e..1bd5dd5 100644
--- a/tools/debugger/kdd/kdd.c
+++ b/tools/debugger/kdd/kdd.c
@@ -710,11 +710,13 @@ static void kdd_handle_read_ctrl(kdd_state *s)
 static void kdd_handle_read_msr(kdd_state *s)
 {
 uint32_t msr = s->rxp.cmd.msr.msr;
+uint64_t val;
 int ok;
 KDD_LOG(s, "Read MSR 0x%"PRIx32"\n", msr);
 
-ok = (kdd_rdmsr(s->guest, s->cpuid, msr, >txp.cmd.msr.val) == 0);
+ok = (kdd_rdmsr(s->guest, s->cpuid, msr, ) == 0);
 s->txp.cmd.msr.msr = msr;
+s->txp.cmd.msr.val = val;
 s->txp.cmd.msr.status = (ok ? KDD_STATUS_SUCCESS : KDD_STATUS_FAILURE);
 kdd_send_cmd(s, KDD_CMD_READ_MSR, 0);
 }
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] tools/kdd: remove unneeded packed attributes

2017-03-10 Thread Tim Deegan
Hi,

At 12:02 +0900 on 10 Mar (1489147353), Roger Pau Monne wrote:
> The layout of the structures make them already aligned, there's no need for 
> the
> packed attributes.

That's not right -- kdd_pkt gets 8 bytes longer with this patch.

In fact this code has the bug that clang's new warning is supposed to
catch: we use these packed fields to parse byte-aligned packets out
of a stream, and then pass the address of a 64-bit field to an
external function as a uint64_t *.  I'll send a patch.

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/8] xen/9pfs: introduce Xen 9pfs backend

2017-03-10 Thread Greg Kurz
On Thu, 9 Mar 2017 18:54:26 +0100
Greg Kurz  wrote:

> On Mon,  6 Mar 2017 18:12:43 -0800
> Stefano Stabellini  wrote:
> 
> > Introduce the Xen 9pfs backend: add struct XenDevOps to register as a
> > Xen backend and add struct V9fsTransport to register as v9fs transport.
> > 
> > All functions are empty stubs for now.
> > 
> > Signed-off-by: Stefano Stabellini 
> > CC: anthony.per...@citrix.com
> > CC: jgr...@suse.com
> > CC: Aneesh Kumar K.V 
> > CC: Greg Kurz 
> > ---
> >  hw/9pfs/xen-9p-backend.c | 96 
> > 
> >  1 file changed, 96 insertions(+)
> >  create mode 100644 hw/9pfs/xen-9p-backend.c
> > 
> > diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
> > new file mode 100644
> > index 000..924fb64
> > --- /dev/null
> > +++ b/hw/9pfs/xen-9p-backend.c
> > @@ -0,0 +1,96 @@
> > +/*
> > + * Xen 9p backend
> > + *
> > + * Copyright Aporeto 2017
> > + *
> > + * Authors:
> > + *  Stefano Stabellini 
> > + *
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +
> > +#include "hw/hw.h"
> > +#include "hw/9pfs/9p.h"
> > +#include "hw/xen/xen_backend.h"
> > +#include "xen_9pfs.h"
> > +#include "qemu/config-file.h"
> > +#include "fsdev/qemu-fsdev.h"
> > +
> > +struct Xen9pfsDev {
> > +struct XenDevice xendev;  /* must be first */
> > +};  
> 
> According to HACKING, this should be:
> 
> typedef struct Xen9pfsDev {
> struct XenDevice xendev;  /* must be first */
> } Xen9pfsDev;
> 
> and...
> 
> > +
> > +static ssize_t xen_9pfs_pdu_vmarshal(V9fsPDU *pdu,
> > + size_t offset,
> > + const char *fmt,
> > + va_list ap)
> > +{
> > +return 0;
> > +}
> > +
> > +static ssize_t xen_9pfs_pdu_vunmarshal(V9fsPDU *pdu,
> > +   size_t offset,
> > +   const char *fmt,
> > +   va_list ap)
> > +{
> > +return 0;
> > +}
> > +
> > +static void xen_9pfs_init_out_iov_from_pdu(V9fsPDU *pdu,
> > +   struct iovec **piov,
> > +   unsigned int *pniov)
> > +{
> > +}
> > +
> > +static void xen_9pfs_init_in_iov_from_pdu(V9fsPDU *pdu,
> > +  struct iovec **piov,
> > +  unsigned int *pniov,
> > +  size_t size)
> > +{
> > +}
> > +
> > +static void xen_9pfs_push_and_notify(V9fsPDU *pdu)
> > +{
> > +}
> > +
> > +static const struct V9fsTransport xen_9p_transport = {
> > +.pdu_vmarshal = xen_9pfs_pdu_vmarshal,
> > +.pdu_vunmarshal = xen_9pfs_pdu_vunmarshal,
> > +.init_in_iov_from_pdu = xen_9pfs_init_in_iov_from_pdu,
> > +.init_out_iov_from_pdu = xen_9pfs_init_out_iov_from_pdu,
> > +.push_and_notify = xen_9pfs_push_and_notify,
> > +};
> > +
> > +static int xen_9pfs_init(struct XenDevice *xendev)
> > +{
> > +return 0;
> > +}
> > +
> > +static int xen_9pfs_free(struct XenDevice *xendev)
> > +{
> > +return -1;
> > +}
> > +
> > +static int xen_9pfs_connect(struct XenDevice *xendev)
> > +{
> > +return 0;
> > +}
> > +
> > +static void xen_9pfs_alloc(struct XenDevice *xendev)
> > +{
> > +}
> > +
> > +static void xen_9pfs_disconnect(struct XenDevice *xendev)
> > +{
> > +}
> > +
> > +struct XenDevOps xen_9pfs_ops = {
> > +.size   = sizeof(struct Xen9pfsDev),  
> 
> ... s/struct //
> 
> > +.flags  = DEVOPS_FLAG_NEED_GNTDEV,
> > +.alloc  = xen_9pfs_alloc,
> > +.init   = xen_9pfs_init,
> > +.initialise= xen_9pfs_connect,
> > +.disconnect = xen_9pfs_disconnect,
> > +.free   = xen_9pfs_free,
> > +};  
> 
> With the above comments addressed:
> 
> Reviewed-by: Greg Kurz 

Hmm... there's still a problem actually. This patch breaks build for some
targets with '--enable-xen --enable-virtfs':

  LINKcris-softmmu/qemu-system-cris
../hw/xen/xen_backend.o: In function `xen_be_register_common':
/home/greg/Work/qemu/qemu-9p/hw/xen/xen_backend.c:588: undefined reference to 
`xen_9pfs_ops'

This happens because only targets that support virtio and PCI pull the
9pfs/fsdev object files. This is an effort to have smaller QEMU binaries,
based on the historical dependency of 9pfs on virtio. This patch breaks
this assumption if CONFIG_XEN_BACKEND is defined but the target doesn't
define CONFIG_VIRTIO and CONFIG_PCI.

Since 9pfs can now be used with another transport, I guess that the
dependency on virtio and PCI isn't right anymore. The new condition
for targets to support 9pfs should rather be something like:

CONFIG_VIRTFS == y && CONFIG_SOFTMMU == y && (CONFIG_VIRTIO == y || 
CONFIG_XEN_BACKEND == y)

This would cause all targets to pull the 9pfs/fsdev object files 

Re: [Xen-devel] [PATCH v1] boot allocator: Use arch helper for virt_to_mfn on DIRECTMAP

2017-03-10 Thread Julien Grall

Hello,

It is not the first time I am saying this. Please CC *all* the 
maintainers of the code you modify. Give a look at 
scripts/get_maintainers.pl.


On 03/10/2017 07:32 AM, vijay.kil...@gmail.com wrote:

From: Vijaya Kumar K 

On ARM, virt_to_mfn uses the hardware for address
translation. So if the virtual address is not mapped translation
fault is raised.
On ARM with NUMA, While initializing second memory node,
panic is triggered from init_node_heap() when virt_to_mfn()
is called for DIRECTMAP_VIRT region address.

The init_node_heap() makes a check on MFN passed to ensure that
MFN less than max MFN. For this, check is made against virt_to_mfn of
DIRECTMAP_VIRT_END region. Since DIRECMAP_VIRT region is not mapped
to any physical memory on ARM, it fails.

In this patch, instead of calling virt_to_mfn(), arch helper
arch_directmap_virt_to_mfn() is introduced. For ARM this arch helper
will return 0 and for x86 this helper does virt_to_mfn.


I don't understand why you return 0 for ARM. It will prevent the code to 
optimize the case where all the node memory is in the direct mapped 
region. Instead it will allocate extra page in xenheap.


On the previous discussion [1], it has been said that on ARM64 all the 
memory is currently direct mapped. So this check should *always* be true 
and not false. It was suggested to move the whole check in arch specific 
code.


If this suggestion does not fit, please explain why. Similarly you need 
to justify why you return 0 for ARM because so far it looks a random value.


Regards,

[1] https://lists.xen.org/archives/html/xen-devel/2017-01/msg00575.html

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 18/24] x86: L2 CAT: implement get hw info flow.

2017-03-10 Thread Jan Beulich
>>> On 10.03.17 at 06:57,  wrote:
> On 17-03-09 08:13:59, Jan Beulich wrote:
>> >>> On 15.02.17 at 09:49,  wrote:
>> > +static bool l2_cat_get_feat_info(const struct feat_node *feat,
>> > + uint32_t data[], uint32_t array_len)
>> > +{
>> > +if ( !data || 2 > array_len )
>> > +return false;
>> > +
>> > +data[CBM_LEN] = feat->info.l2_cat_info.cbm_len;
>> > +data[COS_MAX] = feat->info.l2_cat_info.cos_max;
>> 
>> What about PSR_FLAG? Which puts under question the
>> array_len check at the start of this and its sibling functions: If
>> more array entries are provided, they'd all be left uninitialized
>> without the caller having a way to know. Coming back to the
>> data structures being the same for all features, there should
>> presumably be a structure instead of an array be used for
>> communication here, and ...
>> 
>> > @@ -754,6 +755,11 @@ struct xen_sysctl_psr_cat_op {
>> >  #define XEN_SYSCTL_PSR_CAT_L3_CDP   (1u << 0)
>> >  uint32_t flags; /* OUT: CAT flags */
>> >  } l3_info;
>> > +
>> > +struct {
>> > +uint32_t cbm_len;   /* OUT: CBM length */
>> > +uint32_t cos_max;   /* OUT: Maximum COS */
>> > +} l2_info;
>> 
>> ... this should use the same structure as l3_info (flags being
>> set to zero until a use arises). This would then also allow for
>> more code to be shared instead of duplicated.
>> 
> Ok, will reuse l3_info for L2 CAT. May modify its name to cat_info.

Of course.

> But for MBA or future feature, its info is different with CAT. So, it may
> not be approriate to use a structure as parameter for 'psr_get_info'.
> So I would prefer to keep data[]. Of course, I will correct array_len check
> per Roger's suggestion, like "array_len != PSR_INFO_SIZE". Is that good to
> you?

Yes, if there is a need for it to be that way down the road, then I'm
fine as long as both the risk of array overrun and that of using
uninitialized data are minimized as much as possible.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 11/24] x86: refactor psr: set value: implement cos id picking flow.

2017-03-10 Thread Jan Beulich
>>> On 10.03.17 at 06:40,  wrote:
> On 17-03-09 07:10:33, Jan Beulich wrote:
>> >>> On 15.02.17 at 09:49,  wrote:
>> > @@ -397,6 +408,25 @@ static int l3_cat_compare_val(const uint64_t val[],
>> >  return 0;
>> >  }
>> >  
>> > +static bool l3_cat_fits_cos_max(const uint64_t val[],
>> > +const struct feat_node *feat,
>> > +unsigned int cos)
>> > +{
>> > +uint64_t l3_def_cbm;
>> > +
>> > +l3_def_cbm = (1ull << feat->info.l3_cat_info.cbm_len) - 1;
>> 
>> Seeing this pattern recur I wonder whether this also wouldn't be
>> something that could be stored once in a generic manner, avoiding
>> the need for this per-feature callback (cos_max should be common
>> to all features too - not the values, but the abstract notion - so
>> perhaps get_cos_max() isn't needed as a callback either).
>> 
> DYM to create a member in 'struct feat_node'? E.g:
> uint64_t def_val;

Yes. All data items applicable to all features should be outside
the feature dependent data structures. And which pieces of
data to store you should determine by the overall aim of
preferably as little feature-specific code as possible, i.e.
whatever can be made common with reasonable effort should
be made common.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 10/24] x86: refactor psr: set value: implement cos finding flow.

2017-03-10 Thread Jan Beulich
>>> On 10.03.17 at 06:35,  wrote:
> On 17-03-08 10:03:10, Jan Beulich wrote:
>> >>> On 15.02.17 at 09:49,  wrote:
>> > @@ -356,6 +369,34 @@ static int l3_cat_set_new_val(uint64_t val[],
>> >  return 0;
>> >  }
>> >  
>> > +static int l3_cat_compare_val(const uint64_t val[],
>> > +  const struct feat_node *feat,
>> > +  unsigned int cos, bool *found)
>> > +{
>> > +uint64_t l3_def_cbm;
>> > +
>> > +l3_def_cbm = (1ull << feat->info.l3_cat_info.cbm_len) - 1;
>> 
>> Please only calculate the value on the path you need it. Also this
>> will again degenerate of cbm_len == 64.
>> 
> Sorry, what do you mean? I need get the default value of L3 CAT here
> to check if input val equals the default one if cos exceeds cos_max.

No, you don't need it at function scope. You only need it ...

>> > +/*
>> > + * Different features' cos_max are different. If cos id of the feature
>> > + * being set exceeds other feature's cos_max, the val of other feature
>> > + * must be default value. HW supports such case.
>> > + */
>> > +if ( cos > feat->info.l3_cat_info.cos_max )
>> > +{
>> > +if ( val[0] != l3_def_cbm )

... in this more narrow one. Specifically no code outside the outer
if() needs the variable.

>> > +{
>> > +*found = false;
>> > +return -ENOENT;
>> 
>> What is the difference between this "not found" and ...
>> 
> This is an error case. If input cos exceeds the cos_max and the input val is
> not default value, that means the input parameters exceed HW ability. We 
> should
> report error back.
> 
>> > +}
>> > +*found = true;
>> > +}
>> > +else
>> > +*found = (val[0] == feat->cos_reg_val[cos]);
>> > +
>> > +return 0;
>> 
>> ... the possible one here? I.e. why once -ENOENT and once 0 as
>> return value?
>> 
> 0 means success. '*found' means if the val is found or not.

I still don't understand the difference, but perhaps this will all sort
itself out once the cos_max checks get done in common code and
the default values (suitably stored, as suggested elsewhere) can
also be checked by common code. In fact with that I'm not even
sure you will continue to require this feature dependent actor
anymore.

> Per Roger's
> suggestion, I will refine this function to use return value to check
> if it is found or not.

Well, let's see how this ends up being.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 09/24] x86: refactor psr: set value: assemble features value array.

2017-03-10 Thread Jan Beulich
>>> On 10.03.17 at 04:21,  wrote:
> On 17-03-08 09:54:04, Jan Beulich wrote:
>> >>> On 15.02.17 at 09:49,  wrote:
>> > @@ -207,6 +233,29 @@ static enum psr_feat_type 
>> > psr_cbm_type_to_feat_type(enum cbm_type type)
>> >  return feat_type;
>> >  }
>> >  
>> > +static bool psr_check_cbm(unsigned int cbm_len, uint64_t cbm)
>> > +{
>> > +unsigned int first_bit, zero_bit;
>> > +
>> > +/* Set bits should only in the range of [0, cbm_len). */
>> > +if ( cbm & (~0ull << cbm_len) )
>> 
>> This will not do what you intend for cbm_len == 64.
>> 
> cbm_len is not 64. cbm_len means the CBM value length, how many bits. For L3
> CAT, it may be 11 bits. For L2 CAT, it may be 8 bits.

And there is an _architectural_ guarantee for them to never
reach 64 bits?

>> > +static int l3_cat_get_old_val(uint64_t val[],
>> > +  const struct feat_node *feat,
>> > +  unsigned int old_cos)
>> > +{
>> > +if ( old_cos > feat->info.l3_cat_info.cos_max )
>> 
>> Afaics this condition is the only L3 CAT specific thing in this function.
>> Should more of it be moved into common code? Same below for
>> set_new_val.
>> 
> Sorry, I may not understand your intention. For different features, they have
> different cos_max. Do you mean I should abstract a callback function for all
> features to handle this cos_max check? Thanks!

Yes. All features have a cos_max (even if the precise values may
differ), so common code can do the check if the values is stored
suitably in a field outside of any feature dependent data structures.

>> > -static int assemble_val_array(uint64_t *val,
>> > +static int combine_val_array(uint64_t *val,
>> 
>> Same comment as earlier on - please decide for a final name right
>> when introducing a function. In fact I'd prefer it to remain
>> "assemble".
>> 
> This is corrected in next version. I will change name back to assemble if you
> like it.

I'm fine with it. Personally I'd like "gather" even better, but the
naming choice in the end is up to you as long as it reflects the
purpose of the function (which imo "combine" doesn't).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 08/24] x86: refactor psr: set value: implement framework.

2017-03-10 Thread Jan Beulich
>>> On 10.03.17 at 08:46,  wrote:
> On 17-03-08 09:07:10, Jan Beulich wrote:
> [...]
>> >  /* Called with domain lock held, no extra lock needed for 'psr_cos_ids' */
>> >  static void psr_free_cos(struct domain *d)
>> >  {
>> > -if( !d->arch.psr_cos_ids )
>> > +unsigned int socket, cos;
>> > +
>> > +if ( !d->arch.psr_cos_ids )
>> >  return;
>> 
>> As in an earlier patch I've asked for this check to be removed, I
>> think you will need to add a check on socket_info to be non-
>> NULL somewhere in this function.
>> 
> d->arch.psr_cos_ids is used in the loop below. Shall I not check it?
>  
>> > +for ( socket = 0; socket < nr_sockets; socket++ )
>> > +{
>> > +struct psr_socket_info *info;
>> > +
>> > +/* cos 0 is default one which does not need be handled. */
>> > +if ( (cos = d->arch.psr_cos_ids[socket]) == 0 )
> Here.

Oh, yes, you should. But the check should be added here instead
of in the earlier patch.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 08/24] x86: refactor psr: set value: implement framework.

2017-03-10 Thread Jan Beulich
>>> On 10.03.17 at 03:54,  wrote:
> On 17-03-08 09:07:10, Jan Beulich wrote:
>> >>> On 15.02.17 at 09:49,  wrote:
>> > +static int write_psr_msr(unsigned int socket, unsigned int cos,
>> > + const uint64_t *val)
>> > +{
>> > +return -ENOENT;
>> > +}
>> 
>> Is this function intended you write just one MSR, or potentially many?
>> In the latter case the name would perhaps better be "write_psr_msrs()".
>> 
> For one feature, it does set one MSR.

In which case - why is the value being passed by pointer?

>> > +ref[old_cos]--;
>> > +spin_unlock(>ref_lock);
>> > +
>> > +/*
>> > + * Step 6:
>> > + * Save the COS ID into current domain's psr_cos_ids[] so that we can 
>> > know
>> > + * which COS the domain is using on the socket. One domain can only 
>> > use
>> > + * one COS ID at same time on each socket.
>> > + */
>> > +d->arch.psr_cos_ids[socket] = cos;
>> 
>> So the domain has not been paused, i.e. some of its vCPU-s may
>> be running on other pCPU-s (including ones on the socket in
>> question). How come it is safe to update this value here?
>> 
> This is a domctl operation. It is protected by domctl_lock which is locked in
> do_domctl().

But that lock doesn't keep the subject domain's vCPU-s from
running on other pCPU-s at the same time.

>> >  /* Called with domain lock held, no extra lock needed for 'psr_cos_ids' */
>> >  static void psr_free_cos(struct domain *d)
>> >  {
>> > -if( !d->arch.psr_cos_ids )
>> > +unsigned int socket, cos;
>> > +
>> > +if ( !d->arch.psr_cos_ids )
>> >  return;
>> 
>> As in an earlier patch I've asked for this check to be removed, I
>> think you will need to add a check on socket_info to be non-
>> NULL somewhere in this function.
>> 
> Ok, will do it in the loop.

Please don't - loop invariants should be put outside of loops.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 07/24] x86: refactor psr: implement get value flow.

2017-03-10 Thread Jan Beulich
>>> On 10.03.17 at 02:50,  wrote:
> On 17-03-08 08:35:53, Jan Beulich wrote:
>> >>> On 15.02.17 at 09:49,  wrote:
>> > @@ -498,6 +516,15 @@ int psr_get_info(unsigned int socket, enum cbm_type 
>> > type,
>> >  if ( feat->feature != feat_type )
>> >  continue;
>> >  
>> > +if ( d )
>> > +{
>> > +cos = d->arch.psr_cos_ids[socket];
>> > +if ( feat->ops.get_val(feat, cos, type, val) )
>> > +return 0;
>> > +else
>> > +break;
>> > +}
>> > +
>> >  if ( feat->ops.get_feat_info(feat, data, array_len) )
>> >  return 0;
>> >  else
>> 
>> Looking at the context here - is it really a good idea to overload the
>> function in this way, rather than creating a second one? Your
>> only complicating the live of the callers, as can be seen e.g. ...
>> 
> These codes were separated into two functions before, 'psr_get_info' and
> 'psr_get_val'. But there are some common codes. So, Konrad suggested me
> to create a helper function to reduce redundant codes.

I think you went too far - breaking out common code is nice, but if
the two callers now pass NULL/0 each as two of the last four
arguments, this is a clear indication that you've folded more code
than was actually common.

>> > @@ -507,10 +534,16 @@ int psr_get_info(unsigned int socket, enum cbm_type 
> type,
>> >  return -ENOENT;
>> >  }
>> >  
>> > -int psr_get_l3_cbm(struct domain *d, unsigned int socket,
>> > -   uint64_t *cbm, enum cbm_type type)
>> > +int psr_get_info(unsigned int socket, enum cbm_type type,
>> > + uint32_t data[], unsigned int array_len)
>> > +{
>> > +return psr_get(socket, type, data, array_len, NULL, NULL);
>> 
>> ... here and ...
>> 
>> > +}
>> > +
>> > +int psr_get_val(struct domain *d, unsigned int socket,
>> > +uint64_t *val, enum cbm_type type)
>> >  {
>> > -return 0;
>> > +return psr_get(socket, type, NULL, 0, d, val);
>> >  }
>> 
>> ... here (it is a bad sign that both pass NULL on either side).
>> 
> Yes, these things look not good. But to keep a common helper I have to pass
> all necessary parameters into it. What is your suggestion?

If there's really that much code to be shared (which I continue not
to be convinced of), use of a function pointer may make this look a
little better. But I think when having tried to carry out the earlier
suggestion, you should have responded back right away indicating
this would result in other ugliness.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 06/24] x86: refactor psr: implement get hw info flow.

2017-03-10 Thread Yi Sun
On 17-03-10 01:57:57, Jan Beulich wrote:
> >>> On 10.03.17 at 02:43,  wrote:
> > On 17-03-08 08:15:33, Jan Beulich wrote:
> >> >>> On 15.02.17 at 09:49,  wrote:
> >> > --- a/xen/arch/x86/psr.c
> >> > +++ b/xen/arch/x86/psr.c
> >> > @@ -84,6 +84,7 @@ enum psr_feat_type {
> >> >  PSR_SOCKET_L3_CAT = 0,
> >> >  PSR_SOCKET_L3_CDP,
> >> >  PSR_SOCKET_L2_CAT,
> >> > +PSR_SOCKET_UNKNOWN = 0x,
> >> 
> >> Any reason to use this value, instead of just the next sequential one?
> >> 
> > This is an error value used below, in 'psr_cbm_type_to_feat_type'. To make 
> > it
> > explicitly different, I assign a big value to it.
> 
> How does that "big value" help?
> 
It is just my habit. I will remove this assignment if you don't like it.

> Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 06/24] x86: refactor psr: implement get hw info flow.

2017-03-10 Thread Jan Beulich
>>> On 10.03.17 at 02:43,  wrote:
> On 17-03-08 08:15:33, Jan Beulich wrote:
>> >>> On 15.02.17 at 09:49,  wrote:
>> > --- a/xen/arch/x86/psr.c
>> > +++ b/xen/arch/x86/psr.c
>> > @@ -84,6 +84,7 @@ enum psr_feat_type {
>> >  PSR_SOCKET_L3_CAT = 0,
>> >  PSR_SOCKET_L3_CDP,
>> >  PSR_SOCKET_L2_CAT,
>> > +PSR_SOCKET_UNKNOWN = 0x,
>> 
>> Any reason to use this value, instead of just the next sequential one?
>> 
> This is an error value used below, in 'psr_cbm_type_to_feat_type'. To make it
> explicitly different, I assign a big value to it.

How does that "big value" help?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 04/24] x86: refactor psr: implement CPU init and free flow.

2017-03-10 Thread Jan Beulich
>>> On 10.03.17 at 02:32,  wrote:
> On 17-03-08 07:56:54, Jan Beulich wrote:
>> >>> On 15.02.17 at 09:49,  wrote:
>> > -static int psr_cpu_prepare(unsigned int cpu)
>> > +static void cpu_init_work(void)
>> > +{
>> > +struct psr_socket_info *info;
>> > +unsigned int socket;
>> > +unsigned int cpu = smp_processor_id();
>> > +struct feat_node *feat;
>> > +struct cpuid_leaf regs = { .a = 0, .b = 0, .c = 0, .d = 0 };
>> 
>> I don't see you needing an initializer here at all, but if you really
>> want one for some reason, the same effect can be had with just
>> {}.
>> 
> Konrad suggested me to initialize it like this in v5 patch review comments.
> I think he has experienced some strange issues when he forgot to set _all_
> the entries a structure allocated on the stack.

I can see there being cases where this is desirable; I don't think
this is one of them. (See also a related comment I had made on
a later patch.)

>> > +if ( !cpu_has(_cpu_data, X86_FEATURE_PQE) )
>> 
>> Do you really mean to not universally check the global (boot CPU)
>> flag? I.e. possibly differing behavior on different CPUs?
>> 
> Yes, different sockets may have different configurations. E.g. sockt 0 has
> PQE support but socket 1 does not.

For all other CPU features we assume symmetry. Why would we
not do so here? And how would things even work in that case,
when a vCPU gets moved (by the scheduler) from a more capable
to a less capable pCPU?

>> >  static void psr_cpu_init(void)
>> >  {
>> > +if ( socket_info )
>> > +cpu_init_work();
>> > +
>> >  psr_assoc_init();
>> >  }
>> >  
>> >  static void psr_cpu_fini(unsigned int cpu)
>> >  {
>> > +if ( socket_info )
>> > +cpu_fini_work(cpu);
>> >  return;
>> >  }
>> 
>> Is it really useful to use another layer of helper functions here?
>> 
> The reason we define 'cpu_fini_work' is to match 'cpu_init_work'.

And the question was for both of them.

> If we move
> codes of 'cpu_init_work' into 'psr_cpu_init', the codes look messy.

I don't think that's the case; I could see this as a valid argument if
the calling functions above were already quite complex.

> That is
> the reason we define 'cpu_init_work'. Do you think if it is acceptable to 
> you?

Well, I won't NAK the patch if you keep them, but I'd prefer the
functions to be folded into their callers.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 0/6] Remove dependency on __LINE__

2017-03-10 Thread Jan Beulich
>>> On 10.03.17 at 09:29,  wrote:
> On 03/09/2017 10:34 AM, Jan Beulich wrote:
> On 08.03.17 at 18:46,  wrote:
>>> Sorry for the long delay since the first version of this series
>>> (previously called "Make building xSplice patches easier").  Here is a
>>> set of changes that remove the use of __LINE__ when building with NDEBUG
>>> and LivePatch enabled.  Tested to boot on x86.  Compile-tested on arm.
>>
>> What I'm missing here is an evaluation of possible alternatives, as
>> converting from __LINE__ to code addresses implies an extra step
>> when wanting to look up the place where a problem occurred,
>> which I generally consider undesirable. For example, I had briefly
>> discussed with Andrew the addition of #line directives, but iirc he
>> had indicated some issue with that. Furthermore there is the
>> option (at least for some of the more simple patches) of squeezing
>> more than one statement on a single line, perhaps just for patch
>> creation. And I guess one can think of more.
>>
> 
> Both of the options you suggested require extra work when building a 
> live patch. This series is intended to remove that extra work as the 
> process is already more difficult than I'd like.
> 
> I don't think using addr2line is much harder than looking up the line 
> number directly, especially as it is something that needs to be done for 
> many other crashes.

I agree with the second half, but I don't think I do for the first part:
I don't know how you would manage this, but for me this is several
steps:
- Locate and download the correct package (it's not reasonable to
  keep them all, there simply are too many)
- extract xen-syms or xen.efi (and of course I will need to know
  which one was actually used)
- run addr2line on it
Quite a bit more than just going to the source file and finding the
line right away - in most cases line numbers, as opposed to code
addresses, don't change much despite a delta of a few dozen
backports or other patches, so normally one doesn't even need
to look at the _precise_ tree.

> On your suggestion, use of __LINE__ is disabled only 
> when CONFIG_LIVEPATCH=y and NDEBUG which means that in a typical 
> development scenario, you'd still get line numbers.

That's understood, but I'm concerned about extra overhead when
looking at customer reports.

> They would be 
> removed only for "release" builds in which it is likely that the source 
> code & debuginfo is archived somewhere such that looking up a line 
> number requires several steps anyway. I could suggest making it a 
> separate config option but IIRC you prefer to limit the number of config 
> options.

I prefer to limit ones with non-EXPERT-visible prompts, yes. But
I also don't see how a separate config option would improve the
situation.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 106582: regressions - FAIL

2017-03-10 Thread osstest service owner
flight 106582 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106582/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 105963
 test-amd64-i386-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 105963

version targeted for testing:
 ovmf 55ce319c48e97d8051ae7e43abdd608afab6584d
baseline version:
 ovmf e0307a7dad02aa8c0cd8b3b0b9edce8ddb3fef2e

Last test of basis   105963  2017-02-21 21:43:31 Z   16 days
Failing since105980  2017-02-22 10:03:53 Z   15 days   43 attempts
Testing same since   106582  2017-03-10 03:19:28 Z0 days1 attempts


People who touched revisions under test:
  Anthony PERARD 
  Ard Biesheuvel 
  Bi, Dandan 
  Brijesh Singh 
  Chao Zhang 
  Chen A Chen 
  Dandan Bi 
  edk2-devel On Behalf Of rthomaiy <[mailto:edk2-devel-boun...@lists.01.org]>
  Fu Siyuan 
  Hao Wu 
  Hegde Nagaraj P 
  Hess Chen 
  Jeff Fan 
  Jiaxin Wu 
  Jiewen Yao 
  Laszlo Ersek 
  Leo Duran 
  Paolo Bonzini 
  Qin Long 
  Richard Thomaiyar 
  Ruiyu Ni 
  Star Zeng 
  Wu Jiaxin 
  Yonghong Zhu 
  Zhang Lubo 
  Zhang, Chao B 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 3705 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-unstable test] 106580: regressions - trouble: blocked/broken/fail/pass

2017-03-10 Thread Jan Beulich
>>> On 10.03.17 at 08:20,  wrote:
> flight 106580 xen-unstable real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/106580/ 
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-armhf-armhf-xl-arndale   3 host-install(3)broken REGR. vs. 
> 106534
>  test-amd64-amd64-migrupgrade 10 xen-boot/dst_hostfail REGR. vs. 
> 106534

The NMI watchdog has hit the EOI timer waiting to be able to send
an IPI on CPU1:

Mar 10 00:09:32.745677 (XEN) Xen call trace:
Mar 10 00:09:32.745727 (XEN)[] _spin_lock+0x2c/0x4f
Mar 10 00:09:32.745779 (XEN)[] on_selected_cpus+0x2c/0xc6
Mar 10 00:09:32.753699 (XEN)[] 
irq.c#irq_guest_eoi_timer_fn+0x142/0x165
Mar 10 00:09:32.761711 (XEN)[] 
timer.c#execute_timer+0x47/0x62
Mar 10 00:09:32.769683 (XEN)[] 
timer.c#timer_softirq_action+0xdb/0x22c
Mar 10 00:09:32.769744 (XEN)[] 
softirq.c#__do_softirq+0x7f/0x8a
Mar 10 00:09:32.777697 (XEN)[] do_softirq+0x13/0x15
Mar 10 00:09:32.785792 (XEN)[] 
entry.o#process_softirqs+0x21/0x30

That lock is being held by CPU2:

Mar 10 00:15:25.133639 (XEN) Xen call trace:
Mar 10 00:15:25.133655 (XEN)[] __bitmap_empty+0x54/0x96
Mar 10 00:15:25.141636 (XEN)[] on_selected_cpus+0xad/0xc6
Mar 10 00:15:25.149635 (XEN)[] 
powernow.c#powernow_cpufreq_cpu_init+0x20d/0x372
Mar 10 00:15:25.157633 (XEN)[] cpufreq_add_cpu+0x1d6/0x5d3
Mar 10 00:15:25.157654 (XEN)[] cpufreq_cpu_init+0x17/0x1a
Mar 10 00:15:25.165658 (XEN)[] set_px_pminfo+0x2b6/0x2f7
Mar 10 00:15:25.165679 (XEN)[] do_platform_op+0xe69/0x1959
Mar 10 00:15:25.173667 (XEN)[] pv_hypercall+0x1ef/0x42d
Mar 10 00:15:25.181678 (XEN)[] 
entry.o#test_all_events+0/0x30

Register state tells us that it's CPU5 not responding. The only piece
of information we have about CPU5 is

Mar 10 00:09:32.809709 (XEN) CPU5 @ e008:82d080134083 ()

which is the also in _spin_lock(), but which I'm afraid is too little to
diagnose the issue. I'm therefore wondering whether we wouldn't
better default "async-show-all" to true in debug builds.

What I'm also puzzled by is that the system is still partly alive after
the panic: There's Dom0 output, and it is also reacting to debug
key input. I would have expected a panic to bring down the system
right away...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 0/6] Remove dependency on __LINE__

2017-03-10 Thread Ross Lagerwall

On 03/09/2017 10:34 AM, Jan Beulich wrote:

On 08.03.17 at 18:46,  wrote:

Sorry for the long delay since the first version of this series
(previously called "Make building xSplice patches easier").  Here is a
set of changes that remove the use of __LINE__ when building with NDEBUG
and LivePatch enabled.  Tested to boot on x86.  Compile-tested on arm.


What I'm missing here is an evaluation of possible alternatives, as
converting from __LINE__ to code addresses implies an extra step
when wanting to look up the place where a problem occurred,
which I generally consider undesirable. For example, I had briefly
discussed with Andrew the addition of #line directives, but iirc he
had indicated some issue with that. Furthermore there is the
option (at least for some of the more simple patches) of squeezing
more than one statement on a single line, perhaps just for patch
creation. And I guess one can think of more.



Both of the options you suggested require extra work when building a 
live patch. This series is intended to remove that extra work as the 
process is already more difficult than I'd like.


I don't think using addr2line is much harder than looking up the line 
number directly, especially as it is something that needs to be done for 
many other crashes. On your suggestion, use of __LINE__ is disabled only 
when CONFIG_LIVEPATCH=y and NDEBUG which means that in a typical 
development scenario, you'd still get line numbers. They would be 
removed only for "release" builds in which it is likely that the source 
code & debuginfo is archived somewhere such that looking up a line 
number requires several steps anyway. I could suggest making it a 
separate config option but IIRC you prefer to limit the number of config 
options.


--
Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] x86/altp2m: Added xc_altp2m_set_mem_access_multi()

2017-03-10 Thread Razvan Cojocaru
On 03/10/2017 09:31 AM, Jan Beulich wrote:
 On 09.03.17 at 18:15,  wrote:
>> On 03/09/2017 06:56 PM, Jan Beulich wrote:
>> On 09.03.17 at 10:38,  wrote:
 @@ -4535,6 +4536,30 @@ static int do_altp2m_op(
  a.u.set_mem_access.view);
  break;
  
 +case HVMOP_altp2m_set_mem_access_multi:
 +if ( a.u.set_mem_access_multi.pad ||
 + a.u.set_mem_access_multi.opaque >= 
 a.u.set_mem_access_multi.nr 
>> )
 +{
 +rc = -EINVAL;
 +break;
 +}
 +rc = p2m_set_mem_access_multi(d, 
 a.u.set_mem_access_multi.pfn_list,
 +  
 a.u.set_mem_access_multi.access_list,
 +  a.u.set_mem_access_multi.nr,
 +  a.u.set_mem_access_multi.opaque,
 +  MEMOP_CMD_MASK,
 +  a.u.set_mem_access_multi.view);
 +if ( rc > 0 )
 +{
 +a.u.set_mem_access_multi.opaque = rc;
 +if ( __copy_to_guest(arg, , 1) )
 +rc = -EFAULT;
 +else
 +rc = hypercall_create_continuation(__HYPERVISOR_hvm_op, 
>> "lh",
 +   HVMOP_altp2m, arg);
 +}
 +break;
>>>
>>> Okay, so this is a hvmop, in which case I'm fine with the continuation
>>> model used.
>>>
>>> However - is this interface supposed to be usable by a guest on itself?
>>> Arguably the same question would apply to some of the other sub-op
>>> too, but anyway.
>>
>> Not for any of our use cases. The whole point is for dom0 (or another
>> suitably privileged domain) to monitor another guest that consequently
>> can't, by design, evade detection of bad behaviour by acting at a higher
>> privilege level than the protection software. It wouldn't make sense for
>> a domain to be doing this on itself.
> 
> In which case this should be a domctl.

Fair enough, if nobody objects I'll then just modify
XENMEM_access_op_set_access_multi to take a view_id as well an just
piggyback on that. It already does the right thing underneath.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel