Re: [Xen-devel] [RFC 3/7] arm64:armds: ARM Compiler 6.6 does not accept `rx` registers naming for AArch64

2019-11-12 Thread Julien Grall
Hi,

Aside what Stefano and Jan already said. Please, reword the commit title.
It should reflect what the commit does not describe the error (this should
part of the message).

On Wed, 6 Nov 2019, 18:19 Andrii Anisov,  wrote:

> From: Andrii Anisov 
>
> So get the code duplication with `x`-es.
>

Please provide a link to the documentation so this can be cross-checked.

Cheers,


> Signed-off-by: Andrii Anisov 
> ---
>  xen/include/asm-arm/smccc.h | 60
> +
>  1 file changed, 60 insertions(+)
>
> diff --git a/xen/include/asm-arm/smccc.h b/xen/include/asm-arm/smccc.h
> index 126399d..3fa1144 100644
> --- a/xen/include/asm-arm/smccc.h
> +++ b/xen/include/asm-arm/smccc.h
> @@ -120,6 +120,8 @@ struct arm_smccc_res {
>  #define __constraint_read_6 __constraint_read_5, "r" (r6)
>  #define __constraint_read_7 __constraint_read_6, "r" (r7)
>
> +#ifdef CONFIG_ARM_32
> +
>  #define __declare_arg_0(a0, res)\
>  struct arm_smccc_res*___res = res;  \
>  register unsigned long  r0 asm("r0") = (uint32_t)a0;\
> @@ -174,6 +176,64 @@ struct arm_smccc_res {
>  __declare_arg_6(a0, a1, a2, a3, a4, a5, a6, res);   \
>  register typeof(a7) r7 asm("r7") = __a7
>
> +#else /* ARM_64 */
> +
> +#define __declare_arg_0(a0, res)\
> +struct arm_smccc_res*___res = res;  \
> +register unsigned long  r0 asm("x0") = (uint32_t)a0;\
> +register unsigned long  r1 asm("x1");   \
> +register unsigned long  r2 asm("x2");   \
> +register unsigned long  r3 asm("x3")
> +
> +#define __declare_arg_1(a0, a1, res)\
> +typeof(a1) __a1 = a1;   \
> +struct arm_smccc_res*___res = res;  \
> +register unsigned long  r0 asm("x0") = (uint32_t)a0;\
> +register unsigned long  r1 asm("x1") = __a1;\
> +register unsigned long  r2 asm("x2");   \
> +register unsigned long  r3 asm("x3")
> +
> +#define __declare_arg_2(a0, a1, a2, res)\
> +typeof(a1) __a1 = a1;   \
> +typeof(a2) __a2 = a2;   \
> +struct arm_smccc_res*___res = res;  \
> +register unsigned long  r0 asm("x0") = (uint32_t)a0;\
> +register unsigned long  r1 asm("x1") = __a1;\
> +register unsigned long  r2 asm("x2") = __a2;\
> +register unsigned long  r3 asm("x3")
> +
> +#define __declare_arg_3(a0, a1, a2, a3, res)\
> +typeof(a1) __a1 = a1;   \
> +typeof(a2) __a2 = a2;   \
> +typeof(a3) __a3 = a3;   \
> +struct arm_smccc_res*___res = res;  \
> +register unsigned long  r0 asm("x0") = (uint32_t)a0;\
> +register unsigned long  r1 asm("x1") = __a1;\
> +register unsigned long  r2 asm("x2") = __a2;\
> +register unsigned long  r3 asm("x3") = __a3
> +
> +#define __declare_arg_4(a0, a1, a2, a3, a4, res)\
> +typeof(a4) __a4 = a4;   \
> +__declare_arg_3(a0, a1, a2, a3, res);   \
> +register unsigned long r4 asm("x4") = __a4
> +
> +#define __declare_arg_5(a0, a1, a2, a3, a4, a5, res)\
> +typeof(a5) __a5 = a5;   \
> +__declare_arg_4(a0, a1, a2, a3, a4, res);   \
> +register typeof(a5) r5 asm("x5") = __a5
> +
> +#define __declare_arg_6(a0, a1, a2, a3, a4, a5, a6, res)\
> +typeof(a6) __a6 = a6;   \
> +__declare_arg_5(a0, a1, a2, a3, a4, a5, res);   \
> +register typeof(a6) r6 asm("x6") = __a6
> +
> +#define __declare_arg_7(a0, a1, a2, a3, a4, a5, a6, a7, res)\
> +typeof(a7) __a7 = a7;   \
> +__declare_arg_6(a0, a1, a2, a3, a4, a5, a6, res);   \
> +register typeof(a7) r7 asm("x7") = __a7
> +
> +#endif
> +
>  #define ___declare_args(count, ...) __declare_arg_ ## count(__VA_ARGS__)
>  #define __declare_args(count, ...)  ___declare_args(count, __VA_ARGS__)
>
> --
> 2.7.4
>
>
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC 6/7] arm: Introduce dummy empty functions for data only C files

2019-11-12 Thread Julien Grall
Hi,

On Wed, 6 Nov 2019, 18:20 Andrii Anisov,  wrote:

> From: Andrii Anisov 
>
> ARM Compiler 6 has a proven bug: it compiles data only C files with
> SoftVFP attributes. This leads to a failed linkage afterwards with
> an error:
>

And there are no way to force disabling the softfvp attributes?


> Error: L6242E: Cannot link object built_in.o as its attributes are
> incompatible with the image attributes.
> ... A64 clashes with SoftVFP.
>
> The known workaround is introducing some code into the affected file,
> e.g. an empty (non-static) function is enough.
>

Was this reported to Arm? If so, what was there answer?

Signed-off-by: Andrii Anisov 
> ---
>  xen/arch/arm/platforms/brcm-raspberry-pi.c | 2 ++
>  xen/arch/arm/platforms/thunderx.c  | 2 ++
>  xen/xsm/flask/gen-policy.py| 4 
>  3 files changed, 8 insertions(+)
>
> diff --git a/xen/arch/arm/platforms/brcm-raspberry-pi.c
> b/xen/arch/arm/platforms/brcm-raspberry-pi.c
> index b697fa2..7ab1810 100644
> --- a/xen/arch/arm/platforms/brcm-raspberry-pi.c
> +++ b/xen/arch/arm/platforms/brcm-raspberry-pi.c
> @@ -40,6 +40,8 @@ static const struct dt_device_match rpi4_blacklist_dev[]
> __initconst =
>  { /* sentinel */ },
>  };
>
> +void brcm_raspberry_pi_dummy_func(void) {}
> +
>  PLATFORM_START(rpi4, "Raspberry Pi 4")
>  .compatible = rpi4_dt_compat,
>  .blacklist_dev  = rpi4_blacklist_dev,
> diff --git a/xen/arch/arm/platforms/thunderx.c
> b/xen/arch/arm/platforms/thunderx.c
> index 9b32a29..8015323 100644
> --- a/xen/arch/arm/platforms/thunderx.c
> +++ b/xen/arch/arm/platforms/thunderx.c
> @@ -33,6 +33,8 @@ static const struct dt_device_match
> thunderx_blacklist_dev[] __initconst =
>  { /* sentinel */ },
>  };
>
> +void thunderx_dummy_func(void) {}
> +
>  PLATFORM_START(thunderx, "THUNDERX")
>  .compatible = thunderx_dt_compat,
>  .blacklist_dev = thunderx_blacklist_dev,
> diff --git a/xen/xsm/flask/gen-policy.py b/xen/xsm/flask/gen-policy.py
> index c7501e4..73bf7d2 100644
> --- a/xen/xsm/flask/gen-policy.py
> +++ b/xen/xsm/flask/gen-policy.py
> @@ -21,3 +21,7 @@ sys.stdout.write("""
>  };
>  const unsigned int __initconst xsm_flask_init_policy_size = %d;
>  """ % policy_size)
> +
> +sys.stdout.write("""
> +void policy_dummy_func(void) {}
> +""")
> --
> 2.7.4
>
>
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC 5/7] WIP:arm64:armds: Build XEN with ARM Compiler 6.6

2019-11-12 Thread Julien Grall
On Tue, 12 Nov 2019, 06:27 Stefano Stabellini, 
wrote:

> On Wed, 6 Nov 2019, Andrii Anisov wrote:
> > From: Andrii Anisov 
> >
> > Here several ARM Compiler 6.6 issues are solved or provided a
> work-around:
> >
> >  - Scatter file is pretty primitive, it has no feature to define symbols
> >  - ARM linker defined symbols are not counted as referred if only
> mentioned
> >in a steering file for rename or resolve, so a header file is used to
> >redefine GNU linker script symbols into armlink defined symbols.
> >
> >  - _srodata type clashes by type with __start_bug_frames so can not be
> both
> >redefined as Load$$_rodata_bug_frames_0$$Base. Use resolve feature of
> armlink
> >steering file.
>
> Why _srodata and __start_bug_frames cannot both be defined as
> Load$$_rodata_bug_frames_0$$Base when actually in the case of:
>
> > +#define __per_cpu_data_end  Load$$_bss_percpu$$Limit
> > +#define __bss_end   Load$$_bss_percpu$$Limit
> > +#define _endLoad$$_bss_percpu$$Limit
>
> They are all defined as "Load$$_bss_percpu$$Limit"? And:
>
> > +#define __init_end  Load$$_bss$$Base
> > +#define __bss_start Load$$_bss$$Base
>
> They are both defined as "Load$$_bss$$Base"? What's special about
> "Load$$_rodata_bug_frames_0$$Base"?
>
>
> >  - C style shift operators are missed among supported scatter file
> expressions,
> >so some needed values are hardcoded in scatter file.
> >
> >  - Rename correspondent ARM Linker defined symbols to those needed by
> `symbols` tool
> >using steering file feature.
> >
> >  - ARM Compiler 6.6 tools are not able to rename sections, so we still
> need
> >GNU toolchain's objcopy for this.
> >
> > Signed-off-by: Andrii Anisov 
> > ---
> >  xen/Rules.mk|   6 +
> >  xen/arch/arm/Makefile   |  24 
> >  xen/arch/arm/xen.scat.S | 266
> 
>
> I would strongly suggest to rename this file with something not
> potentially related to scat
>

To be honest, I don't think this file should even exist. This looks like a
copy of xen.lds.S with a different syntax. Furthermore, the comments from
Stefano shows that is going to be hard to maintain/check everything has
been written correctly.

So how about trying to abstract xen.lds.S?


>
> > +/*
> > + * armlink does not understand shifts in scat file expressions
> > + * so hardcode needed values
> > + */
>

Please give a pointer to the doc of the armlink in the commit message. So
we can easily cross-check what's happening.

In this case, I don't particularly like the re-definition of the defines
outside of their header. This is going to make more difficult if we have to
update them in the future.

I can see a few ways to do it:

 - Avoid using shifts in the definitions
 - Find a way to evaluate the value (maybe similar to asn-offset) before
using them.

My preference would be the latter but I could be convinced for the former.

Cheers,
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [qemu-mainline test] 144050: tolerable trouble: fail/pass/starved - PUSHED

2019-11-12 Thread osstest service owner
flight 144050 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144050/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10   fail  like 142915
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 142915
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 142915
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 142915
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 142915
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 142915
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-qemuu-rhel6hvm-intel  2 hosts-allocate starved n/a

version targeted for testing:
 qemuu039e285e095c20a88e623b927654b161aaf9d914
baseline version:
 qemuue9d42461920f6f40f4d847a5ba18e90d095ed0b9

Last test of basis   142915  2019-10-19 14:49:41 Z   24 days
Failing since143030  2019-10-22 11:08:39 Z   21 days   20 attempts
Testing same since   144050  2019-11-12 14:10:13 Z0 days1 attempts


People who touched revisions under test:
  Aleksandar Markovic 
  Alex Bennée 
  Alex Williamson 
  Alexander Shopov 
  Alexey Kardashevskiy 
  Alistair Francis 
  Andreas Schwab 
  Andrew Jones 
  Andrey Smirnov 
  Artyom Tarasenko 
  Basil Salman 
  Bin Meng 
  Bishara AbuHattoum 
  Bruce Rogers 
  Christophe Lyon 
  Cleber Rosa 
  Clement Deschamps 
  Cornelia Huck 
  Cédric Le Goater 
  Daniel P. Berrangé 
  

[Xen-devel] [xen-unstable test] 144042: tolerable FAIL

2019-11-12 Thread osstest service owner
flight 144042 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144042/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 144020
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 144020
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 144020
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 144020
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 144020
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 144020
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 144020
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 144020
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 144020
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 xen  a458d3bd0d2585275c128556ec0cbd818c6a7b0d
baseline version:
 xen  a458d3bd0d2585275c128556ec0cbd818c6a7b0d

Last test of basis   144042  2019-11-12 09:07:51 Z0 days
Testing same since  (not found) 0 attempts

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  pass
 build-amd64  pass
 build-arm64

Re: [Xen-devel] [PATCH] arch: arm: vgic-v3: fix GICD_ISACTIVER range

2019-11-12 Thread Julien Grall
On Wed, 13 Nov 2019, 10:55 André Przywara,  wrote:

> On 13/11/2019 01:08, Julien Grall wrote:
>
> Hi,
>
> > On Tue, 12 Nov 2019, 04:01 Stefano Stabellini,  > > wrote:
> >
> > On Sat, 9 Nov 2019, Julien Grall wrote:
> > > On Sat, 9 Nov 2019, 04:27 Stefano Stabellini,
> > mailto:sstabell...@kernel.org>> wrote:
> > >   On Thu, 7 Nov 2019, Peng Fan wrote:
> > >   > The end should be GICD_ISACTIVERN not GICD_ISACTIVER.
> > >   >
> > >   > Signed-off-by: Peng Fan  > >
> > >
> > >   Reviewed-by: Stefano Stabellini  > >
> > >
> > >
> > > To be honest, I am not sure the code is correct. A read to those
> > registers should tell you the list of interrupts active. As we always
> > > return 0, this will not return the correct state of the GIC.
> > >
> > > I know that returning the list of actives interrupts is
> > complicated with the old vGIC, but I don't think silently ignoring
> > it is a good
> > > idea.
> > > The question here is why the guest accessed those registers? What
> > is it trying to figure out?
> >
> > We are not going to solve the general problem at this stage. At the
> > moment the code:
> >
> > - ignore the first register only
> > - print an error and return an IO_ABORT error for the other regs
> >
> > For the inconsistency alone the second option is undesirable. Also it
> > doesn't match the write implementation, which does the same thing for
> > all the GICD_ISACTIVER* regs instead of having a special treatment
> for
> > the first one only. It looks like a typo in the original patch to me.
> >
> > The proposed patch switches the behavior to:
> >
> > - silently ignore all the GICD_ISACTIVER* regs (as proposed)
> >
> >
> > is an improvement.
> >
> >
> > Peng mentioned that Linux is accessing it, so the worst thing we can do
> > is lying to the guest (as you suggest here). I would definitely not call
> > that an improvement.
>
> The ISACTIVER range is wrong in the description, it covers only one
> register, not multiple. This is obviously a typo, since it's correct in
> both GICv2 and in the high level switch/case in GICv3. Reading from
> outside of any range will inject an abort into the guest, which runs in
> kernel space. This will probably result in a guest crash. I would
> consider not crashing an improvement.
>

It is not. Neither the current approach to silently doing it.


> About "lying" to the guest: Typically an IRQ is just active for a very
> short time, so 0 is a very good answer, actually.


So why does Linux is checking it? What will happen if there were actually
an active interrupt but don't report it?

The old VGIC in KVM
> did exactly the same:
> vgic_reg_access(mmio, NULL, offset,
> ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);
>
> The proper solution would be:
> 1) Track the state of the active bit when we can observe it, so when the
> guest exits with an active IRQ. The new VGIC does that.
> 2) Kick out all VCPUs that have IRQs in that given rank, and sample the
> active bit from the LRs. Sounds pretty horrible, and chances are very
> high you will get all 0s there.
>
> So if I compare "fix those two typos and preserve the state that the Xen
> VGIC has been in for years" to "create a lot of racy code for a rare
> corner case", the first one surely wins.
> That doesn't mean we should never try it, but surely this fix needs to
> go in meanwhile.
>

I don't believe this patch to go in is the correct solution not from a
technical PoV but to get things properly fixed.


> > In the current state, it is a Nack. If there were a warning, then I
> > would be more inclined to see this patch going through.
>
> Do you mean a warning that we are about to lie to the guest? That sounds
> pretty useless, since nobody can do anything about it. Plus we have
> already those warnings on writing to these registers, and this always
> confuses people and triggered pointless bug reports.
>

Well, the warning has the benefits to annoy people. If we do it silently,
then we don't encourage to fix it.


> I think the old VGIC has bigger problems ;-)
>

I agree, but nobody seems to be willing to fix it... My only leverage here
is pushing for a warning to annoy the user.

So I maintain my request for a warning.

Cheers,


> Cheers,
> Andre
>
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] arch: arm: vgic-v3: fix GICD_ISACTIVER range

2019-11-12 Thread André Przywara
On 13/11/2019 01:08, Julien Grall wrote:

Hi,

> On Tue, 12 Nov 2019, 04:01 Stefano Stabellini,  > wrote:
> 
> On Sat, 9 Nov 2019, Julien Grall wrote:
> > On Sat, 9 Nov 2019, 04:27 Stefano Stabellini,
> mailto:sstabell...@kernel.org>> wrote:
> >       On Thu, 7 Nov 2019, Peng Fan wrote:
> >       > The end should be GICD_ISACTIVERN not GICD_ISACTIVER.
> >       >
> >       > Signed-off-by: Peng Fan  >
> >
> >       Reviewed-by: Stefano Stabellini  >
> >
> >
> > To be honest, I am not sure the code is correct. A read to those
> registers should tell you the list of interrupts active. As we always
> > return 0, this will not return the correct state of the GIC.
> >
> > I know that returning the list of actives interrupts is
> complicated with the old vGIC, but I don't think silently ignoring
> it is a good
> > idea.
> > The question here is why the guest accessed those registers? What
> is it trying to figure out?
> 
> We are not going to solve the general problem at this stage. At the
> moment the code:
> 
> - ignore the first register only
> - print an error and return an IO_ABORT error for the other regs
> 
> For the inconsistency alone the second option is undesirable. Also it
> doesn't match the write implementation, which does the same thing for
> all the GICD_ISACTIVER* regs instead of having a special treatment for
> the first one only. It looks like a typo in the original patch to me.
> 
> The proposed patch switches the behavior to:
> 
> - silently ignore all the GICD_ISACTIVER* regs (as proposed)
> 
> 
> is an improvement.
> 
> 
> Peng mentioned that Linux is accessing it, so the worst thing we can do
> is lying to the guest (as you suggest here). I would definitely not call
> that an improvement.

The ISACTIVER range is wrong in the description, it covers only one
register, not multiple. This is obviously a typo, since it's correct in
both GICv2 and in the high level switch/case in GICv3. Reading from
outside of any range will inject an abort into the guest, which runs in
kernel space. This will probably result in a guest crash. I would
consider not crashing an improvement.

About "lying" to the guest: Typically an IRQ is just active for a very
short time, so 0 is a very good answer, actually. The old VGIC in KVM
did exactly the same:
vgic_reg_access(mmio, NULL, offset,
ACCESS_READ_RAZ | ACCESS_WRITE_IGNORED);

The proper solution would be:
1) Track the state of the active bit when we can observe it, so when the
guest exits with an active IRQ. The new VGIC does that.
2) Kick out all VCPUs that have IRQs in that given rank, and sample the
active bit from the LRs. Sounds pretty horrible, and chances are very
high you will get all 0s there.

So if I compare "fix those two typos and preserve the state that the Xen
VGIC has been in for years" to "create a lot of racy code for a rare
corner case", the first one surely wins.
That doesn't mean we should never try it, but surely this fix needs to
go in meanwhile.

> In the current state, it is a Nack. If there were a warning, then I
> would be more inclined to see this patch going through.

Do you mean a warning that we are about to lie to the guest? That sounds
pretty useless, since nobody can do anything about it. Plus we have
already those warnings on writing to these registers, and this always
confuses people and triggered pointless bug reports.

I think the old VGIC has bigger problems ;-)

Cheers,
Andre

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC 7/7] arm/gic-v3: add GIC version suffix to iomem range variables

2019-11-12 Thread Julien Grall
On Tue, 12 Nov 2019, 05:59 Stefano Stabellini, 
wrote:

> On Wed, 6 Nov 2019, Andrii Anisov wrote:
> > From: Andrii Anisov 
> >
> > ARM Compiler 6.6 has a proven bug: static data symbols, moved to an init
> > section, becomes global. Thus these symbols clash with ones defined in
> > gic-v2.c. The straight forward way to resolve the issue is to add the GIC
> > version suffix, at least for one of the conflicting side.
> >
> > Signed-off-by: Andrii Anisov 
>
> The patch is acceptable but this seems a very serious compiler bug.
>

I am a bit worried this is not going to prevent introducing any similar bug
in the future. I think, we have a way to enforce uniq symbols (see
CONFIG_UNIQUE_SYMBOLS). Would it work for you here?


This, together with the other bug described in the previous patch, makes
> me think the ARMCC is not quite ready for showtime. Do you know if there
> are any later version of the compiler that don't have these problems?
>

Related to this as this been reported to Arm?

Cheers,

-- 
Julien Grall
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC 4/7] arm/gic: Drop pointless assertions

2019-11-12 Thread Julien Grall
On Tue, 12 Nov 2019, 05:52 Stefano Stabellini, 
wrote:

> On Wed, 6 Nov 2019, Andrii Anisov wrote:
> > From: Andrii Anisov 
> >
> > Also armclang complains about the condition always true,
> > because `sgi` is of type enum with all its values under 16.
> >
> > Signed-off-by: Andrii Anisov 
>
> Although I am not completely opposed to this, given the choice I would
> prefer to keep the ASSERTs.
>

Why? What would that prevent? It is an enum, so unless you do an horrible
hack on the other side, this should always be valid.

But then, why would this be an issue here and not in the tens other place
where enum is used?



> Given that I would imagine that the ARM C Compiler will also complain
> about many other ASSERTs, I wonder if it wouldn't be better to just
> disable *all* ASSERTs when building with armcc by changing the
> implementation of the ASSERT MACRO.


ARM C compiler is valid here and I would not be surprised this will come up
in Clang and GCC in the future.

If you are worry that the enum is going to grow more than 16 items, then
you should use a BUILD_BUG_ON.




>
> > ---
> >  xen/arch/arm/gic.c | 6 --
> >  1 file changed, 6 deletions(-)
> >
> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> > index 113655a..58c6141 100644
> > --- a/xen/arch/arm/gic.c
> > +++ b/xen/arch/arm/gic.c
> > @@ -294,8 +294,6 @@ void __init gic_init(void)
> >
> >  void send_SGI_mask(const cpumask_t *cpumask, enum gic_sgi sgi)
> >  {
> > -ASSERT(sgi < 16); /* There are only 16 SGIs */
> > -
> >  gic_hw_ops->send_SGI(sgi, SGI_TARGET_LIST, cpumask);
> >  }
> >
> > @@ -306,15 +304,11 @@ void send_SGI_one(unsigned int cpu, enum gic_sgi
> sgi)
> >
> >  void send_SGI_self(enum gic_sgi sgi)
> >  {
> > -ASSERT(sgi < 16); /* There are only 16 SGIs */
> > -
> >  gic_hw_ops->send_SGI(sgi, SGI_TARGET_SELF, NULL);
> >  }
> >
> >  void send_SGI_allbutself(enum gic_sgi sgi)
> >  {
> > -   ASSERT(sgi < 16); /* There are only 16 SGIs */
> > -
> > gic_hw_ops->send_SGI(sgi, SGI_TARGET_OTHERS, NULL);
> >  }
> >
> > --
> > 2.7.4
> >
>
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] arch: arm: vgic-v3: fix GICD_ISACTIVER range

2019-11-12 Thread Julien Grall
On Tue, 12 Nov 2019, 04:01 Stefano Stabellini, 
wrote:

> On Sat, 9 Nov 2019, Julien Grall wrote:
> > On Sat, 9 Nov 2019, 04:27 Stefano Stabellini, 
> wrote:
> >   On Thu, 7 Nov 2019, Peng Fan wrote:
> >   > The end should be GICD_ISACTIVERN not GICD_ISACTIVER.
> >   >
> >   > Signed-off-by: Peng Fan 
> >
> >   Reviewed-by: Stefano Stabellini 
> >
> >
> > To be honest, I am not sure the code is correct. A read to those
> registers should tell you the list of interrupts active. As we always
> > return 0, this will not return the correct state of the GIC.
> >
> > I know that returning the list of actives interrupts is complicated with
> the old vGIC, but I don't think silently ignoring it is a good
> > idea.
> > The question here is why the guest accessed those registers? What is it
> trying to figure out?
>
> We are not going to solve the general problem at this stage. At the
> moment the code:
>
> - ignore the first register only
> - print an error and return an IO_ABORT error for the other regs
>
> For the inconsistency alone the second option is undesirable. Also it
> doesn't match the write implementation, which does the same thing for
> all the GICD_ISACTIVER* regs instead of having a special treatment for
> the first one only. It looks like a typo in the original patch to me.
>
> The proposed patch switches the behavior to:
>
> - silently ignore all the GICD_ISACTIVER* regs (as proposed)


> is an improvement.
>

Peng mentioned that Linux is accessing it, so the worst thing we can do is
lying to the guest (as you suggest here). I would definitely not call that
an improvement.

In the current state, it is a Nack. If there were a warning, then I would
be more inclined to see this patch going through.

Cheers,
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/3] xen/mcelog: also allow building for 32-bit kernels

2019-11-12 Thread Boris Ostrovsky
On 11/11/19 9:46 AM, Jan Beulich wrote:
> There's no apparent reason why it can be used on 64-bit only.
>
> Signed-off-by: Jan Beulich 
>
> --- a/drivers/xen/Kconfig
> +++ b/drivers/xen/Kconfig
> @@ -285,7 +285,7 @@ config XEN_ACPI_PROCESSOR
>  
>  config XEN_MCE_LOG
>   bool "Xen platform mcelog"
> - depends on XEN_DOM0 && X86_64 && X86_MCE
> + depends on XEN_DOM0 && X86 && X86_MCE

Can we have X86_MCE without X86?

-boris

>   help
> Allow kernel fetching MCE error from Xen platform and
> converting it into Linux mcelog format for mcelog tools
>


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/3] xen/mcelog: add PPIN to record when available

2019-11-12 Thread Boris Ostrovsky
On 11/11/19 9:46 AM, Jan Beulich wrote:
> This is to augment commit 3f5a7896a5 ("x86/mce: Include the PPIN in MCE
> records when available").
>
> I'm also adding "synd" and "ipid" fields to struct xen_mce, in an
> attempt to keep field offsets in sync with struct mce. These two fields
> won't get populated for now, though.
>
> Signed-off-by: Jan Beulich 
>
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -393,6 +393,8 @@
>  #define MSR_AMD_PSTATE_DEF_BASE  0xc0010064
>  #define MSR_AMD64_OSVW_ID_LENGTH 0xc0010140
>  #define MSR_AMD64_OSVW_STATUS0xc0010141
> +#define MSR_AMD_PPIN_CTL 0xc00102f0
> +#define MSR_AMD_PPIN 0xc00102f1

Which processors are these defined for? I looked at a couple (fam 15h
and 17h) and didn't see those. And I don't see them in Linux.

>  #define MSR_AMD64_LS_CFG 0xc0011020
>  #define MSR_AMD64_DC_CFG 0xc0011022
>  #define MSR_AMD64_BU_CFG20xc001102a
> --- a/drivers/xen/mcelog.c
> +++ b/drivers/xen/mcelog.c
> @@ -253,6 +253,11 @@ static int convert_log(struct mc_info *m
>   case MSR_IA32_MCG_CAP:
>   m.mcgcap = g_physinfo[i].mc_msrvalues[j].value;
>   break;
> +
> + case MSR_PPIN:
> + case MSR_AMD_PPIN:
> + m.ppin = g_physinfo[i].mc_msrvalues[j].value;
> + break;
>   }
>  
>   mic = NULL;
> --- a/include/xen/interface/xen-mca.h
> +++ b/include/xen/interface/xen-mca.h
> @@ -332,7 +332,11 @@ struct xen_mc {
>  };
>  DEFINE_GUEST_HANDLE_STRUCT(xen_mc);
>  
> -/* Fields are zero when not available */
> +/*
> + * Fields are zero when not available. Also, this struct is shared with
> + * userspace mcelog and thus must keep existing fields at current offsets.
> + * Only add new fields to the end of the structure
> + */
>  struct xen_mce {


Why is this structure is part of the interface?


-boris

>   __u64 status;
>   __u64 misc;
> @@ -353,6 +357,9 @@ struct xen_mce {
>   __u32 socketid; /* CPU socket ID */
>   __u32 apicid;   /* CPU initial apic ID */
>   __u64 mcgcap;   /* MCGCAP MSR: machine check capabilities of CPU */
> + __u64 synd; /* MCA_SYND MSR: only valid on SMCA systems */
> + __u64 ipid; /* MCA_IPID MSR: only valid on SMCA systems */
> + __u64 ppin; /* Protected Processor Inventory Number */
>  };
>  
>  /*


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/3] xen/mcelog: drop __MC_MSR_MCGCAP

2019-11-12 Thread Boris Ostrovsky
On 11/11/19 9:45 AM, Jan Beulich wrote:
> It has never been part of Xen's public interface, and there's therefore
> no guarantee for MCG_CAP's value to always be present in array entry 0.
>
> Signed-off-by: Jan Beulich 

Reviewed-by: Boris Ostrovsky 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 144057: tolerable all pass - PUSHED

2019-11-12 Thread osstest service owner
flight 144057 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144057/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  8c4330818f6ee70cbf7428a40a28a73df1272d10
baseline version:
 xen  aaef3d904bbbde1fcf9c07943878bd2aa64cc2bc

Last test of basis   144053  2019-11-12 15:01:17 Z0 days
Testing same since   144057  2019-11-12 18:00:44 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Anthony PERARD 
  Anthony PERARD 
  Dario Faggioli 
  George Dunlap 
  Marek Marczykowski-Górecki 
  Stefano Stabellini 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   aaef3d904b..8c4330818f  8c4330818f6ee70cbf7428a40a28a73df1272d10 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ovmf test] 144046: all pass - PUSHED

2019-11-12 Thread osstest service owner
flight 144046 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144046/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf e92b155740cdbf10a85ed8f37f69da0991fc8275
baseline version:
 ovmf f8dd7c7018adf78992da572eeaf53c0ce31a411f

Last test of basis   144034  2019-11-11 23:13:12 Z0 days
Testing same since   144046  2019-11-12 12:17:23 Z0 days1 attempts


People who touched revisions under test:
  Michael D Kinney 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   f8dd7c7018..e92b155740  e92b155740cdbf10a85ed8f37f69da0991fc8275 -> 
xen-tested-master

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 14/14] xen/gntdev: use mmu_interval_notifier_insert

2019-11-12 Thread Jason Gunthorpe
From: Jason Gunthorpe 

gntdev simply wants to monitor a specific VMA for any notifier events,
this can be done straightforwardly using mmu_interval_notifier_insert()
over the VMA's VA range.

The notifier should be attached until the original VMA is destroyed.

It is unclear if any of this is even sane, but at least a lot of duplicate
code is removed.

Reviewed-by: Boris Ostrovsky 
Signed-off-by: Jason Gunthorpe 
---
 drivers/xen/gntdev-common.h |   8 +-
 drivers/xen/gntdev.c| 179 ++--
 2 files changed, 49 insertions(+), 138 deletions(-)

diff --git a/drivers/xen/gntdev-common.h b/drivers/xen/gntdev-common.h
index 2f8b949c3eeb14..91e44c04f7876c 100644
--- a/drivers/xen/gntdev-common.h
+++ b/drivers/xen/gntdev-common.h
@@ -21,15 +21,8 @@ struct gntdev_dmabuf_priv;
 struct gntdev_priv {
/* Maps with visible offsets in the file descriptor. */
struct list_head maps;
-   /*
-* Maps that are not visible; will be freed on munmap.
-* Only populated if populate_freeable_maps == 1
-*/
-   struct list_head freeable_maps;
/* lock protects maps and freeable_maps. */
struct mutex lock;
-   struct mm_struct *mm;
-   struct mmu_notifier mn;
 
 #ifdef CONFIG_XEN_GRANT_DMA_ALLOC
/* Device for which DMA memory is allocated. */
@@ -49,6 +42,7 @@ struct gntdev_unmap_notify {
 };
 
 struct gntdev_grant_map {
+   struct mmu_interval_notifier notifier;
struct list_head next;
struct vm_area_struct *vma;
int index;
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 81401f386c9ce0..a04ddf2a68afa5 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -63,7 +63,6 @@ MODULE_PARM_DESC(limit, "Maximum number of grants that may be 
mapped by "
 static atomic_t pages_mapped = ATOMIC_INIT(0);
 
 static int use_ptemod;
-#define populate_freeable_maps use_ptemod
 
 static int unmap_grant_pages(struct gntdev_grant_map *map,
 int offset, int pages);
@@ -249,12 +248,6 @@ void gntdev_put_map(struct gntdev_priv *priv, struct 
gntdev_grant_map *map)
evtchn_put(map->notify.event);
}
 
-   if (populate_freeable_maps && priv) {
-   mutex_lock(>lock);
-   list_del(>next);
-   mutex_unlock(>lock);
-   }
-
if (map->pages && !use_ptemod)
unmap_grant_pages(map, 0, map->count);
gntdev_free_map(map);
@@ -444,16 +437,9 @@ static void gntdev_vma_close(struct vm_area_struct *vma)
 
pr_debug("gntdev_vma_close %p\n", vma);
if (use_ptemod) {
-   /* It is possible that an mmu notifier could be running
-* concurrently, so take priv->lock to ensure that the vma won't
-* vanishing during the unmap_grant_pages call, since we will
-* spin here until that completes. Such a concurrent call will
-* not do any unmapping, since that has been done prior to
-* closing the vma, but it may still iterate the unmap_ops list.
-*/
-   mutex_lock(>lock);
+   WARN_ON(map->vma != vma);
+   mmu_interval_notifier_remove(>notifier);
map->vma = NULL;
-   mutex_unlock(>lock);
}
vma->vm_private_data = NULL;
gntdev_put_map(priv, map);
@@ -475,109 +461,44 @@ static const struct vm_operations_struct gntdev_vmops = {
 
 /* -- */
 
-static bool in_range(struct gntdev_grant_map *map,
- unsigned long start, unsigned long end)
-{
-   if (!map->vma)
-   return false;
-   if (map->vma->vm_start >= end)
-   return false;
-   if (map->vma->vm_end <= start)
-   return false;
-
-   return true;
-}
-
-static int unmap_if_in_range(struct gntdev_grant_map *map,
- unsigned long start, unsigned long end,
- bool blockable)
+static bool gntdev_invalidate(struct mmu_interval_notifier *mn,
+ const struct mmu_notifier_range *range,
+ unsigned long cur_seq)
 {
+   struct gntdev_grant_map *map =
+   container_of(mn, struct gntdev_grant_map, notifier);
unsigned long mstart, mend;
int err;
 
-   if (!in_range(map, start, end))
-   return 0;
+   if (!mmu_notifier_range_blockable(range))
+   return false;
 
-   if (!blockable)
-   return -EAGAIN;
+   /*
+* If the VMA is split or otherwise changed the notifier is not
+* updated, but we don't want to process VA's outside the modified
+* VMA. FIXME: It would be much more understandable to just prevent
+* modifying the VMA in the first place.
+*/
+   if (map->vma->vm_start >= range->end ||
+  

[Xen-devel] [PATCH v3 05/14] RDMA/odp: Use mmu_interval_notifier_insert()

2019-11-12 Thread Jason Gunthorpe
From: Jason Gunthorpe 

Replace the internal interval tree based mmu notifier with the new common
mmu_interval_notifier_insert() API. This removes a lot of code and fixes a
deadlock that can be triggered in ODP:

 zap_page_range()
  mmu_notifier_invalidate_range_start()
   [..]
ib_umem_notifier_invalidate_range_start()
   down_read(_mm->umem_rwsem)
  unmap_single_vma()
[..]
  __split_huge_page_pmd()
mmu_notifier_invalidate_range_start()
[..]
   ib_umem_notifier_invalidate_range_start()
  down_read(_mm->umem_rwsem)   // DEADLOCK

mmu_notifier_invalidate_range_end()
   up_read(_mm->umem_rwsem)
  mmu_notifier_invalidate_range_end()
 up_read(_mm->umem_rwsem)

The umem_rwsem is held across the range_start/end as the ODP algorithm for
invalidate_range_end cannot tolerate changes to the interval
tree. However, due to the nested invalidation regions the second
down_read() can deadlock if there are competing writers. The new core code
provides an alternative scheme to solve this problem.

Fixes: ca748c39ea3f ("RDMA/umem: Get rid of per_mm->notifier_count")
Tested-by: Artemy Kovalyov 
Signed-off-by: Jason Gunthorpe 
---
 drivers/infiniband/core/device.c |   1 -
 drivers/infiniband/core/umem_odp.c   | 303 ---
 drivers/infiniband/hw/mlx5/mlx5_ib.h |   7 +-
 drivers/infiniband/hw/mlx5/mr.c  |   3 +-
 drivers/infiniband/hw/mlx5/odp.c |  50 ++---
 include/rdma/ib_umem_odp.h   |  68 ++
 include/rdma/ib_verbs.h  |   2 -
 7 files changed, 82 insertions(+), 352 deletions(-)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 2dd2cfe9b56136..ac7924b3c73abe 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -2617,7 +2617,6 @@ void ib_set_device_ops(struct ib_device *dev, const 
struct ib_device_ops *ops)
SET_DEVICE_OP(dev_ops, get_vf_config);
SET_DEVICE_OP(dev_ops, get_vf_stats);
SET_DEVICE_OP(dev_ops, init_port);
-   SET_DEVICE_OP(dev_ops, invalidate_range);
SET_DEVICE_OP(dev_ops, iw_accept);
SET_DEVICE_OP(dev_ops, iw_add_ref);
SET_DEVICE_OP(dev_ops, iw_connect);
diff --git a/drivers/infiniband/core/umem_odp.c 
b/drivers/infiniband/core/umem_odp.c
index d7d5fadf0899ad..e42d44e501fd54 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -48,197 +48,33 @@
 
 #include "uverbs.h"
 
-static void ib_umem_notifier_start_account(struct ib_umem_odp *umem_odp)
+static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp,
+  const struct mmu_interval_notifier_ops *ops)
 {
-   mutex_lock(_odp->umem_mutex);
-   if (umem_odp->notifiers_count++ == 0)
-   /*
-* Initialize the completion object for waiting on
-* notifiers. Since notifier_count is zero, no one should be
-* waiting right now.
-*/
-   reinit_completion(_odp->notifier_completion);
-   mutex_unlock(_odp->umem_mutex);
-}
-
-static void ib_umem_notifier_end_account(struct ib_umem_odp *umem_odp)
-{
-   mutex_lock(_odp->umem_mutex);
-   /*
-* This sequence increase will notify the QP page fault that the page
-* that is going to be mapped in the spte could have been freed.
-*/
-   ++umem_odp->notifiers_seq;
-   if (--umem_odp->notifiers_count == 0)
-   complete_all(_odp->notifier_completion);
-   mutex_unlock(_odp->umem_mutex);
-}
-
-static void ib_umem_notifier_release(struct mmu_notifier *mn,
-struct mm_struct *mm)
-{
-   struct ib_ucontext_per_mm *per_mm =
-   container_of(mn, struct ib_ucontext_per_mm, mn);
-   struct rb_node *node;
-
-   down_read(_mm->umem_rwsem);
-   if (!per_mm->mn.users)
-   goto out;
-
-   for (node = rb_first_cached(_mm->umem_tree); node;
-node = rb_next(node)) {
-   struct ib_umem_odp *umem_odp =
-   rb_entry(node, struct ib_umem_odp, interval_tree.rb);
-
-   /*
-* Increase the number of notifiers running, to prevent any
-* further fault handling on this MR.
-*/
-   ib_umem_notifier_start_account(umem_odp);
-   complete_all(_odp->notifier_completion);
-   umem_odp->umem.ibdev->ops.invalidate_range(
-   umem_odp, ib_umem_start(umem_odp),
-   ib_umem_end(umem_odp));
-   }
-
-out:
-   up_read(_mm->umem_rwsem);
-}
-
-static int invalidate_range_start_trampoline(struct ib_umem_odp *item,
-u64 start, u64 end, void *cookie)
-{
-   ib_umem_notifier_start_account(item);
-   item->umem.ibdev->ops.invalidate_range(item, start, end);
-   return 0;
-}
-
-static int 

[Xen-devel] [PATCH v3 11/14] drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror

2019-11-12 Thread Jason Gunthorpe
From: Jason Gunthorpe 

Remove the interval tree in the driver and rely on the tree maintained by
the mmu_notifier for delivering mmu_notifier invalidation callbacks.

For some reason amdgpu has a very complicated arrangement where it tries
to prevent duplicate entries in the interval_tree, this is not necessary,
each amdgpu_bo can be its own stand alone entry. interval_tree already
allows duplicates and overlaps in the tree.

Also, there is no need to remove entries upon a release callback, the
mmu_interval API safely allows objects to remain registered beyond the
lifetime of the mm. The driver only has to stop touching the pages during
release.

Reviewed-by: Philip Yang 
Tested-by: Philip Yang 
Signed-off-by: Jason Gunthorpe 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   2 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c|   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c| 333 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h|   4 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h|  13 +-
 6 files changed, 77 insertions(+), 281 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index bd37df5dd6d048..60591a5d420021 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1006,6 +1006,8 @@ struct amdgpu_device {
struct mutex  lock_reset;
struct amdgpu_doorbell_index doorbell_index;
 
+   struct mutexnotifier_lock;
+
int asic_reset_res;
struct work_struct  xgmi_reset_work;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 6d021ecc8d598f..47700302a08b7f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -481,8 +481,7 @@ static void remove_kgd_mem_from_kfd_bo_list(struct kgd_mem 
*mem,
  *
  * Returns 0 for success, negative errno for errors.
  */
-static int init_user_pages(struct kgd_mem *mem, struct mm_struct *mm,
-  uint64_t user_addr)
+static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr)
 {
struct amdkfd_process_info *process_info = mem->process_info;
struct amdgpu_bo *bo = mem->bo;
@@ -1195,7 +1194,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
add_kgd_mem_to_kfd_bo_list(*mem, avm->process_info, user_addr);
 
if (user_addr) {
-   ret = init_user_pages(*mem, current->mm, user_addr);
+   ret = init_user_pages(*mem, user_addr);
if (ret)
goto allocate_init_user_pages_failed;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 5a1939dbd4e3e6..38f97998aaddb2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2633,6 +2633,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
mutex_init(>virt.vf_errors.lock);
hash_init(adev->mn_hash);
mutex_init(>lock_reset);
+   mutex_init(>notifier_lock);
mutex_init(>virt.dpm_mutex);
mutex_init(>psp.mutex);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index 31d4deb5d29484..9fe1c31ce17a30 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -50,66 +50,6 @@
 #include "amdgpu.h"
 #include "amdgpu_amdkfd.h"
 
-/**
- * struct amdgpu_mn_node
- *
- * @it: interval node defining start-last of the affected address range
- * @bos: list of all BOs in the affected address range
- *
- * Manages all BOs which are affected of a certain range of address space.
- */
-struct amdgpu_mn_node {
-   struct interval_tree_node   it;
-   struct list_headbos;
-};
-
-/**
- * amdgpu_mn_destroy - destroy the HMM mirror
- *
- * @work: previously sheduled work item
- *
- * Lazy destroys the notifier from a work item
- */
-static void amdgpu_mn_destroy(struct work_struct *work)
-{
-   struct amdgpu_mn *amn = container_of(work, struct amdgpu_mn, work);
-   struct amdgpu_device *adev = amn->adev;
-   struct amdgpu_mn_node *node, *next_node;
-   struct amdgpu_bo *bo, *next_bo;
-
-   mutex_lock(>mn_lock);
-   down_write(>lock);
-   hash_del(>node);
-   rbtree_postorder_for_each_entry_safe(node, next_node,
->objects.rb_root, it.rb) {
-   list_for_each_entry_safe(bo, next_bo, >bos, mn_list) {
-   bo->mn = NULL;
-   list_del_init(>mn_list);
-   }
-   kfree(node);
-   }
-   up_write(>lock);
-   mutex_unlock(>mn_lock);
-
-   hmm_mirror_unregister(>mirror);
-   kfree(amn);
-}
-
-/**
- * amdgpu_hmm_mirror_release - callback to notify about mm 

[Xen-devel] [PATCH v3 04/14] mm/hmm: define the pre-processor related parts of hmm.h even if disabled

2019-11-12 Thread Jason Gunthorpe
From: Jason Gunthorpe 

Only the function calls are stubbed out with static inlines that always
fail. This is the standard way to write a header for an optional component
and makes it easier for drivers that only optionally need HMM_MIRROR.

Reviewed-by: Jérôme Glisse 
Tested-by: Ralph Campbell 
Signed-off-by: Jason Gunthorpe 
---
 include/linux/hmm.h | 59 -
 kernel/fork.c   |  1 -
 2 files changed, 47 insertions(+), 13 deletions(-)

diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index fbb35c78637e57..cb69bf10dc788c 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -62,8 +62,6 @@
 #include 
 #include 
 
-#ifdef CONFIG_HMM_MIRROR
-
 #include 
 #include 
 #include 
@@ -374,6 +372,15 @@ struct hmm_mirror {
struct list_headlist;
 };
 
+/*
+ * Retry fault if non-blocking, drop mmap_sem and return -EAGAIN in that case.
+ */
+#define HMM_FAULT_ALLOW_RETRY  (1 << 0)
+
+/* Don't fault in missing PTEs, just snapshot the current state. */
+#define HMM_FAULT_SNAPSHOT (1 << 1)
+
+#ifdef CONFIG_HMM_MIRROR
 int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm);
 void hmm_mirror_unregister(struct hmm_mirror *mirror);
 
@@ -383,14 +390,6 @@ void hmm_mirror_unregister(struct hmm_mirror *mirror);
 int hmm_range_register(struct hmm_range *range, struct hmm_mirror *mirror);
 void hmm_range_unregister(struct hmm_range *range);
 
-/*
- * Retry fault if non-blocking, drop mmap_sem and return -EAGAIN in that case.
- */
-#define HMM_FAULT_ALLOW_RETRY  (1 << 0)
-
-/* Don't fault in missing PTEs, just snapshot the current state. */
-#define HMM_FAULT_SNAPSHOT (1 << 1)
-
 long hmm_range_fault(struct hmm_range *range, unsigned int flags);
 
 long hmm_range_dma_map(struct hmm_range *range,
@@ -401,6 +400,44 @@ long hmm_range_dma_unmap(struct hmm_range *range,
 struct device *device,
 dma_addr_t *daddrs,
 bool dirty);
+#else
+int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm)
+{
+   return -EOPNOTSUPP;
+}
+
+void hmm_mirror_unregister(struct hmm_mirror *mirror)
+{
+}
+
+int hmm_range_register(struct hmm_range *range, struct hmm_mirror *mirror)
+{
+   return -EOPNOTSUPP;
+}
+
+void hmm_range_unregister(struct hmm_range *range)
+{
+}
+
+static inline long hmm_range_fault(struct hmm_range *range, unsigned int flags)
+{
+   return -EOPNOTSUPP;
+}
+
+static inline long hmm_range_dma_map(struct hmm_range *range,
+struct device *device, dma_addr_t *daddrs,
+unsigned int flags)
+{
+   return -EOPNOTSUPP;
+}
+
+static inline long hmm_range_dma_unmap(struct hmm_range *range,
+  struct device *device,
+  dma_addr_t *daddrs, bool dirty)
+{
+   return -EOPNOTSUPP;
+}
+#endif
 
 /*
  * HMM_RANGE_DEFAULT_TIMEOUT - default timeout (ms) when waiting for a range
@@ -411,6 +448,4 @@ long hmm_range_dma_unmap(struct hmm_range *range,
  */
 #define HMM_RANGE_DEFAULT_TIMEOUT 1000
 
-#endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */
-
 #endif /* LINUX_HMM_H */
diff --git a/kernel/fork.c b/kernel/fork.c
index bcdf5312521036..ca39cfc404e3db 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -40,7 +40,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
-- 
2.24.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 02/14] mm/mmu_notifier: add an interval tree notifier

2019-11-12 Thread Jason Gunthorpe
From: Jason Gunthorpe 

Of the 13 users of mmu_notifiers, 8 of them use only
invalidate_range_start/end() and immediately intersect the
mmu_notifier_range with some kind of internal list of VAs.  4 use an
interval tree (i915_gem, radeon_mn, umem_odp, hfi1). 4 use a linked list
of some kind (scif_dma, vhost, gntdev, hmm)

And the remaining 5 either don't use invalidate_range_start() or do some
special thing with it.

It turns out that building a correct scheme with an interval tree is
pretty complicated, particularly if the use case is synchronizing against
another thread doing get_user_pages().  Many of these implementations have
various subtle and difficult to fix races.

This approach puts the interval tree as common code at the top of the mmu
notifier call tree and implements a shareable locking scheme.

It includes:
 - An interval tree tracking VA ranges, with per-range callbacks
 - A read/write locking scheme for the interval tree that avoids
   sleeping in the notifier path (for OOM killer)
 - A sequence counter based collision-retry locking scheme to tell
   device page fault that a VA range is being concurrently invalidated.

This is based on various ideas:
- hmm accumulates invalidated VA ranges and releases them when all
  invalidates are done, via active_invalidate_ranges count.
  This approach avoids having to intersect the interval tree twice (as
  umem_odp does) at the potential cost of a longer device page fault.

- kvm/umem_odp use a sequence counter to drive the collision retry,
  via invalidate_seq

- a deferred work todo list on unlock scheme like RTNL, via deferred_list.
  This makes adding/removing interval tree members more deterministic

- seqlock, except this version makes the seqlock idea multi-holder on the
  write side by protecting it with active_invalidate_ranges and a spinlock

To minimize MM overhead when only the interval tree is being used, the
entire SRCU and hlist overheads are dropped using some simple
branches. Similarly the interval tree overhead is dropped when in hlist
mode.

The overhead from the mandatory spinlock is broadly the same as most of
existing users which already had a lock (or two) of some sort on the
invalidation path.

Acked-by: Christian König 
Tested-by: Philip Yang 
Tested-by: Ralph Campbell 
Reviewed-by: John Hubbard 
Signed-off-by: Jason Gunthorpe 
---
 include/linux/mmu_notifier.h | 101 +++
 mm/Kconfig   |   1 +
 mm/mmu_notifier.c| 552 +--
 3 files changed, 628 insertions(+), 26 deletions(-)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 12bd603d318ce7..9e6caa8ecd1938 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -6,10 +6,12 @@
 #include 
 #include 
 #include 
+#include 
 
 struct mmu_notifier_mm;
 struct mmu_notifier;
 struct mmu_notifier_range;
+struct mmu_interval_notifier;
 
 /**
  * enum mmu_notifier_event - reason for the mmu notifier callback
@@ -32,6 +34,9 @@ struct mmu_notifier_range;
  * access flags). User should soft dirty the page in the end callback to make
  * sure that anyone relying on soft dirtyness catch pages that might be written
  * through non CPU mappings.
+ *
+ * @MMU_NOTIFY_RELEASE: used during mmu_interval_notifier invalidate to signal
+ * that the mm refcount is zero and the range is no longer accessible.
  */
 enum mmu_notifier_event {
MMU_NOTIFY_UNMAP = 0,
@@ -39,6 +44,7 @@ enum mmu_notifier_event {
MMU_NOTIFY_PROTECTION_VMA,
MMU_NOTIFY_PROTECTION_PAGE,
MMU_NOTIFY_SOFT_DIRTY,
+   MMU_NOTIFY_RELEASE,
 };
 
 #define MMU_NOTIFIER_RANGE_BLOCKABLE (1 << 0)
@@ -222,6 +228,26 @@ struct mmu_notifier {
unsigned int users;
 };
 
+/**
+ * struct mmu_interval_notifier_ops
+ * @invalidate: Upon return the caller must stop using any SPTEs within this
+ *  range. This function can sleep. Return false only if sleeping
+ *  was required but mmu_notifier_range_blockable(range) is false.
+ */
+struct mmu_interval_notifier_ops {
+   bool (*invalidate)(struct mmu_interval_notifier *mni,
+  const struct mmu_notifier_range *range,
+  unsigned long cur_seq);
+};
+
+struct mmu_interval_notifier {
+   struct interval_tree_node interval_tree;
+   const struct mmu_interval_notifier_ops *ops;
+   struct mm_struct *mm;
+   struct hlist_node deferred_item;
+   unsigned long invalidate_seq;
+};
+
 #ifdef CONFIG_MMU_NOTIFIER
 
 #ifdef CONFIG_LOCKDEP
@@ -263,6 +289,81 @@ extern int __mmu_notifier_register(struct mmu_notifier *mn,
   struct mm_struct *mm);
 extern void mmu_notifier_unregister(struct mmu_notifier *mn,
struct mm_struct *mm);
+
+unsigned long mmu_interval_read_begin(struct mmu_interval_notifier *mni);
+int mmu_interval_notifier_insert(struct mmu_interval_notifier *mni,
+

[Xen-devel] [PATCH v3 12/14] drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror

2019-11-12 Thread Jason Gunthorpe
From: Jason Gunthorpe 

Convert the collision-retry lock around hmm_range_fault to use the one now
provided by the mmu_interval notifier.

Although this driver does not seem to use the collision retry lock that
hmm provides correctly, it can still be converted over to use the
mmu_interval_notifier api instead of hmm_mirror without too much trouble.

This also deletes another place where a driver is associating additional
data (struct amdgpu_mn) with a mmu_struct.

Signed-off-by: Philip Yang 
Reviewed-by: Philip Yang 
Tested-by: Philip Yang 
Signed-off-by: Jason Gunthorpe 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   4 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c|  14 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c| 148 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h|  49 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   | 116 --
 5 files changed, 94 insertions(+), 237 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 47700302a08b7f..1bcedb9b477dce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1738,6 +1738,10 @@ static int update_invalid_user_pages(struct 
amdkfd_process_info *process_info,
return ret;
}
 
+   /*
+* FIXME: Cannot ignore the return code, must hold
+* notifier_lock
+*/
amdgpu_ttm_tt_get_user_pages_done(bo->tbo.ttm);
 
/* Mark the BO as valid unless it was invalidated
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 82823d9a8ba887..22c989bca7514c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -603,8 +603,6 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
e->tv.num_shared = 2;
 
amdgpu_bo_list_get_list(p->bo_list, >validated);
-   if (p->bo_list->first_userptr != p->bo_list->num_entries)
-   p->mn = amdgpu_mn_get(p->adev, AMDGPU_MN_TYPE_GFX);
 
INIT_LIST_HEAD();
amdgpu_vm_get_pd_bo(>vm, >validated, >vm_pd);
@@ -1287,11 +1285,11 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
if (r)
goto error_unlock;
 
-   /* No memory allocation is allowed while holding the mn lock.
-* p->mn is hold until amdgpu_cs_submit is finished and fence is added
-* to BOs.
+   /* No memory allocation is allowed while holding the notifier lock.
+* The lock is held until amdgpu_cs_submit is finished and fence is
+* added to BOs.
 */
-   amdgpu_mn_lock(p->mn);
+   mutex_lock(>adev->notifier_lock);
 
/* If userptr are invalidated after amdgpu_cs_parser_bos(), return
 * -EAGAIN, drmIoctl in libdrm will restart the amdgpu_cs_ioctl.
@@ -1334,13 +1332,13 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
amdgpu_vm_move_to_lru_tail(p->adev, >vm);
 
ttm_eu_fence_buffer_objects(>ticket, >validated, p->fence);
-   amdgpu_mn_unlock(p->mn);
+   mutex_unlock(>adev->notifier_lock);
 
return 0;
 
 error_abort:
drm_sched_job_cleanup(>base);
-   amdgpu_mn_unlock(p->mn);
+   mutex_unlock(>adev->notifier_lock);
 
 error_unlock:
amdgpu_job_free(job);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index 9fe1c31ce17a30..828b5167ff128f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -50,28 +50,6 @@
 #include "amdgpu.h"
 #include "amdgpu_amdkfd.h"
 
-/**
- * amdgpu_mn_lock - take the write side lock for this notifier
- *
- * @mn: our notifier
- */
-void amdgpu_mn_lock(struct amdgpu_mn *mn)
-{
-   if (mn)
-   down_write(>lock);
-}
-
-/**
- * amdgpu_mn_unlock - drop the write side lock for this notifier
- *
- * @mn: our notifier
- */
-void amdgpu_mn_unlock(struct amdgpu_mn *mn)
-{
-   if (mn)
-   up_write(>lock);
-}
-
 /**
  * amdgpu_mn_invalidate_gfx - callback to notify about mm change
  *
@@ -94,6 +72,9 @@ static bool amdgpu_mn_invalidate_gfx(struct 
mmu_interval_notifier *mni,
return false;
 
mutex_lock(>notifier_lock);
+
+   mmu_interval_set_seq(mni, cur_seq);
+
r = dma_resv_wait_timeout_rcu(bo->tbo.base.resv, true, false,
  MAX_SCHEDULE_TIMEOUT);
mutex_unlock(>notifier_lock);
@@ -127,6 +108,9 @@ static bool amdgpu_mn_invalidate_hsa(struct 
mmu_interval_notifier *mni,
return false;
 
mutex_lock(>notifier_lock);
+
+   mmu_interval_set_seq(mni, cur_seq);
+
amdgpu_amdkfd_evict_userptr(bo->kfd_bo, bo->notifier.mm);
mutex_unlock(>notifier_lock);
 
@@ -137,92 +121,6 @@ static const struct mmu_interval_notifier_ops 

[Xen-devel] [PATCH v3 09/14] nouveau: use mmu_interval_notifier instead of hmm_mirror

2019-11-12 Thread Jason Gunthorpe
From: Jason Gunthorpe 

Remove the hmm_mirror object and use the mmu_interval_notifier API instead
for the range, and use the normal mmu_notifier API for the general
invalidation callback.

While here re-organize the pagefault path so the locking pattern is clear.

nouveau is the only driver that uses a temporary range object and instead
forwards nearly every invalidation range directly to the HW. While this is
not how the mmu_interval_notifier was intended to be used, the overheads on
the pagefaulting path are similar to the existing hmm_mirror version.
Particularly since the interval tree will be small.

Tested-by: Ralph Campbell 
Signed-off-by: Jason Gunthorpe 
---
 drivers/gpu/drm/nouveau/nouveau_svm.c | 179 ++
 1 file changed, 99 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c 
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 577f8811925a59..df9bf1fd1bc0be 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -96,8 +96,6 @@ struct nouveau_svmm {
} unmanaged;
 
struct mutex mutex;
-
-   struct hmm_mirror mirror;
 };
 
 #define SVMM_DBG(s,f,a...) 
\
@@ -293,23 +291,11 @@ static const struct mmu_notifier_ops nouveau_mn_ops = {
.free_notifier = nouveau_svmm_free_notifier,
 };
 
-static int
-nouveau_svmm_sync_cpu_device_pagetables(struct hmm_mirror *mirror,
-   const struct mmu_notifier_range *update)
-{
-   return 0;
-}
-
-static const struct hmm_mirror_ops nouveau_svmm = {
-   .sync_cpu_device_pagetables = nouveau_svmm_sync_cpu_device_pagetables,
-};
-
 void
 nouveau_svmm_fini(struct nouveau_svmm **psvmm)
 {
struct nouveau_svmm *svmm = *psvmm;
if (svmm) {
-   hmm_mirror_unregister(>mirror);
mutex_lock(>mutex);
svmm->vmm = NULL;
mutex_unlock(>mutex);
@@ -357,15 +343,10 @@ nouveau_svmm_init(struct drm_device *dev, void *data,
goto out_free;
 
down_write(>mm->mmap_sem);
-   svmm->mirror.ops = _svmm;
-   ret = hmm_mirror_register(>mirror, current->mm);
-   if (ret)
-   goto out_mm_unlock;
-
svmm->notifier.ops = _mn_ops;
ret = __mmu_notifier_register(>notifier, current->mm);
if (ret)
-   goto out_hmm_unregister;
+   goto out_mm_unlock;
/* Note, ownership of svmm transfers to mmu_notifier */
 
cli->svm.svmm = svmm;
@@ -374,8 +355,6 @@ nouveau_svmm_init(struct drm_device *dev, void *data,
mutex_unlock(>mutex);
return 0;
 
-out_hmm_unregister:
-   hmm_mirror_unregister(>mirror);
 out_mm_unlock:
up_write(>mm->mmap_sem);
 out_free:
@@ -503,43 +482,90 @@ nouveau_svm_fault_cache(struct nouveau_svm *svm,
fault->inst, fault->addr, fault->access);
 }
 
-static inline bool
-nouveau_range_done(struct hmm_range *range)
+struct svm_notifier {
+   struct mmu_interval_notifier notifier;
+   struct nouveau_svmm *svmm;
+};
+
+static bool nouveau_svm_range_invalidate(struct mmu_interval_notifier *mni,
+const struct mmu_notifier_range *range,
+unsigned long cur_seq)
 {
-   bool ret = hmm_range_valid(range);
+   struct svm_notifier *sn =
+   container_of(mni, struct svm_notifier, notifier);
 
-   hmm_range_unregister(range);
-   return ret;
+   /*
+* serializes the update to mni->invalidate_seq done by caller and
+* prevents invalidation of the PTE from progressing while HW is being
+* programmed. This is very hacky and only works because the normal
+* notifier that does invalidation is always called after the range
+* notifier.
+*/
+   if (mmu_notifier_range_blockable(range))
+   mutex_lock(>svmm->mutex);
+   else if (!mutex_trylock(>svmm->mutex))
+   return false;
+   mmu_interval_set_seq(mni, cur_seq);
+   mutex_unlock(>svmm->mutex);
+   return true;
 }
 
-static int
-nouveau_range_fault(struct nouveau_svmm *svmm, struct hmm_range *range)
+static const struct mmu_interval_notifier_ops nouveau_svm_mni_ops = {
+   .invalidate = nouveau_svm_range_invalidate,
+};
+
+static int nouveau_range_fault(struct nouveau_svmm *svmm,
+  struct nouveau_drm *drm, void *data, u32 size,
+  u64 *pfns, struct svm_notifier *notifier)
 {
+   unsigned long timeout =
+   jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
+   /* Have HMM fault pages within the fault window to the GPU. */
+   struct hmm_range range = {
+   .notifier = >notifier,
+   .start = notifier->notifier.interval_tree.start,
+   .end = notifier->notifier.interval_tree.last + 1,
+   .pfns = 

[Xen-devel] [PATCH v3 13/14] mm/hmm: remove hmm_mirror and related

2019-11-12 Thread Jason Gunthorpe
From: Jason Gunthorpe 

The only two users of this are now converted to use mmu_interval_notifier,
delete all the code and update hmm.rst.

Reviewed-by: Jérôme Glisse 
Tested-by: Ralph Campbell 
Signed-off-by: Jason Gunthorpe 
---
 Documentation/vm/hmm.rst | 105 ---
 include/linux/hmm.h  | 183 +
 mm/Kconfig   |   1 -
 mm/hmm.c | 285 ++-
 4 files changed, 34 insertions(+), 540 deletions(-)

diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
index 0a5960beccf76d..893a8ba0e9fefb 100644
--- a/Documentation/vm/hmm.rst
+++ b/Documentation/vm/hmm.rst
@@ -147,49 +147,16 @@ Address space mirroring implementation and API
 Address space mirroring's main objective is to allow duplication of a range of
 CPU page table into a device page table; HMM helps keep both synchronized. A
 device driver that wants to mirror a process address space must start with the
-registration of an hmm_mirror struct::
-
- int hmm_mirror_register(struct hmm_mirror *mirror,
- struct mm_struct *mm);
-
-The mirror struct has a set of callbacks that are used
-to propagate CPU page tables::
-
- struct hmm_mirror_ops {
- /* release() - release hmm_mirror
-  *
-  * @mirror: pointer to struct hmm_mirror
-  *
-  * This is called when the mm_struct is being released.  The callback
-  * must ensure that all access to any pages obtained from this mirror
-  * is halted before the callback returns. All future access should
-  * fault.
-  */
- void (*release)(struct hmm_mirror *mirror);
-
- /* sync_cpu_device_pagetables() - synchronize page tables
-  *
-  * @mirror: pointer to struct hmm_mirror
-  * @update: update information (see struct mmu_notifier_range)
-  * Return: -EAGAIN if update.blockable false and callback need to
-  * block, 0 otherwise.
-  *
-  * This callback ultimately originates from mmu_notifiers when the CPU
-  * page table is updated. The device driver must update its page table
-  * in response to this callback. The update argument tells what action
-  * to perform.
-  *
-  * The device driver must not return from this callback until the device
-  * page tables are completely updated (TLBs flushed, etc); this is a
-  * synchronous call.
-  */
- int (*sync_cpu_device_pagetables)(struct hmm_mirror *mirror,
-   const struct hmm_update *update);
- };
-
-The device driver must perform the update action to the range (mark range
-read only, or fully unmap, etc.). The device must complete the update before
-the driver callback returns.
+registration of a mmu_interval_notifier::
+
+ mni->ops = _ops;
+ int mmu_interval_notifier_insert(struct mmu_interval_notifier *mni,
+ unsigned long start, unsigned long length,
+ struct mm_struct *mm);
+
+During the driver_ops->invalidate() callback the device driver must perform
+the update action to the range (mark range read only, or fully unmap,
+etc.). The device must complete the update before the driver callback returns.
 
 When the device driver wants to populate a range of virtual addresses, it can
 use::
@@ -216,70 +183,46 @@ The usage pattern is::
   struct hmm_range range;
   ...
 
+  range.notifier = 
   range.start = ...;
   range.end = ...;
   range.pfns = ...;
   range.flags = ...;
   range.values = ...;
   range.pfn_shift = ...;
-  hmm_range_register(, mirror);
 
-  /*
-   * Just wait for range to be valid, safe to ignore return value as we
-   * will use the return value of hmm_range_fault() below under the
-   * mmap_sem to ascertain the validity of the range.
-   */
-  hmm_range_wait_until_valid(, TIMEOUT_IN_MSEC);
+  if (!mmget_not_zero(mni->notifier.mm))
+  return -EFAULT;
 
  again:
+  range.notifier_seq = mmu_interval_read_begin();
   down_read(>mmap_sem);
   ret = hmm_range_fault(, HMM_RANGE_SNAPSHOT);
   if (ret) {
   up_read(>mmap_sem);
-  if (ret == -EBUSY) {
-/*
- * No need to check hmm_range_wait_until_valid() return value
- * on retry we will get proper error with hmm_range_fault()
- */
-hmm_range_wait_until_valid(, TIMEOUT_IN_MSEC);
-goto again;
-  }
-  hmm_range_unregister();
+  if (ret == -EBUSY)
+ goto again;
   return ret;
   }
+  up_read(>mmap_sem);
+
   take_lock(driver->update);
-  if (!hmm_range_valid()) {
+  if (mmu_interval_read_retry(, range.notifier_seq) {
   release_lock(driver->update);
-  up_read(>mmap_sem);
   goto again;
   }
 
-  // Use pfns array content to update device page table
+  /* Use pfns array content to update device 

[Xen-devel] [PATCH v3 10/14] drm/amdgpu: Call find_vma under mmap_sem

2019-11-12 Thread Jason Gunthorpe
From: Jason Gunthorpe 

find_vma() must be called under the mmap_sem, reorganize this code to
do the vma check after entering the lock.

Further, fix the unlocked use of struct task_struct's mm, instead use
the mm from hmm_mirror which has an active mm_grab. Also the mm_grab
must be converted to a mm_get before acquiring mmap_sem or calling
find_vma().

Fixes: 66c45500bfdc ("drm/amdgpu: use new HMM APIs and helpers")
Fixes: 0919195f2b0d ("drm/amdgpu: Enable amdgpu_ttm_tt_get_user_pages in worker 
threads")
Acked-by: Christian König 
Reviewed-by: Felix Kuehling 
Reviewed-by: Philip Yang 
Tested-by: Philip Yang 
Signed-off-by: Jason Gunthorpe 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 37 ++---
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index dff41d0a85fe96..c0e41f1f0c2365 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -788,7 +789,7 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, 
struct page **pages)
struct hmm_mirror *mirror = bo->mn ? >mn->mirror : NULL;
struct ttm_tt *ttm = bo->tbo.ttm;
struct amdgpu_ttm_tt *gtt = (void *)ttm;
-   struct mm_struct *mm = gtt->usertask->mm;
+   struct mm_struct *mm;
unsigned long start = gtt->userptr;
struct vm_area_struct *vma;
struct hmm_range *range;
@@ -796,25 +797,14 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, 
struct page **pages)
uint64_t *pfns;
int r = 0;
 
-   if (!mm) /* Happens during process shutdown */
-   return -ESRCH;
-
if (unlikely(!mirror)) {
DRM_DEBUG_DRIVER("Failed to get hmm_mirror\n");
-   r = -EFAULT;
-   goto out;
+   return -EFAULT;
}
 
-   vma = find_vma(mm, start);
-   if (unlikely(!vma || start < vma->vm_start)) {
-   r = -EFAULT;
-   goto out;
-   }
-   if (unlikely((gtt->userflags & AMDGPU_GEM_USERPTR_ANONONLY) &&
-   vma->vm_file)) {
-   r = -EPERM;
-   goto out;
-   }
+   mm = mirror->hmm->mmu_notifier.mm;
+   if (!mmget_not_zero(mm)) /* Happens during process shutdown */
+   return -ESRCH;
 
range = kzalloc(sizeof(*range), GFP_KERNEL);
if (unlikely(!range)) {
@@ -847,6 +837,17 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, 
struct page **pages)
hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT);
 
down_read(>mmap_sem);
+   vma = find_vma(mm, start);
+   if (unlikely(!vma || start < vma->vm_start)) {
+   r = -EFAULT;
+   goto out_unlock;
+   }
+   if (unlikely((gtt->userflags & AMDGPU_GEM_USERPTR_ANONONLY) &&
+   vma->vm_file)) {
+   r = -EPERM;
+   goto out_unlock;
+   }
+
r = hmm_range_fault(range, 0);
up_read(>mmap_sem);
 
@@ -865,15 +866,19 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, 
struct page **pages)
}
 
gtt->range = range;
+   mmput(mm);
 
return 0;
 
+out_unlock:
+   up_read(>mmap_sem);
 out_free_pfns:
hmm_range_unregister(range);
kvfree(pfns);
 out_free_ranges:
kfree(range);
 out:
+   mmput(mm);
return r;
 }
 
-- 
2.24.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 06/14] RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv

2019-11-12 Thread Jason Gunthorpe
From: Jason Gunthorpe 

This converts one of the two users of mmu_notifiers to use the new API.
The conversion is fairly straightforward, however the existing use of
notifiers here seems to be racey.

Tested-by: Dennis Dalessandro 
Signed-off-by: Jason Gunthorpe 
---
 drivers/infiniband/hw/hfi1/file_ops.c |   2 +-
 drivers/infiniband/hw/hfi1/hfi.h  |   2 +-
 drivers/infiniband/hw/hfi1/user_exp_rcv.c | 146 +-
 drivers/infiniband/hw/hfi1/user_exp_rcv.h |   3 +-
 4 files changed, 60 insertions(+), 93 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/file_ops.c 
b/drivers/infiniband/hw/hfi1/file_ops.c
index f9a7e9d29c8ba2..7c5e3fb224139a 100644
--- a/drivers/infiniband/hw/hfi1/file_ops.c
+++ b/drivers/infiniband/hw/hfi1/file_ops.c
@@ -1138,7 +1138,7 @@ static int get_ctxt_info(struct hfi1_filedata *fd, 
unsigned long arg, u32 len)
HFI1_CAP_UGET_MASK(uctxt->flags, MASK) |
HFI1_CAP_KGET_MASK(uctxt->flags, K2U);
/* adjust flag if this fd is not able to cache */
-   if (!fd->handler)
+   if (!fd->use_mn)
cinfo.runtime_flags |= HFI1_CAP_TID_UNMAP; /* no caching */
 
cinfo.num_active = hfi1_count_active_units();
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index fa45350a9a1d32..fc10d65fc3e13c 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1444,7 +1444,7 @@ struct hfi1_filedata {
/* for cpu affinity; -1 if none */
int rec_cpu_num;
u32 tid_n_pinned;
-   struct mmu_rb_handler *handler;
+   bool use_mn;
struct tid_rb_node **entry_to_rb;
spinlock_t tid_lock; /* protect tid_[limit,used] counters */
u32 tid_limit;
diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.c 
b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
index 3592a9ec155e85..75a378162162d3 100644
--- a/drivers/infiniband/hw/hfi1/user_exp_rcv.c
+++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
@@ -59,11 +59,11 @@ static int set_rcvarray_entry(struct hfi1_filedata *fd,
  struct tid_user_buf *tbuf,
  u32 rcventry, struct tid_group *grp,
  u16 pageidx, unsigned int npages);
-static int tid_rb_insert(void *arg, struct mmu_rb_node *node);
 static void cacheless_tid_rb_remove(struct hfi1_filedata *fdata,
struct tid_rb_node *tnode);
-static void tid_rb_remove(void *arg, struct mmu_rb_node *node);
-static int tid_rb_invalidate(void *arg, struct mmu_rb_node *mnode);
+static bool tid_rb_invalidate(struct mmu_interval_notifier *mni,
+ const struct mmu_notifier_range *range,
+ unsigned long cur_seq);
 static int program_rcvarray(struct hfi1_filedata *fd, struct tid_user_buf *,
struct tid_group *grp,
unsigned int start, u16 count,
@@ -73,10 +73,8 @@ static int unprogram_rcvarray(struct hfi1_filedata *fd, u32 
tidinfo,
  struct tid_group **grp);
 static void clear_tid_node(struct hfi1_filedata *fd, struct tid_rb_node *node);
 
-static struct mmu_rb_ops tid_rb_ops = {
-   .insert = tid_rb_insert,
-   .remove = tid_rb_remove,
-   .invalidate = tid_rb_invalidate
+static const struct mmu_interval_notifier_ops tid_mn_ops = {
+   .invalidate = tid_rb_invalidate,
 };
 
 /*
@@ -87,7 +85,6 @@ static struct mmu_rb_ops tid_rb_ops = {
 int hfi1_user_exp_rcv_init(struct hfi1_filedata *fd,
   struct hfi1_ctxtdata *uctxt)
 {
-   struct hfi1_devdata *dd = uctxt->dd;
int ret = 0;
 
spin_lock_init(>tid_lock);
@@ -109,20 +106,7 @@ int hfi1_user_exp_rcv_init(struct hfi1_filedata *fd,
fd->entry_to_rb = NULL;
return -ENOMEM;
}
-
-   /*
-* Register MMU notifier callbacks. If the registration
-* fails, continue without TID caching for this context.
-*/
-   ret = hfi1_mmu_rb_register(fd, fd->mm, _rb_ops,
-  dd->pport->hfi1_wq,
-  >handler);
-   if (ret) {
-   dd_dev_info(dd,
-   "Failed MMU notifier registration %d\n",
-   ret);
-   ret = 0;
-   }
+   fd->use_mn = true;
}
 
/*
@@ -139,7 +123,7 @@ int hfi1_user_exp_rcv_init(struct hfi1_filedata *fd,
 * init.
 */
spin_lock(>tid_lock);
-   if (uctxt->subctxt_cnt && fd->handler) {
+   if (uctxt->subctxt_cnt && fd->use_mn) {
u16 remainder;
 
fd->tid_limit = uctxt->expected_count / uctxt->subctxt_cnt;
@@ -158,18 +142,10 @@ void hfi1_user_exp_rcv_free(struct 

[Xen-devel] [PATCH v3 08/14] nouveau: use mmu_notifier directly for invalidate_range_start

2019-11-12 Thread Jason Gunthorpe
From: Jason Gunthorpe 

There is no reason to get the invalidate_range_start() callback via an
indirection through hmm_mirror, just register a normal notifier directly.

Tested-by: Ralph Campbell 
Signed-off-by: Jason Gunthorpe 
---
 drivers/gpu/drm/nouveau/nouveau_svm.c | 95 ++-
 1 file changed, 63 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c 
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 668d4bd0c118f1..577f8811925a59 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -88,6 +88,7 @@ nouveau_ivmm_find(struct nouveau_svm *svm, u64 inst)
 }
 
 struct nouveau_svmm {
+   struct mmu_notifier notifier;
struct nouveau_vmm *vmm;
struct {
unsigned long start;
@@ -96,7 +97,6 @@ struct nouveau_svmm {
 
struct mutex mutex;
 
-   struct mm_struct *mm;
struct hmm_mirror mirror;
 };
 
@@ -251,10 +251,11 @@ nouveau_svmm_invalidate(struct nouveau_svmm *svmm, u64 
start, u64 limit)
 }
 
 static int
-nouveau_svmm_sync_cpu_device_pagetables(struct hmm_mirror *mirror,
-   const struct mmu_notifier_range *update)
+nouveau_svmm_invalidate_range_start(struct mmu_notifier *mn,
+   const struct mmu_notifier_range *update)
 {
-   struct nouveau_svmm *svmm = container_of(mirror, typeof(*svmm), mirror);
+   struct nouveau_svmm *svmm =
+   container_of(mn, struct nouveau_svmm, notifier);
unsigned long start = update->start;
unsigned long limit = update->end;
 
@@ -264,6 +265,9 @@ nouveau_svmm_sync_cpu_device_pagetables(struct hmm_mirror 
*mirror,
SVMM_DBG(svmm, "invalidate %016lx-%016lx", start, limit);
 
mutex_lock(>mutex);
+   if (unlikely(!svmm->vmm))
+   goto out;
+
if (limit > svmm->unmanaged.start && start < svmm->unmanaged.limit) {
if (start < svmm->unmanaged.start) {
nouveau_svmm_invalidate(svmm, start,
@@ -273,19 +277,31 @@ nouveau_svmm_sync_cpu_device_pagetables(struct hmm_mirror 
*mirror,
}
 
nouveau_svmm_invalidate(svmm, start, limit);
+
+out:
mutex_unlock(>mutex);
return 0;
 }
 
-static void
-nouveau_svmm_release(struct hmm_mirror *mirror)
+static void nouveau_svmm_free_notifier(struct mmu_notifier *mn)
+{
+   kfree(container_of(mn, struct nouveau_svmm, notifier));
+}
+
+static const struct mmu_notifier_ops nouveau_mn_ops = {
+   .invalidate_range_start = nouveau_svmm_invalidate_range_start,
+   .free_notifier = nouveau_svmm_free_notifier,
+};
+
+static int
+nouveau_svmm_sync_cpu_device_pagetables(struct hmm_mirror *mirror,
+   const struct mmu_notifier_range *update)
 {
+   return 0;
 }
 
-static const struct hmm_mirror_ops
-nouveau_svmm = {
+static const struct hmm_mirror_ops nouveau_svmm = {
.sync_cpu_device_pagetables = nouveau_svmm_sync_cpu_device_pagetables,
-   .release = nouveau_svmm_release,
 };
 
 void
@@ -294,7 +310,10 @@ nouveau_svmm_fini(struct nouveau_svmm **psvmm)
struct nouveau_svmm *svmm = *psvmm;
if (svmm) {
hmm_mirror_unregister(>mirror);
-   kfree(*psvmm);
+   mutex_lock(>mutex);
+   svmm->vmm = NULL;
+   mutex_unlock(>mutex);
+   mmu_notifier_put(>notifier);
*psvmm = NULL;
}
 }
@@ -320,7 +339,7 @@ nouveau_svmm_init(struct drm_device *dev, void *data,
mutex_lock(>mutex);
if (cli->svm.cli) {
ret = -EBUSY;
-   goto done;
+   goto out_free;
}
 
/* Allocate a new GPU VMM that can support SVM (managed by the
@@ -335,24 +354,33 @@ nouveau_svmm_init(struct drm_device *dev, void *data,
.fault_replay = true,
}, sizeof(struct gp100_vmm_v0), >svm.vmm);
if (ret)
-   goto done;
+   goto out_free;
 
-   /* Enable HMM mirroring of CPU address-space to VMM. */
-   svmm->mm = get_task_mm(current);
-   down_write(>mm->mmap_sem);
+   down_write(>mm->mmap_sem);
svmm->mirror.ops = _svmm;
-   ret = hmm_mirror_register(>mirror, svmm->mm);
-   if (ret == 0) {
-   cli->svm.svmm = svmm;
-   cli->svm.cli = cli;
-   }
-   up_write(>mm->mmap_sem);
-   mmput(svmm->mm);
+   ret = hmm_mirror_register(>mirror, current->mm);
+   if (ret)
+   goto out_mm_unlock;
 
-done:
+   svmm->notifier.ops = _mn_ops;
+   ret = __mmu_notifier_register(>notifier, current->mm);
if (ret)
-   nouveau_svmm_fini();
+   goto out_hmm_unregister;
+   /* Note, ownership of svmm transfers to mmu_notifier */
+
+   cli->svm.svmm = svmm;
+   cli->svm.cli = cli;
+   up_write(>mm->mmap_sem);

[Xen-devel] [PATCH v3 07/14] drm/radeon: use mmu_interval_notifier_insert

2019-11-12 Thread Jason Gunthorpe
From: Jason Gunthorpe 

The new API is an exact match for the needs of radeon.

For some reason radeon tries to remove overlapping ranges from the
interval tree, but interval trees (and mmu_interval_notifier_insert())
support overlapping ranges directly. Simply delete all this code.

Since this driver is missing a invalidate_range_end callback, but
still calls get_user_pages(), it cannot be correct against all races.

Reviewed-by: Christian König 
Signed-off-by: Jason Gunthorpe 
---
 drivers/gpu/drm/radeon/radeon.h|   9 +-
 drivers/gpu/drm/radeon/radeon_mn.c | 218 ++---
 2 files changed, 51 insertions(+), 176 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index d59b004f669583..30e32adc1fc666 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -68,6 +68,10 @@
 #include 
 #include 
 
+#ifdef CONFIG_MMU_NOTIFIER
+#include 
+#endif
+
 #include 
 #include 
 #include 
@@ -509,8 +513,9 @@ struct radeon_bo {
struct ttm_bo_kmap_obj  dma_buf_vmap;
pid_t   pid;
 
-   struct radeon_mn*mn;
-   struct list_headmn_list;
+#ifdef CONFIG_MMU_NOTIFIER
+   struct mmu_interval_notifiernotifier;
+#endif
 };
 #define gem_to_radeon_bo(gobj) container_of((gobj), struct radeon_bo, tbo.base)
 
diff --git a/drivers/gpu/drm/radeon/radeon_mn.c 
b/drivers/gpu/drm/radeon/radeon_mn.c
index dbab9a3a969b9e..f93829f08a4dc1 100644
--- a/drivers/gpu/drm/radeon/radeon_mn.c
+++ b/drivers/gpu/drm/radeon/radeon_mn.c
@@ -36,131 +36,51 @@
 
 #include "radeon.h"
 
-struct radeon_mn {
-   struct mmu_notifier mn;
-
-   /* objects protected by lock */
-   struct mutexlock;
-   struct rb_root_cached   objects;
-};
-
-struct radeon_mn_node {
-   struct interval_tree_node   it;
-   struct list_headbos;
-};
-
 /**
- * radeon_mn_invalidate_range_start - callback to notify about mm change
+ * radeon_mn_invalidate - callback to notify about mm change
  *
  * @mn: our notifier
- * @mn: the mm this callback is about
- * @start: start of updated range
- * @end: end of updated range
+ * @range: the VMA under invalidation
  *
  * We block for all BOs between start and end to be idle and
  * unmap them by move them into system domain again.
  */
-static int radeon_mn_invalidate_range_start(struct mmu_notifier *mn,
-   const struct mmu_notifier_range *range)
+static bool radeon_mn_invalidate(struct mmu_interval_notifier *mn,
+const struct mmu_notifier_range *range,
+unsigned long cur_seq)
 {
-   struct radeon_mn *rmn = container_of(mn, struct radeon_mn, mn);
+   struct radeon_bo *bo = container_of(mn, struct radeon_bo, notifier);
struct ttm_operation_ctx ctx = { false, false };
-   struct interval_tree_node *it;
-   unsigned long end;
-   int ret = 0;
-
-   /* notification is exclusive, but interval is inclusive */
-   end = range->end - 1;
-
-   /* TODO we should be able to split locking for interval tree and
-* the tear down.
-*/
-   if (mmu_notifier_range_blockable(range))
-   mutex_lock(>lock);
-   else if (!mutex_trylock(>lock))
-   return -EAGAIN;
-
-   it = interval_tree_iter_first(>objects, range->start, end);
-   while (it) {
-   struct radeon_mn_node *node;
-   struct radeon_bo *bo;
-   long r;
-
-   if (!mmu_notifier_range_blockable(range)) {
-   ret = -EAGAIN;
-   goto out_unlock;
-   }
-
-   node = container_of(it, struct radeon_mn_node, it);
-   it = interval_tree_iter_next(it, range->start, end);
+   long r;
 
-   list_for_each_entry(bo, >bos, mn_list) {
+   if (!bo->tbo.ttm || bo->tbo.ttm->state != tt_bound)
+   return true;
 
-   if (!bo->tbo.ttm || bo->tbo.ttm->state != tt_bound)
-   continue;
+   if (!mmu_notifier_range_blockable(range))
+   return false;
 
-   r = radeon_bo_reserve(bo, true);
-   if (r) {
-   DRM_ERROR("(%ld) failed to reserve user bo\n", 
r);
-   continue;
-   }
-
-   r = dma_resv_wait_timeout_rcu(bo->tbo.base.resv,
-   true, false, MAX_SCHEDULE_TIMEOUT);
-   if (r <= 0)
-   DRM_ERROR("(%ld) failed to wait for user bo\n", 
r);
-
-   radeon_ttm_placement_from_domain(bo, 
RADEON_GEM_DOMAIN_CPU);
-   r = ttm_bo_validate(>tbo, >placement, );
-   if (r)
-   DRM_ERROR("(%ld) failed 

[Xen-devel] [PATCH v3 01/14] mm/mmu_notifier: define the header pre-processor parts even if disabled

2019-11-12 Thread Jason Gunthorpe
From: Jason Gunthorpe 

Now that we have KERNEL_HEADER_TEST all headers are generally compile
tested, so relying on makefile tricks to avoid compiling code that depends
on CONFIG_MMU_NOTIFIER is more annoying.

Instead follow the usual pattern and provide most of the header with only
the functions stubbed out when CONFIG_MMU_NOTIFIER is disabled. This
ensures code compiles no matter what the config setting is.

While here, struct mmu_notifier_mm is private to mmu_notifier.c, move it.

Reviewed-by: Jérôme Glisse 
Tested-by: Ralph Campbell 
Reviewed-by: John Hubbard 
Signed-off-by: Jason Gunthorpe 
---
 include/linux/mmu_notifier.h | 46 +---
 mm/mmu_notifier.c| 13 ++
 2 files changed, 30 insertions(+), 29 deletions(-)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 1bd8e6a09a3c27..12bd603d318ce7 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -7,8 +7,9 @@
 #include 
 #include 
 
+struct mmu_notifier_mm;
 struct mmu_notifier;
-struct mmu_notifier_ops;
+struct mmu_notifier_range;
 
 /**
  * enum mmu_notifier_event - reason for the mmu notifier callback
@@ -40,36 +41,8 @@ enum mmu_notifier_event {
MMU_NOTIFY_SOFT_DIRTY,
 };
 
-#ifdef CONFIG_MMU_NOTIFIER
-
-#ifdef CONFIG_LOCKDEP
-extern struct lockdep_map __mmu_notifier_invalidate_range_start_map;
-#endif
-
-/*
- * The mmu notifier_mm structure is allocated and installed in
- * mm->mmu_notifier_mm inside the mm_take_all_locks() protected
- * critical section and it's released only when mm_count reaches zero
- * in mmdrop().
- */
-struct mmu_notifier_mm {
-   /* all mmu notifiers registerd in this mm are queued in this list */
-   struct hlist_head list;
-   /* to serialize the list modifications and hlist_unhashed */
-   spinlock_t lock;
-};
-
 #define MMU_NOTIFIER_RANGE_BLOCKABLE (1 << 0)
 
-struct mmu_notifier_range {
-   struct vm_area_struct *vma;
-   struct mm_struct *mm;
-   unsigned long start;
-   unsigned long end;
-   unsigned flags;
-   enum mmu_notifier_event event;
-};
-
 struct mmu_notifier_ops {
/*
 * Called either by mmu_notifier_unregister or when the mm is
@@ -249,6 +222,21 @@ struct mmu_notifier {
unsigned int users;
 };
 
+#ifdef CONFIG_MMU_NOTIFIER
+
+#ifdef CONFIG_LOCKDEP
+extern struct lockdep_map __mmu_notifier_invalidate_range_start_map;
+#endif
+
+struct mmu_notifier_range {
+   struct vm_area_struct *vma;
+   struct mm_struct *mm;
+   unsigned long start;
+   unsigned long end;
+   unsigned flags;
+   enum mmu_notifier_event event;
+};
+
 static inline int mm_has_notifiers(struct mm_struct *mm)
 {
return unlikely(mm->mmu_notifier_mm);
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 7fde88695f35d6..367670cfd02b7b 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -27,6 +27,19 @@ struct lockdep_map __mmu_notifier_invalidate_range_start_map 
= {
 };
 #endif
 
+/*
+ * The mmu notifier_mm structure is allocated and installed in
+ * mm->mmu_notifier_mm inside the mm_take_all_locks() protected
+ * critical section and it's released only when mm_count reaches zero
+ * in mmdrop().
+ */
+struct mmu_notifier_mm {
+   /* all mmu notifiers registered in this mm are queued in this list */
+   struct hlist_head list;
+   /* to serialize the list modifications and hlist_unhashed */
+   spinlock_t lock;
+};
+
 /*
  * This function can't run concurrently against mmu_notifier_register
  * because mm->mm_users > 0 during mmu_notifier_register and exit_mmap
-- 
2.24.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 03/14] mm/hmm: allow hmm_range to be used with a mmu_interval_notifier or hmm_mirror

2019-11-12 Thread Jason Gunthorpe
From: Jason Gunthorpe 

hmm_mirror's handling of ranges does not use a sequence count which
results in this bug:

 CPU0   CPU1
 hmm_range_wait_until_valid(range)
 valid == true
 hmm_range_fault(range)
hmm_invalidate_range_start()
   range->valid = false
hmm_invalidate_range_end()
   range->valid = true
 hmm_range_valid(range)
  valid == true

Where the hmm_range_valid() should not have succeeded.

Adding the required sequence count would make it nearly identical to the
new mmu_interval_notifier. Instead replace the hmm_mirror stuff with
mmu_interval_notifier.

Co-existence of the two APIs is the first step.

Reviewed-by: Jérôme Glisse 
Tested-by: Philip Yang 
Tested-by: Ralph Campbell 
Signed-off-by: Jason Gunthorpe 
---
 include/linux/hmm.h |  5 +
 mm/hmm.c| 25 +++--
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 3fec513b9c00f1..fbb35c78637e57 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -145,6 +145,9 @@ enum hmm_pfn_value_e {
 /*
  * struct hmm_range - track invalidation lock on virtual address range
  *
+ * @notifier: an optional mmu_interval_notifier
+ * @notifier_seq: when notifier is used this is the result of
+ *mmu_interval_read_begin()
  * @hmm: the core HMM structure this range is active against
  * @vma: the vm area struct for the range
  * @list: all range lock are on a list
@@ -159,6 +162,8 @@ enum hmm_pfn_value_e {
  * @valid: pfns array did not change since it has been fill by an HMM function
  */
 struct hmm_range {
+   struct mmu_interval_notifier *notifier;
+   unsigned long   notifier_seq;
struct hmm  *hmm;
struct list_headlist;
unsigned long   start;
diff --git a/mm/hmm.c b/mm/hmm.c
index 6b0136665407a3..8d060c5dabe37b 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -858,6 +858,14 @@ void hmm_range_unregister(struct hmm_range *range)
 }
 EXPORT_SYMBOL(hmm_range_unregister);
 
+static bool needs_retry(struct hmm_range *range)
+{
+   if (range->notifier)
+   return mmu_interval_check_retry(range->notifier,
+   range->notifier_seq);
+   return !range->valid;
+}
+
 static const struct mm_walk_ops hmm_walk_ops = {
.pud_entry  = hmm_vma_walk_pud,
.pmd_entry  = hmm_vma_walk_pmd,
@@ -898,18 +906,23 @@ long hmm_range_fault(struct hmm_range *range, unsigned 
int flags)
const unsigned long device_vma = VM_IO | VM_PFNMAP | VM_MIXEDMAP;
unsigned long start = range->start, end;
struct hmm_vma_walk hmm_vma_walk;
-   struct hmm *hmm = range->hmm;
+   struct mm_struct *mm;
struct vm_area_struct *vma;
int ret;
 
-   lockdep_assert_held(>mmu_notifier.mm->mmap_sem);
+   if (range->notifier)
+   mm = range->notifier->mm;
+   else
+   mm = range->hmm->mmu_notifier.mm;
+
+   lockdep_assert_held(>mmap_sem);
 
do {
/* If range is no longer valid force retry. */
-   if (!range->valid)
+   if (needs_retry(range))
return -EBUSY;
 
-   vma = find_vma(hmm->mmu_notifier.mm, start);
+   vma = find_vma(mm, start);
if (vma == NULL || (vma->vm_flags & device_vma))
return -EFAULT;
 
@@ -939,7 +952,7 @@ long hmm_range_fault(struct hmm_range *range, unsigned int 
flags)
start = hmm_vma_walk.last;
 
/* Keep trying while the range is valid. */
-   } while (ret == -EBUSY && range->valid);
+   } while (ret == -EBUSY && !needs_retry(range));
 
if (ret) {
unsigned long i;
@@ -997,7 +1010,7 @@ long hmm_range_dma_map(struct hmm_range *range, struct 
device *device,
continue;
 
/* Check if range is being invalidated */
-   if (!range->valid) {
+   if (needs_retry(range)) {
ret = -EBUSY;
goto unmap;
}
-- 
2.24.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH hmm v3 00/14] Consolidate the mmu notifier interval_tree and locking

2019-11-12 Thread Jason Gunthorpe
From: Jason Gunthorpe 

8 of the mmu_notifier using drivers (i915_gem, radeon_mn, umem_odp, hfi1,
scif_dma, vhost, gntdev, hmm) drivers are using a common pattern where
they only use invalidate_range_start/end and immediately check the
invalidating range against some driver data structure to tell if the
driver is interested. Half of them use an interval_tree, the others are
simple linear search lists.

Of the ones I checked they largely seem to have various kinds of races,
bugs and poor implementation. This is a result of the complexity in how
the notifier interacts with get_user_pages(). It is extremely difficult to
use it correctly.

Consolidate all of this code together into the core mmu_notifier and
provide a locking scheme similar to hmm_mirror that allows the user to
safely use get_user_pages() and reliably know if the page list still
matches the mm.

This new arrangment plays nicely with the !blockable mode for
OOM. Scanning the interval tree is done such that the intersection test
will always succeed, and since there is no invalidate_range_end exposed to
drivers the scheme safely allows multiple drivers to be subscribed.

Four places are converted as an example of how the new API is used.
Four are left for future patches:
 - i915_gem has complex locking around destruction of a registration,
   needs more study
 - hfi1 (2nd user) needs access to the rbtree
 - scif_dma has a complicated logic flow
 - vhost's mmu notifiers are already being rewritten

This is already in linux-next, a git tree is available here:

 https://github.com/jgunthorpe/linux/commits/mmu_notifier

v3:
- Rename mmu_range_notifier to mmu_interval_notifier for clarity
  Avoids confusion with struct mmu_notifier_range
- Fix bugs in odp, amdgpu and xen gntdev from testing
- Make ops an argument to mmu_interval_notifier_insert() to make it
  harder to misuse
- Update many comments
- Add testing of mm_count during insertion

v2: https://lore.kernel.org/r/20191028201032.6352-1-...@ziepe.ca
v1: https://lore.kernel.org/r/20191015181242.8343-1-...@ziepe.ca

Absent any new discussion I think this will go to Linus at the next merge
window.

Thanks to everyone to helped!

Jason Gunthorpe (14):
  mm/mmu_notifier: define the header pre-processor parts even if
disabled
  mm/mmu_notifier: add an interval tree notifier
  mm/hmm: allow hmm_range to be used with a mmu_interval_notifier or
hmm_mirror
  mm/hmm: define the pre-processor related parts of hmm.h even if
disabled
  RDMA/odp: Use mmu_interval_notifier_insert()
  RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv
  drm/radeon: use mmu_interval_notifier_insert
  nouveau: use mmu_notifier directly for invalidate_range_start
  nouveau: use mmu_interval_notifier instead of hmm_mirror
  drm/amdgpu: Call find_vma under mmap_sem
  drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror
  drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror
  mm/hmm: remove hmm_mirror and related
  xen/gntdev: use mmu_interval_notifier_insert

 Documentation/vm/hmm.rst  | 105 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |   2 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   9 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c|  14 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c|   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c| 443 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h|  53 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h|  13 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   | 145 +++--
 drivers/gpu/drm/nouveau/nouveau_svm.c | 230 ---
 drivers/gpu/drm/radeon/radeon.h   |   9 +-
 drivers/gpu/drm/radeon/radeon_mn.c| 218 ++-
 drivers/infiniband/core/device.c  |   1 -
 drivers/infiniband/core/umem_odp.c| 303 ++
 drivers/infiniband/hw/hfi1/file_ops.c |   2 +-
 drivers/infiniband/hw/hfi1/hfi.h  |   2 +-
 drivers/infiniband/hw/hfi1/user_exp_rcv.c | 146 ++---
 drivers/infiniband/hw/hfi1/user_exp_rcv.h |   3 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h  |   7 +-
 drivers/infiniband/hw/mlx5/mr.c   |   3 +-
 drivers/infiniband/hw/mlx5/odp.c  |  50 +-
 drivers/xen/gntdev-common.h   |   8 +-
 drivers/xen/gntdev.c  | 179 ++
 include/linux/hmm.h   | 195 +-
 include/linux/mmu_notifier.h  | 147 -
 include/rdma/ib_umem_odp.h|  68 +--
 include/rdma/ib_verbs.h   |   2 -
 kernel/fork.c |   1 -
 mm/Kconfig|   2 +-
 mm/hmm.c  | 276 +
 mm/mmu_notifier.c | 565 +-
 31 files changed, 1271 insertions(+), 1931 deletions(-)

-- 
2.24.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org

Re: [Xen-devel] [PATCH] Introduce a description of a new optional tag for Backports

2019-11-12 Thread Lars Kurth


On 12/11/2019, 11:10, "Stefano Stabellini"  wrote:

On Tue, 12 Nov 2019, Ian Jackson wrote:
> Anthony PERARD writes ("Re: [Xen-devel] [PATCH] Introduce a description 
of a new optional tag for Backports"):
> > Should we describe the Fixes: tag as well? These would have a similar
> > purpose to the backport tag, I mean it could help figure out which
> > commit to backport to which tree.
> 
> Good point.

Yes, good idea.


Lars, I think we are already in agreement.

You can find the description of "Fixes" here in Linux
Documentation/process/submitting-patches.rst.

It would be good to get Jan's ACK at least
Lars


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Xen-unstable: AMD-Vi: update_paging_mode Try to access pdev_list without aquiring pcidevs_lock.

2019-11-12 Thread Sander Eikelenboom
On 12/11/2019 12:05, Jan Beulich wrote:
> On 11.11.2019 22:38, Sander Eikelenboom wrote:
>> When supplying "pci=nomsi" to the guest kernel, the device works fine,
>> and I don't get the "INVALID_DEV_REQUEST".
>>
>> After reverting 1b00c16bdf, the device works fine 
>> and I don't get the INVALID_DEV_REQUEST, 
> 
> Could you give the patch below a try? That commit took care of only
> securing ourselves, but not of relaxing things again when a device
> gets handed to a guest for actual use.
> 
> Jan

Hi Jan,

CC'ed Juergen, as he seems to be dropped off the CC-list at some time.

Just tested this patch: 
the device works fine and I don't get the INVALID_DEV_REQUEST.

This was the last remaining issue around pci passthrough I encountered, 
with all patches applied (yours and Anthony's) pci passthrough for me 
seems to work again as I was used to.

Thanks again for fixing the issues and providing the right educated guesses!

--
Sander


> AMD/IOMMU: restore DTE fields in amd_iommu_setup_domain_device()
> 
> Commit 1b00c16bdf ("AMD/IOMMU: pre-fill all DTEs right after table
> allocation") moved ourselves into a more secure default state, but
> didn't take sufficient care to also undo the effects when handing a
> previously disabled device back to a(nother) domain. Put the fields
> that may have been changed elsewhere back to their intended values
> (some fields amd_iommu_disable_domain_device() touches don't
> currently get written anywhere else, and hence don't need modifying
> here).
> 
> Reported-by: Sander Eikelenboom 
> Signed-off-by: Jan Beulich 
> 
> --- unstable.orig/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ unstable/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -114,11 +114,21 @@ static void amd_iommu_setup_domain_devic
>  
>  if ( !dte->v || !dte->tv )
>  {
> +const struct ivrs_mappings *ivrs_dev;
> +
>  /* bind DTE to domain page-tables */
>  amd_iommu_set_root_page_table(
>  dte, page_to_maddr(hd->arch.root_table), domain->domain_id,
>  hd->arch.paging_mode, valid);
>  
> +/* Undo what amd_iommu_disable_domain_device() may have done. */
> +ivrs_dev = _ivrs_mappings(iommu->seg)[req_id];
> +if ( dte->it_root )
> +dte->int_ctl = IOMMU_DEV_TABLE_INT_CONTROL_TRANSLATED;
> +dte->iv = iommu_intremap;
> +dte->ex = ivrs_dev->dte_allow_exclusion;
> +dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, 
> ACPI_IVHD_SYSTEM_MGMT);
> +
>  if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
>   iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
>  dte->i = ats_enabled;
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-4.12-testing test] 144035: tolerable trouble: fail/pass/starved - PUSHED

2019-11-12 Thread osstest service owner
flight 144035 xen-4.12-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144035/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeat fail in 144007 pass 
in 144035
 test-amd64-i386-xl-raw 19 guest-start/debian.repeat fail in 144007 pass in 
144035
 test-armhf-armhf-libvirt-raw 15 guest-start/debian.repeat fail in 144007 pass 
in 144035
 test-armhf-armhf-xl-vhd 15 guest-start/debian.repeat fail in 144007 pass in 
144035
 test-amd64-amd64-libvirt-pair 10 xen-boot/src_host fail pass in 144007
 test-amd64-i386-xl-qemuu-ws16-amd64 10 windows-install fail pass in 144007

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qcow2 17 guest-localmigrate/x10 fail in 144007 like 143190
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stopfail in 144007 never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-install fail in 144007 never 
pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail in 144007 never 
pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-install fail in 144007 never 
pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail in 144007 never 
pass
 test-amd64-amd64-xl-multivcpu  7 xen-boot fail like 143155
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 

Re: [Xen-devel] [PATCH] xen/sched: remove wrong assertions in csched2_free_pdata()

2019-11-12 Thread Dario Faggioli
On Fri, 2019-11-08 at 08:38 +0100, Juergen Gross wrote:
> The assertions in csched2_free_pdata() are wrong as in case it is
> called by schedule_cpu_add() after a failure of sched_alloc_udata()
> the init pdata function won't have been called.
> 
Sorry, maybe too much time has passed since when I wrote this code, and
I'm rusty, but the comment says:

 "we want to be sure that either init_pdata has never been called, or 
  deinit_pdata has been called already"

So, the case of init_pdata not having been called is considered.

And yet, you are saying it is wrong because:

 "in case it is called [..] after a failure of sched_alloc_udata() the 
  init pdata function won't have been called"

But, as just said, init_pdata not having been called was one of the
possibilities... wasn't it?

Or am I misunderstanding the meaning of the sentence above?

Don't get me wrong, I never particularly loved these ASSERT()s and I'd
be more than fine seeing them go... :-)

Have you seen them triggering inappropriately, either before or after
the core-scheduling series (and either with core-scheduling on or off)?

Regards

(leaving the patch in context on purpose, in case it's useful)

> ---
>  xen/common/sched_credit2.c | 16 
>  1 file changed, 16 deletions(-)
> 
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index af58ee161d..a995ff838f 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -3914,10 +3914,6 @@ csched2_deinit_pdata(const struct scheduler
> *ops, void *pcpu, int cpu)
>  
>  write_lock_irqsave(>lock, flags);
>  
> -/*
> - * alloc_pdata is not implemented, so pcpu must be NULL. On the
> other
> - * hand, init_pdata must have been called for this pCPU.
> - */
>  /*
>   * Scheduler specific data for this pCPU must still be there and
> and be
>   * valid. In fact, if we are here:
> @@ -3969,18 +3965,6 @@ csched2_deinit_pdata(const struct scheduler
> *ops, void *pcpu, int cpu)
>  static void
>  csched2_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
>  {
> -struct csched2_pcpu *spc = pcpu;
> -
> -/*
> - * pcpu either points to a valid struct csched2_pcpu, or is NULL
> (if
> - * CPU bringup failed, and we're beeing called from
> CPU_UP_CANCELLED).
> - * xfree() does not really mind, but we want to be sure that
> either
> - * init_pdata has never been called, or deinit_pdata has been
> called
> - * already.
> - */
> -ASSERT(!pcpu || spc->runq_id == -1);
> -ASSERT(!cpumask_test_cpu(cpu, _priv(ops)->initialized));
> -
>  xfree(pcpu);
>  }
>  
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
---
<> (Raistlin Majere)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Xen Security Advisory 304 v1 (CVE-2018-12207) - x86: Machine Check Error on Page Size Change DoS

2019-11-12 Thread Xen . org security team
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Xen Security Advisory CVE-2018-12207 / XSA-304

x86: Machine Check Error on Page Size Change DoS

ISSUE DESCRIPTION
=

An erratum exists across some CPUs whereby an instruction fetch may
cause a machine check error if the pagetables have been updated in a
specific manner without invalidating the TLB.

The x86 architecture explicitly permits modification of the pagetables
without TLB invalidation, but in this corner case, the impacted core
ceases operating and an unexpected machine check or system reset occurs.

This corner case can be triggered by guest kernels.

For more details, see:
  
https://software.intel.com/security-software-guidance/insights/deep-dive-machine-check-error-avoidance-page-size-change

IMPACT
==

A malicious guest kernel can crash the host, resulting in a Denial of
Service (DoS).  (This CPU bug may also be triggered accidentally.)

VULNERABLE SYSTEMS
==

Systems running all versions of Xen are affected.

Only x86 processors are vulnerable.  ARM processors are not believed to
be vulnerable.

Only Intel Core based processors (from Nehalem onwards) are affected.
Other processors designs (Intel Atom/Knights range), and other
manufacturers (AMD) are not known to be affected.

Only x86 HVM/PVH guests can exploit the vulnerability.  x86 PV guests
cannot exploit the vulnerability.

Please consult the Intel Security Advisory for details on the affected
processors.

MITIGATION
==

Running only PV guests avoids the vulnerability.

Booting Xen with `hap_2mb=0 hap_1gb=0` on the command line, to disable
the use of HAP superpages, works around the vulnerability.

Booting Xen with `hap=0` to disable HAP entirely, or configuring HVM/PVH
guests to use shadow paging (hap=0 in xl.cfg) works around the
vulnerability, but the performance impact of shadow paging in
combination with in-guest Meltdown mitigations (KPTI, KVAS, etc) will
most likely make this option prohibitive to use.

RESOLUTION
==

Applying the appropriate attached patches resolves this issue.

By default, Xen will disable executable superpages on
believed-vulnerable hardware, and report so at boot:

  (XEN) VMX: Disabling executable EPT superpages due to CVE-2018-12207

See the performance and safety consideration section below.

xsa304/xsa304-*.patch   xen-unstable
xsa304/xsa304-4.12-*.patch  Xen 4.12.x
xsa304/xsa304-4.11-*.patch  Xen 4.11.x
xsa304/xsa304-4.10-*.patch  Xen 4.10.x
xsa304/xsa304-4.9-*.patch   Xen 4.9.x
xsa304/xsa304-4.8-*.patch   Xen 4.8.x

The patches are comprised of:
 *-1.patch: Fix on SandyBridge hardware discovered during testing
 *-2.patch: Main security fix
 *-3.patch: (4.10 and later) Runtime control of fast vs secure

$ sha256sum xsa304*/*
3365e0351b3ccb39e3be53bcbfd8219d8282f6f3d97d6c4519a3e860b27f6844  
xsa304/xsa304-1.patch
1a85753717312f2b20f291c9e79271c63be2a9542fbec651d0a8fc4d8aca0408  
xsa304/xsa304-2.patch
0c770aa15f2aef2bb3253194243968181a4bb1710d09d6f785ed7f5dae03b93b  
xsa304/xsa304-3.patch
2d2eb25b842578bd45480c8ff6f2266617dd0db5e6e552d5ae481eb764c8aea0  
xsa304/xsa304-4.8-1.patch
72d91f67af06f89d01f7dc1e6ff87f50cad28bbb0475eb5cfbb986ee51775bc2  
xsa304/xsa304-4.8-2.patch
d8d18e7dd9b59f01454352a46d38699b21c5f1f7ff6bd2aa8e63fbd7a98cfca4  
xsa304/xsa304-4.9-1.patch
244df964d70eab300c77210456439dfb1c46f2ddd9f1b851e1110be7573948ba  
xsa304/xsa304-4.9-2.patch
2d80f2603412abb4e644b8e868f4218e90db3f59b25f833ff7342d347af6c5a8  
xsa304/xsa304-4.10-1.patch
94a87371ddeccf5705ed71a961135393fa9046e4235cc90402f9292dcfffa43c  
xsa304/xsa304-4.10-2.patch
9862e46c2bcbbeaba32d06d7af33b8b97fd8be5a4a35bcd70264e9913031f512  
xsa304/xsa304-4.10-3.patch
b927c5b7a5dbf6260fd37ec2a594d5a0ff40b2fa78c9ca59fa184c87d8d1  
xsa304/xsa304-4.11-1.patch
478d7b7b27bb0a4ed874a4d6fe73282d785feed8c35f3278a07a1228d5dfad77  
xsa304/xsa304-4.11-2.patch
d0e079a0af7045711a21ac52674e5821e69c370f7ef64c9ebdfc0990950f7a54  
xsa304/xsa304-4.11-3.patch
4025732fd83a94c09b023f079e9b3c8399649f31e406f5f0c736a522f75fdd53  
xsa304/xsa304-4.12-1.patch
2653c57fc79b98ca5cc30ceb2299d11c2ba96f4becdfb93a1cc14ca943e18420  
xsa304/xsa304-4.12-2.patch
ec670ca4e3782043824e1f475ba187d89a53836d4e2ad8399daf0a91fcc747dc  
xsa304/xsa304-4.12-3.patch
$

PERFORMANCE AND SAFETY CONSIDERATIONS
=

Disabling executable EPT superpages does come with a performance impact,
caused by increased iTLB pressure.  The overhead will be workload and
CPU dependant.

In configurations where guest kernels are trusted not to mount a DoS
attempt, the mitigation can be turned off by booting with `ept=exec-sp`.

In configurations where the guest kernels are not trusted, users are
recommended to measure the impact to their workloads as part of deciding
between fast and secure.

On Xen 4.10 and later, a runtime decision can be made between fast and
secure by using `xl set-parameters ept=[no-]exec-sp`.

NOTE 

[Xen-devel] [tip: x86/boot] x86/boot: Introduce kernel_info.setup_type_max

2019-11-12 Thread tip-bot2 for Daniel Kiper
The following commit has been merged into the x86/boot branch of tip:

Commit-ID: 00cd1c154d565c62ad5e065bf3530f68bdf59490
Gitweb:
https://git.kernel.org/tip/00cd1c154d565c62ad5e065bf3530f68bdf59490
Author:Daniel Kiper 
AuthorDate:Tue, 12 Nov 2019 14:46:39 +01:00
Committer: Borislav Petkov 
CommitterDate: Tue, 12 Nov 2019 16:16:54 +01:00

x86/boot: Introduce kernel_info.setup_type_max

This field contains maximal allowed type for setup_data.

Do not bump setup_header version in arch/x86/boot/header.S because it
will be followed by additional changes coming into the Linux/x86 boot
protocol.

Suggested-by: H. Peter Anvin (Intel) 
Signed-off-by: Daniel Kiper 
Signed-off-by: Borislav Petkov 
Reviewed-by: Konrad Rzeszutek Wilk 
Reviewed-by: Ross Philipson 
Reviewed-by: H. Peter Anvin (Intel) 
Cc: Andy Lutomirski 
Cc: ard.biesheu...@linaro.org
Cc: Boris Ostrovsky 
Cc: dave.han...@linux.intel.com
Cc: eric.snowb...@oracle.com
Cc: Ingo Molnar 
Cc: Jonathan Corbet 
Cc: Juergen Gross 
Cc: kanth.ghatr...@oracle.com
Cc: linux-...@vger.kernel.org
Cc: linux-efi 
Cc: Peter Zijlstra 
Cc: rdun...@infradead.org
Cc: ross.philip...@oracle.com
Cc: Thomas Gleixner 
Cc: x86-ml 
Cc: xen-devel@lists.xenproject.org
Link: https://lkml.kernel.org/r/20191112134640.16035-3-daniel.ki...@oracle.com
---
 Documentation/x86/boot.rst |  9 -
 arch/x86/boot/compressed/kernel_info.S |  5 +
 arch/x86/include/uapi/asm/bootparam.h  |  3 +++
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
index c60fafd..6cdd767 100644
--- a/Documentation/x86/boot.rst
+++ b/Documentation/x86/boot.rst
@@ -73,7 +73,7 @@ Protocol 2.14:BURNT BY INCORRECT COMMIT 
ae7e1238e68f2a472a125673ab506d49158c188
(x86/boot: Add ACPI RSDP address to setup_header)
DO NOT USE!!! ASSUME SAME AS 2.13.
 
-Protocol 2.15: (Kernel 5.5) Added the kernel_info.
+Protocol 2.15: (Kernel 5.5) Added the kernel_info and 
kernel_info.setup_type_max.
 =  
 
 .. note::
@@ -981,6 +981,13 @@ Offset/size:   0x0008/4
   This field contains the size of the kernel_info including kernel_info.header
   and kernel_info.kernel_info_var_len_data.
 
+   ==
+Field name:setup_type_max
+Offset/size:   0x000c/4
+   ==
+
+  This field contains maximal allowed type for setup_data.
+
 
 The Image Checksum
 ==
diff --git a/arch/x86/boot/compressed/kernel_info.S 
b/arch/x86/boot/compressed/kernel_info.S
index 8ea6f6e..018dacb 100644
--- a/arch/x86/boot/compressed/kernel_info.S
+++ b/arch/x86/boot/compressed/kernel_info.S
@@ -1,5 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
+#include 
+
.section ".rodata.kernel_info", "a"
 
.global kernel_info
@@ -12,6 +14,9 @@ kernel_info:
/* Size total. */
.long   kernel_info_end - kernel_info
 
+   /* Maximal allowed type for setup_data. */
+   .long   SETUP_TYPE_MAX
+
 kernel_info_var_len_data:
/* Empty for time being... */
 kernel_info_end:
diff --git a/arch/x86/include/uapi/asm/bootparam.h 
b/arch/x86/include/uapi/asm/bootparam.h
index a1ebcd7..dbb4112 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -11,6 +11,9 @@
 #define SETUP_APPLE_PROPERTIES 5
 #define SETUP_JAILHOUSE6
 
+/* max(SETUP_*) */
+#define SETUP_TYPE_MAX SETUP_JAILHOUSE
+
 /* ram_size flags */
 #define RAMDISK_IMAGE_START_MASK   0x07FF
 #define RAMDISK_PROMPT_FLAG0x8000

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [tip: x86/boot] x86/boot: Introduce kernel_info

2019-11-12 Thread tip-bot2 for Daniel Kiper
The following commit has been merged into the x86/boot branch of tip:

Commit-ID: 2c33c27fd6033ced942c9a591b8ac15c07c57d70
Gitweb:
https://git.kernel.org/tip/2c33c27fd6033ced942c9a591b8ac15c07c57d70
Author:Daniel Kiper 
AuthorDate:Tue, 12 Nov 2019 14:46:38 +01:00
Committer: Borislav Petkov 
CommitterDate: Tue, 12 Nov 2019 16:10:34 +01:00

x86/boot: Introduce kernel_info

The relationships between the headers are analogous to the various data
sections:

  setup_header = .data
  boot_params/setup_data = .bss

What is missing from the above list? That's right:

  kernel_info = .rodata

We have been (ab)using .data for things that could go into .rodata or .bss for
a long time, for lack of alternatives and -- especially early on -- inertia.
Also, the BIOS stub is responsible for creating boot_params, so it isn't
available to a BIOS-based loader (setup_data is, though).

setup_header is permanently limited to 144 bytes due to the reach of the
2-byte jump field, which doubles as a length field for the structure, combined
with the size of the "hole" in struct boot_params that a protected-mode loader
or the BIOS stub has to copy it into. It is currently 119 bytes long, which
leaves us with 25 very precious bytes. This isn't something that can be fixed
without revising the boot protocol entirely, breaking backwards compatibility.

boot_params proper is limited to 4096 bytes, but can be arbitrarily extended
by adding setup_data entries. It cannot be used to communicate properties of
the kernel image, because it is .bss and has no image-provided content.

kernel_info solves this by providing an extensible place for information about
the kernel image. It is readonly, because the kernel cannot rely on a
bootloader copying its contents anywhere, but that is OK; if it becomes
necessary it can still contain data items that an enabled bootloader would be
expected to copy into a setup_data chunk.

Do not bump setup_header version in arch/x86/boot/header.S because it
will be followed by additional changes coming into the Linux/x86 boot
protocol.

Suggested-by: H. Peter Anvin (Intel) 
Signed-off-by: Daniel Kiper 
Signed-off-by: Borislav Petkov 
Reviewed-by: Konrad Rzeszutek Wilk 
Reviewed-by: Ross Philipson 
Reviewed-by: H. Peter Anvin (Intel) 
Cc: Andy Lutomirski 
Cc: ard.biesheu...@linaro.org
Cc: Boris Ostrovsky 
Cc: dave.han...@linux.intel.com
Cc: eric.snowb...@oracle.com
Cc: Ingo Molnar 
Cc: Jonathan Corbet 
Cc: Juergen Gross 
Cc: kanth.ghatr...@oracle.com
Cc: linux-...@vger.kernel.org
Cc: linux-efi 
Cc: Peter Zijlstra 
Cc: rdun...@infradead.org
Cc: ross.philip...@oracle.com
Cc: Thomas Gleixner 
Cc: x86-ml 
Cc: xen-devel@lists.xenproject.org
Link: https://lkml.kernel.org/r/20191112134640.16035-2-daniel.ki...@oracle.com
---
 Documentation/x86/boot.rst | 126 -
 arch/x86/boot/Makefile |   2 +-
 arch/x86/boot/compressed/Makefile  |   4 +-
 arch/x86/boot/compressed/kernel_info.S |  17 +++-
 arch/x86/boot/header.S |   1 +-
 arch/x86/boot/tools/build.c|   5 +-
 arch/x86/include/uapi/asm/bootparam.h  |   1 +-
 7 files changed, 153 insertions(+), 3 deletions(-)
 create mode 100644 arch/x86/boot/compressed/kernel_info.S

diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
index 08a2f10..c60fafd 100644
--- a/Documentation/x86/boot.rst
+++ b/Documentation/x86/boot.rst
@@ -68,8 +68,25 @@ Protocol 2.12(Kernel 3.8) Added the xloadflags field 
and extension fields
 Protocol 2.13  (Kernel 3.14) Support 32- and 64-bit flags being set in
xloadflags to support booting a 64-bit kernel from 32-bit
EFI
+
+Protocol 2.14: BURNT BY INCORRECT COMMIT 
ae7e1238e68f2a472a125673ab506d49158c1889
+   (x86/boot: Add ACPI RSDP address to setup_header)
+   DO NOT USE!!! ASSUME SAME AS 2.13.
+
+Protocol 2.15: (Kernel 5.5) Added the kernel_info.
 =  
 
+.. note::
+ The protocol version number should be changed only if the setup header
+ is changed. There is no need to update the version number if boot_params
+ or kernel_info are changed. Additionally, it is recommended to use
+ xloadflags (in this case the protocol version number should not be
+ updated either) or kernel_info to communicate supported Linux kernel
+ features to the boot loader. Due to very limited space available in
+ the original setup header every update to it should be considered
+ with great care. Starting from the protocol 2.15 the primary way to
+ communicate things to the boot loader is the kernel_info.
+
 
 Memory Layout
 =
@@ -207,6 +224,7 @@ Offset/Size Proto   NameMeaning
 0258/8 2.10+   pref_addressPreferred loading 
address
 0260/4 2.10+   init_size   Linear memory required 
during 

[Xen-devel] [tip: x86/boot] x86/boot: Introduce setup_indirect

2019-11-12 Thread tip-bot2 for Daniel Kiper
The following commit has been merged into the x86/boot branch of tip:

Commit-ID: b3c72fc9a78e74161f9d05ef7191706060628f8c
Gitweb:
https://git.kernel.org/tip/b3c72fc9a78e74161f9d05ef7191706060628f8c
Author:Daniel Kiper 
AuthorDate:Tue, 12 Nov 2019 14:46:40 +01:00
Committer: Borislav Petkov 
CommitterDate: Tue, 12 Nov 2019 16:21:15 +01:00

x86/boot: Introduce setup_indirect

The setup_data is a bit awkward to use for extremely large data objects,
both because the setup_data header has to be adjacent to the data object
and because it has a 32-bit length field. However, it is important that
intermediate stages of the boot process have a way to identify which
chunks of memory are occupied by kernel data. Thus introduce an uniform
way to specify such indirect data as setup_indirect struct and
SETUP_INDIRECT type.

And finally bump setup_header version in arch/x86/boot/header.S.

Suggested-by: H. Peter Anvin (Intel) 
Signed-off-by: Daniel Kiper 
Signed-off-by: Borislav Petkov 
Reviewed-by: Ross Philipson 
Reviewed-by: H. Peter Anvin (Intel) 
Acked-by: Konrad Rzeszutek Wilk 
Cc: Andy Lutomirski 
Cc: ard.biesheu...@linaro.org
Cc: Boris Ostrovsky 
Cc: dave.han...@linux.intel.com
Cc: eric.snowb...@oracle.com
Cc: Ingo Molnar 
Cc: Jonathan Corbet 
Cc: Juergen Gross 
Cc: kanth.ghatr...@oracle.com
Cc: linux-...@vger.kernel.org
Cc: linux-efi 
Cc: Peter Zijlstra 
Cc: rdun...@infradead.org
Cc: ross.philip...@oracle.com
Cc: Thomas Gleixner 
Cc: x86-ml 
Cc: xen-devel@lists.xenproject.org
Link: https://lkml.kernel.org/r/20191112134640.16035-4-daniel.ki...@oracle.com
---
 Documentation/x86/boot.rst | 43 -
 arch/x86/boot/compressed/kaslr.c   | 12 +++-
 arch/x86/boot/compressed/kernel_info.S |  2 +-
 arch/x86/boot/header.S |  2 +-
 arch/x86/include/uapi/asm/bootparam.h  | 16 +++--
 arch/x86/kernel/e820.c | 11 ++-
 arch/x86/kernel/kdebugfs.c | 21 +---
 arch/x86/kernel/ksysfs.c   | 31 +-
 arch/x86/kernel/setup.c|  6 +++-
 arch/x86/mm/ioremap.c  | 11 ++-
 10 files changed, 138 insertions(+), 17 deletions(-)

diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
index 6cdd767..90bb8f5 100644
--- a/Documentation/x86/boot.rst
+++ b/Documentation/x86/boot.rst
@@ -827,6 +827,47 @@ Protocol:  2.09+
   sure to consider the case where the linked list already contains
   entries.
 
+  The setup_data is a bit awkward to use for extremely large data objects,
+  both because the setup_data header has to be adjacent to the data object
+  and because it has a 32-bit length field. However, it is important that
+  intermediate stages of the boot process have a way to identify which
+  chunks of memory are occupied by kernel data.
+
+  Thus setup_indirect struct and SETUP_INDIRECT type were introduced in
+  protocol 2.15.
+
+  struct setup_indirect {
+__u32 type;
+__u32 reserved;  /* Reserved, must be set to zero. */
+__u64 len;
+__u64 addr;
+  };
+
+  The type member is a SETUP_INDIRECT | SETUP_* type. However, it cannot be
+  SETUP_INDIRECT itself since making the setup_indirect a tree structure
+  could require a lot of stack space in something that needs to parse it
+  and stack space can be limited in boot contexts.
+
+  Let's give an example how to point to SETUP_E820_EXT data using 
setup_indirect.
+  In this case setup_data and setup_indirect will look like this:
+
+  struct setup_data {
+__u64 next = 0 or ;
+__u32 type = SETUP_INDIRECT;
+__u32 len = sizeof(setup_data);
+__u8 data[sizeof(setup_indirect)] = struct setup_indirect {
+  __u32 type = SETUP_INDIRECT | SETUP_E820_EXT;
+  __u32 reserved = 0;
+  __u64 len = ;
+  __u64 addr = ;
+}
+  }
+
+.. note::
+ SETUP_INDIRECT | SETUP_NONE objects cannot be properly distinguished
+ from SETUP_INDIRECT itself. So, this kind of objects cannot be provided
+ by the bootloaders.
+
    
 Field name:pref_address
 Type:  read (reloc)
@@ -986,7 +1027,7 @@ Field name:setup_type_max
 Offset/size:   0x000c/4
    ==
 
-  This field contains maximal allowed type for setup_data.
+  This field contains maximal allowed type for setup_data and setup_indirect 
structs.
 
 
 The Image Checksum
diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 2e53c05..bb9bfef 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -459,6 +459,18 @@ static bool mem_avoid_overlap(struct mem_vector *img,
is_overlapping = true;
}
 
+   if (ptr->type == SETUP_INDIRECT &&
+   ((struct setup_indirect *)ptr->data)->type != 
SETUP_INDIRECT) {
+   avoid.start = ((struct setup_indirect 
*)ptr->data)->addr;
+   avoid.size = ((struct 

Re: [Xen-devel] [PATCH] arch: arm: vgic-v3: fix GICD_ISACTIVER range

2019-11-12 Thread Stefano Stabellini
On Tue, 12 Nov 2019, Peng Fan wrote:
> Hi Julien,
> 
> Inline marked with [Peng Fan]

Please use plain text emails on xen-devel (and other open source
development mailing lists.)


> From: Julien Grall  
> Sent: 2019年11月9日 6:44
> To: Stefano Stabellini ; Andre Przywara 
> 
> Cc: Peng Fan ; Jürgen Groß ; 
> julien.gr...@arm.com; xen-de...@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH] arch: arm: vgic-v3: fix GICD_ISACTIVER range
> 
> Hi,
> 
> Sorry for the formatting.
> On Sat, 9 Nov 2019, 04:27 Stefano Stabellini,  
> wrote:
> On Thu, 7 Nov 2019, Peng Fan wrote:
> > The end should be GICD_ISACTIVERN not GICD_ISACTIVER.
> > 
> > Signed-off-by: Peng Fan 
> 
> Reviewed-by: Stefano Stabellini 
> 
> To be honest, I am not sure the code is correct. A read to those registers 
> should tell you the list of interrupts active. As we always return 0, this 
> will not return the correct state of the GIC.
> 
> I know that returning the list of actives interrupts is complicated with the 
> old vGIC, but I don't think silently ignoring it is a good idea.
> 
> The question here is why the guest accessed those registers? What is it 
> trying to figure out?
> 
> [Peng Fan] I am running Linux 5.4 kernel dom0, gic_peek_irq triggers abort.
> 
> 
> 
> Juergen, I think this fix should be in the release (and also
> backported to stable trees.)
> 
> Without an understanding of the problem, I disagree with this request (see 
> above).
> 
> As an aside, the range ISPENDR  has the same issue.
> 
> [Peng Fan] Should I include this change in v2? Or develop new method to fix 
> the issue?
> But at least dom0 abort when boot.

Also considering Andre's reply, yes, please send another patch to fix
for ISPENDR too. It doesn't have to be the same patch.

Thank you!
 
 
> 
> 
> 
> > ---
> >  xen/arch/arm/vgic-v3.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> > index 422b94f902..e802f2055a 100644
> > --- a/xen/arch/arm/vgic-v3.c
> > +++ b/xen/arch/arm/vgic-v3.c
> > @@ -706,7 +706,7 @@ static int __vgic_v3_distr_common_mmio_read(const char 
> > *name, struct vcpu *v,
> >          goto read_as_zero;
> >  
> >      /* Read the active status of an IRQ via GICD/GICR is not supported */
> > -    case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVER):
> > +    case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVERN):
> >      case VRANGE32(GICD_ICACTIVER, GICD_ICACTIVERN):
> >          goto read_as_zero;
> >  
> > -- 
> > 2.16.4
> > 
> 
> ___
> Xen-devel mailing list
> mailto:Xen-devel@lists.xenproject.org
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.xenproject.org%2Fmailman%2Flistinfo%2Fxen-devel=02%7C01%7Cpeng.fan%40nxp.com%7C33f2e907cdc84ed0a48608d7649d359e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637088498678782239=G3FA2vefr56FeUX5QVZQwSzG22nfv1m%2F0fKIDOnfuFQ%3D=0
> ___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 144053: tolerable all pass - PUSHED

2019-11-12 Thread osstest service owner
flight 144053 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144053/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  aaef3d904bbbde1fcf9c07943878bd2aa64cc2bc
baseline version:
 xen  3683290fc0b0d6500392db733811cc78bcb35eab

Last test of basis   144044  2019-11-12 11:00:54 Z0 days
Testing same since   144053  2019-11-12 15:01:17 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   3683290fc0..aaef3d904b  aaef3d904bbbde1fcf9c07943878bd2aa64cc2bc -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/3] x86/boot: Remove cached CPUID data from the trampoline

2019-11-12 Thread Jan Beulich
On 12.11.2019 17:09, Andrew Cooper wrote:
> On 04/11/2019 15:31, Jan Beulich wrote:
>> On 04.11.2019 16:22, Andrew Cooper wrote:
>>> On 04/11/2019 15:03, Jan Beulich wrote:
 On 04.11.2019 15:59, Andrew Cooper wrote:
> On 04/11/2019 13:25, Jan Beulich wrote:
>> On 01.11.2019 21:25, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/cpu/intel.c
>>> +++ b/xen/arch/x86/cpu/intel.c
>>> @@ -270,6 +270,7 @@ static void early_init_intel(struct cpuinfo_x86 *c)
>>> if (disable) {
>>> wrmsrl(MSR_IA32_MISC_ENABLE, misc_enable & ~disable);
>>> bootsym(trampoline_misc_enable_off) |= disable;
>>> +   bootsym(trampoline_efer) |= EFER_NX;
>>> }
>> I'm fine with all other changes here, just this one concerns me:
>> Before your change we latch a value into trampoline_misc_enable_off
>> just to be used for subsequent IA32_MISC_ENABLE writes we do. This
>> means that, on a hypervisor (like Xen itself) simply discarding
>> guest writes to the MSR (which is "fine" with the described usage
>> of ours up to now), with your change we'd now end up trying to set
>> EFER.NX when the feature may not actually be enabled in
>> IA32_MISC_ENABLE. Wouldn't such an EFER write be liable to #GP(0)?
>> I.e. don't we need to read back the MSR value here, and verify
>> we actually managed to clear the bit before actually OR-ing in
>> EFER_NX?
> If this is a problem in practice, execution won't continue beyond the
> next if() condition just out of context, which set EFER.NX on the BSP
> with an unguarded WRMSR.
 And how is this any good? This kind of crash is exactly what I'm
 asking to avoid.
>>> What is the point of working around a theoretical edge case of broken
>>> nesting under Xen which demonstrably doesn't exist in practice?
>> Well, so far nothing was said about this not being an actual problem.
> 
> Its not an actual problem.  If it were, we would have had crash reports.
> 
>> I simply don't know whether hardware would refuse such an EFER write.
> 
> I've just experimented - writing EFER.NX takes a #GP fault when
> MISC_ENABLE.XD is set.
> 
>> If it does, it would be appropriate for hypervisors to also refuse
>> it. I.e. if we don't do so right now, correcting the behavior would
>> trip the code here.
> 
> MISC_ENABLES.XD is architectural on any Intel system which enumerates
> NX, and if the bit is set, it can be cleared.  (Although the semantics
> described in the SDM are broken.  It is only available if NX is
> enumerated, which is obfuscated by setting XD).
> 
> However, no hypervisor is going to bother virtualising this
> functionality.  Either configure the VM with NX or without.  (KVM for
> example doesn't virtualise MISC_ENABLES at all.)

I'm sorry, but I still don't follow: You say "if the bit is set, it
can be cleared", which is clearly not in line with our current guest
MSR write handling. It just so happens that we have no command line
option allowing to suppress the clearing of XD. If we had, according
to your findings above we'd run into a #GP upon trying to set NX.
How can you easily exclude another hypervisor actually doing so (and
nobody having run into the issue simply because the option is rarely
used)?

Btw - all would be fine if the code in question was guarded by an
NX feature check, but as you say that's not possible because XD set
forces NX clear. However, our setting of EFER.NX could be guarded
this way, as we _expect_ XD to be clear at that point.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] Introduce a description of a new optional tag for Backports

2019-11-12 Thread Stefano Stabellini
On Tue, 12 Nov 2019, Ian Jackson wrote:
> Anthony PERARD writes ("Re: [Xen-devel] [PATCH] Introduce a description of a 
> new optional tag for Backports"):
> > Should we describe the Fixes: tag as well? These would have a similar
> > purpose to the backport tag, I mean it could help figure out which
> > commit to backport to which tree.
> 
> Good point.

Yes, good idea.


Lars, I think we are already in agreement.

You can find the description of "Fixes" here in Linux
Documentation/process/submitting-patches.rst.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUGFIX PATCH for-4.13] sched: fix dom0less boot with the null scheduler

2019-11-12 Thread George Dunlap


> On Nov 6, 2019, at 3:58 PM, Dario Faggioli  wrote:
> 
> In a dom0less configuration, if the null scheduler is used, the system
> may fail to boot, because the loop in null_unit_wake() never exits.
> 
> Bisection showed that this behavior occurs since commit d545f1d6 ("xen:
> sched: deal with vCPUs being or becoming online or offline") but the
> real problem is that, in this case, pick_res() always return the same
> CPU.
> 
> Fix this by only deal with the simple case, i.e., the vCPU that is
> coming online can be assigned to a sched. resource right away, in
> null_unit_wake().
> 
> If it can't, just add it to the waitqueue, and we will deal with it in
> null_schedule(), being careful about not racing with vcpu_wake().
> 
> Reported-by: Stefano Stabellini 
> Signed-off-by: Dario Faggioli 
> Tested-by: Stefano Stabellini 

Reviewed-by: George Dunlap 

With one minor nit…

> + * and it's previous resource is free (and affinities match), we can just

its (no ‘).  I’ll change this on check-in.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 6/6] xen: add runtime parameter reading support to hypfs

2019-11-12 Thread Jürgen Groß

On 12.11.19 15:28, Jan Beulich wrote:

On 02.10.2019 13:20, Juergen Gross wrote:

Add support to read values of hypervisor runtime parameters via the
hypervisor file system for all unsigned integer type runtime parameters.


What about string ones (which you seem to handle in the code,
but see also there)?


Oh, right, this was a late addition.




@@ -320,6 +321,44 @@ int cmdline_strcmp(const char *frag, const char *name)
  }
  }
  
+static struct hypfs_dir hypfs_params = {

+.list = LIST_HEAD_INIT(hypfs_params.list),
+};
+
+static int __init runtime_param_hypfs_add(void)
+{
+const struct kernel_param *param;
+int ret;
+
+ret = hypfs_new_dir(_root, "params", _params);
+BUG_ON(ret);
+
+for ( param = __param_start; param < __param_end; param++ )
+{
+switch ( param->type )
+{
+case OPT_UINT:
+if ( param->len == sizeof(unsigned int) )
+ret = hypfs_new_entry_uint(_params, param->name,
+   (unsigned int *)(param->par.var));


Stray pair or parentheses. I also don't see the need for the cast,
with the "var" union member being "void *".


Right, will drop the cast.




+break;
+
+case OPT_STR:
+ret = hypfs_new_entry_uint(_params, param->name,
+   param->par.var);


hypfs_new_entry_string()?


Yes.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 5/6] xen: add /buildinfo/config entry to hypervisor filesystem

2019-11-12 Thread Jürgen Groß

On 12.11.19 15:22, Jan Beulich wrote:

On 02.10.2019 13:20, Juergen Gross wrote:

Add the /buildinfo/config entry to the hypervisor filesystem. This
entry contains the .config file used to build the hypervisor.


I think this is the 2nd step ahead of the 1st: Much of the stuff
exposed as XENVER_* sub-ops should manifest itself here ahead of
exposing xen/.config.


Yes and no. This is meant as a replacement for my previous patch series
adding .config read support.

It is no problem to add other data as well, but the need for being able
to read .config contents was already agreed on.




@@ -79,3 +80,11 @@ subdir-$(CONFIG_UBSAN) += ubsan
  
  subdir-$(CONFIG_NEEDS_LIBELF) += libelf

  subdir-$(CONFIG_HAS_DEVICE_TREE) += libfdt
+
+config_data.c: ../.config
+   ( echo "char xen_config_data[] ="; \
+ ../tools/bin2c <$<; \
+ echo ";" ) > $@


This is the typical kind of construct that may break (a subsequent
build attempt) when interrupted in the middle. This pretty clearly
is a move-if-changed candidate, at the same time also avoiding a
(cheap, but anyway) pointless re-build in case .config was touched
without actually changing.


Okay.



Furthermore is there a reason to expose this as plain text, when
Linux exposes a gzip-ed version in /proc? The file isn't very
large now, but this was also the case for Linux many years ago.


gzip data may contain bytes with 0x00. Supporting that would require a
different interface at all levels.




--- a/xen/common/hypfs.c
+++ b/xen/common/hypfs.c
@@ -25,6 +25,10 @@ static struct hypfs_entry hypfs_root_entry = {
  .dir = _root,
  };
  
+static struct hypfs_dir hypfs_buildinfo = {

+.list = LIST_HEAD_INIT(hypfs_buildinfo.list),
+};
+
  static int hypfs_add_entry(struct hypfs_dir *parent, struct hypfs_entry *new)
  {
  int ret = -ENOENT;
@@ -316,3 +320,16 @@ long do_hypfs_op(unsigned int cmd,
  
  return ret;

  }
+
+static int __init hypfs_init(void)
+{
+int ret;
+
+ret = hypfs_new_dir(_root, "buildinfo", _buildinfo);
+BUG_ON(ret);
+ret = hypfs_new_entry_string(_buildinfo, "config", xen_config_data);
+BUG_ON(ret);
+
+return 0;
+}
+__initcall(hypfs_init);


Hmm, do you really want to centralize population of the file system
here, rather than having the individual components take care of it?


I can add a new source, e.g. common/buildinfo.c if you like that better.




--- a/xen/tools/Makefile
+++ b/xen/tools/Makefile
@@ -1,13 +1,18 @@
  
  include $(XEN_ROOT)/Config.mk
  
+PROGS = symbols bin2c

+
  .PHONY: default
  default:
-   $(MAKE) symbols
+   $(MAKE) $(PROGS)


Could I ask you to take the opportunity and do away with the
unnecessary (as it seems to me) make recursion? $(PROGS) could
easily become a dependency of "default" afaict.


Fine with me.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [OSSTEST PATCH 2/2] ts-libvirt-build: Do an out-of-tree build

2019-11-12 Thread Jim Fehlig
On 11/12/19 9:10 AM, Ian Jackson wrote:
> Hi.  Thanks for the information.
> 
> Jim Fehlig writes ("Re: [OSSTEST PATCH 2/2] ts-libvirt-build: Do an 
> out-of-tree build"):
>> I assumed libvirt's gradual move from autotools to meson would
>> affect OSSTEST, but later rather than sooner. Sorry for not
>> mentioning it earlier, but now you have been warned that libvirt is
>> moving to meson :-). Meson has a strict separation between source
>> and build directories and some preparatory patches were pushed that
>> force srcdir != builddir
>>
>> https://www.redhat.com/archives/libvir-list/2019-October/msg01681.html
> 
> I read this and some of it is a bit concerning.  Does all of this
>src: [stuff] generate source files into build directory
> mean that previously only in-tree builds were supported and that
> therefore there is no one set of build runes that will work both
> before and after these changes ?

VPATH builds were previously supported, as well as in-tree builds. But 
questions 
around this work are probably best answered by the author. Adding Pavel to cc.

Pavel, for context, see Ian's OSSTEST patches to accommodate recent changes to 
libvirt's build system

https://lists.xenproject.org/archives/html/xen-devel/2019-11/msg00514.html

Regards,
Jim
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [libvirt test] 144038: regressions - FAIL

2019-11-12 Thread osstest service owner
flight 144038 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144038/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 143023
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 143023
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 143023
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 143023

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  e39d3424e329308d9e02b6df774f706a007ffd30
baseline version:
 libvirt  2cff65e4c60ed7b3c0c6a97d526d1f8d52c0e919

Last test of basis   143023  2019-10-22 04:19:26 Z   21 days
Failing since143051  2019-10-23 04:18:57 Z   20 days   17 attempts
Testing same since   144038  2019-11-12 04:18:47 Z0 days1 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Andrew Jones 
  Daniel P. Berrangé 
  Daniel Veillard 
  Eric Blake 
  Jim Fehlig 
  John Ferlan 
  Ján Tomko 
  Laine Stump 
  Laine Stump 
  Maya Rashish 
  Michal Privoznik 
  Pavel Hrdina 
  Peter Krempa 
  Wang Yechao 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  fail
 build-arm64-libvirt  fail
 build-armhf-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   blocked 
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmblocked 
 test-amd64-amd64-libvirt-xsm blocked 
 test-arm64-arm64-libvirt-xsm blocked 
 test-amd64-i386-libvirt-xsm  blocked 
 test-amd64-amd64-libvirt blocked 
 test-arm64-arm64-libvirt blocked 
 test-armhf-armhf-libvirt blocked 
 test-amd64-i386-libvirt  blocked 
 test-amd64-amd64-libvirt-pairblocked 
 test-amd64-i386-libvirt-pair blocked 
 test-arm64-arm64-libvirt-qcow2   blocked 
 test-armhf-armhf-libvirt-raw blocked 
 test-amd64-amd64-libvirt-vhd blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at

Re: [Xen-devel] [PATCH] xen/sched: remove wrong assertions in csched2_free_pdata()

2019-11-12 Thread Jürgen Groß

On 12.11.19 16:52, George Dunlap wrote:




On Nov 8, 2019, at 7:38 AM, Juergen Gross  wrote:

The assertions in csched2_free_pdata() are wrong as in case it is
called by schedule_cpu_add() after a failure of sched_alloc_udata()
the init pdata function won't have been called.


I’m a bit confused by this, as the comment says that the ASSERT()s should be OK 
with that case; i.e., that they should check *either* that pdata hasn’t been 
called, or that dinit_pdata() has been called:


- * xfree() does not really mind, but we want to be sure that either
- * init_pdata has never been called, or deinit_pdata has been called
- * already.


So which of the following conditions will fail if sched_alloc_udata() fails?  
It looks to me like they should both be fine.


You are right, this patch is not needed.

Sorry for the noise,


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [OSSTEST PATCH 2/2] ts-libvirt-build: Do an out-of-tree build

2019-11-12 Thread Ian Jackson
Hi.  Thanks for the information.

Jim Fehlig writes ("Re: [OSSTEST PATCH 2/2] ts-libvirt-build: Do an out-of-tree 
build"):
> I assumed libvirt's gradual move from autotools to meson would
> affect OSSTEST, but later rather than sooner. Sorry for not
> mentioning it earlier, but now you have been warned that libvirt is
> moving to meson :-). Meson has a strict separation between source
> and build directories and some preparatory patches were pushed that
> force srcdir != builddir
> 
> https://www.redhat.com/archives/libvir-list/2019-October/msg01681.html

I read this and some of it is a bit concerning.  Does all of this
  src: [stuff] generate source files into build directory
mean that previously only in-tree builds were supported and that
therefore there is no one set of build runes that will work both
before and after these changes ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/3] x86/boot: Remove cached CPUID data from the trampoline

2019-11-12 Thread Andrew Cooper
On 04/11/2019 15:31, Jan Beulich wrote:
> On 04.11.2019 16:22, Andrew Cooper wrote:
>> On 04/11/2019 15:03, Jan Beulich wrote:
>>> On 04.11.2019 15:59, Andrew Cooper wrote:
 On 04/11/2019 13:25, Jan Beulich wrote:
> On 01.11.2019 21:25, Andrew Cooper wrote:
>> --- a/xen/arch/x86/cpu/intel.c
>> +++ b/xen/arch/x86/cpu/intel.c
>> @@ -270,6 +270,7 @@ static void early_init_intel(struct cpuinfo_x86 *c)
>>  if (disable) {
>>  wrmsrl(MSR_IA32_MISC_ENABLE, misc_enable & ~disable);
>>  bootsym(trampoline_misc_enable_off) |= disable;
>> +bootsym(trampoline_efer) |= EFER_NX;
>>  }
> I'm fine with all other changes here, just this one concerns me:
> Before your change we latch a value into trampoline_misc_enable_off
> just to be used for subsequent IA32_MISC_ENABLE writes we do. This
> means that, on a hypervisor (like Xen itself) simply discarding
> guest writes to the MSR (which is "fine" with the described usage
> of ours up to now), with your change we'd now end up trying to set
> EFER.NX when the feature may not actually be enabled in
> IA32_MISC_ENABLE. Wouldn't such an EFER write be liable to #GP(0)?
> I.e. don't we need to read back the MSR value here, and verify
> we actually managed to clear the bit before actually OR-ing in
> EFER_NX?
 If this is a problem in practice, execution won't continue beyond the
 next if() condition just out of context, which set EFER.NX on the BSP
 with an unguarded WRMSR.
>>> And how is this any good? This kind of crash is exactly what I'm
>>> asking to avoid.
>> What is the point of working around a theoretical edge case of broken
>> nesting under Xen which demonstrably doesn't exist in practice?
> Well, so far nothing was said about this not being an actual problem.

Its not an actual problem.  If it were, we would have had crash reports.

> I simply don't know whether hardware would refuse such an EFER write.

I've just experimented - writing EFER.NX takes a #GP fault when
MISC_ENABLE.XD is set.

> If it does, it would be appropriate for hypervisors to also refuse
> it. I.e. if we don't do so right now, correcting the behavior would
> trip the code here.

MISC_ENABLES.XD is architectural on any Intel system which enumerates
NX, and if the bit is set, it can be cleared.  (Although the semantics
described in the SDM are broken.  It is only available if NX is
enumerated, which is obfuscated by setting XD).

However, no hypervisor is going to bother virtualising this
functionality.  Either configure the VM with NX or without.  (KVM for
example doesn't virtualise MISC_ENABLES at all.)

There is one corner case on out-of-support versions of Xen (which don't
clear XD themselves) where XD would leak through and be ignored, after
which Xen will take a #GP fault trying to set EFER.NX, but I am still
firmly of the opinion that it is not worth putting in a workaround for
an obsolete issue which doesn't exist in practice.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/6] xen: add basic hypervisor filesystem support

2019-11-12 Thread Jürgen Groß

On 12.11.19 14:48, Jan Beulich wrote:

On 02.10.2019 13:20, Juergen Gross wrote:

--- /dev/null
+++ b/xen/common/hypfs.c
@@ -0,0 +1,318 @@
+/**
+ *
+ * hypfs.c
+ *
+ * Simple sysfs-like file system for the hypervisor.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static DEFINE_SPINLOCK(hypfs_lock);
+
+struct hypfs_dir hypfs_root = {
+.list = LIST_HEAD_INIT(hypfs_root.list),
+};
+
+static struct hypfs_entry hypfs_root_entry = {
+.type = hypfs_type_dir,
+.name = "",
+.list = LIST_HEAD_INIT(hypfs_root_entry.list),
+.parent = _root,
+.dir = _root,
+};


This looks to be used only in hypfs_get_entry(). Unless there are
plans to have further uses, it should be moved there.


Okay.



I'm also somewhat puzzled by "name" being an empty string; this
too would look less suspicious if this wasn't a file scope variable.


+static int hypfs_add_entry(struct hypfs_dir *parent, struct hypfs_entry *new)
+{
+int ret = -ENOENT;
+struct list_head *l;
+
+if ( !new->content )
+return -EINVAL;
+
+spin_lock(_lock);
+
+list_for_each ( l, >list )
+{
+struct hypfs_entry *e = list_entry(l, struct hypfs_entry, list);


const?


Hmm, is this true when I add a new entry to it? l is part of *e
after all.




+int cmp = strcmp(e->name, new->name);
+
+if ( cmp > 0 )
+{
+ret = 0;
+list_add_tail(>list, l);
+break;
+}
+if ( cmp == 0 )
+{
+ret = -EEXIST;
+break;
+}
+}
+
+if ( ret == -ENOENT )
+{
+ret = 0;
+list_add_tail(>list, >list);
+}
+
+if ( !ret )
+{
+unsigned int sz = strlen(new->name) + 1;
+
+parent->content_size += sizeof(struct xen_hypfs_direntry) +
+ROUNDUP(sz, 4);


What is this literal 4 coming from? DYM alignof(struct xen_hypfs_direntry)?


Yes.




+new->parent = parent;
+}
+
+spin_unlock(_lock);
+
+return ret;
+}
+
+int hypfs_new_entry_any(struct hypfs_dir *parent, const char *name,
+enum hypfs_entry_type type, void *content)


Perhaps drop the _any suffix?


Okay.




+{
+int ret;
+struct hypfs_entry *new;
+
+if ( strchr(name, '/') || !strcmp(name, ".") || !strcmp(name, "..") )
+return -EINVAL;
+
+new = xzalloc(struct hypfs_entry);
+if ( !new )
+return -ENOMEM;
+
+new->name = name;
+new->type = type;
+new->content = content;
+
+ret = hypfs_add_entry(parent, new);
+
+if ( ret )
+xfree(new);
+
+return ret;
+}
+
+int hypfs_new_entry_string(struct hypfs_dir *parent, const char *name,
+   char *val)


The last parameter here and below being non-const is because of the
intended write support?


Yes.




+{
+return hypfs_new_entry_any(parent, name, hypfs_type_string, val);
+}
+
+int hypfs_new_entry_uint(struct hypfs_dir *parent, const char *name,
+ unsigned int *val)
+{
+return hypfs_new_entry_any(parent, name, hypfs_type_uint, val);
+}
+
+int hypfs_new_dir(struct hypfs_dir *parent, const char *name,
+  struct hypfs_dir *dir)
+{
+if ( !dir )
+dir = xzalloc(struct hypfs_dir);
+
+return hypfs_new_entry_any(parent, name, hypfs_type_dir, dir);
+}
+
+static int hypfs_get_path_user(char *buf, XEN_GUEST_HANDLE_PARAM(void) uaddr,
+   unsigned long len)
+{
+if ( len > XEN_HYPFS_MAX_PATHLEN )
+return -EINVAL;
+
+if ( copy_from_guest(buf, uaddr, len) )
+return -EFAULT;
+
+buf[len - 1] = 0;


In the public interface description you have "including trailing zero
byte". I think instead of putting one there you should check there's
one.


Okay.




+return 0;
+}
+
+static struct hypfs_entry *hypfs_get_entry_rel(struct hypfs_entry *dir,
+   char *path)


const?


Yes.




+{
+char *slash;


const?


Okay.




+struct hypfs_entry *entry;
+struct list_head *l;
+unsigned int name_len;
+
+if ( *path == 0 )


Please either use !*path or be consistent with code a few lines
down and use '\0'.


Okay.




+return dir;
+
+if ( dir->type != hypfs_type_dir )
+return NULL;
+
+slash = strchr(path, '/');
+if ( !slash )
+slash = strchr(path, '\0');


With this better name the variable "end" or some such?


Fine with me.




+name_len = slash - path;
+
+list_for_each ( l, >dir->list )
+{
+int cmp;
+
+entry = list_entry(l, struct hypfs_entry, list);


Why not list_for_each_entry(), eliminating the need for the "l"
helper variable?


Ah, of course!




+cmp = strncmp(path, entry->name, name_len);
+if ( cmp < 0 )
+return NULL;
+if ( cmp > 0 )
+continue;
+if ( 

Re: [Xen-devel] [OSSTEST PATCH 2/2] ts-libvirt-build: Do an out-of-tree build

2019-11-12 Thread Jim Fehlig
On 11/12/19 5:09 AM, Ian Jackson wrote:
> Recent versions of libvirt do not support in-tree builds (!)

I assumed libvirt's gradual move from autotools to meson would affect OSSTEST, 
but later rather than sooner. Sorry for not mentioning it earlier, but now you 
have been warned that libvirt is moving to meson :-). Meson has a strict 
separation between source and build directories and some preparatory patches 
were pushed that force srcdir != builddir

https://www.redhat.com/archives/libvir-list/2019-October/msg01681.html

Daniel posted a note about this change yesterday

https://www.redhat.com/archives/libvir-list/2019-November/msg00299.html

I didn't read libvirt mail yesterday otherwise I would have forwarded that to 
xen-devel. I need to be more proactive with libvirt changes that might affect 
OSSTEST...

Regards,
Jim

> 
> Cope with this by always building in a subdirectory `build' (a
> subdirectory of the source tree); this is the arrangement which the
> libvirt upstream messages and documentation now seem to recommend (at
> least where things have been updated).
> 
> I compared the differences in build output between the results of this
> branch and a previous passing xen-unstable flight.  The libvirt
> library version increased and a file
>usr/local/share/libvirt/cpu_map/arm_features.xml
> appeared.  I think this is just due to changes in the libvirt version,
> 2cff65e4c60e..70218e10bcde, in particular 0de541bfc575
>cpu_map: Ship arm_features.xml
> 
> I also tested that a test job, built with current libvirt and these
> osstest changes, passes as expected.
> 
> CC: Jim Fehlig 
> Signed-off-by: Ian Jackson 
> Tested-by: Ian Jackson 
> ---
>   ts-libvirt-build | 12 +++-
>   1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/ts-libvirt-build b/ts-libvirt-build
> index 2a363f43..e799f003 100755
> --- a/ts-libvirt-build
> +++ b/ts-libvirt-build
> @@ -58,11 +58,13 @@ sub config() {
>   my $gnulib = submodule_find($submodules, "gnulib");
>   target_cmd_build($ho, 3600, $builddir, <   cd libvirt
> + mkdir build
> + cd build
>   CFLAGS="-g -I$xenprefix/include/" \\
>   LDFLAGS="-g -L$xenprefix/lib/ -Wl,-rpath-link=$xenprefix/lib/" \\
>   PKG_CONFIG_PATH="$xenprefix/lib/pkgconfig/" \\
>   GNULIB_SRCDIR=$builddir/libvirt/$gnulib->{Path} \\
> -./autogen.sh --no-git \\
> +../autogen.sh --no-git \\
>--with-libxl --without-xen --without-xenapi 
> --without-selinux \\
>--without-lxc --without-vbox --without-uml \\
>--without-qemu --without-openvz --without-vmware \\
> @@ -72,9 +74,9 @@ END
>   
>   sub build() {
>   target_cmd_build($ho, 3600, $builddir, < -cd libvirt
> -(make $makeflags 2>&1 && touch ../build-ok-stamp) |tee ../log
> -test -f ../build-ok-stamp #/
> +cd libvirt/build
> +(make $makeflags 2>&1 && touch ../../build-ok-stamp) |tee ../log
> +test -f ../../build-ok-stamp #/
>   echo ok.
>   END
>   }
> @@ -82,7 +84,7 @@ END
>   sub install() {
>   target_cmd_build($ho, 300, $builddir, <   mkdir -p dist
> -cd libvirt
> +cd libvirt/build
>   make $makeflags install DESTDIR=$builddir/dist
>   mkdir -p $builddir/dist/etc/init.d
>   END
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/sched: remove wrong assertions in csched2_free_pdata()

2019-11-12 Thread George Dunlap


> On Nov 8, 2019, at 7:38 AM, Juergen Gross  wrote:
> 
> The assertions in csched2_free_pdata() are wrong as in case it is
> called by schedule_cpu_add() after a failure of sched_alloc_udata()
> the init pdata function won't have been called.

I’m a bit confused by this, as the comment says that the ASSERT()s should be OK 
with that case; i.e., that they should check *either* that pdata hasn’t been 
called, or that dinit_pdata() has been called:

> - * xfree() does not really mind, but we want to be sure that either
> - * init_pdata has never been called, or deinit_pdata has been called
> - * already.

So which of the following conditions will fail if sched_alloc_udata() fails?  
It looks to me like they should both be fine.

> -ASSERT(!pcpu || spc->runq_id == -1);
> -ASSERT(!cpumask_test_cpu(cpu, _priv(ops)->initialized));

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [OSSTEST PATCH 1/2] ts-libvirt-build: Provide PKG_CONFIG_PATH

2019-11-12 Thread Ian Jackson
Jim Fehlig writes ("Re: [OSSTEST PATCH 1/2] ts-libvirt-build: Provide 
PKG_CONFIG_PATH"):
> On 11/12/19 5:09 AM, Ian Jackson wrote:
> > +PKG_CONFIG_PATH="$xenprefix/lib/pkgconfig/" \\
> >   GNULIB_SRCDIR=$builddir/libvirt/$gnulib->{Path} \\
> >   ./autogen.sh --no-git \\
> >--with-libxl --without-xen --without-xenapi 
> > --without-selinux \\
> 
> Unrelated, but the legacy xen and xenapi drivers have been removed so the 
> --without-{xen,xenapi} options could be dropped.

Thanks for the review.  I think we should consider that for post Xen
freeze.  I did notice the warning messages but I thought leaving
--without-unknown-thing would be OK, and it did indeed build.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC v5 000/126] error: auto propagated local_err

2019-11-12 Thread Vladimir Sementsov-Ogievskiy
12.11.2019 16:46, Cornelia Huck wrote:
> On Fri, 8 Nov 2019 22:57:25 +0400
> Marc-André Lureau  wrote:
> 
>> Hi
>>
>> On Fri, Nov 8, 2019 at 7:31 PM Vladimir Sementsov-Ogievskiy
>>  wrote:
>>>
>>> Finally, what is the plan?
>>>
>>> Markus what do you think?
>>>
>>> Now a lot of patches are reviewed, but a lot of are not.
>>>
>>> Is there any hope that all patches will be reviewed? Should I resend the
>>> whole series, or may be reduce it to reviewed subsystems only?
>>
>> I don't think we have well established rules for whole-tree cleanups
>> like this. In the past, several cleanup series got lost.
> 
> Yes, it is always problematic if a series touches a lot of different
> subsystems.
> 
>>
>> It will take ages to get every subsystem maintainer to review the
>> patches. Most likely, since they are quite systematic, there isn't
>> much to say and it is easy to miss something that has some hidden
>> ramifications. Perhaps whole-tree cleanups should require at least 2
>> reviewers to bypass the subsytem maintainer review? But my past
>> experience with this kind of exercice doesn't encourage me, and
>> probably I am not the only one.
> 
> It's not just the reviews; it's easy to miss compile problems on less
> mainstream architectures (and even easier to miss functional problems
> there, although they are probably less likely with automated rework.)
> CI can probably help, but that's something for the future.
> 
> Anyway, I've now gotten around to that series; spotted one problem in
> s390x code, I think.
> 
> One thing that's helpful for such a large series is a git branch that
> makes it easy to give the series a quick go. (You can use patchew, but
> it takes time before it gets all mails, so just pushing it somewhere
> and letting people know is a good idea anyway.)
> 

Thanks for review!

The series are posted here:

https://src.openvz.org/users/vsementsov/repos/qemu/browse

https://src.openvz.org/scm/~vsementsov/qemu.git #tag up-auto-local-err-v5


-- 
Best regards,
Vladimir
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [OSSTEST PATCH 1/2] ts-libvirt-build: Provide PKG_CONFIG_PATH

2019-11-12 Thread Jim Fehlig
On 11/12/19 5:09 AM, Ian Jackson wrote:
> In osstest we do not install the xen tree in /usr/local because the
> build environment is shared with many different build jobs which might
> be using different versions of Xen.  We put it in a job-specific
> directory in ~osstest on the build host, and set environment variables
> to ensure that it all gets picked up.
> 
> Recent versions of libvirt insist on finding xenlight.pc; otherwise
> they disable libxl support.  So we must add a PKG_CONFIG_PATH setting.

Sorry. There was a hack to workaround a fedora 28 bug, but now that it is EOL 
the hack was removed

https://libvirt.org/git/?p=libvirt.git;a=commit;h=18981877d2e20390a79d068861a24e716f8ee422

> (In all cases, contrary to the usual protocol for path-like variables,
> we do not append but instead simply set the variable.  This is OK
> because this is an osstest build script run via ssh to the build host,
> so the variables won't have been set already.)
> 
> CC: Jim Fehlig 
> Signed-off-by: Ian Jackson 
> ---
>   ts-libvirt-build | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/ts-libvirt-build b/ts-libvirt-build
> index bc08190a..2a363f43 100755
> --- a/ts-libvirt-build
> +++ b/ts-libvirt-build
> @@ -60,6 +60,7 @@ sub config() {
>   cd libvirt
>   CFLAGS="-g -I$xenprefix/include/" \\
>   LDFLAGS="-g -L$xenprefix/lib/ -Wl,-rpath-link=$xenprefix/lib/" \\
> +PKG_CONFIG_PATH="$xenprefix/lib/pkgconfig/" \\
>   GNULIB_SRCDIR=$builddir/libvirt/$gnulib->{Path} \\
>   ./autogen.sh --no-git \\
>--with-libxl --without-xen --without-xenapi 
> --without-selinux \\

Unrelated, but the legacy xen and xenapi drivers have been removed so the 
--without-{xen,xenapi} options could be dropped.

Regards,
Jim
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [XEN PATCH for-4.13] libxl: Fix libxl_retrieve_domain_configuration error path

2019-11-12 Thread Jürgen Groß

On 12.11.19 15:19, Anthony PERARD wrote:

From: Anthony PERARD 

If an error were to happen before the last step, for example the
domain_configuration is missing, the error wouldn't be check by the
_end callback.

Fix that, also initialise `lock' to NULL because the exit path checks
it.

The issue shows up when there's a stubdom, and running `xl list -l`
aborts. Instead, with this patch, `xl list -l` will not list stubdom,
probably like before.

Reported-by: Marek Marczykowski-Górecki 
Fixes: 61563419257ed40278938db2cce7d697aed44f5d
Signed-off-by: Anthony PERARD 


Release-acked-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [XEN PATCH for-4.13] libxl: Fix libxl_retrieve_domain_configuration error path

2019-11-12 Thread Wei Liu
On Tue, Nov 12, 2019 at 02:19:43PM +, Anthony PERARD wrote:
> From: Anthony PERARD 
> 
> If an error were to happen before the last step, for example the
> domain_configuration is missing, the error wouldn't be check by the

check -> checked

> _end callback.
> 
> Fix that, also initialise `lock' to NULL because the exit path checks
> it.
> 
> The issue shows up when there's a stubdom, and running `xl list -l`
> aborts. Instead, with this patch, `xl list -l` will not list stubdom,
> probably like before.
> 
> Reported-by: Marek Marczykowski-Górecki 
> Fixes: 61563419257ed40278938db2cce7d697aed44f5d
> Signed-off-by: Anthony PERARD 

Acked-by: Wei Liu 

I also have a look at other callbacks. The _end one is the only one that
missed this early exit path.

Juergen, this should definitively be in 4.13 since it fixes a bug
introduced in this cycle.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [XEN PATCH for-4.13] libxl: Fix libxl_retrieve_domain_configuration error path

2019-11-12 Thread Marek Marczykowski-Górecki
On Tue, Nov 12, 2019 at 02:19:43PM +, Anthony PERARD wrote:
> From: Anthony PERARD 
> 
> If an error were to happen before the last step, for example the
> domain_configuration is missing, the error wouldn't be check by the
> _end callback.
> 
> Fix that, also initialise `lock' to NULL because the exit path checks
> it.
> 
> The issue shows up when there's a stubdom, and running `xl list -l`
> aborts. Instead, with this patch, `xl list -l` will not list stubdom,
> probably like before.
> 
> Reported-by: Marek Marczykowski-Górecki 
> Fixes: 61563419257ed40278938db2cce7d697aed44f5d
> Signed-off-by: Anthony PERARD 

With this patch applied, `xl list -l` no longer crashes and only prints
this error for a stubdomain:
libxl: error: libxl_domain.c:1937:retrieve_domain_configuration_lock_acquired: 
Domain 11:Fail to get domain configuration

The actual HVM is listed correctly. This was the previous behavior on
Xen 4.8 too.

Tested-by: Marek Marczykowski-Górecki 

Thanks!

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V2 1/2] x86/altp2m: Add hypercall to set a range of sve bits

2019-11-12 Thread Jan Beulich
On 12.11.2019 15:05, Tamas K Lengyel wrote:
> On Tue, Nov 12, 2019 at 4:54 AM Jan Beulich  wrote:
>> On 06.11.2019 16:35, Alexandru Stefan ISAILA wrote:
>>> +else
>>> +{
>>> +rc = p2m_set_suppress_ve_multi(d, _ve);
>>> +
>>> +if ( rc == -ERESTART )
>>> +if ( __copy_field_to_guest(guest_handle_cast(arg,
>>> +   xen_hvm_altp2m_op_t),
>>> +   , u.suppress_ve.opaque) )
>>> +rc = -EFAULT;
>>
>> If the operation is best effort, _some_ indication of failure should
>> still be handed back to the caller. Whether that's through the opaque
>> field or by some other means is secondary. If not via that field
>> (which would make the outer of the two if()-s disappear), please fold
>> the if()-s.
> 
> At least for mem_sharing_range_op we also do a best-effort and don't
> return an error for pages where it wasn't possible to share. So I
> don't think it's absolutely necessary to do that, especially if the
> caller can't do anything about those errors anyway.

mem-sharing is a little different in nature, isn't it? If you
can't share a page, both involved guests will continue to run
with their own instances. If you want to suppress #VE delivery
and it fails, behavior won't be transparently correct, as
there'll potentially be #VE when there should be none. Whether
that's benign to the guest very much depends on its handler.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 6/6] xen: add runtime parameter reading support to hypfs

2019-11-12 Thread Jan Beulich
On 02.10.2019 13:20, Juergen Gross wrote:
> Add support to read values of hypervisor runtime parameters via the
> hypervisor file system for all unsigned integer type runtime parameters.

What about string ones (which you seem to handle in the code,
but see also there)?

> @@ -320,6 +321,44 @@ int cmdline_strcmp(const char *frag, const char *name)
>  }
>  }
>  
> +static struct hypfs_dir hypfs_params = {
> +.list = LIST_HEAD_INIT(hypfs_params.list),
> +};
> +
> +static int __init runtime_param_hypfs_add(void)
> +{
> +const struct kernel_param *param;
> +int ret;
> +
> +ret = hypfs_new_dir(_root, "params", _params);
> +BUG_ON(ret);
> +
> +for ( param = __param_start; param < __param_end; param++ )
> +{
> +switch ( param->type )
> +{
> +case OPT_UINT:
> +if ( param->len == sizeof(unsigned int) )
> +ret = hypfs_new_entry_uint(_params, param->name,
> +   (unsigned int *)(param->par.var));

Stray pair or parentheses. I also don't see the need for the cast,
with the "var" union member being "void *".

> +break;
> +
> +case OPT_STR:
> +ret = hypfs_new_entry_uint(_params, param->name,
> +   param->par.var);

hypfs_new_entry_string()?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 5/6] xen: add /buildinfo/config entry to hypervisor filesystem

2019-11-12 Thread Jan Beulich
On 02.10.2019 13:20, Juergen Gross wrote:
> Add the /buildinfo/config entry to the hypervisor filesystem. This
> entry contains the .config file used to build the hypervisor.

I think this is the 2nd step ahead of the 1st: Much of the stuff
exposed as XENVER_* sub-ops should manifest itself here ahead of
exposing xen/.config.

> @@ -79,3 +80,11 @@ subdir-$(CONFIG_UBSAN) += ubsan
>  
>  subdir-$(CONFIG_NEEDS_LIBELF) += libelf
>  subdir-$(CONFIG_HAS_DEVICE_TREE) += libfdt
> +
> +config_data.c: ../.config
> + ( echo "char xen_config_data[] ="; \
> +   ../tools/bin2c <$<; \
> +   echo ";" ) > $@

This is the typical kind of construct that may break (a subsequent
build attempt) when interrupted in the middle. This pretty clearly
is a move-if-changed candidate, at the same time also avoiding a
(cheap, but anyway) pointless re-build in case .config was touched
without actually changing.

Furthermore is there a reason to expose this as plain text, when
Linux exposes a gzip-ed version in /proc? The file isn't very
large now, but this was also the case for Linux many years ago.

> --- a/xen/common/hypfs.c
> +++ b/xen/common/hypfs.c
> @@ -25,6 +25,10 @@ static struct hypfs_entry hypfs_root_entry = {
>  .dir = _root,
>  };
>  
> +static struct hypfs_dir hypfs_buildinfo = {
> +.list = LIST_HEAD_INIT(hypfs_buildinfo.list),
> +};
> +
>  static int hypfs_add_entry(struct hypfs_dir *parent, struct hypfs_entry *new)
>  {
>  int ret = -ENOENT;
> @@ -316,3 +320,16 @@ long do_hypfs_op(unsigned int cmd,
>  
>  return ret;
>  }
> +
> +static int __init hypfs_init(void)
> +{
> +int ret;
> +
> +ret = hypfs_new_dir(_root, "buildinfo", _buildinfo);
> +BUG_ON(ret);
> +ret = hypfs_new_entry_string(_buildinfo, "config", 
> xen_config_data);
> +BUG_ON(ret);
> +
> +return 0;
> +}
> +__initcall(hypfs_init);

Hmm, do you really want to centralize population of the file system
here, rather than having the individual components take care of it?

> --- a/xen/tools/Makefile
> +++ b/xen/tools/Makefile
> @@ -1,13 +1,18 @@
>  
>  include $(XEN_ROOT)/Config.mk
>  
> +PROGS = symbols bin2c
> +
>  .PHONY: default
>  default:
> - $(MAKE) symbols
> + $(MAKE) $(PROGS)

Could I ask you to take the opportunity and do away with the
unnecessary (as it seems to me) make recursion? $(PROGS) could
easily become a dependency of "default" afaict.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [XEN PATCH for-4.13] libxl: Fix libxl_retrieve_domain_configuration error path

2019-11-12 Thread Anthony PERARD
From: Anthony PERARD 

If an error were to happen before the last step, for example the
domain_configuration is missing, the error wouldn't be check by the
_end callback.

Fix that, also initialise `lock' to NULL because the exit path checks
it.

The issue shows up when there's a stubdom, and running `xl list -l`
aborts. Instead, with this patch, `xl list -l` will not list stubdom,
probably like before.

Reported-by: Marek Marczykowski-Górecki 
Fixes: 61563419257ed40278938db2cce7d697aed44f5d
Signed-off-by: Anthony PERARD 
---
 tools/libxl/libxl_domain.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c
index 9d0eb5aed11d..33f9d9eaa481 100644
--- a/tools/libxl/libxl_domain.c
+++ b/tools/libxl/libxl_domain.c
@@ -1998,12 +1998,14 @@ static void 
retrieve_domain_configuration_end(libxl__egc *egc,
 retrieve_domain_configuration_state *rdcs, int rc)
 {
 STATE_AO_GC(rdcs->qmp.ao);
-libxl__domain_userdata_lock *lock;
+libxl__domain_userdata_lock *lock = NULL;
 
 /* Convenience aliases */
 libxl_domain_config *const d_config = rdcs->d_config;
 libxl_domid domid = rdcs->qmp.domid;
 
+if (rc) goto out;
+
 lock = libxl__lock_domain_userdata(gc, domid);
 if (!lock) {
 rc = ERROR_LOCK_FAIL;
-- 
Anthony PERARD


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 144044: tolerable all pass - PUSHED

2019-11-12 Thread osstest service owner
flight 144044 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144044/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  3683290fc0b0d6500392db733811cc78bcb35eab
baseline version:
 xen  a458d3bd0d2585275c128556ec0cbd818c6a7b0d

Last test of basis   143542  2019-11-01 18:00:50 Z   10 days
Testing same since   144044  2019-11-12 11:00:54 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Anthony PERARD 
  Jan Beulich 
  Juergen Gross 
  Paul Durrant 
  Roger Pau Monné 
  Stewart Hildebrand 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   a458d3bd0d..3683290fc0  3683290fc0b0d6500392db733811cc78bcb35eab -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V2 1/2] x86/altp2m: Add hypercall to set a range of sve bits

2019-11-12 Thread Tamas K Lengyel
On Tue, Nov 12, 2019 at 4:54 AM Jan Beulich  wrote:
>
> On 06.11.2019 16:35, Alexandru Stefan ISAILA wrote:
> > @@ -4681,7 +4682,7 @@ static int do_altp2m_op(
> >  break;
> >
> >  case HVMOP_altp2m_set_suppress_ve:
> > -if ( a.u.suppress_ve.pad1 || a.u.suppress_ve.pad2 )
> > +if ( a.u.suppress_ve.pad1 )
>
> Just because the field changes its name doesn't mean you can
> drop the check. You even add a new field not used (yet) by
> this sub-function, which then also would need checking here.
>
> > @@ -4693,8 +4694,23 @@ static int do_altp2m_op(
> >  }
> >  break;
> >
> > +case HVMOP_altp2m_set_suppress_ve_multi:
> > +if ( a.u.suppress_ve.pad1 || !a.u.suppress_ve.nr )
>
> A count of zero typically is taken as a no-op, not an error.
>
> > +rc = -EINVAL;
> > +else
> > +{
> > +rc = p2m_set_suppress_ve_multi(d, _ve);
> > +
> > +if ( rc == -ERESTART )
> > +if ( __copy_field_to_guest(guest_handle_cast(arg,
> > +   xen_hvm_altp2m_op_t),
> > +   , u.suppress_ve.opaque) )
> > +rc = -EFAULT;
>
> If the operation is best effort, _some_ indication of failure should
> still be handed back to the caller. Whether that's through the opaque
> field or by some other means is secondary. If not via that field
> (which would make the outer of the two if()-s disappear), please fold
> the if()-s.

At least for mem_sharing_range_op we also do a best-effort and don't
return an error for pages where it wasn't possible to share. So I
don't think it's absolutely necessary to do that, especially if the
caller can't do anything about those errors anyway.

Tamas

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC v5 026/126] python: add commit-per-subsystem.py

2019-11-12 Thread Cornelia Huck
On Fri, 11 Oct 2019 19:04:12 +0300
Vladimir Sementsov-Ogievskiy  wrote:

> Add script to automatically commit tree-wide changes per-subsystem.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---

I think this still needs some notes as to the supposed usage.


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC v5 000/126] error: auto propagated local_err

2019-11-12 Thread Cornelia Huck
On Fri, 8 Nov 2019 22:57:25 +0400
Marc-André Lureau  wrote:

> Hi
> 
> On Fri, Nov 8, 2019 at 7:31 PM Vladimir Sementsov-Ogievskiy
>  wrote:
> >
> > Finally, what is the plan?
> >
> > Markus what do you think?
> >
> > Now a lot of patches are reviewed, but a lot of are not.
> >
> > Is there any hope that all patches will be reviewed? Should I resend the
> > whole series, or may be reduce it to reviewed subsystems only?  
> 
> I don't think we have well established rules for whole-tree cleanups
> like this. In the past, several cleanup series got lost.

Yes, it is always problematic if a series touches a lot of different
subsystems.

> 
> It will take ages to get every subsystem maintainer to review the
> patches. Most likely, since they are quite systematic, there isn't
> much to say and it is easy to miss something that has some hidden
> ramifications. Perhaps whole-tree cleanups should require at least 2
> reviewers to bypass the subsytem maintainer review? But my past
> experience with this kind of exercice doesn't encourage me, and
> probably I am not the only one.

It's not just the reviews; it's easy to miss compile problems on less
mainstream architectures (and even easier to miss functional problems
there, although they are probably less likely with automated rework.)
CI can probably help, but that's something for the future.

Anyway, I've now gotten around to that series; spotted one problem in
s390x code, I think.

One thing that's helpful for such a large series is a git branch that
makes it easy to give the series a quick go. (You can use patchew, but
it takes time before it gets all mails, so just pushing it somewhere
and letting people know is a good idea anyway.)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.13] xen: Drop bogus BOOLEAN definitions, TRUE and FALSE

2019-11-12 Thread Jan Beulich
On 12.11.2019 14:39, Andrew Cooper wrote:
> On 12/11/2019 08:35, Jan Beulich wrote:
>> On 11.11.2019 21:24, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/x86_64/mm.c
>>> +++ b/xen/arch/x86/x86_64/mm.c
>>> @@ -1077,7 +1077,7 @@ long do_set_segment_base(unsigned int which, unsigned 
>>> long base)
>>>  }
>>>  
>>>  
>>> -/* Returns TRUE if given descriptor is valid for GDT or LDT. */
>>> +/* Returns true if given descriptor is valid for GDT or LDT. */
>>>  int check_descriptor(const struct domain *dom, seg_desc_t *d)
>> Wouldn't changes like this one better be accompanied by also adjusting
>> the return type of the function (there are more examples further down
>> in common/timer.c)?
> 
> No.  That is an unrelated change.
> 
> If I were flush with free time then I might consider doing this and
> substantially increase the test burden.
> 
> As it stands, this request is scope creep.

The other alternative would have been to ask for scope reduction,
i.e. leave alone such comments (to avoid the resulting visual
disconnect between comment and actual data type). Anyway - it was
just a question I wanted to raise, not a request for further work
on your part.

>>> --- a/xen/include/asm-arm/arm64/efibind.h
>>> +++ b/xen/include/asm-arm/arm64/efibind.h
>>> @@ -107,7 +107,7 @@ typedef uint64_t   UINTN;
>>>  #define POST_CODE(_Data)
>>>  
>>>  
>>> -#define BREAKPOINT()while (TRUE);// Make it hang on Bios[Dbg]32
>>> +#define BREAKPOINT()while (true);// Make it hang on Bios[Dbg]32
>> You do realize that this and other EFI headers (and perhaps also
>> ACPI ones) are largely verbatim imports from other projects,
>> updating of which will become less straightforward by such
>> replacements? When pulling in the EFI ones I intentionally did not
>> fiddle with them more than absolutely necessary.
> 
> Yes, and?
> 
> It is unacceptable for the acpi headers to forcibly redefine anything in
> their scope, and its definition of va_args is downright dangerous.
> 
> All junk like this in header files does nothing but waste space and
> compiler effort during compilation, and leave people with an slim chance
> of shooting themselves in the foot.

Well, on one hand I'm with you. But then I dare to guess that the
people having written the headers the way they are also aren't
completely un-knowledgeable, i.e. did so for a reason. This seems
(I'm sorry to say it this bluntly) once again a case where you
appear to not be willing to accept other thinking than your own.
It is therefore one thing to get rid of TRUE/FALSE _outside_ of
such headers (where it would better never have been introduced),
and another to modify these more or less verbatim imported headers
themselves.

> How many times do these get touched?  (Rhetorical question.  The answer
> is once (me, clang build fix) since their introduction, 8, 9 and 10
> years ago).
> 
> For the 30s of effort required to tweak once-in-a-blue-moon patches
> which touch these headers, trimming the junk is a no-brainer.

Well, I agree that for just _this_ change it's not a big deal.
But any such approach doesn't scale: What we allow ourselves to do
once we may then easily allow ourselves to do another time, and
then dozens more times. Once that has happened, the effort needed
to do a re-sync may become non-negligible.

Bottom line - I'm half convinced and willing to give my ack, but
I'm not convinced you truly thought through the longer term
consequences. I'd therefore be far happier to see this patch
split into a non-controversial part (anything that's not tied to
the ACPI and EFI header imports), an ACPI, and an EFI part.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 1/3] x86/boot: Introduce the kernel_info

2019-11-12 Thread Daniel Kiper
The relationships between the headers are analogous to the various data
sections:

  setup_header = .data
  boot_params/setup_data = .bss

What is missing from the above list? That's right:

  kernel_info = .rodata

We have been (ab)using .data for things that could go into .rodata or .bss for
a long time, for lack of alternatives and -- especially early on -- inertia.
Also, the BIOS stub is responsible for creating boot_params, so it isn't
available to a BIOS-based loader (setup_data is, though).

setup_header is permanently limited to 144 bytes due to the reach of the
2-byte jump field, which doubles as a length field for the structure, combined
with the size of the "hole" in struct boot_params that a protected-mode loader
or the BIOS stub has to copy it into. It is currently 119 bytes long, which
leaves us with 25 very precious bytes. This isn't something that can be fixed
without revising the boot protocol entirely, breaking backwards compatibility.

boot_params proper is limited to 4096 bytes, but can be arbitrarily extended
by adding setup_data entries. It cannot be used to communicate properties of
the kernel image, because it is .bss and has no image-provided content.

kernel_info solves this by providing an extensible place for information about
the kernel image. It is readonly, because the kernel cannot rely on a
bootloader copying its contents anywhere, but that is OK; if it becomes
necessary it can still contain data items that an enabled bootloader would be
expected to copy into a setup_data chunk.

Do not bump setup_header version in arch/x86/boot/header.S because it
will be followed by additional changes coming into the Linux/x86 boot
protocol.

Suggested-by: H. Peter Anvin (Intel) 
Signed-off-by: Daniel Kiper 
Reviewed-by: Konrad Rzeszutek Wilk 
Reviewed-by: Ross Philipson 
Reviewed-by: H. Peter Anvin (Intel) 
---
v6 - suggestions/fixes:
   - drop "This patch" from the commit message
 (suggested by Borislav Petkov).

v4 - suggestions/fixes:
   - improve the documentation
 (suggested by Randy Dunlap and Konrad Rzeszutek Wilk).

v3 - suggestions/fixes:
   - split kernel_info data into fixed and variable sized regions,
 (suggested by H. Peter Anvin),
   - change kernel_info.header value to "LToP" (0x506f544c),
 (suggested by H. Peter Anvin),
   - improve the comments,
   - improve the documentation.

v2 - suggestions/fixes:
   - rename setup_header2 to kernel_info,
 (suggested by H. Peter Anvin),
   - change kernel_info.header value to "InfO" (0x4f666e49),
   - new kernel_info description in Documentation/x86/boot.rst,
 (suggested by H. Peter Anvin),
   - drop kernel_info_offset_update() as an overkill and
 update kernel_info offset directly from main(),
 (suggested by Eric Snowberg),
   - new commit message
 (suggested by H. Peter Anvin),
   - fix some commit message misspellings
 (suggested by Eric Snowberg).
---
 Documentation/x86/boot.rst | 126 +
 arch/x86/boot/Makefile |   2 +-
 arch/x86/boot/compressed/Makefile  |   4 +-
 arch/x86/boot/compressed/kernel_info.S |  17 +
 arch/x86/boot/header.S |   1 +
 arch/x86/boot/tools/build.c|   5 ++
 arch/x86/include/uapi/asm/bootparam.h  |   1 +
 7 files changed, 153 insertions(+), 3 deletions(-)
 create mode 100644 arch/x86/boot/compressed/kernel_info.S

diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
index 08a2f100c0e6..c60fafda9427 100644
--- a/Documentation/x86/boot.rst
+++ b/Documentation/x86/boot.rst
@@ -68,8 +68,25 @@ Protocol 2.12(Kernel 3.8) Added the xloadflags field 
and extension fields
 Protocol 2.13  (Kernel 3.14) Support 32- and 64-bit flags being set in
xloadflags to support booting a 64-bit kernel from 32-bit
EFI
+
+Protocol 2.14: BURNT BY INCORRECT COMMIT 
ae7e1238e68f2a472a125673ab506d49158c1889
+   (x86/boot: Add ACPI RSDP address to setup_header)
+   DO NOT USE!!! ASSUME SAME AS 2.13.
+
+Protocol 2.15: (Kernel 5.5) Added the kernel_info.
 =  
 
+.. note::
+ The protocol version number should be changed only if the setup header
+ is changed. There is no need to update the version number if boot_params
+ or kernel_info are changed. Additionally, it is recommended to use
+ xloadflags (in this case the protocol version number should not be
+ updated either) or kernel_info to communicate supported Linux kernel
+ features to the boot loader. Due to very limited space available in
+ the original setup header every update to it should be considered
+ with great care. Starting from the protocol 2.15 the primary way to
+ communicate things to the boot loader is the kernel_info.
+
 
 Memory Layout
 =
@@ -207,6 +224,7 @@ Offset/Size Proto   NameMeaning
 0258/8 2.10+  

[Xen-devel] [PATCH v6 2/3] x86/boot: Introduce the kernel_info.setup_type_max

2019-11-12 Thread Daniel Kiper
This field contains maximal allowed type for setup_data.

Do not bump setup_header version in arch/x86/boot/header.S because it
will be followed by additional changes coming into the Linux/x86 boot
protocol.

Suggested-by: H. Peter Anvin (Intel) 
Signed-off-by: Daniel Kiper 
Reviewed-by: Konrad Rzeszutek Wilk 
Reviewed-by: Ross Philipson 
Reviewed-by: H. Peter Anvin (Intel) 
---
v6 - suggestions/fixes:
   - fix setup_type_max offset in Documentation/x86/boot.rst
 (suggested by Borislav Petkov),
   - drop "This patch" from the commit message
 (suggested by Borislav Petkov).

v5 - suggestions/fixes:
   - move incorrect references to the setup_indirect to the
 patch introducing it,
   - do not bump setup_header version in arch/x86/boot/header.S
 (suggested by H. Peter Anvin).
---
 Documentation/x86/boot.rst | 9 -
 arch/x86/boot/compressed/kernel_info.S | 5 +
 arch/x86/include/uapi/asm/bootparam.h  | 3 +++
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
index c60fafda9427..6cdd767c3835 100644
--- a/Documentation/x86/boot.rst
+++ b/Documentation/x86/boot.rst
@@ -73,7 +73,7 @@ Protocol 2.14:BURNT BY INCORRECT COMMIT 
ae7e1238e68f2a472a125673ab506d49158c188
(x86/boot: Add ACPI RSDP address to setup_header)
DO NOT USE!!! ASSUME SAME AS 2.13.
 
-Protocol 2.15: (Kernel 5.5) Added the kernel_info.
+Protocol 2.15: (Kernel 5.5) Added the kernel_info and 
kernel_info.setup_type_max.
 =  
 
 .. note::
@@ -981,6 +981,13 @@ Offset/size:   0x0008/4
   This field contains the size of the kernel_info including kernel_info.header
   and kernel_info.kernel_info_var_len_data.
 
+   ==
+Field name:setup_type_max
+Offset/size:   0x000c/4
+   ==
+
+  This field contains maximal allowed type for setup_data.
+
 
 The Image Checksum
 ==
diff --git a/arch/x86/boot/compressed/kernel_info.S 
b/arch/x86/boot/compressed/kernel_info.S
index 8ea6f6e3feef..018dacbd753e 100644
--- a/arch/x86/boot/compressed/kernel_info.S
+++ b/arch/x86/boot/compressed/kernel_info.S
@@ -1,5 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
+#include 
+
.section ".rodata.kernel_info", "a"
 
.global kernel_info
@@ -12,6 +14,9 @@ kernel_info:
/* Size total. */
.long   kernel_info_end - kernel_info
 
+   /* Maximal allowed type for setup_data. */
+   .long   SETUP_TYPE_MAX
+
 kernel_info_var_len_data:
/* Empty for time being... */
 kernel_info_end:
diff --git a/arch/x86/include/uapi/asm/bootparam.h 
b/arch/x86/include/uapi/asm/bootparam.h
index a1ebcd7a991c..dbb41128e5a0 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -11,6 +11,9 @@
 #define SETUP_APPLE_PROPERTIES 5
 #define SETUP_JAILHOUSE6
 
+/* max(SETUP_*) */
+#define SETUP_TYPE_MAX SETUP_JAILHOUSE
+
 /* ram_size flags */
 #define RAMDISK_IMAGE_START_MASK   0x07FF
 #define RAMDISK_PROMPT_FLAG0x8000
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/6] xen: add basic hypervisor filesystem support

2019-11-12 Thread Jan Beulich
On 02.10.2019 13:20, Juergen Gross wrote:
> --- /dev/null
> +++ b/xen/common/hypfs.c
> @@ -0,0 +1,318 @@
> +/**
> + *
> + * hypfs.c
> + *
> + * Simple sysfs-like file system for the hypervisor.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +static DEFINE_SPINLOCK(hypfs_lock);
> +
> +struct hypfs_dir hypfs_root = {
> +.list = LIST_HEAD_INIT(hypfs_root.list),
> +};
> +
> +static struct hypfs_entry hypfs_root_entry = {
> +.type = hypfs_type_dir,
> +.name = "",
> +.list = LIST_HEAD_INIT(hypfs_root_entry.list),
> +.parent = _root,
> +.dir = _root,
> +};

This looks to be used only in hypfs_get_entry(). Unless there are
plans to have further uses, it should be moved there.

I'm also somewhat puzzled by "name" being an empty string; this
too would look less suspicious if this wasn't a file scope variable.

> +static int hypfs_add_entry(struct hypfs_dir *parent, struct hypfs_entry *new)
> +{
> +int ret = -ENOENT;
> +struct list_head *l;
> +
> +if ( !new->content )
> +return -EINVAL;
> +
> +spin_lock(_lock);
> +
> +list_for_each ( l, >list )
> +{
> +struct hypfs_entry *e = list_entry(l, struct hypfs_entry, list);

const?

> +int cmp = strcmp(e->name, new->name);
> +
> +if ( cmp > 0 )
> +{
> +ret = 0;
> +list_add_tail(>list, l);
> +break;
> +}
> +if ( cmp == 0 )
> +{
> +ret = -EEXIST;
> +break;
> +}
> +}
> +
> +if ( ret == -ENOENT )
> +{
> +ret = 0;
> +list_add_tail(>list, >list);
> +}
> +
> +if ( !ret )
> +{
> +unsigned int sz = strlen(new->name) + 1;
> +
> +parent->content_size += sizeof(struct xen_hypfs_direntry) +
> +ROUNDUP(sz, 4);

What is this literal 4 coming from? DYM alignof(struct xen_hypfs_direntry)?

> +new->parent = parent;
> +}
> +
> +spin_unlock(_lock);
> +
> +return ret;
> +}
> +
> +int hypfs_new_entry_any(struct hypfs_dir *parent, const char *name,
> +enum hypfs_entry_type type, void *content)

Perhaps drop the _any suffix?

> +{
> +int ret;
> +struct hypfs_entry *new;
> +
> +if ( strchr(name, '/') || !strcmp(name, ".") || !strcmp(name, "..") )
> +return -EINVAL;
> +
> +new = xzalloc(struct hypfs_entry);
> +if ( !new )
> +return -ENOMEM;
> +
> +new->name = name;
> +new->type = type;
> +new->content = content;
> +
> +ret = hypfs_add_entry(parent, new);
> +
> +if ( ret )
> +xfree(new);
> +
> +return ret;
> +}
> +
> +int hypfs_new_entry_string(struct hypfs_dir *parent, const char *name,
> +   char *val)

The last parameter here and below being non-const is because of the
intended write support?

> +{
> +return hypfs_new_entry_any(parent, name, hypfs_type_string, val);
> +}
> +
> +int hypfs_new_entry_uint(struct hypfs_dir *parent, const char *name,
> + unsigned int *val)
> +{
> +return hypfs_new_entry_any(parent, name, hypfs_type_uint, val);
> +}
> +
> +int hypfs_new_dir(struct hypfs_dir *parent, const char *name,
> +  struct hypfs_dir *dir)
> +{
> +if ( !dir )
> +dir = xzalloc(struct hypfs_dir);
> +
> +return hypfs_new_entry_any(parent, name, hypfs_type_dir, dir);
> +}
> +
> +static int hypfs_get_path_user(char *buf, XEN_GUEST_HANDLE_PARAM(void) uaddr,
> +   unsigned long len)
> +{
> +if ( len > XEN_HYPFS_MAX_PATHLEN )
> +return -EINVAL;
> +
> +if ( copy_from_guest(buf, uaddr, len) )
> +return -EFAULT;
> +
> +buf[len - 1] = 0;

In the public interface description you have "including trailing zero
byte". I think instead of putting one there you should check there's
one.

> +return 0;
> +}
> +
> +static struct hypfs_entry *hypfs_get_entry_rel(struct hypfs_entry *dir,
> +   char *path)

const?

> +{
> +char *slash;

const?

> +struct hypfs_entry *entry;
> +struct list_head *l;
> +unsigned int name_len;
> +
> +if ( *path == 0 )

Please either use !*path or be consistent with code a few lines
down and use '\0'.

> +return dir;
> +
> +if ( dir->type != hypfs_type_dir )
> +return NULL;
> +
> +slash = strchr(path, '/');
> +if ( !slash )
> +slash = strchr(path, '\0');

With this better name the variable "end" or some such?

> +name_len = slash - path;
> +
> +list_for_each ( l, >dir->list )
> +{
> +int cmp;
> +
> +entry = list_entry(l, struct hypfs_entry, list);

Why not list_for_each_entry(), eliminating the need for the "l"
helper variable?

> +cmp = strncmp(path, entry->name, name_len);
> +if ( cmp < 0 )
> +return NULL;
> +

[Xen-devel] [PATCH v6 0/3] x86/boot: Introduce the kernel_info et consortes

2019-11-12 Thread Daniel Kiper
Hi,

Due to very limited space in the setup_header this patch series introduces new
kernel_info struct which will be used to convey information from the kernel to
the bootloader. This way the boot protocol can be extended regardless of the
setup_header limitations. Additionally, the patch series introduces some
convenience features like the setup_indirect struct and the
kernel_info.setup_type_max field.

Daniel

 Documentation/x86/boot.rst | 174 
++
 arch/x86/boot/Makefile |   2 +-
 arch/x86/boot/compressed/Makefile  |   4 +-
 arch/x86/boot/compressed/kaslr.c   |  12 ++
 arch/x86/boot/compressed/kernel_info.S |  22 ++
 arch/x86/boot/header.S |   3 +-
 arch/x86/boot/tools/build.c|   5 +++
 arch/x86/include/uapi/asm/bootparam.h  |  16 +++-
 arch/x86/kernel/e820.c |  11 +
 arch/x86/kernel/kdebugfs.c |  21 --
 arch/x86/kernel/ksysfs.c   |  31 ++
 arch/x86/kernel/setup.c|   6 +++
 arch/x86/mm/ioremap.c  |  11 +
 13 files changed, 302 insertions(+), 16 deletions(-)

Daniel Kiper (3):
  x86/boot: Introduce the kernel_info
  x86/boot: Introduce the kernel_info.setup_type_max
  x86/boot: Introduce the setup_indirect


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 3/3] x86/boot: Introduce the setup_indirect

2019-11-12 Thread Daniel Kiper
The setup_data is a bit awkward to use for extremely large data objects,
both because the setup_data header has to be adjacent to the data object
and because it has a 32-bit length field. However, it is important that
intermediate stages of the boot process have a way to identify which
chunks of memory are occupied by kernel data. Thus introduce an uniform
way to specify such indirect data as setup_indirect struct and
SETUP_INDIRECT type.

And finally bump setup_header version in arch/x86/boot/header.S.

Suggested-by: H. Peter Anvin (Intel) 
Signed-off-by: Daniel Kiper 
Acked-by: Konrad Rzeszutek Wilk 
Reviewed-by: Ross Philipson 
Reviewed-by: H. Peter Anvin (Intel) 
---
v6 - suggestions/fixes:
   - add a comment to arch/x86/kernel/kdebugfs.c
 (suggested by Borislav Petkov),
   - do some formatting tricks to increase code readability
 (suggested by Borislav Petkov),
   - drop "we" from the commit message
 (suggested by Borislav Petkov).

v5 - suggestions/fixes:
   - bump setup_header version in arch/x86/boot/header.S
 (suggested by H. Peter Anvin).

v4 - suggestions/fixes:
   - change "Note:" to ".. note::".

v3 - suggestions/fixes:
   - add setup_indirect mapping/KASLR avoidance/etc. code
 (suggested by H. Peter Anvin),
   - the SETUP_INDIRECT sets most significant bit right now;
 this way it is possible to differentiate regular setup_data
 and setup_indirect objects in the debugfs filesystem.

v2 - suggestions/fixes:
   - add setup_indirect usage example
 (suggested by Eric Snowberg and Ross Philipson).
---
 Documentation/x86/boot.rst | 43 +-
 arch/x86/boot/compressed/kaslr.c   | 12 ++
 arch/x86/boot/compressed/kernel_info.S |  2 +-
 arch/x86/boot/header.S |  2 +-
 arch/x86/include/uapi/asm/bootparam.h  | 16 ++---
 arch/x86/kernel/e820.c | 11 +
 arch/x86/kernel/kdebugfs.c | 21 +
 arch/x86/kernel/ksysfs.c   | 31 ++--
 arch/x86/kernel/setup.c|  6 +
 arch/x86/mm/ioremap.c  | 11 +
 10 files changed, 138 insertions(+), 17 deletions(-)

diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst
index 6cdd767c3835..90bb8f5ab384 100644
--- a/Documentation/x86/boot.rst
+++ b/Documentation/x86/boot.rst
@@ -827,6 +827,47 @@ Protocol:  2.09+
   sure to consider the case where the linked list already contains
   entries.
 
+  The setup_data is a bit awkward to use for extremely large data objects,
+  both because the setup_data header has to be adjacent to the data object
+  and because it has a 32-bit length field. However, it is important that
+  intermediate stages of the boot process have a way to identify which
+  chunks of memory are occupied by kernel data.
+
+  Thus setup_indirect struct and SETUP_INDIRECT type were introduced in
+  protocol 2.15.
+
+  struct setup_indirect {
+__u32 type;
+__u32 reserved;  /* Reserved, must be set to zero. */
+__u64 len;
+__u64 addr;
+  };
+
+  The type member is a SETUP_INDIRECT | SETUP_* type. However, it cannot be
+  SETUP_INDIRECT itself since making the setup_indirect a tree structure
+  could require a lot of stack space in something that needs to parse it
+  and stack space can be limited in boot contexts.
+
+  Let's give an example how to point to SETUP_E820_EXT data using 
setup_indirect.
+  In this case setup_data and setup_indirect will look like this:
+
+  struct setup_data {
+__u64 next = 0 or ;
+__u32 type = SETUP_INDIRECT;
+__u32 len = sizeof(setup_data);
+__u8 data[sizeof(setup_indirect)] = struct setup_indirect {
+  __u32 type = SETUP_INDIRECT | SETUP_E820_EXT;
+  __u32 reserved = 0;
+  __u64 len = ;
+  __u64 addr = ;
+}
+  }
+
+.. note::
+ SETUP_INDIRECT | SETUP_NONE objects cannot be properly distinguished
+ from SETUP_INDIRECT itself. So, this kind of objects cannot be provided
+ by the bootloaders.
+
    
 Field name:pref_address
 Type:  read (reloc)
@@ -986,7 +1027,7 @@ Field name:setup_type_max
 Offset/size:   0x000c/4
    ==
 
-  This field contains maximal allowed type for setup_data.
+  This field contains maximal allowed type for setup_data and setup_indirect 
structs.
 
 
 The Image Checksum
diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 2e53c056ba20..bb9bfef174ae 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -459,6 +459,18 @@ static bool mem_avoid_overlap(struct mem_vector *img,
is_overlapping = true;
}
 
+   if (ptr->type == SETUP_INDIRECT &&
+   ((struct setup_indirect *)ptr->data)->type != 
SETUP_INDIRECT) {
+   avoid.start = ((struct setup_indirect 
*)ptr->data)->addr;
+   

Re: [Xen-devel] [PATCH for-4.13] xen: Drop bogus BOOLEAN definitions, TRUE and FALSE

2019-11-12 Thread Andrew Cooper
On 12/11/2019 08:35, Jan Beulich wrote:
> On 11.11.2019 21:24, Andrew Cooper wrote:
>> --- a/xen/arch/x86/x86_64/mm.c
>> +++ b/xen/arch/x86/x86_64/mm.c
>> @@ -1077,7 +1077,7 @@ long do_set_segment_base(unsigned int which, unsigned 
>> long base)
>>  }
>>  
>>  
>> -/* Returns TRUE if given descriptor is valid for GDT or LDT. */
>> +/* Returns true if given descriptor is valid for GDT or LDT. */
>>  int check_descriptor(const struct domain *dom, seg_desc_t *d)
> Wouldn't changes like this one better be accompanied by also adjusting
> the return type of the function (there are more examples further down
> in common/timer.c)?

No.  That is an unrelated change.

If I were flush with free time then I might consider doing this and
substantially increase the test burden.

As it stands, this request is scope creep.

>
>> --- a/xen/include/asm-arm/arm64/efibind.h
>> +++ b/xen/include/asm-arm/arm64/efibind.h
>> @@ -107,7 +107,7 @@ typedef uint64_t   UINTN;
>>  #define POST_CODE(_Data)
>>  
>>  
>> -#define BREAKPOINT()while (TRUE);// Make it hang on Bios[Dbg]32
>> +#define BREAKPOINT()while (true);// Make it hang on Bios[Dbg]32
> You do realize that this and other EFI headers (and perhaps also
> ACPI ones) are largely verbatim imports from other projects,
> updating of which will become less straightforward by such
> replacements? When pulling in the EFI ones I intentionally did not
> fiddle with them more than absolutely necessary.

Yes, and?

It is unacceptable for the acpi headers to forcibly redefine anything in
their scope, and its definition of va_args is downright dangerous.

All junk like this in header files does nothing but waste space and
compiler effort during compilation, and leave people with an slim chance
of shooting themselves in the foot.

How many times do these get touched?  (Rhetorical question.  The answer
is once (me, clang build fix) since their introduction, 8, 9 and 10
years ago).

For the 30s of effort required to tweak once-in-a-blue-moon patches
which touch these headers, trimming the junk is a no-brainer.

>
> If it wasn't for this, I'd have ack-ed the patch despite the other
> remark above.
>
>> --- a/xen/include/xen/mm.h
>> +++ b/xen/include/xen/mm.h
>> @@ -607,7 +607,7 @@ int __must_check donate_page(struct domain *d, struct 
>> page_info *page,
>>  #define RAM_TYPE_UNUSABLE 0x0004
>>  #define RAM_TYPE_ACPI 0x0008
>>  #define RAM_TYPE_UNKNOWN  0x0010
>> -/* TRUE if the whole page at @mfn is of the requested RAM type(s) above. */
>> +/* true if the whole page at @mfn is of the requested RAM type(s) above. */
>>  int page_is_ram_type(unsigned long mfn, unsigned long mem_type);
> In other comments I already wasn't sure about such replacements, but
> let them be. Here, however, you violate coding style by using "true"
> instead of "True" (the function returning "int" for now doesn't even
> allow the excuse of meaning the identifier rather than the word).

Fixed.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] arch: arm: vgic-v3: fix GICD_ISACTIVER range

2019-11-12 Thread Andre Przywara
On Mon, 11 Nov 2019 11:01:07 -0800 (PST)
Stefano Stabellini  wrote:

Hi,

> On Sat, 9 Nov 2019, Julien Grall wrote:
> > On Sat, 9 Nov 2019, 04:27 Stefano Stabellini,  
> > wrote:
> >   On Thu, 7 Nov 2019, Peng Fan wrote:  
> >   > The end should be GICD_ISACTIVERN not GICD_ISACTIVER.
> >   >
> >   > Signed-off-by: Peng Fan   
> > 
> >   Reviewed-by: Stefano Stabellini 
> > 
> > 
> > To be honest, I am not sure the code is correct. A read to those registers 
> > should tell you the list of interrupts active. As we always
> > return 0, this will not return the correct state of the GIC.
> > 
> > I know that returning the list of actives interrupts is complicated with 
> > the old vGIC, but I don't think silently ignoring it is a good
> > idea.
> > The question here is why the guest accessed those registers? What is it 
> > trying to figure out?

I see Linux querying the active state (IRQCHIP_STATE_ACTIVE) at two relevant 
points for ARM:
- In kernel/irq/manage.c, in __synchronize_hardirq().
- In KVM's arch timer emulation code.

I think the latter is of no concern (yet), but the first might actually 
trigger. At the moment it's beyond me what it actually does, but maybe some IRQ 
changes (RT, threaded IRQs?) trigger this now?
 
> We are not going to solve the general problem at this stage. At the
> moment the code:
> 
> - ignore the first register only
> - print an error and return an IO_ABORT error for the other regs
> 
> For the inconsistency alone the second option is undesirable. Also it
> doesn't match the write implementation, which does the same thing for
> all the GICD_ISACTIVER* regs instead of having a special treatment for
> the first one only. It looks like a typo in the original patch to me.
> 
> The proposed patch switches the behavior to:
> 
> - silently ignore all the GICD_ISACTIVER* regs (as proposed)
> 
> is an improvement.

Yeah, I agree. Getting the actual active state of a virtual IRQ is at least 
expensive, as you have to bring all VCPUs back, to sync the LR content back 
into something Xen can access.
This is an architectural property of the GIC virtualisation, as normally the 
acknowledge happens without exiting, also the EOI, so Xen cannot know which 
state an IRQ is in while the VCPU is running. Think: Schrödinger ;-)

Regarding this patch: The original code looks indeed like a typo to me: On the 
read side both ISACTIVERx and ICACTIVERx behave the same, so they should be 
handled the same.
And returning 0 is probably the best approximation we can do at the moment. The 
other solution is to add GICv3 support to the new VGIC ;-), as this at least 
solves the case when we deliberately inject an active IRQ. We could extend this 
logic to find out which IRQs in this block *could* possibly be active, then 
bring those VCPUs back to Xen.

Cheers,
Andre.

> >   Juergen, I think this fix should be in the release (and also
> >   backported to stable trees.)
> > 
> > 
> > Without an understanding of the problem, I disagree with this request (see 
> > above).
> > 
> > As an aside, the range ISPENDR  has the same issue.  
> 
> You meant GICD_ICPENDR, right? Yep, that one is suffering from the same
> typo mistake too.
> 
>  
> >   > ---
> >   >  xen/arch/arm/vgic-v3.c | 2 +-
> >   >  1 file changed, 1 insertion(+), 1 deletion(-)
> >   >
> >   > diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> >   > index 422b94f902..e802f2055a 100644
> >   > --- a/xen/arch/arm/vgic-v3.c
> >   > +++ b/xen/arch/arm/vgic-v3.c
> >   > @@ -706,7 +706,7 @@ static int 
> > __vgic_v3_distr_common_mmio_read(const char *name, struct vcpu *v,
> >   >          goto read_as_zero;
> >   > 
> >   >      /* Read the active status of an IRQ via GICD/GICR is not 
> > supported */
> >   > -    case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVER):
> >   > +    case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVERN):
> >   >      case VRANGE32(GICD_ICACTIVER, GICD_ICACTIVERN):
> >   >          goto read_as_zero;
> >   > 
> >   > --
> >   > 2.16.4
> >   >  
> > 
> >   ___
> >   Xen-devel mailing list
> >   Xen-devel@lists.xenproject.org
> >   https://lists.xenproject.org/mailman/listinfo/xen-devel
> > 
> > 
> >   


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [xen-4.13.0-rc] kexec/kdump failure with cpu Intel(R) Xeon(R) Gold 6242 CPU

2019-11-12 Thread Andrew Cooper
On 12/11/2019 11:38, Dietmar Hahn wrote:
> Hi,
>
> on a new machine with cpu Intel(R) Xeon(R) Gold 6242 CPU the kexec/kdump
> doesn't work with current xen-4.13.0-rc.
> The last output of the xen console is:
>
> (XEN) Hardware Dom0 crashed: Executing kexec image on cpu5
> (XEN) Shot down all CPUs
>
> After short delay the system reboots.
>
> It seems the fixes mentioned in the thread
> https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01948.html
> aren't enough.
>
> I built xen-4.11 with the patches but no success.
> On an older system with xen-4.4 the kdump works.
>
> Any help is appreciated.

Do you have purgatory serial enabled?

By any chance does Xen revert back to xAPIC mode and Linux configure
x2apic mode?  There are some interrupt routing issues on those CPUs with
mismatched x(2)apic settings.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ovmf test] 144034: all pass - PUSHED

2019-11-12 Thread osstest service owner
flight 144034 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144034/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf f8dd7c7018adf78992da572eeaf53c0ce31a411f
baseline version:
 ovmf 995d8b8568fe67afffdaac3012d7b990e7314d0b

Last test of basis   144011  2019-11-11 11:09:14 Z1 days
Testing same since   144034  2019-11-11 23:13:12 Z0 days1 attempts


People who touched revisions under test:
  Kinney 
  Laszlo Ersek 
  Leif Lindholm 
  Michael D Kinney 
  Ray Ni 
  Sean Brogan 
  Zhichao Gao 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   995d8b8568..f8dd7c7018  f8dd7c7018adf78992da572eeaf53c0ce31a411f -> 
xen-tested-master

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH 1/2] ts-libvirt-build: Provide PKG_CONFIG_PATH

2019-11-12 Thread Ian Jackson
In osstest we do not install the xen tree in /usr/local because the
build environment is shared with many different build jobs which might
be using different versions of Xen.  We put it in a job-specific
directory in ~osstest on the build host, and set environment variables
to ensure that it all gets picked up.

Recent versions of libvirt insist on finding xenlight.pc; otherwise
they disable libxl support.  So we must add a PKG_CONFIG_PATH setting.

(In all cases, contrary to the usual protocol for path-like variables,
we do not append but instead simply set the variable.  This is OK
because this is an osstest build script run via ssh to the build host,
so the variables won't have been set already.)

CC: Jim Fehlig 
Signed-off-by: Ian Jackson 
---
 ts-libvirt-build | 1 +
 1 file changed, 1 insertion(+)

diff --git a/ts-libvirt-build b/ts-libvirt-build
index bc08190a..2a363f43 100755
--- a/ts-libvirt-build
+++ b/ts-libvirt-build
@@ -60,6 +60,7 @@ sub config() {
 cd libvirt
 CFLAGS="-g -I$xenprefix/include/" \\
 LDFLAGS="-g -L$xenprefix/lib/ -Wl,-rpath-link=$xenprefix/lib/" \\
+PKG_CONFIG_PATH="$xenprefix/lib/pkgconfig/" \\
 GNULIB_SRCDIR=$builddir/libvirt/$gnulib->{Path} \\
 ./autogen.sh --no-git \\
  --with-libxl --without-xen --without-xenapi 
--without-selinux \\
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [OSSTEST PATCH 2/2] ts-libvirt-build: Do an out-of-tree build

2019-11-12 Thread Ian Jackson
Recent versions of libvirt do not support in-tree builds (!)

Cope with this by always building in a subdirectory `build' (a
subdirectory of the source tree); this is the arrangement which the
libvirt upstream messages and documentation now seem to recommend (at
least where things have been updated).

I compared the differences in build output between the results of this
branch and a previous passing xen-unstable flight.  The libvirt
library version increased and a file
  usr/local/share/libvirt/cpu_map/arm_features.xml
appeared.  I think this is just due to changes in the libvirt version,
2cff65e4c60e..70218e10bcde, in particular 0de541bfc575
  cpu_map: Ship arm_features.xml

I also tested that a test job, built with current libvirt and these
osstest changes, passes as expected.

CC: Jim Fehlig 
Signed-off-by: Ian Jackson 
Tested-by: Ian Jackson 
---
 ts-libvirt-build | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/ts-libvirt-build b/ts-libvirt-build
index 2a363f43..e799f003 100755
--- a/ts-libvirt-build
+++ b/ts-libvirt-build
@@ -58,11 +58,13 @@ sub config() {
 my $gnulib = submodule_find($submodules, "gnulib");
 target_cmd_build($ho, 3600, $builddir, <{Path} \\
-./autogen.sh --no-git \\
+../autogen.sh --no-git \\
  --with-libxl --without-xen --without-xenapi 
--without-selinux \\
  --without-lxc --without-vbox --without-uml \\
  --without-qemu --without-openvz --without-vmware \\
@@ -72,9 +74,9 @@ END
 
 sub build() {
 target_cmd_build($ho, 3600, $builddir, <&1 && touch ../build-ok-stamp) |tee ../log
-test -f ../build-ok-stamp #/
+cd libvirt/build
+(make $makeflags 2>&1 && touch ../../build-ok-stamp) |tee ../log
+test -f ../../build-ok-stamp #/
 echo ok.
 END
 }
@@ -82,7 +84,7 @@ END
 sub install() {
 target_cmd_build($ho, 300, $builddir, 

[Xen-devel] Xen 4.13 RC2

2019-11-12 Thread Juergen Gross

Hi all,

Xen 4.13 rc2 is tagged. You can check that out from xen.git:

git://xenbits.xen.org/xen.git 4.13.0-rc2

For your convenience there is also a tarball at:
https://downloads.xenproject.org/release/xen/4.13.0-rc2/xen-4.13.0-rc2.tar.gz

And the signature is at:
https://downloads.xenproject.org/release/xen/4.13.0-rc2/xen-4.13.0-rc2.tar.gz.sig

Please send bug reports and test reports to xen-devel@lists.xenproject.org.
When sending bug reports, please CC relevant maintainers and me
(jgr...@suse.com).

There will be a Xen Test Day on Nov 14th.

See instructions on:

https://wiki.xenproject.org/wiki/Xen_4.13_RC_test_instructions
https://wiki.xenproject.org/wiki/Xen_Project_Test_Days


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V2 2/2] x86/mm: Make use of the default access param from xc_altp2m_create_view

2019-11-12 Thread Jan Beulich
On 06.11.2019 16:35, Alexandru Stefan ISAILA wrote:
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -1345,13 +1345,14 @@ void setup_ept_dump(void)
>  register_keyhandler('D', ept_dump_p2m_table, "dump VT-x EPT tables", 0);
>  }
>  
> -void p2m_init_altp2m_ept(struct domain *d, unsigned int i)
> +void p2m_init_altp2m_ept(struct domain *d, unsigned int i,
> + p2m_access_t default_access)
>  {
>  struct p2m_domain *p2m = d->arch.altp2m_p2m[i];
>  struct p2m_domain *hostp2m = p2m_get_hostp2m(d);
>  struct ept_data *ept;
>  
> -p2m->default_access = hostp2m->default_access;
> +p2m->default_access = default_access;
>  p2m->domain = hostp2m->domain;
>  
>  p2m->global_logdirty = hostp2m->global_logdirty;

All of this is not EPT-specific. Before adding more infrastructure to
cover for this (here: another function parameter), how about moving
these parts into vendor-independent code?

> @@ -2572,17 +2574,36 @@ int p2m_init_altp2m_by_id(struct domain *d, unsigned 
> int idx)
>  altp2m_list_lock(d);
>  
>  if ( d->arch.altp2m_eptp[idx] == mfn_x(INVALID_MFN) )
> -rc = p2m_activate_altp2m(d, idx);
> +rc = p2m_activate_altp2m(d, idx, hostp2m->default_access);
>  
>  altp2m_list_unlock(d);
>  return rc;
>  }
>  
> -int p2m_init_next_altp2m(struct domain *d, uint16_t *idx)
> +int p2m_init_next_altp2m(struct domain *d, uint16_t *idx,
> + uint16_t hvmmem_default_access)
>  {
>  int rc = -EINVAL;
>  unsigned int i;
>  
> +static const p2m_access_t memaccess[] = {
> +#define ACCESS(ac) [XENMEM_access_##ac] = p2m_access_##ac
> +ACCESS(n),
> +ACCESS(r),
> +ACCESS(w),
> +ACCESS(rw),
> +ACCESS(x),
> +ACCESS(rx),
> +ACCESS(wx),
> +ACCESS(rwx),
> +ACCESS(rx2rw),
> +ACCESS(n2rwx),
> +#undef ACCESS
> +};
> +
> +if ( hvmmem_default_access > XENMEM_access_default )
> +return rc;
> +
>  altp2m_list_lock(d);
>  
>  for ( i = 0; i < MAX_ALTP2M; i++ )
> @@ -2590,7 +2611,7 @@ int p2m_init_next_altp2m(struct domain *d, uint16_t 
> *idx)
>  if ( d->arch.altp2m_eptp[i] != mfn_x(INVALID_MFN) )
>  continue;
>  
> -rc = p2m_activate_altp2m(d, i);
> +rc = p2m_activate_altp2m(d, i, memaccess[hvmmem_default_access]);

Aren't you open-coding xenmem_access_to_p2m_access() here? In
no event should there be two instances of the same static array.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [XEN PATCH for-4.13] tools/configure: Honour XEN_COMPILE_ARCH and _TARGET_ for shim

2019-11-12 Thread Ian Jackson
Ian Jackson writes ("Re: [Xen-devel] [XEN PATCH for-4.13] tools/configure: 
Honour XEN_COMPILE_ARCH and _TARGET_ for shim"):
> Andrew Cooper writes ("Re: [Xen-devel] [XEN PATCH for-4.13] tools/configure: 
> Honour XEN_COMPILE_ARCH and _TARGET_ for shim"):
> > On 29/10/2019 17:57, Ian Jackson wrote:
> > > The pvshim can only be built 64-bit because the hypervisor is only
> > > 64-bit nowadays.  The hypervisor build supports XEN_COMPILE_ARCH and
> > > XEN_TARGET_ARCH which override the information from uname.  The pvshim
> > > build runs out of the tools/ directory but calls the hypervisor build
> > > system.
> > >
> > > If one runs in a Linux 32-bit userland with a 64-bit kernel, one used
> > > to be able to set XEN_COMPILE_ARCH.  But nowadays this does not work.
> > 
> > This looks to be a bugfix to 8845155c831c59e867ee3dd31ee63e0cc6c7dcf2 ?
> > 
> > In particular, this deleted the logic to only build the shim for
> > XEN_TARGET_ARCH != x86_32.
> 
> Yes.  I have added a note about that to the commit message (stealing
> your text, thanks) and now I am CCing the author and requester of that
> commit, for form's sake.

Andrew, did you want to ack this ?  Or do you have further comments ?
I have a release-ack...

Thanks,
Ian.

From 1a8de36699b9042c30797e05f7a5f4313d7f7ad1 Mon Sep 17 00:00:00 2001
From: Ian Jackson 
Date: Tue, 29 Oct 2019 17:45:30 +
Subject: [PATCH] tools/configure: Honour XEN_COMPILE_ARCH and _TARGET_ for
 shim
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The pvshim can only be built 64-bit because the hypervisor is only
64-bit nowadays.  The hypervisor build supports XEN_COMPILE_ARCH and
XEN_TARGET_ARCH which override the information from uname.  The pvshim
build runs out of the tools/ directory but calls the hypervisor build
system.

If one runs in a Linux 32-bit userland with a 64-bit kernel, one used
to be able to set XEN_COMPILE_ARCH.  But nowadays this does not work.
configure sees the target cpu as 64-bit and tries to build pvshim.
The build prints
  echo "*** Xen x86/32 target no longer supported!"
and doesn't build anything.  Then the subsequent Makefiles try to
install the non-built pieces.

Fix this anomaly by causing configure to honour the Xen hypervisor way
of setting the target architecture.

In principle this user behaviour is not handled quite right, because
configure will still see 64-bit and so all the autoconf-based
architecture testing will see 64-bit rather than 32-bit x86.  But the
tools are in fact generally quite portable: this particular location
in configure{.ac,} is the only place in tools/ where 64-bit x86 is
treated differently from 32-bit x86, so the fix is sufficient and
correct for this use case.

It remains the case that XEN_COMPILE_ARCH or XEN_TARGET_ARCH to a
non-x86 architecture, when configure thinks things are x86, or vice
versa, will not work right.

(This is a bugfix to 8845155c831c
  pvshim: make PV shim build selectable from configure
which inadvertantly deleted the logic to only build the shim for
XEN_TARGET_ARCH != x86_32.)

I have rerun autogen.sh, so this patch contains the fix to configure
as well as the source fix to configure.ac.

Fixes: 8845155c831c59e867ee3dd31ee63e0cc6c7dcf2
Signed-off-by: Ian Jackson 
CC: Olaf Hering 
CC: Roger Pau Monné 
Release-acked-by: Jürgen Groß 
---
 tools/configure| 2 +-
 tools/configure.ac | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/configure b/tools/configure
index 82947ad308..d9ccce6d2b 100755
--- a/tools/configure
+++ b/tools/configure
@@ -9711,7 +9711,7 @@ fi
 else
 
 cpu=`test -z "$target_cpu" && echo "$host_cpu" || echo "$target_cpu"`
-case "$cpu" in
+case "${XEN_COMPILE_ARCH-${XEN_TARGET_ARCH-$cpu}}" in
 x86_64)
pvshim="y";;
 *) pvshim="n";;
diff --git a/tools/configure.ac b/tools/configure.ac
index 674bd5809d..a8d8ce5ffe 100644
--- a/tools/configure.ac
+++ b/tools/configure.ac
@@ -479,7 +479,7 @@ AC_ARG_ENABLE([pvshim],
[Disable pvshim build (enabled by default on 64bit x86)]),
 [AS_IF([test "x$enable_pvshim" = "xno"], [pvshim=n], [pvshim=y])], [
 cpu=`test -z "$target_cpu" && echo "$host_cpu" || echo "$target_cpu"`
-case "$cpu" in
+case "${XEN_COMPILE_ARCH-${XEN_TARGET_ARCH-$cpu}}" in
 x86_64)
pvshim="y";;
 *) pvshim="n";;
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V2 1/2] x86/altp2m: Add hypercall to set a range of sve bits

2019-11-12 Thread Jan Beulich
On 06.11.2019 16:35, Alexandru Stefan ISAILA wrote:
> @@ -4681,7 +4682,7 @@ static int do_altp2m_op(
>  break;
>  
>  case HVMOP_altp2m_set_suppress_ve:
> -if ( a.u.suppress_ve.pad1 || a.u.suppress_ve.pad2 )
> +if ( a.u.suppress_ve.pad1 )

Just because the field changes its name doesn't mean you can
drop the check. You even add a new field not used (yet) by
this sub-function, which then also would need checking here.

> @@ -4693,8 +4694,23 @@ static int do_altp2m_op(
>  }
>  break;
>  
> +case HVMOP_altp2m_set_suppress_ve_multi:
> +if ( a.u.suppress_ve.pad1 || !a.u.suppress_ve.nr )

A count of zero typically is taken as a no-op, not an error.

> +rc = -EINVAL;
> +else
> +{
> +rc = p2m_set_suppress_ve_multi(d, _ve);
> +
> +if ( rc == -ERESTART )
> +if ( __copy_field_to_guest(guest_handle_cast(arg,
> +   xen_hvm_altp2m_op_t),
> +   , u.suppress_ve.opaque) )
> +rc = -EFAULT;

If the operation is best effort, _some_ indication of failure should
still be handed back to the caller. Whether that's through the opaque
field or by some other means is secondary. If not via that field
(which would make the outer of the two if()-s disappear), please fold
the if()-s.

> +}
> +break;
> +
>  case HVMOP_altp2m_get_suppress_ve:
> -if ( a.u.suppress_ve.pad1 || a.u.suppress_ve.pad2 )
> +if ( a.u.suppress_ve.pad1 )

See above.

> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -3054,6 +3054,64 @@ out:
>  return rc;
>  }
>  
> +/*
> + * Set/clear the #VE suppress bit for multiple pages.  Only available on VMX.
> + */
> +int p2m_set_suppress_ve_multi(struct domain *d,
> +  struct xen_hvm_altp2m_suppress_ve* sve)

Misplaced *.

> +{
> +struct p2m_domain *host_p2m = p2m_get_hostp2m(d);
> +struct p2m_domain *ap2m = NULL;
> +struct p2m_domain *p2m;
> +uint64_t start = sve->opaque ?: sve->gfn;

According to this start (and hence ->opaque) are GFNs.

> +int rc = 0;
> +
> +if ( sve->view > 0 )
> +{
> +if ( sve->view >= MAX_ALTP2M ||
> + d->arch.altp2m_eptp[sve->view] == mfn_x(INVALID_MFN) )
> +return -EINVAL;
> +
> +p2m = ap2m = d->arch.altp2m_p2m[sve->view];
> +}
> +else
> +p2m = host_p2m;
> +
> +p2m_lock(host_p2m);
> +
> +if ( ap2m )
> +p2m_lock(ap2m);
> +
> +
> +while ( start < sve->nr )

According to this, start is an index. Which of the two do you
mean?

> --- a/xen/include/public/hvm/hvm_op.h
> +++ b/xen/include/public/hvm/hvm_op.h
> @@ -42,8 +42,9 @@ struct xen_hvm_altp2m_suppress_ve {
>  uint16_t view;
>  uint8_t suppress_ve; /* Boolean type. */
>  uint8_t pad1;
> -uint32_t pad2;
> +uint32_t nr;
>  uint64_t gfn;
> +uint64_t opaque;
>  };

How is this addition of a field going to work compatibly with old
and new callers on old and new hypervisors? Recall in particular
that these operations are (almost?) all potentially usable by the
guest itself.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-4.13.0-rc] kexec/kdump failure with cpu Intel(R) Xeon(R) Gold 6242 CPU

2019-11-12 Thread Dietmar Hahn
Hi,

on a new machine with cpu Intel(R) Xeon(R) Gold 6242 CPU the kexec/kdump
doesn't work with current xen-4.13.0-rc.
The last output of the xen console is:

(XEN) Hardware Dom0 crashed: Executing kexec image on cpu5
(XEN) Shot down all CPUs

After short delay the system reboots.

It seems the fixes mentioned in the thread
https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg01948.html
aren't enough.

I built xen-4.11 with the patches but no success.
On an older system with xen-4.4 the kdump works.

Any help is appreciated.
Thanks.

Dietmar.



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] Introduce a description of a new optional tag for Backports

2019-11-12 Thread Ian Jackson
Anthony PERARD writes ("Re: [Xen-devel] [PATCH] Introduce a description of a 
new optional tag for Backports"):
> Should we describe the Fixes: tag as well? These would have a similar
> purpose to the backport tag, I mean it could help figure out which
> commit to backport to which tree.

Good point.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] Introduce a description of a new optional tag for Backports

2019-11-12 Thread Anthony PERARD
On Fri, Nov 08, 2019 at 11:09:52AM -0800, Stefano Stabellini wrote:
> +Backport Tag
> +
> +
> +A backport tag is an optional tag in the commit message to request a
> +given commit to be backported to the stable trees:
> +
> +Backport: all
[...]

Should we describe the Fixes: tag as well? These would have a similar
purpose to the backport tag, I mean it could help figure out which
commit to backport to which tree.

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Xen-unstable: AMD-Vi: update_paging_mode Try to access pdev_list without aquiring pcidevs_lock.

2019-11-12 Thread Jan Beulich
On 11.11.2019 22:38, Sander Eikelenboom wrote:
> When supplying "pci=nomsi" to the guest kernel, the device works fine,
> and I don't get the "INVALID_DEV_REQUEST".
> 
> After reverting 1b00c16bdf, the device works fine 
> and I don't get the INVALID_DEV_REQUEST, 

Could you give the patch below a try? That commit took care of only
securing ourselves, but not of relaxing things again when a device
gets handed to a guest for actual use.

Jan

AMD/IOMMU: restore DTE fields in amd_iommu_setup_domain_device()

Commit 1b00c16bdf ("AMD/IOMMU: pre-fill all DTEs right after table
allocation") moved ourselves into a more secure default state, but
didn't take sufficient care to also undo the effects when handing a
previously disabled device back to a(nother) domain. Put the fields
that may have been changed elsewhere back to their intended values
(some fields amd_iommu_disable_domain_device() touches don't
currently get written anywhere else, and hence don't need modifying
here).

Reported-by: Sander Eikelenboom 
Signed-off-by: Jan Beulich 

--- unstable.orig/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ unstable/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -114,11 +114,21 @@ static void amd_iommu_setup_domain_devic
 
 if ( !dte->v || !dte->tv )
 {
+const struct ivrs_mappings *ivrs_dev;
+
 /* bind DTE to domain page-tables */
 amd_iommu_set_root_page_table(
 dte, page_to_maddr(hd->arch.root_table), domain->domain_id,
 hd->arch.paging_mode, valid);
 
+/* Undo what amd_iommu_disable_domain_device() may have done. */
+ivrs_dev = _ivrs_mappings(iommu->seg)[req_id];
+if ( dte->it_root )
+dte->int_ctl = IOMMU_DEV_TABLE_INT_CONTROL_TRANSLATED;
+dte->iv = iommu_intremap;
+dte->ex = ivrs_dev->dte_allow_exclusion;
+dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, 
ACPI_IVHD_SYSTEM_MGMT);
+
 if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
  iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
 dte->i = ats_enabled;


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-4.11-testing test] 144025: tolerable FAIL - PUSHED

2019-11-12 Thread osstest service owner
flight 144025 xen-4.11-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144025/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass

version targeted for testing:
 xen  006b2041242129896fbd30135b3dc6f575894a07
baseline version:
 xen  8bfcd2e5fd1c6a8a64cd29aab6114826cd5e5be5

Last test of basis   143158  2019-10-25 10:41:34 Z   18 days
Failing since143304  2019-10-28 22:06:05 Z   14 days   11 attempts
Testing same since   143479  2019-10-31 16:30:09 Z   11 days9 attempts


People who touched revisions under test:
  Andrew Cooper 
  Brian 

Re: [Xen-devel] [PATCH for-4.13] xen: Drop bogus BOOLEAN definitions, TRUE and FALSE

2019-11-12 Thread Wei Liu
On Mon, Nov 11, 2019 at 08:24:43PM +, Andrew Cooper wrote:
> actypes.h and efidef.h both define BOOLEAN as unsigned char, which is buggy in
> combination with logic such as "BOOLEAN b = (a & 0x100);"  Redefine BOOLEAN as
> bool instead, which doesn't truncate.
> 
> Both also define TRUE and FALSE, with actypes.h being extra rude and replacing
> whatever exists thus far.  Drop all uses of TRUE and FALSE, replacing them
> with true/false respectively, and drop the declarations.
> 
> Also drop the pointless conditional declaration of NULL while cleaning this
> up.
> 
> Finally, correct all the comments which which were found by sed.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3] tools/hotpug: only attempt to call 'ip route' if there is valid command

2019-11-12 Thread Wei Liu
On Tue, Nov 12, 2019 at 11:01:26AM +0100, Jürgen Groß wrote:
> On 08.11.19 11:31, Wei Liu wrote:
> > On Fri, Nov 08, 2019 at 09:42:33AM +, p...@xen.org wrote:
> > > From: Paul Durrant 
> > > 
> > > The vif-route script should only call 'ip route' when 'ipcmd' has been
> > > set, otherwise it will fail due to an incorrect command string.
> > > 
> > > This patch also adds routes for 'tap' (i.e. emulated) devices as well as
> > > 'vif' (i.e. PV) devices. Empirically offline/online commands relate to
> > > 'vif' devices, and add/remove commands relate to 'tap' devices. However,
> > > this patch treats them equally and uses ${type_if} to distinguish. By
> > > adding cases for add/remove the command list becomes exhaustive and hence
> > > 'ipcmd' is guaranteed to be set.
> > > 
> > > Routes for 'tap' and 'vif' devices are distinguished by a route metric.
> > > Emulated devices are used by HVM guests until they are unplugged, at which
> > > point the PV device becomes active. Thus 'tap' devices should get a higher
> > > priority (i.e. lower numbered) metric than 'vif' devices.
> > > 
> > > There is also one small whitespace fix.
> > > 
> > > Signed-off-by: Paul Durrant 
> > 
> > Acked-by: Wei Liu 
> 
> Release-acked-by: Juergen Gross 

Thanks. Queued.

Also change hotpug to hotplug in the subject line.

Wei.

> 
> 
> Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3] tools/hotpug: only attempt to call 'ip route' if there is valid command

2019-11-12 Thread Jürgen Groß

On 08.11.19 11:31, Wei Liu wrote:

On Fri, Nov 08, 2019 at 09:42:33AM +, p...@xen.org wrote:

From: Paul Durrant 

The vif-route script should only call 'ip route' when 'ipcmd' has been
set, otherwise it will fail due to an incorrect command string.

This patch also adds routes for 'tap' (i.e. emulated) devices as well as
'vif' (i.e. PV) devices. Empirically offline/online commands relate to
'vif' devices, and add/remove commands relate to 'tap' devices. However,
this patch treats them equally and uses ${type_if} to distinguish. By
adding cases for add/remove the command list becomes exhaustive and hence
'ipcmd' is guaranteed to be set.

Routes for 'tap' and 'vif' devices are distinguished by a route metric.
Emulated devices are used by HVM guests until they are unplugged, at which
point the PV device becomes active. Thus 'tap' devices should get a higher
priority (i.e. lower numbered) metric than 'vif' devices.

There is also one small whitespace fix.

Signed-off-by: Paul Durrant 


Acked-by: Wei Liu 


Release-acked-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/3] AMD/IOMMU: don't needlessly trigger errors/crashes when unmapping a page

2019-11-12 Thread Jürgen Groß

On 06.11.19 16:18, Jan Beulich wrote:

Unmapping a page which has never been mapped should be a no-op (note how
it already is in case there was no root page table allocated). There's
in particular no need to grow the number of page table levels in use,
and there's also no need to allocate intermediate page tables except
when needing to split a large page.

Signed-off-by: Jan Beulich 


Release-acked-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.13 v2 2/2] x86/ioapic: fix clear_IO_APIC_pin write of raw entries

2019-11-12 Thread Roger Pau Monné
On Mon, Nov 11, 2019 at 10:56:21AM +0100, Jan Beulich wrote:
> On 10.11.2019 10:25, Roger Pau Monne wrote:
> > clear_IO_APIC_pin can be called after the iommu has been enabled, and
> > using raw reads and writes to modify IO-APIC entries that have been
> > setup to use interrupt remapping can lead to issues as some of the
> > fields have different meaning when the IO-APIC entry is setup to point
> > to an interrupt remapping table entry.
> > 
> > The following ASSERT in AMD IOMMU code triggers afterwards as a result
> > of the raw changes to IO-APIC entries performed by clear_IO_APIC_pin.
> > 
> > (XEN) [   10.082154] ENABLING IO-APIC IRQs
> > (XEN) [   10.087789]  -> Using new ACK method
> > (XEN) [   10.093738] Assertion 'get_rte_index(rte) == offset' failed at 
> > iommu_intr.c:328
> > 
> > Fix this by making sure that modifications to entries are performed in
> > non raw mode.
> 
> ... when fields are affected which may either have changed meaning
> with interrupt remapping, or which may need mirroring into IRTEs.
> 
> > Reported-by: Sergey Dyasli 
> > Signed-off-by: Roger Pau Monné 
> 
> With the above addition (or something substantially similar)
> Reviewed-by: Jan Beulich 
> Of course the adjustment is easy enough to do while committing.

The adjustment LGTM, please do it at commit time unless there's
something else that requires a resend of the series.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/sched: fix a potential issue with core scheduling

2019-11-12 Thread Jürgen Groß

On 12.11.19 10:14, Dario Faggioli wrote:

On Fri, 2019-11-08 at 07:57 +0100, Juergen Gross wrote:

cpupool_online_cpumask() is used by credit and rt scheduler. It
returns
all the cpus of a cpupool or all online cpus in case no cpupool is
specified.

The "no cpupool" case can be dropped, as no scheduler other than the
init scheduler will ever work on cpus not associated with any
cpupool.


Yes, this is a cool thing about having the init cpupool/idle scheduler
in place. It's even cooler in Credit2, where it will allow us to drop
some of the cpumask_and() cpumask_or() operations.

It's the reason why, even before core scheduling, I was considering
doing an idle scheduler myself.

I'll get to write that patch (the one for Credit2, I mean) at some
point. :-)


As the individual schedulers should only ever work on scheduling
resources instead of individual cpus, their cpupool_online_cpumask()
use should be replaced by cpupool->res_valid.

Note that only with core scheduling active this might result in
potential problems, as with cpu scheduling both masks are identical.

Signed-off-by: Juergen Gross 


Reviewed-by: Dario Faggioli 


And with my release manager hat on:

Release-acked-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Lifting commit moratorium

2019-11-12 Thread Jürgen Groß

Committers,

The commit moratorium has been lifted, as we've got the desired
push. You can now commit the release-acked patches again.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/sched: fix a potential issue with core scheduling

2019-11-12 Thread Dario Faggioli
On Fri, 2019-11-08 at 07:57 +0100, Juergen Gross wrote:
> cpupool_online_cpumask() is used by credit and rt scheduler. It
> returns
> all the cpus of a cpupool or all online cpus in case no cpupool is
> specified.
> 
> The "no cpupool" case can be dropped, as no scheduler other than the
> init scheduler will ever work on cpus not associated with any
> cpupool.
> 
Yes, this is a cool thing about having the init cpupool/idle scheduler
in place. It's even cooler in Credit2, where it will allow us to drop
some of the cpumask_and() cpumask_or() operations.

It's the reason why, even before core scheduling, I was considering
doing an idle scheduler myself.

I'll get to write that patch (the one for Credit2, I mean) at some
point. :-)

> As the individual schedulers should only ever work on scheduling
> resources instead of individual cpus, their cpupool_online_cpumask()
> use should be replaced by cpupool->res_valid.
> 
> Note that only with core scheduling active this might result in
> potential problems, as with cpu scheduling both masks are identical.
> 
> Signed-off-by: Juergen Gross 
>
Reviewed-by: Dario Faggioli 

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
---
<> (Raistlin Majere)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable test] 144020: tolerable FAIL - PUSHED

2019-11-12 Thread osstest service owner
flight 144020 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/144020/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-migrupgrade 11 xen-boot/dst_host fail in 144001 pass in 144020
 test-amd64-amd64-xl-qcow2 19 guest-start/debian.repeat fail in 144001 pass in 
144020
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeat fail in 144001 pass 
in 144020
 test-amd64-i386-xl-raw 19 guest-start/debian.repeat fail in 144001 pass in 
144020
 test-armhf-armhf-xl-vhd 15 guest-start/debian.repeat fail in 144001 pass in 
144020
 test-armhf-armhf-libvirt-raw 15 guest-start/debian.repeat fail in 144001 pass 
in 144020
 test-amd64-amd64-xl-credit1  18 guest-localmigrate/x10 fail pass in 144001

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-install fail in 144001 never 
pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail in 144001 never 
pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-install fail in 144001 never 
pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail in 144001 never 
pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 142750
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 142750
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 142750
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 142750
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 142750
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 142750
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 142750
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 142750
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 142750
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 142750
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 

Re: [Xen-devel] [PATCH] AMD/IOMMU: Fix passthrough following c/s d7cfeb7c13e

2019-11-12 Thread Jürgen Groß

On 11.11.19 21:55, Andrew Cooper wrote:

"AMD/IOMMU: don't blindly allocate interrupt remapping tables" introduces a
call at runtime from amd_iommu_add_device() to amd_iommu_set_intremap_table()
which is still marked as __init.

On one AMD Rome machine we have, this results in a crash the moment we try to
use an SR-IOV VF in a VM.

Reported-by: Jennifer Herbert 
Signed-off-by: Andrew Cooper 


Release-acked-by: Juergen Gross 


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] AMD/IOMMU: Fix passthrough following c/s d7cfeb7c13e

2019-11-12 Thread Jan Beulich
On 11.11.2019 21:55, Andrew Cooper wrote:
> "AMD/IOMMU: don't blindly allocate interrupt remapping tables" introduces a
> call at runtime from amd_iommu_add_device() to amd_iommu_set_intremap_table()
> which is still marked as __init.
> 
> On one AMD Rome machine we have, this results in a crash the moment we try to
> use an SR-IOV VF in a VM.
> 
> Reported-by: Jennifer Herbert 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 

I'm sorry for the breakage - I recall having made the change, so I must
have lost it at some point.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.13] xen: Drop bogus BOOLEAN definitions, TRUE and FALSE

2019-11-12 Thread Jan Beulich
On 11.11.2019 21:24, Andrew Cooper wrote:
> --- a/xen/arch/x86/x86_64/mm.c
> +++ b/xen/arch/x86/x86_64/mm.c
> @@ -1077,7 +1077,7 @@ long do_set_segment_base(unsigned int which, unsigned 
> long base)
>  }
>  
>  
> -/* Returns TRUE if given descriptor is valid for GDT or LDT. */
> +/* Returns true if given descriptor is valid for GDT or LDT. */
>  int check_descriptor(const struct domain *dom, seg_desc_t *d)

Wouldn't changes like this one better be accompanied by also adjusting
the return type of the function (there are more examples further down
in common/timer.c)?

> --- a/xen/include/asm-arm/arm64/efibind.h
> +++ b/xen/include/asm-arm/arm64/efibind.h
> @@ -107,7 +107,7 @@ typedef uint64_t   UINTN;
>  #define POST_CODE(_Data)
>  
>  
> -#define BREAKPOINT()while (TRUE);// Make it hang on Bios[Dbg]32
> +#define BREAKPOINT()while (true);// Make it hang on Bios[Dbg]32

You do realize that this and other EFI headers (and perhaps also
ACPI ones) are largely verbatim imports from other projects,
updating of which will become less straightforward by such
replacements? When pulling in the EFI ones I intentionally did not
fiddle with them more than absolutely necessary.

If it wasn't for this, I'd have ack-ed the patch despite the other
remark above.

> --- a/xen/include/xen/mm.h
> +++ b/xen/include/xen/mm.h
> @@ -607,7 +607,7 @@ int __must_check donate_page(struct domain *d, struct 
> page_info *page,
>  #define RAM_TYPE_UNUSABLE 0x0004
>  #define RAM_TYPE_ACPI 0x0008
>  #define RAM_TYPE_UNKNOWN  0x0010
> -/* TRUE if the whole page at @mfn is of the requested RAM type(s) above. */
> +/* true if the whole page at @mfn is of the requested RAM type(s) above. */
>  int page_is_ram_type(unsigned long mfn, unsigned long mem_type);

In other comments I already wasn't sure about such replacements, but
let them be. Here, however, you violate coding style by using "true"
instead of "True" (the function returning "int" for now doesn't even
allow the excuse of meaning the identifier rather than the word).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  1   2   >