Re: [Xen-devel] [Qemu-devel] [PATCH v3 01/46] Replace all occurances of __FUNCTION__ with __func__

2017-11-07 Thread Markus Armbruster
Eric Blake  writes:

> On 11/07/2017 04:12 AM, Markus Armbruster wrote:
>> Juan Quintela  writes:
>> 
>>> Alistair Francis  wrote:
 Replace all occurs of __FUNCTION__ except for the check in checkpatch
 with the non GCC specific __func__.

>
 +++ b/audio/audio_int.h
 @@ -253,7 +253,7 @@ static inline int audio_ring_dist (int dst, int src, 
 int len)
  #define AUDIO_STRINGIFY(n) AUDIO_STRINGIFY_(n)
  
  #if defined _MSC_VER || defined __GNUC__
 -#define AUDIO_FUNC __FUNCTION__
 +#define AUDIO_FUNC __func__
  #else
  #define AUDIO_FUNC __FILE__ ":" AUDIO_STRINGIFY (__LINE__)
  #endif
>>>
>>> Unrelated to this patch 
>>> Do we really support other compilers than msc and gcc?
>> 
>> Let me rephrase the question: do we really support compilers that don't
>> understand __func__?  The presence of numerous unconditional uses of
>> __func__ in the tree means the answer is no.  Let's replace AUDIO_FUNC
>> by plain __func__.
>
> Answered elsewhere in patch 3/46 (where we DO replace AUDIO_FUNC by
> __func__).

I see.

Put 03/46 first, so we don't have to mess with AUDIO_FUNC twice?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/hvm: do not register hpet mmio during s3 cycle

2017-11-07 Thread Jan Beulich
>>> On 07.11.17 at 21:39,  wrote:
> Do it once at domain creation (hpet_init).
> 
> Sleep -> Resume cycles will end up crashing an HVM guest with hpet as
> the sequence during resume takes the path:
> -> hvm_s3_suspend
>   -> hpet_reset
> -> hpet_deinit
> -> hpet_init
>   -> register_mmio_handler
> -> hvm_next_io_handler
> 
> register_mmio_handler will use a new io handler each time, until
> eventually it reaches NR_IO_HANDLERS, then hvm_next_io_handler calls
> domain_crash.
> 
> Signed-off-by: Eric Chanudet 

This is certainly worthwhile to consider for 4.10 - please Cc
Julien on v2.

> +void hpet_reinit(struct domain *d)

static

> +{
> +HPETState *h = domain_vhpet(d);
> +
> +if ( !has_vhpet(d) )
> +return;
> +
> +hpet_set(h);

The local variable, being used only once, isn't needed.

> @@ -698,7 +713,7 @@ void hpet_deinit(struct domain *d)
>  void hpet_reset(struct domain *d)
>  {
>  hpet_deinit(d);
> -hpet_init(d);
> +hpet_reinit(d);
>  }

This being the only caller, it then becomes questionable whether
hpet_reinit() needs to be a separate function, or wouldn't better
be inlined here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] xen: Add support for initializing 16550 UART using ACPI

2017-11-07 Thread Bhupinder Thakur
Hi Julien,

On 2 November 2017 at 17:45, Julien Grall  wrote:
> Hi Bhupinder,
>
> Please write a cover letter even if it is small when your send a series with
> multiple patches.
>
>
> On 02/11/17 10:13, Bhupinder Thakur wrote:
>>
>> Currently, Xen supports only DT based initialization of 16550 UART.
>> This patch adds support for initializing 16550 UART using ACPI SPCR table.
>>
>> Signed-off-by: Bhupinder Thakur 
>> ---
>> CC: Andrew Cooper 
>> CC: George Dunlap 
>> CC: Ian Jackson 
>> CC: Jan Beulich 
>> CC: Konrad Rzeszutek Wilk 
>> CC: Stefano Stabellini 
>> CC: Tim Deegan 
>> CC: Wei Liu 
>> CC: Julien Grall 
>>
>>   xen/drivers/char/ns16550.c  | 57
>> +
>>   xen/include/xen/8250-uart.h |  1 +
>>   2 files changed, 58 insertions(+)
>>
>> diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
>> index e0f8199..b3f6d85 100644
>> --- a/xen/drivers/char/ns16550.c
>> +++ b/xen/drivers/char/ns16550.c
>> @@ -1538,6 +1538,63 @@ DT_DEVICE_START(ns16550, "NS16550 UART",
>> DEVICE_SERIAL)
>>   DT_DEVICE_END
>> #endif /* HAS_DEVICE_TREE */
>> +
>> +#ifdef CONFIG_ACPI
>
>
> The code below is going to break x86 build. You need to do #if
> defined(CONFIG_ACPI) && defined(CONFIG_ARM)
>
>
>> +#include 
>> +
>> +static int __init ns16550_acpi_uart_init(const void *data)
>> +{
>> +struct ns16550 *uart;
>> +acpi_status status;
>> +struct acpi_table_spcr *spcr = NULL;
>> +
>> +status = acpi_get_table(ACPI_SIG_SPCR, 0,
>> +(struct acpi_table_header **));
>> +
>> +if ( ACPI_FAILURE(status) )
>> +{
>> +printk("ns16550: Failed to get SPCR table\n");
>> +return -EINVAL;
>> +}
>> +
>> +uart = _com[0];
>> +
>> +ns16550_init_common(uart);
>> +
>> +uart->baud  = BAUD_AUTO;
>> +uart->data_bits = 8;
>> +uart->parity= spcr->parity;
>> +uart->stop_bits = spcr->stop_bits;
>> +uart->io_base = spcr->serial_port.address;
>> +uart->irq = spcr->interrupt;
>> +uart->reg_width = spcr->serial_port.bit_width/8;
>
>
> width / 8;
>
>> +uart->reg_shift = 0;
>> +uart->io_size = UART_MAX_REG
>
> space before and after <<.
>
> Also, io_size seems to be computed differently in pci_uart_config. I am not
> sure why the difference here?

In pci_uart_config:

uart->io_size = max(8U << param->reg_shift,
 param->uart_offset);

I was not sure which param to consider to get the uart_offset. Since
the max register that ns16550 uses is UART_USR, I
calculated the io_size based on that.
>
>> +
>> +irq_set_type(spcr->interrupt, spcr->interrupt_type);
>> +
>> +uart->vuart.base_addr = uart->io_base;
>> +uart->vuart.size = uart->io_size;
>> +uart->vuart.data_off = UART_THR 
>
> Ditto for the space.
>
>> +uart->vuart.status_off = UART_LSR
>
> Ditto.
>
>> +uart->vuart.status = UART_LSR_THRE|UART_LSR_TEMT;
>
>
> Ditto.
>
> Also, the code looks very similar to the DT version. Is there any way to
> share it?
>
>
>> +
>> +/* Register with generic serial driver. */
>> +serial_register_uart(uart - ns16550_com, _driver, uart);
>> +
>> +return 0;
>> +}
>> +
>> +ACPI_DEVICE_START(ns16550c, "16550 COMPAT UART", DEVICE_SERIAL)
>> +.class_type = ACPI_DBG2_16550_COMPATIBLE,
>> +.init = ns16550_acpi_uart_init,
>> +ACPI_DEVICE_END
>> +ACPI_DEVICE_START(ns16550s, "16550 SUBSET UART", DEVICE_SERIAL)
>> +.class_type = ACPI_DBG2_16550_SUBSET,
>> +.init = ns16550_acpi_uart_init,
>> +ACPI_DEVICE_END
>> +
>> +#endif
>>   /*
>>* Local variables:
>>* mode: C
>> diff --git a/xen/include/xen/8250-uart.h b/xen/include/xen/8250-uart.h
>> index 5c3bac3..1b3e137 100644
>> --- a/xen/include/xen/8250-uart.h
>> +++ b/xen/include/xen/8250-uart.h
>> @@ -35,6 +35,7 @@
>>   #define UART_USR  0x1f/* Status register (DW) */
>>   #define UART_DLL  0x00/* divisor latch (ls) (DLAB=1) */
>>   #define UART_DLM  0x01/* divisor latch (ms) (DLAB=1) */
>> +#define UART_MAX_REG  (UART_USR+1)
>> /* Interrupt Enable Register */
>>   #define UART_IER_ERDAI0x01/* rx data recv'd   */
>>
>
> Cheers,
>
> --
> Julien Grall

Regards,
Bhupinder

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Unable to create guest PV domain on OMAP5432

2017-11-07 Thread Jayadev Kumaran
Hello all,

I'm trying to implement Xen hypervisor support on OMAP5432.I have followed
the steps as in
https://wiki.xenproject.org/wiki/Xen_ARM_with_Virtualization_Extensions/OMAP5432_uEVM
for the initial setup. I'm able to see the domain 0 successfully up.







*root@omap5-evm:~# /etc/init.d/xencommons startStarting
/usr/local/sbin/xenstored...Setting domain 0 name, domid and JSON
config...Done setting up Dom0Starting xenconsoled...Starting QEMU as disk
backend for dom0*



*root@omap5-evm:~# xl listNameID
Mem VCPUs  State   Time(s)Domain-0
0   512 2 r-  11.5*
I have used the below configuration file for creating a guest domain.
"
name = "ubuntu"

kernel = "/home/root/ubuntu/vmlinuz"
ramdisk = "/home/root/ubuntu/initrd.gz"
#bootloader = "/usr/lib/xen-4.4/bin/pygrub"

memory = 1024
vcpus = 1

device_model_version = 'qemu-xen-traditional'

disk = [
 '/dev/vg0/domu0,raw,xvda,rw'
   ]

"

But when I try to create the guest domain using xl create command,I get the
below error.


















































*root@omap5-evm:/# xl create -d -c
/etc/xen/config.d/ubuntu.cfg
Parsing config from /etc/xen/config.d/ubuntu.cfg{"c_info": {
"type": "pv","name": "ubuntu","uuid":
"d7dd7835-61e3-46ce-b76f-140c0f9673fe","run_hotplug_scripts":
"True"},"b_info": {"max_vcpus": 1,"avail_vcpus":
[0],"max_memkb": 1048576,
"target_memkb": 1048576,"shadow_memkb": 9216,
"device_model_version": "qemu_xen_traditional","sched_params":
{},"claim_mode": "True","kernel":
"/home/root/ubuntu/vmlinuz","ramdisk":
"/home/root/ubuntu/initrd.gz","type.pv": {},
"arch_arm": {}},"disks": [{"pdev_path":
"/dev/vg0/domu0","vdev": "xvda","format":
"raw","readwrite": 1}],"on_reboot":
"restart","on_soft_reset": "soft_reset"}(XEN) grant_table.c:1688:d0v0
Expanding d4 grant table from 0 to 1 frames(XEN) memory.c:238:d0v0 Could
not allocate order=18 extent: id=4 memflags=0xc0 (0 of 1)libxl: error:
libxl_exec.c:118:libxl_report_child_exitstatus: /etc/xen/scripts/block add
[1612] exited with error status 1libxl: error:
libxl_create.c:1278:domcreate_launch_dm: Domain 4:unable to add disk
deviceslibxl: error: libxl_domain.c:1000:libxl__destroy_domid: Domain
4:Non-existant domainlibxl: error:
libxl_domain.c:959:domain_destroy_callback: Domain 4:Unable to destroy
guestlibxl: error: libxl_domain.c:886:domain_destroy_cb: Domain
4:Destruction of domain failed*
Any input would be highly appreciated

Thanks and Regards,
Jay
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 115643: regressions - FAIL

2017-11-07 Thread osstest service owner
flight 115643 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/115643/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop   fail REGR. vs. 114682

Tests which are failing intermittently (not blocking):
 test-amd64-i386-libvirt-qcow2 17 guest-start/debian.repeat fail pass in 115628
 test-amd64-amd64-libvirt-vhd 18 guest-start.2  fail pass in 115628
 test-armhf-armhf-xl-credit2  16 guest-start/debian.repeat  fail pass in 115628
 test-armhf-armhf-xl-vhd  15 guest-start/debian.repeat  fail pass in 115628

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 114682
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 114682
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 114682
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 114682
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 114682
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 114682
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 114682
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 114682
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass

version targeted for testing:
 linuxe4880bc5dfb1f02b152e62a894b5c6f3e995b3cf
baseline version:
 linuxebe6e90ccc6679cb01d2b280e4b61e6092d4bedb

Last test of basis   114682  2017-10-18 09:54:11 Z   20 days
Failing since114781  2017-10-20 01:00:47 Z   19 days   32 attempts
Testing same since   115628  2017-11-07 01:20:30 Z1 days2 attempts


522 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386 

[Xen-devel] [qemu-mainline test] 115649: regressions - FAIL

2017-11-07 Thread osstest service owner
flight 115649 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/115649/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl  18 guest-localmigrate/x10   fail REGR. vs. 114507
 test-amd64-i386-libvirt-qcow2 17 guest-start/debian.repeat fail REGR. vs. 
114507
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeat fail REGR. vs. 114507
 test-armhf-armhf-xl-credit2 16 guest-start/debian.repeat fail REGR. vs. 114507
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
114507
 test-armhf-armhf-xl-vhd 15 guest-start/debian.repeat fail REGR. vs. 114507

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop  fail blocked in 114507
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 114507
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 114507
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 114507
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 114507
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 qemuu5ca7a3cba468736cfe555887af1f6ba754f6eac9
baseline version:
 qemuuf90ea7ba7c5ae7010ee0ce062207ae42530f57d6

Last test of basis   114507  2017-10-15 01:03:38 Z   23 days
Failing since114546  2017-10-16 12:16:28 Z   22 days   59 attempts
Testing same since   115649  2017-11-07 16:48:27 Z0 days1 attempts


People who touched revisions under test:
  Aaron Lindsay 
  Alberto Garcia 
  Aleksandr Bezzubikov 
  Alex Bennée 
  Alexey Kardashevskiy 
  Alexey Perevalov 
  Alistair Francis 
  Amarnath Valluri 
  Andreas Färber 
  Andrew Baumann 
  Anthoine Bourgeois 
  Anthony PERARD 
  Artyom Tarasenko 
  Bishara AbuHattoum 
  Carlo Marcelo Arenas Belón 
  Chen Hanxiao 

Re: [Xen-devel] [libvirt] Libvirt config converter can't handle file not ending with new line

2017-11-07 Thread Jim Fehlig

On 11/07/2017 08:54 AM, Wim ten Have wrote:

On Tue, 7 Nov 2017 12:20:05 +
Wei Liu  wrote:


On Mon, Nov 06, 2017 at 09:41:01PM -0700, Jim Fehlig wrote:

On 10/30/2017 06:17 AM, Wei Liu wrote:

Hi Jim

I discover a problem when using xen_xl converter. When the file in
question doesn't end with a new line, I get the following error:

error: configuration file syntax error: memory conf:53: expecting a value


I'm not able to reproduce this issue. The libvirt.git tree I tried was a bit
dated, but even after updating to latest master I can't reproduce.
   

After digging a bit (but haven't read libvirt code), it appears that the
file didn't end with a new line.


I tried several files without ending new lines, going both directions
(domxml-to-native and domxml-from-native), but didn't see the mentioned
error. Perhaps your config is revealing another bug which is being
improperly reported. Can you provide an example of the problematic config?
   


I tried to get the exact file that caused the problem but it is already
destroyed by osstest.

A similar file:

http://logs.test-lab.xenproject.org/osstest/logs/115436/test-amd64-amd64-libvirt-pair/debian.guest.osstest.cfg

If you hexdump -C it, you can see the last character is 0a. Remove it and
feed the file into the converter.
Wei.


   The phenonomem you point out is indeed weird.  And my first response
   is that this is a bug parsing the cfg input.  I did little explore and
   think that src/util/virconf.c (virConfParseLong(), virConfParseValue())
   should be reworked as pointed out in below context diffs.

 git diff
diff --git a/src/util/virconf.c b/src/util/virconf.c
index 39c2bd917..bc8e57ec3 100644
--- a/src/util/virconf.c
+++ b/src/util/virconf.c
@@ -352,7 +352,7 @@ virConfParseLong(virConfParserCtxtPtr ctxt, long 
long *val)
 } else if (CUR == '+') {
 NEXT;
 }
-if ((ctxt->cur >= ctxt->end) || (!c_isdigit(CUR))) {
+if ((ctxt->cur > ctxt->end) || (!c_isdigit(CUR))) {
 virConfError(ctxt, VIR_ERR_CONF_SYNTAX, _("unterminated 
number"));
 return -1;
 }
@@ -456,7 +456,7 @@ virConfParseValue(virConfParserCtxtPtr ctxt)
 long long l = 0;

 SKIP_BLANKS;
-if (ctxt->cur >= ctxt->end) {
+if (ctxt->cur > ctxt->end) {
 virConfError(ctxt, VIR_ERR_CONF_SYNTAX, _("expecting a 
value"));
 return NULL;
 }

   I did not go beyond this yet.


Thanks Wim. I noticed Cole fixed a similar issue when parsing content from a 
file with commit 3cc2a9e0d4. But I think instead of replicating that fix in 
virConfReadString(), we should just set the end of content correctly in 
virConfParse(). I've sent a patch along those lines that fixes Wei's test case 
and doesn't regress Cole's test case


https://www.redhat.com/archives/libvir-list/2017-November/msg00286.html

Regards,
Jim


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 115640: regressions - FAIL

2017-11-07 Thread osstest service owner
flight 115640 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/115640/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-vhd 15 guest-start/debian.repeat fail REGR. vs. 115526

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail in 115624 
pass in 115640
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail pass in 
115624

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop  fail in 115624 like 115526
 test-amd64-amd64-xl-qcow2 19 guest-start/debian.repeat fail in 115624 like 
115526
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeatfail  like 115496
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 115526
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 115526
 test-amd64-i386-libvirt-qcow2 17 guest-start/debian.repeatfail like 115526
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 115526
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 115526
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 115526
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 115526
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 115526
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 115526
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 115526
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass

version targeted for testing:
 xen  1f61c07d79abda1e747d70d83edffe4efca48e17
baseline version:
 xen  ff93dc55431517ed29c70dbff6721c6b0803acf9

Last test of basis   115526  2017-11-03 13:51:00 Z4 days
Failing since11  2017-11-04 09:34:51 Z3 days6 attempts
Testing same since   115624  2017-11-06 21:58:54 Z1 days2 attempts


People who touched revisions under test:
  Andrew Cooper 
  Julien Grall 

[Xen-devel] [PATCH] x86/hvm: do not register hpet mmio during s3 cycle

2017-11-07 Thread Eric Chanudet
Do it once at domain creation (hpet_init).

Sleep -> Resume cycles will end up crashing an HVM guest with hpet as
the sequence during resume takes the path:
-> hvm_s3_suspend
  -> hpet_reset
-> hpet_deinit
-> hpet_init
  -> register_mmio_handler
-> hvm_next_io_handler

register_mmio_handler will use a new io handler each time, until
eventually it reaches NR_IO_HANDLERS, then hvm_next_io_handler calls
domain_crash.

Signed-off-by: Eric Chanudet 
---
 xen/arch/x86/hvm/hpet.c | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
index 3ea895a0fb..7635f1a644 100644
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -635,14 +635,10 @@ static int hpet_load(struct domain *d, 
hvm_domain_context_t *h)
 
 HVM_REGISTER_SAVE_RESTORE(HPET, hpet_save, hpet_load, 1, HVMSR_PER_DOM);
 
-void hpet_init(struct domain *d)
+static void hpet_set(HPETState *h)
 {
-HPETState *h = domain_vhpet(d);
 int i;
 
-if ( !has_vhpet(d) )
-return;
-
 memset(h, 0, sizeof(HPETState));
 
 rwlock_init(>lock);
@@ -668,11 +664,30 @@ void hpet_init(struct domain *d)
 h->hpet.comparator64[i] = ~0ULL;
 h->pt[i].source = PTSRC_isa;
 }
+}
 
+void hpet_init(struct domain *d)
+{
+HPETState *h = domain_vhpet(d);
+
+if ( !has_vhpet(d) )
+return;
+
+hpet_set(h);
 register_mmio_handler(d, _mmio_ops);
 d->arch.hvm_domain.params[HVM_PARAM_HPET_ENABLED] = 1;
 }
 
+void hpet_reinit(struct domain *d)
+{
+HPETState *h = domain_vhpet(d);
+
+if ( !has_vhpet(d) )
+return;
+
+hpet_set(h);
+}
+
 void hpet_deinit(struct domain *d)
 {
 int i;
@@ -698,7 +713,7 @@ void hpet_deinit(struct domain *d)
 void hpet_reset(struct domain *d)
 {
 hpet_deinit(d);
-hpet_init(d);
+hpet_reinit(d);
 }
 
 /*
-- 
2.14.2

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [Qemu-devel] [PATCH v3 01/46] Replace all occurances of __FUNCTION__ with __func__

2017-11-07 Thread Eric Blake
On 11/07/2017 04:12 AM, Markus Armbruster wrote:
> Juan Quintela  writes:
> 
>> Alistair Francis  wrote:
>>> Replace all occurs of __FUNCTION__ except for the check in checkpatch
>>> with the non GCC specific __func__.
>>>

>>> +++ b/audio/audio_int.h
>>> @@ -253,7 +253,7 @@ static inline int audio_ring_dist (int dst, int src, 
>>> int len)
>>>  #define AUDIO_STRINGIFY(n) AUDIO_STRINGIFY_(n)
>>>  
>>>  #if defined _MSC_VER || defined __GNUC__
>>> -#define AUDIO_FUNC __FUNCTION__
>>> +#define AUDIO_FUNC __func__
>>>  #else
>>>  #define AUDIO_FUNC __FILE__ ":" AUDIO_STRINGIFY (__LINE__)
>>>  #endif
>>
>> Unrelated to this patch 
>> Do we really support other compilers than msc and gcc?
> 
> Let me rephrase the question: do we really support compilers that don't
> understand __func__?  The presence of numerous unconditional uses of
> __func__ in the tree means the answer is no.  Let's replace AUDIO_FUNC
> by plain __func__.

Answered elsewhere in patch 3/46 (where we DO replace AUDIO_FUNC by
__func__).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 2/5] x86/pvclock: add setter for pvclock_pvti_cpu0_va

2017-11-07 Thread Joao Martins
On 11/06/2017 04:09 PM, Paolo Bonzini wrote:
> On 19/10/2017 15:39, Joao Martins wrote:
>> Right now there is only a pvclock_pvti_cpu0_va() which is defined
>> on kvmclock since:
>>
>> commit dac16fba6fc5
>> ("x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap")
>>
>> The only user of this interface so far is kvm. This commit adds a
>> setter function for the pvti page and moves pvclock_pvti_cpu0_va
>> to pvclock, which is a more generic place to have it; and would
>> allow other PV clocksources to use it, such as Xen.
>>
>> Signed-off-by: Joao Martins 
>> Acked-by: Andy Lutomirski 
> 
> Acked-by: Paolo Bonzini 
> 
> IOW, the Xen folks are free to pick up the whole series. :)
> 
Thank you!

I guess only x86 maintainers Ack is left - any comments?

Joao

> Paolo
> 
>> ---
>> Changes since v1:
>>  * Rebased: the only conflict was that I had move the export
>>  pvclock_pvti_cpu0_va() symbol as it is used by kvm PTP driver.
>>  * Do not initialize pvti_cpu0_va to NULL (checkpatch error)
>>  ( Comments from Andy Lutomirski )
>>  * Removed asm/pvclock.h 'pvclock_set_pvti_cpu0_va' definition
>>  for non !PARAVIRT_CLOCK to better track screwed Kconfig stuff.
>>  * Add his Acked-by (provided the previous adjustment was made)
>>
>> Changes since RFC:
>>  (Comments from Andy Lutomirski)
>>  * Add __init to pvclock_set_pvti_cpu0_va
>>  * Add WARN_ON(vclock_was_used(VCLOCK_PVCLOCK)) to
>>  pvclock_set_pvti_cpu0_va
>> ---
>>  arch/x86/include/asm/pvclock.h | 19 ++-
>>  arch/x86/kernel/kvmclock.c |  7 +--
>>  arch/x86/kernel/pvclock.c  | 14 ++
>>  3 files changed, 25 insertions(+), 15 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
>> index 448cfe1b48cf..6f228f90cdd7 100644
>> --- a/arch/x86/include/asm/pvclock.h
>> +++ b/arch/x86/include/asm/pvclock.h
>> @@ -4,15 +4,6 @@
>>  #include 
>>  #include 
>>  
>> -#ifdef CONFIG_KVM_GUEST
>> -extern struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void);
>> -#else
>> -static inline struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
>> -{
>> -return NULL;
>> -}
>> -#endif
>> -
>>  /* some helper functions for xen and kvm pv clock sources */
>>  u64 pvclock_clocksource_read(struct pvclock_vcpu_time_info *src);
>>  u8 pvclock_read_flags(struct pvclock_vcpu_time_info *src);
>> @@ -101,4 +92,14 @@ struct pvclock_vsyscall_time_info {
>>  
>>  #define PVTI_SIZE sizeof(struct pvclock_vsyscall_time_info)
>>  
>> +#ifdef CONFIG_PARAVIRT_CLOCK
>> +void pvclock_set_pvti_cpu0_va(struct pvclock_vsyscall_time_info *pvti);
>> +struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void);
>> +#else
>> +static inline struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
>> +{
>> +return NULL;
>> +}
>> +#endif
>> +
>>  #endif /* _ASM_X86_PVCLOCK_H */
>> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
>> index d88967659098..538738047ff5 100644
>> --- a/arch/x86/kernel/kvmclock.c
>> +++ b/arch/x86/kernel/kvmclock.c
>> @@ -47,12 +47,6 @@ early_param("no-kvmclock", parse_no_kvmclock);
>>  static struct pvclock_vsyscall_time_info *hv_clock;
>>  static struct pvclock_wall_clock wall_clock;
>>  
>> -struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
>> -{
>> -return hv_clock;
>> -}
>> -EXPORT_SYMBOL_GPL(pvclock_pvti_cpu0_va);
>> -
>>  /*
>>   * The wallclock is the time of day when we booted. Since then, some time 
>> may
>>   * have elapsed since the hypervisor wrote the data. So we try to account 
>> for
>> @@ -334,6 +328,7 @@ int __init kvm_setup_vsyscall_timeinfo(void)
>>  return 1;
>>  }
>>  
>> +pvclock_set_pvti_cpu0_va(hv_clock);
>>  put_cpu();
>>  
>>  kvm_clock.archdata.vclock_mode = VCLOCK_PVCLOCK;
>> diff --git a/arch/x86/kernel/pvclock.c b/arch/x86/kernel/pvclock.c
>> index 5c3f6d6a5078..cb7d6d9c9c2d 100644
>> --- a/arch/x86/kernel/pvclock.c
>> +++ b/arch/x86/kernel/pvclock.c
>> @@ -25,8 +25,10 @@
>>  
>>  #include 
>>  #include 
>> +#include 
>>  
>>  static u8 valid_flags __read_mostly = 0;
>> +static struct pvclock_vsyscall_time_info *pvti_cpu0_va __read_mostly;
>>  
>>  void pvclock_set_flags(u8 flags)
>>  {
>> @@ -144,3 +146,15 @@ void pvclock_read_wallclock(struct pvclock_wall_clock 
>> *wall_clock,
>>  
>>  set_normalized_timespec(ts, now.tv_sec, now.tv_nsec);
>>  }
>> +
>> +void pvclock_set_pvti_cpu0_va(struct pvclock_vsyscall_time_info *pvti)
>> +{
>> +WARN_ON(vclock_was_used(VCLOCK_PVCLOCK));
>> +pvti_cpu0_va = pvti;
>> +}
>> +
>> +struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
>> +{
>> +return pvti_cpu0_va;
>> +}
>> +EXPORT_SYMBOL_GPL(pvclock_pvti_cpu0_va);
>>
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] aarch64: advertise the GIC system register interface

2017-11-07 Thread Peter Maydell
On 7 November 2017 at 17:57, Stefano Stabellini  wrote:
> On Tue, 7 Nov 2017, Peter Maydell wrote:
>> I thought about this on the cycle into work this morning, and I
>> think that rather than require every board that uses gicv3
>> to set a property on the CPU, we should change the definition
>> of the id_aa64pfr0 register so that rather than being ARM_CP_CONST
>> it has a readfn, and then at runtime we can get that readfn to
>> add in the right bit if env->gicv3state is non-null.
>>
>> I'll put together a patch this afternoon.
>
> Great, please CC me when you do, I'll help you test the patch.

http://patchwork.ozlabs.org/patch/835300/ -- should already be
in your inbox somewhere...

thanks
-- PMM

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] aarch64: advertise the GIC system register interface

2017-11-07 Thread Stefano Stabellini
On Tue, 7 Nov 2017, Peter Maydell wrote:
> On 6 November 2017 at 22:16, Stefano Stabellini  
> wrote:
> > When QEMU emulates a GICv3, it needs to advertise the presence of the
> > system register interface, which is done via id_aa64pfr0.
> >
> > To do that, and at the same time to avoid advertising the presence of
> > the system register interface when it is actually not available, set a
> > boolean property in machvirt_init. Check on the boolean property from
> > register_cp_regs_for_features and set id_aa64pfr0 accordingly.
> >
> > Signed-off-by: Stefano Stabellini 
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 9e18b41..369d36b 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -1401,6 +1401,9 @@ static void machvirt_init(MachineState *machine)
> >  object_property_set_link(cpuobj, OBJECT(secure_sysmem),
> >   "secure-memory", _abort);
> >  }
> > +if (vms->gic_version == 3) {
> > +object_property_set_bool(cpuobj, true, "gicv3-sysregs", NULL);
> > +}
> >
> >  object_property_set_bool(cpuobj, true, "realized", NULL);
> >  object_unref(cpuobj);
> 
> I thought about this on the cycle into work this morning, and I
> think that rather than require every board that uses gicv3
> to set a property on the CPU, we should change the definition
> of the id_aa64pfr0 register so that rather than being ARM_CP_CONST
> it has a readfn, and then at runtime we can get that readfn to
> add in the right bit if env->gicv3state is non-null.
> 
> I'll put together a patch this afternoon.

Great, please CC me when you do, I'll help you test the patch.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-unstable bisection] complete test-amd64-amd64-i386-pvgrub

2017-11-07 Thread Julien Grall
Hi Wei,

On 07/11/17 15:13, Wei Liu wrote:
> On Tue, Nov 07, 2017 at 03:09:07PM +, Julien Grall wrote:
>> Hi Wei,
>>
>> On 06/11/17 14:55, Wei Liu wrote:
>>> On Mon, Nov 06, 2017 at 01:47:56PM +, osstest service owner wrote:
 branch xen-unstable
 xenbranch xen-unstable
 job test-amd64-amd64-i386-pvgrub
 testid guest-start

 Tree: linux git://xenbits.xen.org/linux-pvops.git
 Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
 Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
 Tree: qemuu git://xenbits.xen.org/qemu-xen.git
 Tree: xen git://xenbits.xen.org/xen.git

 *** Found and reproduced problem changeset ***

 Bug is in tree:  xen git://xenbits.xen.org/xen.git
 Bug introduced:  f48b5449dabc770acdde6d25cfbd265cfb71034d
 Bug not present: 86cf189a957129ea1ad6468fe9a0887b9e2819f3
 Last fail repro: 
 http://logs.test-lab.xenproject.org/osstest/logs/115612/


 commit f48b5449dabc770acdde6d25cfbd265cfb71034d
 Author: Wei Liu 
 Date:   Thu Oct 12 20:19:07 2017 +0100
 tools/dombuilder: Switch to using gfn terminology for console and 
 xenstore rings
 The sole use of xc_dom_translated() and xc_dom_p2m() outside of 
 the domain
 builder is for libxl_dom() to translate the console and xenstore 
 pfns back
 into useful values.  PV guest pfns are only interesting to the 
 domain builder,
 and gfns are the address space used by all other hypercalls.
 Renaming the fields in xc_dom_image is deliberate, as it will cause
 out-of-tree users of the dombuilder to notice the different 
 semantics.
 Correct the terminology throughout xc_dom_gnttab{_hvm,}_seed(), 
 which are all
 using gfns despite the existing variable names.
 Signed-off-by: Andrew Cooper 
 Reviewed-by: Roger Pau Monn?? 
 Acked-by: Wei Liu 
 Tested-by: Julien Grall 
 Release-acked-by: Julien Grall 
 [ wei: fix stubdom build ]
 Signed-off-by: Wei Liu 
>>>
>>> This has broken pvgrub. The problem is more than just the name of the
>>> variables. I have reverted this and its successor patch.
>>
>> It looks like osstest is still broken after the patches you reverted (see
>> [1] and [2]).
>>
>> AFAICT, the only series between the two flights is the dombuilder, there are
>> 2 patches not reverted.
>>
>> Do you have an idea of what's going on?
>>
>> Cheers,
>>
>> [1] http://logs.test-lab.xenproject.org/osstest/logs/115624/
>> [2]
>> https://lists.xenproject.org/archives/html/xen-devel/2017-11/msg00391.html
>>
> 
> test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
>  115526
> test-armhf-armhf-xl-vhd 15 guest-start/debian.repeat fail REGR. vs.  
> 115526

The log for the xl-vhd contains ([1])

libxl: error: libxl_bootloader.c:283:bootloader_local_detached_cb: Domain 
11:unable to detach locally attached disk
libxl: error: libxl_create.c:1246:domcreate_rebuild_done: Domain 11:cannot 
(re-)build domain: -3
libxl: debug: libxl_domain.c:1138:devices_destroy_cb: Domain 11:Forked pid 5103 
for destroy of domain
libxl: debug: libxl_create.c:1683:do_domain_create: Domain 0:ao 0x5d6e8: 
inprogress: poller=0x56ad8, flags=i
libxl: debug: libxl_event.c:1869:libxl__ao_complete: ao 0x5d6e8: complete, rc=-3
libxl: debug: libxl_event.c:1838:libxl__ao__destroy: ao 0x5d6e8: destroy
libxl: debug: libxl_domain.c:868:libxl_domain_destroy: Domain 11:ao 0x5a170: 
create: how=(nil) callback=(nil) poller=0x56ad8
libxl: error: libxl_domain.c:1000:libxl__destroy_domid: Domain 11:Non-existant 
domain
libxl: error: libxl_domain.c:959:domain_destroy_callback: Domain 11:Unable to 
destroy guest
libxl: error: libxl_domain.c:886:domain_destroy_cb: Domain 11:Destruction of 
domain failed
libxl: debug: libxl_event.c:1869:libxl__ao_complete: ao 0x5a170: complete, 
rc=-21
libxl: debug: libxl_domain.c:877:libxl_domain_destroy: Domain 11:ao 0x5a170: 
inprogress: poller=0x56ad8, flags=ic
libxl: debug: libxl_event.c:1838:libxl__ao__destroy: ao 0x5a170: destroy

It is in guest repeat and has succeed few times before.

Looking at the success/failure ([2]), the same configuration passed on the 
Arndale
(see 115580) but fails reliably on the cubietruck.

My guess would be the disk is not detached by the previous guest in time.
Now the question is why? I am not familiar with this area, any ideas? 

> 
> These aren't related to dombuilder at first glance.
> 

Cheers,

[1] 
http://logs.test-lab.xenproject.org/osstest/logs/115624/test-armhf-armhf-xl-vhd/15.ts-repeat-test.log
[2] 

Re: [Xen-devel] [PATCH for-next 8/9] xsm: add bodge when compiling with llvm coverage support

2017-11-07 Thread Jan Beulich
>>> On 26.10.17 at 11:19,  wrote:
> --- a/xen/include/xsm/dummy.h
> +++ b/xen/include/xsm/dummy.h
> @@ -24,8 +24,22 @@
>   * if references remain at link time.
>   */
>  #define LINKER_BUG_ON(x) do { if (x) __xsm_action_mismatch_detected(); } 
> while (0)
> +
> +#ifdef CONFIG_LLVM_COVERAGE
> +/*
> + * LLVM coverage support seems to disable some of the optimizations needed in
> + * order for XSM to compile. Since coverage should not be used in production
> + * provide an implementation of __xsm_action_mismatch_detected to satisfy the
> + * linker.
> + */
> +static void __xsm_action_mismatch_detected(void)
> +{
> +ASSERT_UNREACHABLE();
> +}

I'm pretty sure this wants to be "inline".

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-next 3/9] gcov: rename sysctl and functions

2017-11-07 Thread Jan Beulich
>>> On 26.10.17 at 11:19,  wrote:
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -646,11 +646,11 @@ struct xen_sysctl_scheduler_op {
>  
>  #define XEN_GCOV_FORMAT_MAGIC0x58434f56 /* XCOV */
>  
> -#define XEN_SYSCTL_GCOV_get_size 0 /* Get total size of output data */
> -#define XEN_SYSCTL_GCOV_read 1 /* Read output data */
> -#define XEN_SYSCTL_GCOV_reset2 /* Reset all counters */
> +#define XEN_SYSCTL_COV_get_size 0 /* Get total size of output data */
> +#define XEN_SYSCTL_COV_read 1 /* Read output data */
> +#define XEN_SYSCTL_COV_reset2 /* Reset all counters */
>  
> -struct xen_sysctl_gcov_op {
> +struct xen_sysctl_cov_op {
>  uint32_t cmd;
>  uint32_t size; /* IN/OUT: size of the buffer  */
>  XEN_GUEST_HANDLE_64(char) buffer; /* OUT */
> @@ -1065,7 +1065,7 @@ struct xen_sysctl {
>  #define XEN_SYSCTL_numainfo  17
>  #define XEN_SYSCTL_cpupool_op18
>  #define XEN_SYSCTL_scheduler_op  19
> -#define XEN_SYSCTL_gcov_op   20
> +#define XEN_SYSCTL_cov_op20
>  #define XEN_SYSCTL_psr_cmt_op21
>  #define XEN_SYSCTL_pcitopoinfo   22
>  #define XEN_SYSCTL_psr_cat_op23
> @@ -1095,7 +1095,7 @@ struct xen_sysctl {
>  struct xen_sysctl_lockprof_op   lockprof_op;
>  struct xen_sysctl_cpupool_opcpupool_op;
>  struct xen_sysctl_scheduler_op  scheduler_op;
> -struct xen_sysctl_gcov_op   gcov_op;
> +struct xen_sysctl_cov_opcov_op;
>  struct xen_sysctl_psr_cmt_oppsr_cmt_op;
>  struct xen_sysctl_psr_cat_oppsr_cat_op;
>  struct xen_sysctl_tmem_op   tmem_op;

While for internal things "cov" is probably fine for now, for the
public interface I think I would prefer it to be a little longer,
perhaps even fully spelled out "coverage".

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] VMX: fix VMCS race on context-switch paths

2017-11-07 Thread Jan Beulich
>>> On 07.11.17 at 16:52,  wrote:
> On 07/11/17 14:55, Jan Beulich wrote:
> On 07.11.17 at 15:24,  wrote:
>>> On 07/11/17 08:07, Jan Beulich wrote:
 --- unstable.orig/xen/arch/x86/domain.c
 +++ unstable/xen/arch/x86/domain.c
 @@ -379,6 +379,14 @@ int vcpu_initialise(struct vcpu *v)
  
  void vcpu_destroy(struct vcpu *v)
  {
 +/*
 + * Flush all state for this vCPU before fully tearing it down. This is
 + * particularly important for HVM ones on VMX, so that this flushing 
 of
 + * state won't happen from the TLB flush IPI handler behind the back 
 of
 + * a vmx_vmcs_enter() / vmx_vmcs_exit() section.
 + */
 +sync_vcpu_execstate(v);
 +
  xfree(v->arch.vm_event);
  v->arch.vm_event = NULL;
>>>
>>> I don't think this is going to fix the problem since vCPU we are
>>> currently destroying has nothing to do with the vCPUx that actually
>>> caused the problem by its migration. We still are going to call
>>> vmx_vcpu_disable_pml() which loads and cleans VMCS on the current pCPU.
>> 
>> Oh, right, wrong vCPU. This should be better:
>> 
>> --- unstable.orig/xen/arch/x86/domain.c
>> +++ unstable/xen/arch/x86/domain.c
>> @@ -379,6 +379,14 @@ int vcpu_initialise(struct vcpu *v)
>>  
>>  void vcpu_destroy(struct vcpu *v)
>>  {
>> +/*
>> + * Flush all state for the vCPU previously having run on the current 
>> CPU.
>> + * This is in particular relevant for HVM ones on VMX, so that this
>> + * flushing of state won't happen from the TLB flush IPI handler behind
>> + * the back of a vmx_vmcs_enter() / vmx_vmcs_exit() section.
>> + */
>> +sync_local_execstate();
>> +
>>  xfree(v->arch.vm_event);
>>  v->arch.vm_event = NULL;
>>  
>> In that case the question then is whether (rather than generalizing
>> is, as mentioned for the earlier version) this wouldn't better go into
>> vmx_vcpu_destroy(), assuming anything called earlier from
>> hvm_vcpu_destroy() isn't susceptible to the problem (i.e. doesn't
>> play with VMCSes).
> 
> Ah, ok. Does this also apply to the previous issue? May I revert that
> change to test it?

Feel free to try it, but I had checked that previous patch earlier
today, and right now I don't think the two issues are related.

> There is one things that I'm worrying about with this approach:
> 
> At this place we just sync the idle context because we know that we are
> going to deal with VMCS later. But what about other potential cases
> (perhaps some softirqs) in which we are accessing a vCPU data structure
> that is currently shared between different pCPUs. Maybe we'd better sync
> the context as soon as possible after we switched to idle from a
> migrated vCPU.

Well, yes, I had pointed out in the earlier reply that this is just to
deal with the specific case here. Whether to sync earlier after a
migration I'm not really sure about - the way it's written right now
is meant to deal with migration across CPUs. If so, this would
perhaps belong into scheduler code (and hence cover ARM as
well), and till now I wasn't able to figure a good place where to
put this.

George, Dario, do you have any thoughts both on the general
idea as well as where to put the necessary code?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline test] 115635: regressions - FAIL

2017-11-07 Thread osstest service owner
flight 115635 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/115635/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 114507

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-xsm  12 guest-start  fail in 115620 pass in 115635
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeat fail in 115620 pass 
in 115635
 test-armhf-armhf-xl-credit2 16 guest-start/debian.repeat fail in 115620 pass 
in 115635
 test-amd64-i386-xl-qemuu-ws16-amd64 10 windows-install fail in 115620 pass in 
115635
 test-amd64-i386-libvirt-qcow2 17 guest-start/debian.repeat fail pass in 115620
 test-amd64-amd64-xl-qcow219 guest-start/debian.repeat  fail pass in 115620
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail pass in 
115620
 test-armhf-armhf-xl-vhd  10 debian-di-install  fail pass in 115620

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop  fail blocked in 114507
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-check fail in 115620 like 
114507
 test-armhf-armhf-libvirt 14 saverestore-support-check fail in 115620 like 
114507
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop  fail in 115620 like 114507
 test-armhf-armhf-libvirt-raw 13 saverestore-support-check fail in 115620 like 
114507
 test-armhf-armhf-libvirt-xsm 13 migrate-support-check fail in 115620 never pass
 test-armhf-armhf-libvirt13 migrate-support-check fail in 115620 never pass
 test-armhf-armhf-xl-vhd 12 migrate-support-check fail in 115620 never pass
 test-armhf-armhf-xl-vhd 13 saverestore-support-check fail in 115620 never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-check fail in 115620 never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 114507
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 qemuu299d1ea9bb56bd9f45f905125489bdd7d543a1aa
baseline version:
 qemuuf90ea7ba7c5ae7010ee0ce062207ae42530f57d6

Last test of basis   114507  2017-10-15 01:03:38 Z   23 days
Failing since114546  2017-10-16 12:16:28 Z   22 days   58 attempts
Testing same since   115620  2017-11-06 17:20:55 Z0 days2 attempts


People who touched revisions under test:
  Aaron Lindsay 
  Alberto 

Re: [Xen-devel] Xen PVH support in grub2

2017-11-07 Thread Boris Ostrovsky
On 11/07/2017 02:42 AM, Juergen Gross wrote:
> On 06/11/17 17:42, Boris Ostrovsky wrote:
>> On 11/06/2017 10:05 AM, Juergen Gross wrote:
>>> On 06/11/17 15:51, Boris Ostrovsky wrote:
 On 11/06/2017 02:16 AM, Juergen Gross wrote:
> On 03/11/17 20:00, Boris Ostrovsky wrote:
>> On 11/03/2017 02:40 PM, Juergen Gross wrote:
>>> On 03/11/17 19:35, Boris Ostrovsky wrote:
 On 11/03/2017 02:23 PM, Juergen Gross wrote:
> On 03/11/17 19:19, Boris Ostrovsky wrote:
>> On 11/03/2017 02:05 PM, Juergen Gross wrote:
>>> So again the question: how to tell whether we are PVH or HVM in
>>> init_hypervisor_platform()? ACPi tables are scanned way later...
>> Can we make grub/OVMF append a boot option?
>>
>> Or set setup_header.hardware_subarch to something? We already have
>> X86_SUBARCH_XEN but it is only used by PV.  Or we might be able to 
>> use
>> hardware_subarch_data (will need to get a buy-in from x86 
>> maintainers, I
>> think).
> But wouldn't this break the idea to reuse the native boot paths in
> grub/OVMF without further modifications?
 WDYM? We will have to have some sort of a plugin in either one to build
 the zeropage anyway. So we'd set hardware_subarch there, in addition to
 other things like setting memory and such.
>>> But isn't the zeropage already being built? I admit that setting subarch
>>> isn't a big deal, but using another entry with a passed-through pvh
>>> start struct isn't either...
>> I don't follow, sorry. My understanding is that zeropage will be built
>> by PVH-enlightened grub so part of this process would be setting the
>> subarch bit.
> My reasoning was based on Roger's remark:
>
> "OTOH if Linux is capable of booting from the native entry point inside
> of a PVH container, we would only have to port OVMF and grub in order
> to work inside of a PVH container, leaving the rest of the logic
> untouched."
 Right, and in my mind porting OVMF/grub includes creating proper zeropage.
>>> Aah, okay. I reasoned on the assumption to just enable OVMF/grub to run
>>> in PVH environment without touching the parts setting up anything for
>>> the new kernel.
>> Someone needs to do what xen_prepare_pvh() does.
> As the loader is filling in the memory map information 

That's the thing that I thought may need to be done by us (setting
commandline too) . But I haven't looked at Xen support in grub so maybe
it's already there.

> the only thing
> remaining would be setting xen_pvh. And this could be delayed as my test
> have shown, so we only need to detect the PVH case from inside the
> kernel. One possibility would be the flags in the ACPI FADT table as
> Roger mentioned, another idea would be a flag in zeropage set by the
> loader.
>
>> And, for 64-bit, we also may need to build early pagetables since
>> startup_64() (unlike startup_32()) expects paging to be on. (I don't
>> know whether this is already part of standard FW codepath)
> This would be done the same way as for a native kernel.

This is done by Linux trampoline code. AFAIK grub loads kernel in realmode.


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 0/4] TEE mediator framework + OP-TEE mediator

2017-11-07 Thread Julien Grall

Hi Volodymyr,

On 02/11/17 20:07, Volodymyr Babchuk wrote:

On Thu, Nov 02, 2017 at 05:49:12PM +, Julien Grall wrote:

On 02/11/17 16:53, Volodymyr Babchuk wrote:

On Thu, Nov 02, 2017 at 01:17:26PM +, Julien Grall wrote:

On 24/10/17 20:02, Volodymyr Babchuk wrote:

But parameters are mapped every call and only needed ones.
Example: I have shared buffers A, B, C, D.

1) I call OpenSession(TA_UUID, A, B).
TA sees only buffers A, B (okay, actually it sees whole page, because
buffer is mapped from userspace).

2) I call InvokeCommand(Session, CMD_ID, B, C).
TA sees only buffers B & C.

3) I call InvokeCommand(Session, CMD_ID, A, D).
TA sees only buffers A & D.

Note, that such buffers are not mapped at OP-TEE address space at all.
They will be mapped only to TA address space.


To confirm, what you are saying is as soon as any call is returned by TA,
the region will be unmapped from the TA address space?

Yes.
Also, just to clarify: TA executes only by request from client. It
can't have external events. So, TA address space is somewhat ephemeral
entity. It exists only during time between TA entry and TA exit. At
all other times, TA does have no address space, no thread context,
anything. Just code and data somewhere in memory.


That's quite a good news :). Thank you for the explanation.





[...]

To be clear, this series don't look controversial at least for OP-TEE. What
I am more concerned is about DomU supports.

Your concern is that rogue DomU can compromise whole system, right?


Yes. You seem to assume that DomU using TEE will always be trusted, I think
this is the wrong approach if the use is able to interact directly with
those guests. See above.

No, I am not assuming that DomU that calls TEE should be trusted. Why do you
think so? It should be able to use TEE services, but this does not mean that
XEN should trust it.


In a previous answer you said: "So, if you don't trust your guest - don't
let it". For me, this clearly means you consider that DomU using TEE are
trusted.

So can you clarify by what you mean by trust then?

Well... In real world "trust" isn't binary option. You don't want to
allow all domains to access TEE. Breached TEE user domain doesn't
automatically mean that your whole system is compromised. But this
certainly increases attack surface. So it is safer to give TEE access
only to those domains, which really require it. You can call them
sligtly more trusted, then others.


Do you have an example of guest you would slightly trust more?

I have an example of guest I would trust less: if I'm running server,
and I'm selling virtual machines on that server, I don't want to them
to access TEE.


Make sense.



I will trust slightly more to my own guest.


I kind of agree if there are either no interaction with the user or the user
is not able to gain privilege permissions.

Okay, if user can execute arbitrary code at EL1... Even then nothing bad
will happen. They must be able to hack mediator/hypervisor/OP-TEE to realy
gain priviegs in system.


My worry here is you base the trust on OP-TEE and not only the hypervisor.
At the moment we had to trust the hardware to do the right thing and the
software is owned by Xen.

How about firmware? E.g. ARM TF?


My point here was anything involved in virtualization is at the moment 
the hardware and Xen. The ARM TF/firmware cannot be accessed 
directly/indirectly by any guest. So there are no concern to me.





Now you are telling me, we have this TEE running in EL3 and have to trust
him to do the isolation between guests. Until the last 2 e-mails, it was not
clear for me how OP-TEE could ensure this isolation.

Actually, OP-TEE is running at S-EL1 :-) Only ARM TF (or whatever
firmware is used) has ultimate control over the system. If we are
talking about modern ARMv8 platforms.


I would advise to explain a bit more in your cover letter of your next
version the design of OP-TEE. This would help people to see how this can
work with the hypervisor and also understanding the consequence...

I see. I'll do this, certainly. I just didn't expected that someone will
be interested in OP-TEE internals at such level.


I like to understand what I sign for :).



But, I think, cover leter for next OP-TEE will be done much later. Now,
I'm busy with OP-TEE part, then there will be changes to support
multi-domain boot and only then OP-TEE specific patches...

BTW, if anyone is interested in current state of OP-TEE mediator, you
can find it at [1]. I was able to pass OP-TEE tests from DomU in the
last version. I use it for OP-TEE development, so it is not
production-ready.

Julien, I want to ask about VM monitor feature in XEN. monitor_smc()
function and whole xen/arch/arm/monitor.c... Looks like it was
introduced for some sort of debugging. Do you know any users of this


It was originally introduced to allow an external application trapping 
SMC and executing an action. This is part of the VM introspection 
framework that could 

Re: [Xen-devel] [libvirt] Libvirt config converter can't handle file not ending with new line

2017-11-07 Thread Wim ten Have
On Tue, 7 Nov 2017 12:20:05 +
Wei Liu  wrote:

> On Mon, Nov 06, 2017 at 09:41:01PM -0700, Jim Fehlig wrote:
> > On 10/30/2017 06:17 AM, Wei Liu wrote:  
> > > Hi Jim
> > > 
> > > I discover a problem when using xen_xl converter. When the file in
> > > question doesn't end with a new line, I get the following error:
> > > 
> > >error: configuration file syntax error: memory conf:53: expecting a 
> > > value  
> > 
> > I'm not able to reproduce this issue. The libvirt.git tree I tried was a bit
> > dated, but even after updating to latest master I can't reproduce.
> >   
> > > After digging a bit (but haven't read libvirt code), it appears that the
> > > file didn't end with a new line.  
> > 
> > I tried several files without ending new lines, going both directions
> > (domxml-to-native and domxml-from-native), but didn't see the mentioned
> > error. Perhaps your config is revealing another bug which is being
> > improperly reported. Can you provide an example of the problematic config?
> >   
> 
> I tried to get the exact file that caused the problem but it is already
> destroyed by osstest.
> 
> A similar file:
> 
> http://logs.test-lab.xenproject.org/osstest/logs/115436/test-amd64-amd64-libvirt-pair/debian.guest.osstest.cfg
> 
> If you hexdump -C it, you can see the last character is 0a. Remove it and
> feed the file into the converter.
> Wei.

  The phenonomem you point out is indeed weird.  And my first response
  is that this is a bug parsing the cfg input.  I did little explore and
  think that src/util/virconf.c (virConfParseLong(), virConfParseValue()) 
  should be reworked as pointed out in below context diffs.

 git diff
diff --git a/src/util/virconf.c b/src/util/virconf.c
index 39c2bd917..bc8e57ec3 100644
--- a/src/util/virconf.c
+++ b/src/util/virconf.c
@@ -352,7 +352,7 @@ virConfParseLong(virConfParserCtxtPtr ctxt, long 
long *val)
 } else if (CUR == '+') {
 NEXT;
 }
-if ((ctxt->cur >= ctxt->end) || (!c_isdigit(CUR))) {
+if ((ctxt->cur > ctxt->end) || (!c_isdigit(CUR))) {
 virConfError(ctxt, VIR_ERR_CONF_SYNTAX, _("unterminated 
number"));
 return -1;
 }
@@ -456,7 +456,7 @@ virConfParseValue(virConfParserCtxtPtr ctxt)
 long long l = 0;
 
 SKIP_BLANKS;
-if (ctxt->cur >= ctxt->end) {
+if (ctxt->cur > ctxt->end) {
 virConfError(ctxt, VIR_ERR_CONF_SYNTAX, _("expecting a 
value"));
 return NULL;
 }

  I did not go beyond this yet.

Rgds,
- Wim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] VMX: fix VMCS race on context-switch paths

2017-11-07 Thread Igor Druzhinin
On 07/11/17 14:55, Jan Beulich wrote:
 On 07.11.17 at 15:24,  wrote:
>> On 07/11/17 08:07, Jan Beulich wrote:
>>> --- unstable.orig/xen/arch/x86/domain.c
>>> +++ unstable/xen/arch/x86/domain.c
>>> @@ -379,6 +379,14 @@ int vcpu_initialise(struct vcpu *v)
>>>  
>>>  void vcpu_destroy(struct vcpu *v)
>>>  {
>>> +/*
>>> + * Flush all state for this vCPU before fully tearing it down. This is
>>> + * particularly important for HVM ones on VMX, so that this flushing of
>>> + * state won't happen from the TLB flush IPI handler behind the back of
>>> + * a vmx_vmcs_enter() / vmx_vmcs_exit() section.
>>> + */
>>> +sync_vcpu_execstate(v);
>>> +
>>>  xfree(v->arch.vm_event);
>>>  v->arch.vm_event = NULL;
>>
>> I don't think this is going to fix the problem since vCPU we are
>> currently destroying has nothing to do with the vCPUx that actually
>> caused the problem by its migration. We still are going to call
>> vmx_vcpu_disable_pml() which loads and cleans VMCS on the current pCPU.
> 
> Oh, right, wrong vCPU. This should be better:
> 
> --- unstable.orig/xen/arch/x86/domain.c
> +++ unstable/xen/arch/x86/domain.c
> @@ -379,6 +379,14 @@ int vcpu_initialise(struct vcpu *v)
>  
>  void vcpu_destroy(struct vcpu *v)
>  {
> +/*
> + * Flush all state for the vCPU previously having run on the current CPU.
> + * This is in particular relevant for HVM ones on VMX, so that this
> + * flushing of state won't happen from the TLB flush IPI handler behind
> + * the back of a vmx_vmcs_enter() / vmx_vmcs_exit() section.
> + */
> +sync_local_execstate();
> +
>  xfree(v->arch.vm_event);
>  v->arch.vm_event = NULL;
>  
> In that case the question then is whether (rather than generalizing
> is, as mentioned for the earlier version) this wouldn't better go into
> vmx_vcpu_destroy(), assuming anything called earlier from
> hvm_vcpu_destroy() isn't susceptible to the problem (i.e. doesn't
> play with VMCSes).

Ah, ok. Does this also apply to the previous issue? May I revert that
change to test it?

There is one things that I'm worrying about with this approach:

At this place we just sync the idle context because we know that we are
going to deal with VMCS later. But what about other potential cases
(perhaps some softirqs) in which we are accessing a vCPU data structure
that is currently shared between different pCPUs. Maybe we'd better sync
the context as soon as possible after we switched to idle from a
migrated vCPU.

Igor

> 
> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] VMX: fix VMCS race on context-switch paths

2017-11-07 Thread Jan Beulich
>>> On 07.11.17 at 09:07,  wrote:
 On 02.11.17 at 20:46,  wrote:
>>> Any ideas about the root cause of the fault and suggestions how to 
>>> reproduce 
> it
>>> would be welcome. Does this crash really has something to do with PML? I 
> doubt
>>> because the original environment may hardly be called PML-heavy.
> 
> Well, PML-heaviness doesn't matter. It's the mere fact that PML
> is enabled on the vCPU being destroyed.
> 
>> So we finally have complete understanding of what's going on:
>> 
>> Some vCPU has just migrated to another pCPU and we switched to idle but
>> per_cpu(curr_vcpu) on the current pCPU is still pointing to it - this is
>> how the current logic works. While we're in idle we're issuing
>> vcpu_destroy() for some other domain which eventually calls
>> vmx_vcpu_disable_pml() and trashes VMCS pointer on the current pCPU. At
>> this moment we get a TLB flush IPI from that same vCPU which is now
>> context switching on another pCPU - it appears to clean TLB after
>> itself. This vCPU is already marked is_running=1 by the scheduler. In
>> the IPI handler we enter __sync_local_execstate() and trying to call
>> vmx_ctxt_switch_from() for the migrated vCPU which is supposed to call
>> vmcs_reload() but doesn't do it because is_running==1. The next VMWRITE
>> crashes the hypervisor.
>> 
>> So the state transition diagram might look like:
>> pCPU1: vCPUx -> migrate to pCPU2 -> idle -> RCU callbacks ->
> 
> I'm not really clear about who/what is "idle" here: pCPU1,
> pCPU2, or yet something else? If vCPUx migrated to pCPU2,
> wouldn't it be put back into runnable state right away, and
> hence pCPU2 can't be idle at this point? Yet for pCPU1 I don't
> think its idleness would matter much, i.e. the situation could
> also arise without it becoming idle afaics. pCPU1 making it
> anywhere softirqs are being processed would suffice.
> 
>> vcpu_destroy() -> vmx_vcpu_disable_pml() -> vmcs_clear()
>> pCPU2: context switch into vCPUx -> is_running = 1 -> TLB flush
>> pCPU1: IPI handler -> context switch out of vCPUx -> VMWRITE -> CRASH!
>> 
>> We can basically just fix the condition around vmcs_reload() call but
>> I'm not completely sure that it's the right way to do - I don't think
>> leaving per_cpu(curr_vcpu) pointing to a migrated vCPU is a good idea
>> (maybe we need to clean it). What are your thoughts?
> 
> per_cpu(curr_vcpu) can only validly be written inside
> __context_switch(), hence the only way to achieve this would
> be to force __context_switch() to be called earlier than out of
> the TLB flush IPI handler, perhaps like in the (untested!) patch
> below. Two questions then remain:
> - Should we perhaps rather do this in an arch-independent way
>   (i.e. ahead of the call to vcpu_destroy() in common code)?
> - This deals with only a special case of the more general "TLB
>   flush behind the back of a vmx_vmcs_enter() /
>   vmx_vmcs_exit() section" - does this need dealing with in a
>   more general way? Here I'm thinking of introducing a
>   FLUSH_STATE flag to be passed to flush_mask() instead of
>   the current flush_tlb_mask() in context_switch() and
>   sync_vcpu_execstate(). This could at the same time be used
>   for a small performance optimization: At least for HAP vCPU-s
>   I don't think we really need the TLB part of the flushes here.

Btw., for this second aspect below is what I have in mind.

Jan

x86: make CPU state flush requests explicit

Having this be an implied side effect of a TLB flush is not very nice:
It could (at least in theory) lead to unintended state flushes (see e.g.
https://lists.xenproject.org/archives/html/xen-devel/2017-11/msg00187.html 
for context). Introduce a flag to be used in the two places actually
wanting the state flushed, and conditionalize the
__sync_local_execstate() invocation in the IPI handler accordingly.

At the same time also conditionalize the flush_area_local() invocations,
to short-circuit the function ending up as a no-op anyway.

Signed-off-by: Jan Beulich 
---
I first thought we could also suppress the TLB flush part in the context
switch cases for HAP vCPU-s, but the per-domain mappings require that to
happen.

--- unstable.orig/xen/arch/x86/domain.c
+++ unstable/xen/arch/x86/domain.c
@@ -1699,7 +1699,7 @@ void context_switch(struct vcpu *prev, s
   !cpumask_empty(_mask)) )
 {
 /* Other cpus call __sync_local_execstate from flush ipi handler. */
-flush_tlb_mask(_mask);
+flush_mask(_mask, FLUSH_TLB | FLUSH_STATE);
 }
 
 if ( prev != next )
@@ -1808,7 +1808,7 @@ void sync_vcpu_execstate(struct vcpu *v)
 sync_local_execstate();
 
 /* Other cpus call __sync_local_execstate from flush ipi handler. */
-flush_tlb_mask(v->vcpu_dirty_cpumask);
+flush_mask(v->vcpu_dirty_cpumask, FLUSH_TLB | FLUSH_STATE);
 }
 
 static int relinquish_memory(
--- unstable.orig/xen/arch/x86/smp.c
+++ unstable/xen/arch/x86/smp.c
@@ 

Re: [Xen-devel] [xen-unstable bisection] complete test-amd64-amd64-i386-pvgrub

2017-11-07 Thread Wei Liu
On Tue, Nov 07, 2017 at 03:09:07PM +, Julien Grall wrote:
> Hi Wei,
> 
> On 06/11/17 14:55, Wei Liu wrote:
> > On Mon, Nov 06, 2017 at 01:47:56PM +, osstest service owner wrote:
> > > branch xen-unstable
> > > xenbranch xen-unstable
> > > job test-amd64-amd64-i386-pvgrub
> > > testid guest-start
> > > 
> > > Tree: linux git://xenbits.xen.org/linux-pvops.git
> > > Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
> > > Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
> > > Tree: qemuu git://xenbits.xen.org/qemu-xen.git
> > > Tree: xen git://xenbits.xen.org/xen.git
> > > 
> > > *** Found and reproduced problem changeset ***
> > > 
> > >Bug is in tree:  xen git://xenbits.xen.org/xen.git
> > >Bug introduced:  f48b5449dabc770acdde6d25cfbd265cfb71034d
> > >Bug not present: 86cf189a957129ea1ad6468fe9a0887b9e2819f3
> > >Last fail repro: 
> > > http://logs.test-lab.xenproject.org/osstest/logs/115612/
> > > 
> > > 
> > >commit f48b5449dabc770acdde6d25cfbd265cfb71034d
> > >Author: Wei Liu 
> > >Date:   Thu Oct 12 20:19:07 2017 +0100
> > >tools/dombuilder: Switch to using gfn terminology for console and 
> > > xenstore rings
> > >The sole use of xc_dom_translated() and xc_dom_p2m() outside of 
> > > the domain
> > >builder is for libxl_dom() to translate the console and xenstore 
> > > pfns back
> > >into useful values.  PV guest pfns are only interesting to the 
> > > domain builder,
> > >and gfns are the address space used by all other hypercalls.
> > >Renaming the fields in xc_dom_image is deliberate, as it will cause
> > >out-of-tree users of the dombuilder to notice the different 
> > > semantics.
> > >Correct the terminology throughout xc_dom_gnttab{_hvm,}_seed(), 
> > > which are all
> > >using gfns despite the existing variable names.
> > >Signed-off-by: Andrew Cooper 
> > >Reviewed-by: Roger Pau Monn?? 
> > >Acked-by: Wei Liu 
> > >Tested-by: Julien Grall 
> > >Release-acked-by: Julien Grall 
> > >[ wei: fix stubdom build ]
> > >Signed-off-by: Wei Liu 
> > 
> > This has broken pvgrub. The problem is more than just the name of the
> > variables. I have reverted this and its successor patch.
> 
> It looks like osstest is still broken after the patches you reverted (see
> [1] and [2]).
> 
> AFAICT, the only series between the two flights is the dombuilder, there are
> 2 patches not reverted.
> 
> Do you have an idea of what's going on?
> 
> Cheers,
> 
> [1] http://logs.test-lab.xenproject.org/osstest/logs/115624/
> [2]
> https://lists.xenproject.org/archives/html/xen-devel/2017-11/msg00391.html
> 

test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs.  
115526
test-armhf-armhf-xl-vhd 15 guest-start/debian.repeat fail REGR. vs.  115526 

These aren't related to dombuilder at first glance.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-unstable bisection] complete test-amd64-amd64-i386-pvgrub

2017-11-07 Thread Julien Grall

Hi Wei,

On 06/11/17 14:55, Wei Liu wrote:

On Mon, Nov 06, 2017 at 01:47:56PM +, osstest service owner wrote:

branch xen-unstable
xenbranch xen-unstable
job test-amd64-amd64-i386-pvgrub
testid guest-start

Tree: linux git://xenbits.xen.org/linux-pvops.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: xen git://xenbits.xen.org/xen.git

*** Found and reproduced problem changeset ***

   Bug is in tree:  xen git://xenbits.xen.org/xen.git
   Bug introduced:  f48b5449dabc770acdde6d25cfbd265cfb71034d
   Bug not present: 86cf189a957129ea1ad6468fe9a0887b9e2819f3
   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/115612/


   commit f48b5449dabc770acdde6d25cfbd265cfb71034d
   Author: Wei Liu 
   Date:   Thu Oct 12 20:19:07 2017 +0100
   
   tools/dombuilder: Switch to using gfn terminology for console and xenstore rings
   
   The sole use of xc_dom_translated() and xc_dom_p2m() outside of the domain

   builder is for libxl_dom() to translate the console and xenstore pfns 
back
   into useful values.  PV guest pfns are only interesting to the domain 
builder,
   and gfns are the address space used by all other hypercalls.
   
   Renaming the fields in xc_dom_image is deliberate, as it will cause

   out-of-tree users of the dombuilder to notice the different semantics.
   
   Correct the terminology throughout xc_dom_gnttab{_hvm,}_seed(), which are all

   using gfns despite the existing variable names.
   
   Signed-off-by: Andrew Cooper 

   Reviewed-by: Roger Pau Monn?? 
   Acked-by: Wei Liu 
   Tested-by: Julien Grall 
   Release-acked-by: Julien Grall 
   [ wei: fix stubdom build ]
   Signed-off-by: Wei Liu 


This has broken pvgrub. The problem is more than just the name of the
variables. I have reverted this and its successor patch.


It looks like osstest is still broken after the patches you reverted 
(see [1] and [2]).


AFAICT, the only series between the two flights is the dombuilder, there 
are 2 patches not reverted.


Do you have an idea of what's going on?

Cheers,

[1] http://logs.test-lab.xenproject.org/osstest/logs/115624/
[2] 
https://lists.xenproject.org/archives/html/xen-devel/2017-11/msg00391.html


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] VMX: fix VMCS race on context-switch paths

2017-11-07 Thread Jan Beulich
>>> On 07.11.17 at 15:24,  wrote:
> On 07/11/17 08:07, Jan Beulich wrote:
>> --- unstable.orig/xen/arch/x86/domain.c
>> +++ unstable/xen/arch/x86/domain.c
>> @@ -379,6 +379,14 @@ int vcpu_initialise(struct vcpu *v)
>>  
>>  void vcpu_destroy(struct vcpu *v)
>>  {
>> +/*
>> + * Flush all state for this vCPU before fully tearing it down. This is
>> + * particularly important for HVM ones on VMX, so that this flushing of
>> + * state won't happen from the TLB flush IPI handler behind the back of
>> + * a vmx_vmcs_enter() / vmx_vmcs_exit() section.
>> + */
>> +sync_vcpu_execstate(v);
>> +
>>  xfree(v->arch.vm_event);
>>  v->arch.vm_event = NULL;
> 
> I don't think this is going to fix the problem since vCPU we are
> currently destroying has nothing to do with the vCPUx that actually
> caused the problem by its migration. We still are going to call
> vmx_vcpu_disable_pml() which loads and cleans VMCS on the current pCPU.

Oh, right, wrong vCPU. This should be better:

--- unstable.orig/xen/arch/x86/domain.c
+++ unstable/xen/arch/x86/domain.c
@@ -379,6 +379,14 @@ int vcpu_initialise(struct vcpu *v)
 
 void vcpu_destroy(struct vcpu *v)
 {
+/*
+ * Flush all state for the vCPU previously having run on the current CPU.
+ * This is in particular relevant for HVM ones on VMX, so that this
+ * flushing of state won't happen from the TLB flush IPI handler behind
+ * the back of a vmx_vmcs_enter() / vmx_vmcs_exit() section.
+ */
+sync_local_execstate();
+
 xfree(v->arch.vm_event);
 v->arch.vm_event = NULL;
 
In that case the question then is whether (rather than generalizing
is, as mentioned for the earlier version) this wouldn't better go into
vmx_vcpu_destroy(), assuming anything called earlier from
hvm_vcpu_destroy() isn't susceptible to the problem (i.e. doesn't
play with VMCSes).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen PVH support in grub2

2017-11-07 Thread Juergen Gross
On 06/11/17 12:36, Juergen Gross wrote:
> On 03/11/17 13:17, Roger Pau Monné wrote:
>> On Fri, Nov 03, 2017 at 01:00:46PM +0100, Juergen Gross wrote:
>>> On 29/09/17 17:51, Roger Pau Monné wrote:
 On Fri, Sep 29, 2017 at 03:33:58PM +, Juergen Gross wrote:
> On 29/09/17 17:24, Roger Pau Monné wrote:
>> On Fri, Sep 29, 2017 at 02:46:53PM +, Juergen Gross wrote:
>> Then, I also wonder whether it would make sense for this grub to load
>> the kernel using the PVH entry point or the native entry point. Would
>> it be possible to boot a Linux kernel up to the point where cpuid can
>> be used inside of a PVH container?
>
> I don't think today's Linux allows that. This has been discussed
> very thoroughly at the time Boris added PVH V2 support to the kernel.

 OK, I'm not going to insist on that, but my plans for FreeBSD is to
 make the native entry point capable of booting inside of a PVH
 container up to the point where cpuid (or whatever method) can be used
 to detect the environment.
>>>
>>> Looking more thoroughly into the Linux boot code I think this could
>>> work for Linux, too. But only if we can tell PVH from HVM in the guest.
>>> How would you do that in FreeBSD? Via flags in the boot params? This
>>> would the have to be done in the boot loader (e.g. grub or OVMF).
>>
>> My plan was not to differentiate between HVM and PVH, but rather to
>> make use of the ACPI information in order to decide which devices are
>> available and which are not inside of a PVH guest.
>>
>> For example in the FADT "IA-PC Boot Architecture Flags" field for PVH
>> we already set "VGA Not Present" and "CMOS RTC Not Present". There
>> might be other flags/fields that must be set, but I would like to
>> avoid having a CPUID bit or similar saying "PVH", because then Xen
>> will be tied to always providing the same set of devices in PVH
>> containers.
> 
> I looked through the xen_pvh_domain() use cases in the Linux kernel
> again.
> 
> Maybe we can really manage to not need differentiating PVH from HVM
> until ACPI table scan. We'd need another hook for Xen, but this should
> be easy as KVM already has a hook where we'd need one. So this can be
> made more general and we are fine.
> 
> I even think we can drop some of the PVH tests, as the PVH-specific
> handling (e.g. for grant table initialization) should work for HVM, too.

So I did a little test now: with some small patches added I've managed
to boot a PVH Linux kernel without any special handling in the early
PVH paths of the kernel (only setting up initial page tables and faking
a memory map similar to the one grub would deliver). PVH detection is
done via ACPI table ("VGA Not Present" and "CMOS RTC Not Present" in a
HVM domain will result in PVH assumed).

If nobody objects to this handling I'll send patches soon.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 115645: tolerable all pass - PUSHED

2017-11-07 Thread osstest service owner
flight 115645 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/115645/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  92f0d4392e73727819c5a83fcce447515efaf2f5
baseline version:
 xen  1f61c07d79abda1e747d70d83edffe4efca48e17

Last test of basis   115616  2017-11-06 15:01:19 Z0 days
Testing same since   115645  2017-11-07 13:03:54 Z0 days1 attempts


People who touched revisions under test:
  Dario Faggioli 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=92f0d4392e73727819c5a83fcce447515efaf2f5
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
 export PERLLIB=.:.
 PERLLIB=.:.
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
92f0d4392e73727819c5a83fcce447515efaf2f5
+ branch=xen-unstable-smoke
+ revision=92f0d4392e73727819c5a83fcce447515efaf2f5
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
 export PERLLIB=.:.:.
 PERLLIB=.:.:.
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
+++ export PERLLIB=.:.:.:.
+++ PERLLIB=.:.:.:.
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.9-testing
+ '[' x92f0d4392e73727819c5a83fcce447515efaf2f5 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git
++ : git://xenbits.xen.org/osstest/ovmf.git
++ : 

Re: [Xen-devel] [libvirt test] 115636: regressions - FAIL

2017-11-07 Thread Anthony PERARD
On Tue, Nov 07, 2017 at 12:59:31PM +, Roger Pau Monné wrote:
> On Tue, Nov 07, 2017 at 12:42:59PM +, osstest service owner wrote:
> > flight 115636 libvirt real [real]
> > http://logs.test-lab.xenproject.org/osstest/logs/115636/
> > 
> > Regressions :-(
> > 
> > Tests which did not succeed and are blocking,
> > including tests which could not be run:
> >  test-amd64-i386-libvirt-qcow2 17 guest-start/debian.repeat fail REGR. vs. 
> > 115476
> 
> One of the local attachments done in order to run the bootloader
> failed:
> 
> 2017-11-07 12:17:16.601+: libxl: 
> libxl_disk.c:990:libxl__device_disk_local_initiate_attach: Trying to find 
> local path
> 2017-11-07 12:17:16.601+: libxl: 
> libxl_disk.c:998:libxl__device_disk_local_initiate_attach: Local path not 
> found, initiating attach.
> 2017-11-07 12:17:16.601+: libxl: 
> libxl_device.c:365:libxl__device_disk_set_backend: Disk vdev=(null) 
> spec.backend=qdisk
> 2017-11-07 12:17:16.601+: libxl: 
> libxl_device.c:365:libxl__device_disk_set_backend: Disk vdev=xvda 
> spec.backend=qdisk
> 2017-11-07 12:17:16.606+: libxl: 
> libxl_event.c:686:libxl__ev_xswatch_deregister: watch w=0xb3f06544: 
> deregister unregistered
> 2017-11-07 12:17:26.693+: libxl: 
> libxl_device.c:1366:libxl__wait_for_backend: Backend 
> /local/domain/0/backend/qdisk/0/51712 not ready
> 2017-11-07 12:17:26.693+: libxl: 
> libxl_bootloader.c:417:bootloader_disk_attached_cb: Domain 7:failed to attach 
> local disk for bootloader execution
> 2017-11-07 12:17:26.693+: libxl: 
> libxl_event.c:686:libxl__ev_xswatch_deregister: watch w=0xb3f06658: 
> deregister unregistered
> 2017-11-07 12:17:26.693+: libxl: 
> libxl_bootloader.c:283:bootloader_local_detached_cb: Domain 7:unable to 
> detach locally attached disk
> 2017-11-07 12:17:26.693+: libxl: 
> libxl_create.c:1246:domcreate_rebuild_done: Domain 7:cannot (re-)build 
> domain: -3
> 2017-11-07 12:17:26.708+: libxl: libxl_domain.c:1138:devices_destroy_cb: 
> Domain 7:Forked pid 4256 for destroy of domain
> 2017-11-07 12:17:26.711+: libxl: libxl_event.c:1869:libxl__ao_complete: 
> ao 0xb3f01578: complete, rc=-3
> 2017-11-07 12:17:26.712+: libxl: libxl_event.c:1838:libxl__ao__destroy: 
> ao 0xb3f01578: destroy
> 
> Sadly AFAICT there's no log for the QEMU running this backend.
> Previous domain creating attempts worked.

That's happening with QEMU mainline as well, I don't know why.

On this test, qemu-system-i386 may have crashed, but I don't know in
which log file this information would be present There is just now
qemu process in the `ps` output.

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] Xen/pciback: Implement PCI slot or bus reset with 'do_flr' SysFS attribute

2017-11-07 Thread Jan Beulich
>>> On 06.11.17 at 18:48,  wrote:
> --- a/Documentation/ABI/testing/sysfs-driver-pciback
> +++ b/Documentation/ABI/testing/sysfs-driver-pciback
> @@ -11,3 +11,15 @@ Description:
>  #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks
>  will allow the guest to read and write to the configuration
>  register 0x0E.
> +
> +What:   /sys/bus/pci/drivers/pciback/do_flr
> +Date:   Nov 2017
> +KernelVersion:  4.15
> +Contact:xen-de...@lists.xenproject.org 
> +Description:
> +An option to perform a slot or bus reset when a PCI device
> + is owned by Xen PCI backend. Writing a string of :BB:DD.F
> + will cause the pciback driver to perform a slot or bus reset
> + if the device supports it. It also checks to make sure that
> + all of the devices under the bridge are owned by Xen PCI
> + backend.

Why do you name this "do_flr" when you don't even try FLR, but
go to slot or then bus reset right away.

> +static int pcistub_reset_dev(struct pci_dev *dev)
> +{
> + struct xen_pcibk_dev_data *dev_data;
> + bool slot = false, bus = false;
> +
> + if (!dev)
> + return -EINVAL;
> +
> + dev_dbg(>dev, "[%s]\n", __func__);
> +
> + if (!pci_probe_reset_slot(dev->slot)) {
> + slot = true;
> + } else if (!pci_probe_reset_bus(dev->bus)) {
> + /* We won't attempt to reset a root bridge. */
> + if (!pci_is_root_bus(dev->bus))
> + bus = true;
> + }
> +
> + if (!bus && !slot)
> + return -EOPNOTSUPP;
> +
> + if (!slot) {
> + struct pcistub_args arg = { .dev = NULL, .dcount = 0 };

Neither of the two initializers is really needed - just {} will do.

> + /*
> +  * Make sure all devices on this bus are owned by the
> +  * PCI backend so that we can safely reset the whole bus.
> +  */
> + pci_walk_bus(dev->bus, pcistub_search_dev, );
> +
> + /* All devices under the bus should be part of pcistub! */
> + if (arg.dev) {
> + dev_err(>dev, "%s device on the bus is not owned 
> by pcistub\n",
> + pci_name(arg.dev));

I think "device" is superfluous here, while "the bus" could do with
replacing by something actually identifying the bus.

> + return -EBUSY;
> + }
> +
> + dev_dbg(>dev, "pcistub owns %d devices on the bus\n",
> + arg.dcount);

Same here for "the bus", provided this log message is useful in the
first place.

> + }

Aren't you missing an "else" here? Aiui in the "slot" case it may
still be multiple devices/functions which are affected.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] VMX: fix VMCS race on context-switch paths

2017-11-07 Thread Igor Druzhinin
On 07/11/17 08:07, Jan Beulich wrote:
 On 02.11.17 at 20:46,  wrote:
>>> Any ideas about the root cause of the fault and suggestions how to 
>>> reproduce it
>>> would be welcome. Does this crash really has something to do with PML? I 
>>> doubt
>>> because the original environment may hardly be called PML-heavy.
> 
> Well, PML-heaviness doesn't matter. It's the mere fact that PML
> is enabled on the vCPU being destroyed.
> 
>> So we finally have complete understanding of what's going on:
>>
>> Some vCPU has just migrated to another pCPU and we switched to idle but
>> per_cpu(curr_vcpu) on the current pCPU is still pointing to it - this is
>> how the current logic works. While we're in idle we're issuing
>> vcpu_destroy() for some other domain which eventually calls
>> vmx_vcpu_disable_pml() and trashes VMCS pointer on the current pCPU. At
>> this moment we get a TLB flush IPI from that same vCPU which is now
>> context switching on another pCPU - it appears to clean TLB after
>> itself. This vCPU is already marked is_running=1 by the scheduler. In
>> the IPI handler we enter __sync_local_execstate() and trying to call
>> vmx_ctxt_switch_from() for the migrated vCPU which is supposed to call
>> vmcs_reload() but doesn't do it because is_running==1. The next VMWRITE
>> crashes the hypervisor.
>>
>> So the state transition diagram might look like:
>> pCPU1: vCPUx -> migrate to pCPU2 -> idle -> RCU callbacks ->
> 
> I'm not really clear about who/what is "idle" here: pCPU1,
> pCPU2, or yet something else?

It's switching to the "current" idle context on pCPU1.

> If vCPUx migrated to pCPU2,
> wouldn't it be put back into runnable state right away, and
> hence pCPU2 can't be idle at this point? Yet for pCPU1 I don't
> think its idleness would matter much, i.e. the situation could
> also arise without it becoming idle afaics. pCPU1 making it
> anywhere softirqs are being processed would suffice.
> 

Idleness matters in that case because we are not switching
per_cpu(curr_vcpu) which I think is the main problem when vCPU migration
comes into play.

>> vcpu_destroy() -> vmx_vcpu_disable_pml() -> vmcs_clear()
>> pCPU2: context switch into vCPUx -> is_running = 1 -> TLB flush
>> pCPU1: IPI handler -> context switch out of vCPUx -> VMWRITE -> CRASH!
>>
>> We can basically just fix the condition around vmcs_reload() call but
>> I'm not completely sure that it's the right way to do - I don't think
>> leaving per_cpu(curr_vcpu) pointing to a migrated vCPU is a good idea
>> (maybe we need to clean it). What are your thoughts?
> 
> per_cpu(curr_vcpu) can only validly be written inside
> __context_switch(), hence the only way to achieve this would
> be to force __context_switch() to be called earlier than out of
> the TLB flush IPI handler, perhaps like in the (untested!) patch
> below. Two questions then remain:
> - Should we perhaps rather do this in an arch-independent way
>   (i.e. ahead of the call to vcpu_destroy() in common code)?
> - This deals with only a special case of the more general "TLB
>   flush behind the back of a vmx_vmcs_enter() /
>   vmx_vmcs_exit() section" - does this need dealing with in a
>   more general way? Here I'm thinking of introducing a
>   FLUSH_STATE flag to be passed to flush_mask() instead of
>   the current flush_tlb_mask() in context_switch() and
>   sync_vcpu_execstate(). This could at the same time be used
>   for a small performance optimization: At least for HAP vCPU-s
>   I don't think we really need the TLB part of the flushes here.
> 
> Jan
> 
> --- unstable.orig/xen/arch/x86/domain.c
> +++ unstable/xen/arch/x86/domain.c
> @@ -379,6 +379,14 @@ int vcpu_initialise(struct vcpu *v)
>  
>  void vcpu_destroy(struct vcpu *v)
>  {
> +/*
> + * Flush all state for this vCPU before fully tearing it down. This is
> + * particularly important for HVM ones on VMX, so that this flushing of
> + * state won't happen from the TLB flush IPI handler behind the back of
> + * a vmx_vmcs_enter() / vmx_vmcs_exit() section.
> + */
> +sync_vcpu_execstate(v);
> +
>  xfree(v->arch.vm_event);
>  v->arch.vm_event = NULL;
>  

I don't think this is going to fix the problem since vCPU we are
currently destroying has nothing to do with the vCPUx that actually
caused the problem by its migration. We still are going to call
vmx_vcpu_disable_pml() which loads and cleans VMCS on the current pCPU.
Perhaps I should improve my diagram:

pCPU1: vCPUx of domain X -> migrate to pCPU2 -> switch to idle context
-> RCU callbacks -> vcpu_destroy(vCPUy of domain Y) ->
vmx_vcpu_disable_pml() -> vmx_vmcs_clear() (VMCS is trashed at this
point on pCPU1)

pCPU2: context switch into vCPUx -> vCPUx.is_running = 1 -> TLB flush
from context switch to clean TLB on pCPU1

(pCPU1 is still somewhere in vcpu_destroy() loop and with VMCS cleared
by vmx_vcpu_disable_pml())

pCPU1: IPI handler for TLB flush -> context switch out of vCPUx (this is
here because we haven't 

Re: [Xen-devel] [PATCH for-4.10] gcov: return EOPNOTSUPP for unimplemented gcov domctl

2017-11-07 Thread Jan Beulich
>>> On 07.11.17 at 13:31,  wrote:
> ENOSYS should only be used by unimplemented top-level syscalls. Use
> EOPNOTSUPP instead.
> 
> Signed-off-by: Roger Pau Monné 
> Reported-by: Jan Beulich 

Reviewed-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [distros-debian-snapshot test] 72430: tolerable FAIL

2017-11-07 Thread Platform Team regression test user
flight 72430 distros-debian-snapshot real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/72430/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-i386-amd64-weekly-netinst-pygrub 10 debian-di-install fail like 
72400
 test-amd64-amd64-amd64-current-netinst-pygrub 10 debian-di-install fail like 
72400
 test-amd64-amd64-i386-weekly-netinst-pygrub 10 debian-di-install fail like 
72400
 test-amd64-amd64-amd64-weekly-netinst-pygrub 10 debian-di-install fail like 
72400
 test-amd64-amd64-i386-daily-netboot-pygrub 10 debian-di-install fail like 72400
 test-amd64-i386-i386-daily-netboot-pvgrub 10 debian-di-install fail like 72400
 test-amd64-amd64-amd64-daily-netboot-pvgrub 10 debian-di-install fail like 
72400
 test-amd64-i386-amd64-daily-netboot-pygrub 10 debian-di-install fail like 72400
 test-armhf-armhf-armhf-daily-netboot-pygrub 10 debian-di-install fail like 
72400
 test-amd64-i386-i386-weekly-netinst-pygrub 10 debian-di-install fail like 72400
 test-amd64-i386-i386-current-netinst-pygrub 10 debian-di-install fail like 
72400
 test-amd64-amd64-i386-current-netinst-pygrub 10 debian-di-install fail like 
72400
 test-amd64-i386-amd64-current-netinst-pygrub 10 debian-di-install fail like 
72400

baseline version:
 flight   72400

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-amd64-daily-netboot-pvgrub  fail
 test-amd64-i386-i386-daily-netboot-pvgrubfail
 test-amd64-i386-amd64-daily-netboot-pygrub   fail
 test-armhf-armhf-armhf-daily-netboot-pygrub  fail
 test-amd64-amd64-i386-daily-netboot-pygrub   fail
 test-amd64-amd64-amd64-current-netinst-pygrubfail
 test-amd64-i386-amd64-current-netinst-pygrub fail
 test-amd64-amd64-i386-current-netinst-pygrub fail
 test-amd64-i386-i386-current-netinst-pygrub  fail
 test-amd64-amd64-amd64-weekly-netinst-pygrub fail
 test-amd64-i386-amd64-weekly-netinst-pygrub  fail
 test-amd64-amd64-i386-weekly-netinst-pygrub  fail
 test-amd64-i386-i386-weekly-netinst-pygrub   fail



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [libvirt test] 115636: regressions - FAIL

2017-11-07 Thread Roger Pau Monné
On Tue, Nov 07, 2017 at 12:42:59PM +, osstest service owner wrote:
> flight 115636 libvirt real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/115636/
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-amd64-i386-libvirt-qcow2 17 guest-start/debian.repeat fail REGR. vs. 
> 115476

One of the local attachments done in order to run the bootloader
failed:

2017-11-07 12:17:16.601+: libxl: 
libxl_disk.c:990:libxl__device_disk_local_initiate_attach: Trying to find local 
path
2017-11-07 12:17:16.601+: libxl: 
libxl_disk.c:998:libxl__device_disk_local_initiate_attach: Local path not 
found, initiating attach.
2017-11-07 12:17:16.601+: libxl: 
libxl_device.c:365:libxl__device_disk_set_backend: Disk vdev=(null) 
spec.backend=qdisk
2017-11-07 12:17:16.601+: libxl: 
libxl_device.c:365:libxl__device_disk_set_backend: Disk vdev=xvda 
spec.backend=qdisk
2017-11-07 12:17:16.606+: libxl: 
libxl_event.c:686:libxl__ev_xswatch_deregister: watch w=0xb3f06544: deregister 
unregistered
2017-11-07 12:17:26.693+: libxl: 
libxl_device.c:1366:libxl__wait_for_backend: Backend 
/local/domain/0/backend/qdisk/0/51712 not ready
2017-11-07 12:17:26.693+: libxl: 
libxl_bootloader.c:417:bootloader_disk_attached_cb: Domain 7:failed to attach 
local disk for bootloader execution
2017-11-07 12:17:26.693+: libxl: 
libxl_event.c:686:libxl__ev_xswatch_deregister: watch w=0xb3f06658: deregister 
unregistered
2017-11-07 12:17:26.693+: libxl: 
libxl_bootloader.c:283:bootloader_local_detached_cb: Domain 7:unable to detach 
locally attached disk
2017-11-07 12:17:26.693+: libxl: 
libxl_create.c:1246:domcreate_rebuild_done: Domain 7:cannot (re-)build domain: 
-3
2017-11-07 12:17:26.708+: libxl: libxl_domain.c:1138:devices_destroy_cb: 
Domain 7:Forked pid 4256 for destroy of domain
2017-11-07 12:17:26.711+: libxl: libxl_event.c:1869:libxl__ao_complete: ao 
0xb3f01578: complete, rc=-3
2017-11-07 12:17:26.712+: libxl: libxl_event.c:1838:libxl__ao__destroy: ao 
0xb3f01578: destroy

Sadly AFAICT there's no log for the QEMU running this backend.
Previous domain creating attempts worked.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] aarch64: advertise the GIC system register interface

2017-11-07 Thread Peter Maydell
On 6 November 2017 at 22:16, Stefano Stabellini  wrote:
> When QEMU emulates a GICv3, it needs to advertise the presence of the
> system register interface, which is done via id_aa64pfr0.
>
> To do that, and at the same time to avoid advertising the presence of
> the system register interface when it is actually not available, set a
> boolean property in machvirt_init. Check on the boolean property from
> register_cp_regs_for_features and set id_aa64pfr0 accordingly.
>
> Signed-off-by: Stefano Stabellini 
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 9e18b41..369d36b 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -1401,6 +1401,9 @@ static void machvirt_init(MachineState *machine)
>  object_property_set_link(cpuobj, OBJECT(secure_sysmem),
>   "secure-memory", _abort);
>  }
> +if (vms->gic_version == 3) {
> +object_property_set_bool(cpuobj, true, "gicv3-sysregs", NULL);
> +}
>
>  object_property_set_bool(cpuobj, true, "realized", NULL);
>  object_unref(cpuobj);

I thought about this on the cycle into work this morning, and I
think that rather than require every board that uses gicv3
to set a property on the CPU, we should change the definition
of the id_aa64pfr0 register so that rather than being ARM_CP_CONST
it has a readfn, and then at runtime we can get that readfn to
add in the right bit if env->gicv3state is non-null.

I'll put together a patch this afternoon.

thanks
-- PMM

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [libvirt test] 115636: regressions - FAIL

2017-11-07 Thread osstest service owner
flight 115636 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/115636/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-libvirt-qcow2 17 guest-start/debian.repeat fail REGR. vs. 
115476

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 115476
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 115476
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 115476
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  47eb77fb33b3b2c1485cafb606402826169b42fa
baseline version:
 libvirt  1bf893406637e852daeaafec6617d3ee3716de25

Last test of basis   115476  2017-11-02 04:22:37 Z5 days
Failing since115509  2017-11-03 04:20:26 Z4 days5 attempts
Testing same since   115636  2017-11-07 04:30:52 Z0 days1 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Daniel Veillard 
  Dawid Zamirski 
  Jiri Denemark 
  John Ferlan 
  Michal Privoznik 
  Nikolay Shirokovskiy 
  Peter Krempa 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-amd64-i386-libvirt-qcow2fail
 test-armhf-armhf-libvirt-raw pass
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 538 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] blkback reporting incorrect number of sectors, unable to boot

2017-11-07 Thread Roger Pau Monné
On Tue, Nov 07, 2017 at 04:31:06AM -0700, Jan Beulich wrote:
> >>> On 07.11.17 at 11:30,  wrote:
> > On Mon, Nov 06, 2017 at 05:33:37AM -0700, Jan Beulich wrote:
> >> >>> On 04.11.17 at 05:48,  wrote:
> >> > I added some additional storage to my server with some native 4k sector
> >> > size disks.  The LVM volumes on that array seem to work fine when mounted
> >> > by the host, and when passed through to any of the Linux guests, but
> >> > Windows guests aren't able to use them when using PV drivers.  The work
> >> > fine to install when I first install Windows (Windows 10, latest build) 
> >> > but
> >> > once I install the PV drivers it will no longer boot and give an
> >> > inaccessible boot device error.  If I assign the storage to a different
> >> > Windows guest that already has the drivers installed (as secondary 
> >> > storage,
> >> > not as the boot device) I see the disk listed in disk management, but the
> >> > size of the disk is 8x larger than it should be.  After looking into it a
> >> > bit, the disk is reporting 8x the number of sectors it should have when I
> >> > run xenstore-ls.  Here is the info from xenstore-ls for the relevant 
> >> > volume:
> >> > 
> >> >   51712 = ""
> >> >frontend = "/local/domain/8/device/vbd/51712"
> >> >params = "/dev/tv_storage/main-storage"
> >> >script = "/etc/xen/scripts/block"
> >> >frontend-id = "8"
> >> >online = "1"
> >> >removable = "0"
> >> >bootable = "1"
> >> >state = "2"
> >> >dev = "xvda"
> >> >type = "phy"
> >> >mode = "w"
> >> >device-type = "disk"
> >> >discard-enable = "1"
> >> >feature-max-indirect-segments = "256"
> >> >multi-queue-max-queues = "12"
> >> >max-ring-page-order = "4"
> >> >physical-device = "fe:0"
> >> >physical-device-path = "/dev/dm-0"
> >> >hotplug-status = "connected"
> >> >feature-flush-cache = "1"
> >> >feature-discard = "0"
> >> >feature-barrier = "1"
> >> >feature-persistent = "1"
> >> >sectors = "34359738368"
> >> >info = "0"
> >> >sector-size = "4096"
> >> >physical-sector-size = "4096"
> >> > 
> >> > 
> >> > Here are the numbers for the volume as reported by fdisk:
> >> > 
> >> > Disk /dev/tv_storage/main-storage: 16 TiB, 17592186044416 bytes, 
> >> > 4294967296
> >> > sectors
> >> > Units: sectors of 1 * 4096 = 4096 bytes
> >> > Sector size (logical/physical): 4096 bytes / 4096 bytes
> >> > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> >> > Disklabel type: dos
> >> > Disk identifier: 0x
> >> > 
> >> > DeviceBoot StartEndSectors Size Id 
> >> > Type
> >> > /dev/tv_storage/main-storage1  1 4294967295 4294967295  16T ee 
> >> > GPT
> >> > 
> >> > 
> >> > As with the size reported in Windows disk management, the number of 
> >> > sectors
> >> > from xenstore seems is 8x higher than what it should be.  The disks 
> >> > aren't
> >> > using 512b sector emulation, they are natively 4k, so I have no idea 
> >> > where
> >> > the 8x increase is coming from.
> >> 
> >> Hmm, looks like a backend problem indeed: struct hd_struct's
> >> nr_sects (which get_capacity() returns) looks to be in 512-byte
> >> units, regardless of actual sector size. Hence the plain
> >> get_capacity() use as well the (wrongly open coded) use of
> >> part_nr_sects_read() looks insufficient in vbd_sz(). Roger,
> >> Konrad?
> > 
> > Hm, AFAICT sector-size should always be set to 512.
> 
> Which would mean that bdev_logical_block_size() can't be used by
> blkback to set this value. Yet then - what's the point of the xenstore
> setting if it's always the same value anyway?

Some frontends (at least FreeBSD) will choke if sector-size is not
set. So we have the following scenario:

 - Windows: acknowledges sector-size * sectors in order to set disk
   capacity.
 - Linux: sets disk capacity to sectors * 512.
 - FreeBSD: sets disk capacity to sector-size * sectors, will choke if
   sector-size is not set.

In order to keep compatibility with all of them AFAICT the only option
is to hardcode sector-size to 512 in xenstore.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH for-4.10] gcov: return EOPNOTSUPP for unimplemented gcov domctl

2017-11-07 Thread Roger Pau Monne
ENOSYS should only be used by unimplemented top-level syscalls. Use
EOPNOTSUPP instead.

Signed-off-by: Roger Pau Monné 
Reported-by: Jan Beulich 
---
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Wei Liu 
Cc: Julien Grall 
---
 xen/common/gcov/gcov.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/common/gcov/gcov.c b/xen/common/gcov/gcov.c
index 35653fd8d8..283d2eec86 100644
--- a/xen/common/gcov/gcov.c
+++ b/xen/common/gcov/gcov.c
@@ -239,7 +239,7 @@ int sysctl_gcov_op(struct xen_sysctl_gcov_op *op)
 break;
 
 default:
-ret = -ENOSYS;
+ret = -EOPNOTSUPP;
 break;
 }
 
-- 
2.13.6 (Apple Git-96)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Libvirt config converter can't handle file not ending with new line

2017-11-07 Thread Wei Liu
On Mon, Nov 06, 2017 at 09:41:01PM -0700, Jim Fehlig wrote:
> On 10/30/2017 06:17 AM, Wei Liu wrote:
> > Hi Jim
> > 
> > I discover a problem when using xen_xl converter. When the file in
> > question doesn't end with a new line, I get the following error:
> > 
> >error: configuration file syntax error: memory conf:53: expecting a value
> 
> I'm not able to reproduce this issue. The libvirt.git tree I tried was a bit
> dated, but even after updating to latest master I can't reproduce.
> 
> > After digging a bit (but haven't read libvirt code), it appears that the
> > file didn't end with a new line.
> 
> I tried several files without ending new lines, going both directions
> (domxml-to-native and domxml-from-native), but didn't see the mentioned
> error. Perhaps your config is revealing another bug which is being
> improperly reported. Can you provide an example of the problematic config?
> 

I tried to get the exact file that caused the problem but it is already
destroyed by osstest.

A similar file:

http://logs.test-lab.xenproject.org/osstest/logs/115436/test-amd64-amd64-libvirt-pair/debian.guest.osstest.cfg

If you hexdump -C it, you can see the last character is 0a. Remove it and
feed the file into the converter.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [linux-4.9 test] 115504: regressions - FAIL

2017-11-07 Thread Roger Pau Monné
On Fri, Nov 03, 2017 at 08:21:31PM +, osstest service owner wrote:
> flight 115504 linux-4.9 real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/115504/
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop   fail REGR. vs. 
> 114814

AFAICT this tree should also be force-pushed, the windows 16 issue is
the same as the one seen on xen-unstable.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 115628: regressions - FAIL

2017-11-07 Thread osstest service owner
flight 115628 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/115628/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop   fail REGR. vs. 114682

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 114682
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 114682
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 114682
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 114682
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 114682
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 114682
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 114682
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 114682
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass

version targeted for testing:
 linuxe4880bc5dfb1f02b152e62a894b5c6f3e995b3cf
baseline version:
 linuxebe6e90ccc6679cb01d2b280e4b61e6092d4bedb

Last test of basis   114682  2017-10-18 09:54:11 Z   20 days
Failing since114781  2017-10-20 01:00:47 Z   18 days   31 attempts
Testing same since   115628  2017-11-07 01:20:30 Z0 days1 attempts


522 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvops   

Re: [Xen-devel] [PATCH for-next 1/9] gcov: return ENOSYS for unimplemented gcov domctl

2017-11-07 Thread Wei Liu
On Tue, Nov 07, 2017 at 09:41:58AM +, Roger Pau Monné wrote:
> > Okay, so EOPNOTSUPP is it then, which is also my preference
> > (due to there being so many uses of EINVAL elsewhere). I've
> > merely mentioned that EINVAL would be suitable since,
> > technically speaking, the value in a "sub-operation" field being
> > invalid is no different from this being the case for the value in
> > any other field.
> 
> If I don't get any more comments I will re-send this patch separately
> using EOPNOTSUPP instead of ENOSYS. I will also keep the Acks gathered
> so far unless anyone objects.
> 

Please send a new patch. I believe this one is already applied.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 3/8] libxl: add backend_features to libxl_device_disk

2017-11-07 Thread Joao Martins


On 11/07/2017 11:28 AM, Oleksandr Grytsov wrote:
> On Thu, Nov 2, 2017 at 8:06 PM, Joao Martins  > wrote:
> 
> The function libxl__device_generic_add will have an additional
> argument whereby it adds a second set of entries visible to the
> backend only. These entries will then be used for devices
> thus overriding backend maximum feature set with this user-defined ones.
> 
> libxl_device_disk.backend_features are a key value store storing:
>   = 
> 
> xl|libxl are stateless with respect to feature names therefore is up to 
> the
> admin to carefully select those. If backend isn't supported therefore the
> features won't be overwritten.
> 
> Signed-off-by: Joao Martins  >
> ---
>  tools/libxl/libxl.h          |  8 
>  tools/libxl/libxl_console.c  |  5 +++--
>  tools/libxl/libxl_device.c   | 37 +
>  tools/libxl/libxl_disk.c     | 17 +++--
>  tools/libxl/libxl_internal.h |  4 +++-
>  tools/libxl/libxl_pci.c      |  2 +-
>  tools/libxl/libxl_types.idl  |  1 +
>  tools/libxl/libxl_usb.c      |  2 +-
>  8 files changed, 65 insertions(+), 11 deletions(-)
> 
> 
> No need to extend libxl__device_generic_add with additional parameter 
> (brents).
> You can add nested entry in libxl__set_xenstore_ as following:
> 
> flexarray_append(back, "require/feature-persistent", "0");

Right, although entries on "back" array will have readonly permission to the
frontend. And these newly added "require" directory in this RFC was meant to be
only visible to the backend, hence only having XS_PERM_NONE permission set.

Joao

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] blkback reporting incorrect number of sectors, unable to boot

2017-11-07 Thread Jan Beulich
>>> On 07.11.17 at 11:30,  wrote:
> On Mon, Nov 06, 2017 at 05:33:37AM -0700, Jan Beulich wrote:
>> >>> On 04.11.17 at 05:48,  wrote:
>> > I added some additional storage to my server with some native 4k sector
>> > size disks.  The LVM volumes on that array seem to work fine when mounted
>> > by the host, and when passed through to any of the Linux guests, but
>> > Windows guests aren't able to use them when using PV drivers.  The work
>> > fine to install when I first install Windows (Windows 10, latest build) but
>> > once I install the PV drivers it will no longer boot and give an
>> > inaccessible boot device error.  If I assign the storage to a different
>> > Windows guest that already has the drivers installed (as secondary storage,
>> > not as the boot device) I see the disk listed in disk management, but the
>> > size of the disk is 8x larger than it should be.  After looking into it a
>> > bit, the disk is reporting 8x the number of sectors it should have when I
>> > run xenstore-ls.  Here is the info from xenstore-ls for the relevant 
>> > volume:
>> > 
>> >   51712 = ""
>> >frontend = "/local/domain/8/device/vbd/51712"
>> >params = "/dev/tv_storage/main-storage"
>> >script = "/etc/xen/scripts/block"
>> >frontend-id = "8"
>> >online = "1"
>> >removable = "0"
>> >bootable = "1"
>> >state = "2"
>> >dev = "xvda"
>> >type = "phy"
>> >mode = "w"
>> >device-type = "disk"
>> >discard-enable = "1"
>> >feature-max-indirect-segments = "256"
>> >multi-queue-max-queues = "12"
>> >max-ring-page-order = "4"
>> >physical-device = "fe:0"
>> >physical-device-path = "/dev/dm-0"
>> >hotplug-status = "connected"
>> >feature-flush-cache = "1"
>> >feature-discard = "0"
>> >feature-barrier = "1"
>> >feature-persistent = "1"
>> >sectors = "34359738368"
>> >info = "0"
>> >sector-size = "4096"
>> >physical-sector-size = "4096"
>> > 
>> > 
>> > Here are the numbers for the volume as reported by fdisk:
>> > 
>> > Disk /dev/tv_storage/main-storage: 16 TiB, 17592186044416 bytes, 4294967296
>> > sectors
>> > Units: sectors of 1 * 4096 = 4096 bytes
>> > Sector size (logical/physical): 4096 bytes / 4096 bytes
>> > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
>> > Disklabel type: dos
>> > Disk identifier: 0x
>> > 
>> > DeviceBoot StartEndSectors Size Id Type
>> > /dev/tv_storage/main-storage1  1 4294967295 4294967295  16T ee GPT
>> > 
>> > 
>> > As with the size reported in Windows disk management, the number of sectors
>> > from xenstore seems is 8x higher than what it should be.  The disks aren't
>> > using 512b sector emulation, they are natively 4k, so I have no idea where
>> > the 8x increase is coming from.
>> 
>> Hmm, looks like a backend problem indeed: struct hd_struct's
>> nr_sects (which get_capacity() returns) looks to be in 512-byte
>> units, regardless of actual sector size. Hence the plain
>> get_capacity() use as well the (wrongly open coded) use of
>> part_nr_sects_read() looks insufficient in vbd_sz(). Roger,
>> Konrad?
> 
> Hm, AFAICT sector-size should always be set to 512.

Which would mean that bdev_logical_block_size() can't be used by
blkback to set this value. Yet then - what's the point of the xenstore
setting if it's always the same value anyway?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 3/8] libxl: add backend_features to libxl_device_disk

2017-11-07 Thread Oleksandr Grytsov
On Thu, Nov 2, 2017 at 8:06 PM, Joao Martins 
wrote:

> The function libxl__device_generic_add will have an additional
> argument whereby it adds a second set of entries visible to the
> backend only. These entries will then be used for devices
> thus overriding backend maximum feature set with this user-defined ones.
>
> libxl_device_disk.backend_features are a key value store storing:
>   = 
>
> xl|libxl are stateless with respect to feature names therefore is up to the
> admin to carefully select those. If backend isn't supported therefore the
> features won't be overwritten.
>
> Signed-off-by: Joao Martins 
> ---
>  tools/libxl/libxl.h  |  8 
>  tools/libxl/libxl_console.c  |  5 +++--
>  tools/libxl/libxl_device.c   | 37 +
>  tools/libxl/libxl_disk.c | 17 +++--
>  tools/libxl/libxl_internal.h |  4 +++-
>  tools/libxl/libxl_pci.c  |  2 +-
>  tools/libxl/libxl_types.idl  |  1 +
>  tools/libxl/libxl_usb.c  |  2 +-
>  8 files changed, 65 insertions(+), 11 deletions(-)
>
>
No need to extend libxl__device_generic_add with additional parameter
(brents).
You can add nested entry in libxl__set_xenstore_ as following:

flexarray_append(back, "require/feature-persistent", "0");

-- 
Best Regards,
Oleksandr Grytsov.
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] Xen/pciback: Implement PCI slot or bus reset with 'do_flr' SysFS attribute

2017-11-07 Thread Roger Pau Monné
On Mon, Nov 06, 2017 at 12:48:42PM -0500, Govinda Tatti wrote:
> The life-cycle of a PCI device in Xen pciback is complex and is constrained
> by the generic PCI locking mechanism.
> 
> - It starts with the device being bound to us, for which we do a function
>   reset (done via SysFS so the PCI lock is held).
> - If the device is unbound from us, we also do a function reset
>   (done via SysFS so the PCI lock is held).
> - If the device is un-assigned from a guest - we do a function reset
>   (no PCI lock is held).
> 
> All reset operations are done on the individual PCI function level
> (so bus:device:function).
> 
> The reset for an individual PCI function means device must support FLR
> (PCIe or AF), PM reset on D3hot->D0 device specific reset, or a secondary
> bus reset for a singleton device on a bus but FLR does not have widespread
> support or it is not reliable in some cases. So, we need to provide an
> alternate mechanism to users to perform a slot or bus level reset.
> 
> Currently, a slot or bus reset is not exposed in SysFS as there is no good
> way of exposing a bus topology there. This is due to the complexity -
> we MUST know that the different functions of a PCIe device are not in use
> by other drivers, or if they are in use (say one of them is assigned to a
> guest and the other is  idle) - it is still OK to reset the slot (assuming
> both of them are owned by Xen pciback).
> 
> This patch does that by doing a slot or bus reset (if slot not supported)
> if all of the functions of a PCIe device belong to Xen PCIback.
> 
> Due to the complexity with the PCI lock we cannot do the reset when a
> device is bound ('echo $BDF > bind') or when unbound ('echo $BDF > unbind')
> as the pci_[slot|bus]_reset also takes the same lock resulting in a
> dead-lock.
> 
> Putting the reset function in a work-queue or thread won't work either -
> as we have to do the reset function outside the 'unbind' context (it holds
> the PCI lock). But once you 'unbind' a device the device is no longer under
> the ownership of Xen pciback and the pci_set_drvdata has been reset, so
> we cannot use a thread for this.
> 
> Instead of doing all this complex dance, we depend on the tool-stack doing
> the right thing. As such, we implement the 'do_flr' SysFS attribute which
> 'xl' uses when a device is detached or attached from/to a guest. It
> bypasses the need to worry about the PCI lock.
> 
> To not inadvertently do a bus reset that would affect devices that are in
> use by other drivers (other than Xen pciback) prior to the reset, we check
> that all of the devices under the bridge are owned by Xen pciback. If they
> are not, we refrain from executing the bus (or slot) reset.
> 
> Signed-off-by: Govinda Tatti 
> Signed-off-by: Konrad Rzeszutek Wilk 
> Reviewed-by: Boris Ostrovsky 
> ---
>  Documentation/ABI/testing/sysfs-driver-pciback |  12 +++
>  drivers/xen/xen-pciback/pci_stub.c | 125 
> +
>  2 files changed, 137 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-driver-pciback 
> b/Documentation/ABI/testing/sysfs-driver-pciback
> index 6a733bf..ccf7dc0 100644
> --- a/Documentation/ABI/testing/sysfs-driver-pciback
> +++ b/Documentation/ABI/testing/sysfs-driver-pciback
> @@ -11,3 +11,15 @@ Description:
>  #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks
>  will allow the guest to read and write to the configuration
>  register 0x0E.
> +
> +What:   /sys/bus/pci/drivers/pciback/do_flr
> +Date:   Nov 2017
> +KernelVersion:  4.15
> +Contact:xen-de...@lists.xenproject.org
> +Description:
> +An option to perform a slot or bus reset when a PCI device
> + is owned by Xen PCI backend. Writing a string of :BB:DD.F
> + will cause the pciback driver to perform a slot or bus reset
> + if the device supports it. It also checks to make sure that
> + all of the devices under the bridge are owned by Xen PCI
> + backend.
> diff --git a/drivers/xen/xen-pciback/pci_stub.c 
> b/drivers/xen/xen-pciback/pci_stub.c
> index 6331a95..2b2c269 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -244,6 +244,96 @@ struct pci_dev *pcistub_get_pci_dev(struct 
> xen_pcibk_device *pdev,
>   return found_dev;
>  }
>  
> +struct pcistub_args {
> + struct pci_dev *dev;

const?

> + int dcount;

unsigned int.

> +};
> +
> +static int pcistub_search_dev(struct pci_dev *dev, void *data)

Seems like this function would better return a boolean rather than an
int.

> +{
> + struct pcistub_device *psdev;
> + struct pcistub_args *arg = data;
> + bool found_dev = false;
> + unsigned long flags;
> +
> + spin_lock_irqsave(_devices_lock, flags);
> +
> + list_for_each_entry(psdev, _devices, dev_list) {

Re: [Xen-devel] [BUG] blkback reporting incorrect number of sectors, unable to boot

2017-11-07 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Roger Pau Monné
> Sent: 07 November 2017 10:30
> To: Jan Beulich 
> Cc: Mike Reardon ; xen-devel@lists.xen.org; Konrad
> Rzeszutek Wilk 
> Subject: Re: [Xen-devel] [BUG] blkback reporting incorrect number of
> sectors, unable to boot
> 
> On Mon, Nov 06, 2017 at 05:33:37AM -0700, Jan Beulich wrote:
> > >>> On 04.11.17 at 05:48,  wrote:
> > > I added some additional storage to my server with some native 4k sector
> > > size disks.  The LVM volumes on that array seem to work fine when
> mounted
> > > by the host, and when passed through to any of the Linux guests, but
> > > Windows guests aren't able to use them when using PV drivers.  The
> work
> > > fine to install when I first install Windows (Windows 10, latest build) 
> > > but
> > > once I install the PV drivers it will no longer boot and give an
> > > inaccessible boot device error.  If I assign the storage to a different
> > > Windows guest that already has the drivers installed (as secondary
> storage,
> > > not as the boot device) I see the disk listed in disk management, but the
> > > size of the disk is 8x larger than it should be.  After looking into it a
> > > bit, the disk is reporting 8x the number of sectors it should have when I
> > > run xenstore-ls.  Here is the info from xenstore-ls for the relevant
> volume:
> > >
> > >   51712 = ""
> > >frontend = "/local/domain/8/device/vbd/51712"
> > >params = "/dev/tv_storage/main-storage"
> > >script = "/etc/xen/scripts/block"
> > >frontend-id = "8"
> > >online = "1"
> > >removable = "0"
> > >bootable = "1"
> > >state = "2"
> > >dev = "xvda"
> > >type = "phy"
> > >mode = "w"
> > >device-type = "disk"
> > >discard-enable = "1"
> > >feature-max-indirect-segments = "256"
> > >multi-queue-max-queues = "12"
> > >max-ring-page-order = "4"
> > >physical-device = "fe:0"
> > >physical-device-path = "/dev/dm-0"
> > >hotplug-status = "connected"
> > >feature-flush-cache = "1"
> > >feature-discard = "0"
> > >feature-barrier = "1"
> > >feature-persistent = "1"
> > >sectors = "34359738368"
> > >info = "0"
> > >sector-size = "4096"
> > >physical-sector-size = "4096"
> > >
> > >
> > > Here are the numbers for the volume as reported by fdisk:
> > >
> > > Disk /dev/tv_storage/main-storage: 16 TiB, 17592186044416 bytes,
> 4294967296
> > > sectors
> > > Units: sectors of 1 * 4096 = 4096 bytes
> > > Sector size (logical/physical): 4096 bytes / 4096 bytes
> > > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > > Disklabel type: dos
> > > Disk identifier: 0x
> > >
> > > DeviceBoot StartEndSectors Size Id 
> > > Type
> > > /dev/tv_storage/main-storage1  1 4294967295 4294967295  16T ee
> GPT
> > >
> > >
> > > As with the size reported in Windows disk management, the number of
> sectors
> > > from xenstore seems is 8x higher than what it should be.  The disks aren't
> > > using 512b sector emulation, they are natively 4k, so I have no idea where
> > > the 8x increase is coming from.
> >
> > Hmm, looks like a backend problem indeed: struct hd_struct's
> > nr_sects (which get_capacity() returns) looks to be in 512-byte
> > units, regardless of actual sector size. Hence the plain
> > get_capacity() use as well the (wrongly open coded) use of
> > part_nr_sects_read() looks insufficient in vbd_sz(). Roger,
> > Konrad?
> 
> Hm, AFAICT sector-size should always be set to 512.
> 
> > Question of course is whether the Linux frontend then
> > also needs adjustment, and hence whether the backend can
> > be corrected in a compatible way in the first place.
> 
> blkfront uses set_capacity, which also seems to expect the sectors to
> be hardcoded to 512.
> 

Oh dear. No wonder it's all quite broken.

  Paul

> Roger.
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3] xen-disk: use an IOThread per instance

2017-11-07 Thread Paul Durrant
This patch allocates an IOThread object for each xen_disk instance and
sets the AIO context appropriately on connect. This allows processing
of I/O to proceed in parallel.

The patch also adds tracepoints into xen_disk to make it possible to
follow the state transtions of an instance in the log.

Signed-off-by: Paul Durrant 
---
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: Kevin Wolf 
Cc: Max Reitz 

v3:
 - Use new iothread_create/destroy() functions

v2:
 - explicitly acquire and release AIO context in qemu_aio_complete() and
   blk_bh()
---
 hw/block/trace-events |  7 +++
 hw/block/xen_disk.c   | 53 ---
 2 files changed, 53 insertions(+), 7 deletions(-)

diff --git a/hw/block/trace-events b/hw/block/trace-events
index cb6767b3ee..962a3bfa24 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -10,3 +10,10 @@ virtio_blk_submit_multireq(void *vdev, void *mrb, int start, 
int num_reqs, uint6
 # hw/block/hd-geometry.c
 hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p LCHS 
%d %d %d"
 hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs, int 
trans) "blk %p CHS %u %u %u trans %d"
+
+# hw/block/xen_disk.c
+xen_disk_alloc(char *name) "%s"
+xen_disk_init(char *name) "%s"
+xen_disk_connect(char *name) "%s"
+xen_disk_disconnect(char *name) "%s"
+xen_disk_free(char *name) "%s"
diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index e431bd89e8..f74fcd42d1 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -27,10 +27,12 @@
 #include "hw/xen/xen_backend.h"
 #include "xen_blkif.h"
 #include "sysemu/blockdev.h"
+#include "sysemu/iothread.h"
 #include "sysemu/block-backend.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qdict.h"
 #include "qapi/qmp/qstring.h"
+#include "trace.h"
 
 /* - */
 
@@ -125,6 +127,9 @@ struct XenBlkDev {
 DriveInfo   *dinfo;
 BlockBackend*blk;
 QEMUBH  *bh;
+
+IOThread*iothread;
+AioContext  *ctx;
 };
 
 /* - */
@@ -596,9 +601,12 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq);
 static void qemu_aio_complete(void *opaque, int ret)
 {
 struct ioreq *ioreq = opaque;
+struct XenBlkDev *blkdev = ioreq->blkdev;
+
+aio_context_acquire(blkdev->ctx);
 
 if (ret != 0) {
-xen_pv_printf(>blkdev->xendev, 0, "%s I/O error\n",
+xen_pv_printf(>xendev, 0, "%s I/O error\n",
   ioreq->req.operation == BLKIF_OP_READ ? "read" : 
"write");
 ioreq->aio_errors++;
 }
@@ -607,10 +615,10 @@ static void qemu_aio_complete(void *opaque, int ret)
 if (ioreq->presync) {
 ioreq->presync = 0;
 ioreq_runio_qemu_aio(ioreq);
-return;
+goto done;
 }
 if (ioreq->aio_inflight > 0) {
-return;
+goto done;
 }
 
 if (xen_feature_grant_copy) {
@@ -647,16 +655,19 @@ static void qemu_aio_complete(void *opaque, int ret)
 }
 case BLKIF_OP_READ:
 if (ioreq->status == BLKIF_RSP_OKAY) {
-block_acct_done(blk_get_stats(ioreq->blkdev->blk), >acct);
+block_acct_done(blk_get_stats(blkdev->blk), >acct);
 } else {
-block_acct_failed(blk_get_stats(ioreq->blkdev->blk), >acct);
+block_acct_failed(blk_get_stats(blkdev->blk), >acct);
 }
 break;
 case BLKIF_OP_DISCARD:
 default:
 break;
 }
-qemu_bh_schedule(ioreq->blkdev->bh);
+qemu_bh_schedule(blkdev->bh);
+
+done:
+aio_context_release(blkdev->ctx);
 }
 
 static bool blk_split_discard(struct ioreq *ioreq, blkif_sector_t 
sector_number,
@@ -913,17 +924,29 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 static void blk_bh(void *opaque)
 {
 struct XenBlkDev *blkdev = opaque;
+
+aio_context_acquire(blkdev->ctx);
 blk_handle_requests(blkdev);
+aio_context_release(blkdev->ctx);
 }
 
 static void blk_alloc(struct XenDevice *xendev)
 {
 struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev);
+Error *err = NULL;
+
+trace_xen_disk_alloc(xendev->name);
 
 QLIST_INIT(>inflight);
 QLIST_INIT(>finished);
 QLIST_INIT(>freelist);
-blkdev->bh = qemu_bh_new(blk_bh, blkdev);
+
+blkdev->iothread = iothread_create(xendev->name, );
+assert(!err);
+
+blkdev->ctx = iothread_get_aio_context(blkdev->iothread);
+blkdev->bh = aio_bh_new(blkdev->ctx, blk_bh, blkdev);
+
 if (xen_mode != XEN_EMULATE) {
 batch_maps = 1;
 }
@@ -950,6 +973,8 @@ static int blk_init(struct XenDevice *xendev)
 int info = 0;
 char *directiosafe = NULL;
 
+trace_xen_disk_init(xendev->name);
+
 /* read xenstore entries */
 if (blkdev->params == NULL) {
   

Re: [Xen-devel] [BUG] blkback reporting incorrect number of sectors, unable to boot

2017-11-07 Thread Roger Pau Monné
On Mon, Nov 06, 2017 at 05:33:37AM -0700, Jan Beulich wrote:
> >>> On 04.11.17 at 05:48,  wrote:
> > I added some additional storage to my server with some native 4k sector
> > size disks.  The LVM volumes on that array seem to work fine when mounted
> > by the host, and when passed through to any of the Linux guests, but
> > Windows guests aren't able to use them when using PV drivers.  The work
> > fine to install when I first install Windows (Windows 10, latest build) but
> > once I install the PV drivers it will no longer boot and give an
> > inaccessible boot device error.  If I assign the storage to a different
> > Windows guest that already has the drivers installed (as secondary storage,
> > not as the boot device) I see the disk listed in disk management, but the
> > size of the disk is 8x larger than it should be.  After looking into it a
> > bit, the disk is reporting 8x the number of sectors it should have when I
> > run xenstore-ls.  Here is the info from xenstore-ls for the relevant volume:
> > 
> >   51712 = ""
> >frontend = "/local/domain/8/device/vbd/51712"
> >params = "/dev/tv_storage/main-storage"
> >script = "/etc/xen/scripts/block"
> >frontend-id = "8"
> >online = "1"
> >removable = "0"
> >bootable = "1"
> >state = "2"
> >dev = "xvda"
> >type = "phy"
> >mode = "w"
> >device-type = "disk"
> >discard-enable = "1"
> >feature-max-indirect-segments = "256"
> >multi-queue-max-queues = "12"
> >max-ring-page-order = "4"
> >physical-device = "fe:0"
> >physical-device-path = "/dev/dm-0"
> >hotplug-status = "connected"
> >feature-flush-cache = "1"
> >feature-discard = "0"
> >feature-barrier = "1"
> >feature-persistent = "1"
> >sectors = "34359738368"
> >info = "0"
> >sector-size = "4096"
> >physical-sector-size = "4096"
> > 
> > 
> > Here are the numbers for the volume as reported by fdisk:
> > 
> > Disk /dev/tv_storage/main-storage: 16 TiB, 17592186044416 bytes, 4294967296
> > sectors
> > Units: sectors of 1 * 4096 = 4096 bytes
> > Sector size (logical/physical): 4096 bytes / 4096 bytes
> > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > Disklabel type: dos
> > Disk identifier: 0x
> > 
> > DeviceBoot StartEndSectors Size Id Type
> > /dev/tv_storage/main-storage1  1 4294967295 4294967295  16T ee GPT
> > 
> > 
> > As with the size reported in Windows disk management, the number of sectors
> > from xenstore seems is 8x higher than what it should be.  The disks aren't
> > using 512b sector emulation, they are natively 4k, so I have no idea where
> > the 8x increase is coming from.
> 
> Hmm, looks like a backend problem indeed: struct hd_struct's
> nr_sects (which get_capacity() returns) looks to be in 512-byte
> units, regardless of actual sector size. Hence the plain
> get_capacity() use as well the (wrongly open coded) use of
> part_nr_sects_read() looks insufficient in vbd_sz(). Roger,
> Konrad?

Hm, AFAICT sector-size should always be set to 512.

> Question of course is whether the Linux frontend then
> also needs adjustment, and hence whether the backend can
> be corrected in a compatible way in the first place.

blkfront uses set_capacity, which also seems to expect the sectors to
be hardcoded to 512.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [Qemu-devel] [PATCH v3 01/46] Replace all occurances of __FUNCTION__ with __func__

2017-11-07 Thread Markus Armbruster
Juan Quintela  writes:

> Alistair Francis  wrote:
>> Replace all occurs of __FUNCTION__ except for the check in checkpatch
>> with the non GCC specific __func__.
>>
>> One line in hcd-musb.c was manually tweaked to pass checkpatch.
>>
>> Signed-off-by: Alistair Francis 
>> Cc: Gerd Hoffmann 
>> Cc: Andrzej Zaborowski 
>> Cc: Stefano Stabellini 
>> Cc: Anthony Perard 
>> Cc: John Snow 
>> Cc: Aurelien Jarno 
>> Cc: Yongbok Kim 
>> Cc: Peter Crosthwaite 
>> Cc: Stefan Hajnoczi 
>> Cc: Fam Zheng 
>> Cc: Juan Quintela 
>> Cc: "Dr. David Alan Gilbert" 
>> Cc: qemu-...@nongnu.org
>> Cc: qemu-bl...@nongnu.org
>> Cc: xen-de...@lists.xenproject.org
>> Reviewed-by: Eric Blake 
>> Reviewed-by: Stefan Hajnoczi 
>
> Reviewed-by: Juan Quintela 
>
>
>> diff --git a/audio/audio_int.h b/audio/audio_int.h
>> index 5bcb1c60e1..543b1bd8d5 100644
>> --- a/audio/audio_int.h
>> +++ b/audio/audio_int.h
>> @@ -253,7 +253,7 @@ static inline int audio_ring_dist (int dst, int src, int 
>> len)
>>  #define AUDIO_STRINGIFY(n) AUDIO_STRINGIFY_(n)
>>  
>>  #if defined _MSC_VER || defined __GNUC__
>> -#define AUDIO_FUNC __FUNCTION__
>> +#define AUDIO_FUNC __func__
>>  #else
>>  #define AUDIO_FUNC __FILE__ ":" AUDIO_STRINGIFY (__LINE__)
>>  #endif
>
> Unrelated to this patch 
> Do we really support other compilers than msc and gcc?

Let me rephrase the question: do we really support compilers that don't
understand __func__?  The presence of numerous unconditional uses of
__func__ in the tree means the answer is no.  Let's replace AUDIO_FUNC
by plain __func__.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/cpuid: Enable new SSE/AVX/AVX512 cpu features

2017-11-07 Thread Jan Beulich
>>> On 07.11.17 at 10:34,  wrote:
>   My understanding is i need to implement x86 emulator for legacy and VEX 
>   CPU features(GFNI,VAES and VPCLMULQDQ), right? 

Yes.

>   As for this patch, whether it is suitable for merge into Xen upstream
>   this time? 

At this time no in any event - the tree is frozen for 4.10. And once
the tree re-opens, the requested emulator additions are then a
prereq for the one here to go in.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-next 1/9] gcov: return ENOSYS for unimplemented gcov domctl

2017-11-07 Thread Roger Pau Monné
On Mon, Nov 06, 2017 at 05:06:18AM -0700, Jan Beulich wrote:
> >>> On 06.11.17 at 12:16,  wrote:
> > Jan Beulich writes ("Re: [PATCH for-next 1/9] gcov: return ENOSYS for 
> > unimplemented gcov domctl"):
> >> On 26.10.17 at 11:19,  wrote:
> >> > --- a/xen/common/gcov/gcov.c
> >> > +++ b/xen/common/gcov/gcov.c
> >> > @@ -239,7 +239,7 @@ int sysctl_gcov_op(struct xen_sysctl_gcov_op *op)
> >> >  break;
> >> >  
> >> >  default:
> >> > -ret = -EINVAL;
> >> > +ret = -ENOSYS;
> >> >  break;
> >> >  }
> >> 
> >> Very certainly ENOSYS is not in any way better. Despite the many
> >> misuses of it, we've started enforcing that this wouldn't be spread.
> >> -EOPNOTSUPP may be fine here, but -EINVAL is suitable as well.
> >> -ENOSYS exclusively means that a _top level_ hypercall is
> >> unimplemented (i.e. with very few exceptions there should be
> >> exactly one place where it gets returned, which is in the main
> >> hypercall dispatch code).
> > 
> > The distinction between unimplemented status of a top-level hypercall
> > and unimplemented status of a sub-op is rarely useful to the caller.
> > 
> > Conversely, the distinction between an unimplemented facility, and a
> > facility which is exists but is being used improperly, is vitally
> > important to anyone who is trying to write compatibility code.
> > 
> > I don't mind if you want to insist on the former distinction,
> > reserving ENOSYS for top-level hypercalls and EOPNOTSUPP for other
> > functions.
> > 
> > But I absolutely do mind the use of EINVAL for "unsupported function".
> > I appreciate that much of the hypervisor has historically used EINVAL
> > this way, but this is (a) a pain for callers (b) evil, bad, and wrong
> > (c) unnecessary since EOPNOTSUPP is available.  We should at least not
> > perpetrate any more of this.  In an unreleased API we should change it
> > before release.

This API has actually been released since ~2013 IIRC, when it was
added to Xen.

> Okay, so EOPNOTSUPP is it then, which is also my preference
> (due to there being so many uses of EINVAL elsewhere). I've
> merely mentioned that EINVAL would be suitable since,
> technically speaking, the value in a "sub-operation" field being
> invalid is no different from this being the case for the value in
> any other field.

If I don't get any more comments I will re-send this patch separately
using EOPNOTSUPP instead of ENOSYS. I will also keep the Acks gathered
so far unless anyone objects.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/cpuid: Enable new SSE/AVX/AVX512 cpu features

2017-11-07 Thread Zhong Yang
On Tue, Nov 07, 2017 at 01:11:02AM -0700, Jan Beulich wrote:
> >>> On 07.11.17 at 07:28,  wrote:
> >   For those new instructions, you mean i also need to support those 
> >   three instructions(GFNI,VAES and VPCLMULQDQ) in x86_emulate() in PV? 
> 
> Why three instructions? And why PV? I'm afraid I'm confused, and
> hence I'm afraid simply saying "yes" to your question might not be
> enough.
> 
> Jan
  
  Hello Jan,

  Sorry for my unclear answer!

  My understanding is i need to implement x86 emulator for legacy and VEX 
  CPU features(GFNI,VAES and VPCLMULQDQ), right? 

  As for this patch, whether it is suitable for merge into Xen upstream
  this time? 

  x86 emulator patches for some intel's CPU features are in our plan and we
  will send those related patches in future. 

  Regards,

  Yang  
  


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] win2008 guest cannot get ip through sriov

2017-11-07 Thread Roger Pau Monné
On Mon, Nov 06, 2017 at 01:04:56AM +, Hao, Xudong wrote:
> > -Original Message-
> > From: Roger Pau Monné [mailto:roger@citrix.com]
> > Sent: Friday, November 3, 2017 7:23 PM
> > To: Hao, Xudong 
> > Cc: Julien Grall ; Stefano Stabellini
> > ; Lars Kurth ; Quan Xu
> > ; Kang, Luwei ; Zhang,
> > PengtaoX ; Julien Grall ;
> > Jan Beulich ; Xen-devel ;
> > Anthony PERARD ; Wei Liu 
> > Subject: Re: [Xen-devel] [BUG] win2008 guest cannot get ip through sriov
> > 
> > On Fri, Nov 03, 2017 at 01:10:26AM +, Hao, Xudong wrote:
> > >
> > > > -Original Message-
> > > > From: Julien Grall [mailto:julien.gr...@linaro.org]
> > > > Sent: Thursday, November 2, 2017 9:50 PM
> > > > To: Stefano Stabellini 
> > > > Cc: Hao, Xudong ; Jan Beulich
> > > > ; Quan Xu ; Lars Kurth
> > > > ; Wei Liu ; Zhang,
> > > > PengtaoX ; Kang, Luwei
> > > > ; Julien Grall ; Anthony
> > > > PERARD ; Xen-devel  > > > de...@lists.xenproject.org>
> > > > Subject: Re: [Xen-devel] [BUG] win2008 guest cannot get ip through
> > > > sriov
> > > >
> > > > Hi,
> > > >
> > > > On 27/10/17 21:16, Stefano Stabellini wrote:
> > > > > On Fri, 27 Oct 2017, Julien Grall wrote:
> > > > >> On 27/10/17 08:27, Hao, Xudong wrote:
> > > > >>> This bug exist much long time, there are many discussion last
> > > > >>> year but not a solution then. I call out it now because there is
> > > > >>> a fix in qemu
> > > > upstream:
> > > > >>> commit a8036336609d2e184fc3543a4c439c0ba7d7f3a2
> > > > >>> Author: Roger Pau Monne 
> > > > >>> Date:   Thu Aug 24 16:07:03 2017 +0100
> > > > >>>
> > > > >>>   xen/pt: allow QEMU to request MSI unmasking at bind time
> > > > >>>
> > > > >>> The fix is not in qemu-xen tree yet, when will qemu-xen sync
> > > > >>> this fix? Is it possible to catch Xen 4.10's qemu-xen?
> > > > >>
> > > > >> I will let Stefano and Anthony providing feedback before giving a
> > > > >> release-ack here.
> > > > >
> > > > > Yes, I think we should backport the commit as it fixes a genuine bug.
> > > > > The backport is not risk-free but it only affects PCI Passthrough.
> > > > > Also the commit has been in QEMU for 2 months now.
> > > >
> > > > Does anyone actually tested it with QEMU Xen tree?
> > > >
> > >
> > > Qemu Xen tree is default which is in Xen source code configuration file
> > Config.mk, I tested it with it.
> > > QEMU_UPSTREAM_URL ?= http://xenbits.xen.org/git-http/qemu-xen.git
> > 
> > Can you please make sure you have QEMU commit a80363: xen/pt: allow QEMU
> > to request MSI unmasking at bind time. AFAICT this is not yet in the 
> > qemu-xen
> > tree.
> > 
> 
> Roger, 
> Maybe I misunderstood of your question and my last mail confused you. 
> Qemu-xen didn't have commit a80363, so I report this issue to ask for sync up 
> with qemu upstream. Last mail I mean I usually used Qemu Xen tree to do test, 
> and found out this issue.

Before requesting the backport, have you tested whether it
fixes your issues?

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] xen-mceinj tool testing cause dom0 crash

2017-11-07 Thread Jan Beulich
>>> On 07.11.17 at 09:23,  wrote:
>> From: Jan Beulich [mailto:jbeul...@suse.com]
>> Sent: Tuesday, November 7, 2017 4:09 PM
>> >>> On 07.11.17 at 02:37,  wrote:
>> >> From: Jan Beulich [mailto:jbeul...@suse.com]
>> >> Sent: Monday, November 6, 2017 5:17 PM
>> >> >>> On 03.11.17 at 09:29,  wrote:
>> >> > We figured out the problem, some corner scripts triggered the error
>> >> > injection at the same page (pfn 0x180020) twice, i.e. "./xen-mceinj
>> >> > -t 0" run over one time, which resulted in Dom0 crash.
>> >>
>> >> But isn't this a valid scenario, which shouldn't result in a kernel crash?
>> > What if
>> >> two successive #MCs occurred for the same page?
>> >> I.e. ...
>> >>
>> >
>> > Yes, it's another valid scenario, the expect result is kernel crash.
>> 
>> Kernel _crash_ or rather kernel _panic_? Of course without any kernel 
>> messages
>> we can't tell one from the other, but to me this makes a difference 
>> nevertheless.
>> 
> Exactly, Dom0 crash.

I don't believe a crash is the expected outcome here.

> And I didn't see any "kernel panic" message from the log -- attach the 
> original log again.

Well, as said - there _no_ kernel log message at all, and hence we
can't tell whether it's a crash or a plain panic. Iirc Xen's "Hardware
Dom0 crashed" can't distinguish the two cases.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 115624: regressions - FAIL

2017-11-07 Thread osstest service owner
flight 115624 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/115624/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
115526
 test-armhf-armhf-xl-vhd 15 guest-start/debian.repeat fail REGR. vs. 115526

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt-vhd 17 guest-start/debian.repeatfail  like 115496
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 115526
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 115526
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 115526
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 115526
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 115526
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 115526
 test-amd64-amd64-xl-qcow219 guest-start/debian.repeatfail  like 115526
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 115526
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 115526
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 115526
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qcow2 12 migrate-support-checkfail  never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass

version targeted for testing:
 xen  1f61c07d79abda1e747d70d83edffe4efca48e17
baseline version:
 xen  ff93dc55431517ed29c70dbff6721c6b0803acf9

Last test of basis   115526  2017-11-03 13:51:00 Z3 days
Failing since11  2017-11-04 09:34:51 Z2 days5 attempts
Testing same since   115624  2017-11-06 21:58:54 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Julien Grall 
  Wei Liu 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf 

Re: [Xen-devel] [PATCH] x86/cpuid: Enable new SSE/AVX/AVX512 cpu features

2017-11-07 Thread Jan Beulich
>>> On 07.11.17 at 07:28,  wrote:
>   For those new instructions, you mean i also need to support those 
>   three instructions(GFNI,VAES and VPCLMULQDQ) in x86_emulate() in PV? 

Why three instructions? And why PV? I'm afraid I'm confused, and
hence I'm afraid simply saying "yes" to your question might not be
enough.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] xen-mceinj tool testing cause dom0 crash

2017-11-07 Thread Jan Beulich
>>> On 07.11.17 at 02:37,  wrote:
>>  -Original Message-
>> From: Jan Beulich [mailto:jbeul...@suse.com]
>> Sent: Monday, November 6, 2017 5:17 PM
>> To: Hao, Xudong 
>> Cc: Julien Grall ; George Dunlap
>> ; Lars Kurth ; Zhang,
>> Haozhong ; xen-devel@lists.xen.org 
>> Subject: RE: [Xen-devel] [BUG] xen-mceinj tool testing cause dom0 crash
>> 
>> >>> On 03.11.17 at 09:29,  wrote:
>> > We figured out the problem, some corner scripts triggered the error
>> > injection at the same page (pfn 0x180020) twice, i.e. "./xen-mceinj -t
>> > 0" run over one time, which resulted in Dom0 crash.
>> 
>> But isn't this a valid scenario, which shouldn't result in a kernel crash? 
> What if
>> two successive #MCs occurred for the same page?
>> I.e. ...
>> 
> 
> Yes, it's another valid scenario, the expect result is kernel crash. 

Kernel _crash_ or rather kernel _panic_? Of course without any
kernel messages we can't tell one from the other, but to me this
makes a difference nevertheless.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] VMX: fix VMCS race on context-switch paths

2017-11-07 Thread Jan Beulich
>>> On 02.11.17 at 20:46,  wrote:
>> Any ideas about the root cause of the fault and suggestions how to reproduce 
>> it
>> would be welcome. Does this crash really has something to do with PML? I 
>> doubt
>> because the original environment may hardly be called PML-heavy.

Well, PML-heaviness doesn't matter. It's the mere fact that PML
is enabled on the vCPU being destroyed.

> So we finally have complete understanding of what's going on:
> 
> Some vCPU has just migrated to another pCPU and we switched to idle but
> per_cpu(curr_vcpu) on the current pCPU is still pointing to it - this is
> how the current logic works. While we're in idle we're issuing
> vcpu_destroy() for some other domain which eventually calls
> vmx_vcpu_disable_pml() and trashes VMCS pointer on the current pCPU. At
> this moment we get a TLB flush IPI from that same vCPU which is now
> context switching on another pCPU - it appears to clean TLB after
> itself. This vCPU is already marked is_running=1 by the scheduler. In
> the IPI handler we enter __sync_local_execstate() and trying to call
> vmx_ctxt_switch_from() for the migrated vCPU which is supposed to call
> vmcs_reload() but doesn't do it because is_running==1. The next VMWRITE
> crashes the hypervisor.
> 
> So the state transition diagram might look like:
> pCPU1: vCPUx -> migrate to pCPU2 -> idle -> RCU callbacks ->

I'm not really clear about who/what is "idle" here: pCPU1,
pCPU2, or yet something else? If vCPUx migrated to pCPU2,
wouldn't it be put back into runnable state right away, and
hence pCPU2 can't be idle at this point? Yet for pCPU1 I don't
think its idleness would matter much, i.e. the situation could
also arise without it becoming idle afaics. pCPU1 making it
anywhere softirqs are being processed would suffice.

> vcpu_destroy() -> vmx_vcpu_disable_pml() -> vmcs_clear()
> pCPU2: context switch into vCPUx -> is_running = 1 -> TLB flush
> pCPU1: IPI handler -> context switch out of vCPUx -> VMWRITE -> CRASH!
> 
> We can basically just fix the condition around vmcs_reload() call but
> I'm not completely sure that it's the right way to do - I don't think
> leaving per_cpu(curr_vcpu) pointing to a migrated vCPU is a good idea
> (maybe we need to clean it). What are your thoughts?

per_cpu(curr_vcpu) can only validly be written inside
__context_switch(), hence the only way to achieve this would
be to force __context_switch() to be called earlier than out of
the TLB flush IPI handler, perhaps like in the (untested!) patch
below. Two questions then remain:
- Should we perhaps rather do this in an arch-independent way
  (i.e. ahead of the call to vcpu_destroy() in common code)?
- This deals with only a special case of the more general "TLB
  flush behind the back of a vmx_vmcs_enter() /
  vmx_vmcs_exit() section" - does this need dealing with in a
  more general way? Here I'm thinking of introducing a
  FLUSH_STATE flag to be passed to flush_mask() instead of
  the current flush_tlb_mask() in context_switch() and
  sync_vcpu_execstate(). This could at the same time be used
  for a small performance optimization: At least for HAP vCPU-s
  I don't think we really need the TLB part of the flushes here.

Jan

--- unstable.orig/xen/arch/x86/domain.c
+++ unstable/xen/arch/x86/domain.c
@@ -379,6 +379,14 @@ int vcpu_initialise(struct vcpu *v)
 
 void vcpu_destroy(struct vcpu *v)
 {
+/*
+ * Flush all state for this vCPU before fully tearing it down. This is
+ * particularly important for HVM ones on VMX, so that this flushing of
+ * state won't happen from the TLB flush IPI handler behind the back of
+ * a vmx_vmcs_enter() / vmx_vmcs_exit() section.
+ */
+sync_vcpu_execstate(v);
+
 xfree(v->arch.vm_event);
 v->arch.vm_event = NULL;
 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel