Re: [PATCH] xen/efi: refactor deprecated strncpy

2023-09-18 Thread Juergen Gross

On 11.09.23 20:59, Justin Stitt wrote:

`strncpy` is deprecated for use on NUL-terminated destination strings [1].

`efi_loader_signature` has space for 4 bytes. We are copying "Xen" (3 bytes)
plus a NUL-byte which makes 4 total bytes. With that being said, there is
currently not a bug with the current `strncpy()` implementation in terms of
buffer overreads but we should favor a more robust string interface
either way.

A suitable replacement is `strscpy` [2] due to the fact that it guarantees
NUL-termination on the destination buffer while being functionally the
same in this case.

Link: 
www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings[1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-harden...@vger.kernel.org
Cc: Kees Cook 
Signed-off-by: Justin Stitt 


Pushed to xen/tip.git for-linus-6.6a


Juergen



OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [PATCH 4/7] block/dirty-bitmap: Clean up local variable shadowing

2023-09-18 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 31.08.2023 um 15:25 hat Markus Armbruster geschrieben:
>> Local variables shadowing other local variables or parameters make the
>> code needlessly hard to understand.  Tracked down with -Wshadow=local.
>> Clean up: delete inner declarations when they are actually redundant,
>> else rename variables.
>> 
>> Signed-off-by: Markus Armbruster 
>> ---
>>  block/monitor/bitmap-qmp-cmds.c | 2 +-
>>  block/qcow2-bitmap.c| 3 +--
>>  2 files changed, 2 insertions(+), 3 deletions(-)
>> 
>> diff --git a/block/monitor/bitmap-qmp-cmds.c 
>> b/block/monitor/bitmap-qmp-cmds.c
>> index 55f778f5af..4d018423d8 100644
>> --- a/block/monitor/bitmap-qmp-cmds.c
>> +++ b/block/monitor/bitmap-qmp-cmds.c
>> @@ -276,7 +276,7 @@ BdrvDirtyBitmap *block_dirty_bitmap_merge(const char 
>> *node, const char *target,
>>  
>>  for (lst = bms; lst; lst = lst->next) {
>>  switch (lst->value->type) {
>> -const char *name, *node;
>> +const char *name;
>>  case QTYPE_QSTRING:
>>  name = lst->value->u.local;
>>  src = bdrv_find_dirty_bitmap(bs, name);
>
> The names in this function are all over the place... A more ambitious
> patch could rename the parameters to dst_node/dst_bitmap and these
> variables to src_node/src_bitmap to get some more consistency (both with
> each other and with the existing src/dst variables).

What exactly would you like me to consider?  Perhaps:

* Rename parameter @node to @dst_node

* Rename which parameter to @dst_bitmap?

* Rename nested local @node to @src_node

* Rename which local variable to @src_bitmap?

* Move nested locals to function scope

> Preexisting, so I'm not insisting that you should do this.
>
>> diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
>> index 037fa2d435..ffd5cd3b23 100644
>> --- a/block/qcow2-bitmap.c
>> +++ b/block/qcow2-bitmap.c
>> @@ -1555,7 +1555,6 @@ bool 
>> qcow2_store_persistent_dirty_bitmaps(BlockDriverState *bs,
>>  FOR_EACH_DIRTY_BITMAP(bs, bitmap) {
>>  const char *name = bdrv_dirty_bitmap_name(bitmap);
>>  uint32_t granularity = bdrv_dirty_bitmap_granularity(bitmap);
>> -Qcow2Bitmap *bm;
>>  
>>  if (!bdrv_dirty_bitmap_get_persistence(bitmap) ||
>>  bdrv_dirty_bitmap_inconsistent(bitmap)) {
>> @@ -1625,7 +1624,7 @@ bool 
>> qcow2_store_persistent_dirty_bitmaps(BlockDriverState *bs,
>>  
>>  /* allocate clusters and store bitmaps */
>>  QSIMPLEQ_FOREACH(bm, bm_list, entry) {
>> -BdrvDirtyBitmap *bitmap = bm->dirty_bitmap;
>> +bitmap = bm->dirty_bitmap;
>>  
>>  if (bitmap == NULL || bdrv_dirty_bitmap_readonly(bitmap)) {
>>  continue;
>
> Reviewed-by: Kevin Wolf 

Thanks!




Re: [PATCH 1/7] migration/rdma: Fix save_page method to fail on polling error

2023-09-18 Thread Markus Armbruster
Eric Blake  writes:

> On Thu, Aug 31, 2023 at 03:25:40PM +0200, Markus Armbruster wrote:
>> qemu_rdma_save_page() reports polling error with error_report(), then
>> succeeds anyway.  This is because the variable holding the polling
>> status *shadows* the variable the function returns.  The latter
>> remains zero.
>> 
>> Broken since day one, and duplicated more recently.
>> 
>> Fixes: 2da776db4846 (rdma: core logic)
>> Fixes: b390afd8c50b (migration/rdma: Fix out of order wrid)
>
> Alas, the curse of immutable git history preserving typos in commit
> subjects ;)

"wrid" is short for "work request ID", actually.

> The alternative of rewriting history and breaking SHA
> references is worse.

Rewriting master is a big no-no.

git-note can be used to correct the record, but it has its usability
issues.

>> Signed-off-by: Markus Armbruster 
>> ---
>>  migration/rdma.c | 6 --
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> Reviewed-by: Eric Blake 

Thanks!




[PATCH] MAINTAINERS: Remove myself as RISC-V maintainer

2023-09-18 Thread Alistair Francis
I unfortunately don't have time to be a Xen maintainer, so remove myself
as the maintainer.

Signed-off-by: Alistair Francis 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 0f227a2f5d..22034bf6e3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -500,7 +500,7 @@ F:  tools/hotplug/Linux/block-drbd-probe
 
 RISCV
 M: Bob Eshleman 
-M: Alistair Francis 
+R: Alistair Francis 
 R: Connor Davis 
 S: Supported
 F: config/riscv64.mk
-- 
2.41.0




[xen-unstable test] 183032: tolerable FAIL - PUSHED

2023-09-18 Thread osstest service owner
flight 183032 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183032/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 183026
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 183026
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 183026
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 183026
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 183026
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 183026
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 183026
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 183026
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 183026
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 183026
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 183026
 test-amd64-amd64-xl-qemuu-win7-amd64 12 windows-install   fail like 183026
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass

version targeted for testing:
 xen  2ea38251eb67639be7aa9d7b64084b1be0230273
baseline version:
 xen  290f82375d828ef93f831a5ef028f1283aa1ea47

Last test of basis   183026  2023-09-18 01:52:02 Z1 days
Testing same since   183032  2023-09-18 17:07:05 Z0 days1 attempts


People who touched revisions under test:
  Federico Serafini 
  George Dunlap 
  Jan Beulich 
  

[xen-unstable-smoke test] 183039: regressions - FAIL

2023-09-18 Thread osstest service owner
flight 183039 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183039/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt  8 xen-boot fail REGR. vs. 183030
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  8 xen-bootfail REGR. vs. 183030

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  b5926c6ecf05c28ee99c6248c42d691ccbf0c315
baseline version:
 xen  2ea38251eb67639be7aa9d7b64084b1be0230273

Last test of basis   183030  2023-09-18 14:02:00 Z0 days
Testing same since   183031  2023-09-18 17:01:55 Z0 days3 attempts


People who touched revisions under test:
  Andrew Cooper 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64fail
 test-amd64-amd64-libvirt fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit b5926c6ecf05c28ee99c6248c42d691ccbf0c315
Author: Andrew Cooper 
Date:   Wed Aug 30 20:24:25 2023 +0100

x86/spec-ctrl: Mitigate the Zen1 DIV leakage

In the Zen1 microarchitecure, there is one divider in the pipeline which
services uops from both threads.  In the case of #DE, the latched result 
from
the previous DIV to execute will be forwarded speculatively.

This is an interesting covert channel that allows two threads to communicate
without any system calls.  In also allows userspace to obtain the result of
the most recent DIV instruction executed (even speculatively) in the core,
which can be from a higher privilege context.

Scrub the result from the divider by executing a non-faulting divide.  This
needs performing on the exit-to-guest paths, and ist_exit-to-Xen.

Alternatives in IST context is believed safe now that it's done in NMI
context.

This is XSA-439 / CVE-2023-20588.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 

commit de1d265001397f308c5c3c5d3ffc30e7ef8c0705
Author: Andrew Cooper 
Date:   Fri Sep 15 12:13:51 2023 +0100

x86/amd: Introduce is_zen{1,2}_uarch() predicates

We already have 3 cases using STIBP as a Zen1/2 heuristic, and are about to
introduce a 4th.  Wrap the heuristic into a pair of predicates rather than
opencoding it, and the explanation of the heuristic, at each usage site.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 

commit 3ee6066bcd737756b0990d417d94eddc0b0d2585
Author: Andrew Cooper 
Date:   Wed Sep 13 13:53:33 2023 +0100

x86/spec-ctrl: Issue VERW during IST exit to Xen

There is a corner case where e.g. an NMI hitting an exit-to-guest path after
SPEC_CTRL_EXIT_TO_* would have run the entire NMI handler *after* the VERW
flush to scrub potentially sensitive data from uarch buffers.

In order to compensate, issue VERW when exiting to Xen from an IST entry.

SPEC_CTRL_EXIT_TO_XEN already has two reads of spec_ctrl_flags off the 
stack,
and we're about to add a third.  Load the field into %ebx, and list the
register as clobbered.

%r12 has been arranged to be the ist_exit signal, so add this as an input
dependency and use it to identify when to issue a VERW.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 

commit 21bdc25b05a0f8ab6bc73520a9ca01327360732c
Author: Andrew Cooper 
Date:   Wed Sep 13 12:20:12 2023 +0100

x86/entry: Track the IST-ness of an entry for the exit paths

Use %r12 to 

[ovmf test] 183041: all pass - PUSHED

2023-09-18 Thread osstest service owner
flight 183041 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183041/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf cbcf0428e83bbe8314de47207072b3b4f1557dc6
baseline version:
 ovmf 408e4631359d3e67633b36f6adf9a13a7cc573d3

Last test of basis   183037  2023-09-18 21:41:56 Z0 days
Testing same since   183041  2023-09-19 01:40:54 Z0 days1 attempts


People who touched revisions under test:
  Ard Biesheuvel 
  Michael Kubacki 
  Sami Mujawar 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   408e463135..cbcf0428e8  cbcf0428e83bbe8314de47207072b3b4f1557dc6 -> 
xen-tested-master



[ovmf test] 183037: all pass - PUSHED

2023-09-18 Thread osstest service owner
flight 183037 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183037/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 408e4631359d3e67633b36f6adf9a13a7cc573d3
baseline version:
 ovmf db38c7de64d4dda2bf3cc6e5d764b027b00afa59

Last test of basis   183027  2023-09-18 02:43:21 Z0 days
Testing same since   183037  2023-09-18 21:41:56 Z0 days1 attempts


People who touched revisions under test:
  Taylor Beebe 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   db38c7de64..408e463135  408e4631359d3e67633b36f6adf9a13a7cc573d3 -> 
xen-tested-master



[xen-unstable-smoke test] 183034: regressions - FAIL

2023-09-18 Thread osstest service owner
flight 183034 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183034/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt  8 xen-boot fail REGR. vs. 183030
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  8 xen-bootfail REGR. vs. 183030

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  b5926c6ecf05c28ee99c6248c42d691ccbf0c315
baseline version:
 xen  2ea38251eb67639be7aa9d7b64084b1be0230273

Last test of basis   183030  2023-09-18 14:02:00 Z0 days
Testing same since   183031  2023-09-18 17:01:55 Z0 days2 attempts


People who touched revisions under test:
  Andrew Cooper 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64fail
 test-amd64-amd64-libvirt fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit b5926c6ecf05c28ee99c6248c42d691ccbf0c315
Author: Andrew Cooper 
Date:   Wed Aug 30 20:24:25 2023 +0100

x86/spec-ctrl: Mitigate the Zen1 DIV leakage

In the Zen1 microarchitecure, there is one divider in the pipeline which
services uops from both threads.  In the case of #DE, the latched result 
from
the previous DIV to execute will be forwarded speculatively.

This is an interesting covert channel that allows two threads to communicate
without any system calls.  In also allows userspace to obtain the result of
the most recent DIV instruction executed (even speculatively) in the core,
which can be from a higher privilege context.

Scrub the result from the divider by executing a non-faulting divide.  This
needs performing on the exit-to-guest paths, and ist_exit-to-Xen.

Alternatives in IST context is believed safe now that it's done in NMI
context.

This is XSA-439 / CVE-2023-20588.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 

commit de1d265001397f308c5c3c5d3ffc30e7ef8c0705
Author: Andrew Cooper 
Date:   Fri Sep 15 12:13:51 2023 +0100

x86/amd: Introduce is_zen{1,2}_uarch() predicates

We already have 3 cases using STIBP as a Zen1/2 heuristic, and are about to
introduce a 4th.  Wrap the heuristic into a pair of predicates rather than
opencoding it, and the explanation of the heuristic, at each usage site.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 

commit 3ee6066bcd737756b0990d417d94eddc0b0d2585
Author: Andrew Cooper 
Date:   Wed Sep 13 13:53:33 2023 +0100

x86/spec-ctrl: Issue VERW during IST exit to Xen

There is a corner case where e.g. an NMI hitting an exit-to-guest path after
SPEC_CTRL_EXIT_TO_* would have run the entire NMI handler *after* the VERW
flush to scrub potentially sensitive data from uarch buffers.

In order to compensate, issue VERW when exiting to Xen from an IST entry.

SPEC_CTRL_EXIT_TO_XEN already has two reads of spec_ctrl_flags off the 
stack,
and we're about to add a third.  Load the field into %ebx, and list the
register as clobbered.

%r12 has been arranged to be the ist_exit signal, so add this as an input
dependency and use it to identify when to issue a VERW.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 

commit 21bdc25b05a0f8ab6bc73520a9ca01327360732c
Author: Andrew Cooper 
Date:   Wed Sep 13 12:20:12 2023 +0100

x86/entry: Track the IST-ness of an entry for the exit paths

Use %r12 to 

Re: [XEN PATCH v2 3/5] tools: don't use distutils in configure nor Makefile

2023-09-18 Thread Javi Merino
On Tue, Sep 12, 2023 at 01:17:21PM +0100, Andrew Cooper wrote:
> On 12/09/2023 12:50 pm, Marek Marczykowski-Górecki wrote:
> > On Tue, Sep 12, 2023 at 11:38:04AM +0100, Andrew Cooper wrote:
> >> On 11/09/2023 5:51 pm, Javi Merino wrote:
> >>> From: Marek Marczykowski-Górecki 
> >>>
> >>> Python distutils is deprecated and is going to be removed in Python
> >>> 3.12. The distutils.sysconfig is available as sysconfig module in
> >>> stdlib since Python 3.2, so use that directly.
> >>>
> >>> Signed-off-by: Marek Marczykowski-Górecki 
> >>> 
> >> This breaks Py2, doesn't it?
> > I was thinking that too, but "sysconfig" module seems to be in Python
> > 2.7 too.

Yes, I forgot to say it.  I tested this with python2 as well as python3.  I did:

  PYTHON=$(which python) ./configure
  make -C tools/pygrub clean && make -C tools/pygrub
  make -C tools/python clean && make -C tools/python

with python being:

  $ python --version
  Python 2.7.18.6

I did the same test with just `./configure` and python3 being:

  $ python3 --version
  Python 3.12.0rc2

> Oh, so it is.  Lovely that the documentation says this...
> 
> It seems to have appeared in Py2.7 and 3.2 together. 
> https://docs.python.org/2.7/library/sysconfig.html
> 
> I notice that README currently says Py2.6.  We can definitely bump that
> to 2.7, and take this patch as-is.

Marek had a patch that bumped the minimum version to 3.2[0] but I
dropped it from the series as we were going to keep the support for python2.

[0] 
https://lore.kernel.org/xen-devel/20230316171634.320626-4-marma...@invisiblethingslab.com/

I will bump the minimum version of python to 2.7 in the README.

Cheers,
Javi



Re: [XEN PATCH v2 2/5] tools: convert setup.py to use setuptools

2023-09-18 Thread Javi Merino
On Tue, Sep 12, 2023 at 11:22:56AM +0100, Andrew Cooper wrote:
> On 11/09/2023 5:51 pm, Javi Merino wrote:
> > From: Marek Marczykowski-Górecki 
> >
> > Python distutils is deprecated and is going to be removed in Python
> > 3.12. Migrate to setuptools.
> > Setuptools in Python 3.11 complains:
> > SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and 
> > pip and other standards-based tools.
> > Keep using setup.py anyway to build C extension.
> >
> > Signed-off-by: Marek Marczykowski-Górecki 
> 
> Throughout the commit message, s/use/support/ seeing as we're not
> removing distutils.

Ok.

> Next, this needs a SoB from you because you've changed the patch from
> what Marek originally wrote.

Done

> > diff --git a/tools/pygrub/setup.py b/tools/pygrub/setup.py
> > index 502aa4df2d..f9e8feb2e6 100644
> > --- a/tools/pygrub/setup.py
> > +++ b/tools/pygrub/setup.py
> > @@ -1,5 +1,9 @@
> > -from distutils.core import setup, Extension
> > -from distutils.ccompiler import new_compiler
> > +try:
> > +from setuptools import setup, Extension
> > +except ImportError:
> > +# distutils was removed in Python 3.12.  If this import fails,
> > +# install setuptools.
> > +from distutils.core import setup, Extension
> 
> Finally, this feels a little unnecessary.  How about just:
> 
> # Prefer setuptools, fall back to distutils
> try:
>     from setuptools import setup, Extension
> except ImportError:
>     from distutils.core import setup, Extension

Done



Re: [XEN PATCH v2 1/5] automation: add python3's setuptools to containers

2023-09-18 Thread Javi Merino
On Tue, Sep 12, 2023 at 11:18:42AM +0100, Andrew Cooper wrote:
> On 11/09/2023 5:51 pm, Javi Merino wrote:
> > In preparation of dropping python distutils and moving to setuptools,
> > add the python3 setuptools module to the containers that need it.
> >
> > The centos7 container was building using python2.  Change it to build
> > python scripts using python3.
> >
> > Debian Stretch is no longer debian oldstable, so move to the archive
> > repositories.
> >
> > Signed-off-by: Javi Merino 
> 
> We are not dropping distutils.  We're moving to support both distutils
> and setuptools, because setuptools doesn't support the minimum version
> of python that Xen supports.

Indeed.  I wrote this when the series was about dropping distutils and
forgot to update the commit message when I change it to support both.

> Therefore, it's important to keep some of the containers on distutils
> rather than switching all to setuptools.
> 
> CenOS can stay as is, as can Stretch and probably Bionic/Focal.

I agree for CentOS and Debian. For Ubuntu we have the following containers:
  * 14.04 (trusty)
  * 16.04 (xenial)
  * 18.04 (bionic)
  * 20.04 (focal)

Following the logic of only moving the new ones, I will leave as is
the old ones (trusty, xenial and bionic) and only move focal to
setuptools.

> Any containers with Py3.10 or later definitely need to move, seeing as
> distuils is formally deprecated there
> 
> It's sadly a little too early to make a Py3.12 container, which will
> lack distutils, but we can come back to that in 4.19.

Yes, even python's own 3.12 release candidate still includes it.

  $ podman run --rm -it docker.io/python:3.12-rc-bookworm
  Python 3.12.0rc2 (main, Sep  8 2023, 03:00:59) [GCC 12.2.0] on linux
  Type "help", "copyright", "credits" or "license" for more information.
  >>> import distutils
  >>>

> As Stefano points out, you should refresh at least some of the arm64
> containers too.  RISC-V and PPC aren't set up for tools builds set, so
> they're fine to leave.

I have refreshed the arm64 containers.

Cheers,
Javi



linux-next: duplicate patch in the xen-tip tree

2023-09-18 Thread Stephen Rothwell
Hi all,

The following commit is also in Linus Torvalds' tree as a different commit
(but the same patch):

  603392995417 ("x86/paravirt: Fix tlb_remove_table function callback prototype 
warning")

This is commit

  fcce1c6cb156 ("x86/paravirt: Fix tlb_remove_table function callback prototype 
warning")

in Linus' tree.

-- 
Cheers,
Stephen Rothwell


pgpc2c_W2f3Rw.pgp
Description: OpenPGP digital signature


Re: [XEN PATCH v2 1/5] automation: add python3's setuptools to containers

2023-09-18 Thread Javi Merino
On Mon, Sep 11, 2023 at 06:15:03PM -0700, Stefano Stabellini wrote:
> On Mon, 11 Sep 2023, Javi Merino wrote:
> > In preparation of dropping python distutils and moving to setuptools,
> > add the python3 setuptools module to the containers that need it.
> > 
> > The centos7 container was building using python2.  Change it to build
> > python scripts using python3.
> > 
> > Debian Stretch is no longer debian oldstable, so move to the archive
> > repositories.
> > 
> > Signed-off-by: Javi Merino 
> > ---
> >  automation/build/alpine/3.18.dockerfile|  1 +
> >  automation/build/archlinux/current.dockerfile  |  1 +
> >  automation/build/centos/7.dockerfile   |  3 ++-
> >  automation/build/debian/bookworm.dockerfile|  1 +
> >  automation/build/debian/stretch.dockerfile | 11 ++-
> >  automation/build/suse/opensuse-leap.dockerfile |  1 +
> >  automation/build/ubuntu/bionic.dockerfile  |  1 +
> >  automation/build/ubuntu/focal.dockerfile   |  1 +
> >  automation/build/ubuntu/trusty.dockerfile  |  1 +
> >  automation/build/ubuntu/xenial.dockerfile  |  1 +
> 
> We are missing:
> automation/build/alpine/3.18-arm64v8.dockerfile
> automation/build/suse/opensuse-tumbleweed.dockerfile
> automation/build/debian/bookworm-i386.dockerfile
> automation/build/debian/bookworm-arm64v8.dockerfile
> automation/build/debian/stretch-i386.dockerfile

Of course! I wasn't able to test it using CI and I missed a bunch of
failed tests.  I have now added it to all of these.

> automation/build/suse/opensuse-leap.dockerfile

Leap is already in the list.

> automation/build/fedora/29.dockerfile

Why? It already has setuptools.

> automation/build/debian/jessie-i386.dockerfile
> automation/build/debian/jessie.dockerfile

Why? The jessie container is not run in automation.  Arguably, the
docker files should be deleted instead.

Cheers,
Javi



[linux-linus test] 183028: regressions - FAIL

2023-09-18 Thread osstest service owner
flight 183028 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183028/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 8 xen-boot fail REGR. 
vs. 182531
 test-amd64-coresched-amd64-xl  8 xen-bootfail REGR. vs. 182531
 test-amd64-amd64-libvirt-pair 12 xen-boot/src_host   fail REGR. vs. 182531
 test-amd64-amd64-libvirt-pair 13 xen-boot/dst_host   fail REGR. vs. 182531
 test-amd64-amd64-xl-xsm   8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-libvirt-raw  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-libvirt-qcow2  8 xen-boot   fail REGR. vs. 182531
 test-amd64-amd64-xl-vhd   8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-pvshim8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-pvhv2-intel  8 xen-boot  fail REGR. vs. 182531
 test-amd64-amd64-freebsd12-amd64  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-qemuu-ws16-amd64  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-qemut-debianhvm-amd64  8 xen-bootfail REGR. vs. 182531
 test-amd64-amd64-xl   8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-qemut-win7-amd64  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-libvirt-xsm  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-qemuu-win7-amd64  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-dom0pvh-xl-amd 14 guest-start   fail REGR. vs. 182531
 test-amd64-amd64-xl-shadow8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  8 xen-bootfail REGR. vs. 182531
 test-amd64-amd64-examine-uefi  8 reboot  fail REGR. vs. 182531
 test-amd64-amd64-xl-credit1   8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-pair12 xen-boot/src_hostfail REGR. vs. 182531
 test-amd64-amd64-pair13 xen-boot/dst_hostfail REGR. vs. 182531
 test-amd64-amd64-xl-multivcpu  8 xen-bootfail REGR. vs. 182531
 test-amd64-amd64-qemuu-nested-intel  8 xen-boot  fail REGR. vs. 182531
 test-amd64-amd64-xl-qemut-ws16-amd64  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 8 xen-boot fail REGR. vs. 
182531
 test-amd64-amd64-xl-qemuu-ovmf-amd64  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-pvhv2-amd  8 xen-bootfail REGR. vs. 182531
 test-amd64-amd64-qemuu-nested-amd  8 xen-bootfail REGR. vs. 182531
 test-amd64-amd64-libvirt  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-dom0pvh-xl-intel 14 guest-start fail REGR. vs. 182531
 test-amd64-amd64-freebsd11-amd64  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-pygrub   8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 8 xen-boot fail REGR. vs. 
182531
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 8 xen-boot fail REGR. 
vs. 182531
 test-amd64-amd64-xl-credit2   8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-examine-bios  8 reboot  fail REGR. vs. 182531
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-examine  8 reboot   fail REGR. vs. 182531

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds  8 xen-boot fail REGR. vs. 182531

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 182531
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 182531
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 182531
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 

Re: [XEN PATCH v2 5/5] README: update to remove old note about the build system's python expectation

2023-09-18 Thread Javi Merino
On Tue, Sep 12, 2023 at 01:25:06PM +0100, Andrew Cooper wrote:
> On 11/09/2023 5:51 pm, Javi Merino wrote:
> > The configure script tests for the availability of python3, python and
> > python2 in that order and sets PYTHON to the first one found in path.
> > You don't need to have a symlink to python.
> 
> I think this was fixed by 5852ca48526316918cd82fba1033a6a5379fbc4c
> "build: fix tools/configure in case only python3 exists"

Indeed! I should have put it in the commit message.

> > Signed-off-by: Javi Merino 
> 
> Reviewed-by: Andrew Cooper 
> 
> I'm happy to fix the commit message on commit.

Thanks!



Re: xl dmesg buffer too small for Xen 4.18?

2023-09-18 Thread Chuck Zmudzinski
On 9/18/2023 2:49 PM, Julien Grall wrote:
> (+Roger and moving to xen-devel)
> 
> Hi,
> 
> On 18/09/2023 19:17, Chuck Zmudzinski wrote:
>> On 9/18/2023 9:00 AM, Chuck Zmudzinski wrote:
>>> Hello,
>>>
>>> I tested Xen 4.18~rc0 on Alma Linux 9 and my first tests indicate it works 
>>> fine for starting the guests I manage but I notice that immediately after 
>>> boot and with only dom0 running on the system, I get:
>>>
>>> [user@Malmalinux ~]$ sudo xl dmesg
>>> 00bee72000-0bee72fff type=7 attr=000f
>>> (XEN)  0bee73000-0bef49fff type=4 attr=000f
>>> (XEN)  0bef4a000-0bef4bfff type=7 attr=000f
>>> (XEN)  0bef4c000-0befbafff type=4 attr=000f
>>> (XEN)  0befbb000-0befbbfff type=7 attr=000f
>>> ...
>>>
>>> I have noticed the buffer fills up quickly on earlier Xen versions, but 
>>> never have I seen it fill up during boot and with only dom0 running.
>>>
>>> Can increasing the buffer fix this? How would one do that?
>>>
>>> Thanks
>>>
>> 
>> I see the setting is the command line option conring_size:
>> 
>> https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#conring_size
>> 
>> The default is 16k, I tried 48k and that was big enough to capture all the 
>> messages at boot for 4.18 rc0. This is probably not an issue if the release 
>> candidate is being more verbose than the actual release will be. But if the 
>> release is still this verbose, maybe the default of 16k should be increased.
> 
> Thanks for the report. This remind me the series [1] from Roger which 
> tries to increase the default size to 32K. @Roger, I am wondering if we 
> should revive it?
> 
> Cheers,
> 
> [1] 
> https://lore.kernel.org/xen-devel/20220630082330.82706-1-roger@citrix.com/
> 

I just tested with 24k, and that is also big enough. So 32k would also be good. 
But the default of 16k appears to be too small for Xen 4.18 rc0 on my machine.



[xen-unstable-smoke test] 183031: regressions - FAIL

2023-09-18 Thread osstest service owner
flight 183031 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183031/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt  8 xen-boot fail REGR. vs. 183030
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  8 xen-bootfail REGR. vs. 183030

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  b5926c6ecf05c28ee99c6248c42d691ccbf0c315
baseline version:
 xen  2ea38251eb67639be7aa9d7b64084b1be0230273

Last test of basis   183030  2023-09-18 14:02:00 Z0 days
Testing same since   183031  2023-09-18 17:01:55 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64fail
 test-amd64-amd64-libvirt fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit b5926c6ecf05c28ee99c6248c42d691ccbf0c315
Author: Andrew Cooper 
Date:   Wed Aug 30 20:24:25 2023 +0100

x86/spec-ctrl: Mitigate the Zen1 DIV leakage

In the Zen1 microarchitecure, there is one divider in the pipeline which
services uops from both threads.  In the case of #DE, the latched result 
from
the previous DIV to execute will be forwarded speculatively.

This is an interesting covert channel that allows two threads to communicate
without any system calls.  In also allows userspace to obtain the result of
the most recent DIV instruction executed (even speculatively) in the core,
which can be from a higher privilege context.

Scrub the result from the divider by executing a non-faulting divide.  This
needs performing on the exit-to-guest paths, and ist_exit-to-Xen.

Alternatives in IST context is believed safe now that it's done in NMI
context.

This is XSA-439 / CVE-2023-20588.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 

commit de1d265001397f308c5c3c5d3ffc30e7ef8c0705
Author: Andrew Cooper 
Date:   Fri Sep 15 12:13:51 2023 +0100

x86/amd: Introduce is_zen{1,2}_uarch() predicates

We already have 3 cases using STIBP as a Zen1/2 heuristic, and are about to
introduce a 4th.  Wrap the heuristic into a pair of predicates rather than
opencoding it, and the explanation of the heuristic, at each usage site.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 

commit 3ee6066bcd737756b0990d417d94eddc0b0d2585
Author: Andrew Cooper 
Date:   Wed Sep 13 13:53:33 2023 +0100

x86/spec-ctrl: Issue VERW during IST exit to Xen

There is a corner case where e.g. an NMI hitting an exit-to-guest path after
SPEC_CTRL_EXIT_TO_* would have run the entire NMI handler *after* the VERW
flush to scrub potentially sensitive data from uarch buffers.

In order to compensate, issue VERW when exiting to Xen from an IST entry.

SPEC_CTRL_EXIT_TO_XEN already has two reads of spec_ctrl_flags off the 
stack,
and we're about to add a third.  Load the field into %ebx, and list the
register as clobbered.

%r12 has been arranged to be the ist_exit signal, so add this as an input
dependency and use it to identify when to issue a VERW.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 

commit 21bdc25b05a0f8ab6bc73520a9ca01327360732c
Author: Andrew Cooper 
Date:   Wed Sep 13 12:20:12 2023 +0100

x86/entry: Track the IST-ness of an entry for the exit paths

Use %r12 to 

Re: RFC: arm64: Handling reserved memory nodes

2023-09-18 Thread Julien Grall




On 16/09/2023 09:16, Leo Yan wrote:

Hi Julien,


Hi Leo,


On Thu, Sep 14, 2023 at 10:37:05AM +0100, Julien Grall wrote:
My understanding is the local variable 'xatp' is located in the stack
and the stack is located in the intermediate physical address
0x1_a017_1000. Thus, when copies the structure 'xatp' from Dom0 to the
Xen hypervisor, the hypervisor detects the IPA is not managed by its
frame table.


Thanks for the log and the analysis (see below).



This is because the memory range [0001a000 - 0001bfff]
is the reserved region so Xen hypervisor doesn't populate this region
and the frame table doesn't initialize for it.


I agree with this statement however...


On the other hand,
this memory range is passed to Dom0 Kernel as _normal_ memory region
and the kernel allocates pages from this memory region, same with
other memory regions.


... from my understanding reserved-memory are just normal memory that 
are set aside for a specific purpose. So Xen has to create a 'memory' 
node *and* a 'reserved-memory' region.


With that the kernel is supposed to exclude all the 'reserved-memory' 
from normal usage unless they have the node contains the property 
'reusable'. This was more clear before the binding was converted to YAML 
in [1].




[...]


## Fixes

I think it's wrong to add the reserved memory regions into the DT
binding as normal memory nodes for Dom0 kernel.  On the other hand, we
cannot simply remove these reserved memory regions and don't pass to
Dom0 kernel - we might reserve memory for specific purpose, for example,
ramoops [1] for kernel debugging.

The right thing to do is to keep these reserved memory nodes to Dom0
kernel.  So one task is to record properties for these reserved memory
node name and properties and pass to Dom0 kernel.

The difficulty is how we can avoid allocate these reserved memory
regions in Xen hypervisor.  We cannot register the reserved memory
into the boot pages, otherwise, the reserved memory might be allocated
in the early phase.  But we need to register these pages into the
frame management framework and reserve them in the first place, so
that we can allow Dom0 kernel to use them.  (I checked a bit the static
memory mechanism, seems to me we cannot use it to resolve this issue).


 From my understanding reserved region are normal RAM which have been carved
out for specific purpose. They may expect different caching policy (e.g.
non-cachable).


Yes, I agree, but we cannot assume the mapping attribution is always
non-cachable. It can be mapped as normal type, or device type (with and
different variants, like strong ordered, write-combined, etc).


For clarification, I didn't suggest it would always be cachable. It can 
be anything and I only provided an example (hence the 'e.g.').





AFAIK, Xen doesn't have the capability to know the memory
attribute (the DT binding only tell whether the region should not mapped.
See the property "no-map"), hence why they were excluded from the memory
management.


I think it's right to exclude the reserved memory regions from the
normal memory management.

Here the problem is these reserved memory regions are passed as normal
memory nodes to Dom0 kernel, then Dom0 kernel allocates pages from
these reserved memory regions.  Apparently, this might lead to conflict,
e.g. the reserved memory is used by Dom0 kernel, at the meantime the
memory is used by another purpose (e.g. by MCU in the system).


See above. I think this is correct to pass both 'memory' and 
'reserved-memory'. Now, it is possible that Xen may not create the 
device-tree correctly.


I would suggest to look how Linux is populating the memory and whether 
it actually skipped the regions.




Here I am a bit confused for "Xen doesn't have the capability to know
the memory attribute".  I looked into the file arch/arm/guest_walk.c,
IIUC, it walks through the stage 1's page tables for the virtual
machine and get the permission for the mapping, we also can get to
know the mapping attribute, right?


Most of the time, Xen will use the HW to translate the guest virtual 
address to an intermediation physical address. Looking at the 
specification, it looks like that PAR_EL1 will contain the memory 
attribute which I didn't know.


We would then need to read MAIR_EL1 to find the attribute and also the 
memory attribute in the stage-2 to figure out the final memory 
attribute. This is feasible but the Xen ABI mandates that region passed 
to Xen have a specific memory attributes (see the comment at the top of 
xen/include/public/arch-arm.h).


Anyway, in your case, Linux is using the buffer is on the stack. So the 
region must have been mapped with the proper attribute.




Another question for the attribute for MMIO regions. For mapping MMIO
regions, prepare_dtb_hwdom() sets the attribute 'p2m_mmio_direct_c'
for the stage 2, but in the Linux kernel the MMIO's attribute can
be one of below variants:

- ioremap(): device type with nGnRE;
- ioremap_np(): 

Re: xl dmesg buffer too small for Xen 4.18?

2023-09-18 Thread Julien Grall

(+Roger and moving to xen-devel)

Hi,

On 18/09/2023 19:17, Chuck Zmudzinski wrote:

On 9/18/2023 9:00 AM, Chuck Zmudzinski wrote:

Hello,

I tested Xen 4.18~rc0 on Alma Linux 9 and my first tests indicate it works fine 
for starting the guests I manage but I notice that immediately after boot and 
with only dom0 running on the system, I get:

[user@Malmalinux ~]$ sudo xl dmesg
00bee72000-0bee72fff type=7 attr=000f
(XEN)  0bee73000-0bef49fff type=4 attr=000f
(XEN)  0bef4a000-0bef4bfff type=7 attr=000f
(XEN)  0bef4c000-0befbafff type=4 attr=000f
(XEN)  0befbb000-0befbbfff type=7 attr=000f
...

I have noticed the buffer fills up quickly on earlier Xen versions, but never 
have I seen it fill up during boot and with only dom0 running.

Can increasing the buffer fix this? How would one do that?

Thanks



I see the setting is the command line option conring_size:

https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#conring_size

The default is 16k, I tried 48k and that was big enough to capture all the 
messages at boot for 4.18 rc0. This is probably not an issue if the release 
candidate is being more verbose than the actual release will be. But if the 
release is still this verbose, maybe the default of 16k should be increased.


Thanks for the report. This remind me the series [1] from Roger which 
tries to increase the default size to 32K. @Roger, I am wondering if we 
should revive it?


Cheers,

[1] 
https://lore.kernel.org/xen-devel/20220630082330.82706-1-roger@citrix.com/


--
Julien Grall



Re: [PATCH v6 0/4] ppc: Enable full Xen build

2023-09-18 Thread Shawn Anastasio
On 9/18/23 8:19 AM, Jan Beulich wrote:
> On 14.09.2023 21:03, Shawn Anastasio wrote:
>> Shawn Anastasio (4):
>>   xen/ppc: Implement bitops.h
>>   xen/ppc: Define minimal stub headers required for full build
> 
> Compilation fails after applying this.
> 
>>   xen/ppc: Add stub function and symbol definitions
> 
> Continuing nevertheless, linking fails after this.
> 
>>   xen/ppc: Enable full Xen build
> 
> Things build okay for me when the full series is applied. Generally we
> wouldn't deliberately break the build between any two patches; doing so
> may be okay here (except I guest CI's build-each-commit would be upset),
> but I'll do so only upon explicit request (and with no-one else objecting).
>

Sorry about that. Going forward I'll take more care to ensure that
partially-applied series still build correctly. For this series though,
if you could make an exception it would be appreciated.

> Jan

Thanks,
Shawn



Re: [XEN PATCH v2 0/5] python: Use setuptools instead of the deprecated distutils

2023-09-18 Thread Javi Merino
On Tue, Sep 12, 2023 at 08:49:04AM +0100, Andrew Cooper wrote:
> On 11/09/2023 5:50 pm, Javi Merino wrote:
> > This series picks up Marek's v1 from
> > https://lore.kernel.org/xen-devel/20230316171634.320626-1-marma...@invisiblethingslab.com/
> > Changes since v1:
> >   - Update all containers to have setuptools, as python 3.12 depecrates 
> > distutils in favour of setuptools
> >   - Keep python2's support by falling back to distutils if setuptools is 
> > not installed
> >   - Drop the commit about raising the baseline requirement for python, as 
> > we keep supporting python2
> >
> > Javi Merino (2):
> >   automation: add python3's setuptools to containers
> >   README: update to remove old note about the build system's python
> > expectation
> >
> > Marek Marczykowski-Górecki (3):
> >   tools: convert setup.py to use setuptools
> >   tools: don't use distutils in configure nor Makefile
> >   tools: regenerate configure
> 
> Two general notes.
> 
> First, because you've (re)arranged and posted this series, you need to
> add your own Signed-off-by line to all patches, even unmodified ones
> Marek's that you've included.  SoB needs to be from everyone involved in
> handling the patch.

Sure.

> Second, patch 4 should be dropped, and a note put in patch 2 to the
> committer, who will use autoconf 2.69 and have a far far smaller delta
> to include.

I have regenerated it with 2.69 and the diff for tools/configure is
reduced to:

 tools/configure | 52 
 1 file changed, 20 insertions(+), 32 deletions(-)

Cheers,
Javi



[xen-unstable-smoke test] 183030: tolerable all pass - PUSHED

2023-09-18 Thread osstest service owner
flight 183030 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183030/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  2ea38251eb67639be7aa9d7b64084b1be0230273
baseline version:
 xen  290f82375d828ef93f831a5ef028f1283aa1ea47

Last test of basis   183011  2023-09-15 22:01:58 Z2 days
Testing same since   183030  2023-09-18 14:02:00 Z0 days1 attempts


People who touched revisions under test:
  Federico Serafini 
  George Dunlap 
  Jan Beulich 
  Roger Pau Monné 
  Shawn Anastasio 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   290f82375d..2ea38251eb  2ea38251eb67639be7aa9d7b64084b1be0230273 -> smoke



Re: [PATCH v2] x86/shutdown: change default reboot method preference

2023-09-18 Thread Roger Pau Monné
On Mon, Sep 18, 2023 at 05:44:47PM +0200, Jan Beulich wrote:
> On 18.09.2023 17:09, Roger Pau Monné wrote:
> > On Mon, Sep 18, 2023 at 02:26:51PM +0200, Jan Beulich wrote:
> >> On 15.09.2023 09:43, Roger Pau Monne wrote:
> >>> The current logic to chose the preferred reboot method is based on the 
> >>> mode Xen
> >>> has been booted into, so if the box is booted from UEFI, the preferred 
> >>> reboot
> >>> method will be to use the ResetSystem() run time service call.
> >>>
> >>> However, that method seems to be widely untested, and quite often leads 
> >>> to a
> >>> result similar to:
> >>>
> >>> Hardware Dom0 shutdown: rebooting machine
> >>> [ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C]
> >>> CPU:0
> >>> RIP:e008:[<0017>] 0017
> >>> RFLAGS: 00010202   CONTEXT: hypervisor
> >>> [...]
> >>> Xen call trace:
> >>>[<0017>] R 0017
> >>>[] S 83207eff7b50
> >>>[] F machine_restart+0x1da/0x261
> >>>[] F apic_wait_icr_idle+0/0x37
> >>>[] F smp_call_function_interrupt+0xc7/0xcb
> >>>[] F call_function_interrupt+0x20/0x34
> >>>[] F do_IRQ+0x150/0x6f3
> >>>[] F common_interrupt+0x132/0x140
> >>>[] F 
> >>> arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> >>>[] F 
> >>> arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> >>>[] F arch/x86/domain.c#idle_loop+0xec/0xee
> >>>
> >>> 
> >>> Panic on CPU 0:
> >>> FATAL TRAP: vector = 6 (invalid opcode)
> >>> 
> >>>
> >>> Which in most cases does lead to a reboot, however that's unreliable.
> >>>
> >>> Change the default reboot preference to prefer ACPI over UEFI if 
> >>> available and
> >>> not in reduced hardware mode.
> >>>
> >>> This is in line to what Linux does, so it's unlikely to cause issues on 
> >>> current
> >>> and future hardware, since there's a much higher chance of vendors testing
> >>> hardware with Linux rather than Xen.
> >>
> >> I certainly appreciate this as a goal. However, ...
> >>
> >>> Add a special case for one Acer model that does require being rebooted 
> >>> using
> >>> ResetSystem().  See Linux commit 0082517fa4bce for rationale.
> >>
> >> ... this is precisely what I'd like to avoid: Needing workarounds on spec-
> >> conforming systems.
> > 
> > I wouldn't call that platform spec-conforming when ACPI reboot doesn't
> > work reliably on it either.  I haven't been able to find a wording on
> > the UEFI specification that mandates using ResetSystem() in order to
> > reset the platform.  I've only found this wording:
> > 
> > "... then the UEFI OS Loader has taken control of the platform, and
> > EFI will not regain control of the system until the platform is reset.
> > One method of resetting the platform is through the EFI Runtime
> > Service ResetSystem()."
> > 
> > And this reads to me as a mere indication that one option is to use
> > ResetSystem(), but that there are likely other platform specific reset
> > methods that are suitable to be used for OSes and still be compliant
> > with the UEFI spec.
> 
> See my reference to ia64.

Right, I understand that on ia64 things might have been different, due
to the platform lacking any other reboot method, but I don't see how
this applies to x86 where there are other reboot methods.

> With ACPI_FADT_RESET_REGISTER not set, I don't
> think there would have been any other non-custom reboot method there. So
> while perhaps not mandated, it's still the designated abstraction layer.

Again the spec doesn't mention that ResetSystem() must be used, so
while it would make sense if it was reliable, it clearly isn't.  In
which case resorting to the more reliable method should always be
preferred, specially if the spec is so lax as to call ResetSystem()
"One method of resetting the platform".

We should also take into account that vendors are much more likely to
test new hardware with Linux rather than Xen, and hence it's low
probability that the default Linux reboot method doesn't work on a
platform, because that would hurt the vendor.

> >>> --- a/xen/arch/x86/shutdown.c
> >>> +++ b/xen/arch/x86/shutdown.c
> >>> @@ -150,19 +150,20 @@ static void default_reboot_type(void)
> >>>  
> >>>  if ( xen_guest )
> >>>  reboot_type = BOOT_XEN;
> >>> +else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
> >>> +reboot_type = BOOT_ACPI;
> >>>  else if ( efi_enabled(EFI_RS) )
> >>>  reboot_type = BOOT_EFI;
> >>> -else if ( acpi_disabled )
> >>> -reboot_type = BOOT_KBD;
> >>>  else
> >>> -reboot_type = BOOT_ACPI;
> >>> +reboot_type = BOOT_KBD;
> >>>  }
> >>>  
> >>>  static int __init cf_check override_reboot(const struct dmi_system_id *d)
> >>>  {
> >>>  enum reboot_type type = (long)d->driver_data;
> >>>  
> >>> -if ( type == BOOT_ACPI && acpi_disabled )
> >>> +if ( (type == BOOT_ACPI && acpi_disabled) ||
> >>> +  

Re: [PATCH v2] x86/shutdown: change default reboot method preference

2023-09-18 Thread Jan Beulich
On 18.09.2023 17:09, Roger Pau Monné wrote:
> On Mon, Sep 18, 2023 at 02:26:51PM +0200, Jan Beulich wrote:
>> On 15.09.2023 09:43, Roger Pau Monne wrote:
>>> The current logic to chose the preferred reboot method is based on the mode 
>>> Xen
>>> has been booted into, so if the box is booted from UEFI, the preferred 
>>> reboot
>>> method will be to use the ResetSystem() run time service call.
>>>
>>> However, that method seems to be widely untested, and quite often leads to a
>>> result similar to:
>>>
>>> Hardware Dom0 shutdown: rebooting machine
>>> [ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C]
>>> CPU:0
>>> RIP:e008:[<0017>] 0017
>>> RFLAGS: 00010202   CONTEXT: hypervisor
>>> [...]
>>> Xen call trace:
>>>[<0017>] R 0017
>>>[] S 83207eff7b50
>>>[] F machine_restart+0x1da/0x261
>>>[] F apic_wait_icr_idle+0/0x37
>>>[] F smp_call_function_interrupt+0xc7/0xcb
>>>[] F call_function_interrupt+0x20/0x34
>>>[] F do_IRQ+0x150/0x6f3
>>>[] F common_interrupt+0x132/0x140
>>>[] F 
>>> arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
>>>[] F 
>>> arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
>>>[] F arch/x86/domain.c#idle_loop+0xec/0xee
>>>
>>> 
>>> Panic on CPU 0:
>>> FATAL TRAP: vector = 6 (invalid opcode)
>>> 
>>>
>>> Which in most cases does lead to a reboot, however that's unreliable.
>>>
>>> Change the default reboot preference to prefer ACPI over UEFI if available 
>>> and
>>> not in reduced hardware mode.
>>>
>>> This is in line to what Linux does, so it's unlikely to cause issues on 
>>> current
>>> and future hardware, since there's a much higher chance of vendors testing
>>> hardware with Linux rather than Xen.
>>
>> I certainly appreciate this as a goal. However, ...
>>
>>> Add a special case for one Acer model that does require being rebooted using
>>> ResetSystem().  See Linux commit 0082517fa4bce for rationale.
>>
>> ... this is precisely what I'd like to avoid: Needing workarounds on spec-
>> conforming systems.
> 
> I wouldn't call that platform spec-conforming when ACPI reboot doesn't
> work reliably on it either.  I haven't been able to find a wording on
> the UEFI specification that mandates using ResetSystem() in order to
> reset the platform.  I've only found this wording:
> 
> "... then the UEFI OS Loader has taken control of the platform, and
> EFI will not regain control of the system until the platform is reset.
> One method of resetting the platform is through the EFI Runtime
> Service ResetSystem()."
> 
> And this reads to me as a mere indication that one option is to use
> ResetSystem(), but that there are likely other platform specific reset
> methods that are suitable to be used for OSes and still be compliant
> with the UEFI spec.

See my reference to ia64. With ACPI_FADT_RESET_REGISTER not set, I don't
think there would have been any other non-custom reboot method there. So
while perhaps not mandated, it's still the designated abstraction layer.

>>> --- a/xen/arch/x86/shutdown.c
>>> +++ b/xen/arch/x86/shutdown.c
>>> @@ -150,19 +150,20 @@ static void default_reboot_type(void)
>>>  
>>>  if ( xen_guest )
>>>  reboot_type = BOOT_XEN;
>>> +else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
>>> +reboot_type = BOOT_ACPI;
>>>  else if ( efi_enabled(EFI_RS) )
>>>  reboot_type = BOOT_EFI;
>>> -else if ( acpi_disabled )
>>> -reboot_type = BOOT_KBD;
>>>  else
>>> -reboot_type = BOOT_ACPI;
>>> +reboot_type = BOOT_KBD;
>>>  }
>>>  
>>>  static int __init cf_check override_reboot(const struct dmi_system_id *d)
>>>  {
>>>  enum reboot_type type = (long)d->driver_data;
>>>  
>>> -if ( type == BOOT_ACPI && acpi_disabled )
>>> +if ( (type == BOOT_ACPI && acpi_disabled) ||
>>> + (type == BOOT_EFI && !efi_enabled(EFI_RS)) )
>>>  type = BOOT_KBD;
>>
>> I guess I don't follow this adjustment: Why would we fall back to KBD
>> first thing? Wouldn't it make sense to try ACPI first if EFI cannot
>> be used?
> 
> This is IMO a weird corner case, we have a explicit request to use one
> reboot method, but we cannot do so because the component is disabled.
> I've assumed that falling back to KBD was the safest option.
> 
> For example if we have to explicitly reboot using UEFI it's likely
> because ACPI (the proposed default method) is not suitable, and hence
> falling back to ACPI here won't help.

Perhaps, but falling back to KBD isn't necessarily going to work either.
And it might well be that on said Acer no reboot method would actually
yield consistent behavior, except for ResetSystem(). The fallback logic
here as well as that in machine_restart() is all based on guesswork
anyway.

Jan



Re: [PATCH v2] x86/shutdown: change default reboot method preference

2023-09-18 Thread Roger Pau Monné
On Mon, Sep 18, 2023 at 02:26:51PM +0200, Jan Beulich wrote:
> On 15.09.2023 09:43, Roger Pau Monne wrote:
> > The current logic to chose the preferred reboot method is based on the mode 
> > Xen
> > has been booted into, so if the box is booted from UEFI, the preferred 
> > reboot
> > method will be to use the ResetSystem() run time service call.
> > 
> > However, that method seems to be widely untested, and quite often leads to a
> > result similar to:
> > 
> > Hardware Dom0 shutdown: rebooting machine
> > [ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C]
> > CPU:0
> > RIP:e008:[<0017>] 0017
> > RFLAGS: 00010202   CONTEXT: hypervisor
> > [...]
> > Xen call trace:
> >[<0017>] R 0017
> >[] S 83207eff7b50
> >[] F machine_restart+0x1da/0x261
> >[] F apic_wait_icr_idle+0/0x37
> >[] F smp_call_function_interrupt+0xc7/0xcb
> >[] F call_function_interrupt+0x20/0x34
> >[] F do_IRQ+0x150/0x6f3
> >[] F common_interrupt+0x132/0x140
> >[] F 
> > arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> >[] F 
> > arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> >[] F arch/x86/domain.c#idle_loop+0xec/0xee
> > 
> > 
> > Panic on CPU 0:
> > FATAL TRAP: vector = 6 (invalid opcode)
> > 
> > 
> > Which in most cases does lead to a reboot, however that's unreliable.
> > 
> > Change the default reboot preference to prefer ACPI over UEFI if available 
> > and
> > not in reduced hardware mode.
> > 
> > This is in line to what Linux does, so it's unlikely to cause issues on 
> > current
> > and future hardware, since there's a much higher chance of vendors testing
> > hardware with Linux rather than Xen.
> 
> I certainly appreciate this as a goal. However, ...
> 
> > Add a special case for one Acer model that does require being rebooted using
> > ResetSystem().  See Linux commit 0082517fa4bce for rationale.
> 
> ... this is precisely what I'd like to avoid: Needing workarounds on spec-
> conforming systems.

I wouldn't call that platform spec-conforming when ACPI reboot doesn't
work reliably on it either.  I haven't been able to find a wording on
the UEFI specification that mandates using ResetSystem() in order to
reset the platform.  I've only found this wording:

"... then the UEFI OS Loader has taken control of the platform, and
EFI will not regain control of the system until the platform is reset.
One method of resetting the platform is through the EFI Runtime
Service ResetSystem()."

And this reads to me as a mere indication that one option is to use
ResetSystem(), but that there are likely other platform specific reset
methods that are suitable to be used for OSes and still be compliant
with the UEFI spec.

> 
> > I'm not aware of using ACPI reboot causing issues on boxes that do have
> > properly implemented ResetSystem() methods.
> 
> I'm also puzzled by this statement: That Acer aspect is a clear indication
> of there being an issue.

Hm yes, I had that sentence from v1, before realizing the Acer quirk.
So there's one know issue with using ACPI as the default reboot
method vs many issues when using the UEFI one.

> Plus it's quite easy to see that hooks may be put
> in place by various firmware components that would then be used to make
> certain adjustments to the platform, ahead of an orderly reboot / shutdown.

Well, I very much doubt any vendor would rely on this, seeing as both
Linux and Windows both default to ACPI reboot, and the UEFI spec not
mandating the use of ResetSystem() anyway.

> > --- a/xen/arch/x86/shutdown.c
> > +++ b/xen/arch/x86/shutdown.c
> > @@ -150,19 +150,20 @@ static void default_reboot_type(void)
> >  
> >  if ( xen_guest )
> >  reboot_type = BOOT_XEN;
> > +else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
> > +reboot_type = BOOT_ACPI;
> >  else if ( efi_enabled(EFI_RS) )
> >  reboot_type = BOOT_EFI;
> > -else if ( acpi_disabled )
> > -reboot_type = BOOT_KBD;
> >  else
> > -reboot_type = BOOT_ACPI;
> > +reboot_type = BOOT_KBD;
> >  }
> >  
> >  static int __init cf_check override_reboot(const struct dmi_system_id *d)
> >  {
> >  enum reboot_type type = (long)d->driver_data;
> >  
> > -if ( type == BOOT_ACPI && acpi_disabled )
> > +if ( (type == BOOT_ACPI && acpi_disabled) ||
> > + (type == BOOT_EFI && !efi_enabled(EFI_RS)) )
> >  type = BOOT_KBD;
> 
> I guess I don't follow this adjustment: Why would we fall back to KBD
> first thing? Wouldn't it make sense to try ACPI first if EFI cannot
> be used?

This is IMO a weird corner case, we have a explicit request to use one
reboot method, but we cannot do so because the component is disabled.
I've assumed that falling back to KBD was the safest option.

For example if we have to explicitly reboot using UEFI it's likely
because ACPI 

Re: [PATCH 6/7] block: Clean up local variable shadowing

2023-09-18 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 31.08.2023 um 15:25 hat Markus Armbruster geschrieben:
>> Local variables shadowing other local variables or parameters make the
>> code needlessly hard to understand.  Tracked down with -Wshadow=local.
>> Clean up: delete inner declarations when they are actually redundant,
>> else rename variables.
>> 
>> Signed-off-by: Markus Armbruster 
>> ---
>>  block.c  |  7 ---
>>  block/rbd.c  |  2 +-
>>  block/stream.c   |  1 -
>>  block/vvfat.c| 34 +-
>>  hw/block/xen-block.c |  6 +++---
>>  5 files changed, 25 insertions(+), 25 deletions(-)
>
> I wonder why you made vdi a separate patch, but not vvfat, even though
> that has more changes. (Of course, my selfish motivation for asking this
> is that I could have given a R-b for it and wouldn't have to look at it
> again in a v2 :-))

I split by maintainer.  The files changed by this patch are only covered
by "Block layer core".

>> diff --git a/block.c b/block.c
>> index a307c151a8..7f0003d8ac 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -3001,7 +3001,8 @@ static BdrvChild 
>> *bdrv_attach_child_common(BlockDriverState *child_bs,
>> BdrvChildRole child_role,
>> uint64_t perm, uint64_t 
>> shared_perm,
>> void *opaque,
>> -   Transaction *tran, Error **errp)
>> +   Transaction *transaction,
>> +   Error **errp)
>>  {
>>  BdrvChild *new_child;
>>  AioContext *parent_ctx, *new_child_ctx;
>> @@ -3088,7 +3089,7 @@ static BdrvChild 
>> *bdrv_attach_child_common(BlockDriverState *child_bs,
>>  .old_parent_ctx = parent_ctx,
>>  .old_child_ctx = child_ctx,
>>  };
>> -tran_add(tran, _attach_child_common_drv, s);
>> +tran_add(transaction, _attach_child_common_drv, s);
>>  
>>  if (new_child_ctx != child_ctx) {
>>  aio_context_release(new_child_ctx);
>
> I think I would resolve this one the other way around. 'tran' is the
> typical name for the parameter and it is the transaction that this
> function should add things to.
>
> The other one that shadows it is a local transaction that is completed
> within the function. I think it's better if that one has a different
> name.
>
> As usual, being more specific than just 'tran' vs. 'transaction' would
> be nice. Maybe 'aio_ctx_tran' for the nested one?

Can do.

> The rest looks okay.

Thanks!




Re: [PATCH 5/7] block/vdi: Clean up local variable shadowing

2023-09-18 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 31.08.2023 um 15:25 hat Markus Armbruster geschrieben:
>> Local variables shadowing other local variables or parameters make the
>> code needlessly hard to understand.  Tracked down with -Wshadow=local.
>> Clean up: delete inner declarations when they are actually redundant,
>> else rename variables.
>> 
>> Signed-off-by: Markus Armbruster 
>
>> @@ -700,7 +699,7 @@ nonallocating_write:
>>  /* One or more new blocks were allocated. */
>>  VdiHeader *header;
>>  uint8_t *base;
>> -uint64_t offset;
>> +uint64_t offs;
>>  uint32_t n_sectors;
>>  
>>  g_free(block);
>> @@ -723,11 +722,11 @@ nonallocating_write:
>>  bmap_first /= (SECTOR_SIZE / sizeof(uint32_t));
>>  bmap_last /= (SECTOR_SIZE / sizeof(uint32_t));
>>  n_sectors = bmap_last - bmap_first + 1;
>> -offset = s->bmap_sector + bmap_first;
>> +offs = s->bmap_sector + bmap_first;
>>  base = ((uint8_t *)>bmap[0]) + bmap_first * SECTOR_SIZE;
>>  logout("will write %u block map sectors starting from entry %u\n",
>> n_sectors, bmap_first);
>> -ret = bdrv_co_pwrite(bs->file, offset * SECTOR_SIZE,
>> +ret = bdrv_co_pwrite(bs->file, offs * SECTOR_SIZE,
>>   n_sectors * SECTOR_SIZE, base, 0);
>>  }
>
> Having two variables 'offset' and 'offs' doesn't really help with
> clarity either. Can we be more specific and use something like
> 'bmap_offset' here?

Sure!




Re: [PATCH 9/9] x86/spec-ctrl: Mitigate the Zen1 DIV leakge

2023-09-18 Thread Andrew Cooper
On 18/09/2023 12:15 pm, Jan Beulich wrote:
> On 15.09.2023 17:00, Andrew Cooper wrote:
>> In the Zen1 microarchitecure, there is one divider in the pipeline which
>> services uops from both threads.  In the case of #DE, the latched result from
>> the previous DIV to execute will be forwarded speculatively.
>>
>> This is an interesting covert channel that allows two threads to communicate
>> without any system calls.  In also allows userspace to obtain the result of
>> the most recent DIV instruction executed (even speculatively) in the core,
>> which can be from a higher privilege context.
>>
>> Scrub the result from the divider by executing a non-faulting divide.  This
>> needs performing on the exit-to-guest paths, and ist_exit-to-Xen.
>>
>> This is XSA-439 / CVE-2023-20588.
>>
>> Signed-off-by: Andrew Cooper 
> Reviewed-by: Jan Beulich 
>
> Nevertheless I would have hoped you add at least a sentence on the 
> alternatives
> patching of the IST path. Hitting #MC while patching is possible, after all
> (yes, you will tell me that #MC is almost certainly fatal to the system 
> anyway,
> but still).

I'll see what I can do.

>
>> @@ -955,6 +960,46 @@ static void __init srso_calculations(bool 
>> hw_smt_enabled)
>>  setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
>>  }
>>  
>> +/*
>> + * The Div leakage issue is specific to the AMD Zen1 microarchitecure.
>> + *
>> + * However, there's no $FOO_NO bit defined, so if we're virtualised we have 
>> no
>> + * hope of spotting the case where we might move to vulnerable hardware.  We
>> + * also can't make any useful conclusion about SMT-ness.
>> + *
>> + * Don't check the hypervisor bit, so at least we do the safe thing when
>> + * booting on something that looks like a Zen1 CPU.
>> + */
>> +static bool __init has_div_vuln(void)
>> +{
>> +if ( !(boot_cpu_data.x86_vendor &
>> +   (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
>> +return false;
>> +
>> +if ( (boot_cpu_data.x86 != 0x17 && boot_cpu_data.x86 != 0x18) ||
>> + !is_zen1_uarch() )
>> +return false;
>> +
>> +return true;
>> +}
> Just to mention it - personally I consider
>
> ...
> if ( ... )
> return true;
>
> return false;
> }
>
> a minor anti-pattern, as a sole return imo makes more clear what's going on.

Well yes, here is an area where we disagree.

It's the same as trailing commas on lists, or "| 0)" for bitmaps for
making a smaller delta for future changes.

> In a case like this, where you intentionally split return paths anyway, I'd
> then go with
>
> static bool __init has_div_vuln(void)
> {
> if ( !(boot_cpu_data.x86_vendor &
>(X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
> return false;
>
> if ( boot_cpu_data.x86 != 0x17 && boot_cpu_data.x86 != 0x18 )
> return false;
>
> return is_zen1_uarch();
> }

I'll swap to this because there is no realistic chance of the logic
chain needing to expand.

~Andrew



Re: [PATCH 8/9] x86/amd: Introduce is_zen{1,2}_uarch() predicates

2023-09-18 Thread Jan Beulich
On 18.09.2023 16:02, Andrew Cooper wrote:
> On 18/09/2023 12:07 pm, Jan Beulich wrote:
>> On 15.09.2023 17:00, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/include/asm/amd.h
>>> +++ b/xen/arch/x86/include/asm/amd.h
>>> @@ -140,6 +140,17 @@
>>> AMD_MODEL_RANGE(0x11, 0x0, 0x0, 0xff, 0xf), \
>>> AMD_MODEL_RANGE(0x12, 0x0, 0x0, 0xff, 0xf))
>>>  
>>> +/*
>>> + * The Zen1 and Zen2 microarchitectures are implemented by AMD (Fam17h) and
>>> + * Hygon (Fam18h) but without simple model number rules.  Instead, use 
>>> STIBP
>>> + * as a heuristic that distinguishes the two.
>>> + *
>>> + * The caller is required to perform the appropriate vendor/family checks
>>> + * first.
>>> + */
>>> +#define is_zen1_uarch() (!boot_cpu_has(X86_FEATURE_AMD_STIBP))
>>> +#define is_zen2_uarch()   boot_cpu_has(X86_FEATURE_AMD_STIBP)
>>> +
>>>  struct cpuinfo_x86;
>>>  int cpu_has_amd_erratum(const struct cpuinfo_x86 *, int, ...);
>> With one simply the opposite of the other, and with the requirement of a
>> family check up front, do we really need both? Personally I'd prefer if
>> we had just the latter. Yet in any event
>> Reviewed-by: Jan Beulich 
> 
> We specifically do want both, because they're use is not symmetric at
> callsites.
> 
> In particular, having only one would make the following patch illogical
> to read.

I don't think it would, but that's perhaps one more of the many areas where
we take different perspectives.

Jan



Re: [PATCH 8/9] x86/amd: Introduce is_zen{1,2}_uarch() predicates

2023-09-18 Thread Andrew Cooper
On 18/09/2023 12:07 pm, Jan Beulich wrote:
> On 15.09.2023 17:00, Andrew Cooper wrote:
>> --- a/xen/arch/x86/include/asm/amd.h
>> +++ b/xen/arch/x86/include/asm/amd.h
>> @@ -140,6 +140,17 @@
>> AMD_MODEL_RANGE(0x11, 0x0, 0x0, 0xff, 0xf),  \
>> AMD_MODEL_RANGE(0x12, 0x0, 0x0, 0xff, 0xf))
>>  
>> +/*
>> + * The Zen1 and Zen2 microarchitectures are implemented by AMD (Fam17h) and
>> + * Hygon (Fam18h) but without simple model number rules.  Instead, use STIBP
>> + * as a heuristic that distinguishes the two.
>> + *
>> + * The caller is required to perform the appropriate vendor/family checks
>> + * first.
>> + */
>> +#define is_zen1_uarch() (!boot_cpu_has(X86_FEATURE_AMD_STIBP))
>> +#define is_zen2_uarch()   boot_cpu_has(X86_FEATURE_AMD_STIBP)
>> +
>>  struct cpuinfo_x86;
>>  int cpu_has_amd_erratum(const struct cpuinfo_x86 *, int, ...);
> With one simply the opposite of the other, and with the requirement of a
> family check up front, do we really need both? Personally I'd prefer if
> we had just the latter. Yet in any event
> Reviewed-by: Jan Beulich 

We specifically do want both, because they're use is not symmetric at
callsites.

In particular, having only one would make the following patch illogical
to read.

~Andrew



Re: [PATCH 6/9] x86/entry: Track the IST-ness of an entry for the exit paths

2023-09-18 Thread Andrew Cooper
On 18/09/2023 12:02 pm, Jan Beulich wrote:
> On 15.09.2023 17:00, Andrew Cooper wrote:
>> Use %r12 to hold an ist_exit boolean.  This register is zero elsewhere in the
>> entry/exit asm, so it only needs setting in the IST path.
>>
>> As this is subtle and fragile, add check_ist_exit() to be used in debugging
>> builds to cross-check that the ist_exit boolean matches the entry vector.
>>
>> Write check_ist_exit() it in C, because it's debug only and the logic more
>> complicated than I care to maintain in asm.
>>
>> For now, we only need to use this signal in the exit-to-Xen path, but some
>> exit-to-guest paths happen in IST context too.  Check the correctness in all
>> exit paths to avoid the logic bitrotting.
>>
>> Signed-off-by: Andrew Cooper 
> Reviewed-by: Jan Beulich 

Thanks.

>
> I understand you then didn't like the idea of macro-izing ...
>
>> --- a/xen/arch/x86/x86_64/compat/entry.S
>> +++ b/xen/arch/x86/x86_64/compat/entry.S
>> @@ -117,8 +117,15 @@ compat_process_trap:
>>  call  compat_create_bounce_frame
>>  jmp   compat_test_all_events
>>  
>> -/* %rbx: struct vcpu, interrupts disabled */
>> +/* %rbx: struct vcpu, %r12: ist_exit, interrupts disabled */
>>  ENTRY(compat_restore_all_guest)
>> +
>> +#ifdef CONFIG_DEBUG
>> +mov   %rsp, %rdi
>> +mov   %r12, %rsi
>> +call  check_ist_exit
>> +#endif
>> +
>>  ASSERT_INTERRUPTS_DISABLED
>>  mov   $~(X86_EFLAGS_IOPL | X86_EFLAGS_VM), %r11d
>>  and   UREGS_eflags(%rsp),%r11d
>> --- a/xen/arch/x86/x86_64/entry.S
>> +++ b/xen/arch/x86/x86_64/entry.S
>> @@ -142,8 +142,15 @@ process_trap:
>>  
>>  .section .text.entry, "ax", @progbits
>>  
>> -/* %rbx: struct vcpu, interrupts disabled */
>> +/* %rbx: struct vcpu, %r12: ist_exit, interrupts disabled */
>>  restore_all_guest:
>> +
>> +#ifdef CONFIG_DEBUG
>> +mov   %rsp, %rdi
>> +mov   %r12, %rsi
>> +call  check_ist_exit
>> +#endif
>> +
>>  ASSERT_INTERRUPTS_DISABLED
>>  
>>  /* Stash guest SPEC_CTRL value while we can read struct vcpu. */
>> @@ -659,8 +666,15 @@ ENTRY(early_page_fault)
>>  .section .text.entry, "ax", @progbits
>>  
>>  ALIGN
>> -/* No special register assumptions. */
>> +/* %r12=ist_exit */
>>  restore_all_xen:
>> +
>> +#ifdef CONFIG_DEBUG
>> +mov   %rsp, %rdi
>> +mov   %r12, %rsi
>> +call  check_ist_exit
>> +#endif
> ... these three instances of identical code you add, along the lines of
> ASSERT_INTERRUPTS_DISABLED?

There's no header that's unique to just the entry.S's, and it's only 3
instructions that need a very specific stack and state layout.

The SPEC_CTRL_* macros are already a giant source of pain, and for 3
instructions I don't think the complexity of the abstraction is worth it.


Furthermore, I've got some fixes to the other ASSERT_* macros which are
going to make them a bit more like this.

~Andrew



Re: [PATCH v6 0/4] ppc: Enable full Xen build

2023-09-18 Thread Jan Beulich
On 14.09.2023 21:03, Shawn Anastasio wrote:
> Shawn Anastasio (4):
>   xen/ppc: Implement bitops.h
>   xen/ppc: Define minimal stub headers required for full build

Compilation fails after applying this.

>   xen/ppc: Add stub function and symbol definitions

Continuing nevertheless, linking fails after this.

>   xen/ppc: Enable full Xen build

Things build okay for me when the full series is applied. Generally we
wouldn't deliberately break the build between any two patches; doing so
may be okay here (except I guest CI's build-each-commit would be upset),
but I'll do so only upon explicit request (and with no-one else objecting).

Jan



Re: [PATCH v6 1/4] xen/ppc: Implement bitops.h

2023-09-18 Thread Jan Beulich
On 14.09.2023 21:03, Shawn Anastasio wrote:
> Implement bitops.h, based on Linux's implementation as of commit
> 5321d1b1afb9a17302c6cec79f0cbf823eb0d3fc. Though it is based off of
> Linux's implementation, this code diverges significantly in a number of
> ways:
>   - Bitmap entries changed to 32-bit words to match X86 and Arm on Xen
>   - PPC32-specific code paths dropped
>   - Formatting completely re-done to more closely line up with Xen.
> Including 4 space indentation.
>   - Use GCC's __builtin_popcount* for hweight* implementation
> 
> Signed-off-by: Shawn Anastasio 

Acked-by: Jan Beulich 





Re: [PATCH 2/7] x86/emul: Fix and extend #DB trap handling

2023-09-18 Thread Jan Beulich
On 18.09.2023 14:19, Andrew Cooper wrote:
> On 18/09/2023 12:30 pm, Jan Beulich wrote:
>> On 18.09.2023 11:24, Andrew Cooper wrote:
>>> On 15/09/2023 9:36 pm, Andrew Cooper wrote:
 --- a/xen/arch/x86/x86_emulate/x86_emulate.c
 +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
 @@ -8379,13 +8379,6 @@ x86_emulate(
  if ( !mode_64bit() )
  _regs.r(ip) = (uint32_t)_regs.r(ip);
  
 -/* Should a singlestep #DB be raised? */
 -if ( rc == X86EMUL_OKAY && singlestep && !ctxt->retire.mov_ss )
 -{
 -ctxt->retire.singlestep = true;
 -ctxt->retire.sti = false;
 -}
 -
  if ( rc != X86EMUL_DONE )
  *ctxt->regs = _regs;
  else
 @@ -8394,6 +8387,19 @@ x86_emulate(
  rc = X86EMUL_OKAY;
  }
  
 +/* Should a singlestep #DB be raised? */
 +if ( rc == X86EMUL_OKAY && singlestep )
 +{
 +ctxt->retire.pending_dbg |= X86_DR6_BS;
 +
 +/* BROKEN - TODO, merge into pending_dbg. */
 +if ( !ctxt->retire.mov_ss )
 +{
 +ctxt->retire.singlestep = true;
 +ctxt->retire.sti = false;
 +}
>>> I occurs to me that setting X86_DR6_BS outside of the !mov_ss case will
>>> break HVM (when HVM swaps from singlestep to pending_dbg) until one of
>>> the further TODOs is addressed.
>>>
>>> This will need bringing back within the conditional to avoid regressions
>>> in the short term.
>> I'm afraid I don't understand this: Isn't the purpose to latch state no
>> matter whether it'll be consumed right away, or only on the next insn?
> 
> Yes, that is the intention in the longterm.
> 
> But in the short term, where I'm doing just enough to fix the %dr6 bits,
> putting this unconditionally in PENDING_DBG will break the emulation of
> mov-to-ss until the bigger todo of "wire INTERRUPTIBILITY/ACTIVITY state
> into the emulation context" is complete.

Since I assume we're talking about the tail of _hvm_emulate_one(), my
problem is that I cannot see how setting X86_DR6_BS would lead to a
problem there. Plus you don't touch x86/hvm/ at all in the series, and
the pending_dbg field gets newly introduced in the patch here. Hence it
looks to me as if for HVM the field could take any value, without
breaking the code. But then, from you explicitly pointing out a problem,
I can only infer that I'm overlooking something.

> The latter is definitely too big to fit into 4.18, and I can't
> intentionally regress mov-to-ss in a series we intend to backport.

Of course.

Jan



Re: [PATCH v1 00/29] Introduce stub headers necessary for full Xen build

2023-09-18 Thread Jan Beulich
On 18.09.2023 14:05, Oleksii wrote:
> On Mon, 2023-09-18 at 11:29 +0200, Jan Beulich wrote:
>> On 18.09.2023 10:51, Oleksii wrote:
>>> On Thu, 2023-09-14 at 17:08 +0200, Jan Beulich wrote:
 On 14.09.2023 16:56, Oleksii Kurochko wrote:
> Based on two patch series [1] and [2], the idea of which is to
> provide minimal
> amount of things for a complete Xen build, a large amount of
> headers are the same
> or almost the same, so it makes sense to move them to asm-
> generic.
>
> Also, providing such stub headers should help future
> architectures
> to add
> a full Xen build.
>
> [1]
> https://lore.kernel.org/xen-devel/cover.1694543103.git.sanasta...@raptorengineering.com/
> [2]
> https://lore.kernel.org/xen-devel/cover.1692181079.git.oleksii.kuroc...@gmail.com/
>
> Oleksii Kurochko (29):
>   xen/asm-generic: introduce stub header spinlock.h

 At the example of this, personally I think this goes too far.
 Headers
 in
 asm-generic should be for the case where an arch elects to not
 implement
 certain functionality. Clearly spinlocks are required uniformly.
>>> It makes sense. Then I will back to the option [2] where I
>>> introduced
>>> all this headers as part of RISC-V architecture. 
>>
>> You did see though that in a reply to my own mail I said I take back
>> the
>> comment, at least as far as this header (and perhaps several others)
>> are
>> concerned.
>>
> I missed that comment on the patch about spinlock.
> 
> Well, then, I don't fully understand the criteria.
> 
> What about empty headers or temporary empty headers?
> 
> For example, asm/xenoprof.h is empty for all arches except x86, so it
> is a good candidate for asm-generic.

That's an example where I think it is wrong (or at least unnecessary) for
the xen/ header to include the asm/ one irrespective of the controlling
CONFIG_* setting. From what I can tell common code would build fine with
the #include moved; x86 code may require an adjustment or two. IOW this
is a case where I think preferably presence of an arch header was
required only when XENOPROF can actually be yet to y in Kconfig.

> But asm/grant_table.h is empty for PPC and RISC-V for now but won't be
> empty in the future. Does it make sense to put them to asm-generic? The
> only benefit I see is that in future architecture if they follow the
> same way of adding support for the arch to Xen, they will face the same
> issue: building full Xen requires this empty header.

Here I can see different ways of looking at it. Personally I'd prefer
stub headers to be used only if, for the foreseeable future, they are
intended to remain in use. grant_table.h pretty clearly doesn't fall in
this category. (You may want to peek at what's being done on the PPC
side. Nevertheless some of what's done there could likely benefit from
what you're doing here.)

> So, should I wait for some time on other patches of the patch series?

Well, afaic I'd prefer if I got a chance to look over at least some more
of the patches in this series. But you're of course free to submit a v2
at any time.

Jan



Re: [PATCH 4/9] x86/spec-ctrl: Improve all SPEC_CTRL_{ENTER,EXIT}_* comments

2023-09-18 Thread Andrew Cooper
On 18/09/2023 11:59 am, Jan Beulich wrote:
> On 15.09.2023 17:00, Andrew Cooper wrote:
>> ... to better explain how they're used.
>>
>> Doing so highlights that SPEC_CTRL_EXIT_TO_XEN is missing a VERW flush for 
>> the
>> corner case when e.g. an NMI hits late in an exit-to-guest path.
>>
>> Leave a TODO, which will be addressed in subsequent patches which arrange for
>> DO_COND_VERW to be safe within SPEC_CTRL_EXIT_TO_XEN.
>>
>> Signed-off-by: Andrew Cooper 
> Reviewed-by: Jan Beulich 
>
> Two nits though:
>
>> @@ -233,7 +236,11 @@
>>  X86_FEATURE_SC_MSR_PV
>>  .endm
>>  
>> -/* Use in interrupt/exception context.  May interrupt Xen or PV context. */
>> +/*
>> + * Used after an exception or maskable interrupt, hitting Xen or PV context.
>> + * There will either be a guest speculation context, or (baring fatal
> Isn't this "barring"?
>
>> @@ -260,7 +270,13 @@
>>  .endm
>>  
>>  /*
>> - * Use in IST interrupt/exception context.  May interrupt Xen or PV context.
>> + * Used after an IST entry hitting Xen or PV context.  Special care is 
>> needed,
>> + * because when hitting Xen context, there may not a well-formed speculation
> Missing "be"?

Both fixed, thanks.

~Andrew



[PATCH] docs: Document a policy for when to deviate from specifications

2023-09-18 Thread George Dunlap
There is an ongoing disagreement among maintainers for how Xen should
handle deviations to specifications such as ACPI or EFI.

Write up an explicit policy, and include two worked-out examples from
recent discussions.

Signed-off-by: George Dunlap 
---
NB that the technical descriptions of the costs of the accommodations
or lack thereof I've just gathered from reading the discussions; I'm
not familiar enough with the details to assert things about them.  So
please correct any technical issues.
---
 docs/policy/FollowingSpecifications.md | 219 +
 1 file changed, 219 insertions(+)
 create mode 100644 docs/policy/FollowingSpecifications.md

diff --git a/docs/policy/FollowingSpecifications.md 
b/docs/policy/FollowingSpecifications.md
new file mode 100644
index 00..a197f01f65
--- /dev/null
+++ b/docs/policy/FollowingSpecifications.md
@@ -0,0 +1,219 @@
+# Guidelines for following specifications
+
+## In general, follow specifications
+
+In general, specifications such as ACPI and EFI should be followed.
+
+## Accommodate non-compliant systems if it doesn't affect compliant systems
+
+Sometimes, however, there occur situations where real systems "in the
+wild" violate these specifications, or at least our interpretation of
+them (henceforth called "non-compliant").  If we can accommodate
+non-compliant systems without affecting any compliant systems, then we
+should do so.
+
+## If accommodation would affect theoretical compliant systems that are
+   not known to exist, and Linux and/or Windows takes the
+   accommodation, take the accommodation unless there's a
+   reason not to.
+
+Sometimes, however, there occur situations where real, non-compliant
+systems "in the wild" cannot be accommodated without affecting
+theoretical compliant systems; but there are no known theoretical
+compliant systems which exist.  If Linux and/or Windows take the
+accommodation, then from a cost/benefits perspective it's probably best
+for us to take the accommodation as well.
+
+This is really a generalization of the next principle; the "reason not
+to" would be in the form of a cost-benefits analysis as described in
+the next section showing why the "special case" doesn't apply to the
+accommodation in question.
+
+## If things aren't clear, do a cost-benefits analysis
+
+Sometimes, however, things are more complicated or less clear.  In
+that case, we should do a cost-benefits analysis for a particular
+accommodation.  Things which should be factored into the analysis:
+
+N-1: The number of non-compliant systems that require the accommodation
+ N-1a: The number of known current systems
+ N-1b: The probable number of unknown current systems
+ N-1c: The probable number of unknown future systems
+
+N-2 The severity of the effect of non-accommodation on these systems
+
+C-1: The number of compliant systems that would be affected by the 
accommodation
+ C-1a: The number of known current systems
+ C-1b: The probable number of unknown current systems
+ C-1c: The probable number of unknown future systems
+
+C-2 The severity of the effect of accommodation on these systems
+
+Intuitively, N-1 * N-2 gives us N, the cost of not making the
+accommodation, and C-1 * C-2 gives us C, the cost of taking the
+accommodation.  If N > C, then we should take the accommodation; if C >
+N, then we shouldn't.
+
+The idea isn't to come up with actual numbers to plug in here
+(although that's certainly an option if someone wants to), but to
+explain the general idea we're trying to get at.
+
+A couple of other principles to factor in:
+
+Vendors tend to copy themselves and other vendors.  If one or two
+major vendors are known to create compliant or non-compliant systems
+in a particular way, then there are likely to be more unknown and
+future systems which will be affected by / need a similar accommodation
+respectively; that is, we should raise our estimates of N-1{b,c} and
+C-1{b,c}.
+
+Some downstreams already implement accommodations, and test on a
+variety of hardware.  If downstreams such as QubesOS or XenServer /
+XCP-ng implement the accommodations, then N-1 * N-2 is likely to be
+non-negligible, and C-1 * C-2 is likely to be negligible.
+
+Windows and Linux are widely tested.  If Windows and/or Linux make a
+particular accommodation, and that accommodation has remained stable
+without being reverted, then it's likely that the number of unknown
+current systems that are affected by the accommodation is negligible;
+that is, we should lower the C-1b estimate.
+
+Vendors tend to test server hardware on Windows and Linux.  If Windows
+and/or Linux make a particular accommodation, then it's unlikely that
+future systems will be affected by the accommodation; that is, we
+should lower the C-1c estimate.
+
+# Example applications
+
+Here are some examples of how these principles can be applied.
+
+## ACPI MADT tables containing ~0
+
+Xen disables certain kinds of features on CPU hotplug systems; for
+example, it will 

Re: [PATCH v2] x86/shutdown: change default reboot method preference

2023-09-18 Thread Jan Beulich
On 15.09.2023 09:43, Roger Pau Monne wrote:
> The current logic to chose the preferred reboot method is based on the mode 
> Xen
> has been booted into, so if the box is booted from UEFI, the preferred reboot
> method will be to use the ResetSystem() run time service call.
> 
> However, that method seems to be widely untested, and quite often leads to a
> result similar to:
> 
> Hardware Dom0 shutdown: rebooting machine
> [ Xen-4.18-unstable  x86_64  debug=y  Tainted:   C]
> CPU:0
> RIP:e008:[<0017>] 0017
> RFLAGS: 00010202   CONTEXT: hypervisor
> [...]
> Xen call trace:
>[<0017>] R 0017
>[] S 83207eff7b50
>[] F machine_restart+0x1da/0x261
>[] F apic_wait_icr_idle+0/0x37
>[] F smp_call_function_interrupt+0xc7/0xcb
>[] F call_function_interrupt+0x20/0x34
>[] F do_IRQ+0x150/0x6f3
>[] F common_interrupt+0x132/0x140
>[] F 
> arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
>[] F 
> arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
>[] F arch/x86/domain.c#idle_loop+0xec/0xee
> 
> 
> Panic on CPU 0:
> FATAL TRAP: vector = 6 (invalid opcode)
> 
> 
> Which in most cases does lead to a reboot, however that's unreliable.
> 
> Change the default reboot preference to prefer ACPI over UEFI if available and
> not in reduced hardware mode.
> 
> This is in line to what Linux does, so it's unlikely to cause issues on 
> current
> and future hardware, since there's a much higher chance of vendors testing
> hardware with Linux rather than Xen.

I certainly appreciate this as a goal. However, ...

> Add a special case for one Acer model that does require being rebooted using
> ResetSystem().  See Linux commit 0082517fa4bce for rationale.

... this is precisely what I'd like to avoid: Needing workarounds on spec-
conforming systems.

> I'm not aware of using ACPI reboot causing issues on boxes that do have
> properly implemented ResetSystem() methods.

I'm also puzzled by this statement: That Acer aspect is a clear indication
of there being an issue. Plus it's quite easy to see that hooks may be put
in place by various firmware components that would then be used to make
certain adjustments to the platform, ahead of an orderly reboot / shutdown.

> --- a/xen/arch/x86/shutdown.c
> +++ b/xen/arch/x86/shutdown.c
> @@ -150,19 +150,20 @@ static void default_reboot_type(void)
>  
>  if ( xen_guest )
>  reboot_type = BOOT_XEN;
> +else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
> +reboot_type = BOOT_ACPI;
>  else if ( efi_enabled(EFI_RS) )
>  reboot_type = BOOT_EFI;
> -else if ( acpi_disabled )
> -reboot_type = BOOT_KBD;
>  else
> -reboot_type = BOOT_ACPI;
> +reboot_type = BOOT_KBD;
>  }
>  
>  static int __init cf_check override_reboot(const struct dmi_system_id *d)
>  {
>  enum reboot_type type = (long)d->driver_data;
>  
> -if ( type == BOOT_ACPI && acpi_disabled )
> +if ( (type == BOOT_ACPI && acpi_disabled) ||
> + (type == BOOT_EFI && !efi_enabled(EFI_RS)) )
>  type = BOOT_KBD;

I guess I don't follow this adjustment: Why would we fall back to KBD
first thing? Wouldn't it make sense to try ACPI first if EFI cannot
be used? And go further to KBD only if ACPI then also turns out
disabled (a mode that Xen quite likely won't correctly operate in
anymore anyway, due to bitrot)?

As an aside, KBD likely is unusable on hw-reduced systems, for there
simply not being a legacy keyboard controller. Instead we may need to
fall back to CF9 in such a case.

Jan



Re: [PATCH 2/7] x86/emul: Fix and extend #DB trap handling

2023-09-18 Thread Andrew Cooper
On 18/09/2023 12:30 pm, Jan Beulich wrote:
> On 18.09.2023 11:24, Andrew Cooper wrote:
>> On 15/09/2023 9:36 pm, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>> @@ -8379,13 +8379,6 @@ x86_emulate(
>>>  if ( !mode_64bit() )
>>>  _regs.r(ip) = (uint32_t)_regs.r(ip);
>>>  
>>> -/* Should a singlestep #DB be raised? */
>>> -if ( rc == X86EMUL_OKAY && singlestep && !ctxt->retire.mov_ss )
>>> -{
>>> -ctxt->retire.singlestep = true;
>>> -ctxt->retire.sti = false;
>>> -}
>>> -
>>>  if ( rc != X86EMUL_DONE )
>>>  *ctxt->regs = _regs;
>>>  else
>>> @@ -8394,6 +8387,19 @@ x86_emulate(
>>>  rc = X86EMUL_OKAY;
>>>  }
>>>  
>>> +/* Should a singlestep #DB be raised? */
>>> +if ( rc == X86EMUL_OKAY && singlestep )
>>> +{
>>> +ctxt->retire.pending_dbg |= X86_DR6_BS;
>>> +
>>> +/* BROKEN - TODO, merge into pending_dbg. */
>>> +if ( !ctxt->retire.mov_ss )
>>> +{
>>> +ctxt->retire.singlestep = true;
>>> +ctxt->retire.sti = false;
>>> +}
>> I occurs to me that setting X86_DR6_BS outside of the !mov_ss case will
>> break HVM (when HVM swaps from singlestep to pending_dbg) until one of
>> the further TODOs is addressed.
>>
>> This will need bringing back within the conditional to avoid regressions
>> in the short term.
> I'm afraid I don't understand this: Isn't the purpose to latch state no
> matter whether it'll be consumed right away, or only on the next insn?

Yes, that is the intention in the longterm.

But in the short term, where I'm doing just enough to fix the %dr6 bits,
putting this unconditionally in PENDING_DBG will break the emulation of
mov-to-ss until the bigger todo of "wire INTERRUPTIBILITY/ACTIVITY state
into the emulation context" is complete.

The latter is definitely too big to fit into 4.18, and I can't
intentionally regress mov-to-ss in a series we intend to backport.

~Andrew



Re: [PATCH v1 00/29] Introduce stub headers necessary for full Xen build

2023-09-18 Thread Oleksii
On Mon, 2023-09-18 at 11:29 +0200, Jan Beulich wrote:
> On 18.09.2023 10:51, Oleksii wrote:
> > On Thu, 2023-09-14 at 17:08 +0200, Jan Beulich wrote:
> > > On 14.09.2023 16:56, Oleksii Kurochko wrote:
> > > > Based on two patch series [1] and [2], the idea of which is to
> > > > provide minimal
> > > > amount of things for a complete Xen build, a large amount of
> > > > headers are the same
> > > > or almost the same, so it makes sense to move them to asm-
> > > > generic.
> > > > 
> > > > Also, providing such stub headers should help future
> > > > architectures
> > > > to add
> > > > a full Xen build.
> > > > 
> > > > [1]
> > > > https://lore.kernel.org/xen-devel/cover.1694543103.git.sanasta...@raptorengineering.com/
> > > > [2]
> > > > https://lore.kernel.org/xen-devel/cover.1692181079.git.oleksii.kuroc...@gmail.com/
> > > > 
> > > > Oleksii Kurochko (29):
> > > >   xen/asm-generic: introduce stub header spinlock.h
> > > 
> > > At the example of this, personally I think this goes too far.
> > > Headers
> > > in
> > > asm-generic should be for the case where an arch elects to not
> > > implement
> > > certain functionality. Clearly spinlocks are required uniformly.
> > It makes sense. Then I will back to the option [2] where I
> > introduced
> > all this headers as part of RISC-V architecture. 
> 
> You did see though that in a reply to my own mail I said I take back
> the
> comment, at least as far as this header (and perhaps several others)
> are
> concerned.
> 
I missed that comment on the patch about spinlock.

Well, then, I don't fully understand the criteria.

What about empty headers or temporary empty headers?

For example, asm/xenoprof.h is empty for all arches except x86, so it
is a good candidate for asm-generic.
But asm/grant_table.h is empty for PPC and RISC-V for now but won't be
empty in the future. Does it make sense to put them to asm-generic? The
only benefit I see is that in future architecture if they follow the
same way of adding support for the arch to Xen, they will face the same
issue: building full Xen requires this empty header.

So, should I wait for some time on other patches of the patch series?

~ Oleksii




Re: [PATCH] x86/shutdown: change default reboot method preference

2023-09-18 Thread Jan Beulich
On 14.09.2023 19:42, Andrew Cooper wrote:
> On 14/09/2023 4:21 pm, Roger Pau Monne wrote:
>> I've found a lot of boxes that don't reboot properly using ResetSystem(), 
>> and I
>> think our default should attempt to make sure Xen reliably reboots on as much
>> hardware as possible.
> 
> You're supposed to use ResetSystem() so all the value-add can be added
> behind your back, but it's a chocolate teapot when it's so broken that
> none of the OSes use it...

That's only one aspect. Recall that EFI originated from ia64 bringup, where
it wasn't even specified how to reboot a system without using the runtime
services function. Fundamentally under EFI shutdown/reboot shouldn't be an
arch-specific operation in the first place.

Jan



Re: [PATCH 7/7] x86/pv: Rewrite %dr6 handling

2023-09-18 Thread Jan Beulich
On 15.09.2023 22:36, Andrew Cooper wrote:
> All #DB exceptions result in an update of %dr6, but this isn't handled
> properly by Xen for any guest type.
> 
> Remove all ad-hoc dr6 handling, leaving it to pv_inject_event() in most cases
> and using the new x86_merge_dr6() helper.
> 
> In do_debug(), swap the dr6 to pending_dbg in order to operate entirely with
> positive polarity.  Among other things, this helps spot RTM/BLD in the
> diagnostic message.
> 
> Drop the unconditional v->arch.dr6 adjustment.  pv_inject_event() performs the
> adjustment in the common case, but retain the prior behaviour if a debugger is
> attached.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 

One minor aspect to consider:

> @@ -1990,24 +1990,23 @@ void do_debug(struct cpu_user_regs *regs)
>   * so ensure the message is ratelimited.
>   */
>  gprintk(XENLOG_WARNING,
> -"Hit #DB in Xen context: %04x:%p [%ps], stk %04x:%p, dr6 
> %lx\n",
> +"Hit #DB in Xen context: %04x:%p [%ps], stk %04x:%p, 
> pending_dbg %lx\n",

Would you mind shorting to just "pending"?

Jan



Re: [PATCH 6/7] x86: Extend x86_event with a pending_dbg field

2023-09-18 Thread Jan Beulich
On 15.09.2023 22:36, Andrew Cooper wrote:
> ... using the Intel VMCS PENDING_DBG semantics, and sharing storage with cr2.
> 
> This requires working around anonymous union bugs in obsolete versions of GCC,
> which in turn needs to drop unnecessary const qualifiers.
> 
> Also introduce a pv_inject_DB() wrapper use this field nicely.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 





Re: [PATCH 5/7] x86: Introduce x86_merge_dr6()

2023-09-18 Thread Jan Beulich
On 15.09.2023 22:36, Andrew Cooper wrote:
> The current logic used to update %dr6 when injecting #DB is buggy.
> 
> The SDM/APM documention on %dr6 updates is far from ideal, but does at least
> make clear that it's non-trivial.  The actual behaviour is to overwrite
> B{0..3} and accumulate all other bits.

As mentioned before, I'm okay to ack this patch provided it is explicitly said
where the information is coming from. Just saying that SDM/PM are incomplete
isn't enough, sorry. With that added (can't really be R-b, I'm afraid):
Acked-by: Jan Beulich 

Jan



Re: [PATCH 2/7] x86/emul: Fix and extend #DB trap handling

2023-09-18 Thread Jan Beulich
On 18.09.2023 11:24, Andrew Cooper wrote:
> On 15/09/2023 9:36 pm, Andrew Cooper wrote:
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>> @@ -8379,13 +8379,6 @@ x86_emulate(
>>  if ( !mode_64bit() )
>>  _regs.r(ip) = (uint32_t)_regs.r(ip);
>>  
>> -/* Should a singlestep #DB be raised? */
>> -if ( rc == X86EMUL_OKAY && singlestep && !ctxt->retire.mov_ss )
>> -{
>> -ctxt->retire.singlestep = true;
>> -ctxt->retire.sti = false;
>> -}
>> -
>>  if ( rc != X86EMUL_DONE )
>>  *ctxt->regs = _regs;
>>  else
>> @@ -8394,6 +8387,19 @@ x86_emulate(
>>  rc = X86EMUL_OKAY;
>>  }
>>  
>> +/* Should a singlestep #DB be raised? */
>> +if ( rc == X86EMUL_OKAY && singlestep )
>> +{
>> +ctxt->retire.pending_dbg |= X86_DR6_BS;
>> +
>> +/* BROKEN - TODO, merge into pending_dbg. */
>> +if ( !ctxt->retire.mov_ss )
>> +{
>> +ctxt->retire.singlestep = true;
>> +ctxt->retire.sti = false;
>> +}
> 
> I occurs to me that setting X86_DR6_BS outside of the !mov_ss case will
> break HVM (when HVM swaps from singlestep to pending_dbg) until one of
> the further TODOs is addressed.
> 
> This will need bringing back within the conditional to avoid regressions
> in the short term.

I'm afraid I don't understand this: Isn't the purpose to latch state no
matter whether it'll be consumed right away, or only on the next insn?

Jan



Re: [PATCH 2/7] x86/emul: Fix and extend #DB trap handling

2023-09-18 Thread Jan Beulich
On 15.09.2023 22:36, Andrew Cooper wrote:
> Lots of this is very very broken, but we need to start somewhere.
> 
> First, the bugfix.  Hooks which use X86EMUL_DONE to skip the general emulation
> still need to evaluate singlestep as part of completing the instruction.
> Defer the logic until X86EMUL_DONE has been converted to X86EMUL_OKAY.

Doesn't this warrant a Fixes: tag against 4999bf3e8b4c?

Jan



Re: [PATCH 1/7] x86/emul: ASSERT that X86EMUL_DONE doesn't escape to callers

2023-09-18 Thread Jan Beulich
On 15.09.2023 22:36, Andrew Cooper wrote:
> This property is far from clear.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 





Re: [PATCH 9/9] x86/spec-ctrl: Mitigate the Zen1 DIV leakge

2023-09-18 Thread Jan Beulich
On 15.09.2023 17:00, Andrew Cooper wrote:
> In the Zen1 microarchitecure, there is one divider in the pipeline which
> services uops from both threads.  In the case of #DE, the latched result from
> the previous DIV to execute will be forwarded speculatively.
> 
> This is an interesting covert channel that allows two threads to communicate
> without any system calls.  In also allows userspace to obtain the result of
> the most recent DIV instruction executed (even speculatively) in the core,
> which can be from a higher privilege context.
> 
> Scrub the result from the divider by executing a non-faulting divide.  This
> needs performing on the exit-to-guest paths, and ist_exit-to-Xen.
> 
> This is XSA-439 / CVE-2023-20588.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 

Nevertheless I would have hoped you add at least a sentence on the alternatives
patching of the IST path. Hitting #MC while patching is possible, after all
(yes, you will tell me that #MC is almost certainly fatal to the system anyway,
but still).

> @@ -955,6 +960,46 @@ static void __init srso_calculations(bool hw_smt_enabled)
>  setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
>  }
>  
> +/*
> + * The Div leakage issue is specific to the AMD Zen1 microarchitecure.
> + *
> + * However, there's no $FOO_NO bit defined, so if we're virtualised we have 
> no
> + * hope of spotting the case where we might move to vulnerable hardware.  We
> + * also can't make any useful conclusion about SMT-ness.
> + *
> + * Don't check the hypervisor bit, so at least we do the safe thing when
> + * booting on something that looks like a Zen1 CPU.
> + */
> +static bool __init has_div_vuln(void)
> +{
> +if ( !(boot_cpu_data.x86_vendor &
> +   (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
> +return false;
> +
> +if ( (boot_cpu_data.x86 != 0x17 && boot_cpu_data.x86 != 0x18) ||
> + !is_zen1_uarch() )
> +return false;
> +
> +return true;
> +}

Just to mention it - personally I consider

...
if ( ... )
return true;

return false;
}

a minor anti-pattern, as a sole return imo makes more clear what's going on.
In a case like this, where you intentionally split return paths anyway, I'd
then go with

static bool __init has_div_vuln(void)
{
if ( !(boot_cpu_data.x86_vendor &
   (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
return false;

if ( boot_cpu_data.x86 != 0x17 && boot_cpu_data.x86 != 0x18 )
return false;

return is_zen1_uarch();
}

Jan



Re: [PATCH 8/9] x86/amd: Introduce is_zen{1,2}_uarch() predicates

2023-09-18 Thread Jan Beulich
On 15.09.2023 17:00, Andrew Cooper wrote:
> --- a/xen/arch/x86/include/asm/amd.h
> +++ b/xen/arch/x86/include/asm/amd.h
> @@ -140,6 +140,17 @@
> AMD_MODEL_RANGE(0x11, 0x0, 0x0, 0xff, 0xf),   \
> AMD_MODEL_RANGE(0x12, 0x0, 0x0, 0xff, 0xf))
>  
> +/*
> + * The Zen1 and Zen2 microarchitectures are implemented by AMD (Fam17h) and
> + * Hygon (Fam18h) but without simple model number rules.  Instead, use STIBP
> + * as a heuristic that distinguishes the two.
> + *
> + * The caller is required to perform the appropriate vendor/family checks
> + * first.
> + */
> +#define is_zen1_uarch() (!boot_cpu_has(X86_FEATURE_AMD_STIBP))
> +#define is_zen2_uarch()   boot_cpu_has(X86_FEATURE_AMD_STIBP)
> +
>  struct cpuinfo_x86;
>  int cpu_has_amd_erratum(const struct cpuinfo_x86 *, int, ...);

With one simply the opposite of the other, and with the requirement of a
family check up front, do we really need both? Personally I'd prefer if
we had just the latter. Yet in any event
Reviewed-by: Jan Beulich 

Jan



Re: [PATCH 6/9] x86/entry: Track the IST-ness of an entry for the exit paths

2023-09-18 Thread Jan Beulich
On 15.09.2023 17:00, Andrew Cooper wrote:
> Use %r12 to hold an ist_exit boolean.  This register is zero elsewhere in the
> entry/exit asm, so it only needs setting in the IST path.
> 
> As this is subtle and fragile, add check_ist_exit() to be used in debugging
> builds to cross-check that the ist_exit boolean matches the entry vector.
> 
> Write check_ist_exit() it in C, because it's debug only and the logic more
> complicated than I care to maintain in asm.
> 
> For now, we only need to use this signal in the exit-to-Xen path, but some
> exit-to-guest paths happen in IST context too.  Check the correctness in all
> exit paths to avoid the logic bitrotting.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 

I understand you then didn't like the idea of macro-izing ...

> --- a/xen/arch/x86/x86_64/compat/entry.S
> +++ b/xen/arch/x86/x86_64/compat/entry.S
> @@ -117,8 +117,15 @@ compat_process_trap:
>  call  compat_create_bounce_frame
>  jmp   compat_test_all_events
>  
> -/* %rbx: struct vcpu, interrupts disabled */
> +/* %rbx: struct vcpu, %r12: ist_exit, interrupts disabled */
>  ENTRY(compat_restore_all_guest)
> +
> +#ifdef CONFIG_DEBUG
> +mov   %rsp, %rdi
> +mov   %r12, %rsi
> +call  check_ist_exit
> +#endif
> +
>  ASSERT_INTERRUPTS_DISABLED
>  mov   $~(X86_EFLAGS_IOPL | X86_EFLAGS_VM), %r11d
>  and   UREGS_eflags(%rsp),%r11d
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -142,8 +142,15 @@ process_trap:
>  
>  .section .text.entry, "ax", @progbits
>  
> -/* %rbx: struct vcpu, interrupts disabled */
> +/* %rbx: struct vcpu, %r12: ist_exit, interrupts disabled */
>  restore_all_guest:
> +
> +#ifdef CONFIG_DEBUG
> +mov   %rsp, %rdi
> +mov   %r12, %rsi
> +call  check_ist_exit
> +#endif
> +
>  ASSERT_INTERRUPTS_DISABLED
>  
>  /* Stash guest SPEC_CTRL value while we can read struct vcpu. */
> @@ -659,8 +666,15 @@ ENTRY(early_page_fault)
>  .section .text.entry, "ax", @progbits
>  
>  ALIGN
> -/* No special register assumptions. */
> +/* %r12=ist_exit */
>  restore_all_xen:
> +
> +#ifdef CONFIG_DEBUG
> +mov   %rsp, %rdi
> +mov   %r12, %rsi
> +call  check_ist_exit
> +#endif

... these three instances of identical code you add, along the lines of
ASSERT_INTERRUPTS_DISABLED?

Jan



[xen-unstable test] 183026: tolerable FAIL

2023-09-18 Thread osstest service owner
flight 183026 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183026/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 183023
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 183023
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 183023
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 183023
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 183023
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 183023
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 183023
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 183023
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 183023
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 183023
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 183023
 test-amd64-amd64-xl-qemuu-win7-amd64 12 windows-install   fail like 183023
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass

version targeted for testing:
 xen  290f82375d828ef93f831a5ef028f1283aa1ea47
baseline version:
 xen  290f82375d828ef93f831a5ef028f1283aa1ea47

Last test of basis   183026  2023-09-18 01:52:02 Z0 days
Testing same since  (not found) 0 attempts

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 

Re: [PATCH 4/9] x86/spec-ctrl: Improve all SPEC_CTRL_{ENTER,EXIT}_* comments

2023-09-18 Thread Jan Beulich
On 15.09.2023 17:00, Andrew Cooper wrote:
> ... to better explain how they're used.
> 
> Doing so highlights that SPEC_CTRL_EXIT_TO_XEN is missing a VERW flush for the
> corner case when e.g. an NMI hits late in an exit-to-guest path.
> 
> Leave a TODO, which will be addressed in subsequent patches which arrange for
> DO_COND_VERW to be safe within SPEC_CTRL_EXIT_TO_XEN.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 

Two nits though:

> @@ -233,7 +236,11 @@
>  X86_FEATURE_SC_MSR_PV
>  .endm
>  
> -/* Use in interrupt/exception context.  May interrupt Xen or PV context. */
> +/*
> + * Used after an exception or maskable interrupt, hitting Xen or PV context.
> + * There will either be a guest speculation context, or (baring fatal

Isn't this "barring"?

> @@ -260,7 +270,13 @@
>  .endm
>  
>  /*
> - * Use in IST interrupt/exception context.  May interrupt Xen or PV context.
> + * Used after an IST entry hitting Xen or PV context.  Special care is 
> needed,
> + * because when hitting Xen context, there may not a well-formed speculation

Missing "be"?

Jan



Re: [PATCH v4 1/8] common: assembly entry point type/size annotations

2023-09-18 Thread Jan Beulich
On 18.09.2023 12:34, Julien Grall wrote:
> Hi,
> 
> On 18/09/2023 11:24, Jan Beulich wrote:
>> On 14.09.2023 23:06, Julien Grall wrote:
>>> On 04/08/2023 07:26, Jan Beulich wrote:
 TBD: What to set CODE_ALIGN to by default? Or should we requires arch-es
to define that in all cases?
>>>
>>> The code alignment is very specific to an architecture. So I think it
>>> would be better if there are no default.
>>>
>>> Otherwise, it will be more difficult for a developper to figure out that
>>> CODE_ALIGN may need an update.
>>
>> Okay, I've dropped the fallback then.
>>
 --- /dev/null
 +++ b/xen/include/xen/linkage.h
 @@ -0,0 +1,56 @@
 +#ifndef __LINKAGE_H__
 +#define __LINKAGE_H__
 +
 +#ifdef __ASSEMBLY__
 +
 +#include 
 +
 +#ifndef CODE_ALIGN
 +# define CODE_ALIGN ??
 +#endif
 +#ifndef CODE_FILL
 +# define CODE_FILL ~0
 +#endif
>>>
>>> What's the value to allow the architecture to override CODE_FILL and ...
>>
>> What is put between functions may be relevant to control. Without fall-
>> through to a subsequent label, I think the intention is to use "int3" (0xcc)
>> filler bytes, for example. (With fall-through to the subsequent label, NOPs
>> would need using in any event.)
> 
> I guess for x86 it makes sense. For Arm, the filler is unlikely to be 
> used as the instruction size is always fixed.
> 
>>
 +
 +#ifndef DATA_ALIGN
 +# define DATA_ALIGN 0
 +#endif
 +#ifndef DATA_FILL
 +# define DATA_FILL ~0
 +#endif
>>>
>>> ... DATA_FILL?
>>
>> For data the need is probably less strict; still I could see one arch to
>> prefer zero filling while another might better like all-ones-filling.
> 
> It is unclear to me why an architecture would prefer one over the other. 
> Can you provide a bit more details?
> 
>>
 +
 +#define SYM_ALIGN(algn...) .balign algn
>>>
>>> I find the name 'algn' confusing (not even referring to the missing
>>> 'i'). Why not naming it 'args'?
>>
>> I can name it "args", sure. It's just that "algn" is in line with the
>> naming further down (where "args" isn't reasonable to use as substitution).
> 
> If you want to be consistent then, I think it would be best to use 
> 'align'. I think it should be fine as we don't seem to use '.align'.

I think I had a conflict from this somewhere, but that may have been very
early when I hadn't switched to .balign yet. I'll see if renaming works
out.

 +#define SYM(name, typ, linkage, algn...)  \
 +.type name, SYM_T_ ## typ;\
 +SYM_L_ ## linkage(name);  \
 +SYM_ALIGN(algn);  \
 +name:
 +
 +#define END(name) .size name, . - name
 +
 +#define FUNC(name, algn...) \
 +SYM(name, FUNC, GLOBAL, LASTARG(CODE_ALIGN, ## algn), CODE_FILL)
 +#define LABEL(name, algn...) \
 +SYM(name, NONE, GLOBAL, LASTARG(CODE_ALIGN, ## algn), CODE_FILL)
 +#define DATA(name, algn...) \
 +SYM(name, DATA, GLOBAL, LASTARG(DATA_ALIGN, ## algn), DATA_FILL)
>>>
>>> I think the alignment should be explicit for DATA. Otherwise, at least
>>> on Arm, we would default to 0 which could lead to unaligned access if
>>> not careful.
>>
>> I disagree. Even for byte-granular data (like strings) it may be desirable
>> to have some default alignment, without every use site needing to repeat
>> that specific value. 
> 
> I understand that some cases may want to use a default alignment. But my 
> concern is the developer may not realize that alignment is necessary. So 
> by making it mandatory, it would at least prompt the developper to think 
> whether this is needed.

Forcing people to use a specific value every time, even when none would
be needed. Anyway, if others think your way, then I can certainly change.
But then I need to know whether others perhaps think alignment on functions
(and maybe even labels) should also be explicit in all cases.

> For the string case, we could introduce a different macro.

Hmm, yet one more special thing then (for people to remember to use under
certain circumstances).

Jan



Re: [PATCH 2/2] xen/x86: ioapic: Bail out from timer_irq_works() as soon as possible

2023-09-18 Thread Jan Beulich
On 15.09.2023 16:00, Julien Grall wrote:
> Hi Jan,
> 
> On 07/09/2023 15:28, Jan Beulich wrote:
>> On 18.08.2023 15:44, Julien Grall wrote:
>>> From: Julien Grall 
>>>
>>> Currently timer_irq_works() will wait the full 100ms before checking
>>> that pit0_ticks has been incremented at least 4 times.
>>>
>>> However, the bulk of the BIOS/platform should not have a buggy timer.
>>> So waiting for the full 100ms is a bit harsh.
>>>
>>> Rework the logic to only wait until 100ms passed or we saw more than
>>> 4 ticks. So now, in the good case, this will reduce the wait time
>>> to ~50ms.
>>>
>>> Signed-off-by: Julien Grall 
>>
>> In principle this is all fine. There's a secondary aspect though which
>> may call for a slight rework of the patch.
>>
>>> --- a/xen/arch/x86/io_apic.c
>>> +++ b/xen/arch/x86/io_apic.c
>>> @@ -1509,6 +1509,8 @@ static void __init setup_ioapic_ids_from_mpc(void)
>>>   static int __init timer_irq_works(void)
>>>   {
>>>   unsigned long t1, flags;
>>> +/* Wait for maximum 10 ticks */
>>> +unsigned long msec = (10 * 1000) / HZ;
>>
>> (Minor remark: I don't think this needs to be unsigned long; unsigned
>> in will suffice.)
> 
> You are right. I can switch to unsigned int.
> 
>>
>>> @@ -1517,19 +1519,25 @@ static int __init timer_irq_works(void)
>>>   
>>>   local_save_flags(flags);
>>>   local_irq_enable();
>>> -/* Let ten ticks pass... */
>>> -mdelay((10 * 1000) / HZ);
>>> -local_irq_restore(flags);
>>>   
>>> -/*
>>> - * Expect a few ticks at least, to be sure some possible
>>> - * glue logic does not lock up after one or two first
>>> - * ticks in a non-ExtINT mode.  Also the local APIC
>>> - * might have cached one ExtINT interrupt.  Finally, at
>>> - * least one tick may be lost due to delays.
>>> - */
>>> -if ( (ACCESS_ONCE(pit0_ticks) - t1) > 4 )
>>> +while ( msec-- )
>>> +{
>>> +mdelay(1);
>>> +/*
>>> + * Expect a few ticks at least, to be sure some possible
>>> + * glue logic does not lock up after one or two first
>>> + * ticks in a non-ExtINT mode.  Also the local APIC
>>> + * might have cached one ExtINT interrupt.  Finally, at
>>> + * least one tick may be lost due to delays.
>>> + */
>>> +if ( (ACCESS_ONCE(pit0_ticks) - t1) <= 4 )
>>> +continue;
>>> +
>>> +local_irq_restore(flags);
>>>   return 1;
>>> +}
>>> +
>>> +local_irq_restore(flags);
>>>   
>>>   return 0;
>>>   }
>>
>> While Andrew has a patch pending (not sure why it didn't go in yet)
>> to simplify local_irq_restore(), and while further it shouldn't really
>> need using here (local_irq_disable() ought to be fine)
> 
> Skimming through the code, the last call of timer_irq_works() in 
> check_timer() happens after the interrupts masking state have been restored:
> 
> local_irq_restore(flags);
> 
> if ( timer_irq_works() )
>...
> 
> 
> So I think timer_irq_works() can be called with interrupts enabled and 
> therefore we can't use local_irq_disable().

Hmm, yes, you're right. That's inconsistent, but dealing with that is a
separate task.

Jan



Re: [PATCH v4 1/8] common: assembly entry point type/size annotations

2023-09-18 Thread Julien Grall

Hi,

On 18/09/2023 11:24, Jan Beulich wrote:

On 14.09.2023 23:06, Julien Grall wrote:

On 04/08/2023 07:26, Jan Beulich wrote:

TBD: What to set CODE_ALIGN to by default? Or should we requires arch-es
   to define that in all cases?


The code alignment is very specific to an architecture. So I think it
would be better if there are no default.

Otherwise, it will be more difficult for a developper to figure out that
CODE_ALIGN may need an update.


Okay, I've dropped the fallback then.


--- /dev/null
+++ b/xen/include/xen/linkage.h
@@ -0,0 +1,56 @@
+#ifndef __LINKAGE_H__
+#define __LINKAGE_H__
+
+#ifdef __ASSEMBLY__
+
+#include 
+
+#ifndef CODE_ALIGN
+# define CODE_ALIGN ??
+#endif
+#ifndef CODE_FILL
+# define CODE_FILL ~0
+#endif


What's the value to allow the architecture to override CODE_FILL and ...


What is put between functions may be relevant to control. Without fall-
through to a subsequent label, I think the intention is to use "int3" (0xcc)
filler bytes, for example. (With fall-through to the subsequent label, NOPs
would need using in any event.)


I guess for x86 it makes sense. For Arm, the filler is unlikely to be 
used as the instruction size is always fixed.





+
+#ifndef DATA_ALIGN
+# define DATA_ALIGN 0
+#endif
+#ifndef DATA_FILL
+# define DATA_FILL ~0
+#endif


... DATA_FILL?


For data the need is probably less strict; still I could see one arch to
prefer zero filling while another might better like all-ones-filling.


It is unclear to me why an architecture would prefer one over the other. 
Can you provide a bit more details?





+
+#define SYM_ALIGN(algn...) .balign algn


I find the name 'algn' confusing (not even referring to the missing
'i'). Why not naming it 'args'?


I can name it "args", sure. It's just that "algn" is in line with the
naming further down (where "args" isn't reasonable to use as substitution).


If you want to be consistent then, I think it would be best to use 
'align'. I think it should be fine as we don't seem to use '.align'.





+#define SYM_L_GLOBAL(name) .globl name
+#define SYM_L_WEAK(name)   .weak name
+#define SYM_L_LOCAL(name)  /* nothing */
+
+#define SYM_T_FUNC STT_FUNC
+#define SYM_T_DATA STT_OBJECT
+#define SYM_T_NONE STT_NOTYPE


SYM_* will be used only in SYM() below. So why not using STT_* directly?


For one this is how the Linux original has it. And then to me DATA and
NONE are neater to have at the use sites than the ELF-specific terms
OBJECT and NOTYPE. But I'm willing to reconsider provided arguments
towards the two given reasons not being overly relevant for us.


+
+#define SYM(name, typ, linkage, algn...)  \
+.type name, SYM_T_ ## typ;\
+SYM_L_ ## linkage(name);  \
+SYM_ALIGN(algn);  \
+name:
+
+#define END(name) .size name, . - name
+
+#define FUNC(name, algn...) \
+SYM(name, FUNC, GLOBAL, LASTARG(CODE_ALIGN, ## algn), CODE_FILL)
+#define LABEL(name, algn...) \
+SYM(name, NONE, GLOBAL, LASTARG(CODE_ALIGN, ## algn), CODE_FILL)
+#define DATA(name, algn...) \
+SYM(name, DATA, GLOBAL, LASTARG(DATA_ALIGN, ## algn), DATA_FILL)


I think the alignment should be explicit for DATA. Otherwise, at least
on Arm, we would default to 0 which could lead to unaligned access if
not careful.


I disagree. Even for byte-granular data (like strings) it may be desirable
to have some default alignment, without every use site needing to repeat
that specific value. 


I understand that some cases may want to use a default alignment. But my 
concern is the developer may not realize that alignment is necessary. So 
by making it mandatory, it would at least prompt the developper to think 
whether this is needed.


For the string case, we could introduce a different macro.

Cheers,

--
Julien Grall



Re: [PATCH 1/2] xen/x86: io_apic: Introduce a command line option to skip timer check

2023-09-18 Thread Jan Beulich
On 15.09.2023 15:18, Julien Grall wrote:
> On 07/09/2023 15:09, Jan Beulich wrote:
>> On 18.08.2023 15:44, Julien Grall wrote:
>>> --- a/docs/misc/xen-command-line.pandoc
>>> +++ b/docs/misc/xen-command-line.pandoc
>>> @@ -1896,6 +1896,13 @@ This option is ignored in **pv-shim** mode.
>>>   ### nr_irqs (x86)
>>>   > `= `
>>>   
>>> +### no_timer_works (x86)
>>> +> `=`
>>> +
>>> +> Default: `true`
>>> +
>>> +Disables the code which tests for broken timer IRQ sources.
>>
>> In description and code it's "check", but here it's "works". Likely
>> just a typo. But I'd prefer if we didn't introduce any new "no*"
>> options which then can be negated to "no-no*". Make it "timer-check"
>> (also avoiding the underscore, no matter that Linux uses it), or
>> alternatively make it a truly positive option, e.g. "timer-irq-works".
> 
> I don't mind too much about using - over _ but it is never clear why you 
> strongly push for it (and whether the others agrees).

Informal feedback suggested that various people agree and no-one strongly
disagrees to the argument of underscore really only being an auxiliary
separator character, when no better one can be used, and it also being
two keypresses to type on most keyboards, when dash is just one.

> Is this documented 
> somewhere? If not, can you do it so everyone can apply it consistently? 
> (At least I would not remember to ask for it because I am happy with the _).

As to documenting - it's not clear to me where such documentation ought
to go. In a way this is coding style, so it could be ./CODING_STYLE, but
then my experience with proposing changes there has been at best mixed.

Jan



Re: [PATCH for-4.18 v2] tools/light: Revoke permissions when a PCI detach for HVM domain

2023-09-18 Thread Anthony PERARD
On Fri, Sep 15, 2023 at 01:52:04PM +0100, Julien Grall wrote:
> From: Julien Grall 
> 
> Currently, libxl will grant IOMEM, I/O port and IRQ permissions when
> a PCI is attached (see pci_add_dm_done()) for all domain types. However,
> the permissions are only revoked for non-HVM domain (see do_pci_remove()).
> 
> This means that HVM domains will be left with extra permissions. While
> this look bad on the paper, the IRQ permissions should be revoked
> when the Device Model call xc_physdev_unmap_pirq() and such domain
> cannot directly mapped I/O port and IOMEM regions. Instead, this has to
> be done by a Device Model.
> 
> The Device Model can only run in dom0 or PV stubdomain (upstream libxl
> doesn't have support for HVM/PVH stubdomain).
> 
> For PV/PVH stubdomain, the permission are properly revoked, so there is
> no security concern.
> 
> This leaves dom0. There are two cases:
>   1) Privileged: Anyone gaining access to the Device Model would already
>  have large control on the host.
>   2) Deprivileged: PCI passthrough require PHYSDEV operations which
>  are not accessible when the Device Model is restricted.
> 
> So overall, it is believed that the extra permissions cannot be exploited.
> 
> Rework the code so the permissions are all removed for HVM domains.
> This needs to happen after the QEMU has detached the device. So
> the revocation is now moved to pci_remove_detached().
> 
> Also add a comment on top of the error message when the PIRQ cannot
> be unbind to explain this could be a spurious error as QEMU may have
> already done it.
> 
> Signed-off-by: Julien Grall 
> 
> ---
> 
> Changes since v1:
> * Move the code to revoke in pci_remove_detached()
> * Add a comment on top of the PIRQ unbind error path
> * Use goto to deal with errors.

Reviewed-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD



Re: [PATCH v4 1/8] common: assembly entry point type/size annotations

2023-09-18 Thread Jan Beulich
On 14.09.2023 23:06, Julien Grall wrote:
> On 04/08/2023 07:26, Jan Beulich wrote:
>> TBD: What to set CODE_ALIGN to by default? Or should we requires arch-es
>>   to define that in all cases?
> 
> The code alignment is very specific to an architecture. So I think it 
> would be better if there are no default.
> 
> Otherwise, it will be more difficult for a developper to figure out that 
> CODE_ALIGN may need an update.

Okay, I've dropped the fallback then.

>> --- /dev/null
>> +++ b/xen/include/xen/linkage.h
>> @@ -0,0 +1,56 @@
>> +#ifndef __LINKAGE_H__
>> +#define __LINKAGE_H__
>> +
>> +#ifdef __ASSEMBLY__
>> +
>> +#include 
>> +
>> +#ifndef CODE_ALIGN
>> +# define CODE_ALIGN ??
>> +#endif
>> +#ifndef CODE_FILL
>> +# define CODE_FILL ~0
>> +#endif
> 
> What's the value to allow the architecture to override CODE_FILL and ...

What is put between functions may be relevant to control. Without fall-
through to a subsequent label, I think the intention is to use "int3" (0xcc)
filler bytes, for example. (With fall-through to the subsequent label, NOPs
would need using in any event.)

>> +
>> +#ifndef DATA_ALIGN
>> +# define DATA_ALIGN 0
>> +#endif
>> +#ifndef DATA_FILL
>> +# define DATA_FILL ~0
>> +#endif
> 
> ... DATA_FILL?

For data the need is probably less strict; still I could see one arch to
prefer zero filling while another might better like all-ones-filling.

>> +
>> +#define SYM_ALIGN(algn...) .balign algn
> 
> I find the name 'algn' confusing (not even referring to the missing 
> 'i'). Why not naming it 'args'?

I can name it "args", sure. It's just that "algn" is in line with the
naming further down (where "args" isn't reasonable to use as substitution).

>> +#define SYM_L_GLOBAL(name) .globl name
>> +#define SYM_L_WEAK(name)   .weak name
>> +#define SYM_L_LOCAL(name)  /* nothing */
>> +
>> +#define SYM_T_FUNC STT_FUNC
>> +#define SYM_T_DATA STT_OBJECT
>> +#define SYM_T_NONE STT_NOTYPE
> 
> SYM_* will be used only in SYM() below. So why not using STT_* directly?

For one this is how the Linux original has it. And then to me DATA and
NONE are neater to have at the use sites than the ELF-specific terms
OBJECT and NOTYPE. But I'm willing to reconsider provided arguments
towards the two given reasons not being overly relevant for us.

>> +
>> +#define SYM(name, typ, linkage, algn...)  \
>> +.type name, SYM_T_ ## typ;\
>> +SYM_L_ ## linkage(name);  \
>> +SYM_ALIGN(algn);  \
>> +name:
>> +
>> +#define END(name) .size name, . - name
>> +
>> +#define FUNC(name, algn...) \
>> +SYM(name, FUNC, GLOBAL, LASTARG(CODE_ALIGN, ## algn), CODE_FILL)
>> +#define LABEL(name, algn...) \
>> +SYM(name, NONE, GLOBAL, LASTARG(CODE_ALIGN, ## algn), CODE_FILL)
>> +#define DATA(name, algn...) \
>> +SYM(name, DATA, GLOBAL, LASTARG(DATA_ALIGN, ## algn), DATA_FILL)
> 
> I think the alignment should be explicit for DATA. Otherwise, at least 
> on Arm, we would default to 0 which could lead to unaligned access if 
> not careful.

I disagree. Even for byte-granular data (like strings) it may be desirable
to have some default alignment, without every use site needing to repeat
that specific value.

Jan



Re: [PATCH 0/5] x86/pv: #DB vs %dr6 fixes, part 2

2023-09-18 Thread Jan Beulich
On 15.09.2023 21:56, Andrew Cooper wrote:
> A third which I'm on the fence about is about PV guests and General
> Detect.  We firmly prohibit PV guests from setting DR7.GD, but we them
> play with the DR6.GD bit as if it had been asserted.
> 
> It would be easy to put GD back into the set of reserved bits for DR6,
> but it also wouldn't be hard to handle GD via dr7_emul and let the PV
> guest have a more-normal looking set of debug functionality.

Anything "more-normal looking" is to be preferred, I would say. As long
as, like you say here, it isn't overly hard.

Jan




Re: [PATCH v1 00/29] Introduce stub headers necessary for full Xen build

2023-09-18 Thread Jan Beulich
On 18.09.2023 11:32, Julien Grall wrote:
> Hi Jan,
> 
> On 18/09/2023 10:29, Jan Beulich wrote:
>> On 18.09.2023 10:51, Oleksii wrote:
>>> On Thu, 2023-09-14 at 17:08 +0200, Jan Beulich wrote:
 On 14.09.2023 16:56, Oleksii Kurochko wrote:
> Based on two patch series [1] and [2], the idea of which is to
> provide minimal
> amount of things for a complete Xen build, a large amount of
> headers are the same
> or almost the same, so it makes sense to move them to asm-generic.
>
> Also, providing such stub headers should help future architectures
> to add
> a full Xen build.
>
> [1]
> https://lore.kernel.org/xen-devel/cover.1694543103.git.sanasta...@raptorengineering.com/
> [2]
> https://lore.kernel.org/xen-devel/cover.1692181079.git.oleksii.kuroc...@gmail.com/
>
> Oleksii Kurochko (29):
>    xen/asm-generic: introduce stub header spinlock.h

 At the example of this, personally I think this goes too far. Headers
 in
 asm-generic should be for the case where an arch elects to not
 implement
 certain functionality. Clearly spinlocks are required uniformly.
>>> It makes sense. Then I will back to the option [2] where I introduced
>>> all this headers as part of RISC-V architecture.
>>
>> You did see though that in a reply to my own mail I said I take back the
>> comment,
> 
> I can't find a reply to our own mail in my inbox. Do you have a message-id?

Oh, sorry, I said so in reply to 01/29.

> ? at least as far as this header (and perhaps several others) are
>> concerned.
> 
> Do you have a list where you think they should be kept? Or are you 
> planning to answer to all you disagree/agree one by one?

I think this can only be one-by-one.

Jan



Re: [PATCH v1 00/29] Introduce stub headers necessary for full Xen build

2023-09-18 Thread Julien Grall

Hi Jan,

On 18/09/2023 10:29, Jan Beulich wrote:

On 18.09.2023 10:51, Oleksii wrote:

On Thu, 2023-09-14 at 17:08 +0200, Jan Beulich wrote:

On 14.09.2023 16:56, Oleksii Kurochko wrote:

Based on two patch series [1] and [2], the idea of which is to
provide minimal
amount of things for a complete Xen build, a large amount of
headers are the same
or almost the same, so it makes sense to move them to asm-generic.

Also, providing such stub headers should help future architectures
to add
a full Xen build.

[1]
https://lore.kernel.org/xen-devel/cover.1694543103.git.sanasta...@raptorengineering.com/
[2]
https://lore.kernel.org/xen-devel/cover.1692181079.git.oleksii.kuroc...@gmail.com/

Oleksii Kurochko (29):
   xen/asm-generic: introduce stub header spinlock.h


At the example of this, personally I think this goes too far. Headers
in
asm-generic should be for the case where an arch elects to not
implement
certain functionality. Clearly spinlocks are required uniformly.

It makes sense. Then I will back to the option [2] where I introduced
all this headers as part of RISC-V architecture.


You did see though that in a reply to my own mail I said I take back the
comment,


I can't find a reply to our own mail in my inbox. Do you have a message-id?

? at least as far as this header (and perhaps several others) are

concerned.


Do you have a list where you think they should be kept? Or are you 
planning to answer to all you disagree/agree one by one?


Cheers,

--
Julien Grall



Re: [PATCH v1 00/29] Introduce stub headers necessary for full Xen build

2023-09-18 Thread Jan Beulich
On 18.09.2023 10:51, Oleksii wrote:
> On Thu, 2023-09-14 at 17:08 +0200, Jan Beulich wrote:
>> On 14.09.2023 16:56, Oleksii Kurochko wrote:
>>> Based on two patch series [1] and [2], the idea of which is to
>>> provide minimal
>>> amount of things for a complete Xen build, a large amount of
>>> headers are the same
>>> or almost the same, so it makes sense to move them to asm-generic.
>>>
>>> Also, providing such stub headers should help future architectures
>>> to add
>>> a full Xen build.
>>>
>>> [1]
>>> https://lore.kernel.org/xen-devel/cover.1694543103.git.sanasta...@raptorengineering.com/
>>> [2]
>>> https://lore.kernel.org/xen-devel/cover.1692181079.git.oleksii.kuroc...@gmail.com/
>>>
>>> Oleksii Kurochko (29):
>>>   xen/asm-generic: introduce stub header spinlock.h
>>
>> At the example of this, personally I think this goes too far. Headers
>> in
>> asm-generic should be for the case where an arch elects to not
>> implement
>> certain functionality. Clearly spinlocks are required uniformly.
> It makes sense. Then I will back to the option [2] where I introduced
> all this headers as part of RISC-V architecture. 

You did see though that in a reply to my own mail I said I take back the
comment, at least as far as this header (and perhaps several others) are
concerned.

Jan



Re: [PATCH 2/7] x86/emul: Fix and extend #DB trap handling

2023-09-18 Thread Andrew Cooper
On 15/09/2023 9:36 pm, Andrew Cooper wrote:
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c 
> b/xen/arch/x86/x86_emulate/x86_emulate.c
> index 94caec1d142c..de7f99500e3f 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -8379,13 +8379,6 @@ x86_emulate(
>  if ( !mode_64bit() )
>  _regs.r(ip) = (uint32_t)_regs.r(ip);
>  
> -/* Should a singlestep #DB be raised? */
> -if ( rc == X86EMUL_OKAY && singlestep && !ctxt->retire.mov_ss )
> -{
> -ctxt->retire.singlestep = true;
> -ctxt->retire.sti = false;
> -}
> -
>  if ( rc != X86EMUL_DONE )
>  *ctxt->regs = _regs;
>  else
> @@ -8394,6 +8387,19 @@ x86_emulate(
>  rc = X86EMUL_OKAY;
>  }
>  
> +/* Should a singlestep #DB be raised? */
> +if ( rc == X86EMUL_OKAY && singlestep )
> +{
> +ctxt->retire.pending_dbg |= X86_DR6_BS;
> +
> +/* BROKEN - TODO, merge into pending_dbg. */
> +if ( !ctxt->retire.mov_ss )
> +{
> +ctxt->retire.singlestep = true;
> +ctxt->retire.sti = false;
> +}

I occurs to me that setting X86_DR6_BS outside of the !mov_ss case will
break HVM (when HVM swaps from singlestep to pending_dbg) until one of
the further TODOs is addressed.

This will need bringing back within the conditional to avoid regressions
in the short term.

~Andrew



Re: [PATCH v1 00/29] Introduce stub headers necessary for full Xen build

2023-09-18 Thread Oleksii
On Mon, 2023-09-18 at 11:51 +0300, Oleksii wrote:
> On Thu, 2023-09-14 at 17:08 +0200, Jan Beulich wrote:
> > On 14.09.2023 16:56, Oleksii Kurochko wrote:
> > > Based on two patch series [1] and [2], the idea of which is to
> > > provide minimal
> > > amount of things for a complete Xen build, a large amount of
> > > headers are the same
> > > or almost the same, so it makes sense to move them to asm-
> > > generic.
> > > 
> > > Also, providing such stub headers should help future
> > > architectures
> > > to add
> > > a full Xen build.
> > > 
> > > [1]
> > > https://lore.kernel.org/xen-devel/cover.1694543103.git.sanasta...@raptorengineering.com/
> > > [2]
> > > https://lore.kernel.org/xen-devel/cover.1692181079.git.oleksii.kuroc...@gmail.com/
> > > 
> > > Oleksii Kurochko (29):
> > >   xen/asm-generic: introduce stub header spinlock.h
> > 
> > At the example of this, personally I think this goes too far.
> > Headers
> > in
> > asm-generic should be for the case where an arch elects to not
> > implement
> > certain functionality. Clearly spinlocks are required uniformly.
> It makes sense. Then I will back to the option [2] where I introduced
> all this headers as part of RISC-V architecture.
And I will review the current patch series probably it is still can be
something moved to asm-generic.

> 
> ~ Oleksii




Re: [PATCH v1 00/29] Introduce stub headers necessary for full Xen build

2023-09-18 Thread Oleksii
On Thu, 2023-09-14 at 17:08 +0200, Jan Beulich wrote:
> On 14.09.2023 16:56, Oleksii Kurochko wrote:
> > Based on two patch series [1] and [2], the idea of which is to
> > provide minimal
> > amount of things for a complete Xen build, a large amount of
> > headers are the same
> > or almost the same, so it makes sense to move them to asm-generic.
> > 
> > Also, providing such stub headers should help future architectures
> > to add
> > a full Xen build.
> > 
> > [1]
> > https://lore.kernel.org/xen-devel/cover.1694543103.git.sanasta...@raptorengineering.com/
> > [2]
> > https://lore.kernel.org/xen-devel/cover.1692181079.git.oleksii.kuroc...@gmail.com/
> > 
> > Oleksii Kurochko (29):
> >   xen/asm-generic: introduce stub header spinlock.h
> 
> At the example of this, personally I think this goes too far. Headers
> in
> asm-generic should be for the case where an arch elects to not
> implement
> certain functionality. Clearly spinlocks are required uniformly.
It makes sense. Then I will back to the option [2] where I introduced
all this headers as part of RISC-V architecture. 

~ Oleksii



Re: [PATCH v1 16/29] xen/asm-generic: introduce stub header flushtlb.h

2023-09-18 Thread Oleksii
Hello Jiamei,

On Fri, 2023-09-15 at 13:15 +0800, Jiamei Xie wrote:
> Hi Oleksii
...
> 
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * indent-tabs-mode: nil
> > + * End:
> > + */
> > +
> > +
> It's duplicated.
Thanks. I'll remove duplication.
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: BSD
> > + * c-basic-offset: 4
> > + * indent-tabs-mode: nil
> > + * End:
> > + */

~ Oleksii




Re: [PATCH v1 01/29] xen/asm-generic: introduce stub header spinlock.h

2023-09-18 Thread Oleksii
On Thu, 2023-09-14 at 17:35 +0200, Jan Beulich wrote:
> On 14.09.2023 16:56, Oleksii Kurochko wrote:
> > The patch introduces stub header needed for full Xen build.
> > 
> > Signed-off-by: Oleksii Kurochko 
> 
> Hmm, looking here I think I need to take back what I said in reply
> to the cover letter, taking this as an example.
> 
> > --- /dev/null
> > +++ b/xen/include/asm-generic/spinlock.h
> > @@ -0,0 +1,23 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +#ifndef __ASM_GENERIC_SPINLOCK_H__
> > +#define __ASM_GENERIC_SPINLOCK_H__
> > +
> > +#define arch_lock_acquire_barrier() smp_mb()
> > +#define arch_lock_release_barrier() smp_mb()
> > +
> > +#define arch_lock_relax() cpu_relax()
> > +#define arch_lock_signal() do { \
> > +} while(0)
> 
> Slightly easier (and without style violation) as ((void)0)?
Thanks. It is much easier.

> 
> > +#define arch_lock_signal_wmb() arch_lock_signal()
> 
> How's the WMB aspect represented in here? I think you need the x86
> variant as the generic fallback.

Agree. I'll take x86 version in the next patch series.

~ Oleksii



Re: [QEMU PATCH v5 09/13] virtio-gpu: Handle resource blob commands

2023-09-18 Thread Akihiko Odaki

On 2023/09/18 17:36, Huang Rui wrote:

On Sat, Sep 16, 2023 at 12:04:17AM +0800, Akihiko Odaki wrote:

On 2023/09/15 20:11, Huang Rui wrote:

From: Antonio Caggiano 

Support BLOB resources creation, mapping and unmapping by calling the
new stable virglrenderer 0.10 interface. Only enabled when available and
via the blob config. E.g. -device virtio-vga-gl,blob=true

Signed-off-by: Antonio Caggiano 
Signed-off-by: Dmitry Osipenko 
Signed-off-by: Xenia Ragiadakou 
Signed-off-by: Huang Rui 
---

V4 -> V5:
  - Use memory_region_init_ram_ptr() instead of
memory_region_init_ram_device_ptr() (Akihiko)

   hw/display/virtio-gpu-virgl.c  | 213 +
   hw/display/virtio-gpu.c|   4 +-
   include/hw/virtio/virtio-gpu.h |   5 +
   meson.build|   4 +
   4 files changed, 225 insertions(+), 1 deletion(-)

diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 312953ec16..563a6f2f58 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -17,6 +17,7 @@
   #include "trace.h"
   #include "hw/virtio/virtio.h"
   #include "hw/virtio/virtio-gpu.h"
+#include "hw/virtio/virtio-gpu-bswap.h"
   
   #include "ui/egl-helpers.h"
   
@@ -78,9 +79,24 @@ static void virgl_cmd_create_resource_3d(VirtIOGPU *g,

   virgl_renderer_resource_create(, NULL, 0);
   }
   
+static void virgl_resource_destroy(VirtIOGPU *g,

+   struct virtio_gpu_simple_resource *res)
+{
+if (!res)
+return;
+
+QTAILQ_REMOVE(>reslist, res, next);
+
+virtio_gpu_cleanup_mapping_iov(g, res->iov, res->iov_cnt);
+g_free(res->addrs);
+
+g_free(res);
+}
+
   static void virgl_cmd_resource_unref(VirtIOGPU *g,
struct virtio_gpu_ctrl_command *cmd)
   {
+struct virtio_gpu_simple_resource *res;
   struct virtio_gpu_resource_unref unref;
   struct iovec *res_iovs = NULL;
   int num_iovs = 0;
@@ -88,13 +104,22 @@ static void virgl_cmd_resource_unref(VirtIOGPU *g,
   VIRTIO_GPU_FILL_CMD(unref);
   trace_virtio_gpu_cmd_res_unref(unref.resource_id);
   
+res = virtio_gpu_find_resource(g, unref.resource_id);

+
   virgl_renderer_resource_detach_iov(unref.resource_id,
  _iovs,
  _iovs);
   if (res_iovs != NULL && num_iovs != 0) {
   virtio_gpu_cleanup_mapping_iov(g, res_iovs, num_iovs);
+if (res) {
+res->iov = NULL;
+res->iov_cnt = 0;
+}
   }
+
   virgl_renderer_resource_unref(unref.resource_id);
+
+virgl_resource_destroy(g, res);
   }
   
   static void virgl_cmd_context_create(VirtIOGPU *g,

@@ -426,6 +451,183 @@ static void virgl_cmd_get_capset(VirtIOGPU *g,
   g_free(resp);
   }
   
+#ifdef HAVE_VIRGL_RESOURCE_BLOB

+
+static void virgl_cmd_resource_create_blob(VirtIOGPU *g,
+   struct virtio_gpu_ctrl_command *cmd)
+{
+struct virtio_gpu_simple_resource *res;
+struct virtio_gpu_resource_create_blob cblob;
+struct virgl_renderer_resource_create_blob_args virgl_args = { 0 };
+int ret;
+
+VIRTIO_GPU_FILL_CMD(cblob);
+virtio_gpu_create_blob_bswap();
+trace_virtio_gpu_cmd_res_create_blob(cblob.resource_id, cblob.size);
+
+if (cblob.resource_id == 0) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: resource id 0 is not allowed\n",
+  __func__);
+cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_RESOURCE_ID;
+return;
+}
+
+res = virtio_gpu_find_resource(g, cblob.resource_id);
+if (res) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: resource already exists %d\n",
+  __func__, cblob.resource_id);
+cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_RESOURCE_ID;
+return;
+}
+
+res = g_new0(struct virtio_gpu_simple_resource, 1);
+if (!res) {
+cmd->error = VIRTIO_GPU_RESP_ERR_OUT_OF_MEMORY;
+return;
+}
+
+res->resource_id = cblob.resource_id;
+res->blob_size = cblob.size;
+
+if (cblob.blob_mem != VIRTIO_GPU_BLOB_MEM_HOST3D) {
+ret = virtio_gpu_create_mapping_iov(g, cblob.nr_entries, sizeof(cblob),
+cmd, >addrs, >iov,
+>iov_cnt);
+if (!ret) {
+g_free(res);
+cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
+return;
+}
+}
+
+QTAILQ_INSERT_HEAD(>reslist, res, next);
+
+virgl_args.res_handle = cblob.resource_id;
+virgl_args.ctx_id = cblob.hdr.ctx_id;
+virgl_args.blob_mem = cblob.blob_mem;
+virgl_args.blob_id = cblob.blob_id;
+virgl_args.blob_flags = cblob.blob_flags;
+virgl_args.size = cblob.size;
+virgl_args.iovecs = res->iov;
+virgl_args.num_iovs = res->iov_cnt;
+
+ret = virgl_renderer_resource_create_blob(_args);
+if (ret) {
+

Re: [QEMU PATCH v5 09/13] virtio-gpu: Handle resource blob commands

2023-09-18 Thread Huang Rui
On Sat, Sep 16, 2023 at 12:04:17AM +0800, Akihiko Odaki wrote:
> On 2023/09/15 20:11, Huang Rui wrote:
> > From: Antonio Caggiano 
> > 
> > Support BLOB resources creation, mapping and unmapping by calling the
> > new stable virglrenderer 0.10 interface. Only enabled when available and
> > via the blob config. E.g. -device virtio-vga-gl,blob=true
> > 
> > Signed-off-by: Antonio Caggiano 
> > Signed-off-by: Dmitry Osipenko 
> > Signed-off-by: Xenia Ragiadakou 
> > Signed-off-by: Huang Rui 
> > ---
> > 
> > V4 -> V5:
> >  - Use memory_region_init_ram_ptr() instead of
> >memory_region_init_ram_device_ptr() (Akihiko)
> > 
> >   hw/display/virtio-gpu-virgl.c  | 213 +
> >   hw/display/virtio-gpu.c|   4 +-
> >   include/hw/virtio/virtio-gpu.h |   5 +
> >   meson.build|   4 +
> >   4 files changed, 225 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> > index 312953ec16..563a6f2f58 100644
> > --- a/hw/display/virtio-gpu-virgl.c
> > +++ b/hw/display/virtio-gpu-virgl.c
> > @@ -17,6 +17,7 @@
> >   #include "trace.h"
> >   #include "hw/virtio/virtio.h"
> >   #include "hw/virtio/virtio-gpu.h"
> > +#include "hw/virtio/virtio-gpu-bswap.h"
> >   
> >   #include "ui/egl-helpers.h"
> >   
> > @@ -78,9 +79,24 @@ static void virgl_cmd_create_resource_3d(VirtIOGPU *g,
> >   virgl_renderer_resource_create(, NULL, 0);
> >   }
> >   
> > +static void virgl_resource_destroy(VirtIOGPU *g,
> > +   struct virtio_gpu_simple_resource *res)
> > +{
> > +if (!res)
> > +return;
> > +
> > +QTAILQ_REMOVE(>reslist, res, next);
> > +
> > +virtio_gpu_cleanup_mapping_iov(g, res->iov, res->iov_cnt);
> > +g_free(res->addrs);
> > +
> > +g_free(res);
> > +}
> > +
> >   static void virgl_cmd_resource_unref(VirtIOGPU *g,
> >struct virtio_gpu_ctrl_command *cmd)
> >   {
> > +struct virtio_gpu_simple_resource *res;
> >   struct virtio_gpu_resource_unref unref;
> >   struct iovec *res_iovs = NULL;
> >   int num_iovs = 0;
> > @@ -88,13 +104,22 @@ static void virgl_cmd_resource_unref(VirtIOGPU *g,
> >   VIRTIO_GPU_FILL_CMD(unref);
> >   trace_virtio_gpu_cmd_res_unref(unref.resource_id);
> >   
> > +res = virtio_gpu_find_resource(g, unref.resource_id);
> > +
> >   virgl_renderer_resource_detach_iov(unref.resource_id,
> >  _iovs,
> >  _iovs);
> >   if (res_iovs != NULL && num_iovs != 0) {
> >   virtio_gpu_cleanup_mapping_iov(g, res_iovs, num_iovs);
> > +if (res) {
> > +res->iov = NULL;
> > +res->iov_cnt = 0;
> > +}
> >   }
> > +
> >   virgl_renderer_resource_unref(unref.resource_id);
> > +
> > +virgl_resource_destroy(g, res);
> >   }
> >   
> >   static void virgl_cmd_context_create(VirtIOGPU *g,
> > @@ -426,6 +451,183 @@ static void virgl_cmd_get_capset(VirtIOGPU *g,
> >   g_free(resp);
> >   }
> >   
> > +#ifdef HAVE_VIRGL_RESOURCE_BLOB
> > +
> > +static void virgl_cmd_resource_create_blob(VirtIOGPU *g,
> > +   struct virtio_gpu_ctrl_command 
> > *cmd)
> > +{
> > +struct virtio_gpu_simple_resource *res;
> > +struct virtio_gpu_resource_create_blob cblob;
> > +struct virgl_renderer_resource_create_blob_args virgl_args = { 0 };
> > +int ret;
> > +
> > +VIRTIO_GPU_FILL_CMD(cblob);
> > +virtio_gpu_create_blob_bswap();
> > +trace_virtio_gpu_cmd_res_create_blob(cblob.resource_id, cblob.size);
> > +
> > +if (cblob.resource_id == 0) {
> > +qemu_log_mask(LOG_GUEST_ERROR, "%s: resource id 0 is not 
> > allowed\n",
> > +  __func__);
> > +cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_RESOURCE_ID;
> > +return;
> > +}
> > +
> > +res = virtio_gpu_find_resource(g, cblob.resource_id);
> > +if (res) {
> > +qemu_log_mask(LOG_GUEST_ERROR, "%s: resource already exists %d\n",
> > +  __func__, cblob.resource_id);
> > +cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_RESOURCE_ID;
> > +return;
> > +}
> > +
> > +res = g_new0(struct virtio_gpu_simple_resource, 1);
> > +if (!res) {
> > +cmd->error = VIRTIO_GPU_RESP_ERR_OUT_OF_MEMORY;
> > +return;
> > +}
> > +
> > +res->resource_id = cblob.resource_id;
> > +res->blob_size = cblob.size;
> > +
> > +if (cblob.blob_mem != VIRTIO_GPU_BLOB_MEM_HOST3D) {
> > +ret = virtio_gpu_create_mapping_iov(g, cblob.nr_entries, 
> > sizeof(cblob),
> > +cmd, >addrs, >iov,
> > +>iov_cnt);
> > +if (!ret) {
> > +g_free(res);
> > +cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
> > +return;
> > +}
> > +}
> 

Re: [PATCH 1/2] xen/x86: io_apic: Introduce a command line option to skip timer check

2023-09-18 Thread Roger Pau Monné
On Fri, Sep 15, 2023 at 03:27:40PM +0100, Julien Grall wrote:
> Hi Roger,
> 
> On 15/09/2023 14:54, Roger Pau Monné wrote:
> > On Fri, Aug 18, 2023 at 02:44:40PM +0100, Julien Grall wrote:
> > > From: Julien Grall 
> > > 
> > > Currently, Xen will spend ~100ms to check if the timer works. If the
> > > Admin knows their platform have a working timer, then it would be
> > > handy to be able to bypass the check.
> > 
> > I'm of the opinion that the current code is at least dubious.
> > 
> > Specifically attempts to test the PIT timer, even when the hypervisor
> > might be using a different timer.  At which point it mostly attempts
> > to test that the interrupt routing from the PIT (or it's replacement)
> > is correct, and that Xen can receive such interrupts.
> > 
> > We go to great lengths in order to attempt to please this piece of
> > code, even when no PIT is available, at which point we switch the HPET
> > to legacy replacement mode in order to satisfy the checks.
> > 
> > I think the current code is not useful, and I would be fine if it was
> > just removed.
> 
> I am afraid I don't know enough the code to be able to say if it can be
> removed.
> 
> I also have no idea how common are such platforms nowadays (I suspect it is
> rather small). Which I why I went with a command line option.

It was more of a grumble rather than a request for you to remove the
code.  I've been meaning to look into this myself for some time, as we
just keep accumulating bodges in order to keep the test happy.

We don't seem to perform such tests for any other interrupt sources
that Xen uses (or timer), so I find it kind of odd.  I guess at one
point the PIT was always used, and hence it was relevant to test for
it unconditionally, but that's not the case anymore.

> 
> > 
> > > 
> > > Introduce a command line option 'no_timer_check' (the name is
> > > matching the Linux parameter) for this purpose.
> > > 
> > > Signed-off-by: Julien Grall 
> > > ---
> > >   docs/misc/xen-command-line.pandoc |  7 +++
> > >   xen/arch/x86/io_apic.c| 11 +++
> > >   2 files changed, 18 insertions(+)
> > > 
> > > diff --git a/docs/misc/xen-command-line.pandoc 
> > > b/docs/misc/xen-command-line.pandoc
> > > index 4872b9098e83..1f9d3106383f 100644
> > > --- a/docs/misc/xen-command-line.pandoc
> > > +++ b/docs/misc/xen-command-line.pandoc
> > > @@ -1896,6 +1896,13 @@ This option is ignored in **pv-shim** mode.
> > >   ### nr_irqs (x86)
> > >   > `= `
> > > +### no_timer_works (x86)
> > 
> > I find the naming confusing, and I would rather avoid negative named
> > booleans.
> > 
> > Maybe it should be `check_pit_intr` and default to enabled for the
> > time being?
> Jan suggested to use timer-irq-works. Would you be happy with the name?

Hm, pit_irq_works might be better IMO, but I could live with
timer_irq_works (with either _ or -).

> > Note that if you don't want to run the test in timer_irq_works() you
> > can possibly avoid the code in preinit_pit(), as there no need to
> > setup channel 0 in periodic mode then.
> 
> The channel also seems to be used as a fallback method for calibrating the
> APIC (see calibrate_APIC_clock()). AFAICT, the fallback method should only
> be used when the PIT is enabled.

Yes, I see.  I think most systems from the last decade will have TSC
deadline timer support, and hence don't engage in the calibration.  We
should see about switching the calibration to use the selected timer,
instead of forcing the usage of the PIT.

> I think it would still be feasible to avoid running the IRQ tests even when
> PIT is used. So it means, we cannot skip preinit_pit().

Yeah, so we seem to have a couple of usages of the Channel 0 counter
that don't rely on the interrupt at all.

> As a side note, I noticed that preinit_pit() is called during resume. But I
> can't find any use of channel 0 after boot. So maybe the call could be
> removed?

See _disable_pit_irq(), there's a relation between the PIT and
cpuidle.

Thanks, Roger.



[linux-linus test] 183025: regressions - FAIL

2023-09-18 Thread osstest service owner
flight 183025 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/183025/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 8 xen-boot fail REGR. 
vs. 182531
 test-amd64-coresched-amd64-xl  8 xen-bootfail REGR. vs. 182531
 test-amd64-amd64-libvirt-pair 12 xen-boot/src_host   fail REGR. vs. 182531
 test-amd64-amd64-libvirt-pair 13 xen-boot/dst_host   fail REGR. vs. 182531
 test-amd64-amd64-xl-xsm   8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-libvirt-raw  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-libvirt-qcow2  8 xen-boot   fail REGR. vs. 182531
 test-amd64-amd64-xl-vhd   8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-pvshim8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-pvhv2-intel  8 xen-boot  fail REGR. vs. 182531
 test-amd64-amd64-freebsd12-amd64  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-qemuu-ws16-amd64  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-qemut-debianhvm-amd64  8 xen-bootfail REGR. vs. 182531
 test-amd64-amd64-xl   8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-qemut-win7-amd64  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-libvirt-xsm  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-qemuu-win7-amd64  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-dom0pvh-xl-amd 14 guest-start   fail REGR. vs. 182531
 test-amd64-amd64-xl-shadow8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  8 xen-bootfail REGR. vs. 182531
 test-amd64-amd64-examine-uefi  8 reboot  fail REGR. vs. 182531
 test-amd64-amd64-xl-credit1   8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-pair12 xen-boot/src_hostfail REGR. vs. 182531
 test-amd64-amd64-pair13 xen-boot/dst_hostfail REGR. vs. 182531
 test-amd64-amd64-xl-multivcpu  8 xen-bootfail REGR. vs. 182531
 test-amd64-amd64-qemuu-nested-intel  8 xen-boot  fail REGR. vs. 182531
 test-amd64-amd64-xl-qemut-ws16-amd64  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-qemuu-ovmf-amd64  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-pvhv2-amd  8 xen-bootfail REGR. vs. 182531
 test-amd64-amd64-qemuu-nested-amd  8 xen-bootfail REGR. vs. 182531
 test-amd64-amd64-dom0pvh-xl-intel 14 guest-start fail REGR. vs. 182531
 test-amd64-amd64-libvirt  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-freebsd11-amd64  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-pygrub   8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 8 xen-boot fail REGR. vs. 
182531
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 8 xen-boot fail REGR. 
vs. 182531
 test-amd64-amd64-xl-credit2   8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-examine-bios  8 reboot  fail REGR. vs. 182531
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  8 xen-boot fail REGR. vs. 182531
 test-amd64-amd64-examine  8 reboot   fail REGR. vs. 182531
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 8 xen-boot fail REGR. vs. 
182531
 test-armhf-armhf-xl-credit1  10 host-ping-check-xen  fail REGR. vs. 182531

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds  8 xen-boot fail REGR. vs. 182531

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 182531
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 182531
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 182531
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 

Re: [QEMU PATCH v5 06/13] virtio-gpu: Support context init feature with virglrenderer

2023-09-18 Thread Akihiko Odaki

On 2023/09/18 15:20, Huang Rui wrote:

On Mon, Sep 18, 2023 at 02:07:25PM +0800, Akihiko Odaki wrote:

On 2023/09/18 14:43, Huang Rui wrote:

On Sun, Sep 17, 2023 at 01:49:19PM +0800, Akihiko Odaki wrote:

On 2023/09/17 14:45, Huang Rui wrote:

On Sat, Sep 16, 2023 at 06:42:04PM +0800, Akihiko Odaki wrote:

On 2023/09/16 19:32, Huang Rui wrote:

On Fri, Sep 15, 2023 at 11:20:46PM +0800, Akihiko Odaki wrote:

On 2023/09/15 20:11, Huang Rui wrote:

Patch "virtio-gpu: CONTEXT_INIT feature" has added the context_init
feature flags.
We would like to enable the feature with virglrenderer, so add to create
virgl renderer context with flags using context_id when valid.

Originally-by: Antonio Caggiano 
Signed-off-by: Huang Rui 
---

V4 -> V5:
 - Inverted patch 5 and 6 because we should configure
   HAVE_VIRGL_CONTEXT_INIT firstly. (Philippe)

  hw/display/virtio-gpu-virgl.c | 13 +++--
  hw/display/virtio-gpu.c   |  2 ++
  2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 8bb7a2c21f..312953ec16 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -106,8 +106,17 @@ static void virgl_cmd_context_create(VirtIOGPU *g,
  trace_virtio_gpu_cmd_ctx_create(cc.hdr.ctx_id,
  cc.debug_name);
  
-virgl_renderer_context_create(cc.hdr.ctx_id, cc.nlen,

-  cc.debug_name);
+if (cc.context_init) {
+#ifdef HAVE_VIRGL_CONTEXT_INIT
+virgl_renderer_context_create_with_flags(cc.hdr.ctx_id,
+ cc.context_init,
+ cc.nlen,
+ cc.debug_name);
+return;
+#endif


This should deal with the case when context_init is set while
HAVE_VIRGL_CONTEXT_INIT is not defined.


Actually, I received the comment below before:

https://lore.kernel.org/qemu-devel/32588d0e-a1f2-30c4-5e9f-e6e7c4190...@linaro.org/

At original patch set, I have the case while HAVE_VIRGL_CONTEXT_INIT is set
but HAVE_VIRGL_CONTEXT_INIT is not defined. But I think we may encounter
the case that virgl_renderer_context_create_with_flags is not defined in
virglrenderer early version. Should I bring the error message back?

Thanks,
Ray


I suggest checking VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED instead of
reporting an error here. Perhaps it may be easier to add #ifdef around:
> +DEFINE_PROP_BIT("context_init", VirtIOGPU, parent_obj.conf.flags,
> +VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED, false),


How about below changes: >
---
diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 8bb7a2c21f..54a3cfe136 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -106,8 +106,15 @@ static void virgl_cmd_context_create(VirtIOGPU *g,
trace_virtio_gpu_cmd_ctx_create(cc.hdr.ctx_id,
cc.debug_name);

-virgl_renderer_context_create(cc.hdr.ctx_id, cc.nlen,
-  cc.debug_name);
+if (cc.context_init && virtio_gpu_context_init_enabled(g->conf)) {
+virgl_renderer_context_create_with_flags(cc.hdr.ctx_id,
+ cc.context_init,
+ cc.nlen,
+ cc.debug_name);
+return;
+}
+
+virgl_renderer_context_create(cc.hdr.ctx_id, cc.nlen, cc.debug_name);
}

static void virgl_cmd_context_destroy(VirtIOGPU *g,
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index be16efbd38..6ff2c8e92d 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1508,6 +1508,10 @@ static Property virtio_gpu_properties[] = {
DEFINE_PROP_BIT("blob", VirtIOGPU, parent_obj.conf.flags,
VIRTIO_GPU_FLAG_BLOB_ENABLED, false),
DEFINE_PROP_SIZE("hostmem", VirtIOGPU, parent_obj.conf.hostmem, 0),
+#ifdef HAVE_VIRGL_CONTEXT_INIT
+DEFINE_PROP_BIT("context_init", VirtIOGPU, parent_obj.conf.flags,
+VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED, true),
+#endif
DEFINE_PROP_END_OF_LIST(),
};



It looks better, but not having #ifdef around
virgl_renderer_context_create_with_flags() will result in a link error
with old virglrenderer.


Hmm, right, it seems that we have to have a "#ifdef" around here. Or do you
have any better idea?


Having #ifdef is the right direction.


OK, so we can use cc.context_init and make sure context_init function
enabled. Please check below:

---
diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 8bb7a2c21f..8363674ebc 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -106,8 +106,17 @@ static void virgl_cmd_context_create(VirtIOGPU *g,
  

Re: [QEMU PATCH v5 06/13] virtio-gpu: Support context init feature with virglrenderer

2023-09-18 Thread Huang Rui
On Mon, Sep 18, 2023 at 02:07:25PM +0800, Akihiko Odaki wrote:
> On 2023/09/18 14:43, Huang Rui wrote:
> > On Sun, Sep 17, 2023 at 01:49:19PM +0800, Akihiko Odaki wrote:
> >> On 2023/09/17 14:45, Huang Rui wrote:
> >>> On Sat, Sep 16, 2023 at 06:42:04PM +0800, Akihiko Odaki wrote:
>  On 2023/09/16 19:32, Huang Rui wrote:
> > On Fri, Sep 15, 2023 at 11:20:46PM +0800, Akihiko Odaki wrote:
> >> On 2023/09/15 20:11, Huang Rui wrote:
> >>> Patch "virtio-gpu: CONTEXT_INIT feature" has added the context_init
> >>> feature flags.
> >>> We would like to enable the feature with virglrenderer, so add to 
> >>> create
> >>> virgl renderer context with flags using context_id when valid.
> >>>
> >>> Originally-by: Antonio Caggiano 
> >>> Signed-off-by: Huang Rui 
> >>> ---
> >>>
> >>> V4 -> V5:
> >>> - Inverted patch 5 and 6 because we should configure
> >>>   HAVE_VIRGL_CONTEXT_INIT firstly. (Philippe)
> >>>
> >>>  hw/display/virtio-gpu-virgl.c | 13 +++--
> >>>  hw/display/virtio-gpu.c   |  2 ++
> >>>  2 files changed, 13 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/hw/display/virtio-gpu-virgl.c 
> >>> b/hw/display/virtio-gpu-virgl.c
> >>> index 8bb7a2c21f..312953ec16 100644
> >>> --- a/hw/display/virtio-gpu-virgl.c
> >>> +++ b/hw/display/virtio-gpu-virgl.c
> >>> @@ -106,8 +106,17 @@ static void virgl_cmd_context_create(VirtIOGPU 
> >>> *g,
> >>>  trace_virtio_gpu_cmd_ctx_create(cc.hdr.ctx_id,
> >>>  cc.debug_name);
> >>>  
> >>> -virgl_renderer_context_create(cc.hdr.ctx_id, cc.nlen,
> >>> -  cc.debug_name);
> >>> +if (cc.context_init) {
> >>> +#ifdef HAVE_VIRGL_CONTEXT_INIT
> >>> +virgl_renderer_context_create_with_flags(cc.hdr.ctx_id,
> >>> + cc.context_init,
> >>> + cc.nlen,
> >>> + cc.debug_name);
> >>> +return;
> >>> +#endif
> >>
> >> This should deal with the case when context_init is set while
> >> HAVE_VIRGL_CONTEXT_INIT is not defined.
> >
> > Actually, I received the comment below before:
> >
> > https://lore.kernel.org/qemu-devel/32588d0e-a1f2-30c4-5e9f-e6e7c4190...@linaro.org/
> >
> > At original patch set, I have the case while HAVE_VIRGL_CONTEXT_INIT is 
> > set
> > but HAVE_VIRGL_CONTEXT_INIT is not defined. But I think we may encounter
> > the case that virgl_renderer_context_create_with_flags is not defined in
> > virglrenderer early version. Should I bring the error message back?
> >
> > Thanks,
> > Ray
> 
>  I suggest checking VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED instead of
>  reporting an error here. Perhaps it may be easier to add #ifdef around:
> > +DEFINE_PROP_BIT("context_init", VirtIOGPU, 
>  parent_obj.conf.flags,
> > +VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED, false),
> >>>
> >>> How about below changes: >
> >>> ---
> >>> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> >>> index 8bb7a2c21f..54a3cfe136 100644
> >>> --- a/hw/display/virtio-gpu-virgl.c
> >>> +++ b/hw/display/virtio-gpu-virgl.c
> >>> @@ -106,8 +106,15 @@ static void virgl_cmd_context_create(VirtIOGPU *g,
> >>>trace_virtio_gpu_cmd_ctx_create(cc.hdr.ctx_id,
> >>>cc.debug_name);
> >>>
> >>> -virgl_renderer_context_create(cc.hdr.ctx_id, cc.nlen,
> >>> -  cc.debug_name);
> >>> +if (cc.context_init && virtio_gpu_context_init_enabled(g->conf)) {
> >>> +virgl_renderer_context_create_with_flags(cc.hdr.ctx_id,
> >>> + cc.context_init,
> >>> + cc.nlen,
> >>> + cc.debug_name);
> >>> +return;
> >>> +}
> >>> +
> >>> +virgl_renderer_context_create(cc.hdr.ctx_id, cc.nlen, cc.debug_name);
> >>>}
> >>>
> >>>static void virgl_cmd_context_destroy(VirtIOGPU *g,
> >>> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
> >>> index be16efbd38..6ff2c8e92d 100644
> >>> --- a/hw/display/virtio-gpu.c
> >>> +++ b/hw/display/virtio-gpu.c
> >>> @@ -1508,6 +1508,10 @@ static Property virtio_gpu_properties[] = {
> >>>DEFINE_PROP_BIT("blob", VirtIOGPU, parent_obj.conf.flags,
> >>>VIRTIO_GPU_FLAG_BLOB_ENABLED, false),
> >>>DEFINE_PROP_SIZE("hostmem", VirtIOGPU, parent_obj.conf.hostmem, 0),
> >>> +#ifdef HAVE_VIRGL_CONTEXT_INIT
> >>> +DEFINE_PROP_BIT("context_init", VirtIOGPU, parent_obj.conf.flags,
> >>> +

Re: [QEMU PATCH v5 06/13] virtio-gpu: Support context init feature with virglrenderer

2023-09-18 Thread Akihiko Odaki

On 2023/09/18 14:43, Huang Rui wrote:

On Sun, Sep 17, 2023 at 01:49:19PM +0800, Akihiko Odaki wrote:

On 2023/09/17 14:45, Huang Rui wrote:

On Sat, Sep 16, 2023 at 06:42:04PM +0800, Akihiko Odaki wrote:

On 2023/09/16 19:32, Huang Rui wrote:

On Fri, Sep 15, 2023 at 11:20:46PM +0800, Akihiko Odaki wrote:

On 2023/09/15 20:11, Huang Rui wrote:

Patch "virtio-gpu: CONTEXT_INIT feature" has added the context_init
feature flags.
We would like to enable the feature with virglrenderer, so add to create
virgl renderer context with flags using context_id when valid.

Originally-by: Antonio Caggiano 
Signed-off-by: Huang Rui 
---

V4 -> V5:
- Inverted patch 5 and 6 because we should configure
  HAVE_VIRGL_CONTEXT_INIT firstly. (Philippe)

 hw/display/virtio-gpu-virgl.c | 13 +++--
 hw/display/virtio-gpu.c   |  2 ++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 8bb7a2c21f..312953ec16 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -106,8 +106,17 @@ static void virgl_cmd_context_create(VirtIOGPU *g,
 trace_virtio_gpu_cmd_ctx_create(cc.hdr.ctx_id,
 cc.debug_name);
 
-virgl_renderer_context_create(cc.hdr.ctx_id, cc.nlen,

-  cc.debug_name);
+if (cc.context_init) {
+#ifdef HAVE_VIRGL_CONTEXT_INIT
+virgl_renderer_context_create_with_flags(cc.hdr.ctx_id,
+ cc.context_init,
+ cc.nlen,
+ cc.debug_name);
+return;
+#endif


This should deal with the case when context_init is set while
HAVE_VIRGL_CONTEXT_INIT is not defined.


Actually, I received the comment below before:

https://lore.kernel.org/qemu-devel/32588d0e-a1f2-30c4-5e9f-e6e7c4190...@linaro.org/

At original patch set, I have the case while HAVE_VIRGL_CONTEXT_INIT is set
but HAVE_VIRGL_CONTEXT_INIT is not defined. But I think we may encounter
the case that virgl_renderer_context_create_with_flags is not defined in
virglrenderer early version. Should I bring the error message back?

Thanks,
Ray


I suggest checking VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED instead of
reporting an error here. Perhaps it may be easier to add #ifdef around:
   > +DEFINE_PROP_BIT("context_init", VirtIOGPU, parent_obj.conf.flags,
   > +VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED, false),


How about below changes: >
---
diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 8bb7a2c21f..54a3cfe136 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -106,8 +106,15 @@ static void virgl_cmd_context_create(VirtIOGPU *g,
   trace_virtio_gpu_cmd_ctx_create(cc.hdr.ctx_id,
   cc.debug_name);

-virgl_renderer_context_create(cc.hdr.ctx_id, cc.nlen,
-  cc.debug_name);
+if (cc.context_init && virtio_gpu_context_init_enabled(g->conf)) {
+virgl_renderer_context_create_with_flags(cc.hdr.ctx_id,
+ cc.context_init,
+ cc.nlen,
+ cc.debug_name);
+return;
+}
+
+virgl_renderer_context_create(cc.hdr.ctx_id, cc.nlen, cc.debug_name);
   }

   static void virgl_cmd_context_destroy(VirtIOGPU *g,
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index be16efbd38..6ff2c8e92d 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1508,6 +1508,10 @@ static Property virtio_gpu_properties[] = {
   DEFINE_PROP_BIT("blob", VirtIOGPU, parent_obj.conf.flags,
   VIRTIO_GPU_FLAG_BLOB_ENABLED, false),
   DEFINE_PROP_SIZE("hostmem", VirtIOGPU, parent_obj.conf.hostmem, 0),
+#ifdef HAVE_VIRGL_CONTEXT_INIT
+DEFINE_PROP_BIT("context_init", VirtIOGPU, parent_obj.conf.flags,
+VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED, true),
+#endif
   DEFINE_PROP_END_OF_LIST(),
   };



It looks better, but not having #ifdef around
virgl_renderer_context_create_with_flags() will result in a link error
with old virglrenderer.


Hmm, right, it seems that we have to have a "#ifdef" around here. Or do you
have any better idea?


Having #ifdef is the right direction.