Re: [PATCH v3 01/11] xen/manage: keep track of the on-going suspend mode

2021-05-20 Thread Anchal Agarwal
On Thu, Oct 01, 2020 at 08:43:58AM -0400, boris.ostrov...@oracle.com wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you can confirm the sender and know the 
> content is safe.
> 
> 
> 
> >>> Also, wrt KASLR stuff, that issue is still seen sometimes but I 
> >>> haven't had
> >>> bandwidth to dive deep into the issue and fix it.
>  So what's the plan there? You first mentioned this issue early this year 
>  and judged by your response it is not clear whether you will ever spend 
>  time looking at it.
> 
> >>> I do want to fix it and did do some debugging earlier this year just 
> >>> haven't
> >>> gotten back to it. Also, wanted to understand if the issue is a blocker 
> >>> to this
> >>> series?
> >>
> >> Integrating code with known bugs is less than ideal.
> >>
> > So for this series to be accepted, KASLR needs to be fixed along with other
> > comments of course?
> 
> 
> Yes, please.
> 
> 
> 
> >>> I had some theories when debugging around this like if the random base 
> >>> address picked by kaslr for the
> >>> resuming kernel mismatches the suspended kernel and just jogging my 
> >>> memory, I didn't find that as the case.
> >>> Another hunch was if physical address of registered vcpu info at boot is 
> >>> different from what suspended kernel
> >>> has and that can cause CPU's to get stuck when coming online.
> >>
> >> I'd think if this were the case you'd have 100% failure rate. And we are 
> >> also re-registering vcpu info on xen restore and I am not aware of any 
> >> failures due to KASLR.
> >>
> > What I meant there wrt VCPU info was that VCPU info is not unregistered 
> > during hibernation,
> > so Xen still remembers the old physical addresses for the VCPU information, 
> > created by the
> > booting kernel. But since the hibernation kernel may have different physical
> > addresses for VCPU info and if mismatch happens, it may cause issues with 
> > resume.
> > During hibernation, the VCPU info register hypercall is not invoked again.
> 
> 
> I still don't think that's the cause but it's certainly worth having a look.
> 
Hi Boris,
Apologies for picking this up after last year. 
I did some dive deep on the above statement and that is indeed the case that's 
happening. 
I did some debugging around KASLR and hibernation using reboot mode.
I observed in my debug prints that whenever vcpu_info* address for secondary 
vcpu assigned 
in xen_vcpu_setup at boot is different than what is in the image, resume gets 
stuck for that vcpu
in bringup_cpu(). That means we have different addresses for 
_cpu(xen_vcpu_info, cpu) at boot and after
control jumps into the image. 

I failed to get any prints after it got stuck in bringup_cpu() and
I do not have an option to send a sysrq signal to the guest or rather get a 
kdump.
This change is not observed in every hibernate-resume cycle. I am not sure if 
this is a bug or an 
expected behavior. 
Also, I am contemplating the idea that it may be a bug in xen code getting 
triggered only when
KASLR is enabled but I do not have substantial data to prove that.
Is this a coincidence that this always happens for 1st vcpu?
Moreover, since hypervisor is not aware that guest is hibernated and it looks 
like a regular shutdown to dom0 during reboot mode,
will re-registering vcpu_info for secondary vcpu's even plausible? I could 
definitely use some advice to debug this further.

 
Some printk's from my debugging:

At Boot:

xen_vcpu_setup: xen_have_vcpu_info_placement=1 cpu=1, vcpup=0x9e548fa560e0, 
info.mfn=3996246 info.offset=224,

Image Loads:
It ends up in the condition:
 xen_vcpu_setup()
 {
 ...
 if (xen_hvm_domain()) {
if (per_cpu(xen_vcpu, cpu) == _cpu(xen_vcpu_info, cpu))
return 0; 
 }
 ...
 }

xen_vcpu_setup: checking mfn on resume cpu=1, info.mfn=3934806 info.offset=224, 
_cpu(xen_vcpu_info, cpu)=0x9d7240a560e0

This is tested on c4.2xlarge [8vcpu 15GB mem] instance with 5.10 kernel running
in the guest.

Thanks,
Anchal.
> 
> -boris
> 
> 



Re: QEMU backport necessary for building with "recent" toolchain (on openSUSE Tumbleweed)

2021-05-20 Thread Dario Faggioli
On Thu, 2021-05-20 at 14:55 +0100, Anthony PERARD wrote:
> On Tue, May 18, 2021 at 05:24:30PM +0200, Dario Faggioli wrote:
> > 
> > I think we need the following commit in our QEMU: bbd2d5a812077
> > ("build: -no-pie is no functional linker flag").
> 
> Hi Dario,
> 
Hi,

> I'm hoping to update qemu-xen to a newer version of QEMU (6.0) which
> would have the fix, but that's need a fix of libxl,
>     "Fix libxl with QEMU 6.0 + remove some more deprecated usages."
> So I would prefer to avoid adding more to the current branch.
> 
Sure, makes sense.

I wanted to bring it up, in case it hadn't been noticed yet. If it has
been noticed and there's a plan then we're good, I guess.

> The branch stable-4.15 already have the fix if you need, in the mean
> time.
> 
Ok, thanks!

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
---
<> (Raistlin Majere)


signature.asc
Description: This is a digitally signed message part


Re: [PATCH v2] libelf: improve PVH elfnote parsing

2021-05-20 Thread Juergen Gross

On 20.05.21 11:28, Jan Beulich wrote:

On 20.05.2021 11:27, Roger Pau Monné wrote:

On Wed, May 19, 2021 at 12:34:19PM +0200, Jan Beulich wrote:

On 18.05.2021 16:47, Roger Pau Monne wrote:

@@ -425,8 +425,11 @@ static elf_errorstatus elf_xen_addr_calc_check(struct 
elf_binary *elf,
  return -1;
  }
  
-/* Initial guess for virt_base is 0 if it is not explicitly defined. */

-if ( parms->virt_base == UNSET_ADDR )
+/*
+ * Initial guess for virt_base is 0 if it is not explicitly defined in the
+ * PV case. For PVH virt_base is forced to 0 because paging is disabled.
+ */
+if ( parms->virt_base == UNSET_ADDR || hvm )
  {
  parms->virt_base = 0;
  elf_msg(elf, "ELF: VIRT_BASE unset, using %#" PRIx64 "\n",


This message is wrong now if hvm is true but parms->virt_base != UNSET_ADDR.
Best perhaps is to avoid emitting the message altogether when hvm is true.
(Since you'll be touching it anyway, perhaps a good opportunity to do 

away

with passing parms->virt_base to elf_msg(), and instead simply use a literal
zero.)


@@ -441,8 +444,10 @@ static elf_errorstatus elf_xen_addr_calc_check(struct 
elf_binary *elf,
   *
   * If we are using the modern ELF notes interface then the default
   * is 0.
+ *
+ * For PVH this is forced to 0, as it's already a legacy option 

for PV.

   */
-if ( parms->elf_paddr_offset == UNSET_ADDR )
+if ( parms->elf_paddr_offset == UNSET_ADDR || hvm )
  {
  if ( parms->elf_note_start )


Don't you want "|| hvm" here as well, or alternatively suppress the
fallback to the __xen_guest section in the PVH case (near the end of
elf_xen_parse())?


The legacy __xen_guest section doesn't support PHYS32_ENTRY, so yes,
that part could be completely skipped when called from an HVM
container.

I think I will fix that in another patch though if you are OK, as
it's not strictly related to the calculation fixes done here.


That's fine; it wants to be a prereq to the one here then, though,
I think.


Would it be possible to add some comment to xen/include/public/elfnote.h
Indicating which elfnotes are evaluated for which guest types, including
a hint which elfnotes _have_ been evaluated before this series? This
will help cleaning up guests regarding advertisement of elfnotes
(something I've been planning to do for the Linux kernel).


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[linux-linus test] 162106: regressions - FAIL

2021-05-20 Thread osstest service owner
flight 162106 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/162106/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-xsm7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-libvirt   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-libvirt-xsm   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-amd64-coresched-i386-xl  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-freebsd10-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-pvshim 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-raw7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-xl-shadow 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-freebsd10-i386  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-examine   6 xen-install  fail REGR. vs. 152332
 test-amd64-i386-libvirt-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-amd64-amd64-libvirt-vhd 13 guest-start  fail REGR. vs. 152332
 test-amd64-amd64-amd64-pvgrub 19 guest-localmigrate/x10  fail REGR. vs. 152332
 test-arm64-arm64-xl-thunderx 13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-xl  13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-xl-xsm  13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-xl-credit1  13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-libvirt-xsm 13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-xl-credit2  13 debian-fixup fail REGR. vs. 152332
 test-amd64-amd64-i386-pvgrub 20 guest-stop   fail REGR. vs. 152332
 test-amd64-amd64-xl-qcow213 guest-start  fail REGR. vs. 152332
 test-armhf-armhf-xl-vhd  13 guest-start  fail REGR. vs. 152332

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 152332
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 152332
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152332
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152332
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 152332
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 

[qemu-mainline test] 162104: regressions - FAIL

2021-05-20 Thread osstest service owner
flight 162104 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/162104/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-qemuu-freebsd11-amd64 16 guest-saverestore fail REGR. vs. 
152631
 test-amd64-i386-freebsd10-i386 16 guest-saverestore  fail REGR. vs. 152631
 test-amd64-i386-freebsd10-amd64 16 guest-saverestore fail REGR. vs. 152631
 test-amd64-amd64-qemuu-freebsd12-amd64 16 guest-saverestore fail REGR. vs. 
152631
 test-amd64-i386-xl-qemuu-debianhvm-amd64 15 guest-saverestore fail REGR. vs. 
152631
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 15 guest-saverestore fail 
REGR. vs. 152631
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 15 guest-saverestore fail REGR. 
vs. 152631
 test-amd64-i386-xl-qemuu-ovmf-amd64 15 guest-saverestore fail REGR. vs. 152631
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 15 guest-saverestore fail REGR. vs. 
152631
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 15 guest-saverestore fail 
REGR. vs. 152631
 test-amd64-amd64-xl-qemuu-win7-amd64 15 guest-saverestore fail REGR. vs. 152631
 test-amd64-amd64-xl-qemuu-ovmf-amd64 15 guest-saverestore fail REGR. vs. 152631
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 15 guest-saverestore fail REGR. 
vs. 152631
 test-amd64-amd64-qemuu-nested-intel 20 debian-hvm-install/l1/l2 fail REGR. vs. 
152631
 test-amd64-i386-xl-qemuu-win7-amd64 15 guest-saverestore fail REGR. vs. 152631
 test-amd64-i386-xl-qemuu-ws16-amd64 15 guest-saverestore fail REGR. vs. 152631
 test-amd64-amd64-xl-qemuu-ws16-amd64 15 guest-saverestore fail REGR. vs. 152631

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 20 guest-localmigrate/x10   fail REGR. vs. 152631

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152631
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 152631
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 152631
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 

[linux-linus test] 162103: regressions - FAIL

2021-05-20 Thread osstest service owner
flight 162103 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/162103/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-xsm7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-examine   6 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-libvirt   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-libvirt-xsm   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-amd64-coresched-i386-xl  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-freebsd10-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-pvshim 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-raw7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-xl-shadow 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-freebsd10-i386  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-libvirt-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-amd64-amd64-libvirt-vhd 13 guest-start  fail REGR. vs. 152332
 test-arm64-arm64-xl-thunderx 13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-xl  13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-libvirt-xsm 13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-xl-credit2  13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-xl-credit1  14 guest-start  fail REGR. vs. 152332
 test-amd64-amd64-amd64-pvgrub 20 guest-stop  fail REGR. vs. 152332
 test-amd64-amd64-i386-pvgrub 20 guest-stop   fail REGR. vs. 152332
 test-amd64-amd64-xl-qcow213 guest-start  fail REGR. vs. 152332
 test-armhf-armhf-xl-vhd  13 guest-start  fail REGR. vs. 152332
 test-arm64-arm64-xl-xsm  14 guest-startfail in 162097 REGR. vs. 152332

Tests which are failing intermittently (not blocking):
 test-arm64-arm64-xl-credit1  13 debian-fixup fail in 162097 pass in 162103
 test-arm64-arm64-xl-xsm  13 debian-fixup   fail pass in 162097
 test-amd64-amd64-xl-credit1  22 guest-start/debian.repeat  fail pass in 162097
 test-amd64-amd64-xl-shadow   22 guest-start/debian.repeat  fail pass in 162097
 test-amd64-amd64-examine  4 memdisk-try-append fail pass in 162097

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 152332
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 152332
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152332
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152332
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 152332
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass

[xen-unstable test] 162102: tolerable FAIL - PUSHED

2021-05-20 Thread osstest service owner
flight 162102 xen-unstable real [real]
flight 162105 xen-unstable real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/162102/
http://logs.test-lab.xenproject.org/osstest/logs/162105/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 14 
guest-start/debianhvm.repeat fail pass in 162105-retest

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds18 guest-start/debian.repeat fail REGR. vs. 162095

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 162095
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 162095
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 162095
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 162095
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 162095
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 162095
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 162095
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 162095
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 162095
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 162095
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 162095
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass

version targeted for testing:
 xen  aa77acc28098d04945af998f3fc0dbd3759b5b41
baseline version:
 xen  935abe1cc463917c697c1451ec8d313a5d75f7de

Last test of basis   162095  2021-05-19 16:38:54 Z1 days
Testing same since   162102  2021-05-20 07:39:10 Z0 

Re: [PATCH v2 1/2] xen-pciback: redo VF placement in the virtual topology

2021-05-20 Thread Boris Ostrovsky


On 5/20/21 10:46 AM, Jan Beulich wrote:
> On 20.05.2021 16:44, Jan Beulich wrote:
>> On 20.05.2021 16:38, Boris Ostrovsky wrote:
>>> On 5/20/21 3:43 AM, Jan Beulich wrote:
 On 20.05.2021 02:36, Boris Ostrovsky wrote:
> On 5/18/21 12:13 PM, Jan Beulich wrote:
>>  
>> @@ -95,22 +95,25 @@ static int __xen_pcibk_add_pci_dev(struc
>>  
>>  /*
>>   * Keep multi-function devices together on the virtual PCI bus, 
>> except
>> - * virtual functions.
>> + * that we want to keep virtual functions at func 0 on their 
>> own. They
>> + * aren't multi-function devices and hence their presence at 
>> func 0
>> + * may cause guests to not scan the other functions.
> So your reading of the original commit is that whatever the issue it was, 
> only function zero was causing the problem? In other words, you are not 
> concerned that pci_scan_slot() may now look at function 1 and skip all 
> higher-numbered function (assuming the problem is still there)?
 I'm not sure I understand the question: Whether to look at higher numbered
 slots is a function of slot 0's multi-function bit alone, aiui. IOW if
 slot 1 is being looked at in the first place, slots 2-7 should also be
 looked at.
>>>
>>> Wasn't the original patch describing a problem strictly as one for 
>>> single-function devices, so the multi-function bit is not set? I.e. if all 
>>> VFs (which are single-function devices) are placed in the same slot then 
>>> pci_scan_slot() would only look at function 0 and ignore anything 
>>> higher-numbered.
>>>
>>>
>>> My question is whether it would "only look at function 0 and ignore 
>>> anything higher-numbered" or "only look at the lowest-numbered function and 
>>> ignore anything higher-numbered".
>> The common scanning logic is to look at slot 0 first. If that's populated,
>> other slots get looked at only if slot 0 has the multi-function bit set.
>> If slot 0 is not populated, nothing is known about the other slots, and
>> hence they need to be scanned.
> In particular Linux'es next_fn() ends with
>
>   /* dev may be NULL for non-contiguous multifunction devices */
>   if (!dev || dev->multifunction)
>   return (fn + 1) % 8;
>
>   return 0;



Ah yes.


Reviewed-by: Boris Ostrovsky 




Re: Uses of /hypervisor memory range (was: FreeBSD/Xen/ARM issues)

2021-05-20 Thread Stefano Stabellini
On Thu, 20 May 2021, Julien Grall wrote:
> On 20/05/2021 00:25, Stefano Stabellini wrote:
> > On Sat, 15 May 2021, Julien Grall wrote:
> > > > My feeling is one of two things should happen with the /hypervisor
> > > > address range:
> > > > 
> > > > 1>  OSes could be encouraged to use it for all foreign mappings.  The
> > > > range should be dynamic in some fashion.  There could be a handy way to
> > > > allow configuring the amount of address space thus reserved.
> > > 
> > > In the context of XSA-300 and virtio on Xen on Arm, we discussed about
> > > providing a revion for foreign mapping. The main trouble here is figuring
> > > out
> > > of the size, if you mess it up then you may break all the PV drivers.
> > > 
> > > If the problem is finding space, then I would like to suggest a different
> > > approach (I think I may have discussed it with Andrew). Xen is in
> > > maintaining
> > > the P2M for the guest and therefore now where are the unallocated space.
> > > How
> > > about introducing a new hypercall to ask for "unallocated space"?
> > > 
> > > This would not work for older hypervisor but you could use the RAM instead
> > > (as
> > > Linux does). This is has drawbacks (e.g. shattering superpage, reducing
> > > the
> > > amount of memory usuable...), but for 1> you would also need a hack for
> > > older
> > > Xen.
> > 
> > I am starting to wonder if it would make sense to add a new device tree
> > binding to describe larger free region for foreign mapping rather then a
> > hypercall. It could be several GBs or even TBs in size. I like the idea
> > of having it device tree because after all this is static information.
> > I can see that a hypercall would also work and I am open to it but if
> > possible I think it would be better not to extend the hypercall
> > interface when there is a good alternative.
> 
> There are two issues with the Device-Tree approach:
>   1) This doesn't work on ACPI
>   2) It is not clear how to define the size of the region. An admin doesn't
> have the right information in hand to know how many mappings will be done (a
> backend may map multiple time the same page).
> 
> The advantage of the hypercall solution is the size is "virtually" unlimited
> and not based on a specific firmware.

Fair enough



Re: [PATCH 00/13] Intel Hardware P-States (HWP) support

2021-05-20 Thread Jan Beulich
On 03.05.2021 21:27, Jason Andryuk wrote:
> Open questions:
> 
> HWP defaults to enabled and running in balanced mode.  This is useful
> for testing, but should the old ACPI cpufreq driver remain the default?

As long as this new code is experimental, yes. But once it's deemed
stable and fully supported, I think on HWP-capable hardware the new
driver should be used by default.

> This series unilaterally enables Hardware Duty Cycling (HDC) which is
> another feature to autonomously powerdown things.  That is enabled if
> HWP is enabled.  Maybe that want to be configurable?

Probably; not even sure what the default would want to be.

Jan



Re: [PATCH v2 1/2] xen-pciback: redo VF placement in the virtual topology

2021-05-20 Thread Jan Beulich
On 20.05.2021 16:44, Jan Beulich wrote:
> On 20.05.2021 16:38, Boris Ostrovsky wrote:
>>
>> On 5/20/21 3:43 AM, Jan Beulich wrote:
>>> On 20.05.2021 02:36, Boris Ostrovsky wrote:
 On 5/18/21 12:13 PM, Jan Beulich wrote:
>  
> @@ -95,22 +95,25 @@ static int __xen_pcibk_add_pci_dev(struc
>  
>   /*
>* Keep multi-function devices together on the virtual PCI bus, except
> -  * virtual functions.
> +  * that we want to keep virtual functions at func 0 on their own. They
> +  * aren't multi-function devices and hence their presence at func 0
> +  * may cause guests to not scan the other functions.

 So your reading of the original commit is that whatever the issue it was, 
 only function zero was causing the problem? In other words, you are not 
 concerned that pci_scan_slot() may now look at function 1 and skip all 
 higher-numbered function (assuming the problem is still there)?
>>> I'm not sure I understand the question: Whether to look at higher numbered
>>> slots is a function of slot 0's multi-function bit alone, aiui. IOW if
>>> slot 1 is being looked at in the first place, slots 2-7 should also be
>>> looked at.
>>
>>
>> Wasn't the original patch describing a problem strictly as one for 
>> single-function devices, so the multi-function bit is not set? I.e. if all 
>> VFs (which are single-function devices) are placed in the same slot then 
>> pci_scan_slot() would only look at function 0 and ignore anything 
>> higher-numbered.
>>
>>
>> My question is whether it would "only look at function 0 and ignore anything 
>> higher-numbered" or "only look at the lowest-numbered function and ignore 
>> anything higher-numbered".
> 
> The common scanning logic is to look at slot 0 first. If that's populated,
> other slots get looked at only if slot 0 has the multi-function bit set.
> If slot 0 is not populated, nothing is known about the other slots, and
> hence they need to be scanned.

In particular Linux'es next_fn() ends with

/* dev may be NULL for non-contiguous multifunction devices */
if (!dev || dev->multifunction)
return (fn + 1) % 8;

return 0;

Jan



Re: [PATCH v2 1/2] xen-pciback: redo VF placement in the virtual topology

2021-05-20 Thread Jan Beulich
On 20.05.2021 16:38, Boris Ostrovsky wrote:
> 
> On 5/20/21 3:43 AM, Jan Beulich wrote:
>> On 20.05.2021 02:36, Boris Ostrovsky wrote:
>>> On 5/18/21 12:13 PM, Jan Beulich wrote:
  
 @@ -95,22 +95,25 @@ static int __xen_pcibk_add_pci_dev(struc
  
/*
 * Keep multi-function devices together on the virtual PCI bus, except
 -   * virtual functions.
 +   * that we want to keep virtual functions at func 0 on their own. They
 +   * aren't multi-function devices and hence their presence at func 0
 +   * may cause guests to not scan the other functions.
>>>
>>> So your reading of the original commit is that whatever the issue it was, 
>>> only function zero was causing the problem? In other words, you are not 
>>> concerned that pci_scan_slot() may now look at function 1 and skip all 
>>> higher-numbered function (assuming the problem is still there)?
>> I'm not sure I understand the question: Whether to look at higher numbered
>> slots is a function of slot 0's multi-function bit alone, aiui. IOW if
>> slot 1 is being looked at in the first place, slots 2-7 should also be
>> looked at.
> 
> 
> Wasn't the original patch describing a problem strictly as one for 
> single-function devices, so the multi-function bit is not set? I.e. if all 
> VFs (which are single-function devices) are placed in the same slot then 
> pci_scan_slot() would only look at function 0 and ignore anything 
> higher-numbered.
> 
> 
> My question is whether it would "only look at function 0 and ignore anything 
> higher-numbered" or "only look at the lowest-numbered function and ignore 
> anything higher-numbered".

The common scanning logic is to look at slot 0 first. If that's populated,
other slots get looked at only if slot 0 has the multi-function bit set.
If slot 0 is not populated, nothing is known about the other slots, and
hence they need to be scanned.

Jan



Re: [PATCH v2 1/2] xen-pciback: redo VF placement in the virtual topology

2021-05-20 Thread Boris Ostrovsky


On 5/20/21 3:43 AM, Jan Beulich wrote:
> On 20.05.2021 02:36, Boris Ostrovsky wrote:
>> On 5/18/21 12:13 PM, Jan Beulich wrote:
>>>  
>>> @@ -95,22 +95,25 @@ static int __xen_pcibk_add_pci_dev(struc
>>>  
>>> /*
>>>  * Keep multi-function devices together on the virtual PCI bus, except
>>> -* virtual functions.
>>> +* that we want to keep virtual functions at func 0 on their own. They
>>> +* aren't multi-function devices and hence their presence at func 0
>>> +* may cause guests to not scan the other functions.
>>
>> So your reading of the original commit is that whatever the issue it was, 
>> only function zero was causing the problem? In other words, you are not 
>> concerned that pci_scan_slot() may now look at function 1 and skip all 
>> higher-numbered function (assuming the problem is still there)?
> I'm not sure I understand the question: Whether to look at higher numbered
> slots is a function of slot 0's multi-function bit alone, aiui. IOW if
> slot 1 is being looked at in the first place, slots 2-7 should also be
> looked at.


Wasn't the original patch describing a problem strictly as one for 
single-function devices, so the multi-function bit is not set? I.e. if all VFs 
(which are single-function devices) are placed in the same slot then 
pci_scan_slot() would only look at function 0 and ignore anything 
higher-numbered.


My question is whether it would "only look at function 0 and ignore anything 
higher-numbered" or "only look at the lowest-numbered function and ignore 
anything higher-numbered".


-boris




Re: QEMU backport necessary for building with "recent" toolchain (on openSUSE Tumbleweed)

2021-05-20 Thread Anthony PERARD
On Tue, May 18, 2021 at 05:24:30PM +0200, Dario Faggioli wrote:
> Hello,
> 
> I think we need the following commit in our QEMU: bbd2d5a812077
> ("build: -no-pie is no functional linker flag").

Hi Dario,

I'm hoping to update qemu-xen to a newer version of QEMU (6.0) which
would have the fix, but that's need a fix of libxl,
"Fix libxl with QEMU 6.0 + remove some more deprecated usages."
So I would prefer to avoid adding more to the current branch.

The branch stable-4.15 already have the fix if you need, in the mean
time.

Thanks,

-- 
Anthony PERARD



[PATCH v2] x86emul: de-duplicate scatters to the same linear address

2021-05-20 Thread Jan Beulich
The SDM specifically allows for earlier writes to fully overlapping
ranges to be dropped. If a guest did so, hvmemul_phys_mmio_access()
would crash it if varying data was written to the same address. Detect
overlaps early, as doing so in hvmemul_{linear,phys}_mmio_access() would
be quite a bit more difficult. To maintain proper faulting behavior,
instead of dropping earlier write instances of fully overlapping slots
altogether, write the data of the final of these slots multiple times.
(We also can't pull ahead the [single] write of the data of the last of
the slots, clearing all involved slots' op_mask bits together, as this
would yield incorrect results if there were intervening partially
overlapping ones.)

Note that due to cache slot use being linear address based, there's no
similar issue with multiple writes to the same physical address (mapped
through different linear addresses).

Signed-off-by: Jan Beulich 
---
v2: Maintain correct faulting behavior.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -10073,15 +10073,36 @@ x86_emulate(
 
 for ( i = 0; op_mask; ++i )
 {
-long idx = b & 1 ? index.qw[i] : index.dw[i];
+long idx = (b & 1 ? index.qw[i]
+  : index.dw[i]) * (1 << state->sib_scale);
+unsigned long offs = truncate_ea(ea.mem.off + idx);
+unsigned int j, slot;
 
 if ( !(op_mask & (1 << i)) )
 continue;
 
-rc = ops->write(ea.mem.seg,
-truncate_ea(ea.mem.off +
-idx * (1 << state->sib_scale)),
-(void *)mmvalp + i * op_bytes, op_bytes, ctxt);
+/*
+ * hvmemul_linear_mmio_access() will find a cache slot based on
+ * linear address.  hvmemul_phys_mmio_access() will crash the
+ * domain if observing varying data getting written to the same
+ * cache slot.  Utilize that squashing earlier writes to fully
+ * overlapping addresses is permitted by the spec.  We can't,
+ * however, drop the writes altogether, to maintain correct
+ * faulting behavior.  Instead write the data from the last of
+ * the fully overlapping slots multiple times.
+ */
+for ( j = (slot = i) + 1; j < n; ++j )
+{
+long idx2 = (b & 1 ? index.qw[j]
+   : index.dw[j]) * (1 << state->sib_scale);
+
+if ( (op_mask & (1 << j)) &&
+ truncate_ea(ea.mem.off + idx2) == offs )
+slot = j;
+}
+
+rc = ops->write(ea.mem.seg, offs,
+(void *)mmvalp + slot * op_bytes, op_bytes, ctxt);
 if ( rc != X86EMUL_OKAY )
 {
 /* See comment in gather emulation. */



[qemu-mainline test] 162099: regressions - FAIL

2021-05-20 Thread osstest service owner
flight 162099 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/162099/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-qemuu-freebsd11-amd64 16 guest-saverestore fail REGR. vs. 
152631
 test-amd64-i386-freebsd10-i386 16 guest-saverestore  fail REGR. vs. 152631
 test-amd64-i386-freebsd10-amd64 16 guest-saverestore fail REGR. vs. 152631
 test-amd64-amd64-qemuu-freebsd12-amd64 16 guest-saverestore fail REGR. vs. 
152631
 test-amd64-i386-xl-qemuu-debianhvm-amd64 15 guest-saverestore fail REGR. vs. 
152631
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 15 guest-saverestore fail 
REGR. vs. 152631
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 15 guest-saverestore fail REGR. 
vs. 152631
 test-amd64-i386-xl-qemuu-ovmf-amd64 15 guest-saverestore fail REGR. vs. 152631
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 15 guest-saverestore fail REGR. vs. 
152631
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 15 guest-saverestore fail 
REGR. vs. 152631
 test-amd64-amd64-xl-qemuu-win7-amd64 15 guest-saverestore fail REGR. vs. 152631
 test-amd64-amd64-xl-qemuu-ovmf-amd64 15 guest-saverestore fail REGR. vs. 152631
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 15 guest-saverestore fail REGR. 
vs. 152631
 test-amd64-amd64-qemuu-nested-intel 20 debian-hvm-install/l1/l2 fail REGR. vs. 
152631
 test-amd64-i386-xl-qemuu-win7-amd64 15 guest-saverestore fail REGR. vs. 152631
 test-amd64-i386-xl-qemuu-ws16-amd64 15 guest-saverestore fail REGR. vs. 152631
 test-amd64-amd64-xl-qemuu-ws16-amd64 15 guest-saverestore fail REGR. vs. 152631

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152631
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 152631
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 152631
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 

Re: Hand over of the Xen shared info page

2021-05-20 Thread Oleksandr Andrushchenko
Hi,

On 5/20/21 12:46 PM, Julien Grall wrote:
>
>
> On 20/05/2021 06:21, Oleksandr Andrushchenko wrote:
>> Hi, all!
>
> Hi,
>
>
>> On 5/20/21 2:11 AM, Stefano Stabellini wrote:
>>> On Wed, 19 May 2021, Julien Grall wrote:
 On 14/05/2021 10:50, Anastasiia Lukianenko wrote:
> Hi Julien!
 Hello,

> On Thu, 2021-05-13 at 09:37 +0100, Julien Grall wrote:
>> On 13/05/2021 09:03, Anastasiia Lukianenko wrote:
>> The alternative is for U-boot to go through the DT and infer which
>> regions are free (IOW any region not described).
> Thank you for interest in the problem and advice on how to solve it.
> Could you please clarify how we could find free regions using DT in U-
> boot?
 I don't know U-boot code, so I can't tell whether what I suggest would 
 work.

 In theory, the device-tree should described every region allocated in 
 address
 space. So if you parse the device-tree and create a list (or any
 datastructure) with the regions, then any range not present in the list 
 would
 be free region you could use.
>>> Yes, any "empty" memory region which is neither memory nor MMIO should
>>> work.
>>
>> You need to account on many things while creating the list of regions:
>>
>> device register mappings, reserved nodes, memory nodes, device tree
>>
>> overlays parsing etc. It all seem to be a not-that-trivial task and after
>>
>> all if implemented it only lives in U-boot and you have to maintain that
>>
>> code. So, if some other entity needs the same you need to implement
>>
>> that as well.
>
> Yes, there are some complexity. I have suggested other approach in a separate 
> thread. Did you look at them?

Yes, so probably we can re-use the solution from that thread when it comes to 
some conclusion

and implementation.

>
>> And also you can imagine a system without device tree at all...
> Xen doesn't provide a stable guest layout. Such system would have to be 
> tailored to a given guest configuration, Xen version (can be custom)...
>
> Therefore, hardcoding the value in the system (not in Xen headers!) is not 
> going to make it much worse.
Agree to some extent
>
>> So, I am not saying it is not possible to implement, but I just question if
>>
>> such an implementation is really a good way forward.
>>
>>>
>>>
 However, I realized a few days ago that the magic pages are not described 
 in
 the DT. We probably want to fix it by marking the page as "reserved" or 
 create
 a specific bindings.

 So you will need a specific quirk for them.
>>> It should also be possible to keep the shared info page allocated and
>>> simply pass the address to the kernel by adding the right node to device
>>> tree.
>> And then you need to modify your OS and this is not only Linux...
>>> To do that, we would have to add a description of the magic pages
>>> to device tree, which I think would be good to have in any case. In that
>>> case it would be best to add a proper binding for it under the "xen,xen"
>>> node.
>>
>> I would say that querying Xen for such a memory page seems much
>>
>> more cleaner and allows the guest OS either to continue using the existing
>>
>> method with memory allocation or using the page provided by Xen.
>
> IIUC your proposal, you are asking to add an hypercall to query which guest 
> physical region can be used for the shared info page.
>
> This may solve the problem you have at hand, but this is not scalable. There 
> are a few other regions which regions unallocated memory (e.g. grant table, 
> grant mapping, foreign memory map,).
>
> So if we were going to involve Xen, then I think it is better to have a 
> generic way to ask Xen for unallocated space.
Agree
>
> Cheers,
>
Thank you,

Oleksandr


Re: [PATCH v3 2/2] libelf: improve PVH elfnote parsing

2021-05-20 Thread Jan Beulich
On 20.05.2021 14:30, Roger Pau Monne wrote:
> Pass an hvm boolean parameter to the elf note parsing and checking
> routines, so that better checking can be done in case libelf is
> dealing with an hvm container.
> 
> elf_xen_note_check shouldn't return early unless PHYS32_ENTRY is set
> and the container is of type HVM, or else the loader and version
> checks would be avoided for kernels intended to be booted as PV but
> that also have PHYS32_ENTRY set.
> 
> Adjust elf_xen_addr_calc_check so that the virtual addresses are
> actually physical ones (by setting virt_base and elf_paddr_offset to
> zero) when the container is of type HVM, as that container is always
> started with paging disabled.
> 
> Signed-off-by: Roger Pau Monné 

With the one hunk moved to patch 1
Reviewed-by: Jan Beulich 

Jan



Re: [PATCH v3 1/2] libelf: don't attempt to parse __xen_guest for PVH

2021-05-20 Thread Jan Beulich
On 20.05.2021 14:30, Roger Pau Monne wrote:
> The legacy __xen_guest section doesn't support the PHYS32_ENTRY
> elfnote, so it's pointless to attempt to parse the elfnotes from that
> section when called from an hvm container.
> 
> Suggested-by: Jan Beulich 
> Signed-off-by: Roger Pau Monné 
> ---
> Changes since v2:
>  - New in this version.
> ---
>  xen/common/libelf/libelf-dominfo.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/common/libelf/libelf-dominfo.c 
> b/xen/common/libelf/libelf-dominfo.c
> index 69c94b6f3bb..abea1011c18 100644
> --- a/xen/common/libelf/libelf-dominfo.c
> +++ b/xen/common/libelf/libelf-dominfo.c
> @@ -577,10 +577,8 @@ elf_errorstatus elf_xen_parse(struct elf_binary *elf,
>  
>  }
>  
> -/*
> - * Finally fall back to the __xen_guest section.
> - */
> -if ( xen_elfnotes == 0 )
> +/* Finally fall back to the __xen_guest section for PV guests only. */
> +if ( xen_elfnotes == 0 && !hvm )

Isn't this depending on the 2nd patch adding the function parameter?
IOW doesn't this break the build, even if just transiently? With the
respective hunk from patch 2 moved here
Reviewed-by: Jan Beulich 

Jan



[PATCH v3 1/2] libelf: don't attempt to parse __xen_guest for PVH

2021-05-20 Thread Roger Pau Monne
The legacy __xen_guest section doesn't support the PHYS32_ENTRY
elfnote, so it's pointless to attempt to parse the elfnotes from that
section when called from an hvm container.

Suggested-by: Jan Beulich 
Signed-off-by: Roger Pau Monné 
---
Changes since v2:
 - New in this version.
---
 xen/common/libelf/libelf-dominfo.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/xen/common/libelf/libelf-dominfo.c 
b/xen/common/libelf/libelf-dominfo.c
index 69c94b6f3bb..abea1011c18 100644
--- a/xen/common/libelf/libelf-dominfo.c
+++ b/xen/common/libelf/libelf-dominfo.c
@@ -577,10 +577,8 @@ elf_errorstatus elf_xen_parse(struct elf_binary *elf,
 
 }
 
-/*
- * Finally fall back to the __xen_guest section.
- */
-if ( xen_elfnotes == 0 )
+/* Finally fall back to the __xen_guest section for PV guests only. */
+if ( xen_elfnotes == 0 && !hvm )
 {
 shdr = elf_shdr_by_name(elf, "__xen_guest");
 if ( ELF_HANDLE_VALID(shdr) )
-- 
2.31.1




[PATCH v3 2/2] libelf: improve PVH elfnote parsing

2021-05-20 Thread Roger Pau Monne
Pass an hvm boolean parameter to the elf note parsing and checking
routines, so that better checking can be done in case libelf is
dealing with an hvm container.

elf_xen_note_check shouldn't return early unless PHYS32_ENTRY is set
and the container is of type HVM, or else the loader and version
checks would be avoided for kernels intended to be booted as PV but
that also have PHYS32_ENTRY set.

Adjust elf_xen_addr_calc_check so that the virtual addresses are
actually physical ones (by setting virt_base and elf_paddr_offset to
zero) when the container is of type HVM, as that container is always
started with paging disabled.

Signed-off-by: Roger Pau Monné 
---
Changes since v2:
 - Do not print messages about VIRT_BASE/ELF_PADDR_OFFSET unset on
   HVM.
 - Make it explicit that elf_paddr_offset will be set to 0 regardless
   of elf_note_start value for HVM.

Changes since v1:
 - Expand comments.
 - Do not set virt_entry to phys_entry unless it's an HVM container.
---
 tools/fuzz/libelf/libelf-fuzzer.c   |  3 +-
 tools/libs/guest/xg_dom_elfloader.c |  6 ++--
 tools/libs/guest/xg_dom_hvmloader.c |  2 +-
 xen/arch/x86/hvm/dom0_build.c   |  2 +-
 xen/arch/x86/pv/dom0_build.c|  2 +-
 xen/common/libelf/libelf-dominfo.c  | 43 ++---
 xen/include/xen/libelf.h|  2 +-
 7 files changed, 37 insertions(+), 23 deletions(-)

diff --git a/tools/fuzz/libelf/libelf-fuzzer.c 
b/tools/fuzz/libelf/libelf-fuzzer.c
index 1ba85717114..84fb84720fa 100644
--- a/tools/fuzz/libelf/libelf-fuzzer.c
+++ b/tools/fuzz/libelf/libelf-fuzzer.c
@@ -17,7 +17,8 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
 return -1;
 
 elf_parse_binary(elf);
-elf_xen_parse(elf, );
+elf_xen_parse(elf, , false);
+elf_xen_parse(elf, , true);
 
 return 0;
 }
diff --git a/tools/libs/guest/xg_dom_elfloader.c 
b/tools/libs/guest/xg_dom_elfloader.c
index 06e713fe111..ad71163dd92 100644
--- a/tools/libs/guest/xg_dom_elfloader.c
+++ b/tools/libs/guest/xg_dom_elfloader.c
@@ -135,7 +135,8 @@ static elf_negerrnoval xc_dom_probe_elf_kernel(struct 
xc_dom_image *dom)
  * or else we might be trying to load a plain ELF.
  */
 elf_parse_binary();
-rc = elf_xen_parse(, dom->parms);
+rc = elf_xen_parse(, dom->parms,
+   dom->container_type == XC_DOM_HVM_CONTAINER);
 if ( rc != 0 )
 return rc;
 
@@ -166,7 +167,8 @@ static elf_negerrnoval xc_dom_parse_elf_kernel(struct 
xc_dom_image *dom)
 
 /* parse binary and get xen meta info */
 elf_parse_binary(elf);
-if ( elf_xen_parse(elf, dom->parms) != 0 )
+if ( elf_xen_parse(elf, dom->parms,
+   dom->container_type == XC_DOM_HVM_CONTAINER) != 0 )
 {
 rc = -EINVAL;
 goto out;
diff --git a/tools/libs/guest/xg_dom_hvmloader.c 
b/tools/libs/guest/xg_dom_hvmloader.c
index ec6ebad7fd5..3a63b23ba39 100644
--- a/tools/libs/guest/xg_dom_hvmloader.c
+++ b/tools/libs/guest/xg_dom_hvmloader.c
@@ -73,7 +73,7 @@ static elf_negerrnoval xc_dom_probe_hvm_kernel(struct 
xc_dom_image *dom)
  * else we might be trying to load a PV kernel.
  */
 elf_parse_binary();
-rc = elf_xen_parse(, dom->parms);
+rc = elf_xen_parse(, dom->parms, true);
 if ( rc == 0 )
 return -EINVAL;
 
diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c
index 878dc1d808e..c24b9efdb0a 100644
--- a/xen/arch/x86/hvm/dom0_build.c
+++ b/xen/arch/x86/hvm/dom0_build.c
@@ -561,7 +561,7 @@ static int __init pvh_load_kernel(struct domain *d, const 
module_t *image,
 elf_set_verbose();
 #endif
 elf_parse_binary();
-if ( (rc = elf_xen_parse(, )) != 0 )
+if ( (rc = elf_xen_parse(, , true)) != 0 )
 {
 printk("Unable to parse kernel for ELFNOTES\n");
 return rc;
diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
index e0801a9e6d1..af47615b226 100644
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/dom0_build.c
@@ -353,7 +353,7 @@ int __init dom0_construct_pv(struct domain *d,
 elf_set_verbose();
 
 elf_parse_binary();
-if ( (rc = elf_xen_parse(, )) != 0 )
+if ( (rc = elf_xen_parse(, , false)) != 0 )
 goto out;
 
 /* compatibility check */
diff --git a/xen/common/libelf/libelf-dominfo.c 
b/xen/common/libelf/libelf-dominfo.c
index abea1011c18..24d1371dd7a 100644
--- a/xen/common/libelf/libelf-dominfo.c
+++ b/xen/common/libelf/libelf-dominfo.c
@@ -360,7 +360,7 @@ elf_errorstatus elf_xen_parse_guest_info(struct elf_binary 
*elf,
 /* sanity checks*/
 
 static elf_errorstatus elf_xen_note_check(struct elf_binary *elf,
-  struct elf_dom_parms *parms)
+  struct elf_dom_parms *parms, bool hvm)
 {
 if ( (ELF_PTRVAL_INVALID(parms->elf_note_start)) &&
  (ELF_PTRVAL_INVALID(parms->guest_info)) )
@@ -382,7 +382,7 @@ static elf_errorstatus 

[PATCH v3 0/2] libelf: small fixes for PVH

2021-05-20 Thread Roger Pau Monne
Hello,

A couple of small fixes for PVH loading. The first one is likely not
very relevant, since PVH couldn't be booted anyway with the data in the
__xen_guest section, so it's mostly a cleanup.

Second patch fixes the checks for PVH loading, as in that case physical
addresses must always be used to perform the bound calculations.

Thanks, Roger.

Roger Pau Monne (2):
  libelf: don't attempt to parse __xen_guest for PVH
  libelf: improve PVH elfnote parsing

 tools/fuzz/libelf/libelf-fuzzer.c   |  3 +-
 tools/libs/guest/xg_dom_elfloader.c |  6 ++--
 tools/libs/guest/xg_dom_hvmloader.c |  2 +-
 xen/arch/x86/hvm/dom0_build.c   |  2 +-
 xen/arch/x86/pv/dom0_build.c|  2 +-
 xen/common/libelf/libelf-dominfo.c  | 49 +
 xen/include/xen/libelf.h|  2 +-
 7 files changed, 39 insertions(+), 27 deletions(-)

-- 
2.31.1




Re: [PATCH] x86/Xen: swap NX determination and GDT setup on BSP

2021-05-20 Thread Juergen Gross

On 20.05.21 14:08, Jan Beulich wrote:

On 20.05.2021 13:57, Juergen Gross wrote:

On 20.05.21 13:42, Jan Beulich wrote:

xen_setup_gdt(), via xen_load_gdt_boot(), wants to adjust page tables.
For this to work when NX is not available, x86_configure_nx() needs to
be called first.

Reported-by: Olaf Hering 
Signed-off-by: Jan Beulich 


Reviewed-by: Juergen Gross 


Thanks. I guess I forgot

Cc: sta...@vger.kernel.org

If you agree, can you please add this before pushing to Linus?


Yes, will do that.


Juergen



OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] x86/Xen: swap NX determination and GDT setup on BSP

2021-05-20 Thread Jan Beulich
On 20.05.2021 13:57, Juergen Gross wrote:
> On 20.05.21 13:42, Jan Beulich wrote:
>> xen_setup_gdt(), via xen_load_gdt_boot(), wants to adjust page tables.
>> For this to work when NX is not available, x86_configure_nx() needs to
>> be called first.
>>
>> Reported-by: Olaf Hering 
>> Signed-off-by: Jan Beulich 
> 
> Reviewed-by: Juergen Gross 

Thanks. I guess I forgot

Cc: sta...@vger.kernel.org

If you agree, can you please add this before pushing to Linus?

Jan



Re: [PATCH] x86/Xen: swap NX determination and GDT setup on BSP

2021-05-20 Thread Juergen Gross

On 20.05.21 13:42, Jan Beulich wrote:

xen_setup_gdt(), via xen_load_gdt_boot(), wants to adjust page tables.
For this to work when NX is not available, x86_configure_nx() needs to
be called first.

Reported-by: Olaf Hering 
Signed-off-by: Jan Beulich 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] xen-netback: correct success/error reporting for the SKB-with-fraglist case

2021-05-20 Thread Jan Beulich
On 25.02.2021 17:23, Paul Durrant wrote:
> On 25/02/2021 14:00, Jan Beulich wrote:
>> On 25.02.2021 13:11, Paul Durrant wrote:
>>> On 25/02/2021 07:33, Jan Beulich wrote:
 On 24.02.2021 17:39, Paul Durrant wrote:
> On 23/02/2021 16:29, Jan Beulich wrote:
>> When re-entering the main loop of xenvif_tx_check_gop() a 2nd time, the
>> special considerations for the head of the SKB no longer apply. Don't
>> mistakenly report ERROR to the frontend for the first entry in the list,
>> even if - from all I can tell - this shouldn't matter much as the overall
>> transmit will need to be considered failed anyway.
>>
>> Signed-off-by: Jan Beulich 
>>
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -499,7 +499,7 @@ check_frags:
>>   * the header's copy failed, and they 
>> are
>>   * sharing a slot, send an error
>>   */
>> -if (i == 0 && sharedslot)
>> +if (i == 0 && !first_shinfo && 
>> sharedslot)
>>  xenvif_idx_release(queue, 
>> pending_idx,
>> 
>> XEN_NETIF_RSP_ERROR);
>>  else
>>
>
> I think this will DTRT, but to my mind it would make more sense to clear
> 'sharedslot' before the 'goto check_frags' at the bottom of the function.

 That was my initial idea as well, but
 - I think it is for a reason that the variable is "const".
 - There is another use of it which would then instead need further
 amending (and which I believe is at least part of the reason for
 the variable to be "const").

>>>
>>> Oh, yes. But now that I look again, don't you want:
>>>
>>> if (i == 0 && first_shinfo && sharedslot)
>>>
>>> ? (i.e no '!')
>>>
>>> The comment states that the error should be indicated when the first
>>> frag contains the header in the case that the map succeeded but the
>>> prior copy from the same ref failed. This can only possibly be the case
>>> if this is the 'first_shinfo'
>>
>> I don't think so, no - there's a difference between "first frag"
>> (at which point first_shinfo is NULL) and first frag list entry
>> (at which point first_shinfo is non-NULL).
> 
> Yes, I realise I got it backwards. Confusing name but the comment above 
> its declaration does explain.
> 
>>
>>> (which is why I still think it is safe to unconst 'sharedslot' and
>>> clear it).
>>
>> And "no" here as well - this piece of code
>>
>>  /* First error: if the header haven't shared a slot with the
>>   * first frag, release it as well.
>>   */
>>  if (!sharedslot)
>>  xenvif_idx_release(queue,
>> XENVIF_TX_CB(skb)->pending_idx,
>> XEN_NETIF_RSP_OKAY);
>>
>> specifically requires sharedslot to have the value that was
>> assigned to it at the start of the function (this property
>> doesn't go away when switching from fragments to frag list).
>> Note also how it uses XENVIF_TX_CB(skb)->pending_idx, i.e. the
>> value the local variable pending_idx was set from at the start
>> of the function.
>>
> 
> True, we do have to deal with freeing up the header if the first map 
> error comes on the frag list.
> 
> Reviewed-by: Paul Durrant 

Since I've not seen this go into 5.13-rc, may I ask what the disposition
of this is?

Jan



[PATCH] x86/Xen: swap NX determination and GDT setup on BSP

2021-05-20 Thread Jan Beulich
xen_setup_gdt(), via xen_load_gdt_boot(), wants to adjust page tables.
For this to work when NX is not available, x86_configure_nx() needs to
be called first.

Reported-by: Olaf Hering 
Signed-off-by: Jan Beulich 

--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1273,16 +1273,16 @@ asmlinkage __visible void __init xen_sta
/* Get mfn list */
xen_build_dynamic_phys_to_machine();
 
+   /* Work out if we support NX */
+   get_cpu_cap(_cpu_data);
+   x86_configure_nx();
+
/*
 * Set up kernel GDT and segment registers, mainly so that
 * -fstack-protector code can be executed.
 */
xen_setup_gdt(0);
 
-   /* Work out if we support NX */
-   get_cpu_cap(_cpu_data);
-   x86_configure_nx();
-
/* Determine virtual and physical address sizes */
get_cpu_address_sizes(_cpu_data);
 



Re: Uses of /hypervisor memory range (was: FreeBSD/Xen/ARM issues)

2021-05-20 Thread Julien Grall

Hi Stefano,

On 20/05/2021 00:25, Stefano Stabellini wrote:

On Sat, 15 May 2021, Julien Grall wrote:

My feeling is one of two things should happen with the /hypervisor
address range:

1>  OSes could be encouraged to use it for all foreign mappings.  The
range should be dynamic in some fashion.  There could be a handy way to
allow configuring the amount of address space thus reserved.


In the context of XSA-300 and virtio on Xen on Arm, we discussed about
providing a revion for foreign mapping. The main trouble here is figuring out
of the size, if you mess it up then you may break all the PV drivers.

If the problem is finding space, then I would like to suggest a different
approach (I think I may have discussed it with Andrew). Xen is in maintaining
the P2M for the guest and therefore now where are the unallocated space. How
about introducing a new hypercall to ask for "unallocated space"?

This would not work for older hypervisor but you could use the RAM instead (as
Linux does). This is has drawbacks (e.g. shattering superpage, reducing the
amount of memory usuable...), but for 1> you would also need a hack for older
Xen.


I am starting to wonder if it would make sense to add a new device tree
binding to describe larger free region for foreign mapping rather then a
hypercall. It could be several GBs or even TBs in size. I like the idea
of having it device tree because after all this is static information.
I can see that a hypercall would also work and I am open to it but if
possible I think it would be better not to extend the hypercall
interface when there is a good alternative.


There are two issues with the Device-Tree approach:
  1) This doesn't work on ACPI
  2) It is not clear how to define the size of the region. An admin 
doesn't have the right information in hand to know how many mappings 
will be done (a backend may map multiple time the same page).


The advantage of the hypercall solution is the size is "virtually" 
unlimited and not based on a specific firmware.


Cheers,

--
Julien Grall



Re: Hand over of the Xen shared info page

2021-05-20 Thread Julien Grall




On 20/05/2021 06:21, Oleksandr Andrushchenko wrote:

Hi, all!


Hi,



On 5/20/21 2:11 AM, Stefano Stabellini wrote:

On Wed, 19 May 2021, Julien Grall wrote:

On 14/05/2021 10:50, Anastasiia Lukianenko wrote:

Hi Julien!

Hello,


On Thu, 2021-05-13 at 09:37 +0100, Julien Grall wrote:

On 13/05/2021 09:03, Anastasiia Lukianenko wrote:
The alternative is for U-boot to go through the DT and infer which
regions are free (IOW any region not described).

Thank you for interest in the problem and advice on how to solve it.
Could you please clarify how we could find free regions using DT in U-
boot?

I don't know U-boot code, so I can't tell whether what I suggest would work.

In theory, the device-tree should described every region allocated in address
space. So if you parse the device-tree and create a list (or any
datastructure) with the regions, then any range not present in the list would
be free region you could use.

Yes, any "empty" memory region which is neither memory nor MMIO should
work.


You need to account on many things while creating the list of regions:

device register mappings, reserved nodes, memory nodes, device tree

overlays parsing etc. It all seem to be a not-that-trivial task and after

all if implemented it only lives in U-boot and you have to maintain that

code. So, if some other entity needs the same you need to implement

that as well.


Yes, there are some complexity. I have suggested other approach in a 
separate thread. Did you look at them?



And also you can imagine a system without device tree at all...
Xen doesn't provide a stable guest layout. Such system would have to be 
tailored to a given guest configuration, Xen version (can be custom)...


Therefore, hardcoding the value in the system (not in Xen headers!) is 
not going to make it much worse.



So, I am not saying it is not possible to implement, but I just question if

such an implementation is really a good way forward.





However, I realized a few days ago that the magic pages are not described in
the DT. We probably want to fix it by marking the page as "reserved" or create
a specific bindings.

So you will need a specific quirk for them.

It should also be possible to keep the shared info page allocated and
simply pass the address to the kernel by adding the right node to device
tree.

And then you need to modify your OS and this is not only Linux...

To do that, we would have to add a description of the magic pages
to device tree, which I think would be good to have in any case. In that
case it would be best to add a proper binding for it under the "xen,xen"
node.


I would say that querying Xen for such a memory page seems much

more cleaner and allows the guest OS either to continue using the existing

method with memory allocation or using the page provided by Xen.


IIUC your proposal, you are asking to add an hypercall to query which 
guest physical region can be used for the shared info page.


This may solve the problem you have at hand, but this is not scalable. 
There are a few other regions which regions unallocated memory (e.g. 
grant table, grant mapping, foreign memory map,).


So if we were going to involve Xen, then I think it is better to have a 
generic way to ask Xen for unallocated space.


Cheers,

--
Julien Grall



Re: [PATCH 03/10] xen/arm: introduce PGC_reserved

2021-05-20 Thread Julien Grall

Hi Jan,

On 20/05/2021 10:27, Jan Beulich wrote:

On 20.05.2021 10:59, Julien Grall wrote:

On 20/05/2021 09:40, Penny Zheng wrote:

Oh, Second thought on this.
And I think you are referring to balloon in/out here, hmm, also, like


Yes I am referring to balloon in/out.


I replied there:
"For issues on ballooning out or in, it is not supported here.


So long you are not using the solution in prod then you are fine (see
below)... But then we should make clear this feature is a tech preview.


Domain on Static Allocation and 1:1 direct-map are all based on
dom0-less right now, so no PV, grant table, event channel, etc, considered.

Right now, it only supports device got passthrough into the guest."


So we are not creating the hypervisor node in the DT for dom0less domU.
However, the hypercalls are still accessible by a domU if it really
wants to use them.

Therefore, a guest can easily mess up with your static configuration and
predictability.

IMHO, this is a must to solve before "static memory" can be used in
production.


I'm having trouble seeing why it can't be addressed right away: 


It can be solved right away. Dom0less domUs don't officially know they 
are running on Xen (they could bruteforce it though), so I think it 
would be fine to merge without it for a tech preview.



Such
guests can balloon in only after they've ballooned out some pages,
and such balloon-in requests would be satisfied from the same static
memory that is associated by the guest anyway.


This would require some bookeeping to know the page was used previously. 
But nothing very challenging though.


Cheers,

--
Julien Grall



Re: regression in recent pvops kernels, dom0 crashes early

2021-05-20 Thread Olaf Hering
Am Thu, 20 May 2021 09:45:03 +0200
schrieb Olaf Hering :

> Am Thu, 20 May 2021 09:03:34 +0200
> schrieb Jan Beulich :
> 
> > Just to be sure - you did not need the other patch that I said I suspect
> > is needed as a prereq?  
> Yes, I needed just this single patch which moves x86_configure_nx up.

I tried the very same approach with the SLE12-SP4-LTSS branch, which also fixed 
dom0 boot.


Olaf


pgp1uyw73SrGz.pgp
Description: Digitale Signatur von OpenPGP


Re: [PATCH 04/10] xen/arm: static memory initialization

2021-05-20 Thread Jan Beulich
On 20.05.2021 11:04, Penny Zheng wrote:
> Hi Jan
> 
>> -Original Message-
>> From: Jan Beulich 
>> Sent: Tuesday, May 18, 2021 6:43 PM
>> To: Penny Zheng 
>> Cc: Bertrand Marquis ; Wei Chen
>> ; nd ; xen-devel@lists.xenproject.org;
>> sstabell...@kernel.org; jul...@xen.org
>> Subject: Re: [PATCH 04/10] xen/arm: static memory initialization
>>
>> On 18.05.2021 11:51, Penny Zheng wrote:
 From: Jan Beulich 
 Sent: Tuesday, May 18, 2021 3:16 PM

 On 18.05.2021 07:21, Penny Zheng wrote:
> This patch introduces static memory initialization, during system
> RAM boot
 up.
>
> New func init_staticmem_pages is the equivalent of init_heap_pages,
> responsible for static memory initialization.
>
> Helper func free_staticmem_pages is the equivalent of
> free_heap_pages, to free nr_pfns pages of static memory.
> For each page, it includes the following steps:
> 1. change page state from in-use(also initialization state) to free
> state and grant PGC_reserved.
> 2. set its owner NULL and make sure this page is not a guest frame
> any more

 But isn't the goal (as per the previous patch) to associate such
 pages with a _specific_ domain?

>>>
>>> Free_staticmem_pages are alike free_heap_pages, are not used only for
>> initialization.
>>> Freeing used pages to unused are also included.
>>> Here, setting its owner NULL is to set owner in used field NULL.
>>
>> I'm afraid I still don't understand.
>>
> 
> When initializing heap, xen is using free_heap_pages to do the initialization.
> And also when normal domain get destroyed/rebooted, xen is using 
> free_domheap_pages,
> which calls free_heap_pages to free the pages.
> 
> So here, since free_staticmem_pages is the equivalent of the free_heap_pages 
> for static
> memory, I'm also considering both two scenarios here. And if it is domain get 
> destroyed/rebooted,
> Page state is in inuse state(PGC_inuse), and the page_info.v.inuse.domain is 
> holding the
> domain owner.
> When freeing then, we need to switch the page state to free state(PGC_free) 
> and set its inuse owner to NULL.

Perhaps my confusion comes from your earlier outline missing

3. re-associate the page with the domain (as represented in free
   pages)

The property of "designated for Dom" should never go away, if I
understand the overall proposal correctly.

Jan



Re: [PATCH v2] libelf: improve PVH elfnote parsing

2021-05-20 Thread Jan Beulich
On 20.05.2021 11:27, Roger Pau Monné wrote:
> On Wed, May 19, 2021 at 12:34:19PM +0200, Jan Beulich wrote:
>> On 18.05.2021 16:47, Roger Pau Monne wrote:
>>> @@ -425,8 +425,11 @@ static elf_errorstatus elf_xen_addr_calc_check(struct 
>>> elf_binary *elf,
>>>  return -1;
>>>  }
>>>  
>>> -/* Initial guess for virt_base is 0 if it is not explicitly defined. */
>>> -if ( parms->virt_base == UNSET_ADDR )
>>> +/*
>>> + * Initial guess for virt_base is 0 if it is not explicitly defined in 
>>> the
>>> + * PV case. For PVH virt_base is forced to 0 because paging is 
>>> disabled.
>>> + */
>>> +if ( parms->virt_base == UNSET_ADDR || hvm )
>>>  {
>>>  parms->virt_base = 0;
>>>  elf_msg(elf, "ELF: VIRT_BASE unset, using %#" PRIx64 "\n",
>>
>> This message is wrong now if hvm is true but parms->virt_base != UNSET_ADDR.
>> Best perhaps is to avoid emitting the message altogether when hvm is true.
>> (Since you'll be touching it anyway, perhaps a good opportunity to do away
>> with passing parms->virt_base to elf_msg(), and instead simply use a literal
>> zero.)
>>
>>> @@ -441,8 +444,10 @@ static elf_errorstatus elf_xen_addr_calc_check(struct 
>>> elf_binary *elf,
>>>   *
>>>   * If we are using the modern ELF notes interface then the default
>>>   * is 0.
>>> + *
>>> + * For PVH this is forced to 0, as it's already a legacy option for PV.
>>>   */
>>> -if ( parms->elf_paddr_offset == UNSET_ADDR )
>>> +if ( parms->elf_paddr_offset == UNSET_ADDR || hvm )
>>>  {
>>>  if ( parms->elf_note_start )
>>
>> Don't you want "|| hvm" here as well, or alternatively suppress the
>> fallback to the __xen_guest section in the PVH case (near the end of
>> elf_xen_parse())?
> 
> The legacy __xen_guest section doesn't support PHYS32_ENTRY, so yes,
> that part could be completely skipped when called from an HVM
> container.
> 
> I think I will fix that in another patch though if you are OK, as
> it's not strictly related to the calculation fixes done here.

That's fine; it wants to be a prereq to the one here then, though,
I think.

Jan



Re: [PATCH v2] libelf: improve PVH elfnote parsing

2021-05-20 Thread Roger Pau Monné
On Wed, May 19, 2021 at 12:34:19PM +0200, Jan Beulich wrote:
> On 18.05.2021 16:47, Roger Pau Monne wrote:
> > @@ -425,8 +425,11 @@ static elf_errorstatus elf_xen_addr_calc_check(struct 
> > elf_binary *elf,
> >  return -1;
> >  }
> >  
> > -/* Initial guess for virt_base is 0 if it is not explicitly defined. */
> > -if ( parms->virt_base == UNSET_ADDR )
> > +/*
> > + * Initial guess for virt_base is 0 if it is not explicitly defined in 
> > the
> > + * PV case. For PVH virt_base is forced to 0 because paging is 
> > disabled.
> > + */
> > +if ( parms->virt_base == UNSET_ADDR || hvm )
> >  {
> >  parms->virt_base = 0;
> >  elf_msg(elf, "ELF: VIRT_BASE unset, using %#" PRIx64 "\n",
> 
> This message is wrong now if hvm is true but parms->virt_base != UNSET_ADDR.
> Best perhaps is to avoid emitting the message altogether when hvm is true.
> (Since you'll be touching it anyway, perhaps a good opportunity to do away
> with passing parms->virt_base to elf_msg(), and instead simply use a literal
> zero.)
> 
> > @@ -441,8 +444,10 @@ static elf_errorstatus elf_xen_addr_calc_check(struct 
> > elf_binary *elf,
> >   *
> >   * If we are using the modern ELF notes interface then the default
> >   * is 0.
> > + *
> > + * For PVH this is forced to 0, as it's already a legacy option for PV.
> >   */
> > -if ( parms->elf_paddr_offset == UNSET_ADDR )
> > +if ( parms->elf_paddr_offset == UNSET_ADDR || hvm )
> >  {
> >  if ( parms->elf_note_start )
> 
> Don't you want "|| hvm" here as well, or alternatively suppress the
> fallback to the __xen_guest section in the PVH case (near the end of
> elf_xen_parse())?

The legacy __xen_guest section doesn't support PHYS32_ENTRY, so yes,
that part could be completely skipped when called from an HVM
container.

I think I will fix that in another patch though if you are OK, as
it's not strictly related to the calculation fixes done here.

Thanks, Roger.



Re: [PATCH 03/10] xen/arm: introduce PGC_reserved

2021-05-20 Thread Jan Beulich
On 20.05.2021 10:59, Julien Grall wrote:
> On 20/05/2021 09:40, Penny Zheng wrote:
>> Oh, Second thought on this.
>> And I think you are referring to balloon in/out here, hmm, also, like
> 
> Yes I am referring to balloon in/out.
> 
>> I replied there:
>> "For issues on ballooning out or in, it is not supported here.
> 
> So long you are not using the solution in prod then you are fine (see 
> below)... But then we should make clear this feature is a tech preview.
> 
>> Domain on Static Allocation and 1:1 direct-map are all based on
>> dom0-less right now, so no PV, grant table, event channel, etc, considered.
>>
>> Right now, it only supports device got passthrough into the guest."
> 
> So we are not creating the hypervisor node in the DT for dom0less domU. 
> However, the hypercalls are still accessible by a domU if it really
> wants to use them.
> 
> Therefore, a guest can easily mess up with your static configuration and 
> predictability.
> 
> IMHO, this is a must to solve before "static memory" can be used in 
> production.

I'm having trouble seeing why it can't be addressed right away: Such
guests can balloon in only after they've ballooned out some pages,
and such balloon-in requests would be satisfied from the same static
memory that is associated by the guest anyway.

Jan



Re: [PATCH] Xen: Design doc for 1:1 direct-map and static allocation

2021-05-20 Thread Julien Grall




On 20/05/2021 06:36, Penny Zheng wrote:

Hi Julien


Hi Penny,


+
+Later, when domain get destroyed and memory relinquished, only pages
+in `page_list` go back to heap, and pages in `reserved_page_list` shall not.


While going through the series, I could not find any code implementing this.
However, this is not enough to prevent a page to go to the heap allocator
because a domain can release memory at runtime using hypercalls like
XENMEM_remove_from_physmap.

One of the use case is when the guest decides to balloon out some memory.
This will call free_domheap_pages().

Effectively, you are treating static memory as domheap pages. So I think it
would be better if you hook in free_domheap_pages() to decide which
allocator is used.

Now, if a guest can balloon out memory, it can also balloon in memory.
There are two cases:
 1) The region used to be RAM region statically allocated
 2) The region used to be unallocated.

I think for 1), we need to be able to re-use the page previously. For 2), it is
not clear to me whether a guest with memory statically allocated should be
allowed to allocate "dynamic" pages.



Yeah, I share the same with you of hooking in free_domheap_pages(). I'm thinking
that if pages of PGC_reserved, we may create a new func free_staticmem_pages to
free them.

For issues on ballooning out or in, it is not supported here.


It is fine that the implementation doesn't yet implement it. However, I 
think the design document should take into account ballooning. This is 
because even if...



Domain on Static Allocation and 1:1 direct-map are all based on dom0-less right
now, so no PV, grant table, event channel, etc, considered.


... there is no PV support & co, a guest is still able to issue 
hypercalls (they are not hidden). Therefore your guest will be able to 
disturb your static allocation.




Right now, it only supports device got passthrough into the guest.


+### Memory Allocation for Domains on Static Allocation
+
+RAM regions pre-defined as static memory for one specifc domain shall
+be parsed and reserved from the beginning. And they shall never go to
+any memory allocator for any use.


Technically, you are introducing a new allocator. So do you mean they should
not be given to neither the buddy allocator nor the bot allocator?



Yes. These pre-defined RAM regions will not be given to any current
memory allocator. If be given there, there is no guarantee that it will
not be allocated for other use.

And right now, in my current design, these pre-defined RAM regions are either 
for
one specific domain as guest RAM or as XEN heap.
   

+
+Later when allocating static memory for this specific domain, after
+acquiring those reserved regions, users need to a do set of
+verification before assigning.
+For each page there, it at least includes the following steps:
+1. Check if it is in free state and has zero reference count.
+2. Check if the page is reserved(`PGC_reserved`).
+
+Then, assigning these pages to this specific domain, and all pages go
+to one new linked page list `reserved_page_list`.
+
+At last, set up guest P2M mapping. By default, it shall be mapped to
+the fixed guest RAM address `GUEST_RAM0_BASE`, `GUEST_RAM1_BASE`,
+just like normal domains. But later in 1:1 direct-map design, if
+`direct-map` is set, the guest physical address will equal to physical

address.

+
+### Static Allocation for Xen itself
+
+### New Deivce Tree Node: `xen,reserved_heap`


s/Deivce/Device/



Thx.


+
+Static memory for Xen heap refers to parts of RAM reserved in the
+beginning for Xen heap only. The memory is pre-defined through XEN
+configuration using physical address ranges.
+
+The reserved memory for Xen heap is an optional feature and can be
+enabled by adding a device tree property in the `chosen` node.
+Currently, this feature is only supported on AArch64.
+
+Here is one example:
+
+
+chosen {
+xen,reserved-heap = <0x0 0x3000 0x0 0x4000>;
+...
+};
+
+RAM at 0x3000 of 1G size will be reserved as heap memory. Later,
+heap allocator will allocate memory only from this specific region.


This section is quite confusing. I think we need to clearly differentiate heap 
vs
allocator.

In Xen we have two heaps:
 1) Xen heap: It is always mapped in Xen virtual address space. This is
mainly used for xen internal allocation.
 2) Domain heap: It may not always be mapped in Xen virtual address space.
This is mainly used for domain memory and mapped on-demand.

For Arm64 (and x86), two heaps are allocated from the same region. But on
Arm32, they are different.

We also have two allocator:
 1) Boot allocator: This is used during boot only. There is no concept of
heap at this time.
 2) Buddy allocator: This is the current runtime allocator. This can either
allocator from either heap.

AFAICT, this design is introducing a 3rd allocator that will return domain heap
pages.

Now, back to this section, are you saying you 

Re: [PATCH v2 2/2] automation: fix dependencies on openSUSE Tumbleweed containers

2021-05-20 Thread Dario Faggioli
On Wed, 2021-05-19 at 20:25 +0100, Andrew Cooper wrote:
> 
> I built the container locally, which reused some of the layers you
> pushed, and it all pushed successfully for me.
> 
> I've committed this series so xen.git matches reality.  Lets see how
> the
> updated container fairs...
> 
Well, something still looks off, although I don't know what.

In fact, we still see: 
https://gitlab.com/xen-project/xen/-/jobs/1277448985

checking for deflateCopy in -lz... no
2608configure: error: Could not find zlib
2609configure: error: ./configure failed for tools

Which means we're missing libz.

In fact, if I pull the container that's currently in the registry, I
can see that:

dario@b17aaa86cacf:~> rpm -qa|grep zlib   
zlib-devel-1.2.11-18.3.x86_64 

However, if I build it locally, with:

dario@Solace:~/src/xen/xen.git/automation/build> make suse/opensuse-tumbleweed

And then enter and use it for building, the package is there and
configure works.

dario@51e463d1d47e:~> rpm -qa|grep libz
libzstd1-1.4.9-1.2.x86_64
libz1-1.2.11-18.3.x86_64
libzck1-1.1.11-1.1.x86_64
libzypp-17.25.10-1.1.x86_64
libzio1-1.06-4.26.x86_64
libzzip-0-13-0.13.72-1.2.x86_64
libzstd-devel-1.4.9-1.2.x86_64

dario@51e463d1d47e:~> ./configure
checking build system type... x86_64-pc-linux-gnu 
checking host system type... x86_64-pc-linux-gnu
... ... ...
checking for pandoc... /usr/bin/pandoc
checking for perl... /usr/bin/perl
configure: creating ./config.status

So, what am I missing or doing wrong?

Thanks and Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
---
<> (Raistlin Majere)


signature.asc
Description: This is a digitally signed message part


[linux-linus test] 162097: regressions - FAIL

2021-05-20 Thread osstest service owner
flight 162097 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/162097/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-xsm7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-examine   6 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-libvirt   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-libvirt-xsm   7 xen-install  fail REGR. vs. 152332
 test-amd64-coresched-i386-xl  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-freebsd10-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-pvshim 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-raw7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-xl-shadow 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-freebsd10-i386  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-libvirt-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-amd64-amd64-libvirt-vhd 13 guest-start  fail REGR. vs. 152332
 test-arm64-arm64-xl-thunderx 13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-xl  13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-xl-xsm  14 guest-start  fail REGR. vs. 152332
 test-arm64-arm64-xl-credit1  13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-libvirt-xsm 13 debian-fixup fail REGR. vs. 152332
 test-arm64-arm64-xl-credit2  13 debian-fixup fail REGR. vs. 152332
 test-amd64-amd64-amd64-pvgrub 20 guest-stop  fail REGR. vs. 152332
 test-amd64-amd64-i386-pvgrub 20 guest-stop   fail REGR. vs. 152332
 test-amd64-amd64-xl-qcow213 guest-start  fail REGR. vs. 152332
 test-armhf-armhf-xl-vhd  13 guest-start  fail REGR. vs. 152332

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 152332
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 152332
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152332
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152332
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 152332
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 

RE: [PATCH 04/10] xen/arm: static memory initialization

2021-05-20 Thread Penny Zheng
Hi Jan

> -Original Message-
> From: Jan Beulich 
> Sent: Tuesday, May 18, 2021 6:43 PM
> To: Penny Zheng 
> Cc: Bertrand Marquis ; Wei Chen
> ; nd ; xen-devel@lists.xenproject.org;
> sstabell...@kernel.org; jul...@xen.org
> Subject: Re: [PATCH 04/10] xen/arm: static memory initialization
> 
> On 18.05.2021 11:51, Penny Zheng wrote:
> >> From: Jan Beulich 
> >> Sent: Tuesday, May 18, 2021 3:16 PM
> >>
> >> On 18.05.2021 07:21, Penny Zheng wrote:
> >>> This patch introduces static memory initialization, during system
> >>> RAM boot
> >> up.
> >>>
> >>> New func init_staticmem_pages is the equivalent of init_heap_pages,
> >>> responsible for static memory initialization.
> >>>
> >>> Helper func free_staticmem_pages is the equivalent of
> >>> free_heap_pages, to free nr_pfns pages of static memory.
> >>> For each page, it includes the following steps:
> >>> 1. change page state from in-use(also initialization state) to free
> >>> state and grant PGC_reserved.
> >>> 2. set its owner NULL and make sure this page is not a guest frame
> >>> any more
> >>
> >> But isn't the goal (as per the previous patch) to associate such
> >> pages with a _specific_ domain?
> >>
> >
> > Free_staticmem_pages are alike free_heap_pages, are not used only for
> initialization.
> > Freeing used pages to unused are also included.
> > Here, setting its owner NULL is to set owner in used field NULL.
> 
> I'm afraid I still don't understand.
> 

When initializing heap, xen is using free_heap_pages to do the initialization.
And also when normal domain get destroyed/rebooted, xen is using 
free_domheap_pages,
which calls free_heap_pages to free the pages.

So here, since free_staticmem_pages is the equivalent of the free_heap_pages 
for static
memory, I'm also considering both two scenarios here. And if it is domain get 
destroyed/rebooted,
Page state is in inuse state(PGC_inuse), and the page_info.v.inuse.domain is 
holding the
domain owner.
When freeing then, we need to switch the page state to free state(PGC_free) and 
set its inuse owner to NULL.

I'll clarify it more clearly in commit message.

> > Still, I need to add more explanation here.
> 
> Yes please.
> 
> >>> @@ -1512,6 +1515,49 @@ static void free_heap_pages(
> >>>  spin_unlock(_lock);
> >>>  }
> >>>
> >>> +/* Equivalent of free_heap_pages to free nr_pfns pages of static
> >>> +memory. */ static void free_staticmem_pages(struct page_info *pg,
> >> unsigned long nr_pfns,
> >>> + bool need_scrub)
> >>
> >> Right now this function gets called only from an __init one. Unless
> >> it is intended to gain further callers, it should be marked __init itself 
> >> then.
> >> Otherwise it should be made sure that other architectures don't
> >> include this (dead there) code.
> >>
> >
> > Sure, I'll add __init. Thx.
> 
> Didn't I see a 2nd call to the function later in the series? That one doesn't
> permit the function to be __init, iirc.
> 
> Jan

Cheers

Penny


Re: [PATCH 03/10] xen/arm: introduce PGC_reserved

2021-05-20 Thread Julien Grall




On 20/05/2021 09:40, Penny Zheng wrote:

Hi julien


Hi Penny,




-Original Message-
From: Penny Zheng
Sent: Thursday, May 20, 2021 2:20 PM
To: Julien Grall ; xen-devel@lists.xenproject.org;
sstabell...@kernel.org
Cc: Bertrand Marquis ; Wei Chen
; nd 
Subject: RE: [PATCH 03/10] xen/arm: introduce PGC_reserved

Hi Julien


-Original Message-
From: Julien Grall 
Sent: Thursday, May 20, 2021 3:46 AM
To: Penny Zheng ; xen-devel@lists.xenproject.org;
sstabell...@kernel.org
Cc: Bertrand Marquis ; Wei Chen
; nd 
Subject: Re: [PATCH 03/10] xen/arm: introduce PGC_reserved



On 19/05/2021 04:16, Penny Zheng wrote:

Hi Julien


Hi Penny,




-Original Message-
From: Julien Grall 
Sent: Tuesday, May 18, 2021 5:46 PM
To: Penny Zheng ;
xen-devel@lists.xenproject.org; sstabell...@kernel.org
Cc: Bertrand Marquis ; Wei Chen
; nd 
Subject: Re: [PATCH 03/10] xen/arm: introduce PGC_reserved



On 18/05/2021 06:21, Penny Zheng wrote:

In order to differentiate pages of static memory, from those
allocated from heap, this patch introduces a new page flag
PGC_reserved

to tell.


New struct reserved in struct page_info is to describe reserved
page info, that is, which specific domain this page is reserved
to. > Helper page_get_reserved_owner and page_set_reserved_owner
are designated to get/set reserved page's owner.

Struct domain is enlarged to more than PAGE_SIZE, due to
newly-imported struct reserved in struct page_info.


struct domain may embed a pointer to a struct page_info but never
directly embed the structure. So can you clarify what you mean?



Signed-off-by: Penny Zheng 
---
xen/include/asm-arm/mm.h | 16 +++-
1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h

index

0b7de3102e..d8922fd5db 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -88,7 +88,15 @@ struct page_info
 */
u32 tlbflush_timestamp;
};
-u64 pad;
+
+/* Page is reserved. */
+struct {
+/*
+ * Reserved Owner of this page,
+ * if this page is reserved to a specific domain.
+ */
+struct domain *domain;
+} reserved;


The space in page_info is quite tight, so I would like to avoid
introducing new fields unless we can't get away from it.

In this case, it is not clear why we need to differentiate the "Owner"
vs the "Reserved Owner". It might be clearer if this change is
folded in the first user of the field.

As an aside, for 32-bit Arm, you need to add a 4-byte padding.



Yeah, I may delete this change. I imported this change as
considering the functionality of rebooting domain on static allocation.

A little more discussion on rebooting domain on static allocation.
Considering the major user cases for domain on static allocation are
system has a total pre-defined, static behavior all the time. No
domain allocation on runtime, while there still exists domain rebooting.


Hmmm... With this series it is still possible to allocate memory at
runtime outside of the static allocation (see my comment on the design

document [1]).

So is it meant to be complete?



I'm guessing that the "allocate memory at runtime outside of the static
allocation" is referring to XEN heap on static allocation, that is, users pre-
defining heap in device tree configuration to let the whole system more static
and predictable.

And I've replied you in the design there, sorry for the late reply. Save your 
time,
and I’ll paste here:

"Right now, on AArch64, all RAM, except reserved memory, will be finally
given to buddy allocator as heap,  like you said, guest RAM for normal domain
will be allocated from there, xmalloc eventually is get memory from there, etc.
So we want to refine the heap here, not iterating through `bootinfo.mem` to
set up XEN heap, but like iterating `bootinfo. reserved_heap` to set up XEN
heap.

On ARM32, xen heap and domain heap are separately mapped, which is more
complicated here. That's why I only talking about implementing these features
on AArch64 as first step."

  Above implementation will be delivered through another patch Serie. This
patch Serie Is only focusing on Domain on Static Allocation.



Oh, Second thought on this.
And I think you are referring to balloon in/out here, hmm, also, like


Yes I am referring to balloon in/out.


I replied there:
"For issues on ballooning out or in, it is not supported here.


So long you are not using the solution in prod then you are fine (see 
below)... But then we should make clear this feature is a tech preview.



Domain on Static Allocation and 1:1 direct-map are all based on
dom0-less right now, so no PV, grant table, event channel, etc, considered.

Right now, it only supports device got passthrough into the guest."


So we are not creating the hypervisor node in the DT for dom0less domU. 
However, the hypercalls are still accessible by a domU if it really

wants to use them.

Therefore, 

Re: [PATCH 01/10] xen/arm: introduce domain on Static Allocation

2021-05-20 Thread Julien Grall

Hi,

On 20/05/2021 07:07, Penny Zheng wrote:

It will be consistent with the ones defined in the parent node, domUx.

Hmmm... To take the example you provided, the parent would be chosen.
However, from the example, I would expect the property #{address, size}-cells
in domU1 to be used. What did I miss?



Yeah, the property #{address, size}-cells in domU1 will be used. And the parent
node will be domU1.


You may have misunderstood what I meant. "domU1" is the node that 
contains the property "xen,static-mem". The parent node would be the one 
above (in our case "chosen").




The dtb property should look like as follows:

 chosen {
 domU1 {
 compatible = "xen,domain";
 #address-cells = <0x2>;
 #size-cells = <0x2>;
 cpus = <2>;
 xen,static-mem = <0x0 0x3000 0x0 0x2000>;

 ...
 };
 };


+DOMU1 on Static Allocation has reserved RAM bank at 0x3000 of 512MB size



+Static Allocation is only supported on AArch64 for now.


The code doesn't seem to be AArch64 specific. So why can't this be
used for 32-bit Arm?



True, we have plans to make it also workable in AArch32 in the future.
Because we considered XEN on cortex-R52.


All the code seems to be implemented in arm generic code. So isn't it already
working?


static int __init early_scan_node(const void *fdt,
  int node, const char *name, int depth,
  u32 address_cells, u32
size_cells, @@ -345,6 +394,9 @@ static int __init early_scan_node(const

void *fdt,

process_multiboot_node(fdt, node, name, address_cells, size_cells);
else if ( depth == 1 && device_tree_node_matches(fdt, node,

"chosen") )

process_chosen_node(fdt, node, name, address_cells,
size_cells);
+else if ( depth == 2 && fdt_get_property(fdt, node,
+ "xen,static-mem",

NULL) )

+process_static_memory(fdt, node, "xen,static-mem", address_cells,
+  size_cells, _mem);


I am a bit concerned to add yet another method to parse the DT and
all the extra code it will add like in patch #2.

   From the host PoV, they are memory reserved for a specific purpose.
Would it be possible to consider the reserve-memory binding for that
purpose? This will happen outside of chosen, but we could use a
phandle to refer the region.



Correct me if I understand wrongly, do you mean what this device tree

snippet looks like:

Yes, this is what I had in mind. Although I have one small remark below.



reserved-memory {
 #address-cells = <2>;
 #size-cells = <2>;
 ranges;

 static-mem-domU1: static-mem@0x3000{


I think the node would need to contain a compatible (name to be defined).



Ok, maybe, hmmm, how about "xen,static-memory"?


I would possibly add "domain" in the name to make clear this is domain 
memory. Stefano, what do you think?


Cheers,

--
Julien Grall



RE: [PATCH 03/10] xen/arm: introduce PGC_reserved

2021-05-20 Thread Penny Zheng
Hi julien 

> -Original Message-
> From: Penny Zheng
> Sent: Thursday, May 20, 2021 2:20 PM
> To: Julien Grall ; xen-devel@lists.xenproject.org;
> sstabell...@kernel.org
> Cc: Bertrand Marquis ; Wei Chen
> ; nd 
> Subject: RE: [PATCH 03/10] xen/arm: introduce PGC_reserved
> 
> Hi Julien
> 
> > -Original Message-
> > From: Julien Grall 
> > Sent: Thursday, May 20, 2021 3:46 AM
> > To: Penny Zheng ; xen-devel@lists.xenproject.org;
> > sstabell...@kernel.org
> > Cc: Bertrand Marquis ; Wei Chen
> > ; nd 
> > Subject: Re: [PATCH 03/10] xen/arm: introduce PGC_reserved
> >
> >
> >
> > On 19/05/2021 04:16, Penny Zheng wrote:
> > > Hi Julien
> >
> > Hi Penny,
> >
> > >
> > >> -Original Message-
> > >> From: Julien Grall 
> > >> Sent: Tuesday, May 18, 2021 5:46 PM
> > >> To: Penny Zheng ;
> > >> xen-devel@lists.xenproject.org; sstabell...@kernel.org
> > >> Cc: Bertrand Marquis ; Wei Chen
> > >> ; nd 
> > >> Subject: Re: [PATCH 03/10] xen/arm: introduce PGC_reserved
> > >>
> > >>
> > >>
> > >> On 18/05/2021 06:21, Penny Zheng wrote:
> > >>> In order to differentiate pages of static memory, from those
> > >>> allocated from heap, this patch introduces a new page flag
> > >>> PGC_reserved
> > to tell.
> > >>>
> > >>> New struct reserved in struct page_info is to describe reserved
> > >>> page info, that is, which specific domain this page is reserved
> > >>> to. > Helper page_get_reserved_owner and page_set_reserved_owner
> > >>> are designated to get/set reserved page's owner.
> > >>>
> > >>> Struct domain is enlarged to more than PAGE_SIZE, due to
> > >>> newly-imported struct reserved in struct page_info.
> > >>
> > >> struct domain may embed a pointer to a struct page_info but never
> > >> directly embed the structure. So can you clarify what you mean?
> > >>
> > >>>
> > >>> Signed-off-by: Penny Zheng 
> > >>> ---
> > >>>xen/include/asm-arm/mm.h | 16 +++-
> > >>>1 file changed, 15 insertions(+), 1 deletion(-)
> > >>>
> > >>> diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
> > >> index
> > >>> 0b7de3102e..d8922fd5db 100644
> > >>> --- a/xen/include/asm-arm/mm.h
> > >>> +++ b/xen/include/asm-arm/mm.h
> > >>> @@ -88,7 +88,15 @@ struct page_info
> > >>> */
> > >>>u32 tlbflush_timestamp;
> > >>>};
> > >>> -u64 pad;
> > >>> +
> > >>> +/* Page is reserved. */
> > >>> +struct {
> > >>> +/*
> > >>> + * Reserved Owner of this page,
> > >>> + * if this page is reserved to a specific domain.
> > >>> + */
> > >>> +struct domain *domain;
> > >>> +} reserved;
> > >>
> > >> The space in page_info is quite tight, so I would like to avoid
> > >> introducing new fields unless we can't get away from it.
> > >>
> > >> In this case, it is not clear why we need to differentiate the "Owner"
> > >> vs the "Reserved Owner". It might be clearer if this change is
> > >> folded in the first user of the field.
> > >>
> > >> As an aside, for 32-bit Arm, you need to add a 4-byte padding.
> > >>
> > >
> > > Yeah, I may delete this change. I imported this change as
> > > considering the functionality of rebooting domain on static allocation.
> > >
> > > A little more discussion on rebooting domain on static allocation.
> > > Considering the major user cases for domain on static allocation are
> > > system has a total pre-defined, static behavior all the time. No
> > > domain allocation on runtime, while there still exists domain rebooting.
> >
> > Hmmm... With this series it is still possible to allocate memory at
> > runtime outside of the static allocation (see my comment on the design
> document [1]).
> > So is it meant to be complete?
> >
> 
> I'm guessing that the "allocate memory at runtime outside of the static
> allocation" is referring to XEN heap on static allocation, that is, users pre-
> defining heap in device tree configuration to let the whole system more static
> and predictable.
> 
> And I've replied you in the design there, sorry for the late reply. Save your 
> time,
> and I’ll paste here:
> 
> "Right now, on AArch64, all RAM, except reserved memory, will be finally
> given to buddy allocator as heap,  like you said, guest RAM for normal domain
> will be allocated from there, xmalloc eventually is get memory from there, 
> etc.
> So we want to refine the heap here, not iterating through `bootinfo.mem` to
> set up XEN heap, but like iterating `bootinfo. reserved_heap` to set up XEN
> heap.
> 
> On ARM32, xen heap and domain heap are separately mapped, which is more
> complicated here. That's why I only talking about implementing these features
> on AArch64 as first step."
> 
>  Above implementation will be delivered through another patch Serie. This
> patch Serie Is only focusing on Domain on Static Allocation.
> 

Oh, Second thought on this. 
And I think you are referring to balloon in/out here, hmm, also, like
I replied there:
"For issues on ballooning out or in, it is not 

[libvirt test] 162100: regressions - FAIL

2021-05-20 Thread osstest service owner
flight 162100 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/162100/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 151777

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  0ad0204ce7f7b512ee349dfbf5cdd751ab0adc1c
baseline version:
 libvirt  2c846fa6bcc11929c9fb857a22430fb9945654ad

Last test of basis   151777  2020-07-10 04:19:19 Z  314 days
Failing since151818  2020-07-11 04:18:52 Z  313 days  306 attempts
Testing same since   162100  2021-05-20 04:20:01 Z0 days1 attempts


People who touched revisions under test:
Adolfo Jayme Barrientos 
  Aleksandr Alekseev 
  Aleksei Zakharov 
  Andika Triwidada 
  Andrea Bolognani 
  Balázs Meskó 
  Barrett Schonefeld 
  Bastian Germann 
  Bastien Orivel 
  BiaoXiang Ye 
  Bihong Yu 
  Binfeng Wu 
  Boris Fiuczynski 
  Brian Turek 
  Bruno Haible 
  Chris Mayo 
  Christian Ehrhardt 
  Christian Schoenebeck 
  Cole Robinson 
  Collin Walling 
  Cornelia Huck 
  Cédric Bosdonnat 
  Côme Borsoi 
  Daniel Henrique Barboza 
  Daniel Letai 
  Daniel P. Berrange 
  Daniel P. Berrangé 
  Dmytro Linkin 
  Eiichi Tsukata 
  Eric Farman 
  Erik Skultety 
  Fabian Affolter 
  Fabian Freyer 
  Fangge Jin 
  Farhan Ali 
  Fedora Weblate Translation 
  gongwei 
  Guoyi Tu
  Göran Uddeborg 
  Halil Pasic 
  Han Han 
  Hao Wang 
  Hela Basa 
  Helmut Grohne 
  Ian Wienand 
  Jakob Meng 
  Jamie Strandboge 
  Jamie Strandboge 
  Jan Kuparinen 
  Jean-Baptiste Holcroft 
  Jianan Gao 
  Jim Fehlig 
  Jin Yan 
  Jiri Denemark 
  John Ferlan 
  Jonathan Watt 
  Jonathon Jongsma 
  Julio Faracco 
  Ján Tomko 
  Kashyap Chamarthy 
  Kevin Locke 
  Kristina Hanicova 
  Laine Stump 
  Laszlo Ersek 
  Liao Pingfang 
  Lin Ma 
  Lin Ma 
  Lin Ma 
  Luke Yue 
  Luyao Zhong 
  Marc Hartmayer 
  Marc-André Lureau 
  Marek Marczykowski-Górecki 
  Markus Schade 
  Martin Kletzander 
  Masayoshi Mizuma 
  Matt Coleman 
  Matt Coleman 
  Mauro Matteo Cascella 
  Meina Li 
  Michal Privoznik 
  Michał Smyk 
  Milo Casagrande 
  Moshe Levi 
  Muha Aliss 
  Neal Gompa 
  Nick Shyrokovskiy 
  Nickys Music Group 
  Nico Pache 
  Nikolay Shirokovskiy 
  Olaf Hering 
  Olesya Gerasimenko 
  Orion Poplawski 
  Pany 
  Patrick Magauran 
  Paulo de Rezende Pinatti 
  Pavel Hrdina 
  Peng Liang 
  Peter Krempa 
  Pino Toscano 
  Pino Toscano 
  Piotr Drąg 
  Prathamesh Chavan 
  Ricky Tigg 
  Roman Bogorodskiy 
  Roman Bolshakov 
  Ryan Gahagan 
  Ryan Schmidt 
  Sam Hartman 
  Scott Shambarger 
  Sebastian Mitterle 
  SeongHyun Jo 
  Shalini Chellathurai Saroja 
  Shaojun Yang 
  Shi Lei 
  simmon 
  Simon Gaiser 
  Stefan Bader 
  Stefan Berger 
  Stefan Berger 
  Szymon Scholz 
  Thomas Huth 
  Tim Wiederhake 
  Tomáš Golembiovský 
  Tomáš Janoušek 
  Tuguoyi 
  Ville Skyttä 
  Wang Xin 
  WangJian 
  Weblate 
  Yalei Li <274268...@qq.com>
  Yalei Li 
  Yang Hang 
  Yanqiu Zhang 
  Yaroslav Kargin 
  Yi Li 
  Yi Wang 
  Yuri Chornoivan 
  Zheng Chuan 
  zhenwei pi 
  Zhenyu Zheng 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386 

Re: regression in recent pvops kernels, dom0 crashes early

2021-05-20 Thread Olaf Hering
Am Thu, 20 May 2021 09:03:34 +0200
schrieb Jan Beulich :

> Just to be sure - you did not need the other patch that I said I suspect
> is needed as a prereq?

Yes, I needed just this single patch which moves x86_configure_nx up.


Olaf


pgpmf2cPPAnBx.pgp
Description: Digitale Signatur von OpenPGP


Re: [PATCH v2 1/2] xen-pciback: redo VF placement in the virtual topology

2021-05-20 Thread Jan Beulich
On 20.05.2021 02:36, Boris Ostrovsky wrote:
> 
> On 5/18/21 12:13 PM, Jan Beulich wrote:
>>  
>> @@ -95,22 +95,25 @@ static int __xen_pcibk_add_pci_dev(struc
>>  
>>  /*
>>   * Keep multi-function devices together on the virtual PCI bus, except
>> - * virtual functions.
>> + * that we want to keep virtual functions at func 0 on their own. They
>> + * aren't multi-function devices and hence their presence at func 0
>> + * may cause guests to not scan the other functions.
> 
> 
> So your reading of the original commit is that whatever the issue it was, 
> only function zero was causing the problem? In other words, you are not 
> concerned that pci_scan_slot() may now look at function 1 and skip all 
> higher-numbered function (assuming the problem is still there)?

I'm not sure I understand the question: Whether to look at higher numbered
slots is a function of slot 0's multi-function bit alone, aiui. IOW if
slot 1 is being looked at in the first place, slots 2-7 should also be
looked at.

Jan



[xen-unstable test] 162095: tolerable FAIL - PUSHED

2021-05-20 Thread osstest service owner
flight 162095 xen-unstable real [real]
flight 162101 xen-unstable real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/162095/
http://logs.test-lab.xenproject.org/osstest/logs/162101/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl   8 xen-bootfail pass in 162101-retest

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl 15 migrate-support-check fail in 162101 never pass
 test-armhf-armhf-xl 16 saverestore-support-check fail in 162101 never pass
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 162078
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 162078
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 162078
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 162078
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 162078
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 162078
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 162078
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 162078
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 162078
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 162078
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 162078
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass

version targeted for testing:
 xen  935abe1cc463917c697c1451ec8d313a5d75f7de
baseline version:
 xen  caa9c4471d1d74b2d236467aaf7e63a806ac11a4

Last test of basis   162078  2021-05-19 05:04:34 Z1 days
Testing same since   162095  2021-05-19 16:38:54 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Julien Grall 
  Wei Liu 

jobs:
 

Re: [PATCH 03/10] xen/arm: introduce PGC_reserved

2021-05-20 Thread Jan Beulich
On 19.05.2021 21:49, Julien Grall wrote:
> On 19/05/2021 10:49, Jan Beulich wrote:
>> On 19.05.2021 05:16, Penny Zheng wrote:
 From: Julien Grall 
 Sent: Tuesday, May 18, 2021 5:46 PM

 On 18/05/2021 06:21, Penny Zheng wrote:
> --- a/xen/include/asm-arm/mm.h
> +++ b/xen/include/asm-arm/mm.h
> @@ -88,7 +88,15 @@ struct page_info
> */
>u32 tlbflush_timestamp;
>};
> -u64 pad;
> +
> +/* Page is reserved. */
> +struct {
> +/*
> + * Reserved Owner of this page,
> + * if this page is reserved to a specific domain.
> + */
> +struct domain *domain;
> +} reserved;

 The space in page_info is quite tight, so I would like to avoid 
 introducing new
 fields unless we can't get away from it.

 In this case, it is not clear why we need to differentiate the "Owner"
 vs the "Reserved Owner". It might be clearer if this change is folded in 
 the
 first user of the field.

 As an aside, for 32-bit Arm, you need to add a 4-byte padding.

>>>
>>> Yeah, I may delete this change. I imported this change as considering the 
>>> functionality
>>> of rebooting domain on static allocation.
>>>
>>> A little more discussion on rebooting domain on static allocation.
>>> Considering the major user cases for domain on static allocation
>>> are system has a total pre-defined, static behavior all the time. No domain 
>>> allocation
>>> on runtime, while there still exists domain rebooting.
>>>
>>> And when rebooting domain on static allocation, all these reserved pages 
>>> could
>>> not go back to heap when freeing them.  So I am considering to use one 
>>> global
>>> `struct page_info*[DOMID]` value to store.
>>
>> Except such a separate array will consume quite a bit of space for
>> no real gain: v.free has 32 bits of padding space right now on
>> Arm64, so there's room for a domid_t there already. Even on Arm32
>> this could be arranged for, as I doubt "order" needs to be 32 bits
>> wide.
> 
> I agree we shouldn't need 32-bit to cover the "order". Although, I would 
> like to see any user reading the field before introducing it.

Of course, but I thought the plan was to mark static pages with their
designated domid, which would happen prior to domain creation. The
reader of the field would then naturally appear during domain creation.

Jan



Re: regression in recent pvops kernels, dom0 crashes early

2021-05-20 Thread Jan Beulich
On 19.05.2021 20:42, Olaf Hering wrote:
> Am Mon, 17 May 2021 12:54:02 +0200
> schrieb Jan Beulich :
> 
>> x86/Xen: swap NX determination and GDT setup on BSP
>>
>> xen_setup_gdt(), via xen_load_gdt_boot(), wants to adjust page tables.
>> For this to work when NX is not available, x86_configure_nx() needs to
>> be called first.
> 
> 
> Thanks. I tried this patch on-top of the SLE15-SP3 kernel branch.
> Without the patch booting fails as reported.
> With the patch the dom0 starts as expected.

Just to be sure - you did not need the other patch that I said I suspect
is needed as a prereq?

Jan



Re: [PATCH v7 01/15] swiotlb: Refactor swiotlb init functions

2021-05-20 Thread Claire Chang
On Thu, May 20, 2021 at 2:50 AM Florian Fainelli  wrote:
>
>
>
> On 5/17/2021 11:42 PM, Claire Chang wrote:
> > Add a new function, swiotlb_init_io_tlb_mem, for the io_tlb_mem struct
> > initialization to make the code reusable.
> >
> > Note that we now also call set_memory_decrypted in swiotlb_init_with_tbl.
> >
> > Signed-off-by: Claire Chang 
> > ---
> >  kernel/dma/swiotlb.c | 51 ++--
> >  1 file changed, 25 insertions(+), 26 deletions(-)
> >
> > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> > index 8ca7d505d61c..d3232fc19385 100644
> > --- a/kernel/dma/swiotlb.c
> > +++ b/kernel/dma/swiotlb.c
> > @@ -168,9 +168,30 @@ void __init swiotlb_update_mem_attributes(void)
> >   memset(vaddr, 0, bytes);
> >  }
> >
> > -int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int 
> > verbose)
> > +static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t 
> > start,
> > + unsigned long nslabs, bool late_alloc)
> >  {
> > + void *vaddr = phys_to_virt(start);
> >   unsigned long bytes = nslabs << IO_TLB_SHIFT, i;
> > +
> > + mem->nslabs = nslabs;
> > + mem->start = start;
> > + mem->end = mem->start + bytes;
> > + mem->index = 0;
> > + mem->late_alloc = late_alloc;
> > + spin_lock_init(>lock);
> > + for (i = 0; i < mem->nslabs; i++) {
> > + mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
> > + mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
> > + mem->slots[i].alloc_size = 0;
> > + }
> > +
> > + set_memory_decrypted((unsigned long)vaddr, bytes >> PAGE_SHIFT);
> > + memset(vaddr, 0, bytes);
>
> You are doing an unconditional set_memory_decrypted() followed by a
> memset here, and then:
>
> > +}
> > +
> > +int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int 
> > verbose)
> > +{
> >   struct io_tlb_mem *mem;
> >   size_t alloc_size;
> >
> > @@ -186,16 +207,8 @@ int __init swiotlb_init_with_tbl(char *tlb, unsigned 
> > long nslabs, int verbose)
> >   if (!mem)
> >   panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
> > __func__, alloc_size, PAGE_SIZE);
> > - mem->nslabs = nslabs;
> > - mem->start = __pa(tlb);
> > - mem->end = mem->start + bytes;
> > - mem->index = 0;
> > - spin_lock_init(>lock);
> > - for (i = 0; i < mem->nslabs; i++) {
> > - mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
> > - mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
> > - mem->slots[i].alloc_size = 0;
> > - }
> > +
> > + swiotlb_init_io_tlb_mem(mem, __pa(tlb), nslabs, false);
>
> You convert this call site with swiotlb_init_io_tlb_mem() which did not
> do the set_memory_decrypted()+memset(). Is this okay or should
> swiotlb_init_io_tlb_mem() add an additional argument to do this
> conditionally?

I'm actually not sure if this it okay. If not, will add an additional
argument for it.

> --
> Florian



Re: Preserving transactions accross Xenstored Live-Update

2021-05-20 Thread Juergen Gross

On 19.05.21 19:10, Julien Grall wrote:

Hi Juergen,

On 19/05/2021 13:50, Juergen Gross wrote:

On 19.05.21 14:33, Julien Grall wrote:



On 19/05/2021 13:32, Julien Grall wrote:

Hi Juergen,

On 19/05/2021 10:09, Juergen Gross wrote:

On 18.05.21 20:11, Julien Grall wrote:


I have started to look at preserving transaction accross Live-update 

in


C Xenstored. So far, I managed to transfer transaction that 
read/write existing nodes.


Now, I am running into trouble to transfer new/deleted node within 
a 



transaction with the existing migration format.

C Xenstored will keep track of nodes accessed during the transaction 



but not the children (AFAICT for performance reason).


Not performance reasons, but because there isn't any need for that:

The children are either unchanged (so the non-transaction node records
apply), or they will be among the tracked nodes (transaction node
records apply). So in both cases all children should be known.
In theory, opening a new transaction means you will not see any 
modification in the global database until the transaction has been 
committed. What you describe would break that because a client would 



be able to see new nodes added outside of the transaction.

However, C Xenstored implements neither of the two. Currently, when 
a node is accessed within the transaction, we will also store the 
names of the current children.


To give an example with access to the global DB (prefixed with TID0) 



and within a transaction (TID1)

 1) TID0: MKDIR "data/bar"
 2) Start transaction TID1
 3) TID1: DIRECTORY "data"
 -> This will cache thenode data
 4) TID0: MKDIR "data/foo"
 -> This will create 

"foo" in the global database

 5) TID1: MKDIR "data/fish"
 -> This will create 

"fish" inthe transaction

 5) TID1: DIRECTORY "data"
 -> This will only return "bar" and "fish"

If we Live-Update between 4) and 5). Then we should make sure that 
"bar" cannot be seen in the listing by TID1.


I meant "foo" here. Sorry for the confusion.



Therefore, I don't think we can restore the children using the 
global node here. Instead we need to find a way to transfer the list 


of known 



children within the transaction.

As a fun fact, C Xenstored implements weirdly the transaction, so TID1 



will be able to access "bar" if it knows the name but not list it.


And this is the basic problem, I think.

C Xenstored should be repaired by adding all (remaining) children of a
node into the TID's database when the list of children is modified
either globally or in a transaction. A child having been added globally
needs to be added as "deleted" into the TID's database.


IIUC, for every modifications in the global database, we would need to 
walk every single transactions and check whether a parent was accessed. 



Am I correct?


Not really. When a node is being read during a transaction and it is
found in the global data base only, its gen-count can be tested for
being older or newer than the transaction start. If it is newer we can
traverse the path up to "/" and treat each parent the same way (so if
a parent is found in the transaction data base, the presence of the
child can be verified, and if it is global only, the gen-count can be
tested against the transaction again).

If so, I don't think this is a workable solution because of the cost to 



execute a single command.


My variant will affect the transaction internal reads only, and the
additional cost will be limited by the distance of the read node from
the root node.

Is it something you plan to address differently with your rework of the 

DB?

Yes. I want to have the transaction specific variants of nodes linked to
the global ones, which solves this problem in an easy way.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v7 04/15] swiotlb: Add restricted DMA pool initialization

2021-05-20 Thread Claire Chang
On Thu, May 20, 2021 at 2:54 AM Florian Fainelli  wrote:
>
>
>
> On 5/17/2021 11:42 PM, Claire Chang wrote:
> > Add the initialization function to create restricted DMA pools from
> > matching reserved-memory nodes.
> >
> > Signed-off-by: Claire Chang 
> > ---
> >  include/linux/device.h  |  4 +++
> >  include/linux/swiotlb.h |  3 +-
> >  kernel/dma/swiotlb.c| 76 +
> >  3 files changed, 82 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/device.h b/include/linux/device.h
> > index 38a2071cf776..4987608ea4ff 100644
> > --- a/include/linux/device.h
> > +++ b/include/linux/device.h
> > @@ -416,6 +416,7 @@ struct dev_links_info {
> >   * @dma_pools:   Dma pools (if dma'ble device).
> >   * @dma_mem: Internal for coherent mem override.
> >   * @cma_area:Contiguous memory area for dma allocations
> > + * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.
> >   * @archdata:For arch-specific additions.
> >   * @of_node: Associated device tree node.
> >   * @fwnode:  Associated device node supplied by platform firmware.
> > @@ -521,6 +522,9 @@ struct device {
> >  #ifdef CONFIG_DMA_CMA
> >   struct cma *cma_area;   /* contiguous memory area for dma
> >  allocations */
> > +#endif
> > +#ifdef CONFIG_DMA_RESTRICTED_POOL
> > + struct io_tlb_mem *dma_io_tlb_mem;
> >  #endif
> >   /* arch specific additions */
> >   struct dev_archdata archdata;
> > diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> > index 216854a5e513..03ad6e3b4056 100644
> > --- a/include/linux/swiotlb.h
> > +++ b/include/linux/swiotlb.h
> > @@ -72,7 +72,8 @@ extern enum swiotlb_force swiotlb_force;
> >   *   range check to see if the memory was in fact allocated by this
> >   *   API.
> >   * @nslabs:  The number of IO TLB blocks (in groups of 64) between @start 
> > and
> > - *   @end. This is command line adjustable via setup_io_tlb_npages.
> > + *   @end. For default swiotlb, this is command line adjustable via
> > + *   setup_io_tlb_npages.
> >   * @used:The number of used IO TLB block.
> >   * @list:The free list describing the number of free entries available
> >   *   from each index.
> > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> > index b849b01a446f..1d8eb4de0d01 100644
> > --- a/kernel/dma/swiotlb.c
> > +++ b/kernel/dma/swiotlb.c
> > @@ -39,6 +39,13 @@
> >  #ifdef CONFIG_DEBUG_FS
> >  #include 
> >  #endif
> > +#ifdef CONFIG_DMA_RESTRICTED_POOL
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#endif
> >
> >  #include 
> >  #include 
> > @@ -690,3 +697,72 @@ static int __init swiotlb_create_default_debugfs(void)
> >  late_initcall(swiotlb_create_default_debugfs);
> >
> >  #endif
> > +
> > +#ifdef CONFIG_DMA_RESTRICTED_POOL
> > +static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
> > + struct device *dev)
> > +{
> > + struct io_tlb_mem *mem = rmem->priv;
> > + unsigned long nslabs = rmem->size >> IO_TLB_SHIFT;
> > +
> > + if (dev->dma_io_tlb_mem)
> > + return 0;
> > +
> > + /*
> > +  * Since multiple devices can share the same pool, the private data,
> > +  * io_tlb_mem struct, will be initialized by the first device attached
> > +  * to it.
> > +  */
> > + if (!mem) {
> > + mem = kzalloc(struct_size(mem, slots, nslabs), GFP_KERNEL);
> > + if (!mem)
> > + return -ENOMEM;
> > +
> > + if (PageHighMem(pfn_to_page(PHYS_PFN(rmem->base {
> > + kfree(mem);
> > + return -EINVAL;
>
> This could probably deserve a warning here to indicate that the reserved
> area must be accessible within the linear mapping as I would expect a
> lot of people to trip over that.

Ok. Will add it.

>
> Reviewed-by: Florian Fainelli 
> --
> Florian



RE: [PATCH 10/10] xen/arm: introduce allocate_static_memory

2021-05-20 Thread Penny Zheng
Hi Julien 

> -Original Message-
> From: Julien Grall 
> Sent: Thursday, May 20, 2021 4:10 AM
> To: Penny Zheng ; xen-devel@lists.xenproject.org;
> sstabell...@kernel.org
> Cc: Bertrand Marquis ; Wei Chen
> ; nd 
> Subject: Re: [PATCH 10/10] xen/arm: introduce allocate_static_memory
> 
> 
> 
> On 19/05/2021 08:27, Penny Zheng wrote:
> > Hi Julien
> >
> >> -Original Message-
> >> From: Julien Grall 
> >> Sent: Tuesday, May 18, 2021 8:06 PM
> >> To: Penny Zheng ;
> >> xen-devel@lists.xenproject.org; sstabell...@kernel.org
> >> Cc: Bertrand Marquis ; Wei Chen
> >> ; nd 
> >> Subject: Re: [PATCH 10/10] xen/arm: introduce allocate_static_memory
> >>
> >> Hi Penny,
> >>
> >> On 18/05/2021 06:21, Penny Zheng wrote:
> >>> This commit introduces allocate_static_memory to allocate static
> >>> memory as guest RAM for domain on Static Allocation.
> >>>
> >>> It uses alloc_domstatic_pages to allocate pre-defined static memory
> >>> banks for this domain, and uses guest_physmap_add_page to set up P2M
> >>> table, guest starting at fixed GUEST_RAM0_BASE, GUEST_RAM1_BASE.
> >>>
> >>> Signed-off-by: Penny Zheng 
> >>> ---
> >>>xen/arch/arm/domain_build.c | 157
> >> +++-
> >>>1 file changed, 155 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/xen/arch/arm/domain_build.c
> >>> b/xen/arch/arm/domain_build.c index 30b55588b7..9f662313ad 100644
> >>> --- a/xen/arch/arm/domain_build.c
> >>> +++ b/xen/arch/arm/domain_build.c
> >>> @@ -437,6 +437,50 @@ static bool __init allocate_bank_memory(struct
> >> domain *d,
> >>>return true;
> >>>}
> >>>
> >>> +/*
> >>> + * #ram_index and #ram_index refer to the index and starting
> >>> +address of guest
> >>> + * memory kank stored in kinfo->mem.
> >>> + * Static memory at #smfn of #tot_size shall be mapped #sgfn, and
> >>> + * #sgfn will be next guest address to map when returning.
> >>> + */
> >>> +static bool __init allocate_static_bank_memory(struct domain *d,
> >>> +   struct kernel_info *kinfo,
> >>> +   int ram_index,
> >>
> >> Please use unsigned.
> >>
> >>> +   paddr_t ram_addr,
> >>> +   gfn_t* sgfn,
> >>
> >> I am confused, what is the difference between ram_addr and sgfn?
> >>
> >
> > We need to constructing kinfo->mem(guest RAM banks) here, and we are
> > indexing in static_mem(physical ram banks). Multiple physical ram
> > banks consist of one guest ram bank(like, GUEST_RAM0).
> >
> > ram_addr  here will either be GUEST_RAM0_BASE, or GUEST_RAM1_BASE,
> for
> > now. I kinds struggled in how to name it. And maybe it shall not be a
> > parameter here.
> >
> > Maybe I should switch.. case.. on the ram_index, if its 0, its
> > GUEST_RAM0_BASE, And if its 1, its GUEST_RAM1_BASE.
> 
> You only need to set kinfo->mem.bank[ram_index].start once. This is when
> you know the bank is first used.
> 
> AFAICT, this function will map the memory for a range start at ``sgfn``.
> It doesn't feel this belongs to the function.
> 
> The same remark is valid for kinfo->mem.nr_banks.
> 

Ok. I finally totally understand what you suggest here.
I'll try to let the action related to setting kinfo->mem.bank[ram_index].start/
kinfo->mem.bank[ram_index].size/ kinfo->mem. nr_banks out of this function,
and only keep the simple functionality of mapping the memory for a range start
at ``sgfn``.

> >>> +   mfn_t smfn,
> >>> +   paddr_t tot_size) {
> >>> +int res;
> >>> +struct membank *bank;
> >>> +paddr_t _size = tot_size;
> >>> +
> >>> +bank = >mem.bank[ram_index];
> >>> +bank->start = ram_addr;
> >>> +bank->size = bank->size + tot_size;
> >>> +
> >>> +while ( tot_size > 0 )
> >>> +{
> >>> +unsigned int order = get_allocation_size(tot_size);
> >>> +
> >>> +res = guest_physmap_add_page(d, *sgfn, smfn, order);
> >>> +if ( res )
> >>> +{
> >>> +dprintk(XENLOG_ERR, "Failed map pages to DOMU: %d", res);
> >>> +return false;
> >>> +}
> >>> +
> >>> +*sgfn = gfn_add(*sgfn, 1UL << order);
> >>> +smfn = mfn_add(smfn, 1UL << order);
> >>> +tot_size -= (1ULL << (PAGE_SHIFT + order));
> >>> +}
> >>> +
> >>> +kinfo->mem.nr_banks = ram_index + 1;
> >>> +kinfo->unassigned_mem -= _size;
> >>> +
> >>> +return true;
> >>> +}
> >>> +
> >>>static void __init allocate_memory(struct domain *d, struct
> >>> kernel_info
> >> *kinfo)
> >>>{
> >>>unsigned int i;
> >>> @@ -480,6 +524,116 @@ fail:
> >>>  (unsigned long)kinfo->unassigned_mem >> 10);
> >>>}
> >>>
> >>> +/* Allocate memory from static memory as RAM for one specific
> >>> +domain d. */ static void __init allocate_static_memory(struct domain *d,
> >>> +  

RE: [PATCH 03/10] xen/arm: introduce PGC_reserved

2021-05-20 Thread Penny Zheng
Hi Julien

> -Original Message-
> From: Julien Grall 
> Sent: Thursday, May 20, 2021 3:46 AM
> To: Penny Zheng ; xen-devel@lists.xenproject.org;
> sstabell...@kernel.org
> Cc: Bertrand Marquis ; Wei Chen
> ; nd 
> Subject: Re: [PATCH 03/10] xen/arm: introduce PGC_reserved
> 
> 
> 
> On 19/05/2021 04:16, Penny Zheng wrote:
> > Hi Julien
> 
> Hi Penny,
> 
> >
> >> -Original Message-
> >> From: Julien Grall 
> >> Sent: Tuesday, May 18, 2021 5:46 PM
> >> To: Penny Zheng ;
> >> xen-devel@lists.xenproject.org; sstabell...@kernel.org
> >> Cc: Bertrand Marquis ; Wei Chen
> >> ; nd 
> >> Subject: Re: [PATCH 03/10] xen/arm: introduce PGC_reserved
> >>
> >>
> >>
> >> On 18/05/2021 06:21, Penny Zheng wrote:
> >>> In order to differentiate pages of static memory, from those
> >>> allocated from heap, this patch introduces a new page flag PGC_reserved
> to tell.
> >>>
> >>> New struct reserved in struct page_info is to describe reserved page
> >>> info, that is, which specific domain this page is reserved to. >
> >>> Helper page_get_reserved_owner and page_set_reserved_owner are
> >>> designated to get/set reserved page's owner.
> >>>
> >>> Struct domain is enlarged to more than PAGE_SIZE, due to
> >>> newly-imported struct reserved in struct page_info.
> >>
> >> struct domain may embed a pointer to a struct page_info but never
> >> directly embed the structure. So can you clarify what you mean?
> >>
> >>>
> >>> Signed-off-by: Penny Zheng 
> >>> ---
> >>>xen/include/asm-arm/mm.h | 16 +++-
> >>>1 file changed, 15 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
> >> index
> >>> 0b7de3102e..d8922fd5db 100644
> >>> --- a/xen/include/asm-arm/mm.h
> >>> +++ b/xen/include/asm-arm/mm.h
> >>> @@ -88,7 +88,15 @@ struct page_info
> >>> */
> >>>u32 tlbflush_timestamp;
> >>>};
> >>> -u64 pad;
> >>> +
> >>> +/* Page is reserved. */
> >>> +struct {
> >>> +/*
> >>> + * Reserved Owner of this page,
> >>> + * if this page is reserved to a specific domain.
> >>> + */
> >>> +struct domain *domain;
> >>> +} reserved;
> >>
> >> The space in page_info is quite tight, so I would like to avoid
> >> introducing new fields unless we can't get away from it.
> >>
> >> In this case, it is not clear why we need to differentiate the "Owner"
> >> vs the "Reserved Owner". It might be clearer if this change is folded
> >> in the first user of the field.
> >>
> >> As an aside, for 32-bit Arm, you need to add a 4-byte padding.
> >>
> >
> > Yeah, I may delete this change. I imported this change as considering
> > the functionality of rebooting domain on static allocation.
> >
> > A little more discussion on rebooting domain on static allocation.
> > Considering the major user cases for domain on static allocation are
> > system has a total pre-defined, static behavior all the time. No
> > domain allocation on runtime, while there still exists domain rebooting.
> 
> Hmmm... With this series it is still possible to allocate memory at runtime
> outside of the static allocation (see my comment on the design document [1]).
> So is it meant to be complete?
> 

I'm guessing that the "allocate memory at runtime outside of the static 
allocation" is
referring to XEN heap on static allocation, that is, users pre-defining heap in 
device tree
configuration to let the whole system more static and predictable.

And I've replied you in the design there, sorry for the late reply. Save your 
time, and
I’ll paste here:

"Right now, on AArch64, all RAM, except reserved memory, will be finally given 
to
buddy allocator as heap,  like you said, guest RAM for normal domain will be 
allocated
from there, xmalloc eventually is get memory from there, etc. So we want to 
refine the heap
here, not iterating through `bootinfo.mem` to set up XEN heap, but like 
iterating
`bootinfo. reserved_heap` to set up XEN heap.

On ARM32, xen heap and domain heap are separately mapped, which is more 
complicated
here. That's why I only talking about implementing these features on AArch64 as 
first step."

 Above implementation will be delivered through another patch Serie. This patch 
Serie
Is only focusing on Domain on Static Allocation. 

> >
> > And when rebooting domain on static allocation, all these reserved
> > pages could not go back to heap when freeing them.  So I am
> > considering to use one global `struct page_info*[DOMID]` value to store.
> >
> > As Jan suggested, when domain get rebooted, struct domain will not exist
> anymore.
> > But I think DOMID info could last.
> You would need to make sure the domid to be reserved. But I am not entirely
> convinced this is necessary here.
> 
> When recreating the domain, you need a way to know its configuration.
> Mostly likely this will come from the Device-Tree. At which point, you can 
> also
> find the static region from there.
> 

True, true. I'll dig more 

RE: [PATCH 01/10] xen/arm: introduce domain on Static Allocation

2021-05-20 Thread Penny Zheng
Hi Julien

> -Original Message-
> From: Julien Grall 
> Sent: Thursday, May 20, 2021 2:27 AM
> To: Penny Zheng ; xen-devel@lists.xenproject.org;
> sstabell...@kernel.org
> Cc: Bertrand Marquis ; Wei Chen
> ; nd 
> Subject: Re: [PATCH 01/10] xen/arm: introduce domain on Static Allocation
> 
> On 19/05/2021 03:22, Penny Zheng wrote:
> > Hi Julien
> 
> Hi Penny,
> 
> >> -Original Message-
> >> From: Julien Grall 
> >> Sent: Tuesday, May 18, 2021 4:58 PM
> >> To: Penny Zheng ;
> >> xen-devel@lists.xenproject.org; sstabell...@kernel.org
> >> Cc: Bertrand Marquis ; Wei Chen
> >> ; nd 
> >> Subject: Re: [PATCH 01/10] xen/arm: introduce domain on Static
> >> Allocation
> >>> +beginning, shall never go to heap allocator or boot allocator for any 
> >>> use.
> >>> +
> >>> +Domains on Static Allocation is supported through device tree
> >>> +property `xen,static-mem` specifying reserved RAM banks as this
> >>> +domain's
> >> guest RAM.
> >>
> >> I would suggest to use "physical RAM" when you refer to the host memory.
> >>
> >>> +By default, they shall be mapped to the fixed guest RAM address
> >>> +`GUEST_RAM0_BASE`, `GUEST_RAM1_BASE`.
> >>
> >> There are a few bits that needs to clarified or part of the description:
> >> 1) "By default" suggests there is an alternative possibility.
> >> However, I don't see any.
> >> 2) Will the first region of xen,static-mem be mapped to
> >> GUEST_RAM0_BASE and the second to GUEST_RAM1_BASE? What if a third
> region is specificed?
> >> 3) We don't guarantee the base address and the size of the banks.
> >> Wouldn't it be better to let the admin select the region he/she wants?
> >> 4) How do you determine the number of cells for the address and the 
> >> size?
> >>
> >
> > The specific implementation on this part could be traced to the last
> > commit
> > https://patchew.org/Xen/20210518052113.725808-1-
> penny.zh...@arm.com/20
> > 210518052113.725808-11-penny.zh...@arm.com/
> 
> Right. My point is an admin should not have to read the code in order to 
> figure
> out how the allocation works.
> 
> >
> > It will exhaust the GUEST_RAM0_SIZE, then seek to the GUEST_RAM1_BASE.
> > GUEST_RAM0 may take up several regions.
> 
> Can this be clarified in the commit message.
> 

Ok, I will expand commit to let it be clarified more clearly.

> > Yes, I may add the 1:1 direct-map scenario here to explain the
> > alternative possibility
> 
> Ok. So I would suggest to remove "by default" until the implementation for
> direct-map is added.
> 

Ok,  will do. Thx.

> > For the third point, are you suggesting that we could provide an
> > option that let user also define guest memory base address/size?
> 
> When reading the document, I originally thought that the first region would be
> mapped to bank1, and then the second region mapped to bank2...
> 
> But from your reply, this is not the expected behavior. Therefore, please 
> ignore
> my suggestion here.
> 
> > I'm confused on the fourth point, you mean the address cell and size cell 
> > for
> xen,static-mem = <...>?
> 
> Yes. This should be clarified in the document. See more below why?
> 
> > It will be consistent with the ones defined in the parent node, domUx.
> Hmmm... To take the example you provided, the parent would be chosen.
> However, from the example, I would expect the property #{address, size}-cells
> in domU1 to be used. What did I miss?
> 

Yeah, the property #{address, size}-cells in domU1 will be used. And the parent
node will be domU1.

The dtb property should look like as follows:

chosen {
domU1 {
compatible = "xen,domain";
#address-cells = <0x2>;
#size-cells = <0x2>;
cpus = <2>;
xen,static-mem = <0x0 0x3000 0x0 0x2000>;

...
};
};

> > +DOMU1 on Static Allocation has reserved RAM bank at 0x3000 of 512MB 
> > size

> >>> +Static Allocation is only supported on AArch64 for now.
> >>
> >> The code doesn't seem to be AArch64 specific. So why can't this be
> >> used for 32-bit Arm?
> >>
> >
> > True, we have plans to make it also workable in AArch32 in the future.
> > Because we considered XEN on cortex-R52.
> 
> All the code seems to be implemented in arm generic code. So isn't it already
> working?
> 
> >>>static int __init early_scan_node(const void *fdt,
> >>>  int node, const char *name, int 
> >>> depth,
> >>>  u32 address_cells, u32
> >>> size_cells, @@ -345,6 +394,9 @@ static int __init early_scan_node(const
> void *fdt,
> >>>process_multiboot_node(fdt, node, name, address_cells, 
> >>> size_cells);
> >>>else if ( depth == 1 && device_tree_node_matches(fdt, node,
> "chosen") )
> >>>process_chosen_node(fdt, node, name, address_cells,
> >>> size_cells);
> >>> +else if ( depth == 2 && fdt_get_property(fdt, node,
> >>> + "xen,static-mem",