Re: hvmloader - allow_memory_relocate overlaps

2024-03-03 Thread Jan Beulich
On 01.03.2024 21:24, Neowutran wrote:
> On 2024-02-07 16:02, Jan Beulich wrote:
>> Could you give the patch below a try? I don't have a device with large
>> enough a BAR that I could sensibly pass through to a guest, so I was
>> only able to build-test the change.
> 
> Hi and thanks,
> I just tested it, it indeed work well when the GPU have bar > 1Go. 
> 
> 
> Setup: 
> I removed my patch 
> (
> 
> 
> --- a/tools/libs/light/libxl_dm.c
> +++ b/tools/libs/light/libxl_dm.c
> @@ -2431,6 +2431,10 @@ void libxl__spawn_stub_dm(libxl__egc *egc, 
> libxl__stub_dm_spawn_state *sdss)
> libxl__xs_get_dompath(gc, 
> guest_domid)),
>  "%s",
>  
> libxl_bios_type_to_string(guest_config->b_info.u.hvm.bios));
> +libxl__xs_printf(gc, XBT_NULL,
> + libxl__sprintf(gc, 
> "%s/hvmloader/allow-memory-relocate", libxl__xs_get_dompath(gc, guest_domid)),
> + "%d",
> + 0);
>  }
>  ret = xc_domain_set_target(ctx->xch, dm_domid, guest_domid);
>  if (ret<0) {
> 
> 
> )
> and applied your suggested "skip huge BARs" patch. 
> My GPU: nvidia 4080
> -
> 
> When the option "Resizable BAR support" is activated in my bios,
> the BAR1 size of my gpu is reported to be 16GB. With this patch, gpu
> passthrough work. 

Good. I guess I should properly post that patch then, no matter that ...

> When the option "Resizable BAR support" is desactivated in my bios,
> the BAR1 size of my gpu is reported to be 256MB. With this patch,
> gpu passthrough doesn't work (same crash as before)

... this clearly requires further work then. 256Mb isn't all that large;
could you post a log file (fragment) for this very case?

> ( note: the option "Resizable BAR support" may or may not exist
> depending on the motherboard model, on some it is 'hardcoded' to
> activated, on some it is 'hardcoded' to desactivated)

Resizable BARs is yet another thing on my (and maybe also Roger's) todo
list. I'm a little uncomfortable doing any work there without having a
system where I could actually test it ...

Jan



Re: hvmloader - allow_memory_relocate overlaps

2024-03-01 Thread Neowutran
On 2024-02-07 16:02, Jan Beulich wrote:
> On 04.01.2024 14:16, Jan Beulich wrote:
> > On 22.12.2023 16:49, Neowutran wrote:
> >> Full logs without my patch to set allow-memory-relocate 
> >> (https://github.com/neowutran/qubes-vmm-xen/blob/allowmemoryrelocate/ALLOWMEMORYRELOCATE.patch)
> >> https://pastebin.com/g 
> >> QGg55WZ
> >> (GPU passthrough doesn't work, hvmloader overlaps with guest memory)
> > 
> > So there are oddities, but I can't spot any overlaps. What's odd is that
> > the two blocks already above 4Gb are accounted for (and later relocated)
> > when calculating total MMIO size. BARs of size 2Gb and more shouldn't be
> > accounted for at all when deciding whether low RAM needs relocating, as
> > those can't live below 4Gb anyway. I vaguely recall pointing this out
> > years ago, but it was thought we'd get away for a fair while. What's
> > further odd is where the two blocks are moved to: F80 moves (down)
> > to C0, while the smaller FC0 moves further up to FC8.
> > 
> > I'll try to get to addressing at least the first oddity; if I can figure
> > out why the second one occurs, I may try to address that as well.
> 
> Could you give the patch below a try? I don't have a device with large
> enough a BAR that I could sensibly pass through to a guest, so I was
> only able to build-test the change.

Hi and thanks,
I just tested it, it indeed work well when the GPU have bar > 1Go. 


Setup: 
I removed my patch 
(


--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -2431,6 +2431,10 @@ void libxl__spawn_stub_dm(libxl__egc *egc, 
libxl__stub_dm_spawn_state *sdss)
libxl__xs_get_dompath(gc, guest_domid)),
 "%s",
 
libxl_bios_type_to_string(guest_config->b_info.u.hvm.bios));
+libxl__xs_printf(gc, XBT_NULL,
+ libxl__sprintf(gc, 
"%s/hvmloader/allow-memory-relocate", libxl__xs_get_dompath(gc, guest_domid)),
+ "%d",
+ 0);
 }
 ret = xc_domain_set_target(ctx->xch, dm_domid, guest_domid);
 if (ret<0) {


)
and applied your suggested "skip huge BARs" patch. 
My GPU: nvidia 4080
-

When the option "Resizable BAR support" is activated in my bios,
the BAR1 size of my gpu is reported to be 16GB. With this patch, gpu
passthrough work. 

When the option "Resizable BAR support" is desactivated in my bios,
the BAR1 size of my gpu is reported to be 256MB. With this patch,
gpu passthrough doesn't work (same crash as before)


( note: the option "Resizable BAR support" may or may not exist
depending on the motherboard model, on some it is 'hardcoded' to
activated, on some it is 'hardcoded' to desactivated)

> Jan
> 
> hvmloader/PCI: skip huge BARs in certain calculations
> 
> BARs of size 2Gb and up can't possibly fit below 4Gb: Both the bottom of
> the lower 2Gb range and the top of the higher 2Gb range have special
> purpose. Don't even have them influence whether to (perhaps) relocate
> low RAM.
> 
> Reported-by: Neowutran 
> Signed-off-by: Jan Beulich 
> ---
> If we wanted to fit e.g. multiple 1Gb BARs, it would likely be prudent
> to similarly avoid low RAM relocation in the first place. Yet accounting
> for things differently depending on many large BARs there are would
> require more intrusive code changes.
> 
> That said, I'm open to further lowering of the threshold. That'll
> require different justification then, though.
> 
> --- a/tools/firmware/hvmloader/pci.c
> +++ b/tools/firmware/hvmloader/pci.c
> @@ -33,6 +33,13 @@ uint32_t pci_mem_start = HVM_BELOW_4G_MM
>  const uint32_t pci_mem_end = RESERVED_MEMBASE;
>  uint64_t pci_hi_mem_start = 0, pci_hi_mem_end = 0;
>  
> +/*
> + * BARs larger than this value are put in 64-bit space unconditionally.  That
> + * is, such BARs also don't play into the determination of how big the lowmem
> + * MMIO hole needs to be.
> + */
> +#define HUGE_BAR_THRESH GB(1)
> +
>  enum virtual_vga virtual_vga = VGA_none;
>  unsigned long igd_opregion_pgbase = 0;
>  
> @@ -286,9 +293,11 @@ void pci_setup(void)
>  bars[i].bar_reg = bar_reg;
>  bars[i].bar_sz  = bar_sz;
>  
> -if ( ((bar_data & PCI_BASE_ADDRESS_SPACE) ==
> -  PCI_BASE_ADDRESS_SPACE_MEMORY) ||
> - (bar_reg == PCI_ROM_ADDRESS) )
> +if ( is_64bar && bar_sz > HUGE_BAR_THRESH )
> +bar64_relocate = 1;
> +else if ( ((bar_data & PCI_BASE_ADDRESS_SPACE) ==
> +   PCI_BASE_ADDRESS_SPACE_MEMORY) ||
> +  (bar_reg == PCI_ROM_ADDRESS) )
>  mmio_total += bar_sz;
>  
>  nr_bars++;
> @@ -367,7 +376,7 @@ void pci_setup(void)
>  pci_mem_start = hvm_info->low_mem_pgend << PAGE_SHIFT;
>  }
>  
> -if ( mmio_total > (pci_mem_end - pci_mem_start) )
> +if ( mmio_total > (pci_mem_end - pci_mem_start) || 

Re: hvmloader - allow_memory_relocate overlaps

2024-02-07 Thread Jan Beulich
On 04.01.2024 14:16, Jan Beulich wrote:
> On 22.12.2023 16:49, Neowutran wrote:
>> Full logs without my patch to set allow-memory-relocate 
>> (https://github.com/neowutran/qubes-vmm-xen/blob/allowmemoryrelocate/ALLOWMEMORYRELOCATE.patch)
>> https://pastebin.com/g 
>> QGg55WZ
>> (GPU passthrough doesn't work, hvmloader overlaps with guest memory)
> 
> So there are oddities, but I can't spot any overlaps. What's odd is that
> the two blocks already above 4Gb are accounted for (and later relocated)
> when calculating total MMIO size. BARs of size 2Gb and more shouldn't be
> accounted for at all when deciding whether low RAM needs relocating, as
> those can't live below 4Gb anyway. I vaguely recall pointing this out
> years ago, but it was thought we'd get away for a fair while. What's
> further odd is where the two blocks are moved to: F80 moves (down)
> to C0, while the smaller FC0 moves further up to FC8.
> 
> I'll try to get to addressing at least the first oddity; if I can figure
> out why the second one occurs, I may try to address that as well.

Could you give the patch below a try? I don't have a device with large
enough a BAR that I could sensibly pass through to a guest, so I was
only able to build-test the change.

Jan

hvmloader/PCI: skip huge BARs in certain calculations

BARs of size 2Gb and up can't possibly fit below 4Gb: Both the bottom of
the lower 2Gb range and the top of the higher 2Gb range have special
purpose. Don't even have them influence whether to (perhaps) relocate
low RAM.

Reported-by: Neowutran 
Signed-off-by: Jan Beulich 
---
If we wanted to fit e.g. multiple 1Gb BARs, it would likely be prudent
to similarly avoid low RAM relocation in the first place. Yet accounting
for things differently depending on many large BARs there are would
require more intrusive code changes.

That said, I'm open to further lowering of the threshold. That'll
require different justification then, though.

--- a/tools/firmware/hvmloader/pci.c
+++ b/tools/firmware/hvmloader/pci.c
@@ -33,6 +33,13 @@ uint32_t pci_mem_start = HVM_BELOW_4G_MM
 const uint32_t pci_mem_end = RESERVED_MEMBASE;
 uint64_t pci_hi_mem_start = 0, pci_hi_mem_end = 0;
 
+/*
+ * BARs larger than this value are put in 64-bit space unconditionally.  That
+ * is, such BARs also don't play into the determination of how big the lowmem
+ * MMIO hole needs to be.
+ */
+#define HUGE_BAR_THRESH GB(1)
+
 enum virtual_vga virtual_vga = VGA_none;
 unsigned long igd_opregion_pgbase = 0;
 
@@ -286,9 +293,11 @@ void pci_setup(void)
 bars[i].bar_reg = bar_reg;
 bars[i].bar_sz  = bar_sz;
 
-if ( ((bar_data & PCI_BASE_ADDRESS_SPACE) ==
-  PCI_BASE_ADDRESS_SPACE_MEMORY) ||
- (bar_reg == PCI_ROM_ADDRESS) )
+if ( is_64bar && bar_sz > HUGE_BAR_THRESH )
+bar64_relocate = 1;
+else if ( ((bar_data & PCI_BASE_ADDRESS_SPACE) ==
+   PCI_BASE_ADDRESS_SPACE_MEMORY) ||
+  (bar_reg == PCI_ROM_ADDRESS) )
 mmio_total += bar_sz;
 
 nr_bars++;
@@ -367,7 +376,7 @@ void pci_setup(void)
 pci_mem_start = hvm_info->low_mem_pgend << PAGE_SHIFT;
 }
 
-if ( mmio_total > (pci_mem_end - pci_mem_start) )
+if ( mmio_total > (pci_mem_end - pci_mem_start) || bar64_relocate )
 {
 printf("Low MMIO hole not large enough for all devices,"
" relocating some BARs to 64-bit\n");
@@ -446,8 +455,9 @@ void pci_setup(void)
  *   the code here assumes it to be.)
  * Should either of those two conditions change, this code will break.
  */
-using_64bar = bars[i].is_64bar && bar64_relocate
-&& (mmio_total > (mem_resource.max - mem_resource.base));
+using_64bar = bars[i].is_64bar && bar64_relocate &&
+(mmio_total > (mem_resource.max - mem_resource.base) ||
+ bar_sz > HUGE_BAR_THRESH);
 bar_data = pci_readl(devfn, bar_reg);
 
 if ( (bar_data & PCI_BASE_ADDRESS_SPACE) ==
@@ -467,7 +477,8 @@ void pci_setup(void)
 resource = _resource;
 bar_data &= ~PCI_BASE_ADDRESS_MEM_MASK;
 }
-mmio_total -= bar_sz;
+if ( bar_sz <= HUGE_BAR_THRESH )
+mmio_total -= bar_sz;
 }
 else
 {





Re: hvmloader - allow_memory_relocate overlaps

2024-01-04 Thread Jan Beulich
On 22.12.2023 16:49, Neowutran wrote:
> Full logs without my patch to set allow-memory-relocate 
> (https://github.com/neowutran/qubes-vmm-xen/blob/allowmemoryrelocate/ALLOWMEMORYRELOCATE.patch)
> https://pastebin.com/g 
> QGg55WZ
> (GPU passthrough doesn't work, hvmloader overlaps with guest memory)

So there are oddities, but I can't spot any overlaps. What's odd is that
the two blocks already above 4Gb are accounted for (and later relocated)
when calculating total MMIO size. BARs of size 2Gb and more shouldn't be
accounted for at all when deciding whether low RAM needs relocating, as
those can't live below 4Gb anyway. I vaguely recall pointing this out
years ago, but it was thought we'd get away for a fair while. What's
further odd is where the two blocks are moved to: F80 moves (down)
to C0, while the smaller FC0 moves further up to FC8.

I'll try to get to addressing at least the first oddity; if I can figure
out why the second one occurs, I may try to address that as well.

Jan




Re: hvmloader - allow_memory_relocate overlaps

2023-12-22 Thread Marek Marczykowski-Górecki
On Fri, Dec 22, 2023 at 12:35:57PM +0100, Jan Beulich wrote:
> On 21.12.2023 19:08, Neowutran wrote:
> > On 2023-12-19 17:12, Jan Beulich wrote:
> >> On 16.12.2023 08:01, Neowutran wrote:
> >>> I am wondering if the variable "allow_memory_relocate" is still
> >>> relevant today and if its default value is still relevant. 
> >>> Should it be defined to 0 by default instead of 1 (it seems to be a
> >>> workaround for qemu-traditional, so maybe an outdated default value ? ) ? 
> >>
> >> So are you saying you use qemu-trad?
> > No, I am using "qemu_xen" ( from  
> > xenstore-read -> 'device-model =
> > "qemu_xen"' 
> > 
> >> Otherwise isn't libxl suppressing this behavior anyway?
> > If by "isn't libxl suppressing this behavior" you means if libxl is setting
> > the value of "allow_memory_relocate", then the answer is no. 
> > 
> > Following this lead, I checked in what code path
> > "allow_memory_relocate" could be defined. 
> > 
> > It is only defined in one code path, 
> > 
> > In the file "tools/libs/light/libxl_dm.c",
> > in the function "void libxl__spawn_local_dm(libxl__egc *egc, 
> > libxl__dm_spawn_state *dmss)": 
> > 
> > '''
> >  // ...
> > if (b_info->type == LIBXL_DOMAIN_TYPE_HVM) {
> >  // ...
> > 
> > libxl__xs_printf(gc, XBT_NULL,
> >  GCSPRINTF("%s/hvmloader/allow-memory-relocate", 
> > path),
> >  "%d",
> >  
> > b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL
> >  &&
> >  !libxl__vnuma_configured(b_info));
> > // ...
> > '''
> > 
> > However, on QubesOS (my system), "local_dm" is never used, "stub_dm"
> > is always used. 
> > 
> > In the function "void lib 
> > xl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)", 
> > the value of "allow_memory_relocate" is never defined. 
> > 
> > I tried to patch the code to define "allow_memory_relocate" in
> > "libxl__spawn_stub_dm": 
> > 
> > '''
> > --- a/tools/libs/light/libxl_dm.c
> > +++ b/tools/libs/light/libxl_dm.c
> > @@ -2431,6 +2431,10 @@ void libxl__spawn_stub_dm(libxl__egc *egc, 
> > libxl__stub_dm_spawn_state *sdss)
> > libxl__xs_get_dompath(gc, 
> > guest_domid)),
> >  "%s",
> >  
> > libxl_bios_type_to_string(guest_config->b_info.u.hvm.bios));
> > +libxl__xs_printf(gc, XBT_NULL,
> > + libxl__sprintf(gc, 
> > "%s/hvmloader/allow-memory-relocate", libxl__xs_get_dompath(gc, 
> > guest_domid)),
> > + "%d",
> > + 0);
> >  }
> >  ret = xc_domain_set_target(ctx->xch, dm_domid, guest_domid);
> >  if (ret<0) {
> > '''
> > 
> > And it is indeed another way to solve my issue. 
> > However I do not understand the xen hypervisor enough to known i 
> > f
> > "allow-memory-relocate" should have been defined in
> > "libxl__spawn_stub_dm" or if the real issue is somewhere else. 
> 
> I think it should be done the same no matter where qemu runs. Back at the
> time iirc only qemu-trad could run in a stubdom, which may explain the
> omission. Cc-ing the two guys who are likely in the best position to tell
> for sure.

This indeed looks like a missing piece of qemu-xen support in
stubdomain.
But also, As Jan pointed out, this would be better fixed at the qemu
side so memory relocation could be re-enabled.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab


signature.asc
Description: PGP signature


Re: hvmloader - allow_memory_relocate overlaps

2023-12-22 Thread Neowutran
> > '''
> > 
> > And the associated result is: 
> > 
> > (d22) NEOWUTRAN pci.c: pci_mem_end: -67108864
> > (d22) NEOWUTRAN pci.c: pci_mem_start: -268435456
> > (d22) NEOWUTRAN pci.c: allow_memory_relocate: 0
> > (d22) NEOWUTRAN pci.c:��
 hvm_info->low_mem_pgend: 983040
> > (d22) NEOWUTRAN pci.c: HVM_LOADER would have tried to relocate guest memory
> > (d22) NEOWUTRAN pci.c: pci_mem_start: -268435456
> > 
> > So if "allow_memory_relocate" was not defined to 0, the hvmloader
> > would have tried to overlaps guest memory, and it seems that is
> > something that qemu_xen cannot handle.
> 
> Well, memory addresses printed in decimal are pretty hard to work with.
> But I'd like to ask anyway that you supply all of the log messages for
> such a guest starting, to be able to correlate what you've added with
> other items also logged.

Full logs with my patch to set allow-memory-relocate 
(https://github.com/neowutran/qubes-vmm-xen/commit/819705bc346cad14836fd523195ad2b0445330ac)
https://pastebin.com/9kQgvraK
(GPU passthrough work, hvmloader doesn't overlaps with guest memory)


Full logs without my patch to set allow-memory-relocate 
(https://github.com/neowutran/qubes-vmm-xen/blob/allowmemoryrelocate/ALLOWMEMORYRELOCATE.patch)
https://pastebin.com/g��
QGg55WZ
(GPU passthrough doesn't work, hvmloader overlaps with guest memory)

> 
> Jan

Thanks,
Neowutran
��



Re: hvmloader - allow_memory_relocate overlaps

2023-12-22 Thread Jan Beulich
On 21.12.2023 19:08, Neowutran wrote:
> On 2023-12-19 17:12, Jan Beulich wrote:
>> On 16.12.2023 08:01, Neowutran wrote:
>>> I am wondering if the variable "allow_memory_relocate" is still
>>> relevant today and if its default value is still relevant. 
>>> Should it be defined to 0 by default instead of 1 (it seems to be a
>>> workaround for qemu-traditional, so maybe an outdated default value ? ) ? 
>>
>> So are you saying you use qemu-trad?
> No, I am using "qemu_xen" ( from  
> xenstore-read -> 'device-model =
> "qemu_xen"' 
> 
>> Otherwise isn't libxl suppressing this behavior anyway?
> If by "isn't libxl suppressing this behavior" you means if libxl is setting
> the value of "allow_memory_relocate", then the answer is no. 
> 
> Following this lead, I checked in what code path
> "allow_memory_relocate" could be defined. 
> 
> It is only defined in one code path, 
> 
> In the file "tools/libs/light/libxl_dm.c",
> in the function "void libxl__spawn_local_dm(libxl__egc *egc, 
> libxl__dm_spawn_state *dmss)": 
> 
> '''
>  // ...
> if (b_info->type == LIBXL_DOMAIN_TYPE_HVM) {
>  // ...
> 
> libxl__xs_printf(gc, XBT_NULL,
>  GCSPRINTF("%s/hvmloader/allow-memory-relocate", 
> path),
>  "%d",
>  
> b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL 
> &&
>  !libxl__vnuma_configured(b_info));
> // ...
> '''
> 
> However, on QubesOS (my system), "local_dm" is never used, "stub_dm"
> is always used. 
> 
> In the function "void lib 
> xl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)", 
> the value of "allow_memory_relocate" is never defined. 
> 
> I tried to patch the code to define "allow_memory_relocate" in
> "libxl__spawn_stub_dm": 
> 
> '''
> --- a/tools/libs/light/libxl_dm.c
> +++ b/tools/libs/light/libxl_dm.c
> @@ -2431,6 +2431,10 @@ void libxl__spawn_stub_dm(libxl__egc *egc, 
> libxl__stub_dm_spawn_state *sdss)
> libxl__xs_get_dompath(gc, 
> guest_domid)),
>  "%s",
>  
> libxl_bios_type_to_string(guest_config->b_info.u.hvm.bios));
> +libxl__xs_printf(gc, XBT_NULL,
> + libxl__sprintf(gc, 
> "%s/hvmloader/allow-memory-relocate", libxl__xs_get_dompath(gc, guest_domid)),
> + "%d",
> + 0);
>  }
>  ret = xc_domain_set_target(ctx->xch, dm_domid, guest_domid);
>  if (ret<0) {
> '''
> 
> And it is indeed another way to solve my issue. 
> However I do not understand the xen hypervisor enough to known i 
> f
> "allow-memory-relocate" should have been defined in
> "libxl__spawn_stub_dm" or if the real issue is somewhere else. 

I think it should be done the same no matter where qemu runs. Back at the
time iirc only qemu-trad could run in a stubdom, which may explain the
omission. Cc-ing the two guys who are likely in the best position to tell
for sure.

>>> Code extract: 
>>>
>>> tools/firmware/hvmloader/pci.c 
>>> "
>>>/*
>>>  * Do we allow hvmloader to relocate guest memory in order to
>>>  * increase the size of the lowmem MMIO hole?  Defaulting to 1
>>>  * here will
>>>  mean that non-libxl toolstacks (including xend and
>>>  * home-grown ones) means that those using qemu-xen will still
>>>  * experience the memory relocation bug described below; but it
>>>  * also means that those using q 
>>> emu-traditional will *not*
>>>  * experience any change; and it also means that there is a
>>>  * work-around for those using qemu-xen, namely switching to
>>>  * qemu-traditional.
>>>  *
>>>  * If we defaulted to 0, and failing to resize the hole caused any
>>>  * problems with qemu-traditional, then there is no work-around.
>>>  *
>>>  * Since xend can 
>  only use qemu-traditional, I think this is the
>>>  * option that will have the least impact.
>>>  */
>>> bool allow_memory_relocate = 1;
>>> "
>>>
>>> "
>>> /*
>>>  * At the moment qemu-xen can't deal with relocated memory regions.
>>>  * It's too close to the release to make a proper fix; for now,
>>>  * only allow t
>>> he MMIO hole to grow large enough to move guest memory
>>>  * if we're running qemu-traditional.  Items that don't fit will be
>>>  * relocated into the 64-bit address space.
>>>  *
>>>  * This loop now does the following:
>>>  * - If allow_memory_relocate, increase the MMIO hole until it's
>>>  *   big enough, or  
>>> until it's 2GiB
>>>  * - If !allow_memory_relocate, increase the MMIO hole until it's
>>>  *   big enough, or until it's 2GiB, or until it overlaps guest
>>>  *   memory
>>>  */
>>> while ( (mmio_total > (pci_mem_end - pci_mem_start))
>>> 
>  && ((pci_mem_start << 1) != 0)
>>> && 

Re: hvmloader - allow_memory_relocate overlaps

2023-12-21 Thread Neowutran
On 2023-12-19 17:12, Jan Beulich wrote:
> On 16.12.2023 08:01, Neowutran wrote:
> > I am wondering if the variable "allow_memory_relocate" is still
> > relevant today and if its default value is still relevant. 
> > Should it be defined to 0 by default instead of 1 (it seems to be a
> > workaround for qemu-traditional, so maybe an outdated default value ? ) ? 
> 
> So are you saying you use qemu-trad?
No, I am using "qemu_xen" ( from ��
xenstore-read -> 'device-model =
"qemu_xen"' 

> Otherwise isn't libxl suppressing this behavior anyway?
If by "isn't libxl suppressing this behavior" you means if libxl is setting
the value of "allow_memory_relocate", then the answer is no. 

Following this lead, I checked in what code path
"allow_memory_relocate" could be defined. 

It is only defined in one code path, 

In the file "tools/libs/light/libxl_dm.c",
in the function "void libxl__spawn_local_dm(libxl__egc *egc, 
libxl__dm_spawn_state *dmss)": 

'''
 // ...
if (b_info->type == LIBXL_DOMAIN_TYPE_HVM) {
 // ...

libxl__xs_printf(gc, XBT_NULL,
 GCSPRINTF("%s/hvmloader/allow-memory-relocate", path),
 "%d",
 
b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL &&
 !libxl__vnuma_configured(b_info));
// ...
'''

However, on QubesOS (my system), "local_dm" is never used, "stub_dm"
is always used. 

In the function "void lib��
xl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)", 
the value of "allow_memory_relocate" is never defined. 

I tried to patch the code to define "allow_memory_relocate" in
"libxl__spawn_stub_dm": 

'''
--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -2431,6 +2431,10 @@ void libxl__spawn_stub_dm(libxl__egc *egc, 
libxl__stub_dm_spawn_state *sdss)
libxl__xs_get_dompath(gc, guest_domid)),
 "%s",
 
libxl_bios_type_to_string(guest_config->b_info.u.hvm.bios));
+libxl__xs_printf(gc, XBT_NULL,
+ libxl__sprintf(gc, 
"%s/hvmloader/allow-memory-relocate", libxl__xs_get_dompath(gc, guest_domid)),
+ "%d",
+ 0);
 }
 ret = xc_domain_set_target(ctx->xch, dm_domid, guest_domid);
 if (ret<0) {
'''

And it is indeed another way to solve my issue. 
However I do not understand the xen hypervisor enough to known i��
f
"allow-memory-relocate" should have been defined in
"libxl__spawn_stub_dm" or if the real issue is somewhere else. 


> Or did that code go stale?
> 
> > Code extract: 
> > 
> > tools/firmware/hvmloader/pci.c 
> > "
> >/*
> >  * Do we allow hvmloader to relocate guest memory in order to
> >  * increase the size of the lowmem MMIO hole?  Defaulting to 1
> >  * here will
> >  mean that non-libxl toolstacks (including xend and
> >  * home-grown ones) means that those using qemu-xen will still
> >  * experience the memory relocation bug described below; but it
> >  * also means that those using q 
> > emu-traditional will *not*
> >  * experience any change; and it also means that there is a
> >  * work-around for those using qemu-xen, namely switching to
> >  * qemu-traditional.
> >  *
> >  * If we defaulted to 0, and failing to resize the hole caused any
> >  * problems with qemu-traditional, then there is no work-around.
> >  *
> >  * Since xend can��
 only use qemu-traditional, I think this is the
> >  * option that will have the least impact.
> >  */
> > bool allow_memory_relocate = 1;
> > "
> > 
> > "
> > /*
> >  * At the moment qemu-xen can't deal with relocated memory regions.
> >  * It's too close to the release to make a proper fix; for now,
> >  * only allow t
> > he MMIO hole to grow large enough to move guest memory
> >  * if we're running qemu-traditional.  Items that don't fit will be
> >  * relocated into the 64-bit address space.
> >  *
> >  * This loop now does the following:
> >  * - If allow_memory_relocate, increase the MMIO hole until it's
> >  *   big enough, or  
> > until it's 2GiB
> >  * - If !allow_memory_relocate, increase the MMIO hole until it's
> >  *   big enough, or until it's 2GiB, or until it overlaps guest
> >  *   memory
> >  */
> > while ( (mmio_total > (pci_mem_end - pci_mem_start))
> >��
 && ((pci_mem_start << 1) != 0)
> > && (allow_memory_relocate
> > || (((pci_mem_start << 1) >> PAGE_SHIFT)
> > >= hvm_info->low_mem_pgend)) )
> > pci_mem_start <<= 1;
> > "
> > 
> > The issue it cause is documented in the source code: guest memory can
> > be trashed by the hvmloader. 
> > 
> > Due to this issue, it is impossible to passthrough a large PCI device 

Re: hvmloader - allow_memory_relocate overlaps

2023-12-19 Thread Jan Beulich
On 16.12.2023 08:01, Neowutran wrote:
> I am wondering if the variable "allow_memory_relocate" is still
> relevant today and if its default value is still relevant. 
> Should it be defined to 0 by default instead of 1 (it seems to be a
> workaround for qemu-traditional, so maybe an outdated default value ? ) ? 

So are you saying you use qemu-trad? Otherwise isn't libxl suppressing
this behavior anyway? Or did that code go stale?

> Code extract: 
> 
> tools/firmware/hvmloader/pci.c 
> "
>/*
>  * Do we allow hvmloader to relocate guest memory in order to
>  * increase the size of the lowmem MMIO hole?  Defaulting to 1
>  * here will
>  mean that non-libxl toolstacks (including xend and
>  * home-grown ones) means that those using qemu-xen will still
>  * experience the memory relocation bug described below; but it
>  * also means that those using q 
> emu-traditional will *not*
>  * experience any change; and it also means that there is a
>  * work-around for those using qemu-xen, namely switching to
>  * qemu-traditional.
>  *
>  * If we defaulted to 0, and failing to resize the hole caused any
>  * problems with qemu-traditional, then there is no work-around.
>  *
>  * Since xend can only use qemu-traditional, I think this is the
>  * option that will have the least impact.
>  */
> bool allow_memory_relocate = 1;
> "
> 
> "
> /*
>  * At the moment qemu-xen can't deal with relocated memory regions.
>  * It's too close to the release to make a proper fix; for now,
>  * only allow t
> he MMIO hole to grow large enough to move guest memory
>  * if we're running qemu-traditional.  Items that don't fit will be
>  * relocated into the 64-bit address space.
>  *
>  * This loop now does the following:
>  * - If allow_memory_relocate, increase the MMIO hole until it's
>  *   big enough, or  
> until it's 2GiB
>  * - If !allow_memory_relocate, increase the MMIO hole until it's
>  *   big enough, or until it's 2GiB, or until it overlaps guest
>  *   memory
>  */
> while ( (mmio_total > (pci_mem_end - pci_mem_start))
> && ((pci_mem_start << 1) != 0)
> && (allow_memory_relocate
> || (((pci_mem_start << 1) >> PAGE_SHIFT)
> >= hvm_info->low_mem_pgend)) )
> pci_mem_start <<= 1;
> "
> 
> The issue it cause is documented in the source code: guest memory can
> be trashed by the hvmloader. 
> 
> Due to this issue, it is impossible to passthrough a large PCI device to a 
> HVM with a lot of ram.
> (https://github.com/QubesOS/qubes-issues/issues/4321). 
> 
> (Forcing "allow_memory_relocate" to be 0 seems to solve the issue
> linked)

What I don't understand here (and what I also can't find helpful logs for)
is: The code in hvmloader is in principle intended to work in both cases.
If there's suspected guest memory corruption, can we get to see the actual
results of the MMIO hole creation and its using for device ranges? If there
indeed is an overlap with guest RAM, that's a bug which wants fixing. I
don't see why we would want to drop allow_memory_relocate in the way you
suggest; quite the other way around, if upstream qemu still can't tolerate
hvmloader doing this, then it wants fixing there such that RAM relocation
can be enabled unconditionally. Then the variable will indeed not be needed
anymore.

Jan



hvmloader - allow_memory_relocate overlaps

2023-12-15 Thread Neowutran
I am wondering if the variable "allow_memory_relocate" is still
relevant today and if its default value is still relevant. 
Should it be defined to 0 by default instead of 1 (it seems to be a
workaround for qemu-traditional, so maybe an outdated default value ? ) ? 

Code extract: 

tools/firmware/hvmloader/pci.c 
"
   /*
 * Do we allow hvmloader to relocate guest memory in order to
 * increase the size of the lowmem MMIO hole?  Defaulting to 1
 * here will
 mean that non-libxl toolstacks (including xend and
 * home-grown ones) means that those using qemu-xen will still
 * experience the memory relocation bug described below; but it
 * also means that those using q��
emu-traditional will *not*
 * experience any change; and it also means that there is a
 * work-around for those using qemu-xen, namely switching to
 * qemu-traditional.
 *
 * If we defaulted to 0, and failing to resize the hole caused any
 * problems with qemu-traditional, then there is no work-around.
 *
 * Since xend can only use qemu-traditional, I think this is the
 * option that will have the least impact.
 */
bool allow_memory_relocate = 1;
"

"
/*
 * At the moment qemu-xen can't deal with relocated memory regions.
 * It's too close to the release to make a proper fix; for now,
 * only allow t
he MMIO hole to grow large enough to move guest memory
 * if we're running qemu-traditional.  Items that don't fit will be
 * relocated into the 64-bit address space.
 *
 * This loop now does the following:
 * - If allow_memory_relocate, increase the MMIO hole until it's
 *   big enough, or ��
until it's 2GiB
 * - If !allow_memory_relocate, increase the MMIO hole until it's
 *   big enough, or until it's 2GiB, or until it overlaps guest
 *   memory
 */
while ( (mmio_total > (pci_mem_end - pci_mem_start))
&& ((pci_mem_start << 1) != 0)
&& (allow_memory_relocate
|| (((pci_mem_start << 1) >> PAGE_SHIFT)
>= hvm_info->low_mem_pgend)) )
pci_mem_start <<= 1;
"

The issue it cause is documented in the source code: guest memory can
be trashed by the hvmloader. 

Due to this issue, it is impossible to passthrough a large PCI device to a HVM 
with a lot of ram.
(https://github.com/QubesOS/qubes-issues/issues/4321). 

(Forcing "allow_memory_relocate" to be 0 seems to solve the issue
linked)



Thanks, 
Neowutran
��