Re: [PATCH v5] hw/i386/pc: improve physical address space bound check for 32-bit x86 systems

2023-09-22 Thread Philippe Mathieu-Daudé

On 22/9/23 17:01, Ani Sinha wrote:

32-bit x86 systems do not have a reserved memory for hole64. On those 32-bit
systems without PSE36 or PAE CPU features, hotplugging memory devices are not
supported by QEMU as QEMU always places hotplugged memory above 4 GiB boundary
which is beyond the physical address space of the processor. Linux guests also
does not support memory hotplug on those systems. Please see Linux
kernel commit b59d02ed08690 ("mm/memory_hotplug: disable the functionality
for 32b") for more details.

Therefore, the maximum limit of the guest physical address in the absence of
additional memory devices effectively coincides with the end of
"above 4G memory space" region for 32-bit x86 without PAE/PSE36. When users
configure additional memory devices, after properly accounting for the
additional device memory region to find the maximum value of the guest
physical address, the address will be outside the range of the processor's
physical address space.

This change adds improvements to take above into consideration.

For example, previously this was allowed:

$ ./qemu-system-x86_64 -cpu pentium -m size=10G

With this change now it is no longer allowed:

$ ./qemu-system-x86_64 -cpu pentium -m size=10G
qemu-system-x86_64: Address space limit 0x < 0x2bfff phys-bits too 
low (32)

However, the following are allowed since on both cases physical address
space of the processor is 36 bits:

$ ./qemu-system-x86_64 -cpu pentium2 -m size=10G
$ ./qemu-system-x86_64 -cpu pentium,pse36=on -m size=10G

For 32-bit, without PAE/PSE36, hotplugging additional memory is no longer 
allowed.

$ ./qemu-system-i386 -m size=1G,maxmem=3G,slots=2
qemu-system-i386: Address space limit 0x < 0x1 phys-bits too 
low (32)
$ ./qemu-system-i386 -machine q35 -m size=1G,maxmem=3G,slots=2
qemu-system-i386: Address space limit 0x < 0x1 phys-bits too 
low (32)

A new compatibility flag is introduced to make sure pc_max_used_gpa() keeps
returning the old value for machines 8.1 and older.
Therefore, the above is still allowed for older machine types in order to 
support
compatibility. Hence, the following still works:

$ ./qemu-system-i386 -machine pc-i440fx-8.1 -m size=1G,maxmem=3G,slots=2
$ ./qemu-system-i386 -machine pc-q35-8.1 -m size=1G,maxmem=3G,slots=2

Further, following is also allowed as with PSE36, the processor has 36-bit
address space:

$ ./qemu-system-i386 -cpu 486,pse36=on -m size=1G,maxmem=3G,slots=2

After calling CPUID with EAX=0x8001, all AMD64 compliant processors
have the longmode-capable-bit turned on in the extended feature flags (bit 29)
in EDX. The absence of CPUID longmode can be used to differentiate between
32-bit and 64-bit processors and is the recommended approach. QEMU takes this
approach elsewhere (for example, please see x86_cpu_realizefn()), With
this change, pc_max_used_gpa() also uses the same method to detect 32-bit
processors.

Unit tests are modified to not run 32-bit x86 tests that use memory hotplug.

Suggested-by: David Hildenbrand 
Signed-off-by: Ani Sinha 
Reviewed-by: David Hildenbrand 
---
  hw/i386/pc.c   | 30 +++---
  hw/i386/pc_piix.c  |  4 
  hw/i386/pc_q35.c   |  2 ++
  include/hw/i386/pc.h   |  6 ++
  tests/qtest/bios-tables-test.c | 26 ++
  tests/qtest/numa-test.c|  7 ++-
  6 files changed, 63 insertions(+), 12 deletions(-)

changelog:
v5: addressed phil's suggestions in code reorg to make it cleaner.
v4: address comments from v3. Fix a bug where compat knob was absent
from q35 machines. Commit message adjustment.
v3: still accounting for additional memory device region above 4G.
unit tests fixed (not running for 32-bit where mem hotplug is used).
v2: removed memory hotplug region from max_gpa. added compat knobs.

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 3db0743f31..f8d1cd1362 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -907,13 +907,37 @@ static uint64_t pc_get_cxl_range_end(PCMachineState *pcms)
  static hwaddr pc_max_used_gpa(PCMachineState *pcms, uint64_t pci_hole64_size)
  {
  X86CPU *cpu = X86_CPU(first_cpu);
+PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+MachineState *ms = MACHINE(pcms);
+uint64_t devmem_start = 0;
+ram_addr_t devmem_size = 0;
  
-/* 32-bit systems don't have hole64 thus return max CPU address */

-if (cpu->phys_bits <= 32) {
+if (cpu->env.features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
+/* 64-bit systems */
+return pc_pci_hole64_start() + pci_hole64_size - 1;
+}
+
+/*
+ * 32-bit systems don't have hole64 but they might have a region for
+ * memory devices. Even if additional hotplugged memory devices might
+ * not be usable by most guest OSes, we need to still consider them for
+ * calculating the highest possible GPA so that we can properly report
+ * if someone configures them on a CPU that cannot 

[PATCH v5] hw/i386/pc: improve physical address space bound check for 32-bit x86 systems

2023-09-22 Thread Ani Sinha
32-bit x86 systems do not have a reserved memory for hole64. On those 32-bit
systems without PSE36 or PAE CPU features, hotplugging memory devices are not
supported by QEMU as QEMU always places hotplugged memory above 4 GiB boundary
which is beyond the physical address space of the processor. Linux guests also
does not support memory hotplug on those systems. Please see Linux
kernel commit b59d02ed08690 ("mm/memory_hotplug: disable the functionality
for 32b") for more details.

Therefore, the maximum limit of the guest physical address in the absence of
additional memory devices effectively coincides with the end of
"above 4G memory space" region for 32-bit x86 without PAE/PSE36. When users
configure additional memory devices, after properly accounting for the
additional device memory region to find the maximum value of the guest
physical address, the address will be outside the range of the processor's
physical address space.

This change adds improvements to take above into consideration.

For example, previously this was allowed:

$ ./qemu-system-x86_64 -cpu pentium -m size=10G

With this change now it is no longer allowed:

$ ./qemu-system-x86_64 -cpu pentium -m size=10G
qemu-system-x86_64: Address space limit 0x < 0x2bfff phys-bits too 
low (32)

However, the following are allowed since on both cases physical address
space of the processor is 36 bits:

$ ./qemu-system-x86_64 -cpu pentium2 -m size=10G
$ ./qemu-system-x86_64 -cpu pentium,pse36=on -m size=10G

For 32-bit, without PAE/PSE36, hotplugging additional memory is no longer 
allowed.

$ ./qemu-system-i386 -m size=1G,maxmem=3G,slots=2
qemu-system-i386: Address space limit 0x < 0x1 phys-bits too 
low (32)
$ ./qemu-system-i386 -machine q35 -m size=1G,maxmem=3G,slots=2
qemu-system-i386: Address space limit 0x < 0x1 phys-bits too 
low (32)

A new compatibility flag is introduced to make sure pc_max_used_gpa() keeps
returning the old value for machines 8.1 and older.
Therefore, the above is still allowed for older machine types in order to 
support
compatibility. Hence, the following still works:

$ ./qemu-system-i386 -machine pc-i440fx-8.1 -m size=1G,maxmem=3G,slots=2
$ ./qemu-system-i386 -machine pc-q35-8.1 -m size=1G,maxmem=3G,slots=2

Further, following is also allowed as with PSE36, the processor has 36-bit
address space:

$ ./qemu-system-i386 -cpu 486,pse36=on -m size=1G,maxmem=3G,slots=2

After calling CPUID with EAX=0x8001, all AMD64 compliant processors
have the longmode-capable-bit turned on in the extended feature flags (bit 29)
in EDX. The absence of CPUID longmode can be used to differentiate between
32-bit and 64-bit processors and is the recommended approach. QEMU takes this
approach elsewhere (for example, please see x86_cpu_realizefn()), With
this change, pc_max_used_gpa() also uses the same method to detect 32-bit
processors.

Unit tests are modified to not run 32-bit x86 tests that use memory hotplug.

Suggested-by: David Hildenbrand 
Signed-off-by: Ani Sinha 
Reviewed-by: David Hildenbrand 
---
 hw/i386/pc.c   | 30 +++---
 hw/i386/pc_piix.c  |  4 
 hw/i386/pc_q35.c   |  2 ++
 include/hw/i386/pc.h   |  6 ++
 tests/qtest/bios-tables-test.c | 26 ++
 tests/qtest/numa-test.c|  7 ++-
 6 files changed, 63 insertions(+), 12 deletions(-)

changelog:
v5: addressed phil's suggestions in code reorg to make it cleaner.
v4: address comments from v3. Fix a bug where compat knob was absent
from q35 machines. Commit message adjustment.
v3: still accounting for additional memory device region above 4G.
unit tests fixed (not running for 32-bit where mem hotplug is used).
v2: removed memory hotplug region from max_gpa. added compat knobs.

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 3db0743f31..f8d1cd1362 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -907,13 +907,37 @@ static uint64_t pc_get_cxl_range_end(PCMachineState *pcms)
 static hwaddr pc_max_used_gpa(PCMachineState *pcms, uint64_t pci_hole64_size)
 {
 X86CPU *cpu = X86_CPU(first_cpu);
+PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+MachineState *ms = MACHINE(pcms);
+uint64_t devmem_start = 0;
+ram_addr_t devmem_size = 0;
 
-/* 32-bit systems don't have hole64 thus return max CPU address */
-if (cpu->phys_bits <= 32) {
+if (cpu->env.features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
+/* 64-bit systems */
+return pc_pci_hole64_start() + pci_hole64_size - 1;
+}
+
+/*
+ * 32-bit systems don't have hole64 but they might have a region for
+ * memory devices. Even if additional hotplugged memory devices might
+ * not be usable by most guest OSes, we need to still consider them for
+ * calculating the highest possible GPA so that we can properly report
+ * if someone configures them on a CPU that cannot possibly address them.
+ */
+if