Re: [patch] x86/smpboot: Fix the parallel bringup decision

2023-05-31 Thread Tom Lendacky

On 5/31/23 02:44, Thomas Gleixner wrote:

The decision to allow parallel bringup of secondary CPUs checks
CC_ATTR_GUEST_STATE_ENCRYPT to detect encrypted guests. Those cannot use
parallel bootup because accessing the local APIC is intercepted and raises
a #VC or #VE, which cannot be handled at that point.

The check works correctly, but only for AMD encrypted guests. TDX does not
set that flag.

As there is no real connection between CC attributes and the inability to
support parallel bringup, replace this with a generic control flag in
x86_cpuinit and let SEV-ES and TDX init code disable it.

Fixes: 0c7ffa32dbd6 ("x86/smpboot/64: Implement arch_cpuhp_init_parallel_bringup() 
and enable it")
Reported-by: Kirill A. Shutemov 
Signed-off-by: Thomas Gleixner 


Still works for SEV-ES/SEV-SNP with parallel boot properly disabled.

Tested-by: Tom Lendacky 


---
  arch/x86/coco/tdx/tdx.c |   11 +++
  arch/x86/include/asm/x86_init.h |3 +++
  arch/x86/kernel/smpboot.c   |   19 ++-
  arch/x86/kernel/x86_init.c  |1 +
  arch/x86/mm/mem_encrypt_amd.c   |   15 +++
  5 files changed, 32 insertions(+), 17 deletions(-)

--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -871,5 +871,16 @@ void __init tdx_early_init(void)
x86_platform.guest.enc_tlb_flush_required   = tdx_tlb_flush_required;
x86_platform.guest.enc_status_change_finish = tdx_enc_status_changed;
  
+	/*

+* TDX intercepts the RDMSR to read the X2APIC ID in the parallel
+* bringup low level code. That raises #VE which cannot be handled
+* there.
+*
+* Intel-TDX has a secure RDMSR hypercall, but that needs to be
+* implemented seperately in the low level startup ASM code.
+* Until that is in place, disable parallel bringup for TDX.
+*/
+   x86_cpuinit.parallel_bringup = false;
+
pr_info("Guest detected\n");
  }
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -177,11 +177,14 @@ struct x86_init_ops {
   * struct x86_cpuinit_ops - platform specific cpu hotplug setups
   * @setup_percpu_clockev: set up the per cpu clock event device
   * @early_percpu_clock_init:  early init of the per cpu clock event device
+ * @fixup_cpu_id:  fixup function for cpuinfo_x86::phys_proc_id
+ * @parallel_bringup:  Parallel bringup control
   */
  struct x86_cpuinit_ops {
void (*setup_percpu_clockev)(void);
void (*early_percpu_clock_init)(void);
void (*fixup_cpu_id)(struct cpuinfo_x86 *c, int node);
+   bool parallel_bringup;
  };
  
  struct timespec64;

--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1267,23 +1267,8 @@ void __init smp_prepare_cpus_common(void
  /* Establish whether parallel bringup can be supported. */
  bool __init arch_cpuhp_init_parallel_bringup(void)
  {
-   /*
-* Encrypted guests require special handling. They enforce X2APIC
-* mode but the RDMSR to read the APIC ID is intercepted and raises
-* #VC or #VE which cannot be handled in the early startup code.
-*
-* AMD-SEV does not provide a RDMSR GHCB protocol so the early
-* startup code cannot directly communicate with the secure
-* firmware. The alternative solution to retrieve the APIC ID via
-* CPUID(0xb), which is covered by the GHCB protocol, is not viable
-* either because there is no enforcement of the CPUID(0xb)
-* provided "initial" APIC ID to be the same as the real APIC ID.
-*
-* Intel-TDX has a secure RDMSR hypercall, but that needs to be
-* implemented seperately in the low level startup ASM code.
-*/
-   if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) {
-   pr_info("Parallel CPU startup disabled due to guest state 
encryption\n");
+   if (!x86_cpuinit.parallel_bringup) {
+   pr_info("Parallel CPU startup disabled by the platform\n");
return false;
}
  
--- a/arch/x86/kernel/x86_init.c

+++ b/arch/x86/kernel/x86_init.c
@@ -126,6 +126,7 @@ struct x86_init_ops x86_init __initdata
  struct x86_cpuinit_ops x86_cpuinit = {
.early_percpu_clock_init= x86_init_noop,
.setup_percpu_clockev   = setup_secondary_APIC_clock,
+   .parallel_bringup   = true,
  };
  
  static void default_nmi_init(void) { };

--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -501,6 +501,21 @@ void __init sme_early_init(void)
x86_platform.guest.enc_status_change_finish  = 
amd_enc_status_change_finish;
x86_platform.guest.enc_tlb_flush_required= 
amd_enc_tlb_flush_required;
x86_platform.guest.enc_cache_flush_required  = 
amd_enc_cache_flush_required;
+
+   /*
+* AMD-SEV-ES intercepts the RDMSR to read the X2APIC ID in the
+* para

Re: [patch] x86/smpboot: Disable parallel bootup if cc_vendor != NONE

2023-05-30 Thread Tom Lendacky

On 5/30/23 15:39, Thomas Gleixner wrote:

On Tue, May 30 2023 at 15:03, Tom Lendacky wrote:

On 5/30/23 14:51, Thomas Gleixner wrote:

That aside. From a semantical POV making this decision about parallel
bootup based on some magic CC encryption attribute is questionable.

I'm tending to just do the below and make this CC agnostic (except that
I couldn't find the right spot for SEV-ES to clear that flag.)


Maybe in sme_sev_setup_real_mode() in arch/x86/realmode/init.c? You could
clear the flag within the CC_ATTR_GUEST_STATE_ENCRYPT check.


Eeew.

Can we please have a AMD SEV-ES init specific place and not hijack some
random code which has to check CC_ATTR_GUEST_STATE_ENCRYPT?


As long as it's not too early, you could try sme_early_init() in 
arch/x86/mm/mem_encrypt_amd.c. Add a check for sev_status & 
MSR_AMD64_SEV_ES_ENABLED and clear the flag.


Thanks,
Tom



Thanks,

 tglx




Re: [patch] x86/smpboot: Disable parallel bootup if cc_vendor != NONE

2023-05-30 Thread Tom Lendacky

On 5/30/23 14:51, Thomas Gleixner wrote:

On Tue, May 30 2023 at 09:56, Sean Christopherson wrote:

On Tue, May 30, 2023, Thomas Gleixner wrote:

On Tue, May 30 2023 at 15:29, Kirill A. Shutemov wrote:

On Tue, May 30, 2023 at 02:09:17PM +0200, Thomas Gleixner wrote:

The decision to allow parallel bringup of secondary CPUs checks
CC_ATTR_GUEST_STATE_ENCRYPT to detect encrypted guests. Those cannot use
parallel bootup because accessing the local APIC is intercepted and raises
a #VC or #VE, which cannot be handled at that point.

The check works correctly, but only for AMD encrypted guests. TDX does not
set that flag.

Check for cc_vendor != CC_VENDOR_NONE instead. That might be overbroad, but
definitely works for both AMD and Intel.


It boots fine with TDX, but I think it is wrong. cc_get_vendor() will
report CC_VENDOR_AMD even on bare metal if SME is enabled. I don't think
we want it.


Right. Did not think about that.

But the same way is CC_ATTR_GUEST_MEM_ENCRYPT overbroad for AMD. Only
SEV-ES traps RDMSR if I'm understandig that maze correctly.


Ya, regular SEV doesn't encrypt register state.


That aside. From a semantical POV making this decision about parallel
bootup based on some magic CC encryption attribute is questionable.

I'm tending to just do the below and make this CC agnostic (except that
I couldn't find the right spot for SEV-ES to clear that flag.)


Maybe in sme_sev_setup_real_mode() in arch/x86/realmode/init.c? You could 
clear the flag within the CC_ATTR_GUEST_STATE_ENCRYPT check.


Thanks,
Tom



Thanks,

 tglx
---
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -871,5 +871,7 @@ void __init tdx_early_init(void)
x86_platform.guest.enc_tlb_flush_required   = tdx_tlb_flush_required;
x86_platform.guest.enc_status_change_finish = tdx_enc_status_changed;
  
+	x86_cpuinit.parallel_bringup = false;

+
pr_info("Guest detected\n");
  }
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -2,6 +2,7 @@
  #ifndef _ASM_X86_PLATFORM_H
  #define _ASM_X86_PLATFORM_H
  
+#include 

  #include 
  
  struct ghcb;

@@ -177,11 +178,14 @@ struct x86_init_ops {
   * struct x86_cpuinit_ops - platform specific cpu hotplug setups
   * @setup_percpu_clockev: set up the per cpu clock event device
   * @early_percpu_clock_init:  early init of the per cpu clock event device
+ * @fixup_cpu_id:  fixup function for cpuinfo_x86::phys_proc_id
+ * @parallel_bringup:  Parallel bringup control
   */
  struct x86_cpuinit_ops {
void (*setup_percpu_clockev)(void);
void (*early_percpu_clock_init)(void);
void (*fixup_cpu_id)(struct cpuinfo_x86 *c, int node);
+   bool parallel_bringup;
  };
  
  struct timespec64;

--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1287,6 +1287,11 @@ bool __init arch_cpuhp_init_parallel_bri
return false;
}
  
+	if (!x86_cpuinit.parallel_bringup) {

+   pr_info("Parallel CPU startup disabled by the platform\n");
+   return false;
+   }
+
smpboot_control = STARTUP_READ_APICID;
pr_debug("Parallel CPU startup enabled: 0x%08x\n", smpboot_control);
return true;
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -126,6 +126,7 @@ struct x86_init_ops x86_init __initdata
  struct x86_cpuinit_ops x86_cpuinit = {
.early_percpu_clock_init= x86_init_noop,
.setup_percpu_clockev   = setup_secondary_APIC_clock,
+   .parallel_bringup   = true,
  };
  
  static void default_nmi_init(void) { };




Re: [PATCH V3 1/5] Swiotlb: Add Swiotlb bounce buffer remap function for HV IVM

2021-12-03 Thread Tom Lendacky

On 12/3/21 1:11 PM, Tom Lendacky wrote:

On 12/3/21 5:20 AM, Tianyu Lan wrote:

On 12/2/2021 10:42 PM, Tom Lendacky wrote:

On 12/1/21 10:02 AM, Tianyu Lan wrote:

From: Tianyu Lan 

In Isolation VM with AMD SEV, bounce buffer needs to be accessed via
extra address space which is above shared_gpa_boundary (E.G 39 bit
address line) reported by Hyper-V CPUID ISOLATION_CONFIG. The access
physical address will be original physical address + shared_gpa_boundary.
The shared_gpa_boundary in the AMD SEV SNP spec is called virtual top of
memory(vTOM). Memory addresses below vTOM are automatically treated as
private while memory above vTOM is treated as shared.

Expose swiotlb_unencrypted_base for platforms to set unencrypted
memory base offset and platform calls swiotlb_update_mem_attributes()
to remap swiotlb mem to unencrypted address space. memremap() can
not be called in the early stage and so put remapping code into
swiotlb_update_mem_attributes(). Store remap address and use it to copy
data from/to swiotlb bounce buffer.

Signed-off-by: Tianyu Lan 


This patch results in the following stack trace during a bare-metal boot
on my EPYC system with SME active (e.g. mem_encrypt=on):

[    0.123932] BUG: Bad page state in process swapper  pfn:108001
[    0.123942] page:(ptrval) refcount:0 mapcount:-128 
mapping: index:0x0 pfn:0x108001

[    0.123946] flags: 0x17c000(node=0|zone=2|lastcpupid=0x1f)
[    0.123952] raw: 0017c000 88904f2d5e80 88904f2d5e80 

[    0.123954] raw:   ff7f 


[    0.123955] page dumped because: nonzero mapcount
[    0.123957] Modules linked in:
[    0.123961] CPU: 0 PID: 0 Comm: swapper Not tainted 
5.16.0-rc3-sos-custom #2

[    0.123964] Hardware name: AMD Corporation
[    0.123967] Call Trace:
[    0.123971]  
[    0.123975]  dump_stack_lvl+0x48/0x5e
[    0.123985]  bad_page.cold+0x65/0x96
[    0.123990]  __free_pages_ok+0x3a8/0x410
[    0.123996]  memblock_free_all+0x171/0x1dc
[    0.124005]  mem_init+0x1f/0x14b
[    0.124011]  start_kernel+0x3b5/0x6a1
[    0.124016]  secondary_startup_64_no_verify+0xb0/0xbb
[    0.124022]  

I see ~40 of these traces, each for different pfns.

Thanks,
Tom


Hi Tom:
   Thanks for your test. Could you help to test the following patch 
and check whether it can fix the issue.


The patch is mangled. Is the only difference where set_memory_decrypted() 
is called?


I de-mangled the patch. No more stack traces with SME active.

Thanks,
Tom



Thanks,
Tom




diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 569272871375..f6c3638255d5 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -73,6 +73,9 @@ extern enum swiotlb_force swiotlb_force;
   * @end:   The end address of the swiotlb memory pool. Used to do 
a quick
   * range check to see if the memory was in fact allocated 
by this

   * API.
+ * @vaddr: The vaddr of the swiotlb memory pool. The swiotlb memory 
pool
+ * may be remapped in the memory encrypted case and store 
virtual

+ * address for bounce buffer operation.
   * @nslabs:    The number of IO TLB blocks (in groups of 64) between 
@start and
   * @end. For default swiotlb, this is command line 
adjustable via

   * setup_io_tlb_npages.
@@ -92,6 +95,7 @@ extern enum swiotlb_force swiotlb_force;
  struct io_tlb_mem {
 phys_addr_t start;
 phys_addr_t end;
+   void *vaddr;
 unsigned long nslabs;
 unsigned long used;
 unsigned int index;
@@ -186,4 +190,6 @@ static inline bool is_swiotlb_for_alloc(struct 
device *dev)

  }
  #endif /* CONFIG_DMA_RESTRICTED_POOL */

+extern phys_addr_t swiotlb_unencrypted_base;
+
  #endif /* __LINUX_SWIOTLB_H */
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 8e840fbbed7c..34e6ade4f73c 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -50,6 +50,7 @@
  #include 
  #include 

+#include 
  #include 
  #include 
  #include 
@@ -72,6 +73,8 @@ enum swiotlb_force swiotlb_force;

  struct io_tlb_mem io_tlb_default_mem;

+phys_addr_t swiotlb_unencrypted_base;
+
  /*
   * Max segment that we can provide which (if pages are contingous) will
   * not be bounced (unless SWIOTLB_FORCE is set).
@@ -155,6 +158,27 @@ static inline unsigned long nr_slots(u64 val)
 return DIV_ROUND_UP(val, IO_TLB_SIZE);
  }

+/*
+ * Remap swioltb memory in the unencrypted physical address space
+ * when swiotlb_unencrypted_base is set. (e.g. for Hyper-V AMD SEV-SNP
+ * Isolation VMs).
+ */
+void *swiotlb_mem_remap(struct io_tlb_mem *mem, unsigned long bytes)
+{
+   void *vaddr = NULL;
+
+   if (swiotlb_unencrypted_base) {
+   phys_addr_t paddr = mem->start + swiotlb_unencrypted_base;
+
+   vaddr = memremap(paddr, bytes, MEMREMAP_WB);
+   if (!vaddr)
+   pr_

Re: [PATCH V3 1/5] Swiotlb: Add Swiotlb bounce buffer remap function for HV IVM

2021-12-03 Thread Tom Lendacky

On 12/3/21 5:20 AM, Tianyu Lan wrote:

On 12/2/2021 10:42 PM, Tom Lendacky wrote:

On 12/1/21 10:02 AM, Tianyu Lan wrote:

From: Tianyu Lan 

In Isolation VM with AMD SEV, bounce buffer needs to be accessed via
extra address space which is above shared_gpa_boundary (E.G 39 bit
address line) reported by Hyper-V CPUID ISOLATION_CONFIG. The access
physical address will be original physical address + shared_gpa_boundary.
The shared_gpa_boundary in the AMD SEV SNP spec is called virtual top of
memory(vTOM). Memory addresses below vTOM are automatically treated as
private while memory above vTOM is treated as shared.

Expose swiotlb_unencrypted_base for platforms to set unencrypted
memory base offset and platform calls swiotlb_update_mem_attributes()
to remap swiotlb mem to unencrypted address space. memremap() can
not be called in the early stage and so put remapping code into
swiotlb_update_mem_attributes(). Store remap address and use it to copy
data from/to swiotlb bounce buffer.

Signed-off-by: Tianyu Lan 


This patch results in the following stack trace during a bare-metal boot
on my EPYC system with SME active (e.g. mem_encrypt=on):

[    0.123932] BUG: Bad page state in process swapper  pfn:108001
[    0.123942] page:(ptrval) refcount:0 mapcount:-128 
mapping: index:0x0 pfn:0x108001

[    0.123946] flags: 0x17c000(node=0|zone=2|lastcpupid=0x1f)
[    0.123952] raw: 0017c000 88904f2d5e80 88904f2d5e80 

[    0.123954] raw:   ff7f 


[    0.123955] page dumped because: nonzero mapcount
[    0.123957] Modules linked in:
[    0.123961] CPU: 0 PID: 0 Comm: swapper Not tainted 
5.16.0-rc3-sos-custom #2

[    0.123964] Hardware name: AMD Corporation
[    0.123967] Call Trace:
[    0.123971]  
[    0.123975]  dump_stack_lvl+0x48/0x5e
[    0.123985]  bad_page.cold+0x65/0x96
[    0.123990]  __free_pages_ok+0x3a8/0x410
[    0.123996]  memblock_free_all+0x171/0x1dc
[    0.124005]  mem_init+0x1f/0x14b
[    0.124011]  start_kernel+0x3b5/0x6a1
[    0.124016]  secondary_startup_64_no_verify+0xb0/0xbb
[    0.124022]  

I see ~40 of these traces, each for different pfns.

Thanks,
Tom


Hi Tom:
   Thanks for your test. Could you help to test the following patch 
and check whether it can fix the issue.


The patch is mangled. Is the only difference where set_memory_decrypted() 
is called?


Thanks,
Tom




diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 569272871375..f6c3638255d5 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -73,6 +73,9 @@ extern enum swiotlb_force swiotlb_force;
   * @end:   The end address of the swiotlb memory pool. Used to do a 
quick
   * range check to see if the memory was in fact allocated by 
this

   * API.
+ * @vaddr: The vaddr of the swiotlb memory pool. The swiotlb memory pool
+ * may be remapped in the memory encrypted case and store 
virtual

+ * address for bounce buffer operation.
   * @nslabs:    The number of IO TLB blocks (in groups of 64) between 
@start and
   * @end. For default swiotlb, this is command line 
adjustable via

   * setup_io_tlb_npages.
@@ -92,6 +95,7 @@ extern enum swiotlb_force swiotlb_force;
  struct io_tlb_mem {
     phys_addr_t start;
     phys_addr_t end;
+   void *vaddr;
     unsigned long nslabs;
     unsigned long used;
     unsigned int index;
@@ -186,4 +190,6 @@ static inline bool is_swiotlb_for_alloc(struct device 
*dev)

  }
  #endif /* CONFIG_DMA_RESTRICTED_POOL */

+extern phys_addr_t swiotlb_unencrypted_base;
+
  #endif /* __LINUX_SWIOTLB_H */
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 8e840fbbed7c..34e6ade4f73c 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -50,6 +50,7 @@
  #include 
  #include 

+#include 
  #include 
  #include 
  #include 
@@ -72,6 +73,8 @@ enum swiotlb_force swiotlb_force;

  struct io_tlb_mem io_tlb_default_mem;

+phys_addr_t swiotlb_unencrypted_base;
+
  /*
   * Max segment that we can provide which (if pages are contingous) will
   * not be bounced (unless SWIOTLB_FORCE is set).
@@ -155,6 +158,27 @@ static inline unsigned long nr_slots(u64 val)
     return DIV_ROUND_UP(val, IO_TLB_SIZE);
  }

+/*
+ * Remap swioltb memory in the unencrypted physical address space
+ * when swiotlb_unencrypted_base is set. (e.g. for Hyper-V AMD SEV-SNP
+ * Isolation VMs).
+ */
+void *swiotlb_mem_remap(struct io_tlb_mem *mem, unsigned long bytes)
+{
+   void *vaddr = NULL;
+
+   if (swiotlb_unencrypted_base) {
+   phys_addr_t paddr = mem->start + swiotlb_unencrypted_base;
+
+   vaddr = memremap(paddr, bytes, MEMREMAP_WB);
+   if (!vaddr)
+   pr_err("Failed to map the unencrypted memory %llx 
size %lx.\n",

+ 

Re: [PATCH V3 1/5] Swiotlb: Add Swiotlb bounce buffer remap function for HV IVM

2021-12-02 Thread Tom Lendacky

On 12/1/21 10:02 AM, Tianyu Lan wrote:

From: Tianyu Lan 

In Isolation VM with AMD SEV, bounce buffer needs to be accessed via
extra address space which is above shared_gpa_boundary (E.G 39 bit
address line) reported by Hyper-V CPUID ISOLATION_CONFIG. The access
physical address will be original physical address + shared_gpa_boundary.
The shared_gpa_boundary in the AMD SEV SNP spec is called virtual top of
memory(vTOM). Memory addresses below vTOM are automatically treated as
private while memory above vTOM is treated as shared.

Expose swiotlb_unencrypted_base for platforms to set unencrypted
memory base offset and platform calls swiotlb_update_mem_attributes()
to remap swiotlb mem to unencrypted address space. memremap() can
not be called in the early stage and so put remapping code into
swiotlb_update_mem_attributes(). Store remap address and use it to copy
data from/to swiotlb bounce buffer.

Signed-off-by: Tianyu Lan 


This patch results in the following stack trace during a bare-metal boot
on my EPYC system with SME active (e.g. mem_encrypt=on):

[0.123932] BUG: Bad page state in process swapper  pfn:108001
[0.123942] page:(ptrval) refcount:0 mapcount:-128 
mapping: index:0x0 pfn:0x108001
[0.123946] flags: 0x17c000(node=0|zone=2|lastcpupid=0x1f)
[0.123952] raw: 0017c000 88904f2d5e80 88904f2d5e80 

[0.123954] raw:   ff7f 

[0.123955] page dumped because: nonzero mapcount
[0.123957] Modules linked in:
[0.123961] CPU: 0 PID: 0 Comm: swapper Not tainted 5.16.0-rc3-sos-custom #2
[0.123964] Hardware name: AMD Corporation
[0.123967] Call Trace:
[0.123971]  
[0.123975]  dump_stack_lvl+0x48/0x5e
[0.123985]  bad_page.cold+0x65/0x96
[0.123990]  __free_pages_ok+0x3a8/0x410
[0.123996]  memblock_free_all+0x171/0x1dc
[0.124005]  mem_init+0x1f/0x14b
[0.124011]  start_kernel+0x3b5/0x6a1
[0.124016]  secondary_startup_64_no_verify+0xb0/0xbb
[0.124022]  

I see ~40 of these traces, each for different pfns.

Thanks,
Tom


---
Change since v2:
* Leave mem->vaddr with phys_to_virt(mem->start) when fail
  to remap swiotlb memory.

Change since v1:
* Rework comment in the swiotlb_init_io_tlb_mem()
* Make swiotlb_init_io_tlb_mem() back to return void.
---
  include/linux/swiotlb.h |  6 ++
  kernel/dma/swiotlb.c| 47 -
  2 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 569272871375..f6c3638255d5 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -73,6 +73,9 @@ extern enum swiotlb_force swiotlb_force;
   * @end:  The end address of the swiotlb memory pool. Used to do a quick
   *range check to see if the memory was in fact allocated by this
   *API.
+ * @vaddr: The vaddr of the swiotlb memory pool. The swiotlb memory pool
+ * may be remapped in the memory encrypted case and store virtual
+ * address for bounce buffer operation.
   * @nslabs:   The number of IO TLB blocks (in groups of 64) between @start and
   *@end. For default swiotlb, this is command line adjustable via
   *setup_io_tlb_npages.
@@ -92,6 +95,7 @@ extern enum swiotlb_force swiotlb_force;
  struct io_tlb_mem {
phys_addr_t start;
phys_addr_t end;
+   void *vaddr;
unsigned long nslabs;
unsigned long used;
unsigned int index;
@@ -186,4 +190,6 @@ static inline bool is_swiotlb_for_alloc(struct device *dev)
  }
  #endif /* CONFIG_DMA_RESTRICTED_POOL */
  
+extern phys_addr_t swiotlb_unencrypted_base;

+
  #endif /* __LINUX_SWIOTLB_H */
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 8e840fbbed7c..adb9d06af5c8 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -50,6 +50,7 @@
  #include 
  #include 
  
+#include 

  #include 
  #include 
  #include 
@@ -72,6 +73,8 @@ enum swiotlb_force swiotlb_force;
  
  struct io_tlb_mem io_tlb_default_mem;
  
+phys_addr_t swiotlb_unencrypted_base;

+
  /*
   * Max segment that we can provide which (if pages are contingous) will
   * not be bounced (unless SWIOTLB_FORCE is set).
@@ -155,6 +158,27 @@ static inline unsigned long nr_slots(u64 val)
return DIV_ROUND_UP(val, IO_TLB_SIZE);
  }
  
+/*

+ * Remap swioltb memory in the unencrypted physical address space
+ * when swiotlb_unencrypted_base is set. (e.g. for Hyper-V AMD SEV-SNP
+ * Isolation VMs).
+ */
+void *swiotlb_mem_remap(struct io_tlb_mem *mem, unsigned long bytes)
+{
+   void *vaddr = NULL;
+
+   if (swiotlb_unencrypted_base) {
+   phys_addr_t paddr = mem->start + swiotlb_unencrypted_base;
+
+   vaddr = memremap(paddr, bytes, MEMREMAP_WB);
+   if (!vaddr)
+   pr_err("Failed to map the 

Re: [PATCH V3 12/13] HV/Netvsc: Add Isolation VM support for netvsc driver

2021-08-20 Thread Tom Lendacky

On 8/19/21 11:21 PM, h...@lst.de wrote:

On Thu, Aug 19, 2021 at 06:14:51PM +, Michael Kelley wrote:

+   if (!pfns)
+   return NULL;
+
+   for (i = 0; i < size / HV_HYP_PAGE_SIZE; i++)
+   pfns[i] = virt_to_hvpfn(buf + i * HV_HYP_PAGE_SIZE)
+   + (ms_hyperv.shared_gpa_boundary >> HV_HYP_PAGE_SHIFT);
+
+   vaddr = vmap_pfn(pfns, size / HV_HYP_PAGE_SIZE, PAGE_KERNEL_IO);
+   kfree(pfns);
+
+   return vaddr;
+}


This function appears to be a duplicate of hv_map_memory() in Patch 11 of this
series.  Is it possible to structure things so there is only one 
implementation?  In


So right now it it identical, but there is an important difference:
the swiotlb memory is physically contiguous to start with, so we can
do the simple remap using vmap_range as suggested in the last mail.
The cases here are pretty weird in that netvsc_remap_buf is called right
after vzalloc.  That is we create _two_ mappings in vmalloc space right
after another, where the original one is just used for establishing the
"GPADL handle" and freeing the memory.  In other words, the obvious thing
to do here would be to use a vmalloc variant that allows to take the
shared_gpa_boundary into account when setting up the PTEs.

And here is somthing I need help from the x86 experts:  does the CPU
actually care about this shared_gpa_boundary?  Or does it just matter
for the generated DMA address?  Does somehow have a good pointer to
how this mechanism works?


The CPU does care. Here's some info:

APM Volume 2, Section 15.36.8:
https://www.amd.com/system/files/TechDocs/24593.pdf

AMD SEV-SNP Whitepaper, Virtual Machine Privilege Levels (~page 14):
https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf

Thanks,
Tom







Re: [PATCH v13 01/12] swiotlb: Refactor swiotlb init functions

2021-06-18 Thread Tom Lendacky
On 6/18/21 1:25 AM, Claire Chang wrote:
> On Fri, Jun 18, 2021 at 7:30 AM Stefano Stabellini
>  wrote:
>>
>> On Thu, 17 Jun 2021, Claire Chang wrote:
>>> Add a new function, swiotlb_init_io_tlb_mem, for the io_tlb_mem struct
>>> initialization to make the code reusable.
>>>
>>> Signed-off-by: Claire Chang 
>>> Reviewed-by: Christoph Hellwig 
>>> Tested-by: Stefano Stabellini 
>>> Tested-by: Will Deacon 
>>> ---
>>>  kernel/dma/swiotlb.c | 50 ++--
>>>  1 file changed, 25 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
>>> index 52e2ac526757..47bb2a766798 100644
>>> --- a/kernel/dma/swiotlb.c
>>> +++ b/kernel/dma/swiotlb.c
>>> @@ -168,9 +168,28 @@ void __init swiotlb_update_mem_attributes(void)
>>>   memset(vaddr, 0, bytes);
>>>  }
>>>
>>> -int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int 
>>> verbose)
>>> +static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t 
>>> start,
>>> + unsigned long nslabs, bool late_alloc)
>>>  {
>>> + void *vaddr = phys_to_virt(start);
>>>   unsigned long bytes = nslabs << IO_TLB_SHIFT, i;
>>> +
>>> + mem->nslabs = nslabs;
>>> + mem->start = start;
>>> + mem->end = mem->start + bytes;
>>> + mem->index = 0;
>>> + mem->late_alloc = late_alloc;
>>> + spin_lock_init(>lock);
>>> + for (i = 0; i < mem->nslabs; i++) {
>>> + mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
>>> + mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
>>> + mem->slots[i].alloc_size = 0;
>>> + }
>>> + memset(vaddr, 0, bytes);
>>> +}
>>> +
>>> +int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int 
>>> verbose)
>>> +{
>>>   struct io_tlb_mem *mem;
>>>   size_t alloc_size;
>>>
>>> @@ -186,16 +205,8 @@ int __init swiotlb_init_with_tbl(char *tlb, unsigned 
>>> long nslabs, int verbose)
>>>   if (!mem)
>>>   panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
>>> __func__, alloc_size, PAGE_SIZE);
>>> - mem->nslabs = nslabs;
>>> - mem->start = __pa(tlb);
>>> - mem->end = mem->start + bytes;
>>> - mem->index = 0;
>>> - spin_lock_init(>lock);
>>> - for (i = 0; i < mem->nslabs; i++) {
>>> - mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
>>> - mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
>>> - mem->slots[i].alloc_size = 0;
>>> - }
>>> +
>>> + swiotlb_init_io_tlb_mem(mem, __pa(tlb), nslabs, false);
>>>
>>>   io_tlb_default_mem = mem;
>>>   if (verbose)
>>> @@ -282,8 +293,8 @@ swiotlb_late_init_with_default_size(size_t default_size)
>>>  int
>>>  swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
>>>  {
>>> - unsigned long bytes = nslabs << IO_TLB_SHIFT, i;
>>>   struct io_tlb_mem *mem;
>>> + unsigned long bytes = nslabs << IO_TLB_SHIFT;
>>>
>>>   if (swiotlb_force == SWIOTLB_NO_FORCE)
>>>   return 0;
>>> @@ -297,20 +308,9 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long 
>>> nslabs)
>>>   if (!mem)
>>>   return -ENOMEM;
>>>
>>> - mem->nslabs = nslabs;
>>> - mem->start = virt_to_phys(tlb);
>>> - mem->end = mem->start + bytes;
>>> - mem->index = 0;
>>> - mem->late_alloc = 1;
>>> - spin_lock_init(>lock);
>>> - for (i = 0; i < mem->nslabs; i++) {
>>> - mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
>>> - mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
>>> - mem->slots[i].alloc_size = 0;
>>> - }
>>> -
>>> + memset(mem, 0, sizeof(*mem));
>>> + swiotlb_init_io_tlb_mem(mem, virt_to_phys(tlb), nslabs, true);
>>>   set_memory_decrypted((unsigned long)tlb, bytes >> PAGE_SHIFT);
>>> - memset(tlb, 0, bytes);
>>
>> This is good for swiotlb_late_init_with_tbl. However I have just noticed
>> that mem could also be allocated from swiotlb_init_with_tbl, in which
>> case the zeroing is missing. I think we need another memset in
>> swiotlb_init_with_tbl as well. Or maybe it could be better to have a
>> single memset at the beginning of swiotlb_init_io_tlb_mem instead. Up to
>> you.
> 
> swiotlb_init_with_tbl uses memblock_alloc to allocate the io_tlb_mem
> and memblock_alloc[1] will do memset in memblock_alloc_try_nid[2], so
> swiotlb_init_with_tbl is also good.
> I'm happy to add the memset in swiotlb_init_io_tlb_mem if you think
> it's clearer and safer.

On x86, if the memset is done before set_memory_decrypted() and memory
encryption is active, then the memory will look like ciphertext afterwards
and not be zeroes. If zeroed memory is required, then a memset must be
done after the set_memory_decrypted() calls.

Thanks,
Tom

> 
> [1] 
> 

Re: [RFC PATCH V3 08/11] swiotlb: Add bounce buffer remap address setting function

2021-06-14 Thread Tom Lendacky
On 6/14/21 2:12 AM, Christoph Hellwig wrote:
> On Mon, Jun 07, 2021 at 10:56:47PM +0800, Tianyu Lan wrote:
>> These addresses in extra address space works as system memory mirror. The 
>> shared memory with host in Isolation VM needs to be accessed via extra 
>> address space which is above shared gpa boundary.
> 
> Why?
> 

IIUC, this is using the vTOM feature of SEV-SNP. When this feature is
enabled for a VMPL level, any physical memory addresses below vTOM are
considered private/encrypted and any physical memory addresses above vTOM
are considered shared/unencrypted. With this option, you don't need a
fully enlightened guest that sets and clears page table encryption bits.
You just need the DMA buffers to be allocated in the proper range above vTOM.

See the section on "Virtual Machine Privilege Levels" in
https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf.

Thanks,
Tom



Re: [PATCH v7 01/15] swiotlb: Refactor swiotlb init functions

2021-05-27 Thread Tom Lendacky
On 5/27/21 9:41 AM, Tom Lendacky wrote:
> On 5/27/21 8:02 AM, Christoph Hellwig wrote:
>> On Wed, May 19, 2021 at 11:50:07AM -0700, Florian Fainelli wrote:
>>> You convert this call site with swiotlb_init_io_tlb_mem() which did not
>>> do the set_memory_decrypted()+memset(). Is this okay or should
>>> swiotlb_init_io_tlb_mem() add an additional argument to do this
>>> conditionally?
>>
>> The zeroing is useful and was missing before.  I think having a clean
>> state here is the right thing.
>>
>> Not sure about the set_memory_decrypted, swiotlb_update_mem_attributes
>> kinda suggests it is too early to set the memory decrupted.
>>
>> Adding Tom who should now about all this.
> 
> The reason for adding swiotlb_update_mem_attributes() was because having
> the call to set_memory_decrypted() in swiotlb_init_with_tbl() triggered a
> BUG_ON() related to interrupts not being enabled yet during boot. So that
> call had to be delayed until interrupts were enabled.

I pulled down and tested the patch set and booted with SME enabled. The
following was seen during the boot:

[0.134184] BUG: Bad page state in process swapper  pfn:108002
[0.134196] page:(ptrval) refcount:0 mapcount:-128 
mapping: index:0x0 pfn:0x108002
[0.134201] flags: 0x17c000(node=0|zone=2|lastcpupid=0x1f)
[0.134208] raw: 0017c000 88847f355e28 88847f355e28 

[0.134210] raw:  0001 ff7f 

[0.134212] page dumped because: nonzero mapcount
[0.134213] Modules linked in:
[0.134218] CPU: 0 PID: 0 Comm: swapper Not tainted 5.13.0-rc2-sos-custom #3
[0.134221] Hardware name: ...
[0.134224] Call Trace:
[0.134233]  dump_stack+0x76/0x94
[0.134244]  bad_page+0xa6/0xf0
[0.134252]  __free_pages_ok+0x331/0x360
[0.134256]  memblock_free_all+0x158/0x1c1
[0.134267]  mem_init+0x1f/0x14c
[0.134273]  start_kernel+0x290/0x574
[0.134279]  secondary_startup_64_no_verify+0xb0/0xbb

I see this about 40 times during the boot, each with a different PFN. The
system boots (which seemed odd), but I don't know if there will be side
effects to this (I didn't stress the system).

I modified the code to add a flag to not do the set_memory_decrypted(), as
suggested by Florian, when invoked from swiotlb_init_with_tbl(), and that
eliminated the bad page state BUG.

Thanks,
Tom

> 
> Thanks,
> Tom
> 
>>



Re: [PATCH v7 01/15] swiotlb: Refactor swiotlb init functions

2021-05-27 Thread Tom Lendacky
On 5/27/21 8:02 AM, Christoph Hellwig wrote:
> On Wed, May 19, 2021 at 11:50:07AM -0700, Florian Fainelli wrote:
>> You convert this call site with swiotlb_init_io_tlb_mem() which did not
>> do the set_memory_decrypted()+memset(). Is this okay or should
>> swiotlb_init_io_tlb_mem() add an additional argument to do this
>> conditionally?
> 
> The zeroing is useful and was missing before.  I think having a clean
> state here is the right thing.
> 
> Not sure about the set_memory_decrypted, swiotlb_update_mem_attributes
> kinda suggests it is too early to set the memory decrupted.
> 
> Adding Tom who should now about all this.

The reason for adding swiotlb_update_mem_attributes() was because having
the call to set_memory_decrypted() in swiotlb_init_with_tbl() triggered a
BUG_ON() related to interrupts not being enabled yet during boot. So that
call had to be delayed until interrupts were enabled.

Thanks,
Tom

> 



Re: [PATCH 7/7] swiotlb: don't override the command line in swiotlb_adjust_size

2021-04-22 Thread Tom Lendacky
On 4/22/21 2:19 AM, Christoph Hellwig wrote:
> When the user specified an explicit swiotlb size on the command line,
> the achitecture code should not override it.
> 
> Fixes: 2cbc2776efe4 ("swiotlb: remove swiotlb_nr_tbl")
> Reported-by: Tom Lendacky 
> Signed-off-by: Christoph Hellwig 

I tested this patchset and I'm not able to get the override via the
command line or via the SEV adjustment function. Looking at the code,
swiotlb_default_size is initialized, so the call to swiotlb_adjust_size()
always returns without setting the new size.

Thanks,
Tom

> ---
>  kernel/dma/swiotlb.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 87d06ddf4753f3..aef02a3825b494 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -106,7 +106,9 @@ void swiotlb_set_max_segment(unsigned int val)
>  
>  unsigned long swiotlb_size_or_default(void)
>  {
> - return swiotlb_default_size;
> + if (swiotlb_default_size)
> + return swiotlb_default_size;
> + return IO_TLB_DEFAULT_SIZE;
>  }
>  
>  void __init swiotlb_adjust_size(unsigned long size)
> @@ -116,6 +118,8 @@ void __init swiotlb_adjust_size(unsigned long size)
>* architectures such as those supporting memory encryption to
>* adjust/expand SWIOTLB size for their use.
>*/
> + if (swiotlb_default_size)
> + return;
>   swiotlb_default_size = ALIGN(size, IO_TLB_SEGSIZE << IO_TLB_SHIFT);
>   pr_info("SWIOTLB bounce buffer size adjusted to %luMB",
>   swiotlb_default_size >> 20);
> 



Re: swiotlb cleanups v3

2021-04-20 Thread Tom Lendacky
On 4/20/21 4:23 AM, Christoph Hellwig wrote:
> On Sat, Apr 17, 2021 at 11:39:22AM -0500, Tom Lendacky wrote:
>> Somewhere between the 1st and 2nd patch, specifying a specific swiotlb
>> for an SEV guest is no longer honored. For example, if I start an SEV
>> guest with 16GB of memory and specify swiotlb=131072 I used to get a
>> 256MB SWIOTLB. However, after the 2nd patch, the swiotlb=131072 is no
>> longer honored and I get a 982MB SWIOTLB (as set via sev_setup_arch() in
>> arch/x86/mm/mem_encrypt.c).
>>
>> I can't be sure which patch caused the issue since an SEV guest fails to
>> boot with the 1st patch but can boot with the 2nd patch, at which point
>> the SWIOTLB comes in at 982MB (I haven't had a chance to debug it and so
>> I'm hoping you might be able to quickly spot what's going on).
> 
> Can you try this patch?

Thanks, Christoph. This works for honoring the command line value with SEV
guests.

There was still a reference to default_nslabs in setup_io_tlb_npages()
that I'm not sure how you want to handle. I just commented it out for now
to let the code compile to test the intent of the patch.

Thanks,
Tom

> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 0a5b6f7e75bce6..ac81ef97df32f5 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -71,15 +71,17 @@ struct io_tlb_mem *io_tlb_default_mem;
>   */
>  static unsigned int max_segment;
>  
> -static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT;
> +static unsigned long swiotlb_cmdline_size;
>  
>  static int __init
>  setup_io_tlb_npages(char *str)
>  {
>   if (isdigit(*str)) {
>   /* avoid tail segment of size < IO_TLB_SEGSIZE */
> - default_nslabs =
> - ALIGN(simple_strtoul(str, , 0), IO_TLB_SEGSIZE);
> + unsigned long nslabs = simple_strtoul(str, , 0);
> +
> + swiotlb_cmdline_size =
> + ALIGN(nslabs, IO_TLB_SEGSIZE) << IO_TLB_SHIFT;
>   }
>   if (*str == ',')
>   ++str;
> @@ -108,7 +110,9 @@ void swiotlb_set_max_segment(unsigned int val)
>  
>  unsigned long swiotlb_size_or_default(void)
>  {
> - return default_nslabs << IO_TLB_SHIFT;
> + if (swiotlb_cmdline_size)
> + return swiotlb_cmdline_size;
> + return IO_TLB_DEFAULT_SIZE;
>  }
>  
>  void __init swiotlb_adjust_size(unsigned long size)
> @@ -118,9 +122,10 @@ void __init swiotlb_adjust_size(unsigned long size)
>* architectures such as those supporting memory encryption to
>* adjust/expand SWIOTLB size for their use.
>*/
> - size = ALIGN(size, IO_TLB_SIZE);
> - default_nslabs = ALIGN(size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE);
> - pr_info("SWIOTLB bounce buffer size adjusted to %luMB", size >> 20);
> + if (!swiotlb_cmdline_size)
> + swiotlb_cmdline_size = ALIGN(size, IO_TLB_SIZE);
> + pr_info("SWIOTLB bounce buffer size adjusted to %luMB",
> + swiotlb_cmdline_size >> 20);
>  }
>  
>  void swiotlb_print_info(void)
> @@ -209,7 +214,7 @@ int __init swiotlb_init_with_tbl(char *tlb, unsigned long 
> nslabs, int verbose)
>  void  __init
>  swiotlb_init(int verbose)
>  {
> - size_t bytes = PAGE_ALIGN(default_nslabs << IO_TLB_SHIFT);
> + size_t bytes = PAGE_ALIGN(swiotlb_size_or_default());
>   void *tlb;
>  
>   if (swiotlb_force == SWIOTLB_NO_FORCE)
> @@ -219,7 +224,7 @@ swiotlb_init(int verbose)
>   tlb = memblock_alloc_low(bytes, PAGE_SIZE);
>   if (!tlb)
>   goto fail;
> - if (swiotlb_init_with_tbl(tlb, default_nslabs, verbose))
> + if (swiotlb_init_with_tbl(tlb, bytes >> IO_TLB_SHIFT, verbose))
>   goto fail_free_mem;
>   return;
>  
> 



Re: swiotlb cleanups v3

2021-04-17 Thread Tom Lendacky
On 4/17/21 11:39 AM, Tom Lendacky wrote:
>> Hi Konrad,
>>
>> this series contains a bunch of swiotlb cleanups, mostly to reduce the
>> amount of internals exposed to code outside of swiotlb.c, which should
>> helper to prepare for supporting multiple different bounce buffer pools.
> 
> Somewhere between the 1st and 2nd patch, specifying a specific swiotlb
> for an SEV guest is no longer honored. For example, if I start an SEV
> guest with 16GB of memory and specify swiotlb=131072 I used to get a
> 256MB SWIOTLB. However, after the 2nd patch, the swiotlb=131072 is no
> longer honored and I get a 982MB SWIOTLB (as set via sev_setup_arch() in
> arch/x86/mm/mem_encrypt.c).
> 
> I can't be sure which patch caused the issue since an SEV guest fails to
> boot with the 1st patch but can boot with the 2nd patch, at which point
> the SWIOTLB comes in at 982MB (I haven't had a chance to debug it and so
> I'm hoping you might be able to quickly spot what's going on).

Ok, I figured out the 1st patch boot issue (which is gone when the
second patch is applied). Here's the issue if anyone is interested:

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index d9c097f0f78c..dbe369674afe 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -226,7 +226,7 @@ int __init swiotlb_init_with_tbl(char *tlb, unsigned long 
nslabs, int verbose)
 
alloc_size = PAGE_ALIGN(mem->nslabs * sizeof(size_t));
mem->alloc_size = memblock_alloc(alloc_size, PAGE_SIZE);
-   if (mem->alloc_size)
+   if (!mem->alloc_size)
panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
  __func__, alloc_size, PAGE_SIZE);
 

The 1st patch still allowed the command line specified size of 256MB
SWIOTLB. So that means the 2nd patch causes the command line specified
256MB SWIOTLB size to be ignored and results in a 982MB SWIOTLB size
for the 16GB guest.

Thanks,
Tom

> 
> Thanks,
> Tom
> 
>>
>> Changes since v2:
>>  - fix a bisetion hazard that did not allocate the alloc_size array
>>  - dropped all patches already merged
>>
>> Changes since v1:
>>  - rebased to v5.12-rc1
>>  - a few more cleanups
>>  - merge and forward port the patch from Claire to move all the global
>>variables into a struct to prepare for multiple instances
> 



Re: swiotlb cleanups v3

2021-04-17 Thread Tom Lendacky
> Hi Konrad,
>
> this series contains a bunch of swiotlb cleanups, mostly to reduce the
> amount of internals exposed to code outside of swiotlb.c, which should
> helper to prepare for supporting multiple different bounce buffer pools.

Somewhere between the 1st and 2nd patch, specifying a specific swiotlb
for an SEV guest is no longer honored. For example, if I start an SEV
guest with 16GB of memory and specify swiotlb=131072 I used to get a
256MB SWIOTLB. However, after the 2nd patch, the swiotlb=131072 is no
longer honored and I get a 982MB SWIOTLB (as set via sev_setup_arch() in
arch/x86/mm/mem_encrypt.c).

I can't be sure which patch caused the issue since an SEV guest fails to
boot with the 1st patch but can boot with the 2nd patch, at which point
the SWIOTLB comes in at 982MB (I haven't had a chance to debug it and so
I'm hoping you might be able to quickly spot what's going on).

Thanks,
Tom

>
> Changes since v2:
>  - fix a bisetion hazard that did not allocate the alloc_size array
>  - dropped all patches already merged
>
> Changes since v1:
>  - rebased to v5.12-rc1
>  - a few more cleanups
>  - merge and forward port the patch from Claire to move all the global
>variables into a struct to prepare for multiple instances




Re: [PATCH v2 4/7] OvmfPkg/IndustryStandard: Introduce PageTable.h

2021-03-26 Thread Tom Lendacky
On 3/25/21 10:47 AM, Anthony PERARD wrote:
> We are going to use the page table structure in yet another place,
> collect the types and macro that can be used from another module
> rather that making yet another copy.
> 
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=2490
> Signed-off-by: Anthony PERARD 

This begs the question of whether there should be only one version of this
header file, now. There are still copies in other places, but maybe that
can be a future cleanup? I'll leave that decision to Laszlo.

With one minor comment below, otherwise:

Acked-by: Tom Lendacky 

> ---
> CC: Tom Lendacky 
> CC: Brijesh Singh 
> ---
> 
> Notes:
> v2:
> - new patch
> 
>  .../IndustryStandard/PageTable.h} | 117 +-
>  .../BaseMemEncryptSevLib/X64/VirtualMemory.h  | 143 +-
>  2 files changed, 5 insertions(+), 255 deletions(-)
>  copy OvmfPkg/{Library/BaseMemEncryptSevLib/X64/VirtualMemory.h => 
> Include/IndustryStandard/PageTable.h} (60%)
> 

...

> diff --git a/OvmfPkg/Library/BaseMemEncryptSevLib/X64/VirtualMemory.h 
> b/OvmfPkg/Library/BaseMemEncryptSevLib/X64/VirtualMemory.h
> index 996f94f07ebb..b621d811ca6f 100644
> --- a/OvmfPkg/Library/BaseMemEncryptSevLib/X64/VirtualMemory.h
> +++ b/OvmfPkg/Library/BaseMemEncryptSevLib/X64/VirtualMemory.h
> @@ -20,151 +20,10 @@
>  #include 
>  #include 
>  #include 
> +#include 

Typically, these are preferred to be in sorted order.

Thanks,
Tom

>  
>  #define SYS_CODE64_SEL 0x38
>