On Mon, Apr 28, 2025 at 09:41:51PM +0000, Ashish Kalra wrote:
> From: Ashish Kalra <[email protected]>
> 
> When kdump is running makedumpfile to generate vmcore and dumping SNP
> guest memory it touches the VMSA page of the vCPU executing kdump which
> then results in unrecoverable #NPF/RMP faults as the VMSA page is
> marked busy/in-use when the vCPU is running and subsequently causes
> guest softlockup/hang.
> 
> Additionally other APs may be halted in guest mode and their VMSA pages
> are marked busy and touching these VMSA pages during guest memory dump
> will also cause #NPF.
> 
> Issue AP_DESTROY GHCB calls on other APs to ensure they are kicked out
> of guest mode and then clear the VMSA bit on their VMSA pages.
> 
> If the vCPU running kdump is an AP, mark it's VMSA page as offline to
> ensure that makedumpfile excludes that page while dumping guest memory.
> 
> Cc: [email protected]
> Fixes: 3074152e56c9 ("x86/sev: Convert shared memory back to private on 
> kexec")
> Signed-off-by: Ashish Kalra <[email protected]>
> ---
>  arch/x86/coco/sev/core.c | 241 +++++++++++++++++++++++++--------------
>  1 file changed, 155 insertions(+), 86 deletions(-)

Some minor cleanups ontop:

diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index b031cabb2ccf..9ac902d022bf 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -961,6 +961,7 @@ void snp_accept_memory(phys_addr_t start, phys_addr_t end)
 
 static int vmgexit_ap_control(u64 event, struct sev_es_save_area *vmsa, u32 
apic_id)
 {
+       bool create = event == SVM_VMGEXIT_AP_CREATE;
        struct ghcb_state state;
        unsigned long flags;
        struct ghcb *ghcb;
@@ -971,8 +972,10 @@ static int vmgexit_ap_control(u64 event, struct 
sev_es_save_area *vmsa, u32 apic
        ghcb = __sev_get_ghcb(&state);
 
        vc_ghcb_invalidate(ghcb);
-       if (event == SVM_VMGEXIT_AP_CREATE)
+
+       if (create)
                ghcb_set_rax(ghcb, vmsa->sev_features);
+
        ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_AP_CREATION);
        ghcb_set_sw_exit_info_1(ghcb,
                                ((u64)apic_id << 32)    |
@@ -985,7 +988,7 @@ static int vmgexit_ap_control(u64 event, struct 
sev_es_save_area *vmsa, u32 apic
 
        if (!ghcb_sw_exit_info_1_is_valid(ghcb) ||
            lower_32_bits(ghcb->save.sw_exit_info_1)) {
-               pr_err("SNP AP %s error\n", (event == SVM_VMGEXIT_AP_CREATE ? 
"CREATE" : "DESTROY"));
+               pr_err("SNP AP %s error\n", (create ? "CREATE" : "DESTROY"));
                ret = -EINVAL;
        }
 
@@ -1168,8 +1171,8 @@ static void shutdown_all_aps(void)
                vmsa = per_cpu(sev_vmsa, cpu);
 
                /*
-                * BSP does not have guest allocated VMSA and there is no need
-                * to clear the VMSA tag for this page.
+                * The BSP or offlined APs do not have guest allocated VMSA
+                * and there is no need  to clear the VMSA tag for this page.
                 */
                if (!vmsa)
                        continue;

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Reply via email to