On Fri, Feb 06, 2026, Jim Mattson wrote:
> On Fri, Feb 6, 2026 at 10:23 AM Yosry Ahmed <[email protected]> wrote:
> >
> > February 6, 2026 at 10:19 AM, "Sean Christopherson" <[email protected]> 
> > wrote:
> >
> >
> > >
> > > On Thu, Feb 05, 2026, Jim Mattson wrote:
> > >
> > > >
> > > > Cache g_pat from vmcb12 in svm->nested.gpat to avoid TOCTTOU issues, and
> > > >  add a validity check so that when nested paging is enabled for vmcb12, 
> > > > an
> > > >  invalid g_pat causes an immediate VMEXIT with exit code 
> > > > VMEXIT_INVALID, as
> > > >  specified in the APM, volume 2: "Nested Paging and VMRUN/VMEXIT."
> > > >
> > > >  Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
> > > >  Signed-off-by: Jim Mattson <[email protected]>
> > > >  ---
> > > >  arch/x86/kvm/svm/nested.c | 4 +++-
> > > >  arch/x86/kvm/svm/svm.h | 3 +++
> > > >  2 files changed, 6 insertions(+), 1 deletion(-)
> > > >
> > > >  diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > > >  index f72dbd10dcad..1d4ff6408b34 100644
> > > >  --- a/arch/x86/kvm/svm/nested.c
> > > >  +++ b/arch/x86/kvm/svm/nested.c
> > > >  @@ -1027,9 +1027,11 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
> > > >
> > > >  nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
> > > >  nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
> > > >  + svm->nested.gpat = vmcb12->save.g_pat;
> > > >
> > > >  if (!nested_vmcb_check_save(vcpu) ||
> > > >  - !nested_vmcb_check_controls(vcpu)) {
> > > >  + !nested_vmcb_check_controls(vcpu) ||
> > > >  + (nested_npt_enabled(svm) && !kvm_pat_valid(svm->nested.gpat))) {
> > > >  vmcb12->control.exit_code = SVM_EXIT_ERR;
> > > >  vmcb12->control.exit_info_1 = 0;
> > > >  vmcb12->control.exit_info_2 = 0;
> > > >  diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> > > >  index 986d90f2d4ca..42a4bf83b3aa 100644
> > > >  --- a/arch/x86/kvm/svm/svm.h
> > > >  +++ b/arch/x86/kvm/svm/svm.h
> > > >  @@ -208,6 +208,9 @@ struct svm_nested_state {
> > > >  */
> > > >  struct vmcb_save_area_cached save;
> > > >
> > > >  + /* Cached guest PAT from vmcb12.save.g_pat */
> > > >  + u64 gpat;
> > > >
> > > Shouldn't this go in vmcb_save_area_cached?
> >
> > I believe Jim changed it after this discussion on v2: 
> > https://lore.kernel.org/kvm/[email protected]/.

LOL, oh the irony:

  I'm going to cache it on its own to avoid confusion.

> Right. The two issues with putting it in vmcb_save_area_cached were:
> 
> 1. Checking all of vmcb_save_area_cached requires access to the
> corresponding control area (or at least the boolean, "NTP enabled.")

Checking the control area seems like the right answer (I went down that path
before reading this).

> 2. In the nested state serialization payload, everything else in the
> vmcb_save_area_cached comes from L1 (host state to be restored at
> emulated #VMEXIT.)

Hmm, right, but *because* it's ignored, that gives us carte blanche to clobber 
it.
More below.

> The first issue was a little messy, but not that distasteful.

I actually find it the opposite of distasteful.  KVM definitely _should_ be
checking the controls, not the vCPU state.  If it weren't for needing to get at
MAXPHYADDR in CPUID, I'd push to drop @vcpu entirely.

> The second issue was really a mess.

I'd rather have the mess contained and document though.  Caching g_pat outside
of vmcb_save_area_cached bleeds the mess into all of the relevant nSVM code, and
doesn't leave any breadcrumbs in the code/comments to explain that it "needs" to
be kept separate.

AFAICT, the only "problem" is that g_pat in the serialization payload will be
garbage when restoring state from an older KVM.  But that's totally fine, 
precisely
because L1's PAT isn't restored from vmcb01 on nested #VMEXIT, it's always 
resident
in vcpu->arch.pat.  So can't we just do this to avoid a spurious -EINVAL?

        /*
         * Validate host state saved from before VMRUN (see
         * nested_svm_check_permissions).
         */
        __nested_copy_vmcb_save_to_cache(&save_cached, save);

        /*
         * Stuff gPAT in L1's save state, as older KVM may not have saved L1's
         * gPAT.  L1's PAT, i.e. hPAT for the vCPU, is *always* tracked in
         * vcpu->arch.pat, i.e. gPAT is a reflection of vcpu->arch.pat, not the
         * other way around.
         */
        save_cached.g_pat = vcpu->arch.pat;

        if (!(save->cr0 & X86_CR0_PG) ||
            !(save->cr0 & X86_CR0_PE) ||
            (save->rflags & X86_EFLAGS_VM) ||
            !nested_vmcb_check_save(vcpu, &ctl_cached, &save_cached))
                goto out_free;

Oh, and if we do plumb in @ctrl to __nested_vmcb_check_save(), I vote to
opportunistically drop the useless single-use wrappers (probably in a standalone
patch to plumb in @ctrl).  E.g. (completely untested)

---
 arch/x86/kvm/svm/nested.c | 71 ++++++++++++++++++---------------------
 arch/x86/kvm/svm/svm.c    |  2 +-
 arch/x86/kvm/svm/svm.h    |  6 ++--
 3 files changed, 35 insertions(+), 44 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index a7d6fc1382a7..a429947c8966 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -339,8 +339,8 @@ static bool nested_svm_check_bitmap_pa(struct kvm_vcpu 
*vcpu, u64 pa, u32 size)
            kvm_vcpu_is_legal_gpa(vcpu, addr + size - 1);
 }
 
-static bool __nested_vmcb_check_controls(struct kvm_vcpu *vcpu,
-                                        struct vmcb_ctrl_area_cached *control)
+static bool nested_vmcb_check_controls(struct kvm_vcpu *vcpu,
+                                      struct vmcb_ctrl_area_cached *control)
 {
        if (CC(!vmcb12_is_intercept(control, INTERCEPT_VMRUN)))
                return false;
@@ -367,8 +367,9 @@ static bool __nested_vmcb_check_controls(struct kvm_vcpu 
*vcpu,
 }
 
 /* Common checks that apply to both L1 and L2 state.  */
-static bool __nested_vmcb_check_save(struct kvm_vcpu *vcpu,
-                                    struct vmcb_save_area_cached *save)
+static bool nested_vmcb_check_save(struct kvm_vcpu *vcpu,
+                                  struct vmcb_ctrl_area_cached *ctrl,
+                                  struct vmcb_save_area_cached *save)
 {
        if (CC(!(save->efer & EFER_SVME)))
                return false;
@@ -399,25 +400,13 @@ static bool __nested_vmcb_check_save(struct kvm_vcpu 
*vcpu,
        if (CC(!kvm_valid_efer(vcpu, save->efer)))
                return false;
 
+       if (CC(ctrl->nested_ctl & SVM_NESTED_CTL_NP_ENABLE) &&
+              !kvm_pat_valid(save->g_pat))
+               return false;
+
        return true;
 }
 
-static bool nested_vmcb_check_save(struct kvm_vcpu *vcpu)
-{
-       struct vcpu_svm *svm = to_svm(vcpu);
-       struct vmcb_save_area_cached *save = &svm->nested.save;
-
-       return __nested_vmcb_check_save(vcpu, save);
-}
-
-static bool nested_vmcb_check_controls(struct kvm_vcpu *vcpu)
-{
-       struct vcpu_svm *svm = to_svm(vcpu);
-       struct vmcb_ctrl_area_cached *ctl = &svm->nested.ctl;
-
-       return __nested_vmcb_check_controls(vcpu, ctl);
-}
-
 /*
  * If a feature is not advertised to L1, clear the corresponding vmcb12
  * intercept.
@@ -504,6 +493,9 @@ static void __nested_copy_vmcb_save_to_cache(struct 
vmcb_save_area_cached *to,
 
        to->dr6 = from->dr6;
        to->dr7 = from->dr7;
+
+       to->g_pat = from->g_pat;
+
 }
 
 void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm,
@@ -644,17 +636,14 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm 
*svm, struct vmcb *vmcb12
                svm->nested.force_msr_bitmap_recalc = true;
        }
 
-       if (npt_enabled) {
-               if (nested_npt_enabled(svm)) {
-                       if (unlikely(new_vmcb12 ||
-                                    vmcb_is_dirty(vmcb12, VMCB_NPT))) {
-                               vmcb02->save.g_pat = svm->nested.gpat;
-                               vmcb_mark_dirty(vmcb02, VMCB_NPT);
-                       }
-               } else {
-                       vmcb02->save.g_pat = vcpu->arch.pat;
+       if (nested_npt_enabled(svm)) {
+               if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_NPT))) {
+                       vmcb02->save.g_pat = svm->nested.save.g_pat;
                        vmcb_mark_dirty(vmcb02, VMCB_NPT);
                }
+       } else if (npt_enabled) {
+               vmcb02->save.g_pat = vcpu->arch.pat;
+               vmcb_mark_dirty(vmcb02, VMCB_NPT);
        }
 
        if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_SEG))) {
@@ -1028,11 +1017,9 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
 
        nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
        nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
-       svm->nested.gpat = vmcb12->save.g_pat;
 
-       if (!nested_vmcb_check_save(vcpu) ||
-           !nested_vmcb_check_controls(vcpu) ||
-           (nested_npt_enabled(svm) && !kvm_pat_valid(svm->nested.gpat))) {
+       if (!nested_vmcb_check_save(vcpu, &svm->nested.ctl, &svm->nested.save) 
||
+           !nested_vmcb_check_controls(vcpu, &svm->nested.ctl)) {
                vmcb12->control.exit_code    = SVM_EXIT_ERR;
                vmcb12->control.exit_info_1  = 0;
                vmcb12->control.exit_info_2  = 0;
@@ -1766,7 +1753,7 @@ static int svm_get_nested_state(struct kvm_vcpu *vcpu,
                kvm_state.hdr.svm.vmcb_pa = svm->nested.vmcb12_gpa;
                if (nested_npt_enabled(svm)) {
                        kvm_state.hdr.svm.flags |= KVM_STATE_SVM_VALID_GPAT;
-                       kvm_state.hdr.svm.gpat = svm->nested.gpat;
+                       kvm_state.hdr.svm.gpat = svm->nested.save.g_pat;
                }
                kvm_state.size += KVM_STATE_NESTED_SVM_VMCB_SIZE;
                kvm_state.flags |= KVM_STATE_NESTED_GUEST_MODE;
@@ -1871,7 +1858,7 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 
        ret = -EINVAL;
        __nested_copy_vmcb_control_to_cache(vcpu, &ctl_cached, ctl);
-       if (!__nested_vmcb_check_controls(vcpu, &ctl_cached))
+       if (!nested_vmcb_check_controls(vcpu, &ctl_cached))
                goto out_free;
 
        /*
@@ -1887,15 +1874,21 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
         * nested_svm_check_permissions).
         */
        __nested_copy_vmcb_save_to_cache(&save_cached, save);
+
+       /*
+        * Stuff gPAT in L1's save state, as older KVM may not have saved L1's
+        * gPAT.  L1's PAT, i.e. hPAT for the vCPU, is *always* tracked in
+        * vcpu->arch.pat, i.e. hPAT is a reflection of vcpu->arch.pat, not the
+        * other way around.
+        */
+       save_cached.g_pat = vcpu->arch.pat;
+
        if (!(save->cr0 & X86_CR0_PG) ||
            !(save->cr0 & X86_CR0_PE) ||
            (save->rflags & X86_EFLAGS_VM) ||
-           !__nested_vmcb_check_save(vcpu, &save_cached))
+           !nested_vmcb_check_save(vcpu, &ctl_cached, &save_cached))
                goto out_free;
 
-       /*
-        * Validate gPAT, if provided.
-        */
        if ((kvm_state->hdr.svm.flags & KVM_STATE_SVM_VALID_GPAT) &&
            !kvm_pat_valid(kvm_state->hdr.svm.gpat))
                goto out_free;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a6a44deec82b..bf8562a5f655 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2862,7 +2862,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
                WARN_ON_ONCE(msr_info->host_initiated && vcpu->wants_to_run);
                if (!msr_info->host_initiated && is_guest_mode(vcpu) &&
                    nested_npt_enabled(svm))
-                       msr_info->data = svm->nested.gpat;
+                       msr_info->data = svm->nested.save.g_pat;
                else
                        msr_info->data = vcpu->arch.pat;
                break;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index a559cd45c8a9..6f07d8e3f06e 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -146,6 +146,7 @@ struct vmcb_save_area_cached {
        u64 cr0;
        u64 dr7;
        u64 dr6;
+       u64 g_pat;
 };
 
 struct vmcb_ctrl_area_cached {
@@ -208,9 +209,6 @@ struct svm_nested_state {
         */
        struct vmcb_save_area_cached save;
 
-       /* Cached guest PAT from vmcb12.save.g_pat */
-       u64 gpat;
-
        bool initialized;
 
        /*
@@ -599,7 +597,7 @@ static inline bool nested_npt_enabled(struct vcpu_svm *svm)
 
 static inline void svm_set_gpat(struct vcpu_svm *svm, u64 data)
 {
-       svm->nested.gpat = data;
+       svm->nested.save.g_pat = data;
        svm_set_vmcb_gpat(svm->nested.vmcb02.ptr, data);
 }
 

base-commit: 6461c50e232d6f81d5b9604236f7ee3df870e932
-- 

Reply via email to