Re: [Qemu-devel] [PATCH v2 4/4] target-i386: Add NPT support
On 2018-06-30 08:05, Paolo Bonzini wrote: > On 30/06/2018 07:25, Jan Kiszka wrote: >> On 2018-06-27 14:14, Paolo Bonzini wrote: >>> On 03/04/2018 17:36, Jan Kiszka wrote: +static hwaddr get_hphys(CPUState *cs, hwaddr gphys, MMUAccessType access_type, +int *prot) +{ +CPUX86State *env = _CPU(cs)->env; +uint64_t rsvd_mask = PG_HI_RSVD_MASK; +uint64_t ptep, pte; +uint64_t exit_info_1 = 0; +target_ulong pde_addr, pte_addr; +uint32_t page_offset; +int page_size; + +if (likely(!(env->hflags & HF_NPT_MASK))) { +return gphys; +} >>> >>> hflags are a somewhat limited resource. Can this go in hflags2? >>> >> >> Will have a look - I don't seen why not. Or is there any special >> semantical difference between both fields? > > Yes, hflags become flags of the translation block, while hflags2 are > just random processor state. If translate.c uses it you must use > hflags, but here hflags2 should be safe. Indeed. We only change it at vmentry/exit, and that is already a translation block barrier. v2 is on the way. Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 4/4] target-i386: Add NPT support
On 30/06/2018 07:25, Jan Kiszka wrote: > On 2018-06-27 14:14, Paolo Bonzini wrote: >> On 03/04/2018 17:36, Jan Kiszka wrote: >>> >>> +static hwaddr get_hphys(CPUState *cs, hwaddr gphys, MMUAccessType >>> access_type, >>> +int *prot) >>> +{ >>> +CPUX86State *env = _CPU(cs)->env; >>> +uint64_t rsvd_mask = PG_HI_RSVD_MASK; >>> +uint64_t ptep, pte; >>> +uint64_t exit_info_1 = 0; >>> +target_ulong pde_addr, pte_addr; >>> +uint32_t page_offset; >>> +int page_size; >>> + >>> +if (likely(!(env->hflags & HF_NPT_MASK))) { >>> +return gphys; >>> +} >> >> hflags are a somewhat limited resource. Can this go in hflags2? >> > > Will have a look - I don't seen why not. Or is there any special > semantical difference between both fields? Yes, hflags become flags of the translation block, while hflags2 are just random processor state. If translate.c uses it you must use hflags, but here hflags2 should be safe. Thanks, Paolo >>> >>> + >>> +env->nested_pg_mode = 0; >>> +if (env->cr[4] & CR4_PAE_MASK) { >>> +env->nested_pg_mode |= SVM_NPT_PAE; >>> +} >>> +if (env->hflags & HF_LMA_MASK) { >>> +env->nested_pg_mode |= SVM_NPT_LMA; >>> +} >>> +if (env->efer & MSR_EFER_NXE) { >>> +env->nested_pg_mode |= SVM_NPT_NXE; >>> +} >>> +} >>> + >> >> This needs to be migrated. You can put it in a subsection, conditional >> on hflags & HF_SVMI_MASK. > > OK. > >> >> Also, do you need to flush the TLB unconditionally, even if CR0.PG is zero? > Cannot follow you here yet. What flush are you referring to? > > Also, CR0.PG would not reflect if NPT is on, which now also contributes > to our TLB. > >> >> Otherwise looks good. I have queued patches 1-3, but hopefully this one >> can go in the next release too. Sorry for the delayed review. > > No problem. > > Thanks, > Jan >
Re: [Qemu-devel] [PATCH v2 4/4] target-i386: Add NPT support
On 2018-06-27 14:14, Paolo Bonzini wrote: > On 03/04/2018 17:36, Jan Kiszka wrote: >> >> +static hwaddr get_hphys(CPUState *cs, hwaddr gphys, MMUAccessType >> access_type, >> +int *prot) >> +{ >> +CPUX86State *env = _CPU(cs)->env; >> +uint64_t rsvd_mask = PG_HI_RSVD_MASK; >> +uint64_t ptep, pte; >> +uint64_t exit_info_1 = 0; >> +target_ulong pde_addr, pte_addr; >> +uint32_t page_offset; >> +int page_size; >> + >> +if (likely(!(env->hflags & HF_NPT_MASK))) { >> +return gphys; >> +} > > hflags are a somewhat limited resource. Can this go in hflags2? > Will have a look - I don't seen why not. Or is there any special semantical difference between both fields? >> >> + >> +env->nested_pg_mode = 0; >> +if (env->cr[4] & CR4_PAE_MASK) { >> +env->nested_pg_mode |= SVM_NPT_PAE; >> +} >> +if (env->hflags & HF_LMA_MASK) { >> +env->nested_pg_mode |= SVM_NPT_LMA; >> +} >> +if (env->efer & MSR_EFER_NXE) { >> +env->nested_pg_mode |= SVM_NPT_NXE; >> +} >> +} >> + > > This needs to be migrated. You can put it in a subsection, conditional > on hflags & HF_SVMI_MASK. OK. > > Also, do you need to flush the TLB unconditionally, even if CR0.PG is zero? Cannot follow you here yet. What flush are you referring to? Also, CR0.PG would not reflect if NPT is on, which now also contributes to our TLB. > > Otherwise looks good. I have queued patches 1-3, but hopefully this one > can go in the next release too. Sorry for the delayed review. No problem. Thanks, Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 4/4] target-i386: Add NPT support
On 03/04/2018 17:36, Jan Kiszka wrote: > > +static hwaddr get_hphys(CPUState *cs, hwaddr gphys, MMUAccessType > access_type, > +int *prot) > +{ > +CPUX86State *env = _CPU(cs)->env; > +uint64_t rsvd_mask = PG_HI_RSVD_MASK; > +uint64_t ptep, pte; > +uint64_t exit_info_1 = 0; > +target_ulong pde_addr, pte_addr; > +uint32_t page_offset; > +int page_size; > + > +if (likely(!(env->hflags & HF_NPT_MASK))) { > +return gphys; > +} hflags are a somewhat limited resource. Can this go in hflags2? > > + > +env->nested_pg_mode = 0; > +if (env->cr[4] & CR4_PAE_MASK) { > +env->nested_pg_mode |= SVM_NPT_PAE; > +} > +if (env->hflags & HF_LMA_MASK) { > +env->nested_pg_mode |= SVM_NPT_LMA; > +} > +if (env->efer & MSR_EFER_NXE) { > +env->nested_pg_mode |= SVM_NPT_NXE; > +} > +} > + This needs to be migrated. You can put it in a subsection, conditional on hflags & HF_SVMI_MASK. Also, do you need to flush the TLB unconditionally, even if CR0.PG is zero? Otherwise looks good. I have queued patches 1-3, but hopefully this one can go in the next release too. Sorry for the delayed review. Paolo
[Qemu-devel] [PATCH v2 4/4] target-i386: Add NPT support
From: Jan KiszkaThis implements NPT suport for SVM by hooking into x86_cpu_handle_mmu_fault where it reads the stage-1 page table. Whether we need to perform this 2nd stage translation, and how, is decided during vmrun and stored in hflags as well as nested_cr3 and nested_pg_mode. As get_hphys performs a direct cpu_vmexit in case of NPT faults, we need retaddr in that function. To avoid changing the signature of cpu_handle_mmu_fault, this passes the value from tlb_fill to get_hphys via the CPU state. This was tested successfully via the Jailhouse hypervisor. Signed-off-by: Jan Kiszka --- target/i386/cpu.c | 2 +- target/i386/cpu.h | 6 ++ target/i386/excp_helper.c | 216 +- target/i386/mem_helper.c | 6 +- target/i386/svm.h | 14 +++ target/i386/svm_helper.c | 22 + 6 files changed, 260 insertions(+), 6 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 555ae79d29..0d14254ca1 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -261,7 +261,7 @@ static void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1, #define TCG_EXT3_FEATURES (CPUID_EXT3_LAHF_LM | CPUID_EXT3_SVM | \ CPUID_EXT3_CR8LEG | CPUID_EXT3_ABM | CPUID_EXT3_SSE4A) #define TCG_EXT4_FEATURES 0 -#define TCG_SVM_FEATURES 0 +#define TCG_SVM_FEATURES CPUID_SVM_NPT #define TCG_KVM_FEATURES 0 #define TCG_7_0_EBX_FEATURES (CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_SMAP | \ CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ADX | \ diff --git a/target/i386/cpu.h b/target/i386/cpu.h index d711634c2f..106a70e1cf 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -177,6 +177,7 @@ typedef enum X86Seg { #define HF_IOBPT_SHIFT 24 /* an io breakpoint enabled */ #define HF_MPX_EN_SHIFT 25 /* MPX Enabled (CR4+XCR0+BNDCFGx) */ #define HF_MPX_IU_SHIFT 26 /* BND registers in-use */ +#define HF_NPT_SHIFT27 /* Nested Paging enabled */ #define HF_CPL_MASK (3 << HF_CPL_SHIFT) #define HF_INHIBIT_IRQ_MASK (1 << HF_INHIBIT_IRQ_SHIFT) @@ -202,6 +203,7 @@ typedef enum X86Seg { #define HF_IOBPT_MASK(1 << HF_IOBPT_SHIFT) #define HF_MPX_EN_MASK (1 << HF_MPX_EN_SHIFT) #define HF_MPX_IU_MASK (1 << HF_MPX_IU_SHIFT) +#define HF_NPT_MASK (1 << HF_NPT_SHIFT) /* hflags2 */ @@ -1201,12 +1203,16 @@ typedef struct CPUX86State { uint16_t intercept_dr_read; uint16_t intercept_dr_write; uint32_t intercept_exceptions; +uint64_t nested_cr3; +uint32_t nested_pg_mode; uint8_t v_tpr; /* KVM states, automatically cleared on reset */ uint8_t nmi_injected; uint8_t nmi_pending; +uintptr_t retaddr; + /* Fields up to this point are cleared by a CPU reset */ struct {} end_reset_fields; diff --git a/target/i386/excp_helper.c b/target/i386/excp_helper.c index cb4d1b7d33..e3bb59284f 100644 --- a/target/i386/excp_helper.c +++ b/target/i386/excp_helper.c @@ -157,6 +157,209 @@ int x86_cpu_handle_mmu_fault(CPUState *cs, vaddr addr, int size, #else +static hwaddr get_hphys(CPUState *cs, hwaddr gphys, MMUAccessType access_type, +int *prot) +{ +CPUX86State *env = _CPU(cs)->env; +uint64_t rsvd_mask = PG_HI_RSVD_MASK; +uint64_t ptep, pte; +uint64_t exit_info_1 = 0; +target_ulong pde_addr, pte_addr; +uint32_t page_offset; +int page_size; + +if (likely(!(env->hflags & HF_NPT_MASK))) { +return gphys; +} + +if (!(env->nested_pg_mode & SVM_NPT_NXE)) { +rsvd_mask |= PG_NX_MASK; +} + +if (env->nested_pg_mode & SVM_NPT_PAE) { +uint64_t pde, pdpe; +target_ulong pdpe_addr; + +#ifdef TARGET_X86_64 +if (env->nested_pg_mode & SVM_NPT_LMA) { +uint64_t pml5e; +uint64_t pml4e_addr, pml4e; + +pml5e = env->nested_cr3; +ptep = PG_NX_MASK | PG_USER_MASK | PG_RW_MASK; + +pml4e_addr = (pml5e & PG_ADDRESS_MASK) + +(((gphys >> 39) & 0x1ff) << 3); +pml4e = x86_ldq_phys(cs, pml4e_addr); +if (!(pml4e & PG_PRESENT_MASK)) { +goto do_fault; +} +if (pml4e & (rsvd_mask | PG_PSE_MASK)) { +goto do_fault_rsvd; +} +if (!(pml4e & PG_ACCESSED_MASK)) { +pml4e |= PG_ACCESSED_MASK; +x86_stl_phys_notdirty(cs, pml4e_addr, pml4e); +} +ptep &= pml4e ^ PG_NX_MASK; +pdpe_addr = (pml4e & PG_ADDRESS_MASK) + +(((gphys >> 30) & 0x1ff) << 3); +pdpe = x86_ldq_phys(cs, pdpe_addr); +if (!(pdpe & PG_PRESENT_MASK)) { +goto do_fault; +} +if (pdpe & rsvd_mask) { +goto do_fault_rsvd; +} +ptep &= pdpe ^ PG_NX_MASK; +if (!(pdpe & PG_ACCESSED_MASK))