Re: [Qemu-devel] [PATCH v2 4/4] target-i386: Add NPT support

2018-06-30 Thread Jan Kiszka
On 2018-06-30 08:05, Paolo Bonzini wrote:
> On 30/06/2018 07:25, Jan Kiszka wrote:
>> On 2018-06-27 14:14, Paolo Bonzini wrote:
>>> On 03/04/2018 17:36, Jan Kiszka wrote:
  
 +static hwaddr get_hphys(CPUState *cs, hwaddr gphys, MMUAccessType 
 access_type,
 +int *prot)
 +{
 +CPUX86State *env = _CPU(cs)->env;
 +uint64_t rsvd_mask = PG_HI_RSVD_MASK;
 +uint64_t ptep, pte;
 +uint64_t exit_info_1 = 0;
 +target_ulong pde_addr, pte_addr;
 +uint32_t page_offset;
 +int page_size;
 +
 +if (likely(!(env->hflags & HF_NPT_MASK))) {
 +return gphys;
 +}
>>>
>>> hflags are a somewhat limited resource.  Can this go in hflags2?
>>>
>>
>> Will have a look - I don't seen why not. Or is there any special
>> semantical difference between both fields?
> 
> Yes, hflags become flags of the translation block, while hflags2 are
> just random processor state.  If translate.c uses it you must use
> hflags, but here hflags2 should be safe.

Indeed. We only change it at vmentry/exit, and that is already a
translation block barrier.

v2 is on the way.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2 4/4] target-i386: Add NPT support

2018-06-30 Thread Paolo Bonzini
On 30/06/2018 07:25, Jan Kiszka wrote:
> On 2018-06-27 14:14, Paolo Bonzini wrote:
>> On 03/04/2018 17:36, Jan Kiszka wrote:
>>>  
>>> +static hwaddr get_hphys(CPUState *cs, hwaddr gphys, MMUAccessType 
>>> access_type,
>>> +int *prot)
>>> +{
>>> +CPUX86State *env = _CPU(cs)->env;
>>> +uint64_t rsvd_mask = PG_HI_RSVD_MASK;
>>> +uint64_t ptep, pte;
>>> +uint64_t exit_info_1 = 0;
>>> +target_ulong pde_addr, pte_addr;
>>> +uint32_t page_offset;
>>> +int page_size;
>>> +
>>> +if (likely(!(env->hflags & HF_NPT_MASK))) {
>>> +return gphys;
>>> +}
>>
>> hflags are a somewhat limited resource.  Can this go in hflags2?
>>
> 
> Will have a look - I don't seen why not. Or is there any special
> semantical difference between both fields?

Yes, hflags become flags of the translation block, while hflags2 are
just random processor state.  If translate.c uses it you must use
hflags, but here hflags2 should be safe.

Thanks,

Paolo

>>>
>>> +
>>> +env->nested_pg_mode = 0;
>>> +if (env->cr[4] & CR4_PAE_MASK) {
>>> +env->nested_pg_mode |= SVM_NPT_PAE;
>>> +}
>>> +if (env->hflags & HF_LMA_MASK) {
>>> +env->nested_pg_mode |= SVM_NPT_LMA;
>>> +}
>>> +if (env->efer & MSR_EFER_NXE) {
>>> +env->nested_pg_mode |= SVM_NPT_NXE;
>>> +}
>>> +}
>>> +
>>
>> This needs to be migrated.  You can put it in a subsection, conditional
>> on hflags & HF_SVMI_MASK.
> 
> OK.
> 
>>
>> Also, do you need to flush the TLB unconditionally, even if CR0.PG is zero?
> Cannot follow you here yet. What flush are you referring to?
> 
> Also, CR0.PG would not reflect if NPT is on, which now also contributes
> to our TLB.
> 
>>
>> Otherwise looks good.  I have queued patches 1-3, but hopefully this one
>> can go in the next release too.  Sorry for the delayed review.
> 
> No problem.
> 
> Thanks,
> Jan
> 




Re: [Qemu-devel] [PATCH v2 4/4] target-i386: Add NPT support

2018-06-29 Thread Jan Kiszka
On 2018-06-27 14:14, Paolo Bonzini wrote:
> On 03/04/2018 17:36, Jan Kiszka wrote:
>>  
>> +static hwaddr get_hphys(CPUState *cs, hwaddr gphys, MMUAccessType 
>> access_type,
>> +int *prot)
>> +{
>> +CPUX86State *env = _CPU(cs)->env;
>> +uint64_t rsvd_mask = PG_HI_RSVD_MASK;
>> +uint64_t ptep, pte;
>> +uint64_t exit_info_1 = 0;
>> +target_ulong pde_addr, pte_addr;
>> +uint32_t page_offset;
>> +int page_size;
>> +
>> +if (likely(!(env->hflags & HF_NPT_MASK))) {
>> +return gphys;
>> +}
> 
> hflags are a somewhat limited resource.  Can this go in hflags2?
> 

Will have a look - I don't seen why not. Or is there any special
semantical difference between both fields?

>>
>> +
>> +env->nested_pg_mode = 0;
>> +if (env->cr[4] & CR4_PAE_MASK) {
>> +env->nested_pg_mode |= SVM_NPT_PAE;
>> +}
>> +if (env->hflags & HF_LMA_MASK) {
>> +env->nested_pg_mode |= SVM_NPT_LMA;
>> +}
>> +if (env->efer & MSR_EFER_NXE) {
>> +env->nested_pg_mode |= SVM_NPT_NXE;
>> +}
>> +}
>> +
> 
> This needs to be migrated.  You can put it in a subsection, conditional
> on hflags & HF_SVMI_MASK.

OK.

> 
> Also, do you need to flush the TLB unconditionally, even if CR0.PG is zero?
Cannot follow you here yet. What flush are you referring to?

Also, CR0.PG would not reflect if NPT is on, which now also contributes
to our TLB.

> 
> Otherwise looks good.  I have queued patches 1-3, but hopefully this one
> can go in the next release too.  Sorry for the delayed review.

No problem.

Thanks,
Jan



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2 4/4] target-i386: Add NPT support

2018-06-27 Thread Paolo Bonzini
On 03/04/2018 17:36, Jan Kiszka wrote:
>  
> +static hwaddr get_hphys(CPUState *cs, hwaddr gphys, MMUAccessType 
> access_type,
> +int *prot)
> +{
> +CPUX86State *env = _CPU(cs)->env;
> +uint64_t rsvd_mask = PG_HI_RSVD_MASK;
> +uint64_t ptep, pte;
> +uint64_t exit_info_1 = 0;
> +target_ulong pde_addr, pte_addr;
> +uint32_t page_offset;
> +int page_size;
> +
> +if (likely(!(env->hflags & HF_NPT_MASK))) {
> +return gphys;
> +}

hflags are a somewhat limited resource.  Can this go in hflags2?

> 
> +
> +env->nested_pg_mode = 0;
> +if (env->cr[4] & CR4_PAE_MASK) {
> +env->nested_pg_mode |= SVM_NPT_PAE;
> +}
> +if (env->hflags & HF_LMA_MASK) {
> +env->nested_pg_mode |= SVM_NPT_LMA;
> +}
> +if (env->efer & MSR_EFER_NXE) {
> +env->nested_pg_mode |= SVM_NPT_NXE;
> +}
> +}
> +

This needs to be migrated.  You can put it in a subsection, conditional
on hflags & HF_SVMI_MASK.

Also, do you need to flush the TLB unconditionally, even if CR0.PG is zero?

Otherwise looks good.  I have queued patches 1-3, but hopefully this one
can go in the next release too.  Sorry for the delayed review.

Paolo



[Qemu-devel] [PATCH v2 4/4] target-i386: Add NPT support

2018-04-03 Thread Jan Kiszka
From: Jan Kiszka 

This implements NPT suport for SVM by hooking into
x86_cpu_handle_mmu_fault where it reads the stage-1 page table. Whether
we need to perform this 2nd stage translation, and how, is decided
during vmrun and stored in hflags as well as nested_cr3 and
nested_pg_mode.

As get_hphys performs a direct cpu_vmexit in case of NPT faults, we need
retaddr in that function. To avoid changing the signature of
cpu_handle_mmu_fault, this passes the value from tlb_fill to get_hphys
via the CPU state.

This was tested successfully via the Jailhouse hypervisor.

Signed-off-by: Jan Kiszka 
---
 target/i386/cpu.c |   2 +-
 target/i386/cpu.h |   6 ++
 target/i386/excp_helper.c | 216 +-
 target/i386/mem_helper.c  |   6 +-
 target/i386/svm.h |  14 +++
 target/i386/svm_helper.c  |  22 +
 6 files changed, 260 insertions(+), 6 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 555ae79d29..0d14254ca1 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -261,7 +261,7 @@ static void x86_cpu_vendor_words2str(char *dst, uint32_t 
vendor1,
 #define TCG_EXT3_FEATURES (CPUID_EXT3_LAHF_LM | CPUID_EXT3_SVM | \
   CPUID_EXT3_CR8LEG | CPUID_EXT3_ABM | CPUID_EXT3_SSE4A)
 #define TCG_EXT4_FEATURES 0
-#define TCG_SVM_FEATURES 0
+#define TCG_SVM_FEATURES CPUID_SVM_NPT
 #define TCG_KVM_FEATURES 0
 #define TCG_7_0_EBX_FEATURES (CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_SMAP | \
   CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ADX | \
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index d711634c2f..106a70e1cf 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -177,6 +177,7 @@ typedef enum X86Seg {
 #define HF_IOBPT_SHIFT  24 /* an io breakpoint enabled */
 #define HF_MPX_EN_SHIFT 25 /* MPX Enabled (CR4+XCR0+BNDCFGx) */
 #define HF_MPX_IU_SHIFT 26 /* BND registers in-use */
+#define HF_NPT_SHIFT27 /* Nested Paging enabled */
 
 #define HF_CPL_MASK  (3 << HF_CPL_SHIFT)
 #define HF_INHIBIT_IRQ_MASK  (1 << HF_INHIBIT_IRQ_SHIFT)
@@ -202,6 +203,7 @@ typedef enum X86Seg {
 #define HF_IOBPT_MASK(1 << HF_IOBPT_SHIFT)
 #define HF_MPX_EN_MASK   (1 << HF_MPX_EN_SHIFT)
 #define HF_MPX_IU_MASK   (1 << HF_MPX_IU_SHIFT)
+#define HF_NPT_MASK  (1 << HF_NPT_SHIFT)
 
 /* hflags2 */
 
@@ -1201,12 +1203,16 @@ typedef struct CPUX86State {
 uint16_t intercept_dr_read;
 uint16_t intercept_dr_write;
 uint32_t intercept_exceptions;
+uint64_t nested_cr3;
+uint32_t nested_pg_mode;
 uint8_t v_tpr;
 
 /* KVM states, automatically cleared on reset */
 uint8_t nmi_injected;
 uint8_t nmi_pending;
 
+uintptr_t retaddr;
+
 /* Fields up to this point are cleared by a CPU reset */
 struct {} end_reset_fields;
 
diff --git a/target/i386/excp_helper.c b/target/i386/excp_helper.c
index cb4d1b7d33..e3bb59284f 100644
--- a/target/i386/excp_helper.c
+++ b/target/i386/excp_helper.c
@@ -157,6 +157,209 @@ int x86_cpu_handle_mmu_fault(CPUState *cs, vaddr addr, 
int size,
 
 #else
 
+static hwaddr get_hphys(CPUState *cs, hwaddr gphys, MMUAccessType access_type,
+int *prot)
+{
+CPUX86State *env = _CPU(cs)->env;
+uint64_t rsvd_mask = PG_HI_RSVD_MASK;
+uint64_t ptep, pte;
+uint64_t exit_info_1 = 0;
+target_ulong pde_addr, pte_addr;
+uint32_t page_offset;
+int page_size;
+
+if (likely(!(env->hflags & HF_NPT_MASK))) {
+return gphys;
+}
+
+if (!(env->nested_pg_mode & SVM_NPT_NXE)) {
+rsvd_mask |= PG_NX_MASK;
+}
+
+if (env->nested_pg_mode & SVM_NPT_PAE) {
+uint64_t pde, pdpe;
+target_ulong pdpe_addr;
+
+#ifdef TARGET_X86_64
+if (env->nested_pg_mode & SVM_NPT_LMA) {
+uint64_t pml5e;
+uint64_t pml4e_addr, pml4e;
+
+pml5e = env->nested_cr3;
+ptep = PG_NX_MASK | PG_USER_MASK | PG_RW_MASK;
+
+pml4e_addr = (pml5e & PG_ADDRESS_MASK) +
+(((gphys >> 39) & 0x1ff) << 3);
+pml4e = x86_ldq_phys(cs, pml4e_addr);
+if (!(pml4e & PG_PRESENT_MASK)) {
+goto do_fault;
+}
+if (pml4e & (rsvd_mask | PG_PSE_MASK)) {
+goto do_fault_rsvd;
+}
+if (!(pml4e & PG_ACCESSED_MASK)) {
+pml4e |= PG_ACCESSED_MASK;
+x86_stl_phys_notdirty(cs, pml4e_addr, pml4e);
+}
+ptep &= pml4e ^ PG_NX_MASK;
+pdpe_addr = (pml4e & PG_ADDRESS_MASK) +
+(((gphys >> 30) & 0x1ff) << 3);
+pdpe = x86_ldq_phys(cs, pdpe_addr);
+if (!(pdpe & PG_PRESENT_MASK)) {
+goto do_fault;
+}
+if (pdpe & rsvd_mask) {
+goto do_fault_rsvd;
+}
+ptep &= pdpe ^ PG_NX_MASK;
+if (!(pdpe & PG_ACCESSED_MASK))