Thanks for your review Jim. On 2017-10-13 at 09:57:45 -0700, Jim Mattson wrote: > I'll ask before Paolo does: Can you please add kvm-unit-tests to > exercise all of this new code? it is should be a API/ioctl tools rather than a kvm-unit-test. Actually, I have prepared a draft version of tools which embedded in the qemu command line, mean that we could set/get the subpage protection via qemu command.
Attached the qemu patch. BTW, it is a pre-design version, I will send a formal qemu patch to qemu list after the API/ioctl was fix by kvm side. > > BTW, what generation of hardware do we need to exercise this code ourselves? As far as I know , This feature will enable on Intel next-generation Ice Lake chips. > > On Fri, Oct 13, 2017 at 4:11 PM, Zhang Yi <[email protected]> wrote: > > From: Zhang Yi Z <[email protected]> > > > > Hi All, > > > > Here is a patch-series which adding EPT-Based Sub-page Write Protection > > Support. You can get It's software developer manuals from: > > > > https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf > > > > In Chapter 4 EPT-BASED SUB-PAGE PERMISSIONS. > > > > Introduction: > > > > EPT-Based Sub-page Write Protection referred to as SPP, it is a capability > > which allow Virtual Machine Monitors(VMM) to specify write-permission for > > guest physical memory at a sub-page(128 byte) granularity. When this > > capability is utilized, the CPU enforces write-access permissions for > > sub-page regions of 4K pages as specified by the VMM. EPT-based sub-page > > permissions is intended to enable fine-grained memory write enforcement by > > a VMM for security(guest OS monitoring) and usages such as device > > virtualization and memory check-point. > > > > How SPP Works: > > > > SPP is active when the "sub-page write protection" VM-execution control is > > 1. A new 4-level paging structure named SPP page table(SPPT) is introduced, > > SPPT will look up the guest physical addresses to derive a 64 bit "sub-page > > permission" value containing sub-page write permissions. The lookup from > > guest-physical addresses to the sub-page region permissions is determined > > by a set of this SPPT paging structures. > > > > The SPPT is used to lookup write permission bits for the 128 byte sub-page > > regions containing in the 4KB guest physical page. EPT specifies the 4KB > > page level privileges that software is allowed when accessing the guest > > physical address, whereas SPPT defines the write permissions for software > > at the 128 byte granularity regions within a 4KB page. Write accesses > > prevented due to sub-page permissions looked up via SPPT are reported as > > EPT violation VM exits. Similar to EPT, a logical processor uses SPPT to > > lookup sub-page region write permissions for guest-physical addresses only > > when those addresses are used to access memory. > > > > Guest write access --> GPA --> Walk EPT --> EPT leaf entry -┐ > > ┌-----------------------------------------------------------┘ > > └-> if VMexec_control.spp && ept_leaf_entry.spp_bit (bit 61) > > | > > └-> <false> --> EPT legacy behavior > > | > > | > > └-> <true> --> if ept_leaf_entry.writable > > | > > └-> <true> --> Ignore SPP > > | > > └-> <false> --> GPA --> Walk SPP 4-level table--┐ > > | > > ┌------------<----------get-the-SPPT-point-from-VMCS-filed-----<------┘ > > | > > Walk SPP L4E table > > | > > └┐--> entry misconfiguration ------------>----------┐<----------------┐ > > | | | > > else | | > > | | | > > | ┌------------------SPP VMexit<-----------------┘ | > > | | | > > | └-> exit_qualification & sppt_misconfig --> sppt misconfig | > > | | | > > | └-> exit_qualification & sppt_miss --> sppt miss | > > └--┐ | > > | | > > walk SPPT L3E--┐--> if-entry-misconfiguration------------>------------┘ > > | | > > else | > > | | > > | | > > walk SPPT L2E --┐--> if-entry-misconfiguration-------->-------┘ > > | | > > else | > > | | > > | | > > walk SPPT L1E --┐-> if-entry-misconfiguration--->----┘ > > | > > else > > | > > └-> if sub-page writable > > └-> <true> allow, write access > > └-> <false> disallow, EPT violation > > > > Patch-sets Description: > > > > Patch 1: Documentation. > > > > Patch 2: This patch adds reporting SPP capability from VMX Procbased MSR, > > according to the definition of hardware spec, bit 23 is the control of the > > SPP capability. > > > > Patch 3: Add new secondary processor-based VM-execution control bit which > > defined as "sub-page write permission", same as VMX Procbased MSR, bit 23 > > is the enable bit of SPP. > > Also we introduced a kernel parameter "enable_ept_spp", now SPP is active > > when the "Sub-page Write Protection" in Secondary VM-Execution Control is > > set and enable the kernel parameter by "enable_ept_spp=1". > > > > Patch 4: Introduced the spptp and spp page table. > > The sub-page permission table is referenced via a 64-bit control field > > called Sub-Page Permission Table Pointer (SPPTP) which contains a > > 4K-aligned physical address. The index and encoding for this VMCS field if > > defined 0x2030 at this time The format of SPPTP is shown in below figure 2: > > this patch introduced the Spp paging structures, which root page will > > created at kvm mmu page initialization. > > Also we added a mmu page role type spp to distinguish it is a spp page or a > > EPT page. > > > > Patch 5: Introduced the SPP-Induced VM exit and it's handle. > > Accesses using guest-physical addresses may cause SPP-induced VM exits due > > to an SPPT misconfiguration or an SPPT miss. The basic VM exit reason code > > reporte for SPP-induced VM exits is 66. > > > > Also introduced the new exit qualification for SPPT-induced vmexits. > > > > | Bit | Contents > > | > > | :---- | :---------------------------------------------------------------- > > | > > | 10:0 | Reserved (0). > > | > > | 11 | SPPT VM exit type. Set for SPPT Miss, cleared for SPPT Misconfig. > > | > > | 12 | NMI unblocking due to IRET > > | > > | 63:13 | Reserved (0) > > | > > > > Patch 6: Added a handle of EPT subpage write protection fault. > > A control bit in EPT leaf paging-structure entries is defined as “Sub-Page > > Permission” (SPP bit). The bit position is 61; it is chosen from among the > > bits that are currently ignored by the processor and available to software. > > While hardware walking the SPP page table, If the sub-page region write > > permission bit is set, the write is allowed, else the write is disallowed > > and results in an EPT violation. > > We need peek this case in EPT violation handler, and trigger a user-space > > exit, return the write protected address(GVA) to user(qemu). > > > > Patch 7: Introduce ioctls to set/get Sub-Page Write Protection. > > We introduced 2 ioctls to let user application to set/get subpage write > > protection bitmap per gfn, each gfn corresponds to a bitmap. > > The user application, qemu, or some other security control daemon. will set > > the protection bitmap via this ioctl. > > the API defined as: > > struct kvm_subpage { > > __u64 base_gfn; > > __u64 npages; > > /* sub-page write-access bitmap array */ > > __u32 access_map[SUBPAGE_MAX_BITMAP]; > > }sp; > > kvm_vm_ioctl(s, KVM_SUBPAGES_SET_ACCESS, &sp) > > kvm_vm_ioctl(s, KVM_SUBPAGES_GET_ACCESS, &sp) > > > > Patch 8 ~ Patch 9: Setup spp page table and update the EPT leaf entry > > indicated with the SPP enable bit. > > If the sub-page write permission VM-execution control is set, treatment of > > write accesses to guest-physical accesses depends on the state of the > > accumulated write-access bit (position 1) and sub-page permission bit > > (position 61) in the EPT leaf paging-structure. > > Software will update the EPT leaf entry sub-page permission bit while > > kvm_set_subpage(patch 7). If the EPT write-access bit set to 0 and the SPP > > bit set to 1 in the leaf EPT paging-structure entry that maps a 4KB page, > > then the hardware will look up a VMM-managed Sub-Page Permission Table > > (SPPT), which will be prepared by setup kvm_set_subpage(patch 8). > > The hardware uses the guest-physical address and bits 11:7 of the address > > accessed to lookup the SPPT to fetch a write permission bit for the 128 > > byte wide sub-page region being accessed within the 4K guest-physical page. > > If the sub-page region write permission bit is set, the write is allowed, > > otherwise the write is disallowed and results in an EPT violation. > > Guest-physical pages mapped via leaf EPT-paging-structures for which the > > accumulated write-access bit and the SPP bits are both clear (0) generate > > EPT violations on memory writes accesses. Guest-physical pages mapped via > > EPT-paging-structure for which the accumulated write-access bit is set (1) > > allow writes, effectively ignoring the SPP bit on the leaf EPT-paging > > structure. > > Software will setup the spp page table level4,3,2 as well as EPT page > > structure, and fill the level 1 page via the 32 bit bitmaps per a single 4K > > page. Now it could be divided to 32 x 128 sub-pages. > > > > The SPP L4E L3E L2E is defined as below figure. > > > > | Bit | Contents > > | > > | :----- | > > :--------------------------------------------------------------------- | > > | 0 | Valid entry when set; indicates whether the entry is present > > | > > | 11:1 | Reserved (0) > > | > > | N-1:12 | Physical address of 4K aligned SPPT LX-1 Table referenced by the > > entry | > > | 51:N | Reserved (0) > > | > > | 63:52 | Reserved (0) > > | > > Note: N is the physical address width supported by the processor, X is the > > page level > > > > The SPP L1E format is defined as below figure. > > | Bit | Contents > > | > > | :---- | :---------------------------------------------------------------- > > | > > | 0+2i | Write permission for i-th 128 byte sub-page region. > > | > > | 1+2i | Reserved (0). > > | > > Note: `0<=i<=31` > > > > > > Zhang Yi Z (10): > > KVM: VMX: Added EPT Subpage Protection Documentation. > > x86/cpufeature: Add intel Sub-Page Protection to CPU features > > KVM: VMX: Added VMX SPP feature flags and VM-Execution Controls. > > KVM: VMX: Introduce the SPPTP and SPP page table. > > KVM: VMX: Introduce SPP-Induced vm exit and it's handle. > > KVM: VMX: Added handle of SPP write protection fault. > > KVM: VMX: Introduce ioctls to set/get Sub-Page Write Protection. > > KVM: VMX: Update the EPT leaf entry indicated with the SPP enable bit. > > KVM: VMX: Added setup spp page structure. > > KVM: VMX: implement setup SPP page structure in spp miss. > > > > Documentation/virtual/kvm/spp_design_kvm.txt | 272 +++++++++++++++++++++ > > arch/x86/include/asm/cpufeatures.h | 1 + > > arch/x86/include/asm/kvm_host.h | 18 +- > > arch/x86/include/asm/vmx.h | 10 + > > arch/x86/include/uapi/asm/vmx.h | 2 + > > arch/x86/kernel/cpu/intel.c | 4 + > > arch/x86/kvm/mmu.c | 340 > > ++++++++++++++++++++++++++- > > arch/x86/kvm/mmu.h | 1 + > > arch/x86/kvm/vmx.c | 104 ++++++++ > > arch/x86/kvm/x86.c | 99 +++++++- > > include/linux/kvm_host.h | 5 + > > include/uapi/linux/kvm.h | 16 ++ > > virt/kvm/kvm_main.c | 26 ++ > > 13 files changed, 893 insertions(+), 5 deletions(-) > > create mode 100644 Documentation/virtual/kvm/spp_design_kvm.txt > > > > -- > > 2.7.4 > >
>From a369bed5d986dccb3ca36dc5a27c6220ca2d1405 Mon Sep 17 00:00:00 2001 From: Zhang Yi Z <[email protected]> Date: Tue, 14 Mar 2017 15:11:38 +0800 Subject: [PATCH] x86: Intel Sub-Page Protection support Signed-off-by: He Chen <[email protected]> Signed-off-by: Zhang Yi Z <[email protected]> --- hmp-commands.hx | 26 ++++++++++++++++++++++++++ hmp.c | 26 ++++++++++++++++++++++++++ hmp.h | 2 ++ include/sysemu/kvm.h | 2 ++ kvm-all.c | 40 ++++++++++++++++++++++++++++++++++++++++ linux-headers/linux/kvm.h | 15 +++++++++++++++ qapi-schema.json | 41 +++++++++++++++++++++++++++++++++++++++++ qmp.c | 43 +++++++++++++++++++++++++++++++++++++++++++ target/i386/kvm.c | 22 ++++++++++++++++++++++ 9 files changed, 217 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index 8819281..7a57411 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1766,6 +1766,32 @@ Set QOM property @var{property} of object at location @var{path} to value @var{v ETEXI { + .name = "get-subpage", + .args_type = "base_gfn:l,npages:l,filename:str", + .params = "base_gfn npages filename", + .help = "get the write-protect bitmap setting of sub-page protectio", + .cmd = hmp_get_subpage, + }, + +STEXI +@item get-subpage @var{base_gfn} @var{npages} @var{file} +Get the write-protect bitmap setting of sub-page protection in the range of @var{base_gfn} to @var{base_gfn} + @var{npages} +ETEXI + + { + .name = "set-subpage", + .args_type = "base_gfn:l,npages:l,wp_map:i", + .params = "base_gfn npages", + .help = "set the write-protect bitmap setting of sub-page protectio", + .cmd = hmp_set_subpage, + }, + +STEXI +@item set-subpage @var{base_gfn} @var{npages} +Get the write-protect bitmap setting of sub-page protection in the range of @var{base_gfn} to @var{base_gfn} + @var{npages} +ETEXI + + { .name = "info", .args_type = "item:s?", .params = "[subcommand]", diff --git a/hmp.c b/hmp.c index 261843f..7d217e9 100644 --- a/hmp.c +++ b/hmp.c @@ -2614,3 +2614,29 @@ void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict) } qapi_free_GuidInfo(info); } + +void hmp_get_subpage(Monitor *mon, const QDict *qdict) +{ + uint64_t base_gfn = qdict_get_int(qdict, "base_gfn"); + uint64_t npages = qdict_get_int(qdict, "npages"); + const char *filename = qdict_get_str(qdict, "filename"); + Error *err = NULL; + + monitor_printf(mon, "base_gfn: %ld, npages: %ld, file: %s\n", base_gfn, npages, filename); + + qmp_get_subpage(base_gfn, npages, filename, &err); + hmp_handle_error(mon, &err); +} + +void hmp_set_subpage(Monitor *mon, const QDict *qdict) +{ + uint64_t base_gfn = qdict_get_int(qdict, "base_gfn"); + uint64_t npages = qdict_get_int(qdict, "npages"); + uint32_t wp_map = qdict_get_int(qdict, "wp_map"); + Error *err = NULL; + + monitor_printf(mon, "base_gfn: %ld, npages: %ld, wp_map: %d\n", base_gfn, npages, wp_map); + + qmp_set_subpage(base_gfn, npages, wp_map, &err); + hmp_handle_error(mon, &err); +} diff --git a/hmp.h b/hmp.h index 799fd37..b72143f 100644 --- a/hmp.h +++ b/hmp.h @@ -138,5 +138,7 @@ void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict *qdict); void hmp_info_dump(Monitor *mon, const QDict *qdict); void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict); void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict); +void hmp_get_subpage(Monitor *mon, const QDict *qdict); +void hmp_set_subpage(Monitor *mon, const QDict *qdict); #endif diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index 24281fc..f7c1340 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -528,4 +528,6 @@ int kvm_set_one_reg(CPUState *cs, uint64_t id, void *source); */ int kvm_get_one_reg(CPUState *cs, uint64_t id, void *target); int kvm_get_max_memslots(void); +int kvm_get_subpage_wp_map(uint64_t base_gfn, uint32_t *buf, uint64_t len); +int kvm_set_subpage_wp_map(uint64_t base_gfn, uint64_t npages, uint32_t wp_map); #endif diff --git a/kvm-all.c b/kvm-all.c index 9040bd5..58cc0a4 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -2593,6 +2593,46 @@ int kvm_get_one_reg(CPUState *cs, uint64_t id, void *target) return r; } +int kvm_get_subpage_wp_map(uint64_t base_gfn, uint32_t *buf, + uint64_t len) +{ + KVMState *s = kvm_state; + struct kvm_subpage sp = {}; + int n; + + sp.base_gfn = base_gfn; + sp.npages = len; + + + if (kvm_vm_ioctl(s, KVM_SUBPAGES_GET_ACCESS, &sp) < 0) { + DPRINTF("ioctl failed %d\n", errno); + return -1; + } + + memcpy(buf, sp.access_map, n * sizeof(uint32_t)); + + return n; +} + +int kvm_set_subpage_wp_map(uint64_t base_gfn, uint64_t npages, + uint32_t wp_map) +{ + KVMState *s = kvm_state; + struct kvm_subpage sp = {}; + + sp.base_gfn = base_gfn; + sp.npages = npages; + sp.access_map[0] = wp_map; + + + if (kvm_vm_ioctl(s, KVM_SUBPAGES_SET_ACCESS, &sp) < 0) { + DPRINTF("ioctl failed %d\n", errno); + return -1; + } + + return 0; +} + static void kvm_accel_class_init(ObjectClass *oc, void *data) { AccelClass *ac = ACCEL_CLASS(oc); diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h index 4e082a8..69de005 100644 --- a/linux-headers/linux/kvm.h +++ b/linux-headers/linux/kvm.h @@ -205,6 +205,7 @@ struct kvm_hyperv_exit { #define KVM_EXIT_S390_STSI 25 #define KVM_EXIT_IOAPIC_EOI 26 #define KVM_EXIT_HYPERV 27 +#define KVM_EXIT_SPP 28 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ @@ -360,6 +361,10 @@ struct kvm_run { struct { __u8 vector; } eoi; + /* KVM_EXIT_SPP */ + struct { + __u64 addr; + } spp; /* KVM_EXIT_HYPERV */ struct kvm_hyperv_exit hyperv; /* Fix the size of the union. */ @@ -1126,6 +1131,8 @@ enum kvm_device_type { struct kvm_userspace_memory_region) #define KVM_SET_TSS_ADDR _IO(KVMIO, 0x47) #define KVM_SET_IDENTITY_MAP_ADDR _IOW(KVMIO, 0x48, __u64) +#define KVM_SUBPAGES_GET_ACCESS _IOR(KVMIO, 0x49, __u64) +#define KVM_SUBPAGES_SET_ACCESS _IOW(KVMIO, 0x4a, __u64) /* enable ucontrol for s390 */ struct kvm_s390_ucas_mapping { @@ -1354,4 +1361,12 @@ struct kvm_assigned_msix_entry { #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) +/* for KVM_SUBPAGES_GET_ACCESS and KVM_SUBPAGES_SET_ACCESS */ +#define SUBPAGE_MAX_BITMAP 256 +struct kvm_subpage { + __u64 base_gfn; + __u64 npages; + __u32 access_map[SUBPAGE_MAX_BITMAP]; /* sub-page write-access bitmap array */ +}; + #endif /* __LINUX_KVM_H */ diff --git a/qapi-schema.json b/qapi-schema.json index 32b4a4b..d6b46bb 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -6267,3 +6267,44 @@ # Since 2.9 ## { 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' } + +## +# @get-subpage: +# +# This command will get setting information of sub-page +# protection. +# +# Since: 2.10 +# +# Example: +# +# -> { "execute": "get-subpage", +# "arguments": { "base_gfn": 0x1000, +# "npages": 10, +# "filename": "/tmp/spp_info" } } +# <- { "return": {} } +# +## +{ 'command': 'get-subpage', + 'data': {'base_gfn': 'uint64', 'npages': 'uint64', 'filename': 'str'} } + + +## +# @set-subpage: +# +# This command will set sub-page protection for given GFNs. +# +# Since: 2.10 +# +# Example: +# +# -> { "execute": "set-subpage", +# "arguments": { "base_gfn": 0x1000, +# "npages": 10, +# "wp_map": 0xffff0000 } } +# <- { "return": {} } +# +## +{ 'command': 'set-subpage', + 'data': {'base_gfn': 'uint64', 'npages': 'uint64', 'wp_map': 'uint32'} } + diff --git a/qmp.c b/qmp.c index fa82b59..274efdb 100644 --- a/qmp.c +++ b/qmp.c @@ -717,3 +717,46 @@ ACPIOSTInfoList *qmp_query_acpi_ospm_status(Error **errp) return head; } + +#define SUBPAGE_BUF_LEN 256 +void qmp_get_subpage(uint64_t base_gfn, uint64_t npages, + const char *filename, Error **errp) +{ + FILE *f; + uint64_t n; + uint32_t buf[SUBPAGE_BUF_LEN]; + + f = fopen(filename, "wb"); + if (!f) { + error_setg_file_open(errp, errno, filename); + return; + } + + while (npages != 0) { + n = npages; + if (n > SUBPAGE_BUF_LEN) + n = SUBPAGE_BUF_LEN; + if (kvm_get_subpage_wp_map(base_gfn, buf, n) < 0) { + error_setg(errp, QERR_IO_ERROR); + goto exit; + } + if (fwrite(buf, 4, n, f) != n) { + error_setg(errp, QERR_IO_ERROR); + goto exit; + } + base_gfn += n; + npages -= n; + } + +exit: + fclose(f); +} + +void qmp_set_subpage(uint64_t base_gfn, uint64_t npages, + uint32_t wp_map, Error **errp) +{ + if (kvm_set_subpage_wp_map(base_gfn, npages, wp_map) < 0) + error_setg(errp, QERR_IO_ERROR); + +} + diff --git a/target/i386/kvm.c b/target/i386/kvm.c index 472399f..18a43d7 100644 --- a/target/i386/kvm.c +++ b/target/i386/kvm.c @@ -3147,6 +3147,23 @@ static int kvm_handle_debug(X86CPU *cpu, return ret; } +static int kvm_handle_spp(uint64_t addr) +{ + /* + uint64_t base_gfn = addr >> 12; + uint64_t offset = addr & ((1 << 12) - 1); + int subpage_index = offset >> 7; + uint32_t mask; + + kvm_get_subpage_wp_map(base_gfn, &mask, 1); + mask |= 1UL << subpage_index; + return kvm_set_subpage_wp_map(base_gfn, 1, mask); + */ + + fprintf(stderr, "QEMU-SPP: we are in kvm_handle_spp now, addr=0x%lx!\n", addr); + return 0; +} + void kvm_arch_update_guest_debug(CPUState *cpu, struct kvm_guest_debug *dbg) { const uint8_t type_code[] = { @@ -3240,6 +3257,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run) ioapic_eoi_broadcast(run->eoi.vector); ret = 0; break; + case KVM_EXIT_SPP: + DPRINTF("handle_spp\n"); + kvm_handle_spp(run->spp.addr); + ret = 0; + break; default: fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason); ret = -1; -- 2.7.4

