Re: [Xen-devel] [RFC 1/4] HVM x86 deprivileged mode: Page allocation helper
On 07/08/15 10:57, Ben Catterall wrote: On 06/08/15 20:22, Andrew Cooper wrote: On 06/08/15 17:45, Ben Catterall wrote: This allocation function is used by the deprivileged mode initialisation code to allocate pages for the new page table mappings and page frames on the HAP page heap. Signed-off-by: Ben Catterall ben.catter...@citrix.com This is fine for your test box, but isn't fine for systems out there without hardware EPT/NPT support. For older systems like that (or in certain specific workloads), shadow paging is used instead. This feature is applicable to any HVM domain, which means that it shouldn't depend on HAP or shadow paging. How much memory is allocated for the depriv area, and what exactly is allocated in total? So, per-vcpu: - a user mode stack which, from your comments in [RFC 2/4], can be 2 pages - local data (may or may not be needed, depends on the device) which will be around a page or two. Text segment: as per your comments in RFC 2/4, this will be changed to be an alias so no extra memory. Plus pagetables for all of these. The stack definitely doesn't need to be per-vcpu. per-pcpu is fine, and will reduce the overhead. I still don't see why local data would be needed between calls into depriv code. Small enough data can live on the stack, and allowing data to persist across calls risks getting state out-of-sync with the device model under emulation. (That is not to say that local data isn't needed. I just can't see a viable use for it at this stage.) ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 2/4] HVM x86 deprivileged mode: Create deprivileged page tables
On 06/08/15 20:52, Andrew Cooper wrote: On 06/08/15 17:45, Ben Catterall wrote: The paging structure mappings for the deprivileged mode are added to the monitor page table for HVM guests. The entries are generated by walking the page tables and mapping in new pages. If a higher-level page table mapping exists, we attempt to traverse all the way down to a leaf page and add the page there. Access bits are flipped as needed. The page entries are generated for deprivileged .text, .data and a stack. The data is copied from sections allocated by the linker. The mappings are setup in an unused portion of the Xen virtual address space. The pages are mapped in as user mode accessible, with NX bits set for the data and stack regions and the code region is set to be executable and read-only. The needed pages are allocated on the HAP page heap and are deallocated when those heap pages are deallocated (on domain destruction). Signed-off-by: Ben Catterall ben.catter...@citrix.com --- xen/arch/x86/hvm/Makefile | 1 + xen/arch/x86/hvm/deprivileged.c| 441 + xen/arch/x86/mm/hap/hap.c | 10 +- xen/arch/x86/xen.lds.S | 19 ++ xen/include/asm-x86/config.h | 10 +- xen/include/xen/hvm/deprivileged.h | 94 6 files changed, 573 insertions(+), 2 deletions(-) create mode 100644 xen/arch/x86/hvm/deprivileged.c create mode 100644 xen/include/xen/hvm/deprivileged.h diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile index 794e793..bd83ba3 100644 --- a/xen/arch/x86/hvm/Makefile +++ b/xen/arch/x86/hvm/Makefile @@ -16,6 +16,7 @@ obj-y += pmtimer.o obj-y += quirks.o obj-y += rtc.o obj-y += save.o +obj-y += deprivileged.o We attempt to keep objects linked in alphabetical order, where possible. apologies obj-y += stdvga.o obj-y += vioapic.o obj-y += viridian.o diff --git a/xen/arch/x86/hvm/deprivileged.c b/xen/arch/x86/hvm/deprivileged.c new file mode 100644 index 000..071d900 --- /dev/null +++ b/xen/arch/x86/hvm/deprivileged.c @@ -0,0 +1,441 @@ +/* + * HVM deprivileged mode to provide support for running operations in + * user mode from Xen + */ +#include xen/lib.h +#include xen/mm.h +#include xen/domain_page.h +#include xen/config.h +#include xen/types.h +#include xen/sched.h +#include asm/paging.h +#include xen/compiler.h +#include asm/hap.h +#include asm/paging.h +#include asm-x86/page.h +#include public/domctl.h +#include xen/domain_page.h +#include asm/hvm/vmx/vmx.h +#include xen/hvm/deprivileged.h + +void hvm_deprivileged_init(struct domain *d, l4_pgentry_t *l4e) +{ +int ret; +void *p; +unsigned long int size; + +/* Copy the .text segment for deprivileged code */ Why do you need to copy the deprivileged text section at all? It is read only and completely common. All you should need to do is make userspace mappings to it where it resides in the Xen linked area. tbh: At the time, I was unsure if it could open a hole if, when Xen switched over to superpages, and the deprivileged code was not aligned on a 4KB page boundary, then mapping it in might prove problematic. Mainly as we'd map in some of Xen's ring 0 code by accident, which could prove useful to an attacker as they could then call this from userspace. Though, now I've thought about it more, I can ensure in the linker that it's aligned properly and I think map_pages_to_xen() will do what I need. i.e. this is an operation which should be performed once at startup, not once per domain. You will also reduce your memory overhead by doing this. got it. +size = (unsigned long int)__hvm_deprivileged_text_end - Don't bother qualifying unsigned long with a further int. A plain unsigned long is the expected declaration. understood. + (unsigned long int)__hvm_deprivileged_text_start; + +ret = hvm_deprivileged_copy_pages(d, l4e, + (unsigned long int)__hvm_deprivileged_text_start, + (unsigned long int)HVM_DEPRIVILEGED_TEXT_ADDR, + size, 0 /* No write */); + +if( ret ) +{ +printk(HVM: Error when initialising depriv .text. Code: %d, ret); +domain_crash(d); +return; +} + +/* Copy the .data segment for ring3 code */ +size = (unsigned long int)__hvm_deprivileged_data_end - + (unsigned long int)__hvm_deprivileged_data_start; + +ret = hvm_deprivileged_copy_pages(d, l4e, + (unsigned long int)__hvm_deprivileged_data_start, + (unsigned long int)HVM_DEPRIVILEGED_DATA_ADDR, + size, _PAGE_NX | _PAGE_RW); What do you expect to end up in a data section like this? It's for passing in configuration data at the start when we run the device (reduce the number of system calls needed) and for local data which the emulated devices may need. It can also be used to put the
Re: [Xen-devel] [RFC 4/4] HVM x86 deprivileged mode: Trap handlers for deprivileged mode
On 07/08/15 13:32, Ben Catterall wrote: On 06/08/15 22:24, Andrew Cooper wrote: On 06/08/2015 17:45, Ben Catterall wrote: Added trap handlers to catch exceptions such as a page fault, general protection fault, etc. These handlers will crash the domain as such exceptions would indicate that either there is a bug in deprivileged mode or it has been compromised by an attacker. Signed-off-by: Ben Catterall ben.catter...@citrix.com --- xen/arch/x86/mm/hap/hap.c | 9 + xen/arch/x86/traps.c | 41 - 2 files changed, 49 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c index abc5113..43bde89 100644 --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -685,8 +685,17 @@ static int hap_page_fault(struct vcpu *v, unsigned long va, { struct domain *d = v-domain; +/* If we get a page fault whilst in HVM security user mode */ +if( v-user_mode == 1 ) +{ +printk(HVM: #PF (%u:%u) whilst in user mode\n, + d-domain_id, v-vcpu_id); %pv is your friend. Like Linux, we have custom printk formats. In this case, passing 'v' as a parameter to %pv will cause d$Xv$Y to be printed. (The example below predates %pv being introduced). ok, will do. thanks! +domain_crash_synchronous(); No need for _synchronous() here. _synchronous() should only be used when you can't safely recover. It ends up spinning in a tight loop waiting for the next timer interrupt, is anything up to 30ms away. I'm not sure if we can safely recover from this. This will only be triggered if there is a bug in depriv mode or if the mode has been compromised and an attacker has tried to access unavailable memory. From my understanding (am I missing something?): domain_crash effectively sets flags to tell the scheduler that it should be killed the next time the scheduler runs and then returns. In which case, if we don't do a synchronous crash, this return path would return back into the deprivileged mode, we would not have mapped in the page (as we shouldn't), and then we get another fault. What do you think is the best way forward? Thanks! Given that there is a use of domain_crash(d) in context below, it is clearly safe to use from here. (Although my general point about hap vs shadow code still applies, meaning that hap_page_fault() is not the correct function to hook like this.) domain_crash() sets a flag, but exiting out from a fault handler heading back towards ring3 code should check for pending softirqs. However, because of the way you have hooked return-to-depriv, you might have broken this. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH for-4.6] tools/xenstore: Correct use of va_end() after va_copy()
C requires that every use of va_copy() is matched with a va_end() call. This is especially important for x86_64 as va_{start,copy}() may need to allocate memory to generate a va_list containing parameters which were previously in registers. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/xenstore/talloc.c |6 ++ 1 file changed, 6 insertions(+) diff --git a/tools/xenstore/talloc.c b/tools/xenstore/talloc.c index 54dbd02..d7edcf3 100644 --- a/tools/xenstore/talloc.c +++ b/tools/xenstore/talloc.c @@ -1101,13 +1101,16 @@ char *talloc_vasprintf(const void *t, const char *fmt, va_list ap) /* this call looks strange, but it makes it work on older solaris boxes */ if ((len = vsnprintf(c, 1, fmt, ap2)) 0) { + va_end(ap2); return NULL; } + va_end(ap2); ret = _talloc(t, len+1); if (ret) { VA_COPY(ap2, ap); vsnprintf(ret, len+1, fmt, ap2); + va_end(ap2); talloc_set_name_const(ret, ret); } @@ -1161,8 +1164,10 @@ static char *talloc_vasprintf_append(char *s, const char *fmt, va_list ap) * the original string. Most current callers of this * function expect it to never return NULL. */ + va_end(ap2); return s; } + va_end(ap2); s = talloc_realloc(NULL, s, char, s_len + len+1); if (!s) return NULL; @@ -1170,6 +1175,7 @@ static char *talloc_vasprintf_append(char *s, const char *fmt, va_list ap) VA_COPY(ap2, ap); vsnprintf(s+s_len, len+1, fmt, ap2); + va_end(ap2); talloc_set_name_const(s, s); return s; -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 1/2] tools/libxl: Assert success of memory allocation in testidl
The chances of an allocation failing are slim but nonzero. Assert success of each allocation to quieten Coverity, which re-notices defects each time the IDL changes. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/libxl/gentest.py |6 ++ 1 file changed, 6 insertions(+) diff --git a/tools/libxl/gentest.py b/tools/libxl/gentest.py index 7621a1e..85311e7 100644 --- a/tools/libxl/gentest.py +++ b/tools/libxl/gentest.py @@ -33,6 +33,7 @@ def gen_rand_init(ty, v, indent = , parent = None): s += %s = rand()%%8;\n % (parent + ty.lenvar.name) s += %s = calloc(%s, sizeof(*%s));\n % \ (v, parent + ty.lenvar.name, v) +s += assert(%s);\n % (v, ) s += {\n s += int i;\n s += for (i=0; i%s; i++)\n % (parent + ty.lenvar.name) @@ -98,6 +99,7 @@ if __name__ == '__main__': #include stdio.h #include stdlib.h #include string.h +#include assert.h #include libxl.h #include libxl_utils.h @@ -106,6 +108,7 @@ static char *rand_str(void) { int i, sz = rand() % 32; char *s = malloc(sz+1); +assert(s); for (i=0; isz; i++) s[i] = 'a' + (rand() % 26); s[i] = '\\0'; @@ -124,6 +127,7 @@ static void libxl_bitmap_rand_init(libxl_bitmap *bitmap) int i; bitmap-size = rand() % 16; bitmap-map = calloc(bitmap-size, sizeof(*bitmap-map)); +assert(bitmap-map); libxl_for_each_bit(i, *bitmap) { if (rand() % 2) libxl_bitmap_set(bitmap, i); @@ -136,6 +140,7 @@ static void libxl_key_value_list_rand_init(libxl_key_value_list *pkvl) { int i, nr_kvp = rand() % 16; libxl_key_value_list kvl = calloc(nr_kvp+1, 2*sizeof(char *)); +assert(kvl); for (i = 0; i2*nr_kvp; i += 2) { kvl[i] = rand_str(); @@ -196,6 +201,7 @@ static void libxl_string_list_rand_init(libxl_string_list *p) { int i, nr = rand() % 16; libxl_string_list l = calloc(nr+1, sizeof(char *)); +assert(l); for (i = 0; inr; i++) { l[i] = rand_str(); -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH for-4.6 0/2] Reduce the quantity of Coverity defects in testidl
No functional change, but removes 48 defects which get intermittently reflagged every time the libxl IDL is altered. Sent for 4.6 at Ian Campbells suggestion. Andrew Cooper (2): tools/libxl: Assert success of memory allocation in testidl tools/libxl: Alter the use of rand() in testidl tools/libxl/gentest.py | 42 +++--- 1 file changed, 27 insertions(+), 15 deletions(-) -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 2/2] tools/libxl: Alter the use of rand() in testidl
Coverity warns for every occurrence of rand(), which is made worse because each time the IDL changes, some of the calls get re-flagged. Collect all calls to rand() in a single function, test_rand(), which takes a modulo parameter for convenience. This turns 40 defects currently into 1, which won't get re-flagged when the IDL changes. In addition, fix the erroneous random choice for libxl_defbool_set(). !!rand() % 1 is unconditionally 0, and even without the % 1 would still be very heavily skewed in one direction. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/libxl/gentest.py | 36 +--- 1 file changed, 21 insertions(+), 15 deletions(-) diff --git a/tools/libxl/gentest.py b/tools/libxl/gentest.py index 85311e7..989959f 100644 --- a/tools/libxl/gentest.py +++ b/tools/libxl/gentest.py @@ -30,7 +30,7 @@ def gen_rand_init(ty, v, indent = , parent = None): elif isinstance(ty, idl.Array): if parent is None: raise Exception(Array type must have a parent) -s += %s = rand()%%8;\n % (parent + ty.lenvar.name) +s += %s = test_rand(8);\n % (parent + ty.lenvar.name) s += %s = calloc(%s, sizeof(*%s));\n % \ (v, parent + ty.lenvar.name, v) s += assert(%s);\n % (v, ) @@ -64,13 +64,13 @@ def gen_rand_init(ty, v, indent = , parent = None): elif ty.typename in [libxl_uuid, libxl_mac, libxl_hwcap, libxl_ms_vm_genid]: s += rand_bytes((uint8_t *)%s, sizeof(*%s));\n % (v,v) elif ty.typename in [libxl_domid, libxl_devid] or isinstance(ty, idl.Number): -s += %s = rand() %% (sizeof(%s)*8);\n % \ +s += %s = test_rand(sizeof(%s) * 8);\n % \ (ty.pass_arg(v, parent is None), ty.pass_arg(v, parent is None)) elif ty.typename in [bool]: -s += %s = rand() %% 2;\n % v +s += %s = test_rand(2);\n % v elif ty.typename in [libxl_defbool]: -s += libxl_defbool_set(%s, !!rand() %% 1);\n % v +s += libxl_defbool_set(%s, test_rand(2));\n % v elif ty.typename in [char *]: s += %s = rand_str();\n % v elif ty.private: @@ -104,13 +104,19 @@ if __name__ == '__main__': #include libxl.h #include libxl_utils.h +static int test_rand(unsigned max) +{ +/* We are not using rand() for its cryptographic properies. */ +return rand() % max; +} + static char *rand_str(void) { -int i, sz = rand() % 32; +int i, sz = test_rand(32); char *s = malloc(sz+1); assert(s); for (i=0; isz; i++) -s[i] = 'a' + (rand() % 26); +s[i] = 'a' + test_rand(26); s[i] = '\\0'; return s; } @@ -119,17 +125,17 @@ static void rand_bytes(uint8_t *p, size_t sz) { int i; for (i=0; isz; i++) -p[i] = rand() % 256; +p[i] = test_rand(256); } static void libxl_bitmap_rand_init(libxl_bitmap *bitmap) { int i; -bitmap-size = rand() % 16; +bitmap-size = test_rand(16); bitmap-map = calloc(bitmap-size, sizeof(*bitmap-map)); assert(bitmap-map); libxl_for_each_bit(i, *bitmap) { -if (rand() % 2) +if (test_rand(2)) libxl_bitmap_set(bitmap, i); else libxl_bitmap_reset(bitmap, i); @@ -138,13 +144,13 @@ static void libxl_bitmap_rand_init(libxl_bitmap *bitmap) static void libxl_key_value_list_rand_init(libxl_key_value_list *pkvl) { -int i, nr_kvp = rand() % 16; +int i, nr_kvp = test_rand(16); libxl_key_value_list kvl = calloc(nr_kvp+1, 2*sizeof(char *)); assert(kvl); for (i = 0; i2*nr_kvp; i += 2) { kvl[i] = rand_str(); -if (rand() % 8) +if (test_rand(8)) kvl[i+1] = rand_str(); else kvl[i+1] = NULL; @@ -156,7 +162,7 @@ static void libxl_key_value_list_rand_init(libxl_key_value_list *pkvl) static void libxl_cpuid_policy_list_rand_init(libxl_cpuid_policy_list *pp) { -int i, nr_policies = rand() % 16; +int i, nr_policies = test_rand(16); struct { const char *n; int w; @@ -189,8 +195,8 @@ static void libxl_cpuid_policy_list_rand_init(libxl_cpuid_policy_list *pp) libxl_cpuid_policy_list p = NULL; for (i = 0; i nr_policies; i++) { -int opt = rand() % nr_options; -int val = rand() % (1options[opt].w); +int opt = test_rand(nr_options); +int val = test_rand(1options[opt].w); snprintf(buf, 64, \%s=%#x\, options[opt].n, val); libxl_cpuid_parse_config(p, buf); } @@ -199,7 +205,7 @@ static void libxl_cpuid_policy_list_rand_init(libxl_cpuid_policy_list *pp) static void libxl_string_list_rand_init(libxl_string_list *p) { -int i, nr = rand() % 16; +int i, nr = test_rand(16); libxl_string_list l = calloc(nr+1, sizeof(char *)); assert(l); -- 1.7.10.4
Re: [Xen-devel] [RFC 4/4] HVM x86 deprivileged mode: Trap handlers for deprivileged mode
On 07/08/15 14:19, Andrew Cooper wrote: On 07/08/15 13:32, Ben Catterall wrote: On 06/08/15 22:24, Andrew Cooper wrote: On 06/08/2015 17:45, Ben Catterall wrote: Added trap handlers to catch exceptions such as a page fault, general protection fault, etc. These handlers will crash the domain as such exceptions would indicate that either there is a bug in deprivileged mode or it has been compromised by an attacker. Signed-off-by: Ben Catterall ben.catter...@citrix.com --- xen/arch/x86/mm/hap/hap.c | 9 + xen/arch/x86/traps.c | 41 - 2 files changed, 49 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c index abc5113..43bde89 100644 --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -685,8 +685,17 @@ static int hap_page_fault(struct vcpu *v, unsigned long va, { struct domain *d = v-domain; +/* If we get a page fault whilst in HVM security user mode */ +if( v-user_mode == 1 ) +{ +printk(HVM: #PF (%u:%u) whilst in user mode\n, + d-domain_id, v-vcpu_id); %pv is your friend. Like Linux, we have custom printk formats. In this case, passing 'v' as a parameter to %pv will cause d$Xv$Y to be printed. (The example below predates %pv being introduced). ok, will do. thanks! +domain_crash_synchronous(); No need for _synchronous() here. _synchronous() should only be used when you can't safely recover. It ends up spinning in a tight loop waiting for the next timer interrupt, is anything up to 30ms away. I'm not sure if we can safely recover from this. This will only be triggered if there is a bug in depriv mode or if the mode has been compromised and an attacker has tried to access unavailable memory. From my understanding (am I missing something?): domain_crash effectively sets flags to tell the scheduler that it should be killed the next time the scheduler runs and then returns. In which case, if we don't do a synchronous crash, this return path would return back into the deprivileged mode, we would not have mapped in the page (as we shouldn't), and then we get another fault. What do you think is the best way forward? Thanks! Given that there is a use of domain_crash(d) in context below, it is clearly safe to use from here. (Although my general point about hap vs shadow code still applies, meaning that hap_page_fault() is not the correct function to hook like this.) domain_crash() sets a flag, but exiting out from a fault handler heading back towards ring3 code should check for pending softirqs. However, because of the way you have hooked return-to-depriv, you might have broken this. Understood, I'll move the handler, make this change and examine the return path. Thanks! ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [qemu-upstream-4.2-testing test] 60611: tolerable FAIL - PUSHED
flight 60611 qemu-upstream-4.2-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/60611/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail blocked in 60207 Tests which did not succeed, but are not blocking: test-amd64-i386-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail never pass test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail never pass test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail never pass test-amd64-i386-xend-qemuu-winxpsp3 21 leak-check/checkfail never pass version targeted for testing: qemuu138906105dd47b9dc6b1e5010e81fc606983dd75 baseline version: qemuuda2e633ec99da897f51f388217f070c53aea6674 Last test of basis60207 2015-07-31 22:20:43 Z7 days Testing same since60553 2015-08-03 19:06:47 Z4 days2 attempts People who touched revisions under test: Stefan Hajnoczi stefa...@redhat.com Stefano Stabellini stefano.stabell...@eu.citrix.com jobs: build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-ovmf-amd64 fail test-amd64-i386-xl-qemuu-ovmf-amd64 fail test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-i386-xl-qemuu-win7-amd64 fail test-amd64-i386-qemuu-rhel6hvm-intel pass test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 pass test-amd64-i386-xend-qemuu-winxpsp3 fail test-amd64-amd64-xl-qemuu-winxpsp3 pass test-i386-i386-xl-qemuu-winxpsp3 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=qemu-upstream-4.2-testing + revision=138906105dd47b9dc6b1e5010e81fc606983dd75 + . cri-lock-repos ++ . cri-common +++ . cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{Repos} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push qemu-upstream-4.2-testing 138906105dd47b9dc6b1e5010e81fc606983dd75 + branch=qemu-upstream-4.2-testing + revision=138906105dd47b9dc6b1e5010e81fc606983dd75 + . cri-lock-repos ++ . cri-common +++ . cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{Repos} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . cri-common ++ . cri-getconfig ++ umask 002 + select_xenbranch + case $branch in + tree=qemuu + xenbranch=xen-4.2-testing + '[' xqemuu = xlinux ']' + linuxbranch= + '[' x = x ']' + qemuubranch=qemu-upstream-4.2-testing + : tested/2.6.39.x + . ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{OsstestUpstream} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ :
Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2
On Fri, 2015-08-07 at 13:25 -0700, Luis R. Rodriguez wrote: On Thu, Aug 6, 2015 at 3:58 PM, Toshi Kani toshi.k...@hp.com wrote: On Thu, 2015-08-06 at 12:53 -0700, Luis R. Rodriguez wrote: On Fri, Jun 12, 2015 at 9:58 AM, Toshi Kani toshi.k...@hp.com wrote: On Fri, 2015-06-12 at 08:59 +0100, Jan Beulich wrote: On 12.06.15 at 01:23, toshi.k...@hp.com wrote: There are two usages on MTRRs: 1) MTRR entries set by firmware 2) MTRR entries set by OS drivers We can obsolete 2), but we have no control over 1). As UEFI firmwares also set this up, this usage will continue to stay. So, we should not get rid of the MTRR code that looks up the MTRR entries, while we have no need to modify them. Such MTRR entries provide safe guard to /dev/mem, which allows privileged user to access a range that may require UC mapping while the /dev/mem driver blindly maps it with WB. MTRRs converts WB to UC in such a case. But it wouldn't be impossible to simply read the MTRRs upon boot, store the information, disable MTRRs, and correctly use PAT to achieve the same effect (i.e. the blindly maps part of course would need fixing). It could be done, but I do not see much benefit of doing it. One of the reasons platform vendors set MTRRs is so that a system won't hit a machine check when an OS bug leads an access with a wrong cache type. A machine check is hard to analyze and can be seen as a hardware issue by customers. Emulating MTRRs with PAT won't protect from such a bug. That's seems like a fair and valid concern. This could only happen if the OS would have code that would use MTRR, in the case of Linux we'll soon be able to vet that this cannot happen. No, there is no OS support necessary to use MTRR. After firmware sets it up, CPUs continue to use it without any OS support. I think the Linux change you are referring is to obsolete legacy interfaces that modify the MTRR setup. I agree that Linux should not modify MTRR. Its a bit more than that though. Since you agree that the OS can live without MTRR code I was hoping to then see if we can fold out PAT Linux code from under the MTRR dependency on Linux and make PAT a first class citizen, maybe at least for x86-64. Right now you can only get PAT support on Linux if you have MTRR code, but I'd like to see if instead we can rip MTRR code out completely under its own Kconfig and let it start rotting away. Code-wise the only issue I saw was that PAT code also relies on mtrr_type_lookup(), see pat_x_mtrr_type(), but other than this I found no other obvious issues. We can rip of the MTTR code that modifies the MTRR setup, but not mtrr_type_lookup(). This function provides necessary checks per documented in commit 7f0431e3dc89 as follows. 1) reserve_memtype() tracks an effective memory type in case a request type is WB (ex. /dev/mem blindly uses WB). Missing to track with its effective type causes a subsequent request to map the same range with the effective type to fail. 2) pud_set_huge() and pmd_set_huge() check if a requested range has any overlap with MTRRs. Missing to detect an overlap may cause a performance penalty or undefined behavior. mtrr_type_lookup() is still admittedly awkward, but I do not think we have an immediate issue in PAT code calling it. I do not think it makes PAT code a second class citizen. Platform firmware and SMIs seems to be the only other possible issue. More on this below. For those type of OSes... could it be possible to negotiate or hint to the platform through an attribute somehow that the OS has such capability to not use MTRR? The OS can disable MTRR. However, this can also cause a problem in firmware, which may rely on MTRR. Can you describe what type of issues we could expect ? I tend to care more about this for 64-bit systems so if 32-bit platforms would be more of the ones which could cause an issue would restricting disabling MTRR only for 64-bit help? The SMI handler runs in real-mode and relies on MTRR being effective to provide right cache types. It does not matter if it is 64-bit or not. Then, only if this bit is set, the platform could then avoid such MTRR settings, and if we have issues you can throw rocks at us. And if that's not possible how about a new platform setting that would need to be set at the platform level to enable disabling this junk? Then only folks who know what they are doing would enable it, and if the customer set it, the issue would not be on the platform. Could this also be used to prevent SMIs with MTRRs? ACPI _OSI could be used for firmware to implement some OS-specific features, but it may be too late for firmware to make major changes and is generally useless unless OS requirements
Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2
On Thu, Aug 6, 2015 at 3:58 PM, Toshi Kani toshi.k...@hp.com wrote: On Thu, 2015-08-06 at 12:53 -0700, Luis R. Rodriguez wrote: On Fri, Jun 12, 2015 at 9:58 AM, Toshi Kani toshi.k...@hp.com wrote: On Fri, 2015-06-12 at 08:59 +0100, Jan Beulich wrote: On 12.06.15 at 01:23, toshi.k...@hp.com wrote: There are two usages on MTRRs: 1) MTRR entries set by firmware 2) MTRR entries set by OS drivers We can obsolete 2), but we have no control over 1). As UEFI firmwares also set this up, this usage will continue to stay. So, we should not get rid of the MTRR code that looks up the MTRR entries, while we have no need to modify them. Such MTRR entries provide safe guard to /dev/mem, which allows privileged user to access a range that may require UC mapping while the /dev/mem driver blindly maps it with WB. MTRRs converts WB to UC in such a case. But it wouldn't be impossible to simply read the MTRRs upon boot, store the information, disable MTRRs, and correctly use PAT to achieve the same effect (i.e. the blindly maps part of course would need fixing). It could be done, but I do not see much benefit of doing it. One of the reasons platform vendors set MTRRs is so that a system won't hit a machine check when an OS bug leads an access with a wrong cache type. A machine check is hard to analyze and can be seen as a hardware issue by customers. Emulating MTRRs with PAT won't protect from such a bug. That's seems like a fair and valid concern. This could only happen if the OS would have code that would use MTRR, in the case of Linux we'll soon be able to vet that this cannot happen. No, there is no OS support necessary to use MTRR. After firmware sets it up, CPUs continue to use it without any OS support. I think the Linux change you are referring is to obsolete legacy interfaces that modify the MTRR setup. I agree that Linux should not modify MTRR. Its a bit more than that though. Since you agree that the OS can live without MTRR code I was hoping to then see if we can fold out PAT Linux code from under the MTRR dependency on Linux and make PAT a first class citizen, maybe at least for x86-64. Right now you can only get PAT support on Linux if you have MTRR code, but I'd like to see if instead we can rip MTRR code out completely under its own Kconfig and let it start rotting away. Code-wise the only issue I saw was that PAT code also relies on mtrr_type_lookup(), see pat_x_mtrr_type(), but other than this I found no other obvious issues. Platform firmware and SMIs seems to be the only other possible issue. More on this below. For those type of OSes... could it be possible to negotiate or hint to the platform through an attribute somehow that the OS has such capability to not use MTRR? The OS can disable MTRR. However, this can also cause a problem in firmware, which may rely on MTRR. Can you describe what type of issues we could expect ? I tend to care more about this for 64-bit systems so if 32-bit platforms would be more of the ones which could cause an issue would restricting disabling MTRR only for 64-bit help? Then, only if this bit is set, the platform could then avoid such MTRR settings, and if we have issues you can throw rocks at us. And if that's not possible how about a new platform setting that would need to be set at the platform level to enable disabling this junk? Then only folks who know what they are doing would enable it, and if the customer set it, the issue would not be on the platform. Could this also be used to prevent SMIs with MTRRs? ACPI _OSI could be used for firmware to implement some OS-specific features, but it may be too late for firmware to make major changes and is generally useless unless OS requirements are described in a spec backed by logo certification. I see.. So there are no guarantees that platform firmware would not expect OS MTRR support. SMIs are also used for platform management, such as fan speed control. And its conceivable that some devices, or the platform itself, may trigger SMIs to have the platform firmware poke with MTRRs? Is there any issue for Linux to use MTRR set by firmware? Even though we don't have the Kconfig option right now to disable MTRR cod explicitly I'll note that there are a few other cases that could flip Linux to note use MTRR: a) Some BIOSes could let MTRR get disabled b) As of Xen 4.4, the hypervisor disables X86_FEATURE_MTRR which disables MTRR on Linux If these environments can exist it'd be good to understand possible issues that could creep up as a result of the OS not having MTRR enabled. If this is a reasonable thing for x86-64 I was hoping we could just let users opt-in to a similar build configuration through the OS by letting PAT not depend on MTRR. Luis ___ Xen-devel mailing list Xen-devel@lists.xen.org
[Xen-devel] [PATCH v2] xen: arm: Support 32MB frametables
setup_frametable_mappings() rounds frametable_size up to a multiple of 32MB. This is wasteful on systems with less than 4GB of RAM, although it does allow the contig bit to be set in the PTEs. Where the frametable is less than 32MB in size, instead round up to a multiple of 2MB, not setting the contig bit in the PTEs. Signed-off-by: Chris Brand chris.br...@broadcom.com --- Changed in v2: merged create_32mb_mappings() and create_2mb_mappings() into create_mappings(). A side-effect is to fix the bug in v1 for ARM64 systems with 4GB RAM. xen/arch/arm/mm.c | 37 ++--- 1 file changed, 22 insertions(+), 15 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index ae0f34c3c480..fd64fbfdfb93 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -628,25 +628,31 @@ void __cpuinit mmu_init_secondary_cpu(void) } /* Create Xen's mappings of memory. - * Base and virt must be 32MB aligned and size a multiple of 32MB. + * Mapping_size must be either 2MB or 32MB. + * Base and virt must be mapping_size aligned. + * Size must be a multiple of mapping_size. * second must be a contiguous set of second level page tables * covering the region starting at virt_offset. */ -static void __init create_32mb_mappings(lpae_t *second, -unsigned long virt_offset, -unsigned long base_mfn, -unsigned long nr_mfns) +static void __init create_mappings(lpae_t *second, + unsigned long virt_offset, + unsigned long base_mfn, + unsigned long nr_mfns, + unsigned int mapping_size) { unsigned long i, count; +const unsigned long granularity = mapping_size PAGE_SHIFT; lpae_t pte, *p; -ASSERT(!((virt_offset PAGE_SHIFT) % (16 * LPAE_ENTRIES))); -ASSERT(!(base_mfn % (16 * LPAE_ENTRIES))); -ASSERT(!(nr_mfns % (16 * LPAE_ENTRIES))); +ASSERT((mapping_size == MB(2)) || (mapping_size == MB(32))); +ASSERT(!((virt_offset PAGE_SHIFT) % granularity)); +ASSERT(!(base_mfn % granularity)); +ASSERT(!(nr_mfns % granularity)); count = nr_mfns / LPAE_ENTRIES; p = second + second_linear_offset(virt_offset); pte = mfn_to_xen_entry(base_mfn, WRITEALLOC); -pte.pt.contig = 1; /* These maps are in 16-entry contiguous chunks. */ +if ( mapping_size == MB(32) ) +pte.pt.contig = 1; /* These maps are in 16-entry contiguous chunks. */ for ( i = 0; i count; i++ ) { write_pte(p + i, pte); @@ -660,7 +666,7 @@ static void __init create_32mb_mappings(lpae_t *second, void __init setup_xenheap_mappings(unsigned long base_mfn, unsigned long nr_mfns) { -create_32mb_mappings(xen_second, XENHEAP_VIRT_START, base_mfn, nr_mfns); +create_mappings(xen_second, XENHEAP_VIRT_START, base_mfn, nr_mfns, MB(32)); /* Record where the xenheap is, for translation routines. */ xenheap_virt_end = XENHEAP_VIRT_START + nr_mfns * PAGE_SIZE; @@ -749,6 +755,7 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe) unsigned long nr_pdxs = pfn_to_pdx(nr_pages); unsigned long frametable_size = nr_pdxs * sizeof(struct page_info); unsigned long base_mfn; +const unsigned long mapping_size = frametable_size MB(32) ? MB(2) : MB(32); #ifdef CONFIG_ARM_64 lpae_t *second, pte; unsigned long nr_second, second_base; @@ -756,9 +763,8 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe) #endif frametable_base_pdx = pfn_to_pdx(ps PAGE_SHIFT); - -/* Round up to 32M boundary */ -frametable_size = (frametable_size + 0x1ff) ~0x1ff; +/* Round up to 2M or 32M boundary, as appropriate. */ +frametable_size = ROUNDUP(frametable_size, mapping_size); base_mfn = alloc_boot_pages(frametable_size PAGE_SHIFT, 32(20-12)); #ifdef CONFIG_ARM_64 @@ -771,9 +777,10 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe) pte.pt.table = 1; write_pte(xen_first[first_table_offset(FRAMETABLE_VIRT_START)+i], pte); } -create_32mb_mappings(second, 0, base_mfn, frametable_size PAGE_SHIFT); +create_mappings(second, 0, base_mfn, frametable_size PAGE_SHIFT, mapping_size); #else -create_32mb_mappings(xen_second, FRAMETABLE_VIRT_START, base_mfn, frametable_size PAGE_SHIFT); +create_mappings(xen_second, FRAMETABLE_VIRT_START, +base_mfn, frametable_size PAGE_SHIFT, mapping_size); #endif memset(frame_table[0], 0, nr_pdxs * sizeof(struct page_info)); -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 9/9] xen/xenbus: Rename the variable xen_store_mfn to xen_store_gfn
On 08/07/2015 12:34 PM, Julien Grall wrote: The variable xen_store_mfn is effectively storing a GFN and not an MFN. Signed-off-by: Julien Grall julien.gr...@citrix.com --- Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: Boris Ostrovsky boris.ostrov...@oracle.com Cc: David Vrabel david.vra...@citrix.com I think that the assignation of xen_start_info in xenstored_local_init is pointless. Although I haven't drop it just in case. I think so too (but that would be a separate patch if you decide to do it). Reviewed-by: Boris Ostrovsky boris.ostrov...@oracle.com Changes in v3: - Patch added. --- drivers/xen/xenbus/xenbus_probe.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c index b3870f4..3cbe055 100644 --- a/drivers/xen/xenbus/xenbus_probe.c +++ b/drivers/xen/xenbus/xenbus_probe.c @@ -75,7 +75,7 @@ EXPORT_SYMBOL_GPL(xen_store_interface); enum xenstore_init xen_store_domain_type; EXPORT_SYMBOL_GPL(xen_store_domain_type); -static unsigned long xen_store_mfn; +static unsigned long xen_store_gfn; static BLOCKING_NOTIFIER_HEAD(xenstore_chain); @@ -711,7 +711,7 @@ static int __init xenstored_local_init(void) if (!page) goto out_err; - xen_store_mfn = xen_start_info-store_mfn = virt_to_gfn((void *)page); + xen_store_gfn = xen_start_info-store_mfn = virt_to_gfn((void *)page); /* Next allocate a local port which xenstored can bind to */ alloc_unbound.dom= DOMID_SELF; @@ -785,12 +785,12 @@ static int __init xenbus_init(void) err = xenstored_local_init(); if (err) goto out_error; - xen_store_interface = gfn_to_virt(xen_store_mfn); + xen_store_interface = gfn_to_virt(xen_store_gfn); break; case XS_PV: xen_store_evtchn = xen_start_info-store_evtchn; - xen_store_mfn = xen_start_info-store_mfn; - xen_store_interface = gfn_to_virt(xen_store_mfn); + xen_store_gfn = xen_start_info-store_mfn; + xen_store_interface = gfn_to_virt(xen_store_gfn); break; case XS_HVM: err = hvm_get_parameter(HVM_PARAM_STORE_EVTCHN, v); @@ -800,9 +800,9 @@ static int __init xenbus_init(void) err = hvm_get_parameter(HVM_PARAM_STORE_PFN, v); if (err) goto out_error; - xen_store_mfn = (unsigned long)v; + xen_store_gfn = (unsigned long)v; xen_store_interface = - xen_remap(xen_store_mfn PAGE_SHIFT, PAGE_SIZE); + xen_remap(xen_store_gfn PAGE_SHIFT, PAGE_SIZE); break; default: pr_warn(Xenstore state unknown\n); ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Reminder: Urgent - Action Required - Xen Dev Summit Bof's Developer Meetings and WG Meetings on Aug 18 19t (print and food order deadline Friday 7th)
Reducing the list Is the Developer Meeting on Aug 19th open to public? Is it possible to listen to the meeting if there is still space left? Yes, if there is space (which there appears to be). But then if you are asking because you want to attend, there wouldn't be an issue even if we were running out of space. Lars On Fri, Aug 7, 2015 at 8:08 AM, Meng Xu xumengpa...@gmail.com wrote: Hi Lars, 2015-08-05 2:54 GMT-07:00 Lars Kurth lars.kurth@gmail.com: Hi folks, this email is for people planning to attend the Xen Dev Summit in Seattle (Aug 17 18) and the Developer Meeting on the 19th. Is the Developer Meeting on Aug 19th open to public? Is it possible to listen to the meeting if there is still space left? Thanks, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V3 1/6] x86/xsaves: enable xsaves/xrstors for pv guest
On Wed, Aug 05, 2015 at 06:51:29PM +0100, Andrew Cooper wrote: On 05/08/15 02:57, Shuai Ruan wrote: This patch emualtes xsaves/xrstors instructions and XSS msr access. As xsaves/xrstors instructions and XSS msr access required be executed only in ring0. So emulation are needed when pv guest uses these instructions. Signed-off-by: Shuai Ruan shuai.r...@linux.intel.com --- xen/arch/x86/domain.c | 3 + xen/arch/x86/traps.c| 138 xen/arch/x86/x86_64/mm.c| 52 +++ xen/arch/x86/xstate.c | 39 xen/include/asm-x86/domain.h| 1 + xen/include/asm-x86/mm.h| 1 + xen/include/asm-x86/msr-index.h | 2 + xen/include/asm-x86/xstate.h| 3 + 8 files changed, 239 insertions(+) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 045f6ff..e8b8d67 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -426,6 +426,7 @@ int vcpu_initialise(struct vcpu *v) /* By default, do not emulate */ v-arch.vm_event.emulate_flags = 0; +v-arch.msr_ia32_xss = 0; The backing memory for struct vcpu is zeroed when allocated. There is no need to explicitly zero this field here. Ok. rc = mapcache_vcpu_init(v); if ( rc ) @@ -1529,6 +1530,8 @@ static void __context_switch(void) if ( xcr0 != get_xcr0() !set_xcr0(xcr0) ) BUG(); } +if ( cpu_has_xsaves ) +wrmsr_safe(MSR_IA32_XSS, n-arch.msr_ia32_xss); This musn't throw away potential errors. It should not be possible for n-arch.msr_ia32_xss to be invalid by this point, so a straight wrmsr() would be correct. However, you will want to implement lazy context switching, exactly like get/set_xcr0(). Ok. vcpu_restore_fpu_eager(n); n-arch.ctxt_switch_to(n); } diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index 6a03582..c1fea77 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -2353,6 +2353,131 @@ static int emulate_privileged_op(struct cpu_user_regs *regs) } break; +case 0xc7: This case should be sorted numerically, so should be between CPUID and default. Ok. +{ +void *xsave_addr; +int not_page_aligned = 0; bool_t Sorry. +u32 guest_xsaves_size = xstate_ctxt_size_compact(v-arch.xcr0); + +switch ( insn_fetch(u8, code_base, eip, code_limit) ) +{ +case 0x2f:/* XSAVES */ +{ +if ( !cpu_has_xsaves || !(v-arch.pv_vcpu.ctrlreg[4] + X86_CR4_OSXSAVE)) I would format this as if ( !cpu_has_xsaves || !(v-arch.pv_vcpu.ctrlreg[4] X86_CR4_OSXSAVE) ) to associate the bit test more clearly. Ok. +{ +do_guest_trap(TRAP_invalid_op, regs, 0); +goto skip; +} + +if ( v-arch.pv_vcpu.ctrlreg[0] X86_CR0_TS ) +{ +do_guest_trap(TRAP_nmi, regs, 0); TRAP_no_device surely? Yes. According to SDM Volum1 Section 13.11. For xsaves, If CR0.TS[bit 3] is 1, a device-not-available exception (#NM) occurs. +goto skip; +} + +if ( !guest_kernel_mode(v, regs) || (regs-edi 0x3f) ) What does edi have to do with xsaves? only edx:eax are special according to the manual. regs-edi is the guest_linear_address +goto fail; + +if ( (regs-edi ~PAGE_MASK) + guest_xsaves_size PAGE_SIZE ) +{ +mfn_t mfn_list[2]; +void *va; This fails to account for the xsaves size growing to PAGE_SIZE in future processors. /* TODO - expand for future processors .*/ BUG_ON(guest_xsaves_size = PAGE_SIZE); Yes, I agree. might be acceptable. However, it is better fixed by... + +not_page_aligned = 1; +mfn_list[0] = _mfn(do_page_walk_mfn(v, regs-edi)); +mfn_list[1] = _mfn(do_page_walk_mfn(v, + PAGE_ALIGN(regs-edi))); + +va = __vmap(mfn_list, 1, 2, PAGE_SIZE, PAGE_HYPERVISOR); +xsave_addr = (void *)((unsigned long)va + + (regs-edi ~PAGE_MASK)); ... introducing a new helper such as void *vmap_guest_linear(void *va, size_t bytes,, uint32_t PFEC); which takes care of the internal details of making a linear area of guest virtual address space appear linear in Xen virtual address space as well. You also need to take care to respect non-writable pages, and have an access_ok() check or a guest could (ab)use this emulation to write to Xen code/data
Re: [Xen-devel] [PATCH V5 3/7] libxl: add pvusb API
On 8/7/2015 at 01:21 AM, in message 55c39796.8000...@citrix.com, George Dunlap george.dun...@citrix.com wrote: On 08/06/2015 04:11 AM, Chun Yan Liu wrote: As 4.6 goes to bug fixing stage, maybe we can pick up this thread? :-) Beside to call for your precious review comments and suggestions so that we can make progress, I also want to confirm about the previous discussed two TODO things: 1) use UDEV name rule to specify usb device uniquely even across reboot. That got consensus. Next thing is exposing that name to some sysfs entry, right? So just to be clear, my understanding of the plan was that we try to fix up the current patch series and check it in once the 4.7 window opens; and that having the utility library to convert other names (including this udev-style naming) into something libxl can use would be a separate series. I wasn't necessarily expecting you to work on it (since it wasn't your idea), but if you're keen, I'm sure we'd be grateful. :-) 2) use libusb instead of reading sysfs by ourselves. As George mentioned, using libusb is not simpler than reading sysfs; and if UDEV name is stored to some sysfs entry for us to retrieve, then we still need reading sysfs things. Could we get to a final decision? If these are settled down, I can update related code. I don't think that libusb would be particularly useful for the current pvusb code, since 1. It's already Linux-specific, 2. We need to mess around with sysfs anyway. The same thing can't be said of the HVM path: I think it fairly likely that the emulated pass-through will Just Work (or nearly so) on BSDs (assuming that qemu itself works on the BSDs). I think it we write our utility library for converting vendorid:productid[:serialno], bus-port, c, then it might make sense to use libusb if it makes it more portable. Regarding the code: Things are looking pretty close. A couple of comments in-line: diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c index 93bb41e..9869a21 100644 --- a/tools/libxl/libxl_device.c +++ b/tools/libxl/libxl_device.c @@ -676,6 +676,10 @@ void libxl__devices_destroy(libxl__egc *egc, libxl__devices_remove_state *drs) aodev-action = LIBXL__DEVICE_ACTION_REMOVE; aodev-dev = dev; aodev-force = drs-force; +if (dev-backend_kind == LIBXL__DEVICE_KIND_VUSB) { +libxl__initiate_device_usbctrl_remove(egc, aodev); +continue; +} I take it the reason we need to special-case this is that we need to go through and un-assign all of the devices inside the controller first? At some point we probably want to generalize this a bit, so that usb controllers and vscsi controllers look the same (rather than both being special-cased). But since this is internal, maybe we can wait for that design until we actually have both types available. +static int +libxl__device_usb_assigned_list(libxl__gc *gc, +libxl_device_usb **list, int *num) +{ +char **domlist; +unsigned int nd = 0, i, j; +char *be_path; +libxl_device_usb *usb; + +*list = NULL; +*num = 0; + +domlist = libxl__xs_directory(gc, XBT_NULL, /local/domain, nd); +be_path = GCSPRINTF(/local/domain/%d/backend/vusb, LIBXL_TOOLSTACK_DOMID); Hmm, so this had made me realize that I don't think we're doing quite the right thing for non-dom0 backends. For non-dom0 backends, here 'be_path' should be replaced with the right backend. First of all, things are a bit interesting for driver domain backends, because the namespace of hostbus.hostaddr depends on the backend of the virtual controller. Which wouldn't be particularly interesting, *except* that because the USB space is so dynamic, you normally have to query the devices to find the hostbus.hostaddr; and you can't do any queries from dom0 at the moment (except, for example, by ssh'ing into the other vm). So to even assign a hostbus.hostaddr device you have to somehow learn, out-of-band, what those numbers are within the domain. I think you are right. As for the network driver domain, when specifying vif, we also need to know the bridge name in that driver domain, then we can use: vif = [ 'bridge=xenbr0, mac=00:16:3E:0d:13:00, model=e1000, backend=domnet' ] Secondly, if the backend is in another domain, then all the stuff re assigning a usb device to usbback can't be done by libxl in the toolstack domain either. Yes. I think so. Should be done in driver domain then. Which means I'm pretty sure this stuff will fail at the moment for USB driver domains trying to assign a non-existent
Re: [Xen-devel] [PATCH V3 3/6] x86/xsaves: enable xsaves/xrstors for hvm guest
On Wed, Aug 05, 2015 at 07:17:44PM +0100, Andrew Cooper wrote: On 05/08/15 02:57, Shuai Ruan wrote: This patch enables xsaves for hvm guest, includes: 1.handle xsaves vmcs init and vmexit. 2.add logic to write/read the XSS msr. Signed-off-by: Shuai Ruan shuai.r...@linux.intel.com --- xen/arch/x86/hvm/hvm.c | 44 ++ xen/arch/x86/hvm/vmx/vmcs.c| 7 +- xen/arch/x86/hvm/vmx/vmx.c | 18 xen/arch/x86/xstate.c | 4 ++-- xen/include/asm-x86/hvm/vmx/vmcs.h | 5 + xen/include/asm-x86/hvm/vmx/vmx.h | 2 ++ xen/include/asm-x86/xstate.h | 2 +- 7 files changed, 78 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index c07e3ef..e5cf761 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -4370,6 +4370,10 @@ void hvm_hypervisor_cpuid_leaf(uint32_t sub_idx, } } +#define XSAVEOPT (1 0) +#define XSAVEC (1 1) +#define XGETBV1(1 2) +#define XSAVES (1 3) These should be in cpufeature.h, not here. Ok. void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx, unsigned int *ecx, unsigned int *edx) { @@ -4456,6 +4460,34 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx, *ebx = _eax + _ebx; } } +if ( count == 1 ) +{ +if ( cpu_has_xsaves ) +{ +*ebx = XSTATE_AREA_MIN_SIZE; +if ( v-arch.xcr0 | v-arch.msr_ia32_xss ) +for ( sub_leaf = 2; sub_leaf 63; sub_leaf++ ) +{ +if ( !((v-arch.xcr0 | v-arch.msr_ia32_xss) + (1ULL sub_leaf)) ) +continue; +domain_cpuid(d, input, sub_leaf, _eax, _ebx, _ecx, + _edx); +*ebx = *ebx + _eax; +} +} +else +{ +*eax = ~XSAVES; +*ebx = *ecx = *edx = 0; +} +if ( !cpu_has_xgetbv1 ) +*eax = ~XGETBV1; +if ( !cpu_has_xsavec ) +*eax = ~XSAVEC; +if ( !cpu_has_xsaveopt ) +*eax = ~XSAVEOPT; +} Urgh - I really need to get domain cpuid fixed in Xen. This is currently making a very bad situation a little worse. In patch 4, I expose the xsaves/xsavec/xsaveopt and need to check whether the hardware supoort it. What's your suggestion about this? break; case 0x8001: @@ -4555,6 +4587,12 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) *msr_content = v-arch.hvm_vcpu.guest_efer; break; +case MSR_IA32_XSS: +if ( !cpu_has_vmx_xsaves ) vmx_xsaves has nothing to do with this here. I presume you mean cpu_has_xsave? +goto gp_fault; +*msr_content = v-arch.msr_ia32_xss; +break; + case MSR_IA32_TSC: *msr_content = _hvm_rdtsc_intercept(); break; @@ -4687,6 +4725,12 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content, return X86EMUL_EXCEPTION; break; +case MSR_IA32_XSS: +if ( !cpu_has_vmx_xsaves ) +goto gp_fault; +v-arch.msr_ia32_xss = msr_content; You must validate msr_content here and possibly hand a gp fault back to the guest. Ok, I will fix it. +break; + case MSR_IA32_TSC: hvm_set_guest_tsc(v, msr_content); break; diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c index 4c5ceb5..8e61e3f 100644 --- a/xen/arch/x86/hvm/vmx/vmcs.c +++ b/xen/arch/x86/hvm/vmx/vmcs.c @@ -230,7 +230,8 @@ static int vmx_init_vmcs_config(void) SECONDARY_EXEC_ENABLE_EPT | SECONDARY_EXEC_ENABLE_RDTSCP | SECONDARY_EXEC_PAUSE_LOOP_EXITING | - SECONDARY_EXEC_ENABLE_INVPCID); + SECONDARY_EXEC_ENABLE_INVPCID | + SECONDARY_EXEC_XSAVES); rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap); if ( _vmx_misc_cap VMX_MISC_VMWRITE_ALL ) opt |= SECONDARY_EXEC_ENABLE_VMCS_SHADOWING; @@ -921,6 +922,7 @@ void virtual_vmcs_vmwrite(void *vvmcs, u32 vmcs_encoding, u64 val) virtual_vmcs_exit(vvmcs); } +#define VMX_XSS_EXIT_BITMAP 0 This define definitely doesn't live here. Ok. static int construct_vmcs(struct vcpu *v) { struct domain *d = v-domain; @@ -1204,6 +1206,9 @@ static int construct_vmcs(struct vcpu *v) __vmwrite(GUEST_PAT, guest_pat);
Re: [Xen-devel] [PATCH V3 4/6] libxc: expose xsaves/xgetbv1/xsavec to hvm guest
On Wed, Aug 05, 2015 at 09:37:22AM +0100, Ian Campbell wrote: On Wed, 2015-08-05 at 09:57 +0800, Shuai Ruan wrote: This patch exposes xsaves/xgetbv1/xsavec to hvm guest. The reserved bits of eax/ebx/ecx/edx must be cleaned up when call cpuid(0dh) with leaf 1 or 2..63. According to the spec the following bits must be reseved: reserved Sorry,:) For leaf 1, bits 03-04/08-31 of ecx is reserved. Edx is reserved. For leaf 2...63, bits 01-31 of ecx is reserved. Edx is reserved. Signed-off-by: Shuai Ruan shuai.r...@linux.intel.com Although this is toolstack code I think this really ought to be acked by the hypervisor x86 maintainers, if they are happy with it then I am too, and in that case you may add: Acked-by: Ian Campbell ian.campb...@citrix.com Ok.Thanks for your review, Ian. --- tools/libxc/xc_cpuid_x86.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c index c97f91a..b69676a 100644 --- a/tools/libxc/xc_cpuid_x86.c +++ b/tools/libxc/xc_cpuid_x86.c @@ -211,6 +211,9 @@ static void intel_xc_cpuid_policy( } #define XSAVEOPT(1 0) +#define XSAVEC (1 1) +#define XGETBV1 (1 2) +#define XSAVES (1 3) /* Configure extended state enumeration leaves (0x000D for xsave) */ static void xc_cpuid_config_xsave( xc_interface *xch, domid_t domid, uint64_t xfeature_mask, @@ -247,8 +250,9 @@ static void xc_cpuid_config_xsave( regs[1] = 512 + 64; /* FP/SSE + XSAVE.HEADER */ break; case 1: /* leaf 1 */ -regs[0] = XSAVEOPT; -regs[1] = regs[2] = regs[3] = 0; +regs[0] = (XSAVEOPT | XSAVEC | XGETBV1 | XSAVES); +regs[2] = 0xe7; +regs[3] = 0; break; case 2 ... 63: /* sub-leaves */ if ( !(xfeature_mask (1ULL input[1])) ) @@ -256,8 +260,9 @@ static void xc_cpuid_config_xsave( regs[0] = regs[1] = regs[2] = regs[3] = 0; break; } -/* Don't touch EAX, EBX. Also cleanup ECX and EDX */ -regs[2] = regs[3] = 0; +/* Don't touch EAX, EBX. Also cleanup EDX. Cleanup bits 01-32 of ECX*/ +regs[2] = 0x1; +regs[3] = 0; break; } } ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V3 0/6] add xsaves/xrstors support
On Wed, Aug 05, 2015 at 05:38:21PM +0100, Andrew Cooper wrote: On 05/08/15 02:57, Shuai Ruan wrote: Detail hardware spec can be found in chapter 13 (section 13.11 13.12) of the Intel SDM [1]. patch1: add xsaves/xrstors support for pv guest patch2: add xsaves/xrstors support for xen patch3-5: add xsaves/xrstors support for hvm guest patch6: swtich on detection of xsaves/xrstors/xgetbv in xen This order of operations seems backwards. Can I suggest starting with a patch which adds various xsaves/etc defines/helper functions/etc (rather than having them spread through the series), then a patch which allows Xen to start using the features, then adding support to PV and HVM guests. ~Andrew ___ Xen-devel mailing list OK , I will do this in next version. Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-next test] 60602: trouble: blocked/broken/fail/pass
flight 60602 linux-next real [real] http://logs.test-lab.xenproject.org/osstest/logs/60602/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf 3 host-install(3) broken REGR. vs. 60389 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail blocked in 60389 test-amd64-i386-xl 13 guest-saverestorefail like 60389 test-amd64-i386-xl-xsm 13 guest-saverestorefail like 60389 test-amd64-i386-pair21 guest-migrate/src_host/dst_host fail like 60389 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail like 60389 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 60389 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 60389 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt-qcow2 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a test-armhf-armhf-xl-vhd 1 build-check(1) blocked n/a test-armhf-armhf-xl-qcow2 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-vhd 1 build-check(1) blocked n/a test-armhf-armhf-xl-raw 1 build-check(1) blocked n/a build-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-xl-arndale 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-xsm 1 build-check(1) blocked n/a test-armhf-armhf-xl 1 build-check(1) blocked n/a test-armhf-armhf-xl-rtds 1 build-check(1) blocked n/a test-armhf-armhf-xl-multivcpu 1 build-check(1) blocked n/a test-armhf-armhf-xl-cubietruck 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit2 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-xl-pvh-intel 13 guest-saverestorefail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 11 migrate-support-checkfail never pass test-amd64-i386-libvirt-qcow2 11 migrate-support-checkfail never pass test-amd64-i386-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-raw 11 migrate-support-checkfail never pass test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass version targeted for testing: linuxa5b4661b8bc23f3a10eca2206daad84014b768ec baseline version: linux01183609ab61d11f1c310d42552a97be3051cc0f Last test of basis (not found) Failing since 0 1970-01-01 00:00:00 Z 16654 days Testing same since60602 2015-08-05 09:20:19 Z1 days1 attempts jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf broken build-i386 pass build-amd64-libvirt pass build-armhf-libvirt blocked build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl pass test-armhf-armhf-xl blocked test-amd64-i386-xl fail test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
Re: [Xen-devel] Reminder: Urgent - Action Required - Xen Dev Summit Bof's Developer Meetings and WG Meetings on Aug 18 19t (print and food order deadline Friday 7th)
Hi Lars, 2015-08-05 2:54 GMT-07:00 Lars Kurth lars.kurth@gmail.com: Hi folks, this email is for people planning to attend the Xen Dev Summit in Seattle (Aug 17 18) and the Developer Meeting on the 19th. Is the Developer Meeting on Aug 19th open to public? Is it possible to listen to the meeting if there is still space left? Thanks, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [distros-debian-sid test] 37816: regressions - FAIL
flight 37816 distros-debian-sid real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/37816/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-amd64-sid-netboot-pygrub 13 guest-saverestore fail REGR. vs. 37766 Tests which did not succeed, but are not blocking: test-amd64-amd64-amd64-sid-netboot-pvgrub 10 guest-start fail never pass test-armhf-armhf-armhf-sid-netboot-pygrub 9 debian-di-install fail never pass baseline version: flight 37766 jobs: build-amd64 pass build-armhf pass build-i386 pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-amd64-sid-netboot-pvgrubfail test-amd64-i386-i386-sid-netboot-pvgrub pass test-amd64-i386-amd64-sid-netboot-pygrub fail test-armhf-armhf-armhf-sid-netboot-pygrubfail test-amd64-amd64-i386-sid-netboot-pygrub pass sg-report-flight on osstest.xs.citrite.net logs: /home/osstest/logs images: /home/osstest/images Logs, config files, etc. are available at http://osstest.xs.citrite.net/~osstest/testlogs/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Push not applicable. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Design doc of adding ACPI support for arm64 on Xen - version 2
Hi Shannon, Just some clarification questions. On 07/08/15 03:11, Shannon Zhao wrote: 3. Dom0 gets grant table and event channel irq information --- As said above, we assign the hypervisor_id be XenVMM to tell Dom0 that it runs on Xen hypervisor. For grant table, add two new HVM_PARAMs: HVM_PARAM_GNTTAB_START_ADDRESS and HVM_PARAM_GNTTAB_SIZE. For event channel irq, reuse HVM_PARAM_CALLBACK_IRQ and add a new delivery type: val[63:56] == 3: val[15:8] is flag: val[7:0] is a PPI (ARM and ARM64 only) Can you describe the content of flag? When constructing Dom0 in Xen, save these values. Then Dom0 could get them through hypercall HVMOP_get_param. 4. Map MMIO regions --- Register a bus_notifier for platform and amba bus in Linux. Add a new XENMAPSPACE XENMAPSPACE_dev_mmio. Within the register, check if the device is newly added, then call hypercall XENMEM_add_to_physmap to map the mmio regions. 5. Route device interrupts to Dom0 -- Route all the SPI interrupts to Dom0 before Dom0 booting. Not all the SPI will be routed to DOM0. Some are used by Xen and should never be used by any guest. I have in mind the UART and SMMU interrupts. You will have to find away to skip them nicely. Note that not all the IRQs used by Xen are properly registered when we build DOM0 (see the SMMU). Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [xen 4.6 retrospective] [urgent] rename freeze window and make release branch as soon as possible after RC1
El 05/08/15 a les 11.22, Lars Kurth ha escrit: This is one item of feedback, which I believe is a quick win for us. This is one piece of feedback from a list of items that have during the last few weeks been raised with me personally, either during face-2-face conversations in a private e-mail thread. See http://lists.xenproject.org/archives/html/xen-devel/2015-08/msg00173.html http://lists.xenproject.org/archives/html/xen-devel/2015-08/msg00173.html for information on the retrospective = Issue / Observation = The name freeze window/period - aka the time period from when we feature freeze until we branch master and/or make the release leads some contributors to mistakenly assume that development for the next release stops. I saw a few mails on xen-devel@ recently, pointing out to contributors that development does not stop during freeze. Chatting to Ian Campbell, he mentioned that he replied to several different people who said they were waiting for the tree to reopen. Maybe choosing a better name will help. In addition, we used to branch master a lot earlier I believe up to Xen 4.1 (around RC2 or RC3). At some point we started branching master when we release. I do not know why we changed, but it seems we did not have any issues branching master around RC2 or RC3. Branching earlier, would mean that contributors do not have to carry patches for as long as they do now, and the risk of having to rebase patches several times is lower. = Possible Solution / Improvement = Change Terminology: * Keep Feature Freeze as is +1 * Rename Freeze Window/Period to Stabilisation Window/Period or something similar -1. IMHO all projects I work with use the freeze terminology, changing it to something else is just going to confuse people. * Make clear that Stabilisation Window/Period != no development in the Development Update x.y mail template +1 Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 01/31] libxl: fix libxl__build_hvm error handling
El 07/08/15 a les 13.03, Wei Liu ha escrit: On Fri, Aug 07, 2015 at 12:55:21PM +0200, Roger Pau Monné wrote: El 07/08/15 a les 12.49, Wei Liu ha escrit: On Fri, Aug 07, 2015 at 12:17:38PM +0200, Roger Pau Monne wrote: With the current code in libxl__build_hvm it is possible for the function to fail and still return 0. I care about this bit, which states clearly there is a bug that needs fixing. There are a bunch of error paths in this function that needs fixing, every error path below the call to libxl__domain_device_construct_rdm will simply return with rc = 0, because the return code of the functions is stored in ret, but the return code for libxl__build_hvm is fetched from rc. It's hard to see where the bug is when this patch also does a bunch of refactoring. It refactors the error paths only, mainly replacing: if (libxl_call_foo(bar)) error to rc = libxl_call_foo(bar) if (rc != 0) error But this suggests there is no bug? The bug is that we return with rc = 0, that's why it's important to change this to follow the coding style. I agree that the patch could be simpler by setting rc to some sane value in each error path (or just setting rc to ERROR_FAIL in the out label), but it doesn't make much sense to me to do this kind of fixing. If we fix it, we fix it for good. So we can keep the error codes returned by auxiliary functions. It would be good if you can separate the bug fix from other name changing bits, so that we can apply that bug fix for 4.6 possibly queue it up for backporting. There are no name changing bits AFAICT. Changing ret for rc is naming changing to me. It's a good thing to do to comply with coding style, but mixing this with bug fix makes it hard to backport the fix itself. I agree that the patch could be simplified, and I can send a simpler fix iif needed, but as said above I would prefer to fix it in a proper way. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] virtio on pv/pvh xen
On Wed, Aug 5, 2015 at 11:33 PM, Wei Liu wei.l...@citrix.com wrote: On Wed, Aug 05, 2015 at 11:09:41PM +0800, Lai Jiangshan wrote: Hi, Liu Does pv or pvh guest support virtio devices? No. If yes, how can I configure the guest? If not, how can I make it support? A new transport which makes use of xenbus and grant table needs to be developed. I don't think pvh need these. Hi, Mukesh Rathor Does pvh guest support virtio devices? Thanks Lai There were talks on standardising virtio and adding Xen PV transport support long time ago but no visible progress was made. Wei. Thanks Lai ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 05/31] libxc: introduce a domain loader for HVM guest firmware
Introduce a very simple (and dummy) domain loader to be used to load the firmware (hvmloader) into HVM guests. Since hmvloader is just a 32bit elf executable the loader is fairly simple. Signed-off-by: Roger Pau Monné roger@citrix.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- Changes since v3: - s/__FUNCTION__/__func__/g - Fix style errors in xc_dom_hvmloader.c. - Add Andrew Cooper Reviewed-by. - Add Wei Acked-by. --- tools/libxc/Makefile | 1 + tools/libxc/include/xc_dom.h | 8 ++ tools/libxc/xc_dom_hvmloader.c | 313 + xen/include/xen/libelf.h | 1 + 4 files changed, 323 insertions(+) create mode 100644 tools/libxc/xc_dom_hvmloader.c diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile index 8ae0ea0..b45380c 100644 --- a/tools/libxc/Makefile +++ b/tools/libxc/Makefile @@ -84,6 +84,7 @@ GUEST_SRCS-y += xc_dom_core.c xc_dom_boot.c GUEST_SRCS-y += xc_dom_elfloader.c GUEST_SRCS-$(CONFIG_X86) += xc_dom_bzimageloader.c GUEST_SRCS-$(CONFIG_X86) += xc_dom_decompress_lz4.c +GUEST_SRCS-$(CONFIG_X86) += xc_dom_hvmloader.c GUEST_SRCS-$(CONFIG_ARM) += xc_dom_armzimageloader.c GUEST_SRCS-y += xc_dom_binloader.c GUEST_SRCS-y += xc_dom_compat_linux.c diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index bc55ec9..02d9d5c 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -14,6 +14,7 @@ */ #include xen/libelf/libelf.h +#include xenguest.h #define INVALID_P2M_ENTRY ((xen_pfn_t)-1) @@ -185,6 +186,13 @@ struct xc_dom_image { XC_DOM_PV_CONTAINER, XC_DOM_HVM_CONTAINER, } container_type; + +/* HVM specific fields. */ +/* Extra ACPI tables passed to HVMLOADER */ +struct xc_hvm_firmware_module acpi_module; + +/* Extra SMBIOS structures passed to HVMLOADER */ +struct xc_hvm_firmware_module smbios_module; }; /* --- pluggable kernel loader - */ diff --git a/tools/libxc/xc_dom_hvmloader.c b/tools/libxc/xc_dom_hvmloader.c new file mode 100644 index 000..79a3b99 --- /dev/null +++ b/tools/libxc/xc_dom_hvmloader.c @@ -0,0 +1,313 @@ +/* + * Xen domain builder -- HVM specific bits. + * + * Parse and load ELF firmware images for HVM domains. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; + * version 2.1 of the License. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + * + */ + +#include stdio.h +#include stdlib.h +#include stdarg.h +#include inttypes.h +#include assert.h + +#include xg_private.h +#include xc_dom.h +#include xc_bitops.h + +/* */ +/* parse elf binary */ + +static elf_negerrnoval check_elf_kernel(struct xc_dom_image *dom, bool verbose) +{ +if ( dom-kernel_blob == NULL ) +{ +if ( verbose ) +xc_dom_panic(dom-xch, XC_INTERNAL_ERROR, + %s: no kernel image loaded, __func__); +return -EINVAL; +} + +if ( !elf_is_elfbinary(dom-kernel_blob, dom-kernel_size) ) +{ +if ( verbose ) +xc_dom_panic(dom-xch, XC_INVALID_KERNEL, + %s: kernel is not an ELF image, __func__); +return -EINVAL; +} +return 0; +} + +static elf_negerrnoval xc_dom_probe_hvm_kernel(struct xc_dom_image *dom) +{ +struct elf_binary elf; +int rc; + +/* This loader is designed for HVM guest firmware. */ +if ( dom-container_type != XC_DOM_HVM_CONTAINER ) +return -EINVAL; + +rc = check_elf_kernel(dom, 0); +if ( rc != 0 ) +return rc; + +rc = elf_init(elf, dom-kernel_blob, dom-kernel_size); +if ( rc != 0 ) +return rc; + +/* + * We need to check that there are no Xen ELFNOTES, or + * else we might be trying to load a PV kernel. + */ +elf_parse_binary(elf); +rc = elf_xen_parse(elf, dom-parms); +if ( rc == 0 ) +return -EINVAL; + +return 0; +} + +static elf_errorstatus xc_dom_parse_hvm_kernel(struct xc_dom_image *dom) +/*
[Xen-devel] [PATCH v4 02/31] libxc: split x86 HVM setup_guest into smaller logical functions
This is just a preparatory change to clean up the code in setup_guest. Should not introduce any functional changes. Signed-off-by: Roger Pau Monné roger@citrix.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- Changes since v3: - Add Andrew Cooper Reviewed-by. - Add Wei Acked-by. --- tools/libxc/xc_hvm_build_x86.c | 198 - 1 file changed, 117 insertions(+), 81 deletions(-) diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c index ec11f15..6f79686 100644 --- a/tools/libxc/xc_hvm_build_x86.c +++ b/tools/libxc/xc_hvm_build_x86.c @@ -231,28 +231,20 @@ static int check_mmio_hole(uint64_t start, uint64_t memsize, return 1; } -static int setup_guest(xc_interface *xch, - uint32_t dom, struct xc_hvm_build_args *args, - char *image, unsigned long image_size) +static int xc_hvm_populate_memory(xc_interface *xch, uint32_t dom, + struct xc_hvm_build_args *args, + xen_pfn_t *page_array) { -xen_pfn_t *page_array = NULL; unsigned long i, vmemid, nr_pages = args-mem_size PAGE_SHIFT; unsigned long p2m_size; unsigned long target_pages = args-mem_target PAGE_SHIFT; -unsigned long entry_eip, cur_pages, cur_pfn; -void *hvm_info_page; -uint32_t *ident_pt; -struct elf_binary elf; -uint64_t v_start, v_end; -uint64_t m_start = 0, m_end = 0; +unsigned long cur_pages, cur_pfn; int rc; xen_capabilities_info_t caps; unsigned long stat_normal_pages = 0, stat_2mb_pages = 0, stat_1gb_pages = 0; unsigned int memflags = 0; int claim_enabled = args-claim_enabled; -xen_pfn_t special_array[NR_SPECIAL_PAGES]; -xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES]; uint64_t total_pages; xen_vmemrange_t dummy_vmemrange[2]; unsigned int dummy_vnode_to_pnode[1]; @@ -260,19 +252,6 @@ static int setup_guest(xc_interface *xch, unsigned int *vnode_to_pnode; unsigned int nr_vmemranges, nr_vnodes; -memset(elf, 0, sizeof(elf)); -if ( elf_init(elf, image, image_size) != 0 ) -{ -PERROR(Could not initialise ELF image); -goto error_out; -} - -xc_elf_set_logfile(xch, elf, 1); - -elf_parse_binary(elf); -v_start = 0; -v_end = args-mem_size; - if ( nr_pages target_pages ) memflags |= XENMEMF_populate_on_demand; @@ -345,24 +324,6 @@ static int setup_guest(xc_interface *xch, goto error_out; } -if ( modules_init(args, v_end, elf, m_start, m_end) != 0 ) -{ -ERROR(Insufficient space to load modules.); -goto error_out; -} - -DPRINTF(VIRTUAL MEMORY ARRANGEMENT:\n); -DPRINTF( Loader: %016PRIx64-%016PRIx64\n, elf.pstart, elf.pend); -DPRINTF( Modules: %016PRIx64-%016PRIx64\n, m_start, m_end); -DPRINTF( TOTAL:%016PRIx64-%016PRIx64\n, v_start, v_end); -DPRINTF( ENTRY:%016PRIx64\n, elf_uval(elf, elf.ehdr, e_entry)); - -if ( (page_array = malloc(p2m_size * sizeof(xen_pfn_t))) == NULL ) -{ -PERROR(Could not allocate memory.); -goto error_out; -} - for ( i = 0; i p2m_size; i++ ) page_array[i] = ((xen_pfn_t)-1); for ( vmemid = 0; vmemid nr_vmemranges; vmemid++ ) @@ -562,7 +523,54 @@ static int setup_guest(xc_interface *xch, DPRINTF( 4KB PAGES: 0x%016lx\n, stat_normal_pages); DPRINTF( 2MB PAGES: 0x%016lx\n, stat_2mb_pages); DPRINTF( 1GB PAGES: 0x%016lx\n, stat_1gb_pages); - + +rc = 0; +goto out; + error_out: +rc = -1; + out: + +/* ensure no unclaimed pages are left unused */ +xc_domain_claim_pages(xch, dom, 0 /* cancels the claim */); + +return rc; +} + +static int xc_hvm_load_image(xc_interface *xch, + uint32_t dom, struct xc_hvm_build_args *args, + xen_pfn_t *page_array) +{ +unsigned long entry_eip, image_size; +struct elf_binary elf; +uint64_t v_start, v_end; +uint64_t m_start = 0, m_end = 0; +char *image; +int rc; + +image = xc_read_image(xch, args-image_file_name, image_size); +if ( image == NULL ) +return -1; + +memset(elf, 0, sizeof(elf)); +if ( elf_init(elf, image, image_size) != 0 ) +goto error_out; + +xc_elf_set_logfile(xch, elf, 1); + +elf_parse_binary(elf); +v_start = 0; +v_end = args-mem_size; + +if ( modules_init(args, v_end, elf, m_start, m_end) != 0 ) +{ +ERROR(Insufficient space to load modules.); +goto error_out; +} + +DPRINTF(VIRTUAL MEMORY ARRANGEMENT:\n); +DPRINTF( Loader: %016PRIx64-%016PRIx64\n, elf.pstart, elf.pend); +DPRINTF( Modules:
[Xen-devel] [PATCH v4 09/31] libxc: introduce a xc_dom_arch for hvm-3.0-x86_32 guests
This xc_dom_arch will be used in order to build HVM domains. The code is based on the existing xc_hvm_populate_memory and xc_hvm_populate_params functions. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- Changes since v3: - Make sure c/s b9dbe33 is not reverted on this patch. - Set the initial BSP state using {get/set}hvmcontext. --- tools/libxc/include/xc_dom.h | 6 + tools/libxc/xc_dom_x86.c | 619 ++- 2 files changed, 614 insertions(+), 11 deletions(-) diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index 0245d24..cda40d9 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -188,6 +188,12 @@ struct xc_dom_image { } container_type; /* HVM specific fields. */ +xen_pfn_t target_pages; +xen_pfn_t mmio_start; +xen_pfn_t mmio_size; +xen_pfn_t lowmem_end; +xen_pfn_t highmem_end; + /* Extra ACPI tables passed to HVMLOADER */ struct xc_hvm_firmware_module acpi_module; diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c index ae8187f..18e3340 100644 --- a/tools/libxc/xc_dom_x86.c +++ b/tools/libxc/xc_dom_x86.c @@ -39,10 +39,32 @@ /* */ -#define SUPERPAGE_PFN_SHIFT 9 -#define SUPERPAGE_NR_PFNS(1UL SUPERPAGE_PFN_SHIFT) #define SUPERPAGE_BATCH_SIZE 512 +#define SUPERPAGE_2MB_SHIFT 9 +#define SUPERPAGE_2MB_NR_PFNS (1UL SUPERPAGE_2MB_SHIFT) +#define SUPERPAGE_1GB_SHIFT 18 +#define SUPERPAGE_1GB_NR_PFNS (1UL SUPERPAGE_1GB_SHIFT) + +#define X86_CR0_PE 0x01 +#define X86_CR0_ET 0x10 + +#define VGA_HOLE_SIZE (0x20) + +#define SPECIALPAGE_PAGING 0 +#define SPECIALPAGE_ACCESS 1 +#define SPECIALPAGE_SHARING 2 +#define SPECIALPAGE_BUFIOREQ 3 +#define SPECIALPAGE_XENSTORE 4 +#define SPECIALPAGE_IOREQ5 +#define SPECIALPAGE_IDENT_PT 6 +#define SPECIALPAGE_CONSOLE 7 +#define NR_SPECIAL_PAGES 8 +#define special_pfn(x) (0xff000u - NR_SPECIAL_PAGES + (x)) + +#define NR_IOREQ_SERVER_PAGES 8 +#define ioreq_server_pfn(x) (special_pfn(0) - NR_IOREQ_SERVER_PAGES + (x)) + #define bits_to_mask(bits) (((xen_vaddr_t)1 (bits))-1) #define round_down(addr, mask) ((addr) ~(mask)) #define round_up(addr, mask) ((addr) | (mask)) @@ -461,6 +483,135 @@ static int alloc_magic_pages(struct xc_dom_image *dom) return 0; } +static void build_hvm_info(void *hvm_info_page, struct xc_dom_image *dom) +{ +struct hvm_info_table *hvm_info = (struct hvm_info_table *) +(((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET); +uint8_t sum; +int i; + +memset(hvm_info_page, 0, PAGE_SIZE); + +/* Fill in the header. */ +memcpy(hvm_info-signature, HVM INFO, sizeof(hvm_info-signature)); +hvm_info-length = sizeof(struct hvm_info_table); + +/* Sensible defaults: these can be overridden by the caller. */ +hvm_info-apic_mode = 1; +hvm_info-nr_vcpus = 1; +memset(hvm_info-vcpu_online, 0xff, sizeof(hvm_info-vcpu_online)); + +/* Memory parameters. */ +hvm_info-low_mem_pgend = dom-lowmem_end PAGE_SHIFT; +hvm_info-high_mem_pgend = dom-highmem_end PAGE_SHIFT; +hvm_info-reserved_mem_pgstart = ioreq_server_pfn(0); + +/* Finish with the checksum. */ +for ( i = 0, sum = 0; i hvm_info-length; i++ ) +sum += ((uint8_t *)hvm_info)[i]; +hvm_info-checksum = -sum; +} + +static int alloc_magic_pages_hvm(struct xc_dom_image *dom) +{ +unsigned long i; +void *hvm_info_page; +uint32_t *ident_pt, domid = dom-guest_domid; +int rc; +xen_pfn_t special_array[NR_SPECIAL_PAGES]; +xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES]; +xc_interface *xch = dom-xch; + +if ( (hvm_info_page = xc_map_foreign_range( + xch, domid, PAGE_SIZE, PROT_READ | PROT_WRITE, + HVM_INFO_PFN)) == NULL ) +goto error_out; +build_hvm_info(hvm_info_page, dom); +munmap(hvm_info_page, PAGE_SIZE); + +/* Allocate and clear special pages. */ +for ( i = 0; i NR_SPECIAL_PAGES; i++ ) +special_array[i] = special_pfn(i); + +rc = xc_domain_populate_physmap_exact(xch, domid, NR_SPECIAL_PAGES, 0, 0, + special_array); +if ( rc != 0 ) +{ +DOMPRINTF(Could not allocate special pages.); +goto error_out; +} + +if ( xc_clear_domain_pages(xch, domid, special_pfn(0), NR_SPECIAL_PAGES) ) +goto error_out; + +xc_hvm_param_set(xch, domid, HVM_PARAM_STORE_PFN, + special_pfn(SPECIALPAGE_XENSTORE)); +xc_hvm_param_set(xch, domid, HVM_PARAM_BUFIOREQ_PFN, + special_pfn(SPECIALPAGE_BUFIOREQ)); +xc_hvm_param_set(xch, domid, HVM_PARAM_IOREQ_PFN, +
[Xen-devel] [PATCH v4 08/31] libxc: rework BSP initialization
Place the calls to xc_vcpu_setcontext and the allocation of the hypercall buffer into the arch-specific vcpu hooks. This is needed for the next patch, so x86 HVM guests can initialize the BSP using XEN_DOMCTL_sethvmcontext instead of XEN_DOMCTL_setvcpucontext. This patch should not introduce any functional change. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- tools/libxc/include/xc_dom.h | 2 +- tools/libxc/xc_dom_arm.c | 22 +- tools/libxc/xc_dom_boot.c| 23 +-- tools/libxc/xc_dom_x86.c | 26 -- 4 files changed, 39 insertions(+), 34 deletions(-) diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index 5c1bb0f..0245d24 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -221,7 +221,7 @@ struct xc_dom_arch { /* arch-specific data structs setup */ int (*start_info) (struct xc_dom_image * dom); int (*shared_info) (struct xc_dom_image * dom, void *shared_info); -int (*vcpu) (struct xc_dom_image * dom, void *vcpu_ctxt); +int (*vcpu) (struct xc_dom_image * dom); int (*bootearly) (struct xc_dom_image * dom); int (*bootlate) (struct xc_dom_image * dom); diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c index 7548dae..8865097 100644 --- a/tools/libxc/xc_dom_arm.c +++ b/tools/libxc/xc_dom_arm.c @@ -119,9 +119,10 @@ static int shared_info_arm(struct xc_dom_image *dom, void *ptr) /* */ -static int vcpu_arm32(struct xc_dom_image *dom, void *ptr) +static int vcpu_arm32(struct xc_dom_image *dom) { -vcpu_guest_context_t *ctxt = ptr; +vcpu_guest_context_any_t any_ctx; +vcpu_guest_context_t *ctxt = any_ctx.c; DOMPRINTF_CALLED(dom-xch); @@ -154,12 +155,18 @@ static int vcpu_arm32(struct xc_dom_image *dom, void *ptr) DOMPRINTF(Initial state CPSR %#PRIx32 PC %#PRIx32, ctxt-user_regs.cpsr, ctxt-user_regs.pc32); -return 0; +rc = xc_vcpu_setcontext(dom-xch, dom-guest_domid, 0, any_ctx); +if ( rc != 0 ) +xc_dom_panic(dom-xch, XC_INTERNAL_ERROR, + %s: SETVCPUCONTEXT failed (rc=%d), __func__, rc); + +return rc; } -static int vcpu_arm64(struct xc_dom_image *dom, void *ptr) +static int vcpu_arm64(struct xc_dom_image *dom) { -vcpu_guest_context_t *ctxt = ptr; +vcpu_guest_context_any_t any_ctx; +vcpu_guest_context_t *ctxt = any_ctx.c; DOMPRINTF_CALLED(dom-xch); /* clear everything */ @@ -189,6 +196,11 @@ static int vcpu_arm64(struct xc_dom_image *dom, void *ptr) DOMPRINTF(Initial state CPSR %#PRIx32 PC %#PRIx64, ctxt-user_regs.cpsr, ctxt-user_regs.pc64); +rc = xc_vcpu_setcontext(dom-xch, dom-guest_domid, 0, any_ctx); +if ( rc != 0 ) +xc_dom_panic(dom-xch, XC_INTERNAL_ERROR, + %s: SETVCPUCONTEXT failed (rc=%d), __func__, rc); + return 0; } diff --git a/tools/libxc/xc_dom_boot.c b/tools/libxc/xc_dom_boot.c index e6f7794..791041b 100644 --- a/tools/libxc/xc_dom_boot.c +++ b/tools/libxc/xc_dom_boot.c @@ -62,19 +62,6 @@ static int setup_hypercall_page(struct xc_dom_image *dom) return rc; } -static int launch_vm(xc_interface *xch, domid_t domid, - vcpu_guest_context_any_t *ctxt) -{ -int rc; - -xc_dom_printf(xch, %s: called, ctxt=%p, __FUNCTION__, ctxt); -rc = xc_vcpu_setcontext(xch, domid, 0, ctxt); -if ( rc != 0 ) -xc_dom_panic(xch, XC_INTERNAL_ERROR, - %s: SETVCPUCONTEXT failed (rc=%d), __FUNCTION__, rc); -return rc; -} - static int clear_page(struct xc_dom_image *dom, xen_pfn_t pfn) { xen_pfn_t dst; @@ -197,14 +184,9 @@ void *xc_dom_boot_domU_map(struct xc_dom_image *dom, xen_pfn_t pfn, int xc_dom_boot_image(struct xc_dom_image *dom) { -DECLARE_HYPERCALL_BUFFER(vcpu_guest_context_any_t, ctxt); xc_dominfo_t info; int rc; -ctxt = xc_hypercall_buffer_alloc(dom-xch, ctxt, sizeof(*ctxt)); -if ( ctxt == NULL ) -return -1; - DOMPRINTF_CALLED(dom-xch); /* misc stuff*/ @@ -259,13 +241,10 @@ int xc_dom_boot_image(struct xc_dom_image *dom) return rc; /* let the vm run */ -memset(ctxt, 0, sizeof(*ctxt)); -if ( (rc = dom-arch_hooks-vcpu(dom, ctxt)) != 0 ) +if ( (rc = dom-arch_hooks-vcpu(dom)) != 0 ) return rc; xc_dom_unmap_all(dom); -rc = launch_vm(dom-xch, dom-guest_domid, ctxt); -xc_hypercall_buffer_free(dom-xch, ctxt); return rc; } diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c index 0f49e27..ae8187f 100644 --- a/tools/libxc/xc_dom_x86.c +++ b/tools/libxc/xc_dom_x86.c @@ -583,10 +583,12 @@ static int shared_info_x86_64(struct
[Xen-devel] [PATCH v4 00/31] Introduce HVM without dm and new boot ABI
This series is split in the following order: - Patch 1 is a fixup of the error paths in libxl__build_hvm and related functions (and can go in independently of the rest of the series). - Patches from 2 to 11 switch HVM domain contruction to use the xc_dom_* family of functions, like they are used to build PV domains. This batch of patches can go in regardless of the status of the rest of the series IMHO, and in fact would help me quite a lot with the rebasing. - Patches from 12 to 22 allow disabling the devices emulated inside of Xen. - Patches from 23 to 31 introduce the creation of HVM guests without a device model and without the devices emulated inside of Xen. This series has been successfully tested on the following hardware: - Intel Xeon W3550. - AMD Opteron 4184. With both hap=0 and hap=1 in the configuration file. I've been able to boot a SMP guest in this mode with a virtual hard drive and a virtual network card, all working fine AFAICT. For this round only maintainers of the specific code being modified have been Cced on the patches. The series can also be found in the following git repo: git://xenbits.xen.org/people/royger/xen.git branch hvm_without_dm_v4 And for the FreeBSD part: git://xenbits.xen.org/people/royger/freebsd.git branch new_entry_point_v3 In case someone wants to give it a try, I've uploaded a FreeBSD kernel that should work when booted into this mode: https://people.freebsd.org/~royger/kernel_no_dm This FreeBSD kernel starts the APs in long mode. There are examples for starting the APs in other modes in the sys/x86/xen/pv.c file. The config file that I've used is: config kernel=/path/to/kernel_no_dm builder=hvm device_model_version=none memory=128 vcpus=2 name = freebsd /config Of course if you have a FreeBSD disk already setup it can also be added to the configuration file, and the following line can be used to point FreeBSD to the disk: extra=vfs.root.mountfrom=ufs:/dev/ufsid/disk_id N 01/31 libxl: fix libxl__build_hvm error handling AW 02/31 libxc: split x86 HVM setup_guest into smaller AW 03/31 libxc: unify xc_dom_p2m_{host/guest} AW 04/31 libxc: introduce the notion of a container type AW 05/31 libxc: introduce a domain loader for HVM guest AW 06/31 libxc: make arch_setup_meminit a xc_dom_arch hook AW 07/31 libxc: make arch_setup_boot{init/late} xc_dom_arch M 08/31 libxc: rework BSP initialization M 09/31 libxc: introduce a xc_dom_arch for hvm-3.0-x86_32 M 10/31 libxl: switch HVM domain building to use xc_dom_* 11/31 libxc: remove dead HVM building code M 12/31 xen/x86: add bitmap of enabled emulated devices 13/31 xen/x86: allow disabling the emulated local apic 14/31 xen/x86: allow disabling the emulated HPET 15/31 xen/x86: allow disabling the pmtimer 16/31 xen/x86: allow disabling the emulated RTC 17/31 xen/x86: allow disabling the emulated IO APIC 18/31 xen/x86: allow disabling the emulated PIC 19/31 xen/x86: allow disabling the emulated pmu 20/31 xen/x86: allow disabling the emulated VGA 21/31 xen/x86: allow disabling the emulated IOMMU 22/31 xen/x86: allow disabling all emulated devices inside 23/31 elfnotes: intorduce a new PHYS_ENTRY elfnote M 24/31 libxc: allow creating domains without emulated 25/31 xen: allow HVM guests to use XENMEM_memory_map M 26/31 xen/x86: allow HVM guests to use hypercalls to bring M 27/31 xenconsole: try to attach to PV console if HVM fails 28/31 libxc/xen: introduce HVM_PARAM_CMDLINE_PFN 29/31 libxc/xen: introduce HVM_PARAM_MODLIST_PFN 30/31 libxc: switch xc_dom_elfloader to be used with M 31/31 libxl: allow the creation of HVM domains without a A = Acked/Reviewed by Andrew Cooper. W = Acked/Reviewed by Wei Liu. N = New in this version. M = Modified in this version. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 06/31] libxc: make arch_setup_meminit a xc_dom_arch hook
This allows having different arch_setup_meminit implementations based on the guest type. It should not introduce any functional changes. Signed-off-by: Roger Pau Monné roger@citrix.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- Changes since v3: - Add Andrew Cooper Reviewed-by. - Move xc_dom_arch definitions to the end of the xc_dom_arch.c files in order to reduce the spurious diffs in the comming patches. - Add Wei Acked-by. --- tools/libxc/include/xc_dom.h | 4 ++- tools/libxc/xc_dom_arm.c | 70 +++ tools/libxc/xc_dom_boot.c| 2 +- tools/libxc/xc_dom_x86.c | 71 +++- 4 files changed, 78 insertions(+), 69 deletions(-) diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index 02d9d5c..c4b994f 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -223,6 +223,9 @@ struct xc_dom_arch { int (*shared_info) (struct xc_dom_image * dom, void *shared_info); int (*vcpu) (struct xc_dom_image * dom, void *vcpu_ctxt); +/* arch-specific memory initialization. */ +int (*meminit) (struct xc_dom_image * dom); + char *guest_type; char *native_protocol; int page_shift; @@ -400,7 +403,6 @@ static inline xen_pfn_t xc_dom_p2m(struct xc_dom_image *dom, xen_pfn_t pfn) /* --- arch bits --- */ -int arch_setup_meminit(struct xc_dom_image *dom); int arch_setup_bootearly(struct xc_dom_image *dom); int arch_setup_bootlate(struct xc_dom_image *dom); diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c index b00d667..24776ba 100644 --- a/tools/libxc/xc_dom_arm.c +++ b/tools/libxc/xc_dom_arm.c @@ -194,38 +194,6 @@ static int vcpu_arm64(struct xc_dom_image *dom, void *ptr) /* */ -static struct xc_dom_arch xc_dom_32 = { -.guest_type = xen-3.0-armv7l, -.native_protocol = XEN_IO_PROTO_ABI_ARM, -.page_shift = PAGE_SHIFT_ARM, -.sizeof_pfn = 8, -.alloc_magic_pages = alloc_magic_pages, -.count_pgtables = count_pgtables_arm, -.setup_pgtables = setup_pgtables_arm, -.start_info = start_info_arm, -.shared_info = shared_info_arm, -.vcpu = vcpu_arm32, -}; - -static struct xc_dom_arch xc_dom_64 = { -.guest_type = xen-3.0-aarch64, -.native_protocol = XEN_IO_PROTO_ABI_ARM, -.page_shift = PAGE_SHIFT_ARM, -.sizeof_pfn = 8, -.alloc_magic_pages = alloc_magic_pages, -.count_pgtables = count_pgtables_arm, -.setup_pgtables = setup_pgtables_arm, -.start_info = start_info_arm, -.shared_info = shared_info_arm, -.vcpu = vcpu_arm64, -}; - -static void __init register_arch_hooks(void) -{ -xc_dom_register_arch_hooks(xc_dom_32); -xc_dom_register_arch_hooks(xc_dom_64); -} - static int set_mode(xc_interface *xch, domid_t domid, char *guest_type) { static const struct { @@ -384,7 +352,7 @@ out: return rc 0 ? rc : 0; } -int arch_setup_meminit(struct xc_dom_image *dom) +static int meminit(struct xc_dom_image *dom) { int i, rc; xen_pfn_t pfn; @@ -542,6 +510,42 @@ int xc_dom_feature_translated(struct xc_dom_image *dom) return 1; } +/* */ + +static struct xc_dom_arch xc_dom_32 = { +.guest_type = xen-3.0-armv7l, +.native_protocol = XEN_IO_PROTO_ABI_ARM, +.page_shift = PAGE_SHIFT_ARM, +.sizeof_pfn = 8, +.alloc_magic_pages = alloc_magic_pages, +.count_pgtables = count_pgtables_arm, +.setup_pgtables = setup_pgtables_arm, +.start_info = start_info_arm, +.shared_info = shared_info_arm, +.vcpu = vcpu_arm32, +.meminit = meminit, +}; + +static struct xc_dom_arch xc_dom_64 = { +.guest_type = xen-3.0-aarch64, +.native_protocol = XEN_IO_PROTO_ABI_ARM, +.page_shift = PAGE_SHIFT_ARM, +.sizeof_pfn = 8, +.alloc_magic_pages = alloc_magic_pages, +.count_pgtables = count_pgtables_arm, +.setup_pgtables = setup_pgtables_arm, +.start_info = start_info_arm, +.shared_info = shared_info_arm, +.vcpu = vcpu_arm64, +.meminit = meminit, +}; + +static void __init register_arch_hooks(void) +{ +xc_dom_register_arch_hooks(xc_dom_32); +xc_dom_register_arch_hooks(xc_dom_64); +} + /* * Local variables: * mode: C diff --git a/tools/libxc/xc_dom_boot.c b/tools/libxc/xc_dom_boot.c index 7c30f96..bf2cd7b 100644 --- a/tools/libxc/xc_dom_boot.c +++ b/tools/libxc/xc_dom_boot.c @@ -146,7 +146,7 @@ int xc_dom_boot_mem_init(struct xc_dom_image *dom) DOMPRINTF_CALLED(dom-xch); -rc = arch_setup_meminit(dom); +rc = dom-arch_hooks-meminit(dom); if ( rc != 0 )
[Xen-devel] [PATCH v4 26/31] xen/x86: allow HVM guests to use hypercalls to bring up vCPUs
Allow the usage of the VCPUOP_initialise, VCPUOP_up, VCPUOP_down and VCPUOP_is_up hypercalls from HVM guests. This patch introduces a new structure (vcpu_hvm_context) that should be used in conjuction with the VCPUOP_initialise hypercall in order to initialize vCPUs for HVM guests. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Stefano Stabellini stefano.stabell...@citrix.com --- xen/arch/arm/domain.c | 24 ++ xen/arch/x86/domain.c | 156 +++ xen/arch/x86/hvm/hvm.c| 8 ++ xen/common/domain.c | 16 +--- xen/include/public/hvm/hvm_vcpu.h | 168 ++ xen/include/xen/domain.h | 2 + 6 files changed, 359 insertions(+), 15 deletions(-) create mode 100644 xen/include/public/hvm/hvm_vcpu.h diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index b2bfc7d..b20035d 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -752,6 +752,30 @@ int arch_set_info_guest( return 0; } +int arch_initialize_vcpu(struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg) +{ +struct vcpu_guest_context *ctxt; +struct domain *d = current-domain; +int rc; + +if ( (ctxt = alloc_vcpu_guest_context()) == NULL ) +return -ENOMEM; + +if ( copy_from_guest(ctxt, arg, 1) ) +{ +free_vcpu_guest_context(ctxt); +return -EFAULT; +} + +domain_lock(d); +rc = v-is_initialised ? -EEXIST : arch_set_info_guest(v, ctxt); +domain_unlock(d); + +free_vcpu_guest_context(ctxt); + +return rc; +} + int arch_vcpu_reset(struct vcpu *v) { vcpu_end_shutdown_deferral(v); diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 432fe43..4a7f8d9 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -37,6 +37,7 @@ #include xen/wait.h #include xen/guest_access.h #include public/sysctl.h +#include public/hvm/hvm_vcpu.h #include asm/regs.h #include asm/mc146818rtc.h #include asm/system.h @@ -1135,6 +1136,161 @@ int arch_set_info_guest( #undef c } +/* Called by VCPUOP_initialise for HVM guests. */ +static int arch_set_info_hvm_guest(struct vcpu *v, vcpu_hvm_context_t *ctx) +{ +struct segment_register seg; + +#define get_context_seg(ctx, seg, f)\ +(ctx)-mode == VCPU_HVM_MODE_16B ? (ctx)-cpu_regs.x86_16.seg##_##f : \ +(ctx)-mode == VCPU_HVM_MODE_32B ? (ctx)-cpu_regs.x86_32.seg##_##f : \ +(ctx)-cpu_regs.x86_64.seg##_##f + +#define get_context_gpr(ctx, gpr) \ +(ctx)-mode == VCPU_HVM_MODE_16B ? (ctx)-cpu_regs.x86_16.gpr : \ +(ctx)-mode == VCPU_HVM_MODE_32B ? (ctx)-cpu_regs.x86_32.e##gpr : \ +(ctx)-cpu_regs.x86_64.r##gpr + +#define get_context_field(ctx, field) \ +(ctx)-mode == VCPU_HVM_MODE_16B ? (ctx)-cpu_regs.x86_16.field : \ +(ctx)-mode == VCPU_HVM_MODE_32B ? (ctx)-cpu_regs.x86_32.field : \ +(ctx)-cpu_regs.x86_64.field + +memset(seg, 0, sizeof(seg)); + +if ( !paging_mode_hap(v-domain) ) +v-arch.guest_table = pagetable_null(); + +v-arch.user_regs.rax = get_context_gpr(ctx, ax); +v-arch.user_regs.rcx = get_context_gpr(ctx, cx); +v-arch.user_regs.rdx = get_context_gpr(ctx, dx); +v-arch.user_regs.rbx = get_context_gpr(ctx, bx); +v-arch.user_regs.rsp = get_context_gpr(ctx, sp); +v-arch.user_regs.rbp = get_context_gpr(ctx, bp); +v-arch.user_regs.rsi = get_context_gpr(ctx, si); +v-arch.user_regs.rdi = get_context_gpr(ctx, di); +v-arch.user_regs.rip = get_context_gpr(ctx, ip); +v-arch.user_regs.rflags = get_context_gpr(ctx, flags); + +v-arch.hvm_vcpu.guest_cr[0] = get_context_field(ctx, cr0) | X86_CR0_ET; +hvm_update_guest_cr(v, 0); +v-arch.hvm_vcpu.guest_cr[4] = get_context_field(ctx, cr4); +hvm_update_guest_cr(v, 4); + +switch ( ctx-mode ) +{ +case VCPU_HVM_MODE_32B: +v-arch.hvm_vcpu.guest_efer = ctx-cpu_regs.x86_32.efer; +hvm_update_guest_efer(v); +v-arch.hvm_vcpu.guest_cr[3] = ctx-cpu_regs.x86_32.cr3; +hvm_update_guest_cr(v, 3); +break; +case VCPU_HVM_MODE_64B: +v-arch.user_regs.r8 = ctx-cpu_regs.x86_64.r8; +v-arch.user_regs.r9 = ctx-cpu_regs.x86_64.r9; +v-arch.user_regs.r10 = ctx-cpu_regs.x86_64.r10; +v-arch.user_regs.r11 = ctx-cpu_regs.x86_64.r11; +v-arch.user_regs.r12 = ctx-cpu_regs.x86_64.r12; +v-arch.user_regs.r13 = ctx-cpu_regs.x86_64.r13; +v-arch.user_regs.r14 = ctx-cpu_regs.x86_64.r14; +v-arch.user_regs.r15 = ctx-cpu_regs.x86_64.r15; +v-arch.hvm_vcpu.guest_efer = ctx-cpu_regs.x86_64.efer; +hvm_update_guest_efer(v); +v-arch.hvm_vcpu.guest_cr[3] = ctx-cpu_regs.x86_64.cr3; +
[Xen-devel] [PATCH v4 10/31] libxl: switch HVM domain building to use xc_dom_* helpers
Now that we have all the code in place HVM domain building in libxl can be switched to use the xc_dom_* family of functions, just like they are used in order to build PV guests. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- tools/libxl/libxl_arch.h | 2 +- tools/libxl/libxl_dm.c | 18 ++-- tools/libxl/libxl_dom.c | 227 +-- tools/libxl/libxl_internal.h | 4 +- tools/libxl/libxl_vnuma.c| 12 ++- tools/libxl/libxl_x86.c | 8 +- 6 files changed, 155 insertions(+), 116 deletions(-) diff --git a/tools/libxl/libxl_arch.h b/tools/libxl/libxl_arch.h index bd030b6..34a853c 100644 --- a/tools/libxl/libxl_arch.h +++ b/tools/libxl/libxl_arch.h @@ -60,6 +60,6 @@ _hidden int libxl__arch_domain_construct_memmap(libxl__gc *gc, libxl_domain_config *d_config, uint32_t domid, -struct xc_hvm_build_args *args); +struct xc_dom_image *dom); #endif diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index 02c0162..f6b4c89 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -18,6 +18,8 @@ #include libxl_osdeps.h /* must come before any other headers */ #include libxl_internal.h + +#include xc_dom.h #include xen/hvm/e820.h static const char *libxl_tapif_script(libxl__gc *gc) @@ -181,7 +183,7 @@ add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config, int libxl__domain_device_construct_rdm(libxl__gc *gc, libxl_domain_config *d_config, uint64_t rdm_mem_boundary, - struct xc_hvm_build_args *args) + struct xc_dom_image *dom) { int i, j, conflict, rc; struct xen_reserved_device_memory *xrdm = NULL; @@ -189,7 +191,7 @@ int libxl__domain_device_construct_rdm(libxl__gc *gc, uint16_t seg; uint8_t bus, devfn; uint64_t rdm_start, rdm_size; -uint64_t highmem_end = args-highmem_end ? args-highmem_end : (1ull32); +uint64_t highmem_end = dom-highmem_end ? dom-highmem_end : (1ull32); /* * We just want to construct RDM once since RDM is specific to the @@ -303,7 +305,7 @@ int libxl__domain_device_construct_rdm(libxl__gc *gc, for (i = 0; i d_config-num_rdms; i++) { rdm_start = d_config-rdms[i].start; rdm_size = d_config-rdms[i].size; -conflict = overlaps_rdm(0, args-lowmem_end, rdm_start, rdm_size); +conflict = overlaps_rdm(0, dom-lowmem_end, rdm_start, rdm_size); if (!conflict) continue; @@ -314,14 +316,14 @@ int libxl__domain_device_construct_rdm(libxl__gc *gc, * We will move downwards lowmem_end so we have to expand * highmem_end. */ -highmem_end += (args-lowmem_end - rdm_start); +highmem_end += (dom-lowmem_end - rdm_start); /* Now move downwards lowmem_end. */ -args-lowmem_end = rdm_start; +dom-lowmem_end = rdm_start; } } /* Sync highmem_end. */ -args-highmem_end = highmem_end; +dom-highmem_end = highmem_end; /* * Finally we can take same policy to check lowmem( 2G) and @@ -331,11 +333,11 @@ int libxl__domain_device_construct_rdm(libxl__gc *gc, rdm_start = d_config-rdms[i].start; rdm_size = d_config-rdms[i].size; /* Does this entry conflict with lowmem? */ -conflict = overlaps_rdm(0, args-lowmem_end, +conflict = overlaps_rdm(0, dom-lowmem_end, rdm_start, rdm_size); /* Does this entry conflict with highmem? */ conflict |= overlaps_rdm((1ULL32), - args-highmem_end - (1ULL32), + dom-highmem_end - (1ULL32), rdm_start, rdm_size); if (!conflict) diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index 92c4278..bf9b65f 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -602,6 +602,63 @@ static int set_vnuma_info(libxl__gc *gc, uint32_t domid, return rc; } +static int libxl__build_dom(libxl__gc *gc, uint32_t domid, + libxl_domain_build_info *info, libxl__domain_build_state *state, + struct xc_dom_image *dom) +{ +uint64_t mem_kb; +int ret; + +if ( (ret = xc_dom_boot_xen_init(dom, CTX-xch, domid)) != 0 ) { +LOGE(ERROR, xc_dom_boot_xen_init failed); +goto out; +} +#ifdef GUEST_RAM_BASE +if ( (ret = xc_dom_rambase_init(dom, GUEST_RAM_BASE)) != 0 ) { +LOGE(ERROR, xc_dom_rambase failed); +
[Xen-devel] [PATCH v4 20/31] xen/x86: allow disabling the emulated VGA
Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/stdvga.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c index f50bff7..a3296bd 100644 --- a/xen/arch/x86/hvm/stdvga.c +++ b/xen/arch/x86/hvm/stdvga.c @@ -555,6 +555,9 @@ void stdvga_init(struct domain *d) void *p; int i; +if ( !has_vvga(d) ) +return; + memset(s, 0, sizeof(*s)); spin_lock_init(s-lock); @@ -594,6 +597,9 @@ void stdvga_deinit(struct domain *d) struct hvm_hw_stdvga *s = d-arch.hvm_domain.stdvga; int i; +if ( !has_vvga(d) ) +return; + for ( i = 0; i != ARRAY_SIZE(s-vram_page); i++ ) { if ( s-vram_page[i] == NULL ) -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 01/31] libxl: fix libxl__build_hvm error handling
On Fri, Aug 07, 2015 at 12:17:38PM +0200, Roger Pau Monne wrote: With the current code in libxl__build_hvm it is possible for the function to fail and still return 0. It's hard to see where the bug is when this patch also does a bunch of refactoring. It would be good if you can separate the bug fix from other name changing bits, so that we can apply that bug fix for 4.6 possibly queue it up for backporting. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 22/31] xen/x86: allow disabling all emulated devices inside of Xen
Only allow enabling or disabling all the emulated devices inside of Xen, right now Xen doesn't support enabling specific emulated devices only. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/domain.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index c508074..432fe43 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -563,7 +563,8 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags, XEN_X86_EMU_IOAPIC | XEN_X86_EMU_PIC | XEN_X86_EMU_PMU | XEN_X86_EMU_VGA | XEN_X86_EMU_IOMMU); -if ( (config-emulation_flags emulation_mask) != emulation_mask ) +if ( (config-emulation_flags emulation_mask) != emulation_mask + (config-emulation_flags emulation_mask) != 0 ) { printk(XENLOG_G_ERR d%d: Xen does not allow HVM creation with the current selection of emulators: %#x.\n, d-domain_id, -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 14/31] xen/x86: allow disabling the emulated HPET
Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/hpet.c | 13 + xen/arch/x86/hvm/hvm.c | 1 - 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c index edf9a17..266b587 100644 --- a/xen/arch/x86/hvm/hpet.c +++ b/xen/arch/x86/hvm/hpet.c @@ -516,6 +516,9 @@ static int hpet_save(struct domain *d, hvm_domain_context_t *h) int rc; uint64_t guest_time; +if ( !has_vhpet(d) ) +return 0; + write_lock(hp-lock); guest_time = guest_time_hpet(hp); @@ -575,6 +578,9 @@ static int hpet_load(struct domain *d, hvm_domain_context_t *h) uint64_t guest_time; int i; +if ( !has_vhpet(d) ) +return 0; + write_lock(hp-lock); /* Reload the HPET registers */ @@ -633,6 +639,9 @@ void hpet_init(struct domain *d) HPETState *h = domain_vhpet(d); int i; +if ( !has_vhpet(d) ) +return; + memset(h, 0, sizeof(HPETState)); rwlock_init(h-lock); @@ -660,6 +669,7 @@ void hpet_init(struct domain *d) } register_mmio_handler(d, hpet_mmio_ops); +d-arch.hvm_domain.params[HVM_PARAM_HPET_ENABLED] = 1; } void hpet_deinit(struct domain *d) @@ -667,6 +677,9 @@ void hpet_deinit(struct domain *d) int i; HPETState *h = domain_vhpet(d); +if ( !has_vhpet(d) ) +return; + write_lock(h-lock); if ( hpet_enabled(h) ) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index c957610..c778a20 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1594,7 +1594,6 @@ int hvm_domain_initialise(struct domain *d) hvm_init_guest_time(d); -d-arch.hvm_domain.params[HVM_PARAM_HPET_ENABLED] = 1; d-arch.hvm_domain.params[HVM_PARAM_TRIPLE_FAULT_REASON] = SHUTDOWN_reboot; vpic_init(d); -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Design doc of adding ACPI support for arm64 on Xen - version 2
On 07/08/15 11:37, Christoffer Dall wrote: On Fri, Aug 7, 2015 at 12:33 PM, Julien Grall julien.gr...@citrix.com wrote: Hi Shannon, Just some clarification questions. On 07/08/15 03:11, Shannon Zhao wrote: 3. Dom0 gets grant table and event channel irq information --- As said above, we assign the hypervisor_id be XenVMM to tell Dom0 that it runs on Xen hypervisor. For grant table, add two new HVM_PARAMs: HVM_PARAM_GNTTAB_START_ADDRESS and HVM_PARAM_GNTTAB_SIZE. For event channel irq, reuse HVM_PARAM_CALLBACK_IRQ and add a new delivery type: val[63:56] == 3: val[15:8] is flag: val[7:0] is a PPI (ARM and ARM64 only) Can you describe the content of flag? When constructing Dom0 in Xen, save these values. Then Dom0 could get them through hypercall HVMOP_get_param. 4. Map MMIO regions --- Register a bus_notifier for platform and amba bus in Linux. Add a new XENMAPSPACE XENMAPSPACE_dev_mmio. Within the register, check if the device is newly added, then call hypercall XENMEM_add_to_physmap to map the mmio regions. 5. Route device interrupts to Dom0 -- Route all the SPI interrupts to Dom0 before Dom0 booting. Not all the SPI will be routed to DOM0. Some are used by Xen and should never be used by any guest. I have in mind the UART and SMMU interrupts. You will have to find away to skip them nicely. Note that not all the IRQs used by Xen are properly registered when we build DOM0 (see the SMMU). Just to clarify; Xen should map all SPIs which are not reserved for Xen to Dom0, right? Right. -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 25/31] xen: allow HVM guests to use XENMEM_memory_map
Enable this hypercall for HVM guests in order to fetch the e820 memory map in the absence of an emulated BIOS. The memory map is populated and notified to Xen in arch_setup_meminit_hvm. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- tools/libxc/xc_dom_x86.c | 29 - 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c index b587b12..87bce6e 100644 --- a/tools/libxc/xc_dom_x86.c +++ b/tools/libxc/xc_dom_x86.c @@ -1205,6 +1205,7 @@ static int check_mmio_hole(uint64_t start, uint64_t memsize, return 1; } +#define MAX_E820_ENTRIES128 static int meminit_hvm(struct xc_dom_image *dom) { unsigned long i, vmemid, nr_pages = dom-total_pages; @@ -1225,6 +1226,8 @@ static int meminit_hvm(struct xc_dom_image *dom) unsigned int nr_vmemranges, nr_vnodes; xc_interface *xch = dom-xch; uint32_t domid = dom-guest_domid; +struct e820entry entries[MAX_E820_ENTRIES]; +int e820_index = 0; if ( nr_pages target_pages ) memflags |= XENMEMF_populate_on_demand; @@ -1275,6 +1278,13 @@ static int meminit_hvm(struct xc_dom_image *dom) vnode_to_pnode = dom-vnode_to_pnode; } +/* Add one additional memory range to account for the VGA hole */ +if ( (nr_vmemranges + (dom-emulation ? 1 : 0)) MAX_E820_ENTRIES ) +{ +DOMPRINTF(Too many memory ranges); +goto error_out; +} + total_pages = 0; p2m_size = 0; for ( i = 0; i nr_vmemranges; i++ ) @@ -1363,9 +1373,13 @@ static int meminit_hvm(struct xc_dom_image *dom) * Under 2MB mode, we allocate pages in batches of no more than 8MB to * ensure that we can be preempted and hence dom0 remains responsive. */ -if ( dom-emulation ) +if ( dom-emulation ) { rc = xc_domain_populate_physmap_exact( xch, domid, 0xa0, 0, memflags, dom-p2m_host[0x00]); +entries[e820_index].addr = 0; +entries[e820_index].size = 0xa0 PAGE_SHIFT; +entries[e820_index++].type = E820_RAM; +} stat_normal_pages = 0; for ( vmemid = 0; vmemid nr_vmemranges; vmemid++ ) @@ -1392,6 +1406,12 @@ static int meminit_hvm(struct xc_dom_image *dom) else cur_pages = vmemranges[vmemid].start PAGE_SHIFT; +/* Build an e820 map. */ +entries[e820_index].addr = cur_pages PAGE_SHIFT; +entries[e820_index].size = vmemranges[vmemid].end - + entries[e820_index].addr; +entries[e820_index++].type = E820_RAM; + while ( (rc == 0) (end_pages cur_pages) ) { /* Clip count to maximum 1GB extent. */ @@ -1509,6 +1529,13 @@ static int meminit_hvm(struct xc_dom_image *dom) DPRINTF( 2MB PAGES: 0x%016lx\n, stat_2mb_pages); DPRINTF( 1GB PAGES: 0x%016lx\n, stat_1gb_pages); +rc = xc_domain_set_memory_map(xch, domid, entries, e820_index); +if ( rc != 0 ) +{ +DOMPRINTF(unable to set memory map.); +goto error_out; +} + rc = 0; goto out; error_out: -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] virtio on pv/pvh xen
Add back xen-devel On Fri, Aug 07, 2015 at 06:45:10PM +0800, Lai Jiangshan wrote: On Fri, Aug 7, 2015 at 6:25 PM, Wei Liu wei.l...@citrix.com wrote: On Fri, Aug 07, 2015 at 06:01:23PM +0800, Lai Jiangshan wrote: On Wed, Aug 5, 2015 at 11:33 PM, Wei Liu wei.l...@citrix.com wrote: On Wed, Aug 05, 2015 at 11:09:41PM +0800, Lai Jiangshan wrote: Hi, Liu Does pv or pvh guest support virtio devices? No. If yes, how can I configure the guest? If not, how can I make it support? A new transport which makes use of xenbus and grant table needs to be developed. I don't think pvh need these. I think it still needs a new transport. Hi, Mukesh Rathor Does pvh guest support virtio devices? Currently there are three transports available: virtio over pci, virtio over mmio and virtio over channel I/O. They all seem to require some sort of emulation support. PVH doesn't have emulation support, which means you can't use existing virtio transports. Thank you for sharing the important knowledge. If you want to add emulation to PVH, in effect it's just a HVM guest. You can already use virtio over pci emulation on HVM guest today. BTW can I ask why you want virtio support? I need virtio-9pfs(share) performance(pvh). Do we have other non-networking sharedfs/virtfs/... for pvh? There was xenfs https://blog.xenproject.org/2009/03/26/status-of-xenfs/ but there is no active development. Not sure about its status nowadays. We have an OPW intern working on virtio-9p on Xen at the moment. Let's see how it goes. Wei. thanks, Lai Wei. Thanks Lai There were talks on standardising virtio and adding Xen PV transport support long time ago but no visible progress was made. Wei. Thanks Lai ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Design doc of adding ACPI support for arm64 on Xen - version 2
On Fri, 7 Aug 2015, Shannon Zhao wrote: This document is going to explain the design details of Xen booting with ACPI on ARM. Maybe parts of it may not be appropriate. Any comments are welcome. To Xen itself booting with ACPI, this is similar to Linux kernel except that Xen doesn't parse DSDT table. So I'll skip this part and focus on how Xen prepares ACPI tables for Dom0 and how Xen passes them to Dom0. 1. Copy and change some EFI and ACPI tables --- a) Copy EFI_SYSTEM_TABLE and change the value of FirmwareVendor, VendorGuid, VendorTable, ConfigurationTable. These changes are not very special and it just assign values to these members. b) Create EFI_MEMORY_DESCRIPTOR table. This will add memory start and size information of Dom0. And Dom0 will get the memory information through this EFI table. c) Copy FADT table. Change the value of arm_boot_flags to enable PSCI and HVC. Let the hypervisor_id be XenVMM in order to tell Dom0 that it runs on Xen hypervisor, then Dom0 could through HVM_PARAM to get some informations for booting necessity, such as grant table start address and size. Change header revison, length and checksum as well. d) Copy GTDT table. Set non_secure_el2_interrupt and non_secure_el2_flags to 0 to mask EL2 timer for Dom0. e) Copy MADT table. According to the value of dom0_max_vcpus to change the number GICC entries. f) Create STAO table. This table is a new added one that's used to define a list of ACPI namespace names that are to be ignored by the OSPM in Dom0. Currently we use it to tell OSPM should ignore UART defined in SPCR table. g) Copy XSDT table. Add a new table entry for STAO and change other table's entries. h) Change the value of xsdt_physical_address in RSDP table. i) The rest of tables are not copied or changed. They are reused including DSDT, SPCR, etc. All these tables will be copied to Dom0 memory except that the reused tables(DSDT, SPCR, etc) will be mapped to Dom0. 2. Create minimal DT to pass required information to Dom0 -- The minimal DT mainly passes Dom0 bootargs, address and size of initrd (if available), address and size of uefi system table, address and size of uefi memory table, uefi-mmap-desc-size and uefi-mmap-desc-ver. An example of the minimal DT: / { #address-cells = 2; #size-cells = 1; chosen { bootargs = kernel=Image console=hvc0 earlycon=pl011,0x1c09 root=/dev/vda2 rw rootfstype=ext4 init=/bin/sh acpi=force; linux,initrd-start = 0x; linux,initrd-end = 0x; linux,uefi-system-table = 0x; linux,uefi-mmap-start = 0x; linux,uefi-mmap-size = 0x; linux,uefi-mmap-desc-size = 0x; linux,uefi-mmap-desc-ver = 0x; }; }; For details loook at https://github.com/torvalds/linux/blob/master/Documentation/arm/uefi.txt 3. Dom0 gets grant table and event channel irq information --- As said above, we assign the hypervisor_id be XenVMM to tell Dom0 that it runs on Xen hypervisor. For grant table, add two new HVM_PARAMs: HVM_PARAM_GNTTAB_START_ADDRESS and HVM_PARAM_GNTTAB_SIZE. For event channel irq, reuse HVM_PARAM_CALLBACK_IRQ and add a new delivery type: val[63:56] == 3: val[15:8] is flag: val[7:0] is a PPI (ARM and ARM64 only) When constructing Dom0 in Xen, save these values. Then Dom0 could get them through hypercall HVMOP_get_param. 4. Map MMIO regions --- Register a bus_notifier for platform and amba bus in Linux. Add a new XENMAPSPACE XENMAPSPACE_dev_mmio. Within the register, check if the device is newly added, then call hypercall XENMEM_add_to_physmap to map the mmio regions. 5. Route device interrupts to Dom0 -- Route all the SPI interrupts to Dom0 before Dom0 booting. Look forward to your comments. If you think it has no problem, giving your ack would be a big help. Then the patchset could move on. :) Indeed. Acked-by: Stefano Stabellini stefano.stabell...@eu.citrix.com It would be nice to have an Ack from Jan too. He should be back in a couple of days. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] virtio on pv/pvh xen
On Fri, Aug 07, 2015 at 06:01:23PM +0800, Lai Jiangshan wrote: On Wed, Aug 5, 2015 at 11:33 PM, Wei Liu wei.l...@citrix.com wrote: On Wed, Aug 05, 2015 at 11:09:41PM +0800, Lai Jiangshan wrote: Hi, Liu Does pv or pvh guest support virtio devices? No. If yes, how can I configure the guest? If not, how can I make it support? A new transport which makes use of xenbus and grant table needs to be developed. I don't think pvh need these. I think it still needs a new transport. Hi, Mukesh Rathor Does pvh guest support virtio devices? Currently there are three transports available: virtio over pci, virtio over mmio and virtio over channel I/O. They all seem to require some sort of emulation support. PVH doesn't have emulation support, which means you can't use existing virtio transports. If you want to add emulation to PVH, in effect it's just a HVM guest. You can already use virtio over pci emulation on HVM guest today. BTW can I ask why you want virtio support? Wei. Thanks Lai There were talks on standardising virtio and adding Xen PV transport support long time ago but no visible progress was made. Wei. Thanks Lai ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Design doc of adding ACPI support for arm64 on Xen - version 2
On Fri, Aug 7, 2015 at 12:33 PM, Julien Grall julien.gr...@citrix.com wrote: Hi Shannon, Just some clarification questions. On 07/08/15 03:11, Shannon Zhao wrote: 3. Dom0 gets grant table and event channel irq information --- As said above, we assign the hypervisor_id be XenVMM to tell Dom0 that it runs on Xen hypervisor. For grant table, add two new HVM_PARAMs: HVM_PARAM_GNTTAB_START_ADDRESS and HVM_PARAM_GNTTAB_SIZE. For event channel irq, reuse HVM_PARAM_CALLBACK_IRQ and add a new delivery type: val[63:56] == 3: val[15:8] is flag: val[7:0] is a PPI (ARM and ARM64 only) Can you describe the content of flag? When constructing Dom0 in Xen, save these values. Then Dom0 could get them through hypercall HVMOP_get_param. 4. Map MMIO regions --- Register a bus_notifier for platform and amba bus in Linux. Add a new XENMAPSPACE XENMAPSPACE_dev_mmio. Within the register, check if the device is newly added, then call hypercall XENMEM_add_to_physmap to map the mmio regions. 5. Route device interrupts to Dom0 -- Route all the SPI interrupts to Dom0 before Dom0 booting. Not all the SPI will be routed to DOM0. Some are used by Xen and should never be used by any guest. I have in mind the UART and SMMU interrupts. You will have to find away to skip them nicely. Note that not all the IRQs used by Xen are properly registered when we build DOM0 (see the SMMU). Just to clarify; Xen should map all SPIs which are not reserved for Xen to Dom0, right? -Christoffer ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 24/31] libxc: allow creating domains without emulated devices.
Introduce a new flag in xc_dom_image that turns on and off the emulated devices. This prevents creating the VGA hole, the hvm_info page and the ioreq server pages. libxl unconditionally sets it to true for all HVM domains at the moment. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- Changes since v3: - Explain the meaning of the emulation xc_dom_image field. --- tools/libxc/include/xc_dom.h | 3 ++ tools/libxc/xc_dom_x86.c | 71 +--- tools/libxl/libxl_dom.c | 1 + 3 files changed, 44 insertions(+), 31 deletions(-) diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index cda40d9..99225cf 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -194,6 +194,9 @@ struct xc_dom_image { xen_pfn_t lowmem_end; xen_pfn_t highmem_end; +/* If set disables the setup of the IOREQ pages and the VGA MMIO hole. */ +bool emulation; + /* Extra ACPI tables passed to HVMLOADER */ struct xc_hvm_firmware_module acpi_module; diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c index 18e3340..b587b12 100644 --- a/tools/libxc/xc_dom_x86.c +++ b/tools/libxc/xc_dom_x86.c @@ -522,12 +522,15 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom) xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES]; xc_interface *xch = dom-xch; -if ( (hvm_info_page = xc_map_foreign_range( - xch, domid, PAGE_SIZE, PROT_READ | PROT_WRITE, - HVM_INFO_PFN)) == NULL ) -goto error_out; -build_hvm_info(hvm_info_page, dom); -munmap(hvm_info_page, PAGE_SIZE); +if ( dom-emulation ) +{ +if ( (hvm_info_page = xc_map_foreign_range( + xch, domid, PAGE_SIZE, PROT_READ | PROT_WRITE, + HVM_INFO_PFN)) == NULL ) +goto error_out; +build_hvm_info(hvm_info_page, dom); +munmap(hvm_info_page, PAGE_SIZE); +} /* Allocate and clear special pages. */ for ( i = 0; i NR_SPECIAL_PAGES; i++ ) @@ -559,30 +562,33 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom) xc_hvm_param_set(xch, domid, HVM_PARAM_SHARING_RING_PFN, special_pfn(SPECIALPAGE_SHARING)); -/* - * Allocate and clear additional ioreq server pages. The default - * server will use the IOREQ and BUFIOREQ special pages above. - */ -for ( i = 0; i NR_IOREQ_SERVER_PAGES; i++ ) -ioreq_server_array[i] = ioreq_server_pfn(i); - -rc = xc_domain_populate_physmap_exact(xch, domid, NR_IOREQ_SERVER_PAGES, 0, - 0, ioreq_server_array); -if ( rc != 0 ) +if ( dom-emulation ) { -DOMPRINTF(Could not allocate ioreq server pages.); -goto error_out; -} +/* + * Allocate and clear additional ioreq server pages. The default + * server will use the IOREQ and BUFIOREQ special pages above. + */ +for ( i = 0; i NR_IOREQ_SERVER_PAGES; i++ ) +ioreq_server_array[i] = ioreq_server_pfn(i); -if ( xc_clear_domain_pages(xch, domid, ioreq_server_pfn(0), - NR_IOREQ_SERVER_PAGES) ) +rc = xc_domain_populate_physmap_exact(xch, domid, NR_IOREQ_SERVER_PAGES, 0, + 0, ioreq_server_array); +if ( rc != 0 ) +{ +DOMPRINTF(Could not allocate ioreq server pages.); goto error_out; +} -/* Tell the domain where the pages are and how many there are */ -xc_hvm_param_set(xch, domid, HVM_PARAM_IOREQ_SERVER_PFN, - ioreq_server_pfn(0)); -xc_hvm_param_set(xch, domid, HVM_PARAM_NR_IOREQ_SERVER_PAGES, - NR_IOREQ_SERVER_PAGES); +if ( xc_clear_domain_pages(xch, domid, ioreq_server_pfn(0), + NR_IOREQ_SERVER_PAGES) ) +goto error_out; + +/* Tell the domain where the pages are and how many there are */ +xc_hvm_param_set(xch, domid, HVM_PARAM_IOREQ_SERVER_PFN, + ioreq_server_pfn(0)); +xc_hvm_param_set(xch, domid, HVM_PARAM_NR_IOREQ_SERVER_PAGES, + NR_IOREQ_SERVER_PAGES); +} /* * Identity-map page table is required for running with CR0.PG=0 when @@ -1320,7 +1326,8 @@ static int meminit_hvm(struct xc_dom_image *dom) * allocated is pointless. */ if ( claim_enabled ) { -rc = xc_domain_claim_pages(xch, domid, target_pages - VGA_HOLE_SIZE); +rc = xc_domain_claim_pages(xch, domid, target_pages - + dom-emulation ? VGA_HOLE_SIZE : 0); if ( rc != 0 ) { DOMPRINTF(Could not allocate memory for HVM guest as
[Xen-devel] [PATCH v4 13/31] xen/x86: allow disabling the emulated local apic
Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Boris Ostrovsky boris.ostrov...@oracle.com Cc: Suravee Suthikulpanit suravee.suthikulpa...@amd.com Cc: Aravind Gopalakrishnan aravind.gopalakrish...@amd.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com Cc: Jun Nakajima jun.nakaj...@intel.com Cc: Eddie Dong eddie.d...@intel.com Cc: Kevin Tian kevin.t...@intel.com --- xen/arch/x86/hvm/svm/svm.c | 16 +--- xen/arch/x86/hvm/vlapic.c | 30 +- xen/arch/x86/hvm/vmsi.c | 6 ++ xen/arch/x86/hvm/vmx/vmcs.c | 14 ++ xen/arch/x86/hvm/vmx/vmx.c | 9 - 5 files changed, 62 insertions(+), 13 deletions(-) diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index 8de41fa..97dc507 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -1035,6 +1035,7 @@ static void noreturn svm_do_resume(struct vcpu *v) struct vmcb_struct *vmcb = v-arch.hvm_svm.vmcb; bool_t debug_state = v-domain-debugger_attached; bool_t vcpu_guestmode = 0; +struct vlapic *vlapic = vcpu_vlapic(v); if ( nestedhvm_enabled(v-domain) nestedhvm_vcpu_in_guestmode(v) ) vcpu_guestmode = 1; @@ -1058,14 +1059,14 @@ static void noreturn svm_do_resume(struct vcpu *v) hvm_asid_flush_vcpu(v); } -if ( !vcpu_guestmode ) +if ( !vcpu_guestmode !vlapic_hw_disabled(vlapic) ) { vintr_t intr; /* Reflect the vlapic's TPR in the hardware vtpr */ intr = vmcb_get_vintr(vmcb); intr.fields.tpr = -(vlapic_get_reg(vcpu_vlapic(v), APIC_TASKPRI) 0xFF) 4; +(vlapic_get_reg(vlapic, APIC_TASKPRI) 0xFF) 4; vmcb_set_vintr(vmcb, intr); } @@ -2294,6 +2295,7 @@ void svm_vmexit_handler(struct cpu_user_regs *regs) int inst_len, rc; vintr_t intr; bool_t vcpu_guestmode = 0; +struct vlapic *vlapic = vcpu_vlapic(v); hvm_invalidate_regs_fields(regs); @@ -2311,11 +2313,11 @@ void svm_vmexit_handler(struct cpu_user_regs *regs) * NB. We need to preserve the low bits of the TPR to make checked builds * of Windows work, even though they don't actually do anything. */ -if ( !vcpu_guestmode ) { +if ( !vcpu_guestmode !vlapic_hw_disabled(vlapic) ) { intr = vmcb_get_vintr(vmcb); -vlapic_set_reg(vcpu_vlapic(v), APIC_TASKPRI, +vlapic_set_reg(vlapic, APIC_TASKPRI, ((intr.fields.tpr 0x0F) 4) | - (vlapic_get_reg(vcpu_vlapic(v), APIC_TASKPRI) 0x0F)); + (vlapic_get_reg(vlapic, APIC_TASKPRI) 0x0F)); } exit_reason = vmcb-exitcode; @@ -2697,14 +2699,14 @@ void svm_vmexit_handler(struct cpu_user_regs *regs) } out: -if ( vcpu_guestmode ) +if ( vcpu_guestmode || vlapic_hw_disabled(vlapic) ) /* Don't clobber TPR of the nested guest. */ return; /* The exit may have updated the TPR: reflect this in the hardware vtpr */ intr = vmcb_get_vintr(vmcb); intr.fields.tpr = -(vlapic_get_reg(vcpu_vlapic(v), APIC_TASKPRI) 0xFF) 4; +(vlapic_get_reg(vlapic, APIC_TASKPRI) 0xFF) 4; vmcb_set_vintr(vmcb, intr); } diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c index b893b40..e355679 100644 --- a/xen/arch/x86/hvm/vlapic.c +++ b/xen/arch/x86/hvm/vlapic.c @@ -993,6 +993,9 @@ static void set_x2apic_id(struct vlapic *vlapic) bool_t vlapic_msr_set(struct vlapic *vlapic, uint64_t value) { +if ( !has_vlapic(vlapic_domain(vlapic)) ) +return 0; + if ( (vlapic-hw.apic_base_msr ^ value) MSR_IA32_APICBASE_ENABLE ) { if ( unlikely(value MSR_IA32_APICBASE_EXTD) ) @@ -1042,8 +1045,7 @@ void vlapic_tdt_msr_set(struct vlapic *vlapic, uint64_t value) uint64_t guest_tsc; struct vcpu *v = vlapic_vcpu(vlapic); -/* may need to exclude some other conditions like vlapic-hw.disabled */ -if ( !vlapic_lvtt_tdt(vlapic) ) +if ( !vlapic_lvtt_tdt(vlapic) || vlapic_hw_disabled(vlapic) ) { HVM_DBG_LOG(DBG_LEVEL_VLAPIC_TIMER, ignore tsc deadline msr write); return; @@ -1118,6 +1120,9 @@ static int __vlapic_accept_pic_intr(struct vcpu *v) int vlapic_accept_pic_intr(struct vcpu *v) { +if ( vlapic_hw_disabled(vcpu_vlapic(v)) ) +return 0; + TRACE_2D(TRC_HVM_EMUL_LAPIC_PIC_INTR, (v == v-domain-arch.hvm_domain.i8259_target), v ? __vlapic_accept_pic_intr(v) : -1); @@ -1265,6 +1270,9 @@ static int lapic_save_hidden(struct domain *d, hvm_domain_context_t *h) struct vlapic *s; int rc = 0; +if ( !has_vlapic(d) ) +return 0; + for_each_vcpu ( d, v ) { s = vcpu_vlapic(v); @@ -1281,6 +1289,9 @@ static int lapic_save_regs(struct domain *d, hvm_domain_context_t *h) struct vlapic *s; int rc = 0; +if ( !has_vlapic(d) ) +return 0; + for_each_vcpu ( d, v
[Xen-devel] [PATCH v4 19/31] xen/x86: allow disabling the emulated pmu
Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/cpu/vpmu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c index 8af3df1..d5bb77d 100644 --- a/xen/arch/x86/cpu/vpmu.c +++ b/xen/arch/x86/cpu/vpmu.c @@ -439,6 +439,9 @@ void vpmu_initialise(struct vcpu *v) int ret; bool_t is_priv_vpmu = is_hardware_domain(v-domain); +if ( !has_vpmu(v-domain) ) +return; + BUILD_BUG_ON(sizeof(struct xen_pmu_intel_ctxt) XENPMU_CTXT_PAD_SZ); BUILD_BUG_ON(sizeof(struct xen_pmu_amd_ctxt) XENPMU_CTXT_PAD_SZ); BUILD_BUG_ON(sizeof(struct xen_pmu_regs) XENPMU_REGS_PAD_SZ); -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 11/31] libxc: remove dead HVM building code
Remove xc_hvm_build_x86.c and xc_hvm_build_arm.c since xc_hvm_build is not longer used in order to create HVM guests. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- tools/libxc/Makefile | 2 - tools/libxc/include/xenguest.h| 44 --- tools/libxc/xc_hvm_build_arm.c| 48 --- tools/libxc/xc_hvm_build_x86.c| 806 -- tools/libxc/xg_private.c | 9 - tools/python/xen/lowlevel/xc/xc.c | 81 6 files changed, 990 deletions(-) delete mode 100644 tools/libxc/xc_hvm_build_arm.c delete mode 100644 tools/libxc/xc_hvm_build_x86.c diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile index b45380c..efffb8d 100644 --- a/tools/libxc/Makefile +++ b/tools/libxc/Makefile @@ -91,9 +91,7 @@ GUEST_SRCS-y += xc_dom_compat_linux.c GUEST_SRCS-$(CONFIG_X86) += xc_dom_x86.c GUEST_SRCS-$(CONFIG_X86) += xc_cpuid_x86.c -GUEST_SRCS-$(CONFIG_X86) += xc_hvm_build_x86.c GUEST_SRCS-$(CONFIG_ARM) += xc_dom_arm.c -GUEST_SRCS-$(CONFIG_ARM) += xc_hvm_build_arm.c ifeq ($(CONFIG_LIBXC_MINIOS),y) GUEST_SRCS-y += xc_dom_decompress_unsafe.c diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h index 1a1a185..ec67fbd 100644 --- a/tools/libxc/include/xenguest.h +++ b/tools/libxc/include/xenguest.h @@ -205,50 +205,6 @@ struct xc_hvm_firmware_module { uint64_t guest_addr_out; }; -struct xc_hvm_build_args { -uint64_t mem_size; /* Memory size in bytes. */ -uint64_t mem_target; /* Memory target in bytes. */ -uint64_t mmio_size; /* Size of the MMIO hole in bytes. */ -const char *image_file_name; /* File name of the image to load. */ - -/* Extra ACPI tables passed to HVMLOADER */ -struct xc_hvm_firmware_module acpi_module; - -/* Extra SMBIOS structures passed to HVMLOADER */ -struct xc_hvm_firmware_module smbios_module; -/* Whether to use claim hypercall (1 - enable, 0 - disable). */ -int claim_enabled; - -/* vNUMA information*/ -xen_vmemrange_t *vmemranges; -unsigned int nr_vmemranges; -unsigned int *vnode_to_pnode; -unsigned int nr_vnodes; - -/* Out parameters */ -uint64_t lowmem_end; -uint64_t highmem_end; -uint64_t mmio_start; -}; - -/** - * Build a HVM domain. - * @parm xch libxc context handle. - * @parm domiddomain ID for the new domain. - * @parm hvm_args parameters for the new domain. - * - * The memory size and image file parameters are required, the rest - * are optional. - */ -int xc_hvm_build(xc_interface *xch, uint32_t domid, - struct xc_hvm_build_args *hvm_args); - -int xc_hvm_build_target_mem(xc_interface *xch, -uint32_t domid, -int memsize, -int target, -const char *image_name); - /* * Sets *lockfd to -1. * Has deallocated everything even on error. diff --git a/tools/libxc/xc_hvm_build_arm.c b/tools/libxc/xc_hvm_build_arm.c deleted file mode 100644 index 14f7c45..000 --- a/tools/libxc/xc_hvm_build_arm.c +++ /dev/null @@ -1,48 +0,0 @@ -/** - * This library is free software; you can redistribute it and/or - * modify it under the terms of the GNU Lesser General Public - * License as published by the Free Software Foundation; - * version 2.1 of the License. - * - * This library is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public - * License along with this library; If not, see http://www.gnu.org/licenses/. - * - * Copyright (c) 2011, Citrix Systems - */ - -#include inttypes.h -#include errno.h -#include xenctrl.h -#include xenguest.h - -int xc_hvm_build(xc_interface *xch, uint32_t domid, - struct xc_hvm_build_args *hvm_args) -{ -errno = ENOSYS; -return -1; -} - -int xc_hvm_build_target_mem(xc_interface *xch, - uint32_t domid, - int memsize, - int target, - const char *image_name) -{ -errno = ENOSYS; -return -1; -} - -/* - * Local variables: - * mode: C - * c-file-style: BSD - * c-basic-offset: 4 - * tab-width: 4 - * indent-tabs-mode: nil - * End: - */ diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c deleted file mode 100644 index 6f79686..000 --- a/tools/libxc/xc_hvm_build_x86.c +++ /dev/null @@ -1,806 +0,0 @@
[Xen-devel] [PATCH v4 30/31] libxc: switch xc_dom_elfloader to be used with HVMlite domains
Allow xc_dom_elfloader to report a guest type as hvm-3.0-x86_32 if it's running inside of a HVM container and has the PHYS32_ENTRY elfnote set. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- Only xc_dom_elfloader has been switched to support HVMlite, other loaders should also be switched once we have a HVMlite compatible kernel that uses them. --- tools/libxc/xc_dom_elfloader.c | 4 1 file changed, 4 insertions(+) diff --git a/tools/libxc/xc_dom_elfloader.c b/tools/libxc/xc_dom_elfloader.c index 66ea9d6..f3a0ed7 100644 --- a/tools/libxc/xc_dom_elfloader.c +++ b/tools/libxc/xc_dom_elfloader.c @@ -56,6 +56,10 @@ static char *xc_dom_guest_type(struct xc_dom_image *dom, { uint64_t machine = elf_uval(elf, elf-ehdr, e_machine); +if ( dom-container_type == XC_DOM_HVM_CONTAINER + dom-parms.phys_entry != UNSET_ADDR ) +return hvm-3.0-x86_32; + switch ( machine ) { case EM_386: -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 12/31] xen/x86: add bitmap of enabled emulated devices
Introduce a bitmap in x86 xen_arch_domainconfig that allows enabling or disabling specific devices emulated inside of Xen for HVM guests. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- Changes since v3: - Return EOPNOTSUPP instead of ENOPERM if an invalid emulation mask is used. - Fix error messages (prefix them with d%d and use %#x instead of 0x%x). - Clearly state in the public header that emulation_flags should only be used with HVM guests. - Add a XEN_X86 prefix to the emulation flags defines. - Properly parenthese the has_* marcos. --- tools/libxl/libxl_x86.c | 8 ++-- xen/arch/x86/domain.c | 18 ++ xen/include/asm-x86/domain.h | 13 + xen/include/public/arch-x86/xen.h | 21 - 4 files changed, 57 insertions(+), 3 deletions(-) diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c index b887411..270f3cf 100644 --- a/tools/libxl/libxl_x86.c +++ b/tools/libxl/libxl_x86.c @@ -7,8 +7,12 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, libxl_domain_config *d_config, xc_domain_configuration_t *xc_config) { -/* No specific configuration right now */ - +if (d_config-c_info.type == LIBXL_DOMAIN_TYPE_HVM) +xc_config-emulation_flags = (XEN_X86_EMU_LAPIC | XEN_X86_EMU_HPET | + XEN_X86_EMU_PMTIMER | XEN_X86_EMU_RTC | + XEN_X86_EMU_IOAPIC | XEN_X86_EMU_PIC | + XEN_X86_EMU_PMU | XEN_X86_EMU_VGA | + XEN_X86_EMU_IOMMU); return 0; } diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 045f6ff..c508074 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -533,6 +533,7 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags, { int i, paging_initialised = 0; int rc = -ENOMEM; +uint32_t emulation_mask; d-arch.s3_integrity = !!(domcr_flags DOMCRF_s3_integrity); @@ -555,6 +556,23 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags, d-domain_id); } +if ( is_hvm_domain(d) ) +{ +emulation_mask = (XEN_X86_EMU_LAPIC | XEN_X86_EMU_HPET | + XEN_X86_EMU_PMTIMER | XEN_X86_EMU_RTC | + XEN_X86_EMU_IOAPIC | XEN_X86_EMU_PIC | + XEN_X86_EMU_PMU | XEN_X86_EMU_VGA | + XEN_X86_EMU_IOMMU); +if ( (config-emulation_flags emulation_mask) != emulation_mask ) +{ +printk(XENLOG_G_ERR d%d: Xen does not allow HVM creation with the + current selection of emulators: %#x.\n, d-domain_id, + config-emulation_flags); +return -EOPNOTSUPP; +} +d-arch.emulation_flags = config-emulation_flags; +} + if ( has_hvm_container_domain(d) ) { d-arch.hvm_domain.hap_enabled = diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h index 0fce09e..2527637 100644 --- a/xen/include/asm-x86/domain.h +++ b/xen/include/asm-x86/domain.h @@ -387,8 +387,21 @@ struct arch_domain bool_t mem_access_emulate_enabled; struct monitor_write_data *event_write_data; + +/* Emulated devices enabled bitmap. */ +uint32_t emulation_flags; } __cacheline_aligned; +#define has_vlapic(d) ((d)-arch.emulation_flags XEN_X86_EMU_LAPIC) +#define has_vhpet(d)((d)-arch.emulation_flags XEN_X86_EMU_HPET) +#define has_vpmtimer(d) ((d)-arch.emulation_flags XEN_X86_EMU_PMTIMER) +#define has_vrtc(d) ((d)-arch.emulation_flags XEN_X86_EMU_RTC) +#define has_vioapic(d) ((d)-arch.emulation_flags XEN_X86_EMU_IOAPIC) +#define has_vpic(d) ((d)-arch.emulation_flags XEN_X86_EMU_PIC) +#define has_vpmu(d) ((d)-arch.emulation_flags XEN_X86_EMU_PMU) +#define has_vvga(d) ((d)-arch.emulation_flags XEN_X86_EMU_VGA) +#define has_viommu(d) ((d)-arch.emulation_flags XEN_X86_EMU_IOMMU) + #define has_arch_pdevs(d)(!list_empty((d)-arch.pdev_list)) #define gdt_ldt_pt_idx(v) \ diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch-x86/xen.h index 2ecc9c9..98cae41 100644 --- a/xen/include/public/arch-x86/xen.h +++ b/xen/include/public/arch-x86/xen.h @@ -268,7 +268,26 @@ typedef struct arch_shared_info arch_shared_info_t; * XEN_DOMCTL_INTERFACE_VERSION. */ struct xen_arch_domainconfig { -char dummy; +#define _XEN_X86_EMU_LAPIC 0 +#define XEN_X86_EMU_LAPIC (1U_XEN_X86_EMU_LAPIC) +#define _XEN_X86_EMU_HPET 1 +#define
[Xen-devel] [PATCH v4 17/31] xen/x86: allow disabling the emulated IO APIC
Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/vioapic.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c index d348235..30a4a0f 100644 --- a/xen/arch/x86/hvm/vioapic.c +++ b/xen/arch/x86/hvm/vioapic.c @@ -424,12 +424,20 @@ void vioapic_update_EOI(struct domain *d, u8 vector) static int ioapic_save(struct domain *d, hvm_domain_context_t *h) { struct hvm_hw_vioapic *s = domain_vioapic(d); + +if ( !has_vioapic(d) ) +return 0; + return hvm_save_entry(IOAPIC, 0, h, s); } static int ioapic_load(struct domain *d, hvm_domain_context_t *h) { struct hvm_hw_vioapic *s = domain_vioapic(d); + +if ( !has_vioapic(d) ) +return 0; + return hvm_load_entry(IOAPIC, h, s); } @@ -448,6 +456,9 @@ void vioapic_reset(struct domain *d) int vioapic_init(struct domain *d) { +if ( !has_vioapic(d) ) +return 0; + if ( (d-arch.hvm_domain.vioapic == NULL) ((d-arch.hvm_domain.vioapic = xmalloc(struct hvm_vioapic)) == NULL) ) return -ENOMEM; @@ -462,6 +473,9 @@ int vioapic_init(struct domain *d) void vioapic_deinit(struct domain *d) { +if ( !has_vioapic(d) ) +return; + xfree(d-arch.hvm_domain.vioapic); d-arch.hvm_domain.vioapic = NULL; } -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 27/31] xenconsole: try to attach to PV console if HVM fails
HVM guests have always used the emulated serial console by default, but if the emulated serial pty cannot be fetched from xenstore try to use the PV console instead. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- Changes since v3: - Drop the usage of a label and instead use if conditions. --- tools/console/client/main.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/tools/console/client/main.c b/tools/console/client/main.c index f130a60..d006fdc 100644 --- a/tools/console/client/main.c +++ b/tools/console/client/main.c @@ -333,7 +333,7 @@ int main(int argc, char **argv) { 0 }, }; - char *dom_path = NULL, *path = NULL; + char *dom_path = NULL, *path = NULL, *test = NULL; int spty, xsfd; struct xs_handle *xs; char *end; @@ -415,9 +415,15 @@ int main(int argc, char **argv) path = malloc(strlen(dom_path) + strlen(/device/console/0/tty) + 5); if (path == NULL) err(ENOMEM, malloc); - if (type == CONSOLE_SERIAL) + if (type == CONSOLE_SERIAL) { snprintf(path, strlen(dom_path) + strlen(/serial/0/tty) + 5, %s/serial/%d/tty, dom_path, num); - else { + test = xs_read(xs, XBT_NULL, path, NULL); + free(test); + if (test == NULL) + type = CONSOLE_PV; + } + if (type == CONSOLE_PV) { + if (num == 0) snprintf(path, strlen(dom_path) + strlen(/console/tty) + 1, %s/console/tty, dom_path); else -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 18/31] xen/x86: allow disabling the emulated PIC
Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/vpic.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/hvm/vpic.c b/xen/arch/x86/hvm/vpic.c index 7c2edc8..5938f40 100644 --- a/xen/arch/x86/hvm/vpic.c +++ b/xen/arch/x86/hvm/vpic.c @@ -377,6 +377,9 @@ static int vpic_save(struct domain *d, hvm_domain_context_t *h) struct hvm_hw_vpic *s; int i; +if ( !has_vpic(d) ) +return 0; + /* Save the state of both PICs */ for ( i = 0; i 2 ; i++ ) { @@ -392,7 +395,10 @@ static int vpic_load(struct domain *d, hvm_domain_context_t *h) { struct hvm_hw_vpic *s; uint16_t inst; - + +if ( !has_vpic(d) ) +return 0; + /* Which PIC is this? */ inst = hvm_load_instance(h); if ( inst 1 ) @@ -425,6 +431,9 @@ void vpic_reset(struct domain *d) void vpic_init(struct domain *d) { +if ( !has_vpic(d) ) +return; + vpic_reset(d); register_portio_handler(d, 0x20, 2, vpic_intercept_pic_io); -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 29/31] libxc/xen: introduce HVM_PARAM_MODLIST_PFN
This HVM parameter is used to pass a list of loaded modules to the guest. Right now the number of loaded modules is limited to 1 by the current implementation, but this interface allows passing more than one module. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- tools/libxc/xc_dom_x86.c| 20 xen/arch/x86/hvm/hvm.c | 2 ++ xen/include/public/hvm/params.h | 23 ++- 3 files changed, 44 insertions(+), 1 deletion(-) diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c index 369745d..1599de4 100644 --- a/tools/libxc/xc_dom_x86.c +++ b/tools/libxc/xc_dom_x86.c @@ -579,6 +579,26 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom) xc_hvm_param_set(xch, domid, HVM_PARAM_CMDLINE_PFN, cmdline_pfn); } +if ( dom-ramdisk_blob ) +{ +xen_pfn_t modlist_pfn = xc_dom_alloc_page(dom, module list); +uint64_t *modlist = xc_map_foreign_range(xch, domid, PAGE_SIZE, + PROT_READ | PROT_WRITE, + modlist_pfn); +if ( modlist == NULL ) { +DOMPRINTF(Unable to map module list page); +goto error_out; +} + +/* This is currently limited to only one module. */ +modlist[0] = dom-ramdisk_seg.vstart - dom-parms.virt_base; +modlist[1] = dom-ramdisk_seg.vend - dom-ramdisk_seg.vstart; +modlist[2] = 0; +modlist[3] = 0; +munmap(modlist, PAGE_SIZE); +xc_hvm_param_set(xch, domid, HVM_PARAM_MODLIST_PFN, modlist_pfn); +} + if ( dom-emulation ) { /* diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 615ea30..f2223ea 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -5867,6 +5867,7 @@ static int hvm_allow_set_param(struct domain *d, case HVM_PARAM_STORE_EVTCHN: case HVM_PARAM_CONSOLE_EVTCHN: case HVM_PARAM_CMDLINE_PFN: +case HVM_PARAM_MODLIST_PFN: break; /* * The following parameters must not be set by the guest @@ -6099,6 +6100,7 @@ static int hvm_allow_get_param(struct domain *d, case HVM_PARAM_CONSOLE_EVTCHN: case HVM_PARAM_ALTP2M: case HVM_PARAM_CMDLINE_PFN: +case HVM_PARAM_MODLIST_PFN: break; /* * The following parameters must not be read by the guest diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h index aa926d4..96f944e 100644 --- a/xen/include/public/hvm/params.h +++ b/xen/include/public/hvm/params.h @@ -193,6 +193,27 @@ /* PFN of the command line. */ #define HVM_PARAM_CMDLINE_PFN 36 -#define HVM_NR_PARAMS 37 +/* + * List of modules passed to the kernel. + * + * The PFN returned by this HVM_PARAM points to a page that contains an + * array of unsigned 64bit integers encoded in little endian. + * + * The first integer contains the address where the module has been loaded, + * while the second contains the size of the module in bytes. The last element + * in the array is a module with address 0 and length 0: + * + * module[0] = address of 1st module + * module[1] = size of 1st module + * [...] + * module[N/2] = address of module N + * module[N/2+1] = size of module N + * [...] + * module[M] = 0 + * module[M+1] = 0 + */ +#define HVM_PARAM_MODLIST_PFN 37 + +#define HVM_NR_PARAMS 38 #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */ -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Design doc of adding ACPI support for arm64 on Xen - version 2
On Fri, 7 Aug 2015, Christoffer Dall wrote: On Fri, Aug 7, 2015 at 12:33 PM, Julien Grall julien.gr...@citrix.com wrote: Hi Shannon, Just some clarification questions. On 07/08/15 03:11, Shannon Zhao wrote: 3. Dom0 gets grant table and event channel irq information --- As said above, we assign the hypervisor_id be XenVMM to tell Dom0 that it runs on Xen hypervisor. For grant table, add two new HVM_PARAMs: HVM_PARAM_GNTTAB_START_ADDRESS and HVM_PARAM_GNTTAB_SIZE. For event channel irq, reuse HVM_PARAM_CALLBACK_IRQ and add a new delivery type: val[63:56] == 3: val[15:8] is flag: val[7:0] is a PPI (ARM and ARM64 only) Can you describe the content of flag? When constructing Dom0 in Xen, save these values. Then Dom0 could get them through hypercall HVMOP_get_param. 4. Map MMIO regions --- Register a bus_notifier for platform and amba bus in Linux. Add a new XENMAPSPACE XENMAPSPACE_dev_mmio. Within the register, check if the device is newly added, then call hypercall XENMEM_add_to_physmap to map the mmio regions. 5. Route device interrupts to Dom0 -- Route all the SPI interrupts to Dom0 before Dom0 booting. Not all the SPI will be routed to DOM0. Some are used by Xen and should never be used by any guest. I have in mind the UART and SMMU interrupts. You will have to find away to skip them nicely. Note that not all the IRQs used by Xen are properly registered when we build DOM0 (see the SMMU). Just to clarify; Xen should map all SPIs which are not reserved for Xen to Dom0, right? Yes, all SPIs that Xen does not use for itself. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 16/31] xen/x86: allow disabling the emulated RTC
Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/rtc.c | 16 1 file changed, 16 insertions(+) diff --git a/xen/arch/x86/hvm/rtc.c b/xen/arch/x86/hvm/rtc.c index a9efeaf..bc93f66 100644 --- a/xen/arch/x86/hvm/rtc.c +++ b/xen/arch/x86/hvm/rtc.c @@ -726,6 +726,9 @@ void rtc_migrate_timers(struct vcpu *v) { RTCState *s = vcpu_vrtc(v); +if ( !has_vrtc(v-domain) ) +return; + if ( v-vcpu_id == 0 ) { migrate_timer(s-update_timer, v-processor);; @@ -739,6 +742,10 @@ static int rtc_save(struct domain *d, hvm_domain_context_t *h) { RTCState *s = domain_vrtc(d); int rc; + +if ( !has_vrtc(d) ) +return 0; + spin_lock(s-lock); rc = hvm_save_entry(RTC, 0, h, s-hw); spin_unlock(s-lock); @@ -750,6 +757,9 @@ static int rtc_load(struct domain *d, hvm_domain_context_t *h) { RTCState *s = domain_vrtc(d); +if ( !has_vrtc(d) ) +return 0; + spin_lock(s-lock); /* Restore the registers */ @@ -790,6 +800,9 @@ void rtc_init(struct domain *d) { RTCState *s = domain_vrtc(d); +if ( !has_vrtc(d) ) +return; + spin_lock_init(s-lock); init_timer(s-update_timer, rtc_update_timer, s, smp_processor_id()); @@ -820,6 +833,9 @@ void rtc_deinit(struct domain *d) { RTCState *s = domain_vrtc(d); +if ( !has_vrtc(d) ) +return; + spin_barrier(s-lock); TRACE_0D(TRC_HVM_EMUL_RTC_STOP_TIMER); -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 28/31] libxc/xen: introduce HVM_PARAM_CMDLINE_PFN
This HVM parameter returns a PFN that contains the address of the memory page where the guest command line has been placed. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- tools/libxc/xc_dom_x86.c| 17 + xen/arch/x86/hvm/hvm.c | 2 ++ xen/include/public/hvm/params.h | 5 - 3 files changed, 23 insertions(+), 1 deletion(-) diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c index 87bce6e..369745d 100644 --- a/tools/libxc/xc_dom_x86.c +++ b/tools/libxc/xc_dom_x86.c @@ -562,6 +562,23 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom) xc_hvm_param_set(xch, domid, HVM_PARAM_SHARING_RING_PFN, special_pfn(SPECIALPAGE_SHARING)); +if ( dom-cmdline ) +{ +xen_pfn_t cmdline_pfn = xc_dom_alloc_page(dom, command line); +char *cmdline = xc_map_foreign_range(xch, domid, PAGE_SIZE, + PROT_READ | PROT_WRITE, + cmdline_pfn); +if ( cmdline == NULL ) { +DOMPRINTF(Unable to map command line page); +goto error_out; +} + +strncpy(cmdline, dom-cmdline, MAX_GUEST_CMDLINE); +cmdline[MAX_GUEST_CMDLINE - 1] = '\0'; +munmap(cmdline, PAGE_SIZE); +xc_hvm_param_set(xch, domid, HVM_PARAM_CMDLINE_PFN, cmdline_pfn); +} + if ( dom-emulation ) { /* diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 15ea5e2..615ea30 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -5866,6 +5866,7 @@ static int hvm_allow_set_param(struct domain *d, case HVM_PARAM_VM_GENERATION_ID_ADDR: case HVM_PARAM_STORE_EVTCHN: case HVM_PARAM_CONSOLE_EVTCHN: +case HVM_PARAM_CMDLINE_PFN: break; /* * The following parameters must not be set by the guest @@ -6097,6 +6098,7 @@ static int hvm_allow_get_param(struct domain *d, case HVM_PARAM_CONSOLE_PFN: case HVM_PARAM_CONSOLE_EVTCHN: case HVM_PARAM_ALTP2M: +case HVM_PARAM_CMDLINE_PFN: break; /* * The following parameters must not be read by the guest diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h index 147d9b8..aa926d4 100644 --- a/xen/include/public/hvm/params.h +++ b/xen/include/public/hvm/params.h @@ -190,6 +190,9 @@ /* Boolean: Enable altp2m */ #define HVM_PARAM_ALTP2M 35 -#define HVM_NR_PARAMS 36 +/* PFN of the command line. */ +#define HVM_PARAM_CMDLINE_PFN 36 + +#define HVM_NR_PARAMS 37 #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */ -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 01/31] libxl: fix libxl__build_hvm error handling
El 07/08/15 a les 12.49, Wei Liu ha escrit: On Fri, Aug 07, 2015 at 12:17:38PM +0200, Roger Pau Monne wrote: With the current code in libxl__build_hvm it is possible for the function to fail and still return 0. It's hard to see where the bug is when this patch also does a bunch of refactoring. It refactors the error paths only, mainly replacing: if (libxl_call_foo(bar)) error to rc = libxl_call_foo(bar) if (rc != 0) error So we can keep the error codes returned by auxiliary functions. It would be good if you can separate the bug fix from other name changing bits, so that we can apply that bug fix for 4.6 possibly queue it up for backporting. There are no name changing bits AFAICT. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 01/31] libxl: fix libxl__build_hvm error handling
On Fri, Aug 07, 2015 at 12:55:21PM +0200, Roger Pau Monné wrote: El 07/08/15 a les 12.49, Wei Liu ha escrit: On Fri, Aug 07, 2015 at 12:17:38PM +0200, Roger Pau Monne wrote: With the current code in libxl__build_hvm it is possible for the function to fail and still return 0. I care about this bit, which states clearly there is a bug that needs fixing. It's hard to see where the bug is when this patch also does a bunch of refactoring. It refactors the error paths only, mainly replacing: if (libxl_call_foo(bar)) error to rc = libxl_call_foo(bar) if (rc != 0) error But this suggests there is no bug? So we can keep the error codes returned by auxiliary functions. It would be good if you can separate the bug fix from other name changing bits, so that we can apply that bug fix for 4.6 possibly queue it up for backporting. There are no name changing bits AFAICT. Changing ret for rc is naming changing to me. It's a good thing to do to comply with coding style, but mixing this with bug fix makes it hard to backport the fix itself. Wei. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 08/31] libxc: rework BSP initialization
On Fri, Aug 07, 2015 at 12:17:45PM +0200, Roger Pau Monne wrote: Place the calls to xc_vcpu_setcontext and the allocation of the hypercall buffer into the arch-specific vcpu hooks. This is needed for the next patch, so x86 HVM guests can initialize the BSP using XEN_DOMCTL_sethvmcontext instead of XEN_DOMCTL_setvcpucontext. This patch should not introduce any functional change. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- tools/libxc/include/xc_dom.h | 2 +- tools/libxc/xc_dom_arm.c | 22 +- tools/libxc/xc_dom_boot.c| 23 +-- tools/libxc/xc_dom_x86.c | 26 -- 4 files changed, 39 insertions(+), 34 deletions(-) diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index 5c1bb0f..0245d24 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -221,7 +221,7 @@ struct xc_dom_arch { /* arch-specific data structs setup */ int (*start_info) (struct xc_dom_image * dom); int (*shared_info) (struct xc_dom_image * dom, void *shared_info); -int (*vcpu) (struct xc_dom_image * dom, void *vcpu_ctxt); +int (*vcpu) (struct xc_dom_image * dom); int (*bootearly) (struct xc_dom_image * dom); int (*bootlate) (struct xc_dom_image * dom); diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c index 7548dae..8865097 100644 --- a/tools/libxc/xc_dom_arm.c +++ b/tools/libxc/xc_dom_arm.c @@ -119,9 +119,10 @@ static int shared_info_arm(struct xc_dom_image *dom, void *ptr) /* */ -static int vcpu_arm32(struct xc_dom_image *dom, void *ptr) +static int vcpu_arm32(struct xc_dom_image *dom) { -vcpu_guest_context_t *ctxt = ptr; +vcpu_guest_context_any_t any_ctx; +vcpu_guest_context_t *ctxt = any_ctx.c; I think you still need to allocate hypercall safe buffer here, as well as in other vcpu_* functions. DOMPRINTF_CALLED(dom-xch); @@ -154,12 +155,18 @@ static int vcpu_arm32(struct xc_dom_image *dom, void *ptr) DOMPRINTF(Initial state CPSR %#PRIx32 PC %#PRIx32, ctxt-user_regs.cpsr, ctxt-user_regs.pc32); -return 0; +rc = xc_vcpu_setcontext(dom-xch, dom-guest_domid, 0, any_ctx); +if ( rc != 0 ) +xc_dom_panic(dom-xch, XC_INTERNAL_ERROR, + %s: SETVCPUCONTEXT failed (rc=%d), __func__, rc); + +return rc; } [...] -ctxt = xc_hypercall_buffer_alloc(dom-xch, ctxt, sizeof(*ctxt)); -if ( ctxt == NULL ) -return -1; - DOMPRINTF_CALLED(dom-xch); /* misc stuff*/ @@ -259,13 +241,10 @@ int xc_dom_boot_image(struct xc_dom_image *dom) return rc; /* let the vm run */ -memset(ctxt, 0, sizeof(*ctxt)); -if ( (rc = dom-arch_hooks-vcpu(dom, ctxt)) != 0 ) +if ( (rc = dom-arch_hooks-vcpu(dom)) != 0 ) return rc; xc_dom_unmap_all(dom); This is not your problem, but this xc_dom_unmap_all is really suspicious. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Fw: drivers/xen/pci.c:31:25: fatal error: asm/pci_x86.h: No such file or directory
On Thu, 6 Aug 2015, Boris Ostrovsky wrote: On 08/06/2015 01:04 PM, Robert Richter wrote: Boris, we are working on acpi pci support for arm64. For this we are enabling PCI_MMCONFIG on arm64 which breaks compiling drivers/xen/pci.c. Looking into it there is x86 code in generic driver code introduced with: 8deb3eb1461e xen/mcfg: Call PHYSDEVOP_pci_mmcfg_reserved for MCFG areas. This implements: diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c index 18fff88254eb..d15f6e80479f 100644 --- a/drivers/xen/pci.c +++ b/drivers/xen/pci.c @@ -26,6 +26,7 @@ #include asm/xen/hypervisor.h #include asm/xen/hypercall.h #include ../pci/pci.h +#include asm/pci_x86.h static bool __read_mostly pci_seg_supported = true; @@ -192,3 +193,49 @@ static int __init register_xen_pci_notifier(void) } arch_initcall(register_xen_pci_notifier); + +#ifdef CONFIG_PCI_MMCONFIG +static int __init xen_mcfg_late(void) +{ + struct pci_mmcfg_region *cfg; + int rc; + + if (!xen_initial_domain()) + return 0; + + if ((pci_probe PCI_PROBE_MMCONF) == 0) + return 0; [...] There are no defines for pci_probe and PCI_PROBE_MMCONF other than for the x86 architecture. I see several ways to fix that: * moving code to arch/x86/pci/xen.c, If this routine is not going to be used by ARM I think this would be the way to go. Stefano --- will this be needed by ARM? Yes, I think it is most probable that this code will be needed on ARM, even though we are still at the design doc stage for Xen and ACPI on ARM at the moment. Besides I think you should be able to compile this function on arm64 just fine. The hypercall should build without issues. If the problem is just this one check, we should be able to rework it somehow. Is there something equivalent to pci_probe on arm? If not, we could just ifdef the check for x86. * make code dependent on arch X86, * make 'if (pci_probe PCI_PROBE_MMCONF) ...' an arch function to be implemented by archs, Otherwise I'd go with this. Or this of course -boris * reworking the code by removing the check (not sure if that could be done). I don't think PCI_XEN is enabled yet for arm64, so we just disalbe it for !X86 or move it to arch/x86. I suggest the latter. Is there any preference you have? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 1/4] HVM x86 deprivileged mode: Page allocation helper
On 06/08/15 20:22, Andrew Cooper wrote: On 06/08/15 17:45, Ben Catterall wrote: This allocation function is used by the deprivileged mode initialisation code to allocate pages for the new page table mappings and page frames on the HAP page heap. Signed-off-by: Ben Catterall ben.catter...@citrix.com This is fine for your test box, but isn't fine for systems out there without hardware EPT/NPT support. For older systems like that (or in certain specific workloads), shadow paging is used instead. This feature is applicable to any HVM domain, which means that it shouldn't depend on HAP or shadow paging. How much memory is allocated for the depriv area, and what exactly is allocated in total? So, per-vcpu: - a user mode stack which, from your comments in [RFC 2/4], can be 2 pages - local data (may or may not be needed, depends on the device) which will be around a page or two. Text segment: as per your comments in RFC 2/4, this will be changed to be an alias so no extra memory. I expect it isn't very much, and would suggest using d-arch.paging.alloc_page() instead (which is the generic get me some memory accounted against the domain helper) which looks as if it should suffice. Depending on exactly how much memory is needed, you might need to bump the default minimum shadow pool size. ~Andrew Ok, will do. That will also solve the EPT/NPT problem, thanks! Ben ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 07/31] libxc: make arch_setup_boot{init/late} xc_dom_arch hooks
This should not introduce any functional change. Signed-off-by: Roger Pau Monné roger@citrix.com Reviewed-by: Andrew Cooper andrew.coo...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- Changes since v3: - Add Andrew Cooper Reviewed-by. - Add Wei Acked-by. --- tools/libxc/include/xc_dom.h | 7 ++- tools/libxc/xc_dom_arm.c | 20 +--- tools/libxc/xc_dom_boot.c| 4 ++-- tools/libxc/xc_dom_x86.c | 10 -- 4 files changed, 25 insertions(+), 16 deletions(-) diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index c4b994f..5c1bb0f 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -222,6 +222,8 @@ struct xc_dom_arch { int (*start_info) (struct xc_dom_image * dom); int (*shared_info) (struct xc_dom_image * dom, void *shared_info); int (*vcpu) (struct xc_dom_image * dom, void *vcpu_ctxt); +int (*bootearly) (struct xc_dom_image * dom); +int (*bootlate) (struct xc_dom_image * dom); /* arch-specific memory initialization. */ int (*meminit) (struct xc_dom_image * dom); @@ -401,11 +403,6 @@ static inline xen_pfn_t xc_dom_p2m(struct xc_dom_image *dom, xen_pfn_t pfn) return dom-p2m_host[pfn - dom-rambase_pfn]; } -/* --- arch bits --- */ - -int arch_setup_bootearly(struct xc_dom_image *dom); -int arch_setup_bootlate(struct xc_dom_image *dom); - /* * Local variables: * mode: C diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c index 24776ba..7548dae 100644 --- a/tools/libxc/xc_dom_arm.c +++ b/tools/libxc/xc_dom_arm.c @@ -489,13 +489,20 @@ static int meminit(struct xc_dom_image *dom) return 0; } -int arch_setup_bootearly(struct xc_dom_image *dom) +int xc_dom_feature_translated(struct xc_dom_image *dom) +{ +return 1; +} + +/* */ + +static int bootearly(struct xc_dom_image *dom) { DOMPRINTF(%s: doing nothing, __FUNCTION__); return 0; } -int arch_setup_bootlate(struct xc_dom_image *dom) +static int bootlate(struct xc_dom_image *dom) { /* XXX * map shared info @@ -505,11 +512,6 @@ int arch_setup_bootlate(struct xc_dom_image *dom) return 0; } -int xc_dom_feature_translated(struct xc_dom_image *dom) -{ -return 1; -} - /* */ static struct xc_dom_arch xc_dom_32 = { @@ -524,6 +526,8 @@ static struct xc_dom_arch xc_dom_32 = { .shared_info = shared_info_arm, .vcpu = vcpu_arm32, .meminit = meminit, +.bootearly = bootearly, +.bootlate = bootlate, }; static struct xc_dom_arch xc_dom_64 = { @@ -538,6 +542,8 @@ static struct xc_dom_arch xc_dom_64 = { .shared_info = shared_info_arm, .vcpu = vcpu_arm64, .meminit = meminit, +.bootearly = bootearly, +.bootlate = bootlate, }; static void __init register_arch_hooks(void) diff --git a/tools/libxc/xc_dom_boot.c b/tools/libxc/xc_dom_boot.c index bf2cd7b..e6f7794 100644 --- a/tools/libxc/xc_dom_boot.c +++ b/tools/libxc/xc_dom_boot.c @@ -208,7 +208,7 @@ int xc_dom_boot_image(struct xc_dom_image *dom) DOMPRINTF_CALLED(dom-xch); /* misc stuff*/ -if ( (rc = arch_setup_bootearly(dom)) != 0 ) +if ( (rc = dom-arch_hooks-bootearly(dom)) != 0 ) return rc; /* collect some info */ @@ -255,7 +255,7 @@ int xc_dom_boot_image(struct xc_dom_image *dom) xc_dom_log_memory_footprint(dom); /* misc x86 stuff */ -if ( (rc = arch_setup_bootlate(dom)) != 0 ) +if ( (rc = dom-arch_hooks-bootlate(dom)) != 0 ) return rc; /* let the vm run */ diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c index 07170c1..0f49e27 100644 --- a/tools/libxc/xc_dom_x86.c +++ b/tools/libxc/xc_dom_x86.c @@ -924,7 +924,9 @@ static int meminit_pv(struct xc_dom_image *dom) return rc; } -int arch_setup_bootearly(struct xc_dom_image *dom) +/* */ + +static int bootearly(struct xc_dom_image *dom) { DOMPRINTF(%s: doing nothing, __FUNCTION__); return 0; @@ -963,7 +965,7 @@ static int map_grant_table_frames(struct xc_dom_image *dom) return 0; } -int arch_setup_bootlate(struct xc_dom_image *dom) +static int bootlate_pv(struct xc_dom_image *dom) { static const struct { char *guest; @@ -1059,6 +1061,8 @@ static struct xc_dom_arch xc_dom_32_pae = { .shared_info = shared_info_x86_32, .vcpu = vcpu_x86_32, .meminit = meminit_pv, +.bootearly = bootearly, +.bootlate = bootlate_pv, }; static struct xc_dom_arch xc_dom_64 = { @@ -1073,6 +1077,8 @@ static struct xc_dom_arch xc_dom_64 = { .shared_info =
[Xen-devel] [PATCH v4 03/31] libxc: unify xc_dom_p2m_{host/guest}
Unify both functions into xc_dom_p2m. Should not introduce any functional change. Signed-off-by: Roger Pau Monné roger@citrix.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com Cc: Samuel Thibault samuel.thiba...@ens-lyon.org --- Changes since v3: - Add Andrew Cooper Reviewed-by. - Add Wei Acked-by. --- stubdom/grub/kexec.c | 4 ++-- tools/libxc/include/xc_dom.h | 14 ++ tools/libxc/xc_dom_boot.c | 10 +- tools/libxc/xc_dom_compat_linux.c | 4 ++-- tools/libxc/xc_dom_x86.c | 32 tools/libxl/libxl_dom.c | 4 ++-- 6 files changed, 29 insertions(+), 39 deletions(-) diff --git a/stubdom/grub/kexec.c b/stubdom/grub/kexec.c index 4c33b25..0b2f4f3 100644 --- a/stubdom/grub/kexec.c +++ b/stubdom/grub/kexec.c @@ -358,9 +358,9 @@ void kexec(void *kernel, long kernel_size, void *module, long module_size, char #ifdef __x86_64__ MMUEXT_PIN_L4_TABLE, #endif -xc_dom_p2m_host(dom, dom-pgtables_seg.pfn), +xc_dom_p2m(dom, dom-pgtables_seg.pfn), dom-guest_domid)) != 0 ) { -grub_printf(pin_table(%lx) returned %d\n, xc_dom_p2m_host(dom, +grub_printf(pin_table(%lx) returned %d\n, xc_dom_p2m(dom, dom-pgtables_seg.pfn), rc); errnum = ERR_BOOT_FAILURE; goto out_remap; diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index 600aef6..9cf13e2 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -375,19 +375,9 @@ static inline void *xc_dom_vaddr_to_ptr(struct xc_dom_image *dom, return ptr + offset; } -static inline xen_pfn_t xc_dom_p2m_host(struct xc_dom_image *dom, xen_pfn_t pfn) +static inline xen_pfn_t xc_dom_p2m(struct xc_dom_image *dom, xen_pfn_t pfn) { -if (dom-shadow_enabled) -return pfn; -if (pfn dom-rambase_pfn || pfn = dom-rambase_pfn + dom-total_pages) -return INVALID_MFN; -return dom-p2m_host[pfn - dom-rambase_pfn]; -} - -static inline xen_pfn_t xc_dom_p2m_guest(struct xc_dom_image *dom, - xen_pfn_t pfn) -{ -if (xc_dom_feature_translated(dom)) +if ( dom-shadow_enabled || xc_dom_feature_translated(dom) ) return pfn; if (pfn dom-rambase_pfn || pfn = dom-rambase_pfn + dom-total_pages) return INVALID_MFN; diff --git a/tools/libxc/xc_dom_boot.c b/tools/libxc/xc_dom_boot.c index 8e06406..7c30f96 100644 --- a/tools/libxc/xc_dom_boot.c +++ b/tools/libxc/xc_dom_boot.c @@ -53,7 +53,7 @@ static int setup_hypercall_page(struct xc_dom_image *dom) dom-parms.virt_hypercall, pfn); domctl.cmd = XEN_DOMCTL_hypercall_init; domctl.domain = dom-guest_domid; -domctl.u.hypercall_init.gmfn = xc_dom_p2m_guest(dom, pfn); +domctl.u.hypercall_init.gmfn = xc_dom_p2m(dom, pfn); rc = do_domctl(dom-xch, domctl); if ( rc != 0 ) xc_dom_panic(dom-xch, XC_INTERNAL_ERROR, @@ -83,7 +83,7 @@ static int clear_page(struct xc_dom_image *dom, xen_pfn_t pfn) if ( pfn == 0 ) return 0; -dst = xc_dom_p2m_host(dom, pfn); +dst = xc_dom_p2m(dom, pfn); DOMPRINTF(%s: pfn 0x% PRIpfn , mfn 0x% PRIpfn , __FUNCTION__, pfn, dst); rc = xc_clear_domain_page(dom-xch, dom-guest_domid, dst); @@ -177,7 +177,7 @@ void *xc_dom_boot_domU_map(struct xc_dom_image *dom, xen_pfn_t pfn, } for ( i = 0; i count; i++ ) -entries[i].mfn = xc_dom_p2m_host(dom, pfn + i); +entries[i].mfn = xc_dom_p2m(dom, pfn + i); ptr = xc_map_foreign_ranges(dom-xch, dom-guest_domid, count page_shift, PROT_READ | PROT_WRITE, 1 page_shift, @@ -434,8 +434,8 @@ int xc_dom_gnttab_init(struct xc_dom_image *dom) dom-console_domid, dom-xenstore_domid); } else { return xc_dom_gnttab_seed(dom-xch, dom-guest_domid, - xc_dom_p2m_host(dom, dom-console_pfn), - xc_dom_p2m_host(dom, dom-xenstore_pfn), + xc_dom_p2m(dom, dom-console_pfn), + xc_dom_p2m(dom, dom-xenstore_pfn), dom-console_domid, dom-xenstore_domid); } } diff --git a/tools/libxc/xc_dom_compat_linux.c b/tools/libxc/xc_dom_compat_linux.c index a3abb99..5c1f043 100644 --- a/tools/libxc/xc_dom_compat_linux.c +++ b/tools/libxc/xc_dom_compat_linux.c @@ -64,8 +64,8 @@ static int xc_linux_build_internal(struct xc_dom_image *dom, if ( (rc = xc_dom_gnttab_init(dom)) != 0) goto out; -*console_mfn = xc_dom_p2m_host(dom, dom-console_pfn); -*store_mfn = xc_dom_p2m_host(dom,
[Xen-devel] [PATCH v4 04/31] libxc: introduce the notion of a container type
Introduce the notion of a container type into xc_dom_image. This will be needed by later changes that will also use xc_dom_image in order to build HVM guests. Signed-off-by: Roger Pau Monné roger@citrix.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- Changes since v3: - Add Andrew Cooper Reviewed-by. - Add Wei Acked-by. --- tools/libxc/include/xc_dom.h | 6 ++ tools/libxc/xc_dom_x86.c | 4 tools/libxl/libxl_dom.c | 1 + 3 files changed, 11 insertions(+) diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index 9cf13e2..bc55ec9 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -179,6 +179,12 @@ struct xc_dom_image { struct xc_dom_arch *arch_hooks; /* allocate up to virt_alloc_end */ int (*allocate) (struct xc_dom_image * dom, xen_vaddr_t up_to); + +/* Container type (HVM or PV). */ +enum { +XC_DOM_PV_CONTAINER, +XC_DOM_HVM_CONTAINER, +} container_type; }; /* --- pluggable kernel loader - */ diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c index dc2f4aa..c7bfc0c 100644 --- a/tools/libxc/xc_dom_x86.c +++ b/tools/libxc/xc_dom_x86.c @@ -1071,6 +1071,10 @@ int arch_setup_bootlate(struct xc_dom_image *dom) int xc_dom_feature_translated(struct xc_dom_image *dom) { +/* Guests running inside HVM containers are always auto-translated. */ +if ( dom-container_type == XC_DOM_HVM_CONTAINER ) +return 1; + return elf_xen_feature_get(XENFEAT_auto_translated_physmap, dom-f_active); } diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index c1d0d8c..92c4278 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -619,6 +619,7 @@ int libxl__build_pv(libxl__gc *gc, uint32_t domid, } dom-pvh_enabled = state-pvh_enabled; +dom-container_type = XC_DOM_PV_CONTAINER; LOG(DEBUG, pv kernel mapped %d path %s, state-pv_kernel.mapped, state-pv_kernel.path); -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 01/31] libxl: fix libxl__build_hvm error handling
With the current code in libxl__build_hvm it is possible for the function to fail and still return 0. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- tools/libxl/libxl_dom.c | 39 ++- 1 file changed, 22 insertions(+), 17 deletions(-) diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index e1f11a3..38fb939 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -766,7 +766,7 @@ static int hvm_build_set_params(xc_interface *handle, uint32_t domid, XC_PAGE_SIZE, PROT_READ | PROT_WRITE, HVM_INFO_PFN); if (va_map == NULL) -return -1; +return ERROR_FAIL; va_hvm = (struct hvm_info_table *)(va_map + HVM_INFO_OFFSET); va_hvm-apic_mode = libxl_defbool_val(info-u.hvm.apic); @@ -912,7 +912,7 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, { libxl_ctx *ctx = libxl__gc_owner(gc); struct xc_hvm_build_args args = {}; -int ret, rc = ERROR_FAIL; +int ret, rc; uint64_t mmio_start, lowmem_end, highmem_end; libxl_domain_build_info *const info = d_config-b_info; @@ -932,7 +932,9 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, if (max_ram_below_4g HVM_BELOW_4G_MMIO_START) args.mmio_size = info-u.hvm.mmio_hole_memkb 10; } -if (libxl__domain_firmware(gc, info, args)) { + +rc = libxl__domain_firmware(gc, info, args); +if (rc != 0) { LOG(ERROR, initializing domain firmware failed); goto out; } @@ -963,15 +965,15 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, if (info-num_vnuma_nodes != 0) { int i; -ret = libxl__vnuma_build_vmemrange_hvm(gc, domid, info, state, args); -if (ret) { -LOGEV(ERROR, ret, hvm build vmemranges failed); +rc = libxl__vnuma_build_vmemrange_hvm(gc, domid, info, state, args); +if (rc != 0) { +LOG(ERROR, hvm build vmemranges failed); goto out; } -ret = libxl__vnuma_config_check(gc, info, state); -if (ret) goto out; -ret = set_vnuma_info(gc, domid, info, state); -if (ret) goto out; +rc = libxl__vnuma_config_check(gc, info, state); +if (rc != 0) goto out; +rc = set_vnuma_info(gc, domid, info, state); +if (rc != 0) goto out; args.nr_vmemranges = state-num_vmemranges; args.vmemranges = libxl__malloc(gc, sizeof(*args.vmemranges) * @@ -994,31 +996,34 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, ret = xc_hvm_build(ctx-xch, domid, args); if (ret) { LOGEV(ERROR, ret, hvm building failed); +rc = ERROR_FAIL; goto out; } -if (libxl__arch_domain_construct_memmap(gc, d_config, domid, args)) { +rc = libxl__arch_domain_construct_memmap(gc, d_config, domid, args); +if (rc != 0) { LOG(ERROR, setting domain memory map failed); goto out; } -ret = hvm_build_set_params(ctx-xch, domid, info, state-store_port, +rc = hvm_build_set_params(ctx-xch, domid, info, state-store_port, state-store_mfn, state-console_port, state-console_mfn, state-store_domid, state-console_domid); -if (ret) { -LOGEV(ERROR, ret, hvm build set params failed); +if (rc != 0) { +LOG(ERROR, hvm build set params failed); goto out; } -ret = hvm_build_set_xs_values(gc, domid, args); -if (ret) { -LOG(ERROR, hvm build set xenstore values failed (ret=%d), ret); +rc = hvm_build_set_xs_values(gc, domid, args); +if (rc != 0) { +LOG(ERROR, hvm build set xenstore values failed); goto out; } return 0; out: +assert(rc != 0); return rc; } -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 31/31] libxl: allow the creation of HVM domains without a device model.
Replace the firmware loaded into HVM guests with an OS kernel. Since the HVM builder now uses the PV xc_dom_* set of functions this kernel will be parsed and loaded inside the guest like on PV, but the container is a pure HVM guest. Also, if device_model_version is set to none or a device model for the specified domain is not present unconditinally set the nic type to LIBXL_NIC_TYPE_VIF. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- Changes since v3: - Add explicit /* fall through */ comments. - Expand libxl__device_nic_setdefault so that it sets the right nic type for HVMlite guests. - Remove stray space in hvm_build_set_params. - Fix the error paths of libxl__domain_firmware. --- docs/man/xl.cfg.pod.5| 5 tools/libxc/xc_dom_x86.c | 7 + tools/libxl/libxl.c | 39 ++--- tools/libxl/libxl_create.c | 16 ++- tools/libxl/libxl_dm.c | 13 - tools/libxl/libxl_dom.c | 68 ++-- tools/libxl/libxl_internal.h | 5 +++- tools/libxl/libxl_types.idl | 1 + tools/libxl/libxl_x86.c | 4 ++- tools/libxl/xl_cmdimpl.c | 2 ++ 10 files changed, 118 insertions(+), 42 deletions(-) diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 index 80e51bb..8cd7726 100644 --- a/docs/man/xl.cfg.pod.5 +++ b/docs/man/xl.cfg.pod.5 @@ -1741,6 +1741,11 @@ This device-model is the default for Linux dom0. Use the device-model based upon the historical Xen fork of Qemu. This device-model is still the default for NetBSD dom0. +=item Bnone + +Don't use any device model. This requires a kernel capable of booting +in this mode. + =back It is recommended to accept the default value for new guests. If diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c index 1599de4..d67feb0 100644 --- a/tools/libxc/xc_dom_x86.c +++ b/tools/libxc/xc_dom_x86.c @@ -1269,6 +1269,13 @@ static int meminit_hvm(struct xc_dom_image *dom) if ( nr_pages target_pages ) memflags |= XENMEMF_populate_on_demand; +/* Make sure there's a MMIO hole for the special pages. */ +if ( dom-mmio_size == 0 ) +{ +dom-mmio_size = NR_SPECIAL_PAGES PAGE_SHIFT; +dom-mmio_start = special_pfn(0); +} + if ( dom-nr_vmemranges == 0 ) { /* Build dummy vnode information diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 083f099..a01868a 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -1033,11 +1033,13 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid) } if (type == LIBXL_DOMAIN_TYPE_HVM) { -rc = libxl__domain_resume_device_model(gc, domid); -if (rc 0) { -LOG(ERROR, failed to unpause device model for domain %u:%d, -domid, rc); -goto out; +if (libxl__domain_has_device_model(gc, domid)) { +rc = libxl__domain_resume_device_model(gc, domid); +if (rc 0) { +LOG(ERROR, failed to unpause device model for domain %u:%d, +domid, rc); +goto out; +} } } ret = xc_domain_unpause(ctx-xch, domid); @@ -1567,7 +1569,6 @@ void libxl__destroy_domid(libxl__egc *egc, libxl__destroy_domid_state *dis) libxl_ctx *ctx = CTX; uint32_t domid = dis-domid; char *dom_path; -char *pid; int rc, dm_present; libxl__ev_child_init(dis-destroyer); @@ -1584,14 +1585,13 @@ void libxl__destroy_domid(libxl__egc *egc, libxl__destroy_domid_state *dis) switch (libxl__domain_type(gc, domid)) { case LIBXL_DOMAIN_TYPE_HVM: -if (!libxl_get_stubdom_id(CTX, domid)) -dm_present = 1; -else +if (libxl_get_stubdom_id(CTX, domid)) { dm_present = 0; -break; +break; +} +/* fall through */ case LIBXL_DOMAIN_TYPE_PV: -pid = libxl__xs_read(gc, XBT_NULL, libxl__sprintf(gc, /local/domain/%d/image/device-model-pid, domid)); -dm_present = (pid != NULL); +dm_present = libxl__domain_has_device_model(gc, domid); break; case LIBXL_DOMAIN_TYPE_INVALID: rc = ERROR_FAIL; @@ -3203,7 +3203,7 @@ out: /**/ int libxl__device_nic_setdefault(libxl__gc *gc, libxl_device_nic *nic, - uint32_t domid) + uint32_t domid, libxl_domain_build_info *info) { int rc; @@ -3240,8 +3240,15 @@ int libxl__device_nic_setdefault(libxl__gc *gc, libxl_device_nic *nic, switch (libxl__domain_type(gc, domid)) { case LIBXL_DOMAIN_TYPE_HVM: -if (!nic-nictype) -nic-nictype = LIBXL_NIC_TYPE_VIF_IOEMU; +if
[Xen-devel] [PATCH v4 15/31] xen/x86: allow disabling the pmtimer
Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/pmtimer.c | 13 + 1 file changed, 13 insertions(+) diff --git a/xen/arch/x86/hvm/pmtimer.c b/xen/arch/x86/hvm/pmtimer.c index 07594e1..199a15e 100644 --- a/xen/arch/x86/hvm/pmtimer.c +++ b/xen/arch/x86/hvm/pmtimer.c @@ -247,6 +247,9 @@ static int pmtimer_save(struct domain *d, hvm_domain_context_t *h) uint32_t x, msb = s-pm.tmr_val TMR_VAL_MSB; int rc; +if ( !has_vpmtimer(d) ) +return 0; + spin_lock(s-lock); /* Update the counter to the guest's current time. We always save @@ -271,6 +274,9 @@ static int pmtimer_load(struct domain *d, hvm_domain_context_t *h) { PMTState *s = d-arch.hvm_domain.pl_time.vpmt; +if ( !has_vpmtimer(d) ) +return 0; + spin_lock(s-lock); /* Reload the registers */ @@ -328,6 +334,9 @@ void pmtimer_init(struct vcpu *v) { PMTState *s = v-domain-arch.hvm_domain.pl_time.vpmt; +if ( !has_vpmtimer(v-domain) ) +return; + spin_lock_init(s-lock); s-scale = ((uint64_t)FREQUENCE_PMTIMER 32) / SYSTEM_TIME_HZ; @@ -348,6 +357,10 @@ void pmtimer_init(struct vcpu *v) void pmtimer_deinit(struct domain *d) { PMTState *s = d-arch.hvm_domain.pl_time.vpmt; + +if ( !has_vpmtimer(d) ) +return; + kill_timer(s-timer); } -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 21/31] xen/x86: allow disabling the emulated IOMMU
Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Suravee Suthikulpanit suravee.suthikulpa...@amd.com Cc: Aravind Gopalakrishnan aravind.gopalakrish...@amd.com --- xen/drivers/passthrough/amd/iommu_guest.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/xen/drivers/passthrough/amd/iommu_guest.c b/xen/drivers/passthrough/amd/iommu_guest.c index e74f469..b4e75ac 100644 --- a/xen/drivers/passthrough/amd/iommu_guest.c +++ b/xen/drivers/passthrough/amd/iommu_guest.c @@ -887,7 +887,8 @@ int guest_iommu_init(struct domain* d) struct guest_iommu *iommu; struct hvm_iommu *hd = domain_hvm_iommu(d); -if ( !is_hvm_domain(d) || !iommu_enabled || !iommuv2_enabled ) +if ( !is_hvm_domain(d) || !iommu_enabled || !iommuv2_enabled || + !has_viommu(d) ) return 0; iommu = xzalloc(struct guest_iommu); -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 23/31] elfnotes: intorduce a new PHYS_ENTRY elfnote
This new elfnote contains the 32bit entry point into the kernel. Xen will use this entry point in order to launch the guest kernel in 32bit protected mode with paging disabled. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- tools/xcutils/readnotes.c | 3 +++ xen/common/libelf/libelf-dominfo.c | 4 xen/include/public/elfnote.h | 11 ++- 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/tools/xcutils/readnotes.c b/tools/xcutils/readnotes.c index 5fa445e..e682dd1 100644 --- a/tools/xcutils/readnotes.c +++ b/tools/xcutils/readnotes.c @@ -159,6 +159,9 @@ static unsigned print_notes(struct elf_binary *elf, ELF_HANDLE_DECL(elf_note) st case XEN_ELFNOTE_L1_MFN_VALID: print_l1_mfn_valid_note(L1_MFN_VALID, elf , note); break; + case XEN_ELFNOTE_PHYS32_ENTRY: + print_numeric_note(PHYS32_ENTRY, elf , note); + break; default: printf(unknown note type %#x\n, (unsigned)elf_uval(elf, note, type)); diff --git a/xen/common/libelf/libelf-dominfo.c b/xen/common/libelf/libelf-dominfo.c index f929968..365e058 100644 --- a/xen/common/libelf/libelf-dominfo.c +++ b/xen/common/libelf/libelf-dominfo.c @@ -119,6 +119,7 @@ elf_errorstatus elf_xen_parse_note(struct elf_binary *elf, [XEN_ELFNOTE_BSD_SYMTAB] = { BSD_SYMTAB, 1}, [XEN_ELFNOTE_SUSPEND_CANCEL] = { SUSPEND_CANCEL, 0 }, [XEN_ELFNOTE_MOD_START_PFN] = { MOD_START_PFN, 0 }, +[XEN_ELFNOTE_PHYS32_ENTRY] = { PHYS32_ENTRY, 0 }, }; /* *INDENT-ON* */ @@ -212,6 +213,9 @@ elf_errorstatus elf_xen_parse_note(struct elf_binary *elf, elf, note, sizeof(*parms-f_supported), i); break; +case XEN_ELFNOTE_PHYS32_ENTRY: +parms-phys_entry = val; +break; } return 0; } diff --git a/xen/include/public/elfnote.h b/xen/include/public/elfnote.h index 3824a94..e6fc596 100644 --- a/xen/include/public/elfnote.h +++ b/xen/include/public/elfnote.h @@ -200,9 +200,18 @@ #define XEN_ELFNOTE_SUPPORTED_FEATURES 17 /* + * Physical entry point into the kernel. + * + * 32bit entry point into the kernel. Xen will use this entry point + * in order to launch the guest kernel in 32bit protected mode + * with paging disabled. + */ +#define XEN_ELFNOTE_PHYS32_ENTRY 18 + +/* * The number of the highest elfnote defined. */ -#define XEN_ELFNOTE_MAX XEN_ELFNOTE_SUPPORTED_FEATURES +#define XEN_ELFNOTE_MAX XEN_ELFNOTE_PHYS32_ENTRY /* * System information exported through crash notes. -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] xen-apic: Enable on domU as well
On Fri, Aug 7, 2015 at 4:23 PM, Konrad Rzeszutek Wilk konrad.w...@oracle.com wrote: Anyhow, your patch seems to fix a regression my patch feb44f1f7a4ac299d1ab1c3606860e70b9b89d69 x86/xen: Provide a Xen PV APIC driver to support 255 VCPUs introduced. Ahhh, good, okay. That explains why I didn't encounter this with older kernels. The whole picture makes sense now. Thanks for reviewing this. David - mergable? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Second regression due to libxl: Remove linux udev rules (2ba368d13893402b2f1fb3c283ddcc714659dd9b)
El 07/08/15 a les 16.54, Konrad Rzeszutek Wilk ha escrit: Ok. I hadn't run your patch yet. Do you want me to run the latest staging instead once more with my test-case? Yes please, 40s in my test case seemed to be fine. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 0/6] xen/PMU: PMU support for Xen PV(H) guests
On 08/07/2015 11:50 AM, Julien Grall wrote: Hi, On 07/08/15 16:35, David Vrabel wrote: On 02/07/15 15:53, Boris Ostrovsky wrote: I haven't posted Linux part of PV(H) VPMU support in a while but now that (hopefully) the hypervisor part is getting close to be done I think it's time to post it again. There are very few differences compared to the last version, mostly due to updates in shared structures layouts. Patches 1 and 4 have no changes at all and patch 5 has minor changes due to rebasing so I kept David's Reviewed-by tag. This breaks the arm and arm64 builds. In file included from /local/davidvr/work/k.org/tip/drivers/xen/sys-hypervisor.c:23:0: /local/davidvr/work/k.org/tip/include/xen/interface/xenpmu.h:91:22: error: field ‘pmu’ has incomplete type struct xen_pmu_arch pmu; ^ /local/davidvr/work/k.org/tip/drivers/xen/sys-hypervisor.c: In function ‘pmu_mode_store’: /local/davidvr/work/k.org/tip/drivers/xen/sys-hypervisor.c:403:2: error: implicit declaration of function ‘HYPERVISOR_xenpmu_op’ [-Werror=implicit-function-declaration] ret = HYPERVISOR_xenpmu_op(XENPMU_mode_set, xp); ^ There is no PMU support for the moment on ARM and this hypercall is only used for x86. I would introduce a new CONFIG (CONFIG_XEN_PMMU) which is enabled for x86 and disabled for ARM. CONFIG_XEN_VPMU, but yes. cc1: some warnings being treated as errors /local/davidvr/work/k.org/tip/scripts/Makefile.build:258: recipe for target 'drivers/xen/sys-hypervisor.o' failed make[3]: *** [drivers/xen/sys-hypervisor.o] Error 1 make[3]: *** Waiting for unfinished jobs /local/davidvr/work/k.org/tip/drivers/xen/xenfs/xensyms.c: In function ‘xensyms_next_sym’: /local/davidvr/work/k.org/tip/drivers/xen/xenfs/xensyms.c:34:2: error: implicit declaration of function ‘HYPERVISOR_dom0_op’ [-Werror=implicit-function-declaration] ret = HYPERVISOR_dom0_op(xs-op); ^ DOM0 op doesn't exists for ARM and xensyms is not even plumbed. I would make sure that XEN_SYMS is not enabled for ARM maybe adding the line below in the kconfig? depends on X86 XEN_DOM0 XENFS Yes. Sorry for breakage. I usually build the hypervisor for ARM but clearly didn't do this for Linux. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] xen-apic: Enable on domU as well
On Thu, Aug 06, 2015 at 06:37:05PM +0200, Jason A. Donenfeld wrote: It turns out that domU also requires the Xen APIC driver. Otherwise we get stuck in busy loops that never exit, such as in this stack trace: (gdb) target remote localhost: Remote debugging using localhost: __xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56 56 while (native_apic_mem_read(APIC_ICR) APIC_ICR_BUSY) (gdb) bt #0 __xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56 #1 __default_send_IPI_shortcut (shortcut=optimized out, dest=optimized out, vector=optimized out) at ./arch/x86/include/asm/ipi.h:75 #2 apic_send_IPI_self (vector=246) at arch/x86/kernel/apic/probe_64.c:54 #3 0x81011336 in arch_irq_work_raise () at arch/x86/kernel/irq_work.c:47 #4 0x8114990c in irq_work_queue (work=0x88000fc0e400) at kernel/irq_work.c:100 #5 0x8110c29d in wake_up_klogd () at kernel/printk/printk.c:2633 #6 0x8110ca60 in vprintk_emit (facility=0, level=optimized out, dict=0x0 irq_stack_union, dictlen=optimized out, fmt=optimized out, args=optimized out) at kernel/printk/printk.c:1778 #7 0x816010c8 in printk (fmt=optimized out) at kernel/printk/printk.c:1868 #8 0xc00013ea in ?? () #9 0x in ?? () Mailing-list-thread: https://lkml.org/lkml/2015/8/4/755 Signed-off-by: Jason A. Donenfeld ja...@zx2c4.com Cc: David Vrabel david.vra...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: sta...@vger.kernel.org While this patch is OK for the trees that implement the PV APIC driver it won't apply to older ones (and it does not need to). In the older ones this was working with f447d56d36af18c5104ff29dcb1327c0c0ac3634 xen: implement apic ipi interface, which should have worked for your case. And would have made the arch_irq_work_raise and such use the Xen code paths: 952 953 #ifdef CONFIG_SMP 954 apic-send_IPI_allbutself = xen_send_IPI_allbutself; 955 apic-send_IPI_mask_allbutself = xen_send_IPI_mask_allbutself; 956 apic-send_IPI_mask = xen_send_IPI_mask; 957 apic-send_IPI_all = xen_send_IPI_all; 958 apic-send_IPI_self = xen_send_IPI_self; 959 #endif Anyhow, your patch seems to fix a regression my patch feb44f1f7a4ac299d1ab1c3606860e70b9b89d69 x86/xen: Provide a Xen PV APIC driver to support 255 VCPUs introduced. I would to the stable.vger.kernel.org: # apply only to v4.1 As the earlier ones will work fine. Thank you! Oh, and Reviewed-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- arch/x86/xen/Makefile | 4 ++-- arch/x86/xen/xen-ops.h | 11 --- 2 files changed, 6 insertions(+), 9 deletions(-) diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile index 7322755..4b6e29a 100644 --- a/arch/x86/xen/Makefile +++ b/arch/x86/xen/Makefile @@ -13,13 +13,13 @@ CFLAGS_mmu.o := $(nostackp) obj-y:= enlighten.o setup.o multicalls.o mmu.o irq.o \ time.o xen-asm.o xen-asm_$(BITS).o \ grant-table.o suspend.o platform-pci-unplug.o \ - p2m.o + p2m.o apic.o obj-$(CONFIG_EVENT_TRACING) += trace.o obj-$(CONFIG_SMP)+= smp.o obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= spinlock.o obj-$(CONFIG_XEN_DEBUG_FS) += debugfs.o -obj-$(CONFIG_XEN_DOM0) += apic.o vga.o +obj-$(CONFIG_XEN_DOM0) += vga.o obj-$(CONFIG_SWIOTLB_XEN)+= pci-swiotlb-xen.o obj-$(CONFIG_XEN_EFI)+= efi.o diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h index c20fe29..cd248ff 100644 --- a/arch/x86/xen/xen-ops.h +++ b/arch/x86/xen/xen-ops.h @@ -98,20 +98,17 @@ static inline void xen_uninit_lock_cpu(int cpu) #endif struct dom0_vga_console_info; - #ifdef CONFIG_XEN_DOM0 void __init xen_init_vga(const struct dom0_vga_console_info *, size_t size); -void __init xen_init_apic(void); #else -static inline void __init xen_init_vga(const struct dom0_vga_console_info *info, -size_t size) -{ -} -static inline void __init xen_init_apic(void) +void __init xen_init_vga(const struct dom0_vga_console_info *info, + size_t size); { } #endif +void __init xen_init_apic(void); + #ifdef CONFIG_XEN_EFI extern void xen_efi_init(void); #else -- 2.5.0 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Second regression due to libxl: Remove linux udev rules (2ba368d13893402b2f1fb3c283ddcc714659dd9b)
On Tue, Aug 04, 2015 at 10:14:32AM +0200, Roger Pau Monné wrote: El 30/07/15 a les 10.53, Roger Pau Monné ha escrit: El 28/07/15 a les 21.47, Konrad Rzeszutek Wilk ha escrit: Hey, I launch a bunch of guests at the same time or in parallel and the scripts end up timing out with: Parsing config from //g-vm8.cfg WARNING: you seem to be using kernel directive to override HVM guest firmware. Ignore that. Use firmware_override instead if you really want a non-default firmware Jul 28 19:20:53 tst036 logger: /etc/xen/scripts/block: add XENBUS_PATH=backend/vbd/13/5632 libxl: error: libxl_aoutils.c:539:async_exec_timeout: killing execution of /etc/xen/scripts/block add because of timeout libxl: error: libxl_create.c:1157:domcreate_launch_dm: unable to add disk devices libxl: error: libxl_dm.c:1955:kill_device_model: unable to find device model pid in /local/domain/13/image/device-model-pid libxl: error: libxl.c:1606:libxl__destroy_domid: libxl__destroy_device_model failed for 13 Jul 28 19:21:03 tst036 logger: /etc/xen/scripts/block: remove XENBUS_PATH=backend/vbd/13/5632 Jul 28 19:21:04 tst036 logger: /etc/xen/scripts/block: Writing backend/vbd/13/5632/hotplug-error xenstore-read backend/vbd/13/5632/node failed. backend/vbd/13/5632/hotplug-status error to xenstore. Jul 28 19:21:04 tst036 logger: /etc/xen/scripts/block: xenstore-read backend/vbd/13/5632/node failed. Jul 28 19:21:05 tst036 logger: /etc/xen/scripts/block: Writing backend/vbd/13/5632/hotplug-error /etc/xen/scripts/block failed; error detected. backend/vbd/13/5632/hotplug-status error to xenstore. Jul 28 19:21:05 tst036 logger: /etc/xen/scripts/block: /etc/xen/scripts/block failed; error detected. libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: /etc/xen/scripts/block remove [10344] exited with error status 1 libxl: error: libxl_device.c:1085:device_hotplug_child_death_cb: script: /etc/xen/scripts/block failed; error detected. libxl: error: libxl.c:1569:libxl__destroy_domid: non-existant domain 13 libxl: error: libxl.c:1527:domain_destroy_callback: unable to destroy guest with domid 13 libxl: error: libxl.c:1454:domain_destroy_cb: destruction of domain 13 failed And I cannot start the guest. While if I revert the mentioned commit everything works peachy. What is interesting is that if I have the revert I can see that the Jul 28 19:39:03 tst036 logger: /etc/xen/scripts/block: Writing backend/vbd/14/5632/physical-device 7:d to xenstore. Jul 28 19:39:03 tst036 logger: /etc/xen/scripts/block: Writing backend/vbd/14/5632/hotplug-status connected to xenstore. or often done much much later after xl create has started. Attached is the bad log and the good log. Can you do the same test with xl -vvv and the following patch applied (with and without 2ba368 reverted): Ping? Hey! I've looked into this, and AFAICT you were probably using the udev rules (you have run_hotplug_scripts=0 in xl.conf?) before 2ba368, and Correct. I think I needed that for driver domains and had left it in there. now you are forcefully switched to launching hotplug scripts from libxl. OK. The issue is that you have multiple guests all using the same image file, so the time to execute the block hotplug script is O(n), where n is the number of times the same image is used: shared_list=$(losetup -a | sed -n -e s@^\([^:]\+\)\(:[[:blank:]]\[0*${dev}\]:${inode}[[:blank:]](.*)\)@\1@p ) for dev in $shared_list do if [ -n $dev ] then check_file_sharing $file $dev $mode fi done This was not a problem when using udev, because there's no timeout, but libxl has a hard timeout (10s) regarding hotplug script execution. The only way I see to solve this is to remove the checks done in the block hotplug script, or to increase the timeout (but since the execution time is not bounded this is doomed to fail if enough guests are using the same image). Ok. I hadn't run your patch yet. Do you want me to run the latest staging instead once more with my test-case? Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7] run QEMU as non-root
On Mon, Jul 27, 2015 at 03:19:56PM +0200, Fabio Fantoni wrote: Il 23/07/2015 19:08, Stefano Stabellini ha scritto: Try to use xen-qemudepriv-domid$domid first, then xen-qemudepriv-shared and root if everything else fails. The uids need to be manually created by the user or, more likely, by the xen package maintainer. Expose a device_model_user setting in libxl_domain_build_info, so that opinionated callers, such as libvirt, can set any user they like. Do not fall back to root if device_model_user is set. QEMU is going to setuid and setgid to the user ID and the group ID of the specified user, soon after initialization, before starting to deal with any guest IO. To actually secure QEMU when running in Dom0, we need at least to deprivilege the privcmd and xenstore interfaces, this is just the first step in that direction. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Thanks for this patch, now I'll test it. I think can be good add also domU's xl cfg parameter for set custom user to use instead make possible only in libxl from external tools, is possible to add it? It looks trivial to me. The hardest part would be picking a name for the new option and writing that down in manpage. :-) Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 13/31] xen/x86: allow disabling the emulated local apic
El 07/08/15 a les 16.09, Boris Ostrovsky ha escrit: On 08/07/2015 06:17 AM, Roger Pau Monne wrote: diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c index a0a97e7..5acb246 100644 --- a/xen/arch/x86/hvm/vmx/vmcs.c +++ b/xen/arch/x86/hvm/vmx/vmcs.c @@ -1027,6 +1027,20 @@ static int construct_vmcs(struct vcpu *v) ASSERT(!(v-arch.hvm_vmx.exec_control CPU_BASED_RDTSC_EXITING)); } +if ( !has_vlapic(d) ) +{ +/* Disable virtual apics, TPR */ +v-arch.hvm_vmx.secondary_exec_control = +~(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES + | SECONDARY_EXEC_APIC_REGISTER_VIRT + | SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY); +v-arch.hvm_vmx.exec_control = ~CPU_BASED_TPR_SHADOW; + +/* In turn, disable posted interrupts. */ +__vmwrite(PIN_BASED_VM_EXEC_CONTROL, + vmx_pin_based_exec_control ~PIN_BASED_POSTED_INTERRUPT); +} + vmx_update_cpu_exec_control(v); This is the same code as the one used right above, in 'if ( is_pvh_domain(d) )' clause. Can you combine the two? No, it's not the same code. The PVH code disables unrestricted guest and sets the entry of the VM to be in long mode, which is not true for HVMlite. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 for Xen 4.6 0/4] Enable per-VCPU parameter settings for RTDS scheduler
On Mon, Jul 27, 2015 at 10:14 AM, Dario Faggioli dario.faggi...@citrix.com wrote: On Fri, 2015-07-10 at 23:52 -0500, Chong Li wrote: Or by just adding a note before the actual output: # xl sched-rtds -d vm1 Showing per-vm(s) default scheduling parameters, use `-v' for seeing the actual parameters of each vcpu NameIDPeriodBudget vm1 1 1 4000 The latter is more verbose, but I think is what I see as more useful (and, probably, the easier to implement). I agree. I'll add it in the next version. Regards, Dario -- This happens because I choose it to happen! (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK) -- Chong Li Department of Computer Science and Engineering Washington University in St.louis ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 13/31] xen/x86: allow disabling the emulated local apic
On 07/08/15 11:17, Roger Pau Monne wrote: Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Boris Ostrovsky boris.ostrov...@oracle.com Cc: Suravee Suthikulpanit suravee.suthikulpa...@amd.com Cc: Aravind Gopalakrishnan aravind.gopalakrish...@amd.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com Cc: Jun Nakajima jun.nakaj...@intel.com Cc: Eddie Dong eddie.d...@intel.com Cc: Kevin Tian kevin.t...@intel.com --- Patches 13 - 21 (all the allow disable of $FOO ones) Acked-by: Andrew Cooper andrew.coop...@citrix.com I have not done a thorough review, but the patches look plausible to a first order approximation. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 6/9] video/xen-fbfront: Further s/MFN/GFN clean-up
The PV driver xen-fbfront is only dealing with GFN and not MFN. Rename all the occurence of MFN to GFN. Also take the opportunity to replace to usage of pfn_to_gfn by xen_page_to_gfn. Signed-off-by: Julien Grall julien.gr...@citrix.com Reviewed-by: David Vrabel david.vra...@citrix.com --- Cc: Jean-Christophe Plagniol-Villard plagn...@jcrosoft.com Cc: Tomi Valkeinen tomi.valkei...@ti.com Cc: linux-fb...@vger.kernel.org Changes in v3: - Typoes in the commit message - Renamed page_to_gfn to xen_page_to_gfn Changes in v2: - Add David's reviewed-by --- drivers/video/fbdev/xen-fbfront.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/video/fbdev/xen-fbfront.c b/drivers/video/fbdev/xen-fbfront.c index 25e3cce..0567d51 100644 --- a/drivers/video/fbdev/xen-fbfront.c +++ b/drivers/video/fbdev/xen-fbfront.c @@ -46,7 +46,7 @@ struct xenfb_info { int nr_pages; int irq; struct xenfb_page *page; - unsigned long *mfns; + unsigned long *gfns; int update_wanted; /* XENFB_TYPE_UPDATE wanted */ int feature_resize; /* XENFB_TYPE_RESIZE ok */ struct xenfb_resize resize; /* protected by resize_lock */ @@ -402,8 +402,8 @@ static int xenfb_probe(struct xenbus_device *dev, info-nr_pages = (fb_size + PAGE_SIZE - 1) PAGE_SHIFT; - info-mfns = vmalloc(sizeof(unsigned long) * info-nr_pages); - if (!info-mfns) + info-gfns = vmalloc(sizeof(unsigned long) * info-nr_pages); + if (!info-gfns) goto error_nomem; /* set up shared page */ @@ -530,29 +530,29 @@ static int xenfb_remove(struct xenbus_device *dev) framebuffer_release(info-fb_info); } free_page((unsigned long)info-page); - vfree(info-mfns); + vfree(info-gfns); vfree(info-fb); kfree(info); return 0; } -static unsigned long vmalloc_to_mfn(void *address) +static unsigned long vmalloc_to_gfn(void *address) { - return pfn_to_gfn(vmalloc_to_pfn(address)); + return xen_page_to_gfn(vmalloc_to_page(address)); } static void xenfb_init_shared_page(struct xenfb_info *info, struct fb_info *fb_info) { int i; - int epd = PAGE_SIZE / sizeof(info-mfns[0]); + int epd = PAGE_SIZE / sizeof(info-gfns[0]); for (i = 0; i info-nr_pages; i++) - info-mfns[i] = vmalloc_to_mfn(info-fb + i * PAGE_SIZE); + info-gfns[i] = vmalloc_to_gfn(info-fb + i * PAGE_SIZE); for (i = 0; i * epd info-nr_pages; i++) - info-page-pd[i] = vmalloc_to_mfn(info-mfns[i * epd]); + info-page-pd[i] = vmalloc_to_gfn(info-gfns[i * epd]); info-page-width = fb_info-var.xres; info-page-height = fb_info-var.yres; -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 3/9] arm/xen: implement correctly pfn_to_mfn
After the commit introducing convertion between DMA and guest addresses, all the callers of pfn_to_mfn are expecting to get a GFN (Guest Frame Number). On ARM, all the guests are auto-translated so the GFN is equal to the Linux PFN (Pseudo-physical Frame Number). The current implementation may return an MFN if the caller is passing a PFN associated to a mapped foreign grant. In pratice, I haven't seen the problem on running guest but we should fix it for the sake of correctness. Correct the implementation by always returning the pfn passed in parameter. A follow-up patch will take care to rename pfn_to_mfn to a suitable name. Signed-off-by: Julien Grall julien.gr...@citrix.com Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- Cc: Russell King li...@arm.linux.org.uk Cc: linux-arm-ker...@lists.infradead.org Changes in v3: - Typoes in the commit message - Add Stefano's reviewed-by --- arch/arm/include/asm/xen/page.h | 8 1 file changed, 8 deletions(-) diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h index 5f76a9e..911d62b 100644 --- a/arch/arm/include/asm/xen/page.h +++ b/arch/arm/include/asm/xen/page.h @@ -36,14 +36,6 @@ extern struct rb_root phys_to_mach; static inline unsigned long pfn_to_mfn(unsigned long pfn) { - unsigned long mfn; - - if (phys_to_mach.rb_node != NULL) { - mfn = __pfn_to_mfn(pfn); - if (mfn != INVALID_P2M_ENTRY) - return mfn; - } - return pfn; } -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 24/31] libxc: allow creating domains without emulated devices.
On 07/08/15 11:18, Roger Pau Monne wrote: @@ -1336,7 +1343,8 @@ static int meminit_hvm(struct xc_dom_image *dom) * tot_pages will be target_pages - VGA_HOLE_SIZE after * this call. */ -rc = xc_domain_set_pod_target(xch, domid, target_pages - VGA_HOLE_SIZE, +rc = xc_domain_set_pod_target(xch, domid, target_pages - + dom-emulation ? VGA_HOLE_SIZE : 0, NULL, NULL, NULL); This might be more cleanly expressed as having d-vga_hole_size which is either VGA_HOLE_SIZE or 0, depending on dom-emulation. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 4/9] xen: Use correctly the Xen memory terminologies
Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN is meant, I suspect this is because the first support for Xen was for PV. This resulted in some misimplementation of helpers on ARM and confused developers about the expected behavior. For instance, with pfn_to_mfn, we expect to get an MFN based on the name. Although, if we look at the implementation on x86, it's returning a GFN. For clarity and avoid new confusion, replace any reference to mfn with gfn in any helpers used by PV drivers. The x86 code will still keep some reference of pfn_to_mfn but exclusively for PV (a BUG_ON has been added to ensure this). No changes as been made in the hypercall field, even though they may be invalid, in order to keep the same as the defintion in xen repo. Note that page_to_mfn has been renamed to xen_page_to_gfn to avoid a name to close to the KVM function gfn_to_page. Take also the opportunity to simplify simple construction such as pfn_to_mfn(page_to_pfn(page)) into xen_page_to_gfn. More complex clean up will come in follow-up patches. [1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb Signed-off-by: Julien Grall julien.gr...@citrix.com Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Acked-by: Dmitry Torokhov dmitry.torok...@gmail.com Acked-by: Wei Liu wei.l...@citrix.com --- Cc: Russell King li...@arm.linux.org.uk Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: Boris Ostrovsky boris.ostrov...@oracle.com Cc: David Vrabel david.vra...@citrix.com Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com Cc: x...@kernel.org Cc: Roger Pau Monné roger@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Juergen Gross jgr...@suse.com Cc: James E.J. Bottomley jbottom...@odin.com Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: Jiri Slaby jsl...@suse.com Cc: Jean-Christophe Plagniol-Villard plagn...@jcrosoft.com Cc: Tomi Valkeinen tomi.valkei...@ti.com Cc: linux-in...@vger.kernel.org Cc: net...@vger.kernel.org Cc: linux-s...@vger.kernel.org Cc: linuxppc-...@lists.ozlabs.org Cc: linux-fb...@vger.kernel.org Cc: linux-arm-ker...@lists.infradead.org Note that I've re-introduced in v2 mfn_to_pfn co only for x86 PV code. The helpers contain a BUG_ON to ensure that it's never called for auto-translated guests. I did as best as my can to determine whether mfn or gfn helpers should be used. Although, I haven't tried to boot it. It may be possible to do further cleanup in the mmu.c where I found some check to auto-translated. I'm not sure why given that the pvmmu callback are only used for non-auto translated guest. Changes in v3: - Add Stefano's reviewed-by (except for the x86 bits) - Add Wei (netback) and Dmitry's (input) acked-by - Keep the VIRT - MACHINE macro in the same order as before in arch/x86/include/asm/xen/page.h - Rename page_to_gfn to xen_page_to_gfn to avoid confusion with the KVM function gfn_to_page. - Typoes in the commit title Changes in v2: - Give directly the URL to the commit rather than the commit ID - xenstored_local_init: keep the cast to void * - Typoes - Keep pfn_to_mfn for x86 and PV-only. The *mfn* helpers are used in arch/x86/xen for enlighten.c, mmu.c, p2m.c, setup.c, smp.c and mm.c --- arch/arm/include/asm/xen/page.h | 13 +++-- arch/x86/include/asm/xen/page.h | 31 +-- arch/x86/xen/smp.c | 2 +- drivers/block/xen-blkfront.c| 6 +++--- drivers/input/misc/xen-kbdfront.c | 4 ++-- drivers/net/xen-netback/netback.c | 4 ++-- drivers/net/xen-netfront.c | 12 +++- drivers/scsi/xen-scsifront.c| 10 +- drivers/tty/hvc/hvc_xen.c | 5 +++-- drivers/video/fbdev/xen-fbfront.c | 4 ++-- drivers/xen/balloon.c | 2 +- drivers/xen/events/events_base.c| 2 +- drivers/xen/events/events_fifo.c| 4 ++-- drivers/xen/gntalloc.c | 3 ++- drivers/xen/manage.c| 2 +- drivers/xen/tmem.c | 4 ++-- drivers/xen/xenbus/xenbus_client.c | 2 +- drivers/xen/xenbus/xenbus_dev_backend.c | 2 +- drivers/xen/xenbus/xenbus_probe.c | 8 +++- include/xen/page.h | 4 ++-- 20 files changed, 73 insertions(+), 51 deletions(-) diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h index 911d62b..1279563 100644 --- a/arch/arm/include/asm/xen/page.h +++ b/arch/arm/include/asm/xen/page.h @@ -34,14 +34,15 @@ typedef struct xpaddr { unsigned long __pfn_to_mfn(unsigned long pfn); extern struct rb_root phys_to_mach; -static inline unsigned long pfn_to_mfn(unsigned long pfn) +/* Pseudo-physical - Guest conversion */ +static
Re: [Xen-devel] [PATCH 1/2] tools/libxl: Assert success of memory allocation in testidl
On Fri, Aug 07, 2015 at 03:06:23PM +0100, Andrew Cooper wrote: The chances of an allocation failing are slim but nonzero. Assert success of each allocation to quieten Coverity, which re-notices defects each time the IDL changes. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen: arm: Support 32MB frametables
Hi Chris, On 06/08/15 18:54, Chris (Christopher) Brand wrote: setup_frametable_mappings() rounds frametable_size up to a multiple of 32MB. This is wasteful on systems with less than 4GB of RAM, although it does allow the contig bit to be set in the PTEs. Where the frametable is less than 32MB in size, instead round up to a multiple of 2MB, not setting the contig bit in the PTEs. OOI, you win 30MB of RAM but how does this affect the performance? Signed-off-by: Chris Brand chris.br...@broadcom.com --- xen/arch/arm/mm.c | 39 --- 1 file changed, 36 insertions(+), 3 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index a91ea774f1f9..47b6d5d44563 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c #ifdef CONFIG_ARM_32 +static void __init create_2mb_mappings(lpae_t *second, + unsigned long virt_offset, + unsigned long base_mfn, + unsigned long nr_mfns) +{ +unsigned long i, count; +lpae_t pte, *p; + +ASSERT(!((virt_offset PAGE_SHIFT) % LPAE_ENTRIES)); +ASSERT(!(base_mfn % LPAE_ENTRIES)); +ASSERT(!(nr_mfns % LPAE_ENTRIES)); + +count = nr_mfns / LPAE_ENTRIES; +p = second + second_linear_offset(virt_offset); +pte = mfn_to_xen_entry(base_mfn, WRITEALLOC); +for ( i = 0; i count; i++ ) +{ +write_pte(p + i, pte); +pte.pt.base += 1 LPAE_SHIFT; +} +flush_xen_data_tlb_local(); +} + Can you rework create_32mb_mappings to take the size of the mappings in parameters? /* Set up the xenheap: up to 1GB of contiguous, always-mapped memory. */ void __init setup_xenheap_mappings(unsigned long base_mfn, unsigned long nr_mfns) @@ -749,6 +772,7 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe) unsigned long nr_pdxs = pfn_to_pdx(nr_pages); unsigned long frametable_size = nr_pdxs * sizeof(struct page_info); unsigned long base_mfn; +unsigned long mask; #ifdef CONFIG_ARM_64 lpae_t *second, pte; unsigned long nr_second, second_base; @@ -757,8 +781,12 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe) frametable_base_pdx = pfn_to_pdx(ps PAGE_SHIFT); -/* Round up to 32M boundary */ -frametable_size = (frametable_size + 0x1ff) ~0x1ff; +/* Round up to 2M or 32M boundary, as appropriate. */ +if ( frametable_size MB(32) ) +mask = MB(2) - 1; +else +mask = MB(32) - 1; +frametable_size = (frametable_size + mask) ~mask; You can use ROUNDUP(frametable_size, size) to avoid open-coding the mask. Also, this code is common with ARM64. If we happen to have a board with a frametable smaller than 32MB, you will round up to 2MB and crash later in create_32mb_mappings because you don't support 2MB mapping for ARM64. It might be good to support 2MB for ARM64 too. base_mfn = alloc_boot_pages(frametable_size PAGE_SHIFT, 32(20-12)); #ifdef CONFIG_ARM_64 @@ -773,7 +801,12 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe) } create_32mb_mappings(second, 0, base_mfn, frametable_size PAGE_SHIFT); #else -create_32mb_mappings(xen_second, FRAMETABLE_VIRT_START, base_mfn, frametable_size PAGE_SHIFT); +if ( frametable_size MB(32) ) +create_2mb_mappings(xen_second, FRAMETABLE_VIRT_START, +base_mfn, frametable_size PAGE_SHIFT); +else +create_32mb_mappings(xen_second, FRAMETABLE_VIRT_START, + base_mfn, frametable_size PAGE_SHIFT); Passing the size/alignment in parameter would have avoid to add this if/else. You can use the new parameter to ASSERT the input and enable or not the contiguous bit. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 11/31] libxc: remove dead HVM building code
On 07/08/15 11:17, Roger Pau Monne wrote: Remove xc_hvm_build_x86.c and xc_hvm_build_arm.c since xc_hvm_build is not longer used in order to create HVM guests. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com Even more slimming down! Reviewed-by: Andrew Cooper andrew.coop...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 12/31] xen/x86: add bitmap of enabled emulated devices
On 07/08/15 11:17, Roger Pau Monne wrote: Introduce a bitmap in x86 xen_arch_domainconfig that allows enabling or disabling specific devices emulated inside of Xen for HVM guests. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 3/4] HVM x86 deprivileged mode: Code for switching into/out of deprivileged mode
On 07/08/15 13:51, Ben Catterall wrote: On 06/08/15 21:55, Andrew Cooper wrote: On 06/08/15 17:45, Ben Catterall wrote: The process to switch into and out of deprivileged mode can be likened to setjmp/longjmp. To enter deprivileged mode, we take a copy of the stack from the guest's registers up to the current stack pointer. This allows us to restore the stack when we have finished the deprivileged mode operation, meaning we can continue execution from that point. This is similar to if a context switch had happened. To exit deprivileged mode, we copy the stack back, replacing the current stack. We can then continue execution from where we left off, which will unwind the stack and free up resources. This method means that we do not need to change any other code paths and its invocation will be transparent to callers. This should allow the feature to be more easily deployed to different parts of Xen. Note that this copy of the stack is per-vcpu but, it will contain per-pcpu data. Extra work is needed to properly migrate vcpus between pcpus. Under what circumstances do you see there being persistent state in the depriv area between calls, given that the calls are synchronous from VM actions? I don't know if we can make these synchronous as we need a way to interrupt the vcpu if it's spinning for a long time. Otherwise an attacker could just spin in depriv and cause a DoS. With that in mind, the scheduler may decide to migrate the vcpu whilst it's in depriv mode which would mean this per-pcpu data is held in the stack copy which is then migrated to another pcpu incorrectly. If the emulator spins for a sufficient time, it is fine to shoot the domain. This is a strict improvement on the current behaviour where a spinning emulator would shoot the host, via a watchdog timeout. As said elsewhere, this kind of DoS is not a very interesting attack vector. State handling errors which cause Xen to change the wrong thing are far more interesting from a guests point of view. http://xenbits.xen.org/xsa/advisory-123.html (full host compromise) or http://xenbits.xen.org/xsa/advisory-108.html (read other guests data) are examples of kinds of interesting issues which could potentially be mitigated with this depriv infrastructure. The switch to and from deprivileged mode is performed using sysret and syscall respectively. I suspect we need to borrow the SS attribute workaround from Linux to make this function reliably on AMD systems. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=61f01dd941ba9e06d2bf05994450ecc3d61b6b8b Ah! ok, I'll look into this. Thanks! Just be aware of it. Don't spend your time attempting to retrofit it to Xen. It is more work than it looks. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6] tools/xenstore: Correct use of va_end() after va_copy()
On Fri, Aug 07, 2015 at 02:51:59PM +0100, Andrew Cooper wrote: C requires that every use of va_copy() is matched with a va_end() call. This is especially important for x86_64 as va_{start,copy}() may need to allocate memory to generate a va_list containing parameters which were previously in registers. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com This is a candidate for backport. --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/xenstore/talloc.c |6 ++ 1 file changed, 6 insertions(+) diff --git a/tools/xenstore/talloc.c b/tools/xenstore/talloc.c index 54dbd02..d7edcf3 100644 --- a/tools/xenstore/talloc.c +++ b/tools/xenstore/talloc.c @@ -1101,13 +1101,16 @@ char *talloc_vasprintf(const void *t, const char *fmt, va_list ap) /* this call looks strange, but it makes it work on older solaris boxes */ if ((len = vsnprintf(c, 1, fmt, ap2)) 0) { + va_end(ap2); return NULL; } + va_end(ap2); ret = _talloc(t, len+1); if (ret) { VA_COPY(ap2, ap); vsnprintf(ret, len+1, fmt, ap2); + va_end(ap2); talloc_set_name_const(ret, ret); } @@ -1161,8 +1164,10 @@ static char *talloc_vasprintf_append(char *s, const char *fmt, va_list ap) * the original string. Most current callers of this * function expect it to never return NULL. */ + va_end(ap2); return s; } + va_end(ap2); s = talloc_realloc(NULL, s, char, s_len + len+1); if (!s) return NULL; @@ -1170,6 +1175,7 @@ static char *talloc_vasprintf_append(char *s, const char *fmt, va_list ap) VA_COPY(ap2, ap); vsnprintf(s+s_len, len+1, fmt, ap2); + va_end(ap2); talloc_set_name_const(s, s); return s; -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 31/31] libxl: allow the creation of HVM domains without a device model.
On Fri, Aug 07, 2015 at 12:18:08PM +0200, Roger Pau Monne wrote: Replace the firmware loaded into HVM guests with an OS kernel. Since the HVM builder now uses the PV xc_dom_* set of functions this kernel will be parsed and loaded inside the guest like on PV, but the container is a pure HVM guest. What is your plan in regards to the 'pvh' parameteR? Should it be repurposed so you can use both? Also, if device_model_version is set to none or a device model for the specified domain is not present unconditinally set the nic type to LIBXL_NIC_TYPE_VIF. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- Changes since v3: - Add explicit /* fall through */ comments. - Expand libxl__device_nic_setdefault so that it sets the right nic type for HVMlite guests. - Remove stray space in hvm_build_set_params. - Fix the error paths of libxl__domain_firmware. --- docs/man/xl.cfg.pod.5| 5 tools/libxc/xc_dom_x86.c | 7 + tools/libxl/libxl.c | 39 ++--- tools/libxl/libxl_create.c | 16 ++- tools/libxl/libxl_dm.c | 13 - tools/libxl/libxl_dom.c | 68 ++-- tools/libxl/libxl_internal.h | 5 +++- tools/libxl/libxl_types.idl | 1 + tools/libxl/libxl_x86.c | 4 ++- tools/libxl/xl_cmdimpl.c | 2 ++ 10 files changed, 118 insertions(+), 42 deletions(-) diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 index 80e51bb..8cd7726 100644 --- a/docs/man/xl.cfg.pod.5 +++ b/docs/man/xl.cfg.pod.5 @@ -1741,6 +1741,11 @@ This device-model is the default for Linux dom0. Use the device-model based upon the historical Xen fork of Qemu. This device-model is still the default for NetBSD dom0. +=item Bnone + +Don't use any device model. This requires a kernel capable of booting +in this mode. + =back It is recommended to accept the default value for new guests. If diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c index 1599de4..d67feb0 100644 --- a/tools/libxc/xc_dom_x86.c +++ b/tools/libxc/xc_dom_x86.c @@ -1269,6 +1269,13 @@ static int meminit_hvm(struct xc_dom_image *dom) if ( nr_pages target_pages ) memflags |= XENMEMF_populate_on_demand; +/* Make sure there's a MMIO hole for the special pages. */ +if ( dom-mmio_size == 0 ) +{ +dom-mmio_size = NR_SPECIAL_PAGES PAGE_SHIFT; +dom-mmio_start = special_pfn(0); +} + if ( dom-nr_vmemranges == 0 ) { /* Build dummy vnode information diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 083f099..a01868a 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -1033,11 +1033,13 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid) } if (type == LIBXL_DOMAIN_TYPE_HVM) { -rc = libxl__domain_resume_device_model(gc, domid); -if (rc 0) { -LOG(ERROR, failed to unpause device model for domain %u:%d, -domid, rc); -goto out; +if (libxl__domain_has_device_model(gc, domid)) { +rc = libxl__domain_resume_device_model(gc, domid); +if (rc 0) { +LOG(ERROR, failed to unpause device model for domain %u:%d, +domid, rc); +goto out; +} } } ret = xc_domain_unpause(ctx-xch, domid); @@ -1567,7 +1569,6 @@ void libxl__destroy_domid(libxl__egc *egc, libxl__destroy_domid_state *dis) libxl_ctx *ctx = CTX; uint32_t domid = dis-domid; char *dom_path; -char *pid; int rc, dm_present; libxl__ev_child_init(dis-destroyer); @@ -1584,14 +1585,13 @@ void libxl__destroy_domid(libxl__egc *egc, libxl__destroy_domid_state *dis) switch (libxl__domain_type(gc, domid)) { case LIBXL_DOMAIN_TYPE_HVM: -if (!libxl_get_stubdom_id(CTX, domid)) -dm_present = 1; -else +if (libxl_get_stubdom_id(CTX, domid)) { dm_present = 0; -break; +break; +} +/* fall through */ case LIBXL_DOMAIN_TYPE_PV: -pid = libxl__xs_read(gc, XBT_NULL, libxl__sprintf(gc, /local/domain/%d/image/device-model-pid, domid)); -dm_present = (pid != NULL); +dm_present = libxl__domain_has_device_model(gc, domid); break; case LIBXL_DOMAIN_TYPE_INVALID: rc = ERROR_FAIL; @@ -3203,7 +3203,7 @@ out: /**/ int libxl__device_nic_setdefault(libxl__gc *gc, libxl_device_nic *nic, - uint32_t domid) + uint32_t domid, libxl_domain_build_info
[Xen-devel] [qemu-upstream-unstable test] 60605: tolerable FAIL - PUSHED
flight 60605 qemu-upstream-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/60605/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds 11 guest-start fail baseline untested Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-qcow2 9 debian-di-installfail never pass test-armhf-armhf-xl-vhd 9 debian-di-installfail never pass test-armhf-armhf-libvirt-qcow2 9 debian-di-installfail never pass test-armhf-armhf-libvirt-vhd 9 debian-di-installfail never pass test-armhf-armhf-libvirt-raw 9 debian-di-installfail never pass test-amd64-amd64-libvirt-pair 21 guest-migrate/src_host/dst_host fail never pass test-amd64-i386-libvirt-pair 21 guest-migrate/src_host/dst_host fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-i386-libvirt-raw 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass test-amd64-amd64-libvirt-qcow2 11 migrate-support-checkfail never pass test-amd64-i386-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-i386-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-raw 9 debian-di-installfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail never pass version targeted for testing: qemuubcf35eec0b621c46dbf0aeb40c6bc06b5d3981aa baseline version: qemuuc4a962ec0c61aa9b860a3635c8424472e6c2cc2c Last test of basis58880 2015-06-24 13:45:58 Z 44 days Failing since 59777 2015-07-20 12:49:32 Z 18 days 11 attempts Testing same since60546 2015-08-03 16:17:04 Z3 days2 attempts People who touched revisions under test: Gerd Hoffmann kra...@redhat.com Jan Beulich jbeul...@suse.com Kevin Wolf kw...@redhat.com Marc-André Lureau marcandre.lur...@gmail.com Paolo Bonzini pbonz...@redhat.com Stefan Hajnoczi stefa...@redhat.com Stefano Stabellini stefano.stabell...@eu.citrix.com jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvops
Re: [Xen-devel] [PATCH v4 31/31] libxl: allow the creation of HVM domains without a device model.
El 07/08/15 a les 17.18, Konrad Rzeszutek Wilk ha escrit: On Fri, Aug 07, 2015 at 12:18:08PM +0200, Roger Pau Monne wrote: Replace the firmware loaded into HVM guests with an OS kernel. Since the HVM builder now uses the PV xc_dom_* set of functions this kernel will be parsed and loaded inside the guest like on PV, but the container is a pure HVM guest. What is your plan in regards to the 'pvh' parameteR? Should it be repurposed so you can use both? I haven't thought about this TBH. One option could be to make the pvh parameter an alias to builder='hvm' device_model_version='none' with a forced 64bit entry point, but since PVH has never made it out of the experimental state I'm not sure if it's worth to keep it. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 25/31] xen: allow HVM guests to use XENMEM_memory_map
On Fri, Aug 07, 2015 at 05:44:22PM +0200, Roger Pau Monné wrote: El 07/08/15 a les 14.22, Wei Liu ha escrit: On Fri, Aug 07, 2015 at 12:18:02PM +0200, Roger Pau Monne wrote: Enable this hypercall for HVM guests in order to fetch the e820 memory map in the absence of an emulated BIOS. The memory map is populated and notified to Xen in arch_setup_meminit_hvm. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- tools/libxc/xc_dom_x86.c | 29 - 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c index b587b12..87bce6e 100644 --- a/tools/libxc/xc_dom_x86.c +++ b/tools/libxc/xc_dom_x86.c @@ -1205,6 +1205,7 @@ static int check_mmio_hole(uint64_t start, uint64_t memsize, return 1; } +#define MAX_E820_ENTRIES128 static int meminit_hvm(struct xc_dom_image *dom) { unsigned long i, vmemid, nr_pages = dom-total_pages; @@ -1225,6 +1226,8 @@ static int meminit_hvm(struct xc_dom_image *dom) unsigned int nr_vmemranges, nr_vnodes; xc_interface *xch = dom-xch; uint32_t domid = dom-guest_domid; +struct e820entry entries[MAX_E820_ENTRIES]; +int e820_index = 0; if ( nr_pages target_pages ) memflags |= XENMEMF_populate_on_demand; @@ -1275,6 +1278,13 @@ static int meminit_hvm(struct xc_dom_image *dom) vnode_to_pnode = dom-vnode_to_pnode; } +/* Add one additional memory range to account for the VGA hole */ +if ( (nr_vmemranges + (dom-emulation ? 1 : 0)) MAX_E820_ENTRIES ) +{ +DOMPRINTF(Too many memory ranges); +goto error_out; +} + total_pages = 0; p2m_size = 0; for ( i = 0; i nr_vmemranges; i++ ) @@ -1363,9 +1373,13 @@ static int meminit_hvm(struct xc_dom_image *dom) * Under 2MB mode, we allocate pages in batches of no more than 8MB to * ensure that we can be preempted and hence dom0 remains responsive. */ -if ( dom-emulation ) +if ( dom-emulation ) { rc = xc_domain_populate_physmap_exact( xch, domid, 0xa0, 0, memflags, dom-p2m_host[0x00]); +entries[e820_index].addr = 0; +entries[e820_index].size = 0xa0 PAGE_SHIFT; +entries[e820_index++].type = E820_RAM; +} stat_normal_pages = 0; for ( vmemid = 0; vmemid nr_vmemranges; vmemid++ ) @@ -1392,6 +1406,12 @@ static int meminit_hvm(struct xc_dom_image *dom) else cur_pages = vmemranges[vmemid].start PAGE_SHIFT; +/* Build an e820 map. */ +entries[e820_index].addr = cur_pages PAGE_SHIFT; +entries[e820_index].size = vmemranges[vmemid].end - + entries[e820_index].addr; +entries[e820_index++].type = E820_RAM; + while ( (rc == 0) (end_pages cur_pages) ) { /* Clip count to maximum 1GB extent. */ @@ -1509,6 +1529,13 @@ static int meminit_hvm(struct xc_dom_image *dom) DPRINTF( 2MB PAGES: 0x%016lx\n, stat_2mb_pages); DPRINTF( 1GB PAGES: 0x%016lx\n, stat_1gb_pages); +rc = xc_domain_set_memory_map(xch, domid, entries, e820_index); +if ( rc != 0 ) +{ +DOMPRINTF(unable to set memory map.); +goto error_out; +} + I think in RDM series there is already a memory map generated in libxl. You might want to move this to libxl, which is supposed to be the arbitrator of what a guest should look like. What do you think? I would like to do that, the only problem I see is that libxl doesn't have any notion of the special pages, so it doesn't know the size of the hole it has to made. I guess one solution would be moving the defines at the top of xc_dom_x86.c to a public libxc header so libxl can pick up the size and position of the hole. But the code above has no notion of special pages either. Can it not just use the E820 map generated in libxl? I think at the time you wrote this patch libxl doesn't generate E820 for HVM guest, but it does now, so maybe you can drop this patch? BTW the title is not very accurate because this patch doesn't involve allowing / disallowing hvm to use that hypercall. Wei. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 09/31] libxc: introduce a xc_dom_arch for hvm-3.0-x86_32 guests
On 07/08/15 11:17, Roger Pau Monne wrote: +static void build_hvm_info(void *hvm_info_page, struct xc_dom_image *dom) +{ +struct hvm_info_table *hvm_info = (struct hvm_info_table *) +(((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET); +uint8_t sum; +int i; + +memset(hvm_info_page, 0, PAGE_SIZE); + +/* Fill in the header. */ +memcpy(hvm_info-signature, HVM INFO, sizeof(hvm_info-signature)); +hvm_info-length = sizeof(struct hvm_info_table); + +/* Sensible defaults: these can be overridden by the caller. */ +hvm_info-apic_mode = 1; +hvm_info-nr_vcpus = 1; +memset(hvm_info-vcpu_online, 0xff, sizeof(hvm_info-vcpu_online)); I realise you are just copying existing code, so won't hold this against you, but these are not sensible defaults. There is a lot of cleanup which should be done, but this particular series is not the place for it. static int start_info_x86_32(struct xc_dom_image *dom) @@ -682,6 +833,103 @@ static int vcpu_x86_64(struct xc_dom_image *dom) return rc; } +static int vcpu_hvm(struct xc_dom_image *dom) +{ +struct { +struct hvm_save_descriptor header_d; +HVM_SAVE_TYPE(HEADER) header; +struct hvm_save_descriptor cpu_d; +HVM_SAVE_TYPE(CPU) cpu; +struct hvm_save_descriptor end_d; +HVM_SAVE_TYPE(END) end; +} bsp_ctx; +uint8_t *full_ctx = NULL; +int rc; + +DOMPRINTF_CALLED(dom-xch); + +/* + * Get the full HVM context in order to have the header, it is not + * possible to get the header with getcontext_partial, and crafting one + * from userspace is also not an option since cpuid is trapped and + * modified by Xen. + */ Eww. Again, not your fault so this patch is ok, but we should see about making things like this easier to do. I have a cunning plan as part of some longterm improvements to migration which might come in handy. + +rc = xc_domain_hvm_getcontext(dom-xch, dom-guest_domid, NULL, 0); +if ( rc = 0 ) +{ +xc_dom_panic(dom-xch, XC_INTERNAL_ERROR, + %s: unable to fetch HVM context size (rc=%d), + __func__, rc); +return rc; +} +full_ctx = malloc(rc); +if ( full_ctx == NULL ) +{ +xc_dom_panic(dom-xch, XC_INTERNAL_ERROR, + %s: unable to allocate memory for HVM context (rc=%d), + __func__, rc); +return -ENOMEM; +} + +memset(full_ctx, 0, rc); calloc() instead of malloc(). + +rc = xc_domain_hvm_getcontext(dom-xch, dom-guest_domid, full_ctx, rc); +if ( rc = 0 ) +{ +xc_dom_panic(dom-xch, XC_INTERNAL_ERROR, + %s: unable to fetch HVM context (rc=%d), + __func__, rc); +goto out; +} + +/* Copy the header to our partial context. */ +memset(bsp_ctx, 0, sizeof(bsp_ctx)); +memcpy(bsp_ctx, full_ctx, + siz +return rc; +} + /* */ static int x86_compat(xc_interface *xch, domid_t domid, char *guest_type) @@ -762,7 +1010,7 @@ static int meminit_pv(struct xc_dom_image *dom) if ( dom-superpages ) { -int count = dom-total_pages SUPERPAGE_PFN_SHIFT; +int count = dom-total_pages SUPERPAGE_2MB_SHIFT; xen_pfn_t extents[count]; dom-p2m_size = dom-total_pages; @@ -773,9 +1021,9 @@ static int meminit_pv(struct xc_dom_image *dom) DOMPRINTF(Populating memory with %d superpages, count); for ( pfn = 0; pfn count; pfn++ ) -extents[pfn] = pfn SUPERPAGE_PFN_SHIFT; +extents[pfn] = pfn SUPERPAGE_2MB_SHIFT; rc = xc_domain_populate_physmap_exact(dom-xch, dom-guest_domid, - count, SUPERPAGE_PFN_SHIFT, 0, + count, SUPERPAGE_2MB_SHIFT, 0, extents); if ( rc ) return rc; @@ -785,7 +1033,7 @@ static int meminit_pv(struct xc_dom_image *dom) for ( i = 0; i count; i++ ) { mfn = extents[i]; -for ( j = 0; j SUPERPAGE_NR_PFNS; j++, pfn++ ) +for ( j = 0; j SUPERPAGE_2MB_NR_PFNS; j++, pfn++ ) dom-p2m_host[pfn] = mfn + j; } } @@ -870,7 +1118,7 @@ static int meminit_pv(struct xc_dom_image *dom) pages = (vmemranges[i].end - vmemranges[i].start) PAGE_SHIFT; -super_pages = pages SUPERPAGE_PFN_SHIFT; +super_pages = pages SUPERPAGE_2MB_SHIFT; pfn_base = vmemranges[i].start PAGE_SHIFT; for ( pfn = pfn_base; pfn pfn_base+pages; pfn++ ) @@ -883,11 +1131,11 @@ static int meminit_pv(struct xc_dom_image *dom)
Re: [Xen-devel] [PATCH v4 13/31] xen/x86: allow disabling the emulated local apic
On 08/07/2015 11:41 AM, Roger Pau Monné wrote: El 07/08/15 a les 16.09, Boris Ostrovsky ha escrit: On 08/07/2015 06:17 AM, Roger Pau Monne wrote: diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c index a0a97e7..5acb246 100644 --- a/xen/arch/x86/hvm/vmx/vmcs.c +++ b/xen/arch/x86/hvm/vmx/vmcs.c @@ -1027,6 +1027,20 @@ static int construct_vmcs(struct vcpu *v) ASSERT(!(v-arch.hvm_vmx.exec_control CPU_BASED_RDTSC_EXITING)); } +if ( !has_vlapic(d) ) +{ +/* Disable virtual apics, TPR */ +v-arch.hvm_vmx.secondary_exec_control = +~(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES + | SECONDARY_EXEC_APIC_REGISTER_VIRT + | SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY); +v-arch.hvm_vmx.exec_control = ~CPU_BASED_TPR_SHADOW; + +/* In turn, disable posted interrupts. */ +__vmwrite(PIN_BASED_VM_EXEC_CONTROL, + vmx_pin_based_exec_control ~PIN_BASED_POSTED_INTERRUPT); +} + vmx_update_cpu_exec_control(v); This is the same code as the one used right above, in 'if ( is_pvh_domain(d) )' clause. Can you combine the two? No, it's not the same code. The PVH code disables unrestricted guest and sets the entry of the VM to be in long mode, which is not true for HVMlite. Right, but the first part of that 'if' statement is the same as the one you are adding (including the comments). So I was suggesting if ( is_pvh_domain(d) || !has_vlapic(d)) { /* Disable virtual apics, TPR */ v-arch.hvm_vmx.secondary_exec_control = ~(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES | SECONDARY_EXEC_APIC_REGISTER_VIRT | SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY); v-arch.hvm_vmx.exec_control = ~CPU_BASED_TPR_SHADOW; /* In turn, disable posted interrupts. */ __vmwrite(PIN_BASED_VM_EXEC_CONTROL, vmx_pin_based_exec_control ~PIN_BASED_POSTED_INTERRUPT); } if ( is_pvh_domain(d) ) { /* Unrestricted guest (real mode for EPT) */ v-arch.hvm_vmx.secondary_exec_control = ~SECONDARY_EXEC_UNRESTRICTED_GUEST; /* Start in 64-bit mode. PVH 32bitfixme. */ vmentry_ctl |= VM_ENTRY_IA32E_MODE; /* GUEST_EFER.LME/LMA ignored */ ASSERT(v-arch.hvm_vmx.exec_control CPU_BASED_ACTIVATE_SECONDARY_CONTROLS); ASSERT(v-arch.hvm_vmx.exec_control CPU_BASED_ACTIVATE_MSR_BITMAP); ASSERT(!(v-arch.hvm_vmx.exec_control CPU_BASED_RDTSC_EXITING)); } -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6 0/2] Reduce the quantity of Coverity defects in testidl
On Fri, Aug 07, 2015 at 03:06:22PM +0100, Andrew Cooper wrote: No functional change, but removes 48 defects which get intermittently reflagged every time the libxl IDL is altered. Sent for 4.6 at Ian Campbells suggestion. Andrew Cooper (2): tools/libxl: Assert success of memory allocation in testidl tools/libxl: Alter the use of rand() in testidl tools/libxl/gentest.py | 42 +++--- 1 file changed, 27 insertions(+), 15 deletions(-) This series only affects idl test case which is not possible to cause any regression. Feel free to apply it when convenient. Wei. -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/2] tools/libxl: Alter the use of rand() in testidl
On Fri, Aug 07, 2015 at 03:06:24PM +0100, Andrew Cooper wrote: Coverity warns for every occurrence of rand(), which is made worse because each time the IDL changes, some of the calls get re-flagged. Collect all calls to rand() in a single function, test_rand(), which takes a modulo parameter for convenience. This turns 40 defects currently into 1, which won't get re-flagged when the IDL changes. In addition, fix the erroneous random choice for libxl_defbool_set(). !!rand() % 1 is unconditionally 0, and even without the % 1 would still be very heavily skewed in one direction. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] help: about arch/x86/xen/multicalls.c
On Thu, Aug 06, 2015 at 02:21:04PM +0800, 黄先生 wrote: hi all: My linux kernel verison is 2.6.32-15, and I make kernel with xen compileroptions. But when my virual machine start on AWS,it show these log: does anyone know how to do? that is soo ancient I am sorry to say we can't help you there. Can you update to a newer kernel please. May I no compiler arch/x86/xen/multicalls.c? How do I can? 1 multicall(s) failed: cpu 0 Pid: 1026, comm: cut_demo_li Tainted: P 2.6.32.15-hermes-1 #10 Call Trace: [c1003284] xen_mc_flush+0xad/0x169 [c1005307] ? xen_restore_fl_direct_end+0x0/0x1 [c10037ad] xen_leave_lazy_mmu+0x8/0xf [c10641c3] remap_pfn_range+0x311/0x392 [c1060f9e] ? __do_fault+0x420/0x454 [c1169c87] mmap_mem+0x5b/0x72 [c1067739] mmap_region+0x1d2/0x355 [c1067bc9] do_mmap_pgoff+0x232/0x294 [c105def4] sys_mmap_pgoff+0x7e/0xe8 [c1006ed8] sysenter_do_call+0x12/0x2c [ cut here ] WARNING: at arch/x86/xen/multicalls.c:182 xen_mc_flush+0x161/0x169() Modules linked in: ncfm_load_lic(P) ncfm_syscfg(P) ncfm_ev(P) ncfm_igb ncfm_ixgbe ncfm_e1000 ncfm_e1000e ncfm_e100 ncfm_r8168 nedrv(P) ncsfkrnl(P) Pid: 1026, comm: cut_demo_li Tainted: P 2.6.32.15-hermes-1 #10 Call Trace: [c1003338] ? xen_mc_flush+0x161/0x169 [c102bee2] warn_slowpath_common+0x60/0x77 [c102bf06] warn_slowpath_null+0xd/0x10 [c1003338] xen_mc_flush+0x161/0x169 [c1005307] ? xen_restore_fl_direct_end+0x0/0x1 [c10037ad] xen_leave_lazy_mmu+0x8/0xf [c10641c3] remap_pfn_range+0x311/0x392 [c1060f9e] ? __do_fault+0x420/0x454 [c1169c87] mmap_mem+0x5b/0x72 [c1067739] mmap_region+0x1d2/0x355 [c1067bc9] do_mmap_pgoff+0x232/0x294 [c105def4] sys_mmap_pgoff+0x7e/0xe8 [c1006ed8] sysenter_do_call+0x12/0x2c ---[ end trace 96d9d060aff577eb ]--- 1 multicall(s) failed: cpu 0 Pid: 1026, comm: cut_demo_li Tainted: PW 2.6.32.15-hermes-1 #10 Call Trace: [c1003284] xen_mc_flush+0xad/0x169 [c1005307] ? xen_restore_fl_direct_end+0x0/0x1 [c10037ad] xen_leave_lazy_mmu+0x8/0xf [c10641c3] remap_pfn_range+0x311/0x392 [c1169c87] mmap_mem+0x5b/0x72 [c1067739] mmap_region+0x1d2/0x355 [c1067bc9] do_mmap_pgoff+0x232/0x294 [c105def4] sys_mmap_pgoff+0x7e/0xe8 [c1006ed8] sysenter_do_call+0x12/0x2c [ cut here ] WARNING: at arch/x86/xen/multicalls.c:182 xen_mc_flush+0x161/0x169() Modules linked in: ncfm_load_lic(P) ncfm_syscfg(P) ncfm_ev(P) ncfm_igb ncfm_ixgbe ncfm_e1000 ncfm_e1000e ncfm_e100 ncfm_r8168 nedrv(P) ncsfkrnl(P) Pid: 1026, comm: cut_demo_li Tainted: PW 2.6.32.15-hermes-1 #10 Call Trace: [c1003338] ? xen_mc_flush+0x161/0x169 [c102bee2] warn_slowpath_common+0x60/0x77 [c102bf06] warn_slowpath_null+0xd/0x10 [c1003338] xen_mc_flush+0x161/0x169 [c1005307] ? xen_restore_fl_direct_end+0x0/0x1 [c10037ad] xen_leave_lazy_mmu+0x8/0xf [c10641c3] remap_pfn_range+0x311/0x392 [c1169c87] mmap_mem+0x5b/0x72 [c1067739] mmap_region+0x1d2/0x355 [c1067bc9] do_mmap_pgoff+0x232/0x294 [c105def4] sys_mmap_pgoff+0x7e/0xe8 [c1006ed8] sysenter_do_call+0x12/0x2c ---[ end trace 96d9d060aff577ec ]--- BUG: unable to handle kernel paging request at d42da180 IP: [c1003c9d] xen_set_pte+0x14/0x1b *pdpt = 14055027 *pde = 00173067 *pte = 8000142da061 Oops: 0003 [#1] SMP last sysfs file: Modules linked in: ncfm_load_lic(P) ncfm_syscfg(P) ncfm_ev(P) ncfm_igb ncfm_ixgbe ncfm_e1000 ncfm_e1000e ncfm_e100 ncfm_r8168 nedrv(P) ncsfkrnl(P) Pid: 1026, comm: cut_demo_li Tainted: PW (2.6.32.15-hermes-1 #10) EIP: 0061:[c1003c9d] EFLAGS: 00010246 CPU: 0 EIP is at xen_set_pte+0x14/0x1b EAX: EBX: 2127a227 ECX: 000c EDX: ESI: 000c EDI: d42da180 EBP: d4145ee8 ESP: d4145edc DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 Process cut_demo_li (pid: 1026, ti=d4144000 task=d403c080 task.ti=d4144000) Stack: b7830df4 d4145f00 c1004ae3 d42da180 0 d4145f84 c1062fc6 2127a227 000c d40f d4145f50 b7830df4 0 d414a6fc c3bc9e40 2127a227 000c 0180 d41c5de0 Call Trace: [c1004ae3] ? xen_set_pte_at+0xbd/0xc3 [c1062fc6] ? handle_mm_fault+0x476/0xb15 [c104f600] ? handle_level_irq+0xb3/0xbb [c1009579] ? handle_irq+0x3b/0x4a [c101e24a] ? do_page_fault+0x17f/0x1e3 [c101e0cb] ? do_page_fault+0x0/0x1e3 [c12de08e] ? error_code+0x66/0x6c [c101e0cb] ? do_page_fault+0x0/0x1e3 Code: 00 c7 42 10 00 00 00 00 c7 42 14 f0 7f 00 00 83 c4 0c 5b 5e 5f 5d c3 55 89 e5 57 89 c7 56 89 ce 53 89 d3 e8 3d 90 01 00 89 77 04 89 1f 5b 5e 5f 5d c3 55 89 e5 56 89 ce 53 89 c3 83 38 00 0f 84 EIP: [c1003c9d] xen_set_pte+0x14/0x1b SS:ESP 0069:d4145edc CR2: d42da180 ---[ end trace 96d9d060aff577ed ]--- note: cut_demo_li[1026] exited with preempt_count 1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 08/31] libxc: rework BSP initialization
On 07/08/15 11:17, Roger Pau Monne wrote: Place the calls to xc_vcpu_setcontext and the allocation of the hypercall buffer into the arch-specific vcpu hooks. This is needed for the next patch, so x86 HVM guests can initialize the BSP using XEN_DOMCTL_sethvmcontext instead of XEN_DOMCTL_setvcpucontext. This patch should not introduce any functional change. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7] run QEMU as non-root
On Thu, Jul 23, 2015 at 06:08:02PM +0100, Stefano Stabellini wrote: [...] +For security reasons, libxl tries to pass a non-root username to QEMU as +argument. During initialization QEMU calls setuid and setgid with the +user ID and the group ID of the user passed as argument. +Libxl looks for the following users in this order: + +1) a user named xen-qemuuser-domid$domid, +Where $domid is the domid of the domain being created. +This requires the reservation of 65535 uids from xen-qemuuser-domid1 +to xen-qemuuser-domid65535. To use this mechanism, you might want to +create a large number of users at installation time. For example: + +for ((i=1; i65536; i++)) +do +adduser --no-create-home --system xen-qemuuser-domid$i +done + +You might want to consider passing --group to adduser to create a new +group for each new user. + + +2) a user named xen-qemuuser-shared +As a fall back if both 1) and 2) fail, libxl will use a single user for ^^ This is 2) +all QEMU instances. The user is named xen-qemuuser-shared. This is +less secure but still better than running QEMU as root. Using this is as +simple as creating just one more user on your host: + +adduser --no-create-home --system xen-qemuuser-shared + + +3) root +As a last resort, libxl will start QEMU as root. diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index efc0617..3f4283f 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -192,6 +192,12 @@ * is not present, instead of ERROR_INVAL. */ #define LIBXL_HAVE_ERROR_DOMAIN_NOTFOUND 1 + +/* libxl_domain_build_info has device_model_user to specify the user to + * run the device model with. See docs/misc/qemu-deprivilege.txt. + */ +#define LIBXL_HAVE_DEVICE_MODEL_USER 1 + /* * libxl ABI compatibility * diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index 0c6408d..24c43df 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -19,6 +19,8 @@ #include libxl_internal.h #include xen/hvm/e820.h +#include sys/types.h +#include pwd.h static const char *libxl_tapif_script(libxl__gc *gc) { @@ -418,6 +420,33 @@ static char *dm_spice_options(libxl__gc *gc, return opt; } +/* return 1 if the user was found, 0 if it was not, -1 on error */ +static int libxl__dm_runas_helper(libxl__gc *gc, char *username) const char * +{ +struct passwd pwd, *user = NULL; +char *buf = NULL; +long buf_size; + +buf_size = sysconf(_SC_GETPW_R_SIZE_MAX); +if (buf_size 0) { +LOGE(ERROR, sysconf(_SC_GETPW_R_SIZE_MAX) returned error %ld, +buf_size); +return -1; +} + +retry: +buf = libxl__realloc(gc, buf, buf_size); This should be libxl__alloc and placed out of the loop? +errno = 0; +getpwnam_r(username, pwd, buf, buf_size, user); +if (user != NULL) +return 1; +if (errno == ERANGE) { +buf_size += 128; +goto retry; +} Please use for / while to loop. Also you might want to save and restore errno. +return 0; +} + static char ** libxl__build_device_model_args_new(libxl__gc *gc, const char *dm, int guest_domid, const libxl_domain_config *guest_config, @@ -439,6 +468,7 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc, int i, connection, devid; uint64_t ram_size; const char *path, *chardev; +char *user = NULL; dm_args = flexarray_make(gc, 16, 1); @@ -878,6 +908,31 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc, default: break; } + +if (b_info-device_model_user) { +user = b_info-device_model_user; +goto end_search; +} + +user = libxl__sprintf(gc, %s%d, LIBXL_QEMU_USER_BASE, guest_domid); +if (libxl__dm_runas_helper(gc, user) 0) +goto end_search; You haven't checked if libxl__dm_runas_helper returns -1. In that case we should bail? + +user = LIBXL_QEMU_USER_SHARED; +if (libxl__dm_runas_helper(gc, user) 0) { +LOG(WARN, Could not find user %s%d, falling back to %s, +LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED); +goto end_search; +} + +user = NULL; +LOG(WARN, Could not find user %s, starting QEMU as root, LIBXL_QEMU_USER_SHARED); + Line too long. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel