Hi, I've made some changes to the virtualization support. This should work coupled with some changes in the vmm and a kernel cmdline parameter. As of now, I have linux working up to busybox while using the x2apic.
The following changes since commit 548361ec318c5381bc13eb7d9fa7e6f59d7081f1: Created a new Makefrag-user-app helper for building binaries (2015-12-21 12:56:49 -0500) are available in the git repository at: [email protected]:GanShun/akaros.git 8a85d7e19c6f0024a9733d7a0bcee08d687a23a6 for you to fetch changes up to 8a85d7e19c6f0024a9733d7a0bcee08d687a23a6: Virtualization changes to handle X2APIC mode. (2016-01-12 09:25:05 -0800) ---------------------------------------------------------------- GanShun (5): Removed lapic_set_id and lapic_set_logid functions Enabling X2APIC Added IOMMU initalization code and init_irte functionality Modified IOAPIC to emit interrupts in remappable format and init irte Virtualization changes to handle X2APIC mode. kern/arch/x86/Kbuild | 1 + kern/arch/x86/apic.c | 31 +++++++------- kern/arch/x86/apic.h | 116 +++++++++++++++++++++------------------------------- kern/arch/x86/apic9.c | 161 ++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------ kern/arch/x86/ioapic.c | 20 +++++++-- kern/arch/x86/ioapic.h | 3 +- kern/arch/x86/iommu.c | 177 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ kern/arch/x86/iommu.h | 105 +++++++++++++++++++++++++++++++++++++++++++++++ kern/arch/x86/mp.c | 3 +- kern/arch/x86/perfmon.c | 2 +- kern/arch/x86/pmap64.c | 4 +- kern/arch/x86/ros/mmu64.h | 5 +-- kern/arch/x86/smp.c | 1 - kern/arch/x86/smp_boot.c | 2 - kern/arch/x86/time.c | 4 +- kern/arch/x86/trap.c | 7 ++++ kern/arch/x86/trapentry64.S | 3 +- kern/arch/x86/vmm/intel/vmx.c | 19 +++++++-- kern/arch/x86/vmm/intel/vmx.h | 5 +++ kern/src/ktest/pb_ktests.c | 1 - tests/vmm/vmrunkernel.c | 43 +++++++++++++------- 21 files changed, 551 insertions(+), 162 deletions(-) create mode 100644 kern/arch/x86/iommu.c create mode 100644 kern/arch/x86/iommu.h On Thursday, December 17, 2015 at 4:01:12 PM UTC-8, Gan Shun wrote: > > Hi, These are my new changes. Please take a look and let me know if you > need anything else changed! This is on branch iommu-g > > Thanks! > The following changes since commit > 080cf1ff0e7510b551f6a29f12aa25b6cc53f16f: > > Added new kmalloc flag KMALLOC_ERROR (2015-12-17 15:24:44 -0500) > > are available in the git repository at: > > [email protected]:GanShun/akaros.git > > for you to fetch changes up to 1ed8ffc6356640b8c1b8ba5519623a0180c892ff: > > Virtualization changes to handle X2APIC mode. (2015-12-17 15:50:47 -0800) > > ---------------------------------------------------------------- > GanShun (5): > Removed lapic_set_id and lapic_set_logid functions > Enabling X2APIC > Added IOMMU initalization code and init_irte functionality > Modified IOAPIC to emit interrupts in remappable format and init irte > Virtualization changes to handle X2APIC mode. > > kern/arch/x86/Kbuild | 1 + > kern/arch/x86/apic.c | 31 +++++++++++----------- > kern/arch/x86/apic.h | 116 > +++++++++++++++++++++++++++++++------------------------------------------------ > kern/arch/x86/apic9.c | 161 > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------- > kern/arch/x86/ioapic.c | 20 +++++++++++--- > kern/arch/x86/ioapic.h | 3 ++- > kern/arch/x86/iommu.c | 177 > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > kern/arch/x86/iommu.h | 105 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > kern/arch/x86/mp.c | 3 ++- > kern/arch/x86/perfmon.c | 2 +- > kern/arch/x86/pmap64.c | 4 +-- > kern/arch/x86/ros/mmu64.h | 5 ++-- > kern/arch/x86/smp.c | 1 - > kern/arch/x86/smp_boot.c | 2 -- > kern/arch/x86/time.c | 4 +-- > kern/arch/x86/trap.c | 7 +++++ > kern/arch/x86/trapentry64.S | 3 ++- > kern/arch/x86/vmm/intel/vmx.c | 11 ++++++-- > kern/arch/x86/vmm/intel/vmx.h | 5 ++++ > kern/src/ktest/pb_ktests.c | 1 - > tests/vmm/vmrunkernel.c | 43 +++++++++++++++++++----------- > 21 files changed, 544 insertions(+), 161 deletions(-) > create mode 100644 kern/arch/x86/iommu.c > create mode 100644 kern/arch/x86/iommu.h > > On Tue, Dec 15, 2015 at 3:18 PM Barret Rhoden <[email protected]> > wrote: > >> Hi - >> >> On 2015-12-09 at 17:40 Gan Shun <[email protected]> wrote: >> > [email protected]:GanShun/akaros.git >> > c0c444f9646147a530672546d5cdbaf28e357e70 >> >> Here's my comments on the first patch. I didn't get to the others >> yet, but figured I'd give you some early feedback. >> >> From glancing at the remaining commits, it looks like there's a bunch of >> things in "Swapped IPI" that fix up some of these things already, and >> the functionality of "Switched from APIC to X2APIC mode" was actually >> split across those two commits, with three IOMMU commits in the middle. >> >> But that last commit also has a bunch of IOMMU things too. (based on >> a quick glance) >> >> I'd try: >> >> split that last commit up into things related to the x2apic and things >> that aren't >> >> $ git rebase -i >> >> reorder the commits so that "Swapped IPI"'s x2apic stuff is right after >> "Switched from" >> >> Then probably squash those into one commit, remove all of the debugging >> code, and then depending on how the commit looks, break it back up into >> bite-sized chunks. >> >> For instance, removing all of the lapic_set_* functions could be a >> commit. Then do the main xAPIC->x2APIC commit. Then making the x2APIC >> changes for the guest could be another commit. >> >> Before doing any of that, make a branch to save your work: >> >> $ git branch temp-backup >> >> (or just use your github remote. =) ) >> >> Make sure you aren't on temp-backup. Then if you do horrendous things >> with git rebase, you don't lose your work. >> >> Ideally, we'd have a clean patchset that "tells the story" of how we >> modified Akaros from using the xAPIC to the x2APIC. >> >> If you run into any issues, let me know, and I'll be glad to help. >> >> Barret >> >> p.s. : here's my comments on patch 1. >> >> > From be8dbe315e02aae8662a3e1f45944d3a9c65f9ae Mon Sep 17 00:00:00 2001 >> > From: GanShun <[email protected]> >> > Date: Mon, 9 Nov 2015 13:04:34 -0800 >> > Subject: Switched from APIC to X2APIC mode >> >> > --- a/kern/arch/x86/apic.c >> > +++ b/kern/arch/x86/apic.c >> > @@ -15,6 +15,8 @@ >> > #include <bitmask.h> >> > #include <arch/topology.h> >> > >> > +//uint32_t apicrget(uint64_t r); >> > +//void apicrput(uint64_t r, uint32_t data); >> >> Can remove these. >> >> > @@ -45,11 +47,11 @@ void lapic_print_isr(void) >> > printk("LAPIC ISR on core %d\n--------------\n", core_id()); >> > for (int i = 7; i >= 0; i--) >> > printk("%3d-%3d: %p\n", (i + 1) * 32 - 1, i * 32, >> > - *(uint32_t*)(LAPIC_ISR + i * 0x10)); >> > + apicrget(LAPIC_ISR + i * 0x10)); >> >> I think any usage of 0x10 is from the xAPIC and needs to be converted. >> >> >> > --- a/kern/arch/x86/apic.h >> > +++ b/kern/arch/x86/apic.h >> >> > static inline uint32_t lapic_get_id(void) >> > { >> > - return read_mmreg32(LAPIC_ID) >> 24; >> > + //return apicrget(LAPIC_ID) >> 24; >> > + return apicrget(LAPIC_ID); >> > } >> >> What's the deal here? I thing the >> 24 was for the xAPIC, and we just >> read it directly for the x2. >> >> So since we're changing over to the x2 in this commit, let's remove the >> // line, and just do the x2APIC thing. >> >> > static inline void lapic_set_id(uint8_t id) >> > { >> > - write_mmreg32(LAPIC_ID, id << 24); >> > + apicrput(LAPIC_ID, id << 24); >> > } >> >> I think we can't do this in the x2APIC. In which case, we should remove >> the function. >> >> > static inline uint8_t lapic_get_logid(void) >> > { >> > - return read_mmreg32(LAPIC_LOGICAL_ID) >> 24; >> > + return apicrget(LAPIC_LOGICAL_ID) >> 24; >> > } >> >> Just apicrget, without the >> 24, right? >> >> >> > static inline void lapic_set_logid(uint8_t id) >> > { >> > - write_mmreg32(LAPIC_LOGICAL_ID, id << 24); >> > + apicrput(LAPIC_LOGICAL_ID, id << 24); >> > } >> >> We can't write this, so we need to remove this function. I think we >> only had one non-important user of this, so we can remove it. >> >> As a note on style, if there were a lot of users of set_id and >> set_logid, you could have made a separate commit that does nothing but >> removes those functions and fixes up whoever was using it. Not a big >> deal in this case, but for bigger commits, that'll keep things more >> understandable. >> >> > @@ -159,56 +164,59 @@ static inline void lapic_disable(void) >> > */ >> > static inline void lapic_wait_to_send(void) >> > { >> > - while (read_mmreg32(LAPIC_IPI_ICR_LOWER) & 0x1000) >> > + // we should not need this function as we are now using x2apic. >> > + return; >> > + >> > + while (apicrget(LAPIC_IPI_ICR_LOWER) & 0x1000) >> > __cpu_relax(); >> > } >> >> Yeah, so let's remove this function too (and all callers of it), >> instead of just returning early. >> >> > static inline void send_init_ipi(void) >> > { >> > - write_mmreg32(LAPIC_IPI_ICR_LOWER, 0x000c4500); >> > + apicrput(LAPIC_IPI_ICR_LOWER, 0x000c4500); >> >> I think calling the register LAPIC_IPI_ICR_LOWER is a bad idea now, >> since there is only one 64 bit register for the entire LAPIC_IPI_ICR. >> (Same comments for all of these ipi helpers). >> >> Also, apicrput() only takes a u32 right now, and the ICR takes a full 64 >> bits (I think). So that function's signature will need to change. >> >> > static inline void __send_ipi(uint8_t hw_coreid, uint8_t vector) >> > { >> > - write_mmreg32(LAPIC_IPI_ICR_UPPER, hw_coreid << 24); >> > - write_mmreg32(LAPIC_IPI_ICR_LOWER, 0x00004000 | vector); >> > + apicrput(LAPIC_IPI_ICR_UPPER, hw_coreid << 24); >> >> Won't that fault when we write to MSR 831? Also, we probably shouldn't >> shift the hw_coreid by 24 anymore. >> >> > + apicrput(LAPIC_IPI_ICR_LOWER, 0x00004000 | vector); >> > lapic_wait_to_send(); >> > } >> >> > diff --git a/kern/arch/x86/apic9.c b/kern/arch/x86/apic9.c >> > index d03d0d637c32..5d8f119bbeee 100644 >> > --- a/kern/arch/x86/apic9.c >> > +++ b/kern/arch/x86/apic9.c >> >> > -static uint32_t apicrget(int r) >> > +uint32_t apicrget(uint64_t r) >> > { >> > + //printk("APICRGET BEFORE SUBTRACTION: %llx\n", r); >> > + if (r >= 0x400) { >> > + r = r - LAPIC_BASE; >> > + // printk("APICRGET AFTER SUBTRACTION: %llx\n", r); >> > + } >> >> I guess this stuff is so that we apicrget can work for both xAPIC and >> x2APIC? We might as well just do the whole switch over and cut out all >> of this nastiness. >> >> Also, please try to avoid adding one-off debug printks to the commit >> history. Do a grep for printk or X2APIC in these diffs to see where >> you're adding them. >> >> > -static void apicrput(int r, uint32_t data) >> > +void apicrput(uint64_t r, uint32_t data) >> > { >> >> > printd("apicrput: %s = %p\n", apicregnames[r], data); >> > - write_mmreg32(apicbase + r, data); >> > + if (r == 0x310) { >> > + temp_data = read_msr(X2APICBase + 0x300/16); >> > + temp_data &= 0xFFFFFFFF; >> > + temp_data |= (uint64_t)data << 32; >> > + } else if (r == 0x300) { >> > + //temp_data = read_msr(X2APICBase + 0x300/16); >> > + //temp_data &= ~0xFFFFFFFF; >> > + temp_data = 0xFFFFFFFF00000000ULL; >> > + temp_data |= data; >> > + } else >> > + temp_data |= data; >> > + write_msr(X2APICBase + r/16, temp_data); >> >> This whole bit could be cleaned up too with a full switch to x2. >> >> >> > @@ -214,15 +263,27 @@ int apiconline(void) >> > uint64_t tsc; >> > uint32_t dfr, ver; >> > int apicno, nlvt; >> > + uint64_t msr_val; >> > + >> > + //X2APIC INIT >> > >> > + //printk("BEFORE X2APIC"); >> > + msr_val = read_msr(IA32_APIC_BASE); >> > + write_msr(IA32_APIC_BASE, msr_val | (3<<10)); >> > + //printk("AFTER X2APIC"); >> > + //printk("APIC ID: %d", apicrget(Id)); >> > if (!apicbase) { >> > printk("No apicbase on HW core %d!!\n", hw_core_id()); >> > return 0; >> > } >> > - if ((apicno = ((apicrget(Id) >> 24) & 0xff)) >= Napic) { >> > + apicno = (apicrget(Id) & 0xff); >> >> Don't mask that with 0xff. That was a leftover from the xAPIC, where we >> needed to select 8 bits from the ID register. Now, our apicno should be >> just the ID. Actually, we can use the lapic_get_id() helper. The >> reason we weren't before is that this is Plan 9 code that wasn't fully >> integrated into Akaros. >> >> > diff --git a/kern/arch/x86/smp_boot.c b/kern/arch/x86/smp_boot.c >> > index 7f6f4d9bf1fb..9660ecd36f27 100644 >> > --- a/kern/arch/x86/smp_boot.c >> > +++ b/kern/arch/x86/smp_boot.c >> > @@ -156,6 +156,7 @@ void smp_boot(void) >> > */ >> > udelay(500000); >> > >> > + printk("I DIDN'T STALL HERE"); >> >> > + printk("I DIDN'T STALL HERE the sequel"); >> >> Coming soon to a theatre near you! I'd watch that sequel. =) >> >> >> > // set a default logical id for now >> > - lapic_set_logid(lapic_get_id()); >> > + //lapic_set_logid(lapic_get_id()); >> >> Can just remove this. >> >> > --- a/kern/arch/x86/trap.c >> > +++ b/kern/arch/x86/trap.c >> > @@ -491,6 +491,8 @@ static bool vector_is_irq(int apic_vec) >> > * are all mapped up at PIC1_OFFSET for the cpu / irq_handler. */ >> > void handle_irq(struct hw_trapframe *hw_tf) >> > { >> > + if (hw_tf->tf_trapno != 240) >> > + printk("IRQ on :%d\n", hw_tf->tf_trapno); >> >> A little farther down in handle_irq, we have a set of checks you can >> turn on or off for debugging instead of doing this. Those checks also >> compile out since they use printd(). >> >> >> > diff --git a/kern/arch/x86/vmm/intel/vmx.c >> b/kern/arch/x86/vmm/intel/vmx.c >> > index bb80f15c2b0e..5c1b61b0d946 100644 >> > --- a/kern/arch/x86/vmm/intel/vmx.c >> > +++ b/kern/arch/x86/vmm/intel/vmx.c >> > @@ -557,16 +557,18 @@ static const struct vmxec cb2ec = { >> > .truemsr = MSR_IA32_VMX_PROCBASED_CTLS2, >> > >> > .must_be_1 = (SECONDARY_EXEC_ENABLE_EPT | >> > - SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES | >> > + //SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES | >> > SECONDARY_EXEC_APIC_REGISTER_VIRT | >> > SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY | >> > + SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE | >> > SECONDARY_EXEC_WBINVD_EXITING), >> >> If this is a permanent change (and I think it is), then let's not >> comment out SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES. So I guess this is >> also switching the guest over to using the x2APIC, which sounds >> reasonable. >> >> > .must_be_0 = ( >> > //SECONDARY_EXEC_APIC_REGISTER_VIRT | >> > //SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY | >> > SECONDARY_EXEC_DESCRIPTOR_EXITING | >> > - SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE | >> > + //SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE | >> > + SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES | >> >> Same here with .must_be_0. >> >> > @@ -2222,6 +2224,9 @@ int intel_vmm_init(void) { >> > /* FIXME: do we need APIC virtualization (flexpriority?) */ >> > >> > memset(msr_bitmap, 0xff, PAGE_SIZE); >> > + memset((void *)msr_bitmap+0x100, 0x0, 0x100/8); >> > + memset((void *)msr_bitmap+0x100+2048, 0x0, 0x100/8); >> >> What is this doing? The 0x100 range seems like magic. Are we letting >> the guest have unrestricted access to those MSRs? At the least, we >> could use a comment here so we know what we're doing. >> >> If there are a lot of guest changes associated with the x2APIC, we can >> do all of those in another commit. (see my previous notes on how to >> split a commit). >> >> >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Akaros" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "Akaros" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. For more options, visit https://groups.google.com/d/optout.
