On 18-5-2014 16:44, Anish wrote:
> Thanks for testing it.
>> Your patch applied cleanly to the working copy of the "bhyve_svm"-project.
> I was then able to merge with HEAD
> (using "theirs-full" on one file) and compile the kernel. So, to me it
> looks OK to commit.
> Yes, that's correct. You have to retain changes in sys/amd64/vmm/amd/amdv.c
> from bhyve_svm branch.
> 
>> Unfortunately, I am still not able to boot CentOS 6.5 using my Phenom
> 1055T. It produces 200% load on the
> host CPU, and the emulated machine generates endlessly:
> Its 200% load because of 2 vcpus to guest. It stuck in loop even with
> single processor(1 vcpu) after PCI probing[debug messages with linux
> .....earlyprintk=serial debug]
> 
> [    3.684243] UDP hash table entries: 1024 (order: 3, 32768 bytes)
> 
> [    3.686484] UDP-Lite hash table entries: 1024 (order: 3, 32768 bytes)
> 
> [    3.691987] NET: Registered protocol family 1
> 
> [    3.693382] pci 0000:00:01.0: Activating ISA DMA hang workarounds
> 
> [    3.695214] PCI: CLS 64 bytes, default 64
> 
> [    3.698176] Trying to unpack rootfs image as initramfs...
> 
> [   30.595279] BUG: soft lockup - CPU#0 stuck for 23s! [swapper/0:1]
> 
> [    3.505631] pnp: PnP ACPI: found 5 devices
> 
> [    3.506417] ACPI: bus type PNP unregistered
> 
> [    3.635781] pci 0000:00:06.0: no compatible bridge window for [mem
> 0xfe440000
> 
> -0xfe45ffff pref]
> 
> [    3.637555] pci 0000:00:06.0: BAR 6: assigned [mem 0x80000000-0x8001ffff
> pref
> 
> ]
> 
> [    3.638986] pci 0000:00:01.0: BAR 6: assigned [mem 0x80020000-0x800207ff
> pref
> 
> ]
> 
> [    3.640416] pci 0000:00:04.0: BAR 6: assigned [mem 0x80020800-0x80020fff
> pref
> 
> ]
> 
> [    3.641864] pci 0000:00:05.0: BAR 6: assigned [mem 0x80021000-0x800217ff
> pref
> 
> ]
> 
> [    3.643259] pci 0000:00:00.0: not setting up bridge for bus 0000:01
> 
> [    3.644550] pci_bus 0000:00: resource 4 [io  0x0000-0x0cf7]
> 
> [    3.645670] pci_bus 0000:00: resource 5 [io  0x0d00-0xffff]
> 
> [    3.646795] pci_bus 0000:00: resource 6 [mem 0x80000000-0xdfffffff]
> 
> [    3.648031] pci_bus 0000:00: resource 7 [mem 0xd000000000-0xfcffffffff]
> 
> [    3.650970] NET: Registered protocol family 2
> 
> [    3.661491] TCP established hash table entries: 16384 (order: 6, 262144
> bytes
> 
> )
> 
> [    3.671854] TCP bind hash table entries: 16384 (order: 6, 262144 bytes)
> 
> [    3.681116] TCP: Hash tables configured (established 16384 bind 16384)
> 
> [    3.683335] TCP: reno registered
> 
> [    3.684243] UDP hash table entries: 1024 (order: 3, 32768 bytes)
> 
> [    3.686484] UDP-Lite hash table entries: 1024 (order: 3, 32768 bytes)
> 
> [    3.691987] NET: Registered protocol family 1
> 
> [    3.693382] pci 0000:00:01.0: Activating ISA DMA hang workarounds
> 
> [    3.695214] PCI: CLS 64 bytes, default 64
> 
> [    3.698176] Trying to unpack rootfs image as initramfs...
> 
> [   30.595279] BUG: soft lockup - CPU#0 stuck for 23s! [swapper/0:1]
> 
> [   30.596366] Modules linked in:
>> Additionally, It produces a lot of MSR requests:
> Yes, on AMD Linux is touching more MSRs( AMD specific -address 0xC00XXXX)
> compared to Intel.
> 
> Thanks and regards,
> Anish
> 
> 
> On Fri, May 16, 2014 at 2:17 PM, Nils Beyer <n...@renzel.net> wrote:
> 
>> Hi Anish,
>>
>> Anish wrote:
>>> If patches looks good to you, we can submit it. I have been testing it on
>>> Phenom box which lacks some of newer SVM features.
>>
>> Your patch applied cleanly to the working copy of the "bhyve_svm"-project.
>> I was then able to merge with HEAD
>> (using "theirs-full" on one file) and compile the kernel. So, to me it
>> looks OK to commit.
>>
>> Unfortunately, I am still not able to boot CentOS 6.5 using my Phenom
>> 1055T. It produces 200% load on the
>> host CPU, and the emulated machine generates endlessly:
>>
>> =======================================================================================
>> BUG: soft lockup - CPU#0 stuck for 67s! [swapper:1]
>> Modules linked in:
>> CPU 0
>> Modules linked in:
>>
>> Pid: 1, comm: swapper Not tainted 2.6.32-431.el6.x86_64 #1   BHYVE

And more...


>> I'd love to see CentOS perfectly running on my Phenom as it runs perfectly
>> on an Intel i3.
>>
>> If you need any further information/debug, please let me know...

I've been trying to get Ubuntu, CentOS and like to run on AMDs, and
currently I'm compiling a kernel, but it goes dirt slow.

Attached a patch I have to debug more of the MSRs and it does what I do
to get the TSC running.... It helps, but things are still like molases.

For Ubuntu I also needed to fix part of the AHCI code since it bails out
on ATA FLUSH.

I'm going to take a look at the recently posted diff which should get
bhyve_svm in line with head. And see if that speeds up my Ubuntu kernels.

--WjW

Index: sys/amd64/vmm/amd/svm.c
===================================================================
--- sys/amd64/vmm/amd/svm.c     (revision 264582)
+++ sys/amd64/vmm/amd/svm.c     (working copy)
@@ -82,6 +82,8 @@
 static bool svm_vmexit(struct svm_softc *svm_sc, int vcpu,
                        struct vm_exit *vmexit);
 static int svm_msr_rw_ok(uint8_t *btmap, uint64_t msr);
+static int svm_msr_ro_ok(uint8_t *btmap, uint64_t msr);
+static int svm_msr_rw_ro_ok(uint8_t *btmap, uint64_t msr, int mask);
 static int svm_msr_index(uint64_t msr, int *index, int *bit);
 
 static uint32_t svm_feature; /* AMD SVM features. */
@@ -315,9 +317,24 @@
 /*
  * Give virtual cpu the complete access to MSR(read & write).
  */
+#define MSR_RO 1
+#define MSR_RW 3
+
 static int
 svm_msr_rw_ok(uint8_t *perm_bitmap, uint64_t msr)
 {
+       return svm_msr_rw_ro_ok(perm_bitmap, msr, MSR_RW);
+}
+
+static int
+svm_msr_ro_ok(uint8_t *perm_bitmap, uint64_t msr)
+{
+       return svm_msr_rw_ro_ok(perm_bitmap, msr, MSR_RO);
+}
+
+static int
+svm_msr_rw_ro_ok(uint8_t *perm_bitmap, uint64_t msr, int mask)
+{
        int index, bit, err;
 
        err = svm_msr_index(msr, &index, &bit);
@@ -336,8 +353,12 @@
        }
 
        /* Disable intercept for read and write. */
-       perm_bitmap[index] &= ~(3 << bit);
-       CTR1(KTR_VMM, "Guest has full control on SVM:MSR(0x%lx).\n", msr);
+       perm_bitmap[index] &= ~(mask << bit);
+       if (mask==MSR_RW) {
+               CTR1(KTR_VMM, "Guest has Read/Write  control on 
SVM:MSR(0x%lx).\n", msr );
+       } else {
+               CTR1(KTR_VMM, "Guest has Read/Write  control on 
SVM:MSR(0x%lx).\n", msr );
+       }
        
        return (0);
 }
@@ -415,10 +436,26 @@
        svm_msr_rw_ok(svm_sc->msr_bitmap, MSR_SYSENTER_CS_MSR);
        svm_msr_rw_ok(svm_sc->msr_bitmap, MSR_SYSENTER_ESP_MSR);
        svm_msr_rw_ok(svm_sc->msr_bitmap, MSR_SYSENTER_EIP_MSR);
-       
+
+#define AMD_MSR_TSEG_BASE      0xc0010112
+#define AMD_MSR_OSVW_ID_LENGTH  0xc0010140      /* read */
+#define AMD_MSR_OSVW_STATUS     0xc0010141      /* read */
+#define AMD_MSR_MC4_CTL_MASK    0xc0010048
+
        /* For Nested Paging/RVI only. */
        svm_msr_rw_ok(svm_sc->msr_bitmap, MSR_PAT);
+       svm_msr_rw_ok(svm_sc->msr_bitmap, AMD_MSR_OSVW_ID_LENGTH);
+       svm_msr_rw_ok(svm_sc->msr_bitmap, AMD_MSR_OSVW_STATUS);
 
+       /*
+        * MSRs that are allowed to be read.
+        * most obvious one is the TSC read which could be time critical
+        */
+       svm_msr_ro_ok(svm_sc->msr_bitmap, MSR_TSC);
+       svm_msr_ro_ok(svm_sc->msr_bitmap, MSR_HWCR);
+       svm_msr_ro_ok(svm_sc->msr_bitmap, AMD_MSR_TSEG_BASE);
+       svm_msr_ro_ok(svm_sc->msr_bitmap, AMD_MSR_MC4_CTL_MASK);
+       
         /* Intercept access to all I/O ports. */
        memset(svm_sc->iopm_bitmap, 0xFF, sizeof(svm_sc->iopm_bitmap));
 
@@ -566,6 +603,13 @@
                                svm_efer(svm_sc, vcpu, info1);
                                break;
                        }
+                       if (ecx == MSR_TSC) {
+                               uint64_t tscval = rdtsc();
+                               VCPU_CTR0(svm_sc->vm, vcpu,"VMEXIT TSC MSR\n");
+                               state->rax = tscval & 0xffffffff;
+                               ctx->e.g.sctx_rdx = tscval >> 32;
+                               break;
+                       } 
                
                        retu = false;   
                        if (info1) {
Index: sys/amd64/vmm/intel/vmx.c
===================================================================
--- sys/amd64/vmm/intel/vmx.c   (revision 264582)
+++ sys/amd64/vmm/intel/vmx.c   (working copy)
@@ -109,6 +109,9 @@
 #define        guest_msr_rw(vmx, msr) \
        msr_bitmap_change_access((vmx)->msr_bitmap, (msr), MSR_BITMAP_ACCESS_RW)
 
+#define guest_msr_ro(vmx, msr) \
+    msr_bitmap_change_access((vmx)->msr_bitmap, (msr), MSR_BITMAP_ACCESS_READ)
+
 #define        HANDLED         1
 #define        UNHANDLED       0
 
@@ -786,6 +789,11 @@
         * MSR_EFER is saved and restored in the guest VMCS area on a
         * VM exit and entry respectively. It is also restored from the
         * host VMCS area on a VM exit.
+        *
+        * The TSC MSR is exposed read-only. Writes are disallowed as that
+        * will impact the host TSC.
+        * XXX Writes would be implemented with a wrmsr trap, and
+        * then modifying the TSC offset in the VMCS.
         */
        if (guest_msr_rw(vmx, MSR_GSBASE) ||
            guest_msr_rw(vmx, MSR_FSBASE) ||
@@ -793,7 +801,8 @@
            guest_msr_rw(vmx, MSR_SYSENTER_ESP_MSR) ||
            guest_msr_rw(vmx, MSR_SYSENTER_EIP_MSR) ||
            guest_msr_rw(vmx, MSR_KGSBASE) ||
-           guest_msr_rw(vmx, MSR_EFER))
+           guest_msr_rw(vmx, MSR_EFER) ||
+           guest_msr_ro(vmx, MSR_TSC))
                panic("vmx_vminit: error setting guest msr access");
 
        /*
Index: sys/amd64/vmm/io/vlapic.c
===================================================================
--- sys/amd64/vmm/io/vlapic.c   (revision 264582)
+++ sys/amd64/vmm/io/vlapic.c   (working copy)
@@ -143,7 +143,7 @@
 #define        VLAPIC_TIMER_UNLOCK(vlapic)     
mtx_unlock_spin(&((vlapic)->timer_mtx))
 #define        VLAPIC_TIMER_LOCKED(vlapic)     
mtx_owned(&((vlapic)->timer_mtx))
 
-#define VLAPIC_BUS_FREQ        tsc_freq
+#define VLAPIC_BUS_FREQ        (128*1024*1024)
 
 static __inline uint32_t
 vlapic_get_id(struct vlapic *vlapic)
Index: sys/amd64/vmm/vmm_msr.c
===================================================================
--- sys/amd64/vmm/vmm_msr.c     (revision 264582)
+++ sys/amd64/vmm/vmm_msr.c     (working copy)
@@ -113,6 +113,9 @@
                case MSR_MCG_CAP:
                        guest_msrs[i] = 0;
                        break;
+               case MSR_TSC:
+                       guest_msrs[i] = rdtsc();
+                       break;
                case MSR_PAT:
                        guest_msrs[i] = PAT_VALUE(0, PAT_WRITE_BACK)      |
                                PAT_VALUE(1, PAT_WRITE_THROUGH)   |
Index: sys/amd64/vmm/vmm_msr.h
===================================================================
--- sys/amd64/vmm/vmm_msr.h     (revision 264582)
+++ sys/amd64/vmm/vmm_msr.h     (working copy)
@@ -29,7 +29,7 @@
 #ifndef        _VMM_MSR_H_
 #define        _VMM_MSR_H_
 
-#define        VMM_MSR_NUM     16
+#define        VMM_MSR_NUM     17
 struct vm;
 
 void   vmm_msr_init(void);
Index: usr.sbin/bhyve/bhyverun.c
===================================================================
--- usr.sbin/bhyve/bhyverun.c   (revision 264582)
+++ usr.sbin/bhyve/bhyverun.c   (working copy)
@@ -52,6 +52,7 @@
 #include <vmmapi.h>
 
 #include "bhyverun.h"
+#include "compiledate.h"
 #include "acpi.h"
 #include "inout.h"
 #include "dbgport.h"
@@ -75,6 +76,8 @@
 
 #define MB             (1024UL * 1024)
 #define GB             (1024UL * MB)
+#define FALSE          0
+#define        TRUE            (!FALSE)
 
 typedef int (*vmexit_handler_t)(struct vmctx *, struct vm_exit *, int *vcpu);
 
@@ -139,8 +142,8 @@
                "       -S: <slot,driver,configinfo> legacy PCI slot config\n"
                "       -l: LPC device configuration\n"
                "       -m: memory size in MB\n"
-               "       -w: ignore unimplemented MSRs\n",
-               progname, (int)strlen(progname), "");
+               "       -w: ignore unimplemented MSRs\n"
+               ,progname, (int)strlen(progname), "");
 
        exit(code);
 }
@@ -287,10 +290,6 @@
        if (vme->u.inout.string || vme->u.inout.rep)
                return (VMEXIT_ABORT);
 
-       /* Special case of guest reset */
-       if (out && port == 0x64 && (uint8_t)eax == 0xFE)
-               return (vmexit_catch_reset());
-
         /* Extra-special case of host notifications */
         if (out && port == GUEST_NIO_PORT)
                 return (vmexit_handle_notify(ctx, vme, pvcpu, eax));
@@ -315,16 +314,16 @@
        uint64_t val;
        uint32_t eax, edx;
        int error;
+       val = 0;
 
-       val = 0;
        error = emulate_rdmsr(ctx, *pvcpu, vme->u.msr.code, &val);
+
        if (error != 0) {
-               fprintf(stderr, "rdmsr to register %#x on vcpu %d\n",
+               fprintf(stderr, "rdmsr to register %#x ignored on vcpu %d\n\r",
                    vme->u.msr.code, *pvcpu);
                if (strictmsr)
                        return (VMEXIT_ABORT);
        }
-
        eax = val;
        error = vm_set_register(ctx, *pvcpu, VM_REG_GUEST_RAX, eax);
        assert(error == 0);
@@ -332,7 +331,6 @@
        edx = val >> 32;
        error = vm_set_register(ctx, *pvcpu, VM_REG_GUEST_RDX, edx);
        assert(error == 0);
-
        return (VMEXIT_CONTINUE);
 }
 
@@ -343,7 +341,7 @@
 
        error = emulate_wrmsr(ctx, *pvcpu, vme->u.msr.code, vme->u.msr.wval);
        if (error != 0) {
-               fprintf(stderr, "wrmsr to register %#x(%#lx) on vcpu %d\n",
+               fprintf(stderr, "wrmsr to register %#x(%#lx) ignored on vcpu 
%d\n\r",
                    vme->u.msr.code, vme->u.msr.wval, *pvcpu);
                if (strictmsr)
                        return (VMEXIT_ABORT);
@@ -676,6 +674,7 @@
        argc -= optind;
        argv += optind;
 
+       printf("BHyve compiled: %s \n\r\n\r", compiledate );
        if (argc != 1)
                usage(1);
 
Index: usr.sbin/bhyve/xmsr.c
===================================================================
--- usr.sbin/bhyve/xmsr.c       (revision 264582)
+++ usr.sbin/bhyve/xmsr.c       (working copy)
@@ -38,24 +38,72 @@
 #include <stdlib.h>
 
 #include "xmsr.h"
+#include "xmsr-info.h"
 
+#define BIT(b) (1<<b)
+#define FALSE  0
+#define        TRUE    (!FALSE)
+
 int
 emulate_wrmsr(struct vmctx *ctx, int vcpu, uint32_t code, uint64_t val)
 {
+       long retval = -1;
 
-       switch (code) {
+       switch (code) { 
        case 0xd04:                     /* Sandy Bridge uncore PMC MSRs */
        case 0xc24:
-               return (0);
+               /* simulate that these registers are written */
+               retval=(0);
+               break;
        default:
                break;
        }
-       return (-1);
+       fprintf(stderr,"wrmsr: %#x, %s, val: %li(%#lx).\n\r", 
+               code, xmsr_info_mnemonic(code), val, val); 
+       return retval;
 }
 
+/*
+ *     Return: error value
+ *             0 = instruction emulated
+ *             !0 = instruction ignore 
+ */
 int
 emulate_rdmsr(struct vmctx *ctx, int vcpu, uint32_t code, uint64_t *val)
 {
+       int retval = 0;
 
-       return (-1);
+        switch (code) {
+        case 0xd04:                     /* Sandy Bridge uncore PMC MSRs */
+//               *val = (0);
+               break;
+        case 0xc24:
+//               *val = (0);
+               break;
+       case AMD_MSR_TSEG_BASE:
+//             *val = 0xcfe00000;
+               break;
+       case AMD_MSR_HWCR:
+               *val = (BIT(24)|BIT(4));
+               break;
+        case AMD_MSR_OSVW_ID_LENGTH:
+                *val = (4);
+               break;
+        case AMD_MSR_OSVW_STATUS:
+                *val = (BIT(3)|BIT(2));
+               break;
+//     case AMD_MSR_IBSCTL:
+//             *val = BIT(8);
+//             break;
+        default:
+               retval = 1;
+                break;
+        }
+       fprintf(stderr,"rdmsr(%i:%s): %#x, %s, val: %li(%#lx).\n\r",
+               retval, (retval==0?"oke":"err"), 
+               code, xmsr_info_mnemonic(code), *val, *val); 
+       return retval;
+
 }
+
+
_______________________________________________
freebsd-virtualization@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"

Reply via email to