[PATCH 0 of 3] update gdbstub support
This patch series updates the gdbstub support for kvm. Patch 12 introduce basic powerpc support while patch 3 fixes gdbstub generic code that was broken in a qemu merge. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2 of 3] [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub
# HG changeset patch # User Christian Ehrhardt [EMAIL PROTECTED] # Date 1228989956 -3600 # Node ID 6f228c807ad0b239b7342d2974debfc66418d784 # Parent 38846cef16e56c681da1ddc179e248972c8b2ff9 [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub From: Hollis Blanchard [EMAIL PROTECTED] Add basic KVM PowerPC support to qemu's gdbstub introducing a kvm ppc style mmu implementation that uses the kvm_translate ioctl. This also requires to save the kvm registers prior to the 'm' gdb operations. Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED] --- [diffstat] gdbstub.c |2 ++ hw/ppc440_bamboo.c |1 + qemu-kvm-powerpc.c | 28 target-ppc/cpu.h|2 ++ target-ppc/helper.c |4 target-ppc/translate_init.c |5 + 6 files changed, 42 insertions(+) [diff] diff --git a/qemu/gdbstub.c b/qemu/gdbstub.c --- a/qemu/gdbstub.c +++ b/qemu/gdbstub.c @@ -1374,6 +1374,7 @@ static int gdb_handle_packet(GDBState *s if (*p == ',') p++; len = strtoull(p, NULL, 16); +kvm_save_registers(s-g_cpu); if (cpu_memory_rw_debug(s-g_cpu, addr, mem_buf, len, 0) != 0) { put_packet (s, E14); } else { @@ -1389,6 +1390,7 @@ static int gdb_handle_packet(GDBState *s if (*p == ':') p++; hextomem(mem_buf, p, len); +kvm_save_registers(s-gcpu); if (cpu_memory_rw_debug(s-g_cpu, addr, mem_buf, len, 1) != 0) put_packet(s, E14); else diff --git a/qemu/hw/ppc440_bamboo.c b/qemu/hw/ppc440_bamboo.c --- a/qemu/hw/ppc440_bamboo.c +++ b/qemu/hw/ppc440_bamboo.c @@ -99,6 +99,7 @@ void bamboo_init(ram_addr_t ram_size, in fprintf(stderr, Unable to initialize CPU!\n); exit(1); } + env-mmu_model = POWERPC_MMU_KVM; /* call init */ printf(Calling function ppc440_init\n); diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c --- a/qemu/qemu-kvm-powerpc.c +++ b/qemu/qemu-kvm-powerpc.c @@ -102,6 +102,7 @@ void kvm_arch_save_regs(CPUState *env) env-spr[SPR_SRR0] = regs.srr0; env-spr[SPR_SRR1] = regs.srr1; +env-spr[SPR_BOOKE_PID] = regs.pid; env-spr[SPR_SPRG0] = regs.sprg0; env-spr[SPR_SPRG1] = regs.sprg1; @@ -219,6 +220,33 @@ int handle_powerpc_dcr_write(int vcpu, u return 0; /* XXX ignore failed DCR ops */ } +int mmukvm_get_physical_address(CPUState *env, mmu_ctx_t *ctx, +target_ulong eaddr, int rw, int access_type) +{ +struct kvm_translation tr; +uint64_t pid; +uint64_t as; +int r; + +pid = env-spr[SPR_BOOKE_PID]; + +if (access_type == ACCESS_CODE) +as = env-msr msr_ir; +else +as = env-msr msr_dr; + +tr.linear_address = as 40 | pid 32 | eaddr; +r = kvm_translate(kvm_context, env-cpu_index, tr); +if (r == -1) +return r; + +if (!tr.valid) +return -EFAULT; + +ctx-raddr = tr.physical_address; +return 0; +} + void kvm_arch_cpu_reset(CPUState *env) { } diff --git a/qemu/target-ppc/cpu.h b/qemu/target-ppc/cpu.h --- a/qemu/target-ppc/cpu.h +++ b/qemu/target-ppc/cpu.h @@ -98,6 +98,8 @@ enum powerpc_mmu_t { POWERPC_MMU_BOOKE_FSL = 0x0009, /* PowerPC 601 MMU model (specific BATs format)*/ POWERPC_MMU_601= 0x000A, +/* KVM managing the MMU state */ +POWERPC_MMU_KVM= 0x000B, #if defined(TARGET_PPC64) #define POWERPC_MMU_64 0x0001 /* 64 bits PowerPC MMU */ diff --git a/qemu/target-ppc/helper.c b/qemu/target-ppc/helper.c --- a/qemu/target-ppc/helper.c +++ b/qemu/target-ppc/helper.c @@ -1429,6 +1429,10 @@ int get_physical_address (CPUState *env, fprintf(logfile, %s\n, __func__); } #endif + +if (env-mmu_model == POWERPC_MMU_KVM) +return mmukvm_get_physical_address(env, ctx, eaddr, rw, access_type); + if ((access_type == ACCESS_CODE msr_ir == 0) || (access_type != ACCESS_CODE msr_dr == 0)) { /* No address translation */ diff --git a/qemu/target-ppc/translate_init.c b/qemu/target-ppc/translate_init.c --- a/qemu/target-ppc/translate_init.c +++ b/qemu/target-ppc/translate_init.c @@ -9273,6 +9273,11 @@ int cpu_ppc_register_internal (CPUPPCSta case POWERPC_MMU_601: mmu_model = PowerPC 601; break; +#ifdef KVM +case POWERPC_MMU_KVM: +mmu_model = PowerPC KVM; +break; +#endif #if defined (TARGET_PPC64) case POWERPC_MMU_64B: mmu_model = PowerPC 64; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1 of 3] [PATCH] kvm-userspace: ppc: Add kvm_translate wrapper
# HG changeset patch # User Christian Ehrhardt [EMAIL PROTECTED] # Date 1228924564 -3600 # Node ID 38846cef16e56c681da1ddc179e248972c8b2ff9 # Parent 705d874ff7a24484eaa15ed75a748c4e1a70c2ef [PATCH] kvm-userspace: ppc: Add kvm_translate wrapper From: Hollis Blanchard [EMAIL PROTECTED] Add kvm_translate() wrapper used to get mmu translations from userspace. Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED] --- [diffstat] libkvm.c |5 + libkvm.h |2 ++ 2 files changed, 7 insertions(+) [diff] diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c --- a/libkvm/libkvm.c +++ b/libkvm/libkvm.c @@ -987,6 +987,11 @@ int kvm_guest_debug(kvm_context_t kvm, i return ioctl(kvm-vcpu_fd[vcpu], KVM_DEBUG_GUEST, dbg); } +int kvm_translate(kvm_context_t kvm, int vcpu, struct kvm_translation *tr) +{ + return ioctl(kvm-vcpu_fd[vcpu], KVM_TRANSLATE, tr); +} + int kvm_set_signal_mask(kvm_context_t kvm, int vcpu, const sigset_t *sigset) { struct kvm_signal_mask *sigmask; diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h --- a/libkvm/libkvm.h +++ b/libkvm/libkvm.h @@ -639,6 +639,8 @@ int kvm_set_pit(kvm_context_t kvm, struc int kvm_set_pit(kvm_context_t kvm, struct kvm_pit_state *s); #endif +int kvm_translate(kvm_context_t kvm, int vcpu, struct kvm_translation *tr); + #endif #ifdef KVM_CAP_VAPIC -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to use PCI-passthrough with kvm
Hi Weidong, Thank you for your advice. The other messages which I found are the following messages which KVM outputs: BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) and, IRQ status is as follows: /proc/interrupt of the host OS is as follows: cat /proc/interrupts CPU0 CPU1 0: 45 0 IO-APIC-edge timer 1: 1 1 IO-APIC-edge i8042 4: 1 1 IO-APIC-edge 9: 0 0 IO-APIC-fasteoi acpi 12: 2 2 IO-APIC-edge i8042 14: 8675 8589 IO-APIC-edge ata_piix 15: 0 0 IO-APIC-edge ata_piix 17: 0 1 IO-APIC-fasteoi uhci_hcd:usb3, ehci_hcd:usb7 18: 2201 2200 IO-APIC-fasteoi uhci_hcd:usb1, uhci_hcd:usb6, pata_marvell 19: 43 45 IO-APIC-fasteoi uhci_hcd:usb5, ohci1394 21: 1091 1060 IO-APIC-fasteoi uhci_hcd:usb2, ata_piix, eth1 23: 2 2 IO-APIC-fasteoi uhci_hcd:usb4, ehci_hcd:usb8 505: 7198 7310 PCI-MSI-edge [EMAIL PROTECTED]::00:02.0 506: 50 53 PCI-MSI-edge kvm_assigned_msi_device NMI: 0 0 Non-maskable interrupts LOC: 87300 87267 Local timer interrupts RES: 54424 49773 Rescheduling interrupts CAL:494446 Function call interrupts TLB:651625 TLB shootdowns SPU: 0 0 Spurious interrupts ERR: 0 MIS: 0 and GuestOS is as follows: $ cat /proc/interrupts CPU0 0: 42779 IO-APIC-edge timer 1: 76 IO-APIC-edge i8042 2: 0XT-PIC-XTcascade 4: 1 IO-APIC-edge 8: 3 IO-APIC-edge rtc 10: 68 IO-APIC-edge eth1 11: 45 IO-APIC-edge 0, eth0 12:228 IO-APIC-edge i8042 14: 9982 IO-APIC-edge libata 15:740 IO-APIC-edge libata NMI: 0 Non-maskable interrupts LOC: 42638 Local timer interrupts RES: 0 Rescheduling interrupts CAL: 0 function call interrupts TLB: 0 TLB shootdowns TRM: 0 Thermal event interrupts SPU: 0 Spurious interrupts ERR: 0 MIS: 0 Note that eth1 is Intel NIC. Any idea ? Thanks, Kazushi -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm-userspace: Load PCI option ROMs
Load assigned devices' PCI option ROMs to the RAM of guest OS. And pass the corresponding devfns to BIOS. Signed-off-by: Kechao Liu [EMAIL PROTECTED] --- bios/rombios.c | 20 +- qemu/hw/device-assignment.c | 140 +++ qemu/hw/device-assignment.h |1 + qemu/hw/pc.c|8 ++- 4 files changed, 163 insertions(+), 6 deletions(-) diff --git a/bios/rombios.c b/bios/rombios.c index 9a1cdd6..6d63568 100644 --- a/bios/rombios.c +++ b/bios/rombios.c @@ -10216,18 +10216,30 @@ rom_scan_loop: add al, #0x04 block_count_rounded: - xor bx, bx ;; Restore DS back to : - mov ds, bx push ax ;; Save AX push di ;; Save DI ;; Push addr of ROM entry point push cx ;; Push seg push #0x0003 ;; Push offset + ;; Get the BDF into ax before invoking the option ROM + mov bl, [2] + mov al, bl + shr al, #7 + cmp al, #1 + jne fetch_bdf + mov ax, ds ;; Increment the DS since rom size larger than an segment + add ax, #0x1000 + mov ds, ax +fetch_bdf: + shl bx, #9 + xor ax, ax + mov al, [bx] + ;; Point ES:DI at $PnP, which tells the ROM that we are a PnP BIOS. ;; That should stop it grabbing INT 19h; we will use its BEV instead. - mov ax, #0xf000 - mov es, ax + mov bx, #0xf000 + mov es, bx lea di, pnp_string mov bp, sp ;; Call ROM init routine using seg:off on stack diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c index 7a5..e53dda4 100644 --- a/qemu/hw/device-assignment.c +++ b/qemu/hw/device-assignment.c @@ -678,3 +678,143 @@ void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices) } } } + +/* Option ROM header */ +struct option_rom_header { +uint8_t signature[2]; +uint8_t rom_size; +uint32_t entry_point; +uint8_t reserved[17]; +uint16_t pci_header_offset; +uint16_t expansion_header_offset; +} __attribute__ ((packed)); + +/* Option ROM PCI data structure */ +struct option_rom_pci_header { +uint8_t signature[4]; +uint16_t vendor_id; +uint16_t device_id; +uint16_t vital_product_data_offset; +uint16_t structure_length; +uint8_t structure_revision; +uint8_t class_code[3]; +uint16_t image_length; +uint16_t image_revision; +uint8_t code_type; +uint8_t indicator; +uint16_t reserved; +} __attribute__ ((packed)); + +/* + * Scan the list of Option ROMs at roms. If a suitable Option ROM is found, + * allocate a ram space and copy it there. Then return its size aligned to + * both 2KB and target page size. + */ +#define OPTION_ROM_ALIGN(x) (((x) + 2047) ~2047) +static int scan_option_rom(uint8_t devfn, void *roms, ram_addr_t offset) +{ +int i, size; +uint8_t csum; +ram_addr_t addr, phys_addr; +struct option_rom_header *rom; +struct option_rom_pci_header *pcih; + +rom = roms; + +for ( ; ; ) { +/* Invalid signature means we're out of option ROMs. */ +if (strncmp((char *)rom-signature, \x55\xaa, 2) || + (rom-rom_size == 0)) +break; + +/* Invalid checksum means we're out of option ROMs. */ +csum = 0; +for (i = 0; i (rom-rom_size * 512); i++) +csum += ((uint8_t *)rom)[i]; +if (csum != 0) +break; + +/* Check the PCI header (if any) for a match. */ +pcih = (struct option_rom_pci_header *) +((char *)rom + rom-pci_header_offset); +if ((rom-pci_header_offset != 0) + !strncmp((char *)pcih-signature, PCIR, 4)) +goto found; + +rom = (struct option_rom_header *)((char *)rom + rom-rom_size * 512); +} + +return 0; + + found: +/* The size should be both 2K-aligned and page-aligned */ +size = (TARGET_PAGE_SIZE 0x800) +? OPTION_ROM_ALIGN(rom-rom_size * 512 + 1) +: TARGET_PAGE_ALIGN(rom-rom_size * 512 + 1); + +/* Size of all available ram space is 0x1 (0xd to 0xe) */ +if ((offset + size) 0x1u) { +fprintf(stderr, Option ROM size %x exceeds available space\n, +rom-rom_size * 512); +return 0; +} + +addr = qemu_ram_alloc(size); +phys_addr = addr + phys_ram_base; + +/* Write ROM data and devfn to phys_addr */ +memcpy((void *)phys_addr, rom, rom-rom_size * 512); +*(uint8_t *)(phys_addr + rom-rom_size * 512) = devfn; + +cpu_register_physical_memory(0xd + offset, size, addr); + +return size; +} + +/* + * Scan the assigned devices for the devices that have an option ROM, + * and then load the corresponding ROM data to RAM. + */ +ram_addr_t assigned_dev_load_option_roms(ram_addr_t rom_base_offset) +{ +ram_addr_t offset = rom_base_offset; +AssignedDevInfo *adev; + +LIST_FOREACH(adev, adev_head, next) { +int size; +void *buf; +FILE *fp; +char rom_file[64]; +char cmd[64]; + +
RE: How to use PCI-passthrough with kvm
It's not related to shared irq. MSI is already supported in KVM, so if the device has MSI capability, there is no sharing irq issue. Can you try latest kvm.git and kvm-userspace.git? At least you are not using latest kvm-userspace because there is output BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1). In addition, can you try out other device, such as add-on PCIe NIC or USB? Regards, Weidong w1ndoz wrote: Hi Weidong, Thank you for your advice. The other messages which I found are the following messages which KVM outputs: BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1) and, IRQ status is as follows: /proc/interrupt of the host OS is as follows: cat /proc/interrupts CPU0 CPU1 0: 45 0 IO-APIC-edge timer 1: 1 1 IO-APIC-edge i8042 4: 1 1 IO-APIC-edge 9: 0 0 IO-APIC-fasteoi acpi 12: 2 2 IO-APIC-edge i8042 14: 8675 8589 IO-APIC-edge ata_piix 15: 0 0 IO-APIC-edge ata_piix 17: 0 1 IO-APIC-fasteoi uhci_hcd:usb3, ehci_hcd:usb7 18: 2201 2200 IO-APIC-fasteoi uhci_hcd:usb1, uhci_hcd:usb6, pata_marvell 19: 43 45 IO-APIC-fasteoi uhci_hcd:usb5, ohci1394 21: 1091 1060 IO-APIC-fasteoi uhci_hcd:usb2, ata_piix, eth1 23: 2 2 IO-APIC-fasteoi uhci_hcd:usb4, ehci_hcd:usb8 505: 7198 7310 PCI-MSI-edge [EMAIL PROTECTED]::00:02.0 506: 50 53 PCI-MSI-edge kvm_assigned_msi_device NMI: 0 0 Non-maskable interrupts LOC: 87300 87267 Local timer interrupts RES: 54424 49773 Rescheduling interrupts CAL:494446 Function call interrupts TLB:651625 TLB shootdowns SPU: 0 0 Spurious interrupts ERR: 0 MIS: 0 and GuestOS is as follows: $ cat /proc/interrupts CPU0 0: 42779 IO-APIC-edge timer 1: 76 IO-APIC-edge i8042 2: 0XT-PIC-XTcascade 4: 1 IO-APIC-edge 8: 3 IO-APIC-edge rtc 10: 68 IO-APIC-edge eth1 11: 45 IO-APIC-edge 0, eth0 12:228 IO-APIC-edge i8042 14: 9982 IO-APIC-edge libata 15:740 IO-APIC-edge libata NMI: 0 Non-maskable interrupts LOC: 42638 Local timer interrupts RES: 0 Rescheduling interrupts CAL: 0 function call interrupts TLB: 0 TLB shootdowns TRM: 0 Thermal event interrupts SPU: 0 Spurious interrupts ERR: 0 MIS: 0 Note that eth1 is Intel NIC. Any idea ? Thanks, Kazushi -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub
This is v2 as version one had a type in it occured when splitting patches. Mercurial somehow lost my changes to the patch description explaining that, but the patch is right this way. Christian Ehrhardt wrote: # HG changeset patch # User Christian Ehrhardt [EMAIL PROTECTED] # Date 1228999833 -3600 # Node ID dc1466c9077ab162f4637fffee1869f26be02299 # Parent 4c07fe2a56c7653a9113e05bb08c2de9aec210ce [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub From: Hollis Blanchard [EMAIL PROTECTED] Add basic KVM PowerPC support to qemu's gdbstub introducing a kvm ppc style mmu implementation that uses the kvm_translate ioctl. This also requires to save the kvm registers prior to the 'm' gdb operations. Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED] --- [diffstat] gdbstub.c |2 ++ hw/ppc440_bamboo.c |1 + qemu-kvm-powerpc.c | 28 target-ppc/cpu.h|2 ++ target-ppc/helper.c |4 target-ppc/translate_init.c |5 + 6 files changed, 42 insertions(+) [diff] diff --git a/qemu/gdbstub.c b/qemu/gdbstub.c --- a/qemu/gdbstub.c +++ b/qemu/gdbstub.c @@ -1374,6 +1374,7 @@ static int gdb_handle_packet(GDBState *s if (*p == ',') p++; len = strtoull(p, NULL, 16); +kvm_save_registers(s-g_cpu); if (cpu_memory_rw_debug(s-g_cpu, addr, mem_buf, len, 0) != 0) { put_packet (s, E14); } else { @@ -1389,6 +1390,7 @@ static int gdb_handle_packet(GDBState *s if (*p == ':') p++; hextomem(mem_buf, p, len); +kvm_save_registers(s-g_cpu); if (cpu_memory_rw_debug(s-g_cpu, addr, mem_buf, len, 1) != 0) put_packet(s, E14); else diff --git a/qemu/hw/ppc440_bamboo.c b/qemu/hw/ppc440_bamboo.c --- a/qemu/hw/ppc440_bamboo.c +++ b/qemu/hw/ppc440_bamboo.c @@ -99,6 +99,7 @@ void bamboo_init(ram_addr_t ram_size, in fprintf(stderr, Unable to initialize CPU!\n); exit(1); } + env-mmu_model = POWERPC_MMU_KVM; /* call init */ printf(Calling function ppc440_init\n); diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c --- a/qemu/qemu-kvm-powerpc.c +++ b/qemu/qemu-kvm-powerpc.c @@ -102,6 +102,7 @@ void kvm_arch_save_regs(CPUState *env) env-spr[SPR_SRR0] = regs.srr0; env-spr[SPR_SRR1] = regs.srr1; +env-spr[SPR_BOOKE_PID] = regs.pid; env-spr[SPR_SPRG0] = regs.sprg0; env-spr[SPR_SPRG1] = regs.sprg1; @@ -219,6 +220,33 @@ int handle_powerpc_dcr_write(int vcpu, u return 0; /* XXX ignore failed DCR ops */ } +int mmukvm_get_physical_address(CPUState *env, mmu_ctx_t *ctx, +target_ulong eaddr, int rw, int access_type) +{ +struct kvm_translation tr; +uint64_t pid; +uint64_t as; +int r; + +pid = env-spr[SPR_BOOKE_PID]; + +if (access_type == ACCESS_CODE) +as = env-msr msr_ir; +else +as = env-msr msr_dr; + +tr.linear_address = as 40 | pid 32 | eaddr; +r = kvm_translate(kvm_context, env-cpu_index, tr); +if (r == -1) +return r; + +if (!tr.valid) +return -EFAULT; + +ctx-raddr = tr.physical_address; +return 0; +} + void kvm_arch_cpu_reset(CPUState *env) { } diff --git a/qemu/target-ppc/cpu.h b/qemu/target-ppc/cpu.h --- a/qemu/target-ppc/cpu.h +++ b/qemu/target-ppc/cpu.h @@ -98,6 +98,8 @@ enum powerpc_mmu_t { POWERPC_MMU_BOOKE_FSL = 0x0009, /* PowerPC 601 MMU model (specific BATs format)*/ POWERPC_MMU_601= 0x000A, +/* KVM managing the MMU state */ +POWERPC_MMU_KVM= 0x000B, #if defined(TARGET_PPC64) #define POWERPC_MMU_64 0x0001 /* 64 bits PowerPC MMU */ diff --git a/qemu/target-ppc/helper.c b/qemu/target-ppc/helper.c --- a/qemu/target-ppc/helper.c +++ b/qemu/target-ppc/helper.c @@ -1429,6 +1429,10 @@ int get_physical_address (CPUState *env, fprintf(logfile, %s\n, __func__); } #endif + +if (env-mmu_model == POWERPC_MMU_KVM) +return mmukvm_get_physical_address(env, ctx, eaddr, rw, access_type); + if ((access_type == ACCESS_CODE msr_ir == 0) || (access_type != ACCESS_CODE msr_dr == 0)) { /* No address translation */ diff --git a/qemu/target-ppc/translate_init.c b/qemu/target-ppc/translate_init.c --- a/qemu/target-ppc/translate_init.c +++ b/qemu/target-ppc/translate_init.c @@ -9273,6 +9273,11 @@ int cpu_ppc_register_internal (CPUPPCSta case POWERPC_MMU_601: mmu_model = PowerPC 601; break; +#ifdef KVM +case POWERPC_MMU_KVM: +mmu_model = PowerPC KVM; +break; +#endif #if defined (TARGET_PPC64) case POWERPC_MMU_64B: mmu_model = PowerPC 64; --
[PATCH] [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub
# HG changeset patch # User Christian Ehrhardt [EMAIL PROTECTED] # Date 1228999833 -3600 # Node ID dc1466c9077ab162f4637fffee1869f26be02299 # Parent 4c07fe2a56c7653a9113e05bb08c2de9aec210ce [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub From: Hollis Blanchard [EMAIL PROTECTED] Add basic KVM PowerPC support to qemu's gdbstub introducing a kvm ppc style mmu implementation that uses the kvm_translate ioctl. This also requires to save the kvm registers prior to the 'm' gdb operations. Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED] --- [diffstat] gdbstub.c |2 ++ hw/ppc440_bamboo.c |1 + qemu-kvm-powerpc.c | 28 target-ppc/cpu.h|2 ++ target-ppc/helper.c |4 target-ppc/translate_init.c |5 + 6 files changed, 42 insertions(+) [diff] diff --git a/qemu/gdbstub.c b/qemu/gdbstub.c --- a/qemu/gdbstub.c +++ b/qemu/gdbstub.c @@ -1374,6 +1374,7 @@ static int gdb_handle_packet(GDBState *s if (*p == ',') p++; len = strtoull(p, NULL, 16); +kvm_save_registers(s-g_cpu); if (cpu_memory_rw_debug(s-g_cpu, addr, mem_buf, len, 0) != 0) { put_packet (s, E14); } else { @@ -1389,6 +1390,7 @@ static int gdb_handle_packet(GDBState *s if (*p == ':') p++; hextomem(mem_buf, p, len); +kvm_save_registers(s-g_cpu); if (cpu_memory_rw_debug(s-g_cpu, addr, mem_buf, len, 1) != 0) put_packet(s, E14); else diff --git a/qemu/hw/ppc440_bamboo.c b/qemu/hw/ppc440_bamboo.c --- a/qemu/hw/ppc440_bamboo.c +++ b/qemu/hw/ppc440_bamboo.c @@ -99,6 +99,7 @@ void bamboo_init(ram_addr_t ram_size, in fprintf(stderr, Unable to initialize CPU!\n); exit(1); } + env-mmu_model = POWERPC_MMU_KVM; /* call init */ printf(Calling function ppc440_init\n); diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c --- a/qemu/qemu-kvm-powerpc.c +++ b/qemu/qemu-kvm-powerpc.c @@ -102,6 +102,7 @@ void kvm_arch_save_regs(CPUState *env) env-spr[SPR_SRR0] = regs.srr0; env-spr[SPR_SRR1] = regs.srr1; +env-spr[SPR_BOOKE_PID] = regs.pid; env-spr[SPR_SPRG0] = regs.sprg0; env-spr[SPR_SPRG1] = regs.sprg1; @@ -219,6 +220,33 @@ int handle_powerpc_dcr_write(int vcpu, u return 0; /* XXX ignore failed DCR ops */ } +int mmukvm_get_physical_address(CPUState *env, mmu_ctx_t *ctx, +target_ulong eaddr, int rw, int access_type) +{ +struct kvm_translation tr; +uint64_t pid; +uint64_t as; +int r; + +pid = env-spr[SPR_BOOKE_PID]; + +if (access_type == ACCESS_CODE) +as = env-msr msr_ir; +else +as = env-msr msr_dr; + +tr.linear_address = as 40 | pid 32 | eaddr; +r = kvm_translate(kvm_context, env-cpu_index, tr); +if (r == -1) +return r; + +if (!tr.valid) +return -EFAULT; + +ctx-raddr = tr.physical_address; +return 0; +} + void kvm_arch_cpu_reset(CPUState *env) { } diff --git a/qemu/target-ppc/cpu.h b/qemu/target-ppc/cpu.h --- a/qemu/target-ppc/cpu.h +++ b/qemu/target-ppc/cpu.h @@ -98,6 +98,8 @@ enum powerpc_mmu_t { POWERPC_MMU_BOOKE_FSL = 0x0009, /* PowerPC 601 MMU model (specific BATs format)*/ POWERPC_MMU_601= 0x000A, +/* KVM managing the MMU state */ +POWERPC_MMU_KVM= 0x000B, #if defined(TARGET_PPC64) #define POWERPC_MMU_64 0x0001 /* 64 bits PowerPC MMU */ diff --git a/qemu/target-ppc/helper.c b/qemu/target-ppc/helper.c --- a/qemu/target-ppc/helper.c +++ b/qemu/target-ppc/helper.c @@ -1429,6 +1429,10 @@ int get_physical_address (CPUState *env, fprintf(logfile, %s\n, __func__); } #endif + +if (env-mmu_model == POWERPC_MMU_KVM) +return mmukvm_get_physical_address(env, ctx, eaddr, rw, access_type); + if ((access_type == ACCESS_CODE msr_ir == 0) || (access_type != ACCESS_CODE msr_dr == 0)) { /* No address translation */ diff --git a/qemu/target-ppc/translate_init.c b/qemu/target-ppc/translate_init.c --- a/qemu/target-ppc/translate_init.c +++ b/qemu/target-ppc/translate_init.c @@ -9273,6 +9273,11 @@ int cpu_ppc_register_internal (CPUPPCSta case POWERPC_MMU_601: mmu_model = PowerPC 601; break; +#ifdef KVM +case POWERPC_MMU_KVM: +mmu_model = PowerPC KVM; +break; +#endif #if defined (TARGET_PPC64) case POWERPC_MMU_64B: mmu_model = PowerPC 64; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [RFC] Replace posix-aio with custom thread pool
My current feeling is that this user thread aio thing will never satisfy enterprise usage and kernel aio is mandatory in my view. I had the same feeling before too, but I thought clone aio was desiderable as intermediate step, because it could help whatever other unix host OS that may not have native aio support. But if there's a problem with opening the file multiple times (which btw is limiting the total number of bdev to a dozen on a default ulimit -n with 64 max threads, but it's probably ok), then we could as well stick to glibc aio, and perhaps wait it to evolve with aio_readv/writev (probably backed by a preadv/pwritev). And we should concentrate on kernel aio and get rid of threads when host OS is linux. We can add a dependency where the dma api will not bounce and linearize the buffer, only if the host backend supports native aio. Has anybody a patch implementing kernel aio that I can plug into the dma zerocopy api? I'm not so sure clone aio is worth maintaining inside qemu instead of evolving glibc and kernel with preadv/pwritev for the long term. Thanks! -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] reduce code duplication
This is just a tiny nitpick I came across while hacking on other stuff. It reduces the code duplication, hopefully in a good way for other eyes too. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] reduce code duplication
Code for all versions of notdirty_mem_write are quite alike. Rewrite it as a generator macro. Signed-off-by: Glauber Costa [EMAIL PROTECTED] --- exec.c | 66 --- 1 files changed, 17 insertions(+), 49 deletions(-) diff --git a/exec.c b/exec.c index 35e0b8e..986c3b0 100644 --- a/exec.c +++ b/exec.c @@ -2455,43 +2455,27 @@ static CPUWriteMemoryFunc *unassigned_mem_write[3] = { unassigned_mem_writel, }; -static void notdirty_mem_writeb(void *opaque, target_phys_addr_t ram_addr, -uint32_t val) +static inline void store_size(target_phys_addr_t addr, uint32_t val, char siz) { -int dirty_flags; -dirty_flags = phys_ram_dirty[ram_addr TARGET_PAGE_BITS]; -if (!(dirty_flags CODE_DIRTY_FLAG)) { -#if !defined(CONFIG_USER_ONLY) -tb_invalidate_phys_page_fast(ram_addr, 'b'); -dirty_flags = phys_ram_dirty[ram_addr TARGET_PAGE_BITS]; -#endif +switch (siz) { +case 'b': stb_p(addr, val); break; +case 'w': stw_p(addr, val); break; +case 'l': stl_p(addr, val); break; } -stb_p(phys_ram_base + ram_addr, val); -#ifdef USE_KQEMU -if (cpu_single_env-kqemu_enabled -(dirty_flags KQEMU_MODIFY_PAGE_MASK) != KQEMU_MODIFY_PAGE_MASK) -kqemu_modify_page(cpu_single_env, ram_addr); -#endif -dirty_flags |= (0xff ~CODE_DIRTY_FLAG); -phys_ram_dirty[ram_addr TARGET_PAGE_BITS] = dirty_flags; -/* we remove the notdirty callback only if the code has been - flushed */ -if (dirty_flags == 0xff) -tlb_set_dirty(cpu_single_env, cpu_single_env-mem_io_vaddr); } -static void notdirty_mem_writew(void *opaque, target_phys_addr_t ram_addr, -uint32_t val) +static inline void notdirty_mem_write_size(void *opaque, target_phys_addr_t ram_addr, +uint32_t val, char siz) { int dirty_flags; dirty_flags = phys_ram_dirty[ram_addr TARGET_PAGE_BITS]; if (!(dirty_flags CODE_DIRTY_FLAG)) { #if !defined(CONFIG_USER_ONLY) -tb_invalidate_phys_page_fast(ram_addr, 'w'); +tb_invalidate_phys_page_fast(ram_addr, siz); dirty_flags = phys_ram_dirty[ram_addr TARGET_PAGE_BITS]; #endif } -stw_p(phys_ram_base + ram_addr, val); +store_size((target_phys_addr_t)phys_ram_base + ram_addr, val, siz); #ifdef USE_KQEMU if (cpu_single_env-kqemu_enabled (dirty_flags KQEMU_MODIFY_PAGE_MASK) != KQEMU_MODIFY_PAGE_MASK) @@ -2505,30 +2489,14 @@ static void notdirty_mem_writew(void *opaque, target_phys_addr_t ram_addr, tlb_set_dirty(cpu_single_env, cpu_single_env-mem_io_vaddr); } -static void notdirty_mem_writel(void *opaque, target_phys_addr_t ram_addr, -uint32_t val) -{ -int dirty_flags; -dirty_flags = phys_ram_dirty[ram_addr TARGET_PAGE_BITS]; -if (!(dirty_flags CODE_DIRTY_FLAG)) { -#if !defined(CONFIG_USER_ONLY) -tb_invalidate_phys_page_fast(ram_addr, 'l'); -dirty_flags = phys_ram_dirty[ram_addr TARGET_PAGE_BITS]; -#endif -} -stl_p(phys_ram_base + ram_addr, val); -#ifdef USE_KQEMU -if (cpu_single_env-kqemu_enabled -(dirty_flags KQEMU_MODIFY_PAGE_MASK) != KQEMU_MODIFY_PAGE_MASK) -kqemu_modify_page(cpu_single_env, ram_addr); -#endif -dirty_flags |= (0xff ~CODE_DIRTY_FLAG); -phys_ram_dirty[ram_addr TARGET_PAGE_BITS] = dirty_flags; -/* we remove the notdirty callback only if the code has been - flushed */ -if (dirty_flags == 0xff) -tlb_set_dirty(cpu_single_env, cpu_single_env-mem_io_vaddr); -} +#define gen_notdirty(s) static void notdirty_mem_write##s(void *opaque, \ +target_phys_addr_t ram_addr, uint32_t val)\ +{ notdirty_mem_write_size(opaque, ram_addr, val, #s[0]); } + +gen_notdirty(b) +gen_notdirty(w) +gen_notdirty(l) +#undef gen_notdirty static CPUReadMemoryFunc *error_mem_read[3] = { NULL, /* never used */ -- 1.5.6.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[ kvm-Bugs-2417350 ] external module trace support kvm-80
Bugs item #2417350, was opened at 2008-12-11 16:05 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2417350group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Henning Schild (hensch) Assigned to: Nobody/Anonymous (nobody) Summary: external module trace support kvm-80 Initial Comment: kvm-80 calls marker_synchronize_unregister(); in kvm_trace. This call was introduced in 2.6.28 and thus can not be used in versions before that. Surrounding the call with #if LINUX_VERSION_CODE KERNEL_VERSION(2,6,27) seems to fix that. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2417350group_id=180599 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[ kvm-Bugs-2417387 ] kvm-80, AMD, vga not updated
Bugs item #2417387, was opened at 2008-12-11 16:20 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2417387group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: amd Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Henning Schild (hensch) Assigned to: Nobody/Anonymous (nobody) Summary: kvm-80, AMD, vga not updated Initial Comment: I gave kvm-80 a try and came across some problems with guest display not beeing updated. On an intel host http://git.kernel.org/?p=virt/kvm/kvm-userspace.git;a=commitdiff;h=8eae225cf8cd82316fcc78569aeb1adbbc077cb8 fixed the problem. But on AMD the fix has no use. When i use a recent grml or KNOPPIX liveCD as guest the display stops beeing refreshed as soon as the linux kernel switches to framebuffer. I can force refreshes with switching between monitor and display or by using vnc and reconnecting over and over. The problem appears on both vga models cirrus and std i did not try vmware. Its also indepent of whether i enable nested paging or not. The host is a phenomx4 with linux 2.6.26 and external kvm-80. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2417387group_id=180599 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 16/16 v6] PCI: document the new PCI boot parameters
--- On Wed, 12/10/08, Grant Grundler [EMAIL PROTECTED] wrote: Date: Wednesday, December 10, 2008, 10:33 PM On Thu, Dec 11, 2008 at 09:43:13AM +0800, Yu Zhao wrote: ... I believe this is the only problem that preclude us having the run-time resource rebalance. And I'm not sure how much effort we can fix it. Any comments? Figure out the right sequence for driver resume so the probe function can call resume as well? Document the change and then start modifying drivers one-by-one. API changes are alot of work. grant -- I've been lurking awaiting to see such a discussion. Alerting the PCI drivers that their resources have been changed from underneath them by extending the suspend/resume model, or perhaps (heresy?) adding a new callback entry point specifically for instructing PCI drivers to re-read their BARs, etc. would be a step in the right direction to enable this whole re-balancing work; well, granted root/paging devs bound to PCI devs could be tricky. It seems entirely natural (to me) that Microsoft would shudder at the amount of work and verification of all drivers to do this, but the Linux community's attitude seems to embrace sweeping changes ;-). -- LH -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [RFC] Replace posix-aio with custom thread pool
On Thu, Dec 11, 2008 at 04:24:37PM +0100, Gerd Hoffmann wrote: Well, linux kernel aio has its share of problems too: * Anthony mentioned it may block on certain circumstances (forgot which ones), and you can't figure beforehand to turn off aio then. We've worse problems as long as bdrv_read/write are used by qcow2. And we can fix host kernel in the long run if this becomes an issue. * It can't handle block allocation. Kernel handles that by doing such writes synchronously via VFS layer (instead of the separate aio code paths). Leads to horrible performance and bug reports such as installs on sparse files are very slow. I think here you mean O_DIRECT regardless of aio/sync API, I doubt aio has any relevance to block allocation in any way, so whatever problem we have with kernel API and O_DIRECT should also be there with sync-api + userland threads and O_DIRECT. * support for vectored aio isn't that old. IIRC it was added somewhen around 2.6.20 (newer that current suse/redhat enterprise versions). Which IMHO means you can't expect it being present unconditionally. I think this is a false alarm: the whole point of kernel AIO is that even if O_DIRECT is enabled, all bios are pushed to the disk before the disk queue is unplugged which is all we care about to get decent disk bandwidth with zerocopy dma. Or at least that's the way it's supposed to work if aio is implemented correctly at the bio level. So in kernels that don't support IOCB_CMD_READV/WRITEV, we've simply to an array of iocb through io_submit (i.e. to conver the iov into a vector of iocb, instead of a single iocb pointing to the iov). Internally to io_submit a single dma command should be generated and the same sg list should be built the same as if we used READV/WRITEV. In theory READV/WRITEV should be just a cpu saving feature, it shouldn't influence disk bandwidth. If it does, it means the bio layer is broken and needs fixing. If IOCB_CMD_READV/WRITEV is available, good, if not we go with READ/WRITE and more iocb dynamically allocated. It just needs a conversion routine from iovec, file, offset to iocb pointer when IOCB_CMD_READV/WRITEV is not available. The iocb array can be preallocated along with the iovec when we detect IOCB_CMD_READV/WRITEV is not available, I've a cache layer that does this and I'll just provide an output selectable in iovec or iocb terms, with iocb selectable depending if host os is linux and IOCB_CMD_READV/WRITEV is not available. Threads will be there anyway for kvm smp. Yes, I didn't mean those threads ;), I love threads, but I love threads that are CPU bound and allow to exploit the whole power of the system! But for storage, threads are purely overscheduling overhead as far as I can tell, given we've an async api to use and we already have to deal with the pain of async programming. So it worth we get the full benefit of it (i.e. no thread/overscheduling overhead). If aio inside the kernel is too complex than use kernel threads, it's still better than user threads. I mean if we keep only using threads we should get rid of bdrv_aio* completely and move qcow2 code in a separate thread instead of keep running it from the io thread. If we stick to threads then it worth to get the full benefit of threads (i.e. not having to deal with the pains of async programming and moving the qcow2 computation in a separate CPU). Something I tried doing but I ended up having to add locks all over qcow2 in order to submit multiple qcow2 requests in parallel (otherwise the lock would be global and I couldn't differentiate between a bdrv_read for qcow2 metadata that must be executed with the qcow2 mutex held, and a bdrv_aio_readv that can run lockless from the point of view of the current qcow2 instance - the qcow2 parent may take its own locks then etc..). Basically it breaks all backends something I'm not confortable with right now just to get zerocopy dma working at platter speed. Hence I stick with async programming for now... Well, wait for glibc isn't going to fly. glibc waits for posix, and posix waits for a reference implementation (which will not be glibc). Agree. and kernel with preadv/pwritev With that in place you don't need kernel aio any more, then you can really do it in userspace with threads. But that probably would be linux-only ^W^W^W Waiting for preadv/pwritev is just the 'quicker' version of waiting glibc aio_readv. And because it remains a linux-only, I prefer kernel AIO that fixes cfq and should be the most optimal anyway (with or without READV/WRITEV support). So in the end: we either open the file 64 times (which I think is perfectly coherent with nfs unless the nfs client is broken, but then Anthony may know nfs better, I'm not heavy nfs user here), or we go with kernel AIO... you know my preference. Said that opening the file 64 times is probably simpler, if it has been confirmed that it doesn't break nfs. Breaking nfs is not
[PATCH] KVM: x86: Rework user space NMI injection as KVM_CAP_USER_NMI
There is no point in doing the ready_for_nmi_injection/ request_nmi_window dance with user space. First, we don't do this for in-kernel irqchip anyway, while the code path is the same as for user space irqchip mode. And second, there is nothing to loose if a pending NMI is overwritten by another one (in contrast to IRQs where we have to save the number). Actually, there is even the risk of raising spurious NMIs this way because the reason for the held-back NMI might already be handled while processing the first one. Therefore this patch creates a simplified user space NMI injection interface, exporting it under KVM_CAP_USER_NMI and dropping the old KVM_CAP_NMI capability. And this time we also take care to provide the interface only on archs supporting NMIs via KVM (right now only x86). Signed-off-by: Jan Kiszka [EMAIL PROTECTED] --- arch/x86/kvm/vmx.c | 24 ++-- arch/x86/kvm/x86.c | 28 ++-- include/linux/kvm.h | 11 +-- 3 files changed, 9 insertions(+), 54 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 487e1dc..6259d74 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2498,15 +2498,13 @@ static void do_interrupt_requests(struct kvm_vcpu *vcpu, } if (vcpu-arch.nmi_injected) { vmx_inject_nmi(vcpu); - if (vcpu-arch.nmi_pending || kvm_run-request_nmi_window) + if (vcpu-arch.nmi_pending) enable_nmi_window(vcpu); else if (vcpu-arch.irq_summary || kvm_run-request_interrupt_window) enable_irq_window(vcpu); return; } - if (!vcpu-arch.nmi_window_open || kvm_run-request_nmi_window) - enable_nmi_window(vcpu); if (vcpu-arch.interrupt_window_open) { if (vcpu-arch.irq_summary !vcpu-arch.interrupt.pending) @@ -3040,14 +3038,6 @@ static int handle_nmi_window(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control); ++vcpu-stat.nmi_window_exits; - /* -* If the user space waits to inject a NMI, exit as soon as possible -*/ - if (kvm_run-request_nmi_window !vcpu-arch.nmi_pending) { - kvm_run-exit_reason = KVM_EXIT_NMI_WINDOW_OPEN; - return 0; - } - return 1; } @@ -3162,7 +3152,7 @@ static int kvm_handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) vmx-soft_vnmi_blocked = 0; vcpu-arch.nmi_window_open = 1; } else if (vmx-vnmi_blocked_time 10LL - (kvm_run-request_nmi_window || vcpu-arch.nmi_pending)) { + vcpu-arch.nmi_pending) { /* * This CPU don't support us in finding the end of an * NMI-blocked window if the guest runs with IRQs @@ -3175,16 +3165,6 @@ static int kvm_handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) vmx-soft_vnmi_blocked = 0; vmx-vcpu.arch.nmi_window_open = 1; } - - /* -* If the user space waits to inject an NNI, exit ASAP -*/ - if (vcpu-arch.nmi_window_open kvm_run-request_nmi_window -!vcpu-arch.nmi_pending) { - kvm_run-exit_reason = KVM_EXIT_NMI_WINDOW_OPEN; - ++vcpu-stat.nmi_window_exits; - return 0; - } } if (exit_reason kvm_vmx_max_exit_handlers diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 10302d3..0e6aa81 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2887,37 +2887,18 @@ static int dm_request_for_irq_injection(struct kvm_vcpu *vcpu, (kvm_x86_ops-get_rflags(vcpu) X86_EFLAGS_IF)); } -/* - * Check if userspace requested a NMI window, and that the NMI window - * is open. - * - * No need to exit to userspace if we already have a NMI queued. - */ -static int dm_request_for_nmi_injection(struct kvm_vcpu *vcpu, - struct kvm_run *kvm_run) -{ - return (!vcpu-arch.nmi_pending - kvm_run-request_nmi_window - vcpu-arch.nmi_window_open); -} - static void post_kvm_run_save(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) { kvm_run-if_flag = (kvm_x86_ops-get_rflags(vcpu) X86_EFLAGS_IF) != 0; kvm_run-cr8 = kvm_get_cr8(vcpu); kvm_run-apic_base = kvm_get_apic_base(vcpu); - if (irqchip_in_kernel(vcpu-kvm)) { + if (irqchip_in_kernel(vcpu-kvm)) kvm_run-ready_for_interrupt_injection = 1; - kvm_run-ready_for_nmi_injection = 1; - } else { + else
Re: [Qemu-devel] [RFC] Replace posix-aio with custom thread pool
Andrea Arcangeli wrote: * It can't handle block allocation. Kernel handles that by doing such writes synchronously via VFS layer (instead of the separate aio code paths). Leads to horrible performance and bug reports such as installs on sparse files are very slow. I think here you mean O_DIRECT regardless of aio/sync API, Yes. But kernel aio requires O_DIRECT, so aio users are affected nevertheless. So in kernels that don't support IOCB_CMD_READV/WRITEV, we've simply to an array of iocb through io_submit (i.e. to conver the iov into a vector of iocb, instead of a single iocb pointing to the iov). Internally to io_submit a single dma command should be generated and the same sg list should be built the same as if we used READV/WRITEV. In theory READV/WRITEV should be just a cpu saving feature, it shouldn't influence disk bandwidth. If it does, it means the bio layer is broken and needs fixing. Havn't tested that. Could be it isn't a big problem, extra code size for the two modes aside. ahem: http://www.daemon-systems.org/man/preadv.2.html Too bad nobody implemented it yet... Kernel side looks easy, attached patch + syscall table windup in all archs ... cheers, Gerd diff --git a/fs/read_write.c b/fs/read_write.c index 969a6d9..d1ea2fd 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -701,6 +701,54 @@ sys_writev(unsigned long fd, const struct iovec __user *vec, unsigned long vlen) return ret; } +asmlinkage ssize_t sys_preadv(unsigned int fd, const struct iovec __user *vec, + unsigned long vlen, loff_t pos) +{ + struct file *file; + ssize_t ret = -EBADF; + int fput_needed; + + if (pos 0) + return -EINVAL; + + file = fget_light(fd, fput_needed); + if (file) { + ret = -ESPIPE; + if (file-f_mode FMODE_PREAD) + ret = vfs_readv(file, vec, vlen, pos); + fput_light(file, fput_needed); + } + + if (ret 0) + add_rchar(current, ret); + inc_syscr(current); + return ret; +} + +asmlinkage ssize_t sys_pwritev(unsigned int fd, const struct iovec __user *vec, + unsigned long vlen, loff_t pos) +{ + struct file *file; + ssize_t ret = -EBADF; + int fput_needed; + + if (pos 0) + return -EINVAL; + + file = fget_light(fd, fput_needed); + if (file) { + ret = -ESPIPE; + if (file-f_mode FMODE_PWRITE) + ret = vfs_writev(file, vec, vlen, pos); + fput_light(file, fput_needed); + } + + if (ret 0) + add_wchar(current, ret); + inc_syscw(current); + return ret; +} + static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos, size_t count, loff_t max) {
Re: [Qemu-devel] [RFC] Replace posix-aio with custom thread pool
Gerd Hoffmann wrote: Andrea Arcangeli wrote: My current feeling is that this user thread aio thing will never satisfy enterprise usage and kernel aio is mandatory in my view. Well, linux kernel aio has its share of problems too: * Anthony mentioned it may block on certain circumstances (forgot which ones), and you can't figure beforehand to turn off aio then. * It can't handle block allocation. Kernel handles that by doing such writes synchronously via VFS layer (instead of the separate aio code paths). Leads to horrible performance and bug reports such as installs on sparse files are very slow. * support for vectored aio isn't that old. IIRC it was added somewhen around 2.6.20 (newer that current suse/redhat enterprise versions). Which IMHO means you can't expect it being present unconditionally. And we should concentrate on kernel aio and get rid of threads when host OS is linux. Threads will be there anyway for kvm smp. Has anybody a patch implementing kernel aio that I can plug into the dma zerocopy api? I'm not so sure clone aio is worth maintaining inside qemu instead of evolving glibc Well, wait for glibc isn't going to fly. glibc waits for posix, and posix waits for a reference implementation (which will not be glibc). and kernel with preadv/pwritev With that in place you don't need kernel aio any more, then you can really do it in userspace with threads. But that probably would be linux-only ^W^W^W linux-only is okay but we just need a relatively sane fall back. There have been preadv/pwritev patches posted before, they just for some reason never were merged. http://lwn.net/Articles/163603/ ahem: http://www.daemon-systems.org/man/preadv.2.html Yeah, dunno if that's all BSDs or just NetBSD. Regards, Anthony Liguori cheers, Gerd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [RFC] Replace posix-aio with custom thread pool
On Thu, Dec 11, 2008 at 05:11:08PM +0100, Gerd Hoffmann wrote: Yes. But kernel aio requires O_DIRECT, so aio users are affected nevertheless. Are you sure? It surely wasn't the case... Havn't tested that. Could be it isn't a big problem, extra code size for the two modes aside. There shouldn't be any problem. Kernel side looks easy, attached patch + syscall table windup in all archs ... So should we depend on this? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: use modern cpumask primitives, no cpumask_t on stack
On Mon, 2008-12-08 at 16:09 +, Avi Kivity wrote: diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index ba4275d..2d6ca79 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -568,14 +570,17 @@ static bool make_all_cpus_request(struct kvm *kvm, unsigned int req) if (test_and_set_bit(req, vcpu-requests)) continue; cpu = vcpu-cpu; - if (cpu != -1 cpu != me) - cpu_set(cpu, cpus); - } - if (!cpus_empty(cpus)) { - smp_call_function_mask(cpus, ack_flush, NULL, 1); - called = true; + if (cpus != NULL cpu != -1 cpu != me) + cpumask_set_cpu(cpu, cpus); } + if (unlikely(cpus == NULL)) + smp_call_function_many(cpu_online_mask, ack_flush, NULL, 1); + else if (!cpumask_empty(cpus)) + smp_call_function_many(cpus, ack_flush, NULL, 1); + else + called = false; put_cpu(); + free_cpumask_var(cpus); return called; } This patch breaks uniprocessor builds, because smp_call_function_many() is only defined for CONFIG_SMP. Avi, I think you should be able to build a PowerPC KVM kernel at this point? That would have caught this error. Rusty, could you ack the following: cpumask: define smp_call_function_many() for non-SMP builds Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] diff --git a/include/linux/smp.h b/include/linux/smp.h --- a/include/linux/smp.h +++ b/include/linux/smp.h @@ -67,13 +67,6 @@ int smp_call_function(void(*func)(void * /* Deprecated: use smp_call_function_many() which uses a cpumask ptr. */ int smp_call_function_mask(cpumask_t mask, void(*func)(void *info), void *info, int wait); - -static inline void smp_call_function_many(const struct cpumask *mask, - void (*func)(void *info), void *info, - int wait) -{ - smp_call_function_mask(*mask, func, info, wait); -} int smp_call_function_single(int cpuid, void (*func) (void *info), void *info, int wait); @@ -151,6 +144,13 @@ static inline void init_call_single_data } #endif /* !SMP */ +static inline void smp_call_function_many(const struct cpumask *mask, + void (*func)(void *info), void *info, + int wait) +{ + smp_call_function_mask(*mask, func, info, wait); +} + /* * smp_processor_id(): get the current CPU ID. * -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Virtio network performance problem
I think it is an unsync tsc problem. First, make sure you pin all of the process threads. There is thread per vcpu + io thread +more non relevant. You can do it by adding the taskset before the cmdline. Second, you said that you use smp guest. So windows also sees unsync tsc. So, either test with UP guest or learn how to pin windows receiving ISR, DPC and the user app. Well, testing on Intel or newer AMD is another option. I tested it again now on Intel with UP guest and there is no such a problem. Hope to test it next week on AMD SMP guest. Regards, Dor I made sure to pin all 5 threads from my guest kvm process to two cores on a single socket, and also tried pinning them all to a single core, but neither made a difference. I then powered down all the cores (using the /sys entries) until only two cores on a single socket were left, which made no improvement. I then powered all cores down except one (and verified that only the one showed up in /proc/cpuinfo). Naturally the guest slowed a bit, but the erratic and negative pings from the guest to the host br0 were still there. If I boot the guest with smp 1 on the kvm command line, the ping times are fine. It's only when I start the guest with 2 VCPUs that the ping times are erratic and some pings are negative. Just to recap, this is on a brand new Dell Poweredge R805 with dual quad-core Opteron 2350s, with KVM-79 built against the CentOS-packaged 2.6.18 kernel. The guest in question is Windows Server 2003 x64 SP2 with 2 VCPUs. I couldn't think of anything else to try, so I loaded up another server, a new Dell Poweredge 2950 with dual quad-core Xeon E5430s. I installed and configured kvm-79 following the exact same process I used to configure it on the AMD host. When I boot the _same_ guest (it's in iscsi) on the PE2950/Intel host, the ping times are perfect. I'm interested to know why this is (before I purchase any more AMD servers) but my priority right now is trying to figure out what I'm doing wrong that is causing virtio networking to be extremely slow (usually around 55Mb/s) on every host I've built it on. I'm sure I'm doing something wrong and I just can't place it. I'll start another thread for that, though, because it seems to be a different problem. In the meantime I'll just keep this guest on my Intel servers, I guess. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [RFC] Replace posix-aio with custom thread pool
Andrea Arcangeli wrote: On Thu, Dec 11, 2008 at 05:11:08PM +0100, Gerd Hoffmann wrote: Yes. But kernel aio requires O_DIRECT, so aio users are affected nevertheless. Are you sure? It surely wasn't the case... Tons of docs say so, but might be they are wrong, I didn't check. Kernel side looks easy, attached patch + syscall table windup in all archs ... So should we depend on this? I suspect we will end up with multiple implementations anyway. So one could be preadv+threads. Probably quite portable if we manage to get the syscalls into linux kernel and glibc. All *BSDs have it already, for solaris I've found a feature request on that. Dunno for MacOS. Additionally we could have OS-specific bits such as linux-aio. Maybe also posix-aio for the *BSD family in case their kernel support for that is better than what glibc provides (i.e. can handle multiple requests in parallel without the fdpool hack). cheers, Gerd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Slow virtio networking
Can anyone provide any pointers as to what might cause virtio networking to be slow (~55Mb/s)? Here's what I've tried so far: Guests tried: Windows Server 2003 x32 SP2 UP guest w/ Windows guest drivers ver 3 Windows Server 2003 x64 SP2 SMP guest w/ Windows guest drivers ver 3 On the host: On all the hosts I'm using CentOS 5.2 x64. Two of the three physical servers I've tried are dual Intel Xeon E5430s (Dell 2950s). The third server is a Dual Opteron 2350 (Dell R805). All the servers are brand new and have almost no load currently. I've tried using KVM-79 built against the stock 2.6.18 kernel, KVM-79 built against a custom 2.6.27.8 kernel, and git (on 12/10/08) built against 2.6.27.8. In all cases, I've made sure to copy the if_tun.h and virtio*.h from my current kernel includes into the kernel/include/linux directory in my kvm-79 or kvm-userspace source tree before building, since I've heard this is necessary to enable GSO support, although I've also been told that lacking GSO shouldn't make it quite as slow as it is. In all cases, when I test throughput from the guest using virtio to the host's br0 (the bridge the guest is joined to), I get somewhere between 55Mb/s and 150Mb/s. This is even less than I get with the rtl8139 and e1000 emulation (between 200 and 350 Mb/s). Everything else with this host and guest works great. Pings from guest to host are perfect (on the intel servers, not on the amd server but that's another issue I believe). Is there something I could be doing wrong during my host kernel config, or during my kvm build, etc that could cause virtio to work, but so slowly? Any ideas are greatly appreciated. -Adrian -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: VMX: Allow single-stepping when interruptible
Avi Kivity wrote: Jan Kiszka wrote: When single-stepping, we have to ensure that the INT1 can make it through even if the guest itself is uninterruptible due to MOV SS or STI. VMENTRY will fail otherwise. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- arch/x86/kvm/vmx.c | 10 -- 1 files changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 3a422dc..8e83102 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1010,6 +1010,7 @@ static void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg) static int set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg) { int old_debug = vcpu-guest_debug; +u32 interruptibility; unsigned long flags; vcpu-guest_debug = dbg-control; @@ -1017,9 +1018,14 @@ static int set_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg) vcpu-guest_debug = 0; flags = vmcs_readl(GUEST_RFLAGS); -if (vcpu-guest_debug KVM_GUESTDBG_SINGLESTEP) +if (vcpu-guest_debug KVM_GUESTDBG_SINGLESTEP) { flags |= X86_EFLAGS_TF | X86_EFLAGS_RF; -else if (old_debug KVM_GUESTDBG_SINGLESTEP) +/* We must be interruptible when single-stepping */ +interruptibility = vmcs_read32(GUEST_INTERRUPTIBILITY_INFO); +if (interruptibility 3) +vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, + interruptibility ~3); Could just write unconditionally - it's not like the write has any effect on speed. vmcs_clear_bits() will do it cleanly. But I'm worried about correctness. Suppose we're singlestepping a sti; hlt sequence. While we'll clear interruptibility, probably receive the debug trap (since that's a high priority exception), but then inject the interrupt before the hlt, hanging the guest. So we probably need to restore interruptibility on exit. There was some issue with the original patch, but I think I have a safe version now that also works as good as the old one. Please see below, including comments. I'm still open to further concerns or better approaches. Sheng, maybe you can provide some more details on how one is supposed to handle this hairy case with VMX. This looks like a good candidate for a test case. This will be more complicated as I'm currently able to handle: kvmctl would have to be extended to interact with the guest debug interface of kvm, setting appropriate breakpoints and handling the callbacks. Jan --- When single-stepping over STI and MOV SS, we must clear the corresponding interruptibility bits in the guest state. Otherwise vmentry fails as it then expects bit 14 (BS) in pending debug exceptions being set, but that's not correct for the guest debugging case. Note that clearing those bits is safe as we check for interruptibility based on the original state and do not inject interrupts or NMIs if guest interruptibility was blocked. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- arch/x86/kvm/vmx.c | 10 ++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index ec37635..26f732c 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2477,6 +2477,11 @@ static void do_interrupt_requests(struct kvm_vcpu *vcpu, { vmx_update_window_states(vcpu); + if (vcpu-guest_debug KVM_GUESTDBG_SINGLESTEP) + vmcs_clear_bits(GUEST_INTERRUPTIBILITY_INFO, + GUEST_INTR_STATE_STI | + GUEST_INTR_STATE_MOV_SS); + if (vcpu-arch.nmi_pending !vcpu-arch.nmi_injected) { if (vcpu-arch.interrupt.pending) { enable_nmi_window(vcpu); @@ -3263,6 +3268,11 @@ static void vmx_intr_assist(struct kvm_vcpu *vcpu) vmx_update_window_states(vcpu); + if (vcpu-guest_debug KVM_GUESTDBG_SINGLESTEP) + vmcs_clear_bits(GUEST_INTERRUPTIBILITY_INFO, + GUEST_INTR_STATE_STI | + GUEST_INTR_STATE_MOV_SS); + if (vcpu-arch.nmi_pending !vcpu-arch.nmi_injected) { if (vcpu-arch.interrupt.pending) { enable_nmi_window(vcpu); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [RFC] Replace posix-aio with custom thread pool
Hi, So one could be preadv+threads. Probably quite portable if we manage to get the syscalls into linux kernel and glibc. All *BSDs have it already, for solaris I've found a feature request on that. Dunno for MacOS. Who's taking care of submitting it to linux? I will. cheers, Gerd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [RFC] Replace posix-aio with custom thread pool
On Thu, Dec 11, 2008 at 05:49:47PM +0100, Andrea Arcangeli wrote: On Thu, Dec 11, 2008 at 05:11:08PM +0100, Gerd Hoffmann wrote: Yes. But kernel aio requires O_DIRECT, so aio users are affected nevertheless. Are you sure? It surely wasn't the case... Mainline kernel aio only implements O_DIRECT. Some RHEL version had support for buffered kernel AIO. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ kvm-Bugs-2418470 ] libvirt save/restore broken after upgrading to kvm-80
Bugs item #2418470, was opened at 2008-12-11 23:44 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2418470group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: qemu Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Tuomas Jormola (tjormola) Assigned to: Nobody/Anonymous (nobody) Summary: libvirt save/restore broken after upgrading to kvm-80 Initial Comment: Hi, I'm running Ubuntu 8.10 on i386 (Intel Core2 Quad Q6600). virsh commands save and restore work just fine with the default kvm-72 based setup. I upgraded to kvm-80 by backporting the kvm-79 package from jaunty development tree with kvm-80 upstream source applied on top. Everything else seems to work ok for me except that now libvirt based restore stopped working. Save produces dump of the vm memory, but when running restore, the machine just boots, it does not resume the state. The save/restore also works if running kvm-72 users pace and kvm-80 kernel modules. Maybe there's something wrong with the migration code in kvm-80 user space, afaik libvirt uses that to implement save/restore. Tuomas Jormola -- You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2418470group_id=180599 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 3/3] KVM: VMX: initialize TSC offset relative to vm creation time
On Thu, Dec 11, 2008 at 07:38:24PM +0200, Avi Kivity wrote: Marcelo Tosatti wrote: This looks fine, but have you tested it on a host with unsync tsc? I'm worried that we'll get regressions there even on uniprocessor guest. I'd like to keep the current behaviour for the special case of uniprocessor guest on unsync tsc host. I don't see how. For UP guests the TSC is initialized to zero during vcpu setup, similarly to the current behaviour. Can you explain? On a host with an unsync tsc, when you move the vcpu to another cpu, the tsc may jump backwards. Ok, this could cause the guest tsc to be initialized to a high value close to wraparound (in case the vcpu is migrated to a cpu with negative difference before vmx_vcpu_setup). What other regression could the updated patch introduce? diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 97215a4..5b70d83 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -378,6 +378,7 @@ struct kvm_arch{ unsigned long irq_sources_bitmap; unsigned long irq_states[KVM_IOAPIC_NUM_PINS]; + u64 vm_init_tsc; }; struct kvm_vm_stat { diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index e446f23..0879852 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -856,11 +856,8 @@ static u64 guest_read_tsc(void) * writes 'guest_tsc' into guest's timestamp counter register * guest_tsc = host_tsc + tsc_offset == tsc_offset = guest_tsc - host_tsc */ -static void guest_write_tsc(u64 guest_tsc) +static void guest_write_tsc(u64 guest_tsc, u64 host_tsc) { - u64 host_tsc; - - rdtscll(host_tsc); vmcs_write64(TSC_OFFSET, guest_tsc - host_tsc); } @@ -924,6 +921,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data) { struct vcpu_vmx *vmx = to_vmx(vcpu); struct kvm_msr_entry *msr; + u64 host_tsc; int ret = 0; switch (msr_index) { @@ -949,7 +947,8 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data) vmcs_writel(GUEST_SYSENTER_ESP, data); break; case MSR_IA32_TIME_STAMP_COUNTER: - guest_write_tsc(data); + rdtscll(host_tsc); + guest_write_tsc(data, host_tsc); break; case MSR_P6_PERFCTR0: case MSR_P6_PERFCTR1: @@ -2111,7 +2110,7 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx) { u32 host_sysenter_cs, msr_low, msr_high; u32 junk; - u64 host_pat; + u64 host_pat, tsc_this, tsc_base; unsigned long a; struct descriptor_table dt; int i; @@ -2239,6 +2238,12 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx) vmcs_writel(CR0_GUEST_HOST_MASK, ~0UL); vmcs_writel(CR4_GUEST_HOST_MASK, KVM_GUEST_CR4_MASK); + tsc_base = vmx-vcpu.kvm-arch.vm_init_tsc; + rdtscll(tsc_this); + if (tsc_this vmx-vcpu.kvm-arch.vm_init_tsc) + tsc_base = tsc_this; + + guest_write_tsc(0, tsc_base); return 0; } @@ -2331,8 +2336,6 @@ static int vmx_vcpu_reset(struct kvm_vcpu *vcpu) vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, 0); vmcs_write32(GUEST_PENDING_DBG_EXCEPTIONS, 0); - guest_write_tsc(0); - /* Special registers */ vmcs_write64(GUEST_IA32_DEBUGCTL, 0); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ba10287..b2d64eb 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4122,6 +4122,8 @@ struct kvm *kvm_arch_create_vm(void) /* Reserve bit 0 of irq_sources_bitmap for userspace irq source */ set_bit(KVM_USERSPACE_IRQ_SOURCE_ID, kvm-arch.irq_sources_bitmap); + rdtscll(kvm-arch.vm_init_tsc); + return kvm; } -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/11] Add E500 support in KVM
This patch set add E500 support in KVM, and is already based on Hollis's feedback (a patchset which not yet commit) btw: The latest code seems broken with this error. --- CC arch/powerpc/kvm/../../../virt/kvm/kvm_main.o arch/powerpc/kvm/../../../virt/kvm/kvm_main.c: In function ‘make_all_cpus_request’: arch/powerpc/kvm/../../../virt/kvm/kvm_main.c:577: error: implicit declaration of function ‘smp_call_function_many’ make[1]: *** [arch/powerpc/kvm/../../../virt/kvm/kvm_main.o] Error 1 make: *** [arch/powerpc/kvm] Error 2 -- Fortunately, I could test these patch based on a commit a couple of days ago. -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/11] No need to include core header for KVM in asm-offsets.c currently
Signed-off-by: Liu Yu [EMAIL PROTECTED] --- arch/powerpc/kernel/asm-offsets.c |4 +--- 1 files changed, 1 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index fc3b863..544a27f 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -49,7 +49,7 @@ #include asm/iseries/alpaca.h #endif #ifdef CONFIG_KVM -#include asm/kvm_44x.h +#include linux/kvm_host.h #endif #if defined(CONFIG_BOOKE) || defined(CONFIG_40x) @@ -355,8 +355,6 @@ int main(void) DEFINE(PTE_SIZE, sizeof(pte_t)); #ifdef CONFIG_KVM - DEFINE(TLBE_BYTES, sizeof(struct kvmppc_44x_tlbe)); - DEFINE(VCPU_HOST_STACK, offsetof(struct kvm_vcpu, arch.host_stack)); DEFINE(VCPU_HOST_PID, offsetof(struct kvm_vcpu, arch.host_pid)); DEFINE(VCPU_GPRS, offsetof(struct kvm_vcpu, arch.gpr)); -- 1.5.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/11] Put iccci into CONFIG_44x ifdef
E500 deosn't support this instruction. Signed-off-by: Liu Yu [EMAIL PROTECTED] --- arch/powerpc/kvm/booke_interrupts.S |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kvm/booke_interrupts.S b/arch/powerpc/kvm/booke_interrupts.S index 084ebcd..4679ec2 100644 --- a/arch/powerpc/kvm/booke_interrupts.S +++ b/arch/powerpc/kvm/booke_interrupts.S @@ -347,7 +347,9 @@ lightweight_exit: lwz r3, VCPU_SHADOW_PID(r4) mtspr SPRN_PID, r3 +#ifdef CONFIG_44x iccci 0, 0 /* XXX hack */ +#endif /* Load some guest volatiles. */ lwz r0, VCPU_GPR(r0)(r4) -- 1.5.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/11] E500 core-specific code
Signed-off-by: Liu Yu [EMAIL PROTECTED] --- arch/powerpc/include/asm/kvm_e500.h | 67 +++ arch/powerpc/kvm/e500.c | 151 +++ 2 files changed, 218 insertions(+), 0 deletions(-) create mode 100644 arch/powerpc/include/asm/kvm_e500.h create mode 100644 arch/powerpc/kvm/e500.c diff --git a/arch/powerpc/include/asm/kvm_e500.h b/arch/powerpc/include/asm/kvm_e500.h new file mode 100644 index 000..9d497ce --- /dev/null +++ b/arch/powerpc/include/asm/kvm_e500.h @@ -0,0 +1,67 @@ +/* + * Copyright (C) 2008 Freescale Semiconductor, Inc. All rights reserved. + * + * Author: Yu Liu, [EMAIL PROTECTED] + * + * Description: + * This file is derived from arch/powerpc/include/asm/kvm_44x.h, + * by Hollis Blanchard [EMAIL PROTECTED]. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + */ + +#ifndef __ASM_KVM_E500_H__ +#define __ASM_KVM_E500_H__ + +#include linux/kvm_host.h + +#define BOOKE_INTERRUPT_SIZE 36 + +#define E500_PID_NUM 3 +#define E500_TLB_NUM 2 + +struct tlbe{ + u32 mas1; + u32 mas2; + u32 mas3; + u32 mas7; +}; + +struct kvmppc_vcpu_e500 { + /* Unmodified copy of the guest's TLB. */ + struct tlbe *guest_tlb[E500_TLB_NUM]; + /* TLB that's actually used when the guest is running. */ + struct tlbe *shadow_tlb[E500_TLB_NUM]; + /* Pages which are referenced in the shadow TLB. */ + struct page **shadow_pages[E500_TLB_NUM]; + + unsigned int guest_tlb_size[E500_TLB_NUM]; + unsigned int shadow_tlb_size[E500_TLB_NUM]; + unsigned int guest_tlb_nv[E500_TLB_NUM]; + + u32 host_pid[E500_PID_NUM]; + u32 pid[E500_PID_NUM]; + + u32 mas0; + u32 mas1; + u32 mas2; + u32 mas3; + u32 mas4; + u32 mas5; + u32 mas6; + u32 mas7; + u32 l1csr1; + u32 hid0; + u32 hid1; + + struct kvm_vcpu vcpu; +}; + +static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu) +{ + return container_of(vcpu, struct kvmppc_vcpu_e500, vcpu); +} + +#endif /* __ASM_KVM_E500_H__ */ diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c new file mode 100644 index 000..b1950ff --- /dev/null +++ b/arch/powerpc/kvm/e500.c @@ -0,0 +1,151 @@ +/* + * Copyright (C) 2008 Freescale Semiconductor, Inc. All rights reserved. + * + * Author: Yu Liu, [EMAIL PROTECTED] + * + * Description: + * This file is derived from arch/powerpc/kvm/44x.c, + * by Hollis Blanchard [EMAIL PROTECTED]. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + */ + +#include linux/kvm_host.h +#include linux/err.h + +#include asm/reg.h +#include asm/cputable.h +#include asm/tlbflush.h +#include asm/kvm_e500.h +#include asm/kvm_ppc.h + +#include e500_tlb.h + +void kvmppc_core_load_host_debugstate(struct kvm_vcpu *vcpu) +{ +} + +void kvmppc_core_load_guest_debugstate(struct kvm_vcpu *vcpu) +{ +} + +void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu) +{ + kvmppc_e500_tlb_load(vcpu, cpu); +} + +void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu) +{ + kvmppc_e500_tlb_put(vcpu); +} + +int kvmppc_core_check_processor_compat(void) +{ + int r; + + if (strcmp(cur_cpu_spec-cpu_name, e500v2) == 0) + r = 0; + else + r = -ENOTSUPP; + + return r; +} + +int kvmppc_core_vcpu_setup(struct kvm_vcpu *vcpu) +{ + struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); + + kvmppc_e500_tlb_setup(vcpu_e500); + + /* Use the same core vertion as host's */ + vcpu-arch.pvr = mfspr(SPRN_PVR); + + return 0; +} + +/* 'linear_address' is actually an encoding of AS|PID|EADDR . */ +int kvmppc_core_vcpu_translate(struct kvm_vcpu *vcpu, + struct kvm_translation *tr) +{ + int index; + gva_t eaddr; + u8 pid; + u8 as; + + eaddr = tr-linear_address; + pid = (tr-linear_address 32) 0xff; + as = (tr-linear_address 40) 0x1; + + index = kvmppc_e500_tlb_search(vcpu, eaddr, as); + if (index 0) { + tr-valid = 0; + return 0; + } + + tr-physical_address = kvmppc_mmu_xlate(vcpu, index, eaddr); + /* XXX what does writeable and usermode even mean? */ + tr-valid = 1; + + return 0; +} + +struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, unsigned int id) +{ + struct kvmppc_vcpu_e500 *vcpu_e500; + struct kvm_vcpu *vcpu; + int err; + + vcpu_e500 = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL); + if (!vcpu_e500) { + err = -ENOMEM; + goto out; + } + + vcpu = vcpu_e500-vcpu; + err = kvm_vcpu_init(vcpu,
[PATCH 06/11] E500 instructions emulation
Signed-off-by: Liu Yu [EMAIL PROTECTED] --- arch/powerpc/kvm/e500_emulate.c | 359 +++ 1 files changed, 359 insertions(+), 0 deletions(-) create mode 100644 arch/powerpc/kvm/e500_emulate.c diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c new file mode 100644 index 000..defe1d9 --- /dev/null +++ b/arch/powerpc/kvm/e500_emulate.c @@ -0,0 +1,359 @@ +/* + * Copyright (C) 2008 Freescale Semiconductor, Inc. All rights reserved. + * + * Author: Yu Liu, [EMAIL PROTECTED] + * + * Description: + * This file is derived from arch/powerpc/kvm/44x_emulate.c, + * by Hollis Blanchard [EMAIL PROTECTED]. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + */ + +#include asm/kvm_ppc.h +#include asm/disassemble.h +#include asm/kvm_e500.h + +#include booke.h +#include e500_tlb.h + +#define OP_RFI 19 + +#define XOP_RFI50 +#define XOP_MFMSR 83 +#define XOP_WRTEE 131 +#define XOP_MTMSR 146 +#define XOP_WRTEEI 163 +#define XOP_TLBIVAX786 +#define XOP_TLBSX 914 +#define XOP_TLBRE 946 +#define XOP_TLBWE 978 + +static void kvmppc_emul_rfi(struct kvm_vcpu *vcpu) +{ + vcpu-arch.pc = vcpu-arch.srr0; + kvmppc_set_msr(vcpu, vcpu-arch.srr1); +} + +int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, + unsigned int inst, int *advance) +{ + int emulated = EMULATE_DONE; + int ra; + int rb; + int rs; + int rt; + + switch (get_op(inst)) { + + case OP_RFI: + switch (get_xop(inst)) { + case XOP_RFI: + kvmppc_emul_rfi(vcpu); + *advance = 0; + break; + + default: + emulated = EMULATE_FAIL; + break; + } + break; + + case 31: + switch (get_xop(inst)) { + + case XOP_MFMSR: + rt = get_rt(inst); + vcpu-arch.gpr[rt] = vcpu-arch.msr; + break; + + case XOP_MTMSR: + rs = get_rs(inst); + kvmppc_set_msr(vcpu, vcpu-arch.gpr[rs]); + break; + + case XOP_WRTEE: + rs = get_rs(inst); + vcpu-arch.msr = (vcpu-arch.msr ~MSR_EE) + | (vcpu-arch.gpr[rs] MSR_EE); + break; + + case XOP_WRTEEI: + vcpu-arch.msr = (vcpu-arch.msr ~MSR_EE) + | (inst MSR_EE); + break; + + case XOP_TLBRE: + emulated = kvmppc_e500_emul_tlbre(vcpu); + break; + + case XOP_TLBWE: + emulated = kvmppc_e500_emul_tlbwe(vcpu); + break; + + case XOP_TLBSX: + rb = get_rb(inst); + emulated = kvmppc_e500_emul_tlbsx(vcpu,rb); + break; + + case XOP_TLBIVAX: + ra = get_ra(inst); + rb = get_rb(inst); + emulated = kvmppc_e500_emul_tlbivax(vcpu, ra, rb); + break; + + default: + emulated = EMULATE_FAIL; + } + + break; + + default: + emulated = EMULATE_FAIL; + } + + return emulated; +} + +int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, int rs) +{ + struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); + + switch (sprn) { + /* E500 */ + case SPRN_PID: + vcpu_e500-pid[0] = vcpu-arch.pid = vcpu-arch.gpr[rs]; break; + case SPRN_PID1: + vcpu_e500-pid[1] = vcpu-arch.gpr[rs]; break; + case SPRN_PID2: + vcpu_e500-pid[2] = vcpu-arch.gpr[rs]; break; + case SPRN_MAS0: + vcpu_e500-mas0 = vcpu-arch.gpr[rs]; break; + case SPRN_MAS1: + vcpu_e500-mas1 = vcpu-arch.gpr[rs]; break; + case SPRN_MAS2: + vcpu_e500-mas2 = vcpu-arch.gpr[rs]; break; + case SPRN_MAS3: + vcpu_e500-mas3 = vcpu-arch.gpr[rs]; break; + case SPRN_MAS4: + vcpu_e500-mas4 = vcpu-arch.gpr[rs]; break; + case SPRN_MAS6: + vcpu_e500-mas6 = vcpu-arch.gpr[rs]; break; + case SPRN_MAS7: + vcpu_e500-mas7 = vcpu-arch.gpr[rs]; break; + case SPRN_L1CSR1: + vcpu_e500-l1csr1 = vcpu-arch.gpr[rs]; break; + case SPRN_HID0: + vcpu_e500-hid0 = vcpu-arch.gpr[rs]; break; + case SPRN_HID1: +
[PATCH 08/11] Add kvmppc_mmu_dtlb/itlb_miss for booke
E500 needs to update some mmu registers, when itlb or dtlb miss happens. Signed-off-by: Liu Yu [EMAIL PROTECTED] --- arch/powerpc/include/asm/kvm_ppc.h |2 ++ arch/powerpc/kvm/44x_tlb.c |8 arch/powerpc/kvm/booke.c |2 ++ 3 files changed, 12 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index f4b041b..82547c8 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -63,6 +63,8 @@ extern int kvmppc_mmu_dtlb_index(struct kvm_vcpu *vcpu, gva_t eaddr); extern int kvmppc_mmu_itlb_index(struct kvm_vcpu *vcpu, gva_t eaddr); extern gpa_t kvmppc_mmu_xlate(struct kvm_vcpu *vcpu, unsigned int gtlb_index, gva_t eaddr); +extern void kvmppc_mmu_dtlb_miss(struct kvm_vcpu *vcpu); +extern void kvmppc_mmu_itlb_miss(struct kvm_vcpu *vcpu); extern struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, unsigned int id); diff --git a/arch/powerpc/kvm/44x_tlb.c b/arch/powerpc/kvm/44x_tlb.c index e67b731..4a16f47 100644 --- a/arch/powerpc/kvm/44x_tlb.c +++ b/arch/powerpc/kvm/44x_tlb.c @@ -232,6 +232,14 @@ int kvmppc_mmu_dtlb_index(struct kvm_vcpu *vcpu, gva_t eaddr) return kvmppc_44x_tlb_index(vcpu, eaddr, vcpu-arch.pid, as); } +void kvmppc_mmu_itlb_miss(struct kvm_vcpu *vcpu) +{ +} + +void kvmppc_mmu_dtlb_miss(struct kvm_vcpu *vcpu) +{ +} + static void kvmppc_44x_shadow_release(struct kvmppc_vcpu_44x *vcpu_44x, unsigned int stlb_index) { diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index a73b395..933c406 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -295,6 +295,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_DTLB_MISS); vcpu-arch.dear = vcpu-arch.fault_dear; vcpu-arch.esr = vcpu-arch.fault_esr; + kvmppc_mmu_dtlb_miss(vcpu); kvmppc_account_exit(vcpu, DTLB_REAL_MISS_EXITS); r = RESUME_GUEST; break; @@ -337,6 +338,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, if (gtlb_index 0) { /* The guest didn't have a mapping for it. */ kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_ITLB_MISS); + kvmppc_mmu_itlb_miss(vcpu); kvmppc_account_exit(vcpu, ITLB_REAL_MISS_EXITS); break; } -- 1.5.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/11] Fix IRQ priority search bug
Signed-off-by: Liu Yu [EMAIL PROTECTED] --- arch/powerpc/kvm/booke.c |2 +- arch/powerpc/kvm/booke.h |1 + 2 files changed, 2 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 933c406..f192fbe 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -163,7 +163,7 @@ void kvmppc_core_deliver_interrupts(struct kvm_vcpu *vcpu) unsigned int priority; priority = __ffs(*pending); - while (priority = BOOKE_MAX_INTERRUPT) { + while (priority = BOOKE_IRQPRIO_MAX) { if (kvmppc_booke_irqprio_deliver(vcpu, priority)) break; diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h index cf7c94c..52c0b1a 100644 --- a/arch/powerpc/kvm/booke.h +++ b/arch/powerpc/kvm/booke.h @@ -41,6 +41,7 @@ #define BOOKE_IRQPRIO_EXTERNAL 13 #define BOOKE_IRQPRIO_FIT 14 #define BOOKE_IRQPRIO_DECREMENTER 15 +#define BOOKE_IRQPRIO_MAX 15 /* Helper function for full MSR writes. No need to call this if only EE is * changing. */ -- 1.5.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/11] Add extra E500 exceptions
Signed-off-by: Liu Yu [EMAIL PROTECTED] --- arch/powerpc/include/asm/kvm_asm.h |7 ++- arch/powerpc/include/asm/kvm_host.h |2 +- arch/powerpc/kvm/booke.c| 18 ++ arch/powerpc/kvm/booke.h| 30 ++ arch/powerpc/kvm/booke_interrupts.S |3 +++ arch/powerpc/kvm/e500.c | 20 +++- arch/powerpc/kvm/e500_emulate.c | 24 7 files changed, 89 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_asm.h b/arch/powerpc/include/asm/kvm_asm.h index 2197764..56bfae5 100644 --- a/arch/powerpc/include/asm/kvm_asm.h +++ b/arch/powerpc/include/asm/kvm_asm.h @@ -42,7 +42,12 @@ #define BOOKE_INTERRUPT_DTLB_MISS 13 #define BOOKE_INTERRUPT_ITLB_MISS 14 #define BOOKE_INTERRUPT_DEBUG 15 -#define BOOKE_MAX_INTERRUPT 15 + +/* E500 */ +#define BOOKE_INTERRUPT_SPE_UNAVAIL 32 +#define BOOKE_INTERRUPT_SPE_FP_DATA 33 +#define BOOKE_INTERRUPT_SPE_FP_ROUND 34 +#define BOOKE_INTERRUPT_PERFORMANCE_MONITOR 35 #define RESUME_FLAG_NV (10) /* Reload guest nonvolatile state? */ #define RESUME_FLAG_HOST(11) /* Resume host? */ diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 63962fa..5e22681 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -150,7 +150,7 @@ struct kvm_vcpu_arch { u32 tbu; u32 tcr; u32 tsr; - u32 ivor[16]; + u32 ivor[64]; ulong ivpr; u32 pir; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index f192fbe..642e420 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -118,6 +118,9 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu, case BOOKE_IRQPRIO_DATA_STORAGE: case BOOKE_IRQPRIO_INST_STORAGE: case BOOKE_IRQPRIO_FP_UNAVAIL: + case BOOKE_IRQPRIO_SPE_UNAVAIL: + case BOOKE_IRQPRIO_SPE_FP_DATA: + case BOOKE_IRQPRIO_SPE_FP_ROUND: case BOOKE_IRQPRIO_AP_UNAVAIL: case BOOKE_IRQPRIO_ALIGNMENT: allowed = 1; @@ -261,6 +264,21 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, r = RESUME_GUEST; break; + case BOOKE_INTERRUPT_SPE_UNAVAIL: + kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_SPE_UNAVAIL); + r = RESUME_GUEST; + break; + + case BOOKE_INTERRUPT_SPE_FP_DATA: + kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_SPE_FP_DATA); + r = RESUME_GUEST; + break; + + case BOOKE_INTERRUPT_SPE_FP_ROUND: + kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_SPE_FP_ROUND); + r = RESUME_GUEST; + break; + case BOOKE_INTERRUPT_DATA_STORAGE: vcpu-arch.dear = vcpu-arch.fault_dear; vcpu-arch.esr = vcpu-arch.fault_esr; diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h index 52c0b1a..700013a 100644 --- a/arch/powerpc/kvm/booke.h +++ b/arch/powerpc/kvm/booke.h @@ -30,18 +30,24 @@ #define BOOKE_IRQPRIO_ALIGNMENT 2 #define BOOKE_IRQPRIO_PROGRAM 3 #define BOOKE_IRQPRIO_FP_UNAVAIL 4 -#define BOOKE_IRQPRIO_SYSCALL 5 -#define BOOKE_IRQPRIO_AP_UNAVAIL 6 -#define BOOKE_IRQPRIO_DTLB_MISS 7 -#define BOOKE_IRQPRIO_ITLB_MISS 8 -#define BOOKE_IRQPRIO_MACHINE_CHECK 9 -#define BOOKE_IRQPRIO_DEBUG 10 -#define BOOKE_IRQPRIO_CRITICAL 11 -#define BOOKE_IRQPRIO_WATCHDOG 12 -#define BOOKE_IRQPRIO_EXTERNAL 13 -#define BOOKE_IRQPRIO_FIT 14 -#define BOOKE_IRQPRIO_DECREMENTER 15 -#define BOOKE_IRQPRIO_MAX 15 +#define BOOKE_IRQPRIO_SPE_UNAVAIL 5 +#define BOOKE_IRQPRIO_SPE_FP_DATA 6 +#define BOOKE_IRQPRIO_SPE_FP_ROUND 7 +#define BOOKE_IRQPRIO_SYSCALL 8 +#define BOOKE_IRQPRIO_AP_UNAVAIL 9 +#define BOOKE_IRQPRIO_DTLB_MISS 10 +#define BOOKE_IRQPRIO_ITLB_MISS 11 +#define BOOKE_IRQPRIO_MACHINE_CHECK 12 +#define BOOKE_IRQPRIO_DEBUG 13 +#define BOOKE_IRQPRIO_CRITICAL 14 +#define BOOKE_IRQPRIO_WATCHDOG 15 +#define BOOKE_IRQPRIO_EXTERNAL 16 +#define BOOKE_IRQPRIO_FIT 17 +#define BOOKE_IRQPRIO_DECREMENTER 18 +#define BOOKE_IRQPRIO_PERFORMANCE_MONITOR 19 +#define BOOKE_IRQPRIO_MAX 19 + +extern unsigned long kvmppc_booke_handlers; /* Helper function for full MSR writes. No need to call this if only EE is * changing. */ diff --git a/arch/powerpc/kvm/booke_interrupts.S b/arch/powerpc/kvm/booke_interrupts.S index 4679ec2..d0c6f84 100644 --- a/arch/powerpc/kvm/booke_interrupts.S +++ b/arch/powerpc/kvm/booke_interrupts.S @@ -86,6 +86,9 @@ KVM_HANDLER BOOKE_INTERRUPT_WATCHDOG KVM_HANDLER BOOKE_INTERRUPT_DTLB_MISS KVM_HANDLER BOOKE_INTERRUPT_ITLB_MISS KVM_HANDLER BOOKE_INTERRUPT_DEBUG +KVM_HANDLER BOOKE_INTERRUPT_SPE_UNAVAIL +KVM_HANDLER BOOKE_INTERRUPT_SPE_FP_DATA +KVM_HANDLER BOOKE_INTERRUPT_SPE_FP_ROUND _GLOBAL(kvmppc_handler_len) .long kvmppc_handler_1 - kvmppc_handler_0 diff --git
[PATCH 1 of 3] [PATCH] kvm-userspace: ppc: Add kvm_translate wrapper
# HG changeset patch # User Christian Ehrhardt [EMAIL PROTECTED] # Date 1228924564 -3600 # Node ID 38846cef16e56c681da1ddc179e248972c8b2ff9 # Parent 705d874ff7a24484eaa15ed75a748c4e1a70c2ef [PATCH] kvm-userspace: ppc: Add kvm_translate wrapper From: Hollis Blanchard [EMAIL PROTECTED] Add kvm_translate() wrapper used to get mmu translations from userspace. Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED] --- [diffstat] libkvm.c |5 + libkvm.h |2 ++ 2 files changed, 7 insertions(+) [diff] diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c --- a/libkvm/libkvm.c +++ b/libkvm/libkvm.c @@ -987,6 +987,11 @@ int kvm_guest_debug(kvm_context_t kvm, i return ioctl(kvm-vcpu_fd[vcpu], KVM_DEBUG_GUEST, dbg); } +int kvm_translate(kvm_context_t kvm, int vcpu, struct kvm_translation *tr) +{ + return ioctl(kvm-vcpu_fd[vcpu], KVM_TRANSLATE, tr); +} + int kvm_set_signal_mask(kvm_context_t kvm, int vcpu, const sigset_t *sigset) { struct kvm_signal_mask *sigmask; diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h --- a/libkvm/libkvm.h +++ b/libkvm/libkvm.h @@ -639,6 +639,8 @@ int kvm_set_pit(kvm_context_t kvm, struc int kvm_set_pit(kvm_context_t kvm, struct kvm_pit_state *s); #endif +int kvm_translate(kvm_context_t kvm, int vcpu, struct kvm_translation *tr); + #endif #ifdef KVM_CAP_VAPIC -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3 of 3] [PATCH] kvm-userspace: fix gdbstub kvm integration
# HG changeset patch # User Christian Ehrhardt [EMAIL PROTECTED] # Date 1228989958 -3600 # Node ID f80fb35de91fe69dae889c70948c9a53212ee444 # Parent 6f228c807ad0b239b7342d2974debfc66418d784 [PATCH] kvm-userspace: fix gdbstub kvm integration From: Christian Ehrhardt [EMAIL PROTECTED] Some recent qemu upstream merges brought in a new concept to not use env as current cpu in gdb_handle_packet anymore. But the kvm calls still do, this leads to SIGDEV's as env is not initialized when calling the functions like kvm_save_registers. Insted there is now a gdbstate structure holding current cpu for step/continue and other ops splitted. This patch changes the kvm_save_registers calls to use the right CPUState variable for the kvm calls in gdb_handle_packet. Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED] --- [diffstat] gdbstub.c |8 1 file changed, 4 insertions(+), 4 deletions(-) [diff] diff --git a/qemu/gdbstub.c b/qemu/gdbstub.c --- a/qemu/gdbstub.c +++ b/qemu/gdbstub.c @@ -1348,7 +1348,7 @@ static int gdb_handle_packet(GDBState *s } break; case 'g': -kvm_save_registers(env); +kvm_save_registers(s-g_cpu); len = 0; for (addr = 0; addr num_g_regs; addr++) { reg_size = gdb_read_register(s-g_cpu, mem_buf + len, addr); @@ -1366,7 +1366,7 @@ static int gdb_handle_packet(GDBState *s len -= reg_size; registers += reg_size; } -kvm_load_registers(env); +kvm_load_registers(s-g_cpu); put_packet(s, OK); break; case 'm': -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub
This is v2 as version one had a type in it occured when splitting patches. Mercurial somehow lost my changes to the patch description explaining that, but the patch is right this way. Christian Ehrhardt wrote: # HG changeset patch # User Christian Ehrhardt [EMAIL PROTECTED] # Date 1228999833 -3600 # Node ID dc1466c9077ab162f4637fffee1869f26be02299 # Parent 4c07fe2a56c7653a9113e05bb08c2de9aec210ce [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub From: Hollis Blanchard [EMAIL PROTECTED] Add basic KVM PowerPC support to qemu's gdbstub introducing a kvm ppc style mmu implementation that uses the kvm_translate ioctl. This also requires to save the kvm registers prior to the 'm' gdb operations. Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] Signed-off-by: Christian Ehrhardt [EMAIL PROTECTED] --- [diffstat] gdbstub.c |2 ++ hw/ppc440_bamboo.c |1 + qemu-kvm-powerpc.c | 28 target-ppc/cpu.h|2 ++ target-ppc/helper.c |4 target-ppc/translate_init.c |5 + 6 files changed, 42 insertions(+) [diff] diff --git a/qemu/gdbstub.c b/qemu/gdbstub.c --- a/qemu/gdbstub.c +++ b/qemu/gdbstub.c @@ -1374,6 +1374,7 @@ static int gdb_handle_packet(GDBState *s if (*p == ',') p++; len = strtoull(p, NULL, 16); +kvm_save_registers(s-g_cpu); if (cpu_memory_rw_debug(s-g_cpu, addr, mem_buf, len, 0) != 0) { put_packet (s, E14); } else { @@ -1389,6 +1390,7 @@ static int gdb_handle_packet(GDBState *s if (*p == ':') p++; hextomem(mem_buf, p, len); +kvm_save_registers(s-g_cpu); if (cpu_memory_rw_debug(s-g_cpu, addr, mem_buf, len, 1) != 0) put_packet(s, E14); else diff --git a/qemu/hw/ppc440_bamboo.c b/qemu/hw/ppc440_bamboo.c --- a/qemu/hw/ppc440_bamboo.c +++ b/qemu/hw/ppc440_bamboo.c @@ -99,6 +99,7 @@ void bamboo_init(ram_addr_t ram_size, in fprintf(stderr, Unable to initialize CPU!\n); exit(1); } + env-mmu_model = POWERPC_MMU_KVM; /* call init */ printf(Calling function ppc440_init\n); diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c --- a/qemu/qemu-kvm-powerpc.c +++ b/qemu/qemu-kvm-powerpc.c @@ -102,6 +102,7 @@ void kvm_arch_save_regs(CPUState *env) env-spr[SPR_SRR0] = regs.srr0; env-spr[SPR_SRR1] = regs.srr1; +env-spr[SPR_BOOKE_PID] = regs.pid; env-spr[SPR_SPRG0] = regs.sprg0; env-spr[SPR_SPRG1] = regs.sprg1; @@ -219,6 +220,33 @@ int handle_powerpc_dcr_write(int vcpu, u return 0; /* XXX ignore failed DCR ops */ } +int mmukvm_get_physical_address(CPUState *env, mmu_ctx_t *ctx, +target_ulong eaddr, int rw, int access_type) +{ +struct kvm_translation tr; +uint64_t pid; +uint64_t as; +int r; + +pid = env-spr[SPR_BOOKE_PID]; + +if (access_type == ACCESS_CODE) +as = env-msr msr_ir; +else +as = env-msr msr_dr; + +tr.linear_address = as 40 | pid 32 | eaddr; +r = kvm_translate(kvm_context, env-cpu_index, tr); +if (r == -1) +return r; + +if (!tr.valid) +return -EFAULT; + +ctx-raddr = tr.physical_address; +return 0; +} + void kvm_arch_cpu_reset(CPUState *env) { } diff --git a/qemu/target-ppc/cpu.h b/qemu/target-ppc/cpu.h --- a/qemu/target-ppc/cpu.h +++ b/qemu/target-ppc/cpu.h @@ -98,6 +98,8 @@ enum powerpc_mmu_t { POWERPC_MMU_BOOKE_FSL = 0x0009, /* PowerPC 601 MMU model (specific BATs format)*/ POWERPC_MMU_601= 0x000A, +/* KVM managing the MMU state */ +POWERPC_MMU_KVM= 0x000B, #if defined(TARGET_PPC64) #define POWERPC_MMU_64 0x0001 /* 64 bits PowerPC MMU */ diff --git a/qemu/target-ppc/helper.c b/qemu/target-ppc/helper.c --- a/qemu/target-ppc/helper.c +++ b/qemu/target-ppc/helper.c @@ -1429,6 +1429,10 @@ int get_physical_address (CPUState *env, fprintf(logfile, %s\n, __func__); } #endif + +if (env-mmu_model == POWERPC_MMU_KVM) +return mmukvm_get_physical_address(env, ctx, eaddr, rw, access_type); + if ((access_type == ACCESS_CODE msr_ir == 0) || (access_type != ACCESS_CODE msr_dr == 0)) { /* No address translation */ diff --git a/qemu/target-ppc/translate_init.c b/qemu/target-ppc/translate_init.c --- a/qemu/target-ppc/translate_init.c +++ b/qemu/target-ppc/translate_init.c @@ -9273,6 +9273,11 @@ int cpu_ppc_register_internal (CPUPPCSta case POWERPC_MMU_601: mmu_model = PowerPC 601; break; +#ifdef KVM +case POWERPC_MMU_KVM: +mmu_model = PowerPC KVM; +break; +#endif #if defined (TARGET_PPC64) case POWERPC_MMU_64B: mmu_model = PowerPC 64; --
[PATCH 2/6] kvm: sync vcpu state during initialization
Currently on x86, qemu initializes CPUState but KVM ignores it and does its own vcpu initialization. However, PowerPC KVM needs to be able to set the initial register state to support the -kernel and -append options. Signed-off-by: Hollis Blanchard holl...@us.ibm.com --- kvm-all.c | 15 +++ kvm.h |1 + vl.c | 11 +++ 3 files changed, 27 insertions(+), 0 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index dad80df..11034df 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -141,6 +141,21 @@ err: return ret; } +int kvm_sync_vcpus(void) +{ +CPUState *env; + +for (env = first_cpu; env != NULL; env = env-next_cpu) { +int ret; + +ret = kvm_arch_put_registers(env); +if (ret) +return ret; +} + +return 0; +} + /* * dirty pages logging control */ diff --git a/kvm.h b/kvm.h index ac464ab..efce145 100644 --- a/kvm.h +++ b/kvm.h @@ -31,6 +31,7 @@ struct kvm_run; int kvm_init(int smp_cpus); int kvm_init_vcpu(CPUState *env); +int kvm_sync_vcpus(void); int kvm_cpu_exec(CPUState *env); diff --git a/vl.c b/vl.c index c3a8d8f..0a02151 100644 --- a/vl.c +++ b/vl.c @@ -5456,6 +5456,17 @@ int main(int argc, char **argv, char **envp) machine-init(ram_size, vga_ram_size, boot_devices, ds, kernel_filename, kernel_cmdline, initrd_filename, cpu_model); +/* Set KVM's vcpu state to qemu's initial CPUState. */ +if (kvm_enabled()) { +int ret; + +ret = kvm_sync_vcpus(); +if (ret 0) { +fprintf(stderr, failed to initialize vcpus\n); +exit(1); +} +} + /* init USB devices */ if (usb_enabled) { for(i = 0; i usb_devices_index; i++) { -- 1.5.6.5 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/6] Enable KVM for ppcemb.
Implement hooks called by generic KVM code. Also add code that will copy the host's CPU and timebase frequencies to the guest, which is necessary on KVM because the guest can directly access the timebase. Signed-off-by: Hollis Blanchard holl...@us.ibm.com --- Makefile.target |3 + configure|6 ++ target-ppc/helper.c |5 + target-ppc/kvm.c | 212 ++ target-ppc/kvm_ppc.c | 105 + target-ppc/kvm_ppc.h | 15 6 files changed, 346 insertions(+), 0 deletions(-) create mode 100644 target-ppc/kvm.c create mode 100644 target-ppc/kvm_ppc.c create mode 100644 target-ppc/kvm_ppc.h diff --git a/Makefile.target b/Makefile.target index 7152dff..d01231d 100644 --- a/Makefile.target +++ b/Makefile.target @@ -652,6 +652,9 @@ OBJS+= heathrow_pic.o grackle_pci.o ppc_oldworld.o OBJS+= unin_pci.o ppc_chrp.o # PowerPC 4xx boards OBJS+= pflash_cfi02.o ppc4xx_devs.o ppc4xx_pci.o ppc405_uc.o ppc405_boards.o +ifdef CONFIG_KVM +OBJS+= kvm_ppc.o +endif # virtio support OBJS+= virtio.o virtio-blk.o virtio-balloon.o endif diff --git a/configure b/configure index 13f6358..c534441 100755 --- a/configure +++ b/configure @@ -1463,6 +1463,7 @@ gdb_xml_files= # Make sure the target and host cpus are compatible if test $kvm = yes -a ! \( $target_cpu = $cpu -o \ + \( $target_cpu = ppcemb -a $cpu = powerpc \) -o \ \( $target_cpu = x86_64 -a $cpu = i386 \) -o \ \( $target_cpu = i386 -a $cpu = x86_64 \) \) ; then kvm=no @@ -1557,6 +1558,11 @@ case $target_cpu in echo #define TARGET_ARCH \ppcemb\ $config_h echo #define TARGET_PPC 1 $config_h echo #define TARGET_PPCEMB 1 $config_h +if test $kvm = yes ; then + echo CONFIG_KVM=yes $config_mak + echo KVM_CFLAGS=$kvm_cflags $config_mak + echo #define CONFIG_KVM 1 $config_h +fi ;; ppc64) echo TARGET_ARCH=ppc64 $config_mak diff --git a/target-ppc/helper.c b/target-ppc/helper.c index 33e8b3b..0b93f1c 100644 --- a/target-ppc/helper.c +++ b/target-ppc/helper.c @@ -30,6 +30,7 @@ #include helper_regs.h #include qemu-common.h #include helper.h +#include kvm.h //#define DEBUG_MMU //#define DEBUG_BATS @@ -2939,6 +2940,10 @@ CPUPPCState *cpu_ppc_init (const char *cpu_model) env-cpu_model_str = cpu_model; cpu_ppc_register_internal(env, def); cpu_ppc_reset(env); + +if (kvm_enabled()) +kvm_init_vcpu(env); + return env; } diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c new file mode 100644 index 000..90b943b --- /dev/null +++ b/target-ppc/kvm.c @@ -0,0 +1,212 @@ +/* + * PowerPC implementation of KVM hooks + * + * Copyright IBM Corp. 2007 + * + * Authors: + * Jerone Young jyou...@us.ibm.com + * Christian Ehrhardt ehrha...@linux.vnet.ibm.com + * Hollis Blanchard holl...@us.ibm.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include sys/types.h +#include sys/ioctl.h +#include sys/mman.h + +#include linux/kvm.h + +#include helper_regs.h +#include qemu-common.h +#include qemu-timer.h +#include sysemu.h +#include kvm.h +#include kvm_ppc.h +#include cpu.h +#include device_tree.h + +//#define DEBUG_KVM + +#ifdef DEBUG_KVM +#define dprintf(fmt, ...) \ +do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0) +#else +#define dprintf(fmt, ...) \ +do { } while (0) +#endif + +int kvm_arch_init(KVMState *s, int smp_cpus) +{ + return 0; +} + +int kvm_arch_init_vcpu(CPUState *cenv) +{ + return 0; +} + +int kvm_arch_put_registers(CPUState *env) +{ + struct kvm_regs regs; + int ret; + int i; + + ret = kvm_vcpu_ioctl(env, KVM_GET_REGS, regs); + if (ret 0) + return ret; + + /* cr is untouched in qemu and not existant in CPUState fr ppr */ + /* hflags is a morphed to MSR on ppc, no need to sync that down to kvm */ + + regs.pc = env-nip; + + regs.ctr = env-ctr; + regs.lr = env-lr; + regs.xer = env-xer; + regs.msr = env-msr; + + regs.srr0 = env-spr[SPR_SRR0]; + regs.srr1 = env-spr[SPR_SRR1]; + + regs.sprg0 = env-spr[SPR_SPRG0]; + regs.sprg1 = env-spr[SPR_SPRG1]; + regs.sprg2 = env-spr[SPR_SPRG2]; + regs.sprg3 = env-spr[SPR_SPRG3]; + regs.sprg4 = env-spr[SPR_SPRG4]; + regs.sprg5 = env-spr[SPR_SPRG5]; + regs.sprg6 = env-spr[SPR_SPRG6]; + regs.sprg7 = env-spr[SPR_SPRG7]; + + for (i = 0;i 32; i++) + regs.gpr[i] = env-gpr[i]; + + ret = kvm_vcpu_ioctl(env, KVM_SET_REGS, regs); + if (ret 0) + return ret; + + return ret; +} + +int kvm_arch_get_registers(CPUState *env) +{ + struct kvm_regs regs; + uint32_t i, ret; + + ret = kvm_vcpu_ioctl(env, KVM_GET_REGS, regs); + if (ret 0) + return ret; + + env-ctr = regs.ctr; + env-lr = regs.lr;
[PATCH 4/6] Implement device tree support needed for Bamboo emulation
To implement the -kernel, -initrd, and -append options, 4xx board emulation must load the guest kernel as if firmware had loaded it. Where u-boot would be the firmware, we must load the flat device tree into memory and set key fields such as /chosen/bootargs. This patch introduces a dependency on libfdt for flat device tree support. Signed-off-by: Hollis Blanchard holl...@us.ibm.com --- Makefile.target |4 ++ configure | 18 device_tree.c | 116 +++ device_tree.h | 26 libfdt_env.h| 22 ++ 5 files changed, 186 insertions(+), 0 deletions(-) create mode 100644 device_tree.c create mode 100644 device_tree.h create mode 100644 libfdt_env.h diff --git a/Makefile.target b/Makefile.target index d01231d..5da4994 100644 --- a/Makefile.target +++ b/Makefile.target @@ -655,6 +655,10 @@ OBJS+= pflash_cfi02.o ppc4xx_devs.o ppc4xx_pci.o ppc405_uc.o ppc405_boards.o ifdef CONFIG_KVM OBJS+= kvm_ppc.o endif +ifdef FDT_LIBS +OBJS+= device_tree.o +LIBS+= $(FDT_LIBS) +endif # virtio support OBJS+= virtio.o virtio-blk.o virtio-balloon.o endif diff --git a/configure b/configure index c534441..b54c15d 100755 --- a/configure +++ b/configure @@ -119,6 +119,7 @@ kvm=yes kerneldir= aix=no blobs=yes +fdt=yes # OS specific targetos=`uname -s` @@ -976,6 +977,18 @@ if $cc $ARCH_CFLAGS -o $TMPE $TMPC 2 /dev/null ; then iovec=yes fi +## +# fdt probe +if test $fdt = yes ; then +fdt=no +cat $TMPC EOF +int main(void) { return 0; } +EOF + if $cc $ARCH_CFLAGS -o $TMPE ${OS_CFLAGS} $TMPC -lfdt 2 /dev/null ; then +fdt=yes + fi +fi + # Check if tools are available to build documentation. if [ -x `which texi2html 2/dev/null` ] \ [ -x `which pod2man 2/dev/null` ]; then @@ -1051,6 +1064,7 @@ echo vde support $vde echo AIO support $aio echo Install blobs $blobs echo KVM support $kvm +echo fdt support $kvm if test $sdl_too_old = yes; then echo - Your SDL version is too old - please upgrade to have SDL support @@ -1340,6 +1354,10 @@ fi if test $iovec = yes ; then echo #define HAVE_IOVEC 1 $config_h fi +if test $fdt = yes ; then + echo #define HAVE_FDT 1 $config_h + echo FDT_LIBS=-lfdt $config_mak +fi # XXX: suppress that if [ $bsd = yes ] ; then diff --git a/device_tree.c b/device_tree.c new file mode 100644 index 000..d7350e3 --- /dev/null +++ b/device_tree.c @@ -0,0 +1,116 @@ +/* + * Functions to help device tree manipulation using libfdt. + * It also provides functions to read entries from device tree proc + * interface. + * + * Copyright 2008 IBM Corporation. + * Authors: Jerone Young jyou...@us.ibm.com + * Hollis Blanchard holl...@us.ibm.com + * + * This work is licensed under the GNU GPL license version 2 or later. + * + */ + +#include stdio.h +#include sys/types.h +#include sys/stat.h +#include fcntl.h +#include unistd.h +#include stdlib.h + +#include config.h +#include qemu-common.h +#include sysemu.h +#include device_tree.h + +#include libfdt.h + +void *load_device_tree(const char *filename_path, void *load_addr) +{ + int dt_file_size; + int dt_file_load_size; + int new_dt_size; + int ret; + void *dt_file = NULL; + void *fdt; + + dt_file_size = get_image_size(filename_path); + if (dt_file_size 0) { + printf(Unable to get size of device tree file '%s'\n, + filename_path); + goto fail; + } + + /* First allocate space in qemu for device tree */ + dt_file = qemu_malloc(dt_file_size); + if (dt_file == NULL) { + printf(Unable to allocate memory in qemu for device tree\n); + goto fail; + } + memset(dt_file, 0, dt_file_size); + + dt_file_load_size = load_image(filename_path, dt_file); + + /* Second we place new copy of 2x size in guest memory +* This give us enough room for manipulation. +*/ + new_dt_size = dt_file_size * 2; + + fdt = load_addr; + ret = fdt_open_into(dt_file, fdt, new_dt_size); + if (ret) { + printf(Unable to copy device tree in memory\n); + goto fail; + } + + /* Check sanity of device tree */ + if (fdt_check_header(fdt)) { + printf (Device tree file loaded into memory is invalid: %s\n, + filename_path); + goto fail; + } + /* free qemu memory with old device tree */ + qemu_free(dt_file); + return fdt; + +fail: + if (dt_file) + qemu_free(dt_file); + return NULL; +} + +int qemu_devtree_setprop(void *fdt, const char *node_path, + const char *property, uint32_t *val_array, int size) +{ + int offset; + + offset = fdt_path_offset(fdt, node_path); +if (offset 0) +return offset; + +
[PATCH 5/6] PowerPC 440EP SoC emulation
Wire up the system-on-chip devices present on 440EP chips. This patch is a little unusual in that qemu doesn't actually emulate the 440 core, but we use this board code with KVM (which does). If/when 440 core emulation is supported, the kvm_enabled() hack can be removed. Signed-off-by: Hollis Blanchard holl...@us.ibm.com --- Makefile.target |1 + hw/ppc440.c | 144 +++ hw/ppc440.h | 20 3 files changed, 165 insertions(+), 0 deletions(-) create mode 100644 hw/ppc440.c create mode 100644 hw/ppc440.h diff --git a/Makefile.target b/Makefile.target index 5da4994..6032af0 100644 --- a/Makefile.target +++ b/Makefile.target @@ -652,6 +652,7 @@ OBJS+= heathrow_pic.o grackle_pci.o ppc_oldworld.o OBJS+= unin_pci.o ppc_chrp.o # PowerPC 4xx boards OBJS+= pflash_cfi02.o ppc4xx_devs.o ppc4xx_pci.o ppc405_uc.o ppc405_boards.o +OBJS+= ppc440.o ifdef CONFIG_KVM OBJS+= kvm_ppc.o endif diff --git a/hw/ppc440.c b/hw/ppc440.c new file mode 100644 index 000..654249f --- /dev/null +++ b/hw/ppc440.c @@ -0,0 +1,144 @@ +/* + * Qemu PowerPC 440 chip emulation + * + * Copyright 2007 IBM Corporation. + * Authors: + * Jerone Young jyou...@us.ibm.com + * Christian Ehrhardt ehrha...@linux.vnet.ibm.com + * Hollis Blanchard holl...@us.ibm.com + * + * This work is licensed under the GNU GPL license version 2 or later. + * + */ + +#include hw.h +#include isa.h +#include ppc.h +#include ppc4xx.h +#include ppc440.h +#include ppc405.h +#include sysemu.h +#include kvm.h + +#define PPC440EP_SDRAM_NR_BANKS 4 + +#define PPC440EP_PCI_CONFIG 0xeec0 +#define PPC440EP_PCI_INTACK 0xeed0 +#define PPC440EP_PCI_SPECIAL0xeed0 +#define PPC440EP_PCI_REGS 0xef40 +#define PPC440EP_PCI_IO 0xe800 +#define PPC440EP_PCI_IOLEN 0x0001 + +static const unsigned int ppc440ep_sdram_bank_sizes[] = { +25620, 12820, 6420, 3220, 1620, 820, 0 +}; + +/* XXX move to ppc4xx_devs.c */ +/* Fill in consecutive SDRAM banks with 'ram_size' bytes of memory. + * + * The SDRAM controller supports a small number of banks, and each bank must be + * one of a small set of sizes. The number of banks and the supported sizes + * varies by SoC. */ +ram_addr_t ppc4xx_sdram_adjust(ram_addr_t ram_size, int nr_banks, + target_phys_addr_t ram_bases[], + target_phys_addr_t ram_sizes[], + const unsigned int sdram_bank_sizes[]) +{ +ram_addr_t ram_end = 0; +int i; +int j; + +for (i = 0; i nr_banks; i++) { +for (j = 0; sdram_bank_sizes[j] != 0; j++) { +unsigned int bank_size = sdram_bank_sizes[j]; + +if (bank_size = ram_size) { +ram_bases[i] = ram_end; +ram_sizes[i] = bank_size; +ram_end += bank_size; +ram_size -= bank_size; +break; +} +} + +if (!ram_size) { +/* No need to use the remaining banks. */ +break; +} +} + +if (ram_size) +printf(Truncating memory to %d MiB to fit SDRAM controller limits.\n, + (int)(ram_end 20)); + +return ram_end; +} + +CPUState *ppc440ep_init(ram_addr_t *ram_size, PCIBus **pcip, +const unsigned int pci_irq_nrs[4], int do_init) +{ +target_phys_addr_t ram_bases[PPC440EP_SDRAM_NR_BANKS]; +target_phys_addr_t ram_sizes[PPC440EP_SDRAM_NR_BANKS]; +CPUState *env; +ppc4xx_mmio_t *mmio; +qemu_irq *pic; +qemu_irq *irqs; +qemu_irq *pci_irqs; + +env = cpu_ppc_init(440EP); +if (!env kvm_enabled()) { +/* XXX Since qemu doesn't yet emulate 440, we just say it's a 405. + * Since KVM doesn't use qemu's CPU emulation it seems to be working + * OK. */ +env = cpu_ppc_init(405); +} +if (!env) { +fprintf(stderr, Unable to initialize CPU!\n); +exit(1); +} + +ppc_dcr_init(env, NULL, NULL); + +/* interrupt controller */ +irqs = qemu_mallocz(sizeof(qemu_irq) * PPCUIC_OUTPUT_NB); +irqs[PPCUIC_OUTPUT_INT] = ((qemu_irq *)env-irq_inputs)[PPC40x_INPUT_INT]; +irqs[PPCUIC_OUTPUT_CINT] = ((qemu_irq *)env-irq_inputs)[PPC40x_INPUT_CINT]; +pic = ppcuic_init(env, irqs, 0x0C0, 0, 1); + +/* SDRAM controller */ +memset(ram_bases, 0, sizeof(ram_bases)); +memset(ram_sizes, 0, sizeof(ram_sizes)); +*ram_size = ppc4xx_sdram_adjust(*ram_size, PPC440EP_SDRAM_NR_BANKS, +ram_bases, ram_sizes, +ppc440ep_sdram_bank_sizes); +/* XXX 440EP's ECC interrupts are on UIC1, but we've only created UIC0. */ +ppc405_sdram_init(env, pic[14], PPC440EP_SDRAM_NR_BANKS, ram_bases, + ram_sizes, do_init); + +/* PCI */ +pci_irqs = qemu_malloc(sizeof(qemu_irq) * 4); +pci_irqs[0] = pic[pci_irq_nrs[0]]; +
Re: [Qemu-devel] [PATCH 2/6] kvm: sync vcpu state during initialization
Hollis Blanchard wrote: Currently on x86, qemu initializes CPUState but KVM ignores it and does its own vcpu initialization. However, PowerPC KVM needs to be able to set the initial register state to support the -kernel and -append options. Signed-off-by: Hollis Blanchard holl...@us.ibm.com Segv's x86 when using -enable-kvm. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 6/6] IBM PowerPC 440EP Bamboo reference board emulation
On 12/11/08, Hollis Blanchard holl...@us.ibm.com wrote: Since most IO devices are integrated into the 440EP chip, Bamboo support mostly entails implementing the -kernel, -initrd, and -append options. These options are implemented by loading the guest as if u-boot had done it, i.e. loading a flat device tree, updating it to hold initrd addresses, ram size, and command line, and passing the FDT address in r3. Since we use it with KVM, we enable the virtio block driver and include hooks necessary for KVM support. Signed-off-by: Hollis Blanchard holl...@us.ibm.com --- a/target-ppc/machine.c +++ b/target-ppc/machine.c @@ -1,5 +1,6 @@ #include hw/hw.h #include hw/boards.h +#include kvm.h Shouldn't be necessary? -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 6/6] IBM PowerPC 440EP Bamboo reference board emulation
Hollis Blanchard wrote: Since most IO devices are integrated into the 440EP chip, Bamboo support mostly entails implementing the -kernel, -initrd, and -append options. These options are implemented by loading the guest as if u-boot had done it, i.e. loading a flat device tree, updating it to hold initrd addresses, ram size, and command line, and passing the FDT address in r3. Since we use it with KVM, we enable the virtio block driver and include hooks necessary for KVM support. Signed-off-by: Hollis Blanchard holl...@us.ibm.com --- Makefile |2 +- Makefile.target |2 +- hw/boards.h |1 + hw/ppc440_bamboo.c | 190 pc-bios/bamboo.dtb | Bin 0 - 3163 bytes pc-bios/bamboo.dts | 234 ++ target-ppc/machine.c |2 + 7 files changed, 429 insertions(+), 2 deletions(-) create mode 100644 hw/ppc440_bamboo.c create mode 100644 pc-bios/bamboo.dtb create mode 100644 pc-bios/bamboo.dts diff --git a/Makefile b/Makefile index a2a03ec..f9496fe 100644 --- a/Makefile +++ b/Makefile @@ -219,7 +219,7 @@ common de-ch es fo fr-ca hu ja mk nl-be pt sl tr ifdef INSTALL_BLOBS BLOBS=bios.bin vgabios.bin vgabios-cirrus.bin ppc_rom.bin \ video.x openbios-sparc32 openbios-sparc64 pxe-ne2k_pci.bin \ -pxe-rtl8139.bin pxe-pcnet.bin pxe-e1000.bin +pxe-rtl8139.bin pxe-pcnet.bin pxe-e1000.bin bamboo.dtb else BLOBS= endif diff --git a/Makefile.target b/Makefile.target index 6032af0..82bc746 100644 --- a/Makefile.target +++ b/Makefile.target @@ -652,7 +652,7 @@ OBJS+= heathrow_pic.o grackle_pci.o ppc_oldworld.o OBJS+= unin_pci.o ppc_chrp.o # PowerPC 4xx boards OBJS+= pflash_cfi02.o ppc4xx_devs.o ppc4xx_pci.o ppc405_uc.o ppc405_boards.o -OBJS+= ppc440.o +OBJS+= ppc440.o ppc440_bamboo.o ifdef CONFIG_KVM OBJS+= kvm_ppc.o endif diff --git a/hw/boards.h b/hw/boards.h index d30c0fc..debe9a6 100644 --- a/hw/boards.h +++ b/hw/boards.h @@ -38,6 +38,7 @@ extern QEMUMachine core99_machine; extern QEMUMachine heathrow_machine; extern QEMUMachine ref405ep_machine; extern QEMUMachine taihu_machine; +extern QEMUMachine bamboo_machine; /* mips_r4k.c */ extern QEMUMachine mips_machine; diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c new file mode 100644 index 000..b0e3106 --- /dev/null +++ b/hw/ppc440_bamboo.c @@ -0,0 +1,190 @@ +/* + * Qemu PowerPC 440 board emulation + * + * Copyright 2007 IBM Corporation. + * Authors: + * Jerone Young jyou...@us.ibm.com + * Christian Ehrhardt ehrha...@linux.vnet.ibm.com + * Hollis Blanchard holl...@us.ibm.com + * + * This work is licensed under the GNU GPL license version 2 or later. + * + */ + +#include config.h +#include qemu-common.h +#include net.h +#include hw.h +#include pci.h +#include virtio-blk.h +#include boards.h +#include sysemu.h +#include ppc440.h +#include kvm.h +#include kvm_ppc.h +#include device_tree.h + +#define BINARY_DEVICE_TREE_FILE bamboo.dtb + +static void *bamboo_load_device_tree(void *addr, + uint32_t ramsize, + target_phys_addr_t initrd_base, + target_phys_addr_t initrd_size, + const char *kernel_cmdline) +{ +void *fdt = NULL; +#ifdef HAVE_FDT Is this at all usable without libfdt? If not, just don't compile this board in unless libfdt is present. +uint32_t mem_reg_property[] = { 0, 0, ramsize }; +char *path = NULL; +int len; +int ret; + +len = asprintf(path, %s/%s, bios_dir, BINARY_DEVICE_TREE_FILE); asprintf() is a GNU-ism and won't compile on Win32. diff --git a/target-ppc/machine.c b/target-ppc/machine.c index be0cbe1..72f67d0 100644 --- a/target-ppc/machine.c +++ b/target-ppc/machine.c @@ -1,5 +1,6 @@ #include hw/hw.h #include hw/boards.h +#include kvm.h Is this necessary? void register_machines(void) { @@ -8,6 +9,7 @@ void register_machines(void) qemu_register_machine(prep_machine); qemu_register_machine(ref405ep_machine); qemu_register_machine(taihu_machine); +qemu_register_machine(bamboo_machine); } void cpu_save(QEMUFile *f, void *opaque) Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 3/6] Enable KVM for ppcemb.
On Thu, 2008-12-11 at 15:30 -0600, Anthony Liguori wrote: Hollis Blanchard wrote: +int kvm_arch_get_registers(CPUState *env) +{ + struct kvm_regs regs; + uint32_t i, ret; + + ret = kvm_vcpu_ioctl(env, KVM_GET_REGS, regs); + if (ret 0) + return ret; + + env-ctr = regs.ctr; + env-lr = regs.lr; + env-xer = regs.xer; + env-msr = regs.msr; + /* calculate hflags based on the current msr using the ppc qemu helper */ + hreg_compute_hflags(env); Do you need this? Practically speaking, I don't even think we need to maintain them on x86 anymore. Ah, it seems you're right. That's good. diff --git a/target-ppc/kvm_ppc.c b/target-ppc/kvm_ppc.c new file mode 100644 index 000..b2b56df --- /dev/null +++ b/target-ppc/kvm_ppc.c Hence my confusion. These are just kvm related helper? I don't know that kvm_ppc.c is a very information name for this sort of stuff. Since this is really host specific, not target specific, why not move it out of target-ppc. I could combine kvm_ppc.c into target-ppc/kvm.c. However, they're really two different things, and I thought it would cause the least confusion if they were logically separate. Most of it is hooks required by common code, and then some of it isn't. (I'm thinking about e.g. IA64 doing a copy/paste, and then wondering which functions they actually need to implement.) Regardless, I will still need a kvm_ppc.h, so kvm_ppc.c seemed like a good place to match. I don't see that you can call any KVM code either host- or target-specific, since by definition they are the same. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 6/6] IBM PowerPC 440EP Bamboo reference board emulation
On Thu, 2008-12-11 at 15:39 -0600, Anthony Liguori wrote: Hollis Blanchard wrote: + +#define BINARY_DEVICE_TREE_FILE bamboo.dtb + +static void *bamboo_load_device_tree(void *addr, + uint32_t ramsize, + target_phys_addr_t initrd_base, + target_phys_addr_t initrd_size, + const char *kernel_cmdline) +{ +void *fdt = NULL; +#ifdef HAVE_FDT Is this at all usable without libfdt? If not, just don't compile this board in unless libfdt is present. In practice, we've only tested with the -kernel option, which does require libfdt. However, in theory there is nothing that precludes running a firmware (such as u-boot) inside the KVM guest. Jean-Christophe is working on improving the ppc4xx device emulation so that becomes possible in the future. -- Hollis Blanchard IBM Linux Technology Center -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html