Re: [Qemu-devel] [PATCH 2/4] spapr: Enable DABRX special register
On 04/04/2014 12:19 AM, Alexander Graf wrote: On 03.04.14 15:14, Alexey Kardashevskiy wrote: This advertises Data Address Breakpoint Register Extension (DABRX) to the guest via hyperrtas list and enables it to migrate. Do all CPUs we support (970 anyone) have DABRX support? 970MP and 970FX do. Support them too? Who cares? :) Also who handles this hcall in the TCG case? Good point... What about older host kernels that don't support xdabr yet? They will ignore FW_FEATURE_XDABR, no? What about PR KVM? Oh. Nothing. And we do not want to make this hcall-xdabr conditional, right? Drop the whole patch? I am really confused now. Alex Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/ppc/spapr.c | 1 + target-ppc/translate_init.c | 4 2 files changed, 5 insertions(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index a11e121..451c473 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -307,6 +307,7 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base, uint32_t start_prop = cpu_to_be32(initrd_base); uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size); char hypertas_prop[] = hcall-pft\0hcall-term\0hcall-dabr\0hcall-interrupt +\0hcall-xdabr \0hcall-tce\0hcall-vio\0hcall-splpar\0hcall-bulk\0hcall-set-mode; char qemu_hypertas_prop[] = hcall-memop1; uint32_t refpoints[] = {cpu_to_be32(0x4), cpu_to_be32(0x4)}; diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c index d07e186..1627bb0 100644 --- a/target-ppc/translate_init.c +++ b/target-ppc/translate_init.c @@ -7010,6 +7010,10 @@ static void init_proc_POWER7 (CPUPPCState *env) SPR_NOACCESS, SPR_NOACCESS, spr_read_generic, spr_write_generic, KVM_REG_PPC_PMC6, 0x); +spr_register_kvm(env, SPR_DABRX, DABRX, + SPR_NOACCESS, SPR_NOACCESS, + SPR_NOACCESS, SPR_NOACCESS, + KVM_REG_PPC_DABRX, 0x); #endif /* !CONFIG_USER_ONLY */ gen_spr_amr(env); /* XXX : not implemented */ -- Alexey
Re: [Qemu-devel] [PATCH] spapr: skip adding usb keyboard/mouse in case of -nodefaults
Nikunj A Dadhania nik...@linux.vnet.ibm.com writes: The following commit caused the regression in qemu-system-ppc64 7effdaa3: spapr: Fix return value of vga initialization d44229c5: Fix vga_interface_type for command line argument '-device VGA' Even when -nodefaults was provided, USB Keyboard and Mouse was added to the machine. This breaks libvirt which uses -nodefaults and adds the keyboard and mouse separately. The machine got 2 USB Keyboards and 2 USB Mouses. CC: Paolo Bonzini pbonz...@redhat.com CC: Mark Wu wu...@linux.vnet.ibm.com CC: Andreas Färber afaer...@suse.de Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com --- Removed the debug statement and fixed indentation breakage: hw/ppc/spapr.c | 6 +- include/sysemu/sysemu.h | 1 + vl.c| 5 + 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index a11e121..3095626 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -1328,7 +1328,11 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) if (usb_enabled(spapr-has_graphics)) { pci_create_simple(phb-bus, -1, pci-ohci); -if (spapr-has_graphics) { +/* + * For VGA/VNC, by default add usb keyboard/mouse, if -nodefaults + * provided skip adding usb keyboard/mouse + */ +if (spapr-has_graphics qemu_has_defaults()) { usbdevice_create(keyboard); usbdevice_create(mouse); } diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index ba5c7f8..8e90ad0 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -200,6 +200,7 @@ DeviceState *get_boot_device(uint32_t position); QemuOpts *qemu_get_machine_opts(void); bool usb_enabled(bool default_usb); +bool qemu_has_defaults(void); extern QemuOptsList qemu_legacy_drive_opts; extern QemuOptsList qemu_common_drive_opts; diff --git a/vl.c b/vl.c index 9975e5a..4d5832c 100644 --- a/vl.c +++ b/vl.c @@ -981,6 +981,11 @@ bool usb_enabled(bool default_usb) has_defaults default_usb); } +bool qemu_has_defaults(void) +{ +return has_defaults; +} + #ifndef _WIN32 static int parse_add_fd(QemuOpts *opts, void *opaque) { Have you considered extending QEMUMachineInitArgs instead of adding this function?
Re: [Qemu-devel] [PATCH 2.0] PPC: E500: Set PIR default reset value rather than SPR value
Am 04.04.2014 um 02:26 schrieb Alexey Kardashevskiy a...@ozlabs.ru: On 04/04/2014 05:58 AM, Alexander Graf wrote: On 03.04.2014, at 20:55, Peter Maydell peter.mayd...@linaro.org wrote: On 3 April 2014 19:48, Alexander Graf ag...@suse.de wrote: We now reset SPRs to their reset values on CPU reset. So if we want to have an SPR persistently changed, we need to change its default reset value rather than the value itself manually. Do this for SPR_BOOKE_PIR, fixing e500v2 SMP boot. Reported-by: Frederic Konrad fred.kon...@greensocs.com Signed-off-by: Alexander Graf ag...@suse.de --- hw/ppc/e500.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c index d7ba25f..f984b3e 100644 --- a/hw/ppc/e500.c +++ b/hw/ppc/e500.c @@ -649,7 +649,7 @@ void ppce500_init(QEMUMachineInitArgs *args, PPCE500Params *params) input = (qemu_irq *)env-irq_inputs; irqs[i][OPENPIC_OUTPUT_INT] = input[PPCE500_INPUT_INT]; irqs[i][OPENPIC_OUTPUT_CINT] = input[PPCE500_INPUT_CINT]; -env-spr[SPR_BOOKE_PIR] = cs-cpu_index = i; +env-spr_cb[SPR_BOOKE_PIR].default_value = cs-cpu_index = i; Not an issue introduced with this patch, but should we really be missing with CPUState::cpu_index in a board file? (Nowhere else does, it's otherwise simply set to an appropriately incremented index by cpu_exec_init().) I rather suspect that cpu_index will already be set to the value we set it, in any case... Most likely, but I'd rather not touch that logic for 2.0 :) git grep cpu_index\s\+=[^=]: cpus.c:1404:cpu_index = 0; exec.c:482:cpu_index = 0; exec.c:486:cpu-cpu_index = cpu_index; hmp.c:742:cpu_index = qdict_get_int(qdict, index); hw/ppc/e500.c:652:env-spr[SPR_BOOKE_PIR] = cs-cpu_index = i; monitor.c:2225:int cpu_index = qdict_get_int(qdict, cpu_index); scripts/qmp/qmp-shell:183:self.__cpu_index = 0 scripts/qmp/qmp-shell:216:def __cmd_passthrough(self, cmdline, cpu_index = 0): scripts/qmp/qmp-shell:229:self.__cpu_index = idx Assigning cpu_index in e500.c looks to me as a strange idea. And it is hidden so nicely with two = in a row :-/ Out of curiosity - in the real e500 hardware, PIR does not get reset on CPU reset? On real hw PIR is a hardwired read-only register :) Alex -- Alexey
Re: [Qemu-devel] [PATCH] qga: trivial fix for unclear document of guest-set-time
On 04.04.2014 02:53, Amos Kong wrote: We mixedly used guest time, system time, hardware time, RTC in document, it's unclear. This patch just added two remarks of RTC and replace two guest time by guest's system time. Signed-off-by: Amos Kong ak...@redhat.com Reviewed-by: Michal Privoznik mpriv...@redhat.com Michal
[Qemu-devel] [PATCH target-arm v1 1/1] net: cadence_gem: Make phy respond to broadcast
Phys must respond to address 0 by specification. Implement. Signed-off-by: Nathan Rossi nathan.ro...@xilinx.com Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com --- hw/net/cadence_gem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c index 92dc2f2..e34b25e 100644 --- a/hw/net/cadence_gem.c +++ b/hw/net/cadence_gem.c @@ -1093,7 +1093,7 @@ static uint64_t gem_read(void *opaque, hwaddr offset, unsigned size) uint32_t phy_addr, reg_num; phy_addr = (retval GEM_PHYMNTNC_ADDR) GEM_PHYMNTNC_ADDR_SHFT; -if (phy_addr == BOARD_PHY_ADDRESS) { +if (phy_addr == BOARD_PHY_ADDRESS || phy_addr == 0) { reg_num = (retval GEM_PHYMNTNC_REG) GEM_PHYMNTNC_REG_SHIFT; retval = 0x; retval |= gem_phy_read(s, reg_num); @@ -1193,7 +1193,7 @@ static void gem_write(void *opaque, hwaddr offset, uint64_t val, uint32_t phy_addr, reg_num; phy_addr = (val GEM_PHYMNTNC_ADDR) GEM_PHYMNTNC_ADDR_SHFT; -if (phy_addr == BOARD_PHY_ADDRESS) { +if (phy_addr == BOARD_PHY_ADDRESS || phy_addr == 0) { reg_num = (val GEM_PHYMNTNC_REG) GEM_PHYMNTNC_REG_SHIFT; gem_phy_write(s, reg_num, val); } -- 1.9.1.1.gbb9f595
Re: [Qemu-devel] [PATCH 3/4] KVM: PPC: Support POWER8 registers
On 04/04/2014 06:12 AM, Tom Musta wrote: On 4/3/2014 8:33 AM, Alexander Graf wrote: On 03.04.14 15:14, Alexey Kardashevskiy wrote: This enabled KVM and migration support for a number of POWER8 registers: snip Tom, please have a look through this as well :). --- snip --- a/target-ppc/cpu.h +++ b/target-ppc/cpu.h @@ -426,6 +426,8 @@ struct ppc_slb_t { #define MSR_TAG 62 /* Tag-active mode (POWERx ?) */ #define MSR_ISF 61 /* Sixty-four-bit interrupt mode on 630 */ #define MSR_SHV 60 /* hypervisor state hflags */ +#define MSR_TS 33 /* Transactional state, 2 bits (Book3s) */ 2 bits means you want to add another define or at least a comment at bit 34. I find it rather counterintuitive to declare bit 33 MSR_TS as 2-bit wide too - you'd expect (3 MSR_TS) gives you the right mask, but then it'd have to be 34, no? Is this better? #define MSR_TS0 34 #define MSR_TS1 33 You should also add a decoder: #define msr_ts ((env-msr MSR_TS1) 3) Yes, this is better. Thanks! +target_ulong tm_gpr[32]; +ppc_avr_t tm_vsr[64]; +uint64_t tm_cr; +uint64_t tm_lr; +uint64_t tm_ctr; +uint64_t tm_fpscr; +uint64_t tm_amr; +uint64_t tm_ppr; +uint64_t tm_vrsave; +uint32_t tm_vscr; +uint64_t tm_dscr; +uint64_t tm_tar; }; If vscr is declared as 32 bits, should CR and VRSAVE also be 32-bits? Nope. linux-headers/asm-powerpc/kvm.h: #define KVM_REG_PPC_TM_CR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x60) #define KVM_REG_PPC_TM_LR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x61) #define KVM_REG_PPC_TM_CTR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x62) #define KVM_REG_PPC_TM_FPSCR(KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x63) #define KVM_REG_PPC_TM_AMR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x64) #define KVM_REG_PPC_TM_PPR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x65) #define KVM_REG_PPC_TM_VRSAVE (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x66) #define KVM_REG_PPC_TM_VSCR (KVM_REG_PPC_TM | KVM_REG_SIZE_U32 | 0x67) #define KVM_REG_PPC_TM_DSCR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x68) #define KVM_REG_PPC_TM_TAR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x69) For the reason unknown (Paul is not in the office to ask) all TM-related registers are defined as 64bit and only VSCR is 32bit. And this is in the host kernel already. diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 9974b10..ead69fa 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -866,6 +866,25 @@ int kvm_arch_put_registers(CPUState *cs, int level) } #ifdef TARGET_PPC64 +if ((cpu-env.msr MSR_TS) 3) { Ah, it works because you're shifting the other direction. That works. How about we just introduce an msr_ts() helper similar to the other lower case helpers to make this obvious? Agreed. diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c index 1627bb0..4b20c29 100644 --- a/target-ppc/translate_init.c +++ b/target-ppc/translate_init.c @@ -7025,14 +7025,22 @@ static void init_proc_POWER7 (CPUPPCState *env) SPR_NOACCESS, SPR_NOACCESS, spr_read_generic, SPR_NOACCESS, 0x8080); -spr_register(env, SPR_VRSAVE, SPR_VRSAVE, - spr_read_generic, spr_write_generic, - spr_read_generic, spr_write_generic, - 0x); -spr_register(env, SPR_PPR, PPR, - spr_read_generic, spr_write_generic, - spr_read_generic, spr_write_generic, - 0x); +spr_register_kvm(env, SPR_VRSAVE, VRSAVE, + spr_read_generic, spr_write_generic, + spr_read_generic, spr_write_generic, + KVM_REG_PPC_VRSAVE, 0x); +spr_register_kvm(env, SPR_PPR, PPR, + spr_read_generic, spr_write_generic, + spr_read_generic, spr_write_generic, + KVM_REG_PPC_PPR, 0x); +spr_register_kvm(env, SPR_BOOK3S_SIAR, SIAR, + spr_read_generic, spr_write_generic, + spr_read_generic, spr_write_generic, + KVM_REG_PPC_SIAR, 0x); +spr_register_kvm(env, SPR_BOOK3S_SDAR, SDAR, + spr_read_generic, spr_write_generic, + spr_read_generic, spr_write_generic, + KVM_REG_PPC_SDAR, 0x); /* Logical partitionning */ spr_register_kvm(env, SPR_LPCR, LPCR, SPR_NOACCESS, SPR_NOACCESS, These need to go into P8 as well? (see my comment for patch 2). Yes. VRSAVE, SIAR, SDAR are even defined for 970 (Alex, should I add them to 970 definitions?), PPR is not defined in any 970 spec but is in 2.04..2.07. -- Alexey
Re: [Qemu-devel] [PATCH 1/2] dma-helpers: Initialize DMAAIOCB in_cancel flag
On Fri, Mar 28, 2014 at 02:22:49PM +, Peter Maydell wrote: Initialize the dbs-in_cancel flag in dma_bdrv_io(), since qemu_aio_get() does not return zero-initialized memory. Spotted by the clang sanitizer (which complained when the value loaded in dma_complete() was not valid for a bool type); this might have resulted in leaking the AIO block. Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- dma-helpers.c | 1 + 1 file changed, 1 insertion(+) diff --git a/dma-helpers.c b/dma-helpers.c index c9620a5..5f421e9 100644 --- a/dma-helpers.c +++ b/dma-helpers.c @@ -213,6 +213,7 @@ BlockDriverAIOCB *dma_bdrv_io( dbs-sg_cur_index = 0; dbs-sg_cur_byte = 0; dbs-dir = dir; +dbs-in_cancel = false; dbs-io_func = io_func; dbs-bh = NULL; qemu_iovec_init(dbs-iov, sg-nsg); Worth merging for 2.0. Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
[Qemu-devel] [PATCH] spapr_nvram: Correct max nvram size
Currently it is UINT16_MAX*16 = 65536*16 = 1048560 which is not a round number and therefore a bit confusing. This defines MAX_NVRAM_SIZE precisely as 1MB. Suggested-by: Thomas Huth th...@linux.vnet.ibm.com Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/nvram/spapr_nvram.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/nvram/spapr_nvram.c b/hw/nvram/spapr_nvram.c index 635713e..af49c46 100644 --- a/hw/nvram/spapr_nvram.c +++ b/hw/nvram/spapr_nvram.c @@ -42,7 +42,7 @@ typedef struct sPAPRNVRAM { #define MIN_NVRAM_SIZE 8192 #define DEFAULT_NVRAM_SIZE 65536 -#define MAX_NVRAM_SIZE (UINT16_MAX * 16) +#define MAX_NVRAM_SIZE 1048576 static void rtas_nvram_fetch(PowerPCCPU *cpu, sPAPREnvironment *spapr, uint32_t token, uint32_t nargs, -- 1.8.4.rc4
Re: [Qemu-devel] [PATCH v7] net: L2TPv3 transport
On Mon, Mar 31, 2014 at 03:39:19PM +0100, anton.iva...@kot-begemot.co.uk wrote: +static void net_l2tpv3_process_queue(NetL2TPV3State *s) +{ +int size = 0; +struct iovec *vec; +bool bad_read; +int data_size; +struct mmsghdr *msgvec; + +/* go into ring mode only if there is a pending tail */ +if (s-queue_depth 0) { +do { +msgvec = s-msgvec + s-queue_tail; +if (msgvec-msg_len 0) { +data_size = msgvec-msg_len - s-header_size; +vec = msgvec-msg_hdr.msg_iov; +if ((data_size 0) +(l2tpv3_verify_header(s, vec-iov_base) == 0)) { +vec++; +/* Use the legacy delivery for now, we will + * switch to using our own ring as a queueing mechanism + * at a later date + */ +size = qemu_send_packet_async( +s-nc, +vec-iov_base, +data_size, +l2tpv3_send_completed +); +bad_read = false; +} else { +bad_read = true; +if (!s-header_mismatch) { +/* report error only once */ +error_report(l2tpv3 header verification failed); +s-header_mismatch = true; +} +} +} else { +bad_read = true; +} +if ((bad_read) || (size 0)) { +s-queue_tail = (s-queue_tail + 1) % MAX_L2TPV3_MSGCNT; +s-queue_depth--; +} +} while ( +(s-queue_depth 0) + qemu_can_send_packet(s-nc) +((size 0) || bad_read) +); This doesn't handle the qemu_send_packet_async() return 0 case correctly: When qemu_send_packet_async() returns 0 this function simply returns but doesn't turn off read poll. The packet is now queued in the net layer waiting for the peer to re-enable receive. When the socket becomes readable again, we will resend the same packet again. It will be queued again by the net layer. This can happen many times so the net layer's queue may fill up. Since read poll wasn't disabled we also burn CPU calling net_l2tpv3_send() from the event loop while the peer refuses to receive. In order to fix this: 1. Change the do ... while loop into a while loop. This guarantees we will only attempt to send a packet if the peer can receive. This is a safety measure. 2. Increment queue_tail when size == 0 because the net layer has queued the packet - we are done with it! 3. Disable read poll when size == 0 so that net_l2tpv3_send() will not be called until the peer re-enables receive again. We'll know because the queue will be flushed and l2tpv3_send_completed() gets called. I hope that once these changes are made the performance will also be much closer to what you had when l2tpv3 implemented its own buffering. +int net_init_l2tpv3(const NetClientOptions *opts, +const char *name, +NetClientState *peer) +{ + + +const NetdevL2TPv3Options *l2tpv3; +NetL2TPV3State *s; +NetClientState *nc; +int fd = -1, gairet; +struct addrinfo hints; +struct addrinfo *result = NULL; +char *srcport, *dstport; + +nc = qemu_new_net_client(net_l2tpv3_info, peer, l2tpv3, name); + +s = DO_UPCAST(NetL2TPV3State, nc, nc); + The file descriptor should be initialized to -1 so we don't accidentally close stdin (0) when a goto outerr path is taken in this function. s-fd = -1; +s-queue_head = 0; +s-queue_tail = 0; +s-header_mismatch = false; + +assert(opts-kind == NET_CLIENT_OPTIONS_KIND_L2TPV3); +l2tpv3 = opts-l2tpv3; + +if (l2tpv3-has_ipv6 l2tpv3-ipv6) { +s-ipv6 = l2tpv3-ipv6; +} else { +s-ipv6 = false; +} + +if (l2tpv3-has_rxcookie || l2tpv3-has_txcookie) { +if (l2tpv3-has_rxcookie l2tpv3-has_txcookie) { +s-cookie = true; +} else { +goto outerr; +} +} else { +s-cookie = false; +} + +if (l2tpv3-has_cookie64 || l2tpv3-cookie64) { +s-cookie_is_64 = true; +} else { +s-cookie_is_64 = false; +} + +if (l2tpv3-has_udp l2tpv3-udp) { +s-udp = true; +if (!(l2tpv3-has_srcport l2tpv3-has_dstport)) { +error_report(l2tpv3_open : need both src and dst port for udp); +goto outerr; +} else { +srcport = l2tpv3-srcport; +dstport = l2tpv3-dstport; +} +} else { +s-udp = false; +srcport = NULL; +dstport = NULL; +} + + +s-offset
Re: [Qemu-devel] [PATCH 2.0] PPC: E500: Set PIR default reset value rather than SPR value
Hi Alex, Seems to works for me ;). On 03/04/2014 20:48, Alexander Graf wrote: We now reset SPRs to their reset values on CPU reset. So if we want to have an SPR persistently changed, we need to change its default reset value rather than the value itself manually. Do this for SPR_BOOKE_PIR, fixing e500v2 SMP boot. Reported-by: Frederic Konrad fred.kon...@greensocs.com Signed-off-by: Alexander Graf ag...@suse.de Tested-by: KONRAD Frederic fred.kon...@greensocs.com Thanks, Fred --- hw/ppc/e500.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c index d7ba25f..f984b3e 100644 --- a/hw/ppc/e500.c +++ b/hw/ppc/e500.c @@ -649,7 +649,7 @@ void ppce500_init(QEMUMachineInitArgs *args, PPCE500Params *params) input = (qemu_irq *)env-irq_inputs; irqs[i][OPENPIC_OUTPUT_INT] = input[PPCE500_INPUT_INT]; irqs[i][OPENPIC_OUTPUT_CINT] = input[PPCE500_INPUT_CINT]; -env-spr[SPR_BOOKE_PIR] = cs-cpu_index = i; +env-spr_cb[SPR_BOOKE_PIR].default_value = cs-cpu_index = i; env-mpic_iack = MPC8544_CCSRBAR_BASE + MPC8544_MPIC_REGS_OFFSET + 0xa0;
Re: [Qemu-devel] [PATCH] qcow2: Flush metadata during read-only reopen
On Thu, Apr 03, 2014 at 03:46:32PM +0200, Kevin Wolf wrote: If lazy refcounts are enabled for a backing file, committing to this backing file may leave it in a dirty state even if the commit succeeds. The reason is that the bdrv_flush() call in bdrv_commit() doesn't flush refcount updates with lazy refcounts enabled, and qcow2_reopen_prepare() doesn't take care to flush metadata. In order to fix this, this patch also fixes qcow2_mark_clean(), which contains another ineffective bdrv_flush() call beause lazy refcounts are disabled only afterwards. All existing callers of qcow2_mark_clean() either don't modify refcounts or already flush manually, so that this fixes only a latent, but not yet actually triggerable bug. Another instance of the same problem is live snapshots. Again, a real corruption is prevented by an explicit flush for non-read-only images in external_snapshot_prepare(), but images using lazy refcounts stay dirty. Cc: qemu-sta...@nongnu.org Signed-off-by: Kevin Wolf kw...@redhat.com --- block/qcow2.c | 25 + tests/qemu-iotests/039 | 20 tests/qemu-iotests/039.out | 11 +++ 3 files changed, 52 insertions(+), 4 deletions(-) Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
Re: [Qemu-devel] [PATCH v5 34/37] target-arm: Implement CBAR for Cortex-A57
On 4 April 2014 06:32, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote: +if (arm_feature(env, ARM_FEATURE_AARCH64)) { +/* 32 bit view is [31:18] 0...0 [43:32]. */ +uint32_t cbar32 = cpu-reset_cbar Should you extract64 on the lower order bits as well to avoid weird | results on a misaligned reset_cbar (or perhaps its worth an assert?). Can't assert, it's a QOM property; we could perhaps validate earlier on in init, but that might be a bit painful to find a suitable place to put it. extracting the low bits too seems a reasonable compromise. +| extract64(cpu-reset_cbar, 32, 12); +ARMCPRegInfo cbar_reginfo[] = { +{ .name = CBAR, + .type = ARM_CP_CONST, + .cp = 15, .crn = 15, .crm = 0, .opc1 = 4, .opc2 = 0, + .access = PL1_R, .resetvalue = cpu-reset_cbar }, +{ .name = CBAR_EL1, .state = ARM_CP_STATE_AA64, + .type = ARM_CP_CONST, + .opc0 = 3, .opc1 = 1, .crn = 15, .crm = 3, .opc2 = 0, + .access = PL1_R, .resetvalue = cbar32 }, +REGINFO_SENTINEL +}; +/* We don't implement a r/w 64 bit CBAR currently */ Is it even valid? Writable CBAR seems like a bug to me that's just fixed in the V8 version bump. This is all IMPDEF anyway (and as you'll see from the next patch A15's CBAR is RO too). R/W CBAR isn't an inherently dumb idea: you could use it to make system designs where setting CBAR is the responsibility of board-specific bootrom or firmware before handoff to tho OS, for instance. Having the h/w hardwire it is probably more robust though. thanks -- PMM
Re: [Qemu-devel] [PATCH] spapr: skip adding usb keyboard/mouse in case of -nodefaults
Markus Armbruster arm...@redhat.com writes: Nikunj A Dadhania nik...@linux.vnet.ibm.com writes: Have you considered extending QEMUMachineInitArgs instead of adding this function? Did not think of this option earlier. You mean doing something like this? diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 3a13231..936a17f 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -1410,6 +1410,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) const char *kernel_cmdline = args-kernel_cmdline; const char *initrd_filename = args-initrd_filename; const char *boot_device = args-boot_device; +unsigned int has_defaults = args-has_defaults; PowerPCCPU *cpu; CPUPPCState *env; PCIHostState *phb; @@ -1605,7 +1606,11 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) if (usb_enabled(spapr-has_graphics)) { pci_create_simple(phb-bus, -1, pci-ohci); -if (spapr-has_graphics) { +/* + * For VGA/VNC, by default add usb keyboard/mouse, if -nodefaults + * provided skip adding usb keyboard/mouse + */ +if (spapr-has_graphics has_defaults) { usbdevice_create(keyboard); usbdevice_create(mouse); } diff --git a/include/hw/boards.h b/include/hw/boards.h index b1d4e9f..ee81ddf 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -16,6 +16,7 @@ typedef struct QEMUMachineInitArgs { const char *kernel_cmdline; const char *initrd_filename; const char *cpu_model; +unsigned int has_defaults; } QEMUMachineInitArgs; typedef void QEMUMachineInitFunc(QEMUMachineInitArgs *args); diff --git a/vl.c b/vl.c index 017f92d..0d6c36c 100644 --- a/vl.c +++ b/vl.c @@ -4348,7 +4348,8 @@ int main(int argc, char **argv, char **envp) .kernel_filename = kernel_filename, .kernel_cmdline = kernel_cmdline, .initrd_filename = initrd_filename, - .cpu_model = cpu_model }; + .cpu_model = cpu_model, + .has_defaults = has_defaults, }; machine-init(args); audio_init();
Re: [Qemu-devel] Documentation tools for QEMU
Il 03/04/2014 20:56, Anirudha Bose ha scritto: Hi Paolo ! I would like to volunteer to develop Documentation tools for QEMU during my summer holidays. Yesterday I had expressed my intent to work on this project in the IRC channel of QEMU. This is a follow up email of your reply. That's awesome. I've CCed qemu-devel so that other folks see what you'd like to do. I'll be away for three weeks starting tomorrow, unfortunately, so making this public is even more important. We can use Pandoc to convert all the docs in different formats to HTML, rather than integrating all of them to qemu-tech.texi. We can do the same thing with other files as well like qemu-docs.texi and qemu-img.texi. A challenge in this project would be to process the plaintext files which have no styles defined in them. So rather than making the script do it, wouldn't it be better if all of them are rewritten in Markdown so that the final HTML will have a consistent look? Yes, rewriting docs/ to Markdown is a good first step. Some things you could do include: - do the above Markdown rewrite, and test it with pandoc conversion to HTML - send patches for it, so that you can get warm with the patches review process - build a pandoc filter that converts @foo to `foo`. I think there are examples in pandoc's homepage. - look at command descriptions from qmp-commands.hx and start integrating missing information into qapi-schema.json. Make qapi-schema.json doc comments much closer to Markdown (plus with the @foo syntax). - study the QAPISchema class in scripts/qapi.py. Start thinking of a qapi-schema.json - Markdown converter and draft it in Python. Later we can parse the JSON too, to include type information in the documentation. - move *.texi files to docs/ and update the Makefile for it. What you expectations with this project? Would the final HTML documentation be hosted somewhere like readthedocs.org http://readthedocs.org? Have you considered using Spinx for this purpose? No, you know more than I do it seems. :) Paolo
Re: [Qemu-devel] [PATCH v5 01/24] vmstate: reduce code duplication
Michael S. Tsirkin m...@redhat.com wrote: move size offset and number of elements math out to functions, to reduce code duplication. Signed-off-by: Michael S. Tsirkin m...@redhat.com Cc: Dr. David Alan Gilbert dgilb...@redhat.com Reviewed-by: Juan Quintela quint...@redhat.com Added to migration branch for 2.1
[Qemu-devel] [PATCH-trivial v2] vl: Report accelerator not supported for target more nicely
When you ask for an accelerator not supported for your target, you get a bogus accelerator does not exist message: $ qemu-system-arm -machine none,accel=kvm KVM not supported for this target kvm accelerator does not exist. No accelerator found! Suppress it. Signed-off-by: Chen Gang gang.chen.5...@gmail.com --- vl.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/vl.c b/vl.c index 842e897..b4f98fa 100644 --- a/vl.c +++ b/vl.c @@ -2709,7 +2709,7 @@ static int configure_accelerator(QEMUMachine *machine) if (!accel_list[i].available()) { printf(%s not supported for this target\n, accel_list[i].name); -continue; +break; } *(accel_list[i].allowed) = true; ret = accel_list[i].init(machine); -- 1.7.9.5
Re: [Qemu-devel] [PATCH v5 02/24] vmstate: add VMS_MUST_EXIST
Michael S. Tsirkin m...@redhat.com wrote: Can be used to verify a required field exists or validate state in some other way. Signed-off-by: Michael S. Tsirkin m...@redhat.com Reviewed-by: Dr. David Alan Gilbert dgilb...@redhat.com What should this do? We can change the semantics of -field_exist() to add this functionality if needed, no? --- include/migration/vmstate.h | 1 + vmstate.c | 10 ++ 2 files changed, 11 insertions(+) diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h index e7e1705..de970ab 100644 --- a/include/migration/vmstate.h +++ b/include/migration/vmstate.h @@ -100,6 +100,7 @@ enum VMStateFlags { VMS_MULTIPLY = 0x200, /* multiply size field by field_size */ VMS_VARRAY_UINT8 = 0x400, /* Array with size in uint8_t field*/ VMS_VARRAY_UINT32= 0x800, /* Array with size in uint32_t field*/ +VMS_MUST_EXIST = 0x1000, /* Field must exist in input */ }; typedef struct { diff --git a/vmstate.c b/vmstate.c index dd6f834..f019228 100644 --- a/vmstate.c +++ b/vmstate.c @@ -102,6 +102,10 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd, return ret; } } +} else if (field-flags VMS_MUST_EXIST) { +fprintf(stderr, Input validation failed: %s/%s\n, +vmsd-name, field-name); +return -1; } field++; } @@ -142,6 +146,12 @@ void vmstate_save_state(QEMUFile *f, const VMStateDescription *vmsd, field-info-put(f, addr, size); } } +} else { +if (field-flags VMS_MUST_EXIST) { +fprintf(stderr, Output state validation failed: %s/%s\n, +vmsd-name, field-name); +assert(!(field-flags VMS_MUST_EXIST)); +} } field++; }
Re: [Qemu-devel] [PATCH v5 12/24] vmstate: fix buffer overflow in target-arm/machine.c
Michael S. Tsirkin m...@redhat.com wrote: CVE-2013-4531 cpreg_vmstate_indexes is a VARRAY_INT32. A negative value for cpreg_vmstate_array_len will cause a buffer overflow. VMSTATE_INT32_LE was supposed to protect against this but doesn't because it doesn't validate that input is non-negative. Fix this macro to valide the value appropriately. The only other user of VMSTATE_INT32_LE doesn't ever use negative numbers so it doesn't care. Reported-by: Anthony Liguori anth...@codemonkey.ws Signed-off-by: Michael S. Tsirkin m...@redhat.com Reviewed-by: Juan Quintela quint...@redhat.com Integrated.
Re: [Qemu-devel] [PATCH v5 22/24] vmstate: s/VMSTATE_INT32_LE/VMSTATE_INT32_POSITIVE_LE/
Michael S. Tsirkin m...@redhat.com wrote: As the macro verifies the value is positive, rename it to make the function clearer. Signed-off-by: Michael S. Tsirkin m...@redhat.com Reviewed-by: Juan Quintela quint...@redhat.com
Re: [Qemu-devel] [PATCH v5 24/24] savevm: Ignore minimum_version_id_old if there is no load_state_old
Michael S. Tsirkin m...@redhat.com wrote: From: Peter Maydell peter.mayd...@linaro.org At the moment we require vmstate definitions to set minimum_version_id_old to the same value as minimum_version_id if they do not provide a load_state_old handler. Since the load_state_old functionality is required only for a handful of devices that need to retain migration compatibility with a pre-vmstate implementation, this means the bulk of devices have pointless boilerplate. Relax the definition so that minimum_version_id_old is ignored if there is no load_state_old handler. Note that under the old scheme we would segfault if the vmstate specified a minimum_version_id_old that was less than minimum_version_id but did not provide a load_state_old function, and the incoming state specified a version number between minimum_version_id_old and minimum_version_id. Under the new scheme this will just result in our failing the migration. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Reviewed-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Michael S. Tsirkin m...@redhat.com Reviewed-by: Juan Quintela quint...@redhat.com Applied
[Qemu-devel] [PATCH V2 2/4] vmxnet3: validate queues configuration coming from quest
CVE-2013-4544 Signed-off-by: Dmitry Fleytman dmi...@daynix.com Reported-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Michael S. Tsirkin m...@redhat.com --- hw/net/vmxnet3.c | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c index 0b317f8..4fefc7b 100644 --- a/hw/net/vmxnet3.c +++ b/hw/net/vmxnet3.c @@ -1336,6 +1336,23 @@ static void vmxnet3_validate_interrupts(VMXNET3State *s) } } +static void vmxnet3_validate_queues(VMXNET3State *s) +{ +/* +* txq_num and rxq_num are total number of queues +* configured by guest. These numbers must not +* exceed corresponding maximal values. +*/ + +if (s-txq_num VMXNET3_DEVICE_MAX_TX_QUEUES) { +hw_error(Bad TX queues number: %d\n, s-txq_num); +} + +if (s-rxq_num VMXNET3_DEVICE_MAX_RX_QUEUES) { +hw_error(Bad RX queues number: %d\n, s-rxq_num); +} +} + static void vmxnet3_activate_device(VMXNET3State *s) { int i; @@ -1382,7 +1399,7 @@ static void vmxnet3_activate_device(VMXNET3State *s) VMXNET3_READ_DRV_SHARED8(s-drv_shmem, devRead.misc.numRxQueues); VMW_CFPRN(Number of TX/RX queues %u/%u, s-txq_num, s-rxq_num); -assert(s-txq_num = VMXNET3_DEVICE_MAX_TX_QUEUES); +vmxnet3_validate_queues(s); qdescr_table_pa = VMXNET3_READ_DRV_SHARED64(s-drv_shmem, devRead.misc.queueDescPA); -- 1.8.5.3
[Qemu-devel] [PATCH V2 4/4] vmxnet3: validate queues configuration read on migration
CVE-2013-4544 Signed-off-by: Dmitry Fleytman dmi...@daynix.com Reported-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Michael S. Tsirkin m...@redhat.com --- hw/net/vmxnet3.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c index a0723c0..ddcee4b 100644 --- a/hw/net/vmxnet3.c +++ b/hw/net/vmxnet3.c @@ -2391,6 +2391,7 @@ static int vmxnet3_post_load(void *opaque, int version_id) } } +vmxnet3_validate_queues(s); vmxnet3_validate_interrupts(s); return 0; -- 1.8.5.3
[Qemu-devel] [PATCH V2 1/4] vmxnet3: validate interrupt indices coming from guest
CVE-2013-4544 Signed-off-by: Dmitry Fleytman dmi...@daynix.com Reported-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Michael S. Tsirkin m...@redhat.com --- hw/net/vmxnet3.c | 36 ++-- 1 file changed, 34 insertions(+), 2 deletions(-) diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c index 5be807c..0b317f8 100644 --- a/hw/net/vmxnet3.c +++ b/hw/net/vmxnet3.c @@ -52,6 +52,9 @@ #define VMXNET3_DEVICE_VERSION0x1 #define VMXNET3_DEVICE_REVISION 0x1 +/* Number of interrupt vectors for non-MSIx modes */ +#define VMXNET3_MAX_NMSIX_INTRS (1) + /* Macros for rings descriptors access */ #define VMXNET3_READ_TX_QUEUE_DESCR8(dpa, field) \ (vmw_shmem_ld8(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field))) @@ -1305,6 +1308,34 @@ static bool vmxnet3_verify_intx(VMXNET3State *s, int intx) (pci_get_byte(s-parent_obj.config + PCI_INTERRUPT_PIN) - 1)); } +static void vmxnet3_validate_interrupt_idx(bool is_msix, int idx) +{ +int max_ints = is_msix ? VMXNET3_MAX_INTRS : VMXNET3_MAX_NMSIX_INTRS; +if (idx = max_ints) { +hw_error(Bad interrupt index: %d\n, idx); +} +} + +static void vmxnet3_validate_interrupts(VMXNET3State *s) +{ +int i; + +VMW_CFPRN(Verifying event interrupt index (%d), s-event_int_idx); +vmxnet3_validate_interrupt_idx(s-msix_used, s-event_int_idx); + +for (i = 0; i s-txq_num; i++) { +int idx = s-txq_descr[i].intr_idx; +VMW_CFPRN(Verifying TX queue %d interrupt index (%d), i, idx); +vmxnet3_validate_interrupt_idx(s-msix_used, idx); +} + +for (i = 0; i s-rxq_num; i++) { +int idx = s-rxq_descr[i].intr_idx; +VMW_CFPRN(Verifying RX queue %d interrupt index (%d), i, idx); +vmxnet3_validate_interrupt_idx(s-msix_used, idx); +} +} + static void vmxnet3_activate_device(VMXNET3State *s) { int i; @@ -1447,6 +1478,8 @@ static void vmxnet3_activate_device(VMXNET3State *s) sizeof(s-rxq_descr[i].rxq_stats)); } +vmxnet3_validate_interrupts(s); + /* Make sure everything is in place before device activation */ smp_wmb(); @@ -2005,7 +2038,6 @@ vmxnet3_cleanup_msix(VMXNET3State *s) } } -#define VMXNET3_MSI_NUM_VECTORS (1) #define VMXNET3_MSI_OFFSET(0x50) #define VMXNET3_USE_64BIT (true) #define VMXNET3_PER_VECTOR_MASK (false) @@ -2016,7 +2048,7 @@ vmxnet3_init_msi(VMXNET3State *s) PCIDevice *d = PCI_DEVICE(s); int res; -res = msi_init(d, VMXNET3_MSI_OFFSET, VMXNET3_MSI_NUM_VECTORS, +res = msi_init(d, VMXNET3_MSI_OFFSET, VMXNET3_MAX_NMSIX_INTRS, VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK); if (0 res) { VMW_WRPRN(Failed to initialize MSI, error %d, res); -- 1.8.5.3
[Qemu-devel] [PATCH V2 0/4] CVE-2013-4544
Changes since V1: * Comments added and extended as sugested by Dave and Michael Dmitry Fleytman (4): vmxnet3: validate interrupt indices coming from guest vmxnet3: validate queues configuration coming from quest vmxnet3: validate interrupt indices read on migration vmxnet3: validate queues configuration read on migration hw/net/vmxnet3.c | 58 +--- 1 file changed, 55 insertions(+), 3 deletions(-) -- 1.8.5.3
[Qemu-devel] [PATCH V2 3/4] vmxnet3: validate interrupt indices read on migration
CVE-2013-4544 Signed-off-by: Dmitry Fleytman dmi...@daynix.com Reported-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Michael S. Tsirkin m...@redhat.com --- hw/net/vmxnet3.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c index 4fefc7b..a0723c0 100644 --- a/hw/net/vmxnet3.c +++ b/hw/net/vmxnet3.c @@ -2391,6 +2391,8 @@ static int vmxnet3_post_load(void *opaque, int version_id) } } +vmxnet3_validate_interrupts(s); + return 0; } -- 1.8.5.3
Re: [Qemu-devel] [PATCH v4 27/30] vmxnet3: validate interrupt indices coming from guest
On Apr 3, 2014, at 19:07 PM, Michael S. Tsirkin m...@redhat.com wrote: On Tue, Apr 01, 2014 at 02:07:52PM +0100, Dr. David Alan Gilbert wrote: * Dmitry Fleytman (dmi...@daynix.com) wrote: On Apr 1, 2014, at 14:33 PM, Dr. David Alan Gilbert dgilb...@redhat.com wrote: * Michael S. Tsirkin (m...@redhat.com) wrote: From: Dmitry Fleytman dmi...@daynix.com CVE-2013-4544 Signed-off-by: Dmitry Fleytman dmi...@daynix.com Reported-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Michael S. Tsirkin m...@redhat.com --- hw/net/vmxnet3.c | 36 ++-- 1 file changed, 34 insertions(+), 2 deletions(-) diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c index 5be807c..a4b5c11 100644 --- a/hw/net/vmxnet3.c +++ b/hw/net/vmxnet3.c @@ -52,6 +52,9 @@ #define VMXNET3_DEVICE_VERSION0x1 #define VMXNET3_DEVICE_REVISION 0x1 +/* Number of interrupt vectors for INTx/MSI */ +#define VMXNET3_MAX_NMSIX_INTRS (1) As per Dmitry's reply this is apparently the number of non-MSIX interrupts; can we change the comment to make this clear. Not sure how to change it. There are three modes of operation: 1. INTx - 1 interrupt is used 2. MSI - 1 interrupt is used 3. MSIx - up to 25 interrupts are used. This define covers 2 first modes of operation. Would something like /* Number of interrupt vectors for non-MSIx modes */ be better? Yes - that's fine. Dave OK Dmitry, can you please resend patches with suggested comment changes? Done. Sent to the list. + /* Macros for rings descriptors access */ #define VMXNET3_READ_TX_QUEUE_DESCR8(dpa, field) \ (vmw_shmem_ld8(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field))) @@ -1305,6 +1308,34 @@ static bool vmxnet3_verify_intx(VMXNET3State *s, int intx) (pci_get_byte(s-parent_obj.config + PCI_INTERRUPT_PIN) - 1)); } +static void vmxnet3_validate_interrupt_idx(bool is_msix, int idx) +{ +int max_ints = is_msix ? VMXNET3_MAX_INTRS : VMXNET3_MAX_NMSIX_INTRS; +if (idx = max_ints) { +hw_error(Bad interrupt index: %d\n, idx); +} Can we avoid hw_error here, we're using in this in state load, so I'm OK at the thought of error_report, but then we should make this return a bool to say it's failed and bubble this up so that it just fails the post_load testlike any other bad state load. Dave +} + +static void vmxnet3_validate_interrupts(VMXNET3State *s) +{ +int i; + +VMW_CFPRN(Verifying event interrupt index (%d), s-event_int_idx); +vmxnet3_validate_interrupt_idx(s-msix_used, s-event_int_idx); + +for (i = 0; i s-txq_num; i++) { +int idx = s-txq_descr[i].intr_idx; +VMW_CFPRN(Verifying TX queue %d interrupt index (%d), i, idx); +vmxnet3_validate_interrupt_idx(s-msix_used, idx); +} + +for (i = 0; i s-rxq_num; i++) { +int idx = s-rxq_descr[i].intr_idx; +VMW_CFPRN(Verifying RX queue %d interrupt index (%d), i, idx); +vmxnet3_validate_interrupt_idx(s-msix_used, idx); +} +} + static void vmxnet3_activate_device(VMXNET3State *s) { int i; @@ -1447,6 +1478,8 @@ static void vmxnet3_activate_device(VMXNET3State *s) sizeof(s-rxq_descr[i].rxq_stats)); } +vmxnet3_validate_interrupts(s); + /* Make sure everything is in place before device activation */ smp_wmb(); @@ -2005,7 +2038,6 @@ vmxnet3_cleanup_msix(VMXNET3State *s) } } -#define VMXNET3_MSI_NUM_VECTORS (1) #define VMXNET3_MSI_OFFSET(0x50) #define VMXNET3_USE_64BIT (true) #define VMXNET3_PER_VECTOR_MASK (false) @@ -2016,7 +2048,7 @@ vmxnet3_init_msi(VMXNET3State *s) PCIDevice *d = PCI_DEVICE(s); int res; -res = msi_init(d, VMXNET3_MSI_OFFSET, VMXNET3_MSI_NUM_VECTORS, +res = msi_init(d, VMXNET3_MSI_OFFSET, VMXNET3_MAX_NMSIX_INTRS, VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK); if (0 res) { VMW_WRPRN(Failed to initialize MSI, error %d, res); -- MST -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH v5 09/24] hpet: fix buffer overrun on invalid state load
Michael S. Tsirkin m...@redhat.com wrote: CVE-2013-4527 hw/timer/hpet.c buffer overrun hpet is a VARRAY with a uint8 size but static array of 32 To fix, make sure num_timers is valid using VMSTATE_VALID hook. Reported-by: Anthony Liguori anth...@codemonkey.ws Signed-off-by: Michael S. Tsirkin m...@redhat.com Reviewed-by: Dr. David Alan Gilbert dgilb...@redhat.com Ok, seing what you want to do with VMSTATE_VALIDATE. Much better solution is add a -validate() field, and use it in the equivalent of LESS_EQUAL and rest of tests. Will sent a patch. Later, Juan. --- hw/timer/hpet.c | 13 + 1 file changed, 13 insertions(+) diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c index e15d6bc..2792f89 100644 --- a/hw/timer/hpet.c +++ b/hw/timer/hpet.c @@ -239,6 +239,18 @@ static int hpet_pre_load(void *opaque) return 0; } +static bool hpet_validate_num_timers(void *opaque, int version_id) +{ +HPETState *s = opaque; + +if (s-num_timers HPET_MIN_TIMERS) { +return false; +} else if (s-num_timers HPET_MAX_TIMERS) { +return false; +} +return true; +} + static int hpet_post_load(void *opaque, int version_id) { HPETState *s = opaque; @@ -307,6 +319,7 @@ static const VMStateDescription vmstate_hpet = { VMSTATE_UINT64(isr, HPETState), VMSTATE_UINT64(hpet_counter, HPETState), VMSTATE_UINT8_V(num_timers, HPETState, 2), +VMSTATE_VALIDATE(num_timers in range, hpet_validate_num_timers), VMSTATE_STRUCT_VARRAY_UINT8(timer, HPETState, num_timers, 0, vmstate_hpet_timer, HPETTimer), VMSTATE_END_OF_LIST()
Re: [Qemu-devel] [PATCH v5 02/24] vmstate: add VMS_MUST_EXIST
* Juan Quintela (quint...@redhat.com) wrote: Michael S. Tsirkin m...@redhat.com wrote: Can be used to verify a required field exists or validate state in some other way. Signed-off-by: Michael S. Tsirkin m...@redhat.com Reviewed-by: Dr. David Alan Gilbert dgilb...@redhat.com What should this do? We can change the semantics of -field_exist() to add this functionality if needed, no? Well that's what Michael is doing; he's changing the use of -field_exist() here, so that if this flag is set then a failure of field_exist causes a migration failure. Dave --- include/migration/vmstate.h | 1 + vmstate.c | 10 ++ 2 files changed, 11 insertions(+) diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h index e7e1705..de970ab 100644 --- a/include/migration/vmstate.h +++ b/include/migration/vmstate.h @@ -100,6 +100,7 @@ enum VMStateFlags { VMS_MULTIPLY = 0x200, /* multiply size field by field_size */ VMS_VARRAY_UINT8 = 0x400, /* Array with size in uint8_t field*/ VMS_VARRAY_UINT32= 0x800, /* Array with size in uint32_t field*/ +VMS_MUST_EXIST = 0x1000, /* Field must exist in input */ }; typedef struct { diff --git a/vmstate.c b/vmstate.c index dd6f834..f019228 100644 --- a/vmstate.c +++ b/vmstate.c @@ -102,6 +102,10 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription *vmsd, return ret; } } +} else if (field-flags VMS_MUST_EXIST) { +fprintf(stderr, Input validation failed: %s/%s\n, +vmsd-name, field-name); +return -1; } field++; } @@ -142,6 +146,12 @@ void vmstate_save_state(QEMUFile *f, const VMStateDescription *vmsd, field-info-put(f, addr, size); } } +} else { +if (field-flags VMS_MUST_EXIST) { +fprintf(stderr, Output state validation failed: %s/%s\n, +vmsd-name, field-name); +assert(!(field-flags VMS_MUST_EXIST)); +} } field++; } -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
[Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue
From: ChenLiang chenlian...@huawei.com V5--V4 * Fix two issues: one is cache_insert don't update the page which has been in the cache. Another avoiding the risk that run xbzrle_encode_buffer on changing data. a. Optimization the xbzrle remarkable decrease the cache misses. The efficiency of compress increases more than fifty times. Before the patch set, the cache almost totally miss when the number of cache item less than the dirty page number. Now the hot pages in the cache will not be replaced by other pages. b. Reducing the data copy c. Fix one corruption issues. ChenLiang (10): XBZRLE: Fix one XBZRLE corruption issues migration: Add counts of updating the dirty bitmap migration: expose the bitmap_sync_count to the end user migration: expose xbzrle cache miss rate XBZRLE: optimize XBZRLE to decrease the cache misses XBZRLE: rebuild the cache_is_cached function xbzrle: don't check the value in the vm ram repeatedly xbzrle: check 8 bytes at a time after an concurrency scene migration: optimize xbzrle by reducing data copy migration: clear the dead code arch_init.c| 74 +- docs/xbzrle.txt| 8 hmp.c | 4 ++ include/migration/migration.h | 2 + include/migration/page_cache.h | 10 ++-- migration.c| 3 ++ page_cache.c | 101 +++-- qapi-schema.json | 9 +++- qmp-commands.hx| 15 -- xbzrle.c | 48 ++-- 10 files changed, 144 insertions(+), 130 deletions(-) -- 1.7.12.4
[Qemu-devel] [PATCH v5 09/10] migration: optimize xbzrle by reducing data copy
From: ChenLiang chenlian...@huawei.com Reducing data copy can reduce cpu overhead. Signed-off-by: ChenLiang chenlian...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com --- arch_init.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/arch_init.c b/arch_init.c index 84a4bd3..94b62e2 100644 --- a/arch_init.c +++ b/arch_init.c @@ -373,11 +373,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data, prev_cached_page = get_cached_data(XBZRLE.cache, current_addr); -/* save current buffer into memory */ -memcpy(XBZRLE.current_buf, *current_data, TARGET_PAGE_SIZE); - /* XBZRLE encoding (if there is no overflow) */ -encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf, +encoded_len = xbzrle_encode_buffer(prev_cached_page, *current_data, TARGET_PAGE_SIZE, XBZRLE.encoded_buf, TARGET_PAGE_SIZE); if (encoded_len == 0) { @@ -396,7 +393,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data, /* we need to update the data in the cache, in order to get the same data */ if (!last_stage) { -memcpy(prev_cached_page, XBZRLE.current_buf, TARGET_PAGE_SIZE); +xbzrle_decode_buffer(XBZRLE.encoded_buf, encoded_len, prev_cached_page, + TARGET_PAGE_SIZE); } /* Send XBZRLE based compressed page */ -- 1.7.12.4
[Qemu-devel] [PATCH v5 10/10] migration: clear the dead code
From: ChenLiang chenlian...@huawei.com clear the dead code Signed-off-by: ChenLiang chenlian...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com Reviewed-by: Juan Quintela quint...@redhat.com --- arch_init.c | 13 - page_cache.c | 58 -- 2 files changed, 71 deletions(-) diff --git a/arch_init.c b/arch_init.c index 94b62e2..db38c9a 100644 --- a/arch_init.c +++ b/arch_init.c @@ -164,14 +164,11 @@ static inline bool is_zero_range(uint8_t *p, uint64_t size) static struct { /* buffer used for XBZRLE encoding */ uint8_t *encoded_buf; -/* buffer for storing page content */ -uint8_t *current_buf; /* Cache for XBZRLE, Protected by lock. */ PageCache *cache; QemuMutex lock; } XBZRLE = { .encoded_buf = NULL, -.current_buf = NULL, .cache = NULL, }; /* buffer used for XBZRLE decoding */ @@ -723,10 +720,8 @@ static void migration_end(void) cache_fini(XBZRLE.cache); g_free(XBZRLE.cache); g_free(XBZRLE.encoded_buf); -g_free(XBZRLE.current_buf); XBZRLE.cache = NULL; XBZRLE.encoded_buf = NULL; -XBZRLE.current_buf = NULL; } XBZRLE_cache_unlock(); } @@ -779,14 +774,6 @@ static int ram_save_setup(QEMUFile *f, void *opaque) return -1; } -XBZRLE.current_buf = g_try_malloc(TARGET_PAGE_SIZE); -if (!XBZRLE.current_buf) { -DPRINTF(Error allocating current_buf\n); -g_free(XBZRLE.encoded_buf); -XBZRLE.encoded_buf = NULL; -return -1; -} - acct_clear(); } diff --git a/page_cache.c b/page_cache.c index 2626d3d..978dc1e 100644 --- a/page_cache.c +++ b/page_cache.c @@ -48,7 +48,6 @@ struct PageCache { CacheItem *page_cache; unsigned int page_size; int64_t max_num_items; -uint64_t max_item_age; int64_t num_items; }; @@ -76,7 +75,6 @@ PageCache *cache_init(int64_t num_pages, unsigned int page_size) } cache-page_size = page_size; cache-num_items = 0; -cache-max_item_age = 0; cache-max_num_items = num_pages; DPRINTF(Setting cache buckets to % PRId64 \n, cache-max_num_items); @@ -187,59 +185,3 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata, return 0; } - -int64_t cache_resize(PageCache *cache, int64_t new_num_pages) -{ -PageCache *new_cache; -int64_t i; - -CacheItem *old_it, *new_it; - -g_assert(cache); - -/* cache was not inited */ -if (cache-page_cache == NULL) { -return -1; -} - -/* same size */ -if (pow2floor(new_num_pages) == cache-max_num_items) { -return cache-max_num_items; -} - -new_cache = cache_init(new_num_pages, cache-page_size); -if (!(new_cache)) { -DPRINTF(Error creating new cache\n); -return -1; -} - -/* move all data from old cache */ -for (i = 0; i cache-max_num_items; i++) { -old_it = cache-page_cache[i]; -if (old_it-it_addr != -1) { -/* check for collision, if there is, keep MRU page */ -new_it = cache_get_by_addr(new_cache, old_it-it_addr); -if (new_it-it_data new_it-it_age = old_it-it_age) { -/* keep the MRU page */ -g_free(old_it-it_data); -} else { -if (!new_it-it_data) { -new_cache-num_items++; -} -g_free(new_it-it_data); -new_it-it_data = old_it-it_data; -new_it-it_age = old_it-it_age; -new_it-it_addr = old_it-it_addr; -} -} -} - -g_free(cache-page_cache); -cache-page_cache = new_cache-page_cache; -cache-max_num_items = new_cache-max_num_items; -cache-num_items = new_cache-num_items; - -g_free(new_cache); - -return cache-max_num_items; -} -- 1.7.12.4
[Qemu-devel] [PATCH v5 04/10] migration: expose xbzrle cache miss rate
From: ChenLiang chenlian...@huawei.com expose xbzrle cache miss rate Signed-off-by: ChenLiang chenlian...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com Reviewed-by: Eric Blake ebl...@redhat.com --- arch_init.c | 18 ++ hmp.c | 2 ++ include/migration/migration.h | 1 + migration.c | 1 + qapi-schema.json | 5 - qmp-commands.hx | 2 ++ 6 files changed, 28 insertions(+), 1 deletion(-) diff --git a/arch_init.c b/arch_init.c index a7c87de..15ca4c0 100644 --- a/arch_init.c +++ b/arch_init.c @@ -235,6 +235,7 @@ typedef struct AccountingInfo { uint64_t xbzrle_bytes; uint64_t xbzrle_pages; uint64_t xbzrle_cache_miss; +double xbzrle_cache_miss_rate; uint64_t xbzrle_overflows; } AccountingInfo; @@ -290,6 +291,11 @@ uint64_t xbzrle_mig_pages_cache_miss(void) return acct_info.xbzrle_cache_miss; } +double xbzrle_mig_cache_miss_rate(void) +{ +return acct_info.xbzrle_cache_miss_rate; +} + uint64_t xbzrle_mig_pages_overflow(void) { return acct_info.xbzrle_overflows; @@ -488,6 +494,8 @@ static void migration_bitmap_sync(void) static int64_t num_dirty_pages_period; int64_t end_time; int64_t bytes_xfer_now; +static uint64_t xbzrle_cache_miss_prev; +static uint64_t iterations_prev; bitmap_sync_count++; @@ -531,6 +539,16 @@ static void migration_bitmap_sync(void) } else { mig_throttle_on = false; } +if (migrate_use_xbzrle()) { +if (iterations_prev != 0) { +acct_info.xbzrle_cache_miss_rate = + (double)(acct_info.xbzrle_cache_miss - +xbzrle_cache_miss_prev) / + (acct_info.iterations - iterations_prev); +} +iterations_prev = acct_info.iterations; +xbzrle_cache_miss_prev = acct_info.xbzrle_cache_miss; +} s-dirty_pages_rate = num_dirty_pages_period * 1000 / (end_time - start_time); s-dirty_bytes_rate = s-dirty_pages_rate * TARGET_PAGE_SIZE; diff --git a/hmp.c b/hmp.c index 77a8d18..18a850d 100644 --- a/hmp.c +++ b/hmp.c @@ -214,6 +214,8 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict) info-xbzrle_cache-pages); monitor_printf(mon, xbzrle cache miss: % PRIu64 \n, info-xbzrle_cache-cache_miss); +monitor_printf(mon, xbzrle cache miss rate: %0.2f\n, + info-xbzrle_cache-cache_miss_rate); monitor_printf(mon, xbzrle overflow : % PRIu64 \n, info-xbzrle_cache-overflow); } diff --git a/include/migration/migration.h b/include/migration/migration.h index 9e62b99..430e48d 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -126,6 +126,7 @@ uint64_t xbzrle_mig_bytes_transferred(void); uint64_t xbzrle_mig_pages_transferred(void); uint64_t xbzrle_mig_pages_overflow(void); uint64_t xbzrle_mig_pages_cache_miss(void); +double xbzrle_mig_cache_miss_rate(void); void ram_handle_compressed(void *host, uint8_t ch, uint64_t size); diff --git a/migration.c b/migration.c index 816f5fb..44b6ca2 100644 --- a/migration.c +++ b/migration.c @@ -174,6 +174,7 @@ static void get_xbzrle_cache_stats(MigrationInfo *info) info-xbzrle_cache-bytes = xbzrle_mig_bytes_transferred(); info-xbzrle_cache-pages = xbzrle_mig_pages_transferred(); info-xbzrle_cache-cache_miss = xbzrle_mig_pages_cache_miss(); +info-xbzrle_cache-cache_miss_rate = xbzrle_mig_cache_miss_rate(); info-xbzrle_cache-overflow = xbzrle_mig_pages_overflow(); } } diff --git a/qapi-schema.json b/qapi-schema.json index 8a60a6e..8450275 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -674,13 +674,16 @@ # # @cache-miss: number of cache miss # +# @cache-miss-rate: rate of cache miss (since 2.1) +# # @overflow: number of overflows # # Since: 1.2 ## { 'type': 'XBZRLECacheStats', 'data': {'cache-size': 'int', 'bytes': 'int', 'pages': 'int', - 'cache-miss': 'int', 'overflow': 'int' } } + 'cache-miss': 'int', 'cache-miss-rate': 'number', + 'overflow': 'int' } } ## # @MigrationInfo diff --git a/qmp-commands.hx b/qmp-commands.hx index aadcd04..f437937 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -2979,6 +2979,7 @@ The main json-object contains the following: - bytes: number of bytes transferred for XBZRLE compressed pages - pages: number of XBZRLE compressed pages - cache-miss: number of XBRZRLE page cache misses + - cache-miss-rate: rate of XBRZRLE page cache misses - overflow: number of times XBZRLE overflows. This means that the XBZRLE encoding was bigger than just sent the whole page, and then we sent the whole page instead (as as @@ -3087,6 +3088,7 @@ Examples:
[Qemu-devel] [PATCH v5 06/10] XBZRLE: rebuild the cache_is_cached function
From: ChenLiang chenlian...@huawei.com Rebuild the cache_is_cached function by cache_get_by_addr. And drops the asserts because the caller is also asserting the same thing. Signed-off-by: ChenLiang chenlian...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com Reviewed-by: Dr. David Alan Gilbert dgilb...@redhat.com --- page_cache.c | 38 -- 1 file changed, 16 insertions(+), 22 deletions(-) diff --git a/page_cache.c b/page_cache.c index 78f7590..2626d3d 100644 --- a/page_cache.c +++ b/page_cache.c @@ -124,24 +124,6 @@ static size_t cache_get_cache_pos(const PageCache *cache, return pos; } -bool cache_is_cached(const PageCache *cache, uint64_t addr, - uint64_t current_age) -{ -size_t pos; - -g_assert(cache); -g_assert(cache-page_cache); - -pos = cache_get_cache_pos(cache, addr); - -if (cache-page_cache[pos].it_addr == addr) { -/* update the it_age when the cache hit */ -cache-page_cache[pos].it_age = current_age; -return true; -} -return false; -} - static CacheItem *cache_get_by_addr(const PageCache *cache, uint64_t addr) { size_t pos; @@ -159,14 +141,26 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t addr) return cache_get_by_addr(cache, addr)-it_data; } +bool cache_is_cached(const PageCache *cache, uint64_t addr, + uint64_t current_age) +{ +CacheItem *it; + +it = cache_get_by_addr(cache, addr); + +if (it-it_addr == addr) { +/* update the it_age when the cache hit */ +it-it_age = current_age; +return true; +} +return false; +} + int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata, uint64_t current_age) { -CacheItem *it = NULL; - -g_assert(cache); -g_assert(cache-page_cache); +CacheItem *it; /* actual update of entry */ it = cache_get_by_addr(cache, addr); -- 1.7.12.4
[Qemu-devel] [PATCH v5 03/10] migration: expose the bitmap_sync_count to the end
From: ChenLiang chenlian...@huawei.com expose the count that logs the times of updating the dirty bitmap to end user. Signed-off-by: ChenLiang chenlian...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com Reviewed-by: Eric Blake ebl...@redhat.com --- arch_init.c | 1 + hmp.c | 2 ++ include/migration/migration.h | 1 + migration.c | 2 ++ qapi-schema.json | 4 +++- qmp-commands.hx | 13 + 6 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch_init.c b/arch_init.c index 200af0e..a7c87de 100644 --- a/arch_init.c +++ b/arch_init.c @@ -536,6 +536,7 @@ static void migration_bitmap_sync(void) s-dirty_bytes_rate = s-dirty_pages_rate * TARGET_PAGE_SIZE; start_time = end_time; num_dirty_pages_period = 0; +s-dirty_sync_count = bitmap_sync_count; } } diff --git a/hmp.c b/hmp.c index 2f279c4..77a8d18 100644 --- a/hmp.c +++ b/hmp.c @@ -188,6 +188,8 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict) info-ram-normal); monitor_printf(mon, normal bytes: % PRIu64 kbytes\n, info-ram-normal_bytes 10); +monitor_printf(mon, dirty sync count: % PRIu64 \n, + info-ram-dirty_sync_count); if (info-ram-dirty_pages_rate) { monitor_printf(mon, dirty pages rate: % PRIu64 pages\n, info-ram-dirty_pages_rate); diff --git a/include/migration/migration.h b/include/migration/migration.h index 3e1e6c7..9e62b99 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -61,6 +61,7 @@ struct MigrationState bool enabled_capabilities[MIGRATION_CAPABILITY_MAX]; int64_t xbzrle_cache_size; int64_t setup_time; +int64_t dirty_sync_count; }; void process_incoming_migration(QEMUFile *f); diff --git a/migration.c b/migration.c index bd1fb91..816f5fb 100644 --- a/migration.c +++ b/migration.c @@ -215,6 +215,7 @@ MigrationInfo *qmp_query_migrate(Error **errp) info-ram-normal_bytes = norm_mig_bytes_transferred(); info-ram-dirty_pages_rate = s-dirty_pages_rate; info-ram-mbps = s-mbps; +info-ram-dirty_sync_count = s-dirty_sync_count; if (blk_mig_active()) { info-has_disk = true; @@ -248,6 +249,7 @@ MigrationInfo *qmp_query_migrate(Error **errp) info-ram-normal = norm_mig_pages_transferred(); info-ram-normal_bytes = norm_mig_bytes_transferred(); info-ram-mbps = s-mbps; +info-ram-dirty_sync_count = s-dirty_sync_count; break; case MIG_STATE_ERROR: info-has_status = true; diff --git a/qapi-schema.json b/qapi-schema.json index 391356f..8a60a6e 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -651,13 +651,15 @@ # # @mbps: throughput in megabits/sec. (since 1.6) # +# @dirty-sync-count: number of times that dirty ram was synchronized (since 2.1) +# # Since: 0.14.0 ## { 'type': 'MigrationStats', 'data': {'transferred': 'int', 'remaining': 'int', 'total': 'int' , 'duplicate': 'int', 'skipped': 'int', 'normal': 'int', 'normal-bytes': 'int', 'dirty-pages-rate' : 'int', - 'mbps' : 'number' } } + 'mbps' : 'number', 'dirty-sync-count' : 'int' } } ## # @XBZRLECacheStats diff --git a/qmp-commands.hx b/qmp-commands.hx index ed3ab92..aadcd04 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -2967,6 +2967,7 @@ The main json-object contains the following: pages. This is just normal pages times size of one page, but this way upper levels don't need to care about page size (json-int) + - dirty-sync-count: times that dirty ram was synchronized (json-int) - disk: only present if status is active and it is a block migration, it is a json-object with the following disk information: - transferred: amount transferred in bytes (json-int) @@ -3004,7 +3005,8 @@ Examples: downtime:12345, duplicate:123, normal:123, - normal-bytes:123456 + normal-bytes:123456, + dirty-sync-count:15 } } } @@ -3029,7 +3031,8 @@ Examples: expected-downtime:12345, duplicate:123, normal:123, -normal-bytes:123456 +normal-bytes:123456, +dirty-sync-count:15 } } } @@ -3049,7 +3052,8 @@ Examples: expected-downtime:12345, duplicate:123, normal:123, -normal-bytes:123456 +normal-bytes:123456, +dirty-sync-count:15 }, disk:{ total:20971520, @@ -3075,7 +3079,8 @@ Examples: expected-downtime:12345, duplicate:10, normal:, -normal-bytes:3412992 +normal-bytes:3412992, +
[Qemu-devel] [PATCH v5 05/10] XBZRLE: optimize XBZRLE to decrease the cache misses
From: ChenLiang chenlian...@huawei.com Avoid hot pages being replaced by others to remarkably decrease cache misses Sample results with the test program which quote from xbzrle.txt ran in vm:(migrate bandwidth:1GE and xbzrle cache size 8MB) the test program: include stdlib.h include stdio.h int main() { char *buf = (char *) calloc(4096, 4096); while (1) { int i; for (i = 0; i 4096 * 4; i++) { buf[i * 4096 / 4]++; } printf(.); } } before this patch: virsh qemu-monitor-command test_vm '{execute: query-migrate}' {return:{expected-downtime:1020,xbzrle-cache:{bytes:1108284, cache-size:8388608,cache-miss-rate:0.987013,pages:18297,overflow:8, cache-miss:1228737},status:active,setup-time:10,total-time:52398, ram:{total:12466991104,remaining:1695744,mbps:935.559472, transferred:5780760580,dirty-sync-counter:271,duplicate:2878530, dirty-pages-rate:29130,skipped:0,normal-bytes:5748592640, normal:1403465}},id:libvirt-706} 18k pages sent compressed in 52 seconds. cache-miss-rate is 98.7%, totally miss. after optimizing: virsh qemu-monitor-command test_vm '{execute: query-migrate}' {return:{expected-downtime:2054,xbzrle-cache:{bytes:5066763, cache-size:8388608,cache-miss-rate:0.485924,pages:194823,overflow:0, cache-miss:210653},status:active,setup-time:11,total-time:18729, ram:{total:12466991104,remaining:3895296,mbps:937.663549, transferred:1615042219,dirty-sync-counter:98,duplicate:2869840, dirty-pages-rate:58781,skipped:0,normal-bytes:1588404224, normal:387794}},id:libvirt-266} 194k pages sent compressed in 18 seconds. The value of cache-miss-rate decrease to 48.59%. Signed-off-by: ChenLiang chenlian...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com --- arch_init.c| 8 +--- docs/xbzrle.txt| 8 include/migration/page_cache.h | 10 +++--- page_cache.c | 23 +++ 4 files changed, 39 insertions(+), 10 deletions(-) diff --git a/arch_init.c b/arch_init.c index 15ca4c0..84a4bd3 100644 --- a/arch_init.c +++ b/arch_init.c @@ -343,7 +343,8 @@ static void xbzrle_cache_zero_page(ram_addr_t current_addr) /* We don't care if this fails to allocate a new cache page * as long as it updated an old one */ -cache_insert(XBZRLE.cache, current_addr, ZERO_TARGET_PAGE); +cache_insert(XBZRLE.cache, current_addr, ZERO_TARGET_PAGE, + bitmap_sync_count); } #define ENCODING_FLAG_XBZRLE 0x1 @@ -355,10 +356,11 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data, int encoded_len = 0, bytes_sent = -1; uint8_t *prev_cached_page; -if (!cache_is_cached(XBZRLE.cache, current_addr)) { +if (!cache_is_cached(XBZRLE.cache, current_addr, bitmap_sync_count)) { acct_info.xbzrle_cache_miss++; if (!last_stage) { -if (cache_insert(XBZRLE.cache, current_addr, *current_data) == -1) { +if (cache_insert(XBZRLE.cache, current_addr, *current_data, + bitmap_sync_count) == -1) { return -1; } else { /* update *current_data when the page has been diff --git a/docs/xbzrle.txt b/docs/xbzrle.txt index cc3a26a..52c8511 100644 --- a/docs/xbzrle.txt +++ b/docs/xbzrle.txt @@ -71,6 +71,14 @@ encoded buffer: encoded length 24 e9 07 0f 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 03 01 67 01 01 69 +Cache update strategy += +Keeping the hot pages in the cache is effective for decreased cache +misses. XBZRLE uses a counter as the age of each page. The counter will +increase after each ram dirty bitmap sync. When a cache conflict is +detected, XBZRLE will only evict pages in the cache that are older than +a threshold. + Usage == 1. Verify the destination QEMU version is able to decode the new format. diff --git a/include/migration/page_cache.h b/include/migration/page_cache.h index 2d5ce2d..10ed532 100644 --- a/include/migration/page_cache.h +++ b/include/migration/page_cache.h @@ -43,8 +43,10 @@ void cache_fini(PageCache *cache); * * @cache pointer to the PageCache struct * @addr: page addr + * @current_age: current bitmap generation */ -bool cache_is_cached(const PageCache *cache, uint64_t addr); +bool cache_is_cached(const PageCache *cache, uint64_t addr, + uint64_t current_age); /** * get_cached_data: Get the data cached for an addr @@ -60,13 +62,15 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t addr); * cache_insert: insert the page into the cache. the page cache * will dup the data on insert. the previous value will be overwritten * - * Returns -1 on error + * Returns -1 when the page isn't inserted into cache * * @cache pointer to the PageCache struct * @addr: page address * @pdata: pointer to the page + * @current_age: current bitmap generation */ -int cache_insert(PageCache
[Qemu-devel] [PATCH v5 07/10] xbzrle: don't check the value in the vm ram repeatedly
From: ChenLiang chenlian...@huawei.com xbzrle_encode_buffer checks the value in the vm ram repeatedly. It is risk if runs xbzrle_encode_buffer on changing data. And it is not necessary. Reported-by: Dr. David Alan Gilbert dgilb...@redhat.com Signed-off-by: ChenLiang chenlian...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com --- xbzrle.c | 20 +++- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/xbzrle.c b/xbzrle.c index fbcb35d..92cccd7 100644 --- a/xbzrle.c +++ b/xbzrle.c @@ -27,9 +27,10 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen, uint8_t *dst, int dlen) { uint32_t zrun_len = 0, nzrun_len = 0; -int d = 0, i = 0; +int d = 0, i = 0, j; long res, xor; uint8_t *nzrun_start = NULL; +uint8_t *xor_ptr = (uint8_t *)(xor); g_assert(!(((uintptr_t)old_buf | (uintptr_t)new_buf | slen) % sizeof(long))); @@ -82,6 +83,8 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen, if (d + 2 dlen) { return -1; } +i++; +nzrun_len++; /* not aligned to sizeof(long) */ res = (slen - i) % sizeof(long); while (res old_buf[i] != new_buf[i]) { @@ -98,11 +101,16 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen, xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i); if ((xor - mask) ~xor (mask 7)) { /* found the end of an nzrun within the current long */ -while (old_buf[i] != new_buf[i]) { -nzrun_len++; -i++; +for (j = 0; j sizeof(long); j++) { +if (0 == xor_ptr[j]) { +break; +} +} +i += j; +nzrun_len += j; +if (j != sizeof(long)) { +break; } -break; } else { i += sizeof(long); nzrun_len += sizeof(long); @@ -118,6 +126,8 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen, memcpy(dst + d, nzrun_start, nzrun_len); d += nzrun_len; nzrun_len = 0; +i++; +zrun_len++; } return d; -- 1.7.12.4
[Qemu-devel] [PATCH v5 08/10] xbzrle: check 8 bytes at a time after an concurrency scene
From: ChenLiang chenlian...@huawei.com The logic of old code is correct. But Checking byte by byte will consume time after an concurrency scene. Signed-off-by: ChenLiang chenlian...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com --- xbzrle.c | 28 ++-- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/xbzrle.c b/xbzrle.c index 92cccd7..9d67309 100644 --- a/xbzrle.c +++ b/xbzrle.c @@ -51,16 +51,24 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen, /* word at a time for speed */ if (!res) { -while (i slen - (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) { -i += sizeof(long); -zrun_len += sizeof(long); -} - -/* go over the rest */ -while (i slen old_buf[i] == new_buf[i]) { -zrun_len++; -i++; +while (i slen) { +if ((*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) { +i += sizeof(long); +zrun_len += sizeof(long); +} else { +/* go over the rest */ +for (j = 0; j sizeof(long); j++) { +if (old_buf[i] == new_buf[i]) { +i++; +zrun_len++; +} else { +break; +} +} +if (j != sizeof(long)) { +break; +} +} } } -- 1.7.12.4
[Qemu-devel] [PATCH v5 01/10] XBZRLE: Fix one XBZRLE corruption issues
From: ChenLiang chenlian...@huawei.com The page may not be inserted into cache after executing save_xbzrle_page. In case of failure to insert, the original page should be sent rather than the page in the cache. Signed-off-by: ChenLiang chenlian...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com Reviewed-by: Juan Quintela quint...@redhat.com --- arch_init.c | 25 + 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/arch_init.c b/arch_init.c index 60c975d..2ac68c2 100644 --- a/arch_init.c +++ b/arch_init.c @@ -340,7 +340,7 @@ static void xbzrle_cache_zero_page(ram_addr_t current_addr) #define ENCODING_FLAG_XBZRLE 0x1 -static int save_xbzrle_page(QEMUFile *f, uint8_t *current_data, +static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data, ram_addr_t current_addr, RAMBlock *block, ram_addr_t offset, int cont, bool last_stage) { @@ -348,19 +348,23 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t *current_data, uint8_t *prev_cached_page; if (!cache_is_cached(XBZRLE.cache, current_addr)) { +acct_info.xbzrle_cache_miss++; if (!last_stage) { -if (cache_insert(XBZRLE.cache, current_addr, current_data) == -1) { +if (cache_insert(XBZRLE.cache, current_addr, *current_data) == -1) { return -1; +} else { +/* update *current_data when the page has been + inserted into cache */ +*current_data = get_cached_data(XBZRLE.cache, current_addr); } } -acct_info.xbzrle_cache_miss++; return -1; } prev_cached_page = get_cached_data(XBZRLE.cache, current_addr); /* save current buffer into memory */ -memcpy(XBZRLE.current_buf, current_data, TARGET_PAGE_SIZE); +memcpy(XBZRLE.current_buf, *current_data, TARGET_PAGE_SIZE); /* XBZRLE encoding (if there is no overflow) */ encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf, @@ -373,7 +377,10 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t *current_data, DPRINTF(Overflow\n); acct_info.xbzrle_overflows++; /* update data in the cache */ -memcpy(prev_cached_page, current_data, TARGET_PAGE_SIZE); +if (!last_stage) { +memcpy(prev_cached_page, *current_data, TARGET_PAGE_SIZE); +*current_data = prev_cached_page; +} return -1; } @@ -598,15 +605,9 @@ static int ram_save_block(QEMUFile *f, bool last_stage) */ xbzrle_cache_zero_page(current_addr); } else if (!ram_bulk_stage migrate_use_xbzrle()) { -bytes_sent = save_xbzrle_page(f, p, current_addr, block, +bytes_sent = save_xbzrle_page(f, p, current_addr, block, offset, cont, last_stage); if (!last_stage) { -/* We must send exactly what's in the xbzrle cache - * even if the page wasn't xbzrle compressed, so that - * it's right next time. - */ -p = get_cached_data(XBZRLE.cache, current_addr); - /* Can't send this cached data async, since the cache page * might get updated before it gets to the wire */ -- 1.7.12.4
[Qemu-devel] [PATCH v5 02/10] migration: Add counts of updating the dirty bitmap
From: ChenLiang chenlian...@huawei.com Add counts to log the times of updating the dirty bitmap. Signed-off-by: ChenLiang chenlian...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com Reviewed-by: Eric Blake ebl...@redhat.com --- arch_init.c | 5 + 1 file changed, 5 insertions(+) diff --git a/arch_init.c b/arch_init.c index 2ac68c2..200af0e 100644 --- a/arch_init.c +++ b/arch_init.c @@ -110,6 +110,8 @@ static bool mig_throttle_on; static int dirty_rate_high_cnt; static void check_guest_throttling(void); +static uint64_t bitmap_sync_count; + /***/ /* ram save/restore */ @@ -487,6 +489,8 @@ static void migration_bitmap_sync(void) int64_t end_time; int64_t bytes_xfer_now; +bitmap_sync_count++; + if (!bytes_xfer_prev) { bytes_xfer_prev = ram_bytes_transferred(); } @@ -734,6 +738,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque) migration_dirty_pages = ram_pages; mig_throttle_on = false; dirty_rate_high_cnt = 0; +bitmap_sync_count = 0; if (migrate_use_xbzrle()) { qemu_mutex_lock_iothread(); -- 1.7.12.4
[Qemu-devel] [PATCH v3 4/4] gtk: Add Grab On Click option
I simply like it better, you don't? :) Signed-off-by: Takashi Iwai ti...@suse.de --- ui/gtk.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/ui/gtk.c b/ui/gtk.c index 6668bd8226d5..4427d9f6c1e9 100644 --- a/ui/gtk.c +++ b/ui/gtk.c @@ -141,6 +141,7 @@ typedef struct GtkDisplayState GtkWidget *zoom_fit_item; GtkWidget *grab_item; GtkWidget *grab_on_hover_item; +GtkWidget *grab_on_click_item; GtkWidget *vga_item; int nb_vcs; @@ -189,6 +190,11 @@ static bool gd_grab_on_hover(GtkDisplayState *s) return gtk_check_menu_item_get_active(GTK_CHECK_MENU_ITEM(s-grab_on_hover_item)); } +static bool gd_grab_on_click(GtkDisplayState *s) +{ +return gtk_check_menu_item_get_active(GTK_CHECK_MENU_ITEM(s-grab_on_click_item)); +} + static bool gd_on_vga(GtkDisplayState *s) { return gtk_notebook_get_current_page(GTK_NOTEBOOK(s-notebook)) == 0; @@ -685,6 +691,12 @@ static gboolean gd_button_event(GtkWidget *widget, GdkEventButton *button, GtkDisplayState *s = opaque; InputButton btn; +if (button-button == 1 button-type == GDK_BUTTON_PRESS +!gd_is_grab_active(s) gd_grab_on_click(s)) { +gtk_check_menu_item_set_active(GTK_CHECK_MENU_ITEM(s-grab_item), TRUE); +return TRUE; +} + if (button-button == 1) { btn = INPUT_BUTTON_LEFT; } else if (button-button == 2) { @@ -1417,6 +1429,9 @@ static GtkWidget *gd_create_menu_view(GtkDisplayState *s, GtkAccelGroup *accel_g s-grab_on_hover_item = gtk_check_menu_item_new_with_mnemonic(_(Grab On _Hover)); gtk_menu_shell_append(GTK_MENU_SHELL(view_menu), s-grab_on_hover_item); +s-grab_on_click_item = gtk_check_menu_item_new_with_mnemonic(_(Grab On _Click)); +gtk_menu_shell_append(GTK_MENU_SHELL(view_menu), s-grab_on_click_item); + s-grab_item = gtk_check_menu_item_new_with_mnemonic(_(_Grab Input)); gtk_menu_item_set_accel_path(GTK_MENU_ITEM(s-grab_item), QEMU/View/Grab Input); -- 1.9.1
[Qemu-devel] [PATCH v3 3/4] gtk: Remember the last grabbed pointer position
It's pretty annoying that the pointer reappears at a random place once after grabbing and ungrabbing the input. Better to restore to the original position where the pointer was grabbed. Reference: https://bugzilla.novell.com/show_bug.cgi?id=849587 Tested-by: Cole Robinson crobi...@redhat.com Reviewed-by: Cole Robinson crobi...@redhat.com Signed-off-by: Takashi Iwai ti...@suse.de --- ui/gtk.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/ui/gtk.c b/ui/gtk.c index 913cc3f70c02..6668bd8226d5 100644 --- a/ui/gtk.c +++ b/ui/gtk.c @@ -159,6 +159,8 @@ typedef struct GtkDisplayState gboolean last_set; int last_x; int last_y; +int grab_x_root; +int grab_y_root; double scale_x; double scale_y; @@ -971,8 +973,8 @@ static void gd_ungrab_keyboard(GtkDisplayState *s) static void gd_grab_pointer(GtkDisplayState *s) { -#if GTK_CHECK_VERSION(3, 0, 0) GdkDisplay *display = gtk_widget_get_display(s-drawing_area); +#if GTK_CHECK_VERSION(3, 0, 0) GdkDeviceManager *mgr = gdk_display_get_device_manager(display); GList *devices = gdk_device_manager_list_devices(mgr, GDK_DEVICE_TYPE_MASTER); @@ -996,6 +998,8 @@ static void gd_grab_pointer(GtkDisplayState *s) tmp = tmp-next; } g_list_free(devices); +gdk_device_get_position(gdk_device_manager_get_client_pointer(mgr), +NULL, s-grab_x_root, s-grab_y_root); #else gdk_pointer_grab(gtk_widget_get_window(s-drawing_area), FALSE, /* All events to come to our window directly */ @@ -1007,13 +1011,15 @@ static void gd_grab_pointer(GtkDisplayState *s) NULL, /* Allow cursor to move over entire desktop */ s-null_cursor, GDK_CURRENT_TIME); +gdk_display_get_pointer(display, NULL, +s-grab_x_root, s-grab_y_root, NULL); #endif } static void gd_ungrab_pointer(GtkDisplayState *s) { -#if GTK_CHECK_VERSION(3, 0, 0) GdkDisplay *display = gtk_widget_get_display(s-drawing_area); +#if GTK_CHECK_VERSION(3, 0, 0) GdkDeviceManager *mgr = gdk_display_get_device_manager(display); GList *devices = gdk_device_manager_list_devices(mgr, GDK_DEVICE_TYPE_MASTER); @@ -1027,8 +1033,14 @@ static void gd_ungrab_pointer(GtkDisplayState *s) tmp = tmp-next; } g_list_free(devices); +gdk_device_warp(gdk_device_manager_get_client_pointer(mgr), +gtk_widget_get_screen(s-drawing_area), +s-grab_x_root, s-grab_y_root); #else gdk_pointer_ungrab(GDK_CURRENT_TIME); +gdk_display_warp_pointer(display, + gtk_widget_get_screen(s-drawing_area), + s-grab_x_root, s-grab_y_root); #endif } -- 1.9.1
[Qemu-devel] [PATCH v3 1/4] gtk: Use gtk generic event signal instead of motion-notify-event
The GDK motion-notify-event isn't generated when the pointer goes out of the target window even if the pointer is grabbed, which essentially means to lose the pointer tracking in gtk-ui. Meanwhile the generic event signal is sent when the pointer is grabbed, so we can use this and pick the motion notify events manually there instead. Reference: https://bugzilla.novell.com/show_bug.cgi?id=849587 Tested-by: Cole Robinson crobi...@redhat.com Reviewed-by: Cole Robinson crobi...@redhat.com Signed-off-by: Takashi Iwai ti...@suse.de --- ui/gtk.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/ui/gtk.c b/ui/gtk.c index f056e4034b3e..c9f6d24eeb1a 100644 --- a/ui/gtk.c +++ b/ui/gtk.c @@ -765,6 +765,14 @@ static gboolean gd_key_event(GtkWidget *widget, GdkEventKey *key, void *opaque) return TRUE; } +static gboolean gd_event(GtkWidget *widget, GdkEvent *event, void *opaque) +{ +if (event-type == GDK_MOTION_NOTIFY) { +return gd_motion_event(widget, event-motion, opaque); +} +return FALSE; +} + /** Window Menu Actions **/ static void gd_menu_pause(GtkMenuItem *item, void *opaque) @@ -1267,8 +1275,8 @@ static void gd_connect_signals(GtkDisplayState *s) g_signal_connect(s-drawing_area, expose-event, G_CALLBACK(gd_expose_event), s); #endif -g_signal_connect(s-drawing_area, motion-notify-event, - G_CALLBACK(gd_motion_event), s); +g_signal_connect(s-drawing_area, event, + G_CALLBACK(gd_event), s); g_signal_connect(s-drawing_area, button-press-event, G_CALLBACK(gd_button_event), s); g_signal_connect(s-drawing_area, button-release-event, -- 1.9.1
[Qemu-devel] [PATCH v3 2/4] gtk: Fix the relative pointer tracking mode
The relative pointer tracking mode was still buggy even after the previous fix of the motion-notify-event since the events are filtered out when the pointer moves outside the drawing window due to the boundary check for the absolute mode. This patch fixes the issue by moving the unnecessary boundary check into the if block of absolute mode, and keep the coordinate in the relative mode even if it's outside the drawing area. But this makes the coordinate (last_x, last_y) possibly pointing to (-1,-1), introduce a new flag to indicate the last coordinate has been updated. Reference: https://bugzilla.novell.com/show_bug.cgi?id=849587 Tested-by: Cole Robinson crobi...@redhat.com Reviewed-by: Cole Robinson crobi...@redhat.com Signed-off-by: Takashi Iwai ti...@suse.de --- ui/gtk.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/ui/gtk.c b/ui/gtk.c index c9f6d24eeb1a..913cc3f70c02 100644 --- a/ui/gtk.c +++ b/ui/gtk.c @@ -156,6 +156,7 @@ typedef struct GtkDisplayState DisplayChangeListener dcl; DisplaySurface *ds; int button_mask; +gboolean last_set; int last_x; int last_y; @@ -616,25 +617,25 @@ static gboolean gd_motion_event(GtkWidget *widget, GdkEventMotion *motion, x = (motion-x - mx) / s-scale_x; y = (motion-y - my) / s-scale_y; -if (x 0 || y 0 || -x = surface_width(s-ds) || -y = surface_height(s-ds)) { -return TRUE; -} - if (qemu_input_is_absolute()) { +if (x 0 || y 0 || +x = surface_width(s-ds) || +y = surface_height(s-ds)) { +return TRUE; +} qemu_input_queue_abs(s-dcl.con, INPUT_AXIS_X, x, surface_width(s-ds)); qemu_input_queue_abs(s-dcl.con, INPUT_AXIS_Y, y, surface_height(s-ds)); qemu_input_event_sync(); -} else if (s-last_x != -1 s-last_y != -1 gd_is_grab_active(s)) { +} else if (s-last_set gd_is_grab_active(s)) { qemu_input_queue_rel(s-dcl.con, INPUT_AXIS_X, x - s-last_x); qemu_input_queue_rel(s-dcl.con, INPUT_AXIS_Y, y - s-last_y); qemu_input_event_sync(); } s-last_x = x; s-last_y = y; +s-last_set = TRUE; if (!qemu_input_is_absolute() gd_is_grab_active(s)) { GdkScreen *screen = gtk_widget_get_screen(s-drawing_area); @@ -669,8 +670,7 @@ static gboolean gd_motion_event(GtkWidget *widget, GdkEventMotion *motion, GdkDisplay *display = gtk_widget_get_display(widget); gdk_display_warp_pointer(display, screen, x, y); #endif -s-last_x = -1; -s-last_y = -1; +s-last_set = FALSE; return FALSE; } } -- 1.9.1
[Qemu-devel] [PATCH v3 0/4] Fix relative pointer tracking on Gtk UI
Hi, this is the revisited patch series. The only difference from v2 is that now they are checkpatch-clean and Cole's acks have been added to patches 1-3. Takashi
Re: [Qemu-devel] [PATCH-trivial v2] vl: Report accelerator not supported for target more nicely
Chen Gang gang.chen.5...@gmail.com writes: When you ask for an accelerator not supported for your target, you get a bogus accelerator does not exist message: $ qemu-system-arm -machine none,accel=kvm KVM not supported for this target kvm accelerator does not exist. No accelerator found! Suppress it. Signed-off-by: Chen Gang gang.chen.5...@gmail.com Reviewed-by: Markus Armbruster arm...@redhat.com
Re: [Qemu-devel] [PATCH] spapr: skip adding usb keyboard/mouse in case of -nodefaults
Nikunj A Dadhania nik...@linux.vnet.ibm.com writes: Markus Armbruster arm...@redhat.com writes: Nikunj A Dadhania nik...@linux.vnet.ibm.com writes: Have you considered extending QEMUMachineInitArgs instead of adding this function? Did not think of this option earlier. You mean doing something like this? Yes. Looks nicer, doesn't it?
Re: [Qemu-devel] [PATCH] spapr: skip adding usb keyboard/mouse in case of -nodefaults
Il 04/04/2014 12:58, Markus Armbruster ha scritto: Have you considered extending QEMUMachineInitArgs instead of adding this function? Did not think of this option earlier. You mean doing something like this? Yes. Looks nicer, doesn't it? I still think it's a libvirt bug. Mixing -nodefaults and -usb and -device is looking for trouble I think. -usb is a do-what-I-mean kind of option and it makes sense for it to add a keyboard and mouse, even with -nodefaults. Paolo
Re: [Qemu-devel] [PATCH] spapr: skip adding usb keyboard/mouse in case of -nodefaults
Il 04/04/2014 07:28, Nikunj A Dadhania ha scritto: And -usb is translated to adding pci-ohci controller for spapr Yeah, but with -nodefaults it's better to use -device directly. I think there is special handling for this in vl.c bool usb_enabled(bool default_usb) { return qemu_opt_get_bool(qemu_get_machine_opts(), usb, has_defaults default_usb); } And spapr.c uses this: if (usb_enabled(spapr-has_graphics)) { pci_create_simple(phb-bus, -1, pci-ohci); Sure. However, I'm saying that it's fine for spapr to make -usb mean OHCI, and also keyboard mouse if there is a VGA card in the system. If libvirt used -device pci-ohci unconditionally, it would fix the bug *and* it would ensure that the PCI slot of pci-ohci does not change due to some other unrelated reason. So I would rather have the fix in libvirt. Paolo
Re: [Qemu-devel] [PATCH] spapr: skip adding usb keyboard/mouse in case of -nodefaults
Paolo Bonzini pbonz...@redhat.com writes: Il 04/04/2014 12:58, Markus Armbruster ha scritto: Have you considered extending QEMUMachineInitArgs instead of adding this function? Did not think of this option earlier. You mean doing something like this? Yes. Looks nicer, doesn't it? I still think it's a libvirt bug. Mixing -nodefaults and -usb and -device is looking for trouble I think. -usb is a do-what-I-mean kind of option and it makes sense for it to add a keyboard and mouse, even with -nodefaults. I agree that management tools should not use -usb, except when they want to manage ancient versions of QEMU for some reason.
Re: [Qemu-devel] [PATCH] spapr: skip adding usb keyboard/mouse in case of -nodefaults
Paolo Bonzini pbonz...@redhat.com writes: Il 04/04/2014 07:28, Nikunj A Dadhania ha scritto: And -usb is translated to adding pci-ohci controller for spapr Yeah, but with -nodefaults it's better to use -device directly. I think there is special handling for this in vl.c bool usb_enabled(bool default_usb) { return qemu_opt_get_bool(qemu_get_machine_opts(), usb, has_defaults default_usb); } And spapr.c uses this: if (usb_enabled(spapr-has_graphics)) { pci_create_simple(phb-bus, -1, pci-ohci); Sure. However, I'm saying that it's fine for spapr to make -usb mean OHCI, and also keyboard mouse if there is a VGA card in the system. If libvirt used -device pci-ohci unconditionally, it would fix the bug *and* it would ensure that the PCI slot of pci-ohci does not change due to some other unrelated reason. So I would rather have the fix in libvirt. Sure, let me check this. Thanks, Nikunj
Re: [Qemu-devel] [PATCH] spapr: skip adding usb keyboard/mouse in case of -nodefaults
Il 04/04/2014 13:40, Nikunj A Dadhania ha scritto: Sure. However, I'm saying that it's fine for spapr to make -usb mean OHCI, and also keyboard mouse if there is a VGA card in the system. If libvirt used -device pci-ohci unconditionally, it would fix the bug *and* it would ensure that the PCI slot of pci-ohci does not change due to some other unrelated reason. So I would rather have the fix in libvirt. Sure, let me check this. It is probably not that easy---libvirt needs to have an implicit USB controller element in the domain XML or something like that, and let the user's XML override the default USB controller. But it would be much better and much more consistent with x86. Paolo
[Qemu-devel] [PATCH for-2.0 v2 0/2] fix bugs involving linux-user signal handling
This patch series fixes bugs reported by Andrei Warkentin involving signal handling in linux-user mode. The first is Andrei's first patch (though I have tweaked the commit message a little). The second patch is aimed at fixing the locking bug that Andrei noted, in a somewhat simpler way than his patches use. The test cases Andrei provided both pass with these patches. Changes v1-v2: * reset have_tb_lock to false when we unlock it on the after-longjmp code path Andrei Warkentin (1): page_check_range: don't bail out early after unprotecting page Peter Maydell (1): cpu-exec: Unlock tb_lock if we longjmp out of code generation cpu-exec.c | 8 translate-all.c | 1 - 2 files changed, 8 insertions(+), 1 deletion(-) -- 1.9.0
[Qemu-devel] [PATCH for-2.0] iscsi: Don't set error if already set in iscsi_do_inquiry
This eliminates the possible assertion failure in error_setg(). Signed-off-by: Fam Zheng f...@redhat.com --- block/iscsi.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/block/iscsi.c b/block/iscsi.c index 21c18a3..64a509f 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -1101,8 +1101,10 @@ static struct scsi_task *iscsi_do_inquiry(struct iscsi_context *iscsi, int lun, return task; fail: -error_setg(errp, iSCSI: Inquiry command failed : %s, - iscsi_get_error(iscsi)); +if (!error_is_set(errp)) { +error_setg(errp, iSCSI: Inquiry command failed : %s, + iscsi_get_error(iscsi)); +} if (task != NULL) { scsi_free_scsi_task(task); } -- 1.9.1
[Qemu-devel] [PATCH for-2.0 v2 1/2] page_check_range: don't bail out early after unprotecting page
From: Andrei Warkentin andrey.warken...@gmail.com When checking a page range, if we found that a page was made read-only by QEMU because it contained translated code, we were incorrectly returning immediately after unprotecting that page, rather than continuing to check the entire range, so we might fail to unprotect pages later in the range, or might incorrectly return a success result even if later pages were not writable. In particular, this could cause segfaults in a case where signals are delivered back to back on a target architecture which uses trampoline code in the stack frame (as AArch64 currently does). The second signal causes a segfault because the frame cannot be written to (it was protected because we translated and executed the restorer trampoline, and the unprotect logic did not unprotect the whole range). Signed-off-by: Andrei Warkentin andrey.warken...@gmail.com [PMM: expanded commit message a bit] Reviewed-by: Richard Henderson r...@twiddle.net Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- translate-all.c | 1 - 1 file changed, 1 deletion(-) diff --git a/translate-all.c b/translate-all.c index f243c10..5759974 100644 --- a/translate-all.c +++ b/translate-all.c @@ -1777,7 +1777,6 @@ int page_check_range(target_ulong start, target_ulong len, int flags) return -1; } } -return 0; } } return 0; -- 1.9.0
[Qemu-devel] [PATCH for-2.0 v2 2/2] cpu-exec: Unlock tb_lock if we longjmp out of code generation
If the guest attempts to execute from unreadable memory, this will cause us to longjmp back to the main loop from inside the target frontend decoder. For linux-user mode, this means we will still hold the tb_ctx.tb_lock, and will deadlock when we try to start executing code again. Unlock the lock in the return-from-longjmp code path to avoid this. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Acked-by: Andrei Warkentin andrey.warken...@gmail.com --- cpu-exec.c | 8 1 file changed, 8 insertions(+) diff --git a/cpu-exec.c b/cpu-exec.c index 0914d3c..2f54054 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -227,6 +227,8 @@ int cpu_exec(CPUArchState *env) TranslationBlock *tb; uint8_t *tc_ptr; uintptr_t next_tb; +/* This must be volatile so it is not trashed by longjmp() */ +volatile bool have_tb_lock = false; if (cpu-halted) { if (!cpu_has_work(cpu)) { @@ -600,6 +602,7 @@ int cpu_exec(CPUArchState *env) cpu_loop_exit(cpu); } spin_lock(tcg_ctx.tb_ctx.tb_lock); +have_tb_lock = true; tb = tb_find_fast(env); /* Note: we do it here to avoid a gcc bug on Mac OS X when doing it in tb_find_slow */ @@ -621,6 +624,7 @@ int cpu_exec(CPUArchState *env) tb_add_jump((TranslationBlock *)(next_tb ~TB_EXIT_MASK), next_tb TB_EXIT_MASK, tb); } +have_tb_lock = false; spin_unlock(tcg_ctx.tb_ctx.tb_lock); /* cpu_interrupt might be called while translating the @@ -692,6 +696,10 @@ int cpu_exec(CPUArchState *env) #ifdef TARGET_I386 x86_cpu = X86_CPU(cpu); #endif +if (have_tb_lock) { +spin_unlock(tcg_ctx.tb_ctx.tb_lock); +have_tb_lock = false; +} } } /* for(;;) */ -- 1.9.0
Re: [Qemu-devel] [PATCH for-2.0] iscsi: Don't set error if already set in iscsi_do_inquiry
Il 04/04/2014 13:53, Fam Zheng ha scritto: This eliminates the possible assertion failure in error_setg(). Signed-off-by: Fam Zheng f...@redhat.com --- block/iscsi.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/block/iscsi.c b/block/iscsi.c index 21c18a3..64a509f 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -1101,8 +1101,10 @@ static struct scsi_task *iscsi_do_inquiry(struct iscsi_context *iscsi, int lun, return task; fail: -error_setg(errp, iSCSI: Inquiry command failed : %s, - iscsi_get_error(iscsi)); +if (!error_is_set(errp)) { +error_setg(errp, iSCSI: Inquiry command failed : %s, + iscsi_get_error(iscsi)); +} if (task != NULL) { scsi_free_scsi_task(task); } Reviewed-by: Paolo Bonzini pbonz...@redhat.com Kevin, Stefan, can you send the pull request for this? Paolo
[Qemu-devel] [PATCH for-2.0 0/3] bdrv_open() fixes
Kevin Wolf (3): block: Don't parse 'filename' option qemu-iotests: Remove CR line endings in reference output block: Fix snapshot=on for protocol parsed from filename block.c| 152 +++-- include/block/block.h | 1 + tests/qemu-iotests/051 | 6 ++ tests/qemu-iotests/051.out | 35 +-- 4 files changed, 115 insertions(+), 79 deletions(-) -- 1.8.3.1
[Qemu-devel] [PATCH 3/3] block: Fix snapshot=on for protocol parsed from filename
Since commit 9fd3171a, BDRV_O_SNAPSHOT uses an option QDict to specify the originally requested image as the backing file of the newly created temporary snapshot. This means that the filename is stored in file.filename, which is an option that is not parsed for protocol names. Therefore things like -drive file=nbd:localhost:10809 were broken because it looked for a local file with the literal name 'nbd:localhost:10809'. This patch changes the way BDRV_O_SNAPSHOT works once again. We now open the originally requested image as normal, and then do a similar operation as for live snapshots to put the temporary snapshot on top. This way, both driver specific options and parsed filenames work. As a nice side effect, this results in code movement to factor bdrv_append_temp_snapshot() out. This is a good preparation for moving its call to drive_init() and friends eventually. Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c| 144 +++-- include/block/block.h | 1 + tests/qemu-iotests/051 | 2 + tests/qemu-iotests/051.out | 14 + 4 files changed, 91 insertions(+), 70 deletions(-) diff --git a/block.c b/block.c index df2b8d1..7817fea 100644 --- a/block.c +++ b/block.c @@ -1162,6 +1162,69 @@ done: return ret; } +void bdrv_append_temp_snapshot(BlockDriverState *bs, Error **errp) +{ +/* TODO: extra byte is a hack to ensure MAX_PATH space on Windows. */ +char tmp_filename[PATH_MAX + 1]; + +int64_t total_size; +BlockDriver *bdrv_qcow2; +QEMUOptionParameter *create_options; +QDict *snapshot_options; +BlockDriverState *bs_snapshot; +Error *local_err; +int ret; + +/* if snapshot, we create a temporary backing file and open it + instead of opening 'filename' directly */ + +/* Get the required size from the image */ +total_size = bdrv_getlength(bs) BDRV_SECTOR_MASK; + +/* Create the temporary image */ +ret = get_tmp_filename(tmp_filename, sizeof(tmp_filename)); +if (ret 0) { +error_setg_errno(errp, -ret, Could not get temporary filename); +return; +} + +bdrv_qcow2 = bdrv_find_format(qcow2); +create_options = parse_option_parameters(, bdrv_qcow2-create_options, + NULL); + +set_option_parameter_int(create_options, BLOCK_OPT_SIZE, total_size); + +ret = bdrv_create(bdrv_qcow2, tmp_filename, create_options, local_err); +free_option_parameters(create_options); +if (ret 0) { +error_setg_errno(errp, -ret, Could not create temporary overlay + '%s': %s, tmp_filename, + error_get_pretty(local_err)); +error_free(local_err); +return; +} + +/* Prepare a new options QDict for the temporary file */ +snapshot_options = qdict_new(); +qdict_put(snapshot_options, file.driver, + qstring_from_str(file)); +qdict_put(snapshot_options, file.filename, + qstring_from_str(tmp_filename)); + +bs_snapshot = bdrv_new(); +bs_snapshot-is_temporary = 1; + +ret = bdrv_open(bs_snapshot, NULL, NULL, snapshot_options, +bs-open_flags ~BDRV_O_SNAPSHOT, bdrv_qcow2, local_err); +if (ret 0) { +error_propagate(errp, local_err); +return; +} + +bdrv_append(bs_snapshot, bs); +bdrv_reopen(bs_snapshot, bs_snapshot-open_flags ~BDRV_O_RDWR, NULL); +} + /* * Opens a disk image (raw, qcow2, vmdk, ...) * @@ -1182,8 +1245,6 @@ int bdrv_open(BlockDriverState **pbs, const char *filename, BlockDriver *drv, Error **errp) { int ret; -/* TODO: extra byte is a hack to ensure MAX_PATH space on Windows. */ -char tmp_filename[PATH_MAX + 1]; BlockDriverState *file = NULL, *bs; const char *drvname; Error *local_err = NULL; @@ -1243,74 +1304,6 @@ int bdrv_open(BlockDriverState **pbs, const char *filename, } } -/* For snapshot=on, create a temporary qcow2 overlay */ -if (flags BDRV_O_SNAPSHOT) { -BlockDriverState *bs1; -int64_t total_size; -BlockDriver *bdrv_qcow2; -QEMUOptionParameter *create_options; -QDict *snapshot_options; - -/* if snapshot, we create a temporary backing file and open it - instead of opening 'filename' directly */ - -/* Get the required size from the image */ -QINCREF(options); -bs1 = NULL; -ret = bdrv_open(bs1, filename, NULL, options, BDRV_O_NO_BACKING, -drv, local_err); -if (ret 0) { -goto fail; -} -total_size = bdrv_getlength(bs1) BDRV_SECTOR_MASK; - -bdrv_unref(bs1); - -/* Create the temporary image */ -ret = get_tmp_filename(tmp_filename, sizeof(tmp_filename)); -if (ret 0) { -error_setg_errno(errp, -ret, Could not get temporary filename); -goto fail; -
[Qemu-devel] [PATCH 1/3] block: Don't parse 'filename' option
When using the QDict option 'filename', it is supposed to be interpreted literally. The code did correctly avoid guessing the protocol from any string before the first colon, but it still called bdrv_parse_filename() which would, for example, incorrectly remove a 'file:' prefix in the raw-posix driver. Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c| 8 tests/qemu-iotests/051 | 4 tests/qemu-iotests/051.out | 11 +++ 3 files changed, 19 insertions(+), 4 deletions(-) diff --git a/block.c b/block.c index 7a90a1b..df2b8d1 100644 --- a/block.c +++ b/block.c @@ -968,7 +968,7 @@ static int bdrv_file_open(BlockDriverState *bs, const char *filename, { BlockDriver *drv; const char *drvname; -bool allow_protocol_prefix = false; +bool parse_filename = false; Error *local_err = NULL; int ret; @@ -977,7 +977,7 @@ static int bdrv_file_open(BlockDriverState *bs, const char *filename, filename = qdict_get_try_str(*options, filename); } else if (filename !qdict_haskey(*options, filename)) { qdict_put(*options, filename, qstring_from_str(filename)); -allow_protocol_prefix = true; +parse_filename = true; } else { error_setg(errp, Can't specify 'file' and 'filename' options at the same time); @@ -994,7 +994,7 @@ static int bdrv_file_open(BlockDriverState *bs, const char *filename, } qdict_del(*options, driver); } else if (filename) { -drv = bdrv_find_protocol(filename, allow_protocol_prefix); +drv = bdrv_find_protocol(filename, parse_filename); if (!drv) { error_setg(errp, Unknown protocol); } @@ -1010,7 +1010,7 @@ static int bdrv_file_open(BlockDriverState *bs, const char *filename, } /* Parse the filename and open it */ -if (drv-bdrv_parse_filename filename) { +if (drv-bdrv_parse_filename parse_filename) { drv-bdrv_parse_filename(filename, *options, local_err); if (local_err) { error_propagate(errp, local_err); diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051 index 14694e1..2f79b26 100755 --- a/tests/qemu-iotests/051 +++ b/tests/qemu-iotests/051 @@ -204,6 +204,10 @@ run_qemu -hda foo:bar run_qemu -drive file=foo:bar run_qemu -drive file.filename=foo:bar +run_qemu -hda file:$TEST_IMG +run_qemu -drive file=file:$TEST_IMG +run_qemu -drive file.filename=file:$TEST_IMG + echo echo === Snapshot mode === echo diff --git a/tests/qemu-iotests/051.out b/tests/qemu-iotests/051.out index f5e33ff..671ac5f 100644 --- a/tests/qemu-iotests/051.out +++ b/tests/qemu-iotests/051.out @@ -275,6 +275,17 @@ QEMU_PROG: -drive file=foo:bar: could not open disk image foo:bar: Unknown proto Testing: -drive file.filename=foo:bar QEMU_PROG: -drive file.filename=foo:bar: could not open disk image ide0-hd0: Could not open 'foo:bar': No such file or directory +Testing: -hda file:TEST_DIR/t.qcow2 +QEMU X.Y.Z monitor - type 'help' for more information +(qemu) q[K[Dqu[K[D[Dqui[K[D[D[Dquit[K + +Testing: -drive file=file:TEST_DIR/t.qcow2 +QEMU X.Y.Z monitor - type 'help' for more information +(qemu) q[K[Dqu[K[D[Dqui[K[D[D[Dquit[K + +Testing: -drive file.filename=file:TEST_DIR/t.qcow2 +QEMU_PROG: -drive file.filename=file:TEST_DIR/t.qcow2: could not open disk image ide0-hd0: Could not open 'file:TEST_DIR/t.qcow2': No such file or directory + === Snapshot mode === -- 1.8.3.1
[Qemu-devel] [PATCH 2/3] qemu-iotests: Remove CR line endings in reference output
qemu doesn't print these CRs any more. The test still didn't fail because the output comparison ignores line endings, but the change turns up each time when you want to update the output. Signed-off-by: Kevin Wolf kw...@redhat.com --- tests/qemu-iotests/051.out | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tests/qemu-iotests/051.out b/tests/qemu-iotests/051.out index 671ac5f..a631b0b 100644 --- a/tests/qemu-iotests/051.out +++ b/tests/qemu-iotests/051.out @@ -44,11 +44,11 @@ QEMU_PROG: -drive file=TEST_DIR/t.qcow2,driver=foo: could not open disk image TE === Overriding backing file === Testing: -drive file=TEST_DIR/t.qcow2,driver=qcow2,backing.file.filename=TEST_DIR/t.qcow2.orig -nodefaults -QEMU X.Y.Z monitor - type 'help' for more information -(qemu) i[K[Din[K[D[Dinf[K[D[D[Dinfo[K[D[D[D[Dinfo [K[D[D[D[D[Dinfo b[K[D[D[D[D[D[Dinfo bl[K[D[D[D[D[D[D[Dinfo blo[K[D[D[D[D[D[D[D[Dinfo bloc[K[D[D[D[D[D[D[D[D[Dinfo block[K -ide0-hd0: TEST_DIR/t.qcow2 (qcow2) -Backing file: TEST_DIR/t.qcow2.orig (chain depth: 1) -(qemu) q[K[Dqu[K[D[Dqui[K[D[D[Dquit[K +QEMU X.Y.Z monitor - type 'help' for more information +(qemu) i[K[Din[K[D[Dinf[K[D[D[Dinfo[K[D[D[D[Dinfo [K[D[D[D[D[Dinfo b[K[D[D[D[D[D[Dinfo bl[K[D[D[D[D[D[D[Dinfo blo[K[D[D[D[D[D[D[D[Dinfo bloc[K[D[D[D[D[D[D[D[D[Dinfo block[K +ide0-hd0: TEST_DIR/t.qcow2 (qcow2) +Backing file: TEST_DIR/t.qcow2.orig (chain depth: 1) +(qemu) q[K[Dqu[K[D[Dqui[K[D[D[Dquit[K === Enable and disable lazy refcounting on the command line, plus some invalid values === -- 1.8.3.1
Re: [Qemu-devel] [PATCH for-2.0] iscsi: Don't set error if already set in iscsi_do_inquiry
Am 04.04.2014 um 13:57 hat Paolo Bonzini geschrieben: Il 04/04/2014 13:53, Fam Zheng ha scritto: This eliminates the possible assertion failure in error_setg(). Signed-off-by: Fam Zheng f...@redhat.com --- block/iscsi.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/block/iscsi.c b/block/iscsi.c index 21c18a3..64a509f 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -1101,8 +1101,10 @@ static struct scsi_task *iscsi_do_inquiry(struct iscsi_context *iscsi, int lun, return task; fail: -error_setg(errp, iSCSI: Inquiry command failed : %s, - iscsi_get_error(iscsi)); +if (!error_is_set(errp)) { +error_setg(errp, iSCSI: Inquiry command failed : %s, + iscsi_get_error(iscsi)); +} if (task != NULL) { scsi_free_scsi_task(task); } Reviewed-by: Paolo Bonzini pbonz...@redhat.com Kevin, Stefan, can you send the pull request for this? Thanks, applied to the block branch. Kevin
Re: [Qemu-devel] [PATCH v3 1/1] char/serial: Fix emptyness handling
On Fri, Mar 28, 2014 at 10:10 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 28/03/2014 12:43, Don Slutz ha scritto: Ping. (Since this is a bug fix, I think it can go into 2.0) -Don Slutz I think the problem is that not many people understand the 8250 device model. CCing someone who hopefully does... I have a bit of experience with 16550 :) Ill push for a merge on this one. Regards, Peter Paolo On 03/18/14 12:29, Don Slutz wrote: The commit 88c1ee73d3231c74ff90bcfc084a7589670ec244 char/serial: Fix emptyness check Still causes extra NULL byte(s) to be sent. So if the fifo is empty, do not send an extra NULL byte. Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com Signed-off-by: Don Slutz dsl...@verizon.com --- Changes V2 to v3 Revert v2 changes Fix coding style issues. Changes v1 to v2 Do full state change on fifo8_is_empty. hw/char/serial.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/hw/char/serial.c b/hw/char/serial.c index 6d3b5af..f4d167f 100644 --- a/hw/char/serial.c +++ b/hw/char/serial.c @@ -225,8 +225,10 @@ static gboolean serial_xmit(GIOChannel *chan, GIOCondition cond, void *opaque) if (s-tsr_retry = 0) { if (s-fcr UART_FCR_FE) { -s-tsr = fifo8_is_empty(s-xmit_fifo) ? -0 : fifo8_pop(s-xmit_fifo); +if (fifo8_is_empty(s-xmit_fifo)) { +return FALSE; +} +s-tsr = fifo8_pop(s-xmit_fifo); if (!s-xmit_fifo.num) { s-lsr |= UART_LSR_THRE; }
Re: [Qemu-devel] [PATCH v2] pseries: Update SLOF firmware image to qemu-slof-20140404
On 04/04/2014 02:57 AM, Alexey Kardashevskiy wrote: The change log is: Isolate sc 1 detection logic build: auto-detect ppc64 architecture cas: increase hcall buffer size to accomodate 256 cpus usb: change device tree naming usb-core: adjust port numbers in set_address virtio-scsi: correct srplun comment Fix kernel loading Workaround to make grub2 assign server ip from dhcp ack packet only ELF: Enter LE binary in LE mode ELF loading should fail for virt != phys Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru Thanks, applied to ppc-next. Let's shoot this for 2.0! Alex
Re: [Qemu-devel] [PATCH] spapr_nvram: Correct max nvram size
On 04/04/2014 09:26 AM, Alexey Kardashevskiy wrote: Currently it is UINT16_MAX*16 = 65536*16 = 1048560 which is not a round number and therefore a bit confusing. This defines MAX_NVRAM_SIZE precisely as 1MB. Suggested-by: Thomas Huth th...@linux.vnet.ibm.com Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru Thanks, applied to ppc-next-2.1. Alex
[Qemu-devel] [PATCH] qcow2: Put cache reference in error case
When qcow2_get_cluster_offset() sees a zero cluster in a version 2 image, it (rightfully) returns an error. But in doing so it shouldn't leak an L2 table cache reference. Signed-off-by: Kevin Wolf kw...@redhat.com --- block/qcow2-cluster.c | 1 + 1 file changed, 1 insertion(+) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 65550fd..54946c7 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -495,6 +495,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset, break; case QCOW2_CLUSTER_ZERO: if (s-qcow_version 3) { +qcow2_cache_put(bs, s-l2_table_cache, (void**) l2_table); return -EIO; } c = count_contiguous_clusters(nb_clusters, s-cluster_size, -- 1.8.3.1
Re: [Qemu-devel] [PATCH for-2.0] iscsi: Don't set error if already set in iscsi_do_inquiry
On Fri, Apr 04, 2014 at 07:53:29PM +0800, Fam Zheng wrote: This eliminates the possible assertion failure in error_setg(). Signed-off-by: Fam Zheng f...@redhat.com --- block/iscsi.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
Re: [Qemu-devel] [PATCH 2/4] spapr: Enable DABRX special register
On 04/04/2014 08:13 AM, Alexey Kardashevskiy wrote: On 04/04/2014 12:19 AM, Alexander Graf wrote: On 03.04.14 15:14, Alexey Kardashevskiy wrote: This advertises Data Address Breakpoint Register Extension (DABRX) to the guest via hyperrtas list and enables it to migrate. Do all CPUs we support (970 anyone) have DABRX support? 970MP and 970FX do. Support them too? Who cares? :) Well, we do support running KVM on these and for KVM we default to -cpu host, so they're important to keep in mind. Also who handles this hcall in the TCG case? Good point... What about older host kernels that don't support xdabr yet? They will ignore FW_FEATURE_XDABR, no? How does a host kernel ignore anything here? We're telling the guest that we support an hcall without asking the host at all. What about PR KVM? Oh. Nothing. And we do not want to make this hcall-xdabr conditional, right? Drop the whole patch? I am really confused now. I think we should properly check whether we can handle this hcall and/or implement a handler for TCG and hosts which don't handle it themselves. Alex
Re: [Qemu-devel] [PATCH 3/4] KVM: PPC: Support POWER8 registers
On 04/04/2014 08:58 AM, Alexey Kardashevskiy wrote: On 04/04/2014 06:12 AM, Tom Musta wrote: On 4/3/2014 8:33 AM, Alexander Graf wrote: On 03.04.14 15:14, Alexey Kardashevskiy wrote: This enabled KVM and migration support for a number of POWER8 registers: snip Tom, please have a look through this as well :). --- snip --- a/target-ppc/cpu.h +++ b/target-ppc/cpu.h @@ -426,6 +426,8 @@ struct ppc_slb_t { #define MSR_TAG 62 /* Tag-active mode (POWERx ?) */ #define MSR_ISF 61 /* Sixty-four-bit interrupt mode on 630 */ #define MSR_SHV 60 /* hypervisor state hflags */ +#define MSR_TS 33 /* Transactional state, 2 bits (Book3s) */ 2 bits means you want to add another define or at least a comment at bit 34. I find it rather counterintuitive to declare bit 33 MSR_TS as 2-bit wide too - you'd expect (3 MSR_TS) gives you the right mask, but then it'd have to be 34, no? Is this better? #define MSR_TS0 34 #define MSR_TS1 33 You should also add a decoder: #define msr_ts ((env-msr MSR_TS1) 3) Yes, this is better. Thanks! +target_ulong tm_gpr[32]; +ppc_avr_t tm_vsr[64]; +uint64_t tm_cr; +uint64_t tm_lr; +uint64_t tm_ctr; +uint64_t tm_fpscr; +uint64_t tm_amr; +uint64_t tm_ppr; +uint64_t tm_vrsave; +uint32_t tm_vscr; +uint64_t tm_dscr; +uint64_t tm_tar; }; If vscr is declared as 32 bits, should CR and VRSAVE also be 32-bits? Nope. linux-headers/asm-powerpc/kvm.h: #define KVM_REG_PPC_TM_CR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x60) #define KVM_REG_PPC_TM_LR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x61) #define KVM_REG_PPC_TM_CTR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x62) #define KVM_REG_PPC_TM_FPSCR(KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x63) #define KVM_REG_PPC_TM_AMR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x64) #define KVM_REG_PPC_TM_PPR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x65) #define KVM_REG_PPC_TM_VRSAVE (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x66) #define KVM_REG_PPC_TM_VSCR (KVM_REG_PPC_TM | KVM_REG_SIZE_U32 | 0x67) #define KVM_REG_PPC_TM_DSCR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x68) #define KVM_REG_PPC_TM_TAR (KVM_REG_PPC_TM | KVM_REG_SIZE_U64 | 0x69) For the reason unknown (Paul is not in the office to ask) all TM-related registers are defined as 64bit and only VSCR is 32bit. And this is in the host kernel already. diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 9974b10..ead69fa 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -866,6 +866,25 @@ int kvm_arch_put_registers(CPUState *cs, int level) } #ifdef TARGET_PPC64 +if ((cpu-env.msr MSR_TS) 3) { Ah, it works because you're shifting the other direction. That works. How about we just introduce an msr_ts() helper similar to the other lower case helpers to make this obvious? Agreed. diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c index 1627bb0..4b20c29 100644 --- a/target-ppc/translate_init.c +++ b/target-ppc/translate_init.c @@ -7025,14 +7025,22 @@ static void init_proc_POWER7 (CPUPPCState *env) SPR_NOACCESS, SPR_NOACCESS, spr_read_generic, SPR_NOACCESS, 0x8080); -spr_register(env, SPR_VRSAVE, SPR_VRSAVE, - spr_read_generic, spr_write_generic, - spr_read_generic, spr_write_generic, - 0x); -spr_register(env, SPR_PPR, PPR, - spr_read_generic, spr_write_generic, - spr_read_generic, spr_write_generic, - 0x); +spr_register_kvm(env, SPR_VRSAVE, VRSAVE, + spr_read_generic, spr_write_generic, + spr_read_generic, spr_write_generic, + KVM_REG_PPC_VRSAVE, 0x); +spr_register_kvm(env, SPR_PPR, PPR, + spr_read_generic, spr_write_generic, + spr_read_generic, spr_write_generic, + KVM_REG_PPC_PPR, 0x); +spr_register_kvm(env, SPR_BOOK3S_SIAR, SIAR, + spr_read_generic, spr_write_generic, + spr_read_generic, spr_write_generic, + KVM_REG_PPC_SIAR, 0x); +spr_register_kvm(env, SPR_BOOK3S_SDAR, SDAR, + spr_read_generic, spr_write_generic, + spr_read_generic, spr_write_generic, + KVM_REG_PPC_SDAR, 0x); /* Logical partitionning */ spr_register_kvm(env, SPR_LPCR, LPCR, SPR_NOACCESS, SPR_NOACCESS, These need to go into P8 as well? (see my comment for patch 2). Yes. VRSAVE, SIAR, SDAR are even defined for 970 (Alex, should I add them to 970 definitions?), PPR is not defined in any 970 spec but is in 2.04..2.07. Anything that 970 can work with should be handled in the 970 case as well, yes. Keep in mind that we
Re: [Qemu-devel] [RFC PATCH] target-ppc: enable migration within the same CPU family
On 04/04/2014 07:17 AM, Alexey Kardashevskiy wrote: On 03/24/2014 04:28 PM, Alexey Kardashevskiy wrote: Currently only migration fails if CPU version is different even a bit. For example, migration from POWER7 v2.0 to POWER7 v2.1 fails because of that. Since there is no difference between CPU versions which could affect migration stream, we can safely enable it. This adds a helper to find the closest POWERPC family class (i.e. first abstract class in hierarchy). This replaces VMSTATE_UINTTL_EQUAL statement with a custom handler which checks if the source and destination CPUs belong to the same family and fails if they are not. This adds a PVR reset to the default value as it will be overwritten by VMSTATE_UINTTL_ARRAY(env.spr, PowerPCCPU, 1024). Since the actual migration format is not changed by this patch, @version_id of vmstate_ppc_cpu does not have to be changed either. Signed-off-by: Bharata B Rao bhar...@linux.vnet.ibm.com Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru Ping? Can't we just always allow migration to succeed? It's a problem of the tool stack above if it allows migration to an incompatible host, no? Alex
Re: [Qemu-devel] [PATCH v5 34/37] target-arm: Implement CBAR for Cortex-A57
On Fri, Apr 4, 2014 at 6:25 PM, Peter Maydell peter.mayd...@linaro.org wrote: On 4 April 2014 06:32, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote: +if (arm_feature(env, ARM_FEATURE_AARCH64)) { +/* 32 bit view is [31:18] 0...0 [43:32]. */ +uint32_t cbar32 = cpu-reset_cbar Should you extract64 on the lower order bits as well to avoid weird | results on a misaligned reset_cbar (or perhaps its worth an assert?). Can't assert, it's a QOM property; we could perhaps validate earlier on in init, Is realize allowed to fail due to bad property values? Thinking more about it, perhaps the ideal solution is to populate the Error ** passed to realize and bail out and let the realize() caller deal with it. but that might be a bit painful to find a suitable place to put it. extracting the low bits too seems a reasonable compromise. Ok, Regards, Peter +| extract64(cpu-reset_cbar, 32, 12); +ARMCPRegInfo cbar_reginfo[] = { +{ .name = CBAR, + .type = ARM_CP_CONST, + .cp = 15, .crn = 15, .crm = 0, .opc1 = 4, .opc2 = 0, + .access = PL1_R, .resetvalue = cpu-reset_cbar }, +{ .name = CBAR_EL1, .state = ARM_CP_STATE_AA64, + .type = ARM_CP_CONST, + .opc0 = 3, .opc1 = 1, .crn = 15, .crm = 3, .opc2 = 0, + .access = PL1_R, .resetvalue = cbar32 }, +REGINFO_SENTINEL +}; +/* We don't implement a r/w 64 bit CBAR currently */ Is it even valid? Writable CBAR seems like a bug to me that's just fixed in the V8 version bump. This is all IMPDEF anyway (and as you'll see from the next patch A15's CBAR is RO too). R/W CBAR isn't an inherently dumb idea: you could use it to make system designs where setting CBAR is the responsibility of board-specific bootrom or firmware before handoff to tho OS, for instance. Having the h/w hardwire it is probably more robust though. thanks -- PMM
Re: [Qemu-devel] [PATCH 2/4] spapr: Enable DABRX special register
On 4/3/2014 7:51 PM, Alexey Kardashevskiy wrote: Since I'll be touching this code soon, I can make copy content of init_proc_POWER7 to init_proc_POWER8 and remove DABRX if this is what you mean. Ok? Yes it is. Thanks, Alexey.
Re: [Qemu-devel] [PATCH v5 34/37] target-arm: Implement CBAR for Cortex-A57
On 4 April 2014 13:32, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote: On Fri, Apr 4, 2014 at 6:25 PM, Peter Maydell peter.mayd...@linaro.org wrote: On 4 April 2014 06:32, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote: +if (arm_feature(env, ARM_FEATURE_AARCH64)) { +/* 32 bit view is [31:18] 0...0 [43:32]. */ +uint32_t cbar32 = cpu-reset_cbar Should you extract64 on the lower order bits as well to avoid weird | results on a misaligned reset_cbar (or perhaps its worth an assert?). Can't assert, it's a QOM property; we could perhaps validate earlier on in init, Is realize allowed to fail due to bad property values? Yes; see for instance hw/intc/arm_gic_common.c : realize gets an Error** so it can fail nicely in this situation. Thinking more about it, perhaps the ideal solution is to populate the Error ** passed to realize and bail out and let the realize() caller deal with it. Yep... but that might be a bit painful to find a suitable place to put it. extracting the low bits too seems a reasonable compromise. ...but as I say it doesn't really seem worth messing about plumbing the Error** into the right places for this corner case. thanks -- PMM
Re: [Qemu-devel] [PATCH] spapr: skip adding usb keyboard/mouse in case of -nodefaults
Am 04.04.2014 10:28, schrieb Nikunj A Dadhania: diff --git a/vl.c b/vl.c index 017f92d..0d6c36c 100644 --- a/vl.c +++ b/vl.c @@ -4348,7 +4348,8 @@ int main(int argc, char **argv, char **envp) .kernel_filename = kernel_filename, .kernel_cmdline = kernel_cmdline, .initrd_filename = initrd_filename, - .cpu_model = cpu_model }; + .cpu_model = cpu_model, + .has_defaults = has_defaults, }; If we do this, please put }; on the next line so that the next person appending something doesn't need to touch it again. :) I do agree with the others that libvirt shouldn't be using the legacy -usb option. Cheers, Andreas machine-init(args); audio_init(); -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH v5 06/37] target-arm: Provide syndrome information for MMU faults
On 1 April 2014 04:10, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote: On Sat, Mar 29, 2014 at 2:09 AM, Peter Maydell peter.mayd...@linaro.org wrote: +static inline uint32_t syn_insn_abort(bool same_el, int ea, int s1ptw, int fsc) Why the mix of bools and ints for the 1 bit fields? Lack of a strong opinion plus writing the code at different times. I'll make them all int I guess. thanks -- PMM
Re: [Qemu-devel] [PATCH v7] target-ppc: gdbstub allow byte swapping for reading/writing registers
On 4/1/2014 3:03 PM, Thomas Falcon wrote: This patch allows registers to be properly read from and written to when using the gdbstub to debug a ppc guest running in little endian mode. It accomplishes this goal by byte swapping the values of any registers if the MSR:LE value is set. Signed-off-by: Thomas Falcon tlfal...@linux.vnet.ibm.com snip index 1c91090..92649ed 100644 --- a/target-ppc/gdbstub.c +++ b/target-ppc/gdbstub.c @@ -21,6 +21,82 @@ #include qemu-common.h #include exec/gdbstub.h +static int ppc_cpu_gdb_register_len(int n) +{ +switch (n) { +case 0 ... 31: +/* gprs */ +return sizeof(target_ulong); +case 32 ... 63: +/* fprs */ +if (gdb_has_xml) { +return 0; +} +return 8; The magic numbers in the case statement are not ideal but there is precedent in the code already for this. Reviewed-by: Tom Musta tommu...@gmail.com Tested-by: Tom Musta tommu...@gmail.com
[Qemu-devel] [PATCH 01/35] qemu-option: introduce qemu_find_opts_singleton
From: Paolo Bonzini pbonz...@redhat.com Reviewed-by: Laszlo Ersek ler...@redhat.com Reviewed-by: Andreas Färber afaer...@suse.de Signed-off-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Igor Mammedov imamm...@redhat.com --- include/qemu/config-file.h | 2 ++ util/qemu-config.c | 14 ++ vl.c | 11 +-- 3 files changed, 17 insertions(+), 10 deletions(-) diff --git a/include/qemu/config-file.h b/include/qemu/config-file.h index dbd97c4..d4ba20e 100644 --- a/include/qemu/config-file.h +++ b/include/qemu/config-file.h @@ -8,6 +8,8 @@ QemuOptsList *qemu_find_opts(const char *group); QemuOptsList *qemu_find_opts_err(const char *group, Error **errp); +QemuOpts *qemu_find_opts_singleton(const char *group); + void qemu_add_opts(QemuOptsList *list); void qemu_add_drive_opts(QemuOptsList *list); int qemu_set_option(const char *str); diff --git a/util/qemu-config.c b/util/qemu-config.c index f610101..60051df 100644 --- a/util/qemu-config.c +++ b/util/qemu-config.c @@ -39,6 +39,20 @@ QemuOptsList *qemu_find_opts(const char *group) return ret; } +QemuOpts *qemu_find_opts_singleton(const char *group) +{ +QemuOptsList *list; +QemuOpts *opts; + +list = qemu_find_opts(group); +assert(list); +opts = qemu_opts_find(list, NULL); +if (!opts) { +opts = qemu_opts_create(list, NULL, 0, error_abort); +} +return opts; +} + static CommandLineParameterInfoList *query_option_descs(const QemuOptDesc *desc) { CommandLineParameterInfoList *param_list = NULL, *entry; diff --git a/vl.c b/vl.c index 9975e5a..1dd6319 100644 --- a/vl.c +++ b/vl.c @@ -517,16 +517,7 @@ static QemuOptsList qemu_name_opts = { */ QemuOpts *qemu_get_machine_opts(void) { -QemuOptsList *list; -QemuOpts *opts; - -list = qemu_find_opts(machine); -assert(list); -opts = qemu_opts_find(list, NULL); -if (!opts) { -opts = qemu_opts_create(list, NULL, 0, error_abort); -} -return opts; +return qemu_find_opts_singleton(machine); } const char *qemu_get_vm_name(void) -- 1.9.0
[Qemu-devel] [PATCH 00/35] pc: ACPI memory hotplug
What's new since v7: * Per Andreas' suggestion dropped DIMMBus concept. * Added hotplug binding for bus-less devices * DIMM device is split to backend and frontend. Therefore following command/options were added for supporting it: For memory-ram backend: CLI: -object-add memory-ram, with options: 'id' and 'size' For dimm frontend: option size became readonly, pulling it's size from attached backend added option memdev for specifying backend by 'id' * dropped support for 32 bit guests * failed hotplug action doesn't consume 1 slot anymore * vaious fixes adressing reviewer's comments most of them in ACPI part --- This series allows to hotplug 'arbitrary' DIMM devices specifying size, NUMA node mapping (guest side), slot and address where to map it, at runtime. Due to ACPI limitation there is need to specify a number of possible DIMM devices. For this task -m option was extended to support following format: -m [mem=]RamSize[,slots=N,maxmem=M] To allow memory hotplug user must specify a pair of additional parameters: 'slots' - number of possible increments 'maxmem' - max possible total memory size QEMU is allowed to use, including RamSize. minimal monitor command syntax to hotplug DIMM device: object_add memory-ram,id=memX,size=1G device_add dimm,id=dimmX,memdev=memX DIMM device provides following properties that could be used with device_add / -device to alter default behavior: id- unique string identifying device [mandatory] slot - number in range [0-slots) [optional], if not specified the first free slot is used node - NUMA node id [optional] (default: 0) size - amount of memory to add, readonly derived from backing memdev start - guest's physical address where to plug DIMM [optional], if not specified the first gap in hotplug memory region that fits DIMM is used -device option could be used for adding potentially hotunplugable DIMMs and also for specifying hotplugged DIMMs in migration case. Tested guests: - RHEL 6x64 - Windows 2012DCx64 - Windows 2008DCx64 Known limitations/bugs/TODOs: - hot-remove is not supported, yet - max number of supported DIMM devices 255 (due to ACPI object name limit), could be increased creating several containers and putting DIMMs there. (exercise for future) - e820 table doesn't include DIMM devices added with -device / (or after reboot devices added with device_add) - Windows 2008 remembers DIMM configuration, so if DIMM with other start/size is added into the same slot, it refuses to use it insisting on old mapping. QEMU git tree for testing is available at: https://github.com/imammedo/qemu/commits/memory-hotplug-v8 Example QEMU cmd line: qemu-system-x86_64 -enable-kvm -monitor unix:/tmp/mon,server,nowait \ -m 4096,slots=4,maxmem=8G guest.img PS: Windows guest requires SRAT table for hotplug to work so add an extra option: -numa node to QEMU command line. Igor Mammedov (34): vl: convert -m to QemuOpts object_add: allow completion handler to get canonical path add memdev backend infrastructure vl.c: extend -m option to support options for memory hotplug add pc-{i440fx,q35}-2.1 machine types pc: create custom generic PC machine type qdev: hotplug for buss-less devices qdev: expose DeviceState.hotplugged field as a property dimm: implement dimm device abstraction memory: add memory_region_is_mapped() API dimm: do not allow to set already busy memdev pc: initialize memory hotplug address space pc: exit QEMU if slots 256 pc: add 'etc/reserved-memory-end' fw_cfg interface for SeaBIOS pc: add memory hotplug handler to PC_MACHINE dimm: add busy address check and address auto-allocation dimm: add busy slot check and slot auto-allocation acpi: rename cpu_hotplug_defs.h to acpi_defs.h acpi: memory hotplug ACPI hardware implementation trace: add acpi memory hotplug IO region events trace: add DIMM slot address allocation for target-i386 acpi:piix4: make plug/unlug callbacks generic acpi:piix4: add memory hotplug handling pc: ich9 lpc: make it work with global/compat properties acpi:ich9: add memory hotplug handling pc: migrate piix4 ich9 MemHotplugState pc: propagate memory hotplug event to ACPI device pc: ACPI BIOS: punch holes in PCI0._CRS for memory hotplug IO region pc: ACPI BIOS: name CPU hotplug ACPI0004 device pc: ACPI BIOS: implement memory hotplug interface pc: ACPI BIOS: use enum for defining memory affinity flags pc: ACPI BIOS: reserve SRAT entry for hotplug mem hole pc: ACPI BIOS: make GPE.3 handle memory hotplug event on PIIX and Q35 machines pc: ACPI BIOS: update pregenerated ACPI table blobs Paolo Bonzini (1): qemu-option: introduce qemu_find_opts_singleton backends/Makefile.objs | 2 + backends/hostmem-ram.c | 54 +++ backends/hostmem.c | 110 +
[Qemu-devel] [PATCH 05/35] vl.c: extend -m option to support options for memory hotplug
Add following parameters: slots - total number of hotplug memory slots maxmem - maximum possible memory slots and maxmem should go in pair and maxmem should be greater than mem for memory hotplug to be enabled. Signed-off-by: Igor Mammedov imamm...@redhat.com --- include/hw/boards.h | 2 ++ qemu-options.hx | 9 ++--- vl.c| 51 +++ 3 files changed, 59 insertions(+), 3 deletions(-) diff --git a/include/hw/boards.h b/include/hw/boards.h index dd2c70d..3567190 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -10,6 +10,8 @@ typedef struct QEMUMachineInitArgs { const QEMUMachine *machine; ram_addr_t ram_size; +ram_addr_t maxram_size; +uint64_t ram_slots; const char *boot_order; const char *kernel_filename; const char *kernel_cmdline; diff --git a/qemu-options.hx b/qemu-options.hx index a5a412e..7934147 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -210,17 +210,20 @@ use is discouraged as it may be removed from future versions. ETEXI DEF(m, HAS_ARG, QEMU_OPTION_m, --m [size=]megs\n +-m[emory] [size=]megs[,slots=n,maxmem=size]\n configure guest RAM\n size: initial amount of guest memory (default: -stringify(DEFAULT_RAM_SIZE) MiB)\n, +stringify(DEFAULT_RAM_SIZE) MiB)\n +slots: number of hotplug slots (default: none)\n +maxmem: maximum amount of guest memory (default: none)\n, QEMU_ARCH_ALL) STEXI @item -m [size=]@var{megs} @findex -m Set virtual RAM size to @var{megs} megabytes. Default is 128 MiB. Optionally, a suffix of ``M'' or ``G'' can be used to signify a value in megabytes or -gigabytes respectively. +gigabytes respectively. Optional pair @var{slots}, @var{maxmem} could be used +to set amount of hotluggable memory slots and possible maximum amount of memory. ETEXI DEF(mem-path, HAS_ARG, QEMU_OPTION_mempath, diff --git a/vl.c b/vl.c index 2181b41..8f7b04e 100644 --- a/vl.c +++ b/vl.c @@ -520,6 +520,14 @@ static QemuOptsList qemu_mem_opts = { .name = size, .type = QEMU_OPT_SIZE, }, +{ +.name = slots, +.type = QEMU_OPT_NUMBER, +}, +{ +.name = maxmem, +.type = QEMU_OPT_SIZE, +}, { /* end of list */ } }, }; @@ -2975,6 +2983,8 @@ int main(int argc, char **argv, char **envp) const char *trace_file = NULL; const ram_addr_t default_ram_size = (ram_addr_t)DEFAULT_RAM_SIZE * 1024 * 1024; +ram_addr_t maxram_size = default_ram_size; +uint64_t ram_slots = 0; atexit(qemu_run_exit_notifiers); error_set_progname(argv[0]); @@ -3312,6 +3322,7 @@ int main(int argc, char **argv, char **envp) case QEMU_OPTION_m: { uint64_t sz; const char *mem_str; +const char *maxmem_str, *slots_str; opts = qemu_opts_parse(qemu_find_opts(memory), optarg, 1); @@ -3353,6 +3364,44 @@ int main(int argc, char **argv, char **envp) error_report(ram size too large); exit(EXIT_FAILURE); } + +maxmem_str = qemu_opt_get(opts, maxmem); +slots_str = qemu_opt_get(opts, slots); +if (maxmem_str slots_str) { +uint64_t slots; + +sz = qemu_opt_get_size(opts, maxmem, 0); +if (sz ram_size) { +fprintf(stderr, qemu: invalid -m option value: maxmem +(% PRIu64 ) = initial memory (% +PRIu64 )\n, sz, ram_size); +exit(EXIT_FAILURE); +} + +slots = qemu_opt_get_number(opts, slots, 0); +if ((sz ram_size) !slots) { +fprintf(stderr, qemu: invalid -m option value: maxmem +(% PRIu64 ) more than initial memory (% +PRIu64 ) but no hotplug slots where +specified\n, sz, ram_size); +exit(EXIT_FAILURE); +} + +if ((sz = ram_size) slots) { +fprintf(stderr, qemu: invalid -m option value: % +PRIu64 hotplug slots where specified but +maxmem (% PRIu64 ) = initial memory (% +PRIu64 )\n, slots, sz, ram_size); +exit(EXIT_FAILURE); +} +maxram_size = sz; +ram_slots = slots; +} else if ((!maxmem_str slots_str) || + (maxmem_str !slots_str)) { +
[Qemu-devel] [PATCH 04/35] add memdev backend infrastructure
Provides framework for splitting host RAM allocation/ policies into a separate backend that could be used by devices. Initially only legacy RAM backend is provided, which uses memory_region_init_ram() allocator and compatible with every CLI option that affects memory_region_init_ram(). Signed-off-by: Igor Mammedov imamm...@redhat.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- backends/Makefile.objs | 2 + backends/hostmem-ram.c | 54 +++ backends/hostmem.c | 110 +++ include/sysemu/hostmem.h | 60 ++ 4 files changed, 226 insertions(+) create mode 100644 backends/hostmem-ram.c create mode 100644 backends/hostmem.c create mode 100644 include/sysemu/hostmem.h diff --git a/backends/Makefile.objs b/backends/Makefile.objs index 42557d5..e6bdc11 100644 --- a/backends/Makefile.objs +++ b/backends/Makefile.objs @@ -6,3 +6,5 @@ common-obj-$(CONFIG_BRLAPI) += baum.o $(obj)/baum.o: QEMU_CFLAGS += $(SDL_CFLAGS) common-obj-$(CONFIG_TPM) += tpm.o + +common-obj-y += hostmem.o hostmem-ram.o diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c new file mode 100644 index 000..cbf7e5a --- /dev/null +++ b/backends/hostmem-ram.c @@ -0,0 +1,54 @@ +/* + * QEMU Host Memory Backend + * + * Copyright (C) 2013 Red Hat Inc + * + * Authors: + * Igor Mammedov imamm...@redhat.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#include sysemu/hostmem.h +#include qom/object_interfaces.h + +#define TYPE_MEMORY_BACKEND_RAM memory-ram + + +static void +ram_backend_memory_init(UserCreatable *uc, Error **errp) +{ +HostMemoryBackend *backend = MEMORY_BACKEND(uc); +char *path; + +if (!backend-size) { +error_setg(errp, can't create backend with size 0); +return; +} + +path = object_get_canonical_path_component(OBJECT(backend)); +memory_region_init_ram(backend-mr, OBJECT(backend), path, + backend-size); +g_free(path); +} + +static void +ram_backend_class_init(ObjectClass *oc, void *data) +{ +UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc); + +ucc-complete = ram_backend_memory_init; +} + +static const TypeInfo ram_backend_info = { +.name = TYPE_MEMORY_BACKEND_RAM, +.parent = TYPE_MEMORY_BACKEND, +.class_init = ram_backend_class_init, +}; + +static void register_types(void) +{ +type_register_static(ram_backend_info); +} + +type_init(register_types); diff --git a/backends/hostmem.c b/backends/hostmem.c new file mode 100644 index 000..0bd3900 --- /dev/null +++ b/backends/hostmem.c @@ -0,0 +1,110 @@ +/* + * QEMU Host Memory Backend + * + * Copyright (C) 2013 Red Hat Inc + * + * Authors: + * Igor Mammedov imamm...@redhat.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#include sysemu/hostmem.h +#include sysemu/sysemu.h +#include qapi/visitor.h +#include qapi/qmp/qerror.h +#include qemu/config-file.h +#include qom/object_interfaces.h + +static void +hostmemory_backend_get_size(Object *obj, Visitor *v, void *opaque, +const char *name, Error **errp) +{ +HostMemoryBackend *backend = MEMORY_BACKEND(obj); +uint64_t value = backend-size; + +visit_type_size(v, value, name, errp); +} + +static void +hostmemory_backend_set_size(Object *obj, Visitor *v, void *opaque, +const char *name, Error **errp) +{ +HostMemoryBackend *backend = MEMORY_BACKEND(obj); +uint64_t value; + +if (memory_region_size(backend-mr)) { +error_setg(errp, cannot change property value\n); +return; +} + +visit_type_size(v, value, name, errp); +if (error_is_set(errp)) { +return; +} +if (!value) { +error_setg(errp, Property '%s.%s' doesn't take value '% PRIu64 ', + object_get_typename(obj), name , value); +return; +} +backend-size = value; +} + +static void hostmemory_backend_initfn(Object *obj) +{ +object_property_add(obj, size, int, +hostmemory_backend_get_size, +hostmemory_backend_set_size, NULL, NULL, NULL); +} + +static void hostmemory_backend_finalize(Object *obj) +{ +HostMemoryBackend *backend = MEMORY_BACKEND(obj); + +if (memory_region_size(backend-mr)) { +memory_region_destroy(backend-mr); +} +} + +static void +hostmemory_backend_memory_init(UserCreatable *uc, Error **errp) +{ +error_setg(errp, memory_init is not implemented for type [%s], + object_get_typename(OBJECT(uc))); +} + +MemoryRegion * +host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp) +{ +return memory_region_size(backend-mr) ? backend-mr : NULL; +} + +static void +hostmemory_backend_class_init(ObjectClass *oc, void *data) +{
[Qemu-devel] [PATCH 02/35] vl: convert -m to QemuOpts
Adds option to -m size - startup memory amount For compatibility with legacy CLI if suffix-less number is passed, it assumes amount in Mb. Otherwise user is free to use suffixed number using suffixes b,k/K,M,G Signed-off-by: Igor Mammedov imamm...@redhat.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com --- qemu-options.hx | 9 +--- vl.c| 71 + 2 files changed, 63 insertions(+), 17 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index 2d33815..a5a412e 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -210,10 +210,13 @@ use is discouraged as it may be removed from future versions. ETEXI DEF(m, HAS_ARG, QEMU_OPTION_m, --m megs set virtual RAM size to megs MB [default= -stringify(DEFAULT_RAM_SIZE) ]\n, QEMU_ARCH_ALL) +-m [size=]megs\n +configure guest RAM\n +size: initial amount of guest memory (default: +stringify(DEFAULT_RAM_SIZE) MiB)\n, +QEMU_ARCH_ALL) STEXI -@item -m @var{megs} +@item -m [size=]@var{megs} @findex -m Set virtual RAM size to @var{megs} megabytes. Default is 128 MiB. Optionally, a suffix of ``M'' or ``G'' can be used to signify a value in megabytes or diff --git a/vl.c b/vl.c index 1dd6319..67fd5bb 100644 --- a/vl.c +++ b/vl.c @@ -510,6 +510,20 @@ static QemuOptsList qemu_name_opts = { }, }; +static QemuOptsList qemu_mem_opts = { +.name = memory, +.implied_opt_name = size, +.head = QTAILQ_HEAD_INITIALIZER(qemu_mem_opts.head), +.merge_lists = true, +.desc = { +{ +.name = size, +.type = QEMU_OPT_SIZE, +}, +{ /* end of list */ } +}, +}; + /** * Get machine options * @@ -2955,6 +2969,8 @@ int main(int argc, char **argv, char **envp) }; const char *trace_events = NULL; const char *trace_file = NULL; +const ram_addr_t default_ram_size = (ram_addr_t)DEFAULT_RAM_SIZE * +1024 * 1024; atexit(qemu_run_exit_notifiers); error_set_progname(argv[0]); @@ -2978,6 +2994,7 @@ int main(int argc, char **argv, char **envp) qemu_add_opts(qemu_trace_opts); qemu_add_opts(qemu_option_rom_opts); qemu_add_opts(qemu_machine_opts); +qemu_add_opts(qemu_mem_opts); qemu_add_opts(qemu_smp_opts); qemu_add_opts(qemu_boot_opts); qemu_add_opts(qemu_sandbox_opts); @@ -3002,7 +3019,7 @@ int main(int argc, char **argv, char **envp) module_call_init(MODULE_INIT_MACHINE); machine_class = find_default_machine(); cpu_model = NULL; -ram_size = 0; +ram_size = default_ram_size; snapshot = 0; cyls = heads = secs = 0; translation = BIOS_ATA_TRANSLATION_AUTO; @@ -3289,20 +3306,48 @@ int main(int argc, char **argv, char **envp) exit(0); break; case QEMU_OPTION_m: { -int64_t value; uint64_t sz; -char *end; +const char *mem_str; -value = strtosz(optarg, end); -if (value 0 || *end) { -fprintf(stderr, qemu: invalid ram size: %s\n, optarg); -exit(1); +opts = qemu_opts_parse(qemu_find_opts(memory), + optarg, 1); +if (!opts) { +exit(EXIT_FAILURE); +} + +mem_str = qemu_opt_get(opts, size); +if (!mem_str) { +error_report(invalid -m option, missing 'size' option); +exit(EXIT_FAILURE); +} +if (!*mem_str) { +error_report(missing 'size' option value); +exit(EXIT_FAILURE); +} + +sz = qemu_opt_get_size(opts, size, ram_size); + +/* Fix up legacy suffix-less format */ +if (g_ascii_isdigit(mem_str[strlen(mem_str) - 1])) { +uint64_t overflow_check = sz; + +sz = 20; +if ((sz 20) != overflow_check) { +error_report(too large 'size' option value); +exit(EXIT_FAILURE); +} +} + +/* backward compatibility behaviour for case -m 0 */ +if (sz == 0) { +sz = default_ram_size; } -sz = QEMU_ALIGN_UP((uint64_t)value, 8192); + +sz = QEMU_ALIGN_UP(sz, 8192); ram_size = sz; if (ram_size != sz) { -fprintf(stderr, qemu: ram size too large\n); -exit(1); +error_report(ram size too large); +exit(EXIT_FAILURE); } break; } @@ -4145,10 +4190,8 @@ int
[Qemu-devel] [PATCH 08/35] qdev: hotplug for buss-less devices
Adds get_hotplug_handler() method to machine, and makes bus-less device to use it during hotplug as a means to discover hotplug handler controller. Returned controller is used to permorm a hotplug action. Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/core/qdev.c | 13 + include/hw/boards.h | 8 2 files changed, 21 insertions(+) diff --git a/hw/core/qdev.c b/hw/core/qdev.c index 60f9df1..50bb8f5 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -34,6 +34,7 @@ #include qapi/qmp/qjson.h #include monitor/monitor.h #include hw/hotplug.h +#include hw/boards.h int qdev_hotplug = 0; static bool qdev_hot_added = false; @@ -761,6 +762,18 @@ static void device_set_realized(Object *obj, bool value, Error **err) local_err == NULL) { hotplug_handler_plug(dev-parent_bus-hotplug_handler, dev, local_err); +} else if (local_err == NULL + object_dynamic_cast(qdev_get_machine(), TYPE_MACHINE)) { +HotplugHandler *hotplug_ctrl; +MachineState *machine = MACHINE(qdev_get_machine()); +MachineClass *mc = MACHINE_GET_CLASS(machine); + +if (mc-get_hotplug_handler) { +hotplug_ctrl = mc-get_hotplug_handler(machine, dev); +if (hotplug_ctrl) { +hotplug_handler_plug(hotplug_ctrl, dev, local_err); +} +} } if (qdev_get_vmsd(dev) local_err == NULL) { diff --git a/include/hw/boards.h b/include/hw/boards.h index 3567190..67750b5 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -73,6 +73,11 @@ extern MachineState *current_machine; /** * MachineClass: * @qemu_machine: #QEMUMachine + * @get_hotplug_handler: this function is called during bus-less + *device hotplug. If defined it returns pointer to an instance + *of HotplugHandler object, which handles hotplug operation + *for a given @dev. It may return NULL if @dev doesn't require + *any actions to be performed by hotplug handler. */ struct MachineClass { /* private */ @@ -80,6 +85,9 @@ struct MachineClass { /* public */ QEMUMachine *qemu_machine; + +HotplugHandler *(*get_hotplug_handler)(MachineState *machine, + DeviceState *dev); }; /** -- 1.9.0
[Qemu-devel] [PATCH 14/35] pc: exit QEMU if slots 256
... which is current ACPI implementation limit. Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/i386/pc.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 69e4225..6fe1803 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1204,6 +1204,12 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory, ram_addr_t hotplug_mem_size = machine-init_args.maxram_size - ram_size; +if (machine-init_args.ram_slots 256) { +error_report(unsupported amount of memory slots: %PRIu64, + machine-init_args.ram_slots); +exit(EXIT_FAILURE); +} + pcms-hotplug_memory_base = ROUND_UP(0x1ULL + above_4g_mem_size, 1ULL 30); -- 1.9.0
[Qemu-devel] [PATCH 11/35] memory: add memory_region_is_mapped() API
which allows to check if MemoryRegion is already mapped. Signed-off-by: Igor Mammedov imamm...@redhat.com --- include/exec/memory.h | 8 memory.c | 15 ++- 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index c084db2..200d3c5 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -848,6 +848,14 @@ void memory_region_set_alias_offset(MemoryRegion *mr, bool memory_region_present(MemoryRegion *parent, hwaddr addr); /** + * memory_region_is_mapped: returns true if #MemoryRegion is mapped + * into any address space. + * + * @mr: a #MemoryRegion which should be checked if it's mapped + */ +bool memory_region_is_mapped(MemoryRegion *mr); + +/** * memory_region_find: translate an address/size relative to a * MemoryRegion into a #MemoryRegionSection. * diff --git a/memory.c b/memory.c index 3f1df23..f4d8e69 100644 --- a/memory.c +++ b/memory.c @@ -492,7 +492,7 @@ static AddressSpace *memory_region_to_address_space(MemoryRegion *mr) return as; } } -abort(); +return NULL; } /* Render a memory region into the global view. Ranges in @view obscure @@ -1569,6 +1569,16 @@ bool memory_region_present(MemoryRegion *parent, hwaddr addr) return true; } +bool memory_region_is_mapped(MemoryRegion *mr) +{ +mr = memory_region_find(mr, 0, 1).mr; +if (!mr) { +return false; +} +memory_region_unref(mr); +return true; +} + MemoryRegionSection memory_region_find(MemoryRegion *mr, hwaddr addr, uint64_t size) { @@ -1586,6 +1596,9 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr, } as = memory_region_to_address_space(root); +if (!as) { +return ret; +} range = addrrange_make(int128_make64(addr), int128_make64(size)); view = address_space_get_flatview(as); -- 1.9.0
[Qemu-devel] [PATCH 25/35] pc: ich9 lpc: make it work with global/compat properties
Propeties of object should be available after its instances_init() callback is finished and not added in PCIDeviceClass.init which is roughly corresponds to realize() method. Moving properties adding into instances_init will fix missing property error when global/compat property mechanism is used. Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/isa/lpc_ich9.c | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c index 51ce12d..46de3b6 100644 --- a/hw/isa/lpc_ich9.c +++ b/hw/isa/lpc_ich9.c @@ -563,7 +563,14 @@ static void ich9_lpc_add_properties(ICH9LPCState *lpc) ich9_pm_add_properties(OBJECT(lpc), lpc-pm, NULL); } -static int ich9_lpc_initfn(PCIDevice *d) +static void ich9_lpc_initfn(Object *obj) +{ +ICH9LPCState *lpc = ICH9_LPC_DEVICE(obj); + +ich9_lpc_add_properties(lpc); +} + +static int ich9_lpc_init(PCIDevice *d) { ICH9LPCState *lpc = ICH9_LPC_DEVICE(d); ISABus *isa_bus; @@ -589,9 +596,6 @@ static int ich9_lpc_initfn(PCIDevice *d) memory_region_add_subregion_overlap(pci_address_space_io(d), ICH9_RST_CNT_IOPORT, lpc-rst_cnt_mem, 1); - -ich9_lpc_add_properties(lpc); - return 0; } @@ -642,7 +646,7 @@ static void ich9_lpc_class_init(ObjectClass *klass, void *data) set_bit(DEVICE_CATEGORY_BRIDGE, dc-categories); dc-reset = ich9_lpc_reset; -k-init = ich9_lpc_initfn; +k-init = ich9_lpc_init; dc-vmsd = vmstate_ich9_lpc; k-config_write = ich9_lpc_config_write; dc-desc = ICH9 LPC bridge; @@ -661,6 +665,7 @@ static const TypeInfo ich9_lpc_info = { .name = TYPE_ICH9_LPC_DEVICE, .parent = TYPE_PCI_DEVICE, .instance_size = sizeof(struct ICH9LPCState), +.instance_init = ich9_lpc_initfn, .class_init = ich9_lpc_class_init, }; -- 1.9.0
[Qemu-devel] [PATCH 07/35] pc: create custom generic PC machine type
it will be used for PC specific options/variables Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/i386/pc.c | 36 hw/i386/pc_piix.c| 36 ++-- hw/i386/pc_q35.c | 12 ++-- include/hw/i386/pc.h | 24 4 files changed, 84 insertions(+), 24 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 14f0d91..32b4003 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1429,3 +1429,39 @@ void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name) gsi_state-ioapic_irq[i] = qdev_get_gpio_in(dev, i); } } + +static void pc_generic_machine_class_init(ObjectClass *oc, void *data) +{ +MachineClass *mc = MACHINE_CLASS(oc); + +mc-qemu_machine = data; +} + +void qemu_register_pc_machine(QEMUMachine *m) +{ +char *name = g_strconcat(m-name, TYPE_MACHINE_SUFFIX, NULL); +TypeInfo ti = { +.name = name, +.parent = TYPE_PC_MACHINE, +.class_init = pc_generic_machine_class_init, +.class_data = (void *)m, +}; + +type_register(ti); +g_free(name); +} + +static const TypeInfo pc_machine_info = { +.name = TYPE_PC_MACHINE, +.parent = TYPE_MACHINE, +.abstract = true, +.instance_size = sizeof(PCMachineState), +.class_size = sizeof(PCMachineClass), +}; + +static void pc_machine_register_types(void) +{ +type_register_static(pc_machine_info); +} + +type_init(pc_machine_register_types) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index f0dc4d1..62f750d 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -835,25 +835,25 @@ static QEMUMachine xenfv_machine = { static void pc_machine_init(void) { -qemu_register_machine(pc_i440fx_machine_v2_1); -qemu_register_machine(pc_i440fx_machine_v2_0); -qemu_register_machine(pc_i440fx_machine_v1_7); -qemu_register_machine(pc_i440fx_machine_v1_6); -qemu_register_machine(pc_i440fx_machine_v1_5); -qemu_register_machine(pc_i440fx_machine_v1_4); -qemu_register_machine(pc_machine_v1_3); -qemu_register_machine(pc_machine_v1_2); -qemu_register_machine(pc_machine_v1_1); -qemu_register_machine(pc_machine_v1_0); -qemu_register_machine(pc_machine_v0_15); -qemu_register_machine(pc_machine_v0_14); -qemu_register_machine(pc_machine_v0_13); -qemu_register_machine(pc_machine_v0_12); -qemu_register_machine(pc_machine_v0_11); -qemu_register_machine(pc_machine_v0_10); -qemu_register_machine(isapc_machine); +qemu_register_pc_machine(pc_i440fx_machine_v2_1); +qemu_register_pc_machine(pc_i440fx_machine_v2_0); +qemu_register_pc_machine(pc_i440fx_machine_v1_7); +qemu_register_pc_machine(pc_i440fx_machine_v1_6); +qemu_register_pc_machine(pc_i440fx_machine_v1_5); +qemu_register_pc_machine(pc_i440fx_machine_v1_4); +qemu_register_pc_machine(pc_machine_v1_3); +qemu_register_pc_machine(pc_machine_v1_2); +qemu_register_pc_machine(pc_machine_v1_1); +qemu_register_pc_machine(pc_machine_v1_0); +qemu_register_pc_machine(pc_machine_v0_15); +qemu_register_pc_machine(pc_machine_v0_14); +qemu_register_pc_machine(pc_machine_v0_13); +qemu_register_pc_machine(pc_machine_v0_12); +qemu_register_pc_machine(pc_machine_v0_11); +qemu_register_pc_machine(pc_machine_v0_10); +qemu_register_pc_machine(isapc_machine); #ifdef CONFIG_XEN -qemu_register_machine(xenfv_machine); +qemu_register_pc_machine(xenfv_machine); #endif } diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 5ad31a5..ee68de7 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -376,12 +376,12 @@ static QEMUMachine pc_q35_machine_v1_4 = { static void pc_q35_machine_init(void) { -qemu_register_machine(pc_q35_machine_v2_1); -qemu_register_machine(pc_q35_machine_v2_0); -qemu_register_machine(pc_q35_machine_v1_7); -qemu_register_machine(pc_q35_machine_v1_6); -qemu_register_machine(pc_q35_machine_v1_5); -qemu_register_machine(pc_q35_machine_v1_4); +qemu_register_pc_machine(pc_q35_machine_v2_1); +qemu_register_pc_machine(pc_q35_machine_v2_0); +qemu_register_pc_machine(pc_q35_machine_v1_7); +qemu_register_pc_machine(pc_q35_machine_v1_6); +qemu_register_pc_machine(pc_q35_machine_v1_5); +qemu_register_pc_machine(pc_q35_machine_v1_4); } machine_init(pc_q35_machine_init); diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h index 9010246..b3bc0f7 100644 --- a/include/hw/i386/pc.h +++ b/include/hw/i386/pc.h @@ -12,9 +12,33 @@ #include qemu/bitmap.h #include sysemu/sysemu.h #include hw/pci/pci.h +#include hw/boards.h #define HPET_INTCAP hpet-intcap +struct PCMachineState { +/* private */ +MachineState parent_obj; +}; + +struct PCMachineClass { +/* private */ +MachineClass parent_class; +}; + +typedef struct PCMachineState PCMachineState; +typedef struct PCMachineClass PCMachineClass; + +#define TYPE_PC_MACHINE
[Qemu-devel] [PATCH 13/35] pc: initialize memory hotplug address space
initialize and map hotplug memory address space container into guest's RAM address space. Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/i386/pc.c | 19 +-- include/hw/i386/pc.h | 10 ++ 2 files changed, 27 insertions(+), 2 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 32b4003..69e4225 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1171,6 +1171,9 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory, MemoryRegion *ram, *option_rom_mr; MemoryRegion *ram_below_4g, *ram_above_4g; FWCfgState *fw_cfg; +ram_addr_t ram_size = below_4g_mem_size + above_4g_mem_size; +MachineState *machine = MACHINE(qdev_get_machine()); +PCMachineState *pcms = PC_MACHINE(machine); linux_boot = (kernel_filename != NULL); @@ -1179,8 +1182,7 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory, * with older qemus that used qemu_ram_alloc(). */ ram = g_malloc(sizeof(*ram)); -memory_region_init_ram(ram, NULL, pc.ram, - below_4g_mem_size + above_4g_mem_size); +memory_region_init_ram(ram, NULL, pc.ram, ram_size); vmstate_register_ram_global(ram); *ram_memory = ram; ram_below_4g = g_malloc(sizeof(*ram_below_4g)); @@ -1197,6 +1199,19 @@ FWCfgState *pc_memory_init(MemoryRegion *system_memory, e820_add_entry(0x1ULL, above_4g_mem_size, E820_RAM); } +/* initialize hotplug memory address space */ +if (ram_size machine-init_args.maxram_size) { +ram_addr_t hotplug_mem_size = +machine-init_args.maxram_size - ram_size; + +pcms-hotplug_memory_base = +ROUND_UP(0x1ULL + above_4g_mem_size, 1ULL 30); + +memory_region_init(pcms-hotplug_memory, OBJECT(pcms), + hotplug-memory, hotplug_mem_size); +memory_region_add_subregion(system_memory, pcms-hotplug_memory_base, +pcms-hotplug_memory); +} /* Initialize PC system firmware */ pc_system_firmware_init(rom_memory, guest_info-isapc_ram_fw); diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h index b3bc0f7..a1f21ba 100644 --- a/include/hw/i386/pc.h +++ b/include/hw/i386/pc.h @@ -16,9 +16,19 @@ #define HPET_INTCAP hpet-intcap +/** + * PCMachineState: + * @hotplug_memory_base: address in guest RAM address space where hotplug memory + * address space begins. + * @hotplug_memory: hotplug memory addess space container + */ struct PCMachineState { /* private */ MachineState parent_obj; + +/* public */ +ram_addr_t hotplug_memory_base; +MemoryRegion hotplug_memory; }; struct PCMachineClass { -- 1.9.0
[Qemu-devel] [PATCH 32/35] pc: ACPI BIOS: use enum for defining memory affinity flags
replace magic numbers with enum describing Flags field of memory affinity in SRAT table. MemoryAffinityFlags enum will define flags decribed by: ACPI spec 5.0, 5.2.16.2 Memory Affinity Structure, Table 5-69 Flags - Memory Affinity Structure Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/i386/acpi-build.c | 23 --- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 6649480..ef89e99 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1165,15 +1165,22 @@ build_hpet(GArray *table_data, GArray *linker) (void *)hpet, HPET, sizeof(*hpet), 1); } +typedef enum { +MEM_AFFINITY_NOFLAGS = 0, +MEM_AFFINITY_ENABLED = (1 0), +MEM_AFFINITY_HOTPLUGGABLE = (1 1), +MEM_AFFINITY_NON_VOLATILE = (1 2), +} MemoryAffinityFlags; + static void -acpi_build_srat_memory(AcpiSratMemoryAffinity *numamem, - uint64_t base, uint64_t len, int node, int enabled) +acpi_build_srat_memory(AcpiSratMemoryAffinity *numamem, uint64_t base, + uint64_t len, int node, MemoryAffinityFlags flags) { numamem-type = ACPI_SRAT_MEMORY; numamem-length = sizeof(*numamem); memset(numamem-proximity, 0, 4); numamem-proximity[0] = node; -numamem-flags = cpu_to_le32(!!enabled); +numamem-flags = cpu_to_le32(flags); numamem-base_addr = cpu_to_le64(base); numamem-range_length = cpu_to_le64(len); } @@ -1221,7 +1228,7 @@ build_srat(GArray *table_data, GArray *linker, numa_start = table_data-len; numamem = acpi_data_push(table_data, sizeof *numamem); -acpi_build_srat_memory(numamem, 0, 640*1024, 0, 1); +acpi_build_srat_memory(numamem, 0, 640*1024, 0, MEM_AFFINITY_ENABLED); next_base = 1024 * 1024; for (i = 1; i guest_info-numa_nodes + 1; ++i) { mem_base = next_base; @@ -1237,19 +1244,21 @@ build_srat(GArray *table_data, GArray *linker, mem_len -= next_base - guest_info-ram_size_below_4g; if (mem_len 0) { numamem = acpi_data_push(table_data, sizeof *numamem); -acpi_build_srat_memory(numamem, mem_base, mem_len, i-1, 1); +acpi_build_srat_memory(numamem, mem_base, mem_len, i-1, + MEM_AFFINITY_ENABLED); } mem_base = 1ULL 32; mem_len = next_base - guest_info-ram_size_below_4g; next_base += (1ULL 32) - guest_info-ram_size_below_4g; } numamem = acpi_data_push(table_data, sizeof *numamem); -acpi_build_srat_memory(numamem, mem_base, mem_len, i - 1, 1); +acpi_build_srat_memory(numamem, mem_base, mem_len, i - 1, + MEM_AFFINITY_ENABLED); } slots = (table_data-len - numa_start) / sizeof *numamem; for (; slots guest_info-numa_nodes + 2; slots++) { numamem = acpi_data_push(table_data, sizeof *numamem); -acpi_build_srat_memory(numamem, 0, 0, 0, 0); +acpi_build_srat_memory(numamem, 0, 0, 0, MEM_AFFINITY_NOFLAGS); } build_header(linker, table_data, -- 1.9.0
[Qemu-devel] [PATCH 16/35] pc: add memory hotplug handler to PC_MACHINE
that will perform mapping of DIMM device into guest's RAM address space Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/i386/pc.c | 52 include/hw/i386/pc.h | 8 2 files changed, 60 insertions(+) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index b164e37..4038e2c 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -58,6 +58,7 @@ #include hw/boards.h #include hw/pci/pci_host.h #include acpi-build.h +#include hw/mem/dimm.h /* debug PC/ISA interrupts */ //#define DEBUG_IRQ @@ -1479,12 +1480,63 @@ void qemu_register_pc_machine(QEMUMachine *m) g_free(name); } +static void pc_dimm_plug(HotplugHandler *hotplug_dev, + DeviceState *dev, Error **errp) +{ +PCMachineState *pcms = PC_MACHINE(hotplug_dev); +DimmDevice *dimm = DIMM(dev); +DimmDeviceClass *ddc = DIMM_GET_CLASS(dimm); +MemoryRegion *mr = ddc-get_memory_region(dimm); + +memory_region_add_subregion(pcms-hotplug_memory, +dimm-start - pcms-hotplug_memory_base, +mr); +vmstate_register_ram(mr, dev); +} + +static void pc_machine_device_plug_cb(HotplugHandler *hotplug_dev, + DeviceState *dev, Error **errp) +{ +if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) { +pc_dimm_plug(hotplug_dev, dev, errp); +} +} + +static HotplugHandler *pc_get_hotpug_handler(MachineState *machine, + DeviceState *dev) +{ +PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine); + +if (object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) { +return HOTPLUG_HANDLER(machine); +} + +return pcmc-get_hotplug_handler ? +pcmc-get_hotplug_handler(machine, dev) : NULL; +} + +static void pc_machine_class_init(ObjectClass *oc, void *data) +{ +MachineClass *mc = MACHINE_CLASS(oc); +PCMachineClass *pcmc = PC_MACHINE_CLASS(oc); +HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc); + +pcmc-get_hotplug_handler = mc-get_hotplug_handler; +mc-get_hotplug_handler = pc_get_hotpug_handler; +hc-plug = pc_machine_device_plug_cb; +} + static const TypeInfo pc_machine_info = { .name = TYPE_PC_MACHINE, .parent = TYPE_MACHINE, .abstract = true, .instance_size = sizeof(PCMachineState), .class_size = sizeof(PCMachineClass), +.class_init = pc_machine_class_init, +.interfaces = (InterfaceInfo[]) { + { TYPE_HOTPLUG_HANDLER }, + { } +}, }; static void pc_machine_register_types(void) diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h index 36f2f52..059e137 100644 --- a/include/hw/i386/pc.h +++ b/include/hw/i386/pc.h @@ -31,9 +31,17 @@ struct PCMachineState { MemoryRegion hotplug_memory; }; +/** + * PCMachineClass: + * @get_hotplug_handler: pointer to parent class callback @get_hotplug_handler + */ struct PCMachineClass { /* private */ MachineClass parent_class; + +/* public */ +HotplugHandler *(*get_hotplug_handler)(MachineState *machine, + DeviceState *dev); }; typedef struct PCMachineState PCMachineState; -- 1.9.0
[Qemu-devel] [PATCH 29/35] pc: ACPI BIOS: punch holes in PCI0._CRS for memory hotplug IO region
... to make sure that IO range used by memory hotlug won't be used by PCI devices. Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/i386/acpi-dsdt.dsl | 10 +- hw/i386/q35-acpi-dsdt.dsl | 10 +- 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/hw/i386/acpi-dsdt.dsl b/hw/i386/acpi-dsdt.dsl index f93353f..b846195 100644 --- a/hw/i386/acpi-dsdt.dsl +++ b/hw/i386/acpi-dsdt.dsl @@ -39,9 +39,17 @@ DefinitionBlock ( WordIO(ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, \ 0x, \ 0x, \ + 0x09FF, \ + 0x, \ + 0x0A00, \ + ,, , TypeStatic) \ + /* 0xa00-0xa17 hole for memory hotplug, include/hw/acpi/memory_hotplug.h:ACPI_MEMORY_HOTPLUG_BASE */ \ + WordIO(ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, \ + 0x, \ + 0x0A18, \ 0x0CF7, \ 0x, \ - 0x0CF8, \ + 0x02E0, \ ,, , TypeStatic) \ WordIO(ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, \ 0x, \ diff --git a/hw/i386/q35-acpi-dsdt.dsl b/hw/i386/q35-acpi-dsdt.dsl index 3838fc7..766e96d 100644 --- a/hw/i386/q35-acpi-dsdt.dsl +++ b/hw/i386/q35-acpi-dsdt.dsl @@ -52,9 +52,17 @@ DefinitionBlock ( WordIO(ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, \ 0x, \ 0x, \ + 0x09FF, \ + 0x, \ + 0x0A00, \ + ,, , TypeStatic) \ + /* 0xa00-0xa17 hole for memory hotplug, include/hw/acpi/memory_hotplug.h:ACPI_MEMORY_HOTPLUG_BASE */ \ + WordIO(ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, \ + 0x, \ + 0x0A18, \ 0x0CD7, \ 0x, \ - 0x0CD8, \ + 0x02C0, \ ,, , TypeStatic) \ /* 0xcd8-0xcf7 hole for CPU hotplug, hw/acpi/ich9.c:ICH9_PROC_BASE */ \ WordIO(ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, \ -- 1.9.0
[Qemu-devel] [PATCH 18/35] dimm: add busy slot check and slot auto-allocation
- if slot property is not specified on -device/device_add command, treat default value as request for assigning DimmDevice to the first free slot. - if slot is provided with -device/device_add command, attempt to use it or fail command if it's already occupied. Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/i386/pc.c | 15 +++ hw/mem/dimm.c | 47 +++ include/hw/mem/dimm.h | 2 ++ 3 files changed, 64 insertions(+) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 29382ec..e3e63bb 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1489,6 +1489,8 @@ static void pc_dimm_plug(HotplugHandler *hotplug_dev, DimmDeviceClass *ddc = DIMM_GET_CLASS(dimm); MemoryRegion *mr = ddc-get_memory_region(dimm); ram_addr_t addr = dimm-start; +MachineState *machine = MACHINE(hotplug_dev); +int slot = dimm-slot; addr = dimm_get_free_addr(pcms-hotplug_memory_base, memory_region_size(pcms-hotplug_memory), @@ -1498,6 +1500,19 @@ static void pc_dimm_plug(HotplugHandler *hotplug_dev, goto out; } object_property_set_int(OBJECT(dev), addr, start, local_err); +if (local_err) { +goto out; +} + +slot = dimm_get_free_slot(slot 0 ? NULL : slot, + machine-init_args.ram_slots, local_err); +if (local_err) { +goto out; +} +object_property_set_int(OBJECT(dev), slot, slot, local_err); +if (local_err) { +goto out; +} memory_region_add_subregion(pcms-hotplug_memory, addr - pcms-hotplug_memory_base, diff --git a/hw/mem/dimm.c b/hw/mem/dimm.c index a10bd07..16fdf62 100644 --- a/hw/mem/dimm.c +++ b/hw/mem/dimm.c @@ -23,6 +23,53 @@ #include qapi/visitor.h #include qemu/range.h +static int dimm_slot2bitmap(Object *obj, void *opaque) +{ +unsigned long *bitmap = opaque; + +if (object_dynamic_cast(obj, TYPE_DIMM)) { +DeviceState *dev = DEVICE(obj); +if (dev-realized) { /* count only realized DIMMs */ +DimmDevice *d = DIMM(obj); +set_bit(d-slot, bitmap); +} +} + +object_child_foreach(obj, dimm_slot2bitmap, opaque); +return 0; +} + +int dimm_get_free_slot(const int *hint, int max_slots, Error **errp) +{ +unsigned long *bitmap = bitmap_new(max_slots); +int slot = 0; + +object_child_foreach(qdev_get_machine(), dimm_slot2bitmap, bitmap); + +/* check if requested slot is not occupied */ +if (hint) { +if (*hint = max_slots) { +error_setg(errp, invalid slot# %d, should be less than %d, + *hint, max_slots); +} +if (!test_bit(*hint, bitmap)) { +slot = *hint; +} else { +error_setg(errp, slot %d is busy, *hint); +} +goto out; +} + +/* search for free slot */ +slot = find_first_zero_bit(bitmap, max_slots); +if (slot == max_slots) { +error_setg(errp, no free slots available); +} +out: +g_free(bitmap); +return slot; +} + static gint dimm_addr_sort(gconstpointer a, gconstpointer b) { DimmDevice *x = DIMM(a); diff --git a/include/hw/mem/dimm.h b/include/hw/mem/dimm.h index d6e4d65..fa27208 100644 --- a/include/hw/mem/dimm.h +++ b/include/hw/mem/dimm.h @@ -67,4 +67,6 @@ uint64_t dimm_get_free_addr(uint64_t address_space_start, uint64_t address_space_size, uint64_t *hint, uint64_t size, Error **errp); + +int dimm_get_free_slot(const int *hint, int max_slots, Error **errp); #endif -- 1.9.0
[Qemu-devel] [PATCH 10/35] dimm: implement dimm device abstraction
Each hotplug-able memory slot is a DimmDevice. A hot-add operation for a DIMM: - creates a new DimmDevice and makes hotplug controller to map it into guest address space Hotplug operations are done through normal device_add commands. For migration case, all hotplugged DIMMs on source should be specified on target's command line using '-device' option with properties set to the same values as on source. To simplify review, patch introduces only DimmDevice QOM skeleton that will be extended by following patches to implement actual memory hotplug and related functions. Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com Signed-off-by: Igor Mammedov imamm...@redhat.com --- default-configs/i386-softmmu.mak | 1 + default-configs/x86_64-softmmu.mak | 1 + hw/Makefile.objs | 1 + hw/mem/Makefile.objs | 1 + hw/mem/dimm.c | 103 + include/hw/mem/dimm.h | 65 +++ 6 files changed, 172 insertions(+) create mode 100644 hw/mem/Makefile.objs create mode 100644 hw/mem/dimm.c create mode 100644 include/hw/mem/dimm.h diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak index 37ef90f..8e08841 100644 --- a/default-configs/i386-softmmu.mak +++ b/default-configs/i386-softmmu.mak @@ -44,3 +44,4 @@ CONFIG_APIC=y CONFIG_IOAPIC=y CONFIG_ICC_BUS=y CONFIG_PVPANIC=y +CONFIG_MEM_HOTPLUG=y diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak index 31bddce..66557ac 100644 --- a/default-configs/x86_64-softmmu.mak +++ b/default-configs/x86_64-softmmu.mak @@ -44,3 +44,4 @@ CONFIG_APIC=y CONFIG_IOAPIC=y CONFIG_ICC_BUS=y CONFIG_PVPANIC=y +CONFIG_MEM_HOTPLUG=y diff --git a/hw/Makefile.objs b/hw/Makefile.objs index d178b65..52a1464 100644 --- a/hw/Makefile.objs +++ b/hw/Makefile.objs @@ -29,6 +29,7 @@ devices-dirs-$(CONFIG_SOFTMMU) += usb/ devices-dirs-$(CONFIG_VIRTIO) += virtio/ devices-dirs-$(CONFIG_SOFTMMU) += watchdog/ devices-dirs-$(CONFIG_SOFTMMU) += xen/ +devices-dirs-$(CONFIG_MEM_HOTPLUG) += mem/ devices-dirs-y += core/ common-obj-y += $(devices-dirs-y) obj-y += $(devices-dirs-y) diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs new file mode 100644 index 000..7563ef5 --- /dev/null +++ b/hw/mem/Makefile.objs @@ -0,0 +1 @@ +common-obj-$(CONFIG_MEM_HOTPLUG) += dimm.o diff --git a/hw/mem/dimm.c b/hw/mem/dimm.c new file mode 100644 index 000..76f2a11 --- /dev/null +++ b/hw/mem/dimm.c @@ -0,0 +1,103 @@ +/* + * Dimm device for Memory Hotplug + * + * Copyright ProfitBricks GmbH 2012 + * Copyright (C) 2014 Red Hat Inc + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see http://www.gnu.org/licenses/ + */ + +#include hw/mem/dimm.h +#include qemu/config-file.h +#include qapi/visitor.h + +static Property dimm_properties[] = { +DEFINE_PROP_UINT64(start, DimmDevice, start, 0), +DEFINE_PROP_UINT32(node, DimmDevice, node, 0), +DEFINE_PROP_INT32(slot, DimmDevice, slot, -1), +DEFINE_PROP_END_OF_LIST(), +}; + +static void dimm_get_size(Object *obj, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +int64_t value; +MemoryRegion *mr; +DimmDevice *dimm = DIMM(obj); + +mr = host_memory_backend_get_memory(dimm-hostmem, errp); +value = memory_region_size(mr); + +visit_type_int(v, value, name, errp); +} + +static void dimm_initfn(Object *obj) +{ +DimmDevice *dimm = DIMM(obj); + +object_property_add(obj, size, int, dimm_get_size, +NULL, NULL, NULL, error_abort); +object_property_add_link(obj, memdev, TYPE_MEMORY_BACKEND, + (Object **)dimm-hostmem, + qdev_prop_allow_set_link_before_realize, + OBJ_PROP_LINK_UNREF_ON_RELEASE, + error_abort); +} + +static void dimm_realize(DeviceState *dev, Error **errp) +{ +DimmDevice *dimm = DIMM(dev); + +if (!dimm-hostmem) { +error_setg(errp, 'memdev' property is not set); +return; +} + +if (!dev-id) { +error_setg(errp, 'id' property is not set); +return; +} +} + +static MemoryRegion *dimm_get_memory_region(DimmDevice *dimm) +{ +return host_memory_backend_get_memory(dimm-hostmem, error_abort); +} + +static void
[Qemu-devel] [PATCH 19/35] acpi: rename cpu_hotplug_defs.h to acpi_defs.h
to make it more generic, so it could be used for memory hotplug as well. Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/i386/acpi-dsdt.dsl | 2 +- hw/i386/q35-acpi-dsdt.dsl | 2 +- include/hw/acpi/acpi_defs.h| 32 include/hw/acpi/cpu_hotplug.h | 2 +- include/hw/acpi/cpu_hotplug_defs.h | 32 5 files changed, 35 insertions(+), 35 deletions(-) create mode 100644 include/hw/acpi/acpi_defs.h delete mode 100644 include/hw/acpi/cpu_hotplug_defs.h diff --git a/hw/i386/acpi-dsdt.dsl b/hw/i386/acpi-dsdt.dsl index 0a1e252..f93353f 100644 --- a/hw/i386/acpi-dsdt.dsl +++ b/hw/i386/acpi-dsdt.dsl @@ -306,7 +306,7 @@ DefinitionBlock ( } } -#include hw/acpi/cpu_hotplug_defs.h +#include hw/acpi/acpi_defs.h #define CPU_STATUS_BASE PIIX4_CPU_HOTPLUG_IO_BASE #include acpi-dsdt-cpu-hotplug.dsl diff --git a/hw/i386/q35-acpi-dsdt.dsl b/hw/i386/q35-acpi-dsdt.dsl index f4d2a2d..3838fc7 100644 --- a/hw/i386/q35-acpi-dsdt.dsl +++ b/hw/i386/q35-acpi-dsdt.dsl @@ -402,7 +402,7 @@ DefinitionBlock ( define_gsi_link(GSIH, 0, 0x17) } -#include hw/acpi/cpu_hotplug_defs.h +#include hw/acpi/acpi_defs.h #define CPU_STATUS_BASE ICH9_CPU_HOTPLUG_IO_BASE #include acpi-dsdt-cpu-hotplug.dsl diff --git a/include/hw/acpi/acpi_defs.h b/include/hw/acpi/acpi_defs.h new file mode 100644 index 000..36b831f --- /dev/null +++ b/include/hw/acpi/acpi_defs.h @@ -0,0 +1,32 @@ +/* + * QEMU ACPI hotplug utilities shared defines + * + * Copyright (C) 2014 Red Hat Inc + * + * Authors: + * Igor Mammedov imamm...@redhat.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#ifndef ACPI_DEFS_H +#define ACPI_DEFS_H + +/* + * ONLY DEFINEs are permited in this file since it's shared + * between C and ASL code. + */ +#define ACPI_CPU_HOTPLUG_STATUS 4 + +/* Limit for CPU arch IDs for CPU hotplug. All hotpluggable CPUs should + * have CPUClass.get_arch_id() ACPI_CPU_HOTPLUG_ID_LIMIT. + */ +#define ACPI_CPU_HOTPLUG_ID_LIMIT 256 + +/* 256 CPU IDs, 8 bits per entry: */ +#define ACPI_GPE_PROC_LEN 32 + +#define ICH9_CPU_HOTPLUG_IO_BASE 0x0CD8 +#define PIIX4_CPU_HOTPLUG_IO_BASE 0xaf00 + +#endif diff --git a/include/hw/acpi/cpu_hotplug.h b/include/hw/acpi/cpu_hotplug.h index 4576400..1562ce8 100644 --- a/include/hw/acpi/cpu_hotplug.h +++ b/include/hw/acpi/cpu_hotplug.h @@ -13,7 +13,7 @@ #define ACPI_HOTPLUG_H #include hw/acpi/acpi.h -#include hw/acpi/cpu_hotplug_defs.h +#include hw/acpi/acpi_defs.h typedef struct AcpiCpuHotplug { MemoryRegion io; diff --git a/include/hw/acpi/cpu_hotplug_defs.h b/include/hw/acpi/cpu_hotplug_defs.h deleted file mode 100644 index 9f33663..000 --- a/include/hw/acpi/cpu_hotplug_defs.h +++ /dev/null @@ -1,32 +0,0 @@ -/* - * QEMU ACPI hotplug utilities shared defines - * - * Copyright (C) 2013 Red Hat Inc - * - * Authors: - * Igor Mammedov imamm...@redhat.com - * - * This work is licensed under the terms of the GNU GPL, version 2 or later. - * See the COPYING file in the top-level directory. - */ -#ifndef ACPI_HOTPLUG_DEFS_H -#define ACPI_HOTPLUG_DEFS_H - -/* - * ONLY DEFINEs are permited in this file since it's shared - * between C and ASL code. - */ -#define ACPI_CPU_HOTPLUG_STATUS 4 - -/* Limit for CPU arch IDs for CPU hotplug. All hotpluggable CPUs should - * have CPUClass.get_arch_id() ACPI_CPU_HOTPLUG_ID_LIMIT. - */ -#define ACPI_CPU_HOTPLUG_ID_LIMIT 256 - -/* 256 CPU IDs, 8 bits per entry: */ -#define ACPI_GPE_PROC_LEN 32 - -#define ICH9_CPU_HOTPLUG_IO_BASE 0x0CD8 -#define PIIX4_CPU_HOTPLUG_IO_BASE 0xaf00 - -#endif -- 1.9.0
[Qemu-devel] [PATCH 33/35] pc: ACPI BIOS: reserve SRAT entry for hotplug mem hole
Needed for Windows to use hotplugged memory device, otherwise it complains that server is not configured for memory hotplug. Tests shows that aftewards it uses dynamically provided proximity value from _PXM() method if available. Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/i386/acpi-build.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index ef89e99..012b100 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1197,6 +1197,8 @@ build_srat(GArray *table_data, GArray *linker, uint64_t curnode; int srat_start, numa_start, slots; uint64_t mem_len, mem_base, next_base; +PCMachineState *pcms = PC_MACHINE(qdev_get_machine()); +ram_addr_t hotplug_as_size = memory_region_size(pcms-hotplug_memory); srat_start = table_data-len; @@ -1261,6 +1263,18 @@ build_srat(GArray *table_data, GArray *linker, acpi_build_srat_memory(numamem, 0, 0, 0, MEM_AFFINITY_NOFLAGS); } +/* + * Fake entry required by Windows to enable memory hotplug in OS. + * Individual DIMM devices override proximity set here via _PXM method, + * which returns associated with it NUMA node id. + */ +if (hotplug_as_size) { +numamem = acpi_data_push(table_data, sizeof *numamem); +acpi_build_srat_memory(numamem, pcms-hotplug_memory_base, + hotplug_as_size, 0, MEM_AFFINITY_HOTPLUGGABLE | + MEM_AFFINITY_ENABLED); +} + build_header(linker, table_data, (void *)(table_data-data + srat_start), SRAT, -- 1.9.0
[Qemu-devel] [PATCH 23/35] acpi:piix4: make plug/unlug callbacks generic
... and report error if plugged in device is not supported. Later generic callbacks will be used by memory hotplug. Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/acpi/piix4.c | 31 ++- 1 file changed, 22 insertions(+), 9 deletions(-) diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c index 67dc075..4341f82 100644 --- a/hw/acpi/piix4.c +++ b/hw/acpi/piix4.c @@ -310,19 +310,32 @@ static void piix4_pm_powerdown_req(Notifier *n, void *opaque) acpi_pm1_evt_power_down(s-ar); } -static void piix4_pci_device_plug_cb(HotplugHandler *hotplug_dev, - DeviceState *dev, Error **errp) +static void piix4_device_plug_cb(HotplugHandler *hotplug_dev, + DeviceState *dev, Error **errp) { PIIX4PMState *s = PIIX4_PM(hotplug_dev); -acpi_pcihp_device_plug_cb(s-ar, s-irq, s-acpi_pci_hotplug, dev, errp); + +if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { +acpi_pcihp_device_plug_cb(s-ar, s-irq, s-acpi_pci_hotplug, dev, + errp); +} else { +error_setg(errp, acpi: device plug request for not supported device +type: %s, object_get_typename(OBJECT(dev))); +} } -static void piix4_pci_device_unplug_cb(HotplugHandler *hotplug_dev, - DeviceState *dev, Error **errp) +static void piix4_device_unplug_cb(HotplugHandler *hotplug_dev, + DeviceState *dev, Error **errp) { PIIX4PMState *s = PIIX4_PM(hotplug_dev); -acpi_pcihp_device_unplug_cb(s-ar, s-irq, s-acpi_pci_hotplug, dev, -errp); + +if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { +acpi_pcihp_device_unplug_cb(s-ar, s-irq, s-acpi_pci_hotplug, dev, +errp); +} else { +error_setg(errp, acpi: device unplug request for not supported device +type: %s, object_get_typename(OBJECT(dev))); +} } static void piix4_update_bus_hotplug(PCIBus *pci_bus, void *opaque) @@ -553,8 +566,8 @@ static void piix4_pm_class_init(ObjectClass *klass, void *data) */ dc-cannot_instantiate_with_device_add_yet = true; dc-hotpluggable = false; -hc-plug = piix4_pci_device_plug_cb; -hc-unplug = piix4_pci_device_unplug_cb; +hc-plug = piix4_device_plug_cb; +hc-unplug = piix4_device_unplug_cb; } static const TypeInfo piix4_pm_info = { -- 1.9.0
[Qemu-devel] [PATCH 26/35] acpi:ich9: add memory hotplug handling
Add memory hotplug initialization/handling to ICH9 LPC device and enable it by default for post 2.0 machine types Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/acpi/ich9.c | 38 ++ hw/i386/pc_q35.c | 4 hw/isa/lpc_ich9.c | 20 include/hw/acpi/ich9.h | 4 include/hw/i386/pc.h | 8 5 files changed, 74 insertions(+) diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c index 0afac42..86c45ba 100644 --- a/hw/acpi/ich9.c +++ b/hw/acpi/ich9.c @@ -34,6 +34,7 @@ #include exec/address-spaces.h #include hw/i386/ich9.h +#include hw/mem/dimm.h //#define DEBUG @@ -224,6 +225,11 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm, pm-gpe_cpu, ICH9_CPU_HOTPLUG_IO_BASE); pm-cpu_added_notifier.notify = ich9_cpu_added_req; qemu_register_cpu_added_notifier(pm-cpu_added_notifier); + +if (pm-acpi_memory_hotplug.is_enabled) { +acpi_memory_hotplug_init(pci_address_space_io(lpc_pci), OBJECT(lpc_pci), + pm-acpi_memory_hotplug); +} } static void ich9_pm_get_gpe0_blk(Object *obj, Visitor *v, @@ -236,9 +242,25 @@ static void ich9_pm_get_gpe0_blk(Object *obj, Visitor *v, visit_type_uint32(v, value, name, errp); } +static bool ich9_pm_get_memory_hotplug_support(Object *obj, Error **errp) +{ +ICH9LPCState *s = ICH9_LPC_DEVICE(obj); + +return s-pm.acpi_memory_hotplug.is_enabled; +} + +static void ich9_pm_set_memory_hotplug_support(Object *obj, bool value, + Error **errp) +{ +ICH9LPCState *s = ICH9_LPC_DEVICE(obj); + +s-pm.acpi_memory_hotplug.is_enabled = value; +} + void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm, Error **errp) { static const uint32_t gpe0_len = ICH9_PMIO_GPE0_LEN; +pm-acpi_memory_hotplug.is_enabled = true; object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE, pm-pm_io_base, errp); @@ -247,4 +269,20 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm, Error **errp) NULL, NULL, pm, NULL); object_property_add_uint32_ptr(obj, ACPI_PM_PROP_GPE0_BLK_LEN, gpe0_len, errp); +object_property_add_bool(obj, memory-hotplug-support, + ich9_pm_get_memory_hotplug_support, + ich9_pm_set_memory_hotplug_support, + NULL); +} + +void ich9_pm_device_plug_cb(ICH9LPCPMRegs *pm, DeviceState *dev, Error **errp) +{ +if (pm-acpi_memory_hotplug.is_enabled +object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) { +acpi_memory_plug_cb(pm-acpi_regs, pm-irq, pm-acpi_memory_hotplug, +dev, errp); +} else { +error_setg(errp, acpi: device plug request for not supported device +type: %s, object_get_typename(OBJECT(dev))); +} } diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 89fd4b4..f9b74d5 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -327,6 +327,10 @@ static QEMUMachine pc_q35_machine_v2_0 = { .name = pc-q35-2.0, .alias = q35, .init = pc_q35_init_2_0, +.compat_props = (GlobalProperty[]) { +PC_Q35_COMPAT_2_0, +{ /* end of list */ } +}, }; #define PC_Q35_1_7_MACHINE_OPTIONS PC_Q35_MACHINE_OPTIONS diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c index 46de3b6..2adf29a 100644 --- a/hw/isa/lpc_ich9.c +++ b/hw/isa/lpc_ich9.c @@ -599,6 +599,19 @@ static int ich9_lpc_init(PCIDevice *d) return 0; } +static void ich9_device_plug_cb(HotplugHandler *hotplug_dev, +DeviceState *dev, Error **errp) +{ +ICH9LPCState *lpc = ICH9_LPC_DEVICE(hotplug_dev); + +ich9_pm_device_plug_cb(lpc-pm, dev, errp); +} + +static void ich9_device_unplug_cb(HotplugHandler *hotplug_dev, + DeviceState *dev, Error **errp) +{ +} + static bool ich9_rst_cnt_needed(void *opaque) { ICH9LPCState *lpc = opaque; @@ -643,6 +656,7 @@ static void ich9_lpc_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); +HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(klass); set_bit(DEVICE_CATEGORY_BRIDGE, dc-categories); dc-reset = ich9_lpc_reset; @@ -659,6 +673,8 @@ static void ich9_lpc_class_init(ObjectClass *klass, void *data) * pc_q35_init() */ dc-cannot_instantiate_with_device_add_yet = true; +hc-plug = ich9_device_plug_cb; +hc-unplug = ich9_device_unplug_cb; } static const TypeInfo ich9_lpc_info = { @@ -667,6 +683,10 @@ static const TypeInfo ich9_lpc_info = { .instance_size = sizeof(struct ICH9LPCState), .instance_init = ich9_lpc_initfn, .class_init = ich9_lpc_class_init, +.interfaces = (InterfaceInfo[]) { +{
[Qemu-devel] [PATCH 31/35] pc: ACPI BIOS: implement memory hotplug interface
- provides static SSDT object for memory hotplug - SSDT template for memory devices and runtime generator of them in SSDT table. Signed-off-by: Vasilis Liaskovitis vasilis.liaskovi...@profitbricks.com Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/i386/Makefile.objs | 3 +- hw/i386/acpi-build.c | 37 hw/i386/ssdt-mem.dsl | 75 +++ hw/i386/ssdt-misc.dsl | 163 ++ 4 files changed, 277 insertions(+), 1 deletion(-) create mode 100644 hw/i386/ssdt-mem.dsl diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs index 3df1612..fd04fc2 100644 --- a/hw/i386/Makefile.objs +++ b/hw/i386/Makefile.objs @@ -9,7 +9,8 @@ obj-y += acpi-build.o obj-y += bios-linker-loader.o hw/i386/acpi-build.o: hw/i386/acpi-build.c hw/i386/acpi-dsdt.hex \ hw/i386/ssdt-proc.hex hw/i386/ssdt-pcihp.hex hw/i386/ssdt-misc.hex \ - hw/i386/acpi-dsdt.hex hw/i386/q35-acpi-dsdt.hex + hw/i386/acpi-dsdt.hex hw/i386/q35-acpi-dsdt.hex \ + hw/i386/q35-acpi-dsdt.hex hw/i386/ssdt-mem.hex iasl-option=$(shell if test -z `$(1) $(2) 21 /dev/null` \ ; then echo $(2); else echo $(3); fi ;) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index a5d3fbf..6649480 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -37,6 +37,7 @@ #include bios-linker-loader.h #include hw/loader.h #include hw/isa/isa.h +#include hw/acpi/memory_hotplug.h /* Supported chipsets: */ #include hw/acpi/piix4.h @@ -664,6 +665,14 @@ static inline char acpi_get_hex(uint32_t val) #define ACPI_PCIQXL_SIZEOF (*ssdt_pciqxl_end - *ssdt_pciqxl_start) #define ACPI_PCIQXL_AML (ssdp_pcihp_aml + *ssdt_pciqxl_start) +#include hw/i386/ssdt-mem.hex + +/* 0x5B 0x82 DeviceOp PkgLength NameString DimmID */ +#define ACPI_MEM_OFFSET_HEX (*ssdt_mem_name - *ssdt_mem_start + 2) +#define ACPI_MEM_OFFSET_ID (*ssdt_mem_id - *ssdt_mem_start + 7) +#define ACPI_MEM_SIZEOF (*ssdt_mem_end - *ssdt_mem_start) +#define ACPI_MEM_AML (ssdm_mem_aml + *ssdt_mem_start) + #define ACPI_SSDT_SIGNATURE 0x54445353 /* SSDT */ #define ACPI_SSDT_HEADER_LENGTH 36 @@ -999,6 +1008,8 @@ build_ssdt(GArray *table_data, GArray *linker, AcpiCpuInfo *cpu, AcpiPmInfo *pm, AcpiMiscInfo *misc, PcPciInfo *pci, PcGuestInfo *guest_info) { +MachineState *machine = MACHINE(qdev_get_machine()); +uint32_t nr_mem = machine-init_args.ram_slots; unsigned acpi_cpus = guest_info-apic_id_limit; int ssdt_start = table_data-len; uint8_t *ssdt_ptr; @@ -1027,6 +1038,9 @@ build_ssdt(GArray *table_data, GArray *linker, ACPI_BUILD_SET_LE(ssdt_ptr, sizeof(ssdp_misc_aml), ssdt_isa_pest[0], 16, misc-pvpanic_port); +ACPI_BUILD_SET_LE(ssdt_ptr, sizeof(ssdp_misc_aml), + ssdt_mctrl_nr_slots[0], 32, nr_mem); + { GArray *sb_scope = build_alloc_array(); uint8_t op = 0x10; /* ScopeOp */ @@ -1080,6 +1094,29 @@ build_ssdt(GArray *table_data, GArray *linker, build_free_array(package); } +if (nr_mem) { +/* + * current device naming scheme dosen't support + * more that 256 memory devices + */ +assert(nr_mem = 256); +/* build memory devices */ +for (i = 0; i nr_mem; i++) { +char id[3]; +uint8_t *mem = acpi_data_push(sb_scope, ACPI_MEM_SIZEOF); + +snprintf(id, sizeof(id), %02X, i); +memcpy(mem, ACPI_MEM_AML, ACPI_MEM_SIZEOF); +memcpy(mem + ACPI_MEM_OFFSET_HEX, id, 2); +memcpy(mem + ACPI_MEM_OFFSET_ID, id, 2); +} + +/* build Method(MTFY, 2) { + * If (LEqual(Arg0, 0x00)) {Notify(MP00, Arg1)} ... + */ +build_append_notify_method(sb_scope, MTFY, MP%0.02X, nr_mem); +} + { AcpiBuildPciBusHotplugState hotplug_state; Object *pci_host; diff --git a/hw/i386/ssdt-mem.dsl b/hw/i386/ssdt-mem.dsl new file mode 100644 index 000..7f68750 --- /dev/null +++ b/hw/i386/ssdt-mem.dsl @@ -0,0 +1,75 @@ +/* + * Memory hotplug ACPI DSDT static objects definitions + * + * Copyright ProfitBricks GmbH 2012 + * Copyright (C) 2013 Red Hat Inc + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see
[Qemu-devel] [PATCH 34/35] pc: ACPI BIOS: make GPE.3 handle memory hotplug event on PIIX and Q35 machines
also make handler edge based to avoid loosing events, the same as it has been done for PCI and CPU hotplug handlers. Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/i386/acpi-dsdt.dsl | 5 - hw/i386/q35-acpi-dsdt.dsl | 5 - 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/hw/i386/acpi-dsdt.dsl b/hw/i386/acpi-dsdt.dsl index b846195..d3fb533 100644 --- a/hw/i386/acpi-dsdt.dsl +++ b/hw/i386/acpi-dsdt.dsl @@ -322,6 +322,7 @@ DefinitionBlock ( / * General purpose events / +External(\_SB.MHPD.MESC, MethodObj) Scope(\_GPE) { Name(_HID, ACPI0006) @@ -338,7 +339,9 @@ DefinitionBlock ( // CPU hotplug event \_SB.PRSC() } -Method(_L03) { +Method(_E03) { +// Memory hotplug event +\_SB.MHPD.MESC() } Method(_L04) { } diff --git a/hw/i386/q35-acpi-dsdt.dsl b/hw/i386/q35-acpi-dsdt.dsl index 766e96d..a55e34f 100644 --- a/hw/i386/q35-acpi-dsdt.dsl +++ b/hw/i386/q35-acpi-dsdt.dsl @@ -418,6 +418,7 @@ DefinitionBlock ( / * General purpose events / +External(\_SB.MHPD.MESC, MethodObj) Scope(\_GPE) { Name(_HID, ACPI0006) @@ -430,7 +431,9 @@ DefinitionBlock ( // CPU hotplug event \_SB.PRSC() } -Method(_L03) { +Method(_E03) { +// Memory hotplug event +\_SB.MHPD.MESC() } Method(_L04) { } -- 1.9.0
Re: [Qemu-devel] [PATCH 2/3] qemu-iotests: Remove CR line endings in reference output
On 04.04.2014 14:03, Kevin Wolf wrote: qemu doesn't print these CRs any more. The test still didn't fail because the output comparison ignores line endings, but the change turns up each time when you want to update the output. Signed-off-by: Kevin Wolf kw...@redhat.com --- tests/qemu-iotests/051.out | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) This patch doesn't work so well over email, but looking at 051.out itself, it seems correct. Reviewed-by: Max Reitz mre...@redhat.com
[Qemu-devel] [PATCH 27/35] pc: migrate piix4 ich9 MemHotplugState
Adds an optional vmstate field that allows to migrate current state of acpi_memory_hotplug of ACPI PM device. Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/acpi/ich9.c | 8 hw/acpi/memory_hotplug.c | 27 +++ hw/acpi/piix4.c | 8 include/hw/acpi/memory_hotplug.h | 7 +++ 4 files changed, 50 insertions(+) diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c index 86c45ba..e566f7e 100644 --- a/hw/acpi/ich9.c +++ b/hw/acpi/ich9.c @@ -140,6 +140,12 @@ static int ich9_pm_post_load(void *opaque, int version_id) .offset = vmstate_offset_pointer(_state, _field, uint8_t), \ } +static bool vmstate_test_use_memhp(void *opaque, int version_id) +{ +ICH9LPCPMRegs *s = opaque; +return s-acpi_memory_hotplug.is_enabled; +} + const VMStateDescription vmstate_ich9_pm = { .name = ich9_pm, .version_id = 1, @@ -156,6 +162,8 @@ const VMStateDescription vmstate_ich9_pm = { VMSTATE_GPE_ARRAY(acpi_regs.gpe.en, ICH9LPCPMRegs), VMSTATE_UINT32(smi_en, ICH9LPCPMRegs), VMSTATE_UINT32(smi_sts, ICH9LPCPMRegs), +VMSTATE_MEMORY_HOTPLUG(acpi_memory_hotplug, + ICH9LPCPMRegs, vmstate_test_use_memhp), VMSTATE_END_OF_LIST() } }; diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c index a5c3ef4..7d33cd7 100644 --- a/hw/acpi/memory_hotplug.c +++ b/hw/acpi/memory_hotplug.c @@ -155,3 +155,30 @@ void acpi_memory_plug_cb(ACPIREGS *ar, qemu_irq irq, MemHotplugState *mem_st, acpi_update_sci(ar, irq); return; } + +static const VMStateDescription vmstate_memhp_sts = { +.name = memory hotplug device state, +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField[]) { +VMSTATE_BOOL(is_enabled, MemStatus), +VMSTATE_BOOL(is_inserting, MemStatus), +VMSTATE_UINT32(ost_event, MemStatus), +VMSTATE_UINT32(ost_status, MemStatus), +VMSTATE_END_OF_LIST() +} +}; + +const VMStateDescription vmstate_memory_hotplug = { +.name = memory hotplug state, +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField[]) { +VMSTATE_UINT32(selector, MemHotplugState), +VMSTATE_STRUCT_VARRAY_POINTER_UINT32(devs, MemHotplugState, dev_count, + vmstate_memhp_sts, MemStatus), +VMSTATE_END_OF_LIST() +} +}; diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c index 056a7bc..3028eb6 100644 --- a/hw/acpi/piix4.c +++ b/hw/acpi/piix4.c @@ -250,6 +250,12 @@ static bool vmstate_test_no_use_acpi_pci_hotplug(void *opaque, int version_id) return !s-use_acpi_pci_hotplug; } +static bool vmstate_test_use_memhp(void *opaque, int version_id) +{ +PIIX4PMState *s = opaque; +return s-acpi_memory_hotplug.is_enabled; +} + /* qemu-kvm 1.2 uses version 3 but advertised as 2 * To support incoming qemu-kvm 1.2 migration, change version_id * and minimum_version_id to 2 below (which breaks migration from @@ -280,6 +286,8 @@ static const VMStateDescription vmstate_acpi = { struct AcpiPciHpPciStatus), VMSTATE_PCI_HOTPLUG(acpi_pci_hotplug, PIIX4PMState, vmstate_test_use_acpi_pci_hotplug), +VMSTATE_MEMORY_HOTPLUG(acpi_memory_hotplug, + PIIX4PMState, vmstate_test_use_memhp), VMSTATE_END_OF_LIST() } }; diff --git a/include/hw/acpi/memory_hotplug.h b/include/hw/acpi/memory_hotplug.h index 912c53f..7251dd0 100644 --- a/include/hw/acpi/memory_hotplug.h +++ b/include/hw/acpi/memory_hotplug.h @@ -3,6 +3,7 @@ #include hw/qdev-core.h #include hw/acpi/acpi.h +#include migration/vmstate.h #define ACPI_MEMORY_HOTPLUG_STATUS 8 @@ -27,4 +28,10 @@ void acpi_memory_hotplug_init(MemoryRegion *as, Object *owner, void acpi_memory_plug_cb(ACPIREGS *ar, qemu_irq irq, MemHotplugState *mem_st, DeviceState *dev, Error **errp); + +extern const VMStateDescription vmstate_memory_hotplug; +#define VMSTATE_MEMORY_HOTPLUG(memhp, state, test_memhp_fn) \ +VMSTATE_STRUCT_TEST(memhp, state, test_memhp_fn, 1, \ +vmstate_memory_hotplug, MemHotplugState) + #endif -- 1.9.0
Re: [Qemu-devel] [PATCH 1/3] block: Don't parse 'filename' option
On 04.04.2014 14:03, Kevin Wolf wrote: When using the QDict option 'filename', it is supposed to be interpreted literally. The code did correctly avoid guessing the protocol from any string before the first colon, but it still called bdrv_parse_filename() which would, for example, incorrectly remove a 'file:' prefix in the raw-posix driver. Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c| 8 tests/qemu-iotests/051 | 4 tests/qemu-iotests/051.out | 11 +++ 3 files changed, 19 insertions(+), 4 deletions(-) Reviewed-by: Max Reitz mre...@redhat.com