[Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm
The goal is to sleep qemu whenever the guest clock is in advance compared to the host clock (we use the monotonic clocks). The amount of time to sleep is calculated in the execution loop in cpu_exec. At first, we tried to approximate at each for loop the real time elapsed while searching for a TB (generating or retrieving from cache) and executing it. We would then approximate the virtual time corresponding to the number of virtual instructions executed. The difference between these 2 values would allow us to know if the guest is in advance or delayed. However, the function used for measuring the real time (qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very expensive. We had an added overhead of 13% of the total run time. Therefore, we modified the algorithm and only take into account the difference between the 2 clocks at the begining of the cpu_exec function. During the for loop we try to reduce the advance of the guest only by computing the virtual time elapsed and sleeping if necessary. The overhead is thus reduced to 3%. Even though this method still has a noticeable overhead, it no longer is a bottleneck in trying to achieve a better guest frequency for which the guest clock is faster than the host one. As for the the alignement of the 2 clocks, with the first algorithm the guest clock was oscillating between -1 and 1ms compared to the host clock. Using the second algorithm we notice that the guest is 5ms behind the host, which is still acceptable for our use case. The tests where conducted using fio and stress. The host machine in an i5 CPU at 3.10GHz running Debian Jessie (kernel 3.12). The guest machine is an arm versatile-pb built with buildroot. Currently, on our test machine, the lowest icount we can achieve that is suitable for aligning the 2 clocks is 6. However, we observe that the IO tests (using fio) are slower than the cpu tests (using stress). Signed-off-by: Sebastian Tanase sebastian.tan...@openwide.fr Tested-by: Camille Bégué camille.be...@openwide.fr Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- cpu-exec.c | 91 cpus.c | 17 ++ include/qemu/timer.h | 1 + 3 files changed, 109 insertions(+) diff --git a/cpu-exec.c b/cpu-exec.c index 38e5f02..1a725b6 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -22,6 +22,84 @@ #include tcg.h #include qemu/atomic.h #include sysemu/qtest.h +#include qemu/timer.h + +/* -icount align implementation. */ + +typedef struct SyncClocks { +int64_t diff_clk; +int64_t original_instr_counter; +} SyncClocks; + +#if !defined(CONFIG_USER_ONLY) +/* Allow the guest to have a max 3ms advance. + * The difference between the 2 clocks could therefore + * oscillate around 0. + */ +#define VM_CLOCK_ADVANCE 300 + +static int64_t delay_host(int64_t diff_clk) +{ +if (diff_clk VM_CLOCK_ADVANCE) { +#ifndef _WIN32 +struct timespec sleep_delay, rem_delay; +sleep_delay.tv_sec = diff_clk / 10LL; +sleep_delay.tv_nsec = diff_clk % 10LL; +if (nanosleep(sleep_delay, rem_delay) 0) { +diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) * 10LL; +diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec; +} else { +diff_clk = 0; +} +#else +Sleep(diff_clk / SCALE_MS); +diff_clk = 0; +#endif +} +return diff_clk; +} + +static int64_t instr_to_vtime(int64_t instr_counter, const CPUState *cpu) +{ +int64_t instr_exec_time; +instr_exec_time = instr_counter - + (cpu-icount_extra + + cpu-icount_decr.u16.low); +instr_exec_time = instr_exec_time icount_time_shift; + +return instr_exec_time; +} + +static void align_clocks(SyncClocks *sc, const CPUState *cpu) +{ +if (!icount_align_option) { +return; +} +sc-diff_clk += instr_to_vtime(sc-original_instr_counter, cpu); +sc-original_instr_counter = cpu-icount_extra + cpu-icount_decr.u16.low; +sc-diff_clk = delay_host(sc-diff_clk); +} + +static void init_delay_params(SyncClocks *sc, + const CPUState *cpu) +{ +if (!icount_align_option) { +return; +} +sc-diff_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) - + qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + + cpu_get_clock_offset(); +sc-original_instr_counter = cpu-icount_extra + cpu-icount_decr.u16.low; +} +#else +static void align_clocks(SyncClocks *sc, const CPUState *cpu) +{ +} + +static void init_delay_params(SyncClocks *sc, const CPUState *cpu) +{ +} +#endif /* CONFIG USER ONLY */ void cpu_loop_exit(CPUState *cpu) { @@ -227,6 +305,8 @@ int cpu_exec(CPUArchState *env) TranslationBlock *tb; uint8_t *tc_ptr; uintptr_t next_tb; +SyncClocks sc; + /* This must be volatile so it is not trashed by longjmp() */ volatile bool have_tb_lock = false; @@ -283,6 +363,13
[Qemu-devel] [PATCH V5 6/6] monitor: Add drift info to 'info jit'
Show in 'info jit' the current delay between the host clock and the guest clock. In addition, print the maximum advance and delay of the guest compared to the host. Signed-off-by: Sebastian Tanase sebastian.tan...@openwide.fr Tested-by: Camille Bégué camille.be...@openwide.fr --- cpu-exec.c| 6 ++ cpus.c| 15 +++ include/qemu-common.h | 4 monitor.c | 1 + 4 files changed, 26 insertions(+) diff --git a/cpu-exec.c b/cpu-exec.c index 7259c94..533e0bf 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -117,6 +117,12 @@ static void init_delay_params(SyncClocks *sc, sc-realtime_clock + cpu_get_clock_offset(); sc-original_instr_counter = cpu-icount_extra + cpu-icount_decr.u16.low; +if (sc-diff_clk max_delay) { +max_delay = sc-diff_clk; +} +if (sc-diff_clk max_advance) { +max_advance = sc-diff_clk; +} /* Print every 2s max if the guest is late. We limit the number of printed messages to NB_PRINT_MAX(currently 100) */ print_delay(sc); diff --git a/cpus.c b/cpus.c index fcc5308..c98036e 100644 --- a/cpus.c +++ b/cpus.c @@ -64,6 +64,8 @@ #endif /* CONFIG_LINUX */ static CPUState *next_cpu; +int64_t max_delay; +int64_t max_advance; bool cpu_is_stopped(CPUState *cpu) { @@ -1521,3 +1523,16 @@ void qmp_inject_nmi(Error **errp) error_set(errp, QERR_UNSUPPORTED); #endif } + +void dump_drift_info(FILE *f, fprintf_function cpu_fprintf) +{ +cpu_fprintf(f, Host - Guest clock %ld(ms)\n, +(cpu_get_clock() - cpu_get_icount())/SCALE_MS); +if (icount_align_option) { +cpu_fprintf(f, Max guest delay %ld(ms)\n, -max_delay/SCALE_MS); +cpu_fprintf(f, Max guest advance %ld(ms)\n, max_advance/SCALE_MS); +} else { +cpu_fprintf(f, Max guest delay NA\n); +cpu_fprintf(f, Max guest advance NA\n); +} +} diff --git a/include/qemu-common.h b/include/qemu-common.h index d47aa02..55971df 100644 --- a/include/qemu-common.h +++ b/include/qemu-common.h @@ -110,6 +110,10 @@ void configure_icount(QemuOpts *opts, Error **errp); extern int use_icount; extern int icount_align_option; extern int icount_time_shift; +/* drift information for info jit command */ +extern int64_t max_delay; +extern int64_t max_advance; +void dump_drift_info(FILE *f, fprintf_function cpu_fprintf); #include qemu/osdep.h #include qemu/bswap.h diff --git a/monitor.c b/monitor.c index 5bc70a6..cdbaa60 100644 --- a/monitor.c +++ b/monitor.c @@ -1047,6 +1047,7 @@ static void do_info_registers(Monitor *mon, const QDict *qdict) static void do_info_jit(Monitor *mon, const QDict *qdict) { dump_exec_info((FILE *)mon, monitor_fprintf); +dump_drift_info((FILE *)mon, monitor_fprintf); } static void do_info_history(Monitor *mon, const QDict *qdict) -- 2.0.0.rc2
[Qemu-devel] [PATCH V5 5/6] cpu_exec: Print to console if the guest is late
If the align option is enabled, we print to the user whenever the guest clock is behind the host clock in order for he/she to have a hint about the actual performance. The maximum print interval is 2s and we limit the number of messages to 100. If desired, this can be changed in cpu-exec.c Signed-off-by: Sebastian Tanase sebastian.tan...@openwide.fr Tested-by: Camille Bégué camille.be...@openwide.fr Conflicts: cpu-exec.c --- cpu-exec.c | 32 +++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/cpu-exec.c b/cpu-exec.c index 1a725b6..7259c94 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -29,6 +29,7 @@ typedef struct SyncClocks { int64_t diff_clk; int64_t original_instr_counter; +int64_t realtime_clock; } SyncClocks; #if !defined(CONFIG_USER_ONLY) @@ -37,6 +38,9 @@ typedef struct SyncClocks { * oscillate around 0. */ #define VM_CLOCK_ADVANCE 300 +#define THRESHOLD_REDUCE 1.5 +#define MAX_DELAY_PRINT_RATE 2 +#define MAX_NB_PRINTS 100 static int64_t delay_host(int64_t diff_clk) { @@ -80,16 +84,42 @@ static void align_clocks(SyncClocks *sc, const CPUState *cpu) sc-diff_clk = delay_host(sc-diff_clk); } +static void print_delay(const SyncClocks *sc) +{ +static float threshold_delay; +static int64_t last_realtime_clock; +static int nb_prints; + +if (icount_align_option +(sc-realtime_clock - last_realtime_clock) / 10LL += MAX_DELAY_PRINT_RATE nb_prints MAX_NB_PRINTS) { +if ((-sc-diff_clk / (float)10LL threshold_delay) || +(-sc-diff_clk / (float)10LL + (threshold_delay - THRESHOLD_REDUCE))) { +threshold_delay = (-sc-diff_clk / 10LL) + 1; +printf(Warning: The guest is now late by %.1f to %.1f seconds\n, + threshold_delay - 1, + threshold_delay); +nb_prints++; +last_realtime_clock = sc-realtime_clock; +} +} +} + static void init_delay_params(SyncClocks *sc, const CPUState *cpu) { if (!icount_align_option) { return; } +sc-realtime_clock = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); sc-diff_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) - - qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + + sc-realtime_clock + cpu_get_clock_offset(); sc-original_instr_counter = cpu-icount_extra + cpu-icount_decr.u16.low; +/* Print every 2s max if the guest is late. We limit the number + of printed messages to NB_PRINT_MAX(currently 100) */ +print_delay(sc); } #else static void align_clocks(SyncClocks *sc, const CPUState *cpu) -- 2.0.0.rc2
Re: [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks
Am 25.07.2014 11:56, schrieb Sebastian Tanase: Sebastian Tanase (6): icount: Add QemuOpts for icount icount: Add align option to icount icount: Make icount_time_shift available everywhere cpu_exec: Add sleeping algorithm cpu_exec: Print to console if the guest is late FWIW the file name is cpu-exec, please fix if you need to resend. Thanks, Andreas monitor: Add drift info to 'info jit' cpu-exec.c| 127 ++ cpus.c| 59 +-- include/qemu-common.h | 9 +++- include/qemu/timer.h | 1 + monitor.c | 1 + qemu-options.hx | 17 +-- qtest.c | 13 +- vl.c | 39 +--- 8 files changed, 248 insertions(+), 18 deletions(-) -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm
Il 25/07/2014 11:56, Sebastian Tanase ha scritto: The goal is to sleep qemu whenever the guest clock is in advance compared to the host clock (we use the monotonic clocks). The amount of time to sleep is calculated in the execution loop in cpu_exec. At first, we tried to approximate at each for loop the real time elapsed while searching for a TB (generating or retrieving from cache) and executing it. We would then approximate the virtual time corresponding to the number of virtual instructions executed. The difference between these 2 values would allow us to know if the guest is in advance or delayed. However, the function used for measuring the real time (qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very expensive. We had an added overhead of 13% of the total run time. Therefore, we modified the algorithm and only take into account the difference between the 2 clocks at the begining of the cpu_exec function. During the for loop we try to reduce the advance of the guest only by computing the virtual time elapsed and sleeping if necessary. The overhead is thus reduced to 3%. Even though this method still has a noticeable overhead, it no longer is a bottleneck in trying to achieve a better guest frequency for which the guest clock is faster than the host one. As for the the alignement of the 2 clocks, with the first algorithm the guest clock was oscillating between -1 and 1ms compared to the host clock. Using the second algorithm we notice that the guest is 5ms behind the host, which is still acceptable for our use case. The tests where conducted using fio and stress. The host machine in an i5 CPU at 3.10GHz running Debian Jessie (kernel 3.12). The guest machine is an arm versatile-pb built with buildroot. Currently, on our test machine, the lowest icount we can achieve that is suitable for aligning the 2 clocks is 6. However, we observe that the IO tests (using fio) are slower than the cpu tests (using stress). Signed-off-by: Sebastian Tanase sebastian.tan...@openwide.fr Tested-by: Camille Bégué camille.be...@openwide.fr Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- cpu-exec.c | 91 cpus.c | 17 ++ include/qemu/timer.h | 1 + 3 files changed, 109 insertions(+) diff --git a/cpu-exec.c b/cpu-exec.c index 38e5f02..1a725b6 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -22,6 +22,84 @@ #include tcg.h #include qemu/atomic.h #include sysemu/qtest.h +#include qemu/timer.h + +/* -icount align implementation. */ + +typedef struct SyncClocks { +int64_t diff_clk; +int64_t original_instr_counter; +} SyncClocks; + +#if !defined(CONFIG_USER_ONLY) +/* Allow the guest to have a max 3ms advance. + * The difference between the 2 clocks could therefore + * oscillate around 0. + */ +#define VM_CLOCK_ADVANCE 300 + +static int64_t delay_host(int64_t diff_clk) +{ +if (diff_clk VM_CLOCK_ADVANCE) { +#ifndef _WIN32 +struct timespec sleep_delay, rem_delay; +sleep_delay.tv_sec = diff_clk / 10LL; +sleep_delay.tv_nsec = diff_clk % 10LL; +if (nanosleep(sleep_delay, rem_delay) 0) { +diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) * 10LL; +diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec; +} else { +diff_clk = 0; +} +#else +Sleep(diff_clk / SCALE_MS); +diff_clk = 0; +#endif +} +return diff_clk; +} + +static int64_t instr_to_vtime(int64_t instr_counter, const CPUState *cpu) +{ +int64_t instr_exec_time; +instr_exec_time = instr_counter - + (cpu-icount_extra + + cpu-icount_decr.u16.low); +instr_exec_time = instr_exec_time icount_time_shift; + +return instr_exec_time; +} + +static void align_clocks(SyncClocks *sc, const CPUState *cpu) +{ +if (!icount_align_option) { +return; +} +sc-diff_clk += instr_to_vtime(sc-original_instr_counter, cpu); +sc-original_instr_counter = cpu-icount_extra + cpu-icount_decr.u16.low; +sc-diff_clk = delay_host(sc-diff_clk); +} Just two comments: 1) perhaps s/original/last/ in original_instr_counter? 2) I think I prefer this to be written like: instr_counter = cpu-icount_extra + cpu-icount_decr.u16.low; instr_exec_time = sc-original_instr_counter - instr_counter; sc-original_instr_counter = instr_counter sc-diff_clk += instr_exec_time icount_time_shift; sc-diff_clk = delay_host(sc-diff_clk); If you agree, I can do it when applying the patches. Thanks for your persistence, I'm very happy with this version! As a follow up, do you think it's possible to modify the places where you run align_clocks, so that you sleep with the iothread mutex *not* taken? Paolo
Re: [Qemu-devel] [PATCH 2/3] target-arm: A64: fix TLB flush instructions
Peter Maydell writes: On 24 July 2014 16:52, Alex Bennée alex.ben...@linaro.org wrote: +/* See: D4.7.2 TLB maintenance requirements and the TLB maintenance instructions + * Page D4-1736 (DDI0487A.b) For TLB maintenance instructions that + * take an address, the maintenance of VA[63:56] is interpreted as + * being the same as the maintenance of VA[55] + */ I'd rather we didn't quote this bit of the ARM ARM, because it's obviously mangled (I'm pretty sure it should say the value of VA[..]). Is it OK to still reference the ARM ARM because otherwise the sign extension would look a little weird without context (although obviously we have a commit message to say we fixed something). Otherwise Reviewed-by: Peter Maydell peter.mayd...@linaro.org thanks -- PMM -- Alex Bennée
Re: [Qemu-devel] [questions] about qemu log
Zhang Haoyu writes: Hi, all If I use qemu command directly to run vm, bypass libvirt, how to configure qemu to assure that each vm has its own log file, like vmname.log? For example, VM: rhel7-net has its own log file, rhel7-net.log, VM:rhel7-stor has its own log file, rhel7-stor.log. -D /path/to/unique/file/name.log Or am I misunderstanding what you want? -- Alex Bennée
[Qemu-devel] [PATCH] target-i386/fpu_helper.c: fbld instruction doesn't set minus sign
Obviously, there is a misprint in function implementation. From: Dmitry Poletaev poletaev-q...@yandex.ru Signed-off-by: Dmitry Poletaev poletaev-q...@yandex.ru --- target-i386/fpu_helper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target-i386/fpu_helper.c b/target-i386/fpu_helper.c index 1b2900d..be1e545 100644 --- a/target-i386/fpu_helper.c +++ b/target-i386/fpu_helper.c @@ -622,7 +622,7 @@ void helper_fbld_ST0(CPUX86State *env, target_ulong ptr) } tmp = int64_to_floatx80(val, env-fp_status); if (cpu_ldub_data(env, ptr + 9) 0x80) { -floatx80_chs(tmp); +tmp = floatx80_chs(tmp); } fpush(env); ST0 = tmp; -- 1.8.4.msysgit.0
Re: [Qemu-devel] [questions] about qemu log
Hi, all If I use qemu command directly to run vm, bypass libvirt, how to configure qemu to assure that each vm has its own log file, like vmname.log? For example, VM: rhel7-net has its own log file, rhel7-net.log, VM:rhel7-stor has its own log file, rhel7-stor.log. -D /path/to/unique/file/name.log -D option is to configure qemu_logfile for the output logs which are controlled by qmp command logfile, which can be enabled/disabled on run time. I want to configure the log file for the output of fprintf(stderr, fmt, ...), .etc, i.e., how to redirect the output of fprintf(stderr, fmt, ...), or some other log-interface to a specified file? I saw the configuration in qemuStateInitialize(), libvirt code, but now I run the vm directly through qemu command, bypass libvirt. Thanks, Zhang Haoyu Or am I misunderstanding what you want?
[Qemu-devel] [PATCH RFC 0/3] dataplane: dataplane: more graceful error handling
Currently, qemu will take a hard exit if it fails to set up guest or host notifiers, giving no real clue as to what went wrong (e.g., when out of file descriptors). This patchset tries to make this more manageable: Both by improving the error message and by gracefully falling back to non-dataplane in case of errors. Patches are also available on git://github.com/cohuck/qemu dataplane-graceful-fail Thoughts? Cornelia Huck (3): dataplane: print why starting failed dataplane: fail notifier setting gracefully dataplane: stop trying on notifier error hw/block/dataplane/virtio-blk.c | 39 +-- 1 file changed, 29 insertions(+), 10 deletions(-) -- 1.7.9.5
[Qemu-devel] [PATCH RFC 1/3] dataplane: print why starting failed
Setting up guest or host notifiers may fail, but the user will have no idea why: Let's print the error returned by the callback. Acked-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/block/dataplane/virtio-blk.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index d6ba65c..527a53c 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -218,6 +218,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); VirtIOBlock *vblk = VIRTIO_BLK(s-vdev); VirtQueue *vq; +int r; if (s-started) { return; @@ -236,16 +237,18 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) } /* Set up guest notifier (irq) */ -if (k-set_guest_notifiers(qbus-parent, 1, true) != 0) { -fprintf(stderr, virtio-blk failed to set guest notifier, -ensure -enable-kvm is set\n); +r = k-set_guest_notifiers(qbus-parent, 1, true); +if (r != 0) { +fprintf(stderr, virtio-blk failed to set guest notifier (%d), +ensure -enable-kvm is set\n, r); exit(1); } s-guest_notifier = virtio_queue_get_guest_notifier(vq); /* Set up virtqueue notify */ -if (k-set_host_notifier(qbus-parent, 0, true) != 0) { -fprintf(stderr, virtio-blk failed to set host notifier\n); +r = k-set_host_notifier(qbus-parent, 0, true); +if (r != 0) { +fprintf(stderr, virtio-blk failed to set host notifier (%d)\n, r); exit(1); } s-host_notifier = *virtio_queue_get_host_notifier(vq); -- 1.7.9.5
[Qemu-devel] [PATCH RFC 2/3] dataplane: fail notifier setting gracefully
The dataplane code is currently doing a hard exit if it fails to set up either guest or host notifiers. In practice, this may mean that a guest suddenly dies after a dataplane device failed to come up (e.g., when a file descriptor limit is hit for tne nth device). Let's just try to unwind the setup instead and return. Acked-by: Christian Borntraeger borntrae...@de.ibm.com Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com --- hw/block/dataplane/virtio-blk.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 527a53c..94e1a29 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -232,8 +232,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) vq = virtio_get_queue(s-vdev, 0); if (!vring_setup(s-vring, s-vdev, 0)) { -s-starting = false; -return; +goto fail_vring; } /* Set up guest notifier (irq) */ @@ -241,7 +240,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) if (r != 0) { fprintf(stderr, virtio-blk failed to set guest notifier (%d), ensure -enable-kvm is set\n, r); -exit(1); +goto fail_guest_notifiers; } s-guest_notifier = virtio_queue_get_guest_notifier(vq); @@ -249,7 +248,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) r = k-set_host_notifier(qbus-parent, 0, true); if (r != 0) { fprintf(stderr, virtio-blk failed to set host notifier (%d)\n, r); -exit(1); +goto fail_host_notifier; } s-host_notifier = *virtio_queue_get_host_notifier(vq); @@ -269,6 +268,14 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) aio_context_acquire(s-ctx); aio_set_event_notifier(s-ctx, s-host_notifier, handle_notify); aio_context_release(s-ctx); +return; + + fail_host_notifier: +k-set_guest_notifiers(qbus-parent, 1, false); + fail_guest_notifiers: +vring_teardown(s-vring, s-vdev, 0); + fail_vring: +s-starting = false; } /* Context: QEMU global mutex held */ -- 1.7.9.5
[Qemu-devel] [PULL for-2.1 0/3] Last minute patches
The following changes since commit f368c33d5ab09dd5656924185cd975b11838cd25: Update version for v2.1.0-rc3 release (2014-07-22 18:17:03 +0100) are available in the git repository at: git://github.com/bonzini/qemu.git tags/for-upstream for you to fetch changes up to d975e28437377bc9a65fb5f4f9486a74c7a124cc: qemu-char: ignore flow control if a PTY's slave is not connected (2014-07-24 18:31:50 +0200) Here are my patches for the last 2.1 rc. Paolo Bonzini (3): acpi-dsdt: procedurally generate _PRT pc: hack for migration compatibility from QEMU 2.0 qemu-char: ignore flow control if a PTY's slave is not connected hw/i386/acpi-build.c| 75 +- hw/i386/acpi-dsdt.dsl | 90 +- hw/i386/acpi-dsdt.hex.generated | 1910 +++ hw/i386/pc_piix.c | 19 + hw/i386/pc_q35.c|5 + include/hw/i386/pc.h|1 + qemu-char.c |7 + 7 files changed, 250 insertions(+), 1857 deletions(-) -- 1.8.3.1
[Qemu-devel] [PULL 2/3] pc: hack for migration compatibility from QEMU 2.0
Changing the ACPI table size causes migration to break, and the memory hotplug work opened our eyes on how horribly we were breaking things in 2.0 already. The ACPI table size is rounded to the next 4k, which one would think gives some headroom. In practice this is not the case, because the user can control the ACPI table size (each CPU adds 97 bytes to the SSDT and 8 to the MADT) and so some -smp values will break the 4k boundary and fail to migrate. Similarly, PCI bridges add ~1870 bytes to the SSDT. To fix this, hard-code 64k as the maximum ACPI table size, which (despite being an order of magnitude smaller than 640k) should be enough for everyone. To fix migration from QEMU 2.0, compute the payload size of QEMU 2.0 and always use that one. The previous patch shrunk the ACPI tables enough that the QEMU 2.0 size should always be enough; non-AML tables can change depending on the configuration (especially MADT, SRAT, HPET) but they remain the same between QEMU 2.0 and 2.1, so we only compute our padding based on the sizes of the SSDT and DSDT. Migration from QEMU 1.7 should work for guests that have a number of CPUs other than 12, 13, 14, 54, 55, 56, 97, 98, 139, 140. It was already broken from QEMU 1.7 to QEMU 2.0 in the same way, though. Even with this patch, QEMU 1.7 and 2.0 have two different ideas of -M pc-i440fx-2.0 when there are PCI bridges. Igor sent a patch to adopt the QEMU 1.7 definition. I think distributions should apply it if they move directly from QEMU 1.7 to 2.1+ without ever packaging version 2.0. Reviewed-by: Laszlo Ersek ler...@redhat.com Tested-by: Igor Mammedov imamm...@redhat.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- hw/i386/acpi-build.c | 75 hw/i386/pc_piix.c| 19 + hw/i386/pc_q35.c | 5 include/hw/i386/pc.h | 1 + 4 files changed, 95 insertions(+), 5 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index ebc5f03..7c6a7e3 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -25,7 +25,9 @@ #include glib.h #include qemu-common.h #include qemu/bitmap.h +#include qemu/osdep.h #include qemu/range.h +#include qemu/error-report.h #include hw/pci/pci.h #include qom/cpu.h #include hw/i386/pc.h @@ -52,6 +54,17 @@ #include qapi/qmp/qint.h #include qom/qom-qobject.h +/* These are used to size the ACPI tables for -M pc-i440fx-1.7 and + * -M pc-i440fx-2.0. Even if the actual amount of AML generated grows + * a little bit, there should be plenty of free space since the DSDT + * shrunk by ~1.5k between QEMU 2.0 and QEMU 2.1. + */ +#define ACPI_BUILD_CPU_AML_SIZE97 +#define ACPI_BUILD_BRIDGE_AML_SIZE 1875 +#define ACPI_BUILD_ALIGN_SIZE 0x1000 + +#define ACPI_BUILD_TABLE_SIZE 0x1 + typedef struct AcpiCpuInfo { DECLARE_BITMAP(found_cpus, ACPI_CPU_HOTPLUG_ID_LIMIT); } AcpiCpuInfo; @@ -87,6 +100,8 @@ typedef struct AcpiBuildPciBusHotplugState { struct AcpiBuildPciBusHotplugState *parent; } AcpiBuildPciBusHotplugState; +unsigned bsel_alloc; + static void acpi_get_dsdt(AcpiMiscInfo *info) { uint16_t *applesmc_sta; @@ -759,8 +774,8 @@ static void *acpi_set_bsel(PCIBus *bus, void *opaque) static void acpi_set_pci_info(void) { PCIBus *bus = find_i440fx(); /* TODO: Q35 support */ -unsigned bsel_alloc = 0; +assert(bsel_alloc == 0); if (bus) { /* Scan all PCI buses. Set property to enable acpi based hotplug. */ pci_for_each_bus_depth_first(bus, acpi_set_bsel, NULL, bsel_alloc); @@ -1440,13 +1455,14 @@ static void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables) { GArray *table_offsets; -unsigned facs, dsdt, rsdt; +unsigned facs, ssdt, dsdt, rsdt; AcpiCpuInfo cpu; AcpiPmInfo pm; AcpiMiscInfo misc; AcpiMcfgInfo mcfg; PcPciInfo pci; uint8_t *u; +size_t aml_len = 0; acpi_get_cpu_info(cpu); acpi_get_pm_info(pm); @@ -1474,13 +1490,20 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables) dsdt = tables-table_data-len; build_dsdt(tables-table_data, tables-linker, misc); +/* Count the size of the DSDT and SSDT, we will need it for legacy + * sizing of ACPI tables. + */ +aml_len += tables-table_data-len - dsdt; + /* ACPI tables pointed to by RSDT */ acpi_add_table(table_offsets, tables-table_data); build_fadt(tables-table_data, tables-linker, pm, facs, dsdt); +ssdt = tables-table_data-len; acpi_add_table(table_offsets, tables-table_data); build_ssdt(tables-table_data, tables-linker, cpu, pm, misc, pci, guest_info); +aml_len += tables-table_data-len - ssdt; acpi_add_table(table_offsets, tables-table_data); build_madt(tables-table_data, tables-linker, cpu, guest_info); @@ -1513,14 +1536,56 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables) /* RSDP is in FSEG memory, so allocate it separately
[Qemu-devel] [PULL 1/3] acpi-dsdt: procedurally generate _PRT
This replaces the _PRT constant with a method that computes it. The problem is that the DSDT+SSDT have grown from 2.0 to 2.1, enough to cross the 8k barrier (we align the ACPI tables to 4k before putting them in fw_cfg). This causes problems with migration and the pc-i440fx-2.0 machine type. The solution to the problem is to hardcode 64k as the limit, but this doesn't solve the bug with pc-i440fx-2.0. The fix will be for QEMU 2.1 to use exactly the same size as QEMU 2.0 for the ACPI tables. First, however, we must make the actual AML equal or smaller; to do this, rewrite _PRT in a way that saves over 1k of bytecode. Reviewed-by: Laszlo Ersek ler...@redhat.com Tested-by: Igor Mammedov imamm...@redhat.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- hw/i386/acpi-dsdt.dsl | 90 +- hw/i386/acpi-dsdt.hex.generated | 1910 +++ 2 files changed, 148 insertions(+), 1852 deletions(-) diff --git a/hw/i386/acpi-dsdt.dsl b/hw/i386/acpi-dsdt.dsl index 3cc0ea0..6ba0170 100644 --- a/hw/i386/acpi-dsdt.dsl +++ b/hw/i386/acpi-dsdt.dsl @@ -181,57 +181,45 @@ DefinitionBlock ( Scope(\_SB) { Scope(PCI0) { -Name(_PRT, Package() { -/* PCI IRQ routing table, example from ACPI 2.0a specification, - section 6.2.8.1 */ -/* Note: we provide the same info as the PCI routing - table of the Bochs BIOS */ - -#define prt_slot(nr, lnk0, lnk1, lnk2, lnk3) \ -Package() { nr##, 0, lnk0, 0 }, \ -Package() { nr##, 1, lnk1, 0 }, \ -Package() { nr##, 2, lnk2, 0 }, \ -Package() { nr##, 3, lnk3, 0 } - -#define prt_slot0(nr) prt_slot(nr, LNKD, LNKA, LNKB, LNKC) -#define prt_slot1(nr) prt_slot(nr, LNKA, LNKB, LNKC, LNKD) -#define prt_slot2(nr) prt_slot(nr, LNKB, LNKC, LNKD, LNKA) -#define prt_slot3(nr) prt_slot(nr, LNKC, LNKD, LNKA, LNKB) - -prt_slot0(0x), -/* Device 1 is power mgmt device, and can only use irq 9 */ -prt_slot(0x0001, LNKS, LNKB, LNKC, LNKD), -prt_slot2(0x0002), -prt_slot3(0x0003), -prt_slot0(0x0004), -prt_slot1(0x0005), -prt_slot2(0x0006), -prt_slot3(0x0007), -prt_slot0(0x0008), -prt_slot1(0x0009), -prt_slot2(0x000a), -prt_slot3(0x000b), -prt_slot0(0x000c), -prt_slot1(0x000d), -prt_slot2(0x000e), -prt_slot3(0x000f), -prt_slot0(0x0010), -prt_slot1(0x0011), -prt_slot2(0x0012), -prt_slot3(0x0013), -prt_slot0(0x0014), -prt_slot1(0x0015), -prt_slot2(0x0016), -prt_slot3(0x0017), -prt_slot0(0x0018), -prt_slot1(0x0019), -prt_slot2(0x001a), -prt_slot3(0x001b), -prt_slot0(0x001c), -prt_slot1(0x001d), -prt_slot2(0x001e), -prt_slot3(0x001f), -}) +Method (_PRT, 0) { +Store(Package(128) {}, Local0) +Store(Zero, Local1) +While(LLess(Local1, 128)) { +// slot = pin 2 +Store(ShiftRight(Local1, 2), Local2) + +// lnk = (slot + pin) 3 +Store(And(Add(Local1, Local2), 3), Local3) +If (LEqual(Local3, 0)) { +Store(Package(4) { Zero, Zero, LNKD, Zero }, Local4) +} +If (LEqual(Local3, 1)) { +// device 1 is the power-management device, needs SCI +If (LEqual(Local1, 4)) { +Store(Package(4) { Zero, Zero, LNKS, Zero }, Local4) +} Else { +Store(Package(4) { Zero, Zero, LNKA, Zero }, Local4) +} +} +If (LEqual(Local3, 2)) { +Store(Package(4) { Zero, Zero, LNKB, Zero }, Local4) +} +If (LEqual(Local3, 3)) { +Store(Package(4) { Zero, Zero, LNKC, Zero }, Local4) +} + +// Complete the interrupt routing entry: +//Package(4) { 0x[slot], [pin], [link], 0) } + +Store(Or(ShiftLeft(Local2, 16), 0x), Index(Local4, 0)) +Store(And(Local1, 3),Index(Local4, 1)) +Store(Local4,Index(Local0, Local1)) + +Increment(Local1) +} + +Return(Local0) +} } Field(PCI0.ISA.P40C, ByteAcc, NoLock, Preserve) { diff --git
[Qemu-devel] [PULL 3/3] qemu-char: ignore flow control if a PTY's slave is not connected
After commit f702e62 (serial: change retry logic to avoid concurrency, 2014-07-11), guest boot hangs if the backend is an unconnected PTY. The reason is that PTYs do not support G_IO_HUP, and serial_xmit is never called. To fix this, simply invoke serial_xmit immediately (via g_idle_source_new) when this happens. Tested-by: Pavel Hrdina phrd...@redhat.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- qemu-char.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/qemu-char.c b/qemu-char.c index 7acc03f..956be49 100644 --- a/qemu-char.c +++ b/qemu-char.c @@ -1168,6 +1168,9 @@ static int pty_chr_write(CharDriverState *chr, const uint8_t *buf, int len) static GSource *pty_chr_add_watch(CharDriverState *chr, GIOCondition cond) { PtyCharDriver *s = chr-opaque; +if (!s-connected) { +return NULL; +} return g_io_create_watch(s-fd, cond); } @@ -3664,6 +3667,10 @@ int qemu_chr_fe_add_watch(CharDriverState *s, GIOCondition cond, } src = s-chr_add_watch(s, cond); +if (!src) { +return -EINVAL; +} + g_source_set_callback(src, (GSourceFunc)func, user_data, NULL); tag = g_source_attach(src, NULL); g_source_unref(src); -- 1.8.3.1
[Qemu-devel] [PULL v2 0/1] Fix for -serial pty regression
The following changes since commit f368c33d5ab09dd5656924185cd975b11838cd25: Update version for v2.1.0-rc3 release (2014-07-22 18:17:03 +0100) are available in the git repository at: git://github.com/bonzini/qemu.git tags/for-upstream for you to fetch changes up to 62c339c5272ce8fbe8ca52695cee8ff40da7872e: qemu-char: ignore flow control if a PTY's slave is not connected (2014-07-25 14:36:07 +0200) Here is the serial fix for 2.1. Paolo Bonzini (1): qemu-char: ignore flow control if a PTY's slave is not connected qemu-char.c | 7 +++ 1 file changed, 7 insertions(+) -- 1.8.3.1
[Qemu-devel] [PULL v2 1/1] qemu-char: ignore flow control if a PTY's slave is not connected
After commit f702e62 (serial: change retry logic to avoid concurrency, 2014-07-11), guest boot hangs if the backend is an unconnected PTY. The reason is that PTYs do not support G_IO_HUP, and serial_xmit is never called. To fix this, simply invoke serial_xmit immediately (via g_idle_source_new) when this happens. Tested-by: Pavel Hrdina phrd...@redhat.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- qemu-char.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/qemu-char.c b/qemu-char.c index 7acc03f..956be49 100644 --- a/qemu-char.c +++ b/qemu-char.c @@ -1168,6 +1168,9 @@ static int pty_chr_write(CharDriverState *chr, const uint8_t *buf, int len) static GSource *pty_chr_add_watch(CharDriverState *chr, GIOCondition cond) { PtyCharDriver *s = chr-opaque; +if (!s-connected) { +return NULL; +} return g_io_create_watch(s-fd, cond); } @@ -3664,6 +3667,10 @@ int qemu_chr_fe_add_watch(CharDriverState *s, GIOCondition cond, } src = s-chr_add_watch(s, cond); +if (!src) { +return -EINVAL; +} + g_source_set_callback(src, (GSourceFunc)func, user_data, NULL); tag = g_source_attach(src, NULL); g_source_unref(src); -- 1.8.3.1
Re: [Qemu-devel] [PATCH 1/4] block/parallels: extend parallels format header with actual data values
On 24/07/14 22:34, Jeff Cody wrote: On Tue, Jul 22, 2014 at 05:19:34PM +0400, Denis V. Lunev wrote: Parallels image format has several additional fields inside: - nb_sectors is actually 64 bit wide. Upper 32bits are not used for images with signature WithoutFreeSpace and must be explicitely s/explicitely/explicitly zeroed according to Parallels. They will be used for images with signature WithouFreSpacExt - inuse is magic which means that the image is currently opened for read/write or was not closed correctly, the magic is 0x746f6e59 - data_off is the location of the first data block. It can be zero and in this case I think you may have forgotten to finish this sentence :) ok This patch adds these values to struct parallels_header and adds proper handling of nb_sectors for currently supported WithoutFreeSpace images. WithouFreSpacExt will be covered in the next patch. Signed-off-by: Denis V. Lunev d...@openvz.org CC: Kevin Wolf kw...@redhat.com CC: Stefan Hajnoczi stefa...@redhat.com --- block/parallels.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/block/parallels.c b/block/parallels.c index 1a5bd35..c44df87 100644 --- a/block/parallels.c +++ b/block/parallels.c @@ -41,8 +41,10 @@ struct parallels_header { uint32_t cylinders; uint32_t tracks; uint32_t catalog_entries; -uint32_t nb_sectors; -char padding[24]; +uint64_t nb_sectors; +uint32_t inuse; +uint32_t data_off; +char padding[12]; } QEMU_PACKED; typedef struct BDRVParallelsState { @@ -90,7 +92,7 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, goto fail; } -bs-total_sectors = le32_to_cpu(ph.nb_sectors); +bs-total_sectors = (uint32_t)le64_to_cpu(ph.nb_sectors); I think an explicit bit mask on the upper 32 bits would fit better here than a cast, especially since neither 'bs-total_sectors' nor 'ph.nb_sectors' is a uint32_t. E.g.: bs-total_sectors = 0x le64_to_cpu(ph.nb_sectors); ok, will do
[Qemu-devel] [Bug 1307225] Re: Running a virtual machine on a Haswell system produces machine check events
I can confirm this. Using qemu-kvm for three virtual machines on Ubuntu 14.04 LTS using a Intel i7-4770 Haswell based server. dmesg: [63429.847437] mce: [Hardware Error]: Machine check events logged [65996.795630] mce: [Hardware Error]: Machine check events logged mcelog: Hardware event. This is not a software error. MCE 0 CPU 2 BANK 0 TIME 1406265172 Fri Jul 25 07:12:52 2014 MCG status: MCi status: Corrected error Error enabled MCA: Internal parity error STATUS 904f0005 MCGSTATUS 0 MCGCAP c09 APICID 4 SOCKETID 0 CPUID Vendor Intel Family 6 Model 60 It's the same error everytime, only APICID and CPU numbers are different. The mce errors did not happen until i migrated the virtual machines from another system, the haswell-server was up for three days without any incidents, now, while running qemu-kvm there is a mce error every 6-12 hours. After the first errors, i called the support of my server provider, they first exchanged RAM, upgraded BIOS... Then, they replaced the whole server, only swapping my harddisks to the new one. But even that didn't help, i still got MCE errors. The harddisks where replaced too, one at a time (to resync raid). Now, i have a completely swapped hardware, but the MCE errors are still popping up. system information attached -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1307225 Title: Running a virtual machine on a Haswell system produces machine check events Status in QEMU: New Bug description: I'm running a virtual Windows SBS 2003 installation on a Xeon E3 Haswell system running Gentoo Linux. First, I used Qemu 1.5.3 (the latest stable version on Gentoo). I got a lot of machine check events (mce: [Hardware Error]: Machine check events logged) in dmesg that always looked like (using mcelog): Hardware event. This is not a software error. MCE 0 CPU 3 BANK 0 TIME 1397455091 Mon Apr 14 07:58:11 2014 MCG status: MCi status: Corrected error Error enabled MCA: Internal parity error STATUS 904f0005 MCGSTATUS 0 MCGCAP c09 APICID 6 SOCKETID 0 CPUID Vendor Intel Family 6 Model 60 I found this discussion on the vmware community: https://communities.vmware.com/thread/452344 It seems that this is (at least partly) caused by the Qemu machine. I switched to Qemu 1.7.0, the first version to use pc-i440fx-1.7. With this version, the errors almost disappeared, but from time to time, I still get machine check events. Anyways, they so not seem to affect neither the vm, nor the host. The Haswell machine has been set up and running for several days without a single error message. They only appear when the VM is running. so I think this is actually some problem with the Haswell architecture (and not a real hardware error). To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1307225/+subscriptions
Re: [Qemu-devel] [PATCH 3/4] block/parallels: split check for parallels format in parallels_open
On 24/07/14 22:50, Jeff Cody wrote: On Tue, Jul 22, 2014 at 05:19:36PM +0400, Denis V. Lunev wrote: and rework error path a bit. There is no difference at the moment, but the code will be definitely shorter when additional processing will be required for WithouFreSpacExt Signed-off-by: Denis V. Lunev d...@openvz.org CC: Kevin Wolf kw...@redhat.com CC: Stefan Hajnoczi stefa...@redhat.com --- block/parallels.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/block/parallels.c b/block/parallels.c index 8f9ec8a..02739cf 100644 --- a/block/parallels.c +++ b/block/parallels.c @@ -85,12 +85,10 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, goto fail; } -if (memcmp(ph.magic, HEADER_MAGIC, 16) || -(le32_to_cpu(ph.version) != HEADER_VERSION)) { -error_setg(errp, Image not in Parallels format); -ret = -EINVAL; -goto fail; -} +if (le32_to_cpu(ph.version) != HEADER_VERSION) +goto fail_format; +if (memcmp(ph.magic, HEADER_MAGIC, 16)) +goto fail_format; QEMU coding style dictates these statements have curly braces, even though they are just one liners. (If you run your patches against scripts/checkpatch.pl, it should catch most style issues). ok, I just used a kernel convention. Will redo this here and in the next path. This is not a problem :) Thank you for pointing this out. I think this patch could also just be squashed into patch 4, if desired. I'd prefer not to do this to have behavior changing and behavior non-changing stuff separated. bs-total_sectors = (uint32_t)le64_to_cpu(ph.nb_sectors); @@ -120,6 +118,9 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, qemu_co_mutex_init(s-lock); return 0; +fail_format: +error_setg(errp, Image not in Parallels format); +ret = -EINVAL; fail: g_free(s-catalog_bitmap); return ret; -- 1.9.1
[Qemu-devel] [Bug 1307225] Re: Running a virtual machine on a Haswell system produces machine check events
attachment logfiles, dmidecode, system information ** Attachment added: logfiles, dmidecode, system information https://bugs.launchpad.net/qemu/+bug/1307225/+attachment/4162599/+files/logfiles-mce.txt -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1307225 Title: Running a virtual machine on a Haswell system produces machine check events Status in QEMU: New Bug description: I'm running a virtual Windows SBS 2003 installation on a Xeon E3 Haswell system running Gentoo Linux. First, I used Qemu 1.5.3 (the latest stable version on Gentoo). I got a lot of machine check events (mce: [Hardware Error]: Machine check events logged) in dmesg that always looked like (using mcelog): Hardware event. This is not a software error. MCE 0 CPU 3 BANK 0 TIME 1397455091 Mon Apr 14 07:58:11 2014 MCG status: MCi status: Corrected error Error enabled MCA: Internal parity error STATUS 904f0005 MCGSTATUS 0 MCGCAP c09 APICID 6 SOCKETID 0 CPUID Vendor Intel Family 6 Model 60 I found this discussion on the vmware community: https://communities.vmware.com/thread/452344 It seems that this is (at least partly) caused by the Qemu machine. I switched to Qemu 1.7.0, the first version to use pc-i440fx-1.7. With this version, the errors almost disappeared, but from time to time, I still get machine check events. Anyways, they so not seem to affect neither the vm, nor the host. The Haswell machine has been set up and running for several days without a single error message. They only appear when the VM is running. so I think this is actually some problem with the Haswell architecture (and not a real hardware error). To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1307225/+subscriptions
Re: [Qemu-devel] [PATCH 4/4] block/parallels: 2TB+ parallels images support
On Fri, Jul 25, 2014 at 07:51:47AM +0400, Denis V. Lunev wrote: On 24/07/14 23:25, Jeff Cody wrote: On Tue, Jul 22, 2014 at 05:19:37PM +0400, Denis V. Lunev wrote: Parallels has released in the recent updates of Parallels Server 5/6 new addition to his image format. Images with signature WithouFreSpacExt have offsets in the catalog coded not as offsets in sectors (multiple of 512 bytes) but offsets coded in blocks (i.e. header-tracks * 512) In this case all 64 bits of header-nb_sectors are used for image size. This patch implements support of this for qemu-img. Signed-off-by: Denis V. Lunev d...@openvz.org CC: Kevin Wolf kw...@redhat.com CC: Stefan Hajnoczi stefa...@redhat.com --- block/parallels.c | 20 +++- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/block/parallels.c b/block/parallels.c index 02739cf..d9cb04f 100644 --- a/block/parallels.c +++ b/block/parallels.c @@ -30,6 +30,7 @@ /**/ #define HEADER_MAGIC WithoutFreeSpace +#define HEADER_MAGIC2 WithouFreSpacExt #define HEADER_VERSION 2 #define HEADER_SIZE 64 @@ -54,6 +55,8 @@ typedef struct BDRVParallelsState { unsigned int catalog_size; unsigned int tracks; + +unsigned int off_multiplier; } BDRVParallelsState; static int parallels_probe(const uint8_t *buf, int buf_size, const char *filename) @@ -63,7 +66,8 @@ static int parallels_probe(const uint8_t *buf, int buf_size, const char *filenam if (buf_size HEADER_SIZE) return 0; -if (!memcmp(ph-magic, HEADER_MAGIC, 16) +if ((!memcmp(ph-magic, HEADER_MAGIC, 16) || +!memcmp(ph-magic, HEADER_MAGIC2, 16)) (le32_to_cpu(ph-version) == HEADER_VERSION)) return 100; @@ -85,13 +89,18 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, goto fail; } +bs-total_sectors = le64_to_cpu(ph.nb_sectors); + bs-total_sectors is a int64_t, and ph.nb_sectors is a uint64_t. Should we do a sanity check here on the max number of sectors? in reality such image will not be usable as we will later have to multiply this count with 512 to calculate offset in the file if (le32_to_cpu(ph.version) != HEADER_VERSION) goto fail_format; -if (memcmp(ph.magic, HEADER_MAGIC, 16)) +if (!memcmp(ph.magic, HEADER_MAGIC, 16)) { +s-off_multiplier = 1; +bs-total_sectors = (uint32_t)bs-total_sectors; (same comment as in patch 1, w.r.t. cast vs. bitmask) ok +} else if (!memcmp(ph.magic, HEADER_MAGIC2, 16)) +s-off_multiplier = le32_to_cpu(ph.tracks); Is there a maximum size in the specification for ph.tracks? no, there is no limitation. Though in reality I have seen the following images only: 252 sectors, 63 sectors, 256 sectors, 2048 sectors and only 2048 was used with this new header. This is just an observation. +else goto fail_format; (same comment as the last patch, w.r.t. braces) -bs-total_sectors = (uint32_t)le64_to_cpu(ph.nb_sectors); - s-tracks = le32_to_cpu(ph.tracks); if (s-tracks == 0) { error_setg(errp, Invalid image: Zero sectors per track); @@ -137,7 +146,8 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num) /* not allocated */ if ((index s-catalog_size) || (s-catalog_bitmap[index] == 0)) return -1; -return (uint64_t)(s-catalog_bitmap[index] + offset) * 512; +return +((uint64_t)s-catalog_bitmap[index] * s-off_multiplier + offset) * 512; Do we need to worry about overflow here, depending on the value of off_multiplier? Also, (and this existed prior to this patch), - we are casting to uint64_t, although the function returns int64_t. good catch. I'll change the declaration to uint64_t and will return 0 from previous if aka Thanks. I'm not sure that is the best option though - the only place the return value from this function is used is in parallels_read(), which passes the output into bdrv_pread() as the offset. The offset parameter for bdrv_pread/pwrite is an int64_t, not a uint64_t. if ((index s-catalog_size) || (s-catalog_bitmap[index] == 0)) It is not possible to obtain 0 as a real offset of any sector as this is an offset of the header. Regarding overflow check. Do you think that we should return error to the upper layer or we could silently fill result data with zeroes as was done originally for the following case 'index s-catalog_size' ? I think it would be best to return error (anything 0 will also cause bdrv_pread to return -EIO, and it is also checked for in parallels_read). I also don't mean to over-complicate things here for you, which is why I asked about the max value of ph.tracks. If that has a reasonable bounds check, then we don't need to worry about overflow at all, and can just change the cast to an int64_t instead
Re: [Qemu-devel] [PATCH 4/4] block/parallels: 2TB+ parallels images support
On 25/07/14 17:08, Jeff Cody wrote: On Fri, Jul 25, 2014 at 07:51:47AM +0400, Denis V. Lunev wrote: On 24/07/14 23:25, Jeff Cody wrote: On Tue, Jul 22, 2014 at 05:19:37PM +0400, Denis V. Lunev wrote: Parallels has released in the recent updates of Parallels Server 5/6 new addition to his image format. Images with signature WithouFreSpacExt have offsets in the catalog coded not as offsets in sectors (multiple of 512 bytes) but offsets coded in blocks (i.e. header-tracks * 512) In this case all 64 bits of header-nb_sectors are used for image size. This patch implements support of this for qemu-img. Signed-off-by: Denis V. Lunev d...@openvz.org CC: Kevin Wolf kw...@redhat.com CC: Stefan Hajnoczi stefa...@redhat.com --- block/parallels.c | 20 +++- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/block/parallels.c b/block/parallels.c index 02739cf..d9cb04f 100644 --- a/block/parallels.c +++ b/block/parallels.c @@ -30,6 +30,7 @@ /**/ #define HEADER_MAGIC WithoutFreeSpace +#define HEADER_MAGIC2 WithouFreSpacExt #define HEADER_VERSION 2 #define HEADER_SIZE 64 @@ -54,6 +55,8 @@ typedef struct BDRVParallelsState { unsigned int catalog_size; unsigned int tracks; + +unsigned int off_multiplier; } BDRVParallelsState; static int parallels_probe(const uint8_t *buf, int buf_size, const char *filename) @@ -63,7 +66,8 @@ static int parallels_probe(const uint8_t *buf, int buf_size, const char *filenam if (buf_size HEADER_SIZE) return 0; -if (!memcmp(ph-magic, HEADER_MAGIC, 16) +if ((!memcmp(ph-magic, HEADER_MAGIC, 16) || +!memcmp(ph-magic, HEADER_MAGIC2, 16)) (le32_to_cpu(ph-version) == HEADER_VERSION)) return 100; @@ -85,13 +89,18 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, goto fail; } +bs-total_sectors = le64_to_cpu(ph.nb_sectors); + bs-total_sectors is a int64_t, and ph.nb_sectors is a uint64_t. Should we do a sanity check here on the max number of sectors? in reality such image will not be usable as we will later have to multiply this count with 512 to calculate offset in the file if (le32_to_cpu(ph.version) != HEADER_VERSION) goto fail_format; -if (memcmp(ph.magic, HEADER_MAGIC, 16)) +if (!memcmp(ph.magic, HEADER_MAGIC, 16)) { +s-off_multiplier = 1; +bs-total_sectors = (uint32_t)bs-total_sectors; (same comment as in patch 1, w.r.t. cast vs. bitmask) ok +} else if (!memcmp(ph.magic, HEADER_MAGIC2, 16)) +s-off_multiplier = le32_to_cpu(ph.tracks); Is there a maximum size in the specification for ph.tracks? no, there is no limitation. Though in reality I have seen the following images only: 252 sectors, 63 sectors, 256 sectors, 2048 sectors and only 2048 was used with this new header. This is just an observation. +else goto fail_format; (same comment as the last patch, w.r.t. braces) -bs-total_sectors = (uint32_t)le64_to_cpu(ph.nb_sectors); - s-tracks = le32_to_cpu(ph.tracks); if (s-tracks == 0) { error_setg(errp, Invalid image: Zero sectors per track); @@ -137,7 +146,8 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num) /* not allocated */ if ((index s-catalog_size) || (s-catalog_bitmap[index] == 0)) return -1; -return (uint64_t)(s-catalog_bitmap[index] + offset) * 512; +return +((uint64_t)s-catalog_bitmap[index] * s-off_multiplier + offset) * 512; Do we need to worry about overflow here, depending on the value of off_multiplier? Also, (and this existed prior to this patch), - we are casting to uint64_t, although the function returns int64_t. good catch. I'll change the declaration to uint64_t and will return 0 from previous if aka Thanks. I'm not sure that is the best option though - the only place the return value from this function is used is in parallels_read(), which passes the output into bdrv_pread() as the offset. The offset parameter for bdrv_pread/pwrite is an int64_t, not a uint64_t. if ((index s-catalog_size) || (s-catalog_bitmap[index] == 0)) It is not possible to obtain 0 as a real offset of any sector as this is an offset of the header. Regarding overflow check. Do you think that we should return error to the upper layer or we could silently fill result data with zeroes as was done originally for the following case 'index s-catalog_size' ? I think it would be best to return error (anything 0 will also cause bdrv_pread to return -EIO, and it is also checked for in parallels_read). I also don't mean to over-complicate things here for you, which is why I asked about the max value of ph.tracks. If that has a reasonable bounds check, then we don't need to worry about overflow at all, and can just change the cast to an int64_t instead of uint64_t. I think we are safe so long as
[Qemu-devel] [PATCH v2 2/4] libqemustub: add qemu_system_shutdown_force()
These sysemu functions will be needed by qemu-char.c, which is linked into tests/vhost-user-test without system emulation functionality. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- stubs/Makefile.objs | 1 + stubs/shutdown.c| 5 + 2 files changed, 6 insertions(+) create mode 100644 stubs/shutdown.c diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs index 528e161..f17d517 100644 --- a/stubs/Makefile.objs +++ b/stubs/Makefile.objs @@ -39,3 +39,4 @@ stub-obj-$(CONFIG_WIN32) += fd-register.o stub-obj-y += cpus.o stub-obj-y += kvm.o stub-obj-y += qmp_pc_dimm_device_list.o +stub-obj-y += shutdown.o diff --git a/stubs/shutdown.c b/stubs/shutdown.c new file mode 100644 index 000..37c96c0 --- /dev/null +++ b/stubs/shutdown.c @@ -0,0 +1,5 @@ +#include sysemu/sysemu.h + +void qemu_system_shutdown_force(void) +{ +} -- 1.9.3
[Qemu-devel] [PATCH v2 0/4] libqtest: solve QEMU process cleanup problem
v2: * Added qemu_system_shutdown_force() API [Peter] Test cases are supposed to clean up even if they fail. Historically libqtest has leaked QEMU processes and files. This caused annoyances and buildbot failures so it was gradually fixed. The solution we have for terminating the QEMU process if the test case fails was not very satisfactory. A SIGABRT handler in the test case sends SIGTERM to QEMU. This only works if the test case receives SIGABRT, not other ways in which it could die. The approach is ugly because it installs a global signal handler although libqtest is supposed to support multiple simultaneous instances. This patch series adds the new -chardev exit-on-eof option. QEMU will terminate if the socket/pipe/stdio receives EOF. That happens when the test case process terminates for any reason. By adding this option to -chardev we can use it with both -qtest and -qmp. Stefan Hajnoczi (4): vl: add qemu_system_shutdown_force() libqemustub: add qemu_system_shutdown_force() qemu-char: add -chardev exit-on-eof option libqtest: use -chardev exit-on-eof to clean up QEMU include/sysemu/char.h | 1 + include/sysemu/sysemu.h | 2 +- qapi-schema.json| 23 --- qemu-char.c | 33 +++-- qemu-options.hx | 19 +-- qmp.c | 3 +-- stubs/Makefile.objs | 1 + stubs/shutdown.c| 5 + tests/libqtest.c| 48 +++- ui/sdl.c| 3 +-- ui/sdl2.c | 6 ++ vl.c| 11 --- 12 files changed, 79 insertions(+), 76 deletions(-) create mode 100644 stubs/shutdown.c -- 1.9.3
[Qemu-devel] [PATCH v2 4/4] libqtest: use -chardev exit-on-eof to clean up QEMU
When the test case aborts it is important to terminate the QEMU process so it does not leak. This was implemented using a SIGABRT handler function in libqtest that sent SIGTERM to QEMU. The SIGABRT approach is messy because it requires a global signal handler but libqtest should support multiple simultaneous instances. Simplify the code using the new -chardev exit-on-eof option. QEMU will automatically exit when our qtest socket closes. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- tests/libqtest.c | 48 +++- 1 file changed, 3 insertions(+), 45 deletions(-) diff --git a/tests/libqtest.c b/tests/libqtest.c index 98e8f4b..6c3dd27 100644 --- a/tests/libqtest.c +++ b/tests/libqtest.c @@ -46,12 +46,8 @@ struct QTestState bool irq_level[MAX_IRQ]; GString *rx; pid_t qemu_pid; /* our child QEMU process */ -struct sigaction sigact_old; /* restored on exit */ }; -static GList *qtest_instances; -static struct sigaction sigact_old; - #define g_assert_no_errno(ret) do { \ g_assert_cmpint(ret, !=, -1); \ } while (0) @@ -110,32 +106,6 @@ static void kill_qemu(QTestState *s) } } -static void sigabrt_handler(int signo) -{ -GList *elem; -for (elem = qtest_instances; elem; elem = elem-next) { -kill_qemu(elem-data); -} -} - -static void setup_sigabrt_handler(void) -{ -struct sigaction sigact; - -/* Catch SIGABRT to clean up on g_assert() failure */ -sigact = (struct sigaction){ -.sa_handler = sigabrt_handler, -.sa_flags = SA_RESETHAND, -}; -sigemptyset(sigact.sa_mask); -sigaction(SIGABRT, sigact, sigact_old); -} - -static void cleanup_sigabrt_handler(void) -{ -sigaction(SIGABRT, sigact_old, NULL); -} - QTestState *qtest_init(const char *extra_args) { QTestState *s; @@ -156,17 +126,12 @@ QTestState *qtest_init(const char *extra_args) sock = init_socket(socket_path); qmpsock = init_socket(qmp_socket_path); -/* Only install SIGABRT handler once */ -if (!qtest_instances) { -setup_sigabrt_handler(); -} - -qtest_instances = g_list_prepend(qtest_instances, s); - s-qemu_pid = fork(); if (s-qemu_pid == 0) { command = g_strdup_printf(exec %s - -qtest unix:%s,nowait + -chardev socket,id=qtestdev,path=%s,nowait, + exit-on-eof + -qtest chardev:qtestdev -qtest-log /dev/null -qmp unix:%s,nowait -machine accel=qtest @@ -207,13 +172,6 @@ QTestState *qtest_init(const char *extra_args) void qtest_quit(QTestState *s) { -/* Uninstall SIGABRT handler on last instance */ -if (qtest_instances !qtest_instances-next) { -cleanup_sigabrt_handler(); -} - -qtest_instances = g_list_remove(qtest_instances, s); - kill_qemu(s); close(s-fd); close(s-qmp_fd); -- 1.9.3
[Qemu-devel] [PATCH v2 1/4] vl: add qemu_system_shutdown_force()
The QEMU --no-shutdown option prevents guests from shutting down. There are several cases where shutdown should be forced, even if --no-shutdown was given. This patch adds an API for forcing shutdown. This cleans up the code and hides the extern int no_shutdown variable which is currently being touched in various files. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- include/sysemu/sysemu.h | 2 +- qmp.c | 3 +-- ui/sdl.c| 3 +-- ui/sdl2.c | 6 ++ vl.c| 11 --- 5 files changed, 13 insertions(+), 12 deletions(-) diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index d8539fd..171d7d3 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -58,6 +58,7 @@ void qemu_system_wakeup_request(WakeupReason reason); void qemu_system_wakeup_enable(WakeupReason reason, bool enabled); void qemu_register_wakeup_notifier(Notifier *notifier); void qemu_system_shutdown_request(void); +void qemu_system_shutdown_force(void); void qemu_system_powerdown_request(void); void qemu_register_powerdown_notifier(Notifier *notifier); void qemu_system_debug_request(void); @@ -126,7 +127,6 @@ extern int max_cpus; extern int cursor_hide; extern int graphic_rotate; extern int no_quit; -extern int no_shutdown; extern int semihosting_enabled; extern int old_param; extern int boot_menu; diff --git a/qmp.c b/qmp.c index 0d2553a..cd4d6a7 100644 --- a/qmp.c +++ b/qmp.c @@ -86,8 +86,7 @@ UuidInfo *qmp_query_uuid(Error **errp) void qmp_quit(Error **errp) { -no_shutdown = 0; -qemu_system_shutdown_request(); +qemu_system_shutdown_force(); } void qmp_stop(Error **errp) diff --git a/ui/sdl.c b/ui/sdl.c index 4e7f920..7c3d91c 100644 --- a/ui/sdl.c +++ b/ui/sdl.c @@ -791,8 +791,7 @@ static void sdl_refresh(DisplayChangeListener *dcl) break; case SDL_QUIT: if (!no_quit) { -no_shutdown = 0; -qemu_system_shutdown_request(); +qemu_system_shutdown_force(); } break; case SDL_MOUSEMOTION: diff --git a/ui/sdl2.c b/ui/sdl2.c index fcac87b..f392528 100644 --- a/ui/sdl2.c +++ b/ui/sdl2.c @@ -711,8 +711,7 @@ static void handle_windowevent(DisplayChangeListener *dcl, SDL_Event *ev) break; case SDL_WINDOWEVENT_CLOSE: if (!no_quit) { -no_shutdown = 0; -qemu_system_shutdown_request(); +qemu_system_shutdown_force(); } break; } @@ -743,8 +742,7 @@ static void sdl_refresh(DisplayChangeListener *dcl) break; case SDL_QUIT: if (!no_quit) { -no_shutdown = 0; -qemu_system_shutdown_request(); +qemu_system_shutdown_force(); } break; case SDL_MOUSEMOTION: diff --git a/vl.c b/vl.c index fe451aa..c733e04 100644 --- a/vl.c +++ b/vl.c @@ -164,7 +164,7 @@ int acpi_enabled = 1; int no_hpet = 0; int fd_bootchk = 1; static int no_reboot; -int no_shutdown = 0; +static int no_shutdown = 0; int cursor_hide = 1; int graphic_rotate = 0; const char *watchdog; @@ -1915,8 +1915,7 @@ void qemu_system_killed(int signal, pid_t pid) { shutdown_signal = signal; shutdown_pid = pid; -no_shutdown = 0; -qemu_system_shutdown_request(); +qemu_system_shutdown_force(); } void qemu_system_shutdown_request(void) @@ -1926,6 +1925,12 @@ void qemu_system_shutdown_request(void) qemu_notify_event(); } +void qemu_system_shutdown_force(void) +{ +no_shutdown = 0; +qemu_system_shutdown_request(); +} + static void qemu_system_powerdown(void) { qapi_event_send_powerdown(error_abort); -- 1.9.3
[Qemu-devel] [PATCH v2 3/4] qemu-char: add -chardev exit-on-eof option
When QEMU is executed as part of a test case or from a script, it is usually desirable to exit if the parent process terminates. This ensures that leaked QEMU processes do not continue consuming resources after their parent has died. This patch adds the -chardev exit-on-eof option causing socket and pipe chardevs to exit QEMU upon close. This happens when a parent process deliberately closes its file descriptor but also when the kernel cleans up a crashed process. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- include/sysemu/char.h | 1 + qapi-schema.json | 23 --- qemu-char.c | 33 +++-- qemu-options.hx | 19 +-- 4 files changed, 57 insertions(+), 19 deletions(-) diff --git a/include/sysemu/char.h b/include/sysemu/char.h index 0bbd631..382b320 100644 --- a/include/sysemu/char.h +++ b/include/sysemu/char.h @@ -86,6 +86,7 @@ struct CharDriverState { guint fd_in_tag; QemuOpts *opts; QTAILQ_ENTRY(CharDriverState) next; +bool exit_on_eof; }; /** diff --git a/qapi-schema.json b/qapi-schema.json index b11aad2..9b13da1 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -2630,10 +2630,13 @@ # @device: The name of the special file for the device, # i.e. /dev/ttyS0 on Unix or COM1: on Windows # @type: What kind of device this is. +# @exit-on-eof: #optional terminate when other side closes the pipe +# (default: false, since: 2.2) # # Since: 1.4 ## -{ 'type': 'ChardevHostdev', 'data': { 'device' : 'str' } } +{ 'type': 'ChardevHostdev', 'data': { 'device' : 'str', + '*exit-on-eof' : 'bool' } } ## # @ChardevSocket: @@ -2648,14 +2651,17 @@ # @nodelay: #optional set TCP_NODELAY socket option (default: false) # @telnet: #optional enable telnet protocol on server # sockets (default: false) +# @exit-on-eof: #optional terminate when other side closes socket +# (default: false, since: 2.2) # # Since: 1.4 ## -{ 'type': 'ChardevSocket', 'data': { 'addr' : 'SocketAddress', - '*server' : 'bool', - '*wait': 'bool', - '*nodelay' : 'bool', - '*telnet' : 'bool' } } +{ 'type': 'ChardevSocket', 'data': { 'addr' : 'SocketAddress', + '*server' : 'bool', + '*wait': 'bool', + '*nodelay' : 'bool', + '*telnet' : 'bool', + '*exit-on-eof' : 'bool' } } ## # @ChardevUdp: @@ -2689,10 +2695,13 @@ # @signal: #optional Allow signals (such as SIGINT triggered by ^C) # be delivered to qemu. Default: true in -nographic mode, # false otherwise. +# @exit-on-eof: #optional terminate when other side sends EOF +# (default: false, since: 2.2) # # Since: 1.5 ## -{ 'type': 'ChardevStdio', 'data': { '*signal' : 'bool' } } +{ 'type': 'ChardevStdio', 'data': { '*signal' : 'bool', +'*exit-on-eof' : 'bool' } } ## # @ChardevSpiceChannel: diff --git a/qemu-char.c b/qemu-char.c index 7acc03f..1974b8f 100644 --- a/qemu-char.c +++ b/qemu-char.c @@ -110,9 +110,15 @@ void qemu_chr_be_event(CharDriverState *s, int event) break; } -if (!s-chr_event) -return; -s-chr_event(s-handler_opaque, event); +if (s-chr_event) { +s-chr_event(s-handler_opaque, event); +} + +if (s-exit_on_eof event == CHR_EVENT_CLOSED) { +fprintf(stderr, qemu: terminating due to eof on chardev '%s'\n, +s-label); +qemu_system_shutdown_force(); +} } void qemu_chr_be_generic_open(CharDriverState *s) @@ -991,6 +997,7 @@ static CharDriverState *qemu_chr_open_pipe(ChardevHostdev *opts) int fd_in, fd_out; char filename_in[256], filename_out[256]; const char *filename = opts-device; +CharDriverState *chr; if (filename == NULL) { fprintf(stderr, chardev: pipe: no filename given\n); @@ -1011,7 +1018,9 @@ static CharDriverState *qemu_chr_open_pipe(ChardevHostdev *opts) return NULL; } } -return qemu_chr_open_fd(fd_in, fd_out); +chr = qemu_chr_open_fd(fd_in, fd_out); +chr-exit_on_eof = opts-has_exit_on_eof opts-exit_on_eof; +return chr; } /* init terminal so that we can grab keys */ @@ -2893,6 +2902,7 @@ static void tcp_chr_close(CharDriverState *chr) static CharDriverState *qemu_chr_open_socket_fd(int fd, bool do_nodelay, bool is_listen, bool is_telnet, bool is_waitconnect, +bool is_exit_on_eof,
Re: [Qemu-devel] [questions] about qemu log
Am 25.07.2014 13:07, schrieb Zhang Haoyu: Hi, all If I use qemu command directly to run vm, bypass libvirt, how to configure qemu to assure that each vm has its own log file, like vmname.log? For example, VM: rhel7-net has its own log file, rhel7-net.log, VM:rhel7-stor has its own log file, rhel7-stor.log. -D /path/to/unique/file/name.log -D option is to configure qemu_logfile for the output logs which are controlled by qmp command logfile, which can be enabled/disabled on run time. I want to configure the log file for the output of fprintf(stderr, fmt, ...), .etc, i.e., how to redirect the output of fprintf(stderr, fmt, ...), or some other log-interface to a specified file? In a shell you would write something like: 2 stderr.log You may also want to toggle QEMU's -msg timestamp=on option. Cheers, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
[Qemu-devel] AArch64 ELF File Loading
Hi, I think the AArch64 port has a problem with a self-modifying code sequence that appears to run fine on other simulators, but I can't get QEMU to run the small bare metal test case I created to try to reproduce the issue. Any help would be appreciated. qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting -d exec,in_asm /tmp/test-nooverwrite 21 | less qemu: fatal: Trying to execute code outside RAM or ROM at 0x qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting -d exec,in_asm -bios /tmp/test-nooverwrite 21 | less qemu: fatal: Trying to execute code outside RAM or ROM at 0x qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting -d exec,in_asm -kernel /tmp/test-nooverwrite 21 | less IN: 0x4000: e3a0 mov r0, #0 ; 0x0 0x4004: e59f1004 ldr r1, [pc, #4]; 0x4010 0x4008: e59f2004 ldr r2, [pc, #4]; 0x4014 0x400c: e59ff004 ldr pc, [pc, #4]; 0x4018 Trace 0x7f309f012000 [4000] Note that the above are A32 instructions, but my ELF is A64 and this is not the specified entry point. aarch64-linux-gnu-readelf -h /tmp/test-nooverwrite ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI:UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: AArch64 Version: 0x1 Entry point address: 0x80001140 Start of program headers: 64 (bytes into file) Start of section headers: 186600 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 3 Size of section headers: 64 (bytes) Number of section headers: 17 Section header string table index: 14 To generate a test bare metal executable, you can download the aarch64-none-elf toolchain from Linaro and: echo '#include stdio.h int main() { printf(Hello, world!\n); return 0; }' hello.c aarch64-none-elf-gcc -specs=aem-ve.specs hello.c -o hello Thanks, Christopher -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation.
Re: [Qemu-devel] AArch64 ELF File Loading
On 25 July 2014 15:01, Christopher Covington c...@codeaurora.org wrote: Hi, I think the AArch64 port has a problem with a self-modifying code sequence that appears to run fine on other simulators, but I can't get QEMU to run the small bare metal test case I created to try to reproduce the issue. Any help would be appreciated. qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting -d exec,in_asm /tmp/test-nooverwrite 21 | less qemu: fatal: Trying to execute code outside RAM or ROM at 0x qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting -d exec,in_asm -bios /tmp/test-nooverwrite 21 | less qemu: fatal: Trying to execute code outside RAM or ROM at 0x qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting -d exec,in_asm -kernel /tmp/test-nooverwrite 21 | less You haven't specified a CPU type, and virt defaults to cortex-a15. Try -cpu cortex-a57. You haven't specified a memory size, and QEMU defaults to 128MB, which (given the start address of RAM) means there won't be any RAM at the load address you're trying to load your test program at. Try -m 3G, and/or make your ELF file load at an address closer to the start of RAM. thanks -- PMM
Re: [Qemu-devel] [PULL for-2.1 0/3] Last minute patches
Il 25/07/2014 14:15, Paolo Bonzini ha scritto: The following changes since commit f368c33d5ab09dd5656924185cd975b11838cd25: Update version for v2.1.0-rc3 release (2014-07-22 18:17:03 +0100) are available in the git repository at: git://github.com/bonzini/qemu.git tags/for-upstream for you to fetch changes up to d975e28437377bc9a65fb5f4f9486a74c7a124cc: qemu-char: ignore flow control if a PTY's slave is not connected (2014-07-24 18:31:50 +0200) Here are my patches for the last 2.1 rc. Paolo Bonzini (3): acpi-dsdt: procedurally generate _PRT pc: hack for migration compatibility from QEMU 2.0 qemu-char: ignore flow control if a PTY's slave is not connected Since Igor hasn't sent his patches, and I'm leaving the office, I pushed this to git://github.com/bonzini/qemu.git tags/for-upstream-full I don't know about tests/acpi-test-data/pc. It makes sense that this patch should modify something there, but: * make check passes * the test warns even before patch 1, for both the DSDT (modified by the patch) and SSDT (which this series doesn't touch at all) * I cannot get it to pass, except by blindly copying the actual output on the expected files * mst is on vacation and Marcel is off on Fridays Based on my understanding of the problem, it is not possible to fix the bug without hacks like this one, and even reverting all patches in this area would be more risky. Paolo
Re: [Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm
- Mail original - De: Paolo Bonzini pbonz...@redhat.com À: Sebastian Tanase sebastian.tan...@openwide.fr, qemu-devel@nongnu.org Cc: aligu...@amazon.com, afaer...@suse.de, r...@twiddle.net, peter maydell peter.mayd...@linaro.org, mich...@walle.cc, a...@alex.org.uk, stefa...@redhat.com, lcapitul...@redhat.com, crobi...@redhat.com, arm...@redhat.com, wenchaoq...@gmail.com, quint...@redhat.com, kw...@redhat.com, m...@redhat.com, pierre lemagourou pierre.lemagou...@openwide.fr, jeremy rosen jeremy.ro...@openwide.fr, camille begue camille.be...@openwide.fr Envoyé: Vendredi 25 Juillet 2014 12:13:44 Objet: Re: [PATCH V5 4/6] cpu_exec: Add sleeping algorithm Il 25/07/2014 11:56, Sebastian Tanase ha scritto: The goal is to sleep qemu whenever the guest clock is in advance compared to the host clock (we use the monotonic clocks). The amount of time to sleep is calculated in the execution loop in cpu_exec. At first, we tried to approximate at each for loop the real time elapsed while searching for a TB (generating or retrieving from cache) and executing it. We would then approximate the virtual time corresponding to the number of virtual instructions executed. The difference between these 2 values would allow us to know if the guest is in advance or delayed. However, the function used for measuring the real time (qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very expensive. We had an added overhead of 13% of the total run time. Therefore, we modified the algorithm and only take into account the difference between the 2 clocks at the begining of the cpu_exec function. During the for loop we try to reduce the advance of the guest only by computing the virtual time elapsed and sleeping if necessary. The overhead is thus reduced to 3%. Even though this method still has a noticeable overhead, it no longer is a bottleneck in trying to achieve a better guest frequency for which the guest clock is faster than the host one. As for the the alignement of the 2 clocks, with the first algorithm the guest clock was oscillating between -1 and 1ms compared to the host clock. Using the second algorithm we notice that the guest is 5ms behind the host, which is still acceptable for our use case. The tests where conducted using fio and stress. The host machine in an i5 CPU at 3.10GHz running Debian Jessie (kernel 3.12). The guest machine is an arm versatile-pb built with buildroot. Currently, on our test machine, the lowest icount we can achieve that is suitable for aligning the 2 clocks is 6. However, we observe that the IO tests (using fio) are slower than the cpu tests (using stress). Signed-off-by: Sebastian Tanase sebastian.tan...@openwide.fr Tested-by: Camille Bégué camille.be...@openwide.fr Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- cpu-exec.c | 91 cpus.c | 17 ++ include/qemu/timer.h | 1 + 3 files changed, 109 insertions(+) diff --git a/cpu-exec.c b/cpu-exec.c index 38e5f02..1a725b6 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -22,6 +22,84 @@ #include tcg.h #include qemu/atomic.h #include sysemu/qtest.h +#include qemu/timer.h + +/* -icount align implementation. */ + +typedef struct SyncClocks { +int64_t diff_clk; +int64_t original_instr_counter; +} SyncClocks; + +#if !defined(CONFIG_USER_ONLY) +/* Allow the guest to have a max 3ms advance. + * The difference between the 2 clocks could therefore + * oscillate around 0. + */ +#define VM_CLOCK_ADVANCE 300 + +static int64_t delay_host(int64_t diff_clk) +{ +if (diff_clk VM_CLOCK_ADVANCE) { +#ifndef _WIN32 +struct timespec sleep_delay, rem_delay; +sleep_delay.tv_sec = diff_clk / 10LL; +sleep_delay.tv_nsec = diff_clk % 10LL; +if (nanosleep(sleep_delay, rem_delay) 0) { +diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) * 10LL; +diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec; +} else { +diff_clk = 0; +} +#else +Sleep(diff_clk / SCALE_MS); +diff_clk = 0; +#endif +} +return diff_clk; +} + +static int64_t instr_to_vtime(int64_t instr_counter, const CPUState *cpu) +{ +int64_t instr_exec_time; +instr_exec_time = instr_counter - + (cpu-icount_extra + + cpu-icount_decr.u16.low); +instr_exec_time = instr_exec_time icount_time_shift; + +return instr_exec_time; +} + +static void align_clocks(SyncClocks *sc, const CPUState *cpu) +{ +if (!icount_align_option) { +return; +} +sc-diff_clk += instr_to_vtime(sc-original_instr_counter, cpu); +sc-original_instr_counter =
Re: [Qemu-devel] AArch64 ELF File Loading
Hi Peter, On 07/25/2014 10:07 AM, Peter Maydell wrote: On 25 July 2014 15:01, Christopher Covington c...@codeaurora.org wrote: Hi, I think the AArch64 port has a problem with a self-modifying code sequence that appears to run fine on other simulators, but I can't get QEMU to run the small bare metal test case I created to try to reproduce the issue. Any help would be appreciated. qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting -d exec,in_asm /tmp/test-nooverwrite 21 | less qemu: fatal: Trying to execute code outside RAM or ROM at 0x qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting -d exec,in_asm -bios /tmp/test-nooverwrite 21 | less qemu: fatal: Trying to execute code outside RAM or ROM at 0x qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting -d exec,in_asm -kernel /tmp/test-nooverwrite 21 | less You haven't specified a CPU type, and virt defaults to cortex-a15. Try -cpu cortex-a57. That explains the A32 instructions. You haven't specified a memory size, and QEMU defaults to 128MB, which (given the start address of RAM) means there won't be any RAM at the load address you're trying to load your test program at. Try -m 3G, and/or make your ELF file load at an address closer to the start of RAM. Thanks for the suggestions. aarch64-none-elf-gcc -specs=rdimon.specs -Ttext=0x4000 hello.c -o hello aarch64-none-elf-readelf -h hello ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI:UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: AArch64 Version: 0x1 Entry point address: 0x4158 Start of program headers: 64 (bytes into file) Start of section headers: 157432 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 4 Size of section headers: 64 (bytes) Number of section headers: 17 Section header string table index: 14 qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt \ -cpu cortex-a57 -m 3G -semihosting -kernel hello qemu: fatal: Trying to execute code outside RAM or ROM at 0x Thanks, Christopher -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation.
Re: [Qemu-devel] AArch64 ELF File Loading
On 25 July 2014 15:35, Christopher Covington c...@codeaurora.org wrote: qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt \ -cpu cortex-a57 -m 3G -semihosting -kernel hello qemu: fatal: Trying to execute code outside RAM or ROM at 0x This means your code took an exception (and there's no RAM at the low address where the vector table is by default). Try -d in_asm,exec,int to get a better idea of what's being executed. Also, where do you expect the output from that printf to be going? Does your gcc/bare metal libc write to the UART? How does it know what address the UART is? If it's expecting to do semihosting for output, then you're running into the fact that we don't implement semihosting for AArch64 yet. thanks -- PMM
Re: [Qemu-devel] [PATCH 1/7] tests: Functions bus_foreach and device_find from libqos virtio API
On Thu, Jul 24, 2014 at 08:30:59PM +0200, Marc Marí wrote: +static QPCIBus *test_start(void) +{ +char cmdline[100]; +char tmp_path[] = /tmp/qtest.XX; +int fd, ret; + +/* Create a temporary raw image */ +fd = mkstemp(tmp_path); +g_assert_cmpint(fd, =, 0); +ret = ftruncate(fd, TEST_IMAGE_SIZE); +g_assert_cmpint(ret, ==, 0); +close(fd); + +last_tmp_path = g_malloc0(strlen(tmp_path)); +strcpy(last_tmp_path, tmp_path); + +snprintf(cmdline, 100, -drive if=none,id=drive0,file=%s +-device virtio-blk-pci,drive=drive0,addr=%x.%x, +tmp_path, PCI_SLOT, PCI_FN); +qtest_start(cmdline); Please unlink the temporary disk image file here. The QEMU process has a file descriptor open when we reach this point, so it's safe to delete it on disk (the file stays allocated until the last file descriptor is closed). This is important so that the temporary file is always deleted in failure cases. We do not reach test_end() when an assertion fails. pgpjYo2EWJU2Q.pgp Description: PGP signature
Re: [Qemu-devel] AArch64 ELF File Loading
Hi Peter, On 07/25/2014 10:41 AM, Peter Maydell wrote: On 25 July 2014 15:35, Christopher Covington c...@codeaurora.org wrote: qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt \ -cpu cortex-a57 -m 3G -semihosting -kernel hello qemu: fatal: Trying to execute code outside RAM or ROM at 0x This means your code took an exception (and there's no RAM at the low address where the vector table is by default). Try -d in_asm,exec,int to get a better idea of what's being executed. qemu-system-aarch64 -nodefaults -nographic -monitor none \ -M virt -cpu cortex-a57 -m 3G -semihosting -kernel hello \ -d in_asm,exec,int qemu: fatal: Trying to execute code outside RAM or ROM at 0x PC= SP= X00= X01= X02= ... The binary runs fine on at least one other simulator. Also, where do you expect the output from that printf to be going? Does your gcc/bare metal libc write to the UART? How does it know what address the UART is? If it's expecting to do semihosting for output, then you're running into the fact that we don't implement semihosting for AArch64 yet. I have local patches adding semihosting for AArch64. I hope eventually be able to share them and other changes, but the approvals will likely take a while longer. Here is an example that doesn't use semihosting. wget http://releases.linaro.org/14.06/components/toolchain/binaries/gcc-linaro-aarch64-none-elf-4.9-2014.06-02_linux.tar.xz tar xf gcc-linaro-aarch64-none-elf-4.9-2014.06-02_linux.tar.xz echo '.global _start _start: mrs x0, midr_el1 b _start' hello.S gcc-linaro-aarch64-none-elf-4.9-2014.06-02_linux/bin/aarch64-none-elf-gcc \ -nostdlib -Ttext=0x4000 hello.S -o hello qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt \ -cpu cortex-a57 -m 3G -semihosting -kernel hello -d in_asm,exec,int qemu: fatal: Trying to execute code outside RAM or ROM at 0x Thanks, Christopher -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation.
Re: [Qemu-devel] AArch64 ELF File Loading
On 25 July 2014 16:05, Christopher Covington c...@codeaurora.org wrote: I have local patches adding semihosting for AArch64. I hope eventually be able to share them and other changes, but the approvals will likely take a while longer. That would be good; there is demand from other quarters for semihosting support. wget http://releases.linaro.org/14.06/components/toolchain/binaries/gcc-linaro-aarch64-none-elf-4.9-2014.06-02_linux.tar.xz tar xf gcc-linaro-aarch64-none-elf-4.9-2014.06-02_linux.tar.xz echo '.global _start _start: mrs x0, midr_el1 b _start' hello.S gcc-linaro-aarch64-none-elf-4.9-2014.06-02_linux/bin/aarch64-none-elf-gcc \ -nostdlib -Ttext=0x4000 hello.S -o hello qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt \ -cpu cortex-a57 -m 3G -semihosting -kernel hello -d in_asm,exec,int qemu: fatal: Trying to execute code outside RAM or ROM at 0x Doh. We never implemented the AArch64 support for booting plain ELF files in hw/arm/boot.c:do_cpu_reset(). Patch coming up in a second; with that I can see via the gdbstub that we're going round the loop in the guest code. thanks -- PMM
Re: [Qemu-devel] [PATCH 2/7] tests: Add virtio device initialization
On Thu, Jul 24, 2014 at 08:31:00PM +0200, Marc Marí wrote: +static uint32_t qvirtio_pci_get_features(QVirtioDevice *d) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +return qpci_io_readl(dev-pdev, dev-addr + QVIRTIO_DEVICE_FEATURES); +} Unused? If it's unused, then it's untested. + +static uint8_t qvirtio_pci_get_status(QVirtioDevice *d) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +return qpci_io_readb(dev-pdev, dev-addr + QVIRTIO_DEVICE_STATUS); +} Unused? + +static void qvirtio_pci_set_status(QVirtioDevice *d, uint8_t val) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +qpci_io_writeb(dev-pdev, dev-addr + QVIRTIO_DEVICE_STATUS, val); Unused? @@ -73,3 +97,11 @@ QVirtioPCIDevice *qvirtio_pci_device_find(QPCIBus *bus, uint16_t device_type) return dev; } + +void qvirtio_pci_enable_device(QVirtioPCIDevice *d) +{ +qpci_device_enable(d-pdev); +d-addr = qpci_iomap(d-pdev, 0); +g_assert(d-addr != NULL); +} Where is qpci_iounmap() called to clean up? @@ -67,6 +69,18 @@ static void pci_basic(void) g_assert_cmphex(dev-vdev.device_type, ==, QVIRTIO_BLK_DEVICE_ID); g_assert_cmphex(dev-pdev-devfn, ==, ((PCI_SLOT 3) | PCI_FN)); +qvirtio_pci_enable_device(dev); +qvirtio_reset(qvirtio_pci, dev-vdev); +qvirtio_set_acknowledge(qvirtio_pci, dev-vdev); +qvirtio_set_driver(qvirtio_pci, dev-vdev); + +/* MSI-X is not enabled */ +addr = dev-addr + QVIRTIO_DEVICE_SPECIFIC_NO_MSIX; + +capacity = qpci_io_readl(dev-pdev, addr) | +qpci_io_readl(dev-pdev, addr+4); +g_assert_cmpint(capacity, ==, TEST_IMAGE_SIZE/512); Please add a qvirtio_config_read() function instead of directly accessing the virtio configuration space via PCI. Stefan pgpJkubtbLrqG.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH 5/7] libqos: Change free function called in malloc
On Thu, Jul 24, 2014 at 08:31:03PM +0200, Marc Marí wrote: Signed-off-by: Marc Marí marc.mari.barc...@gmail.com --- tests/libqos/malloc.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/libqos/malloc.h b/tests/libqos/malloc.h index 46f6000..5565381 100644 --- a/tests/libqos/malloc.h +++ b/tests/libqos/malloc.h @@ -32,7 +32,7 @@ static inline uint64_t guest_alloc(QGuestAllocator *allocator, size_t size) static inline void guest_free(QGuestAllocator *allocator, uint64_t addr) { -allocator-alloc(allocator, addr); +allocator-free(allocator, addr); } Hahahahahahaha! Reviewed-by: Stefan Hajnoczi stefa...@redhat.com pgp_Qq7L8ivx8.pgp Description: PGP signature
[Qemu-devel] [PATCH] hw/arm/boot: Set PC correctly when loading AArch64 ELF files
The code in do_cpu_reset() correctly handled AArch64 CPUs when running Linux kernels, but was missing code in the branch of the if() that deals with loading ELF files. Correctly jump to the ELF entry point on reset rather than leaving the reset PC at zero. Reported-by: Christopher Covington c...@codeaurora.org Signed-off-by: Peter Maydell peter.mayd...@linaro.org Cc: qemu-sta...@nongnu.org --- hw/arm/boot.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/hw/arm/boot.c b/hw/arm/boot.c index 3d1f4a2..1241761 100644 --- a/hw/arm/boot.c +++ b/hw/arm/boot.c @@ -417,8 +417,12 @@ static void do_cpu_reset(void *opaque) if (info) { if (!info-is_linux) { /* Jump to the entry point. */ -env-regs[15] = info-entry 0xfffe; -env-thumb = info-entry 1; +if (env-aarch64) { +env-pc = info-entry; +} else { +env-regs[15] = info-entry 0xfffe; +env-thumb = info-entry 1; +} } else { if (CPU(cpu) == first_cpu) { if (env-aarch64) { -- 1.9.1
Re: [Qemu-devel] [PATCH 3/7] libqtest: add QTEST_LOG for debugging qtest testcases
On Thu, Jul 24, 2014 at 08:31:01PM +0200, Marc Marí wrote: @@ -397,10 +398,18 @@ QDict *qtest_qmpv(QTestState *s, const char *fmt, va_list ap) /* No need to send anything for an empty QObject. */ if (qobj) { +size_t len; +int log = getenv(QTEST_LOG) != NULL; QString *qstr = qobject_to_json(qobj); const char *str = qstring_get_str(qstr); size_t size = qstring_get_length(qstr); +if (log) { +len = write(2, str, size); +if (len != size) { +fprintf(stderr, Could not log\n); It's a bit funny that we print an error message to stderr after failing to write to stderr. Why not just fprintf(stderr, %s, str) instead of using write()? pgpL2e_NNatZz.pgp Description: PGP signature
[Qemu-devel] [PATCH RFC v2 00/12] VMState testing
Hi, The following patch introduce a mechanism to test the correctness of the vmstate's information. This is achieved by saving the device states' information to a memory buffer and then clearing the states, followed by loading the data from the saved memory buffer. v1 -- v2: * Added a list containing all the devices that have been qdevified and gets registered with the SaveStateEntry. * Have provided a way to use either qemu_system_reset functionality or my own version of qdev entries untill all the devices have been qdevified. This gives us the privilege to test any device we want. * Rename some of the variables. I am very bad at naming convention, thanks to community, specially Eric, I try to improve it with every version. * On Eric's advice, I have separated all of the qmp and hmp interface patches. * Changed the DPRINTF statements as required. Dr. David Alan Gilbert (1): QEMUSizedBuffer/QEMUFile Sanidhya Kashyap (11): reset handler for qdevified devices VMState test: query command to extract the qdevified device names VMState test: hmp interface for showing qdevified devices VMstate test: basic VMState testing mechanism VMState test: hmp interface for vmstate testing VMState test: qmp interface for querying the vmstate testing process VMState test: hmp interface for querying the vmstate testing process VMState test: update period of vmstate testing process VMState test: hmp interface for period update VMState test: cancel mechanism for an already running vmstate testing process VMState test: hmp interface for cancel mechanism hmp-commands.hx | 48 + hmp.c | 73 hmp.h | 5 + include/migration/qemu-file.h | 27 +++ monitor.c | 14 ++ qapi-schema.json | 103 +++ qemu-file.c | 411 ++ qmp-commands.hx | 129 + savevm.c | 357 9 files changed, 1167 insertions(+) -- 1.9.3
[Qemu-devel] [PATCH RFC v2 01/12] QEMUSizedBuffer/QEMUFile
From: Dr. David Alan Gilbert dgilb...@redhat.com Stefan Berger's to create a QEMUFile that goes to a memory buffer; from: http://lists.gnu.org/archive/html/qemu-devel/2013-03/msg05036.html Using the QEMUFile interface, this patch adds support functions for operating on in-memory sized buffers that can be written to or read from. Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com Signed-off-by: Joel Schopp jsch...@linux.vnet.ibm.com --- include/migration/qemu-file.h | 27 +++ qemu-file.c | 411 ++ 2 files changed, 438 insertions(+) diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h index c90f529..14e1f4d 100644 --- a/include/migration/qemu-file.h +++ b/include/migration/qemu-file.h @@ -25,6 +25,8 @@ #define QEMU_FILE_H 1 #include exec/cpu-common.h +#include stdint.h + /* This function writes a chunk of data to a file at the given position. * The pos argument can be ignored if the file is only being used for * streaming. The handler should try to write all of the data it can. @@ -94,11 +96,31 @@ typedef struct QEMUFileOps { QEMURamSaveFunc *save_page; } QEMUFileOps; +struct QEMUSizedBuffer { +struct iovec *iov; +size_t n_iov; +size_t size; /* total allocated size in all iov's */ +size_t used; /* number of used bytes */ +}; + +typedef struct QEMUSizedBuffer QEMUSizedBuffer; +QEMUSizedBuffer *qsb_create(const uint8_t *buffer, size_t len); +QEMUSizedBuffer *qsb_clone(const QEMUSizedBuffer *); +void qsb_free(QEMUSizedBuffer *); +size_t qsb_set_length(QEMUSizedBuffer *qsb, size_t length); +size_t qsb_get_length(const QEMUSizedBuffer *qsb); +ssize_t qsb_get_buffer(const QEMUSizedBuffer *, off_t start, size_t count, + uint8_t **buf); +ssize_t qsb_write_at(QEMUSizedBuffer *qsb, const uint8_t *buf, + off_t pos, size_t count); + + QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops); QEMUFile *qemu_fopen(const char *filename, const char *mode); QEMUFile *qemu_fdopen(int fd, const char *mode); QEMUFile *qemu_fopen_socket(int fd, const char *mode); QEMUFile *qemu_popen_cmd(const char *command, const char *mode); +QEMUFile *qemu_bufopen(const char *mode, QEMUSizedBuffer *input); int qemu_get_fd(QEMUFile *f); int qemu_fclose(QEMUFile *f); int64_t qemu_ftell(QEMUFile *f); @@ -111,6 +133,11 @@ void qemu_put_byte(QEMUFile *f, int v); void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, int size); bool qemu_file_mode_is_not_valid(const char *mode); +/* + * For use on files opened with qemu_bufopen + */ +const QEMUSizedBuffer *qemu_buf_get(QEMUFile *f); + static inline void qemu_put_ubyte(QEMUFile *f, unsigned int v) { qemu_put_byte(f, (int)v); diff --git a/qemu-file.c b/qemu-file.c index a8e3912..9fd1675 100644 --- a/qemu-file.c +++ b/qemu-file.c @@ -878,3 +878,414 @@ uint64_t qemu_get_be64(QEMUFile *f) v |= qemu_get_be32(f); return v; } + + +#define QSB_CHUNK_SIZE (1 10) +#define QSB_MAX_CHUNK_SIZE (10 * QSB_CHUNK_SIZE) + +/** + * Create a QEMUSizedBuffer + * This type of buffer uses scatter-gather lists internally and + * can grow to any size. Any data array in the scatter-gather list + * can hold different amount of bytes. + * + * @buffer: Optional buffer to copy into the QSB + * @len: size of initial buffer; if @buffer is given, buffer must + * hold at least len bytes + * + * Returns a pointer to a QEMUSizedBuffer + */ +QEMUSizedBuffer *qsb_create(const uint8_t *buffer, size_t len) +{ +QEMUSizedBuffer *qsb; +size_t alloc_len, num_chunks, i, to_copy; +size_t chunk_size = (len QSB_MAX_CHUNK_SIZE) +? QSB_MAX_CHUNK_SIZE +: QSB_CHUNK_SIZE; + +if (len == 0) { +/* we want to allocate at least one chunk */ +len = QSB_CHUNK_SIZE; +} + +num_chunks = DIV_ROUND_UP(len, chunk_size); +alloc_len = num_chunks * chunk_size; + +qsb = g_new0(QEMUSizedBuffer, 1); +qsb-iov = g_new0(struct iovec, num_chunks); +qsb-n_iov = num_chunks; + +for (i = 0; i num_chunks; i++) { +qsb-iov[i].iov_base = g_malloc0(chunk_size); +qsb-iov[i].iov_len = chunk_size; +if (buffer) { +to_copy = (len - qsb-used) chunk_size + ? chunk_size : (len - qsb-used); +memcpy(qsb-iov[i].iov_base, buffer[qsb-used], to_copy); +qsb-used += to_copy; +} +} + +qsb-size = alloc_len; + +return qsb; +} + +/** + * Free the QEMUSizedBuffer + * + * @qsb: The QEMUSizedBuffer to free + */ +void qsb_free(QEMUSizedBuffer *qsb) +{ +size_t i; + +if (!qsb) { +return; +} + +for (i = 0; i qsb-n_iov; i++) { +g_free(qsb-iov[i].iov_base); +} +g_free(qsb-iov); +g_free(qsb); +} + +/** + * Get the number of of used bytes in the QEMUSizedBuffer + * + * @qsb: A QEMUSizedBuffer + * + * Returns the number of bytes currently used
[Qemu-devel] [PATCH RFC v2 03/12] VMState test: query command to extract the qdevified device names
I have provided a qmp interface for getting the list of qdevified devices that have been registered with SaveVMHandlers. Signed-off-by: Sanidhya Kashyap sanidhya.ii...@gmail.com --- qapi-schema.json | 22 ++ qmp-commands.hx | 25 + savevm.c | 34 ++ 3 files changed, 81 insertions(+) diff --git a/qapi-schema.json b/qapi-schema.json index b11aad2..996e6b5 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -3480,3 +3480,25 @@ # Since: 2.1 ## { 'command': 'rtc-reset-reinjection' } + +## +# @VMstatesQdevDevices +# +# list of qdevified devices that are registered with SaveStateEntry +# +# @device: list of qdevified device names +# +# Since 2.2 +## +{ 'type': 'VMStatesQdevDevices', + 'data': { 'device': ['str'] } } + +## +# @query-qdev-devices +# +# returns the VMStatesQdevDevices that have the associated value +# +# Since 2.2 +## +{ 'command': 'query-qdev-devices', + 'returns': 'VMStatesQdevDevices' } diff --git a/qmp-commands.hx b/qmp-commands.hx index 4be4765..2e20032 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -3755,3 +3755,28 @@ Example: - { return: {} } EQMP + +{ +.name = query-qdev-devices, +.args_type = , +.mhandler.cmd_new = qmp_marshal_input_query_qdev_devices, +}, + +SQMP +query-qdev-devices +-- + +Shows registered Qdevified devices + + +Example (1): + +- { execute: query-qdev-devices } +- { return: [ + { + devices: [ kvm-tpr-opt, piix4_pm ] + } + ] + } + +EQMP diff --git a/savevm.c b/savevm.c index 0255fa0..7c1600a 100644 --- a/savevm.c +++ b/savevm.c @@ -1167,6 +1167,40 @@ void do_savevm(Monitor *mon, const QDict *qdict) } } +static strList *create_qdev_list(const char *name, strList *list) +{ +strList *temp_list; +int len; + +if (!list) { +list = g_malloc0(sizeof(strList)); +len = strlen(name); +list-value = g_malloc0(sizeof(char)*(len+1)); +strcpy(list-value, name); +list-next = NULL; +return list; +} +temp_list = g_malloc0(sizeof(strList)); +len = strlen(name); +temp_list-value = g_malloc0(sizeof(char)*(len+1)); +strcpy(temp_list-value, name); +temp_list-next = list; +list = temp_list; +return list; +} + +VMStatesQdevDevices *qmp_query_qdev_devices(Error **errp) +{ +VMStatesQdevResetEntry *qre; +VMStatesQdevDevices *qdev_devices = g_malloc0(sizeof(VMStatesQdevDevices)); + +QTAILQ_FOREACH(qre, vmstate_reset_handlers, entry) { +qdev_devices-device = create_qdev_list(qre-device_name, + qdev_devices-device); +} +return qdev_devices; +} + void qmp_xen_save_devices_state(const char *filename, Error **errp) { QEMUFile *f; -- 1.9.3
[Qemu-devel] [PATCH RFC v2 05/12] VMstate test: basic VMState testing mechanism
In this patch, I have made the following changes: * changed the DPRINT statement. * renamed the variables. * added noqdev variable which decides which option to use for resetting. * added devices option which can help in resetting one or many devices (only qdevified ones). * updated the documentation. Signed-off-by: Sanidhya Kashyap sanidhya.ii...@gmail.com --- qapi-schema.json | 26 ++ qmp-commands.hx | 37 savevm.c | 251 +++ 3 files changed, 314 insertions(+) diff --git a/qapi-schema.json b/qapi-schema.json index 996e6b5..ec48977 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -3502,3 +3502,29 @@ ## { 'command': 'query-qdev-devices', 'returns': 'VMStatesQdevDevices' } + +## +# @test-vmstates +# +# tests the vmstates' value by dumping and loading in memory +# +# @iterations: (optional) The total iterations for vmstate testing. +# The min and max defind value is 10 and 100 respectively. +# +# @period: (optional) sleep interval between iteration (in milliseconds). +# The default interval is 100 milliseconds with min and max being +# 1 and 1 respectively. +# +# @noqdev: boolean variable which decides whether to use qdevified devices +# or not. Will be removed when all the devices have been qdevified. +# +# @devices: (optional) helps in resetting particular qdevified decices +# that have been registered with SaveStateEntry +# +# Since 2.2 +## +{ 'command': 'test-vmstates', + 'data': {'*iterations': 'int', + '*period': 'int', + 'noqdev': 'bool', + '*qdevices': 'VMStatesQdevDevices' } } diff --git a/qmp-commands.hx b/qmp-commands.hx index 2e20032..6210f56 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -3778,5 +3778,42 @@ Example (1): } ] } +EQMP + +{ +.name = test-vmstates, +.args_type = iterations:i?,period:i?,noqdev:b,qdevices:O?, +.mhandler.cmd_new = qmp_marshal_input_test_vmstates, +}, + +SQMP +test-vmstates +- + +Tests the vmstates' entry by dumping and loading in/from memory + +Arguments: +- iterations: (optional) The total iterations for vmstate testing. + The min and max defined value is 10 and 100 respectively. + +- period: (optional) sleep interval between iteration (in milliseconds). +The default interval is 100 milliseconds with min and max being +1 and 1 respectively. + +- noqdev: boolean variable which decides whether to use qdev or not. +Will be removed when all the devices have been qdevified. + +- devices: (optional) helps in resetting particular qdevified decices + that have been registered with SaveStateEntry + + +Example: + +- { execute: test-vmstates, + arguments: { +iterations: 10, +period: 100, +noqdev: false } } +- { return: {} } EQMP diff --git a/savevm.c b/savevm.c index 7c1600a..304d8fc 100644 --- a/savevm.c +++ b/savevm.c @@ -1201,6 +1201,257 @@ VMStatesQdevDevices *qmp_query_qdev_devices(Error **errp) return qdev_devices; } +#define DEBUG_TEST_VMSTATES 1 + +#ifndef DEBUG_TEST_VMSTATES +#define DEBUG_TEST_VMSTATES 0 +#endif + +#define DPRINTF(fmt, ...) \ +do { \ +if (DEBUG_TEST_VMSTATES) { \ +printf(vmstate_test: fmt, ## __VA_ARGS__); \ +} \ +} while (0) + +#define TEST_VMSTATE_MIN_TIMES 10 +#define TEST_VMSTATE_MAX_TIMES 1000 + +#define TEST_VMSTATE_MIN_INTERVAL_MS 1 +#define TEST_VMSTATE_DEFAULT_INTERVAL_MS 100 +#define TEST_VMSTATE_MAX_INTERVAL_MS 1 + +typedef struct VMStateLogState VMStateLogState; + +struct VMStateLogState { +int64_t current_iteration; +int64_t iterations; +int64_t period; +bool active_state; +bool noqdev; +VMStatesQdevDevices *qdevices; +QEMUTimer *timer; + +QTAILQ_HEAD(qdev_list, VMStatesQdevResetEntry) qdev_list; +}; + +static VMStateLogState *vmstate_current_state(void) +{ +static VMStateLogState current_state = { +.active_state = false, +}; + +return current_state; +} + +static inline void test_vmstates_clear_qdev_entries(VMStateLogState *v) +{ +VMStatesQdevResetEntry *qre, *new_qre; +QTAILQ_FOREACH_SAFE(qre, v-qdev_list, entry, new_qre) { +QTAILQ_REMOVE(v-qdev_list, qre, entry); +} +} + +static inline bool check_device_name(VMStateLogState *v, + VMStatesQdevDevices *qdevices, + Error **errp) +{ +VMStatesQdevResetEntry *qre; +strList *devices_name = qdevices-device; +QTAILQ_INIT(v-qdev_list); +bool device_present; + +/* now, checking against each one */ +for (; devices_name; devices_name = devices_name-next) { +device_present = false; +VMStatesQdevResetEntry *new_qre; +QTAILQ_FOREACH(qre, vmstate_reset_handlers, entry) { +if (!strcmp(qre-device_name, devices_name-value)) { + +
[Qemu-devel] [PATCH RFC v2 04/12] VMState test: hmp interface for showing qdevified devices
This patch provides the hmp interface for qdevified devices list. Signed-off-by: Sanidhya Kashyap sanidhya.ii...@gmail.com --- hmp-commands.hx | 2 ++ hmp.c | 21 + hmp.h | 1 + monitor.c | 7 +++ 4 files changed, 31 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index d0943b1..4603de5 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1780,6 +1780,8 @@ show qdev device model list show roms @item info tpm show the TPM device +@item info qdev_devices +show the qdevified devices registered with migration capability @end table ETEXI diff --git a/hmp.c b/hmp.c index 4d1838e..d1dd7d2 100644 --- a/hmp.c +++ b/hmp.c @@ -1714,3 +1714,24 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict) monitor_printf(mon, \n); } + +void hmp_info_qdev_devices(Monitor *mon, const QDict *qdict) +{ +Error *err = NULL; +VMStatesQdevDevices *qdev_devices = qmp_query_qdev_devices(err); +strList *list = NULL; + +if (qdev_devices) { +list = qdev_devices-device; +while (list) { +monitor_printf(mon, %s\n, list-value); +list = list-next; +} +} + +if (err) { +hmp_handle_error(mon, err); +} + +qapi_free_strList(list); +} diff --git a/hmp.h b/hmp.h index 4fd3c4a..d179454 100644 --- a/hmp.h +++ b/hmp.h @@ -38,6 +38,7 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict); void hmp_info_pci(Monitor *mon, const QDict *qdict); void hmp_info_block_jobs(Monitor *mon, const QDict *qdict); void hmp_info_tpm(Monitor *mon, const QDict *qdict); +void hmp_info_qdev_devices(Monitor *mon, const QDict *qdict); void hmp_quit(Monitor *mon, const QDict *qdict); void hmp_stop(Monitor *mon, const QDict *qdict); void hmp_system_reset(Monitor *mon, const QDict *qdict); diff --git a/monitor.c b/monitor.c index 5bc70a6..bf828d6 100644 --- a/monitor.c +++ b/monitor.c @@ -2918,6 +2918,13 @@ static mon_cmd_t info_cmds[] = { .mhandler.cmd = hmp_info_memdev, }, { +.name = qdev_devices, +.args_type = , +.params = , +.help = show registered qdevified devices, +.mhandler.cmd = hmp_info_qdev_devices, +}, +{ .name = NULL, }, }; -- 1.9.3
[Qemu-devel] [PATCH RFC v2 02/12] reset handler for qdevified devices
I have added a structure containing the list of qdevified devices which have been added to the SaveVMHandlers. Since, I was unable to find any particular struct containing the information about all the qdevified devices. So, I have created my own version, which is very very specific. I shall be grateful if anyone can give some pointers on this. Signed-off-by: Sanidhya Kashyap sanidhya.ii...@gmail.com --- savevm.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/savevm.c b/savevm.c index e19ae0a..0255fa0 100644 --- a/savevm.c +++ b/savevm.c @@ -503,12 +503,29 @@ void unregister_savevm(DeviceState *dev, const char *idstr, void *opaque) } } +/* + * Reset entry for qdevified devices. + * Contains all those devices which have been qdevified and are + * part of the SaveVMHandlers. This one allows resetting of + * single device at any time. + */ + +typedef struct VMStatesQdevResetEntry { +QTAILQ_ENTRY(VMStatesQdevResetEntry) entry; +DeviceState *dev; +char device_name[256]; +} VMStatesQdevResetEntry; + +static QTAILQ_HEAD(vmstate_reset_handlers, VMStatesQdevResetEntry) + vmstate_reset_handlers = QTAILQ_HEAD_INITIALIZER(vmstate_reset_handlers); + int vmstate_register_with_alias_id(DeviceState *dev, int instance_id, const VMStateDescription *vmsd, void *opaque, int alias_id, int required_for_version) { SaveStateEntry *se; +VMStatesQdevResetEntry *qre; /* If this triggers, alias support can be dropped for the vmsd. */ assert(alias_id == -1 || required_for_version = vmsd-minimum_version_id); @@ -521,6 +538,12 @@ int vmstate_register_with_alias_id(DeviceState *dev, int instance_id, se-alias_id = alias_id; if (dev) { + +qre = g_malloc0(sizeof(VMStatesQdevResetEntry)); +qre-dev = dev; +strcpy(qre-device_name, vmsd-name); +QTAILQ_INSERT_TAIL(vmstate_reset_handlers, qre, entry); + char *id = qdev_get_dev_path(dev); if (id) { pstrcpy(se-idstr, sizeof(se-idstr), id); @@ -551,6 +574,7 @@ void vmstate_unregister(DeviceState *dev, const VMStateDescription *vmsd, void *opaque) { SaveStateEntry *se, *new_se; +VMStatesQdevResetEntry *qre, *new_qre; QTAILQ_FOREACH_SAFE(se, savevm_handlers, entry, new_se) { if (se-vmsd == vmsd se-opaque == opaque) { @@ -561,6 +585,12 @@ void vmstate_unregister(DeviceState *dev, const VMStateDescription *vmsd, g_free(se); } } + +QTAILQ_FOREACH_SAFE(qre, vmstate_reset_handlers, entry, new_qre) { +if (dev qre-dev !strcmp(vmsd-name, qre-device_name)) { +QTAILQ_REMOVE(vmstate_reset_handlers, qre, entry); +} +} } static int vmstate_load(QEMUFile *f, SaveStateEntry *se, int version_id) -- 1.9.3
[Qemu-devel] [PATCH RFC v2 06/12] VMState test: hmp interface for vmstate testing
I have added the hmp interface for vmstate testing. Currently, the patch does not support the qdev list, since I could not figure out how to the pass the VMStatesQdevDevices struct which can be parsed and used. Signed-off-by: Sanidhya Kashyap sanidhya.ii...@gmail.com --- hmp-commands.hx | 15 +++ hmp.c | 18 ++ hmp.h | 1 + 3 files changed, 34 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index 4603de5..6af72a6 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1790,6 +1790,21 @@ STEXI show available trace events and their state ETEXI + { +.name = test-vmstates, +.args_type = iterations:i?,period:i?, +.params = total_iterations sleep_interval, +.help = test the vmstates by dumping and loading form memory\n\t\t\t + iterations: (optional) number of times, the vmstates will be tested\n\t\t\t + period: (optional) sleep interval in milliseconds between each iteration, +.mhandler.cmd = hmp_test_vmstates, +}, +STEXI +@item test-vmstates +@findex test-vmstates +dumps and reads the device state's data from the memory for testing purpose +ETEXI + STEXI @end table ETEXI diff --git a/hmp.c b/hmp.c index d1dd7d2..6c998d2 100644 --- a/hmp.c +++ b/hmp.c @@ -1735,3 +1735,21 @@ void hmp_info_qdev_devices(Monitor *mon, const QDict *qdict) qapi_free_strList(list); } + +void hmp_test_vmstates(Monitor *mon, const QDict *qdict) +{ +int has_iterations = qdict_haskey(qdict, iterations); +int64_t iterations = qdict_get_try_int(qdict, iterations, 10); +int has_period = qdict_haskey(qdict, period); +int64_t period = qdict_get_try_int(qdict, period, 100); + +Error *err = NULL; + +qmp_test_vmstates(has_iterations, iterations, has_period, period, + true, false, NULL, err); + +if (err) { +monitor_printf(mon, test-vmstates: %s\n, error_get_pretty(err)); +error_free(err); +} +} diff --git a/hmp.h b/hmp.h index d179454..41bc781 100644 --- a/hmp.h +++ b/hmp.h @@ -95,6 +95,7 @@ void hmp_cpu_add(Monitor *mon, const QDict *qdict); void hmp_object_add(Monitor *mon, const QDict *qdict); void hmp_object_del(Monitor *mon, const QDict *qdict); void hmp_info_memdev(Monitor *mon, const QDict *qdict); +void hmp_test_vmstates(Monitor *mon, const QDict *qdict); void object_add_completion(ReadLineState *rs, int nb_args, const char *str); void object_del_completion(ReadLineState *rs, int nb_args, const char *str); void device_add_completion(ReadLineState *rs, int nb_args, const char *str); -- 1.9.3
[Qemu-devel] [PATCH RFC v2 10/12] VMState test: hmp interface for period update
Signed-off-by: Sanidhya Kashyap sanidhya.ii...@gmail.com --- hmp-commands.hx | 15 +++ hmp.c | 14 ++ hmp.h | 1 + 3 files changed, 30 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index c1dc6a2..6d15184 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1807,6 +1807,21 @@ STEXI dumps and reads the device state's data from the memory for testing purpose ETEXI +{ +.name = test_vmstates_set_period, +.args_type = period:i, +.params = period, +.help = set the sleep interval for vmstates testing process\n\t\t\t + period: the new sleep interval value to replace the existing, +.mhandler.cmd = hmp_test_vmstates_set_period, +}, + +STEXI +@item test_vmstates_set_period @var{period} +@findex test_vmstates_set_period +Set the period to @var{period} (int) for vmstate testing process. +ETEXI + STEXI @end table ETEXI diff --git a/hmp.c b/hmp.c index 385fb99..f54b0b9 100644 --- a/hmp.c +++ b/hmp.c @@ -1767,3 +1767,17 @@ void hmp_info_test_vmstates(Monitor *mon, const QDict *qdict) qapi_free_VMStateLogStateInfo(log_info); } + +void hmp_test_vmstates_set_period(Monitor *mon, const QDict *qdict) +{ +int64_t period = qdict_get_int(qdict, period); +Error *err = NULL; + +qmp_test_vmstates_set_period(period, err); + +if (err) { +monitor_printf(mon, test-vmstates-set-period: %s\n, + error_get_pretty(err)); +error_free(err); +} +} diff --git a/hmp.h b/hmp.h index b77f14c..e1afde8 100644 --- a/hmp.h +++ b/hmp.h @@ -97,6 +97,7 @@ void hmp_object_add(Monitor *mon, const QDict *qdict); void hmp_object_del(Monitor *mon, const QDict *qdict); void hmp_info_memdev(Monitor *mon, const QDict *qdict); void hmp_test_vmstates(Monitor *mon, const QDict *qdict); +void hmp_test_vmstates_set_period(Monitor *mon, const QDict *qdict); void object_add_completion(ReadLineState *rs, int nb_args, const char *str); void object_del_completion(ReadLineState *rs, int nb_args, const char *str); void device_add_completion(ReadLineState *rs, int nb_args, const char *str); -- 1.9.3
[Qemu-devel] [PATCH RFC v2 08/12] VMState test: hmp interface for querying the vmstate testing process
Added a hmp interface for providing the information about the testing process. I have used the underscore as a separater on Eric's advice. But, I have found some of the commands having hyphen. Signed-off-by: Sanidhya Kashyap sanidhya.ii...@gmail.com --- hmp-commands.hx | 2 ++ hmp.c | 14 ++ hmp.h | 1 + monitor.c | 7 +++ 4 files changed, 24 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index 6af72a6..c1dc6a2 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1770,6 +1770,8 @@ show migration status show current migration capabilities @item info migrate_cache_size show current migration XBZRLE cache size +@item info test_vmstates +show current vmstates testing process info @item info balloon show balloon information @item info qtree diff --git a/hmp.c b/hmp.c index 9e01127..385fb99 100644 --- a/hmp.c +++ b/hmp.c @@ -1753,3 +1753,17 @@ void hmp_test_vmstates(Monitor *mon, const QDict *qdict) error_free(err); } } + +void hmp_info_test_vmstates(Monitor *mon, const QDict *qdict) +{ +VMStateLogStateInfo *log_info = qmp_query_test_vmstates(NULL); + +if (log_info) { +monitor_printf(mon, current-iteration: %PRId64 \n +iterations: %PRId64 \n +period: %PRId64 \n, log_info-current_iteration, +log_info-iterations, log_info-period); +} + +qapi_free_VMStateLogStateInfo(log_info); +} diff --git a/hmp.h b/hmp.h index 41bc781..b77f14c 100644 --- a/hmp.h +++ b/hmp.h @@ -39,6 +39,7 @@ void hmp_info_pci(Monitor *mon, const QDict *qdict); void hmp_info_block_jobs(Monitor *mon, const QDict *qdict); void hmp_info_tpm(Monitor *mon, const QDict *qdict); void hmp_info_qdev_devices(Monitor *mon, const QDict *qdict); +void hmp_info_test_vmstates(Monitor *mon, const QDict *qdict); void hmp_quit(Monitor *mon, const QDict *qdict); void hmp_stop(Monitor *mon, const QDict *qdict); void hmp_system_reset(Monitor *mon, const QDict *qdict); diff --git a/monitor.c b/monitor.c index bf828d6..427eef1 100644 --- a/monitor.c +++ b/monitor.c @@ -2862,6 +2862,13 @@ static mon_cmd_t info_cmds[] = { .mhandler.cmd = hmp_info_migrate_capabilities, }, { +.name = test_vmstates, +.args_type = , +.params = , +.help = show current vmstates testing process info, +.mhandler.cmd = hmp_info_test_vmstates, +}, +{ .name = migrate_cache_size, .args_type = , .params = , -- 1.9.3
[Qemu-devel] [PATCH RFC v2 07/12] VMState test: qmp interface for querying the vmstate testing process
This patch provides the information about an already executing testing process. I have modified the qmp command to query-test-vmstates from test-vmstates-get-info. Signed-off-by: Sanidhya Kashyap sanidhya.ii...@gmail.com --- qapi-schema.json | 34 ++ qmp-commands.hx | 25 + savevm.c | 18 ++ 3 files changed, 77 insertions(+) diff --git a/qapi-schema.json b/qapi-schema.json index ec48977..a12872f 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -3528,3 +3528,37 @@ '*period': 'int', 'noqdev': 'bool', '*qdevices': 'VMStatesQdevDevices' } } + +## +# @VMStateLogStateInfo +# +# VMState testing information +# Tells about the current iteration, the total iterations +# that have been provided and the sleep interval +# +# @current-iteration: shows the current iteration at which +# the test is in. +# +# @iterations: the allocated total iterations for the vmstate +# testing process. +# +# @period: the allowed sleep interval between each iteration +# (in milliseconds). +# +# Since 2.2 +## +{ 'type': 'VMStateLogStateInfo', + 'data': { 'current-iteration': 'int', +'iterations':'int', +'period':'int' } } + +## +# @query-test-vmstates +# +# Get the current parameters value of the vmstate testing process. +# +# Returns VMStateLogStateInfo structure. +# +# Since 2.2 +## +{ 'command': 'query-test-vmstates', 'returns': 'VMStateLogStateInfo' } diff --git a/qmp-commands.hx b/qmp-commands.hx index 6210f56..0a40a88 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -3817,3 +3817,28 @@ Example: noqdev: false } } - { return: {} } EQMP + +{ +.name = query-test-vmstates, +.args_type = , +.mhandler.cmd_new = qmp_marshal_input_query_test_vmstates, +}, + +SQMP +query-test-vmstates-info + + +Get the parameters information + +- current_iteration: the current iteration going on +- iterations: the total number of assigned iterations +- period: the sleep interval between the iteration + +Example: + +- { execute: query-test-vmstates } +- { return: { +current_iteration: 3, +iterations: 100, +period: 100 } } +EQMP diff --git a/savevm.c b/savevm.c index b5e53b8..793fee7 100644 --- a/savevm.c +++ b/savevm.c @@ -1451,6 +1451,24 @@ void qmp_test_vmstates(bool has_iterations, int64_t iterations, timer_mod(v-timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); } +VMStateLogStateInfo *qmp_query_test_vmstates(Error **errp) +{ +VMStateLogState *v = vmstate_current_state(); +VMStateLogStateInfo *log_info = NULL; + +if (!v-active_state) { +return log_info; +} + +log_info = g_malloc0(sizeof(VMStateLogStateInfo)); + +log_info-current_iteration = v-current_iteration; +log_info-iterations = v-iterations; +log_info-period = v-period; + +return log_info; +} + void qmp_xen_save_devices_state(const char *filename, Error **errp) { QEMUFile *f; -- 1.9.3
[Qemu-devel] [PATCH RFC v2 11/12] VMState test: cancel mechanism for an already running vmstate testing process
Signed-off-by: Sanidhya Kashyap sanidhya.ii...@gmail.com --- qapi-schema.json | 9 + qmp-commands.hx | 19 +++ savevm.c | 16 ++-- 3 files changed, 42 insertions(+), 2 deletions(-) diff --git a/qapi-schema.json b/qapi-schema.json index 13e922e..91f1672 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -3574,3 +3574,12 @@ ## { 'command' : 'test-vmstates-set-period', 'data': { 'period': 'int' } } + +## +# @log-dirty-bitmap-cancel +# +# cancel the testing vmstates process +# +# Since 2.2 +## +{ 'command': 'test-vmstates-cancel' } diff --git a/qmp-commands.hx b/qmp-commands.hx index 2f019b0..1035885 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -3865,3 +3865,22 @@ Example: - { return: {} } EQMP + { +.name = test-vmstates-cancel, +.args_type = , +.mhandler.cmd_new = qmp_marshal_input_test_vmstates_cancel, +}, + +SQMP +test-vmstates-cancel +-- + +Cancel the current vmstate testing process. + +Arguments: None. + +Example: + +- { execute: test-vmstates-cancel } +- { return: {} } +EQMP diff --git a/savevm.c b/savevm.c index 8b75691..66597db 100644 --- a/savevm.c +++ b/savevm.c @@ -1365,8 +1365,12 @@ static void vmstate_test_cb(void *opaque) if (saved_vm_running) { vm_start(); } -timer_mod(v-timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + - v-period); + if (v-active_state) { +timer_mod(v-timer, v-period + + qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); +} else { +goto testing_end; +} return; } @@ -1482,6 +1486,14 @@ void qmp_test_vmstates_set_period(int64_t period, Error **errp) v-period = period; } +void qmp_test_vmstates_cancel(Error **errp) +{ +VMStateLogState *v = vmstate_current_state(); +if (v-active_state) { +v-active_state = false; +} +} + void qmp_xen_save_devices_state(const char *filename, Error **errp) { QEMUFile *f; -- 1.9.3
[Qemu-devel] [PATCH RFC v2 12/12] VMState test: hmp interface for cancel mechanism
Signed-off-by: Sanidhya Kashyap sanidhya.ii...@gmail.com --- hmp-commands.hx | 14 ++ hmp.c | 6 ++ hmp.h | 1 + 3 files changed, 21 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index 6d15184..fe224fc 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1822,6 +1822,20 @@ STEXI Set the period to @var{period} (int) for vmstate testing process. ETEXI + { + .name = test_vmstates_cancel, + .args_type = , + .params = , + .help = cancel the current vmstates testing process, + .mhandler.cmd = hmp_test_vmstates_cancel, +}, + +STEXI +@item test_vmstates_cancel +@findex test_vmstates_cancel +Cancel the current vmstates testing process +ETEXI + STEXI @end table ETEXI diff --git a/hmp.c b/hmp.c index f54b0b9..bbff92a 100644 --- a/hmp.c +++ b/hmp.c @@ -1781,3 +1781,9 @@ void hmp_test_vmstates_set_period(Monitor *mon, const QDict *qdict) error_free(err); } } + +void hmp_test_vmstates_cancel(Monitor *mon, const QDict *qdict) +{ + qmp_test_vmstates_cancel(NULL); +} + diff --git a/hmp.h b/hmp.h index e1afde8..1277dbc 100644 --- a/hmp.h +++ b/hmp.h @@ -98,6 +98,7 @@ void hmp_object_del(Monitor *mon, const QDict *qdict); void hmp_info_memdev(Monitor *mon, const QDict *qdict); void hmp_test_vmstates(Monitor *mon, const QDict *qdict); void hmp_test_vmstates_set_period(Monitor *mon, const QDict *qdict); +void hmp_test_vmstates_cancel(Monitor *mon, const QDict *qdict); void object_add_completion(ReadLineState *rs, int nb_args, const char *str); void object_del_completion(ReadLineState *rs, int nb_args, const char *str); void device_add_completion(ReadLineState *rs, int nb_args, const char *str); -- 1.9.3
[Qemu-devel] [PATCH RFC v2 09/12] VMState test: update period of vmstate testing process
No particular change, except variable name. Since I am not modifying other variables, so I have not made the command generic. Signed-off-by: Sanidhya Kashyap sanidhya.ii...@gmail.com --- qapi-schema.json | 12 qmp-commands.hx | 23 +++ savevm.c | 13 + 3 files changed, 48 insertions(+) diff --git a/qapi-schema.json b/qapi-schema.json index a12872f..13e922e 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -3562,3 +3562,15 @@ # Since 2.2 ## { 'command': 'query-test-vmstates', 'returns': 'VMStateLogStateInfo' } + +## +# @test-vmstates-set-period +# +# sets the sleep interval between iterations of the vmstate testing process +# +# @period: the updated sleep interval value (in milliseconds) +# +# Since 2.2 +## +{ 'command' : 'test-vmstates-set-period', + 'data': { 'period': 'int' } } diff --git a/qmp-commands.hx b/qmp-commands.hx index 0a40a88..2f019b0 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -3842,3 +3842,26 @@ Example: iterations: 100, period: 100 } } EQMP + +{ +.name = test-vmstates-set-period, +.args_type = period:i, +.mhandler.cmd_new = qmp_marshal_input_test_vmstates_set_period, +}, + +SQMP +test-vmstates-set-period + + +Update the sleep interval for the remaining iterations + +Arguments: + +- period: the updated sleep interval between iterations (json-int) + +Example: + +- { execute: test-vmstates-set-period, arguments: { period: 1024 } } +- { return: {} } +EQMP + diff --git a/savevm.c b/savevm.c index 797be57..d5fb93b 100644 --- a/savevm.c +++ b/savevm.c @@ -1470,6 +1470,19 @@ VMStateLogStateInfo *qmp_query_test_vmstates(Error **errp) return log_info; } +void qmp_test_vmstates_set_period(int64_t period, Error **errp) +{ +VMStateLogState *v = vmstate_current_state(); +if (period TEST_VMSTATE_MIN_INTERVAL_MS || +period TEST_VMSTATE_MAX_INTERVAL_MS) { +error_setg(errp, sleep interval (period) value must be + in the defined range [%d, %d](ms)\n, + TEST_VMSTATE_MIN_INTERVAL_MS, TEST_VMSTATE_MAX_INTERVAL_MS); +return; +} +v-period = period; +} + void qmp_xen_save_devices_state(const char *filename, Error **errp) { QEMUFile *f; -- 1.9.3
[Qemu-devel] [PATCH for-2.1 0/2] Fix migration failure due to ACPI tables size changes
Changing the ACPI table size causes migration to break, and the memory hotplug work opened our eyes on how horribly we were breaking things in 2.0 already. To trigger issue start QEMU-1.7 with -M pc-i440fx-1.7 -device pci-bridge,chassis_nr=1 and try to migrate to QEMU-2.1 or QEMU-2.0 as result target will fail with: qemu-system-x86_64: Length mismatch: /rom@etc/acpi/tables: 2000 in != 3000 This fix allows target QEMU to load smaller RAMBlock into a bigger one and fixes regression which was introduced since 2.0, allowing forward migration from 1.7/2.0 to 2.1 Fix is also suitable for stable-2.0. Igor Mammedov (2): migration: load smaller RAMBlock to a bigger one if permitted acpi: mark ACPI tables ROM blob as extend-able on migration arch_init.c | 19 ++- exec.c | 15 +-- hw/core/loader.c| 6 +- hw/i386/acpi-build.c| 2 +- include/exec/cpu-all.h | 15 ++- include/exec/memory.h | 10 ++ include/exec/ram_addr.h | 1 + include/hw/loader.h | 5 +++-- memory.c| 5 + 9 files changed, 62 insertions(+), 16 deletions(-) -- 1.8.3.1
[Qemu-devel] [PATCH for-2.1 2/2] acpi: mark ACPI tables ROM blob as extend-able on migration
It fixes migration failure for machine type pc-i440fx-1.7 from QEMU 1.7/2.0 to QEMU 2.1 Migration fails due to ACPI tables size grows across 1.7-2.1 versions. That causes ACPI tables ROM blob to change its size differently for the same configurations on different QEMU versions. As result migration code bails out when attempting to load smaller ROM blob into a bigger one on a newer QEMU. To trigger failure it's enough to add pci-bridge device to QEMU CLI Marking ACPI tables ROM blob as extend-able on migration allows QEMU to load smaller ACPI tables from QEMU 1.7/2.0, fixing forward migration failure introduced since 2.0 which affects only configurations that cause ACPI ROM blob size change. Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/core/loader.c | 6 +- hw/i386/acpi-build.c | 2 +- include/hw/loader.h | 5 +++-- 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/hw/core/loader.c b/hw/core/loader.c index 2bf6b8f..d09b894 100644 --- a/hw/core/loader.c +++ b/hw/core/loader.c @@ -722,7 +722,8 @@ err: void *rom_add_blob(const char *name, const void *blob, size_t len, hwaddr addr, const char *fw_file_name, - FWCfgReadCallback fw_callback, void *callback_opaque) + FWCfgReadCallback fw_callback, void *callback_opaque, + bool extendable) { Rom *rom; void *data = NULL; @@ -742,6 +743,9 @@ void *rom_add_blob(const char *name, const void *blob, size_t len, if (rom_file_has_mr) { data = rom_set_mr(rom, OBJECT(fw_cfg), devpath); +if (extendable) { +memory_region_permit_extendable_migration(rom-mr); +} } else { data = rom-data; } diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index ebc5f03..651c06b 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1558,7 +1558,7 @@ static void *acpi_add_rom_blob(AcpiBuildState *build_state, GArray *blob, const char *name) { return rom_add_blob(name, blob-data, acpi_data_len(blob), -1, name, -acpi_build_update, build_state); +acpi_build_update, build_state, true); } static const VMStateDescription vmstate_acpi_build = { diff --git a/include/hw/loader.h b/include/hw/loader.h index 796cbf9..e5a24ac 100644 --- a/include/hw/loader.h +++ b/include/hw/loader.h @@ -57,7 +57,8 @@ int rom_add_file(const char *file, const char *fw_dir, bool option_rom); void *rom_add_blob(const char *name, const void *blob, size_t len, hwaddr addr, const char *fw_file_name, - FWCfgReadCallback fw_callback, void *callback_opaque); + FWCfgReadCallback fw_callback, void *callback_opaque, + bool extendable); int rom_add_elf_program(const char *name, void *data, size_t datasize, size_t romsize, hwaddr addr); int rom_load_all(void); @@ -70,7 +71,7 @@ void do_info_roms(Monitor *mon, const QDict *qdict); #define rom_add_file_fixed(_f, _a, _i) \ rom_add_file(_f, NULL, _a, _i, false) #define rom_add_blob_fixed(_f, _b, _l, _a) \ -rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL) +rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL, false) #define PC_ROM_MIN_VGA 0xc #define PC_ROM_MIN_OPTION 0xc8000 -- 1.8.3.1
Re: [Qemu-devel] [PATCH 4/7] libqos: Correct mask to align size to PAGE_SIZE in malloc-pc
On Thu, Jul 24, 2014 at 08:31:02PM +0200, Marc Marí wrote: Signed-off-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Marc Marí marc.mari.barc...@gmail.com --- tests/libqos/malloc-pc.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Reviewed-by: Stefan Hajnoczi stefa...@redhat.com pgpfJeOahV0Cb.pgp Description: PGP signature
[Qemu-devel] [PATCH for-2.1 1/2] migration: load smaller RAMBlock to a bigger one if permitted
Add API to mark memory region as extend-able on migration, to allow migration code to load smaller RAMBlock into a bigger one on destination QEMU instance. This will allow to fix broken migration from QEMU 1.7/2.0 to QEMU 2.1 due to ACPI tables size changes across 1.7/2.0/2.1 versions by marking ACPI tables ROM blob as extend-able. So that smaller tables from previos version could be always migrated to a bigger rom blob on new version. Credits-for-idea: Michael S. Tsirkin m...@redhat.com Signed-off-by: Igor Mammedov imamm...@redhat.com --- arch_init.c | 19 ++- exec.c | 15 +-- include/exec/cpu-all.h | 15 ++- include/exec/memory.h | 10 ++ include/exec/ram_addr.h | 1 + memory.c| 5 + 6 files changed, 53 insertions(+), 12 deletions(-) diff --git a/arch_init.c b/arch_init.c index 8ddaf35..812f8b5 100644 --- a/arch_init.c +++ b/arch_init.c @@ -1071,11 +1071,20 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) QTAILQ_FOREACH(block, ram_list.blocks, next) { if (!strncmp(id, block-idstr, sizeof(id))) { -if (block-length != length) { -error_report(Length mismatch: %s: RAM_ADDR_FMT - in != RAM_ADDR_FMT, id, length, - block-length); -ret = -EINVAL; +if (block-flags RAM_EXTENDABLE_ON_MIGRATE) { +if (block-length length) { +error_report(Length too big: %s: RAM_ADDR_FMT +RAM_ADDR_FMT, id, length, + block-length); +ret = -EINVAL; +} +} else { +if (block-length != length) { +error_report(Length mismatch: %s: RAM_ADDR_FMT + in != RAM_ADDR_FMT, id, length, + block-length); +ret = -EINVAL; +} } break; } diff --git a/exec.c b/exec.c index 765bd94..755b1d3 100644 --- a/exec.c +++ b/exec.c @@ -69,12 +69,6 @@ AddressSpace address_space_memory; MemoryRegion io_mem_rom, io_mem_notdirty; static MemoryRegion io_mem_unassigned; -/* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */ -#define RAM_PREALLOC (1 0) - -/* RAM is mmap-ed with MAP_SHARED */ -#define RAM_SHARED (1 1) - #endif struct CPUTailQ cpus = QTAILQ_HEAD_INITIALIZER(cpus); @@ -1214,6 +1208,15 @@ void qemu_ram_unset_idstr(ram_addr_t addr) } } +void qemu_ram_set_extendable_on_migration(ram_addr_t addr) +{ +RAMBlock *block = find_ram_block(addr); + +if (block) { +block-flags |= RAM_EXTENDABLE_ON_MIGRATE; +} +} + static int memory_try_enable_merging(void *addr, size_t len) { if (!qemu_opt_get_bool(qemu_get_machine_opts(), mem-merge, true)) { diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h index f91581f..2b9aea3 100644 --- a/include/exec/cpu-all.h +++ b/include/exec/cpu-all.h @@ -296,13 +296,26 @@ CPUArchState *cpu_copy(CPUArchState *env); #if !defined(CONFIG_USER_ONLY) /* memory API */ +typedef enum { +/* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */ +RAM_PREALLOC = 1 0, +/* RAM is mmap-ed with MAP_SHARED */ +RAM_SHARED = 1 1, +/* + * Allow at migration time to load RAMBlock of smaller size than + * destination RAMBlock is + */ +RAM_EXTENDABLE_ON_MIGRATE = 1 2, +RAM_FLAGS_END = 1 31 +} RAMBlockFlags; + typedef struct RAMBlock { struct MemoryRegion *mr; uint8_t *host; ram_addr_t offset; ram_addr_t length; -uint32_t flags; +RAMBlockFlags flags; char idstr[256]; /* Reads can take either the iothread or the ramlist lock. * Writes must take both locks. diff --git a/include/exec/memory.h b/include/exec/memory.h index e2c8e3e..6c03f70 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -894,6 +894,16 @@ bool memory_region_present(MemoryRegion *container, hwaddr addr); bool memory_region_is_mapped(MemoryRegion *mr); /** + * memory_region_permit_extendable_migration: allows to mark + * #MemoryRegion as extendable on migrartion, which permits to + * load source memory block of smaller size than destination memory block + * at migration time + * + * @mr: a #MemoryRegion which should be marked as extendable on migration + */ +void memory_region_permit_extendable_migration(MemoryRegion *mr); + +/** * memory_region_find: translate an address/size relative to a * MemoryRegion into a #MemoryRegionSection. * diff
Re: [Qemu-devel] [PULL for-2.1 0/3] Last minute patches
On 25 July 2014 15:22, Paolo Bonzini pbonz...@redhat.com wrote: Since Igor hasn't sent his patches, and I'm leaving the office, I pushed this to git://github.com/bonzini/qemu.git tags/for-upstream-full I don't know about tests/acpi-test-data/pc. It makes sense that this patch should modify something there, but: * make check passes * the test warns even before patch 1, for both the DSDT (modified by the patch) and SSDT (which this series doesn't touch at all) * I cannot get it to pass, except by blindly copying the actual output on the expected files * mst is on vacation and Marcel is off on Fridays Based on my understanding of the problem, it is not possible to fix the bug without hacks like this one, and even reverting all patches in this area would be more risky. Hmm. I'm not really sure what the right thing is, so what I'm planning to do is: * just apply the qemu-char fix for now * not tag -rc4 today * see if things are clearer on Monday (I see Igor has now sent out a patchset) * tag -rc4 Mon or Tues * slip the release date a few days (not a big deal, I think) thanks -- PMM
Re: [Qemu-devel] [PATCH] [RFC] Add machine type pc-1.0-qemu-kvm for live migrate compatibility with qemu-kvm
ping: anyone got any views on this one? On 22 Jul 2014, at 19:43, Alex Bligh a...@alex.org.uk wrote: Add a machine type pc-1.0-qemu-kvm for live migrate compatibility with qemu-kvm version 1.0. Signed-off-by: Alex Bligh a...@alex.org.uk --- hw/acpi/piix4.c | 49 -- hw/i386/pc_piix.c| 31 + hw/timer/i8254_common.c | 41 ++ include/hw/acpi/piix4.h |1 + include/hw/timer/i8254.h |2 ++ 5 files changed, 122 insertions(+), 2 deletions(-) This RFC patch adds inbound migrate capability from qemu-kvm version 1.0. The main ideas are those set out in Cole Robinson's patch here: http://pkgs.fedoraproject.org/cgit/qemu.git/tree/0001-Fix-migration-from-qemu-kvm.patch?h=f20 however, rather than patching statically (and breaking inbound migration on existing machine types), I have added a new machine type (pc-1.0-qemu-kvm) without affecting any other machine types. This requires 'hot patching' the VMStateDescription in a couple of places, which in turn is less than obvious as there may be (indeed are for i8259) derived classes. Whilst pretty nausea-inducing, this approach has the benefit of being entirely self-contained. I developed this on qemu 2.0 but have forward ported it (trivially) to master. My testing has been on a VM live-migrated-to-file from Ubuntu Precise qemu-kvm 1.0. Testing has been light to date (i.e. can I migrate it inbound with -S without anything complaining). I have not (yet) brought forward the qxl rom size (and possibly video ram) changes in Cole's patches as I'd prefer an assessment of whether this is the right approach first. diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c index b72b34e..708db79 100644 --- a/hw/acpi/piix4.c +++ b/hw/acpi/piix4.c @@ -267,8 +267,9 @@ static const VMStateDescription vmstate_memhp_state = { }; /* qemu-kvm 1.2 uses version 3 but advertised as 2 - * To support incoming qemu-kvm 1.2 migration, change version_id - * and minimum_version_id to 2 below (which breaks migration from + * To support incoming qemu-kvm 1.2 migration, we support + * via a command line option a change to minimum_version_id + * of 2 in a _compat structure (which breaks migration from * qemu 1.2). * */ @@ -307,6 +308,34 @@ static const VMStateDescription vmstate_acpi = { } }; +static const VMStateDescription vmstate_acpi_compat = { +.name = piix4_pm, +.version_id = 3, +.minimum_version_id = 2, +.minimum_version_id_old = 1, +.load_state_old = acpi_load_old, +.post_load = vmstate_acpi_post_load, +.fields = (VMStateField[]) { +VMSTATE_PCI_DEVICE(parent_obj, PIIX4PMState), +VMSTATE_UINT16(ar.pm1.evt.sts, PIIX4PMState), +VMSTATE_UINT16(ar.pm1.evt.en, PIIX4PMState), +VMSTATE_UINT16(ar.pm1.cnt.cnt, PIIX4PMState), +VMSTATE_STRUCT(apm, PIIX4PMState, 0, vmstate_apm, APMState), +VMSTATE_TIMER(ar.tmr.timer, PIIX4PMState), +VMSTATE_INT64(ar.tmr.overflow_time, PIIX4PMState), +VMSTATE_STRUCT(ar.gpe, PIIX4PMState, 2, vmstate_gpe, ACPIGPE), +VMSTATE_STRUCT_TEST( +acpi_pci_hotplug.acpi_pcihp_pci_status[ACPI_PCIHP_BSEL_DEFAULT], +PIIX4PMState, +vmstate_test_no_use_acpi_pci_hotplug, +2, vmstate_pci_status, +struct AcpiPciHpPciStatus), +VMSTATE_PCI_HOTPLUG(acpi_pci_hotplug, PIIX4PMState, +vmstate_test_use_acpi_pci_hotplug), +VMSTATE_END_OF_LIST() +} +}; + static void piix4_reset(void *opaque) { PIIX4PMState *s = opaque; @@ -619,6 +648,22 @@ static void piix4_pm_class_init(ObjectClass *klass, void *data) adevc-ospm_status = piix4_ospm_status; } +void piix4_pm_class_fix_compat(void) +{ +GSList *el, *devices = object_class_get_list(TYPE_DEVICE, false); +/* + * Change the vmstate field in this class and any derived classes + * if not overriden. As no other classes should be using this + * vmstate, we can simply iterate the class list + */ +for (el = devices; el; el = el-next) { +DeviceClass *dc = el-data; +if (dc-vmsd == vmstate_acpi) { +dc-vmsd = vmstate_acpi_compat; +} +} +} + static const TypeInfo piix4_pm_info = { .name = TYPE_PIIX4_PM, .parent= TYPE_PCI_DEVICE, diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index 7081c08..e400ea6 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -49,6 +49,8 @@ #include hw/acpi/acpi.h #include cpu.h #include qemu/error-report.h +#include hw/acpi/piix4.h +#include hw/timer/i8254.h #ifdef CONFIG_XEN # include xen/hvm/hvm_info_table.h #endif @@ -386,6 +388,15 @@ static void pc_init_pci_1_2(MachineState *machine) pc_init_pci(machine); } +/* PC machine init function for qemu-kvm 1.0 */
Re: [Qemu-devel] [PATCH] hw/arm/boot: Set PC correctly when loading AArch64 ELF files
On 07/25/2014 11:23 AM, Peter Maydell wrote: The code in do_cpu_reset() correctly handled AArch64 CPUs when running Linux kernels, but was missing code in the branch of the if() that deals with loading ELF files. Correctly jump to the ELF entry point on reset rather than leaving the reset PC at zero. Thanks Peter! With this patch I can see the first few instructions being executed. Tested-by: Christopher Covington c...@codeaurora.org (The default Newlib/libgloss wants to touch EL3 registers that QEMU doesn't yet have, but I can probably make my test case work with -nostdlib.) Thanks, Christopher -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation.
Re: [Qemu-devel] [PATCH 7/7] libqos: Added basic virtqueue support to virtio implementation
On Thu, Jul 24, 2014 at 08:31:05PM +0200, Marc Marí wrote: +static void qvirtio_pci_set_queue_address(QVirtioDevice *d, uint16_t addr) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +qpci_io_writel(dev-pdev, dev-addr + QVIRTIO_QUEUE_ADDRESS, addr); +} Why is addr uint16_t? It should be a 32-bit Page Frame Number). It's probably clearer to name it pfn instead of addr since it's not an address. pgp1kUPsDwLVV.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH for-2.1 0/2] Fix migration failure due to ACPI tables size changes
On 07/25/14 17:48, Igor Mammedov wrote: Changing the ACPI table size causes migration to break, and the memory hotplug work opened our eyes on how horribly we were breaking things in 2.0 already. To trigger issue start QEMU-1.7 with -M pc-i440fx-1.7 -device pci-bridge,chassis_nr=1 and try to migrate to QEMU-2.1 or QEMU-2.0 as result target will fail with: qemu-system-x86_64: Length mismatch: /rom@etc/acpi/tables: 2000 in != 3000 This fix allows target QEMU to load smaller RAMBlock into a bigger one and fixes regression which was introduced since 2.0, allowing forward migration from 1.7/2.0 to 2.1 Fix is also suitable for stable-2.0. Igor Mammedov (2): migration: load smaller RAMBlock to a bigger one if permitted acpi: mark ACPI tables ROM blob as extend-able on migration arch_init.c | 19 ++- exec.c | 15 +-- hw/core/loader.c| 6 +- hw/i386/acpi-build.c| 2 +- include/exec/cpu-all.h | 15 ++- include/exec/memory.h | 10 ++ include/exec/ram_addr.h | 1 + include/hw/loader.h | 5 +++-- memory.c| 5 + 9 files changed, 62 insertions(+), 16 deletions(-) I can name things in favor of both approaches, Paolo's and yours. Paolo's approach keeps the hack where the problem was introduced, in the ACPI generator. The local nature of the migration-related pig sty is very attractive for me. As I said on IRC, if you fix the migration problem in the APCI generator by elegant compat flags, or such an emergency hack as Paolo's, that's an implementation detail; it's important that it stay local to the ACPI generator. Your patchset leaks the problem into RAMBlocks. In favor of your idea OTOH, as you explained, it would work for all possible ACPI configurations, and not break for a random few ones like Paolo's approach (talking about patch #2 of that set, the procedural _PRT is great in any case). You could argue that by customizing the RAMBlocks as extensible (for ACPI / fw_cfg blob backing only) the leakage is contained. I don't know how to decide between these two approaches. The optimal one would be to reimplement the 2.0 -- 2.1 feature additions from scratch, introducing compat flags afresh. Won't happen I guess. In any case I'll try to review this set for the technical things. Thanks Laszlo
Re: [Qemu-devel] [PULL v2 0/1] Fix for -serial pty regression
On 25 July 2014 13:38, Paolo Bonzini pbonz...@redhat.com wrote: The following changes since commit f368c33d5ab09dd5656924185cd975b11838cd25: Update version for v2.1.0-rc3 release (2014-07-22 18:17:03 +0100) are available in the git repository at: git://github.com/bonzini/qemu.git tags/for-upstream for you to fetch changes up to 62c339c5272ce8fbe8ca52695cee8ff40da7872e: qemu-char: ignore flow control if a PTY's slave is not connected (2014-07-25 14:36:07 +0200) Here is the serial fix for 2.1. Paolo Bonzini (1): qemu-char: ignore flow control if a PTY's slave is not connected Applied this pullreq rather than the other (see comments on that one). thanks -- PMM
[Qemu-devel] [Bug 1348719] [NEW] arm64: -smp 2 hangs qemu
Public bug reported: It appears that smp is broken on qemu for arm64. I'm looking into the root cause but am curious if others can reproduce in their environments. Tested with commit f368c33d5ab09dd5656924185cd975b11838cd25 (July 22) from https://github.com/qemu/qemu.git [root@joelaarch64 ~]# /usr/local/bin/qemu-system-aarch64 --version QEMU emulator version 2.0.93, Copyright (c) 2003-2008 Fabrice Bellard works fine: qemu --enable-kvm -nographic -netdev tap,id=t0,ifname=tap0,script=no,downscript=no,vhost=on -device virtio-net-device,netdev=t0,id=nic0 -kernel /extra/rootfs/boot/Image -drive file=/extra/Styx-Acadia-42-2014-07-07_10-09-27.img,id=fs -device virtio-blk-device,drive=fs -m 1024 -M virt -cpu host -smp 1 -append console=ttyAMA0 console=ttyS0 root=/dev/vda2 hangs: qemu --enable-kvm -nographic -netdev tap,id=t0,ifname=tap0,script=no,downscript=no,vhost=on -device virtio-net-device,netdev=t0,id=nic0 -kernel /extra/rootfs/boot/Image -drive file=/extra/Styx-Acadia-42-2014-07-07_10-09-27.img,id=fs -device virtio-blk-device,drive=fs -m 1024 -M virt -cpu host -smp 2 -append console=ttyAMA0 console=ttyS0 root=/dev/vda2 (gdb) t [Current thread is 1 (Thread 0x3ffb6787cc0 (LWP 7619))] (gdb) bt #0 0x03ffb6e50330 in ppoll () from /lib64/libc.so.6 #1 0x006631a0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=optimized out, __fds=optimized out) at /usr/include/bits/poll2.h:77 #2 qemu_poll_ns (fds=optimized out, nfds=optimized out, timeout=optimized out) at qemu-timer.c:314 #3 0x00662878 in os_host_main_loop_wait (timeout=optimized out) at main-loop.c:229 #4 main_loop_wait (nonblocking=optimized out) at main-loop.c:484 #5 0x0040fdf4 in main_loop () at vl.c:2010 #6 main (argc=optimized out, argv=optimized out, envp=optimized out) at vl.c:4541 (gdb) t [Current thread is 1 (Thread 0x3ffb6787cc0 (LWP 7619))] (gdb) t 2 [Switching to thread 2 (Thread 0x3ffb64beef0 (LWP 7622))] #0 0x03ffb6f10a44 in sigwait () from /lib64/libpthread.so.0 (gdb) bt #0 0x03ffb6f10a44 in sigwait () from /lib64/libpthread.so.0 #1 0x0069d78c in sigwait_compat (opaque=0xd752c0) at util/compatfd.c:36 #2 0x03ffb6f07c20 in start_thread () from /lib64/libpthread.so.0 #3 0x03ffb6e5a80c in clone () from /lib64/libc.so.6 ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1348719 Title: arm64: -smp 2 hangs qemu Status in QEMU: New Bug description: It appears that smp is broken on qemu for arm64. I'm looking into the root cause but am curious if others can reproduce in their environments. Tested with commit f368c33d5ab09dd5656924185cd975b11838cd25 (July 22) from https://github.com/qemu/qemu.git [root@joelaarch64 ~]# /usr/local/bin/qemu-system-aarch64 --version QEMU emulator version 2.0.93, Copyright (c) 2003-2008 Fabrice Bellard works fine: qemu --enable-kvm -nographic -netdev tap,id=t0,ifname=tap0,script=no,downscript=no,vhost=on -device virtio-net-device,netdev=t0,id=nic0 -kernel /extra/rootfs/boot/Image -drive file=/extra/Styx-Acadia-42-2014-07-07_10-09-27.img,id=fs -device virtio-blk-device,drive=fs -m 1024 -M virt -cpu host -smp 1 -append console=ttyAMA0 console=ttyS0 root=/dev/vda2 hangs: qemu --enable-kvm -nographic -netdev tap,id=t0,ifname=tap0,script=no,downscript=no,vhost=on -device virtio-net-device,netdev=t0,id=nic0 -kernel /extra/rootfs/boot/Image -drive file=/extra/Styx-Acadia-42-2014-07-07_10-09-27.img,id=fs -device virtio-blk-device,drive=fs -m 1024 -M virt -cpu host -smp 2 -append console=ttyAMA0 console=ttyS0 root=/dev/vda2 (gdb) t [Current thread is 1 (Thread 0x3ffb6787cc0 (LWP 7619))] (gdb) bt #0 0x03ffb6e50330 in ppoll () from /lib64/libc.so.6 #1 0x006631a0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=optimized out, __fds=optimized out) at /usr/include/bits/poll2.h:77 #2 qemu_poll_ns (fds=optimized out, nfds=optimized out, timeout=optimized out) at qemu-timer.c:314 #3 0x00662878 in os_host_main_loop_wait (timeout=optimized out) at main-loop.c:229 #4 main_loop_wait (nonblocking=optimized out) at main-loop.c:484 #5 0x0040fdf4 in main_loop () at vl.c:2010 #6 main (argc=optimized out, argv=optimized out, envp=optimized out) at vl.c:4541 (gdb) t [Current thread is 1 (Thread 0x3ffb6787cc0 (LWP 7619))] (gdb) t 2 [Switching to thread 2 (Thread 0x3ffb64beef0 (LWP 7622))] #0 0x03ffb6f10a44 in sigwait () from /lib64/libpthread.so.0 (gdb) bt #0 0x03ffb6f10a44 in sigwait () from /lib64/libpthread.so.0 #1 0x0069d78c in sigwait_compat (opaque=0xd752c0) at util/compatfd.c:36 #2 0x03ffb6f07c20 in start_thread () from /lib64/libpthread.so.0 #3 0x03ffb6e5a80c in clone () from /lib64/libc.so.6 To manage notifications about this bug go to:
[Qemu-devel] [Bug 1348719] Re: arm64: -smp 2 hangs qemu
** Changed in: qemu Assignee: (unassigned) = Joel Schopp (joel-schopp) -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1348719 Title: arm64: -smp 2 hangs qemu Status in QEMU: New Bug description: It appears that smp is broken on qemu for arm64. I'm looking into the root cause but am curious if others can reproduce in their environments. Tested with commit f368c33d5ab09dd5656924185cd975b11838cd25 (July 22) from https://github.com/qemu/qemu.git [root@joelaarch64 ~]# /usr/local/bin/qemu-system-aarch64 --version QEMU emulator version 2.0.93, Copyright (c) 2003-2008 Fabrice Bellard works fine: qemu --enable-kvm -nographic -netdev tap,id=t0,ifname=tap0,script=no,downscript=no,vhost=on -device virtio-net-device,netdev=t0,id=nic0 -kernel /extra/rootfs/boot/Image -drive file=/extra/Styx-Acadia-42-2014-07-07_10-09-27.img,id=fs -device virtio-blk-device,drive=fs -m 1024 -M virt -cpu host -smp 1 -append console=ttyAMA0 console=ttyS0 root=/dev/vda2 hangs: qemu --enable-kvm -nographic -netdev tap,id=t0,ifname=tap0,script=no,downscript=no,vhost=on -device virtio-net-device,netdev=t0,id=nic0 -kernel /extra/rootfs/boot/Image -drive file=/extra/Styx-Acadia-42-2014-07-07_10-09-27.img,id=fs -device virtio-blk-device,drive=fs -m 1024 -M virt -cpu host -smp 2 -append console=ttyAMA0 console=ttyS0 root=/dev/vda2 (gdb) t [Current thread is 1 (Thread 0x3ffb6787cc0 (LWP 7619))] (gdb) bt #0 0x03ffb6e50330 in ppoll () from /lib64/libc.so.6 #1 0x006631a0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=optimized out, __fds=optimized out) at /usr/include/bits/poll2.h:77 #2 qemu_poll_ns (fds=optimized out, nfds=optimized out, timeout=optimized out) at qemu-timer.c:314 #3 0x00662878 in os_host_main_loop_wait (timeout=optimized out) at main-loop.c:229 #4 main_loop_wait (nonblocking=optimized out) at main-loop.c:484 #5 0x0040fdf4 in main_loop () at vl.c:2010 #6 main (argc=optimized out, argv=optimized out, envp=optimized out) at vl.c:4541 (gdb) t [Current thread is 1 (Thread 0x3ffb6787cc0 (LWP 7619))] (gdb) t 2 [Switching to thread 2 (Thread 0x3ffb64beef0 (LWP 7622))] #0 0x03ffb6f10a44 in sigwait () from /lib64/libpthread.so.0 (gdb) bt #0 0x03ffb6f10a44 in sigwait () from /lib64/libpthread.so.0 #1 0x0069d78c in sigwait_compat (opaque=0xd752c0) at util/compatfd.c:36 #2 0x03ffb6f07c20 in start_thread () from /lib64/libpthread.so.0 #3 0x03ffb6e5a80c in clone () from /lib64/libc.so.6 To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1348719/+subscriptions
Re: [Qemu-devel] [Intel-gfx] ResettRe: [Xen-devel] [v5][PATCH 0/5] xen: add Intel IGD passthrough support
On Thu, Jul 24, 2014 at 09:44:41AM +0800, Chen, Tiejun wrote: On 2014/7/24 4:54, Konrad Rzeszutek Wilk wrote: On Sat, Jul 19, 2014 at 12:27:21AM +, Kay, Allen M wrote: For the MCH PCI registers that do need to be read - can you tell us which ones those are? In qemu/hw/xen_pt_igd.c/igd_pci_read(), following MCH PCI config register reads are passthrough to the host HW. Some of the registers are needed by Ironlake GFX driver which we probably can remove. I did a trace recently on Broadwell, the number of register accessed are even smaller (0, 2, 2c, 2e, 50, 52, a0, a4). Given that we now have integrated MCH and GPU in the same socket, looks like driver can easily remove reads for offsets 0 - 0x2e. case 0x00:/* vendor id */ case 0x02:/* device id */ case 0x08:/* revision id */ case 0x2c:/* sybsystem vendor id */ case 0x2e:/* sybsystem id */ Right. We can fix the i915 to use the mechanism that Michael mentioned. (see attached RFC patches) case 0x50:/* SNB: processor graphics control register */ case 0x52:/* processor graphics control register */ case 0xa0:/* top of memory */ case 0xb0:/* ILK: BSM: should read from dev 2 offset 0x5c */ case 0x58:/* SNB: PAVPC Offset */ case 0xa4:/* SNB: graphics base of stolen memory */ case 0xa8:/* SNB: base of GTT stolen memory */ I dug in the intel-gtt.c (see ironlake_gtt_driver) to figure out this a bit more. As in, I speculated, what if we returned 0 (and not implement any support for reading from these registers). What would happen? 0x52 for Ironlake (g5): -- It looks like intel_gmch_probe is called when i915_gem_gtt_init starts (there is a lot of code that looks to be used between intel-gtt.c and i915.c). Anyhow the interesting parts are that i9xx_setup ends up calling ioremap the i915 BAR to setup some of these registers for older generations. Then i965_gtt_total_entries gets which reads 0x52, but it is only needed for v5 generation. For other (v4 and G33) it reads it from the GPU's 0x2020 register. If there is a mismatch, it writes to the GPU at 0x2020 to update the the size based on the bridge. And then it reads from 0x2020 and that is returned and stuck in intel_private.gtt_total_entries. So 0x52 in the emulated bridge could be populated with what the GPU has at 0x2020. And the writes go in the emulated area. 0x52 for v6 - v8: - We seem to go to gen6_gmch_probe which just figures out the the GTT based on the GPU's BAR sizes. The stolen values are read from 0x50 from the GPU. So no need to touch the bridge (see gen6_gmch_probe) OK, so no need to have 0x52 or 0x50 in the bridge. 0xA0: - Could not find any reference in the Linux code. Why would Windows driver need this? If we returned the _real_ TOM would it matter? Is it used to figure out the device should use 32-bit DMA operations or 40-bit? 0xb0 or 0x5c: - No mention of them in the Linux code. 0x58, 0xa4, 0xa8: - No usage of them in the Linux code. We seem to be using the 0x52 from the bridge and 0x2020 from the GPU for this functionality. So in theory*, if using Ironlake we need to have a proper value in 0x52 in the bridge. But for the later chipsets we do not need these values (I am assuming that intel_setup_mchbar can safely return as it does that for Ironlake and could very well for later generations). Can this be reflected in the Windows driver as well? P.S. *theory: That is assuming we modify the Linux i915_drv.c:intel_detect_pch to pick up the id as suggested earlier. See the RFC patches attached. (Not compile tested at all!) I take a look these patches, looks we still scan all PCI Bridge to walk all PCHs. So this means we still need to fake a PCI bridge, right? Or maybe you don't cover this problem this time. I totally forgot. Feel free to fix that. I prefer we should check dev slot to get that PCH like my previous patch, Uh? Your patch was checking for 0:1f.0, not the slot of the device. (see https://lkml.org/lkml/2014/6/19/121) gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type. Because Windows always use this way, so I think this point should be same between Linux and Windows. Didn't we discuss that we did not want to pass in PCH at all if we can? And from the observation I made above it seems that we can safely do it under Linux. With Windows I don't know - but I presume the answer is yes too. Or we need anther better way to unify all OSs. Yes. The observation is that a lot of these registers are aliased in the GPU. As such we can skip some of this bridge poking. Hm. I should have gotten further and also done this on
Re: [Qemu-devel] [PATCH 2/7] tests: Add virtio device initialization
El Fri, 25 Jul 2014 16:19:41 +0100 Stefan Hajnoczi stefa...@redhat.com escribió: On Thu, Jul 24, 2014 at 08:31:00PM +0200, Marc Marí wrote: +static uint32_t qvirtio_pci_get_features(QVirtioDevice *d) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +return qpci_io_readl(dev-pdev, dev-addr + QVIRTIO_DEVICE_FEATURES); +} Unused? If it's unused, then it's untested. Yes, moved to the other patch for v2 + +static uint8_t qvirtio_pci_get_status(QVirtioDevice *d) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +return qpci_io_readb(dev-pdev, dev-addr + QVIRTIO_DEVICE_STATUS); +} Unused? Used in virtio.c (qvirtio_reset / qvirtio_set_acknowledge / qvirtio_set_driver). + +static void qvirtio_pci_set_status(QVirtioDevice *d, uint8_t val) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +qpci_io_writeb(dev-pdev, dev-addr + QVIRTIO_DEVICE_STATUS, val); Unused? Also in virtio.c @@ -73,3 +97,11 @@ QVirtioPCIDevice *qvirtio_pci_device_find(QPCIBus *bus, uint16_t device_type) return dev; } + +void qvirtio_pci_enable_device(QVirtioPCIDevice *d) +{ +qpci_device_enable(d-pdev); +d-addr = qpci_iomap(d-pdev, 0); +g_assert(d-addr != NULL); +} Where is qpci_iounmap() called to clean up? Missed. Also, it is unimplemented. Marc
Re: [Qemu-devel] [PATCH for-2.1 2/2] acpi: mark ACPI tables ROM blob as extend-able on migration
Il 25/07/2014 17:48, Igor Mammedov ha scritto: It fixes migration failure for machine type pc-i440fx-1.7 from QEMU 1.7/2.0 to QEMU 2.1 Migration fails due to ACPI tables size grows across 1.7-2.1 versions. That causes ACPI tables ROM blob to change its size differently for the same configurations on different QEMU versions. As result migration code bails out when attempting to load smaller ROM blob into a bigger one on a newer QEMU. To trigger failure it's enough to add pci-bridge device to QEMU CLI Marking ACPI tables ROM blob as extend-able on migration allows QEMU to load smaller ACPI tables from QEMU 1.7/2.0, fixing forward migration failure introduced since 2.0 which affects only configurations that cause ACPI ROM blob size change. This works in this case, and it is more friendly to downstreams indeed. It also is simpler, at least on the surface. I think the ramifications could be a bit larger than with my own patches, but still I guess it's more appropriate at this point of the release cycle. It doesn't handle the case of ACPI tables that shrink, which can happen as well. I guess if this ever happens we can just hard-code the table size of the old versions to something big enough (64K?) and keep using fine-grained sizing. I'd like a day or two to mull about it, but I have it even if the patches are applied. Peter, feel free to go ahead with Igor's patches. Paolo
Re: [Qemu-devel] [PATCH for-2.1 1/2] migration: load smaller RAMBlock to a bigger one if permitted
On 07/25/14 17:48, Igor Mammedov wrote: Add API to mark memory region as extend-able on migration, to allow migration code to load smaller RAMBlock into a bigger one on destination QEMU instance. This will allow to fix broken migration from QEMU 1.7/2.0 to QEMU 2.1 due to ACPI tables size changes across 1.7/2.0/2.1 versions by marking ACPI tables ROM blob as extend-able. So that smaller tables from previos version could be always (previous) migrated to a bigger rom blob on new version. Credits-for-idea: Michael S. Tsirkin m...@redhat.com Signed-off-by: Igor Mammedov imamm...@redhat.com --- arch_init.c | 19 ++- exec.c | 15 +-- include/exec/cpu-all.h | 15 ++- include/exec/memory.h | 10 ++ include/exec/ram_addr.h | 1 + memory.c| 5 + 6 files changed, 53 insertions(+), 12 deletions(-) (Please pass the -O flag to git-format-patch, with an order file that moves header files (*.h) to the front. Header hunks should lead in a patch. Thanks.) diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index 6593be1..248c075 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -33,6 +33,7 @@ void *qemu_get_ram_block_host_ptr(ram_addr_t addr); void *qemu_get_ram_ptr(ram_addr_t addr); void qemu_ram_free(ram_addr_t addr); void qemu_ram_free_from_ptr(ram_addr_t addr); +void qemu_ram_set_extendable_on_migration(ram_addr_t addr); static inline bool cpu_physical_memory_get_dirty(ram_addr_t start, ram_addr_t length, (Ugh. The declarations (prototypes) of qemu_ram_*() functions are scattered between ram_addr.h and cpu-common.h (both under include/exec). I can't see why that is a good thing (the function definitions all seem to be in exec.c).) diff --git a/exec.c b/exec.c index 765bd94..755b1d3 100644 --- a/exec.c +++ b/exec.c @@ -69,12 +69,6 @@ AddressSpace address_space_memory; MemoryRegion io_mem_rom, io_mem_notdirty; static MemoryRegion io_mem_unassigned; -/* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */ -#define RAM_PREALLOC (1 0) - -/* RAM is mmap-ed with MAP_SHARED */ -#define RAM_SHARED (1 1) - #endif I'm not sure these macros should be replaced with an enum; the new flag could be introduced just following the existent pattern. struct CPUTailQ cpus = QTAILQ_HEAD_INITIALIZER(cpus); @@ -1214,6 +1208,15 @@ void qemu_ram_unset_idstr(ram_addr_t addr) } } +void qemu_ram_set_extendable_on_migration(ram_addr_t addr) +{ +RAMBlock *block = find_ram_block(addr); + +if (block) { +block-flags |= RAM_EXTENDABLE_ON_MIGRATE; +} +} + Let's see how some oher qemu_ram_*() functions search for their blocks. We can form two classes; the first class takes a ram_addr_t into some RAMBlock, the second class only accepts/finds a ram_addr_t only at the beginning of some RAMBlock. (1) containment: qemu_get_ram_fd(): calls qemu_get_ram_block() qemu_get_ram_block_host_ptr: calls qemu_get_ram_block() qemu_get_ram_ptr(): calls qemu_get_ram_block() qemu_ram_remap():needs containment (2) exact block start match: qemu_ram_free(): needs (addr == block-offset) qemu_ram_free_from_ptr():needs (addr == block-offset) qemu_ram_set_idstr():calls find_ram_block() qemu_ram_unset_idstr(): calls find_ram_block() Your function will belong to the 2nd class. The definition is close to that of qemu_ram_unset_idstr(), another class member, OK. The declaration is close to qemu_ram_free_from_ptr(), which is another class member. OK. I'd throw in an assert(), rather than an if, just like qemu_ram_set_idstr() does. static int memory_try_enable_merging(void *addr, size_t len) { if (!qemu_opt_get_bool(qemu_get_machine_opts(), mem-merge, true)) { Let's see the caller: diff --git a/memory.c b/memory.c index 64d7176..744c746 100644 --- a/memory.c +++ b/memory.c @@ -1791,6 +1791,11 @@ bool memory_region_is_mapped(MemoryRegion *mr) return mr-container ? true : false; } +void memory_region_permit_extendable_migration(MemoryRegion *mr) +{ +qemu_ram_set_extendable_on_migration(mr-ram_addr); +} + MemoryRegionSection memory_region_find(MemoryRegion *mr, hwaddr addr, uint64_t size) { Looks matching, and consistent with other callers of the 2nd family functions. OK. diff --git a/include/exec/memory.h b/include/exec/memory.h index e2c8e3e..6c03f70 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -894,6 +894,16 @@ bool memory_region_present(MemoryRegion *container, hwaddr addr); bool memory_region_is_mapped(MemoryRegion *mr); /** + * memory_region_permit_extendable_migration: allows to mark Please refer to commit 9d85d557326df69fe0570e7de84b2f57e133c7e7 Author: Michael Tokarev
Re: [Qemu-devel] [PATCH for-2.1 2/2] acpi: mark ACPI tables ROM blob as extend-able on migration
On 07/25/14 17:48, Igor Mammedov wrote: It fixes migration failure for machine type pc-i440fx-1.7 from QEMU 1.7/2.0 to QEMU 2.1 Migration fails due to ACPI tables size grows across 1.7-2.1 versions. That causes ACPI tables ROM blob to change its size differently for the same configurations on different QEMU versions. As result migration code bails out when attempting to load smaller ROM blob into a bigger one on a newer QEMU. To trigger failure it's enough to add pci-bridge device to QEMU CLI Marking ACPI tables ROM blob as extend-able on migration allows QEMU to load smaller ACPI tables from QEMU 1.7/2.0, fixing forward migration failure introduced since 2.0 which affects only configurations that cause ACPI ROM blob size change. Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/core/loader.c | 6 +- hw/i386/acpi-build.c | 2 +- include/hw/loader.h | 5 +++-- 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/hw/core/loader.c b/hw/core/loader.c index 2bf6b8f..d09b894 100644 --- a/hw/core/loader.c +++ b/hw/core/loader.c @@ -722,7 +722,8 @@ err: void *rom_add_blob(const char *name, const void *blob, size_t len, hwaddr addr, const char *fw_file_name, - FWCfgReadCallback fw_callback, void *callback_opaque) + FWCfgReadCallback fw_callback, void *callback_opaque, + bool extendable) { Rom *rom; void *data = NULL; @@ -742,6 +743,9 @@ void *rom_add_blob(const char *name, const void *blob, size_t len, if (rom_file_has_mr) { data = rom_set_mr(rom, OBJECT(fw_cfg), devpath); +if (extendable) { +memory_region_permit_extendable_migration(rom-mr); +} } else { data = rom-data; } diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index ebc5f03..651c06b 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1558,7 +1558,7 @@ static void *acpi_add_rom_blob(AcpiBuildState *build_state, GArray *blob, const char *name) { return rom_add_blob(name, blob-data, acpi_data_len(blob), -1, name, -acpi_build_update, build_state); +acpi_build_update, build_state, true); } static const VMStateDescription vmstate_acpi_build = { diff --git a/include/hw/loader.h b/include/hw/loader.h index 796cbf9..e5a24ac 100644 --- a/include/hw/loader.h +++ b/include/hw/loader.h @@ -57,7 +57,8 @@ int rom_add_file(const char *file, const char *fw_dir, bool option_rom); void *rom_add_blob(const char *name, const void *blob, size_t len, hwaddr addr, const char *fw_file_name, - FWCfgReadCallback fw_callback, void *callback_opaque); + FWCfgReadCallback fw_callback, void *callback_opaque, + bool extendable); int rom_add_elf_program(const char *name, void *data, size_t datasize, size_t romsize, hwaddr addr); int rom_load_all(void); @@ -70,7 +71,7 @@ void do_info_roms(Monitor *mon, const QDict *qdict); #define rom_add_file_fixed(_f, _a, _i) \ rom_add_file(_f, NULL, _a, _i, false) #define rom_add_blob_fixed(_f, _b, _l, _a) \ -rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL) +rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL, false) #define PC_ROM_MIN_VGA 0xc #define PC_ROM_MIN_OPTION 0xc8000 Reviewed-by: Laszlo Ersek ler...@redhat.com
[Qemu-devel] [PATCH 2/8] qemu-img: Add progress output for amend
Now that bdrv_amend_options() supports a status callback, use it to display a progress report. Signed-off-by: Max Reitz mre...@redhat.com --- qemu-img.c | 26 +++--- 1 file changed, 23 insertions(+), 3 deletions(-) diff --git a/qemu-img.c b/qemu-img.c index 90d6b79..a06f425 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -2732,6 +2732,13 @@ out: return 0; } +static void amend_status_cb(BlockDriverState *bs, +int64_t offset, int64_t total_work_size) +{ +(void)bs; +qemu_progress_print(100.f * offset / total_work_size, 0); +} + static int img_amend(int argc, char **argv) { int c, ret = 0; @@ -2740,12 +2747,12 @@ static int img_amend(int argc, char **argv) QemuOpts *opts = NULL; const char *fmt = NULL, *filename, *cache; int flags; -bool quiet = false; +bool quiet = false, progress = false; BlockDriverState *bs = NULL; cache = BDRV_DEFAULT_CACHE; for (;;) { -c = getopt(argc, argv, ho:f:t:q); +c = getopt(argc, argv, ho:f:t:pq); if (c == -1) { break; } @@ -2775,6 +2782,9 @@ static int img_amend(int argc, char **argv) case 't': cache = optarg; break; +case 'p': +progress = true; +break; case 'q': quiet = true; break; @@ -2785,6 +2795,11 @@ static int img_amend(int argc, char **argv) error_exit(Must specify options (-o)); } +if (quiet) { +progress = false; +} +qemu_progress_init(progress, 1.0); + filename = (optind == argc - 1) ? argv[argc - 1] : NULL; if (fmt has_help_option(options)) { /* If a format is explicitly specified (and possibly no filename is @@ -2827,13 +2842,18 @@ static int img_amend(int argc, char **argv) goto out; } -ret = bdrv_amend_options(bs, opts, NULL); +/* In case the driver does not call amend_status_cb() */ +qemu_progress_print(0.f, 0); +ret = bdrv_amend_options(bs, opts, amend_status_cb); +qemu_progress_print(100.f, 0); if (ret 0) { error_report(Error while amending options: %s, strerror(-ret)); goto out; } out: +qemu_progress_end(); + if (bs) { bdrv_unref(bs); } -- 2.0.1
[Qemu-devel] [PATCH 0/8] block/qcow2: Improve (?) zero cluster expansion
The main purpose of this series is to add a progress report to qemu-img amend. This is achieved by adding a callback function to bdrv_amend_options() - the reasons for this choice are explained in patch 1. While adapting qcow2's expand_zero_clusters_in_l1() accordingly, I noticed a way to simplify it and get rid of the rather ugly bitmap used there (patch 6) and even found a way to optimize it (patch 7). However, Kevin already expressed strong dislike about that patch 7, as it complicates things in a function which should rarely ever be used. I personally don't have a strong opinion on the topic. The optimization should significantly increase the expansion performance; on the other hand, it makes the code equally more complicated. Patch 8 does not really depend on patch 7; it contains a test case specifically for patch 7, but of course it works without that patch just as well. Therefore, we can simply drop patch 7 if we don't need it. In this version, the total size of the zero cluster expansion job is the number of zero cluster entries in all L2 tables multiplied by the cluster size. This of course requires that number to be calculated beforehand, which is complicated as well. An easier way (suggested by Kevin) would be to simply use the number of all L2 entries as a basis. This would not be as exact, but much easier to implement. Since I already implemented the more complicated version of everything, I'm just sending it as-is. I'll write an alternative to patch 5 which uses the simpler variant and send it for comparison later. This series depends on v2 of my qemu-img: Allow source cache mode specification seires. Max Reitz (8): block: Add status callback to bdrv_amend_options() qemu-img: Add progress output for amend qemu-img: Fix insignifcant memleak block/qcow2: Make get_refcount() global block/qcow2: Implement status CB for amend block/qcow2: Simplify shared L2 handling in amend block/qcow2: Speed up zero cluster expansion iotests: Expand test 061 block.c| 5 +- block/qcow2-cluster.c | 319 ++--- block/qcow2-refcount.c | 23 ++-- block/qcow2.c | 10 +- block/qcow2.h | 5 +- include/block/block.h | 5 +- include/block/block_int.h | 3 +- qemu-img.c | 30 - tests/qemu-iotests/061 | 41 ++ tests/qemu-iotests/061.out | 40 ++ tests/qemu-iotests/group | 2 +- 11 files changed, 379 insertions(+), 104 deletions(-) -- 2.0.1
[Qemu-devel] [PATCH 5/8] block/qcow2: Implement status CB for amend
The only really time-consuming operation potentially performed by qcow2_amend_options() is zero cluster expansion when downgrading qcow2 images from compat=1.1 to compat=0.10, so report status of that operation and that operation only through the status CB. For this, count the number of L2 zero entries, use this as the basis for the total amend job length and increase the current offset by the cluster size multiplied by the refcount of the L2 table when expanding zero clusters. Signed-off-by: Max Reitz mre...@redhat.com --- block/qcow2-cluster.c | 145 -- block/qcow2.c | 9 ++-- block/qcow2.h | 3 +- 3 files changed, 147 insertions(+), 10 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 4208dc0..0f52ef6 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -1548,10 +1548,20 @@ fail: * zero expansion (i.e., has been filled with zeroes and is referenced from an * L2 table). nb_clusters contains the total cluster count of the image file, * i.e., the number of bits in expanded_clusters. + * + * zero_clusters_count and *expanded_count are used to keep track of progress + * for status_cb(). zero_clusters_count contains the total number of L2 zero + * cluster entries. Accordingly, *expanded_count counts all visited L2 zero + * cluster entries; shared L2 tables are counted accordingly, that is, for each + * L2 zero cluster entry the count is increased by the refcount of the L2 table + * cluster. */ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, int l1_size, uint8_t **expanded_clusters, - uint64_t *nb_clusters) + uint64_t *nb_clusters, + int64_t *expanded_count, + int64_t zero_clusters_count, + BlockDriverAmendStatusCB *status_cb) { BDRVQcowState *s = bs-opaque; bool is_active_l1 = (l1_table == s-l1_table); @@ -1568,6 +1578,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, for (i = 0; i l1_size; i++) { uint64_t l2_offset = l1_table[i] L1E_OFFSET_MASK; bool l2_dirty = false; +int l2_refcount; if (!l2_offset) { /* unallocated */ @@ -1587,6 +1598,12 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, goto fail; } +l2_refcount = qcow2_get_refcount(bs, l2_offset s-cluster_bits); +if (l2_refcount 0) { +ret = l2_refcount; +goto fail; +} + for (j = 0; j s-l2_size; j++) { uint64_t l2_entry = be64_to_cpu(l2_table[j]); int64_t offset = l2_entry L2E_OFFSET_MASK, cluster_index; @@ -1617,6 +1634,8 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, continue; } +*expanded_count += l2_refcount; + if (!preallocated) { if (!bs-backing_hd) { /* not backed; therefore we can simply deallocate the @@ -1703,6 +1722,11 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, } } } + +if (status_cb) { +status_cb(bs, *expanded_count s-cluster_bits, + zero_clusters_count s-cluster_bits); +} } ret = 0; @@ -1724,26 +1748,137 @@ fail: } /* + * Counts all zero clusters in a specific L1 table. + */ +static int64_t count_zero_clusters_in_l1(BlockDriverState *bs, + uint64_t *l1_table, int l1_size) +{ +BDRVQcowState *s = bs-opaque; +bool is_active_l1 = (l1_table == s-l1_table); +uint64_t *l2_table = NULL; +int64_t count = 0; +int ret; +int i, j; + +if (!is_active_l1) { +l2_table = qemu_blockalign(bs, s-cluster_size); +} + +for (i = 0; i l1_size; i++) { +uint64_t l2_offset = l1_table[i] L1E_OFFSET_MASK; + +if (!l2_offset) { +continue; +} + +if (is_active_l1) { +ret = qcow2_cache_get(bs, s-l2_table_cache, l2_offset, + (void **)l2_table); +} else { +ret = bdrv_read(bs-file, l2_offset / BDRV_SECTOR_SIZE, +(void *)l2_table, s-cluster_sectors); +} +if (ret 0) { +goto fail; +} + +for (j = 0; j s-l2_size; j++) { +if (qcow2_get_cluster_type(be64_to_cpu(l2_table[j])) +== QCOW2_CLUSTER_ZERO) +{ +count++; +} +} + +if (is_active_l1) { +ret = qcow2_cache_put(bs, s-l2_table_cache, (void **)l2_table); +l2_table = NULL; +
[Qemu-devel] [PATCH 4/8] block/qcow2: Make get_refcount() global
Reading the refcount of a cluster is an operation which can be useful in all of the qcow2 code, so make that function globally available. Signed-off-by: Max Reitz mre...@redhat.com --- block/qcow2-refcount.c | 23 --- block/qcow2.h | 2 ++ 2 files changed, 14 insertions(+), 11 deletions(-) diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index cc6cf74..0c9887b 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -87,7 +87,7 @@ static int load_refcount_block(BlockDriverState *bs, * return value is the refcount of the cluster, negative values are -errno * and indicate an error. */ -static int get_refcount(BlockDriverState *bs, int64_t cluster_index) +int qcow2_get_refcount(BlockDriverState *bs, int64_t cluster_index) { BDRVQcowState *s = bs-opaque; uint64_t refcount_table_index, block_index; @@ -625,7 +625,7 @@ int qcow2_update_cluster_refcount(BlockDriverState *bs, return ret; } -return get_refcount(bs, cluster_index); +return qcow2_get_refcount(bs, cluster_index); } @@ -646,7 +646,7 @@ static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size) retry: for(i = 0; i nb_clusters; i++) { uint64_t next_cluster_index = s-free_cluster_index++; -refcount = get_refcount(bs, next_cluster_index); +refcount = qcow2_get_refcount(bs, next_cluster_index); if (refcount 0) { return refcount; @@ -710,7 +710,7 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset, /* Check how many clusters there are free */ cluster_index = offset s-cluster_bits; for(i = 0; i nb_clusters; i++) { -refcount = get_refcount(bs, cluster_index++); +refcount = qcow2_get_refcount(bs, cluster_index++); if (refcount 0) { return refcount; @@ -927,7 +927,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, cluster_index, addend, QCOW2_DISCARD_SNAPSHOT); } else { -refcount = get_refcount(bs, cluster_index); +refcount = qcow2_get_refcount(bs, cluster_index); } if (refcount 0) { @@ -967,7 +967,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, refcount = qcow2_update_cluster_refcount(bs, l2_offset s-cluster_bits, addend, QCOW2_DISCARD_SNAPSHOT); } else { -refcount = get_refcount(bs, l2_offset s-cluster_bits); +refcount = qcow2_get_refcount(bs, l2_offset s-cluster_bits); } if (refcount 0) { ret = refcount; @@ -1243,8 +1243,8 @@ fail: * Checks the OFLAG_COPIED flag for all L1 and L2 entries. * * This function does not print an error message nor does it increment - * check_errors if get_refcount fails (this is because such an error will have - * been already detected and sufficiently signaled by the calling function + * check_errors if qcow2_get_refcount fails (this is because such an error will + * have been already detected and sufficiently signaled by the calling function * (qcow2_check_refcounts) by the time this function is called). */ static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res, @@ -1265,7 +1265,7 @@ static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res, continue; } -refcount = get_refcount(bs, l2_offset s-cluster_bits); +refcount = qcow2_get_refcount(bs, l2_offset s-cluster_bits); if (refcount 0) { /* don't print message nor increment check_errors */ continue; @@ -1307,7 +1307,8 @@ static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res, if ((cluster_type == QCOW2_CLUSTER_NORMAL) || ((cluster_type == QCOW2_CLUSTER_ZERO) (data_offset != 0))) { -refcount = get_refcount(bs, data_offset s-cluster_bits); +refcount = qcow2_get_refcount(bs, + data_offset s-cluster_bits); if (refcount 0) { /* don't print message nor increment check_errors */ continue; @@ -1597,7 +1598,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res, /* compare ref counts */ for (i = 0, highest_cluster = 0; i nb_clusters; i++) { -refcount1 = get_refcount(bs, i); +refcount1 = qcow2_get_refcount(bs, i); if (refcount1 0) { fprintf(stderr, Can't get refcount for cluster % PRId64 : %s\n, i, strerror(-refcount1)); diff --git a/block/qcow2.h b/block/qcow2.h index b49424b..b423b71 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -472,6 +472,8 @@ int
[Qemu-devel] [PATCH 3/8] qemu-img: Fix insignifcant memleak
As soon as options is set in img_amend(), it needs to be freed before the function returns. This leak is rather insignifcant, as qemu-img will exit subsequently anyway, but there's no point in not fixing it. Signed-off-by: Max Reitz mre...@redhat.com --- qemu-img.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/qemu-img.c b/qemu-img.c index a06f425..d73a68a 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -2809,7 +2809,9 @@ static int img_amend(int argc, char **argv) } if (optind != argc - 1) { -error_exit(Expecting one image file name); +error_report(Expecting one image file name); +ret = -1; +goto out; } flags = BDRV_O_FLAGS | BDRV_O_RDWR; -- 2.0.1
[Qemu-devel] [PATCH 7/8] block/qcow2: Speed up zero cluster expansion
Actually, we do not need to allocate a new data cluster for every zero cluster to be expanded: It is completely sufficient to rely on qcow2's COW part and instead create a single zero cluster and reuse it as much as possible. Signed-off-by: Max Reitz mre...@redhat.com --- block/qcow2-cluster.c | 119 ++ 1 file changed, 92 insertions(+), 27 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 905beb6..867db03 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -1558,6 +1558,9 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, BDRVQcowState *s = bs-opaque; bool is_active_l1 = (l1_table == s-l1_table); uint64_t *l2_table = NULL; +int64_t zeroed_cluster_offset = 0; +int zeroed_cluster_refcount = 0; +int last_zeroed_cluster_l1i = 0, last_zeroed_cluster_l2i = 0; int ret; int i, j; @@ -1617,47 +1620,79 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, continue; } -offset = qcow2_alloc_clusters(bs, s-cluster_size); -if (offset 0) { -ret = offset; -goto fail; +if (zeroed_cluster_offset) { +zeroed_cluster_refcount += l2_refcount; +if (zeroed_cluster_refcount 0x) { +zeroed_cluster_refcount = 0; +zeroed_cluster_offset = 0; +} } +if (!zeroed_cluster_offset) { +offset = qcow2_alloc_clusters(bs, s-cluster_size); +if (offset 0) { +ret = offset; +goto fail; +} -if (l2_refcount 1) { -/* For shared L2 tables, set the refcount accordingly (it is - * already 1 and needs to be l2_refcount) */ -ret = qcow2_update_cluster_refcount(bs, -offset s-cluster_bits, l2_refcount - 1, -QCOW2_DISCARD_OTHER); +ret = qcow2_pre_write_overlap_check(bs, 0, offset, +s-cluster_size); +if (ret 0) { +qcow2_free_clusters(bs, offset, s-cluster_size, +QCOW2_DISCARD_OTHER); +goto fail; +} + +ret = bdrv_write_zeroes(bs-file, offset / BDRV_SECTOR_SIZE, +s-cluster_sectors, 0); if (ret 0) { qcow2_free_clusters(bs, offset, s-cluster_size, QCOW2_DISCARD_OTHER); goto fail; } + +if (l2_refcount 1) { +ret = qcow2_update_cluster_refcount(bs, +offset s-cluster_bits, l2_refcount - 1, +QCOW2_DISCARD_OTHER); +if (ret 0) { +qcow2_free_clusters(bs, offset, s-cluster_size, +QCOW2_DISCARD_OTHER); +goto fail; +} +} + +zeroed_cluster_offset = offset; +zeroed_cluster_refcount = l2_refcount; +} else { +ret = qcow2_update_cluster_refcount(bs, +zeroed_cluster_offset s-cluster_bits, +l2_refcount, QCOW2_DISCARD_OTHER); +if (ret 0) { +goto fail; +} } + +offset = zeroed_cluster_offset; +last_zeroed_cluster_l1i = i; +last_zeroed_cluster_l2i = j; } -ret = qcow2_pre_write_overlap_check(bs, 0, offset, s-cluster_size); -if (ret 0) { -if (!preallocated) { -qcow2_free_clusters(bs, offset, s-cluster_size, -QCOW2_DISCARD_ALWAYS); +if (preallocated) { +ret = qcow2_pre_write_overlap_check(bs, 0, offset, +s-cluster_size); +if (ret 0) { +goto fail; } -goto fail; -} -ret = bdrv_write_zeroes(bs-file, offset / BDRV_SECTOR_SIZE, -s-cluster_sectors, 0); -if (ret 0) { -if (!preallocated) { -qcow2_free_clusters(bs, offset, s-cluster_size, -
[Qemu-devel] [PATCH 6/8] block/qcow2: Simplify shared L2 handling in amend
Currently, we have a bitmap for keeping track of which clusters have been created during the zero cluster expansion process. This was necessary because we need to properly increase the refcount for shared L2 tables. However, now we can simply take the L2 refcount and use it for the cluster allocated for expansion. This will be the correct refcount and therefore we don't have to remember that cluster having been allocated any more. Signed-off-by: Max Reitz mre...@redhat.com --- block/qcow2-cluster.c | 83 +-- 1 file changed, 21 insertions(+), 62 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 0f52ef6..905beb6 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -1543,12 +1543,6 @@ fail: * Expands all zero clusters in a specific L1 table (or deallocates them, for * non-backed non-pre-allocated zero clusters). * - * expanded_clusters is a bitmap where every bit corresponds to one cluster in - * the image file; a bit gets set if the corresponding cluster has been used for - * zero expansion (i.e., has been filled with zeroes and is referenced from an - * L2 table). nb_clusters contains the total cluster count of the image file, - * i.e., the number of bits in expanded_clusters. - * * zero_clusters_count and *expanded_count are used to keep track of progress * for status_cb(). zero_clusters_count contains the total number of L2 zero * cluster entries. Accordingly, *expanded_count counts all visited L2 zero @@ -1557,9 +1551,7 @@ fail: * cluster. */ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, - int l1_size, uint8_t **expanded_clusters, - uint64_t *nb_clusters, - int64_t *expanded_count, + int l1_size, int64_t *expanded_count, int64_t zero_clusters_count, BlockDriverAmendStatusCB *status_cb) { @@ -1606,31 +1598,11 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, for (j = 0; j s-l2_size; j++) { uint64_t l2_entry = be64_to_cpu(l2_table[j]); -int64_t offset = l2_entry L2E_OFFSET_MASK, cluster_index; +int64_t offset = l2_entry L2E_OFFSET_MASK; int cluster_type = qcow2_get_cluster_type(l2_entry); bool preallocated = offset != 0; -if (cluster_type == QCOW2_CLUSTER_NORMAL) { -cluster_index = offset s-cluster_bits; -assert((cluster_index = 0) (cluster_index *nb_clusters)); -if ((*expanded_clusters)[cluster_index / 8] -(1 (cluster_index % 8))) { -/* Probably a shared L2 table; this cluster was a zero - * cluster which has been expanded, its refcount - * therefore most likely requires an update. */ -ret = qcow2_update_cluster_refcount(bs, cluster_index, 1, -QCOW2_DISCARD_NEVER); -if (ret 0) { -goto fail; -} -/* Since we just increased the refcount, the COPIED flag may - * no longer be set. */ -l2_table[j] = cpu_to_be64(l2_entry ~QCOW_OFLAG_COPIED); -l2_dirty = true; -} -continue; -} -else if (qcow2_get_cluster_type(l2_entry) != QCOW2_CLUSTER_ZERO) { +if (cluster_type != QCOW2_CLUSTER_ZERO) { continue; } @@ -1650,6 +1622,19 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, ret = offset; goto fail; } + +if (l2_refcount 1) { +/* For shared L2 tables, set the refcount accordingly (it is + * already 1 and needs to be l2_refcount) */ +ret = qcow2_update_cluster_refcount(bs, +offset s-cluster_bits, l2_refcount - 1, +QCOW2_DISCARD_OTHER); +if (ret 0) { +qcow2_free_clusters(bs, offset, s-cluster_size, +QCOW2_DISCARD_OTHER); +goto fail; +} +} } ret = qcow2_pre_write_overlap_check(bs, 0, offset, s-cluster_size); @@ -1671,29 +1656,12 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, goto fail; } -l2_table[j] = cpu_to_be64(offset | QCOW_OFLAG_COPIED); -l2_dirty = true; - -cluster_index = offset
[Qemu-devel] [PATCH 8/8] iotests: Expand test 061
Add some tests for progress output and one test regarding the COPIED flag for shared zeroed clusters to 061. Signed-off-by: Max Reitz mre...@redhat.com --- tests/qemu-iotests/061 | 41 + tests/qemu-iotests/061.out | 40 tests/qemu-iotests/group | 2 +- 3 files changed, 82 insertions(+), 1 deletion(-) diff --git a/tests/qemu-iotests/061 b/tests/qemu-iotests/061 index ab98def..12ea687 100755 --- a/tests/qemu-iotests/061 +++ b/tests/qemu-iotests/061 @@ -209,6 +209,47 @@ $QEMU_IMG amend -o compat=0.10 $TEST_IMG _check_test_img $QEMU_IO -c read -P 0 0 64M $TEST_IMG | _filter_qemu_io +echo +echo === Testing multiple zeroed cluster allocations === +echo +IMGOPTS=compat=1.1 TEST_IMG=$TEST_IMG.base _make_test_img 64M +IMGOPTS=compat=1.1,cluster_size=512 make_test_img -b $TEST_IMG.base 64M +$QEMU_IO -c write -z 0 $(((2 * 65535 + 1) * 512)) $TEST_IMG \ +| _filter_qemu_io +$QEMU_IMG amend -o compat=0.10 $TEST_IMG +_check_test_img + +echo +echo === Testing progress report without snapshot === +echo +IMGOPTS=compat=1.1 TEST_IMG=$TEST_IMG.base _make_test_img 4G +IMGOPTS=compat=1.1 _make_test_img -b $TEST_IMG.base 4G +$QEMU_IO -c write -z 0 64k \ + -c write -z 1G 64k \ + -c write -z 2G 64k \ + -c write -z 3G 64k $TEST_IMG | _filter_qemu_io +$QEMU_IMG amend -p -o compat=0.10 $TEST_IMG +_check_test_img + +echo +echo === Testing progress report with snapshot === +echo +IMGOPTS=compat=1.1 TEST_IMG=$TEST_IMG.base _make_test_img 4G +IMGOPTS=compat=1.1 _make_test_img -b $TEST_IMG.base 4G +$QEMU_IO -c write -z 0 64k \ + -c write -z 1G 64k \ + -c write -z 2G 64k \ + -c write -z 3G 64k $TEST_IMG | _filter_qemu_io +$QEMU_IMG snapshot -c foo $TEST_IMG +# COW L2 table 0 +$QEMU_IO -c write -z 64k 64k $TEST_IMG | _filter_qemu_io +# The first four steps will advance by 2/9, because in the active L1 table, the +# first L2 table contains two zero cluster entries and all others contain one +# but are shared with the other L2 table; the final 1/9 step will be from the +# zero cluster entry in the L2 table owned exclusively by the snapshot. +$QEMU_IMG amend -p -o compat=0.10 $TEST_IMG +_check_test_img + # success, all done echo *** done rm -f $seq.full diff --git a/tests/qemu-iotests/061.out b/tests/qemu-iotests/061.out index e372470..63d7bb2 100644 --- a/tests/qemu-iotests/061.out +++ b/tests/qemu-iotests/061.out @@ -384,4 +384,44 @@ wrote 67108864/67108864 bytes at offset 0 No errors were found on the image. read 67108864/67108864 bytes at offset 0 64 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) + +=== Testing multiple zeroed cluster allocations === + +Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=67108864 +./061: line 216: make_test_img: command not found +wrote 67108352/67108352 bytes at offset 0 +64 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +No errors were found on the image. + +=== Testing progress report without snapshot === + +Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=4294967296 +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=4294967296 backing_file='TEST_DIR/t.IMGFMT.base' +wrote 65536/65536 bytes at offset 0 +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +wrote 65536/65536 bytes at offset 1073741824 +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +wrote 65536/65536 bytes at offset 2147483648 +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +wrote 65536/65536 bytes at offset 3221225472 +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +(0.00/100%) (25.00/100%) (50.00/100%) (75.00/100%) (100.00/100%) (100.00/100%) +No errors were found on the image. + +=== Testing progress report with snapshot === + +Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=4294967296 +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=4294967296 backing_file='TEST_DIR/t.IMGFMT.base' +wrote 65536/65536 bytes at offset 0 +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +wrote 65536/65536 bytes at offset 1073741824 +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +wrote 65536/65536 bytes at offset 2147483648 +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +wrote 65536/65536 bytes at offset 3221225472 +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +wrote 65536/65536 bytes at offset 65536 +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +(0.00/100%) (22.22/100%) (44.44/100%) (66.67/100%) (88.89/100%) (100.00/100%) (100.00/100%) (100.00/100%) (100.00/100%) (100.00/100%) +No errors were found on the image. *** done diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group index 6e67f61..b27e2f9 100644 --- a/tests/qemu-iotests/group +++ b/tests/qemu-iotests/group @@ -67,7 +67,7 @@ 058 rw auto quick 059 rw auto quick 060 rw auto quick -061 rw auto quick +061 rw auto 062 rw auto quick 063 rw auto quick 064 rw auto
[Qemu-devel] [PATCH 1/8] block: Add status callback to bdrv_amend_options()
Depending on the changed options and the image format, bdrv_amend_options() may take a significant amount of time. In these cases, a way to be informed about the operation's status is desirable. Since the operation is rather complex and may fundamentally change the image, implementing it as AIO or a couroutine does not seem feasible. On the other hand, implementing it as a block job would be significantly more difficult than a simple callback and would not add benefits other than progress report to the amending operation, because it should not actually be run as a block job at all. A callback may not be very pretty, but it's very easy to implement and perfectly fits its purpose here. Signed-off-by: Max Reitz mre...@redhat.com --- block.c | 5 +++-- block/qcow2.c | 5 - include/block/block.h | 5 - include/block/block_int.h | 3 ++- qemu-img.c| 2 +- 5 files changed, 14 insertions(+), 6 deletions(-) diff --git a/block.c b/block.c index 3e252a2..4c8d4d8 100644 --- a/block.c +++ b/block.c @@ -5708,12 +5708,13 @@ void bdrv_add_before_write_notifier(BlockDriverState *bs, notifier_with_return_list_add(bs-before_write_notifiers, notifier); } -int bdrv_amend_options(BlockDriverState *bs, QemuOpts *opts) +int bdrv_amend_options(BlockDriverState *bs, QemuOpts *opts, + BlockDriverAmendStatusCB *status_cb) { if (!bs-drv-bdrv_amend_options) { return -ENOTSUP; } -return bs-drv-bdrv_amend_options(bs, opts); +return bs-drv-bdrv_amend_options(bs, opts, status_cb); } /* This function will be called by the bdrv_recurse_is_first_non_filter method diff --git a/block/qcow2.c b/block/qcow2.c index b0faa69..757f890 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2210,7 +2210,8 @@ static int qcow2_downgrade(BlockDriverState *bs, int target_version) return 0; } -static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts) +static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts, + BlockDriverAmendStatusCB *status_cb) { BDRVQcowState *s = bs-opaque; int old_version = s-qcow_version, new_version = old_version; @@ -2223,6 +2224,8 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts) int ret; QemuOptDesc *desc = opts-list-desc; +(void)status_cb; + while (desc desc-name) { if (!qemu_opt_find(opts, desc-name)) { /* only change explicitly defined options */ diff --git a/include/block/block.h b/include/block/block.h index 32d3676..f2b1799 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -309,7 +309,10 @@ typedef enum { int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix); -int bdrv_amend_options(BlockDriverState *bs_new, QemuOpts *opts); +typedef void BlockDriverAmendStatusCB(BlockDriverState *bs, int64_t offset, + int64_t total_work_size); +int bdrv_amend_options(BlockDriverState *bs_new, QemuOpts *opts, + BlockDriverAmendStatusCB *status_cb); /* external snapshots */ bool bdrv_recurse_is_first_non_filter(BlockDriverState *bs, diff --git a/include/block/block_int.h b/include/block/block_int.h index f6c3bef..5c5798d 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -228,7 +228,8 @@ struct BlockDriver { int (*bdrv_check)(BlockDriverState* bs, BdrvCheckResult *result, BdrvCheckMode fix); -int (*bdrv_amend_options)(BlockDriverState *bs, QemuOpts *opts); +int (*bdrv_amend_options)(BlockDriverState *bs, QemuOpts *opts, + BlockDriverAmendStatusCB *status_cb); void (*bdrv_debug_event)(BlockDriverState *bs, BlkDebugEvent event); diff --git a/qemu-img.c b/qemu-img.c index ef74ae4..90d6b79 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -2827,7 +2827,7 @@ static int img_amend(int argc, char **argv) goto out; } -ret = bdrv_amend_options(bs, opts); +ret = bdrv_amend_options(bs, opts, NULL); if (ret 0) { error_report(Error while amending options: %s, strerror(-ret)); goto out; -- 2.0.1
[Qemu-devel] [PATCH v2 0/7] Virtio PCI libqos driver
Add functions for virtio PCI libqos driver. Add more debugging tools. Solve bugs found while generating tests. Marc Marí (7): tests: Functions bus_foreach and device_find from libqos virtio API tests: Add virtio device initialization libqtest: add QTEST_LOG for debugging qtest testcases libqos: Correct mask to align size to PAGE_SIZE in malloc-pc libqos: Change free function called in malloc virtio-blk: Correct bug in support for flexible descriptor layout libqos: Added basic virtqueue support to virtio implementation hw/block/virtio-blk.c | 14 +-- tests/Makefile|3 +- tests/libqos/malloc-pc.c |2 +- tests/libqos/malloc.h |2 +- tests/libqos/virtio-pci.c | 219 tests/libqos/virtio-pci.h | 49 ++ tests/libqos/virtio.c | 144 + tests/libqos/virtio.h | 141 tests/libqtest.c |7 +- tests/virtio-blk-test.c | 222 +++-- 10 files changed, 784 insertions(+), 19 deletions(-) create mode 100644 tests/libqos/virtio-pci.c create mode 100644 tests/libqos/virtio-pci.h create mode 100644 tests/libqos/virtio.c create mode 100644 tests/libqos/virtio.h -- 1.7.10.4
[Qemu-devel] [PATCH v2 6/7] virtio-blk: Correct bug in support for flexible descriptor layout
Without this correction, only a three descriptor layout is accepted, and requests with just two descriptors are not completed and no error message is displayed. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com Signed-off-by: Marc Marí marc.mari.barc...@gmail.com --- hw/block/virtio-blk.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index c241c50..302c39e 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -404,19 +404,19 @@ void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb) * NB: per existing s/n string convention the string is * terminated by '\0' only when shorter than buffer. */ -strncpy(req-elem.in_sg[0].iov_base, -s-blk.serial ? s-blk.serial : , -MIN(req-elem.in_sg[0].iov_len, VIRTIO_BLK_ID_BYTES)); +const char *serial = s-blk.serial ? s-blk.serial : ; +size_t size = MIN(strlen(serial) + 1, + MIN(iov_size(in_iov, in_num), + VIRTIO_BLK_ID_BYTES)); +iov_from_buf(in_iov, in_num, 0, serial, size); virtio_blk_req_complete(req, VIRTIO_BLK_S_OK); virtio_blk_free_request(req); } else if (type VIRTIO_BLK_T_OUT) { -qemu_iovec_init_external(req-qiov, req-elem.out_sg[1], - req-elem.out_num - 1); +qemu_iovec_init_external(req-qiov, iov, out_num); virtio_blk_handle_write(req, mrb); } else if (type == VIRTIO_BLK_T_IN || type == VIRTIO_BLK_T_BARRIER) { /* VIRTIO_BLK_T_IN is 0, so we can't just it. */ -qemu_iovec_init_external(req-qiov, req-elem.in_sg[0], - req-elem.in_num - 1); +qemu_iovec_init_external(req-qiov, in_iov, in_num); virtio_blk_handle_read(req); } else { virtio_blk_req_complete(req, VIRTIO_BLK_S_UNSUPP); -- 1.7.10.4
[Qemu-devel] [PATCH v2 2/7] tests: Add virtio device initialization
Add functions to read and write virtio header fields. Add status bit setting in virtio-blk-device. Signed-off-by: Marc Marí marc.mari.barc...@gmail.com --- tests/Makefile|2 +- tests/libqos/virtio-pci.c | 57 + tests/libqos/virtio-pci.h | 18 ++ tests/libqos/virtio.c | 55 +++ tests/libqos/virtio.h | 30 tests/virtio-blk-test.c | 14 +++ 6 files changed, 175 insertions(+), 1 deletion(-) create mode 100644 tests/libqos/virtio.c diff --git a/tests/Makefile b/tests/Makefile index 7c0f670..e0e203f 100644 --- a/tests/Makefile +++ b/tests/Makefile @@ -294,7 +294,7 @@ libqos-obj-y += tests/libqos/i2c.o libqos-pc-obj-y = $(libqos-obj-y) tests/libqos/pci-pc.o libqos-pc-obj-y += tests/libqos/malloc-pc.o libqos-omap-obj-y = $(libqos-obj-y) tests/libqos/i2c-omap.o -libqos-virtio-obj-y = $(libqos-obj-y) $(libqos-pc-obj-y) tests/libqos/virtio-pci.o +libqos-virtio-obj-y = $(libqos-obj-y) $(libqos-pc-obj-y) tests/libqos/virtio.o tests/libqos/virtio-pci.o tests/rtc-test$(EXESUF): tests/rtc-test.o tests/m48t59-test$(EXESUF): tests/m48t59-test.o diff --git a/tests/libqos/virtio-pci.c b/tests/libqos/virtio-pci.c index fde1b1f..b707c8d 100644 --- a/tests/libqos/virtio-pci.c +++ b/tests/libqos/virtio-pci.c @@ -55,6 +55,51 @@ static void qvirtio_pci_assign_device(QVirtioDevice *d, void *data) *vpcidev = (QVirtioPCIDevice *)d; } +static uint8_t qvirtio_pci_config_readb(QVirtioDevice *d, void *addr) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +return qpci_io_readb(dev-pdev, addr); +} + +static uint16_t qvirtio_pci_config_readw(QVirtioDevice *d, void *addr) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +return qpci_io_readw(dev-pdev, addr); +} + +static uint32_t qvirtio_pci_config_readl(QVirtioDevice *d, void *addr) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +return qpci_io_readl(dev-pdev, addr); +} + +static uint64_t qvirtio_pci_config_readq(QVirtioDevice *d, void *addr) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +return qpci_io_readl(dev-pdev, addr) | qpci_io_readl(dev-pdev, addr+4); +} + +static uint8_t qvirtio_pci_get_status(QVirtioDevice *d) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +return qpci_io_readb(dev-pdev, dev-addr + QVIRTIO_DEVICE_STATUS); +} + +static void qvirtio_pci_set_status(QVirtioDevice *d, uint8_t val) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +qpci_io_writeb(dev-pdev, dev-addr + QVIRTIO_DEVICE_STATUS, val); +} + +const QVirtioBus qvirtio_pci = { +.config_readb = qvirtio_pci_config_readb, +.config_readw = qvirtio_pci_config_readw, +.config_readl = qvirtio_pci_config_readl, +.config_readq = qvirtio_pci_config_readq, +.get_status = qvirtio_pci_get_status, +.set_status = qvirtio_pci_set_status, +}; + void qvirtio_pci_foreach(QPCIBus *bus, uint16_t device_type, void (*func)(QVirtioDevice *d, void *data), void *data) { @@ -73,3 +118,15 @@ QVirtioPCIDevice *qvirtio_pci_device_find(QPCIBus *bus, uint16_t device_type) return dev; } + +void qvirtio_pci_enable_device(QVirtioPCIDevice *d) +{ +qpci_device_enable(d-pdev); +d-addr = qpci_iomap(d-pdev, 0); +g_assert(d-addr != NULL); +} + +void qvirtio_pci_disable_device(QVirtioPCIDevice *d) +{ +qpci_iounmap(d-pdev, d-addr); +} diff --git a/tests/libqos/virtio-pci.h b/tests/libqos/virtio-pci.h index 5101abb..3060238 100644 --- a/tests/libqos/virtio-pci.h +++ b/tests/libqos/virtio-pci.h @@ -13,12 +13,30 @@ #include libqos/virtio.h #include libqos/pci.h +#define QVIRTIO_DEVICE_FEATURES 0x00 +#define QVIRTIO_GUEST_FEATURES 0x04 +#define QVIRTIO_QUEUE_ADDRESS 0x08 +#define QVIRTIO_QUEUE_SIZE 0x0C +#define QVIRTIO_QUEUE_SELECT0x0E +#define QVIRTIO_QUEUE_NOTIFY0x10 +#define QVIRTIO_DEVICE_STATUS 0x12 +#define QVIRTIO_ISR_STATUS 0x13 +#define QVIRTIO_MSIX_CONF_VECTOR0x14 +#define QVIRTIO_MSIX_QUEUE_VECTOR 0x16 +#define QVIRTIO_DEVICE_SPECIFIC_MSIX0x18 +#define QVIRTIO_DEVICE_SPECIFIC_NO_MSIX 0x14 + typedef struct QVirtioPCIDevice { QVirtioDevice vdev; QPCIDevice *pdev; +void *addr; } QVirtioPCIDevice; +extern const QVirtioBus qvirtio_pci; + void qvirtio_pci_foreach(QPCIBus *bus, uint16_t device_type, void (*func)(QVirtioDevice *d, void *data), void *data); QVirtioPCIDevice *qvirtio_pci_device_find(QPCIBus *bus, uint16_t device_type); +void qvirtio_pci_enable_device(QVirtioPCIDevice *d); +void qvirtio_pci_disable_device(QVirtioPCIDevice *d); #endif diff --git a/tests/libqos/virtio.c b/tests/libqos/virtio.c new file mode 100644 index 000..577d679 --- /dev/null +++ b/tests/libqos/virtio.c @@ -0,0 +1,55 @@ +/* + * libqos virtio driver + * + * Copyright (c) 2014 Marc Marí + * + * This work is
[Qemu-devel] [PATCH v2 1/7] tests: Functions bus_foreach and device_find from libqos virtio API
Virtio header has been changed to compile and work with a real device. Functions bus_foreach and device_find have been implemented for PCI. Virtio-blk test case now opens a fake device. Signed-off-by: Marc Marí marc.mari.barc...@gmail.com --- tests/Makefile|3 +- tests/libqos/virtio-pci.c | 75 + tests/libqos/virtio-pci.h | 24 +++ tests/libqos/virtio.h | 23 ++ tests/virtio-blk-test.c | 61 +++- 5 files changed, 177 insertions(+), 9 deletions(-) create mode 100644 tests/libqos/virtio-pci.c create mode 100644 tests/libqos/virtio-pci.h create mode 100644 tests/libqos/virtio.h diff --git a/tests/Makefile b/tests/Makefile index 4b2e1bb..7c0f670 100644 --- a/tests/Makefile +++ b/tests/Makefile @@ -294,6 +294,7 @@ libqos-obj-y += tests/libqos/i2c.o libqos-pc-obj-y = $(libqos-obj-y) tests/libqos/pci-pc.o libqos-pc-obj-y += tests/libqos/malloc-pc.o libqos-omap-obj-y = $(libqos-obj-y) tests/libqos/i2c-omap.o +libqos-virtio-obj-y = $(libqos-obj-y) $(libqos-pc-obj-y) tests/libqos/virtio-pci.o tests/rtc-test$(EXESUF): tests/rtc-test.o tests/m48t59-test$(EXESUF): tests/m48t59-test.o @@ -315,7 +316,7 @@ tests/vmxnet3-test$(EXESUF): tests/vmxnet3-test.o tests/ne2000-test$(EXESUF): tests/ne2000-test.o tests/wdt_ib700-test$(EXESUF): tests/wdt_ib700-test.o tests/virtio-balloon-test$(EXESUF): tests/virtio-balloon-test.o -tests/virtio-blk-test$(EXESUF): tests/virtio-blk-test.o +tests/virtio-blk-test$(EXESUF): tests/virtio-blk-test.o $(libqos-virtio-obj-y) tests/virtio-net-test$(EXESUF): tests/virtio-net-test.o tests/virtio-rng-test$(EXESUF): tests/virtio-rng-test.o tests/virtio-scsi-test$(EXESUF): tests/virtio-scsi-test.o diff --git a/tests/libqos/virtio-pci.c b/tests/libqos/virtio-pci.c new file mode 100644 index 000..fde1b1f --- /dev/null +++ b/tests/libqos/virtio-pci.c @@ -0,0 +1,75 @@ +/* + * libqos virtio PCI driver + * + * Copyright (c) 2014 Marc Marí + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include glib.h +#include libqtest.h +#include libqos/virtio.h +#include libqos/virtio-pci.h +#include libqos/pci.h +#include libqos/pci-pc.h + +#include hw/pci/pci_regs.h + +typedef struct QVirtioPCIForeachData { +void (*func)(QVirtioDevice *d, void *data); +uint16_t device_type; +void *user_data; +} QVirtioPCIForeachData; + +static QVirtioPCIDevice *qpcidevice_to_qvirtiodevice(QPCIDevice *pdev) +{ +QVirtioPCIDevice *vpcidev; +vpcidev = g_malloc0(sizeof(*vpcidev)); + +if (pdev) { +vpcidev-pdev = pdev; +vpcidev-vdev.device_type = +qpci_config_readw(vpcidev-pdev, PCI_SUBSYSTEM_ID); +} + +return vpcidev; +} + +static void qvirtio_pci_foreach_callback( +QPCIDevice *dev, int devfn, void *data) +{ +QVirtioPCIForeachData *d = data; +QVirtioPCIDevice *vpcidev = qpcidevice_to_qvirtiodevice(dev); + +if (vpcidev-vdev.device_type == d-device_type) { +d-func(vpcidev-vdev, d-user_data); +} else { +g_free(vpcidev); +} +} + +static void qvirtio_pci_assign_device(QVirtioDevice *d, void *data) +{ +QVirtioPCIDevice **vpcidev = data; +*vpcidev = (QVirtioPCIDevice *)d; +} + +void qvirtio_pci_foreach(QPCIBus *bus, uint16_t device_type, +void (*func)(QVirtioDevice *d, void *data), void *data) +{ +QVirtioPCIForeachData d = { .func = func, +.device_type = device_type, +.user_data = data }; + +qpci_device_foreach(bus, QVIRTIO_VENDOR_ID, -1, +qvirtio_pci_foreach_callback, d); +} + +QVirtioPCIDevice *qvirtio_pci_device_find(QPCIBus *bus, uint16_t device_type) +{ +QVirtioPCIDevice *dev = NULL; +qvirtio_pci_foreach(bus, device_type, qvirtio_pci_assign_device, dev); + +return dev; +} diff --git a/tests/libqos/virtio-pci.h b/tests/libqos/virtio-pci.h new file mode 100644 index 000..5101abb --- /dev/null +++ b/tests/libqos/virtio-pci.h @@ -0,0 +1,24 @@ +/* + * libqos virtio PCI definitions + * + * Copyright (c) 2014 Marc Marí + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#ifndef LIBQOS_VIRTIO_PCI_H +#define LIBQOS_VIRTIO_PCI_H + +#include libqos/virtio.h +#include libqos/pci.h + +typedef struct QVirtioPCIDevice { +QVirtioDevice vdev; +QPCIDevice *pdev; +} QVirtioPCIDevice; + +void qvirtio_pci_foreach(QPCIBus *bus, uint16_t device_type, +void (*func)(QVirtioDevice *d, void *data), void *data); +QVirtioPCIDevice *qvirtio_pci_device_find(QPCIBus *bus, uint16_t device_type); +#endif diff --git a/tests/libqos/virtio.h b/tests/libqos/virtio.h new file mode 100644 index 000..2a05798 --- /dev/null +++
[Qemu-devel] [PATCH v2 5/7] libqos: Change free function called in malloc
Reviewed-by: Stefan Hajnoczi stefa...@redhat.com Signed-off-by: Marc Marí marc.mari.barc...@gmail.com --- tests/libqos/malloc.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/libqos/malloc.h b/tests/libqos/malloc.h index 46f6000..5565381 100644 --- a/tests/libqos/malloc.h +++ b/tests/libqos/malloc.h @@ -32,7 +32,7 @@ static inline uint64_t guest_alloc(QGuestAllocator *allocator, size_t size) static inline void guest_free(QGuestAllocator *allocator, uint64_t addr) { -allocator-alloc(allocator, addr); +allocator-free(allocator, addr); } #endif -- 1.7.10.4
[Qemu-devel] [PATCH v2 7/7] libqos: Added basic virtqueue support to virtio implementation
Add status changing and feature negotiation. Add basic virtqueue support for adding and sending virtqueue requests. Add ISR checking. Signed-off-by: Marc Marí marc.mari.barc...@gmail.com --- tests/libqos/virtio-pci.c | 91 +- tests/libqos/virtio-pci.h |7 ++ tests/libqos/virtio.c | 89 ++ tests/libqos/virtio.h | 90 +- tests/virtio-blk-test.c | 155 +++-- 5 files changed, 425 insertions(+), 7 deletions(-) diff --git a/tests/libqos/virtio-pci.c b/tests/libqos/virtio-pci.c index b707c8d..5d00826 100644 --- a/tests/libqos/virtio-pci.c +++ b/tests/libqos/virtio-pci.c @@ -13,6 +13,8 @@ #include libqos/virtio-pci.h #include libqos/pci.h #include libqos/pci-pc.h +#include libqos/malloc.h +#include libqos/malloc-pc.h #include hw/pci/pci_regs.h @@ -79,16 +81,93 @@ static uint64_t qvirtio_pci_config_readq(QVirtioDevice *d, void *addr) return qpci_io_readl(dev-pdev, addr) | qpci_io_readl(dev-pdev, addr+4); } +static uint32_t qvirtio_pci_get_features(QVirtioDevice *d) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +return qpci_io_readl(dev-pdev, dev-addr + QVIRTIO_DEVICE_FEATURES); +} + +static void qvirtio_pci_set_features(QVirtioDevice *d, uint32_t features) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +qpci_io_writel(dev-pdev, dev-addr + QVIRTIO_GUEST_FEATURES, features); +} + static uint8_t qvirtio_pci_get_status(QVirtioDevice *d) { QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; return qpci_io_readb(dev-pdev, dev-addr + QVIRTIO_DEVICE_STATUS); } -static void qvirtio_pci_set_status(QVirtioDevice *d, uint8_t val) +static void qvirtio_pci_set_status(QVirtioDevice *d, uint8_t status) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +qpci_io_writeb(dev-pdev, dev-addr + QVIRTIO_DEVICE_STATUS, status); +} + +static uint8_t qvirtio_pci_get_isr_status(QVirtioDevice *d) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +return qpci_io_readb(dev-pdev, dev-addr + QVIRTIO_ISR_STATUS); +} + +static void qvirtio_pci_queue_select(QVirtioDevice *d, uint16_t index) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +qpci_io_writeb(dev-pdev, dev-addr + QVIRTIO_QUEUE_SELECT, index); +} + +static uint16_t qvirtio_pci_get_queue_size(QVirtioDevice *d) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +return qpci_io_readw(dev-pdev, dev-addr + QVIRTIO_QUEUE_SIZE); +} + +static void qvirtio_pci_set_queue_address(QVirtioDevice *d, uint32_t pfn) +{ +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; +qpci_io_writel(dev-pdev, dev-addr + QVIRTIO_QUEUE_ADDRESS, pfn); +} + +static QVirtQueue *qvirtio_pci_virtqueue_setup(QVirtioDevice *d, +QGuestAllocator *alloc, uint16_t index) +{ +uint16_t aux; +uint64_t addr; +QVirtQueue *vq; + +vq = g_malloc0(sizeof(*vq)); + +qvirtio_pci_queue_select(d, index); +vq-index = index; +vq-size = qvirtio_pci_get_queue_size(d); +vq-free_head = 0; +vq-num_free = vq-size; +vq-align = QVIRTIO_PCI_ALIGN; + +/* Check different than 0 */ +g_assert_cmpint(vq-size, !=, 0); + +/* Check power of 2 */ +aux = vq-size; +while ((aux 1) != 0) { +aux = aux 1; +} +g_assert_cmpint(aux, !=, 1); + +addr = guest_alloc(alloc, qvring_size(vq-size, QVIRTIO_PCI_ALIGN)); +qvring_init(alloc, vq, addr); +qvirtio_pci_set_queue_address(d, vq-desc/4096); + +/* TODO: MSI-X configuration */ + +return vq; +} + +static void qvirtio_pci_virtqueue_kick(QVirtioDevice *d, QVirtQueue *vq) { QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; -qpci_io_writeb(dev-pdev, dev-addr + QVIRTIO_DEVICE_STATUS, val); +qpci_io_writew(dev-pdev, dev-addr + QVIRTIO_QUEUE_NOTIFY, vq-index); } const QVirtioBus qvirtio_pci = { @@ -96,8 +175,16 @@ const QVirtioBus qvirtio_pci = { .config_readw = qvirtio_pci_config_readw, .config_readl = qvirtio_pci_config_readl, .config_readq = qvirtio_pci_config_readq, +.get_features = qvirtio_pci_get_features, +.set_features = qvirtio_pci_set_features, .get_status = qvirtio_pci_get_status, .set_status = qvirtio_pci_set_status, +.get_isr_status = qvirtio_pci_get_isr_status, +.queue_select = qvirtio_pci_queue_select, +.get_queue_size = qvirtio_pci_get_queue_size, +.set_queue_address = qvirtio_pci_set_queue_address, +.virtqueue_setup = qvirtio_pci_virtqueue_setup, +.virtqueue_kick = qvirtio_pci_virtqueue_kick, }; void qvirtio_pci_foreach(QPCIBus *bus, uint16_t device_type, diff --git a/tests/libqos/virtio-pci.h b/tests/libqos/virtio-pci.h index 3060238..1e1154a 100644 --- a/tests/libqos/virtio-pci.h +++ b/tests/libqos/virtio-pci.h @@ -26,6 +26,13 @@ #define QVIRTIO_DEVICE_SPECIFIC_MSIX0x18 #define QVIRTIO_DEVICE_SPECIFIC_NO_MSIX 0x14 +#define QVIRTIO_F_NOTIFY_ON_EMPTY 0x0100 +#define
[Qemu-devel] [PATCH v2 3/7] libqtest: add QTEST_LOG for debugging qtest testcases
Signed-off-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Marc Marí marc.mari.barc...@gmail.com --- tests/libqtest.c |7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/tests/libqtest.c b/tests/libqtest.c index 98e8f4b..fbd600d 100644 --- a/tests/libqtest.c +++ b/tests/libqtest.c @@ -167,11 +167,12 @@ QTestState *qtest_init(const char *extra_args) if (s-qemu_pid == 0) { command = g_strdup_printf(exec %s -qtest unix:%s,nowait - -qtest-log /dev/null + -qtest-log %s -qmp unix:%s,nowait -machine accel=qtest -display none %s, qemu_binary, socket_path, + getenv(QTEST_LOG) ? /dev/fd/2 : /dev/null, qmp_socket_path, extra_args ?: ); execlp(/bin/sh, sh, -c, command, NULL); @@ -397,10 +398,14 @@ QDict *qtest_qmpv(QTestState *s, const char *fmt, va_list ap) /* No need to send anything for an empty QObject. */ if (qobj) { +int log = getenv(QTEST_LOG) != NULL; QString *qstr = qobject_to_json(qobj); const char *str = qstring_get_str(qstr); size_t size = qstring_get_length(qstr); +if (log) { +fprintf(stderr, %s, str); +} /* Send QMP request */ socket_send(s-qmp_fd, str, size); -- 1.7.10.4
[Qemu-devel] [PATCH v2 4/7] libqos: Correct mask to align size to PAGE_SIZE in malloc-pc
Reviewed-by: Stefan Hajnoczi stefa...@redhat.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Marc Marí marc.mari.barc...@gmail.com --- tests/libqos/malloc-pc.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/libqos/malloc-pc.c b/tests/libqos/malloc-pc.c index db1496c..2efd095 100644 --- a/tests/libqos/malloc-pc.c +++ b/tests/libqos/malloc-pc.c @@ -36,7 +36,7 @@ static uint64_t pc_alloc(QGuestAllocator *allocator, size_t size) size += (PAGE_SIZE - 1); -size = PAGE_SIZE; +size = -PAGE_SIZE; g_assert_cmpint((s-start + size), =, s-end); -- 1.7.10.4
Re: [Qemu-devel] [PATCH] virtio-scsi: fix object check failure
Hi Ming Paolo, On Wed, 2014-06-18 at 23:11 +0800, Ming Lei wrote: On Wed, Jun 18, 2014 at 11:06 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 18/06/2014 17:02, Paolo Bonzini ha scritto: Il 18/06/2014 16:18, Ming Lei ha scritto: This should never be triggered by vhost-scsi. Perhaps a bug in the kernel? It can be triggered with rmmod, system suspend, reboot... Yes, but it should not. What happens if you change VHOST_SCSI_VQ_NUM_FIXED from 2 to 3? Sorry, brain fart here. The number of interrupts in virtio-scsi is queues+3 because of the configuration interrupt. This does not apply here. Anyway, vhost-scsi passes the number of virtqueues correctly to the kernel. QEMU might be triggering this because, when vhost is stopped, the queue is still not empty (as expected, since it's a receive queue). Can you check that this is the case? Yes, that is the case. If so, patching QEMU is correct, but you need to change VIRTIO_SCSI(vdev) to VIRTIO_SCSI_COMMON(vdev), and move events_dropped from VirtIOSCSI to VirtIOSCSICommon. This approach should be better. Just curious if this patch was picked up by upstream yet, and if it needs to be CC'd to stable for vhost-scsi code in = v1.5.y ..? Thanks, --nab
Re: [Qemu-devel] [PATCH v2 4/7] bootindex: delete bootindex when device is removed
Hi, Gerd -Original Message- From: Gerd Hoffmann [mailto:kra...@redhat.com] Sent: Friday, July 25, 2014 5:52 PM Subject: Re: [PATCH v2 4/7] bootindex: delete bootindex when device is removed Hi, +del_boot_device_path(dev); You can call this from device_finalize() instead of placing it into each individual device. Good idea! Thanks very much. Best regards, -Gonglei
Re: [Qemu-devel] [PATCH v2 1/7] bootindex: add modify_boot_device_path function
Hi, Gerd -Original Message- From: Gerd Hoffmann [mailto:kra...@redhat.com] Sent: Friday, July 25, 2014 5:46 PM Hi, +void modify_boot_device_path(int32_t bootindex, DeviceState *dev, + const char *suffix) +{ +FWBootEntry *node, *i; + +assert(dev != NULL || suffix != NULL); + +QTAILQ_FOREACH(i, fw_boot_order, link) { +if (i-bootindex == bootindex) { +qerror_report(ERROR_CLASS_GENERIC_ERROR, + The bootindex %d has already been used, bootindex); +return; +} +/* delete the same original dev */ +if (i-dev-id !strcmp(i-dev-id, dev-id)) { +QTAILQ_REMOVE(fw_boot_order, i, link); Ok ... +g_free(i-suffix); +g_free(i); ... but you should free the old entry later ... + +break; +} +} + +if (bootindex = 0) { +node = g_malloc0(sizeof(FWBootEntry)); +node-bootindex = bootindex; +node-suffix = g_strdup(suffix); ... because you can just copy the suffix from the old entry here, instead of expecting the caller pass it in. Okay, agreed. But we should also think about the situation which a device don't have old entry in global fw_boot_order list. So we can do like this: if (suffix) { node-suffix = g_strdup(suffix); } else if (has_old_entry) { node-suffix = g_strdup(old_entry-suffix); } else { node-suffix = NULL; } Best regards, -Gonglei
Re: [Qemu-devel] [questions] about qemu log
Hi, all If I use qemu command directly to run vm, bypass libvirt, how to configure qemu to assure that each vm has its own log file, like vmname.log? For example, VM: rhel7-net has its own log file, rhel7-net.log, VM:rhel7-stor has its own log file, rhel7-stor.log. -D /path/to/unique/file/name.log -D option is to configure qemu_logfile for the output logs which are controlled by qmp command logfile, which can be enabled/disabled on run time. I want to configure the log file for the output of fprintf(stderr, fmt, ...), .etc, i.e., how to redirect the output of fprintf(stderr, fmt, ...), or some other log-interface to a specified file? In a shell you would write something like: 2 stderr.log You may also want to toggle QEMU's -msg timestamp=on option. I think the -msg -msg timestamp=on option will add timestamp to the output of error_report(fmt, ...), but where is the output? Which file saves the output? And where is the output of fprintf(stderr, fmt, ...)? Should I redirect of stderr to specified log file? In libvirt code, when start a vm(qemuProcessStart), it will create a qemu log file named /var/log/libvirt/qemu/vmname.log, and redirect the stderr and stdout to file descriptor of this qemu log file. But if I run a vm directly by qemu command, bypass libvirt, then how to configure qemu to assure that each vm has its own log file, and how to redirect the stderr, stdout to each vm's own log file? Thanks, Zhang Haoyu Cheers, Andreas
Re: [Qemu-devel] [questions]_about_qemu_log
Hi, all If I use qemu command directly to run vm, bypass libvirt, how to configure qemu to assure that each vm has its own log file, like vmname.log? For example, VM: rhel7-net has its own log file, rhel7-net.log, VM:rhel7-stor has its own log file, rhel7-stor.log. -D /path/to/unique/file/name.log -D option is to configure qemu_logfile for the output logs which are controlled by qmp command logfile, which can be enabled/disabled on run time. I want to configure the log file for the output of fprintf(stderr, fmt, ...), .etc, i.e., how to redirect the output of fprintf(stderr, fmt, ...), or some other log-interface to a specified file? In a shell you would write something like: 2 stderr.log You may also want to toggle QEMU's -msg timestamp=on option. I think the -msg -msg timestamp=on option will add timestamp to the output of error_report(fmt, ...), but where is the output? Which file saves the output? If during the qmp operation, error_report will output to current monitor, but my concern is to stderr, which file saves the output? And where is the output of fprintf(stderr, fmt, ...)? Should I redirect of stderr to specified log file? In libvirt code, when start a vm(qemuProcessStart), it will create a qemu log file named /var/log/libvirt/qemu/vmname.log, and redirect the stderr and stdout to file descriptor of this qemu log file. But if I run a vm directly by qemu command, bypass libvirt, then how to configure qemu to assure that each vm has its own log file, and how to redirect the stderr, stdout to each vm's own log file? Thanks, Zhang Haoyu Cheers, Andreas
Re: [Qemu-devel] [PATCH] virtio-scsi: fix object check failure
On Sat, Jul 26, 2014 at 7:10 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: Hi Ming Paolo, On Wed, 2014-06-18 at 23:11 +0800, Ming Lei wrote: On Wed, Jun 18, 2014 at 11:06 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 18/06/2014 17:02, Paolo Bonzini ha scritto: Il 18/06/2014 16:18, Ming Lei ha scritto: This should never be triggered by vhost-scsi. Perhaps a bug in the kernel? It can be triggered with rmmod, system suspend, reboot... Yes, but it should not. What happens if you change VHOST_SCSI_VQ_NUM_FIXED from 2 to 3? Sorry, brain fart here. The number of interrupts in virtio-scsi is queues+3 because of the configuration interrupt. This does not apply here. Anyway, vhost-scsi passes the number of virtqueues correctly to the kernel. QEMU might be triggering this because, when vhost is stopped, the queue is still not empty (as expected, since it's a receive queue). Can you check that this is the case? Yes, that is the case. If so, patching QEMU is correct, but you need to change VIRTIO_SCSI(vdev) to VIRTIO_SCSI_COMMON(vdev), and move events_dropped from VirtIOSCSI to VirtIOSCSICommon. This approach should be better. Just curious if this patch was picked up by upstream yet, and if it needs to be CC'd to stable for vhost-scsi code in = v1.5.y ..? Below is the merged version which has been marked as stable: http://git.qemu.org/?p=qemu.git;a=commit;h=91d670fbf9945ca4ecbd123affb36889e7fe8a5d Thanks,
[Qemu-devel] [PATCH v3 4/7] bootindex: delete bootindex when device is removed
From: Gonglei arei.gong...@huawei.com Device should be removed from global boot list when it is hot-unplugged. Signed-off-by: Chenliang chenlian...@huawei.com Signed-off-by: Gonglei arei.gong...@huawei.com --- hw/core/qdev.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/core/qdev.c b/hw/core/qdev.c index da1ba48..7bc12bc 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -975,6 +975,8 @@ static void device_finalize(Object *obj) if (dev-opts) { qemu_opts_del(dev-opts); } +/* remove device's bootindex from global boot order list */ +del_boot_device_path(dev); QLIST_FOREACH_SAFE(ngl, dev-gpios, node, next) { QLIST_REMOVE(ngl, node); -- 1.7.12.4
[Qemu-devel] [PATCH v3 1/7] bootindex: add modify_boot_device_path function
From: Gonglei arei.gong...@huawei.com When we want to change one device's bootindex, we should lookup the device and change the bootindex. it is simply that remove it from the global boot list, then re-add it, sorted by new bootindex. If the new bootindex has already used by another device just throw an error. Signed-off-by: Gonglei arei.gong...@huawei.com Signed-off-by: Chenliang chenlian...@huawei.com --- include/sysemu/sysemu.h | 2 ++ vl.c| 58 + 2 files changed, 60 insertions(+) diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index d8539fd..e1b0659 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -209,6 +209,8 @@ void usb_info(Monitor *mon, const QDict *qdict); void add_boot_device_path(int32_t bootindex, DeviceState *dev, const char *suffix); +void modify_boot_device_path(int32_t bootindex, DeviceState *dev, + const char *suffix); char *get_boot_devices_list(size_t *size, bool ignore_suffixes); DeviceState *get_boot_device(uint32_t position); diff --git a/vl.c b/vl.c index fe451aa..3e42eaa 100644 --- a/vl.c +++ b/vl.c @@ -1248,6 +1248,64 @@ void add_boot_device_path(int32_t bootindex, DeviceState *dev, QTAILQ_INSERT_TAIL(fw_boot_order, node, link); } +void modify_boot_device_path(int32_t bootindex, DeviceState *dev, + const char *suffix) +{ +FWBootEntry *node, *i, *old_entry = NULL; + +assert(dev != NULL || suffix != NULL); + +if (bootindex = 0) { +QTAILQ_FOREACH(i, fw_boot_order, link) { +if (i-bootindex == bootindex) { +qerror_report(ERROR_CLASS_GENERIC_ERROR, + The bootindex %d has already been used, bootindex); +return; +} +} +} + +QTAILQ_FOREACH(i, fw_boot_order, link) { +/* delete the same original dev */ +if (i-dev-id !strcmp(i-dev-id, dev-id)) { +QTAILQ_REMOVE(fw_boot_order, i, link); +old_entry = i; + +break; +} +} + +if (bootindex = 0) { +node = g_malloc0(sizeof(FWBootEntry)); +node-bootindex = bootindex; +if (suffix) { +node-suffix = g_strdup(suffix); +} else if (old_entry) { +node-suffix = g_strdup(old_entry-suffix); +} else { +node-suffix = NULL; +} +node-dev = dev; + +/* add to the global boot list */ +QTAILQ_FOREACH(i, fw_boot_order, link) { +if (i-bootindex bootindex) { +continue; +} +QTAILQ_INSERT_BEFORE(i, node, link); +goto out; +} + +QTAILQ_INSERT_TAIL(fw_boot_order, node, link); +} + +out: +if (old_entry) { +g_free(old_entry-suffix); +g_free(old_entry); +} +} + DeviceState *get_boot_device(uint32_t position) { uint32_t counter = 0; -- 1.7.12.4
[Qemu-devel] [PATCH v3 0/7] modify boot order of guest, and take effect after rebooting
From: Gonglei arei.gong...@huawei.com Sometimes, we want to modify boot order of a guest, but no need to shutdown it. We can call dynamic changing bootindex of a guest, which can be assured taking effect just after the guest rebooting. For example, in P2V scene, we boot a guest and then attach a new system disk, for copying some thing. We want to assign the new disk as the booting disk, which means its bootindex=1. Different nics can be assigen different bootindex dynamically also make sense. The patchsets add one qmp interface, and add an fw_cfg_machine_reset() to achieve it. Steps of testing: ./qemu-system-x86_64 -enable-kvm -m 4096 -smp 4 -name redhat6.2 -drive \ file=/home/redhat6.2.img,if=none,id=drive-ide0-0-0 \ -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 \ -drive file=/home/RH-DVD1.iso,if=none,id=drive-ide0-0-1 \ -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=4 \ -vnc 0.0.0.0:10 -netdev type=user,id=net0 \ -device virtio-net-pci,netdev=net0,bootindex=3,id=nic1 \ -netdev type=user,id=net1 -device e1000,netdev=net1,bootindex=2,id=nic2 -monitor stdio QEMU 2.0.93 monitor - type 'help' for more information (qemu) set-bootindex ide0-0-1 1 The bootindex 1 has already been used (qemu) set-bootindex ide0-0-1 5 /disk@1 (qemu) set-bootindex ide0-0-1 0 (qemu) system_reset (qemu) set-bootindex ide0-0-1 1 The bootindex 1 has already been used (qemu) set-bootindex nic2 0 The bootindex 0 has already been used (qemu) set-bootindex ide0-0-1 -1 (qemu) set-bootindex nic2 0 (qemu) system_reset (qemu) Changes since v1: *rework by Gerd's suggestion: - split modify and del fw_boot_order for single function. - change modify bootindex's realization which simply lookup the device and modify the bootindex. if the new bootindex has already used by another device just throw an error. - change to del_boot_device_path(DeviceState *dev) and simply delete all entries belonging to the device. Changes since v2: *address Gerd's reviewing suggestion: - use the old entry's suffix, if the caller do not pass it in. - call del_boot_device_path() from device_finalize() instead of placing it into each individual device. Thanks Gerd. Gonglei (7): bootindex: add modify_boot_device_path function bootindex: add del_boot_device_path function fw_cfg: add fw_cfg_machine_reset function bootindex: delete bootindex when device is removed qmp: add set-bootindex command qemu-monitor: HMP set-bootindex wrapper spapr: fix possible memory leak hmp-commands.hx | 15 ++ hmp.c | 13 hmp.h | 1 + hw/core/qdev.c| 2 ++ hw/nvram/fw_cfg.c | 54 +- hw/ppc/spapr.c| 1 + include/hw/nvram/fw_cfg.h | 2 ++ include/sysemu/sysemu.h | 3 ++ qapi-schema.json | 16 ++ qmp-commands.hx | 24 +++ qmp.c | 17 +++ vl.c | 75 +++ 12 files changed, 216 insertions(+), 7 deletions(-) -- 1.7.12.4
[Qemu-devel] [PATCH v3 2/7] bootindex: add del_boot_device_path function
From: Gonglei arei.gong...@huawei.com Introduce a del_boot_device_path() cleanup fw_cfg content when hot-unplugging devcie refer to bootindex. Signed-off-by: Gonglei arei.gong...@huawei.com Signed-off-by: Chenliang chenlian...@huawei.com --- include/sysemu/sysemu.h | 1 + vl.c| 17 + 2 files changed, 18 insertions(+) diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index e1b0659..7a79ff4 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -209,6 +209,7 @@ void usb_info(Monitor *mon, const QDict *qdict); void add_boot_device_path(int32_t bootindex, DeviceState *dev, const char *suffix); +void del_boot_device_path(DeviceState *dev); void modify_boot_device_path(int32_t bootindex, DeviceState *dev, const char *suffix); char *get_boot_devices_list(size_t *size, bool ignore_suffixes); diff --git a/vl.c b/vl.c index 3e42eaa..0cdadb4 100644 --- a/vl.c +++ b/vl.c @@ -1248,6 +1248,23 @@ void add_boot_device_path(int32_t bootindex, DeviceState *dev, QTAILQ_INSERT_TAIL(fw_boot_order, node, link); } +void del_boot_device_path(DeviceState *dev) +{ +FWBootEntry *i; + +assert(dev != NULL); + +QTAILQ_FOREACH(i, fw_boot_order, link) { +if (i-dev-id dev-id !strcmp(i-dev-id, dev-id)) { +/* remove all entries of the assigend dev */ +QTAILQ_REMOVE(fw_boot_order, i, link); +g_free(i-suffix); +g_free(i); +break; +} +} +} + void modify_boot_device_path(int32_t bootindex, DeviceState *dev, const char *suffix) { -- 1.7.12.4