Re: [Qemu-devel] KVM call minutes for Feb 8
On 10 February 2011 07:47, Anthony Liguori anth...@codemonkey.ws wrote: So very concretely, I'm suggesting we do the following to target-i386: 2) get rid of the entire concept of machines. Creating a i440fx is essentially equivalent to creating a bare machine. Does that make any sense for anything other than target-i386? The concept of a machine model seems a pretty obvious one for ARM boards, for instance, and I'm not sure we'd gain much by having i386 be different to the other architectures... -- PMM
[Qemu-devel] Re: [PATCH uq/master -v2 2/2] KVM, MCE, unpoison memory address across reboot
On 2011-02-10 01:27, Huang Ying wrote: On Wed, 2011-02-09 at 16:00 +0800, Jan Kiszka wrote: On 2011-02-09 04:00, Huang Ying wrote: In Linux kernel HWPoison processing implementation, the virtual address in processes mapping the error physical memory page is marked as HWPoison. So that, the further accessing to the virtual address will kill corresponding processes with SIGBUS. If the error physical memory page is used by a KVM guest, the SIGBUS will be sent to QEMU, and QEMU will simulate a MCE to report that memory error to the guest OS. If the guest OS can not recover from the error (for example, the page is accessed by kernel code), guest OS will reboot the system. But because the underlying host virtual address backing the guest physical memory is still poisoned, if the guest system accesses the corresponding guest physical memory even after rebooting, the SIGBUS will still be sent to QEMU and MCE will be simulated. That is, guest system can not recover via rebooting. Yeah, saw this already during my test... In fact, across rebooting, the contents of guest physical memory page need not to be kept. We can allocate a new host physical page to back the corresponding guest physical address. I just wondering what would be architecturally suboptimal if we simply remapped on SIGBUS directly. Would save us at least the bookkeeping. Because we can not change the content of memory silently during guest OS running, this may corrupts guest OS data structure and even ruins disk contents. But during rebooting, all guest OS state are discarded. I was not talking about remapping more than just the pages that became inaccessible, just like you do now. But I guess the problem is rather that insane guests continuing to access those pages before reboot should also still receive MCEs. Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] KVM call minutes for Feb 8
On 02/10/2011 09:16 AM, Peter Maydell wrote: On 10 February 2011 07:47, Anthony Liguorianth...@codemonkey.ws wrote: So very concretely, I'm suggesting we do the following to target-i386: 2) get rid of the entire concept of machines. Creating a i440fx is essentially equivalent to creating a bare machine. Does that make any sense for anything other than target-i386? The concept of a machine model seems a pretty obvious one for ARM boards, for instance, and I'm not sure we'd gain much by having i386 be different to the other architectures... Yes, it makes a lot of sense, I just don't know the component names as well so bear with me :-) There are two types of Versatile machines today, Versatile/AB and Versatile/PB. They are both made with the same core, ARM926EJ-S, with different expansions. So you would model arm926ej-s as the chipset and then build up the machines by modifying parameters of the chipset (like the board id) and/or adding different components on top of it. A good way to think about what I'm proposing is that machine-init really should be a constructor for a device object. Regards, Anthony Liguori -- PMM
[Qemu-devel] Re: [PATCH uq/master -v2 2/2] KVM, MCE, unpoison memory address across reboot
On 2011-02-10 01:27, Huang Ying wrote: @@ -1882,6 +1919,7 @@ int kvm_arch_on_sigbus_vcpu(CPUState *en hardware_memory_error(); } } +kvm_hwpoison_page_add(ram_addr); if (code == BUS_MCEERR_AR) { /* Fake an Intel architectural Data Load SRAR UCR */ @@ -1926,6 +1964,7 @@ int kvm_arch_on_sigbus(int code, void *a QEMU itself instead of guest system!: %p\n, addr); return 0; } +kvm_hwpoison_page_add(ram_addr); kvm_mce_inj_srao_memscrub2(first_cpu, paddr); } else #endif Looks fine otherwise. Unless that simplification makes sense, I could offer to include this into my MCE rework (there is some minor conflict). If all goes well, that series should be posted during this week. Please have a look at git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream and tell me if it works for you and your signed-off still applies. Thanks, Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] Make tb_alloc static.
On Wed, Feb 09, 2011 at 07:52:52PM +0100, Aurelien Jarno wrote: What about moving tb_alloc() (with tb_free()) higher in the file? After all it make sense to have the function creating or destructing a tb before the function manipulating them. Thanks. Like this ? Tristan. This function is only used within exec.c, so no need to make it public. Signed-off-by: Tristan Gingold ging...@adacore.com --- exec-all.h |1 - exec.c | 52 ++-- 2 files changed, 26 insertions(+), 27 deletions(-) diff --git a/exec-all.h b/exec-all.h index 81497c0..c062693 100644 --- a/exec-all.h +++ b/exec-all.h @@ -182,7 +182,6 @@ static inline unsigned int tb_phys_hash_func(tb_page_addr_t pc) return (pc 2) (CODE_GEN_PHYS_HASH_SIZE - 1); } -TranslationBlock *tb_alloc(target_ulong pc); void tb_free(TranslationBlock *tb); void tb_flush(CPUState *env); void tb_link_page(TranslationBlock *tb, diff --git a/exec.c b/exec.c index 477199b..9a7a752 100644 --- a/exec.c +++ b/exec.c @@ -649,6 +649,32 @@ void cpu_exec_init(CPUState *env) #endif } +/* Allocate a new translation block. Flush the translation buffer if + too many translation blocks or too much generated code. */ +static TranslationBlock *tb_alloc(target_ulong pc) +{ +TranslationBlock *tb; + +if (nb_tbs = code_gen_max_blocks || +(code_gen_ptr - code_gen_buffer) = code_gen_buffer_max_size) +return NULL; +tb = tbs[nb_tbs++]; +tb-pc = pc; +tb-cflags = 0; +return tb; +} + +void tb_free(TranslationBlock *tb) +{ +/* In practice this is mostly used for single use temporary TB + Ignore the hard cases and just back up if this TB happens to + be the last one generated. */ +if (nb_tbs 0 tb == tbs[nb_tbs - 1]) { +code_gen_ptr = tb-tc_ptr; +nb_tbs--; +} +} + static inline void invalidate_page_bitmap(PageDesc *p) { if (p-code_bitmap) { @@ -1227,32 +1253,6 @@ static inline void tb_alloc_page(TranslationBlock *tb, #endif /* TARGET_HAS_SMC */ } -/* Allocate a new translation block. Flush the translation buffer if - too many translation blocks or too much generated code. */ -TranslationBlock *tb_alloc(target_ulong pc) -{ -TranslationBlock *tb; - -if (nb_tbs = code_gen_max_blocks || -(code_gen_ptr - code_gen_buffer) = code_gen_buffer_max_size) -return NULL; -tb = tbs[nb_tbs++]; -tb-pc = pc; -tb-cflags = 0; -return tb; -} - -void tb_free(TranslationBlock *tb) -{ -/* In practice this is mostly used for single use temporary TB - Ignore the hard cases and just back up if this TB happens to - be the last one generated. */ -if (nb_tbs 0 tb == tbs[nb_tbs - 1]) { -code_gen_ptr = tb-tc_ptr; -nb_tbs--; -} -} - /* add a new TB and link it to the physical page tables. phys_page2 is (-1) to indicate that only one page contains the TB. */ void tb_link_page(TranslationBlock *tb, -- 1.7.3.GIT
[Qemu-devel] [PATCH] Network functions patches for win32
This patch contains some fixes for network functions, working in Windows environment, and consists of two parts: 1. net/socket.c fix MSDN includes the following in WSAEALREADY error description for connect() function: To preserve backward compatibility, this error is reported as WSAEINVAL to Winsock applications that link to either Winsock.dll or Wsock32.dll. So check of this error code was added to allow network connections through the sockets in Windows. 2. net/tap-win32.c fix This fix allows connection of internal VLAN to the external TAP interface. If tap_win32_write function always returns 0, the TAP network interface in QEMU is disabled. Signed-off-by: Pavel Dovgalyuk pavel.dovga...@gmail.com --- net/socket.c|2 +- net/tap-win32.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/socket.c b/net/socket.c index 3182b37..7337f4f 100644 --- a/net/socket.c +++ b/net/socket.c @@ -457,7 +457,7 @@ static int net_socket_connect_init(VLANState *vlan, } else if (err == EINPROGRESS) { break; #ifdef _WIN32 -} else if (err == WSAEALREADY) { +} else if (err == WSAEALREADY || err == WSAEINVAL) { break; #endif } else { diff --git a/net/tap-win32.c b/net/tap-win32.c index 081904e..596132e 100644 --- a/net/tap-win32.c +++ b/net/tap-win32.c @@ -480,7 +480,7 @@ static int tap_win32_write(tap_win32_overlapped_t *overlapped, } } -return 0; +return write_size; } static DWORD WINAPI tap_win32_thread_entry(LPVOID param)
[Qemu-devel] [PATCH 12/18] Insert event_tap_mmio() to cpu_physical_memory_rw() in exec.c.
Record mmio write event to replay it upon failover. Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp --- exec.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/exec.c b/exec.c index e950df2..c81fd09 100644 --- a/exec.c +++ b/exec.c @@ -33,6 +33,7 @@ #include osdep.h #include kvm.h #include qemu-timer.h +#include event-tap.h #if defined(CONFIG_USER_ONLY) #include qemu.h #include signal.h @@ -3632,6 +3633,9 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf, io_index = (pd IO_MEM_SHIFT) (IO_MEM_NB_ENTRIES - 1); if (p) addr1 = (addr ~TARGET_PAGE_MASK) + p-region_offset; + +event_tap_mmio(addr, buf, len); + /* XXX: could force cpu_single_env to NULL to avoid potential bugs */ if (l = 4 ((addr1 3) == 0)) { -- 1.7.1.2
[Qemu-devel] [PATCH 13/18] net: insert event-tap to qemu_send_packet() and qemu_sendv_packet_async().
event-tap function is called only when it is on. Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp --- net.c |9 + 1 files changed, 9 insertions(+), 0 deletions(-) diff --git a/net.c b/net.c index 9ba5be2..1176124 100644 --- a/net.c +++ b/net.c @@ -36,6 +36,7 @@ #include qemu-common.h #include qemu_socket.h #include hw/qdev.h +#include event-tap.h static QTAILQ_HEAD(, VLANState) vlans; static QTAILQ_HEAD(, VLANClientState) non_vlan_clients; @@ -559,6 +560,10 @@ ssize_t qemu_send_packet_async(VLANClientState *sender, void qemu_send_packet(VLANClientState *vc, const uint8_t *buf, int size) { +if (event_tap_is_on()) { +return event_tap_send_packet(vc, buf, size); +} + qemu_send_packet_async(vc, buf, size, NULL); } @@ -657,6 +662,10 @@ ssize_t qemu_sendv_packet_async(VLANClientState *sender, { NetQueue *queue; +if (event_tap_is_on()) { +return event_tap_sendv_packet_async(sender, iov, iovcnt, sent_cb); +} + if (sender-link_down || (!sender-peer !sender-vlan)) { return calc_iov_length(iov, iovcnt); } -- 1.7.1.2
[Qemu-devel] [PATCH 18/18] Introduce kemari: to enable FT migration mode (Kemari).
When kemari: is set in front of URI of migrate command, it will turn on ft_mode to start FT migration mode (Kemari). On the receiver side, the option looks like, -incoming kemari:protocol:address:port Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp --- hmp-commands.hx |4 +++- migration.c | 12 qmp-commands.hx |4 +++- 3 files changed, 18 insertions(+), 2 deletions(-) diff --git a/hmp-commands.hx b/hmp-commands.hx index 38e1eb7..ee14344 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -760,7 +760,9 @@ ETEXI \n\t\t\t -b for migration without shared storage with full copy of disk\n\t\t\t -i for migration without shared storage with incremental copy of disk - (base image shared between src and destination), + (base image shared between src and destination) + \n\t\t\t put \kemari:\ in front of URI to enable + Fault Tolerance mode (Kemari protocol), .user_print = monitor_user_noop, .mhandler.cmd_new = do_migrate, }, diff --git a/migration.c b/migration.c index 7837c55..a3f7722 100644 --- a/migration.c +++ b/migration.c @@ -48,6 +48,12 @@ int qemu_start_incoming_migration(const char *uri) const char *p; int ret; +/* check ft_mode (Kemari protocol) */ +if (strstart(uri, kemari:, p)) { +ft_mode = FT_INIT; +uri = p; +} + if (strstart(uri, tcp:, p)) ret = tcp_start_incoming_migration(p); #if !defined(WIN32) @@ -99,6 +105,12 @@ int do_migrate(Monitor *mon, const QDict *qdict, QObject **ret_data) return -1; } +/* check ft_mode (Kemari protocol) */ +if (strstart(uri, kemari:, p)) { +ft_mode = FT_INIT; +uri = p; +} + if (strstart(uri, tcp:, p)) { s = tcp_start_outgoing_migration(mon, p, max_throttle, detach, blk, inc); diff --git a/qmp-commands.hx b/qmp-commands.hx index df40a3d..68ca48a 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -437,7 +437,9 @@ EQMP \n\t\t\t -b for migration without shared storage with full copy of disk\n\t\t\t -i for migration without shared storage with incremental copy of disk - (base image shared between src and destination), + (base image shared between src and destination) + \n\t\t\t put \kemari:\ in front of URI to enable + Fault Tolerance mode (Kemari protocol), .user_print = monitor_user_noop, .mhandler.cmd_new = do_migrate, }, -- 1.7.1.2
[Qemu-devel] [PATCH 04/18] qemu-char: export socket_set_nodelay().
Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp --- qemu-char.c |2 +- qemu_socket.h |1 + 2 files changed, 2 insertions(+), 1 deletions(-) diff --git a/qemu-char.c b/qemu-char.c index ee4f4ca..7286aeb 100644 --- a/qemu-char.c +++ b/qemu-char.c @@ -2111,7 +2111,7 @@ static void tcp_chr_telnet_init(int fd) send(fd, (char *)buf, 3, 0); } -static void socket_set_nodelay(int fd) +void socket_set_nodelay(int fd) { int val = 1; setsockopt(fd, IPPROTO_TCP, TCP_NODELAY, (char *)val, sizeof(val)); diff --git a/qemu_socket.h b/qemu_socket.h index 897a8ae..b7f8465 100644 --- a/qemu_socket.h +++ b/qemu_socket.h @@ -36,6 +36,7 @@ int inet_aton(const char *cp, struct in_addr *ia); int qemu_socket(int domain, int type, int protocol); int qemu_accept(int s, struct sockaddr *addr, socklen_t *addrlen); void socket_set_nonblock(int fd); +void socket_set_nodelay(int fd); int send_all(int fd, const void *buf, int len1); /* New, ipv6-ready socket helper functions, see qemu-sockets.c */ -- 1.7.1.2
[Qemu-devel] [PATCH 02/18] Introduce read() to FdMigrationState.
Currently FdMigrationState doesn't support read(), and this patch introduces it to get response from the other side. Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp --- migration-tcp.c | 15 +++ migration.c | 13 + migration.h |3 +++ 3 files changed, 31 insertions(+), 0 deletions(-) diff --git a/migration-tcp.c b/migration-tcp.c index b55f419..55777c8 100644 --- a/migration-tcp.c +++ b/migration-tcp.c @@ -39,6 +39,20 @@ static int socket_write(FdMigrationState *s, const void * buf, size_t size) return send(s-fd, buf, size, 0); } +static int socket_read(FdMigrationState *s, const void * buf, size_t size) +{ +ssize_t len; + +do { +len = recv(s-fd, (void *)buf, size, 0); +} while (len == -1 socket_error() == EINTR); +if (len == -1) { +len = -socket_error(); +} + +return len; +} + static int tcp_close(FdMigrationState *s) { DPRINTF(tcp_close\n); @@ -94,6 +108,7 @@ MigrationState *tcp_start_outgoing_migration(Monitor *mon, s-get_error = socket_errno; s-write = socket_write; +s-read = socket_read; s-close = tcp_close; s-mig_state.cancel = migrate_fd_cancel; s-mig_state.get_status = migrate_fd_get_status; diff --git a/migration.c b/migration.c index 3612572..f0df5fc 100644 --- a/migration.c +++ b/migration.c @@ -340,6 +340,19 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size) return ret; } +int migrate_fd_get_buffer(void *opaque, uint8_t *data, int64_t pos, size_t size) +{ +FdMigrationState *s = opaque; +int ret; + +ret = s-read(s, data, size); +if (ret == -1) { +ret = -(s-get_error(s)); +} + +return ret; +} + void migrate_fd_connect(FdMigrationState *s) { int ret; diff --git a/migration.h b/migration.h index 2170792..88a6987 100644 --- a/migration.h +++ b/migration.h @@ -48,6 +48,7 @@ struct FdMigrationState int (*get_error)(struct FdMigrationState*); int (*close)(struct FdMigrationState*); int (*write)(struct FdMigrationState*, const void *, size_t); +int (*read)(struct FdMigrationState *, const void *, size_t); void *opaque; }; @@ -116,6 +117,8 @@ void migrate_fd_put_notify(void *opaque); ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size); +int migrate_fd_get_buffer(void *opaque, uint8_t *data, int64_t pos, size_t size); + void migrate_fd_connect(FdMigrationState *s); void migrate_fd_put_ready(void *opaque); -- 1.7.1.2
[Qemu-devel] [PATCH 08/18] savevm: introduce util functions to control ft_trans_file from savevm layer.
To utilize ft_trans_file function, savevm needs interfaces to be exported. Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp --- hw/hw.h |5 ++ savevm.c | 149 ++ 2 files changed, 154 insertions(+), 0 deletions(-) diff --git a/hw/hw.h b/hw/hw.h index a168a37..a9eff5a 100644 --- a/hw/hw.h +++ b/hw/hw.h @@ -51,6 +51,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, QEMUFilePutBufferFunc *put_buffer, QEMUFile *qemu_fopen(const char *filename, const char *mode); QEMUFile *qemu_fdopen(int fd, const char *mode); QEMUFile *qemu_fopen_socket(int fd); +QEMUFile *qemu_fopen_ft_trans(int s_fd, int c_fd); QEMUFile *qemu_popen(FILE *popen_file, const char *mode); QEMUFile *qemu_popen_cmd(const char *command, const char *mode); int qemu_stdio_fd(QEMUFile *f); @@ -60,6 +61,9 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size); void qemu_put_byte(QEMUFile *f, int v); void *qemu_realloc_buffer(QEMUFile *f, int size); void qemu_clear_buffer(QEMUFile *f); +int qemu_ft_trans_begin(QEMUFile *f); +int qemu_ft_trans_commit(QEMUFile *f); +int qemu_ft_trans_cancel(QEMUFile *f); static inline void qemu_put_ubyte(QEMUFile *f, unsigned int v) { @@ -94,6 +98,7 @@ void qemu_file_set_error(QEMUFile *f); * halted due to rate limiting or EAGAIN errors occur as it can be used to * resume output. */ void qemu_file_put_notify(QEMUFile *f); +void qemu_file_get_notify(void *opaque); static inline void qemu_put_be64s(QEMUFile *f, const uint64_t *pv) { diff --git a/savevm.c b/savevm.c index 58e48e3..e44eccd 100644 --- a/savevm.c +++ b/savevm.c @@ -82,6 +82,7 @@ #include migration.h #include qemu_socket.h #include qemu-queue.h +#include ft_trans_file.h #define SELF_ANNOUNCE_ROUNDS 5 @@ -189,6 +190,13 @@ typedef struct QEMUFileSocket QEMUFile *file; } QEMUFileSocket; +typedef struct QEMUFileSocketTrans +{ +int fd; +QEMUFileSocket *s; +VMChangeStateEntry *e; +} QEMUFileSocketTrans; + static int socket_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size) { QEMUFileSocket *s = opaque; @@ -204,6 +212,22 @@ static int socket_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size) return len; } +static ssize_t socket_put_buffer(void *opaque, const void *buf, size_t size) +{ +QEMUFileSocket *s = opaque; +ssize_t len; + +do { +len = send(s-fd, (void *)buf, size, 0); +} while (len == -1 socket_error() == EINTR); + +if (len == -1) { +len = -socket_error(); +} + +return len; +} + static int socket_close(void *opaque) { QEMUFileSocket *s = opaque; @@ -211,6 +235,70 @@ static int socket_close(void *opaque) return 0; } +static int socket_trans_get_buffer(void *opaque, uint8_t *buf, int64_t pos, size_t size) +{ +QEMUFileSocketTrans *t = opaque; +QEMUFileSocket *s = t-s; +ssize_t len; + +len = socket_get_buffer(s, buf, pos, size); + +return len; +} + +static ssize_t socket_trans_put_buffer(void *opaque, const void *buf, size_t size) +{ +QEMUFileSocketTrans *t = opaque; + +return socket_put_buffer(t-s, buf, size); +} + + +static int socket_trans_get_ready(void *opaque) +{ +QEMUFileSocketTrans *t = opaque; +QEMUFileSocket *s = t-s; +QEMUFile *f = s-file; +int ret = 0; + +ret = qemu_loadvm_state(f, 1); +if (ret 0) { +fprintf(stderr, +socket_trans_get_ready: error while loading vmstate\n); +} + +return ret; +} + +static int socket_trans_close(void *opaque) +{ +QEMUFileSocketTrans *t = opaque; +QEMUFileSocket *s = t-s; + +qemu_set_fd_handler2(s-fd, NULL, NULL, NULL, NULL); +qemu_set_fd_handler2(t-fd, NULL, NULL, NULL, NULL); +qemu_del_vm_change_state_handler(t-e); +close(s-fd); +close(t-fd); +qemu_free(s); +qemu_free(t); + +return 0; +} + +static void socket_trans_resume(void *opaque, int running, int reason) +{ +QEMUFileSocketTrans *t = opaque; +QEMUFileSocket *s = t-s; + +if (!running) { +return; +} + +qemu_announce_self(); +qemu_fclose(s-file); +} + static int stdio_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, int size) { QEMUFileStdio *s = opaque; @@ -333,6 +421,26 @@ QEMUFile *qemu_fopen_socket(int fd) return s-file; } +QEMUFile *qemu_fopen_ft_trans(int s_fd, int c_fd) +{ +QEMUFileSocketTrans *t = qemu_mallocz(sizeof(QEMUFileSocketTrans)); +QEMUFileSocket *s = qemu_mallocz(sizeof(QEMUFileSocket)); + +t-s = s; +t-fd = s_fd; +t-e = qemu_add_vm_change_state_handler(socket_trans_resume, t); + +s-fd = c_fd; +s-file = qemu_fopen_ops_ft_trans(t, socket_trans_put_buffer, + socket_trans_get_buffer, NULL, + socket_trans_get_ready, + migrate_fd_wait_for_unfreeze, + socket_trans_close, 0);
[Qemu-devel] [PATCH 14/18] block: insert event-tap to bdrv_aio_writev(), bdrv_aio_flush() and bdrv_flush().
event-tap function is called only when it is on, and requests were sent from device emulators. Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp Acked-by: Kevin Wolf kw...@redhat.com --- block.c | 15 +++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index b476479..8ddce13 100644 --- a/block.c +++ b/block.c @@ -28,6 +28,7 @@ #include block_int.h #include module.h #include qemu-objects.h +#include event-tap.h #ifdef CONFIG_BSD #include sys/types.h @@ -1482,6 +1483,10 @@ int bdrv_flush(BlockDriverState *bs) } if (bs-drv bs-drv-bdrv_flush) { +if (*bs-device_name event_tap_is_on()) { +event_tap_bdrv_flush(); +} + return bs-drv-bdrv_flush(bs); } @@ -2117,6 +2122,11 @@ BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, int64_t sector_num, if (bdrv_check_request(bs, sector_num, nb_sectors)) return NULL; +if (*bs-device_name event_tap_is_on()) { +return event_tap_bdrv_aio_writev(bs, sector_num, qiov, nb_sectors, + cb, opaque); +} + if (bs-dirty_bitmap) { blk_cb_data = blk_dirty_cb_alloc(bs, sector_num, nb_sectors, cb, opaque); @@ -2380,6 +2390,11 @@ BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs, if (!drv) return NULL; + +if (*bs-device_name event_tap_is_on()) { +return event_tap_bdrv_aio_flush(bs, cb, opaque); +} + return drv-bdrv_aio_flush(bs, cb, opaque); } -- 1.7.1.2
[Qemu-devel] [PATCH 00/18] Kemari for KVM v0.2.10
Hi, This patch series is a revised version of Kemari for KVM, which applied comments for the previous post. The current code is based on qemu.git f26e5a54f0554798a2e6f7a074b809b13635d007. The changes from v0.2.9 - v0.2.10 are: - change migrate format to kemari:protocol:host:port (Paolo) The changes from v0.2.8 - v0.2.9 are: - abstract common code between qemu_savevm_{state,trans}_* (Paolo) - change incoming format to kemari:protocol:host:port (Paolo) The changes from v0.2.7 - v0.2.8 are: - fixed calling wrong cb in event-tap - add missing qemu_aio_release in event-tap The changes from v0.2.6 - v0.2.7 are: - add AIOCB, AIOPool and cancel functions (Kevin) - insert event-tap for bdrv_flush (Kevin) - add error handing when calling bdrv functions (Kevin) - fix usage of qemu_aio_flush and bdrv_flush (Kevin) - use bs in AIOCB on the primary (Kevin) - reorder event-tap functions to gather with block/net (Kevin) - fix checking bs-device_name (Kevin) The changes from v0.2.5 - v0.2.6 are: - use qemu_{put,get}_be32() to save/load niov in event-tap The changes from v0.2.4 - v0.2.5 are: - fixed braces and trailing spaces by using Blue's checkpatch.pl (Blue) - event-tap: don't try to send blk_req if it's a bdrv_aio_flush event The changes from v0.2.3 - v0.2.4 are: - call vm_start() before event_tap_flush_one() to avoid failure in virtio-net assertion - add vm_change_state_handler to turn off ft_mode - use qemu_iovec functions in event-tap - remove duplicated code in migration - remove unnecessary new line for error_report in ft_trans_file The changes from v0.2.2 - v0.2.3 are: - queue async net requests without copying (MST) -- if not async, contents of the packets are sent to the secondary - better description for option -k (MST) - fix memory transfer failure - fix ft transaction initiation failure The changes from v0.2.1 - v0.2.2 are: - decrement last_avaid_idx with inuse before saving (MST) - remove qemu_aio_flush() and bdrv_flush_all() in migrate_ft_trans_commit() The changes from v0.2 - v0.2.1 are: - Move event-tap to net/block layer and use stubs (Blue, Paul, MST, Kevin) - Tap bdrv_aio_flush (Marcelo) - Remove multiwrite interface in event-tap (Stefan) - Fix event-tap to use pio/mmio to replay both net/block (Stefan) - Improve error handling in event-tap (Stefan) - Fix leak in event-tap (Stefan) - Revise virtio last_avail_idx manipulation (MST) - Clean up migration.c hook (Marcelo) - Make deleting change state handler robust (Isaku, Anthony) The changes from v0.1.1 - v0.2 are: - Introduce a queue in event-tap to make VM sync live. - Change transaction receiver to a state machine for async receiving. - Replace net/block layer functions with event-tap proxy functions. - Remove dirty bitmap optimization for now. - convert DPRINTF() in ft_trans_file to trace functions. - convert fprintf() in ft_trans_file to error_report(). - improved error handling in ft_trans_file. - add a tmp pointer to qemu_del_vm_change_state_handler. The changes from v0.1 - v0.1.1 are: - events are tapped in net/block layer instead of device emulation layer. - Introduce a new option for -incoming to accept FT transaction. - Removed writev() support to QEMUFile and FdMigrationState for now. I would post this work in a different series. - Modified virtio-blk save/load handler to send inuse variable to correctly replay. - Removed configure --enable-ft-mode. - Removed unnecessary check for qemu_realloc(). The first 6 patches modify several functions of qemu to prepare introducing Kemari specific components. The next 6 patches are the components of Kemari. They introduce event-tap and the FT transaction protocol file based on buffered file. The design document of FT transaction protocol can be found at, http://wiki.qemu.org/images/b/b1/Kemari_sender_receiver_0.5a.pdf Then the following 2 patches modifies net/block layer functions with event-tap functions. Please note that if Kemari is off, event-tap will just passthrough, and there is most no intrusion to exisiting functions including normal live migration. Finally, the migration layer are modified to support Kemari in the last 4 patches. Again, there shouldn't be any affection if a user doesn't specify Kemari specific options. The transaction is now async on both sender and receiver side. The sender side respects the max_downtime to decide when to switch from async to sync mode. The repository contains all patches I'm sending with this message. For those who want to try, please pull the following repository. It also includes dirty bitmap optimization which aren't ready for posting yet. To remove the dirty bitmap optimization, please look at HEAD~4 of the tree. git://kemari.git.sourceforge.net/gitroot/kemari/kemari next Thanks, Yoshi Yoshiaki Tamura (18): Make QEMUFile buf expandable, and introduce qemu_realloc_buffer() and qemu_clear_buffer(). Introduce read() to FdMigrationState. Introduce skip_header parameter to qemu_loadvm_state().
[Qemu-devel] [PATCH 03/18] Introduce skip_header parameter to qemu_loadvm_state().
Introduce skip_header parameter to qemu_loadvm_state() so that it can be called iteratively without reading the header. Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp --- migration.c |2 +- savevm.c| 24 +--- sysemu.h|2 +- 3 files changed, 15 insertions(+), 13 deletions(-) diff --git a/migration.c b/migration.c index f0df5fc..dd3bf94 100644 --- a/migration.c +++ b/migration.c @@ -63,7 +63,7 @@ int qemu_start_incoming_migration(const char *uri) void process_incoming_migration(QEMUFile *f) { -if (qemu_loadvm_state(f) 0) { +if (qemu_loadvm_state(f, 0) 0) { fprintf(stderr, load of migration failed\n); exit(0); } diff --git a/savevm.c b/savevm.c index 6c4c72b..58e48e3 100644 --- a/savevm.c +++ b/savevm.c @@ -1716,7 +1716,7 @@ typedef struct LoadStateEntry { int version_id; } LoadStateEntry; -int qemu_loadvm_state(QEMUFile *f) +int qemu_loadvm_state(QEMUFile *f, int skip_header) { QLIST_HEAD(, LoadStateEntry) loadvm_handlers = QLIST_HEAD_INITIALIZER(loadvm_handlers); @@ -1729,17 +1729,19 @@ int qemu_loadvm_state(QEMUFile *f) return -EINVAL; } -v = qemu_get_be32(f); -if (v != QEMU_VM_FILE_MAGIC) -return -EINVAL; +if (!skip_header) { +v = qemu_get_be32(f); +if (v != QEMU_VM_FILE_MAGIC) +return -EINVAL; -v = qemu_get_be32(f); -if (v == QEMU_VM_FILE_VERSION_COMPAT) { -fprintf(stderr, SaveVM v2 format is obsolete and don't work anymore\n); -return -ENOTSUP; +v = qemu_get_be32(f); +if (v == QEMU_VM_FILE_VERSION_COMPAT) { +fprintf(stderr, SaveVM v2 format is obsolete and don't work anymore\n); +return -ENOTSUP; +} +if (v != QEMU_VM_FILE_VERSION) +return -ENOTSUP; } -if (v != QEMU_VM_FILE_VERSION) -return -ENOTSUP; while ((section_type = qemu_get_byte(f)) != QEMU_VM_EOF) { uint32_t instance_id, version_id, section_id; @@ -2062,7 +2064,7 @@ int load_vmstate(const char *name) return -EINVAL; } -ret = qemu_loadvm_state(f); +ret = qemu_loadvm_state(f, 0); qemu_fclose(f); if (ret 0) { diff --git a/sysemu.h b/sysemu.h index 23ae17e..c86b4e8 100644 --- a/sysemu.h +++ b/sysemu.h @@ -81,7 +81,7 @@ int qemu_savevm_state_begin(Monitor *mon, QEMUFile *f, int blk_enable, int qemu_savevm_state_iterate(Monitor *mon, QEMUFile *f); int qemu_savevm_state_complete(Monitor *mon, QEMUFile *f); void qemu_savevm_state_cancel(Monitor *mon, QEMUFile *f); -int qemu_loadvm_state(QEMUFile *f); +int qemu_loadvm_state(QEMUFile *f, int skip_header); /* SLIRP */ void do_info_slirp(Monitor *mon); -- 1.7.1.2
[Qemu-devel] [PATCH 15/18] savevm: introduce qemu_savevm_trans_{begin, commit}.
Introduce qemu_savevm_trans_{begin,commit} to send the memory and device info together, while avoiding cancelling memory state tracking. This patch also abstracts common code between qemu_savevm_state_{begin,iterate,commit}. Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp --- savevm.c | 157 +++--- sysemu.h |2 + 2 files changed, 101 insertions(+), 58 deletions(-) diff --git a/savevm.c b/savevm.c index e44eccd..1c2a7fb 100644 --- a/savevm.c +++ b/savevm.c @@ -1601,29 +1601,68 @@ bool qemu_savevm_state_blocked(Monitor *mon) return false; } -int qemu_savevm_state_begin(Monitor *mon, QEMUFile *f, int blk_enable, -int shared) +/* + * section: header to write + * inc: if true, forces to pass SECTION_PART instead of SECTION_START + * pause: if true, breaks the loop when live handler returned 0 + */ +static int qemu_savevm_state_live(Monitor *mon, QEMUFile *f, int section, + bool inc, bool pause) { SaveStateEntry *se; +int skip = 0, ret; QTAILQ_FOREACH(se, savevm_handlers, entry) { -if(se-set_params == NULL) { +int len, stage; + +if (se-save_live_state == NULL) { continue; - } - se-set_params(blk_enable, shared, se-opaque); +} + +/* Section type */ +qemu_put_byte(f, section); +qemu_put_be32(f, se-section_id); + +if (section == QEMU_VM_SECTION_START) { +/* ID string */ +len = strlen(se-idstr); +qemu_put_byte(f, len); +qemu_put_buffer(f, (uint8_t *)se-idstr, len); + +qemu_put_be32(f, se-instance_id); +qemu_put_be32(f, se-version_id); + +stage = inc ? QEMU_VM_SECTION_PART : QEMU_VM_SECTION_START; +} else { +assert(inc); +stage = section; +} + +ret = se-save_live_state(mon, f, stage, se-opaque); +if (!ret) { +skip++; +if (pause) { +break; +} +} } - -qemu_put_be32(f, QEMU_VM_FILE_MAGIC); -qemu_put_be32(f, QEMU_VM_FILE_VERSION); + +return skip; +} + +static void qemu_savevm_state_full(QEMUFile *f) +{ +SaveStateEntry *se; QTAILQ_FOREACH(se, savevm_handlers, entry) { int len; -if (se-save_live_state == NULL) +if (se-save_state == NULL se-vmsd == NULL) { continue; +} /* Section type */ -qemu_put_byte(f, QEMU_VM_SECTION_START); +qemu_put_byte(f, QEMU_VM_SECTION_FULL); qemu_put_be32(f, se-section_id); /* ID string */ @@ -1634,9 +1673,29 @@ int qemu_savevm_state_begin(Monitor *mon, QEMUFile *f, int blk_enable, qemu_put_be32(f, se-instance_id); qemu_put_be32(f, se-version_id); -se-save_live_state(mon, f, QEMU_VM_SECTION_START, se-opaque); +vmstate_save(f, se); +} + +qemu_put_byte(f, QEMU_VM_EOF); +} + +int qemu_savevm_state_begin(Monitor *mon, QEMUFile *f, int blk_enable, +int shared) +{ +SaveStateEntry *se; + +QTAILQ_FOREACH(se, savevm_handlers, entry) { +if (se-set_params == NULL) { +continue; +} +se-set_params(blk_enable, shared, se-opaque); } +qemu_put_be32(f, QEMU_VM_FILE_MAGIC); +qemu_put_be32(f, QEMU_VM_FILE_VERSION); + +qemu_savevm_state_live(mon, f, QEMU_VM_SECTION_START, 0, 0); + if (qemu_file_has_error(f)) { qemu_savevm_state_cancel(mon, f); return -EIO; @@ -1647,29 +1706,16 @@ int qemu_savevm_state_begin(Monitor *mon, QEMUFile *f, int blk_enable, int qemu_savevm_state_iterate(Monitor *mon, QEMUFile *f) { -SaveStateEntry *se; int ret = 1; -QTAILQ_FOREACH(se, savevm_handlers, entry) { -if (se-save_live_state == NULL) -continue; - -/* Section type */ -qemu_put_byte(f, QEMU_VM_SECTION_PART); -qemu_put_be32(f, se-section_id); - -ret = se-save_live_state(mon, f, QEMU_VM_SECTION_PART, se-opaque); -if (!ret) { -/* Do not proceed to the next vmstate before this one reported - completion of the current stage. This serializes the migration - and reduces the probability that a faster changing state is - synchronized over and over again. */ -break; -} -} - -if (ret) +/* Do not proceed to the next vmstate before this one reported + completion of the current stage. This serializes the migration + and reduces the probability that a faster changing state is + synchronized over and over again. */ +ret = qemu_savevm_state_live(mon, f, QEMU_VM_SECTION_PART, 1, 1); +if (!ret) { return 1; +} if (qemu_file_has_error(f)) { qemu_savevm_state_cancel(mon, f); @@ -1681,46 +1727,41 @@ int
[Qemu-devel] [PATCH 16/18] migration: introduce migrate_ft_trans_{put, get}_ready(), and modify migrate_fd_put_ready() when ft_mode is on.
Introduce migrate_ft_trans_put_ready() which kicks the FT transaction cycle. When ft_mode is on, migrate_fd_put_ready() would open ft_trans_file and turn on event_tap. To end or cancel FT transaction, ft_mode and event_tap is turned off. migrate_ft_trans_get_ready() is called to receive ack from the receiver. Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp --- migration.c | 261 ++- 1 files changed, 260 insertions(+), 1 deletions(-) diff --git a/migration.c b/migration.c index c5e0146..7837c55 100644 --- a/migration.c +++ b/migration.c @@ -21,6 +21,7 @@ #include qemu_socket.h #include block-migration.h #include qemu-objects.h +#include event-tap.h //#define DEBUG_MIGRATION @@ -283,6 +284,14 @@ void migrate_fd_error(FdMigrationState *s) migrate_fd_cleanup(s); } +static void migrate_ft_trans_error(FdMigrationState *s) +{ +ft_mode = FT_ERROR; +qemu_savevm_state_cancel(s-mon, s-file); +migrate_fd_error(s); +event_tap_unregister(); +} + int migrate_fd_cleanup(FdMigrationState *s) { int ret = 0; @@ -318,6 +327,17 @@ void migrate_fd_put_notify(void *opaque) qemu_file_put_notify(s-file); } +static void migrate_fd_get_notify(void *opaque) +{ +FdMigrationState *s = opaque; + +qemu_set_fd_handler2(s-fd, NULL, NULL, NULL, NULL); +qemu_file_get_notify(s-file); +if (qemu_file_has_error(s-file)) { +migrate_ft_trans_error(s); +} +} + ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size) { FdMigrationState *s = opaque; @@ -353,6 +373,10 @@ int migrate_fd_get_buffer(void *opaque, uint8_t *data, int64_t pos, size_t size) ret = -(s-get_error(s)); } +if (ret == -EAGAIN) { +qemu_set_fd_handler2(s-fd, NULL, migrate_fd_get_notify, NULL, s); +} + return ret; } @@ -379,6 +403,230 @@ void migrate_fd_connect(FdMigrationState *s) migrate_fd_put_ready(s); } +static int migrate_ft_trans_commit(void *opaque) +{ +FdMigrationState *s = opaque; +int ret = -1; + +if (ft_mode != FT_TRANSACTION_COMMIT ft_mode != FT_TRANSACTION_ATOMIC) { +fprintf(stderr, +migrate_ft_trans_commit: invalid ft_mode %d\n, ft_mode); +goto out; +} + +do { +if (ft_mode == FT_TRANSACTION_ATOMIC) { +if (qemu_ft_trans_begin(s-file) 0) { +fprintf(stderr, qemu_ft_trans_begin failed\n); +goto out; +} + +ret = qemu_savevm_trans_begin(s-mon, s-file, 0); +if (ret 0) { +fprintf(stderr, qemu_savevm_trans_begin failed\n); +goto out; +} + +ft_mode = FT_TRANSACTION_COMMIT; +if (ret) { +/* don't proceed until if fd isn't ready */ +goto out; +} +} + +/* make the VM state consistent by flushing outstanding events */ +vm_stop(0); + +/* send at full speed */ +qemu_file_set_rate_limit(s-file, 0); + +ret = qemu_savevm_trans_complete(s-mon, s-file); +if (ret 0) { +fprintf(stderr, qemu_savevm_trans_complete failed\n); +goto out; +} + +ret = qemu_ft_trans_commit(s-file); +if (ret 0) { +fprintf(stderr, qemu_ft_trans_commit failed\n); +goto out; +} + +if (ret) { +ft_mode = FT_TRANSACTION_RECV; +ret = 1; +goto out; +} + +/* flush and check if events are remaining */ +vm_start(); +ret = event_tap_flush_one(); +if (ret 0) { +fprintf(stderr, event_tap_flush_one failed\n); +goto out; +} + +ft_mode = ret ? FT_TRANSACTION_BEGIN : FT_TRANSACTION_ATOMIC; +} while (ft_mode != FT_TRANSACTION_BEGIN); + +vm_start(); +ret = 0; + +out: +return ret; +} + +static int migrate_ft_trans_get_ready(void *opaque) +{ +FdMigrationState *s = opaque; +int ret = -1; + +if (ft_mode != FT_TRANSACTION_RECV) { +fprintf(stderr, +migrate_ft_trans_get_ready: invalid ft_mode %d\n, ft_mode); +goto error_out; +} + +/* flush and check if events are remaining */ +vm_start(); +ret = event_tap_flush_one(); +if (ret 0) { +fprintf(stderr, event_tap_flush_one failed\n); +goto error_out; +} + +if (ret) { +ft_mode = FT_TRANSACTION_BEGIN; +} else { +ft_mode = FT_TRANSACTION_ATOMIC; + +ret = migrate_ft_trans_commit(s); +if (ret 0) { +goto error_out; +} +if (ret) { +goto out; +} +} + +vm_start(); +ret = 0; +goto out; + +error_out: +migrate_ft_trans_error(s); + +out: +return ret; +} + +static int migrate_ft_trans_put_ready(void) +{ +FdMigrationState *s = migrate_to_fms(current_migration); +
[Qemu-devel] [PATCH 11/18] ioport: insert event_tap_ioport() to ioport_write().
Record ioport event to replay it upon failover. Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp --- ioport.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/ioport.c b/ioport.c index aa4188a..74aebf5 100644 --- a/ioport.c +++ b/ioport.c @@ -27,6 +27,7 @@ #include ioport.h #include trace.h +#include event-tap.h /***/ /* IO Port */ @@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, uint32_t data) default_ioport_writel }; IOPortWriteFunc *func = ioport_write_table[index][address]; +event_tap_ioport(index, address, data); if (!func) func = default_func[index]; func(ioport_opaque[address], address, data); -- 1.7.1.2
[Qemu-devel] [PATCH 07/18] Introduce fault tolerant VM transaction QEMUFile and ft_mode.
This code implements VM transaction protocol. Like buffered_file, it sits between savevm and migration layer. With this architecture, VM transaction protocol is implemented mostly independent from other existing code. Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp Signed-off-by: OHMURA Kei ohmura@lab.ntt.co.jp --- Makefile.objs |1 + ft_trans_file.c | 624 +++ ft_trans_file.h | 72 +++ migration.c |3 + trace-events| 15 ++ 5 files changed, 715 insertions(+), 0 deletions(-) create mode 100644 ft_trans_file.c create mode 100644 ft_trans_file.h diff --git a/Makefile.objs b/Makefile.objs index 353b1a8..04148b5 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -100,6 +100,7 @@ common-obj-y += msmouse.o ps2.o common-obj-y += qdev.o qdev-properties.o common-obj-y += block-migration.o common-obj-y += pflib.o +common-obj-y += ft_trans_file.o common-obj-$(CONFIG_BRLAPI) += baum.o common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o migration-fd.o diff --git a/ft_trans_file.c b/ft_trans_file.c new file mode 100644 index 000..2b42b95 --- /dev/null +++ b/ft_trans_file.c @@ -0,0 +1,624 @@ +/* + * Fault tolerant VM transaction QEMUFile + * + * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation. + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + * This source code is based on buffered_file.c. + * Copyright IBM, Corp. 2008 + * Authors: + * Anthony Liguorialigu...@us.ibm.com + */ + +#include qemu-common.h +#include qemu-error.h +#include hw/hw.h +#include qemu-timer.h +#include sysemu.h +#include qemu-char.h +#include trace.h +#include ft_trans_file.h + +typedef struct FtTransHdr +{ +uint16_t cmd; +uint16_t id; +uint32_t seq; +uint32_t payload_len; +} FtTransHdr; + +typedef struct QEMUFileFtTrans +{ +FtTransPutBufferFunc *put_buffer; +FtTransGetBufferFunc *get_buffer; +FtTransPutReadyFunc *put_ready; +FtTransGetReadyFunc *get_ready; +FtTransWaitForUnfreezeFunc *wait_for_unfreeze; +FtTransCloseFunc *close; +void *opaque; +QEMUFile *file; + +enum QEMU_VM_TRANSACTION_STATE state; +uint32_t seq; +uint16_t id; + +int has_error; + +bool freeze_output; +bool freeze_input; +bool rate_limit; +bool is_sender; +bool is_payload; + +uint8_t *buf; +size_t buf_max_size; +size_t put_offset; +size_t get_offset; + +FtTransHdr header; +size_t header_offset; +} QEMUFileFtTrans; + +#define IO_BUF_SIZE 32768 + +static void ft_trans_append(QEMUFileFtTrans *s, +const uint8_t *buf, size_t size) +{ +if (size (s-buf_max_size - s-put_offset)) { +trace_ft_trans_realloc(s-buf_max_size, size + 1024); +s-buf_max_size += size + 1024; +s-buf = qemu_realloc(s-buf, s-buf_max_size); +} + +trace_ft_trans_append(size); +memcpy(s-buf + s-put_offset, buf, size); +s-put_offset += size; +} + +static void ft_trans_flush(QEMUFileFtTrans *s) +{ +size_t offset = 0; + +if (s-has_error) { +error_report(flush when error %d, bailing, s-has_error); +return; +} + +while (offset s-put_offset) { +ssize_t ret; + +ret = s-put_buffer(s-opaque, s-buf + offset, s-put_offset - offset); +if (ret == -EAGAIN) { +break; +} + +if (ret = 0) { +error_report(error flushing data, %s, strerror(errno)); +s-has_error = FT_TRANS_ERR_FLUSH; +break; +} else { +offset += ret; +} +} + +trace_ft_trans_flush(offset, s-put_offset); +memmove(s-buf, s-buf + offset, s-put_offset - offset); +s-put_offset -= offset; +s-freeze_output = !!s-put_offset; +} + +static ssize_t ft_trans_put(void *opaque, void *buf, int size) +{ +QEMUFileFtTrans *s = opaque; +size_t offset = 0; +ssize_t len; + +/* flush buffered data before putting next */ +if (s-put_offset) { +ft_trans_flush(s); +} + +while (!s-freeze_output offset size) { +len = s-put_buffer(s-opaque, (uint8_t *)buf + offset, size - offset); + +if (len == -EAGAIN) { +trace_ft_trans_freeze_output(); +s-freeze_output = 1; +break; +} + +if (len = 0) { +error_report(putting data failed, %s, strerror(errno)); +s-has_error = 1; +offset = -EINVAL; +break; +} + +offset += len; +} + +if (s-freeze_output) { +ft_trans_append(s, buf + offset, size - offset); +offset = size; +} + +return offset; +} + +static int ft_trans_send_header(QEMUFileFtTrans *s, +enum QEMU_VM_TRANSACTION_STATE state, +uint32_t payload_len) +{ +int ret; +FtTransHdr
[Qemu-devel] Re: [PATCH] Fix multiple qemu-options.def generation
diff --git a/Makefile.objs b/Makefile.objs index 4a1eaa1..ee9f190 100755 --- a/Makefile.objs +++ b/Makefile.objs @@ -269,10 +269,10 @@ vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS) vl.o: QEMU_CFLAGS+=$(SDL_CFLAGS) -vl.o: qemu-options.def +vl.o: ../qemu-options.def os-posix.o: qemu-options.def os-win32.o: qemu-options.def -qemu-options.def: $(SRC_PATH)/qemu-options.hx +%qemu-options.def: $(SRC_PATH)/qemu-options.hx $(call quiet-command,sh $(SRC_PATH)/hxtool -h $ $@, GEN $(TARGET_DIR)$@) This is wrong, I think the problem is that you are missing a vpath directive. Does this help? diff --git a/rules.mak b/rules.mak index ed59c9e..6f753ae 100644 --- a/rules.mak +++ b/rules.mak @@ -39,7 +39,7 @@ quiet-command = $(if $(V),$1,$(if $(2),@echo $2 $1, @$1)) cc-option = $(if $(shell $(CC) $1 $2 -S -o /dev/null -xc /dev/null \ /dev/null 21 echo OK), $2, $3) -VPATH_SUFFIXES = %.c %.h %.S %.m %.mak %.texi +VPATH_SUFFIXES = %.c %.h %.S %.m %.mak %.texi %.def set-vpath = $(if $1,$(foreach PATTERN,$(VPATH_SUFFIXES),$(eval vpath $(PATTERN) $1))) # find-in-path Paolo
[Qemu-devel] Re: [PATCH 18/18] Introduce kemari: to enable FT migration mode (Kemari).
On 02/10/2011 10:30 AM, Yoshiaki Tamura wrote: When kemari: is set in front of URI of migrate command, it will turn on ft_mode to start FT migration mode (Kemari). On the receiver side, the option looks like, -incoming kemari:protocol:address:port Signed-off-by: Yoshiaki Tamuratamura.yoshi...@lab.ntt.co.jp --- hmp-commands.hx |4 +++- migration.c | 12 qmp-commands.hx |4 +++- 3 files changed, 18 insertions(+), 2 deletions(-) diff --git a/hmp-commands.hx b/hmp-commands.hx index 38e1eb7..ee14344 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -760,7 +760,9 @@ ETEXI \n\t\t\t -b for migration without shared storage with full copy of disk\n\t\t\t -i for migration without shared storage with incremental copy of disk - (base image shared between src and destination), + (base image shared between src and destination) + \n\t\t\t put \kemari:\ in front of URI to enable + Fault Tolerance mode (Kemari protocol), .user_print = monitor_user_noop, .mhandler.cmd_new = do_migrate, }, diff --git a/migration.c b/migration.c index 7837c55..a3f7722 100644 --- a/migration.c +++ b/migration.c @@ -48,6 +48,12 @@ int qemu_start_incoming_migration(const char *uri) const char *p; int ret; +/* check ft_mode (Kemari protocol) */ +if (strstart(uri, kemari:,p)) { +ft_mode = FT_INIT; +uri = p; +} + if (strstart(uri, tcp:,p)) ret = tcp_start_incoming_migration(p); #if !defined(WIN32) @@ -99,6 +105,12 @@ int do_migrate(Monitor *mon, const QDict *qdict, QObject **ret_data) return -1; } +/* check ft_mode (Kemari protocol) */ +if (strstart(uri, kemari:,p)) { +ft_mode = FT_INIT; +uri = p; +} + if (strstart(uri, tcp:,p)) { s = tcp_start_outgoing_migration(mon, p, max_throttle, detach, blk, inc); diff --git a/qmp-commands.hx b/qmp-commands.hx index df40a3d..68ca48a 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -437,7 +437,9 @@ EQMP \n\t\t\t -b for migration without shared storage with full copy of disk\n\t\t\t -i for migration without shared storage with incremental copy of disk - (base image shared between src and destination), + (base image shared between src and destination) + \n\t\t\t put \kemari:\ in front of URI to enable + Fault Tolerance mode (Kemari protocol), .user_print = monitor_user_noop, .mhandler.cmd_new = do_migrate, }, Acked-by: Paolo Bonzini pbonz...@redhat.com Paolo
RE: [Qemu-devel] Re: [PATCH] Fix multiple qemu-options.def generation
diff --git a/Makefile.objs b/Makefile.objs index 4a1eaa1..ee9f190 100755 --- a/Makefile.objs +++ b/Makefile.objs @@ -269,10 +269,10 @@ vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS) vl.o: QEMU_CFLAGS+=$(SDL_CFLAGS) -vl.o: qemu-options.def +vl.o: ../qemu-options.def os-posix.o: qemu-options.def os-win32.o: qemu-options.def -qemu-options.def: $(SRC_PATH)/qemu-options.hx +%qemu-options.def: $(SRC_PATH)/qemu-options.hx $(call quiet-command,sh $(SRC_PATH)/hxtool -h $ $@, GEN $(TARGET_DIR)$@) This is wrong, I think the problem is that you are missing a vpath directive. Does this help? This patch was for older version of qemu. Current one does not have this problem. Pavel Dovgaluk
[Qemu-devel] [PATCH 09/18] Introduce event-tap.
event-tap controls when to start FT transaction, and provides proxy functions to called from net/block devices. While FT transaction, it queues up net/block requests, and flush them when the transaction gets completed. Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp Signed-off-by: OHMURA Kei ohmura@lab.ntt.co.jp --- Makefile.target |1 + event-tap.c | 939 +++ event-tap.h | 44 +++ qemu-tool.c | 28 ++ trace-events| 10 + 5 files changed, 1022 insertions(+), 0 deletions(-) create mode 100644 event-tap.c create mode 100644 event-tap.h diff --git a/Makefile.target b/Makefile.target index b0ba95f..edbdbee 100644 --- a/Makefile.target +++ b/Makefile.target @@ -199,6 +199,7 @@ obj-y += rwhandler.o obj-$(CONFIG_KVM) += kvm.o kvm-all.o obj-$(CONFIG_NO_KVM) += kvm-stub.o LIBS+=-lz +obj-y += event-tap.o QEMU_CFLAGS += $(VNC_TLS_CFLAGS) QEMU_CFLAGS += $(VNC_SASL_CFLAGS) diff --git a/event-tap.c b/event-tap.c new file mode 100644 index 000..f44d835 --- /dev/null +++ b/event-tap.c @@ -0,0 +1,939 @@ +/* + * Event Tap functions for QEMU + * + * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation. + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#include qemu-common.h +#include qemu-error.h +#include block.h +#include block_int.h +#include ioport.h +#include osdep.h +#include sysemu.h +#include hw/hw.h +#include net.h +#include event-tap.h +#include trace.h + +enum EVENT_TAP_STATE { +EVENT_TAP_OFF, +EVENT_TAP_ON, +EVENT_TAP_SUSPEND, +EVENT_TAP_FLUSH, +EVENT_TAP_LOAD, +EVENT_TAP_REPLAY, +}; + +static enum EVENT_TAP_STATE event_tap_state = EVENT_TAP_OFF; + +typedef struct EventTapIOport { +uint32_t address; +uint32_t data; +int index; +} EventTapIOport; + +#define MMIO_BUF_SIZE 8 + +typedef struct EventTapMMIO { +uint64_t address; +uint8_t buf[MMIO_BUF_SIZE]; +int len; +} EventTapMMIO; + +typedef struct EventTapNetReq { +char *device_name; +int iovcnt; +int vlan_id; +bool vlan_needed; +bool async; +struct iovec *iov; +NetPacketSent *sent_cb; +} EventTapNetReq; + +#define MAX_BLOCK_REQUEST 32 + +typedef struct EventTapAIOCB EventTapAIOCB; + +typedef struct EventTapBlkReq { +char *device_name; +int num_reqs; +int num_cbs; +bool is_flush; +BlockRequest reqs[MAX_BLOCK_REQUEST]; +EventTapAIOCB *acb[MAX_BLOCK_REQUEST]; +} EventTapBlkReq; + +#define EVENT_TAP_IOPORT (1 0) +#define EVENT_TAP_MMIO (1 1) +#define EVENT_TAP_NET(1 2) +#define EVENT_TAP_BLK(1 3) + +#define EVENT_TAP_TYPE_MASK (EVENT_TAP_NET - 1) + +typedef struct EventTapLog { +int mode; +union { +EventTapIOport ioport; +EventTapMMIO mmio; +}; +union { +EventTapNetReq net_req; +EventTapBlkReq blk_req; +}; +QTAILQ_ENTRY(EventTapLog) node; +} EventTapLog; + +struct EventTapAIOCB { +BlockDriverAIOCB common; +BlockDriverAIOCB *acb; +bool is_canceled; +}; + +static EventTapLog *last_event_tap; + +static QTAILQ_HEAD(, EventTapLog) event_list; +static QTAILQ_HEAD(, EventTapLog) event_pool; + +static int (*event_tap_cb)(void); +static QEMUBH *event_tap_bh; +static VMChangeStateEntry *vmstate; + +static void event_tap_bh_cb(void *p) +{ +if (event_tap_cb) { +event_tap_cb(); +} + +qemu_bh_delete(event_tap_bh); +event_tap_bh = NULL; +} + +static void event_tap_schedule_bh(void) +{ +trace_event_tap_ignore_bh(!!event_tap_bh); + +/* if bh is already set, we ignore it for now */ +if (event_tap_bh) { +return; +} + +event_tap_bh = qemu_bh_new(event_tap_bh_cb, NULL); +qemu_bh_schedule(event_tap_bh); + +return; +} + +static void *event_tap_alloc_log(void) +{ +EventTapLog *log; + +if (QTAILQ_EMPTY(event_pool)) { +log = qemu_mallocz(sizeof(EventTapLog)); +} else { +log = QTAILQ_FIRST(event_pool); +QTAILQ_REMOVE(event_pool, log, node); +} + +return log; +} + +static void event_tap_free_net_req(EventTapNetReq *net_req); +static void event_tap_free_blk_req(EventTapBlkReq *blk_req); + +static void event_tap_free_log(EventTapLog *log) +{ +int mode = log-mode ~EVENT_TAP_TYPE_MASK; + +if (mode == EVENT_TAP_NET) { +event_tap_free_net_req(log-net_req); +} else if (mode == EVENT_TAP_BLK) { +event_tap_free_blk_req(log-blk_req); +} + +log-mode = 0; + +/* return the log to event_pool */ +QTAILQ_INSERT_HEAD(event_pool, log, node); +} + +static void event_tap_free_pool(void) +{ +EventTapLog *log, *next; + +QTAILQ_FOREACH_SAFE(log, event_pool, node, next) { +QTAILQ_REMOVE(event_pool, log, node); +qemu_free(log); +} +} + +static void event_tap_free_net_req(EventTapNetReq *net_req) +{ +int i; + +if (!net_req-async) { +for
[Qemu-devel] Re: [PATCH] Correct win32 timers deleting v.3
On 02/02/2011 12:59 PM, Pavel Dovgaluk wrote: Hello. Anybody interested in this patch? I'm planning to replace the multimedia timer with a queue timer. I'll send the patch soon(ish). Paolo
Re: [Qemu-devel] [PATCH 02/18] Introduce read() to FdMigrationState.
2011/2/10 Anthony Liguori anth...@codemonkey.ws: On 02/10/2011 10:30 AM, Yoshiaki Tamura wrote: Currently FdMigrationState doesn't support read(), and this patch introduces it to get response from the other side. Signed-off-by: Yoshiaki Tamuratamura.yoshi...@lab.ntt.co.jp Migration is unidirectional. Changing this is fundamental and not something to be done lightly. I thought we previously discussed using a protocol wrapper around the existing migration protocol? AFAIR, I don't think we had that discussion before. I applied comments from Stefan though. If I missed the discussion, could you please give me the link? Thanks, Yoshi Regards, Anthony Liguori --- migration-tcp.c | 15 +++ migration.c | 13 + migration.h | 3 +++ 3 files changed, 31 insertions(+), 0 deletions(-) diff --git a/migration-tcp.c b/migration-tcp.c index b55f419..55777c8 100644 --- a/migration-tcp.c +++ b/migration-tcp.c @@ -39,6 +39,20 @@ static int socket_write(FdMigrationState *s, const void * buf, size_t size) return send(s-fd, buf, size, 0); } +static int socket_read(FdMigrationState *s, const void * buf, size_t size) +{ + ssize_t len; + + do { + len = recv(s-fd, (void *)buf, size, 0); + } while (len == -1 socket_error() == EINTR); + if (len == -1) { + len = -socket_error(); + } + + return len; +} + static int tcp_close(FdMigrationState *s) { DPRINTF(tcp_close\n); @@ -94,6 +108,7 @@ MigrationState *tcp_start_outgoing_migration(Monitor *mon, s-get_error = socket_errno; s-write = socket_write; + s-read = socket_read; s-close = tcp_close; s-mig_state.cancel = migrate_fd_cancel; s-mig_state.get_status = migrate_fd_get_status; diff --git a/migration.c b/migration.c index 3612572..f0df5fc 100644 --- a/migration.c +++ b/migration.c @@ -340,6 +340,19 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size) return ret; } +int migrate_fd_get_buffer(void *opaque, uint8_t *data, int64_t pos, size_t size) +{ + FdMigrationState *s = opaque; + int ret; + + ret = s-read(s, data, size); + if (ret == -1) { + ret = -(s-get_error(s)); + } + + return ret; +} + void migrate_fd_connect(FdMigrationState *s) { int ret; diff --git a/migration.h b/migration.h index 2170792..88a6987 100644 --- a/migration.h +++ b/migration.h @@ -48,6 +48,7 @@ struct FdMigrationState int (*get_error)(struct FdMigrationState*); int (*close)(struct FdMigrationState*); int (*write)(struct FdMigrationState*, const void *, size_t); + int (*read)(struct FdMigrationState *, const void *, size_t); void *opaque; }; @@ -116,6 +117,8 @@ void migrate_fd_put_notify(void *opaque); ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size); +int migrate_fd_get_buffer(void *opaque, uint8_t *data, int64_t pos, size_t size); + void migrate_fd_connect(FdMigrationState *s); void migrate_fd_put_ready(void *opaque); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 02/18] Introduce read() to FdMigrationState.
On 02/10/2011 10:30 AM, Yoshiaki Tamura wrote: Currently FdMigrationState doesn't support read(), and this patch introduces it to get response from the other side. Signed-off-by: Yoshiaki Tamuratamura.yoshi...@lab.ntt.co.jp Migration is unidirectional. Changing this is fundamental and not something to be done lightly. I thought we previously discussed using a protocol wrapper around the existing migration protocol? Regards, Anthony Liguori --- migration-tcp.c | 15 +++ migration.c | 13 + migration.h |3 +++ 3 files changed, 31 insertions(+), 0 deletions(-) diff --git a/migration-tcp.c b/migration-tcp.c index b55f419..55777c8 100644 --- a/migration-tcp.c +++ b/migration-tcp.c @@ -39,6 +39,20 @@ static int socket_write(FdMigrationState *s, const void * buf, size_t size) return send(s-fd, buf, size, 0); } +static int socket_read(FdMigrationState *s, const void * buf, size_t size) +{ +ssize_t len; + +do { +len = recv(s-fd, (void *)buf, size, 0); +} while (len == -1 socket_error() == EINTR); +if (len == -1) { +len = -socket_error(); +} + +return len; +} + static int tcp_close(FdMigrationState *s) { DPRINTF(tcp_close\n); @@ -94,6 +108,7 @@ MigrationState *tcp_start_outgoing_migration(Monitor *mon, s-get_error = socket_errno; s-write = socket_write; +s-read = socket_read; s-close = tcp_close; s-mig_state.cancel = migrate_fd_cancel; s-mig_state.get_status = migrate_fd_get_status; diff --git a/migration.c b/migration.c index 3612572..f0df5fc 100644 --- a/migration.c +++ b/migration.c @@ -340,6 +340,19 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size) return ret; } +int migrate_fd_get_buffer(void *opaque, uint8_t *data, int64_t pos, size_t size) +{ +FdMigrationState *s = opaque; +int ret; + +ret = s-read(s, data, size); +if (ret == -1) { +ret = -(s-get_error(s)); +} + +return ret; +} + void migrate_fd_connect(FdMigrationState *s) { int ret; diff --git a/migration.h b/migration.h index 2170792..88a6987 100644 --- a/migration.h +++ b/migration.h @@ -48,6 +48,7 @@ struct FdMigrationState int (*get_error)(struct FdMigrationState*); int (*close)(struct FdMigrationState*); int (*write)(struct FdMigrationState*, const void *, size_t); +int (*read)(struct FdMigrationState *, const void *, size_t); void *opaque; }; @@ -116,6 +117,8 @@ void migrate_fd_put_notify(void *opaque); ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size); +int migrate_fd_get_buffer(void *opaque, uint8_t *data, int64_t pos, size_t size); + void migrate_fd_connect(FdMigrationState *s); void migrate_fd_put_ready(void *opaque);
Re: [Qemu-devel] KVM call minutes for Feb 8
On 02/10/2011 10:07 AM, Gleb Natapov wrote: So what if it is easier, it doesn't mean it is correct thing to do. If we spend the next 10 years trying to do the correct thing for some arbitrary definition of correct, that's not terribly useful. It's really simple actually. Let's do the least clever thing and model how hardware actual works. Once we have that, we can try to be better than real hardware (if it's possible). If all composition is done through a factory interface, it doesn't. But my main argument here is that we shouldn't try to make all composition done through a factory interface--only where it makes sense. So very concretely, I'm suggesting we do the following to target-i386: 1) make the i440fx device have an embedded ide controller, piix3, and usb controller that get initialized automatically. The piix3 embeds the PCI-to-ISA bridge along with all of the default ISA devices (rtc, serial, etc.). This may be a problem even from security point of view. What if usb code (ide, serial, parallel) has guest exploitable bug? Currently I can happily continue running guests if they do not need affected subsystem. If we'll get it your way I will no longer be able to do so. qemu -device i440fx,ide=off If you really care to do this. But this desire to remove devices is silly IMHO. Concerns about security are misplaced. If you have to change the way a guest is invoked in order to eliminate security problems, then there's something seriously wrong. Regards, Anthony Liguori
Re: [Qemu-devel] [RFC][PATCH v6 00/04] qtest: qemu unit testing framework
On Wed, Feb 9, 2011 at 8:39 PM, Michael Roth mdr...@linux.vnet.ibm.com wrote: On 02/09/2011 01:42 PM, Blue Swirl wrote: On Fri, Feb 4, 2011 at 3:49 PM, Michael Rothmdr...@linux.vnet.ibm.com wrote: These patches apply to master (2-04-2011), and can also be obtained from: git://repo.or.cz/qemu/mdroth.git qtest_v1 OVERVIEW: QEMU currently lacks a standard means to do targeted unit testing of the device model. Frameworks like kvm-autotest interact via guest OS, which provide a highly abstracted interface to the underlying machine, and are susceptable to bugs in the guest OS itself. This allows for reasonable test coverage of guest functionality as a whole, but reduces the accuracy and specificity with which we can exercise paths in the underlying devices. The following patches provide the basic beginnings of a test framework which replaces vcpu threads with test threads that interact with the underlying machine directly, allowing for directed unit/performance testing of individual devices. Test modules are built directly into the qemu binary, and each module provides the following interfaces: init(): Called in place of qemu's normal machine initialization to setup up devices explicitly. A full machine can be created here by calling into the normal init path, as well as minimal machines with a select set of buses/devices/IRQ handlers. run(): Test logic that interacts with the now-created machine. cleanup(): Currently unused, but potentially allows for chaining multiple tests together. Currently we run one module, then exit. As mentioned these are very early starting points. We're mostly looking for input from the community on the basic approach and overall requirements for an acceptable framework. A basic RTC test module is provided as an example. BUILD/EXAMPLE USAGE: $ ./configure --target-list=x86_64-softmmu --enable-qtest --enable-io-thread $ make $ ./x86_64-softmmu/qemu-system-x86_64 -test ? Available test modules: rtc $ ./x86_64-softmmu/qemu-system-x86_64 -test rtc ../qtest/qtest_rtc.c:test_drift():L94: hz: 2, duration_ms: 4999, exp_duration: 5000, drift ratio: 0.000200 ../qtest/qtest_rtc.c:test_drift():L111: hz: 1024, duration_ms: 4999, exp_duration: 5000, drift ratio: 0.000200 GENERAL PLAN: - Provide libraries for common operations like PCI device enumeration, APIC configuration, default-configured machine setup, interrupt handling, etc. - Develop tests as machine/target specific, potentially make some tests re-usable as interfaces are better defined - Do port i/o via cpu_in/cpu_out commands - Do guest memory access via a CPUPhysMemoryClient interface - Allow interrupts to be sent by writing to an FD, detection in test modules via select()/read() TODO: - A means to propagate test returns values to main i/o thread - Better defined test harness for individual test cases and/or modules, likely via GLib - Support for multiple test threads in a single test module for scalability testing - Modify vl.c hooks so tests can configure their own timers/clocksources - More test modules, improve current rtc module - Further implementing/fleshing out of the overall plan Comments/feedback are welcome! Would it be possible to couple this with the tracing or Kemari somehow so that you could capture, say, block device traces and feed them to test setup? I would think so...it's a pretty open ended framework, a unit test could, say, read in block device traces in some pre-defined format and then execute those against a block device. We're also planning on adding command-line parameters for tests, so a unit test could actually be used as a general testing utility. for instance: That's a good point. Testing network, block, serial, etc device emulation requires mock host devices (netdev, drive, chardev). Net needs a replay net client. A dump net client already exists for capturing packets. Block has no record/replay but the Linux blktrace format might be good. There's also Kevin's blkdebug and CQ's blksim which might be extendable. For chardev perhaps the existing options are already powerful enough, otherwise something like expect would be neat. qemu -test block-trace-virtio -test-opts tracefile=file,target_img=img,target_fmt=qcow2,comparison_img=img Or more like the -device, -chardev, etc syntax: qemu -test block-trace-virtio,tracefile=file,target_img=img,target_fmt=qcow2,comparison_img=img Stefan
Re: [Qemu-devel] KVM call minutes for Feb 8
On 02/10/2011 10:04 AM, Peter Maydell wrote: On 10 February 2011 08:36, Anthony Liguorianth...@codemonkey.ws wrote: On 02/10/2011 09:16 AM, Peter Maydell wrote: On 10 February 2011 07:47, Anthony Liguorianth...@codemonkey.wswrote: 2) get rid of the entire concept of machines. Creating a i440fx is essentially equivalent to creating a bare machine. Does that make any sense for anything other than target-i386? The concept of a machine model seems a pretty obvious one for ARM boards, for instance, and I'm not sure we'd gain much by having i386 be different to the other architectures... Yes, it makes a lot of sense, I just don't know the component names as well so bear with me :-) There are two types of Versatile machines today, Versatile/AB and Versatile/PB. They are both made with the same core, ARM926EJ-S, with different expansions. So you would model arm926ej-s as the chipset and then build up the machines by modifying parameters of the chipset (like the board id) and/or adding different components on top of it. Er, ARM926 is the CPU, it's not a chipset. The board ID is definitely not a property of an ARM926, it's a property of the board (clue is in the name :-)). I don't think versatile boards have a chipset really... As I said, I'm not well versed in the component names in ARM. But that said, an actual processor doesn't connect directly to a bunch of devices. It almost always go through some chipset and that chipset implements a lot of functionality typically. I think the name of the component I'm trying to refer to PL300 which I believe is the Northbridge used for the Versatile boards. In my understanding the machine is the thing that says I need a 926, and an MMC controller at this address, and some UARTS, and... ie it is the thing that does the modifying parameters and adding different components. So if we'd still be doing that I don't see how we've got rid of the concept. I guess I'm missing the point somehow. A machine today is basically the northbridge, southbridge, plus a bunch of default components to make the virtual hardware useful. I'm suggesting that we model a proper northbridge/southbridge. A good way to think about what I'm proposing is that machine-init really should be a constructor for a device object. If you mean that you want machines to be implemented under the hood as a single huge device you can only have one of that spans the entire memory map, well I guess that's an implementation detail. But conceptually machines really do exist, and we definitely still want users to be able to say I want a beagle machine; I want a versatile; I want an n900. An n900 is a very specific hardware configuration that is best represented by some sort of configuration file vs. something hard coded in QEMU. The question is, what level of component modelling do we need to do in order to make it practical to create such configurations from a file. Regards, Anthony Liguori -- PMM
[Qemu-devel] [PATCH 1/2] qdev: Allow hot-plug for lists with pre-filled descriptors
This will be needed for hot-plugging chardevs. Signed-off-by: Amit Shah amit.s...@redhat.com --- monitor.c |4 +--- 1 files changed, 1 insertions(+), 3 deletions(-) diff --git a/monitor.c b/monitor.c index 7fc311d..f3d7ab3 100644 --- a/monitor.c +++ b/monitor.c @@ -74,8 +74,6 @@ * 'O' option string of the form NAME=VALUE,... * parsed according to QemuOptsList given by its name * Example: 'device:O' uses qemu_device_opts. - * Restriction: only lists with empty desc are supported - * TODO lift the restriction * 'i' 32 bit integer * 'l' target long (32 or 64 bit) * 'M' just like 'l', except in user mode the value is @@ -4064,7 +4062,7 @@ static const mon_cmd_t *monitor_parse_command(Monitor *mon, QemuOpts *opts; opts_list = qemu_find_opts(key); -if (!opts_list || opts_list-desc-name) { +if (!opts_list) { goto bad_type; } while (qemu_isspace(*p)) { -- 1.7.4
[Qemu-devel] [PATCH 2/2] qdev: Allow chardevs to be hot-plugged
This commit enables chardevs to be hot-plugged to a running qemu machine. The syntax is similar to the -chardev command line: (qemu) chardev_add socket,path=/tmp/foo,server,nowait,id=char0 Signed-off-by: Amit Shah amit.s...@redhat.com --- hmp-commands.hx | 16 hw/qdev.c | 15 +++ hw/qdev.h |1 + qmp-commands.hx | 23 +++ 4 files changed, 55 insertions(+), 0 deletions(-) diff --git a/hmp-commands.hx b/hmp-commands.hx index 38e1eb7..e0e6fc8 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -554,6 +554,22 @@ command @code{info usb} to see the devices you can remove. ETEXI { +.name = chardev_add, +.args_type = chardev:O, +.params = backend[,prop=value][,...],id=str, +.help = add chardev, like -chardev on the command line, +.user_print = monitor_user_noop, +.mhandler.cmd_new = do_chardev_add, +}, + +STEXI +@item chardev_add @var{config} +@findex chardev_add + +Add chardev. +ETEXI + +{ .name = device_add, .args_type = device:O, .params = driver[,prop=value][,...], diff --git a/hw/qdev.c b/hw/qdev.c index c7fec44..1e24f58 100644 --- a/hw/qdev.c +++ b/hw/qdev.c @@ -861,6 +861,21 @@ void do_info_qdm(Monitor *mon) } } +int do_chardev_add(Monitor *mon, const QDict *qdict, QObject **ret_data) +{ +QemuOpts *opts; + +opts = qemu_opts_from_qdict(qemu_find_opts(chardev), qdict); +if (!opts) { +return -1; +} +if (!qemu_chr_open_opts(opts, NULL)) { +qemu_opts_del(opts); +return -1; +} +return 0; +} + int do_device_add(Monitor *mon, const QDict *qdict, QObject **ret_data) { QemuOpts *opts; diff --git a/hw/qdev.h b/hw/qdev.h index 9808f85..5698713 100644 --- a/hw/qdev.h +++ b/hw/qdev.h @@ -212,6 +212,7 @@ BusState *sysbus_get_default(void); void do_info_qtree(Monitor *mon); void do_info_qdm(Monitor *mon); +int do_chardev_add(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_device_add(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_device_del(Monitor *mon, const QDict *qdict, QObject **ret_data); diff --git a/qmp-commands.hx b/qmp-commands.hx index df40a3d..255da9a 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -275,6 +275,29 @@ Example: EQMP { +.name = chardev_add, +.args_type = device:O, +.params = backend[,prop=value][,...],id=str, +.help = add chardev, like -chardev on the command line, +.user_print = monitor_user_noop, +.mhandler.cmd_new = do_chardev_add, +}, + +SQMP +chardev_add +-- + +Add a chardev. + +Arguments: + +- backend: the backend of the new chardev (json-string) +- id: the chardev's ID, must be unique (json-string) +- chardev properties + +EQMP + +{ .name = device_add, .args_type = device:O, .params = driver[,prop=value][,...], -- 1.7.4
Re: [Qemu-devel] [PATCH 02/18] Introduce read() to FdMigrationState.
On Thu, Feb 10, 2011 at 10:54:01AM +0100, Anthony Liguori wrote: On 02/10/2011 10:30 AM, Yoshiaki Tamura wrote: Currently FdMigrationState doesn't support read(), and this patch introduces it to get response from the other side. Signed-off-by: Yoshiaki Tamuratamura.yoshi...@lab.ntt.co.jp Migration is unidirectional. Changing this is fundamental and not something to be done lightly. Making it bi-directional might break libvirt's save/restore to file support which uses migration, passing a unidirectional FD for the file. It could also break libvirt's secure tunnelled migration support which is currently only expecting to have data sent in one direction on the socket. Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [PATCH 02/18] Introduce read() to FdMigrationState.
2011/2/10 Daniel P. Berrange berra...@redhat.com: On Thu, Feb 10, 2011 at 10:54:01AM +0100, Anthony Liguori wrote: On 02/10/2011 10:30 AM, Yoshiaki Tamura wrote: Currently FdMigrationState doesn't support read(), and this patch introduces it to get response from the other side. Signed-off-by: Yoshiaki Tamuratamura.yoshi...@lab.ntt.co.jp Migration is unidirectional. Changing this is fundamental and not something to be done lightly. Making it bi-directional might break libvirt's save/restore to file support which uses migration, passing a unidirectional FD for the file. It could also break libvirt's secure tunnelled migration support which is currently only expecting to have data sent in one direction on the socket. Hi Daniel, IIUC, this patch isn't something to make existing live migration bi-directional. Just opens up a way for Kemari to use it. Do you think it's dangerous for libvirt still? Thanks, Yoshi Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] KVM call minutes for Feb 8
On 02/10/2011 11:07 AM, Gleb Natapov wrote: On Thu, Feb 10, 2011 at 08:47:12AM +0100, Anthony Liguori wrote: On 02/09/2011 09:15 PM, Blue Swirl wrote: On Wed, Feb 9, 2011 at 9:59 PM, Anthony Liguorianth...@codemonkey.ws wrote: On 02/09/2011 06:48 PM, Blue Swirl wrote: ISASerialState dev; isa_serial_init(dev, 0, 0x274, 0x07, NULL, NULL); Do you mean that there should be a generic way of doing that, like sysbus_create_varargs() for qdev, or just add inline functions which hide qdev property setup? I still think that FDT should be used in the future. That would require that the properties can be set up mechanically, and I don't see how your proposal would help that. Yeah, I don't think that is a good idea anymore. I think this is part of why we're having so many problems with qdev. While (most?) hardware hierarchies can be represented by device tree syntax, not all valid device trees correspond to interface and/or useful hardware hierarchies. User creates a non-working machine and so gets to fix the problems? How is that a problem for us? It's not about creating a non-working machine. It's about what user-level abstraction we need to provide. It's a whole lot easier to implement an i440fx device with a fixed set of parameters than it is to make every possible subdevice have a proper factory interface along with mechanisms to hook everything together. So what if it is easier, it doesn't mean it is correct thing to do. What you are proposing is just a huge step backwards. May be we shouldn't support hooking everything together in completely arbitrary ways, but we shouldn't force isa/pci devices upon our users just because they are non-removable on real chip. I disagree. We don't want to deviate from the spec any more than we already do. The reason for wanting flexibility is because the code for the PIC or RTC, for example, can be used in other Super-IO chipsets or even standalone. If qemu only supported the 440FX chipset, we'd have no reason to make things flexible. So very concretely, I'm suggesting we do the following to target-i386: 1) make the i440fx device have an embedded ide controller, piix3, and usb controller that get initialized automatically. The piix3 embeds the PCI-to-ISA bridge along with all of the default ISA devices (rtc, serial, etc.). This may be a problem even from security point of view. What if usb code (ide, serial, parallel) has guest exploitable bug? Currently I can happily continue running guests if they do not need affected subsystem. If we'll get it your way I will no longer be able to do so. You can't just remove a device from a guest. You have to shut it down. When you power it back up, you may end up with different IRQ assignments or expose some guest bug. If you have a security issue in code that is exposed to the guest, you have to fix it. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] KVM call minutes for Feb 8
On Thu, Feb 10, 2011 at 11:19:48AM +0100, Anthony Liguori wrote: On 02/10/2011 11:10 AM, Gleb Natapov wrote: On Thu, Feb 10, 2011 at 11:00:50AM +0100, Anthony Liguori wrote: On 02/10/2011 10:07 AM, Gleb Natapov wrote: So what if it is easier, it doesn't mean it is correct thing to do. If we spend the next 10 years trying to do the correct thing for some arbitrary definition of correct, that's not terribly useful. Changing direction by 180 every 2 years even less useful. If we think through what we are doing and have a coherent architecture before changing direction, then we won't have this problem. I'd like to believe this :) It's really simple actually. Let's do the least clever thing and model how hardware actual works. Once we have that, we can try to be better than real hardware (if it's possible). I think out understanding on how HW actually works is very different. You are placing to much value on were device resides physically, for me it is completely unimportant detail. Not worth even mentioning. No, I place value on how things are modelled in the real world. Real world (physical HW) have consideration not relevant for our software emulation. Such as cost, physical dimension, power consumption and many other I am sure I missed. There simply aren't PC's out there that lack an RTC so I have no interest in jumping through hoops in QEMU to make it possible to do this without modifying QEMU code. It might sound nice to a developer but it's of absolutely no use to users. RTC is not good example. HPET suppose to replace it (and PIT too). AFAIC there are PCs without RTC already. Good example would be PIC or IOAPIC device and then I would agree with you that it is not worth it to make it possible to create x86 machine without them from command line if it means extra complexity. But how have you jumped from this to lets make usb mandatory? If all composition is done through a factory interface, it doesn't. But my main argument here is that we shouldn't try to make all composition done through a factory interface--only where it makes sense. So very concretely, I'm suggesting we do the following to target-i386: 1) make the i440fx device have an embedded ide controller, piix3, and usb controller that get initialized automatically. The piix3 embeds the PCI-to-ISA bridge along with all of the default ISA devices (rtc, serial, etc.). This may be a problem even from security point of view. What if usb code (ide, serial, parallel) has guest exploitable bug? Currently I can happily continue running guests if they do not need affected subsystem. If we'll get it your way I will no longer be able to do so. qemu -device i440fx,ide=off So you still need to support arbitrary composition. What's the difference? No, we don't. It's possible to have an 'rtc=off' option but I'm tremendously opposed to doing this. Arbitrary composition is not a useful goal IMHO. IMHO is different. We should support composition where it makes sense. For PIC-less x86 it doesn't make it. For usb-less or even ide-less it does. So why do you like -device i440fx over what we have now? Because I don't think tools like libvirt should be doing device composition to create an i440fx-like chipset. I think the current path we're on is pushing too much logic that belongs in QEMU into the management stack. I can agree with that. But from this it doesn't follow that we should get rid of composition. We shouldn't push composition of common HW to libvirt. Looking at libvirt command line I do not think we do it though. Typical libvirt command line specifies disks, networks, usb, vga. How -device i440fx will simplified that? Well usb could be omitted (but not -usbdevice table), disks are not property of i440fx so they will stay, since user may want to use virtio controller (which is not part of i440fx) this should stay too. Network obviously will have to be specified by libvirt too, vga may go to i440fx, but since libvirt supports qxl we will have to have a way to disable default vga and enable qxl instead. So will we really simplify libvirt's life by introducing -device i440fx? In current speak you propose will be implement by using i440fx machine type. Qdev will build it for you. If you had an i440fx machine type, that had no non-optional components added, and you could specify options to the machine type, yes. But I think you'll agree that there's no reason to not just treat the i440fx as a device. I do not agree. There is not such device as i440fx. This is just packaging. If you really care to do this. But this desire to remove devices is silly IMHO. Concerns about security are misplaced. If you have to change the way a guest is invoked in order to eliminate security problems, then there's something seriously wrong. No I do not. I do not create guest with unneeded devices from the beginning. There is very little that isn't 'unneeded'. That depends
Re: [Qemu-devel] KVM call minutes for Feb 8
On Thu, Feb 10, 2011 at 12:25:38PM +0200, Avi Kivity wrote: On 02/10/2011 11:07 AM, Gleb Natapov wrote: On Thu, Feb 10, 2011 at 08:47:12AM +0100, Anthony Liguori wrote: On 02/09/2011 09:15 PM, Blue Swirl wrote: On Wed, Feb 9, 2011 at 9:59 PM, Anthony Liguorianth...@codemonkey.ws wrote: On 02/09/2011 06:48 PM, Blue Swirl wrote: ISASerialState dev; isa_serial_init(dev, 0, 0x274, 0x07, NULL, NULL); Do you mean that there should be a generic way of doing that, like sysbus_create_varargs() for qdev, or just add inline functions which hide qdev property setup? I still think that FDT should be used in the future. That would require that the properties can be set up mechanically, and I don't see how your proposal would help that. Yeah, I don't think that is a good idea anymore. I think this is part of why we're having so many problems with qdev. While (most?) hardware hierarchies can be represented by device tree syntax, not all valid device trees correspond to interface and/or useful hardware hierarchies. User creates a non-working machine and so gets to fix the problems? How is that a problem for us? It's not about creating a non-working machine. It's about what user-level abstraction we need to provide. It's a whole lot easier to implement an i440fx device with a fixed set of parameters than it is to make every possible subdevice have a proper factory interface along with mechanisms to hook everything together. So what if it is easier, it doesn't mean it is correct thing to do. What you are proposing is just a huge step backwards. May be we shouldn't support hooking everything together in completely arbitrary ways, but we shouldn't force isa/pci devices upon our users just because they are non-removable on real chip. I disagree. We don't want to deviate from the spec any more than we already do. Which spec? Even in this discussion we completely mixed different things. 440FX is not a chipset. It is memory controller/pci host bridge. PIIX3/4 is the chipset which is just an arbitrary combination of devices put on the same chip. We do not deviate from spec when we implement those devices. The reason for wanting flexibility is because the code for the PIC or RTC, for example, can be used in other Super-IO chipsets or even standalone. If qemu only supported the 440FX chipset, we'd have no reason to make things flexible. Again you probably mean PIIX3. Even then removing unused ide will free one more PCI slot for my cool virtio disk array. The things is, from code point of view, it does not cost you extra to allow composition of ide since it is just a regular PCI device and we need to support composing those anyway. So very concretely, I'm suggesting we do the following to target-i386: 1) make the i440fx device have an embedded ide controller, piix3, and usb controller that get initialized automatically. The piix3 embeds the PCI-to-ISA bridge along with all of the default ISA devices (rtc, serial, etc.). This may be a problem even from security point of view. What if usb code (ide, serial, parallel) has guest exploitable bug? Currently I can happily continue running guests if they do not need affected subsystem. If we'll get it your way I will no longer be able to do so. You can't just remove a device from a guest. You have to shut it down. When you power it back up, you may end up with different IRQ assignments or expose some guest bug. As I answered to Anthony already I am not talking about changing HW configuration after guest is created rather about creating minimal HW setup for the task from the start. This means no soundcard or usb for Windows exchange server for instance. If you have a security issue in code that is exposed to the guest, you have to fix it. Of course. That is why it is a good idea to expose as little code to guest as possible. Don't you think so? -- Gleb.
[Qemu-devel] [PATCH 0.14/master v2 0/4] Error messages for unsupoorted image format features
With 0.15 we'll most likely get some incompatible image format extensions. This series prepares 0.14 to output more helpful messages if it stumbles over a too new image file. Kevin Wolf (4): qerror: Add QERR_UNKNOWN_BLOCK_FORMAT_FEATURE qcow2: Report error for version 2 qed: Report error for unsupported features qemu-img: Improve error messages for failed bdrv_open block/qcow2.c | 13 +++-- block/qed.c |9 - qemu-img.c| 10 +++--- qerror.c |5 + qerror.h |3 +++ 5 files changed, 34 insertions(+), 6 deletions(-) -- 1.7.2.3
[Qemu-devel] [PATCH v2 3/4] qed: Report error for unsupported features
Instead of just returning -ENOTSUP, generate a more detailed error. Unfortunately we don't have a helpful text for features that we don't know yet, so just print the feature mask. It might be useful at least if someone asks for help. Signed-off-by: Kevin Wolf kw...@redhat.com --- block/qed.c |9 - 1 files changed, 8 insertions(+), 1 deletions(-) diff --git a/block/qed.c b/block/qed.c index 3273448..75ae244 100644 --- a/block/qed.c +++ b/block/qed.c @@ -14,6 +14,7 @@ #include trace.h #include qed.h +#include qerror.h static void qed_aio_cancel(BlockDriverAIOCB *blockacb) { @@ -311,7 +312,13 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags) return -EINVAL; } if (s-header.features ~QED_FEATURE_MASK) { -return -ENOTSUP; /* image uses unsupported feature bits */ +/* image uses unsupported feature bits */ +char buf[64]; +snprintf(buf, sizeof(buf), % PRIx64, +s-header.features ~QED_FEATURE_MASK); +qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE, +bs-device_name, QED, buf); +return -ENOTSUP; } if (!qed_is_cluster_size_valid(s-header.cluster_size)) { return -EINVAL; -- 1.7.2.3
[Qemu-devel] [PATCH v2 1/4] qerror: Add QERR_UNKNOWN_BLOCK_FORMAT_FEATURE
Signed-off-by: Kevin Wolf kw...@redhat.com --- qerror.c |5 + qerror.h |3 +++ 2 files changed, 8 insertions(+), 0 deletions(-) diff --git a/qerror.c b/qerror.c index 9d0cdeb..4855604 100644 --- a/qerror.c +++ b/qerror.c @@ -201,6 +201,11 @@ static const QErrorStringTable qerror_table[] = { .desc = An undefined error has ocurred, }, { +.error_fmt = QERR_UNKNOWN_BLOCK_FORMAT_FEATURE, +.desc = '%(device)' uses a %(format) feature which is not + supported by this qemu version: %(feature), +}, +{ .error_fmt = QERR_VNC_SERVER_FAILED, .desc = Could not start VNC server on %(target), }, diff --git a/qerror.h b/qerror.h index b0f69da..f732d45 100644 --- a/qerror.h +++ b/qerror.h @@ -165,6 +165,9 @@ QError *qobject_to_qerror(const QObject *obj); #define QERR_UNDEFINED_ERROR \ { 'class': 'UndefinedError', 'data': {} } +#define QERR_UNKNOWN_BLOCK_FORMAT_FEATURE \ +{ 'class': 'UnknownBlockFormatFeature', 'data': { 'device': %s, 'format': %s, 'feature': %s } } + #define QERR_VNC_SERVER_FAILED \ { 'class': 'VNCServerFailed', 'data': { 'target': %s } } -- 1.7.2.3
[Qemu-devel] [PATCH v2 2/4] qcow2: Report error for version 2
The qcow2 driver is now declared responsible for any QCOW image that has version 2 or greater (before this, version 3 would be detected as raw). For everything newer than version 2, an error is reported. Signed-off-by: Kevin Wolf kw...@redhat.com --- block/qcow2.c | 13 +++-- 1 files changed, 11 insertions(+), 2 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 551b3c2..75b8bec 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -28,6 +28,7 @@ #include aes.h #include block/qcow2.h #include qemu-error.h +#include qerror.h /* Differences with QCOW: @@ -59,7 +60,7 @@ static int qcow2_probe(const uint8_t *buf, int buf_size, const char *filename) if (buf_size = sizeof(QCowHeader) be32_to_cpu(cow_header-magic) == QCOW_MAGIC -be32_to_cpu(cow_header-version) == QCOW_VERSION) +be32_to_cpu(cow_header-version) = QCOW_VERSION) return 100; else return 0; @@ -163,10 +164,18 @@ static int qcow2_open(BlockDriverState *bs, int flags) be64_to_cpus(header.snapshots_offset); be32_to_cpus(header.nb_snapshots); -if (header.magic != QCOW_MAGIC || header.version != QCOW_VERSION) { +if (header.magic != QCOW_MAGIC) { ret = -EINVAL; goto fail; } +if (header.version != QCOW_VERSION) { +char version[64]; +snprintf(version, sizeof(version), QCOW version %d, header.version); +qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE, +bs-device_name, qcow2, version); +ret = -ENOTSUP; +goto fail; +} if (header.cluster_bits MIN_CLUSTER_BITS || header.cluster_bits MAX_CLUSTER_BITS) { ret = -EINVAL; -- 1.7.2.3
[Qemu-devel] [PATCH v2 4/4] qemu-img: Improve error messages for failed bdrv_open
Output the error message string of the bdrv_open return code. Also set a non-empty device name for the images because the unknown feature error message includes it. Signed-off-by: Kevin Wolf kw...@redhat.com --- qemu-img.c | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/qemu-img.c b/qemu-img.c index 4a37358..7e3cc4c 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -213,8 +213,9 @@ static BlockDriverState *bdrv_new_open(const char *filename, BlockDriverState *bs; BlockDriver *drv; char password[256]; +int ret; -bs = bdrv_new(); +bs = bdrv_new(image); if (fmt) { drv = bdrv_find_format(fmt); @@ -225,10 +226,13 @@ static BlockDriverState *bdrv_new_open(const char *filename, } else { drv = NULL; } -if (bdrv_open(bs, filename, flags, drv) 0) { -error_report(Could not open '%s', filename); + +ret = bdrv_open(bs, filename, flags, drv); +if (ret 0) { +error_report(Could not open '%s': %s, filename, strerror(-ret)); goto fail; } + if (bdrv_is_encrypted(bs)) { printf(Disk image '%s' is encrypted.\n, filename); if (read_password(password, sizeof(password)) 0) { -- 1.7.2.3
[Qemu-devel] [PATCH v3 0/6] target-arm: Fix floating point conversions
This patchset fixes two issues: * default_nan_mode not being honoured for float-to-float conversions * half precision conversions being broken in a number of ways as well as not handling default_nan_mode. With this patchset qemu passes random-instruction-selection tests for VCVT.F32.F16, VCVT.F16.F32, VCVTB and VCVTT, in both IEEE and non-IEEE modes, with and without default-NaN behaviour. Christophe: this patchset includes your softfloat v3 patch, although I have split it up a little to keep the float16 bits separate. Changes since v2: * added STRUCT_TYPES version of float16 and fixed various places which needed a make_float16()/float16_val() in order to compile with STRUCT_TYPES enabled * s/bits16/float16/ in patch 3 as suggested by Aurelien * fixed the types in the f16-related ARM helper wrappers in patch 6 Patch 2 is unchanged and so I've added Aurelien's reviewed-by signoff; the others all changed, although mostly in minor ways. (Compiling with STRUCT_TYPES enabled also needs some fixes to existing float32/float64 code; I'll send a separate patchset for that.) Christophe Lyon (1): softfloat: Honour default_nan_mode for float-to-float conversions Peter Maydell (5): softfloat: Add float16 type and float16 NaN handling functions softfloat: Fix single-to-half precision float conversions softfloat: Correctly handle NaNs in float16_to_float32() target-arm: Silence NaNs resulting from half-precision conversions target-arm: Use standard FPSCR for Neon half-precision operations fpu/softfloat-specialize.h | 130 ++-- fpu/softfloat.c| 100 ++ fpu/softfloat.h| 19 ++- target-arm/helper.c| 38 +++-- target-arm/helpers.h |2 + target-arm/translate.c | 16 +++--- 6 files changed, 251 insertions(+), 54 deletions(-)
Re: [Qemu-devel] KVM call minutes for Feb 8
On Thu, Feb 10, 2011 at 10:38:53AM +, Peter Maydell wrote: This is the system diagram for the Versatile Express: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0447d/I1007683.html I don't know what you'd want to claim is a northbridge there. Basically there's an FPGA with a pile of devices in it, and there's a test chip with the core and some other devices in it. But from a modelling perspective this is all completely irrelevant because regardless of where the hardware designer put the devices, they're just devices at a particular point in the memory map and with a particular set of interrupt wiring and so on. I don't see the point in modelling a concept that has no user-visible effects and doesn't actually make the model any clearer or simpler. Exactly. This is really the same with x86. The fact that some company put several devices on the same chip and gave it commercial name shouldn't govern our design. A machine today is basically the northbridge, southbridge, plus a bunch of default components to make the virtual hardware useful. This doesn't really correspond to ARM boards I've looked at, by and large (for instance there's no mention of the word northbridge in the whole 3700 page OMAP3 TRM). PCs may be best modelled that way, sure, but I don't think you can cram everything into that mould. Even on x86 this model is falling apart. Memory controller moves to cpu. PCI controller will follow. If you mean that you want machines to be implemented under the hood as a single huge device you can only have one of that spans the entire memory map, well I guess that's an implementation detail. But conceptually machines really do exist, and we definitely still want users to be able to say I want a beagle machine; I want a versatile; I want an n900. An n900 is a very specific hardware configuration that is best represented by some sort of configuration file vs. something hard coded in QEMU. Yes, that's the whole point -- machine == specific hardware configuration. That's not getting rid of machine, it's just saying we should have some custom scripting language to define them rather than doing them in C. You still want, fundamentally, to be able to say qemu-system-arm -M machinename +1 -- Gleb.
[Qemu-devel] [PATCH v3 5/6] target-arm: Silence NaNs resulting from half-precision conversions
Silence the NaNs that may result from half-precision conversion, as we do for the other conversions. Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- target-arm/helper.c | 12 ++-- 1 files changed, 10 insertions(+), 2 deletions(-) diff --git a/target-arm/helper.c b/target-arm/helper.c index d29c42b..e427747 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -2627,14 +2627,22 @@ float32 HELPER(vfp_fcvt_f16_to_f32)(uint32_t a, CPUState *env) { float_status *s = env-vfp.fp_status; int ieee = (env-vfp.xregs[ARM_VFP_FPSCR] (1 26)) == 0; -return float16_to_float32(make_float16(a), ieee, s); +float32 r = float16_to_float32(make_float16(a), ieee, s); +if (ieee) { +return float32_maybe_silence_nan(r); +} +return r; } uint32_t HELPER(vfp_fcvt_f32_to_f16)(float32 a, CPUState *env) { float_status *s = env-vfp.fp_status; int ieee = (env-vfp.xregs[ARM_VFP_FPSCR] (1 26)) == 0; -return float16_val(float32_to_float16(a, ieee, s)); +float16 r = float32_to_float16(a, ieee, s); +if (ieee) { +r = float16_maybe_silence_nan(r); +} +return float16_val(r); } float32 HELPER(recps_f32)(float32 a, float32 b, CPUState *env) -- 1.7.1
[Qemu-devel] [PATCH v3 4/6] softfloat: Correctly handle NaNs in float16_to_float32()
Correctly handle NaNs in float16_to_float32(), by defining and using a float16ToCommonNaN() function, as we do with the other formats. Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- fpu/softfloat-specialize.h | 17 + fpu/softfloat.c|4 +--- 2 files changed, 18 insertions(+), 3 deletions(-) diff --git a/fpu/softfloat-specialize.h b/fpu/softfloat-specialize.h index 1c0b12b..2d025bf 100644 --- a/fpu/softfloat-specialize.h +++ b/fpu/softfloat-specialize.h @@ -120,6 +120,23 @@ float16 float16_maybe_silence_nan(float16 a_) } /* +| Returns the result of converting the half-precision floating-point NaN +| `a' to the canonical NaN format. If `a' is a signaling NaN, the invalid +| exception is raised. +**/ + +static commonNaNT float16ToCommonNaN( float16 a STATUS_PARAM ) +{ +commonNaNT z; + +if ( float16_is_signaling_nan( a ) ) float_raise( float_flag_invalid STATUS_VAR ); +z.sign = float16_val(a) 15; +z.low = 0; +z.high = ((bits64) float16_val(a))54; +return z; +} + +/* | Returns the result of converting the canonical NaN `a' to the half- | precision floating-point format. **/ diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 80d8cc4..3abd170 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -2761,9 +2761,7 @@ float32 float16_to_float32(float16 a, flag ieee STATUS_PARAM) if (aExp == 0x1f ieee) { if (aSig) { -/* Make sure correct exceptions are raised. */ -float32ToCommonNaN(a STATUS_VAR); -aSig |= 0x200; +return commonNaNToFloat32(float16ToCommonNaN(a STATUS_VAR) STATUS_VAR); } return packFloat32(aSign, 0xff, aSig 13); } -- 1.7.1
[Qemu-devel] Re: [PATCH v2 3/4] qed: Report error for unsupported features
On Thu, Feb 10, 2011 at 11:18 AM, Kevin Wolf kw...@redhat.com wrote: Instead of just returning -ENOTSUP, generate a more detailed error. Unfortunately we don't have a helpful text for features that we don't know yet, so just print the feature mask. It might be useful at least if someone asks for help. Signed-off-by: Kevin Wolf kw...@redhat.com --- block/qed.c | 9 - 1 files changed, 8 insertions(+), 1 deletions(-) Thanks! Acked-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Re: [Qemu-devel] KVM call minutes for Feb 8
On Thu, Feb 10, 2011 at 01:47:06PM +0100, Anthony Liguori wrote: On 02/10/2011 11:49 AM, Gleb Natapov wrote: On Thu, Feb 10, 2011 at 11:19:48AM +0100, Anthony Liguori wrote: On 02/10/2011 11:10 AM, Gleb Natapov wrote: On Thu, Feb 10, 2011 at 11:00:50AM +0100, Anthony Liguori wrote: On 02/10/2011 10:07 AM, Gleb Natapov wrote: So what if it is easier, it doesn't mean it is correct thing to do. If we spend the next 10 years trying to do the correct thing for some arbitrary definition of correct, that's not terribly useful. Changing direction by 180 every 2 years even less useful. If we think through what we are doing and have a coherent architecture before changing direction, then we won't have this problem. I'd like to believe this :) It's really simple actually. Let's do the least clever thing and model how hardware actual works. Once we have that, we can try to be better than real hardware (if it's possible). I think out understanding on how HW actually works is very different. You are placing to much value on were device resides physically, for me it is completely unimportant detail. Not worth even mentioning. No, I place value on how things are modelled in the real world. Real world (physical HW) have consideration not relevant for our software emulation. Such as cost, physical dimension, power consumption and many other I am sure I missed. There simply aren't PC's out there that lack an RTC so I have no interest in jumping through hoops in QEMU to make it possible to do this without modifying QEMU code. It might sound nice to a developer but it's of absolutely no use to users. RTC is not good example. HPET suppose to replace it (and PIT too). HPET's embed RTCs to provide support for legacy implementations. This is extremely good example of where our modelling breaks down. Take a close look at how the HPET and RTC emulations interact for an example of why we'd be much better off just implementing an RTC within an HPET. Yes HPET can provide legacy RTC timer functionality. No I do not see why we should implement RTC withing HPET. In your model we should remove HPET code completely since HPET is not present in chipset emulated by QEMU. AFAIC there are PCs without RTC already. RTC also provides CMOS functionality and no PC can boot without CMOS. So no, there's nothing we'd consider a PC today that doesn't have an RTC. CMOS may be present even if RTC functionality is absent. Does EFI base machine still need CMOS though? Good example would be PIC or IOAPIC device and then I would agree with you that it is not worth it to make it possible to create x86 machine without them from command line if it means extra complexity. But how have you jumped from this to lets make usb mandatory? USB is mandatory in the PIIX3 but the only significant difference between the piix2 and piix3 is the addition of USB. Consequentially, the main difference between an i440fx and i440bx is the use of a piix2 vs. a piix3. So if you really want to create the same PC we have today w/o USB, the right way to do it would be to have: -device i440,model=fx // with USB -device i440,model=bx // w/o USB Why not qemu -config piix2.cfg or qemu -config piix3.cfg? No need to make data into code. No, we don't. It's possible to have an 'rtc=off' option but I'm tremendously opposed to doing this. Arbitrary composition is not a useful goal IMHO. IMHO is different. We should support composition where it makes sense. For PIC-less x86 it doesn't make it. For usb-less or even ide-less it does. The right way to do a USB-less PC is to have an option to create an i440bx. Why is this the right way? An IDE-less PC is a bit more difficult because IDE is really baked into the concept of a PC. Chances are, there are more than a few guests out there that would have issues from there being no IDE bus present. Non of my modern PCs have IDE. Many high end PC had SCSI instead of IDE in the past. If guest can't run without IDE you do not run it without IDE. So why do you like -device i440fx over what we have now? Because I don't think tools like libvirt should be doing device composition to create an i440fx-like chipset. I think the current path we're on is pushing too much logic that belongs in QEMU into the management stack. I can agree with that. But from this it doesn't follow that we should get rid of composition. We shouldn't push composition of common HW to libvirt. Looking at libvirt command line I do not think we do it though. Typical libvirt command line specifies disks, networks, usb, vga. How -device i440fx will simplified that? Well usb could be omitted (but not -usbdevice table), disks are not property of i440fx so they will stay, since user may want to use virtio controller (which is not part of i440fx) this should stay too. Network obviously will have to be specified by libvirt too, vga may go to i440fx, but since libvirt
Re: [Qemu-devel] KVM call minutes for Feb 8
On Thu, Feb 10, 2011 at 03:00:05PM +0200, Avi Kivity wrote: On 02/10/2011 02:51 PM, Anthony Liguori wrote: On 02/10/2011 12:13 PM, Gleb Natapov wrote: Which spec? Even in this discussion we completely mixed different things. 440FX is not a chipset. Yes, it is. It's a single silicon package with a defined pinout. If you don't believe me, re-read the spec. It's a MCM with the PIIX3 being internally connected. The connection between the i440fx and PIIX3 happens to be PCI but that's not always the case. Sometimes it's a proprietary bus. Aren't they two distinct chips, together comprising the chip-set? One (the northbridge) converts the system bus to PCI + some extra wires, the other (southbridge) bridges PCI to ISA and contains some embedded ISA devices. IIRC there are some wires between them that are not PCI. Yeah, 440fx is probably northbridge and PIIX3 southbridge. -- Gleb.
[Qemu-devel] [PATCH 0/2] softfloat: fix USE_SOFTFLOAT_STRUCT_TYPES compile failures
This patchset fixes some compilation failures which happen if you try to enable softfloat's USE_SOFTFLOAT_STRUCT_TYPES type-error-debugging switch. This patchset leaves one error in float16_to_float32, because that is fixed in passing by my half-precision patchset, and I saw no point in deliberately creating a patch conflict. I've only fixed the problems with core softfloat and the ARM targets; maintainers of other targets can fix their platforms if they think it's worth doing. Peter Maydell (2): softfloat: Fix compilation failures with USE_SOFTFLOAT_STRUCT_TYPES linux-user/arm: fix compilation failures using softfloat's struct types fpu/softfloat.c | 30 +++--- fpu/softfloat.h |4 linux-user/arm/nwfpe/fpa11_cpdt.c |2 +- linux-user/arm/nwfpe/fpopcode.c | 32 linux-user/signal.c |4 ++-- 5 files changed, 38 insertions(+), 34 deletions(-)
[Qemu-devel] [PATCH 1/2] softfloat: Fix compilation failures with USE_SOFTFLOAT_STRUCT_TYPES
Make softfloat compile with USE_SOFTFLOAT_STRUCT_TYPES defined, by adding and using new macros const_float32() and const_float64() so you can use array initializers in an array of float32/float64 whether the types are bare or wrapped in the structs. Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- fpu/softfloat.c | 30 +++--- fpu/softfloat.h |4 2 files changed, 19 insertions(+), 15 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 17842f4..8de887d 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -2172,21 +2172,21 @@ float32 float32_sqrt( float32 a STATUS_PARAM ) static const float64 float32_exp2_coefficients[15] = { -make_float64( 0x3ff0ll ), /* 1 */ -make_float64( 0x3fe0ll ), /* 2 */ -make_float64( 0x3fc5ll ), /* 3 */ -make_float64( 0x3fa5ll ), /* 4 */ -make_float64( 0x3f81ll ), /* 5 */ -make_float64( 0x3f56c16c16c16c17ll ), /* 6 */ -make_float64( 0x3f2a01a01a01a01all ), /* 7 */ -make_float64( 0x3efa01a01a01a01all ), /* 8 */ -make_float64( 0x3ec71de3a556c734ll ), /* 9 */ -make_float64( 0x3e927e4fb7789f5cll ), /* 10 */ -make_float64( 0x3e5ae64567f544e4ll ), /* 11 */ -make_float64( 0x3e21eed8eff8d898ll ), /* 12 */ -make_float64( 0x3de6124613a86d09ll ), /* 13 */ -make_float64( 0x3da93974a8c07c9dll ), /* 14 */ -make_float64( 0x3d6ae7f3e733b81fll ), /* 15 */ +const_float64( 0x3ff0ll ), /* 1 */ +const_float64( 0x3fe0ll ), /* 2 */ +const_float64( 0x3fc5ll ), /* 3 */ +const_float64( 0x3fa5ll ), /* 4 */ +const_float64( 0x3f81ll ), /* 5 */ +const_float64( 0x3f56c16c16c16c17ll ), /* 6 */ +const_float64( 0x3f2a01a01a01a01all ), /* 7 */ +const_float64( 0x3efa01a01a01a01all ), /* 8 */ +const_float64( 0x3ec71de3a556c734ll ), /* 9 */ +const_float64( 0x3e927e4fb7789f5cll ), /* 10 */ +const_float64( 0x3e5ae64567f544e4ll ), /* 11 */ +const_float64( 0x3e21eed8eff8d898ll ), /* 12 */ +const_float64( 0x3de6124613a86d09ll ), /* 13 */ +const_float64( 0x3da93974a8c07c9dll ), /* 14 */ +const_float64( 0x3d6ae7f3e733b81fll ), /* 15 */ }; float32 float32_exp2( float32 a STATUS_PARAM ) diff --git a/fpu/softfloat.h b/fpu/softfloat.h index 4a5345c..aaf6afc 100644 --- a/fpu/softfloat.h +++ b/fpu/softfloat.h @@ -125,11 +125,13 @@ typedef struct { /* The cast ensures an error if the wrong type is passed. */ #define float32_val(x) (((float32)(x)).v) #define make_float32(x) __extension__ ({ float32 f32_val = {x}; f32_val; }) +#define const_float32(x) { x } typedef struct { uint64_t v; } float64; #define float64_val(x) (((float64)(x)).v) #define make_float64(x) __extension__ ({ float64 f64_val = {x}; f64_val; }) +#define const_float64(x) { x } #else typedef uint32_t float32; typedef uint64_t float64; @@ -137,6 +139,8 @@ typedef uint64_t float64; #define float64_val(x) (x) #define make_float32(x) (x) #define make_float64(x) (x) +#define const_float32(x) x +#define const_float64(x) x #endif #ifdef FLOATX80 typedef struct { -- 1.7.1
[Qemu-devel] [PATCH 2/2] linux-user/arm: fix compilation failures using softfloat's struct types
Add uses of the float32/float64 boxing and unboxing macros so that the ARM linux-user targets will compile with USE_SOFTFLOAT_STRUCT_TYPES enabled. Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- linux-user/arm/nwfpe/fpa11_cpdt.c |2 +- linux-user/arm/nwfpe/fpopcode.c | 32 linux-user/signal.c |4 ++-- 3 files changed, 19 insertions(+), 19 deletions(-) diff --git a/linux-user/arm/nwfpe/fpa11_cpdt.c b/linux-user/arm/nwfpe/fpa11_cpdt.c index 1346fd6..b12e27d 100644 --- a/linux-user/arm/nwfpe/fpa11_cpdt.c +++ b/linux-user/arm/nwfpe/fpa11_cpdt.c @@ -33,7 +33,7 @@ void loadSingle(const unsigned int Fn, target_ulong addr) FPA11 *fpa11 = GET_FPA11(); fpa11-fType[Fn] = typeSingle; /* FIXME - handle failure of get_user() */ - get_user_u32(fpa11-fpreg[Fn].fSingle, addr); + get_user_u32(float32_val(fpa11-fpreg[Fn].fSingle), addr); } static inline diff --git a/linux-user/arm/nwfpe/fpopcode.c b/linux-user/arm/nwfpe/fpopcode.c index 240061d..82ac92f 100644 --- a/linux-user/arm/nwfpe/fpopcode.c +++ b/linux-user/arm/nwfpe/fpopcode.c @@ -37,25 +37,25 @@ const floatx80 floatx80Constant[] = { }; const float64 float64Constant[] = { - 0xULL, /* double 0.0 */ - 0x3ff0ULL, /* double 1.0 */ - 0x4000ULL, /* double 2.0 */ - 0x4008ULL, /* double 3.0 */ - 0x4010ULL, /* double 4.0 */ - 0x4014ULL, /* double 5.0 */ - 0x3fe0ULL, /* double 0.5 */ - 0x4024ULL/* double 10.0 */ + const_float64(0xULL),/* double 0.0 */ + const_float64(0x3ff0ULL),/* double 1.0 */ + const_float64(0x4000ULL),/* double 2.0 */ + const_float64(0x4008ULL),/* double 3.0 */ + const_float64(0x4010ULL),/* double 4.0 */ + const_float64(0x4014ULL),/* double 5.0 */ + const_float64(0x3fe0ULL),/* double 0.5 */ + const_float64(0x4024ULL) /* double 10.0 */ }; const float32 float32Constant[] = { - 0x, /* single 0.0 */ - 0x3f80, /* single 1.0 */ - 0x4000, /* single 2.0 */ - 0x4040, /* single 3.0 */ - 0x4080, /* single 4.0 */ - 0x40a0, /* single 5.0 */ - 0x3f00, /* single 0.5 */ - 0x4120 /* single 10.0 */ + const_float32(0x), /* single 0.0 */ + const_float32(0x3f80), /* single 1.0 */ + const_float32(0x4000), /* single 2.0 */ + const_float32(0x4040), /* single 3.0 */ + const_float32(0x4080), /* single 4.0 */ + const_float32(0x40a0), /* single 5.0 */ + const_float32(0x3f00), /* single 0.5 */ + const_float32(0x4120)/* single 10.0 */ }; unsigned int getRegisterCount(const unsigned int opcode) diff --git a/linux-user/signal.c b/linux-user/signal.c index b01bd64..ce033e9 100644 --- a/linux-user/signal.c +++ b/linux-user/signal.c @@ -1299,7 +1299,7 @@ static abi_ulong *setup_sigframe_v2_vfp(abi_ulong *regspace, CPUState *env) __put_user(TARGET_VFP_MAGIC, vfpframe-magic); __put_user(sizeof(*vfpframe), vfpframe-size); for (i = 0; i 32; i++) { -__put_user(env-vfp.regs[i], vfpframe-ufp.fpregs[i]); +__put_user(float64_val(env-vfp.regs[i]), vfpframe-ufp.fpregs[i]); } __put_user(vfp_get_fpscr(env), vfpframe-ufp.fpscr); __put_user(env-vfp.xregs[ARM_VFP_FPEXC], vfpframe-ufp_exc.fpexc); @@ -1588,7 +1588,7 @@ static abi_ulong *restore_sigframe_v2_vfp(CPUState *env, abi_ulong *regspace) return 0; } for (i = 0; i 32; i++) { -__get_user(env-vfp.regs[i], vfpframe-ufp.fpregs[i]); +__get_user(float64_val(env-vfp.regs[i]), vfpframe-ufp.fpregs[i]); } __get_user(fpscr, vfpframe-ufp.fpscr); vfp_set_fpscr(env, fpscr); -- 1.7.1
Re: [Qemu-devel] [PATCH v3] Fix ATA SMART and CHECK POWER MODE
On Wed, 2011-02-09 at 17:22 -0600, Ryan Harper wrote: * Brian Wheeler bdwhe...@indiana.edu [2011-02-09 16:13]: This patch fixes two things: 1) CHECK POWER MODE The error return value wasn't always zero, so it would show up as offline. Error is now explicitly set to zero. 2) SMART The smart values that were returned were invalid and tools like skdump would not recognize that the smart data was actually valid and would dump weird output. The data has been fixed up and raw value support was added. Tools like skdump and palimpsest work as expected. v3 changes: don't reformat code I didn't change v2 changes: use single structure instead of one for thresholds and one for data. Signed-off-by: bdwhe...@indiana.edu diff --git a/hw/ide/core.c b/hw/ide/core.c index dd63664..b0b0b35 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -34,13 +34,26 @@ #include hw/ide/internal.h -static const int smart_attributes[][5] = { -/* id, flags, val, wrst, thrsh */ -{ 0x01, 0x03, 0x64, 0x64, 0x06}, /* raw read */ -{ 0x03, 0x03, 0x64, 0x64, 0x46}, /* spin up */ -{ 0x04, 0x02, 0x64, 0x64, 0x14}, /* start stop count */ -{ 0x05, 0x03, 0x64, 0x64, 0x36}, /* remapped sectors */ -{ 0x00, 0x00, 0x00, 0x00, 0x00} +/* These values were taking from a running system, specifically a + Seagate ST3500418AS */ These values ought to have meaning for your hardware, but won't for either the virtual disk, nor the underlying storage that the virtual disk is running on. Since we're not attempting to pass any of that info, nor keep it in-sync, it probably doesn't matter that much that we're just copying device specific data. I'm open to discussion on how much we care about the attribute values[1]. 1. https://secure.wikimedia.org/wikipedia/en/wiki/S.M.A.R.T.#ATA_S.M.A.R.T._attributes The main reason for this patch was to make sure the disk tools and smartd on linux were happy and returned reasonable values. At some point I may add on the ability to trigger a smart failure (by jumping the sectors remapped or something) but for now its read only and not really meaningful. +static const int smart_attributes[][12] = { +/* id, flags, hflags, val, wrst, raw (6 bytes), threshold */ +/* raw read error rate*/ +{ 0x01, 0x03, 0x00, 0x74, 0x63, 0x31, 0x6d, 0x3f, 0x0d, 0x00, 0x00, 0x06}, probably fine, but this is vendor hardware specific. I can't think of a better number other than 0. I've set it to zero. +/* spin up */ +{ 0x03, 0x03, 0x00, 0x61, 0x61, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, default is probably fine as well, though it's dependent upon the hardware as well. Could be zero as well. I've set it to 16ms so skdump returns something other than 'n/a' +/* start stop count */ +{ 0x04, 0x02, 0x00, 0x64, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x14}, depends on hardware and power mgmt, any count is probably fine. +/* remapped sectors */ +{ 0x05, 0x03, 0x00, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x24}, Probably should be zero. When dumping it via skdump or smartctl out it reads as 0 sectors remapped (as indicated by the raw value). The value looks like its a countdown of sectors remaining, so setting it to the 'worst' value is equivalent to no sectors remapped. +/* power on hours */ +{ 0x09, 0x03, 0x00, 0x61, 0x61, 0x68, 0x0a, 0x00, 0x00, 0x00, 0x00, 0x00}, Zero. I'm going to set it to 1 (hour) so skdump returns something other than 'n/a' +/* power cycle count */ +{ 0x0c, 0x03, 0x00, 0x64, 0x64, 0x32, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, Zero I've set it to zero. +/* airflow-temperature-celsius */ +{ 190, 0x03, 0x00, 0x64, 0x64, 0x1f, 0x00, 0x16, 0x22, 0x00, 0x00, 0x32}, Something resonably ambient 20-30C, current value is probably fine. it reads at 31.0C. I've set the value (and worst)so it matches the raw value. (100C - 31C = 69C (0x45)). I've also adjusted the raw value so it shows the Min/Max is 31C +/* end of list */ +{ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00} }; /* XXX: DVDs that could fit on a CD will be reported as a CD */ @@ -1843,6 +1856,7 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val) break; case WIN_CHECKPOWERMODE1: case WIN_CHECKPOWERMODE2: +s-error = 0; s-nsector = 0xff; /* device active or idle */ s-status = READY_STAT | SEEK_STAT; ide_set_irq(s-bus); @@ -2097,7 +2111,7 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val) if (smart_attributes[n][0] == 0) break; s-io_buffer[2+0+(n*12)] = smart_attributes[n][0]; - s-io_buffer[2+1+(n*12)] = smart_attributes[n][4]; + s-io_buffer[2+1+(n*12)] =
[Qemu-devel] [PATCH v4] Fix ATA SMART and CHECK POWER MODE
This patch fixes two things: 1) CHECK POWER MODE The error return value wasn't always zero, so it would show up as offline. Error is now explicitly set to zero. 2) SMART The smart values that were returned were invalid and tools like skdump would not recognize that the smart data was actually valid and would dump weird output. The data has been fixed up and raw value support was added. Tools like skdump and palimpsest work as expected. v4 changes: incorporate changes from Ryan Harper v3 changes: don't reformat code I didn't change v2 changes: use single structure instead of one for thresholds and one for data. Signed-off-by: bdwhe...@indiana.edu diff --git a/hw/ide/core.c b/hw/ide/core.c index dd63664..c806f31 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -34,13 +34,26 @@ #include hw/ide/internal.h -static const int smart_attributes[][5] = { -/* id, flags, val, wrst, thrsh */ -{ 0x01, 0x03, 0x64, 0x64, 0x06}, /* raw read */ -{ 0x03, 0x03, 0x64, 0x64, 0x46}, /* spin up */ -{ 0x04, 0x02, 0x64, 0x64, 0x14}, /* start stop count */ -{ 0x05, 0x03, 0x64, 0x64, 0x36}, /* remapped sectors */ -{ 0x00, 0x00, 0x00, 0x00, 0x00} +/* These values were based on a Seagate ST3500418AS but have been modified + to make more sense in QEMU */ +static const int smart_attributes[][12] = { +/* id, flags, hflags, val, wrst, raw (6 bytes), threshold */ +/* raw read error rate*/ +{ 0x01, 0x03, 0x00, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x06}, +/* spin up */ +{ 0x03, 0x03, 0x00, 0x64, 0x64, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, +/* start stop count */ +{ 0x04, 0x02, 0x00, 0x64, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x14}, +/* remapped sectors */ +{ 0x05, 0x03, 0x00, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x24}, +/* power on hours */ +{ 0x09, 0x03, 0x00, 0x64, 0x64, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, +/* power cycle count */ +{ 0x0c, 0x03, 0x00, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, +/* airflow-temperature-celsius */ +{ 190, 0x03, 0x00, 0x45, 0x45, 0x1f, 0x00, 0x1f, 0x1f, 0x00, 0x00, 0x32}, +/* end of list */ +{ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00} }; /* XXX: DVDs that could fit on a CD will be reported as a CD */ @@ -1843,6 +1856,7 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val) break; case WIN_CHECKPOWERMODE1: case WIN_CHECKPOWERMODE2: +s-error = 0; s-nsector = 0xff; /* device active or idle */ s-status = READY_STAT | SEEK_STAT; ide_set_irq(s-bus); @@ -2097,7 +2111,7 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val) if (smart_attributes[n][0] == 0) break; s-io_buffer[2+0+(n*12)] = smart_attributes[n][0]; - s-io_buffer[2+1+(n*12)] = smart_attributes[n][4]; + s-io_buffer[2+1+(n*12)] = smart_attributes[n][11]; } for (n=0; n511; n++) /* checksum */ s-io_buffer[511] += s-io_buffer[n]; @@ -2110,12 +2124,13 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val) memset(s-io_buffer, 0, 0x200); s-io_buffer[0] = 0x01; /* smart struct version */ for (n=0; n30; n++) { - if (smart_attributes[n][0] == 0) + if (smart_attributes[n][0] == 0) { break; - s-io_buffer[2+0+(n*12)] = smart_attributes[n][0]; - s-io_buffer[2+1+(n*12)] = smart_attributes[n][1]; - s-io_buffer[2+3+(n*12)] = smart_attributes[n][2]; - s-io_buffer[2+4+(n*12)] = smart_attributes[n][3]; + } + int i; + for(i = 0; i 11; i++) { + s-io_buffer[2+i+(n*12)] = smart_attributes[n][i]; + } } s-io_buffer[362] = 0x02 | (s-smart_autosave?0x80:0x00); if (s-smart_selftest_count == 0) {
[Qemu-devel] [Bug 498107] Re: www.qemu.org and www.nongnu.org/qemu have a lot of bugs
qemu 0.14.0 rc1 Not in qemu-doc.html: (qemu) info spice BTW: The qemu instanz crashed with info spice: kvm ReactOS.img -spice port=12345,disable-ticketing -vga qxl -monitor stdio''' (qemu) info spice Server: address: 0.0.0.0:12345 auth: none qemu: qdict.c:193: qdict_get_obj: Assertion `obj != ((void *)0)' failed. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/498107 Title: www.qemu.org and www.nongnu.org/qemu have a lot of bugs Status in QEMU: New Bug description: The http://websites www.qemu.org and http://www.nongnu.org/qemu have a lot of bugs: Why two websites with different oudated content? No contact address - It is not possible to contact the webmaster. Outdated content -- The current relase is 0.12.0-rc2.! - http://www.nongnu.org/qemu/index.html Jul 30, 2009 QEMU version 0.11.0-rc1 is out - http://www.nongnu.org/qemu/download.html - http://www.qemu.org QEMU version 0.12.0-rc1 is out - http://www.qemu.org/download.html Many Links are outdated or broken - http://www.qemu.org/links.html - http://www.nongnu.org/qemu/links.html For example QEMU on Windows. Why not http://www.davereyn.co.uk/download.htm ? - http://www.qemu.org/user-doc.html - http://www.nongnu.org/qemu/user-doc.html For example: Quick Start, FAQ and QEMU Wiki. No word about the end of KQEMU support next. - http://www.qemu.org/qemu-doc.html - http://www.nongnu.org/qemu/qemu-doc.html There are a lot of differences to qemu --help help in QEMU-Monitor For example -rtc-td-hack -localtime -startdate -netdev -mem-path -mem-prealloc -tdf -nvram -enable-nesting -no-kvm-irqchip -no-kvm-pit -no-kvm-pit-reinjection -xen-domid id -xen-create -xen-attach -readconfig file -writeconfig file (qemu) host_net_redir (qemu) acl_reset Please see also - http://qemu-buch.de/d/Anhang/_Startoptionen_von_QEMU_und_KVM - http://qemu-buch.de/d/Anhang/_QEMU-Monitor
[Qemu-devel] Re: AHCI in SeaBIOS
On Tue, Feb 08, 2011 at 12:57:41AM +0100, Alexander Graf wrote: Hi Kevin, Do you remember why you put AHCI in with default=n? I'd like to see it enabled in Qemu 0.14 and IIUC we use the default configuration for that. Hi Alex, Sorry - I've gotten behind on emails. The reason for not enabling it by default was for two reasons - it had not been tested on real hardware, and it was unclear if doing the jump into 32bit mode would have adverse impact. Speaking of which - it would be awesome of we had a companion SeaBIOS release with Qemu 0.14, so we don't have to fetch a random git snapshot but potentially even could maintain a stable SeaBIOS for during the lifetime of 0.14 :). It's about time to make a new release of seabios, so I'll see if that can be done in a week or so. -Kevin
[Qemu-devel] Re: AHCI in SeaBIOS
Kevin O'Connor wrote: On Tue, Feb 08, 2011 at 12:57:41AM +0100, Alexander Graf wrote: Hi Kevin, Do you remember why you put AHCI in with default=n? I'd like to see it enabled in Qemu 0.14 and IIUC we use the default configuration for that. Hi Alex, Sorry - I've gotten behind on emails. The reason for not enabling it by default was for two reasons - it had not been tested on real hardware, and it was unclear if doing the jump into 32bit mode would have adverse impact. Do you have to do the jump even when AHCI is unused? If no, it shouldn't hurt, right? Anthony, can we enable it only for the Qemu build? Speaking of which - it would be awesome of we had a companion SeaBIOS release with Qemu 0.14, so we don't have to fetch a random git snapshot but potentially even could maintain a stable SeaBIOS for during the lifetime of 0.14 :). It's about time to make a new release of seabios, so I'll see if that can be done in a week or so. Very nice, thank you! It would be great if we could sync that up with the 0.14 release. Anthony? Alex
[Qemu-devel] Re: AHCI in SeaBIOS
Kevin O'Connor wrote: On Thu, Feb 10, 2011 at 04:25:11PM +0100, Alexander Graf wrote: Kevin O'Connor wrote: On Tue, Feb 08, 2011 at 12:57:41AM +0100, Alexander Graf wrote: Do you remember why you put AHCI in with default=n? I'd like to see it enabled in Qemu 0.14 and IIUC we use the default configuration for that. The reason for not enabling it by default was for two reasons - it had not been tested on real hardware, and it was unclear if doing the jump into 32bit mode would have adverse impact. Do you have to do the jump even when AHCI is unused? If no, it shouldn't hurt, right? The 32bit jump is only done if an AHCI drive is found and one tries to read/write to it. Very good, so it really shouldn't hurt. We could add another option that only kicks off the detection when Qemu is found, no? That way we're sure to not break real hardware, but have the functionality in Qemu :) Alex
Re: [Qemu-devel] Re: AHCI in SeaBIOS
On 02/10/2011 04:25 PM, Alexander Graf wrote: Kevin O'Connor wrote: On Tue, Feb 08, 2011 at 12:57:41AM +0100, Alexander Graf wrote: Hi Kevin, Do you remember why you put AHCI in with default=n? I'd like to see it enabled in Qemu 0.14 and IIUC we use the default configuration for that. Hi Alex, Sorry - I've gotten behind on emails. The reason for not enabling it by default was for two reasons - it had not been tested on real hardware, and it was unclear if doing the jump into 32bit mode would have adverse impact. Do you have to do the jump even when AHCI is unused? If no, it shouldn't hurt, right? Anthony, can we enable it only for the Qemu build? Speaking of which - it would be awesome of we had a companion SeaBIOS release with Qemu 0.14, so we don't have to fetch a random git snapshot but potentially even could maintain a stable SeaBIOS for during the lifetime of 0.14 :). It's about time to make a new release of seabios, so I'll see if that can be done in a week or so. Very nice, thank you! It would be great if we could sync that up with the 0.14 release. Anthony? We're right around the corner from -rc2 so let's wait until after 0.14 is released so we have a full release cycle to test it. Regards, Anthony Liguori Alex
[Qemu-devel] [PATCH 03/11] qcow2: Fix error handling for immediate backing file read failure
Requests could return success even though they failed when bdrv_aio_readv returned NULL for a backing file read. Reported-by: Chunqiang Tang ct...@us.ibm.com Signed-off-by: Kevin Wolf kw...@redhat.com --- block/qcow2.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 28338bf..647c2a4 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -479,8 +479,10 @@ static void qcow2_aio_read_cb(void *opaque, int ret) BLKDBG_EVENT(bs-file, BLKDBG_READ_BACKING_AIO); acb-hd_aiocb = bdrv_aio_readv(bs-backing_hd, acb-sector_num, acb-hd_qiov, n1, qcow2_aio_read_cb, acb); -if (acb-hd_aiocb == NULL) +if (acb-hd_aiocb == NULL) { +ret = -EIO; goto done; +} } else { ret = qcow2_schedule_bh(qcow2_aio_read_bh, acb); if (ret 0) -- 1.7.2.3
[Qemu-devel] [PULL 00/11] Block patches for master
The following changes since commit 6c5f738daec123020d32543fe90a6633a4f6643e: microblaze: Handle singlestepping over direct jmps (2011-02-10 00:46:09 +0100) are available in the git repository at: git://repo.or.cz/qemu/kevin.git for-anthony Chunqiang Tang (1): QCOW2: bug fix - read base image beyond its size Jes Sorensen (1): Change snapshot_blkdev hmp to use correct argument type for device Kevin Wolf (7): qcow2: Fix error handling for immediate backing file read failure qcow2: Fix error handling for reading compressed clusters qerror: Add QERR_UNKNOWN_BLOCK_FORMAT_FEATURE qcow2: Report error for version 2 qed: Report error for unsupported features qemu-img: Improve error messages for failed bdrv_open qcow2: Fix order in L2 table COW Markus Armbruster (2): blockdev: Plug memory leak in drive_uninit() blockdev: Plug memory leak in drive_init() error paths block/qcow2-cluster.c | 13 - block/qcow2.c | 26 +++--- block/qed.c |9 - blockdev.c| 12 ++-- cutils.c | 31 +++ hmp-commands.hx |2 +- qemu-common.h |2 ++ qemu-img.c| 10 +++--- qerror.c |5 + qerror.h |3 +++ 10 files changed, 94 insertions(+), 19 deletions(-)
[Qemu-devel] [PATCH 09/11] blockdev: Plug memory leak in drive_uninit()
From: Markus Armbruster arm...@redhat.com Started leaking in commit 1dae12e6. Signed-off-by: Markus Armbruster arm...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- blockdev.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/blockdev.c b/blockdev.c index ecfadc1..24d7658 100644 --- a/blockdev.c +++ b/blockdev.c @@ -182,6 +182,7 @@ static void drive_uninit(DriveInfo *dinfo) { qemu_opts_del(dinfo-opts); bdrv_delete(dinfo-bdrv); +qemu_free(dinfo-id); QTAILQ_REMOVE(drives, dinfo, next); qemu_free(dinfo); } -- 1.7.2.3
[Qemu-devel] [PATCH 10/11] blockdev: Plug memory leak in drive_init() error paths
From: Markus Armbruster arm...@redhat.com Should have spotted this when doing commit 319ae529. Signed-off-by: Markus Armbruster arm...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- blockdev.c | 11 +-- 1 files changed, 9 insertions(+), 2 deletions(-) diff --git a/blockdev.c b/blockdev.c index 24d7658..0690cc8 100644 --- a/blockdev.c +++ b/blockdev.c @@ -526,7 +526,7 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) } else if (ro == 1) { if (type != IF_SCSI type != IF_VIRTIO type != IF_FLOPPY type != IF_NONE) { error_report(readonly not supported by this bus type); -return NULL; +goto err; } } @@ -536,12 +536,19 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) if (ret 0) { error_report(could not open disk image %s: %s, file, strerror(-ret)); -return NULL; +goto err; } if (bdrv_key_required(dinfo-bdrv)) autostart = 0; return dinfo; + +err: +bdrv_delete(dinfo-bdrv); +qemu_free(dinfo-id); +QTAILQ_REMOVE(drives, dinfo, next); +qemu_free(dinfo); +return NULL; } void do_commit(Monitor *mon, const QDict *qdict) -- 1.7.2.3
Re: [Qemu-devel] [PATCH v2 6/6] target-arm: Use standard FPSCR for Neon half-precision operations
On Wed, Feb 09, 2011 at 04:27:30PM +, Peter Maydell wrote: The Neon half-precision conversion operations (VCVT.F16.F32 and VCVT.F32.F16) use ARM standard floating-point arithmetic, unlike the VFP versions (VCVTB and VCVTT). Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- target-arm/helper.c| 26 ++ target-arm/helpers.h |2 ++ target-arm/translate.c | 16 3 files changed, 32 insertions(+), 12 deletions(-) Reviewed-by: Aurelien Jarno aurel...@aurel32.net diff --git a/target-arm/helper.c b/target-arm/helper.c index 503278c..d36f0f3 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -2623,9 +2623,8 @@ VFP_CONV_FIX(ul, s, float32, uint32, u) #undef VFP_CONV_FIX /* Half precision conversions. */ -float32 HELPER(vfp_fcvt_f16_to_f32)(uint32_t a, CPUState *env) +static float32 do_fcvt_f16_to_f32(uint32_t a, CPUState *env, float_status *s) { -float_status *s = env-vfp.fp_status; int ieee = (env-vfp.xregs[ARM_VFP_FPSCR] (1 26)) == 0; float32 r = float16_to_float32(a, ieee, s); if (ieee) { @@ -2634,9 +2633,8 @@ float32 HELPER(vfp_fcvt_f16_to_f32)(uint32_t a, CPUState *env) return r; } -uint32_t HELPER(vfp_fcvt_f32_to_f16)(float32 a, CPUState *env) +static uint32_t do_fcvt_f32_to_f16(float32 a, CPUState *env, float_status *s) { -float_status *s = env-vfp.fp_status; int ieee = (env-vfp.xregs[ARM_VFP_FPSCR] (1 26)) == 0; float16 r = float32_to_float16(a, ieee, s); if (ieee) { @@ -2645,6 +2643,26 @@ uint32_t HELPER(vfp_fcvt_f32_to_f16)(float32 a, CPUState *env) return r; } +float32 HELPER(neon_fcvt_f16_to_f32)(uint32_t a, CPUState *env) +{ +return do_fcvt_f16_to_f32(a, env, env-vfp.standard_fp_status); +} + +float32 HELPER(neon_fcvt_f32_to_f16)(uint32_t a, CPUState *env) +{ +return do_fcvt_f32_to_f16(a, env, env-vfp.standard_fp_status); +} + +float32 HELPER(vfp_fcvt_f16_to_f32)(uint32_t a, CPUState *env) +{ +return do_fcvt_f16_to_f32(a, env, env-vfp.fp_status); +} + +float32 HELPER(vfp_fcvt_f32_to_f16)(uint32_t a, CPUState *env) +{ +return do_fcvt_f32_to_f16(a, env, env-vfp.fp_status); +} + float32 HELPER(recps_f32)(float32 a, float32 b, CPUState *env) { float_status *s = env-vfp.fp_status; diff --git a/target-arm/helpers.h b/target-arm/helpers.h index 8a2564e..40264b4 100644 --- a/target-arm/helpers.h +++ b/target-arm/helpers.h @@ -129,6 +129,8 @@ DEF_HELPER_3(vfp_ultod, f64, f64, i32, env) DEF_HELPER_2(vfp_fcvt_f16_to_f32, f32, i32, env) DEF_HELPER_2(vfp_fcvt_f32_to_f16, i32, f32, env) +DEF_HELPER_2(neon_fcvt_f16_to_f32, f32, i32, env) +DEF_HELPER_2(neon_fcvt_f32_to_f16, i32, f32, env) DEF_HELPER_3(recps_f32, f32, f32, f32, env) DEF_HELPER_3(rsqrts_f32, f32, f32, f32, env) diff --git a/target-arm/translate.c b/target-arm/translate.c index e4649e6..a867f55 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -5495,17 +5495,17 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) tmp = new_tmp(); tmp2 = new_tmp(); tcg_gen_ld_f32(cpu_F0s, cpu_env, neon_reg_offset(rm, 0)); -gen_helper_vfp_fcvt_f32_to_f16(tmp, cpu_F0s, cpu_env); +gen_helper_neon_fcvt_f32_to_f16(tmp, cpu_F0s, cpu_env); tcg_gen_ld_f32(cpu_F0s, cpu_env, neon_reg_offset(rm, 1)); -gen_helper_vfp_fcvt_f32_to_f16(tmp2, cpu_F0s, cpu_env); +gen_helper_neon_fcvt_f32_to_f16(tmp2, cpu_F0s, cpu_env); tcg_gen_shli_i32(tmp2, tmp2, 16); tcg_gen_or_i32(tmp2, tmp2, tmp); tcg_gen_ld_f32(cpu_F0s, cpu_env, neon_reg_offset(rm, 2)); -gen_helper_vfp_fcvt_f32_to_f16(tmp, cpu_F0s, cpu_env); +gen_helper_neon_fcvt_f32_to_f16(tmp, cpu_F0s, cpu_env); tcg_gen_ld_f32(cpu_F0s, cpu_env, neon_reg_offset(rm, 3)); neon_store_reg(rd, 0, tmp2); tmp2 = new_tmp(); -gen_helper_vfp_fcvt_f32_to_f16(tmp2, cpu_F0s, cpu_env); +gen_helper_neon_fcvt_f32_to_f16(tmp2, cpu_F0s, cpu_env); tcg_gen_shli_i32(tmp2, tmp2, 16); tcg_gen_or_i32(tmp2, tmp2, tmp); neon_store_reg(rd, 1, tmp2); @@ -5518,17 +5518,17 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) tmp = neon_load_reg(rm, 0); tmp2 = neon_load_reg(rm, 1); tcg_gen_ext16u_i32(tmp3, tmp); -gen_helper_vfp_fcvt_f16_to_f32(cpu_F0s, tmp3, cpu_env); +gen_helper_neon_fcvt_f16_to_f32(cpu_F0s, tmp3, cpu_env); tcg_gen_st_f32(cpu_F0s, cpu_env,
[Qemu-devel] [PATCH 02/11] QCOW2: bug fix - read base image beyond its size
From: Chunqiang Tang ct...@us.ibm.com This patch fixes the following bug in QCOW2. For a QCOW2 image that is larger than its base image, when handling a read request straddling over the end of the base image, the QCOW2 driver attempts to read beyond the end of the base image and the request would fail. This bug was found by Fast Virtual Disk (FVD)'s fully automated testing tool. The following test triggered the bug. dd if=/dev/zero of=/var/ramdisk/truth.raw count=0 bs=1 seek=1098561536 dd if=/dev/zero of=/var/ramdisk/zero-500M.raw count=0 bs=1 seek=593099264 ./qemu-img create -f qcow2 -ocluster_size=65536,backing_fmt=blksim -b /var/ramdisk/zero-500M.raw /var/ramdisk/test.qcow2 1098561536 ./qemu-io --auto --seed=30477694 --truth=/var/ramdisk/truth.raw --format=qcow2 --test=blksim:/var/ramdisk/test.qcow2 --verify_write=true --compare_before=false --compare_after=true --round=10 --parallel=100 --io_size=10485760 --fail_prob=0 --cancel_prob=0 --instant_qemubh=true Signed-off-by: Chunqiang Tang ct...@us.ibm.com Signed-off-by: Kevin Wolf kw...@redhat.com --- block/qcow2.c |5 ++--- cutils.c | 31 +++ qemu-common.h |2 ++ 3 files changed, 35 insertions(+), 3 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index a1773e4..28338bf 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -355,7 +355,7 @@ int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector *qiov, else n1 = bs-total_sectors - sector_num; -qemu_iovec_memset(qiov, 0, 512 * (nb_sectors - n1)); +qemu_iovec_memset_skip(qiov, 0, 512 * (nb_sectors - n1), 512 * n1); return n1; } @@ -478,8 +478,7 @@ static void qcow2_aio_read_cb(void *opaque, int ret) if (n1 0) { BLKDBG_EVENT(bs-file, BLKDBG_READ_BACKING_AIO); acb-hd_aiocb = bdrv_aio_readv(bs-backing_hd, acb-sector_num, -acb-hd_qiov, acb-cur_nr_sectors, - qcow2_aio_read_cb, acb); +acb-hd_qiov, n1, qcow2_aio_read_cb, acb); if (acb-hd_aiocb == NULL) goto done; } else { diff --git a/cutils.c b/cutils.c index 8d562b2..f9a7e36 100644 --- a/cutils.c +++ b/cutils.c @@ -267,6 +267,37 @@ void qemu_iovec_memset(QEMUIOVector *qiov, int c, size_t count) } } +void qemu_iovec_memset_skip(QEMUIOVector *qiov, int c, size_t count, +size_t skip) +{ +int i; +size_t done; +void *iov_base; +uint64_t iov_len; + +done = 0; +for (i = 0; (i qiov-niov) (done != count); i++) { +if (skip = qiov-iov[i].iov_len) { +/* Skip the whole iov */ +skip -= qiov-iov[i].iov_len; +continue; +} else { +/* Skip only part (or nothing) of the iov */ +iov_base = (uint8_t*) qiov-iov[i].iov_base + skip; +iov_len = qiov-iov[i].iov_len - skip; +skip = 0; +} + +if (done + iov_len count) { +memset(iov_base, c, count - done); +break; +} else { +memset(iov_base, c, iov_len); +} +done += iov_len; +} +} + #ifndef _WIN32 /* Sets a specific flag */ int fcntl_setfl(int fd, int flag) diff --git a/qemu-common.h b/qemu-common.h index c7ff280..cb4b7e0 100644 --- a/qemu-common.h +++ b/qemu-common.h @@ -322,6 +322,8 @@ void qemu_iovec_reset(QEMUIOVector *qiov); void qemu_iovec_to_buffer(QEMUIOVector *qiov, void *buf); void qemu_iovec_from_buffer(QEMUIOVector *qiov, const void *buf, size_t count); void qemu_iovec_memset(QEMUIOVector *qiov, int c, size_t count); +void qemu_iovec_memset_skip(QEMUIOVector *qiov, int c, size_t count, +size_t skip); struct Monitor; typedef struct Monitor Monitor; -- 1.7.2.3
[Qemu-devel] [PATCH 11/11] qcow2: Fix order in L2 table COW
When copying L2 tables (this happens only with internal snapshots), the order wasn't completely safe, so that after a crash you could end up with a L2 table that has too low refcount, possibly leading to corruption in the long run. This patch puts the operations in the right order: First allocate the new L2 table and replace the reference, and only then decrease the refcount of the old table. Signed-off-by: Kevin Wolf kw...@redhat.com --- block/qcow2-cluster.c |9 ++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 437aaa8..750abe3 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -515,13 +515,16 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset, return ret; } } else { -/* FIXME Order */ -if (l2_offset) -qcow2_free_clusters(bs, l2_offset, s-l2_size * sizeof(uint64_t)); +/* First allocate a new L2 table (and do COW if needed) */ ret = l2_allocate(bs, l1_index, l2_table); if (ret 0) { return ret; } + +/* Then decrease the refcount of the old table */ +if (l2_offset) { +qcow2_free_clusters(bs, l2_offset, s-l2_size * sizeof(uint64_t)); +} l2_offset = s-l1_table[l1_index] ~QCOW_OFLAG_COPIED; } -- 1.7.2.3
Re: [Qemu-devel] [PATCH 0/2] target-arm: Fix VQMOV(U)N
On Wed, Feb 09, 2011 at 03:42:31PM +, Peter Maydell wrote: This patchset fixes the VQMOV(U)N instructions (saturating narrowing conversions). Tested by random generation of instructions for VQMOVN, VQMOVUN, VMOVN. Patch 1/2 is the same as the one Christophe sent recently but I have corrected the authorship (this patch is from the meego tree). Juha Riihimäki (1): target-arm: Fix VQMOVUN Neon instruction. Peter Maydell (1): target-arm: Fix 32 bit signed saturating narrow target-arm/helpers.h |3 ++ target-arm/neon_helper.c | 65 +- target-arm/translate.c | 28 +++ 3 files changed, 89 insertions(+), 7 deletions(-) Thanks, both applied. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
[Qemu-devel] [PATCH 01/11] Change snapshot_blkdev hmp to use correct argument type for device
From: Jes Sorensen jes.soren...@redhat.com Pointed out by Markus Signed-off-by: Jes Sorensen jes.soren...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- hmp-commands.hx |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/hmp-commands.hx b/hmp-commands.hx index 38e1eb7..372bef4 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -822,7 +822,7 @@ ETEXI { .name = snapshot_blkdev, -.args_type = device:s,snapshot_file:s?,format:s?, +.args_type = device:B,snapshot_file:s?,format:s?, .params = device [new-image-file] [format], .help = initiates a live snapshot\n\t\t\t of device. If a new image file is specified, the\n\t\t\t -- 1.7.2.3
[Qemu-devel] [PATCH 04/11] qcow2: Fix error handling for reading compressed clusters
When reading a compressed cluster failed, qcow2 falsely returned success. Signed-off-by: Kevin Wolf kw...@redhat.com Reviewed-by: Markus Armbruster arm...@redhat.com --- block/qcow2-cluster.c |4 ++-- block/qcow2.c |4 +++- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 5fb8c66..437aaa8 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -878,11 +878,11 @@ int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset) BLKDBG_EVENT(bs-file, BLKDBG_READ_COMPRESSED); ret = bdrv_read(bs-file, coffset 9, s-cluster_data, nb_csectors); if (ret 0) { -return -1; +return ret; } if (decompress_buffer(s-cluster_cache, s-cluster_size, s-cluster_data + sector_offset, csize) 0) { -return -1; +return -EIO; } s-cluster_cache_offset = coffset; } diff --git a/block/qcow2.c b/block/qcow2.c index 647c2a4..551b3c2 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -497,8 +497,10 @@ static void qcow2_aio_read_cb(void *opaque, int ret) } } else if (acb-cluster_offset QCOW_OFLAG_COMPRESSED) { /* add AIO support for compressed blocks ? */ -if (qcow2_decompress_cluster(bs, acb-cluster_offset) 0) +ret = qcow2_decompress_cluster(bs, acb-cluster_offset); +if (ret 0) { goto done; +} qemu_iovec_from_buffer(acb-hd_qiov, s-cluster_cache + index_in_cluster * 512, -- 1.7.2.3
[Qemu-devel] [PATCH 08/11] qemu-img: Improve error messages for failed bdrv_open
Output the error message string of the bdrv_open return code. Also set a non-empty device name for the images because the unknown feature error message includes it. Signed-off-by: Kevin Wolf kw...@redhat.com Reviewed-by: Anthony Liguori aligu...@us.ibm.com --- qemu-img.c | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/qemu-img.c b/qemu-img.c index 4a37358..7e3cc4c 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -213,8 +213,9 @@ static BlockDriverState *bdrv_new_open(const char *filename, BlockDriverState *bs; BlockDriver *drv; char password[256]; +int ret; -bs = bdrv_new(); +bs = bdrv_new(image); if (fmt) { drv = bdrv_find_format(fmt); @@ -225,10 +226,13 @@ static BlockDriverState *bdrv_new_open(const char *filename, } else { drv = NULL; } -if (bdrv_open(bs, filename, flags, drv) 0) { -error_report(Could not open '%s', filename); + +ret = bdrv_open(bs, filename, flags, drv); +if (ret 0) { +error_report(Could not open '%s': %s, filename, strerror(-ret)); goto fail; } + if (bdrv_is_encrypted(bs)) { printf(Disk image '%s' is encrypted.\n, filename); if (read_password(password, sizeof(password)) 0) { -- 1.7.2.3
Re: [Qemu-devel] Re: [PATCH 2/7] Enable I/O thread and VNC threads by default
On 02/09/2011 06:35 PM, Aurelien Jarno wrote: On Tue, Feb 08, 2011 at 04:08:28PM +0100, Aurelien Jarno wrote: Aurelien Jarno a écrit : Paolo Bonzini a écrit : On 02/08/2011 12:15 PM, Aurelien Jarno wrote: however it should not be done ignoring all the*current* drawbacks of the iothread mode. We know them (at least for some of them), so let's try to solve them. Let's also enumerate them. From what I know: - performance regression in TCG mode I setup an x86_64 guest on an x86_64 host (Intel Xeon E5345). Nothing was running except the standard daemons and the CPU governor was set to performance on all CPU. I then compared the network performance using netperf in default mode, through a tap interface and a virtio nic. I got the following results (quite reproducible, std below 0.5): - without IO thread: 107.36 MB/s - with IO thread: 89.93 MB/s And the same test on the code from september 2009: - without IO thread: 141.8 MB/s virtio-net is super finicky regarding mitigation strategies and their relationship to the I/O thread. Different benchmarks will behave differently. virtio-blk is probably a better device to test as you'll get much more consistent results across different type of I/O patterns. Regards, Anthony Liguori
Re: [Qemu-devel] KVM call minutes for Feb 8
On 02/10/2011 03:20 PM, Gleb Natapov wrote: Jugging by how well all previous conversion went we will end up with one more way of creating devices. One legacy, another qdev and your new one. And what is the problem with qdev again (not that I am a big qdev fan)? We've really been arguing about probably the most minor aspect of the problem with qdev. All I'm really saying is that we shouldn't tie device construction to a factory interface as we do with qdev. That simply means that we should be able to do: RTC *rtc_create(arg1, arg2, arg2); And that a separate piece of code decides which devices are exposed through -device or device_add. Which devices are exposed is really a minor detail. That said, qdev has a number of significant limitations in my mind. The first is that the only relationship between devices is through the BusState interface. I don't think we should even try to have a generic bus model. When you look at how badly broken PCI hotplug is current in qdev, I think this is symptomatic of this. There's also no way in qdev to really have polymorphism. Interfaces really aren't meaningful in qdev so you have things like PCIDevice where some methods are stored in the object instead of the class dispatch table and you have overuse of static class members. And it's all unrelated to VMState. And this is just the basic mechanisms of qdev. The actual implementation is worse. The use of qemu_irq as gpio in the base class and overuse of SystemBus is really quite insane. And so far, the use of qdev has been entirely superficial. Devices still don't make use of bus level interfaces to do I/O so we don't have any better componentization than we did before qdev. The fact that there is no enough interest to convert all devices to it? I don't think there is any device that has been improved by qdev. -device is a nice feature, but it could have been implemented without qdev. Regards, Anthony Liguori How new way of doing things will solve this? Just to be clear I do not have problem with not having ability to compose x86 without pit or kbd controller. Basic things like RTC, pit, pic, ioapic, dma, kbd should be created unconditionally as part of x86 pc machine. But IMHO you are trying to take things to other extreme. -- Gleb.
[Qemu-devel] [PATCH] linux-user: fix compile failure if !CONFIG_USE_GUEST_BASE
If CONFIG_USE_GUEST_BASE is not defined, gcc complains: linux-user/mmap.c:235: error: comparison of unsigned expression = 0 is always true because RESERVED_VA is #defined to 0. Since mmap_find_vma_reserved() will never be called anyway if RESERVED_VA is always 0, fix this by simply #ifdef'ing away the function and its callsite. Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- I'm not a great fan of introducing #ifdefs, but I couldn't come up with a cleaner way of shutting gcc up... linux-user/mmap.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/linux-user/mmap.c b/linux-user/mmap.c index abf21f6..0cf22f8 100644 --- a/linux-user/mmap.c +++ b/linux-user/mmap.c @@ -216,6 +216,7 @@ static abi_ulong mmap_next_start = TASK_UNMAPPED_BASE; unsigned long last_brk; +#ifdef CONFIG_USE_GUEST_BASE /* Subroutine of mmap_find_vma, used when we have pre-allocated a chunk of guest address space. */ static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size) @@ -249,6 +250,7 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size) mmap_next_start = addr; return last_addr; } +#endif /* * Find and reserve a free memory area of size 'size'. The search @@ -271,9 +273,11 @@ abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size) size = HOST_PAGE_ALIGN(size); +#ifdef CONFIG_USE_GUEST_BASE if (RESERVED_VA) { return mmap_find_vma_reserved(start, size); } +#endif addr = start; wrapped = repeat = 0; -- 1.7.1
[Qemu-devel] Re: [PATCH uq/master -v2 2/2] KVM, MCE, unpoison memory address across reboot
On Wed, 2011-02-09 at 16:00 +0800, Jan Kiszka wrote: On 2011-02-09 04:00, Huang Ying wrote: In Linux kernel HWPoison processing implementation, the virtual address in processes mapping the error physical memory page is marked as HWPoison. So that, the further accessing to the virtual address will kill corresponding processes with SIGBUS. If the error physical memory page is used by a KVM guest, the SIGBUS will be sent to QEMU, and QEMU will simulate a MCE to report that memory error to the guest OS. If the guest OS can not recover from the error (for example, the page is accessed by kernel code), guest OS will reboot the system. But because the underlying host virtual address backing the guest physical memory is still poisoned, if the guest system accesses the corresponding guest physical memory even after rebooting, the SIGBUS will still be sent to QEMU and MCE will be simulated. That is, guest system can not recover via rebooting. Yeah, saw this already during my test... In fact, across rebooting, the contents of guest physical memory page need not to be kept. We can allocate a new host physical page to back the corresponding guest physical address. I just wondering what would be architecturally suboptimal if we simply remapped on SIGBUS directly. Would save us at least the bookkeeping. Because we can not change the content of memory silently during guest OS running, this may corrupts guest OS data structure and even ruins disk contents. But during rebooting, all guest OS state are discarded. [snip] @@ -1882,6 +1919,7 @@ int kvm_arch_on_sigbus_vcpu(CPUState *en hardware_memory_error(); } } +kvm_hwpoison_page_add(ram_addr); if (code == BUS_MCEERR_AR) { /* Fake an Intel architectural Data Load SRAR UCR */ @@ -1926,6 +1964,7 @@ int kvm_arch_on_sigbus(int code, void *a QEMU itself instead of guest system!: %p\n, addr); return 0; } +kvm_hwpoison_page_add(ram_addr); kvm_mce_inj_srao_memscrub2(first_cpu, paddr); } else #endif Looks fine otherwise. Unless that simplification makes sense, I could offer to include this into my MCE rework (there is some minor conflict). If all goes well, that series should be posted during this week. Thanks. Best Regards, Huang Ying
[Qemu-devel] Re: AHCI in SeaBIOS
On 02/10/2011 04:37 PM, Anthony Liguori wrote: We're right around the corner from -rc2 so let's wait until after 0.14 is released so we have a full release cycle to test it. And have unusable or only partly usable features (already mentioned multiple times: boot order, AHCI) in 0.14? Paolo
[Qemu-devel] [PATCH 5/7] include qemu-thread.h early
Signed-off-by: Paolo Bonzini pbonz...@redhat.com Cc: Stefan Weil w...@mail.berlios.de Cc: Blue Swirl blauwir...@gmail.com --- cpus.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/cpus.c b/cpus.c index 4c9928e..68b3fcb 100644 --- a/cpus.c +++ b/cpus.c @@ -32,6 +32,7 @@ #include kvm.h #include exec-all.h +#include qemu-thread.h #include cpus.h #include compatfd.h #ifdef CONFIG_LINUX @@ -319,8 +320,6 @@ void vm_stop(int reason) #else /* CONFIG_IOTHREAD */ -#include qemu-thread.h - QemuMutex qemu_global_mutex; static QemuMutex qemu_fair_mutex; -- 1.7.3.5
[Qemu-devel] [PATCH 4/7] add win32 qemu-thread implementation
For now, qemu_cond_timedwait and qemu_mutex_timedlock are left as POSIX-only functions. They can be removed later, once the patches that remove their uses are in. Signed-off-by: Paolo Bonzini pbonz...@redhat.com Cc: Stefan Weil w...@mail.berlios.de Cc: Blue Swirl blauwir...@gmail.com --- Makefile.objs|4 +- qemu-thread.c = qemu-thread-posix.c |0 qemu-thread-posix.h | 18 +++ qemu-thread-win32.c | 272 ++ qemu-thread-win32.h | 22 +++ qemu-thread.h| 27 ++-- 6 files changed, 326 insertions(+), 17 deletions(-) rename qemu-thread.c = qemu-thread-posix.c (100%) create mode 100644 qemu-thread-posix.h create mode 100644 qemu-thread-win32.c create mode 100644 qemu-thread-win32.h diff --git a/Makefile.objs b/Makefile.objs index 353b1a8..19c31fc 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -140,8 +140,8 @@ endif common-obj-y += $(addprefix ui/, $(ui-obj-y)) common-obj-y += iov.o acl.o -common-obj-$(CONFIG_THREAD) += qemu-thread.o -common-obj-$(CONFIG_IOTHREAD) += compatfd.o +common-obj-$(CONFIG_POSIX) += qemu-thread-posix.o compatfd.o +common-obj-$(CONFIG_WIN32) += qemu-thread-win32.o common-obj-y += notify.o event_notifier.o common-obj-y += qemu-timer.o qemu-timer-common.o diff --git a/qemu-thread.c b/qemu-thread-posix.c similarity index 100% rename from qemu-thread.c rename to qemu-thread-posix.c diff --git a/qemu-thread-posix.h b/qemu-thread-posix.h new file mode 100644 index 000..7af371c --- /dev/null +++ b/qemu-thread-posix.h @@ -0,0 +1,18 @@ +#ifndef __QEMU_THREAD_POSIX_H +#define __QEMU_THREAD_POSIX_H 1 +#include pthread.h + +struct QemuMutex { +pthread_mutex_t lock; +}; + +struct QemuCond { +pthread_cond_t cond; +}; + +struct QemuThread { +pthread_t thread; +}; + +void qemu_thread_signal(QemuThread *thread, int sig); +#endif diff --git a/qemu-thread-win32.c b/qemu-thread-win32.c new file mode 100644 index 000..0465a9a --- /dev/null +++ b/qemu-thread-win32.c @@ -0,0 +1,272 @@ +/* + * Win32 implementation for mutex/cond/thread functions + * + * Copyright Red Hat, Inc. 2010 + * + * Author: + * Paolo Bonzini pbonz...@redhat.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ +#include qemu-common.h +#include qemu-thread.h +#include process.h +#include assert.h +#include limits.h + +static void error_exit(int err, const char *msg) +{ +char *pstr; + +FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_ALLOCATE_BUFFER, + NULL, err, 0, (LPTSTR) pstr, 2, NULL); +fprintf(stderr, qemu: %s: %s\n, msg, pstr); +LocalFree(pstr); +exit(1); +} + +void qemu_mutex_init(QemuMutex *mutex) +{ +mutex-owner = 0; +InitializeCriticalSection(mutex-lock); +} + +void qemu_mutex_lock(QemuMutex *mutex) +{ +EnterCriticalSection(mutex-lock); + +/* Win32 CRITICAL_SECTIONs are recursive. Assert that we're not + * using them as such. + */ +assert (mutex-owner == 0); +mutex-owner = GetCurrentThreadId(); +} + +int qemu_mutex_trylock(QemuMutex *mutex) +{ +int owned; + +owned = TryEnterCriticalSection(mutex-lock); +if (owned) { +assert (mutex-owner == 0); +mutex-owner = GetCurrentThreadId(); +} +return !owned; +} + +void qemu_mutex_unlock(QemuMutex *mutex) +{ +assert (mutex-owner == GetCurrentThreadId()); +mutex-owner = 0; +LeaveCriticalSection(mutex-lock); +} + +void qemu_cond_init(QemuCond *cond) +{ +cond-waiters = 0; +cond-was_broadcast = 0; + +cond-sema = CreateSemaphore(NULL, 0, LONG_MAX, NULL); +if (!cond-sema) { +error_exit(GetLastError(), __func__); +} +cond-continue_broadcast = CreateEvent(NULL,/* security */ + FALSE, /* auto-reset */ + FALSE, /* not signaled */ + NULL); /* name */ +if (!cond-continue_broadcast) { +error_exit(GetLastError(), __func__); +} +} + +void qemu_cond_signal(QemuCond *cond) +{ +/* + * Signal only when there are waiters. cond-waiters is + * incremented by pthread_cond_wait under the external lock, + * so we are safe about that. + * + * Waiting threads decrement it outside the external lock, but + * only if another thread is executing pthread_cond_broadcast and + * has the mutex. So, it also cannot be decremented concurrently + * with this particular access. + */ +if (cond-waiters 0) { +cond-waiters--; +if (!ReleaseSemaphore(cond-sema, 1, NULL)) { +error_exit(GetLastError(), __func__); +} +} +} + +void qemu_cond_broadcast(QemuCond *cond) +{ +BOOLEAN result; +/* + * As in pthread_cond_signal, access to cond-waiters and + *
[Qemu-devel] [PATCH 2/7] implement win32 dynticks timer
Signed-off-by: Paolo Bonzini pbonz...@redhat.com Cc: Stefan Weil w...@mail.berlios.de Cc: Blue Swirl blauwir...@gmail.com --- qemu-timer.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/qemu-timer.c b/qemu-timer.c index b0db780..42960de 100644 --- a/qemu-timer.c +++ b/qemu-timer.c @@ -1006,6 +1006,7 @@ static void win32_stop_timer(struct qemu_alarm_timer *t) static void win32_rearm_timer(struct qemu_alarm_timer *t) { struct qemu_alarm_win32 *data = t-priv; +int nearest_delta_ms; assert(alarm_has_dynticks(t)); if (!active_timers[QEMU_CLOCK_REALTIME] @@ -1015,7 +1016,10 @@ static void win32_rearm_timer(struct qemu_alarm_timer *t) timeKillEvent(data-timerId); -data-timerId = timeSetEvent(1, +nearest_delta_ms = (qemu_next_alarm_deadline() + 99) / 100; +if (nearest_delta_ms 1) + nearest_delta_ms = 1; +data-timerId = timeSetEvent(nearest_delta_ms, data-period, host_alarm_handler, (DWORD)t, -- 1.7.3.5
[Qemu-devel] [PATCH 6/7] add assertions on the owner of a QemuMutex
These are already present in the Win32 implementation, add them to the pthread wrappers as well. Signed-off-by: Paolo Bonzini pbonz...@redhat.com Cc: Stefan Weil w...@mail.berlios.de Cc: Blue Swirl blauwir...@gmail.com --- qemu-thread-posix.c | 20 +++- qemu-thread-posix.h |1 + 2 files changed, 20 insertions(+), 1 deletions(-) diff --git a/qemu-thread-posix.c b/qemu-thread-posix.c index fbc78fe..e6cafd9 100644 --- a/qemu-thread-posix.c +++ b/qemu-thread-posix.c @@ -16,9 +16,12 @@ #include time.h #include signal.h #include stdint.h +#include assert.h #include string.h #include qemu-thread.h +static pthread_t pthread_null; + static void error_exit(int err, const char *msg) { fprintf(stderr, qemu: %s: %s\n, msg, strerror(err)); @@ -29,6 +32,7 @@ void qemu_mutex_init(QemuMutex *mutex) { int err; +mutex-owner = pthread_null; err = pthread_mutex_init(mutex-lock, NULL); if (err) error_exit(err, __func__); @@ -48,13 +52,22 @@ void qemu_mutex_lock(QemuMutex *mutex) int err; err = pthread_mutex_lock(mutex-lock); +assert (pthread_equal(mutex-owner, pthread_null)); +mutex-owner = pthread_self(); if (err) error_exit(err, __func__); } int qemu_mutex_trylock(QemuMutex *mutex) { -return pthread_mutex_trylock(mutex-lock); +int err; +err = pthread_mutex_trylock(mutex-lock); +if (err == 0) { +assert (pthread_equal(mutex-owner, pthread_null)); +mutex-owner = pthread_self(); +} + +return !!err; } static void timespec_add_ms(struct timespec *ts, uint64_t msecs) @@ -85,6 +98,8 @@ void qemu_mutex_unlock(QemuMutex *mutex) { int err; +assert (pthread_equal(mutex-owner, pthread_self())); +mutex-owner = pthread_null; err = pthread_mutex_unlock(mutex-lock); if (err) error_exit(err, __func__); @@ -130,7 +145,10 @@ void qemu_cond_wait(QemuCond *cond, QemuMutex *mutex) { int err; +assert (pthread_equal(mutex-owner, pthread_self())); +mutex-owner = pthread_null; err = pthread_cond_wait(cond-cond, mutex-lock); +mutex-owner = pthread_self(); if (err) error_exit(err, __func__); } diff --git a/qemu-thread-posix.h b/qemu-thread-posix.h index 7af371c..11978db 100644 --- a/qemu-thread-posix.h +++ b/qemu-thread-posix.h @@ -4,6 +4,7 @@ struct QemuMutex { pthread_mutex_t lock; +pthread_t owner; }; struct QemuCond { -- 1.7.3.5
[Qemu-devel] [PATCH 7/7] remove CONFIG_THREAD
Signed-off-by: Paolo Bonzini pbonz...@redhat.com Cc: Stefan Weil w...@mail.berlios.de Cc: Blue Swirl blauwir...@gmail.com --- configure |2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/configure b/configure index 598e8e1..46a6389 100755 --- a/configure +++ b/configure @@ -2609,7 +2609,6 @@ if test $vnc_png != no ; then fi if test $vnc_thread != no ; then echo CONFIG_VNC_THREAD=y $config_host_mak - echo CONFIG_THREAD=y $config_host_mak fi if test $fnmatch = yes ; then echo CONFIG_FNMATCH=y $config_host_mak @@ -2696,7 +2695,6 @@ if test $xen = yes ; then fi if test $io_thread = yes ; then echo CONFIG_IOTHREAD=y $config_host_mak - echo CONFIG_THREAD=y $config_host_mak fi if test $linux_aio = yes ; then echo CONFIG_LINUX_AIO=y $config_host_mak -- 1.7.3.5
[Qemu-devel] [PATCH 1/7] unlock iothread during WaitForMultipleObjects
Signed-off-by: Paolo Bonzini pbonz...@redhat.com Cc: Stefan Weil w...@mail.berlios.de Cc: Blue Swirl blauwir...@gmail.com --- os-win32.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/os-win32.c b/os-win32.c index 566d5e9..d907c59 100644 --- a/os-win32.c +++ b/os-win32.c @@ -140,7 +140,9 @@ void os_host_main_loop_wait(int *timeout) int err; WaitObjects *w = wait_objects; +qemu_mutex_unlock_iothread(); ret = WaitForMultipleObjects(w-num, w-events, FALSE, *timeout); +qemu_mutex_lock_iothread(); if (WAIT_OBJECT_0 + 0 = ret ret = WAIT_OBJECT_0 + w-num - 1) { if (w-func[ret - WAIT_OBJECT_0]) w-func[ret - WAIT_OBJECT_0](w-opaque[ret - WAIT_OBJECT_0]); -- 1.7.3.5
Re: [Qemu-devel] [PATCH 2/7] Enable I/O thread and VNC threads by default
On 02/09/2011 11:16 PM, Stefan Weil wrote: I decided to create a new directory structure hosts/w32, so files can be moved from the root to hosts/posix, hosts/w32, or hosts/xxx. Include chains reduce code modifications and conditional compilations. And people who don't want to see w32 support can remove it easily :-) Supporting I/O threads for W32 will be possible, too. I have patches for Win32 iothread, I'm just posting the series split into multiple pieces. Paolo
Re: [Qemu-devel] KVM call minutes for Feb 8
On 10 February 2011 08:36, Anthony Liguori anth...@codemonkey.ws wrote: On 02/10/2011 09:16 AM, Peter Maydell wrote: On 10 February 2011 07:47, Anthony Liguorianth...@codemonkey.ws wrote: 2) get rid of the entire concept of machines. Creating a i440fx is essentially equivalent to creating a bare machine. Does that make any sense for anything other than target-i386? The concept of a machine model seems a pretty obvious one for ARM boards, for instance, and I'm not sure we'd gain much by having i386 be different to the other architectures... Yes, it makes a lot of sense, I just don't know the component names as well so bear with me :-) There are two types of Versatile machines today, Versatile/AB and Versatile/PB. They are both made with the same core, ARM926EJ-S, with different expansions. So you would model arm926ej-s as the chipset and then build up the machines by modifying parameters of the chipset (like the board id) and/or adding different components on top of it. Er, ARM926 is the CPU, it's not a chipset. The board ID is definitely not a property of an ARM926, it's a property of the board (clue is in the name :-)). I don't think versatile boards have a chipset really... In my understanding the machine is the thing that says I need a 926, and an MMC controller at this address, and some UARTS, and... ie it is the thing that does the modifying parameters and adding different components. So if we'd still be doing that I don't see how we've got rid of the concept. I guess I'm missing the point somehow. A good way to think about what I'm proposing is that machine-init really should be a constructor for a device object. If you mean that you want machines to be implemented under the hood as a single huge device you can only have one of that spans the entire memory map, well I guess that's an implementation detail. But conceptually machines really do exist, and we definitely still want users to be able to say I want a beagle machine; I want a versatile; I want an n900. -- PMM
Re: [Qemu-devel] KVM call minutes for Feb 8
On Thu, Feb 10, 2011 at 08:47:12AM +0100, Anthony Liguori wrote: On 02/09/2011 09:15 PM, Blue Swirl wrote: On Wed, Feb 9, 2011 at 9:59 PM, Anthony Liguorianth...@codemonkey.ws wrote: On 02/09/2011 06:48 PM, Blue Swirl wrote: ISASerialState dev; isa_serial_init(dev, 0, 0x274, 0x07, NULL, NULL); Do you mean that there should be a generic way of doing that, like sysbus_create_varargs() for qdev, or just add inline functions which hide qdev property setup? I still think that FDT should be used in the future. That would require that the properties can be set up mechanically, and I don't see how your proposal would help that. Yeah, I don't think that is a good idea anymore. I think this is part of why we're having so many problems with qdev. While (most?) hardware hierarchies can be represented by device tree syntax, not all valid device trees correspond to interface and/or useful hardware hierarchies. User creates a non-working machine and so gets to fix the problems? How is that a problem for us? It's not about creating a non-working machine. It's about what user-level abstraction we need to provide. It's a whole lot easier to implement an i440fx device with a fixed set of parameters than it is to make every possible subdevice have a proper factory interface along with mechanisms to hook everything together. So what if it is easier, it doesn't mean it is correct thing to do. What you are proposing is just a huge step backwards. May be we shouldn't support hooking everything together in completely arbitrary ways, but we shouldn't force isa/pci devices upon our users just because they are non-removable on real chip. Basically, we're making things much harder for ourselves than we should. We want to have an interface to create large chunks of hardware (like an i440fx) which then results in a significant portion of a device tree. But how would this affect interface to devices? I don't see how that would be any different with current model and the function call model. If all composition is done through a factory interface, it doesn't. But my main argument here is that we shouldn't try to make all composition done through a factory interface--only where it makes sense. So very concretely, I'm suggesting we do the following to target-i386: 1) make the i440fx device have an embedded ide controller, piix3, and usb controller that get initialized automatically. The piix3 embeds the PCI-to-ISA bridge along with all of the default ISA devices (rtc, serial, etc.). This may be a problem even from security point of view. What if usb code (ide, serial, parallel) has guest exploitable bug? Currently I can happily continue running guests if they do not need affected subsystem. If we'll get it your way I will no longer be able to do so. 2) get rid of the entire concept of machines. Creating a i440fx is essentially equivalent to creating a bare machine. 3) just use the existing -device infrastructure to support all of this. A very simple device config corresponds to a very complex device tree but that's the desired effect. 4) model the CPUs as devices that take a pointer to a host controller, for x86, the normal case would be giving it a pointer to i440fx. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb.
[Qemu-devel] [PATCH 01/18] Make QEMUFile buf expandable, and introduce qemu_realloc_buffer() and qemu_clear_buffer().
Currently buf size is fixed at 32KB. It would be useful if it could be flexible. Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp --- hw/hw.h |2 ++ savevm.c | 20 +++- 2 files changed, 21 insertions(+), 1 deletions(-) diff --git a/hw/hw.h b/hw/hw.h index 5e24329..a168a37 100644 --- a/hw/hw.h +++ b/hw/hw.h @@ -58,6 +58,8 @@ void qemu_fflush(QEMUFile *f); int qemu_fclose(QEMUFile *f); void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size); void qemu_put_byte(QEMUFile *f, int v); +void *qemu_realloc_buffer(QEMUFile *f, int size); +void qemu_clear_buffer(QEMUFile *f); static inline void qemu_put_ubyte(QEMUFile *f, unsigned int v) { diff --git a/savevm.c b/savevm.c index 6d83b0f..6c4c72b 100644 --- a/savevm.c +++ b/savevm.c @@ -171,7 +171,8 @@ struct QEMUFile { when reading */ int buf_index; int buf_size; /* 0 when writing */ -uint8_t buf[IO_BUF_SIZE]; +int buf_max_size; +uint8_t *buf; int has_error; }; @@ -422,6 +423,9 @@ QEMUFile *qemu_fopen_ops(void *opaque, QEMUFilePutBufferFunc *put_buffer, f-get_rate_limit = get_rate_limit; f-is_write = 0; +f-buf_max_size = IO_BUF_SIZE; +f-buf = qemu_malloc(sizeof(uint8_t) * f-buf_max_size); + return f; } @@ -452,6 +456,19 @@ void qemu_fflush(QEMUFile *f) } } +void *qemu_realloc_buffer(QEMUFile *f, int size) +{ +f-buf_max_size = size; +f-buf = qemu_realloc(f-buf, f-buf_max_size); + +return f-buf; +} + +void qemu_clear_buffer(QEMUFile *f) +{ +f-buf_size = f-buf_index = f-buf_offset = 0; +} + static void qemu_fill_buffer(QEMUFile *f) { int len; @@ -477,6 +494,7 @@ int qemu_fclose(QEMUFile *f) qemu_fflush(f); if (f-close) ret = f-close(f-opaque); +qemu_free(f-buf); qemu_free(f); return ret; } -- 1.7.1.2
[Qemu-devel] Re: [PATCH 6/7] add assertions on the owner of a QemuMutex
On 2011-02-10 18:37, Paolo Bonzini wrote: These are already present in the Win32 implementation, add them to the pthread wrappers as well. Better use PTHREAD_MUTEX_ERRORCHECK. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
[Qemu-devel] [PATCH 17/18] migration-tcp: modify tcp_accept_incoming_migration() to handle ft_mode, and add a hack not to close fd when ft_mode is enabled.
When ft_mode is set in the header, tcp_accept_incoming_migration() sets ft_trans_incoming() as a callback, and call qemu_file_get_notify() to receive FT transaction iteratively. We also need a hack no to close fd before moving to ft_transaction mode, so that we can reuse the fd for it. vm_change_state_handler is added to turn off ft_mode when cont is pressed. Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp --- migration-tcp.c | 67 ++- 1 files changed, 66 insertions(+), 1 deletions(-) diff --git a/migration-tcp.c b/migration-tcp.c index 55777c8..84076d6 100644 --- a/migration-tcp.c +++ b/migration-tcp.c @@ -18,6 +18,8 @@ #include sysemu.h #include buffered_file.h #include block.h +#include ft_trans_file.h +#include event-tap.h //#define DEBUG_MIGRATION_TCP @@ -29,6 +31,8 @@ do { } while (0) #endif +static VMChangeStateEntry *vmstate; + static int socket_errno(FdMigrationState *s) { return socket_error(); @@ -56,7 +60,8 @@ static int socket_read(FdMigrationState *s, const void * buf, size_t size) static int tcp_close(FdMigrationState *s) { DPRINTF(tcp_close\n); -if (s-fd != -1) { +/* FIX ME: accessing ft_mode here isn't clean */ +if (s-fd != -1 ft_mode != FT_INIT) { close(s-fd); s-fd = -1; } @@ -150,6 +155,36 @@ MigrationState *tcp_start_outgoing_migration(Monitor *mon, return s-mig_state; } +static void ft_trans_incoming(void *opaque) +{ +QEMUFile *f = opaque; + +qemu_file_get_notify(f); +if (qemu_file_has_error(f)) { +ft_mode = FT_ERROR; +qemu_fclose(f); +} +} + +static void ft_trans_reset(void *opaque, int running, int reason) +{ +QEMUFile *f = opaque; + +if (running) { +if (ft_mode != FT_ERROR) { +qemu_fclose(f); +} +ft_mode = FT_OFF; +qemu_del_vm_change_state_handler(vmstate); +} +} + +static void ft_trans_schedule_replay(QEMUFile *f) +{ +event_tap_schedule_replay(); +vmstate = qemu_add_vm_change_state_handler(ft_trans_reset, f); +} + static void tcp_accept_incoming_migration(void *opaque) { struct sockaddr_in addr; @@ -175,8 +210,38 @@ static void tcp_accept_incoming_migration(void *opaque) goto out; } +if (ft_mode == FT_INIT) { +autostart = 0; +} + process_incoming_migration(f); + +if (ft_mode == FT_INIT) { +int ret; + +socket_set_nodelay(c); + +f = qemu_fopen_ft_trans(s, c); +if (f == NULL) { +fprintf(stderr, could not qemu_fopen_ft_trans\n); +goto out; +} + +/* need to wait sender to setup */ +ret = qemu_ft_trans_begin(f); +if (ret 0) { +goto out; +} + +qemu_set_fd_handler2(c, NULL, ft_trans_incoming, NULL, f); +ft_trans_schedule_replay(f); +ft_mode = FT_TRANSACTION_RECV; + +return; +} + qemu_fclose(f); + out: close(c); out2: -- 1.7.1.2
[Qemu-devel] [PATCH 10/18] Call init handler of event-tap at main() in vl.c.
Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp --- vl.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/vl.c b/vl.c index 00155fb..f4d4abf 100644 --- a/vl.c +++ b/vl.c @@ -162,6 +162,7 @@ int main(int argc, char **argv) #include qemu-queue.h #include cpus.h #include arch_init.h +#include event-tap.h #include ui/qemu-spice.h @@ -2919,6 +2920,8 @@ int main(int argc, char **argv, char **envp) blk_mig_init(); +event_tap_init(); + /* open the virtual block devices */ if (snapshot) qemu_opts_foreach(qemu_find_opts(drive), drive_enable_snapshot, NULL, 0); -- 1.7.1.2
[Qemu-devel] [PATCH 06/18] virtio: decrement last_avail_idx with inuse before saving.
For regular migration inuse == 0 always as requests are flushed before save. However, event-tap log when enabled introduces an extra queue for requests which is not being flushed, thus the last inuse requests are left in the event-tap queue. Move the last_avail_idx value sent to the remote back to make it repeat the last inuse requests. Signed-off-by: Michael S. Tsirkin m...@redhat.com Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp --- hw/virtio.c | 10 +- 1 files changed, 9 insertions(+), 1 deletions(-) diff --git a/hw/virtio.c b/hw/virtio.c index 31bd9e3..f05d1b6 100644 --- a/hw/virtio.c +++ b/hw/virtio.c @@ -673,12 +673,20 @@ void virtio_save(VirtIODevice *vdev, QEMUFile *f) qemu_put_be32(f, i); for (i = 0; i VIRTIO_PCI_QUEUE_MAX; i++) { +/* For regular migration inuse == 0 always as + * requests are flushed before save. However, + * event-tap log when enabled introduces an extra + * queue for requests which is not being flushed, + * thus the last inuse requests are left in the event-tap queue. + * Move the last_avail_idx value sent to the remote back + * to make it repeat the last inuse requests. */ +uint16_t last_avail = vdev-vq[i].last_avail_idx - vdev-vq[i].inuse; if (vdev-vq[i].vring.num == 0) break; qemu_put_be32(f, vdev-vq[i].vring.num); qemu_put_be64(f, vdev-vq[i].pa); -qemu_put_be16s(f, vdev-vq[i].last_avail_idx); +qemu_put_be16s(f, last_avail); if (vdev-binding-save_queue) vdev-binding-save_queue(vdev-binding_opaque, i, f); } -- 1.7.1.2
[Qemu-devel] [PATCH 05/18] vl.c: add deleted flag for deleting the handler.
Make deleting handlers robust against deletion of any elements in a handler by using a deleted flag like in file descriptors. Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp --- vl.c | 13 + 1 files changed, 9 insertions(+), 4 deletions(-) diff --git a/vl.c b/vl.c index ed2cdfa..00155fb 100644 --- a/vl.c +++ b/vl.c @@ -1158,6 +1158,7 @@ static void nographic_update(void *opaque) struct vm_change_state_entry { VMChangeStateHandler *cb; void *opaque; +int deleted; QLIST_ENTRY (vm_change_state_entry) entries; }; @@ -1178,8 +1179,7 @@ VMChangeStateEntry *qemu_add_vm_change_state_handler(VMChangeStateHandler *cb, void qemu_del_vm_change_state_handler(VMChangeStateEntry *e) { -QLIST_REMOVE (e, entries); -qemu_free (e); +e-deleted = 1; } void vm_state_notify(int running, int reason) @@ -1188,8 +1188,13 @@ void vm_state_notify(int running, int reason) trace_vm_state_notify(running, reason); -for (e = vm_change_state_head.lh_first; e; e = e-entries.le_next) { -e-cb(e-opaque, running, reason); +QLIST_FOREACH(e, vm_change_state_head, entries) { +if (e-deleted) { +QLIST_REMOVE(e, entries); +qemu_free(e); +} else { +e-cb(e-opaque, running, reason); +} } } -- 1.7.1.2
[Qemu-devel] [PULL 0.14] linux-user fixes
The following changes since commit 343c1de916b1841cd5fd5f813add9c87590d72e8: x86: Fix MCA broadcast parameters for TCG case (2011-02-08 12:37:30 +0100) are available in the git repository at: git://gitorious.org/qemu-maemo/qemu.git linux-user-for-0.14 Martin Mohring (1): linux-user: fix for loopmount ioctl Stefan Weil (1): linux-user: Fix possible realloc memory leak linux-user/elfload.c |8 +--- linux-user/ioctls.h |2 -- 2 files changed, 5 insertions(+), 5 deletions(-)
Re: [Qemu-devel] [PATCH 2/7] Enable I/O thread and VNC threads by default
On 02/09/2011 11:16 PM, Stefan Weil wrote: The patch is available here: http://repo.or.cz/w/qemu/ar7.git/commitdiff/aabf11dc0a938b84d76d7c147cbf0445d7bee297 diff --git a/hosts/w32/include/signal.h b/hosts/w32/include/signal.h new file mode 100644 index 000..e45f03c --- /dev/null +++ b/hosts/w32/include/signal.h @@ -0,0 +1,20 @@ +/* + * QEMU w32 support + * + * Copyright (C) 2011 Stefan Weil + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef WIN32_SIGNAL_H +#define WIN32_SIGNAL_H + +#include_next signal.h +#include sys/types.h/* sigset_t */ + +int pthread_sigmask(int how, const sigset_t *set, sigset_t *oldset); +int sigfillset(sigset_t *set); + +#endif /* WIN32_SIGNAL_H */ diff --git a/hosts/w32/include/time.h b/hosts/w32/include/time.h new file mode 100644 index 000..0b997d3 --- /dev/null +++ b/hosts/w32/include/time.h @@ -0,0 +1,31 @@ +/* + * QEMU w32 support + * + * Copyright (C) 2011 Stefan Weil + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#if !defined(W32_TIME_H) +#define W32_TIME_H + +#include_next time.h + +#ifndef HAVE_STRUCT_TIMESPEC +#define HAVE_STRUCT_TIMESPEC 1 +struct timespec { +long tv_sec; +long tv_nsec; +}; +#endif /* HAVE_STRUCT_TIMESPEC */ + +typedef enum { + CLOCK_REALTIME = 0 +} clockid_t; + +int clock_getres (clockid_t clock_id, struct timespec *res); +int clock_gettime(clockid_t clock_id, struct timespec *pTimespec); + +#endif /* W32_TIME_H */ diff --git a/os-win32.c b/os-win32.c index b214e6a..7778366 100644 --- a/os-win32.c +++ b/os-win32.c @@ -36,6 +36,45 @@ /***/ /* Functions missing in mingw */ +#if defined(CONFIG_THREAD) + +int clock_gettime(clockid_t clock_id, struct timespec *pTimespec) +{ + int result = 0; + if (clock_id == CLOCK_REALTIME pTimespec != 0) { +DWORD t = GetTickCount(); +const unsigned cps = 1000; +struct timespec ts; +ts.tv_sec = t / cps; +ts.tv_nsec = (t % cps) * (10UL / cps); +*pTimespec = ts; + } else { +errno = EINVAL; +result = -1; + } + return result; +} Why is this needed? The only user of clock_gettime in the POSIX case is using CLOCK_MONOTONIC, and actually has a Win32 version already. +int pthread_sigmask(int how, const sigset_t *set, sigset_t *oldset) +{ +/* Dummy, do nothing. */ +return EINVAL; +} + +int sigfillset(sigset_t *set) +{ +int result = 0; +if (set) { +*(set) = (sigset_t)(-1); +} else { +errno = EINVAL; +result = -1; +} +return result; +} Instead of these, it's better to provide a Win32 implementation of mutexes and condvars. I'll submit it next week hopefully. Paolo
Re: [Qemu-devel] KVM call minutes for Feb 8
On Thu, Feb 10, 2011 at 11:00:50AM +0100, Anthony Liguori wrote: On 02/10/2011 10:07 AM, Gleb Natapov wrote: So what if it is easier, it doesn't mean it is correct thing to do. If we spend the next 10 years trying to do the correct thing for some arbitrary definition of correct, that's not terribly useful. Changing direction by 180 every 2 years even less useful. It's really simple actually. Let's do the least clever thing and model how hardware actual works. Once we have that, we can try to be better than real hardware (if it's possible). I think out understanding on how HW actually works is very different. You are placing to much value on were device resides physically, for me it is completely unimportant detail. Not worth even mentioning. If all composition is done through a factory interface, it doesn't. But my main argument here is that we shouldn't try to make all composition done through a factory interface--only where it makes sense. So very concretely, I'm suggesting we do the following to target-i386: 1) make the i440fx device have an embedded ide controller, piix3, and usb controller that get initialized automatically. The piix3 embeds the PCI-to-ISA bridge along with all of the default ISA devices (rtc, serial, etc.). This may be a problem even from security point of view. What if usb code (ide, serial, parallel) has guest exploitable bug? Currently I can happily continue running guests if they do not need affected subsystem. If we'll get it your way I will no longer be able to do so. qemu -device i440fx,ide=off So you still need to support arbitrary composition. What's the difference? So why do you like -device i440fx over what we have now? In current speak you propose will be implement by using i440fx machine type. Qdev will build it for you. If you really care to do this. But this desire to remove devices is silly IMHO. Concerns about security are misplaced. If you have to change the way a guest is invoked in order to eliminate security problems, then there's something seriously wrong. No I do not. I do not create guest with unneeded devices from the beginning. -- Gleb.
Re: [Qemu-devel] [PATCH] Make tb_alloc static.
On Thu, Feb 10, 2011 at 10:04:57AM +0100, Tristan Gingold wrote: On Wed, Feb 09, 2011 at 07:52:52PM +0100, Aurelien Jarno wrote: What about moving tb_alloc() (with tb_free()) higher in the file? After all it make sense to have the function creating or destructing a tb before the function manipulating them. Thanks. Like this ? Yes, perfect. Applied. Tristan. This function is only used within exec.c, so no need to make it public. Signed-off-by: Tristan Gingold ging...@adacore.com --- exec-all.h |1 - exec.c | 52 ++-- 2 files changed, 26 insertions(+), 27 deletions(-) diff --git a/exec-all.h b/exec-all.h index 81497c0..c062693 100644 --- a/exec-all.h +++ b/exec-all.h @@ -182,7 +182,6 @@ static inline unsigned int tb_phys_hash_func(tb_page_addr_t pc) return (pc 2) (CODE_GEN_PHYS_HASH_SIZE - 1); } -TranslationBlock *tb_alloc(target_ulong pc); void tb_free(TranslationBlock *tb); void tb_flush(CPUState *env); void tb_link_page(TranslationBlock *tb, diff --git a/exec.c b/exec.c index 477199b..9a7a752 100644 --- a/exec.c +++ b/exec.c @@ -649,6 +649,32 @@ void cpu_exec_init(CPUState *env) #endif } +/* Allocate a new translation block. Flush the translation buffer if + too many translation blocks or too much generated code. */ +static TranslationBlock *tb_alloc(target_ulong pc) +{ +TranslationBlock *tb; + +if (nb_tbs = code_gen_max_blocks || +(code_gen_ptr - code_gen_buffer) = code_gen_buffer_max_size) +return NULL; +tb = tbs[nb_tbs++]; +tb-pc = pc; +tb-cflags = 0; +return tb; +} + +void tb_free(TranslationBlock *tb) +{ +/* In practice this is mostly used for single use temporary TB + Ignore the hard cases and just back up if this TB happens to + be the last one generated. */ +if (nb_tbs 0 tb == tbs[nb_tbs - 1]) { +code_gen_ptr = tb-tc_ptr; +nb_tbs--; +} +} + static inline void invalidate_page_bitmap(PageDesc *p) { if (p-code_bitmap) { @@ -1227,32 +1253,6 @@ static inline void tb_alloc_page(TranslationBlock *tb, #endif /* TARGET_HAS_SMC */ } -/* Allocate a new translation block. Flush the translation buffer if - too many translation blocks or too much generated code. */ -TranslationBlock *tb_alloc(target_ulong pc) -{ -TranslationBlock *tb; - -if (nb_tbs = code_gen_max_blocks || -(code_gen_ptr - code_gen_buffer) = code_gen_buffer_max_size) -return NULL; -tb = tbs[nb_tbs++]; -tb-pc = pc; -tb-cflags = 0; -return tb; -} - -void tb_free(TranslationBlock *tb) -{ -/* In practice this is mostly used for single use temporary TB - Ignore the hard cases and just back up if this TB happens to - be the last one generated. */ -if (nb_tbs 0 tb == tbs[nb_tbs - 1]) { -code_gen_ptr = tb-tc_ptr; -nb_tbs--; -} -} - /* add a new TB and link it to the physical page tables. phys_page2 is (-1) to indicate that only one page contains the TB. */ void tb_link_page(TranslationBlock *tb, -- 1.7.3.GIT -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PATCH v3 0/6] target-arm: Fix floating point conversions
On Thu, Feb 10, 2011 at 11:28:55AM +, Peter Maydell wrote: This patchset fixes two issues: * default_nan_mode not being honoured for float-to-float conversions * half precision conversions being broken in a number of ways as well as not handling default_nan_mode. With this patchset qemu passes random-instruction-selection tests for VCVT.F32.F16, VCVT.F16.F32, VCVTB and VCVTT, in both IEEE and non-IEEE modes, with and without default-NaN behaviour. Christophe: this patchset includes your softfloat v3 patch, although I have split it up a little to keep the float16 bits separate. Changes since v2: * added STRUCT_TYPES version of float16 and fixed various places which needed a make_float16()/float16_val() in order to compile with STRUCT_TYPES enabled * s/bits16/float16/ in patch 3 as suggested by Aurelien * fixed the types in the f16-related ARM helper wrappers in patch 6 Patch 2 is unchanged and so I've added Aurelien's reviewed-by signoff; the others all changed, although mostly in minor ways. (Compiling with STRUCT_TYPES enabled also needs some fixes to existing float32/float64 code; I'll send a separate patchset for that.) Christophe Lyon (1): softfloat: Honour default_nan_mode for float-to-float conversions Peter Maydell (5): softfloat: Add float16 type and float16 NaN handling functions softfloat: Fix single-to-half precision float conversions softfloat: Correctly handle NaNs in float16_to_float32() target-arm: Silence NaNs resulting from half-precision conversions target-arm: Use standard FPSCR for Neon half-precision operations fpu/softfloat-specialize.h | 130 ++-- fpu/softfloat.c| 100 ++ fpu/softfloat.h| 19 ++- target-arm/helper.c| 38 +++-- target-arm/helpers.h |2 + target-arm/translate.c | 16 +++--- 6 files changed, 251 insertions(+), 54 deletions(-) Thanks, all applied. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] KVM call minutes for Feb 8
On 02/10/2011 09:47 AM, Anthony Liguori wrote: So very concretely, I'm suggesting we do the following to target-i386: 1) make the i440fx device have an embedded ide controller, piix3, and usb controller that get initialized automatically. The piix3 embeds the PCI-to-ISA bridge along with all of the default ISA devices (rtc, serial, etc.). This I like. 2) get rid of the entire concept of machines. Creating a i440fx is essentially equivalent to creating a bare machine. No, it's not. The 440fx does not include an IOAPIC, for example. There may be other optional components, or differences in wiring, that make two machines with i440fx not identical. 4) model the CPUs as devices that take a pointer to a host controller, for x86, the normal case would be giving it a pointer to i440fx. Surely the connection is via a bus? An x86 cpu talks to the bus, and there happens to be an 440fx north bridge at the end of it. It could also be a Q35 or something else. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH 02/18] Introduce read() to FdMigrationState.
2011/2/10 Daniel P. Berrange berra...@redhat.com: On Thu, Feb 10, 2011 at 07:23:33PM +0900, Yoshiaki Tamura wrote: 2011/2/10 Daniel P. Berrange berra...@redhat.com: On Thu, Feb 10, 2011 at 10:54:01AM +0100, Anthony Liguori wrote: On 02/10/2011 10:30 AM, Yoshiaki Tamura wrote: Currently FdMigrationState doesn't support read(), and this patch introduces it to get response from the other side. Signed-off-by: Yoshiaki Tamuratamura.yoshi...@lab.ntt.co.jp Migration is unidirectional. Changing this is fundamental and not something to be done lightly. Making it bi-directional might break libvirt's save/restore to file support which uses migration, passing a unidirectional FD for the file. It could also break libvirt's secure tunnelled migration support which is currently only expecting to have data sent in one direction on the socket. Hi Daniel, IIUC, this patch isn't something to make existing live migration bi-directional. Just opens up a way for Kemari to use it. Do you think it's dangerous for libvirt still? The key is for it to be a no-op for any usage of the existing 'migrate' command. I had thought this was wiring up read into the event loop too, so it would be poll()ing for reads, but after re-reading I see this isn't the case here. It's a no-op for existing migration related code. Anthony, did you have the same concern? Yoshi Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] KVM call minutes for Feb 8
On 02/10/2011 11:10 AM, Gleb Natapov wrote: On Thu, Feb 10, 2011 at 11:00:50AM +0100, Anthony Liguori wrote: On 02/10/2011 10:07 AM, Gleb Natapov wrote: So what if it is easier, it doesn't mean it is correct thing to do. If we spend the next 10 years trying to do the correct thing for some arbitrary definition of correct, that's not terribly useful. Changing direction by 180 every 2 years even less useful. If we think through what we are doing and have a coherent architecture before changing direction, then we won't have this problem. It's really simple actually. Let's do the least clever thing and model how hardware actual works. Once we have that, we can try to be better than real hardware (if it's possible). I think out understanding on how HW actually works is very different. You are placing to much value on were device resides physically, for me it is completely unimportant detail. Not worth even mentioning. No, I place value on how things are modelled in the real world. There simply aren't PC's out there that lack an RTC so I have no interest in jumping through hoops in QEMU to make it possible to do this without modifying QEMU code. It might sound nice to a developer but it's of absolutely no use to users. If all composition is done through a factory interface, it doesn't. But my main argument here is that we shouldn't try to make all composition done through a factory interface--only where it makes sense. So very concretely, I'm suggesting we do the following to target-i386: 1) make the i440fx device have an embedded ide controller, piix3, and usb controller that get initialized automatically. The piix3 embeds the PCI-to-ISA bridge along with all of the default ISA devices (rtc, serial, etc.). This may be a problem even from security point of view. What if usb code (ide, serial, parallel) has guest exploitable bug? Currently I can happily continue running guests if they do not need affected subsystem. If we'll get it your way I will no longer be able to do so. qemu -device i440fx,ide=off So you still need to support arbitrary composition. What's the difference? No, we don't. It's possible to have an 'rtc=off' option but I'm tremendously opposed to doing this. Arbitrary composition is not a useful goal IMHO. So why do you like -device i440fx over what we have now? Because I don't think tools like libvirt should be doing device composition to create an i440fx-like chipset. I think the current path we're on is pushing too much logic that belongs in QEMU into the management stack. In current speak you propose will be implement by using i440fx machine type. Qdev will build it for you. If you had an i440fx machine type, that had no non-optional components added, and you could specify options to the machine type, yes. But I think you'll agree that there's no reason to not just treat the i440fx as a device. If you really care to do this. But this desire to remove devices is silly IMHO. Concerns about security are misplaced. If you have to change the way a guest is invoked in order to eliminate security problems, then there's something seriously wrong. No I do not. I do not create guest with unneeded devices from the beginning. There is very little that isn't 'unneeded'. Regards, Anthony Liguori -- Gleb.
Re: [Qemu-devel] KVM call minutes for Feb 8
On 02/10/2011 11:38 AM, Peter Maydell wrote: On 10 February 2011 10:13, Anthony Liguorianth...@codemonkey.ws wrote: On 02/10/2011 10:04 AM, Peter Maydell wrote: On 10 February 2011 08:36, Anthony Liguorianth...@codemonkey.wswrote: So you would model arm926ej-s as the chipset and then build up the machines by modifying parameters of the chipset (like the board id) and/or adding different components on top of it. Er, ARM926 is the CPU, it's not a chipset. The board ID is definitely not a property of an ARM926, it's a property of the board (clue is in the name :-)). I don't think versatile boards have a chipset really... As I said, I'm not well versed in the component names in ARM. But that said, an actual processor doesn't connect directly to a bunch of devices. It almost always go through some chipset and that chipset implements a lot of functionality typically. I think the name of the component I'm trying to refer to PL300 which I believe is the Northbridge used for the Versatile boards. PL300 is just a bus interconnect (so you can connect multiple AXI bus masters (cores) to multiple AXI bus slaves (devices)). Versatile PB doesn't have anything in the documentation that claims to be a Northbridge (PBX does, VExpress doesn't). This is the system diagram for the Versatile Express: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0447d/I1007683.html I don't know what you'd want to claim is a northbridge there. Basically there's an FPGA with a pile of devices in it, and there's a test chip with the core and some other devices in it. But from a modelling perspective this is all completely irrelevant because regardless of where the hardware designer put the devices, they're just devices at a particular point in the memory map and with a particular set of interrupt wiring and so on. But something interacts with each processor and dispatches the I/O operations in the address space, no? I can't believe there are 2^32 address lines coming off of every arm chip that each device connects. This relationship of how I/O fans out through various devices is important because occasionally platforms do weird things during I/O fan out like implement an IOMMU. If we don't model this I/O dispatch model within QEMU, then it's extremely difficult to implement things like IOMMUs. It might be the case that a platform has a chipset that is a pile of well isolated devices that are crammed in the same silicon space but that otherwise have very well defined interactions with each other. This is the exception though, not the rule. Particularly when looking at the relationship between certain devices on the PC (like the role the pckbd plays in address translation), things are simply not so idealized in practice. But if it makes sense for ARM to describe every single platform device through a factory interface, that's fine. Even in this case, you still want to model things like the distinction between the UART16650A and the ISA bus bridge for the serial device. In this case, you want to be able to do composition without going through a factory. An n900 is a very specific hardware configuration that is best represented by some sort of configuration file vs. something hard coded in QEMU. Yes, that's the whole point -- machine == specific hardware configuration. That's not getting rid of machine, it's just saying we should have some custom scripting language to define them rather than doing them in C. You still want, fundamentally, to be able to say qemu-system-arm -M machinename No, qemu-system-arm -M /path/to/n900.cfg But yeah, no disagreement there. But today, the machine concept in QEMU is definitely not a specific hardware configuration. Regards, Anthony Liguori -- PMM
[Qemu-devel] [PATCH] target-arm: Implement VMULL.P8
Implement VMULL.P8 (the 32x32-64 version of the polynomial multiply instruction). Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- target-arm/helpers.h |1 + target-arm/neon_helper.c | 30 ++ target-arm/translate.c |6 -- 3 files changed, 35 insertions(+), 2 deletions(-) diff --git a/target-arm/helpers.h b/target-arm/helpers.h index 4d0de00..0d37abe 100644 --- a/target-arm/helpers.h +++ b/target-arm/helpers.h @@ -275,6 +275,7 @@ DEF_HELPER_2(neon_sub_u16, i32, i32, i32) DEF_HELPER_2(neon_mul_u8, i32, i32, i32) DEF_HELPER_2(neon_mul_u16, i32, i32, i32) DEF_HELPER_2(neon_mul_p8, i32, i32, i32) +DEF_HELPER_2(neon_mull_p8, i64, i32, i32) DEF_HELPER_2(neon_tst_u8, i32, i32, i32) DEF_HELPER_2(neon_tst_u16, i32, i32, i32) diff --git a/target-arm/neon_helper.c b/target-arm/neon_helper.c index 61890dd..b59ad38 100644 --- a/target-arm/neon_helper.c +++ b/target-arm/neon_helper.c @@ -895,6 +895,36 @@ uint32_t HELPER(neon_mul_p8)(uint32_t op1, uint32_t op2) return result; } +uint64_t HELPER(neon_mull_p8)(uint32_t op1, uint32_t op2) +{ +uint64_t result = 0; +uint64_t mask; +uint64_t op2ex = op2; +op2ex = (op2ex 0xff) | +((op2ex 0xff00) 8) | +((op2ex 0xff) 16) | +((op2ex 0xff00) 24); +while (op1) { +mask = 0; +if (op1 1) { +mask |= 0x; +} +if (op1 (1 8)) { +mask |= (0xU 16); +} +if (op1 (1 16)) { +mask |= (0xULL 32); +} +if (op1 (1 24)) { +mask |= (0xULL 48); +} +result ^= op2ex mask; +op1 = (op1 1) 0x7f7f7f7f; +op2ex = 1; +} +return result; +} + #define NEON_FN(dest, src1, src2) dest = (src1 src2) ? -1 : 0 NEON_VOP(tst_u8, neon_u8, 4) NEON_VOP(tst_u16, neon_u16, 2) diff --git a/target-arm/translate.c b/target-arm/translate.c index 3087a5d..f640a50 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -5124,8 +5124,10 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) gen_neon_mull(cpu_V0, tmp, tmp2, size, u); break; case 14: /* Polynomial VMULL */ -cpu_abort(env, Polynomial VMULL not implemented); - +gen_helper_neon_mull_p8(cpu_V0, tmp, tmp2); +dead_tmp(tmp2); +dead_tmp(tmp); +break; default: /* 15 is RESERVED. */ return 1; } -- 1.7.1
[Qemu-devel] Re: RFC: New API for PPC for vcpu mmu access
On Thu, Feb 10, 2011 at 12:55:22PM +0100, Alexander Graf wrote: Scott Wood wrote: On Thu, 3 Feb 2011 10:19:06 +0100 Alexander Graf ag...@suse.de wrote: Yeah, that one's tricky. Usually the way the memory resolver in qemu works is as follows: * kvm goes to qemu * qemu fetches all mmu and register data from kvm * qemu runs its mmu resolution function as if the target was emulated So the normal way would be to fetch _all_ TLB entries from KVM, shove them into env and implement the MMU in qemu (at least enough of it to enable debugging). No other target modifies this code path. But no other target needs to copy 30kb of data only to get the mmu data either :). I guess you mean that cpu_synchronize_state() is supposed to pull in the MMU state, though I don't see where it gets called for 'm'/'M' commands in the gdb stub. Well, we could also call it in get_phys_page_debug in target-ppc, but yes. I guess the reason it works for now is that SDR1 is pretty constant and was fetched earlier on. For BookE not syncing is obviously even more broken. The MMU code seems to be pretty target-specific. It's not clear to what extent there is a normal way, versus what book3s happens to rely on in its get_physical_address() code. I don't think there are any platforms supported yet (with both KVM and a non-empty cpu_get_phys_page_debug() implementation) that have a pure software-managed TLB. x86 has page tables, and book3s has the hash table (603/e300 doesn't, or more accurately Linux doesn't use it, but I guess that's not supported by KVM yet?). As for PPC, only 440, e500 and G3-5 are basically supported. It happens to work on POWER4 and above too and I've even got reports that it's good on e600 :). We could probably do some sort of lazy state transfer only when MMU code that needs it is run. This could initially include debug translations, for testing a non-KVM-dependent get_physical_address() implementation, but eventually that would use KVM_TRANSLATE (when KVM is used) and thus not Yup :). trigger the state transfer. I'd also like to add an info tlb command, which would require the state transfer. Very nice. BTW, how much other than the MMU is missing to be able to run an e500 target in qemu, without kvm? The last person working on BookE emulation was Edgar. Edgar, how far did you get? Hi, TBH, I don't really know. My goal was to get linux running on an PPC-440 embedded with the Xilinx FPGA's. I managed to fix enough BookE emulation to get that far. After that, we've done a few more hacks to run fsboot and uboot. Also, we've added support for some of the BookE debug registers to be able to run gdbserver from within linux guests. Some of these patches haven't made it upstream yet. I haven't taken the time to compare the specs to qemu code, so I don't really know how much is missing. My guess is that If you wan't to run linux guests, the MMU won't be the limiting factor. Cheers