Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Peter Maydell
On 10 February 2011 07:47, Anthony Liguori anth...@codemonkey.ws wrote:
 So very concretely, I'm suggesting we do the following to target-i386:

 2) get rid of the entire concept of machines.  Creating a i440fx is
 essentially equivalent to creating a bare machine.

Does that make any sense for anything other than target-i386?
The concept of a machine model seems a pretty obvious one
for ARM boards, for instance, and I'm not sure we'd gain much
by having i386 be different to the other architectures...

-- PMM



[Qemu-devel] Re: [PATCH uq/master -v2 2/2] KVM, MCE, unpoison memory address across reboot

2011-02-10 Thread Jan Kiszka
On 2011-02-10 01:27, Huang Ying wrote:
 On Wed, 2011-02-09 at 16:00 +0800, Jan Kiszka wrote:
 On 2011-02-09 04:00, Huang Ying wrote:
 In Linux kernel HWPoison processing implementation, the virtual
 address in processes mapping the error physical memory page is marked
 as HWPoison.  So that, the further accessing to the virtual
 address will kill corresponding processes with SIGBUS.

 If the error physical memory page is used by a KVM guest, the SIGBUS
 will be sent to QEMU, and QEMU will simulate a MCE to report that
 memory error to the guest OS.  If the guest OS can not recover from
 the error (for example, the page is accessed by kernel code), guest OS
 will reboot the system.  But because the underlying host virtual
 address backing the guest physical memory is still poisoned, if the
 guest system accesses the corresponding guest physical memory even
 after rebooting, the SIGBUS will still be sent to QEMU and MCE will be
 simulated.  That is, guest system can not recover via rebooting.

 Yeah, saw this already during my test...


 In fact, across rebooting, the contents of guest physical memory page
 need not to be kept.  We can allocate a new host physical page to
 back the corresponding guest physical address.

 I just wondering what would be architecturally suboptimal if we simply
 remapped on SIGBUS directly. Would save us at least the bookkeeping.
 
 Because we can not change the content of memory silently during guest OS
 running, this may corrupts guest OS data structure and even ruins disk
 contents.  But during rebooting, all guest OS state are discarded.

I was not talking about remapping more than just the pages that became
inaccessible, just like you do now. But I guess the problem is rather
that insane guests continuing to access those pages before reboot should
also still receive MCEs.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Anthony Liguori

On 02/10/2011 09:16 AM, Peter Maydell wrote:

On 10 February 2011 07:47, Anthony Liguorianth...@codemonkey.ws  wrote:
   

So very concretely, I'm suggesting we do the following to target-i386:
 
   

2) get rid of the entire concept of machines.  Creating a i440fx is
essentially equivalent to creating a bare machine.
 

Does that make any sense for anything other than target-i386?
The concept of a machine model seems a pretty obvious one
for ARM boards, for instance, and I'm not sure we'd gain much
by having i386 be different to the other architectures...
   


Yes, it makes a lot of sense, I just don't know the component names as 
well so bear with me :-)


There are two types of Versatile machines today, Versatile/AB and 
Versatile/PB.  They are both made with the same core, ARM926EJ-S, with 
different expansions.


So you would model arm926ej-s as the chipset and then build up the 
machines by modifying parameters of the chipset (like the board id) 
and/or adding different components on top of it.


A good way to think about what I'm proposing is that machine-init 
really should be a constructor for a device object.


Regards,

Anthony Liguori


-- PMM

   





[Qemu-devel] Re: [PATCH uq/master -v2 2/2] KVM, MCE, unpoison memory address across reboot

2011-02-10 Thread Jan Kiszka
On 2011-02-10 01:27, Huang Ying wrote:
 @@ -1882,6 +1919,7 @@ int kvm_arch_on_sigbus_vcpu(CPUState *en
  hardware_memory_error();
  }
  }
 +kvm_hwpoison_page_add(ram_addr);
  
  if (code == BUS_MCEERR_AR) {
  /* Fake an Intel architectural Data Load SRAR UCR */
 @@ -1926,6 +1964,7 @@ int kvm_arch_on_sigbus(int code, void *a
  QEMU itself instead of guest system!: %p\n, addr);
  return 0;
  }
 +kvm_hwpoison_page_add(ram_addr);
  kvm_mce_inj_srao_memscrub2(first_cpu, paddr);
  } else
  #endif



 Looks fine otherwise. Unless that simplification makes sense, I could
 offer to include this into my MCE rework (there is some minor conflict).
 If all goes well, that series should be posted during this week.

Please have a look at

git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream

and tell me if it works for you and your signed-off still applies.

Thanks,
Jan



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] Make tb_alloc static.

2011-02-10 Thread Tristan Gingold
On Wed, Feb 09, 2011 at 07:52:52PM +0100, Aurelien Jarno wrote:
 
 What about moving tb_alloc() (with tb_free()) higher in the file? After
 all it make sense to have the function creating or destructing a tb
 before the function manipulating them.

Thanks.  Like this ?

Tristan.


This function is only used within exec.c, so no need to make it public.

Signed-off-by: Tristan Gingold ging...@adacore.com
---
 exec-all.h |1 -
 exec.c |   52 ++--
 2 files changed, 26 insertions(+), 27 deletions(-)

diff --git a/exec-all.h b/exec-all.h
index 81497c0..c062693 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -182,7 +182,6 @@ static inline unsigned int tb_phys_hash_func(tb_page_addr_t 
pc)
 return (pc  2)  (CODE_GEN_PHYS_HASH_SIZE - 1);
 }
 
-TranslationBlock *tb_alloc(target_ulong pc);
 void tb_free(TranslationBlock *tb);
 void tb_flush(CPUState *env);
 void tb_link_page(TranslationBlock *tb,
diff --git a/exec.c b/exec.c
index 477199b..9a7a752 100644
--- a/exec.c
+++ b/exec.c
@@ -649,6 +649,32 @@ void cpu_exec_init(CPUState *env)
 #endif
 }
 
+/* Allocate a new translation block. Flush the translation buffer if
+   too many translation blocks or too much generated code. */
+static TranslationBlock *tb_alloc(target_ulong pc)
+{
+TranslationBlock *tb;
+
+if (nb_tbs = code_gen_max_blocks ||
+(code_gen_ptr - code_gen_buffer) = code_gen_buffer_max_size)
+return NULL;
+tb = tbs[nb_tbs++];
+tb-pc = pc;
+tb-cflags = 0;
+return tb;
+}
+
+void tb_free(TranslationBlock *tb)
+{
+/* In practice this is mostly used for single use temporary TB
+   Ignore the hard cases and just back up if this TB happens to
+   be the last one generated.  */
+if (nb_tbs  0  tb == tbs[nb_tbs - 1]) {
+code_gen_ptr = tb-tc_ptr;
+nb_tbs--;
+}
+}
+
 static inline void invalidate_page_bitmap(PageDesc *p)
 {
 if (p-code_bitmap) {
@@ -1227,32 +1253,6 @@ static inline void tb_alloc_page(TranslationBlock *tb,
 #endif /* TARGET_HAS_SMC */
 }
 
-/* Allocate a new translation block. Flush the translation buffer if
-   too many translation blocks or too much generated code. */
-TranslationBlock *tb_alloc(target_ulong pc)
-{
-TranslationBlock *tb;
-
-if (nb_tbs = code_gen_max_blocks ||
-(code_gen_ptr - code_gen_buffer) = code_gen_buffer_max_size)
-return NULL;
-tb = tbs[nb_tbs++];
-tb-pc = pc;
-tb-cflags = 0;
-return tb;
-}
-
-void tb_free(TranslationBlock *tb)
-{
-/* In practice this is mostly used for single use temporary TB
-   Ignore the hard cases and just back up if this TB happens to
-   be the last one generated.  */
-if (nb_tbs  0  tb == tbs[nb_tbs - 1]) {
-code_gen_ptr = tb-tc_ptr;
-nb_tbs--;
-}
-}
-
 /* add a new TB and link it to the physical page tables. phys_page2 is
(-1) to indicate that only one page contains the TB. */
 void tb_link_page(TranslationBlock *tb,
-- 
1.7.3.GIT




[Qemu-devel] [PATCH] Network functions patches for win32

2011-02-10 Thread Pavel Dovgaluk
This patch contains some fixes for network functions, working in Windows
environment, and consists of two parts:

1. net/socket.c fix
   MSDN includes the following in WSAEALREADY error description for connect() 
   function: To preserve backward compatibility, this error is reported as 
   WSAEINVAL to Winsock applications that link to either Winsock.dll or 
   Wsock32.dll. So check of this error code was added to allow network
   connections through the sockets in Windows.

2. net/tap-win32.c fix
   This fix allows connection of internal VLAN to the external TAP interface.
   If tap_win32_write function always returns 0, the TAP network interface
   in QEMU is disabled.

Signed-off-by: Pavel Dovgalyuk pavel.dovga...@gmail.com
---
 net/socket.c|2 +-
 net/tap-win32.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/socket.c b/net/socket.c
index 3182b37..7337f4f 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -457,7 +457,7 @@ static int net_socket_connect_init(VLANState *vlan,
 } else if (err == EINPROGRESS) {
 break;
 #ifdef _WIN32
-} else if (err == WSAEALREADY) {
+} else if (err == WSAEALREADY || err == WSAEINVAL) {
 break;
 #endif
 } else {
diff --git a/net/tap-win32.c b/net/tap-win32.c
index 081904e..596132e 100644
--- a/net/tap-win32.c
+++ b/net/tap-win32.c
@@ -480,7 +480,7 @@ static int tap_win32_write(tap_win32_overlapped_t 
*overlapped,
 }
 }
 
-return 0;
+return write_size;
 }
 
 static DWORD WINAPI tap_win32_thread_entry(LPVOID param)




[Qemu-devel] [PATCH 12/18] Insert event_tap_mmio() to cpu_physical_memory_rw() in exec.c.

2011-02-10 Thread Yoshiaki Tamura
Record mmio write event to replay it upon failover.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 exec.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/exec.c b/exec.c
index e950df2..c81fd09 100644
--- a/exec.c
+++ b/exec.c
@@ -33,6 +33,7 @@
 #include osdep.h
 #include kvm.h
 #include qemu-timer.h
+#include event-tap.h
 #if defined(CONFIG_USER_ONLY)
 #include qemu.h
 #include signal.h
@@ -3632,6 +3633,9 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, 
uint8_t *buf,
 io_index = (pd  IO_MEM_SHIFT)  (IO_MEM_NB_ENTRIES - 1);
 if (p)
 addr1 = (addr  ~TARGET_PAGE_MASK) + p-region_offset;
+
+event_tap_mmio(addr, buf, len);
+
 /* XXX: could force cpu_single_env to NULL to avoid
potential bugs */
 if (l = 4  ((addr1  3) == 0)) {
-- 
1.7.1.2




[Qemu-devel] [PATCH 13/18] net: insert event-tap to qemu_send_packet() and qemu_sendv_packet_async().

2011-02-10 Thread Yoshiaki Tamura
event-tap function is called only when it is on.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 net.c |9 +
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/net.c b/net.c
index 9ba5be2..1176124 100644
--- a/net.c
+++ b/net.c
@@ -36,6 +36,7 @@
 #include qemu-common.h
 #include qemu_socket.h
 #include hw/qdev.h
+#include event-tap.h
 
 static QTAILQ_HEAD(, VLANState) vlans;
 static QTAILQ_HEAD(, VLANClientState) non_vlan_clients;
@@ -559,6 +560,10 @@ ssize_t qemu_send_packet_async(VLANClientState *sender,
 
 void qemu_send_packet(VLANClientState *vc, const uint8_t *buf, int size)
 {
+if (event_tap_is_on()) {
+return event_tap_send_packet(vc, buf, size);
+}
+
 qemu_send_packet_async(vc, buf, size, NULL);
 }
 
@@ -657,6 +662,10 @@ ssize_t qemu_sendv_packet_async(VLANClientState *sender,
 {
 NetQueue *queue;
 
+if (event_tap_is_on()) {
+return event_tap_sendv_packet_async(sender, iov, iovcnt, sent_cb);
+}
+
 if (sender-link_down || (!sender-peer  !sender-vlan)) {
 return calc_iov_length(iov, iovcnt);
 }
-- 
1.7.1.2




[Qemu-devel] [PATCH 18/18] Introduce kemari: to enable FT migration mode (Kemari).

2011-02-10 Thread Yoshiaki Tamura
When kemari: is set in front of URI of migrate command, it will turn
on ft_mode to start FT migration mode (Kemari).  On the receiver side,
the option looks like, -incoming kemari:protocol:address:port

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 hmp-commands.hx |4 +++-
 migration.c |   12 
 qmp-commands.hx |4 +++-
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 38e1eb7..ee14344 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -760,7 +760,9 @@ ETEXI
  \n\t\t\t -b for migration without shared storage with
   full copy of disk\n\t\t\t -i for migration without 
  shared storage with incremental copy of disk 
- (base image shared between src and destination),
+ (base image shared between src and destination)
+ \n\t\t\t put \kemari:\ in front of URI to enable 
+ Fault Tolerance mode (Kemari protocol),
 .user_print = monitor_user_noop,   
.mhandler.cmd_new = do_migrate,
 },
diff --git a/migration.c b/migration.c
index 7837c55..a3f7722 100644
--- a/migration.c
+++ b/migration.c
@@ -48,6 +48,12 @@ int qemu_start_incoming_migration(const char *uri)
 const char *p;
 int ret;
 
+/* check ft_mode (Kemari protocol) */
+if (strstart(uri, kemari:, p)) {
+ft_mode = FT_INIT;
+uri = p;
+}
+
 if (strstart(uri, tcp:, p))
 ret = tcp_start_incoming_migration(p);
 #if !defined(WIN32)
@@ -99,6 +105,12 @@ int do_migrate(Monitor *mon, const QDict *qdict, QObject 
**ret_data)
 return -1;
 }
 
+/* check ft_mode (Kemari protocol) */
+if (strstart(uri, kemari:, p)) {
+ft_mode = FT_INIT;
+uri = p;
+}
+
 if (strstart(uri, tcp:, p)) {
 s = tcp_start_outgoing_migration(mon, p, max_throttle, detach,
  blk, inc);
diff --git a/qmp-commands.hx b/qmp-commands.hx
index df40a3d..68ca48a 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -437,7 +437,9 @@ EQMP
  \n\t\t\t -b for migration without shared storage with
   full copy of disk\n\t\t\t -i for migration without 
  shared storage with incremental copy of disk 
- (base image shared between src and destination),
+ (base image shared between src and destination)
+ \n\t\t\t put \kemari:\ in front of URI to enable 
+ Fault Tolerance mode (Kemari protocol),
 .user_print = monitor_user_noop,   
.mhandler.cmd_new = do_migrate,
 },
-- 
1.7.1.2




[Qemu-devel] [PATCH 04/18] qemu-char: export socket_set_nodelay().

2011-02-10 Thread Yoshiaki Tamura
Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 qemu-char.c   |2 +-
 qemu_socket.h |1 +
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/qemu-char.c b/qemu-char.c
index ee4f4ca..7286aeb 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -2111,7 +2111,7 @@ static void tcp_chr_telnet_init(int fd)
 send(fd, (char *)buf, 3, 0);
 }
 
-static void socket_set_nodelay(int fd)
+void socket_set_nodelay(int fd)
 {
 int val = 1;
 setsockopt(fd, IPPROTO_TCP, TCP_NODELAY, (char *)val, sizeof(val));
diff --git a/qemu_socket.h b/qemu_socket.h
index 897a8ae..b7f8465 100644
--- a/qemu_socket.h
+++ b/qemu_socket.h
@@ -36,6 +36,7 @@ int inet_aton(const char *cp, struct in_addr *ia);
 int qemu_socket(int domain, int type, int protocol);
 int qemu_accept(int s, struct sockaddr *addr, socklen_t *addrlen);
 void socket_set_nonblock(int fd);
+void socket_set_nodelay(int fd);
 int send_all(int fd, const void *buf, int len1);
 
 /* New, ipv6-ready socket helper functions, see qemu-sockets.c */
-- 
1.7.1.2




[Qemu-devel] [PATCH 02/18] Introduce read() to FdMigrationState.

2011-02-10 Thread Yoshiaki Tamura
Currently FdMigrationState doesn't support read(), and this patch
introduces it to get response from the other side.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 migration-tcp.c |   15 +++
 migration.c |   13 +
 migration.h |3 +++
 3 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/migration-tcp.c b/migration-tcp.c
index b55f419..55777c8 100644
--- a/migration-tcp.c
+++ b/migration-tcp.c
@@ -39,6 +39,20 @@ static int socket_write(FdMigrationState *s, const void * 
buf, size_t size)
 return send(s-fd, buf, size, 0);
 }
 
+static int socket_read(FdMigrationState *s, const void * buf, size_t size)
+{
+ssize_t len;
+
+do {
+len = recv(s-fd, (void *)buf, size, 0);
+} while (len == -1  socket_error() == EINTR);
+if (len == -1) {
+len = -socket_error();
+}
+
+return len;
+}
+
 static int tcp_close(FdMigrationState *s)
 {
 DPRINTF(tcp_close\n);
@@ -94,6 +108,7 @@ MigrationState *tcp_start_outgoing_migration(Monitor *mon,
 
 s-get_error = socket_errno;
 s-write = socket_write;
+s-read = socket_read;
 s-close = tcp_close;
 s-mig_state.cancel = migrate_fd_cancel;
 s-mig_state.get_status = migrate_fd_get_status;
diff --git a/migration.c b/migration.c
index 3612572..f0df5fc 100644
--- a/migration.c
+++ b/migration.c
@@ -340,6 +340,19 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void 
*data, size_t size)
 return ret;
 }
 
+int migrate_fd_get_buffer(void *opaque, uint8_t *data, int64_t pos, size_t 
size)
+{
+FdMigrationState *s = opaque;
+int ret;
+
+ret = s-read(s, data, size);
+if (ret == -1) {
+ret = -(s-get_error(s));
+}
+
+return ret;
+}
+
 void migrate_fd_connect(FdMigrationState *s)
 {
 int ret;
diff --git a/migration.h b/migration.h
index 2170792..88a6987 100644
--- a/migration.h
+++ b/migration.h
@@ -48,6 +48,7 @@ struct FdMigrationState
 int (*get_error)(struct FdMigrationState*);
 int (*close)(struct FdMigrationState*);
 int (*write)(struct FdMigrationState*, const void *, size_t);
+int (*read)(struct FdMigrationState *, const void *, size_t);
 void *opaque;
 };
 
@@ -116,6 +117,8 @@ void migrate_fd_put_notify(void *opaque);
 
 ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size);
 
+int migrate_fd_get_buffer(void *opaque, uint8_t *data, int64_t pos, size_t 
size);
+
 void migrate_fd_connect(FdMigrationState *s);
 
 void migrate_fd_put_ready(void *opaque);
-- 
1.7.1.2




[Qemu-devel] [PATCH 08/18] savevm: introduce util functions to control ft_trans_file from savevm layer.

2011-02-10 Thread Yoshiaki Tamura
To utilize ft_trans_file function, savevm needs interfaces to be
exported.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 hw/hw.h  |5 ++
 savevm.c |  149 ++
 2 files changed, 154 insertions(+), 0 deletions(-)

diff --git a/hw/hw.h b/hw/hw.h
index a168a37..a9eff5a 100644
--- a/hw/hw.h
+++ b/hw/hw.h
@@ -51,6 +51,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, QEMUFilePutBufferFunc 
*put_buffer,
 QEMUFile *qemu_fopen(const char *filename, const char *mode);
 QEMUFile *qemu_fdopen(int fd, const char *mode);
 QEMUFile *qemu_fopen_socket(int fd);
+QEMUFile *qemu_fopen_ft_trans(int s_fd, int c_fd);
 QEMUFile *qemu_popen(FILE *popen_file, const char *mode);
 QEMUFile *qemu_popen_cmd(const char *command, const char *mode);
 int qemu_stdio_fd(QEMUFile *f);
@@ -60,6 +61,9 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int 
size);
 void qemu_put_byte(QEMUFile *f, int v);
 void *qemu_realloc_buffer(QEMUFile *f, int size);
 void qemu_clear_buffer(QEMUFile *f);
+int qemu_ft_trans_begin(QEMUFile *f);
+int qemu_ft_trans_commit(QEMUFile *f);
+int qemu_ft_trans_cancel(QEMUFile *f);
 
 static inline void qemu_put_ubyte(QEMUFile *f, unsigned int v)
 {
@@ -94,6 +98,7 @@ void qemu_file_set_error(QEMUFile *f);
  * halted due to rate limiting or EAGAIN errors occur as it can be used to
  * resume output. */
 void qemu_file_put_notify(QEMUFile *f);
+void qemu_file_get_notify(void *opaque);
 
 static inline void qemu_put_be64s(QEMUFile *f, const uint64_t *pv)
 {
diff --git a/savevm.c b/savevm.c
index 58e48e3..e44eccd 100644
--- a/savevm.c
+++ b/savevm.c
@@ -82,6 +82,7 @@
 #include migration.h
 #include qemu_socket.h
 #include qemu-queue.h
+#include ft_trans_file.h
 
 #define SELF_ANNOUNCE_ROUNDS 5
 
@@ -189,6 +190,13 @@ typedef struct QEMUFileSocket
 QEMUFile *file;
 } QEMUFileSocket;
 
+typedef struct QEMUFileSocketTrans
+{
+int fd;
+QEMUFileSocket *s;
+VMChangeStateEntry *e;
+} QEMUFileSocketTrans;
+
 static int socket_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
 {
 QEMUFileSocket *s = opaque;
@@ -204,6 +212,22 @@ static int socket_get_buffer(void *opaque, uint8_t *buf, 
int64_t pos, int size)
 return len;
 }
 
+static ssize_t socket_put_buffer(void *opaque, const void *buf, size_t size)
+{
+QEMUFileSocket *s = opaque;
+ssize_t len;
+
+do {
+len = send(s-fd, (void *)buf, size, 0);
+} while (len == -1  socket_error() == EINTR);
+
+if (len == -1) {
+len = -socket_error();
+}
+
+return len;
+}
+
 static int socket_close(void *opaque)
 {
 QEMUFileSocket *s = opaque;
@@ -211,6 +235,70 @@ static int socket_close(void *opaque)
 return 0;
 }
 
+static int socket_trans_get_buffer(void *opaque, uint8_t *buf, int64_t pos, 
size_t size)
+{
+QEMUFileSocketTrans *t = opaque;
+QEMUFileSocket *s = t-s;
+ssize_t len;
+
+len = socket_get_buffer(s, buf, pos, size);
+
+return len;
+}
+
+static ssize_t socket_trans_put_buffer(void *opaque, const void *buf, size_t 
size)
+{
+QEMUFileSocketTrans *t = opaque;
+
+return socket_put_buffer(t-s, buf, size);
+}
+
+
+static int socket_trans_get_ready(void *opaque)
+{
+QEMUFileSocketTrans *t = opaque;
+QEMUFileSocket *s = t-s;
+QEMUFile *f = s-file;
+int ret = 0;
+
+ret = qemu_loadvm_state(f, 1);
+if (ret  0) {
+fprintf(stderr,
+socket_trans_get_ready: error while loading vmstate\n);
+}
+
+return ret;
+}
+
+static int socket_trans_close(void *opaque)
+{
+QEMUFileSocketTrans *t = opaque;
+QEMUFileSocket *s = t-s;
+
+qemu_set_fd_handler2(s-fd, NULL, NULL, NULL, NULL);
+qemu_set_fd_handler2(t-fd, NULL, NULL, NULL, NULL);
+qemu_del_vm_change_state_handler(t-e);
+close(s-fd);
+close(t-fd);
+qemu_free(s);
+qemu_free(t);
+
+return 0;
+}
+
+static void socket_trans_resume(void *opaque, int running, int reason)
+{
+QEMUFileSocketTrans *t = opaque;
+QEMUFileSocket *s = t-s;
+
+if (!running) {
+return;
+}
+
+qemu_announce_self();
+qemu_fclose(s-file);
+}
+
 static int stdio_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, int 
size)
 {
 QEMUFileStdio *s = opaque;
@@ -333,6 +421,26 @@ QEMUFile *qemu_fopen_socket(int fd)
 return s-file;
 }
 
+QEMUFile *qemu_fopen_ft_trans(int s_fd, int c_fd)
+{
+QEMUFileSocketTrans *t = qemu_mallocz(sizeof(QEMUFileSocketTrans));
+QEMUFileSocket *s = qemu_mallocz(sizeof(QEMUFileSocket));
+
+t-s = s;
+t-fd = s_fd;
+t-e = qemu_add_vm_change_state_handler(socket_trans_resume, t);
+
+s-fd = c_fd;
+s-file = qemu_fopen_ops_ft_trans(t, socket_trans_put_buffer,
+  socket_trans_get_buffer, NULL,
+  socket_trans_get_ready,
+  migrate_fd_wait_for_unfreeze,
+  socket_trans_close, 0);

[Qemu-devel] [PATCH 14/18] block: insert event-tap to bdrv_aio_writev(), bdrv_aio_flush() and bdrv_flush().

2011-02-10 Thread Yoshiaki Tamura
event-tap function is called only when it is on, and requests were
sent from device emulators.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
Acked-by: Kevin Wolf kw...@redhat.com
---
 block.c |   15 +++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index b476479..8ddce13 100644
--- a/block.c
+++ b/block.c
@@ -28,6 +28,7 @@
 #include block_int.h
 #include module.h
 #include qemu-objects.h
+#include event-tap.h
 
 #ifdef CONFIG_BSD
 #include sys/types.h
@@ -1482,6 +1483,10 @@ int bdrv_flush(BlockDriverState *bs)
 }
 
 if (bs-drv  bs-drv-bdrv_flush) {
+if (*bs-device_name  event_tap_is_on()) {
+event_tap_bdrv_flush();
+}
+
 return bs-drv-bdrv_flush(bs);
 }
 
@@ -2117,6 +2122,11 @@ BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, 
int64_t sector_num,
 if (bdrv_check_request(bs, sector_num, nb_sectors))
 return NULL;
 
+if (*bs-device_name  event_tap_is_on()) {
+return event_tap_bdrv_aio_writev(bs, sector_num, qiov, nb_sectors,
+ cb, opaque);
+}
+
 if (bs-dirty_bitmap) {
 blk_cb_data = blk_dirty_cb_alloc(bs, sector_num, nb_sectors, cb,
  opaque);
@@ -2380,6 +2390,11 @@ BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
 
 if (!drv)
 return NULL;
+
+if (*bs-device_name  event_tap_is_on()) {
+return event_tap_bdrv_aio_flush(bs, cb, opaque);
+}
+
 return drv-bdrv_aio_flush(bs, cb, opaque);
 }
 
-- 
1.7.1.2




[Qemu-devel] [PATCH 00/18] Kemari for KVM v0.2.10

2011-02-10 Thread Yoshiaki Tamura
Hi,

This patch series is a revised version of Kemari for KVM, which
applied comments for the previous post.  The current code is based on
qemu.git f26e5a54f0554798a2e6f7a074b809b13635d007.

The changes from v0.2.9 - v0.2.10 are:

- change migrate format to kemari:protocol:host:port (Paolo)

The changes from v0.2.8 - v0.2.9 are:

- abstract common code between qemu_savevm_{state,trans}_* (Paolo)
- change incoming format to kemari:protocol:host:port (Paolo)

The changes from v0.2.7 - v0.2.8 are:

- fixed calling wrong cb in event-tap
- add missing qemu_aio_release in event-tap

The changes from v0.2.6 - v0.2.7 are:

- add AIOCB, AIOPool and cancel functions (Kevin)
- insert event-tap for bdrv_flush (Kevin)
- add error handing when calling bdrv functions (Kevin)
- fix usage of qemu_aio_flush and bdrv_flush (Kevin)
- use bs in AIOCB on the primary (Kevin)
- reorder event-tap functions to gather with block/net (Kevin)
- fix checking bs-device_name (Kevin)

The changes from v0.2.5 - v0.2.6 are:

- use qemu_{put,get}_be32() to save/load niov in event-tap

The changes from v0.2.4 - v0.2.5 are:

- fixed braces and trailing spaces by using Blue's checkpatch.pl (Blue)
- event-tap: don't try to send blk_req if it's a bdrv_aio_flush event

The changes from v0.2.3 - v0.2.4 are:

- call vm_start() before event_tap_flush_one() to avoid failure in
  virtio-net assertion
- add vm_change_state_handler to turn off ft_mode
- use qemu_iovec functions in event-tap
- remove duplicated code in migration
- remove unnecessary new line for error_report in ft_trans_file

The changes from v0.2.2 - v0.2.3 are:

- queue async net requests without copying (MST)
-- if not async, contents of the packets are sent to the secondary
- better description for option -k (MST)
- fix memory transfer failure
- fix ft transaction initiation failure

The changes from v0.2.1 - v0.2.2 are:

- decrement last_avaid_idx with inuse before saving (MST)
- remove qemu_aio_flush() and bdrv_flush_all() in migrate_ft_trans_commit()

The changes from v0.2 - v0.2.1 are:

- Move event-tap to net/block layer and use stubs (Blue, Paul, MST, Kevin)
- Tap bdrv_aio_flush (Marcelo)
- Remove multiwrite interface in event-tap (Stefan)
- Fix event-tap to use pio/mmio to replay both net/block (Stefan)
- Improve error handling in event-tap (Stefan)
- Fix leak in event-tap (Stefan)
- Revise virtio last_avail_idx manipulation (MST)
- Clean up migration.c hook (Marcelo)
- Make deleting change state handler robust (Isaku, Anthony)

The changes from v0.1.1 - v0.2 are:

- Introduce a queue in event-tap to make VM sync live.
- Change transaction receiver to a state machine for async receiving.
- Replace net/block layer functions with event-tap proxy functions.
- Remove dirty bitmap optimization for now.
- convert DPRINTF() in ft_trans_file to trace functions.
- convert fprintf() in ft_trans_file to error_report().
- improved error handling in ft_trans_file.
- add a tmp pointer to qemu_del_vm_change_state_handler.

The changes from v0.1 - v0.1.1 are:

- events are tapped in net/block layer instead of device emulation layer.
- Introduce a new option for -incoming to accept FT transaction.

- Removed writev() support to QEMUFile and FdMigrationState for now.
  I would post this work in a different series.

- Modified virtio-blk save/load handler to send inuse variable to
  correctly replay.

- Removed configure --enable-ft-mode.
- Removed unnecessary check for qemu_realloc().

The first 6 patches modify several functions of qemu to prepare
introducing Kemari specific components.

The next 6 patches are the components of Kemari.  They introduce
event-tap and the FT transaction protocol file based on buffered file.
The design document of FT transaction protocol can be found at,
http://wiki.qemu.org/images/b/b1/Kemari_sender_receiver_0.5a.pdf

Then the following 2 patches modifies net/block layer functions with
event-tap functions.  Please note that if Kemari is off, event-tap
will just passthrough, and there is most no intrusion to exisiting
functions including normal live migration.

Finally, the migration layer are modified to support Kemari in the
last 4 patches.  Again, there shouldn't be any affection if a user
doesn't specify Kemari specific options.  The transaction is now async
on both sender and receiver side.  The sender side respects the
max_downtime to decide when to switch from async to sync mode.

The repository contains all patches I'm sending with this message.
For those who want to try, please pull the following repository.  It
also includes dirty bitmap optimization which aren't ready for posting
yet.  To remove the dirty bitmap optimization, please look at HEAD~4
of the tree.

git://kemari.git.sourceforge.net/gitroot/kemari/kemari next

Thanks,

Yoshi

Yoshiaki Tamura (18):
  Make QEMUFile buf expandable, and introduce qemu_realloc_buffer() and
qemu_clear_buffer().
  Introduce read() to FdMigrationState.
  Introduce skip_header parameter to qemu_loadvm_state().
  

[Qemu-devel] [PATCH 03/18] Introduce skip_header parameter to qemu_loadvm_state().

2011-02-10 Thread Yoshiaki Tamura
Introduce skip_header parameter to qemu_loadvm_state() so that it can
be called iteratively without reading the header.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 migration.c |2 +-
 savevm.c|   24 +---
 sysemu.h|2 +-
 3 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/migration.c b/migration.c
index f0df5fc..dd3bf94 100644
--- a/migration.c
+++ b/migration.c
@@ -63,7 +63,7 @@ int qemu_start_incoming_migration(const char *uri)
 
 void process_incoming_migration(QEMUFile *f)
 {
-if (qemu_loadvm_state(f)  0) {
+if (qemu_loadvm_state(f, 0)  0) {
 fprintf(stderr, load of migration failed\n);
 exit(0);
 }
diff --git a/savevm.c b/savevm.c
index 6c4c72b..58e48e3 100644
--- a/savevm.c
+++ b/savevm.c
@@ -1716,7 +1716,7 @@ typedef struct LoadStateEntry {
 int version_id;
 } LoadStateEntry;
 
-int qemu_loadvm_state(QEMUFile *f)
+int qemu_loadvm_state(QEMUFile *f, int skip_header)
 {
 QLIST_HEAD(, LoadStateEntry) loadvm_handlers =
 QLIST_HEAD_INITIALIZER(loadvm_handlers);
@@ -1729,17 +1729,19 @@ int qemu_loadvm_state(QEMUFile *f)
 return -EINVAL;
 }
 
-v = qemu_get_be32(f);
-if (v != QEMU_VM_FILE_MAGIC)
-return -EINVAL;
+if (!skip_header) {
+v = qemu_get_be32(f);
+if (v != QEMU_VM_FILE_MAGIC)
+return -EINVAL;
 
-v = qemu_get_be32(f);
-if (v == QEMU_VM_FILE_VERSION_COMPAT) {
-fprintf(stderr, SaveVM v2 format is obsolete and don't work 
anymore\n);
-return -ENOTSUP;
+v = qemu_get_be32(f);
+if (v == QEMU_VM_FILE_VERSION_COMPAT) {
+fprintf(stderr, SaveVM v2 format is obsolete and don't work 
anymore\n);
+return -ENOTSUP;
+}
+if (v != QEMU_VM_FILE_VERSION)
+return -ENOTSUP;
 }
-if (v != QEMU_VM_FILE_VERSION)
-return -ENOTSUP;
 
 while ((section_type = qemu_get_byte(f)) != QEMU_VM_EOF) {
 uint32_t instance_id, version_id, section_id;
@@ -2062,7 +2064,7 @@ int load_vmstate(const char *name)
 return -EINVAL;
 }
 
-ret = qemu_loadvm_state(f);
+ret = qemu_loadvm_state(f, 0);
 
 qemu_fclose(f);
 if (ret  0) {
diff --git a/sysemu.h b/sysemu.h
index 23ae17e..c86b4e8 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -81,7 +81,7 @@ int qemu_savevm_state_begin(Monitor *mon, QEMUFile *f, int 
blk_enable,
 int qemu_savevm_state_iterate(Monitor *mon, QEMUFile *f);
 int qemu_savevm_state_complete(Monitor *mon, QEMUFile *f);
 void qemu_savevm_state_cancel(Monitor *mon, QEMUFile *f);
-int qemu_loadvm_state(QEMUFile *f);
+int qemu_loadvm_state(QEMUFile *f, int skip_header);
 
 /* SLIRP */
 void do_info_slirp(Monitor *mon);
-- 
1.7.1.2




[Qemu-devel] [PATCH 15/18] savevm: introduce qemu_savevm_trans_{begin, commit}.

2011-02-10 Thread Yoshiaki Tamura
Introduce qemu_savevm_trans_{begin,commit} to send the memory and
device info together, while avoiding cancelling memory state tracking.
This patch also abstracts common code between
qemu_savevm_state_{begin,iterate,commit}.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 savevm.c |  157 +++---
 sysemu.h |2 +
 2 files changed, 101 insertions(+), 58 deletions(-)

diff --git a/savevm.c b/savevm.c
index e44eccd..1c2a7fb 100644
--- a/savevm.c
+++ b/savevm.c
@@ -1601,29 +1601,68 @@ bool qemu_savevm_state_blocked(Monitor *mon)
 return false;
 }
 
-int qemu_savevm_state_begin(Monitor *mon, QEMUFile *f, int blk_enable,
-int shared)
+/*
+ * section: header to write
+ * inc: if true, forces to pass SECTION_PART instead of SECTION_START
+ * pause: if true, breaks the loop when live handler returned 0
+ */
+static int qemu_savevm_state_live(Monitor *mon, QEMUFile *f, int section,
+  bool inc, bool pause)
 {
 SaveStateEntry *se;
+int skip = 0, ret;
 
 QTAILQ_FOREACH(se, savevm_handlers, entry) {
-if(se-set_params == NULL) {
+int len, stage;
+
+if (se-save_live_state == NULL) {
 continue;
-   }
-   se-set_params(blk_enable, shared, se-opaque);
+}
+
+/* Section type */
+qemu_put_byte(f, section);
+qemu_put_be32(f, se-section_id);
+
+if (section == QEMU_VM_SECTION_START) {
+/* ID string */
+len = strlen(se-idstr);
+qemu_put_byte(f, len);
+qemu_put_buffer(f, (uint8_t *)se-idstr, len);
+
+qemu_put_be32(f, se-instance_id);
+qemu_put_be32(f, se-version_id);
+
+stage = inc ? QEMU_VM_SECTION_PART : QEMU_VM_SECTION_START;
+} else {
+assert(inc);
+stage = section;
+}
+
+ret = se-save_live_state(mon, f, stage, se-opaque);
+if (!ret) {
+skip++;
+if (pause) {
+break;
+}
+}
 }
-
-qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
-qemu_put_be32(f, QEMU_VM_FILE_VERSION);
+
+return skip;
+}
+
+static void qemu_savevm_state_full(QEMUFile *f)
+{
+SaveStateEntry *se;
 
 QTAILQ_FOREACH(se, savevm_handlers, entry) {
 int len;
 
-if (se-save_live_state == NULL)
+if (se-save_state == NULL  se-vmsd == NULL) {
 continue;
+}
 
 /* Section type */
-qemu_put_byte(f, QEMU_VM_SECTION_START);
+qemu_put_byte(f, QEMU_VM_SECTION_FULL);
 qemu_put_be32(f, se-section_id);
 
 /* ID string */
@@ -1634,9 +1673,29 @@ int qemu_savevm_state_begin(Monitor *mon, QEMUFile *f, 
int blk_enable,
 qemu_put_be32(f, se-instance_id);
 qemu_put_be32(f, se-version_id);
 
-se-save_live_state(mon, f, QEMU_VM_SECTION_START, se-opaque);
+vmstate_save(f, se);
+}
+
+qemu_put_byte(f, QEMU_VM_EOF);
+}
+
+int qemu_savevm_state_begin(Monitor *mon, QEMUFile *f, int blk_enable,
+int shared)
+{
+SaveStateEntry *se;
+
+QTAILQ_FOREACH(se, savevm_handlers, entry) {
+if (se-set_params == NULL) {
+continue;
+}
+se-set_params(blk_enable, shared, se-opaque);
 }
 
+qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
+qemu_put_be32(f, QEMU_VM_FILE_VERSION);
+
+qemu_savevm_state_live(mon, f, QEMU_VM_SECTION_START, 0, 0);
+
 if (qemu_file_has_error(f)) {
 qemu_savevm_state_cancel(mon, f);
 return -EIO;
@@ -1647,29 +1706,16 @@ int qemu_savevm_state_begin(Monitor *mon, QEMUFile *f, 
int blk_enable,
 
 int qemu_savevm_state_iterate(Monitor *mon, QEMUFile *f)
 {
-SaveStateEntry *se;
 int ret = 1;
 
-QTAILQ_FOREACH(se, savevm_handlers, entry) {
-if (se-save_live_state == NULL)
-continue;
-
-/* Section type */
-qemu_put_byte(f, QEMU_VM_SECTION_PART);
-qemu_put_be32(f, se-section_id);
-
-ret = se-save_live_state(mon, f, QEMU_VM_SECTION_PART, se-opaque);
-if (!ret) {
-/* Do not proceed to the next vmstate before this one reported
-   completion of the current stage. This serializes the migration
-   and reduces the probability that a faster changing state is
-   synchronized over and over again. */
-break;
-}
-}
-
-if (ret)
+/* Do not proceed to the next vmstate before this one reported
+   completion of the current stage. This serializes the migration
+   and reduces the probability that a faster changing state is
+   synchronized over and over again. */
+ret = qemu_savevm_state_live(mon, f, QEMU_VM_SECTION_PART, 1, 1);
+if (!ret) {
 return 1;
+}
 
 if (qemu_file_has_error(f)) {
 qemu_savevm_state_cancel(mon, f);
@@ -1681,46 +1727,41 @@ int 

[Qemu-devel] [PATCH 16/18] migration: introduce migrate_ft_trans_{put, get}_ready(), and modify migrate_fd_put_ready() when ft_mode is on.

2011-02-10 Thread Yoshiaki Tamura
Introduce migrate_ft_trans_put_ready() which kicks the FT transaction
cycle.  When ft_mode is on, migrate_fd_put_ready() would open
ft_trans_file and turn on event_tap.  To end or cancel FT transaction,
ft_mode and event_tap is turned off.  migrate_ft_trans_get_ready() is
called to receive ack from the receiver.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 migration.c |  261 ++-
 1 files changed, 260 insertions(+), 1 deletions(-)

diff --git a/migration.c b/migration.c
index c5e0146..7837c55 100644
--- a/migration.c
+++ b/migration.c
@@ -21,6 +21,7 @@
 #include qemu_socket.h
 #include block-migration.h
 #include qemu-objects.h
+#include event-tap.h
 
 //#define DEBUG_MIGRATION
 
@@ -283,6 +284,14 @@ void migrate_fd_error(FdMigrationState *s)
 migrate_fd_cleanup(s);
 }
 
+static void migrate_ft_trans_error(FdMigrationState *s)
+{
+ft_mode = FT_ERROR;
+qemu_savevm_state_cancel(s-mon, s-file);
+migrate_fd_error(s);
+event_tap_unregister();
+}
+
 int migrate_fd_cleanup(FdMigrationState *s)
 {
 int ret = 0;
@@ -318,6 +327,17 @@ void migrate_fd_put_notify(void *opaque)
 qemu_file_put_notify(s-file);
 }
 
+static void migrate_fd_get_notify(void *opaque)
+{
+FdMigrationState *s = opaque;
+
+qemu_set_fd_handler2(s-fd, NULL, NULL, NULL, NULL);
+qemu_file_get_notify(s-file);
+if (qemu_file_has_error(s-file)) {
+migrate_ft_trans_error(s);
+}
+}
+
 ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size)
 {
 FdMigrationState *s = opaque;
@@ -353,6 +373,10 @@ int migrate_fd_get_buffer(void *opaque, uint8_t *data, 
int64_t pos, size_t size)
 ret = -(s-get_error(s));
 }
 
+if (ret == -EAGAIN) {
+qemu_set_fd_handler2(s-fd, NULL, migrate_fd_get_notify, NULL, s);
+}
+
 return ret;
 }
 
@@ -379,6 +403,230 @@ void migrate_fd_connect(FdMigrationState *s)
 migrate_fd_put_ready(s);
 }
 
+static int migrate_ft_trans_commit(void *opaque)
+{
+FdMigrationState *s = opaque;
+int ret = -1;
+
+if (ft_mode != FT_TRANSACTION_COMMIT  ft_mode != FT_TRANSACTION_ATOMIC) {
+fprintf(stderr,
+migrate_ft_trans_commit: invalid ft_mode %d\n, ft_mode);
+goto out;
+}
+
+do {
+if (ft_mode == FT_TRANSACTION_ATOMIC) {
+if (qemu_ft_trans_begin(s-file)  0) {
+fprintf(stderr, qemu_ft_trans_begin failed\n);
+goto out;
+}
+
+ret = qemu_savevm_trans_begin(s-mon, s-file, 0);
+if (ret  0) {
+fprintf(stderr, qemu_savevm_trans_begin failed\n);
+goto out;
+}
+
+ft_mode = FT_TRANSACTION_COMMIT;
+if (ret) {
+/* don't proceed until if fd isn't ready */
+goto out;
+}
+}
+
+/* make the VM state consistent by flushing outstanding events */
+vm_stop(0);
+
+/* send at full speed */
+qemu_file_set_rate_limit(s-file, 0);
+
+ret = qemu_savevm_trans_complete(s-mon, s-file);
+if (ret  0) {
+fprintf(stderr, qemu_savevm_trans_complete failed\n);
+goto out;
+}
+
+ret = qemu_ft_trans_commit(s-file);
+if (ret  0) {
+fprintf(stderr, qemu_ft_trans_commit failed\n);
+goto out;
+}
+
+if (ret) {
+ft_mode = FT_TRANSACTION_RECV;
+ret = 1;
+goto out;
+}
+
+/* flush and check if events are remaining */
+vm_start();
+ret = event_tap_flush_one();
+if (ret  0) {
+fprintf(stderr, event_tap_flush_one failed\n);
+goto out;
+}
+
+ft_mode =  ret ? FT_TRANSACTION_BEGIN : FT_TRANSACTION_ATOMIC;
+} while (ft_mode != FT_TRANSACTION_BEGIN);
+
+vm_start();
+ret = 0;
+
+out:
+return ret;
+}
+
+static int migrate_ft_trans_get_ready(void *opaque)
+{
+FdMigrationState *s = opaque;
+int ret = -1;
+
+if (ft_mode != FT_TRANSACTION_RECV) {
+fprintf(stderr,
+migrate_ft_trans_get_ready: invalid ft_mode %d\n, ft_mode);
+goto error_out;
+}
+
+/* flush and check if events are remaining */
+vm_start();
+ret = event_tap_flush_one();
+if (ret  0) {
+fprintf(stderr, event_tap_flush_one failed\n);
+goto error_out;
+}
+
+if (ret) {
+ft_mode = FT_TRANSACTION_BEGIN;
+} else {
+ft_mode = FT_TRANSACTION_ATOMIC;
+
+ret = migrate_ft_trans_commit(s);
+if (ret  0) {
+goto error_out;
+}
+if (ret) {
+goto out;
+}
+}
+
+vm_start();
+ret = 0;
+goto out;
+
+error_out:
+migrate_ft_trans_error(s);
+
+out:
+return ret;
+}
+
+static int migrate_ft_trans_put_ready(void)
+{
+FdMigrationState *s = migrate_to_fms(current_migration);
+

[Qemu-devel] [PATCH 11/18] ioport: insert event_tap_ioport() to ioport_write().

2011-02-10 Thread Yoshiaki Tamura
Record ioport event to replay it upon failover.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 ioport.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/ioport.c b/ioport.c
index aa4188a..74aebf5 100644
--- a/ioport.c
+++ b/ioport.c
@@ -27,6 +27,7 @@
 
 #include ioport.h
 #include trace.h
+#include event-tap.h
 
 /***/
 /* IO Port */
@@ -76,6 +77,7 @@ static void ioport_write(int index, uint32_t address, 
uint32_t data)
 default_ioport_writel
 };
 IOPortWriteFunc *func = ioport_write_table[index][address];
+event_tap_ioport(index, address, data);
 if (!func)
 func = default_func[index];
 func(ioport_opaque[address], address, data);
-- 
1.7.1.2




[Qemu-devel] [PATCH 07/18] Introduce fault tolerant VM transaction QEMUFile and ft_mode.

2011-02-10 Thread Yoshiaki Tamura
This code implements VM transaction protocol.  Like buffered_file, it
sits between savevm and migration layer.  With this architecture, VM
transaction protocol is implemented mostly independent from other
existing code.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
Signed-off-by: OHMURA Kei ohmura@lab.ntt.co.jp
---
 Makefile.objs   |1 +
 ft_trans_file.c |  624 +++
 ft_trans_file.h |   72 +++
 migration.c |3 +
 trace-events|   15 ++
 5 files changed, 715 insertions(+), 0 deletions(-)
 create mode 100644 ft_trans_file.c
 create mode 100644 ft_trans_file.h

diff --git a/Makefile.objs b/Makefile.objs
index 353b1a8..04148b5 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -100,6 +100,7 @@ common-obj-y += msmouse.o ps2.o
 common-obj-y += qdev.o qdev-properties.o
 common-obj-y += block-migration.o
 common-obj-y += pflib.o
+common-obj-y += ft_trans_file.o
 
 common-obj-$(CONFIG_BRLAPI) += baum.o
 common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o migration-fd.o
diff --git a/ft_trans_file.c b/ft_trans_file.c
new file mode 100644
index 000..2b42b95
--- /dev/null
+++ b/ft_trans_file.c
@@ -0,0 +1,624 @@
+/*
+ * Fault tolerant VM transaction QEMUFile
+ *
+ * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * This source code is based on buffered_file.c.
+ * Copyright IBM, Corp. 2008
+ * Authors:
+ *  Anthony Liguorialigu...@us.ibm.com
+ */
+
+#include qemu-common.h
+#include qemu-error.h
+#include hw/hw.h
+#include qemu-timer.h
+#include sysemu.h
+#include qemu-char.h
+#include trace.h
+#include ft_trans_file.h
+
+typedef struct FtTransHdr
+{
+uint16_t cmd;
+uint16_t id;
+uint32_t seq;
+uint32_t payload_len;
+} FtTransHdr;
+
+typedef struct QEMUFileFtTrans
+{
+FtTransPutBufferFunc *put_buffer;
+FtTransGetBufferFunc *get_buffer;
+FtTransPutReadyFunc *put_ready;
+FtTransGetReadyFunc *get_ready;
+FtTransWaitForUnfreezeFunc *wait_for_unfreeze;
+FtTransCloseFunc *close;
+void *opaque;
+QEMUFile *file;
+
+enum QEMU_VM_TRANSACTION_STATE state;
+uint32_t seq;
+uint16_t id;
+
+int has_error;
+
+bool freeze_output;
+bool freeze_input;
+bool rate_limit;
+bool is_sender;
+bool is_payload;
+
+uint8_t *buf;
+size_t buf_max_size;
+size_t put_offset;
+size_t get_offset;
+
+FtTransHdr header;
+size_t header_offset;
+} QEMUFileFtTrans;
+
+#define IO_BUF_SIZE 32768
+
+static void ft_trans_append(QEMUFileFtTrans *s,
+const uint8_t *buf, size_t size)
+{
+if (size  (s-buf_max_size - s-put_offset)) {
+trace_ft_trans_realloc(s-buf_max_size, size + 1024);
+s-buf_max_size += size + 1024;
+s-buf = qemu_realloc(s-buf, s-buf_max_size);
+}
+
+trace_ft_trans_append(size);
+memcpy(s-buf + s-put_offset, buf, size);
+s-put_offset += size;
+}
+
+static void ft_trans_flush(QEMUFileFtTrans *s)
+{
+size_t offset = 0;
+
+if (s-has_error) {
+error_report(flush when error %d, bailing, s-has_error);
+return;
+}
+
+while (offset  s-put_offset) {
+ssize_t ret;
+
+ret = s-put_buffer(s-opaque, s-buf + offset, s-put_offset - 
offset);
+if (ret == -EAGAIN) {
+break;
+}
+
+if (ret = 0) {
+error_report(error flushing data, %s, strerror(errno));
+s-has_error = FT_TRANS_ERR_FLUSH;
+break;
+} else {
+offset += ret;
+}
+}
+
+trace_ft_trans_flush(offset, s-put_offset);
+memmove(s-buf, s-buf + offset, s-put_offset - offset);
+s-put_offset -= offset;
+s-freeze_output = !!s-put_offset;
+}
+
+static ssize_t ft_trans_put(void *opaque, void *buf, int size)
+{
+QEMUFileFtTrans *s = opaque;
+size_t offset = 0;
+ssize_t len;
+
+/* flush buffered data before putting next */
+if (s-put_offset) {
+ft_trans_flush(s);
+}
+
+while (!s-freeze_output  offset  size) {
+len = s-put_buffer(s-opaque, (uint8_t *)buf + offset, size - offset);
+
+if (len == -EAGAIN) {
+trace_ft_trans_freeze_output();
+s-freeze_output = 1;
+break;
+}
+
+if (len = 0) {
+error_report(putting data failed, %s, strerror(errno));
+s-has_error = 1;
+offset = -EINVAL;
+break;
+}
+
+offset += len;
+}
+
+if (s-freeze_output) {
+ft_trans_append(s, buf + offset, size - offset);
+offset = size;
+}
+
+return offset;
+}
+
+static int ft_trans_send_header(QEMUFileFtTrans *s,
+enum QEMU_VM_TRANSACTION_STATE state,
+uint32_t payload_len)
+{
+int ret;
+FtTransHdr 

[Qemu-devel] Re: [PATCH] Fix multiple qemu-options.def generation

2011-02-10 Thread Paolo Bonzini
 diff --git a/Makefile.objs b/Makefile.objs
 index 4a1eaa1..ee9f190 100755
 --- a/Makefile.objs
 +++ b/Makefile.objs
 @@ -269,10 +269,10 @@ vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS)
  vl.o: QEMU_CFLAGS+=$(SDL_CFLAGS)
 -vl.o: qemu-options.def
 +vl.o: ../qemu-options.def
 os-posix.o: qemu-options.def
 os-win32.o: qemu-options.def
 -qemu-options.def: $(SRC_PATH)/qemu-options.hx
 +%qemu-options.def: $(SRC_PATH)/qemu-options.hx
$(call quiet-command,sh $(SRC_PATH)/hxtool -h  $  $@,  GEN 
   $(TARGET_DIR)$@)

This is wrong, I think the problem is that you are missing a vpath 
directive.

Does this help?

diff --git a/rules.mak b/rules.mak
index ed59c9e..6f753ae 100644
--- a/rules.mak
+++ b/rules.mak
@@ -39,7 +39,7 @@ quiet-command = $(if $(V),$1,$(if $(2),@echo $2  $1, @$1))
 cc-option = $(if $(shell $(CC) $1 $2 -S -o /dev/null -xc /dev/null \
   /dev/null 21  echo OK), $2, $3)
 
-VPATH_SUFFIXES = %.c %.h %.S %.m %.mak %.texi
+VPATH_SUFFIXES = %.c %.h %.S %.m %.mak %.texi %.def
 set-vpath = $(if $1,$(foreach PATTERN,$(VPATH_SUFFIXES),$(eval vpath 
$(PATTERN) $1)))
 
 # find-in-path

Paolo



[Qemu-devel] Re: [PATCH 18/18] Introduce kemari: to enable FT migration mode (Kemari).

2011-02-10 Thread Paolo Bonzini

On 02/10/2011 10:30 AM, Yoshiaki Tamura wrote:

When kemari: is set in front of URI of migrate command, it will turn
on ft_mode to start FT migration mode (Kemari).  On the receiver side,
the option looks like, -incoming kemari:protocol:address:port

Signed-off-by: Yoshiaki Tamuratamura.yoshi...@lab.ntt.co.jp
---
  hmp-commands.hx |4 +++-
  migration.c |   12 
  qmp-commands.hx |4 +++-
  3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 38e1eb7..ee14344 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -760,7 +760,9 @@ ETEXI
  \n\t\t\t -b for migration without shared storage with
   full copy of disk\n\t\t\t -i for migration without 
  shared storage with incremental copy of disk 
- (base image shared between src and destination),
+ (base image shared between src and destination)
+ \n\t\t\t put \kemari:\ in front of URI to enable 
+ Fault Tolerance mode (Kemari protocol),
  .user_print = monitor_user_noop,  
.mhandler.cmd_new = do_migrate,
  },
diff --git a/migration.c b/migration.c
index 7837c55..a3f7722 100644
--- a/migration.c
+++ b/migration.c
@@ -48,6 +48,12 @@ int qemu_start_incoming_migration(const char *uri)
  const char *p;
  int ret;

+/* check ft_mode (Kemari protocol) */
+if (strstart(uri, kemari:,p)) {
+ft_mode = FT_INIT;
+uri = p;
+}
+
  if (strstart(uri, tcp:,p))
  ret = tcp_start_incoming_migration(p);
  #if !defined(WIN32)
@@ -99,6 +105,12 @@ int do_migrate(Monitor *mon, const QDict *qdict, QObject 
**ret_data)
  return -1;
  }

+/* check ft_mode (Kemari protocol) */
+if (strstart(uri, kemari:,p)) {
+ft_mode = FT_INIT;
+uri = p;
+}
+
  if (strstart(uri, tcp:,p)) {
  s = tcp_start_outgoing_migration(mon, p, max_throttle, detach,
   blk, inc);
diff --git a/qmp-commands.hx b/qmp-commands.hx
index df40a3d..68ca48a 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -437,7 +437,9 @@ EQMP
  \n\t\t\t -b for migration without shared storage with
   full copy of disk\n\t\t\t -i for migration without 
  shared storage with incremental copy of disk 
- (base image shared between src and destination),
+ (base image shared between src and destination)
+ \n\t\t\t put \kemari:\ in front of URI to enable 
+ Fault Tolerance mode (Kemari protocol),
  .user_print = monitor_user_noop,  
.mhandler.cmd_new = do_migrate,
  },


Acked-by: Paolo Bonzini pbonz...@redhat.com

Paolo



RE: [Qemu-devel] Re: [PATCH] Fix multiple qemu-options.def generation

2011-02-10 Thread Pavel Dovgaluk
  diff --git a/Makefile.objs b/Makefile.objs
  index 4a1eaa1..ee9f190 100755
  --- a/Makefile.objs
  +++ b/Makefile.objs
  @@ -269,10 +269,10 @@ vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS)
   vl.o: QEMU_CFLAGS+=$(SDL_CFLAGS)
  -vl.o: qemu-options.def
  +vl.o: ../qemu-options.def
  os-posix.o: qemu-options.def
  os-win32.o: qemu-options.def
  -qemu-options.def: $(SRC_PATH)/qemu-options.hx
  +%qemu-options.def: $(SRC_PATH)/qemu-options.hx
 $(call quiet-command,sh $(SRC_PATH)/hxtool -h  $  $@,  
  GEN
 $(TARGET_DIR)$@)
 
 This is wrong, I think the problem is that you are missing a vpath
 directive.
 
 Does this help?

 This patch was for older version of qemu.
 Current one does not have this problem.

Pavel Dovgaluk




[Qemu-devel] [PATCH 09/18] Introduce event-tap.

2011-02-10 Thread Yoshiaki Tamura
event-tap controls when to start FT transaction, and provides proxy
functions to called from net/block devices.  While FT transaction, it
queues up net/block requests, and flush them when the transaction gets
completed.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
Signed-off-by: OHMURA Kei ohmura@lab.ntt.co.jp
---
 Makefile.target |1 +
 event-tap.c |  939 +++
 event-tap.h |   44 +++
 qemu-tool.c |   28 ++
 trace-events|   10 +
 5 files changed, 1022 insertions(+), 0 deletions(-)
 create mode 100644 event-tap.c
 create mode 100644 event-tap.h

diff --git a/Makefile.target b/Makefile.target
index b0ba95f..edbdbee 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -199,6 +199,7 @@ obj-y += rwhandler.o
 obj-$(CONFIG_KVM) += kvm.o kvm-all.o
 obj-$(CONFIG_NO_KVM) += kvm-stub.o
 LIBS+=-lz
+obj-y += event-tap.o
 
 QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
 QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
diff --git a/event-tap.c b/event-tap.c
new file mode 100644
index 000..f44d835
--- /dev/null
+++ b/event-tap.c
@@ -0,0 +1,939 @@
+/*
+ * Event Tap functions for QEMU
+ *
+ * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#include qemu-common.h
+#include qemu-error.h
+#include block.h
+#include block_int.h
+#include ioport.h
+#include osdep.h
+#include sysemu.h
+#include hw/hw.h
+#include net.h
+#include event-tap.h
+#include trace.h
+
+enum EVENT_TAP_STATE {
+EVENT_TAP_OFF,
+EVENT_TAP_ON,
+EVENT_TAP_SUSPEND,
+EVENT_TAP_FLUSH,
+EVENT_TAP_LOAD,
+EVENT_TAP_REPLAY,
+};
+
+static enum EVENT_TAP_STATE event_tap_state = EVENT_TAP_OFF;
+
+typedef struct EventTapIOport {
+uint32_t address;
+uint32_t data;
+int  index;
+} EventTapIOport;
+
+#define MMIO_BUF_SIZE 8
+
+typedef struct EventTapMMIO {
+uint64_t address;
+uint8_t  buf[MMIO_BUF_SIZE];
+int  len;
+} EventTapMMIO;
+
+typedef struct EventTapNetReq {
+char *device_name;
+int iovcnt;
+int vlan_id;
+bool vlan_needed;
+bool async;
+struct iovec *iov;
+NetPacketSent *sent_cb;
+} EventTapNetReq;
+
+#define MAX_BLOCK_REQUEST 32
+
+typedef struct EventTapAIOCB EventTapAIOCB;
+
+typedef struct EventTapBlkReq {
+char *device_name;
+int num_reqs;
+int num_cbs;
+bool is_flush;
+BlockRequest reqs[MAX_BLOCK_REQUEST];
+EventTapAIOCB *acb[MAX_BLOCK_REQUEST];
+} EventTapBlkReq;
+
+#define EVENT_TAP_IOPORT (1  0)
+#define EVENT_TAP_MMIO   (1  1)
+#define EVENT_TAP_NET(1  2)
+#define EVENT_TAP_BLK(1  3)
+
+#define EVENT_TAP_TYPE_MASK (EVENT_TAP_NET - 1)
+
+typedef struct EventTapLog {
+int mode;
+union {
+EventTapIOport ioport;
+EventTapMMIO mmio;
+};
+union {
+EventTapNetReq net_req;
+EventTapBlkReq blk_req;
+};
+QTAILQ_ENTRY(EventTapLog) node;
+} EventTapLog;
+
+struct EventTapAIOCB {
+BlockDriverAIOCB common;
+BlockDriverAIOCB *acb;
+bool is_canceled;
+};
+
+static EventTapLog *last_event_tap;
+
+static QTAILQ_HEAD(, EventTapLog) event_list;
+static QTAILQ_HEAD(, EventTapLog) event_pool;
+
+static int (*event_tap_cb)(void);
+static QEMUBH *event_tap_bh;
+static VMChangeStateEntry *vmstate;
+
+static void event_tap_bh_cb(void *p)
+{
+if (event_tap_cb) {
+event_tap_cb();
+}
+
+qemu_bh_delete(event_tap_bh);
+event_tap_bh = NULL;
+}
+
+static void event_tap_schedule_bh(void)
+{
+trace_event_tap_ignore_bh(!!event_tap_bh);
+
+/* if bh is already set, we ignore it for now */
+if (event_tap_bh) {
+return;
+}
+
+event_tap_bh = qemu_bh_new(event_tap_bh_cb, NULL);
+qemu_bh_schedule(event_tap_bh);
+
+return;
+}
+
+static void *event_tap_alloc_log(void)
+{
+EventTapLog *log;
+
+if (QTAILQ_EMPTY(event_pool)) {
+log = qemu_mallocz(sizeof(EventTapLog));
+} else {
+log = QTAILQ_FIRST(event_pool);
+QTAILQ_REMOVE(event_pool, log, node);
+}
+
+return log;
+}
+
+static void event_tap_free_net_req(EventTapNetReq *net_req);
+static void event_tap_free_blk_req(EventTapBlkReq *blk_req);
+
+static void event_tap_free_log(EventTapLog *log)
+{
+int mode = log-mode  ~EVENT_TAP_TYPE_MASK;
+
+if (mode == EVENT_TAP_NET) {
+event_tap_free_net_req(log-net_req);
+} else if (mode == EVENT_TAP_BLK) {
+event_tap_free_blk_req(log-blk_req);
+}
+
+log-mode = 0;
+
+/* return the log to event_pool */
+QTAILQ_INSERT_HEAD(event_pool, log, node);
+}
+
+static void event_tap_free_pool(void)
+{
+EventTapLog *log, *next;
+
+QTAILQ_FOREACH_SAFE(log, event_pool, node, next) {
+QTAILQ_REMOVE(event_pool, log, node);
+qemu_free(log);
+}
+}
+
+static void event_tap_free_net_req(EventTapNetReq *net_req)
+{
+int i;
+
+if (!net_req-async) {
+for 

[Qemu-devel] Re: [PATCH] Correct win32 timers deleting v.3

2011-02-10 Thread Paolo Bonzini

On 02/02/2011 12:59 PM, Pavel Dovgaluk wrote:

Hello.

  Anybody interested in this patch?


I'm planning to replace the multimedia timer with a queue timer.  I'll 
send the patch soon(ish).


Paolo



Re: [Qemu-devel] [PATCH 02/18] Introduce read() to FdMigrationState.

2011-02-10 Thread Yoshiaki Tamura
2011/2/10 Anthony Liguori anth...@codemonkey.ws:
 On 02/10/2011 10:30 AM, Yoshiaki Tamura wrote:

 Currently FdMigrationState doesn't support read(), and this patch
 introduces it to get response from the other side.

 Signed-off-by: Yoshiaki Tamuratamura.yoshi...@lab.ntt.co.jp


 Migration is unidirectional.  Changing this is fundamental and not something
 to be done lightly.

 I thought we previously discussed using a protocol wrapper around the
 existing migration protocol?

AFAIR, I don't think we had that discussion before.  I applied
comments from Stefan though.  If I missed the discussion, could
you please give me the link?

Thanks,

Yoshi


 Regards,

 Anthony Liguori

 ---
  migration-tcp.c |   15 +++
  migration.c     |   13 +
  migration.h     |    3 +++
  3 files changed, 31 insertions(+), 0 deletions(-)

 diff --git a/migration-tcp.c b/migration-tcp.c
 index b55f419..55777c8 100644
 --- a/migration-tcp.c
 +++ b/migration-tcp.c
 @@ -39,6 +39,20 @@ static int socket_write(FdMigrationState *s, const void
 * buf, size_t size)
      return send(s-fd, buf, size, 0);
  }

 +static int socket_read(FdMigrationState *s, const void * buf, size_t
 size)
 +{
 +    ssize_t len;
 +
 +    do {
 +        len = recv(s-fd, (void *)buf, size, 0);
 +    } while (len == -1  socket_error() == EINTR);
 +    if (len == -1) {
 +        len = -socket_error();
 +    }
 +
 +    return len;
 +}
 +
  static int tcp_close(FdMigrationState *s)
  {
      DPRINTF(tcp_close\n);
 @@ -94,6 +108,7 @@ MigrationState *tcp_start_outgoing_migration(Monitor
 *mon,

      s-get_error = socket_errno;
      s-write = socket_write;
 +    s-read = socket_read;
      s-close = tcp_close;
      s-mig_state.cancel = migrate_fd_cancel;
      s-mig_state.get_status = migrate_fd_get_status;
 diff --git a/migration.c b/migration.c
 index 3612572..f0df5fc 100644
 --- a/migration.c
 +++ b/migration.c
 @@ -340,6 +340,19 @@ ssize_t migrate_fd_put_buffer(void *opaque, const
 void *data, size_t size)
      return ret;
  }

 +int migrate_fd_get_buffer(void *opaque, uint8_t *data, int64_t pos,
 size_t size)
 +{
 +    FdMigrationState *s = opaque;
 +    int ret;
 +
 +    ret = s-read(s, data, size);
 +    if (ret == -1) {
 +        ret = -(s-get_error(s));
 +    }
 +
 +    return ret;
 +}
 +
  void migrate_fd_connect(FdMigrationState *s)
  {
      int ret;
 diff --git a/migration.h b/migration.h
 index 2170792..88a6987 100644
 --- a/migration.h
 +++ b/migration.h
 @@ -48,6 +48,7 @@ struct FdMigrationState
      int (*get_error)(struct FdMigrationState*);
      int (*close)(struct FdMigrationState*);
      int (*write)(struct FdMigrationState*, const void *, size_t);
 +    int (*read)(struct FdMigrationState *, const void *, size_t);
      void *opaque;
  };

 @@ -116,6 +117,8 @@ void migrate_fd_put_notify(void *opaque);

  ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t
 size);

 +int migrate_fd_get_buffer(void *opaque, uint8_t *data, int64_t pos,
 size_t size);
 +
  void migrate_fd_connect(FdMigrationState *s);

  void migrate_fd_put_ready(void *opaque);


 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html




Re: [Qemu-devel] [PATCH 02/18] Introduce read() to FdMigrationState.

2011-02-10 Thread Anthony Liguori

On 02/10/2011 10:30 AM, Yoshiaki Tamura wrote:

Currently FdMigrationState doesn't support read(), and this patch
introduces it to get response from the other side.

Signed-off-by: Yoshiaki Tamuratamura.yoshi...@lab.ntt.co.jp
   


Migration is unidirectional.  Changing this is fundamental and not 
something to be done lightly.


I thought we previously discussed using a protocol wrapper around the 
existing migration protocol?


Regards,

Anthony Liguori


---
  migration-tcp.c |   15 +++
  migration.c |   13 +
  migration.h |3 +++
  3 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/migration-tcp.c b/migration-tcp.c
index b55f419..55777c8 100644
--- a/migration-tcp.c
+++ b/migration-tcp.c
@@ -39,6 +39,20 @@ static int socket_write(FdMigrationState *s, const void * 
buf, size_t size)
  return send(s-fd, buf, size, 0);
  }

+static int socket_read(FdMigrationState *s, const void * buf, size_t size)
+{
+ssize_t len;
+
+do {
+len = recv(s-fd, (void *)buf, size, 0);
+} while (len == -1  socket_error() == EINTR);
+if (len == -1) {
+len = -socket_error();
+}
+
+return len;
+}
+
  static int tcp_close(FdMigrationState *s)
  {
  DPRINTF(tcp_close\n);
@@ -94,6 +108,7 @@ MigrationState *tcp_start_outgoing_migration(Monitor *mon,

  s-get_error = socket_errno;
  s-write = socket_write;
+s-read = socket_read;
  s-close = tcp_close;
  s-mig_state.cancel = migrate_fd_cancel;
  s-mig_state.get_status = migrate_fd_get_status;
diff --git a/migration.c b/migration.c
index 3612572..f0df5fc 100644
--- a/migration.c
+++ b/migration.c
@@ -340,6 +340,19 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void 
*data, size_t size)
  return ret;
  }

+int migrate_fd_get_buffer(void *opaque, uint8_t *data, int64_t pos, size_t 
size)
+{
+FdMigrationState *s = opaque;
+int ret;
+
+ret = s-read(s, data, size);
+if (ret == -1) {
+ret = -(s-get_error(s));
+}
+
+return ret;
+}
+
  void migrate_fd_connect(FdMigrationState *s)
  {
  int ret;
diff --git a/migration.h b/migration.h
index 2170792..88a6987 100644
--- a/migration.h
+++ b/migration.h
@@ -48,6 +48,7 @@ struct FdMigrationState
  int (*get_error)(struct FdMigrationState*);
  int (*close)(struct FdMigrationState*);
  int (*write)(struct FdMigrationState*, const void *, size_t);
+int (*read)(struct FdMigrationState *, const void *, size_t);
  void *opaque;
  };

@@ -116,6 +117,8 @@ void migrate_fd_put_notify(void *opaque);

  ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size);

+int migrate_fd_get_buffer(void *opaque, uint8_t *data, int64_t pos, size_t 
size);
+
  void migrate_fd_connect(FdMigrationState *s);

  void migrate_fd_put_ready(void *opaque);
   





Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Anthony Liguori

On 02/10/2011 10:07 AM, Gleb Natapov wrote:

So what if it is easier, it doesn't mean it is correct thing to do.


If we spend the next 10 years trying to do the correct thing for some 
arbitrary definition of correct, that's not terribly useful.


It's really simple actually.  Let's do the least clever thing and model 
how hardware actual works.  Once we have that, we can try to be better 
than real hardware (if it's possible).





If all composition is done through a factory interface, it doesn't.
But my main argument here is that we shouldn't try to make all
composition done through a factory interface--only where it makes
sense.

So very concretely, I'm suggesting we do the following to target-i386:

1) make the i440fx device have an embedded ide controller, piix3,
and usb controller that get initialized automatically.  The piix3
embeds the PCI-to-ISA bridge along with all of the default ISA
devices (rtc, serial, etc.).
 

This may be a problem even from security point of view. What if usb code
(ide, serial, parallel) has guest exploitable bug? Currently I can happily
continue running guests if they do not need affected subsystem. If we'll
get it your way I will no longer be able to do so.
   


qemu -device i440fx,ide=off

If you really care to do this.  But this desire to remove devices is 
silly IMHO.  Concerns about security are misplaced.  If you have to 
change the way a guest is invoked in order to eliminate security 
problems, then there's something seriously wrong.


Regards,

Anthony Liguori



Re: [Qemu-devel] [RFC][PATCH v6 00/04] qtest: qemu unit testing framework

2011-02-10 Thread Stefan Hajnoczi
On Wed, Feb 9, 2011 at 8:39 PM, Michael Roth mdr...@linux.vnet.ibm.com wrote:
 On 02/09/2011 01:42 PM, Blue Swirl wrote:

 On Fri, Feb 4, 2011 at 3:49 PM, Michael Rothmdr...@linux.vnet.ibm.com
  wrote:

 These patches apply to master (2-04-2011), and can also be obtained from:
 git://repo.or.cz/qemu/mdroth.git qtest_v1

 OVERVIEW:

 QEMU currently lacks a standard means to do targeted unit testing of the
 device model. Frameworks like kvm-autotest interact via guest OS, which
 provide a highly abstracted interface to the underlying machine, and are
 susceptable to bugs in the guest OS itself. This allows for reasonable test
 coverage of guest functionality as a whole, but reduces the accuracy and
 specificity with which we can exercise paths in the underlying devices.

 The following patches provide the basic beginnings of a test framework
 which replaces vcpu threads with test threads that interact with the
 underlying machine directly, allowing for directed unit/performance testing
 of individual devices. Test modules are built directly into the qemu binary,
 and each module provides the following interfaces:

 init():
  Called in place of qemu's normal machine initialization to setup up
 devices explicitly. A full machine can be created here by calling into the
 normal init path, as well as minimal machines with a select set of
 buses/devices/IRQ handlers.

 run():
  Test logic that interacts with the now-created machine.

 cleanup():
  Currently unused, but potentially allows for chaining multiple tests
 together. Currently we run one module, then exit.

 As mentioned these are very early starting points. We're mostly looking
 for input from the community on the basic approach and overall requirements
 for an acceptable framework. A basic RTC test module is provided as an
 example.

 BUILD/EXAMPLE USAGE:

  $ ./configure --target-list=x86_64-softmmu --enable-qtest
 --enable-io-thread
  $ make
  $ ./x86_64-softmmu/qemu-system-x86_64 -test ?
  Available test modules:
  rtc
  $ ./x86_64-softmmu/qemu-system-x86_64 -test rtc
  ../qtest/qtest_rtc.c:test_drift():L94: hz: 2, duration_ms: 4999,
 exp_duration: 5000, drift ratio: 0.000200
  ../qtest/qtest_rtc.c:test_drift():L111: hz: 1024, duration_ms: 4999,
 exp_duration: 5000, drift ratio: 0.000200

 GENERAL PLAN:

  - Provide libraries for common operations like PCI device enumeration,
 APIC configuration, default-configured machine setup, interrupt handling,
 etc.
  - Develop tests as machine/target specific, potentially make some tests
 re-usable as interfaces are better defined
  - Do port i/o via cpu_in/cpu_out commands
  - Do guest memory access via a CPUPhysMemoryClient interface
  - Allow interrupts to be sent by writing to an FD, detection in test
 modules via select()/read()

 TODO:

  - A means to propagate test returns values to main i/o thread
  - Better defined test harness for individual test cases and/or modules,
 likely via GLib
  - Support for multiple test threads in a single test module for
 scalability testing
  - Modify vl.c hooks so tests can configure their own timers/clocksources
  - More test modules, improve current rtc module
  - Further implementing/fleshing out of the overall plan

 Comments/feedback are welcome!

 Would it be possible to couple this with the tracing or Kemari somehow
 so that you could capture, say, block device traces and feed them to
 test setup?

 I would think so...it's a pretty open ended framework, a unit test could,
 say, read in block device traces in some pre-defined format and then execute
 those against a block device. We're also planning on adding command-line
 parameters for tests, so a unit test could actually be used as a general
 testing utility. for instance:

That's a good point.  Testing network, block, serial, etc device
emulation requires mock host devices (netdev, drive, chardev).

Net needs a replay net client.  A dump net client already exists
for capturing packets.

Block has no record/replay but the Linux blktrace format might be
good.  There's also Kevin's blkdebug and CQ's blksim which might be
extendable.

For chardev perhaps the existing options are already powerful enough,
otherwise something like expect would be neat.

 qemu -test block-trace-virtio -test-opts
 tracefile=file,target_img=img,target_fmt=qcow2,comparison_img=img

Or more like the -device, -chardev, etc syntax:

qemu -test 
block-trace-virtio,tracefile=file,target_img=img,target_fmt=qcow2,comparison_img=img

Stefan



Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Anthony Liguori

On 02/10/2011 10:04 AM, Peter Maydell wrote:

On 10 February 2011 08:36, Anthony Liguorianth...@codemonkey.ws  wrote:
   

On 02/10/2011 09:16 AM, Peter Maydell wrote:
 

On 10 February 2011 07:47, Anthony Liguorianth...@codemonkey.wswrote:
   

2) get rid of the entire concept of machines.  Creating a i440fx is
essentially equivalent to creating a bare machine.
 

Does that make any sense for anything other than target-i386?
The concept of a machine model seems a pretty obvious one
for ARM boards, for instance, and I'm not sure we'd gain much
by having i386 be different to the other architectures...
   

Yes, it makes a lot of sense, I just don't know the component names as well
so bear with me :-)

There are two types of Versatile machines today, Versatile/AB and
Versatile/PB.  They are both made with the same core, ARM926EJ-S, with
different expansions.

So you would model arm926ej-s as the chipset and then build up the machines
by modifying parameters of the chipset (like the board id) and/or adding
different components on top of it.
 

Er, ARM926 is the CPU, it's not a chipset. The board ID is definitely
not a property of an ARM926, it's a property of the board (clue is in
the name :-)). I don't think versatile boards have a chipset really...
   


As I said, I'm not well versed in the component names in ARM.

But that said, an actual processor doesn't connect directly to a bunch 
of devices.  It almost always go through some chipset and that chipset 
implements a lot of functionality typically.


I think the name of the component I'm trying to refer to PL300 which I 
believe is the Northbridge used for the Versatile boards.



In my understanding the machine is the thing that says I need a
926, and an MMC controller at this address, and some UARTS,
and... ie it is the thing that does the modifying parameters
and adding different components. So if we'd still be doing that
I don't see how we've got rid of the concept. I guess I'm missing
the point somehow.
   


A machine today is basically the northbridge, southbridge, plus a bunch 
of default components to make the virtual hardware useful.


I'm suggesting that we model a proper northbridge/southbridge.


A good way to think about what I'm proposing is that machine-init really
should be a constructor for a device object.
 

If you mean that you want machines to be implemented under the
hood as a single huge device you can only have one of that spans
the entire memory map, well I guess that's an implementation
detail. But conceptually machines really do exist, and we definitely
still want users to be able to say I want a beagle machine; I want
a versatile; I want an n900.
   


An n900 is a very specific hardware configuration that is best 
represented by some sort of configuration file vs. something hard coded 
in QEMU.


The question is, what level of component modelling do we need to do in 
order to make it practical to create such configurations from a file.


Regards,

Anthony Liguori


-- PMM
   





[Qemu-devel] [PATCH 1/2] qdev: Allow hot-plug for lists with pre-filled descriptors

2011-02-10 Thread Amit Shah
This will be needed for hot-plugging chardevs.

Signed-off-by: Amit Shah amit.s...@redhat.com
---
 monitor.c |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/monitor.c b/monitor.c
index 7fc311d..f3d7ab3 100644
--- a/monitor.c
+++ b/monitor.c
@@ -74,8 +74,6 @@
  * 'O'  option string of the form NAME=VALUE,...
  *  parsed according to QemuOptsList given by its name
  *  Example: 'device:O' uses qemu_device_opts.
- *  Restriction: only lists with empty desc are supported
- *  TODO lift the restriction
  * 'i'  32 bit integer
  * 'l'  target long (32 or 64 bit)
  * 'M'  just like 'l', except in user mode the value is
@@ -4064,7 +4062,7 @@ static const mon_cmd_t *monitor_parse_command(Monitor 
*mon,
 QemuOpts *opts;
 
 opts_list = qemu_find_opts(key);
-if (!opts_list || opts_list-desc-name) {
+if (!opts_list) {
 goto bad_type;
 }
 while (qemu_isspace(*p)) {
-- 
1.7.4




[Qemu-devel] [PATCH 2/2] qdev: Allow chardevs to be hot-plugged

2011-02-10 Thread Amit Shah
This commit enables chardevs to be hot-plugged to a running qemu
machine.  The syntax is similar to the -chardev command line:

(qemu) chardev_add socket,path=/tmp/foo,server,nowait,id=char0

Signed-off-by: Amit Shah amit.s...@redhat.com
---
 hmp-commands.hx |   16 
 hw/qdev.c   |   15 +++
 hw/qdev.h   |1 +
 qmp-commands.hx |   23 +++
 4 files changed, 55 insertions(+), 0 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 38e1eb7..e0e6fc8 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -554,6 +554,22 @@ command @code{info usb} to see the devices you can remove.
 ETEXI
 
 {
+.name   = chardev_add,
+.args_type  = chardev:O,
+.params = backend[,prop=value][,...],id=str,
+.help   = add chardev, like -chardev on the command line,
+.user_print = monitor_user_noop,
+.mhandler.cmd_new = do_chardev_add,
+},
+
+STEXI
+@item chardev_add @var{config}
+@findex chardev_add
+
+Add chardev.
+ETEXI
+
+{
 .name   = device_add,
 .args_type  = device:O,
 .params = driver[,prop=value][,...],
diff --git a/hw/qdev.c b/hw/qdev.c
index c7fec44..1e24f58 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -861,6 +861,21 @@ void do_info_qdm(Monitor *mon)
 }
 }
 
+int do_chardev_add(Monitor *mon, const QDict *qdict, QObject **ret_data)
+{
+QemuOpts *opts;
+
+opts = qemu_opts_from_qdict(qemu_find_opts(chardev), qdict);
+if (!opts) {
+return -1;
+}
+if (!qemu_chr_open_opts(opts, NULL)) {
+qemu_opts_del(opts);
+return -1;
+}
+return 0;
+}
+
 int do_device_add(Monitor *mon, const QDict *qdict, QObject **ret_data)
 {
 QemuOpts *opts;
diff --git a/hw/qdev.h b/hw/qdev.h
index 9808f85..5698713 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -212,6 +212,7 @@ BusState *sysbus_get_default(void);
 
 void do_info_qtree(Monitor *mon);
 void do_info_qdm(Monitor *mon);
+int do_chardev_add(Monitor *mon, const QDict *qdict, QObject **ret_data);
 int do_device_add(Monitor *mon, const QDict *qdict, QObject **ret_data);
 int do_device_del(Monitor *mon, const QDict *qdict, QObject **ret_data);
 
diff --git a/qmp-commands.hx b/qmp-commands.hx
index df40a3d..255da9a 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -275,6 +275,29 @@ Example:
 EQMP
 
 {
+.name   = chardev_add,
+.args_type  = device:O,
+.params = backend[,prop=value][,...],id=str,
+.help   = add chardev, like -chardev on the command line,
+.user_print = monitor_user_noop,
+.mhandler.cmd_new = do_chardev_add,
+},
+
+SQMP
+chardev_add
+--
+
+Add a chardev.
+
+Arguments:
+
+- backend: the backend of the new chardev (json-string)
+- id: the chardev's ID, must be unique (json-string)
+- chardev properties
+
+EQMP
+
+{
 .name   = device_add,
 .args_type  = device:O,
 .params = driver[,prop=value][,...],
-- 
1.7.4




Re: [Qemu-devel] [PATCH 02/18] Introduce read() to FdMigrationState.

2011-02-10 Thread Daniel P. Berrange
On Thu, Feb 10, 2011 at 10:54:01AM +0100, Anthony Liguori wrote:
 On 02/10/2011 10:30 AM, Yoshiaki Tamura wrote:
 Currently FdMigrationState doesn't support read(), and this patch
 introduces it to get response from the other side.
 
 Signed-off-by: Yoshiaki Tamuratamura.yoshi...@lab.ntt.co.jp
 
 Migration is unidirectional.  Changing this is fundamental and not
 something to be done lightly.

Making it bi-directional might break libvirt's save/restore
to file support which uses migration, passing a unidirectional
FD for the file. It could also break libvirt's secure tunnelled
migration support which is currently only expecting to have
data sent in one direction on the socket.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH 02/18] Introduce read() to FdMigrationState.

2011-02-10 Thread Yoshiaki Tamura
2011/2/10 Daniel P. Berrange berra...@redhat.com:
 On Thu, Feb 10, 2011 at 10:54:01AM +0100, Anthony Liguori wrote:
 On 02/10/2011 10:30 AM, Yoshiaki Tamura wrote:
 Currently FdMigrationState doesn't support read(), and this patch
 introduces it to get response from the other side.
 
 Signed-off-by: Yoshiaki Tamuratamura.yoshi...@lab.ntt.co.jp

 Migration is unidirectional.  Changing this is fundamental and not
 something to be done lightly.

 Making it bi-directional might break libvirt's save/restore
 to file support which uses migration, passing a unidirectional
 FD for the file. It could also break libvirt's secure tunnelled
 migration support which is currently only expecting to have
 data sent in one direction on the socket.

Hi Daniel,

IIUC, this patch isn't something to make existing live migration
bi-directional.  Just opens up a way for Kemari to use it.  Do
you think it's dangerous for libvirt still?

Thanks,

Yoshi


 Daniel
 --
 |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
 |: http://libvirt.org              -o-             http://virt-manager.org :|
 |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
 |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html




Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Avi Kivity

On 02/10/2011 11:07 AM, Gleb Natapov wrote:

On Thu, Feb 10, 2011 at 08:47:12AM +0100, Anthony Liguori wrote:
  On 02/09/2011 09:15 PM, Blue Swirl wrote:
  On Wed, Feb 9, 2011 at 9:59 PM, Anthony Liguorianth...@codemonkey.ws   
wrote:
  On 02/09/2011 06:48 PM, Blue Swirl wrote:
  ISASerialState dev;
  
  isa_serial_init(dev, 0, 0x274, 0x07, NULL, NULL);
  
  Do you mean that there should be a generic way of doing that, like
  sysbus_create_varargs() for qdev, or just add inline functions which
  hide qdev property setup?
  
  I still think that FDT should be used in the future. That would
  require that the properties can be set up mechanically, and I don't
  see how your proposal would help that.
  
  Yeah, I don't think that is a good idea anymore.  I think this is part of
  why we're having so many problems with qdev.
  
  While (most?) hardware hierarchies can be represented by device tree 
syntax,
  not all valid device trees correspond to interface and/or useful hardware
  hierarchies.
  User creates a non-working machine and so gets to fix the problems?
  How is that a problem for us?

  It's not about creating a non-working machine.  It's about what
  user-level abstraction we need to provide.

  It's a whole lot easier to implement an i440fx device with a fixed
  set of parameters than it is to make every possible subdevice have a
  proper factory interface along with mechanisms to hook everything
  together.

So what if it is easier, it doesn't mean it is correct thing to do. What
you are proposing is just a huge step backwards. May be we shouldn't
support hooking everything together in completely arbitrary ways, but we
shouldn't force isa/pci devices upon our users just because they are
non-removable on real chip.


I disagree.  We don't want to deviate from the spec any more than we 
already do.


The reason for wanting flexibility is because the code for the PIC or 
RTC, for example, can be used in other Super-IO chipsets or even 
standalone.  If qemu only supported the 440FX chipset, we'd have no 
reason to make things flexible.




  So very concretely, I'm suggesting we do the following to target-i386:

  1) make the i440fx device have an embedded ide controller, piix3,
  and usb controller that get initialized automatically.  The piix3
  embeds the PCI-to-ISA bridge along with all of the default ISA
  devices (rtc, serial, etc.).
This may be a problem even from security point of view. What if usb code
(ide, serial, parallel) has guest exploitable bug? Currently I can happily
continue running guests if they do not need affected subsystem. If we'll
get it your way I will no longer be able to do so.


You can't just remove a device from a guest.  You have to shut it down.  
When you power it back up, you may end up with different IRQ assignments 
or expose some guest bug.


If you have a security issue in code that is exposed to the guest, you 
have to fix it.


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Gleb Natapov
On Thu, Feb 10, 2011 at 11:19:48AM +0100, Anthony Liguori wrote:
 On 02/10/2011 11:10 AM, Gleb Natapov wrote:
 On Thu, Feb 10, 2011 at 11:00:50AM +0100, Anthony Liguori wrote:
 On 02/10/2011 10:07 AM, Gleb Natapov wrote:
 So what if it is easier, it doesn't mean it is correct thing to do.
 If we spend the next 10 years trying to do the correct thing for
 some arbitrary definition of correct, that's not terribly useful.
 Changing direction by 180 every 2 years even less useful.
 
 If we think through what we are doing and have a coherent
 architecture before changing direction, then we won't have this
 problem.
 
I'd like to believe this :)

 It's really simple actually.  Let's do the least clever thing and
 model how hardware actual works.  Once we have that, we can try to
 be better than real hardware (if it's possible).
 I think out understanding on how HW actually works is very different.
 You are placing to much value on were device resides physically, for me
 it is completely unimportant detail. Not worth even mentioning.
 
 No, I place value on how things are modelled in the real world.
Real world (physical HW) have consideration not relevant for our
software emulation. Such as cost, physical dimension, power consumption
and many other I am sure I missed.

 
 There simply aren't PC's out there that lack an RTC so I have no
 interest in jumping through hoops in QEMU to make it possible to do
 this without modifying QEMU code.  It might sound nice to a
 developer but it's of absolutely no use to users.
 
RTC is not good example. HPET suppose to replace it (and PIT too). AFAIC
there are PCs without RTC already. Good example would be PIC or IOAPIC
device and then I would agree with you that it is not worth it to make
it possible to create x86 machine without them from command line if it
means extra complexity. But how have you jumped from this to lets make usb
mandatory?

 If all composition is done through a factory interface, it doesn't.
 But my main argument here is that we shouldn't try to make all
 composition done through a factory interface--only where it makes
 sense.
 
 So very concretely, I'm suggesting we do the following to target-i386:
 
 1) make the i440fx device have an embedded ide controller, piix3,
 and usb controller that get initialized automatically.  The piix3
 embeds the PCI-to-ISA bridge along with all of the default ISA
 devices (rtc, serial, etc.).
 This may be a problem even from security point of view. What if usb code
 (ide, serial, parallel) has guest exploitable bug? Currently I can happily
 continue running guests if they do not need affected subsystem. If we'll
 get it your way I will no longer be able to do so.
 qemu -device i440fx,ide=off
 
 So you still need to support arbitrary composition. What's the
 difference?
 
 No, we don't.  It's possible to have an 'rtc=off' option but I'm
 tremendously opposed to doing this.  Arbitrary composition is not a
 useful goal IMHO.
IMHO is different. We should support composition where it makes sense.
For PIC-less x86 it doesn't make it. For usb-less or even ide-less it
does.

 
   So why do you like -device i440fx over what we have now?
 
 Because I don't think tools like libvirt should be doing device
 composition to create an i440fx-like chipset.  I think the current
 path we're on is pushing too much logic that belongs in QEMU into
 the management stack.
I can agree with that. But from this it doesn't follow that we should
get rid of composition. We shouldn't push composition of common HW to
libvirt. Looking at libvirt command line I do not think we do it though.
Typical libvirt command line specifies disks, networks, usb, vga. How 
-device i440fx will simplified that? Well usb could be omitted (but not
-usbdevice table), disks are not property of i440fx so they will stay,
since user may want to use virtio controller (which is not part of
i440fx) this should stay too. Network obviously will have to be
specified by libvirt too, vga may go to i440fx, but since libvirt
supports qxl we will have to have a way to disable default vga and
enable qxl instead. So will we really simplify libvirt's life by
introducing -device i440fx?

 
 In current speak you propose will be implement by using i440fx machine
 type. Qdev will build it for you.
 
 If you had an i440fx machine type, that had no non-optional
 components added, and you could specify options to the machine type,
 yes.  But I think you'll agree that there's no reason to not just
 treat the i440fx as a device.
I do not agree. There is not such device as i440fx. This is just
packaging.

 
 If you really care to do this.  But this desire to remove devices is
 silly IMHO.  Concerns about security are misplaced.  If you have to
 change the way a guest is invoked in order to eliminate security
 problems, then there's something seriously wrong.
 
 No I do not.  I do not create guest with unneeded devices from the
 beginning.
 
 There is very little that isn't 'unneeded'.
 
That depends 

Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Gleb Natapov
On Thu, Feb 10, 2011 at 12:25:38PM +0200, Avi Kivity wrote:
 On 02/10/2011 11:07 AM, Gleb Natapov wrote:
 On Thu, Feb 10, 2011 at 08:47:12AM +0100, Anthony Liguori wrote:
   On 02/09/2011 09:15 PM, Blue Swirl wrote:
   On Wed, Feb 9, 2011 at 9:59 PM, Anthony Liguorianth...@codemonkey.ws   
  wrote:
   On 02/09/2011 06:48 PM, Blue Swirl wrote:
   ISASerialState dev;
   
   isa_serial_init(dev, 0, 0x274, 0x07, NULL, NULL);
   
   Do you mean that there should be a generic way of doing that, like
   sysbus_create_varargs() for qdev, or just add inline functions which
   hide qdev property setup?
   
   I still think that FDT should be used in the future. That would
   require that the properties can be set up mechanically, and I don't
   see how your proposal would help that.
   
   Yeah, I don't think that is a good idea anymore.  I think this is part 
  of
   why we're having so many problems with qdev.
   
   While (most?) hardware hierarchies can be represented by device tree 
  syntax,
   not all valid device trees correspond to interface and/or useful 
  hardware
   hierarchies.
   User creates a non-working machine and so gets to fix the problems?
   How is that a problem for us?
 
   It's not about creating a non-working machine.  It's about what
   user-level abstraction we need to provide.
 
   It's a whole lot easier to implement an i440fx device with a fixed
   set of parameters than it is to make every possible subdevice have a
   proper factory interface along with mechanisms to hook everything
   together.
 
 So what if it is easier, it doesn't mean it is correct thing to do. What
 you are proposing is just a huge step backwards. May be we shouldn't
 support hooking everything together in completely arbitrary ways, but we
 shouldn't force isa/pci devices upon our users just because they are
 non-removable on real chip.
 
 I disagree.  We don't want to deviate from the spec any more than we
 already do.
 
Which spec? Even in this discussion we completely mixed different
things. 440FX is not a chipset. It is memory controller/pci host bridge.
PIIX3/4 is the chipset which is just an arbitrary combination of devices
put on the same chip. We do not deviate from spec when we implement
those devices.

 The reason for wanting flexibility is because the code for the PIC
 or RTC, for example, can be used in other Super-IO chipsets or even
 standalone.  If qemu only supported the 440FX chipset, we'd have no
 reason to make things flexible.
Again you probably mean PIIX3. Even then removing unused ide will free
one more PCI slot for my cool virtio disk array. The things is, from
code point of view, it does not cost you extra to allow composition of
ide since it is just a regular PCI device and we need to support composing
those anyway.

 
 
   So very concretely, I'm suggesting we do the following to target-i386:
 
   1) make the i440fx device have an embedded ide controller, piix3,
   and usb controller that get initialized automatically.  The piix3
   embeds the PCI-to-ISA bridge along with all of the default ISA
   devices (rtc, serial, etc.).
 This may be a problem even from security point of view. What if usb code
 (ide, serial, parallel) has guest exploitable bug? Currently I can happily
 continue running guests if they do not need affected subsystem. If we'll
 get it your way I will no longer be able to do so.
 
 You can't just remove a device from a guest.  You have to shut it
 down.  When you power it back up, you may end up with different IRQ
 assignments or expose some guest bug.
As I answered to Anthony already I am not talking about changing HW
configuration after guest is created rather about creating minimal HW
setup for the task from the start. This means no soundcard or usb for
Windows exchange server for instance.

 
 If you have a security issue in code that is exposed to the guest,
 you have to fix it.
 
Of course. That is why it is a good idea to expose as little code to
guest as possible. Don't you think so?

--
Gleb.



[Qemu-devel] [PATCH 0.14/master v2 0/4] Error messages for unsupoorted image format features

2011-02-10 Thread Kevin Wolf
With 0.15 we'll most likely get some incompatible image format extensions. This
series prepares 0.14 to output more helpful messages if it stumbles over a too
new image file.

Kevin Wolf (4):
  qerror: Add QERR_UNKNOWN_BLOCK_FORMAT_FEATURE
  qcow2: Report error for version  2
  qed: Report error for unsupported features
  qemu-img: Improve error messages for failed bdrv_open

 block/qcow2.c |   13 +++--
 block/qed.c   |9 -
 qemu-img.c|   10 +++---
 qerror.c  |5 +
 qerror.h  |3 +++
 5 files changed, 34 insertions(+), 6 deletions(-)

-- 
1.7.2.3




[Qemu-devel] [PATCH v2 3/4] qed: Report error for unsupported features

2011-02-10 Thread Kevin Wolf
Instead of just returning -ENOTSUP, generate a more detailed error.

Unfortunately we don't have a helpful text for features that we don't know yet,
so just print the feature mask. It might be useful at least if someone asks for
help.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qed.c |9 -
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index 3273448..75ae244 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -14,6 +14,7 @@
 
 #include trace.h
 #include qed.h
+#include qerror.h
 
 static void qed_aio_cancel(BlockDriverAIOCB *blockacb)
 {
@@ -311,7 +312,13 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags)
 return -EINVAL;
 }
 if (s-header.features  ~QED_FEATURE_MASK) {
-return -ENOTSUP; /* image uses unsupported feature bits */
+/* image uses unsupported feature bits */
+char buf[64];
+snprintf(buf, sizeof(buf), % PRIx64,
+s-header.features  ~QED_FEATURE_MASK);
+qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
+bs-device_name, QED, buf);
+return -ENOTSUP;
 }
 if (!qed_is_cluster_size_valid(s-header.cluster_size)) {
 return -EINVAL;
-- 
1.7.2.3




[Qemu-devel] [PATCH v2 1/4] qerror: Add QERR_UNKNOWN_BLOCK_FORMAT_FEATURE

2011-02-10 Thread Kevin Wolf
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 qerror.c |5 +
 qerror.h |3 +++
 2 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/qerror.c b/qerror.c
index 9d0cdeb..4855604 100644
--- a/qerror.c
+++ b/qerror.c
@@ -201,6 +201,11 @@ static const QErrorStringTable qerror_table[] = {
 .desc  = An undefined error has ocurred,
 },
 {
+.error_fmt = QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
+.desc  = '%(device)' uses a %(format) feature which is not 
+ supported by this qemu version: %(feature),
+},
+{
 .error_fmt = QERR_VNC_SERVER_FAILED,
 .desc  = Could not start VNC server on %(target),
 },
diff --git a/qerror.h b/qerror.h
index b0f69da..f732d45 100644
--- a/qerror.h
+++ b/qerror.h
@@ -165,6 +165,9 @@ QError *qobject_to_qerror(const QObject *obj);
 #define QERR_UNDEFINED_ERROR \
 { 'class': 'UndefinedError', 'data': {} }
 
+#define QERR_UNKNOWN_BLOCK_FORMAT_FEATURE \
+{ 'class': 'UnknownBlockFormatFeature', 'data': { 'device': %s, 'format': 
%s, 'feature': %s } }
+
 #define QERR_VNC_SERVER_FAILED \
 { 'class': 'VNCServerFailed', 'data': { 'target': %s } }
 
-- 
1.7.2.3




[Qemu-devel] [PATCH v2 2/4] qcow2: Report error for version 2

2011-02-10 Thread Kevin Wolf
The qcow2 driver is now declared responsible for any QCOW image that has
version 2 or greater (before this, version 3 would be detected as raw).

For everything newer than version 2, an error is reported.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qcow2.c |   13 +++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 551b3c2..75b8bec 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -28,6 +28,7 @@
 #include aes.h
 #include block/qcow2.h
 #include qemu-error.h
+#include qerror.h
 
 /*
   Differences with QCOW:
@@ -59,7 +60,7 @@ static int qcow2_probe(const uint8_t *buf, int buf_size, 
const char *filename)
 
 if (buf_size = sizeof(QCowHeader) 
 be32_to_cpu(cow_header-magic) == QCOW_MAGIC 
-be32_to_cpu(cow_header-version) == QCOW_VERSION)
+be32_to_cpu(cow_header-version) = QCOW_VERSION)
 return 100;
 else
 return 0;
@@ -163,10 +164,18 @@ static int qcow2_open(BlockDriverState *bs, int flags)
 be64_to_cpus(header.snapshots_offset);
 be32_to_cpus(header.nb_snapshots);
 
-if (header.magic != QCOW_MAGIC || header.version != QCOW_VERSION) {
+if (header.magic != QCOW_MAGIC) {
 ret = -EINVAL;
 goto fail;
 }
+if (header.version != QCOW_VERSION) {
+char version[64];
+snprintf(version, sizeof(version), QCOW version %d, header.version);
+qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
+bs-device_name, qcow2, version);
+ret = -ENOTSUP;
+goto fail;
+}
 if (header.cluster_bits  MIN_CLUSTER_BITS ||
 header.cluster_bits  MAX_CLUSTER_BITS) {
 ret = -EINVAL;
-- 
1.7.2.3




[Qemu-devel] [PATCH v2 4/4] qemu-img: Improve error messages for failed bdrv_open

2011-02-10 Thread Kevin Wolf
Output the error message string of the bdrv_open return code. Also set a
non-empty device name for the images because the unknown feature error message
includes it.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 qemu-img.c |   10 +++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 4a37358..7e3cc4c 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -213,8 +213,9 @@ static BlockDriverState *bdrv_new_open(const char *filename,
 BlockDriverState *bs;
 BlockDriver *drv;
 char password[256];
+int ret;
 
-bs = bdrv_new();
+bs = bdrv_new(image);
 
 if (fmt) {
 drv = bdrv_find_format(fmt);
@@ -225,10 +226,13 @@ static BlockDriverState *bdrv_new_open(const char 
*filename,
 } else {
 drv = NULL;
 }
-if (bdrv_open(bs, filename, flags, drv)  0) {
-error_report(Could not open '%s', filename);
+
+ret = bdrv_open(bs, filename, flags, drv);
+if (ret  0) {
+error_report(Could not open '%s': %s, filename, strerror(-ret));
 goto fail;
 }
+
 if (bdrv_is_encrypted(bs)) {
 printf(Disk image '%s' is encrypted.\n, filename);
 if (read_password(password, sizeof(password))  0) {
-- 
1.7.2.3




[Qemu-devel] [PATCH v3 0/6] target-arm: Fix floating point conversions

2011-02-10 Thread Peter Maydell
This patchset fixes two issues:
 * default_nan_mode not being honoured for float-to-float conversions
 * half precision conversions being broken in a number of ways as
   well as not handling default_nan_mode.

With this patchset qemu passes random-instruction-selection tests
for VCVT.F32.F16, VCVT.F16.F32, VCVTB and VCVTT, in both IEEE and
non-IEEE modes, with and without default-NaN behaviour.

Christophe: this patchset includes your softfloat v3 patch, although
I have split it up a little to keep the float16 bits separate.

Changes since v2:
 * added STRUCT_TYPES version of float16 and fixed various
   places which needed a make_float16()/float16_val() in order
   to compile with STRUCT_TYPES enabled
 * s/bits16/float16/ in patch 3 as suggested by Aurelien
 * fixed the types in the f16-related ARM helper wrappers in patch 6

Patch 2 is unchanged and so I've added Aurelien's reviewed-by
signoff; the others all changed, although mostly in minor ways.

(Compiling with STRUCT_TYPES enabled also needs some fixes to
existing float32/float64 code; I'll send a separate patchset
for that.)


Christophe Lyon (1):
  softfloat: Honour default_nan_mode for float-to-float conversions

Peter Maydell (5):
  softfloat: Add float16 type and float16 NaN handling functions
  softfloat: Fix single-to-half precision float conversions
  softfloat: Correctly handle NaNs in float16_to_float32()
  target-arm: Silence NaNs resulting from half-precision conversions
  target-arm: Use standard FPSCR for Neon half-precision operations

 fpu/softfloat-specialize.h |  130 ++--
 fpu/softfloat.c|  100 ++
 fpu/softfloat.h|   19 ++-
 target-arm/helper.c|   38 +++--
 target-arm/helpers.h   |2 +
 target-arm/translate.c |   16 +++---
 6 files changed, 251 insertions(+), 54 deletions(-)




Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Gleb Natapov
On Thu, Feb 10, 2011 at 10:38:53AM +, Peter Maydell wrote:
 This is the system diagram for the Versatile Express:
 http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0447d/I1007683.html
 I don't know what you'd want to claim is a northbridge there.
 Basically there's an FPGA with a pile of devices in it,
 and there's a test chip with the core and some other devices in
 it. But from a modelling perspective this is all completely
 irrelevant because regardless of where the hardware designer
 put the devices, they're just devices at a particular point in the
 memory map and with a particular set of interrupt wiring and so
 on. I don't see the point in modelling a concept that has no
 user-visible effects and doesn't actually make the model any
 clearer or simpler.
 
Exactly. This is really the same with x86. The fact that some company
put several devices on the same chip and gave it commercial name
shouldn't govern our design.

 
  A machine today is basically the northbridge, southbridge, plus a bunch of
  default components to make the virtual hardware useful.
 
 This doesn't really correspond to ARM boards I've looked at,
 by and large (for instance there's no mention of the word northbridge
 in the whole 3700 page OMAP3 TRM). PCs may be best modelled
 that way, sure, but I don't think you can cram everything into that mould.
 
Even on x86 this model is falling apart. Memory controller moves to cpu.
PCI controller will follow.

  If you mean that you want machines to be implemented under the
  hood as a single huge device you can only have one of that spans
  the entire memory map, well I guess that's an implementation
  detail. But conceptually machines really do exist, and we definitely
  still want users to be able to say I want a beagle machine; I want
  a versatile; I want an n900.
 
  An n900 is a very specific hardware configuration that is best represented
  by some sort of configuration file vs. something hard coded in QEMU.
 
 Yes, that's the whole point -- machine == specific hardware
 configuration.
 
 That's not getting rid of machine, it's just saying we should have
 some custom scripting language to define them rather than doing
 them in C. You still want, fundamentally, to be able to say
   qemu-system-arm -M machinename
 
+1

--
Gleb.



[Qemu-devel] [PATCH v3 5/6] target-arm: Silence NaNs resulting from half-precision conversions

2011-02-10 Thread Peter Maydell
Silence the NaNs that may result from half-precision conversion,
as we do for the other conversions.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 target-arm/helper.c |   12 ++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index d29c42b..e427747 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -2627,14 +2627,22 @@ float32 HELPER(vfp_fcvt_f16_to_f32)(uint32_t a, 
CPUState *env)
 {
 float_status *s = env-vfp.fp_status;
 int ieee = (env-vfp.xregs[ARM_VFP_FPSCR]  (1  26)) == 0;
-return float16_to_float32(make_float16(a), ieee, s);
+float32 r = float16_to_float32(make_float16(a), ieee, s);
+if (ieee) {
+return float32_maybe_silence_nan(r);
+}
+return r;
 }
 
 uint32_t HELPER(vfp_fcvt_f32_to_f16)(float32 a, CPUState *env)
 {
 float_status *s = env-vfp.fp_status;
 int ieee = (env-vfp.xregs[ARM_VFP_FPSCR]  (1  26)) == 0;
-return float16_val(float32_to_float16(a, ieee, s));
+float16 r = float32_to_float16(a, ieee, s);
+if (ieee) {
+r = float16_maybe_silence_nan(r);
+}
+return float16_val(r);
 }
 
 float32 HELPER(recps_f32)(float32 a, float32 b, CPUState *env)
-- 
1.7.1




[Qemu-devel] [PATCH v3 4/6] softfloat: Correctly handle NaNs in float16_to_float32()

2011-02-10 Thread Peter Maydell
Correctly handle NaNs in float16_to_float32(), by defining and
using a float16ToCommonNaN() function, as we do with the other formats.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 fpu/softfloat-specialize.h |   17 +
 fpu/softfloat.c|4 +---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/fpu/softfloat-specialize.h b/fpu/softfloat-specialize.h
index 1c0b12b..2d025bf 100644
--- a/fpu/softfloat-specialize.h
+++ b/fpu/softfloat-specialize.h
@@ -120,6 +120,23 @@ float16 float16_maybe_silence_nan(float16 a_)
 }
 
 /*
+| Returns the result of converting the half-precision floating-point NaN
+| `a' to the canonical NaN format.  If `a' is a signaling NaN, the invalid
+| exception is raised.
+**/
+
+static commonNaNT float16ToCommonNaN( float16 a STATUS_PARAM )
+{
+commonNaNT z;
+
+if ( float16_is_signaling_nan( a ) ) float_raise( float_flag_invalid 
STATUS_VAR );
+z.sign = float16_val(a)  15;
+z.low = 0;
+z.high = ((bits64) float16_val(a))54;
+return z;
+}
+
+/*
 | Returns the result of converting the canonical NaN `a' to the half-
 | precision floating-point format.
 **/
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 80d8cc4..3abd170 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -2761,9 +2761,7 @@ float32 float16_to_float32(float16 a, flag ieee 
STATUS_PARAM)
 
 if (aExp == 0x1f  ieee) {
 if (aSig) {
-/* Make sure correct exceptions are raised.  */
-float32ToCommonNaN(a STATUS_VAR);
-aSig |= 0x200;
+return commonNaNToFloat32(float16ToCommonNaN(a STATUS_VAR) 
STATUS_VAR);
 }
 return packFloat32(aSign, 0xff, aSig  13);
 }
-- 
1.7.1




[Qemu-devel] Re: [PATCH v2 3/4] qed: Report error for unsupported features

2011-02-10 Thread Stefan Hajnoczi
On Thu, Feb 10, 2011 at 11:18 AM, Kevin Wolf kw...@redhat.com wrote:
 Instead of just returning -ENOTSUP, generate a more detailed error.

 Unfortunately we don't have a helpful text for features that we don't know 
 yet,
 so just print the feature mask. It might be useful at least if someone asks 
 for
 help.

 Signed-off-by: Kevin Wolf kw...@redhat.com
 ---
  block/qed.c |    9 -
  1 files changed, 8 insertions(+), 1 deletions(-)

Thanks!

Acked-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com



Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Gleb Natapov
On Thu, Feb 10, 2011 at 01:47:06PM +0100, Anthony Liguori wrote:
 On 02/10/2011 11:49 AM, Gleb Natapov wrote:
 On Thu, Feb 10, 2011 at 11:19:48AM +0100, Anthony Liguori wrote:
 On 02/10/2011 11:10 AM, Gleb Natapov wrote:
 On Thu, Feb 10, 2011 at 11:00:50AM +0100, Anthony Liguori wrote:
 On 02/10/2011 10:07 AM, Gleb Natapov wrote:
 So what if it is easier, it doesn't mean it is correct thing to do.
 If we spend the next 10 years trying to do the correct thing for
 some arbitrary definition of correct, that's not terribly useful.
 Changing direction by 180 every 2 years even less useful.
 If we think through what we are doing and have a coherent
 architecture before changing direction, then we won't have this
 problem.
 
 I'd like to believe this :)
 
 It's really simple actually.  Let's do the least clever thing and
 model how hardware actual works.  Once we have that, we can try to
 be better than real hardware (if it's possible).
 I think out understanding on how HW actually works is very different.
 You are placing to much value on were device resides physically, for me
 it is completely unimportant detail. Not worth even mentioning.
 No, I place value on how things are modelled in the real world.
 Real world (physical HW) have consideration not relevant for our
 software emulation. Such as cost, physical dimension, power consumption
 and many other I am sure I missed.
 
 There simply aren't PC's out there that lack an RTC so I have no
 interest in jumping through hoops in QEMU to make it possible to do
 this without modifying QEMU code.  It might sound nice to a
 developer but it's of absolutely no use to users.
 
 RTC is not good example. HPET suppose to replace it (and PIT too).
 
 HPET's embed RTCs to provide support for legacy implementations.
 This is extremely good example of where our modelling breaks down.
 Take a close look at how the HPET and RTC emulations interact for an
 example of why we'd be much better off just implementing an RTC
 within an HPET.
 
Yes HPET can provide legacy RTC timer functionality. No I do not see why
we should implement RTC withing HPET. In your model we should remove
HPET code completely since HPET is not present in chipset emulated by
QEMU.

   AFAIC
 there are PCs without RTC already.
 
 RTC also provides CMOS functionality and no PC can boot without
 CMOS.  So no, there's nothing we'd consider a PC today that doesn't
 have an RTC.
CMOS may be present even if RTC functionality is absent. Does EFI base
machine still need CMOS though?

 
   Good example would be PIC or IOAPIC
 device and then I would agree with you that it is not worth it to make
 it possible to create x86 machine without them from command line if it
 means extra complexity. But how have you jumped from this to lets make usb
 mandatory?
 
 USB is mandatory in the PIIX3 but the only significant difference
 between the piix2 and piix3 is the addition of USB.
 Consequentially, the main difference between an i440fx and i440bx is
 the use of a piix2 vs. a piix3.  So if you really want to create the
 same PC we have today w/o USB, the right way to do it would be to
 have:
 
 -device i440,model=fx   // with USB
 -device i440,model=bx  // w/o USB

Why not qemu -config piix2.cfg or qemu -config piix3.cfg? No need to
make data into code.

 
 
 No, we don't. It's possible to have an 'rtc=off' option but I'm
 tremendously opposed to doing this.  Arbitrary composition is not a
 useful goal IMHO.
 IMHO is different. We should support composition where it makes sense.
 For PIC-less x86 it doesn't make it. For usb-less or even ide-less it
 does.
 
 The right way to do a USB-less PC is to have an option to create an i440bx.
Why is this the right way?

 
 An IDE-less PC is a bit more difficult because IDE is really baked
 into the concept of a PC.  Chances are, there are more than a few
 guests out there that would have issues from there being no IDE bus
 present.
 
Non of my modern PCs have IDE. Many high end PC had SCSI instead of IDE
in the past. If guest can't run without IDE you do not run it without
IDE.

   So why do you like -device i440fx over what we have now?
 Because I don't think tools like libvirt should be doing device
 composition to create an i440fx-like chipset.  I think the current
 path we're on is pushing too much logic that belongs in QEMU into
 the management stack.
 I can agree with that. But from this it doesn't follow that we should
 get rid of composition. We shouldn't push composition of common HW to
 libvirt. Looking at libvirt command line I do not think we do it though.
 Typical libvirt command line specifies disks, networks, usb, vga. How
 -device i440fx will simplified that? Well usb could be omitted (but not
 -usbdevice table), disks are not property of i440fx so they will stay,
 since user may want to use virtio controller (which is not part of
 i440fx) this should stay too. Network obviously will have to be
 specified by libvirt too, vga may go to i440fx, but since libvirt
 

Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Gleb Natapov
On Thu, Feb 10, 2011 at 03:00:05PM +0200, Avi Kivity wrote:
 On 02/10/2011 02:51 PM, Anthony Liguori wrote:
 On 02/10/2011 12:13 PM, Gleb Natapov wrote:
 
 Which spec? Even in this discussion we completely mixed different
 things. 440FX is not a chipset.
 
 Yes, it is.  It's a single silicon package with a defined pinout.
 If you don't believe me, re-read the spec.
 
 It's a MCM with the PIIX3 being internally connected.   The
 connection between the i440fx and PIIX3 happens to be PCI but
 that's not always the case.  Sometimes it's a proprietary bus.
 
 Aren't they two distinct chips, together comprising the chip-set?
 
 One (the northbridge) converts the system bus to PCI + some extra
 wires, the other (southbridge) bridges PCI to ISA and contains some
 embedded ISA devices.  IIRC there are some wires between them that
 are not PCI.
 
Yeah, 440fx is probably northbridge and PIIX3 southbridge.

--
Gleb.



[Qemu-devel] [PATCH 0/2] softfloat: fix USE_SOFTFLOAT_STRUCT_TYPES compile failures

2011-02-10 Thread Peter Maydell
This patchset fixes some compilation failures which happen if you try to
enable softfloat's USE_SOFTFLOAT_STRUCT_TYPES type-error-debugging switch.

This patchset leaves one error in float16_to_float32, because
that is fixed in passing by my half-precision patchset, and I
saw no point in deliberately creating a patch conflict.

I've only fixed the problems with core softfloat and the ARM targets;
maintainers of other targets can fix their platforms if they think it's
worth doing.

Peter Maydell (2):
  softfloat: Fix compilation failures with USE_SOFTFLOAT_STRUCT_TYPES
  linux-user/arm: fix compilation failures using softfloat's struct
types

 fpu/softfloat.c   |   30 +++---
 fpu/softfloat.h   |4 
 linux-user/arm/nwfpe/fpa11_cpdt.c |2 +-
 linux-user/arm/nwfpe/fpopcode.c   |   32 
 linux-user/signal.c   |4 ++--
 5 files changed, 38 insertions(+), 34 deletions(-)




[Qemu-devel] [PATCH 1/2] softfloat: Fix compilation failures with USE_SOFTFLOAT_STRUCT_TYPES

2011-02-10 Thread Peter Maydell
Make softfloat compile with USE_SOFTFLOAT_STRUCT_TYPES defined, by
adding and using new macros const_float32() and const_float64() so
you can use array initializers in an array of float32/float64 whether
the types are bare or wrapped in the structs.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 fpu/softfloat.c |   30 +++---
 fpu/softfloat.h |4 
 2 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 17842f4..8de887d 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -2172,21 +2172,21 @@ float32 float32_sqrt( float32 a STATUS_PARAM )
 
 static const float64 float32_exp2_coefficients[15] =
 {
-make_float64( 0x3ff0ll ), /*  1 */
-make_float64( 0x3fe0ll ), /*  2 */
-make_float64( 0x3fc5ll ), /*  3 */
-make_float64( 0x3fa5ll ), /*  4 */
-make_float64( 0x3f81ll ), /*  5 */
-make_float64( 0x3f56c16c16c16c17ll ), /*  6 */
-make_float64( 0x3f2a01a01a01a01all ), /*  7 */
-make_float64( 0x3efa01a01a01a01all ), /*  8 */
-make_float64( 0x3ec71de3a556c734ll ), /*  9 */
-make_float64( 0x3e927e4fb7789f5cll ), /* 10 */
-make_float64( 0x3e5ae64567f544e4ll ), /* 11 */
-make_float64( 0x3e21eed8eff8d898ll ), /* 12 */
-make_float64( 0x3de6124613a86d09ll ), /* 13 */
-make_float64( 0x3da93974a8c07c9dll ), /* 14 */
-make_float64( 0x3d6ae7f3e733b81fll ), /* 15 */
+const_float64( 0x3ff0ll ), /*  1 */
+const_float64( 0x3fe0ll ), /*  2 */
+const_float64( 0x3fc5ll ), /*  3 */
+const_float64( 0x3fa5ll ), /*  4 */
+const_float64( 0x3f81ll ), /*  5 */
+const_float64( 0x3f56c16c16c16c17ll ), /*  6 */
+const_float64( 0x3f2a01a01a01a01all ), /*  7 */
+const_float64( 0x3efa01a01a01a01all ), /*  8 */
+const_float64( 0x3ec71de3a556c734ll ), /*  9 */
+const_float64( 0x3e927e4fb7789f5cll ), /* 10 */
+const_float64( 0x3e5ae64567f544e4ll ), /* 11 */
+const_float64( 0x3e21eed8eff8d898ll ), /* 12 */
+const_float64( 0x3de6124613a86d09ll ), /* 13 */
+const_float64( 0x3da93974a8c07c9dll ), /* 14 */
+const_float64( 0x3d6ae7f3e733b81fll ), /* 15 */
 };
 
 float32 float32_exp2( float32 a STATUS_PARAM )
diff --git a/fpu/softfloat.h b/fpu/softfloat.h
index 4a5345c..aaf6afc 100644
--- a/fpu/softfloat.h
+++ b/fpu/softfloat.h
@@ -125,11 +125,13 @@ typedef struct {
 /* The cast ensures an error if the wrong type is passed.  */
 #define float32_val(x) (((float32)(x)).v)
 #define make_float32(x) __extension__ ({ float32 f32_val = {x}; f32_val; })
+#define const_float32(x) { x }
 typedef struct {
 uint64_t v;
 } float64;
 #define float64_val(x) (((float64)(x)).v)
 #define make_float64(x) __extension__ ({ float64 f64_val = {x}; f64_val; })
+#define const_float64(x) { x }
 #else
 typedef uint32_t float32;
 typedef uint64_t float64;
@@ -137,6 +139,8 @@ typedef uint64_t float64;
 #define float64_val(x) (x)
 #define make_float32(x) (x)
 #define make_float64(x) (x)
+#define const_float32(x) x
+#define const_float64(x) x
 #endif
 #ifdef FLOATX80
 typedef struct {
-- 
1.7.1




[Qemu-devel] [PATCH 2/2] linux-user/arm: fix compilation failures using softfloat's struct types

2011-02-10 Thread Peter Maydell
Add uses of the float32/float64 boxing and unboxing macros so that
the ARM linux-user targets will compile with USE_SOFTFLOAT_STRUCT_TYPES
enabled.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 linux-user/arm/nwfpe/fpa11_cpdt.c |2 +-
 linux-user/arm/nwfpe/fpopcode.c   |   32 
 linux-user/signal.c   |4 ++--
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/linux-user/arm/nwfpe/fpa11_cpdt.c 
b/linux-user/arm/nwfpe/fpa11_cpdt.c
index 1346fd6..b12e27d 100644
--- a/linux-user/arm/nwfpe/fpa11_cpdt.c
+++ b/linux-user/arm/nwfpe/fpa11_cpdt.c
@@ -33,7 +33,7 @@ void loadSingle(const unsigned int Fn, target_ulong addr)
FPA11 *fpa11 = GET_FPA11();
fpa11-fType[Fn] = typeSingle;
/* FIXME - handle failure of get_user() */
-   get_user_u32(fpa11-fpreg[Fn].fSingle, addr);
+   get_user_u32(float32_val(fpa11-fpreg[Fn].fSingle), addr);
 }
 
 static inline
diff --git a/linux-user/arm/nwfpe/fpopcode.c b/linux-user/arm/nwfpe/fpopcode.c
index 240061d..82ac92f 100644
--- a/linux-user/arm/nwfpe/fpopcode.c
+++ b/linux-user/arm/nwfpe/fpopcode.c
@@ -37,25 +37,25 @@ const floatx80 floatx80Constant[] = {
 };
 
 const float64 float64Constant[] = {
-  0xULL,   /* double 0.0 */
-  0x3ff0ULL,   /* double 1.0 */
-  0x4000ULL,   /* double 2.0 */
-  0x4008ULL,   /* double 3.0 */
-  0x4010ULL,   /* double 4.0 */
-  0x4014ULL,   /* double 5.0 */
-  0x3fe0ULL,   /* double 0.5 */
-  0x4024ULL/* double 10.0 */
+  const_float64(0xULL),/* double 0.0 */
+  const_float64(0x3ff0ULL),/* double 1.0 */
+  const_float64(0x4000ULL),/* double 2.0 */
+  const_float64(0x4008ULL),/* double 3.0 */
+  const_float64(0x4010ULL),/* double 4.0 */
+  const_float64(0x4014ULL),/* double 5.0 */
+  const_float64(0x3fe0ULL),/* double 0.5 */
+  const_float64(0x4024ULL) /* double 10.0 */
 };
 
 const float32 float32Constant[] = {
-  0x,  /* single 0.0 */
-  0x3f80,  /* single 1.0 */
-  0x4000,  /* single 2.0 */
-  0x4040,  /* single 3.0 */
-  0x4080,  /* single 4.0 */
-  0x40a0,  /* single 5.0 */
-  0x3f00,  /* single 0.5 */
-  0x4120   /* single 10.0 */
+  const_float32(0x),   /* single 0.0 */
+  const_float32(0x3f80),   /* single 1.0 */
+  const_float32(0x4000),   /* single 2.0 */
+  const_float32(0x4040),   /* single 3.0 */
+  const_float32(0x4080),   /* single 4.0 */
+  const_float32(0x40a0),   /* single 5.0 */
+  const_float32(0x3f00),   /* single 0.5 */
+  const_float32(0x4120)/* single 10.0 */
 };
 
 unsigned int getRegisterCount(const unsigned int opcode)
diff --git a/linux-user/signal.c b/linux-user/signal.c
index b01bd64..ce033e9 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -1299,7 +1299,7 @@ static abi_ulong *setup_sigframe_v2_vfp(abi_ulong 
*regspace, CPUState *env)
 __put_user(TARGET_VFP_MAGIC, vfpframe-magic);
 __put_user(sizeof(*vfpframe), vfpframe-size);
 for (i = 0; i  32; i++) {
-__put_user(env-vfp.regs[i], vfpframe-ufp.fpregs[i]);
+__put_user(float64_val(env-vfp.regs[i]), vfpframe-ufp.fpregs[i]);
 }
 __put_user(vfp_get_fpscr(env), vfpframe-ufp.fpscr);
 __put_user(env-vfp.xregs[ARM_VFP_FPEXC], vfpframe-ufp_exc.fpexc);
@@ -1588,7 +1588,7 @@ static abi_ulong *restore_sigframe_v2_vfp(CPUState *env, 
abi_ulong *regspace)
 return 0;
 }
 for (i = 0; i  32; i++) {
-__get_user(env-vfp.regs[i], vfpframe-ufp.fpregs[i]);
+__get_user(float64_val(env-vfp.regs[i]), vfpframe-ufp.fpregs[i]);
 }
 __get_user(fpscr, vfpframe-ufp.fpscr);
 vfp_set_fpscr(env, fpscr);
-- 
1.7.1




Re: [Qemu-devel] [PATCH v3] Fix ATA SMART and CHECK POWER MODE

2011-02-10 Thread Brian Wheeler
On Wed, 2011-02-09 at 17:22 -0600, Ryan Harper wrote:
 * Brian Wheeler bdwhe...@indiana.edu [2011-02-09 16:13]:
  This patch fixes two things:
  
   1) CHECK POWER MODE
  
  The error return value wasn't always zero, so it would show up as
  offline.  Error is now explicitly set to zero.
  
   2) SMART
  
  The smart values that were returned were invalid and tools like skdump
  would not recognize that the smart data was actually valid and would
  dump weird output.  The data has been fixed up and raw value support
  was added.  Tools like skdump and palimpsest work as expected.
  
  v3 changes:  don't reformat code I didn't change
  v2 changes:  use single structure instead of one for thresholds and one
  for data.
  
  Signed-off-by: bdwhe...@indiana.edu
  
  diff --git a/hw/ide/core.c b/hw/ide/core.c
  index dd63664..b0b0b35 100644
  --- a/hw/ide/core.c
  +++ b/hw/ide/core.c
  @@ -34,13 +34,26 @@
  
   #include hw/ide/internal.h
  
  -static const int smart_attributes[][5] = {
  -/* id,  flags, val, wrst, thrsh */
  -{ 0x01, 0x03, 0x64, 0x64, 0x06}, /* raw read */
  -{ 0x03, 0x03, 0x64, 0x64, 0x46}, /* spin up */
  -{ 0x04, 0x02, 0x64, 0x64, 0x14}, /* start stop count */
  -{ 0x05, 0x03, 0x64, 0x64, 0x36}, /* remapped sectors */
  -{ 0x00, 0x00, 0x00, 0x00, 0x00}
  +/* These values were taking from a running system, specifically a
  +   Seagate ST3500418AS */
 
 
 These values ought to have meaning for your hardware, but won't for either the
 virtual disk, nor the underlying storage that the virtual disk is
 running on.  Since we're not attempting to pass any of that info, nor
 keep it in-sync, it probably doesn't matter that much that we're just
 copying device specific data.  I'm open to discussion on how much we
 care about the attribute values[1].
 
 1. 
 https://secure.wikimedia.org/wikipedia/en/wiki/S.M.A.R.T.#ATA_S.M.A.R.T._attributes
 
 

The main reason for this patch was to make sure the disk tools and
smartd on linux were happy and returned reasonable values.  At some
point I may add on the ability to trigger a smart failure (by jumping
the sectors remapped or something) but for now its read only and not
really meaningful.

  +static const int smart_attributes[][12] = {
  +/* id,  flags, hflags, val, wrst, raw (6 bytes), threshold */
  +/* raw read error rate*/
  +{ 0x01, 0x03, 0x00, 0x74, 0x63, 0x31, 0x6d, 0x3f, 0x0d, 0x00, 0x00, 
  0x06},
 
 probably fine, but this is vendor hardware specific.  I can't think of a
 better number other than 0.
 

I've set it to zero.

  +/* spin up */
  +{ 0x03, 0x03, 0x00, 0x61, 0x61, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 
  0x00},
 
 default is probably fine as well, though it's dependent upon the
 hardware as well.  Could be zero as well.
 

I've set it to 16ms so skdump returns something other than 'n/a'


  +/* start stop count */
  +{ 0x04, 0x02, 0x00, 0x64, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 
  0x14},
 
 depends on hardware and power mgmt, any count is probably fine.
 
  +/* remapped sectors */
  +{ 0x05, 0x03, 0x00, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 
  0x24},
 
 Probably should be zero.
 

When dumping it via skdump or smartctl out it reads as 0 sectors
remapped (as indicated by the raw value).  The value looks like its a
countdown of sectors remaining, so setting it to the 'worst' value is
equivalent to no sectors remapped.


  +/* power on hours */
  +{ 0x09, 0x03, 0x00, 0x61, 0x61, 0x68, 0x0a, 0x00, 0x00, 0x00, 0x00, 
  0x00},
 
 Zero.
 

I'm going to set it to 1 (hour) so skdump returns something other than
'n/a'

  +/* power cycle count */
  +{ 0x0c, 0x03, 0x00, 0x64, 0x64, 0x32, 0x00, 0x00, 0x00, 0x00, 0x00, 
  0x00},
 
 Zero

I've set it to zero.

 
  +/* airflow-temperature-celsius */
  +{ 190,  0x03, 0x00, 0x64, 0x64, 0x1f, 0x00, 0x16, 0x22, 0x00, 0x00, 
  0x32},
 
 Something resonably ambient 20-30C, current value is probably fine.
 

it reads at 31.0C.  I've set the value (and worst)so it matches the raw
value.  (100C - 31C = 69C (0x45)).  I've also adjusted the raw value so
it shows the Min/Max is 31C


  +/* end of list */
  +{ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 
  0x00}
   };
  


   /* XXX: DVDs that could fit on a CD will be reported as a CD */
  @@ -1843,6 +1856,7 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val)
   break;
   case WIN_CHECKPOWERMODE1:
   case WIN_CHECKPOWERMODE2:
  +s-error = 0;
   s-nsector = 0xff; /* device active or idle */
   s-status = READY_STAT | SEEK_STAT;
   ide_set_irq(s-bus);
  @@ -2097,7 +2111,7 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val)
  if (smart_attributes[n][0] == 0)
  break;
  s-io_buffer[2+0+(n*12)] = smart_attributes[n][0];
  -   s-io_buffer[2+1+(n*12)] = smart_attributes[n][4];
  +   s-io_buffer[2+1+(n*12)] = 

[Qemu-devel] [PATCH v4] Fix ATA SMART and CHECK POWER MODE

2011-02-10 Thread Brian Wheeler
This patch fixes two things:
 
 1) CHECK POWER MODE
 
The error return value wasn't always zero, so it would show up as
offline.  Error is now explicitly set to zero.
 
 2) SMART
 
The smart values that were returned were invalid and tools like skdump
would not recognize that the smart data was actually valid and would
dump weird output.  The data has been fixed up and raw value support
was added.  Tools like skdump and palimpsest work as expected.

v4 changes:  incorporate changes from Ryan Harper
v3 changes:  don't reformat code I didn't change
v2 changes:  use single structure instead of one for thresholds and one
for data.

Signed-off-by: bdwhe...@indiana.edu

diff --git a/hw/ide/core.c b/hw/ide/core.c
index dd63664..c806f31 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -34,13 +34,26 @@
 
 #include hw/ide/internal.h
 
-static const int smart_attributes[][5] = {
-/* id,  flags, val, wrst, thrsh */
-{ 0x01, 0x03, 0x64, 0x64, 0x06}, /* raw read */
-{ 0x03, 0x03, 0x64, 0x64, 0x46}, /* spin up */
-{ 0x04, 0x02, 0x64, 0x64, 0x14}, /* start stop count */
-{ 0x05, 0x03, 0x64, 0x64, 0x36}, /* remapped sectors */
-{ 0x00, 0x00, 0x00, 0x00, 0x00}
+/* These values were based on a Seagate ST3500418AS but have been modified
+   to make more sense in QEMU */
+static const int smart_attributes[][12] = {
+/* id,  flags, hflags, val, wrst, raw (6 bytes), threshold */
+/* raw read error rate*/
+{ 0x01, 0x03, 0x00, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x06},
+/* spin up */
+{ 0x03, 0x03, 0x00, 0x64, 0x64, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
+/* start stop count */
+{ 0x04, 0x02, 0x00, 0x64, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x14},
+/* remapped sectors */
+{ 0x05, 0x03, 0x00, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x24},
+/* power on hours */
+{ 0x09, 0x03, 0x00, 0x64, 0x64, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
+/* power cycle count */
+{ 0x0c, 0x03, 0x00, 0x64, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
+/* airflow-temperature-celsius */
+{ 190,  0x03, 0x00, 0x45, 0x45, 0x1f, 0x00, 0x1f, 0x1f, 0x00, 0x00, 0x32},
+/* end of list */
+{ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}
 };
 
 /* XXX: DVDs that could fit on a CD will be reported as a CD */
@@ -1843,6 +1856,7 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val)
 break;
 case WIN_CHECKPOWERMODE1:
 case WIN_CHECKPOWERMODE2:
+s-error = 0;
 s-nsector = 0xff; /* device active or idle */
 s-status = READY_STAT | SEEK_STAT;
 ide_set_irq(s-bus);
@@ -2097,7 +2111,7 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val)
if (smart_attributes[n][0] == 0)
break;
s-io_buffer[2+0+(n*12)] = smart_attributes[n][0];
-   s-io_buffer[2+1+(n*12)] = smart_attributes[n][4];
+   s-io_buffer[2+1+(n*12)] = smart_attributes[n][11];
}
for (n=0; n511; n++) /* checksum */
s-io_buffer[511] += s-io_buffer[n];
@@ -2110,12 +2124,13 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val)
memset(s-io_buffer, 0, 0x200);
s-io_buffer[0] = 0x01; /* smart struct version */
for (n=0; n30; n++) {
-   if (smart_attributes[n][0] == 0)
+   if (smart_attributes[n][0] == 0) {
break;
-   s-io_buffer[2+0+(n*12)] = smart_attributes[n][0];
-   s-io_buffer[2+1+(n*12)] = smart_attributes[n][1];
-   s-io_buffer[2+3+(n*12)] = smart_attributes[n][2];
-   s-io_buffer[2+4+(n*12)] = smart_attributes[n][3];
+   }
+   int i;
+   for(i = 0; i  11; i++) {
+   s-io_buffer[2+i+(n*12)] = smart_attributes[n][i];
+   }
}
s-io_buffer[362] = 0x02 | (s-smart_autosave?0x80:0x00);
if (s-smart_selftest_count == 0) {





[Qemu-devel] [Bug 498107] Re: www.qemu.org and www.nongnu.org/qemu have a lot of bugs

2011-02-10 Thread rowa
qemu 0.14.0 rc1

Not in qemu-doc.html:

(qemu) info spice


BTW: The qemu instanz crashed with info spice:

kvm ReactOS.img -spice port=12345,disable-ticketing -vga qxl -monitor
stdio'''

(qemu) info spice
Server:
 address: 0.0.0.0:12345
auth: none
qemu: qdict.c:193: qdict_get_obj: Assertion `obj != ((void *)0)' failed.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/498107

Title:
  www.qemu.org and www.nongnu.org/qemu have a lot of bugs

Status in QEMU:
  New

Bug description:
  The http://websites www.qemu.org and http://www.nongnu.org/qemu have a
  lot of bugs:

  Why two websites with different oudated content?

  No contact address
  -
  It is not possible to contact the webmaster.

  Outdated content
  --
  The current relase is 0.12.0-rc2.!
  - http://www.nongnu.org/qemu/index.html Jul 30, 2009 QEMU version 0.11.0-rc1 
is out
  - http://www.nongnu.org/qemu/download.html
  - http://www.qemu.org QEMU version 0.12.0-rc1 is out 
  - http://www.qemu.org/download.html

  Many Links are outdated or broken
  
  - http://www.qemu.org/links.html
  - http://www.nongnu.org/qemu/links.html
  For example QEMU on Windows. Why not http://www.davereyn.co.uk/download.htm 
?

  - http://www.qemu.org/user-doc.html
  - http://www.nongnu.org/qemu/user-doc.html
  For example: Quick Start, FAQ and QEMU Wiki. 

  No word about the end of KQEMU support next.

  - http://www.qemu.org/qemu-doc.html
  - http://www.nongnu.org/qemu/qemu-doc.html
  There are a lot of differences to
  qemu --help
  help in QEMU-Monitor

  For example
  -rtc-td-hack
  -localtime
  -startdate
  -netdev
  -mem-path
  -mem-prealloc
  -tdf
  -nvram
  -enable-nesting
  -no-kvm-irqchip
  -no-kvm-pit
  -no-kvm-pit-reinjection
  -xen-domid id
  -xen-create
  -xen-attach
  -readconfig file
  -writeconfig file
  (qemu) host_net_redir
  (qemu) acl_reset

  Please see also

  - http://qemu-buch.de/d/Anhang/_Startoptionen_von_QEMU_und_KVM 
  - http://qemu-buch.de/d/Anhang/_QEMU-Monitor





[Qemu-devel] Re: AHCI in SeaBIOS

2011-02-10 Thread Kevin O'Connor
On Tue, Feb 08, 2011 at 12:57:41AM +0100, Alexander Graf wrote:
 Hi Kevin,
 
 Do you remember why you put AHCI in with default=n? I'd like to see
 it enabled in Qemu 0.14 and IIUC we use the default configuration
 for that.

Hi Alex,

Sorry - I've gotten behind on emails.

The reason for not enabling it by default was for two reasons - it had
not been tested on real hardware, and it was unclear if doing the jump
into 32bit mode would have adverse impact.

 Speaking of which - it would be awesome of we had a companion
 SeaBIOS release with Qemu 0.14, so we don't have to fetch a random
 git snapshot but potentially even could maintain a stable SeaBIOS
 for during the lifetime of 0.14 :).

It's about time to make a new release of seabios, so I'll see if that
can be done in a week or so.

-Kevin



[Qemu-devel] Re: AHCI in SeaBIOS

2011-02-10 Thread Alexander Graf
Kevin O'Connor wrote:
 On Tue, Feb 08, 2011 at 12:57:41AM +0100, Alexander Graf wrote:
   
 Hi Kevin,

 Do you remember why you put AHCI in with default=n? I'd like to see
 it enabled in Qemu 0.14 and IIUC we use the default configuration
 for that.
 

 Hi Alex,

 Sorry - I've gotten behind on emails.

 The reason for not enabling it by default was for two reasons - it had
 not been tested on real hardware, and it was unclear if doing the jump
 into 32bit mode would have adverse impact.
   

Do you have to do the jump even when AHCI is unused? If no, it shouldn't
hurt, right?

Anthony, can we enable it only for the Qemu build?

 Speaking of which - it would be awesome of we had a companion
 SeaBIOS release with Qemu 0.14, so we don't have to fetch a random
 git snapshot but potentially even could maintain a stable SeaBIOS
 for during the lifetime of 0.14 :).
 

 It's about time to make a new release of seabios, so I'll see if that
 can be done in a week or so.
   

Very nice, thank you! It would be great if we could sync that up with
the 0.14 release. Anthony?


Alex




[Qemu-devel] Re: AHCI in SeaBIOS

2011-02-10 Thread Alexander Graf
Kevin O'Connor wrote:
 On Thu, Feb 10, 2011 at 04:25:11PM +0100, Alexander Graf wrote:
   
 Kevin O'Connor wrote:
 
 On Tue, Feb 08, 2011 at 12:57:41AM +0100, Alexander Graf wrote:
   
 Do you remember why you put AHCI in with default=n? I'd like to see
 it enabled in Qemu 0.14 and IIUC we use the default configuration
 for that.
 
 The reason for not enabling it by default was for two reasons - it had
 not been tested on real hardware, and it was unclear if doing the jump
 into 32bit mode would have adverse impact.
   
 Do you have to do the jump even when AHCI is unused? If no, it shouldn't
 hurt, right?
 

 The 32bit jump is only done if an AHCI drive is found and one tries to
 read/write to it.
   

Very good, so it really shouldn't hurt. We could add another option that
only kicks off the detection when Qemu is found, no? That way we're sure
to not break real hardware, but have the functionality in Qemu :)


Alex




Re: [Qemu-devel] Re: AHCI in SeaBIOS

2011-02-10 Thread Anthony Liguori

On 02/10/2011 04:25 PM, Alexander Graf wrote:

Kevin O'Connor wrote:
   

On Tue, Feb 08, 2011 at 12:57:41AM +0100, Alexander Graf wrote:

 

Hi Kevin,

Do you remember why you put AHCI in with default=n? I'd like to see
it enabled in Qemu 0.14 and IIUC we use the default configuration
for that.

   

Hi Alex,

Sorry - I've gotten behind on emails.

The reason for not enabling it by default was for two reasons - it had
not been tested on real hardware, and it was unclear if doing the jump
into 32bit mode would have adverse impact.

 

Do you have to do the jump even when AHCI is unused? If no, it shouldn't
hurt, right?

Anthony, can we enable it only for the Qemu build?

   

Speaking of which - it would be awesome of we had a companion
SeaBIOS release with Qemu 0.14, so we don't have to fetch a random
git snapshot but potentially even could maintain a stable SeaBIOS
for during the lifetime of 0.14 :).

   

It's about time to make a new release of seabios, so I'll see if that
can be done in a week or so.

 

Very nice, thank you! It would be great if we could sync that up with
the 0.14 release. Anthony?
   


We're right around the corner from -rc2 so let's wait until after 0.14 
is released so we have a full release cycle to test it.


Regards,

Anthony Liguori


Alex


   





[Qemu-devel] [PATCH 03/11] qcow2: Fix error handling for immediate backing file read failure

2011-02-10 Thread Kevin Wolf
Requests could return success even though they failed when bdrv_aio_readv
returned NULL for a backing file read.

Reported-by: Chunqiang Tang ct...@us.ibm.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qcow2.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 28338bf..647c2a4 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -479,8 +479,10 @@ static void qcow2_aio_read_cb(void *opaque, int ret)
 BLKDBG_EVENT(bs-file, BLKDBG_READ_BACKING_AIO);
 acb-hd_aiocb = bdrv_aio_readv(bs-backing_hd, acb-sector_num,
 acb-hd_qiov, n1, qcow2_aio_read_cb, acb);
-if (acb-hd_aiocb == NULL)
+if (acb-hd_aiocb == NULL) {
+ret = -EIO;
 goto done;
+}
 } else {
 ret = qcow2_schedule_bh(qcow2_aio_read_bh, acb);
 if (ret  0)
-- 
1.7.2.3




[Qemu-devel] [PULL 00/11] Block patches for master

2011-02-10 Thread Kevin Wolf
The following changes since commit 6c5f738daec123020d32543fe90a6633a4f6643e:

  microblaze: Handle singlestepping over direct jmps (2011-02-10 00:46:09 +0100)

are available in the git repository at:
  git://repo.or.cz/qemu/kevin.git for-anthony

Chunqiang Tang (1):
  QCOW2: bug fix - read base image beyond its size

Jes Sorensen (1):
  Change snapshot_blkdev hmp to use correct argument type for device

Kevin Wolf (7):
  qcow2: Fix error handling for immediate backing file read failure
  qcow2: Fix error handling for reading compressed clusters
  qerror: Add QERR_UNKNOWN_BLOCK_FORMAT_FEATURE
  qcow2: Report error for version  2
  qed: Report error for unsupported features
  qemu-img: Improve error messages for failed bdrv_open
  qcow2: Fix order in L2 table COW

Markus Armbruster (2):
  blockdev: Plug memory leak in drive_uninit()
  blockdev: Plug memory leak in drive_init() error paths

 block/qcow2-cluster.c |   13 -
 block/qcow2.c |   26 +++---
 block/qed.c   |9 -
 blockdev.c|   12 ++--
 cutils.c  |   31 +++
 hmp-commands.hx   |2 +-
 qemu-common.h |2 ++
 qemu-img.c|   10 +++---
 qerror.c  |5 +
 qerror.h  |3 +++
 10 files changed, 94 insertions(+), 19 deletions(-)



[Qemu-devel] [PATCH 09/11] blockdev: Plug memory leak in drive_uninit()

2011-02-10 Thread Kevin Wolf
From: Markus Armbruster arm...@redhat.com

Started leaking in commit 1dae12e6.

Signed-off-by: Markus Armbruster arm...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 blockdev.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index ecfadc1..24d7658 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -182,6 +182,7 @@ static void drive_uninit(DriveInfo *dinfo)
 {
 qemu_opts_del(dinfo-opts);
 bdrv_delete(dinfo-bdrv);
+qemu_free(dinfo-id);
 QTAILQ_REMOVE(drives, dinfo, next);
 qemu_free(dinfo);
 }
-- 
1.7.2.3




[Qemu-devel] [PATCH 10/11] blockdev: Plug memory leak in drive_init() error paths

2011-02-10 Thread Kevin Wolf
From: Markus Armbruster arm...@redhat.com

Should have spotted this when doing commit 319ae529.

Signed-off-by: Markus Armbruster arm...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 blockdev.c |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 24d7658..0690cc8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -526,7 +526,7 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi)
 } else if (ro == 1) {
 if (type != IF_SCSI  type != IF_VIRTIO  type != IF_FLOPPY  type 
!= IF_NONE) {
 error_report(readonly not supported by this bus type);
-return NULL;
+goto err;
 }
 }
 
@@ -536,12 +536,19 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi)
 if (ret  0) {
 error_report(could not open disk image %s: %s,
  file, strerror(-ret));
-return NULL;
+goto err;
 }
 
 if (bdrv_key_required(dinfo-bdrv))
 autostart = 0;
 return dinfo;
+
+err:
+bdrv_delete(dinfo-bdrv);
+qemu_free(dinfo-id);
+QTAILQ_REMOVE(drives, dinfo, next);
+qemu_free(dinfo);
+return NULL;
 }
 
 void do_commit(Monitor *mon, const QDict *qdict)
-- 
1.7.2.3




Re: [Qemu-devel] [PATCH v2 6/6] target-arm: Use standard FPSCR for Neon half-precision operations

2011-02-10 Thread Aurelien Jarno
On Wed, Feb 09, 2011 at 04:27:30PM +, Peter Maydell wrote:
 The Neon half-precision conversion operations (VCVT.F16.F32 and
 VCVT.F32.F16) use ARM standard floating-point arithmetic, unlike
 the VFP versions (VCVTB and VCVTT).
 
 Signed-off-by: Peter Maydell peter.mayd...@linaro.org
 ---
  target-arm/helper.c|   26 ++
  target-arm/helpers.h   |2 ++
  target-arm/translate.c |   16 
  3 files changed, 32 insertions(+), 12 deletions(-)

Reviewed-by: Aurelien Jarno aurel...@aurel32.net

 diff --git a/target-arm/helper.c b/target-arm/helper.c
 index 503278c..d36f0f3 100644
 --- a/target-arm/helper.c
 +++ b/target-arm/helper.c
 @@ -2623,9 +2623,8 @@ VFP_CONV_FIX(ul, s, float32, uint32, u)
  #undef VFP_CONV_FIX
  
  /* Half precision conversions.  */
 -float32 HELPER(vfp_fcvt_f16_to_f32)(uint32_t a, CPUState *env)
 +static float32 do_fcvt_f16_to_f32(uint32_t a, CPUState *env, float_status *s)
  {
 -float_status *s = env-vfp.fp_status;
  int ieee = (env-vfp.xregs[ARM_VFP_FPSCR]  (1  26)) == 0;
  float32 r = float16_to_float32(a, ieee, s);
  if (ieee) {
 @@ -2634,9 +2633,8 @@ float32 HELPER(vfp_fcvt_f16_to_f32)(uint32_t a, 
 CPUState *env)
  return r;
  }
  
 -uint32_t HELPER(vfp_fcvt_f32_to_f16)(float32 a, CPUState *env)
 +static uint32_t do_fcvt_f32_to_f16(float32 a, CPUState *env, float_status *s)
  {
 -float_status *s = env-vfp.fp_status;
  int ieee = (env-vfp.xregs[ARM_VFP_FPSCR]  (1  26)) == 0;
  float16 r = float32_to_float16(a, ieee, s);
  if (ieee) {
 @@ -2645,6 +2643,26 @@ uint32_t HELPER(vfp_fcvt_f32_to_f16)(float32 a, 
 CPUState *env)
  return r;
  }
  
 +float32 HELPER(neon_fcvt_f16_to_f32)(uint32_t a, CPUState *env)
 +{
 +return do_fcvt_f16_to_f32(a, env, env-vfp.standard_fp_status);
 +}
 +
 +float32 HELPER(neon_fcvt_f32_to_f16)(uint32_t a, CPUState *env)
 +{
 +return do_fcvt_f32_to_f16(a, env, env-vfp.standard_fp_status);
 +}
 +
 +float32 HELPER(vfp_fcvt_f16_to_f32)(uint32_t a, CPUState *env)
 +{
 +return do_fcvt_f16_to_f32(a, env, env-vfp.fp_status);
 +}
 +
 +float32 HELPER(vfp_fcvt_f32_to_f16)(uint32_t a, CPUState *env)
 +{
 +return do_fcvt_f32_to_f16(a, env, env-vfp.fp_status);
 +}
 +
  float32 HELPER(recps_f32)(float32 a, float32 b, CPUState *env)
  {
  float_status *s = env-vfp.fp_status;
 diff --git a/target-arm/helpers.h b/target-arm/helpers.h
 index 8a2564e..40264b4 100644
 --- a/target-arm/helpers.h
 +++ b/target-arm/helpers.h
 @@ -129,6 +129,8 @@ DEF_HELPER_3(vfp_ultod, f64, f64, i32, env)
  
  DEF_HELPER_2(vfp_fcvt_f16_to_f32, f32, i32, env)
  DEF_HELPER_2(vfp_fcvt_f32_to_f16, i32, f32, env)
 +DEF_HELPER_2(neon_fcvt_f16_to_f32, f32, i32, env)
 +DEF_HELPER_2(neon_fcvt_f32_to_f16, i32, f32, env)
  
  DEF_HELPER_3(recps_f32, f32, f32, f32, env)
  DEF_HELPER_3(rsqrts_f32, f32, f32, f32, env)
 diff --git a/target-arm/translate.c b/target-arm/translate.c
 index e4649e6..a867f55 100644
 --- a/target-arm/translate.c
 +++ b/target-arm/translate.c
 @@ -5495,17 +5495,17 @@ static int disas_neon_data_insn(CPUState * env, 
 DisasContext *s, uint32_t insn)
  tmp = new_tmp();
  tmp2 = new_tmp();
  tcg_gen_ld_f32(cpu_F0s, cpu_env, neon_reg_offset(rm, 0));
 -gen_helper_vfp_fcvt_f32_to_f16(tmp, cpu_F0s, cpu_env);
 +gen_helper_neon_fcvt_f32_to_f16(tmp, cpu_F0s, cpu_env);
  tcg_gen_ld_f32(cpu_F0s, cpu_env, neon_reg_offset(rm, 1));
 -gen_helper_vfp_fcvt_f32_to_f16(tmp2, cpu_F0s, cpu_env);
 +gen_helper_neon_fcvt_f32_to_f16(tmp2, cpu_F0s, cpu_env);
  tcg_gen_shli_i32(tmp2, tmp2, 16);
  tcg_gen_or_i32(tmp2, tmp2, tmp);
  tcg_gen_ld_f32(cpu_F0s, cpu_env, neon_reg_offset(rm, 2));
 -gen_helper_vfp_fcvt_f32_to_f16(tmp, cpu_F0s, cpu_env);
 +gen_helper_neon_fcvt_f32_to_f16(tmp, cpu_F0s, cpu_env);
  tcg_gen_ld_f32(cpu_F0s, cpu_env, neon_reg_offset(rm, 3));
  neon_store_reg(rd, 0, tmp2);
  tmp2 = new_tmp();
 -gen_helper_vfp_fcvt_f32_to_f16(tmp2, cpu_F0s, cpu_env);
 +gen_helper_neon_fcvt_f32_to_f16(tmp2, cpu_F0s, cpu_env);
  tcg_gen_shli_i32(tmp2, tmp2, 16);
  tcg_gen_or_i32(tmp2, tmp2, tmp);
  neon_store_reg(rd, 1, tmp2);
 @@ -5518,17 +5518,17 @@ static int disas_neon_data_insn(CPUState * env, 
 DisasContext *s, uint32_t insn)
  tmp = neon_load_reg(rm, 0);
  tmp2 = neon_load_reg(rm, 1);
  tcg_gen_ext16u_i32(tmp3, tmp);
 -gen_helper_vfp_fcvt_f16_to_f32(cpu_F0s, tmp3, cpu_env);
 +gen_helper_neon_fcvt_f16_to_f32(cpu_F0s, tmp3, cpu_env);
  tcg_gen_st_f32(cpu_F0s, cpu_env, 

[Qemu-devel] [PATCH 02/11] QCOW2: bug fix - read base image beyond its size

2011-02-10 Thread Kevin Wolf
From: Chunqiang Tang ct...@us.ibm.com

This patch fixes the following bug in QCOW2. For a QCOW2 image that is larger
than its base image, when handling a read request straddling over the end of the
base image, the QCOW2 driver attempts to read beyond the end of the base image
and the request would fail.

This bug was found by Fast Virtual Disk (FVD)'s fully automated testing tool.
The following test triggered the bug.

dd if=/dev/zero of=/var/ramdisk/truth.raw count=0 bs=1 seek=1098561536
dd if=/dev/zero of=/var/ramdisk/zero-500M.raw count=0 bs=1 seek=593099264
./qemu-img create -f qcow2 -ocluster_size=65536,backing_fmt=blksim -b 
/var/ramdisk/zero-500M.raw /var/ramdisk/test.qcow2 1098561536
./qemu-io --auto --seed=30477694 --truth=/var/ramdisk/truth.raw --format=qcow2 
--test=blksim:/var/ramdisk/test.qcow2 --verify_write=true 
--compare_before=false --compare_after=true --round=10 --parallel=100 
--io_size=10485760 --fail_prob=0 --cancel_prob=0 --instant_qemubh=true

Signed-off-by: Chunqiang Tang ct...@us.ibm.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qcow2.c |5 ++---
 cutils.c  |   31 +++
 qemu-common.h |2 ++
 3 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index a1773e4..28338bf 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -355,7 +355,7 @@ int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector 
*qiov,
 else
 n1 = bs-total_sectors - sector_num;
 
-qemu_iovec_memset(qiov, 0, 512 * (nb_sectors - n1));
+qemu_iovec_memset_skip(qiov, 0, 512 * (nb_sectors - n1), 512 * n1);
 
 return n1;
 }
@@ -478,8 +478,7 @@ static void qcow2_aio_read_cb(void *opaque, int ret)
 if (n1  0) {
 BLKDBG_EVENT(bs-file, BLKDBG_READ_BACKING_AIO);
 acb-hd_aiocb = bdrv_aio_readv(bs-backing_hd, acb-sector_num,
-acb-hd_qiov, acb-cur_nr_sectors,
-   qcow2_aio_read_cb, acb);
+acb-hd_qiov, n1, qcow2_aio_read_cb, acb);
 if (acb-hd_aiocb == NULL)
 goto done;
 } else {
diff --git a/cutils.c b/cutils.c
index 8d562b2..f9a7e36 100644
--- a/cutils.c
+++ b/cutils.c
@@ -267,6 +267,37 @@ void qemu_iovec_memset(QEMUIOVector *qiov, int c, size_t 
count)
 }
 }
 
+void qemu_iovec_memset_skip(QEMUIOVector *qiov, int c, size_t count,
+size_t skip)
+{
+int i;
+size_t done;
+void *iov_base;
+uint64_t iov_len;
+
+done = 0;
+for (i = 0; (i  qiov-niov)  (done != count); i++) {
+if (skip = qiov-iov[i].iov_len) {
+/* Skip the whole iov */
+skip -= qiov-iov[i].iov_len;
+continue;
+} else {
+/* Skip only part (or nothing) of the iov */
+iov_base = (uint8_t*) qiov-iov[i].iov_base + skip;
+iov_len = qiov-iov[i].iov_len - skip;
+skip = 0;
+}
+
+if (done + iov_len  count) {
+memset(iov_base, c, count - done);
+break;
+} else {
+memset(iov_base, c, iov_len);
+}
+done += iov_len;
+}
+}
+
 #ifndef _WIN32
 /* Sets a specific flag */
 int fcntl_setfl(int fd, int flag)
diff --git a/qemu-common.h b/qemu-common.h
index c7ff280..cb4b7e0 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -322,6 +322,8 @@ void qemu_iovec_reset(QEMUIOVector *qiov);
 void qemu_iovec_to_buffer(QEMUIOVector *qiov, void *buf);
 void qemu_iovec_from_buffer(QEMUIOVector *qiov, const void *buf, size_t count);
 void qemu_iovec_memset(QEMUIOVector *qiov, int c, size_t count);
+void qemu_iovec_memset_skip(QEMUIOVector *qiov, int c, size_t count,
+size_t skip);
 
 struct Monitor;
 typedef struct Monitor Monitor;
-- 
1.7.2.3




[Qemu-devel] [PATCH 11/11] qcow2: Fix order in L2 table COW

2011-02-10 Thread Kevin Wolf
When copying L2 tables (this happens only with internal snapshots), the order
wasn't completely safe, so that after a crash you could end up with a L2 table
that has too low refcount, possibly leading to corruption in the long run.

This patch puts the operations in the right order: First allocate the new
L2 table and replace the reference, and only then decrease the refcount of the
old table.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qcow2-cluster.c |9 ++---
 1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 437aaa8..750abe3 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -515,13 +515,16 @@ static int get_cluster_table(BlockDriverState *bs, 
uint64_t offset,
 return ret;
 }
 } else {
-/* FIXME Order */
-if (l2_offset)
-qcow2_free_clusters(bs, l2_offset, s-l2_size * sizeof(uint64_t));
+/* First allocate a new L2 table (and do COW if needed) */
 ret = l2_allocate(bs, l1_index, l2_table);
 if (ret  0) {
 return ret;
 }
+
+/* Then decrease the refcount of the old table */
+if (l2_offset) {
+qcow2_free_clusters(bs, l2_offset, s-l2_size * sizeof(uint64_t));
+}
 l2_offset = s-l1_table[l1_index]  ~QCOW_OFLAG_COPIED;
 }
 
-- 
1.7.2.3




Re: [Qemu-devel] [PATCH 0/2] target-arm: Fix VQMOV(U)N

2011-02-10 Thread Aurelien Jarno
On Wed, Feb 09, 2011 at 03:42:31PM +, Peter Maydell wrote:
 This patchset fixes the VQMOV(U)N instructions (saturating narrowing
 conversions). Tested by random generation of instructions for
 VQMOVN, VQMOVUN, VMOVN.
 
 Patch 1/2 is the same as the one Christophe sent recently but I have
 corrected the authorship (this patch is from the meego tree).
 
 Juha Riihimäki (1):
   target-arm: Fix VQMOVUN Neon instruction.
 
 Peter Maydell (1):
   target-arm: Fix 32 bit signed saturating narrow
 
  target-arm/helpers.h |3 ++
  target-arm/neon_helper.c |   65 
 +-
  target-arm/translate.c   |   28 +++
  3 files changed, 89 insertions(+), 7 deletions(-)
 
 

Thanks, both applied.

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net



[Qemu-devel] [PATCH 01/11] Change snapshot_blkdev hmp to use correct argument type for device

2011-02-10 Thread Kevin Wolf
From: Jes Sorensen jes.soren...@redhat.com

Pointed out by Markus

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hmp-commands.hx |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 38e1eb7..372bef4 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -822,7 +822,7 @@ ETEXI
 
 {
 .name   = snapshot_blkdev,
-.args_type  = device:s,snapshot_file:s?,format:s?,
+.args_type  = device:B,snapshot_file:s?,format:s?,
 .params = device [new-image-file] [format],
 .help   = initiates a live snapshot\n\t\t\t
   of device. If a new image file is specified, 
the\n\t\t\t
-- 
1.7.2.3




[Qemu-devel] [PATCH 04/11] qcow2: Fix error handling for reading compressed clusters

2011-02-10 Thread Kevin Wolf
When reading a compressed cluster failed, qcow2 falsely returned success.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Markus Armbruster arm...@redhat.com
---
 block/qcow2-cluster.c |4 ++--
 block/qcow2.c |4 +++-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 5fb8c66..437aaa8 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -878,11 +878,11 @@ int qcow2_decompress_cluster(BlockDriverState *bs, 
uint64_t cluster_offset)
 BLKDBG_EVENT(bs-file, BLKDBG_READ_COMPRESSED);
 ret = bdrv_read(bs-file, coffset  9, s-cluster_data, nb_csectors);
 if (ret  0) {
-return -1;
+return ret;
 }
 if (decompress_buffer(s-cluster_cache, s-cluster_size,
   s-cluster_data + sector_offset, csize)  0) {
-return -1;
+return -EIO;
 }
 s-cluster_cache_offset = coffset;
 }
diff --git a/block/qcow2.c b/block/qcow2.c
index 647c2a4..551b3c2 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -497,8 +497,10 @@ static void qcow2_aio_read_cb(void *opaque, int ret)
 }
 } else if (acb-cluster_offset  QCOW_OFLAG_COMPRESSED) {
 /* add AIO support for compressed blocks ? */
-if (qcow2_decompress_cluster(bs, acb-cluster_offset)  0)
+ret = qcow2_decompress_cluster(bs, acb-cluster_offset);
+if (ret  0) {
 goto done;
+}
 
 qemu_iovec_from_buffer(acb-hd_qiov,
 s-cluster_cache + index_in_cluster * 512,
-- 
1.7.2.3




[Qemu-devel] [PATCH 08/11] qemu-img: Improve error messages for failed bdrv_open

2011-02-10 Thread Kevin Wolf
Output the error message string of the bdrv_open return code. Also set a
non-empty device name for the images because the unknown feature error message
includes it.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Anthony Liguori aligu...@us.ibm.com
---
 qemu-img.c |   10 +++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 4a37358..7e3cc4c 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -213,8 +213,9 @@ static BlockDriverState *bdrv_new_open(const char *filename,
 BlockDriverState *bs;
 BlockDriver *drv;
 char password[256];
+int ret;
 
-bs = bdrv_new();
+bs = bdrv_new(image);
 
 if (fmt) {
 drv = bdrv_find_format(fmt);
@@ -225,10 +226,13 @@ static BlockDriverState *bdrv_new_open(const char 
*filename,
 } else {
 drv = NULL;
 }
-if (bdrv_open(bs, filename, flags, drv)  0) {
-error_report(Could not open '%s', filename);
+
+ret = bdrv_open(bs, filename, flags, drv);
+if (ret  0) {
+error_report(Could not open '%s': %s, filename, strerror(-ret));
 goto fail;
 }
+
 if (bdrv_is_encrypted(bs)) {
 printf(Disk image '%s' is encrypted.\n, filename);
 if (read_password(password, sizeof(password))  0) {
-- 
1.7.2.3




Re: [Qemu-devel] Re: [PATCH 2/7] Enable I/O thread and VNC threads by default

2011-02-10 Thread Anthony Liguori

On 02/09/2011 06:35 PM, Aurelien Jarno wrote:

On Tue, Feb 08, 2011 at 04:08:28PM +0100, Aurelien Jarno wrote:
   

Aurelien Jarno a écrit :
 

Paolo Bonzini a écrit :
   

On 02/08/2011 12:15 PM, Aurelien Jarno wrote:
 

however
it should not be done ignoring all the*current*  drawbacks of the
iothread mode. We know them (at least for some of them), so let's try to
solve them.
   

Let's also enumerate them.

 

 From what I know:
- performance regression in TCG mode
   

I setup an x86_64 guest on an x86_64 host (Intel Xeon E5345). Nothing
was running except the standard daemons and the CPU governor was set to
performance on all CPU. I then compared the network performance using
netperf in default mode, through a tap interface and a virtio nic. I got
the following results (quite reproducible, std below 0.5):
- without IO thread: 107.36 MB/s
- with IO thread: 89.93 MB/s

 

And the same test on the code from september 2009:
- without IO thread: 141.8 MB/s
   


virtio-net is super finicky regarding mitigation strategies and their 
relationship to the I/O thread.  Different benchmarks will behave 
differently.  virtio-blk is probably a better device to test as you'll 
get much more consistent results across different type of I/O patterns.


Regards,

Anthony Liguori





Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Anthony Liguori

On 02/10/2011 03:20 PM, Gleb Natapov wrote:

Jugging by how well all previous conversion went we will end up with one
more way of creating devices. One legacy, another qdev and your new one.
And what is the problem with qdev again (not that I am a big qdev fan)?
   


We've really been arguing about probably the most minor aspect of the 
problem with qdev.


All I'm really saying is that we shouldn't tie device construction to a 
factory interface as we do with qdev.


That simply means that we should be able to do:

RTC *rtc_create(arg1, arg2, arg2);

And that a separate piece of code decides which devices are exposed 
through -device or device_add.  Which devices are exposed is really a 
minor detail.


That said, qdev has a number of significant limitations in my mind.  The 
first is that the only relationship between devices is through the 
BusState interface.  I don't think we should even try to have a generic 
bus model.  When you look at how badly broken PCI hotplug is current in 
qdev, I think this is symptomatic of this.


There's also no way in qdev to really have polymorphism.  Interfaces 
really aren't meaningful in qdev so you have things like PCIDevice where 
some methods are stored in the object instead of the class dispatch 
table and you have overuse of static class members.


And it's all unrelated to VMState.

And this is just the basic mechanisms of qdev.  The actual 
implementation is worse.  The use of qemu_irq as gpio in the base class 
and overuse of SystemBus is really quite insane.


And so far, the use of qdev has been entirely superficial.  Devices 
still don't make use of bus level interfaces to do I/O so we don't have 
any better componentization than we did before qdev.



The fact that there is no enough interest to convert all devices to it?
   


I don't think there is any device that has been improved by qdev.  
-device is a nice feature, but it could have been implemented without qdev.


Regards,

Anthony Liguori


How new way of doing things will solve this?

Just to be clear I do not have problem with not having ability to
compose x86 without pit or kbd controller. Basic things like RTC, pit,
pic, ioapic, dma, kbd should be created unconditionally as part of x86
pc machine. But IMHO you are trying to take things to other extreme.

--
Gleb.
   





[Qemu-devel] [PATCH] linux-user: fix compile failure if !CONFIG_USE_GUEST_BASE

2011-02-10 Thread Peter Maydell
If CONFIG_USE_GUEST_BASE is not defined, gcc complains:
 linux-user/mmap.c:235: error: comparison of unsigned expression = 0 is always 
true

because RESERVED_VA is #defined to 0. Since mmap_find_vma_reserved()
will never be called anyway if RESERVED_VA is always 0, fix this by
simply #ifdef'ing away the function and its callsite.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
I'm not a great fan of introducing #ifdefs, but I couldn't come
up with a cleaner way of shutting gcc up...

 linux-user/mmap.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index abf21f6..0cf22f8 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -216,6 +216,7 @@ static abi_ulong mmap_next_start = TASK_UNMAPPED_BASE;
 
 unsigned long last_brk;
 
+#ifdef CONFIG_USE_GUEST_BASE
 /* Subroutine of mmap_find_vma, used when we have pre-allocated a chunk
of guest address space.  */
 static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size)
@@ -249,6 +250,7 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, 
abi_ulong size)
 mmap_next_start = addr;
 return last_addr;
 }
+#endif
 
 /*
  * Find and reserve a free memory area of size 'size'. The search
@@ -271,9 +273,11 @@ abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size)
 
 size = HOST_PAGE_ALIGN(size);
 
+#ifdef CONFIG_USE_GUEST_BASE
 if (RESERVED_VA) {
 return mmap_find_vma_reserved(start, size);
 }
+#endif
 
 addr = start;
 wrapped = repeat = 0;
-- 
1.7.1




[Qemu-devel] Re: [PATCH uq/master -v2 2/2] KVM, MCE, unpoison memory address across reboot

2011-02-10 Thread Huang Ying
On Wed, 2011-02-09 at 16:00 +0800, Jan Kiszka wrote:
 On 2011-02-09 04:00, Huang Ying wrote:
  In Linux kernel HWPoison processing implementation, the virtual
  address in processes mapping the error physical memory page is marked
  as HWPoison.  So that, the further accessing to the virtual
  address will kill corresponding processes with SIGBUS.
  
  If the error physical memory page is used by a KVM guest, the SIGBUS
  will be sent to QEMU, and QEMU will simulate a MCE to report that
  memory error to the guest OS.  If the guest OS can not recover from
  the error (for example, the page is accessed by kernel code), guest OS
  will reboot the system.  But because the underlying host virtual
  address backing the guest physical memory is still poisoned, if the
  guest system accesses the corresponding guest physical memory even
  after rebooting, the SIGBUS will still be sent to QEMU and MCE will be
  simulated.  That is, guest system can not recover via rebooting.
 
 Yeah, saw this already during my test...
 
  
  In fact, across rebooting, the contents of guest physical memory page
  need not to be kept.  We can allocate a new host physical page to
  back the corresponding guest physical address.
 
 I just wondering what would be architecturally suboptimal if we simply
 remapped on SIGBUS directly. Would save us at least the bookkeeping.

Because we can not change the content of memory silently during guest OS
running, this may corrupts guest OS data structure and even ruins disk
contents.  But during rebooting, all guest OS state are discarded.

[snip]
  @@ -1882,6 +1919,7 @@ int kvm_arch_on_sigbus_vcpu(CPUState *en
   hardware_memory_error();
   }
   }
  +kvm_hwpoison_page_add(ram_addr);
   
   if (code == BUS_MCEERR_AR) {
   /* Fake an Intel architectural Data Load SRAR UCR */
  @@ -1926,6 +1964,7 @@ int kvm_arch_on_sigbus(int code, void *a
   QEMU itself instead of guest system!: %p\n, addr);
   return 0;
   }
  +kvm_hwpoison_page_add(ram_addr);
   kvm_mce_inj_srao_memscrub2(first_cpu, paddr);
   } else
   #endif
  
  
 
 Looks fine otherwise. Unless that simplification makes sense, I could
 offer to include this into my MCE rework (there is some minor conflict).
 If all goes well, that series should be posted during this week.

Thanks.

Best Regards,
Huang Ying




[Qemu-devel] Re: AHCI in SeaBIOS

2011-02-10 Thread Paolo Bonzini

On 02/10/2011 04:37 PM, Anthony Liguori wrote:


We're right around the corner from -rc2 so let's wait until after 0.14
is released so we have a full release cycle to test it.


And have unusable or only partly usable features (already mentioned 
multiple times: boot order, AHCI) in 0.14?


Paolo



[Qemu-devel] [PATCH 5/7] include qemu-thread.h early

2011-02-10 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Cc: Stefan Weil w...@mail.berlios.de
Cc: Blue Swirl blauwir...@gmail.com
---
 cpus.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/cpus.c b/cpus.c
index 4c9928e..68b3fcb 100644
--- a/cpus.c
+++ b/cpus.c
@@ -32,6 +32,7 @@
 #include kvm.h
 #include exec-all.h
 
+#include qemu-thread.h
 #include cpus.h
 #include compatfd.h
 #ifdef CONFIG_LINUX
@@ -319,8 +320,6 @@ void vm_stop(int reason)
 
 #else /* CONFIG_IOTHREAD */
 
-#include qemu-thread.h
-
 QemuMutex qemu_global_mutex;
 static QemuMutex qemu_fair_mutex;
 
-- 
1.7.3.5





[Qemu-devel] [PATCH 4/7] add win32 qemu-thread implementation

2011-02-10 Thread Paolo Bonzini
For now, qemu_cond_timedwait and qemu_mutex_timedlock are left as
POSIX-only functions.  They can be removed later, once the patches
that remove their uses are in.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Cc: Stefan Weil w...@mail.berlios.de
Cc: Blue Swirl blauwir...@gmail.com
---
 Makefile.objs|4 +-
 qemu-thread.c = qemu-thread-posix.c |0
 qemu-thread-posix.h  |   18 +++
 qemu-thread-win32.c  |  272 ++
 qemu-thread-win32.h  |   22 +++
 qemu-thread.h|   27 ++--
 6 files changed, 326 insertions(+), 17 deletions(-)
 rename qemu-thread.c = qemu-thread-posix.c (100%)
 create mode 100644 qemu-thread-posix.h
 create mode 100644 qemu-thread-win32.c
 create mode 100644 qemu-thread-win32.h

diff --git a/Makefile.objs b/Makefile.objs
index 353b1a8..19c31fc 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -140,8 +140,8 @@ endif
 common-obj-y += $(addprefix ui/, $(ui-obj-y))
 
 common-obj-y += iov.o acl.o
-common-obj-$(CONFIG_THREAD) += qemu-thread.o
-common-obj-$(CONFIG_IOTHREAD) += compatfd.o
+common-obj-$(CONFIG_POSIX) += qemu-thread-posix.o compatfd.o
+common-obj-$(CONFIG_WIN32) += qemu-thread-win32.o
 common-obj-y += notify.o event_notifier.o
 common-obj-y += qemu-timer.o qemu-timer-common.o
 
diff --git a/qemu-thread.c b/qemu-thread-posix.c
similarity index 100%
rename from qemu-thread.c
rename to qemu-thread-posix.c
diff --git a/qemu-thread-posix.h b/qemu-thread-posix.h
new file mode 100644
index 000..7af371c
--- /dev/null
+++ b/qemu-thread-posix.h
@@ -0,0 +1,18 @@
+#ifndef __QEMU_THREAD_POSIX_H
+#define __QEMU_THREAD_POSIX_H 1
+#include pthread.h
+
+struct QemuMutex {
+pthread_mutex_t lock;
+};
+
+struct QemuCond {
+pthread_cond_t cond;
+};
+
+struct QemuThread {
+pthread_t thread;
+};
+
+void qemu_thread_signal(QemuThread *thread, int sig);
+#endif
diff --git a/qemu-thread-win32.c b/qemu-thread-win32.c
new file mode 100644
index 000..0465a9a
--- /dev/null
+++ b/qemu-thread-win32.c
@@ -0,0 +1,272 @@
+/*
+ * Win32 implementation for mutex/cond/thread functions
+ *
+ * Copyright Red Hat, Inc. 2010
+ *
+ * Author:
+ *  Paolo Bonzini pbonz...@redhat.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+#include qemu-common.h
+#include qemu-thread.h
+#include process.h
+#include assert.h
+#include limits.h
+
+static void error_exit(int err, const char *msg)
+{
+char *pstr;
+
+FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_ALLOCATE_BUFFER,
+  NULL, err, 0, (LPTSTR) pstr, 2, NULL);
+fprintf(stderr, qemu: %s: %s\n, msg, pstr);
+LocalFree(pstr);
+exit(1);
+}
+
+void qemu_mutex_init(QemuMutex *mutex)
+{
+mutex-owner = 0;
+InitializeCriticalSection(mutex-lock);
+}
+
+void qemu_mutex_lock(QemuMutex *mutex)
+{
+EnterCriticalSection(mutex-lock);
+
+/* Win32 CRITICAL_SECTIONs are recursive.  Assert that we're not
+ * using them as such.
+ */
+assert (mutex-owner == 0);
+mutex-owner = GetCurrentThreadId();
+}
+
+int qemu_mutex_trylock(QemuMutex *mutex)
+{
+int owned;
+
+owned = TryEnterCriticalSection(mutex-lock);
+if (owned) {
+assert (mutex-owner == 0);
+mutex-owner = GetCurrentThreadId();
+}
+return !owned;
+}
+
+void qemu_mutex_unlock(QemuMutex *mutex)
+{
+assert (mutex-owner == GetCurrentThreadId());
+mutex-owner = 0;
+LeaveCriticalSection(mutex-lock);
+}
+
+void qemu_cond_init(QemuCond *cond)
+{
+cond-waiters = 0;
+cond-was_broadcast = 0;
+
+cond-sema = CreateSemaphore(NULL, 0, LONG_MAX, NULL);
+if (!cond-sema) {
+error_exit(GetLastError(), __func__);
+}
+cond-continue_broadcast = CreateEvent(NULL,/* security */
+   FALSE,   /* auto-reset */
+   FALSE,   /* not signaled */
+   NULL);   /* name */
+if (!cond-continue_broadcast) {
+error_exit(GetLastError(), __func__);
+}
+}
+
+void qemu_cond_signal(QemuCond *cond)
+{
+/*
+ * Signal only when there are waiters.  cond-waiters is
+ * incremented by pthread_cond_wait under the external lock,
+ * so we are safe about that.
+ *
+ * Waiting threads decrement it outside the external lock, but
+ * only if another thread is executing pthread_cond_broadcast and
+ * has the mutex.  So, it also cannot be decremented concurrently
+ * with this particular access.
+ */
+if (cond-waiters  0) {
+cond-waiters--;
+if (!ReleaseSemaphore(cond-sema, 1, NULL)) {
+error_exit(GetLastError(), __func__);
+}
+}
+}
+
+void qemu_cond_broadcast(QemuCond *cond)
+{
+BOOLEAN result;
+/*
+ * As in pthread_cond_signal, access to cond-waiters and
+ * 

[Qemu-devel] [PATCH 2/7] implement win32 dynticks timer

2011-02-10 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Cc: Stefan Weil w...@mail.berlios.de
Cc: Blue Swirl blauwir...@gmail.com
---
 qemu-timer.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index b0db780..42960de 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -1006,6 +1006,7 @@ static void win32_stop_timer(struct qemu_alarm_timer *t)
 static void win32_rearm_timer(struct qemu_alarm_timer *t)
 {
 struct qemu_alarm_win32 *data = t-priv;
+int nearest_delta_ms;
 
 assert(alarm_has_dynticks(t));
 if (!active_timers[QEMU_CLOCK_REALTIME] 
@@ -1015,7 +1016,10 @@ static void win32_rearm_timer(struct qemu_alarm_timer *t)
 
 timeKillEvent(data-timerId);
 
-data-timerId = timeSetEvent(1,
+nearest_delta_ms = (qemu_next_alarm_deadline() + 99) / 100;
+if (nearest_delta_ms  1)
+   nearest_delta_ms = 1;
+data-timerId = timeSetEvent(nearest_delta_ms,
 data-period,
 host_alarm_handler,
 (DWORD)t,
-- 
1.7.3.5





[Qemu-devel] [PATCH 6/7] add assertions on the owner of a QemuMutex

2011-02-10 Thread Paolo Bonzini
These are already present in the Win32 implementation, add them to
the pthread wrappers as well.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Cc: Stefan Weil w...@mail.berlios.de
Cc: Blue Swirl blauwir...@gmail.com
---
 qemu-thread-posix.c |   20 +++-
 qemu-thread-posix.h |1 +
 2 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/qemu-thread-posix.c b/qemu-thread-posix.c
index fbc78fe..e6cafd9 100644
--- a/qemu-thread-posix.c
+++ b/qemu-thread-posix.c
@@ -16,9 +16,12 @@
 #include time.h
 #include signal.h
 #include stdint.h
+#include assert.h
 #include string.h
 #include qemu-thread.h
 
+static pthread_t pthread_null;
+
 static void error_exit(int err, const char *msg)
 {
 fprintf(stderr, qemu: %s: %s\n, msg, strerror(err));
@@ -29,6 +32,7 @@ void qemu_mutex_init(QemuMutex *mutex)
 {
 int err;
 
+mutex-owner = pthread_null;
 err = pthread_mutex_init(mutex-lock, NULL);
 if (err)
 error_exit(err, __func__);
@@ -48,13 +52,22 @@ void qemu_mutex_lock(QemuMutex *mutex)
 int err;
 
 err = pthread_mutex_lock(mutex-lock);
+assert (pthread_equal(mutex-owner, pthread_null));
+mutex-owner = pthread_self();
 if (err)
 error_exit(err, __func__);
 }
 
 int qemu_mutex_trylock(QemuMutex *mutex)
 {
-return pthread_mutex_trylock(mutex-lock);
+int err;
+err = pthread_mutex_trylock(mutex-lock);
+if (err == 0) {
+assert (pthread_equal(mutex-owner, pthread_null));
+mutex-owner = pthread_self();
+}
+
+return !!err;
 }
 
 static void timespec_add_ms(struct timespec *ts, uint64_t msecs)
@@ -85,6 +98,8 @@ void qemu_mutex_unlock(QemuMutex *mutex)
 {
 int err;
 
+assert (pthread_equal(mutex-owner, pthread_self()));
+mutex-owner = pthread_null;
 err = pthread_mutex_unlock(mutex-lock);
 if (err)
 error_exit(err, __func__);
@@ -130,7 +145,10 @@ void qemu_cond_wait(QemuCond *cond, QemuMutex *mutex)
 {
 int err;
 
+assert (pthread_equal(mutex-owner, pthread_self()));
+mutex-owner = pthread_null;
 err = pthread_cond_wait(cond-cond, mutex-lock);
+mutex-owner = pthread_self();
 if (err)
 error_exit(err, __func__);
 }
diff --git a/qemu-thread-posix.h b/qemu-thread-posix.h
index 7af371c..11978db 100644
--- a/qemu-thread-posix.h
+++ b/qemu-thread-posix.h
@@ -4,6 +4,7 @@
 
 struct QemuMutex {
 pthread_mutex_t lock;
+pthread_t owner;
 };
 
 struct QemuCond {
-- 
1.7.3.5





[Qemu-devel] [PATCH 7/7] remove CONFIG_THREAD

2011-02-10 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Cc: Stefan Weil w...@mail.berlios.de
Cc: Blue Swirl blauwir...@gmail.com
---
 configure |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 598e8e1..46a6389 100755
--- a/configure
+++ b/configure
@@ -2609,7 +2609,6 @@ if test $vnc_png != no ; then
 fi
 if test $vnc_thread != no ; then
   echo CONFIG_VNC_THREAD=y  $config_host_mak
-  echo CONFIG_THREAD=y  $config_host_mak
 fi
 if test $fnmatch = yes ; then
   echo CONFIG_FNMATCH=y  $config_host_mak
@@ -2696,7 +2695,6 @@ if test $xen = yes ; then
 fi
 if test $io_thread = yes ; then
   echo CONFIG_IOTHREAD=y  $config_host_mak
-  echo CONFIG_THREAD=y  $config_host_mak
 fi
 if test $linux_aio = yes ; then
   echo CONFIG_LINUX_AIO=y  $config_host_mak
-- 
1.7.3.5




[Qemu-devel] [PATCH 1/7] unlock iothread during WaitForMultipleObjects

2011-02-10 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Cc: Stefan Weil w...@mail.berlios.de
Cc: Blue Swirl blauwir...@gmail.com
---
 os-win32.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/os-win32.c b/os-win32.c
index 566d5e9..d907c59 100644
--- a/os-win32.c
+++ b/os-win32.c
@@ -140,7 +140,9 @@ void os_host_main_loop_wait(int *timeout)
 int err;
 WaitObjects *w = wait_objects;
 
+qemu_mutex_unlock_iothread();
 ret = WaitForMultipleObjects(w-num, w-events, FALSE, *timeout);
+qemu_mutex_lock_iothread();
 if (WAIT_OBJECT_0 + 0 = ret  ret = WAIT_OBJECT_0 + w-num - 1) {
 if (w-func[ret - WAIT_OBJECT_0])
 w-func[ret - WAIT_OBJECT_0](w-opaque[ret - WAIT_OBJECT_0]);
-- 
1.7.3.5





Re: [Qemu-devel] [PATCH 2/7] Enable I/O thread and VNC threads by default

2011-02-10 Thread Paolo Bonzini

On 02/09/2011 11:16 PM, Stefan Weil wrote:


I decided to create a new directory structure hosts/w32, so files can
be moved from the root to hosts/posix, hosts/w32, or hosts/xxx.
Include chains reduce code modifications and conditional compilations.
And people who don't want to see w32 support can remove it easily :-)

Supporting I/O threads for W32 will be possible, too.


I have patches for Win32 iothread, I'm just posting the series split 
into multiple pieces.


Paolo



Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Peter Maydell
On 10 February 2011 08:36, Anthony Liguori anth...@codemonkey.ws wrote:
 On 02/10/2011 09:16 AM, Peter Maydell wrote:
 On 10 February 2011 07:47, Anthony Liguorianth...@codemonkey.ws  wrote:
 2) get rid of the entire concept of machines.  Creating a i440fx is
 essentially equivalent to creating a bare machine.

 Does that make any sense for anything other than target-i386?
 The concept of a machine model seems a pretty obvious one
 for ARM boards, for instance, and I'm not sure we'd gain much
 by having i386 be different to the other architectures...

 Yes, it makes a lot of sense, I just don't know the component names as well
 so bear with me :-)

 There are two types of Versatile machines today, Versatile/AB and
 Versatile/PB.  They are both made with the same core, ARM926EJ-S, with
 different expansions.

 So you would model arm926ej-s as the chipset and then build up the machines
 by modifying parameters of the chipset (like the board id) and/or adding
 different components on top of it.

Er, ARM926 is the CPU, it's not a chipset. The board ID is definitely
not a property of an ARM926, it's a property of the board (clue is in
the name :-)). I don't think versatile boards have a chipset really...

In my understanding the machine is the thing that says I need a
926, and an MMC controller at this address, and some UARTS,
and... ie it is the thing that does the modifying parameters
and adding different components. So if we'd still be doing that
I don't see how we've got rid of the concept. I guess I'm missing
the point somehow.

 A good way to think about what I'm proposing is that machine-init really
 should be a constructor for a device object.

If you mean that you want machines to be implemented under the
hood as a single huge device you can only have one of that spans
the entire memory map, well I guess that's an implementation
detail. But conceptually machines really do exist, and we definitely
still want users to be able to say I want a beagle machine; I want
a versatile; I want an n900.

-- PMM



Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Gleb Natapov
On Thu, Feb 10, 2011 at 08:47:12AM +0100, Anthony Liguori wrote:
 On 02/09/2011 09:15 PM, Blue Swirl wrote:
 On Wed, Feb 9, 2011 at 9:59 PM, Anthony Liguorianth...@codemonkey.ws  
 wrote:
 On 02/09/2011 06:48 PM, Blue Swirl wrote:
 ISASerialState dev;
 
 isa_serial_init(dev, 0, 0x274, 0x07, NULL, NULL);
 
 Do you mean that there should be a generic way of doing that, like
 sysbus_create_varargs() for qdev, or just add inline functions which
 hide qdev property setup?
 
 I still think that FDT should be used in the future. That would
 require that the properties can be set up mechanically, and I don't
 see how your proposal would help that.
 
 Yeah, I don't think that is a good idea anymore.  I think this is part of
 why we're having so many problems with qdev.
 
 While (most?) hardware hierarchies can be represented by device tree syntax,
 not all valid device trees correspond to interface and/or useful hardware
 hierarchies.
 User creates a non-working machine and so gets to fix the problems?
 How is that a problem for us?
 
 It's not about creating a non-working machine.  It's about what
 user-level abstraction we need to provide.
 
 It's a whole lot easier to implement an i440fx device with a fixed
 set of parameters than it is to make every possible subdevice have a
 proper factory interface along with mechanisms to hook everything
 together.
 
So what if it is easier, it doesn't mean it is correct thing to do. What
you are proposing is just a huge step backwards. May be we shouldn't
support hooking everything together in completely arbitrary ways, but we
shouldn't force isa/pci devices upon our users just because they are
non-removable on real chip.

 Basically, we're making things much harder for ourselves than we should.
 
 We want to have an interface to create large chunks of hardware (like an
 i440fx) which then results in a significant portion of a device tree.
 But how would this affect interface to devices? I don't see how that
 would be any different with current model and the function call model.
 
 If all composition is done through a factory interface, it doesn't.
 But my main argument here is that we shouldn't try to make all
 composition done through a factory interface--only where it makes
 sense.
 
 So very concretely, I'm suggesting we do the following to target-i386:
 
 1) make the i440fx device have an embedded ide controller, piix3,
 and usb controller that get initialized automatically.  The piix3
 embeds the PCI-to-ISA bridge along with all of the default ISA
 devices (rtc, serial, etc.).
This may be a problem even from security point of view. What if usb code
(ide, serial, parallel) has guest exploitable bug? Currently I can happily
continue running guests if they do not need affected subsystem. If we'll
get it your way I will no longer be able to do so.

 
 2) get rid of the entire concept of machines.  Creating a i440fx is
 essentially equivalent to creating a bare machine.
 
 3) just use the existing -device infrastructure to support all of
 this.  A very simple device config corresponds to a very complex
 device tree but that's the desired effect.
 
 4) model the CPUs as devices that take a pointer to a host
 controller, for x86, the normal case would be giving it a pointer to
 i440fx.
 
 Regards,
 
 Anthony Liguori
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.



[Qemu-devel] [PATCH 01/18] Make QEMUFile buf expandable, and introduce qemu_realloc_buffer() and qemu_clear_buffer().

2011-02-10 Thread Yoshiaki Tamura
Currently buf size is fixed at 32KB.  It would be useful if it could
be flexible.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 hw/hw.h  |2 ++
 savevm.c |   20 +++-
 2 files changed, 21 insertions(+), 1 deletions(-)

diff --git a/hw/hw.h b/hw/hw.h
index 5e24329..a168a37 100644
--- a/hw/hw.h
+++ b/hw/hw.h
@@ -58,6 +58,8 @@ void qemu_fflush(QEMUFile *f);
 int qemu_fclose(QEMUFile *f);
 void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size);
 void qemu_put_byte(QEMUFile *f, int v);
+void *qemu_realloc_buffer(QEMUFile *f, int size);
+void qemu_clear_buffer(QEMUFile *f);
 
 static inline void qemu_put_ubyte(QEMUFile *f, unsigned int v)
 {
diff --git a/savevm.c b/savevm.c
index 6d83b0f..6c4c72b 100644
--- a/savevm.c
+++ b/savevm.c
@@ -171,7 +171,8 @@ struct QEMUFile {
when reading */
 int buf_index;
 int buf_size; /* 0 when writing */
-uint8_t buf[IO_BUF_SIZE];
+int buf_max_size;
+uint8_t *buf;
 
 int has_error;
 };
@@ -422,6 +423,9 @@ QEMUFile *qemu_fopen_ops(void *opaque, 
QEMUFilePutBufferFunc *put_buffer,
 f-get_rate_limit = get_rate_limit;
 f-is_write = 0;
 
+f-buf_max_size = IO_BUF_SIZE;
+f-buf = qemu_malloc(sizeof(uint8_t) * f-buf_max_size);
+
 return f;
 }
 
@@ -452,6 +456,19 @@ void qemu_fflush(QEMUFile *f)
 }
 }
 
+void *qemu_realloc_buffer(QEMUFile *f, int size)
+{
+f-buf_max_size = size;
+f-buf = qemu_realloc(f-buf, f-buf_max_size);
+
+return f-buf;
+}
+
+void qemu_clear_buffer(QEMUFile *f)
+{
+f-buf_size = f-buf_index = f-buf_offset = 0;
+}
+
 static void qemu_fill_buffer(QEMUFile *f)
 {
 int len;
@@ -477,6 +494,7 @@ int qemu_fclose(QEMUFile *f)
 qemu_fflush(f);
 if (f-close)
 ret = f-close(f-opaque);
+qemu_free(f-buf);
 qemu_free(f);
 return ret;
 }
-- 
1.7.1.2




[Qemu-devel] Re: [PATCH 6/7] add assertions on the owner of a QemuMutex

2011-02-10 Thread Jan Kiszka
On 2011-02-10 18:37, Paolo Bonzini wrote:
 These are already present in the Win32 implementation, add them to
 the pthread wrappers as well.

Better use PTHREAD_MUTEX_ERRORCHECK.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



[Qemu-devel] [PATCH 17/18] migration-tcp: modify tcp_accept_incoming_migration() to handle ft_mode, and add a hack not to close fd when ft_mode is enabled.

2011-02-10 Thread Yoshiaki Tamura
When ft_mode is set in the header, tcp_accept_incoming_migration()
sets ft_trans_incoming() as a callback, and call
qemu_file_get_notify() to receive FT transaction iteratively.  We also
need a hack no to close fd before moving to ft_transaction mode, so
that we can reuse the fd for it.  vm_change_state_handler is added to
turn off ft_mode when cont is pressed.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 migration-tcp.c |   67 ++-
 1 files changed, 66 insertions(+), 1 deletions(-)

diff --git a/migration-tcp.c b/migration-tcp.c
index 55777c8..84076d6 100644
--- a/migration-tcp.c
+++ b/migration-tcp.c
@@ -18,6 +18,8 @@
 #include sysemu.h
 #include buffered_file.h
 #include block.h
+#include ft_trans_file.h
+#include event-tap.h
 
 //#define DEBUG_MIGRATION_TCP
 
@@ -29,6 +31,8 @@
 do { } while (0)
 #endif
 
+static VMChangeStateEntry *vmstate;
+
 static int socket_errno(FdMigrationState *s)
 {
 return socket_error();
@@ -56,7 +60,8 @@ static int socket_read(FdMigrationState *s, const void * buf, 
size_t size)
 static int tcp_close(FdMigrationState *s)
 {
 DPRINTF(tcp_close\n);
-if (s-fd != -1) {
+/* FIX ME: accessing ft_mode here isn't clean */
+if (s-fd != -1  ft_mode != FT_INIT) {
 close(s-fd);
 s-fd = -1;
 }
@@ -150,6 +155,36 @@ MigrationState *tcp_start_outgoing_migration(Monitor *mon,
 return s-mig_state;
 }
 
+static void ft_trans_incoming(void *opaque)
+{
+QEMUFile *f = opaque;
+
+qemu_file_get_notify(f);
+if (qemu_file_has_error(f)) {
+ft_mode = FT_ERROR;
+qemu_fclose(f);
+}
+}
+
+static void ft_trans_reset(void *opaque, int running, int reason)
+{
+QEMUFile *f = opaque;
+
+if (running) {
+if (ft_mode != FT_ERROR) {
+qemu_fclose(f);
+}
+ft_mode = FT_OFF;
+qemu_del_vm_change_state_handler(vmstate);
+}
+}
+
+static void ft_trans_schedule_replay(QEMUFile *f)
+{
+event_tap_schedule_replay();
+vmstate = qemu_add_vm_change_state_handler(ft_trans_reset, f);
+}
+
 static void tcp_accept_incoming_migration(void *opaque)
 {
 struct sockaddr_in addr;
@@ -175,8 +210,38 @@ static void tcp_accept_incoming_migration(void *opaque)
 goto out;
 }
 
+if (ft_mode == FT_INIT) {
+autostart = 0;
+}
+
 process_incoming_migration(f);
+
+if (ft_mode == FT_INIT) {
+int ret;
+
+socket_set_nodelay(c);
+
+f = qemu_fopen_ft_trans(s, c);
+if (f == NULL) {
+fprintf(stderr, could not qemu_fopen_ft_trans\n);
+goto out;
+}
+
+/* need to wait sender to setup */
+ret = qemu_ft_trans_begin(f);
+if (ret  0) {
+goto out;
+}
+
+qemu_set_fd_handler2(c, NULL, ft_trans_incoming, NULL, f);
+ft_trans_schedule_replay(f);
+ft_mode = FT_TRANSACTION_RECV;
+
+return;
+}
+
 qemu_fclose(f);
+
 out:
 close(c);
 out2:
-- 
1.7.1.2




[Qemu-devel] [PATCH 10/18] Call init handler of event-tap at main() in vl.c.

2011-02-10 Thread Yoshiaki Tamura
Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 vl.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/vl.c b/vl.c
index 00155fb..f4d4abf 100644
--- a/vl.c
+++ b/vl.c
@@ -162,6 +162,7 @@ int main(int argc, char **argv)
 #include qemu-queue.h
 #include cpus.h
 #include arch_init.h
+#include event-tap.h
 
 #include ui/qemu-spice.h
 
@@ -2919,6 +2920,8 @@ int main(int argc, char **argv, char **envp)
 
 blk_mig_init();
 
+event_tap_init();
+
 /* open the virtual block devices */
 if (snapshot)
 qemu_opts_foreach(qemu_find_opts(drive), drive_enable_snapshot, 
NULL, 0);
-- 
1.7.1.2




[Qemu-devel] [PATCH 06/18] virtio: decrement last_avail_idx with inuse before saving.

2011-02-10 Thread Yoshiaki Tamura
For regular migration inuse == 0 always as requests are flushed before
save. However, event-tap log when enabled introduces an extra queue
for requests which is not being flushed, thus the last inuse requests
are left in the event-tap queue.  Move the last_avail_idx value sent
to the remote back to make it repeat the last inuse requests.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 hw/virtio.c |   10 +-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/hw/virtio.c b/hw/virtio.c
index 31bd9e3..f05d1b6 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -673,12 +673,20 @@ void virtio_save(VirtIODevice *vdev, QEMUFile *f)
 qemu_put_be32(f, i);
 
 for (i = 0; i  VIRTIO_PCI_QUEUE_MAX; i++) {
+/* For regular migration inuse == 0 always as
+ * requests are flushed before save. However,
+ * event-tap log when enabled introduces an extra
+ * queue for requests which is not being flushed,
+ * thus the last inuse requests are left in the event-tap queue.
+ * Move the last_avail_idx value sent to the remote back
+ * to make it repeat the last inuse requests. */
+uint16_t last_avail = vdev-vq[i].last_avail_idx - vdev-vq[i].inuse;
 if (vdev-vq[i].vring.num == 0)
 break;
 
 qemu_put_be32(f, vdev-vq[i].vring.num);
 qemu_put_be64(f, vdev-vq[i].pa);
-qemu_put_be16s(f, vdev-vq[i].last_avail_idx);
+qemu_put_be16s(f, last_avail);
 if (vdev-binding-save_queue)
 vdev-binding-save_queue(vdev-binding_opaque, i, f);
 }
-- 
1.7.1.2




[Qemu-devel] [PATCH 05/18] vl.c: add deleted flag for deleting the handler.

2011-02-10 Thread Yoshiaki Tamura
Make deleting handlers robust against deletion of any elements in a
handler by using a deleted flag like in file descriptors.

Signed-off-by: Yoshiaki Tamura tamura.yoshi...@lab.ntt.co.jp
---
 vl.c |   13 +
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/vl.c b/vl.c
index ed2cdfa..00155fb 100644
--- a/vl.c
+++ b/vl.c
@@ -1158,6 +1158,7 @@ static void nographic_update(void *opaque)
 struct vm_change_state_entry {
 VMChangeStateHandler *cb;
 void *opaque;
+int deleted;
 QLIST_ENTRY (vm_change_state_entry) entries;
 };
 
@@ -1178,8 +1179,7 @@ VMChangeStateEntry 
*qemu_add_vm_change_state_handler(VMChangeStateHandler *cb,
 
 void qemu_del_vm_change_state_handler(VMChangeStateEntry *e)
 {
-QLIST_REMOVE (e, entries);
-qemu_free (e);
+e-deleted = 1;
 }
 
 void vm_state_notify(int running, int reason)
@@ -1188,8 +1188,13 @@ void vm_state_notify(int running, int reason)
 
 trace_vm_state_notify(running, reason);
 
-for (e = vm_change_state_head.lh_first; e; e = e-entries.le_next) {
-e-cb(e-opaque, running, reason);
+QLIST_FOREACH(e, vm_change_state_head, entries) {
+if (e-deleted) {
+QLIST_REMOVE(e, entries);
+qemu_free(e);
+} else {
+e-cb(e-opaque, running, reason);
+}
 }
 }
 
-- 
1.7.1.2




[Qemu-devel] [PULL 0.14] linux-user fixes

2011-02-10 Thread Riku Voipio
The following changes since commit 343c1de916b1841cd5fd5f813add9c87590d72e8:

  x86: Fix MCA broadcast parameters for TCG case (2011-02-08 12:37:30 +0100)

are available in the git repository at:
  git://gitorious.org/qemu-maemo/qemu.git linux-user-for-0.14

Martin Mohring (1):
  linux-user: fix for loopmount ioctl

Stefan Weil (1):
  linux-user: Fix possible realloc memory leak

 linux-user/elfload.c |8 +---
 linux-user/ioctls.h  |2 --
 2 files changed, 5 insertions(+), 5 deletions(-)




Re: [Qemu-devel] [PATCH 2/7] Enable I/O thread and VNC threads by default

2011-02-10 Thread Paolo Bonzini

On 02/09/2011 11:16 PM, Stefan Weil wrote:

The patch is available here:
http://repo.or.cz/w/qemu/ar7.git/commitdiff/aabf11dc0a938b84d76d7c147cbf0445d7bee297



diff --git a/hosts/w32/include/signal.h b/hosts/w32/include/signal.h
new file mode 100644
index 000..e45f03c
--- /dev/null
+++ b/hosts/w32/include/signal.h
@@ -0,0 +1,20 @@
+/*
+ * QEMU w32 support
+ *
+ * Copyright (C) 2011 Stefan Weil
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef WIN32_SIGNAL_H
+#define WIN32_SIGNAL_H
+
+#include_next signal.h
+#include sys/types.h/* sigset_t */
+
+int pthread_sigmask(int how, const sigset_t *set, sigset_t *oldset);
+int sigfillset(sigset_t *set);
+
+#endif /* WIN32_SIGNAL_H */
diff --git a/hosts/w32/include/time.h b/hosts/w32/include/time.h
new file mode 100644
index 000..0b997d3
--- /dev/null
+++ b/hosts/w32/include/time.h
@@ -0,0 +1,31 @@
+/*
+ * QEMU w32 support
+ *
+ * Copyright (C) 2011 Stefan Weil
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#if !defined(W32_TIME_H)
+#define W32_TIME_H
+
+#include_next time.h
+
+#ifndef HAVE_STRUCT_TIMESPEC
+#define HAVE_STRUCT_TIMESPEC 1
+struct timespec {
+long tv_sec;
+long tv_nsec;
+};
+#endif /* HAVE_STRUCT_TIMESPEC */
+
+typedef enum {
+  CLOCK_REALTIME = 0
+} clockid_t;
+
+int clock_getres (clockid_t clock_id, struct timespec *res);
+int clock_gettime(clockid_t clock_id, struct timespec *pTimespec);
+
+#endif /* W32_TIME_H */
diff --git a/os-win32.c b/os-win32.c
index b214e6a..7778366 100644
--- a/os-win32.c
+++ b/os-win32.c
@@ -36,6 +36,45 @@
 /***/
 /* Functions missing in mingw */

+#if defined(CONFIG_THREAD)
+
+int clock_gettime(clockid_t clock_id, struct timespec *pTimespec)
+{
+  int result = 0;
+  if (clock_id == CLOCK_REALTIME  pTimespec != 0) {
+DWORD t = GetTickCount();
+const unsigned cps = 1000;
+struct timespec ts;
+ts.tv_sec  = t / cps;
+ts.tv_nsec = (t % cps) * (10UL / cps);
+*pTimespec = ts;
+  } else {
+errno = EINVAL;
+result = -1;
+  }
+  return result;
+}


Why is this needed?  The only user of clock_gettime in the POSIX case is 
using CLOCK_MONOTONIC, and actually has a Win32 version already.



+int pthread_sigmask(int how, const sigset_t *set, sigset_t *oldset)
+{
+/* Dummy, do nothing. */
+return EINVAL;
+}
+
+int sigfillset(sigset_t *set)
+{
+int result = 0;
+if (set) {
+*(set) = (sigset_t)(-1);
+} else {
+errno = EINVAL;
+result = -1;
+}
+return result;
+}


Instead of these, it's better to provide a Win32 implementation of 
mutexes and condvars.  I'll submit it next week hopefully.


Paolo



Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Gleb Natapov
On Thu, Feb 10, 2011 at 11:00:50AM +0100, Anthony Liguori wrote:
 On 02/10/2011 10:07 AM, Gleb Natapov wrote:
 So what if it is easier, it doesn't mean it is correct thing to do.
 
 If we spend the next 10 years trying to do the correct thing for
 some arbitrary definition of correct, that's not terribly useful.
Changing direction by 180 every 2 years even less useful.

 
 It's really simple actually.  Let's do the least clever thing and
 model how hardware actual works.  Once we have that, we can try to
 be better than real hardware (if it's possible).
I think out understanding on how HW actually works is very different.
You are placing to much value on were device resides physically, for me
it is completely unimportant detail. Not worth even mentioning.

 
 
 If all composition is done through a factory interface, it doesn't.
 But my main argument here is that we shouldn't try to make all
 composition done through a factory interface--only where it makes
 sense.
 
 So very concretely, I'm suggesting we do the following to target-i386:
 
 1) make the i440fx device have an embedded ide controller, piix3,
 and usb controller that get initialized automatically.  The piix3
 embeds the PCI-to-ISA bridge along with all of the default ISA
 devices (rtc, serial, etc.).
 This may be a problem even from security point of view. What if usb code
 (ide, serial, parallel) has guest exploitable bug? Currently I can happily
 continue running guests if they do not need affected subsystem. If we'll
 get it your way I will no longer be able to do so.
 
 qemu -device i440fx,ide=off
 
So you still need to support arbitrary composition. What's the
difference? So why do you like -device i440fx over what we have now?
In current speak you propose will be implement by using i440fx machine
type. Qdev will build it for you.

 If you really care to do this.  But this desire to remove devices is
 silly IMHO.  Concerns about security are misplaced.  If you have to
 change the way a guest is invoked in order to eliminate security
 problems, then there's something seriously wrong.
 
No I do not.  I do not create guest with unneeded devices from the
beginning.

--
Gleb.



Re: [Qemu-devel] [PATCH] Make tb_alloc static.

2011-02-10 Thread Aurelien Jarno
On Thu, Feb 10, 2011 at 10:04:57AM +0100, Tristan Gingold wrote:
 On Wed, Feb 09, 2011 at 07:52:52PM +0100, Aurelien Jarno wrote:
  
  What about moving tb_alloc() (with tb_free()) higher in the file? After
  all it make sense to have the function creating or destructing a tb
  before the function manipulating them.
 
 Thanks.  Like this ?
 

Yes, perfect. Applied.

 Tristan.
 
 
 This function is only used within exec.c, so no need to make it public.
 
 Signed-off-by: Tristan Gingold ging...@adacore.com
 ---
  exec-all.h |1 -
  exec.c |   52 ++--
  2 files changed, 26 insertions(+), 27 deletions(-)
 
 diff --git a/exec-all.h b/exec-all.h
 index 81497c0..c062693 100644
 --- a/exec-all.h
 +++ b/exec-all.h
 @@ -182,7 +182,6 @@ static inline unsigned int 
 tb_phys_hash_func(tb_page_addr_t pc)
  return (pc  2)  (CODE_GEN_PHYS_HASH_SIZE - 1);
  }
  
 -TranslationBlock *tb_alloc(target_ulong pc);
  void tb_free(TranslationBlock *tb);
  void tb_flush(CPUState *env);
  void tb_link_page(TranslationBlock *tb,
 diff --git a/exec.c b/exec.c
 index 477199b..9a7a752 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -649,6 +649,32 @@ void cpu_exec_init(CPUState *env)
  #endif
  }
  
 +/* Allocate a new translation block. Flush the translation buffer if
 +   too many translation blocks or too much generated code. */
 +static TranslationBlock *tb_alloc(target_ulong pc)
 +{
 +TranslationBlock *tb;
 +
 +if (nb_tbs = code_gen_max_blocks ||
 +(code_gen_ptr - code_gen_buffer) = code_gen_buffer_max_size)
 +return NULL;
 +tb = tbs[nb_tbs++];
 +tb-pc = pc;
 +tb-cflags = 0;
 +return tb;
 +}
 +
 +void tb_free(TranslationBlock *tb)
 +{
 +/* In practice this is mostly used for single use temporary TB
 +   Ignore the hard cases and just back up if this TB happens to
 +   be the last one generated.  */
 +if (nb_tbs  0  tb == tbs[nb_tbs - 1]) {
 +code_gen_ptr = tb-tc_ptr;
 +nb_tbs--;
 +}
 +}
 +
  static inline void invalidate_page_bitmap(PageDesc *p)
  {
  if (p-code_bitmap) {
 @@ -1227,32 +1253,6 @@ static inline void tb_alloc_page(TranslationBlock *tb,
  #endif /* TARGET_HAS_SMC */
  }
  
 -/* Allocate a new translation block. Flush the translation buffer if
 -   too many translation blocks or too much generated code. */
 -TranslationBlock *tb_alloc(target_ulong pc)
 -{
 -TranslationBlock *tb;
 -
 -if (nb_tbs = code_gen_max_blocks ||
 -(code_gen_ptr - code_gen_buffer) = code_gen_buffer_max_size)
 -return NULL;
 -tb = tbs[nb_tbs++];
 -tb-pc = pc;
 -tb-cflags = 0;
 -return tb;
 -}
 -
 -void tb_free(TranslationBlock *tb)
 -{
 -/* In practice this is mostly used for single use temporary TB
 -   Ignore the hard cases and just back up if this TB happens to
 -   be the last one generated.  */
 -if (nb_tbs  0  tb == tbs[nb_tbs - 1]) {
 -code_gen_ptr = tb-tc_ptr;
 -nb_tbs--;
 -}
 -}
 -
  /* add a new TB and link it to the physical page tables. phys_page2 is
 (-1) to indicate that only one page contains the TB. */
  void tb_link_page(TranslationBlock *tb,
 -- 
 1.7.3.GIT
 
 

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] [PATCH v3 0/6] target-arm: Fix floating point conversions

2011-02-10 Thread Aurelien Jarno
On Thu, Feb 10, 2011 at 11:28:55AM +, Peter Maydell wrote:
 This patchset fixes two issues:
  * default_nan_mode not being honoured for float-to-float conversions
  * half precision conversions being broken in a number of ways as
well as not handling default_nan_mode.
 
 With this patchset qemu passes random-instruction-selection tests
 for VCVT.F32.F16, VCVT.F16.F32, VCVTB and VCVTT, in both IEEE and
 non-IEEE modes, with and without default-NaN behaviour.
 
 Christophe: this patchset includes your softfloat v3 patch, although
 I have split it up a little to keep the float16 bits separate.
 
 Changes since v2:
  * added STRUCT_TYPES version of float16 and fixed various
places which needed a make_float16()/float16_val() in order
to compile with STRUCT_TYPES enabled
  * s/bits16/float16/ in patch 3 as suggested by Aurelien
  * fixed the types in the f16-related ARM helper wrappers in patch 6
 
 Patch 2 is unchanged and so I've added Aurelien's reviewed-by
 signoff; the others all changed, although mostly in minor ways.
 
 (Compiling with STRUCT_TYPES enabled also needs some fixes to
 existing float32/float64 code; I'll send a separate patchset
 for that.)
 
 
 Christophe Lyon (1):
   softfloat: Honour default_nan_mode for float-to-float conversions
 
 Peter Maydell (5):
   softfloat: Add float16 type and float16 NaN handling functions
   softfloat: Fix single-to-half precision float conversions
   softfloat: Correctly handle NaNs in float16_to_float32()
   target-arm: Silence NaNs resulting from half-precision conversions
   target-arm: Use standard FPSCR for Neon half-precision operations
 
  fpu/softfloat-specialize.h |  130 
 ++--
  fpu/softfloat.c|  100 ++
  fpu/softfloat.h|   19 ++-
  target-arm/helper.c|   38 +++--
  target-arm/helpers.h   |2 +
  target-arm/translate.c |   16 +++---
  6 files changed, 251 insertions(+), 54 deletions(-)
 

Thanks, all applied.

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Avi Kivity

On 02/10/2011 09:47 AM, Anthony Liguori wrote:


So very concretely, I'm suggesting we do the following to target-i386:

1) make the i440fx device have an embedded ide controller, piix3, and 
usb controller that get initialized automatically.  The piix3 embeds 
the PCI-to-ISA bridge along with all of the default ISA devices (rtc, 
serial, etc.).


This I like.



2) get rid of the entire concept of machines.  Creating a i440fx is 
essentially equivalent to creating a bare machine.


No, it's not.  The 440fx does not include an IOAPIC, for example.  There 
may be other optional components, or differences in wiring, that make 
two machines with i440fx not identical.




4) model the CPUs as devices that take a pointer to a host controller, 
for x86, the normal case would be giving it a pointer to i440fx.




Surely the connection is via a bus?  An x86 cpu talks to the bus, and 
there happens to be an 440fx north bridge at the end of it.  It could 
also be a Q35 or something else.


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PATCH 02/18] Introduce read() to FdMigrationState.

2011-02-10 Thread Yoshiaki Tamura
2011/2/10 Daniel P. Berrange berra...@redhat.com:
 On Thu, Feb 10, 2011 at 07:23:33PM +0900, Yoshiaki Tamura wrote:
 2011/2/10 Daniel P. Berrange berra...@redhat.com:
  On Thu, Feb 10, 2011 at 10:54:01AM +0100, Anthony Liguori wrote:
  On 02/10/2011 10:30 AM, Yoshiaki Tamura wrote:
  Currently FdMigrationState doesn't support read(), and this patch
  introduces it to get response from the other side.
  
  Signed-off-by: Yoshiaki Tamuratamura.yoshi...@lab.ntt.co.jp
 
  Migration is unidirectional.  Changing this is fundamental and not
  something to be done lightly.
 
  Making it bi-directional might break libvirt's save/restore
  to file support which uses migration, passing a unidirectional
  FD for the file. It could also break libvirt's secure tunnelled
  migration support which is currently only expecting to have
  data sent in one direction on the socket.

 Hi Daniel,

 IIUC, this patch isn't something to make existing live migration
 bi-directional.  Just opens up a way for Kemari to use it.  Do
 you think it's dangerous for libvirt still?

 The key is for it to be a no-op for any usage of the existing
 'migrate' command. I had thought this was wiring up read into
 the event loop too, so it would be poll()ing for reads, but
 after re-reading I see this isn't the case here.

It's a no-op for existing migration related code.  Anthony, did
you have the same concern?

Yoshi


 Regards,
 Daniel
 --
 |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
 |: http://libvirt.org              -o-             http://virt-manager.org :|
 |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
 |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html




Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Anthony Liguori

On 02/10/2011 11:10 AM, Gleb Natapov wrote:

On Thu, Feb 10, 2011 at 11:00:50AM +0100, Anthony Liguori wrote:
   

On 02/10/2011 10:07 AM, Gleb Natapov wrote:
 

So what if it is easier, it doesn't mean it is correct thing to do.
   

If we spend the next 10 years trying to do the correct thing for
some arbitrary definition of correct, that's not terribly useful.
 

Changing direction by 180 every 2 years even less useful.
   


If we think through what we are doing and have a coherent architecture 
before changing direction, then we won't have this problem.



It's really simple actually.  Let's do the least clever thing and
model how hardware actual works.  Once we have that, we can try to
be better than real hardware (if it's possible).
 

I think out understanding on how HW actually works is very different.
You are placing to much value on were device resides physically, for me
it is completely unimportant detail. Not worth even mentioning.
   


No, I place value on how things are modelled in the real world.

There simply aren't PC's out there that lack an RTC so I have no 
interest in jumping through hoops in QEMU to make it possible to do this 
without modifying QEMU code.  It might sound nice to a developer but 
it's of absolutely no use to users.



If all composition is done through a factory interface, it doesn't.
But my main argument here is that we shouldn't try to make all
composition done through a factory interface--only where it makes
sense.

So very concretely, I'm suggesting we do the following to target-i386:

1) make the i440fx device have an embedded ide controller, piix3,
and usb controller that get initialized automatically.  The piix3
embeds the PCI-to-ISA bridge along with all of the default ISA
devices (rtc, serial, etc.).
 

This may be a problem even from security point of view. What if usb code
(ide, serial, parallel) has guest exploitable bug? Currently I can happily
continue running guests if they do not need affected subsystem. If we'll
get it your way I will no longer be able to do so.
   

qemu -device i440fx,ide=off

 

So you still need to support arbitrary composition. What's the
difference?


No, we don't.  It's possible to have an 'rtc=off' option but I'm 
tremendously opposed to doing this.  Arbitrary composition is not a 
useful goal IMHO.



  So why do you like -device i440fx over what we have now?
   


Because I don't think tools like libvirt should be doing device 
composition to create an i440fx-like chipset.  I think the current path 
we're on is pushing too much logic that belongs in QEMU into the 
management stack.



In current speak you propose will be implement by using i440fx machine
type. Qdev will build it for you.
   


If you had an i440fx machine type, that had no non-optional components 
added, and you could specify options to the machine type, yes.  But I 
think you'll agree that there's no reason to not just treat the i440fx 
as a device.



If you really care to do this.  But this desire to remove devices is
silly IMHO.  Concerns about security are misplaced.  If you have to
change the way a guest is invoked in order to eliminate security
problems, then there's something seriously wrong.

 

No I do not.  I do not create guest with unneeded devices from the
beginning.
   


There is very little that isn't 'unneeded'.

Regards,

Anthony Liguori


--
Gleb.
   





Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-10 Thread Anthony Liguori

On 02/10/2011 11:38 AM, Peter Maydell wrote:

On 10 February 2011 10:13, Anthony Liguorianth...@codemonkey.ws  wrote:
   

On 02/10/2011 10:04 AM, Peter Maydell wrote:
 

On 10 February 2011 08:36, Anthony Liguorianth...@codemonkey.wswrote:
   

So you would model arm926ej-s as the chipset and then build up the
machines
by modifying parameters of the chipset (like the board id) and/or adding
different components on top of it.

 

Er, ARM926 is the CPU, it's not a chipset. The board ID is definitely
not a property of an ARM926, it's a property of the board (clue is in
the name :-)). I don't think versatile boards have a chipset really...

   

As I said, I'm not well versed in the component names in ARM.

But that said, an actual processor doesn't connect directly to a bunch of
devices.  It almost always go through some chipset and that chipset
implements a lot of functionality typically.

I think the name of the component I'm trying to refer to PL300 which I
believe is the Northbridge used for the Versatile boards.
 

PL300 is just a bus interconnect (so you can connect multiple AXI
bus masters (cores) to multiple AXI bus slaves (devices)).
Versatile PB doesn't have anything in the documentation that claims
to be a Northbridge (PBX does, VExpress doesn't).

This is the system diagram for the Versatile Express:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0447d/I1007683.html
I don't know what you'd want to claim is a northbridge there.
Basically there's an FPGA with a pile of devices in it,
and there's a test chip with the core and some other devices in
it. But from a modelling perspective this is all completely
irrelevant because regardless of where the hardware designer
put the devices, they're just devices at a particular point in the
memory map and with a particular set of interrupt wiring and so
on.


But something interacts with each processor and dispatches the I/O 
operations in the address space, no?  I can't believe there are 2^32 
address lines coming off of every arm chip that each device connects.


This relationship of how I/O fans out through various devices is 
important because occasionally platforms do weird things during I/O fan 
out like implement an IOMMU.  If we don't model this I/O dispatch model 
within QEMU, then it's extremely difficult to implement things like IOMMUs.


It might be the case that a platform has a chipset that is a pile of 
well isolated devices that are crammed in the same silicon space but 
that otherwise have very well defined interactions with each other.  
This is the exception though, not the rule.


Particularly when looking at the relationship between certain devices on 
the PC (like the role the pckbd plays in address translation), things 
are simply not so idealized in practice.


But if it makes sense for ARM to describe every single platform device 
through a factory interface, that's fine.


Even in this case, you still want to model things like the distinction 
between the UART16650A and the ISA bus bridge for the serial device.  In 
this case, you want to be able to do composition without going through a 
factory.



An n900 is a very specific hardware configuration that is best represented
by some sort of configuration file vs. something hard coded in QEMU.
 

Yes, that's the whole point -- machine == specific hardware
configuration.

That's not getting rid of machine, it's just saying we should have
some custom scripting language to define them rather than doing
them in C. You still want, fundamentally, to be able to say
   qemu-system-arm -M machinename
   


No, qemu-system-arm -M /path/to/n900.cfg

But yeah, no disagreement there.  But today, the machine concept in QEMU 
is definitely not a specific hardware configuration.


Regards,

Anthony Liguori


-- PMM

   





[Qemu-devel] [PATCH] target-arm: Implement VMULL.P8

2011-02-10 Thread Peter Maydell
Implement VMULL.P8 (the 32x32-64 version of the polynomial multiply
instruction).

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 target-arm/helpers.h |1 +
 target-arm/neon_helper.c |   30 ++
 target-arm/translate.c   |6 --
 3 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/target-arm/helpers.h b/target-arm/helpers.h
index 4d0de00..0d37abe 100644
--- a/target-arm/helpers.h
+++ b/target-arm/helpers.h
@@ -275,6 +275,7 @@ DEF_HELPER_2(neon_sub_u16, i32, i32, i32)
 DEF_HELPER_2(neon_mul_u8, i32, i32, i32)
 DEF_HELPER_2(neon_mul_u16, i32, i32, i32)
 DEF_HELPER_2(neon_mul_p8, i32, i32, i32)
+DEF_HELPER_2(neon_mull_p8, i64, i32, i32)
 
 DEF_HELPER_2(neon_tst_u8, i32, i32, i32)
 DEF_HELPER_2(neon_tst_u16, i32, i32, i32)
diff --git a/target-arm/neon_helper.c b/target-arm/neon_helper.c
index 61890dd..b59ad38 100644
--- a/target-arm/neon_helper.c
+++ b/target-arm/neon_helper.c
@@ -895,6 +895,36 @@ uint32_t HELPER(neon_mul_p8)(uint32_t op1, uint32_t op2)
 return result;
 }
 
+uint64_t HELPER(neon_mull_p8)(uint32_t op1, uint32_t op2)
+{
+uint64_t result = 0;
+uint64_t mask;
+uint64_t op2ex = op2;
+op2ex = (op2ex  0xff) |
+((op2ex  0xff00)  8) |
+((op2ex  0xff)  16) |
+((op2ex  0xff00)  24);
+while (op1) {
+mask = 0;
+if (op1  1) {
+mask |= 0x;
+}
+if (op1  (1  8)) {
+mask |= (0xU  16);
+}
+if (op1  (1  16)) {
+mask |= (0xULL  32);
+}
+if (op1  (1  24)) {
+mask |= (0xULL  48);
+}
+result ^= op2ex  mask;
+op1 = (op1  1)  0x7f7f7f7f;
+op2ex = 1;
+}
+return result;
+}
+
 #define NEON_FN(dest, src1, src2) dest = (src1  src2) ? -1 : 0
 NEON_VOP(tst_u8, neon_u8, 4)
 NEON_VOP(tst_u16, neon_u16, 2)
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 3087a5d..f640a50 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -5124,8 +5124,10 @@ static int disas_neon_data_insn(CPUState * env, 
DisasContext *s, uint32_t insn)
 gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
 break;
 case 14: /* Polynomial VMULL */
-cpu_abort(env, Polynomial VMULL not implemented);
-
+gen_helper_neon_mull_p8(cpu_V0, tmp, tmp2);
+dead_tmp(tmp2);
+dead_tmp(tmp);
+break;
 default: /* 15 is RESERVED.  */
 return 1;
 }
-- 
1.7.1




[Qemu-devel] Re: RFC: New API for PPC for vcpu mmu access

2011-02-10 Thread Edgar E. Iglesias
On Thu, Feb 10, 2011 at 12:55:22PM +0100, Alexander Graf wrote:
 Scott Wood wrote:
  On Thu, 3 Feb 2011 10:19:06 +0100
  Alexander Graf ag...@suse.de wrote:
 

  Yeah, that one's tricky. Usually the way the memory resolver in qemu works 
  is as follows:
 
   * kvm goes to qemu
   * qemu fetches all mmu and register data from kvm
   * qemu runs its mmu resolution function as if the target was emulated
 
  So the normal way would be to fetch _all_ TLB entries from KVM, shove 
  them into env and implement the MMU in qemu (at least enough of it to 
  enable debugging). No other target modifies this code path. But no other 
  target needs to copy  30kb of data only to get the mmu data either :).
  
 
  I guess you mean that cpu_synchronize_state() is supposed to pull in the
  MMU state, though I don't see where it gets called for 'm'/'M' commands in
  the gdb stub.

 
 Well, we could also call it in get_phys_page_debug in target-ppc, but
 yes. I guess the reason it works for now is that SDR1 is pretty constant
 and was fetched earlier on. For BookE not syncing is obviously even more
 broken.
 
  The MMU code seems to be pretty target-specific.  It's not clear to what
  extent there is a normal way, versus what book3s happens to rely on in
  its get_physical_address() code.  I don't think there are any platforms
  supported yet (with both KVM and a non-empty cpu_get_phys_page_debug()
  implementation) that have a pure software-managed TLB.  x86 has page
  tables, and book3s has the hash table (603/e300 doesn't, or more accurately
  Linux doesn't use it, but I guess that's not supported by KVM yet?).

 
 As for PPC, only 440, e500 and G3-5 are basically supported. It happens
 to work on POWER4 and above too and I've even got reports that it's good
 on e600 :).
 
  We could probably do some sort of lazy state transfer only when MMU code
  that needs it is run.  This could initially include debug translations, for
  testing a non-KVM-dependent get_physical_address() implementation, but
  eventually that would use KVM_TRANSLATE (when KVM is used) and thus not

 
 Yup :).
 
  trigger the state transfer.  I'd also like to add an info tlb command,
  which would require the state transfer.

 
 Very nice.
 
  BTW, how much other than the MMU is missing to be able to run an e500
  target in qemu, without kvm?

 
 The last person working on BookE emulation was Edgar. Edgar, how far did
 you get?

Hi,

TBH, I don't really know. My goal was to get linux running on an PPC-440
embedded with the Xilinx FPGA's. I managed to fix enough BookE emulation
to get that far.

After that, we've done a few more hacks to run fsboot and uboot. Also,
we've added support for some of the BookE debug registers to be able
to run gdbserver from within linux guests. Some of these patches haven't
made it upstream yet.

I haven't taken the time to compare the specs to qemu code, so I don't
really know how much is missing. My guess is that If you wan't to run
linux guests, the MMU won't be the limiting factor.

Cheers



  1   2   >