date:20100413

Re: [Qemu-devel] qemu per vm config file support

2010-04-13 Thread Amit Shah

On (Mon) Apr 12 2010 [21:06:29], Manish Regmi wrote:
 Hi,
   These days i am playing with qemu and kvm. The command line parameters can
 become very long. Is there any technical reason why config file is not
 implemented in qemu yet.

See -readconfig and -writeconfig. There's partial support for config
files; not all devices can be initialised via configs yet.

Amit

[Qemu-devel] Re: [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.

2010-04-13 Thread Yoshiaki Tamura


Avi Kivity wrote:

On 04/12/2010 01:58 PM, Yoshiaki Tamura wrote:

Is it necessary to update migration and vga bitmaps?

We can simply update the master bitmap, and update the migration and vga
bitmaps only when they need it. That can be done in a different patch.



Let me explain the role of the master bitmap here.

In the previous discussion, the master bitmap represented at least one
dirty type is actually dirty. While implementing this approach, I
though there is one downside that upon resetting the bitmap, it needs
to check all dirty types whether they are already cleared to unset the
master bitmap, and this might hurt the performance in general.


The way I see it, the master bitmap is a buffer between the guest and
all the other client bitmaps (migration and vga). So, a bit can be set
in vga and cleared in master. Let's look at the following scenario:

1. guest touches address 0xa (page 0xa0)
2. qemu updates master bitmap bit 0xa0
3. vga timer fires
4. vga syncs the dirty bitmap for addresses 0xa-0xc
4.1 vga_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
4.2 migration_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
4.3 master_bitmap[0xa0-0xc0] = 0
5. vga draws based on vga_bitmap

the nice thing is that step 4.2 can be omitted if migration is not active.

so, client bitmaps are always updated when the client requests them,
master bitmap is only a buffer.


Thanks.  I think I'm finally getting your point.
At 4.2 and 4.3, is syncing to client necessary?
Simply doing OR at get_dirty like in this patch not enough?


In this patch, master bitmap represents all types are dirty, similar
to existing 0xff. With this approach, resetting the master bitmap can
be done without checking the other types. set_dirty_flags is actually
taking the burden in this case though. Anyway, IIUC somebody would be
unhappy depending on the role of the master bitmap.


Yes, the problem is that set_dirty_flags is the common case and also
uses random access, we'd want to make it touch only a single bit. client
access is rare, and also sequential, and therefore efficient.


I see.
So set_dirty_flags would be like,

1. set master first regard less of dirty type.
2. if dirty type was CODE, set the client.


Note that we should only allocate the migration and vga bitmaps when
migration or vga is active.


Right. May I do it in a different patch?


Sure, more patches is usually better.


I think this is about optimization. In this patch, I focused on not
breaking the existing function.


Yes, that's the best approach. But we do need to see how all the
incremental steps end up.


I totally agree.  Thanks for helping.

In summary, I'll prepare two patch sets.

1. v3 of this patch set.
- Change FLAGS value to (1,2,4,8), and add IDX (0,1,2,3)
- Use ffsl to convert FLAGS to IDX.
- Add help function which takes IDX.
- Change the behavior of set_dirty_flags as above.
- Change dirty bitmap access to a loop.
- Add brace after if ()
- Move some macros to qemu-common.h.

2. Allocate vga and migration dirty bitmap dynamically.

Please modify or add items if I were missing.

Re: [Qemu-devel] QEMU regression problems

2010-04-13 Thread Roy Tam

2010/4/12 Gerhard Wiesinger li...@wiesinger.com:
 Hello,

 Checkit reports some problems under DOS:
 1.) NPU functions are not correct: NPU Trigonometric Functions: FAILED.
 Seems to be a problem of the instruction set.
 2.) Real-Time Clock Alarm: FAILED (This might be also the reason for the
 KVM problem, see my previous post). Seems to be that real-time clock is not
 working correct.
 3.) There is also a problem with the reported base memory under QEMM386
 (HIMEM.SYS and EMM386.EXE is correct here). It is 646kB instead of 640kB.
 Therefore base memory test fails. I guess that reporting memory CMOS
 tables/interrupt functions are not implemented correctly.

 Details are listed below.

 All issues are NOT present under VMWare Server 2.0 and with real hardware.

 QEMU: 0.12.3 under Fedora 11, 2.6.30.10-105.2.23.fc11.x86 on AMD Phenom II
 Quad Core, x86_64-softmmu.

 Any comments?


Tested with Checkit 3.0 and QEMM 9.0.
- The NPU error is confirmed and it seems to be a floating point
rounding error in QEMU. This should be a 0.12 regression as it works
in 0.11.1 and 0.10.6.
- The RTC Alarm failure is confirmed too. Is it unimplemented in QEMU?
- The Base Memory  640K error seems to be SeaBIOS related. QEMU Bochs
BIOS(tested with both -old-bios hack in 0.12 series and old 0.11.1)
will just freeze after QEMU counted RAM.(Tested with ScriptPC and
Bochs).

 Thnx.

 Ciao,
 Gerhard

 --
 http://www.wiesinger.com/

 Details:
 1.)
NPU Trigonometric Functions.FAILED ***
Step 1, Expected0.42926, received0.42926

 Double 'Npu_oldans1' = 0.429259 (3FDB78F91894EFA5h).
 Double 'Npu_oldans2' = 0.628319 (3FE41B2F769CF0E0h).
 Double 'Npu_result ' = 0.429259 (3FDB78F91894EFA6h).

 2.)
 Compare Current TimePassed
DOS: 16:24:39.89Real-Time Clock: 16:24:39.00   (.89 apart)

 Compare Current DatePassed
DOS: 04/11/2010 Real-Time Clock: 04/11/2010.

 Real-Time Clock Alarm...FAILED ***

 Compare Elapsed TimePassed
DOS: 11.97 Seconds  Real-Time Clock: 12.00 Seconds   (.03 apart)

 3.) Known Memory:
Base646K   From  0K to646K   (000h to 00A17FFh)
Base Memory.FAILED ***
ERROR at Address   0Ah, Bits FEDCBA9876543210
ERROR at Address   0A0004h, Bits FEDCBA9876543210
ERROR at Address   0A0006h, Bits FEDCBA9876543210
ERROR at Address   0A0008h, Bits FEDCBA9876543210
ERROR at Address   0A000Ah, Bits FEDCBA9876543210
ERROR at Address   0A000Ch, Bits FEDCBA9876543210
ERROR at Address   0A000Eh, Bits FEDCBA9876543210
ERROR at Address   0A0010h, Bits FEDCBA9876543210
ERROR at Address   0A0012h, Bits FEDCBA9876543210
ERROR at Address   0A0014h, Bits FEDCBA9876543210
ERROR at Address   0A0016h, Bits FEDCBA9876543210
ERROR at Address   0A0018h, Bits FEDCBA9876543210
ERROR at Address   0A001Ah, Bits FEDCBA9876543210
ERROR at Address   0A001Ch, Bits FEDCBA9876543210
ERROR at Address   0A001Eh, Bits FEDCBA9876543210
ERROR at Address   0A0020h, Bits FEDCBA9876543210
ADDITIONAL MEMORY ERRORS WERE NOT LISTED DUE TO LACK OF SPACE.

[Qemu-devel] KVM call agenda for Apr 13

2010-04-13 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

[Qemu-devel] Re: Missing singlestep for already-translated code?

2010-04-13 Thread takasi-y

Hi,
 So for the already-translated code, we will miss singlestep?
At least SH4(and mips?) shows such behaviour.
I think a patch below enables single stepping in such case, too.
But, I'm not sure if this behaviour is on purpose, nor this patch is correct.
/yoshii

diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index 3537f8c..dfa724a 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -300,7 +300,7 @@ static void gen_goto_tb(DisasContext * ctx, int n, 
target_ulong dest)
 tb = ctx-tb;
 
 if ((tb-pc  TARGET_PAGE_MASK) == (dest  TARGET_PAGE_MASK) 
-   !ctx-singlestep_enabled) {
+   !ctx-singlestep_enabled  !singlestep) {
/* Use a direct jump if in same page and singlestep not enabled */
 tcg_gen_goto_tb(n);
 tcg_gen_movi_i32(cpu_pc, dest);

[Qemu-devel] [PATCH] fix migration with large mem

2010-04-13 Thread Izik Eidus

From f881b371e08760a67bf1f5b992a586c3de600f7a Mon Sep 17 00:00:00 2001
From: Izik Eidus iei...@redhat.com
Date: Tue, 13 Apr 2010 12:24:57 +0300
Subject: [PATCH] fix migration with large mem

In cases of guests with large mem that have pages
that all their bytes content are the same, we will
spend alot of time reading the memory from the guest
(is_dup_page())

It is happening beacuse ram_save_live() function have
limit of how much we can send to the dest but not how
much we read from it, and in cases we have many is_dup_page()
hits, we might read huge amount of data without updating important
stuff like the timers...

The guest lose all its repsonsibility and have many softlock ups
inside itself.

this patch add limit on the size we can read from the guest each
iteration.

Thanks.

Signed-off-by: Izik Eidus iei...@redhat.com
---
 arch_init.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index cfc03ea..e27b1a0 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -88,6 +88,8 @@ const uint32_t arch_type = QEMU_ARCH;
 #define RAM_SAVE_FLAG_PAGE 0x08
 #define RAM_SAVE_FLAG_EOS  0x10
 
+#define MAX_SAVE_BLOCK_READ 10 * 1024 * 1024
+
 static int is_dup_page(uint8_t *page, uint8_t ch)
 {
 uint32_t val = ch  24 | ch  16 | ch  8 | ch;
@@ -175,6 +177,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 uint64_t bytes_transferred_last;
 double bwidth = 0;
 uint64_t expected_time = 0;
+int data_read = 0;
 
 if (stage  0) {
 cpu_physical_memory_set_dirty_tracking(0);
@@ -205,10 +208,11 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
void *opaque)
 bytes_transferred_last = bytes_transferred;
 bwidth = qemu_get_clock_ns(rt_clock);
 
-while (!qemu_file_rate_limit(f)) {
+while (!qemu_file_rate_limit(f)  data_read  MAX_SAVE_BLOCK_READ) {
 int ret;
 
 ret = ram_save_block(f);
+data_read += ret * TARGET_PAGE_SIZE;
 bytes_transferred += ret * TARGET_PAGE_SIZE;
 if (ret == 0) { /* no more blocks */
 break;
-- 
1.6.6.1

[Qemu-devel] [PATCH 1/3] block: Convert first_drv to QLIST

2010-04-13 Thread Stefan Hajnoczi

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 block.c |   22 --
 block_int.h |2 +-
 2 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/block.c b/block.c
index 5d087de..12cf434 100644
--- a/block.c
+++ b/block.c
@@ -58,7 +58,8 @@ static int bdrv_write_em(BlockDriverState *bs, int64_t 
sector_num,
 static QTAILQ_HEAD(, BlockDriverState) bdrv_states =
 QTAILQ_HEAD_INITIALIZER(bdrv_states);
 
-static BlockDriver *first_drv;
+static QLIST_HEAD(, BlockDriver) bdrv_drivers =
+QLIST_HEAD_INITIALIZER(bdrv_drivers);
 
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
@@ -142,8 +143,7 @@ void bdrv_register(BlockDriver *bdrv)
 if (!bdrv-bdrv_aio_flush)
 bdrv-bdrv_aio_flush = bdrv_aio_flush_em;
 
-bdrv-next = first_drv;
-first_drv = bdrv;
+QLIST_INSERT_HEAD(bdrv_drivers, bdrv, list);
 }
 
 /* create a new block device (by default it is empty) */
@@ -162,9 +162,10 @@ BlockDriverState *bdrv_new(const char *device_name)
 BlockDriver *bdrv_find_format(const char *format_name)
 {
 BlockDriver *drv1;
-for(drv1 = first_drv; drv1 != NULL; drv1 = drv1-next) {
-if (!strcmp(drv1-format_name, format_name))
+QLIST_FOREACH(drv1, bdrv_drivers, list) {
+if (!strcmp(drv1-format_name, format_name)) {
 return drv1;
+}
 }
 return NULL;
 }
@@ -265,10 +266,11 @@ static BlockDriver *find_protocol(const char *filename)
 len = sizeof(protocol) - 1;
 memcpy(protocol, filename, len);
 protocol[len] = '\0';
-for(drv1 = first_drv; drv1 != NULL; drv1 = drv1-next) {
+QLIST_FOREACH(drv1, bdrv_drivers, list) {
 if (drv1-protocol_name 
-!strcmp(drv1-protocol_name, protocol))
+!strcmp(drv1-protocol_name, protocol)) {
 return drv1;
+}
 }
 return NULL;
 }
@@ -282,7 +284,7 @@ static BlockDriver *find_hdev_driver(const char *filename)
 int score_max = 0, score;
 BlockDriver *drv = NULL, *d;
 
-for (d = first_drv; d; d = d-next) {
+QLIST_FOREACH(d, bdrv_drivers, list) {
 if (d-bdrv_probe_device) {
 score = d-bdrv_probe_device(filename);
 if (score  score_max) {
@@ -317,7 +319,7 @@ static BlockDriver *find_image_format(const char *filename)
 }
 
 score_max = 0;
-for(drv1 = first_drv; drv1 != NULL; drv1 = drv1-next) {
+QLIST_FOREACH(drv1, bdrv_drivers, list) {
 if (drv1-bdrv_probe) {
 score = drv1-bdrv_probe(buf, ret, filename);
 if (score  score_max) {
@@ -1157,7 +1159,7 @@ void bdrv_iterate_format(void (*it)(void *opaque, const 
char *name),
 {
 BlockDriver *drv;
 
-for (drv = first_drv; drv != NULL; drv = drv-next) {
+QLIST_FOREACH(drv, bdrv_drivers, list) {
 it(opaque, drv-format_name);
 }
 }
diff --git a/block_int.h b/block_int.h
index 466a38c..d4067ff 100644
--- a/block_int.h
+++ b/block_int.h
@@ -126,7 +126,7 @@ struct BlockDriver {
 /* Set if newly created images are not guaranteed to contain only zeros */
 int no_zero_init;
 
-struct BlockDriver *next;
+QLIST_ENTRY(BlockDriver) list;
 };
 
 struct BlockDriverState {
-- 
1.7.0

[Qemu-devel] [PATCH 2/3] qemu-img: Eliminate bdrv_new_open() code duplication

2010-04-13 Thread Stefan Hajnoczi

Several commands have code to create a BlockDriverState and open a file.
The bdrv_new_open() function can be used to perform these steps.  This
patch converts the qemu-img commands to actually use bdrv_new_open().

Replaced the bdrv_new_open() 'readonly' argument with bdrv_open()-style
flags to support generic flags like BDRV_O_NO_BACKING.

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 qemu-img.c |   83 +++
 1 files changed, 10 insertions(+), 73 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 18e43a0..b30effa 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -190,12 +190,11 @@ static int read_password(char *buf, int buf_size)
 
 static BlockDriverState *bdrv_new_open(const char *filename,
const char *fmt,
-   int readonly)
+   int flags)
 {
 BlockDriverState *bs;
 BlockDriver *drv;
 char password[256];
-int flags = BRDV_O_FLAGS;
 
 bs = bdrv_new();
 if (!bs)
@@ -207,9 +206,6 @@ static BlockDriverState *bdrv_new_open(const char *filename,
 } else {
 drv = NULL;
 }
-if (!readonly) {
-flags |= BDRV_O_RDWR;
-}
 if (bdrv_open(bs, filename, flags, drv)  0) {
 error(Could not open '%s', filename);
 }
@@ -349,7 +345,7 @@ static int img_create(int argc, char **argv)
 }
 }
 
-bs = bdrv_new_open(backing_file-value.s, fmt, 1);
+bs = bdrv_new_open(backing_file-value.s, fmt, BRDV_O_FLAGS);
 bdrv_get_geometry(bs, size);
 size *= 512;
 bdrv_delete(bs);
@@ -384,7 +380,6 @@ static int img_check(int argc, char **argv)
 {
 int c, ret;
 const char *filename, *fmt;
-BlockDriver *drv;
 BlockDriverState *bs;
 
 fmt = NULL;
@@ -405,19 +400,7 @@ static int img_check(int argc, char **argv)
 help();
 filename = argv[optind++];
 
-bs = bdrv_new();
-if (!bs)
-error(Not enough memory);
-if (fmt) {
-drv = bdrv_find_format(fmt);
-if (!drv)
-error(Unknown file format '%s', fmt);
-} else {
-drv = NULL;
-}
-if (bdrv_open(bs, filename, BRDV_O_FLAGS, drv)  0) {
-error(Could not open '%s', filename);
-}
+bs = bdrv_new_open(filename, fmt, BRDV_O_FLAGS);
 ret = bdrv_check(bs);
 switch(ret) {
 case 0:
@@ -443,7 +426,6 @@ static int img_commit(int argc, char **argv)
 {
 int c, ret;
 const char *filename, *fmt;
-BlockDriver *drv;
 BlockDriverState *bs;
 
 fmt = NULL;
@@ -464,19 +446,7 @@ static int img_commit(int argc, char **argv)
 help();
 filename = argv[optind++];
 
-bs = bdrv_new();
-if (!bs)
-error(Not enough memory);
-if (fmt) {
-drv = bdrv_find_format(fmt);
-if (!drv)
-error(Unknown file format '%s', fmt);
-} else {
-drv = NULL;
-}
-if (bdrv_open(bs, filename, BRDV_O_FLAGS | BDRV_O_RDWR, drv)  0) {
-error(Could not open '%s', filename);
-}
+bs = bdrv_new_open(filename, fmt, BRDV_O_FLAGS | BDRV_O_RDWR);
 ret = bdrv_commit(bs);
 switch(ret) {
 case 0:
@@ -633,7 +603,7 @@ static int img_convert(int argc, char **argv)
 
 total_sectors = 0;
 for (bs_i = 0; bs_i  bs_n; bs_i++) {
-bs[bs_i] = bdrv_new_open(argv[optind + bs_i], fmt, 1);
+bs[bs_i] = bdrv_new_open(argv[optind + bs_i], fmt, BRDV_O_FLAGS);
 if (!bs[bs_i])
 error(Could not open '%s', argv[optind + bs_i]);
 bdrv_get_geometry(bs[bs_i], bs_sectors);
@@ -691,7 +661,7 @@ static int img_convert(int argc, char **argv)
 }
 }
 
-out_bs = bdrv_new_open(out_filename, out_fmt, 0);
+out_bs = bdrv_new_open(out_filename, out_fmt, BRDV_O_FLAGS | BDRV_O_RDWR);
 
 bs_i = 0;
 bs_offset = 0;
@@ -889,7 +859,6 @@ static int img_info(int argc, char **argv)
 {
 int c;
 const char *filename, *fmt;
-BlockDriver *drv;
 BlockDriverState *bs;
 char fmt_name[128], size_buf[128], dsize_buf[128];
 uint64_t total_sectors;
@@ -916,19 +885,7 @@ static int img_info(int argc, char **argv)
 help();
 filename = argv[optind++];
 
-bs = bdrv_new();
-if (!bs)
-error(Not enough memory);
-if (fmt) {
-drv = bdrv_find_format(fmt);
-if (!drv)
-error(Unknown file format '%s', fmt);
-} else {
-drv = NULL;
-}
-if (bdrv_open(bs, filename, BRDV_O_FLAGS | BDRV_O_NO_BACKING, drv)  0) {
-error(Could not open '%s', filename);
-}
+bs = bdrv_new_open(filename, fmt, BRDV_O_FLAGS | BDRV_O_NO_BACKING);
 bdrv_get_format(bs, fmt_name, sizeof(fmt_name));
 bdrv_get_geometry(bs, total_sectors);
 get_human_readable_size(size_buf, sizeof(size_buf), total_sectors * 512);
@@ -1028,13 +985,7 @@ static int img_snapshot(int argc, char **argv)
 filename =

[Qemu-devel] [PATCH 3/3] qemu-img: Fix BRDV_O_FLAGS typo

2010-04-13 Thread Stefan Hajnoczi

It should be BDRV_O_FLAGS instead of BRDV_O_FLAGS.

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 qemu-img.c |   20 ++--
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index b30effa..7203b8b 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -37,7 +37,7 @@ typedef struct img_cmd_t {
 } img_cmd_t;
 
 /* Default to cache=writeback as data integrity is not important for qemu-tcg. 
*/
-#define BRDV_O_FLAGS BDRV_O_CACHE_WB
+#define BDRV_O_FLAGS BDRV_O_CACHE_WB
 
 static void QEMU_NORETURN error(const char *fmt, ...)
 {
@@ -345,7 +345,7 @@ static int img_create(int argc, char **argv)
 }
 }
 
-bs = bdrv_new_open(backing_file-value.s, fmt, BRDV_O_FLAGS);
+bs = bdrv_new_open(backing_file-value.s, fmt, BDRV_O_FLAGS);
 bdrv_get_geometry(bs, size);
 size *= 512;
 bdrv_delete(bs);
@@ -400,7 +400,7 @@ static int img_check(int argc, char **argv)
 help();
 filename = argv[optind++];
 
-bs = bdrv_new_open(filename, fmt, BRDV_O_FLAGS);
+bs = bdrv_new_open(filename, fmt, BDRV_O_FLAGS);
 ret = bdrv_check(bs);
 switch(ret) {
 case 0:
@@ -446,7 +446,7 @@ static int img_commit(int argc, char **argv)
 help();
 filename = argv[optind++];
 
-bs = bdrv_new_open(filename, fmt, BRDV_O_FLAGS | BDRV_O_RDWR);
+bs = bdrv_new_open(filename, fmt, BDRV_O_FLAGS | BDRV_O_RDWR);
 ret = bdrv_commit(bs);
 switch(ret) {
 case 0:
@@ -603,7 +603,7 @@ static int img_convert(int argc, char **argv)
 
 total_sectors = 0;
 for (bs_i = 0; bs_i  bs_n; bs_i++) {
-bs[bs_i] = bdrv_new_open(argv[optind + bs_i], fmt, BRDV_O_FLAGS);
+bs[bs_i] = bdrv_new_open(argv[optind + bs_i], fmt, BDRV_O_FLAGS);
 if (!bs[bs_i])
 error(Could not open '%s', argv[optind + bs_i]);
 bdrv_get_geometry(bs[bs_i], bs_sectors);
@@ -661,7 +661,7 @@ static int img_convert(int argc, char **argv)
 }
 }
 
-out_bs = bdrv_new_open(out_filename, out_fmt, BRDV_O_FLAGS | BDRV_O_RDWR);
+out_bs = bdrv_new_open(out_filename, out_fmt, BDRV_O_FLAGS | BDRV_O_RDWR);
 
 bs_i = 0;
 bs_offset = 0;
@@ -885,7 +885,7 @@ static int img_info(int argc, char **argv)
 help();
 filename = argv[optind++];
 
-bs = bdrv_new_open(filename, fmt, BRDV_O_FLAGS | BDRV_O_NO_BACKING);
+bs = bdrv_new_open(filename, fmt, BDRV_O_FLAGS | BDRV_O_NO_BACKING);
 bdrv_get_format(bs, fmt_name, sizeof(fmt_name));
 bdrv_get_geometry(bs, total_sectors);
 get_human_readable_size(size_buf, sizeof(size_buf), total_sectors * 512);
@@ -1075,7 +1075,7 @@ static int img_rebase(int argc, char **argv)
  * Ignore the old backing file for unsafe rebase in case we want to correct
  * the reference to a renamed or moved backing file.
  */
-flags = BRDV_O_FLAGS | BDRV_O_RDWR | (unsafe ? BDRV_O_NO_BACKING : 0);
+flags = BDRV_O_FLAGS | BDRV_O_RDWR | (unsafe ? BDRV_O_NO_BACKING : 0);
 bs = bdrv_new_open(filename, fmt, flags);
 
 /* Find the right drivers for the backing files */
@@ -1106,7 +1106,7 @@ static int img_rebase(int argc, char **argv)
 
 bs_old_backing = bdrv_new(old_backing);
 bdrv_get_backing_filename(bs, backing_name, sizeof(backing_name));
-if (bdrv_open(bs_old_backing, backing_name, BRDV_O_FLAGS,
+if (bdrv_open(bs_old_backing, backing_name, BDRV_O_FLAGS,
 old_backing_drv))
 {
 error(Could not open old backing file '%s', backing_name);
@@ -1114,7 +1114,7 @@ static int img_rebase(int argc, char **argv)
 }
 
 bs_new_backing = bdrv_new(new_backing);
-if (bdrv_open(bs_new_backing, out_baseimg, BRDV_O_FLAGS | BDRV_O_RDWR,
+if (bdrv_open(bs_new_backing, out_baseimg, BDRV_O_FLAGS | BDRV_O_RDWR,
 new_backing_drv))
 {
 error(Could not open new backing file '%s', out_baseimg);
-- 
1.7.0

[Qemu-devel] [PATCH 0/3] block and qemu-img: Further cleanups

2010-04-13 Thread Stefan Hajnoczi

These patches apply to Kevin's block branch.

[Qemu-devel] Re: Missing singlestep for already-translated code?

2010-04-13 Thread Jun Koi

On Tue, Apr 13, 2010 at 6:21 PM,  takas...@ops.dti.ne.jp wrote:
 Hi,
 So for the already-translated code, we will miss singlestep?
 At least SH4(and mips?) shows such behaviour.
 I think a patch below enables single stepping in such case, too.
 But, I'm not sure if this behaviour is on purpose, nor this patch is correct.
 /yoshii

 diff --git a/target-sh4/translate.c b/target-sh4/translate.c
 index 3537f8c..dfa724a 100644
 --- a/target-sh4/translate.c
 +++ b/target-sh4/translate.c
 @@ -300,7 +300,7 @@ static void gen_goto_tb(DisasContext * ctx, int n, 
 target_ulong dest)
     tb = ctx-tb;

     if ((tb-pc  TARGET_PAGE_MASK) == (dest  TARGET_PAGE_MASK) 
 -       !ctx-singlestep_enabled) {
 +       !ctx-singlestep_enabled  !singlestep) {
        /* Use a direct jump if in same page and singlestep not enabled */
         tcg_gen_goto_tb(n);
         tcg_gen_movi_i32(cpu_pc, dest);


The first glance: because you are patching translate.c, I dont think
you fixed the problem.

Thanks,
J

[Qemu-devel] Re: [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.

2010-04-13 Thread Avi Kivity


On 04/13/2010 11:01 AM, Yoshiaki Tamura wrote:

Avi Kivity wrote:

On 04/12/2010 01:58 PM, Yoshiaki Tamura wrote:

Is it necessary to update migration and vga bitmaps?

We can simply update the master bitmap, and update the migration 
and vga

bitmaps only when they need it. That can be done in a different patch.



Let me explain the role of the master bitmap here.

In the previous discussion, the master bitmap represented at least one
dirty type is actually dirty. While implementing this approach, I
though there is one downside that upon resetting the bitmap, it needs
to check all dirty types whether they are already cleared to unset the
master bitmap, and this might hurt the performance in general.


The way I see it, the master bitmap is a buffer between the guest and
all the other client bitmaps (migration and vga). So, a bit can be set
in vga and cleared in master. Let's look at the following scenario:

1. guest touches address 0xa (page 0xa0)
2. qemu updates master bitmap bit 0xa0
3. vga timer fires
4. vga syncs the dirty bitmap for addresses 0xa-0xc
4.1 vga_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
4.2 migration_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
4.3 master_bitmap[0xa0-0xc0] = 0
5. vga draws based on vga_bitmap

the nice thing is that step 4.2 can be omitted if migration is not 
active.


so, client bitmaps are always updated when the client requests them,
master bitmap is only a buffer.


Thanks.  I think I'm finally getting your point.
At 4.2 and 4.3, is syncing to client necessary?


Yes.  We want to clear the master bitmap, otherwise on the next 
iteration we will get dirty bits that may be spurious.  But if we clear 
the dirty bitmap, we must copy it to all client bitmaps, not just vga, 
otherwise we lose data.



Simply doing OR at get_dirty like in this patch not enough?


If you don't clear the master bitmap, then you will get the same bits at 
the next iteration.





In this patch, master bitmap represents all types are dirty, similar
to existing 0xff. With this approach, resetting the master bitmap can
be done without checking the other types. set_dirty_flags is actually
taking the burden in this case though. Anyway, IIUC somebody would be
unhappy depending on the role of the master bitmap.


Yes, the problem is that set_dirty_flags is the common case and also
uses random access, we'd want to make it touch only a single bit. client
access is rare, and also sequential, and therefore efficient.


I see.
So set_dirty_flags would be like,

1. set master first regard less of dirty type.
2. if dirty type was CODE, set the client.


There really should be two APIs, one for general dirtyness and one for 
code dirty.  As Anthony said, the code dirty bitmap is completely 
separate from the other uses.


Consider starting by separating the two uses, it may clear up the code 
and make things easier later on.




In summary, I'll prepare two patch sets.

1. v3 of this patch set.
- Change FLAGS value to (1,2,4,8), and add IDX (0,1,2,3)
- Use ffsl to convert FLAGS to IDX.
- Add help function which takes IDX.
- Change the behavior of set_dirty_flags as above.
- Change dirty bitmap access to a loop.
- Add brace after if ()
- Move some macros to qemu-common.h.

2. Allocate vga and migration dirty bitmap dynamically.

Please modify or add items if I were missing.


I would add separation of CODE and the other dirty bitmaps.

--
error compiling committee.c: too many arguments to function

[Qemu-devel] [PATCH] block.h: bdrv_create2 doesn't exist any more

2010-04-13 Thread Kevin Wolf

The bdrv_create2 implementation has been disappeared long ago. Remove its
prototype from the header file, too.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block.h |4 
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/block.h b/block.h
index 4a57dd5..05ad572 100644
--- a/block.h
+++ b/block.h
@@ -57,10 +57,6 @@ BlockDriver *bdrv_find_format(const char *format_name);
 BlockDriver *bdrv_find_whitelisted_format(const char *format_name);
 int bdrv_create(BlockDriver *drv, const char* filename,
 QEMUOptionParameter *options);
-int bdrv_create2(BlockDriver *drv,
- const char *filename, int64_t size_in_sectors,
- const char *backing_file, const char *backing_format,
- int flags);
 BlockDriverState *bdrv_new(const char *device_name);
 void bdrv_delete(BlockDriverState *bs);
 int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags);
-- 
1.6.6.1

[Qemu-devel] Re: [PATCH v2 3/6] Modifies wrapper functions for byte-based phys_ram_dirty bitmap to bit-based phys_ram_dirty bitmap.

2010-04-13 Thread Yoshiaki Tamura


Avi Kivity wrote:

On 04/13/2010 11:01 AM, Yoshiaki Tamura wrote:

Avi Kivity wrote:

On 04/12/2010 01:58 PM, Yoshiaki Tamura wrote:

Is it necessary to update migration and vga bitmaps?

We can simply update the master bitmap, and update the migration
and vga
bitmaps only when they need it. That can be done in a different patch.



Let me explain the role of the master bitmap here.

In the previous discussion, the master bitmap represented at least one
dirty type is actually dirty. While implementing this approach, I
though there is one downside that upon resetting the bitmap, it needs
to check all dirty types whether they are already cleared to unset the
master bitmap, and this might hurt the performance in general.


The way I see it, the master bitmap is a buffer between the guest and
all the other client bitmaps (migration and vga). So, a bit can be set
in vga and cleared in master. Let's look at the following scenario:

1. guest touches address 0xa (page 0xa0)
2. qemu updates master bitmap bit 0xa0
3. vga timer fires
4. vga syncs the dirty bitmap for addresses 0xa-0xc
4.1 vga_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
4.2 migration_bitmap[0xa0-0xc0] |= master_bitmap[0xa0-0xc0]
4.3 master_bitmap[0xa0-0xc0] = 0
5. vga draws based on vga_bitmap

the nice thing is that step 4.2 can be omitted if migration is not
active.

so, client bitmaps are always updated when the client requests them,
master bitmap is only a buffer.


Thanks. I think I'm finally getting your point.
At 4.2 and 4.3, is syncing to client necessary?


Yes. We want to clear the master bitmap, otherwise on the next iteration
we will get dirty bits that may be spurious. But if we clear the dirty
bitmap, we must copy it to all client bitmaps, not just vga, otherwise
we lose data.


Simply doing OR at get_dirty like in this patch not enough?


If you don't clear the master bitmap, then you will get the same bits at
the next iteration.


OK.
Master is just a cache/buffer, and upon get_dirty, we copy it to client bitmaps 
with for loop, clear the master and finally check actual client bitmap.  If 
we're going to allocate client bitmaps dynamically, we need to check whether it 
isn't null too.



In this patch, master bitmap represents all types are dirty, similar
to existing 0xff. With this approach, resetting the master bitmap can
be done without checking the other types. set_dirty_flags is actually
taking the burden in this case though. Anyway, IIUC somebody would be
unhappy depending on the role of the master bitmap.


Yes, the problem is that set_dirty_flags is the common case and also
uses random access, we'd want to make it touch only a single bit. client
access is rare, and also sequential, and therefore efficient.


I see.
So set_dirty_flags would be like,

1. set master first regard less of dirty type.
2. if dirty type was CODE, set the client.


There really should be two APIs, one for general dirtyness and one for
code dirty. As Anthony said, the code dirty bitmap is completely
separate from the other uses.

Consider starting by separating the two uses, it may clear up the code
and make things easier later on.


That really make sense.  A little bit hesitant because the code might look a bit 
duplicated.



In summary, I'll prepare two patch sets.

1. v3 of this patch set.
- Change FLAGS value to (1,2,4,8), and add IDX (0,1,2,3)
- Use ffsl to convert FLAGS to IDX.
- Add help function which takes IDX.
- Change the behavior of set_dirty_flags as above.
- Change dirty bitmap access to a loop.
- Add brace after if ()
- Move some macros to qemu-common.h.

2. Allocate vga and migration dirty bitmap dynamically.

Please modify or add items if I were missing.


I would add separation of CODE and the other dirty bitmaps.


I'm not sure I should starting separating first or later at this moment, but 
will consider it.

Re: [Qemu-devel] How to lock-up your tap-based VM network

2010-04-13 Thread Jan Kiszka

Paul Brook wrote:
 A major reason for this deadlock could likely be removed by shutting
 down the tap (if peered) or dropping packets in user space (in case of
 vlan) when a NIC is stopped or otherwise shut down. Currently most (if
 not all) NIC models seem to signal both queue full and RX disabled
 via !can_receive().
 
 No. A disabled device should return true from can_recieve, then discard the 
 packets in its receive callback. Failure to do so is a bug in the device. It 
 looks like the virtio-net device may be buggy.

That's not a virtio-only issue. In fact, we ran into this over pcnet,
and a quick check of other popular PCI NIC models (except for rtl8139)
revealed the same picture: They only report can_receive if their
receiver unit is up and ready (some also include the queue state, but
that's an add-on).

I think it's clear why: can_receive strongly suggests that a suspended
receiver should make the model return false. If we want to keep this
handler, it should be refactored to something like queue_full.

But before starting any refactoring endeavor: Do we have a consensus on
the direction? Refactor can_receive to queue_full? Or even drop it?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] How to lock-up your tap-based VM network

2010-04-13 Thread Jan Kiszka

Jamie Lokier wrote:
 Paul Brook wrote:
 A major reason for this deadlock could likely be removed by shutting
 down the tap (if peered) or dropping packets in user space (in case of
 vlan) when a NIC is stopped or otherwise shut down. Currently most (if
 not all) NIC models seem to signal both queue full and RX disabled
 via !can_receive().
 No. A disabled device should return true from can_recieve, then discard the 
 packets in its receive callback. Failure to do so is a bug in the device. It 
 looks like the virtio-net device may be buggy.
 
 I agree - or alternatively signal that there's no point sending it
 packets and they should be dropped without bothering to construct them.
 
 But anyway, this flow control mechanism is buggy - what if instead of
 an interface down, you just have a *slow* guest?  That should not push
 back so much that it makes other guests networking with each other
 slow down.

Indeed. So, instead of the current scheme that tries to stop the sender
when some receiver overflows, we must turn it up side down so that this
stop can _never_ happen. We may keep a sufficiently long queue to reduce
the risk of packet drops, but we can't prevent this for all cases anyway.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] How to lock-up your tap-based VM network

2010-04-13 Thread Jan Kiszka

Paul Brook wrote:
 But anyway, this flow control mechanism is buggy - what if instead of
 an interface down, you just have a *slow* guest?  That should not push
 back so much that it makes other guests networking with each other
 slow down.
 
 The OP described multiple guests connected via a host bridge. In this case it 
 is entirely the host's responsibility to arbitrate between multiple guests. 
 If 
 one interface can block the bridge simply by failing to respond in a timely 
 manner then this is a serious bug or misconfiguration of your host bridge.

It's not the bridge, I think it's limited to the tun driver as bridge
participant and how it is configured/used by QEMU.

 
 The reason tap_send exists is because slirp does not implement TCP flow 
 control properly.  Instead it relies on the can_send hook to only avoid 
 dropped packets.
 
 Using this in the tap code is debatable. My guess is that this is a hack to 
 workaround guest network stacks that expect throughput to be limited by line 
 speed (which is a virtual environment is effectively infinite), have poor 
 higher level flow control, and react badly to packet loss.

[ CC'ing some people who should have been involved in establishing the
TX mitigation for tap devices. Full thread under
http://thread.gmane.org/gmane.comp.emulators.qemu/67809 ]

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH] SCSI: Add disk reset handler

2010-04-13 Thread Jan Kiszka

Ensure that pending requests of an SCSI disk are purged on system reset
and also restore max_lba. The latter is now only present in the reset
handler as that one is already automatically called during init.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/scsi-disk.c |   32 
 1 files changed, 24 insertions(+), 8 deletions(-)

This should be applied to stable as well. I can provide a corresponding
patch once this one is acceptable.

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 77cb1da..50d5e9c 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -1010,22 +1010,42 @@ static int32_t scsi_send_command(SCSIDevice *d, 
uint32_t tag,
 }
 }
 
-static void scsi_destroy(SCSIDevice *dev)
+static void scsi_disk_purge_requests(SCSIDiskState *s)
 {
-SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev);
 SCSIDiskReq *r;
 
 while (!QTAILQ_EMPTY(s-qdev.requests)) {
 r = DO_UPCAST(SCSIDiskReq, req, QTAILQ_FIRST(s-qdev.requests));
 scsi_remove_request(r);
 }
+}
+
+static void scsi_disk_reset(DeviceState *dev)
+{
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev.qdev, dev);
+uint64_t nb_sectors;
+
+scsi_disk_purge_requests(s);
+
+bdrv_get_geometry(s-bs, nb_sectors);
+nb_sectors /= s-cluster_size;
+if (nb_sectors) {
+nb_sectors--;
+}
+s-max_lba = nb_sectors;
+}
+
+static void scsi_destroy(SCSIDevice *dev)
+{
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev);
+
+scsi_disk_purge_requests(s);
 drive_uninit(s-qdev.conf.dinfo);
 }
 
 static int scsi_disk_initfn(SCSIDevice *dev)
 {
 SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev);
-uint64_t nb_sectors;
 
 if (!s-qdev.conf.dinfo || !s-qdev.conf.dinfo-bdrv) {
 error_report(scsi-disk: drive property not set);
@@ -1046,11 +1066,6 @@ static int scsi_disk_initfn(SCSIDevice *dev)
 s-cluster_size = s-qdev.blocksize / 512;
 
 s-qdev.type = TYPE_DISK;
-bdrv_get_geometry(s-bs, nb_sectors);
-nb_sectors /= s-cluster_size;
-if (nb_sectors)
-nb_sectors--;
-s-max_lba = nb_sectors;
 qemu_add_vm_change_state_handler(scsi_dma_restart_cb, s);
 return 0;
 }
@@ -1059,6 +1074,7 @@ static SCSIDeviceInfo scsi_disk_info = {
 .qdev.name= scsi-disk,
 .qdev.desc= virtual scsi disk or cdrom,
 .qdev.size= sizeof(SCSIDiskState),
+.qdev.reset   = scsi_disk_reset,
 .init = scsi_disk_initfn,
 .destroy  = scsi_destroy,
 .send_command = scsi_send_command,

Re: [Qemu-devel] How to lock-up your tap-based VM network

2010-04-13 Thread Paul Brook

 Paul Brook wrote:
  But anyway, this flow control mechanism is buggy - what if instead of
  an interface down, you just have a *slow* guest?  That should not push
  back so much that it makes other guests networking with each other
  slow down.
 
  The OP described multiple guests connected via a host bridge. In this
  case it is entirely the host's responsibility to arbitrate between
  multiple guests. If one interface can block the bridge simply by failing
  to respond in a timely manner then this is a serious bug or
  misconfiguration of your host bridge.
 
 It's not the bridge, I think it's limited to the tun driver as bridge
 participant and how it is configured/used by QEMU.

If one tap interface can prevent correct operation of another by failing to 
read data, then this is clearly a host kernel bug. 
I've no idea whether it's a bug in the TAP implementation (e.g. shared queue 
between independent devices), a bug in the bridging implementation (drops 
unrelated packets when one port is slow), or some interaction between the two, 
but it's definitely not a qemu bug.

I'm sure there are plenty of ways to DoS a qemu instance, and prevent it 
reading data from its tap interface.  How/whether this effects other unrelated 
processes on the host machine is not something qemu can or should know about.

Paul

Re: [Qemu-devel] How to lock-up your tap-based VM network

2010-04-13 Thread Jan Kiszka

Paul Brook wrote:
 Paul Brook wrote:
 A major reason for this deadlock could likely be removed by shutting
 down the tap (if peered) or dropping packets in user space (in case of
 vlan) when a NIC is stopped or otherwise shut down. Currently most (if
 not all) NIC models seem to signal both queue full and RX disabled
 via !can_receive().
 No. A disabled device should return true from can_recieve, then discard
 the packets in its receive callback. Failure to do so is a bug in the
 device. It looks like the virtio-net device may be buggy.
 That's not a virtio-only issue. In fact, we ran into this over pcnet,
 and a quick check of other popular PCI NIC models (except for rtl8139)
 revealed the same picture: They only report can_receive if their
 receiver unit is up and ready (some also include the queue state, but
 that's an add-on).
 
 If so these are also buggy.
 
 I think it's clear why: can_receive strongly suggests that a suspended
 receiver should make the model return false. If we want to keep this
 handler, it should be refactored to something like queue_full.
 
 I don't see a need to refactor anything.  You just need to fix the devices 
 that incorrectly return false when their RX engine is disabled.

can_receive is a misleading name. NIC model developers will continue
to make mistakes when implementing this function. And I don't think we
want such implementations all around:

static int rtl8139_can_receive(VLANClientState *nc)
{
RTL8139State *s = DO_UPCAST(NICState, nc, nc)-opaque;
int avail;

/* Receive (drop) packets if card is disabled.  */
if (!s-clock_enabled)
  return 1;
if (!rtl8139_receiver_enabled(s))
  return 1;

if (rtl8139_cp_receiver_enabled(s)) {
/* ??? Flow control not implemented in c+ mode.
   This is a hack to work around slirp deficiencies anyway.  */
return 1;
} else {
avail = MOD2(s-RxBufferSize + s-RxBufPtr - s-RxBufAddr,
 s-RxBufferSize);
return (avail == 0 || avail = 1514);
}
}

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] Re: Missing singlestep for already-translated code?

2010-04-13 Thread Jan Kiszka

Jun Koi wrote:
 Hi,
 
 I am looking into the singlestep command in monitor interface, and it
 seems that we only take into account the singlestep flag when we are
 translating code.
 So for the already-translated code, we will miss singlestep?

This feature is broken. For TCG, it should at least flush the
translation buffer, and for KVM it has to enable single-stepping in the
kernel. That's what happens automatically when you call cpu_single_step.
I guess 'singlestep' wants to be somehow orthogonal to this. But this is
the wrong approach.

Does anyone actually used this feature or still does so? It looks fairly
redundant to me, kind of a poor-man's gdb front-end as part of the
monitor console.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] How to lock-up your tap-based VM network

2010-04-13 Thread Paul Brook

 Paul Brook wrote:
  A major reason for this deadlock could likely be removed by shutting
  down the tap (if peered) or dropping packets in user space (in case of
  vlan) when a NIC is stopped or otherwise shut down. Currently most (if
  not all) NIC models seem to signal both queue full and RX disabled
  via !can_receive().
 
  No. A disabled device should return true from can_recieve, then discard
  the packets in its receive callback. Failure to do so is a bug in the
  device. It looks like the virtio-net device may be buggy.
 
 That's not a virtio-only issue. In fact, we ran into this over pcnet,
 and a quick check of other popular PCI NIC models (except for rtl8139)
 revealed the same picture: They only report can_receive if their
 receiver unit is up and ready (some also include the queue state, but
 that's an add-on).

If so these are also buggy.

 I think it's clear why: can_receive strongly suggests that a suspended
 receiver should make the model return false. If we want to keep this
 handler, it should be refactored to something like queue_full.

I don't see a need to refactor anything.  You just need to fix the devices 
that incorrectly return false when their RX engine is disabled.

Paul

[Qemu-devel] KVM call minutes for Apr 13

2010-04-13 Thread Chris Wright

No topics, short call

Re: [Qemu-devel] Re: Missing singlestep for already-translated code?

2010-04-13 Thread Alexander Graf


On 13.04.2010, at 15:36, Jan Kiszka wrote:

 Jun Koi wrote:
 Hi,
 
 I am looking into the singlestep command in monitor interface, and it
 seems that we only take into account the singlestep flag when we are
 translating code.
 So for the already-translated code, we will miss singlestep?
 
 This feature is broken. For TCG, it should at least flush the
 translation buffer, and for KVM it has to enable single-stepping in the
 kernel. That's what happens automatically when you call cpu_single_step.
 I guess 'singlestep' wants to be somehow orthogonal to this. But this is
 the wrong approach.
 
 Does anyone actually used this feature or still does so? It looks fairly
 redundant to me, kind of a poor-man's gdb front-end as part of the
 monitor console.

Not sure what it does, but I use -singlestep quite a lot to get register dumps 
for instructions when using -d cpu.

Alex

Re: [Qemu-devel] [PATCH 2/2] VirtIO RNG

2010-04-13 Thread Ian Molton

Paul Brook wrote:
 Paul Brook wrote:

 So now everything that looks like a stream of bytes has to use the
 virtio-serial code...
 
 IMO, yes. That's the whole point of virtio-serial.
 
 You can handle it however you like in your in your favourite guest OS, but I 
 object to qemu having multiple virtio devices that look and smell exactly 
 like 
 a serial port.

I dont think it looks and smells exactly like a serial port. The code is
simpler to begin with, nor does it support any out-of-band signalling.

 Why? Its not like it'll make the rng device any simpler, smaller,
 faster, or reduce its dependencies. Virtio is simple enough to begin with!
 
 I disagree. Past experience shows that it's more than possible to completely 
 screw up something as simple as a serial port.

Straw man. Serial ports might be fairly simple things, but my device is
simpler still than most of them, and you're talking about emulating a
serial port, where you have control lines, line disciplines, etc. to
emulate, all of which need to be right. None of that is a concern in my
driver.

 If everything that looks like a 
 serial port provides their own implementation then the chances are they'll 
 all 
 be missing something, and broken in different ways.

Better never write any code then, its garunteesd to have a bug in it.

I *know* what you're getting at - I removed hundreds of lines of common
code copied and randomly, brokenly, modified from the various ASoC
kernel drivers a few months back.

But this bit of code doesnt really share enough in common with anything
else to make it worth handling your way. Alternatively, prove me wrong!

 Take rate limiting as an example: Your implementation has at least one 
 outright bug (using gettimeofday),

Arguably a bug. I'd disagree that it causes any problems in practice
(qemu_gettimeofday, btw - which I was specifically requested to use
during an earlier review of this patch...)

 and various questionable characteristics (is a 1-second token bucket 
 appropriate).

So what? if people dont like the default policy, or its too limited,
they can feel free to contribute patches. Its hardly like this is core
code that minute changes will require months of review to ratify!

At least I'm doing something about the
total-lack-of-any-support-whatsoever situation.

 The EGD protocol bits are another example. You aren't exposing any of the 
 interesting bits of this protocol to the guest, so it seems entirely 
 reasonable to connect this to a serial port.

Except that then we need to have a little daemon to feed rate limited
EGD protocol data to said serial port.

So now you've got the situation of the USB-serial entropy key, feeding
the (potentially) proprietary egd-compatible-but-not-rate-limited daemon
 that feeds a rate limiting daemon into a serial port on qemu that feeds
the guest OS.

So this is running in a server farm with 64 or 128 guests sat on it.
each one with its own silly rate limiting daemon / serial port mux from
the egd daemon the USB stick feeds. What happens when you apt-get
upgrade and the egd daemon gets replaced? you really want to notify and
restart all 128 noddy mux daemons and reconnect qemu to them? Why dont
we just kill the guests and reboot them all too?

My patchset handled this case without any problems at all (the reader
would simply (and without blocking) reconect to the server).

The point is that you can take 'generic support' too far. Sure you CAN
model this as a bunch of serial ports, and feed them from rate limiting
daemons, which in turn read from some egd-speaking daemon or hardware,
but EGD protocol is *SO* simple. Why on earth go to these lengths when
it can be implemented and embedded in a handful of lines of code?

The chardev reconnect stuff was in the generic qemu chardev layer and
thus reuseable by other parts of qemu that need to maintain a persistent
state even when connected to other processes via sockets, pipes, fifos etc.

 That allows it to be used by guests that don't have a virtio-rng driver.

Thats already supported by use of TCP tunnels. but thats 1) not
efficient and 2) not very secure - few sources of entropy provide crypto
to garuntee that the data source hasnt been frobbed.   3) much more
vulnerable to attack from processes running on the guest (TCP is much
more exposde than the rng-core internals)

So, rather than bike-shedding, how about making some kind of decision?

Either:

1) Dictate how the code is going to be written (in which case, since you
know it so well, write it yourself)

or:

2) Accept that not everyone likes your (idea for a) implementation, that
prior interfaces exist on the guest side, and that a perfectly good and
non-intrusive way of doing it on qemu has beeen written for you.
(including a couple of other nice features / fixes along the way, like
adding the missing SIZE_ property, socket reconnect support, etc.)

Bear in mind that its rather unfair to (as has been in this case) have a
patch reviewed, modified based on

Re: [Qemu-devel] [PATCH 2/2] VirtIO RNG

2010-04-13 Thread Paul Brook

 Bear in mind that its rather unfair to (as has been in this case) have a
 patch reviewed, modified based on criticism, and then rejected after
 effort has been wasted working on it.

I'm pretty sure I raised all these issues the first time this code was 
submitted.

Paul

[Qemu-devel] [PATCH] tun: orphan an skb on tx

2010-04-13 Thread Michael S. Tsirkin

The following situation was observed in the field:
tap1 sends packets, tap2 does not consume them, as a result
tap1 can not be closed. This happens because
tun/tap devices can hang on to skbs undefinitely.

As noted by Herbert, possible solutions include a timeout followed by a
copy/change of ownership of the skb, or always copying/changing
ownership if we're going into a hostile device.

This patch implements the second approach.

Note: one issue still remaining is that since skbs
keep reference to tun socket and tun socket has a
reference to tun device, we won't flush backlog,
instead simply waiting for all skbs to get transmitted.
At least this is not user-triggerable, and
this was not reported in practice, my assumption is
other devices besides tap complete an skb
within finite time after it has been queued.

A possible solution for the second issue
would not to have socket reference the device,
instead, implement dev-destructor for tun, and
wait for all skbs to complete there, but this
needs some thought, probably too risky for 2.6.34.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
Tested-by: Yan Vugenfirer yvuge...@redhat.com

---

Please review the below, and consider for 2.6.34,
and stable trees.

 drivers/net/tun.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 96c39bd..4326520 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -387,6 +387,10 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, 
struct net_device *dev)
}
}
 
+   /* Orphan the skb - required as we might hang on to it
+* for indefinite time. */
+   skb_orphan(skb);
+
/* Enqueue packet */
skb_queue_tail(tun-socket.sk-sk_receive_queue, skb);
dev-trans_start = jiffies;
-- 
1.7.0.2.280.gc6f05

[Qemu-devel] Re: [PATCH] tun: orphan an skb on tx

2010-04-13 Thread Herbert Xu

On Tue, Apr 13, 2010 at 05:59:44PM +0300, Michael S. Tsirkin wrote:
 The following situation was observed in the field:
 tap1 sends packets, tap2 does not consume them, as a result
 tap1 can not be closed. This happens because
 tun/tap devices can hang on to skbs undefinitely.
 
 As noted by Herbert, possible solutions include a timeout followed by a
 copy/change of ownership of the skb, or always copying/changing
 ownership if we're going into a hostile device.
 
 This patch implements the second approach.
 
 Note: one issue still remaining is that since skbs
 keep reference to tun socket and tun socket has a
 reference to tun device, we won't flush backlog,
 instead simply waiting for all skbs to get transmitted.
 At least this is not user-triggerable, and
 this was not reported in practice, my assumption is
 other devices besides tap complete an skb
 within finite time after it has been queued.
 
 A possible solution for the second issue
 would not to have socket reference the device,
 instead, implement dev-destructor for tun, and
 wait for all skbs to complete there, but this
 needs some thought, probably too risky for 2.6.34.
 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 Tested-by: Yan Vugenfirer yvuge...@redhat.com

Acked-by: Herbert Xu herb...@gondor.apana.org.au

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [Qemu-devel] Re: Missing singlestep for already-translated code?

2010-04-13 Thread Jan Kiszka

Alexander Graf wrote:
 On 13.04.2010, at 15:36, Jan Kiszka wrote:
 
 Jun Koi wrote:
 Hi,

 I am looking into the singlestep command in monitor interface, and it
 seems that we only take into account the singlestep flag when we are
 translating code.
 So for the already-translated code, we will miss singlestep?
 This feature is broken. For TCG, it should at least flush the
 translation buffer, and for KVM it has to enable single-stepping in the
 kernel. That's what happens automatically when you call cpu_single_step.
 I guess 'singlestep' wants to be somehow orthogonal to this. But this is
 the wrong approach.

 Does anyone actually used this feature or still does so? It looks fairly
 redundant to me, kind of a poor-man's gdb front-end as part of the
 monitor console.
 
 Not sure what it does, but I use -singlestep quite a lot to get register 
 dumps for instructions when using -d cpu.

Ah, singlestep is not about stopping the VM after each instruction but
about limiting the TB length to a single instruction. Badly named and
poorly documented.

In that case, the dynamic switch should already be fine by adding a
tb_flush() on enable. Still, someone should also patch at least the docs.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 2/2] VirtIO RNG

2010-04-13 Thread Paul Brook

 So, rather than bike-shedding, how about making some kind of decision?

Ok, let me make this simple.

Features such as rate limiting and EGD protocol translation should not be part 
of individual device emulation. They are part of the host interface, not the 
guest machine.  If you really want these in qemu then they should be 
implemented as part of the chardev layer.

It may be a bit more work to implement these properly. However I'm not going 
to accept a half-assed implementation just because someone wrote it.


A direct result of this is that virtio-rng degenerates to a crippled virtual 
serial port. We already have a virtual serial port implementation designed for 
exactly this kind of application.  IMHO virtio-rng is a complete waste of time 
and space.

Paul

[Qemu-devel] Re: [PATCH] tun: orphan an skb on tx

2010-04-13 Thread Jan Kiszka

Michael S. Tsirkin wrote:
 The following situation was observed in the field:
 tap1 sends packets, tap2 does not consume them, as a result
 tap1 can not be closed.

And before that, tap1 may not be able to send further packets to anyone
else on the bridge as its TX resources were blocked by tap2 - that's
what we saw in the field.

 This happens because
 tun/tap devices can hang on to skbs undefinitely.
 
 As noted by Herbert, possible solutions include a timeout followed by a
 copy/change of ownership of the skb, or always copying/changing
 ownership if we're going into a hostile device.
 
 This patch implements the second approach.
 
 Note: one issue still remaining is that since skbs
 keep reference to tun socket and tun socket has a
 reference to tun device, we won't flush backlog,
 instead simply waiting for all skbs to get transmitted.
 At least this is not user-triggerable, and
 this was not reported in practice, my assumption is
 other devices besides tap complete an skb
 within finite time after it has been queued.
 
 A possible solution for the second issue
 would not to have socket reference the device,
 instead, implement dev-destructor for tun, and
 wait for all skbs to complete there, but this
 needs some thought, probably too risky for 2.6.34.
 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 Tested-by: Yan Vugenfirer yvuge...@redhat.com
 
 ---
 
 Please review the below, and consider for 2.6.34,
 and stable trees.
 
  drivers/net/tun.c |4 
  1 files changed, 4 insertions(+), 0 deletions(-)
 
 diff --git a/drivers/net/tun.c b/drivers/net/tun.c
 index 96c39bd..4326520 100644
 --- a/drivers/net/tun.c
 +++ b/drivers/net/tun.c
 @@ -387,6 +387,10 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, 
 struct net_device *dev)
   }
   }
  
 + /* Orphan the skb - required as we might hang on to it
 +  * for indefinite time. */
 + skb_orphan(skb);
 +
   /* Enqueue packet */
   skb_queue_tail(tun-socket.sk-sk_receive_queue, skb);
   dev-trans_start = jiffies;

Nice solution, thanks!

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] Re: Problem with QEMU on KVM

2010-04-13 Thread Jan Kiszka

Mulyadi Santosa wrote:
 Hi Jamie...
 
 On Mon, Apr 12, 2010 at 19:07, Jamie Lokier ja...@shareable.org wrote:
 There are various -no-kvm-XXX options to try:

 -no-kvm-irqchip disable KVM kernel mode PIC/IOAPIC/LAPIC
 -no-kvm-pit disable KVM kernel mode PIT
 -no-kvm-pit-reinjection disable KVM kernel mode PIT interrupt reinjection
 
 I try to read the source code without too much luck understanding the
 meaning of the above parameters, especially the reinjection. Please
 CMIIW, interrupt reinjection is a way to handle lost ticks, right?

Those switches only exist in the qemu-kvm branch. The fact that
-enable-kvm has some effect for you makes me think that you tried
upstream KVM support (ie. the one that comes with vanilla QEMU), right?

It would be interesting to check if you can reproduce the problem also
with qemu-kvm(-0.12.3 or git head). As you may also face an issue of the
kvm kernel bits, it would be furthermore helpful to check with the
latest kernel from kvm.git or the modules build via kvm-kmod as well.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] Re: [PATCH] tun: orphan an skb on tx

2010-04-13 Thread Jan Kiszka

Eric Dumazet wrote:
 Le mardi 13 avril 2010 à 17:36 +0200, Jan Kiszka a écrit :
 Michael S. Tsirkin wrote:
 The following situation was observed in the field:
 tap1 sends packets, tap2 does not consume them, as a result
 tap1 can not be closed.
 And before that, tap1 may not be able to send further packets to anyone
 else on the bridge as its TX resources were blocked by tap2 - that's
 what we saw in the field.

 
 After the patch, tap1 is able to flood tap2, and tap3/tap4 not able to
 send one single frame. Is it OK ?

I think if that's a real issue, you have to apply traffic shaping to the
untrusted nodes. The existing flow-control scheme was fragile anyway as
you had to translate packet lengths on TX side to packet counts on RX.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] Re: [PATCH] tun: orphan an skb on tx

2010-04-13 Thread Michael S. Tsirkin

On Tue, Apr 13, 2010 at 06:40:38PM +0200, Eric Dumazet wrote:
 Le mardi 13 avril 2010 à 17:36 +0200, Jan Kiszka a écrit :
  Michael S. Tsirkin wrote:
   The following situation was observed in the field:
   tap1 sends packets, tap2 does not consume them, as a result
   tap1 can not be closed.
  
  And before that, tap1 may not be able to send further packets to anyone
  else on the bridge as its TX resources were blocked by tap2 - that's
  what we saw in the field.
  
 
 After the patch, tap1 is able to flood tap2, and tap3/tap4 not able to
 send one single frame. Is it OK ?

Yes :) This was always possible. Number of senders needed to flood
a receiver might vary depending on send/recv queue size
that you set. External sources can also fill your RX queue
if you let them. In the end, we need to rely on the scheduler for fairness,
or apply packet shaping.

 Back to the problem : tap1 cannot be closed.
 
 Why ? because of refcounts ?

Yes.

 When a socket with inflight tx packets is closed, we dont block the
 close, we only delay the socket freeing once all packets were delivered
 and freed.
 

Which is wrong, since this is under userspace control, so you get
unkillable processes.

-- 
MST

[Qemu-devel] Re: [PATCH] block: Split bdrv_open

2010-04-13 Thread Christoph Hellwig

On Mon, Apr 12, 2010 at 04:49:16PM +0200, Kevin Wolf wrote:
 bdrv_open contains quite some code that is only useful for opening images (as
 opposed to opening files by a protocol), for example snapshots.
 
 This patch splits the code so that we have bdrv_open_file() for files (uses
 protocols), bdrv_open() for images (uses format drivers) and bdrv_do_open() 
 for
 the code common for opening both images and files.
 
 Signed-off-by: Kevin Wolf kw...@redhat.com
 ---
 This patch applies on top of Christoph's RFC for the format/protocol split

I like this a lot.  A few comments:

 - why is bdrv_do_open below it's users in the code?  I really hate
   forward declarations of functions and they can usually be easily
   avoided.
 - a do a function name is not very meaningfull - what about
   bdrv_open_common instead?
 - doesn't the backing device handling only apply to image formats, too?

Re: [Qemu-devel] How to lock-up your tap-based VM network

2010-04-13 Thread Blue Swirl

On 4/12/10, Paul Brook p...@codesourcery.com wrote:
  A major reason for this deadlock could likely be removed by shutting
   down the tap (if peered) or dropping packets in user space (in case of
   vlan) when a NIC is stopped or otherwise shut down. Currently most (if
   not all) NIC models seem to signal both queue full and RX disabled
   via !can_receive().


 No. A disabled device should return true from can_recieve, then discard the
  packets in its receive callback. Failure to do so is a bug in the device. It
  looks like the virtio-net device may be buggy.

Awesome, it looks like a longstanding bug with pcnet/lance has is
fixed by this change! OpenBSD installer would hang when receiving
packages, now it works!

diff --git a/hw/pcnet.c b/hw/pcnet.c
index 5e63eb5..a04ad09 100644
--- a/hw/pcnet.c
+++ b/hw/pcnet.c
@@ -1031,8 +1031,6 @@ static int pcnet_tdte_poll(PCNetState *s)
 int pcnet_can_receive(VLANClientState *nc)
 {
 PCNetState *s = DO_UPCAST(NICState, nc, nc)-opaque;
-if (CSR_STOP(s) || CSR_SPND(s))
-return 0;

 return sizeof(s-buffer)-16;
 }

Re: [Qemu-devel] [GSoC 2010] Pass-through filesystem support.

2010-04-13 Thread jvrao

jvrao wrote:
 Alexander Graf wrote:
 On 12.04.2010, at 13:58, Jamie Lokier wrote:

 Mohammed Gamal wrote:
 On Mon, Apr 12, 2010 at 12:29 AM, Jamie Lokier ja...@shareable.org wrote:
 Javier Guerra Giraldez wrote:
 On Sat, Apr 10, 2010 at 7:42 AM, Mohammed Gamal m.gamal...@gmail.com 
 wrote:
 On Sat, Apr 10, 2010 at 2:12 PM, Jamie Lokier ja...@shareable.org 
 wrote:
 To throw a spanner in, the most widely supported filesystem across
 operating systems is probably NFS, version 2 :-)
 Remember that Windows usage on a VM is not some rare use case, and
 it'd be a little bit of a pain from a user's perspective to have to
 install a third party NFS client for every VM they use. Having
 something supported on the VM out of the box is a better option IMO.
 i don't think virtio-CIFS has any more support out of the box (on any
 system) than virtio-9P.
 It doesn't, but at least network-CIFS tends to work ok and is the
 method of choice for Windows VMs - when you can setup Samba on the
 host (which as previously noted you cannot always do non-disruptively
 with current Sambas).

 -- Jamie

 I think having support for both 9p and CIFS would be the best option.
 In that case the user will have the option to use either one,
 depending on how their guests support these filesystems. In that case
 I'd prefer to work on CIFS support while the 9p effort can still go
 on. I don't think both efforts are mutually exclusive.

 What do the rest of you guys think?
 I only noted NFS because most old OSes do not support CIFS or 9P -
 especially all the old unixes.

 I don't think old versions of MS-DOS and Windows (95, 98, ME, Nt4?)
 even support current CIFS.  They need extra server settings to work
 - such as setting passwords on the server to non-encrypted and other quirks.

 Meanwhile Windows Vista/2008/7 works better with SMB2, not CIFS, to
 properly see symlinks and hard links.

 So there is no really nice out of the box file service which works
 easily with all guest OSes.

 I'm guessing that out of all the filesystems, CIFS is the most widely
 supported in recent OSes (released in the last 10 years).  But I'm not
 really sure what the state of CIFS is for non-Windows, non-Linux,
 non-BSD guests.
 So what? If you want to have direct host fs access, install guest drivers. 
 If you can't, set up networking and use CIFS or NFS or whatever.

 I'm not sure why 9P is being pursued.  Does anything much support it,
 or do all OSes except quite recent Linux need a custom driver for 9P?
 Even Linux only got the first commit in the kernel 5 years ago, so
 probably it was only about 3 years ago that it will have begun
 appearing in stable distros, if at all.  Filesystem passthrough to
 Linux guests installed in the last couple of years is a useful
 feature, and I know that for many people that is their only use of
 KVM, but compared with CIFS' broad support it seems like quite a
 narrow goal.
 The goal is to have something simple and fast. We can fine-tune 9P to align 
 with the Linux VFS structures, making it really little overhead (and little 
 headache too). For Windows guests, nothing prevents us to expose yet another 
 9P flavor. That again would be aligned well with Windows's VFS and be slim 
 and fast there.

 The biggest problem I see with CIFS is that it's a huge beast. There are a 
 lot of corner cases where it just doesn't fit in. See my previous mail for 
 more details.

 As Alex mentioned, 9P is chosen for its mere simplicity and easy adaptability.
 NFS and CIFS does not give that flexibility. As we mentioned in the patch 
 series, we are 
 already seeing better numbers with 9P. Looking ahead 9P can embed KVM/QEMU 
 knowledge
 to share physical resources like memory/cache between the host and the guest.
 
 I think looking into the windows side of 9P client would be great option too. 
 The current patch on the mailing list supports .U (unix) protocol and will be 
 introducing
 .L (Linux) variant as we move forward.

Hi Mohammed,
Please let us know once you decide on where your interest lies.
Will be glad to have you on VirtFS (9P) though. :)


- JV

 
 - JV
 
 
 Alex

Re: [Qemu-devel] How to lock-up your tap-based VM network

2010-04-13 Thread Blue Swirl

On 4/13/10, Blue Swirl blauwir...@gmail.com wrote:
 On 4/12/10, Paul Brook p...@codesourcery.com wrote:
A major reason for this deadlock could likely be removed by shutting
 down the tap (if peered) or dropping packets in user space (in case of
 vlan) when a NIC is stopped or otherwise shut down. Currently most (if
 not all) NIC models seem to signal both queue full and RX disabled
 via !can_receive().
  
  
   No. A disabled device should return true from can_recieve, then discard the
packets in its receive callback. Failure to do so is a bug in the device. 
 It
looks like the virtio-net device may be buggy.


 Awesome, it looks like a longstanding bug with pcnet/lance has is
  fixed by this change! OpenBSD installer would hang when receiving
  packages, now it works!

I spoke too soon, networking works also without the patch now.

Re: [Qemu-devel] Problem with DOS application and 286 DOS Extender application

2010-04-13 Thread Gerhard Wiesinger


On Tue, 13 Apr 2010, Roy Tam wrote:


2010/4/13 Gerhard Wiesinger li...@wiesinger.com:

Hello,

I'm having a problem with a DOS application which uses a 286 DOS Extender,
error message is as the following:
unable to create task for execution
Interrupt 10 (Ah) while creating task: Invalid task segment selector.
Happens with QEMM386 and HIMEM.SYS/EMM386.EXE.

I guess the application does at this point swithing to 286 protected mode
and trying to move conventional memory up to EMS memory.

Issue is NOT present under VMWare Server 2.0 and with real hardware.

DOS; MS-DOS 6.22
QEMU: 0.12.3 under Fedora 11, 2.6.30.10-105.2.23.fc11.x86 on AMD Phenom II
Quad Core, x86_64-softmmu.

Any comments or ideas (I guess something with protected mode and MMU might
be wrong)?



You need to mention the program name so that people can try to
reproduce the bug.



It is a non public, proprietary application which uses the Ergo Computing 
286 DOS Extender. I guess some other application which use the same DOS 
extender have the same problem. So best thing is to find another 
application which uses the Ergo Computing 286 DOS Extender, too.


Ciao,
Gerhard

--
http://www.wiesinger.com/

Re: [Qemu-devel] Problem with QEMU on KVM

2010-04-13 Thread Gerhard Wiesinger


On Mon, 12 Apr 2010, Jamie Lokier wrote:


Mulyadi Santosa wrote:

Hi Gerhard...

On Sun, Apr 11, 2010 at 20:52, Gerhard Wiesinger li...@wiesinger.com wrote:

OK, uses the following ports:
Port 0x20: 8259 interrupt controller
Port 0x40: 8253 timer

Interrupt 0x1A:
ah=0x00: fetches system timer counters
ah=0x02: reads the clock
ah=0x04: fetches date

So there must be something wrong with KVM with the above functionality (I
guess the timers).


Hmm, my silly guess is, maybe the timer is seen as decrementedor
at least stuck.

I have very little knowledge about git, but maybe you can start doing
git bisect to narrow which git commit that introduce such behaviour.
Meanwhile, let's wait for comments from one of the KVM developers.


There are various -no-kvm-XXX options to try:

-no-kvm-irqchip disable KVM kernel mode PIC/IOAPIC/LAPIC
-no-kvm-pit disable KVM kernel mode PIT
-no-kvm-pit-reinjection disable KVM kernel mode PIT interrupt reinjection

Any of them might be causing this problem.

They disable in-kernel emulation of the interrupt controller and timer
chips, and force the qemu version to be used.  It's possible the
in-kernel emulation is buggier than the qemu version.


Hello Jamie, thanx so far.

I'm starting in this way:
/root/download/qemu/qemu-0.12.3/x86_64-softmmu/qemu-system-x86_64 -enable-kvm 
-a_lot_of_other_options

All options you mentioned don't work:
/root/download/qemu/qemu-0.12.3/x86_64-softmmu/qemu-system-x86_64: invalid 
option -- '-no-kvm-irqchip'
/root/download/qemu/qemu-0.12.3/x86_64-softmmu/qemu-system-x86_64: invalid 
option -- '-no-kvm-pit'
/root/download/qemu/qemu-0.12.3/x86_64-softmmu/qemu-system-x86_64: invalid 
option -- '-no-kvm-no-kvm-irqchip'

Is there another way of startup or is it working only on newer versions?

BTW: For me it is very unclear what's the difference between the 
following commands:

qemu
qemu -enable-kvm
kvm

How can I build the kvm executable? Different source tree? Newer version 
from branch?


Maybe someone can explain the different versions of qemu/kvm and branches 
etc. and possible features in detail.


Thnx.

Ciao,
Gerhard

--
http://www.wiesinger.com/

Re: [Qemu-devel] [GSoC 2010] Pass-through filesystem support.

2010-04-13 Thread Mohammed Gamal

On Tue, Apr 13, 2010 at 9:08 PM, jvrao jv...@linux.vnet.ibm.com wrote:
 jvrao wrote:
 Alexander Graf wrote:
 On 12.04.2010, at 13:58, Jamie Lokier wrote:

 Mohammed Gamal wrote:
 On Mon, Apr 12, 2010 at 12:29 AM, Jamie Lokier ja...@shareable.org 
 wrote:
 Javier Guerra Giraldez wrote:
 On Sat, Apr 10, 2010 at 7:42 AM, Mohammed Gamal m.gamal...@gmail.com 
 wrote:
 On Sat, Apr 10, 2010 at 2:12 PM, Jamie Lokier ja...@shareable.org 
 wrote:
 To throw a spanner in, the most widely supported filesystem across
 operating systems is probably NFS, version 2 :-)
 Remember that Windows usage on a VM is not some rare use case, and
 it'd be a little bit of a pain from a user's perspective to have to
 install a third party NFS client for every VM they use. Having
 something supported on the VM out of the box is a better option IMO.
 i don't think virtio-CIFS has any more support out of the box (on any
 system) than virtio-9P.
 It doesn't, but at least network-CIFS tends to work ok and is the
 method of choice for Windows VMs - when you can setup Samba on the
 host (which as previously noted you cannot always do non-disruptively
 with current Sambas).

 -- Jamie

 I think having support for both 9p and CIFS would be the best option.
 In that case the user will have the option to use either one,
 depending on how their guests support these filesystems. In that case
 I'd prefer to work on CIFS support while the 9p effort can still go
 on. I don't think both efforts are mutually exclusive.

 What do the rest of you guys think?
 I only noted NFS because most old OSes do not support CIFS or 9P -
 especially all the old unixes.

 I don't think old versions of MS-DOS and Windows (95, 98, ME, Nt4?)
 even support current CIFS.  They need extra server settings to work
 - such as setting passwords on the server to non-encrypted and other 
 quirks.

 Meanwhile Windows Vista/2008/7 works better with SMB2, not CIFS, to
 properly see symlinks and hard links.

 So there is no really nice out of the box file service which works
 easily with all guest OSes.

 I'm guessing that out of all the filesystems, CIFS is the most widely
 supported in recent OSes (released in the last 10 years).  But I'm not
 really sure what the state of CIFS is for non-Windows, non-Linux,
 non-BSD guests.
 So what? If you want to have direct host fs access, install guest drivers. 
 If you can't, set up networking and use CIFS or NFS or whatever.

 I'm not sure why 9P is being pursued.  Does anything much support it,
 or do all OSes except quite recent Linux need a custom driver for 9P?
 Even Linux only got the first commit in the kernel 5 years ago, so
 probably it was only about 3 years ago that it will have begun
 appearing in stable distros, if at all.  Filesystem passthrough to
 Linux guests installed in the last couple of years is a useful
 feature, and I know that for many people that is their only use of
 KVM, but compared with CIFS' broad support it seems like quite a
 narrow goal.
 The goal is to have something simple and fast. We can fine-tune 9P to align 
 with the Linux VFS structures, making it really little overhead (and little 
 headache too). For Windows guests, nothing prevents us to expose yet 
 another 9P flavor. That again would be aligned well with Windows's VFS and 
 be slim and fast there.

 The biggest problem I see with CIFS is that it's a huge beast. There are a 
 lot of corner cases where it just doesn't fit in. See my previous mail for 
 more details.

 As Alex mentioned, 9P is chosen for its mere simplicity and easy 
 adaptability.
 NFS and CIFS does not give that flexibility. As we mentioned in the patch 
 series, we are
 already seeing better numbers with 9P. Looking ahead 9P can embed KVM/QEMU 
 knowledge
 to share physical resources like memory/cache between the host and the guest.

 I think looking into the windows side of 9P client would be great option too.
 The current patch on the mailing list supports .U (unix) protocol and will 
 be introducing
 .L (Linux) variant as we move forward.

 Hi Mohammed,
 Please let us know once you decide on where your interest lies.
 Will be glad to have you on VirtFS (9P) though. :)


 - JV


It seems the community is more keen on getting 9P support merged than
getting CIFS supported, and they have made good points to support
their argument. I'm not sure whether work on this project could fit in
as a GSoC project and if there is much remaining work that could make
it fit in that direction. But I'd be glad to volunteer anyway :)

Regards,
Mohammed

[Qemu-devel] Re: [PATCH] tun: orphan an skb on tx

2010-04-13 Thread Michael S. Tsirkin

On Tue, Apr 13, 2010 at 08:31:03PM +0200, Eric Dumazet wrote:
 Le mardi 13 avril 2010 à 20:39 +0300, Michael S. Tsirkin a écrit :
 
   When a socket with inflight tx packets is closed, we dont block the
   close, we only delay the socket freeing once all packets were delivered
   and freed.
   
  
  Which is wrong, since this is under userspace control, so you get
  unkillable processes.
  
 
 We do not get unkillable processes, at least with sockets I was thinking
 about (TCP/UDP ones).
 
 Maybe tun sockets can behave the same ?

Looks like that's what my patch does: ip_rcv seems to call
skb_orphan too.

 Herbert Acked your patch, so I guess its OK, but I think it can be
 dangerous.
 Anyway my feeling is that we try to add various mechanisms to keep a
 hostile user flooding another one.
 
 For example, UDP got memory accounting quite recently, and we added
 socket backlog limits very recently. It was considered not needed few
 years ago.

[Qemu-devel] Re: [PATCH] tun: orphan an skb on tx

2010-04-13 Thread Michael S. Tsirkin

On Tue, Apr 13, 2010 at 10:38:06PM +0200, Eric Dumazet wrote:
 Le mardi 13 avril 2010 à 23:25 +0300, Michael S. Tsirkin a écrit :
  On Tue, Apr 13, 2010 at 08:31:03PM +0200, Eric Dumazet wrote:
   Le mardi 13 avril 2010 à 20:39 +0300, Michael S. Tsirkin a écrit :
   
 When a socket with inflight tx packets is closed, we dont block the
 close, we only delay the socket freeing once all packets were 
 delivered
 and freed.
 

Which is wrong, since this is under userspace control, so you get
unkillable processes.

   
   We do not get unkillable processes, at least with sockets I was thinking
   about (TCP/UDP ones).
   
   Maybe tun sockets can behave the same ?
  
  Looks like that's what my patch does: ip_rcv seems to call
  skb_orphan too.
 
 Well, I was speaking of tx side, you speak of receiving side.

Point is, both ip_rcv and my patch call skb_orphan.

 An external flood (coming from another domain) is another problem.
 
 A sender might flood the 'network' inside our domain. How can we
 reasonably limit the sender ?
 
 Maybe the answer is 'We can not', but it should be stated somewhere, so
 that someone can address this point later.
 

And whatever's done should ideally work for tap to IP
and IP to IP sockets as well, not just tap to tap.

-- 
MST

[Qemu-devel] [PATCH v2] Write cmos hd data for ide drives using -device parm

2010-04-13 Thread Bruce Rogers


When specifying ide devices using -device, the cmos information 
which the bios depends on is not written. This patch generalizes 
the cmos hd data setting for the existing code path and adds the 
ability to call that code on a per machine, per ide drive basis. 

This is important for older guests, such as Windows XP, and Windows 
Server 2003. 

I needed to add an id to the IDEBus structure since there wasn't 
a way to identify whether an ide bus was primary or secondary in 
the lower level code. 

v2:Fixed same issue for isapc machine type (previous patch addressed
pc machine types only). Additionally, the bus naming needed to be fixed
in the isa case to be able to reference the secondary ide bus in -device
nomenclature.

Signed-off-by: Bruce Rogers brog...@novell.com 

Bruce 

 hw/ide/internal.h |3 +-
 hw/ide/isa.c  |8 +-
 hw/ide/piix.c |6 +++-
 hw/ide/qdev.c |   13 +++---
 hw/pc.c   |   68 --
 sysemu.h  |1 
 vl.c  |6 
 7 files changed, 71 insertions(+), 34 deletions(-)

diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index 027029e..4ea62bd 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -450,6 +450,7 @@ struct IDEBus {
 IDEState ifs[2];
 uint8_t unit;
 uint8_t cmd;
+uint8_t id;
 qemu_irq irq;
 };
 
@@ -565,7 +566,7 @@ void ide_init2(IDEBus *bus, DriveInfo *hd0, DriveInfo *hd1,
 void ide_init_ioport(IDEBus *bus, int iobase, int iobase2);
 
 /* hw/ide/qdev.c */
-void ide_bus_new(IDEBus *idebus, DeviceState *dev);
+void ide_bus_new(IDEBus *idebus, DeviceState *dev, const char *name);
 IDEDevice *ide_create_drive(IDEBus *bus, int unit, DriveInfo *drive);
 
 #endif /* HW_IDE_INTERNAL_H */
diff --git a/hw/ide/isa.c b/hw/ide/isa.c
index dff7c79..ac3ee3b 100644
--- a/hw/ide/isa.c
+++ b/hw/ide/isa.c
@@ -67,7 +67,13 @@ static int isa_ide_initfn(ISADevice *dev)
 {
 ISAIDEState *s = DO_UPCAST(ISAIDEState, dev, dev);
 
-ide_bus_new(s-bus, s-dev.qdev);
+if (s-iobase == 0x1f0) {
+ide_bus_new(s-bus, s-dev.qdev, ide.0);
+s-bus.id = 0;
+} else {
+ide_bus_new(s-bus, s-dev.qdev, ide.1);
+s-bus.id = 1;
+}
 ide_init_ioport(s-bus, s-iobase, s-iobase2);
 isa_init_irq(dev, s-irq, s-isairq);
 ide_init2(s-bus, NULL, NULL, s-irq);
diff --git a/hw/ide/piix.c b/hw/ide/piix.c
index 295a93d..377fda8 100644
--- a/hw/ide/piix.c
+++ b/hw/ide/piix.c
@@ -131,11 +131,13 @@ static int pci_piix_ide_initfn(PCIIDEState *d)
 
 vmstate_register(0, vmstate_ide_pci, d);
 
-ide_bus_new(d-bus[0], d-dev.qdev);
-ide_bus_new(d-bus[1], d-dev.qdev);
+ide_bus_new(d-bus[0], d-dev.qdev, NULL);
+ide_bus_new(d-bus[1], d-dev.qdev, NULL);
 ide_init_ioport(d-bus[0], 0x1f0, 0x3f6);
 ide_init_ioport(d-bus[1], 0x170, 0x376);
 
+d-bus[0].id = 0;
+d-bus[1].id = 1;
 ide_init2(d-bus[0], NULL, NULL, isa_reserve_irq(14));
 ide_init2(d-bus[1], NULL, NULL, isa_reserve_irq(15));
 return 0;
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index b18693d..c868b8e 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -29,9 +29,9 @@ static struct BusInfo ide_bus_info = {
 .size  = sizeof(IDEBus),
 };
 
-void ide_bus_new(IDEBus *idebus, DeviceState *dev)
+void ide_bus_new(IDEBus *idebus, DeviceState *dev, const char *name)
 {
-qbus_create_inplace(idebus-qbus, ide_bus_info, dev, NULL);
+qbus_create_inplace(idebus-qbus, ide_bus_info, dev, name);
 }
 
 static int ide_qdev_init(DeviceState *qdev, DeviceInfo *base)
@@ -39,6 +39,7 @@ static int ide_qdev_init(DeviceState *qdev, DeviceInfo *base)
 IDEDevice *dev = DO_UPCAST(IDEDevice, qdev, qdev);
 IDEDeviceInfo *info = DO_UPCAST(IDEDeviceInfo, qdev, base);
 IDEBus *bus = DO_UPCAST(IDEBus, qbus, qdev-parent_bus);
+int i;
 
 if (!dev-conf.dinfo) {
 fprintf(stderr, %s: no drive specified\n, qdev-info-name);
@@ -65,7 +66,13 @@ static int ide_qdev_init(DeviceState *qdev, DeviceInfo *base)
 default:
 goto err;
 }
-return info-init(dev);
+dev-dinfo-bus = bus-id;
+dev-dinfo-unit = dev-unit;
+i = info-init(dev);
+if (i = 0) {
+machine_ide_finalize(dev-dinfo);
+}
+return i;
 
 err:
 return -1;
diff --git a/hw/pc.c b/hw/pc.c
index 69e597f..29f47d4 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -215,6 +215,44 @@ static void cmos_init_hd(int type_ofs, int info_ofs, 
BlockDriverState *hd)
 rtc_set_memory(s, info_ofs + 8, sectors);
 }
 
+static void set_cmos_hd_data(DriveInfo *dinfo)
+{
+static int hd_type = 0;
+static int hd_trans = 0;
+RTCState *s = rtc_state;
+int cylinders, heads, sectors, translation;
+
+if (dinfo-bus == 0) {
+if (dinfo-unit == 0) {
+hd_type |= 0xf0;
+cmos_init_hd(0x19, 0x1b, dinfo-bdrv);
+} else {
+hd_type |= 0x0f;
+cmos_init_hd(0x1a, 0x24, dinfo-bdrv);
+}
+rtc_set_memory(s, 0x12,

[Qemu-devel] Re: [PATCH] tun: orphan an skb on tx

2010-04-13 Thread Eric Dumazet

Le mardi 13 avril 2010 à 17:36 +0200, Jan Kiszka a écrit :
 Michael S. Tsirkin wrote:
  The following situation was observed in the field:
  tap1 sends packets, tap2 does not consume them, as a result
  tap1 can not be closed.
 
 And before that, tap1 may not be able to send further packets to anyone
 else on the bridge as its TX resources were blocked by tap2 - that's
 what we saw in the field.
 

After the patch, tap1 is able to flood tap2, and tap3/tap4 not able to
send one single frame. Is it OK ?

Back to the problem : tap1 cannot be closed.

Why ? because of refcounts ?

When a socket with inflight tx packets is closed, we dont block the
close, we only delay the socket freeing once all packets were delivered
and freed.

Re: [Qemu-devel] [PATCH] arm: Fix compiler warning (fprintf format string)

2010-04-13 Thread Aurelien Jarno

On Fri, Apr 09, 2010 at 10:49:50PM +0200, Stefan Weil wrote:
 When argument checking is enabled, gcc throws this error:
 
 error: format not a string literal and no format arguments
 
 The patch rewrites the statement to satisfy the compiler.
 
 Signed-off-by: Stefan Weil w...@mail.berlios.de

Thanks, applied.

 ---
  arm-dis.c |8 
  1 files changed, 4 insertions(+), 4 deletions(-)
 
 diff --git a/arm-dis.c b/arm-dis.c
 index 4fb899e..5028555 100644
 --- a/arm-dis.c
 +++ b/arm-dis.c
 @@ -3149,14 +3149,14 @@ print_insn_thumb16 (bfd_vma pc, struct 
 disassemble_info *info, long given)
 if (started)
   func (stream, , );
 started = 1;
 -   func (stream, arm_regnames[14] /* lr */);
 +   func (stream, %s, arm_regnames[14] /* lr */);
   }
  
 if (domaskpc)
   {
 if (started)
   func (stream, , );
 -   func (stream, arm_regnames[15] /* pc */);
 +   func (stream, %s, arm_regnames[15] /* pc */);
   }
  
 func (stream, });
 @@ -3699,7 +3699,7 @@ print_insn_thumb32 (bfd_vma pc, struct disassemble_info 
 *info, long given)
 }
   else
 {
 - func (stream, psr_name (given  0xff));
 + func (stream, %s, psr_name (given  0xff));
 }
   break;
  
 @@ -3707,7 +3707,7 @@ print_insn_thumb32 (bfd_vma pc, struct disassemble_info 
 *info, long given)
   if ((given  0xff) == 0)
 func (stream, %cPSR, (given  0x10) ? 'S' : 'C');
   else
 -   func (stream, psr_name (given  0xff));
 +   func (stream, %s, psr_name (given  0xff));
   break;
  
 case '0': case '1': case '2': case '3': case '4':
 -- 
 1.5.6.5
 
 
 
 

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH] m68k: Fix compiler warning (fprintf format string)

2010-04-13 Thread Aurelien Jarno

On Fri, Apr 09, 2010 at 10:49:51PM +0200, Stefan Weil wrote:
 When argument checking is enabled, gcc throws this error:
 
 error: format not a string literal and no format arguments
 
 The patch rewrites the statement to satisfy the compiler.
 It also removes a type cast which is not needed.
 
 Signed-off-by: Stefan Weil w...@mail.berlios.de

Thanks, applied.

 ---
  m68k-dis.c |6 +++---
  1 files changed, 3 insertions(+), 3 deletions(-)
 
 diff --git a/m68k-dis.c b/m68k-dis.c
 index d38d5a2..7cd88f8 100644
 --- a/m68k-dis.c
 +++ b/m68k-dis.c
 @@ -1104,7 +1104,7 @@ print_insn_arg (const char *d,
{
  static const char *const cacheFieldName[] = { nc, dc, ic, bc 
 };
  val = fetch_arg (buffer, place, 2, info);
 -(*info-fprintf_func) (info-stream, cacheFieldName[val]);
 +(*info-fprintf_func) (info-stream, %s, cacheFieldName[val]);
  break;
}
  
 @@ -1199,7 +1199,7 @@ print_insn_arg (const char *d,
   {
 static const char *const scalefactor_name[] = { ,  };
 val = fetch_arg (buffer, place, 1, info);
 -   (*info-fprintf_func) (info-stream, scalefactor_name[val]);
 +   (*info-fprintf_func) (info-stream, %s, scalefactor_name[val]);
   }
else
   {
 @@ -1804,7 +1804,7 @@ match_insn_m68k (bfd_vma memaddr,
  
save_p = p;
info-print_address_func = dummy_print_address;
 -  info-fprintf_func = (fprintf_ftype) dummy_printer;
 +  info-fprintf_func = dummy_printer;
  
/* We scan the operands twice.  The first time we don't print anything,
   but look for errors.  */
 -- 
 1.5.6.5
 
 
 
 

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH] sh4: Fix compiler warning (fprintf format string)

2010-04-13 Thread Aurelien Jarno

On Fri, Apr 09, 2010 at 10:49:52PM +0200, Stefan Weil wrote:
 When argument checking is enabled, gcc throws this error:
 
 error: format not a string literal and no format arguments
 
 The patch rewrites the statement to satisfy the compiler.
 
 Signed-off-by: Stefan Weil w...@mail.berlios.de

Thanks, applied.

 ---
  sh4-dis.c |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/sh4-dis.c b/sh4-dis.c
 index 41fd866..078a6b2 100644
 --- a/sh4-dis.c
 +++ b/sh4-dis.c
 @@ -1493,10 +1493,10 @@ print_insn_ppi (int field_b, struct disassemble_info 
 *info)
 print_dsp_reg (field_b  0xf, fprintf_fn, stream);
 break;
   case DSP_REG_X:
 -   fprintf_fn (stream, sx_tab[(field_b  6)  3]);
 +   fprintf_fn (stream, %s, sx_tab[(field_b  6)  3]);
 break;
   case DSP_REG_Y:
 -   fprintf_fn (stream, sy_tab[(field_b  4)  3]);
 +   fprintf_fn (stream, %s, sy_tab[(field_b  4)  3]);
 break;
   case A_MACH:
 fprintf_fn (stream, mach);
 -- 
 1.5.6.5
 
 
 
 

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH] sparc: Fix compiler warning (fprintf format string)

2010-04-13 Thread Aurelien Jarno

On Fri, Apr 09, 2010 at 10:49:53PM +0200, Stefan Weil wrote:
 When argument checking is enabled, gcc throws this error:
 
 error: format not a string literal and no format arguments
 
 The patch rewrites the statement to satisfy the compiler.
 
 Signed-off-by: Stefan Weil w...@mail.berlios.de

Thanks, applied.

 ---
  sparc-dis.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)
 
 diff --git a/sparc-dis.c b/sparc-dis.c
 index 83a12ae..611e74f 100644
 --- a/sparc-dis.c
 +++ b/sparc-dis.c
 @@ -2778,7 +2778,7 @@ print_insn_sparc (bfd_vma memaddr, disassemble_info 
 *info)
/* Can't do simple format if source and dest are different.  */
continue;
  
 -  (*info-fprintf_func) (stream, opcode-name);
 +  (*info-fprintf_func) (stream, %s, opcode-name);
  
{
  const char *s;
 -- 
 1.5.6.5
 
 
 
 

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH] linux-user: do_shmdt(): Fix page_set_flags's 2nd arg.

2010-04-13 Thread Aurelien Jarno

On Sun, Apr 11, 2010 at 02:09:57AM +0900, takas...@ops.dti.ne.jp wrote:
 2nd arg of page_set_flags() should be start+size, but size.
 
 Signed-off-by: Takashi YOSHII takas...@ops.dti.ne.jp

Thanks, applied.

 ---
  linux-user/syscall.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)
 
 diff --git a/linux-user/syscall.c b/linux-user/syscall.c
 index a03e432..26c0fb4 100644
 --- a/linux-user/syscall.c
 +++ b/linux-user/syscall.c
 @@ -2752,7 +2752,7 @@ static inline abi_long do_shmdt(abi_ulong shmaddr)
  for (i = 0; i  N_SHM_REGIONS; ++i) {
  if (shm_regions[i].start == shmaddr) {
  shm_regions[i].start = 0;
 -page_set_flags(shmaddr, shm_regions[i].size, 0);
 +page_set_flags(shmaddr, shmaddr + shm_regions[i].size, 0);
  break;
  }
  }
 -- 
 1.6.5
 
 
 
 

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

[Qemu-devel] Qemu sharing files help

2010-04-13 Thread Arpit Patel

Hi,

I have Ubuntu on my host system and I am using following command on it, to
load kernel..

*qemu -kernel kernelimage -initrd initrd.img /dev/zero -append cmdline *

Was wondering, how can I share/copy files from host Ubuntu system to the one
with above command?

Someone mentioned using sshfs, but was hard for me to figure out, how ?



Thanks,
Arpit

Re: [Qemu-devel] Problem with DOS application and 286 DOS Extender application

2010-04-13 Thread Jamie Lokier

Gerhard Wiesinger wrote:
 It is a non public, proprietary application which uses the Ergo Computing 
 286 DOS Extender. I guess some other application which use the same DOS 
 extender have the same problem. So best thing is to find another 
 application which uses the Ergo Computing 286 DOS Extender, too.

The 286 was obsolete 20 years ago, although code depending on it
persisted for some years after.

I'm fairly sure the number of people using (or trying to use) Qemu
with 286-specific code is very small indeed, so unfortunately for a
286 problem, you will need to help reproduce it as much as you can for
it to be fixed.

Note that Qemu doesn't emulate segments properly even for 32-bit x86
code, and 16-bit (286) code depends on that all the more.  That may be
the problem.

Or it may be the reset using keyboard controller and BIOS method
used to switch from protected mode to real mode on a 286 is not
implemented properly, or is not supported by the BIOS properly.

Or it may simply be a bug in 16-bit task segment switching or
something like that, which is quite complex and so rarely used that it
might never have been properly tested.

Did you try running the application under Bochs, which has a more
accurate emulation of very old x86 CPUs?

-- Jamie

[Qemu-devel] Re: ehci update

2010-04-13 Thread Jan Kiszka

David S. Ahern wrote:
 After a month of code refactoring and clean ups, etc, I thought I would
 send along an update. The attached patch is relative to your ehci
 branch; I also attached the full usb-ehci.c file for easier reading.

Thanks for your work! I applied it and once again merged git head at
this chance.

Just one request for the future: Please keep a queue of incremental
changes. This EHCI beast is sufficiently tricky, and at some point
someone (you included) may want to go back in history to find
out where some change came from, and why it was applied.

E.g.: We apparently regressed further /wrt my favorite test case (as
it's self-contained): -usbdevice net. qemu is now entering an infinite
loop when you start dhcpcd in the guest on that interface.

 
 At this point I can get a Windows XP guest to format a 4GB key and read
 from and write to it. I can get an FC-12 guest to format a 4GB key and
 an 8GB key as well as read from and write to both. Write rates are on
 the order of 8 MB/sec for dd:
 
 # dd if=/dev/zero of=test bs=1M count=100 oflag=dsync
 100+0 records in
 100+0 records out
 104857600 bytes (105 MB) copied, 12.1205 s, 8.7 MB/s
 
 rsync of text files (e.g., /var/log) is on the order of 2MB/sec.
 
 4GB keys are definitely more stable; the 8GB is not recognized by
 Windows XP.
 
 It still needs a lot of love, but definitely an improvement from the
 last version. The biggest difference for the performance boost and
 stability is discovering that the usbfs in linux limits transactions to
 16k versus the EHCI spec which allows 20k per qTD. I added a hack to
 submit which detects 20k requests from a guest and breaks it up into 2
 requests through the host (a 16k and then a 4k).

Did someone already bring this up on LKML or wherever usbfs is
discussed? Should be fixable, I naively guess.

Jan



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] Re: ehci update

2010-04-13 Thread David S. Ahern



On 04/13/2010 05:35 PM, Jan Kiszka wrote:
 David S. Ahern wrote:
 After a month of code refactoring and clean ups, etc, I thought I would
 send along an update. The attached patch is relative to your ehci
 branch; I also attached the full usb-ehci.c file for easier reading.
 
 Thanks for your work! I applied it and once again merged git head at
 this chance.
 
 Just one request for the future: Please keep a queue of incremental
 changes. This EHCI beast is sufficiently tricky, and at some point
 someone (you included) may want to go back in history to find
 out where some change came from, and why it was applied.

Agreed. I will submit focused changes from now on.

 
 E.g.: We apparently regressed further /wrt my favorite test case (as
 it's self-contained): -usbdevice net. qemu is now entering an infinite
 loop when you start dhcpcd in the guest on that interface.

I've been focused on a single path -- async, bulk transfers to a USB
key. I take a look at the ethernet device as well.

I've struggled with the infinite loop part: the async train jumps the
track with FC-12 guest rather quickly (ie., the link list is no longer a
loop). I put in a loop detector in the advance_state function which
helps for storage devices - but clearly something is not right. I've
been roaming the ehci_hcd code in the kernel as well looking for clues.
A lot of details to gel and the day job keeps getting in the way. :-)

 

 At this point I can get a Windows XP guest to format a 4GB key and read
 from and write to it. I can get an FC-12 guest to format a 4GB key and
 an 8GB key as well as read from and write to both. Write rates are on
 the order of 8 MB/sec for dd:

 # dd if=/dev/zero of=test bs=1M count=100 oflag=dsync
 100+0 records in
 100+0 records out
 104857600 bytes (105 MB) copied, 12.1205 s, 8.7 MB/s

 rsync of text files (e.g., /var/log) is on the order of 2MB/sec.

 4GB keys are definitely more stable; the 8GB is not recognized by
 Windows XP.

 It still needs a lot of love, but definitely an improvement from the
 last version. The biggest difference for the performance boost and
 stability is discovering that the usbfs in linux limits transactions to
 16k versus the EHCI spec which allows 20k per qTD. I added a hack to
 submit which detects 20k requests from a guest and breaks it up into 2
 requests through the host (a 16k and then a 4k).
 
 Did someone already bring this up on LKML or wherever usbfs is
 discussed? Should be fixable, I naively guess.

I submitted the patch to linux-usb and it was nack'ed. The response was
that memory is allocated in powers of 2 so trying to up the limit from
16k to 20k means it will actually want to find 32k of contiguous memory.
The suggestion was to handle it with multiple requests within qemu. I
guess libusb does that.

Right now there is a hack in the ehci model to do this, but the
usb-linux code would be a better place since it might be specific to
linux hosts.

David

 
 Jan

[Qemu-devel] Re: [PATCH] tun: orphan an skb on tx

2010-04-13 Thread Herbert Xu

On Tue, Apr 13, 2010 at 08:31:03PM +0200, Eric Dumazet wrote:

 Herbert Acked your patch, so I guess its OK, but I think it can be
 dangerous.

The tun socket accounting was never designed to stop it from
flooding another tun interface.  It's there to stop it from
transmitting above a destination interface TX bandwidth and
cause unnecessary packet drops.  It also limits the total amount
of kernel memory that can be pinned down by a single tun interface.

In this case, all we're doing is shifting the accounting from the
hardware queue to the qdisc queue.

So your ability to flood a tun interface is essentially unchanged.

BTW we do the same thing in a number of hardware drivers, as well
as virtio-net.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [SeaBIOS] [Qemu-devel] QEMU regression problems

2010-04-13 Thread Kevin O'Connor

On Tue, Apr 13, 2010 at 02:26:10PM +0800, Roy Tam wrote:
 2010/4/12 Gerhard Wiesinger li...@wiesinger.com:
  3.) There is also a problem with the reported base memory under QEMM386
  (HIMEM.SYS and EMM386.EXE is correct here). It is 646kB instead of 640kB.
  Therefore base memory test fails. I guess that reporting memory CMOS
  tables/interrupt functions are not implemented correctly.
 
 - The Base Memory  640K error seems to be SeaBIOS related. QEMU Bochs
 BIOS(tested with both -old-bios hack in 0.12 series and old 0.11.1)
 will just freeze after QEMU counted RAM.(Tested with ScriptPC and
 Bochs).

The SeaBIOS log would really help.  This can be done by adding:

-chardev stdio,id=seabios -device isa-debugcon,iobase=0x402,chardev=seabios

to the qemu command line.

The memory can be obtained in several places (int 12, int 1588, int
15e801, int 15e820, and mem 40:13).  All look fine to me from looking
at the code.

-Kevin

Re: [Qemu-devel] Re: ehci update

2010-04-13 Thread Alexander Graf


On 14.04.2010, at 01:50, David S. Ahern wrote:

 
 
 On 04/13/2010 05:35 PM, Jan Kiszka wrote:
 David S. Ahern wrote:
 After a month of code refactoring and clean ups, etc, I thought I would
 send along an update. The attached patch is relative to your ehci
 branch; I also attached the full usb-ehci.c file for easier reading.
 
 Thanks for your work! I applied it and once again merged git head at
 this chance.
 
 Just one request for the future: Please keep a queue of incremental
 changes. This EHCI beast is sufficiently tricky, and at some point
 someone (you included) may want to go back in history to find
 out where some change came from, and why it was applied.
 
 Agreed. I will submit focused changes from now on.
 
 
 E.g.: We apparently regressed further /wrt my favorite test case (as
 it's self-contained): -usbdevice net. qemu is now entering an infinite
 loop when you start dhcpcd in the guest on that interface.
 
 I've been focused on a single path -- async, bulk transfers to a USB
 key. I take a look at the ethernet device as well.
 
 I've struggled with the infinite loop part: the async train jumps the
 track with FC-12 guest rather quickly (ie., the link list is no longer a
 loop). I put in a loop detector in the advance_state function which
 helps for storage devices - but clearly something is not right. I've
 been roaming the ehci_hcd code in the kernel as well looking for clues.
 A lot of details to gel and the day job keeps getting in the way. :-)
 
 
 
 At this point I can get a Windows XP guest to format a 4GB key and read
 from and write to it. I can get an FC-12 guest to format a 4GB key and
 an 8GB key as well as read from and write to both. Write rates are on
 the order of 8 MB/sec for dd:
 
 # dd if=/dev/zero of=test bs=1M count=100 oflag=dsync
 100+0 records in
 100+0 records out
 104857600 bytes (105 MB) copied, 12.1205 s, 8.7 MB/s
 
 rsync of text files (e.g., /var/log) is on the order of 2MB/sec.
 
 4GB keys are definitely more stable; the 8GB is not recognized by
 Windows XP.
 
 It still needs a lot of love, but definitely an improvement from the
 last version. The biggest difference for the performance boost and
 stability is discovering that the usbfs in linux limits transactions to
 16k versus the EHCI spec which allows 20k per qTD. I added a hack to
 submit which detects 20k requests from a guest and breaks it up into 2
 requests through the host (a 16k and then a 4k).
 
 Did someone already bring this up on LKML or wherever usbfs is
 discussed? Should be fixable, I naively guess.
 
 I submitted the patch to linux-usb and it was nack'ed. The response was
 that memory is allocated in powers of 2 so trying to up the limit from
 16k to 20k means it will actually want to find 32k of contiguous memory.
 The suggestion was to handle it with multiple requests within qemu. I
 guess libusb does that.

Any reason we're not using libusb?


Alex

[Qemu-devel] Re: SeaBIOS error with Juniper FreeBSD kernel

2010-04-13 Thread Kevin O'Connor

On Tue, Apr 13, 2010 at 08:48:35PM +0200, Bjørn Mork wrote:
 Gleb Natapov g...@redhat.com writes:
  On Sun, Feb 28, 2010 at 02:39:04PM -0500, Kevin O'Connor wrote:
  On Sun, Feb 21, 2010 at 04:18:38PM -0700, Brandon Bennett wrote:
   I have narrowed it down to SMBIOS.  If I disable CONFIG_SMBIOS the
   image boots up fine.
  If there is any seabios revision that works then it is possible to
  bisect to find problematic commit.
 I'm also bitten by this and have now attempted to bisect it.  The result
 was: 
 
 c95d2cee36db79c88253d43a90230ceadf0c26cf is first bad commit
 commit c95d2cee36db79c88253d43a90230ceadf0c26cf

It looks like memory layout changes in the f-segment is tickling the
underlying bug.  I don't think SMBIOS, the above commit, or the other
commit identified earlier are the root cause of the problem.  Instead,
I'd guess these commits just change the memory layout enough to avoid
the root cause.

This looks like it is going to require some careful study with a
debugger and some execution traces.  Unfortunately, since this image
isn't available for download it makes it difficult to track down.

-Kevin

[Qemu-devel] Re: [PULL] pci,vhost-net,eepro100

2010-04-13 Thread Anthony Liguori


On 04/11/2010 03:29 PM, Michael S. Tsirkin wrote:

The following changes since commit f7e2aca83419dde3c94fa1d5e615581bb4ded9c0:

   tcg/ppc: Fix typo (2010-04-06 03:10:03 +0400)

are available in the git repository at:
   git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu.git for_anthony
   


Pulled.  Thanks.

Regards,

Anthony Liguori


David L Stevens (1):
   vhost: fix features ack

Marcelo Tosatti (1):
   vhost.c: includelinux/vhost.h  last

Michael S. Tsirkin (3):
   pci: add API to add capability at a known offset
   eepro100: convert to new capability API
   vhost-net: disable mergeable buffers

Stefan Weil (8):
   eepro100: Don't allow writing SCBStatus
   eepro100: Simplify status handling
   eepro100: Simplified device instantiation
   eepro100: Add new device variant i82801
   eepro100: Set configuration bit for standard TCB
   eepro100: Set power management capability using pci_reserve_capability
   eepro100: fix mapping of flash memory
   eepro100: fix PCI interrupt pin configuration regression

  hw/eepro100.c   |  534 --
  hw/pci.c|   17 ++-
  hw/pci.h|2 +
  hw/vhost.c  |2 +-
  hw/vhost_net.c  |1 +
  hw/virtio-net.c |8 +
  6 files changed, 223 insertions(+), 341 deletions(-)

Re: [Qemu-devel] Re: ehci update

2010-04-13 Thread David S. Ahern



On 04/13/2010 07:20 PM, Alexander Graf wrote:
 It still needs a lot of love, but definitely an improvement from the
 last version. The biggest difference for the performance boost and
 stability is discovering that the usbfs in linux limits transactions to
 16k versus the EHCI spec which allows 20k per qTD. I added a hack to
 submit which detects 20k requests from a guest and breaks it up into 2
 requests through the host (a 16k and then a 4k).

 Did someone already bring this up on LKML or wherever usbfs is
 discussed? Should be fixable, I naively guess.

 I submitted the patch to linux-usb and it was nack'ed. The response was
 that memory is allocated in powers of 2 so trying to up the limit from
 16k to 20k means it will actually want to find 32k of contiguous memory.
 The suggestion was to handle it with multiple requests within qemu. I
 guess libusb does that.
 
 Any reason we're not using libusb?

Good question. I was wondering the same. I was going to look at
converting usb-linux to use libusb1 when I get some time.

David

 
 
 Alex

[Qemu-devel] [PATCH v5 00/17] virtio-serial fixes, new abi for port discovery, flow control

2010-04-13 Thread Amit Shah

Hello,

These patches rework the way ports are announced to the guests. A
control message is used to let the guest know a new port is
added. Initial port discovery and port hot-plug work via this way now.

This was done to have the host and guest port numbering in sync to
avoid surprises after several hotplug/unplug operations and
migrations.

The ability to assign a particular port number to ports is also added
so that management software can control the placement of ports.

This iteration fixes a few things Juan pointed out.

Juan and Gerd have already acked v4 of the patch series, from which it
remains largely unchanged.

Differences from v4:
- Juan: qemu_free() handles NULL fine

- Juan: Send ports_map as an array of u32s instead of as a buffer
  (big/little endian issues)

- Juan: iov should be compiled only once (nothing target-specific)

- A few fixes from the other submission: discard piled up buffers in
  the vq after port close or hot-unplug.

Overall:
- Users can set the port id they wish to instantiate ports at by using
  the ,nr= parameter to 'virtserialport' and 'virtconsole' devices

- Migration fixes: refuse migration when:
  - number of active ports is different between the src and destination
  - max_nr_ports a device can support on the src is more

- If a qemu chardev connection to a port is closed on the dest while
  it was open on the src, inform the guest about this. (Also do the
  same for port closed on src but open on dest.)

- Use control messages for relaying new port information instead of
  config space (changes abi)

- Propagate error message from guest in instantiating a port or a
  device to the user.

- Handle scatter/gather for control output and data output from the
  guest

- Fix abuse of virtio api in the virtqueue_push() function

- Add an API for the ports for flow control: ports can signal when
  they're ready to accept data / stop sending data.

Amit Shah (17):
  virtio-serial: save/load: Ensure target has enough ports
  virtio-serial: save/load: Ensure nr_ports on src and dest are same.
  virtio-serial: save/load: Ensure we have hot-plugged ports
instantiated
  virtio-serial: save/load: Send target host connection status if
different
  virtio-serial: Use control messages to notify guest of new ports
  virtio-serial: whitespace: match surrounding code
  virtio-serial: Remove redundant check for 0-sized write request
  virtio-serial: Update copyright year to 2010
  virtio-serial: Propagate errors in initialising ports / devices in
guest
  virtio-serial: Send out guest data to ports only if port is opened
  iov: Introduce a new file for helpers around iovs, add iov_from_buf()
  iov: Add iov_to_buf and iov_size helpers
  virtio-serial: Handle scatter-gather buffers for control messages
  virtio-serial: Handle scatter/gather input from the guest
  virtio-serial: Apps should consume all data that guest sends out /
Fix virtio api abuse
  virtio-serial: Discard data that guest sends us when ports aren't
connected
  virtio-serial: Implement flow control for individual ports

 Makefile   |2 +
 Makefile.objs  |1 +
 hw/iov.c   |   70 ++
 hw/iov.h   |   19 +++
 hw/virtio-balloon.c|   35 +-
 hw/virtio-console.c|   11 +-
 hw/virtio-net.c|   20 +---
 hw/virtio-serial-bus.c |  333 
 hw/virtio-serial.h |   34 --
 9 files changed, 377 insertions(+), 148 deletions(-)
 create mode 100644 hw/iov.c
 create mode 100644 hw/iov.h

[Qemu-devel] [PATCH v5 01/17] virtio-serial: save/load: Ensure target has enough ports

2010-04-13 Thread Amit Shah

The target could be started with max_nr_ports for a virtio-serial device
lesser than what was available on the source machine. Fail the migration
in such a case.

Signed-off-by: Amit Shah amit.s...@redhat.com
Reported-by: Juan Quintela quint...@redhat.com
---
 hw/virtio-serial-bus.c |   13 +++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index 17c1ec1..9a7f0c1 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -374,10 +374,13 @@ static void virtio_serial_save(QEMUFile *f, void *opaque)
 
 /* Items in struct VirtIOSerial */
 
+qemu_put_be32s(f, s-bus-max_nr_ports);
+
 /* Do this because we might have hot-unplugged some ports */
 nr_active_ports = 0;
-QTAILQ_FOREACH(port, s-ports, next)
+QTAILQ_FOREACH(port, s-ports, next) {
 nr_active_ports++;
+}
 
 qemu_put_be32s(f, nr_active_ports);
 
@@ -399,7 +402,7 @@ static int virtio_serial_load(QEMUFile *f, void *opaque, 
int version_id)
 {
 VirtIOSerial *s = opaque;
 VirtIOSerialPort *port;
-uint32_t nr_active_ports;
+uint32_t max_nr_ports, nr_active_ports;
 unsigned int i;
 
 if (version_id  2) {
@@ -420,6 +423,12 @@ static int virtio_serial_load(QEMUFile *f, void *opaque, 
int version_id)
 
 /* Items in struct VirtIOSerial */
 
+qemu_get_be32s(f, max_nr_ports);
+if (max_nr_ports  s-bus-max_nr_ports) {
+/* Source could have more ports than us. Fail migration. */
+return -EINVAL;
+}
+
 qemu_get_be32s(f, nr_active_ports);
 
 /* Items in struct VirtIOSerialPort */
-- 
1.6.2.5

[Qemu-devel] [PATCH v5 02/17] virtio-serial: save/load: Ensure nr_ports on src and dest are same.

2010-04-13 Thread Amit Shah

The number of ports on the source as well as the destination machines
should match. If they don't, it means some ports that got hotplugged on
the source aren't instantiated on the destination. Or that ports that
were hot-unplugged on the source are created on the destination.

Signed-off-by: Amit Shah amit.s...@redhat.com
Reported-by: Juan Quintela quint...@redhat.com
---
 hw/virtio-serial-bus.c |   18 --
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index 9a7f0c1..d31e62d 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -402,7 +402,7 @@ static int virtio_serial_load(QEMUFile *f, void *opaque, 
int version_id)
 {
 VirtIOSerial *s = opaque;
 VirtIOSerialPort *port;
-uint32_t max_nr_ports, nr_active_ports;
+uint32_t max_nr_ports, nr_active_ports, nr_ports;
 unsigned int i;
 
 if (version_id  2) {
@@ -419,7 +419,21 @@ static int virtio_serial_load(QEMUFile *f, void *opaque, 
int version_id)
 /* The config space */
 qemu_get_be16s(f, s-config.cols);
 qemu_get_be16s(f, s-config.rows);
-s-config.nr_ports = qemu_get_be32(f);
+nr_ports = qemu_get_be32(f);
+
+if (nr_ports != s-config.nr_ports) {
+/*
+ * Source hot-plugged/unplugged ports and we don't have all of
+ * them here.
+ *
+ * Note: This condition cannot check for all hotplug/unplug
+ * events: eg, if one port was hot-plugged and one was
+ * unplugged, the nr_ports remains the same but the port id's
+ * would have changed and we won't catch it here. A later
+ * check for !find_port_by_id() will confirm if this happened.
+ */
+return -EINVAL;
+}
 
 /* Items in struct VirtIOSerial */
 
-- 
1.6.2.5

[Qemu-devel] [PATCH v5 06/17] virtio-serial: whitespace: match surrounding code

2010-04-13 Thread Amit Shah

The virtio-serial code doesn't mix declarations and definitions, so
separate them out on different lines.

Signed-off-by: Amit Shah amit.s...@redhat.com
---
 hw/virtio-serial-bus.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index bb11a9b..8efba0b 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -354,7 +354,10 @@ static void handle_input(VirtIODevice *vdev, VirtQueue *vq)
 
 static uint32_t get_features(VirtIODevice *vdev, uint32_t features)
 {
-VirtIOSerial *vser = DO_UPCAST(VirtIOSerial, vdev, vdev);
+VirtIOSerial *vser;
+
+vser = DO_UPCAST(VirtIOSerial, vdev, vdev);
+
 if (vser-bus-max_nr_ports  1) {
 features |= (1  VIRTIO_CONSOLE_F_MULTIPORT);
 }
-- 
1.6.2.5

[Qemu-devel] [PATCH v5 03/17] virtio-serial: save/load: Ensure we have hot-plugged ports instantiated

2010-04-13 Thread Amit Shah

If some ports that were hot-plugged on the source are not available on
the destination, fail migration instead of trying to deref a NULL
pointer.

Signed-off-by: Amit Shah amit.s...@redhat.com
Reported-by: Juan Quintela quint...@redhat.com
---
 hw/virtio-serial-bus.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index d31e62d..5316ef6 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -451,6 +451,13 @@ static int virtio_serial_load(QEMUFile *f, void *opaque, 
int version_id)
 
 id = qemu_get_be32(f);
 port = find_port_by_id(s, id);
+if (!port) {
+/*
+ * The requested port was hot-plugged on the source but we
+ * don't have it
+ */
+return -EINVAL;
+}
 
 port-guest_connected = qemu_get_byte(f);
 }
-- 
1.6.2.5

[Qemu-devel] [PATCH v5 07/17] virtio-serial: Remove redundant check for 0-sized write request

2010-04-13 Thread Amit Shah

The check for a 0-sized write request to a guest port is not necessary;
the while loop below won't be executed in this case and all will be
fine.

Signed-off-by: Amit Shah amit.s...@redhat.com
---
 hw/virtio-serial-bus.c |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index 8efba0b..c1d9851 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -91,9 +91,6 @@ static size_t write_to_port(VirtIOSerialPort *port,
 if (!virtio_queue_ready(vq)) {
 return 0;
 }
-if (!size) {
-return 0;
-}
 
 while (offset  size) {
 int i;
-- 
1.6.2.5

[Qemu-devel] [PATCH v5 05/17] virtio-serial: Use control messages to notify guest of new ports

2010-04-13 Thread Amit Shah

Allow the port 'id's to be set by a user on the command line. This is
needed by management apps that will want a stable port numbering scheme
for hot-plug/unplug and migration.

Since the port numbers are shared with the guest (to identify ports in
control messages), we just send a control message to the guest
indicating addition of new ports (hot-plug) or notifying the guest of
the available ports when the guest sends us a DEVICE_READY control
message.

Signed-off-by: Amit Shah amit.s...@redhat.com
---
 hw/virtio-console.c|2 +
 hw/virtio-serial-bus.c |  184 +++
 hw/virtio-serial.h |   17 +++--
 3 files changed, 133 insertions(+), 70 deletions(-)

diff --git a/hw/virtio-console.c b/hw/virtio-console.c
index bd44ec6..17b221d 100644
--- a/hw/virtio-console.c
+++ b/hw/virtio-console.c
@@ -99,6 +99,7 @@ static VirtIOSerialPortInfo virtconsole_info = {
 .exit  = virtconsole_exitfn,
 .qdev.props = (Property[]) {
 DEFINE_PROP_UINT8(is_console, VirtConsole, port.is_console, 1),
+DEFINE_PROP_UINT32(nr, VirtConsole, port.id, VIRTIO_CONSOLE_BAD_ID),
 DEFINE_PROP_CHR(chardev, VirtConsole, chr),
 DEFINE_PROP_STRING(name, VirtConsole, port.name),
 DEFINE_PROP_END_OF_LIST(),
@@ -133,6 +134,7 @@ static VirtIOSerialPortInfo virtserialport_info = {
 .init  = virtserialport_initfn,
 .exit  = virtconsole_exitfn,
 .qdev.props = (Property[]) {
+DEFINE_PROP_UINT32(nr, VirtConsole, port.id, VIRTIO_CONSOLE_BAD_ID),
 DEFINE_PROP_CHR(chardev, VirtConsole, chr),
 DEFINE_PROP_STRING(name, VirtConsole, port.name),
 DEFINE_PROP_END_OF_LIST(),
diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index 484dc94..bb11a9b 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -41,6 +41,10 @@ struct VirtIOSerial {
 VirtIOSerialBus *bus;
 
 QTAILQ_HEAD(, VirtIOSerialPort) ports;
+
+/* bitmap for identifying active ports */
+uint32_t *ports_map;
+
 struct virtio_console_config config;
 };
 
@@ -48,6 +52,10 @@ static VirtIOSerialPort *find_port_by_id(VirtIOSerial *vser, 
uint32_t id)
 {
 VirtIOSerialPort *port;
 
+if (id == VIRTIO_CONSOLE_BAD_ID) {
+return NULL;
+}
+
 QTAILQ_FOREACH(port, vser-ports, next) {
 if (port-id == id)
 return port;
@@ -208,14 +216,25 @@ static void handle_control_message(VirtIOSerial *vser, 
void *buf)
 size_t buffer_len;
 
 gcpkt = buf;
-port = find_port_by_id(vser, ldl_p(gcpkt-id));
-if (!port)
-return;
 
 cpkt.event = lduw_p(gcpkt-event);
 cpkt.value = lduw_p(gcpkt-value);
 
+port = find_port_by_id(vser, ldl_p(gcpkt-id));
+if (!port  cpkt.event != VIRTIO_CONSOLE_DEVICE_READY)
+return;
+
 switch(cpkt.event) {
+case VIRTIO_CONSOLE_DEVICE_READY:
+/*
+ * The device is up, we can now tell the device about all the
+ * ports we have here.
+ */
+QTAILQ_FOREACH(port, vser-ports, next) {
+send_control_event(port, VIRTIO_CONSOLE_PORT_ADD, 1);
+}
+break;
+
 case VIRTIO_CONSOLE_PORT_READY:
 /*
  * Now that we know the guest asked for the port name, we're
@@ -363,6 +382,7 @@ static void virtio_serial_save(QEMUFile *f, void *opaque)
 VirtIOSerial *s = opaque;
 VirtIOSerialPort *port;
 uint32_t nr_active_ports;
+unsigned int i;
 
 /* The virtio device */
 virtio_save(s-vdev, f);
@@ -370,13 +390,17 @@ static void virtio_serial_save(QEMUFile *f, void *opaque)
 /* The config space */
 qemu_put_be16s(f, s-config.cols);
 qemu_put_be16s(f, s-config.rows);
-qemu_put_be32s(f, s-config.nr_ports);
 
-/* Items in struct VirtIOSerial */
+qemu_put_be32s(f, s-config.max_nr_ports);
+
+/* The ports map */
+
+for (i = 0; i  (s-config.max_nr_ports + 31) / 32; i++) {
+qemu_put_be32s(f, s-ports_map[i]);
+}
 
-qemu_put_be32s(f, s-bus-max_nr_ports);
+/* Ports */
 
-/* Do this because we might have hot-unplugged some ports */
 nr_active_ports = 0;
 QTAILQ_FOREACH(port, s-ports, next) {
 nr_active_ports++;
@@ -388,11 +412,6 @@ static void virtio_serial_save(QEMUFile *f, void *opaque)
  * Items in struct VirtIOSerialPort.
  */
 QTAILQ_FOREACH(port, s-ports, next) {
-/*
- * We put the port number because we may not have an active
- * port at id 0 that's reserved for a console port, or in case
- * of ports that might have gotten unplugged
- */
 qemu_put_be32s(f, port-id);
 qemu_put_byte(f, port-guest_connected);
 qemu_put_byte(f, port-host_connected);
@@ -403,7 +422,8 @@ static int virtio_serial_load(QEMUFile *f, void *opaque, 
int version_id)
 {
 VirtIOSerial *s = opaque;
 VirtIOSerialPort *port;
-uint32_t max_nr_ports, nr_active_ports, nr_ports;
+size_t ports_map_size;
+

2010-04-13 Thread Amit Shah

Signed-off-by: Amit Shah amit.s...@redhat.com
---
 hw/virtio-console.c|2 +-
 hw/virtio-serial-bus.c |2 +-
 hw/virtio-serial.h |2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/virtio-console.c b/hw/virtio-console.c
index 17b221d..6b8 100644
--- a/hw/virtio-console.c
+++ b/hw/virtio-console.c
@@ -1,7 +1,7 @@
 /*
  * Virtio Console and Generic Serial Port Devices
  *
- * Copyright Red Hat, Inc. 2009
+ * Copyright Red Hat, Inc. 2009, 2010
  *
  * Authors:
  *  Amit Shah amit.s...@redhat.com
diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index c1d9851..c77ea4f 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -1,7 +1,7 @@
 /*
  * A bus for connecting virtio serial and console ports
  *
- * Copyright (C) 2009 Red Hat, Inc.
+ * Copyright (C) 2009, 2010 Red Hat, Inc.
  *
  * Author(s):
  *  Amit Shah amit.s...@redhat.com
diff --git a/hw/virtio-serial.h b/hw/virtio-serial.h
index 0548689..f023873 100644
--- a/hw/virtio-serial.h
+++ b/hw/virtio-serial.h
@@ -2,7 +2,7 @@
  * Virtio Serial / Console Support
  *
  * Copyright IBM, Corp. 2008
- * Copyright Red Hat, Inc. 2009
+ * Copyright Red Hat, Inc. 2009, 2010
  *
  * Authors:
  *  Christian Ehrhardt ehrha...@linux.vnet.ibm.com
-- 
1.6.2.5

[Qemu-devel] [PATCH v5 09/17] virtio-serial: Propagate errors in initialising ports / devices in guest

2010-04-13 Thread Amit Shah

If adding of ports or devices in the guest fails we can send out a QMP
event so that management software can deal with it.

Signed-off-by: Amit Shah amit.s...@redhat.com
---
 hw/virtio-serial-bus.c |   10 ++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index c77ea4f..3a09f0d 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -223,6 +223,11 @@ static void handle_control_message(VirtIOSerial *vser, 
void *buf)
 
 switch(cpkt.event) {
 case VIRTIO_CONSOLE_DEVICE_READY:
+if (!cpkt.value) {
+error_report(virtio-serial-bus: Guest failure in adding device 
%s\n,
+ vser-bus-qbus.name);
+break;
+}
 /*
  * The device is up, we can now tell the device about all the
  * ports we have here.
@@ -233,6 +238,11 @@ static void handle_control_message(VirtIOSerial *vser, 
void *buf)
 break;
 
 case VIRTIO_CONSOLE_PORT_READY:
+if (!cpkt.value) {
+error_report(virtio-serial-bus: Guest failure in adding port %u 
for device %s\n,
+ port-id, vser-bus-qbus.name);
+break;
+}
 /*
  * Now that we know the guest asked for the port name, we're
  * sure the guest has initialised whatever state is necessary
-- 
1.6.2.5

69 matches

Mail list logo