[Qemu-devel] [RFC 09/24] qed: add qed_bdrv_get_mapping()

2011-07-28 Thread Devin Nakamura
Fuction to get drive mapping from qed images

Signed-off-by: Devin Nakamura 
---
 block/qed.c |   32 
 1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index 00cf895..dadb7f8 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -1451,6 +1451,37 @@ static int bdrv_qed_check(BlockDriverState *bs, 
BdrvCheckResult *result)
 return qed_check(s, result, false);
 }
 
+static int bdrv_qed_get_mapping(BlockDriverState *bs, uint64_t *guest_offset,
+uint64_t *host_offset,
+uint64_t *contiguous_bytes)
+{
+BDRVQEDState *s = bs->opaque;
+size_t l2_size = s->header.cluster_size * s->table_nelems;
+uint64_t pos = *guest_offset + *contiguous_bytes;
+uint64_t offset = pos;
+size_t len = 0;
+QEDRequest req = {.l2_table = NULL};
+int ret;
+
+if (pos >= s->header.image_size) {
+*contiguous_bytes = 0;
+return 0;
+}
+
+do {
+pos += len;
+ret = qed_find_cluster_sync(s, &req, pos, l2_size, &offset, &len);
+} while (ret != QED_CLUSTER_FOUND && pos < s->header.image_size);
+*guest_offset = pos;
+*host_offset = offset;
+if (pos >= s->header.image_size) {
+*contiguous_bytes = 0;
+} else {
+*contiguous_bytes = len;
+}
+return 0;
+}
+
 static QEMUOptionParameter qed_create_options[] = {
 {
 .name = BLOCK_OPT_SIZE,
@@ -1497,6 +1528,7 @@ static BlockDriver bdrv_qed = {
 .bdrv_get_info= bdrv_qed_get_info,
 .bdrv_change_backing_file = bdrv_qed_change_backing_file,
 .bdrv_check   = bdrv_qed_check,
+.bdrv_get_mapping = bdrv_qed_get_mapping,
 };
 
 static void bdrv_qed_init(void)
-- 
1.7.6.rc1




Re: [Qemu-devel] Log in to the OS through qemu

2011-07-28 Thread bala suru
Hi,
can tell me how to set-up th SDL window for the qemu ..?


On Thu, Jul 28, 2011 at 11:17 PM, Mulyadi Santosa  wrote:

> On Thu, Jul 28, 2011 at 23:19, bala suru  wrote:
> > Hi,
> > I have installed OS on VMs (KVM-Qemu) , one version is ttylinux 9.0 and
> > other is ttylinux 12 .
> >
> > I have installed these two with opennebula tool kit .  generally I try
> to
> > login to the OS using SSH , since ttylinux 12 needs to enable the
> ethernet
> > after logged in  to the OS, how to login to the OS through qemu ... i
> know
> > the username and password for that ..
>
> could you make the SDL window come up? that's the easiest way
> AFAIK...it represents your VM monitor
>
> --
> regards,
>
> Mulyadi Santosa
> Freelance Linux trainer and consultant
>
> blog: the-hydra.blogspot.com
> training: mulyaditraining.blogspot.com
>


[Qemu-devel] [RFC 21/24] qcow2: add qcow2_open_conversion_target()

2011-07-28 Thread Devin Nakamura
Still in very raw form.  Not likely to work yet

Signed-off-by: Devin Nakamura 
---
 block/qcow2.c |  124 +
 1 files changed, 124 insertions(+), 0 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 86df65d..f1e1e12 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1444,6 +1444,129 @@ static int 
qcow2_get_conversion_options(BlockDriverState *bs,
 }
 return 0;
 }
+
+
+static int qcow2_open_conversion_target(BlockDriverState *bs,
+BlockConversionOptions *drv_options,
+QEMUOptionParameter *usr_options)
+{
+uint64_t imgsize, sectorsize, clusterbits;
+uint64_t num_refcount_blocks;
+uint16_t *refcount_block;
+uint64_t table_index, table_limit, i;
+
+sectorsize = drv_options->cluster_size;
+clusterbits = 0;
+while (~sectorsize & 1) {
+sectorsize >>=1;
+clusterbits++;
+}
+if (sectorsize != 1) {
+return -EINVAL;
+}
+
+
+imgsize = drv_options->image_size;
+
+BDRVQcowState *s = bs->opaque;
+s->crypt_method_header = QCOW_CRYPT_NONE;
+s->cluster_bits = clusterbits;
+s->cluster_size = 1 << s->cluster_bits;
+s->cluster_sectors = 1 << (s->cluster_bits - 9);
+s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */
+s->l2_size = 1 << s->l2_bits;
+bs->total_sectors = imgsize / 512;
+s->csize_shift = (62 - (s->cluster_bits - 8));
+s->csize_mask = (1 << (s->cluster_bits - 8)) - 1;
+s->cluster_offset_mask = (1LL << s->csize_shift) - 1;
+//s->refcount_table_offset = 0;
+//s->refcount_table_size = 0;
+s->snapshots_offset = 0;
+s->nb_snapshots = 0;
+/* alloc L2 table/refcount block cache */
+s->l2_table_cache = qcow2_cache_create(bs, L2_CACHE_SIZE, true); //todo: 
double check writethrough
+s->refcount_block_cache = qcow2_cache_create(bs, REFCOUNT_CACHE_SIZE,
+true);
+
+s->cluster_cache = qemu_malloc(s->cluster_size);
+/* one more sector for decompressed data alignment */
+s->cluster_data = qemu_malloc(QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size
+  + 512);
+s->cluster_cache_offset = -1;
+
+//Make up a refcount table
+const uint64_t num_clusters = bs->file->total_sectors >>
+(clusterbits - BDRV_SECTOR_BITS);
+const uint64_t cluster_size = 1 << s->cluster_bits;
+const uint64_t uint64_per_cluster = cluster_size / sizeof(uint64_t);
+const uint64_t uint16_per_cluster = cluster_size / sizeof(uint16_t);
+const uint64_t num_l2_tables = (num_clusters + uint64_per_cluster - 1) / 
+uint64_per_cluster;
+const uint64_t num_l1_clusters = (num_l2_tables + uint64_per_cluster - 1)
+ / uint64_per_cluster);
+s->l1_size = num_l2_tables;
+s->l1_table = qemu_mallocz(s->l1_size * s->cluster_size);
+num_refcount_blocks = num_clusters + num_l2_tables + num_l1_clusters;
+num_refcount_blocks = (num_refcount_blocks + uint16_per_cluster - 1) /
+  uint16_per_cluster;
+
+/* Account for refcount blocks used to track refcount blocks */
+num_refcount_blocks *= uint16_per_cluster;
+num_refcount_blocks /= uint16_per_cluster -1;
+num_refcount_blocks += 1; //todo: fix rounding
+
+/* Account for refcount blocks used to track refcount table */
+num_refcount_blocks *= uint64_per_cluster;
+num_refcount_blocks /= uint64_per_cluster -1;
+num_refcount_blocks += 1; // todo: fix rounding
+
+s->refcount_table_size = (num_refcount_blocks / uint64_per_cluster) +1;
+s->refcount_table = qemu_mallocz(s->refcount_table_size * s->cluster_size);
+refcount_block = qemu_mallocz(s->cluster_size);
+
+//s->refcount_table_size = num_refcount_blocks;
+s->free_cluster_index = num_clusters;
+s->l1_table_offset = s->free_cluster_index << s->cluster_bits;
+s->free_cluster_index += num_l1_clusters;
+s->refcount_table_offset = s->free_cluster_index << s->cluster_bits;
+s->free_cluster_index++;
+
+/* assign offsets for all refcount blocks */
+for (table_index = 0; table_index < num_refcount_blocks; table_index++) {
+s->refcount_table[table_index] = s->free_cluster_index++ << 
s->cluster_bits;
+}
+
+/* set all refcounts in the block to 1 */
+for (i=0; i < uint16_per_cluster; i++) {
+refcount_block[i] = cpu_to_be16(1);
+}
+
+//TODO: double check this, it seems a bit fishy
+table_limit = s->free_cluster_index / uint16_per_cluster;
+for(table_index = 0; table_index < table_limit; table_index++) {
+bdrv_write_sync(bs->file, s->refcount_table[table_index] >> 
BDRV_SECTOR_BITS,
+(uint8_t*) refcount_block, s->cluster_sectors);
+}
+
+table_limit = s->free_cluster_index % uint16_per_cluster;
+for(i=table_limit; i < uint64_per_cluster; i++) {
+refcount_bl

[Qemu-devel] [RFC 10/24] qed: add qed_bdrv_map()

2011-07-28 Thread Devin Nakamura
Conflicts:

block_int.h

Signed-off-by: Devin Nakamura 
---
 block.c |4 ++--
 block.h |4 ++--
 block/qed.c |   47 +++
 3 files changed, 51 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index 28b4418..dca3687 100644
--- a/block.c
+++ b/block.c
@@ -3112,8 +3112,8 @@ int bdrv_get_mapping(BlockDriverState *bs, uint64_t 
*guest_offset,
 contiguous_bytes);
 }
 
-int bdrv_map(BlockDriverState *bs, uint64_t *guest_offset,
- uint64_t *host_offset, uint64_t *contiguous_bytes)
+int bdrv_map(BlockDriverState *bs, uint64_t guest_offset,
+ uint64_t host_offset, uint64_t contiguous_bytes)
 {
 BlockDriver *drv = bs->drv;
 if (!drv) {
diff --git a/block.h b/block.h
index 2426b62..e0580f0 100644
--- a/block.h
+++ b/block.h
@@ -261,8 +261,8 @@ int bdrv_open_conversion_target(BlockDriverState **bs, 
BlockDriverState *file,
 const char *target_fmt);
 int bdrv_get_mapping(BlockDriverState *bs, uint64_t *guest_offset,
  uint64_t *host_offset, uint64_t *contiguous_bytes);
-int bdrv_map(BlockDriverState *bs, uint64_t *guest_offset,
- uint64_t *host_offset, uint64_t *contiguous_bytes);
+int bdrv_map(BlockDriverState *bs, uint64_t guest_offset,
+ uint64_t host_offset, uint64_t contiguous_bytes);
 int bdrv_copy_header(BlockDriverState *bs);
 
 typedef enum {
diff --git a/block/qed.c b/block/qed.c
index dadb7f8..daf82fd 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -1482,6 +1482,52 @@ static int bdrv_qed_get_mapping(BlockDriverState *bs, 
uint64_t *guest_offset,
 return 0;
 }
 
+static int bdrv_qed_map(BlockDriverState *bs, uint64_t guest_offset,
+uint64_t host_offset, uint64_t contiguous_bytes)
+{
+BDRVQEDState* s = bs->opaque;
+if(guest_offset != qed_start_of_cluster(s, guest_offset)){
+return -EINVAL;
+}
+
+if(contiguous_bytes % s->header.cluster_size != 0){
+return -EINVAL;
+}
+
+while(contiguous_bytes > 0){
+uint64_t l1_index, l2_index, first_l2;
+int ret;
+QEDRequest req = {.l2_table = NULL};
+
+
+l1_index = qed_l1_index(s, guest_offset);
+l2_index = qed_l2_index(s, guest_offset);
+first_l2 = l2_index;
+if(!s->l1_table->offsets[l1_index]){
+CachedL2Table * table = qed_new_l2_table(s);
+s->l1_table->offsets[l1_index] = table->offset;
+qed_write_l1_table_sync(s, l1_index, 1);
+qed_commit_l2_cache_entry(&s->l2_cache, table);
+}
+
+ret = qed_read_l2_table_sync(s, &req, s->l1_table->offsets[l1_index]);
+if(ret){
+return ret;
+}
+
+for(;l2_index < s->table_nelems && contiguous_bytes > 0; l2_index++){
+req.l2_table->table->offsets[l2_index] = host_offset;
+host_offset += s->header.cluster_size;
+contiguous_bytes -= s->header.cluster_size;
+}
+
+ret = qed_write_l2_table_sync(s, &req, 0, s->table_nelems, true);
+qed_unref_l2_cache_entry(req.l2_table);
+
+}
+return 0;
+}
+
 static QEMUOptionParameter qed_create_options[] = {
 {
 .name = BLOCK_OPT_SIZE,
@@ -1529,6 +1575,7 @@ static BlockDriver bdrv_qed = {
 .bdrv_change_backing_file = bdrv_qed_change_backing_file,
 .bdrv_check   = bdrv_qed_check,
 .bdrv_get_mapping = bdrv_qed_get_mapping,
+.bdrv_map = bdrv_qed_map,
 };
 
 static void bdrv_qed_init(void)
-- 
1.7.6.rc1




[Qemu-devel] [RFC 12/24] qed: add bdrv_qed_copy_header()

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 block/qed.c |   16 
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index b05224a..556512b 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -1579,6 +1579,21 @@ static int bdrv_qed_map(BlockDriverState *bs, uint64_t 
guest_offset,
 return 0;
 }
 
+static int bdrv_qed_copy_header(BlockDriverState *bs)
+{
+BlockDriverState *backup;
+uint8_t buffer[512];
+/*TODO: change filename*/
+bdrv_create_file("headerbackup.temp", NULL);
+bdrv_file_open(&backup, "headerbackup.temp", BDRV_O_RDWR);
+bdrv_read(bs->file, 0, buffer, 1); /*TODO: check return code*/
+bdrv_write(backup, 0, buffer, 1); /*TODO: check return code*/
+bdrv_close(backup);
+
+qed_write_header_sync(bs->opaque);
+return 0;
+}
+
 static QEMUOptionParameter qed_create_options[] = {
 {
 .name = BLOCK_OPT_SIZE,
@@ -1628,6 +1643,7 @@ static BlockDriver bdrv_qed = {
 .bdrv_open_conversion_target = bdrv_qed_open_conversion_target,
 .bdrv_get_mapping= bdrv_qed_get_mapping,
 .bdrv_map= bdrv_qed_map,
+.bdrv_copy_header= bdrv_qed_copy_header,
 };
 
 static void bdrv_qed_init(void)
-- 
1.7.6.rc1




[Qemu-devel] [RFC 20/24] qcow2: add get_conversion_options()

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 block/qcow2.c |   24 +---
 1 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 3bb28d2..86df65d 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1427,6 +1427,23 @@ static int qcow2_copy_header(BlockDriverState *bs)
 return qcow2_drop_leaked_clusters(bs);
 }
 
+static int qcow2_get_conversion_options(BlockDriverState *bs,
+BlockConversionOptions *options)
+{
+BDRVQcowState *s = bs->opaque;
+
+options->image_size = bs->total_sectors * BDRV_SECTOR_SIZE;
+options->cluster_size = s->cluster_sectors * BDRV_SECTOR_SIZE;
+options->allocation_size =  s->cluster_sectors * BDRV_SECTOR_SIZE;
+if (s->crypt_method == QCOW_CRYPT_NONE) {
+options->encryption_type = BLOCK_CRYPT_NONE;
+} else if (s->crypt_method == QCOW_CRYPT_AES) {
+options->encryption_type = BLOCK_CRYPT_AES;
+} else {
+return -1; //TODO: FIXME
+}
+return 0;
+}
 static QEMUOptionParameter qcow2_create_options[] = {
 {
 .name = BLOCK_OPT_SIZE,
@@ -1497,9 +1514,10 @@ static BlockDriver bdrv_qcow2 = {
 .create_options = qcow2_create_options,
 .bdrv_check = qcow2_check,
 
-.bdrv_get_mapping = qcow2_get_mapping,
-.bdrv_map = qcow2_map,
-.bdrv_copy_header = qcow2_copy_header,
+.bdrv_get_mapping= qcow2_get_mapping,
+.bdrv_map= qcow2_map,
+.bdrv_copy_header= qcow2_copy_header,
+.bdrv_get_conversion_options = qcow2_get_conversion_options,
 };
 
 static void bdrv_qcow2_init(void)
-- 
1.7.6.rc1




[Qemu-devel] [RFC 14/24] qcow2: fix typo in documentation for qcow2_get_cluster_offset()

2011-07-28 Thread Devin Nakamura
Documentation states the num is measured in clusters, but its
actually measured in sectors

Signed-off-by: Devin Nakamura 
---
 block/qcow2-cluster.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 882f50a..ca56918 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -381,10 +381,10 @@ static int copy_sectors(BlockDriverState *bs, uint64_t 
start_sect,
  * For a given offset of the disk image, find the cluster offset in
  * qcow2 file. The offset is stored in *cluster_offset.
  *
- * on entry, *num is the number of contiguous clusters we'd like to
+ * on entry, *num is the number of contiguous sectors we'd like to
  * access following offset.
  *
- * on exit, *num is the number of contiguous clusters we can read.
+ * on exit, *num is the number of contiguous sectors we can read.
  *
  * Return 0, if the offset is found
  * Return -errno, otherwise.
-- 
1.7.6.rc1




Re: [Qemu-devel] [PATCH] v6 revamp acpitable parsing and allow to specify complete (headerful) table

2011-07-28 Thread Michael Tokarev
29.07.2011 05:49, Isaku Yamahata wrote:
> On Fri, Jul 15, 2011 at 07:51:43PM +0300, Blue Swirl wrote:
>> On Fri, Jul 15, 2011 at 6:18 PM, John Baboval
>>  wrote:
>>> Is there something I can do to help take this patch the rest of the way?
>>>
>>> I'd hate to see it die because of a style issue and a compiler warning.
>>
>> There's also suspicious missing 'break' statement. How about fixing
>> the issues and submitting the patch?
> 
> I fixed the compile error.

Oh, another one? :)

> I think the missing break is intentional, so added an comment there. Michael?

Yes it's intentional, you're right.

> Blue, can you please take a look of it?

I think it's hopeless.

/mjt



[Qemu-devel] [RFC 04/24] block: add bdrv_get_mapping()

2011-07-28 Thread Devin Nakamura
Conflicts:

block.h

Signed-off-by: Devin Nakamura 
---
 block.c |   29 +
 block.h |2 ++
 2 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index 9530577..d0019c4 100644
--- a/block.c
+++ b/block.c
@@ -3082,3 +3082,32 @@ int bdrv_get_conversion_options(BlockDriverState *bs,
 }
 return bs->drv->bdrv_get_conversion_options(bs, options);
 }
+
+/**
+ * Gets a mapping from an offset in the image to an offset within a file
+ *
+ * All offsets are in bytes. Functions starts its search at offset host_offset
+ * + count (offset within the image, not the file offset)
+ *
+ * @param guest_offset  used to calculate starting search location on
+ *  function call. On return it is the starting
+ *  offset of the mapping in the image.
+ * @param host_offset[out]  The starting file offset of the mapping.
+ * @param contiguous_bytes[out] The number of bytes for which this mapping is
+ *  contiguous. If 0, there are no more mapppings 
in
+ *  the image
+ */
+
+int bdrv_get_mapping(BlockDriverState *bs, uint64_t *guest_offset,
+ uint64_t *host_offset, uint64_t *contiguous_bytes)
+{
+BlockDriver *drv = bs->drv;
+if (!drv) {
+return -ENOMEDIUM;
+}
+if (!drv->bdrv_get_mapping) {
+return -ENOTSUP;
+}
+return drv->bdrv_get_mapping(bs, guest_offset, host_offset,
+contiguous_bytes);
+}
diff --git a/block.h b/block.h
index 662bae7..3459900 100644
--- a/block.h
+++ b/block.h
@@ -259,6 +259,8 @@ int bdrv_open_conversion_target(BlockDriverState **bs, 
BlockDriverState *file,
 BlockConversionOptions *drv_options,
 QEMUOptionParameter *usr_options,
 const char *target_fmt);
+int bdrv_get_mapping(BlockDriverState *bs, uint64_t *guest_offset,
+ uint64_t *host_offset, uint64_t *contiguous_bytes);
 typedef enum {
 BLKDBG_L1_UPDATE,
 
-- 
1.7.6.rc1




[Qemu-devel] [RFC 15/24] qcow2: split up the creation of new refcount table from the act of checking it

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 block/qcow2-refcount.c |   39 +--
 1 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 14b2f67..75f1f88 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1075,17 +1075,10 @@ fail:
 return -EIO;
 }
 
-/*
- * Checks an image for refcount consistency.
- *
- * Returns 0 if no errors are found, the number of errors in case the image is
- * detected as corrupted, and -errno when an internal error occurred.
- */
-int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res)
-{
+static int inc_refcount_table(BlockDriverState *bs, BdrvCheckResult *res, 
uint16_t **table) {
 BDRVQcowState *s = bs->opaque;
 int64_t size;
-int nb_clusters, refcount1, refcount2, i;
+int nb_clusters, i;
 QCowSnapshot *sn;
 uint16_t *refcount_table;
 int ret;
@@ -1151,6 +1144,33 @@ int qcow2_check_refcounts(BlockDriverState *bs, 
BdrvCheckResult *res)
 }
 }
 }
+*table = refcount_table;
+ret = 0;
+
+fail:
+return ret;
+}
+
+/*
+ * Checks an image for refcount consistency.
+ *
+ * Returns 0 if no errors are found, the number of errors in case the image is
+ * detected as corrupted, and -errno when an internal error occurred.
+ */
+int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res)
+{
+BDRVQcowState *s = bs->opaque;
+int64_t size;
+int nb_clusters, refcount1, refcount2, i;
+uint16_t *refcount_table;
+int ret;
+
+size = bdrv_getlength(bs->file);
+nb_clusters = size_to_clusters(s, size);
+ret = inc_refcount_table(bs, res, &refcount_table);
+if (ret) {
+goto fail;
+}
 
 /* compare ref counts */
 for(i = 0; i < nb_clusters; i++) {
@@ -1182,4 +1202,3 @@ fail:
 
 return ret;
 }
-
-- 
1.7.6.rc1




[Qemu-devel] [RFC 13/24] qed: add bdrv_qed_get_conversion_options()

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 block/qed.c |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index 556512b..26e43e2 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -1594,6 +1594,18 @@ static int bdrv_qed_copy_header(BlockDriverState *bs)
 return 0;
 }
 
+static int bdrv_qed_get_conversion_options(BlockDriverState *bs,
+   BlockConversionOptions *options)
+{
+BDRVQEDState* s = bs->opaque;
+
+options->encryption_type = 0;
+options->cluster_size = s->header.cluster_size;
+options->allocation_size = options->cluster_size;
+options->image_size = s->header.image_size;
+return 0;
+}
+
 static QEMUOptionParameter qed_create_options[] = {
 {
 .name = BLOCK_OPT_SIZE,
@@ -1644,6 +1656,7 @@ static BlockDriver bdrv_qed = {
 .bdrv_get_mapping= bdrv_qed_get_mapping,
 .bdrv_map= bdrv_qed_map,
 .bdrv_copy_header= bdrv_qed_copy_header,
+.bdrv_get_conversion_options = bdrv_qed_get_conversion_options,
 };
 
 static void bdrv_qed_init(void)
-- 
1.7.6.rc1




[Qemu-devel] [RFC 05/24] block: add bdrv_map()

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 block.c |   14 ++
 block.h |2 ++
 2 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index d0019c4..98a9491 100644
--- a/block.c
+++ b/block.c
@@ -3111,3 +3111,17 @@ int bdrv_get_mapping(BlockDriverState *bs, uint64_t 
*guest_offset,
 return drv->bdrv_get_mapping(bs, guest_offset, host_offset,
 contiguous_bytes);
 }
+
+int bdrv_map(BlockDriverState *bs, uint64_t *guest_offset,
+ uint64_t *host_offset, uint64_t *contiguous_bytes)
+{
+BlockDriver *drv = bs->drv;
+if (!drv) {
+return -ENOMEDIUM;
+}
+if (!drv->bdrv_map) {
+return -ENOTSUP;
+}
+return drv->bdrv_map(bs, guest_offset, host_offset,
+contiguous_bytes);
+}
diff --git a/block.h b/block.h
index 3459900..e021394 100644
--- a/block.h
+++ b/block.h
@@ -261,6 +261,8 @@ int bdrv_open_conversion_target(BlockDriverState **bs, 
BlockDriverState *file,
 const char *target_fmt);
 int bdrv_get_mapping(BlockDriverState *bs, uint64_t *guest_offset,
  uint64_t *host_offset, uint64_t *contiguous_bytes);
+int bdrv_map(BlockDriverState *bs, uint64_t *guest_offset,
+ uint64_t *host_offset, uint64_t *contiguous_bytes);
 typedef enum {
 BLKDBG_L1_UPDATE,
 
-- 
1.7.6.rc1




[Qemu-devel] [RFC 06/24] block: add bdrv_copy_header()

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 block.c |   12 
 block.h |2 ++
 2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index 98a9491..28b4418 100644
--- a/block.c
+++ b/block.c
@@ -3125,3 +3125,15 @@ int bdrv_map(BlockDriverState *bs, uint64_t 
*guest_offset,
 return drv->bdrv_map(bs, guest_offset, host_offset,
 contiguous_bytes);
 }
+
+int bdrv_copy_header(BlockDriverState *bs)
+{
+BlockDriver *drv = bs->drv;
+if (!drv) {
+return -ENOMEDIUM;
+}
+if (!drv->bdrv_copy_header) {
+return -ENOTSUP;
+}
+return drv->bdrv_copy_header(bs);
+}
diff --git a/block.h b/block.h
index e021394..2426b62 100644
--- a/block.h
+++ b/block.h
@@ -263,6 +263,8 @@ int bdrv_get_mapping(BlockDriverState *bs, uint64_t 
*guest_offset,
  uint64_t *host_offset, uint64_t *contiguous_bytes);
 int bdrv_map(BlockDriverState *bs, uint64_t *guest_offset,
  uint64_t *host_offset, uint64_t *contiguous_bytes);
+int bdrv_copy_header(BlockDriverState *bs);
+
 typedef enum {
 BLKDBG_L1_UPDATE,
 
-- 
1.7.6.rc1




[Qemu-devel] [RFC 07/24] qed: make qed_alloc_clusters round up offset to nearest cluster

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 block/qed.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index 3970379..00cf895 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -263,6 +263,9 @@ static int qed_read_string(BlockDriverState *file, uint64_t 
offset, size_t n,
  */
 static uint64_t qed_alloc_clusters(BDRVQEDState *s, unsigned int n)
 {
+s->file_size = (s->file_size + s->header.cluster_size -1)
+/ s->header.cluster_size;
+s->file_size *= s->header.cluster_size;
 uint64_t offset = s->file_size;
 s->file_size += n * s->header.cluster_size;
 return offset;
-- 
1.7.6.rc1




[Qemu-devel] [RFC 22/24] qemu-io: make map command use new block mapping function

2011-07-28 Thread Devin Nakamura
bdrv_get_mapping will be used when it is defined,
otherwise default to old behaviour.

Signed-off-by: Devin Nakamura 
---
 qemu-io.c |   23 ++-
 1 files changed, 22 insertions(+), 1 deletions(-)

diff --git a/qemu-io.c b/qemu-io.c
index a553d0c..caf51fe 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -1543,7 +1543,7 @@ static const cmdinfo_t alloc_cmd = {
 .oneline= "checks if a sector is present in the file",
 };
 
-static int map_f(int argc, char **argv)
+static int map_f_old(int argc, char **argv)
 {
 int64_t offset;
 int64_t nb_sectors;
@@ -1570,6 +1570,27 @@ static int map_f(int argc, char **argv)
 return 0;
 }
 
+static int map_f(int argc, char **argv)
+{
+uint64_t host_offset, guest_offset, count;
+int ret;
+
+if (!bs->drv->bdrv_get_mapping) {
+return map_f_old(argc, argv);
+}
+
+guest_offset = count = 0;
+printf("  Guest Offset   -Host Offset   -  Count\n");
+do {
+ret = bdrv_get_mapping(bs, &guest_offset, &host_offset, &count);
+if (count) {
+printf("%016lx - %016lx - %016lx\n", guest_offset, host_offset,
+   count);
+}
+} while (!ret && count > 0);
+return ret;
+}
+
 static const cmdinfo_t map_cmd = {
.name   = "map",
.argmin = 0,
-- 
1.7.6.rc1




Re: [Qemu-devel] [PATCH v2] pci: Common overflow prevention

2011-07-28 Thread Michael S. Tsirkin
On Fri, Jul 29, 2011 at 10:01:43AM +0900, Isaku Yamahata wrote:
> On Thu, Jul 28, 2011 at 11:40:21AM +0300, Michael S. Tsirkin wrote:
> > I don't see a problem with this, but could you please clarify when does
> > this happen? I think this is only possible for a pci device
> > behind an express root. If so, this belongs in pcie_host.c
> > 
> > I'd also like this info to be recorded in the commit log.
> 
> >From 1dd598fd35d4e988dc51487829ed66208ca89021 Mon Sep 17 00:00:00 2001
> Message-Id: 
> <1dd598fd35d4e988dc51487829ed66208ca89021.1311901239.git.yamah...@valinux.co.jp>
> From: Isaku Yamahata 
> Date: Fri, 29 Jul 2011 09:52:45 +0900
> Subject: [PATCH] pcie_host: limit check of pcie_mmcfg_data_write/read
> 
> This patch adds the check where the offset in the configuration space
> is in its configuration size.
> 
> MMCFG area allows access of pcie configuration space to be in
> offset [0, 4K).
> At the same time, conventional pci devices whose configuration space size
> is 256 bytes can be behind pcie-to-pci bridge.
> The access whose offset is [256, 4K) should have no effect
> on the conventional pci device
> Add the limit check and ignore such accesses.
> 
> Signed-off-by: Isaku Yamahata 

I tweaked the commit log and applied this.
Thanks!



[Qemu-devel] [RFC 17/24] qcow2: add qcow2_get_mapping

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 block/qcow2.c |   36 
 1 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 48e1b95..05ea40c 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1340,6 +1340,40 @@ static int qcow2_load_vmstate(BlockDriverState *bs, 
uint8_t *buf,
 return ret;
 }
 
+static int qcow2_get_mapping(BlockDriverState *bs, uint64_t *guest_offset,
+uint64_t *host_offset, uint64_t *contiguous_bytes)
+{
+uint64_t cluster_offset, pos;
+//BDRVQcowState *s = bs->opaque;
+int ret, count;
+pos = *guest_offset + *contiguous_bytes;
+
+if (pos >= bs->total_sectors << BDRV_SECTOR_BITS) {
+*contiguous_bytes = 0;
+return 0;
+}
+count = 0;
+do {
+pos += count << BDRV_SECTOR_BITS;
+count = INT_MAX;
+ret = qcow2_get_cluster_offset(bs, pos, &count, &cluster_offset);
+if (ret) {
+*contiguous_bytes = 0;
+return ret;
+}
+} while (!cluster_offset && pos < bs->total_sectors * BDRV_SECTOR_SIZE);
+
+if (pos >= bs->total_sectors << BDRV_SECTOR_BITS || !cluster_offset) {
+*contiguous_bytes = 0;
+return 0;
+}
+
+*contiguous_bytes = count << BDRV_SECTOR_BITS;
+*guest_offset = pos;
+*host_offset = cluster_offset;
+return 0;
+}
+
 static QEMUOptionParameter qcow2_create_options[] = {
 {
 .name = BLOCK_OPT_SIZE,
@@ -1409,6 +1443,8 @@ static BlockDriver bdrv_qcow2 = {
 
 .create_options = qcow2_create_options,
 .bdrv_check = qcow2_check,
+
+.bdrv_get_mapping = qcow2_get_mapping,
 };
 
 static void bdrv_qcow2_init(void)
-- 
1.7.6.rc1




[Qemu-devel] [RFC 08/24] qed: add qed_find_cluster_sync()

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 block/qed-cluster.c |   35 +++
 block/qed.h |4 
 2 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/block/qed-cluster.c b/block/qed-cluster.c
index 3e19ad1..063b965 100644
--- a/block/qed-cluster.c
+++ b/block/qed-cluster.c
@@ -163,3 +163,38 @@ void qed_find_cluster(BDRVQEDState *s, QEDRequest 
*request, uint64_t pos,
 qed_read_l2_table(s, request, l2_offset,
   qed_find_cluster_cb, find_cluster_cb);
 }
+
+typedef struct {
+int ret;
+uint64_t *offset;
+size_t *len;
+} QEDFindClusterSyncCB;
+
+static void qed_find_cluster_sync_cb(void *opaque, int ret, uint64_t offset,
+  size_t len)
+{
+QEDFindClusterSyncCB *find_cluster_sync_cb = opaque;
+*find_cluster_sync_cb->offset = offset;
+*find_cluster_sync_cb->len = len;
+find_cluster_sync_cb->ret = ret;
+}
+
+int qed_find_cluster_sync(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
+  size_t len, uint64_t *offset,
+  size_t *contiguous_bytes)
+{
+QEDFindClusterSyncCB find_cluster_cb;
+find_cluster_cb.ret = -EINPROGRESS;
+find_cluster_cb.offset = offset;
+find_cluster_cb.len = contiguous_bytes;
+
+async_context_push();
+qed_find_cluster(s, request, pos, len, &qed_find_cluster_sync_cb,
+ &find_cluster_cb);
+while (find_cluster_cb.ret == -EINPROGRESS) {
+qemu_aio_wait();
+}
+async_context_pop();
+
+return find_cluster_cb.ret;
+}
diff --git a/block/qed.h b/block/qed.h
index 388fdb3..c899c15 100644
--- a/block/qed.h
+++ b/block/qed.h
@@ -239,6 +239,10 @@ int qed_write_l2_table_sync(BDRVQEDState *s, QEDRequest 
*request,
 void qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
   size_t len, QEDFindClusterFunc *cb, void *opaque);
 
+int qed_find_cluster_sync(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
+  size_t len, uint64_t *offset,
+  size_t *contiguous_bytes);
+
 /**
  * Consistency check
  */
-- 
1.7.6.rc1




[Qemu-devel] [RFC 24/24] qemu-img: add inplace conversion to qemu-img

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 Makefile |2 +
 qemu-img-cmds.hx |6 +
 qemu-img.c   |   64 ++
 3 files changed, 72 insertions(+), 0 deletions(-)

diff --git a/Makefile b/Makefile
index daa3aa0..243c818 100644
--- a/Makefile
+++ b/Makefile
@@ -148,6 +148,8 @@ qemu-nbd$(EXESUF): qemu-nbd.o qemu-tool.o qemu-error.o 
$(oslib-obj-y) $(trace-ob
 
 qemu-io$(EXESUF): qemu-io.o cmd.o qemu-tool.o qemu-error.o $(oslib-obj-y) 
$(trace-obj-y) $(block-obj-y) $(qobject-obj-y) $(version-obj-y) 
qemu-timer-common.o
 
+qemu-convert$(EXESUF): qemu-convert.o cmd.o qemu-tool.o qemu-error.o 
$(oslib-obj-y) $(trace-obj-y) $(block-obj-y) $(qobject-obj-y) $(version-obj-y) 
qemu-timer-common.o
+
 qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@,"  GEN  
 $@")
 
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 1299e83..ea97f83 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -33,6 +33,12 @@ STEXI
 @item convert [-c] [-p] [-f @var{fmt}] [-O @var{output_fmt}] [-o 
@var{options}] [-s @var{snapshot_name}] @var{filename} [@var{filename2} [...]] 
@var{output_filename}
 ETEXI
 
+DEF("convert-inplace", img_convert_inplace,
+"convert-inplace filename target_format")
+STEXI
+@item convert-inplace @var{filename} @var{target_format}
+ETEXI
+
 DEF("info", img_info,
 "info [-f fmt] filename")
 STEXI
diff --git a/qemu-img.c b/qemu-img.c
index b205e98..c22066e 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1024,6 +1024,70 @@ out:
 return 0;
 }
 
+static int img_convert_inplace(int argc, char **argv)
+{
+int ret;
+BlockDriverState *bs_src, *bs_tgt;
+QEMUOptionParameter *options = NULL;
+BlockConversionOptions drv_options;
+uint64_t guest_offset, host_offset, count;
+guest_offset = host_offset = count = 0;
+bs_src = bdrv_new("");
+ret = bdrv_open(bs_src, argv[1], BDRV_O_RDWR, NULL);
+if (ret) {
+printf("Failed to open source image: %s\n", argv[1]);
+return ret;
+}
+
+ret = bdrv_get_conversion_options(bs_src, &drv_options);
+if (ret) {
+printf("Failed getting conversion options\n");
+goto end_before_tgt;
+}
+
+ret = bdrv_open_conversion_target(&bs_tgt, bs_src->file, &drv_options, 
options, argv[2]);
+if (ret) {
+printf("Failed to open conversion target, format: %s\n", argv[2]);
+switch(ret) {
+case -ENOENT:
+printf("No block driver found\n"); break;
+case -ENOTSUP:
+printf("Not supported by block driver\n"); break;
+default:
+printf("Unknown error (%i)\n", ret);
+}
+goto end_before_tgt;
+}
+
+do {
+ret = bdrv_get_mapping(bs_src, &guest_offset, &host_offset, &count);
+if (!ret) {
+ret = bdrv_map(bs_tgt, guest_offset, host_offset, count);
+}
+} while (count != 0);
+
+if (ret) {
+printf("An error occured during mapping process\n");
+goto end;
+}
+ret = bdrv_copy_header(bs_tgt);
+if (ret) {
+printf("An error occured during header copy\n");
+goto end;
+}
+
+ret = 0;
+
+end:
+bdrv_close(bs_tgt);
+end_before_tgt:
+bdrv_close(bs_src);
+return ret;
+
+
+
+}
+
 
 static void dump_snapshots(BlockDriverState *bs)
 {
-- 
1.7.6.rc1




[Qemu-devel] [RFC 23/24] qemu-io: add setmap command

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 qemu-io.c |   24 
 1 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/qemu-io.c b/qemu-io.c
index caf51fe..a49f62a 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -1601,6 +1601,29 @@ static const cmdinfo_t map_cmd = {
 };
 
 
+static int setmap_f(int argc, char **argv)
+{
+uint64_t guest_offset, host_offset, count;
+guest_offset = cvtnum(argv[1]);
+host_offset = cvtnum(argv[2]);
+if (argc == 4) {
+count =  cvtnum(argv[3]);
+} else {
+/*count = bs->s*/
+count = 0; /*TODO: fix*/
+}
+return bdrv_map(bs, guest_offset, host_offset, count);
+}
+
+static const cmdinfo_t setmap_cmd = {
+.name = "setmap",
+.argmin = 2,
+.argmax = 3,
+.cfunc = setmap_f,
+.args = "",
+.oneline = "Sets mapping in image file",
+};
+
 static int close_f(int argc, char **argv)
 {
 bdrv_close(bs);
@@ -1835,6 +1858,7 @@ int main(int argc, char **argv)
 add_command(&discard_cmd);
 add_command(&alloc_cmd);
 add_command(&map_cmd);
+add_command(&setmap_cmd);
 
 add_args_command(init_args_command);
 add_check_command(init_check_command);
-- 
1.7.6.rc1




[Qemu-devel] [RFC 19/24] qcow2: add qcow2_copy_header()

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 block/qcow2.c |   54 ++
 1 files changed, 54 insertions(+), 0 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index b75364d..3bb28d2 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1374,6 +1374,59 @@ static int qcow2_get_mapping(BlockDriverState *bs, 
uint64_t *guest_offset,
 return 0;
 }
 
+static int qcow2_copy_header(BlockDriverState *bs)
+{
+BlockDriverState *backup;
+uint8_t buffer[512];
+int ret;
+QCowHeader header;
+BDRVQcowState *s = bs->opaque;
+ret = qcow2_cache_flush(bs, s->l2_table_cache);
+if (ret) {
+return ret;
+}
+
+/*TODO: change filename*/
+bdrv_create_file("headerbackup.temp", NULL);
+bdrv_file_open(&backup, "headerbackup.temp", BDRV_O_RDWR);
+bdrv_read(bs->file, 0, buffer, 1); /*TODO: check return code*/
+bdrv_write(backup, 0, buffer, 1); /*TODO: check return code*/
+bdrv_close(backup);
+
+header.magic = QCOW_MAGIC;
+header.version = 2;
+header.backing_file_offset = 0;
+header.backing_file_size = 0;
+header.size = bs->total_sectors << BDRV_SECTOR_BITS;
+header.cluster_bits = s->cluster_bits;
+header.crypt_method = QCOW_CRYPT_NONE;
+header.l1_table_offset = s->l1_table_offset;
+header.l1_size = s->l1_size;
+header.refcount_table_offset = s->refcount_table_offset;
+header.refcount_table_clusters = s->refcount_table_size; /*TODO: FIXME*/
+header.snapshots_offset = s->snapshots_offset;
+header.nb_snapshots = s->nb_snapshots;
+
+cpu_to_be64s(&header.backing_file_offset);
+cpu_to_be32s(&header.backing_file_size);
+cpu_to_be32s(&header.cluster_bits);
+cpu_to_be32s(&header.crypt_method);
+cpu_to_be32s(&header.l1_size);
+cpu_to_be64s(&header.l1_table_offset);
+cpu_to_be32s(&header.magic);
+cpu_to_be32s(&header.nb_snapshots);
+cpu_to_be32s(&header.refcount_table_clusters);
+cpu_to_be64s(&header.refcount_table_offset);
+cpu_to_be64s(&header.size);
+cpu_to_be64s(&header.snapshots_offset);
+cpu_to_be32s(&header.version);
+ret = bdrv_pwrite_sync(bs->file, 0, &header, sizeof(QCowHeader));
+if (ret) {
+return ret;
+}
+return qcow2_drop_leaked_clusters(bs);
+}
+
 static QEMUOptionParameter qcow2_create_options[] = {
 {
 .name = BLOCK_OPT_SIZE,
@@ -1446,6 +1499,7 @@ static BlockDriver bdrv_qcow2 = {
 
 .bdrv_get_mapping = qcow2_get_mapping,
 .bdrv_map = qcow2_map,
+.bdrv_copy_header = qcow2_copy_header,
 };
 
 static void bdrv_qcow2_init(void)
-- 
1.7.6.rc1




[Qemu-devel] [RFC 18/24] qcow2: add qcow2_map

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 block/qcow2-cluster.c |   49 +
 block/qcow2.c |1 +
 block/qcow2.h |3 +++
 3 files changed, 53 insertions(+), 0 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index ca56918..848f2ee 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -977,3 +977,52 @@ int qcow2_discard_clusters(BlockDriverState *bs, uint64_t 
offset,
 
 return 0;
 }
+
+int qcow2_map(BlockDriverState *bs, uint64_t guest_offset,
+uint64_t host_offset, uint64_t contiguous_bytes)
+{
+BDRVQcowState *s = bs->opaque;
+unsigned int nelms;
+
+
+if ((s->cluster_size - 1) & guest_offset) {
+return -EINVAL;
+}
+
+if (contiguous_bytes % s->cluster_size) {
+return -EINVAL;
+}
+
+nelms = s->l2_size / sizeof(uint64_t);
+
+while (contiguous_bytes > 0) {
+unsigned int l1_index, l2_index;
+uint64_t *l2_table;
+int ret;
+l1_index = guest_offset >> (s->l2_bits + s->cluster_bits);
+l2_index = (guest_offset  >> s->cluster_bits) & (s->l2_size - 1);
+
+if (!s->l1_table[l1_index]) {
+ret = l2_allocate(bs, l1_index, &l2_table);
+if (ret) {
+return ret;
+}
+} else {
+ret = l2_load(bs, s->l1_table[l1_index] & ~QCOW_OFLAG_COPIED, 
&l2_table);
+if (ret) {
+return ret;
+}
+}
+qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+
+for (; l2_index < nelms && contiguous_bytes > 0;  l2_index++) {
+l2_table[l2_index] = cpu_to_be64(host_offset | QCOW_OFLAG_COPIED);
+host_offset += s->cluster_size;
+contiguous_bytes -= s->cluster_size;
+}
+
+qcow2_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
+
+}
+return 0;
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index 05ea40c..b75364d 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1445,6 +1445,7 @@ static BlockDriver bdrv_qcow2 = {
 .bdrv_check = qcow2_check,
 
 .bdrv_get_mapping = qcow2_get_mapping,
+.bdrv_map = qcow2_map,
 };
 
 static void bdrv_qcow2_init(void)
diff --git a/block/qcow2.h b/block/qcow2.h
index e9efb74..feea866 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -245,4 +245,7 @@ int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache 
*c, uint64_t offset,
 void **table);
 int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table);
 
+int qcow2_map(BlockDriverState *bs, uint64_t guest_offset, uint64_t host_ofset,
+  uint64_t contiguous_bytes);
+
 #endif
-- 
1.7.6.rc1




[Qemu-devel] [RFC 16/24] qcow2: add qcow2_drop_leaked_clusters()

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 block/qcow2-refcount.c |   34 ++
 block/qcow2.h  |2 ++
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 75f1f88..2f78a71 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1202,3 +1202,37 @@ fail:
 
 return ret;
 }
+
+int qcow2_drop_leaked_clusters(BlockDriverState *bs) {
+BDRVQcowState *s = bs->opaque;
+int64_t size;
+int nb_clusters, refcount, i;
+uint16_t *refcount_table;
+BdrvCheckResult res;
+int ret;
+
+ret = inc_refcount_table(bs, &res, &refcount_table);
+if (ret) {
+goto fail;
+}
+size = bdrv_getlength(bs->file);
+nb_clusters = size_to_clusters(s, size);
+
+for(i = 0; i < nb_clusters; i++) {
+refcount = get_refcount(bs, i);
+refcount = refcount_table[i] - refcount;
+if (refcount < 0) { //we only want to change leaked clusters
+ret = update_cluster_refcount(bs, i , refcount);
+if (ret) {
+goto fail;
+}
+}
+}
+
+ret = 0;
+
+fail:
+qemu_free(refcount_table);
+
+return ret;
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index 6a0a21b..e9efb74 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -193,6 +193,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 
 int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res);
 
+int qcow2_drop_leaked_clusters(BlockDriverState *bs) ;
+
 /* qcow2-cluster.c functions */
 int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size);
 void qcow2_l2_cache_reset(BlockDriverState *bs);
-- 
1.7.6.rc1




[Qemu-devel] [RFC 11/24] qed: add open_conversion_target()

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 block/qed.c |   86 +++---
 1 files changed, 69 insertions(+), 17 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index daf82fd..b05224a 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -1451,6 +1451,57 @@ static int bdrv_qed_check(BlockDriverState *bs, 
BdrvCheckResult *result)
 return qed_check(s, result, false);
 }
 
+static int bdrv_qed_open_conversion_target(BlockDriverState *bs,
+   BlockConversionOptions *drv_options,
+   QEMUOptionParameter *usr_options)
+{
+BDRVQEDState *s = bs->opaque;
+s->bs = bs;
+if (drv_options->encryption_type != BLOCK_CRYPT_NONE) {
+return -1; //TODO: FIXME
+}
+if (drv_options->cluster_size != drv_options->allocation_size) {
+return -1; //TODO: FIXME
+}
+s->header.magic = QED_MAGIC;
+s->header.table_size = QED_DEFAULT_TABLE_SIZE;
+if(qed_is_cluster_size_valid(drv_options->cluster_size)) {
+s->header.cluster_size = drv_options->cluster_size;
+} else {
+return -EINVAL;
+}
+if(qed_is_image_size_valid(drv_options->image_size, s->header.cluster_size,
+   s->header.table_size)) {
+/*TODO: check if qed_is_image_size_valid
+* checks image_size % cluster_size =0 */
+s->header.image_size = drv_options->image_size;
+} else {
+return -EINVAL;
+}
+s->file_size = bs->file->total_sectors << BDRV_SECTOR_BITS;
+s->l1_table = qed_alloc_table(s);
+s->header.l1_table_offset = qed_alloc_clusters(s, s->header.table_size);
+QSIMPLEQ_INIT(&s->allocating_write_reqs);
+
+
+if (!qed_check_table_offset(s, s->header.l1_table_offset)) {
+return -EINVAL;
+}
+
+s->table_nelems = (s->header.cluster_size * s->header.table_size) /
+  sizeof(uint64_t);
+s->l2_shift = ffs(s->header.cluster_size) - 1;
+s->l2_mask = s->table_nelems - 1;
+s->l1_shift = s->l2_shift + ffs(s->table_nelems) - 1;
+
+qed_init_l2_cache(&s->l2_cache);
+
+s->need_check_timer = qemu_new_timer_ns(vm_clock,
+qed_need_check_timer_cb, s);
+qed_write_l1_table_sync(s, 0, s->table_nelems);
+return 0;
+}
+
 static int bdrv_qed_get_mapping(BlockDriverState *bs, uint64_t *guest_offset,
 uint64_t *host_offset,
 uint64_t *contiguous_bytes)
@@ -1559,23 +1610,24 @@ static BlockDriver bdrv_qed = {
 .instance_size= sizeof(BDRVQEDState),
 .create_options   = qed_create_options,
 
-.bdrv_probe   = bdrv_qed_probe,
-.bdrv_open= bdrv_qed_open,
-.bdrv_close   = bdrv_qed_close,
-.bdrv_create  = bdrv_qed_create,
-.bdrv_flush   = bdrv_qed_flush,
-.bdrv_is_allocated= bdrv_qed_is_allocated,
-.bdrv_make_empty  = bdrv_qed_make_empty,
-.bdrv_aio_readv   = bdrv_qed_aio_readv,
-.bdrv_aio_writev  = bdrv_qed_aio_writev,
-.bdrv_aio_flush   = bdrv_qed_aio_flush,
-.bdrv_truncate= bdrv_qed_truncate,
-.bdrv_getlength   = bdrv_qed_getlength,
-.bdrv_get_info= bdrv_qed_get_info,
-.bdrv_change_backing_file = bdrv_qed_change_backing_file,
-.bdrv_check   = bdrv_qed_check,
-.bdrv_get_mapping = bdrv_qed_get_mapping,
-.bdrv_map = bdrv_qed_map,
+.bdrv_probe  = bdrv_qed_probe,
+.bdrv_open   = bdrv_qed_open,
+.bdrv_close  = bdrv_qed_close,
+.bdrv_create = bdrv_qed_create,
+.bdrv_flush  = bdrv_qed_flush,
+.bdrv_is_allocated   = bdrv_qed_is_allocated,
+.bdrv_make_empty = bdrv_qed_make_empty,
+.bdrv_aio_readv  = bdrv_qed_aio_readv,
+.bdrv_aio_writev = bdrv_qed_aio_writev,
+.bdrv_aio_flush  = bdrv_qed_aio_flush,
+.bdrv_truncate   = bdrv_qed_truncate,
+.bdrv_getlength  = bdrv_qed_getlength,
+.bdrv_get_info   = bdrv_qed_get_info,
+.bdrv_change_backing_file= bdrv_qed_change_backing_file,
+.bdrv_check  = bdrv_qed_check,
+.bdrv_open_conversion_target = bdrv_qed_open_conversion_target,
+.bdrv_get_mapping= bdrv_qed_get_mapping,
+.bdrv_map= bdrv_qed_map,
 };
 
 static void bdrv_qed_init(void)
-- 
1.7.6.rc1




[Qemu-devel] [RFC 02/24] block: add bdrv_get_conversion_options()

2011-07-28 Thread Devin Nakamura

Signed-off-by: Devin Nakamura 
---
 block.c |   13 +
 block.h |2 ++
 2 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index 9549b9e..4503b7b 100644
--- a/block.c
+++ b/block.c
@@ -3037,3 +3037,16 @@ out:
 
 return ret;
 }
+
+int bdrv_get_conversion_options(BlockDriverState *bs,
+BlockConversionOptions *options)
+{
+if (!bs->drv) {
+return -ENOENT;
+}
+
+if (!bs->drv->bdrv_get_conversion_options) {
+return -ENOTSUP;
+}
+return bs->drv->bdrv_get_conversion_options(bs, options);
+}
diff --git a/block.h b/block.h
index a1c4cc8..ad7e5ea 100644
--- a/block.h
+++ b/block.h
@@ -253,6 +253,8 @@ int bdrv_in_use(BlockDriverState *bs);
 
 typedef struct BlockConversionOptions BlockConversionOptions;
 
+int bdrv_get_conversion_options(BlockDriverState *bs,
+BlockConversionOptions *options);
 typedef enum {
 BLKDBG_L1_UPDATE,
 
-- 
1.7.6.rc1




[Qemu-devel] [RFC 03/24] block: add bdrv_open_conversion_target()

2011-07-28 Thread Devin Nakamura
Conflicts:

block.h

Signed-off-by: Devin Nakamura 
---
 block.c |   32 
 block.h |4 
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/block.c b/block.c
index 4503b7b..9530577 100644
--- a/block.c
+++ b/block.c
@@ -3038,6 +3038,38 @@ out:
 return ret;
 }
 
+int bdrv_open_conversion_target(BlockDriverState **bs, BlockDriverState *file,
+BlockConversionOptions *drv_options,
+QEMUOptionParameter *usr_options,
+const char *target_fmt)
+{
+BlockDriver *drv;
+BlockDriverState *bss;
+
+drv = bdrv_find_format(target_fmt);
+if (!drv) {
+return -ENOENT;
+}
+
+if (!drv->bdrv_open_conversion_target) {
+return -ENOTSUP;
+}
+
+*bs = bdrv_new("");
+bss = *bs;
+bss->file = file;
+bss->total_sectors = drv_options->image_size >> BDRV_SECTOR_BITS;
+bss->encrypted = 0;
+bss->valid_key = 0;
+bss->open_flags = 0;
+/* buffer_alignment defaulted to 512, drivers can change this value */
+bss->buffer_alignment = 512;
+bss->opaque = qemu_mallocz(drv->instance_size);
+bss->drv = drv;
+return drv->bdrv_open_conversion_target(bss, drv_options, usr_options);
+
+}
+
 int bdrv_get_conversion_options(BlockDriverState *bs,
 BlockConversionOptions *options)
 {
diff --git a/block.h b/block.h
index ad7e5ea..662bae7 100644
--- a/block.h
+++ b/block.h
@@ -255,6 +255,10 @@ typedef struct BlockConversionOptions 
BlockConversionOptions;
 
 int bdrv_get_conversion_options(BlockDriverState *bs,
 BlockConversionOptions *options);
+int bdrv_open_conversion_target(BlockDriverState **bs, BlockDriverState *file,
+BlockConversionOptions *drv_options,
+QEMUOptionParameter *usr_options,
+const char *target_fmt);
 typedef enum {
 BLKDBG_L1_UPDATE,
 
-- 
1.7.6.rc1




[Qemu-devel] [RFC 01/24] block: add block conversion api

2011-07-28 Thread Devin Nakamura
add functions to block driver interface to support inplace image conversion

Signed-off-by: Devin Nakamura 
---
 block.h |2 +
 block_int.h |   88 +++
 2 files changed, 90 insertions(+), 0 deletions(-)

diff --git a/block.h b/block.h
index 59cc410..a1c4cc8 100644
--- a/block.h
+++ b/block.h
@@ -251,6 +251,8 @@ int64_t bdrv_get_dirty_count(BlockDriverState *bs);
 void bdrv_set_in_use(BlockDriverState *bs, int in_use);
 int bdrv_in_use(BlockDriverState *bs);
 
+typedef struct BlockConversionOptions BlockConversionOptions;
+
 typedef enum {
 BLKDBG_L1_UPDATE,
 
diff --git a/block_int.h b/block_int.h
index efb6803..84bf89e 100644
--- a/block_int.h
+++ b/block_int.h
@@ -41,6 +41,9 @@
 #define BLOCK_OPT_PREALLOC  "preallocation"
 #define BLOCK_OPT_SUBFMT"subformat"
 
+#define BLOCK_CRYPT_NONE0
+#define BLOCK_CRYPT_AES 1
+
 typedef struct AIOPool {
 void (*cancel)(BlockDriverAIOCB *acb);
 int aiocb_size;
@@ -139,6 +142,85 @@ struct BlockDriver {
  */
 int (*bdrv_has_zero_init)(BlockDriverState *bs);
 
+/* In-place image conversion */
+
+/**
+ * Opens an image conversion target.
+ *
+ * @param bs  Basic Initialization done by
+ *bdrv_open_conversion_target() Still need to set 
format
+ *specific data.
+ * @param usr_options Creation options.
+ * @param drv_options Conversion Options
+ * @returnReturns non-zero on failure.
+ */
+int (*bdrv_open_conversion_target)(BlockDriverState *bs,
+BlockConversionOptions *drv_options, QEMUOptionParameter *usr_options);
+
+/**
+ * Gets a mapping in the image file.
+ *
+ * The function starts searching for a mapping at
+ * starting_guest_offset = guest_offset + contiguous_bytes
+ *
+ * @param bs[in]   The image in which to find mapping.
+ * @param guest_offset[in,out] On function entry used to calculate
+ * starting search address.
+ * On function exit contains the starting
+ * guest offset of the mapping.
+ * @param host_offset[out] The starting image file offset for the
+ * mapping.
+ * @param contiguous_bytes[in,out] On function entry used to calculate
+ * starting search address.
+ * On function exit contains the number of
+ * bytes for which this mapping is valid.
+ * A value of 0 means there are no more
+ * mappings in the image.
+ * @return Returns non-zero on error.
+ */
+int (*bdrv_get_mapping)(BlockDriverState *bs, uint64_t *guest_offset,
+uint64_t *host_offset, uint64_t *contiguous_bytes);
+
+/**
+ * Sets a mapping in the image file.
+ *
+ * @param bs   Usualy opened with bdrv_open_conversion_target
+ * @param guest_offset The starting guest offset of the mapping
+ * (in bytes). Function fails and returns -EINVAL 
if
+ * not cluster aligned.
+ * @param host_offset  The starting image offset of the mapping
+ * (in bytes). Function fails and returns -EINVAL 
if
+ * not cluster aligned.
+ * @param contiguous_bytes The number of bytes for which this mapping 
exists
+ * Function fails and returns -EINVAL if not 
cluster
+ * aligned
+ * @return Returns non-zero on error
+ */
+int (*bdrv_map)(BlockDriverState *bs, uint64_t guest_offset,
+uint64_t host_offset, uint64_t contiguous_bytes);
+
+/**
+ * Copies out the header of a conversion target
+ *
+ * Saves the current header for the image in a temporary file and 
overwrites
+ * it with the header for the new format (at the moment the header is
+ * assumed to be 1 sector)
+ *
+ * @param bs  Usualy opened with bdrv_open_conversion_target().
+ * @returnReturns non-zero on failure
+ */
+int (*bdrv_copy_header) (BlockDriverState *bs);
+
+/**
+ * Asks the block driver what options should be used to create a conversion
+ * target.
+ *
+ * @param options[out] Block conversion options
+ */
+int (*bdrv_get_conversion_options)(BlockDriverState *bs,
+ BlockConversionOptions *options);
+
+
 QLIST_ENTRY(BlockDriver) list;
 };
 
@@ -263,4 +345,10 @@ static inline unsigned int 
get_physical_block_exp(BlockConf *conf)
 DEFINE_PROP_UINT32("discard_granularity", _state, \
_conf.discard_granularity, 0)
 
+struct BlockConversionOptions {
+i

[Qemu-devel] [RFC 00/24] inplace image conversion

2011-07-28 Thread Devin Nakamura
This series sets out the foundations of an api for in place image conversion, 
and implements it 
in the QED and QCOW2 drivers.

The basic flow of conversion is as follows:
The source image is openened as normal.

The source file is the queried for conversion options via 
bdrv_get_conversion_options. The conversion
options contain things like the size of the clusters, the type of encryption 
etc.

The target image is then opened with bdrv_open_conversion_target, passing the 
conversion optionons
among other things allong with it. This is basically a hybrid of bdrv_open and 
bdrv_create.

Mappings are then fetched and set using bdrv_get_mapping and bdrv_map 
respectively. They both deal 
with data in the format guest_offset, host_offset, and contiguous_bytes. For 
example if 
guest_offset = 31, host_offset = 41, and contiguous_bytes = 59 that would mean 
that offset 31 in the
virtual disk would be located at offset 41 in the image file and 32 at 42 and 
so on up to 89 at 99.

At the end of the mapping process the image is "finalized" with 
bdrv_copy_header, where the target
driver copies out the old header to a temporary file, and then writes out its 
own header into the 
image.

I've integerated inplace conversion into qemu-img in the form: 
qemu-img convert-inplace  

I've also integrated bdrv_get_mapping and bdrv_map into qemu-io (with the 
former replacing qemu-io's
internal mapping when then block driver supports it) 

Devin Nakamura (24):
  block: add block conversion api
  block: add bdrv_get_conversion_options()
  block: add bdrv_open_conversion_target()
  block: add bdrv_get_mapping()
  block: add bdrv_map()
  block: add bdrv_copy_header()
  qed: make qed_alloc_clusters round up offset to nearest cluster
  qed: add qed_find_cluster_sync()
  qed: add qed_bdrv_get_mapping()
  qed: add qed_bdrv_map()
  qed: add open_conversion_target()
  qed: add bdrv_qed_copy_header()
  qed: add bdrv_qed_get_conversion_options()
  qcow2: fix typo in documentation for qcow2_get_cluster_offset()
  qcow2: split up the creation of new refcount table from the act of
checking it
  qcow2: add qcow2_drop_leaked_clusters()
  qcow2: add qcow2_get_mapping
  qcow2: add qcow2_map
  qcow2: add qcow2_copy_header()
  qcow2: add get_conversion_options()
  qcow2: add qcow2_open_conversion_target()
  qemu-io: make map command use new block mapping function
  qemu-io: add setmap command
  qemu-img: add inplace conversion to qemu-img

 Makefile   |2 +
 block.c|  100 +
 block.h|   14 +++
 block/qcow2-cluster.c  |   53 +++-
 block/qcow2-refcount.c |   71 +--
 block/qcow2.c  |  233 
 block/qcow2.h  |5 +
 block/qed-cluster.c|   35 +++
 block/qed.c|  193 ---
 block/qed.h|4 +
 block_int.h|   88 ++
 qemu-img-cmds.hx   |6 ++
 qemu-img.c |   64 +
 qemu-io.c  |   47 ++-
 14 files changed, 888 insertions(+), 27 deletions(-)

-- 
1.7.6.rc1




[Qemu-devel] 0.15 testing, spice, wxp

2011-07-28 Thread Rick Vernam
I'd like to do some very high-level testing of qemu-stable-0.15 (qemu-system-
x86_64, specifically) running Linux host, Windows XP SP3 guest.
I'm afraid that I don't have a ready environment for building the windows 
guest bits, and the added time requirement simply puts it over the top for me.
Are there updated spice windows guest binaries available anywhere?  I suppose 
this would just be the QXL video driver and the guest agent?  From looking at 
spice-space.org/download.html I concluded that I'm looking for verion 0.8.x, 
but my searches have left me empty handed.

Thanks for any tips,
-Rick



Re: [Qemu-devel] [PATCH v2 1/1] The codes V2 for QEMU disk I/O limits.

2011-07-28 Thread Zhi Yong Wu
On Thu, Jul 28, 2011 at 10:42 PM, Marcelo Tosatti  wrote:
> On Thu, Jul 28, 2011 at 12:24:48PM +0800, Zhi Yong Wu wrote:
>> On Wed, Jul 27, 2011 at 11:49 PM, Marcelo Tosatti  
>> wrote:
>> > On Wed, Jul 27, 2011 at 06:17:15PM +0800, Zhi Yong Wu wrote:
>> >> >> +        wait_time = 1;
>> >> >> +    }
>> >> >> +
>> >> >> +    wait_time = wait_time + (slice_time - elapsed_time);
>> >> >> +    if (wait) {
>> >> >> +        *wait = wait_time * BLOCK_IO_SLICE_TIME * 10 + 1;
>> >> >> +    }
>> >> >
>> >> > The guest can keep submitting requests where "wait_time = 1" above,
>> >> > and the timer will be rearmed continuously in the future.
>> >
>> > This is wrong.
>> >
>> >>  Can't you
>> >> > simply arm the timer to the next slice start? _Some_ data must be
>> >> > transfered by then, anyway (and nothing can be transfered earlier than
>> >> > that).
>> >
>> > This is valid.
>> >
>> >> Sorry, i have got what you mean. Can you elaborate in more detail?
>> >
>> > Sorry, the bug i mentioned about timer being rearmed does not exist.
>> >
>> > But arming the timer for the last request as its done is 
>> > confusing/unnecessary.
>> >
>> > The timer processes queued requests, but the timer is armed accordingly
>> > to the last queued request in the slice. For example, if a request is
>> > submitted 1ms before the slice ends, the timer is armed 100ms +
>> > (slice_time - elapsed_time) in the future.
>>
>> If the timer is simply amred to the next slice start, this timer will
>> be a periodic timer, either the I/O rate can not be throttled under
>> the limits, or the enqueued request can be delayed to handled, this
>> will lower I/O rate seriously than the limits.
>
> Yes, periodic but disarmed when there is no queueing. I don't understand
> your point about low I/O rate.
We hope that at the same time the runtime I/O rate is throttled under
this limits, it can be very close to the limits. If the timer is
simpily armed to the next slice, the enqueued request could be delayed
a bit long to be handled, this could make current I/O rate lowerer
largely than the limits. So this could seriously hurt the I/O
performance.

>
>> Maybe the slice time should be variable with current I/O rate. What do
>> you think of it?
>
> Not sure if its necessary. The slice should be small to avoid excessive
> work on timer context, but the advantage of increasing the slice is
> very small if any. BTW, 10ms seems a better default than 100ms.
Thanks for your comments, if you're interested, pls take a look at the
code V3. It has added the codes for variable slice time.

>
>



-- 
Regards,

Zhi Yong Wu



Re: [Qemu-devel] [PATCH] v6 revamp acpitable parsing and allow to specify complete (headerful) table

2011-07-28 Thread Isaku Yamahata
On Fri, Jul 15, 2011 at 07:51:43PM +0300, Blue Swirl wrote:
> On Fri, Jul 15, 2011 at 6:18 PM, John Baboval
>  wrote:
> > Is there something I can do to help take this patch the rest of the way?
> >
> > I'd hate to see it die because of a style issue and a compiler warning.
> 
> There's also suspicious missing 'break' statement. How about fixing
> the issues and submitting the patch?

I fixed the compile error.
I think the missing break is intentional, so added an comment there. Michael?
Blue, can you please take a look of it?


>From 9a5e4158074ea251ab064a946927bcaf861f5c1e Mon Sep 17 00:00:00 2001
Message-Id: 
<9a5e4158074ea251ab064a946927bcaf861f5c1e.1311903724.git.yamah...@valinux.co.jp>
From: Michael Tokarev 
Date: Thu, 12 May 2011 18:44:17 +0400
Subject: [PATCH] revamp acpitable parsing and allow to specify complete 
(headerful) table

>From Michael Tokarev 

This patch almost rewrites acpi_table_add() function
(but still leaves it using old get_param_value() interface).
The result is that it's now possible to specify whole table
(together with a header) in an external file, instead of just
data portion, with a new file= parameter, but at the same time
it's still possible to specify header fields as before.

Now with the checkpatch.pl formatting fixes, thanks to
Stefan Hajnoczi for suggestions, with changes from
Isaku Yamahata, and with my further refinements.

Signed-off-by: Michael Tokarev 
Cc:: Isaku Yamahata 
Cc: John Baboval 
Cc: Blue Swirl 

---
v5: rediffed against current qemu/master.
v6: fix one "} else {" coding style defect (noted by Blue Swirl)
v7: style fix and added an comment for suspicious break
I think that the missing break of case 0 is intentional.
I added the fallthrough comment there.
---
 hw/acpi.c   |  298 ---
 qemu-options.hx |7 +-
 2 files changed, 178 insertions(+), 127 deletions(-)

diff --git a/hw/acpi.c b/hw/acpi.c
index ad40fb4..79ec66c 100644
--- a/hw/acpi.c
+++ b/hw/acpi.c
@@ -20,19 +20,30 @@
 #include "pc.h"
 #include "acpi.h"
 
-struct acpi_table_header
-{
-char signature [4];/* ACPI signature (4 ASCII characters) */
+struct acpi_table_header {
+uint16_t _length; /* our length, not actual part of the hdr */
+  /* XXX why we have 2 length fields here? */
+char sig[4];  /* ACPI signature (4 ASCII characters) */
 uint32_t length;  /* Length of table, in bytes, including header */
 uint8_t revision; /* ACPI Specification minor version # */
 uint8_t checksum; /* To make sum of entire table == 0 */
-char oem_id [6];   /* OEM identification */
-char oem_table_id [8]; /* OEM table identification */
+char oem_id[6];   /* OEM identification */
+char oem_table_id[8]; /* OEM table identification */
 uint32_t oem_revision;/* OEM revision number */
-char asl_compiler_id [4]; /* ASL compiler vendor ID */
+char asl_compiler_id[4];  /* ASL compiler vendor ID */
 uint32_t asl_compiler_revision; /* ASL compiler revision number */
 } __attribute__((packed));
 
+#define ACPI_TABLE_HDR_SIZE sizeof(struct acpi_table_header)
+#define ACPI_TABLE_PFX_SIZE sizeof(uint16_t)  /* size of the extra prefix */
+
+static const char dfl_hdr[ACPI_TABLE_HDR_SIZE] =
+"\0\0"   /* fake _length (2) */
+"QEMU\0\0\0\0\1\0"   /* sig (4), len(4), revno (1), csum (1) */
+"QEMUQEQEMUQEMU\1\0\0\0" /* OEM id (6), table (8), revno (4) */
+"QEMU\1\0\0\0"   /* ASL compiler ID (4), version (4) */
+;
+
 char *acpi_tables;
 size_t acpi_tables_len;
 
@@ -40,163 +51,198 @@ static int acpi_checksum(const uint8_t *data, int len)
 {
 int sum, i;
 sum = 0;
-for(i = 0; i < len; i++)
+for (i = 0; i < len; i++) {
 sum += data[i];
+}
 return (-sum) & 0xff;
 }
 
+/* like strncpy() but zero-fills the tail of destination */
+static void strzcpy(char *dst, const char *src, size_t size)
+{
+size_t len = strlen(src);
+if (len >= size) {
+len = size;
+} else {
+  memset(dst + len, 0, size - len);
+}
+memcpy(dst, src, len);
+}
+
+/* XXX fixme: this function uses obsolete argument parsing interface */
 int acpi_table_add(const char *t)
 {
-static const char *dfl_id = "QEMUQEMU";
 char buf[1024], *p, *f;
-struct acpi_table_header acpi_hdr;
 unsigned long val;
-uint32_t length;
-struct acpi_table_header *acpi_hdr_p;
-size_t off;
+size_t len, start, allen;
+bool has_header;
+int changed;
+int r;
+struct acpi_table_header hdr;
+
+r = 0;
+r |= get_param_value(buf, sizeof(buf), "data", t) ? 1 : 0;
+r |= get_param_value(buf, sizeof(buf), "file", t) ? 2 : 0;
+switch (r) {
+case 0:
+buf[0] = '\0';
+/* fallthrough for default behavior */
+case 1:
+has_header = false;
+break;
+case 2:
+has_header = true;
+ 

Re: [Qemu-devel] [PATCH 1/6] VMDK: enable twoGbMaxExtentFlat

2011-07-28 Thread Fam Zheng
Sorry, I missed one line in this patch. Will fix it in V2.

On Wed, Jul 27, 2011 at 5:27 PM, Fam Zheng  wrote:
> Enable the createType 'twoGbMaxExtentFlat'. The supporting code is
> already in.
>
> Signed-off-by: Fam Zheng 
> ---
>  block/vmdk.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/block/vmdk.c b/block/vmdk.c
> index 37478d2..9e6c67a 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -551,7 +551,8 @@ static int vmdk_open_desc_file(BlockDriverState *bs, int 
> flags)
>     if (vmdk_parse_description(buf, "createType", ct, sizeof(ct))) {
>         return -EINVAL;
>     }
> -    if (strcmp(ct, "monolithicFlat")) {
> +    if (strcmp(ct, "monolithicFlat") &&
> +        strcmp(ct, "twoGbMaxExtentFlat")) {
>         fprintf(stderr,
>                 "VMDK: Not supported image type \"%s\""".\n", ct);
>         return -ENOTSUP;
>



-- 
Best regards!
Fam Zheng



[Qemu-devel] [Bug 595438] Re: KVM segmentation fault, using SCSI+writeback and linux 2.4 guest

2011-07-28 Thread Serge Hallyn
** Changed in: kvm
   Status: Confirmed => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/595438

Title:
  KVM segmentation fault, using SCSI+writeback and linux 2.4 guest

Status in Kernel Virtual Machine:
  Invalid
Status in QEMU:
  Fix Released
Status in qemu-kvm:
  Fix Released
Status in “qemu-kvm” package in Ubuntu:
  Fix Released
Status in “qemu-kvm” source package in Lucid:
  Fix Released
Status in “qemu-kvm” package in Debian:
  Fix Released

Bug description:
  I Use Ubuntu 32 bit 10.04 with standard KVM.
  I have Intel E7600  @ 3.06GHz processor with VMX

  In this system I Run:
  LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin 
QEMU_AUDIO_DRV=none /usr/bin/kvm -M pc-0.12 -enable-kvm -m 256 -smp 1 -name 
spamsender -uuid b9cacd5e-08f7-41fd-78c8-89cec59af881 -chardev 
socket,id=monitor,path=/var/lib/libvirt/qemu/spamsender.monitor,server,nowait 
-monitor chardev:monitor -boot d -drive 
file=/mnt/megadiff/cdiso_400_130.iso,if=ide,media=cdrom,index=2 -drive 
file=/home/mmarkk/spamsender2.img,if=scsi,index=0,format=qcow2,cache=writeback 
-net nic,macaddr=00:00:00:00:00:00,vlan=0,name=nic.0 -net tap,vlan=0,name=tap.0 
-chardev pty,id=serial0 -serial chardev:serial0 -parallel none -usb -vnc 
127.0.0.1:0 -vga cirrus

  .iso image contain custom distro of 2.4-linux kernel based system.
  During install process (when .tar.gz actively unpacked), kvm dead with
  segmentation fault.

  And ONLY when I choose scsi virtual disk and writeback simultaneously.
  But, writeback+ide, writethrough+scsi works OK.

  I use qcow2. It seems, that qcow does not have such problems.

  Virtual machine get down at random time during file copy. It seems,
  when qcow2 file size need to be expanded.

  IMPACT: kvm used with scsi virtual disk and writeback dies with
  segfault.

  FIX: is the inclusion of a patch cherry-picked from upstream which dequeues
  requests before invoking callbacks.  It is at
  
http://bazaar.launchpad.net/~serge-hallyn/ubuntu/lucid/qemu-kvm/fix-scsi-writeback/revision/70

  TO REPRODUCE: See the command above.

  REGRESSION POTENTIAL: this is cherry-picked from upstream, and has been
  tested by the bug reporter with no ill effects.

To manage notifications about this bug go to:
https://bugs.launchpad.net/kvm/+bug/595438/+subscriptions



Re: [Qemu-devel] [PATCH] fix disabling interrupts in sun4u

2011-07-28 Thread tsnsaito
At Thu, 28 Jul 2011 16:44:23 +0200,
Artyom Tarasenko wrote:
> On Thu, Jul 28, 2011 at 3:40 PM,   wrote:
> > At Thu, 28 Jul 2011 14:50:57 +0200,
> > Artyom Tarasenko wrote:
> >> On Thu, Jul 28, 2011 at 2:03 PM,   wrote:
> >> > At Thu, 28 Jul 2011 13:51:08 +0200,
> >> > Artyom Tarasenko wrote:
> >> >> On Thu, Jul 28, 2011 at 12:31 PM,   wrote:
> >> Do you have objections to this patch in its current form?
> >
> > No, I don't have any objections.
> >
> >> > The interrupts from APB would be reported by cpu_set_irq(),
> >> > but cpu_set_irq() seems to generate interrupt_vector_n traps.
> >>
> >> For me it's not obvious. The interrupt vector not just one line, but
> >> the vector, which is written in the corresponding CPU register (also
> >> missing in the current qemu implementation). On the real hardware the
> >> vector is created by the IOMMU (PBM/APB/...). If qemu intends to
> >> support multiple chipsets, we should keep it the way it's done on the
> >> real hardware (for instance the interrupt vectors for on-board devices
> >> on Ultra-1 and E6500 are not the same).
> >
> > Sorry, I can't keep up with this vector thing...
> > Does the CPU receive hardware interrupts as interrupt_vector traps
> > (trap type=0x60) regardless of the kind of the interrupt controller,
> > doesn't it?
> 
> It does indeed, but it also stores the interrupt vector identifying
> the initiator device, in a CPU register readable with asi 0x7f .
> What would APB pass to the cpu_set_irq? I see the three following variants:
> 
> a) it passes the PCI interrupt id, which is translated to the
> interrupt vector in cpu_set_irq()
> b) it passes the vector. This implies that 2048 (0-0x7ff) CPU
> interrupts have to be allocated.
> c) hack combining a+b: allocate only the interrupts known to be used
> and translate an internal interrupt id to a vector.
> 
> The variant "a" is bad because it doesn't allow support for different
> chipsets. The variant "b" is bad because qemu has to allocate way too
> many interrupts. Only few of them will be used actually. The variant
> "c" is bad, well, because it's a hack.
> 
> That's why I suggest using another interface between APB and CPU.

Thanks, I understand the reason for introducing a diffent interface
for device interrupts.
It might be difficult to implement the interface as the set_irq callback.

> >> I'd suggest APB shall use some other interface for communicating
> >> interrupts to the CPU. Something like
> >> cpu_receive_ivec(interrupt_vector).
> >>
> >> >> >> Signed-off-by: Artyom Tarasenko 
> >> >> >> ---
> >> >> >>  hw/sun4u.c |    6 --
> >> >> >>  1 files changed, 4 insertions(+), 2 deletions(-)
> >> >> >>
> >> >> >> diff --git a/hw/sun4u.c b/hw/sun4u.c
> >> >> >> index d7dcaf0..7f95aeb 100644
> >> >> >> --- a/hw/sun4u.c
> >> >> >> +++ b/hw/sun4u.c
> >> >> >> @@ -255,7 +255,7 @@ void cpu_check_irqs(CPUState *env)
> >> >> >>          pil |= 1 << 14;
> >> >> >>      }
> >> >> >>
> >> >> >> -    if (!pil) {
> >> >> >> +    if (pil < (2 << env->psrpil)){
> >> >> >>          if (env->interrupt_request & CPU_INTERRUPT_HARD) {
> >> >> >>              CPUIRQ_DPRINTF("Reset CPU IRQ (current interrupt %x)\n",
> >> >> >>                             env->interrupt_index);
> >> >> >> @@ -287,10 +287,12 @@ void cpu_check_irqs(CPUState *env)
> >> >> >>                  break;
> >> >> >>              }
> >> >> >>          }
> >> >> >> -    } else {
> >> >> >> +    } else if (env->interrupt_request & CPU_INTERRUPT_HARD) {
> >> >> >>          CPUIRQ_DPRINTF("Interrupts disabled, pil=%08x pil_in=%08x 
> >> >> >> softint=%08x "
> >> >> >>                         "current interrupt %x\n",
> >> >> >>                         pil, env->pil_in, env->softint, 
> >> >> >> env->interrupt_index);
> >> >> >> +        env->interrupt_index = 0;
> >> >> >> +        cpu_reset_interrupt(env, CPU_INTERRUPT_HARD);
> >> >> >>      }
> >> >> >>  }
> >> >> >>
> >> >> >> --
> >> >> >> 1.7.3.4
> >> >> >
> >> >> >
> >> >> > 
> >> >> > Tsuneo Saito 
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Regards,
> >> >> Artyom Tarasenko
> >> >>
> >> >> solaris/sparc under qemu blog: http://tyom.blogspot.com/
> >> >
> >> > 
> >> > Tsuneo Saito 
> >> >
> >>
> >> --
> >> Regards,
> >> Artyom Tarasenko
> >>
> >> solaris/sparc under qemu blog: http://tyom.blogspot.com/
> >
> > 
> > Tsuneo Saito 
> >
> 
> 
> 
> -- 
> Regards,
> Artyom Tarasenko
> 
> solaris/sparc under qemu blog: http://tyom.blogspot.com/


Tsuneo Saito 



Re: [Qemu-devel] [PATCH v2] pci: Common overflow prevention

2011-07-28 Thread Isaku Yamahata
On Thu, Jul 28, 2011 at 11:40:21AM +0300, Michael S. Tsirkin wrote:
> I don't see a problem with this, but could you please clarify when does
> this happen? I think this is only possible for a pci device
> behind an express root. If so, this belongs in pcie_host.c
> 
> I'd also like this info to be recorded in the commit log.

>From 1dd598fd35d4e988dc51487829ed66208ca89021 Mon Sep 17 00:00:00 2001
Message-Id: 
<1dd598fd35d4e988dc51487829ed66208ca89021.1311901239.git.yamah...@valinux.co.jp>
From: Isaku Yamahata 
Date: Fri, 29 Jul 2011 09:52:45 +0900
Subject: [PATCH] pcie_host: limit check of pcie_mmcfg_data_write/read

This patch adds the check where the offset in the configuration space
is in its configuration size.

MMCFG area allows access of pcie configuration space to be in
offset [0, 4K).
At the same time, conventional pci devices whose configuration space size
is 256 bytes can be behind pcie-to-pci bridge.
The access whose offset is [256, 4K) should have no effect
on the conventional pci device
Add the limit check and ignore such accesses.

Signed-off-by: Isaku Yamahata 
---
 hw/pcie_host.c |   28 ++--
 1 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/hw/pcie_host.c b/hw/pcie_host.c
index f0b3d13..f9fea3d 100644
--- a/hw/pcie_host.c
+++ b/hw/pcie_host.c
@@ -56,23 +56,39 @@ static void pcie_mmcfg_data_write(PCIBus *s,
   uint32_t mmcfg_addr, uint32_t val, int len)
 {
 PCIDevice *pci_dev = pcie_dev_find_by_mmcfg_addr(s, mmcfg_addr);
+uint32_t addr;
+uint32_t limit;
 
 if (!pci_dev) {
 return;
 }
-pci_host_config_write_common(pci_dev, PCIE_MMCFG_CONFOFFSET(mmcfg_addr),
- pci_config_size(pci_dev), val, len);
+addr = PCIE_MMCFG_CONFOFFSET(mmcfg_addr);
+limit = pci_config_size(pci_dev);
+if (limit <= addr) {
+/* conventional pci device can be behind pcie-to-pci bridge.
+   256 <= addr < 4K has no effects. */
+return;
+}
+pci_host_config_write_common(pci_dev, addr, limit, val, len);
 }
 
-static uint32_t pcie_mmcfg_data_read(PCIBus *s, uint32_t addr, int len)
+static uint32_t pcie_mmcfg_data_read(PCIBus *s, uint32_t mmcfg_addr, int len)
 {
-PCIDevice *pci_dev = pcie_dev_find_by_mmcfg_addr(s, addr);
+PCIDevice *pci_dev = pcie_dev_find_by_mmcfg_addr(s, mmcfg_addr);
+uint32_t addr;
+uint32_t limit;
 
 if (!pci_dev) {
 return ~0x0;
 }
-return pci_host_config_read_common(pci_dev, PCIE_MMCFG_CONFOFFSET(addr),
-   pci_config_size(pci_dev), len);
+addr = PCIE_MMCFG_CONFOFFSET(mmcfg_addr);
+limit = pci_config_size(pci_dev);
+if (limit <= addr) {
+/* conventional pci device can be behind pcie-to-pci bridge.
+   256 <= addr < 4K has no effects. */
+return ~0x0;
+}
+return pci_host_config_read_common(pci_dev, addr, limit, len);
 }
 
 static void pcie_mmcfg_data_writeb(void *opaque,
-- 
1.7.1.1



-- 
yamahata



Re: [Qemu-devel] RFC: moving fsfreeze support from the userland guest agent to the guest kernel

2011-07-28 Thread Fernando Luis Vazquez Cao

Michael Roth さんは書きました:

On 07/28/2011 03:03 AM, Andrea Arcangeli wrote:
On Thu, Jul 28, 2011 at 11:53:50AM +0900, Fernando Luis Vázquez Cao 
wrote:

On Wed, 2011-07-27 at 17:24 +0200, Andrea Arcangeli wrote:

making
sure no lib is calling any I/O function to be able to defreeze the
filesystems later, making sure the oom killer or a wrong kill -9
$RANDOM isn't killing the agent by mistake while the I/O is blocked
and the copy is going.


Yes with the current API if the agent is killed while the filesystems
are frozen we are screwed.

I have just submitted patches that implement a new API that should make
the virtualization use case more reliable. Basically, I am adding a new
ioctl, FIGETFREEZEFD, which freezes the indicated filesystem and 
returns

a file descriptor; as long as that file descriptor is held open, the
filesystem remains open. If the freeze file descriptor is closed (be it
through a explicit call to close(2) or as part of process exit
housekeeping) the associated filesystem is automatically thawed.

- fsfreeze: add ioctl to create a fd for freeze control
http://marc.info/?l=linux-fsdevel&m=131175212512290&w=2
- fsfreeze: add freeze fd ioctls
http://marc.info/?l=linux-fsdevel&m=131175220612341&w=2


This is probably how the API should have been implemented originally
instead of FIFREEZE/FITHAW.

It looks a bit overkill though, I would think it'd be enough to have
the fsfreeze forced at FIGETFREEZEFD, and the only way to thaw by
closing the file without requiring any of the
FS_FREEZE_FD/FS_THAW_FD/FS_ISFROZEN_FD. But I guess you have use cases


One of the crappy things about the current implementation is the 
inability to determine whether or not a filesystem is frozen. At least 
in the context of guest agent at least, it'd be nice if 
guest-fsfreeze-status checked the actual system state rather than some 
internal state that may not necessarily reflect reality (if we freeze, 
and some other application thaws, we currently still report the state 
as frozen).


Also in the context of the guest agent, we are indeed screwed if the 
agent gets killed while in a frozen state, and remain screwed even if 
it's restarted since we have no way of determining whether or not 
we're in a frozen state and thus should disable logging operations.


That is precisely the reason I added the new API.

We could check status by looking for a failure from the freeze 
operation, but if you're just interested in getting the state, having 
to potentially induce a freeze just to get at the state is really 
heavy-handed.


So having an open operation that doesn't force a freeze/thaw/status 
operation serves some fairly common use cases I think. 


Yep. If you think there is something missing API wise let me know and I 
will implement it.


Thanks,
Fernando



Re: [Qemu-devel] [PATCH 1/2] pflash: Support read-only mode

2011-07-28 Thread Jordan Justen
On Thu, Jul 28, 2011 at 11:26, Jan Kiszka  wrote:
> On 2011-07-27 17:38, Jordan Justen wrote:
>> On Wed, Jul 27, 2011 at 02:30, Jan Kiszka  wrote:
>>> On 2011-07-25 23:34, Jordan Justen wrote:
 Read-only mode is indicated by bdrv_is_read_only

 When read-only mode is enabled, no changes will be made
 to the flash image in memory, and no bdrv_write calls will be
 made.

 Signed-off-by: Jordan Justen 
 Cc: Jan Kiszka 
 Cc: Aurelien Jarno 
 Cc: Anthony Liguori 
 ---
  blockdev.c        |    3 +-
  hw/pflash_cfi01.c |   36 ++-
  hw/pflash_cfi02.c |   82 
 
  3 files changed, 68 insertions(+), 53 deletions(-)

 diff --git a/blockdev.c b/blockdev.c
 index 0b8d3a4..566a502 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -521,7 +521,8 @@ DriveInfo *drive_init(QemuOpts *opts, int 
 default_to_scsi)
          /* CDROM is fine for any interface, don't check.  */
          ro = 1;
      } else if (ro == 1) {
 -        if (type != IF_SCSI && type != IF_VIRTIO && type != IF_FLOPPY && 
 type != IF_NONE) {
 +        if (type != IF_SCSI && type != IF_VIRTIO && type != IF_FLOPPY &&
 +            type != IF_NONE && type != IF_PFLASH) {
              error_report("readonly not supported by this bus type");
              goto err;
          }
 diff --git a/hw/pflash_cfi01.c b/hw/pflash_cfi01.c
 index 90fdc84..11ac490 100644
 --- a/hw/pflash_cfi01.c
 +++ b/hw/pflash_cfi01.c
 @@ -284,8 +284,10 @@ static void pflash_write(pflash_t *pfl, 
 target_phys_addr_t offset,
                      TARGET_FMT_plx "\n",
                      __func__, offset, pfl->sector_len);

 -            memset(p + offset, 0xff, pfl->sector_len);
 -            pflash_update(pfl, offset, pfl->sector_len);
 +            if (!pfl->ro) {
 +                memset(p + offset, 0xff, pfl->sector_len);
 +                pflash_update(pfl, offset, pfl->sector_len);
 +            }
              pfl->status |= 0x80; /* Ready! */
              break;
          case 0x50: /* Clear status bits */
 @@ -324,8 +326,10 @@ static void pflash_write(pflash_t *pfl, 
 target_phys_addr_t offset,
          case 0x10: /* Single Byte Program */
          case 0x40: /* Single Byte Program */
              DPRINTF("%s: Single Byte Program\n", __func__);
 -            pflash_data_write(pfl, offset, value, width, be);
 -            pflash_update(pfl, offset, width);
 +            if (!pfl->ro) {
 +                pflash_data_write(pfl, offset, value, width, be);
 +                pflash_update(pfl, offset, width);
 +            }
              pfl->status |= 0x80; /* Ready! */
              pfl->wcycle = 0;
          break;
 @@ -373,7 +377,9 @@ static void pflash_write(pflash_t *pfl, 
 target_phys_addr_t offset,
      case 2:
          switch (pfl->cmd) {
          case 0xe8: /* Block write */
 -            pflash_data_write(pfl, offset, value, width, be);
 +            if (!pfl->ro) {
 +                pflash_data_write(pfl, offset, value, width, be);
 +            }

              pfl->status |= 0x80;

 @@ -383,8 +389,10 @@ static void pflash_write(pflash_t *pfl, 
 target_phys_addr_t offset,

                  DPRINTF("%s: block write finished\n", __func__);
                  pfl->wcycle++;
 -                /* Flush the entire write buffer onto backing storage.  */
 -                pflash_update(pfl, offset & mask, pfl->writeblock_size);
 +                if (!pfl->ro) {
 +                    /* Flush the entire write buffer onto backing 
 storage.  */
 +                    pflash_update(pfl, offset & mask, 
 pfl->writeblock_size);
 +                }
              }

              pfl->counter--;
 @@ -621,13 +629,13 @@ pflash_t *pflash_cfi01_register(target_phys_addr_t 
 base, ram_addr_t off,
              return NULL;
          }
      }
 -#if 0 /* XXX: there should be a bit to set up read-only,
 -       *      the same way the hardware does (with WP pin).
 -       */
 -    pfl->ro = 1;
 -#else
 -    pfl->ro = 0;
 -#endif
 +
 +    if (pfl->bs) {
 +        pfl->ro = bdrv_is_read_only(pfl->bs);
 +    } else {
 +        pfl->ro = 0;
 +    }
 +
      pfl->timer = qemu_new_timer_ns(vm_clock, pflash_timer, pfl);
      pfl->base = base;
      pfl->sector_len = sector_len;
 diff --git a/hw/pflash_cfi02.c b/hw/pflash_cfi02.c
 index 725cd1e..920209d 100644
 --- a/hw/pflash_cfi02.c
 +++ b/hw/pflash_cfi02.c
 @@ -315,35 +315,37 @@ static void pflash_write (pflash_t *pfl, 
 target_phys_addr_t offset,
              DPRINTF("%s: write data offset 

Re: [Qemu-devel] [RFC] Introduce vm_stop_permanent()

2011-07-28 Thread Alon Levy
On Thu, Jul 28, 2011 at 03:22:52PM -0300, Luiz Capitulino wrote:
> On Thu, 28 Jul 2011 20:04:58 +0200
> Jan Kiszka  wrote:
> 
> > On 2011-07-28 20:00, Luiz Capitulino wrote:
> > > On Thu, 28 Jul 2011 19:52:31 +0200
> > > Jan Kiszka  wrote:
> > > 
> > >> On 2011-07-28 19:48, Luiz Capitulino wrote:
> > >>> On Thu, 28 Jul 2011 14:39:23 -0300
> > >>> Luiz Capitulino  wrote:
> > >>>
> >  On Thu, 28 Jul 2011 17:20:41 +0200
> >  Jan Kiszka  wrote:
> > 
> > > On 2011-07-28 17:18, Luiz Capitulino wrote:
> > >> On Thu, 28 Jul 2011 16:19:19 +0200
> > >> Jan Kiszka  wrote:
> > >>
> > >>> On 2011-07-28 15:37, Avi Kivity wrote:
> >  On 07/28/2011 04:31 PM, Luiz Capitulino wrote:
> > > On Thu, 28 Jul 2011 10:23:22 +0300
> > > Avi Kivity  wrote:
> > >
> > >>  On 07/28/2011 12:44 AM, Blue Swirl wrote:
> > >>  >  On Wed, Jul 27, 2011 at 9:42 PM, Luiz
> > > Capitulino   wrote:
> > >>  >  >   This function should be used when the VM is not supposed 
> > >> to
> > > resume
> > >>  >  >   execution (eg. by issuing 'cont' monitor command).
> > >>  >  >
> > >>  >  >   Today, we allow the user to resume execution even when:
> > >>  >  >
> > >>  >  > o the guest shuts down and -no-shutdown is used
> > >>  >  > o there's a kvm internal error
> > >>  >  > o loading the VM state with -loadvm or "loadvm" in the
> > > monitor fails
> > >>  >  >
> > >>  >  >   I think only badness can happen from the cases above.
> > >>  >
> > >>  >  I'd suppose a system_reset should bring the system back to
> > > sanity and
> > >>  >  then clear vm_permanent_stopped (where's -ly?)
> > >
> > > What's -ly?
> > >
> > 
> >  permanent-ly.
> > 
> > >>  >  except maybe for KVM
> > >>  >  internal error if that can't be recovered. Then it would not 
> > >> very
> > >>  >  permanent anymore, so the name would need adjusting.
> > >>
> > >>  Currently, all kvm internal errors are recoverable by reset (and
> > >>  possibly by fiddling with memory/registers).
> > >
> > > Ok, but a poweroff in the guest isn't recoverable with 
> > > system_reset
> > > right? Or does it depend on the guest?
> > 
> >  Right, it's not recoverable if you shut the power down where the 
> >  tractor
> >  beam is coupled to the main reactor.
> > >>>
> > >>> system_reset will bring all emulated devices back into their 
> > >>> power-on
> > >>> state - unless we have remaining bugs to fix. Actually, one may 
> > >>> consider
> > >>> issuing this reset automatically on vm_start after "permant" 
> > >>> vm_stop.
> > 
> >  The only permanent vm_stop we'd have is poweroff when -no-shutdown is 
> >  used.
> > 
> >  Are you saying that system_reset should be able to recover from that 
> >  too?
> > >>>
> > >>> It already does, so we don't have permanent stops.
> > >>
> > >> Exactly. We just have stops over inconsistent states that require a
> > >> reset to continue with anything useful.
> > > 
> > > Yes. If I got you right, you suggest that we do the reset automatically.
> > > 
> > > I think it's better to let the user do it, because s/he might want to
> > > do something else before resetting. For example, for the kvm error the
> > > user might want to save the vm state.
> > 
> > Associating the reset with a cont means requesting an explicit action
> > from the user. I'm not suggesting to do the reset when the stop state is
> > entered.
> 
> I see. But automatically resetting on cont might be unexpected to the
> user, even on a bad state.

I use this feature a lot in testing, i.e. that there is no automatic reset
when I run a vm with -no-shutdown and I do a guest initiated shutdown.
Very convenient when running with snapshot - great time to do commit ide0.
I have a patch I send a month or so ago to make sure you can do this endlessly -
currently after the first shutdown the -no-shutdown set flag is reset.

> 
> Another option would be to add a force option to cont, where the reset is
> done when the state is invalid (otherwise cont will return an error).
> 
> I still prefer to let the user do it manually though.
> 
> > > For the poweroff case with -no-shutdown it's probably fine, but I don't
> > > want to hard code special cases. It's better and easier to treat them all
> > > as "require system_reset to recover".
> > 
> > In any case, we need to tag the current state as stopped-and-invalid or
> > so vs. a normal stop. That remains a valuable first step. How to deal
> > with that information is the second one.
> 



Re: [Qemu-devel] [PATCH 1/2] pflash: Support read-only mode

2011-07-28 Thread Jan Kiszka
On 2011-07-27 17:38, Jordan Justen wrote:
> On Wed, Jul 27, 2011 at 02:30, Jan Kiszka  wrote:
>> On 2011-07-25 23:34, Jordan Justen wrote:
>>> Read-only mode is indicated by bdrv_is_read_only
>>>
>>> When read-only mode is enabled, no changes will be made
>>> to the flash image in memory, and no bdrv_write calls will be
>>> made.
>>>
>>> Signed-off-by: Jordan Justen 
>>> Cc: Jan Kiszka 
>>> Cc: Aurelien Jarno 
>>> Cc: Anthony Liguori 
>>> ---
>>>  blockdev.c|3 +-
>>>  hw/pflash_cfi01.c |   36 ++-
>>>  hw/pflash_cfi02.c |   82 
>>> 
>>>  3 files changed, 68 insertions(+), 53 deletions(-)
>>>
>>> diff --git a/blockdev.c b/blockdev.c
>>> index 0b8d3a4..566a502 100644
>>> --- a/blockdev.c
>>> +++ b/blockdev.c
>>> @@ -521,7 +521,8 @@ DriveInfo *drive_init(QemuOpts *opts, int 
>>> default_to_scsi)
>>>  /* CDROM is fine for any interface, don't check.  */
>>>  ro = 1;
>>>  } else if (ro == 1) {
>>> -if (type != IF_SCSI && type != IF_VIRTIO && type != IF_FLOPPY && 
>>> type != IF_NONE) {
>>> +if (type != IF_SCSI && type != IF_VIRTIO && type != IF_FLOPPY &&
>>> +type != IF_NONE && type != IF_PFLASH) {
>>>  error_report("readonly not supported by this bus type");
>>>  goto err;
>>>  }
>>> diff --git a/hw/pflash_cfi01.c b/hw/pflash_cfi01.c
>>> index 90fdc84..11ac490 100644
>>> --- a/hw/pflash_cfi01.c
>>> +++ b/hw/pflash_cfi01.c
>>> @@ -284,8 +284,10 @@ static void pflash_write(pflash_t *pfl, 
>>> target_phys_addr_t offset,
>>>  TARGET_FMT_plx "\n",
>>>  __func__, offset, pfl->sector_len);
>>>
>>> -memset(p + offset, 0xff, pfl->sector_len);
>>> -pflash_update(pfl, offset, pfl->sector_len);
>>> +if (!pfl->ro) {
>>> +memset(p + offset, 0xff, pfl->sector_len);
>>> +pflash_update(pfl, offset, pfl->sector_len);
>>> +}
>>>  pfl->status |= 0x80; /* Ready! */
>>>  break;
>>>  case 0x50: /* Clear status bits */
>>> @@ -324,8 +326,10 @@ static void pflash_write(pflash_t *pfl, 
>>> target_phys_addr_t offset,
>>>  case 0x10: /* Single Byte Program */
>>>  case 0x40: /* Single Byte Program */
>>>  DPRINTF("%s: Single Byte Program\n", __func__);
>>> -pflash_data_write(pfl, offset, value, width, be);
>>> -pflash_update(pfl, offset, width);
>>> +if (!pfl->ro) {
>>> +pflash_data_write(pfl, offset, value, width, be);
>>> +pflash_update(pfl, offset, width);
>>> +}
>>>  pfl->status |= 0x80; /* Ready! */
>>>  pfl->wcycle = 0;
>>>  break;
>>> @@ -373,7 +377,9 @@ static void pflash_write(pflash_t *pfl, 
>>> target_phys_addr_t offset,
>>>  case 2:
>>>  switch (pfl->cmd) {
>>>  case 0xe8: /* Block write */
>>> -pflash_data_write(pfl, offset, value, width, be);
>>> +if (!pfl->ro) {
>>> +pflash_data_write(pfl, offset, value, width, be);
>>> +}
>>>
>>>  pfl->status |= 0x80;
>>>
>>> @@ -383,8 +389,10 @@ static void pflash_write(pflash_t *pfl, 
>>> target_phys_addr_t offset,
>>>
>>>  DPRINTF("%s: block write finished\n", __func__);
>>>  pfl->wcycle++;
>>> -/* Flush the entire write buffer onto backing storage.  */
>>> -pflash_update(pfl, offset & mask, pfl->writeblock_size);
>>> +if (!pfl->ro) {
>>> +/* Flush the entire write buffer onto backing storage. 
>>>  */
>>> +pflash_update(pfl, offset & mask, 
>>> pfl->writeblock_size);
>>> +}
>>>  }
>>>
>>>  pfl->counter--;
>>> @@ -621,13 +629,13 @@ pflash_t *pflash_cfi01_register(target_phys_addr_t 
>>> base, ram_addr_t off,
>>>  return NULL;
>>>  }
>>>  }
>>> -#if 0 /* XXX: there should be a bit to set up read-only,
>>> -   *  the same way the hardware does (with WP pin).
>>> -   */
>>> -pfl->ro = 1;
>>> -#else
>>> -pfl->ro = 0;
>>> -#endif
>>> +
>>> +if (pfl->bs) {
>>> +pfl->ro = bdrv_is_read_only(pfl->bs);
>>> +} else {
>>> +pfl->ro = 0;
>>> +}
>>> +
>>>  pfl->timer = qemu_new_timer_ns(vm_clock, pflash_timer, pfl);
>>>  pfl->base = base;
>>>  pfl->sector_len = sector_len;
>>> diff --git a/hw/pflash_cfi02.c b/hw/pflash_cfi02.c
>>> index 725cd1e..920209d 100644
>>> --- a/hw/pflash_cfi02.c
>>> +++ b/hw/pflash_cfi02.c
>>> @@ -315,35 +315,37 @@ static void pflash_write (pflash_t *pfl, 
>>> target_phys_addr_t offset,
>>>  DPRINTF("%s: write data offset " TARGET_FMT_plx " %08x %d\n",
>>>  __func__, offset, value, width);
>>>  p = pfl->storage;
>>> -switch (width) {
>>> -c

Re: [Qemu-devel] [RFC] Introduce vm_stop_permanent()

2011-07-28 Thread Luiz Capitulino
On Thu, 28 Jul 2011 20:04:58 +0200
Jan Kiszka  wrote:

> On 2011-07-28 20:00, Luiz Capitulino wrote:
> > On Thu, 28 Jul 2011 19:52:31 +0200
> > Jan Kiszka  wrote:
> > 
> >> On 2011-07-28 19:48, Luiz Capitulino wrote:
> >>> On Thu, 28 Jul 2011 14:39:23 -0300
> >>> Luiz Capitulino  wrote:
> >>>
>  On Thu, 28 Jul 2011 17:20:41 +0200
>  Jan Kiszka  wrote:
> 
> > On 2011-07-28 17:18, Luiz Capitulino wrote:
> >> On Thu, 28 Jul 2011 16:19:19 +0200
> >> Jan Kiszka  wrote:
> >>
> >>> On 2011-07-28 15:37, Avi Kivity wrote:
>  On 07/28/2011 04:31 PM, Luiz Capitulino wrote:
> > On Thu, 28 Jul 2011 10:23:22 +0300
> > Avi Kivity  wrote:
> >
> >>  On 07/28/2011 12:44 AM, Blue Swirl wrote:
> >>  >  On Wed, Jul 27, 2011 at 9:42 PM, Luiz
> > Capitulino   wrote:
> >>  >  >   This function should be used when the VM is not supposed to
> > resume
> >>  >  >   execution (eg. by issuing 'cont' monitor command).
> >>  >  >
> >>  >  >   Today, we allow the user to resume execution even when:
> >>  >  >
> >>  >  > o the guest shuts down and -no-shutdown is used
> >>  >  > o there's a kvm internal error
> >>  >  > o loading the VM state with -loadvm or "loadvm" in the
> > monitor fails
> >>  >  >
> >>  >  >   I think only badness can happen from the cases above.
> >>  >
> >>  >  I'd suppose a system_reset should bring the system back to
> > sanity and
> >>  >  then clear vm_permanent_stopped (where's -ly?)
> >
> > What's -ly?
> >
> 
>  permanent-ly.
> 
> >>  >  except maybe for KVM
> >>  >  internal error if that can't be recovered. Then it would not 
> >> very
> >>  >  permanent anymore, so the name would need adjusting.
> >>
> >>  Currently, all kvm internal errors are recoverable by reset (and
> >>  possibly by fiddling with memory/registers).
> >
> > Ok, but a poweroff in the guest isn't recoverable with system_reset
> > right? Or does it depend on the guest?
> 
>  Right, it's not recoverable if you shut the power down where the 
>  tractor
>  beam is coupled to the main reactor.
> >>>
> >>> system_reset will bring all emulated devices back into their power-on
> >>> state - unless we have remaining bugs to fix. Actually, one may 
> >>> consider
> >>> issuing this reset automatically on vm_start after "permant" vm_stop.
> 
>  The only permanent vm_stop we'd have is poweroff when -no-shutdown is 
>  used.
> 
>  Are you saying that system_reset should be able to recover from that too?
> >>>
> >>> It already does, so we don't have permanent stops.
> >>
> >> Exactly. We just have stops over inconsistent states that require a
> >> reset to continue with anything useful.
> > 
> > Yes. If I got you right, you suggest that we do the reset automatically.
> > 
> > I think it's better to let the user do it, because s/he might want to
> > do something else before resetting. For example, for the kvm error the
> > user might want to save the vm state.
> 
> Associating the reset with a cont means requesting an explicit action
> from the user. I'm not suggesting to do the reset when the stop state is
> entered.

I see. But automatically resetting on cont might be unexpected to the
user, even on a bad state.

Another option would be to add a force option to cont, where the reset is
done when the state is invalid (otherwise cont will return an error).

I still prefer to let the user do it manually though.

> > For the poweroff case with -no-shutdown it's probably fine, but I don't
> > want to hard code special cases. It's better and easier to treat them all
> > as "require system_reset to recover".
> 
> In any case, we need to tag the current state as stopped-and-invalid or
> so vs. a normal stop. That remains a valuable first step. How to deal
> with that information is the second one.



Re: [Qemu-devel] [RFC] Introduce vm_stop_permanent()

2011-07-28 Thread Jan Kiszka
On 2011-07-28 20:00, Luiz Capitulino wrote:
> On Thu, 28 Jul 2011 19:52:31 +0200
> Jan Kiszka  wrote:
> 
>> On 2011-07-28 19:48, Luiz Capitulino wrote:
>>> On Thu, 28 Jul 2011 14:39:23 -0300
>>> Luiz Capitulino  wrote:
>>>
 On Thu, 28 Jul 2011 17:20:41 +0200
 Jan Kiszka  wrote:

> On 2011-07-28 17:18, Luiz Capitulino wrote:
>> On Thu, 28 Jul 2011 16:19:19 +0200
>> Jan Kiszka  wrote:
>>
>>> On 2011-07-28 15:37, Avi Kivity wrote:
 On 07/28/2011 04:31 PM, Luiz Capitulino wrote:
> On Thu, 28 Jul 2011 10:23:22 +0300
> Avi Kivity  wrote:
>
>>  On 07/28/2011 12:44 AM, Blue Swirl wrote:
>>  >  On Wed, Jul 27, 2011 at 9:42 PM, Luiz
> Capitulino   wrote:
>>  >  >   This function should be used when the VM is not supposed to
> resume
>>  >  >   execution (eg. by issuing 'cont' monitor command).
>>  >  >
>>  >  >   Today, we allow the user to resume execution even when:
>>  >  >
>>  >  > o the guest shuts down and -no-shutdown is used
>>  >  > o there's a kvm internal error
>>  >  > o loading the VM state with -loadvm or "loadvm" in the
> monitor fails
>>  >  >
>>  >  >   I think only badness can happen from the cases above.
>>  >
>>  >  I'd suppose a system_reset should bring the system back to
> sanity and
>>  >  then clear vm_permanent_stopped (where's -ly?)
>
> What's -ly?
>

 permanent-ly.

>>  >  except maybe for KVM
>>  >  internal error if that can't be recovered. Then it would not very
>>  >  permanent anymore, so the name would need adjusting.
>>
>>  Currently, all kvm internal errors are recoverable by reset (and
>>  possibly by fiddling with memory/registers).
>
> Ok, but a poweroff in the guest isn't recoverable with system_reset
> right? Or does it depend on the guest?

 Right, it's not recoverable if you shut the power down where the 
 tractor
 beam is coupled to the main reactor.
>>>
>>> system_reset will bring all emulated devices back into their power-on
>>> state - unless we have remaining bugs to fix. Actually, one may consider
>>> issuing this reset automatically on vm_start after "permant" vm_stop.

 The only permanent vm_stop we'd have is poweroff when -no-shutdown is used.

 Are you saying that system_reset should be able to recover from that too?
>>>
>>> It already does, so we don't have permanent stops.
>>
>> Exactly. We just have stops over inconsistent states that require a
>> reset to continue with anything useful.
> 
> Yes. If I got you right, you suggest that we do the reset automatically.
> 
> I think it's better to let the user do it, because s/he might want to
> do something else before resetting. For example, for the kvm error the
> user might want to save the vm state.

Associating the reset with a cont means requesting an explicit action
from the user. I'm not suggesting to do the reset when the stop state is
entered.

> 
> For the poweroff case with -no-shutdown it's probably fine, but I don't
> want to hard code special cases. It's better and easier to treat them all
> as "require system_reset to recover".

In any case, we need to tag the current state as stopped-and-invalid or
so vs. a normal stop. That remains a valuable first step. How to deal
with that information is the second one.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [RFC] Introduce vm_stop_permanent()

2011-07-28 Thread Luiz Capitulino
On Thu, 28 Jul 2011 19:52:31 +0200
Jan Kiszka  wrote:

> On 2011-07-28 19:48, Luiz Capitulino wrote:
> > On Thu, 28 Jul 2011 14:39:23 -0300
> > Luiz Capitulino  wrote:
> > 
> >> On Thu, 28 Jul 2011 17:20:41 +0200
> >> Jan Kiszka  wrote:
> >>
> >>> On 2011-07-28 17:18, Luiz Capitulino wrote:
>  On Thu, 28 Jul 2011 16:19:19 +0200
>  Jan Kiszka  wrote:
> 
> > On 2011-07-28 15:37, Avi Kivity wrote:
> >> On 07/28/2011 04:31 PM, Luiz Capitulino wrote:
> >>> On Thu, 28 Jul 2011 10:23:22 +0300
> >>> Avi Kivity  wrote:
> >>>
>   On 07/28/2011 12:44 AM, Blue Swirl wrote:
>   >  On Wed, Jul 27, 2011 at 9:42 PM, Luiz
> >>> Capitulino   wrote:
>   >  >   This function should be used when the VM is not supposed to
> >>> resume
>   >  >   execution (eg. by issuing 'cont' monitor command).
>   >  >
>   >  >   Today, we allow the user to resume execution even when:
>   >  >
>   >  > o the guest shuts down and -no-shutdown is used
>   >  > o there's a kvm internal error
>   >  > o loading the VM state with -loadvm or "loadvm" in the
> >>> monitor fails
>   >  >
>   >  >   I think only badness can happen from the cases above.
>   >
>   >  I'd suppose a system_reset should bring the system back to
> >>> sanity and
>   >  then clear vm_permanent_stopped (where's -ly?)
> >>>
> >>> What's -ly?
> >>>
> >>
> >> permanent-ly.
> >>
>   >  except maybe for KVM
>   >  internal error if that can't be recovered. Then it would not very
>   >  permanent anymore, so the name would need adjusting.
> 
>   Currently, all kvm internal errors are recoverable by reset (and
>   possibly by fiddling with memory/registers).
> >>>
> >>> Ok, but a poweroff in the guest isn't recoverable with system_reset
> >>> right? Or does it depend on the guest?
> >>
> >> Right, it's not recoverable if you shut the power down where the 
> >> tractor
> >> beam is coupled to the main reactor.
> >
> > system_reset will bring all emulated devices back into their power-on
> > state - unless we have remaining bugs to fix. Actually, one may consider
> > issuing this reset automatically on vm_start after "permant" vm_stop.
> >>
> >> The only permanent vm_stop we'd have is poweroff when -no-shutdown is used.
> >>
> >> Are you saying that system_reset should be able to recover from that too?
> > 
> > It already does, so we don't have permanent stops.
> 
> Exactly. We just have stops over inconsistent states that require a
> reset to continue with anything useful.

Yes. If I got you right, you suggest that we do the reset automatically.

I think it's better to let the user do it, because s/he might want to
do something else before resetting. For example, for the kvm error the
user might want to save the vm state.

For the poweroff case with -no-shutdown it's probably fine, but I don't
want to hard code special cases. It's better and easier to treat them all
as "require system_reset to recover".



Re: [Qemu-devel] [RFC][PATCH 0/21] QEMU Object Model

2011-07-28 Thread Anthony Liguori

On 07/28/2011 10:47 AM, Paolo Bonzini wrote:

On 07/28/2011 05:04 PM, Anthony Liguori wrote:

The only way I can see is to teach each device about this interface and
then have a common bus. That implies that you have:

class GoldfishEnumerator : public Device {
GoldfishDevice *slots[N];


FWIW, there's no hardcoded limit in the interface, and the list of
devices is unordered. But that only means you should attach it with

plug-set goldfish_tty::enumerator goldfish_enum

rather than

plug-set goldfish_enum::slots[12] goldfish_tty

If you can confirm that, that's fine.


Yes, you can certainly have an enumerator socket that when set, 
automatically connects the appropriate properties in the enumerator.


That way you don't have to connect things by hand.


};

interface GoldfishDevice {
const char *get_name();
uint64_t get_mmio_base();
...
};

class GoldfishNic : public Device, implements GoldfishDevice
{
const char *get_name(void) {
return "nic";
}


uint64_t mmio_base;
uint64_t get_mmio_base() { return mmio_base; }
uint64_t set_mmio_base(uint64_t addr) { mmio_base = addr; }


};


And that's exactly my point. It's a "stupid" interface full of
getters/setters, which is what you get if you use only interface
inheritance instead of, where appropriate, data containment.



You can certainly do:

struct GoldfishEnumInfo
{
const char *name;
uint64_t mmio_base;
};

interface GoldfishDevice {
GoldfishEnumInfo *get_info();
}

And then:

GoldfishEnumInfo *goldfish_nic_get_info(GoldFishNic *nic)
{
return nic->enuminfo;
}


Interfaces should be reserved for what really depends on the
_implementation_ of the GoldfishNic, not for accessing a bunch of
numbers.


Just define an interface that returns a struct then.  It's no more 
complicated than that.


What I struggle with is whether you're suggesting that the info isn't 
part of the device or whether it is part of the device and you just 
think that we shouldn't need 10 different accessors.



There is no implementation-dependent detail of that kind in the
GoldfishDevice (unlike other buses, even simple ones like I2C).


The PIC's view is more complicated than a Pin, and more similar to ISA.


ISA is just a pin. The ISA bus extender literally has five pins
corresponding to the ISA IRQs 7, 6, 5, 4, 3.


ISA is many pins. :) Goldfish looks similar (32 pins).


Sorry, I meant to say ISA IRQs are just exposed as pins.

My real point is, doing:

class IsaDevice : public Device {
   Pin irq[16];
};

class MyIsaDevice : public IsaDevice {
   int irq_index;
};

And then doing:

my_isa_device->irq_index = 5;

Is a very good way to model ISA IRQ selection.

You can simplify it by having a single IRQ output for each ISA device 
instead of any possible IRQ and then route the IRQ as part of the bus 
connection.


Regards,

Anthony Liguori


Paolo






Re: [Qemu-devel] [RFC] Introduce vm_stop_permanent()

2011-07-28 Thread Jan Kiszka
On 2011-07-28 19:48, Luiz Capitulino wrote:
> On Thu, 28 Jul 2011 14:39:23 -0300
> Luiz Capitulino  wrote:
> 
>> On Thu, 28 Jul 2011 17:20:41 +0200
>> Jan Kiszka  wrote:
>>
>>> On 2011-07-28 17:18, Luiz Capitulino wrote:
 On Thu, 28 Jul 2011 16:19:19 +0200
 Jan Kiszka  wrote:

> On 2011-07-28 15:37, Avi Kivity wrote:
>> On 07/28/2011 04:31 PM, Luiz Capitulino wrote:
>>> On Thu, 28 Jul 2011 10:23:22 +0300
>>> Avi Kivity  wrote:
>>>
  On 07/28/2011 12:44 AM, Blue Swirl wrote:
  >  On Wed, Jul 27, 2011 at 9:42 PM, Luiz
>>> Capitulino   wrote:
  >  >   This function should be used when the VM is not supposed to
>>> resume
  >  >   execution (eg. by issuing 'cont' monitor command).
  >  >
  >  >   Today, we allow the user to resume execution even when:
  >  >
  >  > o the guest shuts down and -no-shutdown is used
  >  > o there's a kvm internal error
  >  > o loading the VM state with -loadvm or "loadvm" in the
>>> monitor fails
  >  >
  >  >   I think only badness can happen from the cases above.
  >
  >  I'd suppose a system_reset should bring the system back to
>>> sanity and
  >  then clear vm_permanent_stopped (where's -ly?)
>>>
>>> What's -ly?
>>>
>>
>> permanent-ly.
>>
  >  except maybe for KVM
  >  internal error if that can't be recovered. Then it would not very
  >  permanent anymore, so the name would need adjusting.

  Currently, all kvm internal errors are recoverable by reset (and
  possibly by fiddling with memory/registers).
>>>
>>> Ok, but a poweroff in the guest isn't recoverable with system_reset
>>> right? Or does it depend on the guest?
>>
>> Right, it's not recoverable if you shut the power down where the tractor
>> beam is coupled to the main reactor.
>
> system_reset will bring all emulated devices back into their power-on
> state - unless we have remaining bugs to fix. Actually, one may consider
> issuing this reset automatically on vm_start after "permant" vm_stop.
>>
>> The only permanent vm_stop we'd have is poweroff when -no-shutdown is used.
>>
>> Are you saying that system_reset should be able to recover from that too?
> 
> It already does, so we don't have permanent stops.

Exactly. We just have stops over inconsistent states that require a
reset to continue with anything useful.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [RFC] Introduce vm_stop_permanent()

2011-07-28 Thread Luiz Capitulino
On Thu, 28 Jul 2011 14:39:23 -0300
Luiz Capitulino  wrote:

> On Thu, 28 Jul 2011 17:20:41 +0200
> Jan Kiszka  wrote:
> 
> > On 2011-07-28 17:18, Luiz Capitulino wrote:
> > > On Thu, 28 Jul 2011 16:19:19 +0200
> > > Jan Kiszka  wrote:
> > > 
> > >> On 2011-07-28 15:37, Avi Kivity wrote:
> > >>> On 07/28/2011 04:31 PM, Luiz Capitulino wrote:
> >  On Thu, 28 Jul 2011 10:23:22 +0300
> >  Avi Kivity  wrote:
> > 
> > >  On 07/28/2011 12:44 AM, Blue Swirl wrote:
> > >  >  On Wed, Jul 27, 2011 at 9:42 PM, Luiz
> >  Capitulino   wrote:
> > >  >  >   This function should be used when the VM is not supposed to
> >  resume
> > >  >  >   execution (eg. by issuing 'cont' monitor command).
> > >  >  >
> > >  >  >   Today, we allow the user to resume execution even when:
> > >  >  >
> > >  >  > o the guest shuts down and -no-shutdown is used
> > >  >  > o there's a kvm internal error
> > >  >  > o loading the VM state with -loadvm or "loadvm" in the
> >  monitor fails
> > >  >  >
> > >  >  >   I think only badness can happen from the cases above.
> > >  >
> > >  >  I'd suppose a system_reset should bring the system back to
> >  sanity and
> > >  >  then clear vm_permanent_stopped (where's -ly?)
> > 
> >  What's -ly?
> > 
> > >>>
> > >>> permanent-ly.
> > >>>
> > >  >  except maybe for KVM
> > >  >  internal error if that can't be recovered. Then it would not very
> > >  >  permanent anymore, so the name would need adjusting.
> > >
> > >  Currently, all kvm internal errors are recoverable by reset (and
> > >  possibly by fiddling with memory/registers).
> > 
> >  Ok, but a poweroff in the guest isn't recoverable with system_reset
> >  right? Or does it depend on the guest?
> > >>>
> > >>> Right, it's not recoverable if you shut the power down where the tractor
> > >>> beam is coupled to the main reactor.
> > >>
> > >> system_reset will bring all emulated devices back into their power-on
> > >> state - unless we have remaining bugs to fix. Actually, one may consider
> > >> issuing this reset automatically on vm_start after "permant" vm_stop.
> 
> The only permanent vm_stop we'd have is poweroff when -no-shutdown is used.
> 
> Are you saying that system_reset should be able to recover from that too?

It already does, so we don't have permanent stops.

> 
> If yes, then I guess the right place is do_cont(). But I'm not sure we
> should do that (if -no-shutdown is not used, we'll exit() anyway).
> 
> > > 
> > > I'm going to abandon the vm_stop_permanent() idea.
> > > 
> > > I'm thinking something towards having a stop_kind argument to vm_stop(),
> > > which could be temporary (which is what we have today), only with
> > > system_reset or permanent.
> > > 
> > > Or maybe vm_stop_opt(), which allows for fine grained control of stop
> > > (vm_stop() would default to today's behavior)...
> > 
> > But the right point for reset is start, not stop. Otherwise we loose the
> > state before the stop.
> 




Re: [Qemu-devel] Log in to the OS through qemu

2011-07-28 Thread Mulyadi Santosa
On Thu, Jul 28, 2011 at 23:19, bala suru  wrote:
> Hi,
> I have installed OS on VMs (KVM-Qemu) , one version is ttylinux 9.0 and
> other is ttylinux 12 .
>
> I have installed these two with opennebula tool kit .  generally I try  to
> login to the OS using SSH , since ttylinux 12 needs to enable the ethernet
> after logged in  to the OS, how to login to the OS through qemu ... i know
> the username and password for that ..

could you make the SDL window come up? that's the easiest way
AFAIK...it represents your VM monitor

-- 
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com



Re: [Qemu-devel] [RFC] Introduce vm_stop_permanent()

2011-07-28 Thread Luiz Capitulino
On Thu, 28 Jul 2011 17:20:41 +0200
Jan Kiszka  wrote:

> On 2011-07-28 17:18, Luiz Capitulino wrote:
> > On Thu, 28 Jul 2011 16:19:19 +0200
> > Jan Kiszka  wrote:
> > 
> >> On 2011-07-28 15:37, Avi Kivity wrote:
> >>> On 07/28/2011 04:31 PM, Luiz Capitulino wrote:
>  On Thu, 28 Jul 2011 10:23:22 +0300
>  Avi Kivity  wrote:
> 
> >  On 07/28/2011 12:44 AM, Blue Swirl wrote:
> >  >  On Wed, Jul 27, 2011 at 9:42 PM, Luiz
>  Capitulino   wrote:
> >  >  >   This function should be used when the VM is not supposed to
>  resume
> >  >  >   execution (eg. by issuing 'cont' monitor command).
> >  >  >
> >  >  >   Today, we allow the user to resume execution even when:
> >  >  >
> >  >  > o the guest shuts down and -no-shutdown is used
> >  >  > o there's a kvm internal error
> >  >  > o loading the VM state with -loadvm or "loadvm" in the
>  monitor fails
> >  >  >
> >  >  >   I think only badness can happen from the cases above.
> >  >
> >  >  I'd suppose a system_reset should bring the system back to
>  sanity and
> >  >  then clear vm_permanent_stopped (where's -ly?)
> 
>  What's -ly?
> 
> >>>
> >>> permanent-ly.
> >>>
> >  >  except maybe for KVM
> >  >  internal error if that can't be recovered. Then it would not very
> >  >  permanent anymore, so the name would need adjusting.
> >
> >  Currently, all kvm internal errors are recoverable by reset (and
> >  possibly by fiddling with memory/registers).
> 
>  Ok, but a poweroff in the guest isn't recoverable with system_reset
>  right? Or does it depend on the guest?
> >>>
> >>> Right, it's not recoverable if you shut the power down where the tractor
> >>> beam is coupled to the main reactor.
> >>
> >> system_reset will bring all emulated devices back into their power-on
> >> state - unless we have remaining bugs to fix. Actually, one may consider
> >> issuing this reset automatically on vm_start after "permant" vm_stop.

The only permanent vm_stop we'd have is poweroff when -no-shutdown is used.

Are you saying that system_reset should be able to recover from that too?

If yes, then I guess the right place is do_cont(). But I'm not sure we
should do that (if -no-shutdown is not used, we'll exit() anyway).

> > 
> > I'm going to abandon the vm_stop_permanent() idea.
> > 
> > I'm thinking something towards having a stop_kind argument to vm_stop(),
> > which could be temporary (which is what we have today), only with
> > system_reset or permanent.
> > 
> > Or maybe vm_stop_opt(), which allows for fine grained control of stop
> > (vm_stop() would default to today's behavior)...
> 
> But the right point for reset is start, not stop. Otherwise we loose the
> state before the stop.




Re: [Qemu-devel] [PATCH] add QEMU_LD_PREFIX environment variable

2011-07-28 Thread Alexander Graf

On 28.07.2011, at 18:50, Geert Stappers wrote:

> On Thu, Jul 28, 2011 at 01:24:47PM +0200, Johannes Schauer wrote:
>> 
>> @Geert Stappers:
>> 
>> you are patching bsd-user/main.c and darwin-user/main.c as well. I take
>> it that you did test your changes on those platforms? does it work there
>> as well? I have no clue of darwin but is it really useful there?
> 
> They only check I did,
> was checking if BSD and Darwing have getenv(), they do.
> 
> I consider the 
> +/* read interp_prefix from environment variable */
> +if (getenv("QEMU_LD_PREFIX") != NULL) {
> +interp_prefix = getenv("QEMU_LD_PREFIX");
> to Darwin and BSD as harmless. In fact only reason for those additional
> lines were to bring attention to those operating systems.

It's called DYLD_PRELOAD on Darwin :). However, I do think that if anything 
this should be an alias, so you can set either of the two. If anyone cares 
about darwin-user at all.


Alex




Re: [Qemu-devel] [PATCH] add QEMU_LD_PREFIX environment variable

2011-07-28 Thread Geert Stappers
On Thu, Jul 28, 2011 at 01:24:47PM +0200, Johannes Schauer wrote:
> 
> @Geert Stappers:
> 
> you are patching bsd-user/main.c and darwin-user/main.c as well. I take
> it that you did test your changes on those platforms? does it work there
> as well? I have no clue of darwin but is it really useful there?

They only check I did,
was checking if BSD and Darwing have getenv(), they do.

I consider the 
+/* read interp_prefix from environment variable */
+if (getenv("QEMU_LD_PREFIX") != NULL) {
+interp_prefix = getenv("QEMU_LD_PREFIX");
to Darwin and BSD as harmless. In fact only reason for those additional
lines were to bring attention to those operating systems.

With the idea^Whope of better acception from upstream.


Cheers
Geert Stappers




Re: [Qemu-devel] [RFC][PATCH 0/21] QEMU Object Model

2011-07-28 Thread Paolo Bonzini

On 07/28/2011 05:04 PM, Anthony Liguori wrote:

The only way I can see is to teach each device about this interface and
then have a common bus.  That implies that you have:

class GoldfishEnumerator : public Device {
  GoldfishDevice *slots[N];


FWIW, there's no hardcoded limit in the interface, and the list of 
devices is unordered.  But that only means you should attach it with


plug-set goldfish_tty::enumerator goldfish_enum

rather than

plug-set goldfish_enum::slots[12] goldfish_tty

If you can confirm that, that's fine.


};

interface GoldfishDevice {
  const char *get_name();
  uint64_t get_mmio_base();
  ...
};

class GoldfishNic : public Device, implements GoldfishDevice
{
 const char *get_name(void) {
 return "nic";
 }


 uint64_t mmio_base;
 uint64_t get_mmio_base() { return mmio_base; }
 uint64_t set_mmio_base(uint64_t addr) { mmio_base = addr; }


};


And that's exactly my point.  It's a "stupid" interface full of 
getters/setters, which is what you get if you use only interface 
inheritance instead of, where appropriate, data containment.


Interfaces should be reserved for what really depends on the 
_implementation_ of the GoldfishNic, not for accessing a bunch of 
numbers.  There is no implementation-dependent detail of that kind in 
the GoldfishDevice (unlike other buses, even simple ones like I2C).



The PIC's view is more complicated than a Pin, and more similar to ISA.


ISA is just a pin.  The ISA bus extender literally has five pins
corresponding to the ISA IRQs 7, 6, 5, 4, 3.


ISA is many pins. :)  Goldfish looks similar (32 pins).

Paolo



[Qemu-devel] Log in to the OS through qemu

2011-07-28 Thread bala suru
Hi,
I have installed OS on VMs (KVM-Qemu) , one version is ttylinux 9.0 and
other is ttylinux 12 .

I have installed these two with opennebula tool kit .  generally I try  to
login to the OS using SSH , since ttylinux 12 needs to enable the ethernet
after logged in  to the OS, how to login to the OS through qemu ... i know
the username and password for that ..

I checked that through $virsh command ,Both OS running fine , and shows the
dmain id and status is running ..

regards
Bala


Re: [Qemu-devel] [PATCH 08/15] qmp: add block_stream command

2011-07-28 Thread Stefan Hajnoczi
On Thu, Jul 28, 2011 at 4:53 PM, Marcelo Tosatti  wrote:
> On Wed, Jul 27, 2011 at 02:44:48PM +0100, Stefan Hajnoczi wrote:
>> For leaf images with copy-on-read semantics, the stream command allows
>> the user to populate the image file by copying data from the backing
>> file while the guest is running.  Once all blocks have been streamed,
>> the dependency on the original backing file is removed.  Therefore,
>> stream commands can be used to implement post-copy live block migration
>> and rapid deployment.
>>
>> The command synopsis is:
>>
>> block_stream
>> 
>>
>> Copy data from a backing file into a block device.
>>
>> The block streaming operation is performed in the background until the
>> entire backing file has been copied.  This command returns immediately
>> once streaming has started.  The status of ongoing block streaming
>> operations can be checked with query-block-jobs.  The operation can be
>> stopped before it has completed using the block_job_cancel command.
>>
>> If a base file is specified then sectors are not copied from that base
>> file and its backing chain.  When streaming completes the image file
>> will have the base file as its backing file.  This can be used to stream
>> a subset of the backing file chain instead of flattening the entire
>> image.
>>
>> On successful completion the image file is updated to drop the backing
>> file.
>>
>> Arguments:
>>
>> - device: device name (json-string)
>> - base:   common backing file (json-string, optional)
>>
>> Errors:
>>
>> DeviceInUse:    streaming is already active on this device
>> DeviceNotFound: device name is invalid
>> NotSupported:   image streaming is not supported by this device
>>
>> Events:
>>
>> On completion the BLOCK_JOB_COMPLETED event is raised with the following
>> fields:
>>
>> - type:     job type ("stream" for image streaming, json-string)
>> - device:   device name (json-string)
>> - end:      maximum progress value (json-int)
>> - position: current progress value (json-int)
>> - speed:    rate limit, bytes per second (json-int)
>> - error:    error message (json-string, only on error)
>>
>> The completion event is raised both on success and on failure.  On
>> success position is equal to end.  On failure position and end can be
>> used to indicate at which point the operation failed.
>>
>> On failure the error field contains a human-readable error message.
>> There are no semantics other than that streaming has failed and clients
>> should not try to interpret the error string.
>>
>> Examples:
>>
>> -> { "execute": "block_stream", "arguments": { "device": "virtio0" } }
>> <- { "return":  {} }
>>
>> Signed-off-by: Adam Litke 
>> Signed-off-by: Stefan Hajnoczi 
>> ---
>>  blockdev.c      |  133 
>> +++
>>  blockdev.h      |    1 +
>>  hmp-commands.hx |   14 ++
>>  monitor.c       |    3 +
>>  monitor.h       |    1 +
>>  qerror.c        |    8 +++
>>  qerror.h        |    6 +++
>>  qmp-commands.hx |   64 ++
>>  8 files changed, 230 insertions(+), 0 deletions(-)
>>
>> diff --git a/blockdev.c b/blockdev.c
>> index b337732..cd5e49c 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -16,6 +16,7 @@
>>  #include "sysemu.h"
>>  #include "hw/qdev.h"
>>  #include "block_int.h"
>> +#include "qjson.h"
>>
>>  static QTAILQ_HEAD(drivelist, DriveInfo) drives = 
>> QTAILQ_HEAD_INITIALIZER(drives);
>>
>> @@ -50,6 +51,131 @@ static const int if_max_devs[IF_COUNT] = {
>>      [IF_SCSI] = 7,
>>  };
>>
>> +typedef struct StreamState {
>> +    int64_t offset;             /* current position in block device */
>> +    BlockDriverState *bs;
>> +    QEMUTimer *timer;
>> +    QLIST_ENTRY(StreamState) list;
>> +} StreamState;
>> +
>> +static QLIST_HEAD(, StreamState) block_streams =
>> +    QLIST_HEAD_INITIALIZER(block_streams);
>> +
>> +static QObject *stream_get_qobject(StreamState *s)
>> +{
>> +    const char *name = bdrv_get_device_name(s->bs);
>> +    int64_t len = bdrv_getlength(s->bs);
>> +
>> +    return qobject_from_jsonf("{ 'device': %s, 'type': 'stream', "
>> +                              "'offset': %" PRId64 ", 'len': %" PRId64 ", "
>> +                              "'speed': %" PRId64 " }",
>> +                              name, s->offset, len, (int64_t)0);
>> +}
>> +
>> +static void stream_mon_event(StreamState *s, int ret)
>> +{
>> +    QObject *data = stream_get_qobject(s);
>> +
>> +    if (ret < 0) {
>> +        QDict *qdict = qobject_to_qdict(data);
>> +
>> +        qdict_put(qdict, "error", qstring_from_str(strerror(-ret)));
>> +    }
>> +
>> +    monitor_protocol_event(QEVENT_BLOCK_JOB_COMPLETED, data);
>> +    qobject_decref(data);
>> +}
>> +
>> +static void stream_free(StreamState *s)
>> +{
>> +    QLIST_REMOVE(s, list);
>> +
>> +    qemu_del_timer(s->timer);
>> +    qemu_free_timer(s->timer);
>> +    qemu_free(s);
>> +}
>> +
>> +static void stream_complete(StreamState *s, int ret)
>> +{
>> +    stream_mon_event(s, ret);
>> +

Re: [Qemu-devel] [PATCH 08/15] qmp: add block_stream command

2011-07-28 Thread Marcelo Tosatti
On Wed, Jul 27, 2011 at 02:44:48PM +0100, Stefan Hajnoczi wrote:
> For leaf images with copy-on-read semantics, the stream command allows
> the user to populate the image file by copying data from the backing
> file while the guest is running.  Once all blocks have been streamed,
> the dependency on the original backing file is removed.  Therefore,
> stream commands can be used to implement post-copy live block migration
> and rapid deployment.
> 
> The command synopsis is:
> 
> block_stream
> 
> 
> Copy data from a backing file into a block device.
> 
> The block streaming operation is performed in the background until the
> entire backing file has been copied.  This command returns immediately
> once streaming has started.  The status of ongoing block streaming
> operations can be checked with query-block-jobs.  The operation can be
> stopped before it has completed using the block_job_cancel command.
> 
> If a base file is specified then sectors are not copied from that base
> file and its backing chain.  When streaming completes the image file
> will have the base file as its backing file.  This can be used to stream
> a subset of the backing file chain instead of flattening the entire
> image.
> 
> On successful completion the image file is updated to drop the backing
> file.
> 
> Arguments:
> 
> - device: device name (json-string)
> - base:   common backing file (json-string, optional)
> 
> Errors:
> 
> DeviceInUse:streaming is already active on this device
> DeviceNotFound: device name is invalid
> NotSupported:   image streaming is not supported by this device
> 
> Events:
> 
> On completion the BLOCK_JOB_COMPLETED event is raised with the following
> fields:
> 
> - type: job type ("stream" for image streaming, json-string)
> - device:   device name (json-string)
> - end:  maximum progress value (json-int)
> - position: current progress value (json-int)
> - speed:rate limit, bytes per second (json-int)
> - error:error message (json-string, only on error)
> 
> The completion event is raised both on success and on failure.  On
> success position is equal to end.  On failure position and end can be
> used to indicate at which point the operation failed.
> 
> On failure the error field contains a human-readable error message.
> There are no semantics other than that streaming has failed and clients
> should not try to interpret the error string.
> 
> Examples:
> 
> -> { "execute": "block_stream", "arguments": { "device": "virtio0" } }
> <- { "return":  {} }
> 
> Signed-off-by: Adam Litke 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  blockdev.c  |  133 
> +++
>  blockdev.h  |1 +
>  hmp-commands.hx |   14 ++
>  monitor.c   |3 +
>  monitor.h   |1 +
>  qerror.c|8 +++
>  qerror.h|6 +++
>  qmp-commands.hx |   64 ++
>  8 files changed, 230 insertions(+), 0 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index b337732..cd5e49c 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -16,6 +16,7 @@
>  #include "sysemu.h"
>  #include "hw/qdev.h"
>  #include "block_int.h"
> +#include "qjson.h"
>  
>  static QTAILQ_HEAD(drivelist, DriveInfo) drives = 
> QTAILQ_HEAD_INITIALIZER(drives);
>  
> @@ -50,6 +51,131 @@ static const int if_max_devs[IF_COUNT] = {
>  [IF_SCSI] = 7,
>  };
>  
> +typedef struct StreamState {
> +int64_t offset; /* current position in block device */
> +BlockDriverState *bs;
> +QEMUTimer *timer;
> +QLIST_ENTRY(StreamState) list;
> +} StreamState;
> +
> +static QLIST_HEAD(, StreamState) block_streams =
> +QLIST_HEAD_INITIALIZER(block_streams);
> +
> +static QObject *stream_get_qobject(StreamState *s)
> +{
> +const char *name = bdrv_get_device_name(s->bs);
> +int64_t len = bdrv_getlength(s->bs);
> +
> +return qobject_from_jsonf("{ 'device': %s, 'type': 'stream', "
> +  "'offset': %" PRId64 ", 'len': %" PRId64 ", "
> +  "'speed': %" PRId64 " }",
> +  name, s->offset, len, (int64_t)0);
> +}
> +
> +static void stream_mon_event(StreamState *s, int ret)
> +{
> +QObject *data = stream_get_qobject(s);
> +
> +if (ret < 0) {
> +QDict *qdict = qobject_to_qdict(data);
> +
> +qdict_put(qdict, "error", qstring_from_str(strerror(-ret)));
> +}
> +
> +monitor_protocol_event(QEVENT_BLOCK_JOB_COMPLETED, data);
> +qobject_decref(data);
> +}
> +
> +static void stream_free(StreamState *s)
> +{
> +QLIST_REMOVE(s, list);
> +
> +qemu_del_timer(s->timer);
> +qemu_free_timer(s->timer);
> +qemu_free(s);
> +}
> +
> +static void stream_complete(StreamState *s, int ret)
> +{
> +stream_mon_event(s, ret);
> +stream_free(s);
> +}
> +
> +static void stream_cb(void *opaque, int nb_sectors)
> +{
> +StreamState *s = opaque;
> +
> +if (nb_sectors < 0) {
> +stream_comple

Re: [Qemu-devel] RFC: moving fsfreeze support from the userland guest agent to the guest kernel

2011-07-28 Thread Michael Roth

On 07/28/2011 03:54 AM, Jes Sorensen wrote:

On 07/27/11 18:40, Andrea Arcangeli wrote:

Another thing to note is that snapshotting is not necessarily something

that should be completely transparent to the guest. One of the planned
future features for the guest agent (mentioned in the snapshot wiki, and
a common use case that I've seen come up elsewhere as well in the
context of database applications), is a way for userspace applications
to register callbacks to be made in the event of a freeze (dumping
application-managed caches to disk and things along that line). The

Not sure if the scripts are really needed or if they would just open a
brand new fsfreeze specific unix domain socket (created by the
database) to tell the database to freeze.

If the latter is the case, then it'd be better rather than changing
the database to open unix domain socket so the script can connect to
it when invoked (or maybe to just add some new function to the
protocol of an existing open unix domain socket), to instead change
the database to open a /dev/virtio-fsfreeze device, created by the
virtio-fsfreeze.ko virtio driver through udev. The database would poll
it, and it could read the request to freeze, and write into it that it
finished freezing when done. Then when all openers of the device
freezed, the virtio-fsfreeze.ko would go ahead freezing all the
filesystems, and then tell qemu when it's finished freezing. Then qemu
can finally block all the I/O and tell libvirt to go ahead with the
snapshot.


I think it could also be a combined operation, ie. having the freeze
happen in the kernel, but doing the callouts using a userspace daemon. I
like the userspace daemon for the callouts because it allows providing a
more sophisticated API than if we provide just a socket like interface.
In addition the callout is less critical wrt crashes than the fsfreeze
operations.



I'd prefer this approach as well. We could potentially implement it with 
a more general mechanism for executing scripts in the guest for whatever 
reason, rather than an fsfreeze-specific one.


Let the management layer handle the orchestration between the 2. Whether 
the freeze is kernel-driven or not I think can go either way, though the 
potential issues I mentioned in response the Fernando's post seem to 
those proposed changes are required for a "proper" guest agent 
implementation, and at that point we're talking about kernel changes 
either way for the functionality we ultimately want.


I think there may still be value in retaining the current fsfreeze 
support in the agent for older guests, however. What I'm convinced of 
now though is that the operation should not be tethered to the 
application callback operation, since that's applicable to other 
potential fsfreeze mechanisms.



Cheers,
Jes





Re: [Qemu-devel] [PATCH] net: fix default MAC assignment for hotplugged NICs

2011-07-28 Thread Marcelo Tosatti
On Thu, Jul 28, 2011 at 04:22:09PM +0100, Peter Maydell wrote:
> On 28 July 2011 16:12, Marcelo Tosatti  wrote:
> >
> > The index for assignment of default MAC address is duplicated:
> > qemu_macaddr_default_if_unset has its own variable and net_init_nic uses
> > the nic table index.
> 
> Isn't this already fixed by commit 6eed18568d ?
> 
> -- PMM

Yes, thanks for pointing out.




Re: [Qemu-devel] mips-linux-user and POSIX IPC

2011-07-28 Thread Riku Voipio
On Tue, Jul 26, 2011 at 08:56:38AM +, Holger Freyther wrote:
> For semctl qemu enters through the do_ipc method, it appears
> to be that the 'variable' ptr is really a ptr (to the stack) and
> needs to be dereferenced. The below snippet seems to fix that
> issue for me.
 
> My next problem is with do_shmctl, somehow third is NULL but it
> should point to the out parameter (and the application is doing
> this correctly as well). While trying to understand the issue it
> looks like target_to_host_shmid_ds will not properly unlock the
> struct on all paths.
 
> Is the IPC emulation supposed to work? Is this an 'obvious' API
> issue for MIPS?

The ltp testsuite has many semaphore and other ipc tests. Comparing
results of running the same ltp binaries on real mips and qemu-mips
should quickly reveal which syscall emulations are broken. And provide
a good testcase for fixes.

> @@ -2873,7 +2886,13 @@ static abi_long do_ipc(
>  break;
>  
>  case IPCOP_semctl:
> -ret = do_semctl(first, second, third, (union \n
> target_semun)(abi_ulong) ptr);
> +if (!lock_user_struct(VERIFY_READ, semun, ptr, 1))
> +   ret = -TARGET_EFAULT;
> +   else {
> +   __get_user(t_semun.buf, &semun->buf);
> +   ret = do_semctl(first, second, third, t_semun);
> +   unlock_user_struct(semun, ptr, 0);
> +   }
>  break;

This looks from a quick view a correct fix.



Re: [Qemu-devel] [PATCH] net: fix default MAC assignment for hotplugged NICs

2011-07-28 Thread Peter Maydell
On 28 July 2011 16:12, Marcelo Tosatti  wrote:
>
> The index for assignment of default MAC address is duplicated:
> qemu_macaddr_default_if_unset has its own variable and net_init_nic uses
> the nic table index.

Isn't this already fixed by commit 6eed18568d ?

-- PMM



Re: [Qemu-devel] [RFC] Introduce vm_stop_permanent()

2011-07-28 Thread Jan Kiszka
On 2011-07-28 17:18, Luiz Capitulino wrote:
> On Thu, 28 Jul 2011 16:19:19 +0200
> Jan Kiszka  wrote:
> 
>> On 2011-07-28 15:37, Avi Kivity wrote:
>>> On 07/28/2011 04:31 PM, Luiz Capitulino wrote:
 On Thu, 28 Jul 2011 10:23:22 +0300
 Avi Kivity  wrote:

>  On 07/28/2011 12:44 AM, Blue Swirl wrote:
>  >  On Wed, Jul 27, 2011 at 9:42 PM, Luiz
 Capitulino   wrote:
>  >  >   This function should be used when the VM is not supposed to
 resume
>  >  >   execution (eg. by issuing 'cont' monitor command).
>  >  >
>  >  >   Today, we allow the user to resume execution even when:
>  >  >
>  >  > o the guest shuts down and -no-shutdown is used
>  >  > o there's a kvm internal error
>  >  > o loading the VM state with -loadvm or "loadvm" in the
 monitor fails
>  >  >
>  >  >   I think only badness can happen from the cases above.
>  >
>  >  I'd suppose a system_reset should bring the system back to
 sanity and
>  >  then clear vm_permanent_stopped (where's -ly?)

 What's -ly?

>>>
>>> permanent-ly.
>>>
>  >  except maybe for KVM
>  >  internal error if that can't be recovered. Then it would not very
>  >  permanent anymore, so the name would need adjusting.
>
>  Currently, all kvm internal errors are recoverable by reset (and
>  possibly by fiddling with memory/registers).

 Ok, but a poweroff in the guest isn't recoverable with system_reset
 right? Or does it depend on the guest?
>>>
>>> Right, it's not recoverable if you shut the power down where the tractor
>>> beam is coupled to the main reactor.
>>
>> system_reset will bring all emulated devices back into their power-on
>> state - unless we have remaining bugs to fix. Actually, one may consider
>> issuing this reset automatically on vm_start after "permant" vm_stop.
> 
> I'm going to abandon the vm_stop_permanent() idea.
> 
> I'm thinking something towards having a stop_kind argument to vm_stop(),
> which could be temporary (which is what we have today), only with
> system_reset or permanent.
> 
> Or maybe vm_stop_opt(), which allows for fine grained control of stop
> (vm_stop() would default to today's behavior)...

But the right point for reset is start, not stop. Otherwise we loose the
state before the stop.

Jam



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [RFC] Introduce vm_stop_permanent()

2011-07-28 Thread Luiz Capitulino
On Thu, 28 Jul 2011 16:19:19 +0200
Jan Kiszka  wrote:

> On 2011-07-28 15:37, Avi Kivity wrote:
> > On 07/28/2011 04:31 PM, Luiz Capitulino wrote:
> >> On Thu, 28 Jul 2011 10:23:22 +0300
> >> Avi Kivity  wrote:
> >>
> >> >  On 07/28/2011 12:44 AM, Blue Swirl wrote:
> >> >  >  On Wed, Jul 27, 2011 at 9:42 PM, Luiz
> >> Capitulino   wrote:
> >> >  >  >   This function should be used when the VM is not supposed to
> >> resume
> >> >  >  >   execution (eg. by issuing 'cont' monitor command).
> >> >  >  >
> >> >  >  >   Today, we allow the user to resume execution even when:
> >> >  >  >
> >> >  >  > o the guest shuts down and -no-shutdown is used
> >> >  >  > o there's a kvm internal error
> >> >  >  > o loading the VM state with -loadvm or "loadvm" in the
> >> monitor fails
> >> >  >  >
> >> >  >  >   I think only badness can happen from the cases above.
> >> >  >
> >> >  >  I'd suppose a system_reset should bring the system back to
> >> sanity and
> >> >  >  then clear vm_permanent_stopped (where's -ly?)
> >>
> >> What's -ly?
> >>
> > 
> > permanent-ly.
> > 
> >> >  >  except maybe for KVM
> >> >  >  internal error if that can't be recovered. Then it would not very
> >> >  >  permanent anymore, so the name would need adjusting.
> >> >
> >> >  Currently, all kvm internal errors are recoverable by reset (and
> >> >  possibly by fiddling with memory/registers).
> >>
> >> Ok, but a poweroff in the guest isn't recoverable with system_reset
> >> right? Or does it depend on the guest?
> > 
> > Right, it's not recoverable if you shut the power down where the tractor
> > beam is coupled to the main reactor.
> 
> system_reset will bring all emulated devices back into their power-on
> state - unless we have remaining bugs to fix. Actually, one may consider
> issuing this reset automatically on vm_start after "permant" vm_stop.

I'm going to abandon the vm_stop_permanent() idea.

I'm thinking something towards having a stop_kind argument to vm_stop(),
which could be temporary (which is what we have today), only with
system_reset or permanent.

Or maybe vm_stop_opt(), which allows for fine grained control of stop
(vm_stop() would default to today's behavior)...



[Qemu-devel] [PATCH] net: fix default MAC assignment for hotplugged NICs

2011-07-28 Thread Marcelo Tosatti

The index for assignment of default MAC address is duplicated:
qemu_macaddr_default_if_unset has its own variable and net_init_nic uses
the nic table index.

This leads to assignment of same MAC addresses to NICs initialized via
command line and hotplugged ones.

Fix by not assigning default MAC address in net_init_nic, leaving
that job to qemu_macaddr_default_if_unset.

Signed-off-by: Marcelo Tosatti 
Reported-by: Amos Kong 
BZ: https://bugzilla.redhat.com/show_bug.cgi?id=712046

diff --git a/net.c b/net.c
index 66123ad..f34400c 100644
--- a/net.c
+++ b/net.c
@@ -776,13 +776,6 @@ static int net_init_nic(QemuOpts *opts,
 nd->devaddr = qemu_strdup(qemu_opt_get(opts, "addr"));
 }
 
-nd->macaddr[0] = 0x52;
-nd->macaddr[1] = 0x54;
-nd->macaddr[2] = 0x00;
-nd->macaddr[3] = 0x12;
-nd->macaddr[4] = 0x34;
-nd->macaddr[5] = 0x56 + idx;
-
 if (qemu_opt_get(opts, "macaddr") &&
 net_parse_macaddr(nd->macaddr, qemu_opt_get(opts, "macaddr")) < 0) {
 error_report("invalid syntax for ethernet address");



Re: [Qemu-devel] RFC: moving fsfreeze support from the userland guest agent to the guest kernel

2011-07-28 Thread Michael Roth

On 07/28/2011 03:03 AM, Andrea Arcangeli wrote:

On Thu, Jul 28, 2011 at 11:53:50AM +0900, Fernando Luis Vázquez Cao wrote:

On Wed, 2011-07-27 at 17:24 +0200, Andrea Arcangeli wrote:

making
sure no lib is calling any I/O function to be able to defreeze the
filesystems later, making sure the oom killer or a wrong kill -9
$RANDOM isn't killing the agent by mistake while the I/O is blocked
and the copy is going.


Yes with the current API if the agent is killed while the filesystems
are frozen we are screwed.

I have just submitted patches that implement a new API that should make
the virtualization use case more reliable. Basically, I am adding a new
ioctl, FIGETFREEZEFD, which freezes the indicated filesystem and returns
a file descriptor; as long as that file descriptor is held open, the
filesystem remains open. If the freeze file descriptor is closed (be it
through a explicit call to close(2) or as part of process exit
housekeeping) the associated filesystem is automatically thawed.

- fsfreeze: add ioctl to create a fd for freeze control
   http://marc.info/?l=linux-fsdevel&m=131175212512290&w=2
- fsfreeze: add freeze fd ioctls
   http://marc.info/?l=linux-fsdevel&m=131175220612341&w=2


This is probably how the API should have been implemented originally
instead of FIFREEZE/FITHAW.

It looks a bit overkill though, I would think it'd be enough to have
the fsfreeze forced at FIGETFREEZEFD, and the only way to thaw by
closing the file without requiring any of the
FS_FREEZE_FD/FS_THAW_FD/FS_ISFROZEN_FD. But I guess you have use cases


One of the crappy things about the current implementation is the 
inability to determine whether or not a filesystem is frozen. At least 
in the context of guest agent at least, it'd be nice if 
guest-fsfreeze-status checked the actual system state rather than some 
internal state that may not necessarily reflect reality (if we freeze, 
and some other application thaws, we currently still report the state as 
frozen).


Also in the context of the guest agent, we are indeed screwed if the 
agent gets killed while in a frozen state, and remain screwed even if 
it's restarted since we have no way of determining whether or not we're 
in a frozen state and thus should disable logging operations.


We could check status by looking for a failure from the freeze 
operation, but if you're just interested in getting the state, having to 
potentially induce a freeze just to get at the state is really heavy-handed.


So having an open operation that doesn't force a freeze/thaw/status 
operation serves some fairly common use cases I think.



for those if you implemented it, maybe to check if root is stepping on
its own toes by checking if the fs is already freezed before freezing
it and returning failure if it is, running ioctl instead of opening
closing the file isn't necessarily better. At the very least the
get_user(should_freeze, argp) doesn't seem so necessary, it just
complicates the ioctl API a bit without much gain, I think it'd be
cleaner if the FS_FREEZE_FD was the only way to freeze then.

It's certainly a nice reliability improvement and safer API.

Now if you add a file descriptor to epoll/poll that userland can open
and talk to, to know when a fsfreeze is asked on a certain fs, a
fsfreeze userland agent (not virt related too) could open it and start
the scripts if that filesystem is being fsfreezed before calling
freeze_super().

Then a PARAVIRT_FSFREEZE=y/m driver could just invoke the fsfreeze
without any dependency on a virt specific guest agent.

Maybe Christoph's right there are filesystems in userland (not sure
how the storage is related, it's all about filesystems and apps as far
I can see, and it's all blkdev agnostic) that may make things more
complicated, but those usually have a kernel backend too (like
fuse). I may not see the full picture of the filesystem in userland or
how the storage agent in guest userland relates to this.

If you believe having libvirt talking QMP/QAPI over a virtio-serial
vmchannel with some virt specific guest userland agent bypassing qemu
entirely is better, that's ok with me, but there should be a strong
reason for it because the paravirt_fsfreeze.ko approach with a small
qemu backend and a qemu monitor command that starts paravirt-fsfreeze
in guest before going ahead blocking all I/O (to provide backwards
compatibility and reliable snapshots to guest OS that won't have the
paravirt fsfreeze too) looks more reliable, more compact and simpler
to use to me. I'll be surely ok either ways though.

Thanks,
Andrea





Re: [Qemu-devel] [RFC][PATCH 0/21] QEMU Object Model

2011-07-28 Thread Anthony Liguori

On 07/28/2011 09:41 AM, Paolo Bonzini wrote:

On 07/28/2011 04:03 PM, Anthony Liguori wrote:

No doubt about that. :) I'd put a lot more hope into Goldfish though.


What's unclear to me about the Goldfish enumerator is whether it should
be filled out through interaction with hardware devices or via some
other mechanism.

In many ways, it's similar to ACPI and a Device Tree. In both of those
cases, firmware actually is responsible for constructing those tables.


Yes, it is a flat device tree.

Since it supports hotplug (at least in theory, the Android emulator
predates qdev so it doesn't support it), I would say it is more similar
to PCI configuration space. The difference is that IRQ numbers and MMIO
base addresses are handed out by hardware (by a piece of the SoC) rather
than by the firmware.

So yes, the hardware would have some kind of bus to talk to the devices
and arbitrate hotplug/hotunplug. The only peculiarity being that the bus
enumerator hardcodes itself in the list it exposes, in addition to the
devices on the bus.

But that still means that the devices have two views:

1) the enumerator's view is either "this is my name, my MMIO base, my
IRQ base" or "I need 4k of MMIO and 1 IRQ line, please tell me
where/which are those", depending on the device;


I think it's important to ask, how would this be implemented in 
hardware.  The only way I can see is to teach each device about this 
interface and then have a common bus.  That implies that you have:


class GoldfishEnumerator : public Device {
 GoldfishDevice *slots[N];
};

interface GoldfishDevice {
 const char *get_name();
 uint64_t get_mmio_base();
 ...
};

class GoldfishNic : public Device, implements GoldfishDevice
{
const char *get_name(void) {
return "nic";
}
};

With respect to hotplug, that means that you have to hot plug the device 
to multiple busses.



2) the PIC's view is "please bring this IRQ line up/down" (the device
says which line, since the enumerator can assign those dynamically).

The PIC's view is more complicated than a Pin, and more similar to ISA.


ISA is just a pin.  The ISA bus extender literally has five pins 
corresponding to the ISA IRQs 7, 6, 5, 4, 3.


EISA adds 5 more pins for 10, 11, 12, 14, 15.

ISA devices "choose" their IRQ line by hardwiring their IRQ output pin 
to a specific IRQ line on the bus.


Regards,

Anthony Liguori



Paolo






Re: [Qemu-devel] [PATCH v2 1/1] The codes V2 for QEMU disk I/O limits.

2011-07-28 Thread Marcelo Tosatti
On Thu, Jul 28, 2011 at 12:24:48PM +0800, Zhi Yong Wu wrote:
> On Wed, Jul 27, 2011 at 11:49 PM, Marcelo Tosatti  wrote:
> > On Wed, Jul 27, 2011 at 06:17:15PM +0800, Zhi Yong Wu wrote:
> >> >> +        wait_time = 1;
> >> >> +    }
> >> >> +
> >> >> +    wait_time = wait_time + (slice_time - elapsed_time);
> >> >> +    if (wait) {
> >> >> +        *wait = wait_time * BLOCK_IO_SLICE_TIME * 10 + 1;
> >> >> +    }
> >> >
> >> > The guest can keep submitting requests where "wait_time = 1" above,
> >> > and the timer will be rearmed continuously in the future.
> >
> > This is wrong.
> >
> >>  Can't you
> >> > simply arm the timer to the next slice start? _Some_ data must be
> >> > transfered by then, anyway (and nothing can be transfered earlier than
> >> > that).
> >
> > This is valid.
> >
> >> Sorry, i have got what you mean. Can you elaborate in more detail?
> >
> > Sorry, the bug i mentioned about timer being rearmed does not exist.
> >
> > But arming the timer for the last request as its done is 
> > confusing/unnecessary.
> >
> > The timer processes queued requests, but the timer is armed accordingly
> > to the last queued request in the slice. For example, if a request is
> > submitted 1ms before the slice ends, the timer is armed 100ms +
> > (slice_time - elapsed_time) in the future.
> 
> If the timer is simply amred to the next slice start, this timer will
> be a periodic timer, either the I/O rate can not be throttled under
> the limits, or the enqueued request can be delayed to handled, this
> will lower I/O rate seriously than the limits.

Yes, periodic but disarmed when there is no queueing. I don't understand
your point about low I/O rate.

> Maybe the slice time should be variable with current I/O rate. What do
> you think of it?

Not sure if its necessary. The slice should be small to avoid excessive
work on timer context, but the advantage of increasing the slice is
very small if any. BTW, 10ms seems a better default than 100ms.




[Qemu-devel] [PATCH] lm832x: Take DeviceState pointer in lm832x_key_event()

2011-07-28 Thread Peter Maydell
Since lm832x has been qdev'ified, its users will generally
have a DeviceState pointer rather than an i2c_slave pointer,
so adjust lm832x_key_event's prototype to suit.

This allows the n810 (its only user) to actually pass a correct
pointer to it rather than NULL. The effect is that we no longer
segfault when a key is pressed.

Signed-off-by: Peter Maydell 
---
NB: this patch depends on the OMAP GPIO v2 patchset.

 hw/i2c.h |2 +-
 hw/lm832x.c  |4 ++--
 hw/nseries.c |7 +++
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/hw/i2c.h b/hw/i2c.h
index 5514402..9381d01 100644
--- a/hw/i2c.h
+++ b/hw/i2c.h
@@ -72,6 +72,6 @@ void wm8750_set_bclk_in(void *opaque, int new_hz);
 void tmp105_set(i2c_slave *i2c, int temp);
 
 /* lm832x.c */
-void lm832x_key_event(i2c_slave *i2c, int key, int state);
+void lm832x_key_event(DeviceState *dev, int key, int state);
 
 #endif
diff --git a/hw/lm832x.c b/hw/lm832x.c
index 590a4cc..992ce49 100644
--- a/hw/lm832x.c
+++ b/hw/lm832x.c
@@ -474,9 +474,9 @@ static int lm8323_init(i2c_slave *i2c)
 return 0;
 }
 
-void lm832x_key_event(struct i2c_slave *i2c, int key, int state)
+void lm832x_key_event(DeviceState *dev, int key, int state)
 {
-LM823KbdState *s = (LM823KbdState *) i2c;
+LM823KbdState *s = FROM_I2C_SLAVE(LM823KbdState, I2C_SLAVE_FROM_QDEV(dev));
 
 if ((s->status & INT_ERROR) && (s->error & ERR_FIFOOVR))
 return;
diff --git a/hw/nseries.c b/hw/nseries.c
index 32f2f53..45b52bb 100644
--- a/hw/nseries.c
+++ b/hw/nseries.c
@@ -45,7 +45,7 @@ struct n800_s {
 i2c_bus *i2c;
 
 int keymap[0x80];
-i2c_slave *kbd;
+DeviceState *kbd;
 
 TUSBState *usb;
 void *retu;
@@ -362,7 +362,6 @@ static int n810_keys[0x80] = {
 static void n810_kbd_setup(struct n800_s *s)
 {
 qemu_irq kbd_irq = qdev_get_gpio_in(s->cpu->gpio, N810_KEYBOARD_GPIO);
-DeviceState *dev;
 int i;
 
 for (i = 0; i < 0x80; i ++)
@@ -375,8 +374,8 @@ static void n810_kbd_setup(struct n800_s *s)
 
 /* Attach the LM8322 keyboard to the I2C bus,
  * should happen in n8x0_i2c_setup and s->kbd be initialised here.  */
-dev = i2c_create_slave(s->i2c, "lm8323", N810_LM8323_ADDR);
-qdev_connect_gpio_out(dev, 0, kbd_irq);
+s->kbd = i2c_create_slave(s->i2c, "lm8323", N810_LM8323_ADDR);
+qdev_connect_gpio_out(s->kbd, 0, kbd_irq);
 }
 
 /* LCD MIPI DBI-C controller (URAL) */
-- 
1.7.1




Re: [Qemu-devel] [PATCH] fix disabling interrupts in sun4u

2011-07-28 Thread Artyom Tarasenko
On Thu, Jul 28, 2011 at 3:40 PM,   wrote:
> At Thu, 28 Jul 2011 14:50:57 +0200,
> Artyom Tarasenko wrote:
>> On Thu, Jul 28, 2011 at 2:03 PM,   wrote:
>> > At Thu, 28 Jul 2011 13:51:08 +0200,
>> > Artyom Tarasenko wrote:
>> >> On Thu, Jul 28, 2011 at 12:31 PM,   wrote:
>> >> > Hi,
>> >> >
>> >> > At Mon, 25 Jul 2011 19:22:38 +0200,
>> >> > Artyom Tarasenko wrote:
>> >> >
>> >> >> clear interrupt request if the interrupt priority < CPU pil
>> >> >> clear hardware interrupt request if interrupts are disabled
>> >> >
>> >> > Not directly related to the fix, but I'd like to note a problem
>> >> > of hw/sun4u.c interrupt code:
>> >> >
>> >> > The interrupt code probably mixes hardware interrupts and
>> >> > software interrupts.
>> >> > %pil is for software interrupts (interrupt_level_n traps).
>> >> > %pil can not mask hardware interrupts (interrupt_vector traps);
>> >> > the CPU raises interrupt_vector traps even on %pil=15.
>> >> > But in cpu_check_irqs() and cpu_set_irq(), hardware interrupts
>> >> > seem to be masked by %pil.
>> >>
>> >> The interrupt_vector traps are currently not implemented, are they?
>> >> So it's hard to tell whether they are masked.
>> >
>> > Yes, interrupt_vector is not implemented yet.
>> > I failed to explain the problem.
>> > The problem is that cpu_set_irqs() should raise interrupt_vector
>> > traps but it raises interrupt_level_n traps actually.
>> > sun4uv_init() calls qemu_allocate_irqs() with cpu_set_irq as
>> > the 1st argument.  The allocated irqs (the irq variable) are
>> > passed to pci_apb_init().  APB should generate interrupt_vector
>> > traps (hardware interrupts), not the interrupt_vector_n traps.
>>
>> Yes, this is true. But it's more complicated than this: cpu_check_irqs
>> also checks tick/stick/hstick interrupts. They should produce the
>> interrupt_level_n traps as they currently do.
>
> That's right.
> tick/stick/hstick must raise interrupt_level_n traps.
>
>> The patch merely fixes the problem of hanging on a interrupt_vector_n
>> trap if the trap handler uses pil for interrupt masking. The problem
>> exists independently from interrupt_vector trap generation (as you
>> pointed out).
>
> I understand what is the problem that your patch is going to fix.
> Thanks for the explanation.
>

Sorry, I haven't implied that you don't. Meant to say "As you pointed
out, the problem exists independently...".

>> Do you have objections to this patch in its current form?
>
> No, I don't have any objections.
>
>> > The interrupts from APB would be reported by cpu_set_irq(),
>> > but cpu_set_irq() seems to generate interrupt_vector_n traps.
>>
>> For me it's not obvious. The interrupt vector not just one line, but
>> the vector, which is written in the corresponding CPU register (also
>> missing in the current qemu implementation). On the real hardware the
>> vector is created by the IOMMU (PBM/APB/...). If qemu intends to
>> support multiple chipsets, we should keep it the way it's done on the
>> real hardware (for instance the interrupt vectors for on-board devices
>> on Ultra-1 and E6500 are not the same).
>
> Sorry, I can't keep up with this vector thing...
> Does the CPU receive hardware interrupts as interrupt_vector traps
> (trap type=0x60) regardless of the kind of the interrupt controller,
> doesn't it?

It does indeed, but it also stores the interrupt vector identifying
the initiator device, in a CPU register readable with asi 0x7f .
What would APB pass to the cpu_set_irq? I see the three following variants:

a) it passes the PCI interrupt id, which is translated to the
interrupt vector in cpu_set_irq()
b) it passes the vector. This implies that 2048 (0-0x7ff) CPU
interrupts have to be allocated.
c) hack combining a+b: allocate only the interrupts known to be used
and translate an internal interrupt id to a vector.

The variant "a" is bad because it doesn't allow support for different
chipsets. The variant "b" is bad because qemu has to allocate way too
many interrupts. Only few of them will be used actually. The variant
"c" is bad, well, because it's a hack.

That's why I suggest using another interface between APB and CPU.

Have I overlooked something?

>> I'd suggest APB shall use some other interface for communicating
>> interrupts to the CPU. Something like
>> cpu_receive_ivec(interrupt_vector).
>>
>> >> >> Signed-off-by: Artyom Tarasenko 
>> >> >> ---
>> >> >>  hw/sun4u.c |    6 --
>> >> >>  1 files changed, 4 insertions(+), 2 deletions(-)
>> >> >>
>> >> >> diff --git a/hw/sun4u.c b/hw/sun4u.c
>> >> >> index d7dcaf0..7f95aeb 100644
>> >> >> --- a/hw/sun4u.c
>> >> >> +++ b/hw/sun4u.c
>> >> >> @@ -255,7 +255,7 @@ void cpu_check_irqs(CPUState *env)
>> >> >>          pil |= 1 << 14;
>> >> >>      }
>> >> >>
>> >> >> -    if (!pil) {
>> >> >> +    if (pil < (2 << env->psrpil)){
>> >> >>          if (env->interrupt_request & CPU_INTERRUPT_HARD) {
>> >> >>              CPUIRQ_DPRINTF("Reset CPU IRQ (current interrupt %x)\n",
>> >> >>                             en

Re: [Qemu-devel] [RFC][PATCH 0/21] QEMU Object Model

2011-07-28 Thread Paolo Bonzini

On 07/28/2011 04:03 PM, Anthony Liguori wrote:

No doubt about that. :) I'd put a lot more hope into Goldfish though.


What's unclear to me about the Goldfish enumerator is whether it should
be filled out through interaction with hardware devices or via some
other mechanism.

In many ways, it's similar to ACPI and a Device Tree.  In both of those
cases, firmware actually is responsible for constructing those tables.


Yes, it is a flat device tree.

Since it supports hotplug (at least in theory, the Android emulator 
predates qdev so it doesn't support it), I would say it is more similar 
to PCI configuration space.  The difference is that IRQ numbers and MMIO 
base addresses are handed out by hardware (by a piece of the SoC) rather 
than by the firmware.


So yes, the hardware would have some kind of bus to talk to the devices 
and arbitrate hotplug/hotunplug.  The only peculiarity being that the 
bus enumerator hardcodes itself in the list it exposes, in addition to 
the devices on the bus.


But that still means that the devices have two views:

1) the enumerator's view is either "this is my name, my MMIO base, my 
IRQ base" or "I need 4k of MMIO and 1 IRQ line, please tell me 
where/which are those", depending on the device;


2) the PIC's view is "please bring this IRQ line up/down" (the device 
says which line, since the enumerator can assign those dynamically).


The PIC's view is more complicated than a Pin, and more similar to ISA.

Paolo



[Qemu-devel] [Bug 524447] Re: virsh save is very slow

2011-07-28 Thread Michael Tokarev
I just wanted to point out that we've this patch in Debian since ages,
and it's been included in upstream for a long time too.  Added a debbug
reference for this as well.

** Bug watch added: Debian Bug tracker #597517
   http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=597517

** Also affects: qemu-kvm (Debian) via
   http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=597517
   Importance: Unknown
   Status: Unknown

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/524447

Title:
  virsh save is very slow

Status in libvirt virtualization API:
  Unknown
Status in QEMU:
  Fix Released
Status in “libvirt” package in Ubuntu:
  Invalid
Status in “qemu-kvm” package in Ubuntu:
  Fix Released
Status in “libvirt” source package in Lucid:
  Won't Fix
Status in “qemu-kvm” source package in Lucid:
  Won't Fix
Status in “libvirt” source package in Maverick:
  Won't Fix
Status in “qemu-kvm” source package in Maverick:
  Won't Fix
Status in “qemu-kvm” package in Debian:
  Unknown

Bug description:
  ==
  SRU Justification:
  1. impact: 'qemu save' is slow
  2. how addressed: a patch upstream fixes the case when a file does not 
announce when it is ready.
  3. patch: see the patch in linked bzr trees
  4. TEST CASE: see comment #4 for a specific recipe
  5. regression potential:  this patch only touches the vm save path.
  ==

  As reported here: http://www.redhat.com/archives/libvir-
  list/2009-December/msg00203.html

  "virsh save" is very slow - it writes the image at around 1MB/sec on
  my test system.

  (I think I saw a bug report for this issue on Fedora's bugzilla, but I
  can't find it now...)

  Confirmed under Karmic.

To manage notifications about this bug go to:
https://bugs.launchpad.net/libvirt/+bug/524447/+subscriptions



Re: [Qemu-devel] [RFC] Introduce vm_stop_permanent()

2011-07-28 Thread Jan Kiszka
On 2011-07-28 15:37, Avi Kivity wrote:
> On 07/28/2011 04:31 PM, Luiz Capitulino wrote:
>> On Thu, 28 Jul 2011 10:23:22 +0300
>> Avi Kivity  wrote:
>>
>> >  On 07/28/2011 12:44 AM, Blue Swirl wrote:
>> >  >  On Wed, Jul 27, 2011 at 9:42 PM, Luiz
>> Capitulino   wrote:
>> >  >  >   This function should be used when the VM is not supposed to
>> resume
>> >  >  >   execution (eg. by issuing 'cont' monitor command).
>> >  >  >
>> >  >  >   Today, we allow the user to resume execution even when:
>> >  >  >
>> >  >  > o the guest shuts down and -no-shutdown is used
>> >  >  > o there's a kvm internal error
>> >  >  > o loading the VM state with -loadvm or "loadvm" in the
>> monitor fails
>> >  >  >
>> >  >  >   I think only badness can happen from the cases above.
>> >  >
>> >  >  I'd suppose a system_reset should bring the system back to
>> sanity and
>> >  >  then clear vm_permanent_stopped (where's -ly?)
>>
>> What's -ly?
>>
> 
> permanent-ly.
> 
>> >  >  except maybe for KVM
>> >  >  internal error if that can't be recovered. Then it would not very
>> >  >  permanent anymore, so the name would need adjusting.
>> >
>> >  Currently, all kvm internal errors are recoverable by reset (and
>> >  possibly by fiddling with memory/registers).
>>
>> Ok, but a poweroff in the guest isn't recoverable with system_reset
>> right? Or does it depend on the guest?
> 
> Right, it's not recoverable if you shut the power down where the tractor
> beam is coupled to the main reactor.

system_reset will bring all emulated devices back into their power-on
state - unless we have remaining bugs to fix. Actually, one may consider
issuing this reset automatically on vm_start after "permant" vm_stop.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] Volume key in qcow3?

2011-07-28 Thread Kevin Wolf
Am 28.07.2011 10:05, schrieb Frediano Ziglio:
> Hi,
>   I noted that AES encryption using qcow2 just use the password given
> as as key (and also truncating it to 16 bytes == 128 bits).
> This is prone to brute force attacks and is not also easy to change
> password (you have to decrypt and encrypt again the entire image).
> LUKS and EncFS use another way. They generate a random key (the
> "volume key") then use the password you give to encrypt N times (where
> N is decided by security level or automatically based on time to
> decrypt the volume key. To change the password just give the old one,
> get the volume key and encrypt again using the new one. LUKS support
> also multiple "slots" to allow multiple password and even using an
> external key file.
> Obviously this require an additional extension to qcow2 so I think it
> require a new qcow3 format.

Yes, once we have qcow3, adding things like this should be easy enough.
I think the idea makes sense.

Another thing to consider with encryption is that we don't encrypt
metadata currently. I'm not entirely sure if this is a good or a bad
thing. Metadata is relatively predictable and I think that might hurt
the encryption? Though I'm really not an expert in this area.

Kevin



Re: [Qemu-devel] [PATCH] [RFC] qcow2: group refcount updates during cow

2011-07-28 Thread Kevin Wolf
Am 28.07.2011 15:50, schrieb Frediano Ziglio:
> Well, I think this is the first real improve patch.
> Is more a RFC than a patch. Yes, some lines are terrible!
> It collapses refcount decrement during cow.
> From a first check time executing 015 test passed from about 600 seconds
> to 70.
> This at least prove that refcount updates counts!
> Some doubt:
> 1- place the code in qcow2-refcount.c as it update only refcount and not
>   cluster?
> 2- allow some sort of "begin transaction" / "commit" / "rollback" like 
>   databases instead?
> 3- allow changing tables from different coroutines?
> 
> 1) If you have a sequence like (1, 2, 4) probably these clusters are all in
> the same l2 table but with this code you get two write instead of one.
> I'm thinking about a function in qcow2-refcount.c that accept an array of 
> cluster
> instead of a start + len.
> 
> Signed-off-by: Frediano Ziglio 

I think what you're seeing is actually just one special case of a more
general problem. The problem is that we're interpreting writethrough
stricter than required.

The semantics that we really need is that on completion of a request,
all of its data and metadata must be flushed to disk. There is no
requirement that we flush all intermediate states.

My recent update to qcow2_update_snapshot_refcount() is just another
case of the same problem. I think the solution should be similar to what
I did there, i.e. switch the cache to writeback mode while we're
operating on it and switch back when we're done. We should probably have
functions that make both of this a one-liner (I think here we have some
similarity to your begin/commit idea).

With the right functions, this could become as easy as this (might need
better function names, but you get the idea):

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 882f50a..45b67b1 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -612,6 +612,8 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState
*bs, QCowL2Meta *m)
 if (m->nb_clusters == 0)
 return 0;

+qcow2_cache_disable_writethrough(bs);
+
 old_cluster = qemu_malloc(m->nb_clusters * sizeof(uint64_t));

 /* copy content of unmodified sectors */
@@ -683,6 +685,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState
*bs, QCowL2Meta *m)

 ret = 0;
 err:
+qcow2_cache_restore_writethrough(bs);
 qemu_free(old_cluster);
 return ret;
  }

Kevin



[Qemu-devel] [Bug 524447] Re: virsh save is very slow

2011-07-28 Thread Jeff Snider
Michael: Yes, that is the correct patch.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/524447

Title:
  virsh save is very slow

Status in libvirt virtualization API:
  Unknown
Status in QEMU:
  Fix Released
Status in “libvirt” package in Ubuntu:
  Invalid
Status in “qemu-kvm” package in Ubuntu:
  Fix Released
Status in “libvirt” source package in Lucid:
  Won't Fix
Status in “qemu-kvm” source package in Lucid:
  Won't Fix
Status in “libvirt” source package in Maverick:
  Won't Fix
Status in “qemu-kvm” source package in Maverick:
  Won't Fix

Bug description:
  ==
  SRU Justification:
  1. impact: 'qemu save' is slow
  2. how addressed: a patch upstream fixes the case when a file does not 
announce when it is ready.
  3. patch: see the patch in linked bzr trees
  4. TEST CASE: see comment #4 for a specific recipe
  5. regression potential:  this patch only touches the vm save path.
  ==

  As reported here: http://www.redhat.com/archives/libvir-
  list/2009-December/msg00203.html

  "virsh save" is very slow - it writes the image at around 1MB/sec on
  my test system.

  (I think I saw a bug report for this issue on Fedora's bugzilla, but I
  can't find it now...)

  Confirmed under Karmic.

To manage notifications about this bug go to:
https://bugs.launchpad.net/libvirt/+bug/524447/+subscriptions



Re: [Qemu-devel] [RFC][PATCH 0/21] QEMU Object Model

2011-07-28 Thread Anthony Liguori

On 07/28/2011 08:50 AM, Paolo Bonzini wrote:

On 07/28/2011 02:46 PM, Anthony Liguori wrote:


There are plenty of PV interfaces implemented by QEMU. Would you say the
same of virtio?


Virtio was designed to look like real hardware.

I would say that trying to fit XenStore into QOM would not be a good
exercise.


No doubt about that. :) I'd put a lot more hope into Goldfish though.


What's unclear to me about the Goldfish enumerator is whether it should 
be filled out through interaction with hardware devices or via some 
other mechanism.


In many ways, it's similar to ACPI and a Device Tree.  In both of those 
cases, firmware actually is responsible for constructing those tables.


Regards,

Anthony Liguori



Paolo






Re: [Qemu-devel] [PATCH 00/12] bugfix and qdevify NAND and ONENAND devices

2011-07-28 Thread Peter Maydell
Ping on this one too, since I'm nearly ready with the next chunk of
patches (18 patches fixing bugs and adding NAND support to omap_gpmc)
and they obviously depend on this set.

-- PMM

On 15 July 2011 15:58, Peter Maydell  wrote:
> This patchseries is more goodies from the Meego tree. Specifically, various
> bug fixes to hw/nand and hw/onenand, plus conversion of both to sysbus.
>
> [An advance note on the next thing due to come out of the pipe: qdeving
> nand and onenand allows us to move to qdevification of omap_gpmc, which
> takes both of those (and other things) as 'downstream' devices. That
> change will make it accept arbitrary SysBus devices as its 'downstream'
> devices rather than the current setup of taking an opaque pointer and a
> pair of resize/unmap function pointers. (There are also patches which add
> NAND support so you can tell omap_gpmc "this is a NAND device" and it then
> uses the nand_setpins/nand_getpins/nand_setio interface.) The rationale for
> using SysBus there rather than defining some new kind of bus is that you
> want to be able to plug any random thing into it; for instance the Overo
> board has a lan9118 ethernet controller hanging off the GPMC. ]
>
> A note on dependencies:
>  * This patch depends on the omap gpio patchset (although more textually
>   than seriously semantically)
>  * the six nand patches and the six onenand patches are broadly independent
>   of each other
>
> Juha Riihimäki (9):
>  hw/nand: Support large NAND devices
>  hw/nand: Support devices wider than 8 bits
>  hw/nand: Support multiple reads following READ STATUS
>  hw/nand: qdevify
>  onenand: Handle various ID fields separately
>  onenand: Ignore zero writes to boot command space
>  hw/onenand: program actions can only clear bits
>  hw/sysbus: Add sysbus_mmio_unmap() for unmapping a region
>  hw/onenand: qdevify
>
> Peter Maydell (3):
>  hw/nand: Pass block device state to init function
>  hw/nand: Writing to NAND can only clear bits
>  onenand: Pass BlockDriverState to init function
>
>  hw/axis_dev88.c |    8 +-
>  hw/flash.h      |   19 ++--
>  hw/nand.c       |  345 ---
>  hw/nseries.c    |   13 ++-
>  hw/onenand.c    |  373 
> +++
>  hw/spitz.c      |    6 +-
>  hw/sysbus.c     |   17 +++
>  hw/sysbus.h     |    1 +
>  hw/tc6393xb.c   |    7 +-
>  9 files changed, 561 insertions(+), 228 deletions(-)



Re: [Qemu-devel] [RFC][PATCH 0/21] QEMU Object Model

2011-07-28 Thread Paolo Bonzini

On 07/28/2011 02:46 PM, Anthony Liguori wrote:


There are plenty of PV interfaces implemented by QEMU. Would you say the
same of virtio?


Virtio was designed to look like real hardware.

I would say that trying to fit XenStore into QOM would not be a good
exercise.


No doubt about that. :)  I'd put a lot more hope into Goldfish though.

Paolo



[Qemu-devel] [PATCH] [RFC] qcow2: group refcount updates during cow

2011-07-28 Thread Frediano Ziglio
Well, I think this is the first real improve patch.
Is more a RFC than a patch. Yes, some lines are terrible!
It collapses refcount decrement during cow.
>From a first check time executing 015 test passed from about 600 seconds
to 70.
This at least prove that refcount updates counts!
Some doubt:
1- place the code in qcow2-refcount.c as it update only refcount and not
  cluster?
2- allow some sort of "begin transaction" / "commit" / "rollback" like 
  databases instead?
3- allow changing tables from different coroutines?

1) If you have a sequence like (1, 2, 4) probably these clusters are all in
the same l2 table but with this code you get two write instead of one.
I'm thinking about a function in qcow2-refcount.c that accept an array of 
cluster
instead of a start + len.

Signed-off-by: Frediano Ziglio 
---
 block/qcow2-cluster.c |   36 ++--
 1 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 81cf77d..da17365 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -675,10 +675,42 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, 
QCowL2Meta *m)
  * Also flush bs->file to get the right order for L2 and refcount update.
  */
 if (j != 0) {
+int64_t old_start = 0, old_end = -2;
+int count = 0;
 for (i = 0; i < j; i++) {
-qcow2_free_any_clusters(bs,
-be64_to_cpu(old_cluster[i]) & ~QCOW_OFLAG_COPIED, 1);
+old_cluster[i] = be64_to_cpu(old_cluster[i]) & ~QCOW_OFLAG_COPIED;
 }
+// XXX sort old_cluster
+for (i = 0; i < j; i++) {
+int64_t cluster = old_cluster[i];
+
+/* group if contiguos */
+if (old_end + 1 == (cluster >> s->cluster_bits)) {
+++old_end;
+continue;
+}
+
+/* handle */
+if (old_end > 0) {
+qcow2_free_any_clusters(bs, old_start << s->cluster_bits, 
old_end - old_start + 1);
+count += old_end - old_start + 1;
+}
+old_end = -2;
+
+/* handle compressed separately */
+if ((cluster & QCOW_OFLAG_COMPRESSED)) {
+qcow2_free_any_clusters(bs, cluster, 1);
+continue;
+}
+
+/* start a new group */
+old_start = old_end = cluster >> s->cluster_bits;
+}
+if (old_end > 0) {
+qcow2_free_any_clusters(bs, old_start << s->cluster_bits, old_end 
- old_start + 1);
+count += old_end - old_start + 1;
+}
+assert(count == j);
 }
 
 ret = 0;
-- 
1.7.1




Re: [Qemu-devel] [PATCH] fix disabling interrupts in sun4u

2011-07-28 Thread tsnsaito
At Thu, 28 Jul 2011 14:50:57 +0200,
Artyom Tarasenko wrote:
> On Thu, Jul 28, 2011 at 2:03 PM,   wrote:
> > At Thu, 28 Jul 2011 13:51:08 +0200,
> > Artyom Tarasenko wrote:
> >> On Thu, Jul 28, 2011 at 12:31 PM,   wrote:
> >> > Hi,
> >> >
> >> > At Mon, 25 Jul 2011 19:22:38 +0200,
> >> > Artyom Tarasenko wrote:
> >> >
> >> >> clear interrupt request if the interrupt priority < CPU pil
> >> >> clear hardware interrupt request if interrupts are disabled
> >> >
> >> > Not directly related to the fix, but I'd like to note a problem
> >> > of hw/sun4u.c interrupt code:
> >> >
> >> > The interrupt code probably mixes hardware interrupts and
> >> > software interrupts.
> >> > %pil is for software interrupts (interrupt_level_n traps).
> >> > %pil can not mask hardware interrupts (interrupt_vector traps);
> >> > the CPU raises interrupt_vector traps even on %pil=15.
> >> > But in cpu_check_irqs() and cpu_set_irq(), hardware interrupts
> >> > seem to be masked by %pil.
> >>
> >> The interrupt_vector traps are currently not implemented, are they?
> >> So it's hard to tell whether they are masked.
> >
> > Yes, interrupt_vector is not implemented yet.
> > I failed to explain the problem.
> > The problem is that cpu_set_irqs() should raise interrupt_vector
> > traps but it raises interrupt_level_n traps actually.
> > sun4uv_init() calls qemu_allocate_irqs() with cpu_set_irq as
> > the 1st argument.  The allocated irqs (the irq variable) are
> > passed to pci_apb_init().  APB should generate interrupt_vector
> > traps (hardware interrupts), not the interrupt_vector_n traps.
> 
> Yes, this is true. But it's more complicated than this: cpu_check_irqs
> also checks tick/stick/hstick interrupts. They should produce the
> interrupt_level_n traps as they currently do.

That's right.
tick/stick/hstick must raise interrupt_level_n traps.

> The patch merely fixes the problem of hanging on a interrupt_vector_n
> trap if the trap handler uses pil for interrupt masking. The problem
> exists independently from interrupt_vector trap generation (as you
> pointed out).

I understand what is the problem that your patch is going to fix.
Thanks for the explanation.

> Do you have objections to this patch in its current form?

No, I don't have any objections.

> > The interrupts from APB would be reported by cpu_set_irq(),
> > but cpu_set_irq() seems to generate interrupt_vector_n traps.
> 
> For me it's not obvious. The interrupt vector not just one line, but
> the vector, which is written in the corresponding CPU register (also
> missing in the current qemu implementation). On the real hardware the
> vector is created by the IOMMU (PBM/APB/...). If qemu intends to
> support multiple chipsets, we should keep it the way it's done on the
> real hardware (for instance the interrupt vectors for on-board devices
> on Ultra-1 and E6500 are not the same).

Sorry, I can't keep up with this vector thing...
Does the CPU receive hardware interrupts as interrupt_vector traps
(trap type=0x60) regardless of the kind of the interrupt controller,
doesn't it?

> I'd suggest APB shall use some other interface for communicating
> interrupts to the CPU. Something like
> cpu_receive_ivec(interrupt_vector).
> 
> >> >> Signed-off-by: Artyom Tarasenko 
> >> >> ---
> >> >>  hw/sun4u.c |    6 --
> >> >>  1 files changed, 4 insertions(+), 2 deletions(-)
> >> >>
> >> >> diff --git a/hw/sun4u.c b/hw/sun4u.c
> >> >> index d7dcaf0..7f95aeb 100644
> >> >> --- a/hw/sun4u.c
> >> >> +++ b/hw/sun4u.c
> >> >> @@ -255,7 +255,7 @@ void cpu_check_irqs(CPUState *env)
> >> >>          pil |= 1 << 14;
> >> >>      }
> >> >>
> >> >> -    if (!pil) {
> >> >> +    if (pil < (2 << env->psrpil)){
> >> >>          if (env->interrupt_request & CPU_INTERRUPT_HARD) {
> >> >>              CPUIRQ_DPRINTF("Reset CPU IRQ (current interrupt %x)\n",
> >> >>                             env->interrupt_index);
> >> >> @@ -287,10 +287,12 @@ void cpu_check_irqs(CPUState *env)
> >> >>                  break;
> >> >>              }
> >> >>          }
> >> >> -    } else {
> >> >> +    } else if (env->interrupt_request & CPU_INTERRUPT_HARD) {
> >> >>          CPUIRQ_DPRINTF("Interrupts disabled, pil=%08x pil_in=%08x 
> >> >> softint=%08x "
> >> >>                         "current interrupt %x\n",
> >> >>                         pil, env->pil_in, env->softint, 
> >> >> env->interrupt_index);
> >> >> +        env->interrupt_index = 0;
> >> >> +        cpu_reset_interrupt(env, CPU_INTERRUPT_HARD);
> >> >>      }
> >> >>  }
> >> >>
> >> >> --
> >> >> 1.7.3.4
> >> >
> >> >
> >> > 
> >> > Tsuneo Saito 
> >> >
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Artyom Tarasenko
> >>
> >> solaris/sparc under qemu blog: http://tyom.blogspot.com/
> >
> > 
> > Tsuneo Saito 
> >
> 
> -- 
> Regards,
> Artyom Tarasenko
> 
> solaris/sparc under qemu blog: http://tyom.blogspot.com/


Tsuneo Saito 



[Qemu-devel] [PATCH 1/4] linux-user: define default cpu model in configure instead of linux-user/main.c

2011-07-28 Thread Laurent Vivier
Signed-off-by: Laurent Vivier 
---
 configure |   15 +++
 linux-user/main.c |   34 +-
 2 files changed, 16 insertions(+), 33 deletions(-)

diff --git a/configure b/configure
index fb8819b..c74a5f9 100755
--- a/configure
+++ b/configure
@@ -3075,6 +3075,7 @@ target_dir="$target"
 config_target_mak=$target_dir/config-target.mak
 target_arch2=`echo $target | cut -d '-' -f 1`
 target_bigendian="no"
+target_default_cpu="any"
 
 case "$target_arch2" in
   
armeb|lm32|m68k|microblaze|mips|mipsn32|mips64|ppc|ppcemb|ppc64|ppc64abi32|s390x|sh4eb|sparc|sparc64|sparc32plus)
@@ -3151,11 +3152,13 @@ TARGET_ABI_DIR=""
 case "$target_arch2" in
   i386)
 target_phys_bits=64
+target_default_cpu="qemu32"
   ;;
   x86_64)
 TARGET_BASE_ARCH=i386
 target_phys_bits=64
 target_long_alignment=8
+target_default_cpu="qemu64"
   ;;
   alpha)
 target_phys_bits=64
@@ -3173,6 +3176,7 @@ case "$target_arch2" in
   cris)
 target_nptl="yes"
 target_phys_bits=32
+target_default_cpu=""
   ;;
   lm32)
 target_phys_bits=32
@@ -3198,12 +3202,14 @@ case "$target_arch2" in
 echo "TARGET_ABI_MIPSO32=y" >> $config_target_mak
 target_nptl="yes"
 target_phys_bits=64
+target_default_cpu="24Kf"
   ;;
   mipsn32|mipsn32el)
 TARGET_ARCH=mipsn32
 TARGET_BASE_ARCH=mips
 echo "TARGET_ABI_MIPSN32=y" >> $config_target_mak
 target_phys_bits=64
+target_default_cpu="20Kc"
   ;;
   mips64|mips64el)
 TARGET_ARCH=mips64
@@ -3211,12 +3217,14 @@ case "$target_arch2" in
 echo "TARGET_ABI_MIPSN64=y" >> $config_target_mak
 target_phys_bits=64
 target_long_alignment=8
+target_default_cpu="20Kc"
   ;;
   ppc)
 gdb_xml_files="power-core.xml power-fpu.xml power-altivec.xml 
power-spe.xml"
 target_phys_bits=32
 target_nptl="yes"
 target_libs_softmmu="$fdt_libs"
+target_default_cpu="750"
   ;;
   ppcemb)
 TARGET_BASE_ARCH=ppc
@@ -3225,6 +3233,7 @@ case "$target_arch2" in
 target_phys_bits=64
 target_nptl="yes"
 target_libs_softmmu="$fdt_libs"
+target_default_cpu="750"
   ;;
   ppc64)
 TARGET_BASE_ARCH=ppc
@@ -3233,6 +3242,7 @@ case "$target_arch2" in
 target_phys_bits=64
 target_long_alignment=8
 target_libs_softmmu="$fdt_libs"
+target_default_cpu="970fx"
   ;;
   ppc64abi32)
 TARGET_ARCH=ppc64
@@ -3242,6 +3252,7 @@ case "$target_arch2" in
 gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml 
power-spe.xml"
 target_phys_bits=64
 target_libs_softmmu="$fdt_libs"
+target_default_cpu="750"
   ;;
   sh4|sh4eb)
 TARGET_ARCH=sh4
@@ -3251,11 +3262,13 @@ case "$target_arch2" in
   ;;
   sparc)
 target_phys_bits=64
+target_default_cpu="Fujitsu MB86904"
   ;;
   sparc64)
 TARGET_BASE_ARCH=sparc
 target_phys_bits=64
 target_long_alignment=8
+target_default_cpu="TI UltraSparc II"
   ;;
   sparc32plus)
 TARGET_ARCH=sparc64
@@ -3263,6 +3276,7 @@ case "$target_arch2" in
 TARGET_ABI_DIR=sparc
 echo "TARGET_ABI32=y" >> $config_target_mak
 target_phys_bits=64
+target_default_cpu="Fujitsu MB86904"
   ;;
   s390x)
 target_nptl="yes"
@@ -3281,6 +3295,7 @@ echo "TARGET_SHORT_ALIGNMENT=$target_short_alignment" >> 
$config_target_mak
 echo "TARGET_INT_ALIGNMENT=$target_int_alignment" >> $config_target_mak
 echo "TARGET_LONG_ALIGNMENT=$target_long_alignment" >> $config_target_mak
 echo "TARGET_LLONG_ALIGNMENT=$target_llong_alignment" >> $config_target_mak
+echo "TARGET_DEFAULT_CPU=\"$target_default_cpu\"" >> $config_target_mak
 echo "TARGET_ARCH=$TARGET_ARCH" >> $config_target_mak
 target_arch_name="`echo $TARGET_ARCH | tr '[:lower:]' '[:upper:]'`"
 echo "TARGET_$target_arch_name=y" >> $config_target_mak
diff --git a/linux-user/main.c b/linux-user/main.c
index 2135b9c..7180cee 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -3082,39 +3082,7 @@ int main(int argc, char **argv, char **envp)
 init_paths(interp_prefix);
 
 if (cpu_model == NULL) {
-#if defined(TARGET_I386)
-#ifdef TARGET_X86_64
-cpu_model = "qemu64";
-#else
-cpu_model = "qemu32";
-#endif
-#elif defined(TARGET_ARM)
-cpu_model = "any";
-#elif defined(TARGET_UNICORE32)
-cpu_model = "any";
-#elif defined(TARGET_M68K)
-cpu_model = "any";
-#elif defined(TARGET_SPARC)
-#ifdef TARGET_SPARC64
-cpu_model = "TI UltraSparc II";
-#else
-cpu_model = "Fujitsu MB86904";
-#endif
-#elif defined(TARGET_MIPS)
-#if defined(TARGET_ABI_MIPSN32) || defined(TARGET_ABI_MIPSN64)
-cpu_model = "20Kc";
-#else
-cpu_model = "24Kf";
-#endif
-#elif defined(TARGET_PPC)
-#ifdef TARGET_PPC64
-cpu_model = "970fx";
-#else
-cpu_model = "750";
-#endif
-#else
-cpu_model = "any";
-#endif
+cpu_model = TARGET_DEFAULT_CPU;
 }
 cpu_exec_init_all(0);
 /* NOTE: we need to init the CPU at this stage to get
-- 
1.7.4.1




[Qemu-devel] [PATCH 4/4] linux-user: define new environment variables

2011-07-28 Thread Laurent Vivier
QEMU_GDB=port allows to define gdb server port to wait on.
QEMU_DEBUG=options allows to activate log file (like -d options)

Signed-off-by: Laurent Vivier 
---
 linux-user/main.c |   11 ---
 qemu-doc.texi |4 
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 7180cee..ff1720b 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -2815,8 +2815,10 @@ static void usage(void)
"-strace  log system calls\n"
"\n"
"Environment variables:\n"
-   "QEMU_STRACE   Print system calls and arguments similar to 
the\n"
-   "  'strace' program.  Enable by setting to any 
value.\n"
+   "QEMU_STRACEPrint system calls and arguments similar to 
the\n"
+   "   'strace' program.  Enable by setting to any 
value.\n"
+   "QEMU_DEBUG=options Activate log. Use same options as '-d' 
options\n"
+   "QEMU_GDB=port  Wait gdb connection to port\n"
"You can use -E and -U options to set/unset environment variables\n"
"for target process.  It is possible to provide several variables\n"
"by repeating the option.  For example:\n"
@@ -2872,7 +2874,7 @@ int main(int argc, char **argv, char **envp)
 const char *filename;
 const char *cpu_model;
 const char *log_file = DEBUG_LOGFILE;
-const char *log_mask = NULL;
+const char *log_mask = getenv("QEMU_DEBUG");
 struct target_pt_regs regs1, *regs = ®s1;
 struct image_info info1, *info = &info1;
 struct linux_binprm bprm;
@@ -2919,6 +2921,9 @@ int main(int argc, char **argv, char **envp)
 #if defined(cpudef_setup)
 cpudef_setup(); /* parse cpu definitions in target config file (TBD) */
 #endif
+if (getenv("QEMU_GDB")) {
+  gdbstub_port = atoi(getenv("QEMU_GDB"));
+}
 
 optind = 1;
 for(;;) {
diff --git a/qemu-doc.texi b/qemu-doc.texi
index 47e1991..330f9d0 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -2285,6 +2285,10 @@ space emulator hasn't implemented ptrace).  At the 
moment this is
 incomplete.  All system calls that don't have a specific argument
 format are printed with information for six arguments.  Many
 flag-style arguments don't have decoders and will show up as numbers.
+@item QEMU_DEBUG=options
+Activate log. Use same options as '-d' options.
+@item QEMU_GDB=port
+Wait gdb connection to port.
 @end table
 
 @node Other binaries
-- 
1.7.4.1




[Qemu-devel] [PATCH 3/4] linux-user,m68k: display default cpu

2011-07-28 Thread Laurent Vivier
Signed-off-by: Laurent Vivier 
---
 target-m68k/helper.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/target-m68k/helper.c b/target-m68k/helper.c
index a936fe7..f5d33cd 100644
--- a/target-m68k/helper.c
+++ b/target-m68k/helper.c
@@ -57,6 +57,11 @@ void m68k_cpu_list(FILE *f, fprintf_function cpu_fprintf)
 unsigned int i;
 
 for (i = 0; m68k_cpu_defs[i].name; i++) {
+if (strcmp(m68k_cpu_defs[i].name, TARGET_DEFAULT_CPU) == 0) {
+(*cpu_fprintf)(f, " >");
+} else {
+(*cpu_fprintf)(f, "  ");
+}
 (*cpu_fprintf)(f, "%s\n", m68k_cpu_defs[i].name);
 }
 }
-- 
1.7.4.1




[Qemu-devel] [PATCH 2/4] linux-user: specify the cpu model during configure

2011-07-28 Thread Laurent Vivier
This patch allows to set the default cpu model for a given architecture,
for instance:

 configure --target-list=m68k-linux-user --m68k-default-cpu=m68040

Signed-off-by: Laurent Vivier 
---
 configure |9 +
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/configure b/configure
index c74a5f9..d54f0ed 100755
--- a/configure
+++ b/configure
@@ -527,6 +527,10 @@ for opt do
   ;;
   --target-list=*) target_list="$optarg"
   ;;
+  --*-default-cpu=*)
+tmp=`expr "x$opt" : 'x--\(.*\)-default-cpu=.*'`
+eval ${tmp}_default_cpu="$optarg"
+  ;;
   --enable-trace-backend=*) trace_backend="$optarg"
   ;;
   --with-trace-file=*) trace_file="$optarg"
@@ -916,6 +920,7 @@ echo "   use %M for cpu name 
[$interp_prefix]"
 echo "  --target-list=LIST   set target list (default: build everything)"
 echo "Available targets: $default_target_list" | \
 fold -s -w 53 | sed -e 's/^/   /'
+echo "  --ARCH-default-cpu=CPU   set the default cpu for a given architecture"
 echo ""
 echo "Advanced options (experts only):"
 echo "  --source-path=PATH   path of source code [$source_path]"
@@ -3291,6 +3296,10 @@ case "$target_arch2" in
 exit 1
   ;;
 esac
+tmp_target_default_cpu=`eval echo \\$${target_arch2}_default_cpu`
+if [ "x$tmp_target_default_cpu" != "x" ] ; then
+  target_default_cpu="$tmp_target_default_cpu"
+fi
 echo "TARGET_SHORT_ALIGNMENT=$target_short_alignment" >> $config_target_mak
 echo "TARGET_INT_ALIGNMENT=$target_int_alignment" >> $config_target_mak
 echo "TARGET_LONG_ALIGNMENT=$target_long_alignment" >> $config_target_mak
-- 
1.7.4.1




[Qemu-devel] [PATCH 0/4] Set of patches for chrooted environment

2011-07-28 Thread Laurent Vivier
This set of patches helps to use qemu-linux-user in a chrooted environment.

It mostly allows to define the default cpu model as we can't use '-cpu' 
argument.
The last one defines enviromnent variables to be able to use log file and 
gdb server  ('-d' and '-g' arguments).

NOTE: I saw some comments in the mailing list about environment variables,
if patch #4 dislikes, I've also a patch providing a "qemu-wrapper" with the 
same behavior.

[PATCH 1/4] linux-user: define default cpu model in configure instead of 
linux-user/main.c
[PATCH 2/4] linux-user: specify the cpu model during configure
[PATCH 3/4] linux-user,m68k: display default cpu
[PATCH 4/4] linux-user: define new environment variables



Re: [Qemu-devel] [RFC] Introduce vm_stop_permanent()

2011-07-28 Thread Avi Kivity

On 07/28/2011 04:31 PM, Luiz Capitulino wrote:

On Thu, 28 Jul 2011 10:23:22 +0300
Avi Kivity  wrote:

>  On 07/28/2011 12:44 AM, Blue Swirl wrote:
>  >  On Wed, Jul 27, 2011 at 9:42 PM, Luiz Capitulino  
 wrote:
>  >  >   This function should be used when the VM is not supposed to resume
>  >  >   execution (eg. by issuing 'cont' monitor command).
>  >  >
>  >  >   Today, we allow the user to resume execution even when:
>  >  >
>  >  > o the guest shuts down and -no-shutdown is used
>  >  > o there's a kvm internal error
>  >  > o loading the VM state with -loadvm or "loadvm" in the monitor fails
>  >  >
>  >  >   I think only badness can happen from the cases above.
>  >
>  >  I'd suppose a system_reset should bring the system back to sanity and
>  >  then clear vm_permanent_stopped (where's -ly?)

What's -ly?



permanent-ly.


>  >  except maybe for KVM
>  >  internal error if that can't be recovered. Then it would not very
>  >  permanent anymore, so the name would need adjusting.
>
>  Currently, all kvm internal errors are recoverable by reset (and
>  possibly by fiddling with memory/registers).

Ok, but a poweroff in the guest isn't recoverable with system_reset
right? Or does it depend on the guest?


Right, it's not recoverable if you shut the power down where the tractor 
beam is coupled to the main reactor.



I get funny results if qemu is started with -no-shutdown and I run cont after
a poweroff in a F15 guest. Sometimes qemu will exit after a few seconds,
sometimes 'info status' will say 'running'.


Yes, best fixed.

--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [RFC] Introduce vm_stop_permanent()

2011-07-28 Thread Luiz Capitulino
On Thu, 28 Jul 2011 10:23:22 +0300
Avi Kivity  wrote:

> On 07/28/2011 12:44 AM, Blue Swirl wrote:
> > On Wed, Jul 27, 2011 at 9:42 PM, Luiz Capitulino  
> > wrote:
> > >  This function should be used when the VM is not supposed to resume
> > >  execution (eg. by issuing 'cont' monitor command).
> > >
> > >  Today, we allow the user to resume execution even when:
> > >
> > >o the guest shuts down and -no-shutdown is used
> > >o there's a kvm internal error
> > >o loading the VM state with -loadvm or "loadvm" in the monitor fails
> > >
> > >  I think only badness can happen from the cases above.
> >
> > I'd suppose a system_reset should bring the system back to sanity and
> > then clear vm_permanent_stopped (where's -ly?)

What's -ly?

> > except maybe for KVM
> > internal error if that can't be recovered. Then it would not very
> > permanent anymore, so the name would need adjusting.
> 
> Currently, all kvm internal errors are recoverable by reset (and 
> possibly by fiddling with memory/registers).

Ok, but a poweroff in the guest isn't recoverable with system_reset
right? Or does it depend on the guest?

I get funny results if qemu is started with -no-shutdown and I run cont after
a poweroff in a F15 guest. Sometimes qemu will exit after a few seconds,
sometimes 'info status' will say 'running'.



[Qemu-devel] [PATCH RESEND 3/3] xen: use uint64_t instead of target_ulong in cpu_ioreq_move

2011-07-28 Thread stefano.stabellini
From: Stefano Stabellini 

cpu_ioreq_move might move 8 bytes at a time so we must make sure that
the temporary variable can hold 8 bytes.

Signed-off-by: Stefano Stabellini 
---
 xen-all.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/xen-all.c b/xen-all.c
index 9eaeac1..3611e19 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -620,7 +620,7 @@ static void cpu_ioreq_move(ioreq_t *req)
 }
 }
 } else {
-target_ulong tmp;
+uint64_t tmp;
 
 if (req->dir == IOREQ_READ) {
 for (i = 0; i < req->count; i++) {
-- 
1.7.2.3




[Qemu-devel] [PATCH RESEND 1/3] Introduce a new 'connected' xendev op called when Connected.

2011-07-28 Thread stefano.stabellini
From: John Haxby 

Rename the existing xendev 'connect' op to 'initialised' and introduce
a new 'connected' op.  This new op, if defined, is called when the
backend is connected.  Note that since there is no state transition this
may be called more than once.

Signed-off-by: John Haxby 
Acked-by: Stefano Stabellini 
---
 hw/xen_backend.c |   39 +--
 hw/xen_backend.h |3 ++-
 hw/xen_console.c |4 ++--
 hw/xen_disk.c|2 +-
 hw/xen_nic.c |2 +-
 hw/xenfb.c   |8 
 6 files changed, 43 insertions(+), 15 deletions(-)

diff --git a/hw/xen_backend.c b/hw/xen_backend.c
index d881fa2..c506dfe 100644
--- a/hw/xen_backend.c
+++ b/hw/xen_backend.c
@@ -421,13 +421,13 @@ static int xen_be_try_init(struct XenDevice *xendev)
 }
 
 /*
- * Try to connect xendev.  Depends on the frontend being ready
+ * Try to initialise xendev.  Depends on the frontend being ready
  * for it (shared ring and evtchn info in xenstore, state being
  * Initialised or Connected).
  *
  * Goes to Connected on success.
  */
-static int xen_be_try_connect(struct XenDevice *xendev)
+static int xen_be_try_initialise(struct XenDevice *xendev)
 {
 int rc = 0;
 
@@ -441,11 +441,11 @@ static int xen_be_try_connect(struct XenDevice *xendev)
 }
 }
 
-if (xendev->ops->connect) {
-rc = xendev->ops->connect(xendev);
+if (xendev->ops->initialise) {
+rc = xendev->ops->initialise(xendev);
 }
 if (rc != 0) {
-xen_be_printf(xendev, 0, "connect() failed\n");
+xen_be_printf(xendev, 0, "initialise() failed\n");
 return rc;
 }
 
@@ -454,6 +454,28 @@ static int xen_be_try_connect(struct XenDevice *xendev)
 }
 
 /*
+ * Try to let xendev know that it is connected.  Depends on the
+ * frontend being Connected.  Note that this may be called more
+ * than once since the backend state is not modified.
+ */
+static void xen_be_try_connected(struct XenDevice *xendev)
+{
+if (!xendev->ops->connected)
+   return;
+
+if (xendev->fe_state != XenbusStateConnected) {
+   if (xendev->ops->flags & DEVOPS_FLAG_IGNORE_STATE) {
+   xen_be_printf(xendev, 2, "frontend not ready, ignoring\n");
+   } else {
+   xen_be_printf(xendev, 2, "frontend not ready (yet)\n");
+   return;
+   }
+}
+
+xendev->ops->connected(xendev);
+}
+
+/*
  * Teardown connection.
  *
  * Goes to Closed when done.
@@ -508,7 +530,12 @@ void xen_be_check_state(struct XenDevice *xendev)
 rc = xen_be_try_init(xendev);
 break;
 case XenbusStateInitWait:
-rc = xen_be_try_connect(xendev);
+rc = xen_be_try_initialise(xendev);
+break;
+case XenbusStateConnected:
+/* xendev->be_state doesn't change */
+xen_be_try_connected(xendev);
+rc = -1;
 break;
 case XenbusStateClosed:
 rc = xen_be_try_reset(xendev);
diff --git a/hw/xen_backend.h b/hw/xen_backend.h
index 6401c85..3305630 100644
--- a/hw/xen_backend.h
+++ b/hw/xen_backend.h
@@ -21,7 +21,8 @@ struct XenDevOps {
 uint32_t  flags;
 void  (*alloc)(struct XenDevice *xendev);
 int   (*init)(struct XenDevice *xendev);
-int   (*connect)(struct XenDevice *xendev);
+int   (*initialise)(struct XenDevice *xendev);
+void  (*connected)(struct XenDevice *xendev);
 void  (*event)(struct XenDevice *xendev);
 void  (*disconnect)(struct XenDevice *xendev);
 int   (*free)(struct XenDevice *xendev);
diff --git a/hw/xen_console.c b/hw/xen_console.c
index 8ef104c..c6bea81 100644
--- a/hw/xen_console.c
+++ b/hw/xen_console.c
@@ -212,7 +212,7 @@ out:
 return ret;
 }
 
-static int con_connect(struct XenDevice *xendev)
+static int con_initialise(struct XenDevice *xendev)
 {
 struct XenConsole *con = container_of(xendev, struct XenConsole, xendev);
 int limit;
@@ -273,7 +273,7 @@ struct XenDevOps xen_console_ops = {
 .size   = sizeof(struct XenConsole),
 .flags  = DEVOPS_FLAG_IGNORE_STATE,
 .init   = con_init,
-.connect= con_connect,
+.initialise = con_initialise,
 .event  = con_event,
 .disconnect = con_disconnect,
 };
diff --git a/hw/xen_disk.c b/hw/xen_disk.c
index add815f..4abdaf9 100644
--- a/hw/xen_disk.c
+++ b/hw/xen_disk.c
@@ -846,7 +846,7 @@ struct XenDevOps xen_blkdev_ops = {
 .flags  = DEVOPS_FLAG_NEED_GNTDEV,
 .alloc  = blk_alloc,
 .init   = blk_init,
-.connect= blk_connect,
+.initialise= blk_connect,
 .disconnect = blk_disconnect,
 .event  = blk_event,
 .free   = blk_free,
diff --git a/hw/xen_nic.c b/hw/xen_nic.c
index ff86491..9a97d3e 100644
--- a/hw/xen_nic.c
+++ b/hw/xen_nic.c
@@ -433,7 +433,7 @@ struct XenDevOps xen_netdev_ops = {
 .size   = sizeof(struct XenNetDev),
 .flags  = DEVOPS_FLAG_NEED_GNTDEV,
 .init   = net_init,
-.connect 

[Qemu-devel] [PATCH RESEND 2/3] Move the xenfb pointer handler to the connected method

2011-07-28 Thread stefano.stabellini
From: John Haxby 

Ensure that we read "request-abs-pointer" after the frontend has written
it.  This means that we will correctly set up an ansolute or relative
pointer handler correctly.

Signed-off-by: John Haxby 
Acked-by: Stefano Stabellini 
---
 hw/xenfb.c |   19 ++-
 1 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/hw/xenfb.c b/hw/xenfb.c
index c001ab9..d964318 100644
--- a/hw/xenfb.c
+++ b/hw/xenfb.c
@@ -356,10 +356,6 @@ static int input_initialise(struct XenDevice *xendev)
 struct XenInput *in = container_of(xendev, struct XenInput, c.xendev);
 int rc;
 
-if (xenstore_read_fe_int(xendev, "request-abs-pointer",
- &in->abs_pointer_wanted) == -1)
-   in->abs_pointer_wanted = 0;
-
 if (!in->c.ds) {
 char *vfb = xenstore_read_str(NULL, "device/vfb");
 if (vfb == NULL) {
@@ -377,10 +373,22 @@ static int input_initialise(struct XenDevice *xendev)
return rc;
 
 qemu_add_kbd_event_handler(xenfb_key_event, in);
+return 0;
+}
+
+static void input_connected(struct XenDevice *xendev)
+{
+struct XenInput *in = container_of(xendev, struct XenInput, c.xendev);
+
+if (xenstore_read_fe_int(xendev, "request-abs-pointer",
+ &in->abs_pointer_wanted) == -1)
+   in->abs_pointer_wanted = 0;
+
+if (in->qmouse)
+   qemu_remove_mouse_event_handler(in->qmouse);
 in->qmouse = qemu_add_mouse_event_handler(xenfb_mouse_event, in,
  in->abs_pointer_wanted,
  "Xen PVFB Mouse");
-return 0;
 }
 
 static void input_disconnect(struct XenDevice *xendev)
@@ -960,6 +968,7 @@ struct XenDevOps xen_kbdmouse_ops = {
 .size   = sizeof(struct XenInput),
 .init   = input_init,
 .initialise = input_initialise,
+.connected  = input_connected,
 .disconnect = input_disconnect,
 .event  = input_event,
 };
-- 
1.7.2.3




Re: [Qemu-devel] [V5 Patch 3/4]Qemu: Command "block_set" for dynamic block params change

2011-07-28 Thread Kevin Wolf
Am 28.07.2011 15:10, schrieb Stefan Hajnoczi:
> On Thu, Jul 28, 2011 at 11:23 AM, Kevin Wolf  wrote:
>> Am 27.07.2011 16:51, schrieb Stefan Hajnoczi:
>>> I'll think about this some more, there are a couple of solutions like
>>> keeping only the file descriptor around, introducing a flush command
>>> that makes sure the file is in a clean state, or changing QED to not
>>> do this.
>>
>> Changing the format drivers doesn't really look like the right solution.
>>
>> Keeping the fd around looks okay, we can probably achieve this by
>> introducing a bdrv_reopen function. It means that we may need to reopen
>> the format layer, but it can't really fail.
> 
> My concern is that this assumes a single file descriptor.  For vmdk
> there may be multiple split files.
> 
> I'm almost starting to think bdrv_reopen() should be recursive down
> the BlockDriverState tree.

Yes, VMDK would have to call bdrv_reopen() for each file that it uses.

Kevin



Re: [Qemu-devel] Block layer roadmap

2011-07-28 Thread Stefan Hajnoczi
On Thu, Jul 28, 2011 at 2:10 PM, Kevin Wolf  wrote:
> Am 28.07.2011 14:54, schrieb Stefan Hajnoczi:
>> On Thu, Jul 28, 2011 at 1:35 PM, Kevin Wolf  wrote:
>>> Am 28.07.2011 14:09, schrieb Christoph Hellwig:
 On Wed, Jul 27, 2011 at 01:37:31PM +0100, Stefan Hajnoczi wrote:
> Coroutines in the block layer [Kevin]
>  * Programming model to simplify block drivers without blocking QEMU 
> threads

 Can anyone explain what the whole point of this is?  It really just is
 a bit of syntactic sugar for the current async state machines.  What does
 it buy us over going for real threading?
>>>
>>> The only current block driver that really does everything in an async
>>> state machine is qed. It's definitely not nice code, and having to
>>> convert all of the other block drivers to this would be a lot of work.
>>
>> Thanks Kevin :).
>
> I certainly didn't mean to attack your code or even yourself. It's not
> that qed is done particularly bad or anything. That the code isn't
> really nice is just the natural result of the callback-based programming
> model.

No worries, no offence taken :)

Stefan



Re: [Qemu-devel] [V5 Patch 3/4]Qemu: Command "block_set" for dynamic block params change

2011-07-28 Thread Stefan Hajnoczi
On Thu, Jul 28, 2011 at 11:23 AM, Kevin Wolf  wrote:
> Am 27.07.2011 16:51, schrieb Stefan Hajnoczi:
>> 2011/7/27 Michael Tokarev :
>>> 27.07.2011 15:30, Supriya Kannery wrote:
 New command "block_set" added for dynamically changing any of the block
 device parameters. For now, dynamic setting of hostcache params using this
 command is implemented. Other block device parameter changes, can be
 integrated in similar lines.

 Signed-off-by: Supriya Kannery 

 ---
  block.c         |   54 +
  block.h         |    2 +
  blockdev.c      |   61 
 
  blockdev.h      |    1
  hmp-commands.hx |   14 
  qemu-config.c   |   13 +++
  qemu-option.c   |   25 ++
  qemu-option.h   |    2 +
  qmp-commands.hx |   28 +
  9 files changed, 200 insertions(+)

 Index: qemu/block.c
 ===
 --- qemu.orig/block.c
 +++ qemu/block.c
 @@ -651,6 +651,34 @@ unlink_and_fail:
      return ret;
  }

 +int bdrv_reopen(BlockDriverState *bs, int bdrv_flags)
 +{
 +    BlockDriver *drv = bs->drv;
 +    int ret = 0, open_flags;
 +
 +    /* Quiesce IO for the given block device */
 +    qemu_aio_flush();
 +    if (bdrv_flush(bs)) {
 +        qerror_report(QERR_DATA_SYNC_FAILED, bs->device_name);
 +        return ret;
 +    }
 +    open_flags = bs->open_flags;
 +    bdrv_close(bs);
 +
 +    ret = bdrv_open(bs, bs->filename, bdrv_flags, drv);
 +    if (ret < 0) {
 +        /* Reopen failed. Try to open with original flags */
 +        qerror_report(QERR_REOPEN_FILE_FAILED, bs->filename);
 +        ret = bdrv_open(bs, bs->filename, open_flags, drv);
 +        if (ret < 0) {
 +            /* Reopen failed with orig and modified flags */
 +            abort();
 +        }
>>>
>>> Can we please avoid this stuff completely?  Just keep the
>>> old device open still, until you're sure new one is ok.
>>>
>>> Or else it will be quite dangerous command in many cases.
>>> For example, after -runas/-chroot, or additional selinux
>>> settings or whatnot.  And in this case, instead of merely
>>> returning an error, we'll see abort().  Boom.
>>
>> Slight complication for image formats that use a dirty bit here.  QED
>> has a dirty bit.  VMDK also specifies one but we don't implement it
>> today.
>>
>> If the image file is dirty then all its metadata will be scanned for
>> consistency when it is opened.  This increases the bdrv_open() time
>> and hence the downtime of the VM.  So it is not ideal to open the
>> image file twice, even though there is no consistency problem.
>
> In general I really like understatements, but opening an image file
> twice isn't only "not ideal", but should be considered verboten.

Point taken.

>> I'll think about this some more, there are a couple of solutions like
>> keeping only the file descriptor around, introducing a flush command
>> that makes sure the file is in a clean state, or changing QED to not
>> do this.
>
> Changing the format drivers doesn't really look like the right solution.
>
> Keeping the fd around looks okay, we can probably achieve this by
> introducing a bdrv_reopen function. It means that we may need to reopen
> the format layer, but it can't really fail.

My concern is that this assumes a single file descriptor.  For vmdk
there may be multiple split files.

I'm almost starting to think bdrv_reopen() should be recursive down
the BlockDriverState tree.

Stefan



Re: [Qemu-devel] Block layer roadmap

2011-07-28 Thread Kevin Wolf
Am 28.07.2011 14:54, schrieb Stefan Hajnoczi:
> On Thu, Jul 28, 2011 at 1:35 PM, Kevin Wolf  wrote:
>> Am 28.07.2011 14:09, schrieb Christoph Hellwig:
>>> On Wed, Jul 27, 2011 at 01:37:31PM +0100, Stefan Hajnoczi wrote:
 Coroutines in the block layer [Kevin]
  * Programming model to simplify block drivers without blocking QEMU 
 threads
>>>
>>> Can anyone explain what the whole point of this is?  It really just is
>>> a bit of syntactic sugar for the current async state machines.  What does
>>> it buy us over going for real threading?
>>
>> The only current block driver that really does everything in an async
>> state machine is qed. It's definitely not nice code, and having to
>> convert all of the other block drivers to this would be a lot of work.
> 
> Thanks Kevin :). 

I certainly didn't mean to attack your code or even yourself. It's not
that qed is done particularly bad or anything. That the code isn't
really nice is just the natural result of the callback-based programming
model.

>> So if it only means that we're making things async that would block the
>> VCPU until now, isn't that a great improvement already?
>>
>> The advantage compared to threading is that it allows an easy and
>> incremental transition.
> 
> Also remember that we already have a threads-based implementation of
> coroutines.  Later on we can do fine-grained locking and switch to
> threads directly instead of coroutines, if need be.

Might be an option for the future, but for now there's enough left to do
to take real advantage of coroutines.

Kevin



Re: [Qemu-devel] [PATCH] fix disabling interrupts in sun4u

2011-07-28 Thread Artyom Tarasenko
On Thu, Jul 28, 2011 at 2:10 PM, Tsuneo Saito  wrote:
> 2011/7/28 :
>>
>> At Thu, 28 Jul 2011 13:51:08 +0200,
>> Artyom Tarasenko wrote:
>> > On Thu, Jul 28, 2011 at 12:31 PM,   wrote:
>> > > Hi,
>> > >
>> > > At Mon, 25 Jul 2011 19:22:38 +0200,
>> > > Artyom Tarasenko wrote:
>> > >
>> > >> clear interrupt request if the interrupt priority < CPU pil
>> > >> clear hardware interrupt request if interrupts are disabled
>> > >
>> > > Not directly related to the fix, but I'd like to note a problem
>> > > of hw/sun4u.c interrupt code:
>> > >
>> > > The interrupt code probably mixes hardware interrupts and
>> > > software interrupts.
>> > > %pil is for software interrupts (interrupt_level_n traps).
>> > > %pil can not mask hardware interrupts (interrupt_vector traps);
>> > > the CPU raises interrupt_vector traps even on %pil=15.
>> > > But in cpu_check_irqs() and cpu_set_irq(), hardware interrupts
>> > > seem to be masked by %pil.
>> >
>> > The interrupt_vector traps are currently not implemented, are they?
>> > So it's hard to tell whether they are masked.
>>
>> Yes, interrupt_vector is not implemented yet.
>
> Ah, I intended to write:
>  No, interrupt_vector is not implemented yet.
> I'm very sorry for my engrish :-(

Not having English as my native language, I even haven't noticed
something wrong here.

Despite what they say the International language is not English, but
broken English.  :)

>> I failed to explain the problem.
>> The problem is that cpu_set_irqs() should raise interrupt_vector
>> traps but it raises interrupt_level_n traps actually.
>>
>> sun4uv_init() calls qemu_allocate_irqs() with cpu_set_irq as
>> the 1st argument.  The allocated irqs (the irq variable) are
>> passed to pci_apb_init().  APB should generate interrupt_vector
>> traps (hardware interrupts), not the interrupt_vector_n traps.
>> The interrupts from APB would be reported by cpu_set_irq(),
>> but cpu_set_irq() seems to generate interrupt_vector_n traps.
>>
>> > >> Signed-off-by: Artyom Tarasenko 
>> > >> ---
>> > >>  hw/sun4u.c |    6 --
>> > >>  1 files changed, 4 insertions(+), 2 deletions(-)
>> > >>
>> > >> diff --git a/hw/sun4u.c b/hw/sun4u.c
>> > >> index d7dcaf0..7f95aeb 100644
>> > >> --- a/hw/sun4u.c
>> > >> +++ b/hw/sun4u.c
>> > >> @@ -255,7 +255,7 @@ void cpu_check_irqs(CPUState *env)
>> > >>          pil |= 1 << 14;
>> > >>      }
>> > >>
>> > >> -    if (!pil) {
>> > >> +    if (pil < (2 << env->psrpil)){
>> > >>          if (env->interrupt_request & CPU_INTERRUPT_HARD) {
>> > >>              CPUIRQ_DPRINTF("Reset CPU IRQ (current interrupt %x)\n",
>> > >>                             env->interrupt_index);
>> > >> @@ -287,10 +287,12 @@ void cpu_check_irqs(CPUState *env)
>> > >>                  break;
>> > >>              }
>> > >>          }
>> > >> -    } else {
>> > >> +    } else if (env->interrupt_request & CPU_INTERRUPT_HARD) {
>> > >>          CPUIRQ_DPRINTF("Interrupts disabled, pil=%08x pil_in=%08x 
>> > >> softint=%08x "
>> > >>                         "current interrupt %x\n",
>> > >>                         pil, env->pil_in, env->softint, 
>> > >> env->interrupt_index);
>> > >> +        env->interrupt_index = 0;
>> > >> +        cpu_reset_interrupt(env, CPU_INTERRUPT_HARD);
>> > >>      }
>> > >>  }
>> > >>
>> > >> --
>> > >> 1.7.3.4
>> > >
>> > >
>> > > 
>> > > Tsuneo Saito 
>> > >
>> >
>> >
>> >
>> > --
>> > Regards,
>> > Artyom Tarasenko
>> >
>> > solaris/sparc under qemu blog: http://tyom.blogspot.com/
>>
>> 
>> Tsuneo Saito 
>



-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/



Re: [Qemu-devel] Block layer roadmap

2011-07-28 Thread Stefan Hajnoczi
On Thu, Jul 28, 2011 at 1:35 PM, Kevin Wolf  wrote:
> Am 28.07.2011 14:09, schrieb Christoph Hellwig:
>> On Wed, Jul 27, 2011 at 01:37:31PM +0100, Stefan Hajnoczi wrote:
>>> Coroutines in the block layer [Kevin]
>>>  * Programming model to simplify block drivers without blocking QEMU threads
>>
>> Can anyone explain what the whole point of this is?  It really just is
>> a bit of syntactic sugar for the current async state machines.  What does
>> it buy us over going for real threading?
>
> The only current block driver that really does everything in an async
> state machine is qed. It's definitely not nice code, and having to
> convert all of the other block drivers to this would be a lot of work.

Thanks Kevin :).  I do agree with the clumsiness of async callback
programming - a lot of code is spent bundling and unbundling variables
to pass between functions, not to mention that the control flow is
much harder to follow.

> So if it only means that we're making things async that would block the
> VCPU until now, isn't that a great improvement already?
>
> The advantage compared to threading is that it allows an easy and
> incremental transition.

Also remember that we already have a threads-based implementation of
coroutines.  Later on we can do fine-grained locking and switch to
threads directly instead of coroutines, if need be.

Stefan



Re: [Qemu-devel] Block layer roadmap

2011-07-28 Thread Anthony Liguori

On 07/28/2011 07:09 AM, Christoph Hellwig wrote:

On Wed, Jul 27, 2011 at 01:37:31PM +0100, Stefan Hajnoczi wrote:

Coroutines in the block layer [Kevin]
  * Programming model to simplify block drivers without blocking QEMU threads


Can anyone explain what the whole point of this is?  It really just is
a bit of syntactic sugar for the current async state machines.  What does
it buy us over going for real threading?


It is threading--just with a common locking model where a single big 
lock is held to make up for the fact that most of QEMU isn't reentrant.


By restructuring the code to be threaded, we can incrementally remove 
the big lock if we audit for use of non-reentrant functions and 
introduce more granular locking.


The whole ucontext/setjmp thing is just an optimization.  I would hope 
it entirely disappears long term as we promote coroutines to full threads.


Regards,

Anthony Liguori









Re: [Qemu-devel] [PATCH] fix disabling interrupts in sun4u

2011-07-28 Thread Artyom Tarasenko
On Thu, Jul 28, 2011 at 2:03 PM,   wrote:
> At Thu, 28 Jul 2011 13:51:08 +0200,
> Artyom Tarasenko wrote:
>> On Thu, Jul 28, 2011 at 12:31 PM,   wrote:
>> > Hi,
>> >
>> > At Mon, 25 Jul 2011 19:22:38 +0200,
>> > Artyom Tarasenko wrote:
>> >
>> >> clear interrupt request if the interrupt priority < CPU pil
>> >> clear hardware interrupt request if interrupts are disabled
>> >
>> > Not directly related to the fix, but I'd like to note a problem
>> > of hw/sun4u.c interrupt code:
>> >
>> > The interrupt code probably mixes hardware interrupts and
>> > software interrupts.
>> > %pil is for software interrupts (interrupt_level_n traps).
>> > %pil can not mask hardware interrupts (interrupt_vector traps);
>> > the CPU raises interrupt_vector traps even on %pil=15.
>> > But in cpu_check_irqs() and cpu_set_irq(), hardware interrupts
>> > seem to be masked by %pil.
>>
>> The interrupt_vector traps are currently not implemented, are they?
>> So it's hard to tell whether they are masked.
>
> Yes, interrupt_vector is not implemented yet.
> I failed to explain the problem.
> The problem is that cpu_set_irqs() should raise interrupt_vector
> traps but it raises interrupt_level_n traps actually.
> sun4uv_init() calls qemu_allocate_irqs() with cpu_set_irq as
> the 1st argument.  The allocated irqs (the irq variable) are
> passed to pci_apb_init().  APB should generate interrupt_vector
> traps (hardware interrupts), not the interrupt_vector_n traps.

Yes, this is true. But it's more complicated than this: cpu_check_irqs
also checks tick/stick/hstick interrupts. They should produce the
interrupt_level_n traps as they currently do.

The patch merely fixes the problem of hanging on a interrupt_vector_n
trap if the trap handler uses pil for interrupt masking. The problem
exists independently from interrupt_vector trap generation (as you
pointed out).

Do you have objections to this patch in its current form?

> The interrupts from APB would be reported by cpu_set_irq(),
> but cpu_set_irq() seems to generate interrupt_vector_n traps.

For me it's not obvious. The interrupt vector not just one line, but
the vector, which is written in the corresponding CPU register (also
missing in the current qemu implementation). On the real hardware the
vector is created by the IOMMU (PBM/APB/...). If qemu intends to
support multiple chipsets, we should keep it the way it's done on the
real hardware (for instance the interrupt vectors for on-board devices
on Ultra-1 and E6500 are not the same).

I'd suggest APB shall use some other interface for communicating
interrupts to the CPU. Something like
cpu_receive_ivec(interrupt_vector).

>> >> Signed-off-by: Artyom Tarasenko 
>> >> ---
>> >>  hw/sun4u.c |    6 --
>> >>  1 files changed, 4 insertions(+), 2 deletions(-)
>> >>
>> >> diff --git a/hw/sun4u.c b/hw/sun4u.c
>> >> index d7dcaf0..7f95aeb 100644
>> >> --- a/hw/sun4u.c
>> >> +++ b/hw/sun4u.c
>> >> @@ -255,7 +255,7 @@ void cpu_check_irqs(CPUState *env)
>> >>          pil |= 1 << 14;
>> >>      }
>> >>
>> >> -    if (!pil) {
>> >> +    if (pil < (2 << env->psrpil)){
>> >>          if (env->interrupt_request & CPU_INTERRUPT_HARD) {
>> >>              CPUIRQ_DPRINTF("Reset CPU IRQ (current interrupt %x)\n",
>> >>                             env->interrupt_index);
>> >> @@ -287,10 +287,12 @@ void cpu_check_irqs(CPUState *env)
>> >>                  break;
>> >>              }
>> >>          }
>> >> -    } else {
>> >> +    } else if (env->interrupt_request & CPU_INTERRUPT_HARD) {
>> >>          CPUIRQ_DPRINTF("Interrupts disabled, pil=%08x pil_in=%08x 
>> >> softint=%08x "
>> >>                         "current interrupt %x\n",
>> >>                         pil, env->pil_in, env->softint, 
>> >> env->interrupt_index);
>> >> +        env->interrupt_index = 0;
>> >> +        cpu_reset_interrupt(env, CPU_INTERRUPT_HARD);
>> >>      }
>> >>  }
>> >>
>> >> --
>> >> 1.7.3.4
>> >
>> >
>> > 
>> > Tsuneo Saito 
>> >
>>
>>
>>
>> --
>> Regards,
>> Artyom Tarasenko
>>
>> solaris/sparc under qemu blog: http://tyom.blogspot.com/
>
> 
> Tsuneo Saito 
>

-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/



Re: [Qemu-devel] [PATCH v2] pci: Common overflow prevention

2011-07-28 Thread Isaku Yamahata
On Thu, Jul 28, 2011 at 11:40:21AM +0300, Michael S. Tsirkin wrote:
> On Thu, Jul 28, 2011 at 04:23:24PM +0900, Isaku Yamahata wrote:
> > This might be a bit late comment...
> > 
> > On Fri, Jul 22, 2011 at 11:05:01AM +0200, Jan Kiszka wrote:
> > > diff --git a/hw/pci_host.c b/hw/pci_host.c
> > > index 728e2d4..bfdc321 100644
> > > --- a/hw/pci_host.c
> > > +++ b/hw/pci_host.c
> > > @@ -47,17 +47,33 @@ static inline PCIDevice *pci_dev_find_by_addr(PCIBus 
> > > *bus, uint32_t addr)
> > >  return pci_find_device(bus, bus_num, devfn);
> > >  }
> > >  
> > > +void pci_config_write_common(PCIDevice *pci_dev, uint32_t addr,
> > > + uint32_t limit, uint32_t val, uint32_t len)
> > > +{
> > > +assert(len <= 4);
> > > +pci_dev->config_write(pci_dev, addr, val, MIN(len, limit - addr));
> > > +}
> > > +
> > > +uint32_t pci_config_read_common(PCIDevice *pci_dev, uint32_t addr,
> > > +uint32_t limit, uint32_t len)
> > > +{
> > > +assert(len <= 4);
> > > +return pci_dev->config_read(pci_dev, addr, MIN(len, limit - addr));
> > > +}
> > > +
> > 
> > Since limit and addr is unsigned, MIN(len, limit - addr) = len if limit < 
> > addr.
> > So we need explicit "if (limit < addr) return;".
> > Here's the patch for pci branch.
> > 
> > >From 75c1a2b47c93ad987cd7a37fb62bda9a59f27948 Mon Sep 17 00:00:00 2001
> > Message-Id: 
> > <75c1a2b47c93ad987cd7a37fb62bda9a59f27948.1311837763.git.yamah...@valinux.co.jp>
> > From: Isaku Yamahata 
> > Date: Thu, 28 Jul 2011 16:20:28 +0900
> > Subject: [PATCH] pci/host: limit check of pci_host_config_read/write_common
> > 
> > This patch adds boundary check in pci_host_config_read/write_common()
> > Since limit and addr is unsigned, MIN(len, limit - addr) = len if limit < 
> > addr.
> > So we need explicit "if (limit <= addr) return;"
> > 
> > Signed-off-by: Isaku Yamahata 
> 
> I don't see a problem with this, but could you please clarify when does
> this happen? I think this is only possible for a pci device
> behind an express root. If so, this belongs in pcie_host.c

Right. I'll move the check into pcie_host.c

-- 
yamahata



Re: [Qemu-devel] [V5 Patch 3/4]Qemu: Command "block_set" for dynamic block params change

2011-07-28 Thread Anthony Liguori

On 07/28/2011 05:13 AM, Supriya Kannery wrote:

On 07/27/2011 09:32 PM, Anthony Liguori wrote:

On 07/27/2011 09:31 AM, Stefan Hajnoczi wrote:

On Wed, Jul 27, 2011 at 1:58 PM, Anthony
Liguori wrote:

Index: qemu/hmp-commands.hx
===
--- qemu.orig/hmp-commands.hx
+++ qemu/hmp-commands.hx
@@ -70,6 +70,20 @@ but should be used with extreme caution.
resizes image files, it can not resize block devices like LVM volumes.
ETEXI

+ {
+ .name = "block_set",
+ .args_type = "device:B,device:O",
+ .params = "device [prop=value][,...]",
+ .help = "Change block device parameters
[hostcache=on/off]",
+ .user_print = monitor_user_noop,
+ .mhandler.cmd_new = do_block_set,
+ },
+STEXI
+@item block_set @var{config}
+@findex block_set
+Change block device parameters (eg: hostcache=on/off) while guest is
running.
+ETEXI
+


block_set_hostcache() please.

Multiplexing commands is generally a bad idea. It weakens typing. In
the
absence of a generic way to set block device properties, implementing
properties as generic in the QMP layer seems like a bad idea to me.


The idea behind block_set was to have a unified interface for changing
block device parameters at runtime. This prevents us from reinventing
new commands from scratch. For example, block I/O throttling is
already queued up to add run-time parameters.

Without a unified command we have a bulkier QMP/HMP interface,
duplicated code, and possibly inconsistencies in syntax between the
commands. Isn't the best way to avoid these problems a unified
interface?

I understand the lack of type safety concern but in this case we
already have to manually pull parsed arguments (i.e. cast to specific
types and deal with invalid input). To me this is a reason *for*
using a unified interface like block_set.


Think about it from a client perspective. How do I determine which
properties are supported by this version of QEMU? I have no way to
identify programmatically what arguments are valid for block_set.



User can programmatically find out valid parameters for
block_set. Internally, validation of prop=value is done against -drive
parameter list and then, only the valid/implemented commands (for now,
hostcache) are accepted from that list. Please find execution output
attached:

(qemu) info block
ide0-hd0: removable=0 hostcache=1 file=../rhel6-32.qcow2 ro=0 drv=qcow2
encrypted=0
floppy0: removable=1 locked=0 hostcache=0 file=test.img ro=0 drv=raw
encrypted=0
ide1-cd0: removable=1 locked=0 hostcache=1 [not inserted]
sd0: removable=1 locked=0 hostcache=1 [not inserted]
(qemu) block
block_resize block_set block_passwd
(qemu) block_set


That's HMP btw, not QMP.


block_set: block device name expected
(qemu) block
block_resize block_set block_passwd
(qemu) help block_set
block_set device [prop=value][,...] -- Change block device parameters
[hostcache=on/off]


Parsing help text is not introspection!

Regards,

Anthony Liguori


(qemu) block_set ide
Device 'ide' not found
(qemu) block_set ide0-hd0
Usage: block_set device [prop=value][,...]
(qemu) block_set ide0-hd0 cache
Invalid parameter 'cache'
(qemu) block_set ide0-hd0 cache=on
Parameter 'hostcache' expects on/off
(qemu) block_set ide0-hd0 hostcache=on
(qemu) block_set ide0-hd0 hostcache=off
(qemu) info block
ide0-hd0: removable=0 hostcache=0 file=../rhel6-32.qcow2 ro=0 drv=qcow2
encrypted=0
floppy0: removable=1 locked=0 hostcache=0 file=test.img ro=0 drv=raw
encrypted=0
ide1-cd0: removable=1 locked=0 hostcache=1 [not inserted]
sd0: removable=1 locked=0 hostcache=1 [not inserted]


When we add further parameters, "usage" string can be enhanced to
include those parameters for informing user.

- Thanks, Supriya


OTOH, if you have strong types like block_set_hostcache, query-commands
tells me exactly what's supported.

Regards,

Anthony Liguori



Stefan












Re: [Qemu-devel] [RFC][PATCH 0/21] QEMU Object Model

2011-07-28 Thread Anthony Liguori

On 07/28/2011 02:36 AM, Paolo Bonzini wrote:

On 07/27/2011 10:01 PM, Anthony Liguori wrote:




That's milkymist, not GoldFish.


Oh, Goldfish is fake. It's not real hardware.

The enumerator device is not a real device. It's weird because it's
imaginary and was designed specifically within QEMU.

It's not a good example for discussing modelling.


There are plenty of PV interfaces implemented by QEMU. Would you say the
same of virtio?


Virtio was designed to look like real hardware.

I would say that trying to fit XenStore into QOM would not be a good 
exercise.


Regards,

Anthony Liguori



Paolo






Re: [Qemu-devel] [PATCH 1/2] linux aio: support flush operation

2011-07-28 Thread Kevin Wolf
Am 28.07.2011 14:15, schrieb Christoph Hellwig:
>> Christoph, on another note: Can we rely on Linux AIO never returning
>> short writes except on EOF? Currently we return -EINVAL in this case, so

"short reads" I meant, of course.

>> I hope it's true or we wouldn't return the correct error code.
> 
> More or less.  There's one corner case for all Linux I/O, and that is
> only writes up to INT_MAX are supported, and larger writes (and reads)
> get truncated to it.  It's pretty nasty, but Linux has been vocally
> opposed to fixing this issue.

I think we can safely ignore this. So just replacing the current
ret = -EINVAL; by a memset(buf + ret, 0, len - ret); ret = 0; should be
okay, right? (Of course using the qiov versions, but you get the idea)

Kevin



Re: [Qemu-devel] Block layer roadmap

2011-07-28 Thread Kevin Wolf
Am 28.07.2011 14:09, schrieb Christoph Hellwig:
> On Wed, Jul 27, 2011 at 01:37:31PM +0100, Stefan Hajnoczi wrote:
>> Coroutines in the block layer [Kevin]
>>  * Programming model to simplify block drivers without blocking QEMU threads
> 
> Can anyone explain what the whole point of this is?  It really just is
> a bit of syntactic sugar for the current async state machines.  What does
> it buy us over going for real threading?

The only current block driver that really does everything in an async
state machine is qed. It's definitely not nice code, and having to
convert all of the other block drivers to this would be a lot of work.

So if it only means that we're making things async that would block the
VCPU until now, isn't that a great improvement already?

The advantage compared to threading is that it allows an easy and
incremental transition.

Kevin



Re: [Qemu-devel] [PATCH v2 5/5] virtio-balloon: Unregister savevm section on device unplug

2011-07-28 Thread Michael S. Tsirkin
On Thu, Jul 28, 2011 at 02:21:32PM +0200, Markus Armbruster wrote:
> "Michael S. Tsirkin"  writes:
> 
> > On Thu, Jul 28, 2011 at 09:45:44AM +0200, Markus Armbruster wrote:
> >> I hate the virtio pointer thicket.
> >
> > The problem is that each device is both a virtio pci
> > device and a virtio net device.
> 
> In my possibly naive opinion, virtio-FOO-pci is a PCI device, and has a
> common virtio part.

Yes, but it also has a comon virtio-pci part with other
virtio-FOO-pci devices, and a common virtio-net
part with virtio-net-.

-- 
MST



  1   2   >