[Qemu-devel] [PATCH v2] Add interface to traverse the qmp command list by QmpCommand

2013-10-08 Thread Mark Wu
In the original code, qmp_get_command_list is used to construct
a list of all commands' name. To get the information of all qga
commands, it traverses the name list and search the command info
with its name.  So it can cause O(n^2) in the number of commands.

This patch adds an interface to traverse the qmp command list by
QmpCommand to replace qmp_get_command_list. It can decrease the
complexity from O(n^2) to O(n).

Signed-off-by: Mark Wu wu...@linux.vnet.ibm.com
---
Changes:
v2:
1. Keep the signature of qmp_command_is_enabled (per Eric and Michael)
2. Remove the unnecessary pointer castings (per Eric)

 include/qapi/qmp/dispatch.h |  5 ++--
 qapi/qmp-registry.c | 27 +++---
 qga/commands.c  | 38 ++---
 qga/main.c  | 68 +
 4 files changed, 48 insertions(+), 90 deletions(-)

diff --git a/include/qapi/qmp/dispatch.h b/include/qapi/qmp/dispatch.h
index 1ce11f5..b6eb49e 100644
--- a/include/qapi/qmp/dispatch.h
+++ b/include/qapi/qmp/dispatch.h
@@ -47,9 +47,10 @@ QmpCommand *qmp_find_command(const char *name);
 QObject *qmp_dispatch(QObject *request);
 void qmp_disable_command(const char *name);
 void qmp_enable_command(const char *name);
-bool qmp_command_is_enabled(const char *name);
-char **qmp_get_command_list(void);
+bool qmp_command_is_enabled(const QmpCommand *cmd);
 QObject *qmp_build_error_object(Error *errp);
+typedef void (*qmp_cmd_callback_fn)(QmpCommand *cmd, void *opaque);
+void qmp_for_each_command(qmp_cmd_callback_fn fn, void *opaque);
 
 #endif
 
diff --git a/qapi/qmp-registry.c b/qapi/qmp-registry.c
index 28bbbe8..3fcf10e 100644
--- a/qapi/qmp-registry.c
+++ b/qapi/qmp-registry.c
@@ -66,35 +66,16 @@ void qmp_enable_command(const char *name)
 qmp_toggle_command(name, true);
 }
 
-bool qmp_command_is_enabled(const char *name)
+bool qmp_command_is_enabled(const QmpCommand *cmd)
 {
-QmpCommand *cmd;
-
-QTAILQ_FOREACH(cmd, qmp_commands, node) {
-if (strcmp(cmd-name, name) == 0) {
-return cmd-enabled;
-}
-}
-
-return false;
+return cmd-enabled;
 }
 
-char **qmp_get_command_list(void)
+void qmp_for_each_command(qmp_cmd_callback_fn fn, void *opaque)
 {
 QmpCommand *cmd;
-int count = 1;
-char **list_head, **list;
-
-QTAILQ_FOREACH(cmd, qmp_commands, node) {
-count++;
-}
-
-list_head = list = g_malloc0(count * sizeof(char *));
 
 QTAILQ_FOREACH(cmd, qmp_commands, node) {
-*list = g_strdup(cmd-name);
-list++;
+fn(cmd, opaque);
 }
-
-return list_head;
 }
diff --git a/qga/commands.c b/qga/commands.c
index 528b082..063b22b 100644
--- a/qga/commands.c
+++ b/qga/commands.c
@@ -45,35 +45,27 @@ void qmp_guest_ping(Error **err)
 slog(guest-ping called);
 }
 
-struct GuestAgentInfo *qmp_guest_info(Error **err)
+static void qmp_command_info(QmpCommand *cmd, void *opaque)
 {
-GuestAgentInfo *info = g_malloc0(sizeof(GuestAgentInfo));
+GuestAgentInfo *info = opaque;
 GuestAgentCommandInfo *cmd_info;
 GuestAgentCommandInfoList *cmd_info_list;
-char **cmd_list_head, **cmd_list;
-
-info-version = g_strdup(QEMU_VERSION);
-
-cmd_list_head = cmd_list = qmp_get_command_list();
-if (*cmd_list_head == NULL) {
-goto out;
-}
 
-while (*cmd_list) {
-cmd_info = g_malloc0(sizeof(GuestAgentCommandInfo));
-cmd_info-name = g_strdup(*cmd_list);
-cmd_info-enabled = qmp_command_is_enabled(cmd_info-name);
+cmd_info = g_malloc0(sizeof(GuestAgentCommandInfo));
+cmd_info-name = g_strdup(cmd-name);
+cmd_info-enabled = qmp_command_is_enabled(cmd);
 
-cmd_info_list = g_malloc0(sizeof(GuestAgentCommandInfoList));
-cmd_info_list-value = cmd_info;
-cmd_info_list-next = info-supported_commands;
-info-supported_commands = cmd_info_list;
+cmd_info_list = g_malloc0(sizeof(GuestAgentCommandInfoList));
+cmd_info_list-value = cmd_info;
+cmd_info_list-next = info-supported_commands;
+info-supported_commands = cmd_info_list;
+}
 
-g_free(*cmd_list);
-cmd_list++;
-}
+struct GuestAgentInfo *qmp_guest_info(Error **err)
+{
+GuestAgentInfo *info = g_malloc0(sizeof(GuestAgentInfo));
 
-out:
-g_free(cmd_list_head);
+info-version = g_strdup(QEMU_VERSION);
+qmp_for_each_command(qmp_command_info, info);
 return info;
 }
diff --git a/qga/main.c b/qga/main.c
index 6c746c8..ff2ee03 100644
--- a/qga/main.c
+++ b/qga/main.c
@@ -347,48 +347,34 @@ static gint ga_strcmp(gconstpointer str1, gconstpointer 
str2)
 }
 
 /* disable commands that aren't safe for fsfreeze */
-static void ga_disable_non_whitelisted(void)
+static void ga_disable_non_whitelisted(QmpCommand *cmd, void *opaque)
 {
-char **list_head, **list;
 bool whitelisted;
 int i;
 
-list_head = list = qmp_get_command_list();
-while (*list != NULL) {
-whitelisted = false;
- 

[Qemu-devel] [PATCH v4] Extend qemu-ga's 'guest-info' command to expose flag 'success-response'

2013-10-08 Thread Mark Wu
Now we have several qemu-ga commands not returning response on success.
It has been documented in qga/qapi-schema.json already. This patch exposes
the 'success-response' flag by extending 'guest-info' command. With this
change, the clients can handle the command response more flexibly.

Signed-off-by: Mark Wu wu...@linux.vnet.ibm.com
---
Changes:
v4: 
Add signature of qmp_has_success_response per Michael.
v3: 
1. treat cmd-options as a bitmask instead of single option (per Eric) 
2. rebase on the patch  Add interface to traverse the qmp command list
by QmpCommand to avoid the O(n2) problem (per Eric and Michael)
v2: 
add the notation 'since 1.7' to the option 'success-response'
(per Eric Blake's comments)

 include/qapi/qmp/dispatch.h | 1 +
 qapi/qmp-registry.c | 5 +
 qga/commands.c  | 1 +
 qga/qapi-schema.json| 5 -
 4 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/qapi/qmp/dispatch.h b/include/qapi/qmp/dispatch.h
index b6eb49e..cebf6aa 100644
--- a/include/qapi/qmp/dispatch.h
+++ b/include/qapi/qmp/dispatch.h
@@ -48,6 +48,7 @@ QObject *qmp_dispatch(QObject *request);
 void qmp_disable_command(const char *name);
 void qmp_enable_command(const char *name);
 bool qmp_command_is_enabled(const QmpCommand *cmd);
+bool qmp_has_success_response(const QmpCommand *cmd);
 QObject *qmp_build_error_object(Error *errp);
 typedef void (*qmp_cmd_callback_fn)(QmpCommand *cmd, void *opaque);
 void qmp_for_each_command(qmp_cmd_callback_fn fn, void *opaque);
diff --git a/qapi/qmp-registry.c b/qapi/qmp-registry.c
index 3fcf10e..c75c2e8 100644
--- a/qapi/qmp-registry.c
+++ b/qapi/qmp-registry.c
@@ -71,6 +71,11 @@ bool qmp_command_is_enabled(const QmpCommand *cmd)
 return cmd-enabled;
 }
 
+bool qmp_has_success_response(const QmpCommand *cmd)
+{
+   return !(cmd-options  QCO_NO_SUCCESS_RESP);
+}
+
 void qmp_for_each_command(qmp_cmd_callback_fn fn, void *opaque)
 {
 QmpCommand *cmd;
diff --git a/qga/commands.c b/qga/commands.c
index 063b22b..7f089ba 100644
--- a/qga/commands.c
+++ b/qga/commands.c
@@ -54,6 +54,7 @@ static void qmp_command_info(QmpCommand *cmd, void *opaque)
 cmd_info = g_malloc0(sizeof(GuestAgentCommandInfo));
 cmd_info-name = g_strdup(cmd-name);
 cmd_info-enabled = qmp_command_is_enabled(cmd);
+cmd_info-success_response = qmp_has_success_response(cmd);
 
 cmd_info_list = g_malloc0(sizeof(GuestAgentCommandInfoList));
 cmd_info_list-value = cmd_info;
diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
index 7155b7a..245f968 100644
--- a/qga/qapi-schema.json
+++ b/qga/qapi-schema.json
@@ -141,10 +141,13 @@
 #
 # @enabled: whether command is currently enabled by guest admin
 #
+# @success-response: whether command returns a response on success
+#(since 1.7)
+#
 # Since 1.1.0
 ##
 { 'type': 'GuestAgentCommandInfo',
-  'data': { 'name': 'str', 'enabled': 'bool' } }
+  'data': { 'name': 'str', 'enabled': 'bool', 'success-response': 'bool' } }
 
 ##
 # @GuestAgentInfo
-- 
1.8.3.1




Re: [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements

2013-10-08 Thread Stefan Hajnoczi
On Mon, Oct 7, 2013 at 11:42 AM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 07/10/2013 10:42, Stefan Hajnoczi ha scritto:
 Could you make bdrv_co_write_zeroes() always use UNMAP, if possible, and
 avoid adding the new BDRV_REQ_MAY_UNMAP flag?  While reading the first
 few patches in this series I wondered why there is a need to expose
 flags at all...

 Sometimes it is useful to distinguish between zeroing at the image
 format level from discarding at the device level, but I don't think we
 make use of that yet.  I'd prefer to keep the interface simple for now
 and add flags later, if necessary.

 Or maybe I just missed something ;)

 The flag is needed to implement the right semantics for the SCSI WRITE
 SAME command, which are:

 - if the UNMAP bit is off, always write the sectors (that's
 bdrv_aio_write_zeroes without BDRV_REQ_MAY_UNMAP if the payload is zero,
 otherwise it's emulated with bdrv_aio_writev)

 - if the target can discard and write the specified payload, you can
 discard, else you must write the sectors with the correct payload
 (that's bdrv_aio_write_zeroes with BDRV_REQ_MAY_UNMAP).

 Contrast this with the UNMAP command, which does not make any guarantee
 on the content of the sectors after the command is completed (a few
 months ago we agreed that, even if you have discard_zeroes=true in the
 target, it is fine for UNMAP to do nothing).

Okay, then let's keep the patches to expose the flag.

Stefan



Re: [Qemu-devel] [PATCH v5 0/4] timers thread-safe stuff

2013-10-08 Thread Stefan Hajnoczi
On Mon, Oct 07, 2013 at 02:24:26PM +0200, Paolo Bonzini wrote:
 Stefan, will you pick this up next week or shall I?
 
 I have patches for thread-safe icount almost ready to post, and I am not
 sure through whom they are going to go.

Please include it in your pull request.

Kevin is merging block patches this week, the queue is fairly full so I
imagine he has plenty of other things to review.

Stefan



Re: [Qemu-devel] [patch 0/2] force -mem-path RAM allocation

2013-10-08 Thread Markus Armbruster
We have

-mem-path FILE  provide backing storage for guest RAM
-mem-prealloc   preallocate guest memory (use with -mem-path)

PATCH 2/2 adds

-mem-path-forcefail if unable to allocate RAM as specified by -mem-path

Looks like it's time to consolidate the options related to guest memory
into a single, QemuOpts-style -memory NAME=VALUE,...  What do you guys
think?



Re: [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements

2013-10-08 Thread Peter Lieven

On 08.10.2013 09:02, Stefan Hajnoczi wrote:

On Mon, Oct 7, 2013 at 11:42 AM, Paolo Bonzini pbonz...@redhat.com wrote:

Il 07/10/2013 10:42, Stefan Hajnoczi ha scritto:

Could you make bdrv_co_write_zeroes() always use UNMAP, if possible, and
avoid adding the new BDRV_REQ_MAY_UNMAP flag?  While reading the first
few patches in this series I wondered why there is a need to expose
flags at all...

Sometimes it is useful to distinguish between zeroing at the image
format level from discarding at the device level, but I don't think we
make use of that yet.  I'd prefer to keep the interface simple for now
and add flags later, if necessary.

Or maybe I just missed something ;)

The flag is needed to implement the right semantics for the SCSI WRITE
SAME command, which are:

- if the UNMAP bit is off, always write the sectors (that's
bdrv_aio_write_zeroes without BDRV_REQ_MAY_UNMAP if the payload is zero,
otherwise it's emulated with bdrv_aio_writev)

- if the target can discard and write the specified payload, you can
discard, else you must write the sectors with the correct payload
(that's bdrv_aio_write_zeroes with BDRV_REQ_MAY_UNMAP).

Contrast this with the UNMAP command, which does not make any guarantee
on the content of the sectors after the command is completed (a few
months ago we agreed that, even if you have discard_zeroes=true in the
target, it is fine for UNMAP to do nothing).

Okay, then let's keep the patches to expose the flag.

Okay, then I can keep those.

Can you give a short hint if my approach with brdv_make_empty is what
you want? I would like to not change the parameters, so use BDRV_REQ_MAY_UNMAP
unconditionally.

int bdrv_make_empty(BlockDriverState *bs)
{
int64_t target_size = bdrv_getlength(bs) / BDRV_SECTOR_SIZE;
int64_t ret, nb_sectors, sector_num = 0;
int n;

if (bs-drv-bdrv_make_empty) {
return bs-drv-bdrv_make_empty(bs);
}

for (;;) {
nb_sectors = target_size - sector_num;
if (nb_sectors = 0) {
return 0;
}
if (nb_sectors  INT_MAX) {
nb_sectors = INT_MAX;
}
ret = bdrv_get_block_status(bs, sector_num, nb_sectors, n);
if (ret  BDRV_BLOCK_ZERO) {
sector_num += n;
continue;
}
ret = bdrv_write_zeroes(bs, sector_num, n, BDRV_REQ_MAY_UNMAP);
if (ret  0) {
error_report(error writing zeroes at sector % PRId64 : %s,
 sector_num, strerror(-ret));
return ret;
}
sector_num += n;
}
}



Re: [Qemu-devel] [patch 0/2] force -mem-path RAM allocation

2013-10-08 Thread Paolo Bonzini
Il 08/10/2013 09:32, Markus Armbruster ha scritto:
 We have
 
 -mem-path FILE  provide backing storage for guest RAM
 -mem-prealloc   preallocate guest memory (use with -mem-path)
 
 PATCH 2/2 adds
 
 -mem-path-forcefail if unable to allocate RAM as specified by 
 -mem-path
 
 Looks like it's time to consolidate the options related to guest memory
 into a single, QemuOpts-style -memory NAME=VALUE,...  What do you guys
 think?

Yes, we can use -numa memory (or -numa mem) that Wanlong Gao is
adding.  We can add path=, preallocate= and force= options there.

Paolo



Re: [Qemu-devel] [patch 1/2] qemu: mempath: prefault pages manually

2013-10-08 Thread Paolo Bonzini
Il 08/10/2013 02:41, Marcelo Tosatti ha scritto:
 +/* unblock SIGBUS */
 +pthread_sigmask(SIG_BLOCK, NULL, oldset);
 +sigemptyset(set);
 +sigaddset(set, SIGBUS);
 +pthread_sigmask(SIG_UNBLOCK, set, NULL);

Please instead modify qemu-thread-posix.c to unblock all per-thread
signals (SIGBUS, SIGSEGV, SIGILL, SIGFPE and SIGSYS).  There is no need
to keep those blocked.

Paolo



Re: [Qemu-devel] [PATCH v4 2/7] qmp: add internal sync mode common to mirror_start

2013-10-08 Thread Fam Zheng
On Mon, 09/30 08:49, Eric Blake wrote:
 On 09/30/2013 06:02 AM, Fam Zheng wrote:
  This adds a new sync mode common which only copies data that is above
  the common ancestor of source and target. In general, this could be useful
  in cases like:
  
  base_bs --- common_ancestor --- foo --- bar ---source
\
 \--- target
  
  Where data in foo, bar and source will be copied to target, once such
  common backing_hd sharing is introduced. For now, we could use a special
  case: If target is the ancestor of source, like,
  
  base_bs --- target --- foo --- bar ---source
  
  The data in foo, bar and source will be copied to target, like
  drive-commit, and when they are synced, the source bs replaces target
  bs. This is specifically useful for block commit of active layer.
  
  This mode is not available (-ENOTSUP) from QMP interface, it is only
  used internally by block commit code.
  
 
  +++ b/qapi-schema.json
  @@ -1363,7 +1363,7 @@
   # Since: 1.3
   ##
   { 'enum': 'MirrorSyncMode',
  -  'data': ['top', 'full', 'none'] }
  +  'data': ['top', 'full', 'none', 'common'] }
 
 Is it worth documenting the mode, in order to include a '(since 1.7)'
 notation, as well as a mention that this mode is not supported via QMP
 but only exists so that the code generator will support the mode needed
 internally?  Is there any way to refactor things so that you don't have
 to munge the QAPI just to provide this internal-only mode?
 

As described in commit message, this mode could be useful once blockdev-add has
device referencing (backing_hd sharing). For now, even with the same backing
file, they don't share BDS, so it's not working as expected and should be
disabled.

So do you think it OK to document as not implemented for now, and wait for
backing_hd sharing to enable it?

Thanks,

Fam




Re: [Qemu-devel] [PATCH v4 4/7] mirror: Add commit_job_type to perform commit with mirror code

2013-10-08 Thread Fam Zheng
On Tue, 10/01 11:13, Eric Blake wrote:
 On 09/30/2013 06:02 AM, Fam Zheng wrote:
  Commit active layer will be implemented in block/mirror.c, prepare a new
  job type to let it have a right type name for the user.
  
  Signed-off-by: Fam Zheng f...@redhat.com
  ---
   block/mirror.c| 12 +++-
   blockdev.c|  2 +-
   include/block/block_int.h |  2 ++
   3 files changed, 14 insertions(+), 2 deletions(-)
  
  diff --git a/block/mirror.c b/block/mirror.c
  index af6851f..20dcfb6 100644
  --- a/block/mirror.c
  +++ b/block/mirror.c
  @@ -532,10 +532,19 @@ static const BlockJobType mirror_job_type = {
   .complete  = mirror_complete,
   };
   
  +static const BlockJobType commit_job_type = {
  +.instance_size = sizeof(MirrorBlockJob),
  +.job_type  = commit,
 
 I still wonder if we should complete the conversion over to a QAPI enum
 type for all valid job types prior to hard-coding open strings through
 yet more of the code base.  As long as we don't have introspection done
 yet, we can still make the switch, and in the long run, having an enum
 of valid job types seems like it will be better for maintenance, all
 with no change to what is sent over the wire in QMP.
 
  @@ -390,6 +391,7 @@ void mirror_start(BlockDriverState *bs, 
  BlockDriverState *target,
 int64_t speed, int64_t granularity, int64_t buf_size,
 MirrorSyncMode mode, BlockdevOnError on_source_error,
 BlockdevOnError on_target_error,
  +  bool commit_job,
 
 If we DO create a QAPI enum for job type, then this parameter would be
 the enum type, rather than a bool.
 

Good point, I'll work on a QAPI enum and base this on it.

Fam



Re: [Qemu-devel] [PATCH qom-next 0/2] qdev-monitor: Reference counting follow-ups

2013-10-08 Thread Igor Mammedov
On Mon,  7 Oct 2013 18:43:59 +0200
Andreas Färber afaer...@suse.de wrote:

 Hello,
 
 I have queued bug fixes by Igor and Stefan for device_add on qom-next and
 am rearranging the following changes of mine on top.
 
 1) Further naming cleanups, now rebased on the bugfixes for easier 
 backporting.
 2) Inlining of qdev_init(), so that we always have unparent+unref pairs.
 
 If there's no objections, planning to include this in a pull tonight or 
 tomorrow.
 
 Regards,
 Andreas
 
 Cc: Igor Mammedov imamm...@redhat.com
 Cc: Stefan Hajnoczi stefa...@redhat.com
 Cc: Paolo Bonzini pbonz...@redhat.com
 Cc: Anthony Liguori anth...@codemonkey.ws
 
 Andreas Färber (2):
   qdev-monitor: Avoid qdev as variable name
   qdev-monitor: Inline qdev_init() for device_add
 
  qdev-monitor.c | 37 +
  1 file changed, 21 insertions(+), 16 deletions(-)
 

Reviewed-By: Igor Mammedov imamm...@redhat.com



[Qemu-devel] savevm/loadvm

2013-10-08 Thread Alexey Kardashevskiy
Hi!

I need the community help with savevm/loadvm.

I run QEMU like this:

./qemu-system-ppc64 \
 -drive file=virtimg/fc19_16GB.qcow2 \
 -nodefaults \
 -m 2048 \
 -machine pseries \
 -nographic \
 -vga none \
 -enable-kvm


The disk image is an 16GB qcow2 image.

Now I start the guest and do savevm 1 and loadvm 1 from the qemu
console. Everything works. Then I exit qemu, make sure that the snapshot is
there and run QEMU as above plus -loadvm 1. It fails with:

qemu-system-ppc64: qcow2: Loading snapshots with different disk size is not
implemented
qemu-system-ppc64: Error -95 while activating snapshot '2' on 'scsi0-hd0'

The check is added by commit 90b277593df873d3a2480f002e2eb5fe1f8e5277
qcow2: Save disk size in snapshot header.

As I cannot realize the whole idea of the patch, I looked a bit deeper.
This is the check:

int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
{
[...]
if (sn-disk_size != bs-total_sectors * BDRV_SECTOR_SIZE) {
error_report(qcow2: Loading snapshots with different disk 
size is not implemented);
ret = -ENOTSUP;
goto fail;
}


My understanding of the patch was that the disk_size should remain 16GB
(0x4..) as it uses bs-total_sectors and never changes it. And
bs-growable is 0 for qcow2 image because it is not really growable. At
least the total_sectors value from the qcow2 file header does not change
between QEMU starts.

However qcow2_save_vmstate() sets bs-growable to 1 for a short time
(commit 178e08a58f40dd5aef2ce774fe0850f5d0e56918 from 2009) and this
triggers a branch in bdrv_co_do_writev() which changes bs-total_sectors.
So when QEMU writes snapshots to the file, the disk_size field of a
snapshot has bigger value (for example 0x4.007b.8180).

And the check above fails. It does not fail if to do loadvm
_in_the_same_run_ after savevm because QEMU operates with the updated
bs-total_sectors.

What the proper fix would be? Or it is not a bug at all and I should be
using something else for -loadvm? Thanks.



-- 
Alexey



[Qemu-devel] [PATCH] scsi: Allocate SCSITargetReq r-buf dynamically

2013-10-08 Thread Asias He
r-buf is hardcoded to 2056 which is (256 + 1) * 8, allowing 256 luns at
most. If more than 256 luns are specified by user, we have buffer
overflow in scsi_target_emulate_report_luns.

To fix, we allocate the buffer dynamically.

Signed-off-by: Asias He as...@redhat.com
---
 hw/scsi/scsi-bus.c | 44 +---
 include/hw/scsi/scsi.h |  2 ++
 2 files changed, 35 insertions(+), 11 deletions(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 4d36841..d950e6f 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -11,6 +11,8 @@ static char *scsibus_get_dev_path(DeviceState *dev);
 static char *scsibus_get_fw_dev_path(DeviceState *dev);
 static int scsi_req_parse(SCSICommand *cmd, SCSIDevice *dev, uint8_t *buf);
 static void scsi_req_dequeue(SCSIRequest *req);
+static uint8_t *scsi_target_alloc_buf(SCSIRequest *req, size_t len);
+static void scsi_target_free_buf(SCSIRequest *req);
 
 static Property scsi_props[] = {
 DEFINE_PROP_UINT32(channel, SCSIDevice, channel, 0),
@@ -317,7 +319,8 @@ typedef struct SCSITargetReq SCSITargetReq;
 struct SCSITargetReq {
 SCSIRequest req;
 int len;
-uint8_t buf[2056];
+uint8_t *buf;
+int buf_len;
 };
 
 static void store_lun(uint8_t *outbuf, int lun)
@@ -361,14 +364,12 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq 
*r)
 if (!found_lun0) {
 n += 8;
 }
-len = MIN(n + 8, r-req.cmd.xfer  ~7);
-if (len  sizeof(r-buf)) {
-/* TODO:  256 LUNs? */
-return false;
-}
 
+scsi_target_alloc_buf(r-req, n + 8);
+
+len = MIN(n + 8, r-req.cmd.xfer  ~7);
 memset(r-buf, 0, len);
-stl_be_p(r-buf, n);
+stl_be_p(r-buf[0], n);
 i = found_lun0 ? 8 : 16;
 QTAILQ_FOREACH(kid, r-req.bus-qbus.children, sibling) {
 DeviceState *qdev = kid-child;
@@ -387,6 +388,9 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq 
*r)
 static bool scsi_target_emulate_inquiry(SCSITargetReq *r)
 {
 assert(r-req.dev-lun != r-req.lun);
+
+scsi_target_alloc_buf(r-req, SCSI_INQUIRY_LEN);
+
 if (r-req.cmd.buf[1]  0x2) {
 /* Command support data - optional, not implemented */
 return false;
@@ -411,7 +415,7 @@ static bool scsi_target_emulate_inquiry(SCSITargetReq *r)
 return false;
 }
 /* done with EVPD */
-assert(r-len  sizeof(r-buf));
+assert(r-len  r-buf_len);
 r-len = MIN(r-req.cmd.xfer, r-len);
 return true;
 }
@@ -455,8 +459,8 @@ static int32_t scsi_target_send_command(SCSIRequest *req, 
uint8_t *buf)
 }
 break;
 case REQUEST_SENSE:
-r-len = scsi_device_get_sense(r-req.dev, r-buf,
-   MIN(req-cmd.xfer, sizeof r-buf),
+scsi_target_alloc_buf(r-req, SCSI_SENSE_LEN);
+r-len = scsi_device_get_sense(r-req.dev, r-buf, r-buf_len,
(req-cmd.buf[1]  1) == 0);
 if (r-req.dev-sense_is_ua) {
 scsi_device_unit_attention_reported(req-dev);
@@ -501,11 +505,29 @@ static uint8_t *scsi_target_get_buf(SCSIRequest *req)
 return r-buf;
 }
 
+static uint8_t *scsi_target_alloc_buf(SCSIRequest *req, size_t len)
+{
+SCSITargetReq *r = DO_UPCAST(SCSITargetReq, req, req);
+
+r-buf = g_malloc(len);
+r-buf_len = len;
+
+return r-buf;
+}
+
+static void scsi_target_free_buf(SCSIRequest *req)
+{
+SCSITargetReq *r = DO_UPCAST(SCSITargetReq, req, req);
+
+g_free(r-buf);
+}
+
 static const struct SCSIReqOps reqops_target_command = {
 .size = sizeof(SCSITargetReq),
 .send_command = scsi_target_send_command,
 .read_data= scsi_target_read_data,
 .get_buf  = scsi_target_get_buf,
+.free_req = scsi_target_free_buf,
 };
 
 
@@ -1365,7 +1387,7 @@ int scsi_build_sense(uint8_t *in_buf, int in_len,
 buf[7] = 10;
 buf[12] = sense.asc;
 buf[13] = sense.ascq;
-return MIN(len, 18);
+return MIN(len, SCSI_SENSE_LEN);
 } else {
 /* Return descriptor format sense buffer */
 buf[0] = 0x72;
diff --git a/include/hw/scsi/scsi.h b/include/hw/scsi/scsi.h
index 1b66510..76f6ac2 100644
--- a/include/hw/scsi/scsi.h
+++ b/include/hw/scsi/scsi.h
@@ -9,6 +9,8 @@
 #define MAX_SCSI_DEVS  255
 
 #define SCSI_CMD_BUF_SIZE 16
+#define SCSI_SENSE_LEN  18
+#define SCSI_INQUIRY_LEN36
 
 typedef struct SCSIBus SCSIBus;
 typedef struct SCSIBusInfo SCSIBusInfo;
-- 
1.8.3.1




[Qemu-devel] [PATCH 2/8] timers: add timer_mod_anticipate and timer_mod_anticipate_ns

2013-10-08 Thread Paolo Bonzini
These let a user anticipate the deadline of a timer, atomically with
other sites that call the function.  This helps avoiding complicated
lock hierarchies.  It is useful whenever the timer does work based on
the current value of the clock (rather than doing something periodically
on every tick).

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 include/qemu/timer.h | 26 ++
 qemu-timer.c | 29 +
 2 files changed, 55 insertions(+)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index b58903b..f215b0b 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -539,6 +539,19 @@ void timer_del(QEMUTimer *ts);
 void timer_mod_ns(QEMUTimer *ts, int64_t expire_time);
 
 /**
+ * timer_mod_anticipate_ns:
+ * @ts: the timer
+ * @expire_time: the expiry time in nanoseconds
+ *
+ * Modify a timer to expire at @expire_time or the current time,
+ * whichever comes earlier.
+ *
+ * This function is thread-safe but the timer and its timer list must not be
+ * freed while this function is running.
+ */
+void timer_mod_anticipate_ns(QEMUTimer *ts, int64_t expire_time);
+
+/**
  * timer_mod:
  * @ts: the timer
  * @expire_time: the expire time in the units associated with the timer
@@ -552,6 +565,19 @@ void timer_mod_ns(QEMUTimer *ts, int64_t expire_time);
 void timer_mod(QEMUTimer *ts, int64_t expire_timer);
 
 /**
+ * timer_mod_anticipate:
+ * @ts: the timer
+ * @expire_time: the expiry time in nanoseconds
+ *
+ * Modify a timer to expire at @expire_time or the current time, whichever
+ * comes earlier, taking into account the scale associated with the timer.
+ *
+ * This function is thread-safe but the timer and its timer list must not be
+ * freed while this function is running.
+ */
+void timer_mod_anticipate(QEMUTimer *ts, int64_t expire_time);
+
+/**
  * timer_pending:
  * @ts: the timer
  *
diff --git a/qemu-timer.c b/qemu-timer.c
index 95fc6eb..202e9a2 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -393,11 +393,40 @@ void timer_mod_ns(QEMUTimer *ts, int64_t expire_time)
 }
 }
 
+/* modify the current timer so that it will be fired when current_time
+   = expire_time or the current deadline, whichever comes earlier.
+   The corresponding callback will be called. */
+void timer_mod_anticipate_ns(QEMUTimer *ts, int64_t expire_time)
+{
+QEMUTimerList *timer_list = ts-timer_list;
+bool rearm;
+
+qemu_mutex_lock(timer_list-active_timers_lock);
+if (ts-expire_time == -1 || ts-expire_time  expire_time) {
+if (ts-expire_time != -1) {
+timer_del_locked(timer_list, ts);
+}
+rearm = timer_mod_ns_locked(timer_list, ts, expire_time);
+} else {
+rearm = false;
+}
+qemu_mutex_unlock(timer_list-active_timers_lock);
+
+if (rearm) {
+timerlist_rearm(timer_list);
+}
+}
+
 void timer_mod(QEMUTimer *ts, int64_t expire_time)
 {
 timer_mod_ns(ts, expire_time * ts-scale);
 }
 
+void timer_mod_anticipate(QEMUTimer *ts, int64_t expire_time)
+{
+timer_mod_anticipate_ns(ts, expire_time * ts-scale);
+}
+
 bool timer_pending(QEMUTimer *ts)
 {
 return ts-expire_time = 0;
-- 
1.8.3.1





[Qemu-devel] [PATCH 0/8] Make icount thread-safe

2013-10-08 Thread Paolo Bonzini
This series moves the icount state under the same seqlock as the normal
vm_clock implementation.

It is not yet 100% thread-safe, because the CPU list should be moved
under RCU protection (due to the call to !all_cpu_threads_idle()
in qemu_clock_warp).  However it is a substantial step forward, the
only uncovered case being CPU hotplug.

Please review.

Paolo

Paolo Bonzini (8):
  timers: extract timer_mod_ns_locked and timerlist_rearm
  timers: add timer_mod_anticipate and timer_mod_anticipate_ns
  timers: use cpu_get_icount() directly
  timers: reorganize icount_warp_rt
  timers: prepare the code for future races in calling qemu_clock_warp
  timers: introduce cpu_get_clock_locked
  timers: document (future) locking rules for icount
  timers: make icount thread-safe

 cpus.c | 110 -
 include/qemu/timer.h   |  26 +
 qemu-timer.c   |  74 +++--
 4 files changed, 163 insertions(+), 47 deletions(-)
 create mode 100644 include/qemu/seqlock.h

-- 
1.8.3.1




[Qemu-devel] [PATCH 4/8] timers: reorganize icount_warp_rt

2013-10-08 Thread Paolo Bonzini
To prepare for future code changes, move the increment of qemu_icount_bias
outside the if statement.

Also, hoist outside the if the check for timers that expired due to the
warping.  The check is redundant when !runstate_is_running(), but
doing it this way helps because the code that increments qemu_icount_bias
will be a critical section.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/cpus.c b/cpus.c
index f87ff6f..9f450ad 100644
--- a/cpus.c
+++ b/cpus.c
@@ -279,10 +279,10 @@ static void icount_warp_rt(void *opaque)
 
 if (runstate_is_running()) {
 int64_t clock = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
-int64_t warp_delta = clock - vm_clock_warp_start;
-if (use_icount == 1) {
-qemu_icount_bias += warp_delta;
-} else {
+int64_t warp_delta;
+
+warp_delta = clock - vm_clock_warp_start;
+if (use_icount == 2) {
 /*
  * In adaptive mode, do not let QEMU_CLOCK_VIRTUAL run too
  * far ahead of real time.
@@ -290,13 +290,15 @@ static void icount_warp_rt(void *opaque)
 int64_t cur_time = cpu_get_clock();
 int64_t cur_icount = cpu_get_icount();
 int64_t delta = cur_time - cur_icount;
-qemu_icount_bias += MIN(warp_delta, delta);
-}
-if (qemu_clock_expired(QEMU_CLOCK_VIRTUAL)) {
-qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
+warp_delta = MIN(warp_delta, delta);
 }
+qemu_icount_bias += warp_delta;
 }
 vm_clock_warp_start = -1;
+
+if (qemu_clock_expired(QEMU_CLOCK_VIRTUAL)) {
+qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
+}
 }
 
 void qtest_clock_warp(int64_t dest)
-- 
1.8.3.1





[Qemu-devel] [PATCH 1/8] timers: extract timer_mod_ns_locked and timerlist_rearm

2013-10-08 Thread Paolo Bonzini
These will be reused in timer_mod_anticipate functions.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-timer.c | 51 ---
 1 file changed, 32 insertions(+), 19 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index 6b62e88..95fc6eb 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -338,6 +338,34 @@ static void timer_del_locked(QEMUTimerList *timer_list, 
QEMUTimer *ts)
 }
 }
 
+static bool timer_mod_ns_locked(QEMUTimerList *timer_list,
+QEMUTimer *ts, int64_t expire_time)
+{
+QEMUTimer **pt, *t;
+
+/* add the timer in the sorted list */
+pt = timer_list-active_timers;
+for (;;) {
+t = *pt;
+if (!timer_expired_ns(t, expire_time)) {
+break;
+}
+pt = t-next;
+}
+ts-expire_time = MAX(expire_time, 0);
+ts-next = *pt;
+*pt = ts;
+
+return pt == timer_list-active_timers;
+}
+
+static void timerlist_rearm(QEMUTimerList *timer_list)
+{
+/* Interrupt execution to force deadline recalculation.  */
+qemu_clock_warp(timer_list-clock-type);
+timerlist_notify(timer_list);
+}
+
 /* stop a timer, but do not dealloc it */
 void timer_del(QEMUTimer *ts)
 {
@@ -353,30 +381,15 @@ void timer_del(QEMUTimer *ts)
 void timer_mod_ns(QEMUTimer *ts, int64_t expire_time)
 {
 QEMUTimerList *timer_list = ts-timer_list;
-QEMUTimer **pt, *t;
+bool rearm;
 
 qemu_mutex_lock(timer_list-active_timers_lock);
 timer_del_locked(timer_list, ts);
-
-/* add the timer in the sorted list */
-pt = timer_list-active_timers;
-for(;;) {
-t = *pt;
-if (!timer_expired_ns(t, expire_time)) {
-break;
-}
-pt = t-next;
-}
-ts-expire_time = MAX(expire_time, 0);
-ts-next = *pt;
-*pt = ts;
+rearm = timer_mod_ns_locked(timer_list, ts, expire_time);
 qemu_mutex_unlock(timer_list-active_timers_lock);
 
-/* Rearm if necessary  */
-if (pt == timer_list-active_timers) {
-/* Interrupt execution to force deadline recalculation.  */
-qemu_clock_warp(timer_list-clock-type);
-timerlist_notify(timer_list);
+if (rearm) {
+timerlist_rearm(timer_list);
 }
 }
 
-- 
1.8.3.1





[Qemu-devel] [PATCH 3/8] timers: use cpu_get_icount() directly

2013-10-08 Thread Paolo Bonzini
This will help later when we will have to place these calls in
a critical section, and thus call a version of cpu_get_icount()
that does not take the lock.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/cpus.c b/cpus.c
index 870a832..f87ff6f 100644
--- a/cpus.c
+++ b/cpus.c
@@ -224,12 +224,15 @@ static void icount_adjust(void)
 int64_t cur_icount;
 int64_t delta;
 static int64_t last_delta;
+
 /* If the VM is not running, then do nothing.  */
 if (!runstate_is_running()) {
 return;
 }
+
 cur_time = cpu_get_clock();
-cur_icount = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+cur_icount = cpu_get_icount();
+
 delta = cur_icount - cur_time;
 /* FIXME: This is a very crude algorithm, somewhat prone to oscillation.  
*/
 if (delta  0
@@ -285,7 +288,7 @@ static void icount_warp_rt(void *opaque)
  * far ahead of real time.
  */
 int64_t cur_time = cpu_get_clock();
-int64_t cur_icount = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+int64_t cur_icount = cpu_get_icount();
 int64_t delta = cur_time - cur_icount;
 qemu_icount_bias += MIN(warp_delta, delta);
 }
-- 
1.8.3.1





[Qemu-devel] [PATCH 8/8] timers: make icount thread-safe

2013-10-08 Thread Paolo Bonzini
This lets threads other than the I/O thread use vm_clock even in -icount mode.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c | 42 ++
 1 file changed, 34 insertions(+), 8 deletions(-)

diff --git a/cpus.c b/cpus.c
index bc675a4..1e5cba4 100644
--- a/cpus.c
+++ b/cpus.c
@@ -133,7 +133,7 @@ typedef struct TimersState {
 static TimersState timers_state;
 
 /* Return the virtual CPU time, based on the instruction counter.  */
-int64_t cpu_get_icount(void)
+static int64_t cpu_get_icount_locked(void)
 {
 int64_t icount;
 CPUState *cpu = current_cpu;
@@ -149,6 +149,19 @@ int64_t cpu_get_icount(void)
 return qemu_icount_bias + (icount  icount_time_shift);
 }
 
+int64_t cpu_get_icount(void)
+{
+int64_t icount;
+unsigned start;
+
+do {
+start = seqlock_read_begin(timers_state.clock_seqlock);
+icount = cpu_get_icount_locked();
+} while (seqlock_read_retry(timers_state.clock_seqlock, start));
+
+return icount;
+}
+
 /* return the host CPU cycle counter and handle stop/restart */
 /* cpu_ticks is safely if holding BQL */
 int64_t cpu_get_ticks(void)
@@ -246,8 +259,9 @@ static void icount_adjust(void)
 return;
 }
 
-cur_time = cpu_get_clock();
-cur_icount = cpu_get_icount();
+seqlock_write_lock(timers_state.clock_seqlock);
+cur_time = cpu_get_clock_locked();
+cur_icount = cpu_get_icount_locked();
 
 delta = cur_icount - cur_time;
 /* FIXME: This is a very crude algorithm, somewhat prone to oscillation.  
*/
@@ -265,6 +279,7 @@ static void icount_adjust(void)
 }
 last_delta = delta;
 qemu_icount_bias = cur_icount - (qemu_icount  icount_time_shift);
+seqlock_write_unlock(timers_state.clock_seqlock);
 }
 
 static void icount_adjust_rt(void *opaque)
@@ -289,10 +304,14 @@ static int64_t qemu_icount_round(int64_t count)
 
 static void icount_warp_rt(void *opaque)
 {
-if (vm_clock_warp_start == -1) {
+/* The icount_warp_timer is rescheduled soon after vm_clock_warp_start
+ * changes from -1 to another value, so the race here is okay.
+ */
+if (atomic_read(vm_clock_warp_start) == -1) {
 return;
 }
 
+seqlock_write_lock(timers_state.clock_seqlock);
 if (runstate_is_running()) {
 int64_t clock = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 int64_t warp_delta;
@@ -303,14 +322,15 @@ static void icount_warp_rt(void *opaque)
  * In adaptive mode, do not let QEMU_CLOCK_VIRTUAL run too
  * far ahead of real time.
  */
-int64_t cur_time = cpu_get_clock();
-int64_t cur_icount = cpu_get_icount();
+int64_t cur_time = cpu_get_clock_locked();
+int64_t cur_icount = cpu_get_icount_locked();
 int64_t delta = cur_time - cur_icount;
 warp_delta = MIN(warp_delta, delta);
 }
 qemu_icount_bias += warp_delta;
 }
 vm_clock_warp_start = -1;
+seqlock_write_unlock(timers_state.clock_seqlock);
 
 if (qemu_clock_expired(QEMU_CLOCK_VIRTUAL)) {
 qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
@@ -324,7 +344,10 @@ void qtest_clock_warp(int64_t dest)
 while (clock  dest) {
 int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
 int64_t warp = MIN(dest - clock, deadline);
+seqlock_write_lock(timers_state.clock_seqlock);
 qemu_icount_bias += warp;
+seqlock_write_unlock(timers_state.clock_seqlock);
+
 qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
 clock = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
 }
@@ -391,9 +415,11 @@ void qemu_clock_warp(QEMUClockType type)
  * you will not be sending network packets continuously instead of
  * every 100ms.
  */
+seqlock_write_lock(timers_state.clock_seqlock);
 if (vm_clock_warp_start == -1 || vm_clock_warp_start  clock) {
 vm_clock_warp_start = clock;
 }
+seqlock_write_unlock(timers_state.clock_seqlock);
 timer_mod_anticipate(icount_warp_timer, clock + deadline);
 } else if (deadline == 0) {
 qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
-- 
1.8.3.1




[Qemu-devel] [PATCH 5/8] timers: prepare the code for future races in calling qemu_clock_warp

2013-10-08 Thread Paolo Bonzini
Computing the deadline of all vm_clocks is somewhat expensive and calls
out to qemu-timer.c; two reasons not to do it in the seqlock's write-side
critical section.  This however opens the door for races in setting and
reading vm_clock_warp_start.

To plug them, we need to cover the case where a new deadline slips in
between the call to qemu_clock_deadline_ns_all and the actual modification
of the icount_warp_timer.  Restrict changes to vm_clock_warp_start and
the icount_warp_timer's expiration time, to only move them back (which
would simply cause an early wakeup).

If a vm_clock timer is cancelled while CPUs are idle, this might cause the
icount_warp_timer to fire unnecessarily.  This is not a problem, after it
fires the timer becomes inactive and the next call to timer_mod_anticipate
will be precise.

In addition to this, we must deactivate the icount_warp_timer _before_
checking whether CPUs are idle.  This way, if the last CPU becomes idle
during the call to timer_del we will still set up the icount_warp_timer.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/cpus.c b/cpus.c
index 9f450ad..08eaf23 100644
--- a/cpus.c
+++ b/cpus.c
@@ -319,6 +319,7 @@ void qtest_clock_warp(int64_t dest)
 
 void qemu_clock_warp(QEMUClockType type)
 {
+int64_t clock;
 int64_t deadline;
 
 /*
@@ -338,7 +339,7 @@ void qemu_clock_warp(QEMUClockType type)
  * the earliest QEMU_CLOCK_VIRTUAL timer.
  */
 icount_warp_rt(NULL);
-if (!all_cpu_threads_idle() || !qemu_clock_has_timers(QEMU_CLOCK_VIRTUAL)) 
{
-timer_del(icount_warp_timer);
+timer_del(icount_warp_timer);
+if (!all_cpu_threads_idle()) {
 return;
 }
@@ -348,17 +349,11 @@ void qemu_clock_warp(QEMUClockType type)
return;
 }
 
-vm_clock_warp_start = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 /* We want to use the earliest deadline from ALL vm_clocks */
+clock = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
-
-/* Maintain prior (possibly buggy) behaviour where if no deadline
- * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more than
- * INT32_MAX nanoseconds ahead, we still use INT32_MAX
- * nanoseconds.
- */
-if ((deadline  0) || (deadline  INT32_MAX)) {
-deadline = INT32_MAX;
+if (deadline  0) {
+return;
 }
 
 if (deadline  0) {
@@ -379,7 +375,10 @@ void qemu_clock_warp(QEMUClockType type)
  * you will not be sending network packets continuously instead of
  * every 100ms.
  */
-timer_mod(icount_warp_timer, vm_clock_warp_start + deadline);
+if (vm_clock_warp_start == -1 || vm_clock_warp_start  clock) {
+vm_clock_warp_start = clock;
+}
+timer_mod_anticipate(icount_warp_timer, clock + deadline);
 } else if (deadline == 0) {
 qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
 }
-- 
1.8.3.1





[Qemu-devel] [PATCH 6/8] timers: introduce cpu_get_clock_locked

2013-10-08 Thread Paolo Bonzini
This fixes a deadlock in cpu_disable_ticks.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
Should be squashed in Ping Fan's patches.

 cpus.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/cpus.c b/cpus.c
index 08eaf23..01acce2 100644
--- a/cpus.c
+++ b/cpus.c
@@ -166,6 +166,20 @@ int64_t cpu_get_ticks(void)
 }
 }
 
+static int64_t cpu_get_clock_locked(void)
+{
+int64_t ti;
+
+if (!timers_state.cpu_ticks_enabled) {
+ti = timers_state.cpu_clock_offset;
+} else {
+ti = get_clock();
+ti += timers_state.cpu_clock_offset;
+}
+
+return ti;
+}
+
 /* return the host CPU monotonic timer and handle stop/restart */
 int64_t cpu_get_clock(void)
 {
@@ -174,12 +188,7 @@ int64_t cpu_get_clock(void)
 
 do {
 start = seqlock_read_begin(timers_state.clock_seqlock);
-if (!timers_state.cpu_ticks_enabled) {
-ti = timers_state.cpu_clock_offset;
-} else {
-ti = get_clock();
-ti += timers_state.cpu_clock_offset;
-}
+ti = cpu_get_clock_locked();
 } while (seqlock_read_retry(timers_state.clock_seqlock, start));
 
 return ti;
@@ -220,7 +233,7 @@ void cpu_disable_ticks(void)
 seqlock_write_lock(timers_state.clock_seqlock);
 if (timers_state.cpu_ticks_enabled) {
 timers_state.cpu_ticks_offset = cpu_get_ticks();
-timers_state.cpu_clock_offset = cpu_get_clock();
+timers_state.cpu_clock_offset = cpu_get_clock_locked();
 timers_state.cpu_ticks_enabled = 0;
 }
 seqlock_write_unlock(timers_state.clock_seqlock);
-- 
1.8.3.1





[Qemu-devel] [PATCH 7/8] timers: document (future) locking rules for icount

2013-10-08 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/cpus.c b/cpus.c
index 01acce2..bc675a4 100644
--- a/cpus.c
+++ b/cpus.c
@@ -98,17 +98,22 @@ static bool all_cpu_threads_idle(void)
 /***/
 /* guest cycle counter */
 
+/* Protected by TimersState seqlock */
+
+/* Compensate for varying guest execution speed.  */
+static int64_t qemu_icount_bias;
+static int64_t vm_clock_warp_start;
 /* Conversion factor from emulated instructions to virtual clock ticks.  */
 static int icount_time_shift;
 /* Arbitrarily pick 1MIPS as the minimum allowable speed.  */
 #define MAX_ICOUNT_SHIFT 10
-/* Compensate for varying guest execution speed.  */
-static int64_t qemu_icount_bias;
+
+/* Only written by TCG thread */
+static int64_t qemu_icount;
+
 static QEMUTimer *icount_rt_timer;
 static QEMUTimer *icount_vm_timer;
 static QEMUTimer *icount_warp_timer;
-static int64_t vm_clock_warp_start;
-static int64_t qemu_icount;
 
 typedef struct TimersState {
 int64_t cpu_ticks_prev;
@@ -232,6 +237,8 @@ static void icount_adjust(void)
 int64_t cur_time;
 int64_t cur_icount;
 int64_t delta;
+
+/* Protected by TimersState mutex.  */
 static int64_t last_delta;
 
 /* If the VM is not running, then do nothing.  */
-- 
1.8.3.1





Re: [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements

2013-10-08 Thread Stefan Hajnoczi
On Tue, Oct 8, 2013 at 10:01 AM, Peter Lieven p...@kamp.de wrote:
 On 08.10.2013 09:02, Stefan Hajnoczi wrote:

 On Mon, Oct 7, 2013 at 11:42 AM, Paolo Bonzini pbonz...@redhat.com
 wrote:

 Il 07/10/2013 10:42, Stefan Hajnoczi ha scritto:

 Could you make bdrv_co_write_zeroes() always use UNMAP, if possible, and
 avoid adding the new BDRV_REQ_MAY_UNMAP flag?  While reading the first
 few patches in this series I wondered why there is a need to expose
 flags at all...

 Sometimes it is useful to distinguish between zeroing at the image
 format level from discarding at the device level, but I don't think we
 make use of that yet.  I'd prefer to keep the interface simple for now
 and add flags later, if necessary.

 Or maybe I just missed something ;)

 The flag is needed to implement the right semantics for the SCSI WRITE
 SAME command, which are:

 - if the UNMAP bit is off, always write the sectors (that's
 bdrv_aio_write_zeroes without BDRV_REQ_MAY_UNMAP if the payload is zero,
 otherwise it's emulated with bdrv_aio_writev)

 - if the target can discard and write the specified payload, you can
 discard, else you must write the sectors with the correct payload
 (that's bdrv_aio_write_zeroes with BDRV_REQ_MAY_UNMAP).

 Contrast this with the UNMAP command, which does not make any guarantee
 on the content of the sectors after the command is completed (a few
 months ago we agreed that, even if you have discard_zeroes=true in the
 target, it is fine for UNMAP to do nothing).

 Okay, then let's keep the patches to expose the flag.

 Okay, then I can keep those.

 Can you give a short hint if my approach with brdv_make_empty is what
 you want? I would like to not change the parameters, so use
 BDRV_REQ_MAY_UNMAP
 unconditionally.

 int bdrv_make_empty(BlockDriverState *bs)

The semantics of bdrv_make_empty() today are: deallocate all data in
the top layer of the image file.  If there is a backing file, reads
will fall back to the backing file.

The semantics that you want are zeroing the entire disk image
(efficiently, when possible).

A flags argument is needed to support both of sets of semantics.  If
you don't like that, then I suggest creating a new function called
bdrv_make_zero().

Stefan



Re: [Qemu-devel] savevm/loadvm

2013-10-08 Thread Paolo Bonzini
Il 08/10/2013 10:40, Alexey Kardashevskiy ha scritto:
 However qcow2_save_vmstate() sets bs-growable to 1 for a short time
 (commit 178e08a58f40dd5aef2ce774fe0850f5d0e56918 from 2009) and this
 triggers a branch in bdrv_co_do_writev() which changes bs-total_sectors.
 So when QEMU writes snapshots to the file, the disk_size field of a
 snapshot has bigger value (for example 0x4.007b.8180).

I think you need to modify qcow2_save_vmstate to save and restore
bs-total_sectors.  Can you test that and if so post the patch?

Paolo



Re: [Qemu-devel] [PATCH 1/8] timers: extract timer_mod_ns_locked and timerlist_rearm

2013-10-08 Thread Alex Bligh

On 8 Oct 2013, at 09:47, Paolo Bonzini wrote:

 These will be reused in timer_mod_anticipate functions.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Alex Bligh a...@alex.org.uk
 ---
 qemu-timer.c | 51 ---
 1 file changed, 32 insertions(+), 19 deletions(-)
 
 diff --git a/qemu-timer.c b/qemu-timer.c
 index 6b62e88..95fc6eb 100644
 --- a/qemu-timer.c
 +++ b/qemu-timer.c
 @@ -338,6 +338,34 @@ static void timer_del_locked(QEMUTimerList *timer_list, 
 QEMUTimer *ts)
 }
 }
 
 +static bool timer_mod_ns_locked(QEMUTimerList *timer_list,
 +QEMUTimer *ts, int64_t expire_time)
 +{
 +QEMUTimer **pt, *t;
 +
 +/* add the timer in the sorted list */
 +pt = timer_list-active_timers;
 +for (;;) {
 +t = *pt;
 +if (!timer_expired_ns(t, expire_time)) {
 +break;
 +}
 +pt = t-next;
 +}
 +ts-expire_time = MAX(expire_time, 0);
 +ts-next = *pt;
 +*pt = ts;
 +
 +return pt == timer_list-active_timers;
 +}
 +
 +static void timerlist_rearm(QEMUTimerList *timer_list)
 +{
 +/* Interrupt execution to force deadline recalculation.  */
 +qemu_clock_warp(timer_list-clock-type);
 +timerlist_notify(timer_list);
 +}
 +
 /* stop a timer, but do not dealloc it */
 void timer_del(QEMUTimer *ts)
 {
 @@ -353,30 +381,15 @@ void timer_del(QEMUTimer *ts)
 void timer_mod_ns(QEMUTimer *ts, int64_t expire_time)
 {
 QEMUTimerList *timer_list = ts-timer_list;
 -QEMUTimer **pt, *t;
 +bool rearm;
 
 qemu_mutex_lock(timer_list-active_timers_lock);
 timer_del_locked(timer_list, ts);
 -
 -/* add the timer in the sorted list */
 -pt = timer_list-active_timers;
 -for(;;) {
 -t = *pt;
 -if (!timer_expired_ns(t, expire_time)) {
 -break;
 -}
 -pt = t-next;
 -}
 -ts-expire_time = MAX(expire_time, 0);
 -ts-next = *pt;
 -*pt = ts;
 +rearm = timer_mod_ns_locked(timer_list, ts, expire_time);
 qemu_mutex_unlock(timer_list-active_timers_lock);
 
 -/* Rearm if necessary  */
 -if (pt == timer_list-active_timers) {
 -/* Interrupt execution to force deadline recalculation.  */
 -qemu_clock_warp(timer_list-clock-type);
 -timerlist_notify(timer_list);
 +if (rearm) {
 +timerlist_rearm(timer_list);
 }
 }
 
 -- 
 1.8.3.1
 
 
 
 

-- 
Alex Bligh







Re: [Qemu-devel] KVM Guest keymap issue

2013-10-08 Thread Matej Mailing
Hi,

the strange thing is that all other keys and combinations work except
those ccaron, Ccaron, scaron and Scaron, zcaron and ZCaron don't. In
our language there are many words containing those chars and I really
need to have them working.

When looking at the sl keymap file, those codes, even for all other
chars that I type with showkey --ascii, are different than the showkey
outputs, but they work (except those mentioned above).

Now I am totally confused on how could those that work, work ...

Thanks for any enlightenments in advance :)
Matej

2013/9/26 Matej Mailing mail...@tam.si:
 I am still pretty lost here, also after reading your link which shed a
 light to many things.

 Every suggestion and idea is very welcome!
 Thanks,
 Matej

 2013/9/24 Markus Armbruster arm...@redhat.com:
 Not specific to KVM, adding qemu-devel.

 Matej Mailing mail...@tam.si writes:

 Dear list,

 I have a problem with a Windows XP guest that I connect to via VNC and
 is using sl keymap (option -k sl).

 The guest is Windows XP and the problematic characters are s, c and z
 with caron... when I type them via VNC, they are not printed at all in
 virtual system... I have checked the file /usr/share/kvm/keymaps/sl
 and it seems that it contains different codes than I get when doing
 showkey --ascii on the host machine (running Ubuntu 12.04). I have
 tried to change the KVM's keymap file 'sl' with the codes I get from
 showkey, but they are still not printed in virtual system to which I
 am connected via VNC...

 I am totally lost with this issue, thanks for your time and ideas.

 Required reading for anyone struggling with virtual keyboards:

 https://www.berrange.com/posts/2010/07/04/more-than-you-or-i-ever-wanted-to-know-about-virtual-keyboard-handling/



Re: [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements

2013-10-08 Thread Peter Lieven

On 08.10.2013 10:59, Stefan Hajnoczi wrote:

On Tue, Oct 8, 2013 at 10:01 AM, Peter Lieven p...@kamp.de wrote:

On 08.10.2013 09:02, Stefan Hajnoczi wrote:

On Mon, Oct 7, 2013 at 11:42 AM, Paolo Bonzini pbonz...@redhat.com
wrote:

Il 07/10/2013 10:42, Stefan Hajnoczi ha scritto:

Could you make bdrv_co_write_zeroes() always use UNMAP, if possible, and
avoid adding the new BDRV_REQ_MAY_UNMAP flag?  While reading the first
few patches in this series I wondered why there is a need to expose
flags at all...

Sometimes it is useful to distinguish between zeroing at the image
format level from discarding at the device level, but I don't think we
make use of that yet.  I'd prefer to keep the interface simple for now
and add flags later, if necessary.

Or maybe I just missed something ;)

The flag is needed to implement the right semantics for the SCSI WRITE
SAME command, which are:

- if the UNMAP bit is off, always write the sectors (that's
bdrv_aio_write_zeroes without BDRV_REQ_MAY_UNMAP if the payload is zero,
otherwise it's emulated with bdrv_aio_writev)

- if the target can discard and write the specified payload, you can
discard, else you must write the sectors with the correct payload
(that's bdrv_aio_write_zeroes with BDRV_REQ_MAY_UNMAP).

Contrast this with the UNMAP command, which does not make any guarantee
on the content of the sectors after the command is completed (a few
months ago we agreed that, even if you have discard_zeroes=true in the
target, it is fine for UNMAP to do nothing).

Okay, then let's keep the patches to expose the flag.

Okay, then I can keep those.

Can you give a short hint if my approach with brdv_make_empty is what
you want? I would like to not change the parameters, so use
BDRV_REQ_MAY_UNMAP
unconditionally.

int bdrv_make_empty(BlockDriverState *bs)

The semantics of bdrv_make_empty() today are: deallocate all data in
the top layer of the image file.  If there is a backing file, reads
will fall back to the backing file.

The semantics that you want are zeroing the entire disk image
(efficiently, when possible).

A flags argument is needed to support both of sets of semantics.  If
you don't like that, then I suggest creating a new function called
bdrv_make_zero().

Ok, that is what I would like to do. In this case I only have to rename
bdrv_zeroize to bdrv_make_zero. Ok ?

Peter




Re: [Qemu-devel] [PATCH 2/8] timers: add timer_mod_anticipate and timer_mod_anticipate_ns

2013-10-08 Thread Alex Bligh
Paolo,

On 8 Oct 2013, at 09:47, Paolo Bonzini wrote:
 
 --- a/qemu-timer.c
 +++ b/qemu-timer.c
 @@ -393,11 +393,40 @@ void timer_mod_ns(QEMUTimer *ts, int64_t expire_time)
 }
 }
 
 +/* modify the current timer so that it will be fired when current_time
 +   = expire_time or the current deadline, whichever comes earlier.
 +   The corresponding callback will be called. */
 +void timer_mod_anticipate_ns(QEMUTimer *ts, int64_t expire_time)
 +{
 +QEMUTimerList *timer_list = ts-timer_list;
 +bool rearm;
 +
 +qemu_mutex_lock(timer_list-active_timers_lock);
 +if (ts-expire_time == -1 || ts-expire_time  expire_time) {

So if we want to alter it ...

 +if (ts-expire_time != -1) {
 +timer_del_locked(timer_list, ts);
 +}

What's this bit for? Surely you've calculated whether you are
shortening the expiry time (above), so all you need do now is
modify it. Why delete it? timer_mod_ns doesn't make this
check?

Otherwise looks OK.

 +rearm = timer_mod_ns_locked(timer_list, ts, expire_time);
 +} else {
 +rearm = false;
 +}
 +qemu_mutex_unlock(timer_list-active_timers_lock);
 +
 +if (rearm) {
 +timerlist_rearm(timer_list);
 +}
 +}
 +
 void timer_mod(QEMUTimer *ts, int64_t expire_time)
 {
 timer_mod_ns(ts, expire_time * ts-scale);
 }
 
 +void timer_mod_anticipate(QEMUTimer *ts, int64_t expire_time)
 +{
 +timer_mod_anticipate_ns(ts, expire_time * ts-scale);
 +}
 +
 bool timer_pending(QEMUTimer *ts)
 {
 return ts-expire_time = 0;
 -- 
 1.8.3.1
 
 
 
 

-- 
Alex Bligh







Re: [Qemu-devel] savevm/loadvm

2013-10-08 Thread Kevin Wolf
Am 08.10.2013 um 11:04 hat Paolo Bonzini geschrieben:
 Il 08/10/2013 10:40, Alexey Kardashevskiy ha scritto:
  However qcow2_save_vmstate() sets bs-growable to 1 for a short time
  (commit 178e08a58f40dd5aef2ce774fe0850f5d0e56918 from 2009) and this
  triggers a branch in bdrv_co_do_writev() which changes bs-total_sectors.
  So when QEMU writes snapshots to the file, the disk_size field of a
  snapshot has bigger value (for example 0x4.007b.8180).
 
 I think you need to modify qcow2_save_vmstate to save and restore
 bs-total_sectors.  Can you test that and if so post the patch?

It's a regression introduced by commit df2a6f29, right?

What you suggest probably works as a quick hack to fix the bug. The VM
state functions in qcow2 are getting uglier and uglier, though. Maybe
they should avoid going through block.c and adding hacks to disable or
revert side effects.

Kevin



Re: [Qemu-devel] [PATCH 2/8] timers: add timer_mod_anticipate and timer_mod_anticipate_ns

2013-10-08 Thread Paolo Bonzini
Il 08/10/2013 11:15, Alex Bligh ha scritto:
 So if we want to alter it ...
 
  +if (ts-expire_time != -1) {
  +timer_del_locked(timer_list, ts);
  +}
 What's this bit for? Surely you've calculated whether you are
 shortening the expiry time (above), so all you need do now is
 modify it. Why delete it? timer_mod_ns doesn't make this
 check?

timer_mod_ns_locked does not remove the timer from the list:

qemu_mutex_lock(timer_list-active_timers_lock);
timer_del_locked(timer_list, ts);
rearm = timer_mod_ns_locked(timer_list, ts, expire_time);
qemu_mutex_unlock(timer_list-active_timers_lock);

This is doing the same.  The check in the if is not strictly
necessary, but it saves a useless visit of the list.  It could be added
to timer_mod_ns as well.

Paolo



Re: [Qemu-devel] [PATCHv3 00/20] block: logical block provisioning enhancements

2013-10-08 Thread Stefan Hajnoczi
On Tue, Oct 8, 2013 at 11:12 AM, Peter Lieven p...@kamp.de wrote:
 On 08.10.2013 10:59, Stefan Hajnoczi wrote:

 On Tue, Oct 8, 2013 at 10:01 AM, Peter Lieven p...@kamp.de wrote:

 On 08.10.2013 09:02, Stefan Hajnoczi wrote:

 On Mon, Oct 7, 2013 at 11:42 AM, Paolo Bonzini pbonz...@redhat.com
 wrote:

 Il 07/10/2013 10:42, Stefan Hajnoczi ha scritto:

 Could you make bdrv_co_write_zeroes() always use UNMAP, if possible,
 and
 avoid adding the new BDRV_REQ_MAY_UNMAP flag?  While reading the first
 few patches in this series I wondered why there is a need to expose
 flags at all...

 Sometimes it is useful to distinguish between zeroing at the image
 format level from discarding at the device level, but I don't think we
 make use of that yet.  I'd prefer to keep the interface simple for now
 and add flags later, if necessary.

 Or maybe I just missed something ;)

 The flag is needed to implement the right semantics for the SCSI WRITE
 SAME command, which are:

 - if the UNMAP bit is off, always write the sectors (that's
 bdrv_aio_write_zeroes without BDRV_REQ_MAY_UNMAP if the payload is
 zero,
 otherwise it's emulated with bdrv_aio_writev)

 - if the target can discard and write the specified payload, you can
 discard, else you must write the sectors with the correct payload
 (that's bdrv_aio_write_zeroes with BDRV_REQ_MAY_UNMAP).

 Contrast this with the UNMAP command, which does not make any guarantee
 on the content of the sectors after the command is completed (a few
 months ago we agreed that, even if you have discard_zeroes=true in the
 target, it is fine for UNMAP to do nothing).

 Okay, then let's keep the patches to expose the flag.

 Okay, then I can keep those.

 Can you give a short hint if my approach with brdv_make_empty is what
 you want? I would like to not change the parameters, so use
 BDRV_REQ_MAY_UNMAP
 unconditionally.

 int bdrv_make_empty(BlockDriverState *bs)

 The semantics of bdrv_make_empty() today are: deallocate all data in
 the top layer of the image file.  If there is a backing file, reads
 will fall back to the backing file.

 The semantics that you want are zeroing the entire disk image
 (efficiently, when possible).

 A flags argument is needed to support both of sets of semantics.  If
 you don't like that, then I suggest creating a new function called
 bdrv_make_zero().

 Ok, that is what I would like to do. In this case I only have to rename
 bdrv_zeroize to bdrv_make_zero. Ok ?

Yes, please.

Stefan



[Qemu-devel] [PATCH 0/3] qapi: introduce BlockJobType enum

2013-10-08 Thread Fam Zheng
Currently, block job type is hard coded string and could be repeated in
different places in the code base. Introduce a enum type in QAPI to make it
better for maintenance and introspection. The old BlockJobType struct is
renamed to BlockJobDriver and its field job_type becomes a BlockJobType enum.

Nothing is changed to the interface.

Fam Zheng (3):
  blockjob: rename BlockJobType to BlockJobDriver
  qapi: Introduce enum BlockJobType
  qapi: make use of new BlockJobType

 block/backup.c   |  6 +++---
 block/commit.c   |  6 +++---
 block/mirror.c   |  6 +++---
 block/stream.c   |  6 +++---
 blockjob.c   | 22 +++---
 include/block/blockjob.h | 14 +++---
 qapi-schema.json | 18 ++
 7 files changed, 48 insertions(+), 30 deletions(-)

-- 
1.8.3.1




[Qemu-devel] [PATCH 1/3] blockjob: rename BlockJobType to BlockJobDriver

2013-10-08 Thread Fam Zheng
We will use BlockJobType as the enum type name of block jobs in QAPI,
rename current BlockJobType to BlockJobDriver, which will eventually
become a set of operations, similar to block drivers.

Signed-off-by: Fam Zheng f...@redhat.com
---
 block/backup.c   |  4 ++--
 block/commit.c   |  4 ++--
 block/mirror.c   |  4 ++--
 block/stream.c   |  4 ++--
 blockjob.c   | 22 +++---
 include/block/blockjob.h | 12 ++--
 6 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 04c4b5c..d374472 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -202,7 +202,7 @@ static void backup_iostatus_reset(BlockJob *job)
 bdrv_iostatus_reset(s-target);
 }
 
-static const BlockJobType backup_job_type = {
+static const BlockJobDriver backup_job_driver = {
 .instance_size  = sizeof(BackupBlockJob),
 .job_type   = backup,
 .set_speed  = backup_set_speed,
@@ -370,7 +370,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 return;
 }
 
-BackupBlockJob *job = block_job_create(backup_job_type, bs, speed,
+BackupBlockJob *job = block_job_create(backup_job_driver, bs, speed,
cb, opaque, errp);
 if (!job) {
 return;
diff --git a/block/commit.c b/block/commit.c
index ac4b7cc..5146138 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -173,7 +173,7 @@ static void commit_set_speed(BlockJob *job, int64_t speed, 
Error **errp)
 ratelimit_set_speed(s-limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
 }
 
-static const BlockJobType commit_job_type = {
+static const BlockJobDriver commit_job_driver = {
 .instance_size = sizeof(CommitBlockJob),
 .job_type  = commit,
 .set_speed = commit_set_speed,
@@ -238,7 +238,7 @@ void commit_start(BlockDriverState *bs, BlockDriverState 
*base,
 }
 
 
-s = block_job_create(commit_job_type, bs, speed, cb, opaque, errp);
+s = block_job_create(commit_job_driver, bs, speed, cb, opaque, errp);
 if (!s) {
 return;
 }
diff --git a/block/mirror.c b/block/mirror.c
index 6e7a274..991cc24 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -525,7 +525,7 @@ static void mirror_complete(BlockJob *job, Error **errp)
 block_job_resume(job);
 }
 
-static const BlockJobType mirror_job_type = {
+static const BlockJobDriver mirror_job_driver = {
 .instance_size = sizeof(MirrorBlockJob),
 .job_type  = mirror,
 .set_speed = mirror_set_speed,
@@ -563,7 +563,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState 
*target,
 return;
 }
 
-s = block_job_create(mirror_job_type, bs, speed, cb, opaque, errp);
+s = block_job_create(mirror_job_driver, bs, speed, cb, opaque, errp);
 if (!s) {
 return;
 }
diff --git a/block/stream.c b/block/stream.c
index 45837f4..7f412bd 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -203,7 +203,7 @@ static void stream_set_speed(BlockJob *job, int64_t speed, 
Error **errp)
 ratelimit_set_speed(s-limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
 }
 
-static const BlockJobType stream_job_type = {
+static const BlockJobDriver stream_job_driver = {
 .instance_size = sizeof(StreamBlockJob),
 .job_type  = stream,
 .set_speed = stream_set_speed,
@@ -224,7 +224,7 @@ void stream_start(BlockDriverState *bs, BlockDriverState 
*base,
 return;
 }
 
-s = block_job_create(stream_job_type, bs, speed, cb, opaque, errp);
+s = block_job_create(stream_job_driver, bs, speed, cb, opaque, errp);
 if (!s) {
 return;
 }
diff --git a/blockjob.c b/blockjob.c
index e7d49b7..6814e69 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -35,7 +35,7 @@
 #include qmp-commands.h
 #include qemu/timer.h
 
-void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs,
+void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
int64_t speed, BlockDriverCompletionFunc *cb,
void *opaque, Error **errp)
 {
@@ -48,8 +48,8 @@ void *block_job_create(const BlockJobType *job_type, 
BlockDriverState *bs,
 bdrv_ref(bs);
 bdrv_set_in_use(bs, 1);
 
-job = g_malloc0(job_type-instance_size);
-job-job_type  = job_type;
+job = g_malloc0(driver-instance_size);
+job-driver= driver;
 job-bs= bs;
 job-cb= cb;
 job-opaque= opaque;
@@ -87,11 +87,11 @@ void block_job_set_speed(BlockJob *job, int64_t speed, 
Error **errp)
 {
 Error *local_err = NULL;
 
-if (!job-job_type-set_speed) {
+if (!job-driver-set_speed) {
 error_set(errp, QERR_NOT_SUPPORTED);
 return;
 }
-job-job_type-set_speed(job, speed, local_err);
+job-driver-set_speed(job, speed, local_err);
 if (error_is_set(local_err)) {
 error_propagate(errp, local_err);
 return;
@@ -102,12 +102,12 @@ void 

[Qemu-devel] [PATCH 2/3] qapi: Introduce enum BlockJobType

2013-10-08 Thread Fam Zheng
This will replace the open coded block job type string for mirror,
commit and backup.

Signed-off-by: Fam Zheng f...@redhat.com
---
 qapi-schema.json | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/qapi-schema.json b/qapi-schema.json
index 145eca8..381ffbf 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1366,6 +1366,24 @@
   'data': ['top', 'full', 'none'] }
 
 ##
+# @BlockJobType:
+#
+# Type of a block job.
+#
+# @commit: block commit job type, see block-commit
+#
+# @stream: block stream job type, see block-stream
+#
+# @mirror: drive mirror job type, see drive-mirror
+#
+# @backup: drive backup job type, see drive-backup
+#
+# Since: 1.7
+##
+{ 'enum': 'BlockJobType',
+  'data': ['commit', 'stream', 'mirror', 'backup'] }
+
+##
 # @BlockJobInfo:
 #
 # Information about a long-running block device operation.
-- 
1.8.3.1




[Qemu-devel] [PATCH 3/3] qapi: make use of new BlockJobType

2013-10-08 Thread Fam Zheng
Switch the string to enum type BlockJobType in BlockJobDriver.

Signed-off-by: Fam Zheng f...@redhat.com
---
 block/backup.c   | 2 +-
 block/commit.c   | 2 +-
 block/mirror.c   | 2 +-
 block/stream.c   | 2 +-
 blockjob.c   | 4 ++--
 include/block/blockjob.h | 2 +-
 6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index d374472..cad14c9 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -204,7 +204,7 @@ static void backup_iostatus_reset(BlockJob *job)
 
 static const BlockJobDriver backup_job_driver = {
 .instance_size  = sizeof(BackupBlockJob),
-.job_type   = backup,
+.job_type   = BLOCK_JOB_TYPE_BACKUP,
 .set_speed  = backup_set_speed,
 .iostatus_reset = backup_iostatus_reset,
 };
diff --git a/block/commit.c b/block/commit.c
index 5146138..d4090cb 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -175,7 +175,7 @@ static void commit_set_speed(BlockJob *job, int64_t speed, 
Error **errp)
 
 static const BlockJobDriver commit_job_driver = {
 .instance_size = sizeof(CommitBlockJob),
-.job_type  = commit,
+.job_type  = BLOCK_JOB_TYPE_COMMIT,
 .set_speed = commit_set_speed,
 };
 
diff --git a/block/mirror.c b/block/mirror.c
index 991cc24..7b95acf 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -527,7 +527,7 @@ static void mirror_complete(BlockJob *job, Error **errp)
 
 static const BlockJobDriver mirror_job_driver = {
 .instance_size = sizeof(MirrorBlockJob),
-.job_type  = mirror,
+.job_type  = BLOCK_JOB_TYPE_MIRROR,
 .set_speed = mirror_set_speed,
 .iostatus_reset= mirror_iostatus_reset,
 .complete  = mirror_complete,
diff --git a/block/stream.c b/block/stream.c
index 7f412bd..694fd42 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -205,7 +205,7 @@ static void stream_set_speed(BlockJob *job, int64_t speed, 
Error **errp)
 
 static const BlockJobDriver stream_job_driver = {
 .instance_size = sizeof(StreamBlockJob),
-.job_type  = stream,
+.job_type  = BLOCK_JOB_TYPE_STREAM,
 .set_speed = stream_set_speed,
 };
 
diff --git a/blockjob.c b/blockjob.c
index 6814e69..9e5fd5c 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -209,7 +209,7 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType type, 
int64_t ns)
 BlockJobInfo *block_job_query(BlockJob *job)
 {
 BlockJobInfo *info = g_new0(BlockJobInfo, 1);
-info-type  = g_strdup(job-driver-job_type);
+info-type  = g_strdup(BlockJobType_lookup[job-driver-job_type]);
 info-device= g_strdup(bdrv_get_device_name(job-bs));
 info-len   = job-len;
 info-busy  = job-busy;
@@ -236,7 +236,7 @@ QObject *qobject_from_block_job(BlockJob *job)
   'len': % PRId64 ,
   'offset': % PRId64 ,
   'speed': % PRId64  },
-  job-driver-job_type,
+  BlockJobType_lookup[job-driver-job_type],
   bdrv_get_device_name(job-bs),
   job-len,
   job-offset,
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 99359b5..d76de62 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -37,7 +37,7 @@ typedef struct BlockJobDriver {
 size_t instance_size;
 
 /** String describing the operation, part of query-block-jobs QMP API */
-const char *job_type;
+BlockJobType job_type;
 
 /** Optional callback for job types that support setting a speed limit */
 void (*set_speed)(BlockJob *job, int64_t speed, Error **errp);
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH v4 6/7] commit: remove unused check

2013-10-08 Thread Fam Zheng
On Mon, 09/30 14:17, Paolo Bonzini wrote:
 Il 30/09/2013 14:02, Fam Zheng ha scritto:
  We support top == active for commit now, remove the check which is dead
  code now.
  
  Signed-off-by: Fam Zheng f...@redhat.com
  ---
   block/commit.c | 7 ---
   1 file changed, 7 deletions(-)
  
  diff --git a/block/commit.c b/block/commit.c
  index ac4b7cc..086f8c9 100644
  --- a/block/commit.c
  +++ b/block/commit.c
  @@ -198,13 +198,6 @@ void commit_start(BlockDriverState *bs, 
  BlockDriverState *base,
   return;
   }
   
  -/* Once we support top == active layer, remove this check */
  -if (top == bs) {
  -error_setg(errp,
  -   Top image as the active layer is currently 
  unsupported);
  -return;
  -}
  -
   if (top == base) {
   error_setg(errp, Invalid files for merge: top and base are the 
  same);
   return;
  
 
 Perhaps this could even become an assertion, or it could take care of
 calling commit_active_start?
 

Good idea. I'll make it an assertion.

Thanks,
Fam



Re: [Qemu-devel] savevm/loadvm

2013-10-08 Thread Paolo Bonzini
Il 08/10/2013 11:23, Kevin Wolf ha scritto:
  I think you need to modify qcow2_save_vmstate to save and restore
  bs-total_sectors.  Can you test that and if so post the patch?
 It's a regression introduced by commit df2a6f29, right?

Yes, that's what introduced the if.

 What you suggest probably works as a quick hack to fix the bug. The VM
 state functions in qcow2 are getting uglier and uglier, though. Maybe
 they should avoid going through block.c and adding hacks to disable or
 revert side effects.

Yes, that would work too.

Paolo



Re: [Qemu-devel] BUG: RTC issue when Windows guest is idle

2013-10-08 Thread Xiexiangyou
Hi:

 I have met the same bug that windows2008 guest stop receive the RTC 
ticks when it in idle status by fortuitous.

When vnc connect, guest will resume to receive RTC ticks and the time run fast 
because of the coalesced timer



HPET is diabled, and RTC is set catchup, as following:

  clock offset='utc'

timer name='rtc' tickpolicy='catchup' track='guest'/

timer name='hpet' present='no'/

  /clock



My kvm module is version3.6. Should I upgrade it to latest version. Any 
suggestion is welcome !



Thanks!



--xiangyouxie



On Thu, Feb 21, 2013 at 06:16:10PM +, Matthew Anderson wrote:

 If this isn't the correct list just let me know,



 I've run into a bug whereby a Windows guest (tested on Server 2008R2 and

 2012) no longer receives RTC ticks when it has been idle for a random amount

 of time. HPET is disabled and the guest is running Hyper-V relaxed timers

 (same situation without hv_relaxed). The guest clock stands still and the

 qemu process uses very little CPU (0.5%, normally it's 5% when the guest is

 idle) . Eventually the guest stops responding to network requests but if you

 open the guest console via VNC and move the mouse around it comes back to

 life and QEMU replays the lost RTC ticks and the guest recovers. I've also

 been able to make it recover by querying the clock over the network via the

 net time command, you can see the clock stand still for 30 seconds then it

 replays the ticks and catches up.



 I've tried to reproduce the issue but it seems fairly illusive, the only way

 I've been able to reproduce it is by letting the VM's idle and waiting.

 Sometimes it's hours and sometimes minutes. Can anyone suggest a way to

 narrow the issue down?



 Qemu command line is-

 /usr/bin/kvm -name SQL01 -S -M pc-0.14 -cpu qemu64,hv_relaxed -enable-kvm -m

 2048 -smp 2,sockets=2,cores=1,threads=1 -uuid

 5f54333b-c250-aa72-c979-39d156814b85 -no-user-config -nodefaults -chardev

 socket,id=charmonitor,path=/var/lib/libvirt/qemu/iHost-SQL01.monitor,server,nowait

  -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime

 -no-hpet -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2

 -drive

 file=/mnt/gluster1-norep/iHost/SQL01.qed,if=none,id=drive-virtio-disk0,format=qed,cache=writeback

  -device

 virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0

  -drive

 file=/mnt/gluster1-norep/iHost/SQL01-Data.qed,if=none,id=drive-virtio-disk2,format=qed,cache=writeback

  -device

 virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk2,id=virtio-disk2

  -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw -device

 ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 -netdev

 tap,fd=29,id=hostnet0,vhost=on,vhostfd=39 -device

 virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:2c:8d:23,bus=pci.0,addr=0x3

  -chardev pty,id=charserial0 -device

 isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc

 127.0.0.1:22 -vga cirrus -device

 virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4



 Environment is -

 Mainline 3.7.5 and 3.8.0

 Qemu 1.2.2, 1.3.1 and 1.4.0





Re: [Qemu-devel] [PATCH v2 14/17] qemu-iotests: Check autodel behaviour for device_del

2013-10-08 Thread Kevin Wolf
Am 01.10.2013 um 19:06 hat Eric Blake geschrieben:
 On 10/01/2013 07:20 AM, Kevin Wolf wrote:
  Block devices creates with -drive and drive_add should automatically
  disappear if the guest device is unplugged. blockdev-add ones shouldn't.
  
  Signed-off-by: Kevin Wolf kw...@redhat.com
  Reviewed-by: Max Reitz mre...@redhat.com
  ---
   tests/qemu-iotests/064   | 133 
  +++
   tests/qemu-iotests/064.out   |  80 +++
   tests/qemu-iotests/common.filter |   8 +++
   tests/qemu-iotests/group |   1 +
   4 files changed, 222 insertions(+)
   create mode 100755 tests/qemu-iotests/064
   create mode 100644 tests/qemu-iotests/064.out
 
 Reviewed-by: Eric Blake ebl...@redhat.com
 
 
  +
  +_supported_fmt qcow2
  +_supported_proto file
  +_supported_os Linux
 
 Is this test really Linux-only?

Probably most tests aren't, but all of them have _supported_os Linux.
If anyone wanted to use this for something, they'd have to make a bigger
change than just to this patch.

Kevin



Re: [Qemu-devel] [RFC] qapi/error: Optional error propagation backtrace

2013-10-08 Thread Max Reitz

On 2013-09-09 15:53, Max Reitz wrote:

Add a configure switch which enables an error propagation backtrace.
This results in the error_set function prepending every message by the
source file name, function and line in which it was called, as well as
error_propagate appending this information to the propagated message,
resulting in a backtrace.

Signed-off-by: Max Reitz mre...@redhat.com


Ping – any comments on this?


---
Since this obviously breaks existing tests (and cannot really be used for
new tests since it includes a line number, which is a rather volatile
output) and is generally not very useful to normal users, the switch is
disabled by default. This functionality is intended for developers
tracking down error paths.

Since I do not know whether I am the only one considering it useful
in the first place, this is just an RFC for now.

Furthermore, I am not sure whether a configure switch is really the right
way to implement this.
---
  configure| 12 +++
  include/qapi/error.h | 40 +++-
  util/error.c | 57 +++-
  3 files changed, 90 insertions(+), 19 deletions(-)

diff --git a/configure b/configure
index e989609..4e43f7b 100755
--- a/configure
+++ b/configure
@@ -243,6 +243,7 @@ gtk=
  gtkabi=2.0
  tpm=no
  libssh2=
+error_backtrace=no
  
  # parse CC options first

  for opt do
@@ -949,6 +950,10 @@ for opt do
;;
--enable-libssh2) libssh2=yes
;;
+  --enable-error-bt) error_backtrace=yes
+  ;;
+  --disable-error-bt) error_backtrace=no
+  ;;
*) echo ERROR: unknown option $opt; show_help=yes
;;
esac
@@ -1168,6 +1173,8 @@ echo   --gcov=GCOV  use specified gcov 
[$gcov_tool]
  echo   --enable-tpm enable TPM support
  echo   --disable-libssh2disable ssh block device support
  echo   --enable-libssh2 enable ssh block device support
+echo   --disable-error-bt   disable backtraces on internal errors
+echo   --enable-error-btenable backtraces on internal errors
  echo 
  echo NOTE: The object files are built at the place where configure is 
launched
  exit 1
@@ -3650,6 +3657,7 @@ echo gcov  $gcov_tool
  echo gcov enabled  $gcov
  echo TPM support   $tpm
  echo libssh2 support   $libssh2
+echo error backtraces  $error_backtrace
  echo TPM passthrough   $tpm_passthrough
  echo QOM debugging $qom_cast_debug
  
@@ -4044,6 +4052,10 @@ if test $libssh2 = yes ; then

echo CONFIG_LIBSSH2=y  $config_host_mak
  fi
  
+if test $error_backtrace = yes ; then

+  echo CONFIG_ERROR_BACKTRACE=y  $config_host_mak
+fi
+
  if test $virtio_blk_data_plane = yes ; then
echo 'CONFIG_VIRTIO_BLK_DATA_PLANE=$(CONFIG_VIRTIO)'  $config_host_mak
  fi
diff --git a/include/qapi/error.h b/include/qapi/error.h
index ffd1cea..d3cfe35 100644
--- a/include/qapi/error.h
+++ b/include/qapi/error.h
@@ -25,16 +25,28 @@ typedef struct Error Error;
  /**
   * Set an indirect pointer to an error given a ErrorClass value and a
   * printf-style human message.  This function is not meant to be used outside
- * of QEMU.
+ * of QEMU.  The message will be prepended by the file/function/line 
information
+ * if CONFIG_ERROR_BACKTRACE is defined.
   */
-void error_set(Error **err, ErrorClass err_class, const char *fmt, ...) 
GCC_FMT_ATTR(3, 4);
+void error_set_bt(const char *file, const char *func, int line,
+  Error **err, ErrorClass err_class, const char *fmt, ...)
+GCC_FMT_ATTR(6, 7);
+
+#define error_set(...) \
+error_set_bt(__FILE__, __func__, __LINE__, __VA_ARGS__)
  
  /**

   * Set an indirect pointer to an error given a ErrorClass value and a
   * printf-style human message, followed by a strerror() string if
- * @os_error is not zero.
+ * @os_error is not zero.  The message will be prepended by the
+ * file/function/line information if CONFIG_ERROR_BACKTRACE is defined.
   */
-void error_set_errno(Error **err, int os_error, ErrorClass err_class, const 
char *fmt, ...) GCC_FMT_ATTR(4, 5);
+void error_set_errno_bt(const char *file, const char *func, int line,
+Error **err, int os_error, ErrorClass err_class,
+const char *fmt, ...) GCC_FMT_ATTR(7, 8);
+
+#define error_set_errno(...) \
+error_set_errno_bt(__FILE__, __func__, __LINE__, __VA_ARGS__)
  
  /**

   * Same as error_set(), but sets a generic error
@@ -47,7 +59,11 @@ void error_set_errno(Error **err, int os_error, ErrorClass 
err_class, const char
  /**
   * Helper for open() errors
   */
-void error_setg_file_open(Error **errp, int os_errno, const char *filename);
+void error_setg_file_open_bt(const char *file, const char *func, int line,
+ Error **errp, int os_errno, const char *filename);
+
+#define error_setg_file_open(...) \
+error_setg_file_open_bt(__FILE__, __func__, __LINE__, __VA_ARGS__)
  
  /**

   * Returns true if an indirect pointer to an error is 

[Qemu-devel] [PATCH] osdep: initialize glib threads in all QEMU tools

2013-10-08 Thread Stefan Hajnoczi
glib versions prior to 2.31.0 require an explicit g_thread_init() call
to enable multi-threading.

Failure to initialize threading causes glib to take single-threaded code
paths without synchronization.  For example, the g_slice allocator will
crash due to race conditions.

Fix this for all QEMU tool programs (qemu-nbd, qemu-io, qemu-img) by
moving the g_thread_init() call from vl.c:main() into a new
osdep.c:thread_init() constructor function.

thread_init() has __attribute__((constructor)) and is automatically
invoked by the runtime during startup.

We can now drop the simple trace backend's g_thread_init() call since
thread_init() already called it.

Note that we must keep coroutine-gthread.c's g_thread_init() call which
is located in a constructor function.  There is no guarantee for
constructor function ordering so thread_init() may only be called later.

Reported-by: Mario de Chenno mario.deche...@unina2.it
Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 trace/simple.c |  9 -
 util/osdep.c   | 18 ++
 vl.c   |  8 
 3 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/trace/simple.c b/trace/simple.c
index 1e3f691..7833309 100644
--- a/trace/simple.c
+++ b/trace/simple.c
@@ -404,15 +404,6 @@ bool trace_backend_init(const char *events, const char 
*file)
 {
 GThread *thread;
 
-if (!g_thread_supported()) {
-#if !GLIB_CHECK_VERSION(2, 31, 0)
-g_thread_init(NULL);
-#else
-fprintf(stderr, glib threading failed to initialize.\n);
-exit(1);
-#endif
-}
-
 #if !GLIB_CHECK_VERSION(2, 31, 0)
 trace_available_cond = g_cond_new();
 trace_empty_cond = g_cond_new();
diff --git a/util/osdep.c b/util/osdep.c
index 62072b4..e29ba6f 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -437,6 +437,24 @@ int socket_init(void)
 return 0;
 }
 
+/* Ensure that glib is running in multi-threaded mode */
+static void __attribute__((constructor)) thread_init(void)
+{
+if (!g_thread_supported()) {
+#if !GLIB_CHECK_VERSION(2, 31, 0)
+/* Old versions of glib require explicit initialization.  Failure to do
+ * this results in the single-threaded code paths being taken inside
+ * glib.  For example, the g_slice allocator will not be thread-safe
+ * and cause crashes.
+ */
+g_thread_init(NULL);
+#else
+fprintf(stderr, glib threading failed to initialize.\n);
+exit(1);
+#endif
+}
+}
+
 #ifndef CONFIG_IOVEC
 /* helper function for iov_send_recv() */
 static ssize_t
diff --git a/vl.c b/vl.c
index 983cdc6..e2dee8e 100644
--- a/vl.c
+++ b/vl.c
@@ -2857,14 +2857,6 @@ int main(int argc, char **argv, char **envp)
 error_set_progname(argv[0]);
 
 g_mem_set_vtable(mem_trace);
-if (!g_thread_supported()) {
-#if !GLIB_CHECK_VERSION(2, 31, 0)
-g_thread_init(NULL);
-#else
-fprintf(stderr, glib threading failed to initialize.\n);
-exit(1);
-#endif
-}
 
 module_call_init(MODULE_INIT_QOM);
 
-- 
1.8.3.1




[Qemu-devel] [Bug 1233225] Re: mips/mipsel linux user float division problem

2013-10-08 Thread Petar Jovanovic
This is a known issue.
There was a fix proposal by Thomas Schwinge back in June

http://patchwork.ozlabs.org/patch/250161/

but he has not updated the patch per suggestion ever since, though the patch
as is was much closer to correct behaviour than what it is now in the source.

If anyone is in hurry, he/she can pick up that change.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1233225

Title:
  mips/mipsel linux user float division problem

Status in QEMU:
  Confirmed

Bug description:
  Hi,

  I tested the following with the qemu git HEAD as of 2013-09-30 on
  Debian stable and testing. My host runs amd64 but I also tried this
  out inside a i386 chroot with the same result. The problem occurs for
  mips and mipsel. Given the following program:

  #include stdio.h
  int main(int argc, char **argv)
  {
  int a = 1;
  double d = a/2.0;
  printf(%f\n, d);
  return 0;
  }

  Instead of printing 0.5, it will print 2.0 if executed in qemu user
  mode.

  $ mipsel-linux-gnu-gcc mipstest.c
  $ ~/qemu/mipsel-linux-user/qemu-mipsel ./a.out
  2.0

  Expecting this to be a problem with my cross compiler (gcc-4.4 from
  emdebian) I ran a fully emulated debian squeeze environment inside
  qemu. There, I compiled the same program natively with gcc and as
  expected got 0.5 as the output. I also copied the cross compiled
  binary inside the emulated environment and also got 0.5 when I ran it.
  So the same mips/mipsel binary produces different output depending on
  whether it is run in a fully emulated environment or qemu user mode.

  Can anybody else reproduce this problem?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1233225/+subscriptions



Re: [Qemu-devel] [PATCH v2 2/2] KVM: s390: add floating irq controller

2013-10-08 Thread Jens Freimann
On Sat, Oct 05, 2013 at 01:53:33AM +0200, Alexander Graf wrote:
 
 On 06.09.2013, at 14:19, Jens Freimann wrote:
 
[snip] 
  -int kvm_s390_inject_vm(struct kvm *kvm,
  -  struct kvm_s390_interrupt *s390int)
  +static void __inject_vm(struct kvm *kvm, struct kvm_s390_interrupt_info 
  *inti)
 
 This really doesn't only inject, it enqueues an interrupt and also injects 
 them then, right?

hmm, maybe the term injecting is misleading. What is meant by injecting here is 
simply adding
it to the list. The actual injection into the guest lowcore is done by the 
do_deliver_interrupt
function.

  {
  struct kvm_s390_local_interrupt *li;
  struct kvm_s390_float_interrupt *fi;
  -   struct kvm_s390_interrupt_info *inti, *iter;
  +   struct kvm_s390_interrupt_info *iter;
  int sigcpu;
  
  +   mutex_lock(kvm-lock);
  +   fi = kvm-arch.float_int;
 
 You probably want to move this into your device structure eventually.

Yes, good point.
 
  +   spin_lock(fi-lock);
  +   if (!is_ioint(inti-type)) {
  +   list_add_tail(inti-list, fi-list);
  +   } else {
  +   u64 isc_bits = int_word_to_isc_bits(inti-io.io_int_word);
  +
  +   /* Keep I/O interrupts sorted in isc order. */
  +   list_for_each_entry(iter, fi-list, list) {
  +   if (!is_ioint(iter-type))
  +   continue;
  +   if (int_word_to_isc_bits(iter-io.io_int_word)
  +   = isc_bits)
  +   continue;
  +   break;
  +   }
  +   list_add_tail(inti-list, iter-list);
  +   }
  +   atomic_set(fi-active, 1);
  +   sigcpu = find_first_bit(fi-idle_mask, KVM_MAX_VCPUS);
  +   if (sigcpu == KVM_MAX_VCPUS) {
  +   do {
  +   sigcpu = fi-next_rr_cpu++;
  +   if (sigcpu == KVM_MAX_VCPUS)
  +   sigcpu = fi-next_rr_cpu = 0;
  +   } while (fi-local_int[sigcpu] == NULL);
  +   }
  +   li = fi-local_int[sigcpu];
  +   spin_lock_bh(li-lock);
  +   atomic_set_mask(CPUSTAT_EXT_INT, li-cpuflags);
  +   if (waitqueue_active(li-wq))
  +   wake_up_interruptible(li-wq);
 
 Does this kick the other vcpu to notify it that an irq is available?
 
 The code is very spaghetti style. There's certainly a good opportunity to 
 split things up and make it more readable here ;). I think splitting it into 
 enqueue and target cpu deliver parts makes a lot of sense for starters.

I'm working on a larger patchset that changes interrupt injection and delivery.
The rationale is to make it more readable, use less locking (get rid of lists 
use bitmaps) and become more PoP-compliant. I'll try to make it more obvious.
 
 Btw, usually the way these interrupt controllers work in KVM is that the CPU 
 local interrupt controller part fetches interrupts from the floating 
 interrupt controller. So in here you would only enqueue and maybe kick a 
 potential receiver of the interrupt. Before a vcpu goes back into guest 
 state, it asks the floating interrupt controller whether there's anything 
 pending for it. If so, it delivers it.

I think that's kind of what we're doing here. We enqueue by adding it to our 
list
of pending interrupts, look for an idle cpu, set the external interrupt flag 
and wake it up. 

 
 That way even if the vcpu you chose here happens to get preempted, you still 
 deliver the interrupt quickly through another vcpu.
 
  +   spin_unlock_bh(li-lock);
  +   spin_unlock(fi-lock);
  +   mutex_unlock(kvm-lock);
  +}
  +
  +int kvm_s390_inject_vm(struct kvm *kvm,
  +  struct kvm_s390_interrupt *s390int)
  +{
  +   struct kvm_s390_interrupt_info *inti;
  +
  inti = kzalloc(sizeof(*inti), GFP_KERNEL);
  if (!inti)
  return -ENOMEM;
  
  -   switch (s390int-type) {
  +   inti-type = s390int-type;
  +   switch (inti-type) {
  case KVM_S390_INT_VIRTIO:
  VM_EVENT(kvm, 5, inject: virtio parm:%x,parm64:%llx,
   s390int-parm, s390int-parm64);
  -   inti-type = s390int-type;
  inti-ext.ext_params = s390int-parm;
  inti-ext.ext_params2 = s390int-parm64;
  break;
  case KVM_S390_INT_SERVICE:
  VM_EVENT(kvm, 5, inject: sclp parm:%x, s390int-parm);
  -   inti-type = s390int-type;
  inti-ext.ext_params = s390int-parm;
  break;
  -   case KVM_S390_PROGRAM_INT:
  -   case KVM_S390_SIGP_STOP:
  -   case KVM_S390_INT_EXTERNAL_CALL:
  -   case KVM_S390_INT_EMERGENCY:
  -   kfree(inti);
  -   return -EINVAL;
  case KVM_S390_MCHK:
  VM_EVENT(kvm, 5, inject: machine check parm64:%llx,
   s390int-parm64);
  -   inti-type = s390int-type;
  inti-mchk.cr14 = s390int-parm; /* upper bits are not used */
  inti-mchk.mcic = s390int-parm64;
  break;
  case 

Re: [Qemu-devel] [PATCHv5] block/get_block_status: avoid redundant callouts on raw devices

2013-10-08 Thread Stefan Hajnoczi
On Mon, Oct 7, 2013 at 10:25 AM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 07/10/2013 07:59, Peter Lieven ha scritto:
 if a raw device like an iscsi target or host device is used
 the current implementation makes a second call out to get
 the block status of bs-file.

 Signed-off-by: Peter Lieven p...@kamp.de
 ---
 v5: add a generic get_lba_status function in the raw driver which
 adds the BDRV_BLOCK_RAW flag. bdrv_co_get_block_status will
 handle the callout to bs-file then.

 v4: use a flag to detect the raw driver instead of the strncmp
 hack

  block.c   |4 
  block/raw_bsd.c   |3 ++-
  include/block/block.h |4 
  3 files changed, 10 insertions(+), 1 deletion(-)

 diff --git a/block.c b/block.c
 index 93e113a..38a589e 100644
 --- a/block.c
 +++ b/block.c
 @@ -3147,6 +3147,10 @@ static int64_t coroutine_fn 
 bdrv_co_get_block_status(BlockDriverState *bs,
  return ret;
  }

 +if (ret  BDRV_BLOCK_RAW) {
 +return bdrv_get_block_status(bs-file, sector_num, nb_sectors, 
 pnum);

 Strictly speaking, this should probably do something like this:

   assert(ret  BDRV_BLOCK_OFFSET_VALID);
   return bdrv_get_block_status(bs-file, ret  BDRV_SECTOR_BITS,
nb_sectors, pnum);

 Or alternatively the raw driver should return just BDRV_BLOCK_RAW.

 As a third option, the raw driver could also return not just
 BDRV_BLOCK_RAW and BDRV_BLOCK_OFFSET_VALID, but also BDRV_BLOCK_DATA (so
 that the answer makes some sense even without going down to bs-file).

 But I'll let the block maintainers decide what to do.

The return value is a bitfield, so let's use those bits and take Option 3.

Stefan



[Qemu-devel] centos5 32bit and debian6 32bit problem with virtio-serial (guest-agent)

2013-10-08 Thread Hamed Afshar
Hi,
I've enabled qemu-guest-agent for a VM. When the guest is centos6, I can 
properly find /dev/virtio-ports/org.qemu.guest_agent.0.
But when the guest is either centos5 32bit or debian6 32bit (these are the OSes 
I've checked so far), I cannot find the device in /dev.
The command 'lspci' displays 00:08.0 Communication controller: Red Hat, Inc 
Virtio console.
I'm not sure if there is a problem with the driver or it's just mapped with 
other names in /dev.

I would be glad if you assist.

Thanks,
Hamed Afshar

[Qemu-devel] [PATCHv4 03/17] block: introduce BDRV_REQ_MAY_UNMAP request flag

2013-10-08 Thread Peter Lieven
Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: Peter Lieven p...@kamp.de
---
 block-migration.c |3 ++-
 block.c   |4 
 block/backup.c|2 +-
 include/block/block.h |7 +++
 4 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/block-migration.c b/block-migration.c
index 713a8e3..fc4ef93 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -780,7 +780,8 @@ static int block_load(QEMUFile *f, void *opaque, int 
version_id)
 }
 
 if (flags  BLK_MIG_FLAG_ZERO_BLOCK) {
-ret = bdrv_write_zeroes(bs, addr, nr_sectors, 0);
+ret = bdrv_write_zeroes(bs, addr, nr_sectors,
+BDRV_REQ_MAY_UNMAP);
 } else {
 buf = g_malloc(BLOCK_SIZE);
 qemu_get_buffer(f, buf, BLOCK_SIZE);
diff --git a/block.c b/block.c
index 8ed53a2..0675257 100644
--- a/block.c
+++ b/block.c
@@ -2801,6 +2801,10 @@ int coroutine_fn bdrv_co_write_zeroes(BlockDriverState 
*bs,
 {
 trace_bdrv_co_write_zeroes(bs, sector_num, nb_sectors);
 
+if (!(bs-open_flags  BDRV_O_UNMAP)) {
+flags = ~BDRV_REQ_MAY_UNMAP;
+}
+
 return bdrv_co_do_writev(bs, sector_num, nb_sectors, NULL,
  BDRV_REQ_ZERO_WRITE | flags);
 }
diff --git a/block/backup.c b/block/backup.c
index 5f6a642..1ab080d 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -139,7 +139,7 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
 if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
 ret = bdrv_co_write_zeroes(job-target,
start * BACKUP_SECTORS_PER_CLUSTER,
-   n, 0);
+   n, BDRV_REQ_MAY_UNMAP);
 } else {
 ret = bdrv_co_writev(job-target,
  start * BACKUP_SECTORS_PER_CLUSTER, n,
diff --git a/include/block/block.h b/include/block/block.h
index 47c6475..b72c81a 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -65,6 +65,13 @@ typedef struct BlockDevOps {
 typedef enum {
 BDRV_REQ_COPY_ON_READ = 0x1,
 BDRV_REQ_ZERO_WRITE   = 0x2,
+/* The BDRV_REQ_MAY_UNMAP flag is used to indicate that the block driver
+ * is allowed to optimize a write zeroes request by unmapping (discarding)
+ * blocks if it is guaranteed that the result will read back as
+ * zeroes. The flag is only passed to the driver if the block device is
+ * opened with BDRV_O_UNMAP.
+ */
+BDRV_REQ_MAY_UNMAP= 0x4,
 } BdrvRequestFlags;
 
 #define BDRV_O_RDWR0x0002
-- 
1.7.9.5




[Qemu-devel] [PATCHv4 05/17] block/raw: add bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes

2013-10-08 Thread Peter Lieven
Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: Peter Lieven p...@kamp.de
---
 block/raw_bsd.c |   56 +--
 1 file changed, 34 insertions(+), 22 deletions(-)

diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index d5ab295..8dc7bba 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -131,6 +131,16 @@ static int raw_has_zero_init(BlockDriverState *bs)
 return bdrv_has_zero_init(bs-file);
 }
 
+static bool raw_has_discard_zeroes(BlockDriverState *bs)
+{
+return bdrv_has_discard_zeroes(bs-file);
+}
+
+static bool raw_has_discard_write_zeroes(BlockDriverState *bs)
+{
+return bdrv_has_discard_write_zeroes(bs-file);
+}
+
 static int raw_create(const char *filename, QEMUOptionParameter *options,
   Error **errp)
 {
@@ -165,28 +175,30 @@ static int raw_probe(const uint8_t *buf, int buf_size, 
const char *filename)
 }
 
 static BlockDriver bdrv_raw = {
-.format_name  = raw,
-.bdrv_probe   = raw_probe,
-.bdrv_reopen_prepare  = raw_reopen_prepare,
-.bdrv_open= raw_open,
-.bdrv_close   = raw_close,
-.bdrv_create  = raw_create,
-.bdrv_co_readv= raw_co_readv,
-.bdrv_co_writev   = raw_co_writev,
-.bdrv_co_write_zeroes = raw_co_write_zeroes,
-.bdrv_co_discard  = raw_co_discard,
-.bdrv_co_get_block_status = raw_co_get_block_status,
-.bdrv_truncate= raw_truncate,
-.bdrv_getlength   = raw_getlength,
-.bdrv_get_info= raw_get_info,
-.bdrv_is_inserted = raw_is_inserted,
-.bdrv_media_changed   = raw_media_changed,
-.bdrv_eject   = raw_eject,
-.bdrv_lock_medium = raw_lock_medium,
-.bdrv_ioctl   = raw_ioctl,
-.bdrv_aio_ioctl   = raw_aio_ioctl,
-.create_options   = raw_create_options[0],
-.bdrv_has_zero_init   = raw_has_zero_init
+.format_name   = raw,
+.bdrv_probe= raw_probe,
+.bdrv_reopen_prepare   = raw_reopen_prepare,
+.bdrv_open = raw_open,
+.bdrv_close= raw_close,
+.bdrv_create   = raw_create,
+.bdrv_co_readv = raw_co_readv,
+.bdrv_co_writev= raw_co_writev,
+.bdrv_co_write_zeroes  = raw_co_write_zeroes,
+.bdrv_co_discard   = raw_co_discard,
+.bdrv_co_get_block_status  = raw_co_get_block_status,
+.bdrv_truncate = raw_truncate,
+.bdrv_getlength= raw_getlength,
+.bdrv_get_info = raw_get_info,
+.bdrv_is_inserted  = raw_is_inserted,
+.bdrv_media_changed= raw_media_changed,
+.bdrv_eject= raw_eject,
+.bdrv_lock_medium  = raw_lock_medium,
+.bdrv_ioctl= raw_ioctl,
+.bdrv_aio_ioctl= raw_aio_ioctl,
+.create_options= raw_create_options[0],
+.bdrv_has_zero_init= raw_has_zero_init,
+.bdrv_has_discard_zeroes   = raw_has_discard_zeroes,
+.bdrv_has_discard_write_zeroes = raw_has_discard_write_zeroes,
 };
 
 static void bdrv_raw_init(void)
-- 
1.7.9.5




[Qemu-devel] [PATCHv4 01/17] block: make BdrvRequestFlags public

2013-10-08 Thread Peter Lieven
Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: Peter Lieven p...@kamp.de
---
 block.c   |5 -
 include/block/block.h |5 +
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index 93e113a..08cba1e 100644
--- a/block.c
+++ b/block.c
@@ -51,11 +51,6 @@
 
 #define NOT_DONE 0x7fff /* used while emulated sync operation in progress 
*/
 
-typedef enum {
-BDRV_REQ_COPY_ON_READ = 0x1,
-BDRV_REQ_ZERO_WRITE   = 0x2,
-} BdrvRequestFlags;
-
 static void bdrv_dev_change_media_cb(BlockDriverState *bs, bool load);
 static BlockDriverAIOCB *bdrv_aio_readv_em(BlockDriverState *bs,
 int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
diff --git a/include/block/block.h b/include/block/block.h
index f808550..9fb0c3d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -62,6 +62,11 @@ typedef struct BlockDevOps {
 void (*resize_cb)(void *opaque);
 } BlockDevOps;
 
+typedef enum {
+BDRV_REQ_COPY_ON_READ = 0x1,
+BDRV_REQ_ZERO_WRITE   = 0x2,
+} BdrvRequestFlags;
+
 #define BDRV_O_RDWR0x0002
 #define BDRV_O_SNAPSHOT0x0008 /* open the file read only and save writes 
in a snapshot */
 #define BDRV_O_NOCACHE 0x0020 /* do not use the host page cache */
-- 
1.7.9.5




[Qemu-devel] [PATCHv4 06/17] block: add BlockLimits structure to BlockDriverState

2013-10-08 Thread Peter Lieven
this patch adds BlockLimits which introduces discard and write_zeroes
limits and alignment information to the BlockDriverState.

Signed-off-by: Peter Lieven p...@kamp.de
---
 include/block/block_int.h |   17 +
 1 file changed, 17 insertions(+)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index ce5c62e..775b7ab 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -233,6 +233,20 @@ struct BlockDriver {
 QLIST_ENTRY(BlockDriver) list;
 };
 
+typedef struct BlockLimits {
+/* maximum number of sectors that can be discarded at once */
+int max_discard;
+
+/* optimal alignment for discard requests in sectors */
+int64_t discard_alignment;
+
+/* maximum number of sectors that can zeroized at once */
+int max_write_zeroes;
+
+/* optimal alignment for write zeroes requests in sectors */
+int64_t write_zeroes_alignment;
+} BlockLimits;
+
 /*
  * Note: the function bdrv_append() copies and swaps contents of
  * BlockDriverStates, so if you add new fields to this struct, please
@@ -286,6 +300,9 @@ struct BlockDriverState {
 uint64_t total_time_ns[BDRV_MAX_IOTYPE];
 uint64_t wr_highest_sector;
 
+/* I/O Limits */
+BlockLimits bl;
+
 /* Whether the disk can expand beyond total_sectors */
 int growable;
 
-- 
1.7.9.5




[Qemu-devel] [PATCHv4 14/17] block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks

2013-10-08 Thread Peter Lieven
this patch does 2 things:
a) only do additional call outs if BDRV_BLOCK_ZERO is not already set.
b) use the newly introduced bdrv_has_discard_zeroes() to return the
   zero state of an unallocated block. the used callout to
   bdrv_has_zero_init() is only valid right after bdrv_create.

Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: Peter Lieven p...@kamp.de
---
 block.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index fc931e3..1be4418 100644
--- a/block.c
+++ b/block.c
@@ -3247,8 +3247,8 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status(BlockDriverState *bs,
 return ret;
 }
 
-if (!(ret  BDRV_BLOCK_DATA)) {
-if (bdrv_has_zero_init(bs)) {
+if (!(ret  BDRV_BLOCK_DATA)  !(ret  BDRV_BLOCK_ZERO)) {
+if (bdrv_has_discard_zeroes(bs)) {
 ret |= BDRV_BLOCK_ZERO;
 } else if (bs-backing_hd) {
 BlockDriverState *bs2 = bs-backing_hd;
-- 
1.7.9.5




[Qemu-devel] [PATCHv4 02/17] block: add flags to bdrv_*_write_zeroes

2013-10-08 Thread Peter Lieven
Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: Peter Lieven p...@kamp.de
---
 block-migration.c |2 +-
 block.c   |   20 +++-
 block/backup.c|3 ++-
 block/qcow2-cluster.c |2 +-
 block/qcow2.c |2 +-
 block/qed.c   |3 ++-
 block/raw_bsd.c   |5 +++--
 block/vmdk.c  |3 ++-
 include/block/block.h |4 ++--
 include/block/block_int.h |2 +-
 qemu-io-cmds.c|2 +-
 11 files changed, 27 insertions(+), 21 deletions(-)

diff --git a/block-migration.c b/block-migration.c
index daf9ec1..713a8e3 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -780,7 +780,7 @@ static int block_load(QEMUFile *f, void *opaque, int 
version_id)
 }
 
 if (flags  BLK_MIG_FLAG_ZERO_BLOCK) {
-ret = bdrv_write_zeroes(bs, addr, nr_sectors);
+ret = bdrv_write_zeroes(bs, addr, nr_sectors, 0);
 } else {
 buf = g_malloc(BLOCK_SIZE);
 qemu_get_buffer(f, buf, BLOCK_SIZE);
diff --git a/block.c b/block.c
index 08cba1e..8ed53a2 100644
--- a/block.c
+++ b/block.c
@@ -79,7 +79,7 @@ static BlockDriverAIOCB 
*bdrv_co_aio_rw_vector(BlockDriverState *bs,
bool is_write);
 static void coroutine_fn bdrv_co_do_rw(void *opaque);
 static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
-int64_t sector_num, int nb_sectors);
+int64_t sector_num, int nb_sectors, BdrvRequestFlags flags);
 
 static QTAILQ_HEAD(, BlockDriverState) bdrv_states =
 QTAILQ_HEAD_INITIALIZER(bdrv_states);
@@ -2375,10 +2375,11 @@ int bdrv_writev(BlockDriverState *bs, int64_t 
sector_num, QEMUIOVector *qiov)
 return bdrv_rwv_co(bs, sector_num, qiov, true, 0);
 }
 
-int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num, int nb_sectors)
+int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
+  int nb_sectors, BdrvRequestFlags flags)
 {
 return bdrv_rw_co(bs, sector_num, NULL, nb_sectors, true,
-  BDRV_REQ_ZERO_WRITE);
+  BDRV_REQ_ZERO_WRITE | flags);
 }
 
 int bdrv_pread(BlockDriverState *bs, int64_t offset,
@@ -2560,7 +2561,7 @@ static int coroutine_fn 
bdrv_co_do_copy_on_readv(BlockDriverState *bs,
 if (drv-bdrv_co_write_zeroes 
 buffer_is_zero(bounce_buffer, iov.iov_len)) {
 ret = bdrv_co_do_write_zeroes(bs, cluster_sector_num,
-  cluster_nb_sectors);
+  cluster_nb_sectors, 0);
 } else {
 /* This does not change the data on the disk, it is not necessary
  * to flush even in cache=writethrough mode.
@@ -2694,7 +2695,7 @@ int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState 
*bs,
 }
 
 static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
-int64_t sector_num, int nb_sectors)
+int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
 {
 BlockDriver *drv = bs-drv;
 QEMUIOVector qiov;
@@ -2706,7 +2707,7 @@ static int coroutine_fn 
bdrv_co_do_write_zeroes(BlockDriverState *bs,
 
 /* First try the efficient write zeroes operation */
 if (drv-bdrv_co_write_zeroes) {
-ret = drv-bdrv_co_write_zeroes(bs, sector_num, nb_sectors);
+ret = drv-bdrv_co_write_zeroes(bs, sector_num, nb_sectors, flags);
 if (ret != -ENOTSUP) {
 return ret;
 }
@@ -2761,7 +2762,7 @@ static int coroutine_fn 
bdrv_co_do_writev(BlockDriverState *bs,
 if (ret  0) {
 /* Do nothing, write notifier decided to fail this request */
 } else if (flags  BDRV_REQ_ZERO_WRITE) {
-ret = bdrv_co_do_write_zeroes(bs, sector_num, nb_sectors);
+ret = bdrv_co_do_write_zeroes(bs, sector_num, nb_sectors, flags);
 } else {
 ret = drv-bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
 }
@@ -2795,12 +2796,13 @@ int coroutine_fn bdrv_co_writev(BlockDriverState *bs, 
int64_t sector_num,
 }
 
 int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs,
-  int64_t sector_num, int nb_sectors)
+  int64_t sector_num, int nb_sectors,
+  BdrvRequestFlags flags)
 {
 trace_bdrv_co_write_zeroes(bs, sector_num, nb_sectors);
 
 return bdrv_co_do_writev(bs, sector_num, nb_sectors, NULL,
- BDRV_REQ_ZERO_WRITE);
+ BDRV_REQ_ZERO_WRITE | flags);
 }
 
 /**
diff --git a/block/backup.c b/block/backup.c
index 04c4b5c..5f6a642 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -138,7 +138,8 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
 
 if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
 ret = bdrv_co_write_zeroes(job-target,
-   start * BACKUP_SECTORS_PER_CLUSTER, n);
+   

[Qemu-devel] [PATCHv4 07/17] block: honour BlockLimits in bdrv_co_do_write_zeroes

2013-10-08 Thread Peter Lieven
Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: Peter Lieven p...@kamp.de
---
 block.c |   65 +++
 1 file changed, 49 insertions(+), 16 deletions(-)

diff --git a/block.c b/block.c
index 6a46bc2..7551751 100644
--- a/block.c
+++ b/block.c
@@ -2694,32 +2694,65 @@ int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState 
*bs,
 BDRV_REQ_COPY_ON_READ);
 }
 
+/* if no limit is specified in the BlockLimits use a default
+ * of 32768 512-byte sectors (16 MiB) per request.
+ */
+#define MAX_WRITE_ZEROES_DEFAULT 32768
+
 static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
 int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
 {
 BlockDriver *drv = bs-drv;
 QEMUIOVector qiov;
-struct iovec iov;
-int ret;
+struct iovec iov = {0};
+int ret = 0;
 
-/* TODO Emulate only part of misaligned requests instead of letting block
- * drivers return -ENOTSUP and emulate everything */
+int max_write_zeroes = bs-bl.max_write_zeroes ?
+   bs-bl.max_write_zeroes : MAX_WRITE_ZEROES_DEFAULT;
 
-/* First try the efficient write zeroes operation */
-if (drv-bdrv_co_write_zeroes) {
-ret = drv-bdrv_co_write_zeroes(bs, sector_num, nb_sectors, flags);
-if (ret != -ENOTSUP) {
-return ret;
+while (nb_sectors  0  !ret) {
+int num = nb_sectors;
+
+/* align request */
+if (bs-bl.write_zeroes_alignment 
+num = bs-bl.write_zeroes_alignment 
+sector_num % bs-bl.write_zeroes_alignment) {
+if (num  bs-bl.write_zeroes_alignment) {
+num = bs-bl.write_zeroes_alignment;
+}
+num -= sector_num % bs-bl.write_zeroes_alignment;
 }
-}
 
-/* Fall back to bounce buffer if write zeroes is unsupported */
-iov.iov_len  = nb_sectors * BDRV_SECTOR_SIZE;
-iov.iov_base = qemu_blockalign(bs, iov.iov_len);
-memset(iov.iov_base, 0, iov.iov_len);
-qemu_iovec_init_external(qiov, iov, 1);
+/* limit request size */
+if (num  max_write_zeroes) {
+num = max_write_zeroes;
+}
+
+ret = -ENOTSUP;
+/* First try the efficient write zeroes operation */
+if (drv-bdrv_co_write_zeroes) {
+ret = drv-bdrv_co_write_zeroes(bs, sector_num, num, flags);
+}
+
+if (ret == -ENOTSUP) {
+/* Fall back to bounce buffer if write zeroes is unsupported */
+iov.iov_len = num * BDRV_SECTOR_SIZE;
+if (iov.iov_base == NULL) {
+/* allocate bounce buffer only once and ensure that it
+ * is big enough for this and all future requests.
+ */
+size_t bufsize = num = nb_sectors ? num : max_write_zeroes;
+iov.iov_base = qemu_blockalign(bs, bufsize * BDRV_SECTOR_SIZE);
+memset(iov.iov_base, 0, bufsize * BDRV_SECTOR_SIZE);
+}
+qemu_iovec_init_external(qiov, iov, 1);
 
-ret = drv-bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
+ret = drv-bdrv_co_writev(bs, sector_num, num, qiov);
+}
+
+sector_num += num;
+nb_sectors -= num;
+}
 
 qemu_vfree(iov.iov_base);
 return ret;
-- 
1.7.9.5




[Qemu-devel] [PATCHv4 16/17] qemu-img: conditionally zero out target on convert

2013-10-08 Thread Peter Lieven
If the target has_zero_init = 0, but supports efficiently
writing zeroes by unmapping we call bdrv_zeroize to
avoid fully allocating the target. This currently
is designed especially for iscsi.

Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: Peter Lieven p...@kamp.de
---
 qemu-img.c |   10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/qemu-img.c b/qemu-img.c
index c6eff15..2d46de8 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1353,7 +1353,7 @@ static int img_convert(int argc, char **argv)
 }
 }
 
-flags = BDRV_O_RDWR;
+flags = min_sparse ? (BDRV_O_RDWR | BDRV_O_UNMAP) : BDRV_O_RDWR;
 ret = bdrv_parse_cache_flags(cache, flags);
 if (ret  0) {
 error_report(Invalid cache option: %s, cache);
@@ -1469,6 +1469,14 @@ static int img_convert(int argc, char **argv)
 } else {
 int has_zero_init = min_sparse ? bdrv_has_zero_init(out_bs) : 0;
 
+if (!has_zero_init  bdrv_has_discard_write_zeroes(out_bs)) {
+ret = bdrv_make_zero(out_bs, BDRV_REQ_MAY_UNMAP);
+if (ret  0) {
+goto out;
+}
+has_zero_init = 1;
+}
+
 sector_num = 0; // total number of sectors converted so far
 nb_sectors = total_sectors - sector_num;
 if (nb_sectors != 0) {
-- 
1.7.9.5




[Qemu-devel] [PATCHv4 00/17] block: logical block provisioning enhancements

2013-10-08 Thread Peter Lieven
this patch adds the ability for targets to stay sparse during
block migration (if the zero_blocks capability is set) and qemu-img convert
even if the target does not have has_zero_init = 1.

the series was especially developed for iSCSI, but it should also work
with other drivers with little or no adjustments. these adjustments
should be limited to providing block provisioning information through
get_block_info and/or honouring BDRV_REQ_MAY_UNMAP on writing zeroes.

v3-v4:
 - changed BlockLimits struct to typedef (Stefan, Eric)
 - renamed bdrv_zeroize to bdrv_make_zero (Stefan)
 - added comment about the -S flag of qemu-img convert in
   qemu-img.texi (Eric)
 - used struct assignment for bs-bl in raw_open (Stefan, Eric)
 - dropped 3 get_block_status fixes that are independent of
   this series and already partly merged.

v2-v3:
 - fix merge conflict in block/qcow2_cluster.c
 - changed return type of bdrv_has_discard_zeroes and
   bdrv_has_discard_write_zeroes to bool.
 - moved alignment and limits info to a BlockLimits struct (Paolo).
 - added magic constanst for default maximum in bdrv_co_do_write_zeroes
   and bdrv_co_discard (Eric).
 - bdrv_co_do_write_zeroes: allocating the bounce buffer only once (Eric),
   fixed bounce iov_len in the fall back path.
 - bdrv_zeroize: added inline docu (Eric) and do not mask flags passed
   to bdrv_write_zeroes (Eric).
 - qemu-img: changed the default hint for -S (min_sparse) in the usage
   help to 4k. not changing the default as it is unclear why this default
   was set. size suffixes are already supported (Eric).

v1-v2:
 - moved block max_discard and max_write_zeroes to BlockDriverState
 - added discard_alignment and write_zeroes_alignment to BlockDriverState
 - added bdrv_has_discard_zeroes() and bdrv_has_discard_write_zeroes()
 - added logic to bdrv_co_discard and bdrv_co_do_write_zeroes to honour
   limit and alignment info.
 - added support for -S 0 in qemu-img convert.

Peter Lieven (17):
  block: make BdrvRequestFlags public
  block: add flags to bdrv_*_write_zeroes
  block: introduce BDRV_REQ_MAY_UNMAP request flag
  block: introduce bdrv_has_discard_zeroes and
bdrv_has_discard_write_zeroes
  block/raw: add bdrv_has_discard_zeroes and
bdrv_has_discard_write_zeroes
  block: add BlockLimits structure to BlockDriverState
  block: honour BlockLimits in bdrv_co_do_write_zeroes
  block: honour BlockLimits in bdrv_co_discard
  iscsi: simplify iscsi_co_discard
  iscsi: set limits in BlockDriverState
  iscsi: add bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes
  iscsi: add bdrv_co_write_zeroes
  block: introduce bdrv_make_zero
  block/get_block_status: fix BDRV_BLOCK_ZERO for unallocated blocks
  qemu-img: add support for fully allocated images
  qemu-img: conditionally zero out target on convert
  block/raw: copy BlockLimits on raw_open

 block-migration.c |3 +-
 block.c   |  199 +
 block/backup.c|3 +-
 block/iscsi.c |  152 --
 block/qcow2-cluster.c |2 +-
 block/qcow2.c |2 +-
 block/qed.c   |3 +-
 block/raw_bsd.c   |   62 --
 block/vmdk.c  |3 +-
 include/block/block.h |   19 -
 include/block/block_int.h |   32 +++-
 qemu-img.c|   18 +++-
 qemu-img.texi |5 ++
 qemu-io-cmds.c|2 +-
 14 files changed, 394 insertions(+), 111 deletions(-)

-- 
1.7.9.5




[Qemu-devel] [PATCHv4 04/17] block: introduce bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes

2013-10-08 Thread Peter Lieven
Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: Peter Lieven p...@kamp.de
---
 block.c   |   29 +
 include/block/block.h |2 ++
 include/block/block_int.h |   13 +
 3 files changed, 44 insertions(+)

diff --git a/block.c b/block.c
index 0675257..6a46bc2 100644
--- a/block.c
+++ b/block.c
@@ -3085,6 +3085,35 @@ int bdrv_has_zero_init(BlockDriverState *bs)
 return 0;
 }
 
+bool bdrv_has_discard_zeroes(BlockDriverState *bs)
+{
+assert(bs-drv);
+
+if (bs-backing_hd) {
+return false;
+}
+if (bs-drv-bdrv_has_discard_zeroes) {
+return bs-drv-bdrv_has_discard_zeroes(bs);
+}
+
+return false;
+}
+
+bool bdrv_has_discard_write_zeroes(BlockDriverState *bs)
+{
+assert(bs-drv);
+
+if (bs-backing_hd || !(bs-open_flags  BDRV_O_UNMAP)) {
+return false;
+}
+
+if (bs-drv-bdrv_has_discard_write_zeroes) {
+return bs-drv-bdrv_has_discard_write_zeroes(bs);
+}
+
+return false;
+}
+
 typedef struct BdrvCoGetBlockStatusData {
 BlockDriverState *bs;
 BlockDriverState *base;
diff --git a/include/block/block.h b/include/block/block.h
index b72c81a..2de226f 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -310,6 +310,8 @@ int bdrv_discard(BlockDriverState *bs, int64_t sector_num, 
int nb_sectors);
 int bdrv_co_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
 int bdrv_has_zero_init_1(BlockDriverState *bs);
 int bdrv_has_zero_init(BlockDriverState *bs);
+bool bdrv_has_discard_zeroes(BlockDriverState *bs);
+bool bdrv_has_discard_write_zeroes(BlockDriverState *bs);
 int64_t bdrv_get_block_status(BlockDriverState *bs, int64_t sector_num,
   int nb_sectors, int *pnum);
 int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index f95dba7..ce5c62e 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -217,6 +217,19 @@ struct BlockDriver {
  */
 int (*bdrv_has_zero_init)(BlockDriverState *bs);
 
+/*
+ * Returns true if discarded blocks read back as zeroes.
+ */
+bool (*bdrv_has_discard_zeroes)(BlockDriverState *bs);
+
+/*
+ * Returns true if the driver can optimize writing zeroes by discarding
+ * sectors. It is additionally required that the block device is
+ * opened with BDRV_O_UNMAP and the that write zeroes request carries
+ * the BDRV_REQ_MAY_UNMAP flag for this to work.
+ */
+bool (*bdrv_has_discard_write_zeroes)(BlockDriverState *bs);
+
 QLIST_ENTRY(BlockDriver) list;
 };
 
-- 
1.7.9.5




[Qemu-devel] [PATCHv4 11/17] iscsi: add bdrv_has_discard_zeroes and bdrv_has_discard_write_zeroes

2013-10-08 Thread Peter Lieven
Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: Peter Lieven p...@kamp.de
---
 block/iscsi.c |   16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 8ed2274..f0ac620 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1449,6 +1449,18 @@ static int iscsi_has_zero_init(BlockDriverState *bs)
 return 0;
 }
 
+static bool iscsi_has_discard_zeroes(BlockDriverState *bs)
+{
+IscsiLun *iscsilun = bs-opaque;
+return !!iscsilun-lbprz;
+}
+
+static bool iscsi_has_discard_write_zeroes(BlockDriverState *bs)
+{
+IscsiLun *iscsilun = bs-opaque;
+return iscsilun-lbprz  iscsilun-lbp.lbpws;
+}
+
 static int iscsi_create(const char *filename, QEMUOptionParameter *options,
 Error **errp)
 {
@@ -1535,7 +1547,9 @@ static BlockDriver bdrv_iscsi = {
 .bdrv_aio_writev = iscsi_aio_writev,
 .bdrv_aio_flush  = iscsi_aio_flush,
 
-.bdrv_has_zero_init = iscsi_has_zero_init,
+.bdrv_has_zero_init= iscsi_has_zero_init,
+.bdrv_has_discard_zeroes   = iscsi_has_discard_zeroes,
+.bdrv_has_discard_write_zeroes = iscsi_has_discard_write_zeroes,
 
 #ifdef __linux__
 .bdrv_ioctl   = iscsi_ioctl,
-- 
1.7.9.5




[Qemu-devel] [PATCHv4 10/17] iscsi: set limits in BlockDriverState

2013-10-08 Thread Peter Lieven
Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: Peter Lieven p...@kamp.de
---
 block/iscsi.c |   14 ++
 1 file changed, 14 insertions(+)

diff --git a/block/iscsi.c b/block/iscsi.c
index be70ced..8ed2274 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1367,6 +1367,20 @@ static int iscsi_open(BlockDriverState *bs, QDict 
*options, int flags,
sizeof(struct scsi_inquiry_block_limits));
 scsi_free_scsi_task(task);
 task = NULL;
+
+if (iscsilun-bl.max_unmap  0x) {
+bs-bl.max_discard = sector_lun2qemu(iscsilun-bl.max_unmap,
+ iscsilun);
+}
+bs-bl.discard_alignment = sector_lun2qemu(iscsilun-bl.opt_unmap_gran,
+   iscsilun);
+
+if (iscsilun-bl.max_ws_len  0x) {
+bs-bl.max_write_zeroes = sector_lun2qemu(iscsilun-bl.max_ws_len,
+  iscsilun);
+}
+bs-bl.write_zeroes_alignment = 
sector_lun2qemu(iscsilun-bl.opt_unmap_gran,
+iscsilun);
 }
 
 #if defined(LIBISCSI_FEATURE_NOP_COUNTER)
-- 
1.7.9.5




[Qemu-devel] [PATCHv4 08/17] block: honour BlockLimits in bdrv_co_discard

2013-10-08 Thread Peter Lieven
Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: Peter Lieven p...@kamp.de
---
 block.c |   37 -
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 7551751..43d5f46 100644
--- a/block.c
+++ b/block.c
@@ -4209,6 +4209,11 @@ static void coroutine_fn bdrv_discard_co_entry(void 
*opaque)
 rwco-ret = bdrv_co_discard(rwco-bs, rwco-sector_num, rwco-nb_sectors);
 }
 
+/* if no limit is specified in the BlockLimits use a default
+ * of 32768 512-byte sectors (16 MiB) per request.
+ */
+#define MAX_DISCARD_DEFAULT 32768
+
 int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
  int nb_sectors)
 {
@@ -4230,7 +4235,37 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, 
int64_t sector_num,
 }
 
 if (bs-drv-bdrv_co_discard) {
-return bs-drv-bdrv_co_discard(bs, sector_num, nb_sectors);
+int max_discard = bs-bl.max_discard ?
+  bs-bl.max_discard : MAX_DISCARD_DEFAULT;
+
+while (nb_sectors  0) {
+int ret;
+int num = nb_sectors;
+
+/* align request */
+if (bs-bl.discard_alignment 
+num = bs-bl.discard_alignment 
+sector_num % bs-bl.discard_alignment) {
+if (num  bs-bl.discard_alignment) {
+num = bs-bl.discard_alignment;
+}
+num -= sector_num % bs-bl.discard_alignment;
+}
+
+/* limit request size */
+if (num  max_discard) {
+num = max_discard;
+}
+
+ret = bs-drv-bdrv_co_discard(bs, sector_num, num);
+if (ret) {
+return ret;
+}
+
+sector_num += num;
+nb_sectors -= num;
+}
+return 0;
 } else if (bs-drv-bdrv_aio_discard) {
 BlockDriverAIOCB *acb;
 CoroutineIOCompletion co = {
-- 
1.7.9.5




[Qemu-devel] [PATCHv4 09/17] iscsi: simplify iscsi_co_discard

2013-10-08 Thread Peter Lieven
now that bdrv_co_discard can handle limits we do not need
the request split logic here anymore.

Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: Peter Lieven p...@kamp.de
---
 block/iscsi.c |   67 +
 1 file changed, 25 insertions(+), 42 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 6152ef1..be70ced 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -87,7 +87,6 @@ typedef struct IscsiAIOCB {
 #define NOP_INTERVAL 5000
 #define MAX_NOP_FAILURES 3
 #define ISCSI_CMD_RETRIES 5
-#define ISCSI_MAX_UNMAP 131072
 
 static void
 iscsi_bh_cb(void *p)
@@ -912,8 +911,6 @@ coroutine_fn iscsi_co_discard(BlockDriverState *bs, int64_t 
sector_num,
 IscsiLun *iscsilun = bs-opaque;
 struct IscsiTask iTask;
 struct unmap_list list;
-uint32_t nb_blocks;
-uint32_t max_unmap;
 
 if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
 return -EINVAL;
@@ -925,52 +922,38 @@ coroutine_fn iscsi_co_discard(BlockDriverState *bs, 
int64_t sector_num,
 }
 
 list.lba = sector_qemu2lun(sector_num, iscsilun);
-nb_blocks = sector_qemu2lun(nb_sectors, iscsilun);
+list.num = sector_qemu2lun(nb_sectors, iscsilun);
 
-max_unmap = iscsilun-bl.max_unmap;
-if (max_unmap == 0x) {
-max_unmap = ISCSI_MAX_UNMAP;
-}
-
-while (nb_blocks  0) {
-iscsi_co_init_iscsitask(iscsilun, iTask);
-list.num = nb_blocks;
-if (list.num  max_unmap) {
-list.num = max_unmap;
-}
+iscsi_co_init_iscsitask(iscsilun, iTask);
 retry:
-if (iscsi_unmap_task(iscsilun-iscsi, iscsilun-lun, 0, 0, list, 1,
- iscsi_co_generic_cb, iTask) == NULL) {
-return -EIO;
-}
-
-while (!iTask.complete) {
-iscsi_set_events(iscsilun);
-qemu_coroutine_yield();
-}
+if (iscsi_unmap_task(iscsilun-iscsi, iscsilun-lun, 0, 0, list, 1,
+ iscsi_co_generic_cb, iTask) == NULL) {
+return -EIO;
+}
 
-if (iTask.task != NULL) {
-scsi_free_scsi_task(iTask.task);
-iTask.task = NULL;
-}
+while (!iTask.complete) {
+iscsi_set_events(iscsilun);
+qemu_coroutine_yield();
+}
 
-if (iTask.do_retry) {
-goto retry;
-}
+if (iTask.task != NULL) {
+scsi_free_scsi_task(iTask.task);
+iTask.task = NULL;
+}
 
-if (iTask.status == SCSI_STATUS_CHECK_CONDITION) {
-/* the target might fail with a check condition if it
-   is not happy with the alignment of the UNMAP request
-   we silently fail in this case */
-return 0;
-}
+if (iTask.do_retry) {
+goto retry;
+}
 
-if (iTask.status != SCSI_STATUS_GOOD) {
-return -EIO;
-}
+if (iTask.status == SCSI_STATUS_CHECK_CONDITION) {
+/* the target might fail with a check condition if it
+   is not happy with the alignment of the UNMAP request
+   we silently fail in this case */
+return 0;
+}
 
-list.lba += list.num;
-nb_blocks -= list.num;
+if (iTask.status != SCSI_STATUS_GOOD) {
+return -EIO;
 }
 
 return 0;
-- 
1.7.9.5




[Qemu-devel] [PATCHv4 12/17] iscsi: add bdrv_co_write_zeroes

2013-10-08 Thread Peter Lieven
Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: Peter Lieven p...@kamp.de
---
 block/iscsi.c |   59 +
 1 file changed, 59 insertions(+)

diff --git a/block/iscsi.c b/block/iscsi.c
index f0ac620..16510c2 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -56,6 +56,7 @@ typedef struct IscsiLun {
 uint8_t lbprz;
 struct scsi_inquiry_logical_block_provisioning lbp;
 struct scsi_inquiry_block_limits bl;
+unsigned char *zeroblock;
 } IscsiLun;
 
 typedef struct IscsiTask {
@@ -959,6 +960,62 @@ retry:
 return 0;
 }
 
+
+static int
+coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
+   int nb_sectors, BdrvRequestFlags flags)
+{
+IscsiLun *iscsilun = bs-opaque;
+struct IscsiTask iTask;
+uint64_t lba;
+uint32_t nb_blocks;
+
+if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
+return -EINVAL;
+}
+
+if (!iscsilun-lbp.lbpws) {
+/* WRITE SAME is not supported by the target */
+return -ENOTSUP;
+}
+
+lba = sector_qemu2lun(sector_num, iscsilun);
+nb_blocks = sector_qemu2lun(nb_sectors, iscsilun);
+
+if (iscsilun-zeroblock == NULL) {
+iscsilun-zeroblock = g_malloc0(iscsilun-block_size);
+}
+
+iscsi_co_init_iscsitask(iscsilun, iTask);
+retry:
+if (iscsi_writesame16_task(iscsilun-iscsi, iscsilun-lun, lba,
+   iscsilun-zeroblock, iscsilun-block_size,
+   nb_blocks, 0, !!(flags  BDRV_REQ_MAY_UNMAP),
+   0, 0, iscsi_co_generic_cb, iTask) == NULL) {
+return -EIO;
+}
+
+while (!iTask.complete) {
+iscsi_set_events(iscsilun);
+qemu_coroutine_yield();
+}
+
+if (iTask.task != NULL) {
+scsi_free_scsi_task(iTask.task);
+iTask.task = NULL;
+}
+
+if (iTask.do_retry) {
+goto retry;
+}
+
+if (iTask.status != SCSI_STATUS_GOOD) {
+return -EIO;
+}
+
+return 0;
+}
+
 static int parse_chap(struct iscsi_context *iscsi, const char *target)
 {
 QemuOptsList *list;
@@ -1421,6 +1478,7 @@ static void iscsi_close(BlockDriverState *bs)
 }
 qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), NULL, NULL, NULL);
 iscsi_destroy_context(iscsi);
+g_free(iscsilun-zeroblock);
 memset(iscsilun, 0, sizeof(IscsiLun));
 }
 
@@ -1542,6 +1600,7 @@ static BlockDriver bdrv_iscsi = {
 .bdrv_co_get_block_status = iscsi_co_get_block_status,
 #endif
 .bdrv_co_discard  = iscsi_co_discard,
+.bdrv_co_write_zeroes = iscsi_co_write_zeroes,
 
 .bdrv_aio_readv  = iscsi_aio_readv,
 .bdrv_aio_writev = iscsi_aio_writev,
-- 
1.7.9.5




[Qemu-devel] [PATCHv4 13/17] block: introduce bdrv_make_zero

2013-10-08 Thread Peter Lieven
this patch adds a call to completely zero out a block device.
the operation is sped up by checking the block status and
only writing zeroes to the device if they currently do not
return zeroes. optionally the zero writing can be sped up
by setting the flag BDRV_REQ_MAY_UNMAP to emulate the zero
write by unmapping if the driver supports it.

Signed-off-by: Peter Lieven p...@kamp.de
---
 block.c   |   37 +
 include/block/block.h |1 +
 2 files changed, 38 insertions(+)

diff --git a/block.c b/block.c
index 43d5f46..fc931e3 100644
--- a/block.c
+++ b/block.c
@@ -2382,6 +2382,43 @@ int bdrv_write_zeroes(BlockDriverState *bs, int64_t 
sector_num,
   BDRV_REQ_ZERO_WRITE | flags);
 }
 
+/*
+ * Completely zero out a block device with the help of bdrv_write_zeroes.
+ * The operation is sped up by checking the block status and only writing
+ * zeroes to the device if they currently do not return zeroes. Optional
+ * flags are passed through to bdrv_write_zeroes (e.g. BDRV_REQ_MAY_UNMAP).
+ *
+ * Returns  0 on error, 0 on success. For error codes see bdrv_write().
+ */
+int bdrv_make_zero(BlockDriverState *bs, BdrvRequestFlags flags)
+{
+int64_t target_size = bdrv_getlength(bs) / BDRV_SECTOR_SIZE;
+int64_t ret, nb_sectors, sector_num = 0;
+int n;
+
+for (;;) {
+nb_sectors = target_size - sector_num;
+if (nb_sectors = 0) {
+return 0;
+}
+if (nb_sectors  INT_MAX) {
+nb_sectors = INT_MAX;
+}
+ret = bdrv_get_block_status(bs, sector_num, nb_sectors, n);
+if (ret  BDRV_BLOCK_ZERO) {
+sector_num += n;
+continue;
+}
+ret = bdrv_write_zeroes(bs, sector_num, n, flags);
+if (ret  0) {
+error_report(error writing zeroes at sector % PRId64 : %s,
+ sector_num, strerror(-ret));
+return ret;
+}
+sector_num += n;
+}
+}
+
 int bdrv_pread(BlockDriverState *bs, int64_t offset,
void *buf, int count1)
 {
diff --git a/include/block/block.h b/include/block/block.h
index 2de226f..f466953 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -196,6 +196,7 @@ int bdrv_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors);
 int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, BdrvRequestFlags flags);
+int bdrv_make_zero(BlockDriverState *bs, BdrvRequestFlags flags);
 int bdrv_writev(BlockDriverState *bs, int64_t sector_num, QEMUIOVector *qiov);
 int bdrv_pread(BlockDriverState *bs, int64_t offset,
void *buf, int count);
-- 
1.7.9.5




Re: [Qemu-devel] [PATCHv5] block/get_block_status: avoid redundant callouts on raw devices

2013-10-08 Thread Peter Lieven

On 07.10.2013 10:25, Paolo Bonzini wrote:

Il 07/10/2013 07:59, Peter Lieven ha scritto:

if a raw device like an iscsi target or host device is used
the current implementation makes a second call out to get
the block status of bs-file.

Signed-off-by: Peter Lieven p...@kamp.de
---
v5: add a generic get_lba_status function in the raw driver which
 adds the BDRV_BLOCK_RAW flag. bdrv_co_get_block_status will
 handle the callout to bs-file then.

v4: use a flag to detect the raw driver instead of the strncmp
 hack

  block.c   |4 
  block/raw_bsd.c   |3 ++-
  include/block/block.h |4 
  3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 93e113a..38a589e 100644
--- a/block.c
+++ b/block.c
@@ -3147,6 +3147,10 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status(BlockDriverState *bs,
  return ret;
  }
  
+if (ret  BDRV_BLOCK_RAW) {

+return bdrv_get_block_status(bs-file, sector_num, nb_sectors, pnum);

Strictly speaking, this should probably do something like this:

   assert(ret  BDRV_BLOCK_OFFSET_VALID);
   return bdrv_get_block_status(bs-file, ret  BDRV_SECTOR_BITS,
nb_sectors, pnum);

shouldn't the last line be:

  return bdrv_get_block_status(bs-file, ret  BDRV_SECTOR_BITS,
   *pnum, pnum);


This would of course require *pnum = nb_sectors in raw_co_get_block_status
?





[Qemu-devel] [PATCH v3 01/17] qapi-types/visit.py: Pass whole expr dict for structs

2013-10-08 Thread Kevin Wolf
Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
---
 scripts/qapi-types.py | 11 ---
 scripts/qapi-visit.py |  8 ++--
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/scripts/qapi-types.py b/scripts/qapi-types.py
index 5222463..566fe5e 100644
--- a/scripts/qapi-types.py
+++ b/scripts/qapi-types.py
@@ -71,7 +71,7 @@ def generate_struct_fields(members):
  c_name=c_var(argname))
 if structured:
 push_indent()
-ret += generate_struct(, argname, argentry)
+ret += generate_struct({ field: argname, data: argentry})
 pop_indent()
 else:
 ret += mcgen('''
@@ -81,7 +81,12 @@ def generate_struct_fields(members):
 
 return ret
 
-def generate_struct(structname, fieldname, members):
+def generate_struct(expr):
+
+structname = expr.get('type', )
+fieldname = expr.get('field', )
+members = expr['data']
+
 ret = mcgen('''
 struct %(name)s
 {
@@ -417,7 +422,7 @@ if do_builtins:
 for expr in exprs:
 ret = \n
 if expr.has_key('type'):
-ret += generate_struct(expr['type'], , expr['data']) + \n
+ret += generate_struct(expr) + \n
 ret += generate_type_cleanup_decl(expr['type'] + List)
 fdef.write(generate_type_cleanup(expr['type'] + List) + \n)
 ret += generate_type_cleanup_decl(expr['type'])
diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index 597cca4..1e44004 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -120,7 +120,11 @@ if (!err) {
 ''')
 return ret
 
-def generate_visit_struct(name, members):
+def generate_visit_struct(expr):
+
+name = expr['type']
+members = expr['data']
+
 ret = generate_visit_struct_fields(name, , , members)
 
 ret += mcgen('''
@@ -472,7 +476,7 @@ if do_builtins:
 
 for expr in exprs:
 if expr.has_key('type'):
-ret = generate_visit_struct(expr['type'], expr['data'])
+ret = generate_visit_struct(expr)
 ret += generate_visit_list(expr['type'], expr['data'])
 fdef.write(ret)
 
-- 
1.8.1.4




[Qemu-devel] [PATCH v3 04/17] blockdev: 'blockdev-add' QMP command

2013-10-08 Thread Kevin Wolf
For examples see the changes to qmp-commands.hx.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 blockdev.c   |  57 ++
 qapi-schema.json | 236 +++
 qmp-commands.hx  |  55 +
 3 files changed, 348 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 29a5b70..157c076 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -38,6 +38,8 @@
 #include qemu/option.h
 #include qemu/config-file.h
 #include qapi/qmp/types.h
+#include qapi-visit.h
+#include qapi/qmp-output-visitor.h
 #include sysemu/sysemu.h
 #include block/block_int.h
 #include qmp-commands.h
@@ -2066,6 +2068,61 @@ void qmp_block_job_complete(const char *device, Error 
**errp)
 block_job_complete(job, errp);
 }
 
+void qmp_blockdev_add(BlockdevOptions *options, Error **errp)
+{
+QmpOutputVisitor *ov = qmp_output_visitor_new();
+QObject *obj;
+QDict *qdict;
+DriveInfo *dinfo;
+Error *local_err = NULL;
+
+/* Require an ID in the top level */
+if (!options-has_id) {
+error_setg(errp, Block device needs an ID);
+goto fail;
+}
+
+/* TODO Sort it out in raw-posix and drive_init: Reject aio=native with
+ * cache.direct=false instead of silently switching to aio=threads, except
+ * if called from drive_init.
+ *
+ * For now, simply forbidding the combination for all drivers will do. */
+if (options-has_aio  options-aio == BLOCKDEV_AIO_OPTIONS_NATIVE) {
+bool direct = options-cache-has_direct  options-cache-direct;
+if (!options-has_cache  !direct) {
+error_setg(errp, aio=native requires cache.direct=true);
+goto fail;
+}
+}
+
+visit_type_BlockdevOptions(qmp_output_get_visitor(ov),
+   options, NULL, local_err);
+if (error_is_set(local_err)) {
+error_propagate(errp, local_err);
+goto fail;
+}
+
+obj = qmp_output_get_qobject(ov);
+qdict = qobject_to_qdict(obj);
+
+qdict_flatten(qdict);
+
+QemuOpts *opts = qemu_opts_from_qdict(qemu_drive_opts, qdict, local_err);
+if (error_is_set(local_err)) {
+error_propagate(errp, local_err);
+goto fail;
+}
+
+dinfo = blockdev_init(opts, IF_NONE);
+if (!dinfo) {
+error_setg(errp, Could not open image);
+goto fail;
+}
+
+fail:
+qmp_output_visitor_cleanup(ov);
+}
+
 static void do_qmp_query_block_jobs_one(void *opaque, BlockDriverState *bs)
 {
 BlockJobInfoList **prev = opaque;
diff --git a/qapi-schema.json b/qapi-schema.json
index 145eca8..71e2b2e 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3902,3 +3902,239 @@
 ##
 { 'command': 'query-rx-filter', 'data': { '*name': 'str' },
   'returns': ['RxFilterInfo'] }
+
+
+##
+# @BlockdevDiscardOptions
+#
+# Determines how to handle discard requests.
+#
+# @ignore:  Ignore the request
+# @unmap:   Forward as an unmap request
+#
+# Since: 1.7
+##
+{ 'enum': 'BlockdevDiscardOptions',
+  'data': [ 'ignore', 'unmap' ] }
+
+##
+# @BlockdevAioOptions
+#
+# Selects the AIO backend to handle I/O requests
+#
+# @threads: Use qemu's thread pool
+# @native:  Use native AIO backend (only Linux and Windows)
+#
+# Since: 1.7
+##
+{ 'enum': 'BlockdevAioOptions',
+  'data': [ 'threads', 'native' ] }
+
+##
+# @BlockdevCacheOptions
+#
+# Includes cache-related options for block devices
+#
+# @writeback:   #optional enables writeback mode for any caches (default: true)
+# @direct:  #optional enables use of O_DIRECT (bypass the host page cache;
+#   default: false)
+# @no-flush:#optional ignore any flush requests for the device (default:
+#   false)
+#
+# Since: 1.7
+##
+{ 'type': 'BlockdevCacheOptions',
+  'data': { '*writeback': 'bool',
+'*direct': 'bool',
+'*no-flush': 'bool' } }
+
+##
+# @BlockdevOptionsBase
+#
+# Options that are available for all block devices, independent of the block
+# driver.
+#
+# @driver:  block driver name
+# @id:  #optional id by which the new block device can be referred to.
+#   This is a required option on the top level of blockdev-add, and
+#   currently not allowed on any other level.
+# @discard: #optional discard-related options (default: ignore)
+# @cache:   #optional cache-related options
+# @aio: #optional AIO backend (default: threads)
+# @rerror:  #optional how to handle read errors on the device
+#   (default: report)
+# @werror:  #optional how to handle write errors on the device
+#   (default: enospc)
+# @read-only:   #optional whether the block device should be read-only
+#   (default: false)
+#
+# Since: 1.7
+##
+{ 'type': 'BlockdevOptionsBase',
+  'data': { 'driver': 'str',
+'*id': 'str',
+'*discard': 'BlockdevDiscardOptions',
+'*cache': 'BlockdevCacheOptions',
+'*aio': 'BlockdevAioOptions',
+

[Qemu-devel] [PATCH v3 05/17] blockdev: Separate ID generation from DriveInfo creation

2013-10-08 Thread Kevin Wolf
blockdev-add shouldn't automatically generate IDs, but will keep most of
the DriveInfo creation code.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Wenchao Xia xiaw...@linux.vnet.ibm.com
Reviewed-by: Eric Blake ebl...@redhat.com
---
 blockdev.c| 32 +---
 include/qemu/option.h |  1 +
 util/qemu-option.c|  6 ++
 3 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 157c076..e7e6bdd 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -604,23 +604,25 @@ static DriveInfo *blockdev_init(QemuOpts *all_opts,
 return NULL;
 }
 
-/* init */
-
-dinfo = g_malloc0(sizeof(*dinfo));
-if ((buf = qemu_opts_id(opts)) != NULL) {
-dinfo-id = g_strdup(buf);
-} else {
-/* no id supplied - create one */
-dinfo-id = g_malloc0(32);
-if (type == IF_IDE || type == IF_SCSI)
+/* no id supplied - create one */
+if (qemu_opts_id(opts) == NULL) {
+char *new_id;
+if (type == IF_IDE || type == IF_SCSI) {
 mediastr = (media == MEDIA_CDROM) ? -cd : -hd;
-if (max_devs)
-snprintf(dinfo-id, 32, %s%i%s%i,
- if_name[type], bus_id, mediastr, unit_id);
-else
-snprintf(dinfo-id, 32, %s%s%i,
- if_name[type], mediastr, unit_id);
+}
+if (max_devs) {
+new_id = g_strdup_printf(%s%i%s%i, if_name[type], bus_id,
+ mediastr, unit_id);
+} else {
+new_id = g_strdup_printf(%s%s%i, if_name[type],
+ mediastr, unit_id);
+}
+qemu_opts_set_id(opts, new_id);
 }
+
+/* init */
+dinfo = g_malloc0(sizeof(*dinfo));
+dinfo-id = g_strdup(qemu_opts_id(opts));
 dinfo-bdrv = bdrv_new(dinfo-id);
 dinfo-bdrv-open_flags = snapshot ? BDRV_O_SNAPSHOT : 0;
 dinfo-bdrv-read_only = ro;
diff --git a/include/qemu/option.h b/include/qemu/option.h
index 63db4cc..5c0c6dd 100644
--- a/include/qemu/option.h
+++ b/include/qemu/option.h
@@ -142,6 +142,7 @@ void qemu_opts_loc_restore(QemuOpts *opts);
 int qemu_opts_set(QemuOptsList *list, const char *id,
   const char *name, const char *value);
 const char *qemu_opts_id(QemuOpts *opts);
+void qemu_opts_set_id(QemuOpts *opts, char *id);
 void qemu_opts_del(QemuOpts *opts);
 void qemu_opts_validate(QemuOpts *opts, const QemuOptDesc *desc, Error **errp);
 int qemu_opts_do_parse(QemuOpts *opts, const char *params, const char 
*firstname);
diff --git a/util/qemu-option.c b/util/qemu-option.c
index e0844a9..efcb5dc 100644
--- a/util/qemu-option.c
+++ b/util/qemu-option.c
@@ -834,6 +834,12 @@ const char *qemu_opts_id(QemuOpts *opts)
 return opts-id;
 }
 
+/* The id string will be g_free()d by qemu_opts_del */
+void qemu_opts_set_id(QemuOpts *opts, char *id)
+{
+opts-id = id;
+}
+
 void qemu_opts_del(QemuOpts *opts)
 {
 QemuOpt *opt;
-- 
1.8.1.4




[Qemu-devel] [PATCH v3 02/17] qapi-types/visit.py: Inheritance for structs

2013-10-08 Thread Kevin Wolf
This introduces a new 'base' key for struct definitions that refers to
another struct type. On the JSON level, the fields of the base type are
included directly into the same namespace as the fields of the defined
type, like with unions. On the C level, a pointer to a struct of the
base type is included.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
---
 docs/qapi-code-gen.txt | 17 +
 scripts/qapi-types.py  |  4 
 scripts/qapi-visit.py  | 18 --
 3 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt
index 0ce045c..91f44d0 100644
--- a/docs/qapi-code-gen.txt
+++ b/docs/qapi-code-gen.txt
@@ -53,6 +53,23 @@ The use of '*' as a prefix to the name means the member is 
optional.  Optional
 members should always be added to the end of the dictionary to preserve
 backwards compatibility.
 
+
+A complex type definition can specify another complex type as its base.
+In this case, the fields of the base type are included as top-level fields
+of the new complex type's dictionary in the QMP wire format. An example
+definition is:
+
+ { 'type': 'BlockdevOptionsGenericFormat', 'data': { 'file': 'str' } }
+ { 'type': 'BlockdevOptionsGenericCOWFormat',
+   'base': 'BlockdevOptionsGenericFormat',
+   'data': { '*backing': 'str' } }
+
+An example BlockdevOptionsGenericCOWFormat object on the wire could use
+both fields like this:
+
+ { file: /some/place/my-image,
+   backing: /some/place/my-backing-file }
+
 === Enumeration types ===
 
 An enumeration type is a dictionary containing a single key whose value is a
diff --git a/scripts/qapi-types.py b/scripts/qapi-types.py
index 566fe5e..4a1652b 100644
--- a/scripts/qapi-types.py
+++ b/scripts/qapi-types.py
@@ -86,6 +86,7 @@ def generate_struct(expr):
 structname = expr.get('type', )
 fieldname = expr.get('field', )
 members = expr['data']
+base = expr.get('base')
 
 ret = mcgen('''
 struct %(name)s
@@ -93,6 +94,9 @@ struct %(name)s
 ''',
   name=structname)
 
+if base:
+ret += generate_struct_fields({'base': base})
+
 ret += generate_struct_fields(members)
 
 if len(fieldname):
diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index 1e44004..c39e628 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -17,7 +17,7 @@ import os
 import getopt
 import errno
 
-def generate_visit_struct_fields(name, field_prefix, fn_prefix, members):
+def generate_visit_struct_fields(name, field_prefix, fn_prefix, members, base 
= None):
 substructs = []
 ret = ''
 full_name = name if not fn_prefix else %s_%s % (name, fn_prefix)
@@ -42,6 +42,19 @@ static void visit_type_%(full_name)s_fields(Visitor *m, 
%(name)s ** obj, Error *
 name=name, full_name=full_name)
 push_indent()
 
+if base:
+ret += mcgen('''
+visit_start_implicit_struct(m, obj ? (void**) (*obj)-%(c_name)s : NULL, 
sizeof(%(type)s), err);
+if (!err) {
+visit_type_%(type)s_fields(m, obj ? (*obj)-%(c_prefix)s%(c_name)s : 
NULL, err);
+error_propagate(errp, err);
+err = NULL;
+visit_end_implicit_struct(m, err);
+}
+''',
+ c_prefix=c_var(field_prefix),
+ type=type_name(base), c_name=c_var('base'))
+
 for argname, argentry, optional, structured in parse_args(members):
 if optional:
 ret += mcgen('''
@@ -124,8 +137,9 @@ def generate_visit_struct(expr):
 
 name = expr['type']
 members = expr['data']
+base = expr.get('base')
 
-ret = generate_visit_struct_fields(name, , , members)
+ret = generate_visit_struct_fields(name, , , members, base)
 
 ret += mcgen('''
 
-- 
1.8.1.4




[Qemu-devel] [PATCH v3 12/17] blockdev: Move virtio-blk device creation to drive_init

2013-10-08 Thread Kevin Wolf
Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
---
 blockdev.c | 54 +++---
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 1eaefb0..1c05c7a 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -318,7 +318,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 int ro = 0;
 int bdrv_flags = 0;
 int on_read_error, on_write_error;
-const char *devaddr;
 DriveInfo *dinfo;
 ThrottleConfig cfg;
 int snapshot = 0;
@@ -472,20 +471,12 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 }
 }
 
-if ((devaddr = qemu_opt_get(opts, addr)) != NULL) {
-if (type != IF_VIRTIO) {
-error_report(addr is not supported by this bus type);
-return NULL;
-}
-}
-
 /* init */
 dinfo = g_malloc0(sizeof(*dinfo));
 dinfo-id = g_strdup(qemu_opts_id(opts));
 dinfo-bdrv = bdrv_new(dinfo-id);
 dinfo-bdrv-open_flags = snapshot ? BDRV_O_SNAPSHOT : 0;
 dinfo-bdrv-read_only = ro;
-dinfo-devaddr = devaddr;
 dinfo-type = type;
 dinfo-refcount = 1;
 if (serial != NULL) {
@@ -512,22 +503,8 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 case IF_FLOPPY:
 case IF_PFLASH:
 case IF_MTD:
-break;
 case IF_VIRTIO:
-{
-/* add virtio block device */
-QemuOpts *devopts;
-devopts = qemu_opts_create_nofail(qemu_find_opts(device));
-if (arch_type == QEMU_ARCH_S390X) {
-qemu_opt_set(devopts, driver, virtio-blk-s390);
-} else {
-qemu_opt_set(devopts, driver, virtio-blk-pci);
-}
-qemu_opt_set(devopts, drive, dinfo-id);
-if (devaddr)
-qemu_opt_set(devopts, addr, devaddr);
 break;
-}
 default:
 abort();
 }
@@ -651,6 +628,10 @@ QemuOptsList qemu_legacy_drive_opts = {
 .name = boot,
 .type = QEMU_OPT_BOOL,
 .help = (deprecated, ignored),
+},{
+.name = addr,
+.type = QEMU_OPT_STRING,
+.help = pci address (virtio only),
 },
 { /* end of list */ }
 },
@@ -666,6 +647,7 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 BlockInterfaceType type;
 int cyls, heads, secs, translation;
 int max_devs, bus_id, unit_id, index;
+const char *devaddr;
 Error *local_err = NULL;
 
 /* Change legacy command line options into QMP ones */
@@ -866,6 +848,27 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 g_free(new_id);
 }
 
+/* Add virtio block device */
+devaddr = qemu_opt_get(legacy_opts, addr);
+if (devaddr  type != IF_VIRTIO) {
+error_report(addr is not supported by this bus type);
+goto fail;
+}
+
+if (type == IF_VIRTIO) {
+QemuOpts *devopts;
+devopts = qemu_opts_create_nofail(qemu_find_opts(device));
+if (arch_type == QEMU_ARCH_S390X) {
+qemu_opt_set(devopts, driver, virtio-blk-s390);
+} else {
+qemu_opt_set(devopts, driver, virtio-blk-pci);
+}
+qemu_opt_set(devopts, drive, qdict_get_str(bs_opts, id));
+if (devaddr) {
+qemu_opt_set(devopts, addr, devaddr);
+}
+}
+
 /* Actual block device init: Functionality shared with blockdev-add */
 dinfo = blockdev_init(bs_opts, type, media);
 if (dinfo == NULL) {
@@ -883,6 +886,7 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 
 dinfo-bus = bus_id;
 dinfo-unit = unit_id;
+dinfo-devaddr = devaddr;
 
 fail:
 qemu_opts_del(legacy_opts);
@@ -2259,10 +2263,6 @@ QemuOptsList qemu_common_drive_opts = {
 .type = QEMU_OPT_STRING,
 .help = write error action,
 },{
-.name = addr,
-.type = QEMU_OPT_STRING,
-.help = pci address (virtio only),
-},{
 .name = read-only,
 .type = QEMU_OPT_BOOL,
 .help = open drive file as read-only,
-- 
1.8.1.4




[Qemu-devel] [PATCH v3 07/17] blockdev: Move parsing of 'media' option to drive_init

2013-10-08 Thread Kevin Wolf
This moves as much as possible of the processing of the 'media' option
to drive_init so that it can only be accessed using legacy functions,
but never with anything blockdev-add related.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Benoit Canet ben...@irqsave.net
Reviewed-by: Eric Blake ebl...@redhat.com
---
 blockdev.c | 73 ++
 1 file changed, 50 insertions(+), 23 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index cbe907b..0eaaffa 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -305,16 +305,18 @@ static bool check_throttle_config(ThrottleConfig *cfg, 
Error **errp)
 return true;
 }
 
+typedef enum { MEDIA_DISK, MEDIA_CDROM } DriveMediaType;
+
 /* Takes the ownership of bs_opts */
 static DriveInfo *blockdev_init(QDict *bs_opts,
-BlockInterfaceType block_default_type)
+BlockInterfaceType block_default_type,
+DriveMediaType media)
 {
 const char *buf;
 const char *file = NULL;
 const char *serial;
 const char *mediastr = ;
 BlockInterfaceType type;
-enum { MEDIA_DISK, MEDIA_CDROM } media;
 int bus_id, unit_id;
 int cyls, heads, secs, translation;
 int max_devs;
@@ -335,7 +337,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 BlockDriver *drv = NULL;
 
 translation = BIOS_ATA_TRANSLATION_AUTO;
-media = MEDIA_DISK;
 
 /* Check common options by copying from bs_opts to opts, all other options
  * stay in bs_opts for processing by bdrv_open(). */
@@ -422,19 +423,11 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
}
 }
 
-if ((buf = qemu_opt_get(opts, media)) != NULL) {
-if (!strcmp(buf, disk)) {
-   media = MEDIA_DISK;
-   } else if (!strcmp(buf, cdrom)) {
-if (cyls || secs || heads) {
-error_report(CHS can't be set with media=%s, buf);
-   return NULL;
-}
-   media = MEDIA_CDROM;
-   } else {
-   error_report('%s' invalid media, buf);
-   return NULL;
-   }
+if (media == MEDIA_CDROM) {
+if (cyls || secs || heads) {
+error_report(CHS can't be set with media=cdrom);
+return NULL;
+}
 }
 
 if ((buf = qemu_opt_get(opts, discard)) != NULL) {
@@ -755,11 +748,27 @@ static void qemu_opt_rename(QemuOpts *opts, const char 
*from, const char *to)
 }
 }
 
+QemuOptsList qemu_legacy_drive_opts = {
+.name = drive,
+.head = QTAILQ_HEAD_INITIALIZER(qemu_legacy_drive_opts.head),
+.desc = {
+{
+.name = media,
+.type = QEMU_OPT_STRING,
+.help = media type (disk, cdrom),
+},
+{ /* end of list */ }
+},
+};
+
 DriveInfo *drive_init(QemuOpts *all_opts, BlockInterfaceType 
block_default_type)
 {
 const char *value;
-DriveInfo *dinfo;
+DriveInfo *dinfo = NULL;
 QDict *bs_opts;
+QemuOpts *legacy_opts;
+DriveMediaType media = MEDIA_DISK;
+Error *local_err = NULL;
 
 /* Change legacy command line options into QMP ones */
 qemu_opt_rename(all_opts, iops, throttling.iops-total);
@@ -812,8 +821,29 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 bs_opts = qdict_new();
 qemu_opts_to_qdict(all_opts, bs_opts);
 
+legacy_opts = qemu_opts_create_nofail(qemu_legacy_drive_opts);
+qemu_opts_absorb_qdict(legacy_opts, bs_opts, local_err);
+if (error_is_set(local_err)) {
+qerror_report_err(local_err);
+error_free(local_err);
+goto fail;
+}
+
+/* Media type */
+value = qemu_opt_get(legacy_opts, media);
+if (value) {
+if (!strcmp(value, disk)) {
+media = MEDIA_DISK;
+} else if (!strcmp(value, cdrom)) {
+media = MEDIA_CDROM;
+} else {
+error_report('%s' invalid media, value);
+goto fail;
+}
+}
+
 /* Actual block device init: Functionality shared with blockdev-add */
-dinfo = blockdev_init(bs_opts, block_default_type);
+dinfo = blockdev_init(bs_opts, block_default_type, media);
 if (dinfo == NULL) {
 goto fail;
 }
@@ -823,6 +853,7 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 dinfo-opts = all_opts;
 
 fail:
+qemu_opts_del(legacy_opts);
 return dinfo;
 }
 
@@ -2115,7 +2146,7 @@ void qmp_blockdev_add(BlockdevOptions *options, Error 
**errp)
 
 qdict_flatten(qdict);
 
-dinfo = blockdev_init(qdict, IF_NONE);
+dinfo = blockdev_init(qdict, IF_NONE, MEDIA_DISK);
 if (!dinfo) {
 error_setg(errp, Could not open image);
 goto fail;
@@ -2184,10 +2215,6 @@ QemuOptsList qemu_common_drive_opts = {
 .type = QEMU_OPT_STRING,
 .help = chs translation (auto, lba. none),
 },{
-.name = media,
-.type = QEMU_OPT_STRING,

[Qemu-devel] [PATCH v3 09/17] blockdev: Moving parsing of geometry options to drive_init

2013-10-08 Thread Kevin Wolf
This moves all of the geometry options (cyls/heads/secs/trans) to
drive_init so that they can only be accessed using legacy functions, but
never with anything blockdev-add related.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
---
 blockdev.c | 136 +++--
 1 file changed, 69 insertions(+), 67 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 58c3ab9..b1c7590 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -317,7 +317,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 const char *serial;
 const char *mediastr = ;
 int bus_id, unit_id;
-int cyls, heads, secs, translation;
 int max_devs;
 int index;
 int ro = 0;
@@ -335,8 +334,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 bool has_driver_specific_opts;
 BlockDriver *drv = NULL;
 
-translation = BIOS_ATA_TRANSLATION_AUTO;
-
 /* Check common options by copying from bs_opts to opts, all other options
  * stay in bs_opts for processing by bdrv_open(). */
 id = qdict_get_try_str(bs_opts, id);
@@ -365,10 +362,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 unit_id = qemu_opt_get_number(opts, unit, -1);
 index   = qemu_opt_get_number(opts, index, -1);
 
-cyls  = qemu_opt_get_number(opts, cyls, 0);
-heads = qemu_opt_get_number(opts, heads, 0);
-secs  = qemu_opt_get_number(opts, secs, 0);
-
 snapshot = qemu_opt_get_bool(opts, snapshot, 0);
 ro = qemu_opt_get_bool(opts, read-only, 0);
 copy_on_read = qemu_opt_get_bool(opts, copy-on-read, false);
@@ -378,46 +371,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 
 max_devs = if_max_devs[type];
 
-if (cyls || heads || secs) {
-if (cyls  1) {
-error_report(invalid physical cyls number);
-   return NULL;
-   }
-if (heads  1) {
-error_report(invalid physical heads number);
-   return NULL;
-   }
-if (secs  1) {
-error_report(invalid physical secs number);
-   return NULL;
-   }
-}
-
-if ((buf = qemu_opt_get(opts, trans)) != NULL) {
-if (!cyls) {
-error_report('%s' trans must be used with cyls, heads and secs,
- buf);
-return NULL;
-}
-if (!strcmp(buf, none))
-translation = BIOS_ATA_TRANSLATION_NONE;
-else if (!strcmp(buf, lba))
-translation = BIOS_ATA_TRANSLATION_LBA;
-else if (!strcmp(buf, auto))
-translation = BIOS_ATA_TRANSLATION_AUTO;
-   else {
-error_report('%s' invalid translation type, buf);
-   return NULL;
-   }
-}
-
-if (media == MEDIA_CDROM) {
-if (cyls || secs || heads) {
-error_report(CHS can't be set with media=cdrom);
-return NULL;
-}
-}
-
 if ((buf = qemu_opt_get(opts, discard)) != NULL) {
 if (bdrv_parse_discard_flags(buf, bdrv_flags) != 0) {
 error_report(invalid discard option);
@@ -612,10 +565,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 dinfo-type = type;
 dinfo-bus = bus_id;
 dinfo-unit = unit_id;
-dinfo-cyls = cyls;
-dinfo-heads = heads;
-dinfo-secs = secs;
-dinfo-trans = translation;
 dinfo-refcount = 1;
 if (serial != NULL) {
 dinfo-serial = g_strdup(serial);
@@ -748,6 +697,22 @@ QemuOptsList qemu_legacy_drive_opts = {
 .name = if,
 .type = QEMU_OPT_STRING,
 .help = interface (ide, scsi, sd, mtd, floppy, pflash, virtio),
+},{
+.name = cyls,
+.type = QEMU_OPT_NUMBER,
+.help = number of cylinders (ide disk geometry),
+},{
+.name = heads,
+.type = QEMU_OPT_NUMBER,
+.help = number of heads (ide disk geometry),
+},{
+.name = secs,
+.type = QEMU_OPT_NUMBER,
+.help = number of sectors (ide disk geometry),
+},{
+.name = trans,
+.type = QEMU_OPT_STRING,
+.help = chs translation (auto, lba, none),
 },
 { /* end of list */ }
 },
@@ -761,6 +726,7 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 QemuOpts *legacy_opts;
 DriveMediaType media = MEDIA_DISK;
 BlockInterfaceType type;
+int cyls, heads, secs, translation;
 Error *local_err = NULL;
 
 /* Change legacy command line options into QMP ones */
@@ -850,6 +816,53 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 type = block_default_type;
 }
 
+/* Geometry */
+cyls  = qemu_opt_get_number(legacy_opts, cyls, 0);
+heads = qemu_opt_get_number(legacy_opts, heads, 0);
+secs  = qemu_opt_get_number(legacy_opts, secs, 0);
+
+if (cyls || heads || secs) {
+if (cyls  1) {
+error_report(invalid physical cyls 

[Qemu-devel] [PATCH v3 15/17] blockdev: Remove 'media' parameter from blockdev_init()

2013-10-08 Thread Kevin Wolf
The remaining users shouldn't be there with blockdev-add and are easy to
move to drive_init().

Bonus bug fix: As a side effect, CD-ROM drives can now use block drivers
on the read-only whitelist without explicitly specifying read-only=on,
even if a format is explicitly specified.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
---
 blockdev.c | 40 +++-
 1 file changed, 15 insertions(+), 25 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 1aedef8..b39f2e7 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -309,8 +309,7 @@ typedef enum { MEDIA_DISK, MEDIA_CDROM } DriveMediaType;
 
 /* Takes the ownership of bs_opts */
 static DriveInfo *blockdev_init(QDict *bs_opts,
-BlockInterfaceType type,
-DriveMediaType media)
+BlockInterfaceType type)
 {
 const char *buf;
 const char *file = NULL;
@@ -492,22 +491,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 bdrv_set_io_limits(dinfo-bdrv, cfg);
 }
 
-switch(type) {
-case IF_IDE:
-case IF_SCSI:
-case IF_XEN:
-case IF_NONE:
-dinfo-media_cd = media == MEDIA_CDROM;
-break;
-case IF_SD:
-case IF_FLOPPY:
-case IF_PFLASH:
-case IF_MTD:
-case IF_VIRTIO:
-break;
-default:
-abort();
-}
 if (!file || !*file) {
 if (has_driver_specific_opts) {
 file = NULL;
@@ -529,11 +512,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 bdrv_flags |= BDRV_O_INCOMING;
 }
 
-if (media == MEDIA_CDROM) {
-/* CDROM is fine for any interface, don't check.  */
-ro = 1;
-}
-
 bdrv_flags |= ro ? 0 : BDRV_O_RDWR;
 
 if (ro  copy_on_read) {
@@ -717,6 +695,7 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 media = MEDIA_DISK;
 } else if (!strcmp(value, cdrom)) {
 media = MEDIA_CDROM;
+qdict_put(bs_opts, read-only, qstring_from_str(on));
 } else {
 error_report('%s' invalid media, value);
 goto fail;
@@ -864,7 +843,7 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 }
 
 /* Actual block device init: Functionality shared with blockdev-add */
-dinfo = blockdev_init(bs_opts, type, media);
+dinfo = blockdev_init(bs_opts, type);
 if (dinfo == NULL) {
 goto fail;
 }
@@ -882,6 +861,17 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 dinfo-unit = unit_id;
 dinfo-devaddr = devaddr;
 
+switch(type) {
+case IF_IDE:
+case IF_SCSI:
+case IF_XEN:
+case IF_NONE:
+dinfo-media_cd = media == MEDIA_CDROM;
+break;
+default:
+break;
+}
+
 fail:
 qemu_opts_del(legacy_opts);
 return dinfo;
@@ -2176,7 +2166,7 @@ void qmp_blockdev_add(BlockdevOptions *options, Error 
**errp)
 
 qdict_flatten(qdict);
 
-dinfo = blockdev_init(qdict, IF_NONE, MEDIA_DISK);
+dinfo = blockdev_init(qdict, IF_NONE);
 if (!dinfo) {
 error_setg(errp, Could not open image);
 goto fail;
-- 
1.8.1.4




[Qemu-devel] [PATCH v3 03/17] blockdev: Introduce DriveInfo.enable_auto_del

2013-10-08 Thread Kevin Wolf
BlockDriverStates shouldn't be affected by an unplugged guest device,
except if created with the legacy -drive command line option or the
drive_add HMP command.

Make the automatic deletion as well as cancelling of jobs conditional on
an enable_auto_del boolean that is only set in drive_init().

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
Reviewed-by: Wenchao Xia xiaw...@linux.vnet.ibm.com
---
 blockdev.c| 17 -
 include/sysemu/blockdev.h |  1 +
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 8aa66a9..29a5b70 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -89,6 +89,10 @@ void blockdev_mark_auto_del(BlockDriverState *bs)
 {
 DriveInfo *dinfo = drive_get_by_blockdev(bs);
 
+if (dinfo  !dinfo-enable_auto_del) {
+return;
+}
+
 if (bs-job) {
 block_job_cancel(bs-job);
 }
@@ -750,6 +754,7 @@ static void qemu_opt_rename(QemuOpts *opts, const char 
*from, const char *to)
 DriveInfo *drive_init(QemuOpts *all_opts, BlockInterfaceType 
block_default_type)
 {
 const char *value;
+DriveInfo *dinfo;
 
 /* Change legacy command line options into QMP ones */
 qemu_opt_rename(all_opts, iops, throttling.iops-total);
@@ -798,7 +803,17 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 qemu_opt_unset(all_opts, cache);
 }
 
-return blockdev_init(all_opts, block_default_type);
+/* Actual block device init: Functionality shared with blockdev-add */
+dinfo = blockdev_init(all_opts, block_default_type);
+if (dinfo == NULL) {
+goto fail;
+}
+
+/* Set legacy DriveInfo fields */
+dinfo-enable_auto_del = true;
+
+fail:
+return dinfo;
 }
 
 void do_commit(Monitor *mon, const QDict *qdict)
diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h
index 804ec88..1082091 100644
--- a/include/sysemu/blockdev.h
+++ b/include/sysemu/blockdev.h
@@ -37,6 +37,7 @@ struct DriveInfo {
 int bus;
 int unit;
 int auto_del;   /* see blockdev_mark_auto_del() */
+bool enable_auto_del; /* Only for legacy drive_init() */
 int media_cd;
 int cyls, heads, secs, trans;
 QemuOpts *opts;
-- 
1.8.1.4




[Qemu-devel] [PATCH v3 10/17] blockdev: Move parsing of 'boot' option to drive_init

2013-10-08 Thread Kevin Wolf
It's already ignored and only prints a deprecation message. No use in
making it available in new interfaces.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
---
 blockdev.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index b1c7590..85e412a 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -456,12 +456,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 return NULL;
 }
 
-if (qemu_opt_get(opts, boot) != NULL) {
-fprintf(stderr, qemu-kvm: boot=on|off is deprecated and will be 
-ignored. Future versions will reject this parameter. Please 
-update your scripts.\n);
-}
-
 on_write_error = BLOCKDEV_ON_ERROR_ENOSPC;
 if ((buf = qemu_opt_get(opts, werror)) != NULL) {
 if (type != IF_IDE  type != IF_SCSI  type != IF_VIRTIO  type != 
IF_NONE) {
@@ -713,6 +707,10 @@ QemuOptsList qemu_legacy_drive_opts = {
 .name = trans,
 .type = QEMU_OPT_STRING,
 .help = chs translation (auto, lba, none),
+},{
+.name = boot,
+.type = QEMU_OPT_BOOL,
+.help = (deprecated, ignored),
 },
 { /* end of list */ }
 },
@@ -788,6 +786,13 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 goto fail;
 }
 
+/* Deprecated option boot=[on|off] */
+if (qemu_opt_get(legacy_opts, boot) != NULL) {
+fprintf(stderr, qemu-kvm: boot=on|off is deprecated and will be 
+ignored. Future versions will reject this parameter. Please 
+update your scripts.\n);
+}
+
 /* Media type */
 value = qemu_opt_get(legacy_opts, media);
 if (value) {
@@ -2328,10 +2333,6 @@ QemuOptsList qemu_common_drive_opts = {
 .name = copy-on-read,
 .type = QEMU_OPT_BOOL,
 .help = copy read data from backing file into image file,
-},{
-.name = boot,
-.type = QEMU_OPT_BOOL,
-.help = (deprecated, ignored),
 },
 { /* end of list */ }
 },
-- 
1.8.1.4




[Qemu-devel] [PATCH v3 11/17] blockdev: Move bus/unit/index processing to drive_init

2013-10-08 Thread Kevin Wolf
This requires moving the automatic ID generation at the same time, so
let's do that as well.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
---
 blockdev.c | 157 -
 1 file changed, 73 insertions(+), 84 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 85e412a..1eaefb0 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -315,10 +315,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 const char *buf;
 const char *file = NULL;
 const char *serial;
-const char *mediastr = ;
-int bus_id, unit_id;
-int max_devs;
-int index;
 int ro = 0;
 int bdrv_flags = 0;
 int on_read_error, on_write_error;
@@ -358,10 +354,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 has_driver_specific_opts = !!qdict_size(bs_opts);
 
 /* extract parameters */
-bus_id  = qemu_opt_get_number(opts, bus, 0);
-unit_id = qemu_opt_get_number(opts, unit, -1);
-index   = qemu_opt_get_number(opts, index, -1);
-
 snapshot = qemu_opt_get_bool(opts, snapshot, 0);
 ro = qemu_opt_get_bool(opts, read-only, 0);
 copy_on_read = qemu_opt_get_bool(opts, copy-on-read, false);
@@ -369,8 +361,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 file = qemu_opt_get(opts, file);
 serial = qemu_opt_get(opts, serial);
 
-max_devs = if_max_devs[type];
-
 if ((buf = qemu_opt_get(opts, discard)) != NULL) {
 if (bdrv_parse_discard_flags(buf, bdrv_flags) != 0) {
 error_report(invalid discard option);
@@ -489,66 +479,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 }
 }
 
-/* compute bus and unit according index */
-
-if (index != -1) {
-if (bus_id != 0 || unit_id != -1) {
-error_report(index cannot be used with bus and unit);
-return NULL;
-}
-bus_id = drive_index_to_bus_id(type, index);
-unit_id = drive_index_to_unit_id(type, index);
-}
-
-/* if user doesn't specify a unit_id,
- * try to find the first free
- */
-
-if (unit_id == -1) {
-   unit_id = 0;
-   while (drive_get(type, bus_id, unit_id) != NULL) {
-   unit_id++;
-   if (max_devs  unit_id = max_devs) {
-   unit_id -= max_devs;
-   bus_id++;
-   }
-   }
-}
-
-/* check unit id */
-
-if (max_devs  unit_id = max_devs) {
-error_report(unit %d too big (max is %d),
- unit_id, max_devs - 1);
-return NULL;
-}
-
-/*
- * catch multiple definitions
- */
-
-if (drive_get(type, bus_id, unit_id) != NULL) {
-error_report(drive with bus=%d, unit=%d (index=%d) exists,
- bus_id, unit_id, index);
-return NULL;
-}
-
-/* no id supplied - create one */
-if (qemu_opts_id(opts) == NULL) {
-char *new_id;
-if (type == IF_IDE || type == IF_SCSI) {
-mediastr = (media == MEDIA_CDROM) ? -cd : -hd;
-}
-if (max_devs) {
-new_id = g_strdup_printf(%s%i%s%i, if_name[type], bus_id,
- mediastr, unit_id);
-} else {
-new_id = g_strdup_printf(%s%s%i, if_name[type],
- mediastr, unit_id);
-}
-qemu_opts_set_id(opts, new_id);
-}
-
 /* init */
 dinfo = g_malloc0(sizeof(*dinfo));
 dinfo-id = g_strdup(qemu_opts_id(opts));
@@ -557,8 +487,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 dinfo-bdrv-read_only = ro;
 dinfo-devaddr = devaddr;
 dinfo-type = type;
-dinfo-bus = bus_id;
-dinfo-unit = unit_id;
 dinfo-refcount = 1;
 if (serial != NULL) {
 dinfo-serial = g_strdup(serial);
@@ -684,6 +612,18 @@ QemuOptsList qemu_legacy_drive_opts = {
 .head = QTAILQ_HEAD_INITIALIZER(qemu_legacy_drive_opts.head),
 .desc = {
 {
+.name = bus,
+.type = QEMU_OPT_NUMBER,
+.help = bus number,
+},{
+.name = unit,
+.type = QEMU_OPT_NUMBER,
+.help = unit number (i.e. lun for scsi),
+},{
+.name = index,
+.type = QEMU_OPT_NUMBER,
+.help = index number,
+},{
 .name = media,
 .type = QEMU_OPT_STRING,
 .help = media type (disk, cdrom),
@@ -725,6 +665,7 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 DriveMediaType media = MEDIA_DISK;
 BlockInterfaceType type;
 int cyls, heads, secs, translation;
+int max_devs, bus_id, unit_id, index;
 Error *local_err = NULL;
 
 /* Change legacy command line options into QMP ones */
@@ -868,6 +809,63 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 }
 }
 
+/* Device address specified by bus/unit or index.
+ * If none 

[Qemu-devel] [PATCH v3 06/17] blockdev: Pass QDict to blockdev_init()

2013-10-08 Thread Kevin Wolf
Working on a QDict instead of a QemuOpts that accepts anything is more
in line with bdrv_open(). A QDict is what qmp_blockdev_add() already has
anyway, so this saves additional conversions. And last, but not least,
it allows later patches to easily extract legacy options into a
separate, typed QemuOpts for drive_init() (the untyped QemuOpts that
drive_init already has doesn't allow access to numbers, only strings,
and is therefore useless without conversion).

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Benoit Canet ben...@irqsave.net
Reviewed-by: Eric Blake ebl...@redhat.com
---
 blockdev.c | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index e7e6bdd..cbe907b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -217,7 +217,10 @@ static void bdrv_format_print(void *opaque, const char 
*name)
 
 static void drive_uninit(DriveInfo *dinfo)
 {
-qemu_opts_del(dinfo-opts);
+if (dinfo-opts) {
+qemu_opts_del(dinfo-opts);
+}
+
 bdrv_unref(dinfo-bdrv);
 g_free(dinfo-id);
 QTAILQ_REMOVE(drives, dinfo, next);
@@ -302,7 +305,8 @@ static bool check_throttle_config(ThrottleConfig *cfg, 
Error **errp)
 return true;
 }
 
-static DriveInfo *blockdev_init(QemuOpts *all_opts,
+/* Takes the ownership of bs_opts */
+static DriveInfo *blockdev_init(QDict *bs_opts,
 BlockInterfaceType block_default_type)
 {
 const char *buf;
@@ -326,7 +330,6 @@ static DriveInfo *blockdev_init(QemuOpts *all_opts,
 int ret;
 Error *error = NULL;
 QemuOpts *opts;
-QDict *bs_opts;
 const char *id;
 bool has_driver_specific_opts;
 BlockDriver *drv = NULL;
@@ -334,9 +337,9 @@ static DriveInfo *blockdev_init(QemuOpts *all_opts,
 translation = BIOS_ATA_TRANSLATION_AUTO;
 media = MEDIA_DISK;
 
-/* Check common options by copying from all_opts to opts, all other options
- * are stored in bs_opts. */
-id = qemu_opts_id(all_opts);
+/* Check common options by copying from bs_opts to opts, all other options
+ * stay in bs_opts for processing by bdrv_open(). */
+id = qdict_get_try_str(bs_opts, id);
 opts = qemu_opts_create(qemu_common_drive_opts, id, 1, error);
 if (error_is_set(error)) {
 qerror_report_err(error);
@@ -344,8 +347,6 @@ static DriveInfo *blockdev_init(QemuOpts *all_opts,
 return NULL;
 }
 
-bs_opts = qdict_new();
-qemu_opts_to_qdict(all_opts, bs_opts);
 qemu_opts_absorb_qdict(opts, bs_opts, error);
 if (error_is_set(error)) {
 qerror_report_err(error);
@@ -634,7 +635,6 @@ static DriveInfo *blockdev_init(QemuOpts *all_opts,
 dinfo-heads = heads;
 dinfo-secs = secs;
 dinfo-trans = translation;
-dinfo-opts = all_opts;
 dinfo-refcount = 1;
 if (serial != NULL) {
 dinfo-serial = g_strdup(serial);
@@ -759,6 +759,7 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 {
 const char *value;
 DriveInfo *dinfo;
+QDict *bs_opts;
 
 /* Change legacy command line options into QMP ones */
 qemu_opt_rename(all_opts, iops, throttling.iops-total);
@@ -807,14 +808,19 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 qemu_opt_unset(all_opts, cache);
 }
 
+/* Get a QDict for processing the options */
+bs_opts = qdict_new();
+qemu_opts_to_qdict(all_opts, bs_opts);
+
 /* Actual block device init: Functionality shared with blockdev-add */
-dinfo = blockdev_init(all_opts, block_default_type);
+dinfo = blockdev_init(bs_opts, block_default_type);
 if (dinfo == NULL) {
 goto fail;
 }
 
 /* Set legacy DriveInfo fields */
 dinfo-enable_auto_del = true;
+dinfo-opts = all_opts;
 
 fail:
 return dinfo;
@@ -2109,13 +2115,7 @@ void qmp_blockdev_add(BlockdevOptions *options, Error 
**errp)
 
 qdict_flatten(qdict);
 
-QemuOpts *opts = qemu_opts_from_qdict(qemu_drive_opts, qdict, local_err);
-if (error_is_set(local_err)) {
-error_propagate(errp, local_err);
-goto fail;
-}
-
-dinfo = blockdev_init(opts, IF_NONE);
+dinfo = blockdev_init(qdict, IF_NONE);
 if (!dinfo) {
 error_setg(errp, Could not open image);
 goto fail;
-- 
1.8.1.4




[Qemu-devel] [PATCH v3 17/17] blockdev: blockdev_init() error conversion

2013-10-08 Thread Kevin Wolf
This gives us meaningful error messages for the blockdev-add QMP
command.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
---
 blockdev.c | 59 +--
 1 file changed, 33 insertions(+), 26 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 61dbf26..835c8c3 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -272,7 +272,7 @@ static void bdrv_put_ref_bh_schedule(BlockDriverState *bs)
 qemu_bh_schedule(s-bh);
 }
 
-static int parse_block_error_action(const char *buf, bool is_read)
+static int parse_block_error_action(const char *buf, bool is_read, Error 
**errp)
 {
 if (!strcmp(buf, ignore)) {
 return BLOCKDEV_ON_ERROR_IGNORE;
@@ -283,8 +283,8 @@ static int parse_block_error_action(const char *buf, bool 
is_read)
 } else if (!strcmp(buf, report)) {
 return BLOCKDEV_ON_ERROR_REPORT;
 } else {
-error_report('%s' invalid %s error action,
- buf, is_read ? read : write);
+error_setg(errp, '%s' invalid %s error action,
+   buf, is_read ? read : write);
 return -1;
 }
 }
@@ -309,7 +309,8 @@ typedef enum { MEDIA_DISK, MEDIA_CDROM } DriveMediaType;
 
 /* Takes the ownership of bs_opts */
 static DriveInfo *blockdev_init(QDict *bs_opts,
-BlockInterfaceType type)
+BlockInterfaceType type,
+Error **errp)
 {
 const char *buf;
 const char *file = NULL;
@@ -333,15 +334,13 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 id = qdict_get_try_str(bs_opts, id);
 opts = qemu_opts_create(qemu_common_drive_opts, id, 1, error);
 if (error_is_set(error)) {
-qerror_report_err(error);
-error_free(error);
+error_propagate(errp, error);
 return NULL;
 }
 
 qemu_opts_absorb_qdict(opts, bs_opts, error);
 if (error_is_set(error)) {
-qerror_report_err(error);
-error_free(error);
+error_propagate(errp, error);
 return NULL;
 }
 
@@ -361,7 +360,7 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 
 if ((buf = qemu_opt_get(opts, discard)) != NULL) {
 if (bdrv_parse_discard_flags(buf, bdrv_flags) != 0) {
-error_report(invalid discard option);
+error_setg(errp, invalid discard option);
 return NULL;
 }
 }
@@ -383,7 +382,7 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 } else if (!strcmp(buf, threads)) {
 /* this is the default */
 } else {
-   error_report(invalid aio option);
+   error_setg(errp, invalid aio option);
return NULL;
 }
 }
@@ -400,9 +399,10 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 drv = bdrv_find_whitelisted_format(buf, ro);
 if (!drv) {
 if (!ro  bdrv_find_whitelisted_format(buf, !ro)) {
-error_report('%s' can be only used as read-only device., 
buf);
+error_setg(errp, '%s' can be only used as read-only device.,
+   buf);
 } else {
-error_report('%s' invalid format, buf);
+error_setg(errp, '%s' invalid format, buf);
 }
 return NULL;
 }
@@ -439,20 +439,20 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 cfg.op_size = qemu_opt_get_number(opts, throttling.iops-size, 0);
 
 if (!check_throttle_config(cfg, error)) {
-error_report(%s, error_get_pretty(error));
-error_free(error);
+error_propagate(errp, error);
 return NULL;
 }
 
 on_write_error = BLOCKDEV_ON_ERROR_ENOSPC;
 if ((buf = qemu_opt_get(opts, werror)) != NULL) {
 if (type != IF_IDE  type != IF_SCSI  type != IF_VIRTIO  type != 
IF_NONE) {
-error_report(werror is not supported by this bus type);
+error_setg(errp, werror is not supported by this bus type);
 return NULL;
 }
 
-on_write_error = parse_block_error_action(buf, 0);
-if (on_write_error  0) {
+on_write_error = parse_block_error_action(buf, 0, error);
+if (error_is_set(error)) {
+error_propagate(errp, error);
 return NULL;
 }
 }
@@ -464,8 +464,9 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 return NULL;
 }
 
-on_read_error = parse_block_error_action(buf, 1);
-if (on_read_error  0) {
+on_read_error = parse_block_error_action(buf, 1, error);
+if (error_is_set(error)) {
+error_propagate(errp, error);
 return NULL;
 }
 }
@@ -518,8 +519,9 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 ret = bdrv_open(dinfo-bdrv, file, bs_opts, bdrv_flags, drv, error);
 
 if (ret  0) {
-error_report(could not open disk image %s: %s,
-

[Qemu-devel] [PATCH v3 08/17] blockdev: Move parsing of 'if' option to drive_init

2013-10-08 Thread Kevin Wolf
It's always IF_NONE for blockdev-add.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Benoit Canet ben...@irqsave.net
Reviewed-by: Eric Blake ebl...@redhat.com
---
 blockdev.c | 40 ++--
 1 file changed, 22 insertions(+), 18 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 0eaaffa..58c3ab9 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -309,14 +309,13 @@ typedef enum { MEDIA_DISK, MEDIA_CDROM } DriveMediaType;
 
 /* Takes the ownership of bs_opts */
 static DriveInfo *blockdev_init(QDict *bs_opts,
-BlockInterfaceType block_default_type,
+BlockInterfaceType type,
 DriveMediaType media)
 {
 const char *buf;
 const char *file = NULL;
 const char *serial;
 const char *mediastr = ;
-BlockInterfaceType type;
 int bus_id, unit_id;
 int cyls, heads, secs, translation;
 int max_devs;
@@ -377,17 +376,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 file = qemu_opt_get(opts, file);
 serial = qemu_opt_get(opts, serial);
 
-if ((buf = qemu_opt_get(opts, if)) != NULL) {
-for (type = 0; type  IF_COUNT  strcmp(buf, if_name[type]); type++)
-;
-if (type == IF_COUNT) {
-error_report(unsupported bus type '%s', buf);
-return NULL;
-   }
-} else {
-type = block_default_type;
-}
-
 max_devs = if_max_devs[type];
 
 if (cyls || heads || secs) {
@@ -756,6 +744,10 @@ QemuOptsList qemu_legacy_drive_opts = {
 .name = media,
 .type = QEMU_OPT_STRING,
 .help = media type (disk, cdrom),
+},{
+.name = if,
+.type = QEMU_OPT_STRING,
+.help = interface (ide, scsi, sd, mtd, floppy, pflash, virtio),
 },
 { /* end of list */ }
 },
@@ -768,6 +760,7 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 QDict *bs_opts;
 QemuOpts *legacy_opts;
 DriveMediaType media = MEDIA_DISK;
+BlockInterfaceType type;
 Error *local_err = NULL;
 
 /* Change legacy command line options into QMP ones */
@@ -842,8 +835,23 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 }
 }
 
+/* Controller type */
+value = qemu_opt_get(legacy_opts, if);
+if (value) {
+for (type = 0;
+ type  IF_COUNT  strcmp(value, if_name[type]);
+ type++) {
+}
+if (type == IF_COUNT) {
+error_report(unsupported bus type '%s', value);
+goto fail;
+}
+} else {
+type = block_default_type;
+}
+
 /* Actual block device init: Functionality shared with blockdev-add */
-dinfo = blockdev_init(bs_opts, block_default_type, media);
+dinfo = blockdev_init(bs_opts, type, media);
 if (dinfo == NULL) {
 goto fail;
 }
@@ -2191,10 +2199,6 @@ QemuOptsList qemu_common_drive_opts = {
 .type = QEMU_OPT_NUMBER,
 .help = unit number (i.e. lun for scsi),
 },{
-.name = if,
-.type = QEMU_OPT_STRING,
-.help = interface (ide, scsi, sd, mtd, floppy, pflash, virtio),
-},{
 .name = index,
 .type = QEMU_OPT_NUMBER,
 .help = index number,
-- 
1.8.1.4




[Qemu-devel] [PATCHv4 15/17] qemu-img: add support for fully allocated images

2013-10-08 Thread Peter Lieven
Signed-off-by: Peter Lieven p...@kamp.de
---
 qemu-img.c|8 +---
 qemu-img.texi |5 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 926f0a0..c6eff15 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -100,8 +100,10 @@ static void help(void)
  '-h' with or without a command shows this help and lists the 
supported formats\n
  '-p' show progress of command (only certain commands)\n
  '-q' use Quiet mode - do not print any output (except errors)\n
- '-S' indicates the consecutive number of bytes that must contain 
only zeros\n
-  for qemu-img to create a sparse image during conversion\n
+ '-S' indicates the consecutive number of bytes (defaults to 4k) 
that must\n
+  contain only zeros for qemu-img to create a sparse image 
during\n
+  conversion. if the number of bytes is 0 sparse files are 
disabled and\n
+  images will always be fully allocated\n
  '--output' takes the format in which the output must be done 
(human or json)\n
  '-n' skips the target volume creation (useful if the volume is 
created\n
   prior to running qemu-img)\n
@@ -1465,7 +1467,7 @@ static int img_convert(int argc, char **argv)
 /* signal EOF to align */
 bdrv_write_compressed(out_bs, 0, NULL, 0);
 } else {
-int has_zero_init = bdrv_has_zero_init(out_bs);
+int has_zero_init = min_sparse ? bdrv_has_zero_init(out_bs) : 0;
 
 sector_num = 0; // total number of sectors converted so far
 nb_sectors = total_sectors - sector_num;
diff --git a/qemu-img.texi b/qemu-img.texi
index 768054e..51a1ee5 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -193,6 +193,11 @@ Image conversion is also useful to get smaller image when 
using a
 growable format such as @code{qcow} or @code{cow}: the empty sectors
 are detected and suppressed from the destination image.
 
+@var{sparse_size} indicates the consecutive number of bytes (defaults to 4k)
+that must contain only zeros for qemu-img to create a sparse image during
+conversion. If the number of bytes is 0 sparse files are disabled and
+images will always be fully allocated.
+
 You can use the @var{backing_file} option to force the output image to be
 created as a copy on write image of the specified base image; the
 @var{backing_file} should have the same content as the input's base image,
-- 
1.7.9.5




[Qemu-devel] [PATCH v3 13/17] blockdev: Remove IF_* check for read-only blockdev_init

2013-10-08 Thread Kevin Wolf
IF_NONE allows read-only, which makes forbidding it in this place
for other types pretty much pointless.

Instead, make sure that all devices for which the check would have
errored out check in their init function that they don't get a read-only
BlockDriverState. This catches even cases where IF_NONE and -device is
used.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
---
 blockdev.c | 6 --
 hw/block/m25p80.c  | 5 +
 hw/block/xen_disk.c| 5 +
 hw/sd/milkymist-memcard.c  | 4 
 hw/sd/omap_mmc.c   | 6 ++
 hw/sd/pl181.c  | 4 
 hw/sd/pxa2xx_mmci.c| 3 +++
 hw/sd/sd.c | 5 +
 hw/sd/sdhci.c  | 3 +++
 hw/sd/ssi-sd.c | 3 +++
 tests/qemu-iotests/051.out | 5 -
 11 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 1c05c7a..1aedef8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -532,12 +532,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 if (media == MEDIA_CDROM) {
 /* CDROM is fine for any interface, don't check.  */
 ro = 1;
-} else if (ro == 1) {
-if (type != IF_SCSI  type != IF_VIRTIO  type != IF_FLOPPY 
-type != IF_NONE  type != IF_PFLASH) {
-error_report(read-only not supported by this bus type);
-goto err;
-}
 }
 
 bdrv_flags |= ro ? 0 : BDRV_O_RDWR;
diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
index 8c3b7f0..02a1544 100644
--- a/hw/block/m25p80.c
+++ b/hw/block/m25p80.c
@@ -624,6 +624,11 @@ static int m25p80_init(SSISlave *ss)
 if (dinfo  dinfo-bdrv) {
 DB_PRINT_L(0, Binding to IF_MTD drive\n);
 s-bdrv = dinfo-bdrv;
+if (bdrv_is_read_only(s-bdrv)) {
+fprintf(stderr, Can't use a read-only drive);
+return 1;
+}
+
 /* FIXME: Move to late init */
 if (bdrv_read(s-bdrv, 0, s-storage, DIV_ROUND_UP(s-size,
 BDRV_SECTOR_SIZE))) {
diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index f35fc59..253a45f 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -829,6 +829,11 @@ static int blk_connect(struct XenDevice *xendev)
 /* setup via qemu cmdline - already setup for us */
 xen_be_printf(blkdev-xendev, 2, get configured bdrv (cmdline 
setup)\n);
 blkdev-bs = blkdev-dinfo-bdrv;
+if (bdrv_is_read_only(blkdev-bs)  !readonly) {
+xen_be_printf(blkdev-xendev, 0, Unexpected read-only drive);
+blkdev-bs = NULL;
+return -1;
+}
 /* blkdev-bs is not create by us, we get a reference
  * so we can bdrv_unref() unconditionally */
 bdrv_ref(blkdev-bs);
diff --git a/hw/sd/milkymist-memcard.c b/hw/sd/milkymist-memcard.c
index 42613b3..d1168c9 100644
--- a/hw/sd/milkymist-memcard.c
+++ b/hw/sd/milkymist-memcard.c
@@ -255,6 +255,10 @@ static int milkymist_memcard_init(SysBusDevice *dev)
 
 dinfo = drive_get_next(IF_SD);
 s-card = sd_init(dinfo ? dinfo-bdrv : NULL, false);
+if (s-card == NULL) {
+return -1;
+}
+
 s-enabled = dinfo ? bdrv_is_inserted(dinfo-bdrv) : 0;
 
 memory_region_init_io(s-regs_region, OBJECT(s), memcard_mmio_ops, s,
diff --git a/hw/sd/omap_mmc.c b/hw/sd/omap_mmc.c
index bf5d1fb..937a478 100644
--- a/hw/sd/omap_mmc.c
+++ b/hw/sd/omap_mmc.c
@@ -593,6 +593,9 @@ struct omap_mmc_s *omap_mmc_init(hwaddr base,
 
 /* Instantiate the storage */
 s-card = sd_init(bd, false);
+if (s-card == NULL) {
+exit(1);
+}
 
 return s;
 }
@@ -618,6 +621,9 @@ struct omap_mmc_s *omap2_mmc_init(struct 
omap_target_agent_s *ta,
 
 /* Instantiate the storage */
 s-card = sd_init(bd, false);
+if (s-card == NULL) {
+exit(1);
+}
 
 s-cdet = qemu_allocate_irqs(omap_mmc_cover_cb, s, 1)[0];
 sd_set_cb(s-card, NULL, s-cdet);
diff --git a/hw/sd/pl181.c b/hw/sd/pl181.c
index 03875bf..c35896d 100644
--- a/hw/sd/pl181.c
+++ b/hw/sd/pl181.c
@@ -491,6 +491,10 @@ static int pl181_init(SysBusDevice *sbd)
 qdev_init_gpio_out(dev, s-cardstatus, 2);
 dinfo = drive_get_next(IF_SD);
 s-card = sd_init(dinfo ? dinfo-bdrv : NULL, false);
+if (s-card == NULL) {
+return -1;
+}
+
 return 0;
 }
 
diff --git a/hw/sd/pxa2xx_mmci.c b/hw/sd/pxa2xx_mmci.c
index 90c955f..b9d8b1a 100644
--- a/hw/sd/pxa2xx_mmci.c
+++ b/hw/sd/pxa2xx_mmci.c
@@ -539,6 +539,9 @@ PXA2xxMMCIState *pxa2xx_mmci_init(MemoryRegion *sysmem,
 
 /* Instantiate the actual storage */
 s-card = sd_init(bd, false);
+if (s-card == NULL) {
+exit(1);
+}
 
 register_savevm(NULL, pxa2xx_mmci, 0, 0,
 pxa2xx_mmci_save, pxa2xx_mmci_load, s);
diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 346d86f..7380f06 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -494,6 +494,11 @@ SDState *sd_init(BlockDriverState *bs, bool is_spi)
 {
 

Re: [Qemu-devel] [PATCH] osdep: initialize glib threads in all QEMU tools

2013-10-08 Thread Kevin Wolf
Am 08.10.2013 um 11:58 hat Stefan Hajnoczi geschrieben:
 glib versions prior to 2.31.0 require an explicit g_thread_init() call
 to enable multi-threading.
 
 Failure to initialize threading causes glib to take single-threaded code
 paths without synchronization.  For example, the g_slice allocator will
 crash due to race conditions.
 
 Fix this for all QEMU tool programs (qemu-nbd, qemu-io, qemu-img) by
 moving the g_thread_init() call from vl.c:main() into a new
 osdep.c:thread_init() constructor function.
 
 thread_init() has __attribute__((constructor)) and is automatically
 invoked by the runtime during startup.
 
 We can now drop the simple trace backend's g_thread_init() call since
 thread_init() already called it.
 
 Note that we must keep coroutine-gthread.c's g_thread_init() call which
 is located in a constructor function.  There is no guarantee for
 constructor function ordering so thread_init() may only be called later.

The glib documentation says:

Since version 2.24, calling g_thread_init() multiple times is
allowed, but nothing happens except for the first call.

I take that this means previously it wasn't allowed. qemu's configure
checks for a minimum version of 2.12, so we seems to support glib
versions that don't allow g_thread_init() to be called multiple times.

Do we need to protect against this?

Kevin



[Qemu-devel] [PATCH v3 14/17] qemu-iotests: Check autodel behaviour for device_del

2013-10-08 Thread Kevin Wolf
Block devices creates with -drive and drive_add should automatically
disappear if the guest device is unplugged. blockdev-add ones shouldn't.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
---
 tests/qemu-iotests/064   | 133 +++
 tests/qemu-iotests/064.out   |  80 +++
 tests/qemu-iotests/common.filter |   8 +++
 tests/qemu-iotests/group |   1 +
 4 files changed, 222 insertions(+)
 create mode 100755 tests/qemu-iotests/064
 create mode 100644 tests/qemu-iotests/064.out

diff --git a/tests/qemu-iotests/064 b/tests/qemu-iotests/064
new file mode 100755
index 000..79dc38b
--- /dev/null
+++ b/tests/qemu-iotests/064
@@ -0,0 +1,133 @@
+#!/bin/bash
+#
+# Test automatic deletion of BDSes created by -drive/drive_add
+#
+# Copyright (C) 2013 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/.
+#
+
+# creator
+owner=kw...@redhat.com
+
+seq=`basename $0`
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+function do_run_qemu()
+{
+echo Testing: $@
+$QEMU -nographic -qmp stdio -serial none $@
+echo
+}
+
+function run_qemu()
+{
+do_run_qemu $@ 21 | _filter_testdir | _filter_qmp
+}
+
+size=128M
+
+_make_test_img $size
+
+echo
+echo === -drive/-device and device_del ===
+echo
+
+run_qemu -drive file=$TEST_IMG,format=$IMGFMT,if=none,id=disk -device 
virtio-blk-pci,drive=disk,id=virtio0 EOF
+{ execute: qmp_capabilities }
+{ execute: query-block }
+{ execute: device_del, arguments: { id: virtio0 } }
+{ execute: system_reset }
+{ execute: query-block }
+{ execute: quit }
+EOF
+
+echo
+echo === -drive/device_add and device_del ===
+echo
+
+run_qemu -drive file=$TEST_IMG,format=$IMGFMT,if=none,id=disk EOF
+{ execute: qmp_capabilities }
+{ execute: query-block }
+{ execute: device_add,
+   arguments: { driver: virtio-blk-pci, drive: disk,
+  id: virtio0 } }
+{ execute: device_del, arguments: { id: virtio0 } }
+{ execute: system_reset }
+{ execute: query-block }
+{ execute: quit }
+EOF
+
+echo
+echo === drive_add/device_add and device_del ===
+echo
+
+run_qemu EOF
+{ execute: qmp_capabilities }
+{ execute: human-monitor-command,
+  arguments: { command-line: drive_add 0 
file=$TEST_IMG,format=$IMGFMT,if=none,id=disk } }
+{ execute: query-block }
+{ execute: device_add,
+   arguments: { driver: virtio-blk-pci, drive: disk,
+  id: virtio0 } }
+{ execute: device_del, arguments: { id: virtio0 } }
+{ execute: system_reset }
+{ execute: query-block }
+{ execute: quit }
+EOF
+
+echo
+echo === blockdev_add/device_add and device_del ===
+echo
+
+run_qemu EOF
+{ execute: qmp_capabilities }
+{ execute: blockdev-add,
+  arguments: {
+  options: {
+driver: $IMGFMT,
+id: disk,
+file: {
+driver: file,
+filename: $TEST_IMG
+}
+  }
+}
+  }
+{ execute: query-block }
+{ execute: device_add,
+   arguments: { driver: virtio-blk-pci, drive: disk,
+  id: virtio0 } }
+{ execute: device_del, arguments: { id: virtio0 } }
+{ execute: system_reset }
+{ execute: query-block }
+{ execute: quit }
+EOF
+
+# success, all done
+echo *** done
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/064.out b/tests/qemu-iotests/064.out
new file mode 100644
index 000..8c7a334
--- /dev/null
+++ b/tests/qemu-iotests/064.out
@@ -0,0 +1,80 @@
+QA output created by 064
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 
+
+=== -drive/-device and device_del ===
+
+Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk -device 
virtio-blk-pci,drive=disk,id=virtio0
+QMP_VERSION
+{return: {}}
+{return: [{io-status: ok, device: disk, locked: false, 
removable: false, inserted: {iops_rd: 0, image: {virtual-size: 
134217728, filename: TEST_DIR/t.qcow2, cluster-size: 65536, format: 
qcow2, actual-size: 139264, dirty-flag: false}, iops_wr: 0, ro: 
false, backing_file_depth: 0, drv: qcow2, iops: 0, bps_wr: 0, 
encrypted: false, bps: 0, bps_rd: 0, file: TEST_DIR/t.qcow2, 
encryption_key_missing: false}, type: unknown}, {io-status: ok, 
device: ide1-cd0, locked: false, 

[Qemu-devel] [PATCH v3 16/17] blockdev: Don't disable COR automatically with blockdev-add

2013-10-08 Thread Kevin Wolf
If a read-only device is configured with copy-on-read=on, the old code
only prints a warning and automatically disables copy on read. Make it
a real error for blockdev-add.

Signed-off-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
---
 block.c|  9 +++--
 blockdev.c | 31 +++
 2 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index 93e113a..dc63f02 100644
--- a/block.c
+++ b/block.c
@@ -774,8 +774,13 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
 }
 
 assert(bs-copy_on_read == 0); /* bdrv_new() and bdrv_close() make it so */
-if (!bs-read_only  (flags  BDRV_O_COPY_ON_READ)) {
-bdrv_enable_copy_on_read(bs);
+if (flags  BDRV_O_COPY_ON_READ) {
+if (!bs-read_only) {
+bdrv_enable_copy_on_read(bs);
+} else {
+error_setg(errp, Can't use copy-on-read on read-only device);
+return -EINVAL;
+}
 }
 
 if (filename != NULL) {
diff --git a/blockdev.c b/blockdev.c
index b39f2e7..61dbf26 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -514,10 +514,6 @@ static DriveInfo *blockdev_init(QDict *bs_opts,
 
 bdrv_flags |= ro ? 0 : BDRV_O_RDWR;
 
-if (ro  copy_on_read) {
-error_report(warning: disabling copy_on_read on read-only drive);
-}
-
 QINCREF(bs_opts);
 ret = bdrv_open(dinfo-bdrv, file, bs_opts, bdrv_flags, drv, error);
 
@@ -605,6 +601,18 @@ QemuOptsList qemu_legacy_drive_opts = {
 .type = QEMU_OPT_STRING,
 .help = pci address (virtio only),
 },
+
+/* Options that are passed on, but have special semantics with -drive 
*/
+{
+.name = read-only,
+.type = QEMU_OPT_BOOL,
+.help = open drive file as read-only,
+},{
+.name = copy-on-read,
+.type = QEMU_OPT_BOOL,
+.help = copy read data from backing file into image file,
+},
+
 { /* end of list */ }
 },
 };
@@ -620,6 +628,7 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 int cyls, heads, secs, translation;
 int max_devs, bus_id, unit_id, index;
 const char *devaddr;
+bool read_only, copy_on_read;
 Error *local_err = NULL;
 
 /* Change legacy command line options into QMP ones */
@@ -702,6 +711,20 @@ DriveInfo *drive_init(QemuOpts *all_opts, 
BlockInterfaceType block_default_type)
 }
 }
 
+/* copy-on-read is disabled with a warning for read-only devices */
+read_only = qemu_opt_get_bool(legacy_opts, read-only, false);
+copy_on_read = qemu_opt_get_bool(legacy_opts, copy-on-read, false);
+
+if (read_only  copy_on_read) {
+error_report(warning: disabling copy-on-read on read-only drive);
+copy_on_read = false;
+}
+
+qdict_put(bs_opts, read-only,
+  qstring_from_str(read_only ? on : off));
+qdict_put(bs_opts, copy-on-read,
+  qstring_from_str(copy_on_read ? on :off));
+
 /* Controller type */
 value = qemu_opt_get(legacy_opts, if);
 if (value) {
-- 
1.8.1.4




Re: [Qemu-devel] [PATCH] block/iscsi: introduce bdrv_co_{readv, writev, flush_to_disk}

2013-10-08 Thread Kevin Wolf
Am 02.10.2013 um 17:46 hat Paolo Bonzini geschrieben:
 Il 02/10/2013 17:41, Peter Lieven ha scritto:
  this converts read, write and flush functions from aio to coroutines.
 
 I'm not sure it's already the time for this...  Cancellation sucks in
 QEMU, and this is going to make things even worse.

Not sure what you're referring to. If you mean iscsi_aio_cancel(), isn't
it dead code anyway since we changed block.c to use coroutines for
everything? bdrv_co_io_em() even throws the acb away, so even if you
wanted, there's no way to cancel the request even today.

And the lines deleted/inserted are much in favour of this patch.

Kevin



Re: [Qemu-devel] [PATCH] block/iscsi: introduce bdrv_co_{readv, writev, flush_to_disk}

2013-10-08 Thread Paolo Bonzini
Il 08/10/2013 14:33, Kevin Wolf ha scritto:
   this converts read, write and flush functions from aio to coroutines.
  
  I'm not sure it's already the time for this...  Cancellation sucks in
  QEMU, and this is going to make things even worse.
 Not sure what you're referring to. If you mean iscsi_aio_cancel(), isn't
 it dead code anyway since we changed block.c to use coroutines for
 everything? bdrv_co_io_em() even throws the acb away, so even if you
 wanted, there's no way to cancel the request even today.

SCSI tries to use cancellation, and this results in VCPU threads
starving all other threads.  So I would like to introduce cancellation
points for coroutines.

Paolo



Re: [Qemu-devel] [PATCHv5] block/get_block_status: avoid redundant callouts on raw devices

2013-10-08 Thread Paolo Bonzini
Il 08/10/2013 14:05, Peter Lieven ha scritto:
 Strictly speaking, this should probably do something like this:

assert(ret  BDRV_BLOCK_OFFSET_VALID);
return bdrv_get_block_status(bs-file, ret  BDRV_SECTOR_BITS,
 nb_sectors, pnum);
 shouldn't the last line be:
 
   return bdrv_get_block_status(bs-file, ret  BDRV_SECTOR_BITS,
*pnum, pnum);
 
 
 This would of course require *pnum = nb_sectors in raw_co_get_block_status
 ?

Yes for both questions.

Paolo



Re: [Qemu-devel] [PATCH] qemu-iotests: Correct 026 output

2013-10-08 Thread Kevin Wolf
Am 02.10.2013 um 16:45 hat Max Reitz geschrieben:
 Because l2_allocate now frees the unused L2 cluster on error, the
 according test cases in 026 don't result in one leaked cluster anymore.
 
 Signed-off-by: Max Reitz mre...@redhat.com
 ---
 This patch depends on and is a follow-up to qcow2: Free allocated L2
 cluster on error which was part of the qcow2: Small error path fixes
 for l2_allocate series. This series is fully merged to Kevin's block
 branch, however, this one patch was excluded from his pull request last
 Friday, therefore, it is missing from master.
 ---
  tests/qemu-iotests/026.out | 32 
  tests/qemu-iotests/026.out.nocache | 32 
  2 files changed, 16 insertions(+), 48 deletions(-)

Stefan, it seems you already picked this up?

Kevin



Re: [Qemu-devel] [PATCH] block/iscsi: introduce bdrv_co_{readv, writev, flush_to_disk}

2013-10-08 Thread Kevin Wolf
Am 08.10.2013 um 14:35 hat Paolo Bonzini geschrieben:
 Il 08/10/2013 14:33, Kevin Wolf ha scritto:
this converts read, write and flush functions from aio to coroutines.
   
   I'm not sure it's already the time for this...  Cancellation sucks in
   QEMU, and this is going to make things even worse.
  Not sure what you're referring to. If you mean iscsi_aio_cancel(), isn't
  it dead code anyway since we changed block.c to use coroutines for
  everything? bdrv_co_io_em() even throws the acb away, so even if you
  wanted, there's no way to cancel the request even today.
 
 SCSI tries to use cancellation, and this results in VCPU threads
 starving all other threads.  So I would like to introduce cancellation
 points for coroutines.

Sounds like a nice thing to have, but it's unrelated to this patch.
Cancellation means waiting for request completion before and after the
patch.

Kevin



[Qemu-devel] [PATCHv6] block/get_block_status: avoid redundant callouts on raw devices

2013-10-08 Thread Peter Lieven
if a raw device like an iscsi target or host device is used
the current implementation makes a second call out to get
the block status of bs-file.

Signed-off-by: Peter Lieven p...@kamp.de
---
v6: made the result of raw_co_get_block_status valid by
adding BDRV_BLOCK_DATA and setting *pnum to nb_sectors
and calling bdrv_get_block_status on bs-file with
the offset and *pnum returned in the first callout.
   
v5: add a generic get_lba_status function in the raw driver which
adds the BDRV_BLOCK_RAW flag. bdrv_co_get_block_status will
handle the callout to bs-file then.

v4: use a flag to detect the raw driver instead of the strncmp
hack

 block.c   |6 ++
 block/raw_bsd.c   |4 +++-
 include/block/block.h |4 
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 93e113a..826eff0 100644
--- a/block.c
+++ b/block.c
@@ -3147,6 +3147,12 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status(BlockDriverState *bs,
 return ret;
 }
 
+if (ret  BDRV_BLOCK_RAW) {
+assert(ret  BDRV_BLOCK_OFFSET_VALID);
+return bdrv_get_block_status(bs-file, ret  BDRV_SECTOR_BITS,
+ *pnum, pnum);
+}
+
 if (!(ret  BDRV_BLOCK_DATA)) {
 if (bdrv_has_zero_init(bs)) {
 ret |= BDRV_BLOCK_ZERO;
diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index d4ace60..d61906b 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -62,7 +62,9 @@ static int64_t coroutine_fn 
raw_co_get_block_status(BlockDriverState *bs,
 int64_t sector_num,
 int nb_sectors, int *pnum)
 {
-return bdrv_get_block_status(bs-file, sector_num, nb_sectors, pnum);
+*pnum = nb_sectors;
+return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_DATA |
+   (sector_num  BDRV_SECTOR_BITS);
 }
 
 static int coroutine_fn raw_co_write_zeroes(BlockDriverState *bs,
diff --git a/include/block/block.h b/include/block/block.h
index f808550..003699e 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -84,6 +84,9 @@ typedef struct BlockDevOps {
 /* BDRV_BLOCK_DATA: data is read from bs-file or another file
  * BDRV_BLOCK_ZERO: sectors read as zero
  * BDRV_BLOCK_OFFSET_VALID: sector stored in bs-file as raw data
+ * BDRV_BLOCK_RAW: used internally to indicate that the request
+ * was answered by the raw driver and that one
+ * should look in bs-file directly.
  *
  * If BDRV_BLOCK_OFFSET_VALID is set, bits 9-62 represent the offset in
  * bs-file where sector data can be read from as raw data.
@@ -105,6 +108,7 @@ typedef struct BlockDevOps {
 #define BDRV_BLOCK_DATA 1
 #define BDRV_BLOCK_ZERO 2
 #define BDRV_BLOCK_OFFSET_VALID 4
+#define BDRV_BLOCK_RAW  8
 #define BDRV_BLOCK_OFFSET_MASK  BDRV_SECTOR_MASK
 
 typedef enum {
-- 
1.7.9.5




[Qemu-devel] [PATCHv4 17/17] block/raw: copy BlockLimits on raw_open

2013-10-08 Thread Peter Lieven
Signed-off-by: Peter Lieven p...@kamp.de
---
 block/raw_bsd.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 8dc7bba..2c26b79 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -159,6 +159,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, 
int flags,
 Error **errp)
 {
 bs-sg = bs-file-sg;
+bs-bl = bs-file-bl;
 return 0;
 }
 
-- 
1.7.9.5




Re: [Qemu-devel] [PATCH 0/3] qapi: introduce BlockJobType enum

2013-10-08 Thread Kevin Wolf
Am 08.10.2013 um 11:29 hat Fam Zheng geschrieben:
 Currently, block job type is hard coded string and could be repeated in
 different places in the code base. Introduce a enum type in QAPI to make it
 better for maintenance and introspection. The old BlockJobType struct is
 renamed to BlockJobDriver and its field job_type becomes a BlockJobType 
 enum.
 
 Nothing is changed to the interface.
 
 Fam Zheng (3):
   blockjob: rename BlockJobType to BlockJobDriver
   qapi: Introduce enum BlockJobType
   qapi: make use of new BlockJobType

Thanks, applied to the block branch.

Kevin



Re: [Qemu-devel] [Xen-devel] Hvmloader: Add _STA for PCI hotplug slots

2013-10-08 Thread Fabio Fantoni

Il 29/09/2013 02:30, Gonglei (Arei) ha scritto:

-Original Message-
From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
Sent: Saturday, September 28, 2013 5:43 AM
To: Gonglei (Arei); anthony.per...@citrix.com; Stefano Stabellini
Cc: xen-de...@lists.xen.org; Hanweidong (Randy); Yanqiangjun; Luonengjun;
qemu-devel@nongnu.org; Gaowei (UVP); Huangweidong (Hardware)
Subject: Re: [Xen-devel] Hvmloader: Add _STA for PCI hotplug slots

On Fri, Sep 27, 2013 at 06:29:20AM +, Gonglei (Arei) wrote:

Hi,

Hey,

(CCing Stefano and Anthony).


In Xen platform, after using upstream qemu, the all of pci devices will show

hotplug in the windows guest.

In this situation, the windows guest may occur blue screen when VM' user

click the icon of VGA card for trying unplug VGA card.

However, we don't hope VM's user can do such dangerous operation, and

showing all pci devices inside the guest OS is unfriendly.

In addition, I find the traditional qemu have not this problem, and KVM also.


Is there any news about this patch please?



On the KVM platform, the seabios will read the RMV bits of pci slot (according

the 0xae08 I/O port register),

then modify the SSDT table.

The key steps as follows:
In Seabios:
#define PCI_RMV_BASE 0xae0c// 0xae08 I/O port register
static void* build_ssdt(void)
{
  ...
  // build Device object for each slot
  u32 rmvc_pcrm = inl(PCI_RMV_BASE);
  ...
}

In upstream Qemu, read 0xae0c I/O port register function:
static uint64_t pci_read(void *opaque, hwaddr addr, unsigned int size)
{
 ...
case PCI_RMV_BASE - PCI_HOTPLUG_ADDR:
 val = s-pci0_hotplug_enable;
 break;
}
s-pci0_hotplug_enable is set by the follow function:

static void piix4_update_hotplug(PIIX4PMState *s)
{
...
s-pci0_hotplug_enable = ~0;
 s-pci0_slot_device_present = 0;

 QTAILQ_FOREACH_SAFE(kid, bus-children, sibling, next) {
 DeviceState *qdev = kid-child;
 PCIDevice *pdev = PCI_DEVICE(qdev);
 PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(pdev);
 int slot = PCI_SLOT(pdev-devfn);

//setting by PCIDeviceClass *k-no_hotplug
 if (pc-no_hotplug) {
 s-pci0_hotplug_enable = ~(1U  slot);
 }

 s-pci0_slot_device_present |= (1U  slot);
 }
}

But, on the XEN platform, ACPI DSDT tables is produced by the hvmloader,
more details in this patch:


http://xen.1045712.n5.nabble.com/xen-unstable-hvmloader-acpi-dsdt-Fix-PCI-
hotplug-with-the-new-qemu-xen-td4947152.html

# Node ID 1a912ce93b506a185b54fd97986214e6eff8a0bc
# Parent  6bc03e22f921aadfa7e5cebe92100cb01377947d
hvmloader/acpi/dsdt: Fix PCI hotplug with the new qemu-xen.

oddly enough you did not CC the author of said patch?

I am doing that for you.

That's my mistake, thank you so much!

The ACPI PIIX4 device in QEMU upstream as not the same behavior to
handle PCI hotplug. This patch introduce the necessary change to the
DSDT ACPI table to behave as expceted by the new QEMU.

To switch to this new DSDT table version, there is a new option
--dm-version to mk_dsdt.

Change are inspired by SeaBIOS DSDT source code.

There is few things missing with the new QEMU:
   - QEMU provide the plugged/unplugged status only per slot (and not
 per func like qemu-xen-traditionnal.
   - I did not include the _STA ACPI method that give the status of a
 device (present, functionning properly) because qemu-xen does not
 handle it.
   - I did not include the _RMV method that say if the device can be
 removed,
 because the IO port of QEMU that give this status always return
 true. In
 SeaBIOS table, they have a specific _RMV method for VGA, ISA that
 return
 false. But I'm not sure that we can do the same in Xen.


now, I add the _STA method, return the different value according the 0xae08

I/O port register on read,

a pci device allow hotplug return 0x1f, a pci device don't allow return 0x1e.
Then the pci devices which don't allow hotplug will not show inside the guest

OS.

So you are saying that the 'the IO port of QEMU that give this status always
return
true.  is no longer correct?


Yes, now the RMV bits of pci slot is set by the PCIDeviceClass's no_hotplug 
attribute, such as:

static void cirrus_vga_class_init(ObjectClass *klass, void *data)
{
 DeviceClass *dc = DEVICE_CLASS(klass);
 PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);

 k-no_hotplug = 1;
 ... ...
}

If k-no_hotplug equal 1, the corresponding slot of PIIX4PMState 
pci0_hotplug_enable bit will be cleared.
Otherwise the slot's pci0_hotplug_enable bit will be set.


Index: tools/firmware/hvmloader/acpi/mk_dsdt.c



===

--- tools/firmware/hvmloader/acpi/mk_dsdt.c (revision 1105)
+++ tools/firmware/hvmloader/acpi/mk_dsdt.c (working copy)
@@ -437,6 +437,10 @@
  indent(); printf(B0EJ, 32,\n);
  pop_block();

+stmt(OperationRegion, SRMV, SystemIO, 

Re: [Qemu-devel] [PATCH 3/3] qapi: make use of new BlockJobType

2013-10-08 Thread Eric Blake
On 10/08/2013 03:29 AM, Fam Zheng wrote:
 Switch the string to enum type BlockJobType in BlockJobDriver.
 
 Signed-off-by: Fam Zheng f...@redhat.com
 ---

 +++ b/include/block/blockjob.h
 @@ -37,7 +37,7 @@ typedef struct BlockJobDriver {
  size_t instance_size;
  
  /** String describing the operation, part of query-block-jobs QMP API */
 -const char *job_type;
 +BlockJobType job_type;

Comment looks awkward now that it is not a string in memory; but it is
still a string on the wire, so I can live with it as-is.

Series: Reviewed-by: Eric Blake ebl...@redhat.com

Thanks for doing this!

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] osdep: initialize glib threads in all QEMU tools

2013-10-08 Thread Paolo Bonzini
Il 08/10/2013 14:25, Kevin Wolf ha scritto:
 The glib documentation says:
 
 Since version 2.24, calling g_thread_init() multiple times is
 allowed, but nothing happens except for the first call.
 
 I take that this means previously it wasn't allowed. qemu's configure
 checks for a minimum version of 2.12, so we seems to support glib
 versions that don't allow g_thread_init() to be called multiple times.
 
 Do we need to protect against this?

I think that's the point of the if (!g_thread_supported ()) tests.

Paolo



Re: [Qemu-devel] [PATCH v4] Extend qemu-ga's 'guest-info' command to expose flag 'success-response'

2013-10-08 Thread Eric Blake
On 10/08/2013 12:23 AM, Mark Wu wrote:
 Now we have several qemu-ga commands not returning response on success.
 It has been documented in qga/qapi-schema.json already. This patch exposes
 the 'success-response' flag by extending 'guest-info' command. With this
 change, the clients can handle the command response more flexibly.
 
 Signed-off-by: Mark Wu wu...@linux.vnet.ibm.com
 ---
 Changes:
 v4: 
 Add signature of qmp_has_success_response per Michael.
 v3: 
   1. treat cmd-options as a bitmask instead of single option (per Eric) 
   2. rebase on the patch  Add interface to traverse the qmp command list
 by QmpCommand to avoid the O(n2) problem (per Eric and Michael)
 v2: 
 add the notation 'since 1.7' to the option 'success-response'
 (per Eric Blake's comments)
 
  include/qapi/qmp/dispatch.h | 1 +
  qapi/qmp-registry.c | 5 +
  qga/commands.c  | 1 +
  qga/qapi-schema.json| 5 -
  4 files changed, 11 insertions(+), 1 deletion(-)

Reviewed-by: Eric Blake ebl...@redhat.com

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2] Add interface to traverse the qmp command list by QmpCommand

2013-10-08 Thread Eric Blake
On 10/08/2013 12:23 AM, Mark Wu wrote:
 In the original code, qmp_get_command_list is used to construct
 a list of all commands' name. To get the information of all qga
 commands, it traverses the name list and search the command info
 with its name.  So it can cause O(n^2) in the number of commands.
 
 This patch adds an interface to traverse the qmp command list by
 QmpCommand to replace qmp_get_command_list. It can decrease the
 complexity from O(n^2) to O(n).
 
 Signed-off-by: Mark Wu wu...@linux.vnet.ibm.com
 ---
 Changes:
 v2:
   1. Keep the signature of qmp_command_is_enabled (per Eric and Michael)
 2. Remove the unnecessary pointer castings (per Eric)
 
  include/qapi/qmp/dispatch.h |  5 ++--
  qapi/qmp-registry.c | 27 +++---
  qga/commands.c  | 38 ++---
  qga/main.c  | 68 
 +
  4 files changed, 48 insertions(+), 90 deletions(-)

Reviewed-by: Eric Blake ebl...@redhat.com

 +++ b/qga/main.c
 @@ -347,48 +347,34 @@ static gint ga_strcmp(gconstpointer str1, gconstpointer 
 str2)
  }
  
  /* disable commands that aren't safe for fsfreeze */
 -static void ga_disable_non_whitelisted(void)
 +static void ga_disable_non_whitelisted(QmpCommand *cmd, void *opaque)
  {
 -char **list_head, **list;
  bool whitelisted;
  int i;
  
 -list_head = list = qmp_get_command_list();
 -while (*list != NULL) {
 -whitelisted = false;
 -i = 0;
 -while (ga_freeze_whitelist[i] != NULL) {
 -if (strcmp(*list, ga_freeze_whitelist[i]) == 0) {
 -whitelisted = true;
 -}
 -i++;
 -}
 -if (!whitelisted) {
 -g_debug(disabling command: %s, *list);
 -qmp_disable_command(*list);
 +whitelisted = false;
 +i = 0;
 +while (ga_freeze_whitelist[i] != NULL) {
 +if (strcmp(cmd-name, ga_freeze_whitelist[i]) == 0) {

This (and several other instances) accesses cmd-name directly where we
were formally using the string from the returned list.  Should these
spots be modified to go through an accessor method instead, similar to
qmp_command_is_enabled, so that QmpCommand can be treated as an opaque
type from this file?

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 0/8] Make icount thread-safe

2013-10-08 Thread Andreas Färber
Am 08.10.2013 10:47, schrieb Paolo Bonzini:
 This series moves the icount state under the same seqlock as the normal
 vm_clock implementation.
 
 It is not yet 100% thread-safe, because the CPU list should be moved
 under RCU protection (due to the call to !all_cpu_threads_idle()
 in qemu_clock_warp).  However it is a substantial step forward, the
 only uncovered case being CPU hotplug.
 
 Please review.
 
 Paolo
 
 Paolo Bonzini (8):
   timers: extract timer_mod_ns_locked and timerlist_rearm
   timers: add timer_mod_anticipate and timer_mod_anticipate_ns

   timers: use cpu_get_icount() directly
   timers: reorganize icount_warp_rt
   timers: prepare the code for future races in calling qemu_clock_warp
   timers: introduce cpu_get_clock_locked
   timers: document (future) locking rules for icount
   timers: make icount thread-safe

These patches touch cpus.c exclusively, so timers: is rather misleading.

As you know I have pending patches (in need of rebase due to the
performance issue you raised) moving the icount CPU fields around.
Is there anything in particular I should be aware of? Looks to me as if
this may be orthogonal?

What about the previous patch disabling icount for -smp? Does this
series supersede it or does it fix different concurrency issues?

Thanks,
Andreas

  cpus.c | 110 
 -
  include/qemu/timer.h   |  26 +
  qemu-timer.c   |  74 +++--
  4 files changed, 163 insertions(+), 47 deletions(-)
  create mode 100644 include/qemu/seqlock.h

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCH v5 0/3] pci: partially implement master abort protocol

2013-10-08 Thread Michael S. Tsirkin
On Mon, Sep 16, 2013 at 11:21:13AM +0300, Marcel Apfelbaum wrote:
 PCI spec requires that a transaction that has not been claimed
 by any PCI bus devices will be terminated by the initiator
 with master abort. For read transactions -1() is returned and 
 writes are silently dropped.

OK looks good to me, I put this on the pci branch.

 Implementation:
  - Allowed the MemoryRegion priority to be negative so a subregion will be
visible on all the addresses not covered by other container subregions.
  - Added a memory region with negative priority that extends over all the
pci address space. This region catches all the accesses
to the unassigned pci addresses.
  - The MemoryRegion's ops emulates the master abort scenario.
 
 I am working on implementing the following on top of this series
  - Implement upstream master abort
  - Handling of RECEIVED MASTER ABORT BIT in Status register
 
 Changes from v4:
  - Addressed Peter Maydell comments
- Changed memory patches commit comment
  - Addressed Michael S. Tsirkin comments
- Changed PCI master_abort_mem ops endian-nes to DEVICE_LITTLE_ENDIAN
 
 Changes from v3:
  - Addressed Peter Maydell comments
- Removed unnecessary changes to priority of MemoryListener
- Ensured that priority is now signed in all related places
- Added to memory docs explanation on signed priorities
  - Addresses Michael S. Tsirkin comments
- Changed the name of the new Memory region to master_abort_mem
- Made master abort priority INT_MIN instead of -1
  - Removed handling of RECEIVED MASTER ABORT BIT; it will be taken
care in a different series
 
 Changes from v2:
  - minor: changed nr of patches in the title
  - minor: modified series list
 
 Changes from v1:
  - pci-unassigned-mem MemoryRegion resides now in PCIBus and not on
 various Host Bridges
  - pci-unassgined-mem does not have a .valid.accept field and
 implements read write methods
 
 Marcel Apfelbaum (3):
   memory: Change MemoryRegion priorities from unsigned to signed
   docs/memory: Explictly state that MemoryRegion priority is signed
   hw/pci: partially handle pci master abort
 
  docs/memory.txt  |  4 
  hw/core/sysbus.c |  4 ++--
  hw/pci/pci.c | 27 +++
  include/exec/memory.h|  4 ++--
  include/hw/pci/pci_bus.h |  1 +
  include/hw/sysbus.h  |  2 +-
  memory.c |  4 ++--
  7 files changed, 39 insertions(+), 7 deletions(-)
 
 -- 
 1.8.3.1



Re: [Qemu-devel] [PATCH 0/8] Make icount thread-safe

2013-10-08 Thread Paolo Bonzini
Il 08/10/2013 15:47, Andreas Färber ha scritto:
 Am 08.10.2013 10:47, schrieb Paolo Bonzini:
 This series moves the icount state under the same seqlock as the normal
 vm_clock implementation.

 It is not yet 100% thread-safe, because the CPU list should be moved
 under RCU protection (due to the call to !all_cpu_threads_idle()
 in qemu_clock_warp).  However it is a substantial step forward, the
 only uncovered case being CPU hotplug.

 Please review.

 Paolo

 Paolo Bonzini (8):
   timers: extract timer_mod_ns_locked and timerlist_rearm
   timers: add timer_mod_anticipate and timer_mod_anticipate_ns
 
   timers: use cpu_get_icount() directly
   timers: reorganize icount_warp_rt
   timers: prepare the code for future races in calling qemu_clock_warp
   timers: introduce cpu_get_clock_locked
   timers: document (future) locking rules for icount
   timers: make icount thread-safe
 
 These patches touch cpus.c exclusively, so timers: is rather misleading.

I can change that to icount.

 As you know I have pending patches (in need of rebase due to the
 performance issue you raised) moving the icount CPU fields around.
 Is there anything in particular I should be aware of? Looks to me as if
 this may be orthogonal?

It's entirely orthogonal.  It doesn't affect the cpu-exec part of
icount, only the timers :) part.

 What about the previous patch disabling icount for -smp? Does this
 series supersede it or does it fix different concurrency issues?

This is for making accesses to icount safe without holding the BQL.
icount for -smp remains just as broken as before.

Paolo



Re: [Qemu-devel] [PATCH v7 4/6] qcow2: Add support for ImageInfoSpecific

2013-10-08 Thread Kevin Wolf
Am 02.10.2013 um 10:39 hat Max Reitz geschrieben:
 Add a new ImageInfoSpecificQCow2 type as a subtype of ImageInfoSpecific.
 This contains the compatibility level as a string and an optional
 lazy_refcounts boolean (optional means mandatory for compat = 1.1 and
 not available for compat == 0.10).
 
 Also, add qcow2_get_specific_info, which returns this information.
 
 Signed-off-by: Max Reitz mre...@redhat.com
 Reviewed-by: Eric Blake ebl...@redhat.com
 ---
  block/qcow2.c| 19 +++
  qapi-schema.json | 16 
  2 files changed, 35 insertions(+)
 
 diff --git a/block/qcow2.c b/block/qcow2.c
 index 4a9888c..396650d 100644
 --- a/block/qcow2.c
 +++ b/block/qcow2.c
 @@ -1810,6 +1810,24 @@ static int qcow2_get_info(BlockDriverState *bs, 
 BlockDriverInfo *bdi)
  return 0;
  }
  
 +static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs)
 +{
 +BDRVQcowState *s = bs-opaque;
 +ImageInfoSpecific *spec_info = g_new0(ImageInfoSpecific, 1);
 +
 +spec_info-kind = IMAGE_INFO_SPECIFIC_KIND_QCOW2;
 +spec_info-qcow2 = g_new0(ImageInfoSpecificQCow2, 1);
 +if (s-qcow_version == 2) {
 +spec_info-qcow2-compat = g_strdup(0.10);
 +} else if (s-qcow_version == 3) {
 +spec_info-qcow2-compat = g_strdup(1.1);
 +spec_info-qcow2-lazy_refcounts = s-use_lazy_refcounts;

If this function gets reused in a running qemu instance, e.g. from the
monitor, I think the semantics we'd want it to have is to describe the
image file, not the current options (which may have been modified on the
command line or using other monitor commands).

By checking s-compatible_features instead of s-use_lazy_refcounts you
would always return this information.

(Also, how could you leave out this opportunity to use compound
literals?! ;-))

 +spec_info-qcow2-has_lazy_refcounts = true;
 +}
 +
 +return spec_info;
 +}

Kevin



Re: [Qemu-devel] [PATCH] osdep: initialize glib threads in all QEMU tools

2013-10-08 Thread Kevin Wolf
Am 08.10.2013 um 15:08 hat Paolo Bonzini geschrieben:
 Il 08/10/2013 14:25, Kevin Wolf ha scritto:
  The glib documentation says:
  
  Since version 2.24, calling g_thread_init() multiple times is
  allowed, but nothing happens except for the first call.
  
  I take that this means previously it wasn't allowed. qemu's configure
  checks for a minimum version of 2.12, so we seems to support glib
  versions that don't allow g_thread_init() to be called multiple times.
  
  Do we need to protect against this?
 
 I think that's the point of the if (!g_thread_supported ()) tests.

Ah yes, I think you're right. Not the best function name I've ever seen
that glib uses there, but okay.

Kevin



[Qemu-devel] Current qemu-master hangs when used with qxl + linux guest

2013-10-08 Thread Hans de Goede

Hi All,

I'm having this weird problem with qemu master + spice/qxl using
guests. As soon as the guest starts Xorg, I get the following message
from qemu:

main-loop: WARNING: I/O thread spun for 1000 iterations

And from then on the guest hangs and qemu consumes 100% cpu. The qemu
console still works, and I can quit qemu that way.

Doing ctrl+c + a thread apply all bt in qemy shows one cpu thread waiting
for the iothread-lock, and all other threads waiting in poll.

This happens both with non kms guests (tried RHEL-6.5, older Fedoras)
as well as with kms guests (tried a fully up2date F-19).

Since I've not seen any similar reports, I assume it is something
with my setup ...

I've tried changing various things:
-removing the spice agent channel
-changing the number of virtual cpus (tried 1 and 2 virtual cpus)
-upgrading spice-server to the latest git master

But all to no avail.

This is with qemu-master build from source on a fully up2date
F-20 system, using the F-20 seabios files.

If someone has any clever ideas I'll happily try debugging this further.

Regards,

Hans



  1   2   3   >