[Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Peter Lieven
the block layer silently merges write requests since
commit 40b4f539. This patch adds a knob to disable
this feature as there has been some discussion lately
if multiwrite is a good idea at all and as it falsifies
benchmarks.

Signed-off-by: Peter Lieven p...@kamp.de
---
 block.c   |4 
 block/qapi.c  |1 +
 blockdev.c|7 +++
 hmp.c |4 
 include/block/block_int.h |1 +
 qapi/block-core.json  |   10 +-
 qemu-options.hx   |1 +
 qmp-commands.hx   |2 ++
 8 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 27533f3..1658a72 100644
--- a/block.c
+++ b/block.c
@@ -4531,6 +4531,10 @@ static int multiwrite_merge(BlockDriverState *bs, 
BlockRequest *reqs,
 {
 int i, outidx;
 
+if (!bs-write_merging) {
+return num_reqs;
+}
+
 // Sort requests by start sector
 qsort(reqs, num_reqs, sizeof(*reqs), multiwrite_req_compare);
 
diff --git a/block/qapi.c b/block/qapi.c
index 9733ebd..02251dd 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -58,6 +58,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs)
 
 info-backing_file_depth = bdrv_get_backing_file_depth(bs);
 info-detect_zeroes = bs-detect_zeroes;
+info-write_merging = bs-write_merging;
 
 if (bs-io_limits_enabled) {
 ThrottleConfig cfg;
diff --git a/blockdev.c b/blockdev.c
index e595910..13e47b8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
 const char *id;
 bool has_driver_specific_opts;
 BlockdevDetectZeroesOptions detect_zeroes;
+bool write_merging;
 BlockDriver *drv = NULL;
 
 /* Check common options by copying from bs_opts to opts, all other options
@@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
 snapshot = qemu_opt_get_bool(opts, snapshot, 0);
 ro = qemu_opt_get_bool(opts, read-only, 0);
 copy_on_read = qemu_opt_get_bool(opts, copy-on-read, false);
+write_merging = qemu_opt_get_bool(opts, write-merging, true);
 
 if ((buf = qemu_opt_get(opts, discard)) != NULL) {
 if (bdrv_parse_discard_flags(buf, bdrv_flags) != 0) {
@@ -530,6 +532,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
 bs-open_flags = snapshot ? BDRV_O_SNAPSHOT : 0;
 bs-read_only = ro;
 bs-detect_zeroes = detect_zeroes;
+bs-write_merging = write_merging;
 
 bdrv_set_on_error(bs, on_read_error, on_write_error);
 
@@ -2746,6 +2749,10 @@ QemuOptsList qemu_common_drive_opts = {
 .name = detect-zeroes,
 .type = QEMU_OPT_STRING,
 .help = try to optimize zero writes (off, on, unmap),
+},{
+.name = write-merging,
+.type = QEMU_OPT_BOOL,
+.help = enable write merging (default: true),
 },
 { /* end of list */ }
 },
diff --git a/hmp.c b/hmp.c
index 63d7686..8d6ad0b 100644
--- a/hmp.c
+++ b/hmp.c
@@ -348,6 +348,10 @@ void hmp_info_block(Monitor *mon, const QDict *qdict)

BlockdevDetectZeroesOptions_lookup[info-value-inserted-detect_zeroes]);
 }
 
+if (!info-value-inserted-write_merging) {
+monitor_printf(mon, Write Merging:off\n);
+}
+
 if (info-value-inserted-bps
 || info-value-inserted-bps_rd
 || info-value-inserted-bps_wr
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8d86a6c..39bbde2 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -407,6 +407,7 @@ struct BlockDriverState {
 
 QDict *options;
 BlockdevDetectZeroesOptions detect_zeroes;
+bool write_merging;
 
 /* The error object in use for blocking operations on backing_hd */
 Error *backing_blocker;
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 8f7089e..4931bd9 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -214,6 +214,8 @@
 #
 # @detect_zeroes: detect and optimize zero writes (Since 2.1)
 #
+# @write_merging: true if write merging is enabled (Since 2.2)
+#
 # @bps: total throughput limit in bytes per second is specified
 #
 # @bps_rd: read throughput limit in bytes per second is specified
@@ -250,6 +252,7 @@
 '*backing_file': 'str', 'backing_file_depth': 'int',
 'encrypted': 'bool', 'encryption_key_missing': 'bool',
 'detect_zeroes': 'BlockdevDetectZeroesOptions',
+'write_merging': 'bool',
 'bps': 'int', 'bps_rd': 'int', 'bps_wr': 'int',
 'iops': 'int', 'iops_rd': 'int', 'iops_wr': 'int',
 'image': 'ImageInfo',
@@ -1180,6 +1183,10 @@
 # (default: false)
 # @detect-zeroes: #optional detect and optimize zero writes (Since 2.1)
 # (default: off)
+# @write-merging: #optional enable the merging of write 

Re: [Qemu-devel] [PATCH] fix the memory leak for share hugepage

2014-10-20 Thread Linhaifeng


On 2014/10/20 13:32, Wen Congyang wrote:
 On 10/20/2014 12:48 PM, Linhaifeng wrote:


 On 2014/10/20 10:12, Wen Congyang wrote:
 On 10/18/2014 11:20 AM, Linhaifeng wrote:


 On 2014/10/17 21:26, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:57:27PM +0800, Linhaifeng wrote:


 On 2014/10/17 16:33, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:27:17PM +0800, haifeng@huawei.com wrote:
 From: linhaifeng haifeng@huawei.com

 The VM start with share hugepage should close the hugefile fd
 when exit.Because the hugepage fd may be send to other process
 e.g vhost-user If qemu not close the fd the other process can
 not free the hugepage otherwise exit process,this is ugly,so
 qemu should close all shared fd when exit.

 Signed-off-by: linhaifeng haifeng@huawei.com

 Err, all file descriptors are closed automatically when a process
 exits. So manually calling close(fd) before exit can't have any
 functional effect on a resource leak.

 If QEMU has sent the FD to another process, that process has a
 completely separate copy of the FD. Closing the FD in QEMU will
 not close the FD in the other process. You need the other process
 to exit for the copy to be closed.

 Regards,
 Daniel

 Hi,daniel

 QEMU send the fd by unix domain socket.unix domain socket just install 
 the fd to
 other process and inc the f_count,if qemu not close the fd the f_count 
 is not dec.
 Then the other process even close the fd the hugepage would not freed 
 whise the other process exit.

 The kernel always closes all FDs when a process exits. So if this FD is
 not being correctly closed then it is a kernel bug. There should never
 be any reason for an application to do close(fd) before exiting.

 Regards,
 Daniel

 Hi,daniel

 I don't think this is kernel's bug.May be this a problem about usage.
 If you open a file you should close it too.

 If you don't close it, the kernel will help you when the program exits.

 Yes,when the hugepage is only used for qemu,the kernel will free the file 
 object.If the hugepage shared for other process,when qemu exit the kernel 
 will not free the file.
 
 Even if the hugepage is shared with the other process, the kernel will auto 
 close the fd when qemu
 exits. If the kernel doesn't do it, it is a kernel bug.
 
Kernel supply close to fix this bug.If you call open you must call close.
If not, the result is unpredictability.

 This is linux man pageabout how to free resource of file.
 http://linux.die.net/man/2/close


 I'm trying to describe my problem.

 For example, there are 2 VMs run with hugepage and the hugepage only for 
 QEMU to use.

 Before run VM the meminfo is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the two VMs.QEMU deal with hugepage as follow steps:
 1.open
 2.unlink
 3.mmap
 4.use memory of hugepage.After this step the meminfo is :
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB
 5.shutdown VM with signal 15 without close(fd).After this step the meminfo 
 is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Yes,it works well,like you said the kernel recycle all resources.

 For another example,there are 2 VMs run with hugepage and share the 
 hugepage with vapp(a vhost-user application).

 The vapp is your internal application?

 Yes vapp is a application to share the QEMU's hugepage.So threr are two 
 process use the hugepage.


 Before run VM the meminfo is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the first VM.QEMU deal with hugepage as follow steps:
 1.open
 2.unlink
 3.mmap
 4.use memory of hugepage and send the fd to vapp with unix domain 
 socket.After this step the meminfo is:
 
 Do you modify qemu?
 
 HugePages_Total:4096
 HugePages_Free: 2048
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the second VM.After this step the meminfo is:
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Then I want to close the first VM and run another VM.After close the first 
 VM and close the fd in vapp the meminfo is :
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Does the qemu still run after you close the first VM? If the qemu exits, 
 the fd will be closed by the kernel, so this
 bug is very strange.

 qemu is not run when close the first VM.If other process used the file will 
 be closed by kernel too?
 
 If qeum doesn't run after the first vm is closed, the fd should be closed 
 even if another process uses the file.
 


 So failed to run the third VM because the first VM have not 

Re: [Qemu-devel] [PATCH 1/2] qcow2: Do not overflow when writing an L1 sector

2014-10-20 Thread Peter Lieven

On 16.10.2014 15:25, Max Reitz wrote:

While writing an L1 table sector, qcow2_write_l1_entry() copies the
respective range from s-l1_table to the local buf array. The size of
s-l1_table does not have to be a multiple of L1_ENTRIES_PER_SECTOR;
thus, limit the index which is used for copying all entries to the L1
size.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Max Reitz mre...@redhat.com
---
  block/qcow2-cluster.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index f7dd8c0..4d888c7 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -164,12 +164,14 @@ static int l2_load(BlockDriverState *bs, uint64_t 
l2_offset,
  int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)
  {
  BDRVQcowState *s = bs-opaque;
-uint64_t buf[L1_ENTRIES_PER_SECTOR];
+uint64_t buf[L1_ENTRIES_PER_SECTOR] = { 0 };
  int l1_start_index;
  int i, ret;
  
  l1_start_index = l1_index  ~(L1_ENTRIES_PER_SECTOR - 1);

-for (i = 0; i  L1_ENTRIES_PER_SECTOR; i++) {
+for (i = 0; i  L1_ENTRIES_PER_SECTOR  l1_start_index + i  s-l1_size;
+ i++)
+{
  buf[i] = cpu_to_be64(s-l1_table[l1_start_index + i]);
  }
  


Reviewed-by: Peter Lieven p...@kamp.de




Re: [Qemu-devel] [PATCH] fix the memory leak for share hugepage

2014-10-20 Thread Wen Congyang
On 10/20/2014 02:17 PM, Linhaifeng wrote:
 
 
 On 2014/10/20 13:32, Wen Congyang wrote:
 On 10/20/2014 12:48 PM, Linhaifeng wrote:


 On 2014/10/20 10:12, Wen Congyang wrote:
 On 10/18/2014 11:20 AM, Linhaifeng wrote:


 On 2014/10/17 21:26, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:57:27PM +0800, Linhaifeng wrote:


 On 2014/10/17 16:33, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:27:17PM +0800, haifeng@huawei.com wrote:
 From: linhaifeng haifeng@huawei.com

 The VM start with share hugepage should close the hugefile fd
 when exit.Because the hugepage fd may be send to other process
 e.g vhost-user If qemu not close the fd the other process can
 not free the hugepage otherwise exit process,this is ugly,so
 qemu should close all shared fd when exit.

 Signed-off-by: linhaifeng haifeng@huawei.com

 Err, all file descriptors are closed automatically when a process
 exits. So manually calling close(fd) before exit can't have any
 functional effect on a resource leak.

 If QEMU has sent the FD to another process, that process has a
 completely separate copy of the FD. Closing the FD in QEMU will
 not close the FD in the other process. You need the other process
 to exit for the copy to be closed.

 Regards,
 Daniel

 Hi,daniel

 QEMU send the fd by unix domain socket.unix domain socket just install 
 the fd to
 other process and inc the f_count,if qemu not close the fd the f_count 
 is not dec.
 Then the other process even close the fd the hugepage would not freed 
 whise the other process exit.

 The kernel always closes all FDs when a process exits. So if this FD is
 not being correctly closed then it is a kernel bug. There should never
 be any reason for an application to do close(fd) before exiting.

 Regards,
 Daniel

 Hi,daniel

 I don't think this is kernel's bug.May be this a problem about usage.
 If you open a file you should close it too.

 If you don't close it, the kernel will help you when the program exits.

 Yes,when the hugepage is only used for qemu,the kernel will free the file 
 object.If the hugepage shared for other process,when qemu exit the kernel 
 will not free the file.

 Even if the hugepage is shared with the other process, the kernel will auto 
 close the fd when qemu
 exits. If the kernel doesn't do it, it is a kernel bug.

 Kernel supply close to fix this bug.If you call open you must call close.
 If not, the result is unpredictability.

No, if the program exists, the kernel must close all fd used by the program.
So, there is no need to close fd before program exists.

Thanks
Wen Congyang


 This is linux man pageabout how to free resource of file.
 http://linux.die.net/man/2/close


 I'm trying to describe my problem.

 For example, there are 2 VMs run with hugepage and the hugepage only for 
 QEMU to use.

 Before run VM the meminfo is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the two VMs.QEMU deal with hugepage as follow steps:
 1.open
 2.unlink
 3.mmap
 4.use memory of hugepage.After this step the meminfo is :
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB
 5.shutdown VM with signal 15 without close(fd).After this step the 
 meminfo is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Yes,it works well,like you said the kernel recycle all resources.

 For another example,there are 2 VMs run with hugepage and share the 
 hugepage with vapp(a vhost-user application).

 The vapp is your internal application?

 Yes vapp is a application to share the QEMU's hugepage.So threr are two 
 process use the hugepage.


 Before run VM the meminfo is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the first VM.QEMU deal with hugepage as follow steps:
 1.open
 2.unlink
 3.mmap
 4.use memory of hugepage and send the fd to vapp with unix domain 
 socket.After this step the meminfo is:

 Do you modify qemu?

 HugePages_Total:4096
 HugePages_Free: 2048
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the second VM.After this step the meminfo is:
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Then I want to close the first VM and run another VM.After close the 
 first VM and close the fd in vapp the meminfo is :
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Does the qemu still run after you close the first VM? If the qemu exits, 
 the fd will be closed by the kernel, so this
 bug is very strange.

 qemu is not run when close the first VM.If other process used the file will 
 be closed 

Re: [Qemu-devel] [PATCH 2/2] iotests: Add test for qcow2 L1 table update

2014-10-20 Thread Peter Lieven

On 16.10.2014 15:25, Max Reitz wrote:

Updating the L1 table should not result in random data being written.
This adds a test for that.

Signed-off-by: Max Reitz mre...@redhat.com
---
  tests/qemu-iotests/107 | 61 ++
  tests/qemu-iotests/107.out | 10 
  tests/qemu-iotests/group   |  1 +
  3 files changed, 72 insertions(+)
  create mode 100755 tests/qemu-iotests/107
  create mode 100644 tests/qemu-iotests/107.out

diff --git a/tests/qemu-iotests/107 b/tests/qemu-iotests/107
new file mode 100755
index 000..cad1cf9
--- /dev/null
+++ b/tests/qemu-iotests/107
@@ -0,0 +1,61 @@
+#!/bin/bash
+#
+# Tests updates of the qcow2 L1 table
+#
+# Copyright (C) 2014 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/.
+#
+
+# creator
+owner=mre...@redhat.com
+
+seq=$(basename $0)
+echo QA output created by $seq
+
+here=$PWD
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+_cleanup()
+{
+   _cleanup_test_img
+}
+trap _cleanup; exit \$status 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto file


This (and maybe other recently added tests) also works on NFS.
As NFS on QCOW2 might be a reasonable combination I would add it.

Peter




[Qemu-devel] [PATCH v3] virtio-pci: fix migration for pci bus master

2014-10-20 Thread Michael S. Tsirkin
Current support for bus master (clearing OK bit) together with the need to
support guests which do not enable PCI bus mastering, leads to extra state in
VIRTIO_PCI_FLAG_BUS_MASTER_BUG bit, which isn't robust in case of cross-version
migration for the case when guests use the device before setting DRIVER_OK.

Rip out this code, and replace it:
-   Modern QEMU doesn't need VIRTIO_PCI_FLAG_BUS_MASTER_BUG
so just drop it for latest machine type.
-   For compat machine types, set PCI_COMMAND if DRIVER_OK
is set.

As this is needed for 2.1 for both pc and ppc, move PC_COMPAT macros from pc.h
to a new common header.

Reviewed-by: Greg Kurz gk...@linux.vnet.ibm.com
Tested-by: Greg Kurz gk...@linux.vnet.ibm.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---

Alexander, could you pls ack me merging this?
Thanks!


changes from v2:
drop default = -1 from ppc - was a typo, reported by Greg

 hw/virtio/virtio-pci.h |  5 +
 include/hw/compat.h| 16 
 include/hw/i386/pc.h   | 10 ++
 hw/i386/pc_piix.c  |  2 +-
 hw/i386/pc_q35.c   |  2 +-
 hw/ppc/spapr.c |  7 +++
 hw/virtio/virtio-pci.c | 29 +++--
 7 files changed, 43 insertions(+), 28 deletions(-)
 create mode 100644 include/hw/compat.h

diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index 1cea157..8873b6d 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -53,6 +53,11 @@ typedef struct VirtioBusClass VirtioPCIBusClass;
 #define VIRTIO_PCI_BUS_CLASS(klass) \
 OBJECT_CLASS_CHECK(VirtioPCIBusClass, klass, TYPE_VIRTIO_PCI_BUS)
 
+/* Need to activate work-arounds for buggy guests at vmstate load. */
+#define VIRTIO_PCI_FLAG_BUS_MASTER_BUG_MIGRATION_BIT  0
+#define VIRTIO_PCI_FLAG_BUS_MASTER_BUG_MIGRATION \
+(1  VIRTIO_PCI_FLAG_BUS_MASTER_BUG_MIGRATION_BIT)
+
 /* Performance improves when virtqueue kick processing is decoupled from the
  * vcpu thread using ioeventfd for some devices. */
 #define VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT 1
diff --git a/include/hw/compat.h b/include/hw/compat.h
new file mode 100644
index 000..47f6ff5
--- /dev/null
+++ b/include/hw/compat.h
@@ -0,0 +1,16 @@
+#ifndef HW_COMPAT_H
+#define HW_COMPAT_H
+
+#define HW_COMPAT_2_1 \
+{\
+.driver   = intel-hda,\
+.property = old_msi_addr,\
+.value= on,\
+},\
+{\
+.driver   = virtio-pci,\
+.property = virtio-pci-bus-master-bug-migration,\
+.value= on,\
+}
+
+#endif /* HW_COMPAT_H */
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index bae023a..82ad046 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -14,6 +14,7 @@
 #include sysemu/sysemu.h
 #include hw/pci/pci.h
 #include hw/boards.h
+#include hw/compat.h
 
 #define HPET_INTCAP hpet-intcap
 
@@ -307,15 +308,8 @@ int e820_add_entry(uint64_t, uint64_t, uint32_t);
 int e820_get_num_entries(void);
 bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
 
-#define PC_COMPAT_2_1 \
-{\
-.driver   = intel-hda,\
-.property = old_msi_addr,\
-.value= on,\
-}
-
 #define PC_COMPAT_2_0 \
-PC_COMPAT_2_1, \
+HW_COMPAT_2_1, \
 {\
 .driver   = virtio-scsi-pci,\
 .property = any_layout,\
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 553afdd..a1634ab 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -502,7 +502,7 @@ static QEMUMachine pc_i440fx_machine_v2_1 = {
 .name = pc-i440fx-2.1,
 .init = pc_init_pci,
 .compat_props = (GlobalProperty[]) {
-PC_COMPAT_2_1,
+HW_COMPAT_2_1,
 { /* end of list */ }
 },
 };
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index a199043..f330f7a 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -365,7 +365,7 @@ static QEMUMachine pc_q35_machine_v2_1 = {
 .name = pc-q35-2.1,
 .init = pc_q35_init,
 .compat_props = (GlobalProperty[]) {
-PC_COMPAT_2_1,
+HW_COMPAT_2_1,
 { /* end of list */ }
 },
 };
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 2becc9f..623f626 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -57,6 +57,8 @@
 #include trace.h
 #include hw/nmi.h
 
+#include hw/compat.h
+
 #include libfdt.h
 
 /* SLOF memory layout:
@@ -1689,10 +1691,15 @@ static const TypeInfo spapr_machine_info = {
 static void spapr_machine_2_1_class_init(ObjectClass *oc, void *data)
 {
 MachineClass *mc = MACHINE_CLASS(oc);
+static GlobalProperty compat_props[] = {
+HW_COMPAT_2_1,
+{ /* end of list */ }
+};
 
 mc-name = pseries-2.1;
 mc-desc = pSeries Logical Partition (PAPR compliant) v2.1;
 mc-is_default = 0;
+mc-compat_props = compat_props;
 }
 
 static const TypeInfo spapr_machine_2_1_info = {
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index a827cd4..a499a3c 100644
--- a/hw/virtio/virtio-pci.c
+++ 

Re: [Qemu-devel] [PATCH 1/2] i386: Add a Virtual Machine Generation ID device

2014-10-20 Thread Michael S. Tsirkin
On Mon, Oct 20, 2014 at 08:57:05AM +0300, Gal Hammer wrote:
 On 19/10/2014 18:14, Michael S. Tsirkin wrote:
 On Sun, Oct 19, 2014 at 04:43:07PM +0300, Gal Hammer wrote:
 Based on Microsoft's sepecifications (paper can be dowloaded from
 http://go.microsoft.com/fwlink/?LinkId=260709), add a device
 description to the SSDT ACPI table and its implementation.
 
 The GUID is set using a global vmgenid.uuid parameter.
 
 Signed-off-by: Gal Hammer gham...@redhat.com
 
 ---
   default-configs/i386-softmmu.mak |   1 +
   default-configs/x86_64-softmmu.mak   |   1 +
   hw/acpi/core.c   |   8 +++
   hw/acpi/ich9.c   |   8 +++
   hw/acpi/piix4.c  |   8 +++
   hw/i386/acpi-build.c |   8 +++
   hw/i386/acpi-dsdt.dsl|   4 +-
   hw/i386/acpi-dsdt.hex.generated  |   6 +-
   hw/i386/pc.c |   8 +++
   hw/i386/q35-acpi-dsdt.dsl|   5 +-
   hw/i386/q35-acpi-dsdt.hex.generated  |   8 +--
   hw/i386/ssdt-misc.dsl|  36 +++
   hw/i386/ssdt-misc.hex.generated  |   8 +--
   hw/isa/lpc_ich9.c|   1 +
   hw/misc/Makefile.objs|   1 +
   hw/misc/vmgenid.c| 116 
  +++
   include/hw/acpi/acpi.h   |   2 +
   include/hw/acpi/acpi_dev_interface.h |   4 ++
   include/hw/acpi/ich9.h   |   2 +
   include/hw/i386/pc.h |   3 +
   include/hw/misc/vmgenid.h|  21 +++
   21 files changed, 246 insertions(+), 13 deletions(-)
   create mode 100644 hw/misc/vmgenid.c
   create mode 100644 include/hw/misc/vmgenid.h
 
 Please document the host/guest API.
 It seems that you are using a hard-coded hardware address,
 and using up a GPE.
 
 I'll add a document file which describes the device's implementation.
 
 
 
 
 diff --git a/default-configs/i386-softmmu.mak 
 b/default-configs/i386-softmmu.mak
 index 8e08841..bd33c75 100644
 --- a/default-configs/i386-softmmu.mak
 +++ b/default-configs/i386-softmmu.mak
 @@ -45,3 +45,4 @@ CONFIG_IOAPIC=y
   CONFIG_ICC_BUS=y
   CONFIG_PVPANIC=y
   CONFIG_MEM_HOTPLUG=y
 +CONFIG_VMGENID=y
 diff --git a/default-configs/x86_64-softmmu.mak 
 b/default-configs/x86_64-softmmu.mak
 index 66557ac..006fc7c 100644
 --- a/default-configs/x86_64-softmmu.mak
 +++ b/default-configs/x86_64-softmmu.mak
 @@ -45,3 +45,4 @@ CONFIG_IOAPIC=y
   CONFIG_ICC_BUS=y
   CONFIG_PVPANIC=y
   CONFIG_MEM_HOTPLUG=y
 +CONFIG_VMGENID=y
 diff --git a/hw/acpi/core.c b/hw/acpi/core.c
 index a7368fb..a01c980 100644
 --- a/hw/acpi/core.c
 +++ b/hw/acpi/core.c
 @@ -28,6 +28,8 @@
   #include qapi-visit.h
   #include qapi-event.h
 
 +#define ACPI_VM_GENERATION_ID_CHANGED_STATUS 1
 +
   struct acpi_table_header {
   uint16_t _length; /* our length, not actual part of the hdr */
 /* allows easier parsing for fw_cfg clients 
  */
 @@ -680,3 +682,9 @@ void acpi_update_sci(ACPIREGS *regs, qemu_irq irq)
  (regs-pm1.evt.en  ACPI_BITMASK_TIMER_ENABLE) 
  !(pm1a_sts  ACPI_BITMASK_TIMER_STATUS));
   }
 +
 +void acpi_vm_generation_id_changed(ACPIREGS *acpi_regs, qemu_irq irq)
 +{
 +acpi_regs-gpe.sts[0] |= ACPI_VM_GENERATION_ID_CHANGED_STATUS;
 +acpi_update_sci(acpi_regs, irq);
 +}
 diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
 index 7b14bbb..5501c0e 100644
 --- a/hw/acpi/ich9.c
 +++ b/hw/acpi/ich9.c
 @@ -316,3 +316,11 @@ void ich9_pm_ospm_status(AcpiDeviceIf *adev, 
 ACPIOSTInfoList ***list)
 
   acpi_memory_ospm_status(s-pm.acpi_memory_hotplug, list);
   }
 +
 +void ich9_vm_generation_id_changed(AcpiDeviceIf *adev)
 +{
 +ICH9LPCState *s = ICH9_LPC_DEVICE(adev);
 +ICH9LPCPMRegs *pm = s-pm;
 +
 +acpi_vm_generation_id_changed(pm-acpi_regs, pm-irq);
 +}
 diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
 index 0bfa814..ad0ef68 100644
 --- a/hw/acpi/piix4.c
 +++ b/hw/acpi/piix4.c
 @@ -580,6 +580,13 @@ static void piix4_ospm_status(AcpiDeviceIf *adev, 
 ACPIOSTInfoList ***list)
   acpi_memory_ospm_status(s-acpi_memory_hotplug, list);
   }
 
 +static void piix4_vm_generation_id_changed(AcpiDeviceIf *adev)
 +{
 +PIIX4PMState *s = PIIX4_PM(adev);
 +
 +acpi_vm_generation_id_changed(s-ar, s-irq);
 +}
 +
   static Property piix4_pm_properties[] = {
   DEFINE_PROP_UINT32(smb_io_base, PIIX4PMState, smb_io_base, 0),
   DEFINE_PROP_UINT8(ACPI_PM_PROP_S3_DISABLED, PIIX4PMState, disable_s3, 
  0),
 @@ -617,6 +624,7 @@ static void piix4_pm_class_init(ObjectClass *klass, 
 void *data)
   hc-plug = piix4_device_plug_cb;
   hc-unplug_request = piix4_device_unplug_request_cb;
   adevc-ospm_status = piix4_ospm_status;
 +adevc-vm_generation_id_changed = piix4_vm_generation_id_changed;
   }
 
   static const TypeInfo piix4_pm_info = {
 diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
 index 00be4bb..27d0494 100644
 --- a/hw/i386/acpi-build.c
 +++ 

Re: [Qemu-devel] [PATCH 6/6] vnc: track limit connections

2014-10-20 Thread Gerd Hoffmann

  Hi,

 If we set the max trying times, and then
 There are some concepts:
  - INTERVAL_TIME: a time window that user can connnet vnc server
  - REJECT_TIME: the time of reject any connection
  - MAX_TRY_TIMES: the times that user can connect vnc server in INTERVAL_TIME,
if attach the MAX_TRY_TIMES, the server will lock, any user can not 
 connect again
before REJECT_TIME attached. The old connected client will not be 
 influenced.

i.e. effectively rate-limit login attempts.  Makes sense to have an
option for that, although I'm not sure it is worth the trouble doing
something beyond a simple one attempt per second allowed (i.e. stop
polling the listening socket for a second after each accept).

cheers,
  Gerd






Re: [Qemu-devel] [PATCH v5 2/7] stm32f205_USART: Add the stm32f205 USART Controller

2014-10-20 Thread Peter Crosthwaite
On Thu, Oct 16, 2014 at 10:53 PM, Alistair Francis alistai...@gmail.com wrote:
 This patch adds the stm32f205 USART controller
 (UART also uses the same controller).

 Signed-off-by: Alistair Francis alistai...@gmail.com
 ---
  default-configs/arm-softmmu.mak   |   1 +
  hw/char/Makefile.objs |   1 +
  hw/char/stm32f205_usart.c | 218 
 ++
  include/hw/char/stm32f205_usart.h |  69 
  4 files changed, 289 insertions(+)
  create mode 100644 hw/char/stm32f205_usart.c
  create mode 100644 include/hw/char/stm32f205_usart.h

 diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
 index cf23b24..422dec0 100644
 --- a/default-configs/arm-softmmu.mak
 +++ b/default-configs/arm-softmmu.mak
 @@ -79,6 +79,7 @@ CONFIG_REALVIEW=y
  CONFIG_ZAURUS=y
  CONFIG_ZYNQ=y
  CONFIG_STM32F205_TIMER=y
 +CONFIG_STM32F205_USART=y

  CONFIG_VERSATILE_PCI=y
  CONFIG_VERSATILE_I2C=y
 diff --git a/hw/char/Makefile.objs b/hw/char/Makefile.objs
 index 317385d..c7b3ce4 100644
 --- a/hw/char/Makefile.objs
 +++ b/hw/char/Makefile.objs
 @@ -15,6 +15,7 @@ obj-$(CONFIG_OMAP) += omap_uart.o
  obj-$(CONFIG_SH4) += sh_serial.o
  obj-$(CONFIG_PSERIES) += spapr_vty.o
  obj-$(CONFIG_DIGIC) += digic-uart.o
 +obj-$(CONFIG_STM32F205_USART) += stm32f205_usart.o

  common-obj-$(CONFIG_ETRAXFS) += etraxfs_ser.o
  common-obj-$(CONFIG_ISA_DEBUG) += debugcon.o
 diff --git a/hw/char/stm32f205_usart.c b/hw/char/stm32f205_usart.c
 new file mode 100644
 index 000..9d399b8
 --- /dev/null
 +++ b/hw/char/stm32f205_usart.c
 @@ -0,0 +1,218 @@
 +/*
 + * STM32F205 USART
 + *
 + * Copyright (c) 2014 Alistair Francis alist...@alistair23.me
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a 
 copy
 + * of this software and associated documentation files (the Software), to 
 deal
 + * in the Software without restriction, including without limitation the 
 rights
 + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 + * copies of the Software, and to permit persons to whom the Software is
 + * furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice shall be included in
 + * all copies or substantial portions of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
 FROM,
 + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 + * THE SOFTWARE.
 + */
 +
 +#include hw/char/stm32f205_usart.h
 +
 +#ifndef STM_USART_ERR_DEBUG
 +#define STM_USART_ERR_DEBUG 0
 +#endif
 +
 +#define DB_PRINT_L(lvl, fmt, args...) do { \
 +if (STM_USART_ERR_DEBUG = lvl) { \
 +qemu_log(%s:  fmt, __func__, ## args); \
 +} \
 +} while (0);
 +
 +#define DB_PRINT(fmt, args...) DB_PRINT_L(1, fmt, ## args)
 +
 +static int stm32f205_usart_can_receive(void *opaque)
 +{
 +STM32f205UsartState *s = opaque;
 +
 +if (!(s-usart_sr  USART_SR_RXNE)) {
 +return 1;
 +}
 +
 +return 0;
 +}
 +
 +static void stm32f205_usart_receive(void *opaque, const uint8_t *buf, int 
 size)
 +{
 +STM32f205UsartState *s = opaque;
 +
 +s-usart_dr = *buf;
 +
 +if (!(s-usart_cr1  USART_CR1_UE  s-usart_cr1  USART_CR1_RE)) {
 +/* USART not enabled - drop the chars */
 +DB_PRINT(Dropping the chars\n);
 +return;
 +}
 +
 +s-usart_sr |= USART_SR_RXNE;
 +
 +if (s-usart_cr1  USART_CR1_RXNEIE) {
 +qemu_set_irq(s-irq, 1);
 +}
 +
 +DB_PRINT(Receiving: %c\n, s-usart_dr);
 +}
 +
 +static void stm32f205_usart_reset(DeviceState *dev)
 +{
 +STM32f205UsartState *s = STM32F205_USART(dev);
 +
 +s-usart_sr = USART_SR_RESET;
 +s-usart_dr = 0x;
 +s-usart_brr = 0x;
 +s-usart_cr1 = 0x;
 +s-usart_cr2 = 0x;
 +s-usart_cr3 = 0x;
 +s-usart_gtpr = 0x;
 +}
 +
 +static uint64_t stm32f205_usart_read(void *opaque, hwaddr addr,
 +   unsigned int size)
 +{
 +STM32f205UsartState *s = opaque;
 +uint64_t retvalue;
 +
 +DB_PRINT(Read 0x%HWADDR_PRIx\n, addr);
 +
 +switch (addr) {
 +case USART_SR:
 +retvalue = s-usart_sr;
 +s-usart_sr = ~USART_SR_TC;
 +if (s-chr) {
 +qemu_chr_accept_input(s-chr);
 +}
 +return retvalue;
 +case USART_DR:
 +DB_PRINT(Value: 0x% PRIx32 , %c\n, s-usart_dr, (char) 
 s-usart_dr);
 +s-usart_sr |= USART_SR_TXE;
 +s-usart_sr = ~USART_SR_RXNE;

Do you need to qemu_chr_accept_input here?

 +return s-usart_dr  0x3FF;
 +case USART_BRR:
 +return 

Re: [Qemu-devel] [PATCH v5 1/7] stm32f205_timer: Add the stm32f205 Timer

2014-10-20 Thread Peter Crosthwaite
Sorry about the review delay...

On Thu, Oct 16, 2014 at 10:53 PM, Alistair Francis alistai...@gmail.com wrote:
 This patch adds the stm32f205 timers: TIM2, TIM3, TIM4 and TIM5
 to QEMU.

 Signed-off-by: Alistair Francis alistai...@gmail.com
 ---
 V4:
  - Update timer units again
 - Thanks to Peter C
 V3:
  - Update debug statements
  - Correct the units for timer_mod
  - Correctly set timer_offset from resets
 V2:
  - Reorder the Makefile config
  - Fix up the debug printing
  - Correct the timer event trigger
 Changes from RFC:
  - Small changes to functionality and style. Thanks to Peter C
  - Rename to make the timer more generic
  - Split the config settings to device level

  default-configs/arm-softmmu.mak|   1 +
  hw/timer/Makefile.objs |   2 +
  hw/timer/stm32f205_timer.c | 318 
 +
  include/hw/timer/stm32f205_timer.h | 101 
  4 files changed, 422 insertions(+)
  create mode 100644 hw/timer/stm32f205_timer.c
  create mode 100644 include/hw/timer/stm32f205_timer.h

 diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
 index f3513fa..cf23b24 100644
 --- a/default-configs/arm-softmmu.mak
 +++ b/default-configs/arm-softmmu.mak
 @@ -78,6 +78,7 @@ CONFIG_NSERIES=y
  CONFIG_REALVIEW=y
  CONFIG_ZAURUS=y
  CONFIG_ZYNQ=y
 +CONFIG_STM32F205_TIMER=y

  CONFIG_VERSATILE_PCI=y
  CONFIG_VERSATILE_I2C=y
 diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs
 index 2c86c3d..4bd9617 100644
 --- a/hw/timer/Makefile.objs
 +++ b/hw/timer/Makefile.objs
 @@ -31,3 +31,5 @@ obj-$(CONFIG_DIGIC) += digic-timer.o
  obj-$(CONFIG_MC146818RTC) += mc146818rtc.o

  obj-$(CONFIG_ALLWINNER_A10_PIT) += allwinner-a10-pit.o
 +
 +common-obj-$(CONFIG_STM32F205_TIMER) += stm32f205_timer.o
 diff --git a/hw/timer/stm32f205_timer.c b/hw/timer/stm32f205_timer.c
 new file mode 100644
 index 000..aace8df
 --- /dev/null
 +++ b/hw/timer/stm32f205_timer.c
 @@ -0,0 +1,318 @@
 +/*
 + * STM32F205 Timer

ST doc RM0033 which docs this timer refers to a larger family of SOCs.
I think you can change this from 205 to 2XX probably globally for the
series.

 + *
 + * Copyright (c) 2014 Alistair Francis alist...@alistair23.me
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a 
 copy
 + * of this software and associated documentation files (the Software), to 
 deal
 + * in the Software without restriction, including without limitation the 
 rights
 + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 + * copies of the Software, and to permit persons to whom the Software is
 + * furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice shall be included in
 + * all copies or substantial portions of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
 FROM,
 + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 + * THE SOFTWARE.
 + */
 +
 +#include hw/timer/stm32f205_timer.h
 +
 +#ifndef STM_TIMER_ERR_DEBUG
 +#define STM_TIMER_ERR_DEBUG 0
 +#endif
 +
 +#define DB_PRINT_L(lvl, fmt, args...) do { \
 +if (STM_TIMER_ERR_DEBUG = lvl) { \
 +qemu_log(%s:  fmt, __func__, ## args); \
 +} \
 +} while (0);
 +
 +#define DB_PRINT(fmt, args...) DB_PRINT_L(1, fmt, ## args)
 +
 +static void stm32f205_timer_set_alarm(STM32f205TimerState *s);
 +
 +static void stm32f205_timer_interrupt(void *opaque)
 +{
 +STM32f205TimerState *s = opaque;
 +
 +DB_PRINT(Interrupt\n);
 +
 +if (s-tim_dier  TIM_DIER_UIE  s-tim_cr1  TIM_CR1_CEN) {
 +s-tim_sr |= 1;
 +qemu_irq_pulse(s-irq);
 +stm32f205_timer_set_alarm(s);
 +}
 +}
 +
 +static void stm32f205_timer_set_alarm(STM32f205TimerState *s)
 +{
 +uint32_t ticks;
 +int64_t now;
 +
 +DB_PRINT(Alarm set at: 0x%x\n, s-tim_cr1);
 +
 +now = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);

So now is in terms of ms.

 +ticks = s-tim_arr - ((s-tick_offset + (now * (s-freq_hz / 1000))) /

tick_offset is terms of clock-cycles-before-prescalar. ticks and
tim_arr must be in terms of clock-cycles-post-pre-scalar.

I'm slightly hazy on definition of tick_offset but i'm guessing its
the time offset of when the timer started expressed in
before-prescalar cycles? I would then expect this to be:

ticks = tim_arr - (now * (scale) - tick_offset).

with (now * scale - tick_offset) / tim_psc corresponding to the
current value of the running timer (tim_cnt?).

 +(s-tim_psc + 1));

So in total this expression is calculating a number of clock cycles
until a hit as ticks.

 +
 +

Re: [Qemu-devel] [PATCH v5 4/7] target_arm: Remove memory region init from armv7m_init

2014-10-20 Thread Peter Crosthwaite
On Thu, Oct 16, 2014 at 10:54 PM, Alistair Francis alistai...@gmail.com wrote:
 This patch moves the memory region init code from the
 armv7m_init function to the stellaris_init function

 Signed-off-by: Alistair Francis alistai...@gmail.com

Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com

 ---
  hw/arm/armv7m.c  | 33 +++--
  hw/arm/stellaris.c   | 24 
  include/hw/arm/arm.h |  3 +--
  3 files changed, 24 insertions(+), 36 deletions(-)

 diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
 index ef24ca4..50281f7 100644
 --- a/hw/arm/armv7m.c
 +++ b/hw/arm/armv7m.c
 @@ -163,11 +163,10 @@ static void armv7m_reset(void *opaque)
  }

  /* Init CPU and memory for a v7-M based board.
 -   flash_size and sram_size are in kb.
 +   mem_size is in bytes.
 Returns the NVIC array.  */

 -qemu_irq *armv7m_init(MemoryRegion *system_memory,
 -  int flash_size, int sram_size,
 +qemu_irq *armv7m_init(MemoryRegion *system_memory, int mem_size,
const char *kernel_filename, const char *cpu_model)
  {
  ARMCPU *cpu;
 @@ -180,13 +179,8 @@ qemu_irq *armv7m_init(MemoryRegion *system_memory,
  uint64_t lowaddr;
  int i;
  int big_endian;
 -MemoryRegion *sram = g_new(MemoryRegion, 1);
 -MemoryRegion *flash = g_new(MemoryRegion, 1);
  MemoryRegion *hack = g_new(MemoryRegion, 1);

 -flash_size *= 1024;
 -sram_size *= 1024;
 -
  if (cpu_model == NULL) {
 cpu_model = cortex-m3;
  }
 @@ -197,27 +191,6 @@ qemu_irq *armv7m_init(MemoryRegion *system_memory,
  }
  env = cpu-env;

 -#if 0
 -/*  32Mb SRAM gets complicated because it overlaps the bitband area.
 -   We don't have proper commandline options, so allocate half of memory
 -   as SRAM, up to a maximum of 32Mb, and the rest as code.  */
 -if (ram_size  (512 + 32) * 1024 * 1024)
 -ram_size = (512 + 32) * 1024 * 1024;
 -sram_size = (ram_size / 2)  TARGET_PAGE_MASK;
 -if (sram_size  32 * 1024 * 1024)
 -sram_size = 32 * 1024 * 1024;
 -code_size = ram_size - sram_size;
 -#endif
 -
 -/* Flash programming is done via the SCU, so pretend it is ROM.  */
 -memory_region_init_ram(flash, NULL, armv7m.flash, flash_size,
 -   error_abort);
 -vmstate_register_ram_global(flash);
 -memory_region_set_readonly(flash, true);
 -memory_region_add_subregion(system_memory, 0, flash);
 -memory_region_init_ram(sram, NULL, armv7m.sram, sram_size, 
 error_abort);
 -vmstate_register_ram_global(sram);
 -memory_region_add_subregion(system_memory, 0x2000, sram);
  armv7m_bitband_init();

  nvic = qdev_create(NULL, armv7m_nvic);
 @@ -244,7 +217,7 @@ qemu_irq *armv7m_init(MemoryRegion *system_memory,
  image_size = load_elf(kernel_filename, NULL, NULL, entry, lowaddr,
NULL, big_endian, ELF_MACHINE, 1);
  if (image_size  0) {
 -image_size = load_image_targphys(kernel_filename, 0, flash_size);
 +image_size = load_image_targphys(kernel_filename, 0, mem_size);
  lowaddr = 0;
  }
  if (image_size  0) {
 diff --git a/hw/arm/stellaris.c b/hw/arm/stellaris.c
 index 64bd4b4..d0c61c5 100644
 --- a/hw/arm/stellaris.c
 +++ b/hw/arm/stellaris.c
 @@ -1220,10 +1220,26 @@ static void stellaris_init(const char 
 *kernel_filename, const char *cpu_model,
  int i;
  int j;

 -flash_size = ((board-dc0  0x) + 1)  1;
 -sram_size = (board-dc0  18) + 1;
 -pic = armv7m_init(get_system_memory(),
 -  flash_size, sram_size, kernel_filename, cpu_model);
 +MemoryRegion *sram = g_new(MemoryRegion, 1);
 +MemoryRegion *flash = g_new(MemoryRegion, 1);
 +MemoryRegion *system_memory = get_system_memory();
 +
 +flash_size = (((board-dc0  0x) + 1)  1) * 1024;
 +sram_size = ((board-dc0  18) + 1) * 1024;
 +
 +/* Flash programming is done via the SCU, so pretend it is ROM.  */
 +memory_region_init_ram(flash, NULL, stellaris.flash, flash_size,
 +   error_abort);
 +vmstate_register_ram_global(flash);
 +memory_region_set_readonly(flash, true);
 +memory_region_add_subregion(system_memory, 0, flash);
 +
 +memory_region_init_ram(sram, NULL, stellaris.sram, sram_size,
 +   error_abort);
 +vmstate_register_ram_global(sram);
 +memory_region_add_subregion(system_memory, 0x2000, sram);
 +
 +pic = armv7m_init(system_memory, flash_size, kernel_filename, cpu_model);

  if (board-dc1  (1  16)) {
  dev = sysbus_create_varargs(TYPE_STELLARIS_ADC, 0x40038000,
 diff --git a/include/hw/arm/arm.h b/include/hw/arm/arm.h
 index cefc9e6..a112930 100644
 --- a/include/hw/arm/arm.h
 +++ b/include/hw/arm/arm.h
 @@ -15,8 +15,7 @@
  #include hw/irq.h

  /* armv7m.c */
 -qemu_irq *armv7m_init(MemoryRegion *system_memory,
 -  

Re: [Qemu-devel] [PATCH v5 5/7] target_arm: Parameterise the irq lines for armv7m_init

2014-10-20 Thread Peter Crosthwaite
On Thu, Oct 16, 2014 at 10:54 PM, Alistair Francis alistai...@gmail.com wrote:
 This patch allows the board to specifiy the number of NVIC interrupt
 lines when using armv7m_init.

 Signed-off-by: Alistair Francis alistai...@gmail.com

Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com

 ---
  hw/arm/armv7m.c  | 7 ---
  hw/arm/stellaris.c   | 5 -
  include/hw/arm/arm.h | 2 +-
  3 files changed, 9 insertions(+), 5 deletions(-)

 diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
 index 50281f7..7169027 100644
 --- a/hw/arm/armv7m.c
 +++ b/hw/arm/armv7m.c
 @@ -166,14 +166,14 @@ static void armv7m_reset(void *opaque)
 mem_size is in bytes.
 Returns the NVIC array.  */

 -qemu_irq *armv7m_init(MemoryRegion *system_memory, int mem_size,
 +qemu_irq *armv7m_init(MemoryRegion *system_memory, int mem_size, int num_irq,
const char *kernel_filename, const char *cpu_model)
  {
  ARMCPU *cpu;
  CPUARMState *env;
  DeviceState *nvic;
  /* FIXME: make this local state.  */
 -static qemu_irq pic[64];
 +qemu_irq *pic = g_new(qemu_irq, num_irq);
  int image_size;
  uint64_t entry;
  uint64_t lowaddr;
 @@ -194,11 +194,12 @@ qemu_irq *armv7m_init(MemoryRegion *system_memory, int 
 mem_size,
  armv7m_bitband_init();

  nvic = qdev_create(NULL, armv7m_nvic);
 +qdev_prop_set_uint32(nvic, num-irq, num_irq);
  env-nvic = nvic;
  qdev_init_nofail(nvic);
  sysbus_connect_irq(SYS_BUS_DEVICE(nvic), 0,
 qdev_get_gpio_in(DEVICE(cpu), ARM_CPU_IRQ));
 -for (i = 0; i  64; i++) {
 +for (i = 0; i  num_irq; i++) {
  pic[i] = qdev_get_gpio_in(nvic, i);
  }

 diff --git a/hw/arm/stellaris.c b/hw/arm/stellaris.c
 index d0c61c5..6fad10f 100644
 --- a/hw/arm/stellaris.c
 +++ b/hw/arm/stellaris.c
 @@ -29,6 +29,8 @@
  #define BP_OLED_SSI  0x02
  #define BP_GAMEPAD   0x04

 +#define NUM_IRQ_LINES 64
 +
  typedef const struct {
  const char *name;
  uint32_t did0;
 @@ -1239,7 +1241,8 @@ static void stellaris_init(const char *kernel_filename, 
 const char *cpu_model,
  vmstate_register_ram_global(sram);
  memory_region_add_subregion(system_memory, 0x2000, sram);

 -pic = armv7m_init(system_memory, flash_size, kernel_filename, cpu_model);
 +pic = armv7m_init(system_memory, flash_size, NUM_IRQ_LINES,
 +  kernel_filename, cpu_model);

  if (board-dc1  (1  16)) {
  dev = sysbus_create_varargs(TYPE_STELLARIS_ADC, 0x40038000,
 diff --git a/include/hw/arm/arm.h b/include/hw/arm/arm.h
 index a112930..94e55a4 100644
 --- a/include/hw/arm/arm.h
 +++ b/include/hw/arm/arm.h
 @@ -15,7 +15,7 @@
  #include hw/irq.h

  /* armv7m.c */
 -qemu_irq *armv7m_init(MemoryRegion *system_memory, int mem_size,
 +qemu_irq *armv7m_init(MemoryRegion *system_memory, int mem_size, int num_irq,
const char *kernel_filename, const char *cpu_model);

  /* arm_boot.c */
 --
 1.9.1





Re: [Qemu-devel] [PATCH v5 3/7] stm32f205_SYSCFG: Add the stm32f205 SYSCFG

2014-10-20 Thread Peter Crosthwaite
On Thu, Oct 16, 2014 at 10:54 PM, Alistair Francis alistai...@gmail.com wrote:
 This patch adds the stm32f205 System Configuration
 Controller. This is used to configure what memory is mapped
 at address 0 (although that is not supported) as well
 as configure how the EXTI interrupts work (also not
 supported at the moment).

 This device is not required for basic examples, but more
 complex systems will require it (as well as the EXTI device)

 Signed-off-by: Alistair Francis alistai...@gmail.com
 ---
  default-configs/arm-softmmu.mak|   1 +
  hw/misc/Makefile.objs  |   1 +
  hw/misc/stm32f205_syscfg.c | 160 
 +
  include/hw/misc/stm32f205_syscfg.h |  61 ++
  4 files changed, 223 insertions(+)
  create mode 100644 hw/misc/stm32f205_syscfg.c
  create mode 100644 include/hw/misc/stm32f205_syscfg.h

 diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
 index 422dec0..a2ea8f7 100644
 --- a/default-configs/arm-softmmu.mak
 +++ b/default-configs/arm-softmmu.mak
 @@ -80,6 +80,7 @@ CONFIG_ZAURUS=y
  CONFIG_ZYNQ=y
  CONFIG_STM32F205_TIMER=y
  CONFIG_STM32F205_USART=y
 +CONFIG_STM32F205_SYSCFG=y

  CONFIG_VERSATILE_PCI=y
  CONFIG_VERSATILE_I2C=y
 diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
 index 979e532..63f03bd 100644
 --- a/hw/misc/Makefile.objs
 +++ b/hw/misc/Makefile.objs
 @@ -39,5 +39,6 @@ obj-$(CONFIG_OMAP) += omap_sdrc.o
  obj-$(CONFIG_OMAP) += omap_tap.o
  obj-$(CONFIG_SLAVIO) += slavio_misc.o
  obj-$(CONFIG_ZYNQ) += zynq_slcr.o
 +obj-$(CONFIG_STM32F205_SYSCFG) += stm32f205_syscfg.o

  obj-$(CONFIG_PVPANIC) += pvpanic.o
 diff --git a/hw/misc/stm32f205_syscfg.c b/hw/misc/stm32f205_syscfg.c
 new file mode 100644
 index 000..82aa50f
 --- /dev/null
 +++ b/hw/misc/stm32f205_syscfg.c
 @@ -0,0 +1,160 @@
 +/*
 + * STM32F205 SYSCFG

2XX

 + *
 + * Copyright (c) 2014 Alistair Francis alist...@alistair23.me
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a 
 copy
 + * of this software and associated documentation files (the Software), to 
 deal
 + * in the Software without restriction, including without limitation the 
 rights
 + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 + * copies of the Software, and to permit persons to whom the Software is
 + * furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice shall be included in
 + * all copies or substantial portions of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
 FROM,
 + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 + * THE SOFTWARE.
 + */
 +
 +#include hw/misc/stm32f205_syscfg.h
 +
 +#ifndef STM_SYSCFG_ERR_DEBUG
 +#define STM_SYSCFG_ERR_DEBUG 0
 +#endif
 +
 +#define DB_PRINT_L(lvl, fmt, args...) do { \
 +if (STM_SYSCFG_ERR_DEBUG = lvl) { \
 +qemu_log(%s:  fmt, __func__, ## args); \
 +} \
 +} while (0);
 +
 +#define DB_PRINT(fmt, args...) DB_PRINT_L(1, fmt, ## args)
 +
 +static void stm32f205_syscfg_reset(DeviceState *dev)
 +{
 +STM32f205SyscfgState *s = STM32F205_SYSCFG(dev);
 +
 +s-syscfg_memrmp = 0x;
 +s-syscfg_pmc = 0x;
 +s-syscfg_exticr1 = 0x;
 +s-syscfg_exticr2 = 0x;
 +s-syscfg_exticr3 = 0x;
 +s-syscfg_exticr4 = 0x;
 +s-syscfg_cmpcr = 0x;
 +}
 +
 +static uint64_t stm32f205_syscfg_read(void *opaque, hwaddr addr,
 + unsigned int size)
 +{
 +STM32f205SyscfgState *s = opaque;
 +
 +DB_PRINT(0x%x\n, (uint) addr);
 +

HWADDR_PRIx

 +switch (addr) {
 +case SYSCFG_MEMRMP:
 +return s-syscfg_memrmp;
 +case SYSCFG_PMC:
 +return s-syscfg_pmc;
 +case SYSCFG_EXTICR1:
 +return s-syscfg_exticr1;
 +case SYSCFG_EXTICR2:
 +return s-syscfg_exticr2;
 +case SYSCFG_EXTICR3:
 +return s-syscfg_exticr3;
 +case SYSCFG_EXTICR4:
 +return s-syscfg_exticr4;
 +case SYSCFG_CMPCR:
 +return s-syscfg_cmpcr;
 +default:
 +qemu_log_mask(LOG_GUEST_ERROR,
 +  STM32F205_syscfg_read: Bad offset %x\n, (int)addr);

%s __func, HWADDR_PRIx

 +return 0;
 +}
 +
 +return 0;
 +}
 +
 +static void stm32f205_syscfg_write(void *opaque, hwaddr addr,
 +   uint64_t val64, unsigned int size)
 +{
 +STM32f205SyscfgState *s = opaque;
 +uint32_t value = val64;
 +
 +DB_PRINT(0x%x, 0x%x\n, value, (uint) addr);
 +

HWADDR_PRIx

 +switch (addr) {
 +case SYSCFG_MEMRMP:
 +   

Re: [Qemu-devel] [PATCH v5 6/7] stm32f205: Add the stm32f205 SoC

2014-10-20 Thread Peter Crosthwaite
On Thu, Oct 16, 2014 at 10:54 PM, Alistair Francis alistai...@gmail.com wrote:
 This patch adds the stm32f205 SoC. This will be used by the
 Netduino 2 to create a machine.

 Signed-off-by: Alistair Francis alistai...@gmail.com
 ---
  default-configs/arm-softmmu.mak |   1 +
  hw/arm/Makefile.objs|   1 +
  hw/arm/stm32f205_soc.c  | 157 
 
  include/hw/arm/stm32f205_soc.h  |  69 ++
  4 files changed, 228 insertions(+)
  create mode 100644 hw/arm/stm32f205_soc.c
  create mode 100644 include/hw/arm/stm32f205_soc.h

 diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
 index a2ea8f7..8068100 100644
 --- a/default-configs/arm-softmmu.mak
 +++ b/default-configs/arm-softmmu.mak
 @@ -81,6 +81,7 @@ CONFIG_ZYNQ=y
  CONFIG_STM32F205_TIMER=y
  CONFIG_STM32F205_USART=y
  CONFIG_STM32F205_SYSCFG=y
 +CONFIG_STM32F205_SOC=y

  CONFIG_VERSATILE_PCI=y
  CONFIG_VERSATILE_I2C=y
 diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
 index 6088e53..9769317 100644
 --- a/hw/arm/Makefile.objs
 +++ b/hw/arm/Makefile.objs
 @@ -8,3 +8,4 @@ obj-y += armv7m.o exynos4210.o pxa2xx.o pxa2xx_gpio.o 
 pxa2xx_pic.o
  obj-$(CONFIG_DIGIC) += digic.o
  obj-y += omap1.o omap2.o strongarm.o
  obj-$(CONFIG_ALLWINNER_A10) += allwinner-a10.o cubieboard.o
 +obj-$(CONFIG_STM32F205_SOC) += stm32f205_soc.o
 diff --git a/hw/arm/stm32f205_soc.c b/hw/arm/stm32f205_soc.c
 new file mode 100644
 index 000..bd9514e
 --- /dev/null
 +++ b/hw/arm/stm32f205_soc.c
 @@ -0,0 +1,157 @@
 +/*
 + * STM32F205 SoC
 + *
 + * Copyright (c) 2014 Alistair Francis alist...@alistair23.me
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a 
 copy
 + * of this software and associated documentation files (the Software), to 
 deal
 + * in the Software without restriction, including without limitation the 
 rights
 + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 + * copies of the Software, and to permit persons to whom the Software is
 + * furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice shall be included in
 + * all copies or substantial portions of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
 FROM,
 + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 + * THE SOFTWARE.
 + */
 +
 +#include hw/arm/stm32f205_soc.h
 +
 +/* At the moment only Timer 2 to 5 are modelled */
 +static const uint32_t timer_addr[] = { 0x4000, 0x4400,
 +0x4800, 0x4C00 };
 +static const uint32_t usart_addr[] = { 0x40011000, 0x40004400,
 +0x40004800, 0x40004C00, 0x40005000, 0x40011400 };
 +

You have 6 addresses for USART ...

 +static const int timer_irq[] = {28, 29, 30, 50};
 +static const int usart_irq[] = {37, 38, 39, 52, 53, 71, 82, 83};
 +

... but 8 IRQS and the loop below uses only 5 values. What's the system exactly?

 +static void stm32f205_soc_initfn(Object *obj)
 +{
 +STM32F205State *s = STM32F205_SOC(obj);
 +int i;
 +
 +object_initialize(s-syscfg, sizeof(s-syscfg), TYPE_STM32F205_SYSCFG);
 +qdev_set_parent_bus(DEVICE(s-syscfg), sysbus_get_default());
 +
 +for (i = 0; i  5; i++) {
 +object_initialize(s-usart[i], sizeof(s-usart[i]),
 +  TYPE_STM32F205_USART);
 +qdev_set_parent_bus(DEVICE(s-usart[i]), sysbus_get_default());
 +}
 +
 +for (i = 0; i  4; i++) {
 +object_initialize(s-timer[i], sizeof(s-timer[i]),
 +  TYPE_STM32F205_TIMER);
 +qdev_set_parent_bus(DEVICE(s-timer[i]), sysbus_get_default());
 +}
 +}
 +
 +static void stm32f205_soc_realize(DeviceState *dev_soc, Error **errp)
 +{
 +STM32F205State *s = STM32F205_SOC(dev_soc);
 +DeviceState *syscfgdev, *usartdev, *timerdev;
 +SysBusDevice *syscfgbusdev, *usartbusdev, *timerbusdev;
 +qemu_irq *pic;;

stray ;

 +Error *err = NULL;
 +int i;
 +
 +MemoryRegion *system_memory = get_system_memory();
 +MemoryRegion *sram = g_new(MemoryRegion, 1);
 +MemoryRegion *flash = g_new(MemoryRegion, 1);
 +MemoryRegion *flash_alias = g_new(MemoryRegion, 1);
 +
 +memory_region_init_ram(flash, NULL, netduino.flash, FLASH_SIZE,
 +   error_abort);
 +memory_region_init_alias(flash_alias, NULL, netduino.flash.alias,
 + flash, 0, FLASH_SIZE);
 +
 +vmstate_register_ram_global(flash);
 +
 +memory_region_set_readonly(flash, true);
 +memory_region_set_readonly(flash_alias, true);
 +
 +

Re: [Qemu-devel] [PATCH] fix the memory leak for share hugepage

2014-10-20 Thread Linhaifeng
Hi,all

Maybe this is unix domain socket's bug.I found that qemu send the fd to vapp 
the fd's f_count inc twice in kernel.

1.kernel calls when we call send.
unix_stream_sendmsg - unix_scm_to_skb - unix_attach_fds - scm_fp_dup - 
get_file - atomic_long_inc(f-f_count)

Maybe should't inc the f_count when call send.


2.kernel calls when we call recv
unix_stream_recvmsg - scm_fp_dup - get_file - atomic_long_inc(f-f_count)



On 2014/10/20 14:26, Wen Congyang wrote:
 On 10/20/2014 02:17 PM, Linhaifeng wrote:


 On 2014/10/20 13:32, Wen Congyang wrote:
 On 10/20/2014 12:48 PM, Linhaifeng wrote:


 On 2014/10/20 10:12, Wen Congyang wrote:
 On 10/18/2014 11:20 AM, Linhaifeng wrote:


 On 2014/10/17 21:26, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:57:27PM +0800, Linhaifeng wrote:


 On 2014/10/17 16:33, Daniel P. Berrange wrote:
 On Fri, Oct 17, 2014 at 04:27:17PM +0800, haifeng@huawei.com 
 wrote:
 From: linhaifeng haifeng@huawei.com

 The VM start with share hugepage should close the hugefile fd
 when exit.Because the hugepage fd may be send to other process
 e.g vhost-user If qemu not close the fd the other process can
 not free the hugepage otherwise exit process,this is ugly,so
 qemu should close all shared fd when exit.

 Signed-off-by: linhaifeng haifeng@huawei.com

 Err, all file descriptors are closed automatically when a process
 exits. So manually calling close(fd) before exit can't have any
 functional effect on a resource leak.

 If QEMU has sent the FD to another process, that process has a
 completely separate copy of the FD. Closing the FD in QEMU will
 not close the FD in the other process. You need the other process
 to exit for the copy to be closed.

 Regards,
 Daniel

 Hi,daniel

 QEMU send the fd by unix domain socket.unix domain socket just install 
 the fd to
 other process and inc the f_count,if qemu not close the fd the f_count 
 is not dec.
 Then the other process even close the fd the hugepage would not freed 
 whise the other process exit.

 The kernel always closes all FDs when a process exits. So if this FD is
 not being correctly closed then it is a kernel bug. There should never
 be any reason for an application to do close(fd) before exiting.

 Regards,
 Daniel

 Hi,daniel

 I don't think this is kernel's bug.May be this a problem about usage.
 If you open a file you should close it too.

 If you don't close it, the kernel will help you when the program exits.

 Yes,when the hugepage is only used for qemu,the kernel will free the file 
 object.If the hugepage shared for other process,when qemu exit the kernel 
 will not free the file.

 Even if the hugepage is shared with the other process, the kernel will auto 
 close the fd when qemu
 exits. If the kernel doesn't do it, it is a kernel bug.

 Kernel supply close to fix this bug.If you call open you must call close.
 If not, the result is unpredictability.
 
 No, if the program exists, the kernel must close all fd used by the program.
 So, there is no need to close fd before program exists.
 
 Thanks
 Wen Congyang
 

 This is linux man pageabout how to free resource of file.
 http://linux.die.net/man/2/close


 I'm trying to describe my problem.

 For example, there are 2 VMs run with hugepage and the hugepage only for 
 QEMU to use.

 Before run VM the meminfo is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the two VMs.QEMU deal with hugepage as follow steps:
 1.open
 2.unlink
 3.mmap
 4.use memory of hugepage.After this step the meminfo is :
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB
 5.shutdown VM with signal 15 without close(fd).After this step the 
 meminfo is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Yes,it works well,like you said the kernel recycle all resources.

 For another example,there are 2 VMs run with hugepage and share the 
 hugepage with vapp(a vhost-user application).

 The vapp is your internal application?

 Yes vapp is a application to share the QEMU's hugepage.So threr are two 
 process use the hugepage.


 Before run VM the meminfo is :
 HugePages_Total:4096
 HugePages_Free: 4096
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the first VM.QEMU deal with hugepage as follow steps:
 1.open
 2.unlink
 3.mmap
 4.use memory of hugepage and send the fd to vapp with unix domain 
 socket.After this step the meminfo is:

 Do you modify qemu?

 HugePages_Total:4096
 HugePages_Free: 2048
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Run the second VM.After this step the meminfo is:
 HugePages_Total:4096
 HugePages_Free:0
 HugePages_Rsvd:0
 HugePages_Surp:0
 Hugepagesize:   2048 kB

 Then I want 

Re: [Qemu-devel] writing a QEMU block driver

2014-10-20 Thread Max Reitz

Am 2014-10-17 um 16:59 schrieb Sandeep Joshi:


Hi there,

Do let me know if I am asking these questions on the wrong forum.  I'd 
like to write a QEMU block driver which forwards IO requests to a 
custom-built storage cluster.


I have seen Jeff Cody's presentation http://bugnik.us/kvm2013 and 
also browsed the source code for sheepdog, nbd and gluster in the 
block directory and had a few questions to confirm or correct my 
understanding.


1) What is the difference between bdrv_open and bdrv_file_open 
function pointers in the BlockDriver ?


I'm not sure, but the main difference should be that bdrv_file_open() is 
invoked for protocol block drivers, whereas bdrv_open() is invoked for 
format block drivers. A couple of months ago, there was still a 
top-level bdrv_file_open() function which has since been integrated into 
bdrv_open(), so we might probably want to remove bdrv_file_open() in the 
future as well...


But for now, use bdrv_file_open() for protocol drivers and bdrv_open() 
for format drivers.


2) Is it possible to implement only a protocol driver without a format 
driver (the distinction that Jeff made in his presentation above) ?  
In other words, can I only set the protocol_name and not 
format_name in BlockDriver ?  I'd like to support all image formats 
(qemu, raw, etc) without having to reimplement the logic for each.


Setting format_name does not make a block driver a format driver. A 
block driver can only be either protocol or format driver, and the 
distinction is probably made (again, I'd have to look it up to be sure) 
by protocol drivers setting protocol_name and bdrv_file_open(), whereas 
format drivers do not.


So you just need to set protocol_name and bdrv_file_open() (and 
format_name as well, see nbd for an example where protocol_name and 
format_name differ) and qemu knows your block driver is a protocol 
driver and any format drivers will work on top of it. You should not set 
bdrv_open(), however.


Once again, I'm not 100 % sure, but it should work that way.

Just by the way, I can very well imagine that the distinction between 
protocol and format block drivers will disappear (at least in the code) 
in the future. But that should not be any of your concern. :-)


3) The control flow for creating a file starts with the image format 
driver and later invokes the protocol driver.


image_driver-bdrv_create()
-- bdrv_create_file
  -- bdrv_find_protocol(filename)
  -- bdrv_create
--- Protocol_driver-bdrv_create()

Is this the case for all functions?   Does the read/write first flow 
through the image format driver before getting passed down to the 
protocol driver (possibly via some coroutine invoked from the block 
layer or virtio-blk ) ?  Can someone give me a hint as to how I can 
trace the control flow ?


Well, you can always use gdb with break points and backtraces. At least 
that's what I'd do.


For your first question: Yes, for each guest device or let's say virtual 
guest device (because creating an image is not done through a guest 
device, but the only thing missing from a guest device configuration is 
in fact the device itself), there is a tree of BlockDriverStates. Every 
request runs through the whole tree. It may not touch all nodes, but it 
will start from the top (which is normally a format BDS) and then 
proceed as far as the block drivers create new requests to their children.


Or, to be more technical: A request only goes to the topmost node in the 
BDS tree (the root). If need be, it will manually forward it to its 
child (which normally is bs-file if bs is a pointer to the 
BlockDriverState) or children (e.g. bs-backing_hd, the backing file, or 
driver-specific things, such as the children for the quorum block driver 
which are not stored in the BlockDriverState).


This doesn't apply so well to bdrv_create(), because that function does 
not work on BlockDriverStates, but I'm hoping you're seeing the point.


Shameless self plug: Regarding this whole BDS tree thing I can recommend 
Kevin's and my presentation from this year's KVM Forum: 
http://events.linuxfoundation.org/sites/events/files/slides/blockdev.pdf



Max


Re: [Qemu-devel] [PATCH] fix the memory leak for share hugepage

2014-10-20 Thread Daniel P. Berrange
On Sat, Oct 18, 2014 at 11:20:13AM +0800, Linhaifeng wrote:
 
 
 On 2014/10/17 21:26, Daniel P. Berrange wrote:
  On Fri, Oct 17, 2014 at 04:57:27PM +0800, Linhaifeng wrote:
 
 
  On 2014/10/17 16:33, Daniel P. Berrange wrote:
  On Fri, Oct 17, 2014 at 04:27:17PM +0800, haifeng@huawei.com wrote:
  From: linhaifeng haifeng@huawei.com
 
  The VM start with share hugepage should close the hugefile fd
  when exit.Because the hugepage fd may be send to other process
  e.g vhost-user If qemu not close the fd the other process can
  not free the hugepage otherwise exit process,this is ugly,so
  qemu should close all shared fd when exit.
 
  Signed-off-by: linhaifeng haifeng@huawei.com
 
  Err, all file descriptors are closed automatically when a process
  exits. So manually calling close(fd) before exit can't have any
  functional effect on a resource leak.
 
  If QEMU has sent the FD to another process, that process has a
  completely separate copy of the FD. Closing the FD in QEMU will
  not close the FD in the other process. You need the other process
  to exit for the copy to be closed.
 
  Regards,
  Daniel
 
  Hi,daniel
 
  QEMU send the fd by unix domain socket.unix domain socket just install the 
  fd to
  other process and inc the f_count,if qemu not close the fd the f_count is 
  not dec.
  Then the other process even close the fd the hugepage would not freed 
  whise the other process exit.
  
  The kernel always closes all FDs when a process exits. So if this FD is
  not being correctly closed then it is a kernel bug. There should never
  be any reason for an application to do close(fd) before exiting.
  
  Regards,
  Daniel
  
 Hi,daniel
 
 I don't think this is kernel's bug.May be this a problem about usage.
 If you open a file you should close it too.

No, the standard UNIX semantics are that the kernel will close all
file descriptors when a process exits. There is *no* requirement to
manually close any file descriptors before exiting.

The only time you need to close files is if you are using a higher
level API (eg C library FILE * ) and you need to explicitly flush
any buffered I/O operations before exiting. This doesn't apply in the
case of huge pages.

 Run the first VM.QEMU deal with hugepage as follow steps:
 1.open
 2.unlink
 3.mmap
 4.use memory of hugepage and send the fd to vapp with unix domain 
 socket.After this step the meminfo is:

If the QEMU process has exited then the kernel has closed the FD that
QEMU had open. The logical implication is that the 'vapp' process still
has its copy of the file descriptor open even after QEMU has exited.

Another possibility is that whatever launched QEMU has not done a waitpid
and thus leaving QEMU in a zombie state where file descriptors are not
yet cleaned up.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] spec, RFC: TLS support for NBD

2014-10-20 Thread Daniel P. Berrange
On Sat, Oct 18, 2014 at 07:33:22AM +0100, Richard W.M. Jones wrote:
 On Sat, Oct 18, 2014 at 12:03:23AM +0200, Wouter Verhelst wrote:
  Hi all,
  
  (added rjones from nbdkit fame -- hi there)
 
 [I'm happy to implement whatever you come up with, but I've added
 Florian Weimer to CC who is part of Red Hat's product security group]
 
  So I think the following would make sense to allow TLS in NBD.
  
  This would extend the newstyle negotiation by adding two options (i.e.,
  client requests), one server reply, and one server error as well as
  extend one existing reply, in the following manner:
  
  - The two new commands are NBD_OPT_PEEK_EXPORT and NBD_OPT_STARTTLS. The
former would be used to verify if the server will do TLS for a given
export:
  
C: NBD_OPT_PEEK_EXPORT
S: NBD_REP_SERVER, with an extra field after the export name
   containing flags that describe the export (R/O vs R/W state,
   whether TLS is allowed and/or required).

IMHO the server should never provide *any* information about the exported
volume(s) until the TLS layer has been fully setup. ie we shouldn't only
think about the actual block data transfers, we should protect the entire
NBD protocol even metadata related operations.

If the server indicates that TLS is allowed, the client may now issue
NBD_OPT_STARTTLS:
  
C: NBD_OPT_STARTTLS
S: NBD_REP_STARTTLS # or NBD_REP_ERR_POLICY, if unwilling
C: initiate TLS handshake
  
Once the TLS handshake has completed, negotiation should continue over
the secure channel. The client should initiate that by sending an
NBD_OPT_* message.
  
  - The server may reply to any and all negotiation request with
NBD_REP_ERR_TLS_REQD if it does not want to do anything without TLS.
However, if at least one export is supported without encryption, the
server must not in any case use this reply.
  
  There is no command to exit TLS again. I don't think that makes sense,
  but I could be persuaded otherwise with sound technical arguments.
  
  Thoughts?
  
  (full spec (with numbers etc) exists as an (uncommitted) diff to
  doc/proto.txt on my laptop, ...)

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH v6 16/24] hw: Convert from BlockDriverState to BlockBackend, mostly

2014-10-20 Thread Markus Armbruster
Kevin Wolf kw...@redhat.com writes:

 Am 07.10.2014 um 13:59 hat Markus Armbruster geschrieben:
 Device models should access their block backends only through the
 block-backend.h API.  Convert them, and drop direct includes of
 inappropriate headers.
 
 Just four uses of BlockDriverState are left:
 
 * The Xen paravirtual block device backend (xen_disk.c) opens images
   itself when set up via xenbus, bypassing blockdev.c.  I figure it
   should go through qmp_blockdev_add() instead.
 
 * Device model usb-storage prompts for keys.  No other device model
   does, and this one probably shouldn't do it, either.
 
 * ide_issue_trim_cb() uses bdrv_aio_discard() instead of
   blk_aio_discard() because it fishes its backend out of a BlockAIOCB,
   which has only the BlockDriverState.
 
 * PC87312State has an unused BlockDriverState[] member.
 
 The next two commits take care of the latter two.
 
 Signed-off-by: Markus Armbruster arm...@redhat.com

 diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
 index 198da2e..8b3f352 100644
 --- a/include/qemu/typedefs.h
 +++ b/include/qemu/typedefs.h
 @@ -37,6 +37,7 @@ typedef struct HCIInfo HCIInfo;
  typedef struct AudioState AudioState;
  typedef struct BlockBackend BlockBackend;
  typedef struct BlockDriverState BlockDriverState;
 +typedef struct BlockBackend BlockBackend;
  typedef struct DriveInfo DriveInfo;
  typedef struct DisplayState DisplayState;
  typedef struct DisplayChangeListener DisplayChangeListener;

 This is a duplicate typedef (the first definition is even in the context
 of this hunk) and causes the build to fail on RHEL 6. I can drop the
 hunk while applying if you don't object.

Yes, please!



Re: [Qemu-devel] [PATCH v3] qemu-char: Do not disconnect when there's data for reading

2014-10-20 Thread Markus Armbruster
MAINTAINERS points to Anthony, and you duly cc'ed him, but he's
effectively retired.  Cc'ing recent committers include Paolo and Peter.

Zifei Tong zifeit...@gmail.com writes:

 After commit 812c1057f6175ac9a9829fa2920a2b5783814193 (Handle G_IO_HUP
 in tcp_chr_read for tcp chardev), connections are disconnected when in
 G_IO_HUP condition.

 However, it's possible that there is still data for reading in the channel.
 In that case, the remaining data is not handled.

 I saw a related bug when running socat in write-only mode, after

   $ echo quit | socat -u - UNIX-CONNECT:qemu-monitor

 the monitor won't not run the 'quit' command.

 Instead of GIOCondition, this patch uses the return value of tcp_chr_recv()
 to check the state of connection as suggested by Kirill.

 Cc: Kirill Batuzov batuz...@ispras.ru
 Cc: Nikolay Nikolaev n.nikol...@virtualopensystems.com
 Cc: Markus Armbruster arm...@redhat.com
 Cc: Anthony Liguori aligu...@amazon.com
 Signed-off-by: Zifei Tong zifeit...@gmail.com
 ---
 Changes in v3: handle EWOULDBLOCK, remove inaccurate comment

  qemu-char.c | 10 ++
  1 file changed, 2 insertions(+), 8 deletions(-)

 diff --git a/qemu-char.c b/qemu-char.c
 index 2a3cb9f..d1893a0 100644
 --- a/qemu-char.c
 +++ b/qemu-char.c
 @@ -2692,12 +2692,6 @@ static gboolean tcp_chr_read(GIOChannel *chan, 
 GIOCondition cond, void *opaque)
  uint8_t buf[READ_BUF_LEN];
  int len, size;
  
 -if (cond  G_IO_HUP) {
 -/* connection closed */
 -tcp_chr_disconnect(chr);
 -return TRUE;
 -}
 -
  if (!s-connected || s-max_size = 0) {
  return TRUE;
  }
 @@ -2705,8 +2699,8 @@ static gboolean tcp_chr_read(GIOChannel *chan, 
 GIOCondition cond, void *opaque)
  if (len  s-max_size)
  len = s-max_size;
  size = tcp_chr_recv(chr, (void *)buf, len);
 -if (size == 0) {
 -/* connection closed */
 +if (size == 0 ||
 +(size  0  !(errno == EAGAIN || errno == EWOULDBLOCK || errno == 
 EINTR))) {
  tcp_chr_disconnect(chr);
  } else if (size  0) {
  if (s-do_telnetopt)



Re: [Qemu-devel] [PATCH 2/2] vl.c: reduce exit on error code duplication

2014-10-20 Thread Markus Armbruster
Igor Mammedov imamm...@redhat.com writes:

 use exit_if_error() helper instead of a bunch of
 if (local_err) {
 error_report(foo);
 error_free(local_err);
 exit(1);
 }
 code blocks

 Signed-off-by: Igor Mammedov imamm...@redhat.com

A quick git-grep -B 2 -w exit shows the pattern exists outside vl.c.
The instances in hw/ are of course suspect.  Anyway, the helper seems
more generally useful.  Let's put it in util/ now, rather than move it
later.

Suggest to name it something like error_report_fatal(), to make both the
fact that it reports and that it exits obvious from the name.



Re: [Qemu-devel] [PATCH v2 0/2] target-xtensa: fix loading uImage kernels on MMUv2 cores

2014-10-20 Thread Alexander Graf



 Am 19.10.2014 um 07:03 schrieb Max Filippov jcmvb...@gmail.com:
 
 Hi,
 
 this series fixes loading uImage kernels on MMUv2 xtensa cores.
 
 U-boot for xtensa always treats uImage load address as virtual address.
 This is important when booting uImage on xtensa core with MMUv2, because
 MMUv2 has fixed non-identity virtual-to-physical mapping after reset.

Reviewed-by: Alexander Graf ag...@suse.de


Alex




[Qemu-devel] [PATCH v2] block: fix implicit conversion to invalid type

2014-10-20 Thread Igor Mammedov
change type of variable to expected IoOperationType which fixes compile
warning:

block.c:3655:20: warning: implicit conversion from enumeration
 type enum IoOperationType to different enumeration type BlockErrorAction

Signed-off-by: Igor Mammedov imamm...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
---
v2: fix spelling error in subj
---
 block.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block.c b/block.c
index 27533f3..c15fad0 100644
--- a/block.c
+++ b/block.c
@@ -3650,10 +3650,10 @@ static void send_qmp_error_event(BlockDriverState *bs,
  BlockErrorAction action,
  bool is_read, int error)
 {
-BlockErrorAction ac;
+IoOperationType op;
 
-ac = is_read ? IO_OPERATION_TYPE_READ : IO_OPERATION_TYPE_WRITE;
-qapi_event_send_block_io_error(bdrv_get_device_name(bs), ac, action,
+op = is_read ? IO_OPERATION_TYPE_READ : IO_OPERATION_TYPE_WRITE;
+qapi_event_send_block_io_error(bdrv_get_device_name(bs), op, action,
bdrv_iostatus_is_enabled(bs),
error == ENOSPC, strerror(error),
error_abort);
-- 
1.9.3




Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Max Reitz

On 2014-10-20 at 08:14, Peter Lieven wrote:

the block layer silently merges write requests since


s/^t/T/


commit 40b4f539. This patch adds a knob to disable
this feature as there has been some discussion lately
if multiwrite is a good idea at all and as it falsifies
benchmarks.

Signed-off-by: Peter Lieven p...@kamp.de
---
  block.c   |4 
  block/qapi.c  |1 +
  blockdev.c|7 +++
  hmp.c |4 
  include/block/block_int.h |1 +
  qapi/block-core.json  |   10 +-
  qemu-options.hx   |1 +
  qmp-commands.hx   |2 ++
  8 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 27533f3..1658a72 100644
--- a/block.c
+++ b/block.c
@@ -4531,6 +4531,10 @@ static int multiwrite_merge(BlockDriverState *bs, 
BlockRequest *reqs,
  {
  int i, outidx;
  
+if (!bs-write_merging) {

+return num_reqs;
+}
+
  // Sort requests by start sector
  qsort(reqs, num_reqs, sizeof(*reqs), multiwrite_req_compare);
  
diff --git a/block/qapi.c b/block/qapi.c

index 9733ebd..02251dd 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -58,6 +58,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs)
  
  info-backing_file_depth = bdrv_get_backing_file_depth(bs);

  info-detect_zeroes = bs-detect_zeroes;
+info-write_merging = bs-write_merging;
  
  if (bs-io_limits_enabled) {

  ThrottleConfig cfg;
diff --git a/blockdev.c b/blockdev.c
index e595910..13e47b8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
  const char *id;
  bool has_driver_specific_opts;
  BlockdevDetectZeroesOptions detect_zeroes;
+bool write_merging;
  BlockDriver *drv = NULL;
  
  /* Check common options by copying from bs_opts to opts, all other options

@@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
  snapshot = qemu_opt_get_bool(opts, snapshot, 0);
  ro = qemu_opt_get_bool(opts, read-only, 0);
  copy_on_read = qemu_opt_get_bool(opts, copy-on-read, false);
+write_merging = qemu_opt_get_bool(opts, write-merging, true);


Using this option in blockdev_init() means that you can only enable or 
disable merging for the top layer (the root BDS). Furthermore, since you 
don't set bs-write_merging in bdrv_new() (or at least bdrv_open()), it 
actually defaults to false and only for the top layer it defaults to true.


Therefore, if after this patch a format block driver issues a multiwrite 
to its file, the write will not be merged and the user can do nothing 
about it. I don't suppose this is intentional...?


I propose evaluating the option in bdrv_open() and setting 
bs-write_merging there.



  if ((buf = qemu_opt_get(opts, discard)) != NULL) {
  if (bdrv_parse_discard_flags(buf, bdrv_flags) != 0) {
@@ -530,6 +532,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
  bs-open_flags = snapshot ? BDRV_O_SNAPSHOT : 0;
  bs-read_only = ro;
  bs-detect_zeroes = detect_zeroes;
+bs-write_merging = write_merging;
  
  bdrv_set_on_error(bs, on_read_error, on_write_error);
  
@@ -2746,6 +2749,10 @@ QemuOptsList qemu_common_drive_opts = {

  .name = detect-zeroes,
  .type = QEMU_OPT_STRING,
  .help = try to optimize zero writes (off, on, unmap),
+},{
+.name = write-merging,
+.type = QEMU_OPT_BOOL,
+.help = enable write merging (default: true),
  },
  { /* end of list */ }
  },
diff --git a/hmp.c b/hmp.c
index 63d7686..8d6ad0b 100644
--- a/hmp.c
+++ b/hmp.c
@@ -348,6 +348,10 @@ void hmp_info_block(Monitor *mon, const QDict *qdict)
 
BlockdevDetectZeroesOptions_lookup[info-value-inserted-detect_zeroes]);
  }
  
+if (!info-value-inserted-write_merging) {

+monitor_printf(mon, Write Merging:off\n);
+}
+
  if (info-value-inserted-bps
  || info-value-inserted-bps_rd
  || info-value-inserted-bps_wr
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8d86a6c..39bbde2 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -407,6 +407,7 @@ struct BlockDriverState {
  
  QDict *options;

  BlockdevDetectZeroesOptions detect_zeroes;
+bool write_merging;
  
  /* The error object in use for blocking operations on backing_hd */

  Error *backing_blocker;
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 8f7089e..4931bd9 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -214,6 +214,8 @@
  #
  # @detect_zeroes: detect and optimize zero writes (Since 2.1)
  #
+# @write_merging: true if write merging is enabled (Since 2.2)
+#
  # @bps: total throughput limit in bytes per second is specified
  #
  

Re: [Qemu-devel] [PATCH 2/2] iotests: Add test for qcow2 L1 table update

2014-10-20 Thread Max Reitz

On 2014-10-20 at 08:25, Peter Lieven wrote:

On 16.10.2014 15:25, Max Reitz wrote:

Updating the L1 table should not result in random data being written.
This adds a test for that.

Signed-off-by: Max Reitz mre...@redhat.com
---
  tests/qemu-iotests/107 | 61 
++

  tests/qemu-iotests/107.out | 10 
  tests/qemu-iotests/group   |  1 +
  3 files changed, 72 insertions(+)
  create mode 100755 tests/qemu-iotests/107
  create mode 100644 tests/qemu-iotests/107.out

diff --git a/tests/qemu-iotests/107 b/tests/qemu-iotests/107
new file mode 100755
index 000..cad1cf9
--- /dev/null
+++ b/tests/qemu-iotests/107
@@ -0,0 +1,61 @@
+#!/bin/bash
+#
+# Tests updates of the qcow2 L1 table
+#
+# Copyright (C) 2014 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/.
+#
+
+# creator
+owner=mre...@redhat.com
+
+seq=$(basename $0)
+echo QA output created by $seq
+
+here=$PWD
+tmp=/tmp/$$
+status=1# failure is the default!
+
+_cleanup()
+{
+_cleanup_test_img
+}
+trap _cleanup; exit \$status 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto file


This (and maybe other recently added tests) also works on NFS.
As NFS on QCOW2 might be a reasonable combination I would add it.


It probably works over some other protocols as well. I think we want to 
go through all the tests in a separate series and fix all of these 
cases. I'll fix it in a v2 if I need to do a v2, if not, I'll leave it 
as-is. If you want to test qcow2, you want to run all tests over file, 
probably. If you want to test nfs, you probably don't want to use qcow2 
but raw (because many of the currently existing tests are too limited in 
their format and protocol choices).


Since you suggested looking into which tests actually support more 
formats and protocols than they currently pretend to, feel free to send 
such a series. ;-)


Max



Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Peter Lieven

On 20.10.2014 10:59, Max Reitz wrote:

On 2014-10-20 at 08:14, Peter Lieven wrote:

the block layer silently merges write requests since


s/^t/T/


commit 40b4f539. This patch adds a knob to disable
this feature as there has been some discussion lately
if multiwrite is a good idea at all and as it falsifies
benchmarks.

Signed-off-by: Peter Lieven p...@kamp.de
---
  block.c   |4 
  block/qapi.c  |1 +
  blockdev.c|7 +++
  hmp.c |4 
  include/block/block_int.h |1 +
  qapi/block-core.json  |   10 +-
  qemu-options.hx   |1 +
  qmp-commands.hx   |2 ++
  8 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 27533f3..1658a72 100644
--- a/block.c
+++ b/block.c
@@ -4531,6 +4531,10 @@ static int multiwrite_merge(BlockDriverState *bs, 
BlockRequest *reqs,
  {
  int i, outidx;
  +if (!bs-write_merging) {
+return num_reqs;
+}
+
  // Sort requests by start sector
  qsort(reqs, num_reqs, sizeof(*reqs), multiwrite_req_compare);
  diff --git a/block/qapi.c b/block/qapi.c
index 9733ebd..02251dd 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -58,6 +58,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs)
info-backing_file_depth = bdrv_get_backing_file_depth(bs);
  info-detect_zeroes = bs-detect_zeroes;
+info-write_merging = bs-write_merging;
if (bs-io_limits_enabled) {
  ThrottleConfig cfg;
diff --git a/blockdev.c b/blockdev.c
index e595910..13e47b8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
  const char *id;
  bool has_driver_specific_opts;
  BlockdevDetectZeroesOptions detect_zeroes;
+bool write_merging;
  BlockDriver *drv = NULL;
/* Check common options by copying from bs_opts to opts, all other 
options
@@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
  snapshot = qemu_opt_get_bool(opts, snapshot, 0);
  ro = qemu_opt_get_bool(opts, read-only, 0);
  copy_on_read = qemu_opt_get_bool(opts, copy-on-read, false);
+write_merging = qemu_opt_get_bool(opts, write-merging, true);


Using this option in blockdev_init() means that you can only enable or disable merging for the top layer (the root BDS). Furthermore, since you don't set bs-write_merging in bdrv_new() (or at least bdrv_open()), it actually defaults to false and only 
for the top layer it defaults to true.


Therefore, if after this patch a format block driver issues a multiwrite to its 
file, the write will not be merged and the user can do nothing about it. I 
don't suppose this is intentional...?


I am not sure if a block driver actually can do this at all? The only way to 
enter multiwrite is from virtio_blk_handle_request in virtio-blk.c.



I propose evaluating the option in bdrv_open() and setting bs-write_merging 
there.


I wasn't aware actually. I remember that someone asked me to implement 
discard_zeroes in blockdev_init. I think it was something related to QMP. So we 
still might
need to check parameters at 2 positions? It is quite confusing which paramter 
has to be parsed where.

Peter




  if ((buf = qemu_opt_get(opts, discard)) != NULL) {
  if (bdrv_parse_discard_flags(buf, bdrv_flags) != 0) {
@@ -530,6 +532,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
  bs-open_flags = snapshot ? BDRV_O_SNAPSHOT : 0;
  bs-read_only = ro;
  bs-detect_zeroes = detect_zeroes;
+bs-write_merging = write_merging;
bdrv_set_on_error(bs, on_read_error, on_write_error);
  @@ -2746,6 +2749,10 @@ QemuOptsList qemu_common_drive_opts = {
  .name = detect-zeroes,
  .type = QEMU_OPT_STRING,
  .help = try to optimize zero writes (off, on, unmap),
+},{
+.name = write-merging,
+.type = QEMU_OPT_BOOL,
+.help = enable write merging (default: true),
  },
  { /* end of list */ }
  },
diff --git a/hmp.c b/hmp.c
index 63d7686..8d6ad0b 100644
--- a/hmp.c
+++ b/hmp.c
@@ -348,6 +348,10 @@ void hmp_info_block(Monitor *mon, const QDict *qdict)
BlockdevDetectZeroesOptions_lookup[info-value-inserted-detect_zeroes]);
  }
  +if (!info-value-inserted-write_merging) {
+monitor_printf(mon, Write Merging:off\n);
+}
+
  if (info-value-inserted-bps
  || info-value-inserted-bps_rd
  || info-value-inserted-bps_wr
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8d86a6c..39bbde2 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -407,6 +407,7 @@ struct BlockDriverState {
QDict *options;
  BlockdevDetectZeroesOptions detect_zeroes;
+bool write_merging;
/* The error object in use for blocking operations on 

Re: [Qemu-devel] [PATCH 2/2] iotests: Add test for qcow2 L1 table update

2014-10-20 Thread Peter Lieven

On 20.10.2014 11:09, Max Reitz wrote:

On 2014-10-20 at 08:25, Peter Lieven wrote:

On 16.10.2014 15:25, Max Reitz wrote:

Updating the L1 table should not result in random data being written.
This adds a test for that.

Signed-off-by: Max Reitz mre...@redhat.com
---
  tests/qemu-iotests/107 | 61 ++
  tests/qemu-iotests/107.out | 10 
  tests/qemu-iotests/group   |  1 +
  3 files changed, 72 insertions(+)
  create mode 100755 tests/qemu-iotests/107
  create mode 100644 tests/qemu-iotests/107.out

diff --git a/tests/qemu-iotests/107 b/tests/qemu-iotests/107
new file mode 100755
index 000..cad1cf9
--- /dev/null
+++ b/tests/qemu-iotests/107
@@ -0,0 +1,61 @@
+#!/bin/bash
+#
+# Tests updates of the qcow2 L1 table
+#
+# Copyright (C) 2014 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/.
+#
+
+# creator
+owner=mre...@redhat.com
+
+seq=$(basename $0)
+echo QA output created by $seq
+
+here=$PWD
+tmp=/tmp/$$
+status=1# failure is the default!
+
+_cleanup()
+{
+_cleanup_test_img
+}
+trap _cleanup; exit \$status 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto file


This (and maybe other recently added tests) also works on NFS.
As NFS on QCOW2 might be a reasonable combination I would add it.


It probably works over some other protocols as well. I think we want to go through all the tests in a separate series and fix all of these cases. I'll fix it in a v2 if I need to do a v2, if not, I'll leave it as-is. If you want to test qcow2, you want 
to run all tests over file, probably. If you want to test nfs, you probably don't want to use qcow2 but raw (because many of the currently existing tests are too limited in their format and protocol choices).


Since you suggested looking into which tests actually support more formats and 
protocols than they currently pretend to, feel free to send such a series. ;-)


I actually did when writing the NFS driver. I did modifications to a lot of 
tests to support other protocols. But the approach was too hackish. But I would 
like to see at least
the tests that work without modifications to have supported protocols listed. 
And as someone likely will use a QCOW2 container on NFS I think is is an 
important
test case. At least to have all the tests included that are working out of the 
box.

Peter




[Qemu-devel] [PATCH] get_maintainer.pl: Default to --no-git-fallback

2014-10-20 Thread Markus Armbruster
Contributors rely on this script to find maintainers to copy.  The
script falls back to git when no exact MAINTAINERS pattern matches.
When that happens, recent contributors get copied, which tends not be
particularly useful.  Some contributors find it even annoying.

Flip the default to don't fall back to git.  Use --git-fallback to
ask it to fall back to git.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
 scripts/get_maintainer.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index 38334de..ec2d16f 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -28,7 +28,7 @@ my $email_git = 0;
 my $email_git_all_signature_types = 0;
 my $email_git_blame = 0;
 my $email_git_blame_signatures = 1;
-my $email_git_fallback = 1;
+my $email_git_fallback = 0;
 my $email_git_min_signatures = 1;
 my $email_git_max_maintainers = 5;
 my $email_git_min_percent = 5;
-- 
1.9.3




[Qemu-devel] MAINTAINERS leaves too many files uncovered

2014-10-20 Thread Markus Armbruster
In my experience, too many files are not covered by MAINTAINERS.
scripts/get_maintainer.pl falls back to git then, unless you say
--no-git-fallback.  Copies sent there tends to annoy their recipients
without accomplishing all that much.

Two obvious improvements:

* Easy: Flip scripts/get_maintainer.pl's default to --no-git-fallback.
  I'll post the obvious patch, please raise your objections there.

* Harder: improve MAINTAINERS coverage.

Let me back up subjective experience with hard data.  The tree has quite
a few files:

$ git-ls-files | wc -l
3746

Counting them by extension:

$ git-ls-files | sed -n 's#.*/##;s#.*\.##p' | sort | uniq -c | sort -nr
   1836 c
818 h
133 out
105 S
 97 objs
 69 s
 64 mak
 48 json
 47 py
 41 txt
 33 exit
 33 err
 16 xml
 16 bin
 13 rom
 12 sh
 12 dsl
[Long tail that doesn't add up to anything interesting omitted]

Let's look for .c not in MAINTAINERS:

$ for i in `git-ls-files`; do [ `scripts/get_maintainer.pl -f 
--no-git-fallback $i` ] || echo $i; done unmaintained-files
$ grep -c '\.c$' unmaintained-files
1066

That's almost 60%.  Not good.

Apparently, nobody cares for tests:

$ grep '^tests/' unmaintained-files | grep -c '\.c$'
654
$ git-ls-files | grep '^tests/' | grep -c '\.c$'
664

Filtering those out leaves us with 412 unmaintained out of of 1172, or
35% unmaintained.  Not good even if we (foolishly!) considered tests not
worthy of maintenance.

Maybe unmaintained files are much smaller.  David A.  Wheeler's
SLOCCount counts 570kSLOC in 1212 maintained files (+140 files sloccount
doesn't know how to count) vs. 300kSLOC in 1798 unmaintained files (+596
uncounted), or 35% unmaintained SLOC.  With tests/ ignored, it's 30%.
So, unmaintained files are indeed smaller, but 30-something percent is
still not good.

Where are the unmaintained files?  Top-scoring directories outside
tests/ and include/, files in subdirs not counted:

#files   directory
84  68%  .
63 100%  default-configs
48 100%  pc-bios
43  97%  stubs
39 100%  util
37 100%  pc-bios/keymaps
35  81%  hw/display
32  94%  scripts
26  92%  docs
26 100%  libcacard
23  69%  hw/misc
22  57%  hw/net
21  63%  hw/intc
19 100%  roms
18 100%  disas
18  56%  hw/timer
16 100%  hw/core
15  53%  hw/char
15 100%  qga
14 100%  docs/specs
12  92%  hw/input
12 100%  qobject
12 100%  target-m68k
11 100%  backends
11 100%  pc-bios/s390-ccw
10  71%  hw/dma

Ideas?  Takers?



Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Max Reitz

On 2014-10-20 at 11:14, Peter Lieven wrote:

On 20.10.2014 10:59, Max Reitz wrote:

On 2014-10-20 at 08:14, Peter Lieven wrote:

the block layer silently merges write requests since


s/^t/T/


commit 40b4f539. This patch adds a knob to disable
this feature as there has been some discussion lately
if multiwrite is a good idea at all and as it falsifies
benchmarks.

Signed-off-by: Peter Lieven p...@kamp.de
---
  block.c   |4 
  block/qapi.c  |1 +
  blockdev.c|7 +++
  hmp.c |4 
  include/block/block_int.h |1 +
  qapi/block-core.json  |   10 +-
  qemu-options.hx   |1 +
  qmp-commands.hx   |2 ++
  8 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 27533f3..1658a72 100644
--- a/block.c
+++ b/block.c
@@ -4531,6 +4531,10 @@ static int multiwrite_merge(BlockDriverState 
*bs, BlockRequest *reqs,

  {
  int i, outidx;
  +if (!bs-write_merging) {
+return num_reqs;
+}
+
  // Sort requests by start sector
  qsort(reqs, num_reqs, sizeof(*reqs), multiwrite_req_compare);
  diff --git a/block/qapi.c b/block/qapi.c
index 9733ebd..02251dd 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -58,6 +58,7 @@ BlockDeviceInfo 
*bdrv_block_device_info(BlockDriverState *bs)

info-backing_file_depth = bdrv_get_backing_file_depth(bs);
  info-detect_zeroes = bs-detect_zeroes;
+info-write_merging = bs-write_merging;
if (bs-io_limits_enabled) {
  ThrottleConfig cfg;
diff --git a/blockdev.c b/blockdev.c
index e595910..13e47b8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char 
*file, QDict *bs_opts,

  const char *id;
  bool has_driver_specific_opts;
  BlockdevDetectZeroesOptions detect_zeroes;
+bool write_merging;
  BlockDriver *drv = NULL;
/* Check common options by copying from bs_opts to opts, all 
other options
@@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char 
*file, QDict *bs_opts,

  snapshot = qemu_opt_get_bool(opts, snapshot, 0);
  ro = qemu_opt_get_bool(opts, read-only, 0);
  copy_on_read = qemu_opt_get_bool(opts, copy-on-read, false);
+write_merging = qemu_opt_get_bool(opts, write-merging, true);


Using this option in blockdev_init() means that you can only enable 
or disable merging for the top layer (the root BDS). Furthermore, 
since you don't set bs-write_merging in bdrv_new() (or at least 
bdrv_open()), it actually defaults to false and only for the top 
layer it defaults to true.


Therefore, if after this patch a format block driver issues a 
multiwrite to its file, the write will not be merged and the user can 
do nothing about it. I don't suppose this is intentional...?


I am not sure if a block driver actually can do this at all? The only 
way to enter multiwrite is from virtio_blk_handle_request in 
virtio-blk.c.


Well, there's also qemu-io -c multiwrite (which only accesses the root 
BDS as well). But other than that, yes, you're right. So, in practice it 
shouldn't matter.






I propose evaluating the option in bdrv_open() and setting 
bs-write_merging there.


I wasn't aware actually. I remember that someone asked me to implement 
discard_zeroes in blockdev_init. I think it was something related to 
QMP. So we still might
need to check parameters at 2 positions? It is quite confusing which 
paramter has to be parsed where.


As for me, I don't know why some options are parsed in blockdev_init() 
at all. I guess all the options currently parsed in blockdev_init() 
should later be moved to the BlockBackend, at least that would be the 
idea. In practice, we cannot do that: Things like caching will stay in 
the BlockDriverState.


I think it's just broken. IMHO, everything related to the BB should be 
in blockdev_init() and everything related to the BDS should be in 
bdrv_open(). So the question is now whether you want write_merging to be 
in the BDS or in the BB. Considering BB is in Kevin's block branch as of 
last Friday, you might actually want to work on that branch and move the 
field into the BB if you decide that that's the place it should be in.


Max


Peter




  if ((buf = qemu_opt_get(opts, discard)) != NULL) {
  if (bdrv_parse_discard_flags(buf, bdrv_flags) != 0) {
@@ -530,6 +532,7 @@ static DriveInfo *blockdev_init(const char 
*file, QDict *bs_opts,

  bs-open_flags = snapshot ? BDRV_O_SNAPSHOT : 0;
  bs-read_only = ro;
  bs-detect_zeroes = detect_zeroes;
+bs-write_merging = write_merging;
bdrv_set_on_error(bs, on_read_error, on_write_error);
  @@ -2746,6 +2749,10 @@ QemuOptsList qemu_common_drive_opts = {
  .name = detect-zeroes,
  .type = QEMU_OPT_STRING,
  .help = try to optimize zero writes (off, on, unmap),
+},{
+.name = write-merging,
+.type = QEMU_OPT_BOOL,
+ 

[Qemu-devel] [PATCH] intel_iommu: fix VTD_SID_TO_BUS

2014-10-20 Thread Michael S. Tsirkin
(((sid)  8)  0xff)  makes no sense
(((sid)  8)  0xff) seems to be what was meant.

Suggested-by: Markus Armbruster arm...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
---

Compile-tested only.

 include/hw/i386/intel_iommu.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index f4701e1..e321ee4 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -37,7 +37,7 @@
 #define VTD_PCI_DEVFN_MAX   256
 #define VTD_PCI_SLOT(devfn) (((devfn)  3)  0x1f)
 #define VTD_PCI_FUNC(devfn) ((devfn)  0x07)
-#define VTD_SID_TO_BUS(sid) (((sid)  8)  0xff)
+#define VTD_SID_TO_BUS(sid) (((sid)  8)  0xff)
 #define VTD_SID_TO_DEVFN(sid)   ((sid)  0xff)
 
 #define DMAR_REG_SIZE   0x230
-- 
MST



Re: [Qemu-devel] [PATCH 2/2] vl.c: reduce exit on error code duplication

2014-10-20 Thread Igor Mammedov
On Wed, 15 Oct 2014 08:35:53 -0600
Eric Blake ebl...@redhat.com wrote:

 On 10/15/2014 05:03 AM, Igor Mammedov wrote:
  use exit_if_error() helper instead of a bunch of
  if (local_err) {
  error_report(foo);
  error_free(local_err);
  exit(1);
  }
  code blocks
  
  Signed-off-by: Igor Mammedov imamm...@redhat.com
  ---
   vl.c | 58 ++
   1 file changed, 30 insertions(+), 28 deletions(-)
 
 Not much net change, but I like the refactoring.
 
 
   static int default_driver_check(QemuOpts *opts, void *opaque)
   {
   const char *driver = qemu_opt_get(opts, driver);
  @@ -2380,11 +2404,7 @@ static int chardev_init_func(QemuOpts *opts, void 
  *opaque)
   Error *local_err = NULL;
   
   qemu_chr_new_from_opts(opts, NULL, local_err);
  -if (local_err) {
  -error_report(%s, error_get_pretty(local_err));
  -error_free(local_err);
  -return -1;
  -}
  +exit_if_error(local_err, NULL);
   return 0;
   }
 
 Idea for followup patch: this function now always returns 0 (if it
 returns at all); therefore, change its signature to void and simplify
 further.
it won't work in this case since it's called by qemu_opts_foreach()
which requires return value.

 
   
  @@ -2790,12 +2810,7 @@ static int machine_set_property(const char *name, 
  const char *value,
   string_input_visitor_cleanup(siv);
   g_free(qom_name);
   
  -if (local_err) {
  -qerror_report_err(local_err);
  -error_free(local_err);
  -return -1;
  -}
  -
  +exit_if_error(local_err, NULL);
   return 0;
   }
 
 Same idea for simplification.
ditto

 
 But as that should be a separate patch, this one is:
 Reviewed-by: Eric Blake ebl...@redhat.com
 




Re: [Qemu-devel] qemu opengl status

2014-10-20 Thread Gerd Hoffmann
On Fr, 2014-10-17 at 15:59 +0200, Stéphane ANCELOT wrote:
 Hi,
 I would like to run OpenGL application in a pentium3 guest (host x86_64).
 
 I would like to know if it is possible.
 can you give me directions ?

Plan is to support that with virtio-gpu.  Still work in progress.
Expect it with qemu 2.3 earliest.  Chances are high that it'll take a
bit longer though.

https://www.kraxel.org/slides/qemu-gfx/

cheers,
  Gerd





Re: [Qemu-devel] Close the BlockDriverState when guest eject the media

2014-10-20 Thread Kevin Wolf
Am 18.10.2014 um 12:02 hat Weidong Huang geschrieben:
 Hi ALL:
 
 There are two ways to eject the cdrom tray. One is by the eject's qmp 
 commmand(eject_device).
 The another one is by the guest(bdrv_eject). They have different results.

Yes, they are different things.

If a guest opens the tray (using bdrv_eject) and then closes it again,
with no user interaction in between, the virtual media must still be in
the drive and the guest must be able to access the same image again.
Calling bdrv_close() in this case would be a bug.

The goal of the monitor command eject on the other hand is to remove
the medium so that the drive is empty. That a device with a closed tray
has to be opened for this is only secondary.

 eject_device: close the BlockDriverState(bdrv_close(bs))
 bdrv_eject: don't close the BlockDriverState,
 
 This is ambiguous. So libvirt can't handle some situations.
 
 libvirt send eject qmp command --- qemu send eject request to guest ---
 guest respond to qemu --- qemu emit tray_open event to libvirt ---
 libvirt will not send change qmp command if media source is null. So
 the media is not be replace to the null.

What is the problem that libvirt has with the guest opening the tray? I
don't think libvirt should even care about that case.

Kevin

 So close the BlockDriverState in bdrv_eject. Thanks.
 
 diff --git a/block.c b/block.c
 index d3aebeb..0be69de 100644
 --- a/block.c
 +++ b/block.c
 @@ -5276,6 +5276,10 @@ void bdrv_eject(BlockDriverState *bs, bool eject_flag)
  qapi_event_send_device_tray_moved(bdrv_get_device_name(bs),
eject_flag, error_abort);
  }
 +
 +if (eject_flag) {
 +bdrv_close(bs);
 +}
  }
 



Re: [Qemu-devel] spec, RFC: TLS support for NBD

2014-10-20 Thread Stefan Hajnoczi
On Mon, Oct 20, 2014 at 08:58:14AM +0100, Daniel P. Berrange wrote:
 On Sat, Oct 18, 2014 at 07:33:22AM +0100, Richard W.M. Jones wrote:
  On Sat, Oct 18, 2014 at 12:03:23AM +0200, Wouter Verhelst wrote:
   Hi all,
   
   (added rjones from nbdkit fame -- hi there)
  
  [I'm happy to implement whatever you come up with, but I've added
  Florian Weimer to CC who is part of Red Hat's product security group]
  
   So I think the following would make sense to allow TLS in NBD.
   
   This would extend the newstyle negotiation by adding two options (i.e.,
   client requests), one server reply, and one server error as well as
   extend one existing reply, in the following manner:
   
   - The two new commands are NBD_OPT_PEEK_EXPORT and NBD_OPT_STARTTLS. The
 former would be used to verify if the server will do TLS for a given
 export:
   
 C: NBD_OPT_PEEK_EXPORT
 S: NBD_REP_SERVER, with an extra field after the export name
containing flags that describe the export (R/O vs R/W state,
whether TLS is allowed and/or required).
 
 IMHO the server should never provide *any* information about the exported
 volume(s) until the TLS layer has been fully setup. ie we shouldn't only
 think about the actual block data transfers, we should protect the entire
 NBD protocol even metadata related operations.

This makes sense.

TLS is about the transport, not about a particular NBD export.  The only
thing that should be communicated is STARTTLS.

Stefan


pgpXDFIeyNWjU.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH V5 0/8] cpu/acpi: convert cpu hot plug to hotplug_handler API

2014-10-20 Thread Gu Zheng
Hi Igor,
How about this version?

Regards,
Gu
On 10/10/2014 10:15 AM, Gu Zheng wrote:

 Previously we use cpu_added_notifiers to register cpu hotplug notifier 
 callback
 which is not able to pass/handle errors, so we switch it to unified hotplug
 handler API which allows to pass errors and would allow to cancel device_add
 in case of error.
 Thanks very much for Igor's review and suggestion.
 
 ---
 v5:
  -rebase on the latest upstream and fix some comments.
  Patch 4/8:
  -split the check out of acpi_dev block.
  Patch 5/8:
  -move CPU hot-plug notifier cleanup hunk into Patch 6/8.
  Patch 6/8:
  -delete the caller of notifier_list_notify() in this patch.
  Patch 8/8:
  -rename acpi_set_local_sts to acpi_set_cpu_present_bit for better 
 readability.
 
 v4:
  -split removal of CPU hotplug notifier into separate patch (Patch 6/8).
  Patch 1/7:
  -convert CPUState *cpu to DeviceState *dev like it's done for other handlers
   and do cast to CPU inside.
  Patch 5/7:
  -Make rtc_state as a link property in PCMachine rather than the global
   variables.
  -Split out the removal of unused notifier into separate patch.
  -Check the result of plug callback before update rtc_state.
 
 v3:
  -deal with start-up cpus in pc_cpu_plug as Igor suggested.
 
 v2:
  -Add 3 new patches(5/7,6/7,7/7), delete original patch 5/5.
   1/5--1/7
   2/5--2/7
   3/5--3/7
   4/5--4/7
  Patch 1/7:
  -add errp argument to catch error.
  -return error instead of aborting if cpu id is invalid.
  -make acpi_cpu_plug_cb as a wrapper around AcpiCpuHotplug_add.
  Patch 3/7:
  -remove unused AcpiCpuHotplug_add directly.
  Patch 5/7:
  -switch the last user of cpu hotplug notifier to hotplug handler API, and
   remove the unused cpu hotplug notify.
  Patch 6/7:
  -split the function rename (just cleanup) into single patch.
  Patch 7/7:
  -introduce help function acpi_set_local_sts to keep the bit setting in
   one place.
 ---
 
 Gu Zheng (8):
   acpi/cpu: add cpu hotplug callback function to match hotplug_handler
 API
   acpi:ich9: convert cpu hotplug to hotplug_handler API
   acpi:piix4: convert cpu hotplug to hotplug_handler API
   pc: add cpu hotplug handler to PC_MACHINE
   pc: Update rtc_cmos in pc_cpu_plug
   qom/cpu: remove the unused CPU hot-plug notifier
   cpu-hotplug: rename function for better readability
   acpi/cpu-hotplug: introduce help function to keep bit setting in one
 place
 
  hw/acpi/cpu_hotplug.c |   35 --
  hw/acpi/ich9.c|   17 ++
  hw/acpi/piix4.c   |   18 ++-
  hw/i386/pc.c  |   65 
 +++--
  hw/i386/pc_piix.c |2 +-
  hw/i386/pc_q35.c  |2 +-
  include/hw/acpi/cpu_hotplug.h |7 ++--
  include/hw/acpi/ich9.h|1 -
  include/hw/i386/pc.h  |3 +-
  include/sysemu/sysemu.h   |3 --
  qom/cpu.c |   10 --
  11 files changed, 84 insertions(+), 79 deletions(-)
 





Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Peter Lieven

On 20.10.2014 11:27, Max Reitz wrote:

On 2014-10-20 at 11:14, Peter Lieven wrote:

On 20.10.2014 10:59, Max Reitz wrote:

On 2014-10-20 at 08:14, Peter Lieven wrote:

the block layer silently merges write requests since


s/^t/T/


commit 40b4f539. This patch adds a knob to disable
this feature as there has been some discussion lately
if multiwrite is a good idea at all and as it falsifies
benchmarks.

Signed-off-by: Peter Lieven p...@kamp.de
---
  block.c   |4 
  block/qapi.c  |1 +
  blockdev.c|7 +++
  hmp.c |4 
  include/block/block_int.h |1 +
  qapi/block-core.json  |   10 +-
  qemu-options.hx   |1 +
  qmp-commands.hx   |2 ++
  8 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 27533f3..1658a72 100644
--- a/block.c
+++ b/block.c
@@ -4531,6 +4531,10 @@ static int multiwrite_merge(BlockDriverState *bs, 
BlockRequest *reqs,
  {
  int i, outidx;
  +if (!bs-write_merging) {
+return num_reqs;
+}
+
  // Sort requests by start sector
  qsort(reqs, num_reqs, sizeof(*reqs), multiwrite_req_compare);
  diff --git a/block/qapi.c b/block/qapi.c
index 9733ebd..02251dd 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -58,6 +58,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs)
info-backing_file_depth = bdrv_get_backing_file_depth(bs);
  info-detect_zeroes = bs-detect_zeroes;
+info-write_merging = bs-write_merging;
if (bs-io_limits_enabled) {
  ThrottleConfig cfg;
diff --git a/blockdev.c b/blockdev.c
index e595910..13e47b8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
  const char *id;
  bool has_driver_specific_opts;
  BlockdevDetectZeroesOptions detect_zeroes;
+bool write_merging;
  BlockDriver *drv = NULL;
/* Check common options by copying from bs_opts to opts, all other 
options
@@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
  snapshot = qemu_opt_get_bool(opts, snapshot, 0);
  ro = qemu_opt_get_bool(opts, read-only, 0);
  copy_on_read = qemu_opt_get_bool(opts, copy-on-read, false);
+write_merging = qemu_opt_get_bool(opts, write-merging, true);


Using this option in blockdev_init() means that you can only enable or disable merging for the top layer (the root BDS). Furthermore, since you don't set bs-write_merging in bdrv_new() (or at least bdrv_open()), it actually defaults to false and only 
for the top layer it defaults to true.


Therefore, if after this patch a format block driver issues a multiwrite to its 
file, the write will not be merged and the user can do nothing about it. I 
don't suppose this is intentional...?


I am not sure if a block driver actually can do this at all? The only way to 
enter multiwrite is from virtio_blk_handle_request in virtio-blk.c.


Well, there's also qemu-io -c multiwrite (which only accesses the root BDS as 
well). But other than that, yes, you're right. So, in practice it shouldn't 
matter.





I propose evaluating the option in bdrv_open() and setting bs-write_merging 
there.


I wasn't aware actually. I remember that someone asked me to implement 
discard_zeroes in blockdev_init. I think it was something related to QMP. So we 
still might
need to check parameters at 2 positions? It is quite confusing which paramter 
has to be parsed where.


As for me, I don't know why some options are parsed in blockdev_init() at all. I guess all the options currently parsed in blockdev_init() should later be moved to the BlockBackend, at least that would be the idea. In practice, we cannot do that: Things 
like caching will stay in the BlockDriverState.


I think it's just broken. IMHO, everything related to the BB should be in blockdev_init() and everything related to the BDS should be in bdrv_open(). So the question is now whether you want write_merging to be in the BDS or in the BB. Considering BB is 
in Kevin's block branch as of last Friday, you might actually want to work on that branch and move the field into the BB if you decide that that's the place it should be in.


Actually I there a pros and cons for both BDS and BB. As of now my intention 
was to be able to turn it off. As there are People who would like to see it 
completely disappear I would not spent too much effort in that switch today.
Looking at BB it is a BDS thing and thus belongs to bdrv_open. But this is true 
for discard_zeroes (and others) as well. Kevin, Stefan, ultimatively where 
should it be parsed?

I have on my todo list the following to give you a figure what might happen. 
All this is for 2.3+ except for the accounting maybe:

 - add accounting for merged requests (to have a metric for modifications)
 - evaluate if sorting the requests really helps
 - simplify the merge conditions and do not keep a 

Re: [Qemu-devel] [PATCH 4/6] target-mips: add restrictions for possible values in registers

2014-10-20 Thread Yongbok Kim

On 14/07/2014 17:19, Leon Alrae wrote:

In Release 6 not all the values are allowed to be written to a register.
If the value is not valid or unsupported then it should stay unchanged.

For pre-R6 the existing behaviour has been changed only for CP0_Index register
as the current implementation does not seem to be correct - it looks like it
tries to limit the input value but the limit is higher than the actual
number of tlb entries.

Signed-off-by: Leon Alraeleon.al...@imgtec.com
---
  target-mips/op_helper.c |   63 ++
  1 files changed, 46 insertions(+), 17 deletions(-)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 431f3a1..4be435c 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -959,14 +959,14 @@ target_ulong helper_dmfc0_watchlo(CPUMIPSState *env, 
uint32_t sel)
  
  void helper_mtc0_index(CPUMIPSState *env, target_ulong arg1)

  {
-int num = 1;
-unsigned int tmp = env-tlb-nb_tlb;
-
-do {
-tmp = 1;
-num = 1;
-} while (tmp);
-env-CP0_Index = (env-CP0_Index  0x8000) | (arg1  (num - 1));
+uint32_t index_p = env-CP0_Index  0x8000;
+uint32_t tlb_index = arg1  0x7fff;
+if (tlb_index  env-tlb-nb_tlb) {
+if (env-insn_flags  ISA_MIPS32R6) {
+index_p |= arg1  0x8000;
+}
+env-CP0_Index = index_p | tlb_index;
+}


Agree to restrict index for pre R6 as well. It is UNDEFINED and software 
shouldn't rely on the undefined behaviour.



  }
  
  void helper_mtc0_mvpcontrol(CPUMIPSState *env, target_ulong arg1)

@@ -1294,8 +1294,13 @@ void helper_mtc0_context(CPUMIPSState *env, target_ulong 
arg1)
  
  void helper_mtc0_pagemask(CPUMIPSState *env, target_ulong arg1)

  {
-/* 1k pages not implemented */


This comment is still valid but it is appeared several times in the code 
already.

Agreed to remove the comment here.


-env-CP0_PageMask = arg1  (0x1FFF  (TARGET_PAGE_MASK  1));
+uint64_t mask = arg1  (TARGET_PAGE_BITS + 1);
+if (!(env-insn_flags  ISA_MIPS32R6) || (arg1 == ~0) ||
+(mask == 0x || mask == 0x0003 || mask == 0x000F ||
+ mask == 0x003F || mask == 0x00FF || mask == 0x03FF ||
+ mask == 0x0FFF || mask == 0x3FFF || mask == 0x)) {
+env-CP0_PageMask = arg1  (0x1FFF  (TARGET_PAGE_MASK  1));
+}
  }
  
  void helper_mtc0_pagegrain(CPUMIPSState *env, target_ulong arg1)

@@ -1309,7 +1314,13 @@ void helper_mtc0_pagegrain(CPUMIPSState *env, 
target_ulong arg1)
  
  void helper_mtc0_wired(CPUMIPSState *env, target_ulong arg1)

  {
-env-CP0_Wired = arg1 % env-tlb-nb_tlb;
+if (env-insn_flags  ISA_MIPS32R6) {
+if (arg1  env-tlb-nb_tlb) {
+env-CP0_Wired = arg1;


Wired field should be compared with Limit field (and as a result, number 
of entries in the TLB).



+}
+} else {
+env-CP0_Wired = arg1 % env-tlb-nb_tlb;
+}
  }
  
  void helper_mtc0_srsconf0(CPUMIPSState *env, target_ulong arg1)

@@ -1368,11 +1379,14 @@ void helper_mtc0_entryhi(CPUMIPSState *env, 
target_ulong arg1)
  }
  
  /* 1k pages not implemented */

-val = arg1  mask;
  #if defined(TARGET_MIPS64)
-val = env-SEGMask;
+if ((env-insn_flags  ISA_MIPS32R6)  extract64(arg1, 62, 2) == 0x2) {
+mask = ~(0x3ull  62);


If Config0_AT = 1, R field is restricted for 1 as well.


+}
+mask = env-SEGMask;
  #endif
  old = env-CP0_EntryHi;
+val = (arg1  mask) | (old  ~mask);
  env-CP0_EntryHi = val;
  if (env-CP0_Config3  (1  CP0C3_MT)) {
  sync_c0_entryhi(env, env-current_tc);
@@ -1402,6 +1416,13 @@ void helper_mtc0_status(CPUMIPSState *env, target_ulong 
arg1)
  uint32_t val, old;
  uint32_t mask = env-CP0_Status_rw_bitmask;
  
+if (env-insn_flags  ISA_MIPS32R6) {

+if (extract32(env-CP0_Status, CP0St_KSU, 2) == 0x3) {
+mask = ~(3  CP0St_KSU);
+}
+mask = ~(0x0018  arg1);
+}
+
  val = arg1  mask;
  old = env-CP0_Status;
  env-CP0_Status = (env-CP0_Status  ~mask) | val;
@@ -1457,6 +1478,9 @@ static void mtc0_cause(CPUMIPSState *cpu, target_ulong 
arg1)
  if (cpu-insn_flags  ISA_MIPS32R2) {
  mask |= 1  CP0Ca_DC;
  }
+if (cpu-insn_flags  ISA_MIPS32R6) {
+mask = ~((1  CP0Ca_WP)  arg1);
+}
  
  cpu-CP0_Cause = (cpu-CP0_Cause  ~mask) | (arg1  mask);
  
@@ -2381,8 +2405,9 @@ void helper_ctc1(CPUMIPSState *env, target_ulong arg1, uint32_t fs, uint32_t rt)

  }
  break;
  case 25:
-if (arg1  0xff00)
+if (env-insn_flags  ISA_MIPS32R6 || arg1  0xff00) {
  return;
+}
  env-active_fpu.fcr31 = (env-active_fpu.fcr31  0x017f) | ((arg1  
0xfe)  24) |
   ((arg1  0x1)  23);
  break;
@@ -2398,9 +2423,13 @@ void helper_ctc1(CPUMIPSState *env, target_ulong arg1, 
uint32_t fs, uint32_t rt)
   ((arg1  0x4) 

Re: [Qemu-devel] [PATCH 2/7] runstate: Add runstate store

2014-10-20 Thread Dr. David Alan Gilbert
* Juan Quintela (quint...@redhat.com) wrote:
 This allows us to store the current state to send it through migration.

Why store the runstate as a string?  The later code then ends up doing
string compares and things - why not just use the enum value?

Dave

 Signed-off-by: Juan Quintela quint...@redhat.com
 ---
  include/sysemu/sysemu.h |  1 +
  vl.c| 10 ++
  2 files changed, 11 insertions(+)
 
 diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
 index d8539fd..ae217da 100644
 --- a/include/sysemu/sysemu.h
 +++ b/include/sysemu/sysemu.h
 @@ -28,6 +28,7 @@ bool runstate_check(RunState state);
  void runstate_set(RunState new_state);
  int runstate_is_running(void);
  bool runstate_needs_reset(void);
 +int runstate_store(char *str, int size);
  typedef struct vm_change_state_entry VMChangeStateEntry;
  typedef void VMChangeStateHandler(void *opaque, int running, RunState state);
 
 diff --git a/vl.c b/vl.c
 index 964d634..ce8e28b 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -677,6 +677,16 @@ bool runstate_check(RunState state)
  return current_run_state == state;
  }
 
 +int runstate_store(char *str, int size)
 +{
 +const char *state = RunState_lookup[current_run_state];
 +
 +if (strlen(state)+1  size)
 +return -1;
 +strncpy(str, state, strlen(state)+1);
 +return 0;
 +}
 +
  static void runstate_init(void)
  {
  const RunStateTransition *p;
 -- 
 2.1.0
 
 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



[Qemu-devel] Patch checking bot

2014-10-20 Thread Stefan Hajnoczi
Hi,
At KVM Forum 2014 we discussed a patch checking bot that automates patch
format checking and smoke testing:

1. Did the patch submitter include Signed-off-by?
2. Does checkpatch.pl pass?
3. Does the patch apply to qemu.git/master?
4. Does each patch compile?
5. Does the series pass make check and qemu-iotests?

Here are some thoughts on the patch checker:

If a patch series passes successfully, no email is sent.  If a patch
series fails, an email with the errors is sent as a reply to the patch
series email thread.  The patch submitter can then respond in case there
are false positive (e.g. from checkpatch.pl) - the bot doesn't care
about replies but it tells the human reviewers and maintainers what the
patch submitter intends to do.

The bot should detect new patches within 15 minutes so humans can rely
on it to perform these basic checks before they review the patch series.

There should be a web page showing the check status of each patch series
on the mailing list.  This allows anyone to see which patch series have
passed, failed, or are pending check.

Ideas on the implementation:

The patches tool allows querying patch series on the mailing list.  It
can be used to apply patches to a git tree and display patches in mbox
format:

  https://github.com/stefanha/patches/tree/stefanha-tweaks

Patch series contain untrusted code so it is critical that operations
are performed inside a sandbox.  Otherwise people could send email to
qemu-devel@nongnu.org with Makefile or checkpatch.pl changes that get
executed with the bot's privileges!

Use docker or lxc to run a container for builds.  The root file system
should be fresh for each build so previous builds cannot affect later
ones.  The container cannot have external networking connectivity (for
security).

Include automated deployment scripts so bot instances can be created
easily.  Here is an example of automated deployment scripts written with
Fabric that I use for VM that builds the QEMU patches database:

  https://github.com/stefanha/qemu-patches


pgpUI9Ls3EMs2.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH 0/3] parallels: additional iotests and a minor bugfix

2014-10-20 Thread Denis V. Lunev

On 08/10/14 13:13, Denis V. Lunev wrote:

Pls find here test authentic test material, i.e. parallels images
with WithoutFreeSpace and WithouFreSpacExt signatures created
in authentic way + a minor bug fix for access to non-initialized
memory found by valgrind.

Signed-off-by: Denis V. Lunev d...@openvz.org
CC: Jeff Cody jc...@redhat.com
CC: Kevin Wolf kw...@redhat.com
CC: Stefan Hajnoczi stefa...@redhat.com

ping



Re: [Qemu-devel] [PATCH 5/6] target-mips: correctly handle access to unimplemented CP0 register

2014-10-20 Thread Yongbok Kim

On 14/07/2014 17:19, Leon Alrae wrote:

Release 6 limits the number of cases where software can cause UNDEFINED or
UNPREDICTABLE behaviour. In this case, when accessing reserved / unimplemented
CP0 register, writes are ignored and reads return 0.

In pre-R6 the behaviour is not specified, but generating RI exception is not
what the real HW does.

Additionally, remove CP0 Random register as it became reserved in Release 6.

Signed-off-by: Leon Alrae leon.al...@imgtec.com
---
  target-mips/translate.c |  546 +++
  1 files changed, 264 insertions(+), 282 deletions(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index 4ed81fe..cd20f35 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -4627,6 +4627,13 @@ static inline void gen_mfc0_unimplemented(DisasContext 
*ctx, TCGv arg)
  }
  }
  
+#define CP0_CHECK(c)\

+do {\
+if (!(c)) { \
+goto cp0_unimplemented; \
+}   \
+} while (0)
+
  static void gen_mfc0(DisasContext *ctx, TCGv arg, int reg, int sel)
  {
  const char *rn = invalid;
@@ -4642,67 +4649,68 @@ static void gen_mfc0(DisasContext *ctx, TCGv arg, int 
reg, int sel)
  rn = Index;
  break;
  case 1:
-check_insn(ctx, ASE_MT);
+CP0_CHECK(ctx-insn_flags  ASE_MT);
  gen_helper_mfc0_mvpcontrol(arg, cpu_env);
  rn = MVPControl;
  break;
  case 2:
-check_insn(ctx, ASE_MT);
+CP0_CHECK(ctx-insn_flags  ASE_MT);
  gen_helper_mfc0_mvpconf0(arg, cpu_env);
  rn = MVPConf0;
  break;
  case 3:
-check_insn(ctx, ASE_MT);
+CP0_CHECK(ctx-insn_flags  ASE_MT);
  gen_helper_mfc0_mvpconf1(arg, cpu_env);
  rn = MVPConf1;
  break;
  default:
-goto die;
+goto cp0_unimplemented;
  }
  break;
  case 1:
  switch (sel) {
  case 0:
+CP0_CHECK(!(ctx-insn_flags  ISA_MIPS32R6));
  gen_helper_mfc0_random(arg, cpu_env);
  rn = Random;
  break;
  case 1:
-check_insn(ctx, ASE_MT);
+CP0_CHECK(ctx-insn_flags  ASE_MT);
  gen_mfc0_load32(arg, offsetof(CPUMIPSState, CP0_VPEControl));
  rn = VPEControl;
  break;
  case 2:
-check_insn(ctx, ASE_MT);
+CP0_CHECK(ctx-insn_flags  ASE_MT);
  gen_mfc0_load32(arg, offsetof(CPUMIPSState, CP0_VPEConf0));
  rn = VPEConf0;
  break;
  case 3:
-check_insn(ctx, ASE_MT);
+CP0_CHECK(ctx-insn_flags  ASE_MT);
  gen_mfc0_load32(arg, offsetof(CPUMIPSState, CP0_VPEConf1));
  rn = VPEConf1;
  break;
  case 4:
-check_insn(ctx, ASE_MT);
+CP0_CHECK(ctx-insn_flags  ASE_MT);
  gen_mfc0_load64(arg, offsetof(CPUMIPSState, CP0_YQMask));
  rn = YQMask;
  break;
  case 5:
-check_insn(ctx, ASE_MT);
+CP0_CHECK(ctx-insn_flags  ASE_MT);
  gen_mfc0_load64(arg, offsetof(CPUMIPSState, CP0_VPESchedule));
  rn = VPESchedule;
  break;
  case 6:
-check_insn(ctx, ASE_MT);
+CP0_CHECK(ctx-insn_flags  ASE_MT);
  gen_mfc0_load64(arg, offsetof(CPUMIPSState, CP0_VPEScheFBack));
  rn = VPEScheFBack;
  break;
  case 7:
-check_insn(ctx, ASE_MT);
+CP0_CHECK(ctx-insn_flags  ASE_MT);
  gen_mfc0_load32(arg, offsetof(CPUMIPSState, CP0_VPEOpt));
  rn = VPEOpt;
  break;
  default:
-goto die;
+goto cp0_unimplemented;
  }
  break;
  case 2:
@@ -4722,42 +4730,42 @@ static void gen_mfc0(DisasContext *ctx, TCGv arg, int 
reg, int sel)
  rn = EntryLo0;
  break;
  case 1:
-check_insn(ctx, ASE_MT);
+CP0_CHECK(ctx-insn_flags  ASE_MT);
  gen_helper_mfc0_tcstatus(arg, cpu_env);
  rn = TCStatus;
  break;
  case 2:
-check_insn(ctx, ASE_MT);
+CP0_CHECK(ctx-insn_flags  ASE_MT);
  gen_helper_mfc0_tcbind(arg, cpu_env);
  rn = TCBind;
  break;
  case 3:
-check_insn(ctx, ASE_MT);
+CP0_CHECK(ctx-insn_flags  ASE_MT);
  gen_helper_mfc0_tcrestart(arg, cpu_env);
  rn = TCRestart;
  break;
  case 4:
-check_insn(ctx, ASE_MT);
+CP0_CHECK(ctx-insn_flags  ASE_MT);
  gen_helper_mfc0_tchalt(arg, 

Re: [Qemu-devel] [PATCH 5/7] migration: create now section to store global state

2014-10-20 Thread Dr. David Alan Gilbert
* Juan Quintela (quint...@redhat.com) wrote:
 This includes a new section that for now just stores the current qemu state.
 
 Right now, there are only one way to control what is the state of the
 target after migration.
 
 - If you run the target qemu with -S, it would start stopped.
 - If you run the target qemu without -S, it would run just after migration 
 finishes.
 
 The problem here is what happens if we start the target without -S and
 there happens one error during migration that puts current state as
 -EIO.  Migration would ends (notice that the error happend doing block
 IO, network IO, i.e. nothing related with migration), and when
 migration finish, we would just continue running on destination,
 probably hanging the guest/corruption data, whatever.

A couple of questions:
   1) Does the ordering of loading this state matter - lets say that the source
 was in an error state, then all the other device states get loaded and then
 it loads your global_state which tells the destination is in error - is 
that
 too late ? Could the device emulation have already started doing some IO
 or something to the devices which it wouldn't have done if it knew there 
was
 already a problem?

   2) What's the advantage of the optional section over using the 'command' 
sections
  I use in postcopy; 
http://lists.gnu.org/archive/html/qemu-devel/2014-10/msg00337.html ?

Dave

 Signed-off-by: Juan Quintela quint...@redhat.com
 ---
  include/migration/migration.h |  4 ++
  migration.c   | 88 
 +--
  vl.c  |  1 +
  3 files changed, 90 insertions(+), 3 deletions(-)
 
 diff --git a/include/migration/migration.h b/include/migration/migration.h
 index 3cb5ba8..bc1069b 100644
 --- a/include/migration/migration.h
 +++ b/include/migration/migration.h
 @@ -174,4 +174,8 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t 
 block_offset,
   ram_addr_t offset, size_t size,
   int *bytes_sent);
 
 +void register_global_state(void);
 +void global_state_store(void);
 +char *global_state_get_runstate(void);
 +
  #endif
 diff --git a/migration.c b/migration.c
 index 8d675b3..6f7e50e 100644
 --- a/migration.c
 +++ b/migration.c
 @@ -112,10 +112,20 @@ static void process_incoming_migration_co(void *opaque)
  exit(EXIT_FAILURE);
  }
 
 -if (autostart) {
 +/* runstate ==  means that we haven't received it through the
 + * wire, so we obey autostart.  runstate == runing means that we
 + * need to run it, we need to make sure that we do it after
 + * everything else has finished.  Every other state change is done
 + * at the post_load function */
 +
 +if (strcmp(global_state_get_runstate(), running) == 0 ) {
  vm_start();
 -} else {
 -runstate_set(RUN_STATE_PAUSED);
 +} else if (strcmp(global_state_get_runstate(), ) == 0 ) {
 +if (autostart) {
 +vm_start();
 +} else {
 +runstate_set(RUN_STATE_PAUSED);
 +}
  }
  }
 
 @@ -608,6 +618,7 @@ static void *migration_thread(void *opaque)
  qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
  old_vm_running = runstate_is_running();
 
 +global_state_store();
  ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
  if (ret = 0) {
  qemu_file_set_rate_limit(s-file, INT64_MAX);
 @@ -699,3 +710,74 @@ void migrate_fd_connect(MigrationState *s)
  qemu_thread_create(s-thread, migration, migration_thread, s,
 QEMU_THREAD_JOINABLE);
  }
 +
 +typedef struct {
 +int32_t size;
 +uint8_t runstate[100];
 +} GlobalState;
 +
 +static GlobalState global_state;
 +
 +void global_state_store(void)
 +{
 +if (runstate_store((char*)global_state.runstate,
 +   sizeof(global_state.runstate)) == -1) {
 +printf(Runstate is too big\n);
 +exit(-1);
 +}
 +}
 +
 +char *global_state_get_runstate(void)
 +{
 +return (char *)global_state.runstate;
 +}
 +
 +static int global_state_post_load(void *opaque, int version_id)
 +{
 +GlobalState *s = opaque;
 +int ret = 0;
 +char *runstate = (char*)s-runstate;
 +
 +printf(loaded state: %s\n, runstate);
 +
 +if (strcmp(runstate, running) != 0) {
 +
 +RunState r = runstate_index(runstate);
 +
 +if (r == -1) {
 +printf(Unknown received state %s\n, runstate);
 +return -1;
 +}
 +ret = vm_stop_force_state(r);
 +}
 +
 +   return ret;
 +}
 +
 +static void global_state_pre_save(void *opaque)
 +{
 +GlobalState *s = opaque;
 +
 +s-size = strlen((char*)s-runstate) + 1;
 +printf(saved state: %s\n, s-runstate);
 +}
 +
 +static const VMStateDescription vmstate_globalstate = {
 +.name = globalstate,
 +.version_id = 1,
 +.minimum_version_id = 1,
 

Re: [Qemu-devel] [PATCH 2/7] runstate: Add runstate store

2014-10-20 Thread Juan Quintela
Dr. David Alan Gilbert dgilb...@redhat.com wrote:
 * Juan Quintela (quint...@redhat.com) wrote:
 This allows us to store the current state to send it through migration.

 Why store the runstate as a string?  The later code then ends up doing
 string compares and things - why not just use the enum value?

How do you know that it has the same values both sides?  As far as I can
see, all interaction with the outside is done with strings (i.e. QMP).

But it is easier for me if I can sent the numeric value.

Libvirt folks?
Luiz?

What should I do?

Later, Juan.


 Dave

 Signed-off-by: Juan Quintela quint...@redhat.com
 ---
  include/sysemu/sysemu.h |  1 +
  vl.c| 10 ++
  2 files changed, 11 insertions(+)
 
 diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
 index d8539fd..ae217da 100644
 --- a/include/sysemu/sysemu.h
 +++ b/include/sysemu/sysemu.h
 @@ -28,6 +28,7 @@ bool runstate_check(RunState state);
  void runstate_set(RunState new_state);
  int runstate_is_running(void);
  bool runstate_needs_reset(void);
 +int runstate_store(char *str, int size);
  typedef struct vm_change_state_entry VMChangeStateEntry;
  typedef void VMChangeStateHandler(void *opaque, int running, RunState 
 state);
 
 diff --git a/vl.c b/vl.c
 index 964d634..ce8e28b 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -677,6 +677,16 @@ bool runstate_check(RunState state)
  return current_run_state == state;
  }
 
 +int runstate_store(char *str, int size)
 +{
 +const char *state = RunState_lookup[current_run_state];
 +
 +if (strlen(state)+1  size)
 +return -1;
 +strncpy(str, state, strlen(state)+1);
 +return 0;
 +}
 +
  static void runstate_init(void)
  {
  const RunStateTransition *p;
 -- 
 2.1.0
 
 
 --
 Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH V5 0/8] cpu/acpi: convert cpu hot plug to hotplug_handler API

2014-10-20 Thread Igor Mammedov
On Mon, 20 Oct 2014 17:42:40 +0800
Gu Zheng guz.f...@cn.fujitsu.com wrote:

 Hi Igor,
 How about this version?
I'm reviewing it now.

 
 Regards,
 Gu
 On 10/10/2014 10:15 AM, Gu Zheng wrote:
 
  Previously we use cpu_added_notifiers to register cpu hotplug notifier 
  callback
  which is not able to pass/handle errors, so we switch it to unified hotplug
  handler API which allows to pass errors and would allow to cancel device_add
  in case of error.
  Thanks very much for Igor's review and suggestion.
  
  ---
  v5:
   -rebase on the latest upstream and fix some comments.
   Patch 4/8:
   -split the check out of acpi_dev block.
   Patch 5/8:
   -move CPU hot-plug notifier cleanup hunk into Patch 6/8.
   Patch 6/8:
   -delete the caller of notifier_list_notify() in this patch.
   Patch 8/8:
   -rename acpi_set_local_sts to acpi_set_cpu_present_bit for better 
  readability.
  
  v4:
   -split removal of CPU hotplug notifier into separate patch (Patch 6/8).
   Patch 1/7:
   -convert CPUState *cpu to DeviceState *dev like it's done for other 
  handlers
and do cast to CPU inside.
   Patch 5/7:
   -Make rtc_state as a link property in PCMachine rather than the global
variables.
   -Split out the removal of unused notifier into separate patch.
   -Check the result of plug callback before update rtc_state.
  
  v3:
   -deal with start-up cpus in pc_cpu_plug as Igor suggested.
  
  v2:
   -Add 3 new patches(5/7,6/7,7/7), delete original patch 5/5.
1/5--1/7
2/5--2/7
3/5--3/7
4/5--4/7
   Patch 1/7:
   -add errp argument to catch error.
   -return error instead of aborting if cpu id is invalid.
   -make acpi_cpu_plug_cb as a wrapper around AcpiCpuHotplug_add.
   Patch 3/7:
   -remove unused AcpiCpuHotplug_add directly.
   Patch 5/7:
   -switch the last user of cpu hotplug notifier to hotplug handler API, and
remove the unused cpu hotplug notify.
   Patch 6/7:
   -split the function rename (just cleanup) into single patch.
   Patch 7/7:
   -introduce help function acpi_set_local_sts to keep the bit setting in
one place.
  ---
  
  Gu Zheng (8):
acpi/cpu: add cpu hotplug callback function to match hotplug_handler
  API
acpi:ich9: convert cpu hotplug to hotplug_handler API
acpi:piix4: convert cpu hotplug to hotplug_handler API
pc: add cpu hotplug handler to PC_MACHINE
pc: Update rtc_cmos in pc_cpu_plug
qom/cpu: remove the unused CPU hot-plug notifier
cpu-hotplug: rename function for better readability
acpi/cpu-hotplug: introduce help function to keep bit setting in one
  place
  
   hw/acpi/cpu_hotplug.c |   35 --
   hw/acpi/ich9.c|   17 ++
   hw/acpi/piix4.c   |   18 ++-
   hw/i386/pc.c  |   65 
  +++--
   hw/i386/pc_piix.c |2 +-
   hw/i386/pc_q35.c  |2 +-
   include/hw/acpi/cpu_hotplug.h |7 ++--
   include/hw/acpi/ich9.h|1 -
   include/hw/i386/pc.h  |3 +-
   include/sysemu/sysemu.h   |3 --
   qom/cpu.c |   10 --
   11 files changed, 84 insertions(+), 79 deletions(-)
  
 
 
 




Re: [Qemu-devel] [PATCH 2/6] target-mips: implement forbidden slot

2014-10-20 Thread Yongbok Kim

On 14/07/2014 17:19, Leon Alrae wrote:

When conditional compact branch is encountered decode one more instruction in
current translation block - that will be forbidden slot. Instruction in
forbidden slot will be executed only if conditional compact branch is not taken.

Any control transfer instruction (CTI) which are branches, jumps, ERET,
DERET, WAIT and PAUSE will generate RI exception if executed in forbidden or
delay slot.

Signed-off-by: Leon Alrae leon.al...@imgtec.com
---
  target-mips/cpu.h   |5 ++-
  target-mips/translate.c |   89 +++---
  2 files changed, 63 insertions(+), 31 deletions(-)

diff --git a/target-mips/cpu.h b/target-mips/cpu.h
index 2a762d2..a35ab9d 100644
--- a/target-mips/cpu.h
+++ b/target-mips/cpu.h
@@ -462,7 +462,7 @@ struct CPUMIPSState {
  #define EXCP_INST_NOTAVAIL 0x2 /* No valid instruction word for BadInstr */
  uint32_t hflags;/* CPU State */
  /* TMASK defines different execution modes */
-#define MIPS_HFLAG_TMASK  0x2C07FF
+#define MIPS_HFLAG_TMASK  0x6C07FF
  #define MIPS_HFLAG_MODE   0x7 /* execution modes*/
  /* The KSU flags must be the lowest bits in hflags. The flag order
 must be the same as defined for CP0 Status. This allows to use
@@ -488,7 +488,7 @@ struct CPUMIPSState {
   * the delay slot, record what type of branch it is so that we can
   * resume translation properly.  It might be possible to reduce
   * this from three bits to two.  */
-#define MIPS_HFLAG_BMASK_BASE  0x03800
+#define MIPS_HFLAG_BMASK_BASE  0x403800
  #define MIPS_HFLAG_B  0x00800 /* Unconditional branch   */
  #define MIPS_HFLAG_BC 0x01000 /* Conditional branch */
  #define MIPS_HFLAG_BL 0x01800 /* Likely branch  */
@@ -506,6 +506,7 @@ struct CPUMIPSState {
  /* Extra flag about HWREna register. */
  #define MIPS_HFLAG_HWRENA_ULR 0x10 /* ULR bit from HWREna is set. */
  #define MIPS_HFLAG_SBRI  0x20 /* R6 SDBBP causes RI excpt. in user mode */
+#define MIPS_HFLAG_FBNSLOT 0x40 /* Forbidden slot   */
  target_ulong btarget;/* Jump / branch target   */
  target_ulong bcond;  /* Branch condition (if needed)   */
  
diff --git a/target-mips/translate.c b/target-mips/translate.c

index d0f695a..4ed81fe 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -7686,12 +7686,20 @@ static void gen_cp0 (CPUMIPSState *env, DisasContext 
*ctx, uint32_t opc, int rt,
  case OPC_ERET:
  opn = eret;
  check_insn(ctx, ISA_MIPS2);
+if (ctx-insn_flags  ISA_MIPS32R6  ctx-hflags  MIPS_HFLAG_BMASK) {
+MIPS_DEBUG(CTI in delay / forbidden slot);
+goto die;
+}
  gen_helper_eret(cpu_env);
  ctx-bstate = BS_EXCP;
  break;
  case OPC_DERET:
  opn = deret;
  check_insn(ctx, ISA_MIPS32);
+if (ctx-insn_flags  ISA_MIPS32R6  ctx-hflags  MIPS_HFLAG_BMASK) {
+MIPS_DEBUG(CTI in delay / forbidden slot);
+goto die;
+}
  if (!(ctx-hflags  MIPS_HFLAG_DM)) {
  MIPS_INVAL(opn);
  generate_exception(ctx, EXCP_RI);
@@ -7703,6 +7711,10 @@ static void gen_cp0 (CPUMIPSState *env, DisasContext 
*ctx, uint32_t opc, int rt,
  case OPC_WAIT:
  opn = wait;
  check_insn(ctx, ISA_MIPS3 | ISA_MIPS32);
+if (ctx-insn_flags  ISA_MIPS32R6  ctx-hflags  MIPS_HFLAG_BMASK) {
+MIPS_DEBUG(CTI in delay / forbidden slot);
+goto die;
+}
  /* If we get an exception, we want to restart at next instruction */
  ctx-pc += 4;
  save_cpu_state(ctx, 1);
@@ -7729,6 +7741,12 @@ static void gen_compute_branch1(DisasContext *ctx, 
uint32_t op,
  const char *opn = cp1 cond branch;
  TCGv_i32 t0 = tcg_temp_new_i32();
  
+if (ctx-insn_flags  ISA_MIPS32R6  ctx-hflags  MIPS_HFLAG_BMASK) {

+MIPS_DEBUG(CTI in delay / forbidden slot);
+generate_exception(ctx, EXCP_RI);
+goto out;
+}
+
  if (cc != 0)
  check_insn(ctx, ISA_MIPS4 | ISA_MIPS32);
  
@@ -10299,6 +10317,10 @@ static void gen_branch(DisasContext *ctx, int insn_bytes)

  save_cpu_state(ctx, 0);
  /* FIXME: Need to clear can_do_io.  */
  switch (proc_hflags  MIPS_HFLAG_BMASK_BASE) {
+case MIPS_HFLAG_FBNSLOT:
+MIPS_DEBUG(forbidden slot);
+gen_goto_tb(ctx, 0, ctx-pc + insn_bytes);
+break;
  case MIPS_HFLAG_B:
  /* unconditional branch */
  MIPS_DEBUG(unconditional branch);
@@ -15711,56 +15733,56 @@ static void gen_compute_compact_branch(DisasContext 
*ctx, uint32_t opc,
  gen_branch(ctx, 4);
  } else {
  /* Conditional compact branch */
-int l1 = gen_new_label();
+int fs = gen_new_label();
  save_cpu_state(ctx, 0);
  

Re: [Qemu-devel] [PATCH 5/7] migration: create now section to store global state

2014-10-20 Thread Kevin Wolf
Am 15.10.2014 um 09:55 hat Juan Quintela geschrieben:
 This includes a new section that for now just stores the current qemu state.
 
 Right now, there are only one way to control what is the state of the
 target after migration.
 
 - If you run the target qemu with -S, it would start stopped.
 - If you run the target qemu without -S, it would run just after migration 
 finishes.
 
 The problem here is what happens if we start the target without -S and
 there happens one error during migration that puts current state as
 -EIO.  Migration would ends (notice that the error happend doing block
 IO, network IO, i.e. nothing related with migration), and when
 migration finish, we would just continue running on destination,
 probably hanging the guest/corruption data, whatever.
 
 Signed-off-by: Juan Quintela quint...@redhat.com
 ---
  include/migration/migration.h |  4 ++
  migration.c   | 88 
 +--
  vl.c  |  1 +
  3 files changed, 90 insertions(+), 3 deletions(-)
 
 diff --git a/include/migration/migration.h b/include/migration/migration.h
 index 3cb5ba8..bc1069b 100644
 --- a/include/migration/migration.h
 +++ b/include/migration/migration.h
 @@ -174,4 +174,8 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t 
 block_offset,
   ram_addr_t offset, size_t size,
   int *bytes_sent);
 
 +void register_global_state(void);
 +void global_state_store(void);
 +char *global_state_get_runstate(void);
 +
  #endif
 diff --git a/migration.c b/migration.c
 index 8d675b3..6f7e50e 100644
 --- a/migration.c
 +++ b/migration.c
 @@ -112,10 +112,20 @@ static void process_incoming_migration_co(void *opaque)
  exit(EXIT_FAILURE);
  }
 
 -if (autostart) {
 +/* runstate ==  means that we haven't received it through the
 + * wire, so we obey autostart.  runstate == runing means that we
 + * need to run it, we need to make sure that we do it after
 + * everything else has finished.  Every other state change is done
 + * at the post_load function */
 +
 +if (strcmp(global_state_get_runstate(), running) == 0 ) {
  vm_start();

Does this mean that -S is now ignored in the common case? Wouldn't it be
better to change only the case without -S? Otherwise I guess libvirt
will get quite confused.

 -} else {
 -runstate_set(RUN_STATE_PAUSED);
 +} else if (strcmp(global_state_get_runstate(), ) == 0 ) {
 +if (autostart) {
 +vm_start();
 +} else {
 +runstate_set(RUN_STATE_PAUSED);
 +}
  }
  }
 
 @@ -608,6 +618,7 @@ static void *migration_thread(void *opaque)
  qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
  old_vm_running = runstate_is_running();
 
 +global_state_store();
  ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
  if (ret = 0) {
  qemu_file_set_rate_limit(s-file, INT64_MAX);
 @@ -699,3 +710,74 @@ void migrate_fd_connect(MigrationState *s)
  qemu_thread_create(s-thread, migration, migration_thread, s,
 QEMU_THREAD_JOINABLE);
  }
 +
 +typedef struct {
 +int32_t size;
 +uint8_t runstate[100];
 +} GlobalState;
 +
 +static GlobalState global_state;
 +
 +void global_state_store(void)
 +{
 +if (runstate_store((char*)global_state.runstate,
 +   sizeof(global_state.runstate)) == -1) {
 +printf(Runstate is too big\n);
 +exit(-1);
 +}
 +}

Not sure if the concept of a single GlobalStore that calls all the
individual handlers for each piece of global state is optimal.

Can't we use something like vmstate_register()? Perhaps even the same
function, just with dev == NULL? (Actually, you even do this below, to
register the global state. So I guess I'm only disagreeing on the
granularity of having only a single section with a single handler
function for the whole global state.)

 +char *global_state_get_runstate(void)
 +{
 +return (char *)global_state.runstate;
 +}
 +
 +static int global_state_post_load(void *opaque, int version_id)
 +{
 +GlobalState *s = opaque;
 +int ret = 0;
 +char *runstate = (char*)s-runstate;
 +
 +printf(loaded state: %s\n, runstate);
 +
 +if (strcmp(runstate, running) != 0) {
 +
 +RunState r = runstate_index(runstate);
 +
 +if (r == -1) {
 +printf(Unknown received state %s\n, runstate);
 +return -1;
 +}
 +ret = vm_stop_force_state(r);
 +}
 +
 +   return ret;
 +}
 +
 +static void global_state_pre_save(void *opaque)
 +{
 +GlobalState *s = opaque;
 +
 +s-size = strlen((char*)s-runstate) + 1;
 +printf(saved state: %s\n, s-runstate);
 +}
 +
 +static const VMStateDescription vmstate_globalstate = {
 +.name = globalstate,
 +.version_id = 1,
 +.minimum_version_id = 1,
 +.post_load = 

Re: [Qemu-devel] Close the BlockDriverState when guest eject the media

2014-10-20 Thread Weidong Huang
On 2014/10/20 17:41, Kevin Wolf wrote:

 Am 18.10.2014 um 12:02 hat Weidong Huang geschrieben:
 Hi ALL:

 There are two ways to eject the cdrom tray. One is by the eject's qmp 
 commmand(eject_device).
 The another one is by the guest(bdrv_eject). They have different results.
 
 Yes, they are different things.
 
 If a guest opens the tray (using bdrv_eject) and then closes it again,
 with no user interaction in between, the virtual media must still be in
 the drive and the guest must be able to access the same image again.
 Calling bdrv_close() in this case would be a bug.
 
 The goal of the monitor command eject on the other hand is to remove
 the medium so that the drive is empty. That a device with a closed tray
 has to be opened for this is only secondary.


Thanks for your reply.

There is a problem.

1. Qemu receive the eject command.
2. Runs eject_request_cb when an eject request is issued from the monitor, 
the tray
is closed, and the medium is locked. But the drive is not closed.
3. Guest agree with opening tray and qemu will call bdrv_eject to complete. The 
drive is
still not close.

So the result of the monitor command eject is not to remove the medium in 
this situation.

 
 eject_device: close the BlockDriverState(bdrv_close(bs))
 bdrv_eject: don't close the BlockDriverState,

 This is ambiguous. So libvirt can't handle some situations.

 libvirt send eject qmp command --- qemu send eject request to guest ---
 guest respond to qemu --- qemu emit tray_open event to libvirt ---
 libvirt will not send change qmp command if media source is null. So
 the media is not be replace to the null.
 
 What is the problem that libvirt has with the guest opening the tray? I
 don't think libvirt should even care about that case.


For example, using libvirt to change media by xml below(media source is null):
disk type='file' device='cdrom'
driver name='qemu'/
target dev='hdb' bus='ide'/
/disk

libivrt return ok. But media still is in the guest.
This is confused.

Thanks.

 
 Kevin
 
 So close the BlockDriverState in bdrv_eject. Thanks.

 diff --git a/block.c b/block.c
 index d3aebeb..0be69de 100644
 --- a/block.c
 +++ b/block.c
 @@ -5276,6 +5276,10 @@ void bdrv_eject(BlockDriverState *bs, bool eject_flag)
  qapi_event_send_device_tray_moved(bdrv_get_device_name(bs),
eject_flag, error_abort);
  }
 +
 +if (eject_flag) {
 +bdrv_close(bs);
 +}
  }

 
 .
 






Re: [Qemu-devel] [RFC 0/7] Optional toplevel sections

2014-10-20 Thread Kevin Wolf
Am 15.10.2014 um 09:55 hat Juan Quintela geschrieben:
 Hi
 
 by popular demand, and after too many time, this series.  This is an
 RFC to know what people think about how to use them, the interface
 proposed, whatever.
 [...]
 
 Kevin: You asked for optional sections in the past for the block
layer, would this proposal be enough for you?

I know I've asked in more than one occasion, and of course I don't
remember all the details any more. Anyway, I remember two cases offhand:

* qcow2 with patches like Delayed COW keeps internal block layer state
  in memory that might need to be migrated. This series looks fine for
  this case in principle, we'd just need to find a way to distinguish
  the affected BlockDriverStates. We can probably take a node-name if it
  exists (with Jeff's auto-naming patches not a problem, because then it
  would always exist)

  How do devices solve this? Do they use something like a qdev path to
  identify to which device a given section belongs?

* When a VM is stopped after an I/O error, we need to migrate the
  information about pending requests (bdrv_drain_all doesn't complete
  the failed requests). Currently we do this in device code, but it
  would be very nice to make this common block layer functionality.

  The problem here is that bdrv_aio_readv/writev get an opaque pointer
  back to the device, which of course becomes meaningless during
  migration.

  So this one is tricky even if we have optional top-level sections.

Kevin



Re: [Qemu-devel] Close the BlockDriverState when guest eject the media

2014-10-20 Thread Kevin Wolf
Am 20.10.2014 um 13:27 hat Weidong Huang geschrieben:
 On 2014/10/20 17:41, Kevin Wolf wrote:
 
  Am 18.10.2014 um 12:02 hat Weidong Huang geschrieben:
  Hi ALL:
 
  There are two ways to eject the cdrom tray. One is by the eject's qmp 
  commmand(eject_device).
  The another one is by the guest(bdrv_eject). They have different results.
  
  Yes, they are different things.
  
  If a guest opens the tray (using bdrv_eject) and then closes it again,
  with no user interaction in between, the virtual media must still be in
  the drive and the guest must be able to access the same image again.
  Calling bdrv_close() in this case would be a bug.
  
  The goal of the monitor command eject on the other hand is to remove
  the medium so that the drive is empty. That a device with a closed tray
  has to be opened for this is only secondary.
 
 
 Thanks for your reply.
 
 There is a problem.
 
 1. Qemu receive the eject command.
 2. Runs eject_request_cb when an eject request is issued from the monitor, 
 the tray
 is closed, and the medium is locked. But the drive is not closed.
 3. Guest agree with opening tray and qemu will call bdrv_eject to complete. 
 The drive is
 still not close.
 
 So the result of the monitor command eject is not to remove the medium in 
 this situation.

Now I understand, thanks for explaining.

But I think libvirt can actually work correctly with what qemu offers
today. qemu returns an error if the medium cannot be removed with the
'eject' command and it only sends an eject request to the guest.

With this error, libvirt can know that the DEVICE_TRAY_MOVED event
doesn't mean that the medium has removed, but that it needs to issue
another 'eject' command.

If this isn't implemented correctly in libvirt today, this needs a
libvirt fix rather than a qemu one.

  eject_device: close the BlockDriverState(bdrv_close(bs))
  bdrv_eject: don't close the BlockDriverState,
 
  This is ambiguous. So libvirt can't handle some situations.
 
  libvirt send eject qmp command --- qemu send eject request to guest ---
  guest respond to qemu --- qemu emit tray_open event to libvirt ---
  libvirt will not send change qmp command if media source is null. So
  the media is not be replace to the null.
  
  What is the problem that libvirt has with the guest opening the tray? I
  don't think libvirt should even care about that case.
 
 
 For example, using libvirt to change media by xml below(media source is null):
 disk type='file' device='cdrom'
 driver name='qemu'/
 target dev='hdb' bus='ide'/
 /disk
 
 libivrt return ok. But media still is in the guest.
 This is confused.

Kevin



Re: [Qemu-devel] [PATCH] intel_iommu: fix VTD_SID_TO_BUS

2014-10-20 Thread Markus Armbruster
Michael S. Tsirkin m...@redhat.com writes:

 (((sid)  8)  0xff)  makes no sense
 (((sid)  8)  0xff) seems to be what was meant.

 Suggested-by: Markus Armbruster arm...@redhat.com

Actually by the reporter of https://bugs.launchpad.net/bugs/1382477

 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 ---

 Compile-tested only.

  include/hw/i386/intel_iommu.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
 index f4701e1..e321ee4 100644
 --- a/include/hw/i386/intel_iommu.h
 +++ b/include/hw/i386/intel_iommu.h
 @@ -37,7 +37,7 @@
  #define VTD_PCI_DEVFN_MAX   256
  #define VTD_PCI_SLOT(devfn) (((devfn)  3)  0x1f)
  #define VTD_PCI_FUNC(devfn) ((devfn)  0x07)
 -#define VTD_SID_TO_BUS(sid) (((sid)  8)  0xff)
 +#define VTD_SID_TO_BUS(sid) (((sid)  8)  0xff)
  #define VTD_SID_TO_DEVFN(sid)   ((sid)  0xff)
  
  #define DMAR_REG_SIZE   0x230

Can't find the spec right now, but it looks plausible enough.

Only use is in vtd_context_device_invalidate().  Bug's impact isn't
obvious to me.

Reviewed-by: Markus Armbruster arm...@redhat.com



Re: [Qemu-devel] [PATCH v4 1/2] qcow2: Add qcow2_shrink_l1_and_l2_table for qcow2 shrinking

2014-10-20 Thread Max Reitz

On 2014-10-14 at 19:53, Jun Li wrote:

This patch is the realization of new function qcow2_shrink_l1_and_l2_table.
This function will shrink/discard l1 and l2 table when do qcow2 shrinking.

Signed-off-by: Jun Li junm...@gmail.com
---
v4:
   Add deal with COW clusters in l2 table. When using COW, some of (l2_entry 
s-cluster_bits) will larger than s-refcount_table_size, so need to discard
this l2_entry.
v3:
   Fixed host cluster leak.
---
  block/qcow2-cluster.c | 186 ++
  block/qcow2.c |  40 +--
  block/qcow2.h |   2 +
  3 files changed, 224 insertions(+), 4 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index f7dd8c0..0664b8a 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -29,6 +29,9 @@
  #include block/qcow2.h
  #include trace.h
  
+static int l2_load(BlockDriverState *bs, uint64_t l2_offset,

+   uint64_t **l2_table);
+
  int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
  bool exact_size)
  {
@@ -135,6 +138,189 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t 
min_size,
  return ret;
  }
  
+int qcow2_shrink_l1_and_l2_table(BlockDriverState *bs, uint64_t new_l1_size,

+ int new_l2_index, bool exact_size)
+{
+BDRVQcowState *s = bs-opaque;
+int new_l1_size2, ret, i;
+uint64_t *new_l1_table;
+int64_t new_l1_table_offset;
+int64_t old_l1_table_offset, old_l1_size;
+uint8_t data[12];
+
+new_l1_size2 = sizeof(uint64_t) * new_l1_size;


new_l1_size is uint64_t, new_l1_size2 is int. Not a good idea.


+new_l1_table = qemu_try_blockalign(bs-file,
+   align_offset(new_l1_size2, 512));


As long as with 512 you mean the sector size, s/512/BDRV_SECTOR_SIZE/.


+if (new_l1_table == NULL) {
+return -ENOMEM;
+}
+memset(new_l1_table, 0, align_offset(new_l1_size2, 512));


Same here.


+
+/* shrinking l1 table */
+memcpy(new_l1_table, s-l1_table, new_l1_size2);


Please add an assert(new_l1_size = s-l1_size) before this.


+
+/* write new table (align to cluster) */
+BLKDBG_EVENT(bs-file, BLKDBG_L1_GROW_ALLOC_TABLE);


Well, it's not really growing, but fine.


+new_l1_table_offset = qcow2_alloc_clusters(bs, new_l1_size2);
+if (new_l1_table_offset  0) {
+qemu_vfree(new_l1_table);
+return new_l1_table_offset;
+}
+
+ret = qcow2_cache_flush(bs, s-refcount_block_cache);
+if (ret  0) {
+goto fail;
+}
+
+/* the L1 position has not yet been updated, so these clusters must
+ * indeed be completely free */
+ret = qcow2_pre_write_overlap_check(bs, 0, new_l1_table_offset,
+new_l1_size2);
+if (ret  0) {
+goto fail;
+}
+
+BLKDBG_EVENT(bs-file, BLKDBG_L1_GROW_WRITE_TABLE);
+
+for (i = 0; i  new_l1_size; i++) {
+new_l1_table[i] = cpu_to_be64(new_l1_table[i]);


You could have used cpu_to_be64s(new_l1_table[i]) here.


+}
+
+ret = bdrv_pwrite_sync(bs-file, new_l1_table_offset,
+   new_l1_table, new_l1_size2);
+if (ret  0) {
+goto fail;
+}
+
+for (i = 0; i  new_l1_size; i++) {
+new_l1_table[i] = be64_to_cpu(new_l1_table[i]);


Again, cpu_to_be64s() is an alternative.


+}
+
+/* set new table */
+BLKDBG_EVENT(bs-file, BLKDBG_L1_GROW_ACTIVATE_TABLE);
+cpu_to_be32w((uint32_t *)data, new_l1_size);
+stq_be_p(data + 4, new_l1_table_offset);


Why not cpu_to_be64w()? I find it easier to read and it definitely fits 
in better.



+ret = bdrv_pwrite_sync(bs-file, offsetof(QCowHeader, l1_size),
+   data, sizeof(data));
+if (ret  0) {
+goto fail;
+}
+
+old_l1_table_offset = s-l1_table_offset;
+s-l1_table_offset = new_l1_table_offset;
+uint64_t *old_l1_table = s-l1_table;


Declarations have to be at the start of the block.


+s-l1_table = new_l1_table;
+old_l1_size = s-l1_size;
+s-l1_size = new_l1_size;
+
+int num = old_l1_size - s-l1_size;


Same here.


+
+while (num = 0) {
+uint64_t l2_offset;
+int ret;
+uint64_t *l2_table, l2_entry;
+int last_free_cluster = 0;
+
+l2_offset = old_l1_table[s-l1_size + num - 1]  L1E_OFFSET_MASK;
+if (l2_offset == 0) {
+goto retry;
+}
+
+if (num == 0) {
+if (new_l2_index == 0) {
+goto retry;
+}
+last_free_cluster = new_l2_index + 1;
+}
+
+/* load l2_table into cache */
+ret = l2_load(bs, l2_offset, l2_table);
+
+if (ret  0) {
+goto fail;


Too bad we're freeing new_l1_table in fail. We should be freeing 
old_l1_table.



+}
+
+for (i = s-l2_size; i  0; i--) {
+l2_entry = be64_to_cpu(l2_table[i - 1]);


Okay... I'd 

[Qemu-devel] [PATCH] block: qemu-iotests change _supported_proto to file once more.

2014-10-20 Thread Peter Lieven
In preparation to possible automatic regression and performance
testing for the block layer I found that the iotests don't work
for all protocols anymore.

In commit 1f7bf7d0 I started to change supported protocols from
generic to file for various tests. Unfortunately, some tests
added in the meantime again carry generic protocol altough they
can only work with file because they require local file access.

The other way around for some tests that only support file I added
NFS protocol after confirming they work.

Signed-off-by: Peter Lieven p...@kamp.de
---
 tests/qemu-iotests/075 |2 +-
 tests/qemu-iotests/076 |2 +-
 tests/qemu-iotests/078 |2 +-
 tests/qemu-iotests/079 |2 +-
 tests/qemu-iotests/080 |2 +-
 tests/qemu-iotests/081 |2 +-
 tests/qemu-iotests/082 |2 +-
 tests/qemu-iotests/084 |2 +-
 tests/qemu-iotests/086 |2 +-
 tests/qemu-iotests/088 |2 +-
 tests/qemu-iotests/090 |2 +-
 tests/qemu-iotests/092 |2 +-
 tests/qemu-iotests/103 |2 +-
 13 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/tests/qemu-iotests/075 b/tests/qemu-iotests/075
index 40032c5..6117660 100755
--- a/tests/qemu-iotests/075
+++ b/tests/qemu-iotests/075
@@ -39,7 +39,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
 . ./common.filter
 
 _supported_fmt cloop
-_supported_proto generic
+_supported_proto file
 _supported_os Linux
 
 block_size_offset=128
diff --git a/tests/qemu-iotests/076 b/tests/qemu-iotests/076
index b614a7d..bc47457 100755
--- a/tests/qemu-iotests/076
+++ b/tests/qemu-iotests/076
@@ -39,7 +39,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
 . ./common.filter
 
 _supported_fmt parallels
-_supported_proto generic
+_supported_proto file
 _supported_os Linux
 
 tracks_offset=$((0x1c))
diff --git a/tests/qemu-iotests/078 b/tests/qemu-iotests/078
index d4d6da7..7be2c3f 100755
--- a/tests/qemu-iotests/078
+++ b/tests/qemu-iotests/078
@@ -39,7 +39,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
 . ./common.filter
 
 _supported_fmt bochs
-_supported_proto generic
+_supported_proto file
 _supported_os Linux
 
 catalog_size_offset=$((0x48))
diff --git a/tests/qemu-iotests/079 b/tests/qemu-iotests/079
index 2142bbb..6613cfb 100755
--- a/tests/qemu-iotests/079
+++ b/tests/qemu-iotests/079
@@ -39,7 +39,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
 . ./common.filter
 
 _supported_fmt qcow2
-_supported_proto file
+_supported_proto file nfs
 _supported_os Linux
 
 function test_qemu_img()
diff --git a/tests/qemu-iotests/080 b/tests/qemu-iotests/080
index 6b3a3e7..9de337c 100755
--- a/tests/qemu-iotests/080
+++ b/tests/qemu-iotests/080
@@ -40,7 +40,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
 . ./common.filter
 
 _supported_fmt qcow2
-_supported_proto generic
+_supported_proto file
 _supported_os Linux
 
 header_size=104
diff --git a/tests/qemu-iotests/081 b/tests/qemu-iotests/081
index 7ae4be2..ed3c29e 100755
--- a/tests/qemu-iotests/081
+++ b/tests/qemu-iotests/081
@@ -41,7 +41,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
 . ./common.filter
 
 _supported_fmt raw
-_supported_proto generic
+_supported_proto file
 _supported_os Linux
 
 function do_run_qemu()
diff --git a/tests/qemu-iotests/082 b/tests/qemu-iotests/082
index 910b13e..e64de27 100755
--- a/tests/qemu-iotests/082
+++ b/tests/qemu-iotests/082
@@ -39,7 +39,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
 . ./common.filter
 
 _supported_fmt qcow2
-_supported_proto file
+_supported_proto file nfs
 _supported_os Linux
 
 function run_qemu_img()
diff --git a/tests/qemu-iotests/084 b/tests/qemu-iotests/084
index ae33c2c..2712c02 100755
--- a/tests/qemu-iotests/084
+++ b/tests/qemu-iotests/084
@@ -41,7 +41,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
 
 # This tests vdi-specific header fields
 _supported_fmt vdi
-_supported_proto generic
+_supported_proto file
 _supported_os Linux
 
 size=64M
diff --git a/tests/qemu-iotests/086 b/tests/qemu-iotests/086
index d9a80cf..234eb9a 100755
--- a/tests/qemu-iotests/086
+++ b/tests/qemu-iotests/086
@@ -39,7 +39,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
 . ./common.filter
 
 _supported_fmt qcow2
-_supported_proto file
+_supported_proto file nfs
 _supported_os Linux
 
 function run_qemu_img()
diff --git a/tests/qemu-iotests/088 b/tests/qemu-iotests/088
index c09adf8..f9c3129 100755
--- a/tests/qemu-iotests/088
+++ b/tests/qemu-iotests/088
@@ -40,7 +40,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
 . ./common.filter
 
 _supported_fmt vpc
-_supported_proto generic
+_supported_proto file
 _supported_os Linux
 
 offset_block_size=$((512 + 32))
diff --git a/tests/qemu-iotests/090 b/tests/qemu-iotests/090
index 8d032f8..70b5a6f 100755
--- a/tests/qemu-iotests/090
+++ b/tests/qemu-iotests/090
@@ -39,7 +39,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
 . ./common.filter
 
 _supported_fmt qcow2
-_supported_proto file
+_supported_proto file nfs
 _supported_os Linux
 
 IMG_SIZE=128K
diff --git a/tests/qemu-iotests/092 b/tests/qemu-iotests/092
index a8c0c9c..52c529b 100755
--- 

Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Max Reitz

On 2014-10-20 at 12:03, Peter Lieven wrote:

On 20.10.2014 11:27, Max Reitz wrote:

On 2014-10-20 at 11:14, Peter Lieven wrote:

On 20.10.2014 10:59, Max Reitz wrote:

On 2014-10-20 at 08:14, Peter Lieven wrote:

the block layer silently merges write requests since


s/^t/T/


commit 40b4f539. This patch adds a knob to disable
this feature as there has been some discussion lately
if multiwrite is a good idea at all and as it falsifies
benchmarks.

Signed-off-by: Peter Lieven p...@kamp.de
---
  block.c   |4 
  block/qapi.c  |1 +
  blockdev.c|7 +++
  hmp.c |4 
  include/block/block_int.h |1 +
  qapi/block-core.json  |   10 +-
  qemu-options.hx   |1 +
  qmp-commands.hx   |2 ++
  8 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 27533f3..1658a72 100644
--- a/block.c
+++ b/block.c
@@ -4531,6 +4531,10 @@ static int 
multiwrite_merge(BlockDriverState *bs, BlockRequest *reqs,

  {
  int i, outidx;
  +if (!bs-write_merging) {
+return num_reqs;
+}
+
  // Sort requests by start sector
  qsort(reqs, num_reqs, sizeof(*reqs), multiwrite_req_compare);
  diff --git a/block/qapi.c b/block/qapi.c
index 9733ebd..02251dd 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -58,6 +58,7 @@ BlockDeviceInfo 
*bdrv_block_device_info(BlockDriverState *bs)

info-backing_file_depth = bdrv_get_backing_file_depth(bs);
  info-detect_zeroes = bs-detect_zeroes;
+info-write_merging = bs-write_merging;
if (bs-io_limits_enabled) {
  ThrottleConfig cfg;
diff --git a/blockdev.c b/blockdev.c
index e595910..13e47b8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char 
*file, QDict *bs_opts,

  const char *id;
  bool has_driver_specific_opts;
  BlockdevDetectZeroesOptions detect_zeroes;
+bool write_merging;
  BlockDriver *drv = NULL;
/* Check common options by copying from bs_opts to opts, 
all other options
@@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char 
*file, QDict *bs_opts,

  snapshot = qemu_opt_get_bool(opts, snapshot, 0);
  ro = qemu_opt_get_bool(opts, read-only, 0);
  copy_on_read = qemu_opt_get_bool(opts, copy-on-read, false);
+write_merging = qemu_opt_get_bool(opts, write-merging, true);


Using this option in blockdev_init() means that you can only enable 
or disable merging for the top layer (the root BDS). Furthermore, 
since you don't set bs-write_merging in bdrv_new() (or at least 
bdrv_open()), it actually defaults to false and only for the top 
layer it defaults to true.


Therefore, if after this patch a format block driver issues a 
multiwrite to its file, the write will not be merged and the user 
can do nothing about it. I don't suppose this is intentional...?


I am not sure if a block driver actually can do this at all? The 
only way to enter multiwrite is from virtio_blk_handle_request in 
virtio-blk.c.


Well, there's also qemu-io -c multiwrite (which only accesses the 
root BDS as well). But other than that, yes, you're right. So, in 
practice it shouldn't matter.






I propose evaluating the option in bdrv_open() and setting 
bs-write_merging there.


I wasn't aware actually. I remember that someone asked me to 
implement discard_zeroes in blockdev_init. I think it was something 
related to QMP. So we still might
need to check parameters at 2 positions? It is quite confusing which 
paramter has to be parsed where.


As for me, I don't know why some options are parsed in 
blockdev_init() at all. I guess all the options currently parsed in 
blockdev_init() should later be moved to the BlockBackend, at least 
that would be the idea. In practice, we cannot do that: Things like 
caching will stay in the BlockDriverState.


I think it's just broken. IMHO, everything related to the BB should 
be in blockdev_init() and everything related to the BDS should be in 
bdrv_open(). So the question is now whether you want write_merging to 
be in the BDS or in the BB. Considering BB is in Kevin's block branch 
as of last Friday, you might actually want to work on that branch and 
move the field into the BB if you decide that that's the place it 
should be in.


Actually I there a pros and cons for both BDS and BB. As of now my 
intention was to be able to turn it off. As there are People who would 
like to see it completely disappear I would not spent too much effort 
in that switch today.
Looking at BB it is a BDS thing and thus belongs to bdrv_open. But 
this is true for discard_zeroes (and others) as well. Kevin, Stefan, 
ultimatively where should it be parsed?


Yes, and for cache, too. That's what I meant with it's just broken.

Max

I have on my todo list the following to give you a figure what might 
happen. All this is for 2.3+ except for the accounting maybe:


 - add accounting for merged 

Re: [Qemu-devel] spec, RFC: TLS support for NBD

2014-10-20 Thread Markus Armbruster
Stefan Hajnoczi stefa...@redhat.com writes:

 On Mon, Oct 20, 2014 at 08:58:14AM +0100, Daniel P. Berrange wrote:
 On Sat, Oct 18, 2014 at 07:33:22AM +0100, Richard W.M. Jones wrote:
  On Sat, Oct 18, 2014 at 12:03:23AM +0200, Wouter Verhelst wrote:
   Hi all,
   
   (added rjones from nbdkit fame -- hi there)
  
  [I'm happy to implement whatever you come up with, but I've added
  Florian Weimer to CC who is part of Red Hat's product security group]
  
   So I think the following would make sense to allow TLS in NBD.
   
   This would extend the newstyle negotiation by adding two options (i.e.,
   client requests), one server reply, and one server error as well as
   extend one existing reply, in the following manner:
   
   - The two new commands are NBD_OPT_PEEK_EXPORT and NBD_OPT_STARTTLS. The
 former would be used to verify if the server will do TLS for a given
 export:
   
 C: NBD_OPT_PEEK_EXPORT
 S: NBD_REP_SERVER, with an extra field after the export name
containing flags that describe the export (R/O vs R/W state,
whether TLS is allowed and/or required).
 
 IMHO the server should never provide *any* information about the exported
 volume(s) until the TLS layer has been fully setup. ie we shouldn't only
 think about the actual block data transfers, we should protect the entire
 NBD protocol even metadata related operations.

 This makes sense.

Seconded.

 TLS is about the transport, not about a particular NBD export.  The only
 thing that should be communicated is STARTTLS.

Furthermore, STARTTLS is vulnerable to active attacks: if you can get
between the peers, you can make them fall back to unencrypted silently.
How do you plan to guard against that?

See also https://www.agwa.name/blog/post/starttls_considered_harmful



Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Peter Lieven

On 20.10.2014 13:51, Max Reitz wrote:

On 2014-10-20 at 12:03, Peter Lieven wrote:

On 20.10.2014 11:27, Max Reitz wrote:

On 2014-10-20 at 11:14, Peter Lieven wrote:

On 20.10.2014 10:59, Max Reitz wrote:

On 2014-10-20 at 08:14, Peter Lieven wrote:

the block layer silently merges write requests since


s/^t/T/


commit 40b4f539. This patch adds a knob to disable
this feature as there has been some discussion lately
if multiwrite is a good idea at all and as it falsifies
benchmarks.

Signed-off-by: Peter Lieven p...@kamp.de
---
  block.c   |4 
  block/qapi.c  |1 +
  blockdev.c|7 +++
  hmp.c |4 
  include/block/block_int.h |1 +
  qapi/block-core.json  |   10 +-
  qemu-options.hx   |1 +
  qmp-commands.hx   |2 ++
  8 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 27533f3..1658a72 100644
--- a/block.c
+++ b/block.c
@@ -4531,6 +4531,10 @@ static int multiwrite_merge(BlockDriverState *bs, 
BlockRequest *reqs,
  {
  int i, outidx;
  +if (!bs-write_merging) {
+return num_reqs;
+}
+
  // Sort requests by start sector
  qsort(reqs, num_reqs, sizeof(*reqs), multiwrite_req_compare);
  diff --git a/block/qapi.c b/block/qapi.c
index 9733ebd..02251dd 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -58,6 +58,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs)
info-backing_file_depth = bdrv_get_backing_file_depth(bs);
  info-detect_zeroes = bs-detect_zeroes;
+info-write_merging = bs-write_merging;
if (bs-io_limits_enabled) {
  ThrottleConfig cfg;
diff --git a/blockdev.c b/blockdev.c
index e595910..13e47b8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
  const char *id;
  bool has_driver_specific_opts;
  BlockdevDetectZeroesOptions detect_zeroes;
+bool write_merging;
  BlockDriver *drv = NULL;
/* Check common options by copying from bs_opts to opts, all other 
options
@@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
  snapshot = qemu_opt_get_bool(opts, snapshot, 0);
  ro = qemu_opt_get_bool(opts, read-only, 0);
  copy_on_read = qemu_opt_get_bool(opts, copy-on-read, false);
+write_merging = qemu_opt_get_bool(opts, write-merging, true);


Using this option in blockdev_init() means that you can only enable or disable merging for the top layer (the root BDS). Furthermore, since you don't set bs-write_merging in bdrv_new() (or at least bdrv_open()), it actually defaults to false and 
only for the top layer it defaults to true.


Therefore, if after this patch a format block driver issues a multiwrite to its 
file, the write will not be merged and the user can do nothing about it. I 
don't suppose this is intentional...?


I am not sure if a block driver actually can do this at all? The only way to 
enter multiwrite is from virtio_blk_handle_request in virtio-blk.c.


Well, there's also qemu-io -c multiwrite (which only accesses the root BDS as 
well). But other than that, yes, you're right. So, in practice it shouldn't 
matter.





I propose evaluating the option in bdrv_open() and setting bs-write_merging 
there.


I wasn't aware actually. I remember that someone asked me to implement 
discard_zeroes in blockdev_init. I think it was something related to QMP. So we 
still might
need to check parameters at 2 positions? It is quite confusing which paramter 
has to be parsed where.


As for me, I don't know why some options are parsed in blockdev_init() at all. I guess all the options currently parsed in blockdev_init() should later be moved to the BlockBackend, at least that would be the idea. In practice, we cannot do that: 
Things like caching will stay in the BlockDriverState.


I think it's just broken. IMHO, everything related to the BB should be in blockdev_init() and everything related to the BDS should be in bdrv_open(). So the question is now whether you want write_merging to be in the BDS or in the BB. Considering BB 
is in Kevin's block branch as of last Friday, you might actually want to work on that branch and move the field into the BB if you decide that that's the place it should be in.


Actually I there a pros and cons for both BDS and BB. As of now my intention 
was to be able to turn it off. As there are People who would like to see it 
completely disappear I would not spent too much effort in that switch today.
Looking at BB it is a BDS thing and thus belongs to bdrv_open. But this is true 
for discard_zeroes (and others) as well. Kevin, Stefan, ultimatively where 
should it be parsed?


Yes, and for cache, too. That's what I meant with it's just broken.


Looking at the old discussion about discard zeroes it was recommended to put it 
into bdrv_open_common. If thats still the recommendation I will put it 

Re: [Qemu-devel] spec, RFC: TLS support for NBD

2014-10-20 Thread Daniel P. Berrange
On Mon, Oct 20, 2014 at 01:51:43PM +0200, Markus Armbruster wrote:
 Stefan Hajnoczi stefa...@redhat.com writes:
 
  On Mon, Oct 20, 2014 at 08:58:14AM +0100, Daniel P. Berrange wrote:
  On Sat, Oct 18, 2014 at 07:33:22AM +0100, Richard W.M. Jones wrote:
   On Sat, Oct 18, 2014 at 12:03:23AM +0200, Wouter Verhelst wrote:
Hi all,

(added rjones from nbdkit fame -- hi there)
   
   [I'm happy to implement whatever you come up with, but I've added
   Florian Weimer to CC who is part of Red Hat's product security group]
   
So I think the following would make sense to allow TLS in NBD.

This would extend the newstyle negotiation by adding two options (i.e.,
client requests), one server reply, and one server error as well as
extend one existing reply, in the following manner:

- The two new commands are NBD_OPT_PEEK_EXPORT and NBD_OPT_STARTTLS. 
The
  former would be used to verify if the server will do TLS for a given
  export:

  C: NBD_OPT_PEEK_EXPORT
  S: NBD_REP_SERVER, with an extra field after the export name
 containing flags that describe the export (R/O vs R/W state,
 whether TLS is allowed and/or required).
  
  IMHO the server should never provide *any* information about the exported
  volume(s) until the TLS layer has been fully setup. ie we shouldn't only
  think about the actual block data transfers, we should protect the entire
  NBD protocol even metadata related operations.
 
  This makes sense.
 
 Seconded.
 
  TLS is about the transport, not about a particular NBD export.  The only
  thing that should be communicated is STARTTLS.
 
 Furthermore, STARTTLS is vulnerable to active attacks: if you can get
 between the peers, you can make them fall back to unencrypted silently.
 How do you plan to guard against that?

Well the use of a STARTTLS message at a protocol level isn't vulnerable
per-se, rather it is the handling of it that matters. The key is what
happens if the server wants TLS and the client does not send a STARTTLS
message. If the server happily carries on with plain text that's bad. If
the server closes any connection that attempts to skip STARTTLS, that's
fine. Likewise if the client wants TLS and the server claims to not do
TLS, then the client should close the connection and not carry on. This
avoids the MITM downgrade problem.

So from the POV of QEMU / QEMU-NBD I'd expect us to have a CLI option
tls=on|off  and if the client / server are configured differently then
it would be a hard failure, never any negotiated fallback to plain text
if one requests TLS and the other doesn't.

If QEMU relies on the CLI option, then technically we do not need any
NBD protocol level changes at all. A standard TLS handshake could be
started the moment the TCP connection is established. Only once the
TLS handshake completes would the NBD protocol start running.

The real / main benefit of having a STARTTLS message would be to give
better error reporting for clients not attempting TLS. eg so they could
report a clear This server requires TLS error instead of just seeing
unintelligible data from the NBD server and no clue that it is a TLS
handshake.

This is how the VNC integration works at least. The VNC server advertizes
that it requires the TLS auth protocol extension. If the VNC client does
not support this, the server will drop the connection and the VNC client
can at least report to the user that the server requested use of TLS.

The key is that no data or metadata that is in any way related to remote
desktop (or NBD volume) is exchanged between server/client until after
the TLS auth protocol completes.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] Close the BlockDriverState when guest eject the media

2014-10-20 Thread Markus Armbruster
Weidong Huang h...@huawei.com writes:

 On 2014/10/20 17:41, Kevin Wolf wrote:

 Am 18.10.2014 um 12:02 hat Weidong Huang geschrieben:
 Hi ALL:

 There are two ways to eject the cdrom tray. One is by the eject's
 qmp commmand(eject_device).
 The another one is by the guest(bdrv_eject). They have different results.
 
 Yes, they are different things.
 
 If a guest opens the tray (using bdrv_eject) and then closes it again,
 with no user interaction in between, the virtual media must still be in
 the drive and the guest must be able to access the same image again.
 Calling bdrv_close() in this case would be a bug.
 
 The goal of the monitor command eject on the other hand is to remove
 the medium so that the drive is empty. That a device with a closed tray
 has to be opened for this is only secondary.


 Thanks for your reply.

 There is a problem.

 1. Qemu receive the eject command.
 2. Runs eject_request_cb when an eject request is issued from the
 monitor, the tray
 is closed, and the medium is locked. But the drive is not closed.

Yes, callback eject_request_cb() runs when an eject request is issued
from the monitor, the tray * is closed, and the medium is locked
(quoting block.h).

 3. Guest agree with opening tray and qemu will call bdrv_eject to
 complete. The drive is
 still not close.

Yes, the guest honors the request by unlocking and opening the tray.
This calls bdrv_lock_medium(), then bdrv_eject().

 So the result of the monitor command eject is not to remove the
 medium in this situation.

Correct.  This is a known wart.  To work around it, wait for event
DEVICE_TRAY_MOVED and eject again.  Yes, this is racy: the guest can
reclose the tray and lock it before you get your eject in.

Your patch removes this wart, but regresses other scenarios:

commit 4be9762adb0947a353e6efef2fed354f69218bfb
Author: Markus Armbruster arm...@redhat.com
Date:   Tue Jul 27 14:02:01 2010 +0200

block: Change bdrv_eject() not to drop the image

bdrv_eject() gets called when a device model opens or closes the tray.

If the block driver implements method bdrv_eject(), that method gets
called.  Drivers host_cdrom implements it, and it opens and closes the
physical tray, and nothing else.  When a device model opens, then
closes the tray, media changes only if the user actively changes the
physical media while the tray is open.  This is matches how physical
hardware behaves.

If the block driver doesn't implement method bdrv_eject(), we do
something quite different: opening the tray severs the connection to
the image by calling bdrv_close(), and closing the tray does nothing.
When the device model opens, then closes the tray, media is gone,
unless the user actively inserts another one while the tray is open,
with a suitable change command in the monitor.  This isn't how
physical hardware behaves.  Rather inconvenient when programs
helpfully eject media to give you a chance to change it.  The way
bdrv_eject() behaves here turns that chance into a must, which is not
what these programs or their users expect.

Change the default action not to call bdrv_close().  Instead, note the
tray status in new BlockDriverState member tray_open.  Use it in
bdrv_is_inserted().

Arguably, the device models should keep track of tray status
themselves.  But this is less invasive.

Signed-off-by: Markus Armbruster arm...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com

Programs really depend on eject, load, get the same medium back
behavior.  Example: https://bugzilla.redhat.com/show_bug.cgi?id=558256

We intend to provide new commands that behave better than eject.
Don't hold your breath.

 eject_device: close the BlockDriverState(bdrv_close(bs))
 bdrv_eject: don't close the BlockDriverState,

 This is ambiguous. So libvirt can't handle some situations.

 libvirt send eject qmp command --- qemu send eject request to guest ---
 guest respond to qemu --- qemu emit tray_open event to libvirt ---
 libvirt will not send change qmp command if media source is null. So
 the media is not be replace to the null.
 
 What is the problem that libvirt has with the guest opening the tray? I
 don't think libvirt should even care about that case.


 For example, using libvirt to change media by xml below(media source is null):
 disk type='file' device='cdrom'
 driver name='qemu'/
 target dev='hdb' bus='ide'/
 /disk

 libivrt return ok. But media still is in the guest.
 This is confused.

libvirt bug, caused by the bad QEMU interface.



Re: [Qemu-devel] [PATCH] intel_iommu: fix VTD_SID_TO_BUS

2014-10-20 Thread Le Tan
Hi Markus,

2014-10-20 19:41 GMT+08:00 Markus Armbruster arm...@redhat.com:
 Michael S. Tsirkin m...@redhat.com writes:

 (((sid)  8)  0xff)  makes no sense
 (((sid)  8)  0xff) seems to be what was meant.

 Suggested-by: Markus Armbruster arm...@redhat.com

 Actually by the reporter of https://bugs.launchpad.net/bugs/1382477

 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 ---

 Compile-tested only.

  include/hw/i386/intel_iommu.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
 index f4701e1..e321ee4 100644
 --- a/include/hw/i386/intel_iommu.h
 +++ b/include/hw/i386/intel_iommu.h
 @@ -37,7 +37,7 @@
  #define VTD_PCI_DEVFN_MAX   256
  #define VTD_PCI_SLOT(devfn) (((devfn)  3)  0x1f)
  #define VTD_PCI_FUNC(devfn) ((devfn)  0x07)
 -#define VTD_SID_TO_BUS(sid) (((sid)  8)  0xff)
 +#define VTD_SID_TO_BUS(sid) (((sid)  8)  0xff)
  #define VTD_SID_TO_DEVFN(sid)   ((sid)  0xff)

  #define DMAR_REG_SIZE   0x230

 Can't find the spec right now, but it looks plausible enough.

Yes, this is a typo. I am sorry that I introduced such a mistake.
The spec is here in Section 3.4 :
http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html

VTD_SID_TO_BUS(sid) is intended to be used to get the bus id from the
source identifier.

Thanks very much!

Regards,
Le

 Only use is in vtd_context_device_invalidate().  Bug's impact isn't
 obvious to me.

 Reviewed-by: Markus Armbruster arm...@redhat.com



Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Peter Lieven

On 20.10.2014 13:51, Max Reitz wrote:

On 2014-10-20 at 12:03, Peter Lieven wrote:

On 20.10.2014 11:27, Max Reitz wrote:

On 2014-10-20 at 11:14, Peter Lieven wrote:

On 20.10.2014 10:59, Max Reitz wrote:

On 2014-10-20 at 08:14, Peter Lieven wrote:

the block layer silently merges write requests since


s/^t/T/


commit 40b4f539. This patch adds a knob to disable
this feature as there has been some discussion lately
if multiwrite is a good idea at all and as it falsifies
benchmarks.

Signed-off-by: Peter Lieven p...@kamp.de
---
  block.c   |4 
  block/qapi.c  |1 +
  blockdev.c|7 +++
  hmp.c |4 
  include/block/block_int.h |1 +
  qapi/block-core.json  |   10 +-
  qemu-options.hx   |1 +
  qmp-commands.hx   |2 ++
  8 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 27533f3..1658a72 100644
--- a/block.c
+++ b/block.c
@@ -4531,6 +4531,10 @@ static int multiwrite_merge(BlockDriverState *bs, 
BlockRequest *reqs,
  {
  int i, outidx;
  +if (!bs-write_merging) {
+return num_reqs;
+}
+
  // Sort requests by start sector
  qsort(reqs, num_reqs, sizeof(*reqs), multiwrite_req_compare);
  diff --git a/block/qapi.c b/block/qapi.c
index 9733ebd..02251dd 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -58,6 +58,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs)
info-backing_file_depth = bdrv_get_backing_file_depth(bs);
  info-detect_zeroes = bs-detect_zeroes;
+info-write_merging = bs-write_merging;
if (bs-io_limits_enabled) {
  ThrottleConfig cfg;
diff --git a/blockdev.c b/blockdev.c
index e595910..13e47b8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
  const char *id;
  bool has_driver_specific_opts;
  BlockdevDetectZeroesOptions detect_zeroes;
+bool write_merging;
  BlockDriver *drv = NULL;
/* Check common options by copying from bs_opts to opts, all other 
options
@@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
  snapshot = qemu_opt_get_bool(opts, snapshot, 0);
  ro = qemu_opt_get_bool(opts, read-only, 0);
  copy_on_read = qemu_opt_get_bool(opts, copy-on-read, false);
+write_merging = qemu_opt_get_bool(opts, write-merging, true);


Using this option in blockdev_init() means that you can only enable or disable merging for the top layer (the root BDS). Furthermore, since you don't set bs-write_merging in bdrv_new() (or at least bdrv_open()), it actually defaults to false and 
only for the top layer it defaults to true.


Therefore, if after this patch a format block driver issues a multiwrite to its 
file, the write will not be merged and the user can do nothing about it. I 
don't suppose this is intentional...?


I am not sure if a block driver actually can do this at all? The only way to 
enter multiwrite is from virtio_blk_handle_request in virtio-blk.c.


Well, there's also qemu-io -c multiwrite (which only accesses the root BDS as 
well). But other than that, yes, you're right. So, in practice it shouldn't 
matter.





I propose evaluating the option in bdrv_open() and setting bs-write_merging 
there.


I wasn't aware actually. I remember that someone asked me to implement 
discard_zeroes in blockdev_init. I think it was something related to QMP. So we 
still might
need to check parameters at 2 positions? It is quite confusing which paramter 
has to be parsed where.


As for me, I don't know why some options are parsed in blockdev_init() at all. I guess all the options currently parsed in blockdev_init() should later be moved to the BlockBackend, at least that would be the idea. In practice, we cannot do that: 
Things like caching will stay in the BlockDriverState.


I think it's just broken. IMHO, everything related to the BB should be in blockdev_init() and everything related to the BDS should be in bdrv_open(). So the question is now whether you want write_merging to be in the BDS or in the BB. Considering BB 
is in Kevin's block branch as of last Friday, you might actually want to work on that branch and move the field into the BB if you decide that that's the place it should be in.


Actually I there a pros and cons for both BDS and BB. As of now my intention 
was to be able to turn it off. As there are People who would like to see it 
completely disappear I would not spent too much effort in that switch today.
Looking at BB it is a BDS thing and thus belongs to bdrv_open. But this is true 
for discard_zeroes (and others) as well. Kevin, Stefan, ultimatively where 
should it be parsed?


Yes, and for cache, too. That's what I meant with it's just broken.


Can you further help here. I think my problem was that I don't have access to 
the commandline options in bdrv_open?!

Thank you,
Peter



[Qemu-devel] [PATCH v5 0/1] Add support for Xen access to vmport

2014-10-20 Thread Don Slutz
Changes v4 to v5:
  Paul Durrant
vmware_ioreq_t struct is not really a request any more. Maybe
vmware_regs_t?
  Renamed various parts from vmware_ioreq to vmware_regs.  Also
  HVM_PARAM_VMPORT_IOREQ_PFN to HVM_PARAM_VMPORT_REGS_PFN.
cpu_by_ioreq_id name implies the array is indexed by an id
carries in the ioreq.
  Renamed cpu_by_ioreq_id to cpu_by_vcpu_id.
Is cpu_get_vmport_ioreq_from_shared_memory worth its own
function?
  Moved in-line.
I don't think you need the barrier anyway.
  Dropped the barrier.
Oh, I now realize you mean the same theoretical rather than
actual limit, in which case this can be a build time check
anyway.
  Switch to build time check, move to a better place.
You could avoid passing state to both of them by setting
current_cpu here couldn't you?
  Yes, moved state usage to handle_vmport_ioreq().

  Stefano Stabellini
Error out if it fails with error != -ENOSYS.
  Done.

Changes RFC-v2x to v4:
  Stefano Stabellini
Please try to get rid of the #ifdefs.
  Moved 2 #ifdefs into hw/xen/xen_common.h

Changes v2 to RFC-v2x:
  Paul Durrant
Use a 2nd shared page.
  Added HVM_PARAM_VMPORT_IOREQ_PFN usage.

Changes v1 to v2:
   More info in commit message.

  Stefano Stabellini
the registers being passes explicitely by Xen rather than
hiding them into other ioreq fields.
   Added vmware_ioreq_t
  Paolo Bonzini  Alexander Graf
Fixup env access
  Added cpu_by_ioreq_id.
  Set current_cpu in regs_to_cpu(), clear in regs_from_cpu().
  Drop all changes to vmport.c

Note: to use this with Xen either a version of:

[Qemu-devel] [PATCH] -machine vmport=off: Allow disabling of VMWare ioport 
emulation

or

From f70663d9fb86914144ba340b6186cb1e67ac6eec Mon Sep 17 00:00:00 2001
From: Don Slutz dsl...@verizon.com
Date: Fri, 26 Sep 2014 08:11:39 -0400
Subject: [PATCH 1/2] hack: force enable vmport

Signed-off-by: Don Slutz dsl...@verizon.com
---
 hw/i386/pc_piix.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 103d756..b76dfbc 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -234,7 +234,7 @@ static void pc_init1(MachineState *machine,
 pc_vga_init(isa_bus, pci_enabled ? pci_bus : NULL);
 
 /* init basic PC hardware */
-pc_basic_device_init(isa_bus, gsi, rtc_state, floppy, xen_enabled(),
+pc_basic_device_init(isa_bus, gsi, rtc_state, floppy, false,
 0x4);
 
 pc_nic_init(isa_bus, pci_bus);
-- 
1.8.4

needs to be done to QEMU.

And the Xen RFC patch:

[RFC][PATCH v2 1/1] Add IOREQ_TYPE_VMWARE_PORT

needs to be done to Xen.

Don Slutz (1):
  xen-hvm.c: Add support for Xen access to vmport

 include/hw/xen/xen_common.h |  22 +
 xen-hvm.c   | 110 ++--
 2 files changed, 127 insertions(+), 5 deletions(-)

-- 
1.8.4




[Qemu-devel] [PATCH v5 1/1] xen-hvm.c: Add support for Xen access to vmport

2014-10-20 Thread Don Slutz
This adds synchronisation of the 6 vcpu registers (only 32bits of
them) that vmport.c needs between Xen and QEMU.

This is to avoid a 2nd and 3rd exchange between QEMU and Xen to
fetch and put these 6 vcpu registers used by the code in vmport.c
and vmmouse.c

The registers are passed in the new shared page provided by
HVM_PARAM_VMPORT_REGS_PFN.

Add new array to XenIOState that allows selection of current_cpu by
vcpu id.

Now pass XenIOState to handle_ioreq().

Add new routines regs_to_cpu(), regs_from_cpu(), and
handle_vmport_ioreq().

Signed-off-by: Don Slutz dsl...@verizon.com
---
v5:
vmware_ioreq_t struct is not really a request any more. Maybe
vmware_regs_t?
  Renamed various parts from vmware_ioreq to vmware_regs.  Also
  HVM_PARAM_VMPORT_IOREQ_PFN to HVM_PARAM_VMPORT_REGS_PFN.
cpu_by_ioreq_id name implies the array is indexed by an id
carries in the ioreq.
  Renamed cpu_by_ioreq_id to cpu_by_vcpu_id.
Is cpu_get_vmport_ioreq_from_shared_memory worth its own
function?
  Moved in-line.
I don't think you need the barrier anyway.
  Dropped the barrier.
Oh, I now realize you mean the same theoretical rather than
actual limit, in which case this can be a build time check
anyway.
  Switch to build time check, move to a better place.
You could avoid passing state to both of them by setting
current_cpu here couldn't you?
  Yes, moved state usage to handle_vmport_ioreq().
Error out if it fails with error != -ENOSYS.
  Done.

v4:
Please try to get rid of the #ifdefs.
  Moved 2 #ifdefs into hw/xen/xen_common.h


 include/hw/xen/xen_common.h |  22 +
 xen-hvm.c   | 110 ++--
 2 files changed, 127 insertions(+), 5 deletions(-)

diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h
index 07731b9..42e3d77 100644
--- a/include/hw/xen/xen_common.h
+++ b/include/hw/xen/xen_common.h
@@ -164,4 +164,26 @@ void destroy_hvm_domain(bool reboot);
 /* shutdown/destroy current domain because of an error */
 void xen_shutdown_fatal_error(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
 
+#ifdef HVM_PARAM_VMPORT_REGS_PFN
+static inline int xen_get_vmport_regs_pfn(XenXC xc, domid_t dom,
+  unsigned long *vmport_regs_pfn)
+{
+return xc_get_hvm_param(xc, dom, HVM_PARAM_VMPORT_REGS_PFN,
+vmport_regs_pfn);
+}
+#else
+static inline int xen_get_vmport_regs_pfn(XenXC xc, domid_t dom,
+  unsigned long *vmport_regs_pfn)
+{
+return -ENOSYS;
+}
+#endif
+
+#if __GNUC__  4 || (__GNUC__ == 4  __GNUC_MINOR__ = 6)
+/* Force a compilation error if condition is true */
+#define BUILD_BUG_ON(cond) ({ _Static_assert(!(cond), !( #cond )); })
+#else
+#define BUILD_BUG_ON(cond) ((void)sizeof(struct { int:-!!(cond); }))
+#endif
+
 #endif /* QEMU_HW_XEN_COMMON_H */
diff --git a/xen-hvm.c b/xen-hvm.c
index 05e522c..b5ef683 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -41,6 +41,29 @@ static MemoryRegion *framebuffer;
 static bool xen_in_migration;
 
 /* Compatibility with older version */
+
+/* This allows QEMU to build on a system that has Xen 4.5 or earlier
+ * installed.  This here (not in hw/xen/xen_common.h) because xen/hvm/ioreq.h
+ * needs to be included before this block and hw/xen/xen_common.h needs to
+ * be included before xen/hvm/ioreq.h
+ */
+#ifndef IOREQ_TYPE_VMWARE_PORT
+#define IOREQ_TYPE_VMWARE_PORT  3
+struct vmware_regs {
+uint32_t esi;
+uint32_t edi;
+uint32_t ebx;
+uint32_t ecx;
+uint32_t edx;
+};
+typedef struct vmware_regs vmware_regs_t;
+
+struct shared_vmport_iopage {
+struct vmware_regs vcpu_vmport_regs[1];
+};
+typedef struct shared_vmport_iopage shared_vmport_iopage_t;
+#endif
+
 #if __XEN_LATEST_INTERFACE_VERSION__  0x0003020a
 static inline uint32_t xen_vcpu_eport(shared_iopage_t *shared_page, int i)
 {
@@ -79,8 +102,10 @@ typedef struct XenPhysmap {
 
 typedef struct XenIOState {
 shared_iopage_t *shared_page;
+shared_vmport_iopage_t *shared_vmport_page;
 buffered_iopage_t *buffered_io_page;
 QEMUTimer *buffered_io_timer;
+CPUState **cpu_by_vcpu_id;
 /* the evtchn port for polling the notification, */
 evtchn_port_t *ioreq_local_port;
 /* evtchn local port for buffered io */
@@ -101,6 +126,8 @@ typedef struct XenIOState {
 Notifier wakeup;
 } XenIOState;
 
+static void handle_ioreq(XenIOState *state, ioreq_t *req);
+
 /* Xen specific function for piix pci */
 
 int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
@@ -773,7 +800,50 @@ static void cpu_ioreq_move(ioreq_t *req)
 }
 }
 
-static void handle_ioreq(ioreq_t *req)
+static void regs_to_cpu(vmware_regs_t *vmport_regs, ioreq_t *req)
+{
+X86CPU *cpu;
+CPUX86State *env;
+
+cpu = X86_CPU(current_cpu);
+env = cpu-env;
+env-regs[R_EAX] = req-data;
+env-regs[R_EBX] = vmport_regs-ebx;
+env-regs[R_ECX] = 

Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Max Reitz

On 2014-10-20 at 14:16, Peter Lieven wrote:

On 20.10.2014 13:51, Max Reitz wrote:

On 2014-10-20 at 12:03, Peter Lieven wrote:

On 20.10.2014 11:27, Max Reitz wrote:

On 2014-10-20 at 11:14, Peter Lieven wrote:

On 20.10.2014 10:59, Max Reitz wrote:

On 2014-10-20 at 08:14, Peter Lieven wrote:

the block layer silently merges write requests since


s/^t/T/


commit 40b4f539. This patch adds a knob to disable
this feature as there has been some discussion lately
if multiwrite is a good idea at all and as it falsifies
benchmarks.

Signed-off-by: Peter Lieven p...@kamp.de
---
  block.c   |4 
  block/qapi.c  |1 +
  blockdev.c|7 +++
  hmp.c |4 
  include/block/block_int.h |1 +
  qapi/block-core.json  |   10 +-
  qemu-options.hx   |1 +
  qmp-commands.hx   |2 ++
  8 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 27533f3..1658a72 100644
--- a/block.c
+++ b/block.c
@@ -4531,6 +4531,10 @@ static int 
multiwrite_merge(BlockDriverState *bs, BlockRequest *reqs,

  {
  int i, outidx;
  +if (!bs-write_merging) {
+return num_reqs;
+}
+
  // Sort requests by start sector
  qsort(reqs, num_reqs, sizeof(*reqs), 
multiwrite_req_compare);

  diff --git a/block/qapi.c b/block/qapi.c
index 9733ebd..02251dd 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -58,6 +58,7 @@ BlockDeviceInfo 
*bdrv_block_device_info(BlockDriverState *bs)

info-backing_file_depth = bdrv_get_backing_file_depth(bs);
  info-detect_zeroes = bs-detect_zeroes;
+info-write_merging = bs-write_merging;
if (bs-io_limits_enabled) {
  ThrottleConfig cfg;
diff --git a/blockdev.c b/blockdev.c
index e595910..13e47b8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char 
*file, QDict *bs_opts,

  const char *id;
  bool has_driver_specific_opts;
  BlockdevDetectZeroesOptions detect_zeroes;
+bool write_merging;
  BlockDriver *drv = NULL;
/* Check common options by copying from bs_opts to opts, 
all other options
@@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char 
*file, QDict *bs_opts,

  snapshot = qemu_opt_get_bool(opts, snapshot, 0);
  ro = qemu_opt_get_bool(opts, read-only, 0);
  copy_on_read = qemu_opt_get_bool(opts, copy-on-read, 
false);
+write_merging = qemu_opt_get_bool(opts, write-merging, 
true);


Using this option in blockdev_init() means that you can only 
enable or disable merging for the top layer (the root BDS). 
Furthermore, since you don't set bs-write_merging in bdrv_new() 
(or at least bdrv_open()), it actually defaults to false and only 
for the top layer it defaults to true.


Therefore, if after this patch a format block driver issues a 
multiwrite to its file, the write will not be merged and the user 
can do nothing about it. I don't suppose this is intentional...?


I am not sure if a block driver actually can do this at all? The 
only way to enter multiwrite is from virtio_blk_handle_request in 
virtio-blk.c.


Well, there's also qemu-io -c multiwrite (which only accesses the 
root BDS as well). But other than that, yes, you're right. So, in 
practice it shouldn't matter.






I propose evaluating the option in bdrv_open() and setting 
bs-write_merging there.


I wasn't aware actually. I remember that someone asked me to 
implement discard_zeroes in blockdev_init. I think it was 
something related to QMP. So we still might
need to check parameters at 2 positions? It is quite confusing 
which paramter has to be parsed where.


As for me, I don't know why some options are parsed in 
blockdev_init() at all. I guess all the options currently parsed in 
blockdev_init() should later be moved to the BlockBackend, at least 
that would be the idea. In practice, we cannot do that: Things like 
caching will stay in the BlockDriverState.


I think it's just broken. IMHO, everything related to the BB should 
be in blockdev_init() and everything related to the BDS should be 
in bdrv_open(). So the question is now whether you want 
write_merging to be in the BDS or in the BB. Considering BB is in 
Kevin's block branch as of last Friday, you might actually want to 
work on that branch and move the field into the BB if you decide 
that that's the place it should be in.


Actually I there a pros and cons for both BDS and BB. As of now my 
intention was to be able to turn it off. As there are People who 
would like to see it completely disappear I would not spent too much 
effort in that switch today.
Looking at BB it is a BDS thing and thus belongs to bdrv_open. But 
this is true for discard_zeroes (and others) as well. Kevin, Stefan, 
ultimatively where should it be parsed?


Yes, and for cache, too. That's what I meant with it's just broken.


Can you further help here. I think my problem was that I don't have 
access to the 

Re: [Qemu-devel] spec, RFC: TLS support for NBD

2014-10-20 Thread Florian Weimer

On 10/20/2014 01:51 PM, Markus Armbruster wrote:

Furthermore, STARTTLS is vulnerable to active attacks: if you can get
between the peers, you can make them fall back to unencrypted silently.
How do you plan to guard against that?


The usual way to deal with this is to use different syntax for 
TLS-enabled and non-TLS addresses (e.g., https:// and http://).  With a 
TLS address, the client must enforce that only TLS-enabled connections 
are possible.  STARTTLS isn't the problem here, it's just an accident of 
history that many STARTTLS client implementations do not require a TLS 
handshake before proceeding.


I cannot comment on whether the proposed STARTTLS command is at the 
correct stage of the NBD protocol.  If there is a protocol description 
for NBD, I can have a look.


--
Florian Weimer / Red Hat Product Security



Re: [Qemu-devel] [PATCH] get_maintainer.pl: Default to --no-git-fallback

2014-10-20 Thread Don Slutz

I am happy with this so:

Reviewed-by: Don Slutz dsl...@verizon.com

   -Don Slutz

On 10/20/14 05:19, Markus Armbruster wrote:

Contributors rely on this script to find maintainers to copy.  The
script falls back to git when no exact MAINTAINERS pattern matches.
When that happens, recent contributors get copied, which tends not be
particularly useful.  Some contributors find it even annoying.

Flip the default to don't fall back to git.  Use --git-fallback to
ask it to fall back to git.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
  scripts/get_maintainer.pl | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index 38334de..ec2d16f 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -28,7 +28,7 @@ my $email_git = 0;
  my $email_git_all_signature_types = 0;
  my $email_git_blame = 0;
  my $email_git_blame_signatures = 1;
-my $email_git_fallback = 1;
+my $email_git_fallback = 0;
  my $email_git_min_signatures = 1;
  my $email_git_max_maintainers = 5;
  my $email_git_min_percent = 5;





[Qemu-devel] [PATCH v2 0/2] vl.c: unify exit on error handling

2014-10-20 Thread Igor Mammedov

v2:
 * s/exit_if_error/error_report_fatal/
 * move error_report_fatal() into util/error.c

reduces amount of code needed for handling exit on error
for new users, turning it from 5LOC to 1 for each case.

Igor Mammedov (2):
  vl.c: use single local_err throughout main()
  vl.c: reduce exit on error code duplication

 include/qemu/error-report.h |  2 ++
 util/error.c| 24 
 vl.c| 38 +++---
 3 files changed, 33 insertions(+), 31 deletions(-)

-- 
1.9.3




Re: [Qemu-devel] [PATCH v5 1/1] xen-hvm.c: Add support for Xen access to vmport

2014-10-20 Thread Paul Durrant
 -Original Message-
 From: Don Slutz [mailto:dsl...@verizon.com]
 Sent: 20 October 2014 13:19
 To: qemu-devel@nongnu.org; Paul Durrant
 Cc: xen-de...@lists.xensource.com; Alexander Graf; Andreas Färber;
 Anthony Liguori; Don Slutz; Marcel Apfelbaum; Markus Armbruster; Michael
 S. Tsirkin; Stefano Stabellini
 Subject: [PATCH v5 1/1] xen-hvm.c: Add support for Xen access to vmport
 
 This adds synchronisation of the 6 vcpu registers (only 32bits of
 them) that vmport.c needs between Xen and QEMU.
 
 This is to avoid a 2nd and 3rd exchange between QEMU and Xen to
 fetch and put these 6 vcpu registers used by the code in vmport.c
 and vmmouse.c
 
 The registers are passed in the new shared page provided by
 HVM_PARAM_VMPORT_REGS_PFN.
 
 Add new array to XenIOState that allows selection of current_cpu by
 vcpu id.
 
 Now pass XenIOState to handle_ioreq().
 
 Add new routines regs_to_cpu(), regs_from_cpu(), and
 handle_vmport_ioreq().
 
 Signed-off-by: Don Slutz dsl...@verizon.com
 ---
 v5:
 vmware_ioreq_t struct is not really a request any more. Maybe
 vmware_regs_t?
   Renamed various parts from vmware_ioreq to vmware_regs.  Also
   HVM_PARAM_VMPORT_IOREQ_PFN to
 HVM_PARAM_VMPORT_REGS_PFN.
 cpu_by_ioreq_id name implies the array is indexed by an id
 carries in the ioreq.
   Renamed cpu_by_ioreq_id to cpu_by_vcpu_id.
 Is cpu_get_vmport_ioreq_from_shared_memory worth its own
 function?
   Moved in-line.
 I don't think you need the barrier anyway.
   Dropped the barrier.
 Oh, I now realize you mean the same theoretical rather than
 actual limit, in which case this can be a build time check
 anyway.
   Switch to build time check, move to a better place.
 You could avoid passing state to both of them by setting
 current_cpu here couldn't you?
   Yes, moved state usage to handle_vmport_ioreq().
 Error out if it fails with error != -ENOSYS.
   Done.
 
 v4:
 Please try to get rid of the #ifdefs.
   Moved 2 #ifdefs into hw/xen/xen_common.h
 

One possible nit, inline below, but...

Reviewed-by: Paul Durrant paul.durr...@citrix.com

 
  include/hw/xen/xen_common.h |  22 +
  xen-hvm.c   | 110
 ++--
  2 files changed, 127 insertions(+), 5 deletions(-)
 
 diff --git a/include/hw/xen/xen_common.h
 b/include/hw/xen/xen_common.h
 index 07731b9..42e3d77 100644
 --- a/include/hw/xen/xen_common.h
 +++ b/include/hw/xen/xen_common.h
 @@ -164,4 +164,26 @@ void destroy_hvm_domain(bool reboot);
  /* shutdown/destroy current domain because of an error */
  void xen_shutdown_fatal_error(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
 
 +#ifdef HVM_PARAM_VMPORT_REGS_PFN
 +static inline int xen_get_vmport_regs_pfn(XenXC xc, domid_t dom,
 +  unsigned long *vmport_regs_pfn)
 +{
 +return xc_get_hvm_param(xc, dom,
 HVM_PARAM_VMPORT_REGS_PFN,
 +vmport_regs_pfn);
 +}
 +#else
 +static inline int xen_get_vmport_regs_pfn(XenXC xc, domid_t dom,
 +  unsigned long *vmport_regs_pfn)
 +{
 +return -ENOSYS;
 +}
 +#endif
 +
 +#if __GNUC__  4 || (__GNUC__ == 4  __GNUC_MINOR__ = 6)
 +/* Force a compilation error if condition is true */
 +#define BUILD_BUG_ON(cond) ({ _Static_assert(!(cond), !( #cond )); })
 +#else
 +#define BUILD_BUG_ON(cond) ((void)sizeof(struct { int:-!!(cond); }))
 +#endif
 +
  #endif /* QEMU_HW_XEN_COMMON_H */
 diff --git a/xen-hvm.c b/xen-hvm.c
 index 05e522c..b5ef683 100644
 --- a/xen-hvm.c
 +++ b/xen-hvm.c
 @@ -41,6 +41,29 @@ static MemoryRegion *framebuffer;
  static bool xen_in_migration;
 
  /* Compatibility with older version */
 +
 +/* This allows QEMU to build on a system that has Xen 4.5 or earlier
 + * installed.  This here (not in hw/xen/xen_common.h) because
 xen/hvm/ioreq.h
 + * needs to be included before this block and hw/xen/xen_common.h
 needs to
 + * be included before xen/hvm/ioreq.h
 + */
 +#ifndef IOREQ_TYPE_VMWARE_PORT
 +#define IOREQ_TYPE_VMWARE_PORT  3
 +struct vmware_regs {
 +uint32_t esi;
 +uint32_t edi;
 +uint32_t ebx;
 +uint32_t ecx;
 +uint32_t edx;
 +};
 +typedef struct vmware_regs vmware_regs_t;
 +
 +struct shared_vmport_iopage {
 +struct vmware_regs vcpu_vmport_regs[1];
 +};
 +typedef struct shared_vmport_iopage shared_vmport_iopage_t;
 +#endif
 +
  #if __XEN_LATEST_INTERFACE_VERSION__  0x0003020a
  static inline uint32_t xen_vcpu_eport(shared_iopage_t *shared_page, int i)
  {
 @@ -79,8 +102,10 @@ typedef struct XenPhysmap {
 
  typedef struct XenIOState {
  shared_iopage_t *shared_page;
 +shared_vmport_iopage_t *shared_vmport_page;
  buffered_iopage_t *buffered_io_page;
  QEMUTimer *buffered_io_timer;
 +CPUState **cpu_by_vcpu_id;
  /* the evtchn port for polling the notification, */
  evtchn_port_t *ioreq_local_port;
  /* evtchn local port for buffered io */
 @@ -101,6 +126,8 

[Qemu-devel] [PATCH v2 1/2] vl.c: use single local_err throughout main()

2014-10-20 Thread Igor Mammedov
Signed-off-by: Igor Mammedov imamm...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
---
 vl.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/vl.c b/vl.c
index aee73e1..528c289 100644
--- a/vl.c
+++ b/vl.c
@@ -2771,6 +2771,7 @@ int main(int argc, char **argv, char **envp)
 };
 const char *trace_events = NULL;
 const char *trace_file = NULL;
+Error *local_err = NULL;
 const ram_addr_t default_ram_size = (ram_addr_t)DEFAULT_RAM_SIZE *
 1024 * 1024;
 ram_addr_t maxram_size = default_ram_size;
@@ -4072,7 +4073,6 @@ int main(int argc, char **argv, char **envp)
 configure_accelerator(current_machine);
 
 if (qtest_chrdev) {
-Error *local_err = NULL;
 qtest_init(qtest_chrdev, qtest_log, local_err);
 if (local_err) {
 error_report(%s, error_get_pretty(local_err));
@@ -4316,7 +4316,6 @@ int main(int argc, char **argv, char **envp)
 #ifdef CONFIG_VNC
 /* init remote displays */
 if (vnc_display) {
-Error *local_err = NULL;
 vnc_display_init(ds);
 vnc_display_open(ds, vnc_display, local_err);
 if (local_err != NULL) {
@@ -4371,7 +4370,6 @@ int main(int argc, char **argv, char **envp)
 }
 
 if (incoming) {
-Error *local_err = NULL;
 qemu_start_incoming_migration(incoming, local_err);
 if (local_err) {
 error_report(-incoming %s: %s, incoming,
-- 
1.9.3




Re: [Qemu-devel] [PATCH v5 6/7] stm32f205: Add the stm32f205 SoC

2014-10-20 Thread Alistair Francis
On Mon, Oct 20, 2014 at 5:47 PM, Peter Crosthwaite
peter.crosthwa...@xilinx.com wrote:
 On Thu, Oct 16, 2014 at 10:54 PM, Alistair Francis alistai...@gmail.com 
 wrote:
 This patch adds the stm32f205 SoC. This will be used by the
 Netduino 2 to create a machine.

 Signed-off-by: Alistair Francis alistai...@gmail.com
 ---
  default-configs/arm-softmmu.mak |   1 +
  hw/arm/Makefile.objs|   1 +
  hw/arm/stm32f205_soc.c  | 157 
 
  include/hw/arm/stm32f205_soc.h  |  69 ++
  4 files changed, 228 insertions(+)
  create mode 100644 hw/arm/stm32f205_soc.c
  create mode 100644 include/hw/arm/stm32f205_soc.h

 diff --git a/default-configs/arm-softmmu.mak 
 b/default-configs/arm-softmmu.mak
 index a2ea8f7..8068100 100644
 --- a/default-configs/arm-softmmu.mak
 +++ b/default-configs/arm-softmmu.mak
 @@ -81,6 +81,7 @@ CONFIG_ZYNQ=y
  CONFIG_STM32F205_TIMER=y
  CONFIG_STM32F205_USART=y
  CONFIG_STM32F205_SYSCFG=y
 +CONFIG_STM32F205_SOC=y

  CONFIG_VERSATILE_PCI=y
  CONFIG_VERSATILE_I2C=y
 diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
 index 6088e53..9769317 100644
 --- a/hw/arm/Makefile.objs
 +++ b/hw/arm/Makefile.objs
 @@ -8,3 +8,4 @@ obj-y += armv7m.o exynos4210.o pxa2xx.o pxa2xx_gpio.o 
 pxa2xx_pic.o
  obj-$(CONFIG_DIGIC) += digic.o
  obj-y += omap1.o omap2.o strongarm.o
  obj-$(CONFIG_ALLWINNER_A10) += allwinner-a10.o cubieboard.o
 +obj-$(CONFIG_STM32F205_SOC) += stm32f205_soc.o
 diff --git a/hw/arm/stm32f205_soc.c b/hw/arm/stm32f205_soc.c
 new file mode 100644
 index 000..bd9514e
 --- /dev/null
 +++ b/hw/arm/stm32f205_soc.c
 @@ -0,0 +1,157 @@
 +/*
 + * STM32F205 SoC
 + *
 + * Copyright (c) 2014 Alistair Francis alist...@alistair23.me
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a 
 copy
 + * of this software and associated documentation files (the Software), to 
 deal
 + * in the Software without restriction, including without limitation the 
 rights
 + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 + * copies of the Software, and to permit persons to whom the Software is
 + * furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice shall be included 
 in
 + * all copies or substantial portions of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS 
 OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
 OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
 FROM,
 + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 + * THE SOFTWARE.
 + */
 +
 +#include hw/arm/stm32f205_soc.h
 +
 +/* At the moment only Timer 2 to 5 are modelled */
 +static const uint32_t timer_addr[] = { 0x4000, 0x4400,
 +0x4800, 0x4C00 };
 +static const uint32_t usart_addr[] = { 0x40011000, 0x40004400,
 +0x40004800, 0x40004C00, 0x40005000, 0x40011400 };
 +

 You have 6 addresses for USART ...

 +static const int timer_irq[] = {28, 29, 30, 50};
 +static const int usart_irq[] = {37, 38, 39, 52, 53, 71, 82, 83};
 +

 ... but 8 IRQS and the loop below uses only 5 values. What's the system 
 exactly?

These must be left over from the Netduino Plus 2. I think it's just
the first five, but I'll
double check and fix in a respin


 +static void stm32f205_soc_initfn(Object *obj)
 +{
 +STM32F205State *s = STM32F205_SOC(obj);
 +int i;
 +
 +object_initialize(s-syscfg, sizeof(s-syscfg), TYPE_STM32F205_SYSCFG);
 +qdev_set_parent_bus(DEVICE(s-syscfg), sysbus_get_default());
 +
 +for (i = 0; i  5; i++) {
 +object_initialize(s-usart[i], sizeof(s-usart[i]),
 +  TYPE_STM32F205_USART);
 +qdev_set_parent_bus(DEVICE(s-usart[i]), sysbus_get_default());
 +}
 +
 +for (i = 0; i  4; i++) {
 +object_initialize(s-timer[i], sizeof(s-timer[i]),
 +  TYPE_STM32F205_TIMER);
 +qdev_set_parent_bus(DEVICE(s-timer[i]), sysbus_get_default());
 +}
 +}
 +
 +static void stm32f205_soc_realize(DeviceState *dev_soc, Error **errp)
 +{
 +STM32F205State *s = STM32F205_SOC(dev_soc);
 +DeviceState *syscfgdev, *usartdev, *timerdev;
 +SysBusDevice *syscfgbusdev, *usartbusdev, *timerbusdev;
 +qemu_irq *pic;;

 stray ;

Will fix


 +Error *err = NULL;
 +int i;
 +
 +MemoryRegion *system_memory = get_system_memory();
 +MemoryRegion *sram = g_new(MemoryRegion, 1);
 +MemoryRegion *flash = g_new(MemoryRegion, 1);
 +MemoryRegion *flash_alias = g_new(MemoryRegion, 1);
 +
 +memory_region_init_ram(flash, NULL, netduino.flash, FLASH_SIZE,
 +   error_abort);
 +memory_region_init_alias(flash_alias, NULL, 

[Qemu-devel] [PATCH v2 2/2] vl.c: reduce exit on error code duplication

2014-10-20 Thread Igor Mammedov
use error_report_fatal() helper instead of a bunch of
if (local_err) {
error_report(foo);
error_free(local_err);
exit(1);
}
code blocks

Signed-off-by: Igor Mammedov imamm...@redhat.com
---
 * s/exit_if_error/error_report_fatal/
 * move error_report_fatal() into util/error.c
---
 include/qemu/error-report.h |  2 ++
 util/error.c| 24 
 vl.c| 34 ++
 3 files changed, 32 insertions(+), 28 deletions(-)

diff --git a/include/qemu/error-report.h b/include/qemu/error-report.h
index 7ab2355..a80a1a7 100644
--- a/include/qemu/error-report.h
+++ b/include/qemu/error-report.h
@@ -16,6 +16,7 @@
 #include stdarg.h
 #include stdbool.h
 #include qemu/compiler.h
+#include qapi/error.h
 
 typedef struct Location {
 /* all members are private to qemu-error.c */
@@ -40,6 +41,7 @@ void error_printf_unless_qmp(const char *fmt, ...) 
GCC_FMT_ATTR(1, 2);
 void error_set_progname(const char *argv0);
 void error_vreport(const char *fmt, va_list ap) GCC_FMT_ATTR(1, 0);
 void error_report(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
+void error_report_fatal(Error *err, const char *fmt, ...) GCC_FMT_ATTR(2, 3);
 const char *error_get_progname(void);
 extern bool enable_timestamp_msg;
 
diff --git a/util/error.c b/util/error.c
index 2ace0d8..61d6d21 100644
--- a/util/error.c
+++ b/util/error.c
@@ -171,3 +171,27 @@ void error_propagate(Error **dst_errp, Error *local_err)
 error_free(local_err);
 }
 }
+
+void error_report_fatal(Error *err, const char *fmt, ...)
+{
+va_list ap;
+
+if (!err) {
+return;
+}
+
+if (fmt) {
+char *optional_msg = NULL;
+
+va_start(ap, fmt);
+optional_msg = g_strdup_vprintf(fmt, ap);
+va_end(ap);
+error_report(%s: %s, optional_msg, error_get_pretty(err));
+g_free(optional_msg);
+} else {
+error_report(%s, error_get_pretty(err));
+}
+
+error_free(err);
+exit(EXIT_FAILURE);
+}
diff --git a/vl.c b/vl.c
index 528c289..4f03b96 100644
--- a/vl.c
+++ b/vl.c
@@ -2264,11 +2264,7 @@ static int chardev_init_func(QemuOpts *opts, void 
*opaque)
 Error *local_err = NULL;
 
 qemu_chr_new_from_opts(opts, NULL, local_err);
-if (local_err) {
-error_report(%s, error_get_pretty(local_err));
-error_free(local_err);
-return -1;
-}
+error_report_fatal(local_err, NULL);
 return 0;
 }
 
@@ -2674,12 +2670,7 @@ static int machine_set_property(const char *name, const 
char *value,
 string_input_visitor_cleanup(siv);
 g_free(qom_name);
 
-if (local_err) {
-qerror_report_err(local_err);
-error_free(local_err);
-return -1;
-}
-
+error_report_fatal(local_err, NULL);
 return 0;
 }
 
@@ -4074,11 +4065,7 @@ int main(int argc, char **argv, char **envp)
 
 if (qtest_chrdev) {
 qtest_init(qtest_chrdev, qtest_log, local_err);
-if (local_err) {
-error_report(%s, error_get_pretty(local_err));
-error_free(local_err);
-exit(1);
-}
+error_report_fatal(local_err, NULL);
 }
 
 machine_opts = qemu_get_machine_opts();
@@ -4318,12 +4305,8 @@ int main(int argc, char **argv, char **envp)
 if (vnc_display) {
 vnc_display_init(ds);
 vnc_display_open(ds, vnc_display, local_err);
-if (local_err != NULL) {
-error_report(Failed to start VNC server on `%s': %s,
- vnc_display, error_get_pretty(local_err));
-error_free(local_err);
-exit(1);
-}
+error_report_fatal(local_err, Failed to start VNC server on '%s',
+   vnc_display);
 
 if (show_vnc_port) {
 printf(VNC server running on `%s'\n, vnc_display_local_addr(ds));
@@ -4371,12 +4354,7 @@ int main(int argc, char **argv, char **envp)
 
 if (incoming) {
 qemu_start_incoming_migration(incoming, local_err);
-if (local_err) {
-error_report(-incoming %s: %s, incoming,
- error_get_pretty(local_err));
-error_free(local_err);
-exit(1);
-}
+error_report_fatal(local_err, -incoming %s, incoming);
 } else if (autostart) {
 vm_start();
 }
-- 
1.9.3




Re: [Qemu-devel] [PATCH v5 1/1] xen-hvm.c: Add support for Xen access to vmport

2014-10-20 Thread Don Slutz

On 10/20/14 08:30, Paul Durrant wrote:

-Original Message-
From: Don Slutz [mailto:dsl...@verizon.com]
Sent: 20 October 2014 13:19
To: qemu-devel@nongnu.org; Paul Durrant
Cc: xen-de...@lists.xensource.com; Alexander Graf; Andreas Färber;
Anthony Liguori; Don Slutz; Marcel Apfelbaum; Markus Armbruster; Michael
S. Tsirkin; Stefano Stabellini
Subject: [PATCH v5 1/1] xen-hvm.c: Add support for Xen access to vmport

This adds synchronisation of the 6 vcpu registers (only 32bits of
them) that vmport.c needs between Xen and QEMU.

This is to avoid a 2nd and 3rd exchange between QEMU and Xen to
fetch and put these 6 vcpu registers used by the code in vmport.c
and vmmouse.c

The registers are passed in the new shared page provided by
HVM_PARAM_VMPORT_REGS_PFN.

Add new array to XenIOState that allows selection of current_cpu by
vcpu id.

Now pass XenIOState to handle_ioreq().

Add new routines regs_to_cpu(), regs_from_cpu(), and
handle_vmport_ioreq().

Signed-off-by: Don Slutz dsl...@verizon.com
---
v5:
 vmware_ioreq_t struct is not really a request any more. Maybe
 vmware_regs_t?
   Renamed various parts from vmware_ioreq to vmware_regs.  Also
   HVM_PARAM_VMPORT_IOREQ_PFN to
HVM_PARAM_VMPORT_REGS_PFN.
 cpu_by_ioreq_id name implies the array is indexed by an id
 carries in the ioreq.
   Renamed cpu_by_ioreq_id to cpu_by_vcpu_id.
 Is cpu_get_vmport_ioreq_from_shared_memory worth its own
 function?
   Moved in-line.
 I don't think you need the barrier anyway.
   Dropped the barrier.
 Oh, I now realize you mean the same theoretical rather than
 actual limit, in which case this can be a build time check
 anyway.
   Switch to build time check, move to a better place.
 You could avoid passing state to both of them by setting
 current_cpu here couldn't you?
   Yes, moved state usage to handle_vmport_ioreq().
 Error out if it fails with error != -ENOSYS.
   Done.

v4:
 Please try to get rid of the #ifdefs.
   Moved 2 #ifdefs into hw/xen/xen_common.h


One possible nit, inline below, but...

Reviewed-by: Paul Durrant paul.durr...@citrix.com


  include/hw/xen/xen_common.h |  22 +
  xen-hvm.c   | 110
++--
  2 files changed, 127 insertions(+), 5 deletions(-)

diff --git a/include/hw/xen/xen_common.h
b/include/hw/xen/xen_common.h
index 07731b9..42e3d77 100644
--- a/include/hw/xen/xen_common.h
+++ b/include/hw/xen/xen_common.h
@@ -164,4 +164,26 @@ void destroy_hvm_domain(bool reboot);
  /* shutdown/destroy current domain because of an error */
  void xen_shutdown_fatal_error(const char *fmt, ...) GCC_FMT_ATTR(1, 2);

+#ifdef HVM_PARAM_VMPORT_REGS_PFN
+static inline int xen_get_vmport_regs_pfn(XenXC xc, domid_t dom,
+  unsigned long *vmport_regs_pfn)
+{
+return xc_get_hvm_param(xc, dom,
HVM_PARAM_VMPORT_REGS_PFN,
+vmport_regs_pfn);
+}
+#else
+static inline int xen_get_vmport_regs_pfn(XenXC xc, domid_t dom,
+  unsigned long *vmport_regs_pfn)
+{
+return -ENOSYS;
+}
+#endif
+
+#if __GNUC__  4 || (__GNUC__ == 4  __GNUC_MINOR__ = 6)
+/* Force a compilation error if condition is true */
+#define BUILD_BUG_ON(cond) ({ _Static_assert(!(cond), !( #cond )); })
+#else
+#define BUILD_BUG_ON(cond) ((void)sizeof(struct { int:-!!(cond); }))
+#endif
+
  #endif /* QEMU_HW_XEN_COMMON_H */
diff --git a/xen-hvm.c b/xen-hvm.c
index 05e522c..b5ef683 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -41,6 +41,29 @@ static MemoryRegion *framebuffer;
  static bool xen_in_migration;

  /* Compatibility with older version */
+
+/* This allows QEMU to build on a system that has Xen 4.5 or earlier
+ * installed.  This here (not in hw/xen/xen_common.h) because
xen/hvm/ioreq.h
+ * needs to be included before this block and hw/xen/xen_common.h
needs to
+ * be included before xen/hvm/ioreq.h
+ */
+#ifndef IOREQ_TYPE_VMWARE_PORT
+#define IOREQ_TYPE_VMWARE_PORT  3
+struct vmware_regs {
+uint32_t esi;
+uint32_t edi;
+uint32_t ebx;
+uint32_t ecx;
+uint32_t edx;
+};
+typedef struct vmware_regs vmware_regs_t;
+
+struct shared_vmport_iopage {
+struct vmware_regs vcpu_vmport_regs[1];
+};
+typedef struct shared_vmport_iopage shared_vmport_iopage_t;
+#endif
+
  #if __XEN_LATEST_INTERFACE_VERSION__  0x0003020a
  static inline uint32_t xen_vcpu_eport(shared_iopage_t *shared_page, int i)
  {
@@ -79,8 +102,10 @@ typedef struct XenPhysmap {

  typedef struct XenIOState {
  shared_iopage_t *shared_page;
+shared_vmport_iopage_t *shared_vmport_page;
  buffered_iopage_t *buffered_io_page;
  QEMUTimer *buffered_io_timer;
+CPUState **cpu_by_vcpu_id;
  /* the evtchn port for polling the notification, */
  evtchn_port_t *ioreq_local_port;
  /* evtchn local port for buffered io */
@@ -101,6 +126,8 @@ typedef struct XenIOState {
  Notifier wakeup;
  } 

[Qemu-devel] [PATCH v6 1/1] xen-hvm.c: Add support for Xen access to vmport

2014-10-20 Thread Don Slutz
This adds synchronisation of the 6 vcpu registers (only 32bits of
them) that vmport.c needs between Xen and QEMU.

This is to avoid a 2nd and 3rd exchange between QEMU and Xen to
fetch and put these 6 vcpu registers used by the code in vmport.c
and vmmouse.c

The registers are passed in the new shared page provided by
HVM_PARAM_VMPORT_REGS_PFN.

Add new array to XenIOState that allows selection of current_cpu by
vcpu id.

Now pass XenIOState to handle_ioreq().

Add new routines regs_to_cpu(), regs_from_cpu(), and
handle_vmport_ioreq().

Signed-off-by: Don Slutz dsl...@verizon.com
Reviewed-by: Paul Durrant paul.durr...@citrix.com
---
v6:
Do we need a forward declaration?
  Nope. Drooped.
Added Reviewed-by from Paul Durrant.

v5:
vmware_ioreq_t struct is not really a request any more. Maybe
vmware_regs_t?
  Renamed various parts from vmware_ioreq to vmware_regs.  Also
  HVM_PARAM_VMPORT_IOREQ_PFN to HVM_PARAM_VMPORT_REGS_PFN.
cpu_by_ioreq_id name implies the array is indexed by an id
carries in the ioreq.
  Renamed cpu_by_ioreq_id to cpu_by_vcpu_id.
Is cpu_get_vmport_ioreq_from_shared_memory worth its own
function?
  Moved in-line.
I don't think you need the barrier anyway.
  Dropped the barrier.
Oh, I now realize you mean the same theoretical rather than
actual limit, in which case this can be a build time check
anyway.
  Switch to build time check, move to a better place.
You could avoid passing state to both of them by setting
current_cpu here couldn't you?
  Yes, moved state usage to handle_vmport_ioreq().
Error out if it fails with error != -ENOSYS.
  Done.

v4:
Please try to get rid of the #ifdefs.
  Moved 2 #ifdefs into hw/xen/xen_common.h


 include/hw/xen/xen_common.h |  22 +
 xen-hvm.c   | 108 ++--
 2 files changed, 125 insertions(+), 5 deletions(-)

diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h
index 07731b9..42e3d77 100644
--- a/include/hw/xen/xen_common.h
+++ b/include/hw/xen/xen_common.h
@@ -164,4 +164,26 @@ void destroy_hvm_domain(bool reboot);
 /* shutdown/destroy current domain because of an error */
 void xen_shutdown_fatal_error(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
 
+#ifdef HVM_PARAM_VMPORT_REGS_PFN
+static inline int xen_get_vmport_regs_pfn(XenXC xc, domid_t dom,
+  unsigned long *vmport_regs_pfn)
+{
+return xc_get_hvm_param(xc, dom, HVM_PARAM_VMPORT_REGS_PFN,
+vmport_regs_pfn);
+}
+#else
+static inline int xen_get_vmport_regs_pfn(XenXC xc, domid_t dom,
+  unsigned long *vmport_regs_pfn)
+{
+return -ENOSYS;
+}
+#endif
+
+#if __GNUC__  4 || (__GNUC__ == 4  __GNUC_MINOR__ = 6)
+/* Force a compilation error if condition is true */
+#define BUILD_BUG_ON(cond) ({ _Static_assert(!(cond), !( #cond )); })
+#else
+#define BUILD_BUG_ON(cond) ((void)sizeof(struct { int:-!!(cond); }))
+#endif
+
 #endif /* QEMU_HW_XEN_COMMON_H */
diff --git a/xen-hvm.c b/xen-hvm.c
index 05e522c..7c6291a 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -41,6 +41,29 @@ static MemoryRegion *framebuffer;
 static bool xen_in_migration;
 
 /* Compatibility with older version */
+
+/* This allows QEMU to build on a system that has Xen 4.5 or earlier
+ * installed.  This here (not in hw/xen/xen_common.h) because xen/hvm/ioreq.h
+ * needs to be included before this block and hw/xen/xen_common.h needs to
+ * be included before xen/hvm/ioreq.h
+ */
+#ifndef IOREQ_TYPE_VMWARE_PORT
+#define IOREQ_TYPE_VMWARE_PORT  3
+struct vmware_regs {
+uint32_t esi;
+uint32_t edi;
+uint32_t ebx;
+uint32_t ecx;
+uint32_t edx;
+};
+typedef struct vmware_regs vmware_regs_t;
+
+struct shared_vmport_iopage {
+struct vmware_regs vcpu_vmport_regs[1];
+};
+typedef struct shared_vmport_iopage shared_vmport_iopage_t;
+#endif
+
 #if __XEN_LATEST_INTERFACE_VERSION__  0x0003020a
 static inline uint32_t xen_vcpu_eport(shared_iopage_t *shared_page, int i)
 {
@@ -79,8 +102,10 @@ typedef struct XenPhysmap {
 
 typedef struct XenIOState {
 shared_iopage_t *shared_page;
+shared_vmport_iopage_t *shared_vmport_page;
 buffered_iopage_t *buffered_io_page;
 QEMUTimer *buffered_io_timer;
+CPUState **cpu_by_vcpu_id;
 /* the evtchn port for polling the notification, */
 evtchn_port_t *ioreq_local_port;
 /* evtchn local port for buffered io */
@@ -773,7 +798,50 @@ static void cpu_ioreq_move(ioreq_t *req)
 }
 }
 
-static void handle_ioreq(ioreq_t *req)
+static void regs_to_cpu(vmware_regs_t *vmport_regs, ioreq_t *req)
+{
+X86CPU *cpu;
+CPUX86State *env;
+
+cpu = X86_CPU(current_cpu);
+env = cpu-env;
+env-regs[R_EAX] = req-data;
+env-regs[R_EBX] = vmport_regs-ebx;
+env-regs[R_ECX] = vmport_regs-ecx;
+env-regs[R_EDX] = vmport_regs-edx;
+env-regs[R_ESI] = vmport_regs-esi;
+

[Qemu-devel] [PATCH v6 0/1] Add support for Xen access to vmport

2014-10-20 Thread Don Slutz
Changes v5 to v6:
  Paul Durrant
Do we need a forward declaration?
  Nope. Drooped.
Added Reviewed-by.

Changes v4 to v5:
  Paul Durrant
vmware_ioreq_t struct is not really a request any more. Maybe
vmware_regs_t?
  Renamed various parts from vmware_ioreq to vmware_regs.  Also
  HVM_PARAM_VMPORT_IOREQ_PFN to HVM_PARAM_VMPORT_REGS_PFN.
cpu_by_ioreq_id name implies the array is indexed by an id
carries in the ioreq.
  Renamed cpu_by_ioreq_id to cpu_by_vcpu_id.
Is cpu_get_vmport_ioreq_from_shared_memory worth its own
function?
  Moved in-line.
I don't think you need the barrier anyway.
  Dropped the barrier.
Oh, I now realize you mean the same theoretical rather than
actual limit, in which case this can be a build time check
anyway.
  Switch to build time check, move to a better place.
You could avoid passing state to both of them by setting
current_cpu here couldn't you?
  Yes, moved state usage to handle_vmport_ioreq().

  Stefano Stabellini
Error out if it fails with error != -ENOSYS.
  Done.

Changes RFC-v2x to v4:
  Stefano Stabellini
Please try to get rid of the #ifdefs.
  Moved 2 #ifdefs into hw/xen/xen_common.h

Changes v2 to RFC-v2x:
  Paul Durrant
Use a 2nd shared page.
  Added HVM_PARAM_VMPORT_IOREQ_PFN usage.

Changes v1 to v2:
   More info in commit message.

  Stefano Stabellini
the registers being passes explicitely by Xen rather than
hiding them into other ioreq fields.
   Added vmware_ioreq_t
  Paolo Bonzini  Alexander Graf
Fixup env access
  Added cpu_by_ioreq_id.
  Set current_cpu in regs_to_cpu(), clear in regs_from_cpu().
  Drop all changes to vmport.c

Note: to use this with Xen either a version of:

[Qemu-devel] [PATCH] -machine vmport=off: Allow disabling of VMWare ioport 
emulation

or

From f70663d9fb86914144ba340b6186cb1e67ac6eec Mon Sep 17 00:00:00 2001
From: Don Slutz dsl...@verizon.com
Date: Fri, 26 Sep 2014 08:11:39 -0400
Subject: [PATCH 1/2] hack: force enable vmport

Signed-off-by: Don Slutz dsl...@verizon.com
---
 hw/i386/pc_piix.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 103d756..b76dfbc 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -234,7 +234,7 @@ static void pc_init1(MachineState *machine,
 pc_vga_init(isa_bus, pci_enabled ? pci_bus : NULL);
 
 /* init basic PC hardware */
-pc_basic_device_init(isa_bus, gsi, rtc_state, floppy, xen_enabled(),
+pc_basic_device_init(isa_bus, gsi, rtc_state, floppy, false,
 0x4);
 
 pc_nic_init(isa_bus, pci_bus);
-- 
1.8.4

needs to be done to QEMU.

And the Xen RFC patch:

[RFC][PATCH v2 1/1] Add IOREQ_TYPE_VMWARE_PORT

needs to be done to Xen.

Don Slutz (1):
  xen-hvm.c: Add support for Xen access to vmport

 include/hw/xen/xen_common.h |  22 +
 xen-hvm.c   | 108 ++--
 2 files changed, 125 insertions(+), 5 deletions(-)

-- 
1.8.4




Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Peter Lieven

On 20.10.2014 14:19, Max Reitz wrote:

On 2014-10-20 at 14:16, Peter Lieven wrote:

On 20.10.2014 13:51, Max Reitz wrote:

On 2014-10-20 at 12:03, Peter Lieven wrote:

On 20.10.2014 11:27, Max Reitz wrote:

On 2014-10-20 at 11:14, Peter Lieven wrote:

On 20.10.2014 10:59, Max Reitz wrote:

On 2014-10-20 at 08:14, Peter Lieven wrote:

the block layer silently merges write requests since


s/^t/T/


commit 40b4f539. This patch adds a knob to disable
this feature as there has been some discussion lately
if multiwrite is a good idea at all and as it falsifies
benchmarks.

Signed-off-by: Peter Lieven p...@kamp.de
---
  block.c   |4 
  block/qapi.c  |1 +
  blockdev.c|7 +++
  hmp.c |4 
  include/block/block_int.h |1 +
  qapi/block-core.json  |   10 +-
  qemu-options.hx   |1 +
  qmp-commands.hx   |2 ++
  8 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 27533f3..1658a72 100644
--- a/block.c
+++ b/block.c
@@ -4531,6 +4531,10 @@ static int multiwrite_merge(BlockDriverState *bs, 
BlockRequest *reqs,
  {
  int i, outidx;
  +if (!bs-write_merging) {
+return num_reqs;
+}
+
  // Sort requests by start sector
  qsort(reqs, num_reqs, sizeof(*reqs), multiwrite_req_compare);
  diff --git a/block/qapi.c b/block/qapi.c
index 9733ebd..02251dd 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -58,6 +58,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs)
info-backing_file_depth = bdrv_get_backing_file_depth(bs);
  info-detect_zeroes = bs-detect_zeroes;
+info-write_merging = bs-write_merging;
if (bs-io_limits_enabled) {
  ThrottleConfig cfg;
diff --git a/blockdev.c b/blockdev.c
index e595910..13e47b8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
  const char *id;
  bool has_driver_specific_opts;
  BlockdevDetectZeroesOptions detect_zeroes;
+bool write_merging;
  BlockDriver *drv = NULL;
/* Check common options by copying from bs_opts to opts, all other 
options
@@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
  snapshot = qemu_opt_get_bool(opts, snapshot, 0);
  ro = qemu_opt_get_bool(opts, read-only, 0);
  copy_on_read = qemu_opt_get_bool(opts, copy-on-read, false);
+write_merging = qemu_opt_get_bool(opts, write-merging, true);


Using this option in blockdev_init() means that you can only enable or disable merging for the top layer (the root BDS). Furthermore, since you don't set bs-write_merging in bdrv_new() (or at least bdrv_open()), it actually defaults to false and 
only for the top layer it defaults to true.


Therefore, if after this patch a format block driver issues a multiwrite to its 
file, the write will not be merged and the user can do nothing about it. I 
don't suppose this is intentional...?


I am not sure if a block driver actually can do this at all? The only way to 
enter multiwrite is from virtio_blk_handle_request in virtio-blk.c.


Well, there's also qemu-io -c multiwrite (which only accesses the root BDS as 
well). But other than that, yes, you're right. So, in practice it shouldn't 
matter.





I propose evaluating the option in bdrv_open() and setting bs-write_merging 
there.


I wasn't aware actually. I remember that someone asked me to implement 
discard_zeroes in blockdev_init. I think it was something related to QMP. So we 
still might
need to check parameters at 2 positions? It is quite confusing which paramter 
has to be parsed where.


As for me, I don't know why some options are parsed in blockdev_init() at all. I guess all the options currently parsed in blockdev_init() should later be moved to the BlockBackend, at least that would be the idea. In practice, we cannot do that: 
Things like caching will stay in the BlockDriverState.


I think it's just broken. IMHO, everything related to the BB should be in blockdev_init() and everything related to the BDS should be in bdrv_open(). So the question is now whether you want write_merging to be in the BDS or in the BB. Considering BB 
is in Kevin's block branch as of last Friday, you might actually want to work on that branch and move the field into the BB if you decide that that's the place it should be in.


Actually I there a pros and cons for both BDS and BB. As of now my intention 
was to be able to turn it off. As there are People who would like to see it 
completely disappear I would not spent too much effort in that switch today.
Looking at BB it is a BDS thing and thus belongs to bdrv_open. But this is true 
for discard_zeroes (and others) as well. Kevin, Stefan, ultimatively where 
should it be parsed?


Yes, and for cache, too. That's what I meant with it's just broken.


Can you further help here. I think my problem was that I don't have 

Re: [Qemu-devel] spec, RFC: TLS support for NBD

2014-10-20 Thread Richard W.M. Jones
On Mon, Oct 20, 2014 at 01:56:43PM +0200, Florian Weimer wrote:
 On 10/20/2014 01:51 PM, Markus Armbruster wrote:
 Furthermore, STARTTLS is vulnerable to active attacks: if you can get
 between the peers, you can make them fall back to unencrypted silently.
 How do you plan to guard against that?
 
 The usual way to deal with this is to use different syntax for
 TLS-enabled and non-TLS addresses (e.g., https:// and http://).
 With a TLS address, the client must enforce that only TLS-enabled
 connections are possible.  STARTTLS isn't the problem here, it's
 just an accident of history that many STARTTLS client
 implementations do not require a TLS handshake before proceeding.
 
 I cannot comment on whether the proposed STARTTLS command is at the
 correct stage of the NBD protocol.  If there is a protocol
 description for NBD, I can have a look.

Two actually :-)  Both are covered here:

http://sourceforge.net/p/nbd/code/ci/master/tree/doc/proto.txt

I believe that the proposed changes only cover the new style
protocol.

There's no common syntax for nbd URLs that I'm aware of.  At least,
both qemu  guestfish have nbd:... strings that they can parse, but
both have a completely different syntax.  But we could still have a
client-side indication (flag or nbds:..) to say that we want to force
TLS.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top



Re: [Qemu-devel] [PATCH v2 1/9] target-mips: add KScratch registers

2014-10-20 Thread Leon Alrae
Hi Yongbok,

On 14/10/2014 14:59, Yongbok Kim wrote:
 @@ -4611,6 +4612,15 @@ static inline void gen_mtc0_store64 (TCGv arg,
 target_ulong off)
   tcg_gen_st_tl(arg, cpu_env, off);
   }
   +static inline void gen_mfc0_unimplemented(DisasContext *ctx, TCGv arg)
 +{
 +if (ctx-insn_flags  ISA_MIPS32R6) {
 +tcg_gen_movi_tl(arg, 0);
 +} else {
 +tcg_gen_movi_tl(arg, ~0);
 +}
 +}
 +
 
 Not related with KScratch registers. It would be better to be a separate
 patch or
 as part of the patch [PATCH 5/6] target-mips: correctly handle access to
 unimplemented CP0 register.

Actually it is related to all cp0 registers and KScratch is the first
cp0 register added in the series, thus in my opinion this is a good
place for including the definition of gen_mfc0_unimplemented(). The
patch you mentioned is correcting the remaining (existing before this
patch) cp0 registers.

Regards,
Leon




Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Kevin Wolf
Am 20.10.2014 um 13:53 hat Peter Lieven geschrieben:
 On 20.10.2014 13:51, Max Reitz wrote:
 On 2014-10-20 at 12:03, Peter Lieven wrote:
 On 20.10.2014 11:27, Max Reitz wrote:
 On 2014-10-20 at 11:14, Peter Lieven wrote:
 On 20.10.2014 10:59, Max Reitz wrote:
 On 2014-10-20 at 08:14, Peter Lieven wrote:
 the block layer silently merges write requests since
 
 s/^t/T/
 
 commit 40b4f539. This patch adds a knob to disable
 this feature as there has been some discussion lately
 if multiwrite is a good idea at all and as it falsifies
 benchmarks.
 
 Signed-off-by: Peter Lieven p...@kamp.de
 ---
   block.c   |4 
   block/qapi.c  |1 +
   blockdev.c|7 +++
   hmp.c |4 
   include/block/block_int.h |1 +
   qapi/block-core.json  |   10 +-
   qemu-options.hx   |1 +
   qmp-commands.hx   |2 ++
   8 files changed, 29 insertions(+), 1 deletion(-)
 
 diff --git a/block.c b/block.c
 index 27533f3..1658a72 100644
 --- a/block.c
 +++ b/block.c
 @@ -4531,6 +4531,10 @@ static int multiwrite_merge(BlockDriverState 
 *bs, BlockRequest *reqs,
   {
   int i, outidx;
   +if (!bs-write_merging) {
 +return num_reqs;
 +}
 +
   // Sort requests by start sector
   qsort(reqs, num_reqs, sizeof(*reqs), multiwrite_req_compare);
   diff --git a/block/qapi.c b/block/qapi.c
 index 9733ebd..02251dd 100644
 --- a/block/qapi.c
 +++ b/block/qapi.c
 @@ -58,6 +58,7 @@ BlockDeviceInfo 
 *bdrv_block_device_info(BlockDriverState *bs)
 info-backing_file_depth = bdrv_get_backing_file_depth(bs);
   info-detect_zeroes = bs-detect_zeroes;
 +info-write_merging = bs-write_merging;
 if (bs-io_limits_enabled) {
   ThrottleConfig cfg;
 diff --git a/blockdev.c b/blockdev.c
 index e595910..13e47b8 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -378,6 +378,7 @@ static DriveInfo *blockdev_init(const char *file, 
 QDict *bs_opts,
   const char *id;
   bool has_driver_specific_opts;
   BlockdevDetectZeroesOptions detect_zeroes;
 +bool write_merging;
   BlockDriver *drv = NULL;
 /* Check common options by copying from bs_opts to opts, all 
  other options
 @@ -405,6 +406,7 @@ static DriveInfo *blockdev_init(const char *file, 
 QDict *bs_opts,
   snapshot = qemu_opt_get_bool(opts, snapshot, 0);
   ro = qemu_opt_get_bool(opts, read-only, 0);
   copy_on_read = qemu_opt_get_bool(opts, copy-on-read, false);
 +write_merging = qemu_opt_get_bool(opts, write-merging, true);
 
 Using this option in blockdev_init() means that you can
 only enable or disable merging for the top layer (the root
 BDS). Furthermore, since you don't set bs-write_merging
 in bdrv_new() (or at least bdrv_open()), it actually
 defaults to false and only for the top layer it defaults
 to true.
 
 Therefore, if after this patch a format block driver issues a multiwrite 
 to its file, the write will not be merged and the user can do nothing 
 about it. I don't suppose this is intentional...?
 
 I am not sure if a block driver actually can do this at all? The only way 
 to enter multiwrite is from virtio_blk_handle_request in virtio-blk.c.
 
 Well, there's also qemu-io -c multiwrite (which only accesses the root BDS 
 as well). But other than that, yes, you're right. So, in practice it 
 shouldn't matter.
 
 
 
 I propose evaluating the option in bdrv_open() and setting 
 bs-write_merging there.
 
 I wasn't aware actually. I remember that someone asked me to implement 
 discard_zeroes in blockdev_init. I think it was something related to QMP. 
 So we still might
 need to check parameters at 2 positions? It is quite confusing which 
 paramter has to be parsed where.
 
 As for me, I don't know why some options are parsed in
 blockdev_init() at all. I guess all the options currently
 parsed in blockdev_init() should later be moved to the
 BlockBackend, at least that would be the idea. In practice, we
 cannot do that: Things like caching will stay in the
 BlockDriverState.
 
 I think it's just broken. IMHO, everything related to the BB
 should be in blockdev_init() and everything related to the BDS
 should be in bdrv_open(). So the question is now whether you
 want write_merging to be in the BDS or in the BB. Considering
 BB is in Kevin's block branch as of last Friday, you might
 actually want to work on that branch and move the field into
 the BB if you decide that that's the place it should be in.
 
 Actually I there a pros and cons for both BDS and BB. As of now my 
 intention was to be able to turn it off. As there are People who would like 
 to see it completely disappear I would not spent too much effort in that 
 switch today.
 Looking at BB it is a BDS thing and thus belongs to bdrv_open. But this is 
 true for discard_zeroes (and others) as well. Kevin, Stefan, ultimatively 
 where should it be parsed?
 
 Yes, and for cache, too. That's what I meant with it's just 

Re: [Qemu-devel] [PATCH] block: qemu-iotests change _supported_proto to file once more.

2014-10-20 Thread Benoît Canet
The Monday 20 Oct 2014 à 13:47:11 (+0200), Peter Lieven wrote :
 In preparation to possible automatic regression and performance
 testing for the block layer I found that the iotests don't work
 for all protocols anymore.
 
 In commit 1f7bf7d0 I started to change supported protocols from
 generic to file for various tests. Unfortunately, some tests
 added in the meantime again carry generic protocol altough they
 can only work with file because they require local file access.
 
 The other way around for some tests that only support file I added
 NFS protocol after confirming they work.
 
 Signed-off-by: Peter Lieven p...@kamp.de
 ---
  tests/qemu-iotests/075 |2 +-
  tests/qemu-iotests/076 |2 +-
  tests/qemu-iotests/078 |2 +-
  tests/qemu-iotests/079 |2 +-
  tests/qemu-iotests/080 |2 +-
  tests/qemu-iotests/081 |2 +-
  tests/qemu-iotests/082 |2 +-
  tests/qemu-iotests/084 |2 +-
  tests/qemu-iotests/086 |2 +-
  tests/qemu-iotests/088 |2 +-
  tests/qemu-iotests/090 |2 +-
  tests/qemu-iotests/092 |2 +-
  tests/qemu-iotests/103 |2 +-
  13 files changed, 13 insertions(+), 13 deletions(-)
 
 diff --git a/tests/qemu-iotests/075 b/tests/qemu-iotests/075
 index 40032c5..6117660 100755
 --- a/tests/qemu-iotests/075
 +++ b/tests/qemu-iotests/075
 @@ -39,7 +39,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
  . ./common.filter
  
  _supported_fmt cloop
 -_supported_proto generic
 +_supported_proto file
  _supported_os Linux
  
  block_size_offset=128
 diff --git a/tests/qemu-iotests/076 b/tests/qemu-iotests/076
 index b614a7d..bc47457 100755
 --- a/tests/qemu-iotests/076
 +++ b/tests/qemu-iotests/076
 @@ -39,7 +39,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
  . ./common.filter
  
  _supported_fmt parallels
 -_supported_proto generic
 +_supported_proto file
  _supported_os Linux
  
  tracks_offset=$((0x1c))
 diff --git a/tests/qemu-iotests/078 b/tests/qemu-iotests/078
 index d4d6da7..7be2c3f 100755
 --- a/tests/qemu-iotests/078
 +++ b/tests/qemu-iotests/078
 @@ -39,7 +39,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
  . ./common.filter
  
  _supported_fmt bochs
 -_supported_proto generic
 +_supported_proto file
  _supported_os Linux
  
  catalog_size_offset=$((0x48))
 diff --git a/tests/qemu-iotests/079 b/tests/qemu-iotests/079
 index 2142bbb..6613cfb 100755
 --- a/tests/qemu-iotests/079
 +++ b/tests/qemu-iotests/079
 @@ -39,7 +39,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
  . ./common.filter
  
  _supported_fmt qcow2
 -_supported_proto file
 +_supported_proto file nfs
  _supported_os Linux
  
  function test_qemu_img()
 diff --git a/tests/qemu-iotests/080 b/tests/qemu-iotests/080
 index 6b3a3e7..9de337c 100755
 --- a/tests/qemu-iotests/080
 +++ b/tests/qemu-iotests/080
 @@ -40,7 +40,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
  . ./common.filter
  
  _supported_fmt qcow2
 -_supported_proto generic
 +_supported_proto file
  _supported_os Linux
  
  header_size=104
 diff --git a/tests/qemu-iotests/081 b/tests/qemu-iotests/081
 index 7ae4be2..ed3c29e 100755
 --- a/tests/qemu-iotests/081
 +++ b/tests/qemu-iotests/081
 @@ -41,7 +41,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
  . ./common.filter
  
  _supported_fmt raw
 -_supported_proto generic
 +_supported_proto file
  _supported_os Linux
  
  function do_run_qemu()
 diff --git a/tests/qemu-iotests/082 b/tests/qemu-iotests/082
 index 910b13e..e64de27 100755
 --- a/tests/qemu-iotests/082
 +++ b/tests/qemu-iotests/082
 @@ -39,7 +39,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
  . ./common.filter
  
  _supported_fmt qcow2
 -_supported_proto file
 +_supported_proto file nfs
  _supported_os Linux
  
  function run_qemu_img()
 diff --git a/tests/qemu-iotests/084 b/tests/qemu-iotests/084
 index ae33c2c..2712c02 100755
 --- a/tests/qemu-iotests/084
 +++ b/tests/qemu-iotests/084
 @@ -41,7 +41,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
  
  # This tests vdi-specific header fields
  _supported_fmt vdi
 -_supported_proto generic
 +_supported_proto file
  _supported_os Linux
  
  size=64M
 diff --git a/tests/qemu-iotests/086 b/tests/qemu-iotests/086
 index d9a80cf..234eb9a 100755
 --- a/tests/qemu-iotests/086
 +++ b/tests/qemu-iotests/086
 @@ -39,7 +39,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
  . ./common.filter
  
  _supported_fmt qcow2
 -_supported_proto file
 +_supported_proto file nfs
  _supported_os Linux
  
  function run_qemu_img()
 diff --git a/tests/qemu-iotests/088 b/tests/qemu-iotests/088
 index c09adf8..f9c3129 100755
 --- a/tests/qemu-iotests/088
 +++ b/tests/qemu-iotests/088
 @@ -40,7 +40,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
  . ./common.filter
  
  _supported_fmt vpc
 -_supported_proto generic
 +_supported_proto file
  _supported_os Linux
  
  offset_block_size=$((512 + 32))
 diff --git a/tests/qemu-iotests/090 b/tests/qemu-iotests/090
 index 8d032f8..70b5a6f 100755
 --- a/tests/qemu-iotests/090
 +++ b/tests/qemu-iotests/090
 @@ -39,7 +39,7 @@ trap _cleanup; exit \$status 0 1 2 3 15
  . ./common.filter
  

Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Max Reitz

On 20.10.2014 at 14:48, Peter Lieven wrote:

On 20.10.2014 14:19, Max Reitz wrote:

On 2014-10-20 at 14:16, Peter Lieven wrote:

On 20.10.2014 13:51, Max Reitz wrote:

On 2014-10-20 at 12:03, Peter Lieven wrote:
[...]


Can you further help here. I think my problem was that I don't have 
access to the commandline options in bdrv_open?!


You do. It's the options QDict. :-)


Maybe I just don't get it.

If I specify

qemu -drive if=virtio,file=image.qcow2,write-merging=off

and check with

qdict_get_try_bool(options, write-merging, true);

in bdrv_open() directly before bdrv_swap I always get true.


Hm, judging from fprintf(stderr, %s\n, 
qstring_get_str(qobject_to_json_pretty(QOBJECT(options;, it's there 
for me (directly after qdict_del(options, node-name). The output is:


Qemu wrote:

{
filename: image.qcow2
}
{
write-merging: off
}
qemu-system-x86_64: -drive 
if=virtio,file=image.qcow2,write-merging=off: could not open disk 
image image.qcow2: Block format 'qcow2' used by device 'virtio0' 
doesn't support the option 'write-merging'


But as you can see, it's a string and not a bool. So the problem is that 
there are (at least) two parameter types in qemu: One is just giving a 
QDict, and the other are QemuOpts. QDicts are just the raw user input 
and the user can only input strings, so everything is just a string. As 
far as I know, typing everything correctly is done by converting the 
QDict to a QemuOpts object (as you can see in generally every block 
driver which supports some options (e.g. qcow2) and also in 
blockdev_init(), it's qemu_opts_absorb_qdict()).


Sooo, right, I forgot that. Currently, there are no non-string 
non-block-driver-specific options for mid-tree BDS (in contrast to the 
root BDS, which are parsed in blockdev_init()), so you now have the 
honorable task of introducing such a QemuOptsList along with 
qemu_opts_absorb_qdict() and everything to bdrv_open_common(). *cough*


Max



Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Peter Lieven

On 20.10.2014 15:15, Max Reitz wrote:

On 20.10.2014 at 14:48, Peter Lieven wrote:

On 20.10.2014 14:19, Max Reitz wrote:

On 2014-10-20 at 14:16, Peter Lieven wrote:

On 20.10.2014 13:51, Max Reitz wrote:

On 2014-10-20 at 12:03, Peter Lieven wrote:
[...]


Can you further help here. I think my problem was that I don't have access to 
the commandline options in bdrv_open?!


You do. It's the options QDict. :-)


Maybe I just don't get it.

If I specify

qemu -drive if=virtio,file=image.qcow2,write-merging=off

and check with

qdict_get_try_bool(options, write-merging, true);

in bdrv_open() directly before bdrv_swap I always get true.


Hm, judging from fprintf(stderr, %s\n, 
qstring_get_str(qobject_to_json_pretty(QOBJECT(options;, it's there for me (directly after 
qdict_del(options, node-name). The output is:

Qemu wrote:

{
filename: image.qcow2
}
{
write-merging: off
}
qemu-system-x86_64: -drive if=virtio,file=image.qcow2,write-merging=off: could 
not open disk image image.qcow2: Block format 'qcow2' used by device 'virtio0' 
doesn't support the option 'write-merging'


But as you can see, it's a string and not a bool. So the problem is that there are (at least) two parameter types in qemu: One is just giving a QDict, and the other are QemuOpts. QDicts are just the raw user input and the user can only input strings, 
so everything is just a string. As far as I know, typing everything correctly is done by converting the QDict to a QemuOpts object (as you can see in generally every block driver which supports some options (e.g. qcow2) and also in blockdev_init(), it's 
qemu_opts_absorb_qdict()).


Sooo, right, I forgot that. Currently, there are no non-string non-block-driver-specific options for mid-tree BDS (in contrast to the root BDS, which are parsed in blockdev_init()), so you now have the honorable task of introducing such a QemuOptsList 
along with qemu_opts_absorb_qdict() and everything to bdrv_open_common(). *cough*


I would appreciate if someone with better knowledge of this whole stuff would 
start this. Or we postpone this know until all the ongoing conversions are done.

Peter



Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Max Reitz

On 20.10.2014 at 15:19, Peter Lieven wrote:

On 20.10.2014 15:15, Max Reitz wrote:

On 20.10.2014 at 14:48, Peter Lieven wrote:

On 20.10.2014 14:19, Max Reitz wrote:

On 2014-10-20 at 14:16, Peter Lieven wrote:

On 20.10.2014 13:51, Max Reitz wrote:

On 2014-10-20 at 12:03, Peter Lieven wrote:
[...]


Can you further help here. I think my problem was that I don't 
have access to the commandline options in bdrv_open?!


You do. It's the options QDict. :-)


Maybe I just don't get it.

If I specify

qemu -drive if=virtio,file=image.qcow2,write-merging=off

and check with

qdict_get_try_bool(options, write-merging, true);

in bdrv_open() directly before bdrv_swap I always get true.


Hm, judging from fprintf(stderr, %s\n, 
qstring_get_str(qobject_to_json_pretty(QOBJECT(options;, it's 
there for me (directly after qdict_del(options, node-name). The 
output is:


Qemu wrote:

{
filename: image.qcow2
}
{
write-merging: off
}
qemu-system-x86_64: -drive 
if=virtio,file=image.qcow2,write-merging=off: could not open disk 
image image.qcow2: Block format 'qcow2' used by device 'virtio0' 
doesn't support the option 'write-merging'


But as you can see, it's a string and not a bool. So the problem is 
that there are (at least) two parameter types in qemu: One is just 
giving a QDict, and the other are QemuOpts. QDicts are just the raw 
user input and the user can only input strings, so everything is just 
a string. As far as I know, typing everything correctly is done by 
converting the QDict to a QemuOpts object (as you can see in 
generally every block driver which supports some options (e.g. qcow2) 
and also in blockdev_init(), it's qemu_opts_absorb_qdict()).


Sooo, right, I forgot that. Currently, there are no non-string 
non-block-driver-specific options for mid-tree BDS (in contrast to 
the root BDS, which are parsed in blockdev_init()), so you now have 
the honorable task of introducing such a QemuOptsList along with 
qemu_opts_absorb_qdict() and everything to bdrv_open_common(). *cough*


I would appreciate if someone with better knowledge of this whole 
stuff would start this. Or we postpone this know until all the ongoing 
conversions are done.


I can try and create some barebone which your patches can then be based 
on. I probably don't have the knowledge either, but I'm daring enough to 
do it anyway. ;-)


Max



Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Peter Lieven

On 20.10.2014 15:22, Max Reitz wrote:

On 20.10.2014 at 15:19, Peter Lieven wrote:

On 20.10.2014 15:15, Max Reitz wrote:

On 20.10.2014 at 14:48, Peter Lieven wrote:

On 20.10.2014 14:19, Max Reitz wrote:

On 2014-10-20 at 14:16, Peter Lieven wrote:

On 20.10.2014 13:51, Max Reitz wrote:

On 2014-10-20 at 12:03, Peter Lieven wrote:
[...]


Can you further help here. I think my problem was that I don't have access to 
the commandline options in bdrv_open?!


You do. It's the options QDict. :-)


Maybe I just don't get it.

If I specify

qemu -drive if=virtio,file=image.qcow2,write-merging=off

and check with

qdict_get_try_bool(options, write-merging, true);

in bdrv_open() directly before bdrv_swap I always get true.


Hm, judging from fprintf(stderr, %s\n, 
qstring_get_str(qobject_to_json_pretty(QOBJECT(options;, it's there for me (directly after 
qdict_del(options, node-name). The output is:

Qemu wrote:

{
filename: image.qcow2
}
{
write-merging: off
}
qemu-system-x86_64: -drive if=virtio,file=image.qcow2,write-merging=off: could 
not open disk image image.qcow2: Block format 'qcow2' used by device 'virtio0' 
doesn't support the option 'write-merging'


But as you can see, it's a string and not a bool. So the problem is that there are (at least) two parameter types in qemu: One is just giving a QDict, and the other are QemuOpts. QDicts are just the raw user input and the user can only input 
strings, so everything is just a string. As far as I know, typing everything correctly is done by converting the QDict to a QemuOpts object (as you can see in generally every block driver which supports some options (e.g. qcow2) and also in 
blockdev_init(), it's qemu_opts_absorb_qdict()).


Sooo, right, I forgot that. Currently, there are no non-string non-block-driver-specific options for mid-tree BDS (in contrast to the root BDS, which are parsed in blockdev_init()), so you now have the honorable task of introducing such a QemuOptsList 
along with qemu_opts_absorb_qdict() and everything to bdrv_open_common(). *cough*


I would appreciate if someone with better knowledge of this whole stuff would 
start this. Or we postpone this know until all the ongoing conversions are done.


I can try and create some barebone which your patches can then be based on. I 
probably don't have the knowledge either, but I'm daring enough to do it 
anyway. ;-)


Thank you.

Peter



Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Kevin Wolf
Am 20.10.2014 um 15:22 hat Max Reitz geschrieben:
 On 20.10.2014 at 15:19, Peter Lieven wrote:
 On 20.10.2014 15:15, Max Reitz wrote:
 On 20.10.2014 at 14:48, Peter Lieven wrote:
 On 20.10.2014 14:19, Max Reitz wrote:
 On 2014-10-20 at 14:16, Peter Lieven wrote:
 On 20.10.2014 13:51, Max Reitz wrote:
 On 2014-10-20 at 12:03, Peter Lieven wrote:
 [...]
 
 Can you further help here. I think my problem was that I
 don't have access to the commandline options in
 bdrv_open?!
 
 You do. It's the options QDict. :-)
 
 Maybe I just don't get it.
 
 If I specify
 
 qemu -drive if=virtio,file=image.qcow2,write-merging=off
 
 and check with
 
 qdict_get_try_bool(options, write-merging, true);
 
 in bdrv_open() directly before bdrv_swap I always get true.
 
 Hm, judging from fprintf(stderr, %s\n,
 qstring_get_str(qobject_to_json_pretty(QOBJECT(options;,
 it's there for me (directly after qdict_del(options,
 node-name). The output is:
 
 Qemu wrote:
 {
 filename: image.qcow2
 }
 {
 write-merging: off
 }
 qemu-system-x86_64: -drive
 if=virtio,file=image.qcow2,write-merging=off: could not open
 disk image image.qcow2: Block format 'qcow2' used by device
 'virtio0' doesn't support the option 'write-merging'
 
 But as you can see, it's a string and not a bool. So the problem
 is that there are (at least) two parameter types in qemu: One
 is just giving a QDict, and the other are QemuOpts. QDicts are
 just the raw user input and the user can only input strings, so
 everything is just a string. As far as I know, typing everything
 correctly is done by converting the QDict to a QemuOpts object
 (as you can see in generally every block driver which supports
 some options (e.g. qcow2) and also in blockdev_init(), it's
 qemu_opts_absorb_qdict()).
 
 Sooo, right, I forgot that. Currently, there are no non-string
 non-block-driver-specific options for mid-tree BDS (in contrast
 to the root BDS, which are parsed in blockdev_init()), so you
 now have the honorable task of introducing such a QemuOptsList
 along with qemu_opts_absorb_qdict() and everything to
 bdrv_open_common(). *cough*
 
 I would appreciate if someone with better knowledge of this whole
 stuff would start this. Or we postpone this know until all the
 ongoing conversions are done.
 
 I can try and create some barebone which your patches can then be
 based on. I probably don't have the knowledge either, but I'm daring
 enough to do it anyway. ;-)

Actually I have some patches somewhere [1] that introduce a QemuOpts for
bdrv_open_common(). I intended to use that for cache modes, but as I
explained in our KVM Forum presentation, it's not quite as easy as I
thought it would be and so the patch series isn't ready yet.

Anyway, having the QemuOpts there for driver-independent options is
probably the way to go. Feel free to remove the caching from my
patch and keep only the node-name part. Then it can be a preparatory
patch for your series where you simply add a new option to the list.

Kevin

[1] 
http://repo.or.cz/w/qemu/kevin.git/commitdiff/9c22aee04cf0bdf6a3858340bc6ff27d6805254f



[Qemu-devel] [PULL 02/28] block/raw-posix: Fix disk corruption in try_fiemap

2014-10-20 Thread Kevin Wolf
From: Tony Breeds t...@bakeyournoodle.com

Using fiemap without FIEMAP_FLAG_SYNC is a known corrupter.

Add the FIEMAP_FLAG_SYNC flag to the FS_IOC_FIEMAP ioctl.  This has
the downside of significantly reducing performance.

Reported-By: Michael Steffens michael_steff...@posteo.de
Signed-off-by: Tony Breeds t...@bakeyournoodle.com
Cc: Kevin Wolf kw...@redhat.com
Cc: Markus Armbruster arm...@redhat.com
Cc: Stefan Hajnoczi stefa...@redhat.com
Cc: Max Reitz mre...@redhat.com
Cc: Pádraig Brady pbr...@redhat.com
Cc: Eric Blake ebl...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/raw-posix.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 86ce4f2..d672c73 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1482,7 +1482,7 @@ static int64_t try_fiemap(BlockDriverState *bs, off_t 
start, off_t *data,
 
 f.fm.fm_start = start;
 f.fm.fm_length = (int64_t)nb_sectors * BDRV_SECTOR_SIZE;
-f.fm.fm_flags = 0;
+f.fm.fm_flags = FIEMAP_FLAG_SYNC;
 f.fm.fm_extent_count = 1;
 f.fm.fm_reserved = 0;
 if (ioctl(s-fd, FS_IOC_FIEMAP, f) == -1) {
-- 
1.8.3.1




[Qemu-devel] [PULL 05/28] block: Split bdrv_new_root() off bdrv_new()

2014-10-20 Thread Kevin Wolf
From: Markus Armbruster arm...@redhat.com

Creating an anonymous BDS can't fail.  Make that obvious.

Signed-off-by: Markus Armbruster arm...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Benoît Canet benoit.ca...@nodalink.com
Reviewed-by: Kevin Wolf kw...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block.c   | 28 +++-
 block/iscsi.c |  2 +-
 block/vvfat.c |  2 +-
 blockdev.c|  2 +-
 hw/block/xen_disk.c   |  2 +-
 include/block/block.h |  3 ++-
 qemu-img.c|  6 +++---
 qemu-io.c |  2 +-
 qemu-nbd.c|  2 +-
 9 files changed, 30 insertions(+), 19 deletions(-)

diff --git a/block.c b/block.c
index 27533f3..42659ec 100644
--- a/block.c
+++ b/block.c
@@ -336,10 +336,11 @@ void bdrv_register(BlockDriver *bdrv)
 }
 
 /* create a new block device (by default it is empty) */
-BlockDriverState *bdrv_new(const char *device_name, Error **errp)
+BlockDriverState *bdrv_new_root(const char *device_name, Error **errp)
 {
 BlockDriverState *bs;
-int i;
+
+assert(*device_name);
 
 if (*device_name  !id_wellformed(device_name)) {
 error_setg(errp, Invalid device name);
@@ -358,12 +359,21 @@ BlockDriverState *bdrv_new(const char *device_name, Error 
**errp)
 return NULL;
 }
 
+bs = bdrv_new();
+
+pstrcpy(bs-device_name, sizeof(bs-device_name), device_name);
+QTAILQ_INSERT_TAIL(bdrv_states, bs, device_list);
+
+return bs;
+}
+
+BlockDriverState *bdrv_new(void)
+{
+BlockDriverState *bs;
+int i;
+
 bs = g_new0(BlockDriverState, 1);
 QLIST_INIT(bs-dirty_bitmaps);
-pstrcpy(bs-device_name, sizeof(bs-device_name), device_name);
-if (device_name[0] != '\0') {
-QTAILQ_INSERT_TAIL(bdrv_states, bs, device_list);
-}
 for (i = 0; i  BLOCK_OP_TYPE_MAX; i++) {
 QLIST_INIT(bs-op_blockers[i]);
 }
@@ -1224,7 +1234,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*options, Error **errp)
 goto free_exit;
 }
 
-backing_hd = bdrv_new(, errp);
+backing_hd = bdrv_new();
 
 if (bs-backing_format[0] != '\0') {
 back_drv = bdrv_find_format(bs-backing_format);
@@ -1353,7 +1363,7 @@ int bdrv_append_temp_snapshot(BlockDriverState *bs, int 
flags, Error **errp)
 qdict_put(snapshot_options, file.filename,
   qstring_from_str(tmp_filename));
 
-bs_snapshot = bdrv_new(, error_abort);
+bs_snapshot = bdrv_new();
 
 ret = bdrv_open(bs_snapshot, NULL, NULL, snapshot_options,
 flags, bdrv_qcow2, local_err);
@@ -1424,7 +1434,7 @@ int bdrv_open(BlockDriverState **pbs, const char 
*filename,
 if (*pbs) {
 bs = *pbs;
 } else {
-bs = bdrv_new(, error_abort);
+bs = bdrv_new();
 }
 
 /* NULL means an empty set of options */
diff --git a/block/iscsi.c b/block/iscsi.c
index 3a01de0..a7fb764 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1519,7 +1519,7 @@ static int iscsi_create(const char *filename, QemuOpts 
*opts, Error **errp)
 IscsiLun *iscsilun = NULL;
 QDict *bs_options;
 
-bs = bdrv_new(, error_abort);
+bs = bdrv_new();
 
 /* Read out options */
 total_size = DIV_ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
diff --git a/block/vvfat.c b/block/vvfat.c
index 731e591..6c9fde0 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -2939,7 +2939,7 @@ static int enable_write_target(BDRVVVFATState *s, Error 
**errp)
 unlink(s-qcow_filename);
 #endif
 
-bdrv_set_backing_hd(s-bs, bdrv_new(, error_abort));
+bdrv_set_backing_hd(s-bs, bdrv_new());
 s-bs-backing_hd-drv = vvfat_write_target;
 s-bs-backing_hd-opaque = g_new(void *, 1);
 *(void**)s-bs-backing_hd-opaque = s;
diff --git a/blockdev.c b/blockdev.c
index e595910..7608f46 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -523,7 +523,7 @@ static DriveInfo *blockdev_init(const char *file, QDict 
*bs_opts,
 }
 
 /* init */
-bs = bdrv_new(qemu_opts_id(opts), errp);
+bs = bdrv_new_root(qemu_opts_id(opts), errp);
 if (!bs) {
 goto early_err;
 }
diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 0d27ab1..7f0f66b 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -858,7 +858,7 @@ static int blk_connect(struct XenDevice *xendev)
 
 /* setup via xenbus - create new block driver instance */
 xen_be_printf(blkdev-xendev, 2, create new bdrv (xenbus setup)\n);
-blkdev-bs = bdrv_new(blkdev-dev, NULL);
+blkdev-bs = bdrv_new_root(blkdev-dev, NULL);
 if (!blkdev-bs) {
 return -1;
 }
diff --git a/include/block/block.h b/include/block/block.h
index 3318f0d..a3039ce 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -204,7 +204,8 @@ BlockDriver *bdrv_find_whitelisted_format(const char 
*format_name,
 int bdrv_create(BlockDriver *drv, const char* filename,
 QemuOpts *opts, Error **errp);
 

[Qemu-devel] [PULL 00/28] Block patches

2014-10-20 Thread Kevin Wolf
The following changes since commit 5f77ef69a195098baddfdc6d189f1b4a94587378:

  glib: add compatibility interface for g_strcmp0() (2014-10-16 23:02:31 +0100)

are available in the git repository at:

  git://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to 84ebe3755f88be4c3733e997641fafd050a58810:

  block: Make device model's references to BlockBackend strong (2014-10-20 
14:03:51 +0200)


Block patches


Markus Armbruster (24):
  block: Split bdrv_new_root() off bdrv_new()
  block: New BlockBackend
  block: Connect BlockBackend to BlockDriverState
  block: Connect BlockBackend and DriveInfo
  block: Code motion to get rid of stubs/blockdev.c
  block: Make BlockBackend own its BlockDriverState
  blockdev: Eliminate drive_del()
  block: Eliminate bdrv_iterate(), use bdrv_next()
  block: Eliminate BlockDriverState member device_name[]
  block: Merge BlockBackend and BlockDriverState name spaces
  block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()
  block: Rename BlockDriverAIOCB* to BlockAIOCB*
  block: Rename BlockDriverCompletionFunc to BlockCompletionFunc
  virtio-blk: Drop redundant VirtIOBlock member conf
  virtio-blk: Rename VirtIOBlkConf variables to conf
  hw: Convert from BlockDriverState to BlockBackend, mostly
  ide: Complete conversion from BlockDriverState to BlockBackend
  pc87312: Drop unused members of PC87312State
  blockdev: Drop superfluous DriveInfo member id
  blockdev: Fix blockdev-add not to create DriveInfo
  block/qapi: Convert qmp_query_block() to BlockBackend
  blockdev: Convert qmp_eject(), qmp_change_blockdev() to BlockBackend
  block: Lift device model API into BlockBackend
  block: Make device model's references to BlockBackend strong

Max Reitz (1):
  nbd: Fix filename generation

Tony Breeds (2):
  block/raw-posix: Fix disk corruption in try_fiemap
  block/raw-posix: use seek_hole ahead of fiemap

Zhang Haoyu (1):
  qcow2: fix leak of Qcow2DiscardRegion in update_refcount_discard

 block-migration.c|  44 ++-
 block.c  | 390 +++
 block/Makefile.objs  |   2 +-
 block/archipelago.c  |  28 +-
 block/backup.c   |   2 +-
 block/blkdebug.c |  18 +-
 block/blkverify.c|  18 +-
 block/block-backend.c| 631 +++
 block/commit.c   |   2 +-
 block/curl.c |   6 +-
 block/iscsi.c|  10 +-
 block/linux-aio.c|   8 +-
 block/mirror.c   |   9 +-
 block/nbd.c  |  44 ++-
 block/null.c |  34 +-
 block/qapi.c |  27 +-
 block/qcow.c |   4 +-
 block/qcow2-refcount.c   |   1 +
 block/qcow2.c|   4 +-
 block/qed-gencb.c|   4 +-
 block/qed-table.c|  10 +-
 block/qed.c  |  46 +--
 block/qed.h  |  12 +-
 block/quorum.c   |  42 +-
 block/raw-aio.h  |   8 +-
 block/raw-posix.c|  38 +-
 block/raw-win32.c|  16 +-
 block/raw_bsd.c  |   8 +-
 block/rbd.c  |  56 +--
 block/sheepdog.c |   4 +-
 block/stream.c   |   2 +-
 block/vdi.c  |   2 +-
 block/vhdx.c |   2 +-
 block/vmdk.c |   4 +-
 block/vpc.c  |   2 +-
 block/vvfat.c|   4 +-
 block/win32-aio.c|   6 +-
 blockdev.c   | 197 --
 blockjob.c   |   7 +-
 device-hotplug.c |   3 +-
 dma-helpers.c|  67 ++--
 docs/blkdebug.txt|   8 +-
 hw/arm/collie.c  |  10 +-
 hw/arm/gumstix.c |   6 +-
 hw/arm/highbank.c|   2 +-
 hw/arm/mainstone.c   |   8 +-
 hw/arm/musicpal.c|  13 +-
 hw/arm/nseries.c |   7 +-
 hw/arm/omap1.c   |   4 +-
 hw/arm/omap2.c   |   4 +-
 hw/arm/omap_sx1.c|  10 +-
 hw/arm/pxa2xx.c  |   7 +-
 hw/arm/realview.c|   2 +-
 

[Qemu-devel] [PULL 03/28] block/raw-posix: use seek_hole ahead of fiemap

2014-10-20 Thread Kevin Wolf
From: Tony Breeds t...@bakeyournoodle.com

try_fiemap() uses FIEMAP_FLAG_SYNC which has a significant performance
impact.

Prefer seek_hole() over fiemap() to avoid this impact where possible.
seek_hole is more widely used and, arguably, has potential to be
optimised in the kernel.

Reported-By: Michael Steffens michael_steff...@posteo.de
Signed-off-by: Tony Breeds t...@bakeyournoodle.com
Cc: Kevin Wolf kw...@redhat.com
Cc: Markus Armbruster arm...@redhat.com
Cc: Stefan Hajnoczi stefa...@redhat.com
Cc: Max Reitz mre...@redhat.com
Cc: Pádraig Brady pbr...@redhat.com
Cc: Eric Blake ebl...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/raw-posix.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index d672c73..eb76f0e 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1571,9 +1571,9 @@ static int64_t coroutine_fn 
raw_co_get_block_status(BlockDriverState *bs,
 
 start = sector_num * BDRV_SECTOR_SIZE;
 
-ret = try_fiemap(bs, start, data, hole, nb_sectors, pnum);
+ret = try_seek_hole(bs, start, data, hole, pnum);
 if (ret  0) {
-ret = try_seek_hole(bs, start, data, hole, pnum);
+ret = try_fiemap(bs, start, data, hole, nb_sectors, pnum);
 if (ret  0) {
 /* Assume everything is allocated. */
 data = 0;
-- 
1.8.3.1




[Qemu-devel] [PULL 01/28] qcow2: fix leak of Qcow2DiscardRegion in update_refcount_discard

2014-10-20 Thread Kevin Wolf
From: Zhang Haoyu zhan...@sangfor.com

When the Qcow2DiscardRegion is adjacent to another one referenced by d,
free this Qcow2DiscardRegion metadata referenced by p after
it was removed from s-discards queue.

Signed-off-by: Zhang Haoyu zhan...@sangfor.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qcow2-refcount.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 2bcaaf9..c31d85a 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -524,6 +524,7 @@ found:
 QTAILQ_REMOVE(s-discards, p, next);
 d-offset = MIN(d-offset, p-offset);
 d-bytes += p-bytes;
+g_free(p);
 }
 }
 
-- 
1.8.3.1




[Qemu-devel] [PULL 10/28] block: Make BlockBackend own its BlockDriverState

2014-10-20 Thread Kevin Wolf
From: Markus Armbruster arm...@redhat.com

On BlockBackend destruction, unref its BlockDriverState.  Replaces the
callers' unrefs.

This turns the pointer from BlockBackend to BlockDriverState into a
strong reference, managed with bdrv_ref() / bdrv_unref().  The
back-pointer remains weak.

Signed-off-by: Markus Armbruster arm...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Kevin Wolf kw...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/block-backend.c |  6 ++
 blockdev.c|  7 +--
 hw/block/xen_disk.c   |  6 +++---
 qemu-img.c| 35 +--
 qemu-io.c |  5 -
 qemu-nbd.c|  1 -
 6 files changed, 7 insertions(+), 53 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index d4bdd48..6236b5b 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -54,8 +54,6 @@ BlockBackend *blk_new(const char *name, Error **errp)
 
 /*
  * Create a new BlockBackend with a new BlockDriverState attached.
- * Both have a reference count of one.  Caller owns *both* references.
- * TODO Let caller own only the BlockBackend reference
  * Otherwise just like blk_new(), which see.
  */
 BlockBackend *blk_new_with_bs(const char *name, Error **errp)
@@ -83,7 +81,9 @@ static void blk_delete(BlockBackend *blk)
 {
 assert(!blk-refcnt);
 if (blk-bs) {
+assert(blk-bs-blk == blk);
 blk-bs-blk = NULL;
+bdrv_unref(blk-bs);
 blk-bs = NULL;
 }
 /* Avoid double-remove after blk_hide_on_behalf_of_do_drive_del() */
@@ -119,8 +119,6 @@ void blk_ref(BlockBackend *blk)
  * Decrement @blk's reference count.
  * If this drops it to zero, destroy @blk.
  * For convenience, do nothing if @blk is null.
- * Does *not* touch the attached BlockDriverState's reference count.
- * TODO Decrement it!
  */
 void blk_unref(BlockBackend *blk)
 {
diff --git a/blockdev.c b/blockdev.c
index a00461d..63f797b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -278,10 +278,7 @@ static void bdrv_format_print(void *opaque, const char 
*name)
 
 void drive_del(DriveInfo *dinfo)
 {
-BlockBackend *blk = dinfo-bdrv-blk;
-
-bdrv_unref(dinfo-bdrv);
-blk_unref(blk);
+blk_unref(dinfo-bdrv-blk);
 }
 
 typedef struct {
@@ -583,7 +580,6 @@ static BlockBackend *blockdev_init(const char *file, QDict 
*bs_opts,
 return blk;
 
 err:
-bdrv_unref(bs);
 blk_unref(blk);
 early_err:
 qemu_opts_del(opts);
@@ -2608,7 +2604,6 @@ void qmp_blockdev_add(BlockdevOptions *options, Error 
**errp)
 }
 
 if (bdrv_key_required(blk_bs(blk))) {
-bdrv_unref(blk_bs(blk));
 blk_unref(blk);
 error_setg(errp, blockdev-add doesn't support encrypted devices);
 goto fail;
diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 0022083..feb227f 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -872,7 +872,6 @@ static int blk_connect(struct XenDevice *xendev)
 xen_be_printf(blkdev-xendev, 0, error: %s\n,
   error_get_pretty(local_err));
 error_free(local_err);
-bdrv_unref(blkdev-bs);
 blk_unref(blk);
 blkdev-bs = NULL;
 return -1;
@@ -888,7 +887,9 @@ static int blk_connect(struct XenDevice *xendev)
 }
 /* blkdev-bs is not create by us, we get a reference
  * so we can bdrv_unref() unconditionally */
-bdrv_ref(blkdev-bs);
+/* Except we don't bdrv_unref() anymore, we blk_unref().
+ * Conditionally, because we can't easily blk_ref() here.
+ * TODO Clean this up! */
 }
 bdrv_attach_dev_nofail(blkdev-bs, blkdev);
 blkdev-file_size = bdrv_getlength(blkdev-bs);
@@ -988,7 +989,6 @@ static void blk_disconnect(struct XenDevice *xendev)
 
 if (blkdev-bs) {
 bdrv_detach_dev(blkdev-bs, blkdev);
-bdrv_unref(blkdev-bs);
 if (!blkdev-dinfo) {
 blk_unref(blk_by_name(blkdev-dev));
 }
diff --git a/qemu-img.c b/qemu-img.c
index 5548637..09e7e72 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -329,7 +329,6 @@ static BlockBackend *img_open(const char *id, const char 
*filename,
 }
 return blk;
 fail:
-bdrv_unref(bs);
 blk_unref(blk);
 return NULL;
 }
@@ -712,9 +711,7 @@ static int img_check(int argc, char **argv)
 
 fail:
 qapi_free_ImageCheck(check);
-bdrv_unref(bs);
 blk_unref(blk);
-
 return ret;
 }
 
@@ -786,7 +783,6 @@ static int img_commit(int argc, char **argv)
 break;
 }
 
-bdrv_unref(bs);
 blk_unref(blk);
 if (ret) {
 return 1;
@@ -1196,10 +1192,8 @@ static int img_compare(int argc, char **argv)
 out:
 qemu_vfree(buf1);
 qemu_vfree(buf2);
-bdrv_unref(bs2);
 blk_unref(blk2);
 out2:
-bdrv_unref(bs1);
 blk_unref(blk1);
 out3:
 qemu_progress_end();
@@ -1754,18 +1748,8 @@ out:
 qemu_opts_free(create_opts);
 qemu_vfree(buf);
 

[Qemu-devel] [PULL 26/28] blockdev: Convert qmp_eject(), qmp_change_blockdev() to BlockBackend

2014-10-20 Thread Kevin Wolf
From: Markus Armbruster arm...@redhat.com

Much more command code needs conversion.  I'm converting these now
because they're using bdrv_dev_* functions, which I'm about to lift
into BlockBackend.

Signed-off-by: Markus Armbruster arm...@redhat.com
Reviewed-by: Benoît Canet benoit.ca...@nodalink.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Kevin Wolf kw...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 blockdev.c | 20 
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 2a3d908..a32d84c 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1593,8 +1593,10 @@ exit:
 }
 
 
-static void eject_device(BlockDriverState *bs, int force, Error **errp)
+static void eject_device(BlockBackend *blk, int force, Error **errp)
 {
+BlockDriverState *bs = blk_bs(blk);
+
 if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_EJECT, errp)) {
 return;
 }
@@ -1618,15 +1620,15 @@ static void eject_device(BlockDriverState *bs, int 
force, Error **errp)
 
 void qmp_eject(const char *device, bool has_force, bool force, Error **errp)
 {
-BlockDriverState *bs;
+BlockBackend *blk;
 
-bs = bdrv_find(device);
-if (!bs) {
+blk = blk_by_name(device);
+if (!blk) {
 error_set(errp, QERR_DEVICE_NOT_FOUND, device);
 return;
 }
 
-eject_device(bs, force, errp);
+eject_device(blk, force, errp);
 }
 
 void qmp_block_passwd(bool has_device, const char *device,
@@ -1685,16 +1687,18 @@ static void qmp_bdrv_open_encrypted(BlockDriverState 
*bs, const char *filename,
 void qmp_change_blockdev(const char *device, const char *filename,
  const char *format, Error **errp)
 {
+BlockBackend *blk;
 BlockDriverState *bs;
 BlockDriver *drv = NULL;
 int bdrv_flags;
 Error *err = NULL;
 
-bs = bdrv_find(device);
-if (!bs) {
+blk = blk_by_name(device);
+if (!blk) {
 error_set(errp, QERR_DEVICE_NOT_FOUND, device);
 return;
 }
+bs = blk_bs(blk);
 
 if (format) {
 drv = bdrv_find_whitelisted_format(format, bs-read_only);
@@ -1704,7 +1708,7 @@ void qmp_change_blockdev(const char *device, const char 
*filename,
 }
 }
 
-eject_device(bs, 0, err);
+eject_device(blk, 0, err);
 if (err) {
 error_propagate(errp, err);
 return;
-- 
1.8.3.1




[Qemu-devel] [PULL 28/28] block: Make device model's references to BlockBackend strong

2014-10-20 Thread Kevin Wolf
From: Markus Armbruster arm...@redhat.com

Doesn't make a difference just yet, but it's the right thing to do.

Signed-off-by: Markus Armbruster arm...@redhat.com
Reviewed-by: Benoît Canet benoit.ca...@nodalink.com
Reviewed-by: Kevin Wolf kw...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/block-backend.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index bdcbac6..d0692b1 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -257,6 +257,7 @@ int blk_attach_dev(BlockBackend *blk, void *dev)
 if (blk-dev) {
 return -EBUSY;
 }
+blk_ref(blk);
 blk-dev = dev;
 bdrv_iostatus_reset(blk-bs);
 
@@ -290,6 +291,7 @@ void blk_detach_dev(BlockBackend *blk, void *dev)
 blk-dev_opaque = NULL;
 bdrv_set_guest_block_size(blk-bs, 512);
 qemu_coroutine_adjust_pool_size(-COROUTINE_POOL_RESERVATION);
+blk_unref(blk);
 }
 
 /*
-- 
1.8.3.1




[Qemu-devel] [PULL 19/28] virtio-blk: Rename VirtIOBlkConf variables to conf

2014-10-20 Thread Kevin Wolf
From: Markus Armbruster arm...@redhat.com

This is consistent with how VirtIOFOOConf variables are named
elsewhere, and makes blk available for BlockBackend variables.

Signed-off-by: Markus Armbruster arm...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hw/block/dataplane/virtio-blk.c | 33 ++-
 hw/block/dataplane/virtio-blk.h |  2 +-
 hw/block/virtio-blk.c   | 50 -
 include/hw/virtio/virtio-blk.h  |  2 +-
 4 files changed, 44 insertions(+), 43 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 5458f9d..bd250df 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -30,7 +30,7 @@ struct VirtIOBlockDataPlane {
 bool stopping;
 bool disabled;
 
-VirtIOBlkConf *blk;
+VirtIOBlkConf *conf;
 
 VirtIODevice *vdev;
 Vring vring;/* virtqueue vring */
@@ -94,7 +94,7 @@ static void handle_notify(EventNotifier *e)
 VirtIOBlock *vblk = VIRTIO_BLK(s-vdev);
 
 event_notifier_test_and_clear(s-host_notifier);
-bdrv_io_plug(s-blk-conf.bs);
+bdrv_io_plug(s-conf-conf.bs);
 for (;;) {
 MultiReqBuffer mrb = {
 .num_writes = 0,
@@ -120,7 +120,7 @@ static void handle_notify(EventNotifier *e)
 virtio_blk_handle_request(req, mrb);
 }
 
-virtio_submit_multiwrite(s-blk-conf.bs, mrb);
+virtio_submit_multiwrite(s-conf-conf.bs, mrb);
 
 if (likely(ret == -EAGAIN)) { /* vring emptied */
 /* Re-enable guest-host notifies and stop processing the vring.
@@ -133,11 +133,11 @@ static void handle_notify(EventNotifier *e)
 break;
 }
 }
-bdrv_io_unplug(s-blk-conf.bs);
+bdrv_io_unplug(s-conf-conf.bs);
 }
 
 /* Context: QEMU global mutex held */
-void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
+void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *conf,
   VirtIOBlockDataPlane **dataplane,
   Error **errp)
 {
@@ -148,7 +148,7 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, 
VirtIOBlkConf *blk,
 
 *dataplane = NULL;
 
-if (!blk-data_plane  !blk-iothread) {
+if (!conf-data_plane  !conf-iothread) {
 return;
 }
 
@@ -163,7 +163,8 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, 
VirtIOBlkConf *blk,
 /* If dataplane is (re-)enabled while the guest is running there could be
  * block jobs that can conflict.
  */
-if (bdrv_op_is_blocked(blk-conf.bs, BLOCK_OP_TYPE_DATAPLANE, local_err)) 
{
+if (bdrv_op_is_blocked(conf-conf.bs, BLOCK_OP_TYPE_DATAPLANE,
+   local_err)) {
 error_setg(errp, cannot start dataplane thread: %s,
error_get_pretty(local_err));
 error_free(local_err);
@@ -172,10 +173,10 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, 
VirtIOBlkConf *blk,
 
 s = g_new0(VirtIOBlockDataPlane, 1);
 s-vdev = vdev;
-s-blk = blk;
+s-conf = conf;
 
-if (blk-iothread) {
-s-iothread = blk-iothread;
+if (conf-iothread) {
+s-iothread = conf-iothread;
 object_ref(OBJECT(s-iothread));
 } else {
 /* Create per-device IOThread if none specified.  This is for
@@ -192,9 +193,9 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, 
VirtIOBlkConf *blk,
 s-bh = aio_bh_new(s-ctx, notify_guest_bh, s);
 
 error_setg(s-blocker, block device is in use by data plane);
-bdrv_op_block_all(blk-conf.bs, s-blocker);
-bdrv_op_unblock(blk-conf.bs, BLOCK_OP_TYPE_RESIZE, s-blocker);
-bdrv_op_unblock(blk-conf.bs, BLOCK_OP_TYPE_DRIVE_DEL, s-blocker);
+bdrv_op_block_all(conf-conf.bs, s-blocker);
+bdrv_op_unblock(conf-conf.bs, BLOCK_OP_TYPE_RESIZE, s-blocker);
+bdrv_op_unblock(conf-conf.bs, BLOCK_OP_TYPE_DRIVE_DEL, s-blocker);
 
 *dataplane = s;
 }
@@ -207,7 +208,7 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
 }
 
 virtio_blk_data_plane_stop(s);
-bdrv_op_unblock_all(s-blk-conf.bs, s-blocker);
+bdrv_op_unblock_all(s-conf-conf.bs, s-blocker);
 error_free(s-blocker);
 object_unref(OBJECT(s-iothread));
 qemu_bh_delete(s-bh);
@@ -262,7 +263,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 s-started = true;
 trace_virtio_blk_data_plane_start(s);
 
-bdrv_set_aio_context(s-blk-conf.bs, s-ctx);
+bdrv_set_aio_context(s-conf-conf.bs, s-ctx);
 
 /* Kick right away to begin processing requests already in vring */
 event_notifier_set(virtio_queue_get_host_notifier(vq));
@@ -308,7 +309,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 aio_set_event_notifier(s-ctx, s-host_notifier, NULL);
 
 /* Drain and switch bs back to the QEMU main loop */
-bdrv_set_aio_context(s-blk-conf.bs, 

[Qemu-devel] [PULL 21/28] ide: Complete conversion from BlockDriverState to BlockBackend

2014-10-20 Thread Kevin Wolf
From: Markus Armbruster arm...@redhat.com

Add a BlockBackend member to TrimAIOCB, so ide_issue_trim_cb() can use
blk_aio_discard() instead of bdrv_aio_discard().

Signed-off-by: Markus Armbruster arm...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Kevin Wolf kw...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hw/ide/core.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index a5c4698..44e3d50 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -362,6 +362,7 @@ static void ide_set_signature(IDEState *s)
 
 typedef struct TrimAIOCB {
 BlockAIOCB common;
+BlockBackend *blk;
 QEMUBH *bh;
 int ret;
 QEMUIOVector *qiov;
@@ -421,8 +422,8 @@ static void ide_issue_trim_cb(void *opaque, int ret)
 }
 
 /* Got an entry! Submit and exit.  */
-iocb-aiocb = bdrv_aio_discard(iocb-common.bs, sector, count,
-   ide_issue_trim_cb, opaque);
+iocb-aiocb = blk_aio_discard(iocb-blk, sector, count,
+  ide_issue_trim_cb, opaque);
 return;
 }
 
@@ -446,6 +447,7 @@ BlockAIOCB *ide_issue_trim(BlockBackend *blk,
 TrimAIOCB *iocb;
 
 iocb = blk_aio_get(trim_aiocb_info, blk, cb, opaque);
+iocb-blk = blk;
 iocb-bh = qemu_bh_new(ide_trim_bh_cb, iocb);
 iocb-ret = 0;
 iocb-qiov = qiov;
-- 
1.8.3.1




[Qemu-devel] [PULL 22/28] pc87312: Drop unused members of PC87312State

2014-10-20 Thread Kevin Wolf
From: Markus Armbruster arm...@redhat.com

Signed-off-by: Markus Armbruster arm...@redhat.com
Reviewed-by: Benoît Canet benoit.ca...@nodalink.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Kevin Wolf kw...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 include/hw/isa/pc87312.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/include/hw/isa/pc87312.h b/include/hw/isa/pc87312.h
index befc8bd..bf74470 100644
--- a/include/hw/isa/pc87312.h
+++ b/include/hw/isa/pc87312.h
@@ -47,13 +47,10 @@ typedef struct PC87312State {
 
 struct {
 ISADevice *dev;
-BlockDriverState *drive[2];
-uint32_t base;
 } fdc;
 
 struct {
 ISADevice *dev;
-uint32_t base;
 } ide;
 
 MemoryRegion io;
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Kevin Wolf
Am 20.10.2014 um 15:47 hat Peter Lieven geschrieben:
 On 20.10.2014 15:31, Kevin Wolf wrote:
 Am 20.10.2014 um 15:22 hat Max Reitz geschrieben:
 On 20.10.2014 at 15:19, Peter Lieven wrote:
 On 20.10.2014 15:15, Max Reitz wrote:
 On 20.10.2014 at 14:48, Peter Lieven wrote:
 On 20.10.2014 14:19, Max Reitz wrote:
 On 2014-10-20 at 14:16, Peter Lieven wrote:
 On 20.10.2014 13:51, Max Reitz wrote:
 On 2014-10-20 at 12:03, Peter Lieven wrote:
 [...]
 Can you further help here. I think my problem was that I
 don't have access to the commandline options in
 bdrv_open?!
 You do. It's the options QDict. :-)
 Maybe I just don't get it.
 
 If I specify
 
 qemu -drive if=virtio,file=image.qcow2,write-merging=off
 
 and check with
 
 qdict_get_try_bool(options, write-merging, true);
 
 in bdrv_open() directly before bdrv_swap I always get true.
 Hm, judging from fprintf(stderr, %s\n,
 qstring_get_str(qobject_to_json_pretty(QOBJECT(options;,
 it's there for me (directly after qdict_del(options,
 node-name). The output is:
 
 Qemu wrote:
 {
 filename: image.qcow2
 }
 {
 write-merging: off
 }
 qemu-system-x86_64: -drive
 if=virtio,file=image.qcow2,write-merging=off: could not open
 disk image image.qcow2: Block format 'qcow2' used by device
 'virtio0' doesn't support the option 'write-merging'
 But as you can see, it's a string and not a bool. So the problem
 is that there are (at least) two parameter types in qemu: One
 is just giving a QDict, and the other are QemuOpts. QDicts are
 just the raw user input and the user can only input strings, so
 everything is just a string. As far as I know, typing everything
 correctly is done by converting the QDict to a QemuOpts object
 (as you can see in generally every block driver which supports
 some options (e.g. qcow2) and also in blockdev_init(), it's
 qemu_opts_absorb_qdict()).
 
 Sooo, right, I forgot that. Currently, there are no non-string
 non-block-driver-specific options for mid-tree BDS (in contrast
 to the root BDS, which are parsed in blockdev_init()), so you
 now have the honorable task of introducing such a QemuOptsList
 along with qemu_opts_absorb_qdict() and everything to
 bdrv_open_common(). *cough*
 I would appreciate if someone with better knowledge of this whole
 stuff would start this. Or we postpone this know until all the
 ongoing conversions are done.
 I can try and create some barebone which your patches can then be
 based on. I probably don't have the knowledge either, but I'm daring
 enough to do it anyway. ;-)
 Actually I have some patches somewhere [1] that introduce a QemuOpts for
 bdrv_open_common(). I intended to use that for cache modes, but as I
 explained in our KVM Forum presentation, it's not quite as easy as I
 thought it would be and so the patch series isn't ready yet.
 
 Anyway, having the QemuOpts there for driver-independent options is
 probably the way to go. Feel free to remove the caching from my
 patch and keep only the node-name part. Then it can be a preparatory
 patch for your series where you simply add a new option to the list.
 
 Kevin
 
 [1] 
 http://repo.or.cz/w/qemu/kevin.git/commitdiff/9c22aee04cf0bdf6a3858340bc6ff27d6805254f
 
 Thank you.
 
 Would it be legit to recycle qemu_common_drive_opts from blockdev.c for this?

No, I don't think so. That one should in theory be only for BlockBackend
options. For the short term, it still mixes BB and BDS options, but BDS
options should be moved out step by step. In any case, it is only used
for the top level.

Any option that is parsed with qemu_opts_absorb_qdict() in
bdrv_open_common() must also be handled there. If you don't ensure that
and extract all the options that blockdev_init() knows without actually
handling them, it can happen that invalid options are silently ignored
(e.g. backing.werror should error out, but would be accepted).

And please coordinate with Max, if both of you write a patch, that's
wasted time.

Kevin



Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Peter Lieven

On 20.10.2014 15:31, Kevin Wolf wrote:

Am 20.10.2014 um 15:22 hat Max Reitz geschrieben:

On 20.10.2014 at 15:19, Peter Lieven wrote:

On 20.10.2014 15:15, Max Reitz wrote:

On 20.10.2014 at 14:48, Peter Lieven wrote:

On 20.10.2014 14:19, Max Reitz wrote:

On 2014-10-20 at 14:16, Peter Lieven wrote:

On 20.10.2014 13:51, Max Reitz wrote:

On 2014-10-20 at 12:03, Peter Lieven wrote:
[...]

Can you further help here. I think my problem was that I
don't have access to the commandline options in
bdrv_open?!

You do. It's the options QDict. :-)

Maybe I just don't get it.

If I specify

qemu -drive if=virtio,file=image.qcow2,write-merging=off

and check with

qdict_get_try_bool(options, write-merging, true);

in bdrv_open() directly before bdrv_swap I always get true.

Hm, judging from fprintf(stderr, %s\n,
qstring_get_str(qobject_to_json_pretty(QOBJECT(options;,
it's there for me (directly after qdict_del(options,
node-name). The output is:

Qemu wrote:

{
filename: image.qcow2
}
{
write-merging: off
}
qemu-system-x86_64: -drive
if=virtio,file=image.qcow2,write-merging=off: could not open
disk image image.qcow2: Block format 'qcow2' used by device
'virtio0' doesn't support the option 'write-merging'

But as you can see, it's a string and not a bool. So the problem
is that there are (at least) two parameter types in qemu: One
is just giving a QDict, and the other are QemuOpts. QDicts are
just the raw user input and the user can only input strings, so
everything is just a string. As far as I know, typing everything
correctly is done by converting the QDict to a QemuOpts object
(as you can see in generally every block driver which supports
some options (e.g. qcow2) and also in blockdev_init(), it's
qemu_opts_absorb_qdict()).

Sooo, right, I forgot that. Currently, there are no non-string
non-block-driver-specific options for mid-tree BDS (in contrast
to the root BDS, which are parsed in blockdev_init()), so you
now have the honorable task of introducing such a QemuOptsList
along with qemu_opts_absorb_qdict() and everything to
bdrv_open_common(). *cough*

I would appreciate if someone with better knowledge of this whole
stuff would start this. Or we postpone this know until all the
ongoing conversions are done.

I can try and create some barebone which your patches can then be
based on. I probably don't have the knowledge either, but I'm daring
enough to do it anyway. ;-)

Actually I have some patches somewhere [1] that introduce a QemuOpts for
bdrv_open_common(). I intended to use that for cache modes, but as I
explained in our KVM Forum presentation, it's not quite as easy as I
thought it would be and so the patch series isn't ready yet.

Anyway, having the QemuOpts there for driver-independent options is
probably the way to go. Feel free to remove the caching from my
patch and keep only the node-name part. Then it can be a preparatory
patch for your series where you simply add a new option to the list.

Kevin

[1] 
http://repo.or.cz/w/qemu/kevin.git/commitdiff/9c22aee04cf0bdf6a3858340bc6ff27d6805254f


Thank you.

Would it be legit to recycle qemu_common_drive_opts from blockdev.c for this?

Peter



[Qemu-devel] [question] savevm/delvm: Is it neccesary to perform bdrv_drain_all before savevm and delvm?

2014-10-20 Thread Zhang Haoyu
Hi,

I noticed that bdrv_drain_all is performed in load_vmstate before 
bdrv_snapshot_goto,
and bdrv_drain_all is performed in qmp_transaction before 
internal_snapshot_prepare,
so is it also neccesary to perform bdrv_drain_all in savevm and delvm?

Thanks,
Zhang Haoyu





Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Peter Lieven

On 20.10.2014 15:55, Kevin Wolf wrote:

Am 20.10.2014 um 15:47 hat Peter Lieven geschrieben:

On 20.10.2014 15:31, Kevin Wolf wrote:

Am 20.10.2014 um 15:22 hat Max Reitz geschrieben:

On 20.10.2014 at 15:19, Peter Lieven wrote:

On 20.10.2014 15:15, Max Reitz wrote:

On 20.10.2014 at 14:48, Peter Lieven wrote:

On 20.10.2014 14:19, Max Reitz wrote:

On 2014-10-20 at 14:16, Peter Lieven wrote:

On 20.10.2014 13:51, Max Reitz wrote:

On 2014-10-20 at 12:03, Peter Lieven wrote:
[...]

Can you further help here. I think my problem was that I
don't have access to the commandline options in
bdrv_open?!

You do. It's the options QDict. :-)

Maybe I just don't get it.

If I specify

qemu -drive if=virtio,file=image.qcow2,write-merging=off

and check with

qdict_get_try_bool(options, write-merging, true);

in bdrv_open() directly before bdrv_swap I always get true.

Hm, judging from fprintf(stderr, %s\n,
qstring_get_str(qobject_to_json_pretty(QOBJECT(options;,
it's there for me (directly after qdict_del(options,
node-name). The output is:

Qemu wrote:

{
filename: image.qcow2
}
{
write-merging: off
}
qemu-system-x86_64: -drive
if=virtio,file=image.qcow2,write-merging=off: could not open
disk image image.qcow2: Block format 'qcow2' used by device
'virtio0' doesn't support the option 'write-merging'

But as you can see, it's a string and not a bool. So the problem
is that there are (at least) two parameter types in qemu: One
is just giving a QDict, and the other are QemuOpts. QDicts are
just the raw user input and the user can only input strings, so
everything is just a string. As far as I know, typing everything
correctly is done by converting the QDict to a QemuOpts object
(as you can see in generally every block driver which supports
some options (e.g. qcow2) and also in blockdev_init(), it's
qemu_opts_absorb_qdict()).

Sooo, right, I forgot that. Currently, there are no non-string
non-block-driver-specific options for mid-tree BDS (in contrast
to the root BDS, which are parsed in blockdev_init()), so you
now have the honorable task of introducing such a QemuOptsList
along with qemu_opts_absorb_qdict() and everything to
bdrv_open_common(). *cough*

I would appreciate if someone with better knowledge of this whole
stuff would start this. Or we postpone this know until all the
ongoing conversions are done.

I can try and create some barebone which your patches can then be
based on. I probably don't have the knowledge either, but I'm daring
enough to do it anyway. ;-)

Actually I have some patches somewhere [1] that introduce a QemuOpts for
bdrv_open_common(). I intended to use that for cache modes, but as I
explained in our KVM Forum presentation, it's not quite as easy as I
thought it would be and so the patch series isn't ready yet.

Anyway, having the QemuOpts there for driver-independent options is
probably the way to go. Feel free to remove the caching from my
patch and keep only the node-name part. Then it can be a preparatory
patch for your series where you simply add a new option to the list.

Kevin

[1] 
http://repo.or.cz/w/qemu/kevin.git/commitdiff/9c22aee04cf0bdf6a3858340bc6ff27d6805254f

Thank you.

Would it be legit to recycle qemu_common_drive_opts from blockdev.c for this?

No, I don't think so. That one should in theory be only for BlockBackend
options. For the short term, it still mixes BB and BDS options, but BDS
options should be moved out step by step. In any case, it is only used
for the top level.

Any option that is parsed with qemu_opts_absorb_qdict() in
bdrv_open_common() must also be handled there. If you don't ensure that
and extract all the options that blockdev_init() knows without actually
handling them, it can happen that invalid options are silently ignored
(e.g. backing.werror should error out, but would be accepted).

And please coordinate with Max, if both of you write a patch, that's
wasted time.


Max, if you don't have started I would use Kevins patch as basis?

Peter



Re: [Qemu-devel] [PATCH] block: add a knob to disable multiwrite_merge

2014-10-20 Thread Max Reitz

On 20.10.2014 at 15:59, Peter Lieven wrote:

On 20.10.2014 15:55, Kevin Wolf wrote:

Am 20.10.2014 um 15:47 hat Peter Lieven geschrieben:

On 20.10.2014 15:31, Kevin Wolf wrote:

Am 20.10.2014 um 15:22 hat Max Reitz geschrieben:

On 20.10.2014 at 15:19, Peter Lieven wrote:

On 20.10.2014 15:15, Max Reitz wrote:

On 20.10.2014 at 14:48, Peter Lieven wrote:

On 20.10.2014 14:19, Max Reitz wrote:

On 2014-10-20 at 14:16, Peter Lieven wrote:

On 20.10.2014 13:51, Max Reitz wrote:

On 2014-10-20 at 12:03, Peter Lieven wrote:
[...]

Can you further help here. I think my problem was that I
don't have access to the commandline options in
bdrv_open?!

You do. It's the options QDict. :-)

Maybe I just don't get it.

If I specify

qemu -drive if=virtio,file=image.qcow2,write-merging=off

and check with

qdict_get_try_bool(options, write-merging, true);

in bdrv_open() directly before bdrv_swap I always get true.

Hm, judging from fprintf(stderr, %s\n,
qstring_get_str(qobject_to_json_pretty(QOBJECT(options;,
it's there for me (directly after qdict_del(options,
node-name). The output is:

Qemu wrote:

{
filename: image.qcow2
}
{
write-merging: off
}
qemu-system-x86_64: -drive
if=virtio,file=image.qcow2,write-merging=off: could not open
disk image image.qcow2: Block format 'qcow2' used by device
'virtio0' doesn't support the option 'write-merging'

But as you can see, it's a string and not a bool. So the problem
is that there are (at least) two parameter types in qemu: One
is just giving a QDict, and the other are QemuOpts. QDicts are
just the raw user input and the user can only input strings, so
everything is just a string. As far as I know, typing everything
correctly is done by converting the QDict to a QemuOpts object
(as you can see in generally every block driver which supports
some options (e.g. qcow2) and also in blockdev_init(), it's
qemu_opts_absorb_qdict()).

Sooo, right, I forgot that. Currently, there are no non-string
non-block-driver-specific options for mid-tree BDS (in contrast
to the root BDS, which are parsed in blockdev_init()), so you
now have the honorable task of introducing such a QemuOptsList
along with qemu_opts_absorb_qdict() and everything to
bdrv_open_common(). *cough*

I would appreciate if someone with better knowledge of this whole
stuff would start this. Or we postpone this know until all the
ongoing conversions are done.

I can try and create some barebone which your patches can then be
based on. I probably don't have the knowledge either, but I'm daring
enough to do it anyway. ;-)
Actually I have some patches somewhere [1] that introduce a 
QemuOpts for

bdrv_open_common(). I intended to use that for cache modes, but as I
explained in our KVM Forum presentation, it's not quite as easy as I
thought it would be and so the patch series isn't ready yet.

Anyway, having the QemuOpts there for driver-independent options is
probably the way to go. Feel free to remove the caching from my
patch and keep only the node-name part. Then it can be a preparatory
patch for your series where you simply add a new option to the list.

Kevin

[1] 
http://repo.or.cz/w/qemu/kevin.git/commitdiff/9c22aee04cf0bdf6a3858340bc6ff27d6805254f

Thank you.

Would it be legit to recycle qemu_common_drive_opts from blockdev.c 
for this?

No, I don't think so. That one should in theory be only for BlockBackend
options. For the short term, it still mixes BB and BDS options, but BDS
options should be moved out step by step. In any case, it is only used
for the top level.

Any option that is parsed with qemu_opts_absorb_qdict() in
bdrv_open_common() must also be handled there. If you don't ensure that
and extract all the options that blockdev_init() knows without actually
handling them, it can happen that invalid options are silently ignored
(e.g. backing.werror should error out, but would be accepted).

And please coordinate with Max, if both of you write a patch, that's
wasted time.


Max, if you don't have started I would use Kevins patch as basis?


No, I haven't. Feel free to.

Max



Re: [Qemu-devel] [question] savevm/delvm: Is it neccesary to perform bdrv_drain_all before savevm and delvm?

2014-10-20 Thread Kevin Wolf
Am 20.10.2014 um 15:48 hat Zhang Haoyu geschrieben:
 Hi,
 
 I noticed that bdrv_drain_all is performed in load_vmstate before 
 bdrv_snapshot_goto,
 and bdrv_drain_all is performed in qmp_transaction before 
 internal_snapshot_prepare,
 so is it also neccesary to perform bdrv_drain_all in savevm and delvm?

Definitely yes for savevm. do_savevm() calls it indirectly via
vm_stop(), so that part looks okay.

delvm doesn't affect the currently running VM, and therefore doesn't
interfere with guest requests that are in flight. So I think that a
bdrv_drain_all() isn't needed there.

Kevin



  1   2   >