Re: [Qemu-devel] [Qemu-block] [PATCH] raw-posix.c: Make physical devices usable in QEMU

2015-07-16 Thread Programmingkid

On Jul 16, 2015, at 9:19 AM, Stefan Hajnoczi wrote:

 On Thu, Jul 09, 2015 at 10:02:26AM -0400, Programmingkid wrote:
 
 On Jul 9, 2015, at 6:52 AM, Stefan Hajnoczi wrote:
 
 On Tue, Jul 07, 2015 at 01:33:23PM -0400, Programmingkid wrote:
 Make physical devices like a USB flash drive or a CDROM drive work in 
 QEMU. With
 this patch I can use a USB flash drive like a hard drive. Before this 
 patch, 
 QEMU would just quit with a message like resource busy.
 
 The commit message and the description are missing on Mac OS X.  It
 should be clear right away that this applies to Mac only.  This works
 fine on Linux and probably other host OSes.
 
 Yeah, that should have been done. Did you see any issues with the code?
 
 QEMU shouldn't silently open a different file than the one given by the
 user.  The user should give the exact device file they want.  If there
 is magic behavior it needs to be documented, but I don't see a reason
 why that's necessary in the case of device files.

I think you are reviewing an older patch. The newest one doesn't do that. 

 
 QEMU shouldn't mount/unmount the CD-ROM.  atexit(3) doesn't handle
 crashes or abort().  Users may be confused to find their CD-ROM
 unmounted in those cases and would see this as a bug.  Instead we should
 refuse mounted CD-ROMs so the user understands that block-level access
 requires them to unmount first.

That can be done. It just wouldn't be as user friendly as having QEMU do it for
the user :(

 
 The strcpy/sprintf usage in this patch is unsafe and can lead to buffer
 overflow, for example in the case of generating command-lines.  The
 command-line buffer is only MAXPATHLEN so prepending the command to the
 filename could exceed the buffer size.
 
 There is also a buffer overflow in the array of devices that need to be
 mounted.  What happens if there are more than 7 devices?

Ok. Will correct this issue. 




Re: [Qemu-devel] [PULL for-2.4 0/1] qxl: allow to specify head limit to qxl driver

2015-07-16 Thread Peter Maydell
On 16 July 2015 at 16:34, Gerd Hoffmann kra...@redhat.com wrote:
   Hi,

 A single spice patch, adding a new parameter to specify the number
 of heads for qxl (assuming spice-server version is new enough).

 please pull,
   Gerd

 The following changes since commit 2d5ee9e7a7dd495d233cf9613a865f63f88e3375:

   Merge remote-tracking branch 'remotes/lalrae/tags/mips-20150716' into 
 staging (2015-07-16 10:40:23 +0100)

 are available in the git repository at:


   git://anongit.freedesktop.org/spice/qemu tags/pull-spice-20150716-1

 for you to fetch changes up to 567161fdd47aeb6987e700702f6bbfef04ae0236:

   qxl: allow to specify head limit to qxl driver (2015-07-16 17:31:05 +0200)

 
 qxl: allow to specify head limit to qxl driver

 

Applied, thanks.

-- PMM



Re: [Qemu-devel] [PULL for-2.4 0/4] input: fixes for 2.4

2015-07-16 Thread Peter Maydell
On 16 July 2015 at 16:38, Gerd Hoffmann kra...@redhat.com wrote:
   Hi,

 A few input fixes for 2.4.  Also enable virtio-input builds on
 non-linux hosts after fixing up the ioctl include.

 please pull,
   Gerd

 The following changes since commit f3a1b5068cea303a55e2a21a97e66d057eaae638:

   Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging 
 (2015-07-13 13:35:51 +0100)

 are available in the git repository at:


   git://git.kraxel.org/qemu tags/pull-input-20150714-1

 for you to fetch changes up to 8121c09e8a52fd47254479d8f5ccbbc20e7bb718:

   hid: clarify hid_keyboard_process_keycode (2015-07-14 13:48:45 +0200)

 
 input: fixes for 2.4

 

I'm afraid this doesn't build for Windows:

In file included from
/home/petmay01/linaro/qemu-for-merges/hw/input/virtio-input.c:13:
/home/petmay01/linaro/qemu-for-merges/include/standard-headers/linux/input.h:890:1:
error: SW_MAX redefined
In file included from
/usr/lib/gcc/i586-mingw32msvc/4.2.1-sjlj/../../../../i586-mingw32msvc/include/windows.h:55,
 from
/home/petmay01/linaro/qemu-for-merges/include/sysemu/os-win32.h:29,
 from
/home/petmay01/linaro/qemu-for-merges/include/qemu-common.h:48,
 from
/home/petmay01/linaro/qemu-for-merges/include/qemu/iov.h:17,
 from
/home/petmay01/linaro/qemu-for-merges/hw/input/virtio-input.c:7:
/usr/lib/gcc/i586-mingw32msvc/4.2.1-sjlj/../../../../i586-mingw32msvc/include/winuser.h:729:1:
error: this is the location of the previous definition


thanks
-- PMM



Re: [Qemu-devel] [PULL 00/11] RCU, KVM, memory API, crypto, Coverity fixes for 2.4.0-rc1

2015-07-16 Thread Peter Maydell
On 16 July 2015 at 17:55, Paolo Bonzini pbonz...@redhat.com wrote:
 The following changes since commit 6169b60285fe1ff730d840a49527e721bfb30899:

   Update version for v2.4.0-rc0 release (2015-07-09 17:56:56 +0100)

 are available in the git repository at:

   git://github.com/bonzini/qemu.git tags/for-upstream

 for you to fetch changes up to 4a8775ab71d2186fc1cd585ea80c000409965cde:

   crypto: avoid undefined behavior in nettle calls (2015-07-16 18:54:21 +0200)

 
 * rcu_register_thread fixes.
 * MIPS-KVM fixes.
 * Coverity fixes.
 * Nettle function prototype fixes.
 * Memory API refcount fix.


I get a pile of assertions on OSX running rcutorture:

GTESTER tests/rcutorture
Assertion failed: (rcu_reader.depth == 0), function
rcu_unregister_thread, file /Users/pm215/srcAssertion failed:
(rcu_reader.de/qemu/util/rcu.c, line 304.
pth == 0), function rcu_unregister_thread, file
/Users/pm215/src/qemu/util/rcu.c, line 304.
Assertion failed: (rcu_reader.depth == 0), function
rcu_unregister_thread, file /Users/pm215/src/qemu/util/rcu.c, line
304.
Assertion failed: (rcu_reader.depth == 0), function
rcu_unregister_thread, file /Users/pm215/src/qemu/util/rcu.c, line
304.
Assertion failed: (rcu_reader.depth == 0), function
rcu_unregister_thread, file /Users/pm215/src/qemu/util/rcu.c, line
304.
Assertion failed: (rcu_reader.depth == 0), function
rcu_unregister_thread, file /Users/pm215/srcAssertion failed:
(rcu_reader.de/qemu/util/rcu.c, line 304.
pth == 0), function rcu_unregister_thread, file
/Users/pm215/src/qemu/util/rcu.c, line 304.
GTester: last random seed: R02S9b5149dbb406809df60686a3e8223c26
Assertion failed: (rcu_reader.depth == 0), function
rcu_unregister_thread, file /Users/pm215/src/qemu/util/rcu.c, line
304.
Assertion failed: (rcu_reader.depth == 0), function
rcu_unregister_thread, file /Users/pm215/src/qemu/util/rcu.c, line
304.
Assertion failed: (rcu_reader.depth == 0), function
rcu_unregister_thread, file /Users/pm215/src/qemu/util/rcu.c, line
304.
Assertion failed: (rcu_reader.deAssertion failed: (rcu_reader.depth ==
0), function rcu_unregistpth == 0), function rcu_unregister_thread,
file /Users/pm215/srcer_thread, file /Users/pm215/src/qemu/util/rcu.c,
line 304.
/qemu/util/rcu.c, line 304.
GTester: last random seed: R02Sb915fd85eca48d367fd186bdfd39d8c7
make: *** [check-tests/rcutorture] Error 1

-- PMM



Re: [Qemu-devel] [PATCH 3/3] PAM: make PAM emulation closer to documentation

2015-07-16 Thread Paolo Bonzini


On 16/07/2015 16:41, Ефимов Василий wrote:
 The main problem is rendering memory tree to FlatView.

I don't believe it's necessary to render a memory tree to the FlatView.
 You can use existing AddressSpaces.

 +/* Read from RAM and write to PCI */
 +memory_region_init_io(pam-region[1], OBJECT(dev), pam_ops, pam,
 +pam-r-ram-w-pci, size); 

 This can be done with memory_region_set_readonly on the RAM region.  You
 need to set mr-ops in order to point to pam_ops; for a first proof of
 concept you can just set the field directly.
 The idea is to read directly from system RAM region and to write
 to PCI using I/O (because I do not see another way to emulate
 access type driven redirection with existent API). If we create RAM
 and make it read only then new useless RAM block will be created.

Don't create RAM; modify the existing one.

 It is useless because of ram_addr of new region will be set to
 one within system RAM block. Hence, cleaner way is to create I/O region.

You can use the existing RAM region and modify its properties (i.e.
toggle mr-readonly) after setting special mr-ops.  The special mr-ops
will be used for writes when mr-readonly = true.

 Writes to the PCI memory space can use the PCI address space, with
 address_space_st*.
 There is no PCI AddressSpace (only MemoryRegion). But
 address_space_st* requires AddressSpace as argument.

Then create one with address_space_init.

However, can the guest see the difference between real mode 1 and the
fake mode 1 that QEMU implements?  Perhaps mode 1 can be left as is.

 +/* Read from PCI and write to RAM */
 +memory_region_init_io(pam-region[2], OBJECT(dev), pam_ops, pam,
 +pam-r-pci-w-ram, size);

 Here you cannot run code from ROM, so it can be a pure MMIO region.
 Reads can use address_space_ld*, while writes can use
 memory_region_get_ram_ptr.
 
 Even in this mode it is possible for code to be executed from ROM. This
 can happen when particular PCI address is within ROM device connected
 to PCI bus.

If it's just for pc.rom and isa-bios, introduce a new function
pam_create_pci_region that creates pc.rom with
memory_region_init_rom_device.  The mr-ops can write to RAM (mode 2) or
discard the write (mode 0).

They you can make pc.rom 256K instead of 128K, and instead of an alias,
you can manually copy the last 128K of the BIOS into the last 128K of
pc.rom.

Some adjustment will be necessary in order to support migration (perhaps
creating two 128K regions pc.rom and pc.rom.mirror), but for a proof of
concept the above should be enough.

Paolo



[Qemu-devel] [PATCH v13 18/19] i.MX: Add qtest support for I2C device emulator.

2015-07-16 Thread Jean-Christophe Dubois
This is using a ds1338 RTC chip on the I2C bus. This RTC chip is
not present on the real 3DS PDK board.

Signed-off-by: Jean-Christophe Dubois j...@tribudubois.net
---

Changes since v1:
* not present on v1

Changes since v2:
* use a common header file for I2C regs definition

Changes since v3:
* rework GPL headers.

Changes since v4:
* none

Changes since v5:
* none

Changes since v6:
* none

Changes since v7:
* adapt to new i.MX I2C header file.

Changes since v8:
* no change

Changes since v9:
* no change

Changes since v10:
* no change

Changes since v11:
* no change

Changes since v12:
* no change

 tests/Makefile |   3 +
 tests/ds1338-test.c|  75 ++
 tests/libqos/i2c-imx.c | 209 +
 tests/libqos/i2c.h |   3 +
 4 files changed, 290 insertions(+)
 create mode 100644 tests/ds1338-test.c
 create mode 100644 tests/libqos/i2c-imx.c

diff --git a/tests/Makefile b/tests/Makefile
index c5e4744..93890a8 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -193,6 +193,7 @@ check-qtest-sparc64-y = tests/endianness-test$(EXESUF)
 gcov-files-sparc-y += hw/timer/m48t59.c
 gcov-files-sparc64-y += hw/timer/m48t59.c
 check-qtest-arm-y = tests/tmp105-test$(EXESUF)
+check-qtest-arm-y = tests/ds1338-test$(EXESUF)
 gcov-files-arm-y += hw/misc/tmp105.c
 check-qtest-arm-y += tests/virtio-blk-test$(EXESUF)
 gcov-files-arm-y += arm-softmmu/hw/block/virtio-blk.c
@@ -342,6 +343,7 @@ libqos-pc-obj-y = $(libqos-obj-y) tests/libqos/pci-pc.o
 libqos-pc-obj-y += tests/libqos/malloc-pc.o tests/libqos/libqos-pc.o
 libqos-pc-obj-y += tests/libqos/ahci.o
 libqos-omap-obj-y = $(libqos-obj-y) tests/libqos/i2c-omap.o
+libqos-imx-obj-y = $(libqos-obj-y) tests/libqos/i2c-imx.o
 libqos-usb-obj-y = $(libqos-pc-obj-y) tests/libqos/usb.o
 libqos-virtio-obj-y = $(libqos-pc-obj-y) tests/libqos/virtio.o 
tests/libqos/virtio-pci.o tests/libqos/virtio-mmio.o 
tests/libqos/malloc-generic.o
 
@@ -356,6 +358,7 @@ tests/hd-geo-test$(EXESUF): tests/hd-geo-test.o
 tests/boot-order-test$(EXESUF): tests/boot-order-test.o $(libqos-obj-y)
 tests/bios-tables-test$(EXESUF): tests/bios-tables-test.o $(libqos-obj-y)
 tests/tmp105-test$(EXESUF): tests/tmp105-test.o $(libqos-omap-obj-y)
+tests/ds1338-test$(EXESUF): tests/ds1338-test.o $(libqos-imx-obj-y)
 tests/i440fx-test$(EXESUF): tests/i440fx-test.o $(libqos-pc-obj-y)
 tests/q35-test$(EXESUF): tests/q35-test.o $(libqos-pc-obj-y)
 tests/fw_cfg-test$(EXESUF): tests/fw_cfg-test.o $(libqos-pc-obj-y)
diff --git a/tests/ds1338-test.c b/tests/ds1338-test.c
new file mode 100644
index 000..fbc989b
--- /dev/null
+++ b/tests/ds1338-test.c
@@ -0,0 +1,75 @@
+/*
+ * QTest testcase for the DS1338 RTC
+ *
+ * Copyright (c) 2013 Jean-Christophe Dubois
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under the terms of the GNU General Public License as published by the
+ *  Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but WITHOUT
+ *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ *  FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ *  for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+#include libqtest.h
+#include libqos/i2c.h
+
+#include glib.h
+
+#define IMX25_I2C_0_BASE 0x43F8
+
+#define DS1338_ADDR 0x68
+
+static I2CAdapter *i2c;
+static uint8_t addr;
+
+#define bcd2bin(x)(((x)  0x0f) + ((x)  4) * 10)
+
+static void send_and_receive(void)
+{
+uint8_t cmd[1];
+uint8_t resp[7];
+time_t now = time(NULL);
+struct tm *tm_ptr = gmtime(now);
+
+/* reset the index in the RTC memory */
+cmd[0] = 0;
+i2c_send(i2c, addr, cmd, 1);
+
+/* retrieve the date */
+i2c_recv(i2c, addr, resp, 7);
+
+/* check retreived time againt local time */
+g_assert_cmpuint(bcd2bin(resp[4]), == , tm_ptr-tm_mday);
+g_assert_cmpuint(bcd2bin(resp[5]), == , 1 + tm_ptr-tm_mon);
+g_assert_cmpuint(2000 + bcd2bin(resp[6]), == , 1900 + tm_ptr-tm_year);
+}
+
+int main(int argc, char **argv)
+{
+QTestState *s = NULL;
+int ret;
+
+g_test_init(argc, argv, NULL);
+
+s = qtest_start(-display none -machine imx25_3ds);
+i2c = imx_i2c_create(IMX25_I2C_0_BASE);
+addr = DS1338_ADDR;
+
+qtest_add_func(/ds1338/tx-rx, send_and_receive);
+
+ret = g_test_run();
+
+if (s) {
+qtest_quit(s);
+}
+g_free(i2c);
+
+return ret;
+}
diff --git a/tests/libqos/i2c-imx.c b/tests/libqos/i2c-imx.c
new file mode 100644
index 000..b5cef66
--- /dev/null
+++ b/tests/libqos/i2c-imx.c
@@ -0,0 +1,209 @@
+/*
+ * QTest i.MX I2C driver
+ *
+ * Copyright (c) 2013 Jean-Christophe Dubois
+ *
+ *  This program is free 

Re: [Qemu-devel] [PATCH v2 00/13] tcg/sparc v8plus code generation

2015-07-16 Thread Richard Henderson

On 07/15/2015 09:54 PM, Aurelien Jarno wrote:

While I understand why we need the new trunc_shr_i32 opcode for MIPS64
(the 32-bit values must be kept sign-extended), I currently fail to
see why it is needed for SPARC.


As far as I recall, it improves code for extracting high parts of 64-bit 
quantities.  Without this, we wind up with a 64-bit shift, requiring a 64-bit 
temp register, followed by the real truncate which can copy the data to a 
32-bit destination register.



r~



[Qemu-devel] [PATCH 1/2] tcg/i386: Extend addresses for 32-bit guests

2015-07-16 Thread Richard Henderson
Removing the ??? comment explaining why it (mostly) worked.

Signed-off-by: Richard Henderson r...@twiddle.net
---
 tcg/i386/tcg-target.c | 105 +++---
 1 file changed, 65 insertions(+), 40 deletions(-)

diff --git a/tcg/i386/tcg-target.c b/tcg/i386/tcg-target.c
index ff4d9cf..bbe2963 100644
--- a/tcg/i386/tcg-target.c
+++ b/tcg/i386/tcg-target.c
@@ -1434,8 +1434,8 @@ static inline void setup_guest_base_seg(void) { }
 #endif /* SOFTMMU */
 
 static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
-   TCGReg base, intptr_t ofs, int seg,
-   TCGMemOp memop)
+   TCGReg base, int index, intptr_t ofs,
+   int seg, TCGMemOp memop)
 {
 const TCGMemOp real_bswap = memop  MO_BSWAP;
 TCGMemOp bswap = real_bswap;
@@ -1448,13 +1448,16 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, 
TCGReg datalo, TCGReg datahi,
 
 switch (memop  MO_SSIZE) {
 case MO_UB:
-tcg_out_modrm_offset(s, OPC_MOVZBL + seg, datalo, base, ofs);
+tcg_out_modrm_sib_offset(s, OPC_MOVZBL + seg, datalo,
+ base, index, 0, ofs);
 break;
 case MO_SB:
-tcg_out_modrm_offset(s, OPC_MOVSBL + P_REXW + seg, datalo, base, ofs);
+tcg_out_modrm_sib_offset(s, OPC_MOVSBL + P_REXW + seg, datalo,
+ base, index, 0, ofs);
 break;
 case MO_UW:
-tcg_out_modrm_offset(s, OPC_MOVZWL + seg, datalo, base, ofs);
+tcg_out_modrm_sib_offset(s, OPC_MOVZWL + seg, datalo,
+ base, index, 0, ofs);
 if (real_bswap) {
 tcg_out_rolw_8(s, datalo);
 }
@@ -1462,20 +1465,21 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, 
TCGReg datalo, TCGReg datahi,
 case MO_SW:
 if (real_bswap) {
 if (have_movbe) {
-tcg_out_modrm_offset(s, OPC_MOVBE_GyMy + P_DATA16 + seg,
- datalo, base, ofs);
+tcg_out_modrm_sib_offset(s, OPC_MOVBE_GyMy + P_DATA16 + seg,
+ datalo, base, index, 0, ofs);
 } else {
-tcg_out_modrm_offset(s, OPC_MOVZWL + seg, datalo, base, ofs);
+tcg_out_modrm_sib_offset(s, OPC_MOVZWL + seg, datalo,
+ base, index, 0, ofs);
 tcg_out_rolw_8(s, datalo);
 }
 tcg_out_modrm(s, OPC_MOVSWL + P_REXW, datalo, datalo);
 } else {
-tcg_out_modrm_offset(s, OPC_MOVSWL + P_REXW + seg,
- datalo, base, ofs);
+tcg_out_modrm_sib_offset(s, OPC_MOVSWL + P_REXW + seg,
+ datalo, base, index, 0, ofs);
 }
 break;
 case MO_UL:
-tcg_out_modrm_offset(s, movop + seg, datalo, base, ofs);
+tcg_out_modrm_sib_offset(s, movop + seg, datalo, base, index, 0, ofs);
 if (bswap) {
 tcg_out_bswap32(s, datalo);
 }
@@ -1483,19 +1487,22 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, 
TCGReg datalo, TCGReg datahi,
 #if TCG_TARGET_REG_BITS == 64
 case MO_SL:
 if (real_bswap) {
-tcg_out_modrm_offset(s, movop + seg, datalo, base, ofs);
+tcg_out_modrm_sib_offset(s, movop + seg, datalo,
+ base, index, 0, ofs);
 if (bswap) {
 tcg_out_bswap32(s, datalo);
 }
 tcg_out_ext32s(s, datalo, datalo);
 } else {
-tcg_out_modrm_offset(s, OPC_MOVSLQ + seg, datalo, base, ofs);
+tcg_out_modrm_sib_offset(s, OPC_MOVSLQ + seg, datalo,
+ base, index, 0, ofs);
 }
 break;
 #endif
 case MO_Q:
 if (TCG_TARGET_REG_BITS == 64) {
-tcg_out_modrm_offset(s, movop + P_REXW + seg, datalo, base, ofs);
+tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo,
+ base, index, 0, ofs);
 if (bswap) {
 tcg_out_bswap64(s, datalo);
 }
@@ -1506,11 +1513,15 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, 
TCGReg datalo, TCGReg datahi,
 datahi = t;
 }
 if (base != datalo) {
-tcg_out_modrm_offset(s, movop + seg, datalo, base, ofs);
-tcg_out_modrm_offset(s, movop + seg, datahi, base, ofs + 4);
+tcg_out_modrm_sib_offset(s, movop + seg, datalo,
+ base, index, 0, ofs);
+tcg_out_modrm_sib_offset(s, movop + seg, datahi,
+ base, index, 0, ofs + 4);
 } else {
-tcg_out_modrm_offset(s, movop + seg, datahi, base, ofs + 4);
-

Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me

2015-07-16 Thread Paolo Bonzini


On 16/07/2015 21:05, Richard W.M. Jones wrote:
 Sorry to spoil things, but I'm still seeing this bug, although it is
 now a lot less frequent with your patch.  I would estimate it happens
 more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs
 with qemu.git + the v2 patch series.
 
 It's the exact same hang in both cases.
 
 Is it possible that this patch doesn't completely close any race?
 
 Still, it is an improvement, so there is that.

I would guess instead that there are two separate bugs, but it's not
impossible that it's still there.

Paolo



Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me

2015-07-16 Thread Paolo Bonzini


On 17/07/2015 00:06, Paolo Bonzini wrote:
 
 
 On 16/07/2015 21:05, Richard W.M. Jones wrote:
 Sorry to spoil things, but I'm still seeing this bug, although it is
 now a lot less frequent with your patch.  I would estimate it happens
 more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs
 with qemu.git + the v2 patch series.

 It's the exact same hang in both cases.

 Is it possible that this patch doesn't completely close any race?

 Still, it is an improvement, so there is that.
 
 I would guess instead that there are two separate bugs, but it's not
 impossible that it's still there.

Reproduced after ~80 runs...

Paolo



[Qemu-devel] [PULL 2/8] mips/kvm: Sign extend registers written to KVM

2015-07-16 Thread Paolo Bonzini
From: James Hogan james.ho...@imgtec.com

In case we're running on a 64-bit host, be sure to sign extend the
general purpose registers and hi/lo/pc before writing them to KVM, so as
to take advantage of MIPS32/MIPS64 compatibility.

Signed-off-by: James Hogan james.ho...@imgtec.com
Cc: Paolo Bonzini pbonz...@redhat.com
Cc: Leon Alrae leon.al...@imgtec.com
Cc: Aurelien Jarno aurel...@aurel32.net
Cc: k...@vger.kernel.org
Cc: qemu-sta...@nongnu.org
Message-Id: 1429871214-23514-3-git-send-email-james.ho...@imgtec.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 target-mips/kvm.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target-mips/kvm.c b/target-mips/kvm.c
index 85256f3..d287d42 100644
--- a/target-mips/kvm.c
+++ b/target-mips/kvm.c
@@ -628,12 +628,12 @@ int kvm_arch_put_registers(CPUState *cs, int level)
 
 /* Set the registers based on QEMU's view of things */
 for (i = 0; i  32; i++) {
-regs.gpr[i] = env-active_tc.gpr[i];
+regs.gpr[i] = (int64_t)(target_long)env-active_tc.gpr[i];
 }
 
-regs.hi = env-active_tc.HI[0];
-regs.lo = env-active_tc.LO[0];
-regs.pc = env-active_tc.PC;
+regs.hi = (int64_t)(target_long)env-active_tc.HI[0];
+regs.lo = (int64_t)(target_long)env-active_tc.LO[0];
+regs.pc = (int64_t)(target_long)env-active_tc.PC;
 
 ret = kvm_vcpu_ioctl(cs, KVM_SET_REGS, regs);
 
-- 
2.4.3





[Qemu-devel] [Bug 1384892] Re: RTL8168 NIC VFIO not working anymore since QEMU 2.1

2015-07-16 Thread Thorsten Kohfeldt
I made some time for limited testing.
The released V2.1 puts the vfio-ed RTL8168 into a zombie state when running an 
IPFire VM, which requires a subsequent POWER cycle in order to let mii-tools 
show anything else than 'no link' (i.e. to get the LED back on). Pushing the 
reset button on the x64 metal was NOT enough.

Then I binary-patched my V2.1 qemu x64 executable so it compares against
a 'wrong' PCI ID inside vfio_probe_rtl8168_bar2_window_quirk() (to
confirm comment #3). Yes, that made it work again.

Further patch-attempts towards comment #10 all ended up in RTL zombie
state (even though the IPFire VM was otherwise responsive). Each time a
power cycle was required to get the RTL back alive.

Could there be a general problem because the RTL must be usable in MSI-X mode 
for Windows and in in MSI mode for Linux ?
Maybe a cmdline switch should be added to activate the quirk just when MSI-X is 
intended ?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1384892

Title:
  RTL8168 NIC VFIO not working anymore since QEMU 2.1

Status in QEMU:
  New

Bug description:
  After upgrading QEMU from 2.0 to 2.1 (and libiscsi from 1.7.0 to 1.12 as a 
dependency) my two RTL8168 NICs stopped working.
  The NICs do not respond to any command and even the LEDs on the network 
connection turn off, a few seconds after the VM started.
  To get them back running I had to downgrade to 2.0 and restart the system.
  Unfortunately, I have no clue what to do or how to debug this problem since 
there are no specific errors logged.
  I tried two different VMs: Debian Wheezy and IPFire (see attachment for 
further details).
  The QEMU 2.1 changelog states Support for RTL8168 NICs. so there were some 
major changes done, I guess.

  On the IPFire guest the kernel log shows many of these lines:
  r8169 :00:07.0 green1: rtl_eriar_cond == 1 (loop: 100, delay: 100)
  r8169 :00:07.0 green1: rtl_phy_reset_cond == 1 (loop: 100, delay: 1)

  On the Debian guest there is only:
  r8169 :00:07.0: firmware: agent loaded rtl_nic/rtl8168e-3.fw into memory
  r8169 :00:07.0: lan0: link down
  ADDRCONF(NETDEV_UP): lan0: link is not ready

  The commandline for IPFire can be seen in the attachment. It is the same for 
Debian.
  There are also the complete kernel logs for the working (2.0) and non-working 
(2.1) cases.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1384892/+subscriptions



Re: [Qemu-devel] [PATCH 2/2 v3] virtio-rng: Serve pending request if any after timer bumps up quota.

2015-07-16 Thread Amit Shah
On (Wed) 15 Jul 2015 [17:46:48], Pankaj Gupta wrote:
 We are arming timer when we get first request from guest.
 Even if guest pulls all the data we will be serving guest
 only when timer bumps up new quota. When timer expires
 we check if we have a pending request from guest, we
 serve it and re-arm the timer else we don't do any thing.
 
 Signed-off-by: Pankaj Gupta pagu...@redhat.com
 ---
  hw/virtio/virtio-rng.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)
 
 diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
 index ae04352..3d9a002 100644
 --- a/hw/virtio/virtio-rng.c
 +++ b/hw/virtio/virtio-rng.c
 @@ -142,9 +142,13 @@ static int virtio_rng_load(QEMUFile *f, void *opaque, 
 int version_id)
  static void check_rate_limit(void *opaque)
  {
  VirtIORNG *vrng = opaque;
 +size_t size;
  
  vrng-quota_remaining = vrng-conf.max_bytes;
 -virtio_rng_process(vrng);
 +size = get_request_size(vrng-vq, 0);
 +if (size  0) {
 +virtio_rng_process(vrng);
 +}
  vrng-activate_timer = true;

Ah, this isn't helping us much here; we might as well return in a
similar way from virtio_rng_process.

What I had written earlier was slightly different than this:

 So now with your changes, here is what we can do: instead of just
 activating the timer in check_rate_limit(), we can issue a
 get_request_size() call.  If the return value is  0, it means we have
 a buffer queued up by the guest, and we can then call
 virtio_rng_process() and set activated to true.  Else, no need to call
 virtio_rng_process at all, and the job for the timer is done, since
 there are no more outstanding requests from the guest.

Anyway we can look at that later, patch 1 is already a big improvement
so I think we should just stick with that, and think about other
things in a different series.

Thanks,


Amit



Re: [Qemu-devel] [PATCH] qemu-char: Fix missed data on unix socket

2015-07-16 Thread Amit Shah
On (Wed) 15 Jul 2015 [23:44:52], Nils Carlson wrote:
 On Mon, 13 Jul 2015, Nils Carlson wrote:
 
 On Mon, 13 Jul 2015, Amit Shah wrote:
 
 snip
 
 Also, returning TRUE there isn't right - if the connection ends, we
 should return FALSE.
 
 I agree that this seems reasonable. I will change it and re-test.
 
 
 I had a closer look, and it seems always returning true is intentional here,
 the called function, tcp_chr_disconnect(chr), handles the deregistration
 from handlers. If we were to return FALSE we would be duplicating work and
 possibly breaking things.

Not sure how.

Anyway, can you please start a new thread, with the author and
reviewers of the patch CC'ed, so they can chime in as well?

Amit



Re: [Qemu-devel] [PATCH V2] virtio-net: unbreak any layout

2015-07-16 Thread Michael S. Tsirkin
On Wed, Jul 15, 2015 at 03:56:07PM +0800, Jason Wang wrote:
 Commit 032a74a1c0fcdd5fd1c69e56126b4c857ee36611
 (virtio-net: byteswap virtio-net header) breaks any layout by
 requiring out_sg[0].iov_len = n-guest_hdr_len. Fixing this by
 copying header to temporary buffer if swap is needed, and then use
 this buffer as part of out_sg.
 
 Fixes 032a74a1c0fcdd5fd1c69e56126b4c857ee36611
 (virtio-net: byteswap virtio-net header)
 Cc: qemu-sta...@nongnu.org
 Cc: c...@fr.ibm.com
 Signed-off-by: Jason Wang jasow...@redhat.com
 ---
 Changes from V1:
 - avoid header copying if there's no need to do header swap
 - don't write the header back
 ---
  hw/net/virtio-net.c   | 17 ++---
  include/hw/virtio/virtio-access.h |  9 +
  2 files changed, 23 insertions(+), 3 deletions(-)
 
 diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
 index e3c2db3..12322bd 100644
 --- a/hw/net/virtio-net.c
 +++ b/hw/net/virtio-net.c
 @@ -1142,7 +1142,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
  ssize_t ret, len;
  unsigned int out_num = elem.out_num;
  struct iovec *out_sg = elem.out_sg[0];
 -struct iovec sg[VIRTQUEUE_MAX_SIZE];
 +struct iovec sg[VIRTQUEUE_MAX_SIZE], sg2[VIRTQUEUE_MAX_SIZE];
 +struct virtio_net_hdr hdr;
  
  if (out_num  1) {
  error_report(virtio-net header not in first element);
 @@ -1150,11 +1151,21 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
  }
  
  if (n-has_vnet_hdr) {
 -if (out_sg[0].iov_len  n-guest_hdr_len) {
 +if (iov_size(out_sg, out_num)  n-guest_hdr_len) {
  error_report(virtio-net header incorrect);
  exit(1);
  }

this scans the iov unnecessarily. How about checking return
code from iov_to_buf instead?

 -virtio_net_hdr_swap(vdev, (void *) out_sg[0].iov_base);
 +if (virtio_needs_swap(vdev)) {
 +iov_to_buf(out_sg, out_num, 0, hdr, sizeof(hdr));
 +virtio_net_hdr_swap(vdev, (void *) hdr);
 +sg2[0].iov_base = hdr;
 +sg2[0].iov_len = sizeof(hdr);
 +out_num = iov_copy(sg2[1], ARRAY_SIZE(sg2) - 1,
 +   out_sg, out_num,
 +   sizeof(hdr), -1);


This might truncate packet if it does not fit in 1024 anymore.
It might be better to just drop it.

 +out_num += 1;
 +out_sg = sg2;
 +}
  }
  
  /*
 diff --git a/include/hw/virtio/virtio-access.h 
 b/include/hw/virtio/virtio-access.h
 index cee5dd7..1ec1dfd 100644
 --- a/include/hw/virtio/virtio-access.h
 +++ b/include/hw/virtio/virtio-access.h
 @@ -143,6 +143,15 @@ static inline uint64_t virtio_ldq_p(VirtIODevice *vdev, 
 const void *ptr)
  }
  }
  
 +static inline bool virtio_needs_swap(VirtIODevice *vdev)
 +{
 +#ifdef HOST_WORDS_BIGENDIAN
 +return virtio_access_is_big_endian(vdev) ? false : true;
 +#else
 +return virtio_access_is_big_endian(vdev) ? true : false;
 +#endif
 +}
 +
  static inline uint16_t virtio_tswap16(VirtIODevice *vdev, uint16_t s)
  {
  #ifdef HOST_WORDS_BIGENDIAN
 -- 
 2.1.4



Re: [Qemu-devel] [PATCH 2/2 v3] virtio-rng: Serve pending request if any after timer bumps up quota.

2015-07-16 Thread Pankaj Gupta


- Original Message -
 From: Amit Shah amit.s...@redhat.com
 To: Pankaj Gupta pagu...@redhat.com
 Cc: qemu-devel@nongnu.org, m...@redhat.com
 Sent: Thursday, 16 July, 2015 11:35:36 AM
 Subject: Re: [Qemu-devel] [PATCH 2/2 v3] virtio-rng: Serve pending request if 
 any after timer bumps up quota.
 
 On (Wed) 15 Jul 2015 [17:46:48], Pankaj Gupta wrote:
  We are arming timer when we get first request from guest.
  Even if guest pulls all the data we will be serving guest
  only when timer bumps up new quota. When timer expires
  we check if we have a pending request from guest, we
  serve it and re-arm the timer else we don't do any thing.
  
  Signed-off-by: Pankaj Gupta pagu...@redhat.com
  ---
   hw/virtio/virtio-rng.c | 6 +-
   1 file changed, 5 insertions(+), 1 deletion(-)
  
  diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
  index ae04352..3d9a002 100644
  --- a/hw/virtio/virtio-rng.c
  +++ b/hw/virtio/virtio-rng.c
  @@ -142,9 +142,13 @@ static int virtio_rng_load(QEMUFile *f, void *opaque,
  int version_id)
   static void check_rate_limit(void *opaque)
   {
       VirtIORNG *vrng = opaque;
  +    size_t size;
   
       vrng-quota_remaining = vrng-conf.max_bytes;
  -    virtio_rng_process(vrng);
  +    size = get_request_size(vrng-vq, 0);
  +    if (size  0) {
  +        virtio_rng_process(vrng);
  +    }
       vrng-activate_timer = true;
 
 Ah, this isn't helping us much here; we might as well return in a
 similar way from virtio_rng_process.
 
 What I had written earlier was slightly different than this:
 
  So now with your changes, here is what we can do: instead of just
  activating the timer in check_rate_limit(), we can issue a
  get_request_size() call.  If the return value is  0, it means we have
  a buffer queued up by the guest, and we can then call
  virtio_rng_process() and set activated to true.  Else, no need to call
  virtio_rng_process at all, and the job for the timer is done, since
  there are no more outstanding requests from the guest.
 
 Anyway we can look at that later, patch 1 is already a big improvement
 so I think we should just stick with that, and think about other
 things in a different series.

Sure.

Thanks,
Pankaj
 
 Thanks,
 
 
 Amit
 
 



Re: [Qemu-devel] [PATCH V2] virtio-net: unbreak any layout

2015-07-16 Thread Michael S. Tsirkin
On Thu, Jul 16, 2015 at 01:54:38PM +0800, Jason Wang wrote:
 
 
 On 07/15/2015 07:23 PM, Michael S. Tsirkin wrote:
  On Wed, Jul 15, 2015 at 03:56:07PM +0800, Jason Wang wrote:
  Commit 032a74a1c0fcdd5fd1c69e56126b4c857ee36611
  (virtio-net: byteswap virtio-net header) breaks any layout by
  requiring out_sg[0].iov_len = n-guest_hdr_len. Fixing this by
  copying header to temporary buffer if swap is needed, and then use
  this buffer as part of out_sg.
 
  Fixes 032a74a1c0fcdd5fd1c69e56126b4c857ee36611
  (virtio-net: byteswap virtio-net header)
  Cc: qemu-sta...@nongnu.org
  Cc: c...@fr.ibm.com
  Signed-off-by: Jason Wang jasow...@redhat.com
  ---
  Changes from V1:
  - avoid header copying if there's no need to do header swap
  - don't write the header back
  ---
   hw/net/virtio-net.c   | 17 ++---
   include/hw/virtio/virtio-access.h |  9 +
   2 files changed, 23 insertions(+), 3 deletions(-)
 
  diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
  index e3c2db3..12322bd 100644
  --- a/hw/net/virtio-net.c
  +++ b/hw/net/virtio-net.c
  @@ -1142,7 +1142,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
   ssize_t ret, len;
   unsigned int out_num = elem.out_num;
   struct iovec *out_sg = elem.out_sg[0];
  -struct iovec sg[VIRTQUEUE_MAX_SIZE];
  +struct iovec sg[VIRTQUEUE_MAX_SIZE], sg2[VIRTQUEUE_MAX_SIZE];
  +struct virtio_net_hdr hdr;
   
   if (out_num  1) {
   error_report(virtio-net header not in first element);
  @@ -1150,11 +1151,21 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue 
  *q)
   }
   
   if (n-has_vnet_hdr) {
  -if (out_sg[0].iov_len  n-guest_hdr_len) {
  +if (iov_size(out_sg, out_num)  n-guest_hdr_len) {
   error_report(virtio-net header incorrect);
   exit(1);
   }
  -virtio_net_hdr_swap(vdev, (void *) out_sg[0].iov_base);
  +if (virtio_needs_swap(vdev)) {
  +iov_to_buf(out_sg, out_num, 0, hdr, sizeof(hdr));
  +virtio_net_hdr_swap(vdev, (void *) hdr);
  +sg2[0].iov_base = hdr;
  +sg2[0].iov_len = sizeof(hdr);
  +out_num = iov_copy(sg2[1], ARRAY_SIZE(sg2) - 1,
  +   out_sg, out_num,
  +   sizeof(hdr), -1);
  +out_num += 1;
  +out_sg = sg2;
  +}
   }
   
   /*
  I guess this works but iov cpy is pretty expensive.
 
  - When using the tun backend, we can just set VNETBE/VNETLE
correctly and avoid swaps in userspace.
It's easy to probe for - why don't we do this?
 
 We need to fix stable branches first. And if we do this for 2.4, if
 running on a older kernel which does not support VNETBE/VNETLE, do we
 want to fail the initialization? If yes, it breaks backward
 compatibility. If no, we still need the swap here.
 
 
  - OTOH when not using the tun backend, offloads are
generally disabled, so the swap isn't needed
since header is going to be all 0s.
 
 I agree it's better to check backend (e.g host_hdr_len) before trying to
 do the swap.
 
 
  - what's left is old kernels when using tun.
how about probing for that explicitly?
In other words, virtio_net_needs_swap as opposed to
the generic virtio_needs_swap.
 
 So we still need swap anyway. How about apply this patch first for
 stable and 2.4 and probe the VNETBE/VNETLE explicitly on top for 2.5?

I'm fine with it being a patch on top, but I think it's preferable
to use VNETBE/VNETLE even in 2.4 if that's available.

 
 
  diff --git a/include/hw/virtio/virtio-access.h 
  b/include/hw/virtio/virtio-access.h
  index cee5dd7..1ec1dfd 100644
  --- a/include/hw/virtio/virtio-access.h
  +++ b/include/hw/virtio/virtio-access.h
  @@ -143,6 +143,15 @@ static inline uint64_t virtio_ldq_p(VirtIODevice 
  *vdev, const void *ptr)
   }
   }
   
  +static inline bool virtio_needs_swap(VirtIODevice *vdev)
  +{
  +#ifdef HOST_WORDS_BIGENDIAN
  +return virtio_access_is_big_endian(vdev) ? false : true;
  +#else
  +return virtio_access_is_big_endian(vdev) ? true : false;
  +#endif
  +}
  +
   static inline uint16_t virtio_tswap16(VirtIODevice *vdev, uint16_t s)
   {
   #ifdef HOST_WORDS_BIGENDIAN
  -- 
  2.1.4



Re: [Qemu-devel] [PATCH v4 0/7] Fix QEMU crash during memory hotplug with vhost=on

2015-07-16 Thread Igor Mammedov
On Wed, 15 Jul 2015 19:32:31 +0300
Michael S. Tsirkin m...@redhat.com wrote:

 On Wed, Jul 15, 2015 at 05:12:01PM +0200, Igor Mammedov wrote:
  On Thu,  9 Jul 2015 13:47:17 +0200
  Igor Mammedov imamm...@redhat.com wrote:
  
  there also is yet another issue with vhost-user. It also has
  very low limit on amount of memory regions (if I recall correctly 8)
  and it's possible to trigger even without memory hotplug.
  one just need to start QEMU with a several -numa memdev= options
  to create a necessary amount of memory regions to trigger it.
  
  lowrisk option to fix it would be increasing limit in vhost-user
  backend.
  
  another option is disabling vhost and fall-back to virtio,
  but I don't know much about vhost if it's possible to 
  to switch it off without loosing packets guest was sending
  at the moment and if it will work at all with vhost.
 
 With vhost-user you can't fall back to virtio: it's
 not an accelerator, it's the backend.
 
 Updating the protocol to support a bigger table
 is possible but old remotes won't be able to support it.
 
it looks like increasing limit is the only option left.

it's not ideal that old remotes /with hardcoded limit/
might not be able to handle bigger table but at least
new ones and ones that handle VhostUserMsg payload
dynamically would be able to work without crashing.



[Qemu-devel] [PULL 2/9] target-mips: fix to clear MSACSR.Cause

2015-07-16 Thread Leon Alrae
From: Yongbok Kim yongbok@imgtec.com

MSACSR.Cause bits are needed to be cleared before a vector floating-point
instructions.
FEXDO.df, FEXUPL.df and FEXUPR.df were missed out.

Signed-off-by: Yongbok Kim yongbok@imgtec.com
Reviewed-by: Aurelien Jarno aurel...@aurel32.net
Reviewed-by: Leon Alrae leon.al...@imgtec.com
Signed-off-by: Leon Alrae leon.al...@imgtec.com
---
 target-mips/msa_helper.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/target-mips/msa_helper.c b/target-mips/msa_helper.c
index 26ffdc7..a1cb48f 100644
--- a/target-mips/msa_helper.c
+++ b/target-mips/msa_helper.c
@@ -2642,6 +2642,8 @@ void helper_msa_fexdo_df(CPUMIPSState *env, uint32_t df, 
uint32_t wd,
 wr_t *pwt = (env-active_fpu.fpr[wt].wr);
 uint32_t i;
 
+clear_msacsr_cause(env);
+
 switch (df) {
 case DF_WORD:
 for (i = 0; i  DF_ELEMENTS(DF_WORD); i++) {
@@ -3192,6 +3194,8 @@ void helper_msa_fexupl_df(CPUMIPSState *env, uint32_t df, 
uint32_t wd,
 wr_t *pws = (env-active_fpu.fpr[ws].wr);
 uint32_t i;
 
+clear_msacsr_cause(env);
+
 switch (df) {
 case DF_WORD:
 for (i = 0; i  DF_ELEMENTS(DF_WORD); i++) {
@@ -3224,6 +3228,8 @@ void helper_msa_fexupr_df(CPUMIPSState *env, uint32_t df, 
uint32_t wd,
 wr_t *pws = (env-active_fpu.fpr[ws].wr);
 uint32_t i;
 
+clear_msacsr_cause(env);
+
 switch (df) {
 case DF_WORD:
 for (i = 0; i  DF_ELEMENTS(DF_WORD); i++) {
-- 
2.1.0




[Qemu-devel] [PULL 9/9] target-mips: fix page fault address for LWL/LWR/LDL/LDR

2015-07-16 Thread Leon Alrae
From: Aurelien Jarno aurel...@aurel32.net

When a LWL, LWR, LDL or LDR instruction triggers a page fault, QEMU
currently reports the aligned address in CP0 BadVAddr, while the Windows
NT kernel expects the unaligned address.

This patch adds a byte access with the unaligned address at the
beginning of the LWL/LWR/LDL/LDR instructions to possibly trigger a page
fault and fill the QEMU TLB.

Cc: Leon Alrae leon.al...@imgtec.com
Reported-by: Hervé Poussineau hpous...@reactos.org
Tested-by: Hervé Poussineau hpous...@reactos.org
Signed-off-by: Aurelien Jarno aurel...@aurel32.net
Signed-off-by: Leon Alrae leon.al...@imgtec.com
---
 target-mips/translate.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index 4a1ffdb..d1de35a 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -2142,6 +2142,9 @@ static void gen_ld(DisasContext *ctx, uint32_t opc,
 break;
 case OPC_LDL:
 t1 = tcg_temp_new();
+/* Do a byte access to possibly trigger a page
+   fault with the unaligned address.  */
+tcg_gen_qemu_ld_tl(t1, t0, ctx-mem_idx, MO_UB);
 tcg_gen_andi_tl(t1, t0, 7);
 #ifndef TARGET_WORDS_BIGENDIAN
 tcg_gen_xori_tl(t1, t1, 7);
@@ -2163,6 +2166,9 @@ static void gen_ld(DisasContext *ctx, uint32_t opc,
 break;
 case OPC_LDR:
 t1 = tcg_temp_new();
+/* Do a byte access to possibly trigger a page
+   fault with the unaligned address.  */
+tcg_gen_qemu_ld_tl(t1, t0, ctx-mem_idx, MO_UB);
 tcg_gen_andi_tl(t1, t0, 7);
 #ifdef TARGET_WORDS_BIGENDIAN
 tcg_gen_xori_tl(t1, t1, 7);
@@ -2229,6 +2235,9 @@ static void gen_ld(DisasContext *ctx, uint32_t opc,
 break;
 case OPC_LWL:
 t1 = tcg_temp_new();
+/* Do a byte access to possibly trigger a page
+   fault with the unaligned address.  */
+tcg_gen_qemu_ld_tl(t1, t0, ctx-mem_idx, MO_UB);
 tcg_gen_andi_tl(t1, t0, 3);
 #ifndef TARGET_WORDS_BIGENDIAN
 tcg_gen_xori_tl(t1, t1, 3);
@@ -2251,6 +2260,9 @@ static void gen_ld(DisasContext *ctx, uint32_t opc,
 break;
 case OPC_LWR:
 t1 = tcg_temp_new();
+/* Do a byte access to possibly trigger a page
+   fault with the unaligned address.  */
+tcg_gen_qemu_ld_tl(t1, t0, ctx-mem_idx, MO_UB);
 tcg_gen_andi_tl(t1, t0, 3);
 #ifdef TARGET_WORDS_BIGENDIAN
 tcg_gen_xori_tl(t1, t1, 3);
-- 
2.1.0




Re: [Qemu-devel] [RFC PATCH v0] spapr: Abort when hash table size requirement isn't met

2015-07-16 Thread Bharata B Rao
On Wed, Jul 15, 2015 at 03:27:13PM +0530, Bharata B Rao wrote:
 [This patch addresses an issue which is not prominently seen in mainline,
 but seen frequently only in David's spapr-next branch. Though it is possible
 to see this issue with mainline too, the current version of the patch
 is intended for David's tree.]
 
 QEMU requests for hash table allocation through KVM_PPC_ALLOCATE_HTAB ioctl
 by providing the size hint via htab_shift value. Sometimes the hinted
 size requirement can't be met by the host and it returns with a lower
 value for htab_shift.
 
 This was fine until recently where the hash table size was dependent
 on guest RAM size. With the intention of supporting memory hotplug, hash
 table size was changed to depend on maxram size recently. Since it is
 typical to have maxram size to be much higher than RAM size, the possibility
 of host not being able to meet the size requirement has increased. This
 causes two problems:
 
 - When memory hotplug is supported, we will not be able to grow till
   maxram if the host wasn't able to satisfy the hash table size for the
   the full maxram range.

This is a recoverable condition where the hotplug can be gracefully failed.

 - During migration, we can end up having different htab_shift values (and
   hence different hash table sizes) at the source and target due to
   which the migration fails.

One possible way to solve this is to change (reduce) the maxram_size
based on the negotiated value of htab_shit and use the changed value
of maxram_size at the target during migration. However AFAIK, currently
there is no way to communicate the changed maxram_size back to libvirt,
so this solution may not be feasible.

So it is the question of whether to allow the guest to boot with reduced
hashtable size and fail migration (this is the current behaviour)

or

As done in this patch, prevent the booting of the VM altogether.

I am leaning towards the former. Thoughts ?

Regards,
Bharata.




Re: [Qemu-devel] [PATCH] qemu-char: Fix missed data on unix socket

2015-07-16 Thread Nils Carlson

On Thu, 16 Jul 2015, Amit Shah wrote:


On (Wed) 15 Jul 2015 [23:44:52], Nils Carlson wrote:

On Mon, 13 Jul 2015, Nils Carlson wrote:


On Mon, 13 Jul 2015, Amit Shah wrote:


snip


Also, returning TRUE there isn't right - if the connection ends, we
should return FALSE.


I agree that this seems reasonable. I will change it and re-test.



I had a closer look, and it seems always returning true is intentional here,
the called function, tcp_chr_disconnect(chr), handles the deregistration
from handlers. If we were to return FALSE we would be duplicating work and
possibly breaking things.


Not sure how.

Anyway, can you please start a new thread, with the author and
reviewers of the patch CC'ed, so they can chime in as well?


The authors and reviewers of which patch exactly?

The fix for windows which broke cloudstack was done in 812c1057.
The returning TRUE everywhere was done in commit cdbf6e16 by Paolo 
Bonzini, CC:d above.


I don't think there was anything wrong with either patch, the former one 
simply broken a strange behaviour CloudStack was relying on which I would 
say is dubious (fire and forget messaging.)


/Nils



[Qemu-devel] [Bug 1465935] Re: kvm_irqchip_commit_routes: Assertion `ret == 0' failed

2015-07-16 Thread Stefan Bader
The proposed fix seems not yet part of any qemu release but applied as

commit bdf026317daa3b9dfa281f29e96fbb6fd48394c8
Author: 马文霜 kevin...@tencent.com
Date:   Wed Jul 1 15:41:41 2015 +0200

Fix irq route entries exceeding KVM_MAX_IRQ_ROUTES

to v2.4.0-rc0. So this would affect all current releases and the current
development release (Wily/15.10). I would start there with
reproduction/verification and work backwards from there.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1465935

Title:
  kvm_irqchip_commit_routes: Assertion `ret == 0' failed

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Confirmed
Status in qemu source package in Precise:
  New
Status in qemu source package in Trusty:
  New
Status in qemu source package in Utopic:
  New
Status in qemu source package in Vivid:
  New

Bug description:
  Several my QEMU instances crashed, and in the  qemu log, I can see
  this assertion failure,

 qemu-system-x86_64: /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:984:
  kvm_irqchip_commit_routes: Assertion `ret == 0' failed.

  The QEMU version is 2.0.0, HV OS is ubuntu 12.04, kernel 3.2.0-38.
  Guest OS is RHEL 6.3.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1465935/+subscriptions



[Qemu-devel] [PULL 3/9] disas/mips: fix disassembling R6 instructions

2015-07-16 Thread Leon Alrae
From: Yongbok Kim yongbok@imgtec.com

In the Release 6 of the MIPS Architecture, LL, SC, LLD, SCD, PREF
and CACHE instructions have 9 bits offsets.

Signed-off-by: Yongbok Kim yongbok@imgtec.com
Reviewed-by: Aurelien Jarno aurel...@aurel32.net
Reviewed-by: Leon Alrae leon.al...@imgtec.com
Signed-off-by: Leon Alrae leon.al...@imgtec.com
---
 disas/mips.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/disas/mips.c b/disas/mips.c
index 32940fe..01336a8 100644
--- a/disas/mips.c
+++ b/disas/mips.c
@@ -1296,12 +1296,12 @@ const struct mips_opcode mips_builtin_opcodes[] =
 {dmod,d,s,t,0x00de, 0xfc0007ff, WR_d|RD_s|RD_t,   0, 
I64R6},
 {ddivu,   d,s,t,0x009f, 0xfc0007ff, WR_d|RD_s|RD_t,   0, 
I64R6},
 {dmodu,   d,s,t,0x00df, 0xfc0007ff, WR_d|RD_s|RD_t,   0, 
I64R6},
-{ll,  t,o(b),   0x7c36, 0xfc7f, LDD|RD_b|WR_t,0, 
I32R6},
-{sc,  t,o(b),   0x7c26, 0xfc7f, LDD|RD_b|WR_t,0, 
I32R6},
-{lld, t,o(b),   0x7c37, 0xfc7f, LDD|RD_b|WR_t,0, 
I64R6},
-{scd, t,o(b),   0x7c27, 0xfc7f, LDD|RD_b|WR_t,0, 
I64R6},
-{pref,h,o(b),   0x7c35, 0xfc7f, RD_b, 0, 
I32R6},
-{cache,   k,o(b),   0x7c25, 0xfc7f, RD_b, 0, 
I32R6},
+{ll,  t,+o(b),  0x7c36, 0xfc7f, LDD|RD_b|WR_t,0, 
I32R6},
+{sc,  t,+o(b),  0x7c26, 0xfc7f, LDD|RD_b|WR_t,0, 
I32R6},
+{lld, t,+o(b),  0x7c37, 0xfc7f, LDD|RD_b|WR_t,0, 
I64R6},
+{scd, t,+o(b),  0x7c27, 0xfc7f, LDD|RD_b|WR_t,0, 
I64R6},
+{pref,h,+o(b),  0x7c35, 0xfc7f, RD_b, 0, 
I32R6},
+{cache,   k,+o(b),  0x7c25, 0xfc7f, RD_b, 0, 
I32R6},
 {seleqz,  d,v,t,0x0035, 0xfc0007ff, WR_d|RD_s|RD_t,   0, 
I32R6},
 {selnez,  d,v,t,0x0037, 0xfc0007ff, WR_d|RD_s|RD_t,   0, 
I32R6},
 {maddf.s, D,S,T,0x4618, 0xffe0003f, WR_D|RD_S|RD_T|FP_S,  0, 
I32R6},
-- 
2.1.0




[Qemu-devel] [PULL 5/9] target-mips: correct DERET instruction

2015-07-16 Thread Leon Alrae
Fix Debug Mode flag clearing, and when DERET is placed between LL and SC
do not make SC fail.

Signed-off-by: Leon Alrae leon.al...@imgtec.com
Reviewed-by: Aurelien Jarno aurel...@aurel32.net
---
 target-mips/op_helper.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index d457a29..9c28631 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -2154,10 +2154,9 @@ void helper_deret(CPUMIPSState *env)
 debug_pre_eret(env);
 set_pc(env, env-CP0_DEPC);
 
-env-hflags = MIPS_HFLAG_DM;
+env-hflags = ~MIPS_HFLAG_DM;
 compute_hflags(env);
 debug_post_eret(env);
-env-lladdr = 1;
 }
 #endif /* !CONFIG_USER_ONLY */
 
-- 
2.1.0




[Qemu-devel] [PULL 1/9] target-mips: fix MIPS64R6-generic configuration

2015-07-16 Thread Leon Alrae
From: Yongbok Kim yongbok@imgtec.com

Fix core configuration for MIPS64R6-generic to make it as close as
I6400.
I6400 core has 48-bit of Virtual Address available (SEGBITS).
MIPS SIMD Architecture is available.
Rearrange order of bits to match the specification.

Signed-off-by: Yongbok Kim yongbok@imgtec.com
Reviewed-by: Aurelien Jarno aurel...@aurel32.net
Reviewed-by: Leon Alrae leon.al...@imgtec.com
Signed-off-by: Leon Alrae leon.al...@imgtec.com
---
 target-mips/mips-defs.h  |  2 +-
 target-mips/translate_init.c | 18 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/target-mips/mips-defs.h b/target-mips/mips-defs.h
index 20aa87c..53b185e 100644
--- a/target-mips/mips-defs.h
+++ b/target-mips/mips-defs.h
@@ -11,7 +11,7 @@
 #if defined(TARGET_MIPS64)
 #define TARGET_LONG_BITS 64
 #define TARGET_PHYS_ADDR_SPACE_BITS 48
-#define TARGET_VIRT_ADDR_SPACE_BITS 42
+#define TARGET_VIRT_ADDR_SPACE_BITS 48
 #else
 #define TARGET_LONG_BITS 32
 #define TARGET_PHYS_ADDR_SPACE_BITS 40
diff --git a/target-mips/translate_init.c b/target-mips/translate_init.c
index ddfaff8..9304e74 100644
--- a/target-mips/translate_init.c
+++ b/target-mips/translate_init.c
@@ -655,14 +655,14 @@ static const mips_def_t mips_defs[] =
(2  CP0C1_DS) | (4  CP0C1_DL) | (3  CP0C1_DA) |
(0  CP0C1_PC) | (1  CP0C1_WR) | (1  CP0C1_EP),
 .CP0_Config2 = MIPS_CONFIG2,
-.CP0_Config3 = MIPS_CONFIG3 | (1  CP0C3_RXI) | (1  CP0C3_BP) |
-   (1  CP0C3_BI) | (1  CP0C3_ULRI) | (1  CP0C3_LPA) |
-   (1U  CP0C3_M),
-.CP0_Config4 = MIPS_CONFIG4 | (0xfc  CP0C4_KScrExist) |
-   (3  CP0C4_IE) | (1  CP0C4_M),
+.CP0_Config3 = MIPS_CONFIG3 | (1U  CP0C3_M) | (1  CP0C3_MSAP) |
+   (1  CP0C3_BP) | (1  CP0C3_BI) | (1  CP0C3_ULRI) |
+   (1  CP0C3_RXI) | (1  CP0C3_LPA),
+.CP0_Config4 = MIPS_CONFIG4 | (1U  CP0C4_M) | (3  CP0C4_IE) |
+   (0xfc  CP0C4_KScrExist),
 .CP0_Config5 = MIPS_CONFIG5 | (1  CP0C5_LLB),
-.CP0_Config5_rw_bitmask = (1  CP0C5_SBRI) | (1  CP0C5_FRE) |
-  (1  CP0C5_UFE),
+.CP0_Config5_rw_bitmask = (1  CP0C5_MSAEn) | (1  CP0C5_SBRI) |
+  (1  CP0C5_FRE) | (1  CP0C5_UFE),
 .CP0_LLAddr_rw_bitmask = 0,
 .CP0_LLAddr_shift = 0,
 .SYNCI_Step = 32,
@@ -674,9 +674,9 @@ static const mips_def_t mips_defs[] =
 .CP1_fcr0 = (1  FCR0_FREP) | (1  FCR0_F64) | (1  FCR0_L) |
 (1  FCR0_W) | (1  FCR0_D) | (1  FCR0_S) |
 (0x00  FCR0_PRID) | (0x0  FCR0_REV),
-.SEGBITS = 42,
+.SEGBITS = 48,
 .PABITS = 48,
-.insn_flags = CPU_MIPS64R6,
+.insn_flags = CPU_MIPS64R6 | ASE_MSA,
 .mmu_type = MMU_TYPE_R4000,
 },
 {
-- 
2.1.0




[Qemu-devel] [PULL 6/9] target-mips: fix logically dead code reported by Coverity

2015-07-16 Thread Leon Alrae
Make use of CMPOP in floating-point compare instructions.

Signed-off-by: Leon Alrae leon.al...@imgtec.com
Reviewed-by: Aurelien Jarno aurel...@aurel32.net
---
 target-mips/translate.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index 7302857..4a1ffdb 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -9552,6 +9552,7 @@ static void gen_farith (DisasContext *ctx, enum fopcode 
op1,
 gen_cmp_s(ctx, func-48, ft, fs, cc);
 opn = condnames[func-48];
 }
+optype = CMPOP;
 break;
 case OPC_ADD_D:
 check_cp1_registers(ctx, fs | ft | fd);
@@ -10036,6 +10037,7 @@ static void gen_farith (DisasContext *ctx, enum fopcode 
op1,
 gen_cmp_d(ctx, func-48, ft, fs, cc);
 opn = condnames[func-48];
 }
+optype = CMPOP;
 break;
 case OPC_CVT_S_D:
 check_cp1_registers(ctx, fs);
@@ -10461,6 +10463,7 @@ static void gen_farith (DisasContext *ctx, enum fopcode 
op1,
 gen_cmp_ps(ctx, func-48, ft, fs, cc);
 opn = condnames[func-48];
 }
+optype = CMPOP;
 break;
 default:
 MIPS_INVAL(opn);
-- 
2.1.0




Re: [Qemu-devel] [PATCH] AioContext: fix broken ctx-dispatching optimization

2015-07-16 Thread Paolo Bonzini


On 16/07/2015 11:14, Kevin Wolf wrote:
 With this information, I understand that what has changed is that the
   return value of g_main_context_iteration() changes from true before this
   patch (had the aio_notify() from aio_set_fd_handler() pending) to false
   after the patch (aio_notify() doesn't inject an event any more).
   
   This should mean that like above we can assert that the first iteration
   returns false, i.e. reverse the assertion (and indeed, with this
   change the test still passes for me).
  
  I was a bit undecided about this.  In the end I decided that the calls
  to aio_poll/g_main_context_iteration were just to put the AioContext in
  a known state, and the assertions on the return value of g_assert were
  not really important.  For this reason, the while loop seemed to express
  the intentions best, and I made it consistent between the AioContext and
  GSource cases.
 You changed the AioContext case in this same patch, even if you didn't
 quote my comment on that hunk. :-)
 
 Both cases were asserting the return value before.

I'll change the testcase (other than the aio_notify testcase) in a
separate patch.

Paolo



Re: [Qemu-devel] [PULL 0/2] X86 queue, 2015-07-15

2015-07-16 Thread Peter Maydell
On 15 July 2015 at 21:27, Eduardo Habkost ehabk...@redhat.com wrote:
 The following changes since commit 7692401a0826803522cfde533bdcc149932ddc6a:

   Merge remote-tracking branch 
 'remotes/pmaydell/tags/pull-target-arm-20150715' into staging (2015-07-15 
 17:28:59 +0100)

 are available in the git repository at:

   git://github.com/ehabkost/qemu.git tags/x86-pull-request

 for you to fetch changes up to 3046bb5debc8153a542acb1df93b2a1a85527a15:

   target-i386: emulate CPUID level of real hardware (2015-07-15 17:05:59 
 -0300)

 
 X86 queue, 2015-07-15

 Two bug fixes:
 * Memory leak due to extra g_strdup() when registering X86CPU alias properties
 * Fix CPUID levels so that W10 insider can run as guest OS

Applied, thanks.

-- PMM



Re: [Qemu-devel] RFC: guest-side retrieval of fw_cfg file

2015-07-16 Thread Igor Mammedov
On Thu, 16 Jul 2015 10:30:41 +0300
Michael S. Tsirkin m...@redhat.com wrote:

 On Thu, Jul 16, 2015 at 08:57:36AM +0200, Igor Mammedov wrote:
  On Wed, 15 Jul 2015 19:24:33 +0300
  Michael S. Tsirkin m...@redhat.com wrote:
  
   On Wed, Jul 15, 2015 at 06:01:42PM +0200, Igor Mammedov wrote:
On Wed, 15 Jul 2015 17:08:27 +0300
Michael S. Tsirkin m...@redhat.com wrote:

 On Mon, Jul 13, 2015 at 04:09:37PM -0400, Gabriel L. Somlo wrote:
  Hi,
  
  A while ago I was pondering on the options available for retrieving
  a fw_cfg blob from the guest-side (now that we can insert fw_cfg
  files on the host-side command line, see commit 81b2b8106).
  
  So over the last couple of weekends I cooked up the sysfs kernel
  module below, which lists all fw_cfg files
  under /sys/firmware/fw_cfg/filename.
 
 One concern here is that there will be a conflict here if fw cfg
 is used by ACPI. I don't know whether that last is a good idea
 though, so maybe not a real concern. I think Igor
 wanted to make it so.

I don't see any conflict here so far, it's just guest side module that
accesses fw_cfg.
   
   If there's ACPI code that accesses fw_cfg in response to an interrupt,
   it'll race with fw cfg access by guest OS. On linux we might be able to
   block ACPI preventing it from running. We probably won't be able to do
   it on windows.
  wrt vmgenid series we were talking about possibility to access fw_cfg
  from ACPI device.init so it's unlikely that it will ever collide with
  much later sysfs accesses.
 
 Don't we need to get updates when it changes too?
I've thought that it's pretty static and doesn't change.

 
  But if ACPI will start accessing fw_cfg from
  other methods it will race for sure.
 
 Maybe we shouldn't poke at fw cfg in ACPI then.
Yep, maybe we shouldn't.

The last vmgenid series just maps UUID region
at fixed location (which is sufficient to build ACPI
tables at machine done time) so there isn't real need
to export address via fw_cfg.






[Qemu-devel] [PATCH v2 1/3] tests: remove irrelevant assertions from test-aio

2015-07-16 Thread Paolo Bonzini
In these tests, the purpose of the initial calls to aio_poll and
g_main_context_iteration is simply to put the AioContext in a
known state; the return value of the function does not really
matter.  The next patch will change those return values; change
the assertions to a while loop which expresses the intention
better.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 tests/test-aio.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tests/test-aio.c b/tests/test-aio.c
index a7cb5c9..e7bbb83 100644
--- a/tests/test-aio.c
+++ b/tests/test-aio.c
@@ -331,7 +331,7 @@ static void test_wait_event_notifier(void)
 EventNotifierTestData data = { .n = 0, .active = 1 };
 event_notifier_init(data.e, false);
 aio_set_event_notifier(ctx, data.e, event_ready_cb);
-g_assert(!aio_poll(ctx, false));
+while (aio_poll(ctx, false));
 g_assert_cmpint(data.n, ==, 0);
 g_assert_cmpint(data.active, ==, 1);
 
@@ -356,7 +356,7 @@ static void test_flush_event_notifier(void)
 EventNotifierTestData data = { .n = 0, .active = 10, .auto_set = true };
 event_notifier_init(data.e, false);
 aio_set_event_notifier(ctx, data.e, event_ready_cb);
-g_assert(!aio_poll(ctx, false));
+while (aio_poll(ctx, false));
 g_assert_cmpint(data.n, ==, 0);
 g_assert_cmpint(data.active, ==, 10);
 
@@ -669,7 +669,7 @@ static void test_source_wait_event_notifier(void)
 EventNotifierTestData data = { .n = 0, .active = 1 };
 event_notifier_init(data.e, false);
 aio_set_event_notifier(ctx, data.e, event_ready_cb);
-g_assert(g_main_context_iteration(NULL, false));
+while (g_main_context_iteration(NULL, false));
 g_assert_cmpint(data.n, ==, 0);
 g_assert_cmpint(data.active, ==, 1);
 
@@ -694,7 +694,7 @@ static void test_source_flush_event_notifier(void)
 EventNotifierTestData data = { .n = 0, .active = 10, .auto_set = true };
 event_notifier_init(data.e, false);
 aio_set_event_notifier(ctx, data.e, event_ready_cb);
-g_assert(g_main_context_iteration(NULL, false));
+while (g_main_context_iteration(NULL, false));
 g_assert_cmpint(data.n, ==, 0);
 g_assert_cmpint(data.active, ==, 10);
 
-- 
2.4.3





Re: [Qemu-devel] memory hotplug fails with mem-merge option

2015-07-16 Thread Bharata B Rao
On Thu, Jul 16, 2015 at 10:25:09AM +0530, Bharata B Rao wrote:
 Hi,
 
 When I use -machine pc,mem-merge=off|on -m 1G,slots=4,maxmem=2G, adding
 memory-backend-ram object fails like below. Same failure is seen with
 -machine pseries,mem-merge=on|off.
 
 (qemu) object_add memory-backend-ram,id=ram0,size=1G
 
 qemu-system-x86_64: util/qemu-option.c:388: qemu_opt_get_bool_helper: 
 Assertion `opt-desc  opt-desc-type == QEMU_OPT_BOOL' failed.

Same failure seen with -machine pseries,mem-merge=off when used together with
-object memory-backend-file on QEMU commandline in which case boot time
failure is observed.

Regards,
Bharata.




Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me

2015-07-16 Thread Kevin Wolf
Am 16.07.2015 um 11:56 hat Paolo Bonzini geschrieben:
 Apart from an additional assertion this is exactly the same code as v1,
 but split across three patches so that the important one focuses on the
 optimization.
 
 Paolo
 
 v1-v2
 Split some changes to the tests to a separate patch
 Fix commit message [Laszlo]
 Clarify do...while loop in aio-win32.c [Kevin]
 
 Paolo Bonzini (3):
   tests: remove irrelevant assertions from test-aio
   aio-win32: reorganize polling loop
   AioContext: fix broken ctx-dispatching optimization

Reviewed-by: Kevin Wolf kw...@redhat.com



Re: [Qemu-devel] [PATCH v9 0/4] remove icc bus/bridge

2015-07-16 Thread Igor Mammedov
On Thu, 16 Jul 2015 10:45:41 +0800
Zhu Guihua zhugh.f...@cn.fujitsu.com wrote:

 ping...
I'll look at it once 2.4 is released.

 
 On 07/03/2015 05:38 PM, Zhu Guihua wrote:
  ICC Bus was used for providing a hotpluggable bus for APIC and CPU,
  but now we use HotplugHandler to make hotplug. So ICC Bus is
  unnecessary.
 
  This code has passed the new pc-cpu-test.
  And I have tested with kvm along with kernel_irqchip=on/off,
  it works fine.
 
  This patch series is based on the latest master.
 
  v9:
-use a callback to correct reset sequence for x86
-update apic mmio mapping
 
  v8:
-add a wrapper to specify reset order
 
  v7:
-update to register reset handler for main_system_bus when created
-register reset handler for apic after all devices are initialized
 
  Chen Fan (2):
 apic: map APIC's MMIO region at each CPU's address space
 cpu/apic: drop icc bus/bridge
 
  Zhu Guihua (2):
 x86: use new method to correct reset sequence
 icc_bus: drop the unused files
 
default-configs/i386-softmmu.mak   |   1 -
default-configs/x86_64-softmmu.mak |   1 -
hw/cpu/Makefile.objs   |   1 -
hw/cpu/icc_bus.c   | 118 
  -
hw/i386/pc.c   |  43 +++---
hw/i386/pc_piix.c  |   9 +--
hw/i386/pc_q35.c   |   9 +--
hw/intc/apic_common.c  |  11 +---
include/hw/cpu/icc_bus.h   |  82 --
include/hw/i386/apic_internal.h|   7 ++-
include/hw/i386/pc.h   |   2 +-
target-i386/cpu.c  |  30 +++---
12 files changed, 52 insertions(+), 262 deletions(-)
delete mode 100644 hw/cpu/icc_bus.c
delete mode 100644 include/hw/cpu/icc_bus.h
 
 
 




Re: [Qemu-devel] write operations during read

2015-07-16 Thread Kevin Wolf
Hi Gleb,

Am 15.07.2015 um 13:04 hat Gleb Stepanov geschrieben:
We are testing disk I/O perfomance using fio tool. I've measured
 performance on qcow2 image mounted to nb0 device. I run tests and
 launch iostat -x 1 command to browse I/O operations. Test consists of
 sequential read operations with 1 MB size. I've get following output.
 
 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 sda   0.00 5.000.00  677.00 0.00 22812.00
 sdb   0.00 0.000.00  545.00 0.00 0.00
 dm-0  0.00 0.000.00  674.00 0.00 22784.00
 dm-1  0.00 0.000.007.00 0.0028.00
 dm-2  0.00 0.000.000.00 0.00 0.00
 nb0   0.00 0.00  432.00  182.00 55296.00 23296.00
 
 So, reading produce a lot of write operations. What is the reason why
 these write operations occured?

Unfortunately you didn't include your qemu command line and didn't
describe in detail what your test setup looked like, so I'll have to
take a guess.

Read requests never cause writes. However, you wrote 55 MB to the image
before running the test. Did you make sure that this data was already
flushed to the disk before starting your benchmark? If it was only
sitting in the kernel page cache, the kernel can decide to write it out
at any point.

 How to determina block size that qemu operates , because if i've
 writtien 55 mb with 1 mb blocks
 i should get 55 operations instead of 432.

If you didn't use O_DIRECT in the benchmark, chances are that the guest
kernel issues requests smaller than 1 MB to qemu.

Be aware that you might get a few metadata reads in the guest
filesystem, the qcow2 and the host filesystem layer, so slightly more
than 55 requests is expected. Each layer may also have to split requests
if the data is stored fragmented.

Also, did you make sure that no processes other than fio access the
disk?

Kevin



Re: [Qemu-devel] [PATCH v4 0/7] Fix QEMU crash during memory hotplug with vhost=on

2015-07-16 Thread Michael S. Tsirkin
On Thu, Jul 16, 2015 at 11:42:36AM +0200, Igor Mammedov wrote:
 On Thu, 16 Jul 2015 10:35:33 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Thu, Jul 16, 2015 at 09:26:21AM +0200, Igor Mammedov wrote:
   On Wed, 15 Jul 2015 19:32:31 +0300
   Michael S. Tsirkin m...@redhat.com wrote:
   
On Wed, Jul 15, 2015 at 05:12:01PM +0200, Igor Mammedov wrote:
 On Thu,  9 Jul 2015 13:47:17 +0200
 Igor Mammedov imamm...@redhat.com wrote:
 
 there also is yet another issue with vhost-user. It also has
 very low limit on amount of memory regions (if I recall correctly 8)
 and it's possible to trigger even without memory hotplug.
 one just need to start QEMU with a several -numa memdev= options
 to create a necessary amount of memory regions to trigger it.
 
 lowrisk option to fix it would be increasing limit in vhost-user
 backend.
 
 another option is disabling vhost and fall-back to virtio,
 but I don't know much about vhost if it's possible to 
 to switch it off without loosing packets guest was sending
 at the moment and if it will work at all with vhost.

With vhost-user you can't fall back to virtio: it's
not an accelerator, it's the backend.

Updating the protocol to support a bigger table
is possible but old remotes won't be able to support it.

   it looks like increasing limit is the only option left.
   
   it's not ideal that old remotes /with hardcoded limit/
   might not be able to handle bigger table but at least
   new ones and ones that handle VhostUserMsg payload
   dynamically would be able to work without crashing.
  
  I think we need a way for hotplug to fail gracefully.  As long as we
  don't implement the hva trick, it's needed for old kernels with vhost in
  kernel, too.
 I don't see a reliable way to fail hotplug though.
 
 In case of hotplug failure path comes from memory listener
 which can't fail by design but it fails in vhost case, i.e.
 vhost side doesn't follow protocol.
 
 We already have considered idea of querying vhost, for limit
 from memory hotplug handler before mapping memory region
 but it has drawbacks:
  1. amount of memory ranges changes during guest lifecycle
as it initializes different devices.
which leads to a case when we can hotplug more pc-dimms
than cold-plug.
Which leads to inability to migrate guest with hotplugged
pc-dimms since target QEMU won't start with that amount
of dimms from source due to hitting limit.
  2. it's ugly hack to query random 'vhost' entity when plugging
dimm device from modeling pov, but we can live with it
if it helps QEMU not to crash.
 
 If it's acceptable to break/ignore #1 issue, I can post related
 QEMU patches that I have, at least qemu won't crash with old
 vhost backends.

Old kvm has lower limits on 3 of slots as well. How is this handled?

-- 
MST



Re: [Qemu-devel] [PATCH] crypto/cipher-nettle.c: Pass correct function type to cbc_encrypt and cbc_decrypt.

2015-07-16 Thread Richard W.M. Jones
On Thu, Jul 16, 2015 at 11:04:15AM +0100, Peter Maydell wrote:
 Is this the same issue that Radim's patchset from a few days
 back is addressing?

Yes, looks like it, so ignore this patch.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW



Re: [Qemu-devel] [PATCH 3/3] PAM: make PAM emulation closer to documentation

2015-07-16 Thread Ефимов Василий

16.07.2015 12:05, Paolo Bonzini пишет:



On 16/07/2015 10:35, Efimov Vasily wrote:

This patch improves PAM emulation.

PAM defines 4 memory access redirection modes. In mode 1 reads are directed to
RAM and writes are directed to PCI. In mode 2 it is contrary. In mode 0 all
access is directed to PCI. In mode 3 it is directed to RAM. Modes 0 and 3 are
well emulated but modes 1 and 2 are not. The cause is: aliases are used
while more complicated logic is required.

The idea is to use ROM device like memory regions for mode 1 and 2 emulation
instead of aliases. Writes are directed to proper destination region by
specified I/O callback. Read redirection depends on type of source region.
In most cases source region is RAM (or ROM), so ram_addr of PAM region is set to
ram_addr of source region with offset. Otherwise, when source region is an I/O
region, reading is redirected to source region read callback by PAM region one.

Read source and write destination regions are updated by the memory
commit callback.

Note that we cannot use I/O region for PAM as it will violate trying to execute
code outside RAM or ROM assertion.

Signed-off-by: Efimov Vasily r...@ispras.ru
---
  hw/pci-host/pam.c | 238 +-
  include/hw/pci-host/pam.h |  10 +-
  2 files changed, 223 insertions(+), 25 deletions(-)

diff --git a/hw/pci-host/pam.c b/hw/pci-host/pam.c
index 17d826c..9729b2b 100644
--- a/hw/pci-host/pam.c
+++ b/hw/pci-host/pam.c
@@ -27,43 +27,233 @@
   * THE SOFTWARE.
   */

-#include qom/object.h
-#include sysemu/sysemu.h
  #include hw/pci-host/pam.h
+#include exec/address-spaces.h
+#include exec/memory-internal.h
+#include qemu/bswap.h
+
+static void pam_write(void *opaque, hwaddr addr, uint64_t data,
+  unsigned size)
+{
+PAMMemoryRegion *pam = (PAMMemoryRegion *) opaque;
+void *ptr;
+
+/* Destination region can be both RAM and IO. */
+if (!memory_access_is_direct(pam-mr_write_to, true)) {
+memory_region_dispatch_write(pam-mr_write_to,
+ addr + pam-write_offset, data, size,
+ MEMTXATTRS_UNSPECIFIED);
+} else {
+ptr = memory_region_get_ram_ptr(pam-mr_write_to) + addr
+  + pam-write_offset;
+
+switch (size) {
+case 1:
+stb_p(ptr, data);
+break;
+case 2:
+stw_he_p(ptr, data);
+break;
+case 4:
+stl_he_p(ptr, data);
+break;
+case 8:
+stq_he_p(ptr, data);
+break;
+default:
+abort();
+}
+
+invalidate_and_set_dirty(pam-mr_write_to, addr + pam-pam_offset,
+ size);
+}
+}
+


The idea is very good, but the implementation relies on copying a lot of
code from exec.c.

If it is about pam_write then, for instance, I could suggest to outline
corresponding part of address_space_rw to dedicated public function.
The solution will require endianness conversion because of the
part works with uint8_t buffer but not with uint64_t values.

The rest of code looks up destination or source region or child region
offset in memory sub-tree which root is PCI or RAM region provided on
PAM creation. We cannon use common address_space_translate because it
searches from address space root and will return current PAM region.
To summarize, I suggest to move the code to exec.c. It is generic
enough.


Could you use an IOMMU memory region instead?  Then a single region can
be used to implement all four modes, and you don't hit the trying to
execute code outside RAM or RAM.

Did you mean MemoryRegion.iommu_ops ? The feature does not allow to
change destination memory region. Also I has no find its using during
write access from CPU. And there is:

exec.c: address_space_translate_for_iotlb:
assert(!section-mr-iommu_ops);



Paolo






Re: [Qemu-devel] iothread: release iothread around aio_poll causes random hangs at startup

2015-07-16 Thread Christian Borntraeger
Am 10.06.2015 um 11:34 schrieb Fam Zheng:
 On Wed, 06/10 11:18, Christian Borntraeger wrote:
 Am 10.06.2015 um 04:12 schrieb Fam Zheng:
 On Tue, 06/09 11:01, Christian Borntraeger wrote:
 Am 09.06.2015 um 04:28 schrieb Fam Zheng:
 On Tue, 06/02 16:36, Christian Borntraeger wrote:
 Paolo,

 I bisected 
 commit a0710f7995f914e3044e5899bd8ff6c43c62f916
 Author: Paolo Bonzini pbonz...@redhat.com
 AuthorDate: Fri Feb 20 17:26:52 2015 +0100
 Commit: Kevin Wolf kw...@redhat.com
 CommitDate: Tue Apr 28 15:36:08 2015 +0200

 iothread: release iothread around aio_poll

 to cause a problem with hanging guests.

 Having many guests all with a kernel/ramdisk (via -kernel) and
 several null block devices will result in hangs. All hanging 
 guests are in partition detection code waiting for an I/O to return
 so very early maybe even the first I/O.

 Reverting that commit fixes the hangs.
 Any ideas?


For what its worth, I can no longer reproduce the issue on
current master + cherry-pick of a0710f7995f (iothread: release iothread around 
aio_poll)

bisect tells me that

commit 53ec73e264f481b79b52efcadc9ceb8f8996975c
Author: Fam Zheng f...@redhat.com
AuthorDate: Fri May 29 18:53:14 2015 +0800
Commit: Stefan Hajnoczi stefa...@redhat.com
CommitDate: Tue Jul 7 14:27:14 2015 +0100

block: Use bdrv_drain to replace uncessary bdrv_drain_all

made the problem will blk-null go away. I still dont understand why.

Christian




Re: [Qemu-devel] [PATCH v4 0/7] Fix QEMU crash during memory hotplug with vhost=on

2015-07-16 Thread Igor Mammedov
On Thu, 16 Jul 2015 10:35:33 +0300
Michael S. Tsirkin m...@redhat.com wrote:

 On Thu, Jul 16, 2015 at 09:26:21AM +0200, Igor Mammedov wrote:
  On Wed, 15 Jul 2015 19:32:31 +0300
  Michael S. Tsirkin m...@redhat.com wrote:
  
   On Wed, Jul 15, 2015 at 05:12:01PM +0200, Igor Mammedov wrote:
On Thu,  9 Jul 2015 13:47:17 +0200
Igor Mammedov imamm...@redhat.com wrote:

there also is yet another issue with vhost-user. It also has
very low limit on amount of memory regions (if I recall correctly 8)
and it's possible to trigger even without memory hotplug.
one just need to start QEMU with a several -numa memdev= options
to create a necessary amount of memory regions to trigger it.

lowrisk option to fix it would be increasing limit in vhost-user
backend.

another option is disabling vhost and fall-back to virtio,
but I don't know much about vhost if it's possible to 
to switch it off without loosing packets guest was sending
at the moment and if it will work at all with vhost.
   
   With vhost-user you can't fall back to virtio: it's
   not an accelerator, it's the backend.
   
   Updating the protocol to support a bigger table
   is possible but old remotes won't be able to support it.
   
  it looks like increasing limit is the only option left.
  
  it's not ideal that old remotes /with hardcoded limit/
  might not be able to handle bigger table but at least
  new ones and ones that handle VhostUserMsg payload
  dynamically would be able to work without crashing.
 
 I think we need a way for hotplug to fail gracefully.  As long as we
 don't implement the hva trick, it's needed for old kernels with vhost in
 kernel, too.
I don't see a reliable way to fail hotplug though.

In case of hotplug failure path comes from memory listener
which can't fail by design but it fails in vhost case, i.e.
vhost side doesn't follow protocol.

We already have considered idea of querying vhost, for limit
from memory hotplug handler before mapping memory region
but it has drawbacks:
 1. amount of memory ranges changes during guest lifecycle
   as it initializes different devices.
   which leads to a case when we can hotplug more pc-dimms
   than cold-plug.
   Which leads to inability to migrate guest with hotplugged
   pc-dimms since target QEMU won't start with that amount
   of dimms from source due to hitting limit.
 2. it's ugly hack to query random 'vhost' entity when plugging
   dimm device from modeling pov, but we can live with it
   if it helps QEMU not to crash.

If it's acceptable to break/ignore #1 issue, I can post related
QEMU patches that I have, at least qemu won't crash with old
vhost backends.



[Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me

2015-07-16 Thread Paolo Bonzini
Apart from an additional assertion this is exactly the same code as v1,
but split across three patches so that the important one focuses on the
optimization.

Paolo

v1-v2
Split some changes to the tests to a separate patch
Fix commit message [Laszlo]
Clarify do...while loop in aio-win32.c [Kevin]

Paolo Bonzini (3):
  tests: remove irrelevant assertions from test-aio
  aio-win32: reorganize polling loop
  AioContext: fix broken ctx-dispatching optimization

 aio-posix.c | 19 ++--
 aio-win32.c | 41 ++
 async.c | 21 +-
 docs/aio_notify.promela | 77 +
 include/block/aio.h | 29 +++
 tests/test-aio.c| 26 +++--
 6 files changed, 98 insertions(+), 115 deletions(-)

-- 
2.4.3

diff from v1:
diff --git a/aio-win32.c b/aio-win32.c
index ae7c6cf..9d6c12f 100644
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -312,7 +312,13 @@ bool aio_poll(AioContext *ctx, bool blocking)
 ctx-walking_handlers--;
 first = true;
 
-/* wait until next event */
+/* ctx-notifier is always registered.  */
+assert(count  0);
+
+/* Multiple iterations, all of them non-blocking except the first,
+ * may be necessary to process all pending events.  After the first
+ * WaitForMultipleObjects call ctx-notify_me will be decremented.
+ */
 do {
 HANDLE event;
 int ret;





[Qemu-devel] [PATCH v2 3/3] AioContext: fix broken ctx-dispatching optimization

2015-07-16 Thread Paolo Bonzini
This patch rewrites the ctx-dispatching optimization, which was the cause
of some mysterious hangs that could be reproduced on aarch64 KVM only.
The hangs were indirectly caused by aio_poll() and in particular by
flash memory updates's call to blk_write(), which invokes aio_poll().
Fun stuff: they had an extremely short race window, so much that
adding all kind of tracing to either the kernel or QEMU made it
go away (a single printf made it half as reproducible).

On the plus side, the failure mode (a hang until the next keypress)
made it very easy to examine the state of the process with a debugger.
And there was a very nice reproducer from Laszlo, which failed pretty
often (more than half of the time) on any version of QEMU with a non-debug
kernel; it also failed fast, while still in the firmware.  So, it could
have been worse.

For some unknown reason they happened only with virtio-scsi, but
that's not important.  It's more interesting that they disappeared with
io=native, making thread-pool.c a likely suspect for where the bug arose.
thread-pool.c is also one of the few places which use bottom halves
across threads, by the way.

I hope that no other similar bugs exist, but just in case :) I am
going to describe how the successful debugging went...  Since the
likely culprit was the ctx-dispatching optimization, which mostly
affects bottom halves, the first observation was that there are two
qemu_bh_schedule() invocations in the thread pool: the one in the aio
worker and the one in thread_pool_completion_bh.  The latter always
causes the optimization to trigger, the former may or may not.  In
order to restrict the possibilities, I introduced new functions
qemu_bh_schedule_slow() and qemu_bh_schedule_fast():

 /* qemu_bh_schedule_slow: */
 ctx = bh-ctx;
 bh-idle = 0;
 if (atomic_xchg(bh-scheduled, 1) == 0) {
 event_notifier_set(ctx-notifier);
 }

 /* qemu_bh_schedule_fast: */
 ctx = bh-ctx;
 bh-idle = 0;
 assert(ctx-dispatching);
 atomic_xchg(bh-scheduled, 1);

Notice how the atomic_xchg is still in qemu_bh_schedule_slow().  This
was already debated a few months ago, so I assumed it to be correct.
In retrospect this was a very good idea, as you'll see later.

Changing thread_pool_completion_bh() to qemu_bh_schedule_fast() didn't
trigger the assertion (as expected).  Changing the worker's invocation
to qemu_bh_schedule_slow() didn't hide the bug (another assumption
which luckily held).  This already limited heavily the amount of
interaction between the threads, hinting that the problematic events
must have triggered around thread_pool_completion_bh().

As mentioned early, invoking a debugger to examine the state of a
hung process was pretty easy; the iothread was always waiting on a
poll(..., -1) system call.  Infinite timeouts are much rarer on x86,
and this could be the reason why the bug was never observed there.
With the buggy sequence more or less resolved to an interaction between
thread_pool_completion_bh() and poll(..., -1), my tracing strategy was
to just add a few qemu_clock_get_ns(QEMU_CLOCK_REALTIME) calls, hoping
that the ordering of aio_ctx_prepare(), aio_ctx_dispatch, poll() and
qemu_bh_schedule_fast() would provide some hint.  The output was:

(gdb) p last_prepare
$3 = 103885451
(gdb) p last_dispatch
$4 = 103876492
(gdb) p last_poll
$5 = 115909333
(gdb) p last_schedule
$6 = 115925212

Notice how the last call to qemu_poll_ns() came after aio_ctx_dispatch().
This makes little sense unless there is an aio_poll() call involved,
and indeed with a slightly different instrumentation you can see that
there is one:

(gdb) p last_prepare
$3 = 107569679
(gdb) p last_dispatch
$4 = 107561600
(gdb) p last_aio_poll
$5 = 110671400
(gdb) p last_schedule
$6 = 110698917

So the scenario becomes clearer:

   iothread   VCPU thread
--
   aio_ctx_prepare
   aio_ctx_check
   qemu_poll_ns(timeout=-1)
  aio_poll
aio_dispatch
  thread_pool_completion_bh
qemu_bh_schedule()

At this point bh-scheduled = 1 and the iothread has not been woken up.
The solution must be close, but this alone should not be a problem,
because the bottom half is only rescheduled to account for rare situations
(see commit 3c80ca1, thread-pool: avoid deadlock in nested aio_poll()
calls, 2014-07-15).

Introducing a third thread---a thread pool worker thread, which
also does qemu_bh_schedule()---does bring out the problematic case.
The third thread must be awakened *after* the callback is complete and
thread_pool_completion_bh has redone the whole loop, explaining the
short race window.  And then this is what happens:

  thread pool worker

[Qemu-devel] [PATCH v2 2/3] aio-win32: reorganize polling loop

2015-07-16 Thread Paolo Bonzini
Preparatory bugfixes and tweaks to the loop before the next patch:

- disable dispatch optimization during aio_prepare.  This fixes a bug.

- do not modify blocking until after the first WaitForMultipleObjects
call.  This is needed in the next patch.

- change the loop to do...while.  This makes it obvious that the loop
is always entered at least once.  In the next patch this is important
because the first iteration undoes the ctx-notify_me increment that
happened before entering the loop.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 aio-win32.c | 21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/aio-win32.c b/aio-win32.c
index 233d8f5..9268b5c 100644
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -284,11 +284,6 @@ bool aio_poll(AioContext *ctx, bool blocking)
 int timeout;
 
 aio_context_acquire(ctx);
-have_select_revents = aio_prepare(ctx);
-if (have_select_revents) {
-blocking = false;
-}
-
 was_dispatching = ctx-dispatching;
 progress = false;
 
@@ -304,6 +299,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
  */
 aio_set_dispatching(ctx, !blocking);
 
+have_select_revents = aio_prepare(ctx);
+
 ctx-walking_handlers++;
 
 /* fill fd sets */
@@ -317,12 +314,18 @@ bool aio_poll(AioContext *ctx, bool blocking)
 ctx-walking_handlers--;
 first = true;
 
-/* wait until next event */
-while (count  0) {
+/* ctx-notifier is always registered.  */
+assert(count  0);
+
+/* Multiple iterations, all of them non-blocking except the first,
+ * may be necessary to process all pending events.  After the first
+ * WaitForMultipleObjects call ctx-notify_me will be decremented.
+ */
+do {
 HANDLE event;
 int ret;
 
-timeout = blocking
+timeout = blocking  !have_select_revents
 ? qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)) : 0;
 if (timeout) {
 aio_context_release(ctx);
@@ -351,7 +354,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
 blocking = false;
 
 progress |= aio_dispatch_handlers(ctx, event);
-}
+} while (count  0);
 
 progress |= timerlistgroup_run_timers(ctx-tlg);
 
-- 
2.4.3





Re: [Qemu-devel] [PATCH] crypto/cipher-nettle.c: Pass correct function type to cbc_encrypt and cbc_decrypt.

2015-07-16 Thread Peter Maydell
On 16 July 2015 at 11:08, Paolo Bonzini pbonz...@redhat.com wrote:


 On 16/07/2015 12:04, Peter Maydell wrote:
 Is this the same issue that Radim's patchset from a few days
 back is addressing?

 Yes.  Are you going to apply it, or should I include it in the pull
 request I'll send out today?

If you put it in your pull request that would be easier for me.

thanks
-- PMM



Re: [Qemu-devel] RFC: guest-side retrieval of fw_cfg file

2015-07-16 Thread Michael S. Tsirkin
On Thu, Jul 16, 2015 at 11:50:38AM +0200, Igor Mammedov wrote:
 On Thu, 16 Jul 2015 10:30:41 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Thu, Jul 16, 2015 at 08:57:36AM +0200, Igor Mammedov wrote:
   On Wed, 15 Jul 2015 19:24:33 +0300
   Michael S. Tsirkin m...@redhat.com wrote:
   
On Wed, Jul 15, 2015 at 06:01:42PM +0200, Igor Mammedov wrote:
 On Wed, 15 Jul 2015 17:08:27 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Mon, Jul 13, 2015 at 04:09:37PM -0400, Gabriel L. Somlo wrote:
   Hi,
   
   A while ago I was pondering on the options available for 
   retrieving
   a fw_cfg blob from the guest-side (now that we can insert fw_cfg
   files on the host-side command line, see commit 81b2b8106).
   
   So over the last couple of weekends I cooked up the sysfs kernel
   module below, which lists all fw_cfg files
   under /sys/firmware/fw_cfg/filename.
  
  One concern here is that there will be a conflict here if fw cfg
  is used by ACPI. I don't know whether that last is a good idea
  though, so maybe not a real concern. I think Igor
  wanted to make it so.
 
 I don't see any conflict here so far, it's just guest side module that
 accesses fw_cfg.

If there's ACPI code that accesses fw_cfg in response to an interrupt,
it'll race with fw cfg access by guest OS. On linux we might be able to
block ACPI preventing it from running. We probably won't be able to do
it on windows.
   wrt vmgenid series we were talking about possibility to access fw_cfg
   from ACPI device.init so it's unlikely that it will ever collide with
   much later sysfs accesses.
  
  Don't we need to get updates when it changes too?
 I've thought that it's pretty static and doesn't change.

vm gen id? No, its dynamic.

  
   But if ACPI will start accessing fw_cfg from
   other methods it will race for sure.
  
  Maybe we shouldn't poke at fw cfg in ACPI then.
 Yep, maybe we shouldn't.
 
 The last vmgenid series just maps UUID region
 at fixed location (which is sufficient to build ACPI
 tables at machine done time) so there isn't real need
 to export address via fw_cfg.
 

Other versions just write into guest memory from QEMU,
that will work fine too.

-- 
MST



Re: [Qemu-devel] [PATCH 6/6] ARM: enable PMSAv7-style MPU on Cortex-M3/M4

2015-07-16 Thread Peter Maydell
On 15 July 2015 at 08:31, Alex Züpke alexander.zue...@hs-rm.de wrote:
 Am 14.07.2015 um 19:49 schrieb Peter Maydell:
 RASR is UNPREDICTABLE for non-word-size access, so we don't need
 this at all.

 It's from ARM recommended sample code:
 http://infocenter.arm.com/help/topic/com.arm.doc.dui0553a/BIHHHDDJ.html

Ah, thanks. It turns out that this is a difference between
implementations: earlier M profile cores (M3, M4) support
byte and halfword accesses here, but later ones don't, so
the architecture defines it as UNPREDICTABLE. Since we're
modelling an M3, it's probably safest to implement it.
You might find that extract32()/deposit32() make for a
cleaner way to write it, though.

-- PMM



Re: [Qemu-devel] [PATCH v4 0/7] Fix QEMU crash during memory hotplug with vhost=on

2015-07-16 Thread Igor Mammedov
On Thu, 16 Jul 2015 13:24:35 +0300
Michael S. Tsirkin m...@redhat.com wrote:

 On Thu, Jul 16, 2015 at 11:42:36AM +0200, Igor Mammedov wrote:
  On Thu, 16 Jul 2015 10:35:33 +0300
  Michael S. Tsirkin m...@redhat.com wrote:
  
   On Thu, Jul 16, 2015 at 09:26:21AM +0200, Igor Mammedov wrote:
On Wed, 15 Jul 2015 19:32:31 +0300
Michael S. Tsirkin m...@redhat.com wrote:

 On Wed, Jul 15, 2015 at 05:12:01PM +0200, Igor Mammedov wrote:
  On Thu,  9 Jul 2015 13:47:17 +0200
  Igor Mammedov imamm...@redhat.com wrote:
  
  there also is yet another issue with vhost-user. It also has
  very low limit on amount of memory regions (if I recall correctly 8)
  and it's possible to trigger even without memory hotplug.
  one just need to start QEMU with a several -numa memdev= options
  to create a necessary amount of memory regions to trigger it.
  
  lowrisk option to fix it would be increasing limit in vhost-user
  backend.
  
  another option is disabling vhost and fall-back to virtio,
  but I don't know much about vhost if it's possible to 
  to switch it off without loosing packets guest was sending
  at the moment and if it will work at all with vhost.
 
 With vhost-user you can't fall back to virtio: it's
 not an accelerator, it's the backend.
 
 Updating the protocol to support a bigger table
 is possible but old remotes won't be able to support it.
 
it looks like increasing limit is the only option left.

it's not ideal that old remotes /with hardcoded limit/
might not be able to handle bigger table but at least
new ones and ones that handle VhostUserMsg payload
dynamically would be able to work without crashing.
   
   I think we need a way for hotplug to fail gracefully.  As long as we
   don't implement the hva trick, it's needed for old kernels with vhost in
   kernel, too.
  I don't see a reliable way to fail hotplug though.
  
  In case of hotplug failure path comes from memory listener
  which can't fail by design but it fails in vhost case, i.e.
  vhost side doesn't follow protocol.
  
  We already have considered idea of querying vhost, for limit
  from memory hotplug handler before mapping memory region
  but it has drawbacks:
   1. amount of memory ranges changes during guest lifecycle
 as it initializes different devices.
 which leads to a case when we can hotplug more pc-dimms
 than cold-plug.
 Which leads to inability to migrate guest with hotplugged
 pc-dimms since target QEMU won't start with that amount
 of dimms from source due to hitting limit.
   2. it's ugly hack to query random 'vhost' entity when plugging
 dimm device from modeling pov, but we can live with it
 if it helps QEMU not to crash.
  
  If it's acceptable to break/ignore #1 issue, I can post related
  QEMU patches that I have, at least qemu won't crash with old
  vhost backends.
 
 Old kvm has lower limits on 3 of slots as well. How is this handled?
the same ugly/non perfect way,
pc_dimm_plug() -  pc_dimm_memory_plug() {
...
if (kvm_enabled()  !kvm_has_free_slot(machine)) {
error_setg(local_err, hypervisor has no free memory slots left);
...




Re: [Qemu-devel] [PATCH 3/3] PAM: make PAM emulation closer to documentation

2015-07-16 Thread Paolo Bonzini


On 16/07/2015 12:51, Ефимов Василий wrote:
 The rest of code looks up destination or source region or child region
 offset in memory sub-tree which root is PCI or RAM region provided on
 PAM creation. We cannon use common address_space_translate because it
 searches from address space root and will return current PAM region.
 To summarize, I suggest to move the code to exec.c. It is generic
 enough. 

All these mechanism are extremely low level.  They are encapsulated
within exec.c, and copying code to pam.c is not a good idea because you
already have all the AddressSpaces and RAM MemoryRegions you need.

 Could you use an IOMMU memory region instead?  Then a single region can
 be used to implement all four modes, and you don't hit the trying to
 execute code outside RAM or RAM.
 Did you mean MemoryRegion.iommu_ops ? The feature does not allow to
 change destination memory region.

It does.  You're right about this:

 exec.c: address_space_translate_for_iotlb:
 assert(!section-mr-iommu_ops); 

... but an IOMMU region is not needed, and I think you can do everything
without touching exec.c at all.

+/* Read from RAM and write to PCI */
+memory_region_init_io(pam-region[1], OBJECT(dev), pam_ops, pam,
+pam-r-ram-w-pci, size);

This can be done with memory_region_set_readonly on the RAM region.  You
need to set mr-ops in order to point to pam_ops; for a first proof of
concept you can just set the field directly.

Writes to the PCI memory space can use the PCI address space, with
address_space_st*.

+/* Read from PCI and write to RAM */
+memory_region_init_io(pam-region[2], OBJECT(dev), pam_ops, pam,
+pam-r-pci-w-ram, size);

Here you cannot run code from ROM, so it can be a pure MMIO region.
Reads can use address_space_ld*, while writes can use
memory_region_get_ram_ptr.

Paolo



Re: [Qemu-devel] [PATCH] AioContext: fix broken ctx-dispatching optimization

2015-07-16 Thread Kevin Wolf
Am 16.07.2015 um 00:59 hat Paolo Bonzini geschrieben:
 On 15/07/2015 22:14, Kevin Wolf wrote:
   index 233d8f5..ae7c6cf 100644
   --- a/aio-win32.c
   +++ b/aio-win32.c
   @@ -318,11 +313,11 @@ bool aio_poll(AioContext *ctx, bool blocking)
first = true;

/* wait until next event */
   -while (count  0) {
   +do {
HANDLE event;
int ret;

   -timeout = blocking
   +timeout = blocking  !have_select_revents
? qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)) : 0;
if (timeout) {
aio_context_release(ctx);
 
  Here I'm not sure why you switched to a do..while loop? It's not obvious
  to me how the change from aio_set_dispatching() to ctx-notify_me is
  related to this.
 
 I sort of expected a reviewer to ask me about this change.  I would have
 explained this in the commit message, but it was already long enough. :)

Rather than an extra section in the commit message, it should have been
a comment in the code at least because it's not just the change that is
confusing, but the resulting code is confusing as well.

The reason is that, at least to me, the do..while loop reads almost
like an assertion that count may indeed be 0.

 The count on entry is never zero, because the ctx-notifier is always
 registered.
 
 I changed the while to do...while so that it's obvious that
 ctx-notify_me is always decremented.
 
 In retrospect I could have added simply
 
 /* ctx-notifier is always registered.  */
 assert(count  0);
 
 above the do.

Yes, please do that, it should be much clearer.

  With this information, I understand that what has changed is that the
  return value of g_main_context_iteration() changes from true before this
  patch (had the aio_notify() from aio_set_fd_handler() pending) to false
  after the patch (aio_notify() doesn't inject an event any more).
  
  This should mean that like above we can assert that the first iteration
  returns false, i.e. reverse the assertion (and indeed, with this
  change the test still passes for me).
 
 I was a bit undecided about this.  In the end I decided that the calls
 to aio_poll/g_main_context_iteration were just to put the AioContext in
 a known state, and the assertions on the return value of g_assert were
 not really important.  For this reason, the while loop seemed to express
 the intentions best, and I made it consistent between the AioContext and
 GSource cases.

You changed the AioContext case in this same patch, even if you didn't
quote my comment on that hunk. :-)

Both cases were asserting the return value before.

Kevin



Re: [Qemu-devel] [PATCH] AioContext: fix broken ctx-dispatching optimization

2015-07-16 Thread Paolo Bonzini


On 16/07/2015 07:29, Fam Zheng wrote:
 This patch rewrites the ctx-dispatching optimization 
 
 I lost track, but what was the justification of this optimization?

qemu_bh_schedule's call to aio_notify were showing in the profile as 20%
or so.

Paolo

 Anyway, awesome debugging and explanations!



Re: [Qemu-devel] [PATCH] AioContext: fix broken ctx-dispatching optimization

2015-07-16 Thread Stefan Hajnoczi
On Thu, Jul 16, 2015 at 01:29:20PM +0800, Fam Zheng wrote:
 On Wed, 07/15 19:13, Paolo Bonzini wrote:
  This patch rewrites the ctx-dispatching optimization 
 
 I lost track, but what was the justification of this optimization?

To avoid signalling the event notifier when it's not necessary.


pgpOBgCfD5TyW.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH] AioContext: fix broken ctx-dispatching optimization

2015-07-16 Thread Stefan Hajnoczi
On Wed, Jul 15, 2015 at 07:13:23PM +0200, Paolo Bonzini wrote:
 This patch rewrites the ctx-dispatching optimization, which was the cause
 of some mysterious hangs that could be reproduced on aarch64 KVM only.
 The hangs were indirectly caused by aio_poll() and in particular by
 flash memory updates's call to blk_write(), which invokes aio_poll().
 Fun stuff: they had an extremely short race window, so much that
 adding all kind of tracing to either the kernel or QEMU made it
 go away (a single printf made it half as reproducible).
 
 On the plus side, the failure mode (a hang until the next keypress)
 made it very easy to examine the state of the process with a debugger.
 And there was a very nice reproducer from Laszlo, which failed pretty
 often (more than half of the time) on any version of QEMU with a non-debug
 kernel; it also failed fast, while still in the firmware.  So, it could
 have been worse.
 
 For some unknown reason they happened only with virtio-scsi, but
 that's not important.  It's more interesting that they disappeared with
 io=native, making thread-pool.c a likely suspect for where the bug arose.
 thread-pool.c is also one of the few places which use bottom halves
 across threads, by the way.
 
 I hope that no other similar bugs exist, but just in case :) I am
 going to describe how the successful debugging went...  Since the
 likely culprit was the ctx-dispatching optimization, which mostly
 affects bottom halves, the first observation was that there are two
 qemu_bh_schedule() invocations in the thread pool: the one in the aio
 worker and the one in thread_pool_completion_bh.  The latter always
 causes the optimization to trigger, the former may or may not.  In
 order to restrict the possibilities, I introduced new functions
 qemu_bh_schedule_slow() and qemu_bh_schedule_fast():
 
  /* qemu_bh_schedule_slow: */
  ctx = bh-ctx;
  bh-idle = 0;
  if (atomic_xchg(bh-scheduled, 1) == 0) {
  event_notifier_set(ctx-notifier);
  }
 
  /* qemu_bh_schedule_fast: */
  ctx = bh-ctx;
  bh-idle = 0;
  assert(ctx-dispatching);
  atomic_xchg(bh-scheduled, 1);
 
 Notice how the atomic_xchg is still in qemu_bh_schedule_slow().  This
 was already debated a few months ago, so I assumed it to be correct.
 In retrospect this was a very good idea, as you'll see later.
 
 Changing thread_pool_completion_bh() to qemu_bh_schedule_fast() didn't
 trigger the assertion (as expected).  Changing the worker's invocation
 to qemu_bh_schedule_slow() didn't hide the bug (another assumption
 which luckily held).  This already limited heavily the amount of
 interaction between the threads, hinting that the problematic events
 must have triggered around thread_pool_completion_bh().
 
 As mentioned early, invoking a debugger to examine the state of a
 hung process was pretty easy; the iothread was always waiting on a
 poll(..., -1) system call.  Infinite timeouts are much rarer on x86,
 and this could be the reason why the bug was never observed there.
 With the buggy sequence more or less resolved to an interaction between
 thread_pool_completion_bh() and poll(..., -1), my tracing strategy was
 to just add a few qemu_clock_get_ns(QEMU_CLOCK_REALTIME) calls, hoping
 that the ordering of aio_ctx_prepare(), aio_ctx_dispatch, poll() and
 qemu_bh_schedule_fast() would provide some hint.  The output was:
 
 (gdb) p last_prepare
 $3 = 103885451
 (gdb) p last_dispatch
 $4 = 103876492
 (gdb) p last_poll
 $5 = 115909333
 (gdb) p last_schedule
 $6 = 115925212
 
 Notice how the last call to qemu_poll_ns() came after aio_ctx_dispatch().
 This makes little sense unless there is an aio_poll() call involved,
 and indeed with a slightly different instrumentation you can see that
 there is one:
 
 (gdb) p last_prepare
 $3 = 107569679
 (gdb) p last_dispatch
 $4 = 107561600
 (gdb) p last_aio_poll
 $5 = 110671400
 (gdb) p last_schedule
 $6 = 110698917
 
 So the scenario becomes clearer:
 
iothread   VCPU thread
 --
aio_ctx_prepare
aio_ctx_check
qemu_poll_ns(timeout=-1)
   aio_poll
 aio_dispatch
   thread_pool_completion_bh
 qemu_bh_schedule()
 
 At this point bh-scheduled = 1 and the iothread has not been woken up.
 The solution must be close, but this alone should not be a problem,
 because the bottom half is only rescheduled to account for rare situations
 (see commit 3c80ca1, thread-pool: avoid deadlock in nested aio_poll()
 calls, 2014-07-15).
 
 Introducing a third thread---a thread pool worker thread, which
 also does qemu_bh_schedule()---does bring out the problematic case.
 The third thread must be awakened *after* the callback is complete and
 thread_pool_completion_bh has redone the whole loop, 

Re: [Qemu-devel] Patches

2015-07-16 Thread Stefan Hajnoczi
On Wed, Jul 15, 2015 at 12:26:44PM +0100, Peter Maydell wrote:
 On 15 July 2015 at 09:52, Stefan Hajnoczi stefa...@redhat.com wrote:
  On Tue, Jul 14, 2015 at 10:58:41AM +0100, Peter Maydell wrote:
  On 14 July 2015 at 09:12, Peter Crosthwaite
  peter.crosthwa...@xilinx.com wrote:
   Hi Stefan, Peter,
  
   Is patches down? I don't see any patches for the last 3 days and there
   is definitely new things on list:
  
   $ ./patches fetch http://vmsplice.net/~patches/patches.json
   Fetched info on 689 patch series
   Fetching mboxes...
   $ ./patches list | more
   Message-id: 1436556159-3002-1-git-send-email...@weilnetz.de
   From: Stefan Weil s...@weilnetz.de
   Date: 2015-07-10
   Tags: for-2.4
  [1/1] tci: Fix regression with INDEX_op_qemu_st_i32, 
   INDEX_op_qemu_st_i64
 
  The other patches instance also seems to have got wedged
  (http://wiki.qemu.org/patches/patches.json): that
  2015-07-10 patch is the last one it knows about too.
 
  Strange, that is the same file as
  http://qemu-project.org/patches/patches.json.
 
 It unwedged itself later that same day.

I experienced something weird today as well.  patches fetch downloaded
an incomplete patches.json file :(.

I'll have to keep an eye on the server...

Stefan


pgpDqjotpK7wm.pgp
Description: PGP signature


[Qemu-devel] [PATCH] crypto/cipher-nettle.c: Pass correct function type to cbc_encrypt and cbc_decrypt.

2015-07-16 Thread Richard W.M. Jones
The prototypes of the {nettle_}cbc_encrypt and cbc_decrypt functions
are:

void
cbc_encrypt(const void *ctx, nettle_cipher_func *f,
size_t block_size, uint8_t *iv,
size_t length, uint8_t *dst,
const uint8_t *src);

void
cbc_decrypt(const void *ctx, nettle_cipher_func *f,
size_t block_size, uint8_t *iv,
size_t length, uint8_t *dst,
const uint8_t *src);

Since we passed nettle_crypt_func (instead of nettle_cipher_func) as
the second argument, this gave the errors below.

In file included from crypto/cipher.c:71:0:
./crypto/cipher-nettle.c: In function ‘qcrypto_cipher_encrypt’:
./crypto/cipher-nettle.c:154:39: error: passing argument 2 of 
‘nettle_cbc_encrypt’ from incompatible pointer type 
[-Werror=incompatible-pointer-types]
 cbc_encrypt(ctx-ctx_encrypt, ctx-alg_encrypt,
   ^
In file included from ./crypto/cipher-nettle.c:24:0,
 from crypto/cipher.c:71:
/usr/include/nettle/cbc.h:48:1: note: expected ‘void (*)(const void *, size_t,  
uint8_t *, const uint8_t *) {aka void (*)(const void *, long unsigned int,  
unsigned char *, const unsigned char *)}’ but argument is of type ‘void 
(*)(void *, size_t,  uint8_t *, const uint8_t *) {aka void (*)(void *, long 
unsigned int,  unsigned char *, const unsigned char *)}’
 cbc_encrypt(const void *ctx, nettle_cipher_func *f,
 ^
In file included from crypto/cipher.c:71:0:
./crypto/cipher-nettle.c: In function ‘qcrypto_cipher_decrypt’:
./crypto/cipher-nettle.c:183:21: error: passing argument 2 of 
‘nettle_cbc_decrypt’ from incompatible pointer type 
[-Werror=incompatible-pointer-types]
 ctx-alg_decrypt, ctx-niv, ctx-iv,
 ^
In file included from ./crypto/cipher-nettle.c:24:0,
 from crypto/cipher.c:71:
/usr/include/nettle/cbc.h:54:1: note: expected ‘void (*)(const void *, size_t,  
uint8_t *, const uint8_t *) {aka void (*)(const void *, long unsigned int,  
unsigned char *, const unsigned char *)}’ but argument is of type ‘void 
(*)(void *, size_t,  uint8_t *, const uint8_t *) {aka void (*)(void *, long 
unsigned int,  unsigned char *, const unsigned char *)}’
 cbc_decrypt(const void *ctx, nettle_cipher_func *f,

Signed-off-by: Richard W.M. Jones rjo...@redhat.com
---
 crypto/cipher-nettle.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/crypto/cipher-nettle.c b/crypto/cipher-nettle.c
index e5a14bc..f06ea92 100644
--- a/crypto/cipher-nettle.c
+++ b/crypto/cipher-nettle.c
@@ -27,8 +27,8 @@ typedef struct QCryptoCipherNettle QCryptoCipherNettle;
 struct QCryptoCipherNettle {
 void *ctx_encrypt;
 void *ctx_decrypt;
-nettle_crypt_func *alg_encrypt;
-nettle_crypt_func *alg_decrypt;
+nettle_cipher_func *alg_encrypt;
+nettle_cipher_func *alg_decrypt;
 uint8_t *iv;
 size_t niv;
 };
@@ -83,8 +83,8 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg,
 des_set_key(ctx-ctx_encrypt, rfbkey);
 g_free(rfbkey);
 
-ctx-alg_encrypt = (nettle_crypt_func *)des_encrypt;
-ctx-alg_decrypt = (nettle_crypt_func *)des_decrypt;
+ctx-alg_encrypt = (nettle_cipher_func *)des_encrypt;
+ctx-alg_decrypt = (nettle_cipher_func *)des_decrypt;
 
 ctx-niv = DES_BLOCK_SIZE;
 break;
@@ -98,8 +98,8 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg,
 aes_set_encrypt_key(ctx-ctx_encrypt, nkey, key);
 aes_set_decrypt_key(ctx-ctx_decrypt, nkey, key);
 
-ctx-alg_encrypt = (nettle_crypt_func *)aes_encrypt;
-ctx-alg_decrypt = (nettle_crypt_func *)aes_decrypt;
+ctx-alg_encrypt = (nettle_cipher_func *)aes_encrypt;
+ctx-alg_decrypt = (nettle_cipher_func *)aes_decrypt;
 
 ctx-niv = AES_BLOCK_SIZE;
 break;
-- 
2.4.3




Re: [Qemu-devel] [PATCH] crypto/cipher-nettle.c: Pass correct function type to cbc_encrypt and cbc_decrypt.

2015-07-16 Thread Peter Maydell
On 16 July 2015 at 10:55, Richard W.M. Jones rjo...@redhat.com wrote:
 The prototypes of the {nettle_}cbc_encrypt and cbc_decrypt functions
 are:

 void
 cbc_encrypt(const void *ctx, nettle_cipher_func *f,
 size_t block_size, uint8_t *iv,
 size_t length, uint8_t *dst,
 const uint8_t *src);

 void
 cbc_decrypt(const void *ctx, nettle_cipher_func *f,
 size_t block_size, uint8_t *iv,
 size_t length, uint8_t *dst,
 const uint8_t *src);

 Since we passed nettle_crypt_func (instead of nettle_cipher_func) as
 the second argument, this gave the errors below.

 In file included from crypto/cipher.c:71:0:
 ./crypto/cipher-nettle.c: In function ‘qcrypto_cipher_encrypt’:
 ./crypto/cipher-nettle.c:154:39: error: passing argument 2 of 
 ‘nettle_cbc_encrypt’ from incompatible pointer type 
 [-Werror=incompatible-pointer-types]
  cbc_encrypt(ctx-ctx_encrypt, ctx-alg_encrypt,
^
 In file included from ./crypto/cipher-nettle.c:24:0,
  from crypto/cipher.c:71:
 /usr/include/nettle/cbc.h:48:1: note: expected ‘void (*)(const void *, 
 size_t,  uint8_t *, const uint8_t *) {aka void (*)(const void *, long 
 unsigned int,  unsigned char *, const unsigned char *)}’ but argument is of 
 type ‘void (*)(void *, size_t,  uint8_t *, const uint8_t *) {aka void 
 (*)(void *, long unsigned int,  unsigned char *, const unsigned char *)}’
  cbc_encrypt(const void *ctx, nettle_cipher_func *f,
  ^
 In file included from crypto/cipher.c:71:0:
 ./crypto/cipher-nettle.c: In function ‘qcrypto_cipher_decrypt’:
 ./crypto/cipher-nettle.c:183:21: error: passing argument 2 of 
 ‘nettle_cbc_decrypt’ from incompatible pointer type 
 [-Werror=incompatible-pointer-types]
  ctx-alg_decrypt, ctx-niv, ctx-iv,
  ^
 In file included from ./crypto/cipher-nettle.c:24:0,
  from crypto/cipher.c:71:
 /usr/include/nettle/cbc.h:54:1: note: expected ‘void (*)(const void *, 
 size_t,  uint8_t *, const uint8_t *) {aka void (*)(const void *, long 
 unsigned int,  unsigned char *, const unsigned char *)}’ but argument is of 
 type ‘void (*)(void *, size_t,  uint8_t *, const uint8_t *) {aka void 
 (*)(void *, long unsigned int,  unsigned char *, const unsigned char *)}’
  cbc_decrypt(const void *ctx, nettle_cipher_func *f,

Is this the same issue that Radim's patchset from a few days
back is addressing?

thanks
-- PMM



Re: [Qemu-devel] [PATCH] crypto/cipher-nettle.c: Pass correct function type to cbc_encrypt and cbc_decrypt.

2015-07-16 Thread Paolo Bonzini


On 16/07/2015 12:04, Peter Maydell wrote:
 Is this the same issue that Radim's patchset from a few days
 back is addressing?

Yes.  Are you going to apply it, or should I include it in the pull
request I'll send out today?

Paolo



Re: [Qemu-devel] [PATCH V2 1/2] tests: introduce basic pci test for virtio-net

2015-07-16 Thread Stefan Hajnoczi
On Mon, Jul 13, 2015 at 02:04:19PM +0800, Jason Wang wrote:
 Signed-off-by: Jason Wang jasow...@redhat.com
 ---
 Changes from V1:
 - replace the magic value 12 with a macro
 ---
  tests/Makefile  |   2 +-
  tests/virtio-net-test.c | 186 
 ++--
  2 files changed, 180 insertions(+), 8 deletions(-)

I haven't reviewed this series in detail but it's great that you are
doing this!

Stefan


pgpuyRglZJN3p.pgp
Description: PGP signature


Re: [Qemu-devel] [PULL 0/9] target-mips queue

2015-07-16 Thread Peter Maydell
On 16 July 2015 at 09:17, Leon Alrae leon.al...@imgtec.com wrote:
 Hi,

 This pull request contains MIPS bug fixes for 2.4-rc1.

 Thanks,
 Leon

 Cc: Peter Maydell peter.mayd...@linaro.org
 Cc: Aurelien Jarno aurel...@aurel32.net

 The following changes since commit 661725da09f47eb92d356fac10a4cf3b7ad1f61d:

   Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20150714' into 
 staging (2015-07-14 18:50:17 +0100)

 are available in the git repository at:

   git://github.com/lalrae/qemu.git tags/mips-20150716

 for you to fetch changes up to 908680c6441ac468f4871d513f42be396ea0d264:

   target-mips: fix page fault address for LWL/LWR/LDL/LDR (2015-07-15 
 14:07:25 +0100)

 
 MIPS patches 2015-07-16

 Changes:
 * bug fixes

Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH 09/10 v12] target-tilegx: Generate tcg instructions to finish Hello world

2015-07-16 Thread gchen gchen
Hello Maintainers:

Please help review this patch when you have time.

Thanks.

On 06/13/2015 09:21 PM, Chen Gang wrote:
 Generate related tcg instructions, and qemu tilegx can finish running
 Hello world. The elf64 binary can be static or shared.

 Signed-off-by: Chen Gang gang.chen.5...@gmail.com
 ---
  target-tilegx/translate.c | 2966 
 +
  1 file changed, 2966 insertions(+)
  create mode 100644 target-tilegx/translate.c

 diff --git a/target-tilegx/translate.c b/target-tilegx/translate.c
 new file mode 100644
 index 000..1dd3a43
 --- /dev/null
 +++ b/target-tilegx/translate.c
 @@ -0,0 +1,2966 @@
 +/*
 + * QEMU TILE-Gx CPU
 + *
 + *  Copyright (c) 2015 Chen Gang
 + *
 + * This library is free software; you can redistribute it and/or
 + * modify it under the terms of the GNU Lesser General Public
 + * License as published by the Free Software Foundation; either
 + * version 2.1 of the License, or (at your option) any later version.
 + *
 + * This library is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 + * Lesser General Public License for more details.
 + *
 + * You should have received a copy of the GNU Lesser General Public
 + * License along with this library; if not, see
 + * http://www.gnu.org/licenses/lgpl-2.1.html
 + */
 +
 +#include cpu.h
 +#include qemu/log.h
 +#include disas/disas.h
 +#include tcg-op.h
 +#include exec/cpu_ldst.h
 +#include opcode_tilegx.h
 +#include spr_def_64.h
 +
 +#define FMT64X %016 PRIx64
 +#define TILEGX_TMP_REGS(TILEGX_MAX_INSTRUCTIONS_PER_BUNDLE + 1)
 +
 +static TCGv_ptr cpu_env;
 +static TCGv cpu_pc;
 +static TCGv cpu_regs[TILEGX_R_COUNT];
 +static TCGv cpu_spregs[TILEGX_SPR_COUNT];
 +#if defined(CONFIG_USER_ONLY)
 +static TCGv_i32 cpu_excparam;
 +#endif
 +
 +static const char * const reg_names[] = {
 + r0,  r1,  r2,  r3,  r4,  r5,  r6,  r7,
 + r8,  r9, r10, r11, r12, r13, r14, r15,
 +r16, r17, r18, r19, r20, r21, r22, r23,
 +r24, r25, r26, r27, r28, r29, r30, r31,
 +r32, r33, r34, r35, r36, r37, r38, r39,
 +r40, r41, r42, r43, r44, r45, r46, r47,
 +r48, r49, r50, r51,  bp,  tp,  sp,  lr
 +};
 +
 +static const char * const spreg_names[] = {
 +cmpexch, criticalsec, simcontrol
 +};
 +
 +/* It is for temporary registers */
 +typedef struct DisasContextTemp {
 +uint8_t idx;   /* index */
 +TCGv val;  /* value */
 +} DisasContextTemp;
 +
 +/* This is the state at translation time.  */
 +typedef struct DisasContext {
 +uint64_t pc;   /* Current pc */
 +int exception; /* Current exception */
 +
 +TCGv zero; /* For zero register */
 +
 +DisasContextTemp *tmp_regcur;  /* Current temporary registers */
 +DisasContextTemp tmp_regs[TILEGX_TMP_REGS]; /* All temporary registers */
 +struct {
 +TCGCond cond;  /* Branch condition */
 +TCGv dest; /* pc jump destination, if will jump */
 +TCGv val1; /* Firt value for condition comparing */
 +TCGv val2; /* Second value for condition comparing */
 +} jmp; /* Jump object, only once in each TB 
 block */
 +} DisasContext;
 +
 +#include exec/gen-icount.h
 +
 +static void gen_exception(DisasContext *dc, int num)
 +{
 +TCGv_i32 tmp = tcg_const_i32(num);
 +
 +gen_helper_exception(cpu_env, tmp);
 +tcg_temp_free_i32(tmp);
 +}
 +
 +/*
 + * All exceptions which can still let working flow continue are all in pipe 
 x1,
 + * which is the last pipe of a bundle. So it is OK to only process the first
 + * exception within a bundle.
 + */
 +static void set_exception(DisasContext *dc, int num)
 +{
 +if (dc-exception == TILEGX_EXCP_NONE) {
 +dc-exception = num;
 +}
 +}
 +
 +static bool check_gr(DisasContext *dc, uint8_t reg)
 +{
 +if (likely(reg  TILEGX_R_COUNT)) {
 +return true;
 +}
 +
 +switch (reg) {
 +case TILEGX_R_SN:
 +case TILEGX_R_ZERO:
 +break;
 +case TILEGX_R_IDN0:
 +case TILEGX_R_IDN1:
 +set_exception(dc, TILEGX_EXCP_REG_IDN_ACCESS);
 +break;
 +case TILEGX_R_UDN0:
 +case TILEGX_R_UDN1:
 +case TILEGX_R_UDN2:
 +case TILEGX_R_UDN3:
 +set_exception(dc, TILEGX_EXCP_REG_UDN_ACCESS);
 +break;
 +default:
 +g_assert_not_reached();
 +}
 +return false;
 +}
 +
 +static TCGv load_zero(DisasContext *dc)
 +{
 +if (TCGV_IS_UNUSED_I64(dc-zero)) {
 +dc-zero = tcg_const_i64(0);
 +}
 +return dc-zero;
 +}
 +
 +static TCGv load_gr(DisasContext *dc, uint8_t reg)
 +{
 +if (check_gr(dc, reg)) {
 +return cpu_regs[reg];
 +}
 +return load_zero(dc);
 +}
 +
 +static TCGv dest_gr(DisasContext *dc, uint8_t rdst)
 

Re: [Qemu-devel] RFC: guest-side retrieval of fw_cfg file

2015-07-16 Thread Igor Mammedov
On Wed, 15 Jul 2015 19:24:33 +0300
Michael S. Tsirkin m...@redhat.com wrote:

 On Wed, Jul 15, 2015 at 06:01:42PM +0200, Igor Mammedov wrote:
  On Wed, 15 Jul 2015 17:08:27 +0300
  Michael S. Tsirkin m...@redhat.com wrote:
  
   On Mon, Jul 13, 2015 at 04:09:37PM -0400, Gabriel L. Somlo wrote:
Hi,

A while ago I was pondering on the options available for retrieving
a fw_cfg blob from the guest-side (now that we can insert fw_cfg
files on the host-side command line, see commit 81b2b8106).

So over the last couple of weekends I cooked up the sysfs kernel
module below, which lists all fw_cfg files
under /sys/firmware/fw_cfg/filename.
   
   One concern here is that there will be a conflict here if fw cfg
   is used by ACPI. I don't know whether that last is a good idea
   though, so maybe not a real concern. I think Igor
   wanted to make it so.
  
  I don't see any conflict here so far, it's just guest side module that
  accesses fw_cfg.
 
 If there's ACPI code that accesses fw_cfg in response to an interrupt,
 it'll race with fw cfg access by guest OS. On linux we might be able to
 block ACPI preventing it from running. We probably won't be able to do
 it on windows.
wrt vmgenid series we were talking about possibility to access fw_cfg
from ACPI device.init so it's unlikely that it will ever collide with
much later sysfs accesses. But if ACPI will start accessing fw_cfg from
other methods it will race for sure.




[Qemu-devel] [PATCH 2/3] memory: make function memory_access_is_direct public

2015-07-16 Thread Efimov Vasily
Make function memory_access_is_direct public. It is required by PAM emulation.

Signed-off-by: Efimov Vasily r...@ispras.ru
---
 exec.c | 12 
 include/exec/memory-internal.h | 12 
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/exec.c b/exec.c
index 4e37ded..27064b8 100644
--- a/exec.c
+++ b/exec.c
@@ -372,18 +372,6 @@ address_space_translate_internal(AddressSpaceDispatch *d, 
hwaddr addr, hwaddr *x
 return section;
 }
 
-static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write)
-{
-if (memory_region_is_ram(mr)) {
-return !(is_write  mr-readonly);
-}
-if (memory_region_is_romd(mr)) {
-return !is_write;
-}
-
-return false;
-}
-
 /* Called from RCU critical section */
 MemoryRegion *address_space_translate(AddressSpace *as, hwaddr addr,
   hwaddr *xlat, hwaddr *plen,
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 801da82..89975b6 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -34,5 +34,17 @@ bool memory_region_access_valid(MemoryRegion *mr, hwaddr 
addr,
 void invalidate_and_set_dirty(MemoryRegion *mr, hwaddr addr,
  hwaddr length);
 
+static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write)
+{
+if (memory_region_is_ram(mr)) {
+return !(is_write  mr-readonly);
+}
+if (memory_region_is_romd(mr)) {
+return !is_write;
+}
+
+return false;
+}
+
 #endif
 #endif
-- 
1.9.1




[Qemu-devel] [PATCH 3/3] PAM: make PAM emulation closer to documentation

2015-07-16 Thread Efimov Vasily
This patch improves PAM emulation.

PAM defines 4 memory access redirection modes. In mode 1 reads are directed to
RAM and writes are directed to PCI. In mode 2 it is contrary. In mode 0 all
access is directed to PCI. In mode 3 it is directed to RAM. Modes 0 and 3 are
well emulated but modes 1 and 2 are not. The cause is: aliases are used
while more complicated logic is required.

The idea is to use ROM device like memory regions for mode 1 and 2 emulation
instead of aliases. Writes are directed to proper destination region by
specified I/O callback. Read redirection depends on type of source region.
In most cases source region is RAM (or ROM), so ram_addr of PAM region is set to
ram_addr of source region with offset. Otherwise, when source region is an I/O
region, reading is redirected to source region read callback by PAM region one.

Read source and write destination regions are updated by the memory
commit callback.

Note that we cannot use I/O region for PAM as it will violate trying to execute
code outside RAM or ROM assertion.

Signed-off-by: Efimov Vasily r...@ispras.ru
---
 hw/pci-host/pam.c | 238 +-
 include/hw/pci-host/pam.h |  10 +-
 2 files changed, 223 insertions(+), 25 deletions(-)

diff --git a/hw/pci-host/pam.c b/hw/pci-host/pam.c
index 17d826c..9729b2b 100644
--- a/hw/pci-host/pam.c
+++ b/hw/pci-host/pam.c
@@ -27,43 +27,233 @@
  * THE SOFTWARE.
  */
 
-#include qom/object.h
-#include sysemu/sysemu.h
 #include hw/pci-host/pam.h
+#include exec/address-spaces.h
+#include exec/memory-internal.h
+#include qemu/bswap.h
+
+static void pam_write(void *opaque, hwaddr addr, uint64_t data,
+  unsigned size)
+{
+PAMMemoryRegion *pam = (PAMMemoryRegion *) opaque;
+void *ptr;
+
+/* Destination region can be both RAM and IO. */
+if (!memory_access_is_direct(pam-mr_write_to, true)) {
+memory_region_dispatch_write(pam-mr_write_to,
+ addr + pam-write_offset, data, size,
+ MEMTXATTRS_UNSPECIFIED);
+} else {
+ptr = memory_region_get_ram_ptr(pam-mr_write_to) + addr
+  + pam-write_offset;
+
+switch (size) {
+case 1:
+stb_p(ptr, data);
+break;
+case 2:
+stw_he_p(ptr, data);
+break;
+case 4:
+stl_he_p(ptr, data);
+break;
+case 8:
+stq_he_p(ptr, data);
+break;
+default:
+abort();
+}
+
+invalidate_and_set_dirty(pam-mr_write_to, addr + pam-pam_offset,
+ size);
+}
+}
+
+static uint64_t pam_read(void *opaque, hwaddr addr, unsigned size)
+{
+PAMMemoryRegion *pam = (PAMMemoryRegion *) opaque;
+uint64_t ret = 0;
+
+/* Source region can be IO only. */
+memory_region_dispatch_read(pam-mr_read_from, pam-read_offset + addr,
+ret, size, MEMTXATTRS_UNSPECIFIED);
+
+return ret;
+}
+
+static MemoryRegionOps pam_ops = {
+.write = pam_write,
+.read = pam_read,
+.endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static void pam_set_current(PAMMemoryRegion *pam, unsigned new_current)
+{
+assert(new_current = 3);
+
+if (new_current != pam-current) {
+#ifdef DEBUG_PAM
+printf(PAM 0xTARGET_FMT_plx: %d = %s\n, pam-pam_offset,
+   new_current, pam-region[new_current].name);
+#endif
+memory_region_set_enabled(pam-region[pam-current], false);
+pam-current = new_current;
+memory_region_set_enabled(pam-region[new_current], true);
+}
+}
+
+static void pam_leaf_mr_lookup(MemoryRegion *mr, hwaddr offset,
+   MemoryRegion **leaf, hwaddr *offset_within_leaf)
+{
+MemoryRegion *other;
+if (mr-alias) {
+pam_leaf_mr_lookup(mr-alias, offset + mr-alias_offset, leaf,
+   offset_within_leaf);
+} else {
+if (QTAILQ_EMPTY(mr-subregions)
+ int128_lt(int128_make64(offset), mr-size)) {
+*leaf = mr;
+*offset_within_leaf = offset;
+} else {
+QTAILQ_FOREACH(other, mr-subregions, subregions_link) {
+if (!other-enabled) {
+continue;
+}
+if (other-addr = offset  int128_lt(int128_make64(offset),
+int128_add(int128_make64(other-addr), other-size))) {
+pam_leaf_mr_lookup(other, offset - other-addr, leaf,
+   offset_within_leaf);
+if (*leaf) {
+return;
+}
+}
+}
+*leaf = NULL;
+}
+}
+}
+
+static bool mr_has_mr(MemoryRegion *container, MemoryRegion *mr,
+  hwaddr *offset)
+{
+hwaddr tmp_offset;
+

[Qemu-devel] [PATCH 0/3] PAM emulation improvement

2015-07-16 Thread Efimov Vasily
This patch series improves PAM emulation.

PAM defines 4 memory access redirection modes. In mode 1 reads are directed to
RAM and writes are directed to PCI. In mode 2 it is contrary. In mode 0 all
access is directed to PCI. In mode 3 it is directed to RAM. Modes 0 and 3 are
well emulated but modes 1 and 2 are not. The cause is: aliases are used
while more complicated logic is required.

The idea is to use ROM device like memory regions for mode 1 and 2 emulation
instead of aliases. Writes are directed to proper destination region by
specified I/O callback. Read redirection depends on type of source region.
In most cases source region is RAM (or ROM), so ram_addr of PAM region is set to
ram_addr of source region with offset. Otherwise, when source region is an I/O
region, reading is redirected to source region read callback by PAM region one.

Read source and write destination regions are updated by the memory
commit callback.

Note that we cannot use I/O region for PAM as it will violate trying to execute
code outside RAM or ROM assertion.

Qemu distribution includes SeaBIOS which has hacks to work around incorrect
modes 1 and 2 emulation.

This patch series is tested using modified SeaBIOS. It is forced to use mode 2
for copying its data. BIOS reads a value from memory and immediately writes it
to same address. According to PAM definition, reads are directed to PCI (i.e. to
BIOS ROM) and writes are directed to RAM.

The patch for SeaBIOS is listed below.

Both SeaBIOS versions works with new PAM but the modified one does not work with
old PAM.

==
diff --git a/src/fw/shadow.c b/src/fw/shadow.c
index 4f6..5b0e527 100644
--- a/src/fw/shadow.c
+++ b/src/fw/shadow.c
@@ -26,32 +26,43 @@ static void
 __make_bios_writable_intel(u16 bdf, u32 pam0)
 {
 // Make ram from 0xc-0xf writable
-int clear = 0;
 int i;
+unsigned *mem, *mem_limit;
 for (i=0; i6; i++) {
 u32 pam = pam0 + 1 + i;
 int reg = pci_config_readb(bdf, pam);
-if (CONFIG_OPTIONROMS_DEPLOYED  (reg  0x11) != 0x11) {
-// Need to copy optionroms to work around qemu implementation
-void *mem = (void*)(BUILD_ROM_START + i * 32*1024);
-memcpy((void*)BUILD_BIOS_TMP_ADDR, mem, 32*1024);
+if ((reg  0x11) != 0x11) {
+mem = (unsigned *)(BUILD_ROM_START + i * 32 * 1024);
+pci_config_writeb(bdf, pam, 0x22);
+mem_limit = mem + 32 * 1024 / sizeof(unsigned);
+
+while (mem  mem_limit) {
+volatile unsigned tmp = *mem;
+*mem = tmp;
+mem++;
+}
 pci_config_writeb(bdf, pam, 0x33);
-memcpy(mem, (void*)BUILD_BIOS_TMP_ADDR, 32*1024);
-clear = 1;
 } else {
 pci_config_writeb(bdf, pam, 0x33);
 }
 }
-if (clear)
-memset((void*)BUILD_BIOS_TMP_ADDR, 0, 32*1024);
 
 // Make ram from 0xf-0x10 writable
 int reg = pci_config_readb(bdf, pam0);
-pci_config_writeb(bdf, pam0, 0x30);
 if (reg  0x10)
 // Ram already present.
 return;
 
+pci_config_writeb(bdf, pam0, 0x22);
+mem = (unsigned *)BUILD_BIOS_ADDR;
+mem_limit = mem + 32 * 1024 / sizeof(unsigned);
+while (mem  mem_limit) {
+volatile unsigned tmp = *mem;
+*mem = tmp;
+mem++;
+}
+pci_config_writeb(bdf, pam0, 0x33);
+
 // Copy bios.
 extern u8 code32flat_start[], code32flat_end[];
 memcpy(code32flat_start, code32flat_start + BIOS_SRC_OFFSET
@@ -61,17 +72,6 @@ __make_bios_writable_intel(u16 bdf, u32 pam0)
 static void
 make_bios_writable_intel(u16 bdf, u32 pam0)
 {
-int reg = pci_config_readb(bdf, pam0);
-if (!(reg  0x10)) {
-// QEMU doesn't fully implement the piix shadow capabilities -
-// if ram isn't backing the bios segment when shadowing is
-// disabled, the code itself wont be in memory.  So, run the
-// code from the high-memory flash location.
-u32 pos = (u32)__make_bios_writable_intel + BIOS_SRC_OFFSET;
-void (*func)(u16 bdf, u32 pam0) = (void*)pos;
-func(bdf, pam0);
-return;
-}
 // Ram already present - just enable writes
 __make_bios_writable_intel(bdf, pam0);
 }

Efimov Vasily (3):
  memory: make function invalidate_and_set_dirty public
  memory: make function memory_access_is_direct public
  PAM: make PAM emulation closer to documentation

 exec.c |  14 +--
 hw/pci-host/pam.c  | 238 -
 include/exec/memory-internal.h |  15 +++
 include/hw/pci-host/pam.h  |  10 +-
 4 files changed, 239 insertions(+), 38 deletions(-)

-- 
1.9.1




Re: [Qemu-devel] [PATCH V2] virtio-net: unbreak any layout

2015-07-16 Thread Michael S. Tsirkin
On Thu, Jul 16, 2015 at 02:49:48PM +0800, Jason Wang wrote:
 
 
 On 07/16/2015 02:42 PM, Michael S. Tsirkin wrote:
  On Wed, Jul 15, 2015 at 03:56:07PM +0800, Jason Wang wrote:
  Commit 032a74a1c0fcdd5fd1c69e56126b4c857ee36611
  (virtio-net: byteswap virtio-net header) breaks any layout by
  requiring out_sg[0].iov_len = n-guest_hdr_len. Fixing this by
  copying header to temporary buffer if swap is needed, and then use
  this buffer as part of out_sg.
 
  Fixes 032a74a1c0fcdd5fd1c69e56126b4c857ee36611
  (virtio-net: byteswap virtio-net header)
  Cc: qemu-sta...@nongnu.org
  Cc: c...@fr.ibm.com
  Signed-off-by: Jason Wang jasow...@redhat.com
  ---
  Changes from V1:
  - avoid header copying if there's no need to do header swap
  - don't write the header back
  ---
   hw/net/virtio-net.c   | 17 ++---
   include/hw/virtio/virtio-access.h |  9 +
   2 files changed, 23 insertions(+), 3 deletions(-)
 
  diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
  index e3c2db3..12322bd 100644
  --- a/hw/net/virtio-net.c
  +++ b/hw/net/virtio-net.c
  @@ -1142,7 +1142,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
   ssize_t ret, len;
   unsigned int out_num = elem.out_num;
   struct iovec *out_sg = elem.out_sg[0];
  -struct iovec sg[VIRTQUEUE_MAX_SIZE];
  +struct iovec sg[VIRTQUEUE_MAX_SIZE], sg2[VIRTQUEUE_MAX_SIZE];
  +struct virtio_net_hdr hdr;
   
   if (out_num  1) {
   error_report(virtio-net header not in first element);
  @@ -1150,11 +1151,21 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue 
  *q)
   }
   
   if (n-has_vnet_hdr) {
  -if (out_sg[0].iov_len  n-guest_hdr_len) {
  +if (iov_size(out_sg, out_num)  n-guest_hdr_len) {
   error_report(virtio-net header incorrect);
   exit(1);
   }
  this scans the iov unnecessarily. How about checking return
  code from iov_to_buf instead?
 
 Looks ok since hdr is very short anyway.

but iov_size scans the whole iov, not just the header.

 
  -virtio_net_hdr_swap(vdev, (void *) out_sg[0].iov_base);
  +if (virtio_needs_swap(vdev)) {
  +iov_to_buf(out_sg, out_num, 0, hdr, sizeof(hdr));
  +virtio_net_hdr_swap(vdev, (void *) hdr);
  +sg2[0].iov_base = hdr;
  +sg2[0].iov_len = sizeof(hdr);
  +out_num = iov_copy(sg2[1], ARRAY_SIZE(sg2) - 1,
  +   out_sg, out_num,
  +   sizeof(hdr), -1);
 
  This might truncate packet if it does not fit in 1024 anymore.
  It might be better to just drop it.
 
 Ok, will do this in V3.
 
 
  +out_num += 1;
  +out_sg = sg2;
  +}
   }
   
   /*
  diff --git a/include/hw/virtio/virtio-access.h 
  b/include/hw/virtio/virtio-access.h
  index cee5dd7..1ec1dfd 100644
  --- a/include/hw/virtio/virtio-access.h
  +++ b/include/hw/virtio/virtio-access.h
  @@ -143,6 +143,15 @@ static inline uint64_t virtio_ldq_p(VirtIODevice 
  *vdev, const void *ptr)
   }
   }
   
  +static inline bool virtio_needs_swap(VirtIODevice *vdev)
  +{
  +#ifdef HOST_WORDS_BIGENDIAN
  +return virtio_access_is_big_endian(vdev) ? false : true;
  +#else
  +return virtio_access_is_big_endian(vdev) ? true : false;
  +#endif
  +}
  +
   static inline uint16_t virtio_tswap16(VirtIODevice *vdev, uint16_t s)
   {
   #ifdef HOST_WORDS_BIGENDIAN
  -- 
  2.1.4



Re: [Qemu-devel] [PATCH V2] virtio-net: unbreak any layout

2015-07-16 Thread Jason Wang


On 07/16/2015 03:29 PM, Michael S. Tsirkin wrote:
 On Thu, Jul 16, 2015 at 02:49:48PM +0800, Jason Wang wrote:
  
  
  On 07/16/2015 02:42 PM, Michael S. Tsirkin wrote:
   On Wed, Jul 15, 2015 at 03:56:07PM +0800, Jason Wang wrote:
   Commit 032a74a1c0fcdd5fd1c69e56126b4c857ee36611
   (virtio-net: byteswap virtio-net header) breaks any layout by
   requiring out_sg[0].iov_len = n-guest_hdr_len. Fixing this by
   copying header to temporary buffer if swap is needed, and then use
   this buffer as part of out_sg.
  
   Fixes 032a74a1c0fcdd5fd1c69e56126b4c857ee36611
   (virtio-net: byteswap virtio-net header)
   Cc: qemu-sta...@nongnu.org
   Cc: c...@fr.ibm.com
   Signed-off-by: Jason Wang jasow...@redhat.com
   ---
   Changes from V1:
   - avoid header copying if there's no need to do header swap
   - don't write the header back
   ---
hw/net/virtio-net.c   | 17 ++---
include/hw/virtio/virtio-access.h |  9 +
2 files changed, 23 insertions(+), 3 deletions(-)
  
   diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
   index e3c2db3..12322bd 100644
   --- a/hw/net/virtio-net.c
   +++ b/hw/net/virtio-net.c
   @@ -1142,7 +1142,8 @@ static int32_t 
   virtio_net_flush_tx(VirtIONetQueue *q)
ssize_t ret, len;
unsigned int out_num = elem.out_num;
struct iovec *out_sg = elem.out_sg[0];
   -struct iovec sg[VIRTQUEUE_MAX_SIZE];
   +struct iovec sg[VIRTQUEUE_MAX_SIZE], sg2[VIRTQUEUE_MAX_SIZE];
   +struct virtio_net_hdr hdr;

if (out_num  1) {
error_report(virtio-net header not in first element);
   @@ -1150,11 +1151,21 @@ static int32_t 
   virtio_net_flush_tx(VirtIONetQueue *q)
}

if (n-has_vnet_hdr) {
   -if (out_sg[0].iov_len  n-guest_hdr_len) {
   +if (iov_size(out_sg, out_num)  n-guest_hdr_len) {
error_report(virtio-net header incorrect);
exit(1);
}
   this scans the iov unnecessarily. How about checking return
   code from iov_to_buf instead?
  
  Looks ok since hdr is very short anyway.
 but iov_size scans the whole iov, not just the header.


Right.



[Qemu-devel] [PULL 7/9] target-mips: fix resource leak reported by Coverity

2015-07-16 Thread Leon Alrae
UHI assert and link operations call lock_user_string() twice to obtain two
strings pointed by gpr[4] and gpr[5]. If the second lock_user_string()
fails, then the first one won't get freed. Fix this by introducing another
macro responsible for obtaining two strings and handling allocation
failure.

Signed-off-by: Leon Alrae leon.al...@imgtec.com
Reviewed-by: Aurelien Jarno aurel...@aurel32.net
---
 target-mips/mips-semi.c | 23 +++
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/target-mips/mips-semi.c b/target-mips/mips-semi.c
index 1162c76..5050940 100644
--- a/target-mips/mips-semi.c
+++ b/target-mips/mips-semi.c
@@ -220,6 +220,23 @@ static int copy_argn_to_target(CPUMIPSState *env, int 
arg_num,
 }   \
 } while (0)
 
+#define GET_TARGET_STRINGS_2(p, addr, p2, addr2)\
+do {\
+p = lock_user_string(addr); \
+if (!p) {   \
+gpr[2] = -1;\
+gpr[3] = EFAULT;\
+goto uhi_done;  \
+}   \
+p2 = lock_user_string(addr2);   \
+if (!p2) {  \
+unlock_user(p, addr, 0);\
+gpr[2] = -1;\
+gpr[3] = EFAULT;\
+goto uhi_done;  \
+}   \
+} while (0)
+
 #define FREE_TARGET_STRING(p, gpr)  \
 do {\
 unlock_user(p, gpr, 0); \
@@ -322,8 +339,7 @@ void helper_do_semihosting(CPUMIPSState *env)
 FREE_TARGET_STRING(p, gpr[4]);
 break;
 case UHI_assert:
-GET_TARGET_STRING(p, gpr[4]);
-GET_TARGET_STRING(p2, gpr[5]);
+GET_TARGET_STRINGS_2(p, gpr[4], p2, gpr[5]);
 printf(assertion ');
 printf(\%s\, p);
 printf(': file \%s\, line %d\n, p2, (int)gpr[6]);
@@ -341,8 +357,7 @@ void helper_do_semihosting(CPUMIPSState *env)
 break;
 #ifndef _WIN32
 case UHI_link:
-GET_TARGET_STRING(p, gpr[4]);
-GET_TARGET_STRING(p2, gpr[5]);
+GET_TARGET_STRINGS_2(p, gpr[4], p2, gpr[5]);
 gpr[2] = link(p, p2);
 gpr[3] = errno_mips(errno);
 FREE_TARGET_STRING(p2, gpr[5]);
-- 
2.1.0




[Qemu-devel] [PULL 8/9] linux-user: Fix MIPS N64 trap and break instruction bug

2015-07-16 Thread Leon Alrae
From: Andrew Bennett andrew.benn...@imgtec.com

For the MIPS N64 ABI when QEMU reads the break/trap instruction so that
it can inspect the break/trap code it reads 8 rather than 4 bytes
which means it finds the code field from the instruction after the
break/trap instruction.  This then causes the break/trap handling
code to fail because it does not understand the code number.

The fix forces QEMU to always read 4 bytes of instruction data rather
than deciding how much to read based on the ABI.

Signed-off-by: Andrew Bennett andrew.benn...@imgtec.com
Reviewed-by: Leon Alrae leon.al...@imgtec.com
Signed-off-by: Leon Alrae leon.al...@imgtec.com
---
 linux-user/main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 05914b1..fdee981 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -2577,7 +2577,7 @@ done_syscall:
 code = (trap_instr  6)  0x3f;
 }
 } else {
-ret = get_user_ual(trap_instr, env-active_tc.PC);
+ret = get_user_u32(trap_instr, env-active_tc.PC);
 if (ret != 0) {
 goto error;
 }
@@ -2611,7 +2611,7 @@ done_syscall:
 
 trap_instr = (instr[0]  16) | instr[1];
 } else {
-ret = get_user_ual(trap_instr, env-active_tc.PC);
+ret = get_user_u32(trap_instr, env-active_tc.PC);
 }
 
 if (ret != 0) {
-- 
2.1.0




[Qemu-devel] [PULL 4/9] target-mips: fix ASID synchronisation for MIPS MT

2015-07-16 Thread Leon Alrae
From: Aurelien Jarno aurel...@aurel32.net

When syncing the task ASID with EntryHi, correctly or the value instead
of assigning it.

Reported-by: Dr. David Alan Gilbert dgilb...@redhat.com
Signed-off-by: Aurelien Jarno aurel...@aurel32.net
Cc: Leon Alrae leon.al...@imgtec.com
Reviewed-by: Leon Alrae leon.al...@imgtec.com
Signed-off-by: Leon Alrae leon.al...@imgtec.com
---
 target-mips/op_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 2a9ddff..d457a29 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -661,7 +661,7 @@ static void sync_c0_tcstatus(CPUMIPSState *cpu, int tc,
 
 /* Sync the TASID with EntryHi.  */
 cpu-CP0_EntryHi = ~0xff;
-cpu-CP0_EntryHi = tasid;
+cpu-CP0_EntryHi |= tasid;
 
 compute_hflags(cpu);
 }
-- 
2.1.0




Re: [Qemu-devel] [PATCH V2] virtio-net: unbreak any layout

2015-07-16 Thread Jason Wang


On 07/16/2015 02:42 PM, Michael S. Tsirkin wrote:
 On Wed, Jul 15, 2015 at 03:56:07PM +0800, Jason Wang wrote:
 Commit 032a74a1c0fcdd5fd1c69e56126b4c857ee36611
 (virtio-net: byteswap virtio-net header) breaks any layout by
 requiring out_sg[0].iov_len = n-guest_hdr_len. Fixing this by
 copying header to temporary buffer if swap is needed, and then use
 this buffer as part of out_sg.

 Fixes 032a74a1c0fcdd5fd1c69e56126b4c857ee36611
 (virtio-net: byteswap virtio-net header)
 Cc: qemu-sta...@nongnu.org
 Cc: c...@fr.ibm.com
 Signed-off-by: Jason Wang jasow...@redhat.com
 ---
 Changes from V1:
 - avoid header copying if there's no need to do header swap
 - don't write the header back
 ---
  hw/net/virtio-net.c   | 17 ++---
  include/hw/virtio/virtio-access.h |  9 +
  2 files changed, 23 insertions(+), 3 deletions(-)

 diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
 index e3c2db3..12322bd 100644
 --- a/hw/net/virtio-net.c
 +++ b/hw/net/virtio-net.c
 @@ -1142,7 +1142,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
  ssize_t ret, len;
  unsigned int out_num = elem.out_num;
  struct iovec *out_sg = elem.out_sg[0];
 -struct iovec sg[VIRTQUEUE_MAX_SIZE];
 +struct iovec sg[VIRTQUEUE_MAX_SIZE], sg2[VIRTQUEUE_MAX_SIZE];
 +struct virtio_net_hdr hdr;
  
  if (out_num  1) {
  error_report(virtio-net header not in first element);
 @@ -1150,11 +1151,21 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
  }
  
  if (n-has_vnet_hdr) {
 -if (out_sg[0].iov_len  n-guest_hdr_len) {
 +if (iov_size(out_sg, out_num)  n-guest_hdr_len) {
  error_report(virtio-net header incorrect);
  exit(1);
  }
 this scans the iov unnecessarily. How about checking return
 code from iov_to_buf instead?

Looks ok since hdr is very short anyway.


 -virtio_net_hdr_swap(vdev, (void *) out_sg[0].iov_base);
 +if (virtio_needs_swap(vdev)) {
 +iov_to_buf(out_sg, out_num, 0, hdr, sizeof(hdr));
 +virtio_net_hdr_swap(vdev, (void *) hdr);
 +sg2[0].iov_base = hdr;
 +sg2[0].iov_len = sizeof(hdr);
 +out_num = iov_copy(sg2[1], ARRAY_SIZE(sg2) - 1,
 +   out_sg, out_num,
 +   sizeof(hdr), -1);

 This might truncate packet if it does not fit in 1024 anymore.
 It might be better to just drop it.

Ok, will do this in V3.


 +out_num += 1;
 +out_sg = sg2;
 +}
  }
  
  /*
 diff --git a/include/hw/virtio/virtio-access.h 
 b/include/hw/virtio/virtio-access.h
 index cee5dd7..1ec1dfd 100644
 --- a/include/hw/virtio/virtio-access.h
 +++ b/include/hw/virtio/virtio-access.h
 @@ -143,6 +143,15 @@ static inline uint64_t virtio_ldq_p(VirtIODevice *vdev, 
 const void *ptr)
  }
  }
  
 +static inline bool virtio_needs_swap(VirtIODevice *vdev)
 +{
 +#ifdef HOST_WORDS_BIGENDIAN
 +return virtio_access_is_big_endian(vdev) ? false : true;
 +#else
 +return virtio_access_is_big_endian(vdev) ? true : false;
 +#endif
 +}
 +
  static inline uint16_t virtio_tswap16(VirtIODevice *vdev, uint16_t s)
  {
  #ifdef HOST_WORDS_BIGENDIAN
 -- 
 2.1.4




[Qemu-devel] [PULL 0/9] target-mips queue

2015-07-16 Thread Leon Alrae
Hi,

This pull request contains MIPS bug fixes for 2.4-rc1.

Thanks,
Leon

Cc: Peter Maydell peter.mayd...@linaro.org
Cc: Aurelien Jarno aurel...@aurel32.net

The following changes since commit 661725da09f47eb92d356fac10a4cf3b7ad1f61d:

  Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20150714' into 
staging (2015-07-14 18:50:17 +0100)

are available in the git repository at:

  git://github.com/lalrae/qemu.git tags/mips-20150716

for you to fetch changes up to 908680c6441ac468f4871d513f42be396ea0d264:

  target-mips: fix page fault address for LWL/LWR/LDL/LDR (2015-07-15 14:07:25 
+0100)


MIPS patches 2015-07-16

Changes:
* bug fixes


Andrew Bennett (1):
  linux-user: Fix MIPS N64 trap and break instruction bug

Aurelien Jarno (2):
  target-mips: fix ASID synchronisation for MIPS MT
  target-mips: fix page fault address for LWL/LWR/LDL/LDR

Leon Alrae (3):
  target-mips: correct DERET instruction
  target-mips: fix logically dead code reported by Coverity
  target-mips: fix resource leak reported by Coverity

Yongbok Kim (3):
  target-mips: fix MIPS64R6-generic configuration
  target-mips: fix to clear MSACSR.Cause
  disas/mips: fix disassembling R6 instructions

 disas/mips.c | 12 ++--
 linux-user/main.c|  4 ++--
 target-mips/mips-defs.h  |  2 +-
 target-mips/mips-semi.c  | 23 +++
 target-mips/msa_helper.c |  6 ++
 target-mips/op_helper.c  |  5 ++---
 target-mips/translate.c  | 15 +++
 target-mips/translate_init.c | 18 +-
 8 files changed, 60 insertions(+), 25 deletions(-)



Re: [Qemu-devel] [PATCH 08/10 v12] target-tilegx: Add several helpers for instructions translation

2015-07-16 Thread gchen gchen
Hello Maintainers:brbrPlease help review this patch when you have 
time.brbrThanks.brbrOn 06/13/2015 09:19 PM, Chen Gang wrote:brgt; 
The related instructions are exception, cntlz, cnttz, shufflebytes, andbrgt; 
add_saturate.brgt;brgt; Signed-off-by: Chen Gang 
lt;gang.chen.5...@gmail.comgt;brgt; ---brgt;  target-tilegx/helper.c | 
83 ++brgt;  
target-tilegx/helper.h |  5 +++brgt;  2 files changed, 88 
insertions(+)brgt;  create mode 100644 target-tilegx/helper.cbrgt;  
create mode 100644 target-tilegx/helper.hbrgt;brgt; diff --git 
a/target-tilegx/helper.c b/target-tilegx/helper.cbrgt; new file mode 
100644brgt; index 000..5ab41cdbrgt; --- /dev/nullbrgt; +++ 
b/target-tilegx/helper.cbrgt; @@ -0,0 +1,83 @@brgt; +/*brgt; + * QEMU 
TILE-Gx helpersbrgt; + *brgt; + *  Copyright (c) 2015 Chen Gangbrgt; + 
*brgt; + * This library is free software; you can redistribute it 
and/orbrgt; + * modify it under the terms of the GNU Lesser General 
Publicbrgt; + * License as published by the Free Software Foundation; 
eitherbrgt; + * version 2.1 of the License, or (at your option) any later 
version.brgt; + *brgt; + * This library is distributed in the hope that 
it will be useful,brgt; + * but WITHOUT ANY WARRANTY; without even the 
implied warranty ofbrgt; + * MERCHANTABILITY or FITNESS FOR A PARTICULAR 
PURPOSE.  See the GNUbrgt; + * Lesser General Public License for more 
details.brgt; + *brgt; + * You should have received a copy of the GNU 
Lesser General Publicbrgt; + * License along with this library; if not, 
seebrgt; + * lt;http://www.gnu.org/licenses/lgpl-2.1.htmlgt;brgt; + 
*/brgt; +brgt; +#include cpu.hbrgt; +#include 
qemu-common.hbrgt; +#include exec/helper-proto.hbrgt; +brgt; 
+#define SIGNBIT32 0x8000brgt; +brgt; +int64_t 
helper_add_saturate(CPUTLGState *env, uint64_t rsrc, uint64_t rsrcb)brgt; 
+{brgt; +uint32_t rdst = rsrc + rsrcb;brgt; +brgt; +if (((rdst 
^ rsrc) amp; SIGNBIT32) amp;amp; !((rsrc ^ rsrcb) amp; SIGNBIT32)) 
{brgt; +rdst = ~(((int32_t)rsrc gt;gt; 31) ^ SIGNBIT32);brgt; + 
   }brgt; +brgt; +return (int64_t)rdst;brgt; +}brgt; +brgt; 
+void helper_exception(CPUTLGState *env, uint32_t excp)brgt; +{brgt; +
CPUState *cs = CPU(tilegx_env_get_cpu(env));brgt; +brgt; +
cs-gt;exception_index = excp;brgt; +cpu_loop_exit(cs);brgt; 
+}brgt; +brgt; +uint64_t helper_cntlz(uint64_t arg)brgt; +{brgt; +  
  return clz64(arg);brgt; +}brgt; +brgt; +uint64_t 
helper_cnttz(uint64_t arg)brgt; +{brgt; +return ctz64(arg);brgt; 
+}brgt; +brgt; +/*brgt; + * Functional Descriptionbrgt; + * 
uint64_t a = rf[SrcA];brgt; + * uint64_t b = rf[SrcB];brgt; + * 
uint64_t d = rf[Dest];brgt; + * uint64_t output = 0;brgt; + * 
unsigned int counter;brgt; + * for (counter = 0; counter lt; (WORD_SIZE 
/ BYTE_SIZE); counter++)brgt; + * {brgt; + * int sel = 
getByte (b, counter) amp; 0xf;brgt; + * uint8_t byte = (sel lt; 8) 
? getByte (d, sel) : getByte (a, (sel - 8));brgt; + * output = 
setByte (output, counter, byte);brgt; + * }brgt; + * rf[Dest] = 
output;brgt; + */brgt; +uint64_t helper_shufflebytes(uint64_t rdst, 
uint64_t rsrc, uint64_t rsrcb)brgt; +{brgt; +uint64_t vdst = 
0;brgt; +int count;brgt; +brgt; +for (count = 0; count lt; 
64; count += 8) {brgt; +uint64_t sel = rsrcb gt;gt; count;brgt; 
+uint64_t src = (sel amp; 8) ? rsrc : rdst;brgt; +vdst |= 
((src gt;gt; ((sel amp; 7) * 8)) amp; 0xff) lt;lt; count;brgt; +
}brgt; +brgt; +return vdst;brgt; +}brgt; diff --git 
a/target-tilegx/helper.h b/target-tilegx/helper.hbrgt; new file mode 
100644brgt; index 000..1411c19brgt; --- /dev/nullbrgt; +++ 
b/target-tilegx/helper.hbrgt; @@ -0,0 +1,5 @@brgt; 
+DEF_HELPER_2(exception, noreturn, env, i32)brgt; +DEF_HELPER_FLAGS_1(cntlz, 
TCG_CALL_NO_RWG_SE, i64, i64)brgt; +DEF_HELPER_FLAGS_1(cnttz, 
TCG_CALL_NO_RWG_SE, i64, i64)brgt; +DEF_HELPER_FLAGS_3(shufflebytes, 
TCG_CALL_NO_RWG_SE, i64, i64, i64, i64)brgt; +DEF_HELPER_3(add_saturate, 
s64, env, i64, i64)brgt;brbr--brChen GangbrbrOpen, share, and 
attitude like air, water, and life which God blessedbr



Re: [Qemu-devel] RFC: guest-side retrieval of fw_cfg file

2015-07-16 Thread Michael S. Tsirkin
On Thu, Jul 16, 2015 at 08:57:36AM +0200, Igor Mammedov wrote:
 On Wed, 15 Jul 2015 19:24:33 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Wed, Jul 15, 2015 at 06:01:42PM +0200, Igor Mammedov wrote:
   On Wed, 15 Jul 2015 17:08:27 +0300
   Michael S. Tsirkin m...@redhat.com wrote:
   
On Mon, Jul 13, 2015 at 04:09:37PM -0400, Gabriel L. Somlo wrote:
 Hi,
 
 A while ago I was pondering on the options available for retrieving
 a fw_cfg blob from the guest-side (now that we can insert fw_cfg
 files on the host-side command line, see commit 81b2b8106).
 
 So over the last couple of weekends I cooked up the sysfs kernel
 module below, which lists all fw_cfg files
 under /sys/firmware/fw_cfg/filename.

One concern here is that there will be a conflict here if fw cfg
is used by ACPI. I don't know whether that last is a good idea
though, so maybe not a real concern. I think Igor
wanted to make it so.
   
   I don't see any conflict here so far, it's just guest side module that
   accesses fw_cfg.
  
  If there's ACPI code that accesses fw_cfg in response to an interrupt,
  it'll race with fw cfg access by guest OS. On linux we might be able to
  block ACPI preventing it from running. We probably won't be able to do
  it on windows.
 wrt vmgenid series we were talking about possibility to access fw_cfg
 from ACPI device.init so it's unlikely that it will ever collide with
 much later sysfs accesses.

Don't we need to get updates when it changes too?

 But if ACPI will start accessing fw_cfg from
 other methods it will race for sure.

Maybe we shouldn't poke at fw cfg in ACPI then.

-- 
MST



Re: [Qemu-devel] [PATCH v4 0/7] Fix QEMU crash during memory hotplug with vhost=on

2015-07-16 Thread Michael S. Tsirkin
On Thu, Jul 16, 2015 at 09:26:21AM +0200, Igor Mammedov wrote:
 On Wed, 15 Jul 2015 19:32:31 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Wed, Jul 15, 2015 at 05:12:01PM +0200, Igor Mammedov wrote:
   On Thu,  9 Jul 2015 13:47:17 +0200
   Igor Mammedov imamm...@redhat.com wrote:
   
   there also is yet another issue with vhost-user. It also has
   very low limit on amount of memory regions (if I recall correctly 8)
   and it's possible to trigger even without memory hotplug.
   one just need to start QEMU with a several -numa memdev= options
   to create a necessary amount of memory regions to trigger it.
   
   lowrisk option to fix it would be increasing limit in vhost-user
   backend.
   
   another option is disabling vhost and fall-back to virtio,
   but I don't know much about vhost if it's possible to 
   to switch it off without loosing packets guest was sending
   at the moment and if it will work at all with vhost.
  
  With vhost-user you can't fall back to virtio: it's
  not an accelerator, it's the backend.
  
  Updating the protocol to support a bigger table
  is possible but old remotes won't be able to support it.
  
 it looks like increasing limit is the only option left.
 
 it's not ideal that old remotes /with hardcoded limit/
 might not be able to handle bigger table but at least
 new ones and ones that handle VhostUserMsg payload
 dynamically would be able to work without crashing.

I think we need a way for hotplug to fail gracefully.  As long as we
don't implement the hva trick, it's needed for old kernels with vhost in
kernel, too.

-- 
MST



[Qemu-devel] [PATCH 1/3] memory: make function invalidate_and_set_dirty public

2015-07-16 Thread Efimov Vasily
Make function invalidate_and_set_dirty public. It is required by PAM emulation.

Signed-off-by: Efimov Vasily r...@ispras.ru
---
 exec.c | 2 +-
 include/exec/memory-internal.h | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/exec.c b/exec.c
index 251dc79..4e37ded 100644
--- a/exec.c
+++ b/exec.c
@@ -2281,7 +2281,7 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
 
 #else
 
-static void invalidate_and_set_dirty(MemoryRegion *mr, hwaddr addr,
+void invalidate_and_set_dirty(MemoryRegion *mr, hwaddr addr,
  hwaddr length)
 {
 uint8_t dirty_log_mask = memory_region_get_dirty_log_mask(mr);
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index fb467ac..801da82 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -31,5 +31,8 @@ extern const MemoryRegionOps unassigned_mem_ops;
 bool memory_region_access_valid(MemoryRegion *mr, hwaddr addr,
 unsigned size, bool is_write);
 
+void invalidate_and_set_dirty(MemoryRegion *mr, hwaddr addr,
+ hwaddr length);
+
 #endif
 #endif
-- 
1.9.1




Re: [Qemu-devel] [PATCH 3/3] PAM: make PAM emulation closer to documentation

2015-07-16 Thread Paolo Bonzini


On 16/07/2015 10:35, Efimov Vasily wrote:
 This patch improves PAM emulation.
 
 PAM defines 4 memory access redirection modes. In mode 1 reads are directed to
 RAM and writes are directed to PCI. In mode 2 it is contrary. In mode 0 all
 access is directed to PCI. In mode 3 it is directed to RAM. Modes 0 and 3 are
 well emulated but modes 1 and 2 are not. The cause is: aliases are used
 while more complicated logic is required.
 
 The idea is to use ROM device like memory regions for mode 1 and 2 emulation
 instead of aliases. Writes are directed to proper destination region by
 specified I/O callback. Read redirection depends on type of source region.
 In most cases source region is RAM (or ROM), so ram_addr of PAM region is set 
 to
 ram_addr of source region with offset. Otherwise, when source region is an I/O
 region, reading is redirected to source region read callback by PAM region 
 one.
 
 Read source and write destination regions are updated by the memory
 commit callback.
 
 Note that we cannot use I/O region for PAM as it will violate trying to 
 execute
 code outside RAM or ROM assertion.
 
 Signed-off-by: Efimov Vasily r...@ispras.ru
 ---
  hw/pci-host/pam.c | 238 
 +-
  include/hw/pci-host/pam.h |  10 +-
  2 files changed, 223 insertions(+), 25 deletions(-)
 
 diff --git a/hw/pci-host/pam.c b/hw/pci-host/pam.c
 index 17d826c..9729b2b 100644
 --- a/hw/pci-host/pam.c
 +++ b/hw/pci-host/pam.c
 @@ -27,43 +27,233 @@
   * THE SOFTWARE.
   */
  
 -#include qom/object.h
 -#include sysemu/sysemu.h
  #include hw/pci-host/pam.h
 +#include exec/address-spaces.h
 +#include exec/memory-internal.h
 +#include qemu/bswap.h
 +
 +static void pam_write(void *opaque, hwaddr addr, uint64_t data,
 +  unsigned size)
 +{
 +PAMMemoryRegion *pam = (PAMMemoryRegion *) opaque;
 +void *ptr;
 +
 +/* Destination region can be both RAM and IO. */
 +if (!memory_access_is_direct(pam-mr_write_to, true)) {
 +memory_region_dispatch_write(pam-mr_write_to,
 + addr + pam-write_offset, data, size,
 + MEMTXATTRS_UNSPECIFIED);
 +} else {
 +ptr = memory_region_get_ram_ptr(pam-mr_write_to) + addr
 +  + 
 pam-write_offset;
 +
 +switch (size) {
 +case 1:
 +stb_p(ptr, data);
 +break;
 +case 2:
 +stw_he_p(ptr, data);
 +break;
 +case 4:
 +stl_he_p(ptr, data);
 +break;
 +case 8:
 +stq_he_p(ptr, data);
 +break;
 +default:
 +abort();
 +}
 +
 +invalidate_and_set_dirty(pam-mr_write_to, addr + pam-pam_offset,
 + size);
 +}
 +}
 +

The idea is very good, but the implementation relies on copying a lot of
code from exec.c.

Could you use an IOMMU memory region instead?  Then a single region can
be used to implement all four modes, and you don't hit the trying to
execute code outside RAM or RAM.

Paolo



Re: [Qemu-devel] iothread: release iothread around aio_poll causes random hangs at startup

2015-07-16 Thread Christian Borntraeger
Am 16.07.2015 um 13:20 schrieb Paolo Bonzini:
 
 
 On 16/07/2015 13:03, Christian Borntraeger wrote:
 For what its worth, I can no longer reproduce the issue on
 current master + cherry-pick of a0710f7995f (iothread: release iothread 
 around aio_poll)

 bisect tells me that

 commit 53ec73e264f481b79b52efcadc9ceb8f8996975c
 Author: Fam Zheng f...@redhat.com
 AuthorDate: Fri May 29 18:53:14 2015 +0800
 Commit: Stefan Hajnoczi stefa...@redhat.com
 CommitDate: Tue Jul 7 14:27:14 2015 +0100

 block: Use bdrv_drain to replace uncessary bdrv_drain_all

 made the problem will blk-null go away. I still dont understand why.
 
 It could be related to the AioContext problem that I'm fixing these
 days, too.  Good news, we'll requeue the patch for 2.5.

That was also something that I had in mind (in fact I retested this to check
the ctx patch). master + cherry-pick of a0710f7995f + revert of 53ec73e26 + 
this fix
still fails, so it was (is?) a different issue. The interesting part is that 
this
problem required 2 or more disk (and we replace drain_all with single drains) so
it somewhat sounds plausible.

Christian




[Qemu-devel] [PATCH v2] target-arm: kvm: Differentiate registers based on write-back levels

2015-07-16 Thread Christoffer Dall
Some registers like the CNTVCT register should only be written to the
kernel as part of machine initialization or on vmload operations, but
never during runtime, as this can potentially make time go backwards or
create inconsistent time observations between VCPUs.

Introduce a list of registers that should not be written back at runtime
and check this list on syncing the register state to the KVM state.

Signed-off-by: Christoffer Dall christoffer.d...@linaro.org
---
Changes since RFC:
 - Move cpreg_level to kvm_arm_cpreg_level and into kvm32.c and kvm64.c
 - Changed struct name and declare as static const

 dtc  |  2 +-
 target-arm/kvm.c |  6 +-
 target-arm/kvm32.c   | 30 +-
 target-arm/kvm64.c   | 30 +-
 target-arm/kvm_arm.h | 12 +++-
 target-arm/machine.c |  2 +-
 6 files changed, 76 insertions(+), 6 deletions(-)

diff --git a/dtc b/dtc
index 65cc4d2..bc895d6 16
--- a/dtc
+++ b/dtc
@@ -1 +1 @@
-Subproject commit 65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf
+Subproject commit bc895d6d09695d05ceb8b52486ffe861d6cfbdde
diff --git a/target-arm/kvm.c b/target-arm/kvm.c
index 548bfd7..b278542 100644
--- a/target-arm/kvm.c
+++ b/target-arm/kvm.c
@@ -409,7 +409,7 @@ bool write_kvmstate_to_list(ARMCPU *cpu)
 return ok;
 }
 
-bool write_list_to_kvmstate(ARMCPU *cpu)
+bool write_list_to_kvmstate(ARMCPU *cpu, int level)
 {
 CPUState *cs = CPU(cpu);
 int i;
@@ -421,6 +421,10 @@ bool write_list_to_kvmstate(ARMCPU *cpu)
 uint32_t v32;
 int ret;
 
+if (kvm_arm_cpreg_level(regidx)  level) {
+continue;
+}
+
 r.id = regidx;
 switch (regidx  KVM_REG_SIZE_MASK) {
 case KVM_REG_SIZE_U32:
diff --git a/target-arm/kvm32.c b/target-arm/kvm32.c
index d7e7d68..6769815 100644
--- a/target-arm/kvm32.c
+++ b/target-arm/kvm32.c
@@ -153,6 +153,34 @@ bool kvm_arm_reg_syncs_via_cpreg_list(uint64_t regidx)
 }
 }
 
+typedef struct CPRegStateLevel {
+uint64_t regidx;
+int level;
+} CPRegStateLevel;
+
+/* All coprocessor registers not listed in the following table are assumed to
+ * be of the level KVM_PUT_RUNTIME_STATE, a register should be written less
+ * often, you must add it to this table with a state of either
+ * KVM_PUT_RESET_STATE or KVM_PUT_FULL_STATE.
+ */
+static const CPRegStateLevel non_runtime_cpregs[] = {
+{ KVM_REG_ARM_TIMER_CNT, KVM_PUT_FULL_STATE },
+};
+
+int kvm_arm_cpreg_level(uint64_t regidx)
+{
+int i;
+
+for (i = 0; i  ARRAY_SIZE(non_runtime_cpregs); i++) {
+const CPRegStateLevel *l = non_runtime_cpregs[i];
+if (l-regidx == regidx) {
+return l-level;
+}
+}
+
+return KVM_PUT_RUNTIME_STATE;
+}
+
 #define ARM_MPIDR_HWID_BITMASK 0xFF
 #define ARM_CPU_ID_MPIDR   0, 0, 0, 5
 
@@ -367,7 +395,7 @@ int kvm_arch_put_registers(CPUState *cs, int level)
  * managed to update the CPUARMState with, and only allowing those
  * to be written back up into the kernel).
  */
-if (!write_list_to_kvmstate(cpu)) {
+if (!write_list_to_kvmstate(cpu, level)) {
 return EINVAL;
 }
 
diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c
index ac34f51..d59f41c 100644
--- a/target-arm/kvm64.c
+++ b/target-arm/kvm64.c
@@ -139,6 +139,34 @@ bool kvm_arm_reg_syncs_via_cpreg_list(uint64_t regidx)
 }
 }
 
+typedef struct CPRegStateLevel {
+uint64_t regidx;
+int level;
+} CPRegStateLevel;
+
+/* All system not listed in the following table are assumed to be of the level
+ * KVM_PUT_RUNTIME_STATE, a register should be written less often, you must
+ * add it to this table with a state of either KVM_PUT_RESET_STATE or
+ * KVM_PUT_FULL_STATE.
+ */
+static const CPRegStateLevel non_runtime_cpregs[] = {
+{ KVM_REG_ARM_TIMER_CNT, KVM_PUT_FULL_STATE },
+};
+
+int kvm_arm_cpreg_level(uint64_t regidx)
+{
+int i;
+
+for (i = 0; i  ARRAY_SIZE(non_runtime_cpregs); i++) {
+const CPRegStateLevel *l = non_runtime_cpregs[i];
+if (l-regidx == regidx) {
+return l-level;
+}
+}
+
+return KVM_PUT_RUNTIME_STATE;
+}
+
 #define AARCH64_CORE_REG(x)   (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
  KVM_REG_ARM_CORE | KVM_REG_ARM_CORE_REG(x))
 
@@ -280,7 +308,7 @@ int kvm_arch_put_registers(CPUState *cs, int level)
 return ret;
 }
 
-if (!write_list_to_kvmstate(cpu)) {
+if (!write_list_to_kvmstate(cpu, level)) {
 return EINVAL;
 }
 
diff --git a/target-arm/kvm_arm.h b/target-arm/kvm_arm.h
index 5abd591..7912d74 100644
--- a/target-arm/kvm_arm.h
+++ b/target-arm/kvm_arm.h
@@ -69,8 +69,18 @@ int kvm_arm_init_cpreg_list(ARMCPU *cpu);
 bool kvm_arm_reg_syncs_via_cpreg_list(uint64_t regidx);
 
 /**
+ * kvm_arm_cpreg_level
+ * regidx: KVM register index
+ *
+ * Return the level of this coprocessor/system register.  Return value is
+ * either KVM_PUT_RUNTIME_STATE, KVM_PUT_RESET_STATE, or KVM_PUT_FULL_STATE.
+ */

Re: [Qemu-devel] iothread: release iothread around aio_poll causes random hangs at startup

2015-07-16 Thread Paolo Bonzini


On 16/07/2015 13:24, Christian Borntraeger wrote:
 The interesting part is that this problem required 2 or more disk
 (and we replace drain_all with single drains) so it somewhat sounds
 plausible.

Yes, indeed.  It is very plausible.  I wanted to reproduce it these
days, so thanks for saving me a lot of time!  I'll test your exact setup
(master + AioContext fix + cherry-pick of a0710f7995f + revert of
53ec73e26).

Thanks,

Paolo



Re: [Qemu-devel] [PATCH 2/5] virtio-blk: disable scsi passthrough for 1.0 device

2015-07-16 Thread Michael S. Tsirkin
On Thu, Jul 16, 2015 at 02:37:12PM +0200, Cornelia Huck wrote:
 On Wed, 15 Jul 2015 21:51:32 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Wed, Jul 15, 2015 at 05:38:53PM +0200, Cornelia Huck wrote:
   On Wed, 15 Jul 2015 17:39:18 +0300
   Michael S. Tsirkin m...@redhat.com wrote:
   
On Wed, Jul 15, 2015 at 04:30:51PM +0200, Cornelia Huck wrote:
 On Wed, 15 Jul 2015 17:11:57 +0300
 Michael S. Tsirkin m...@redhat.com wrote:
 
  Fine, but revision is negotiated way before features are
  probed so why does it make a practical difference?
 
 Legacy drivers (that don't know about the set-revision 
 command) will
 read features without revision negotiation - we need to offer 
 them the
 legacy feature set.

Right. So simply do if (revision  1) return features  
0x
and that will do this, will it not?
   
   Not for bits that we want to offer for legacy but not for modern.
  
  I don't think this selective offering works at least for scsi.
  scsi is a backend feature, if you connect a modern device
  in front the device simply does not work.
  It therefore makes no sense to attach a transitional device
  to such a backend.
 
 My point is that we're losing legacy features with that approach, and
 it would not be possible to offer them to legacy guests with newer
 qemus (at least with ccw).

What's wrong with adding a disable-modern flag, like pci has?
User can set that to get a legacy device.
   
   The whole idea behind the revision-stuff was that we don't need
   something like disable-modern. If the device is able to figure out on
   its own if it is to act as a modern or a legacy device, why require
   user intervention?
  
  It's about compatibility, e.g. being able to test legacy mode
  in transitional drivers in guests.
 
 Let's step back a bit. Why do we need legacy? To support old hosts and
 old guests. So what we need is a test matrix combining old/modern hosts
 with old/modern guests. I don't think forcing to an old version is
 something we should spend time on except during development.
 
  Consider also bugs, e.g. the fact that linux guests lack WCE
  in modern mode ATM.
 
 If we consider this a bug, shouldn't VERSION_1 be forced off
 unconditionally until specs and codebase are fixed?

I don't see why.  We'll never fix all bugs.  It causes a performance
regression but otherwise things work correctly.
modern is off by default for now, this seems sufficient, if we
never even ship it as an option we'll never find all bugs.

  

 What about the other way around (i.e. scsi is configured, therefore 
 the
 device is legacy-only)? We'd only retain the scsi bit if it is 
 actually
 wanted by the user's configuration. I would need to enforce a max
 revision of 0 for such a device in ccw, and pci could disable modern
 for it.

Will have to think about it.
But I think a flag to disable/enable modern is useful in any case,
and it seems sufficient.
   
   I don't like the idea of disabling modern or legacy for ccw, where the
   differences between both are very minor.
   
   I also don't think requiring the user to specify a new flag on upgrade
   just to present the same features as before is a good idea: it is
   something that is easily missed and may lead to much headscratching.
  
  And doing this on a driver upgrade won't?  As I said, if you believe
  this feature has value, argue that we shouldn't drop scsi off in virtio
  1.0 then.
 
 I seem to have problems explaining myself :(
 
 I don't care about the feature, except for supporting old guests that
 should continue working as before even if the host updated. And without
 special configuration on the host, as this is too easily forgotten.
 What else is legacy good for, other than providing backwards
 compatibility?

Maybe it's me having trouble explaining.

There are two kind of blk devices:

- those that allow scsi passthrough
Such devices can't work properly through the modern interface.
We could add a modern interface but device can not operate through it.

- those that don't allow scsi passthrough
Such devices can't work properly through the modern interface.
We could add a modern interface and we'll get a transitional
device.


I think for 2.4 it's a good idea to avoid enabling modern interface
by default. Therefore, for 2.4 we can keep scsi enabled unless modern
is requested by user. I am also fine with just doing

if (modern  scsi)
exit;


-- 
MST



Re: [Qemu-devel] [Qemu-block] [PATCH] raw-posix.c: Make physical devices usable in QEMU

2015-07-16 Thread Stefan Hajnoczi
On Thu, Jul 09, 2015 at 10:02:26AM -0400, Programmingkid wrote:
 
 On Jul 9, 2015, at 6:52 AM, Stefan Hajnoczi wrote:
 
  On Tue, Jul 07, 2015 at 01:33:23PM -0400, Programmingkid wrote:
  Make physical devices like a USB flash drive or a CDROM drive work in 
  QEMU. With
  this patch I can use a USB flash drive like a hard drive. Before this 
  patch, 
  QEMU would just quit with a message like resource busy.
  
  The commit message and the description are missing on Mac OS X.  It
  should be clear right away that this applies to Mac only.  This works
  fine on Linux and probably other host OSes.
 
 Yeah, that should have been done. Did you see any issues with the code?

QEMU shouldn't silently open a different file than the one given by the
user.  The user should give the exact device file they want.  If there
is magic behavior it needs to be documented, but I don't see a reason
why that's necessary in the case of device files.

QEMU shouldn't mount/unmount the CD-ROM.  atexit(3) doesn't handle
crashes or abort().  Users may be confused to find their CD-ROM
unmounted in those cases and would see this as a bug.  Instead we should
refuse mounted CD-ROMs so the user understands that block-level access
requires them to unmount first.

The strcpy/sprintf usage in this patch is unsafe and can lead to buffer
overflow, for example in the case of generating command-lines.  The
command-line buffer is only MAXPATHLEN so prepending the command to the
filename could exceed the buffer size.

There is also a buffer overflow in the array of devices that need to be
mounted.  What happens if there are more than 7 devices?


pgpzWeEYBeNpv.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH] qemu-char: Fix missed data on unix socket

2015-07-16 Thread Amit Shah
On (Thu) 16 Jul 2015 [10:11:04], Nils Carlson wrote:
 On Thu, 16 Jul 2015, Amit Shah wrote:
 
 On (Wed) 15 Jul 2015 [23:44:52], Nils Carlson wrote:
 On Mon, 13 Jul 2015, Nils Carlson wrote:
 
 On Mon, 13 Jul 2015, Amit Shah wrote:
 
 snip
 
 Also, returning TRUE there isn't right - if the connection ends, we
 should return FALSE.
 
 I agree that this seems reasonable. I will change it and re-test.
 
 
 I had a closer look, and it seems always returning true is intentional here,
 the called function, tcp_chr_disconnect(chr), handles the deregistration
 from handlers. If we were to return FALSE we would be duplicating work and
 possibly breaking things.
 
 Not sure how.
 
 Anyway, can you please start a new thread, with the author and
 reviewers of the patch CC'ed, so they can chime in as well?
 
 The authors and reviewers of which patch exactly?
 
 The fix for windows which broke cloudstack was done in 812c1057.

Yea, the author of this patch.  Also, I meant the return TRUE in this
patch, not the one Paolo added.

 The returning TRUE everywhere was done in commit cdbf6e16 by Paolo Bonzini,
 CC:d above.
 
 I don't think there was anything wrong with either patch, the former one
 simply broken a strange behaviour CloudStack was relying on which I would
 say is dubious (fire and forget messaging.)

There need not be anything wrong with a previous patch; but just
because someone touched the area recently, they might have more
context, and even help in spotting the bug.


Amit



[Qemu-devel] [PATCH 0/4] target-arm: Implement Secure physical timer

2015-07-16 Thread Peter Maydell
We managed to forget to implement the Secure physical timer when
we added TZ support for AArch32. This patchset fixes the omission,
and wires up its interrupt.

This patchset sits on top of
https://git.linaro.org/people/peter.maydell/qemu-arm.git target-arm-post-2.4
which is where I'm putting patches which I've reviewed but which aren't
2.4 material. Currently that's just Edgar's hyp timer series. (Warning,
branch will rebase.)

Since we're now wiring up four timer interrupts in virt and
a15mpcore, I switched them to a data-driven loop for neatness.

Incidentally, the reason that kernels worked and continue to work
even if they're booting Secure is that they cope with timer interrupts
coming in via either the NS timer irq line or the S timer irq line.

The next step is to put the secure-GIC patchset on top of this.
I think we're going to need to make the virt board default to
non-secure at that point, because the QEMU UEFI blob doesn't cope
with being booted Secure. That's probably for the best anyway, since
then it will be the same for TCG and KVM.

thanks
-- PMM

Peter Maydell (4):
  target-arm: Add the AArch64 view of the Secure physical timer
  target-arm: Add AArch32 banked register access to secure physical
timer
  hw/arm/virt: Wire up secure timer interrupt
  hw/cpu/a15mpcore: Wire up hyp and secure physical timer interrupts

 hw/arm/virt.c|  28 +++--
 hw/cpu/a15mpcore.c   |  21 ++
 target-arm/cpu-qom.h |   1 +
 target-arm/cpu.c |   2 +
 target-arm/cpu.h |   3 +-
 target-arm/helper.c  | 114 +++
 6 files changed, 148 insertions(+), 21 deletions(-)

-- 
1.9.1




Re: [Qemu-devel] [PATCH 2/5] virtio-blk: disable scsi passthrough for 1.0 device

2015-07-16 Thread Cornelia Huck
On Wed, 15 Jul 2015 21:51:32 +0300
Michael S. Tsirkin m...@redhat.com wrote:

 On Wed, Jul 15, 2015 at 05:38:53PM +0200, Cornelia Huck wrote:
  On Wed, 15 Jul 2015 17:39:18 +0300
  Michael S. Tsirkin m...@redhat.com wrote:
  
   On Wed, Jul 15, 2015 at 04:30:51PM +0200, Cornelia Huck wrote:
On Wed, 15 Jul 2015 17:11:57 +0300
Michael S. Tsirkin m...@redhat.com wrote:

 Fine, but revision is negotiated way before features are
 probed so why does it make a practical difference?

Legacy drivers (that don't know about the set-revision command) 
will
read features without revision negotiation - we need to offer 
them the
legacy feature set.
   
   Right. So simply do if (revision  1) return features  0x
   and that will do this, will it not?
  
  Not for bits that we want to offer for legacy but not for modern.
 
 I don't think this selective offering works at least for scsi.
 scsi is a backend feature, if you connect a modern device
 in front the device simply does not work.
 It therefore makes no sense to attach a transitional device
 to such a backend.

My point is that we're losing legacy features with that approach, and
it would not be possible to offer them to legacy guests with newer
qemus (at least with ccw).
   
   What's wrong with adding a disable-modern flag, like pci has?
   User can set that to get a legacy device.
  
  The whole idea behind the revision-stuff was that we don't need
  something like disable-modern. If the device is able to figure out on
  its own if it is to act as a modern or a legacy device, why require
  user intervention?
 
 It's about compatibility, e.g. being able to test legacy mode
 in transitional drivers in guests.

Let's step back a bit. Why do we need legacy? To support old hosts and
old guests. So what we need is a test matrix combining old/modern hosts
with old/modern guests. I don't think forcing to an old version is
something we should spend time on except during development.

 Consider also bugs, e.g. the fact that linux guests lack WCE
 in modern mode ATM.

If we consider this a bug, shouldn't VERSION_1 be forced off
unconditionally until specs and codebase are fixed?

 
   
What about the other way around (i.e. scsi is configured, therefore the
device is legacy-only)? We'd only retain the scsi bit if it is actually
wanted by the user's configuration. I would need to enforce a max
revision of 0 for such a device in ccw, and pci could disable modern
for it.
   
   Will have to think about it.
   But I think a flag to disable/enable modern is useful in any case,
   and it seems sufficient.
  
  I don't like the idea of disabling modern or legacy for ccw, where the
  differences between both are very minor.
  
  I also don't think requiring the user to specify a new flag on upgrade
  just to present the same features as before is a good idea: it is
  something that is easily missed and may lead to much headscratching.
 
 And doing this on a driver upgrade won't?  As I said, if you believe
 this feature has value, argue that we shouldn't drop scsi off in virtio
 1.0 then.

I seem to have problems explaining myself :(

I don't care about the feature, except for supporting old guests that
should continue working as before even if the host updated. And without
special configuration on the host, as this is too easily forgotten.
What else is legacy good for, other than providing backwards
compatibility?




Re: [Qemu-devel] [PATCH 0/2] vmxnet3: Fix RX TCP/UDP checksum of partially summed packets

2015-07-16 Thread Stefan Hajnoczi
On Tue, Jul 14, 2015 at 11:55:14AM +0300, Shmulik Ladkani wrote:
 In case csum offloaded packet, vmxnet3 implementation always passes an
 RxCompDesc with the Checksum calculated and found correct notification
 to the OS. This emulates the observed ESXi behavior.
 
 Therefore, if packet has the NEEDS_CSUM bit set, we must calculate and
 place a fully computed checksum into the tcp/udp header. Otherwise, the
 OS driver will receive a checksum-correct indication but with the actual
 tcp/udp checksum field having just the pseudo header csum value.
 
 Fix, by converting partially summed packets to be fully checksummed.
 
 Shmulik Ladkani:
   net/vmxnet3: Refactor 'vmxnet_rx_pkt_attach_data'
 
 Dana Rubin:
   net/vmxnet3: Fix RX TCP/UDP checksum on partially summed packets
 
  hw/net/vmxnet3.c   | 59 
 ++
  hw/net/vmxnet_rx_pkt.c | 12 +++---
  hw/net/vmxnet_rx_pkt.h | 11 ++
  3 files changed, 79 insertions(+), 3 deletions(-)
 
 -- 
 1.9.1
 
 

Thanks, applied to my net tree:
https://github.com/stefanha/qemu/commits/net

Stefan


pgp6clwA4UFHu.pgp
Description: PGP signature


Re: [Qemu-devel] iothread: release iothread around aio_poll causes random hangs at startup

2015-07-16 Thread Paolo Bonzini


On 16/07/2015 13:03, Christian Borntraeger wrote:
 For what its worth, I can no longer reproduce the issue on
 current master + cherry-pick of a0710f7995f (iothread: release iothread 
 around aio_poll)
 
 bisect tells me that
 
 commit 53ec73e264f481b79b52efcadc9ceb8f8996975c
 Author: Fam Zheng f...@redhat.com
 AuthorDate: Fri May 29 18:53:14 2015 +0800
 Commit: Stefan Hajnoczi stefa...@redhat.com
 CommitDate: Tue Jul 7 14:27:14 2015 +0100
 
 block: Use bdrv_drain to replace uncessary bdrv_drain_all
 
 made the problem will blk-null go away. I still dont understand why.

It could be related to the AioContext problem that I'm fixing these
days, too.  Good news, we'll requeue the patch for 2.5.

Paolo



Re: [Qemu-devel] [PATCH v7 15/42] Return path: Source handling of return path

2015-07-16 Thread Dr. David Alan Gilbert
* Amit Shah (amit.s...@redhat.com) wrote:
 On (Tue) 16 Jun 2015 [11:26:28], Dr. David Alan Gilbert (git) wrote:
  From: Dr. David Alan Gilbert dgilb...@redhat.com
  
  Open a return path, and handle messages that are received upon it.
  
  Signed-off-by: Dr. David Alan Gilbert dgilb...@redhat.com
 
 
  -/* migration thread support */
  +/*
  + * Something bad happened to the RP stream, mark an error
  + * The caller shall print something to indicate why
  + */
  +static void source_return_path_bad(MigrationState *s)
 
 Can you rename this to something like
 
 mark_source_rp_bad()
 
 ?
 
 Intent is clearer that way.

Done.

 Also, the comment says caller will print something, but the
 invocations below are a mix of printfs and traces.  Not saying the
 caller has to print always, but maybe only comment needs update.

Yes, I've changed the comment, and changed one of the traces into
an error_report.

Thanks,

Dave

 
  +{
  +s-rp_state.error = true;
  +migrate_fd_cleanup_src_rp(s);
  +}
  +
  +/*
  + * Handles messages sent on the return path towards the source VM
  + *
  + */
  +static void *source_return_path_thread(void *opaque)
  +{
  +MigrationState *ms = opaque;
  +QEMUFile *rp = ms-rp_state.file;
  +uint16_t expected_len, header_len, header_type;
  +const int max_len = 512;
  +uint8_t buf[max_len];
  +uint32_t tmp32;
  +int res;
  +
  +trace_source_return_path_thread_entry();
  +while (rp  !qemu_file_get_error(rp) 
  +migration_already_active(ms)) {
  +trace_source_return_path_thread_loop_top();
  +header_type = qemu_get_be16(rp);
  +header_len = qemu_get_be16(rp);
  +
  +switch (header_type) {
  +case MIG_RP_MSG_SHUT:
  +case MIG_RP_MSG_PONG:
  +expected_len = 4;
  +break;
  +
  +default:
  +error_report(RP: Received invalid message 0x%04x length 
  0x%04x,
  +header_type, header_len);
  +source_return_path_bad(ms);
  +goto out;
  +}
   
  +if (header_len  expected_len) {
  +error_report(RP: Received message 0x%04x with
  +incorrect length %d expecting %d,
  +header_type, header_len,
  +expected_len);
  +source_return_path_bad(ms);
  +goto out;
  +}
  +
  +/* We know we've got a valid header by this point */
  +res = qemu_get_buffer(rp, buf, header_len);
  +if (res != header_len) {
  +trace_source_return_path_thread_failed_read_cmd_data();
  +source_return_path_bad(ms);
  +goto out;
  +}
 
   Amit
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] memory hotplug fails with mem-merge option

2015-07-16 Thread Igor Mammedov
On Thu, 16 Jul 2015 16:00:07 +0530
Bharata B Rao bhar...@linux.vnet.ibm.com wrote:

 On Thu, Jul 16, 2015 at 10:25:09AM +0530, Bharata B Rao wrote:
  Hi,
  
  When I use -machine pc,mem-merge=off|on -m 1G,slots=4,maxmem=2G, adding
  memory-backend-ram object fails like below. Same failure is seen with
  -machine pseries,mem-merge=on|off.
  
  (qemu) object_add memory-backend-ram,id=ram0,size=1G
  
  qemu-system-x86_64: util/qemu-option.c:388: qemu_opt_get_bool_helper: 
  Assertion `opt-desc  opt-desc-type == QEMU_OPT_BOOL' failed.
 
 Same failure seen with -machine pseries,mem-merge=off when used together with
 -object memory-backend-file on QEMU commandline in which case boot time
 failure is observed.
 
 Regards,
 Bharata.

looks like related to 

0a7cf217 util/qemu-config: fix regression of qmp_query_command_line_options
and
49d2e64 machine: remove qemu_machine_opts global list

CCing Marcel in hope he knows how to fix it right away.



Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me

2015-07-16 Thread Richard W.M. Jones
Sorry for the very long delay in replying to this.  I wanted to be
absolutely sure I was reproducing the bug.

Unfortunately I'm only able to reproduce the bug with qemu 2.3.0
(both the version in Fedora Rawhide, or the tagged v2.3.0 from git).

I cannot currently reproduce it at all with upstream qemu from git.

The patches understandably only apply to upstream qemu from git, and
have quite a few tricky conflicts with v2.3.0.

I'll keep trying on this one.  It may be that the window for the
bug to reproduce with qemu.git has got smaller.

Rich.

--

Notes on how I try to reproduce this:

(1) Using Fedora Rawhide aarch64

(2) libguestfs checked out and compiled from git

(3) kraxel's edk2.git-aarch64-0-20150713.b1115.g2ad9cf3.noarch

(4) heisenscsi.pl (attached).

$ export LIBGUESTFS_HV=/path/to/qemu/aarch64-softmmu/qemu-system-aarch64
$ while true; do echo .; ./run ./heisenscsi.pl ; done



-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org


heisenscsi.pl
Description: Perl program


Re: [Qemu-devel] [Bug 1474263] Re: Image format was not specified warning should be suppressed for the vvfat (and probably nbd) driver

2015-07-16 Thread Stefan Hajnoczi
On Wed, Jul 15, 2015 at 10:54:47AM -0600, Eric Blake wrote:
 On 07/15/2015 09:42 AM, Max Reitz wrote:
  Hi,
  
  Indeed using non-raw images should not be used over NBD. The warning
  however is not superfluous, since qemu does indeed probe the image
  format, so a malicious guest might write a qcow2 header into the raw
  image, thus making qemu probe a qcow2 image the next time the same
  configuration is used. The problem would be solved by not making qemu
  probe the image format over NBD, but always assume raw; but I guess this
  will break existing use cases, even though they were wrong from the
  start. Anyway, this is solved by explicitly specifying the image format
  to be raw, which is what the warning says.
 
 I could actually see the use of non-raw over NBD.  We support nested
 protocols (where you can use qcow2-qcow2-file), that is, where a file
 contains a qcow2 file whose contents are themselves a qcow2 image.
 (Perhaps useful in nested guests, where the outer qcow2 layer serves a
 disk to an L0 guest, which in turn uses the inner layer to present a
 disk to an L1 guest).  In such a case, opening just one layer of qcow2
 for service over NBD will expose the inner qcow2 image, and connecting
 qemu as an NBD client with format=raw will directly manipulate the qcow2
 data seen by the L0 guest, while connecting as an NBD client with
 format=qcow2 will see the raw data seen by the L1 guest.
 
 But it's more likely to encounter this scenario with NBD, and not with
 vvfat.

I agree that it's perfectly okay to use non-raw on top of NBD.

We allow image formats on host block devices and iSCSI LUNs.  Why
shouldn't they be allowed on NBD exports?

Stefan


pgp8CmTZLbvHt.pgp
Description: PGP signature


[Qemu-devel] [PATCH 3/4] hw/arm/virt: Wire up secure timer interrupt

2015-07-16 Thread Peter Maydell
Wire up the secure timer interrupt. Since we've defined
that the plain old physical timer is the NS timer, we can
drop the now-out-of-date comment about QEMU not having TZ.

Use a data-driven loop to wire up the timer interrupts, since
we now have four of them and the code is the same for each.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 hw/arm/virt.c | 28 +++-
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index aab99f7..95b1a9a 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -392,20 +392,22 @@ static void create_gic(VirtBoardInfo *vbi, qemu_irq *pic)
 for (i = 0; i  smp_cpus; i++) {
 DeviceState *cpudev = DEVICE(qemu_get_cpu(i));
 int ppibase = NUM_IRQS + i * GIC_INTERNAL + GIC_NR_SGIS;
-/* physical timer; we wire it up to the non-secure timer's ID,
- * since a real A15 always has TrustZone but QEMU doesn't.
+int irq;
+/* Mapping from the output timer irq lines from the CPU to the
+ * GIC PPI inputs we use for the virt board.
  */
-qdev_connect_gpio_out(cpudev, 0,
-  qdev_get_gpio_in(gicdev,
- ppibase + ARCH_TIMER_NS_EL1_IRQ));
-/* virtual timer */
-qdev_connect_gpio_out(cpudev, 1,
-  qdev_get_gpio_in(gicdev,
-   ppibase + ARCH_TIMER_VIRT_IRQ));
-/* Hypervisor timer.  */
-qdev_connect_gpio_out(cpudev, 2,
-  qdev_get_gpio_in(gicdev,
- ppibase + ARCH_TIMER_NS_EL2_IRQ));
+const int timer_irq[] = {
+[GTIMER_PHYS] = ARCH_TIMER_NS_EL1_IRQ,
+[GTIMER_VIRT] = ARCH_TIMER_VIRT_IRQ,
+[GTIMER_HYP]  = ARCH_TIMER_NS_EL2_IRQ,
+[GTIMER_SEC]  = ARCH_TIMER_S_EL1_IRQ,
+};
+
+for (irq = 0; irq  ARRAY_SIZE(timer_irq); irq++) {
+qdev_connect_gpio_out(cpudev, irq,
+  qdev_get_gpio_in(gicdev,
+   ppibase + timer_irq[irq]));
+}
 
 sysbus_connect_irq(gicbusdev, i, qdev_get_gpio_in(cpudev, 
ARM_CPU_IRQ));
 sysbus_connect_irq(gicbusdev, i + smp_cpus,
-- 
1.9.1




[Qemu-devel] [PATCH 1/4] target-arm: Add the AArch64 view of the Secure physical timer

2015-07-16 Thread Peter Maydell
On CPUs with EL3, there are two physical timers, one for Secure and one
for Non-secure. Implement this extra timer and the AArch64 registers
which access it.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 target-arm/cpu-qom.h |  1 +
 target-arm/cpu.c |  2 ++
 target-arm/cpu.h |  3 +-
 target-arm/helper.c  | 87 
 4 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/target-arm/cpu-qom.h b/target-arm/cpu-qom.h
index 54db337..00c0716 100644
--- a/target-arm/cpu-qom.h
+++ b/target-arm/cpu-qom.h
@@ -225,6 +225,7 @@ int arm_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, 
int reg);
 void arm_gt_ptimer_cb(void *opaque);
 void arm_gt_vtimer_cb(void *opaque);
 void arm_gt_htimer_cb(void *opaque);
+void arm_gt_stimer_cb(void *opaque);
 
 #ifdef TARGET_AARCH64
 int aarch64_cpu_gdb_read_register(CPUState *cpu, uint8_t *buf, int reg);
diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index 3525348..fafc3ed 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -455,6 +455,8 @@ static void arm_cpu_initfn(Object *obj)
 arm_gt_vtimer_cb, cpu);
 cpu-gt_timer[GTIMER_HYP] = timer_new(QEMU_CLOCK_VIRTUAL, GTIMER_SCALE,
 arm_gt_htimer_cb, cpu);
+cpu-gt_timer[GTIMER_SEC] = timer_new(QEMU_CLOCK_VIRTUAL, GTIMER_SCALE,
+arm_gt_stimer_cb, cpu);
 qdev_init_gpio_out(DEVICE(cpu), cpu-gt_timer_outputs,
ARRAY_SIZE(cpu-gt_timer_outputs));
 #endif
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 7346c5f..4a6cd51 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -114,7 +114,8 @@ typedef struct ARMGenericTimer {
 #define GTIMER_PHYS 0
 #define GTIMER_VIRT 1
 #define GTIMER_HYP  2
-#define NUM_GTIMERS 3
+#define GTIMER_SEC  3
+#define NUM_GTIMERS 4
 
 typedef struct {
 uint64_t raw_tcr;
diff --git a/target-arm/helper.c b/target-arm/helper.c
index 4a7dd24..d31b946 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -1214,6 +1214,32 @@ static CPAccessResult gt_vtimer_access(CPUARMState *env, 
const ARMCPRegInfo *ri)
 return gt_timer_access(env, GTIMER_VIRT);
 }
 
+static CPAccessResult gt_stimer_access(CPUARMState *env,
+   const ARMCPRegInfo *ri)
+{
+/* The AArch64 register view of the secure physical timer is
+ * always accessible from EL3, and configurably accessible from
+ * Secure EL1.
+ */
+switch (arm_current_el(env)) {
+case 1:
+if (!arm_is_secure(env)) {
+return CP_ACCESS_TRAP;
+}
+if (!(env-cp15.scr_el3  SCR_ST)) {
+return CP_ACCESS_TRAP_EL3;
+}
+return CP_ACCESS_OK;
+case 0:
+case 2:
+return CP_ACCESS_TRAP;
+case 3:
+return CP_ACCESS_OK;
+default:
+g_assert_not_reached();
+}
+}
+
 static uint64_t gt_get_countervalue(CPUARMState *env)
 {
 return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) / GTIMER_SCALE;
@@ -1420,6 +1446,34 @@ static void gt_hyp_ctl_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 gt_ctl_write(env, ri, GTIMER_HYP, value);
 }
 
+static void gt_sec_timer_reset(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+gt_timer_reset(env, ri, GTIMER_SEC);
+}
+
+static void gt_sec_cval_write(CPUARMState *env, const ARMCPRegInfo *ri,
+  uint64_t value)
+{
+gt_cval_write(env, ri, GTIMER_SEC, value);
+}
+
+static uint64_t gt_sec_tval_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+return gt_tval_read(env, ri, GTIMER_SEC);
+}
+
+static void gt_sec_tval_write(CPUARMState *env, const ARMCPRegInfo *ri,
+  uint64_t value)
+{
+gt_tval_write(env, ri, GTIMER_SEC, value);
+}
+
+static void gt_sec_ctl_write(CPUARMState *env, const ARMCPRegInfo *ri,
+  uint64_t value)
+{
+gt_ctl_write(env, ri, GTIMER_SEC, value);
+}
+
 void arm_gt_ptimer_cb(void *opaque)
 {
 ARMCPU *cpu = opaque;
@@ -1441,6 +1495,13 @@ void arm_gt_htimer_cb(void *opaque)
 gt_recalc_timer(cpu, GTIMER_HYP);
 }
 
+void arm_gt_stimer_cb(void *opaque)
+{
+ARMCPU *cpu = opaque;
+
+gt_recalc_timer(cpu, GTIMER_SEC);
+}
+
 static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
 /* Note that CNTFRQ is purely reads-as-written for the benefit
  * of software; writing it doesn't actually change the timer frequency.
@@ -1570,6 +1631,32 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
   .resetvalue = 0, .accessfn = gt_vtimer_access,
   .writefn = gt_virt_cval_write, .raw_writefn = raw_write,
 },
+/* Secure timer -- this is actually restricted to only EL3
+ * and configurably Secure-EL1 via the accessfn.
+ */
+{ .name = CNTPS_TVAL_EL1, .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 7, .crn = 14, .crm = 2, .opc2 = 0,
+  .type = ARM_CP_NO_RAW | ARM_CP_IO, 

[Qemu-devel] [PATCH 4/4] hw/cpu/a15mpcore: Wire up hyp and secure physical timer interrupts

2015-07-16 Thread Peter Maydell
Since we now support both the hypervisor and the secure physical timer, wire
their interrupt lines up in the a15mpcore wrapper object.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 hw/cpu/a15mpcore.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/hw/cpu/a15mpcore.c b/hw/cpu/a15mpcore.c
index acc419e..49727d0 100644
--- a/hw/cpu/a15mpcore.c
+++ b/hw/cpu/a15mpcore.c
@@ -79,14 +79,21 @@ static void a15mp_priv_realize(DeviceState *dev, Error 
**errp)
 for (i = 0; i  s-num_cpu; i++) {
 DeviceState *cpudev = DEVICE(qemu_get_cpu(i));
 int ppibase = s-num_irq - 32 + i * 32;
-/* physical timer; we wire it up to the non-secure timer's ID,
- * since a real A15 always has TrustZone but QEMU doesn't.
+int irq;
+/* Mapping from the output timer irq lines from the CPU to the
+ * GIC PPI inputs used on the A15:
  */
-qdev_connect_gpio_out(cpudev, 0,
-  qdev_get_gpio_in(gicdev, ppibase + 30));
-/* virtual timer */
-qdev_connect_gpio_out(cpudev, 1,
-  qdev_get_gpio_in(gicdev, ppibase + 27));
+const int timer_irq[] = {
+[GTIMER_PHYS] = 30,
+[GTIMER_VIRT] = 27,
+[GTIMER_HYP]  = 26,
+[GTIMER_SEC]  = 29,
+};
+for (irq = 0; irq  ARRAY_SIZE(timer_irq); irq++) {
+qdev_connect_gpio_out(cpudev, irq,
+  qdev_get_gpio_in(gicdev,
+   ppibase + timer_irq[irq]));
+}
 }
 
 /* Memory map (addresses are offsets from PERIPHBASE):
-- 
1.9.1




Re: [Qemu-devel] [virtio guest] vring_need_event() from virtqueue_kick_prepare()

2015-07-16 Thread Stefan Hajnoczi
On Thu, Jul 09, 2015 at 01:45:50PM +0300, Catalin Vasile wrote:
 I have managed to deal with it.
 The thing is I was using one of the latest versions of qemu, but an
 old Linux Kernel version of 3.12.

Old guest kernel or old host kernel?

New QEMU should support old guest kernels.  Maybe you have found a bug?


pgpAhWoBaNKMQ.pgp
Description: PGP signature


Re: [Qemu-devel] write operations during read

2015-07-16 Thread Stefan Hajnoczi
On Wed, Jul 15, 2015 at 02:04:17PM +0300, Gleb Stepanov wrote:
 Hello, all.

This is a duplicate.  Please see my reply to your previous email.


pgpngLK6Z9I3F.pgp
Description: PGP signature


Re: [Qemu-devel] RFC: guest-side retrieval of fw_cfg file

2015-07-16 Thread Igor Mammedov
On Thu, 16 Jul 2015 13:25:56 +0300
Michael S. Tsirkin m...@redhat.com wrote:

 On Thu, Jul 16, 2015 at 11:50:38AM +0200, Igor Mammedov wrote:
  On Thu, 16 Jul 2015 10:30:41 +0300
  Michael S. Tsirkin m...@redhat.com wrote:
  
   On Thu, Jul 16, 2015 at 08:57:36AM +0200, Igor Mammedov wrote:
On Wed, 15 Jul 2015 19:24:33 +0300
Michael S. Tsirkin m...@redhat.com wrote:

 On Wed, Jul 15, 2015 at 06:01:42PM +0200, Igor Mammedov wrote:
  On Wed, 15 Jul 2015 17:08:27 +0300
  Michael S. Tsirkin m...@redhat.com wrote:
  
   On Mon, Jul 13, 2015 at 04:09:37PM -0400, Gabriel L. Somlo wrote:
Hi,

A while ago I was pondering on the options available for 
retrieving
a fw_cfg blob from the guest-side (now that we can insert fw_cfg
files on the host-side command line, see commit 81b2b8106).

So over the last couple of weekends I cooked up the sysfs kernel
module below, which lists all fw_cfg files
under /sys/firmware/fw_cfg/filename.
   
   One concern here is that there will be a conflict here if fw cfg
   is used by ACPI. I don't know whether that last is a good idea
   though, so maybe not a real concern. I think Igor
   wanted to make it so.
  
  I don't see any conflict here so far, it's just guest side module 
  that
  accesses fw_cfg.
 
 If there's ACPI code that accesses fw_cfg in response to an interrupt,
 it'll race with fw cfg access by guest OS. On linux we might be able 
 to
 block ACPI preventing it from running. We probably won't be able to do
 it on windows.
wrt vmgenid series we were talking about possibility to access fw_cfg
from ACPI device.init so it's unlikely that it will ever collide with
much later sysfs accesses.
   
   Don't we need to get updates when it changes too?
  I've thought that it's pretty static and doesn't change.
 
 vm gen id? No, its dynamic.
nope, I've mean address of buffer, not UUID itself.
 
   
But if ACPI will start accessing fw_cfg from
other methods it will race for sure.
   
   Maybe we shouldn't poke at fw cfg in ACPI then.
  Yep, maybe we shouldn't.
  
  The last vmgenid series just maps UUID region
  at fixed location (which is sufficient to build ACPI
  tables at machine done time) so there isn't real need
  to export address via fw_cfg.
  
 
 Other versions just write into guest memory from QEMU,
 that will work fine too.
yep they do,
except of that they are more complex on ACPI side and
it's more difficult to debug. I like the last revision of
series for simplicity of design and behavior.




[Qemu-devel] [PATCH 2/4] target-arm: Add AArch32 banked register access to secure physical timer

2015-07-16 Thread Peter Maydell
If EL3 is AArch32, then the secure physical timer is accessed via
banking of the registers used for the non-secure physical timer.
Implement this banking.

Note that the access controls for the AArch32 banked registers
remain the same as the physical-timer checks; they are not the
same as the controls on the AArch64 secure timer registers.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 target-arm/helper.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index d31b946..6fd5d93 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -1527,12 +1527,22 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
 },
 /* per-timer control */
 { .name = CNTP_CTL, .cp = 15, .crn = 14, .crm = 2, .opc1 = 0, .opc2 = 1,
+  .secure = ARM_CP_SECSTATE_NS,
   .type = ARM_CP_IO | ARM_CP_ALIAS, .access = PL1_RW | PL0_R,
   .accessfn = gt_ptimer_access,
   .fieldoffset = offsetoflow32(CPUARMState,
cp15.c14_timer[GTIMER_PHYS].ctl),
   .writefn = gt_phys_ctl_write, .raw_writefn = raw_write,
 },
+{ .name = CNTP_CTL(S),
+  .cp = 15, .crn = 14, .crm = 2, .opc1 = 0, .opc2 = 1,
+  .secure = ARM_CP_SECSTATE_S,
+  .type = ARM_CP_IO | ARM_CP_ALIAS, .access = PL1_RW | PL0_R,
+  .accessfn = gt_ptimer_access,
+  .fieldoffset = offsetoflow32(CPUARMState,
+   cp15.c14_timer[GTIMER_SEC].ctl),
+  .writefn = gt_sec_ctl_write, .raw_writefn = raw_write,
+},
 { .name = CNTP_CTL_EL0, .state = ARM_CP_STATE_AA64,
   .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 2, .opc2 = 1,
   .type = ARM_CP_IO, .access = PL1_RW | PL0_R,
@@ -1558,10 +1568,18 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
 },
 /* TimerValue views: a 32 bit downcounting view of the underlying state */
 { .name = CNTP_TVAL, .cp = 15, .crn = 14, .crm = 2, .opc1 = 0, .opc2 = 0,
+  .secure = ARM_CP_SECSTATE_NS,
   .type = ARM_CP_NO_RAW | ARM_CP_IO, .access = PL1_RW | PL0_R,
   .accessfn = gt_ptimer_access,
   .readfn = gt_phys_tval_read, .writefn = gt_phys_tval_write,
 },
+{ .name = CNTP_TVAL(S),
+  .cp = 15, .crn = 14, .crm = 2, .opc1 = 0, .opc2 = 0,
+  .secure = ARM_CP_SECSTATE_S,
+  .type = ARM_CP_NO_RAW | ARM_CP_IO, .access = PL1_RW | PL0_R,
+  .accessfn = gt_ptimer_access,
+  .readfn = gt_sec_tval_read, .writefn = gt_sec_tval_write,
+},
 { .name = CNTP_TVAL_EL0, .state = ARM_CP_STATE_AA64,
   .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 2, .opc2 = 0,
   .type = ARM_CP_NO_RAW | ARM_CP_IO, .access = PL1_RW | PL0_R,
@@ -1602,12 +1620,21 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
 },
 /* Comparison value, indicating when the timer goes off */
 { .name = CNTP_CVAL, .cp = 15, .crm = 14, .opc1 = 2,
+  .secure = ARM_CP_SECSTATE_NS,
   .access = PL1_RW | PL0_R,
   .type = ARM_CP_64BIT | ARM_CP_IO | ARM_CP_ALIAS,
   .fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_PHYS].cval),
   .accessfn = gt_ptimer_access,
   .writefn = gt_phys_cval_write, .raw_writefn = raw_write,
 },
+{ .name = CNTP_CVAL(S), .cp = 15, .crm = 14, .opc1 = 2,
+  .secure = ARM_CP_SECSTATE_S,
+  .access = PL1_RW | PL0_R,
+  .type = ARM_CP_64BIT | ARM_CP_IO | ARM_CP_ALIAS,
+  .fieldoffset = offsetof(CPUARMState, cp15.c14_timer[GTIMER_SEC].cval),
+  .accessfn = gt_ptimer_access,
+  .writefn = gt_sec_cval_write, .raw_writefn = raw_write,
+},
 { .name = CNTP_CVAL_EL0, .state = ARM_CP_STATE_AA64,
   .opc0 = 3, .opc1 = 3, .crn = 14, .crm = 2, .opc2 = 2,
   .access = PL1_RW | PL0_R,
-- 
1.9.1




Re: [Qemu-devel] [Qemu-block] [PATCH 1/3] blockjob: Introduce block_job_relax_cpu

2015-07-16 Thread Stefan Hajnoczi
On Wed, Jul 15, 2015 at 06:32:54PM +0800, Fam Zheng wrote:
 On Tue, 07/14 13:31, Stefan Hajnoczi wrote:
  On Fri, Jul 10, 2015 at 05:42:48AM +0200, Alexandre DERUMIER wrote:
   By the way, why did you choose 10 milliseconds?  That is quite long.
   
   If this function is called once per 10 ms disk I/O operations then we
   lose 50% utilization.  1 ms or less would be reasonable.
   
   From my tests, 1ms is not enough, It still hanging in guest or qmp 
   queries.
   10ms give me optimal balance between bitmap scan speed and guest 
   responsiveness.
  
  Then I don't fully understand the bug.
  
  Fam: can you explain why 1ms isn't enough?
 
 In Alexandre's case, I suppose it's because the lseek is so slow that sleeping
 for 1ms would still let mirror coroutine to occupy, say, 90% of CPU time, so
 guest IO stutters. Perhaps we could move lseek to thread pool in the future.

Right, that's the real problem here.  If lseek is done in a worker
thread than the coroutine yields in the meantime and the responsiveness
problem is solved.

This sounds like an important fix in the early 2.5 release cycle.


pgpWzrqa3oBVa.pgp
Description: PGP signature


Re: [Qemu-devel] Write requests, during read operations

2015-07-16 Thread Stefan Hajnoczi
On Tue, Jul 14, 2015 at 05:39:13PM +0300, Gleb Stepanov wrote:
 I've measured performance on qcow2 image mounted to nb0 device.

nb0 device?

 I run tests and
 launch iostat -x 1 command to browse I/O operations. Test consists of
 sequential read operations with 1 MB size. I've get following output.
 
 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 sda   0.00 5.000.00  677.00 0.00 22812.00
 sdb   0.00 0.000.00  545.00 0.00 0.00
 dm-0  0.00 0.000.00  674.00 0.00 22784.00
 dm-1  0.00 0.000.007.00 0.0028.00
 dm-2  0.00 0.000.000.00 0.00 0.00
 nb0   0.00 0.00  432.00  182.00 55296.00 23296.00
 
 So, reading produce a lot of write operations. What is the reason why
 these write operations occured?
 How to determina block size that qemu operates , because if i've
 writtien 55 mb with 1 mb blocks
 i should get 55 operations instead of 432.

Please post details of the benchmark:
 * fio job file and fio command-line
 * commands used to launch QEMU or qemu-nbd


pgpmdxYy8OZh_.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH 3/3] PAM: make PAM emulation closer to documentation

2015-07-16 Thread Ефимов Василий

16.07.2015 14:10, Paolo Bonzini wrote:


 On 16/07/2015 12:51, Ефимов Василий wrote:
 The rest of code looks up destination or source region or child region
 offset in memory sub-tree which root is PCI or RAM region provided on
 PAM creation. We cannon use common address_space_translate because it
 searches from address space root and will return current PAM region.
 To summarize, I suggest to move the code to exec.c. It is generic
 enough.

 All these mechanism are extremely low level.  They are encapsulated
 within exec.c, and copying code to pam.c is not a good idea because you
 already have all the AddressSpaces and RAM MemoryRegions you need.
The core problem is there is no region type which can redirects access
depending on whether it is read or write. The access type driven alias
could be perfect. But it is quite difficult to invent such type of
alias (or to generalize existing). The main problem is rendering memory
tree to FlatView.

 Could you use an IOMMU memory region instead?  Then a single region can
 be used to implement all four modes, and you don't hit the trying to
 execute code outside RAM or RAM.
 Did you mean MemoryRegion.iommu_ops ? The feature does not allow to
 change destination memory region.

 It does.  You're right about this:

 exec.c: address_space_translate_for_iotlb:
  assert(!section-mr-iommu_ops);

 ... but an IOMMU region is not needed, and I think you can do everything
 without touching exec.c at all.

 +/* Read from RAM and write to PCI */
 +memory_region_init_io(pam-region[1], OBJECT(dev), pam_ops, pam,
 +pam-r-ram-w-pci, size);

 This can be done with memory_region_set_readonly on the RAM region.  You
 need to set mr-ops in order to point to pam_ops; for a first proof of
 concept you can just set the field directly.
The idea is to read directly from system RAM region and to write
to PCI using I/O (because I do not see another way to emulate
access type driven redirection with existent API). If we create RAM
and make it read only then new useless RAM block will be created.
It is useless because of ram_addr of new region will be set to
one within system RAM block. Hence, cleaner way is to create I/O region.

 Writes to the PCI memory space can use the PCI address space, with
 address_space_st*.
There is no PCI AddressSpace (only MemoryRegion). But
address_space_st* requires AddressSpace as argument.

 +/* Read from PCI and write to RAM */
 +memory_region_init_io(pam-region[2], OBJECT(dev), pam_ops, pam,
 +pam-r-pci-w-ram, size);

 Here you cannot run code from ROM, so it can be a pure MMIO region.
 Reads can use address_space_ld*, while writes can use
 memory_region_get_ram_ptr.

Even in this mode it is possible for code to be executed from ROM. This
can happen when particular PCI address is within ROM device connected
to PCI bus.

I do not have examples where this happens in mode 2, but in mode 0
(where reads are also directed to PCI) this happens at startup time
when BIOS is executed from ROM device on PCI. The path in memory tree
is system - pam-pci - pci - isa-bios - pc.bios where pc.bios is ROM
and pam-pci is alias.

If this happens when PAM is in mode 2 then path should become
system - pam-r-pci-w-ram where pam-r-pci-w-ram is ROM which ram_addr
points to RAM block of pc.bios. Write handler of pam-r-pci-w-ram
performs access to RAM block of pc.ram.

Implemented mechanism performs search for leaf region. And it supports
cases when leaf is I/O too. In this case pam-r-pci-w-ram (and
pam-r-ram-w-pci) becomes clear I/O and both its handlers
redirects access to corresponding region.

 Paolo


Vasily




Re: [Qemu-devel] [RFC PATCH qemu v3 4/4] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering)

2015-07-16 Thread Alex Williamson
On Thu, 2015-07-16 at 15:11 +1000, David Gibson wrote:
 On Tue, Jul 14, 2015 at 10:21:54PM +1000, Alexey Kardashevskiy wrote:
  This makes use of the new memory registering feature. The idea is
  to provide the userspace ability to notify the host kernel about pages
  which are going to be used for DMA. Having this information, the host
  kernel can pin them all once per user process, do locked pages
  accounting (once) and not spent time on doing that in real time with
  possible failures which cannot be handled nicely in some cases.
  
  This adds a guest RAM memory listener which notifies a VFIO container
  about memory which needs to be pinned/unpinned. VFIO MMIO regions
  (i.e. skip dump regions) are skipped.
  
  The feature is only enabled for SPAPR IOMMU v2. The host kernel changes
  are required. Since v2 does not need/support VFIO_IOMMU_ENABLE, this does
  not call it when v2 is detected and enabled.
  
  This does not change the guest visible interface.
  
  Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 
 I've looked at this in more depth now, and attempting to unify the
 pre-reg and mapping listeners like this can't work - they need to be
 listening on different address spaces:  mapping actions need to be
 listening on the PCI address space, whereas the pre-reg needs to be
 listening on address_space_memory.  For x86 - for now - those end up
 being the same thing, but on Power they're not.
 
 We do need to be clear about what differences are due to the presence
 of a guest IOMMU versus which are due to arch or underlying IOMMU
 type.  For now Power has a guest IOMMU and x86 doesn't, but that could
 well change in future: we could well implement the guest side IOMMU
 for x86 in future (or x86 could invent a paravirt IOMMU interface).
 On the other side, BenH's experimental powernv machine type could
 introduce Power machines without a guest side IOMMU (or at least an
 optional guest side IOMMU).
 
 The quick and dirty approach here is:
1. Leave the main listener as is
2. Add a new pre-reg notifier to the spapr iommu specific code,
   which listens on address_space_memory, *not* the PCI space

It is dirty and that's exactly what I've been advising Alexey against
because we have entirely too much dirty spapr specific code that doesn't
need to be spapr specific.  I don't see why separate address space
matters, that's done at the point of registering the listener and so far
doesn't play much a role in the actual listener behavior, just which
regions it sees.
 
 The more generally correct approach, which allows for more complex
 IOMMU arrangements and the possibility of new IOMMU types with pre-reg
 is:
1. Have the core implement both a mapping listener and a pre-reg
   listener (optionally enabled by a per-iommu-type flag).
   Basically the first one sees what *is* mapped, the second sees
   what *could* be mapped.

This just sounds like different address spaces, address_space_memory vs
address_space_physical_memory.

 
2. As now, the mapping listener listens on PCI address space, if
   RAM blocks are added, immediately map them into the host IOMMU,
   if guest IOMMU blocks appear register a notifier which will
   mirror guest IOMMU mappings to the host IOMMU (this is what we
   do now).

Right, this is done now, nothing new required.

3. The pre-reg listener also listens on the PCI address space.  RAM
   blocks added are pre-registered immediately.  But, if guest
   IOMMU blocks are added, instead of registering a guest-iommu
   notifier, we register another listener on the *target* AS of the
   guest IOMMU, same callbacks as this one.  In practice that
   target AS will almost always resolve to address_space_memory,
   but this can at least in theory handle crazy guest setups with
   multiple layers of IOMMU.
 
4. Have to ensure that the pre-reg callbacks always happen before
   the mapping calls.  For a system with an IOMMU backend which
   requires pre-registration, but doesn't have a guest IOMMU, we
   need to pre-reg, then host-iommu-map RAM blocks that appear in
   PCI address space.

Both of these just seem like details of the iommu-type (pre-reg)
specific parts of the listener, I'm not understanding what's
fundamentally different about it that we can't do it now, within a
single listener, nor do I see how it's all that different from x86.  x86
mappings are more dynamic, but that's largely in legacy space for x86
due to some bizarre compatibility things.  x86 also tries to map mmio
regions for p2p, but these are just minor details to mappings that are
largely the same.  Thanks,

Alex




  1   2   3   >