date:20130506

Re: [Qemu-devel] [PATCH v2 0/6] proposal to make hostmem listener RAM unplug safe

2013-05-06 Thread Paolo Bonzini

Il 06/05/2013 03:42, liu ping fan ha scritto:
 On Sat, May 4, 2013 at 5:53 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 03/05/2013 04:45, Liu Ping Fan ha scritto:
 v1-v2:
   1.split RCU prepared style update and monitor the RAM-Device refcnt into 
 two patches (patch 2,4)
   2.introduce AddrSpaceMem, which is similar to HostMem, but based on 
 address space, while
 the original HostMem only server system memory address space

 This looks suspiciously similar to FlatView, doesn't it?

 FlatView is used for all the listeners, including for mmio dispatching,
 which aims to mapping from hwaddr to DeviceState for dispatching service.
 While here, we mapping from hwaddr to hva.
 
 Perhaps the right thing to do is to add the appropriate locking and
 RCU-style updating to address_space_update_topology and
 
 RCU implementation is data struct related,  and each listener has its
 local table, so I think it is more reasonable to implement them
 separately.

I mentioned address_space_update_topology simply because it is where the
FlatView is replaced.

 memory_region_find.   (And replacing flatview_destroy with ref/unref
 similar to HostMem in your patch 2).  Then just switch dataplane to use
 memory_region_find...

 In fact, I think, HostMem listener can be an substitute for
 cpu_physical_memory_map(),  the main issue can be the migration
 support.  But before getting big patches, I hope to have this smaller
 and simpler one.

I think replacing HostMem with FlatView is a smaller patch than these
ones.  I'll try to make a prototype.

Paolo

Re: [Qemu-devel] [PATCH v3 2/4] Add i.MX I2C controller emulator

2013-05-06 Thread Jean-Christophe DUBOIS


On 05/06/2013 04:19 AM, Peter Crosthwaite wrote:

Hi JC,

On Mon, May 6, 2013 at 12:28 AM, Jean-Christophe DUBOIS
j...@tribudubois.net wrote:

The slave mode is not implemented.

Changes since v1:
 * use QOM cast
 * run checkpatch on code
 * added restrictin on MemoryRegionOps
 * use DeviceClass::realise as init function

Changes since v2:
 * use CamelCase for state type
 * use extrac32() for bit manipilation.
 * improve QOM cast
 * separate regs definition in its own file (to be reused by qtest)


Per-patch change logs need to go below the line ...


Signed-off-by: Jean-Christophe DUBOIS j...@tribudubois.net
---

... here. Otherwise it will go into the mainline git log on merge.

The edit can be done by hand edit before the send-email. But I use a
scripted approach personally to move my changelogs out of commit
message pre send (and my working tree has commits formatted similar to
what you have sent here). Others using the manual flow tend to just
use cover letter change logs I think.

OK



  hw/i2c/Makefile.objs  |   1 +
  hw/i2c/imx_i2c.c  | 352 ++
  hw/i2c/imx_i2c_regs.h |  63 +
  3 files changed, 416 insertions(+)
  create mode 100644 hw/i2c/imx_i2c.c
  create mode 100644 hw/i2c/imx_i2c_regs.h

diff --git a/hw/i2c/Makefile.objs b/hw/i2c/Makefile.objs
index 648278e..d27bbaa 100644
--- a/hw/i2c/Makefile.objs
+++ b/hw/i2c/Makefile.objs
@@ -4,4 +4,5 @@ common-obj-$(CONFIG_ACPI) += smbus_ich9.o
  common-obj-$(CONFIG_APM) += pm_smbus.o
  common-obj-$(CONFIG_BITBANG_I2C) += bitbang_i2c.o
  common-obj-$(CONFIG_EXYNOS4) += exynos4210_i2c.o
+common-obj-$(CONFIG_IMX_I2C) += imx_i2c.o
  obj-$(CONFIG_OMAP) += omap_i2c.o
diff --git a/hw/i2c/imx_i2c.c b/hw/i2c/imx_i2c.c
new file mode 100644
index 000..9debcce
--- /dev/null
+++ b/hw/i2c/imx_i2c.c
@@ -0,0 +1,352 @@
+/*
+ *  i.MX I2C Bus Serial Interface Emulation
+ *
+ *  Copyright (C) 2013 Jean-Christophe Dubois.
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under the terms of the GNU General Public License as published by the
+ *  Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but WITHOUT
+ *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ *  FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ *  for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see http://www.gnu.org/licenses/.
+ *
+ */
+
+#include qemu/bitops.h
+
+#include hw/sysbus.h
+#include hw/i2c/i2c.h
+
+#include hw/i2c/imx_i2c_regs.h
+
+#ifndef IMX_I2C_DEBUG
+#define IMX_I2C_DEBUG 0
+#endif
+
+#if IMX_I2C_DEBUG
+#define DPRINT(fmt, ...)  \
+do { fprintf(stderr, imx_i2c[%s]:  fmt, __func__, ## __VA_ARGS__); \
+   } while (0)
+
+static const char *imx_i2c_get_regname(unsigned offset)
+{
+switch (offset) {
+case IADR_ADDR:
+return IADR;
+case IFDR_ADDR:
+return IFDR;
+case I2CR_ADDR:
+return I2CR;
+case I2SR_ADDR:
+return I2SR;
+case I2DR_ADDR:
+return I2DR;
+default:
+return [?];
+}
+}
+#else
+#define DPRINT(fmt, args...)  do { } while (0)
+#endif
+
+#define TYPE_IMX_I2C  imx.i2c
+#define IMX_I2C(obj)  \
+OBJECT_CHECK(IMXI2CState, (obj), TYPE_IMX_I2C)
+
+typedef struct IMXI2CState {
+SysBusDevice parent_obj;
+
+MemoryRegion iomem;
+i2c_bus *bus;
+qemu_irq irq;
+
+uint16_t  address;
+
+uint16_t iadr;
+uint16_t ifdr;
+uint16_t i2cr;
+uint16_t i2sr;
+uint16_t i2dr_read;
+uint16_t i2dr_write;
+} IMXI2CState;
+
+static inline bool imx_i2c_is_enabled(IMXI2CState *s)
+{
+return s-i2cr  I2CR_IEN;
+}
+
+static inline bool imx_i2c_interrupt_is_enabled(IMXI2CState *s)
+{
+return s-i2cr  I2CR_IIEN;
+}
+
+static inline bool imx_i2c_is_master(IMXI2CState *s)
+{
+return s-i2cr  I2CR_MSTA;
+}
+
+static inline bool imx_i2c_direction_is_tx(IMXI2CState *s)
+{
+return s-i2cr  I2CR_MTX;
+}
+
+static void imx_i2c_reset(DeviceState *dev)
+{
+IMXI2CState *s = IMX_I2C(dev);
+
+if (s-address != ADDR_RESET) {
+i2c_end_transfer(s-bus);
+}

I don't think this is right, unless your device has actual logic that
cleans up the I2C bus on hard reset (which would be very strange). The
I2C bus should be responsible for its own reset as a reset (if needed
at all). As an I2C bus is stateless, cleanup of an inflight
transaction should happen naturally when a reset happens both master
and slave side. Did this manifest for you as a bug at any stage? If so
I think its a bug in the I2C framework and worth RFCing.


No, I was trying to be extra cautious and leave a clean state. I'll 
remove the i2c_end_transfer().

+
+

Re: [Qemu-devel] [uq/master PATCH] kvmvapic: add ioport read accessor

2013-05-06 Thread Jan Kiszka

On 2013-05-05 22:51, Marcelo Tosatti wrote:
 
 Necessary since memory region accessor assumes read and write
 methods are registered. Otherwise reading I/O port 0x7e segfaults.
 
 https://bugzilla.redhat.com/show_bug.cgi?id=954306
 
 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
 
 diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
 index 5b558aa..655483b 100644
 --- a/hw/i386/kvmvapic.c
 +++ b/hw/i386/kvmvapic.c
 @@ -687,8 +687,14 @@ static void vapic_write(void *opaque, hwaddr addr, 
 uint64_t data,
  }
  }
  
 +static uint64_t vapic_read(void *opaque, hwaddr addr, unsigned size)
 +{
 +return 0x;
 +}
 +
  static const MemoryRegionOps vapic_ops = {
  .write = vapic_write,
 +.read = vapic_read,
  .endianness = DEVICE_NATIVE_ENDIAN,
  };
  
 

Right. I'm just wondering why the guest reads from that port.

Reviewed-by: Jan Kiszka jan.kis...@siemens.com

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH v3 0/5] KVM flash memory support

2013-05-06 Thread Jordan Justen

git://github.com/jljusten/qemu.git kvm-flash-v3

Utilize KVM_CAP_READONLY_MEM to support PC system flash emulation
with KVM.

v3:
 * Squash patch 2  3 based on Xiao's feedback that what I
   was calling a 'workaround' in patch 3 was actually what
   is required by the KVM READONLY memory support.

v2:
 * Remove rom_only from PC_COMPAT_1_4
 * Only enable flash when a pflash drive is created.

Jordan Justen (5):
  kvm: add kvm_readonly_mem_enabled
  kvm: support using KVM_MEM_READONLY flag for readonly regions
  pflash_cfi01: memory region should be set to enable readonly mode
  pc_sysfw: allow flash (-pflash) memory to be used with KVM
  pc_sysfw: change rom_only default to 0

 hw/block/pc_sysfw.c |   52 +--
 hw/block/pflash_cfi01.c |2 ++
 include/hw/i386/pc.h|4 
 include/sysemu/kvm.h|   10 +
 kvm-all.c   |   42 ++
 kvm-stub.c  |1 +
 6 files changed, 78 insertions(+), 33 deletions(-)

-- 
1.7.10.4

[Qemu-devel] [PATCH v3 2/5] kvm: support using KVM_MEM_READONLY flag for readonly regions

2013-05-06 Thread Jordan Justen

A slot that uses KVM_MEM_READONLY can be read from and code
can execute from the region, but writes will trap.

For regions that are readonly and also not writeable, we
force the slot to be removed so reads or writes to the region
will trap. (A memory region in this state is not executable
within kvm.)

Signed-off-by: Jordan Justen jordan.l.jus...@intel.com
Reviewed-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
---
 kvm-all.c |   36 +++-
 1 file changed, 27 insertions(+), 9 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 1686adc..fffd2f4 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -201,12 +201,18 @@ static int kvm_set_user_memory_region(KVMState *s, 
KVMSlot *slot)
 
 mem.slot = slot-slot;
 mem.guest_phys_addr = slot-start_addr;
-mem.memory_size = slot-memory_size;
 mem.userspace_addr = (unsigned long)slot-ram;
 mem.flags = slot-flags;
 if (s-migration_log) {
 mem.flags |= KVM_MEM_LOG_DIRTY_PAGES;
 }
+if (mem.flags  KVM_MEM_READONLY  mem.memory_size != 0) {
+/* Set the slot size to 0 before setting the slot to the desired
+ * value. This is needed based on KVM commit 75d61fbc. */
+mem.memory_size = 0;
+kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, mem);
+}
+mem.memory_size = slot-memory_size;
 return kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, mem);
 }
 
@@ -268,9 +274,14 @@ err:
  * dirty pages logging control
  */
 
-static int kvm_mem_flags(KVMState *s, bool log_dirty)
+static int kvm_mem_flags(KVMState *s, bool log_dirty, bool readonly)
 {
-return log_dirty ? KVM_MEM_LOG_DIRTY_PAGES : 0;
+int flags = 0;
+flags = log_dirty ? KVM_MEM_LOG_DIRTY_PAGES : 0;
+if (readonly  kvm_readonly_mem_allowed) {
+flags |= KVM_MEM_READONLY;
+}
+return flags;
 }
 
 static int kvm_slot_dirty_pages_log_change(KVMSlot *mem, bool log_dirty)
@@ -281,7 +292,7 @@ static int kvm_slot_dirty_pages_log_change(KVMSlot *mem, 
bool log_dirty)
 
 old_flags = mem-flags;
 
-flags = (mem-flags  ~mask) | kvm_mem_flags(s, log_dirty);
+flags = (mem-flags  ~mask) | kvm_mem_flags(s, log_dirty, false);
 mem-flags = flags;
 
 /* If nothing changed effectively, no need to issue ioctl */
@@ -638,7 +649,14 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, 
bool add)
 }
 
 if (!memory_region_is_ram(mr)) {
-return;
+if (!mr-readonly || !kvm_readonly_mem_allowed) {
+return;
+} else if (!mr-readable  add) {
+/* If the memory range is not readable, then we actually want
+ * to remove the kvm memory slot so all accesses will trap. */
+assert(mr-readonly  kvm_readonly_mem_allowed);
+add = false;
+}
 }
 
 ram = memory_region_get_ram_ptr(mr) + section-offset_within_region + 
delta;
@@ -687,7 +705,7 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, 
bool add)
 mem-memory_size = old.memory_size;
 mem-start_addr = old.start_addr;
 mem-ram = old.ram;
-mem-flags = kvm_mem_flags(s, log_dirty);
+mem-flags = kvm_mem_flags(s, log_dirty, mr-readonly);
 
 err = kvm_set_user_memory_region(s, mem);
 if (err) {
@@ -708,7 +726,7 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, 
bool add)
 mem-memory_size = start_addr - old.start_addr;
 mem-start_addr = old.start_addr;
 mem-ram = old.ram;
-mem-flags =  kvm_mem_flags(s, log_dirty);
+mem-flags =  kvm_mem_flags(s, log_dirty, mr-readonly);
 
 err = kvm_set_user_memory_region(s, mem);
 if (err) {
@@ -732,7 +750,7 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, 
bool add)
 size_delta = mem-start_addr - old.start_addr;
 mem-memory_size = old.memory_size - size_delta;
 mem-ram = old.ram + size_delta;
-mem-flags = kvm_mem_flags(s, log_dirty);
+mem-flags = kvm_mem_flags(s, log_dirty, mr-readonly);
 
 err = kvm_set_user_memory_region(s, mem);
 if (err) {
@@ -754,7 +772,7 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, 
bool add)
 mem-memory_size = size;
 mem-start_addr = start_addr;
 mem-ram = ram;
-mem-flags = kvm_mem_flags(s, log_dirty);
+mem-flags = kvm_mem_flags(s, log_dirty, mr-readonly);
 
 err = kvm_set_user_memory_region(s, mem);
 if (err) {
-- 
1.7.10.4

[Qemu-devel] [PATCH v3 1/5] kvm: add kvm_readonly_mem_enabled

2013-05-06 Thread Jordan Justen

Signed-off-by: Jordan Justen jordan.l.jus...@intel.com
---
 include/sysemu/kvm.h |   10 ++
 kvm-all.c|6 ++
 kvm-stub.c   |1 +
 3 files changed, 17 insertions(+)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 9735c1d..13c4b2e 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -45,6 +45,7 @@ extern bool kvm_async_interrupts_allowed;
 extern bool kvm_irqfds_allowed;
 extern bool kvm_msi_via_irqfd_allowed;
 extern bool kvm_gsi_routing_allowed;
+extern bool kvm_readonly_mem_allowed;
 
 #if defined CONFIG_KVM || !defined NEED_CPU_H
 #define kvm_enabled()   (kvm_allowed)
@@ -97,6 +98,14 @@ extern bool kvm_gsi_routing_allowed;
  */
 #define kvm_gsi_routing_enabled() (kvm_gsi_routing_allowed)
 
+/**
+ * kvm_readonly_mem_enabled:
+ *
+ * Returns: true if KVM readonly memory is enabled (ie the kernel
+ * supports it and we're running in a configuration that permits it).
+ */
+#define kvm_readonly_mem_enabled() (kvm_readonly_mem_allowed)
+
 #else
 #define kvm_enabled()   (0)
 #define kvm_irqchip_in_kernel() (false)
@@ -104,6 +113,7 @@ extern bool kvm_gsi_routing_allowed;
 #define kvm_irqfds_enabled() (false)
 #define kvm_msi_via_irqfd_enabled() (false)
 #define kvm_gsi_routing_allowed() (false)
+#define kvm_readonly_mem_enabled() (false)
 #endif
 
 struct kvm_run;
diff --git a/kvm-all.c b/kvm-all.c
index 3a31602..1686adc 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -111,6 +111,7 @@ bool kvm_irqfds_allowed;
 bool kvm_msi_via_irqfd_allowed;
 bool kvm_gsi_routing_allowed;
 bool kvm_allowed;
+bool kvm_readonly_mem_allowed;
 
 static const KVMCapabilityInfo kvm_required_capabilites[] = {
 KVM_CAP_INFO(USER_MEMORY),
@@ -1425,6 +1426,11 @@ int kvm_init(void)
 s-irq_set_ioctl = KVM_IRQ_LINE_STATUS;
 }
 
+#ifdef KVM_CAP_READONLY_MEM
+kvm_readonly_mem_allowed =
+(kvm_check_extension(s, KVM_CAP_READONLY_MEM)  0);
+#endif
+
 ret = kvm_arch_init(s);
 if (ret  0) {
 goto err;
diff --git a/kvm-stub.c b/kvm-stub.c
index b2c8f9b..22eaff0 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -26,6 +26,7 @@ bool kvm_irqfds_allowed;
 bool kvm_msi_via_irqfd_allowed;
 bool kvm_gsi_routing_allowed;
 bool kvm_allowed;
+bool kvm_readonly_mem_allowed;
 
 int kvm_init_vcpu(CPUState *cpu)
 {
-- 
1.7.10.4

[Qemu-devel] [PATCH v3 3/5] pflash_cfi01: memory region should be set to enable readonly mode

2013-05-06 Thread Jordan Justen

This causes any writes to the memory region to trap to the
device handler.

This is also important for KVM, because this allows the memory
region to be set using KVM_MEM_READONLY, which allows the memory
region to be read  executed. (Without this, KVM will not support
executing from the memory region.)

Signed-off-by: Jordan Justen jordan.l.jus...@intel.com
---
 hw/block/pflash_cfi01.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
index 3ff20e0..b65225e 100644
--- a/hw/block/pflash_cfi01.c
+++ b/hw/block/pflash_cfi01.c
@@ -596,6 +596,8 @@ static int pflash_cfi01_init(SysBusDevice *dev)
 }
 }
 
+memory_region_set_readonly(pfl-mem, true);
+
 if (pfl-bs) {
 pfl-ro = bdrv_is_read_only(pfl-bs);
 } else {
-- 
1.7.10.4

[Qemu-devel] [PATCH v3 5/5] pc_sysfw: change rom_only default to 0

2013-05-06 Thread Jordan Justen

Now KVM can support a flash memory. This feature depends on
KVM_CAP_READONLY_MEM, which was introduced in Linux 3.7.

Flash memory will only be enabled if a pflash device is
created. (For example, by using the -pflash command line
parameter.)

Signed-off-by: Jordan Justen jordan.l.jus...@intel.com
---
 hw/block/pc_sysfw.c  |2 +-
 include/hw/i386/pc.h |4 
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/hw/block/pc_sysfw.c b/hw/block/pc_sysfw.c
index 301eb96..46e794e 100644
--- a/hw/block/pc_sysfw.c
+++ b/hw/block/pc_sysfw.c
@@ -267,7 +267,7 @@ void pc_system_firmware_init(MemoryRegion *rom_memory)
 }
 
 static Property pcsysfw_properties[] = {
-DEFINE_PROP_UINT8(rom_only, PcSysFwDevice, rom_only, 1),
+DEFINE_PROP_UINT8(rom_only, PcSysFwDevice, rom_only, 0),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 41869e5..10c9347 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -238,10 +238,6 @@ int e820_add_entry(uint64_t, uint64_t, uint32_t);
 .driver   = virtio-net-pci,\
 .property = romfile,\
 .value= pxe-virtio.rom,\
-},{\
-.driver   = pc-sysfw,\
-.property = rom_only,\
-.value= stringify(0),\
 }
 
 #endif
-- 
1.7.10.4

[Qemu-devel] [PATCH v3 4/5] pc_sysfw: allow flash (-pflash) memory to be used with KVM

2013-05-06 Thread Jordan Justen

When pc-sysfw.rom_only != 0, flash memory will be
usable with kvm. In order to enable flash memory mode,
a pflash device must be created. (For example, by
using the -pflash command line parameter.)

Usage of a flash memory device with kvm requires the
KVM READONLY memory capability, and kvm will abort if
a flash device is used with an older kvm which does
not support this capability.

If a flash device is not used, then qemu/kvm will
operate in the original rom-mode.

Signed-off-by: Jordan Justen jordan.l.jus...@intel.com
---
 hw/block/pc_sysfw.c |   50 +++---
 1 file changed, 31 insertions(+), 19 deletions(-)

diff --git a/hw/block/pc_sysfw.c b/hw/block/pc_sysfw.c
index aad8614..301eb96 100644
--- a/hw/block/pc_sysfw.c
+++ b/hw/block/pc_sysfw.c
@@ -215,28 +215,40 @@ void pc_system_firmware_init(MemoryRegion *rom_memory)
 
 qdev_init_nofail(DEVICE(sysfw_dev));
 
-if (sysfw_dev-rom_only) {
-old_pc_system_rom_init(rom_memory);
-return;
-}
-
 pflash_drv = drive_get(IF_PFLASH, 0, 0);
 
-/* Currently KVM cannot execute from device memory.
-   Use old rom based firmware initialization for KVM. */
-/*
- * This is a Bad Idea, because it makes enabling/disabling KVM
- * guest-visible.  Do it only in bug-compatibility mode.
- */
-if (pc_sysfw_flash_vs_rom_bug_compatible  kvm_enabled()) {
-if (pflash_drv != NULL) {
-fprintf(stderr, qemu: pflash cannot be used with kvm enabled\n);
-exit(1);
-} else {
-sysfw_dev-rom_only = 1;
-old_pc_system_rom_init(rom_memory);
-return;
+if (pc_sysfw_flash_vs_rom_bug_compatible) {
+/*
+ * This is a Bad Idea, because it makes enabling/disabling KVM
+ * guest-visible.  Do it only in bug-compatibility mode.
+ */
+if (kvm_enabled()) {
+if (pflash_drv != NULL) {
+fprintf(stderr, qemu: pflash cannot be used with kvm 
enabled\n);
+exit(1);
+} else {
+/* In old pc_sysfw_flash_vs_rom_bug_compatible mode, we assume
+ * that KVM cannot execute from device memory. In this case, we
+ * use old rom based firmware initialization for KVM. But, 
since
+ * this is different from non-kvm mode, this behavior is
+ * undesirable */
+sysfw_dev-rom_only = 1;
+}
 }
+} else if (pflash_drv == NULL) {
+/* When a pflash drive is not found, use rom-mode */
+sysfw_dev-rom_only = 1;
+} else if (kvm_enabled()  !kvm_readonly_mem_enabled()) {
+/* Older KVM cannot execute from device memory. So, flash memory
+ * cannot be used unless the readonly memory kvm capability is 
present. */
+fprintf(stderr, qemu: pflash with kvm requires KVM readonly memory 
support\n);
+exit(1);
+}
+
+/* If rom-mode is active, use the old pc system rom initialization. */
+if (sysfw_dev-rom_only) {
+old_pc_system_rom_init(rom_memory);
+return;
 }
 
 /* If a pflash drive is not found, then create one using
-- 
1.7.10.4

Re: [Qemu-devel] [PATCH 7/8] pseries: savevm support for PAPR virtual SCSI

2013-05-06 Thread Paolo Bonzini

Il 03/05/2013 03:38, David Gibson ha scritto:
 This patch adds the necessary support for saving the state of the PAPR VIO
 virtual SCSI device.  This turns out to be trivial, because the generiC
 SCSI code already quiesces the attached virtual SCSI bus.
 
 Signed-off-by: David Gibson da...@gibson.dropbear.id.au
 ---
  hw/scsi/spapr_vscsi.c |   28 
  1 file changed, 28 insertions(+)
 
 diff --git a/hw/scsi/spapr_vscsi.c b/hw/scsi/spapr_vscsi.c
 index 3d322d5..f416871 100644
 --- a/hw/scsi/spapr_vscsi.c
 +++ b/hw/scsi/spapr_vscsi.c
 @@ -954,6 +954,33 @@ static Property spapr_vscsi_properties[] = {
  DEFINE_PROP_END_OF_LIST(),
  };
  
 +static void spapr_vscsi_pre_save(void *opaque)
 +{
 +VSCSIState *s = opaque;
 +int i;
 +
 +/* Can't save active requests, apparently the general SCSI code
 + * quiesces the queue for us on vmsave */
 +for (i = 0; i  VSCSI_REQ_LIMIT; i++) {
 +assert(!s-reqs[i].active);
 +}
 +}

This is only true when the rerror and werror options have the values
ignore or report.  See virtio-scsi for an example of how to save the
requests using the save_request and load_request callbacks in SCSIBusInfo.

Paolo

 +static const VMStateDescription vmstate_spapr_vscsi = {
 +.name = spapr_vscsi,
 +.version_id = 1,
 +.minimum_version_id = 1,
 +.minimum_version_id_old = 1,
 +.pre_save = spapr_vscsi_pre_save,
 +.fields  = (VMStateField []) {
 +VMSTATE_SPAPR_VIO(vdev, VSCSIState),
 +/* VSCSI state */
 +/*  */
 +
 +VMSTATE_END_OF_LIST()
 +},
 +};
 +
  static void spapr_vscsi_class_init(ObjectClass *klass, void *data)
  {
  DeviceClass *dc = DEVICE_CLASS(klass);
 @@ -968,6 +995,7 @@ static void spapr_vscsi_class_init(ObjectClass *klass, 
 void *data)
  k-signal_mask = 0x0001;
  dc-props = spapr_vscsi_properties;
  k-rtce_window_size = 0x1000;
 +dc-vmsd = vmstate_spapr_vscsi;
  }
  
  static const TypeInfo spapr_vscsi_info = {

Re: [Qemu-devel] [PATCH 2/9] qom: add object_property_add_unnamed_child

2013-05-06 Thread Paolo Bonzini

Il 03/05/2013 18:03, Michael Roth ha scritto:
 This interface allows us to add a child property without specifying a
 name. Instead, a unique name is created and passed back after adding
 the property.
 
 Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
 ---
  include/qom/object.h |   16 
  qom/object.c |   25 +
  2 files changed, 41 insertions(+)
 
 diff --git a/include/qom/object.h b/include/qom/object.h
 index 86f1e2e..ca0fce8 100644
 --- a/include/qom/object.h
 +++ b/include/qom/object.h
 @@ -1041,6 +1041,22 @@ void object_property_add_child(Object *obj, const char 
 *name,
 Object *child, struct Error **errp);
  
  /**
 + * object_property_add_unnamed_child:
 + *
 + * @obj: the object to add a property to
 + * @name: the name of the property
 + * @child: the child object
 + * @errp: if an error occurs, a pointer to an area to store the area
 + *
 + * Same as object_property_add_child, but will allocate a unique name to
 + * identify the child property.
 + *
 + * Returns: The name assigned to the child property, or NULL on failure.
 + */
 +char *object_property_add_unnamed_child(Object *obj, Object *child,
 +struct Error **errp);
 +
 +/**
   * object_property_add_link:
   * @obj: the object to add a property to
   * @name: the name of the property
 diff --git a/qom/object.c b/qom/object.c
 index c932f64..229a9a7 100644
 --- a/qom/object.c
 +++ b/qom/object.c
 @@ -926,6 +926,31 @@ static void object_finalize_child_property(Object *obj, 
 const char *name,
  object_unref(child);
  }
  
 +char *object_property_add_unnamed_child(Object *obj, Object *child, Error 
 **errp)
 +{
 +int idx = 0;
 +bool next_idx_found = false;
 +char name[64];
 +ObjectProperty *prop;
 +
 +while (!next_idx_found) {
 +sprintf(name, unnamed[%d], idx);
 +QTAILQ_FOREACH(prop, obj-properties, node) {
 +if (strcmp(name, prop-name) == 0) {
 +idx++;
 +break;
 +}
 +}
 +if (!prop) {
 +next_idx_found = true;
 +}
 +}
 +
 +object_property_add_child(obj, name, child, errp);
 +
 +return error_is_set(errp) ? NULL : g_strdup(name);
 +}

This is O(n^3) for adding N children.  O(n^2) would be not-that-great
but fine; can you take the occasion to convert the properties list to a
hashtable?

Paolo

 +
  void object_property_add_child(Object *obj, const char *name,
 Object *child, Error **errp)
  {

Re: [Qemu-devel] [PATCH 1/9] qom: add qom_init_completion

2013-05-06 Thread Paolo Bonzini

Il 03/05/2013 18:03, Michael Roth ha scritto:
 This is similar in concept to realize, though semantics are a
 bit more open-ended:
 
 And object might in some cases need a number of properties to be
 specified before it can be used/started/etc. This can't always
 be done via an open-ended new() function, the main example being objects
 that around created via the command-line by -object.
 
 To support these cases we allow a function, -instance_init_completion,
 to be registered that will be called by the -object constructor, or can
 be called at the end of new() constructors and such.

This seems a lot like a realize property that cannot be set back to false...

Paolo

 Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
 ---
  include/qom/object.h |   19 +++
  qom/object.c |   21 +
  vl.c |2 ++
  3 files changed, 42 insertions(+)
 
 diff --git a/include/qom/object.h b/include/qom/object.h
 index d0f99c5..86f1e2e 100644
 --- a/include/qom/object.h
 +++ b/include/qom/object.h
 @@ -394,6 +394,11 @@ struct Object
   * @instance_init: This function is called to initialize an object.  The 
 parent
   *   class will have already been initialized so the type is only responsible
   *   for initializing its own members.
 + * @instance_init_completion: This function is used mainly cases where an
 + *   object has been instantiated via the command-line, and is called once 
 all
 + *   properties specified via command-line have been set for the object. This
 + *   is not called automatically, but manually via @object_init_completion 
 once
 + *   the processing of said properties is completed.
   * @instance_finalize: This function is called during object destruction.  
 This
   *   is called before the parent @instance_finalize function has been called.
   *   An object should only free the members that are unique to its type in 
 this
 @@ -429,6 +434,7 @@ struct TypeInfo
  
  size_t instance_size;
  void (*instance_init)(Object *obj);
 +void (*instance_init_completion)(Object *obj);
  void (*instance_finalize)(Object *obj);
  
  bool abstract;
 @@ -562,6 +568,19 @@ struct InterfaceClass
  Object *object_new(const char *typename);
  
  /**
 + * object_init_completion:
 + * @obj: The object to complete initialization of
 + *
 + * In cases where an object is instantiated from a command-line with a number
 + * of properties specified as parameters (generally via -object), or for 
 cases
 + * where a new()/helper function is used to pass/set some minimal number of
 + * properties that are required prior to completion of object initialization,
 + * this function can be called to mark when that occurs to complete object
 + * initialization.
 + */
 +void object_init_completion(Object *obj);
 +
 +/**
   * object_new_with_type:
   * @type: The type of the object to instantiate.
   *
 diff --git a/qom/object.c b/qom/object.c
 index 75e6aac..c932f64 100644
 --- a/qom/object.c
 +++ b/qom/object.c
 @@ -50,6 +50,7 @@ struct TypeImpl
  void *class_data;
  
  void (*instance_init)(Object *obj);
 +void (*instance_init_completion)(Object *obj);
  void (*instance_finalize)(Object *obj);
  
  bool abstract;
 @@ -110,6 +111,7 @@ static TypeImpl *type_register_internal(const TypeInfo 
 *info)
  ti-class_data = info-class_data;
  
  ti-instance_init = info-instance_init;
 +ti-instance_init_completion = info-instance_init_completion;
  ti-instance_finalize = info-instance_finalize;
  
  ti-abstract = info-abstract;
 @@ -422,6 +424,25 @@ Object *object_new(const char *typename)
  return object_new_with_type(ti);
  }
  
 +
 +static void object_init_completion_with_type(Object *obj, TypeImpl *ti)
 +{
 +if (type_has_parent(ti)) {
 +object_init_completion_with_type(obj, type_get_parent(ti));
 +}
 +
 +if (ti-instance_init_completion) {
 +ti-instance_init_completion(obj);
 +}
 +}
 +
 +void object_init_completion(Object *obj)
 +{
 +TypeImpl *ti = type_get_by_name(object_get_class(obj)-type-name);
 +
 +object_init_completion_with_type(obj, ti);
 +}
 +
  Object *object_dynamic_cast(Object *obj, const char *typename)
  {
  if (obj  object_class_dynamic_cast(object_get_class(obj), typename)) {
 diff --git a/vl.c b/vl.c
 index 6e6225f..d454c86 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -2831,6 +2831,8 @@ static int object_create(QemuOpts *opts, void *opaque)
  object_property_add_child(container_get(object_get_root(), /objects),
id, obj, NULL);
  
 +object_init_completion(obj);
 +
  return 0;
  }

Re: [Qemu-devel] [PATCH 7/9] iohandler: associate with main event loop via a QSource

2013-05-06 Thread Paolo Bonzini

Il 03/05/2013 18:03, Michael Roth ha scritto:
 This introduces a GlibQContext wrapper around the main GMainContext
 event loop, and associates iohandlers with it via a QSource (which
 GlibQContext creates a GSource from so that it can be driven via
 GLib. A subsequent patch will drive the GlibQContext directly)
 
 We also add QContext-aware functionality to iohandler interfaces
 so that they can be bound to other QContext event loops, and add
 non-global set_fd_handler() interfaces to facilitate this. This is made
 possible by simply searching a given QContext for a QSource by the name
 of iohandler so that we can attach event handlers to the associated
 IOHandlerState.
 
 Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com

This patch is why I think that this is a bit overengineered.  The main
loop is always glib-based, there should be no need to go through the
QSource abstraction.

BTW, this is broken for Win32.  The right thing to do here is to first
convert iohandler to a GSource in such a way that it works for both
POSIX and Win32, and then (if needed) we can later convert GSource to
QSource.

Paolo

 ---
  include/qemu/main-loop.h |   31 +-
  iohandler.c  |  238 
 --
  main-loop.c  |   21 +++-
  3 files changed, 213 insertions(+), 77 deletions(-)
 
 diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
 index 6f0200a..dbadf9f 100644
 --- a/include/qemu/main-loop.h
 +++ b/include/qemu/main-loop.h
 @@ -26,6 +26,7 @@
  #define QEMU_MAIN_LOOP_H 1
  
  #include block/aio.h
 +#include qcontext/qcontext.h
  
  #define SIG_IPI SIGUSR1
  
 @@ -168,9 +169,24 @@ void qemu_del_wait_object(HANDLE handle, WaitObjectFunc 
 *func, void *opaque);
  
  /* async I/O support */
  
 +#define QSOURCE_IOHANDLER iohandler
 +
  typedef void IOReadHandler(void *opaque, const uint8_t *buf, int size);
  typedef int IOCanReadHandler(void *opaque);
  
 +QContext *qemu_get_qcontext(void);
 +/**
 + * iohandler_attach: Attach a QSource to a QContext
 + *
 + * This enables the use of IOHandler interfaces such as
 + * set_fd_handler() on the given QContext. IOHandler lists will be
 + * tracked/handled/dispatched based on a named QSource that is added to
 + * the QContext
 + *
 + * @ctx: A QContext to add an IOHandler QSource to
 + */
 +void iohandler_attach(QContext *ctx);
 +
  /**
   * qemu_set_fd_handler2: Register a file descriptor with the main loop
   *
 @@ -217,6 +233,13 @@ int qemu_set_fd_handler2(int fd,
   IOHandler *fd_write,
   void *opaque);
  
 +int set_fd_handler2(QContext *ctx,
 +int fd,
 +IOCanReadHandler *fd_read_poll,
 +IOHandler *fd_read,
 +IOHandler *fd_write,
 +void *opaque);
 +
  /**
   * qemu_set_fd_handler: Register a file descriptor with the main loop
   *
 @@ -250,6 +273,12 @@ int qemu_set_fd_handler(int fd,
  IOHandler *fd_write,
  void *opaque);
  
 +int set_fd_handler(QContext *ctx,
 +   int fd,
 +   IOHandler *fd_read,
 +   IOHandler *fd_write,
 +   void *opaque);
 +
  #ifdef CONFIG_POSIX
  /**
   * qemu_add_child_watch: Register a child process for reaping.
 @@ -302,8 +331,6 @@ void qemu_mutex_unlock_iothread(void);
  /* internal interfaces */
  
  void qemu_fd_register(int fd);
 -void qemu_iohandler_fill(GArray *pollfds);
 -void qemu_iohandler_poll(GArray *pollfds, int rc);
  
  QEMUBH *qemu_bh_new(QEMUBHFunc *cb, void *opaque);
  void qemu_bh_schedule_idle(QEMUBH *bh);
 diff --git a/iohandler.c b/iohandler.c
 index ae2ef8f..8625272 100644
 --- a/iohandler.c
 +++ b/iohandler.c
 @@ -41,38 +41,170 @@ typedef struct IOHandlerRecord {
  int fd;
  int pollfds_idx;
  bool deleted;
 +GPollFD pfd;
 +bool pfd_added;
  } IOHandlerRecord;
  
 -static QLIST_HEAD(, IOHandlerRecord) io_handlers =
 -QLIST_HEAD_INITIALIZER(io_handlers);
 +typedef struct IOHandlerState {
 +QLIST_HEAD(, IOHandlerRecord) io_handlers;
 +} IOHandlerState;
  
 +static bool iohandler_prepare(QSource *qsource, int *timeout)
 +{
 +QSourceClass *qsourcek = QSOURCE_GET_CLASS(qsource);
 +IOHandlerState *s = qsourcek-get_user_data(qsource);
 +IOHandlerRecord *ioh;
  
 -/* XXX: fd_read_poll should be suppressed, but an API change is
 -   necessary in the character devices to suppress fd_can_read(). */
 -int qemu_set_fd_handler2(int fd,
 - IOCanReadHandler *fd_read_poll,
 - IOHandler *fd_read,
 - IOHandler *fd_write,
 - void *opaque)
 +QLIST_FOREACH(ioh, s-io_handlers, next) {
 +int events = 0;
 +
 +if (ioh-deleted)
 +continue;
 +
 +if (ioh-fd_read 
 +(!ioh-fd_read_poll ||
 + ioh-fd_read_poll(ioh-opaque) != 0)) {

Re: [Qemu-devel] [PATCH 9/9] dataplane: use a QContext event loop in place of custom thread

2013-05-06 Thread Paolo Bonzini

Il 03/05/2013 18:03, Michael Roth ha scritto:
 virtio-blk dataplane currently creates/manages it's own thread to
 offload work to a separate event loop.
 
 This patch insteads allows us to specify a QContext-based event loop by
 adding a context property for virtio-blk we can use like so:
 
   qemu ... \
 -object glib-qcontext,id=ctx0,threaded=yes
 -drive file=file.raw,id=drive0,aio=native,cache=none \
 -device virtio-blk,drive=drive0,scsi=off,x-data-plane=on,context=ctx0
 
 virtio-blk dataplane then simply attachs/detaches it's AioContext to the
 ctx0 event loop on start/stop.
 
 This also makes available the option to drive a virtio-blk dataplane via
 the default main loop:
 
   qemu ... \
 -drive file=file.raw,id=drive0,aio=native,cache=none \
 -device virtio-blk,drive=drive0,scsi=off,x-data-plane=on,context=main
 
 This doesn't do much in and of itself, but helps to demonstrate how we
 might model a general mechanism to offload device workloads to separate
 threads.
 
 Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
 ---
  hw/block/dataplane/virtio-blk.c |   46 
 ---
  include/hw/virtio/virtio-blk.h  |7 --
  2 files changed, 19 insertions(+), 34 deletions(-)
 
 diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
 index 0356665..08ea10f 100644
 --- a/hw/block/dataplane/virtio-blk.c
 +++ b/hw/block/dataplane/virtio-blk.c
 @@ -24,6 +24,8 @@
  #include virtio-blk.h
  #include block/aio.h
  #include hw/virtio/virtio-bus.h
 +#include qcontext/qcontext.h
 +#include qcontext/glib-qcontext.h
  
  enum {
  SEG_MAX = 126,  /* maximum number of I/O segments */
 @@ -60,6 +62,7 @@ struct VirtIOBlockDataPlane {
   * use it).
   */
  AioContext *ctx;
 +QContext *qctx;
  EventNotifier io_notifier;  /* Linux AIO completion */
  EventNotifier host_notifier;/* doorbell */
  
 @@ -375,26 +378,6 @@ static void handle_io(EventNotifier *e)
  }
  }
  
 -static void *data_plane_thread(void *opaque)
 -{
 -VirtIOBlockDataPlane *s = opaque;
 -
 -do {
 -aio_poll(s-ctx, true);
 -} while (!s-stopping || s-num_reqs  0);
 -return NULL;
 -}
 -
 -static void start_data_plane_bh(void *opaque)
 -{
 -VirtIOBlockDataPlane *s = opaque;
 -
 -qemu_bh_delete(s-start_bh);
 -s-start_bh = NULL;
 -qemu_thread_create(s-thread, data_plane_thread,
 -   s, QEMU_THREAD_JOINABLE);
 -}
 -
  bool virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
VirtIOBlockDataPlane **dataplane)
  {
 @@ -460,6 +443,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
  VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
  VirtQueue *vq;
  int i;
 +Error *err = NULL;
  
  if (s-started) {
  return;
 @@ -502,9 +486,16 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
  /* Kick right away to begin processing requests already in vring */
  event_notifier_set(virtio_queue_get_host_notifier(vq));
  
 -/* Spawn thread in BH so it inherits iothread cpusets */
 -s-start_bh = qemu_bh_new(start_data_plane_bh, s);
 -qemu_bh_schedule(s-start_bh);
 +/* use QEMU main loop/context by default */
 +if (!s-blk-context) {
 +s-blk-context = g_strdup(main);
 +}

Or rather create a device-specific context by default?

Paolo

 +s-qctx = qcontext_find_by_name(s-blk-context, err);
 +if (err) {
 +fprintf(stderr, virtio-blk failed to start: %s, 
 error_get_pretty(err));
 +exit(1);
 +}
 +aio_context_attach(s-ctx, s-qctx);
  }
  
  void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 @@ -517,15 +508,6 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
  s-stopping = true;
  trace_virtio_blk_data_plane_stop(s);
  
 -/* Stop thread or cancel pending thread creation BH */
 -if (s-start_bh) {
 -qemu_bh_delete(s-start_bh);
 -s-start_bh = NULL;
 -} else {
 -aio_notify(s-ctx);
 -qemu_thread_join(s-thread);
 -}
 -
  aio_set_event_notifier(s-ctx, s-io_notifier, NULL, NULL);
  ioq_cleanup(s-ioqueue);
  
 diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
 index fc71853..c5514a4 100644
 --- a/include/hw/virtio/virtio-blk.h
 +++ b/include/hw/virtio/virtio-blk.h
 @@ -110,6 +110,7 @@ struct VirtIOBlkConf
  uint32_t scsi;
  uint32_t config_wce;
  uint32_t data_plane;
 +char *context;
  };
  
  struct VirtIOBlockDataPlane;
 @@ -138,13 +139,15 @@ typedef struct VirtIOBlock {
  DEFINE_BLOCK_CHS_PROPERTIES(_state, _field.conf),
  \
  DEFINE_PROP_STRING(serial, _state, _field.serial), 
  \
  DEFINE_PROP_BIT(config-wce, _state, _field.config_wce, 0, true),   
  \
 -DEFINE_PROP_BIT(scsi, _state, _field.scsi, 0, true)
 +DEFINE_PROP_BIT(scsi, _state,

Re: [Qemu-devel] [RFC 0/9] QContext: QOM class to support multiple event loops

2013-05-06 Thread Paolo Bonzini

Il 03/05/2013 18:03, Michael Roth ha scritto:
 These patches apply on top of qemu.git master, and can also be obtained from:
 git://github.com/mdroth/qemu.git qcontext
 
 OVERVIEW
 
 This series introduces a set of QOM classes/interfaces for event
 registration/handling: QContext and QSource, which are based closely on
 their GMainContext/GSource GLib counterparts.
 
 QContexts can be created via the command-line via -object, and can also be
 intructed (via -object params/properties) to automatically start a
 thread/event-loop to handle QSources we attach to them.

This is an awesome idea.

However, it seems a bit overengineered.  Why do we need QSource at all?
 In my opinion, we should first change dataplane to use AioContext as a
GSource, and benchmark it thoroughly.  If it is fast enough, we can
just introduce a glib-based QContext and be done with it.  Hopefully
that is the case...

Paolo

Re: [Qemu-devel] [PATCH prep for-1.5 v2] prep: Add ELF support for -bios

2013-05-06 Thread Fabien Chouteau

On 05/05/2013 09:00 PM, Andreas Färber wrote:
 Am 05.05.2013 20:40, schrieb Alexander Graf:


 Am 05.05.2013 um 19:45 schrieb Andreas Färber andreas.faer...@web.de:

 This prepares for switching from OpenHack'Ware to OpenBIOS.

 While touching the error handling code, switch from aborting hw_error()
 to fprintf()+exit() and suppress failing without -bios for qtest.

 Signed-off-by: Andreas Färber andreas.faer...@web.de

 Acked-by: Alexander Graf ag...@suse.de
 
 Thanks, applied to prep-up:
 http://repo.or.cz/w/qemu/afaerber.git/shortlog/refs/heads/prep-up
 
 Sorry, forgot the changelog:
 * error handling was split up as suggested by Alex,
 * missing exit(1) calls were added and

BTW, why do you use fprintf()+exit() instead of hw_error()?

-- 
Fabien Chouteau

Re: [Qemu-devel] [PATCH v7 0/7] push mmio dispatch out of big lock

2013-05-06 Thread Paolo Bonzini

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Il 04/05/2013 12:42, Jan Kiszka ha scritto:
 On 2013-05-04 11:47, Paolo Bonzini wrote:
 Il 03/05/2013 10:04, Jan Kiszka ha scritto:
 We can't change the semantics of opaque as long as old_mmio /
 old_portio are around. But we need a flag anyway to indicate if
 a region is depending on BQL or not. Adding a separate Object
 *owner to MemoryRegion can serve both purposes. Then we define
 something like
 
 void memory_region_set_local_locking(MemoryRegion *mr, bool
 local_locking, Object *owner);
 
 to control the property (if local_locking is true, owner must
 be non-NULL, of course). That's quite similar to my old
 prototype here that had
 memory_region_set/clear_global_locking.
 
 I think setting the owner can be done separately from enabling
 local lock.  For example, memory_region_find could also have a
 variant that adds a ref to the owner.  It would be very similar
 to what Ping Fan is doing in the virtio-dataplane's HostMem data
 structure.
 
 That's trivial to break up, but I'm not sure if there will be
 reasonable scenarios where a region requires reference counting
 without being able to work without the BQL. RAM, e.g., should
 always work BQL-free (once we have the infrastructure in place).

I think we need to add an owner to all regions (tedious, but
doable---perhaps even scriptable).  The current code covers
address_space_rw, but memory_region_find remains callable only from
BQL-protected regions.  The caller of memory_region_find needs to be
able to inspect the MemoryRegion, even if it is just to fail on
non-RAM regions.  I would like to switch the dataplane code to use
memory_region_find.

BTW, have you seen
http://lwn.net/SubscriberLink/548909/b6fdd846f1232be6/ ? [*]  Perhaps
we can adopt something like that, it solves the same exact problem
that we have, and it's a well-known solution from the literature.

[*]  The subscriber link mechanism allows an LWN.net
 subscriber to generate a special URL for a
 subscription-only article. That URL can then be given to
 others, who will be able to access the article regardless
 of whether they are subscribed. This feature is made
 available as a service to LWN subscribers, and in the hope
 that they will use it to spread the word about their
 favorite LWN articles.

 And memory_region_find should likely always increment a reference
 if the target region has an owner. We should convert its users to
 properly dereference the region once done with it.

Yes.  But this is what requires you to have an owner for all regions.

Paolo
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJRh2TNAAoJEBvWZb6bTYbyVqsP/3DUCevVyhMU0OsDrdMhtbQO
9fTzQmvhbUo2auEzjhjxvl9YH/2exvymsBH1kW2dM5xsct+MsULKMmpw2wucELfd
9i82fS/TbofTWQI0Qz2iCEn6G40aAJf5GC/eMMUINpYlTL0hhaRibaNl4wtwgM+N
FhlohLM0Dki/dQiF+DOfr2TdeFwuJfBpaDVL3Q7YZ+4TXADnjHglltWBwWk0RYvy
1nzUNqah4WwP3yOSlh53kT40VGCgea3mJaogoBTNz1iYdsi2FEGcRdO8JKQqZDoU
0EzfEfBTmACSjXdFOpnkR81PV19DiinRK/Wcj4RGfJJygHAZcabseueWOYKLox+P
Zkjvr1h8HBkJWAZB1yZn0M++ts6nByFFZt0RKDgR9DEhJbKlf6E/7yH/xclecSMR
UynRjuLIZWmKgs+VrfBE8Sda4Wz8NP8oR6A6rv0t9K+oLI0CAA8bj3fQmGhgldkP
AAIyOcsWP7VnizvaLoxicP20fAqvEBPYJhcO80/kVrdubG9yt6ljdHnIwGpbXrR5
hchYoWYK4SsyPp6YwESG1eVPZ/4GNoK0PIjeJELSmUGwbMfwU/VOaIL0DExXpF5Y
b9yct9CpgEW3PhGxqCNgEsPAIbMSpl1OjaAqAdDb+DGZiYwIWE8Mb3SB5Ilsvco2
ZYVzJH7sXoOB9o5k7DIt
=nA/3
-END PGP SIGNATURE-

[Qemu-devel] [PATCH] qmp: fix handling of cmd with Equals in qmp-shell

2013-05-06 Thread Zhangleiqiang

qmp: fix handling of cmd with equal mark in qmp-shell

qmp-shell splits the argument and value of input command
by equal mark(=). But there are commands whose values
include equal mark themselves, and the json built by
qmp-shell will not correct. For example, when using NBD as
the target of block-backup command, the input
block-backup target=nbd+unix:///drive0?socket=/tmp/nbd.sock
will fail, because the json built will be as follows:

{
execute:block-backup,
arguments:{target:nbd+unix:///drive0?socket}
}

Fix it by joining the sections split by equal mark excluding the
first section in __build_cmd function when the length of sections
is larger than two.

Signed-off-by: zhangleiqiang zhangleiqi...@huawei.com
---
 QMP/qmp-shell | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/QMP/qmp-shell b/QMP/qmp-shell
index d126e63..73cb3b6 100755
--- a/QMP/qmp-shell
+++ b/QMP/qmp-shell
@@ -99,6 +99,8 @@ class QMPShell(qmp.QEMUMonitorProtocol):
 for arg in cmdargs[1:]:
 opt = arg.split('=')
 try:
+if(len(opt)  2):
+opt[1] = '='.join(opt[1:])
 value = int(opt[1])
 except ValueError:
 if opt[1] == 'true':
-- 
1.8.1.4


--
Leiqzhang

Best Regards

[Qemu-devel] [PATCH] qmp: fix handling of cmd with Equals in qmp-shell

2013-05-06 Thread Zhangleiqiang


qmp: fix handling of cmd with equal mark in qmp-shell

qmp-shell splits the argument and value of input command
by equal mark(=). But there are commands whose values
include equal mark themselves, and the json built by
qmp-shell will not correct. For example, when using NBD as
the target of block-backup command, the input
block-backup target=nbd+unix:///drive0?socket=/tmp/nbd.sock
will fail, because the json built will be as follows:

{
execute:block-backup,
arguments:{target:nbd+unix:///drive0?socket}
}

Fix it by joining the sections split by equal mark excluding the
first section in __build_cmd function when the length of sections
is larger than two.

Signed-off-by: zhangleiqiang zhangleiqi...@huawei.com
---
 QMP/qmp-shell | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/QMP/qmp-shell b/QMP/qmp-shell
index d126e63..73cb3b6 100755
--- a/QMP/qmp-shell
+++ b/QMP/qmp-shell
@@ -99,6 +99,8 @@ class QMPShell(qmp.QEMUMonitorProtocol):
 for arg in cmdargs[1:]:
 opt = arg.split('=')
 try:
+if(len(opt)  2):
+opt[1] = '='.join(opt[1:])
 value = int(opt[1])
 except ValueError:
 if opt[1] == 'true':
-- 
1.8.1.4


--
Leiqzhang

Best Regards

Re: [Qemu-devel] [PATCH v7 0/7] push mmio dispatch out of big lock

2013-05-06 Thread Jan Kiszka

On 2013-05-06 10:07, Paolo Bonzini wrote:
 Il 04/05/2013 12:42, Jan Kiszka ha scritto:
 On 2013-05-04 11:47, Paolo Bonzini wrote:
 Il 03/05/2013 10:04, Jan Kiszka ha scritto:
 We can't change the semantics of opaque as long as old_mmio /
 old_portio are around. But we need a flag anyway to indicate if
 a region is depending on BQL or not. Adding a separate Object
 *owner to MemoryRegion can serve both purposes. Then we define
 something like

 void memory_region_set_local_locking(MemoryRegion *mr, bool
 local_locking, Object *owner);

 to control the property (if local_locking is true, owner must
 be non-NULL, of course). That's quite similar to my old
 prototype here that had
 memory_region_set/clear_global_locking.

 I think setting the owner can be done separately from enabling
 local lock.  For example, memory_region_find could also have a
 variant that adds a ref to the owner.  It would be very similar
 to what Ping Fan is doing in the virtio-dataplane's HostMem data
 structure.
 
 That's trivial to break up, but I'm not sure if there will be
 reasonable scenarios where a region requires reference counting
 without being able to work without the BQL. RAM, e.g., should
 always work BQL-free (once we have the infrastructure in place).
 
 I think we need to add an owner to all regions (tedious, but
 doable---perhaps even scriptable).  The current code covers
 address_space_rw, but memory_region_find remains callable only from
 BQL-protected regions.  The caller of memory_region_find needs to be
 able to inspect the MemoryRegion, even if it is just to fail on
 non-RAM regions.  I would like to switch the dataplane code to use
 memory_region_find.
 
 BTW, have you seen
 http://lwn.net/SubscriberLink/548909/b6fdd846f1232be6/ ? [*]  Perhaps
 we can adopt something like that, it solves the same exact problem
 that we have, and it's a well-known solution from the literature.

That looks like a more handy wrapper around mutex_trylock + recursive locks.

But the problem is not the locking mechanism. The issue is the
(non-existing) roll-back logic in the device models. That's the tedious
work - if we would like to go that way. But I'm still optimistic we can
avoid it. We may need lock state inspection (mutex_is_locked), playing
with this ATM.

 
   [*]  The subscriber link mechanism allows an LWN.net
subscriber to generate a special URL for a
subscription-only article. That URL can then be given to
others, who will be able to access the article regardless
of whether they are subscribed. This feature is made
available as a service to LWN subscribers, and in the hope
that they will use it to spread the word about their
favorite LWN articles.
 
 And memory_region_find should likely always increment a reference
 if the target region has an owner. We should convert its users to
 properly dereference the region once done with it.
 
 Yes.  But this is what requires you to have an owner for all regions.

You don't need an owner for regions that are protect by the BQL (the
majority in the foreseeable future). For those regions, reference
counting can remain a nop, internally. But that's nothing their users
should care about.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH v2 0/1] uhci: Use an intermediate buffer for usb packet data

2013-05-06 Thread Hans de Goede

Hi,

Due to various unfortunate reasons we cannot reliable detect a guest
cancelling a packet as soon as it happens, instead we detect cancels
with some delay.

When packets are handled async, and we directly pass the guest memory for
the packet to the usb-device as iovec, this means that the usb-device can
write to guest-memory which the guest has already re-used for other purposes
- not good!

This patch fixes this by adding an intermediate buffer and writing back not
only the result, but also the data, of async completed packets when scanning
the schedule.

Changes in v2: Use usb_packet_addbuf instead of directly calling iovec_add

Signed-off-by: Hans de Goede hdego...@redhat.com

Regards,

Hans

[Qemu-devel] [PATCH] uhci: Use an intermediate buffer for usb packet data

2013-05-06 Thread Hans de Goede

Due to various unfortunate reasons we cannot reliable detect a guest
cancelling a packet as soon as it happens, instead we detect cancels
with some delay.

When packets are handled async, and we directly pass the guest memory for
the packet to the usb-device as iovec, this means that the usb-device can
write to guest-memory which the guest has already re-used for other purposes
- not good!

This patch fixes this by adding an intermediate buffer and writing back not
only the result, but also the data, of async completed packets when scanning
the schedule.

Signed-off-by: Hans de Goede hdego...@redhat.com
---
 hw/usb/hcd-uhci.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/hw/usb/hcd-uhci.c b/hw/usb/hcd-uhci.c
index f8c4286..c85b203 100644
--- a/hw/usb/hcd-uhci.c
+++ b/hw/usb/hcd-uhci.c
@@ -119,7 +119,8 @@ struct UHCIPCIDeviceClass {
 
 struct UHCIAsync {
 USBPacket packet;
-QEMUSGList sgl;
+uint8_t   static_buf[64]; /* 64 bytes is enough, except for isoc packets */
+uint8_t   *buf;
 UHCIQueue *queue;
 QTAILQ_ENTRY(UHCIAsync) next;
 uint32_t  td_addr;
@@ -264,7 +265,6 @@ static UHCIAsync *uhci_async_alloc(UHCIQueue *queue, 
uint32_t td_addr)
 async-queue = queue;
 async-td_addr = td_addr;
 usb_packet_init(async-packet);
-pci_dma_sglist_init(async-sgl, queue-uhci-dev, 1);
 trace_usb_uhci_packet_add(async-queue-token, async-td_addr);
 
 return async;
@@ -274,7 +274,9 @@ static void uhci_async_free(UHCIAsync *async)
 {
 trace_usb_uhci_packet_del(async-queue-token, async-td_addr);
 usb_packet_cleanup(async-packet);
-qemu_sglist_destroy(async-sgl);
+if (async-buf != async-static_buf) {
+g_free(async-buf);
+}
 g_free(async);
 }
 
@@ -299,7 +301,6 @@ static void uhci_async_cancel(UHCIAsync *async)
  async-done);
 if (!async-done)
 usb_cancel_packet(async-packet);
-usb_packet_unmap(async-packet, async-sgl);
 uhci_async_free(async);
 }
 
@@ -774,6 +775,7 @@ static int uhci_complete_td(UHCIState *s, UHCI_TD *td, 
UHCIAsync *async, uint32_
 *int_mask |= 0x01;
 
 if (pid == USB_TOKEN_IN) {
+pci_dma_write(s-dev, td-buffer, async-buf, len);
 if ((td-ctrl  TD_CTRL_SPD)  len  max_len) {
 *int_mask |= 0x02;
 /* short packet: do not update QH */
@@ -881,12 +883,17 @@ static int uhci_handle_td(UHCIState *s, UHCIQueue *q, 
uint32_t qh_addr,
 spd = (pid == USB_TOKEN_IN  (td-ctrl  TD_CTRL_SPD) != 0);
 usb_packet_setup(async-packet, pid, q-ep, 0, td_addr, spd,
  (td-ctrl  TD_CTRL_IOC) != 0);
-qemu_sglist_add(async-sgl, td-buffer, max_len);
-usb_packet_map(async-packet, async-sgl);
+if (max_len = sizeof(async-static_buf)) {
+async-buf = async-static_buf;
+} else {
+async-buf = g_malloc(max_len);
+}
+usb_packet_addbuf(async-packet, async-buf, max_len);
 
 switch(pid) {
 case USB_TOKEN_OUT:
 case USB_TOKEN_SETUP:
+pci_dma_read(s-dev, td-buffer, async-buf, max_len);
 usb_handle_packet(q-ep-dev, async-packet);
 if (async-packet.status == USB_RET_SUCCESS) {
 async-packet.actual_length = max_len;
@@ -899,7 +906,6 @@ static int uhci_handle_td(UHCIState *s, UHCIQueue *q, 
uint32_t qh_addr,
 
 default:
 /* invalid pid : frame interrupted */
-usb_packet_unmap(async-packet, async-sgl);
 uhci_async_free(async);
 s-status |= UHCI_STS_HCPERR;
 uhci_update_irq(s);
@@ -916,7 +922,6 @@ static int uhci_handle_td(UHCIState *s, UHCIQueue *q, 
uint32_t qh_addr,
 
 done:
 ret = uhci_complete_td(s, td, async, int_mask);
-usb_packet_unmap(async-packet, async-sgl);
 uhci_async_free(async);
 return ret;
 }
-- 
1.8.2.1

Re: [Qemu-devel] [PATCH v2 1/4] Add i.MX FEC Ethernet driver

2013-05-06 Thread Michael S. Tsirkin

On Sun, May 05, 2013 at 11:00:24PM +0100, Peter Maydell wrote:
 On 5 May 2013 22:15, Michael S. Tsirkin m...@redhat.com wrote:
  On Sun, May 05, 2013 at 07:01:34PM +0100, Peter Maydell wrote:
  Sorry, you can't say this until we've sorted out the mess
  that is new-style networking options in a machine which
  creates embedded network controllers.
 
  What is missing exactly?
  Could you please give some examples of the problems
  that -netdev + -device has but -net does not have?
 
 -netdev + -device is fine (unsurprisingly since that's the
 PC usecase); -netdev + a device that's preinstantiated by the
 board is not so fine. And you can't use -device to instantiate
 most embedded network controllers because there's no way to
 wire up the IRQs and MMIOs.

Can't board code look for instanciated controllers
and wire them up?

 
 There's probably a nasty workaround involving '-global', but:
  * that requires the user to know the device name for the
onboard NIC for the board, which is a regression from
the -net situation
  * it's not clear how it works if the board has two NICs
of the same type

How does it work now?
I am guessing each -net nic gets mapped to a random device.
At some level that's worse than documenting about internal names,
we are teaching users to learn order of initialization
by trial and error and then rely on this.

  * if we claim -global is the right approach we need to actually
document it (and document all the board NIC names, yuck)
  * we need to fix existing boards which do the don't instantiate
NIC unless the user said -net nic trick by looking at
nd_table[]
  * we need to make the board code pass the right NIC properties
in both the legacy -net option and new style cases (at the
moment, for instance, lan911_init() insists on having a
NICInfo* passed to it)
 
 -net nic works for these use cases because it will operate on
 the NICs created by the machine models, because the machine
 models look at the nd_table[] when they create the NICs.
 
 thanks
 -- PMM


-- 
MST

Re: [Qemu-devel] [PATCH v2 1/4] Add i.MX FEC Ethernet driver

2013-05-06 Thread Peter Maydell

[cc'd Anthony since this has drifted into a more general topic]

On 6 May 2013 09:51, Michael S. Tsirkin m...@redhat.com wrote:
 On Sun, May 05, 2013 at 11:00:24PM +0100, Peter Maydell wrote:
 On 5 May 2013 22:15, Michael S. Tsirkin m...@redhat.com wrote:
  On Sun, May 05, 2013 at 07:01:34PM +0100, Peter Maydell wrote:
  Sorry, you can't say this until we've sorted out the mess
  that is new-style networking options in a machine which
  creates embedded network controllers.

  What is missing exactly?
  Could you please give some examples of the problems
  that -netdev + -device has but -net does not have?

 -netdev + -device is fine (unsurprisingly since that's the
 PC usecase); -netdev + a device that's preinstantiated by the
 board is not so fine. And you can't use -device to instantiate
 most embedded network controllers because there's no way to
 wire up the IRQs and MMIOs.

 Can't board code look for instanciated controllers
 and wire them up?

I don't think this will work, because -device does both
'instance_init' and 'realize', and some of the things the
board needs to set and wire up must be done before 'realize'.

 There's probably a nasty workaround involving '-global', but:
  * that requires the user to know the device name for the
onboard NIC for the board, which is a regression from
the -net situation
  * it's not clear how it works if the board has two NICs
of the same type

 How does it work now?
 I am guessing each -net nic gets mapped to a random device.
 At some level that's worse than documenting about internal names,
 we are teaching users to learn order of initialization
 by trial and error and then rely on this.

Well, it gets mapped to a specific device (hopefully we pick
the same order as the kernel so first nic is eth0, second
is eth1, and so on). This isn't a question of initialization
order, because you can happily initialize the NIC corresponding
to nd_table[1] before the one for nd_table[0] if you like.
It's just a matter of picking which bit of hardware we call
the first ethernet device, in the same way that we pick
one of two serial ports to call the first serial port.

thanks
-- PMM

Re: [Qemu-devel] [PATCH v2 1/4] Add i.MX FEC Ethernet driver

2013-05-06 Thread Michael S. Tsirkin

On Mon, May 06, 2013 at 10:08:42AM +0100, Peter Maydell wrote:
 [cc'd Anthony since this has drifted into a more general topic]
 
 On 6 May 2013 09:51, Michael S. Tsirkin m...@redhat.com wrote:
  On Sun, May 05, 2013 at 11:00:24PM +0100, Peter Maydell wrote:
  On 5 May 2013 22:15, Michael S. Tsirkin m...@redhat.com wrote:
   On Sun, May 05, 2013 at 07:01:34PM +0100, Peter Maydell wrote:
   Sorry, you can't say this until we've sorted out the mess
   that is new-style networking options in a machine which
   creates embedded network controllers.
 
   What is missing exactly?
   Could you please give some examples of the problems
   that -netdev + -device has but -net does not have?
 
  -netdev + -device is fine (unsurprisingly since that's the
  PC usecase); -netdev + a device that's preinstantiated by the
  board is not so fine. And you can't use -device to instantiate
  most embedded network controllers because there's no way to
  wire up the IRQs and MMIOs.
 
  Can't board code look for instanciated controllers
  and wire them up?
 
 I don't think this will work, because -device does both
 'instance_init' and 'realize', and some of the things the
 board needs to set and wire up must be done before 'realize'.

Well let's add a flag that tells QM to delay realize then?
It's not abstract but maybe embedded type?

  There's probably a nasty workaround involving '-global', but:
   * that requires the user to know the device name for the
 onboard NIC for the board, which is a regression from
 the -net situation
   * it's not clear how it works if the board has two NICs
 of the same type
 
  How does it work now?
  I am guessing each -net nic gets mapped to a random device.
  At some level that's worse than documenting about internal names,
  we are teaching users to learn order of initialization
  by trial and error and then rely on this.
 
 Well, it gets mapped to a specific device (hopefully we pick
 the same order as the kernel so first nic is eth0, second
 is eth1, and so on). This isn't a question of initialization
 order, because you can happily initialize the NIC corresponding
 to nd_table[1] before the one for nd_table[0] if you like.
 It's just a matter of picking which bit of hardware we call
 the first ethernet device, in the same way that we pick
 one of two serial ports to call the first serial port.
 
 thanks
 -- PMM

In other words, it's an undocumented hack :(
Scary as it sounds, for this case I like documenting
internal names better.

Re: [Qemu-devel] [PATCH] [KVM] Needless to update msi route when only msi-x entry control section changed

2013-05-06 Thread Michael S. Tsirkin

On Mon, May 06, 2013 at 02:52:37AM +, Zhanghaoyu (A) wrote:
  With regard to old version linux guest(e.g., rhel-5.5), in ISR processing, 
  mask and unmask msi-x vector every time, which result in VMEXIT, then QEMU 
  will invoke kvm_irqchip_update_msi_route() to ask KVM hypervisor to update 
  the VM irq routing table. In KVM hypervisor, synchronizing RCU needed 
  after updating routing table, so much time consumed for waiting in 
  wait_rcu_gp(). So CPU usage in VM is so high, while from the view of host, 
  VM's total CPU usage is so low. 
  Masking/unmasking msi-x vector only set msi-x entry control section, 
  needless to update VM irq routing table.
  
  Signed-off-by: Zhang Haoyu haoyu.zh...@huawei.com
  Signed-off-by: Huang Weidong weidong.hu...@huawei.com
  Signed-off-by: Qin Chuanyu qinchua...@huawei.com
  ---
  hw/i386/kvm/pci-assign.c | 3 +++
  1 files changed, 3 insertions(+)
  
  --- a/hw/i386/kvm/pci-assign.c  2013-05-04 15:53:18.0 +0800
  +++ b/hw/i386/kvm/pci-assign.c  2013-05-04 15:50:46.0 +0800
  @@ -1576,6 +1576,8 @@ static void assigned_dev_msix_mmio_write
   MSIMessage msg;
   int ret;
  
  +/* Needless to update msi route when only msi-x entry 
  control section changed */
  +if ((addr  (PCI_MSIX_ENTRY_SIZE - 1)) != 
  + PCI_MSIX_ENTRY_VECTOR_CTRL){
   msg.address = entry-addr_lo |
   ((uint64_t)entry-addr_hi  32);
   msg.data = entry-data; @@ -1585,6 +1587,7 @@ static 
  void assigned_dev_msix_mmio_write
   if (ret) {
   error_report(Error updating irq routing entry (%d), 
  ret);
   }
  +}
   }
   }
   }
  
  Thanks,
  Zhang Haoyu
 
 
 If guest wants to update the vector, it does it like this:
 mask
 update
 unmask
 and it looks like the only point where we update the vector is on unmask, so 
 this patch will mean we don't update the vector ever.
 
 I'm not sure this combination (old guest + legacy device assignment
 framework) is worth optimizing. Can you try VFIO instead?
 
 But if it is, the right way to do this is probably along the lines of the 
 below patch. Want to try it out?
 
 diff --git a/kvm-all.c b/kvm-all.c
 index 2d92721..afe2327 100644
 --- a/kvm-all.c
 +++ b/kvm-all.c
 @@ -1006,6 +1006,11 @@ static int kvm_update_routing_entry(KVMState *s,
  continue;
  }
  
 +if (entry-type == new_entry-type 
 +entry-flags == new_entry-flags 
 +entry-u == new_entry-u) {
 +return 0;
 +}
  entry-type = new_entry-type;
  entry-flags = new_entry-flags;
  entry-u = new_entry-u;
 
 
 union type cannot be directly compared, I tried out below patch instead,
 --- a/kvm-all.c 2013-05-06 09:56:38.0 +0800
 +++ b/kvm-all.c 2013-05-06 09:56:45.0 +0800
 @@ -1008,6 +1008,12 @@ static int kvm_update_routing_entry(KVMS
  continue;
  }
 
 +if (entry-type == new_entry-type 
 +entry-flags == new_entry-flags 
 +!memcmp(entry-u, new_entry-u, sizeof(entry-u))) {
 +return 0;
 +}
 +
  entry-type = new_entry-type;
  entry-flags = new_entry-flags;
  entry-u = new_entry-u;
 
 MST's patch is more universal than my first patch fixed in 
 assigned_dev_msix_mmio_write().
 On the case that the msix entry's other section but control section is set 
 to the identical value with old entry's, MST's patch also works.
 MST's patch also works on the non-passthrough scenario.

Any numbers for either case?

 And, after MST's patch applied, the below check in 
 virtio_pci_vq_vector_unmask() can be removed.
 --- a/hw/virtio/virtio-pci.c2013-05-04 15:53:20.0 +0800
 +++ b/hw/virtio/virtio-pci.c2013-05-06 10:25:58.0 +0800
 @@ -619,12 +619,10 @@ static int virtio_pci_vq_vector_unmask(V
 
  if (proxy-vector_irqfd) {
  irqfd = proxy-vector_irqfd[vector];
 -if (irqfd-msg.data != msg.data || irqfd-msg.address != 
 msg.address) {
  ret = kvm_irqchip_update_msi_route(kvm_state, irqfd-virq, msg);
  if (ret  0) {
  return ret;
  }
 -}
  }
 
  /* If guest supports masking, irqfd is already setup, unmask it.
 
 Thanks,
 Zhang Haoyu

Re: [Qemu-devel] [PATCH v7 0/7] push mmio dispatch out of big lock

2013-05-06 Thread Paolo Bonzini

Il 06/05/2013 10:40, Jan Kiszka ha scritto:


  [*]  The subscriber link mechanism allows an LWN.net
   subscriber to generate a special URL for a
   subscription-only article. That URL can then be given to
   others, who will be able to access the article regardless
   of whether they are subscribed. This feature is made
   available as a service to LWN subscribers, and in the hope
   that they will use it to spread the word about their
   favorite LWN articles.

 And memory_region_find should likely always increment a reference
 if the target region has an owner. We should convert its users to
 properly dereference the region once done with it.

 Yes.  But this is what requires you to have an owner for all regions.
 
 You don't need an owner for regions that are protect by the BQL (the
 majority in the foreseeable future). For those regions, reference
 counting can remain a nop, internally.

The problem is that even if I/O for a region is supposed to happen
within the BQL, lookup can happen outside the BQL.  Lookup will use the
region even if it is just to discard it:

   VCPU thread (under BQL)  device thread
 
--
flatview_ref
memory_region_find returns d-mr
memory_region_ref(d-mr) /* nop 
*/
   qdev_free(d)
 object_unparent(d)
   unrealize(d)
 memory_region_del_subregion(d-mr)
   FlatView updated, d-mr not in the new view

flatview_unref
  memory_region_unref(d-mr)
object_unref(d)
  free(d)
if (!d-mr-is_ram) {/* 
BAD! */
  memory_region_unref(d-mr) /* 
nop */
  return error
}


Here, the memory region is dereferenced *before* we know that it is BQL-free
(in fact, exactly to ascertain whether it is BQL-free).

We can hack around it by putting an is_ram field in FlatRange and
MemoryRegionSection, but it is not a solution.  Here is how giving an
owner to all regions fixes it:

   VCPU thread (under BQL)  device thread
 
--
flatview_ref
memory_region_find returns d-mr
memory_region_ref(d-mr)
  object_ref(d)

   qdev_free(d)
 object_unparent(d)
   unrealize(d)
 memory_region_del_subregion(d-mr)
   FlatView updated, d-mr not in the new view

flatview_unref
  memory_region_unref(d-mr)
object_unref(d) /* still 
alive! */

if (!d-mr-is_ram) {
  memory_region_unref(d-mr)
object_unref(d)
  free(d)
  return error
}

Paolo

Re: [Qemu-devel] [PATCH v7 0/7] push mmio dispatch out of big lock

2013-05-06 Thread Jan Kiszka

On 2013-05-06 12:27, Paolo Bonzini wrote:
 Il 06/05/2013 10:40, Jan Kiszka ha scritto:
 

 [*]  The subscriber link mechanism allows an LWN.net
  subscriber to generate a special URL for a
  subscription-only article. That URL can then be given to
  others, who will be able to access the article regardless
  of whether they are subscribed. This feature is made
  available as a service to LWN subscribers, and in the hope
  that they will use it to spread the word about their
  favorite LWN articles.

 And memory_region_find should likely always increment a reference
 if the target region has an owner. We should convert its users to
 properly dereference the region once done with it.

 Yes.  But this is what requires you to have an owner for all regions.

 You don't need an owner for regions that are protect by the BQL (the
 majority in the foreseeable future). For those regions, reference
 counting can remain a nop, internally.
 
 The problem is that even if I/O for a region is supposed to happen
 within the BQL, lookup can happen outside the BQL.  Lookup will use the
 region even if it is just to discard it:
 
VCPU thread (under BQL)  device thread
  
 --
 flatview_ref
 memory_region_find returns 
 d-mr
 memory_region_ref(d-mr) /* 
 nop */
qdev_free(d)
  object_unparent(d)
unrealize(d)
  memory_region_del_subregion(d-mr)
FlatView updated, d-mr not in the new view
 
 flatview_unref
   memory_region_unref(d-mr)
 object_unref(d)
   free(d)
 if (!d-mr-is_ram) {
 /* BAD! */
   memory_region_unref(d-mr) 
 /* nop */
   return error
 }
 
 
 Here, the memory region is dereferenced *before* we know that it is BQL-free
 (in fact, exactly to ascertain whether it is BQL-free).

Both flatview update and lookup *plus* locking type evaluation (i.e.
memory region dereferencing) always happen under the address space lock.
See Pingfan's patch.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH v7 0/7] push mmio dispatch out of big lock

2013-05-06 Thread Paolo Bonzini

Il 06/05/2013 12:56, Jan Kiszka ha scritto:
 The problem is that even if I/O for a region is supposed to happen
 within the BQL, lookup can happen outside the BQL.  Lookup will use the
 region even if it is just to discard it:

VCPU thread (under BQL)  device thread
  
 --
 flatview_ref
 memory_region_find returns 
 d-mr
 memory_region_ref(d-mr) /* 
 nop */
qdev_free(d)
  object_unparent(d)
unrealize(d)
  memory_region_del_subregion(d-mr)
FlatView updated, d-mr not in the new view

 flatview_unref
   memory_region_unref(d-mr)
 object_unref(d)
   free(d)
 if (!d-mr-is_ram) {
 /* BAD! */
   memory_region_unref(d-mr) 
 /* nop */
   return error
 }


 Here, the memory region is dereferenced *before* we know that it is BQL-free
 (in fact, exactly to ascertain whether it is BQL-free).
 
 Both flatview update and lookup *plus* locking type evaluation (i.e.
 memory region dereferencing) always happen under the address space lock.
 See Pingfan's patch.

That's true of address_space_rw/map, but I don't think it holds for
memory_region_find.

Paolo

Re: [Qemu-devel] [PATCH] prep: Fix software reset

2013-05-06 Thread Julio Guerra

2013/5/6 Andreas Färber andreas.faer...@web.de:

 Thanks, applied this bit to prep-up:
 http://repo.or.cz/w/qemu/afaerber.git/shortlog/refs/heads/prep-up


Ok. I just saw Hervé's work on system I/O and your short discussion.
So should I just wait for a soft-reset-method agreement and solution
now ?

--
Julio Guerra

Re: [Qemu-devel] [PATCH v7 0/7] push mmio dispatch out of big lock

2013-05-06 Thread Jan Kiszka

On 2013-05-06 12:58, Paolo Bonzini wrote:
 Il 06/05/2013 12:56, Jan Kiszka ha scritto:
 The problem is that even if I/O for a region is supposed to happen
 within the BQL, lookup can happen outside the BQL.  Lookup will use the
 region even if it is just to discard it:

VCPU thread (under BQL)  device thread
  
 --
 flatview_ref
 memory_region_find returns 
 d-mr
 memory_region_ref(d-mr) /* 
 nop */
qdev_free(d)
  object_unparent(d)
unrealize(d)
  memory_region_del_subregion(d-mr)
FlatView updated, d-mr not in the new view

 flatview_unref
   memory_region_unref(d-mr)
 object_unref(d)
   free(d)
 if (!d-mr-is_ram) {   
  /* BAD! */
   
 memory_region_unref(d-mr) /* nop */
   return error
 }


 Here, the memory region is dereferenced *before* we know that it is BQL-free
 (in fact, exactly to ascertain whether it is BQL-free).

 Both flatview update and lookup *plus* locking type evaluation (i.e.
 memory region dereferencing) always happen under the address space lock.
 See Pingfan's patch.
 
 That's true of address_space_rw/map, but I don't think it holds for
 memory_region_find.

It has to, or it would be broken: Either it is called on a region that
supports reference counting and, thus, increments the counter before
returning, or it has to be called with the BQL held.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH v7 5/7] memory: make mmio dispatch able to be out of biglock

2013-05-06 Thread Paolo Bonzini

Il 25/11/2012 03:03, Liu Ping Fan ha scritto:
 @@ -3550,12 +3668,11 @@ void *address_space_map(AddressSpace *as,
  target_phys_addr_t *plen,
  bool is_write)
  {
 -AddressSpaceDispatch *d = as-dispatch;
  target_phys_addr_t len = *plen;
  target_phys_addr_t todo = 0;
  int l;
  target_phys_addr_t page;
 -MemoryRegionSection *section;
 +MemoryRegionSection *section, mr_obj;
  ram_addr_t raddr = RAM_ADDR_MAX;
  ram_addr_t rlen;
  void *ret;
 @@ -3565,7 +3682,8 @@ void *address_space_map(AddressSpace *as,
  l = (page + TARGET_PAGE_SIZE) - addr;
  if (l  len)
  l = len;
 -section = phys_page_find(d, page  TARGET_PAGE_BITS);
 +address_space_section_lookup_ref(as, page  TARGET_PAGE_BITS, 
 mr_obj);
 +section = mr_obj;
  
  if (!(memory_region_is_ram(section-mr)  !section-readonly)) {
  if (todo || bounce.buffer) {
 @@ -3579,6 +3697,7 @@ void *address_space_map(AddressSpace *as,
  }
  
  *plen = l;
 +memory_region_section_unref(mr_obj);
  return bounce.buffer;
  }
  if (!todo) {
 @@ -3589,6 +3708,7 @@ void *address_space_map(AddressSpace *as,
  len -= l;
  addr += l;
  todo += l;
 +memory_region_section_unref(mr_obj);
  }
  rlen = todo;
  ret = qemu_ram_ptr_length(raddr, rlen);

I think this unref is wrong.  You need to delay it to the
address_space_unmap, and this in turns requires changing the signature
of address_space_map.

Paolo

Re: [Qemu-devel] [PATCH v7 5/7] memory: make mmio dispatch able to be out of biglock

2013-05-06 Thread Jan Kiszka

On 2013-05-06 13:21, Paolo Bonzini wrote:
 Il 25/11/2012 03:03, Liu Ping Fan ha scritto:
 @@ -3550,12 +3668,11 @@ void *address_space_map(AddressSpace *as,
  target_phys_addr_t *plen,
  bool is_write)
  {
 -AddressSpaceDispatch *d = as-dispatch;
  target_phys_addr_t len = *plen;
  target_phys_addr_t todo = 0;
  int l;
  target_phys_addr_t page;
 -MemoryRegionSection *section;
 +MemoryRegionSection *section, mr_obj;
  ram_addr_t raddr = RAM_ADDR_MAX;
  ram_addr_t rlen;
  void *ret;
 @@ -3565,7 +3682,8 @@ void *address_space_map(AddressSpace *as,
  l = (page + TARGET_PAGE_SIZE) - addr;
  if (l  len)
  l = len;
 -section = phys_page_find(d, page  TARGET_PAGE_BITS);
 +address_space_section_lookup_ref(as, page  TARGET_PAGE_BITS, 
 mr_obj);
 +section = mr_obj;
  
  if (!(memory_region_is_ram(section-mr)  !section-readonly)) {
  if (todo || bounce.buffer) {
 @@ -3579,6 +3697,7 @@ void *address_space_map(AddressSpace *as,
  }
  
  *plen = l;
 +memory_region_section_unref(mr_obj);
  return bounce.buffer;
  }
  if (!todo) {
 @@ -3589,6 +3708,7 @@ void *address_space_map(AddressSpace *as,
  len -= l;
  addr += l;
  todo += l;
 +memory_region_section_unref(mr_obj);
  }
  rlen = todo;
  ret = qemu_ram_ptr_length(raddr, rlen);
 
 I think this unref is wrong.  You need to delay it to the
 address_space_unmap, and this in turns requires changing the signature
 of address_space_map.

Can't RAMBlock hold a reference to the associated region? Then this
could be retrieved on unmap without bothering the caller.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH v7 5/7] memory: make mmio dispatch able to be out of biglock

2013-05-06 Thread Paolo Bonzini

Il 06/05/2013 13:25, Jan Kiszka ha scritto:
 On 2013-05-06 13:21, Paolo Bonzini wrote:
 Il 25/11/2012 03:03, Liu Ping Fan ha scritto:
 @@ -3550,12 +3668,11 @@ void *address_space_map(AddressSpace *as,
  target_phys_addr_t *plen,
  bool is_write)
  {
 -AddressSpaceDispatch *d = as-dispatch;
  target_phys_addr_t len = *plen;
  target_phys_addr_t todo = 0;
  int l;
  target_phys_addr_t page;
 -MemoryRegionSection *section;
 +MemoryRegionSection *section, mr_obj;
  ram_addr_t raddr = RAM_ADDR_MAX;
  ram_addr_t rlen;
  void *ret;
 @@ -3565,7 +3682,8 @@ void *address_space_map(AddressSpace *as,
  l = (page + TARGET_PAGE_SIZE) - addr;
  if (l  len)
  l = len;
 -section = phys_page_find(d, page  TARGET_PAGE_BITS);
 +address_space_section_lookup_ref(as, page  TARGET_PAGE_BITS, 
 mr_obj);
 +section = mr_obj;
  
  if (!(memory_region_is_ram(section-mr)  !section-readonly)) {
  if (todo || bounce.buffer) {
 @@ -3579,6 +3697,7 @@ void *address_space_map(AddressSpace *as,
  }
  
  *plen = l;
 +memory_region_section_unref(mr_obj);
  return bounce.buffer;
  }
  if (!todo) {
 @@ -3589,6 +3708,7 @@ void *address_space_map(AddressSpace *as,
  len -= l;
  addr += l;
  todo += l;
 +memory_region_section_unref(mr_obj);
  }
  rlen = todo;
  ret = qemu_ram_ptr_length(raddr, rlen);

 I think this unref is wrong.  You need to delay it to the
 address_space_unmap, and this in turns requires changing the signature
 of address_space_map.
 
 Can't RAMBlock hold a reference to the associated region? Then this
 could be retrieved on unmap without bothering the caller.

Right you are. :)  In fact, RAMBlock already does have block-mr.

Paolo

Re: [Qemu-devel] [PATCH v7 0/7] push mmio dispatch out of big lock

2013-05-06 Thread Paolo Bonzini

Il 06/05/2013 13:11, Jan Kiszka ha scritto:
 On 2013-05-06 12:58, Paolo Bonzini wrote:
 Il 06/05/2013 12:56, Jan Kiszka ha scritto:
 The problem is that even if I/O for a region is supposed to happen
 within the BQL, lookup can happen outside the BQL.  Lookup will use the
 region even if it is just to discard it:

VCPU thread (under BQL)  device thread
  
 --
 flatview_ref
 memory_region_find returns 
 d-mr
 memory_region_ref(d-mr) 
 /* nop */
qdev_free(d)
  object_unparent(d)
unrealize(d)
  memory_region_del_subregion(d-mr)
FlatView updated, d-mr not in the new view

 flatview_unref
   
 memory_region_unref(d-mr)
 object_unref(d)
   free(d)
 if (!d-mr-is_ram) {  
   /* BAD! */
   
 memory_region_unref(d-mr) /* nop */
   return error
 }


 Here, the memory region is dereferenced *before* we know that it is 
 BQL-free
 (in fact, exactly to ascertain whether it is BQL-free).

 Both flatview update and lookup *plus* locking type evaluation (i.e.
 memory region dereferencing) always happen under the address space lock.
 See Pingfan's patch.

 That's true of address_space_rw/map, but I don't think it holds for
 memory_region_find.
 
 It has to, or it would be broken: Either it is called on a region that
 supports reference counting

You cannot know that in advance, can you?  The address is decided by the
guest.

 and, thus, increments the counter before
 returning, or it has to be called with the BQL held.

... or we need to support reference counting on all regions, so that the
other possibility is automatically true.

Strictly speaking, only regions that can be unplugged need to support
reference counting.


Paolo

Re: [Qemu-devel] [PATCH v7 0/7] push mmio dispatch out of big lock

2013-05-06 Thread Jan Kiszka

On 2013-05-06 13:28, Paolo Bonzini wrote:
 Il 06/05/2013 13:11, Jan Kiszka ha scritto:
 On 2013-05-06 12:58, Paolo Bonzini wrote:
 Il 06/05/2013 12:56, Jan Kiszka ha scritto:
 The problem is that even if I/O for a region is supposed to happen
 within the BQL, lookup can happen outside the BQL.  Lookup will use the
 region even if it is just to discard it:

VCPU thread (under BQL)  device thread
  
 --
 flatview_ref
 memory_region_find 
 returns d-mr
 memory_region_ref(d-mr) 
 /* nop */
qdev_free(d)
  object_unparent(d)
unrealize(d)
  memory_region_del_subregion(d-mr)
FlatView updated, d-mr not in the new view

 flatview_unref
   
 memory_region_unref(d-mr)
 object_unref(d)
   free(d)
 if (!d-mr-is_ram) { 
/* BAD! */
   
 memory_region_unref(d-mr) /* nop */
   return error
 }


 Here, the memory region is dereferenced *before* we know that it is 
 BQL-free
 (in fact, exactly to ascertain whether it is BQL-free).

 Both flatview update and lookup *plus* locking type evaluation (i.e.
 memory region dereferencing) always happen under the address space lock.
 See Pingfan's patch.

 That's true of address_space_rw/map, but I don't think it holds for
 memory_region_find.

 It has to, or it would be broken: Either it is called on a region that
 supports reference counting
 
 You cannot know that in advance, can you?  The address is decided by the
 guest.

Need to help me again to get the context: In which case is this a
hot-path that we want to keep BQL-free? Current users of
memory_region_find appear to be all relatively slow paths, thus are fine
with staying under BQL.

 
 and, thus, increments the counter before
 returning, or it has to be called with the BQL held.
 
 ... or we need to support reference counting on all regions, so that the
 other possibility is automatically true.
 
 Strictly speaking, only regions that can be unplugged need to support
 reference counting.

That should make the conversion, if actually required, more bearable.
Having to assign an owner to every region around as a precondition to
introduce a new concept with initially less than a handful of users
would be too much, I suppose.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH v7 0/7] push mmio dispatch out of big lock

2013-05-06 Thread Paolo Bonzini

Il 06/05/2013 13:39, Jan Kiszka ha scritto:
 On 2013-05-06 13:28, Paolo Bonzini wrote:
 Il 06/05/2013 13:11, Jan Kiszka ha scritto:
 On 2013-05-06 12:58, Paolo Bonzini wrote:
 Il 06/05/2013 12:56, Jan Kiszka ha scritto:
 The problem is that even if I/O for a region is supposed to happen
 within the BQL, lookup can happen outside the BQL.  Lookup will use the
 region even if it is just to discard it:

VCPU thread (under BQL)  device thread
  
 --
 flatview_ref
 memory_region_find 
 returns d-mr
 memory_region_ref(d-mr) 
 /* nop */
qdev_free(d)
  object_unparent(d)
unrealize(d)
  memory_region_del_subregion(d-mr)
FlatView updated, d-mr not in the new view

 flatview_unref
   
 memory_region_unref(d-mr)
 object_unref(d)
   free(d)
 if (!d-mr-is_ram) {
 /* BAD! */
   
 memory_region_unref(d-mr) /* nop */
   return error
 }


 Here, the memory region is dereferenced *before* we know that it is 
 BQL-free
 (in fact, exactly to ascertain whether it is BQL-free).

 Both flatview update and lookup *plus* locking type evaluation (i.e.
 memory region dereferencing) always happen under the address space lock.
 See Pingfan's patch.

 That's true of address_space_rw/map, but I don't think it holds for
 memory_region_find.

 It has to, or it would be broken: Either it is called on a region that
 supports reference counting

 You cannot know that in advance, can you?  The address is decided by the
 guest.
 
 Need to help me again to get the context: In which case is this a
 hot-path that we want to keep BQL-free? Current users of
 memory_region_find appear to be all relatively slow paths, thus are fine
 with staying under BQL.

virtio-blk-dataplane is basically redoing memory_region_find with a
separate data structure, exactly so that it can run outside the BQL
before we get BQL-free MMIO dispatch.

I can try to post patches later today that actually use
memory_region_find instead.

 and, thus, increments the counter before
 returning, or it has to be called with the BQL held.

 ... or we need to support reference counting on all regions, so that the
 other possibility is automatically true.

 Strictly speaking, only regions that can be unplugged need to support
 reference counting.
 
 That should make the conversion, if actually required, more bearable.
 Having to assign an owner to every region around as a precondition to
 introduce a new concept with initially less than a handful of users
 would be too much, I suppose.

I agree.  Though I have some hope that it could be scripted, this
exception would make the conversion possible for non-qdevified devices.

Paolo

Re: [Qemu-devel] [uq/master PATCH] kvmvapic: add ioport read accessor

2013-05-06 Thread Gleb Natapov

On Sun, May 05, 2013 at 05:51:49PM -0300, Marcelo Tosatti wrote:
 
 Necessary since memory region accessor assumes read and write
 methods are registered. Otherwise reading I/O port 0x7e segfaults.
 
 https://bugzilla.redhat.com/show_bug.cgi?id=954306
 
 Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
 
Applied, thanks.

 diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
 index 5b558aa..655483b 100644
 --- a/hw/i386/kvmvapic.c
 +++ b/hw/i386/kvmvapic.c
 @@ -687,8 +687,14 @@ static void vapic_write(void *opaque, hwaddr addr, 
 uint64_t data,
  }
  }
  
 +static uint64_t vapic_read(void *opaque, hwaddr addr, unsigned size)
 +{
 +return 0x;
 +}
 +
  static const MemoryRegionOps vapic_ops = {
  .write = vapic_write,
 +.read = vapic_read,
  .endianness = DEVICE_NATIVE_ENDIAN,
  };
  
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.

Re: [Qemu-devel] [PATCH V13 2/6] avoid duplication of default value in QemuOpts

2013-05-06 Thread Markus Armbruster

Dong Xu Wang wdon...@linux.vnet.ibm.com writes:

 According Markus's comments, his patch will move the default value entirely
 to QemuOptDesc.

 When getting the value of an option that hasn't been set, and
 QemuOptDesc has a default value, return that.  Else, behave as
 before.

 Example: qemu_opt_get_number(opts, foo, 42)

If foo has been set in opts, return its value.

Else, if opt's QemuOptDesc has a default value for foo, return
that.

Else, return 42.

Note that the last argument is useless when QemuOptDesc has a
default value.  Ugly.  If it bothers us, assert that the argument
equals the default from QemuOptDesc.

Last sentence is not 100% clear.  Either change it to If it bothers us,
we could assert, or implement the assert.  Considering we're at v12, I
recommend the former.


 Example: qemu_opt_get(opts, bar)

If bar has been set in opts, return its value.

Else, if opt's QemuOptDesc has a default value for bar, return
that.

Else, return NULL.

 Signed-off-by: Dong Xu Wang wdon...@linux.vnet.ibm.com
 ---
  util/qemu-option.c | 58 
 +-
  1 file changed, 40 insertions(+), 18 deletions(-)

 diff --git a/util/qemu-option.c b/util/qemu-option.c
 index 57cdd57..4f94000 100644
 --- a/util/qemu-option.c
 +++ b/util/qemu-option.c
 @@ -525,10 +525,28 @@ static QemuOpt *qemu_opt_find(QemuOpts *opts, const 
 char *name)
  return NULL;
  }
  
 +static const QemuOptDesc *find_desc_by_name(const QemuOptDesc *desc,
 +const char *name)
 +{
 +int i;
 +
 +for (i = 0; desc[i].name != NULL; i++) {
 +if (strcmp(desc[i].name, name) == 0) {
 +return desc[i];
 +}
 +}
 +
 +return NULL;
 +}
 +
  const char *qemu_opt_get(QemuOpts *opts, const char *name)
  {
  QemuOpt *opt = qemu_opt_find(opts, name);
 -return opt ? opt-str : NULL;
 +const QemuOptDesc *desc;
 +desc = find_desc_by_name(opts-list-desc, name);
 +
 +return opt ? opt-str :
 +(desc  desc-def_value_str ? desc-def_value_str : NULL);
  }
  

I'd make this function work exactly like the ones that follow:

if (!opt) {
desc = find_desc_by_name(opts-list-desc, name);
if (desc  desc-def_value_str) {
return desc-def_value_str;
}
}
return opt ? opt-str : NULL;

Matter of taste; no need to respin just for that.

  bool qemu_opt_has_help_opt(QemuOpts *opts)
 @@ -546,9 +564,15 @@ bool qemu_opt_has_help_opt(QemuOpts *opts)
  bool qemu_opt_get_bool(QemuOpts *opts, const char *name, bool defval)
  {
  QemuOpt *opt = qemu_opt_find(opts, name);
 +const QemuOptDesc *desc;
  
 -if (opt == NULL)
 +if (opt == NULL) {
 +desc = find_desc_by_name(opts-list-desc, name);
 +if (desc  desc-def_value_str) {
 +parse_option_bool(name, desc-def_value_str, defval, NULL);

Ignores errors.  Should only happen when somebody sets a bad
desc-def_value_str.  Because those are fixed at compile time, that's a
programming error.  Recommend to assert like this:

parse_option_bool(name, desc-def_value_str, defval, local_err);
assert(!local_err);

 +}
  return defval;
 +}
  assert(opt-desc  opt-desc-type == QEMU_OPT_BOOL);
  return opt-value.boolean;
  }
 @@ -556,9 +580,15 @@ bool qemu_opt_get_bool(QemuOpts *opts, const char *name, 
 bool defval)
  uint64_t qemu_opt_get_number(QemuOpts *opts, const char *name, uint64_t 
 defval)
  {
  QemuOpt *opt = qemu_opt_find(opts, name);
 +const QemuOptDesc *desc;
  
 -if (opt == NULL)
 +if (opt == NULL) {
 +desc = find_desc_by_name(opts-list-desc, name);
 +if (desc  desc-def_value_str) {
 +parse_option_number(name, desc-def_value_str, defval, NULL);

Likewise.

 +}
  return defval;
 +}
  assert(opt-desc  opt-desc-type == QEMU_OPT_NUMBER);
  return opt-value.uint;
  }
 @@ -566,9 +596,15 @@ uint64_t qemu_opt_get_number(QemuOpts *opts, const char 
 *name, uint64_t defval)
  uint64_t qemu_opt_get_size(QemuOpts *opts, const char *name, uint64_t defval)
  {
  QemuOpt *opt = qemu_opt_find(opts, name);
 +const QemuOptDesc *desc;
  
 -if (opt == NULL)
 +if (opt == NULL) {
 +desc = find_desc_by_name(opts-list-desc, name);
 +if (desc  desc-def_value_str) {
 +parse_option_size(name, desc-def_value_str, defval, NULL);

Likewise.

 +}
  return defval;
 +}
  assert(opt-desc  opt-desc-type == QEMU_OPT_SIZE);
  return opt-value.uint;
  }
 @@ -609,20 +645,6 @@ static bool opts_accepts_any(const QemuOpts *opts)
  return opts-list-desc[0].name == NULL;
  }
  
 -static const QemuOptDesc *find_desc_by_name(const QemuOptDesc *desc,
 -const char *name)
 -{
 -int i;
 -
 -for (i = 0; desc[i].name != NULL; i++) {
 -if

Re: [Qemu-devel] [PATCH v2 1/4] Add i.MX FEC Ethernet driver

2013-05-06 Thread Peter Maydell

On 6 May 2013 10:24, Michael S. Tsirkin m...@redhat.com wrote:
 On Mon, May 06, 2013 at 10:08:42AM +0100, Peter Maydell wrote:
 On 6 May 2013 09:51, Michael S. Tsirkin m...@redhat.com wrote:
  On Sun, May 05, 2013 at 11:00:24PM +0100, Peter Maydell wrote:
  On 5 May 2013 22:15, Michael S. Tsirkin m...@redhat.com wrote:
   On Sun, May 05, 2013 at 07:01:34PM +0100, Peter Maydell wrote:

  Can't board code look for instanciated controllers
  and wire them up?

 I don't think this will work, because -device does both
 'instance_init' and 'realize', and some of the things the
 board needs to set and wire up must be done before 'realize'.

 Well let's add a flag that tells QM to delay realize then?
 It's not abstract but maybe embedded type?

This still requires users to know what their board's NIC
happens to be, and how do you match up the half-finished
thing created with -device to the device that the board
creates later?

  There's probably a nasty workaround involving '-global', but:
   * that requires the user to know the device name for the
 onboard NIC for the board, which is a regression from
 the -net situation
   * it's not clear how it works if the board has two NICs
 of the same type
 
  How does it work now?
  I am guessing each -net nic gets mapped to a random device.
  At some level that's worse than documenting about internal names,
  we are teaching users to learn order of initialization
  by trial and error and then rely on this.

 Well, it gets mapped to a specific device (hopefully we pick
 the same order as the kernel so first nic is eth0, second
 is eth1, and so on). This isn't a question of initialization
 order, because you can happily initialize the NIC corresponding
 to nd_table[1] before the one for nd_table[0] if you like.
 It's just a matter of picking which bit of hardware we call
 the first ethernet device, in the same way that we pick
 one of two serial ports to call the first serial port.

 In other words, it's an undocumented hack :(
 Scary as it sounds, for this case I like documenting
 internal names better.

How does that work when both internal NICs are the same kind
of device?

-- PMM

Re: [Qemu-devel] [PATCH v7 0/7] push mmio dispatch out of big lock

2013-05-06 Thread Jan Kiszka

On 2013-05-06 13:47, Paolo Bonzini wrote:
 Il 06/05/2013 13:39, Jan Kiszka ha scritto:
 On 2013-05-06 13:28, Paolo Bonzini wrote:
 Il 06/05/2013 13:11, Jan Kiszka ha scritto:
 On 2013-05-06 12:58, Paolo Bonzini wrote:
 Il 06/05/2013 12:56, Jan Kiszka ha scritto:
 The problem is that even if I/O for a region is supposed to happen
 within the BQL, lookup can happen outside the BQL.  Lookup will use the
 region even if it is just to discard it:

VCPU thread (under BQL)  device thread
  
 --
 flatview_ref
 memory_region_find 
 returns d-mr
 
 memory_region_ref(d-mr) /* nop */
qdev_free(d)
  object_unparent(d)
unrealize(d)
  memory_region_del_subregion(d-mr)
FlatView updated, d-mr not in the new view

 flatview_unref
   
 memory_region_unref(d-mr)
 object_unref(d)
   free(d)
 if (!d-mr-is_ram) {   
  /* BAD! */
   
 memory_region_unref(d-mr) /* nop */
   return error
 }


 Here, the memory region is dereferenced *before* we know that it is 
 BQL-free
 (in fact, exactly to ascertain whether it is BQL-free).

 Both flatview update and lookup *plus* locking type evaluation (i.e.
 memory region dereferencing) always happen under the address space lock.
 See Pingfan's patch.

 That's true of address_space_rw/map, but I don't think it holds for
 memory_region_find.

 It has to, or it would be broken: Either it is called on a region that
 supports reference counting

 You cannot know that in advance, can you?  The address is decided by the
 guest.

 Need to help me again to get the context: In which case is this a
 hot-path that we want to keep BQL-free? Current users of
 memory_region_find appear to be all relatively slow paths, thus are fine
 with staying under BQL.
 
 virtio-blk-dataplane is basically redoing memory_region_find with a
 separate data structure, exactly so that it can run outside the BQL
 before we get BQL-free MMIO dispatch.
 
 I can try to post patches later today that actually use
 memory_region_find instead.

We could define its semantics as follows: return a reference to the
corresponding memory region, provide this is safe. A reference is safe when
 - the region supports BQL-free operation (thus provides an owner to
   apply reference counting on)
 - the caller holds the BQL (check via qemu_mutex_iothread_is_locked()
   - to be implemented)

The latter implies that the BQL is not dropped before returning the
reference, but that's nothing memory_region_find can enforce.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH V13 1/6] add def_value_str in QemuOptDesc struct and rewrite qemu_opts_print

2013-05-06 Thread Markus Armbruster

Dong Xu Wang wdon...@linux.vnet.ibm.com writes:

 qemu_opts_print has no user now, so can re-write the function safely.

 qemu_opts_print will be used while using qemu-img create, it will
 produce the same output as previous code.

 The behavior of this function has changed:

 1. Print every possible option, whether a value has been set or not.
 2. Option descriptors may provide a default value.
 3. Print to stdout instead of stderr.

 Previously the behavior was to print every option that has been set.
 Options that have not been set would be skipped.

 Signed-off-by: Dong Xu Wang wdon...@linux.vnet.ibm.com
 ---
 v12-v13
 1) re-write commit message.

 v11-v12
 1) make def_value_str become the real default value string in opt_set
 function.

 v10-v11:
 1) print all values that have actually been assigned while accept-any
 cases.

 v7-v8:
 1) print elements = accept any params while opts_accepts_any() ==
 true.
 2) since def_print_str is the default value if an option isn't set,
 so rename it to def_value_str.


  include/qemu/option.h |  3 ++-
  util/qemu-option.c| 33 +++--
  2 files changed, 29 insertions(+), 7 deletions(-)

 diff --git a/include/qemu/option.h b/include/qemu/option.h
 index bdb6d21..b928ab0 100644
 --- a/include/qemu/option.h
 +++ b/include/qemu/option.h
 @@ -96,6 +96,7 @@ typedef struct QemuOptDesc {
  const char *name;
  enum QemuOptType type;
  const char *help;
 +const char *def_value_str;
  } QemuOptDesc;
  
  struct QemuOptsList {
 @@ -152,7 +153,7 @@ QDict *qemu_opts_to_qdict(QemuOpts *opts, QDict *qdict);
  void qemu_opts_absorb_qdict(QemuOpts *opts, QDict *qdict, Error **errp);
  
  typedef int (*qemu_opts_loopfunc)(QemuOpts *opts, void *opaque);
 -int qemu_opts_print(QemuOpts *opts, void *dummy);
 +int qemu_opts_print(QemuOpts *opts);
  int qemu_opts_foreach(QemuOptsList *list, qemu_opts_loopfunc func, void 
 *opaque,
int abort_on_failure);
  
 diff --git a/util/qemu-option.c b/util/qemu-option.c
 index 8b74bf1..57cdd57 100644
 --- a/util/qemu-option.c
 +++ b/util/qemu-option.c
 @@ -646,6 +646,7 @@ static void opt_set(QemuOpts *opts, const char *name, 
 const char *value,
  }
  opt-desc = desc;
  opt-str = g_strdup(value);
 +opt-str = g_strdup(value ?: desc-def_value_str);

Memory leak.  Did you forget to delete the previous line?  You do it in
PATCH 4/6, plugging the leak...

  qemu_opt_parse(opt, local_err);
  if (error_is_set(local_err)) {
  error_propagate(errp, local_err);

opt_set() now accepts a null value argument, and so do its callers
qemu_opt_set(), qemu_opt_set_err(), qemu_opts_set().  Why?

 @@ -860,16 +861,36 @@ void qemu_opts_del(QemuOpts *opts)
  g_free(opts);
  }
  
 -int qemu_opts_print(QemuOpts *opts, void *dummy)
 +int qemu_opts_print(QemuOpts *opts)
  {
  QemuOpt *opt;
 +QemuOptDesc *desc = opts-list-desc;
  
 -fprintf(stderr, %s: %s:, opts-list-name,
 -opts-id ? opts-id : noid);
 -QTAILQ_FOREACH(opt, opts-head, next) {
 -fprintf(stderr,  %s=\%s\, opt-name, opt-str);
 +if (desc[0].name == NULL) {
 +QTAILQ_FOREACH(opt, opts-head, next) {
 +printf(%s=\%s\ , opt-name, opt-str);
 +}
 +return 0;
 +}
 +for (; desc  desc-name; desc++) {
 +const char *value = desc-def_value_str;
 +QemuOpt *opt;
 +
 +opt = qemu_opt_find(opts, desc-name);
 +if (opt) {
 +value = opt-str;
 +}
 +
 +if (!value) {
 +continue;
 +}
 +
 +if (desc-type == QEMU_OPT_STRING) {
 +printf(%s='%s' , desc-name, value);
 +} else {
 +printf(%s=%s , desc-name, value);
 +}
  }
 -fprintf(stderr, \n);
  return 0;
  }

Re: [Qemu-devel] [PATCH V13 4/6] create some QemuOpts functons

2013-05-06 Thread Markus Armbruster

Dong Xu Wang wdon...@linux.vnet.ibm.com writes:

 These functions will be used in next commit. qemu_opt_get_(*)_del functions
 are used to make sure we have the same behaviors as before: after get an
 option value, options++.

I don't understand the last sentence.

 Signed-off-by: Dong Xu Wang wdon...@linux.vnet.ibm.com
 ---
  include/qemu/option.h |  11 +-
  util/qemu-option.c| 103 
 ++
  2 files changed, 105 insertions(+), 9 deletions(-)

 diff --git a/include/qemu/option.h b/include/qemu/option.h
 index c7a5c14..d63e447 100644
 --- a/include/qemu/option.h
 +++ b/include/qemu/option.h
 @@ -108,6 +108,7 @@ struct QemuOptsList {
  };
  
  const char *qemu_opt_get(QemuOpts *opts, const char *name);
 +const char *qemu_opt_get_del(QemuOpts *opts, const char *name);
  /**
   * qemu_opt_has_help_opt:
   * @opts: options to search for a help request
 @@ -121,13 +122,18 @@ const char *qemu_opt_get(QemuOpts *opts, const char 
 *name);
   */
  bool qemu_opt_has_help_opt(QemuOpts *opts);
  bool qemu_opt_get_bool(QemuOpts *opts, const char *name, bool defval);
 +bool qemu_opt_get_bool_del(QemuOpts *opts, const char *name, bool defval);
  uint64_t qemu_opt_get_number(QemuOpts *opts, const char *name, uint64_t 
 defval);
  uint64_t qemu_opt_get_size(QemuOpts *opts, const char *name, uint64_t 
 defval);
 +uint64_t qemu_opt_get_size_del(QemuOpts *opts, const char *name,
 +   uint64_t defval);
  int qemu_opt_set(QemuOpts *opts, const char *name, const char *value);
 +int qemu_opt_replace_set(QemuOpts *opts, const char *name, const char 
 *value);
  void qemu_opt_set_err(QemuOpts *opts, const char *name, const char *value,
Error **errp);
  int qemu_opt_set_bool(QemuOpts *opts, const char *name, bool val);
  int qemu_opt_set_number(QemuOpts *opts, const char *name, int64_t val);
 +int qemu_opt_replace_set_number(QemuOpts *opts, const char *name, int64_t 
 val);
  typedef int (*qemu_opt_loopfunc)(const char *name, const char *value, void 
 *opaque);
  int qemu_opt_foreach(QemuOpts *opts, qemu_opt_loopfunc func, void *opaque,
   int abort_on_failure);
 @@ -144,7 +150,10 @@ const char *qemu_opts_id(QemuOpts *opts);
  void qemu_opts_del(QemuOpts *opts);
  void qemu_opts_validate(QemuOpts *opts, const QemuOptDesc *desc, Error 
 **errp);
  int qemu_opts_do_parse(QemuOpts *opts, const char *params, const char 
 *firstname);
 -QemuOpts *qemu_opts_parse(QemuOptsList *list, const char *params, int 
 permit_abbrev);
 +int qemu_opts_do_parse_replace(QemuOpts *opts, const char *params,
 +   const char *firstname);
 +QemuOpts *qemu_opts_parse(QemuOptsList *list, const char *params,
 +  int permit_abbrev);
  void qemu_opts_set_defaults(QemuOptsList *list, const char *params,
  int permit_abbrev);
  QemuOpts *qemu_opts_from_qdict(QemuOptsList *list, const QDict *qdict,
 diff --git a/util/qemu-option.c b/util/qemu-option.c
 index 0488c27..5db6d76 100644
 --- a/util/qemu-option.c
 +++ b/util/qemu-option.c
 @@ -33,6 +33,8 @@
  #include qapi/qmp/qerror.h
  #include qemu/option_int.h
  
 +static void qemu_opt_del(QemuOpt *opt);
 +
  /*
   * Extracts the name of an option from the parameter string (p points at the
   * first byte of the option name)
 @@ -549,6 +551,16 @@ const char *qemu_opt_get(QemuOpts *opts, const char 
 *name)
   const char *qemu_opt_get(QemuOpts *opts, const char *name)
   {
   QemuOpt *opt = qemu_opt_find(opts, name);
   const QemuOptDesc *desc;
   desc = find_desc_by_name(opts-list-desc, name);

   return opt ? opt-str :
  (desc  desc-def_value_str ? desc-def_value_str : NULL);
  }
  
 +const char *qemu_opt_get_del(QemuOpts *opts, const char *name)
 +{
 +QemuOpt *opt = qemu_opt_find(opts, name);
 +const char *str = opt ? g_strdup(opt-str) : NULL;
 +if (opt) {
 +qemu_opt_del(opt);
 +}
 +return str;
 +}
 +

Unlike qemu_opt_del(), this one doesn't use def_value_str.  Why?  Isn't
that a trap for users of this function?

Same question for the qemu_opt_get_FOO_del() that follow.

  bool qemu_opt_has_help_opt(QemuOpts *opts)
  {
  QemuOpt *opt;
 @@ -577,6 +589,22 @@ bool qemu_opt_get_bool(QemuOpts *opts, const char *name, 
 bool defval)
  return opt-value.boolean;
  }
  
 +bool qemu_opt_get_bool_del(QemuOpts *opts, const char *name, bool defval)
 +{
 +QemuOpt *opt = qemu_opt_find(opts, name);
 +bool ret;
 +
 +if (opt == NULL) {
 +return defval;
 +}
 +ret = opt-value.boolean;
 +assert(opt-desc  opt-desc-type == QEMU_OPT_BOOL);
 +if (opt) {
 +qemu_opt_del(opt);
 +}
 +return ret;
 +}
 +
  uint64_t qemu_opt_get_number(QemuOpts *opts, const char *name, uint64_t 
 defval)
  {
  QemuOpt *opt = qemu_opt_find(opts, name);
 @@ -609,6 +637,23 @@ uint64_t qemu_opt_get_size(QemuOpts *opts, const char 
 *name, uint64_t

Re: [Qemu-devel] [PATCH V13 0/6] replace QEMUOptionParameter with QemuOpts parser

2013-05-06 Thread Markus Armbruster

Dong Xu Wang wdon...@linux.vnet.ibm.com writes:

 These patches will replace QEMUOptionParameter with QemuOpts. Change logs
 please go to each patch's commit message.

I reviewed 1-4/6 for now.  I'm sorry it has taken me so long.  Let's
discuss my findings before I continue with 5/6, because that one's
*big*.

[Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue

2013-05-06 Thread Gleb Natapov

Anthony please pull if it is not too later for 1.5.

The following changes since commit 467b34689d277fa56c09ad07ca0f08d7d7539f6d:

  Update OpenBIOS images (2013-05-05 09:53:22 +)

are available in the git repository at:

  git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git uq/master

for you to fetch changes up to 0c1cd0ae2a4faabeb948b9a07ea1696e853de174:

  kvmvapic: add ioport read accessor (2013-05-06 14:52:26 +0300)


Marcelo Tosatti (1):
  kvmvapic: add ioport read accessor

 hw/i386/kvmvapic.c |6 ++
 1 file changed, 6 insertions(+)

[Qemu-devel] [PATCH 1/1] kvmvapic: add ioport read accessor

2013-05-06 Thread Gleb Natapov

From: Marcelo Tosatti mtosa...@redhat.com

Necessary since memory region accessor assumes read and write
methods are registered. Otherwise reading I/O port 0x7e segfaults.

https://bugzilla.redhat.com/show_bug.cgi?id=954306

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
Reviewed-by: Jan Kiszka jan.kis...@siemens.com
Signed-off-by: Gleb Natapov g...@redhat.com
---
 hw/i386/kvmvapic.c |6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index 5b558aa..655483b 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -687,8 +687,14 @@ static void vapic_write(void *opaque, hwaddr addr, 
uint64_t data,
 }
 }
 
+static uint64_t vapic_read(void *opaque, hwaddr addr, unsigned size)
+{
+return 0x;
+}
+
 static const MemoryRegionOps vapic_ops = {
 .write = vapic_write,
+.read = vapic_read,
 .endianness = DEVICE_NATIVE_ENDIAN,
 };
 
-- 
1.7.10.4

[Qemu-devel] [PATCH v3 0/5] proposal to make hostmem listener RAM unplug safe

2013-05-06 Thread Liu Ping Fan

Open issue:
  As to [PATCH v3 2/5] hostmem: AddressSpace has its own map and maintained by 
RCU prepared style,
  Paolo may want to use memory_region_find() to re-implement hostmem. It is 
still under discussion


v2-v3:
  1.Drop the memory region's ref in virtio-blk. Now all the reference are kept 
inside Vring.

v1-v2:
  1.split RCU prepared style update and monitor the RAM-Device refcnt into two 
patches (patch 2,4)
  2.introduce AddrSpaceMem, which is similar to HostMem, but based on address 
space, while
the original HostMem only server system memory address space


*** BLURB HERE ***

Liu Ping Fan (5):
  hostmem: make hostmem single, not per Vring related
  hostmem: AddressSpace has its own map and maintained by RCU prepared
style
  memory: add ref/unref interface for MemroyRegionOps
  hostmem: hostmem listener pin RAM-Device by refcnt
  Vring: use hostmem's RAM safe api

 exec.c|2 +
 hw/block/dataplane/virtio-blk.c   |8 --
 hw/virtio/dataplane/hostmem.c |  150 -
 hw/virtio/dataplane/vring.c   |   98 -
 include/exec/memory.h |   10 ++
 include/hw/virtio/dataplane/hostmem.h |   33 +---
 include/hw/virtio/dataplane/vring.h   |   16 -
 memory.c  |   18 
 8 files changed, 251 insertions(+), 84 deletions(-)

-- 
1.7.4.4

[Qemu-devel] [PATCH v3 1/5] hostmem: make hostmem single, not per Vring related

2013-05-06 Thread Liu Ping Fan

From: Liu Ping Fan pingf...@linux.vnet.ibm.com

The hwaddr and hva mapping relation is system wide, no need to
be created for each Vring

Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
---
 exec.c|2 ++
 hw/virtio/dataplane/hostmem.c |   33 +++--
 hw/virtio/dataplane/vring.c   |   11 ---
 include/hw/virtio/dataplane/hostmem.h |   13 +
 include/hw/virtio/dataplane/vring.h   |1 -
 5 files changed, 34 insertions(+), 26 deletions(-)

diff --git a/exec.c b/exec.c
index fa1e0c3..1ec36a9 100644
--- a/exec.c
+++ b/exec.c
@@ -49,6 +49,7 @@
 #include translate-all.h
 
 #include exec/memory-internal.h
+#include hw/virtio/dataplane/hostmem.h
 
 //#define DEBUG_UNASSIGNED
 //#define DEBUG_SUBPAGE
@@ -1809,6 +1810,7 @@ static void memory_map_init(void)
 memory_listener_register(core_memory_listener, address_space_memory);
 memory_listener_register(io_memory_listener, address_space_io);
 memory_listener_register(tcg_memory_listener, address_space_memory);
+hostmem_init();
 
 dma_context_init(dma_context_memory, address_space_memory,
  NULL, NULL, NULL);
diff --git a/hw/virtio/dataplane/hostmem.c b/hw/virtio/dataplane/hostmem.c
index 37292ff..756b09f 100644
--- a/hw/virtio/dataplane/hostmem.c
+++ b/hw/virtio/dataplane/hostmem.c
@@ -14,6 +14,10 @@
 #include exec/address-spaces.h
 #include hw/virtio/dataplane/hostmem.h
 
+HostMem *system_mem;
+
+static void hostmem_finalize(void);
+
 static int hostmem_lookup_cmp(const void *phys_, const void *region_)
 {
 hwaddr phys = *(const hwaddr *)phys_;
@@ -31,11 +35,12 @@ static int hostmem_lookup_cmp(const void *phys_, const void 
*region_)
 /**
  * Map guest physical address to host pointer
  */
-void *hostmem_lookup(HostMem *hostmem, hwaddr phys, hwaddr len, bool is_write)
+void *hostmem_lookup(hwaddr phys, hwaddr len, bool is_write)
 {
 HostMemRegion *region;
 void *host_addr = NULL;
 hwaddr offset_within_region;
+HostMem *hostmem = system_mem;
 
 qemu_mutex_lock(hostmem-current_regions_lock);
 region = bsearch(phys, hostmem-current_regions,
@@ -137,13 +142,12 @@ static void 
hostmem_listener_coalesced_mmio_dummy(MemoryListener *listener,
 {
 }
 
-void hostmem_init(HostMem *hostmem)
+void hostmem_init(void)
 {
-memset(hostmem, 0, sizeof(*hostmem));
+system_mem = g_new0(HostMem, 1);
+qemu_mutex_init(system_mem-current_regions_lock);
 
-qemu_mutex_init(hostmem-current_regions_lock);
-
-hostmem-listener = (MemoryListener){
+system_mem-listener = (MemoryListener) {
 .begin = hostmem_listener_dummy,
 .commit = hostmem_listener_commit,
 .region_add = hostmem_listener_append_region,
@@ -161,16 +165,17 @@ void hostmem_init(HostMem *hostmem)
 .priority = 10,
 };
 
-memory_listener_register(hostmem-listener, address_space_memory);
-if (hostmem-num_new_regions  0) {
-hostmem_listener_commit(hostmem-listener);
+memory_listener_register(system_mem-listener, address_space_memory);
+if (system_mem-num_new_regions  0) {
+hostmem_listener_commit(system_mem-listener);
 }
+atexit(hostmem_finalize);
 }
 
-void hostmem_finalize(HostMem *hostmem)
+static void hostmem_finalize(void)
 {
-memory_listener_unregister(hostmem-listener);
-g_free(hostmem-new_regions);
-g_free(hostmem-current_regions);
-qemu_mutex_destroy(hostmem-current_regions_lock);
+memory_listener_unregister(system_mem-listener);
+g_free(system_mem-new_regions);
+g_free(system_mem-current_regions);
+qemu_mutex_destroy(system_mem-current_regions_lock);
 }
diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index e0d6e83..4d6d735 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -27,8 +27,7 @@ bool vring_setup(Vring *vring, VirtIODevice *vdev, int n)
 
 vring-broken = false;
 
-hostmem_init(vring-hostmem);
-vring_ptr = hostmem_lookup(vring-hostmem, vring_addr, vring_size, true);
+vring_ptr = hostmem_lookup(vring_addr, vring_size, true);
 if (!vring_ptr) {
 error_report(Failed to map vring 
  addr %# HWADDR_PRIx  size % HWADDR_PRIu,
@@ -51,7 +50,6 @@ bool vring_setup(Vring *vring, VirtIODevice *vdev, int n)
 
 void vring_teardown(Vring *vring)
 {
-hostmem_finalize(vring-hostmem);
 }
 
 /* Disable guest-host notifies */
@@ -138,8 +136,7 @@ static int get_indirect(Vring *vring,
 struct vring_desc *desc_ptr;
 
 /* Translate indirect descriptor */
-desc_ptr = hostmem_lookup(vring-hostmem,
-  indirect-addr + found * sizeof(desc),
+desc_ptr = hostmem_lookup(indirect-addr + found * sizeof(desc),
   sizeof(desc), false);
 if (!desc_ptr) {
 error_report(Failed to map indirect descriptor 
@@ -172,7 +169,7 @@ static int

[Qemu-devel] [PATCH v3 4/5] hostmem: hostmem listener pin RAM-Device by refcnt

2013-05-06 Thread Liu Ping Fan

From: Liu Ping Fan pingf...@linux.vnet.ibm.com

With ref()/unref() interface of MemoryRegion, we can pin RAM-Device
when using its memory, and release it when done.

Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
---
 hw/virtio/dataplane/hostmem.c |   24 +++-
 hw/virtio/dataplane/vring.c   |8 
 include/hw/virtio/dataplane/hostmem.h |4 +++-
 3 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/hw/virtio/dataplane/hostmem.c b/hw/virtio/dataplane/hostmem.c
index f31a703..420c9be 100644
--- a/hw/virtio/dataplane/hostmem.c
+++ b/hw/virtio/dataplane/hostmem.c
@@ -42,18 +42,23 @@ static void hostmem_ref(HostMem *hostmem)
 
 static void hostmem_unref(HostMem *hostmem)
 {
-int t;
+int i, t;
+HostMemRegion *hmr;
 
 t = __sync_sub_and_fetch(hostmem-ref, 1);
 assert(t = 0);
 if (!t) {
+for (i = 0; i  hostmem-num_current_regions; i++) {
+hmr = hostmem-current_regions[i];
+memory_region_unref(hmr-mr);
+}
 g_free(hostmem-current_regions);
 g_free(hostmem);
 }
 }
 
 static void *address_space_mem_lookup(AddrSpaceMem *as_mem, hwaddr phys,
-hwaddr len, bool is_write)
+hwaddr len, MemoryRegion **mr, bool is_write)
 {
 HostMemRegion *region;
 void *host_addr = NULL;
@@ -65,6 +70,9 @@ static void *address_space_mem_lookup(AddrSpaceMem *as_mem, 
hwaddr phys,
 hostmem_ref(hostmem);
 qemu_mutex_unlock(as_mem-cur_lock);
 
+if (mr) {
+*mr = NULL;
+}
 region = bsearch(phys, hostmem-current_regions,
  hostmem-num_current_regions,
  sizeof(hostmem-current_regions[0]),
@@ -79,7 +87,10 @@ static void *address_space_mem_lookup(AddrSpaceMem *as_mem, 
hwaddr phys,
 if (len = region-size - offset_within_region) {
 host_addr = region-host_addr + offset_within_region;
 }
-
+if (mr) {
+*mr = region-mr;
+memory_region_ref(*mr);
+}
 out:
 hostmem_unref(hostmem);
 return host_addr;
@@ -88,9 +99,10 @@ out:
 /**
  * Map guest physical address to host pointer
  */
-void *hostmem_lookup(hwaddr phys, hwaddr len, bool is_write)
+void *hostmem_lookup(hwaddr phys, hwaddr len, MemoryRegion **mr,
+bool is_write)
 {
-return address_space_mem_lookup(system_mem, phys, len, is_write);
+return address_space_mem_lookup(system_mem, phys, len, mr, is_write);
 }
 
 static void hostmem_listener_begin(MemoryListener *listener)
@@ -134,6 +146,7 @@ static void hostmem_append_new_region(HostMem *hostmem,
 hostmem-current_regions[num] = (HostMemRegion){
 .host_addr = ram_ptr + section-offset_within_region,
 .guest_addr = section-offset_within_address_space,
+.mr = section-mr,
 .size = section-size,
 .readonly = section-readonly,
 };
@@ -155,6 +168,7 @@ static void hostmem_listener_append_region(MemoryListener 
*listener,
 return;
 }
 
+memory_region_ref(section-mr);
 hostmem_append_new_region(as_mem-next_hostmem, section);
 }
 
diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 4d6d735..e3c3afb 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -27,7 +27,7 @@ bool vring_setup(Vring *vring, VirtIODevice *vdev, int n)
 
 vring-broken = false;
 
-vring_ptr = hostmem_lookup(vring_addr, vring_size, true);
+vring_ptr = hostmem_lookup(vring_addr, vring_size, NULL, true);
 if (!vring_ptr) {
 error_report(Failed to map vring 
  addr %# HWADDR_PRIx  size % HWADDR_PRIu,
@@ -137,7 +137,7 @@ static int get_indirect(Vring *vring,
 
 /* Translate indirect descriptor */
 desc_ptr = hostmem_lookup(indirect-addr + found * sizeof(desc),
-  sizeof(desc), false);
+  sizeof(desc), NULL, false);
 if (!desc_ptr) {
 error_report(Failed to map indirect descriptor 
  addr %# PRIx64  len %zu,
@@ -169,7 +169,7 @@ static int get_indirect(Vring *vring,
 return -ENOBUFS;
 }
 
-iov-iov_base = hostmem_lookup(desc.addr, desc.len,
+iov-iov_base = hostmem_lookup(desc.addr, desc.len, NULL,
desc.flags  VRING_DESC_F_WRITE);
 if (!iov-iov_base) {
 error_report(Failed to map indirect descriptor
@@ -297,7 +297,7 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
 }
 
 /* TODO handle non-contiguous memory across region boundaries */
-iov-iov_base = hostmem_lookup(desc.addr, desc.len,
+iov-iov_base = hostmem_lookup(desc.addr, desc.len, NULL,
desc.flags  VRING_DESC_F_WRITE);
 if (!iov-iov_base) {
 error_report(Failed to map vring desc addr %# PRIx64  len %u,
diff --git a/include/hw/virtio/dataplane/hostmem.h 
b/include/hw/virtio/dataplane/hostmem.h

[Qemu-devel] [PATCH v3 2/5] hostmem: AddressSpace has its own map and maintained by RCU prepared style

2013-05-06 Thread Liu Ping Fan

From: Liu Ping Fan pingf...@linux.vnet.ibm.com

Each address space will have a map between its hwaddr and hva,
this is expressed as the struct AddrSpaceMem. Currently only
address space of system memory's map is used by virtio device,
and the map is stored in AddrSpaceMem *system_mem.

The map is maintained by RCU prepared style, cur_hostmem,
next_hostmem, cur_lock fields in AddrSpaceMem help to access
the aim. cur_hostmem is used to search, next_hostmem is used to
update and will substitue cur_hostmem when done.

Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
---
 hw/virtio/dataplane/hostmem.c |  133 +++--
 include/hw/virtio/dataplane/hostmem.h |   20 +++--
 2 files changed, 103 insertions(+), 50 deletions(-)

diff --git a/hw/virtio/dataplane/hostmem.c b/hw/virtio/dataplane/hostmem.c
index 756b09f..f31a703 100644
--- a/hw/virtio/dataplane/hostmem.c
+++ b/hw/virtio/dataplane/hostmem.c
@@ -14,7 +14,7 @@
 #include exec/address-spaces.h
 #include hw/virtio/dataplane/hostmem.h
 
-HostMem *system_mem;
+static AddrSpaceMem *system_mem;
 
 static void hostmem_finalize(void);
 
@@ -32,17 +32,39 @@ static int hostmem_lookup_cmp(const void *phys_, const void 
*region_)
 }
 }
 
-/**
- * Map guest physical address to host pointer
- */
-void *hostmem_lookup(hwaddr phys, hwaddr len, bool is_write)
+static void hostmem_ref(HostMem *hostmem)
+{
+int t;
+
+t = __sync_add_and_fetch(hostmem-ref, 1);
+assert(t  0);
+}
+
+static void hostmem_unref(HostMem *hostmem)
+{
+int t;
+
+t = __sync_sub_and_fetch(hostmem-ref, 1);
+assert(t = 0);
+if (!t) {
+g_free(hostmem-current_regions);
+g_free(hostmem);
+}
+}
+
+static void *address_space_mem_lookup(AddrSpaceMem *as_mem, hwaddr phys,
+hwaddr len, bool is_write)
 {
 HostMemRegion *region;
 void *host_addr = NULL;
 hwaddr offset_within_region;
-HostMem *hostmem = system_mem;
+HostMem *hostmem;
+
+qemu_mutex_lock(as_mem-cur_lock);
+hostmem = as_mem-cur_hostmem;
+hostmem_ref(hostmem);
+qemu_mutex_unlock(as_mem-cur_lock);
 
-qemu_mutex_lock(hostmem-current_regions_lock);
 region = bsearch(phys, hostmem-current_regions,
  hostmem-num_current_regions,
  sizeof(hostmem-current_regions[0]),
@@ -57,28 +79,45 @@ void *hostmem_lookup(hwaddr phys, hwaddr len, bool is_write)
 if (len = region-size - offset_within_region) {
 host_addr = region-host_addr + offset_within_region;
 }
-out:
-qemu_mutex_unlock(hostmem-current_regions_lock);
 
+out:
+hostmem_unref(hostmem);
 return host_addr;
 }
 
 /**
- * Install new regions list
+ * Map guest physical address to host pointer
  */
-static void hostmem_listener_commit(MemoryListener *listener)
+void *hostmem_lookup(hwaddr phys, hwaddr len, bool is_write)
 {
-HostMem *hostmem = container_of(listener, HostMem, listener);
+return address_space_mem_lookup(system_mem, phys, len, is_write);
+}
 
-qemu_mutex_lock(hostmem-current_regions_lock);
-g_free(hostmem-current_regions);
-hostmem-current_regions = hostmem-new_regions;
-hostmem-num_current_regions = hostmem-num_new_regions;
-qemu_mutex_unlock(hostmem-current_regions_lock);
+static void hostmem_listener_begin(MemoryListener *listener)
+{
+AddrSpaceMem *as_mem = container_of(listener, AddrSpaceMem, listener);
+
+as_mem-next_hostmem = g_new0(HostMem, 1);
+as_mem-next_hostmem-ref = 1;
+}
 
-/* Reset new regions list */
-hostmem-new_regions = NULL;
-hostmem-num_new_regions = 0;
+/**
+ * Install new regions list
+ */
+static void hostmem_listener_commit(MemoryListener *listener)
+{
+HostMem *tmp;
+AddrSpaceMem *as_mem = container_of(listener, AddrSpaceMem, listener);
+
+/* writer of cur_hostmem next_hostmem is serialized by biglock
+ * in hotplug path. So only take care of r/w on cur_hostmem
+ */
+tmp = as_mem-cur_hostmem;
+qemu_mutex_lock(as_mem-cur_lock);
+as_mem-cur_hostmem = as_mem-next_hostmem;
+qemu_mutex_unlock(as_mem-cur_lock);
+as_mem-next_hostmem = NULL;
+hostmem_unref(tmp);
 }
 
 /**
@@ -88,23 +127,23 @@ static void hostmem_append_new_region(HostMem *hostmem,
   MemoryRegionSection *section)
 {
 void *ram_ptr = memory_region_get_ram_ptr(section-mr);
-size_t num = hostmem-num_new_regions;
-size_t new_size = (num + 1) * sizeof(hostmem-new_regions[0]);
+size_t num = hostmem-num_current_regions;
+size_t new_size = (num + 1) * sizeof(hostmem-current_regions[0]);
 
-hostmem-new_regions = g_realloc(hostmem-new_regions, new_size);
-hostmem-new_regions[num] = (HostMemRegion){
+hostmem-current_regions = g_realloc(hostmem-current_regions, new_size);
+hostmem-current_regions[num] = (HostMemRegion){
 .host_addr = ram_ptr + section-offset_within_region,
 .guest_addr = section-offset_within_address_space,
 .size =

[Qemu-devel] [PATCH v3 3/5] memory: add ref/unref interface for MemroyRegionOps

2013-05-06 Thread Liu Ping Fan

From: Liu Ping Fan pingf...@linux.vnet.ibm.com

This pair of interface are optinal, except for those device which is
used outside the biglock's protection for hot unplug. Currently,
HostMem used by virtio-blk dataplane is outside biglock, so the RAM
device should implement this.

Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
---
 include/exec/memory.h |   10 ++
 memory.c  |   18 ++
 2 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 9e88320..7e38fc1 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -54,6 +54,12 @@ struct MemoryRegionIORange {
  * Memory region callbacks
  */
 struct MemoryRegionOps {
+
+/* ref/unref pair is optional;  ref.
+ * inc refcnt of object who store MemoryRegion
+ */
+void (*ref)(void);
+void (*unref)(void);
 /* Read from the memory region. @addr is relative to @mr; @size is
  * in bytes. */
 uint64_t (*read)(void *opaque,
@@ -223,6 +229,10 @@ struct MemoryListener {
 QTAILQ_ENTRY(MemoryListener) link;
 };
 
+/**/
+bool memory_region_ref(MemoryRegion *mr);
+bool memory_region_unref(MemoryRegion *mr);
+
 /**
  * memory_region_init: Initialize a memory region
  *
diff --git a/memory.c b/memory.c
index 75ca281..c29998d 100644
--- a/memory.c
+++ b/memory.c
@@ -786,6 +786,24 @@ static bool memory_region_wrong_endianness(MemoryRegion 
*mr)
 #endif
 }
 
+bool memory_region_ref(MemoryRegion *mr)
+{
+if (mr-ops  mr-ops-ref) {
+mr-ops-ref();
+return true;
+}
+return false;
+}
+
+bool memory_region_unref(MemoryRegion *mr)
+{
+if (mr-ops  mr-ops-unref) {
+mr-ops-unref();
+return true;
+}
+return false;
+}
+
 void memory_region_init(MemoryRegion *mr,
 const char *name,
 uint64_t size)
-- 
1.7.4.4

[Qemu-devel] [PATCH v3 5/5] Vring: use hostmem's RAM safe api

2013-05-06 Thread Liu Ping Fan

From: Liu Ping Fan pingf...@linux.vnet.ibm.com

Before mm-ops done, we should gurantee the validaion of regions which is
used by Vring self and the chunck pointed by vring desc. We acheive
this goal by inc refcnt of RAM-Device. When finished, we dec this cnt
through the interface of MemoryRegion.

We keep the MemoryRegion's reference info totally in Vring, so the caller
can not be aware of the reference.

Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com
---
 hw/block/dataplane/virtio-blk.c |8 ---
 hw/virtio/dataplane/vring.c |   93 +++---
 include/hw/virtio/dataplane/vring.h |   15 ++
 3 files changed, 89 insertions(+), 27 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 0356665..4babda1 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -25,14 +25,6 @@
 #include block/aio.h
 #include hw/virtio/virtio-bus.h
 
-enum {
-SEG_MAX = 126,  /* maximum number of I/O segments */
-VRING_MAX = SEG_MAX + 2,/* maximum number of vring descriptors */
-REQ_MAX = VRING_MAX,/* maximum number of requests in the vring,
- * is VRING_MAX / 2 with traditional and
- * VRING_MAX with indirect descriptors */
-};
-
 typedef struct {
 struct iocb iocb;   /* Linux AIO control block */
 QEMUIOVector *inhdr;/* iovecs for virtio_blk_inhdr */
diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index e3c3afb..2cfd6d0 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -27,7 +27,7 @@ bool vring_setup(Vring *vring, VirtIODevice *vdev, int n)
 
 vring-broken = false;
 
-vring_ptr = hostmem_lookup(vring_addr, vring_size, NULL, true);
+vring_ptr = hostmem_lookup(vring_addr, vring_size, vring-vring_mr, true);
 if (!vring_ptr) {
 error_report(Failed to map vring 
  addr %# HWADDR_PRIx  size % HWADDR_PRIu,
@@ -50,6 +50,7 @@ bool vring_setup(Vring *vring, VirtIODevice *vdev, int n)
 
 void vring_teardown(Vring *vring)
 {
+memory_region_unref(vring-vring_mr);
 }
 
 /* Disable guest-host notifies */
@@ -109,11 +110,14 @@ bool vring_should_notify(VirtIODevice *vdev, Vring *vring)
 static int get_indirect(Vring *vring,
 struct iovec iov[], struct iovec *iov_end,
 unsigned int *out_num, unsigned int *in_num,
-struct vring_desc *indirect)
+struct vring_desc *indirect,
+MemoryRegion ***mrs)
 {
 struct vring_desc desc;
 unsigned int i = 0, count, found = 0;
-
+MemoryRegion **cur = *mrs;
+int ret = 0;
+MemoryRegion *desc_mr;
 /* Sanity check */
 if (unlikely(indirect-len % sizeof(desc))) {
 error_report(Invalid length in indirect descriptor: 
@@ -137,49 +141,58 @@ static int get_indirect(Vring *vring,
 
 /* Translate indirect descriptor */
 desc_ptr = hostmem_lookup(indirect-addr + found * sizeof(desc),
-  sizeof(desc), NULL, false);
+  sizeof(desc),
+  desc_mr,
+  false);
 if (!desc_ptr) {
 error_report(Failed to map indirect descriptor 
  addr %# PRIx64  len %zu,
  (uint64_t)indirect-addr + found * sizeof(desc),
  sizeof(desc));
 vring-broken = true;
-return -EFAULT;
+ret = -EFAULT;
+goto fail;
 }
 desc = *desc_ptr;
 
 /* Ensure descriptor has been loaded before accessing fields */
 barrier(); /* read_barrier_depends(); */
+memory_region_unref(desc_mr);
 
 if (unlikely(++found  count)) {
 error_report(Loop detected: last one at %u 
  indirect size %u, i, count);
 vring-broken = true;
-return -EFAULT;
+ret = -EFAULT;
+goto fail;
 }
 
 if (unlikely(desc.flags  VRING_DESC_F_INDIRECT)) {
 error_report(Nested indirect descriptor);
 vring-broken = true;
-return -EFAULT;
+ret = -EFAULT;
+goto fail;
 }
 
 /* Stop for now if there are not enough iovecs available. */
 if (iov = iov_end) {
-return -ENOBUFS;
+ret = -ENOBUFS;
+goto fail;
 }
 
-iov-iov_base = hostmem_lookup(desc.addr, desc.len, NULL,
+iov-iov_base = hostmem_lookup(desc.addr, desc.len, cur,
desc.flags  VRING_DESC_F_WRITE);
 if (!iov-iov_base) {
 error_report(Failed to map indirect descriptor
  addr %# PRIx64  len %u,

Re: [Qemu-devel] QEMU aarch64 TCG target - testing question about x86-64

2013-05-06 Thread Claudio Fontana

On 14.03.2013 17:16, Peter Maydell wrote:
 On 14 March 2013 15:57, Claudio Fontana claudio.font...@huawei.com wrote:
 I am currently working on an aarch64 tcg target implementation,
 based on the available gdb patches contributed by ARM and the results
 of the linaro toolchain.
 
 Doing a target implementation based on the gdb/binutils
 patches and not the actual documentation is going to be
 enormously painful to review (to the point that I will almost
 certainly just say sorry, no), because it will basically
 be you have the semantics of this wrong, you have the
 decoding wrong all the way through for a whole pile of
 corner cases. You need to be working from the actual ARM
 documentation (which I regret is currently only available
 under NDA).
 
 See also the patchset that Alex Graf posted recently (which
 is a bunch of framework code but not the actual decoder).
 
 -- PMM
 

Well, we happen to have just completed a first working version of TCG support 
for aarch64 here,
and it has been tested successfully running on Foundation v8, running the 
system emulation for various targets
(at the moment armv5/linux, armv7/linux, x86 FreeDOS, X86 Linux).

I understand that you have reservations on upstreaming this work for the 
reasons you explain above,
so for now it will be available to Huawei only. If anybody is interested, I 
will be happy to send the patches.

Now I have a question regarding the test images, I have seen various QEMU 
images at
wiki.qemu.org/Testing

I have tested with some of those, but I don't see an x86-64 test case;
is there a reference test kernel/image for x86-64?

Thanks,

Claudio Fontana
Server OS Architect
Huawei Technologies Duesseldorf GmbH
Riesstraße 25 - 80992 München

Re: [Qemu-devel] [PATCH v7 0/7] push mmio dispatch out of big lock

2013-05-06 Thread Paolo Bonzini

Il 06/05/2013 14:06, Jan Kiszka ha scritto:
 On 2013-05-06 13:47, Paolo Bonzini wrote:
 Il 06/05/2013 13:39, Jan Kiszka ha scritto:
 On 2013-05-06 13:28, Paolo Bonzini wrote:
 Il 06/05/2013 13:11, Jan Kiszka ha scritto:
 On 2013-05-06 12:58, Paolo Bonzini wrote:
 Il 06/05/2013 12:56, Jan Kiszka ha scritto:
 The problem is that even if I/O for a region is supposed to happen
 within the BQL, lookup can happen outside the BQL.  Lookup will use the
 region even if it is just to discard it:

VCPU thread (under BQL)  device thread
  
 --
 flatview_ref
 memory_region_find 
 returns d-mr
 
 memory_region_ref(d-mr) /* nop */
qdev_free(d)
  object_unparent(d)
unrealize(d)
  memory_region_del_subregion(d-mr)
FlatView updated, d-mr not in the new view

 flatview_unref
   
 memory_region_unref(d-mr)
 object_unref(d)
   free(d)
 if (!d-mr-is_ram) {  
   /* BAD! */
   
 memory_region_unref(d-mr) /* nop */
   return error
 }


 Here, the memory region is dereferenced *before* we know that it is 
 BQL-free
 (in fact, exactly to ascertain whether it is BQL-free).

 Both flatview update and lookup *plus* locking type evaluation (i.e.
 memory region dereferencing) always happen under the address space lock.
 See Pingfan's patch.

 That's true of address_space_rw/map, but I don't think it holds for
 memory_region_find.

 It has to, or it would be broken: Either it is called on a region that
 supports reference counting

 You cannot know that in advance, can you?  The address is decided by the
 guest.

 Need to help me again to get the context: In which case is this a
 hot-path that we want to keep BQL-free? Current users of
 memory_region_find appear to be all relatively slow paths, thus are fine
 with staying under BQL.

 virtio-blk-dataplane is basically redoing memory_region_find with a
 separate data structure, exactly so that it can run outside the BQL
 before we get BQL-free MMIO dispatch.

 I can try to post patches later today that actually use
 memory_region_find instead.
 
 We could define its semantics as follows: return a reference to the
 corresponding memory region, provide this is safe. A reference is safe when
  - the region supports BQL-free operation (thus provides an owner to
apply reference counting on)

This doesn't really work.  Regions that are known not to disappear (most
importantly, the main RAM region) also support BQL-free operation, but
have no owner right now.

Also, memory_region_find cannot know if it's returning a valid result,
and the callee cannot check it because the region may have disappeared
already when it is returned.

But I really would be surprised if adding an owner everywhere is so
hard...  let's try that first, it would solve the problem.

  - the caller holds the BQL (check via qemu_mutex_iothread_is_locked()
- to be implemented)
 
 The latter implies that the BQL is not dropped before returning the
 reference, but that's nothing memory_region_find can enforce.

Paolo

Re: [Qemu-devel] [PATCH 7/7] block: dump to monitor for bdrv_snapshot_dump() and bdrv_image_info_dump()

2013-05-06 Thread Luiz Capitulino

On Mon, 06 May 2013 10:09:43 +0800
Wenchao Xia xiaw...@linux.vnet.ibm.com wrote:

 于 2013-5-3 10:51, Wenchao Xia 写道:
  于 2013-5-2 20:02, Luiz Capitulino 写道:
  On Thu, 02 May 2013 10:05:08 +0800
  Wenchao Xia xiaw...@linux.vnet.ibm.com wrote:
 
  于 2013-4-30 3:05, Luiz Capitulino 写道:
  On Fri, 26 Apr 2013 16:46:57 +0200
  Stefan Hajnoczi stefa...@gmail.com wrote:
 
  On Fri, Apr 26, 2013 at 05:31:15PM +0800, Wenchao Xia wrote:
  @@ -2586,10 +2585,12 @@ void do_info_snapshots(Monitor *mon, const
  QDict *qdict)
 }
 
 if (total  0) {
  -monitor_printf(mon, %s\n, bdrv_snapshot_dump(buf,
  sizeof(buf), NULL));
  +bdrv_snapshot_dump(NULL);
  +monitor_printf(mon, \n);
 
  Luiz: any issue with mixing monitor_printf(mon) and
  monitor_vprintf(cur_mon) calls?  I guess there was a reason for
  explicitly passing mon instead of relying on cur_mon.
 
  where are they being mixed?
 
  bdrv_snapshot_dump() used a global variable cur_mon inside,
  instead
  of let caller pass in a explicit montior* mon, I guess that is the
  question.
 
  I'd have to see the code to tell, but yes, what Stefan described is the
  best practice for the Monitor.
 
 I think this would not be a problem until qemu wants more than one
  human monitor console, and then we may require a data structure to tell
  where to output the string: stdout, *mon, or even stderr, and
  error_printf() also need to be changed.
 
Luiz, what is your idea? I'd like to respin v2 if no issues for it.

As I said before, I'd have to see the code to tell. But answering your comment,
the code does support multiple monitors.

Re: [Qemu-devel] QEMU aarch64 TCG target - testing question about x86-64

2013-05-06 Thread Paolo Bonzini

Il 06/05/2013 14:56, Claudio Fontana ha scritto:
 On 14.03.2013 17:16, Peter Maydell wrote:
 On 14 March 2013 15:57, Claudio Fontana claudio.font...@huawei.com wrote:
 I am currently working on an aarch64 tcg target implementation,
 based on the available gdb patches contributed by ARM and the results
 of the linaro toolchain.

 Doing a target implementation based on the gdb/binutils
 patches and not the actual documentation is going to be
 enormously painful to review (to the point that I will almost
 certainly just say sorry, no), because it will basically
 be you have the semantics of this wrong, you have the
 decoding wrong all the way through for a whole pile of
 corner cases. You need to be working from the actual ARM
 documentation (which I regret is currently only available
 under NDA).

 See also the patchset that Alex Graf posted recently (which
 is a bunch of framework code but not the actual decoder).

 -- PMM

 
 Well, we happen to have just completed a first working version of TCG support 
 for aarch64 here,
 and it has been tested successfully running on Foundation v8, running the 
 system emulation for various targets
 (at the moment armv5/linux, armv7/linux, x86 FreeDOS, X86 Linux).
 
 I understand that you have reservations on upstreaming this work for the 
 reasons you explain above,
 so for now it will be available to Huawei only. If anybody is interested, I 
 will be happy to send the patches.
 
 Now I have a question regarding the test images, I have seen various QEMU 
 images at
 wiki.qemu.org/Testing
 
 I have tested with some of those, but I don't see an x86-64 test case;
 is there a reference test kernel/image for x86-64?

No, usually people just do a smoke test using their favorite distro
and/or Windows.

More complete integration testing of i386/x86-64 images is done with
virt-test, which supports a variety of distros.  The closest thing to a
reference image is virt-test's JeOS image at
http://lmr.fedorapeople.org/jeos/jeos-17-64.qcow2.7z (should probably be
added to the list...), currently based on Fedora 17.

Paolo

Re: [Qemu-devel] QEMU aarch64 TCG target - testing question about x86-64

2013-05-06 Thread Peter Maydell

On 6 May 2013 13:56, Claudio Fontana claudio.font...@huawei.com wrote:
 On 14.03.2013 17:16, Peter Maydell wrote:
 On 14 March 2013 15:57, Claudio Fontana claudio.font...@huawei.com wrote:
 I am currently working on an aarch64 tcg target implementation,
 based on the available gdb patches contributed by ARM and the results
 of the linaro toolchain.

 Doing a target implementation based on the gdb/binutils
 patches and not the actual documentation is going to be
 enormously painful to review

 Well, we happen to have just completed a first working version
 of TCG support for aarch64 here, and it has been tested successfully
 running on Foundation v8, running the system emulation for various targets
 (at the moment armv5/linux, armv7/linux, x86 FreeDOS, X86 Linux).

Auugh. I've just realised I totally misread your initial email as
being a proposal for a QEMU target (ie target-*, to implement
guest AArch64 support), because up til now nobody at all has expressed
any interest in supporting QEMU on AArch64 hosts. My reasons for
preferring to use the official documentation for the guest support
are rather less applicable to adding host support.

 I understand that you have reservations on upstreaming this work
 for the reasons you explain above, so for now it will be available
 to Huawei only.

Since you've written it (and now I've realised my confusion!)
you may as well send the patches to qemu-devel, I think.

thanks
-- PMM

[Qemu-devel] [PATCH] Add 'maxqdepth' as an option to tty character devices.

2013-05-06 Thread John Baboval

From: John V. Baboval john.babo...@virtualcomputer.com

This parameter will cause writes to tty backed chardevs to return
-EAGAIN if the backing tty has buffered more than the specified
number of characters. When data is sent, the TIOCOUTQ ioctl is invoked
to determine the current TTY output buffer depth.

Background:

Some devices use DTR/DSR as flow control. (eg. Check/Receipt
printers with some POS software). When the device de-asserts
DTR, the guest OS notifies the application and new data is blocked.
When running on a QEMU serial port backed by a TTY, though the guest
stops transmitting, all the characters in the TTY output buffer are
still sent. The device buffer overflows and data is lost. In this
case the user could set maxqdepth=1.

Signed-off-by: John Baboval john.babo...@citrix.com
---
 include/sysemu/char.h |2 ++
 qapi-schema.json  |5 -
 qemu-char.c   |   40 +++-
 qemu-options.hx   |4 ++--
 4 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/include/sysemu/char.h b/include/sysemu/char.h
index 5e42c90..a94c1fb 100644
--- a/include/sysemu/char.h
+++ b/include/sysemu/char.h
@@ -43,6 +43,7 @@ typedef struct {
 
 #define CHR_IOCTL_SERIAL_SET_TIOCM   13
 #define CHR_IOCTL_SERIAL_GET_TIOCM   14
+#define CHR_IOCTL_SERIAL_TIOCOUTQ15
 
 #define CHR_TIOCM_CTS  0x020
 #define CHR_TIOCM_CAR  0x040
@@ -77,6 +78,7 @@ struct CharDriverState {
 int fe_open;
 int explicit_fe_open;
 int avail_connections;
+uint32_t maxqdepth;
 QemuOpts *opts;
 QTAILQ_ENTRY(CharDriverState) next;
 };
diff --git a/qapi-schema.json b/qapi-schema.json
index 7797400..029e7c9 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3182,11 +3182,14 @@
 #
 # @device: The name of the special file for the device,
 #  i.e. /dev/ttyS0 on Unix or COM1: on Windows
+# @maxqdepth: The maximum depth of the underlying tty
+  output queue (Unix) 
 # @type: What kind of device this is.
 #
 # Since: 1.4
 ##
-{ 'type': 'ChardevHostdev', 'data': { 'device' : 'str' } }
+{ 'type': 'ChardevHostdev', 'data': { 'device': 'str',
+  'maxqdepth' : 'int' } }
 
 ##
 # @ChardevSocket:
diff --git a/qemu-char.c b/qemu-char.c
index 64e824d..e2e4217 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -782,6 +782,7 @@ typedef struct FDCharDriver {
 GIOChannel *fd_in, *fd_out;
 guint fd_in_tag;
 int max_size;
+int tiocoutq_failed;
 QTAILQ_ENTRY(FDCharDriver) node;
 } FDCharDriver;
 
@@ -1260,6 +1261,22 @@ static CharDriverState *qemu_chr_open_pty(const char *id,
 return chr;
 }
 
+static int tty_serial_write(CharDriverState *chr, const uint8_t *buf, int len)
+{
+FDCharDriver *s = chr-opaque;
+uint32_t inflight = 0;
+
+qemu_chr_fe_ioctl(chr, CHR_IOCTL_SERIAL_TIOCOUTQ, inflight);
+if (inflight = chr-maxqdepth)
+return -EAGAIN;
+
+if (inflight + len  chr-maxqdepth) {
+len = chr-maxqdepth - inflight;
+}
+
+return io_channel_send(s-fd_out, buf, len);
+}
+
 static void tty_serial_init(int fd, int speed,
 int parity, int data_bits, int stop_bits)
 {
@@ -1438,6 +1455,16 @@ static int tty_serial_ioctl(CharDriverState *chr, int 
cmd, void *arg)
 ioctl(g_io_channel_unix_get_fd(s-fd_in), TIOCMSET, targ);
 }
 break;
+case CHR_IOCTL_SERIAL_TIOCOUTQ:
+{
+if (!s-tiocoutq_failed)
+s-tiocoutq_failed = ioctl(g_io_channel_unix_get_fd(s-fd_in),
+   TIOCOUTQ, arg);
+
+if (s-tiocoutq_failed)
+*(unsigned int *)arg = 0;
+}
+break;
 default:
 return -ENOTSUP;
 }
@@ -1466,6 +1493,7 @@ static CharDriverState *qemu_chr_open_tty_fd(int fd)
 
 tty_serial_init(fd, 115200, 'N', 8, 1);
 chr = qemu_chr_open_fd(fd, fd);
+chr-chr_write = tty_serial_write;
 chr-chr_ioctl = tty_serial_ioctl;
 chr-chr_close = qemu_chr_close_tty;
 return chr;
@@ -3172,6 +3200,8 @@ static void qemu_chr_parse_serial(QemuOpts *opts, 
ChardevBackend *backend,
 }
 backend-serial = g_new0(ChardevHostdev, 1);
 backend-serial-device = g_strdup(device);
+backend-serial-maxqdepth =
+qemu_opt_get_number(opts, maxqdepth, -1);
 }
 
 static void qemu_chr_parse_parallel(QemuOpts *opts, ChardevBackend *backend,
@@ -3575,6 +3605,9 @@ QemuOptsList qemu_chardev_opts = {
 },{
 .name = size,
 .type = QEMU_OPT_SIZE,
+},{
+.name = maxqdepth,
+.type = QEMU_OPT_NUMBER,
 },
 { /* end of list */ }
 },
@@ -3653,6 +3686,7 @@ static CharDriverState 
*qmp_chardev_open_serial(ChardevHostdev *serial,
 Error **errp)
 {
 #ifdef HAVE_CHARDEV_TTY
+CharDriverState *chr;
 int fd;
 
 fd = qmp_chardev_open_file_source(serial-device, O_RDWR,

Re: [Qemu-devel] [PATCH] Add 'maxqdepth' as an option to tty character devices.

2013-05-06 Thread Paolo Bonzini

Il 06/05/2013 15:43, John Baboval ha scritto:
 From: John V. Baboval john.babo...@virtualcomputer.com
 
 This parameter will cause writes to tty backed chardevs to return
 -EAGAIN if the backing tty has buffered more than the specified
 number of characters. When data is sent, the TIOCOUTQ ioctl is invoked
 to determine the current TTY output buffer depth.
 
 Background:
 
 Some devices use DTR/DSR as flow control. (eg. Check/Receipt
 printers with some POS software). When the device de-asserts
 DTR, the guest OS notifies the application and new data is blocked.
 When running on a QEMU serial port backed by a TTY, though the guest
 stops transmitting, all the characters in the TTY output buffer are
 still sent. The device buffer overflows and data is lost. In this
 case the user could set maxqdepth=1.
 
 Signed-off-by: John Baboval john.babo...@citrix.com
 ---
  include/sysemu/char.h |2 ++
  qapi-schema.json  |5 -
  qemu-char.c   |   40 +++-
  qemu-options.hx   |4 ++--
  4 files changed, 47 insertions(+), 4 deletions(-)
 
 diff --git a/include/sysemu/char.h b/include/sysemu/char.h
 index 5e42c90..a94c1fb 100644
 --- a/include/sysemu/char.h
 +++ b/include/sysemu/char.h
 @@ -43,6 +43,7 @@ typedef struct {
  
  #define CHR_IOCTL_SERIAL_SET_TIOCM   13
  #define CHR_IOCTL_SERIAL_GET_TIOCM   14
 +#define CHR_IOCTL_SERIAL_TIOCOUTQ15
  
  #define CHR_TIOCM_CTS0x020
  #define CHR_TIOCM_CAR0x040
 @@ -77,6 +78,7 @@ struct CharDriverState {
  int fe_open;
  int explicit_fe_open;
  int avail_connections;
 +uint32_t maxqdepth;
  QemuOpts *opts;
  QTAILQ_ENTRY(CharDriverState) next;
  };
 diff --git a/qapi-schema.json b/qapi-schema.json
 index 7797400..029e7c9 100644
 --- a/qapi-schema.json
 +++ b/qapi-schema.json
 @@ -3182,11 +3182,14 @@
  #
  # @device: The name of the special file for the device,
  #  i.e. /dev/ttyS0 on Unix or COM1: on Windows
 +# @maxqdepth: The maximum depth of the underlying tty
 +  output queue (Unix) 
  # @type: What kind of device this is.
  #
  # Since: 1.4
  ##
 -{ 'type': 'ChardevHostdev', 'data': { 'device' : 'str' } }
 +{ 'type': 'ChardevHostdev', 'data': { 'device': 'str',
 +  'maxqdepth' : 'int' } }

This needs to be optional for backwards compatibility.  You can check
serial-has_maxqdepth and use a default value of -1 if it is true...

  
  ##
  # @ChardevSocket:
 diff --git a/qemu-char.c b/qemu-char.c
 index 64e824d..e2e4217 100644
 --- a/qemu-char.c
 +++ b/qemu-char.c
 @@ -782,6 +782,7 @@ typedef struct FDCharDriver {
  GIOChannel *fd_in, *fd_out;
  guint fd_in_tag;
  int max_size;
 +int tiocoutq_failed;
  QTAILQ_ENTRY(FDCharDriver) node;
  } FDCharDriver;
  
 @@ -1260,6 +1261,22 @@ static CharDriverState *qemu_chr_open_pty(const char 
 *id,
  return chr;
  }
  
 +static int tty_serial_write(CharDriverState *chr, const uint8_t *buf, int 
 len)
 +{
 +FDCharDriver *s = chr-opaque;
 +uint32_t inflight = 0;
 +
 +qemu_chr_fe_ioctl(chr, CHR_IOCTL_SERIAL_TIOCOUTQ, inflight);
 +if (inflight = chr-maxqdepth)
 +return -EAGAIN;
 +
 +if (inflight + len  chr-maxqdepth) {
 +len = chr-maxqdepth - inflight;
 +}
 +
 +return io_channel_send(s-fd_out, buf, len);
 +}
 +
  static void tty_serial_init(int fd, int speed,
  int parity, int data_bits, int stop_bits)
  {
 @@ -1438,6 +1455,16 @@ static int tty_serial_ioctl(CharDriverState *chr, int 
 cmd, void *arg)
  ioctl(g_io_channel_unix_get_fd(s-fd_in), TIOCMSET, targ);
  }
  break;
 +case CHR_IOCTL_SERIAL_TIOCOUTQ:
 +{
 +if (!s-tiocoutq_failed)
 +s-tiocoutq_failed = 
 ioctl(g_io_channel_unix_get_fd(s-fd_in),
 +   TIOCOUTQ, arg);
 +
 +if (s-tiocoutq_failed)
 +*(unsigned int *)arg = 0;
 +}
 +break;
  default:
  return -ENOTSUP;
  }
 @@ -1466,6 +1493,7 @@ static CharDriverState *qemu_chr_open_tty_fd(int fd)
  
  tty_serial_init(fd, 115200, 'N', 8, 1);
  chr = qemu_chr_open_fd(fd, fd);
 +chr-chr_write = tty_serial_write;
  chr-chr_ioctl = tty_serial_ioctl;
  chr-chr_close = qemu_chr_close_tty;
  return chr;
 @@ -3172,6 +3200,8 @@ static void qemu_chr_parse_serial(QemuOpts *opts, 
 ChardevBackend *backend,
  }
  backend-serial = g_new0(ChardevHostdev, 1);
  backend-serial-device = g_strdup(device);
 +backend-serial-maxqdepth =
 +qemu_opt_get_number(opts, maxqdepth, -1);

... and also set has_maxqdepth here.

Thanks,

Paolo

  }
  
  static void qemu_chr_parse_parallel(QemuOpts *opts, ChardevBackend *backend,
 @@ -3575,6 +3605,9 @@ QemuOptsList qemu_chardev_opts = {
  },{
  .name = size,
  .type = QEMU_OPT_SIZE,
 +},{
 +

[Qemu-devel] [PATCH] ahci: Don't allow creating slave drives

2013-05-06 Thread Kevin Wolf

An IDE bus provided by AHCI can only take a single IDE drive. If you add
a drive as slave, qemu used to accept the command line but the device
wouldn't be actually usable. Catch the situation instead and error out.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hw/ide/ahci.c |  2 +-
 hw/ide/cmd646.c   |  2 +-
 hw/ide/internal.h |  3 ++-
 hw/ide/isa.c  |  2 +-
 hw/ide/macio.c|  2 +-
 hw/ide/mmio.c |  2 +-
 hw/ide/piix.c |  2 +-
 hw/ide/qdev.c | 10 +-
 hw/ide/via.c  |  2 +-
 9 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 3405583..eab6096 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -1163,7 +1163,7 @@ void ahci_init(AHCIState *s, DeviceState *qdev, 
DMAContext *dma, int ports)
 for (i = 0; i  s-ports; i++) {
 AHCIDevice *ad = s-dev[i];
 
-ide_bus_new(ad-port, qdev, i);
+ide_bus_new(ad-port, qdev, i, 1);
 ide_init2(ad-port, irqs[i]);
 
 ad-hba = s;
diff --git a/hw/ide/cmd646.c b/hw/ide/cmd646.c
index 541d4ef..a73eb9a 100644
--- a/hw/ide/cmd646.c
+++ b/hw/ide/cmd646.c
@@ -281,7 +281,7 @@ static int pci_cmd646_ide_initfn(PCIDevice *dev)
 
 irq = qemu_allocate_irqs(cmd646_set_irq, d, 2);
 for (i = 0; i  2; i++) {
-ide_bus_new(d-bus[i], d-dev.qdev, i);
+ide_bus_new(d-bus[i], d-dev.qdev, i, 2);
 ide_init2(d-bus[i], irq[i]);
 
 bmdma_init(d-bus[i], d-bmdma[i], d);
diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index 0efb2da..03f1489 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -450,6 +450,7 @@ struct IDEBus {
 IDEDevice *slave;
 IDEState ifs[2];
 int bus_id;
+int max_units;
 IDEDMA *dma;
 uint8_t unit;
 uint8_t cmd;
@@ -574,7 +575,7 @@ void ide_atapi_cmd(IDEState *s);
 void ide_atapi_cmd_reply_end(IDEState *s);
 
 /* hw/ide/qdev.c */
-void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id);
+void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id, int max_units);
 IDEDevice *ide_create_drive(IDEBus *bus, int unit, DriveInfo *drive);
 
 #endif /* HW_IDE_INTERNAL_H */
diff --git a/hw/ide/isa.c b/hw/ide/isa.c
index 5e7422f..369a7fa 100644
--- a/hw/ide/isa.c
+++ b/hw/ide/isa.c
@@ -69,7 +69,7 @@ static int isa_ide_initfn(ISADevice *dev)
 {
 ISAIDEState *s = ISA_IDE(dev);
 
-ide_bus_new(s-bus, DEVICE(dev), 0);
+ide_bus_new(s-bus, DEVICE(dev), 0, 2);
 ide_init_ioport(s-bus, dev, s-iobase, s-iobase2);
 isa_init_irq(dev, s-irq, s-isairq);
 ide_init2(s-bus, s-irq);
diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index 64b2406..bf12a10 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -334,7 +334,7 @@ static void macio_ide_initfn(Object *obj)
 SysBusDevice *d = SYS_BUS_DEVICE(obj);
 MACIOIDEState *s = MACIO_IDE(obj);
 
-ide_bus_new(s-bus, DEVICE(obj), 0);
+ide_bus_new(s-bus, DEVICE(obj), 0, 2);
 memory_region_init_io(s-mem, pmac_ide_ops, s, pmac-ide, 0x1000);
 sysbus_init_mmio(d, s-mem);
 sysbus_init_irq(d, s-irq);
diff --git a/hw/ide/mmio.c b/hw/ide/mmio.c
index ce88c3a..e80e7e5 100644
--- a/hw/ide/mmio.c
+++ b/hw/ide/mmio.c
@@ -137,7 +137,7 @@ static void mmio_ide_initfn(Object *obj)
 SysBusDevice *d = SYS_BUS_DEVICE(obj);
 MMIOState *s = MMIO_IDE(obj);
 
-ide_bus_new(s-bus, DEVICE(obj), 0);
+ide_bus_new(s-bus, DEVICE(obj), 0, 2);
 sysbus_init_irq(d, s-irq);
 }
 
diff --git a/hw/ide/piix.c b/hw/ide/piix.c
index 1de284d..bf2856f 100644
--- a/hw/ide/piix.c
+++ b/hw/ide/piix.c
@@ -135,7 +135,7 @@ static void pci_piix_init_ports(PCIIDEState *d) {
 int i;
 
 for (i = 0; i  2; i++) {
-ide_bus_new(d-bus[i], d-dev.qdev, i);
+ide_bus_new(d-bus[i], d-dev.qdev, i, 2);
 ide_init_ioport(d-bus[i], NULL, port_info[i].iobase,
 port_info[i].iobase2);
 ide_init2(d-bus[i], isa_get_irq(NULL, port_info[i].isairq));
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 8a9a891..6a272b0 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -47,10 +47,11 @@ static const TypeInfo ide_bus_info = {
 .class_init = ide_bus_class_init,
 };
 
-void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id)
+void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id, int max_units)
 {
 qbus_create_inplace(idebus-qbus, TYPE_IDE_BUS, dev, NULL);
 idebus-bus_id = bus_id;
+idebus-max_units = max_units;
 }
 
 static char *idebus_get_fw_dev_path(DeviceState *dev)
@@ -76,6 +77,13 @@ static int ide_qdev_init(DeviceState *qdev)
 if (dev-unit == -1) {
 dev-unit = bus-master ? 1 : 0;
 }
+
+if (dev-unit = bus-max_units) {
+error_report(Can't create IDE unit %d, bus supports only %d units,
+ dev-unit, bus-max_units);
+goto err;
+}
+
 switch (dev-unit) {
 case 0:
 if (bus-master) {
diff --git a/hw/ide/via.c b/hw/ide/via.c
index 9d6a644..5fe053c 100644
--- a/hw/ide/via.c
+++ b/hw/ide/via.c
@@ -158,7 +158,7 @@ static void

Re: [Qemu-devel] [PATCH v7 0/7] push mmio dispatch out of big lock

2013-05-06 Thread Jan Kiszka

On 2013-05-06 15:09, Paolo Bonzini wrote:
 Il 06/05/2013 14:06, Jan Kiszka ha scritto:
 On 2013-05-06 13:47, Paolo Bonzini wrote:
 Il 06/05/2013 13:39, Jan Kiszka ha scritto:
 On 2013-05-06 13:28, Paolo Bonzini wrote:
 Il 06/05/2013 13:11, Jan Kiszka ha scritto:
 On 2013-05-06 12:58, Paolo Bonzini wrote:
 Il 06/05/2013 12:56, Jan Kiszka ha scritto:
 The problem is that even if I/O for a region is supposed to happen
 within the BQL, lookup can happen outside the BQL.  Lookup will use 
 the
 region even if it is just to discard it:

VCPU thread (under BQL)  device thread
  
 --
 flatview_ref
 memory_region_find 
 returns d-mr
 
 memory_region_ref(d-mr) /* nop */
qdev_free(d)
  object_unparent(d)
unrealize(d)
  memory_region_del_subregion(d-mr)
FlatView updated, d-mr not in the new view

 flatview_unref
   
 memory_region_unref(d-mr)
 object_unref(d)
   free(d)
 if (!d-mr-is_ram) { 
/* BAD! */
   
 memory_region_unref(d-mr) /* nop */
   return error
 }


 Here, the memory region is dereferenced *before* we know that it is 
 BQL-free
 (in fact, exactly to ascertain whether it is BQL-free).

 Both flatview update and lookup *plus* locking type evaluation (i.e.
 memory region dereferencing) always happen under the address space 
 lock.
 See Pingfan's patch.

 That's true of address_space_rw/map, but I don't think it holds for
 memory_region_find.

 It has to, or it would be broken: Either it is called on a region that
 supports reference counting

 You cannot know that in advance, can you?  The address is decided by the
 guest.

 Need to help me again to get the context: In which case is this a
 hot-path that we want to keep BQL-free? Current users of
 memory_region_find appear to be all relatively slow paths, thus are fine
 with staying under BQL.

 virtio-blk-dataplane is basically redoing memory_region_find with a
 separate data structure, exactly so that it can run outside the BQL
 before we get BQL-free MMIO dispatch.

 I can try to post patches later today that actually use
 memory_region_find instead.

 We could define its semantics as follows: return a reference to the
 corresponding memory region, provide this is safe. A reference is safe when
  - the region supports BQL-free operation (thus provides an owner to
apply reference counting on)
 
 This doesn't really work.  Regions that are known not to disappear (most
 importantly, the main RAM region) also support BQL-free operation, but
 have no owner right now.

Those few are much easier to convert than a full set of PCI and other
hot-pluggable device, that's my point.

 
 Also, memory_region_find cannot know if it's returning a valid result,
 and the callee cannot check it because the region may have disappeared
 already when it is returned.

Again, we hold the address space lock while checking the conditions. If
a region does not supports BQL-free mode and BQL is not held, we have an
error and return NULL (or bail out with a runtime error).

 
 But I really would be surprised if adding an owner everywhere is so
 hard...  let's try that first, it would solve the problem.

If we can avoid it, that would only help the process. If we can't, ok.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

[Qemu-devel] [RFC PATCH 0/8] MemoryRegion and FlatView refcounting, replace hostmem with memory_region_find

2013-05-06 Thread Paolo Bonzini

Hi,

this is an alternative approach to refactoring of dataplane's HostMem
code.  Here, I take Ping Fan's idea of RCU-style updating of the
region list and apply it to the AddressSpace's FlatView.  With this
change, dataplane can simply use memory_region_find instead of
hostmem.

This is a somewhat larger change, but I prefer it for two reasons.

1) it splits the task of adding BQL-less memory dispatch in two parts,
   tacking memory_region_find first (which is simpler because locking
   is left to the caller).

2) HostMem duplicates a lot of the FlatView logic, and adding the
   RCU-style update in FlatView benefits everyone.

The missing ingredients here are:

1) remember and unreference the MemoryRegions that are used in
   a vring entry.  In order to implement this, it is probably simpler
   to change vring.c to use virtio.c's VirtQueueElement data structure.
   We want something like that anyway in order to support migration.

2) add an owner field to MemoryRegion, and set it for all MemoryRegions
   for hot-unpluggable devices.  In this series, ref/unref are stubs.

For simplicity I based the patches on my IOMMU rebase.  I placed the
tree at git://github.com/bonzini/qemu.git, branch iommu.

Paolo

Paolo Bonzini (8):
  memory: add ref/unref calls
  exec: check MRU in qemu_ram_addr_from_host
  memory: return MemoryRegion from qemu_ram_addr_from_host
  memory: ref/unref memory across address_space_map/unmap
  memory: access FlatView from a local variable
  memory: use a new FlatView pointer on every topology update
  memory: add reference counting to FlatView
  dataplane: replace hostmem with memory_region_find

 exec.c|   63 +---
 hw/core/loader.c  |1 +
 hw/display/exynos4210_fimd.c  |6 +
 hw/display/framebuffer.c  |   10 +-
 hw/i386/kvm/ioapic.c  |2 +
 hw/i386/kvmvapic.c|1 +
 hw/misc/vfio.c|2 +
 hw/virtio/dataplane/Makefile.objs |2 +-
 hw/virtio/dataplane/hostmem.c |  176 -
 hw/virtio/dataplane/vring.c   |   56 +--
 hw/virtio/vhost.c |2 +
 hw/virtio/virtio-balloon.c|1 +
 hw/xen/xen_pt.c   |4 +
 include/exec/cpu-common.h |2 +-
 include/exec/memory.h |9 ++
 include/hw/virtio/dataplane/hostmem.h |   57 ---
 include/hw/virtio/dataplane/vring.h   |3 +-
 kvm-all.c |2 +
 memory.c  |  142 +-
 target-arm/kvm.c  |2 +
 target-i386/kvm.c |4 +-
 target-sparc/mmu_helper.c |1 +
 xen-all.c |2 +
 23 files changed, 253 insertions(+), 297 deletions(-)
 delete mode 100644 hw/virtio/dataplane/hostmem.c
 delete mode 100644 include/hw/virtio/dataplane/hostmem.h

[Qemu-devel] [RFC PATCH 6/8] memory: use a new FlatView pointer on every topology update

2013-05-06 Thread Paolo Bonzini

This is the first step towards converting as-current_map to
RCU-style updates, where the FlatView updates run concurrently
with uses of an old FlatView.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 memory.c |   34 ++
 1 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/memory.c b/memory.c
index 553b04c..fbb2657 100644
--- a/memory.c
+++ b/memory.c
@@ -273,6 +273,7 @@ static void flatview_destroy(FlatView *view)
 memory_region_unref(view-ranges[i].mr);
 }
 g_free(view-ranges);
+g_free(view);
 }
 
 static bool can_merge(FlatRange *r1, FlatRange *r2)
@@ -566,17 +567,18 @@ static void render_memory_region(FlatView *view,
 }
 
 /* Render a memory topology into a list of disjoint absolute ranges. */
-static FlatView generate_memory_topology(MemoryRegion *mr)
+static FlatView *generate_memory_topology(MemoryRegion *mr)
 {
-FlatView view;
+FlatView *view;
 
-flatview_init(view);
+view = g_new(FlatView, 1);
+flatview_init(view);
 
 if (mr) {
-render_memory_region(view, mr, int128_zero(),
+render_memory_region(view, mr, int128_zero(),
  addrrange_make(int128_zero(), int128_2_64()), 
false);
 }
-flatview_simplify(view);
+flatview_simplify(view);
 
 return view;
 }
@@ -664,8 +666,8 @@ static void address_space_update_ioeventfds(AddressSpace 
*as)
 }
 
 static void address_space_update_topology_pass(AddressSpace *as,
-   FlatView old_view,
-   FlatView new_view,
+   const FlatView *old_view,
+   const FlatView *new_view,
bool adding)
 {
 unsigned iold, inew;
@@ -675,14 +677,14 @@ static void 
address_space_update_topology_pass(AddressSpace *as,
  * Kill ranges in the old map, and instantiate ranges in the new map.
  */
 iold = inew = 0;
-while (iold  old_view.nr || inew  new_view.nr) {
-if (iold  old_view.nr) {
-frold = old_view.ranges[iold];
+while (iold  old_view-nr || inew  new_view-nr) {
+if (iold  old_view-nr) {
+frold = old_view-ranges[iold];
 } else {
 frold = NULL;
 }
-if (inew  new_view.nr) {
-frnew = new_view.ranges[inew];
+if (inew  new_view-nr) {
+frnew = new_view-ranges[inew];
 } else {
 frnew = NULL;
 }
@@ -728,14 +730,14 @@ static void 
address_space_update_topology_pass(AddressSpace *as,
 
 static void address_space_update_topology(AddressSpace *as)
 {
-FlatView old_view = *as-current_map;
-FlatView new_view = generate_memory_topology(as-root);
+FlatView *old_view = as-current_map;
+FlatView *new_view = generate_memory_topology(as-root);
 
 address_space_update_topology_pass(as, old_view, new_view, false);
 address_space_update_topology_pass(as, old_view, new_view, true);
 
-*as-current_map = new_view;
-flatview_destroy(old_view);
+as-current_map = new_view;
+flatview_destroy(old_view);
 address_space_update_ioeventfds(as);
 }
 
-- 
1.7.1

[Qemu-devel] [RFC PATCH 2/8] exec: check MRU in qemu_ram_addr_from_host

2013-05-06 Thread Paolo Bonzini

This function is not used outside the iothread mutex, so it
can use ram_list.mru_block.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 exec.c |   12 ++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/exec.c b/exec.c
index 9f324bb..8e46228 100644
--- a/exec.c
+++ b/exec.c
@@ -1404,18 +1404,26 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t 
*ram_addr)
 return 0;
 }
 
+block = ram_list.mru_block;
+if (block  block-host  host - block-host  block-length) {
+goto found;
+}
+
 QTAILQ_FOREACH(block, ram_list.blocks, next) {
 /* This case append when the block is not mapped. */
 if (block-host == NULL) {
 continue;
 }
 if (host - block-host  block-length) {
-*ram_addr = block-offset + (host - block-host);
-return 0;
+goto found;
 }
 }
 
 return -1;
+
+found:
+*ram_addr = block-offset + (host - block-host);
+return 0;
 }
 
 /* Some of the softmmu routines need to translate from a host pointer
-- 
1.7.1

[Qemu-devel] [RFC][PATCH 02/15] applesmc: replace register_ioport*

2013-05-06 Thread Jan Kiszka

Convert over to memory regions to obsolete register_ioport*.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/misc/applesmc.c |   48 
 1 files changed, 36 insertions(+), 12 deletions(-)

diff --git a/hw/misc/applesmc.c b/hw/misc/applesmc.c
index 78904a8..af24be1 100644
--- a/hw/misc/applesmc.c
+++ b/hw/misc/applesmc.c
@@ -73,6 +73,8 @@ typedef struct AppleSMCState AppleSMCState;
 struct AppleSMCState {
 ISADevice parent_obj;
 
+MemoryRegion io_data;
+MemoryRegion io_cmd;
 uint32_t iobase;
 uint8_t cmd;
 uint8_t status;
@@ -86,7 +88,8 @@ struct AppleSMCState {
 QLIST_HEAD(, AppleSMCData) data_def;
 };
 
-static void applesmc_io_cmd_writeb(void *opaque, uint32_t addr, uint32_t val)
+static void applesmc_io_cmd_write(void *opaque, hwaddr addr, uint64_t val,
+  unsigned size)
 {
 AppleSMCState *s = opaque;
 
@@ -115,7 +118,8 @@ static void applesmc_fill_data(AppleSMCState *s)
 }
 }
 
-static void applesmc_io_data_writeb(void *opaque, uint32_t addr, uint32_t val)
+static void applesmc_io_data_write(void *opaque, hwaddr addr, uint64_t val,
+   unsigned size)
 {
 AppleSMCState *s = opaque;
 
@@ -138,7 +142,8 @@ static void applesmc_io_data_writeb(void *opaque, uint32_t 
addr, uint32_t val)
 }
 }
 
-static uint32_t applesmc_io_data_readb(void *opaque, uint32_t addr1)
+static uint64_t applesmc_io_data_read(void *opaque, hwaddr addr1,
+  unsigned size)
 {
 AppleSMCState *s = opaque;
 uint8_t retval = 0;
@@ -162,7 +167,7 @@ static uint32_t applesmc_io_data_readb(void *opaque, 
uint32_t addr1)
 return retval;
 }
 
-static uint32_t applesmc_io_cmd_readb(void *opaque, uint32_t addr1)
+static uint64_t applesmc_io_cmd_read(void *opaque, hwaddr addr1, unsigned size)
 {
 AppleSMCState *s = opaque;
 
@@ -201,18 +206,37 @@ static void qdev_applesmc_isa_reset(DeviceState *dev)
 applesmc_add_key(s, MSSD, 1, \0x3);
 }
 
+static const MemoryRegionOps applesmc_data_io_ops = {
+.write = applesmc_io_data_write,
+.read = applesmc_io_data_read,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.impl = {
+.min_access_size = 1,
+.max_access_size = 1,
+},
+};
+
+static const MemoryRegionOps applesmc_cmd_io_ops = {
+.write = applesmc_io_cmd_write,
+.read = applesmc_io_cmd_read,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.impl = {
+.min_access_size = 1,
+.max_access_size = 1,
+},
+};
+
 static int applesmc_isa_init(ISADevice *dev)
 {
 AppleSMCState *s = APPLE_SMC(dev);
 
-register_ioport_read(s-iobase + APPLESMC_DATA_PORT, 4, 1,
- applesmc_io_data_readb, s);
-register_ioport_read(s-iobase + APPLESMC_CMD_PORT, 4, 1,
- applesmc_io_cmd_readb, s);
-register_ioport_write(s-iobase + APPLESMC_DATA_PORT, 4, 1,
-  applesmc_io_data_writeb, s);
-register_ioport_write(s-iobase + APPLESMC_CMD_PORT, 4, 1,
-  applesmc_io_cmd_writeb, s);
+memory_region_init_io(s-io_data, applesmc_data_io_ops, s,
+  applesmc-data, 4);
+isa_register_ioport(dev, s-io_data, s-iobase + APPLESMC_DATA_PORT);
+
+memory_region_init_io(s-io_cmd, applesmc_cmd_io_ops, s,
+  applesmc-cmd, 4);
+isa_register_ioport(dev, s-io_cmd, s-iobase + APPLESMC_CMD_PORT);
 
 if (!s-osk || (strlen(s-osk) != 64)) {
 fprintf(stderr, WARNING: Using AppleSMC with invalid key\n);
-- 
1.7.3.4

[Qemu-devel] [RFC PATCH 3/8] memory: return MemoryRegion from qemu_ram_addr_from_host

2013-05-06 Thread Paolo Bonzini

It will be needed in the next patch.

The common parts of qemu_get_ram_ptr and qemu_ram_addr_from_host are
moved to a new function qemu_get_ram_block.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 exec.c|   35 +--
 include/exec/cpu-common.h |2 +-
 target-i386/kvm.c |4 ++--
 3 files changed, 24 insertions(+), 17 deletions(-)

diff --git a/exec.c b/exec.c
index 8e46228..54b57fc 100644
--- a/exec.c
+++ b/exec.c
@@ -1287,15 +1287,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
 }
 #endif /* !_WIN32 */
 
-/* Return a host pointer to ram allocated with qemu_ram_alloc.
-   With the exception of the softmmu code in this file, this should
-   only be used for local memory (e.g. video ram) that the device owns,
-   and knows it isn't going to access beyond the end of the block.
-
-   It should not be used for general purpose DMA.
-   Use cpu_physical_memory_map/cpu_physical_memory_rw instead.
- */
-void *qemu_get_ram_ptr(ram_addr_t addr)
+static RAMBlock *qemu_get_ram_block(ram_addr_t addr)
 {
 RAMBlock *block;
 
@@ -1315,6 +1307,21 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
 
 found:
 ram_list.mru_block = block;
+return block;
+}
+
+/* Return a host pointer to ram allocated with qemu_ram_alloc.
+   With the exception of the softmmu code in this file, this should
+   only be used for local memory (e.g. video ram) that the device owns,
+   and knows it isn't going to access beyond the end of the block.
+
+   It should not be used for general purpose DMA.
+   Use cpu_physical_memory_map/cpu_physical_memory_rw instead.
+ */
+void *qemu_get_ram_ptr(ram_addr_t addr)
+{
+RAMBlock *block = qemu_get_ram_block(addr);
+
 if (xen_enabled()) {
 /* We need to check if the requested address is in the RAM
  * because we don't want to map the entire memory in QEMU.
@@ -1394,14 +1401,14 @@ void qemu_put_ram_ptr(void *addr)
 trace_qemu_put_ram_ptr(addr);
 }
 
-int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr)
+MemoryRegion *qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr)
 {
 RAMBlock *block;
 uint8_t *host = ptr;
 
 if (xen_enabled()) {
 *ram_addr = xen_ram_addr_from_mapcache(ptr);
-return 0;
+return qemu_get_ram_block(*ram_addr)-mr;
 }
 
 block = ram_list.mru_block;
@@ -1419,11 +1426,11 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t 
*ram_addr)
 }
 }
 
-return -1;
+return NULL;
 
 found:
 *ram_addr = block-offset + (host - block-host);
-return 0;
+return block-mr;
 }
 
 /* Some of the softmmu routines need to translate from a host pointer
@@ -1432,7 +1439,7 @@ ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr)
 {
 ram_addr_t ram_addr;
 
-if (qemu_ram_addr_from_host(ptr, ram_addr)) {
+if (qemu_ram_addr_from_host(ptr, ram_addr) == NULL) {
 fprintf(stderr, Bad ram pointer %p\n, ptr);
 abort();
 }
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 2e5f11f..84dfd3b 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -53,7 +53,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length);
 void *qemu_get_ram_ptr(ram_addr_t addr);
 void qemu_put_ram_ptr(void *addr);
 /* This should not be used by devices.  */
-int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr);
+MemoryRegion *qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr);
 ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr);
 void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev);
 
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 9ffb6ca..7ba98cd 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -318,7 +318,7 @@ int kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void 
*addr)
 
 if ((env-mcg_cap  MCG_SER_P)  addr
  (code == BUS_MCEERR_AR || code == BUS_MCEERR_AO)) {
-if (qemu_ram_addr_from_host(addr, ram_addr) ||
+if (qemu_ram_addr_from_host(addr, ram_addr) == NULL ||
 !kvm_physical_memory_addr_from_host(c-kvm_state, addr, paddr)) {
 fprintf(stderr, Hardware memory error for memory used by 
 QEMU itself instead of guest system!\n);
@@ -350,7 +350,7 @@ int kvm_arch_on_sigbus(int code, void *addr)
 hwaddr paddr;
 
 /* Hope we are lucky for AO MCE */
-if (qemu_ram_addr_from_host(addr, ram_addr) ||
+if (qemu_ram_addr_from_host(addr, ram_addr) == NULL ||
 !kvm_physical_memory_addr_from_host(CPU(first_cpu)-kvm_state,
 addr, paddr)) {
 fprintf(stderr, Hardware memory error for memory used by 
-- 
1.7.1

[Qemu-devel] [RFC][PATCH 10/15] memory: Rework sub-page handling

2013-05-06 Thread Jan Kiszka

Simplify the sub-page handling by implementing it directly in the
dispatcher instead of using a redirection memory region. We extend the
phys_sections entries to optionally hold a pointer to the sub-section
table that used to reside in the subpage_t structure. IOW, we add one
optional dispatch level below the existing radix tree.

address_space_lookup_region is extended to take this additional level
into account. This direct dispatching to that target memory region will
also be helpful when we want to add per-region locking control.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 exec.c|  175 +
 include/exec/memory.h |1 -
 memory.c  |1 -
 3 files changed, 59 insertions(+), 118 deletions(-)

diff --git a/exec.c b/exec.c
index 53c2778..3ee1f3f 100644
--- a/exec.c
+++ b/exec.c
@@ -51,7 +51,6 @@
 #include exec/memory-internal.h
 
 //#define DEBUG_UNASSIGNED
-//#define DEBUG_SUBPAGE
 
 #if !defined(CONFIG_USER_ONLY)
 int phys_ram_fd;
@@ -82,7 +81,14 @@ int use_icount;
 
 #if !defined(CONFIG_USER_ONLY)
 
-static MemoryRegionSection *phys_sections;
+#define SUBSECTION_IDX(addr) ((addr)  ~TARGET_PAGE_MASK)
+
+typedef struct PhysSection {
+MemoryRegionSection section;
+uint16_t *sub_section;
+} PhysSection;
+
+static PhysSection *phys_sections;
 static unsigned phys_sections_nb, phys_sections_nb_alloc;
 static uint16_t phys_section_unassigned;
 static uint16_t phys_section_notdirty;
@@ -182,8 +188,8 @@ static void phys_page_set(AddressSpaceDispatch *d,
 phys_page_set_level(d-phys_map, index, nb, leaf, P_L2_LEVELS - 1);
 }
 
-static MemoryRegionSection *phys_page_find(AddressSpaceDispatch *d,
-   hwaddr index)
+static PhysSection *phys_section_find(AddressSpaceDispatch *d,
+  hwaddr index)
 {
 PhysPageEntry lp = d-phys_map;
 PhysPageEntry *p;
@@ -646,7 +652,7 @@ hwaddr memory_region_section_get_iotlb(CPUArchState *env,
and avoid full address decoding in every device.
We can't use the high bits of pd for this because
IO_MEM_ROMD uses these as a ram address.  */
-iotlb = section - phys_sections;
+iotlb = container_of(section, PhysSection, section) - phys_sections;
 iotlb += memory_region_section_addr(section, paddr);
 }
 
@@ -668,27 +674,13 @@ hwaddr memory_region_section_get_iotlb(CPUArchState *env,
 #endif /* defined(CONFIG_USER_ONLY) */
 
 #if !defined(CONFIG_USER_ONLY)
+static int subsection_register(PhysSection *psection, uint32_t start,
+   uint32_t end, uint16_t section);
+static void subsections_init(PhysSection *psection);
 
-#define SUBPAGE_IDX(addr) ((addr)  ~TARGET_PAGE_MASK)
-typedef struct subpage_t {
-MemoryRegion iomem;
-hwaddr base;
-uint16_t sub_section[TARGET_PAGE_SIZE];
-} subpage_t;
-
-static int subpage_register (subpage_t *mmio, uint32_t start, uint32_t end,
- uint16_t section);
-static subpage_t *subpage_init(hwaddr base);
 static void destroy_page_desc(uint16_t section_index)
 {
-MemoryRegionSection *section = phys_sections[section_index];
-MemoryRegion *mr = section-mr;
-
-if (mr-subpage) {
-subpage_t *subpage = container_of(mr, subpage_t, iomem);
-memory_region_destroy(subpage-iomem);
-g_free(subpage);
-}
+g_free(phys_sections[section_index].sub_section);
 }
 
 static void destroy_l2_mapping(PhysPageEntry *lp, unsigned level)
@@ -722,10 +714,11 @@ static uint16_t phys_section_add(MemoryRegionSection 
*section)
 {
 if (phys_sections_nb == phys_sections_nb_alloc) {
 phys_sections_nb_alloc = MAX(phys_sections_nb_alloc * 2, 16);
-phys_sections = g_renew(MemoryRegionSection, phys_sections,
+phys_sections = g_renew(PhysSection, phys_sections,
 phys_sections_nb_alloc);
 }
-phys_sections[phys_sections_nb] = *section;
+phys_sections[phys_sections_nb].section = *section;
+phys_sections[phys_sections_nb].sub_section = NULL;
 return phys_sections_nb++;
 }
 
@@ -734,31 +727,31 @@ static void phys_sections_clear(void)
 phys_sections_nb = 0;
 }
 
-static void register_subpage(AddressSpaceDispatch *d, MemoryRegionSection 
*section)
+static void register_subsection(AddressSpaceDispatch *d,
+MemoryRegionSection *section)
 {
-subpage_t *subpage;
 hwaddr base = section-offset_within_address_space
  TARGET_PAGE_MASK;
-MemoryRegionSection *existing = phys_page_find(d, base  
TARGET_PAGE_BITS);
+PhysSection *psection = phys_section_find(d, base  TARGET_PAGE_BITS);
 MemoryRegionSection subsection = {
 .offset_within_address_space = base,
 .size = TARGET_PAGE_SIZE,
 };
+uint16_t new_section;
 hwaddr start, end;
 
-assert(existing-mr-subpage || existing-mr == io_mem_unassigned);
+

[Qemu-devel] [RFC][PATCH 00/15] Refactor portio dispatching

2013-05-06 Thread Jan Kiszka

This series converts the remaining users of register_ioport* to portio
lists, simplifies the handling of subpages and adds support for unaligned
memory region accesses. Then it replaces the current portio dispatcher
with the existing one for MMIO and removes several lines of code. This
also allows to build BQL-free portio on top once we enhance the memory
layer accordingly.

Seems to work fine so far but surely requires thorough review. And I
would welcome early comments on the direction.

Jan


CC: malc av1...@comtv.ru

Jan Kiszka (15):
  adlib: replace register_ioport*
  applesmc: replace register_ioport*
  wdt_ib700: replace register_ioport*
  i82374: replace register_ioport*
  prep: replace register_ioport*
  vt82c686: replace register_ioport*
  Privatize register_ioport_read/write
  isa: implement isa_is_ioport_assigned via memory_region_find
  memory: Introduce address_space_lookup_region
  memory: Rework sub-page handling
  memory: Allow unaligned address_space_rw
  vmware-vga: Accept unaligned I/O accesses
  ioport: Switch dispatching to memory core layer
  ioport: Remove unused old dispatching services
  ioport: Move IOPortRead/WriteFunc typedefs to memory.h

 cputlb.c   |2 +-
 exec.c |  273 +
 hw/acpi/piix4.c|6 +-
 hw/audio/adlib.c   |   20 ++-
 hw/display/vmware_vga.c|4 +
 hw/dma/i82374.c|   17 ++-
 hw/isa/isa-bus.c   |   11 ++
 hw/isa/lpc_ich9.c  |8 +-
 hw/isa/vt82c686.c  |   40 --
 hw/misc/applesmc.c |   48 +--
 hw/ppc/prep.c  |   23 ++-
 hw/watchdog/wdt_ib700.c|   12 ++-
 include/exec/cputlb.h  |2 -
 include/exec/ioport.h  |   19 +---
 include/exec/iorange.h |   31 
 include/exec/memory-internal.h |2 -
 include/exec/memory.h  |   26 ++--
 include/hw/isa/isa.h   |2 +
 ioport.c   |  294 
 memory.c   |  102 +-
 translate-all.c|3 +-
 21 files changed, 311 insertions(+), 634 deletions(-)
 delete mode 100644 include/exec/iorange.h

-- 
1.7.3.4

[Qemu-devel] [RFC][PATCH 07/15] Privatize register_ioport_read/write

2013-05-06 Thread Jan Kiszka

No more users outside of ioport.c.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 include/exec/ioport.h |4 
 ioport.c  |8 
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/include/exec/ioport.h b/include/exec/ioport.h
index fc28350..4953892 100644
--- a/include/exec/ioport.h
+++ b/include/exec/ioport.h
@@ -39,10 +39,6 @@ typedef uint32_t (IOPortReadFunc)(void *opaque, uint32_t 
address);
 typedef void (IOPortDestructor)(void *opaque);
 
 void ioport_register(IORange *iorange);
-int register_ioport_read(pio_addr_t start, int length, int size,
- IOPortReadFunc *func, void *opaque);
-int register_ioport_write(pio_addr_t start, int length, int size,
-  IOPortWriteFunc *func, void *opaque);
 void isa_unassign_ioport(pio_addr_t start, int length);
 bool isa_is_ioport_assigned(pio_addr_t start);
 
diff --git a/ioport.c b/ioport.c
index a0ac2a0..d5b7fbd 100644
--- a/ioport.c
+++ b/ioport.c
@@ -139,8 +139,8 @@ static int ioport_bsize(int size, int *bsize)
 }
 
 /* size is the word size in byte */
-int register_ioport_read(pio_addr_t start, int length, int size,
- IOPortReadFunc *func, void *opaque)
+static int register_ioport_read(pio_addr_t start, int length, int size,
+IOPortReadFunc *func, void *opaque)
 {
 int i, bsize;
 
@@ -159,8 +159,8 @@ int register_ioport_read(pio_addr_t start, int length, int 
size,
 }
 
 /* size is the word size in byte */
-int register_ioport_write(pio_addr_t start, int length, int size,
-  IOPortWriteFunc *func, void *opaque)
+static int register_ioport_write(pio_addr_t start, int length, int size,
+ IOPortWriteFunc *func, void *opaque)
 {
 int i, bsize;
 
-- 
1.7.3.4

[Qemu-devel] [RFC][PATCH 06/15] vt82c686: replace register_ioport*

2013-05-06 Thread Jan Kiszka

Convert over to memory regions to obsolete register_ioport*.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/isa/vt82c686.c |   40 ++--
 1 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/hw/isa/vt82c686.c b/hw/isa/vt82c686.c
index 5261927..c0d9919 100644
--- a/hw/isa/vt82c686.c
+++ b/hw/isa/vt82c686.c
@@ -43,10 +43,12 @@ typedef struct SuperIOConfig
 
 typedef struct VT82C686BState {
 PCIDevice dev;
+MemoryRegion superio;
 SuperIOConfig superio_conf;
 } VT82C686BState;
 
-static void superio_ioport_writeb(void *opaque, uint32_t addr, uint32_t data)
+static void superio_ioport_writeb(void *opaque, hwaddr addr, uint64_t data,
+  unsigned size)
 {
 int can_write;
 SuperIOConfig *superio_conf = opaque;
@@ -93,7 +95,7 @@ static void superio_ioport_writeb(void *opaque, uint32_t 
addr, uint32_t data)
 }
 }
 
-static uint32_t superio_ioport_readb(void *opaque, uint32_t addr)
+static uint64_t superio_ioport_readb(void *opaque, hwaddr addr, unsigned size)
 {
 SuperIOConfig *superio_conf = opaque;
 
@@ -101,6 +103,16 @@ static uint32_t superio_ioport_readb(void *opaque, 
uint32_t addr)
 return (superio_conf-config[superio_conf-index]);
 }
 
+static const MemoryRegionOps superio_ops = {
+.read = superio_ioport_readb,
+.write = superio_ioport_writeb,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.impl = {
+.min_access_size = 1,
+.max_access_size = 1,
+},
+};
+
 static void vt82c686b_reset(void * opaque)
 {
 PCIDevice *d = opaque;
@@ -140,17 +152,7 @@ static void vt82c686b_write_config(PCIDevice * d, uint32_t 
address,
 
 pci_default_write_config(d, address, val, len);
 if (address == 0x85) {  /* enable or disable super IO configure */
-if (val  0x2) {
-/* floppy also uses 0x3f0 and 0x3f1.
- * But we do not emulate flopy,so just set it here. */
-isa_unassign_ioport(0x3f0, 2);
-register_ioport_read(0x3f0, 2, 1, superio_ioport_readb,
- vt686-superio_conf);
-register_ioport_write(0x3f0, 2, 1, superio_ioport_writeb,
-  vt686-superio_conf);
-} else {
-isa_unassign_ioport(0x3f0, 2);
-}
+memory_region_set_enabled(vt686-superio, val  0x2);
 }
 }
 
@@ -423,11 +425,13 @@ static const VMStateDescription vmstate_via = {
 /* init the PCI-to-ISA bridge */
 static int vt82c686b_initfn(PCIDevice *d)
 {
+VT82C686BState *vt82c = DO_UPCAST(VT82C686BState, dev, d);
 uint8_t *pci_conf;
+ISABus *isa_bus;
 uint8_t *wmask;
 int i;
 
-isa_bus_new(d-qdev, pci_address_space_io(d));
+isa_bus = isa_bus_new(d-qdev, pci_address_space_io(d));
 
 pci_conf = d-config;
 pci_config_set_prog_interface(pci_conf, 0x0);
@@ -439,6 +443,14 @@ static int vt82c686b_initfn(PCIDevice *d)
}
 }
 
+memory_region_init_io(vt82c-superio, superio_ops, vt82c-superio_conf,
+  superio, 2);
+memory_region_set_enabled(vt82c-superio, false);
+/* The floppy also uses 0x3f0 and 0x3f1.
+ * But we do not emulate a floppy, so just set it here. */
+memory_region_add_subregion(isa_bus-address_space_io, 0x3f0,
+vt82c-superio);
+
 qemu_register_reset(vt82c686b_reset, d);
 
 return 0;
-- 
1.7.3.4

[Qemu-devel] [RFC][PATCH 15/15] ioport: Move IOPortRead/WriteFunc typedefs to memory.h

2013-05-06 Thread Jan Kiszka

Move the function types required for MemoryRegionPortio to memory.h.
This allows to let ioport.h depend on memory.h, which is more consistent
instead than the other way around.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 include/exec/ioport.h |8 +---
 include/exec/memory.h |4 +++-
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/include/exec/ioport.h b/include/exec/ioport.h
index ba3ebb8..c7da6d4 100644
--- a/include/exec/ioport.h
+++ b/include/exec/ioport.h
@@ -25,6 +25,7 @@
 #define IOPORT_H
 
 #include qemu-common.h
+#include exec/memory.h
 
 typedef uint32_t pio_addr_t;
 #define FMT_pioaddr PRIx32
@@ -32,10 +33,6 @@ typedef uint32_t pio_addr_t;
 #define MAX_IOPORTS (64 * 1024)
 #define IOPORTS_MASK(MAX_IOPORTS - 1)
 
-/* These should really be in isa.h, but are here to make pc.h happy.  */
-typedef void (IOPortWriteFunc)(void *opaque, uint32_t address, uint32_t data);
-typedef uint32_t (IOPortReadFunc)(void *opaque, uint32_t address);
-
 void cpu_outb(pio_addr_t addr, uint8_t val);
 void cpu_outw(pio_addr_t addr, uint16_t val);
 void cpu_outl(pio_addr_t addr, uint32_t val);
@@ -43,9 +40,6 @@ uint8_t cpu_inb(pio_addr_t addr);
 uint16_t cpu_inw(pio_addr_t addr);
 uint32_t cpu_inl(pio_addr_t addr);
 
-struct MemoryRegion;
-struct MemoryRegionPortio;
-
 typedef struct PortioList {
 const struct MemoryRegionPortio *ports;
 struct MemoryRegion *address_space;
diff --git a/include/exec/memory.h b/include/exec/memory.h
index cad73f5..7843076 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -22,7 +22,6 @@
 #include exec/cpu-common.h
 #include exec/hwaddr.h
 #include qemu/queue.h
-#include exec/ioport.h
 #include qemu/int128.h
 
 typedef struct MemoryRegionOps MemoryRegionOps;
@@ -136,6 +135,9 @@ struct MemoryRegion {
 MemoryRegionIoeventfd *ioeventfds;
 };
 
+typedef void (IOPortWriteFunc)(void *opaque, uint32_t address, uint32_t data);
+typedef uint32_t (IOPortReadFunc)(void *opaque, uint32_t address);
+
 struct MemoryRegionPortio {
 uint32_t offset;
 uint32_t len;
-- 
1.7.3.4

[Qemu-devel] [RFC PATCH 7/8] memory: add reference counting to FlatView

2013-05-06 Thread Paolo Bonzini

With this change, a FlatView can be used even after a concurrent
update has replaced it.  Because we do not have RCU, we use a
mutex to protect the small critical sections that read/write the
as-current_map pointer.  Accesses to the FlatView can be done
outside the mutex.

If a MemoryRegion will be used after the FlatView is unref-ed (or after
a MemoryListener callback is returned), a reference has to be added to
that MemoryRegion.  For example, memory_region_find adds a reference to
the MemoryRegion that it returns.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 memory.c |   75 +++--
 1 files changed, 67 insertions(+), 8 deletions(-)

diff --git a/memory.c b/memory.c
index fbb2657..f02a3cc 100644
--- a/memory.c
+++ b/memory.c
@@ -26,12 +26,26 @@ static unsigned memory_region_transaction_depth;
 static bool memory_region_update_pending;
 static bool global_dirty_log = false;
 
+/* Either the flat_view_mutex or the iothread mutex can be taken around reads
+ * of as-current_map; the critical section is extremely short, so I'm using a
+ * single mutex for every AS.  We could also RCU for the read-side.
+ *
+ * Both locks are taken while writing to as-current_map (with the iothread
+ * mutex outside).
+ */
+static QemuMutex flat_view_mutex;
+
 static QTAILQ_HEAD(memory_listeners, MemoryListener) memory_listeners
 = QTAILQ_HEAD_INITIALIZER(memory_listeners);
 
 static QTAILQ_HEAD(, AddressSpace) address_spaces
 = QTAILQ_HEAD_INITIALIZER(address_spaces);
 
+static void memory_init(void)
+{
+qemu_mutex_init(flat_view_mutex);
+}
+
 typedef struct AddrRange AddrRange;
 
 /*
@@ -222,6 +236,7 @@ struct FlatRange {
  * order.
  */
 struct FlatView {
+unsigned ref;
 FlatRange *ranges;
 unsigned nr;
 unsigned nr_allocated;
@@ -243,6 +258,7 @@ static bool flatrange_equal(FlatRange *a, FlatRange *b)
 
 static void flatview_init(FlatView *view)
 {
+view-ref = 1;
 view-ranges = NULL;
 view-nr = 0;
 view-nr_allocated = 0;
@@ -276,6 +292,18 @@ static void flatview_destroy(FlatView *view)
 g_free(view);
 }
 
+static void flatview_ref(FlatView *view)
+{
+__sync_fetch_and_add(view-ref, 1);
+}
+
+static void flatview_unref(FlatView *view)
+{
+if (__sync_fetch_and_sub(view-ref, 1) == 1) {
+flatview_destroy(view);
+}
+}
+
 static bool can_merge(FlatRange *r1, FlatRange *r2)
 {
 return int128_eq(addrrange_end(r1-addr), r2-addr.start)
@@ -728,16 +756,38 @@ static void 
address_space_update_topology_pass(AddressSpace *as,
 }
 
 
+static FlatView *address_space_get_flatview(AddressSpace *as)
+{
+FlatView *view;
+
+qemu_mutex_lock(flat_view_mutex);
+view = as-current_map;
+flatview_ref(view);
+qemu_mutex_unlock(flat_view_mutex);
+return view;
+}
+
 static void address_space_update_topology(AddressSpace *as)
 {
-FlatView *old_view = as-current_map;
+FlatView *old_view = address_space_get_flatview(as);
 FlatView *new_view = generate_memory_topology(as-root);
 
 address_space_update_topology_pass(as, old_view, new_view, false);
 address_space_update_topology_pass(as, old_view, new_view, true);
 
+qemu_mutex_lock(flat_view_mutex);
+flatview_unref(as-current_map);
 as-current_map = new_view;
-flatview_destroy(old_view);
+qemu_mutex_unlock(flat_view_mutex);
+
+/* Note that all the old MemoryRegions are still alive up to this
+ * point.  This relieves most MemoryListeners from the need to
+ * ref/unref the MemoryRegions they get---unless they use them
+ * outside the iothread mutex, in which case precise reference
+ * counting is necessary.
+ */
+flatview_unref(old_view);
+
 address_space_update_ioeventfds(as);
 }
 
@@ -1138,12 +1188,13 @@ void memory_region_sync_dirty_bitmap(MemoryRegion *mr)
 FlatRange *fr;
 
 QTAILQ_FOREACH(as, address_spaces, address_spaces_link) {
-FlatView *view = as-current_map;
+FlatView *view = address_space_get_flatview(as);
 FOR_EACH_FLAT_RANGE(fr, view) {
 if (fr-mr == mr) {
 MEMORY_LISTENER_UPDATE_REGION(fr, as, Forward, log_sync);
 }
 }
+flatview_unref(view);
 }
 }
 
@@ -1195,7 +1246,7 @@ static void 
memory_region_update_coalesced_range_as(MemoryRegion *mr, AddressSpa
 AddrRange tmp;
 MemoryRegionSection section;
 
-view = as-current_map;
+view = address_space_get_flatview(as);
 FOR_EACH_FLAT_RANGE(fr, view) {
 if (fr-mr == mr) {
 section = (MemoryRegionSection) {
@@ -1221,6 +1272,7 @@ static void 
memory_region_update_coalesced_range_as(MemoryRegion *mr, AddressSpa
 }
 }
 }
+flatview_unref(view);
 }
 
 static void memory_region_update_coalesced_range(MemoryRegion *mr)
@@ -1508,7 +1560,7 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
 as = memory_region_to_address_space(root);
 range =

[Qemu-devel] [RFC][PATCH 01/15] adlib: replace register_ioport*

2013-05-06 Thread Jan Kiszka

Convert over to memory regions to obsolete register_ioport*.

CC: malc av1...@comtv.ru
Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/audio/adlib.c |   20 
 1 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/hw/audio/adlib.c b/hw/audio/adlib.c
index fc20857..9a27c01 100644
--- a/hw/audio/adlib.c
+++ b/hw/audio/adlib.c
@@ -283,9 +283,17 @@ static void Adlib_fini (AdlibState *s)
 AUD_remove_card (s-card);
 }
 
+static MemoryRegionPortio adlib_portio_list[] = {
+{ 0x388, 4, 1, .read = adlib_read, .write = adlib_write, },
+{ 0, 4, 1, .read = adlib_read, .write = adlib_write, },
+{ 0, 2, 1, .read = adlib_read, .write = adlib_write, },
+PORTIO_END_OF_LIST(),
+};
+
 static int Adlib_initfn (ISADevice *dev)
 {
 AdlibState *s = ADLIB(dev);
+PortioList *port_list = g_new(PortioList, 1);
 struct audsettings as;
 
 if (glob_adlib) {
@@ -338,14 +346,10 @@ static int Adlib_initfn (ISADevice *dev)
 s-samples = AUD_get_buffer_size_out (s-voice)  SHIFT;
 s-mixbuf = g_malloc0 (s-samples  SHIFT);
 
-register_ioport_read (0x388, 4, 1, adlib_read, s);
-register_ioport_write (0x388, 4, 1, adlib_write, s);
-
-register_ioport_read (s-port, 4, 1, adlib_read, s);
-register_ioport_write (s-port, 4, 1, adlib_write, s);
-
-register_ioport_read (s-port + 8, 2, 1, adlib_read, s);
-register_ioport_write (s-port + 8, 2, 1, adlib_write, s);
+adlib_portio_list[1].offset = s-port;
+adlib_portio_list[2].offset = s-port + 8;
+portio_list_init (port_list, adlib_portio_list, s, adlib);
+portio_list_add (port_list, isa_address_space_io(dev), 0);
 
 return 0;
 }
-- 
1.7.3.4

[Qemu-devel] [RFC][PATCH 14/15] ioport: Remove unused old dispatching services

2013-05-06 Thread Jan Kiszka

Remove unused ioport_register and isa_unassign_ioport along with
everything that only those services used.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 include/exec/ioport.h  |5 -
 include/exec/iorange.h |   31 --
 include/exec/memory.h  |9 --
 ioport.c   |  238 
 4 files changed, 0 insertions(+), 283 deletions(-)
 delete mode 100644 include/exec/iorange.h

diff --git a/include/exec/ioport.h b/include/exec/ioport.h
index b476857..ba3ebb8 100644
--- a/include/exec/ioport.h
+++ b/include/exec/ioport.h
@@ -25,7 +25,6 @@
 #define IOPORT_H
 
 #include qemu-common.h
-#include exec/iorange.h
 
 typedef uint32_t pio_addr_t;
 #define FMT_pioaddr PRIx32
@@ -36,10 +35,6 @@ typedef uint32_t pio_addr_t;
 /* These should really be in isa.h, but are here to make pc.h happy.  */
 typedef void (IOPortWriteFunc)(void *opaque, uint32_t address, uint32_t data);
 typedef uint32_t (IOPortReadFunc)(void *opaque, uint32_t address);
-typedef void (IOPortDestructor)(void *opaque);
-
-void ioport_register(IORange *iorange);
-void isa_unassign_ioport(pio_addr_t start, int length);
 
 void cpu_outb(pio_addr_t addr, uint8_t val);
 void cpu_outw(pio_addr_t addr, uint16_t val);
diff --git a/include/exec/iorange.h b/include/exec/iorange.h
deleted file mode 100644
index cd980a8..000
--- a/include/exec/iorange.h
+++ /dev/null
@@ -1,31 +0,0 @@
-#ifndef IORANGE_H
-#define IORANGE_H
-
-#include stdint.h
-
-typedef struct IORange IORange;
-typedef struct IORangeOps IORangeOps;
-
-struct IORangeOps {
-void (*read)(IORange *iorange, uint64_t offset, unsigned width,
- uint64_t *data);
-void (*write)(IORange *iorange, uint64_t offset, unsigned width,
-  uint64_t data);
-void (*destructor)(IORange *iorange);
-};
-
-struct IORange {
-const IORangeOps *ops;
-uint64_t base;
-uint64_t len;
-};
-
-static inline void iorange_init(IORange *iorange, const IORangeOps *ops,
-uint64_t base, uint64_t len)
-{
-iorange-ops = ops;
-iorange-base = base;
-iorange-len = len;
-}
-
-#endif
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 5c9a958..cad73f5 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -22,7 +22,6 @@
 #include exec/cpu-common.h
 #include exec/hwaddr.h
 #include qemu/queue.h
-#include exec/iorange.h
 #include exec/ioport.h
 #include qemu/int128.h
 
@@ -42,14 +41,6 @@ struct MemoryRegionMmio {
 CPUWriteMemoryFunc *write[3];
 };
 
-/* Internal use; thunks between old-style IORange and MemoryRegions. */
-typedef struct MemoryRegionIORange MemoryRegionIORange;
-struct MemoryRegionIORange {
-IORange iorange;
-MemoryRegion *mr;
-hwaddr offset;
-};
-
 /*
  * Memory region callbacks
  */
diff --git a/ioport.c b/ioport.c
index 9f15567..87e 100644
--- a/ioport.c
+++ b/ioport.c
@@ -30,252 +30,14 @@
 #include exec/memory.h
 #include exec/address-spaces.h
 
-/***/
-/* IO Port */
-
-//#define DEBUG_UNUSED_IOPORT
 //#define DEBUG_IOPORT
 
-#ifdef DEBUG_UNUSED_IOPORT
-#  define LOG_UNUSED_IOPORT(fmt, ...) fprintf(stderr, fmt, ## __VA_ARGS__)
-#else
-#  define LOG_UNUSED_IOPORT(fmt, ...) do{ } while (0)
-#endif
-
 #ifdef DEBUG_IOPORT
 #  define LOG_IOPORT(...) qemu_log_mask(CPU_LOG_IOPORT, ## __VA_ARGS__)
 #else
 #  define LOG_IOPORT(...) do { } while (0)
 #endif
 
-/* XXX: use a two level table to limit memory usage */
-
-static void *ioport_opaque[MAX_IOPORTS];
-static IOPortReadFunc *ioport_read_table[3][MAX_IOPORTS];
-static IOPortWriteFunc *ioport_write_table[3][MAX_IOPORTS];
-static IOPortDestructor *ioport_destructor_table[MAX_IOPORTS];
-
-static IOPortReadFunc default_ioport_readb, default_ioport_readw, 
default_ioport_readl;
-static IOPortWriteFunc default_ioport_writeb, default_ioport_writew, 
default_ioport_writel;
-
-static uint32_t ioport_read(int index, uint32_t address)
-{
-static IOPortReadFunc * const default_func[3] = {
-default_ioport_readb,
-default_ioport_readw,
-default_ioport_readl
-};
-IOPortReadFunc *func = ioport_read_table[index][address];
-if (!func)
-func = default_func[index];
-return func(ioport_opaque[address], address);
-}
-
-static void ioport_write(int index, uint32_t address, uint32_t data)
-{
-static IOPortWriteFunc * const default_func[3] = {
-default_ioport_writeb,
-default_ioport_writew,
-default_ioport_writel
-};
-IOPortWriteFunc *func = ioport_write_table[index][address];
-if (!func)
-func = default_func[index];
-func(ioport_opaque[address], address, data);
-}
-
-static uint32_t default_ioport_readb(void *opaque, uint32_t address)
-{
-LOG_UNUSED_IOPORT(unused inb: port=0x%04PRIx32\n, address);
-return 0xff;
-}
-
-static void default_ioport_writeb(void *opaque, uint32_t address, uint32_t 
data)
-{
-

[Qemu-devel] [RFC][PATCH 05/15] prep: replace register_ioport*

2013-05-06 Thread Jan Kiszka

Convert over to memory regions to obsolete register_ioport*.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/ppc/prep.c |   23 +++
 1 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/hw/ppc/prep.c b/hw/ppc/prep.c
index 59c7da3..671dc26 100644
--- a/hw/ppc/prep.c
+++ b/hw/ppc/prep.c
@@ -429,6 +429,16 @@ static void ppc_prep_reset(void *opaque)
 cpu_reset(CPU(cpu));
 }
 
+static const MemoryRegionPortio prep_portio_list[] = {
+/* System control ports */
+{ 0x0092, 1, 1, .read = PREP_io_800_readb, .write = PREP_io_800_writeb, },
+{ 0x0800, 0x52, 1,
+  .read = PREP_io_800_readb, .write = PREP_io_800_writeb, },
+/* Special port to get debug messages from Open-Firmware */
+{ 0x0F00, 4, 1, .write = PPC_debug_write, },
+PORTIO_END_OF_LIST(),
+};
+
 /* PowerPC PREP hardware initialisation */
 static void ppc_prep_init(QEMUMachineInitArgs *args)
 {
@@ -445,6 +455,7 @@ static void ppc_prep_init(QEMUMachineInitArgs *args)
 nvram_t nvram;
 M48t59State *m48t59;
 MemoryRegion *PPC_io_memory = g_new(MemoryRegion, 1);
+PortioList *port_list = g_new(PortioList, 1);
 #if 0
 MemoryRegion *xcsr = g_new(MemoryRegion, 1);
 #endif
@@ -624,11 +635,10 @@ static void ppc_prep_init(QEMUMachineInitArgs *args)
 isa_create_simple(isa_bus, i8042);
 
 sysctrl-reset_irq = first_cpu-irq_inputs[PPC6xx_INPUT_HRESET];
-/* System control ports */
-register_ioport_read(0x0092, 0x01, 1, PREP_io_800_readb, sysctrl);
-register_ioport_write(0x0092, 0x01, 1, PREP_io_800_writeb, sysctrl);
-register_ioport_read(0x0800, 0x52, 1, PREP_io_800_readb, sysctrl);
-register_ioport_write(0x0800, 0x52, 1, PREP_io_800_writeb, sysctrl);
+
+portio_list_init(port_list, prep_portio_list, sysctrl, prep);
+portio_list_add(port_list, get_system_io(), 0x0);
+
 /* PowerPC control and status register group */
 #if 0
 memory_region_init_io(xcsr, PPC_XCSR_ops, NULL, ppc-xcsr, 0x1000);
@@ -655,9 +665,6 @@ static void ppc_prep_init(QEMUMachineInitArgs *args)
  /* XXX: need an option to load a NVRAM image */
  0,
  graphic_width, graphic_height, graphic_depth);
-
-/* Special port to get debug messages from Open-Firmware */
-register_ioport_write(0x0F00, 4, 1, PPC_debug_write, NULL);
 }
 
 static QEMUMachine prep_machine = {
-- 
1.7.3.4

[Qemu-devel] [RFC][PATCH 11/15] memory: Allow unaligned address_space_rw

2013-05-06 Thread Jan Kiszka

This will be needed for some corner cases with para-virtual the I/O
ports.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 exec.c |   33 ++---
 1 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/exec.c b/exec.c
index 3ee1f3f..9c582b1 100644
--- a/exec.c
+++ b/exec.c
@@ -1833,38 +1833,41 @@ void address_space_rw(AddressSpace *as, hwaddr addr, 
uint8_t *buf,
 uint8_t *ptr;
 uint32_t val;
 MemoryRegionSection *section;
+MemoryRegion *mr;
 
 while (len  0) {
 l = ((addr  TARGET_PAGE_MASK) + TARGET_PAGE_SIZE) - addr;
 if (l  len)
 l = len;
 section = address_space_lookup_region(as, addr);
+mr = section-mr;
 
 if (is_write) {
-if (!memory_region_is_ram(section-mr)) {
+if (!memory_region_is_ram(mr)) {
 hwaddr addr1;
 addr1 = memory_region_section_addr(section, addr);
 /* XXX: could force cpu_single_env to NULL to avoid
potential bugs */
-if (l = 4  ((addr1  3) == 0)) {
+if (l = 4  ((addr1  3) == 0 || mr-ops-impl.unaligned)) {
 /* 32 bit write access */
 val = ldl_p(buf);
-io_mem_write(section-mr, addr1, val, 4);
+io_mem_write(mr, addr1, val, 4);
 l = 4;
-} else if (l = 2  ((addr1  1) == 0)) {
+} else if (l = 2 
+   ((addr1  1) == 0 || mr-ops-impl.unaligned)) {
 /* 16 bit write access */
 val = lduw_p(buf);
-io_mem_write(section-mr, addr1, val, 2);
+io_mem_write(mr, addr1, val, 2);
 l = 2;
 } else {
 /* 8 bit write access */
 val = ldub_p(buf);
-io_mem_write(section-mr, addr1, val, 1);
+io_mem_write(mr, addr1, val, 1);
 l = 1;
 }
 } else if (!section-readonly) {
 ram_addr_t addr1;
-addr1 = memory_region_get_ram_addr(section-mr)
+addr1 = memory_region_get_ram_addr(mr)
 + memory_region_section_addr(section, addr);
 /* RAM case */
 ptr = qemu_get_ram_ptr(addr1);
@@ -1873,30 +1876,30 @@ void address_space_rw(AddressSpace *as, hwaddr addr, 
uint8_t *buf,
 qemu_put_ram_ptr(ptr);
 }
 } else {
-if (!(memory_region_is_ram(section-mr) ||
-  memory_region_is_romd(section-mr))) {
+if (!(memory_region_is_ram(mr) || memory_region_is_romd(mr))) {
 hwaddr addr1;
 /* I/O case */
 addr1 = memory_region_section_addr(section, addr);
-if (l = 4  ((addr1  3) == 0)) {
+if (l = 4  ((addr1  3) == 0 || mr-ops-impl.unaligned)) {
 /* 32 bit read access */
-val = io_mem_read(section-mr, addr1, 4);
+val = io_mem_read(mr, addr1, 4);
 stl_p(buf, val);
 l = 4;
-} else if (l = 2  ((addr1  1) == 0)) {
+} else if (l = 2 
+   ((addr1  1) == 0 || mr-ops-impl.unaligned)) {
 /* 16 bit read access */
-val = io_mem_read(section-mr, addr1, 2);
+val = io_mem_read(mr, addr1, 2);
 stw_p(buf, val);
 l = 2;
 } else {
 /* 8 bit read access */
-val = io_mem_read(section-mr, addr1, 1);
+val = io_mem_read(mr, addr1, 1);
 stb_p(buf, val);
 l = 1;
 }
 } else {
 /* RAM case */
-ptr = qemu_get_ram_ptr(section-mr-ram_addr
+ptr = qemu_get_ram_ptr(mr-ram_addr
+ memory_region_section_addr(section,
 addr));
 memcpy(buf, ptr, l);
-- 
1.7.3.4

[Qemu-devel] [RFC][PATCH 09/15] memory: Introduce address_space_lookup_region

2013-05-06 Thread Jan Kiszka

This new service so far only replaces phys_page_find as public API. In a
follow-up step, it will return the effective memory region for the
specified address, i.e. after resolving what are currently sub-pages.
Moreover, it will also once encapsulate locking and reference counting
when we introduce BQL-free dispatching.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 cputlb.c  |2 +-
 exec.c|   46 +-
 include/exec/cputlb.h |2 --
 include/exec/memory.h |9 +
 translate-all.c   |3 +--
 5 files changed, 32 insertions(+), 30 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index aba7e44..e2c95c1 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -254,7 +254,7 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
 if (size != TARGET_PAGE_SIZE) {
 tlb_add_large_page(env, vaddr, size);
 }
-section = phys_page_find(address_space_memory.dispatch, paddr  
TARGET_PAGE_BITS);
+section = address_space_lookup_region(address_space_memory, paddr);
 #if defined(DEBUG_TLB)
 printf(tlb_set_page: vaddr= TARGET_FMT_lx  paddr=0x TARGET_FMT_plx
 prot=%x idx=%d pd=0x%08lx\n,
diff --git a/exec.c b/exec.c
index 19725db..53c2778 100644
--- a/exec.c
+++ b/exec.c
@@ -182,7 +182,8 @@ static void phys_page_set(AddressSpaceDispatch *d,
 phys_page_set_level(d-phys_map, index, nb, leaf, P_L2_LEVELS - 1);
 }
 
-MemoryRegionSection *phys_page_find(AddressSpaceDispatch *d, hwaddr index)
+static MemoryRegionSection *phys_page_find(AddressSpaceDispatch *d,
+   hwaddr index)
 {
 PhysPageEntry lp = d-phys_map;
 PhysPageEntry *p;
@@ -1894,19 +1895,16 @@ static void invalidate_and_set_dirty(hwaddr addr,
 void address_space_rw(AddressSpace *as, hwaddr addr, uint8_t *buf,
   int len, bool is_write)
 {
-AddressSpaceDispatch *d = as-dispatch;
 int l;
 uint8_t *ptr;
 uint32_t val;
-hwaddr page;
 MemoryRegionSection *section;
 
 while (len  0) {
-page = addr  TARGET_PAGE_MASK;
-l = (page + TARGET_PAGE_SIZE) - addr;
+l = ((addr  TARGET_PAGE_MASK) + TARGET_PAGE_SIZE) - addr;
 if (l  len)
 l = len;
-section = phys_page_find(d, page  TARGET_PAGE_BITS);
+section = address_space_lookup_region(as, addr);
 
 if (is_write) {
 if (!memory_region_is_ram(section-mr)) {
@@ -2006,18 +2004,15 @@ void cpu_physical_memory_rw(hwaddr addr, uint8_t *buf,
 void cpu_physical_memory_write_rom(hwaddr addr,
const uint8_t *buf, int len)
 {
-AddressSpaceDispatch *d = address_space_memory.dispatch;
 int l;
 uint8_t *ptr;
-hwaddr page;
 MemoryRegionSection *section;
 
 while (len  0) {
-page = addr  TARGET_PAGE_MASK;
-l = (page + TARGET_PAGE_SIZE) - addr;
+l = ((addr  TARGET_PAGE_MASK) + TARGET_PAGE_SIZE) - addr;
 if (l  len)
 l = len;
-section = phys_page_find(d, page  TARGET_PAGE_BITS);
+section = address_space_lookup_region(address_space_memory, addr);
 
 if (!(memory_region_is_ram(section-mr) ||
   memory_region_is_romd(section-mr))) {
@@ -2096,22 +2091,19 @@ void *address_space_map(AddressSpace *as,
 hwaddr *plen,
 bool is_write)
 {
-AddressSpaceDispatch *d = as-dispatch;
 hwaddr len = *plen;
 hwaddr todo = 0;
 int l;
-hwaddr page;
 MemoryRegionSection *section;
 ram_addr_t raddr = RAM_ADDR_MAX;
 ram_addr_t rlen;
 void *ret;
 
 while (len  0) {
-page = addr  TARGET_PAGE_MASK;
-l = (page + TARGET_PAGE_SIZE) - addr;
+l = ((addr  TARGET_PAGE_MASK) + TARGET_PAGE_SIZE) - addr;
 if (l  len)
 l = len;
-section = phys_page_find(d, page  TARGET_PAGE_BITS);
+section = address_space_lookup_region(as, addr);
 
 if (!(memory_region_is_ram(section-mr)  !section-readonly)) {
 if (todo || bounce.buffer) {
@@ -2188,6 +2180,11 @@ void cpu_physical_memory_unmap(void *buffer, hwaddr len,
 return address_space_unmap(address_space_memory, buffer, len, is_write, 
access_len);
 }
 
+MemoryRegionSection *address_space_lookup_region(AddressSpace *as, hwaddr addr)
+{
+return phys_page_find(as-dispatch, addr  TARGET_PAGE_BITS);
+}
+
 /* warning: addr must be aligned */
 static inline uint32_t ldl_phys_internal(hwaddr addr,
  enum device_endian endian)
@@ -2196,7 +2193,7 @@ static inline uint32_t ldl_phys_internal(hwaddr addr,
 uint32_t val;
 MemoryRegionSection *section;
 
-section = phys_page_find(address_space_memory.dispatch, addr  
TARGET_PAGE_BITS);
+section = address_space_lookup_region(address_space_memory, addr);
 
 if (!(memory_region_is_ram(section-mr) ||
   memory_region_is_romd(section-mr))) {
@@ -2255,7

[Qemu-devel] [RFC][PATCH 13/15] ioport: Switch dispatching to memory core layer

2013-05-06 Thread Jan Kiszka

The current ioport dispatcher is a complex beast, mostly due to the
need to deal with old portio interface users. But we can overcome it
without converting all portio users by embedding the required base
address of a MemoryRegionPortio access into that data structure. That
removes the need to have the additional MemoryRegionIORange structure
in the loop on every access.

To handle old portio memory ops, we simply hook
memory_region_iorange_read/write into the normal dispatch handler,
calling them when standard region handler is not set, but old_portio is.

We can drop the additional aliasing of ioport regions and can also
the special address space listener. cpu_in/out now simply call into
address_space_read/write.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 exec.c |   25 --
 include/exec/ioport.h  |1 -
 include/exec/memory-internal.h |2 -
 include/exec/memory.h  |3 +-
 ioport.c   |   49 +++
 memory.c   |  101 ++-
 6 files changed, 66 insertions(+), 115 deletions(-)

diff --git a/exec.c b/exec.c
index 9c582b1..992e16a 100644
--- a/exec.c
+++ b/exec.c
@@ -1679,24 +1679,6 @@ static void core_log_global_stop(MemoryListener 
*listener)
 cpu_physical_memory_set_dirty_tracking(0);
 }
 
-static void io_region_add(MemoryListener *listener,
-  MemoryRegionSection *section)
-{
-MemoryRegionIORange *mrio = g_new(MemoryRegionIORange, 1);
-
-mrio-mr = section-mr;
-mrio-offset = section-offset_within_region;
-iorange_init(mrio-iorange, memory_region_iorange_ops,
- section-offset_within_address_space, section-size);
-ioport_register(mrio-iorange);
-}
-
-static void io_region_del(MemoryListener *listener,
-  MemoryRegionSection *section)
-{
-isa_unassign_ioport(section-offset_within_address_space, section-size);
-}
-
 static MemoryListener core_memory_listener = {
 .begin = core_begin,
 .log_global_start = core_log_global_start,
@@ -1704,12 +1686,6 @@ static MemoryListener core_memory_listener = {
 .priority = 1,
 };
 
-static MemoryListener io_memory_listener = {
-.region_add = io_region_add,
-.region_del = io_region_del,
-.priority = 0,
-};
-
 static MemoryListener tcg_memory_listener = {
 .commit = tcg_commit,
 };
@@ -1752,7 +1728,6 @@ static void memory_map_init(void)
 address_space_io.name = I/O;
 
 memory_listener_register(core_memory_listener, address_space_memory);
-memory_listener_register(io_memory_listener, address_space_io);
 memory_listener_register(tcg_memory_listener, address_space_memory);
 
 dma_context_init(dma_context_memory, address_space_memory,
diff --git a/include/exec/ioport.h b/include/exec/ioport.h
index eb99ffe..b476857 100644
--- a/include/exec/ioport.h
+++ b/include/exec/ioport.h
@@ -56,7 +56,6 @@ typedef struct PortioList {
 struct MemoryRegion *address_space;
 unsigned nr;
 struct MemoryRegion **regions;
-struct MemoryRegion **aliases;
 void *opaque;
 const char *name;
 } PortioList;
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 1b156fd..3134990 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -128,8 +128,6 @@ static inline void 
cpu_physical_memory_mask_dirty_range(ram_addr_t start,
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
  int dirty_flags);
 
-extern const IORangeOps memory_region_iorange_ops;
-
 #endif
 
 #endif
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 0087555..5c9a958 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -151,6 +151,7 @@ struct MemoryRegionPortio {
 unsigned size;
 IOPortReadFunc *read;
 IOPortWriteFunc *write;
+uint32_t base; /* private field */
 };
 
 #define PORTIO_END_OF_LIST() { }
@@ -810,7 +811,7 @@ void mtree_info(fprintf_function mon_printf, void *f);
  * address_space_init: initializes an address space
  *
  * @as: an uninitialized #AddressSpace
- * @root: a #MemoryRegion that routes addesses for the address space
+ * @root: a #MemoryRegion that routes addresses for the address space
  */
 void address_space_init(AddressSpace *as, MemoryRegion *root);
 
diff --git a/ioport.c b/ioport.c
index 56470c5..9f15567 100644
--- a/ioport.c
+++ b/ioport.c
@@ -28,6 +28,7 @@
 #include exec/ioport.h
 #include trace.h
 #include exec/memory.h
+#include exec/address-spaces.h
 
 /***/
 /* IO Port */
@@ -279,27 +280,34 @@ void cpu_outb(pio_addr_t addr, uint8_t val)
 {
 LOG_IOPORT(outb: %04FMT_pioaddr %02PRIx8\n, addr, val);
 trace_cpu_out(addr, val);
-ioport_write(0, addr, val);
+address_space_write(address_space_io, addr, val, 1);
 }
 
 void cpu_outw(pio_addr_t addr, uint16_t val)
 {
+uint8_t

Re: [Qemu-devel] [RFC][PATCH 09/15] memory: Introduce address_space_lookup_region

2013-05-06 Thread Paolo Bonzini

Il 06/05/2013 16:26, Jan Kiszka ha scritto:
 This new service so far only replaces phys_page_find as public API. In a
 follow-up step, it will return the effective memory region for the
 specified address, i.e. after resolving what are currently sub-pages.
 Moreover, it will also once encapsulate locking and reference counting
 when we introduce BQL-free dispatching.

In my IOMMU rebase I have a similar function:

/* address_space_translate: translate an address range into an address space
 * into a MemoryRegionSection and an address range into that section.
 *
 * @as: #AddressSpace to be accessed
 * @addr: address within that address space
 * @xlat: pointer to address within the returned memory region section's
 * #MemoryRegion.
 * @len: pointer to length
 * @is_write: indicates the transfer direction
 */
MemoryRegionSection *address_space_translate(AddressSpace *as, hwaddr addr,
 hwaddr *xlat, hwaddr *len,
 bool is_write);

It wraps (actually, replaces) both phys_page_find and
memory_region_section_addr.

Paolo

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  cputlb.c  |2 +-
  exec.c|   46 +-
  include/exec/cputlb.h |2 --
  include/exec/memory.h |9 +
  translate-all.c   |3 +--
  5 files changed, 32 insertions(+), 30 deletions(-)
 
 diff --git a/cputlb.c b/cputlb.c
 index aba7e44..e2c95c1 100644
 --- a/cputlb.c
 +++ b/cputlb.c
 @@ -254,7 +254,7 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
  if (size != TARGET_PAGE_SIZE) {
  tlb_add_large_page(env, vaddr, size);
  }
 -section = phys_page_find(address_space_memory.dispatch, paddr  
 TARGET_PAGE_BITS);
 +section = address_space_lookup_region(address_space_memory, paddr);
  #if defined(DEBUG_TLB)
  printf(tlb_set_page: vaddr= TARGET_FMT_lx  paddr=0x TARGET_FMT_plx
  prot=%x idx=%d pd=0x%08lx\n,
 diff --git a/exec.c b/exec.c
 index 19725db..53c2778 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -182,7 +182,8 @@ static void phys_page_set(AddressSpaceDispatch *d,
  phys_page_set_level(d-phys_map, index, nb, leaf, P_L2_LEVELS - 1);
  }
  
 -MemoryRegionSection *phys_page_find(AddressSpaceDispatch *d, hwaddr index)
 +static MemoryRegionSection *phys_page_find(AddressSpaceDispatch *d,
 +   hwaddr index)
  {
  PhysPageEntry lp = d-phys_map;
  PhysPageEntry *p;
 @@ -1894,19 +1895,16 @@ static void invalidate_and_set_dirty(hwaddr addr,
  void address_space_rw(AddressSpace *as, hwaddr addr, uint8_t *buf,
int len, bool is_write)
  {
 -AddressSpaceDispatch *d = as-dispatch;
  int l;
  uint8_t *ptr;
  uint32_t val;
 -hwaddr page;
  MemoryRegionSection *section;
  
  while (len  0) {
 -page = addr  TARGET_PAGE_MASK;
 -l = (page + TARGET_PAGE_SIZE) - addr;
 +l = ((addr  TARGET_PAGE_MASK) + TARGET_PAGE_SIZE) - addr;
  if (l  len)
  l = len;
 -section = phys_page_find(d, page  TARGET_PAGE_BITS);
 +section = address_space_lookup_region(as, addr);
  
  if (is_write) {
  if (!memory_region_is_ram(section-mr)) {
 @@ -2006,18 +2004,15 @@ void cpu_physical_memory_rw(hwaddr addr, uint8_t *buf,
  void cpu_physical_memory_write_rom(hwaddr addr,
 const uint8_t *buf, int len)
  {
 -AddressSpaceDispatch *d = address_space_memory.dispatch;
  int l;
  uint8_t *ptr;
 -hwaddr page;
  MemoryRegionSection *section;
  
  while (len  0) {
 -page = addr  TARGET_PAGE_MASK;
 -l = (page + TARGET_PAGE_SIZE) - addr;
 +l = ((addr  TARGET_PAGE_MASK) + TARGET_PAGE_SIZE) - addr;
  if (l  len)
  l = len;
 -section = phys_page_find(d, page  TARGET_PAGE_BITS);
 +section = address_space_lookup_region(address_space_memory, addr);
  
  if (!(memory_region_is_ram(section-mr) ||
memory_region_is_romd(section-mr))) {
 @@ -2096,22 +2091,19 @@ void *address_space_map(AddressSpace *as,
  hwaddr *plen,
  bool is_write)
  {
 -AddressSpaceDispatch *d = as-dispatch;
  hwaddr len = *plen;
  hwaddr todo = 0;
  int l;
 -hwaddr page;
  MemoryRegionSection *section;
  ram_addr_t raddr = RAM_ADDR_MAX;
  ram_addr_t rlen;
  void *ret;
  
  while (len  0) {
 -page = addr  TARGET_PAGE_MASK;
 -l = (page + TARGET_PAGE_SIZE) - addr;
 +l = ((addr  TARGET_PAGE_MASK) + TARGET_PAGE_SIZE) - addr;
  if (l  len)
  l = len;
 -section = phys_page_find(d, page  TARGET_PAGE_BITS);
 +section = address_space_lookup_region(as, addr);
  
  if (!(memory_region_is_ram(section-mr)  !section-readonly)) {
  if (todo ||

Re: [Qemu-devel] [RFC][PATCH 12/15] vmware-vga: Accept unaligned I/O accesses

2013-05-06 Thread Paolo Bonzini

Il 06/05/2013 16:26, Jan Kiszka ha scritto:
 Before switching to the memory core dispatcher, we need to make sure
 that this pv-device will continue to receive unaligned portio accesses.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/display/vmware_vga.c |4 
  1 files changed, 4 insertions(+), 0 deletions(-)
 
 diff --git a/hw/display/vmware_vga.c b/hw/display/vmware_vga.c
 index fd3569d..ec41681 100644
 --- a/hw/display/vmware_vga.c
 +++ b/hw/display/vmware_vga.c
 @@ -1241,6 +1241,10 @@ static const MemoryRegionOps vmsvga_io_ops = {
  .valid = {
  .min_access_size = 4,
  .max_access_size = 4,
 +.unaligned = true,
 +},
 +.impl = {
 +.unaligned = true,
  },
  };
  
 

The Xen platform device needs the same.

Paolo

Re: [Qemu-devel] [RFC][PATCH 05/15] prep: replace register_ioport*

2013-05-06 Thread Andreas Färber

Am 06.05.2013 16:26, schrieb Jan Kiszka:
 Convert over to memory regions to obsolete register_ioport*.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/ppc/prep.c |   23 +++
  1 files changed, 15 insertions(+), 8 deletions(-)

As a heads-up, for PReP we've been preparing to move this System I/O to
a qdev/QOM device, so this would likely need rebasing at some point.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [RFC][PATCH 12/15] vmware-vga: Accept unaligned I/O accesses

2013-05-06 Thread Jan Kiszka

On 2013-05-06 16:40, Paolo Bonzini wrote:
 Il 06/05/2013 16:26, Jan Kiszka ha scritto:
 Before switching to the memory core dispatcher, we need to make sure
 that this pv-device will continue to receive unaligned portio accesses.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/display/vmware_vga.c |4 
  1 files changed, 4 insertions(+), 0 deletions(-)

 diff --git a/hw/display/vmware_vga.c b/hw/display/vmware_vga.c
 index fd3569d..ec41681 100644
 --- a/hw/display/vmware_vga.c
 +++ b/hw/display/vmware_vga.c
 @@ -1241,6 +1241,10 @@ static const MemoryRegionOps vmsvga_io_ops = {
  .valid = {
  .min_access_size = 4,
  .max_access_size = 4,
 +.unaligned = true,
 +},
 +.impl = {
 +.unaligned = true,
  },
  };
  

 
 The Xen platform device needs the same.

OK, good to know. In theory, this should only affect weird PV, so I hope
that is all.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [RFC][PATCH 00/15] Refactor portio dispatching

2013-05-06 Thread Andreas Färber

Am 06.05.2013 16:26, schrieb Jan Kiszka:
 This series converts the remaining users of register_ioport* to portio
 lists,

Why does it need to be lists? Is there anything wrong with using
isa_register_ioport() as done in Hervé's previous ioport cleanup series?
(Hope you were aware of those ppc-centered discussions?)

Andreas

 simplifies the handling of subpages and adds support for unaligned
 memory region accesses. Then it replaces the current portio dispatcher
 with the existing one for MMIO and removes several lines of code. This
 also allows to build BQL-free portio on top once we enhance the memory
 layer accordingly.
 
 Seems to work fine so far but surely requires thorough review. And I
 would welcome early comments on the direction.
 
 Jan
 
 
 CC: malc av1...@comtv.ru
 
 Jan Kiszka (15):
   adlib: replace register_ioport*
   applesmc: replace register_ioport*
   wdt_ib700: replace register_ioport*
   i82374: replace register_ioport*
   prep: replace register_ioport*
   vt82c686: replace register_ioport*
   Privatize register_ioport_read/write
   isa: implement isa_is_ioport_assigned via memory_region_find
   memory: Introduce address_space_lookup_region
   memory: Rework sub-page handling
   memory: Allow unaligned address_space_rw
   vmware-vga: Accept unaligned I/O accesses
   ioport: Switch dispatching to memory core layer
   ioport: Remove unused old dispatching services
   ioport: Move IOPortRead/WriteFunc typedefs to memory.h
 
  cputlb.c   |2 +-
  exec.c |  273 +
  hw/acpi/piix4.c|6 +-
  hw/audio/adlib.c   |   20 ++-
  hw/display/vmware_vga.c|4 +
  hw/dma/i82374.c|   17 ++-
  hw/isa/isa-bus.c   |   11 ++
  hw/isa/lpc_ich9.c  |8 +-
  hw/isa/vt82c686.c  |   40 --
  hw/misc/applesmc.c |   48 +--
  hw/ppc/prep.c  |   23 ++-
  hw/watchdog/wdt_ib700.c|   12 ++-
  include/exec/cputlb.h  |2 -
  include/exec/ioport.h  |   19 +---
  include/exec/iorange.h |   31 
  include/exec/memory-internal.h |2 -
  include/exec/memory.h  |   26 ++--
  include/hw/isa/isa.h   |2 +
  ioport.c   |  294 
 
  memory.c   |  102 +-
  translate-all.c|3 +-
  21 files changed, 311 insertions(+), 634 deletions(-)
  delete mode 100644 include/exec/iorange.h
 


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [RFC][PATCH 09/15] memory: Introduce address_space_lookup_region

2013-05-06 Thread Jan Kiszka

On 2013-05-06 16:39, Paolo Bonzini wrote:
 Il 06/05/2013 16:26, Jan Kiszka ha scritto:
 This new service so far only replaces phys_page_find as public API. In a
 follow-up step, it will return the effective memory region for the
 specified address, i.e. after resolving what are currently sub-pages.
 Moreover, it will also once encapsulate locking and reference counting
 when we introduce BQL-free dispatching.
 
 In my IOMMU rebase I have a similar function:
 
 /* address_space_translate: translate an address range into an address space
  * into a MemoryRegionSection and an address range into that section.
  *
  * @as: #AddressSpace to be accessed
  * @addr: address within that address space
  * @xlat: pointer to address within the returned memory region section's
  * #MemoryRegion.
  * @len: pointer to length
  * @is_write: indicates the transfer direction
  */
 MemoryRegionSection *address_space_translate(AddressSpace *as, hwaddr addr,
  hwaddr *xlat, hwaddr *len,
  bool is_write);
 
 It wraps (actually, replaces) both phys_page_find and
 memory_region_section_addr.

Good, looks like we are heading in similar directions. What is the
purpose of len? When does is_write matter?

In a later step, this should become something like
address_space_get_region_ref (to be paired with memory_region_unref,
once done). So this one also takes care of incrementing the reference
counter or acquiring the BQL, as necessary. Currently, it asks the
caller to specify if the BQL is already held, but that will change.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [RFC][PATCH 00/15] Refactor portio dispatching

2013-05-06 Thread Jan Kiszka

On 2013-05-06 16:50, Andreas Färber wrote:
 Am 06.05.2013 16:26, schrieb Jan Kiszka:
 This series converts the remaining users of register_ioport* to portio
 lists,
 
 Why does it need to be lists? Is there anything wrong with using
 isa_register_ioport() as done in Hervé's previous ioport cleanup series?
 (Hope you were aware of those ppc-centered discussions?)

Lists are the straight way of conversion, specifically for I/O port
assignments that are scattered. On some lazy day, someone can convert
them into multiple memory regions. That's beyond the scope if this
series and more risky.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [RFC][PATCH 08/15] isa: implement isa_is_ioport_assigned via memory_region_find

2013-05-06 Thread Andreas Färber

Am 06.05.2013 16:26, schrieb Jan Kiszka:
 Move isa_is_ioport_assigned to the ISA core and implement it via a
 memory region lookup. As all IO ports are now directly or indirectly
 registered via the memory API, this becomes possible and will finally
 allow us to drop the ioport tables.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/acpi/piix4.c   |6 +++---
  hw/isa/isa-bus.c  |   11 +++
  hw/isa/lpc_ich9.c |8 
  include/exec/ioport.h |1 -
  include/hw/isa/isa.h  |2 ++
  ioport.c  |7 ---
  6 files changed, 20 insertions(+), 15 deletions(-)
 
 diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
 index c4af1cc..5955217 100644
 --- a/hw/acpi/piix4.c
 +++ b/hw/acpi/piix4.c
 @@ -386,10 +386,10 @@ static void piix4_pm_machine_ready(Notifier *n, void 
 *opaque)
  uint8_t *pci_conf;
  
  pci_conf = s-dev.config;
 -pci_conf[0x5f] = (isa_is_ioport_assigned(0x378) ? 0x80 : 0) | 0x10;
 +pci_conf[0x5f] = (isa_is_ioport_assigned(NULL, 0x378) ? 0x80 : 0) | 0x10;
  pci_conf[0x63] = 0x60;
 -pci_conf[0x67] = (isa_is_ioport_assigned(0x3f8) ? 0x08 : 0) |
 - (isa_is_ioport_assigned(0x2f8) ? 0x90 : 0);
 +pci_conf[0x67] = (isa_is_ioport_assigned(NULL, 0x3f8) ? 0x08 : 0) |
 +(isa_is_ioport_assigned(NULL, 0x2f8) ? 0x90 : 0);
  
  }
  

Is there really no way to access the ISABus from this device? Would be
nice to get rid of global ISA variables and not introduce more
dependencies. :)

Andreas

[...]
 diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
 index 667e882..641227a 100644
 --- a/hw/isa/lpc_ich9.c
 +++ b/hw/isa/lpc_ich9.c
 @@ -480,19 +480,19 @@ static void ich9_lpc_machine_ready(Notifier *n, void 
 *opaque)
  uint8_t *pci_conf;
  
  pci_conf = s-d.config;
 -if (isa_is_ioport_assigned(0x3f8)) {
 +if (isa_is_ioport_assigned(s-isa_bus, 0x3f8)) {
  /* com1 */
  pci_conf[0x82] |= 0x01;
  }
 -if (isa_is_ioport_assigned(0x2f8)) {
 +if (isa_is_ioport_assigned(s-isa_bus, 0x2f8)) {
  /* com2 */
  pci_conf[0x82] |= 0x02;
  }
 -if (isa_is_ioport_assigned(0x378)) {
 +if (isa_is_ioport_assigned(s-isa_bus, 0x378)) {
  /* lpt */
  pci_conf[0x82] |= 0x04;
  }
 -if (isa_is_ioport_assigned(0x3f0)) {
 +if (isa_is_ioport_assigned(s-isa_bus, 0x3f0)) {
  /* floppy */
  pci_conf[0x82] |= 0x08;
  }
[snip]

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [RFC][PATCH 09/15] memory: Introduce address_space_lookup_region

2013-05-06 Thread Paolo Bonzini

Il 06/05/2013 16:51, Jan Kiszka ha scritto:
 On 2013-05-06 16:39, Paolo Bonzini wrote:
 Il 06/05/2013 16:26, Jan Kiszka ha scritto:
 This new service so far only replaces phys_page_find as public API. In a
 follow-up step, it will return the effective memory region for the
 specified address, i.e. after resolving what are currently sub-pages.
 Moreover, it will also once encapsulate locking and reference counting
 when we introduce BQL-free dispatching.

 In my IOMMU rebase I have a similar function:

 /* address_space_translate: translate an address range into an address space
  * into a MemoryRegionSection and an address range into that section.
  *
  * @as: #AddressSpace to be accessed
  * @addr: address within that address space
  * @xlat: pointer to address within the returned memory region section's
  * #MemoryRegion.
  * @len: pointer to length
  * @is_write: indicates the transfer direction
  */
 MemoryRegionSection *address_space_translate(AddressSpace *as, hwaddr addr,
  hwaddr *xlat, hwaddr *len,
  bool is_write);

 It wraps (actually, replaces) both phys_page_find and
 memory_region_section_addr.
 
 Good, looks like we are heading in similar directions. What is the
 purpose of len? When does is_write matter?

Both matter when adding the IOMMU.  is_write is needed to check for
permissions, and len because the translation will be valid for one page
only (not for the full size of the MemoryRegion.

I can implement address_space_translate on top of
address_space_lookup_region and include your next patch too, so we're
fine here.

Paolo

 In a later step, this should become something like
 address_space_get_region_ref (to be paired with memory_region_unref,
 once done). So this one also takes care of incrementing the reference
 counter or acquiring the BQL, as necessary. Currently, it asks the
 caller to specify if the BQL is already held, but that will change.
 
 Jan

Re: [Qemu-devel] [RFC][PATCH 11/15] memory: Allow unaligned address_space_rw

2013-05-06 Thread Paolo Bonzini

Il 06/05/2013 16:26, Jan Kiszka ha scritto:
 This will be needed for some corner cases with para-virtual the I/O
 ports.
 
 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  exec.c |   33 ++---
  1 files changed, 18 insertions(+), 15 deletions(-)
 
 diff --git a/exec.c b/exec.c
 index 3ee1f3f..9c582b1 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -1833,38 +1833,41 @@ void address_space_rw(AddressSpace *as, hwaddr addr, 
 uint8_t *buf,
  uint8_t *ptr;
  uint32_t val;
  MemoryRegionSection *section;
 +MemoryRegion *mr;
  
  while (len  0) {
  l = ((addr  TARGET_PAGE_MASK) + TARGET_PAGE_SIZE) - addr;
  if (l  len)
  l = len;
  section = address_space_lookup_region(as, addr);
 +mr = section-mr;
  
  if (is_write) {
 -if (!memory_region_is_ram(section-mr)) {
 +if (!memory_region_is_ram(mr)) {
  hwaddr addr1;
  addr1 = memory_region_section_addr(section, addr);
  /* XXX: could force cpu_single_env to NULL to avoid
 potential bugs */
 -if (l = 4  ((addr1  3) == 0)) {
 +if (l = 4  ((addr1  3) == 0 || mr-ops-impl.unaligned)) 
 {

Does the length matter at all if unaligned accesses are allowed?  I
think it shouldn't...

Paolo

  /* 32 bit write access */
  val = ldl_p(buf);
 -io_mem_write(section-mr, addr1, val, 4);
 +io_mem_write(mr, addr1, val, 4);
  l = 4;
 -} else if (l = 2  ((addr1  1) == 0)) {
 +} else if (l = 2 
 +   ((addr1  1) == 0 || mr-ops-impl.unaligned)) {
  /* 16 bit write access */
  val = lduw_p(buf);
 -io_mem_write(section-mr, addr1, val, 2);
 +io_mem_write(mr, addr1, val, 2);
  l = 2;
  } else {
  /* 8 bit write access */
  val = ldub_p(buf);
 -io_mem_write(section-mr, addr1, val, 1);
 +io_mem_write(mr, addr1, val, 1);
  l = 1;
  }
  } else if (!section-readonly) {
  ram_addr_t addr1;
 -addr1 = memory_region_get_ram_addr(section-mr)
 +addr1 = memory_region_get_ram_addr(mr)
  + memory_region_section_addr(section, addr);
  /* RAM case */
  ptr = qemu_get_ram_ptr(addr1);
 @@ -1873,30 +1876,30 @@ void address_space_rw(AddressSpace *as, hwaddr addr, 
 uint8_t *buf,
  qemu_put_ram_ptr(ptr);
  }
  } else {
 -if (!(memory_region_is_ram(section-mr) ||
 -  memory_region_is_romd(section-mr))) {
 +if (!(memory_region_is_ram(mr) || memory_region_is_romd(mr))) {
  hwaddr addr1;
  /* I/O case */
  addr1 = memory_region_section_addr(section, addr);
 -if (l = 4  ((addr1  3) == 0)) {
 +if (l = 4  ((addr1  3) == 0 || mr-ops-impl.unaligned)) 
 {
  /* 32 bit read access */
 -val = io_mem_read(section-mr, addr1, 4);
 +val = io_mem_read(mr, addr1, 4);
  stl_p(buf, val);
  l = 4;
 -} else if (l = 2  ((addr1  1) == 0)) {
 +} else if (l = 2 
 +   ((addr1  1) == 0 || mr-ops-impl.unaligned)) {
  /* 16 bit read access */
 -val = io_mem_read(section-mr, addr1, 2);
 +val = io_mem_read(mr, addr1, 2);
  stw_p(buf, val);
  l = 2;
  } else {
  /* 8 bit read access */
 -val = io_mem_read(section-mr, addr1, 1);
 +val = io_mem_read(mr, addr1, 1);
  stb_p(buf, val);
  l = 1;
  }
  } else {
  /* RAM case */
 -ptr = qemu_get_ram_ptr(section-mr-ram_addr
 +ptr = qemu_get_ram_ptr(mr-ram_addr
 + memory_region_section_addr(section,
  addr));
  memcpy(buf, ptr, l);

Re: [Qemu-devel] [PATCH] Add 'maxqdepth' as an option to tty character devices.

2013-05-06 Thread Eric Blake

On 05/06/2013 07:43 AM, John Baboval wrote:
 From: John V. Baboval john.babo...@virtualcomputer.com
 
 This parameter will cause writes to tty backed chardevs to return
 -EAGAIN if the backing tty has buffered more than the specified
 number of characters. When data is sent, the TIOCOUTQ ioctl is invoked
 to determine the current TTY output buffer depth.
 

Reviewing just the interface portion of the patch:

 +++ b/qapi-schema.json
 @@ -3182,11 +3182,14 @@
  #
  # @device: The name of the special file for the device,
  #  i.e. /dev/ttyS0 on Unix or COM1: on Windows
 +# @maxqdepth: The maximum depth of the underlying tty
 +  output queue (Unix) 

Trailing whitespace.  Run your patch through scripts/checkpatch.pl.

Since you are adding a new member, you should use a (since 1.6)
comment on this line.  Also, most interfaces tend to use a blank line
between member documentation.

  # @type: What kind of device this is.

Hmm - we have a pre-existing documentation bug - this line probably
should have been deleted during commit d36b2b90.

  #
  # Since: 1.4
  ##
 -{ 'type': 'ChardevHostdev', 'data': { 'device' : 'str' } }
 +{ 'type': 'ChardevHostdev', 'data': { 'device': 'str',
 +  'maxqdepth' : 'int' } }

Ouch - this says that maxqdepth is mandatory.  But that is a
backwards-incompatible change with apps that target the 'chardev-add'
QMP command of qemu 1.4.  You MUST make it optional, since older apps
will not be providing it.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC][PATCH 11/15] memory: Allow unaligned address_space_rw

2013-05-06 Thread Jan Kiszka

On 2013-05-06 16:55, Paolo Bonzini wrote:
 Il 06/05/2013 16:26, Jan Kiszka ha scritto:
 This will be needed for some corner cases with para-virtual the I/O
 ports.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  exec.c |   33 ++---
  1 files changed, 18 insertions(+), 15 deletions(-)

 diff --git a/exec.c b/exec.c
 index 3ee1f3f..9c582b1 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -1833,38 +1833,41 @@ void address_space_rw(AddressSpace *as, hwaddr addr, 
 uint8_t *buf,
  uint8_t *ptr;
  uint32_t val;
  MemoryRegionSection *section;
 +MemoryRegion *mr;
  
  while (len  0) {
  l = ((addr  TARGET_PAGE_MASK) + TARGET_PAGE_SIZE) - addr;
  if (l  len)
  l = len;
  section = address_space_lookup_region(as, addr);
 +mr = section-mr;
  
  if (is_write) {
 -if (!memory_region_is_ram(section-mr)) {
 +if (!memory_region_is_ram(mr)) {
  hwaddr addr1;
  addr1 = memory_region_section_addr(section, addr);
  /* XXX: could force cpu_single_env to NULL to avoid
 potential bugs */
 -if (l = 4  ((addr1  3) == 0)) {
 +if (l = 4  ((addr1  3) == 0 || 
 mr-ops-impl.unaligned)) {
 
 Does the length matter at all if unaligned accesses are allowed?  I
 think it shouldn't...

What do you mean? The length test here is not about alignment, it's
about proper split-up depending on the input size (we cannot use 32 bit
for all accesses and do not want to make them all byte accesses, do we?).

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [RFC][PATCH 08/15] isa: implement isa_is_ioport_assigned via memory_region_find

2013-05-06 Thread Jan Kiszka

On 2013-05-06 16:55, Andreas Färber wrote:
 Am 06.05.2013 16:26, schrieb Jan Kiszka:
 Move isa_is_ioport_assigned to the ISA core and implement it via a
 memory region lookup. As all IO ports are now directly or indirectly
 registered via the memory API, this becomes possible and will finally
 allow us to drop the ioport tables.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/acpi/piix4.c   |6 +++---
  hw/isa/isa-bus.c  |   11 +++
  hw/isa/lpc_ich9.c |8 
  include/exec/ioport.h |1 -
  include/hw/isa/isa.h  |2 ++
  ioport.c  |7 ---
  6 files changed, 20 insertions(+), 15 deletions(-)

 diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
 index c4af1cc..5955217 100644
 --- a/hw/acpi/piix4.c
 +++ b/hw/acpi/piix4.c
 @@ -386,10 +386,10 @@ static void piix4_pm_machine_ready(Notifier *n, void 
 *opaque)
  uint8_t *pci_conf;
  
  pci_conf = s-dev.config;
 -pci_conf[0x5f] = (isa_is_ioport_assigned(0x378) ? 0x80 : 0) | 0x10;
 +pci_conf[0x5f] = (isa_is_ioport_assigned(NULL, 0x378) ? 0x80 : 0) | 
 0x10;
  pci_conf[0x63] = 0x60;
 -pci_conf[0x67] = (isa_is_ioport_assigned(0x3f8) ? 0x08 : 0) |
 -(isa_is_ioport_assigned(0x2f8) ? 0x90 : 0);
 +pci_conf[0x67] = (isa_is_ioport_assigned(NULL, 0x3f8) ? 0x08 : 0) |
 +(isa_is_ioport_assigned(NULL, 0x2f8) ? 0x90 : 0);
  
  }
  
 
 Is there really no way to access the ISABus from this device? Would be
 nice to get rid of global ISA variables and not introduce more
 dependencies. :)

There is likely a way, just didn't find a direct one. So I prefer to
limit the scope and leave such cleanups for later. Keep in mind that the
existing API had an implicit NULL device, so this is no regression!

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH for-1.5] virtio-pci: bugfix

2013-05-06 Thread Michael S. Tsirkin

mask notifiers are never called without msix,
so devices with backend masking like vhost don't work.
Call mask notifiers explicitly at
startup/cleanup to make it work.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
Tested-by: Alexander Graf ag...@suse.de

---
 hw/virtio/virtio-pci.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 8bba0f3..d0fcc6c 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -758,6 +758,10 @@ static int virtio_pci_set_guest_notifier(DeviceState *d, 
int n, bool assign,
 event_notifier_cleanup(notifier);
 }
 
+if (!msix_enabled(proxy-pci_dev)  proxy-vdev-guest_notifier_mask) {
+proxy-vdev-guest_notifier_mask(proxy-vdev, n, !assign);
+}
+
 return 0;
 }
 
-- 
MST

Re: [Qemu-devel] [RFC][PATCH 11/15] memory: Allow unaligned address_space_rw

2013-05-06 Thread Paolo Bonzini

Il 06/05/2013 16:58, Jan Kiszka ha scritto:
 On 2013-05-06 16:55, Paolo Bonzini wrote:
 Il 06/05/2013 16:26, Jan Kiszka ha scritto:
 This will be needed for some corner cases with para-virtual the I/O
 ports.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  exec.c |   33 ++---
  1 files changed, 18 insertions(+), 15 deletions(-)

 diff --git a/exec.c b/exec.c
 index 3ee1f3f..9c582b1 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -1833,38 +1833,41 @@ void address_space_rw(AddressSpace *as, hwaddr 
 addr, uint8_t *buf,
  uint8_t *ptr;
  uint32_t val;
  MemoryRegionSection *section;
 +MemoryRegion *mr;
  
  while (len  0) {
  l = ((addr  TARGET_PAGE_MASK) + TARGET_PAGE_SIZE) - addr;
  if (l  len)
  l = len;
  section = address_space_lookup_region(as, addr);
 +mr = section-mr;
  
  if (is_write) {
 -if (!memory_region_is_ram(section-mr)) {
 +if (!memory_region_is_ram(mr)) {
  hwaddr addr1;
  addr1 = memory_region_section_addr(section, addr);
  /* XXX: could force cpu_single_env to NULL to avoid
 potential bugs */
 -if (l = 4  ((addr1  3) == 0)) {
 +if (l = 4  ((addr1  3) == 0 || 
 mr-ops-impl.unaligned)) {

 Does the length matter at all if unaligned accesses are allowed?  I
 think it shouldn't...
 
 What do you mean? The length test here is not about alignment, it's
 about proper split-up depending on the input size (we cannot use 32 bit
 for all accesses and do not want to make them all byte accesses, do we?).

Oh right, I was thinking of my IOMMU tree where I have:

l = len;
section = address_space_translate(as, addr, addr1, l, is_write);

if (is_write) {
if (!memory_region_is_ram(section-mr)) {
/* XXX: could force cpu_single_env to NULL to avoid
   potential bugs */
if (l = 4  ((addr1  3) == 0)) {

and address_space_translate does:

*plen = MIN(section-size - addr, *plen);

It should not do this if section-mr-ops.unaligned.  Whoever rebases on top
of the other should keep this in mind.

Paolo

Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/7] pci: add MPC105 PCI host bridge emulation

2013-05-06 Thread Alexander Graf


On 05/03/2013 07:57 AM, Hervé Poussineau wrote:

Alexander Graf a écrit :


Am 02.05.2013 um 22:08 schrieb Hervé Poussineau hpous...@reactos.org:


Non-contiguous I/O is not implemented.

There is also somewhere a bug in the memory controller, which means
that some real firmwares may not detect the correct amount of memory.
This can be bypassed by adding '-m 1G' on the command line.

Add x-auto-conf property, to automatically configure the memory
controller at startup. This will be required by OpenBIOS, which
doesn't know how to do it.


Why not teach it? I'd prefer to see that logic in firmware.


Me too, but I'm not confident enough in my capabilities to do it.


Huh? Why not? Most of the device initialization code in OpenBIOS happens 
in C, so you don't even have to touch Forth code :).


Autoconfiguration is only in one place of the code, so I think it can 
be removed easily once OpenBIOS has this logic.


I'd prefer if we could come up with a clean model from the start. It 
really shouldn't be hard at all.







Signed-off-by: Hervé Poussineau hpous...@reactos.org
---
default-configs/ppc-softmmu.mak |1 +
hw/pci-host/Makefile.objs   |1 +
hw/pci-host/mpc105.c|  488 
+++

include/hw/pci/pci_ids.h|1 +
trace-events|7 +
5 files changed, 498 insertions(+)
create mode 100644 hw/pci-host/mpc105.c

diff --git a/default-configs/ppc-softmmu.mak 
b/default-configs/ppc-softmmu.mak

index cc3587f..f79b058 100644
--- a/default-configs/ppc-softmmu.mak
+++ b/default-configs/ppc-softmmu.mak
@@ -28,6 +28,7 @@ CONFIG_MAC_NVRAM=y
CONFIG_MAC_DBDMA=y
CONFIG_HEATHROW_PIC=y
CONFIG_GRACKLE_PCI=y
+CONFIG_MPC105_PCI=y
CONFIG_UNIN_PCI=y
CONFIG_DEC_PCI=y
CONFIG_PPCE500_PCI=y
diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
index 909e702..ec4427b 100644
--- a/hw/pci-host/Makefile.objs
+++ b/hw/pci-host/Makefile.objs
@@ -3,6 +3,7 @@ common-obj-y += pam.o
# PPC devices
common-obj-$(CONFIG_PREP_PCI) += prep.o
common-obj-$(CONFIG_GRACKLE_PCI) += grackle.o
+common-obj-$(CONFIG_MPC105_PCI) += mpc105.o
# NewWorld PowerMac
common-obj-$(CONFIG_UNIN_PCI) += uninorth.o
common-obj-$(CONFIG_DEC_PCI) += dec.o
diff --git a/hw/pci-host/mpc105.c b/hw/pci-host/mpc105.c
new file mode 100644
index 000..8e4cc95
--- /dev/null
+++ b/hw/pci-host/mpc105.c
@@ -0,0 +1,488 @@
+/*
+ * QEMU MPC-105 Eagle PCI host
+ *
+ * Copyright (c) 2013 Hervé Poussineau
+ *
+ * This program is free software: you can redistribute it and/or 
modify
+ * it under the terms of the GNU General Public License as 
published by

+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) version 3 or any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see 
http://www.gnu.org/licenses/.

+ */
+
+#include hw/pci/pci.h
+#include hw/pci/pci_bus.h
+#include hw/pci/pci_host.h
+#include hw/i386/pc.h


That include sounds odd :).


Sure, but you need to access pic_read_irq() :)
hw/pci-host/prep.c also includes it.


Phew. Would be a good thing to pull out of there. But it's out of the 
scope for this set.







+#include hw/loader.h
+#include exec/address-spaces.h
+#include elf.h
+#include trace.h
+
+#define TYPE_MPC105_PCI_HOST_BRIDGE mpc105-pcihost
+#define MPC105_PCI_HOST_BRIDGE(obj) \
+OBJECT_CHECK(Mpc105HostState, (obj), TYPE_MPC105_PCI_HOST_BRIDGE)
+
+#define TYPE_MPC105 mpc105
+#define MPC105(obj) \
+OBJECT_CHECK(Mpc105State, (obj), TYPE_MPC105)
+
+#define MEM_STA_03 0x0080
+#define MEM_STA_47 0x0084
+#define EXT_MEM_STA_03 0x0088
+#define EXT_MEM_STA_47 0x008c
+#define MEM_END_03 0x0090
+#define MEM_END_47 0x0094
+#define EXT_MEM_END_03 0x0098
+#define EXT_MEM_END_47 0x009c
+#define MEM_BANK_EN0x00a0
+#define PROC_CFG_A80x00a8
+#define PROC_CFG_AC0x00ac
+#define ALT_OSV_1  0x00ba
+#define ERR_EN_REG10x00c0
+#define ERR_DR10x00c1
+#define ERR_EN_REG20x00c4
+#define MEM_CFG_1  0x00f0
+#define MEM_CFG_2  0x00f4
+#define MEM_CFG_4  0x00fc
+
+#define MEM_CFG_1_MEMGO (1  19)
+
+#define BIOS_SIZE (1024 * 1024)
+
+typedef struct Mpc105State {
+PCIDevice parent_obj;
+uint32_t ram_size;
+uint32_t elf_machine;
+uint32_t x_auto_conf;
+char *bios_name;
+MemoryRegion bios;
+MemoryRegion simm[8];
+bool use_sizer[8];
+/* use a sizer to allow access to only part of a simm */
+MemoryRegion sizer[8];
+} Mpc105State;
+
+static uint64_t mpc105_unassigned_read(void *opaque, hwaddr addr,
+   unsigned int size)
+{
+trace_mpc105_unassigned_mem_read(addr);
+return 0;
+}
+
+static void

[Qemu-devel] [RFC PATCH 4/8] memory: ref/unref memory across address_space_map/unmap

2013-05-06 Thread Paolo Bonzini

The iothread mutex might be released between map and unmap, so the
mapped region might disappear.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 exec.c |   12 +++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/exec.c b/exec.c
index 54b57fc..54ed203 100644
--- a/exec.c
+++ b/exec.c
@@ -2077,6 +2077,7 @@ void cpu_physical_memory_write_rom(hwaddr addr,
 }
 
 typedef struct {
+MemoryRegion *mr;
 void *buffer;
 hwaddr addr;
 hwaddr len;
@@ -2171,15 +2172,18 @@ void *address_space_map(AddressSpace *as,
 bounce.buffer = qemu_memalign(TARGET_PAGE_SIZE, TARGET_PAGE_SIZE);
 bounce.addr = addr;
 bounce.len = l;
+bounce.mr = section-mr;
 if (!is_write) {
 address_space_read(as, addr, bounce.buffer, l);
 }
 
 *plen = l;
+memory_region_ref(section-mr);
 return bounce.buffer;
 }
 if (!todo) {
 raddr = memory_region_get_ram_addr(section-mr) + xlat;
+memory_region_ref(section-mr);
 } else {
 if (memory_region_get_ram_addr(section-mr) + xlat != raddr + 
todo) {
 break;
@@ -2204,8 +2208,12 @@ void address_space_unmap(AddressSpace *as, void *buffer, 
hwaddr len,
  int is_write, hwaddr access_len)
 {
 if (buffer != bounce.buffer) {
+MemoryRegion *mr;
+ram_addr_t addr1;
+
+mr = qemu_ram_addr_from_host(buffer, addr1);
+assert(mr);
 if (is_write) {
-ram_addr_t addr1 = qemu_ram_addr_from_host_nofail(buffer);
 while (access_len) {
 unsigned l;
 l = TARGET_PAGE_SIZE;
@@ -2219,6 +2227,7 @@ void address_space_unmap(AddressSpace *as, void *buffer, 
hwaddr len,
 if (xen_enabled()) {
 xen_invalidate_map_cache_entry(buffer);
 }
+memory_region_unref(mr);
 return;
 }
 if (is_write) {
@@ -2226,6 +2235,7 @@ void address_space_unmap(AddressSpace *as, void *buffer, 
hwaddr len,
 }
 qemu_vfree(bounce.buffer);
 bounce.buffer = NULL;
+memory_region_unref(bounce.mr);
 cpu_notify_map_clients();
 }
 
-- 
1.7.1

Re: [Qemu-devel] [RFC][PATCH 08/15] isa: implement isa_is_ioport_assigned via memory_region_find

2013-05-06 Thread Paolo Bonzini

Il 06/05/2013 17:02, Jan Kiszka ha scritto:
 On 2013-05-06 16:59, Paolo Bonzini wrote:
 Il 06/05/2013 16:55, Andreas Färber ha scritto:
 Am 06.05.2013 16:26, schrieb Jan Kiszka:
 Move isa_is_ioport_assigned to the ISA core and implement it via a
 memory region lookup. As all IO ports are now directly or indirectly
 registered via the memory API, this becomes possible and will finally
 allow us to drop the ioport tables.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/acpi/piix4.c   |6 +++---
  hw/isa/isa-bus.c  |   11 +++
  hw/isa/lpc_ich9.c |8 
  include/exec/ioport.h |1 -
  include/hw/isa/isa.h  |2 ++
  ioport.c  |7 ---
  6 files changed, 20 insertions(+), 15 deletions(-)

 diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
 index c4af1cc..5955217 100644
 --- a/hw/acpi/piix4.c
 +++ b/hw/acpi/piix4.c
 @@ -386,10 +386,10 @@ static void piix4_pm_machine_ready(Notifier *n, void 
 *opaque)
  uint8_t *pci_conf;
  
  pci_conf = s-dev.config;
 -pci_conf[0x5f] = (isa_is_ioport_assigned(0x378) ? 0x80 : 0) | 0x10;
 +pci_conf[0x5f] = (isa_is_ioport_assigned(NULL, 0x378) ? 0x80 : 0) | 
 0x10;
  pci_conf[0x63] = 0x60;
 -pci_conf[0x67] = (isa_is_ioport_assigned(0x3f8) ? 0x08 : 0) |
 -  (isa_is_ioport_assigned(0x2f8) ? 0x90 : 0);
 +pci_conf[0x67] = (isa_is_ioport_assigned(NULL, 0x3f8) ? 0x08 : 0) |
 +(isa_is_ioport_assigned(NULL, 0x2f8) ? 0x90 : 0);
  
  }
  

 Is there really no way to access the ISABus from this device? Would be
 nice to get rid of global ISA variables and not introduce more
 dependencies. :)

 There's always a way to find the ISABus via QOM:

 ISABus *isa_bus = (ISABus *) object_resolve_path_type(, TYPE_ISA_BUS, 
 NULL);
 
 Err, in what way is this better? It also assumes that there is only one.

I didn't say it is better. :)  Unfortunately, the PIIX4 has these
register on the wrong function, ICH9 fixed it.

You could make this take an AddressSpace instead of an ISABus, and use
pci_address_space_io.  The assumption then becomes that the ISA and PM
devices have the same address spaces, which is somewhat hackish but
reasonable.

In fact, with the other new API I have added (address_space_valid), this
would become address_space_valid(pci_address_space_io(dev), 0x2f8, 1)
and similar.  I need to check, but you could remove
isa_is_ioport_assigned completely.

Paolo

Re: [Qemu-devel] [RFC][PATCH 08/15] isa: implement isa_is_ioport_assigned via memory_region_find

2013-05-06 Thread Jan Kiszka

On 2013-05-06 16:59, Paolo Bonzini wrote:
 Il 06/05/2013 16:55, Andreas Färber ha scritto:
 Am 06.05.2013 16:26, schrieb Jan Kiszka:
 Move isa_is_ioport_assigned to the ISA core and implement it via a
 memory region lookup. As all IO ports are now directly or indirectly
 registered via the memory API, this becomes possible and will finally
 allow us to drop the ioport tables.

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 ---
  hw/acpi/piix4.c   |6 +++---
  hw/isa/isa-bus.c  |   11 +++
  hw/isa/lpc_ich9.c |8 
  include/exec/ioport.h |1 -
  include/hw/isa/isa.h  |2 ++
  ioport.c  |7 ---
  6 files changed, 20 insertions(+), 15 deletions(-)

 diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
 index c4af1cc..5955217 100644
 --- a/hw/acpi/piix4.c
 +++ b/hw/acpi/piix4.c
 @@ -386,10 +386,10 @@ static void piix4_pm_machine_ready(Notifier *n, void 
 *opaque)
  uint8_t *pci_conf;
  
  pci_conf = s-dev.config;
 -pci_conf[0x5f] = (isa_is_ioport_assigned(0x378) ? 0x80 : 0) | 0x10;
 +pci_conf[0x5f] = (isa_is_ioport_assigned(NULL, 0x378) ? 0x80 : 0) | 
 0x10;
  pci_conf[0x63] = 0x60;
 -pci_conf[0x67] = (isa_is_ioport_assigned(0x3f8) ? 0x08 : 0) |
 -   (isa_is_ioport_assigned(0x2f8) ? 0x90 : 0);
 +pci_conf[0x67] = (isa_is_ioport_assigned(NULL, 0x3f8) ? 0x08 : 0) |
 +(isa_is_ioport_assigned(NULL, 0x2f8) ? 0x90 : 0);
  
  }
  

 Is there really no way to access the ISABus from this device? Would be
 nice to get rid of global ISA variables and not introduce more
 dependencies. :)
 
 There's always a way to find the ISABus via QOM:
 
 ISABus *isa_bus = (ISABus *) object_resolve_path_type(, TYPE_ISA_BUS, 
 NULL);

Err, in what way is this better? It also assumes that there is only one.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

[Qemu-devel] [RFC PATCH 5/8] memory: access FlatView from a local variable

2013-05-06 Thread Paolo Bonzini

We will soon require accesses to as-current_map to be placed under
a lock (with reference counting so as to keep the critical section
small).  To simplify this change, always fetch as-current_map into
a local variable and access it through that variable.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 memory.c |   31 +--
 1 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/memory.c b/memory.c
index bc4cf4b..553b04c 100644
--- a/memory.c
+++ b/memory.c
@@ -632,13 +632,15 @@ static void address_space_add_del_ioeventfds(AddressSpace 
*as,
 
 static void address_space_update_ioeventfds(AddressSpace *as)
 {
+FlatView *view;
 FlatRange *fr;
 unsigned ioeventfd_nb = 0;
 MemoryRegionIoeventfd *ioeventfds = NULL;
 AddrRange tmp;
 unsigned i;
 
-FOR_EACH_FLAT_RANGE(fr, as-current_map) {
+view = as-current_map;
+FOR_EACH_FLAT_RANGE(fr, view) {
 for (i = 0; i  fr-mr-ioeventfd_nb; ++i) {
 tmp = addrrange_shift(fr-mr-ioeventfds[i].addr,
   int128_sub(fr-addr.start,
@@ -1134,7 +1136,8 @@ void memory_region_sync_dirty_bitmap(MemoryRegion *mr)
 FlatRange *fr;
 
 QTAILQ_FOREACH(as, address_spaces, address_spaces_link) {
-FOR_EACH_FLAT_RANGE(fr, as-current_map) {
+FlatView *view = as-current_map;
+FOR_EACH_FLAT_RANGE(fr, view) {
 if (fr-mr == mr) {
 MEMORY_LISTENER_UPDATE_REGION(fr, as, Forward, log_sync);
 }
@@ -1184,12 +1187,14 @@ void *memory_region_get_ram_ptr(MemoryRegion *mr)
 
 static void memory_region_update_coalesced_range_as(MemoryRegion *mr, 
AddressSpace *as)
 {
+FlatView *view;
 FlatRange *fr;
 CoalescedMemoryRange *cmr;
 AddrRange tmp;
 MemoryRegionSection section;
 
-FOR_EACH_FLAT_RANGE(fr, as-current_map) {
+view = as-current_map;
+FOR_EACH_FLAT_RANGE(fr, view) {
 if (fr-mr == mr) {
 section = (MemoryRegionSection) {
 .address_space = as,
@@ -1476,9 +1481,9 @@ static int cmp_flatrange_addr(const void *addr_, const 
void *fr_)
 return 0;
 }
 
-static FlatRange *address_space_lookup(AddressSpace *as, AddrRange addr)
+static FlatRange *flatview_lookup(FlatView *view, AddrRange addr)
 {
-return bsearch(addr, as-current_map-ranges, as-current_map-nr,
+return bsearch(addr, view-ranges, view-nr,
sizeof(FlatRange), cmp_flatrange_addr);
 }
 
@@ -1489,6 +1494,7 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
 MemoryRegion *root;
 AddressSpace *as;
 AddrRange range;
+FlatView *view;
 FlatRange *fr;
 
 addr += mr-addr;
@@ -1499,13 +1505,14 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
 
 as = memory_region_to_address_space(root);
 range = addrrange_make(int128_make64(addr), int128_make64(size));
-fr = address_space_lookup(as, range);
+
+view = as-current_map;
+fr = flatview_lookup(view, range);
 if (!fr) {
 return ret;
 }
 
-while (fr  as-current_map-ranges
-addrrange_intersects(fr[-1].addr, range)) {
+while (fr  view-ranges  addrrange_intersects(fr[-1].addr, range)) {
 --fr;
 }
 
@@ -1525,9 +1532,11 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
 
 void address_space_sync_dirty_bitmap(AddressSpace *as)
 {
+FlatView *view;
 FlatRange *fr;
 
-FOR_EACH_FLAT_RANGE(fr, as-current_map) {
+view = as-current_map;
+FOR_EACH_FLAT_RANGE(fr, view) {
 MEMORY_LISTENER_UPDATE_REGION(fr, as, Forward, log_sync);
 }
 }
@@ -1547,6 +1556,7 @@ void memory_global_dirty_log_stop(void)
 static void listener_add_address_space(MemoryListener *listener,
AddressSpace *as)
 {
+FlatView *view;
 FlatRange *fr;
 
 if (listener-address_space_filter
@@ -1560,7 +1570,8 @@ static void listener_add_address_space(MemoryListener 
*listener,
 }
 }
 
-FOR_EACH_FLAT_RANGE(fr, as-current_map) {
+view = as-current_map;
+FOR_EACH_FLAT_RANGE(fr, view) {
 MemoryRegionSection section = {
 .mr = fr-mr,
 .address_space = as,
-- 
1.7.1

Re: [Qemu-devel] [RFC 0/9] QContext: QOM class to support multiple event loops

2013-05-06 Thread Anthony Liguori

Paolo Bonzini pbonz...@redhat.com writes:

 Il 03/05/2013 18:03, Michael Roth ha scritto:
 These patches apply on top of qemu.git master, and can also be obtained from:
 git://github.com/mdroth/qemu.git qcontext
 
 OVERVIEW
 
 This series introduces a set of QOM classes/interfaces for event
 registration/handling: QContext and QSource, which are based closely on
 their GMainContext/GSource GLib counterparts.
 
 QContexts can be created via the command-line via -object, and can also be
 intructed (via -object params/properties) to automatically start a
 thread/event-loop to handle QSources we attach to them.

 This is an awesome idea.

Ack.

 However, it seems a bit overengineered.

Ack.

  Why do we need QSource at all?
  In my opinion, we should first change dataplane to use AioContext as a
 GSource, and benchmark it thoroughly.  If it is fast enough, we can
 just introduce a glib-based QContext and be done with it.  Hopefully
 that is the case...

Why even bother with QContext then?

Regards,

Anthony Liguori


 Paolo

[Qemu-devel] [RFC][PATCH 04/15] i82374: replace register_ioport*

2013-05-06 Thread Jan Kiszka

Convert over to memory regions to obsolete register_ioport*.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/dma/i82374.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/hw/dma/i82374.c b/hw/dma/i82374.c
index f3d1924..3cc9aab 100644
--- a/hw/dma/i82374.c
+++ b/hw/dma/i82374.c
@@ -124,16 +124,23 @@ static const VMStateDescription vmstate_isa_i82374 = {
 },
 };
 
+static const MemoryRegionPortio i82374_portio_list[] = {
+{ 0x0A, 1, 1, .read = i82374_read_isr, },
+{ 0x10, 8, 1, .write = i82374_write_command, },
+{ 0x18, 8, 1, .read = i82374_read_status, },
+{ 0x20, 0x20, 1,
+  .write = i82374_write_descriptor, .read = i82374_read_descriptor, },
+PORTIO_END_OF_LIST(),
+};
+
 static int i82374_isa_init(ISADevice *dev)
 {
 ISAi82374State *isa = I82374(dev);
 I82374State *s = isa-state;
+PortioList *port_list = g_new(PortioList, 1);
 
-register_ioport_read(isa-iobase + 0x0A, 1, 1, i82374_read_isr, s);
-register_ioport_write(isa-iobase + 0x10, 8, 1, i82374_write_command, s);
-register_ioport_read(isa-iobase + 0x18, 8, 1, i82374_read_status, s);
-register_ioport_write(isa-iobase + 0x20, 0x20, 1, 
i82374_write_descriptor, s);
-register_ioport_read(isa-iobase + 0x20, 0x20, 1, i82374_read_descriptor, 
s);
+portio_list_init(port_list, i82374_portio_list, s, i82374);
+portio_list_add(port_list, isa_address_space_io(dev), isa-iobase);
 
 i82374_init(s);
 
-- 
1.7.3.4

1 2 3 >

1 - 100 of 233 matches

Mail list logo