date:20150717

Re: [Qemu-devel] [PATCH v2] target-arm: kvm: Differentiate registers based on write-back levels

2015-07-17 Thread Peter Maydell

On 16 July 2015 at 12:34, Christoffer Dall christoffer.d...@linaro.org wrote:
 Some registers like the CNTVCT register should only be written to the
 kernel as part of machine initialization or on vmload operations, but
 never during runtime, as this can potentially make time go backwards or
 create inconsistent time observations between VCPUs.

 Introduce a list of registers that should not be written back at runtime
 and check this list on syncing the register state to the KVM state.

Thanks. I think this should go into QEMU 2.4, given that it fixes
a bug with time misbehaving in guests. Are you happy that it's
received enough testing? (I have given 32-bit KVM a spin but have
no convenient 64-bit box to test with, and besides, I didn't notice the
bug in the first place :-))

 Signed-off-by: Christoffer Dall christoffer.d...@linaro.org
 ---
 Changes since RFC:
  - Move cpreg_level to kvm_arm_cpreg_level and into kvm32.c and kvm64.c
  - Changed struct name and declare as static const

I have a couple of minor comments on the comments below, and you
forgot to update the stub version of write_list_to_kvmstate().
I can just fix these up as I put it into target-arm.next, though.
Fixed up version at:

 https://git.linaro.org/people/peter.maydell/qemu-arm.git target-arm.next

If interested parties could test that by end-of-Monday that
would be nice (since rc2 is scheduled for Tuesday).

  dtc  |  2 +-
  target-arm/kvm.c |  6 +-
  target-arm/kvm32.c   | 30 +-
  target-arm/kvm64.c   | 30 +-
  target-arm/kvm_arm.h | 12 +++-
  target-arm/machine.c |  2 +-
  6 files changed, 76 insertions(+), 6 deletions(-)

 diff --git a/dtc b/dtc
 index 65cc4d2..bc895d6 16
 --- a/dtc
 +++ b/dtc
 @@ -1 +1 @@
 -Subproject commit 65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf
 +Subproject commit bc895d6d09695d05ceb8b52486ffe861d6cfbdde

Stray submodule change :-)

 diff --git a/target-arm/kvm32.c b/target-arm/kvm32.c
 index d7e7d68..6769815 100644
 --- a/target-arm/kvm32.c
 +++ b/target-arm/kvm32.c
 @@ -153,6 +153,34 @@ bool kvm_arm_reg_syncs_via_cpreg_list(uint64_t regidx)
  }
  }

 +typedef struct CPRegStateLevel {
 +uint64_t regidx;
 +int level;
 +} CPRegStateLevel;
 +
 +/* All coprocessor registers not listed in the following table are assumed to
 + * be of the level KVM_PUT_RUNTIME_STATE, a register should be written less

. If a register.

 + * often, you must add it to this table with a state of either
 + * KVM_PUT_RESET_STATE or KVM_PUT_FULL_STATE.
 + */
 +static const CPRegStateLevel non_runtime_cpregs[] = {
 +{ KVM_REG_ARM_TIMER_CNT, KVM_PUT_FULL_STATE },
 +};

 diff --git a/target-arm/kvm64.c b/target-arm/kvm64.c
 index ac34f51..d59f41c 100644
 --- a/target-arm/kvm64.c
 +++ b/target-arm/kvm64.c
 @@ -139,6 +139,34 @@ bool kvm_arm_reg_syncs_via_cpreg_list(uint64_t regidx)
  }
  }

 +typedef struct CPRegStateLevel {
 +uint64_t regidx;
 +int level;
 +} CPRegStateLevel;
 +
 +/* All system not listed in the following table are assumed to be of the 
 level

system registers

 + * KVM_PUT_RUNTIME_STATE, a register should be written less often, you must

. If a register

 + * add it to this table with a state of either KVM_PUT_RESET_STATE or
 + * KVM_PUT_FULL_STATE.
 + */
 +static const CPRegStateLevel non_runtime_cpregs[] = {
 +{ KVM_REG_ARM_TIMER_CNT, KVM_PUT_FULL_STATE },
 +};

 --- a/target-arm/kvm_arm.h
 +++ b/target-arm/kvm_arm.h

 @@ -83,7 +93,7 @@ bool kvm_arm_reg_syncs_via_cpreg_list(uint64_t regidx);
  * Note that we do not stop early on failure -- we will attempt
   * writing all registers in the list.
  */
 -bool write_list_to_kvmstate(ARMCPU *cpu);
 +bool write_list_to_kvmstate(ARMCPU *cpu, int level);

You forgot to update the stub function in target-arm/kvm-stub.c,
so this breaks compilation on non-ARM hosts.

thanks
-- PMM

[Qemu-devel] [PULL 6/7] ipxe: don't override GITVERSION

2015-07-17 Thread Gerd Hoffmann

We had build problems due to the git version checking in the ipxe build
system in the past.  Don't remember the details, but the problem seems
to be gone now, so lets remove the workaround.

Signed-off-by: Gerd Hoffmann kra...@redhat.com

[ most likely ipxe commit 6153c09c41034250408f3596555fcaae715da46c:
  [build] Set GITVERSION only if there is a git repository ]

Reviewed-by: Laszlo Ersek ler...@redhat.com
---
 roms/Makefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/roms/Makefile b/roms/Makefile
index 4971f6b..55fc45b 100644
--- a/roms/Makefile
+++ b/roms/Makefile
@@ -121,12 +121,12 @@ efi-rom-%: build-pxe-roms build-efi-roms
-o ../pc-bios/efi-$*.rom
 
 build-pxe-roms: ipxe/src/config/local/general.h 
ipxe/src/config/local/branding.h
-   $(MAKE) -C ipxe/src GITVERSION= \
+   $(MAKE) -C ipxe/src \
CROSS_COMPILE=$(x86_64_cross_prefix) \
$(patsubst %,bin/%.rom,$(pxerom_targets))
 
 build-efi-roms: build-pxe-roms ipxe/src/config/local/general.h 
ipxe/src/config/local/branding.h
-   $(MAKE) -C ipxe/src GITVERSION= \
+   $(MAKE) -C ipxe/src \
CROSS_COMPILE=$(x86_64_cross_prefix) \
$(patsubst %,bin-i386-efi/%.efidrv,$(pxerom_targets)) \
$(patsubst %,bin-x86_64-efi/%.efidrv,$(pxerom_targets))
-- 
1.8.3.1

[Qemu-devel] [PULL 3/7] ipxe: rm local config in cleanup

2015-07-17 Thread Gerd Hoffmann

ipxe build now generates empty local header files in case they are
not present.  Let's remove them on cleanup to make sure we store a
fresh copy on the next build.

Signed-off-by: Gerd Hoffmann kra...@redhat.com
Acked-by: Laszlo Ersek ler...@redhat.com
---
 roms/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/roms/Makefile b/roms/Makefile
index 7b3f156..c0153c7 100644
--- a/roms/Makefile
+++ b/roms/Makefile
@@ -153,5 +153,6 @@ clean:
$(MAKE) -C sgabios clean
rm -f sgabios/.depend
$(MAKE) -C ipxe/src veryclean
+   (cd ipxe; rm -f src/config/local/*.h)
$(MAKE) -C SLOF clean
rm -rf u-boot/build.e500
-- 
1.8.3.1

[Qemu-devel] [PULL 4/7] ipxe: disable load file protocol

2015-07-17 Thread Gerd Hoffmann

Activate the opt-out added by one not-upstream ipxe patch:

[efi] make load file protocol optional

Signed-off-by: Gerd Hoffmann kra...@redhat.com
Reviewed-by: Laszlo Ersek ler...@redhat.com
---
 roms/config.ipxe.general.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/roms/config.ipxe.general.h b/roms/config.ipxe.general.h
index 619ee4c..2df042a 100644
--- a/roms/config.ipxe.general.h
+++ b/roms/config.ipxe.general.h
@@ -2,3 +2,4 @@
 #define BANNER_TIMEOUT 30
 #undef ROM_BANNER_TIMEOUT
 #define ROM_BANNER_TIMEOUT 0
+#undef EFI_PROTO_LOAD_FILE
-- 
1.8.3.1

[Qemu-devel] [PULL for-2.4 0/7] update ipxe roms, fix efi support

2015-07-17 Thread Gerd Hoffmann

  Hi,

This pull finally fixes the efi boot support.  ipxe is updated to the
latest master, two non-upstream commits needed to make efi work are
added on top, and the build process is tweaked a bit.

The ipxe changes are pushed to git://git.kraxel.org/ipxe (branch qemu,
tag qemu-2.4).  They should be mirrored to git://git.qemu.org/ipxe.git
before merging this pull request.

please pull,
  Gerd

The following changes since commit 2d5ee9e7a7dd495d233cf9613a865f63f88e3375:

  Merge remote-tracking branch 'remotes/lalrae/tags/mips-20150716' into staging 
(2015-07-16 10:40:23 +0100)

are available in the git repository at:


  git://git.kraxel.org/qemu tags/pull-ipxe-20150717-1

for you to fetch changes up to 4f0c601b71e53a7d225a1913b784242400788991:

  ipxe: update binaries (2015-07-16 17:39:12 +0200)


update ipxe roms, fix efi support


Gerd Hoffmann (7):
  ipxe: update from 35c53797 to 24112d9 (upstream/master)
  ipxe: update to 87981bb (qemu)
  ipxe: rm local config in cleanup
  ipxe: disable load file protocol
  ipxe: add qemu branding
  ipxe: don't override GITVERSION
  ipxe: update binaries

 pc-bios/efi-e1000.rom   | Bin 197120 - 192512 bytes
 pc-bios/efi-eepro100.rom| Bin 197632 - 192512 bytes
 pc-bios/efi-ne2k_pci.rom| Bin 195584 - 190976 bytes
 pc-bios/efi-pcnet.rom   | Bin 195584 - 190976 bytes
 pc-bios/efi-rtl8139.rom | Bin 200192 - 194560 bytes
 pc-bios/efi-virtio.rom  | Bin 194048 - 188928 bytes
 roms/Makefile   |   9 +
 roms/config.ipxe.branding.h |   2 ++
 roms/config.ipxe.general.h  |   1 +
 roms/ipxe   |   2 +-
 10 files changed, 9 insertions(+), 5 deletions(-)
 create mode 100644 roms/config.ipxe.branding.h

[Qemu-devel] [PULL 5/7] ipxe: add qemu branding

2015-07-17 Thread Gerd Hoffmann

Apply qemu-project.org branding, so the official builds can easily be
identified in the banner.

Signed-off-by: Gerd Hoffmann kra...@redhat.com
Acked-by: Laszlo Ersek ler...@redhat.com
---
 roms/Makefile   | 4 ++--
 roms/config.ipxe.branding.h | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)
 create mode 100644 roms/config.ipxe.branding.h

diff --git a/roms/Makefile b/roms/Makefile
index c0153c7..4971f6b 100644
--- a/roms/Makefile
+++ b/roms/Makefile
@@ -120,12 +120,12 @@ efi-rom-%: build-pxe-roms build-efi-roms
-ec ipxe/src/bin-x86_64-efi/$(VID)$(DID).efidrv \
-o ../pc-bios/efi-$*.rom
 
-build-pxe-roms: ipxe/src/config/local/general.h
+build-pxe-roms: ipxe/src/config/local/general.h 
ipxe/src/config/local/branding.h
$(MAKE) -C ipxe/src GITVERSION= \
CROSS_COMPILE=$(x86_64_cross_prefix) \
$(patsubst %,bin/%.rom,$(pxerom_targets))
 
-build-efi-roms: build-pxe-roms ipxe/src/config/local/general.h
+build-efi-roms: build-pxe-roms ipxe/src/config/local/general.h 
ipxe/src/config/local/branding.h
$(MAKE) -C ipxe/src GITVERSION= \
CROSS_COMPILE=$(x86_64_cross_prefix) \
$(patsubst %,bin-i386-efi/%.efidrv,$(pxerom_targets)) \
diff --git a/roms/config.ipxe.branding.h b/roms/config.ipxe.branding.h
new file mode 100644
index 000..324995e
--- /dev/null
+++ b/roms/config.ipxe.branding.h
@@ -0,0 +1,2 @@
+#undef PRODUCT_NAME
+#define PRODUCT_NAME iPXE for qemu-project.org
-- 
1.8.3.1

[Qemu-devel] [PULL 7/7] ipxe: update binaries

2015-07-17 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann kra...@redhat.com
Reviewed-by: Laszlo Ersek ler...@redhat.com
---
 pc-bios/efi-e1000.rom| Bin 197120 - 192512 bytes
 pc-bios/efi-eepro100.rom | Bin 197632 - 192512 bytes
 pc-bios/efi-ne2k_pci.rom | Bin 195584 - 190976 bytes
 pc-bios/efi-pcnet.rom| Bin 195584 - 190976 bytes
 pc-bios/efi-rtl8139.rom  | Bin 200192 - 194560 bytes
 pc-bios/efi-virtio.rom   | Bin 194048 - 188928 bytes
 6 files changed, 0 insertions(+), 0 deletions(-)

diff --git a/pc-bios/efi-e1000.rom b/pc-bios/efi-e1000.rom
index 4e29d9d..203bbaf 100644
Binary files a/pc-bios/efi-e1000.rom and b/pc-bios/efi-e1000.rom differ
diff --git a/pc-bios/efi-eepro100.rom b/pc-bios/efi-eepro100.rom
index 2a92d6f..0df3135 100644
Binary files a/pc-bios/efi-eepro100.rom and b/pc-bios/efi-eepro100.rom differ
diff --git a/pc-bios/efi-ne2k_pci.rom b/pc-bios/efi-ne2k_pci.rom
index 6366017..91c0ebb 100644
Binary files a/pc-bios/efi-ne2k_pci.rom and b/pc-bios/efi-ne2k_pci.rom differ
diff --git a/pc-bios/efi-pcnet.rom b/pc-bios/efi-pcnet.rom
index a61f586..4d79e37 100644
Binary files a/pc-bios/efi-pcnet.rom and b/pc-bios/efi-pcnet.rom differ
diff --git a/pc-bios/efi-rtl8139.rom b/pc-bios/efi-rtl8139.rom
index c9c77ea..0d3287b 100644
Binary files a/pc-bios/efi-rtl8139.rom and b/pc-bios/efi-rtl8139.rom differ
diff --git a/pc-bios/efi-virtio.rom b/pc-bios/efi-virtio.rom
index eec2790..8f9a9b4 100644
Binary files a/pc-bios/efi-virtio.rom and b/pc-bios/efi-virtio.rom differ
-- 
1.8.3.1

[Qemu-devel] [PULL 2/7] ipxe: update to 87981bb (qemu)

2015-07-17 Thread Gerd Hoffmann

Add two patches we've been struggling to get upstream.
They are available from git://git.qemu.org/ipxe.git qemu

git shortlog


Gerd Hoffmann (1):
  [efi] make load file protocol optional

Laszlo Ersek (1):
  efi_snp: improve compliance with the EFI_SIMPLE_NETWORK_PROTOCOL spec

Signed-off-by: Gerd Hoffmann kra...@redhat.com
Reviewed-by: Laszlo Ersek ler...@redhat.com
---
 roms/ipxe | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/roms/ipxe b/roms/ipxe
index 24112d9..87981bb 16
--- a/roms/ipxe
+++ b/roms/ipxe
@@ -1 +1 @@
-Subproject commit 24112d91a0ab4c29794bc7750cbfc4166bc9026a
+Subproject commit 87981bb66f5f496f124469fb6c13aa13ee00061e
-- 
1.8.3.1

[Qemu-devel] [PULL 1/7] ipxe: update from 35c53797 to 24112d9 (upstream/master)

2015-07-17 Thread Gerd Hoffmann

git shortlog


Alex Williamson (1):
  [dhcp] Extract timing parameters out to config/dhcp.h

Bernd Wiebelt (1):
  [tg3] Add support for BCM57766

Christian Hesse (3):
  [intel] Add PCI device IDs for Intel I218-LM and I218-V
  [build] Add missing const qualifiers
  [ath9k] Remove confusing logic inversion in an ANI variable

Ed Swierk (1):
  [intel] Update PCI device IDs for Intel 82599 and X540 10G NICs

Laszlo Ersek (1):
  [virtio] Downgrade per-iobuf debug messages to DBGC2

Michael Brown (228):
  [device] Provide a driver-private data field for root devices
  [iobuf] Add iob_split() to split an I/O buffer into portions
  [rndis] Add generic RNDIS device abstraction
  [hyperv] Add support for Hyper-V hypervisor
  [hyperv] Add support for VMBus devices
  [hyperv] Add support for NetVSC paravirtual network devices
  [rndis] Send RNDIS_INITIALISE_MSG
  [rndis] Send RNDIS_HALT_MSG
  [hyperv] Tear down NetVSC RX buffer GPADL after closing VMBus device
  [rndis] Clear receive filter when closing the device
  [hyperv] Receive all VMBus messages in a poll
  [hyperv] Increase TX ring size
  [hyperv] Assume that VMBus xfer page ranges correspond to RNDIS messages
  [rndis] Ignore start-of-day RNDIS_INDICATE_STATUS_MSG with status 
0x40020006
  [hyperv] Tidy up debug output
  [hyperv] Require support for VMBus version 3.0 or newer
  [build] Include Hyper-V driver in the all-drivers build
  [pci] Allow drivers to specify a PCI class
  [romprefix] Ensure UNDI loader can be included by all ROM types
  [usb] Add basic support for USB devices
  [usb] Add basic support for USB hubs
  [usb] Add support for xHCI host controllers
  [ncm] Add support for CDC-NCM USB Ethernet devices
  [usb] Report xHCI host controller events
  [ncm] Use large multi-packet buffers by default
  [tftp] Explicitly abort connection whenever parent interface is closed
  [uri] Allow tftp_uri() to construct a URI with a custom port
  [pxe] Use tftp_uri() to construct PXE TFTP URIs
  [pxe] Maintain a queue for received PXE UDP packets
  [ncm] Reserve headroom in received packets
  [usb] Try multiple USB device configurations
  [usb] Handle CDC union functional descriptors
  [usb] Parse endpoint descriptor bInterval field
  [usb] Allow usb_stream() to enforce a terminating short packet
  [ecm] Add support for CDC-ECM USB Ethernet devices
  [xhci] Delay after (possibly) forcing port link state to RxDetect
  [build] Move branding information to config/branding.h
  [build] Use PRODUCT_SHORT_NAME for end-user visible strings
  [build] Allow product URI to be customised via config/branding.h
  [build] Allow error message URI to be customised via config/branding.h
  [build] Allow command help text URI to be customised via config/branding.h
  [build] Allow setting help text URI to be customised via config/branding.h
  [build] Allow product tag line to be customised via config/branding.h
  [rndis] Add rndis_rx_err()
  [usb] Handle port status changes received after failing to find a driver
  [efi] Disallow R_X86_64_32 relocations
  [build] Apply the -fno-PIE -nopie workaround only to i386 builds
  [usb] Provide generic framework for refilling receive endpoints
  [usb] Use generic refill framework for USB hub interrupt endpoints
  [ecm] Use generic refill framework for bulk IN and interrupt endpoints
  [ncm] Use generic refill framework for bulk IN and interrupt endpoints
  [libc] Remove unused string functions
  [libc] Rewrite string functions
  [test] Add self-tests for more string functions
  [test] Add constant-length memset() self-tests
  [libc] Reduce size of memset()
  [usb] Add generic USB network device framework
  [ecm] Use generic USB network device framework
  [ncm] Use generic USB network device framework
  [timer] Rewrite the 8254 Programmable Interval Timer support
  [xhci] Leak memory if controller fails to disable slot
  [xhci] Abort commands on timeout
  [test] Add IPv4 self-tests
  [legal] Add missing copyright header to net/ipv4.c
  [ipv4] Rewrite inet_aton()
  [libc] Rewrite strtoul()
  [hyperv] Check for required features
  [prefix] Use .bss16 as temporary stack space for calls to install_block
  [zbin] Use LZMA compression
  [zbin] Perform extra normalisation after completing decompression
  [prefix] Call decompressor in flat real mode when DEBUG=libprefix is 
enabled
  [zbin] Allow decompressor to generate debug output via BIOS console
  [zbin] Fix check for existence of most recent output byte
  [zbin] Remove now-unused unnrv2b.S decompressor
  [legal] Update GPLv2 licence text
  [legal] Include full licence text for all GPL2_OR_LATER files
  [mucurses] Add missing FILE_LICENCE declarations

[Qemu-devel] [RFC PATCH V3 0/3] Multithread TCG async_safe_work part.

2015-07-17 Thread fred . konrad

From: KONRAD Frederic fred.kon...@greensocs.com

This is the async_safe_work introduction bit of the Multithread TCG work.
Rebased on current upstream (6169b60285fe1ff730d840a49527e721bfb30899).

(Currently untested as I need to rebase MTTCG first.)

It can be cloned here:
http://git.greensocs.com/fkonrad/mttcg.git branch async_work_v3

The first patch introduces a mutex to protect the existing queued_work_*
CPUState members against multiple (concurent) access.

The second patch introduces a tcg_exec_flag which will be 1 when we are inside
cpu_exec(), -1 when we must not enter the cpu execution and 0 when we are
allowed to do so. This is required as safe work need to be sure that's all vCPU
are outside cpu_exec().

The last patch introduces async_safe_work. It allows to add some work which will
be done asynchronously but only when all vCPUs are outside cpu_exec(). The tcg
thread will wait that no vCPUs have any pending safe work before reentering
cpu-exec().

Changes V2 - V3:
  * Check atomically we are not in the executiong loop to fix the race condition
which might happen.
Changes V1 - V2:
  * Release the lock while running the callback for both async and safe work.

KONRAD Frederic (3):
  cpus: protect queued_work_* with work_mutex.
  cpus: add tcg_exec_flag.
  cpus: introduce async_run_safe_work_on_cpu.

 cpu-exec.c|  10 
 cpus.c| 160 --
 include/qom/cpu.h |  57 +++
 qom/cpu.c |  20 +++
 4 files changed, 207 insertions(+), 40 deletions(-)

-- 
1.9.0

[Qemu-devel] [RFC PATCH V3 1/3] cpus: protect queued_work_* with work_mutex.

2015-07-17 Thread fred . konrad

From: KONRAD Frederic fred.kon...@greensocs.com

This protects queued_work_* used by async_run_on_cpu, run_on_cpu and
flush_queued_work with a new lock (work_mutex) to prevent multiple (concurrent)
access.

Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com

Changes V1 - V2:
  * Unlock the mutex while running the callback.
---
 cpus.c| 11 +++
 include/qom/cpu.h |  3 +++
 qom/cpu.c |  1 +
 3 files changed, 15 insertions(+)

diff --git a/cpus.c b/cpus.c
index b00a423..eabd4b1 100644
--- a/cpus.c
+++ b/cpus.c
@@ -845,6 +845,8 @@ void run_on_cpu(CPUState *cpu, void (*func)(void *data), 
void *data)
 wi.func = func;
 wi.data = data;
 wi.free = false;
+
+qemu_mutex_lock(cpu-work_mutex);
 if (cpu-queued_work_first == NULL) {
 cpu-queued_work_first = wi;
 } else {
@@ -853,6 +855,7 @@ void run_on_cpu(CPUState *cpu, void (*func)(void *data), 
void *data)
 cpu-queued_work_last = wi;
 wi.next = NULL;
 wi.done = false;
+qemu_mutex_unlock(cpu-work_mutex);
 
 qemu_cpu_kick(cpu);
 while (!wi.done) {
@@ -876,6 +879,8 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void 
*data), void *data)
 wi-func = func;
 wi-data = data;
 wi-free = true;
+
+qemu_mutex_lock(cpu-work_mutex);
 if (cpu-queued_work_first == NULL) {
 cpu-queued_work_first = wi;
 } else {
@@ -884,6 +889,7 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void 
*data), void *data)
 cpu-queued_work_last = wi;
 wi-next = NULL;
 wi-done = false;
+qemu_mutex_unlock(cpu-work_mutex);
 
 qemu_cpu_kick(cpu);
 }
@@ -896,15 +902,20 @@ static void flush_queued_work(CPUState *cpu)
 return;
 }
 
+qemu_mutex_lock(cpu-work_mutex);
 while ((wi = cpu-queued_work_first)) {
 cpu-queued_work_first = wi-next;
+qemu_mutex_unlock(cpu-work_mutex);
 wi-func(wi-data);
+qemu_mutex_lock(cpu-work_mutex);
 wi-done = true;
 if (wi-free) {
 g_free(wi);
 }
 }
 cpu-queued_work_last = NULL;
+qemu_mutex_unlock(cpu-work_mutex);
+
 qemu_cond_broadcast(qemu_work_cond);
 }
 
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 20aabc9..efa9624 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -242,6 +242,8 @@ struct kvm_run;
  * @mem_io_pc: Host Program Counter at which the memory was accessed.
  * @mem_io_vaddr: Target virtual address at which the memory was accessed.
  * @kvm_fd: vCPU file descriptor for KVM.
+ * @work_mutex: Lock to prevent multiple access to queued_work_*.
+ * @queued_work_first: First asynchronous work pending.
  *
  * State of one CPU core or thread.
  */
@@ -262,6 +264,7 @@ struct CPUState {
 uint32_t host_tid;
 bool running;
 struct QemuCond *halt_cond;
+QemuMutex work_mutex;
 struct qemu_work_item *queued_work_first, *queued_work_last;
 bool thread_kicked;
 bool created;
diff --git a/qom/cpu.c b/qom/cpu.c
index eb9cfec..4e12598 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -316,6 +316,7 @@ static void cpu_common_initfn(Object *obj)
 cpu-gdb_num_regs = cpu-gdb_num_g_regs = cc-gdb_num_core_regs;
 QTAILQ_INIT(cpu-breakpoints);
 QTAILQ_INIT(cpu-watchpoints);
+qemu_mutex_init(cpu-work_mutex);
 }
 
 static void cpu_common_finalize(Object *obj)
-- 
1.9.0

[Qemu-devel] [RFC PATCH V3 2/3] cpus: add tcg_exec_flag.

2015-07-17 Thread fred . konrad

From: KONRAD Frederic fred.kon...@greensocs.com

This flag indicates the state of the VCPU thread:
  * 0 if the VCPU is allowed to execute code.
  * 1 if the VCPU is currently executing code.
  * -1 if the VCPU is not allowed to execute code.

This allows to atomically check and run safe work or check and continue the TCG
execution.

Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com
Changes V2 - V3:
  * introduce a third state which allow or not the execution.
  * atomically check and set the flag when starting or blocking the code 
execution.
Changes V1 - V2:
  * do both tcg_executing = 0 or 1 in cpu_exec().
---
 cpu-exec.c|  5 +
 include/qom/cpu.h | 32 
 qom/cpu.c | 19 +++
 3 files changed, 56 insertions(+)

diff --git a/cpu-exec.c b/cpu-exec.c
index 75694f3..e1a 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -371,6 +371,10 @@ int cpu_exec(CPUState *cpu)
 cpu-halted = 0;
 }
 
+if (!tcg_cpu_try_start_execution(cpu)) {
+cpu-exit_request = 1;
+return 0;
+}
 current_cpu = cpu;
 
 /* As long as current_cpu is null, up to the assignment just above,
@@ -583,5 +587,6 @@ int cpu_exec(CPUState *cpu)
 
 /* fail safe : never use current_cpu outside cpu_exec() */
 current_cpu = NULL;
+tcg_cpu_allow_execution(cpu);
 return ret;
 }
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index efa9624..de7487e 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -226,6 +226,7 @@ struct kvm_run;
  * @stopped: Indicates the CPU has been artificially stopped.
  * @tcg_exit_req: Set to force TCG to stop executing linked TBs for this
  *   CPU and return to its top level loop.
+ * @tcg_exec_flag: See tcg_cpu_flag_* function.
  * @singlestep_enabled: Flags for single-stepping.
  * @icount_extra: Instructions until next timer event.
  * @icount_decr: Number of cycles left, with interrupt flag in high bit.
@@ -322,6 +323,8 @@ struct CPUState {
(absolute value) offset as small as possible.  This reduces code
size, especially for hosts without large memory offsets.  */
 volatile sig_atomic_t tcg_exit_req;
+
+int tcg_exec_flag;
 };
 
 QTAILQ_HEAD(CPUTailQ, CPUState);
@@ -337,6 +340,35 @@ extern struct CPUTailQ cpus;
 DECLARE_TLS(CPUState *, current_cpu);
 #define current_cpu tls_var(current_cpu)
 
+
+/**
+ * tcg_cpu_try_block_execution
+ * @cpu: The CPU to block the execution
+ *
+ * Try to set the tcg_exec_flag to -1 saying the CPU can't execute code if the
+ * CPU is not executing code.
+ * Returns true if the cpu execution is blocked, false otherwise.
+ */
+bool tcg_cpu_try_block_execution(CPUState *cpu);
+
+/**
+ * tcg_cpu_allow_execution
+ * @cpu: The CPU to allow the execution.
+ *
+ * Just reset the state of tcg_exec_flag, and allow the execution of some code.
+ */
+void tcg_cpu_allow_execution(CPUState *cpu);
+
+/**
+ * tcg_cpu_try_start_execution
+ * @cpu: The CPU to start the execution.
+ *
+ * Just set the tcg_exec_flag to 1 saying the CPU is executing code if the CPU
+ * is allowed to run some code.
+ * Returns true if the cpu can execute, false otherwise.
+ */
+bool tcg_cpu_try_start_execution(CPUState *cpu);
+
 /**
  * cpu_paging_enabled:
  * @cpu: The CPU whose state is to be inspected.
diff --git a/qom/cpu.c b/qom/cpu.c
index 4e12598..e32f90c 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -26,6 +26,23 @@
 #include qemu/error-report.h
 #include sysemu/sysemu.h
 
+bool tcg_cpu_try_block_execution(CPUState *cpu)
+{
+return (atomic_cmpxchg(cpu-tcg_exec_flag, 0, -1)
+   || (cpu-tcg_exec_flag == -1));
+}
+
+void tcg_cpu_allow_execution(CPUState *cpu)
+{
+cpu-tcg_exec_flag = 0;
+}
+
+bool tcg_cpu_try_start_execution(CPUState *cpu)
+{
+return (atomic_cmpxchg(cpu-tcg_exec_flag, 0, 1)
+   || (cpu-tcg_exec_flag == 1));
+}
+
 bool cpu_exists(int64_t id)
 {
 CPUState *cpu;
@@ -249,6 +266,8 @@ static void cpu_common_reset(CPUState *cpu)
 cpu-icount_decr.u32 = 0;
 cpu-can_do_io = 0;
 cpu-exception_index = -1;
+
+tcg_cpu_allow_execution(cpu);
 memset(cpu-tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof(void *));
 }
 
-- 
1.9.0

[Qemu-devel] [RFC PATCH V3 3/3] cpus: introduce async_run_safe_work_on_cpu.

2015-07-17 Thread fred . konrad

From: KONRAD Frederic fred.kon...@greensocs.com

We already had async_run_on_cpu but we need all VCPUs outside their execution
loop to execute some tb_flush/invalidate task:

async_run_on_cpu_safe schedule a work on a VCPU but the work start when no more
VCPUs are executing code.
When a safe work is pending cpu_has_work returns true, so cpu_exec returns and
the VCPUs can't enters execution loop. cpu_thread_is_idle returns false so at
the moment where all VCPUs are stop || stopped the safe work queue can be
flushed.

Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com

Changes V3 - V4:
  * Use tcg_cpu_try_block_execution.
  * Use a counter to know how many safe work are pending.
Changes V2 - V3:
  * Unlock the mutex while executing the callback.
Changes V1 - V2:
  * Move qemu_cpu_kick_thread to avoid prototype declaration.
  * Use the work_mutex lock to protect the queued_safe_work_* structures.
---
 cpu-exec.c|   5 ++
 cpus.c| 149 +++---
 include/qom/cpu.h |  24 -
 3 files changed, 137 insertions(+), 41 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index e1a..97805cc 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -363,6 +363,11 @@ int cpu_exec(CPUState *cpu)
 /* This must be volatile so it is not trashed by longjmp() */
 volatile bool have_tb_lock = false;
 
+if (async_safe_work_pending()) {
+cpu-exit_request = 1;
+return 0;
+}
+
 if (cpu-halted) {
 if (!cpu_has_work(cpu)) {
 return EXCP_HALTED;
diff --git a/cpus.c b/cpus.c
index eabd4b1..2250296 100644
--- a/cpus.c
+++ b/cpus.c
@@ -69,6 +69,8 @@ static CPUState *next_cpu;
 int64_t max_delay;
 int64_t max_advance;
 
+int safe_work_pending; /* Number of safe work pending for all VCPUs. */
+
 bool cpu_is_stopped(CPUState *cpu)
 {
 return cpu-stopped || !runstate_is_running();
@@ -76,7 +78,7 @@ bool cpu_is_stopped(CPUState *cpu)
 
 static bool cpu_thread_is_idle(CPUState *cpu)
 {
-if (cpu-stop || cpu-queued_work_first) {
+if (cpu-stop || cpu-queued_work_first || cpu-queued_safe_work_first) {
 return false;
 }
 if (cpu_is_stopped(cpu)) {
@@ -833,6 +835,45 @@ void qemu_init_cpu_loop(void)
 qemu_thread_get_self(io_thread);
 }
 
+static void qemu_cpu_kick_thread(CPUState *cpu)
+{
+#ifndef _WIN32
+int err;
+
+err = pthread_kill(cpu-thread-thread, SIG_IPI);
+if (err) {
+fprintf(stderr, qemu:%s: %s, __func__, strerror(err));
+exit(1);
+}
+#else /* _WIN32 */
+if (!qemu_cpu_is_self(cpu)) {
+CONTEXT tcgContext;
+
+if (SuspendThread(cpu-hThread) == (DWORD)-1) {
+fprintf(stderr, qemu:%s: GetLastError:%lu\n, __func__,
+GetLastError());
+exit(1);
+}
+
+/* On multi-core systems, we are not sure that the thread is actually
+ * suspended until we can get the context.
+ */
+tcgContext.ContextFlags = CONTEXT_CONTROL;
+while (GetThreadContext(cpu-hThread, tcgContext) != 0) {
+continue;
+}
+
+cpu_signal(0);
+
+if (ResumeThread(cpu-hThread) == (DWORD)-1) {
+fprintf(stderr, qemu:%s: GetLastError:%lu\n, __func__,
+GetLastError());
+exit(1);
+}
+}
+#endif
+}
+
 void run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data)
 {
 struct qemu_work_item wi;
@@ -894,6 +935,70 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void 
*data), void *data)
 qemu_cpu_kick(cpu);
 }
 
+void async_run_safe_work_on_cpu(CPUState *cpu, void (*func)(void *data),
+void *data)
+{
+struct qemu_work_item *wi;
+
+wi = g_malloc0(sizeof(struct qemu_work_item));
+wi-func = func;
+wi-data = data;
+wi-free = true;
+
+atomic_inc(safe_work_pending);
+qemu_mutex_lock(cpu-work_mutex);
+if (cpu-queued_safe_work_first == NULL) {
+cpu-queued_safe_work_first = wi;
+} else {
+cpu-queued_safe_work_last-next = wi;
+}
+cpu-queued_safe_work_last = wi;
+wi-next = NULL;
+wi-done = false;
+qemu_mutex_unlock(cpu-work_mutex);
+
+CPU_FOREACH(cpu) {
+qemu_cpu_kick_thread(cpu);
+}
+}
+
+static void flush_queued_safe_work(CPUState *cpu)
+{
+struct qemu_work_item *wi;
+CPUState *other_cpu;
+
+if (cpu-queued_safe_work_first == NULL) {
+return;
+}
+
+CPU_FOREACH(other_cpu) {
+if (!tcg_cpu_try_block_execution(other_cpu)) {
+return;
+}
+}
+
+qemu_mutex_lock(cpu-work_mutex);
+while ((wi = cpu-queued_safe_work_first)) {
+cpu-queued_safe_work_first = wi-next;
+qemu_mutex_unlock(cpu-work_mutex);
+wi-func(wi-data);
+qemu_mutex_lock(cpu-work_mutex);
+wi-done = true;
+if (wi-free) {
+g_free(wi);
+}
+atomic_dec(safe_work_pending);
+}
+cpu-queued_safe_work_last

[Qemu-devel] [PULL 4/4] Revert xhci: set timer to retry xfers

2015-07-17 Thread Gerd Hoffmann

This reverts commit 4e8cfbe1143d8384387595b500212d7a7f11aeae.

We should not poll via timer, and with ccid being fixed
to properly notify us about pending transfers we don't have to.

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hw/usb/hcd-xhci.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index 90a5fbf..c673bed 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -,8 +,6 @@ static void xhci_kick_ep(XHCIState *xhci, unsigned int 
slotid,
 if (xfer-running_retry) {
 DPRINTF(xhci: xfer nacked, stopping schedule\n);
 epctx-retry = xfer;
-timer_mod(epctx-kick_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) 
+
-  epctx-interval * 125000);
 break;
 }
 }
-- 
1.8.3.1

Re: [Qemu-devel] vhost-user memory regions.

2015-07-17 Thread Marc-André Lureau

Hi Naredula

On Mon, Jul 13, 2015 at 10:35 AM, Naredula Janardhana Reddy
naredula.j...@gmail.com wrote:

 In the QEMU 2.3.0 , The memory regions published to user space application
 is limited address space, not the complete memory regions of quest vm, due
 to this , the user level app cannot decode all the memory block addresses.
 This is needed for reading the packet from virtio ring.

 Is there any configuration  in qemu to overcome the above  or it is bug in
 virtio-user?.

 Test details:
 Qemu cmd line:
 /opt/qemu.2.3.0/bin/qemu-system-x86_64 -enable-kvm -gdb
 tcp::1336,server,nowait
 -monitor tcp::51008,server,nowait,nodelay -m 256M -mem-path /hugetlbfs
 -mem-prealloc
 -smp 2 -chardev backend=socket,id=ch1,path=/opt_src/port1 -netdev
 vhost-user,id=guest0,chardev=ch1 -device
 virtio-net-pci,mac=00:30:48:DB:5E:01,netdev=guest0,mq=on,vectors=8 -vnc :8
 -serial telnet::50008,server,nowait -daemonize -append
 net_send_int_disable=1 ipaddr=192.168.122.3 gw=192.168.122.1 nic_intr_off=1
 netbh_tightloop=0 page_cache=1  -kernel /opt_src/jiny/jiny_image.bin -drive
 if=virtio,id=hdr0,file=/opt_src/jiny/disk
 In the below message

 The below is message recevied at the user level application from
 vhost-user(qemu), the size is only 655k.  but needed is 256M(size of  memory
 or shared memory size of hugetlb).  I have Temporarly hardcoded the size in
 the app by changing the size to 256M from 655K. so that network packet
 address can able to decode from  the virtio ring.

 Cmd: VHOST_USER_SET_MEM_TABLE (0x5)
 flags: 0x1
 regions: 2
 region:
 gpa = 0x0
 size = 655360
 ua = 0x2ac0
 region:
 gpa = 0x0
 size = 786432
 ua = 0xff4


I got a similar error when using vapp
(https://github.com/virtualopensystems/vapp), it's missing mmap_offset
field.
You need this fix: https://github.com/virtualopensystems/vapp/pull/3/files

-- 
Marc-André Lureau

[Qemu-devel] [FIX PATCH] pc-dimm: Fail pc-dimm realization for invalid nodes in non-NUMA configuration

2015-07-17 Thread Bharata B Rao

pc_dimm_realize() validates the NUMA node to which memory hotplug is
being performed only in case of NUMA configuration. Include a check
to fail invalid nodes in case of non-NUMA configuration too.

Signed-off-by: Bharata B Rao bhar...@linux.vnet.ibm.com
---
 hw/mem/pc-dimm.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index bb04862..099025e 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -414,10 +414,11 @@ static void pc_dimm_realize(DeviceState *dev, Error 
**errp)
 error_setg(errp, ' PC_DIMM_MEMDEV_PROP ' property is not set);
 return;
 }
-if ((nb_numa_nodes  0)  (dimm-node = nb_numa_nodes)) {
+if (((nb_numa_nodes  0)  (dimm-node = nb_numa_nodes)) ||
+(!nb_numa_nodes  dimm-node)) {
 error_setg(errp, 'DIMM property  PC_DIMM_NODE_PROP  has value %
PRIu32 ' which exceeds the number of numa nodes: %d,
-   dimm-node, nb_numa_nodes);
+   dimm-node, nb_numa_nodes ? nb_numa_nodes : 1);
 return;
 }
 }
-- 
2.1.0

[Qemu-devel] [RFC PATCH 1/2] spapr: add dumpdtb support

2015-07-17 Thread Andrew Jones

Signed-off-by: Andrew Jones drjo...@redhat.com
---
 hw/ppc/spapr.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a6f19473cf278..c1cbf3387ae0c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -30,6 +30,7 @@
 #include hw/fw-path-provider.h
 #include elf.h
 #include net/net.h
+#include sysemu/device_tree.h
 #include sysemu/block-backend.h
 #include sysemu/cpus.h
 #include sysemu/kvm.h
@@ -822,6 +823,7 @@ static void spapr_finalize_fdt(sPAPRMachineState *spapr,
 exit(1);
 }
 
+qemu_fdt_dumpdtb(fdt, fdt_totalsize(fdt));
 cpu_physical_memory_write(fdt_addr, fdt, fdt_totalsize(fdt));
 
 g_free(bootlist);
-- 
2.4.3

Re: [Qemu-devel] [RFC v3 06/13] target-i386: translate: implement qemu_ldlink and qemu_stcond ops

2015-07-17 Thread Alex Bennée


Alvise Rigo a.r...@virtualopensystems.com writes:

 Implement strex and ldrex instruction relying on TCG's qemu_ldlink and
 qemu_stcond.  For the time being only 32bit configurations are supported.

 Suggested-by: Jani Kokkonen jani.kokko...@huawei.com
 Suggested-by: Claudio Fontana claudio.font...@huawei.com
 Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com
 ---
  tcg/i386/tcg-target.c | 136 
 ++
  1 file changed, 114 insertions(+), 22 deletions(-)

 diff --git a/tcg/i386/tcg-target.c b/tcg/i386/tcg-target.c
 index 0d7c99c..d8250a9 100644
 --- a/tcg/i386/tcg-target.c
 +++ b/tcg/i386/tcg-target.c
 @@ -1141,6 +1141,17 @@ static void * const qemu_ld_helpers[16] = {
  [MO_BEQ]  = helper_be_ldq_mmu,
  };
  
 +/* LoadLink helpers, only unsigned. Use the macro below to access them. */
 +static void * const qemu_ldex_helpers[16] = {
 +[MO_LEUL] = helper_le_ldlinkul_mmu,
 +};
 +
 +#define LDEX_HELPER(mem_op) \
 +({  \
 +assert(mem_op  MO_EXCL);   \
 +qemu_ldex_helpers[((int)mem_op - MO_EXCL)]; \
 +})
 +
  /* helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr,
   * uintxx_t val, int mmu_idx, uintptr_t 
 ra)
   */
 @@ -1154,6 +1165,17 @@ static void * const qemu_st_helpers[16] = {
  [MO_BEQ]  = helper_be_stq_mmu,
  };
  
 +/* StoreConditional helpers. Use the macro below to access them. */
 +static void * const qemu_stex_helpers[16] = {
 +[MO_LEUL] = helper_le_stcondl_mmu,
 +};
 +
 +#define STEX_HELPER(mem_op) \
 +({  \
 +assert(mem_op  MO_EXCL);   \
 +qemu_stex_helpers[(int)mem_op - MO_EXCL];   \
 +})
 +

Same comments as for target-arm.

Do we need to be protecting backends with HAS_LDST_EXCL defines or some
such macro hackery? What currently happens if you use the new TCG ops
when the backend doesn't support them? Is supporting all backends a
prerequisite for the series?

  /* Perform the TLB load and compare.
  
 Inputs:
 @@ -1249,6 +1271,7 @@ static inline void tcg_out_tlb_load(TCGContext *s, 
 TCGReg addrlo, TCGReg addrhi,
   * for a load or store, so that we can later generate the correct helper code
   */
  static void add_qemu_ldst_label(TCGContext *s, bool is_ld, TCGMemOpIdx oi,
 +TCGReg llsc_success,
  TCGReg datalo, TCGReg datahi,
  TCGReg addrlo, TCGReg addrhi,
  tcg_insn_unit *raddr,
 @@ -1257,6 +1280,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool 
 is_ld, TCGMemOpIdx oi,
  TCGLabelQemuLdst *label = new_ldst_label(s);
  
  label-is_ld = is_ld;
 +label-llsc_success = llsc_success;
  label-oi = oi;
  label-datalo_reg = datalo;
  label-datahi_reg = datahi;
 @@ -1311,7 +1335,11 @@ static void tcg_out_qemu_ld_slow_path(TCGContext *s, 
 TCGLabelQemuLdst *l)
   (uintptr_t)l-raddr);
  }
  
 -tcg_out_call(s, qemu_ld_helpers[opc  (MO_BSWAP | MO_SIZE)]);
 +if (opc  MO_EXCL) {
 +tcg_out_call(s, LDEX_HELPER(opc));
 +} else {
 +tcg_out_call(s, qemu_ld_helpers[opc  ~MO_SIGN]);
 +}
  
  data_reg = l-datalo_reg;
  switch (opc  MO_SSIZE) {
 @@ -1415,9 +1443,16 @@ static void tcg_out_qemu_st_slow_path(TCGContext *s, 
 TCGLabelQemuLdst *l)
  }
  }
  
 -/* Tail call to the helper, with the return address back inline.  */
 -tcg_out_push(s, retaddr);
 -tcg_out_jmp(s, qemu_st_helpers[opc  (MO_BSWAP | MO_SIZE)]);
 +if (opc  MO_EXCL) {
 +tcg_out_call(s, STEX_HELPER(opc));
 +/* Save the output of the StoreConditional */
 +tcg_out_mov(s, TCG_TYPE_I32, l-llsc_success, TCG_REG_EAX);
 +tcg_out_jmp(s, l-raddr);
 +} else {
 +/* Tail call to the helper, with the return address back inline.  
 */
 +tcg_out_push(s, retaddr);
 +tcg_out_jmp(s, qemu_st_helpers[opc]);
 +}
  }
  #elif defined(__x86_64__)  defined(__linux__)
  # include asm/prctl.h
 @@ -1530,7 +1565,8 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, 
 TCGReg datalo, TCGReg datahi,
  /* XXX: qemu_ld and qemu_st could be modified to clobber only EDX and
 EAX. It will be useful once fixed registers globals are less
 common. */
 -static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64)
 +static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64,
 +bool isLoadLink)
  {
  TCGReg datalo, datahi, addrlo;
  TCGReg addrhi __attribute__((unused));
 @@ -1553,14 +1589,34 @@ static void

[Qemu-devel] [PULL 3/4] usb-ccid: add missing wakeup calls

2015-07-17 Thread Gerd Hoffmann

Properly notify the host adapter that we have
data pending, so it doesn't has to poll us.

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hw/usb/dev-smartcard-reader.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/usb/dev-smartcard-reader.c b/hw/usb/dev-smartcard-reader.c
index 904b568..8952eff 100644
--- a/hw/usb/dev-smartcard-reader.c
+++ b/hw/usb/dev-smartcard-reader.c
@@ -283,6 +283,7 @@ typedef struct CCIDBus {
 typedef struct USBCCIDState {
 USBDevice dev;
 USBEndpoint *intr;
+USBEndpoint *bulk;
 CCIDBus bus;
 CCIDCardState *card;
 BulkIn bulk_in_pending[BULK_IN_PENDING_NUM]; /* circular */
@@ -769,6 +770,7 @@ static void ccid_write_slot_status(USBCCIDState *s, 
CCID_Header *recv)
 h-b.bError = s-bError;
 h-bClockStatus = CLOCK_STATUS_RUNNING;
 ccid_reset_error_status(s);
+usb_wakeup(s-bulk, 0);
 }
 
 static void ccid_write_parameters(USBCCIDState *s, CCID_Header *recv)
@@ -789,6 +791,7 @@ static void ccid_write_parameters(USBCCIDState *s, 
CCID_Header *recv)
 h-bProtocolNum = s-bProtocolNum;
 h-abProtocolDataStructure = s-abProtocolDataStructure;
 ccid_reset_error_status(s);
+usb_wakeup(s-bulk, 0);
 }
 
 static void ccid_write_data_block(USBCCIDState *s, uint8_t slot, uint8_t seq,
@@ -810,6 +813,7 @@ static void ccid_write_data_block(USBCCIDState *s, uint8_t 
slot, uint8_t seq,
 }
 memcpy(p-abData, data, len);
 ccid_reset_error_status(s);
+usb_wakeup(s-bulk, 0);
 }
 
 static void ccid_report_error_failed(USBCCIDState *s, uint8_t error)
@@ -1319,6 +1323,7 @@ static void ccid_realize(USBDevice *dev, Error **errp)
 NULL);
 qbus_set_hotplug_handler(BUS(s-bus), DEVICE(dev), error_abort);
 s-intr = usb_ep_get(dev, USB_TOKEN_IN, CCID_INT_IN_EP);
+s-bulk = usb_ep_get(dev, USB_TOKEN_IN, CCID_BULK_IN_EP);
 s-card = NULL;
 s-migration_state = MIGRATION_NONE;
 s-migration_target_ip = 0;
-- 
1.8.3.1

[Qemu-devel] [PULL for-2.4 0/4] usb: fixes for 2.4 (ccid, xhci and usb-host)

2015-07-17 Thread Gerd Hoffmann

  Hi,

Here are usb fixes for 2.4.

please pull,
  Gerd

The following changes since commit 2d5ee9e7a7dd495d233cf9613a865f63f88e3375:

  Merge remote-tracking branch 'remotes/lalrae/tags/mips-20150716' into staging 
(2015-07-16 10:40:23 +0100)

are available in the git repository at:


  git://git.kraxel.org/qemu tags/pull-usb-20150717-1

for you to fetch changes up to 92fdfa4bef9c92addcc009dd3e0131172b4fdc78:

  Revert xhci: set timer to retry xfers (2015-07-17 13:20:53 +0200)


usb: fixes for 2.4 (ccid, xhci and usb-host)


Gerd Hoffmann (3):
  usb-ccid: fix 61b4887b41b270bc837ead57bc502d904af023bb
  usb-ccid: add missing wakeup calls
  Revert xhci: set timer to retry xfers

Lin Ma (1):
  Re-attach usb device to kernel while usb_host_open fails

 hw/usb/dev-smartcard-reader.c | 21 +
 hw/usb/hcd-xhci.c |  2 --
 hw/usb/host-libusb.c  |  3 +++
 3 files changed, 16 insertions(+), 10 deletions(-)

[Qemu-devel] [PULL 2/4] usb-ccid: fix 61b4887b41b270bc837ead57bc502d904af023bb

2015-07-17 Thread Gerd Hoffmann

QOMification dropped the parent device lookup, fix it.

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hw/usb/dev-smartcard-reader.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/hw/usb/dev-smartcard-reader.c b/hw/usb/dev-smartcard-reader.c
index de534ba..904b568 100644
--- a/hw/usb/dev-smartcard-reader.c
+++ b/hw/usb/dev-smartcard-reader.c
@@ -1184,7 +1184,7 @@ void ccid_card_send_apdu_to_guest(CCIDCardState *card,
   uint8_t *apdu, uint32_t len)
 {
 DeviceState *qdev = DEVICE(card);
-USBDevice *dev = USB_DEVICE(qdev);
+USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent);
 USBCCIDState *s = USB_CCID_DEV(dev);
 Answer *answer;
 
@@ -1207,7 +1207,7 @@ void ccid_card_send_apdu_to_guest(CCIDCardState *card,
 void ccid_card_card_removed(CCIDCardState *card)
 {
 DeviceState *qdev = DEVICE(card);
-USBDevice *dev = USB_DEVICE(qdev);
+USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent);
 USBCCIDState *s = USB_CCID_DEV(dev);
 
 ccid_on_slot_change(s, false);
@@ -1218,7 +1218,7 @@ void ccid_card_card_removed(CCIDCardState *card)
 int ccid_card_ccid_attach(CCIDCardState *card)
 {
 DeviceState *qdev = DEVICE(card);
-USBDevice *dev = USB_DEVICE(qdev);
+USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent);
 USBCCIDState *s = USB_CCID_DEV(dev);
 
 DPRINTF(s, 1, CCID Attach\n);
@@ -1231,7 +1231,7 @@ int ccid_card_ccid_attach(CCIDCardState *card)
 void ccid_card_ccid_detach(CCIDCardState *card)
 {
 DeviceState *qdev = DEVICE(card);
-USBDevice *dev = USB_DEVICE(qdev);
+USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent);
 USBCCIDState *s = USB_CCID_DEV(dev);
 
 DPRINTF(s, 1, CCID Detach\n);
@@ -1244,7 +1244,7 @@ void ccid_card_ccid_detach(CCIDCardState *card)
 void ccid_card_card_error(CCIDCardState *card, uint64_t error)
 {
 DeviceState *qdev = DEVICE(card);
-USBDevice *dev = USB_DEVICE(qdev);
+USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent);
 USBCCIDState *s = USB_CCID_DEV(dev);
 
 s-bmCommandStatus = COMMAND_STATUS_FAILED;
@@ -1263,7 +1263,7 @@ void ccid_card_card_error(CCIDCardState *card, uint64_t 
error)
 void ccid_card_card_inserted(CCIDCardState *card)
 {
 DeviceState *qdev = DEVICE(card);
-USBDevice *dev = USB_DEVICE(qdev);
+USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent);
 USBCCIDState *s = USB_CCID_DEV(dev);
 
 s-bmCommandStatus = COMMAND_STATUS_NO_ERROR;
@@ -1275,7 +1275,7 @@ static int ccid_card_exit(DeviceState *qdev)
 {
 int ret = 0;
 CCIDCardState *card = CCID_CARD(qdev);
-USBDevice *dev = USB_DEVICE(qdev);
+USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent);
 USBCCIDState *s = USB_CCID_DEV(dev);
 
 if (ccid_card_inserted(s)) {
@@ -1289,7 +1289,7 @@ static int ccid_card_exit(DeviceState *qdev)
 static int ccid_card_init(DeviceState *qdev)
 {
 CCIDCardState *card = CCID_CARD(qdev);
-USBDevice *dev = USB_DEVICE(qdev);
+USBDevice *dev = USB_DEVICE(qdev-parent_bus-parent);
 USBCCIDState *s = USB_CCID_DEV(dev);
 int ret = 0;
 
-- 
1.8.3.1

[Qemu-devel] [RFC PATCH 2/2] spapr: -kernel: allow linking with specified addr

2015-07-17 Thread Andrew Jones

I've started playing with adding ppc support to kvm-unit-tests,
using spapr for the machine model. I wanted to link the unit test
at 0x40 to match qemu's load address, making the unit test
startup code simpler, but ended up with 0x80 instead, due to
how translate_kernel_address works. The translation makes sense
for how Linux kernels are linked (always at 0xc000 or
0xc000), but for the unit test case we need to avoid
adding the offset.

Signed-off-by: Andrew Jones drjo...@redhat.com
---
Big RFC because I don't know if the always at 0xc... statement
is 100% true for Linux, nor if this patch would break other stuff...

 hw/ppc/spapr.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index c1cbf3387ae0c..4f1548f5168db 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -832,6 +832,9 @@ static void spapr_finalize_fdt(sPAPRMachineState *spapr,
 
 static uint64_t translate_kernel_address(void *opaque, uint64_t addr)
 {
+if ((addr  0x0fff) == addr) {
+return addr;
+}
 return (addr  0x0fff) + KERNEL_LOAD_ADDR;
 }
 
-- 
2.4.3

[Qemu-devel] [RFC PATCH 0/2] spapr: changes for kvm-unit-tests

2015-07-17 Thread Andrew Jones

I've started playing with adding ppc support to kvm-unit-tests,
using spapr for the machine model. Here are a couple patches
to spapr I've made.

Thanks,
drew

Andrew Jones (2):
  spapr: add dumpdtb support
  spapr: -kernel: allow linking with specified addr

 hw/ppc/spapr.c | 5 +
 1 file changed, 5 insertions(+)

-- 
2.4.3

Re: [Qemu-devel] [PULL for-2.4 0/4] usb: fixes for 2.4 (ccid, xhci and usb-host)

2015-07-17 Thread Peter Maydell

On 17 July 2015 at 12:29, Gerd Hoffmann kra...@redhat.com wrote:
   Hi,

 Here are usb fixes for 2.4.

 please pull,
   Gerd

 The following changes since commit 2d5ee9e7a7dd495d233cf9613a865f63f88e3375:

   Merge remote-tracking branch 'remotes/lalrae/tags/mips-20150716' into 
 staging (2015-07-16 10:40:23 +0100)

 are available in the git repository at:


   git://git.kraxel.org/qemu tags/pull-usb-20150717-1

 for you to fetch changes up to 92fdfa4bef9c92addcc009dd3e0131172b4fdc78:

   Revert xhci: set timer to retry xfers (2015-07-17 13:20:53 +0200)

 
 usb: fixes for 2.4 (ccid, xhci and usb-host)

Applied, thanks.

-- PMM

Re: [Qemu-devel] [RFC v3 05/13] target-arm: translate: implement qemu_ldlink and qemu_stcond ops

2015-07-17 Thread Alex Bennée


Alvise Rigo a.r...@virtualopensystems.com writes:

 Implement strex and ldrex instruction relying on TCG's qemu_ldlink and
 qemu_stcond.  For the time being only the 32bit instructions are supported.

 Suggested-by: Jani Kokkonen jani.kokko...@huawei.com
 Suggested-by: Claudio Fontana claudio.font...@huawei.com
 Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com
 ---
  target-arm/translate.c |  87 ++-
  tcg/arm/tcg-target.c   | 121 
 +
  2 files changed, 178 insertions(+), 30 deletions(-)

 diff --git a/target-arm/translate.c b/target-arm/translate.c
 index 80302cd..0366c76 100644
 --- a/target-arm/translate.c
 +++ b/target-arm/translate.c
 @@ -72,6 +72,8 @@ static TCGv_i64 cpu_exclusive_test;
  static TCGv_i32 cpu_exclusive_info;
  #endif
  
 +static TCGv_i32 cpu_ll_sc_context;
 +
  /* FIXME:  These should be removed.  */
  static TCGv_i32 cpu_F0s, cpu_F1s;
  static TCGv_i64 cpu_F0d, cpu_F1d;
 @@ -103,6 +105,8 @@ void arm_translate_init(void)
  offsetof(CPUARMState, exclusive_addr), exclusive_addr);
  cpu_exclusive_val = tcg_global_mem_new_i64(TCG_AREG0,
  offsetof(CPUARMState, exclusive_val), exclusive_val);
 +cpu_ll_sc_context = tcg_global_mem_new_i32(TCG_AREG0,
 +offsetof(CPUARMState, ll_sc_context), ll_sc_context);
  #ifdef CONFIG_USER_ONLY
  cpu_exclusive_test = tcg_global_mem_new_i64(TCG_AREG0,
  offsetof(CPUARMState, exclusive_test), exclusive_test);
 @@ -961,6 +965,18 @@ DO_GEN_ST(8, MO_UB)
  DO_GEN_ST(16, MO_TEUW)
  DO_GEN_ST(32, MO_TEUL)
  
 +/* Load/Store exclusive generators (always unsigned) */
 +static inline void gen_aa32_ldex32(TCGv_i32 val, TCGv_i32 addr, int index)
 +{
 +tcg_gen_qemu_ldlink_i32(val, addr, index, MO_TEUL | MO_EXCL);
 +}
 +
 +static inline void gen_aa32_stex32(TCGv_i32 is_dirty, TCGv_i32 val,
 +   TCGv_i32 addr, int index)
 +{
 +tcg_gen_qemu_stcond_i32(is_dirty, val, addr, index, MO_TEUL | MO_EXCL);
 +}
 +
  static inline void gen_set_pc_im(DisasContext *s, target_ulong val)
  {
  tcg_gen_movi_i32(cpu_R[15], val);
 @@ -7427,6 +7443,26 @@ static void gen_load_exclusive(DisasContext *s, int 
 rt, int rt2,
  store_reg(s, rt, tmp);
  }
  
 +static void gen_load_exclusive_multi(DisasContext *s, int rt, int rt2,
 + TCGv_i32 addr, int size)
 +{
 +TCGv_i32 tmp = tcg_temp_new_i32();
 +
 +switch (size) {
 +case 0:
 +case 1:
 +abort();
 +case 2:
 +gen_aa32_ldex32(tmp, addr, get_mem_index(s));
 +break;
 +case 3:
 +default:
 +abort();
 +}
 +
 +store_reg(s, rt, tmp);
 +}
 +
  static void gen_clrex(DisasContext *s)
  {
  gen_helper_atomic_clear(cpu_env);
 @@ -7460,6 +7496,52 @@ static void gen_store_exclusive(DisasContext *s, int 
 rd, int rt, int rt2,
  tcg_temp_free_i64(val);
  tcg_temp_free_i32(tmp_size);
  }
 +
 +static void gen_store_exclusive_multi(DisasContext *s, int rd, int rt, int 
 rt2,
 +  TCGv_i32 addr, int size)
 +{
 +TCGv_i32 tmp;
 +TCGv_i32 is_dirty;
 +TCGLabel *done_label;
 +TCGLabel *fail_label;
 +
 +fail_label = gen_new_label();
 +done_label = gen_new_label();
 +
 +tmp = tcg_temp_new_i32();
 +is_dirty = tcg_temp_new_i32();
 +
 +/* Fail if we are not in LL/SC context. */
 +tcg_gen_brcondi_i32(TCG_COND_NE, cpu_ll_sc_context, 1, fail_label);
 +
 +tmp = load_reg(s, rt);
 +switch (size) {
 +case 0:
 +case 1:
 +abort();
 +break;
 +case 2:
 +gen_aa32_stex32(is_dirty, tmp, addr, get_mem_index(s));
 +break;
 +case 3:
 +default:
 +abort();
 +}
 +
 +tcg_temp_free_i32(tmp);
 +
 +/* Check if the store conditional has to fail. */
 +tcg_gen_brcondi_i32(TCG_COND_EQ, is_dirty, 1, fail_label);
 +tcg_temp_free_i32(is_dirty);
 +
 +tcg_temp_free_i32(tmp);
 +
 +tcg_gen_movi_i32(cpu_R[rd], 0); /* is_dirty = 0 */
 +tcg_gen_br(done_label);
 +gen_set_label(fail_label);
 +tcg_gen_movi_i32(cpu_R[rd], 1); /* is_dirty = 1 */
 +gen_set_label(done_label);
 +}
  #endif
  
  /* gen_srs:
 @@ -8308,7 +8390,7 @@ static void disas_arm_insn(DisasContext *s, unsigned 
 int insn)
  } else if (insn  (1  20)) {
  switch (op1) {
  case 0: /* ldrex */
 -gen_load_exclusive(s, rd, 15, addr, 2);
 +gen_load_exclusive_multi(s, rd, 15, addr, 2);
  break;
  case 1: /* ldrexd */
  gen_load_exclusive(s, rd, rd + 1, addr, 3);
 @@ -8326,7 +8408,8 @@ static void disas_arm_insn(DisasContext *s, unsigned 
 int insn)
  rm = insn  0xf;
  switch (op1) {

Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me

2015-07-17 Thread Paolo Bonzini



On 17/07/2015 15:28, Marc Zyngier wrote:
  
  Marc, does it ring any bell?
 Well, this is an example of a guest accessing non-memory using an
 instruction that we cannot safely emulate - not an IO accessor (load
 multiple, for example).
 
 In this case, we kill the guest (we could as well inject a fault).

Yup, I later found this in the dmesg:

[1875219.903969] kvm [14843]: load/store instruction decoding not implemented

 This vcpu seems to be accessing 0x918 (the mmio structure points
 there), but I can't immediately relate it to the content of the
 registers.

0x918 is the pl011, which makes some sense.

Have you ever seen a corrupted register dump?  The guest RAM goes
from 0x4000 to 0xbfff, so SP is pointing outside memory.

  PC=bf671210  SP=c1f0
  X00=0a003e70 X01= X02=bf680548 
  X03=0030
  X04=bbb5fc18 X05=004b7770 X06=bf721930 
  X07=009a
  X08=bf716858 X09=0090 X10= 
  X11=0046
  X12=a007e03a X13= X14= 
  X15=
  X16=bf716df0 X17= X18= 
  X19=bf6f5f18
  X20= X21= X22= 
  X23=
  X24= X25= X26= 
  X27=
  X28= X29= X30= 
  PSTATE=6305 (flags -ZC-)

If the register dump is not corrupted, execution went in the weeds...
I don't have the guest anymore but it's just firmware so the memory
contents are well reproducible:

0xbf671200:  a97d6ffa  ldmdbge  sp!, {r1, r3, r4, r5, r6, r7, 
r8, r9, sl, fp, sp, lr}^
0xbf671204:  a97e77fc  ldmdbge  lr!, {r2, r3, r4, r5, r6, r7, 
r8, r9, sl, ip, sp, lr}^
0xbf671208:  f85f03fe  undefined instruction 0xf85f03fe
0xbf67120c:  910803ff  strdls   r0, [r8, -pc]
0xbf671210:  ad7007e0  ldclge   7, cr0, [r0, #-896]!
0xbf671214:  ad710fe2  ldclge   15, cr0, [r1, #-904]!
0xbf671218:  ad7217e4  ldclge   7, cr1, [r2, #-912]!
0xbf67121c:  ad731fe6  ldclge   15, cr1, [r3, #-920]!
0xbf671220:  ad7427e8  ldclge   7, cr2, [r4, #-928]!
0xbf671224:  ad752fea  ldclge   15, cr2, [r5, #-936]!

 What looks a bit weird is that all the registers are clamped to 32bit,
 but the PSTATE indicates it is running in 64bit (EL1h, which makes the
 PC being utterly wrong).

The PC can be okay since UEFI code doesn't really use virtual addressing,
but the other registers are weird indeed.

 What are you running there?

Just firmware.  It's a UEFI reboot loop (as soon as you get to the
UEFI shell QEMU exits and the script starts a new one).

The kernel is an old 3.19 one, I'll upgrade and retest.

Paolo

[Qemu-devel] [PATCH 1/4] Revert vhost-user: add multi queue support

2015-07-17 Thread Michael S. Tsirkin

This reverts commit 830d70db692e374b5f4407f96a1ceefdcc97.

The interface as merged isn't fully backwards-compatible with existing
clients. Revert it.  Let's redo this after 2.4, based on protocol
extensions in follow-up patches.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 qapi-schema.json  |  6 +-
 hw/net/vhost_net.c|  3 +--
 hw/virtio/vhost-user.c| 11 +--
 net/vhost-user.c  | 37 +
 docs/specs/vhost-user.txt |  5 -
 qemu-options.hx   |  5 ++---
 6 files changed, 18 insertions(+), 49 deletions(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index 106008c..88a3883 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2463,16 +2463,12 @@
 #
 # @vhostforce: #optional vhost on for non-MSIX virtio guests (default: false).
 #
-# @queues: #optional number of queues to be created for multiqueue vhost-user
-#  (default: 1) (Since 2.4)
-#
 # Since 2.1
 ##
 { 'struct': 'NetdevVhostUserOptions',
   'data': {
 'chardev':'str',
-'*vhostforce':'bool',
-'*queues':'uint32' } }
+'*vhostforce':'bool' } }
 
 ##
 # @NetClientOptions
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 9bd360b..5c1d11f 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -160,7 +160,6 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
 
 net-dev.nvqs = 2;
 net-dev.vqs = net-vqs;
-net-dev.vq_index = net-nc-queue_index;
 
 r = vhost_dev_init(net-dev, options-opaque,
options-backend_type);
@@ -287,7 +286,7 @@ static void vhost_net_stop_one(struct vhost_net *net,
 for (file.index = 0; file.index  net-dev.nvqs; ++file.index) {
 const VhostOps *vhost_ops = net-dev.vhost_ops;
 int r = vhost_ops-vhost_call(net-dev, VHOST_RESET_OWNER,
-  file);
+  NULL);
 assert(r = 0);
 }
 }
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index d6f2163..e7ab829 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -210,12 +210,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 break;
 
 case VHOST_SET_OWNER:
-break;
-
 case VHOST_RESET_OWNER:
-memcpy(msg.state, arg, sizeof(struct vhost_vring_state));
-msg.state.index += dev-vq_index;
-msg.size = sizeof(m.state);
 break;
 
 case VHOST_SET_MEM_TABLE:
@@ -258,20 +253,17 @@ static int vhost_user_call(struct vhost_dev *dev, 
unsigned long int request,
 case VHOST_SET_VRING_NUM:
 case VHOST_SET_VRING_BASE:
 memcpy(msg.state, arg, sizeof(struct vhost_vring_state));
-msg.state.index += dev-vq_index;
 msg.size = sizeof(m.state);
 break;
 
 case VHOST_GET_VRING_BASE:
 memcpy(msg.state, arg, sizeof(struct vhost_vring_state));
-msg.state.index += dev-vq_index;
 msg.size = sizeof(m.state);
 need_reply = 1;
 break;
 
 case VHOST_SET_VRING_ADDR:
 memcpy(msg.addr, arg, sizeof(struct vhost_vring_addr));
-msg.addr.index += dev-vq_index;
 msg.size = sizeof(m.addr);
 break;
 
@@ -279,7 +271,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 case VHOST_SET_VRING_CALL:
 case VHOST_SET_VRING_ERR:
 file = arg;
-msg.u64 = (file-index + dev-vq_index)  VHOST_USER_VRING_IDX_MASK;
+msg.u64 = file-index  VHOST_USER_VRING_IDX_MASK;
 msg.size = sizeof(m.u64);
 if (ioeventfd_enabled()  file-fd  0) {
 fds[fd_num++] = file-fd;
@@ -321,7 +313,6 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 error_report(Received bad msg size.);
 return -1;
 }
-msg.state.index -= dev-vq_index;
 memcpy(arg, msg.state, sizeof(struct vhost_vring_state));
 break;
 default:
diff --git a/net/vhost-user.c b/net/vhost-user.c
index b51bc04..93dcecd 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -120,39 +120,35 @@ static void net_vhost_user_event(void *opaque, int event)
 case CHR_EVENT_OPENED:
 vhost_user_start(s);
 net_vhost_link_down(s, false);
-error_report(chardev \%s\ went up, s-nc.info_str);
+error_report(chardev \%s\ went up, s-chr-label);
 break;
 case CHR_EVENT_CLOSED:
 net_vhost_link_down(s, true);
 vhost_user_stop(s);
-error_report(chardev \%s\ went down, s-nc.info_str);
+error_report(chardev \%s\ went down, s-chr-label);
 break;
 }
 }
 
 static int net_vhost_user_init(NetClientState *peer, const char *device,
-   const char *name, CharDriverState *chr,
-   uint32_t queues)
+   const char

[Qemu-devel] [PATCH 0/4] vhost-user: protocol updates

2015-07-17 Thread Michael S. Tsirkin

This patchset sets the stage for extending the vhost user
protocol, with full backwards compatibility.

The approach is to negotiate feature bits queried and
acknowledged during device setup.

For now, there's no new functionality: two new messages
are added that will allow negotiating new messages
required for functionality such as MQ and migration.

For now, I used the feature bit 30 to signal support for these new messages,
and we now have 64 more bits to play.

The patches can be found in my tree, branch vhost-user.

Only patch 1 is intended for 2.4.

Posting early so people working on extensions such as
migration can review this - but please note
the protocol is not set in stone yet.

Michael S. Tsirkin (4):
  Revert vhost-user: add multi queue support
  vhost-user: refactor ioctl translation
  vhost-user: add protocol feature negotiation
  vhost-user: unit test for new messages

 qapi-schema.json  |   6 +-
 include/hw/virtio/vhost.h |   1 +
 hw/net/vhost_net.c|   5 +-
 hw/virtio/vhost-user.c| 150 ++
 net/vhost-user.c  |  37 
 tests/vhost-user-test.c   |  19 ++
 docs/specs/vhost-user.txt |  40 +++--
 qemu-options.hx   |   5 +-
 8 files changed, 174 insertions(+), 89 deletions(-)

-- 
MST

[Qemu-devel] [PATCH 3/4] vhost-user: add protocol feature negotiation

2015-07-17 Thread Michael S. Tsirkin

Support a separate bitmask for vhost-user protocol features,
and messages to get/set protocol features.

Invoke them at init.

No features are defined yet.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 include/hw/virtio/vhost.h |  1 +
 hw/net/vhost_net.c|  2 ++
 hw/virtio/vhost-user.c| 52 +++
 docs/specs/vhost-user.txt | 37 +
 4 files changed, 92 insertions(+)

diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index dd51050..6467c73 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -47,6 +47,7 @@ struct vhost_dev {
 unsigned long long features;
 unsigned long long acked_features;
 unsigned long long backend_features;
+unsigned long long protocol_features;
 bool started;
 bool log_enabled;
 unsigned long long log_size;
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 5c1d11f..c864237 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -152,8 +152,10 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
 net-dev.backend_features = qemu_has_vnet_hdr(options-net_backend)
 ? 0 : (1ULL  VHOST_NET_F_VIRTIO_NET_HDR);
 net-backend = r;
+net-dev.protocol_features = 0;
 } else {
 net-dev.backend_features = 0;
+net-dev.protocol_features = 0;
 net-backend = -1;
 }
 net-nc = options-net_backend;
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 5a83d00..c4428a1 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -25,6 +25,9 @@
 
 #define VHOST_MEMORY_MAX_NREGIONS8
 
+#define VHOST_USER_F_PROTOCOL_FEATURES 30
+#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x0ULL
+
 typedef enum VhostUserRequest {
 VHOST_USER_NONE = 0,
 VHOST_USER_GET_FEATURES = 1,
@@ -41,6 +44,8 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_VRING_KICK = 12,
 VHOST_USER_SET_VRING_CALL = 13,
 VHOST_USER_SET_VRING_ERR = 14,
+VHOST_USER_GET_PROTOCOL_FEATURES = 15,
+VHOST_USER_SET_PROTOCOL_FEATURES = 26,
 VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -332,10 +337,57 @@ static int vhost_user_call(struct vhost_dev *dev, 
unsigned long int request,
 
 static int vhost_user_init(struct vhost_dev *dev, void *opaque)
 {
+VhostUserMsg msg = { 0 };
+int err;
+
 assert(dev-vhost_ops-backend_type == VHOST_BACKEND_TYPE_USER);
 
 dev-opaque = opaque;
 
+msg.request = VHOST_USER_GET_FEATURES;
+msg.flags = VHOST_USER_VERSION;
+msg.size = 0;
+
+err = vhost_user_write(dev, msg, NULL, 0);
+if (err  0) {
+return err;
+}
+
+err = vhost_user_read(dev, msg);
+if (err  0) {
+return err;
+}
+
+if (__virtio_has_feature(msg.u64, VHOST_USER_F_PROTOCOL_FEATURES)) {
+dev-backend_features |= 1ULL  VHOST_USER_F_PROTOCOL_FEATURES;
+
+msg.request = VHOST_USER_GET_PROTOCOL_FEATURES;
+msg.flags = VHOST_USER_VERSION;
+msg.size = 0;
+
+err = vhost_user_write(dev, msg, NULL, 0);
+if (err  0) {
+return err;
+}
+
+err = vhost_user_read(dev, msg);
+if (err  0) {
+return err;
+}
+
+dev-protocol_features = msg.u64  VHOST_USER_PROTOCOL_FEATURE_MASK;
+
+msg.request = VHOST_USER_SET_PROTOCOL_FEATURES;
+msg.flags = VHOST_USER_VERSION;
+msg.u64 = dev-protocol_features;
+msg.size = sizeof msg.u64;
+
+err = vhost_user_write(dev, msg, NULL, 0);
+if (err  0) {
+return err;
+}
+}
+
 return 0;
 }
 
diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 650bb18..0062baa 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -113,6 +113,7 @@ message replies. Most of the requests don't require 
replies. Here is a list of
 the ones that do:
 
  * VHOST_GET_FEATURES
+ * VHOST_GET_PROTOCOL_FEATURES
  * VHOST_GET_VRING_BASE
 
 There are several messages that the master sends with file descriptors passed
@@ -127,6 +128,13 @@ in the ancillary data:
 If Master is unable to send the full message or receives a wrong reply it will
 close the connection. An optional reconnection mechanism can be implemented.
 
+Any protocol extensions are gated by protocol feature bits,
+which allows full backwards compatibility on both master
+and slave.
+As older slaves don't support negotiating protocol features,
+a feature bit was dedicated for this purpose:
+#define VHOST_USER_F_PROTOCOL_FEATURES 30
+
 Message types
 -
 
@@ -138,6 +146,8 @@ Message types
   Slave payload: u64
 
   Get from the underlying vhost implementation the features bitmask.
+  Feature bit VHOST_USER_F_PROTOCOL_FEATURES signals slave support for
+  VHOST_USER_GET_PROTOCOL_FEATURES and VHOST_USER_SET_PROTOCOL_FEATURES.
 
  * VHOST_USER_SET_FEATURES
 
@@ -146,6 +156,33 @@ Message types
   Master payload: u64

[Qemu-devel] [PATCH 2/4] vhost-user: refactor ioctl translation

2015-07-17 Thread Michael S. Tsirkin

translate at the outset, have rest of code use vhost-user features
exclusively.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 hw/virtio/vhost-user.c | 87 ++
 1 file changed, 46 insertions(+), 41 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index e7ab829..5a83d00 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -89,35 +89,40 @@ static bool ioeventfd_enabled(void)
 return kvm_enabled()  kvm_eventfds_enabled();
 }
 
-static unsigned long int ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
--1, /* VHOST_USER_NONE */
-VHOST_GET_FEATURES, /* VHOST_USER_GET_FEATURES */
-VHOST_SET_FEATURES, /* VHOST_USER_SET_FEATURES */
-VHOST_SET_OWNER,/* VHOST_USER_SET_OWNER */
-VHOST_RESET_OWNER,  /* VHOST_USER_RESET_OWNER */
-VHOST_SET_MEM_TABLE,/* VHOST_USER_SET_MEM_TABLE */
-VHOST_SET_LOG_BASE, /* VHOST_USER_SET_LOG_BASE */
-VHOST_SET_LOG_FD,   /* VHOST_USER_SET_LOG_FD */
-VHOST_SET_VRING_NUM,/* VHOST_USER_SET_VRING_NUM */
-VHOST_SET_VRING_ADDR,   /* VHOST_USER_SET_VRING_ADDR */
-VHOST_SET_VRING_BASE,   /* VHOST_USER_SET_VRING_BASE */
-VHOST_GET_VRING_BASE,   /* VHOST_USER_GET_VRING_BASE */
-VHOST_SET_VRING_KICK,   /* VHOST_USER_SET_VRING_KICK */
-VHOST_SET_VRING_CALL,   /* VHOST_USER_SET_VRING_CALL */
-VHOST_SET_VRING_ERR /* VHOST_USER_SET_VRING_ERR */
-};
-
 static VhostUserRequest vhost_user_request_translate(unsigned long int request)
 {
-VhostUserRequest idx;
-
-for (idx = 0; idx  VHOST_USER_MAX; idx++) {
-if (ioctl_to_vhost_user_request[idx] == request) {
-break;
-}
+switch (request) {
+case VHOST_GET_FEATURES:
+return VHOST_USER_GET_FEATURES;
+case VHOST_SET_FEATURES:
+return VHOST_USER_SET_FEATURES;
+case VHOST_SET_OWNER:
+return VHOST_USER_SET_OWNER;
+case VHOST_RESET_OWNER:
+return VHOST_USER_RESET_OWNER;
+case VHOST_SET_MEM_TABLE:
+return VHOST_USER_SET_MEM_TABLE;
+case VHOST_SET_LOG_BASE:
+return VHOST_USER_SET_LOG_BASE;
+case VHOST_SET_LOG_FD:
+return VHOST_USER_SET_LOG_FD;
+case VHOST_SET_VRING_NUM:
+return VHOST_USER_SET_VRING_NUM;
+case VHOST_SET_VRING_ADDR:
+return VHOST_USER_SET_VRING_ADDR;
+case VHOST_SET_VRING_BASE:
+return VHOST_USER_SET_VRING_BASE;
+case VHOST_GET_VRING_BASE:
+return VHOST_USER_GET_VRING_BASE;
+case VHOST_SET_VRING_KICK:
+return VHOST_USER_SET_VRING_KICK;
+case VHOST_SET_VRING_CALL:
+return VHOST_USER_SET_VRING_CALL;
+case VHOST_SET_VRING_ERR:
+return VHOST_USER_SET_VRING_ERR;
+default:
+return VHOST_USER_MAX;
 }
-
-return (idx == VHOST_USER_MAX) ? VHOST_USER_NONE : idx;
 }
 
 static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg)
@@ -198,22 +203,22 @@ static int vhost_user_call(struct vhost_dev *dev, 
unsigned long int request,
 msg.flags = VHOST_USER_VERSION;
 msg.size = 0;
 
-switch (request) {
-case VHOST_GET_FEATURES:
+switch (msg_request) {
+case VHOST_USER_GET_FEATURES:
 need_reply = 1;
 break;
 
-case VHOST_SET_FEATURES:
-case VHOST_SET_LOG_BASE:
+case VHOST_USER_SET_FEATURES:
+case VHOST_USER_SET_LOG_BASE:
 msg.u64 = *((__u64 *) arg);
 msg.size = sizeof(m.u64);
 break;
 
-case VHOST_SET_OWNER:
-case VHOST_RESET_OWNER:
+case VHOST_USER_SET_OWNER:
+case VHOST_USER_RESET_OWNER:
 break;
 
-case VHOST_SET_MEM_TABLE:
+case VHOST_USER_SET_MEM_TABLE:
 for (i = 0; i  dev-mem-nregions; ++i) {
 struct vhost_memory_region *reg = dev-mem-regions + i;
 ram_addr_t ram_addr;
@@ -246,30 +251,30 @@ static int vhost_user_call(struct vhost_dev *dev, 
unsigned long int request,
 
 break;
 
-case VHOST_SET_LOG_FD:
+case VHOST_USER_SET_LOG_FD:
 fds[fd_num++] = *((int *) arg);
 break;
 
-case VHOST_SET_VRING_NUM:
-case VHOST_SET_VRING_BASE:
+case VHOST_USER_SET_VRING_NUM:
+case VHOST_USER_SET_VRING_BASE:
 memcpy(msg.state, arg, sizeof(struct vhost_vring_state));
 msg.size = sizeof(m.state);
 break;
 
-case VHOST_GET_VRING_BASE:
+case VHOST_USER_GET_VRING_BASE:
 memcpy(msg.state, arg, sizeof(struct vhost_vring_state));
 msg.size = sizeof(m.state);
 need_reply = 1;
 break;
 
-case VHOST_SET_VRING_ADDR:
+case VHOST_USER_SET_VRING_ADDR:
 memcpy(msg.addr, arg, sizeof(struct vhost_vring_addr));
 msg.size = sizeof(m.addr);
 break;
 
-case VHOST_SET_VRING_KICK:
-case VHOST_SET_VRING_CALL:
-case VHOST_SET_VRING_ERR:
+case VHOST_USER_SET_VRING_KICK:
+case VHOST_USER_SET_VRING_CALL:
+case VHOST_USER_SET_VRING_ERR:
 file = arg;

[Qemu-devel] [PATCH 4/4] vhost-user: unit test for new messages

2015-07-17 Thread Michael S. Tsirkin

Data is empty for now, but do make sure master
sets the new feature bit flag.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 tests/vhost-user-test.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
index 75fedf0..228acb6 100644
--- a/tests/vhost-user-test.c
+++ b/tests/vhost-user-test.c
@@ -53,6 +53,8 @@
 
 #define VHOST_MEMORY_MAX_NREGIONS8
 
+#define VHOST_USER_F_PROTOCOL_FEATURES 30
+
 typedef enum VhostUserRequest {
 VHOST_USER_NONE = 0,
 VHOST_USER_GET_FEATURES = 1,
@@ -69,6 +71,8 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_VRING_KICK = 12,
 VHOST_USER_SET_VRING_CALL = 13,
 VHOST_USER_SET_VRING_ERR = 14,
+VHOST_USER_GET_PROTOCOL_FEATURES = 15,
+VHOST_USER_SET_PROTOCOL_FEATURES = 16,
 VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -293,11 +297,26 @@ static void chr_read(void *opaque, const uint8_t *buf, 
int size)
 /* send back features to qemu */
 msg.flags |= VHOST_USER_REPLY_MASK;
 msg.size = sizeof(m.u64);
+msg.u64 = 0x1ULL  VHOST_USER_F_PROTOCOL_FEATURES;
+p = (uint8_t *) msg;
+qemu_chr_fe_write_all(chr, p, VHOST_USER_HDR_SIZE + msg.size);
+break;
+
+case VHOST_USER_SET_FEATURES:
+   g_assert_cmpint(msg.u64  (0x1ULL  VHOST_USER_F_PROTOCOL_FEATURES),
+   !=, 0ULL);
+break;
+
+case VHOST_USER_GET_PROTOCOL_FEATURES:
+/* send back features to qemu */
+msg.flags |= VHOST_USER_REPLY_MASK;
+msg.size = sizeof(m.u64);
 msg.u64 = 0;
 p = (uint8_t *) msg;
 qemu_chr_fe_write_all(chr, p, VHOST_USER_HDR_SIZE + msg.size);
 break;
 
+
 case VHOST_USER_GET_VRING_BASE:
 /* send back vring base to qemu */
 msg.flags |= VHOST_USER_REPLY_MASK;
-- 
MST

[Qemu-devel] [PATCH] Merge memory_region_init_reservation() into memory_region_init_io()

2015-07-17 Thread Pavel Fedin

Just speficying ops = NULL in some cases can be more convenient than having
two functions. See vGICv3 v5 patch set for at least one such case.

GICv2 code can be refactored in a way similar to what is done in the mentioned 
patchset, killing
some code duplication.

Signed-off-by: Pavel Fedin p.fe...@samsung.com
---
 include/exec/memory.h | 14 +++---
 memory.c  | 10 +-
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 1394715..3b5e44e 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -436,6 +436,9 @@ void memory_region_init_alias(MemoryRegion *mr,
  * memory_region_init_rom_device:  Initialize a ROM memory region.  Writes are
  * handled via callbacks.
  *
+ * If NULL callbacks pointer is given, then I/O space is not supposed to be
+ * handled by QEMU itself. Any access via the memory API will cause an abort().
+ *
  * @mr: the #MemoryRegion to be initialized.
  * @owner: the object that tracks the region's reference count
  * @ops: callbacks for write access handling.
@@ -458,16 +461,21 @@ void memory_region_init_rom_device(MemoryRegion *mr,
  * A reservation region primariy serves debugging purposes.  It claims I/O
  * space that is not supposed to be handled by QEMU itself.  Any access via
  * the memory API will cause an abort().
+ * This function is deprecated. Use memory_region_init_io() with NULL
+ * callbacks instead.
  *
  * @mr: the #MemoryRegion to be initialized
  * @owner: the object that tracks the region's reference count
  * @name: used for debugging; not visible to the user or ABI
  * @size: size of the region.
  */
-void memory_region_init_reservation(MemoryRegion *mr,
-struct Object *owner,
+static inline void memory_region_init_reservation(MemoryRegion *mr,
+Object *owner,
 const char *name,
-uint64_t size);
+uint64_t size)
+{
+memory_region_init_io(mr, owner, NULL, mr, name, size);
+}
 
 /**
  * memory_region_init_iommu: Initialize a memory region that translates
diff --git a/memory.c b/memory.c
index 0acebb1..b843952 100644
--- a/memory.c
+++ b/memory.c
@@ -1187,7 +1187,7 @@ void memory_region_init_io(MemoryRegion *mr,
uint64_t size)
 {
 memory_region_init(mr, owner, name, size);
-mr-ops = ops;
+mr-ops = ops ? ops : unassigned_mem_ops;
 mr-opaque = opaque;
 mr-terminates = true;
 }
@@ -1307,14 +1307,6 @@ void memory_region_init_iommu(MemoryRegion *mr,
 notifier_list_init(mr-iommu_notify);
 }
 
-void memory_region_init_reservation(MemoryRegion *mr,
-Object *owner,
-const char *name,
-uint64_t size)
-{
-memory_region_init_io(mr, owner, unassigned_mem_ops, mr, name, size);
-}
-
 static void memory_region_finalize(Object *obj)
 {
 MemoryRegion *mr = MEMORY_REGION(obj);
-- 
1.9.5.msysgit.0

Re: [Qemu-devel] [PATCH for-2.4 1/4] Revert vhost-user: add multi queue support

2015-07-17 Thread Eric Blake

On 07/17/2015 08:09 AM, Michael S. Tsirkin wrote:
 This reverts commit 830d70db692e374b5f4407f96a1ceefdcc97.
 
 The interface as merged isn't fully backwards-compatible with existing
 clients. Revert it.  Let's redo this after 2.4, based on protocol
 extensions in follow-up patches.
 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 ---
  qapi-schema.json  |  6 +-
  hw/net/vhost_net.c|  3 +--
  hw/virtio/vhost-user.c| 11 +--
  net/vhost-user.c  | 37 +
  docs/specs/vhost-user.txt |  5 -
  qemu-options.hx   |  5 ++---
  6 files changed, 18 insertions(+), 49 deletions(-)

Reviewed-by: Eric Blake ebl...@redhat.com

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me

2015-07-17 Thread Marc Zyngier

On Fri, 17 Jul 2015 15:04:27 +0100
Paolo Bonzini pbonz...@redhat.com wrote:

 
 
 On 17/07/2015 15:28, Marc Zyngier wrote:
   
   Marc, does it ring any bell?
  Well, this is an example of a guest accessing non-memory using an
  instruction that we cannot safely emulate - not an IO accessor (load
  multiple, for example).
  
  In this case, we kill the guest (we could as well inject a fault).
 
 Yup, I later found this in the dmesg:
 
 [1875219.903969] kvm [14843]: load/store instruction decoding not implemented
 
  This vcpu seems to be accessing 0x918 (the mmio structure points
  there), but I can't immediately relate it to the content of the
  registers.
 
 0x918 is the pl011, which makes some sense.
 
 Have you ever seen a corrupted register dump?  The guest RAM goes
 from 0x4000 to 0xbfff, so SP is pointing outside memory.

I've never seen such a corruption - so far.

   PC=bf671210  SP=c1f0
   X00=0a003e70 X01= X02=bf680548 
   X03=0030
   X04=bbb5fc18 X05=004b7770 X06=bf721930 
   X07=009a
   X08=bf716858 X09=0090 X10= 
   X11=0046
   X12=a007e03a X13= X14= 
   X15=
   X16=bf716df0 X17= X18= 
   X19=bf6f5f18
   X20= X21= X22= 
   X23=
   X24= X25= X26= 
   X27=
   X28= X29= X30= 
   PSTATE=6305 (flags -ZC-)
 
 If the register dump is not corrupted, execution went in the weeds...
 I don't have the guest anymore but it's just firmware so the memory
 contents are well reproducible:
 
 0xbf671200:  a97d6ffa  ldmdbgesp!, {r1, r3, r4, r5, r6, r7, 
 r8, r9, sl, fp, sp, lr}^
 0xbf671204:  a97e77fc  ldmdbgelr!, {r2, r3, r4, r5, r6, r7, 
 r8, r9, sl, ip, sp, lr}^
 0xbf671208:  f85f03fe  undefined instruction 0xf85f03fe
 0xbf67120c:  910803ff  strdls r0, [r8, -pc]
 0xbf671210:  ad7007e0  ldclge 7, cr0, [r0, #-896]!
 0xbf671214:  ad710fe2  ldclge 15, cr0, [r1, #-904]!
 0xbf671218:  ad7217e4  ldclge 7, cr1, [r2, #-912]!
 0xbf67121c:  ad731fe6  ldclge 15, cr1, [r3, #-920]!
 0xbf671220:  ad7427e8  ldclge 7, cr2, [r4, #-928]!
 0xbf671224:  ad752fea  ldclge 15, cr2, [r5, #-936]!

But that's all 32bit code, and your guest was running in 64bit. What
does it look like as A64?

  What looks a bit weird is that all the registers are clamped to 32bit,
  but the PSTATE indicates it is running in 64bit (EL1h, which makes the
  PC being utterly wrong).
 
 The PC can be okay since UEFI code doesn't really use virtual addressing,
 but the other registers are weird indeed.

It definitely looks like something tramped on your register file. KVM
doesn't do that at all (we use the whole AArch64 register file
irrespective of the execution state).

Is your UEFI guest 32 or 64bit?

M.
-- 
Jazz is not dead. It just smells funny.

[Qemu-devel] [PATCH v2 2/2] blockdev: always compile in -drive aio= parsing

2015-07-17 Thread Stefan Hajnoczi

CONFIG_LINUX_AIO is an implementation detail of raw-posix.c.  Don't
mention CONFIG_LINUX_AIO in blockdev.c.  Let raw-posix.c decide what to
do with BDRV_O_NATIVE_AIO if CONFIG_LINUX_AIO is not defined.

Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 blockdev.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 62a4586..37b91c8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -405,7 +405,6 @@ static BlockBackend *blockdev_init(const char *file, QDict 
*bs_opts,
 bdrv_flags |= BDRV_O_NO_FLUSH;
 }
 
-#ifdef CONFIG_LINUX_AIO
 if ((buf = qemu_opt_get(opts, aio)) != NULL) {
 if (!strcmp(buf, native)) {
 bdrv_flags |= BDRV_O_NATIVE_AIO;
@@ -416,7 +415,6 @@ static BlockBackend *blockdev_init(const char *file, QDict 
*bs_opts,
goto early_err;
 }
 }
-#endif
 
 if ((buf = qemu_opt_get(opts, format)) != NULL) {
 if (is_help_option(buf)) {
-- 
2.4.3

[Qemu-devel] [PATCH v2 0/2] block: warn about aio=native if libaio is unavailable

2015-07-17 Thread Stefan Hajnoczi

v2:
 * Banish CONFIG_LINUX_AIO from blockdev.c, that is raw-posix.c's business
   [Kevin]
 * Print the warning in the same way as the aio=native,cache.direct=off
   deprecation warning [Kevin]

Open question: what about the Windows case?  We now pass the
FILE_FLAG_OVERLAPPED flag which we didn't do before for -drive aio=native.

Stefan Hajnoczi (2):
  raw-posix: warn about BDRV_O_NATIVE_AIO if libaio is unavailable
  blockdev: always compile in -drive aio= parsing

 block/raw-posix.c | 11 ++-
 blockdev.c|  2 --
 2 files changed, 10 insertions(+), 3 deletions(-)

-- 
2.4.3

[Qemu-devel] [PATCH v2 1/2] raw-posix: warn about BDRV_O_NATIVE_AIO if libaio is unavailable

2015-07-17 Thread Stefan Hajnoczi

raw-posix.c silently ignores BDRV_O_NATIVE_AIO if libaio is unavailable.
It is confusing when aio=native performance is identical to aio=threads
because the binary was accidentally built without libaio.

Print a deprecation warning if -drive aio=native is used with a binary
that does not support libaio.  There are probably users using aio=native
who would be inconvenienced if QEMU suddenly refused to start their
guests.  In the future this will become an error.

Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
 block/raw-posix.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 855febe..e09019c 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -519,7 +519,16 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
  future QEMU versions.\n,
  bs-filename);
 }
-#endif
+#else
+if (bdrv_flags  BDRV_O_NATIVE_AIO) {
+error_printf(WARNING: aio=native was specified for '%s', but 
+ is not supported in this build. Falling back to 
+ aio=threads.\n
+  This will become an error condition in 
+ future QEMU versions.\n,
+ bs-filename);
+}
+#endif /* !defined(CONFIG_LINUX_AIO) */
 
 s-has_discard = true;
 s-has_write_zeroes = true;
-- 
2.4.3

Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me

2015-07-17 Thread Laszlo Ersek

On 07/17/15 15:28, Marc Zyngier wrote:
 On Fri, 17 Jul 2015 10:30:38 +0100
 Paolo Bonzini pbonz...@redhat.com wrote:
 


 On 17/07/2015 06:44, Paolo Bonzini wrote:


 On 16/07/2015 21:05, Richard W.M. Jones wrote:

 Sorry to spoil things, but I'm still seeing this bug, although it is
 now a lot less frequent with your patch.  I would estimate it happens
 more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs
 with qemu.git + the v2 patch series.

 It's the exact same hang in both cases.

 Is it possible that this patch doesn't completely close any race?

 Still, it is an improvement, so there is that.

 Would seem at first glance like a different bug.

 Interestingly, adding some tracing (qemu_clock_get_ns) makes the bug
 more likely: now it reproduces in about 10 tries.  Of course :) adding
 other kinds of tracing instead make it go away again (50 tries).

 Perhaps this:

i/o thread vcpu thread   worker thread
-
lock_iothread
notify_me = 1
...
unlock_iothread
   lock_iothread
   notify_me = 3
   ppoll
   notify_me = 1
  bh-scheduled = 1
  event_notifier_set
   event_notifier_test_and_clear
ppoll
 ^^ hang
 
 In the exact shape above, it doesn't seem too likely to happen, but
 perhaps there's another simpler case.  Still, the bug exists.

 The above is not really related to notify_me.  Here the notification is
 not being optimized away!  So I wonder if this one has been there forever.

 Fam suggested putting the event_notifier_test_and_clear before
 aio_bh_poll(), but it does not work.  I'll look more close

 However, an unconditional event_notifier_test_and_clear is pretty
 expensive.  On one hand, obviously correctness comes first.  On the
 other hand, an expensive operation at the wrong place can mask the race
 very easily; I'll let the fix run for a while, but I'm not sure if a
 successful test really says anything useful.

 So it may not be useful, but still successful test is successful. :)
 The following patch, which also includes the delta between v2 and v3
 of this series, survived 674 reboots before hitting a definitely
 unrelated problem:

 error: kvm run failed Function not implemented
 PC=bf671210  SP=c1f0
 X00=0a003e70 X01= X02=bf680548 
 X03=0030
 X04=bbb5fc18 X05=004b7770 X06=bf721930 
 X07=009a
 X08=bf716858 X09=0090 X10= 
 X11=0046
 X12=a007e03a X13= X14= 
 X15=
 X16=bf716df0 X17= X18= 
 X19=bf6f5f18
 X20= X21= X22= 
 X23=
 X24= X25= X26= 
 X27=
 X28= X29= X30= 
 PSTATE=6305 (flags -ZC-)

 For the record, this is the kvm_run struct:

 $6 = {request_interrupt_window = 0 '\000', padding1 = 
 \000\000\000\000\000\000, exit_reason = 0, 
   ready_for_interrupt_injection = 0 '\000', if_flag = 0 '\000', flags = 0, 
 cr8 = 0, apic_base = 0, {hw = {
   hardware_exit_reason = 150994968}, fail_entry = 
 {hardware_entry_failure_reason = 150994968}, ex = {
   exception = 150994968, error_code = 0}, io = {direction = 24 '\030', 
 size = 0 '\000', port = 2304, 
   count = 0, data_offset = 144}, debug = {arch = {No data fields}}, 
 mmio = {phys_addr = 150994968, 
   data = \220\000\000\000\000\000\000, len = 4, is_write = 0 '\000'}, 
 hypercall = {nr = 150994968, 
   args = {144, 4, 0, 0, 0, 0}, ret = 0, longmode = 0, pad = 0}, 
 tpr_access = {rip = 150994968, 
   is_write = 144, pad = 0}, s390_sieic = {icptcode = 24 '\030', ipa = 
 2304, ipb = 0}, 
 s390_reset_flags = 150994968, s390_ucontrol = {trans_exc_code = 
 150994968, pgm_code = 144}, dcr = {
   dcrn = 150994968, data = 0, is_write = 144 '\220'}, internal = 
 {suberror = 150994968, ndata = 0, 
   data = {144, 4, 0 repeats 14 times}}, osi = {gprs = {150994968, 144, 
 4, 0 repeats 29 times}}, 
 papr_hcall = {nr = 150994968, ret = 144, args = {4, 0, 0, 0, 0, 0, 0, 0, 
 0}}, s390_tsch = {
   subchannel_id = 24, subchannel_nr = 2304, io_int_parm = 0, io_int_word 
 = 144, ipb = 0, 
   dequeued = 4 '\004'}, epr = {epr = 150994968}, system_event = {type = 
 150994968, flags = 144}, 
 s390_stsi = {addr = 150994968, ar = 144 '\220', reserved = 0 '\000', fc 
 = 0 '\000', sel1 = 0 '\000', 
   sel2 = 0}, 
 padding = 
 \030\000\000\t\000\000\000\000\220\000\000\000\000\000\000\000\004, '\000' 
 repeats 238 times}, kvm_valid_regs = 0,

[Qemu-devel] STM32F407-Discovery led blinking video

2015-07-17 Thread Liviu Ionescu

as planned, the GNU ARM Eclipse QEMU branch is now fully functional for simple 
LED blinking applications.

the board image is displayed in a SDL window and the LEDs are coloured 
rectangles shown over the image.

a short demo video is available from:
 
  http://gnuarmeclipse.livius.net/mw/images/f/fa/Stm32f4-discovery.mov


regards,

Liviu

Re: [Qemu-devel] [PATCH v2] raw-posix.c: Make physical devices usable in QEMU under Mac OS X host

2015-07-17 Thread Stefan Hajnoczi

On Thu, Jul 16, 2015 at 04:46:07PM -0400, Programmingkid wrote:
 @@ -2014,7 +2015,9 @@ kern_return_t GetBSDPath( io_iterator_t mediaIterator, 
 char *bsdPath, CFIndex ma
  if ( bsdPathAsCFString ) {
  size_t devPathLength;
  strcpy( bsdPath, _PATH_DEV );
 -strcat( bsdPath, r );
 +if (flags  BDRV_O_NOCACHE) {
 +strcat(bsdPath, r);
 +}
  devPathLength = strlen( bsdPath );
  if ( CFStringGetCString( bsdPathAsCFString, bsdPath + 
 devPathLength, maxPathSize - devPathLength, kCFStringEncodingASCII ) ) {
  kernResult = KERN_SUCCESS;

Is this the fix that makes CD-ROM passthrough work for you?

Does the guest boot successfully when you do:

  -drive if=ide,media=cdrom,cache=none,file=/dev/cdrom

?

 @@ -2027,7 +2030,67 @@ kern_return_t GetBSDPath( io_iterator_t mediaIterator, 
 char *bsdPath, CFIndex ma
  return kernResult;
  }
  
 -#endif
 +/* Sets up a physical device for use in QEMU */
 +static void setupDevice(const char *bsdPath)
 +{
 +   /*
 +* Mac OS X does not like allowing QEMU to use physical devices that are
 +* mounted. Attempts to do so result in 'Resource busy' errors.
 +*/
 +
 +int fd;
 +fd = qemu_open(bsdPath, O_RDONLY | O_BINARY | O_LARGEFILE);
 +
 +/* if the device fails to open */
 +if (fd  0) {
 +printf(Error: failed to open %s\n, bsdPath);
 +printf(If device %s is mounted on the desktop, unmount it
 +first before using it in QEMU.\n, bsdPath);
 +printf(\nCommand to unmount device: diskutil unmountDisk %s, 
 bsdPath);
 +printf(\nCommand to mount device: diskutil mountDisk %s\n\n, 
 bsdPath);
 +}

This error message is printed regardless of the errno value.  What is
the specific errno value when open(2) fails because the device is
mounted?

Also, can you move this after the raw_open_common() call to avoid the
setupCDROM()/setupDevice() changes made in this patch and duplicating
the error message.  It doesn't seem necessary to open the files ahead of
raw_open_common() since the code continues in the error case anyway:

ret = raw_open_common(bs, options, flags, 0, local_err);
if (ret  0) {
if (local_err) {
error_propagate(errp, local_err);
}
#if defined(__APPLE__)  defined(__MACH__)
if (strstart(filename, /dev/) == 0  ret == -EBUSY) { /* or whatever 
*/
error_report(If device %s is mounted on the desktop, unmount it
  first before using it in QEMU., bsdPath);
error_report(Command to unmount device: diskutil unmountDisk %s, 
bsdPath);
error_report(Command to mount device: diskutil mountDisk %s, 
bsdPath);
}
#endif /* defined(__APPLE__)  defined(__MACH__) */
return ret;
}

That's a much smaller change.

 +
 +/* if the device opens */
 +else {
 +qemu_close(fd);
 +}
 +}
 +
 +/* Sets up a real cdrom for use in QEMU */
 +static void setupCDROM(char *bsdPath)
 +{
 +int index, numOfTestPartitions = 2, fd;
 +char testPartition[MAXPATHLEN];
 +bool partitionFound = false;
 +
 +/* look for a working partition */
 +for (index = 0; index  numOfTestPartitions; index++) {
 +strncpy(testPartition, bsdPath, MAXPATHLEN);

The safe way to use strncpy() is:

  strncpy(testPartition, bsdPath, MAXPATHLEN - 1);
  testPartition[MAXPATHLEN - 1] = '\0';

but pstrcpy() is a easier to use correctly.  Please use that instead.

 +snprintf(testPartition, MAXPATHLEN, %ss%d, testPartition, index);
 +fd = qemu_open(testPartition, O_RDONLY | O_BINARY | O_LARGEFILE);
 +if (fd  0) {

Should be fd = 0 since fd = 0 is valid too, although unlikely.

 +partitionFound = true;
 +qemu_close(fd);
 +break;
 +}
 +}
 +
 +/* if a working partition on the device was not found */
 +if (partitionFound == false) {
 +printf(Error: Failed to find a working partition on disc!\n);
 +printf(If your disc is mounted on the desktop, trying unmounting it
 +first before using it in QEMU.\n);
 +printf(\nCommand to unmount disc: 
 +diskutil unmountDisk %s\n, bsdPath);
 +printf(Command to mount disc: 
 +   diskutil mountDisk %s\n\n, bsdPath);
 +}
 +
 +DPRINTF(Using %s as CDROM\n, testPartition);
 +strncpy(bsdPath, testPartition, MAXPATHLEN);

Please use pstrcpy().

 +}
 +
 +#endif /* defined(__APPLE__)  defined(__MACH__) */
  
  static int hdev_probe_device(const char *filename)
  {
 @@ -2119,30 +2182,28 @@ static int hdev_open(BlockDriverState *bs, QDict 
 *options, int flags,
  #if defined(__APPLE__)  defined(__MACH__)
  const char *filename = qdict_get_str(options, filename);
  
 -if (strstart(filename, /dev/cdrom, NULL)) {
 -kern_return_t kernResult;
 -io_iterator_t mediaIterator;
 -char

Re: [Qemu-devel] [RFC PATCH qemu v3 4/4] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering)

2015-07-17 Thread David Gibson

On Fri, Jul 17, 2015 at 05:13:37PM +1000, Alexey Kardashevskiy wrote:
 On 07/16/2015 03:11 PM, David Gibson wrote:
 On Tue, Jul 14, 2015 at 10:21:54PM +1000, Alexey Kardashevskiy wrote:
 This makes use of the new memory registering feature. The idea is
 to provide the userspace ability to notify the host kernel about pages
 which are going to be used for DMA. Having this information, the host
 kernel can pin them all once per user process, do locked pages
 accounting (once) and not spent time on doing that in real time with
 possible failures which cannot be handled nicely in some cases.
 
 This adds a guest RAM memory listener which notifies a VFIO container
 about memory which needs to be pinned/unpinned. VFIO MMIO regions
 (i.e. skip dump regions) are skipped.
 
 The feature is only enabled for SPAPR IOMMU v2. The host kernel changes
 are required. Since v2 does not need/support VFIO_IOMMU_ENABLE, this does
 not call it when v2 is detected and enabled.
 
 This does not change the guest visible interface.
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 
 I've looked at this in more depth now, and attempting to unify the
 pre-reg and mapping listeners like this can't work - they need to be
 listening on different address spaces:  mapping actions need to be
 listening on the PCI address space, whereas the pre-reg needs to be
 listening on address_space_memory.  For x86 - for now - those end up
 being the same thing, but on Power they're not.
 
 We do need to be clear about what differences are due to the presence
 of a guest IOMMU versus which are due to arch or underlying IOMMU
 type.  For now Power has a guest IOMMU and x86 doesn't, but that could
 well change in future: we could well implement the guest side IOMMU
 for x86 in future (or x86 could invent a paravirt IOMMU interface).
 On the other side, BenH's experimental powernv machine type could
 introduce Power machines without a guest side IOMMU (or at least an
 optional guest side IOMMU).
 
 The quick and dirty approach here is:
 1. Leave the main listener as is
 2. Add a new pre-reg notifier to the spapr iommu specific code,
which listens on address_space_memory, *not* the PCI space
 
 The more generally correct approach, which allows for more complex
 IOMMU arrangements and the possibility of new IOMMU types with pre-reg
 is:
 1. Have the core implement both a mapping listener and a pre-reg
listener (optionally enabled by a per-iommu-type flag).
Basically the first one sees what *is* mapped, the second sees
what *could* be mapped.
 
 2. As now, the mapping listener listens on PCI address space, if
RAM blocks are added, immediately map them into the host IOMMU,
if guest IOMMU blocks appear register a notifier which will
mirror guest IOMMU mappings to the host IOMMU (this is what we
do now).
 
 3. The pre-reg listener also listens on the PCI address space.  RAM
blocks added are pre-registered immediately.
 
 
 PCI address space listeners won't be notified about RAM blocks on sPAPR.

Sure they will - if any RAM blocks were mapped directly into PCI
address space, the listener would be notified.  It's just that no RAM
blocks are directly mapped into PCI space, only partially mapped in
via IOMMU blocks.

But the idea is this scheme could handle a platform that has both a
bypass DMA window which maps directly onto a block of ram and an
IOMMU controlled DMA window.  Or one which could have either setup
depending on circumstances (which is probably true of BenH's powernv
machine type).

 But, if guest
IOMMU blocks are added, instead of registering a guest-iommu
notifier,
 
 guest-iommu notifier is the one called via memory_region_notify_iommu()
 from H_PUT_TCE? Instead implies dropping it, how this can work?

Because the other listener - the mapping listener at (2) handles that
part.  The pre-reg listener doesn't.

But as noted in by other mail this whole scheme doesn't work without a
way to discover an IOMMU region's target AS in advance, which doesn't
currently exist.

 we register another listener on the *target* AS of the
guest IOMMU, same callbacks as this one.  In practice that
target AS will almost always resolve to address_space_memory,
but this can at least in theory handle crazy guest setups with
multiple layers of IOMMU.
 
 4. Have to ensure that the pre-reg callbacks always happen before
the mapping calls.  For a system with an IOMMU backend which
requires pre-registration, but doesn't have a guest IOMMU, we
need to pre-reg, then host-iommu-map RAM blocks that appear in
PCI address space.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgp019P3Hv12i.pgp
Description: PGP

Re: [Qemu-devel] [PATCH v2 00/13] tcg/sparc v8plus code generation

2015-07-17 Thread Aurelien Jarno

On 2015-07-17 12:23, Aurelien Jarno wrote:
 On 2015-07-16 22:29, Richard Henderson wrote:
  On 07/15/2015 09:54 PM, Aurelien Jarno wrote:
  While I understand why we need the new trunc_shr_i32 opcode for MIPS64
  (the 32-bit values must be kept sign-extended), I currently fail to
  see why it is needed for SPARC.
  
  As far as I recall, it improves code for extracting high parts of 64-bit
  quantities.  Without this, we wind up with a 64-bit shift, requiring a
  64-bit temp register, followed by the real truncate which can copy the
  data to a 32-bit destination register.
 
 Ok, I understand the use case now. So it's not for correctness, but
 rather to generate more optimized code.

OTOH, it means that we always have to go through a 32-bit register first
when truncating a 64-bit value.

I mean we gain in the following case:
  shr_i64 t64, t64, i
  trunc_i64_i32 t32, t64
  ...

But we lose in the following case:
  trunc_i64_i32 t32, t64
  neg t32, t32
  ...

Overall I guess the advantages far outweigh the disadvantages.

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [RFC v3 08/13] exec.c: introduce a simple rendezvous support

2015-07-17 Thread Alex Bennée


Alvise Rigo a.r...@virtualopensystems.com writes:

 When a vCPU is about to set a memory page as exclusive, it needs to wait
 that all the running vCPUs finish to execute the current TB and to be aware
 of the exact moment when that happens. For this, add a simple rendezvous
 mechanism that will be used in softmmu_llsc_template.h to implement the
 ldlink operation.

 Suggested-by: Jani Kokkonen jani.kokko...@huawei.com
 Suggested-by: Claudio Fontana claudio.font...@huawei.com
 Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com
 ---
  cpus.c|  5 +
  exec.c| 45 +
  include/qom/cpu.h | 16 
  3 files changed, 66 insertions(+)

 diff --git a/cpus.c b/cpus.c
 index aee445a..f4d938e 100644
 --- a/cpus.c
 +++ b/cpus.c
 @@ -1423,6 +1423,11 @@ static int tcg_cpu_exec(CPUArchState *env)
  qemu_mutex_unlock_iothread();
  ret = cpu_exec(env);
  cpu-tcg_executing = 0;
 +
 +if (unlikely(cpu-pending_rdv)) {
 +cpu_exit_do_rendezvous(cpu);
 +}
 +

I'll ignore this stuff for now as I assume we can all use the async work
patch of Fred's?


  qemu_mutex_lock_iothread();
  #ifdef CONFIG_PROFILER
  tcg_time += profile_getclock() - ti;
 diff --git a/exec.c b/exec.c
 index 964e922..51958ed 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -746,6 +746,51 @@ void cpu_breakpoint_remove_all(CPUState *cpu, int mask)
  }
  }
  
 +/* Rendezvous implementation.
 + * The corresponding definitions are in include/qom/cpu.h. */
 +CpuExitRendezvous cpu_exit_rendezvous;
 +inline void cpu_exit_init_rendezvous(void)
 +{
 +CpuExitRendezvous *rdv = cpu_exit_rendezvous;
 +
 +rdv-attendees = 0;
 +}
 +
 +inline void cpu_exit_rendezvous_add_attendee(CPUState *cpu)
 +{
 +CpuExitRendezvous *rdv = cpu_exit_rendezvous;
 +
 +if (!cpu-pending_rdv) {
 +cpu-pending_rdv = 1;
 +atomic_inc(rdv-attendees);
 +}
 +}
 +
 +void cpu_exit_do_rendezvous(CPUState *cpu)
 +{
 +CpuExitRendezvous *rdv = cpu_exit_rendezvous;
 +
 +atomic_dec(rdv-attendees);
 +
 +cpu-pending_rdv = 0;
 +}
 +
 +void cpu_exit_rendezvous_wait_others(CPUState *cpu)
 +{
 +CpuExitRendezvous *rdv = cpu_exit_rendezvous;
 +
 +while (rdv-attendees) {
 +g_usleep(TCG_RDV_POLLING_PERIOD);
 +}
 +}
 +
 +void cpu_exit_rendezvous_release(void)
 +{
 +CpuExitRendezvous *rdv = cpu_exit_rendezvous;
 +
 +rdv-attendees = 0;
 +}
 +
  /* enable or disable single step mode. EXCP_DEBUG is returned by the
 CPU loop after each instruction */
  void cpu_single_step(CPUState *cpu, int enabled)
 diff --git a/include/qom/cpu.h b/include/qom/cpu.h
 index 8f3fe56..8d121b3 100644
 --- a/include/qom/cpu.h
 +++ b/include/qom/cpu.h
 @@ -201,6 +201,20 @@ typedef struct CPUWatchpoint {
  QTAILQ_ENTRY(CPUWatchpoint) entry;
  } CPUWatchpoint;
  
 +/* Rendezvous support */
 +#define TCG_RDV_POLLING_PERIOD 10
 +typedef struct CpuExitRendezvous {
 +volatile int attendees;
 +QemuMutex lock;
 +} CpuExitRendezvous;
 +
 +extern CpuExitRendezvous cpu_exit_rendezvous;
 +void cpu_exit_init_rendezvous(void);
 +void cpu_exit_rendezvous_add_attendee(CPUState *cpu);
 +void cpu_exit_do_rendezvous(CPUState *cpu);
 +void cpu_exit_rendezvous_wait_others(CPUState *cpu);
 +void cpu_exit_rendezvous_release(void);
 +
  struct KVMState;
  struct kvm_run;
  
 @@ -291,6 +305,8 @@ struct CPUState {
  
  void *opaque;
  
 +volatile int pending_rdv;
 +

I will however echo the hmmm on Fred's patch about the optimistic use
of volatile here. As Peter mentioned it is a red flag and I would prefer
explicit memory consistency behaviour used and documented.


  /* In order to avoid passing too many arguments to the MMIO helpers,
   * we store some rarely used information in the CPU context.
   */

-- 
Alex Bennée

Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me

2015-07-17 Thread Marc Zyngier

On Fri, 17 Jul 2015 14:39:55 +0100
Laszlo Ersek ler...@redhat.com wrote:

 On 07/17/15 15:28, Marc Zyngier wrote:
  On Fri, 17 Jul 2015 10:30:38 +0100
  Paolo Bonzini pbonz...@redhat.com wrote:
  
 
 
  On 17/07/2015 06:44, Paolo Bonzini wrote:
 
 
  On 16/07/2015 21:05, Richard W.M. Jones wrote:
 
  Sorry to spoil things, but I'm still seeing this bug, although it is
  now a lot less frequent with your patch.  I would estimate it happens
  more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs
  with qemu.git + the v2 patch series.
 
  It's the exact same hang in both cases.
 
  Is it possible that this patch doesn't completely close any race?
 
  Still, it is an improvement, so there is that.
 
  Would seem at first glance like a different bug.
 
  Interestingly, adding some tracing (qemu_clock_get_ns) makes the bug
  more likely: now it reproduces in about 10 tries.  Of course :) adding
  other kinds of tracing instead make it go away again (50 tries).
 
  Perhaps this:
 
 i/o thread vcpu thread   worker thread
 -
 lock_iothread
 notify_me = 1
 ...
 unlock_iothread
lock_iothread
notify_me = 3
ppoll
notify_me = 1
   bh-scheduled = 1
   event_notifier_set
event_notifier_test_and_clear
 ppoll
  ^^ hang

  In the exact shape above, it doesn't seem too likely to happen, but
  perhaps there's another simpler case.  Still, the bug exists.
 
  The above is not really related to notify_me.  Here the notification is
  not being optimized away!  So I wonder if this one has been there forever.
 
  Fam suggested putting the event_notifier_test_and_clear before
  aio_bh_poll(), but it does not work.  I'll look more close
 
  However, an unconditional event_notifier_test_and_clear is pretty
  expensive.  On one hand, obviously correctness comes first.  On the
  other hand, an expensive operation at the wrong place can mask the race
  very easily; I'll let the fix run for a while, but I'm not sure if a
  successful test really says anything useful.
 
  So it may not be useful, but still successful test is successful. :)
  The following patch, which also includes the delta between v2 and v3
  of this series, survived 674 reboots before hitting a definitely
  unrelated problem:
 
  error: kvm run failed Function not implemented
  PC=bf671210  SP=c1f0
  X00=0a003e70 X01= X02=bf680548 
  X03=0030
  X04=bbb5fc18 X05=004b7770 X06=bf721930 
  X07=009a
  X08=bf716858 X09=0090 X10= 
  X11=0046
  X12=a007e03a X13= X14= 
  X15=
  X16=bf716df0 X17= X18= 
  X19=bf6f5f18
  X20= X21= X22= 
  X23=
  X24= X25= X26= 
  X27=
  X28= X29= X30= 
  PSTATE=6305 (flags -ZC-)
 
  For the record, this is the kvm_run struct:
 
  $6 = {request_interrupt_window = 0 '\000', padding1 = 
  \000\000\000\000\000\000, exit_reason = 0, 
ready_for_interrupt_injection = 0 '\000', if_flag = 0 '\000', flags = 0, 
  cr8 = 0, apic_base = 0, {hw = {
hardware_exit_reason = 150994968}, fail_entry = 
  {hardware_entry_failure_reason = 150994968}, ex = {
exception = 150994968, error_code = 0}, io = {direction = 24 '\030', 
  size = 0 '\000', port = 2304, 
count = 0, data_offset = 144}, debug = {arch = {No data fields}}, 
  mmio = {phys_addr = 150994968, 
data = \220\000\000\000\000\000\000, len = 4, is_write = 0 
  '\000'}, hypercall = {nr = 150994968, 
args = {144, 4, 0, 0, 0, 0}, ret = 0, longmode = 0, pad = 0}, 
  tpr_access = {rip = 150994968, 
is_write = 144, pad = 0}, s390_sieic = {icptcode = 24 '\030', ipa = 
  2304, ipb = 0}, 
  s390_reset_flags = 150994968, s390_ucontrol = {trans_exc_code = 
  150994968, pgm_code = 144}, dcr = {
dcrn = 150994968, data = 0, is_write = 144 '\220'}, internal = 
  {suberror = 150994968, ndata = 0, 
data = {144, 4, 0 repeats 14 times}}, osi = {gprs = {150994968, 
  144, 4, 0 repeats 29 times}}, 
  papr_hcall = {nr = 150994968, ret = 144, args = {4, 0, 0, 0, 0, 0, 0, 
  0, 0}}, s390_tsch = {
subchannel_id = 24, subchannel_nr = 2304, io_int_parm = 0, 
  io_int_word = 144, ipb = 0, 
dequeued = 4 '\004'}, epr = {epr = 150994968}, system_event = {type 
  = 150994968, flags = 144}, 
  s390_stsi = {addr = 150994968, ar = 144 '\220', reserved = 0 '\000',

[Qemu-devel] [PULL for-2.4 1/1] virtio-rng: trigger timer only when guest requests for entropy

2015-07-17 Thread Amit Shah

From: Pankaj Gupta pagu...@redhat.com

This patch triggers timer only when guest requests for
entropy. As soon as first request from guest for entropy
comes we set the timer. Timer bumps up the quota value
when it gets triggered.

Signed-off-by: Pankaj Gupta pagu...@redhat.com
Reviewed-by: Amit Shah amit.s...@redhat.com
Message-Id: 1436962608-9961-2-git-send-email-pagu...@redhat.com

[Re-worded patch subject, removed extra whitespace -- Amit]

Signed-off-by: Amit Shah amit.s...@redhat.com
---
 hw/virtio/virtio-rng.c | 15 ---
 include/hw/virtio/virtio-rng.h |  1 +
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
index 740ed31..6e5f022 100644
--- a/hw/virtio/virtio-rng.c
+++ b/hw/virtio/virtio-rng.c
@@ -77,6 +77,12 @@ static void virtio_rng_process(VirtIORNG *vrng)
 return;
 }
 
+if (vrng-activate_timer) {
+timer_mod(vrng-rate_limit_timer,
+  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 
vrng-conf.period_ms);
+vrng-activate_timer = false;
+}
+
 if (vrng-quota_remaining  0) {
 quota = 0;
 } else {
@@ -138,8 +144,7 @@ static void check_rate_limit(void *opaque)
 
 vrng-quota_remaining = vrng-conf.max_bytes;
 virtio_rng_process(vrng);
-timer_mod(vrng-rate_limit_timer,
-   qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 
vrng-conf.period_ms);
+vrng-activate_timer = true;
 }
 
 static void virtio_rng_device_realize(DeviceState *dev, Error **errp)
@@ -195,13 +200,9 @@ static void virtio_rng_device_realize(DeviceState *dev, 
Error **errp)
 
 vrng-vq = virtio_add_queue(vdev, 8, handle_input);
 vrng-quota_remaining = vrng-conf.max_bytes;
-
 vrng-rate_limit_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
check_rate_limit, vrng);
-
-timer_mod(vrng-rate_limit_timer,
-   qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 
vrng-conf.period_ms);
-
+vrng-activate_timer = true;
 register_savevm(dev, virtio-rng, -1, 1, virtio_rng_save,
 virtio_rng_load, vrng);
 }
diff --git a/include/hw/virtio/virtio-rng.h b/include/hw/virtio/virtio-rng.h
index 0316488..3f07de7 100644
--- a/include/hw/virtio/virtio-rng.h
+++ b/include/hw/virtio/virtio-rng.h
@@ -44,6 +44,7 @@ typedef struct VirtIORNG {
  */
 QEMUTimer *rate_limit_timer;
 int64_t quota_remaining;
+bool activate_timer;
 } VirtIORNG;
 
 #endif
-- 
2.4.3

[Qemu-devel] [PULL for-2.4 0/1] virtio-rng: reduce wakeups

2015-07-17 Thread Amit Shah

The following changes since commit 5b5e8cdd7da7a2214dd062afff5b866234aab228:

  Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20150717-1' into 
staging (2015-07-17 12:39:12 +0100)

are available in the git repository at:

  https://git.kernel.org/pub/scm/virt/qemu/amit/virtio-rng.git tags/vrng-2.4

for you to fetch changes up to 621a20e08155179b1902c428361e80f41429f50d:

  virtio-rng: trigger timer only when guest requests for entropy (2015-07-17 
19:05:16 +0530)


Fire timer only when required.  Brings down wakeups by a big number.



Pankaj Gupta (1):
  virtio-rng: trigger timer only when guest requests for entropy

 hw/virtio/virtio-rng.c | 15 ---
 include/hw/virtio/virtio-rng.h |  1 +
 2 files changed, 9 insertions(+), 7 deletions(-)

-- 
2.4.3

Re: [Qemu-devel] [PATCH v4 1/1] vhost user: add support of live migration

2015-07-17 Thread Paolo Bonzini



On 17/07/2015 15:35, Marc-André Lureau wrote:
  LOG_FD is implemented in the kernel drivers/vhost/vhost.c.  It seems to
  be an eventfd-like mechanism to save on dirty bitmap scans.  However,
  it's not well documented how to implement it in a correct way.
 and it's not used by qemu, so hard to say if it actually work well.
 
  In any case, live migration needs a new message type (like LOG_MMAP_FD)
  in the vhost-user protocol.

 Yes, perhaps with size and offset. I am looking at this.

The offset should be 0...

Do you know the size of the ram_addr_t space from
VHOST_USER_SET_MEM_TABLE's user address and size fields?

If the size isn't needed, you can reuse LOG_BASE, ignoring the content
of the payload and adding the SCM_RIGHTS file descriptor.

Paolo

Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me

2015-07-17 Thread Richard W.M. Jones

On Fri, Jul 17, 2015 at 02:48:40PM +0100, Marc Zyngier wrote:
 Still: there is nothing in the registers that remotely points to that
 area. X0 is the closest, but it'd take a big negative offset to get
 there.
 
 Is that a Linux kernel? or something else?

You're sure it's not this one?
https://bugzilla.redhat.com/show_bug.cgi?id=1194366
That was caused by ftrace screwing up guest memory, so it was
effectively running random code.  It is also fixed (by you in fact).

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

Re: [Qemu-devel] [RFC v3 08/13] exec.c: introduce a simple rendezvous support

2015-07-17 Thread alvise rigo

On Fri, Jul 17, 2015 at 3:45 PM, Alex Bennée alex.ben...@linaro.org wrote:

 Alvise Rigo a.r...@virtualopensystems.com writes:

 When a vCPU is about to set a memory page as exclusive, it needs to wait
 that all the running vCPUs finish to execute the current TB and to be aware
 of the exact moment when that happens. For this, add a simple rendezvous
 mechanism that will be used in softmmu_llsc_template.h to implement the
 ldlink operation.

 Suggested-by: Jani Kokkonen jani.kokko...@huawei.com
 Suggested-by: Claudio Fontana claudio.font...@huawei.com
 Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com
 ---
  cpus.c|  5 +
  exec.c| 45 +
  include/qom/cpu.h | 16 
  3 files changed, 66 insertions(+)

 diff --git a/cpus.c b/cpus.c
 index aee445a..f4d938e 100644
 --- a/cpus.c
 +++ b/cpus.c
 @@ -1423,6 +1423,11 @@ static int tcg_cpu_exec(CPUArchState *env)
  qemu_mutex_unlock_iothread();
  ret = cpu_exec(env);
  cpu-tcg_executing = 0;
 +
 +if (unlikely(cpu-pending_rdv)) {
 +cpu_exit_do_rendezvous(cpu);
 +}
 +

 I'll ignore this stuff for now as I assume we can all use the async work
 patch of Fred's?

Yes, it will be more likely based on the plain async_run_on_cpu.



  qemu_mutex_lock_iothread();
  #ifdef CONFIG_PROFILER
  tcg_time += profile_getclock() - ti;
 diff --git a/exec.c b/exec.c
 index 964e922..51958ed 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -746,6 +746,51 @@ void cpu_breakpoint_remove_all(CPUState *cpu, int mask)
  }
  }

 +/* Rendezvous implementation.
 + * The corresponding definitions are in include/qom/cpu.h. */
 +CpuExitRendezvous cpu_exit_rendezvous;
 +inline void cpu_exit_init_rendezvous(void)
 +{
 +CpuExitRendezvous *rdv = cpu_exit_rendezvous;
 +
 +rdv-attendees = 0;
 +}
 +
 +inline void cpu_exit_rendezvous_add_attendee(CPUState *cpu)
 +{
 +CpuExitRendezvous *rdv = cpu_exit_rendezvous;
 +
 +if (!cpu-pending_rdv) {
 +cpu-pending_rdv = 1;
 +atomic_inc(rdv-attendees);
 +}
 +}
 +
 +void cpu_exit_do_rendezvous(CPUState *cpu)
 +{
 +CpuExitRendezvous *rdv = cpu_exit_rendezvous;
 +
 +atomic_dec(rdv-attendees);
 +
 +cpu-pending_rdv = 0;
 +}
 +
 +void cpu_exit_rendezvous_wait_others(CPUState *cpu)
 +{
 +CpuExitRendezvous *rdv = cpu_exit_rendezvous;
 +
 +while (rdv-attendees) {
 +g_usleep(TCG_RDV_POLLING_PERIOD);
 +}
 +}
 +
 +void cpu_exit_rendezvous_release(void)
 +{
 +CpuExitRendezvous *rdv = cpu_exit_rendezvous;
 +
 +rdv-attendees = 0;
 +}
 +
  /* enable or disable single step mode. EXCP_DEBUG is returned by the
 CPU loop after each instruction */
  void cpu_single_step(CPUState *cpu, int enabled)
 diff --git a/include/qom/cpu.h b/include/qom/cpu.h
 index 8f3fe56..8d121b3 100644
 --- a/include/qom/cpu.h
 +++ b/include/qom/cpu.h
 @@ -201,6 +201,20 @@ typedef struct CPUWatchpoint {
  QTAILQ_ENTRY(CPUWatchpoint) entry;
  } CPUWatchpoint;

 +/* Rendezvous support */
 +#define TCG_RDV_POLLING_PERIOD 10
 +typedef struct CpuExitRendezvous {
 +volatile int attendees;
 +QemuMutex lock;
 +} CpuExitRendezvous;
 +
 +extern CpuExitRendezvous cpu_exit_rendezvous;
 +void cpu_exit_init_rendezvous(void);
 +void cpu_exit_rendezvous_add_attendee(CPUState *cpu);
 +void cpu_exit_do_rendezvous(CPUState *cpu);
 +void cpu_exit_rendezvous_wait_others(CPUState *cpu);
 +void cpu_exit_rendezvous_release(void);
 +
  struct KVMState;
  struct kvm_run;

 @@ -291,6 +305,8 @@ struct CPUState {

  void *opaque;

 +volatile int pending_rdv;
 +

 I will however echo the hmmm on Fred's patch about the optimistic use
 of volatile here. As Peter mentioned it is a red flag and I would prefer
 explicit memory consistency behaviour used and documented.

In my local branch I'm now using atomic_set/atomic_read, basically
what is defined in qemu/atomic.h. Is this enough?

Regards,
alvise



  /* In order to avoid passing too many arguments to the MMIO helpers,
   * we store some rarely used information in the CPU context.
   */

 --
 Alex Bennée

Re: [Qemu-devel] [PATCH 2/2 v3] virtio-rng: Serve pending request if any after timer bumps up quota.

2015-07-17 Thread Amit Shah

On (Thu) 16 Jul 2015 [02:34:34], Pankaj Gupta wrote:
  Anyway we can look at that later, patch 1 is already a big improvement
  so I think we should just stick with that, and think about other
  things in a different series.
 
 Sure.

I think we can apply patch 1 for 2.4, since it reduces the wakeups we
cause.  If you can post powertop numbers, that'll be great.

I'll do a pull req in the meantime.

Thanks,

Amit

Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me

2015-07-17 Thread Laszlo Ersek

On 07/17/15 15:48, Marc Zyngier wrote:
 On Fri, 17 Jul 2015 14:39:55 +0100
 Laszlo Ersek ler...@redhat.com wrote:
 
 On 07/17/15 15:28, Marc Zyngier wrote:
 On Fri, 17 Jul 2015 10:30:38 +0100
 Paolo Bonzini pbonz...@redhat.com wrote:



 On 17/07/2015 06:44, Paolo Bonzini wrote:


 On 16/07/2015 21:05, Richard W.M. Jones wrote:

 Sorry to spoil things, but I'm still seeing this bug, although it is
 now a lot less frequent with your patch.  I would estimate it happens
 more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs
 with qemu.git + the v2 patch series.

 It's the exact same hang in both cases.

 Is it possible that this patch doesn't completely close any race?

 Still, it is an improvement, so there is that.

 Would seem at first glance like a different bug.

 Interestingly, adding some tracing (qemu_clock_get_ns) makes the bug
 more likely: now it reproduces in about 10 tries.  Of course :) adding
 other kinds of tracing instead make it go away again (50 tries).

 Perhaps this:

i/o thread vcpu thread   worker thread
-
lock_iothread
notify_me = 1
...
unlock_iothread
   lock_iothread
   notify_me = 3
   ppoll
   notify_me = 1
  bh-scheduled = 1
  event_notifier_set
   event_notifier_test_and_clear
ppoll
 ^^ hang
   
 In the exact shape above, it doesn't seem too likely to happen, but
 perhaps there's another simpler case.  Still, the bug exists.

 The above is not really related to notify_me.  Here the notification is
 not being optimized away!  So I wonder if this one has been there forever.

 Fam suggested putting the event_notifier_test_and_clear before
 aio_bh_poll(), but it does not work.  I'll look more close

 However, an unconditional event_notifier_test_and_clear is pretty
 expensive.  On one hand, obviously correctness comes first.  On the
 other hand, an expensive operation at the wrong place can mask the race
 very easily; I'll let the fix run for a while, but I'm not sure if a
 successful test really says anything useful.

 So it may not be useful, but still successful test is successful. :)
 The following patch, which also includes the delta between v2 and v3
 of this series, survived 674 reboots before hitting a definitely
 unrelated problem:

 error: kvm run failed Function not implemented
 PC=bf671210  SP=c1f0
 X00=0a003e70 X01= X02=bf680548 
 X03=0030
 X04=bbb5fc18 X05=004b7770 X06=bf721930 
 X07=009a
 X08=bf716858 X09=0090 X10= 
 X11=0046
 X12=a007e03a X13= X14= 
 X15=
 X16=bf716df0 X17= X18= 
 X19=bf6f5f18
 X20= X21= X22= 
 X23=
 X24= X25= X26= 
 X27=
 X28= X29= X30= 
 PSTATE=6305 (flags -ZC-)

 For the record, this is the kvm_run struct:

 $6 = {request_interrupt_window = 0 '\000', padding1 = 
 \000\000\000\000\000\000, exit_reason = 0, 
   ready_for_interrupt_injection = 0 '\000', if_flag = 0 '\000', flags = 0, 
 cr8 = 0, apic_base = 0, {hw = {
   hardware_exit_reason = 150994968}, fail_entry = 
 {hardware_entry_failure_reason = 150994968}, ex = {
   exception = 150994968, error_code = 0}, io = {direction = 24 '\030', 
 size = 0 '\000', port = 2304, 
   count = 0, data_offset = 144}, debug = {arch = {No data fields}}, 
 mmio = {phys_addr = 150994968, 
   data = \220\000\000\000\000\000\000, len = 4, is_write = 0 
 '\000'}, hypercall = {nr = 150994968, 
   args = {144, 4, 0, 0, 0, 0}, ret = 0, longmode = 0, pad = 0}, 
 tpr_access = {rip = 150994968, 
   is_write = 144, pad = 0}, s390_sieic = {icptcode = 24 '\030', ipa = 
 2304, ipb = 0}, 
 s390_reset_flags = 150994968, s390_ucontrol = {trans_exc_code = 
 150994968, pgm_code = 144}, dcr = {
   dcrn = 150994968, data = 0, is_write = 144 '\220'}, internal = 
 {suberror = 150994968, ndata = 0, 
   data = {144, 4, 0 repeats 14 times}}, osi = {gprs = {150994968, 
 144, 4, 0 repeats 29 times}}, 
 papr_hcall = {nr = 150994968, ret = 144, args = {4, 0, 0, 0, 0, 0, 0, 
 0, 0}}, s390_tsch = {
   subchannel_id = 24, subchannel_nr = 2304, io_int_parm = 0, 
 io_int_word = 144, ipb = 0, 
   dequeued = 4 '\004'}, epr = {epr = 150994968}, system_event = {type 
 = 150994968, flags = 144}, 
 s390_stsi = {addr = 150994968, ar = 144 '\220', reserved = 0 '\000', 
 fc = 0 '\000', sel1 = 0 '\000', 
   sel2 = 0}, 
 padding =

Re: [Qemu-devel] [PATCH] blockdev: warn about aio=native if libaio is unavailable

2015-07-17 Thread Stefan Hajnoczi

On Fri, Jul 17, 2015 at 12:56:15PM +0200, Kevin Wolf wrote:
 Am 17.07.2015 um 11:59 hat Stefan Hajnoczi geschrieben:
  QEMU silently ignores aio=native if libaio is unavailable.  It is
  confusing when aio=native performance is identical to aio=threads
  because the binary was accidentally built without libaio.
  
  Use error_report() because failing would break backward compatibility.
  There are probably users using aio=native who would be inconvenienced if
  QEMU suddenly refused to start their guests.
  
  Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
 
 I hope not too many people are using aio=native without having libaio
 compiled in... Can we make it a message like for the case with
 aio=native,cache.direct=off, i.e. a deprecation warning that allows us
 to make this an error in a few releases?

Yes, I'll move the warning to raw-posix.c so all callers benefit from
it.


pgpoOON4yc1s8.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me

2015-07-17 Thread Marc Zyngier

On Fri, 17 Jul 2015 14:53:06 +0100
Richard W.M. Jones rjo...@redhat.com wrote:

 On Fri, Jul 17, 2015 at 02:48:40PM +0100, Marc Zyngier wrote:
  Still: there is nothing in the registers that remotely points to that
  area. X0 is the closest, but it'd take a big negative offset to get
  there.
  
  Is that a Linux kernel? or something else?
 
 You're sure it's not this one?
 https://bugzilla.redhat.com/show_bug.cgi?id=1194366
 That was caused by ftrace screwing up guest memory, so it was
 effectively running random code.  It is also fixed (by you in fact).

Don't think so. The bug you quote was the guest kernel being buggy, and
touching non-memory space.

This new issue seems different - this is not a Linux kernel, by the
look of it.

M.
-- 
Jazz is not dead. It just smells funny.

Re: [Qemu-devel] [PATCH v4 1/1] vhost user: add support of live migration

2015-07-17 Thread Paolo Bonzini



On 17/07/2015 12:34, Marc-André Lureau wrote:
 On Fri, Jul 17, 2015 at 4:25 AM, Paolo Bonzini pbonz...@redhat.com wrote:
  But LOG_BASE makes little sense across processes, and LOG_FD is unused
  in QEMU, isn't it?  So this patch is not enough to add support of live
  migration.

 You are right, LOG_BASE doesn't make much sense, and LOG_FD isn't
 used, despite some code already there. Furthermore, the log is
 allocated with regular malloc, hardly shareable.

LOG_FD is implemented in the kernel drivers/vhost/vhost.c.  It seems to
be an eventfd-like mechanism to save on dirty bitmap scans.  However,
it's not well documented how to implement it in a correct way.

In any case, live migration needs a new message type (like LOG_MMAP_FD)
in the vhost-user protocol.

Paolo

Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me

2015-07-17 Thread Richard W.M. Jones

On Fri, Jul 17, 2015 at 11:30:38AM +0200, Paolo Bonzini wrote:
 error: kvm run failed Function not implemented
 PC=bf671210  SP=c1f0
 X00=0a003e70 X01= X02=bf680548 
 X03=0030
 X04=bbb5fc18 X05=004b7770 X06=bf721930 
 X07=009a
 X08=bf716858 X09=0090 X10= 
 X11=0046
 X12=a007e03a X13= X14= 
 X15=
 X16=bf716df0 X17= X18= 
 X19=bf6f5f18
 X20= X21= X22= 
 X23=
 X24= X25= X26= 
 X27=
 X28= X29= X30= 
 PSTATE=6305 (flags -ZC-)

Vaguely reminiscent of this bug:

https://bugzilla.redhat.com/show_bug.cgi?id=1194366

(See comment 7 in particular)

 diff --git a/aio-posix.c b/aio-posix.c
 index 268d14d..d2011d0 100644
 --- a/aio-posix.c
 +++ b/aio-posix.c
 @@ -273,6 +273,13 @@ bool aio_poll(AioContext *ctx, bool blocking)
  aio_context_acquire(ctx);
  }
  
 +/* This should be optimized... */
 +event_notifier_test_and_clear(ctx-notifier);
 +
 +if (blocking) {
 +atomic_sub(ctx-notify_me, 2);
 +}
 +
  /* if we have any readable fds, dispatch event */
  if (ret  0) {
  for (i = 0; i  npfd; i++) {
 @@ -283,10 +290,6 @@ bool aio_poll(AioContext *ctx, bool blocking)
  npfd = 0;
  ctx-walking_handlers--;
  
 -if (blocking) {
 -atomic_sub(ctx-notify_me, 2);
 -}
 -
  /* Run dispatch even if there were no readable fds to run timers */
  if (aio_dispatch(ctx)) {
  progress = true;
 diff --git a/aio-win32.c b/aio-win32.c
 index 2bfd5f8..33809fd 100644
 --- a/aio-win32.c
 +++ b/aio-win32.c
 @@ -326,6 +326,10 @@ bool aio_poll(AioContext *ctx, bool blocking)
  if (timeout) {
  aio_context_acquire(ctx);
  }
 +
 +/* This should be optimized... */
 +event_notifier_test_and_clear(ctx-notifier);
 +
  if (blocking) {
  assert(first);
  atomic_sub(ctx-notify_me, 2);
 diff --git a/async.c b/async.c
 index 9204907..120e183 100644
 --- a/async.c
 +++ b/async.c
 @@ -202,6 +202,9 @@ aio_ctx_check(GSource *source)
  AioContext *ctx = (AioContext *) source;
  QEMUBH *bh;
  
 +/* This should be optimized... */
 +event_notifier_test_and_clear(ctx-notifier);
 +
  atomic_and(ctx-notify_me, ~1);
  for (bh = ctx-first_bh; bh; bh = bh-next) {
  if (!bh-deleted  bh-scheduled) {
 @@ -280,6 +280,10 @@ static void aio_rfifolock_cb(void *opaque)
  aio_notify(opaque);
  }
  
 +static void event_notifier_dummy_cb(EventNotifier *e)
 +{
 +}
 +
  AioContext *aio_context_new(Error **errp)
  {
  int ret;
 @@ -292,7 +296,7 @@ AioContext *aio_context_new(Error **errp)
  return NULL;
  }
  g_source_set_can_recurse(ctx-source, true);
 -aio_set_event_notifier(ctx, ctx-notifier, NULL);
 +aio_set_event_notifier(ctx, ctx-notifier, event_notifier_dummy_cb);
  ctx-thread_pool = NULL;
  qemu_mutex_init(ctx-bh_lock);
  rfifolock_init(ctx-lock, aio_rfifolock_cb, ctx);

With this patch, I've got to 500 iterations without seeing the error.

Still testing it however ...

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

[Qemu-devel] [PATCH v5 1/5] Implement GIC-500 base class

2015-07-17 Thread Pavel Fedin

From: Shlomo Pongratz shlomo.pongr...@huawei.com

This class is to be used by both software and KVM implementations of GICv3

Signed-off-by: Shlomo Pongratz shlomo.pongr...@huawei.com
Signed-off-by: Pavel Fedin p.fe...@samsung.com
---
 hw/intc/Makefile.objs  |   1 +
 hw/intc/arm_gicv3_common.c | 260 +
 hw/intc/gicv3_internal.h   | 156 ++
 include/hw/intc/arm_gicv3_common.h | 113 
 4 files changed, 530 insertions(+)
 create mode 100644 hw/intc/arm_gicv3_common.c
 create mode 100644 hw/intc/gicv3_internal.h
 create mode 100644 include/hw/intc/arm_gicv3_common.h

diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index 092d8a8..1317e5a 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -12,6 +12,7 @@ common-obj-$(CONFIG_IOAPIC) += ioapic_common.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gic_common.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gic.o
 common-obj-$(CONFIG_ARM_GIC) += arm_gicv2m.o
+common-obj-$(CONFIG_ARM_GIC) += arm_gicv3_common.o
 common-obj-$(CONFIG_OPENPIC) += openpic.o
 
 obj-$(CONFIG_APIC) += apic.o apic_common.o
diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
new file mode 100644
index 000..ef6f22e
--- /dev/null
+++ b/hw/intc/arm_gicv3_common.c
@@ -0,0 +1,260 @@
+/*
+ * ARM GICv3 support - common bits of emulated and KVM kernel model
+ *
+ * Copyright (c) 2012 Linaro Limited
+ * Copyright (c) 2015 Huawei.
+ * Written by Peter Maydell
+ * Extended to 64 cores by Shlomo Pongratz
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+#include gicv3_internal.h
+
+static void gicv3_pre_save(void *opaque)
+{
+GICv3State *s = (GICv3State *)opaque;
+ARMGICv3CommonClass *c = ARM_GICV3_COMMON_GET_CLASS(s);
+
+if (c-pre_save) {
+c-pre_save(s);
+}
+}
+
+static int gicv3_post_load(void *opaque, int version_id)
+{
+GICv3State *s = (GICv3State *)opaque;
+ARMGICv3CommonClass *c = ARM_GICV3_COMMON_GET_CLASS(s);
+
+if (c-post_load) {
+c-post_load(s);
+}
+return 0;
+}
+
+static const VMStateDescription vmstate_gicv3_irq_state = {
+.name = arm_gicv3_irq_state,
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINT64(pending, gicv3_irq_state),
+VMSTATE_UINT64(active, gicv3_irq_state),
+VMSTATE_UINT64(level, gicv3_irq_state),
+VMSTATE_UINT64(group, gicv3_irq_state),
+VMSTATE_BOOL(edge_trigger, gicv3_irq_state),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static const VMStateDescription vmstate_gicv3_sgi_state = {
+.name = arm_gicv3_sgi_state,
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINT64_ARRAY(pending, gicv3_sgi_state, GICV3_NCPU),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static const VMStateDescription vmstate_gicv3 = {
+.name = arm_gicv3,
+.version_id = 7,
+.minimum_version_id = 7,
+.pre_save = gicv3_pre_save,
+.post_load = gicv3_post_load,
+.fields = (VMStateField[]) {
+VMSTATE_UINT32(ctlr, GICv3State),
+VMSTATE_UINT32_ARRAY(cpu_ctlr, GICv3State, GICV3_NCPU),
+VMSTATE_STRUCT_ARRAY(irq_state, GICv3State, GICV3_MAXIRQ, 1,
+ vmstate_gicv3_irq_state, gicv3_irq_state),
+VMSTATE_UINT64_ARRAY(irq_target, GICv3State, GICV3_MAXIRQ),
+VMSTATE_UINT8_2DARRAY(priority1, GICv3State, GIC_INTERNAL,
+  GICV3_NCPU),
+VMSTATE_UINT8_ARRAY(priority2, GICv3State,
+GICV3_MAXIRQ - GIC_INTERNAL),
+VMSTATE_UINT16_2DARRAY(last_active, GICv3State, GICV3_MAXIRQ,
+   GICV3_NCPU),
+VMSTATE_STRUCT_ARRAY(sgi_state, GICv3State, GIC_NR_SGIS, 1,
+ vmstate_gicv3_sgi_state, gicv3_sgi_state),
+VMSTATE_UINT16_ARRAY(priority_mask, GICv3State, GICV3_NCPU),
+VMSTATE_UINT16_ARRAY(running_irq, GICv3State, GICV3_NCPU),
+VMSTATE_UINT16_ARRAY(running_priority, GICv3State, GICV3_NCPU),
+VMSTATE_UINT16_ARRAY(current_pending, GICv3State, GICV3_NCPU),
+VMSTATE_END_OF_LIST()
+}
+};
+
+void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler,
+  const MemoryRegionOps *dist_ops,
+

Re: [Qemu-devel] [RFC v3 06/13] target-i386: translate: implement qemu_ldlink and qemu_stcond ops

2015-07-17 Thread alvise rigo

On Fri, Jul 17, 2015 at 2:56 PM, Alex Bennée alex.ben...@linaro.org wrote:

 Alvise Rigo a.r...@virtualopensystems.com writes:

 Implement strex and ldrex instruction relying on TCG's qemu_ldlink and
 qemu_stcond.  For the time being only 32bit configurations are supported.

 Suggested-by: Jani Kokkonen jani.kokko...@huawei.com
 Suggested-by: Claudio Fontana claudio.font...@huawei.com
 Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com
 ---
  tcg/i386/tcg-target.c | 136 
 ++
  1 file changed, 114 insertions(+), 22 deletions(-)

 diff --git a/tcg/i386/tcg-target.c b/tcg/i386/tcg-target.c
 index 0d7c99c..d8250a9 100644
 --- a/tcg/i386/tcg-target.c
 +++ b/tcg/i386/tcg-target.c
 @@ -1141,6 +1141,17 @@ static void * const qemu_ld_helpers[16] = {
  [MO_BEQ]  = helper_be_ldq_mmu,
  };

 +/* LoadLink helpers, only unsigned. Use the macro below to access them. */
 +static void * const qemu_ldex_helpers[16] = {
 +[MO_LEUL] = helper_le_ldlinkul_mmu,
 +};
 +
 +#define LDEX_HELPER(mem_op) \
 +({  \
 +assert(mem_op  MO_EXCL);   \
 +qemu_ldex_helpers[((int)mem_op - MO_EXCL)]; \
 +})
 +
  /* helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr,
   * uintxx_t val, int mmu_idx, uintptr_t 
 ra)
   */
 @@ -1154,6 +1165,17 @@ static void * const qemu_st_helpers[16] = {
  [MO_BEQ]  = helper_be_stq_mmu,
  };

 +/* StoreConditional helpers. Use the macro below to access them. */
 +static void * const qemu_stex_helpers[16] = {
 +[MO_LEUL] = helper_le_stcondl_mmu,
 +};
 +
 +#define STEX_HELPER(mem_op) \
 +({  \
 +assert(mem_op  MO_EXCL);   \
 +qemu_stex_helpers[(int)mem_op - MO_EXCL];   \
 +})
 +

 Same comments as for target-arm.

 Do we need to be protecting backends with HAS_LDST_EXCL defines or some
 such macro hackery? What currently happens if you use the new TCG ops
 when the backend doesn't support them? Is supporting all backends a
 prerequisite for the series?

I think that the ideal approach would be to have all the backends
implementing the slowpath for atomic instructions so that the
HAS_LDST_EXCL macro will not be needed. Then a frontend can rely on
the slowpath or not.
So, ideally, it's a prerequisite.

Regards,
alvise


  /* Perform the TLB load and compare.

 Inputs:
 @@ -1249,6 +1271,7 @@ static inline void tcg_out_tlb_load(TCGContext *s, 
 TCGReg addrlo, TCGReg addrhi,
   * for a load or store, so that we can later generate the correct helper 
 code
   */
  static void add_qemu_ldst_label(TCGContext *s, bool is_ld, TCGMemOpIdx oi,
 +TCGReg llsc_success,
  TCGReg datalo, TCGReg datahi,
  TCGReg addrlo, TCGReg addrhi,
  tcg_insn_unit *raddr,
 @@ -1257,6 +1280,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool 
 is_ld, TCGMemOpIdx oi,
  TCGLabelQemuLdst *label = new_ldst_label(s);

  label-is_ld = is_ld;
 +label-llsc_success = llsc_success;
  label-oi = oi;
  label-datalo_reg = datalo;
  label-datahi_reg = datahi;
 @@ -1311,7 +1335,11 @@ static void tcg_out_qemu_ld_slow_path(TCGContext *s, 
 TCGLabelQemuLdst *l)
   (uintptr_t)l-raddr);
  }

 -tcg_out_call(s, qemu_ld_helpers[opc  (MO_BSWAP | MO_SIZE)]);
 +if (opc  MO_EXCL) {
 +tcg_out_call(s, LDEX_HELPER(opc));
 +} else {
 +tcg_out_call(s, qemu_ld_helpers[opc  ~MO_SIGN]);
 +}

  data_reg = l-datalo_reg;
  switch (opc  MO_SSIZE) {
 @@ -1415,9 +1443,16 @@ static void tcg_out_qemu_st_slow_path(TCGContext *s, 
 TCGLabelQemuLdst *l)
  }
  }

 -/* Tail call to the helper, with the return address back inline.  */
 -tcg_out_push(s, retaddr);
 -tcg_out_jmp(s, qemu_st_helpers[opc  (MO_BSWAP | MO_SIZE)]);
 +if (opc  MO_EXCL) {
 +tcg_out_call(s, STEX_HELPER(opc));
 +/* Save the output of the StoreConditional */
 +tcg_out_mov(s, TCG_TYPE_I32, l-llsc_success, TCG_REG_EAX);
 +tcg_out_jmp(s, l-raddr);
 +} else {
 +/* Tail call to the helper, with the return address back inline.  
 */
 +tcg_out_push(s, retaddr);
 +tcg_out_jmp(s, qemu_st_helpers[opc]);
 +}
  }
  #elif defined(__x86_64__)  defined(__linux__)
  # include asm/prctl.h
 @@ -1530,7 +1565,8 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, 
 TCGReg datalo, TCGReg datahi,
  /* XXX: qemu_ld and qemu_st could be modified to clobber only EDX and
 EAX. It will be useful once fixed registers globals are less
 common. */

[Qemu-devel] [PATCH v5 3/5] Introduce irqchip type specification for KVM

2015-07-17 Thread Pavel Fedin

This patch introduces kernel_irqchip_type member in Machine class, which
it passed to kvm_arch_irqchip_create. It allows machine models to specify
correct GIC type during KVM capability verification. The variable is
defined as int in order to be architecture-agnostic for potential future
uses by other architectures.

Signed-off-by: Pavel Fedin p.fe...@samsung.com
---
 hw/arm/vexpress.c|  1 +
 hw/arm/virt.c|  3 +++
 include/hw/boards.h  |  1 +
 include/sysemu/kvm.h |  3 ++-
 kvm-all.c|  2 +-
 stubs/kvm.c  |  2 +-
 target-arm/kvm.c | 13 +++--
 7 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/hw/arm/vexpress.c b/hw/arm/vexpress.c
index da21788..0675a00 100644
--- a/hw/arm/vexpress.c
+++ b/hw/arm/vexpress.c
@@ -556,6 +556,7 @@ static void vexpress_common_init(MachineState *machine)
 const hwaddr *map = daughterboard-motherboard_map;
 int i;
 
+machine-kernel_irqchip_type = KVM_DEV_TYPE_ARM_VGIC_V2;
 daughterboard-init(vms, machine-ram_size, machine-cpu_model, pic);
 
 /*
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 4846892..2e7d858 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -945,6 +945,9 @@ static void virt_instance_init(Object *obj)
 Set on/off to enable/disable the ARM 
 Security Extensions (TrustZone),
 NULL);
+
+/* Default GIC type is v2 */
+vms-parent.kernel_irqchip_type = KVM_DEV_TYPE_ARM_VGIC_V2;
 }
 
 static void virt_class_init(ObjectClass *oc, void *data)
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 2aec9cb..37eb767 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -127,6 +127,7 @@ struct MachineState {
 char *accel;
 bool kernel_irqchip_allowed;
 bool kernel_irqchip_required;
+int kernel_irqchip_type;
 int kvm_shadow_mem;
 char *dtb;
 char *dumpdtb;
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 983e99e..8f4d485 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -434,6 +434,7 @@ void kvm_init_irq_routing(KVMState *s);
 /**
  * kvm_arch_irqchip_create:
  * @KVMState: The KVMState pointer
+ * @type: irqchip type, architecture-specific
  *
  * Allow architectures to create an in-kernel irq chip themselves.
  *
@@ -441,7 +442,7 @@ void kvm_init_irq_routing(KVMState *s);
  *0: irq chip was not created
  *   0: irq chip was created
  */
-int kvm_arch_irqchip_create(KVMState *s);
+int kvm_arch_irqchip_create(KVMState *s, int type);
 
 /**
  * kvm_set_one_reg - set a register value in KVM via KVM_SET_ONE_REG ioctl
diff --git a/kvm-all.c b/kvm-all.c
index 06e06f2..8df938d 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1395,7 +1395,7 @@ static void kvm_irqchip_create(MachineState *machine, 
KVMState *s)
 
 /* First probe and see if there's a arch-specific hook to create the
  * in-kernel irqchip for us */
-ret = kvm_arch_irqchip_create(s);
+ret = kvm_arch_irqchip_create(s, machine-kernel_irqchip_type);
 if (ret == 0) {
 ret = kvm_vm_ioctl(s, KVM_CREATE_IRQCHIP);
 }
diff --git a/stubs/kvm.c b/stubs/kvm.c
index e7c60b6..a8505ff 100644
--- a/stubs/kvm.c
+++ b/stubs/kvm.c
@@ -1,7 +1,7 @@
 #include qemu-common.h
 #include sysemu/kvm.h
 
-int kvm_arch_irqchip_create(KVMState *s)
+int kvm_arch_irqchip_create(KVMState *s, int type)
 {
 return 0;
 }
diff --git a/target-arm/kvm.c b/target-arm/kvm.c
index 548bfd7..3992c1f 100644
--- a/target-arm/kvm.c
+++ b/target-arm/kvm.c
@@ -579,19 +579,28 @@ void kvm_arch_init_irq_routing(KVMState *s)
 {
 }
 
-int kvm_arch_irqchip_create(KVMState *s)
+int kvm_arch_irqchip_create(KVMState *s, int type)
 {
 int ret;
 
+/* Failure here means forgotten initialization of
+ * machine-kernel_irqchip_type in model code
+ */
+assert(type != 0);
+
 /* If we can create the VGIC using the newer device control API, we
  * let the device do this when it initializes itself, otherwise we
  * fall back to the old API */
 
-ret = kvm_create_device(s, KVM_DEV_TYPE_ARM_VGIC_V2, true);
+ret = kvm_create_device(s, type, true);
 if (ret == 0) {
 return 1;
 }
 
+/* Fallback will create VGIC v2 */
+if (type != KVM_DEV_TYPE_ARM_VGIC_V2) {
+return ret;
+}
 return 0;
 }
 
-- 
1.9.5.msysgit.0

[Qemu-devel] [PATCH v5 2/5] Extract some reusable vGIC code

2015-07-17 Thread Pavel Fedin

These functions are useful also for vGICv3 implementation. Make them accessible
from within other modules.

Actually kvm_dist_get() and kvm_dist_put() could also be made reusable, but
they would require two extra parameters (s-dev_fd and s-num_cpu) as well as
lots of typecasts of 's' to DeviceState * and back to GICState *. This makes
the code very ugly so i decided to stop at this point. I tried also an
approach with making a base class for all possible GICs, but it would contain
only three variables (dev_fd, cpu_num and irq_num), and accessing them through
the rest of the code would be again tedious (either ugly casts or qemu-style
separate object pointer). So i disliked it too.

Signed-off-by: Pavel Fedin p.fe...@samsung.com
---
 hw/intc/arm_gic_kvm.c | 84 ---
 hw/intc/vgic_common.h | 43 ++
 2 files changed, 82 insertions(+), 45 deletions(-)
 create mode 100644 hw/intc/vgic_common.h

diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c
index f56bff1..9b6d702 100644
--- a/hw/intc/arm_gic_kvm.c
+++ b/hw/intc/arm_gic_kvm.c
@@ -23,6 +23,7 @@
 #include sysemu/kvm.h
 #include kvm_arm.h
 #include gic_internal.h
+#include vgic_common.h
 
 //#define DEBUG_GIC_KVM
 
@@ -52,7 +53,7 @@ typedef struct KVMARMGICClass {
 void (*parent_reset)(DeviceState *dev);
 } KVMARMGICClass;
 
-static void kvm_arm_gic_set_irq(void *opaque, int irq, int level)
+void kvm_arm_gic_set_irq(uint32_t num_irq, int irq, int level)
 {
 /* Meaning of the 'irq' parameter:
  *  [0..N-1] : external interrupts
@@ -63,10 +64,9 @@ static void kvm_arm_gic_set_irq(void *opaque, int irq, int 
level)
  * has separate fields in the irq number for type,
  * CPU number and interrupt number.
  */
-GICState *s = (GICState *)opaque;
 int kvm_irq, irqtype, cpu;
 
-if (irq  (s-num_irq - GIC_INTERNAL)) {
+if (irq  (num_irq - GIC_INTERNAL)) {
 /* External interrupt. The kernel numbers these like the GIC
  * hardware, with external interrupt IDs starting after the
  * internal ones.
@@ -77,7 +77,7 @@ static void kvm_arm_gic_set_irq(void *opaque, int irq, int 
level)
 } else {
 /* Internal interrupt: decode into (cpu, interrupt id) */
 irqtype = KVM_ARM_IRQ_TYPE_PPI;
-irq -= (s-num_irq - GIC_INTERNAL);
+irq -= (num_irq - GIC_INTERNAL);
 cpu = irq / GIC_INTERNAL;
 irq %= GIC_INTERNAL;
 }
@@ -87,12 +87,19 @@ static void kvm_arm_gic_set_irq(void *opaque, int irq, int 
level)
 kvm_set_irq(kvm_state, kvm_irq, !!level);
 }
 
+static void kvm_arm_gicv2_set_irq(void *opaque, int irq, int level)
+{
+GICState *s = (GICState *)opaque;
+
+kvm_arm_gic_set_irq(s-num_irq, irq, level);
+}
+
 static bool kvm_arm_gic_can_save_restore(GICState *s)
 {
 return s-dev_fd = 0;
 }
 
-static bool kvm_gic_supports_attr(GICState *s, int group, int attrnum)
+bool kvm_gic_supports_attr(int dev_fd, int group, int attrnum)
 {
 struct kvm_device_attr attr = {
 .group = group,
@@ -100,14 +107,14 @@ static bool kvm_gic_supports_attr(GICState *s, int group, 
int attrnum)
 .flags = 0,
 };
 
-if (s-dev_fd == -1) {
+if (dev_fd == -1) {
 return false;
 }
 
-return kvm_device_ioctl(s-dev_fd, KVM_HAS_DEVICE_ATTR, attr) == 0;
+return kvm_device_ioctl(dev_fd, KVM_HAS_DEVICE_ATTR, attr) == 0;
 }
 
-static void kvm_gic_access(GICState *s, int group, int offset,
+void kvm_gic_access(int dev_fd, int group, int offset,
int cpu, uint32_t *val, bool write)
 {
 struct kvm_device_attr attr;
@@ -130,7 +137,7 @@ static void kvm_gic_access(GICState *s, int group, int 
offset,
 type = KVM_GET_DEVICE_ATTR;
 }
 
-err = kvm_device_ioctl(s-dev_fd, type, attr);
+err = kvm_device_ioctl(dev_fd, type, attr);
 if (err  0) {
 fprintf(stderr, KVM_{SET/GET}_DEVICE_ATTR failed: %s\n,
 strerror(-err));
@@ -138,20 +145,6 @@ static void kvm_gic_access(GICState *s, int group, int 
offset,
 }
 }
 
-static void kvm_gicd_access(GICState *s, int offset, int cpu,
-uint32_t *val, bool write)
-{
-kvm_gic_access(s, KVM_DEV_ARM_VGIC_GRP_DIST_REGS,
-   offset, cpu, val, write);
-}
-
-static void kvm_gicc_access(GICState *s, int offset, int cpu,
-uint32_t *val, bool write)
-{
-kvm_gic_access(s, KVM_DEV_ARM_VGIC_GRP_CPU_REGS,
-   offset, cpu, val, write);
-}
-
 #define for_each_irq_reg(_ctr, _max_irq, _field_width) \
 for (_ctr = 0; _ctr  ((_max_irq) / (32 / (_field_width))); _ctr++)
 
@@ -291,7 +284,7 @@ static void kvm_dist_get(GICState *s, uint32_t offset, int 
width,
 irq = i * regsz;
 cpu = 0;
 while ((cpu  s-num_cpu  irq  GIC_INTERNAL) || cpu == 0) {
-kvm_gicd_access(s, offset, cpu, reg, false);
+kvm_gicd_access(s-dev_fd, offset, cpu, reg,

Re: [Qemu-devel] [RFC v3 07/13] ram_addr.h: Make exclusive bitmap accessors atomic

2015-07-17 Thread Alex Bennée


Alvise Rigo a.r...@virtualopensystems.com writes:

 Suggested-by: Jani Kokkonen jani.kokko...@huawei.com
 Suggested-by: Claudio Fontana claudio.font...@huawei.com
 Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com
 ---
  include/exec/ram_addr.h | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

 diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
 index 2766541..e51bd65 100644
 --- a/include/exec/ram_addr.h
 +++ b/include/exec/ram_addr.h
 @@ -255,7 +255,7 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(unsigned 
 long *dest,
  /* Exclusive bitmap accessors. */
  static inline void cpu_physical_memory_set_excl_dirty(ram_addr_t addr)
  {
 -set_bit(addr  TARGET_PAGE_BITS,
 +set_bit_atomic(addr  TARGET_PAGE_BITS,
  ram_list.dirty_memory[DIRTY_MEMORY_EXCLUSIVE]);
  }
  
 @@ -267,8 +267,8 @@ static inline int 
 cpu_physical_memory_excl_is_dirty(ram_addr_t addr)
  
  static inline void cpu_physical_memory_clear_excl_dirty(ram_addr_t addr)
  {
 -clear_bit(addr  TARGET_PAGE_BITS,
 -  ram_list.dirty_memory[DIRTY_MEMORY_EXCLUSIVE]);
 +
 bitmap_test_and_clear_atomic(ram_list.dirty_memory[DIRTY_MEMORY_EXCLUSIVE],
 + addr  TARGET_PAGE_BITS, 1);

Does this call for simply implementing a clear_bit_atomic() rather than
the fancy bitmap_test_and_clear_atomic. Looking at atomic.h it seems the
primitives you need are there.

  }
  
  #endif

-- 
Alex Bennée

Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me

2015-07-17 Thread Marc Zyngier

On Fri, 17 Jul 2015 10:30:38 +0100
Paolo Bonzini pbonz...@redhat.com wrote:

 
 
 On 17/07/2015 06:44, Paolo Bonzini wrote:
  
  
  On 16/07/2015 21:05, Richard W.M. Jones wrote:
 
  Sorry to spoil things, but I'm still seeing this bug, although it is
  now a lot less frequent with your patch.  I would estimate it happens
  more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs
  with qemu.git + the v2 patch series.
 
  It's the exact same hang in both cases.
 
  Is it possible that this patch doesn't completely close any race?
 
  Still, it is an improvement, so there is that.
  
  Would seem at first glance like a different bug.
  
  Interestingly, adding some tracing (qemu_clock_get_ns) makes the bug
  more likely: now it reproduces in about 10 tries.  Of course :) adding
  other kinds of tracing instead make it go away again (50 tries).
  
  Perhaps this:
  
 i/o thread vcpu thread   worker thread
 -
 lock_iothread
 notify_me = 1
 ...
 unlock_iothread
lock_iothread
notify_me = 3
ppoll
notify_me = 1
   bh-scheduled = 1
   event_notifier_set
event_notifier_test_and_clear
 ppoll
  ^^ hang
  
  In the exact shape above, it doesn't seem too likely to happen, but
  perhaps there's another simpler case.  Still, the bug exists.
  
  The above is not really related to notify_me.  Here the notification is
  not being optimized away!  So I wonder if this one has been there forever.
  
  Fam suggested putting the event_notifier_test_and_clear before
  aio_bh_poll(), but it does not work.  I'll look more close
  
  However, an unconditional event_notifier_test_and_clear is pretty
  expensive.  On one hand, obviously correctness comes first.  On the
  other hand, an expensive operation at the wrong place can mask the race
  very easily; I'll let the fix run for a while, but I'm not sure if a
  successful test really says anything useful.
 
 So it may not be useful, but still successful test is successful. :)
 The following patch, which also includes the delta between v2 and v3
 of this series, survived 674 reboots before hitting a definitely
 unrelated problem:
 
 error: kvm run failed Function not implemented
 PC=bf671210  SP=c1f0
 X00=0a003e70 X01= X02=bf680548 
 X03=0030
 X04=bbb5fc18 X05=004b7770 X06=bf721930 
 X07=009a
 X08=bf716858 X09=0090 X10= 
 X11=0046
 X12=a007e03a X13= X14= 
 X15=
 X16=bf716df0 X17= X18= 
 X19=bf6f5f18
 X20= X21= X22= 
 X23=
 X24= X25= X26= 
 X27=
 X28= X29= X30= 
 PSTATE=6305 (flags -ZC-)
 
 For the record, this is the kvm_run struct:
 
 $6 = {request_interrupt_window = 0 '\000', padding1 = 
 \000\000\000\000\000\000, exit_reason = 0, 
   ready_for_interrupt_injection = 0 '\000', if_flag = 0 '\000', flags = 0, 
 cr8 = 0, apic_base = 0, {hw = {
   hardware_exit_reason = 150994968}, fail_entry = 
 {hardware_entry_failure_reason = 150994968}, ex = {
   exception = 150994968, error_code = 0}, io = {direction = 24 '\030', 
 size = 0 '\000', port = 2304, 
   count = 0, data_offset = 144}, debug = {arch = {No data fields}}, 
 mmio = {phys_addr = 150994968, 
   data = \220\000\000\000\000\000\000, len = 4, is_write = 0 '\000'}, 
 hypercall = {nr = 150994968, 
   args = {144, 4, 0, 0, 0, 0}, ret = 0, longmode = 0, pad = 0}, 
 tpr_access = {rip = 150994968, 
   is_write = 144, pad = 0}, s390_sieic = {icptcode = 24 '\030', ipa = 
 2304, ipb = 0}, 
 s390_reset_flags = 150994968, s390_ucontrol = {trans_exc_code = 
 150994968, pgm_code = 144}, dcr = {
   dcrn = 150994968, data = 0, is_write = 144 '\220'}, internal = 
 {suberror = 150994968, ndata = 0, 
   data = {144, 4, 0 repeats 14 times}}, osi = {gprs = {150994968, 144, 
 4, 0 repeats 29 times}}, 
 papr_hcall = {nr = 150994968, ret = 144, args = {4, 0, 0, 0, 0, 0, 0, 0, 
 0}}, s390_tsch = {
   subchannel_id = 24, subchannel_nr = 2304, io_int_parm = 0, io_int_word 
 = 144, ipb = 0, 
   dequeued = 4 '\004'}, epr = {epr = 150994968}, system_event = {type = 
 150994968, flags = 144}, 
 s390_stsi = {addr = 150994968, ar = 144 '\220', reserved = 0 '\000', fc = 
 0 '\000', sel1 = 0 '\000', 
   sel2 = 0}, 
 padding = 
 \030\000\000\t\000\000\000\000\220\000\000\000\000\000\000\000\004, '\000' 
 repeats 238

Re: [Qemu-devel] [PATCH v4 1/1] vhost user: add support of live migration

2015-07-17 Thread Marc-André Lureau

Hi

On Fri, Jul 17, 2015 at 2:57 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 LOG_FD is implemented in the kernel drivers/vhost/vhost.c.  It seems to
 be an eventfd-like mechanism to save on dirty bitmap scans.  However,
 it's not well documented how to implement it in a correct way.

and it's not used by qemu, so hard to say if it actually work well.

 In any case, live migration needs a new message type (like LOG_MMAP_FD)
 in the vhost-user protocol.

Yes, perhaps with size and offset. I am looking at this.


-- 
Marc-André Lureau

Re: [Qemu-devel] [virtio guest] vring_need_event() from virtqueue_kick_prepare()

2015-07-17 Thread Stefan Hajnoczi

On Fri, Jul 17, 2015 at 09:30:03AM +0300, Catalin Vasile wrote:
 Do you mean vhost_net - old kernel, qemu - latest, guest - latest?

I mean whatever combination didn't work for your custom vhost module.
If vhost_net is broken too there is probably a vhost/QEMU bug that
should be fixed.

Stefan


pgpaYd2oKLC3H.pgp
Description: PGP signature

Re: [Qemu-devel] [RFC v3 05/13] target-arm: translate: implement qemu_ldlink and qemu_stcond ops

2015-07-17 Thread alvise rigo

On Fri, Jul 17, 2015 at 2:51 PM, Alex Bennée alex.ben...@linaro.org wrote:

 Alvise Rigo a.r...@virtualopensystems.com writes:

 Implement strex and ldrex instruction relying on TCG's qemu_ldlink and
 qemu_stcond.  For the time being only the 32bit instructions are supported.

 Suggested-by: Jani Kokkonen jani.kokko...@huawei.com
 Suggested-by: Claudio Fontana claudio.font...@huawei.com
 Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com
 ---
  target-arm/translate.c |  87 ++-
  tcg/arm/tcg-target.c   | 121 
 +
  2 files changed, 178 insertions(+), 30 deletions(-)

 diff --git a/target-arm/translate.c b/target-arm/translate.c
 index 80302cd..0366c76 100644
 --- a/target-arm/translate.c
 +++ b/target-arm/translate.c
 @@ -72,6 +72,8 @@ static TCGv_i64 cpu_exclusive_test;
  static TCGv_i32 cpu_exclusive_info;
  #endif

 +static TCGv_i32 cpu_ll_sc_context;
 +
  /* FIXME:  These should be removed.  */
  static TCGv_i32 cpu_F0s, cpu_F1s;
  static TCGv_i64 cpu_F0d, cpu_F1d;
 @@ -103,6 +105,8 @@ void arm_translate_init(void)
  offsetof(CPUARMState, exclusive_addr), exclusive_addr);
  cpu_exclusive_val = tcg_global_mem_new_i64(TCG_AREG0,
  offsetof(CPUARMState, exclusive_val), exclusive_val);
 +cpu_ll_sc_context = tcg_global_mem_new_i32(TCG_AREG0,
 +offsetof(CPUARMState, ll_sc_context), ll_sc_context);
  #ifdef CONFIG_USER_ONLY
  cpu_exclusive_test = tcg_global_mem_new_i64(TCG_AREG0,
  offsetof(CPUARMState, exclusive_test), exclusive_test);
 @@ -961,6 +965,18 @@ DO_GEN_ST(8, MO_UB)
  DO_GEN_ST(16, MO_TEUW)
  DO_GEN_ST(32, MO_TEUL)

 +/* Load/Store exclusive generators (always unsigned) */
 +static inline void gen_aa32_ldex32(TCGv_i32 val, TCGv_i32 addr, int index)
 +{
 +tcg_gen_qemu_ldlink_i32(val, addr, index, MO_TEUL | MO_EXCL);
 +}
 +
 +static inline void gen_aa32_stex32(TCGv_i32 is_dirty, TCGv_i32 val,
 +   TCGv_i32 addr, int index)
 +{
 +tcg_gen_qemu_stcond_i32(is_dirty, val, addr, index, MO_TEUL | MO_EXCL);
 +}
 +
  static inline void gen_set_pc_im(DisasContext *s, target_ulong val)
  {
  tcg_gen_movi_i32(cpu_R[15], val);
 @@ -7427,6 +7443,26 @@ static void gen_load_exclusive(DisasContext *s, int 
 rt, int rt2,
  store_reg(s, rt, tmp);
  }

 +static void gen_load_exclusive_multi(DisasContext *s, int rt, int rt2,
 + TCGv_i32 addr, int size)
 +{
 +TCGv_i32 tmp = tcg_temp_new_i32();
 +
 +switch (size) {
 +case 0:
 +case 1:
 +abort();
 +case 2:
 +gen_aa32_ldex32(tmp, addr, get_mem_index(s));
 +break;
 +case 3:
 +default:
 +abort();
 +}
 +
 +store_reg(s, rt, tmp);
 +}
 +
  static void gen_clrex(DisasContext *s)
  {
  gen_helper_atomic_clear(cpu_env);
 @@ -7460,6 +7496,52 @@ static void gen_store_exclusive(DisasContext *s, int 
 rd, int rt, int rt2,
  tcg_temp_free_i64(val);
  tcg_temp_free_i32(tmp_size);
  }
 +
 +static void gen_store_exclusive_multi(DisasContext *s, int rd, int rt, int 
 rt2,
 +  TCGv_i32 addr, int size)
 +{
 +TCGv_i32 tmp;
 +TCGv_i32 is_dirty;
 +TCGLabel *done_label;
 +TCGLabel *fail_label;
 +
 +fail_label = gen_new_label();
 +done_label = gen_new_label();
 +
 +tmp = tcg_temp_new_i32();
 +is_dirty = tcg_temp_new_i32();
 +
 +/* Fail if we are not in LL/SC context. */
 +tcg_gen_brcondi_i32(TCG_COND_NE, cpu_ll_sc_context, 1, fail_label);
 +
 +tmp = load_reg(s, rt);
 +switch (size) {
 +case 0:
 +case 1:
 +abort();
 +break;
 +case 2:
 +gen_aa32_stex32(is_dirty, tmp, addr, get_mem_index(s));
 +break;
 +case 3:
 +default:
 +abort();
 +}
 +
 +tcg_temp_free_i32(tmp);
 +
 +/* Check if the store conditional has to fail. */
 +tcg_gen_brcondi_i32(TCG_COND_EQ, is_dirty, 1, fail_label);
 +tcg_temp_free_i32(is_dirty);
 +
 +tcg_temp_free_i32(tmp);
 +
 +tcg_gen_movi_i32(cpu_R[rd], 0); /* is_dirty = 0 */
 +tcg_gen_br(done_label);
 +gen_set_label(fail_label);
 +tcg_gen_movi_i32(cpu_R[rd], 1); /* is_dirty = 1 */
 +gen_set_label(done_label);
 +}
  #endif

  /* gen_srs:
 @@ -8308,7 +8390,7 @@ static void disas_arm_insn(DisasContext *s, unsigned 
 int insn)
  } else if (insn  (1  20)) {
  switch (op1) {
  case 0: /* ldrex */
 -gen_load_exclusive(s, rd, 15, addr, 2);
 +gen_load_exclusive_multi(s, rd, 15, addr, 
 2);
  break;
  case 1: /* ldrexd */
  gen_load_exclusive(s, rd, rd + 1, addr, 3);
 @@ -8326,7 +8408,8 @@ static void disas_arm_insn(DisasContext *s, unsigned 
 int insn)
  rm =

Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me

2015-07-17 Thread Paolo Bonzini



On 17/07/2015 14:58, Richard W.M. Jones wrote:
 On Fri, Jul 17, 2015 at 11:30:38AM +0200, Paolo Bonzini wrote:
 error: kvm run failed Function not implemented
 PC=bf671210  SP=c1f0
 X00=0a003e70 X01= X02=bf680548 
 X03=0030
 X04=bbb5fc18 X05=004b7770 X06=bf721930 
 X07=009a
 X08=bf716858 X09=0090 X10= 
 X11=0046
 X12=a007e03a X13= X14= 
 X15=
 X16=bf716df0 X17= X18= 
 X19=bf6f5f18
 X20= X21= X22= 
 X23=
 X24= X25= X26= 
 X27=
 X28= X29= X30= 
 PSTATE=6305 (flags -ZC-)
 
 Vaguely reminiscent of this bug:
 
 https://bugzilla.redhat.com/show_bug.cgi?id=1194366
 
 (See comment 7 in particular)

Must be it, I was using an old kernel.

Thanks!

Paolo

 diff --git a/aio-posix.c b/aio-posix.c
 index 268d14d..d2011d0 100644
 --- a/aio-posix.c
 +++ b/aio-posix.c
 @@ -273,6 +273,13 @@ bool aio_poll(AioContext *ctx, bool blocking)
  aio_context_acquire(ctx);
  }
  
 +/* This should be optimized... */
 +event_notifier_test_and_clear(ctx-notifier);
 +
 +if (blocking) {
 +atomic_sub(ctx-notify_me, 2);
 +}
 +
  /* if we have any readable fds, dispatch event */
  if (ret  0) {
  for (i = 0; i  npfd; i++) {
 @@ -283,10 +290,6 @@ bool aio_poll(AioContext *ctx, bool blocking)
  npfd = 0;
  ctx-walking_handlers--;
  
 -if (blocking) {
 -atomic_sub(ctx-notify_me, 2);
 -}
 -
  /* Run dispatch even if there were no readable fds to run timers */
  if (aio_dispatch(ctx)) {
  progress = true;
 diff --git a/aio-win32.c b/aio-win32.c
 index 2bfd5f8..33809fd 100644
 --- a/aio-win32.c
 +++ b/aio-win32.c
 @@ -326,6 +326,10 @@ bool aio_poll(AioContext *ctx, bool blocking)
  if (timeout) {
  aio_context_acquire(ctx);
  }
 +
 +/* This should be optimized... */
 +event_notifier_test_and_clear(ctx-notifier);
 +
  if (blocking) {
  assert(first);
  atomic_sub(ctx-notify_me, 2);
 diff --git a/async.c b/async.c
 index 9204907..120e183 100644
 --- a/async.c
 +++ b/async.c
 @@ -202,6 +202,9 @@ aio_ctx_check(GSource *source)
  AioContext *ctx = (AioContext *) source;
  QEMUBH *bh;
  
 +/* This should be optimized... */
 +event_notifier_test_and_clear(ctx-notifier);
 +
  atomic_and(ctx-notify_me, ~1);
  for (bh = ctx-first_bh; bh; bh = bh-next) {
  if (!bh-deleted  bh-scheduled) {
 @@ -280,6 +280,10 @@ static void aio_rfifolock_cb(void *opaque)
  aio_notify(opaque);
  }
  
 +static void event_notifier_dummy_cb(EventNotifier *e)
 +{
 +}
 +
  AioContext *aio_context_new(Error **errp)
  {
  int ret;
 @@ -292,7 +296,7 @@ AioContext *aio_context_new(Error **errp)
  return NULL;
  }
  g_source_set_can_recurse(ctx-source, true);
 -aio_set_event_notifier(ctx, ctx-notifier, NULL);
 +aio_set_event_notifier(ctx, ctx-notifier, event_notifier_dummy_cb);
  ctx-thread_pool = NULL;
  qemu_mutex_init(ctx-bh_lock);
  rfifolock_init(ctx-lock, aio_rfifolock_cb, ctx);
 
 With this patch, I've got to 500 iterations without seeing the error.
 
 Still testing it however ...
 
 Rich.

[Qemu-devel] [PATCH v5 5/5] Add gicversion option to virt machine

2015-07-17 Thread Pavel Fedin

Set kernel_irqchip_type according to value of the option and pass it
around where necessary. Instantiate devices and fdt nodes according
to the choice.

mac_cpus for virt machine increased to 64. GICv2 compatibility check
happens inside arm_gic_common_realize().

Signed-off-by: Pavel Fedin p.fe...@samsung.com
---
 hw/arm/virt.c | 137 ++
 include/hw/arm/fdt.h  |   2 +-
 include/hw/arm/virt.h |   6 ++-
 3 files changed, 121 insertions(+), 24 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 2e7d858..afbeedf 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -48,6 +48,8 @@
 #include hw/arm/sysbus-fdt.h
 #include hw/platform-bus.h
 #include hw/arm/fdt.h
+#include linux/kvm.h /* For KVM_DEV_TYPE_ARM_VGIC_V{2|3} */
+#include qapi/visitor.h
 
 /* Number of external interrupt lines to configure the GIC with */
 #define NUM_IRQS 256
@@ -106,7 +108,12 @@ static const MemMapEntry a15memmap[] = {
 /* GIC distributor and CPU interfaces sit inside the CPU peripheral space 
*/
 [VIRT_GIC_DIST] =   { 0x0800, 0x0001 },
 [VIRT_GIC_CPU] ={ 0x0801, 0x0001 },
-[VIRT_GIC_V2M] ={ 0x0802, 0x1000 },
+[VIRT_GIC_V2M] ={ 0x0802, 0x0001 },
+/* On v3 VIRT_GIC_DIST_MBI and VIRT_ITS_CONTROL take place of
+ * VIRT_GIC_CPU and VIRT_GIC_V2M respectively
+ */
+[VIRT_ITS_TRANSLATION] ={ 0x0803, 0x0001 },
+[VIRT_LPI] ={ 0x0804, 0x0080 },
 [VIRT_UART] =   { 0x0900, 0x1000 },
 [VIRT_RTC] ={ 0x0901, 0x1000 },
 [VIRT_FW_CFG] = { 0x0902, 0x000a },
@@ -256,10 +263,13 @@ static void fdt_add_timer_nodes(const VirtBoardInfo *vbi)
  * they are edge-triggered.
  */
 ARMCPU *armcpu;
+uint32_t max;
 uint32_t irqflags = GIC_FDT_IRQ_FLAGS_EDGE_LO_HI;
 
+/* Argument is 32 bit but 8 bits are reserved for flags */
+max = (vbi-smp_cpus = 24) ? 24 : vbi-smp_cpus;
 irqflags = deposit32(irqflags, GIC_FDT_IRQ_PPI_CPU_START,
- GIC_FDT_IRQ_PPI_CPU_WIDTH, (1  vbi-smp_cpus) - 1);
+ GIC_FDT_IRQ_PPI_CPU_WIDTH, (1  max) - 1);
 
 qemu_fdt_add_subnode(vbi-fdt, /timer);
 
@@ -283,6 +293,18 @@ static void fdt_add_cpu_nodes(const VirtBoardInfo *vbi)
 {
 int cpu;
 
+/*
+ * From Documentation/devicetree/bindings/arm/cpus.txt
+ *  On ARM v8 64-bit systems value should be set to 2,
+ *  that corresponds to the MPIDR_EL1 register size.
+ *  If MPIDR_EL1[63:32] value is equal to 0 on all CPUs
+ *  in the system, #address-cells can be set to 1, since
+ *  MPIDR_EL1[63:32] bits are not used for CPUs
+ *  identification.
+ *
+ *  Now GIC500 doesn't support affinities 2  3 so currently
+ *  #address-cells can stay 1 until future GIC
+ */
 qemu_fdt_add_subnode(vbi-fdt, /cpus);
 qemu_fdt_setprop_cell(vbi-fdt, /cpus, #address-cells, 0x1);
 qemu_fdt_setprop_cell(vbi-fdt, /cpus, #size-cells, 0x0);
@@ -319,25 +341,36 @@ static void fdt_add_v2m_gic_node(VirtBoardInfo *vbi)
 qemu_fdt_setprop_cell(vbi-fdt, /intc/v2m, phandle, vbi-v2m_phandle);
 }
 
-static void fdt_add_gic_node(VirtBoardInfo *vbi)
+static void fdt_add_gic_node(VirtBoardInfo *vbi, int type)
 {
 vbi-gic_phandle = qemu_fdt_alloc_phandle(vbi-fdt);
 qemu_fdt_setprop_cell(vbi-fdt, /, interrupt-parent, vbi-gic_phandle);
 
 qemu_fdt_add_subnode(vbi-fdt, /intc);
-/* 'cortex-a15-gic' means 'GIC v2' */
-qemu_fdt_setprop_string(vbi-fdt, /intc, compatible,
-arm,cortex-a15-gic);
 qemu_fdt_setprop_cell(vbi-fdt, /intc, #interrupt-cells, 3);
 qemu_fdt_setprop(vbi-fdt, /intc, interrupt-controller, NULL, 0);
-qemu_fdt_setprop_sized_cells(vbi-fdt, /intc, reg,
- 2, vbi-memmap[VIRT_GIC_DIST].base,
- 2, vbi-memmap[VIRT_GIC_DIST].size,
- 2, vbi-memmap[VIRT_GIC_CPU].base,
- 2, vbi-memmap[VIRT_GIC_CPU].size);
 qemu_fdt_setprop_cell(vbi-fdt, /intc, #address-cells, 0x2);
 qemu_fdt_setprop_cell(vbi-fdt, /intc, #size-cells, 0x2);
 qemu_fdt_setprop(vbi-fdt, /intc, ranges, NULL, 0);
+if (type == KVM_DEV_TYPE_ARM_VGIC_V3) {
+qemu_fdt_setprop_string(vbi-fdt, /intc, compatible,
+arm,gic-v3);
+qemu_fdt_setprop_sized_cells(vbi-fdt, /intc, reg,
+ 2, vbi-memmap[VIRT_GIC_DIST].base,
+ 2, vbi-memmap[VIRT_GIC_DIST].size,
+ 2, vbi-memmap[VIRT_LPI].base,
+ 2, vbi-memmap[VIRT_LPI].size);
+} else {
+/* 'cortex-a15-gic' means 'GIC v2' */
+qemu_fdt_setprop_string(vbi-fdt, /intc, compatible,
+

[Qemu-devel] [PATCH v5 0/5] [PATCH v4 0/5] vGICv3 support

2015-07-17 Thread Pavel Fedin

This series introduces support for GICv3 by KVM. Software emulation is
currently not supported.

Differences from v4:
- Do not reintroduce several constants shared with GICv2, reuse them instead.
- Added gicv3_init_irqs_and_mmio() in base class, to be used by both software
  emulation and KVM code. Avoids code duplication.
- Do not add NULL msi-parent phandle to PCI device in the FDT
- Removed a couple of stale things from virt.c

Differences from v3:
- Fixed stupid build breakage in patch 0002
- Rebased on top of current master, patch 0003 adjusted according to
  kvm_irqchip_create() changes
- Added assertion against uninitialized kernel_irqchip_type
- Removed kernel_irqchip_type initialization from models which do not
  use KVM vGIC

Differences from v2:
- Removed some unrelated and unnecessary changes from virt machine,
  occasionally slipped in; some of them caused qemu to crash on ARM32.
- Fixed build for ARM32; vGICv3 code requires definitions which are
  present only in ARM64 kernel

Differences from v1:
- Base class included, taken from the series by Shlomo Pongratz:
  http://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg01512.html
  The code is refactored as little as possible in order to simplify
  further addition of software emulation:
  - Minor fixes in code style and comments, according to old reviews
  - Removed REV_V3 definition because it's currently not used, and it does
not add any meaning to number 3.
  - Removed reserved regions for MBI and ITS (except for 'virt' machine
memory map). These should go to separate classes when implemented.
- Improved commit messages
- vGIC patches restructured
- Use 'gicversion' option instead of virt-v3 machine

Pavel Fedin (4):
  Extract some reusable vGIC code
  Introduce irqchip type specification for KVM
  Initial implementation of vGICv3
  Add gicversion option to virt machine

Shlomo Pongratz (1):
  Implement GIC-500 base class

 hw/arm/vexpress.c  |   1 +
 hw/arm/virt.c  | 140 
 hw/intc/Makefile.objs  |   4 +
 hw/intc/arm_gic_kvm.c  |  84 ++--
 hw/intc/arm_gicv3_common.c | 260 +
 hw/intc/arm_gicv3_kvm.c| 159 +++
 hw/intc/gicv3_internal.h   | 156 ++
 hw/intc/vgic_common.h  |  43 ++
 include/hw/arm/fdt.h   |   2 +-
 include/hw/arm/virt.h  |   6 +-
 include/hw/boards.h|   1 +
 include/hw/intc/arm_gicv3_common.h | 113 
 include/sysemu/kvm.h   |   3 +-
 kvm-all.c  |   2 +-
 stubs/kvm.c|   2 +-
 target-arm/kvm.c   |  13 +-
 16 files changed, 915 insertions(+), 74 deletions(-)
 create mode 100644 hw/intc/arm_gicv3_common.c
 create mode 100644 hw/intc/arm_gicv3_kvm.c
 create mode 100644 hw/intc/gicv3_internal.h
 create mode 100644 hw/intc/vgic_common.h
 create mode 100644 include/hw/intc/arm_gicv3_common.h

-- 
1.9.5.msysgit.0

[Qemu-devel] [PATCH v5 4/5] Initial implementation of vGICv3

2015-07-17 Thread Pavel Fedin

Get/put routines are missing, live migration is not possible.

Signed-off-by: Pavel Fedin p.fe...@samsung.com
---
 hw/intc/Makefile.objs   |   3 +
 hw/intc/arm_gicv3_kvm.c | 159 
 2 files changed, 162 insertions(+)
 create mode 100644 hw/intc/arm_gicv3_kvm.c

diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index 1317e5a..e2525a8 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -17,6 +17,9 @@ common-obj-$(CONFIG_OPENPIC) += openpic.o
 
 obj-$(CONFIG_APIC) += apic.o apic_common.o
 obj-$(CONFIG_ARM_GIC_KVM) += arm_gic_kvm.o
+ifeq ($(ARCH), aarch64) # Only 64-bit KVM can use these
+obj-$(CONFIG_ARM_GIC_KVM) += arm_gicv3_kvm.o
+endif
 obj-$(CONFIG_STELLARIS) += armv7m_nvic.o
 obj-$(CONFIG_EXYNOS4) += exynos4210_gic.o exynos4210_combiner.o
 obj-$(CONFIG_GRLIB) += grlib_irqmp.o
diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
new file mode 100644
index 000..e2724bc
--- /dev/null
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -0,0 +1,159 @@
+/*
+ * ARM Generic Interrupt Controller using KVM in-kernel support
+ *
+ * Copyright (c) 2015 Samsung Electronics Co., Ltd.
+ * Written by Pavel Fedin
+ * Based on vGICv2 code by Peter Maydell
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+#include hw/sysbus.h
+#include sysemu/kvm.h
+#include kvm_arm.h
+#include gicv3_internal.h
+#include vgic_common.h
+
+#ifdef DEBUG_GICV3_KVM
+static const int debug_gicv3_kvm = 1;
+#else
+static const int debug_gicv3_kvm = 0;
+#endif
+
+#define DPRINTF(fmt, ...) do { \
+if (debug_gicv3_kvm) { \
+printf(kvm_gicv3:  fmt , ## __VA_ARGS__); \
+} \
+} while (0)
+
+#define TYPE_KVM_ARM_GICV3 kvm-arm-gicv3
+#define KVM_ARM_GICV3(obj) \
+ OBJECT_CHECK(GICv3State, (obj), TYPE_KVM_ARM_GICV3)
+#define KVM_ARM_GICV3_CLASS(klass) \
+ OBJECT_CLASS_CHECK(KVMARMGICv3Class, (klass), TYPE_KVM_ARM_GICV3)
+#define KVM_ARM_GICV3_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(KVMARMGICv3Class, (obj), TYPE_KVM_ARM_GICV3)
+
+typedef struct KVMARMGICv3Class {
+ARMGICv3CommonClass parent_class;
+DeviceRealize parent_realize;
+void (*parent_reset)(DeviceState *dev);
+} KVMARMGICv3Class;
+
+static void kvm_arm_gicv3_set_irq(void *opaque, int irq, int level)
+{
+GICv3State *s = (GICv3State *)opaque;
+
+kvm_arm_gic_set_irq(s-num_irq, irq, level);
+}
+
+static void kvm_arm_gicv3_put(GICv3State *s)
+{
+/* TODO */
+DPRINTF(Cannot put kernel gic state, no kernel interface\n);
+}
+
+static void kvm_arm_gicv3_get(GICv3State *s)
+{
+/* TODO */
+DPRINTF(Cannot get kernel gic state, no kernel interface\n);
+}
+
+static void kvm_arm_gicv3_reset(DeviceState *dev)
+{
+GICv3State *s = ARM_GICV3_COMMON(dev);
+KVMARMGICv3Class *kgc = KVM_ARM_GICV3_GET_CLASS(s);
+
+DPRINTF(Reset\n);
+
+kgc-parent_reset(dev);
+kvm_arm_gicv3_put(s);
+}
+
+static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp)
+{
+GICv3State *s = KVM_ARM_GICV3(dev);
+KVMARMGICv3Class *kgc = KVM_ARM_GICV3_GET_CLASS(s);
+Error *local_err = NULL;
+int ret;
+
+DPRINTF(kvm_arm_gicv3_realize\n);
+
+kgc-parent_realize(dev, local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+
+gicv3_init_irqs_and_mmio(s, kvm_arm_gicv3_set_irq, NULL, NULL);
+
+/* Try to create the device via the device control API */
+s-dev_fd = -1;
+ret = kvm_create_device(kvm_state, KVM_DEV_TYPE_ARM_VGIC_V3, false);
+if (ret = 0) {
+s-dev_fd = ret;
+} else if (ret != -ENODEV  ret != -ENOTSUP) {
+error_setg_errno(errp, -ret, error creating in-kernel VGIC);
+return;
+}
+
+if (kvm_gic_supports_attr(s-dev_fd, KVM_DEV_ARM_VGIC_GRP_NR_IRQS, 0)) {
+uint32_t numirqs = s-num_irq;
+DPRINTF(KVM_DEV_ARM_VGIC_GRP_NR_IRQS = %u\n, numirqs);
+kvm_gic_access(s-dev_fd, KVM_DEV_ARM_VGIC_GRP_NR_IRQS,
+   0, 0, numirqs, 1);
+}
+
+/* Tell the kernel to complete VGIC initialization now */
+if (kvm_gic_supports_attr(s-dev_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
+  KVM_DEV_ARM_VGIC_CTRL_INIT)) {
+   DPRINTF(KVM_DEV_ARM_VGIC_CTRL_INIT\n);
+kvm_gic_access(s-dev_fd, KVM_DEV_ARM_VGIC_GRP_CTRL,
+  KVM_DEV_ARM_VGIC_CTRL_INIT, 0, 0, 1);
+}
+
+

Re: [Qemu-devel] [PATCH v1 4/6] xen: use errno instead of rc for xc_domain_add_to_physmap

2015-07-17 Thread Stefano Stabellini

On Thu, 2 Jul 2015, Konrad Rzeszutek Wilk wrote:
 In Xen 4.6 commit cd2f100f0f61b3f333d52d1737dd73f02daee592
 libxc: Fix do_memory_op to return negative value on errors
 made the libxc API less odd-ball: On errors, return value is
 -1 and error code is in errno. On success the return value
 is either 0 or an positive value.
 
 Since we could be running with an old toolstack in which the
 Exx value is in rc or the newer, we add an wrapper around
 the xc_domain_add_to_physmap (called xen_xc_domain_add_to_physmap)
 which will always return the EXX.
 
 Suggested-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
 Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 ---
  include/hw/xen/xen_common.h | 22 ++
  xen-hvm.c   |  4 ++--
  2 files changed, 24 insertions(+), 2 deletions(-)
 
 diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h
 index 38f29fb..fc66c3c 100644
 --- a/include/hw/xen/xen_common.h
 +++ b/include/hw/xen/xen_common.h
 @@ -407,4 +407,26 @@ static inline int xen_set_ioreq_server_state(XenXC xc, 
 domid_t dom,
  
  #endif
  
 +#if CONFIG_XEN_CTRL_INTERFACE_VERSION  460

This patch is fine as is and you can add my

Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com

however there is no code to detect 460 at the moment, see
configure:1844. We need one more test case in the configure script to
detect 4.6.


 +static inline int xen_xc_domain_add_to_physmap(XenXC xch, uint32_t domid,
 +   unsigned int space,
 +   unsigned long idx,
 +   xen_pfn_t gpfn)
 +{
 +return xc_domain_add_to_physmap(xch, domid, space, idx, gpfn);
 +}
 +#else
 +static inline int xen_xc_domain_add_to_physmap(XenXC xch, uint32_t domid,
 +   unsigned int space,
 +   unsigned long idx,
 +   xen_pfn_t gpfn)
 +{
 +/* In Xen 4.6 rc is -1 and errno contains the error value. */
 +int rc = xc_domain_add_to_physmap(xch, domid, space, idx, gpfn);
 +if (rc == -1)
 +return errno;
 +return rc;
 +}
 +#endif
 +
  #endif /* QEMU_HW_XEN_COMMON_H */
 diff --git a/xen-hvm.c b/xen-hvm.c
 index 0408462..fae2d22 100644
 --- a/xen-hvm.c
 +++ b/xen-hvm.c
 @@ -345,7 +345,7 @@ go_physmap:
  unsigned long idx = pfn + i;
  xen_pfn_t gpfn = start_gpfn + i;
  
 -rc = xc_domain_add_to_physmap(xen_xc, xen_domid, XENMAPSPACE_gmfn, 
 idx, gpfn);
 +rc = xen_xc_domain_add_to_physmap(xen_xc, xen_domid, 
 XENMAPSPACE_gmfn, idx, gpfn);
  if (rc) {
  DPRINTF(add_to_physmap MFN %PRI_xen_pfn to PFN %
  PRI_xen_pfn failed: %d\n, idx, gpfn, rc);
 @@ -422,7 +422,7 @@ static int xen_remove_from_physmap(XenIOState *state,
  xen_pfn_t idx = start_addr + i;
  xen_pfn_t gpfn = phys_offset + i;
  
 -rc = xc_domain_add_to_physmap(xen_xc, xen_domid, XENMAPSPACE_gmfn, 
 idx, gpfn);
 +rc = xen_xc_domain_add_to_physmap(xen_xc, xen_domid, 
 XENMAPSPACE_gmfn, idx, gpfn);
  if (rc) {
  fprintf(stderr, add_to_physmap MFN %PRI_xen_pfn to PFN %
  PRI_xen_pfn failed: %d\n, idx, gpfn, rc);
 -- 
 2.1.0

Re: [Qemu-devel] [PATCH v1 5/6] xen/pt/msi: Add the register value when printing logging and error messages

2015-07-17 Thread Stefano Stabellini

On Thu, 2 Jul 2015, Konrad Rzeszutek Wilk wrote:
 We would like to know what the MSI register value is to help
 in troubleshooting in the field. As such modify the logging
 logic to include such details in xen_pt_msgctrl_reg_write.
 
 Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com

Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com


  hw/xen/xen_pt_config_init.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)
 
 diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
 index dd37be3..462e1b9 100644
 --- a/hw/xen/xen_pt_config_init.c
 +++ b/hw/xen/xen_pt_config_init.c
 @@ -1086,7 +1086,7 @@ static int 
 xen_pt_msgctrl_reg_write(XenPCIPassthroughState *s,
  /* setup MSI pirq for the first time */
  if (!msi-initialized) {
  /* Init physical one */
 -XEN_PT_LOG(s-dev, setup MSI\n);
 +XEN_PT_LOG(s-dev, setup MSI (register: %x).\n, *val);
  if (xen_pt_msi_setup(s)) {
  /* We do not broadcast the error to the framework code, so
   * that MSI errors are contained in MSI emulation code and
 @@ -1094,12 +1094,12 @@ static int 
 xen_pt_msgctrl_reg_write(XenPCIPassthroughState *s,
   * Guest MSI would be actually not working.
   */
  *val = ~PCI_MSI_FLAGS_ENABLE;
 -XEN_PT_WARN(s-dev, Can not map MSI.\n);
 +XEN_PT_WARN(s-dev, Can not map MSI (register: %x)!\n, 
 *val);
  return 0;
  }
  if (xen_pt_msi_update(s)) {
  *val = ~PCI_MSI_FLAGS_ENABLE;
 -XEN_PT_WARN(s-dev, Can not bind MSI\n);
 +XEN_PT_WARN(s-dev, Can not bind MSI (register: %x)!\n, 
 *val);
  return 0;
  }
  msi-initialized = true;
 -- 
 2.1.0

Re: [Qemu-devel] [RFC v3 12/13] softmmu_llsc_template.h: move to multithreading

2015-07-17 Thread Alex Bennée


Alvise Rigo a.r...@virtualopensystems.com writes:

 Update the TCG LL/SC instructions to work in multi-threading.

 The basic idea remains untouched, but the whole mechanism is improved to
 make use of the callback support to query TLB flush requests and the
 rendezvous callback to synchronize all the currently running vCPUs.

 In essence, if a vCPU wants to LL to a page which is not already set as
 EXCL, it will arrange a rendezvous with all the vCPUs that are executing
 a TB and query a TLB flush for *all* the vCPUs.
 Doing so, we make sure that:
 - the running vCPUs do not touch the EXCL page while the requesting vCPU
   is setting the transaction to EXCL of the page
 - all the vCPUs will have the EXCL flag in the TLB entry for that
   specific page *before* entering the next TB

 Suggested-by: Jani Kokkonen jani.kokko...@huawei.com
 Suggested-by: Claudio Fontana claudio.font...@huawei.com
 Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com
 ---
  cputlb.c|  2 +
  include/exec/cpu-defs.h |  4 ++
  softmmu_llsc_template.h | 97 
 -
  3 files changed, 69 insertions(+), 34 deletions(-)

 diff --git a/cputlb.c b/cputlb.c
 index 66df41a..0566e0f 100644
 --- a/cputlb.c
 +++ b/cputlb.c
 @@ -30,6 +30,8 @@
  #include exec/ram_addr.h
  #include tcg/tcg.h
  
 +#include sysemu/cpus.h
 +
  void qemu_mutex_lock_iothread(void);
  void qemu_mutex_unlock_iothread(void);
  
 diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
 index c73a75f..40742b3 100644
 --- a/include/exec/cpu-defs.h
 +++ b/include/exec/cpu-defs.h
 @@ -169,5 +169,9 @@ typedef struct CPUIOTLBEntry {
  /* Used for atomic instruction translation. */  \
  bool ll_sc_context; \
  hwaddr excl_protected_hwaddr;   \
 +/* Used to carry the stcond result and also as a flag to flag a
 + * normal store access made by a stcond. */ \
 +int excl_succeeded; \
 +
  
  #endif
 diff --git a/softmmu_llsc_template.h b/softmmu_llsc_template.h
 index 81e9d8e..4105e72 100644
 --- a/softmmu_llsc_template.h
 +++ b/softmmu_llsc_template.h
 @@ -54,7 +54,21 @@
   (TARGET_PAGE_MASK | TLB_INVALID_MASK)); 
 \
  })   
 \
  
 -#define EXCLUSIVE_RESET_ADDR ULLONG_MAX
 +#define is_read_tlb_entry_set(env, page, index)  
 \
 +({   
 \
 +(addr  TARGET_PAGE_MASK)
 \
 + == ((env-tlb_table[mmu_idx][index].addr_read) 
 \
 + (TARGET_PAGE_MASK | TLB_INVALID_MASK)); 
 \
 +})   
 \
 +
 +/* Whenever a SC operation fails, we add a small delay to reduce the
 + * concurrency among the atomic instruction emulation code. Without this 
 delay,
 + * in very congested situation where plain stores make all the pending LLs
 + * fail, the code could reach a stalling situation in which all the SCs 
 happen
 + * to fail.
 + * TODO: make the delay dynamic according with the SC failing rate.
 + * */
 +#define TCG_ATOMIC_INSN_EMUL_DELAY 100

I'd be tempted to split out this sort of chicanery into a separate patch. 

  
  WORD_TYPE helper_le_ldlink_name(CPUArchState *env, target_ulong addr,
  TCGMemOpIdx oi, uintptr_t retaddr)
 @@ -65,35 +79,58 @@ WORD_TYPE helper_le_ldlink_name(CPUArchState *env, 
 target_ulong addr,
  hwaddr hw_addr;
  unsigned mmu_idx = get_mmuidx(oi);
  
 -/* Use the proper load helper from cpu_ldst.h */
 -ret = helper_ld_legacy(env, addr, mmu_idx, retaddr);
 -
 -/* The last legacy access ensures that the TLB and IOTLB entry for 'addr'
 - * have been created. */
  index = (addr  TARGET_PAGE_BITS)  (CPU_TLB_SIZE - 1);
 +if (!is_read_tlb_entry_set(env, addr, index)) {
 +tlb_fill(ENV_GET_CPU(env), addr, READ_ACCESS_TYPE, mmu_idx, retaddr);
 +}
  
  /* hw_addr = hwaddr of the page (i.e. section-mr-ram_addr + xlat)
   * plus the offset (i.e. addr  ~TARGET_PAGE_MASK) */
  hw_addr = (env-iotlb[mmu_idx][index].addr  TARGET_PAGE_MASK) + addr;
  
  /* Set the exclusive-protected hwaddr. */
 -env-excl_protected_hwaddr = hw_addr;
 -env-ll_sc_context = true;
 +qemu_mutex_lock(tcg_excl_access_lock);
 +if (cpu_physical_memory_excl_is_dirty(hw_addr)  !exit_flush_request) {
 +exit_flush_request = 1;
  
 -/* No need to mask hw_addr with TARGET_PAGE_MASK since
 - * cpu_physical_memory_excl_is_dirty() will take care of that. */
 -if (cpu_physical_memory_excl_is_dirty(hw_addr)) {
 -

Re: [Qemu-devel] [PATCH 2/2] tcg/i386: Reserve register for guest_base if a segment isn't available

2015-07-17 Thread Paolo Bonzini



On 17/07/2015 03:33, Peter Maydell wrote:
 On 16 July 2015 at 22:25, Richard Henderson r...@twiddle.net wrote:
  This saves 2 insns and 10 bytes from the implementation of
  each memory operation.
 
 Do we have an idea of which platforms/configs don't let
 us have a segment register for guest_base?

Everything non-Linux I think, which for user-mode emulation means bsd-user.

Paolo

Re: [Qemu-devel] [PATCH v1 6/6] xen/pt: Use XEN_PT_LOG properly to guard against compiler warnings.

2015-07-17 Thread Stefano Stabellini

On Thu, 2 Jul 2015, Konrad Rzeszutek Wilk wrote:
 If XEN_PT_LOGGING_ENABLED is enabled the XEN_PT_LOG macros start
 using the first argument. Which means if within the function there
 is only one user of the argument ('d') and XEN_PT_LOGGING_ENABLED
 is not set, we get compiler warnings. This is not the case now
 but with the xen/pt: Use xen_host_pci_get_[byte|word] instead of dev.config
 we will hit - so this sync up the function to the rest of them.
 
 Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com

Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com


  hw/xen/xen_pt_config_init.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
 index 462e1b9..21d4938 100644
 --- a/hw/xen/xen_pt_config_init.c
 +++ b/hw/xen/xen_pt_config_init.c
 @@ -1418,7 +1418,7 @@ static int 
 xen_pt_msixctrl_reg_init(XenPCIPassthroughState *s,
  reg_field = pci_get_word(d-config + real_offset);
  
  if (reg_field  PCI_MSIX_FLAGS_ENABLE) {
 -XEN_PT_LOG(d, MSIX already enabled, disabling it first\n);
 +XEN_PT_LOG(s-dev, MSIX already enabled, disabling it first\n);
  xen_host_pci_set_word(s-real_device, real_offset,
reg_field  ~PCI_MSIX_FLAGS_ENABLE);
  }
 -- 
 2.1.0

Re: [Qemu-devel] [RFC v3 12/13] softmmu_llsc_template.h: move to multithreading

2015-07-17 Thread alvise rigo

On Fri, Jul 17, 2015 at 5:27 PM, Alex Bennée alex.ben...@linaro.org wrote:

 Alvise Rigo a.r...@virtualopensystems.com writes:

 Update the TCG LL/SC instructions to work in multi-threading.

 The basic idea remains untouched, but the whole mechanism is improved to
 make use of the callback support to query TLB flush requests and the
 rendezvous callback to synchronize all the currently running vCPUs.

 In essence, if a vCPU wants to LL to a page which is not already set as
 EXCL, it will arrange a rendezvous with all the vCPUs that are executing
 a TB and query a TLB flush for *all* the vCPUs.
 Doing so, we make sure that:
 - the running vCPUs do not touch the EXCL page while the requesting vCPU
   is setting the transaction to EXCL of the page
 - all the vCPUs will have the EXCL flag in the TLB entry for that
   specific page *before* entering the next TB

 Suggested-by: Jani Kokkonen jani.kokko...@huawei.com
 Suggested-by: Claudio Fontana claudio.font...@huawei.com
 Signed-off-by: Alvise Rigo a.r...@virtualopensystems.com
 ---
  cputlb.c|  2 +
  include/exec/cpu-defs.h |  4 ++
  softmmu_llsc_template.h | 97 
 -
  3 files changed, 69 insertions(+), 34 deletions(-)

 diff --git a/cputlb.c b/cputlb.c
 index 66df41a..0566e0f 100644
 --- a/cputlb.c
 +++ b/cputlb.c
 @@ -30,6 +30,8 @@
  #include exec/ram_addr.h
  #include tcg/tcg.h

 +#include sysemu/cpus.h
 +
  void qemu_mutex_lock_iothread(void);
  void qemu_mutex_unlock_iothread(void);

 diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
 index c73a75f..40742b3 100644
 --- a/include/exec/cpu-defs.h
 +++ b/include/exec/cpu-defs.h
 @@ -169,5 +169,9 @@ typedef struct CPUIOTLBEntry {
  /* Used for atomic instruction translation. */  \
  bool ll_sc_context; \
  hwaddr excl_protected_hwaddr;   \
 +/* Used to carry the stcond result and also as a flag to flag a
 + * normal store access made by a stcond. */ \
 +int excl_succeeded; \
 +

  #endif
 diff --git a/softmmu_llsc_template.h b/softmmu_llsc_template.h
 index 81e9d8e..4105e72 100644
 --- a/softmmu_llsc_template.h
 +++ b/softmmu_llsc_template.h
 @@ -54,7 +54,21 @@
   (TARGET_PAGE_MASK | TLB_INVALID_MASK));
  \
  })  
  \

 -#define EXCLUSIVE_RESET_ADDR ULLONG_MAX
 +#define is_read_tlb_entry_set(env, page, index) 
  \
 +({  
  \
 +(addr  TARGET_PAGE_MASK)   
  \
 + == ((env-tlb_table[mmu_idx][index].addr_read)
  \
 + (TARGET_PAGE_MASK | TLB_INVALID_MASK));
  \
 +})  
  \
 +
 +/* Whenever a SC operation fails, we add a small delay to reduce the
 + * concurrency among the atomic instruction emulation code. Without this 
 delay,
 + * in very congested situation where plain stores make all the pending LLs
 + * fail, the code could reach a stalling situation in which all the SCs 
 happen
 + * to fail.
 + * TODO: make the delay dynamic according with the SC failing rate.
 + * */
 +#define TCG_ATOMIC_INSN_EMUL_DELAY 100

 I'd be tempted to split out this sort of chicanery into a separate patch.

OK, I think it's a good idea since it's not strictly required.

Regards,
alvise



  WORD_TYPE helper_le_ldlink_name(CPUArchState *env, target_ulong addr,
  TCGMemOpIdx oi, uintptr_t retaddr)
 @@ -65,35 +79,58 @@ WORD_TYPE helper_le_ldlink_name(CPUArchState *env, 
 target_ulong addr,
  hwaddr hw_addr;
  unsigned mmu_idx = get_mmuidx(oi);

 -/* Use the proper load helper from cpu_ldst.h */
 -ret = helper_ld_legacy(env, addr, mmu_idx, retaddr);
 -
 -/* The last legacy access ensures that the TLB and IOTLB entry for 
 'addr'
 - * have been created. */
  index = (addr  TARGET_PAGE_BITS)  (CPU_TLB_SIZE - 1);
 +if (!is_read_tlb_entry_set(env, addr, index)) {
 +tlb_fill(ENV_GET_CPU(env), addr, READ_ACCESS_TYPE, mmu_idx, 
 retaddr);
 +}

  /* hw_addr = hwaddr of the page (i.e. section-mr-ram_addr + xlat)
   * plus the offset (i.e. addr  ~TARGET_PAGE_MASK) */
  hw_addr = (env-iotlb[mmu_idx][index].addr  TARGET_PAGE_MASK) + addr;

  /* Set the exclusive-protected hwaddr. */
 -env-excl_protected_hwaddr = hw_addr;
 -env-ll_sc_context = true;
 +qemu_mutex_lock(tcg_excl_access_lock);
 +if (cpu_physical_memory_excl_is_dirty(hw_addr)  !exit_flush_request) {
 +exit_flush_request = 1;

 -/* No need to mask hw_addr with TARGET_PAGE_MASK since
 - *

Re: [Qemu-devel] [PATCH 2/2] tcg/i386: Reserve register for guest_base if a segment isn't available

2015-07-17 Thread Richard Henderson


On 07/17/2015 02:33 AM, Peter Maydell wrote:

On 16 July 2015 at 22:25, Richard Henderson r...@twiddle.net wrote:

This saves 2 insns and 10 bytes from the implementation of
each memory operation.


Do we have an idea of which platforms/configs don't let
us have a segment register for guest_base?


We've only bothered to fill in code for Linux so far.

These days we should probably probe for fsgsbase insns and use those.  Though I 
don't know how many of the various operating systems enable that...



r~

Re: [Qemu-devel] [PATCH RFC 2/9] tcg: don't abuse TCG type in tcg_gen_trunc_shr_i64_i32

2015-07-17 Thread Richard Henderson

On 07/15/2015 12:03 PM, Aurelien Jarno wrote:

The tcg_gen_trunc_shr_i64_i32 function takes a 64-bit argument and
returns a 32-bit value. Directly call tcg_gen_op3 with the correct
types instead of calling tcg_gen_op3i_i32 and abusing the TCG types.

Cc: Paolo Bonzinipbonz...@redhat.com
Cc: Richard Hendersonr...@twiddle.net
Signed-off-by: Aurelien Jarnoaurel...@aurel32.net
---
  tcg/tcg-op.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson r...@twiddle.net

r~

Re: [Qemu-devel] [PATCH RFC 3/9] tcg: implement real ext_i32_i64 and extu_i32_i64 ops

2015-07-17 Thread Richard Henderson

On 07/15/2015 12:03 PM, Aurelien Jarno wrote:

Implement optional but real ext_i32_i64 and extu_i32_i64 ops. When
implemented, these ensure that a 32-bit value is always converted to
a 64-bit value and not propagated through the register allocator or
the optimizer.

Cc: Paolo Bonzinipbonz...@redhat.com
Cc: Richard Hendersonr...@twiddle.net
Signed-off-by: Aurelien Jarnoaurel...@aurel32.net
---
  tcg/aarch64/tcg-target.h | 6 +-
  tcg/i386/tcg-target.h| 7 ++-
  tcg/ia64/tcg-target.h| 6 +-
  tcg/ppc/tcg-target.h | 7 ++-
  tcg/s390/tcg-target.h| 6 +-
  tcg/sparc/tcg-target.h   | 6 +-
  tcg/tcg-op.c | 6 ++
  tcg/tcg-opc.h| 3 +++
  tcg/tci/tcg-target.h | 7 ++-
  9 files changed, 47 insertions(+), 7 deletions(-)

Reviewed-by: Richard Henderson r...@twiddle.net

r~

Re: [Qemu-devel] [PATCH RFC 1/9] tcg: rename trunc_shr_i32 into trunc_shr_i64_i32

2015-07-17 Thread Richard Henderson


On 07/15/2015 12:03 PM, Aurelien Jarno wrote:

The op is sometimes named trunc_shr_i32 and sometimes trunc_shr_i64_i32,
and the name in the README doesn't match the name offered to the
frontends.

Always use the long name to make it clear it is a size changing op.

Cc: Paolo Bonzinipbonz...@redhat.com
Cc: Richard Hendersonr...@twiddle.net
Signed-off-by: Aurelien Jarnoaurel...@aurel32.net
---
  tcg/README   | 2 +-
  tcg/aarch64/tcg-target.h | 2 +-
  tcg/i386/tcg-target.h| 2 +-
  tcg/ia64/tcg-target.h| 2 +-
  tcg/optimize.c   | 6 +++---
  tcg/ppc/tcg-target.h | 2 +-
  tcg/s390/tcg-target.h| 2 +-
  tcg/sparc/tcg-target.c   | 4 ++--
  tcg/sparc/tcg-target.h   | 2 +-
  tcg/tcg-op.c | 4 ++--
  tcg/tcg-opc.h| 4 ++--
  tcg/tcg.h| 2 +-
  tcg/tci/tcg-target.h | 2 +-
  13 files changed, 18 insertions(+), 18 deletions(-)


Reviewed-by: Richard Henderson r...@twiddle.net


r~

Re: [Qemu-devel] [PATCH RFC 7/9] tcg: replace ext/u_i32_i64 by a mov when not implemented

2015-07-17 Thread Richard Henderson


On 07/15/2015 12:03 PM, Aurelien Jarno wrote:

When ext_i32_i64 and extu_i32_i64 ops are not implemented, this means
that the register is already properly zero/sign extended, so we can
simply replace it by a mov.

In practice it means at least one of the two ops should always be
implemented on 64-bit targets.

Cc: Paolo Bonzinipbonz...@redhat.com
Cc: Richard Hendersonr...@twiddle.net
Signed-off-by: Aurelien Jarnoaurel...@aurel32.net
---
  tcg/tcg-op.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


If we're going to do this (and of course pick a solution for all of the other 
backends), I think perhaps x86 should choose trunc + exts as the two that 
should be implemented, leaving extu the one that can be folded away.


Something to experiment with...


r~

Re: [Qemu-devel] [virtio guest] vring_need_event() from virtqueue_kick_prepare()

2015-07-17 Thread Catalin Vasile

Do you mean vhost_net - old kernel, qemu - latest, guest - latest?

On Thu, Jul 16, 2015 at 7:33 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Thu, Jul 16, 2015 at 1:54 PM, Catalin Vasile
 catalinvasil...@gmail.com wrote:
 Both. The compiled kernel was common for both.

 Does vhost_net work with the old kernel + new QEMU combo?

 Stefan

Re: [Qemu-devel] [PULL for-2.4 0/4] input: fixes for 2.4

2015-07-17 Thread Gerd Hoffmann

  Hi,

 I'm afraid this doesn't build for Windows:
 
 In file included from
 /home/petmay01/linaro/qemu-for-merges/hw/input/virtio-input.c:13:
 /home/petmay01/linaro/qemu-for-merges/include/standard-headers/linux/input.h:890:1:
 error: SW_MAX redefined
 In file included from
 /usr/lib/gcc/i586-mingw32msvc/4.2.1-sjlj/../../../../i586-mingw32msvc/include/windows.h:55,
  from
 /home/petmay01/linaro/qemu-for-merges/include/sysemu/os-win32.h:29,
  from
 /home/petmay01/linaro/qemu-for-merges/include/qemu-common.h:48,
  from
 /home/petmay01/linaro/qemu-for-merges/include/qemu/iov.h:17,
  from
 /home/petmay01/linaro/qemu-for-merges/hw/input/virtio-input.c:7:
 /usr/lib/gcc/i586-mingw32msvc/4.2.1-sjlj/../../../../i586-mingw32msvc/include/winuser.h:729:1:
 error: this is the location of the previous definition

Lovely.  Looks like a bug in the mingw headers to me, especially as my
machine cross-builds this just fine for both win32 and win64.

We are in hard-freeze though, so no time to experiments here, I'll redo
the pull with patch #3 dropped, lets sort this post-2.4.

cheers,
  Gerd

[Qemu-devel] [PULL 1/3] virtio-input: fix segfault in virtio_input_hid_properties

2015-07-17 Thread Gerd Hoffmann

From: Lin Ma l...@suse.com

commit 5cce173 introduced virtio-input segfault, This patch fixes it.

Signed-off-by: Lin Ma l...@suse.com
Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hw/input/virtio-input-hid.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/input/virtio-input-hid.c b/hw/input/virtio-input-hid.c
index 616a815..4d85dad 100644
--- a/hw/input/virtio-input-hid.c
+++ b/hw/input/virtio-input-hid.c
@@ -308,6 +308,7 @@ static void virtio_input_hid_handle_status(VirtIOInput 
*vinput,
 static Property virtio_input_hid_properties[] = {
 DEFINE_PROP_STRING(display, VirtIOInputHID, display),
 DEFINE_PROP_UINT32(head, VirtIOInputHID, head, 0),
+DEFINE_PROP_END_OF_LIST(),
 };
 
 static void virtio_input_hid_class_init(ObjectClass *klass, void *data)
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH] hostmem: Fix qemu_opt_get_bool() crash in host_memory_backend_init()

2015-07-17 Thread Marcel Apfelbaum


On 07/16/2015 11:39 PM, Eduardo Habkost wrote:

This fixes the following crash, introduced by commit
49d2e648e8087d154d8bf8b91f27c8e05e79d5a6:

   $ gdb --args qemu-system-x86_64 -machine pc,mem-merge=off -object 
memory-backend-ram,id=ram-node0,size=1024
   [...]
   Program received signal SIGABRT, Aborted.
   (gdb) bt
   #0  0x7253b8c7 in raise () at /lib64/libc.so.6
   #1  0x7253d52a in abort () at /lib64/libc.so.6
   #2  0x7253446d in __assert_fail_base () at /lib64/libc.so.6
   #3  0x72534522 in  () at /lib64/libc.so.6
   #4  0x558bb80a in qemu_opt_get_bool_helper (opts=0x5621b650, 
name=name@entry=0x558ec922 mem-merge, defval=defval@entry=true, 
del=del@entry=false) at qemu/util/qemu-option.c:388
   #5  0x558bbb5a in qemu_opt_get_bool (opts=optimized out, 
name=name@entry=0x558ec922 mem-merge, defval=defval@entry=true) at 
qemu/util/qemu-option.c:398
   #6  0x55720a24 in host_memory_backend_init (obj=0x562ac970) at 
qemu/backends/hostmem.c:226

Instead of using qemu_opt_get_bool(), that didn't work with
qemu_machine_opts for a long time, we can use the machine QOM properties
directly.

This was the intent.



Signed-off-by: Eduardo Habkost ehabk...@redhat.com
---
  backends/hostmem.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index 61c1ac0..38a32ed 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -10,6 +10,7 @@
   * See the COPYING file in the top-level directory.
   */
  #include sysemu/hostmem.h
+#include hw/boards.h
  #include qapi/visitor.h
  #include qapi-types.h
  #include qapi-visit.h
@@ -223,10 +224,10 @@ static void host_memory_backend_init(Object *obj)
  {
  HostMemoryBackend *backend = MEMORY_BACKEND(obj);

-backend-merge = qemu_opt_get_bool(qemu_get_machine_opts(),
-   mem-merge, true);
-backend-dump = qemu_opt_get_bool(qemu_get_machine_opts(),
-  dump-guest-core, true);
+backend-merge = object_property_get_bool(OBJECT(current_machine),
+  mem-merge, error_abort);
+backend-dump = object_property_get_bool(OBJECT(current_machine),
+dump-guest-core, error_abort);
  backend-prealloc = mem_prealloc;

  object_property_add_bool(obj, merge,



Thank you for fixing this,

Reviewed-by: Marcel Apfelbaum mar...@redhat.com

Re: [Qemu-devel] memory hotplug fails with mem-merge option

2015-07-17 Thread Marcel Apfelbaum


On 07/16/2015 03:24 PM, Igor Mammedov wrote:

On Thu, 16 Jul 2015 16:00:07 +0530
Bharata B Rao bhar...@linux.vnet.ibm.com wrote:


On Thu, Jul 16, 2015 at 10:25:09AM +0530, Bharata B Rao wrote:

Hi,

When I use -machine pc,mem-merge=off|on -m 1G,slots=4,maxmem=2G, adding
memory-backend-ram object fails like below. Same failure is seen with
-machine pseries,mem-merge=on|off.

(qemu) object_add memory-backend-ram,id=ram0,size=1G

qemu-system-x86_64: util/qemu-option.c:388: qemu_opt_get_bool_helper: Assertion `opt-desc 
 opt-desc-type == QEMU_OPT_BOOL' failed.


Same failure seen with -machine pseries,mem-merge=off when used together with
-object memory-backend-file on QEMU commandline in which case boot time
failure is observed.

Regards,
Bharata.


looks like related to

0a7cf217 util/qemu-config: fix regression of qmp_query_command_line_options
and
49d2e64 machine: remove qemu_machine_opts global list

CCing Marcel in hope he knows how to fix it right away.

Hi,

Can you please check that Eduardo's patch fixes this?
[PATCH] hostmem: Fix qemu_opt_get_bool() crash in host_memory_backend_init()
https://www.mail-archive.com/qemu-devel@nongnu.org/msg310915.html

Thanks,
Marcel

Re: [Qemu-devel] net: Next steps to deprecate -net

2015-07-17 Thread Thomas Huth

On 07/17/2015 09:25 AM, Peter Maydell wrote:
 On 17 July 2015 at 07:53, Thomas Huth th...@redhat.com wrote:
 Ok, assuming that my Network traffic dumping for -netdev devices patch
 series is going to solve the dumping-for-netdev problem, how do we
 tackle the remaining problems that we have to solve before we can
 deprecate -net? Does anybody have a survey of the (onboard) NICs that
 can only be configured with -net but not with -device? Could they
 nowadays be changed to work with -device, too, or are there still major
 obstacles to solve first?
 
 The problem is that -device says create a new device and
 configure it like this. But onboard NICs are created by
 the board, so we want let the user say how to configure
 those devices, not create new ones...

Ok, I see ... maybe it makes sense to simply keep -net nic to be able
to configure the default/onboard NIC, and only to remove all the other
-net options instead (-net user etc.). The disliked vlan/hub concept
could then be removed, too, since -net nic can be used together with
-netdev nowadays by using something like -net nic,netdev=xxx as far
as I know. That would clean up most points of confusion, I think, and
would not cause too much code churn for the onboard NICs. Does that
sound feasible?

 Thomas





signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] memory hotplug fails with mem-merge option

2015-07-17 Thread Bharata B Rao

On Fri, Jul 17, 2015 at 10:26:16AM +0300, Marcel Apfelbaum wrote:
 On 07/16/2015 03:24 PM, Igor Mammedov wrote:
 On Thu, 16 Jul 2015 16:00:07 +0530
 Bharata B Rao bhar...@linux.vnet.ibm.com wrote:
 
 On Thu, Jul 16, 2015 at 10:25:09AM +0530, Bharata B Rao wrote:
 Hi,
 
 When I use -machine pc,mem-merge=off|on -m 1G,slots=4,maxmem=2G, adding
 memory-backend-ram object fails like below. Same failure is seen with
 -machine pseries,mem-merge=on|off.
 
 (qemu) object_add memory-backend-ram,id=ram0,size=1G
 
 qemu-system-x86_64: util/qemu-option.c:388: qemu_opt_get_bool_helper: 
 Assertion `opt-desc  opt-desc-type == QEMU_OPT_BOOL' failed.
 
 Same failure seen with -machine pseries,mem-merge=off when used together 
 with
 -object memory-backend-file on QEMU commandline in which case boot time
 failure is observed.
 
 Regards,
 Bharata.
 
 looks like related to
 
 0a7cf217 util/qemu-config: fix regression of qmp_query_command_line_options
 and
 49d2e64 machine: remove qemu_machine_opts global list
 
 CCing Marcel in hope he knows how to fix it right away.
 Hi,
 
 Can you please check that Eduardo's patch fixes this?
 [PATCH] hostmem: Fix qemu_opt_get_bool() crash in 
 host_memory_backend_init()
 https://www.mail-archive.com/qemu-devel@nongnu.org/msg310915.html

Yes, this fixes the problem.

Regards,
Bharata.

Re: [Qemu-devel] [PATCH v2 3/3] AioContext: fix broken ctx-dispatching optimization

2015-07-17 Thread Stefan Hajnoczi

On Fri, Jul 17, 2015 at 06:17:47AM +0200, Paolo Bonzini wrote:
 
 
 On 16/07/2015 11:56, Paolo Bonzini wrote:
  @@ -286,13 +283,15 @@ bool aio_poll(AioContext *ctx, bool blocking)
   npfd = 0;
   ctx-walking_handlers--;
   
  +if (blocking) {
  +atomic_sub(ctx-notify_me, 2);
  +}
  +
 
 I kept this place for subtracting notify_me because it is the same place
 where aio_set_dispatching was called.  However, it can be anticipated to
 
  /* if we have any readable fds, dispatch event */
  if (ret  0) {
  for (i = 0; i  npfd; i++) {
 
 i.e. right after poll.  As poll has exited, it can't be blocking the
 thread anymore.  Stefan, please let me send v3 on Monday.

Sure.

Stefan


pgp5ys_kE4p0H.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH RFC 4/9] tcg/optimize: add optimizations for ext_i32_i64 and extu_i32_i64 ops

2015-07-17 Thread Richard Henderson

On 07/15/2015 12:03 PM, Aurelien Jarno wrote:

They behave the same as ext32s_i64 and ext32u_i64 from the constant
folding and zero propagation point of view, except that they can't
be replaced by a mov, so we don't compute the affected value.

Cc: Paolo Bonzinipbonz...@redhat.com
Cc: Richard Hendersonr...@twiddle.net
Signed-off-by: Aurelien Jarnoaurel...@aurel32.net
---
  tcg/optimize.c | 10 ++
  1 file changed, 10 insertions(+)

Reviewed-by: Richard Henderson r...@twiddle.net

r~

[Qemu-devel] [PATCH V3] virtio-net: unbreak any layout

2015-07-17 Thread Jason Wang

Commit 032a74a1c0fcdd5fd1c69e56126b4c857ee36611
(virtio-net: byteswap virtio-net header) breaks any layout by
requiring out_sg[0].iov_len = n-guest_hdr_len. Fixing this by
copying header to temporary buffer if swap is needed, and then use
this buffer as part of out_sg.

Fixes 032a74a1c0fcdd5fd1c69e56126b4c857ee36611
(virtio-net: byteswap virtio-net header)
Cc: qemu-sta...@nongnu.org
Cc: c...@fr.ibm.com
Signed-off-by: Jason Wang jasow...@redhat.com
---
Changes from V2:
- Drop the packet that will be truncated
- Avoid scan the whole iov by just checking the return of iov_to_buf()
Changes from V1:
- avoid header copying if there's no need to do header swap
- don't write the header back
---
 hw/net/virtio-net.c   | 23 ++-
 include/hw/virtio/virtio-access.h |  9 +
 2 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index e3c2db3..9f7e91d 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1142,7 +1142,8 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
 ssize_t ret, len;
 unsigned int out_num = elem.out_num;
 struct iovec *out_sg = elem.out_sg[0];
-struct iovec sg[VIRTQUEUE_MAX_SIZE];
+struct iovec sg[VIRTQUEUE_MAX_SIZE], sg2[VIRTQUEUE_MAX_SIZE + 1];
+struct virtio_net_hdr_mrg_rxbuf mhdr;
 
 if (out_num  1) {
 error_report(virtio-net header not in first element);
@@ -1150,13 +1151,25 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
 }
 
 if (n-has_vnet_hdr) {
-if (out_sg[0].iov_len  n-guest_hdr_len) {
+if (iov_to_buf(out_sg, out_num, 0, mhdr, n-guest_hdr_len) 
+n-guest_hdr_len) {
 error_report(virtio-net header incorrect);
 exit(1);
 }
-virtio_net_hdr_swap(vdev, (void *) out_sg[0].iov_base);
+if (virtio_needs_swap(vdev)) {
+virtio_net_hdr_swap(vdev, (void *) mhdr);
+sg2[0].iov_base = mhdr;
+sg2[0].iov_len = n-guest_hdr_len;
+out_num = iov_copy(sg2[1], ARRAY_SIZE(sg2) - 1,
+   out_sg, out_num,
+   n-guest_hdr_len, -1);
+if (out_num == VIRTQUEUE_MAX_SIZE) {
+goto drop;
+   }
+out_num += 1;
+out_sg = sg2;
+   }
 }
-
 /*
  * If host wants to see the guest header as is, we can
  * pass it on unchanged. Otherwise, copy just the parts
@@ -1186,7 +1199,7 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
 }
 
 len += ret;
-
+drop:
 virtqueue_push(q-tx_vq, elem, 0);
 virtio_notify(vdev, q-tx_vq);
 
diff --git a/include/hw/virtio/virtio-access.h 
b/include/hw/virtio/virtio-access.h
index cee5dd7..1ec1dfd 100644
--- a/include/hw/virtio/virtio-access.h
+++ b/include/hw/virtio/virtio-access.h
@@ -143,6 +143,15 @@ static inline uint64_t virtio_ldq_p(VirtIODevice *vdev, 
const void *ptr)
 }
 }
 
+static inline bool virtio_needs_swap(VirtIODevice *vdev)
+{
+#ifdef HOST_WORDS_BIGENDIAN
+return virtio_access_is_big_endian(vdev) ? false : true;
+#else
+return virtio_access_is_big_endian(vdev) ? true : false;
+#endif
+}
+
 static inline uint16_t virtio_tswap16(VirtIODevice *vdev, uint16_t s)
 {
 #ifdef HOST_WORDS_BIGENDIAN
-- 
2.1.4

Re: [Qemu-devel] net: Next steps to deprecate -net (was: [RFC PATCH] Enable vlans and dump for -netdev, too)

2015-07-17 Thread Stefan Hajnoczi

On Fri, Jul 17, 2015 at 08:53:08AM +0200, Thomas Huth wrote:
 On 05/26/2015 04:29 PM, Markus Armbruster wrote:
  Stefan Hajnoczi stefa...@redhat.com writes:
 [...]
  We thought the QEMU vlan concept would be dropped completely in the
  future, so it was never added to -netdev.  No patches to do that have
  been posted over the years, so I think it was more of a conceptual goal
  than a concrete requirement.
  
  Well, patches to do that first need to replace the VLAN-only dump
  feature.
  
  To fully deprecate -net, we also have to replace -net nic for
  configuring onboard NICs.
  
  Prior discussion:
  http://lists.nongnu.org/archive/html/qemu-devel/2013-02/msg03743.html
  
  We haven't really tried either.
 
 Ok, assuming that my Network traffic dumping for -netdev devices patch
 series is going to solve the dumping-for-netdev problem, how do we
 tackle the remaining problems that we have to solve before we can
 deprecate -net? Does anybody have a survey of the (onboard) NICs that
 can only be configured with -net but not with -device? Could they
 nowadays be changed to work with -device, too, or are there still major
 obstacles to solve first?

Take a look at nd_table[] and nb_nics.  That's the array of -net nic
devices.  The boards look into the array to grab NICs.

The default NIC is set in net_init_clients() as a -net nic option.

The weird thing about -net nic is that the device is not created in
net/net.c by the -net nic command-line option.  It just adds the
information to the nd_table[] array.

It's the board that has to instantiate nd_table[] entries.  For example,
pc_nic_init() adds devices for x86 guests.

That's about all I know or have thought about so far.  It would be nice
to get rid of -net but it will take some work and is a QEMU 3.0 feature
since it breaks backwards compatibility.


pgp0U_M3wqYnm.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH RFC 8/9] tcg/optimize: do not simplify size changing moves

2015-07-17 Thread Richard Henderson


On 07/15/2015 12:03 PM, Aurelien Jarno wrote:

Now that we have real size changing ops, we don't need to marked high
bits of the destination as garbage. The goal of the optimizer is to
predict the value of the temps (and not of the registers) and do
simplifications when possible. The problem there is therefore not the
fact that those bits are not counted as garbage, but that a size
changing op is replaced by a move.

This patch is basically a revert of 24666baf, including the changes that
have been made since then.

Cc: Paolo Bonzini pbonz...@redhat.com
Cc: Richard Henderson r...@twiddle.net
Signed-off-by: Aurelien Jarno aurel...@aurel32.net


What we're missing here is whether the omitted size changing op is extu or 
exts.  Mask should be extended to match.  Which means keeping most of this code.



r~

Re: [Qemu-devel] [RFC PATCH qemu v3 4/4] vfio: spapr: Add SPAPR IOMMU v2 support (DMA memory preregistering)

2015-07-17 Thread Alexey Kardashevskiy


On 07/16/2015 03:11 PM, David Gibson wrote:

On Tue, Jul 14, 2015 at 10:21:54PM +1000, Alexey Kardashevskiy wrote:

This makes use of the new memory registering feature. The idea is
to provide the userspace ability to notify the host kernel about pages
which are going to be used for DMA. Having this information, the host
kernel can pin them all once per user process, do locked pages
accounting (once) and not spent time on doing that in real time with
possible failures which cannot be handled nicely in some cases.

This adds a guest RAM memory listener which notifies a VFIO container
about memory which needs to be pinned/unpinned. VFIO MMIO regions
(i.e. skip dump regions) are skipped.

The feature is only enabled for SPAPR IOMMU v2. The host kernel changes
are required. Since v2 does not need/support VFIO_IOMMU_ENABLE, this does
not call it when v2 is detected and enabled.

This does not change the guest visible interface.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru


I've looked at this in more depth now, and attempting to unify the
pre-reg and mapping listeners like this can't work - they need to be
listening on different address spaces:  mapping actions need to be
listening on the PCI address space, whereas the pre-reg needs to be
listening on address_space_memory.  For x86 - for now - those end up
being the same thing, but on Power they're not.

We do need to be clear about what differences are due to the presence
of a guest IOMMU versus which are due to arch or underlying IOMMU
type.  For now Power has a guest IOMMU and x86 doesn't, but that could
well change in future: we could well implement the guest side IOMMU
for x86 in future (or x86 could invent a paravirt IOMMU interface).
On the other side, BenH's experimental powernv machine type could
introduce Power machines without a guest side IOMMU (or at least an
optional guest side IOMMU).

The quick and dirty approach here is:
1. Leave the main listener as is
2. Add a new pre-reg notifier to the spapr iommu specific code,
   which listens on address_space_memory, *not* the PCI space

The more generally correct approach, which allows for more complex
IOMMU arrangements and the possibility of new IOMMU types with pre-reg
is:
1. Have the core implement both a mapping listener and a pre-reg
   listener (optionally enabled by a per-iommu-type flag).
   Basically the first one sees what *is* mapped, the second sees
   what *could* be mapped.

2. As now, the mapping listener listens on PCI address space, if
   RAM blocks are added, immediately map them into the host IOMMU,
   if guest IOMMU blocks appear register a notifier which will
   mirror guest IOMMU mappings to the host IOMMU (this is what we
   do now).

3. The pre-reg listener also listens on the PCI address space.  RAM
   blocks added are pre-registered immediately.



PCI address space listeners won't be notified about RAM blocks on sPAPR.



But, if guest
   IOMMU blocks are added, instead of registering a guest-iommu
   notifier,


guest-iommu notifier is the one called via memory_region_notify_iommu() 
from H_PUT_TCE? Instead implies dropping it, how this can work?




we register another listener on the *target* AS of the
   guest IOMMU, same callbacks as this one.  In practice that
   target AS will almost always resolve to address_space_memory,
   but this can at least in theory handle crazy guest setups with
   multiple layers of IOMMU.

4. Have to ensure that the pre-reg callbacks always happen before
   the mapping calls.  For a system with an IOMMU backend which
   requires pre-registration, but doesn't have a guest IOMMU, we
   need to pre-reg, then host-iommu-map RAM blocks that appear in
   PCI address space.





--
Alexey

[Qemu-devel] [PATCH V3 1/2] tests: introduce basic pci test for virtio-net

2015-07-17 Thread Jason Wang

Signed-off-by: Jason Wang jasow...@redhat.com
---
Changes from V2:
- Remove duplicated code of gtest_start()
Changes from V1:
- replace the magic value 12 with a macro
---
 tests/Makefile  |   2 +-
 tests/virtio-net-test.c | 185 ++--
 2 files changed, 179 insertions(+), 8 deletions(-)

diff --git a/tests/Makefile b/tests/Makefile
index 2c4b8dc..5805326 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -381,7 +381,7 @@ tests/wdt_ib700-test$(EXESUF): tests/wdt_ib700-test.o
 tests/tco-test$(EXESUF): tests/tco-test.o $(libqos-pc-obj-y)
 tests/virtio-balloon-test$(EXESUF): tests/virtio-balloon-test.o
 tests/virtio-blk-test$(EXESUF): tests/virtio-blk-test.o $(libqos-virtio-obj-y)
-tests/virtio-net-test$(EXESUF): tests/virtio-net-test.o $(libqos-pc-obj-y)
+tests/virtio-net-test$(EXESUF): tests/virtio-net-test.o $(libqos-pc-obj-y) 
$(libqos-virtio-obj-y)
 tests/virtio-rng-test$(EXESUF): tests/virtio-rng-test.o $(libqos-pc-obj-y)
 tests/virtio-scsi-test$(EXESUF): tests/virtio-scsi-test.o 
$(libqos-virtio-obj-y)
 tests/virtio-9p-test$(EXESUF): tests/virtio-9p-test.o
diff --git a/tests/virtio-net-test.c b/tests/virtio-net-test.c
index ea7478c..b37dd3e 100644
--- a/tests/virtio-net-test.c
+++ b/tests/virtio-net-test.c
@@ -10,20 +10,192 @@
 #include glib.h
 #include string.h
 #include libqtest.h
+#include qemu-common.h
+#include qemu/sockets.h
 #include qemu/osdep.h
-#include libqos/pci.h
+#include qemu/iov.h
+#include libqos/pci-pc.h
+#include libqos/virtio.h
+#include libqos/virtio-pci.h
+#include libqos/malloc.h
+#include libqos/malloc-pc.h
+#include libqos/malloc-generic.h
+#include qemu/bswap.h
+#include hw/virtio/virtio-net.h
 
 #define PCI_SLOT_HP 0x06
+#define PCI_SLOT0x04
+#define PCI_FN  0x00
 
-/* Tests only initialization so far. TODO: Replace with functional tests */
-static void pci_nop(void)
+#define QVIRTIO_NET_TIMEOUT_US (30 * 1000 * 1000)
+#define VNET_HDR_SIZE sizeof(struct virtio_net_hdr_mrg_rxbuf)
+
+static void test_end(void)
+{
+qtest_end();
+}
+
+#ifndef _WIN32
+
+static QVirtioPCIDevice *virtio_net_pci_init(QPCIBus *bus, int slot)
+{
+QVirtioPCIDevice *dev;
+
+dev = qvirtio_pci_device_find(bus, QVIRTIO_NET_DEVICE_ID);
+g_assert(dev != NULL);
+g_assert_cmphex(dev-vdev.device_type, ==, QVIRTIO_NET_DEVICE_ID);
+
+qvirtio_pci_device_enable(dev);
+qvirtio_reset(qvirtio_pci, dev-vdev);
+qvirtio_set_acknowledge(qvirtio_pci, dev-vdev);
+qvirtio_set_driver(qvirtio_pci, dev-vdev);
+
+return dev;
+}
+
+static QPCIBus *pci_test_start(int socket)
+{
+char *cmdline;
+
+cmdline = g_strdup_printf(-netdev socket,fd=%d,id=hs0 -device 
+  virtio-net-pci,netdev=hs0, socket);
+qtest_start(cmdline);
+g_free(cmdline);
+
+return qpci_init_pc();
+}
+
+static void driver_init(const QVirtioBus *bus, QVirtioDevice *dev)
+{
+uint32_t features;
+
+features = qvirtio_get_features(bus, dev);
+features = features  ~(QVIRTIO_F_BAD_FEATURE |
+QVIRTIO_F_RING_INDIRECT_DESC |
+QVIRTIO_F_RING_EVENT_IDX);
+qvirtio_set_features(bus, dev, features);
+
+qvirtio_set_driver_ok(bus, dev);
+}
+
+static void rx_test(const QVirtioBus *bus, QVirtioDevice *dev,
+QGuestAllocator *alloc, QVirtQueue *vq,
+int socket)
+{
+uint64_t req_addr;
+uint32_t free_head;
+char test[] = TEST;
+char buffer[64];
+int len = htonl(sizeof(test));
+struct iovec iov[] = {
+{
+.iov_base = len,
+.iov_len = sizeof(len),
+}, {
+.iov_base = test,
+.iov_len = sizeof(test),
+},
+};
+int ret;
+
+req_addr = guest_alloc(alloc, 64);
+
+free_head = qvirtqueue_add(vq, req_addr, 64, true, false);
+qvirtqueue_kick(bus, dev, vq, free_head);
+
+ret = iov_send(socket, iov, 2, 0, sizeof(len) + sizeof(test));
+g_assert_cmpint(ret, ==, sizeof(test) + sizeof(len));
+
+qvirtio_wait_queue_isr(bus, dev, vq, QVIRTIO_NET_TIMEOUT_US);
+memread(req_addr + VNET_HDR_SIZE, buffer, sizeof(test));
+g_assert_cmpstr(buffer, ==, TEST);
+
+guest_free(alloc, req_addr);
+}
+
+static void tx_test(const QVirtioBus *bus, QVirtioDevice *dev,
+QGuestAllocator *alloc, QVirtQueue *vq,
+int socket)
+{
+uint64_t req_addr;
+uint32_t free_head;
+uint32_t len;
+char buffer[64];
+int ret;
+
+req_addr = guest_alloc(alloc, 64);
+memwrite(req_addr + VNET_HDR_SIZE, TEST, 4);
+
+free_head = qvirtqueue_add(vq, req_addr, 64, false, false);
+qvirtqueue_kick(bus, dev, vq, free_head);
+
+qvirtio_wait_queue_isr(bus, dev, vq, QVIRTIO_NET_TIMEOUT_US);
+guest_free(alloc, req_addr);
+
+ret = qemu_recv(socket, len, sizeof(len), 0);
+g_assert_cmpint(ret, ==, sizeof(len));
+len = ntohl(len);
+
+ret =

Re: [Qemu-devel] net: Next steps to deprecate -net (was: [RFC PATCH] Enable vlans and dump for -netdev, too)

2015-07-17 Thread Peter Maydell

On 17 July 2015 at 07:53, Thomas Huth th...@redhat.com wrote:
 Ok, assuming that my Network traffic dumping for -netdev devices patch
 series is going to solve the dumping-for-netdev problem, how do we
 tackle the remaining problems that we have to solve before we can
 deprecate -net? Does anybody have a survey of the (onboard) NICs that
 can only be configured with -net but not with -device? Could they
 nowadays be changed to work with -device, too, or are there still major
 obstacles to solve first?

The problem is that -device says create a new device and
configure it like this. But onboard NICs are created by
the board, so we want let the user say how to configure
those devices, not create new ones...

-- PMM

[Qemu-devel] [Bug 1465935] Re: kvm_irqchip_commit_routes: Assertion `ret == 0' failed

2015-07-17 Thread Stefan Bader

Unfortunately I seem to be unable to get this bug triggered with the 
reproducer. It could be a detail of the guest setup I am missing. Since I do 
not have access to RHEL I used CentOS 6.3 in a 8core guest with 2 virtio disks. 
Host was 14.04. Left the script running for quite a bit but no crash happened.
So it would be up to you to confirm that with a current 14.04 host you still 
can trigger the bug and with the patched version of qemu from 
http://people.canonical.com/~smb/lp1465935/ it would be gone. Thanks.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1465935

Title:
  kvm_irqchip_commit_routes: Assertion `ret == 0' failed

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Confirmed
Status in qemu source package in Precise:
  New
Status in qemu source package in Trusty:
  New
Status in qemu source package in Utopic:
  New
Status in qemu source package in Vivid:
  New

Bug description:
  Several my QEMU instances crashed, and in the  qemu log, I can see
  this assertion failure,

 qemu-system-x86_64: /build/buildd/qemu-2.0.0+dfsg/kvm-all.c:984:
  kvm_irqchip_commit_routes: Assertion `ret == 0' failed.

  The QEMU version is 2.0.0, HV OS is ubuntu 12.04, kernel 3.2.0-38.
  Guest OS is RHEL 6.3.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1465935/+subscriptions

Re: [Qemu-devel] [PATCH RFC 9/9] tcg: update README about size changing ops

2015-07-17 Thread Richard Henderson


On 07/15/2015 12:03 PM, Aurelien Jarno wrote:

+These ops are all optional in that case they are implemented as mov.
+This is to allow some optimizations if the target maintains registers
+zero or sign extended. For example a MIPS64 CPU requires that all
+32-bit values are stored sign-extended in the registers. This means
+the trunc_shr_i64_i32 should sign-extend the value when moving it
+from a 64-bit to a 32-bit register. It also means ext_i32_i64 can be
+implemented as a simple mov as the value is already sign extended.


We need better wording.  Each one of the three are optional, and the other two 
must be implemented.  I think we ought to have a check in tcg.c about this, in 
tcg_add_target_add_op_defs.



r~

[Qemu-devel] net: Next steps to deprecate -net (was: [RFC PATCH] Enable vlans and dump for -netdev, too)

2015-07-17 Thread Thomas Huth

On 05/26/2015 04:29 PM, Markus Armbruster wrote:
 Stefan Hajnoczi stefa...@redhat.com writes:
[...]
 We thought the QEMU vlan concept would be dropped completely in the
 future, so it was never added to -netdev.  No patches to do that have
 been posted over the years, so I think it was more of a conceptual goal
 than a concrete requirement.
 
 Well, patches to do that first need to replace the VLAN-only dump
 feature.
 
 To fully deprecate -net, we also have to replace -net nic for
 configuring onboard NICs.
 
 Prior discussion:
 http://lists.nongnu.org/archive/html/qemu-devel/2013-02/msg03743.html
 
 We haven't really tried either.

Ok, assuming that my Network traffic dumping for -netdev devices patch
series is going to solve the dumping-for-netdev problem, how do we
tackle the remaining problems that we have to solve before we can
deprecate -net? Does anybody have a survey of the (onboard) NICs that
can only be configured with -net but not with -device? Could they
nowadays be changed to work with -device, too, or are there still major
obstacles to solve first?

 Thomas




signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PULL v2 for-2.4 0/3] input: fixes for 2.4

2015-07-17 Thread Gerd Hoffmann

  Hi,

Dropped the patch to enable virtio-input for non-linux systems.
Otherwise unmodified.

please pull,
  Gerd

The following changes since commit 2d5ee9e7a7dd495d233cf9613a865f63f88e3375:

  Merge remote-tracking branch 'remotes/lalrae/tags/mips-20150716' into staging 
(2015-07-16 10:40:23 +0100)

are available in the git repository at:


  git://git.kraxel.org/qemu tags/pull-input-20150717-1

for you to fetch changes up to 562f93754b95fd6dc65ad9a2aa15a90b2da7e8a4:

  hid: clarify hid_keyboard_process_keycode (2015-07-17 08:44:41 +0200)


input: fixes for 2.4


Gerd Hoffmann (1):
  virtio-input: move sys/ioctl.h include

Lin Ma (1):
  virtio-input: fix segfault in virtio_input_hid_properties

Paolo Bonzini (1):
  hid: clarify hid_keyboard_process_keycode

 hw/input/hid.c | 32 
 hw/input/virtio-input-hid.c|  1 +
 hw/input/virtio-input-host.c   |  1 +
 include/standard-headers/linux/input.h |  1 -
 scripts/update-linux-headers.sh|  1 +
 5 files changed, 31 insertions(+), 5 deletions(-)

[Qemu-devel] [PULL 3/3] hid: clarify hid_keyboard_process_keycode

2015-07-17 Thread Gerd Hoffmann

From: Paolo Bonzini pbonz...@redhat.com

Coverity thinks the fallthroughs are smelly.  They are correct, but
everything else in this function is like wut?.

Refer explicitly to bits 8 and 9 of hs-kbd.modifiers instead of
shifting right first and using (1  7).  Document what the scancode
is when hid_code is 0xe0.  And add plenty of comments.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hw/input/hid.c | 32 
 1 file changed, 28 insertions(+), 4 deletions(-)

diff --git a/hw/input/hid.c b/hw/input/hid.c
index 6841cb8..21ebd9e 100644
--- a/hw/input/hid.c
+++ b/hw/input/hid.c
@@ -239,7 +239,7 @@ static void hid_keyboard_event(DeviceState *dev, 
QemuConsole *src,
 
 static void hid_keyboard_process_keycode(HIDState *hs)
 {
-uint8_t hid_code, key;
+uint8_t hid_code, index, key;
 int i, keycode, slot;
 
 if (hs-n == 0) {
@@ -249,7 +249,8 @@ static void hid_keyboard_process_keycode(HIDState *hs)
 keycode = hs-kbd.keycodes[slot];
 
 key = keycode  0x7f;
-hid_code = hid_usage_keys[key | ((hs-kbd.modifiers  1)  (1  7))];
+index = key | ((hs-kbd.modifiers  (1  8))  1);
+hid_code = hid_usage_keys[index];
 hs-kbd.modifiers = ~(1  8);
 
 switch (hid_code) {
@@ -257,18 +258,41 @@ static void hid_keyboard_process_keycode(HIDState *hs)
 return;
 
 case 0xe0:
+assert(key == 0x1d);
 if (hs-kbd.modifiers  (1  9)) {
-hs-kbd.modifiers ^= 3  8;
+/* The hid_codes for the 0xe1/0x1d scancode sequence are 0xe9/0xe0.
+ * Here we're processing the second hid_code.  By dropping bit 9
+ * and setting bit 8, the scancode after 0x1d will access the
+ * second half of the table.
+ */
+hs-kbd.modifiers ^= (1  8) | (1  9);
 return;
 }
+/* fall through to process Ctrl_L */
 case 0xe1 ... 0xe7:
+/* Ctrl_L/Ctrl_R, Shift_L/Shift_R, Alt_L/Alt_R, Win_L/Win_R.
+ * Handle releases here, or fall through to process presses.
+ */
 if (keycode  (1  7)) {
 hs-kbd.modifiers = ~(1  (hid_code  0x0f));
 return;
 }
-case 0xe8 ... 0xef:
+/* fall through */
+case 0xe8 ... 0xe9:
+/* USB modifiers are just 1 byte long.  Bits 8 and 9 of
+ * hs-kbd.modifiers implement a state machine that detects the
+ * 0xe0 and 0xe1/0x1d sequences.  These bits do not follow the
+ * usual rules where bit 7 marks released keys; they are cleared
+ * elsewhere in the function as the state machine dictates.
+ */
 hs-kbd.modifiers |= 1  (hid_code  0x0f);
 return;
+
+case 0xea ... 0xef:
+abort();
+
+default:
+break;
 }
 
 if (keycode  (1  7)) {
-- 
1.8.3.1

[Qemu-devel] [PULL 2/3] virtio-input: move sys/ioctl.h include

2015-07-17 Thread Gerd Hoffmann

Drop from include/standard-headers/linux/input.h
Add to hw/input/virtio-input-host.c instead.

That allows to build virtio-input (except pass-through) on windows.

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 hw/input/virtio-input-host.c   | 1 +
 include/standard-headers/linux/input.h | 1 -
 scripts/update-linux-headers.sh| 1 +
 3 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/input/virtio-input-host.c b/hw/input/virtio-input-host.c
index f7e3d84..8978f16 100644
--- a/hw/input/virtio-input-host.c
+++ b/hw/input/virtio-input-host.c
@@ -11,6 +11,7 @@
 #include hw/virtio/virtio.h
 #include hw/virtio/virtio-input.h
 
+#include sys/ioctl.h
 #include standard-headers/linux/input.h
 
 /* - */
diff --git a/include/standard-headers/linux/input.h 
b/include/standard-headers/linux/input.h
index a459dd2..b003c67 100644
--- a/include/standard-headers/linux/input.h
+++ b/include/standard-headers/linux/input.h
@@ -10,7 +10,6 @@
 
 
 #include sys/time.h
-#include sys/ioctl.h
 #include sys/types.h
 #include standard-headers/linux/types.h
 
diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index 47378d9..f0e830c 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -56,6 +56,7 @@ cp_virtio() {
 -e 's/__bitwise__//' \
 -e 's/__attribute__((packed))/QEMU_PACKED/' \
 -e 's/__inline__/inline/' \
+-e '/sys\/ioctl.h/d' \
 $f  $to/$header;
 done
 fi
-- 
1.8.3.1

1 2 >

1 - 100 of 169 matches

Mail list logo