date:20150812

Re: [Qemu-devel] [Xen-devel] [PATCH v1] xenpt: Properly handle 64-bit bar with more than 4G size

2015-08-12 Thread Konrad Rzeszutek Wilk

On Wed, Aug 12, 2015 at 08:53:44AM +, Wu, Feng wrote:
 
 
  -Original Message-
  From: Jan Beulich [mailto:jbeul...@suse.com]
  Sent: Wednesday, August 12, 2015 4:43 PM
  To: Wu, Feng
  Cc: stefano.stabell...@eu.citrix.com; xen-de...@lists.xensource.com;
  qemu-devel@nongnu.org
  Subject: RE: [Qemu-devel] [Xen-devel] [PATCH v1] xenpt: Properly handle 
  64-bit
  bar with more than 4G size
  
   On 12.08.15 at 09:10, feng...@intel.com wrote:
  
  
   -Original Message-
   From: qemu-devel-bounces+feng.wu=intel@nongnu.org
   [mailto:qemu-devel-bounces+feng.wu=intel@nongnu.org] On Behalf Of
   Jan Beulich
   Sent: Wednesday, August 12, 2015 2:59 PM
   To: Wu, Feng
   Cc: xen-de...@lists.xensource.com; qemu-devel@nongnu.org;
   stefano.stabell...@eu.citrix.com
   Subject: Re: [Qemu-devel] [Xen-devel] [PATCH v1] xenpt: Properly handle
  64-bit
   bar with more than 4G size
  
On 05.08.15 at 04:02, feng...@intel.com wrote:
@@ -491,8 +474,9 @@ static int
   xen_pt_bar_reg_write(XenPCIPassthroughState *s, XenPTReg *cfg_entry,
 bar_ro_mask = XEN_PT_BAR_IO_RO_MASK | (r_size - 1);
 break;
 case XEN_PT_BAR_FLAG_UPPER:
+r = d-io_regions[index-1];
  
   Perhaps worth an assert(index  0)?
  
   No problem, I will add it. BTW, do you have any other comments about this
   patch? If no, I am
   going to send out the new version with this changes.
  
  No - everything else looks to make sense (but continues to need
  testing).
  
 
 I don't have such a device in hand. Can anybody who has such a device help to 
 test this
 patch? It would be highly appreciated!

Um, 4GB MMIO bars? Wouldn't the Nvidia Quadro K6000 12GB GDDR do it? I am sure
that Intel would be OK expensing $4K of hardware :-)

 
 Thanks,
 Feng
 
  Jan
 
 
 ___
 Xen-devel mailing list
 xen-de...@lists.xen.org
 http://lists.xen.org/xen-devel

Re: [Qemu-devel] [PATCH] MAINTAINERS: list smbios maintainers

2015-08-12 Thread Wei Huang




On 8/12/15 04:20, Michael S. Tsirkin wrote:

Now that smbios has its own directory, list its
maintainers. Same people as ACPI so just reuse that
entry.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
  MAINTAINERS | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 978b717..a059d5d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -645,13 +645,15 @@ S: Supported
  F: include/hw/pci/*
  F: hw/pci/*

-ACPI
+ACPI/SMBIOS
  M: Michael S. Tsirkin m...@redhat.com
  M: Igor Mammedov imamm...@redhat.com
  S: Supported
  F: include/hw/acpi/*
+F: include/hw/smbios/*
  F: hw/mem/*
  F: hw/acpi/*
+F: hw/smbios/*
  F: hw/i386/acpi-build.[hc]
  F: hw/i386/*dsl
  F: hw/arm/virt-acpi-build.c


Acked-by: Wei Huang w...@redhat.com

Note that SMBIOS patch isn't in yet.

-Wei

Re: [Qemu-devel] [PATCH for-2.5 15/30] m68k: add more modes to movem

2015-08-12 Thread Richard Henderson

On 08/12/2015 01:07 AM, Andreas Schwab wrote:
 Richard Henderson r...@twiddle.net writes:
 
 On 08/09/2015 01:13 PM, Laurent Vivier wrote:
 +opsize = (insn  0x40) != 0 ? OS_LONG : OS_WORD;
 +incr = opsize_bytes(opsize);
 +if (!is_load  (insn  070) == 040) {
 +for (i = 15; i = 0; i--, mask = 1) {

 This has got to be wrong.  Just because it's pre-decrement doesn't mean
 you should skip all of the loads.
 
 Pre-dec only supports reg-to-mem, and is special because mask is bit
 reversed.

Ah, I'd never noticed that.  A comment to that effect would be good.


r~

Re: [Qemu-devel] [RFC PATCH V7 00/19] Multithread TCG.

2015-08-12 Thread Frederic Konrad


On 11/08/2015 15:59, Frederic Konrad wrote:

On 11/08/2015 14:45, Paolo Bonzini wrote:

On 10/08/2015 17:26, fred.kon...@greensocs.com wrote:

From: KONRAD Frederic fred.kon...@greensocs.com

This is the 7th round of the MTTCG patch series.

Thanks to look at this.

Here is a list of issues that I found:

- tb_lock usage in tb_find_fast is complicated and introduces the need
for other complicated code such as the tb_invalidate callback. Instead,
the tb locking should reuse the cpu-exec.c code for user-mode emulation,
with additional locking in the spots identified by Fred.
The reason for this is that locking around tb_find_fast just kills the 
performance.




- tb_lock uses a recursive lock, but this is not necessary.  Did I ever
say I dislike recursive mutexes? :)  The wrappers
tb_lock()/tb_unlock()/tb_lock_reset() can catch recursive locking for
us, so it's not hard to do without it.
True this is definitely not nice but seems some code eg: tb_invalidate 
are called from
different place with or without the tb_lock. We might probably be able 
to lock the

caller.



- code_bitmap is not protected by any mutex
(tb_invalidate_phys_page_fast is called with the iothread mutex taken,
but other users of code_bitmap do not use it).  Writes should be
protected by the tb_lock, reads by either tb_lock or RCU.

Yes good point I missed this one.


- memory barriers are probably requested around accesses to
-exit_request.  -thread_kicked also needs to be accessed with atomics,
because async_run_{,safe_}on_cpu can be called outside the big QEMU 
lock.


- the whole signal-based qemu_cpu_kick can just go away.  Just setting
tcg_exit_req and exit_request will kick the TCG thread.  The hairy Win32
SuspendThread/ResumeThread goes away too.  I suggest doing it now,
because proving it unnecessary is easier than proving it correct.
Just setting tcg_exit_req and exit_request and signal the 
cpu-halt_cond I guess?

BTW that affect KVM as well. Seems this mechanism is used as well with
qemu_cpu_kick_self().. Which is a little strange as it seems the SIGIPI 
trigger a

dummy signal handler?

memset(sigact, 0, sizeof(sigact));
sigact.sa_handler = dummy_signal;
sigaction(SIG_IPI, sigact, NULL);





- user-mode emulation is broken (does not compile)

- the big QEMU lock is not taken anywhere for MMIO accesses that require
it (i.e. basically all of them)

Isn't that handled by prepare_mmio_access?



- some code wants to be called _outside_ the big QEMU lock, for example
because it longjmps back to cpu_exec.  For example, I suspect that the
notdirty callbacks must be marked with 
memory_region_clear_global_locking.


Fred


I've started looking at them (and documenting the locking conventions
for functions), and I hope to post it to some git repo later this week.

Paolo


It can be cloned from:
g...@git.greensocs.com:fkonrad/mttcg.git branch multi_tcg_v7.

This patch-set try to address the different issues in the global 
picture of

MTTCG, presented on the wiki.

== Needed patch for our work ==

Some preliminaries are needed for our work:
  * current_cpu doesn't make sense in mttcg so a tcg_executing flag 
is added to

the CPUState.
  * We need to run some work safely when all VCPUs are outside their 
execution
loop. This is done with the async_run_safe_work_on_cpu function 
introduced

in this series.
  * QemuSpin lock is introduced (on posix only yet) to allow a 
faster handling of

atomic instruction.

== Code generation and cache ==

As Qemu stands, there is no protection at all against two threads 
attempting to

generate code at the same time or modifying a TranslationBlock.
The protect TBContext with tb_lock patch address the issue of code 
generation

and makes all the tb_* function thread safe (except tb_flush).
This raised the question of one or multiple caches. We choosed to 
use one
unified cache because it's easier as a first step and since the 
structure of
QEMU effectively has a ‘local’ cache per CPU in the form of the jump 
cache, we

don't see the benefit of having two pools of tbs.

== Dirty tracking ==

Protecting the IOs:
To allows all VCPUs threads to run at the same time we need to drop the
global_mutex as soon as possible. The io access need to take the 
mutex. This is
likely to change when 
http://thread.gmane.org/gmane.comp.emulators.qemu/345258

will be upstreamed.

Invalidation of TranslationBlocks:
We can have all VCPUs running during an invalidation. Each VCPU is 
able to clean
it's jump cache itself as it is in CPUState so that can be handled 
by a simple

call to async_run_on_cpu. However tb_invalidate also writes to the
TranslationBlock which is shared as we have only one pool.
Hence this part of invalidate requires all VCPUs to exit before it 
can be done.

Hence the async_run_safe_work_on_cpu is introduced to handle this case.

== Atomic instruction ==

For now only ARM on x64 is supported by using an cmpxchg instruction.
Specifically the limitation of this approach is

Re: [Qemu-devel] [PATCH v8 3/5] Introduce irqchip type specification for KVM

2015-08-12 Thread Eric Auger

On 08/12/2015 04:24 PM, Christoffer Dall wrote:
 On Wed, Aug 12, 2015 at 4:14 PM, Eric Auger eric.au...@linaro.org wrote:
 Hi,
 On 08/12/2015 03:23 PM, Christoffer Dall wrote:
 On Wed, Aug 12, 2015 at 2:59 PM, Peter Maydell peter.mayd...@linaro.org 
 wrote:
 On 12 August 2015 at 13:27, Pavel Fedin p.fe...@samsung.com wrote:
  Hello!

 I still think this is the wrong approach -- see my remarks
 in the previous round of patch review.

  You know... I thought a little bit...
  So far, test = true in KVM_CREATE_DEVICE means that we just want to know 
 whether this type is supported. No actual actions is done by the kernel. 
 Is it correct?
 yes
 If yes, we can just leave this test as it is, because if it says that
 GICv2 is supported by KVM_CREATE_DEVICE, then:
 1. We use new API. No KVM_IRQCHIP_CREATE.
 2. GICv3 may be supported.

  Therefore, if we do this check, and it succeeds, then we just
 proceed, and later actually try to create GICv3. If it fails for some
 reason, we will see error message anyway. So would it be OK just not
 to touch kvm_arch_irqchip_create() at all?

 No, because then if the kernel only supports GICv3 the
 code in kvm_arch_irqchip_create() (as it is currently written)
 will erroneously fall back to using the old API.

 Christoffer: the question was, why does kvm_arch_irqchip_create()
 not only check the KVM_CAP_DEVICE_CTRL but also try to see
 if it can KVM_CREATE_DEVICE the GICv2 in order to avoid
 falling back to the old pre-KVM_CREATE_DEVICE API ? Are there
 kernels which have the capability bit set but which can't
 actually use KVM_CREATE_DEVICE to create the irqchip?

 My thinking probably was that technically the KVM_CAP_DEVICE_CTRL is
 an orthogonal concept from how to create the vgic, and you could
 advertise this capability without also supporting the GICv2 device
 type.

 However, I don't believe such kernels exist
 the capability was advertised for arm/arm64 with GICv2
 7330672  KVM: arm-vgic: Support KVM_CREATE_DEVICE for VGIC (1 year, 8
 months ago) Christoffer Dall
 so effectively I don't think we have any arm kernel advertising the CAP
 without GICv3.

  and they cannot in the
 future as that would be because we would remove an actively supported
 API.


 My preference here would be for kvm_arch_irqchip_create()
 to just use 'is the KVM_CAP_DEVICE_CTRL capability set'
 for its can we use the new API test; that will then
 work whether we have a GICv2 or GICv3 in the host. (The
 actual GIC device creation later on might fail, of course,
 but that can be handled at that point. The only thing
 we need to do as early as kvm_arch_irqchip_create is
 determine whether we must use the old API.)
 another way was proposed in the past was consisting in calling first
 ret = kvm_create_device(s, KVM_DEV_TYPE_ARM_VGIC_V3, true);
 if this succeeds this means we have the new API (the only one used vy
 v3) and hence we have it as well for VGIC_V2, we can return ...
 if this fails we just can say VGICv3 KVM device hasn't registered, try
 KVM_DEV_TYPE_ARM_VGIC_V2
 if we have it return else fall back to older API

 I think it worked as well?

 Besides I think Peter's suggestion is simpler.

 For info we also use KVM VFIO device now.

 Note that there's a difference between I called the create-device
 ioctl, and the creation of the device failed vs. I called the ioctl
 and I got an error because the ioctl is not supported, only in the
 latter case should you fall back to the older API.
yes understood; practically kvm_ioctl_create_device used in test mode
just looks for the specific device in the registered KVM device list and
that's it. whatever the case, API not supported or device not found it
reports -ENODEV.

Eric
 
 Not sure if the end result is the same with the suggested approach,
 based on the right error values etc.
 
 -Christoffer

Re: [Qemu-devel] [PATCH v8 3/5] Introduce irqchip type specification for KVM

2015-08-12 Thread Christoffer Dall

On Wed, Aug 12, 2015 at 4:14 PM, Eric Auger eric.au...@linaro.org wrote:
 Hi,
 On 08/12/2015 03:23 PM, Christoffer Dall wrote:
 On Wed, Aug 12, 2015 at 2:59 PM, Peter Maydell peter.mayd...@linaro.org 
 wrote:
 On 12 August 2015 at 13:27, Pavel Fedin p.fe...@samsung.com wrote:
  Hello!

 I still think this is the wrong approach -- see my remarks
 in the previous round of patch review.

  You know... I thought a little bit...
  So far, test = true in KVM_CREATE_DEVICE means that we just want to know 
 whether this type is supported. No actual actions is done by the kernel. 
 Is it correct?
 yes
 If yes, we can just leave this test as it is, because if it says that
 GICv2 is supported by KVM_CREATE_DEVICE, then:
 1. We use new API. No KVM_IRQCHIP_CREATE.
 2. GICv3 may be supported.

  Therefore, if we do this check, and it succeeds, then we just
 proceed, and later actually try to create GICv3. If it fails for some
 reason, we will see error message anyway. So would it be OK just not
 to touch kvm_arch_irqchip_create() at all?

 No, because then if the kernel only supports GICv3 the
 code in kvm_arch_irqchip_create() (as it is currently written)
 will erroneously fall back to using the old API.

 Christoffer: the question was, why does kvm_arch_irqchip_create()
 not only check the KVM_CAP_DEVICE_CTRL but also try to see
 if it can KVM_CREATE_DEVICE the GICv2 in order to avoid
 falling back to the old pre-KVM_CREATE_DEVICE API ? Are there
 kernels which have the capability bit set but which can't
 actually use KVM_CREATE_DEVICE to create the irqchip?

 My thinking probably was that technically the KVM_CAP_DEVICE_CTRL is
 an orthogonal concept from how to create the vgic, and you could
 advertise this capability without also supporting the GICv2 device
 type.

 However, I don't believe such kernels exist
 the capability was advertised for arm/arm64 with GICv2
 7330672  KVM: arm-vgic: Support KVM_CREATE_DEVICE for VGIC (1 year, 8
 months ago) Christoffer Dall
 so effectively I don't think we have any arm kernel advertising the CAP
 without GICv3.

  and they cannot in the
 future as that would be because we would remove an actively supported
 API.


 My preference here would be for kvm_arch_irqchip_create()
 to just use 'is the KVM_CAP_DEVICE_CTRL capability set'
 for its can we use the new API test; that will then
 work whether we have a GICv2 or GICv3 in the host. (The
 actual GIC device creation later on might fail, of course,
 but that can be handled at that point. The only thing
 we need to do as early as kvm_arch_irqchip_create is
 determine whether we must use the old API.)
 another way was proposed in the past was consisting in calling first
 ret = kvm_create_device(s, KVM_DEV_TYPE_ARM_VGIC_V3, true);
 if this succeeds this means we have the new API (the only one used vy
 v3) and hence we have it as well for VGIC_V2, we can return ...
 if this fails we just can say VGICv3 KVM device hasn't registered, try
 KVM_DEV_TYPE_ARM_VGIC_V2
 if we have it return else fall back to older API

 I think it worked as well?

 Besides I think Peter's suggestion is simpler.

 For info we also use KVM VFIO device now.

Note that there's a difference between I called the create-device
ioctl, and the creation of the device failed vs. I called the ioctl
and I got an error because the ioctl is not supported, only in the
latter case should you fall back to the older API.

Not sure if the end result is the same with the suggested approach,
based on the right error values etc.

-Christoffer

Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list

2015-08-12 Thread alvise rigo

On Wed, Aug 12, 2015 at 4:10 PM, Paolo Bonzini pbonz...@redhat.com wrote:


 On 12/08/2015 16:04, alvise rigo wrote:
  clear algorithm:
 if bytemap[vaddr] == 254
bytemap[vaddr] = CPU_ID
 Isn't this also required for the clear algorithm?

 if bytemap[vaddr]  254
 /* this can happen for the TLB_EXCL slow path effect */
 bytemap[vaddr] = 255

 I don't think so because clear doesn't clear TLB_EXCL.  But maybe we're
 talking about two different things. :)

Let me go through it again, I might have missed something :)

alvise


 Paolo

Re: [Qemu-devel] [PATCH v8 3/5] Introduce irqchip type specification for KVM

2015-08-12 Thread Eric Auger

Hi,
On 08/12/2015 03:23 PM, Christoffer Dall wrote:
 On Wed, Aug 12, 2015 at 2:59 PM, Peter Maydell peter.mayd...@linaro.org 
 wrote:
 On 12 August 2015 at 13:27, Pavel Fedin p.fe...@samsung.com wrote:
  Hello!

 I still think this is the wrong approach -- see my remarks
 in the previous round of patch review.

  You know... I thought a little bit...
  So far, test = true in KVM_CREATE_DEVICE means that we just want to know 
 whether this type is supported. No actual actions is done by the kernel. Is 
 it correct? 
yes
If yes, we can just leave this test as it is, because if it says that
GICv2 is supported by KVM_CREATE_DEVICE, then:
 1. We use new API. No KVM_IRQCHIP_CREATE.
 2. GICv3 may be supported.

  Therefore, if we do this check, and it succeeds, then we just
 proceed, and later actually try to create GICv3. If it fails for some
 reason, we will see error message anyway. So would it be OK just not
 to touch kvm_arch_irqchip_create() at all?

 No, because then if the kernel only supports GICv3 the
 code in kvm_arch_irqchip_create() (as it is currently written)
 will erroneously fall back to using the old API.

 Christoffer: the question was, why does kvm_arch_irqchip_create()
 not only check the KVM_CAP_DEVICE_CTRL but also try to see
 if it can KVM_CREATE_DEVICE the GICv2 in order to avoid
 falling back to the old pre-KVM_CREATE_DEVICE API ? Are there
 kernels which have the capability bit set but which can't
 actually use KVM_CREATE_DEVICE to create the irqchip?
 
 My thinking probably was that technically the KVM_CAP_DEVICE_CTRL is
 an orthogonal concept from how to create the vgic, and you could
 advertise this capability without also supporting the GICv2 device
 type.
 
 However, I don't believe such kernels exist
the capability was advertised for arm/arm64 with GICv2
7330672  KVM: arm-vgic: Support KVM_CREATE_DEVICE for VGIC (1 year, 8
months ago) Christoffer Dall
so effectively I don't think we have any arm kernel advertising the CAP
without GICv3.

 and they cannot in the
 future as that would be because we would remove an actively supported
 API.
 

 My preference here would be for kvm_arch_irqchip_create()
 to just use 'is the KVM_CAP_DEVICE_CTRL capability set'
 for its can we use the new API test; that will then
 work whether we have a GICv2 or GICv3 in the host. (The
 actual GIC device creation later on might fail, of course,
 but that can be handled at that point. The only thing
 we need to do as early as kvm_arch_irqchip_create is
 determine whether we must use the old API.)
another way was proposed in the past was consisting in calling first
ret = kvm_create_device(s, KVM_DEV_TYPE_ARM_VGIC_V3, true);
if this succeeds this means we have the new API (the only one used vy
v3) and hence we have it as well for VGIC_V2, we can return ...
if this fails we just can say VGICv3 KVM device hasn't registered, try
KVM_DEV_TYPE_ARM_VGIC_V2
if we have it return else fall back to older API

I think it worked as well?

Besides I think Peter's suggestion is simpler.

For info we also use KVM VFIO device now.

Best Regards

Eric

 I'm fine with this.
 
 -Christoffer

[Qemu-devel] [PATCH] sh4: Fix initramfs initialization for endiannes-mismatched targets

2015-08-12 Thread Guenter Roeck

If host and target endianness does not match, loding an initramfs does not work.
Fix by writing boot parameters with appropriate endianness conversion.

Signed-off-by: Guenter Roeck li...@roeck-us.net
---
 hw/sh4/r2d.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/sh4/r2d.c b/hw/sh4/r2d.c
index 5e22ed7..3b0b2ec 100644
--- a/hw/sh4/r2d.c
+++ b/hw/sh4/r2d.c
@@ -338,9 +338,9 @@ static void r2d_init(MachineState *machine)
 }
 
 /* initialization which should be done by firmware */
-boot_params.loader_type = 1;
-boot_params.initrd_start = INITRD_LOAD_OFFSET;
-boot_params.initrd_size = initrd_size;
+boot_params.loader_type = tswap32(1);
+boot_params.initrd_start = tswap32(INITRD_LOAD_OFFSET);
+boot_params.initrd_size = tswap32(initrd_size);
 }
 
 if (kernel_cmdline) {
-- 
2.1.4

[Qemu-devel] [PATCH] hw/misc/zynq_slcr: Change CPU clock rate

2015-08-12 Thread Guenter Roeck

The Linux kernel only accepts 34 Khz and 67 Khz clock rates, and
may crash if the actual clock rate is too low. The clock rate used to be
(ps-clk-frequency * 26 / 4), which resulted in a CPU frequency of
21 Khz if ps-clk-frequency was set to  Hz. Change it to
(ps-clk-frequency * 20 / 2) = 33 Khz to make Linux happy.

Signed-off-by: Guenter Roeck li...@roeck-us.net
---
 hw/misc/zynq_slcr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/misc/zynq_slcr.c b/hw/misc/zynq_slcr.c
index 964f253..d3e1ce0 100644
--- a/hw/misc/zynq_slcr.c
+++ b/hw/misc/zynq_slcr.c
@@ -189,7 +189,7 @@ static void zynq_slcr_reset(DeviceState *d)
 
 s-regs[LOCKSTA] = 1;
 /* 0x100 - 0x11C */
-s-regs[ARM_PLL_CTRL]   = 0x0001A008;
+s-regs[ARM_PLL_CTRL]   = 0x00014008;
 s-regs[DDR_PLL_CTRL]   = 0x0001A008;
 s-regs[IO_PLL_CTRL]= 0x0001A008;
 s-regs[PLL_STATUS] = 0x003F;
@@ -198,7 +198,7 @@ static void zynq_slcr_reset(DeviceState *d)
 s-regs[IO_PLL_CFG] = 0x00014000;
 
 /* 0x120 - 0x16C */
-s-regs[ARM_CLK_CTRL]   = 0x1F000400;
+s-regs[ARM_CLK_CTRL]   = 0x1F000200;
 s-regs[DDR_CLK_CTRL]   = 0x1843;
 s-regs[DCI_CLK_CTRL]   = 0x01E03201;
 s-regs[APER_CLK_CTRL]  = 0x01FFCCCD;
-- 
2.1.4

Re: [Qemu-devel] [PATCH] Makefile.target: include top level build dir in vpath

2015-08-12 Thread Michael Marineau

On Aug 12, 2015 6:32 AM, Paolo Bonzini pbonz...@redhat.com wrote:



 On 09/08/2015 09:02, Michael Marineau wrote:
  Using ccache with CCACHE_BASEDIR set to $(SRC_PATH) or a parent will
  rewrite all absolute paths to relative paths. This interacts poorly with
  QEMU's two-level build directory scheme. For example, lets say
  BUILD_DIR=$(SRC_PATH)/build so build/blockdev.d will contain:
 
blockdev.o: ../blockdev.c ../include/sysemu/block-backend.h \
 
  Now the target build under build/x86_64-softmmu or similar will depend
  on ../blockdev.o which in turn will get make to source ../blockdev.d to
  check its dependencies. Since make always considers paths relative to
  the current working directory rather than the makefile the path appeared
  in the relative path to ../blockdev.c is useless.
 
  This change simply adds the top level build directory to vpath so paths
  relative to the source directory, top build directory, and target build
  directory all work just fine.
  ---
   Makefile.target | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
 
  diff --git a/Makefile.target b/Makefile.target
  index 3e7aafd..dc32294 100644
  --- a/Makefile.target
  +++ b/Makefile.target
  @@ -7,7 +7,7 @@ include config-target.mak
   include config-devices.mak
   include $(SRC_PATH)/rules.mak
 
  -$(call set-vpath, $(SRC_PATH))
  +$(call set-vpath, $(SRC_PATH):$(BUILD_DIR))
   ifdef CONFIG_LINUX
   QEMU_CFLAGS += -I../linux-headers
   endif
 

 Hi,

 can you reply with Signed-off-by: Michael Marineau
 michael.marin...@coreos.com, representing that you've read and agreed
 to the Developer Certificate of Origin (http://developercertificate.org/)?

 This is necessary to include your patch in QEMU.

 Paolo

Oops, forgot about that.

Signed-off-by: Michael Marineau
michael.marin...@coreos.com

Re: [Qemu-devel] [PATCH v2] target-cris: update CPU state save/load to use VMStateDescription

2015-08-12 Thread Edgar E. Iglesias

On Fri, Aug 07, 2015 at 05:02:14PM +0100, Peter Maydell wrote:
 From: Juan Quintela quint...@redhat.com
 
 Update the CRIS CPU state save/load to use a VMStateDescription struct
 rather than cpu_save/cpu_load functions.
 
 Have to define TLBSet struct.
 Multidimensional arrays in C are a mess, just unroll them.
 
 Signed-off-by: Juan Quintela quint...@redhat.com


Acked-by: Edgar E. Iglesias edgar.igles...@xilinx.com



 [PMM:
  * expand commit message a little since it's no longer one patch in
a 35-patch series
  * add header/copyright comment to machine.c; credited copyright is
Red Hat and author is Juan, since this commit gives the file all-new
contents; license is LGPL-2-or-later, to match other target-cris code
  * remove hardcoded tab
  * add fields for locked_irq, interrupt_vector, fault_vector, trap_vector
  * drop minimum_version_id_old fields
  * bump version_id to 2 as we are not compatible with old state format
  * remove unnecessary hw/boards.h include
  * update to register via dc-vmsd]
 Signed-off-by: Peter Maydell peter.mayd...@linaro.org
 ---
 
 This is a patch of Juan's from way back in 2012, which I am resurrecting
 because we now have only two CPUs which still use old-style non-VMState
 save/load (CRIS and SPARC). If we can update them both we can drop the
 machinery in the common code which supports this.
 
 Notes:
  * CPUCRISState indent style is somewhat mismatched (cf load_info);
I took the minimal make checkpatch happy approach, but am
happy to do something else with the line changed here
  * I have added a copyright header to target-cris/machine.c, because it
did not have one at all, and this commit effectively gives the
file all-new contents. I have set it up as LGPLv2-or-later,
copyright Red Hat, author Juan. Please let me know if you would
prefer something else!
  * I added vmstate entries for four fields which did not previously
get saved and restored, which is presumably a bug fix...
  * vmsave/vmload on the axis-dev88 board does not currently seem to
work (among other obvious problems, there is no vmstate support
in the interrupt controller), so we're limited to looks good
on code review here.
 
 Changes v1-v2:
  * actually register the vmstate struct, by setting dc-vmsd
(discussion on IRC with Andreas a few weeks back indicated that
dc-vmsd is preferred over cc-vmsd for new CPUs)
  * update it to be the right format for a dc-vmsd struct
 
 Testing: I checked with a host gdb that we seem to be filling in
 the registers with the right values. Since none of the devices
 in the axis model (including the interrupt controller) support
 vmstate save/load, the run after load from snapshot goes wrong pretty
 quickly, obviously.
 
  target-cris/cpu-qom.h |   4 ++
  target-cris/cpu.c |   1 +
  target-cris/cpu.h |  13 ++--
  target-cris/machine.c | 167 
 +-
  4 files changed, 95 insertions(+), 90 deletions(-)
 
 diff --git a/target-cris/cpu-qom.h b/target-cris/cpu-qom.h
 index 6fc30c2..df4c0b5 100644
 --- a/target-cris/cpu-qom.h
 +++ b/target-cris/cpu-qom.h
 @@ -73,6 +73,10 @@ static inline CRISCPU *cris_env_get_cpu(CPUCRISState *env)
  
  #define ENV_OFFSET offsetof(CRISCPU, env)
  
 +#ifndef CONFIG_USER_ONLY
 +extern const struct VMStateDescription vmstate_cris_cpu;
 +#endif
 +
  void cris_cpu_do_interrupt(CPUState *cpu);
  void crisv10_cpu_do_interrupt(CPUState *cpu);
  bool cris_cpu_exec_interrupt(CPUState *cpu, int int_req);
 diff --git a/target-cris/cpu.c b/target-cris/cpu.c
 index b17e849..d461e07 100644
 --- a/target-cris/cpu.c
 +++ b/target-cris/cpu.c
 @@ -302,6 +302,7 @@ static void cris_cpu_class_init(ObjectClass *oc, void 
 *data)
  cc-handle_mmu_fault = cris_cpu_handle_mmu_fault;
  #else
  cc-get_phys_page_debug = cris_cpu_get_phys_page_debug;
 +dc-vmsd = vmstate_cris_cpu;
  #endif
  
  cc-gdb_num_core_regs = 49;
 diff --git a/target-cris/cpu.h b/target-cris/cpu.h
 index d422e35..6760dc6 100644
 --- a/target-cris/cpu.h
 +++ b/target-cris/cpu.h
 @@ -108,6 +108,11 @@
  
  #define NB_MMU_MODES 2
  
 +typedef struct {
 +uint32_t hi;
 +uint32_t lo;
 +} TLBSet;
 +
  typedef struct CPUCRISState {
   uint32_t regs[16];
   /* P0 - P15 are referred to as special registers in the docs.  */
 @@ -161,11 +166,7 @@ typedef struct CPUCRISState {
*
* One for I and another for D.
*/
 - struct
 - {
 - uint32_t hi;
 - uint32_t lo;
 - } tlbsets[2][4][16];
 +TLBSet tlbsets[2][4][16];
  
   CPU_COMMON
  
 @@ -227,8 +228,6 @@ enum {
  #define cpu_gen_code cpu_cris_gen_code
  #define cpu_signal_handler cpu_cris_signal_handler
  
 -#define CPU_SAVE_VERSION 1
 -
  /* MMU modes definitions */
  #define MMU_MODE0_SUFFIX _kernel
  #define MMU_MODE1_SUFFIX _user
 diff --git a/target-cris/machine.c b/target-cris/machine.c
 index 8f9c0dd..983b67c 100644
 --- a/target-cris/machine.c
 +++

Re: [Qemu-devel] [RFC PATCH V7 16/19] translate-all: introduces tb_flush_safe.

2015-08-12 Thread Paolo Bonzini



On 12/08/2015 16:11, Frederic Konrad wrote:
 You could also allocate a new code buffer and free the old one with
 call_rcu.  This should simplify things a lot.

 Depending the size of the code buffer this might be a good idea. :).

32 megabytes.

Paolo

Re: [Qemu-devel] [PATCH] linux-user: elfload: Still use TARGET_PAGE_SIZE for i386 guest

2015-08-12 Thread Richard Henderson

On 08/12/2015 12:59 AM, gchen gchen wrote:
 Nack.  There's 99 problems with host page size guest page size.  This
 solves none of them, and in the hackiest way possible.

 
 Under alpha virtual machine, if set i386 guest page size 8KB, it will
 cause failure directly (any dynamically linked binaries can not work).

Yes, I know.  The same thing happens when running i386 guests on other
(non-virtual) hosts.  E.g. Sparc64's 8kB page, PowerPC64's 64kB page.

 Do you have any other ideas for solving this issue?

The only complete solution that I see is to use softmmu with linux-user, so
that we properly emulate the guest pages.  Yes, it will cause quite some
slow-down in emulation, but I believe it's the only reliable way.


r~

[Qemu-devel] [PATCH] block/raw-posix: Use raw_normalize_devicepath()

2015-08-12 Thread Max Reitz

The filename given to qemu_open() in block/raw-posix.c should generally
have been processed by raw_normalize_devicepath(); unless we are only
probing (in which case the caller often checks whether the file is a
block device or not, and this property will be changed by
raw_normalize_devicepath() on NetBSD) or it is about a deprecated device
(i.e. floppy).

Signed-off-by: Max Reitz mre...@redhat.com
---
 block/raw-posix.c | 22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 855febe..30df8ad 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -670,11 +670,17 @@ static int raw_reopen_prepare(BDRVReopenState *state,
 
 /* If we cannot use fcntl, or fcntl failed, fall back to qemu_open() */
 if (raw_s-fd == -1) {
-assert(!(raw_s-open_flags  O_CREAT));
-raw_s-fd = qemu_open(state-bs-filename, raw_s-open_flags);
-if (raw_s-fd == -1) {
-error_setg_errno(errp, errno, Could not reopen file);
-ret = -1;
+const char *normalized_filename = state-bs-filename;
+ret = raw_normalize_devicepath(normalized_filename);
+if (ret  0) {
+error_setg_errno(errp, -ret, Could not normalize device path);
+} else {
+assert(!(raw_s-open_flags  O_CREAT));
+raw_s-fd = qemu_open(normalized_filename, raw_s-open_flags);
+if (raw_s-fd == -1) {
+error_setg_errno(errp, errno, Could not reopen file);
+ret = -1;
+}
 }
 }
 
@@ -2314,6 +2320,12 @@ static int hdev_create(const char *filename, QemuOpts 
*opts,
 
 (void)has_prefix;
 
+ret = raw_normalize_devicepath(filename);
+if (ret  0) {
+error_setg_errno(errp, -ret, Could not normalize device path);
+return ret;
+}
+
 /* Read out options */
 total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
   BDRV_SECTOR_SIZE);
-- 
2.4.6

Re: [Qemu-devel] [RFC PATCH V7 00/19] Multithread TCG.

2015-08-12 Thread Paolo Bonzini



On 12/08/2015 17:19, Frederic Konrad wrote:
 BTW that affect KVM as well. Seems this mechanism is used as well with
 qemu_cpu_kick_self().. Which is a little strange as it seems the SIGIPI
 trigger a
 dummy signal handler?
 
 memset(sigact, 0, sizeof(sigact));
 sigact.sa_handler = dummy_signal;
 sigaction(SIG_IPI, sigact, NULL);

KVM is different, the signal handler is used to kick the VM out of
KVM_RUN.  We're going to add another path (a ioctl) but it cannot use
the same code as TCG.

qemu_cpu_kick_self is needed in some special cases where KVM tells you
call KVM_RUN asap but you know you have more work to do in userspace.
 Calling qemu_cpu_kick_self lets you call KVM_RUN work and immediately
do the userspace work.

Paolo

[Qemu-devel] [PATCH v3 0/5] block: Drop BDS.filename

2015-08-12 Thread Max Reitz

This series depends on
[PATCH] block/raw-posix: Use raw_normalize_devicepath().


The BDS filename field is generally only used when opening disk images
or emitting error or warning messages, the only exception to this rule
is the map command of qemu-img. However, using exact_filename there
instead should not be a problem. Therefore, we can drop the filename
field from the BlockDriverState and use a function instead which builds
the filename from scratch when called.

This is slower than reading a static char array but the problem of that
static array is that it may become obsolete due to changes in any
BlockDriverState or in the BDS graph. Using a function which rebuilds
the filename every time it is called resolves this problem.

The disadvantage of worse performance is negligible, on the other hand.
After patch 2 of this series, which replaces some queries of
BDS.filename by reads from somewhere else (mostly BDS.exact_filename),
the filename field is only used when a disk image is opened or some
message should be emitted, both of which cases do not suffer from the
performance hit.


http://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg07025.html
gives an example of how to achieve an outdated filename, so we do need
this series. We can technically fix it differently (by appropriately
invoking bdrv_refresh_filename()), but that is pretty difficult™. I find
this series simpler and it it's something we wanted to do anyway.

Regarding the fear that this might change current behavior, especially
for libvirt which used filenames to identify nodes in the BDS graph:
Since bdrv_open() already calls bdrv_refresh_filename() today, and
apparently everything works fine, this series will most likely not break
anything. The only thing that will change is if the BDS graph is
dynamically reconfigured at runtime, and in that case it's most likely a
bug fix. Also, I don't think libvirt supports such cases already.


tl;dr: This series is a bug fix and I don't think it'll break legacy
management applications relying on the filename to identify a BDS; as
long as they are not trying to continue that practice with more advanced
configurations (e.g. with Quorum).


v2:
- Patches 1 and 4: Store filename in heap rather than on the stack
  [Wen Congyang]
- Patch 2:
  - Rebased on
[PATCH] block/raw-posix: Use raw_normalize_devicepath()
  - Fixed 85-character line
- Patch 3: Added wrapper function which facilitates generation of a
  filename into a temporary heap buffer


git-backport-diff against v2:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/5:[0011] [FC] 'block: Change bdrv_get_encrypted_filename()'
002/5:[0007] [FC] 'block: Avoid BlockDriverState.filename'
003/5:[0009] [FC] 'block: Add bdrv_filename()'
004/5:[0129] [FC] 'block: Drop BlockDriverState.filename'
005/5:[] [--] 'iotests: Test changed Quorum filename'


Max Reitz (5):
  block: Change bdrv_get_encrypted_filename()
  block: Avoid BlockDriverState.filename
  block: Add bdrv_filename()
  block: Drop BlockDriverState.filename
  iotests: Test changed Quorum filename

 block.c   | 108 +-
 block/commit.c|   4 +-
 block/gluster.c   |   2 +-
 block/mirror.c|  16 +--
 block/qapi.c  |   4 +-
 block/raw-posix.c |  20 ++---
 block/raw-win32.c |   4 +-
 block/raw_bsd.c   |   4 +-
 block/vhdx-log.c  |   5 ++-
 block/vmdk.c  |  22 +++---
 block/vpc.c   |   7 ++-
 blockdev.c|  22 +++---
 include/block/block.h |   4 +-
 include/block/block_int.h |   1 -
 monitor.c |   6 ++-
 qemu-img.c|   2 +-
 tests/qemu-iotests/041|  17 ++--
 17 files changed, 179 insertions(+), 69 deletions(-)

-- 
2.4.6

[Qemu-devel] [PATCH v3 5/5] iotests: Test changed Quorum filename

2015-08-12 Thread Max Reitz

After drive-mirror replacing a Quorum child, the filename of the Quorum
BDS should reflect the change. This patch replaces the existing test for
whether the operation did actually exchange the BDS (which simply tested
whether the new BDS existed) by a test which examines the children list
contained in the Quorum BDS's filename as returned by query-block.

As a nice side effect of confirming that the new BDS actually belongs to
the Quorum BDS, this checks whether the filename was properly updated.

Signed-off-by: Max Reitz mre...@redhat.com
---
 tests/qemu-iotests/041 | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 3d46ed7..316f42a 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -18,6 +18,7 @@
 # along with this program.  If not, see http://www.gnu.org/licenses/.
 #
 
+import json
 import time
 import os
 import iotests
@@ -908,10 +909,18 @@ class TestRepairQuorum(iotests.QMPTestCase):
 self.assert_qmp(result, 'return', {})
 
 self.complete_and_wait(drive=quorum0)
-result = self.vm.qmp('query-named-block-nodes')
-self.assert_qmp(result, 'return[0]/file', quorum_repair_img)
-# TODO: a better test requiring some QEMU infrastructure will be added
-#   to check that this file is really driven by quorum
+
+result = self.vm.qmp('query-block')
+for blockdev in result['return']:
+if blockdev['device'] == 'quorum0':
+filename = blockdev['inserted']['image']['filename']
+self.assertEquals(filename[:5], 'json:')
+children = json.loads(filename[5:])['children']
+self.assertEquals(children[0]['file']['filename'], quorum_img1)
+self.assertEquals(children[1]['file']['filename'],
+  quorum_repair_img)
+self.assertEquals(children[2]['file']['filename'], quorum_img3)
+
 self.vm.shutdown()
 
 if __name__ == '__main__':
-- 
2.4.6

[Qemu-devel] [PATCH v3 2/5] block: Avoid BlockDriverState.filename

2015-08-12 Thread Max Reitz

In places which directly pass a filename to the OS, we should not use
the filename field at all but exact_filename instead (although the
former currently equals the latter if that is set).

In qemu-img's map command, we should be using the filename field; but
since this commit prepares to remove that field, using exact_filename is
fine, too (this is the only user of BlockDriverState.filename which
frequently queries that field).

Signed-off-by: Max Reitz mre...@redhat.com
---
 block.c   | 5 +++--
 block/gluster.c   | 2 +-
 block/raw-posix.c | 8 
 block/raw-win32.c | 4 ++--
 qemu-img.c| 2 +-
 5 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index 41b0f85..1572785 100644
--- a/block.c
+++ b/block.c
@@ -918,8 +918,9 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
 if (ret  0) {
 if (local_err) {
 error_propagate(errp, local_err);
-} else if (bs-filename[0]) {
-error_setg_errno(errp, -ret, Could not open '%s', bs-filename);
+} else if (bs-exact_filename[0]) {
+error_setg_errno(errp, -ret, Could not open '%s',
+ bs-exact_filename);
 } else {
 error_setg_errno(errp, -ret, Could not open image);
 }
diff --git a/block/gluster.c b/block/gluster.c
index 1eb3a8c..176682b 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -358,7 +358,7 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState 
*state,
 
 gconf = g_new0(GlusterConf, 1);
 
-reop_s-glfs = qemu_gluster_init(gconf, state-bs-filename, errp);
+reop_s-glfs = qemu_gluster_init(gconf, state-bs-exact_filename, errp);
 if (reop_s-glfs == NULL) {
 ret = -errno;
 goto exit;
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 30df8ad..10f8f62 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -670,7 +670,7 @@ static int raw_reopen_prepare(BDRVReopenState *state,
 
 /* If we cannot use fcntl, or fcntl failed, fall back to qemu_open() */
 if (raw_s-fd == -1) {
-const char *normalized_filename = state-bs-filename;
+const char *normalized_filename = state-bs-exact_filename;
 ret = raw_normalize_devicepath(normalized_filename);
 if (ret  0) {
 error_setg_errno(errp, -ret, Could not normalize device path);
@@ -2201,7 +2201,7 @@ static int fd_open(BlockDriverState *bs)
 DPRINTF(No floppy (open delayed)\n);
 return -EIO;
 }
-s-fd = qemu_open(bs-filename, s-open_flags  ~O_NONBLOCK);
+s-fd = qemu_open(bs-exact_filename, s-open_flags  ~O_NONBLOCK);
 if (s-fd  0) {
 s-fd_error_time = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 s-fd_got_error = 1;
@@ -2498,7 +2498,7 @@ static void floppy_eject(BlockDriverState *bs, bool 
eject_flag)
 qemu_close(s-fd);
 s-fd = -1;
 }
-fd = qemu_open(bs-filename, s-open_flags | O_NONBLOCK);
+fd = qemu_open(bs-exact_filename, s-open_flags | O_NONBLOCK);
 if (fd = 0) {
 if (ioctl(fd, FDEJECT, 0)  0)
 perror(FDEJECT);
@@ -2722,7 +2722,7 @@ static int cdrom_reopen(BlockDriverState *bs)
  */
 if (s-fd = 0)
 qemu_close(s-fd);
-fd = qemu_open(bs-filename, s-open_flags, 0644);
+fd = qemu_open(bs-exact_filename, s-open_flags, 0644);
 if (fd  0) {
 s-fd = -1;
 return -EIO;
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 68f2338..5c8a894 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -417,7 +417,7 @@ static void raw_close(BlockDriverState *bs)
 
 CloseHandle(s-hfile);
 if (bs-open_flags  BDRV_O_TEMPORARY) {
-unlink(bs-filename);
+unlink(bs-exact_filename);
 }
 }
 
@@ -485,7 +485,7 @@ static int64_t raw_get_allocated_file_size(BlockDriverState 
*bs)
   DWORD * high);
 get_compressed_t get_compressed;
 struct _stati64 st;
-const char *filename = bs-filename;
+const char *filename = bs-exact_filename;
 /* WinNT support GetCompressedFileSize to determine allocate size */
 get_compressed =
 (get_compressed_t) GetProcAddress(GetModuleHandle(kernel32),
diff --git a/qemu-img.c b/qemu-img.c
index 75f4ee4..3d5587a 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2155,7 +2155,7 @@ static void dump_map_entry(OutputFormat output_format, 
MapEntry *e,
 }
 if ((e-flags  (BDRV_BLOCK_DATA|BDRV_BLOCK_ZERO)) == BDRV_BLOCK_DATA) 
{
 printf(%#-16PRIx64%#-16PRIx64%#-16PRIx64%s\n,
-   e-start, e-length, e-offset, e-bs-filename);
+   e-start, e-length, e-offset, e-bs-exact_filename);
 }
 /* This format ignores the distinction between 0, ZERO and ZERO|DATA.
  * Modify the flags here to allow more coalescing.
-- 
2.4.6

[Qemu-devel] [PATCH v3 1/5] block: Change bdrv_get_encrypted_filename()

2015-08-12 Thread Max Reitz

Instead of returning a pointer to the filename, copy it into a buffer
specified by the caller.

Signed-off-by: Max Reitz mre...@redhat.com
---
 block.c   | 25 ++---
 include/block/block.h |  2 +-
 monitor.c |  6 +-
 3 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/block.c b/block.c
index d088ee0..41b0f85 100644
--- a/block.c
+++ b/block.c
@@ -2760,10 +2760,13 @@ void bdrv_add_key(BlockDriverState *bs, const char 
*key, Error **errp)
 }
 } else {
 if (bdrv_key_required(bs)) {
+char *enc_filename = g_malloc(PATH_MAX);
+bdrv_get_encrypted_filename(bs, enc_filename, PATH_MAX);
 error_set(errp, ERROR_CLASS_DEVICE_ENCRYPTED,
   '%s' (%s) is encrypted,
   bdrv_get_device_or_node_name(bs),
-  bdrv_get_encrypted_filename(bs));
+  enc_filename);
+g_free(enc_filename);
 }
 }
 }
@@ -2980,14 +2983,22 @@ bool bdrv_can_write_zeroes_with_unmap(BlockDriverState 
*bs)
 return false;
 }
 
-const char *bdrv_get_encrypted_filename(BlockDriverState *bs)
+char *bdrv_get_encrypted_filename(BlockDriverState *bs, char *dest, size_t sz)
 {
-if (bs-backing_hd  bs-backing_hd-encrypted)
-return bs-backing_file;
-else if (bs-encrypted)
-return bs-filename;
-else
+if (sz  INT_MAX) {
+sz = INT_MAX;
+}
+
+if (bs-backing_hd  bs-backing_hd-encrypted) {
+pstrcpy(dest, sz, bs-backing_file);
+return dest;
+} else if (bs-encrypted) {
+pstrcpy(dest, sz, bs-filename);
+return dest;
+} else {
+dest[0] = '\0';
 return NULL;
+}
 }
 
 void bdrv_get_backing_filename(BlockDriverState *bs,
diff --git a/include/block/block.h b/include/block/block.h
index 37916f7..a78e4f1 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -430,7 +430,7 @@ void bdrv_round_to_clusters(BlockDriverState *bs,
 int64_t *cluster_sector_num,
 int *cluster_nb_sectors);
 
-const char *bdrv_get_encrypted_filename(BlockDriverState *bs);
+char *bdrv_get_encrypted_filename(BlockDriverState *bs, char *dest, size_t sz);
 void bdrv_get_backing_filename(BlockDriverState *bs,
char *filename, int filename_size);
 void bdrv_get_full_backing_filename(BlockDriverState *bs,
diff --git a/monitor.c b/monitor.c
index aeea2b5..a7d1650 100644
--- a/monitor.c
+++ b/monitor.c
@@ -5292,10 +5292,14 @@ int monitor_read_bdrv_key_start(Monitor *mon, 
BlockDriverState *bs,
 BlockCompletionFunc *completion_cb,
 void *opaque)
 {
+char *enc_filename;
 int err;
 
+enc_filename = g_malloc(PATH_MAX);
+bdrv_get_encrypted_filename(bs, enc_filename, PATH_MAX);
 monitor_printf(mon, %s (%s) is encrypted.\n, bdrv_get_device_name(bs),
-   bdrv_get_encrypted_filename(bs));
+   enc_filename);
+g_free(enc_filename);
 
 mon-password_completion_cb = completion_cb;
 mon-password_opaque = opaque;
-- 
2.4.6

[Qemu-devel] [PATCH v3 4/5] block: Drop BlockDriverState.filename

2015-08-12 Thread Max Reitz

That field is now only used during initialization of BlockDriverStates
(opening images) and for error or warning messages. Performance is not
that much of an issue here, so we can drop the field and replace its use
by a call to bdrv_filename() or bdrv_filename_alloc(). By doing so we
can ensure the result always to be recent, whereas the contents of
BlockDriverState.filename may have been obsoleted by manipulations of
single BlockDriverStates or of the BDS graph.

The users of the BDS filename field were changed as follows:
- copying the filename into another buffer is trivially replaced by
  using bdrv_filename() instead of the copy function
- strdup() on the filename is replaced by a call to
  bdrv_filename_alloc()
- bdrv_filename(bs, bs-filename, sizeof(bs-filename)) is replaced by
  bdrv_refresh_filename(bs)
  (these were introduced by the previous patch)
- anywhere else bdrv_filename_alloc() is used, any access to
  BlockDriverState.filename is then replaced by the buffer returned, and
  it is freed when it is no longer needed

Signed-off-by: Max Reitz mre...@redhat.com
---
 block.c   | 47 ++-
 block/blkverify.c |  3 +--
 block/commit.c|  4 +++-
 block/mirror.c| 16 
 block/qapi.c  |  4 ++--
 block/quorum.c|  2 +-
 block/raw-posix.c | 12 +---
 block/raw_bsd.c   |  4 +++-
 block/vhdx-log.c  |  5 -
 block/vmdk.c  | 22 --
 block/vpc.c   |  7 +--
 blockdev.c| 22 +-
 include/block/block_int.h |  1 -
 13 files changed, 103 insertions(+), 46 deletions(-)

diff --git a/block.c b/block.c
index d2a3e8d..c53d1d6 100644
--- a/block.c
+++ b/block.c
@@ -227,10 +227,10 @@ void bdrv_get_full_backing_filename_from_filename(const 
char *backed,
 void bdrv_get_full_backing_filename(BlockDriverState *bs, char *dest, size_t 
sz,
 Error **errp)
 {
-char *backed = bs-exact_filename[0] ? bs-exact_filename : bs-filename;
-
+char *backed = bdrv_filename_alloc(bs);
 bdrv_get_full_backing_filename_from_filename(backed, bs-backing_file,
  dest, sz, errp);
+g_free(backed);
 }
 
 void bdrv_register(BlockDriver *bdrv)
@@ -821,6 +821,7 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
 QDict *options, int flags, BlockDriver *drv, Error **errp)
 {
 int ret, open_flags;
+char *filename_buffer = NULL;
 const char *filename;
 const char *node_name = NULL;
 QemuOpts *opts;
@@ -831,7 +832,8 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
 assert(options != NULL  bs-options != options);
 
 if (file != NULL) {
-filename = file-filename;
+filename_buffer = bdrv_filename_alloc(file);
+filename = filename_buffer;
 } else {
 filename = qdict_get_try_str(options, filename);
 }
@@ -839,7 +841,8 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
 if (drv-bdrv_needs_filename  !filename) {
 error_setg(errp, The '%s' block driver requires a file name,
drv-format_name);
-return -EINVAL;
+ret = -EINVAL;
+goto fail_filename;
 }
 
 trace_bdrv_open_common(bs, filename ?: , flags, drv-format_name);
@@ -888,11 +891,10 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
 }
 
 if (filename != NULL) {
-pstrcpy(bs-filename, sizeof(bs-filename), filename);
+pstrcpy(bs-exact_filename, sizeof(bs-exact_filename), filename);
 } else {
-bs-filename[0] = '\0';
+bs-exact_filename[0] = '\0';
 }
-pstrcpy(bs-exact_filename, sizeof(bs-exact_filename), bs-filename);
 
 bs-drv = drv;
 bs-opaque = g_malloc0(drv-instance_size);
@@ -952,6 +954,7 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
 assert((bs-request_alignment != 0) || bdrv_is_sg(bs));
 
 qemu_opts_del(opts);
+g_free(filename_buffer);
 return 0;
 
 free_and_fail:
@@ -961,6 +964,8 @@ free_and_fail:
 bs-drv = NULL;
 fail_opts:
 qemu_opts_del(opts);
+fail_filename:
+g_free(filename_buffer);
 return ret;
 }
 
@@ -1158,7 +1163,7 @@ void bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd)
 }
 bs-backing_child = bdrv_attach_child(bs, backing_hd, child_backing);
 bs-open_flags = ~BDRV_O_NO_BACKING;
-pstrcpy(bs-backing_file, sizeof(bs-backing_file), backing_hd-filename);
+bdrv_filename(backing_hd, bs-backing_file, sizeof(bs-backing_file));
 pstrcpy(bs-backing_format, sizeof(bs-backing_format),
 backing_hd-drv ? backing_hd-drv-format_name : );
 
@@ -1559,7 +1564,7 @@ static int bdrv_open_inherit(BlockDriverState **pbs, 
const char *filename,
 }
 }
 
-

[Qemu-devel] [PATCH v3 3/5] block: Add bdrv_filename()

2015-08-12 Thread Max Reitz

Split the part which actually refreshes the BlockDriverState.filename
field off of bdrv_refresh_filename() into a more generic function
bdrv_filename(), which first calls bdrv_refresh_filename() and then
stores a qemu-usable filename into the given buffer instead of
BlockDriverState.filename.

Since bdrv_refresh_filename() therefore no longer refreshes that field,
some calls to that function have to be replaced by calls to
bdrv_filename() manually refreshing the BDS filename field (this is
only temporary).

Additionally, a wrapper function bdrv_filename_alloc() is added which
allocates a buffer of size PATH_MAX, call bdrv_filename() on that buffer
and returns it, since needing a temporary buffer for the filename is a
rather common pattern.

Signed-off-by: Max Reitz mre...@redhat.com
---
 block.c   | 39 ---
 block/blkverify.c |  3 ++-
 block/quorum.c|  2 +-
 include/block/block.h |  2 ++
 4 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/block.c b/block.c
index 1572785..d2a3e8d 100644
--- a/block.c
+++ b/block.c
@@ -1559,7 +1559,7 @@ static int bdrv_open_inherit(BlockDriverState **pbs, 
const char *filename,
 }
 }
 
-bdrv_refresh_filename(bs);
+bdrv_filename(bs, bs-filename, sizeof(bs-filename));
 
 /* For snapshot=on, create a temporary qcow2 overlay. bs points to the
  * temporary snapshot afterwards. */
@@ -4153,9 +4153,6 @@ static bool append_open_options(QDict *d, 
BlockDriverState *bs)
  *  - full_open_options: Options which, when given when opening a block device
  *   (without a filename), result in a BDS (mostly)
  *   equalling the given one
- *  - filename: If exact_filename is set, it is copied here. Otherwise,
- *  full_open_options is converted to a JSON object, prefixed with
- *  json: (for use through the JSON pseudo protocol) and put 
here.
  */
 void bdrv_refresh_filename(BlockDriverState *bs)
 {
@@ -4242,15 +4239,43 @@ void bdrv_refresh_filename(BlockDriverState *bs)
 
 bs-full_open_options = opts;
 }
+}
+
+/* First refreshes exact_filename and full_open_options by calling
+ * bdrv_refresh_filename(). Then, if exact_filename is set, it is copied into
+ * the target buffer. Otherwise, full_open_options is converted to a JSON
+ * object, prefixed with json: (for use through the JSON pseudo protocol) and
+ * put there.
+ *
+ * If sz  0, the string put into the buffer will always be null-terminated.
+ *
+ * Returns @dest.
+ */
+char *bdrv_filename(BlockDriverState *bs, char *dest, size_t sz)
+{
+bdrv_refresh_filename(bs);
+
+if (sz  INT_MAX) {
+sz = INT_MAX;
+}
 
 if (bs-exact_filename[0]) {
-pstrcpy(bs-filename, sizeof(bs-filename), bs-exact_filename);
+pstrcpy(dest, sz, bs-exact_filename);
 } else if (bs-full_open_options) {
 QString *json = qobject_to_json(QOBJECT(bs-full_open_options));
-snprintf(bs-filename, sizeof(bs-filename), json:%s,
- qstring_get_str(json));
+snprintf(dest, sz, json:%s, qstring_get_str(json));
 QDECREF(json);
 }
+
+return dest;
+}
+
+/* Wrapper around bdrv_filename() using g_malloc(PATH_MAX) for the destination
+ * buffer.
+ */
+char *bdrv_filename_alloc(BlockDriverState *bs)
+{
+return bdrv_filename(bs, g_malloc(PATH_MAX), PATH_MAX);
 }
 
 /* This accessor function purpose is to allow the device models to access the
diff --git a/block/blkverify.c b/block/blkverify.c
index d277e63..cb9ce02 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -309,7 +309,8 @@ static void blkverify_refresh_filename(BlockDriverState *bs)
 BDRVBlkverifyState *s = bs-opaque;
 
 /* bs-file has already been refreshed */
-bdrv_refresh_filename(s-test_file);
+bdrv_filename(s-test_file, s-test_file-filename,
+  sizeof(s-test_file-filename));
 
 if (bs-file-full_open_options  s-test_file-full_open_options) {
 QDict *opts = qdict_new();
diff --git a/block/quorum.c b/block/quorum.c
index 2f6c45f..00d1fb0 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -1003,7 +1003,7 @@ static void quorum_refresh_filename(BlockDriverState *bs)
 int i;
 
 for (i = 0; i  s-num_children; i++) {
-bdrv_refresh_filename(s-bs[i]);
+bdrv_filename(s-bs[i], s-bs[i]-filename, 
sizeof(s-bs[i]-filename));
 if (!s-bs[i]-full_open_options) {
 return;
 }
diff --git a/include/block/block.h b/include/block/block.h
index a78e4f1..4be6c18 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -267,6 +267,8 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState 
*bs,
 const char *backing_file);
 int bdrv_get_backing_file_depth(BlockDriverState *bs);
 void bdrv_refresh_filename(BlockDriverState *bs);
+char *bdrv_filename(BlockDriverState *bs, char *dest, size_t sz);
+char *bdrv_filename_alloc(BlockDriverState *bs);
 int

Re: [Qemu-devel] [PATCH 08/10] tcg: add memory barriers in page_find_alloc accesses

2015-08-12 Thread Emilio G. Cota

On Wed, Aug 12, 2015 at 18:41:00 +0200, Paolo Bonzini wrote:
 page_find is reading the radix tree outside all locks, so it has to
 use the RCU primitives.  It does not need RCU critical sections
 because the PageDescs are never removed, so there is never a need
 to wait for the end of code sections that use a PageDesc.

Note that rcu_find_alloc might end up writing to the tree, see below.

BTW the fact that there are no removals makes the use of RCU unnecessary.

 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  translate-all.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)
 
 diff --git a/translate-all.c b/translate-all.c
 index 7727091..78a787d 100644
 --- a/translate-all.c
 +++ b/translate-all.c
 @@ -437,14 +437,14 @@ static PageDesc *page_find_alloc(tb_page_addr_t index, 
 int alloc)
  
  /* Level 2..N-1.  */
  for (i = V_L1_SHIFT / V_L2_BITS - 1; i  0; i--) {
 -void **p = *lp;
 +void **p = atomic_rcu_read(lp);
  
  if (p == NULL) {
  if (!alloc) {
  return NULL;
  }
  p = g_new0(void *, V_L2_SIZE);
 -*lp = p;
 +atomic_rcu_set(lp, p);
  }
  
  lp = p + ((index  (i * V_L2_BITS))  (V_L2_SIZE - 1));
 @@ -456,7 +456,7 @@ static PageDesc *page_find_alloc(tb_page_addr_t index, 
 int alloc)
  return NULL;
  }
  pd = g_new0(PageDesc, V_L2_SIZE);
 -*lp = pd;
 +atomic_rcu_set(lp, pd);

rcu_set is not enough; a cmpxchg with a fail path would be needed here, since
if the find_alloc is called without any locks held (as described in the commit 
message)
several threads could concurrently write to the same node, corrupting the tree.

I argue however that it is better to call page_find/_alloc with a mutex held,
since otherwise we'd have to add per-PageDesc locks (it's very common to
call page_find and then update the PageDesc). I have an RFC patchset for 
multithreaded tcg
that deals with this, will submit once I bring it up to date with upstream.

Emilio

Re: [Qemu-devel] [RFC 1/3] monitor: split MonitorQAPIEventState

2015-08-12 Thread Eric Blake

On 08/12/2015 02:00 PM, Laszlo Ersek wrote:
 Assume there has been a long period of silence (no attempts to emit an
 event). Now some client code makes a call to emit the event.
 
 Will that event be emitted immediately, or will it be delayed to see if
 more are coming? I'd like to understand this aspect first.
 
 I think the first instance of the event, after the grace period, should
 be emitted immediately, and further instances that quickly follow should
 be suppressed.

That has always been the goal of event throttling: when a new event
arrives and the timer is not running, emit it right away and start the
timer. If another event arrives while the timer has not expired, save
it.  If multiple events arrive during the timer, the last one received
overwrites any others.  Then, when the timer expires, either there is a
saved event (one or more events arrived during the throttling window),
so we send it and restart another throttling window, or there was no
event pending so the line is quiet and the next event can send right
away.  Thus, management will get notification as soon as possible that
events are starting to happen after a period of quiet, but will then not
be spammed any faster than once per throttle period (once per second)
with additional events.  When additional events are sent, they are
always the most accurate state of the system at the time the event was
queued, but intermediate events may have been lost.  Furthermore, unless
the events have not been happening, the throttling of the event means
the management might not see the event until nearly a second late,
although the timestamp associated with the event should still be
accurate to the time it was queued up, not finally sent.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH for-2.5 08/30] m68k: update CPU flags management

2015-08-12 Thread Laurent Vivier



Le 12/08/2015 07:12, Richard Henderson a écrit :
 On 08/09/2015 01:13 PM, Laurent Vivier wrote:
 @@ -798,9 +796,9 @@ void HELPER(mac_set_flags)(CPUM68KState *env,
 uint32_t acc)
   }
   }

 -void HELPER(flush_flags)(CPUM68KState *env, uint32_t cc_op)
 +uint32_t HELPER(flush_flags)(CPUM68KState *env, uint32_t op)
   {
 -cpu_m68k_flush_flags(env, cc_op);
 +return cpu_m68k_flush_flags(env, op);
   }
 
 Since this function no longer modifies ENV, it probably deserves a
 better name than flush_flags.  FWIW cc_compute_all isn't a bad name,
 if you're copying i386 anyway...
 
 -DEF_HELPER_2(flush_flags, void, env, i32)
 +DEF_HELPER_2(flush_flags, i32, env, i32)
 
 Modify to use DEF_HELPER_FLAGS while you're at it.  At the moment it
 reads some globals, but doesn't write any, or have any other side effects.

It writes env-cc_x, so I guess I can't use DEF_HELPER_FLAGS ?

 
   static inline void gen_flush_flags(DisasContext *s)
   {
   if (s-cc_op == CC_OP_FLAGS)
   return;
 -gen_flush_cc_op(s);
 -gen_helper_flush_flags(cpu_env, QREG_CC_OP);
 -s-cc_op = CC_OP_FLAGS;
 +if (s-cc_op == CC_OP_DYNAMIC) {
 +gen_helper_flush_flags(QREG_CC_DEST, cpu_env, QREG_CC_OP);
 +} else {
 +gen_helper_flush_flags(QREG_CC_DEST, cpu_env,
 tcg_const_i32(s-cc_op));
 +}
 
 That const needs to be freed.

perhaps I'm wrong, what I had understood is:

tcg_const_i32() creates a tcg_temp_new_i32(), and tcg_temp_new_i32() are
automatically freed at end of tcg block (whereas tcg_const_local adn
tcg_temp_local are not).

 @@ -1248,7 +1294,6 @@ DISAS_INSN(bitop_im)
   DEST_EA(env, insn, opsize, tmp, addr);
   }
   }
 -
   DISAS_INSN(arith_im)
   {
   int op;
 
 Careful with the errant whitespace changes.
 
 @@ -1706,16 +1745,18 @@ DISAS_INSN(branch)
   /* bsr */
   gen_push(s, tcg_const_i32(s-pc));
   }
 -gen_flush_cc_op(s);
   if (op  1) {
   /* Bcc */
   l1 = gen_new_label();
   gen_jmpcc(s, ((insn  8)  0xf) ^ 1, l1);
 +update_cc_op(s);
   gen_jmp_tb(s, 1, base + offset);
   gen_set_label(l1);
 +update_cc_op(s);
   gen_jmp_tb(s, 0, s-pc);
 
 Ideally you'd do this only once, before the jmpcc.
 
 
 r~

Re: [Qemu-devel] [PATCH for-2.5 30/30] m68k: add bitfield instructions

2015-08-12 Thread Richard Henderson

On 08/09/2015 01:13 PM, Laurent Vivier wrote:
  uint32_t HELPER(rol32)(uint32_t val, uint32_t shift)
  {
  uint32_t result;
 @@ -1227,6 +1241,53 @@ void HELPER(set_mac_extu)(CPUM68KState *env, uint32_t 
 val, uint32_t acc)
  env-macc[acc + 1] = res;
  }
  
 +/* load from a bitfield */
 +
 +uint64_t HELPER(bitfield_load)(uint32_t addr, uint32_t offset, uint32_t 
 width)
 +{
 +uint8_t data[8];
 +uint64_t bitfield;
 +int size;
 +int i;
 +
 +size = (offset + width + 7)  3;
 +#if defined(CONFIG_USER_ONLY)
 +cpu_memory_rw_debug(NULL, (target_ulong)addr, data, size, 0);
 +#else
 +cpu_physical_memory_rw(addr, data, size, 0);
 +#endif

Err, this bypasses virtual memory.  Definitely not correct.

You'll need to use cpu_ld*_data instead.

 +DISAS_INSN(bitfield_reg)
 +{
 +uint16_t ext;
 +TCGv tmp;
 +TCGv tmp1;
 +TCGv reg;
 +TCGv offset;
 +TCGv width;
 +int op;
 +TCGv reg2;
 +TCGv mask;
 +
 +reg = DREG(insn, 0);
 +op = (insn  8)  7;
 +ext = read_im16(env, s);
 +
 +bitfield_param(ext, offset, width, mask);
 +
 +if (ext  0x0800) {
 +tcg_gen_andi_i32(offset, offset, 31);
 +}
 +gen_helper_ror32(mask, mask, offset);

Ah, the curious unused helpers.  Anyway, tcg_gen_rotr_i32.

Anyway, there's so much redundant computation in here let's name some
sub-expressions, and give them their own temporaries.  Let the tcg optimizer do
its job if they turn out to be unused for a specific case.

  mask_msb = mask before the rotate (at msb).
  mask_inp = mask after the rotate (in place).

 +tmp = tcg_temp_new_i32();
 +tcg_gen_and_i32(tmp, reg, mask);

  oldf_inp = tmp (old field in place).

 +
 +tmp1 = tcg_temp_new_i32();
 +gen_helper_rol32(tmp1, tmp, offset);

  oldf_msb = tmp1 (old field at msb).

 +tcg_gen_sub_i32(tmp2, tcg_const_i32(32), width);

  comp_width = tmp2 (compliment of width).

Use tcg_gen_subfi so you won't leak the const.
Compute this once for all of the sub-cases.

 +switch (op) {
 +case 0: /* bftst */
 +break;
 +case 1: /* bfextu */
 +tcg_gen_add_i32(tmp1, offset, width);
 +tcg_gen_andi_i32(tmp1, tmp1, 31);
 +gen_helper_rol32(reg2, tmp, tmp1);

  tcg_gen_shr_i32(reg2, oldf_msb, comp_width);

 +break;
 +case 2: /* bfchg */
 +tcg_gen_xor_i32(reg, reg, mask);
 +break;
 +case 3: /* bfexts */
 +gen_helper_rol32(reg2, tmp, offset);
 +tcg_gen_sub_i32(width, tcg_const_i32(32), width);
 +tcg_gen_sar_i32(reg2, reg2, width);

  tcg_gen_sar_i32(reg2, oldf_msb, comp_width);

 +case 4: /* bfclr */
 +tcg_gen_not_i32(mask, mask);
 +tcg_gen_and_i32(reg, reg, mask);

  tcg_gen_andc_i32(reg, reg, mask_inp);

 +break;
 +case 5: /* bfffo */
 +gen_helper_rol32(reg2, tmp, offset);
 +gen_helper_bfffo(tmp, tmp, width);
 +tcg_gen_add_i32(reg2, tmp, offset);

There's a typo here if you look close.  That said,

  tcg_gen_orc_i32(tmp, oldf_msb, mask_msb);
  gen_helper_clo32(tmp, tmp);
  tcg_gen_add_i32(reg2, tmp, offset);

with

uint32_t helper_clo32(uint32_t x)
{
  return clo32(x);
}

The first opcode sets all bits outside the field, so that (if the field is
smaller than 32 bits) we're guaranteed to find a one.  At which point there's
really very little that needs doing in the helper.

 +case 6: /* bfset */
 +tcg_gen_or_i32(reg, reg, mask);
 +break;

  tcg_gen_or_i32(reg, reg, mask_inp);

 +case 7: /* bfins */
 +tcg_gen_shl_i32(tmp1, tcg_const_i32(1), width);

Undefined if width == 32.  That said...

 +tcg_gen_subi_i32(tmp1, tmp1, 1);
 +tcg_gen_and_i32(tmp, reg2, tmp1);
 +tcg_gen_add_i32(tmp1, offset, width);
 +tcg_gen_andi_i32(tmp1, tmp1, 31);
 +gen_helper_ror32(tmp, tmp, tmp1);
 +tcg_gen_not_i32(mask, mask);
 +tcg_gen_and_i32(reg, reg, mask);
 +tcg_gen_or_i32(reg, reg, tmp);
 +break;

  /* Rotate the source so that the field is at msb.  */
  tcg_gen_rotr_i32(tmp, reg2, width);

  /* Isolate the field and set flags.  */
  tcg_gen_and_i32(tmp, tmp, mask_msb);
  gen_logic_cc(s, tmp, OS_LONG);

  /* Rotate the field into position.  */
  tcg_gen_rotr_i32(tmp, tmp, offset);

  /* Merge field into destination.  */
  tcg_gen_andc_i32(reg, reg, mask_inp);
  tcg_gen_or_i32(reg, reg, tmp);
  return;

Handle the non-bfins after the switch with

  gen_logic_cc(s, oldf_msb, OS_LONG);

 +DISAS_INSN(bitfield_mem)
 +{
 +uint16_t ext;
 +int op;
 +TCGv_i64 bitfield;
 +TCGv_i64 mask_bitfield;
 +TCGv mask_cc;
 +TCGv shift;
 +TCGv val;
 +TCGv src;
 +TCGv offset;
 +TCGv width;
 +TCGv reg;
 +TCGv tmp;
 +
 +op = (insn  8)  7;
 +ext = read_im16(env, s);
 +src = gen_lea(env, s, insn, OS_LONG);
 +if (IS_NULL_QREG(src)) {
 +gen_addr_fault(s);
 +return;
 +}
 +
 +bitfield_param(ext, offset, width,

Re: [Qemu-devel] [RFC 1/3] monitor: split MonitorQAPIEventState

2015-08-12 Thread Marc-André Lureau

Hi

- Original Message -
 On 08/12/15 21:46, marcandre.lur...@redhat.com wrote:
  From: Marc-André Lureau marcandre.lur...@redhat.com
  
  Create a seperate pending event structure MonitorQAPIEventPending.
  Use a MonitorQAPIEventDelay callback to handle the delaying. This
  allows other implementations of throttling.
  
  Signed-off-by: Marc-André Lureau marcandre.lur...@redhat.com
  ---
   monitor.c| 124
   +--
   trace-events |   2 +-
   2 files changed, 79 insertions(+), 47 deletions(-)
 
 Assume there has been a long period of silence (no attempts to emit an
 event). Now some client code makes a call to emit the event.
 
 Will that event be emitted immediately, or will it be delayed to see if
 more are coming? I'd like to understand this aspect first.
 
 I think the first instance of the event, after the grace period, should
 be emitted immediately, and further instances that quickly follow should
 be suppressed.

This is what qemu already does. The first event is sent immediately, the
later ones may be delayed (but there will be at least one event every period,
if the event is flooded). This patch 1/3 doesn't change the logic, only
it split things to make them a bit more modular.

So the rest of the patches do not change the qemu delay logic, it adds a way to
delay based on (event + id) instead of just (event). It does that by
adding an additional id hashtable for the event type. My hope is
that this apporach could be reused if other field or combinations of fields
are necessary for other events, but for now, it's hardcoded for id. 

cheers

Re: [Qemu-devel] [PATCH for-2.5 08/30] m68k: update CPU flags management

2015-08-12 Thread Laurent Vivier



Le 12/08/2015 23:19, Richard Henderson a écrit :
 On 08/12/2015 01:56 PM, Laurent Vivier wrote:
 -DEF_HELPER_2(flush_flags, void, env, i32)
 +DEF_HELPER_2(flush_flags, i32, env, i32)

 Modify to use DEF_HELPER_FLAGS while you're at it.  At the moment it
 reads some globals, but doesn't write any, or have any other side
 effects.

 It writes env-cc_x, so I guess I can't use DEF_HELPER_FLAGS ?
 
 Ah, missed that.  So, no, not usefully.
 
 That const needs to be freed.

 perhaps I'm wrong, what I had understood is:

 tcg_const_i32() creates a tcg_temp_new_i32(), and tcg_temp_new_i32() are
 automatically freed at end of tcg block (whereas tcg_const_local adn
 tcg_temp_local are not).
 
 They are freed at the end of a basic block.  But the total number of
 temps affects the speed of the tcg code generator.  So you can improve
 the speed of qemu by freeing temporaries that are no longer needed.

OK, thank you. I was wondering if it is useful to free temp or not...

Laurent

[Qemu-devel] [PATCH] hw/misc: Add support for ADC controller in Xilinx Zynq 7000

2015-08-12 Thread Guenter Roeck

Add support for the Xilinx XADC core used in Zynq 7000.

References:
- Zynq-7000 All Programmable SoC Technical Reference Manual
- 7 Series FPGAs and Zynq-7000 All Programmable SoC XADC
  Dual 12-Bit 1 MSPS Analog-to-Digital Converter

Tested with Linux using qemu machine xilinx-zynq-a9 with devicetree
files zynq-zc702.dtb and zynq-zc706.dtb, and kernel configuration
multi_v7_defconfig.

Signed-off-by: Guenter Roeck li...@roeck-us.net
---
 hw/arm/xilinx_zynq.c  |   5 +
 hw/misc/Makefile.objs |   1 +
 hw/misc/zynq_xadc.c   | 305 ++
 3 files changed, 311 insertions(+)
 create mode 100644 hw/misc/zynq_xadc.c

diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
index a4e7b5c..195e9f1 100644
--- a/hw/arm/xilinx_zynq.c
+++ b/hw/arm/xilinx_zynq.c
@@ -225,6 +225,11 @@ static void zynq_init(MachineState *machine)
 sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, 0xE0101000);
 sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[79-IRQ_OFFSET]);
 
+dev = qdev_create(NULL, xilinx,zynq_xadc);
+qdev_init_nofail(dev);
+sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, 0xF8007100);
+sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[39-IRQ_OFFSET]);
+
 dev = qdev_create(NULL, pl330);
 qdev_prop_set_uint8(dev, num_chnls,  8);
 qdev_prop_set_uint8(dev, num_periph_req,  4);
diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index 4aa76ff..5f76f05 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -36,6 +36,7 @@ obj-$(CONFIG_OMAP) += omap_sdrc.o
 obj-$(CONFIG_OMAP) += omap_tap.o
 obj-$(CONFIG_SLAVIO) += slavio_misc.o
 obj-$(CONFIG_ZYNQ) += zynq_slcr.o
+obj-$(CONFIG_ZYNQ) += zynq_xadc.o
 obj-$(CONFIG_STM32F2XX_SYSCFG) += stm32f2xx_syscfg.o
 
 obj-$(CONFIG_PVPANIC) += pvpanic.o
diff --git a/hw/misc/zynq_xadc.c b/hw/misc/zynq_xadc.c
new file mode 100644
index 000..480e2bf
--- /dev/null
+++ b/hw/misc/zynq_xadc.c
@@ -0,0 +1,305 @@
+/*
+ * ADC registers for Xilinx Zynq Platform
+ *
+ * Copyright (c) 2015 Guenter Roeck
+ * Based on hw/misc/zynq_slcr.c, written by Michal Simek
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+#include hw/hw.h
+#include qemu/timer.h
+#include hw/sysbus.h
+#include sysemu/sysemu.h
+
+enum {
+CFG= 0x000 / 4,
+INTSTS,
+INTMSK,
+STATUS,
+CFIFO,
+DFIFO,
+CTL,
+};
+
+#define XADC_ZYNQ_CFG_ENABLEBIT(31)
+#define XADC_ZYNQ_CFG_CFIFOTH_RD(x) (((x)  20)  0x0f)
+#define XADC_ZYNQ_CFG_DFIFOTH_RD(x) (((x)  16)  0x0f)
+#define XADC_ZYNQ_CFG_WEDGE BIT(13)
+#define XADC_ZYNQ_CFG_REDGE BIT(12)
+#define XADC_ZYNQ_CFG_TCKRATE_DIV2  (0x0  8)
+#define XADC_ZYNQ_CFG_TCKRATE_DIV4  (0x1  8)
+#define XADC_ZYNQ_CFG_TCKRATE_DIV8  (0x2  8)
+#define XADC_ZYNQ_CFG_TCKRATE_DIV16 (0x3  8)
+#define XADC_ZYNQ_CFG_IGAP_MASK 0x1f
+#define XADC_ZYNQ_CFG_IGAP(x)   ((x)  XADC_ZYNQ_CFG_IGAP_MASK)
+
+#define XADC_ZYNQ_INT_CFIFO_LTH BIT(9)
+#define XADC_ZYNQ_INT_DFIFO_GTH BIT(8)
+#define XADC_ZYNQ_INT_ALARM_MASK0xff
+#define XADC_ZYNQ_INT_ALARM_OFFSET  0
+
+#define XADC_ZYNQ_STATUS_DFIFO_LVL(x)   (((x)  0x0f)  12)
+#define XADC_ZYNQ_STATUS_CFIFOF BIT(11)
+#define XADC_ZYNQ_STATUS_CFIFOE BIT(10)
+#define XADC_ZYNQ_STATUS_DFIFOF BIT(9)
+#define XADC_ZYNQ_STATUS_DFIFOE BIT(8)
+#define XADC_ZYNQ_STATUS_OT BIT(7)
+#define XADC_ZYNQ_STATUS_ALM(x) BIT(x)
+
+#define XADC_ZYNQ_CTL_RESET BIT(4)
+
+#define XADC_ZYNQ_CMD_NOP   0x00
+#define XADC_ZYNQ_CMD_READ  0x01
+#define XADC_ZYNQ_CMD_WRITE 0x02
+
+#define ZYNQ_XADC_MMIO_SIZE 0x0020
+#define ZYNQ_XADC_NUM_IO_REGS   (ZYNQ_XADC_MMIO_SIZE / 4)
+#define ZYNQ_XADC_NUM_ADC_REGS  128
+#define ZYNQ_XADC_FIFO_DEPTH15
+
+#define TYPE_ZYNQ_XADC  xilinx,zynq_xadc
+#define ZYNQ_XADC(obj) \
+OBJECT_CHECK(ZynqXADCState, (obj), TYPE_ZYNQ_XADC)
+
+typedef struct ZynqXADCState {
+SysBusDevice parent_obj;
+
+MemoryRegion iomem;
+
+uint32_t regs[ZYNQ_XADC_NUM_IO_REGS];   /* I/O registers */
+uint16_t xadc_regs[ZYNQ_XADC_NUM_ADC_REGS]; /* XADC registers */
+uint16_t xadc_read_reg_previous;/* result of most recent read command 
*/
+uint16_t xadc_dfifo[ZYNQ_XADC_FIFO_DEPTH];  /* data fifo */
+uint16_t xadc_dfifo_depth;  /* entries in data fifo */
+
+struct IRQState *irq;
+
+} ZynqXADCState;
+
+static void _zynq_xadc_reset(ZynqXADCState *s)
+{
+int i;
+
+s-regs[CFG] = XADC_ZYNQ_CFG_IGAP(0x14) | XADC_ZYNQ_CFG_TCKRATE_DIV4 |
+XADC_ZYNQ_CFG_REDGE;
+

Re: [Qemu-devel] [PATCH for-2.5 20/30] m68k: add exg

2015-08-12 Thread Laurent Vivier



Le 12/08/2015 19:05, Richard Henderson a écrit :
 On 08/09/2015 01:13 PM, Laurent Vivier wrote:
 Signed-off-by: Laurent Vivier laur...@vivier.eu
 ---
  target-m68k/translate.c | 34 +-
  1 file changed, 33 insertions(+), 1 deletion(-)

 diff --git a/target-m68k/translate.c b/target-m68k/translate.c
 index adf4521..b7d15e9 100644
 --- a/target-m68k/translate.c
 +++ b/target-m68k/translate.c
 @@ -2035,10 +2035,42 @@ DISAS_INSN(and)
  TCGv dest;
  TCGv addr;
  int opsize;
 +int exg_mode;
  
 +dest = tcg_temp_new();
 +
 +/* exg */
 +
 +exg_mode = insn  0x1f8;
 
 Likewise, surely we can decode EXG separately from AND, and avoid doing so for
 coldfire.

I agree for CMPM, not for EXG.

Let's have a look to instructions encoding :)

AND   1100dddooommmrrr

  ddd   data register number
  ooo   opmode,  invalid: 011, 111
  mmmrrrea mode, if ooo = { 000, 001, 010} invalid: 001000 .. 00
 if ooo = { 100, 101, 110} invalid; 00 .. 00

EXG   1100xxx1oyyy

  xxx   register
  o valid: 01000, 01001, 10001
  yyy   register

So, EXG is an AND with

  ooo 101, 110
  mmm 000, 001

which are invalid combinations for AND.

IMHO, EXG looks like a wart on the AND and should be decoded like that...

I don't know how to add this easily in the table... except by adding 3
entries to decode 1 instruction. Is it acceptable ?

Laurent

Re: [Qemu-devel] [PATCH for-2.5 20/30] m68k: add exg

2015-08-12 Thread Richard Henderson


On 08/12/2015 03:43 PM, Laurent Vivier wrote:



Le 12/08/2015 19:05, Richard Henderson a écrit :

On 08/09/2015 01:13 PM, Laurent Vivier wrote:

Signed-off-by: Laurent Vivier laur...@vivier.eu
---
  target-m68k/translate.c | 34 +-
  1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index adf4521..b7d15e9 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2035,10 +2035,42 @@ DISAS_INSN(and)
  TCGv dest;
  TCGv addr;
  int opsize;
+int exg_mode;

+dest = tcg_temp_new();
+
+/* exg */
+
+exg_mode = insn  0x1f8;


Likewise, surely we can decode EXG separately from AND, and avoid doing so for
coldfire.


I agree for CMPM, not for EXG.

Let's have a look to instructions encoding :)

AND   1100dddooommmrrr

   ddd   data register number
   ooo   opmode,  invalid: 011, 111
   mmmrrrea mode, if ooo = { 000, 001, 010} invalid: 001000 .. 00
  if ooo = { 100, 101, 110} invalid; 00 .. 00

EXG   1100xxx1oyyy

   xxx   register
   o valid: 01000, 01001, 10001
   yyy   register

So, EXG is an AND with

   ooo 101, 110
   mmm 000, 001

which are invalid combinations for AND.

IMHO, EXG looks like a wart on the AND and should be decoded like that...


Hmm, perhaps you're right.  On the other hand, maybe we should rename the 
function and_exg, and also properly check for M68000 before accepting exg?



r~

Re: [Qemu-devel] [PATCH for-2.5 20/30] m68k: add exg

2015-08-12 Thread Laurent Vivier



Le 13/08/2015 01:09, Richard Henderson a écrit :
 On 08/12/2015 03:43 PM, Laurent Vivier wrote:


 Le 12/08/2015 19:05, Richard Henderson a écrit :
 On 08/09/2015 01:13 PM, Laurent Vivier wrote:
 Signed-off-by: Laurent Vivier laur...@vivier.eu
 ---
   target-m68k/translate.c | 34 +-
   1 file changed, 33 insertions(+), 1 deletion(-)

 diff --git a/target-m68k/translate.c b/target-m68k/translate.c
 index adf4521..b7d15e9 100644
 --- a/target-m68k/translate.c
 +++ b/target-m68k/translate.c
 @@ -2035,10 +2035,42 @@ DISAS_INSN(and)
   TCGv dest;
   TCGv addr;
   int opsize;
 +int exg_mode;

 +dest = tcg_temp_new();
 +
 +/* exg */
 +
 +exg_mode = insn  0x1f8;

 Likewise, surely we can decode EXG separately from AND, and avoid
 doing so for
 coldfire.

 I agree for CMPM, not for EXG.

 Let's have a look to instructions encoding :)

 AND   1100dddooommmrrr

ddd   data register number
ooo   opmode,  invalid: 011, 111
mmmrrrea mode, if ooo = { 000, 001, 010} invalid: 001000 .. 00
   if ooo = { 100, 101, 110} invalid; 00 .. 00

 EXG   1100xxx1oyyy

xxx   register
o valid: 01000, 01001, 10001
yyy   register

 So, EXG is an AND with

ooo 101, 110
mmm 000, 001

 which are invalid combinations for AND.

 IMHO, EXG looks like a wart on the AND and should be decoded like that...
 
 Hmm, perhaps you're right.  On the other hand, maybe we should rename
 the function and_exg, and also properly check for M68000 before
 accepting exg?

I agree.

Thank you for all your comments.
Laurent

Re: [Qemu-devel] [PATCH for-2.5 27/30] m68k: add addx/subx/negx

2015-08-12 Thread Laurent Vivier



Le 12/08/2015 20:46, Richard Henderson a écrit :
 On 08/09/2015 01:13 PM, Laurent Vivier wrote:
 +return (op1  ~((1UL  bits) - 1)) | res; \
 
 deposit32(op1, res, bits, 0)
 

You mean: deposit32(op1, 0, bits, res) ?

Laurent

Re: [Qemu-devel] [PATCH for-2.5 08/30] m68k: update CPU flags management

2015-08-12 Thread Richard Henderson


On 08/12/2015 01:56 PM, Laurent Vivier wrote:

-DEF_HELPER_2(flush_flags, void, env, i32)
+DEF_HELPER_2(flush_flags, i32, env, i32)


Modify to use DEF_HELPER_FLAGS while you're at it.  At the moment it
reads some globals, but doesn't write any, or have any other side effects.


It writes env-cc_x, so I guess I can't use DEF_HELPER_FLAGS ?


Ah, missed that.  So, no, not usefully.


That const needs to be freed.


perhaps I'm wrong, what I had understood is:

tcg_const_i32() creates a tcg_temp_new_i32(), and tcg_temp_new_i32() are
automatically freed at end of tcg block (whereas tcg_const_local adn
tcg_temp_local are not).


They are freed at the end of a basic block.  But the total number of temps 
affects the speed of the tcg code generator.  So you can improve the speed of 
qemu by freeing temporaries that are no longer needed.



r~

Re: [Qemu-devel] [PATCH v4 1/3] linux-headers: Add eeh.h

2015-08-12 Thread Gavin Shan

On Wed, Aug 12, 2015 at 12:00:35PM +1000, Alexey Kardashevskiy wrote:
On 08/11/2015 07:11 PM, Peter Maydell wrote:
On 10 August 2015 at 08:13, Gavin Shan gws...@linux.vnet.ibm.com wrote:
The header file was introduced by following Linux upstream commits:

 commit ed3e81f (powerpc/eeh: Move PE state constants around)
 commit ec33d36 (powerpc/eeh: Introduce eeh_pe_inject_err())

Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
  linux-headers/asm-powerpc/eeh.h | 56 
 +
  1 file changed, 56 insertions(+)
  create mode 100644 linux-headers/asm-powerpc/eeh.h

Shouldn't this be added by updating scripts/update-linux-headers.sh
and then doing a plain synchronize headers against kernel version $X ?


I also thought this is the protocol but then I looked into the git history
and discovered this is not always the case :)



So should I use scripts/update-linux-headers.sh or what I had is fine?

Otherwise you won't get any future changes to this file.


Thanks,
Gavin

Re: [Qemu-devel] [PATCH for-2.5 00/30] 680x0 instructions emulation

2015-08-12 Thread Richard Henderson


On 08/09/2015 01:13 PM, Laurent Vivier wrote:

   m68k: allow to update flags with operation on words and bytes
   m68k: update CPU flags management
   m68k: add X flag helpers


I wonder if we can talk about a different mechanism for tracking flags.

The x86 scheme tracks flags with 3 words: { op, dest, src }.  This is easy to 
handle arithmetic, logicals, and comparisons.  But shifts and other weird 
things require complex helpers.


The m68k scheme, which you're enhancing, is the same, except with an additional 
word for X.  It's this extra X flag that complicates things.


The arm scheme tracks 4 flags with 4 individual words.  This is more 
complicated for arithmetic (especially computing overflow), but it rarely 
requires complicated helpers.


I propose using a hybrid between the two, using 6 words: { op, c, v, z, n, x }.

op = OP_FLAGS (i.e. fully computed state)

  X = cc_x is 0/1
  C = cc_c is 0/1
  V = cc_v  0
  N = cc_n  0
  Z = cc_z == 0

op = OP_ADD

  X = C = cc_x is 0/1

  cc_n = result
  cc_v = src1
  cc_c = unused
  cc_z = unused

  N = cc_n  0
  Z = cc_n == 0
  V = ((result ^ src1)  ~((result - src1) ^ src1))  0

op = OP_SUB

  like OP_ADD, except
  V = ((result ^ src1)  (src1 ^ (src1 - result)))  0

op = OP_LOGIC

  X = cc_x is 0/1
  C = 0
  V = 0

  cc_n = result
  N = cc_n  0
  Z = cc_n == 0

All of the byte and word variants are handled by sign-extending the quantities.

This keeps X up-to-date at all times.  It minimizes the amount to data that 
needs to be synced back to env before memory operations.  It leaves enough data 
to easily compute signed comparisons.


The more complicated cases of addx and shifts, now simply operate in OP_FLAGS 
mode and don't require helpers.


I'll follow up with some patches...


r~

Re: [Qemu-devel] [PATCH] linux-user: elfload: Still use TARGET_PAGE_SIZE for i386 guest

2015-08-12 Thread Alexander Graf



 Am 13.08.2015 um 04:45 schrieb gchen gchen xili_gchen_5...@hotmail.com:
 
 On 2015年08月12日 23:06, Richard Henderson wrote:
 On 08/12/2015 12:59 AM, gchen gchen wrote:
 Nack.  There's 99 problems with host page size guest page size.  This
 solves none of them, and in the hackiest way possible.
 
 Under alpha virtual machine, if set i386 guest page size 8KB, it will
 cause failure directly (any dynamically linked binaries can not work).
 
 Yes, I know.  The same thing happens when running i386 guests on other
 (non-virtual) hosts.  E.g. Sparc64's 8kB page, PowerPC64's 64kB page.
 
 Yes. The reason why I am only focus on Alpha is the machine which I am
 working for is almost the same as Alpha. But this machine is very slow,
 its performance maybe like 10 years ago's x86_64 laptop.

What does Almost the same as Alpha mean? Does it use 8k or 4k page size?

 
 Do you have any other ideas for solving this issue?
 
 The only complete solution that I see is to use softmmu with linux-user, so
 that we properly emulate the guest pages.  Yes, it will cause quite some
 slow-down in emulation, but I believe it's the only reliable way.
 
 I have tried softmmu, for me, the performance is not acceptable, we can
 not use this way.

Softmmu but without system emulation. Performance should be about half of 
today's linux-user.


Alex

Re: [Qemu-devel] [PATCH v6 0/2] vhost user: Add live migration

2015-08-12 Thread Ouyang, Changchun

 -Original Message-
 From: Marc-André Lureau [mailto:marcandre.lur...@gmail.com]
 Sent: Wednesday, August 12, 2015 6:07 PM
 To: Michael S. Tsirkin
 Cc: Thibaut Collet; QEMU; stefa...@redhat.com; Jason Wang; Paolo Bonzini;
 Linhaifeng; Ouyang, Changchun
 Subject: Re: [PATCH v6 0/2] vhost user: Add live migration

 Hi

 On Wed, Aug 12, 2015 at 9:25 AM, Michael S. Tsirkin m...@redhat.com
 wrote:
  I think these patches need to be rebased on top of Marc Andre's ones,
  and use protocol flags to negotiate capabilities.
  Right?

 Correct. His patches should be applied before my migration tests, though.

Good point, do we have plan how to make these patches applied serially?
M.s.t's protocol feature patch,  marc andre's patch, Thibaut's patch for adding 
live migration, and my patch for vhost-user multi queue. 

 --
 Marc-André Lureau

Re: [Qemu-devel] Qemu-devel Digest, Vol 149, Issue 266

2015-08-12 Thread Ouyang, Changchun

 -Original Message-
 Date: Wed, 12 Aug 2015 14:15:54 +0300
 From: Michael S. Tsirkin m...@redhat.com
 To: Marcel Apfelbaum mar...@redhat.com
 Cc: qemu-devel@nongnu.org
 Subject: Re: [Qemu-devel] [PATCH] virtio/vhost: drop unnecessary
   VHOST_SET_VRING call
 Message-ID: 20150812141448-mutt-send-email-...@redhat.com
 Content-Type: text/plain; charset=us-ascii

 On Wed, Aug 12, 2015 at 02:10:56PM +0300, Marcel Apfelbaum wrote:
  On 08/12/2015 01:34 PM, Michael S. Tsirkin wrote:
  On Wed, Aug 12, 2015 at 01:19:51PM +0300, Marcel Apfelbaum wrote:
  No need to send VHOST_SET_VRING_CALL to backend before the
  negotiation with the guest is finished.

  Signed-off-by: Marcel Apfelbaum mar...@redhat.com

  Well - we do need to set it to the masked notifier initially to avoid
  losing events.  You can't just drop it - need to move this call
  somewhere else.

Agree with m.s.t.
We could not drop it.
Vhost-user multi queue also need this.

  What do we need to set?
  I just dropped the call to VHOST_SET_VRING_CALL.

  Thanks,
  Marcel

 We use two eventfds: masked and unmasked one.
 We switch dynamically dependent on msi mask value.
 Code assumes we start out masked, so we need to match that.

  ---
hw/virtio/vhost.c | 13 +
1 file changed, 1 insertion(+), 12 deletions(-)

  diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index
  2712c6f..b448542 100644
  --- a/hw/virtio/vhost.c
  +++ b/hw/virtio/vhost.c
  @@ -875,24 +875,13 @@ static void
 vhost_eventfd_del(MemoryListener
  *listener,
static int vhost_virtqueue_init(struct vhost_dev *dev,
struct vhost_virtqueue *vq, int n)
{
  -struct vhost_vring_file file = {
  -.index = n,
  -};
int r = event_notifier_init(vq-masked_notifier, 0);
  +
if (r  0) {
return r;
}

  -file.fd = event_notifier_get_fd(vq-masked_notifier);
  -r = dev-vhost_ops-vhost_call(dev, VHOST_SET_VRING_CALL, file);
  -if (r) {
  -r = -errno;
  -goto fail_call;
  -}
return 0;
  -fail_call:
  -event_notifier_cleanup(vq-masked_notifier);
  -return r;
}

static void vhost_virtqueue_cleanup(struct vhost_virtqueue *vq)
  --
  2.1.0

[Qemu-devel] [v2 1/4] migration: do cleanup operation after completion

2015-08-12 Thread Liang Li

Because of the patch 3ea3b7fa9af067982f34b of kvm, now the migration_end()
is a time consuming operation, which takes about dozens of milliseconds, and
will prolong VM downtime. Such an operation should be done after migration
completion.

For a VM with 8G RAM, this patch can reduce the VM downtime about 32 ms during
live migration.

Signed-off-by: Liang Li liang.z...@intel.com
Reviewed-by: Paolo Bonzini pbonz...@redhat.com
---
 migration/block.c |  1 -
 migration/migration.c | 14 +++---
 migration/ram.c   |  1 -
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/migration/block.c b/migration/block.c
index ed865ed..85496fd 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -750,7 +750,6 @@ static int block_save_complete(QEMUFile *f, void *opaque)
 
 qemu_put_be64(f, BLK_MIG_FLAG_EOS);
 
-blk_mig_cleanup();
 return 0;
 }
 
diff --git a/migration/migration.c b/migration/migration.c
index 662e77e..4ddb9ad 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -560,12 +560,9 @@ static void migrate_fd_cleanup(void *opaque)
 
 assert(s-state != MIGRATION_STATUS_ACTIVE);
 
-if (s-state != MIGRATION_STATUS_COMPLETED) {
-qemu_savevm_state_cancel();
-if (s-state == MIGRATION_STATUS_CANCELLING) {
-migrate_set_state(s, MIGRATION_STATUS_CANCELLING,
-  MIGRATION_STATUS_CANCELLED);
-}
+if (s-state == MIGRATION_STATUS_CANCELLING) {
+migrate_set_state(s, MIGRATION_STATUS_CANCELLING,
+  MIGRATION_STATUS_CANCELLED);
 }
 
 notifier_list_notify(migration_state_notifiers, s);
@@ -923,6 +920,7 @@ static void *migration_thread(void *opaque)
 int64_t initial_bytes = 0;
 int64_t max_size = 0;
 int64_t start_time = initial_time;
+int64_t end_time;
 bool old_vm_running = false;
 
 rcu_register_thread();
@@ -1007,9 +1005,11 @@ static void *migration_thread(void *opaque)
 }
 }
 
+end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+
 qemu_mutex_lock_iothread();
+qemu_savevm_state_cancel();
 if (s-state == MIGRATION_STATUS_COMPLETED) {
-int64_t end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
 uint64_t transferred_bytes = qemu_ftell(s-file);
 s-total_time = end_time - s-total_time;
 s-downtime = end_time - start_time;
diff --git a/migration/ram.c b/migration/ram.c
index 7f007e6..6249f6e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1269,7 +1269,6 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
 
 rcu_read_unlock();
 
-migration_end();
 qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
 
 return 0;
-- 
1.9.1

[Qemu-devel] [v2 3/4] migration: rename cancel to cleanup in SaveVMHandles

2015-08-12 Thread Liang Li

'cleanup' seems more appropriate than 'cancel'.

Signed-off-by: Liang Li liang.z...@intel.com
---
 include/migration/vmstate.h | 2 +-
 migration/block.c   | 2 +-
 migration/ram.c | 2 +-
 migration/savevm.c  | 4 ++--
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index 2e5a97d..24b939c 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -39,7 +39,7 @@ typedef struct SaveVMHandlers {
 void (*set_params)(const MigrationParams *params, void * opaque);
 SaveStateHandler *save_state;
 
-void (*cancel)(void *opaque);
+void (*cleanup)(void *opaque);
 int (*save_live_complete)(QEMUFile *f, void *opaque);
 
 /* This runs both outside and inside the iothread lock.  */
diff --git a/migration/block.c b/migration/block.c
index 85496fd..55442bd 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -879,7 +879,7 @@ static SaveVMHandlers savevm_block_handlers = {
 .save_live_complete = block_save_complete,
 .save_live_pending = block_save_pending,
 .load_state = block_load,
-.cancel = block_migration_cancel,
+.cleanup = block_migration_cancel,
 .is_active = block_is_active,
 };
 
diff --git a/migration/ram.c b/migration/ram.c
index 6249f6e..e7f711e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1610,7 +1610,7 @@ static SaveVMHandlers savevm_ram_handlers = {
 .save_live_complete = ram_save_complete,
 .save_live_pending = ram_save_pending,
 .load_state = ram_load,
-.cancel = ram_migration_cancel,
+.cleanup = ram_migration_cancel,
 };
 
 void ram_mig_init(void)
diff --git a/migration/savevm.c b/migration/savevm.c
index b141b17..96d0aab 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -911,8 +911,8 @@ void qemu_savevm_state_cleanup(void)
 
 trace_savevm_state_cleanup();
 QTAILQ_FOREACH(se, savevm_state.handlers, entry) {
-if (se-ops  se-ops-cancel) {
-se-ops-cancel(se-opaque);
+if (se-ops  se-ops-cleanup) {
+se-ops-cleanup(se-opaque);
 }
 }
 }
-- 
1.9.1

[Qemu-devel] [v2 2/4] migration: rename qemu_savevm_state_cancel

2015-08-12 Thread Liang Li

The function qemu_savevm_state_cancel is called after the migration
in migration_thread, it seems strange to 'cancel' it after completion,
rename it to qemu_savevm_state_cleanup looks better.

Signed-off-by: Liang Li liang.z...@intel.com
---
 include/sysemu/sysemu.h | 2 +-
 migration/migration.c   | 2 +-
 migration/savevm.c  | 6 +++---
 trace-events| 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 44570d1..9cc0240 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -88,7 +88,7 @@ void qemu_savevm_state_begin(QEMUFile *f,
 void qemu_savevm_state_header(QEMUFile *f);
 int qemu_savevm_state_iterate(QEMUFile *f);
 void qemu_savevm_state_complete(QEMUFile *f);
-void qemu_savevm_state_cancel(void);
+void qemu_savevm_state_cleanup(void);
 uint64_t qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size);
 int qemu_loadvm_state(QEMUFile *f);
 
diff --git a/migration/migration.c b/migration/migration.c
index 4ddb9ad..1ff30fe 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1008,7 +1008,7 @@ static void *migration_thread(void *opaque)
 end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
 
 qemu_mutex_lock_iothread();
-qemu_savevm_state_cancel();
+qemu_savevm_state_cleanup();
 if (s-state == MIGRATION_STATUS_COMPLETED) {
 uint64_t transferred_bytes = qemu_ftell(s-file);
 s-total_time = end_time - s-total_time;
diff --git a/migration/savevm.c b/migration/savevm.c
index 6071215..b141b17 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -905,11 +905,11 @@ uint64_t qemu_savevm_state_pending(QEMUFile *f, uint64_t 
max_size)
 return ret;
 }
 
-void qemu_savevm_state_cancel(void)
+void qemu_savevm_state_cleanup(void)
 {
 SaveStateEntry *se;
 
-trace_savevm_state_cancel();
+trace_savevm_state_cleanup();
 QTAILQ_FOREACH(se, savevm_state.handlers, entry) {
 if (se-ops  se-ops-cancel) {
 se-ops-cancel(se-opaque);
@@ -946,7 +946,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
 ret = qemu_file_get_error(f);
 }
 if (ret != 0) {
-qemu_savevm_state_cancel();
+qemu_savevm_state_cleanup();
 error_setg_errno(errp, -ret, Error while writing VM state);
 }
 return ret;
diff --git a/trace-events b/trace-events
index 94bf3bb..102373c 100644
--- a/trace-events
+++ b/trace-events
@@ -1195,7 +1195,7 @@ savevm_state_begin(void) 
 savevm_state_header(void) 
 savevm_state_iterate(void) 
 savevm_state_complete(void) 
-savevm_state_cancel(void) 
+savevm_state_cleanup(void) 
 vmstate_save(const char *idstr, const char *vmsd_name) %s, %s
 vmstate_load(const char *idstr, const char *vmsd_name) %s, %s
 qemu_announce_self_iter(const char *mac) %s
-- 
1.9.1

[Qemu-devel] [v2 4/4] migration: code clean up

2015-08-12 Thread Liang Li

Just clean up code, no behavior change.

Signed-off-by: Liang Li liang.z...@intel.com
---
 migration/block.c | 9 ++---
 migration/ram.c   | 9 ++---
 2 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/migration/block.c b/migration/block.c
index 55442bd..869798c 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -591,7 +591,7 @@ static int64_t get_remaining_dirty(void)
 
 /* Called with iothread lock taken.  */
 
-static void blk_mig_cleanup(void)
+static void block_migration_cleanup(void *opaque)
 {
 BlkMigDevState *bmds;
 BlkMigBlock *blk;
@@ -618,11 +618,6 @@ static void blk_mig_cleanup(void)
 blk_mig_unlock();
 }
 
-static void block_migration_cancel(void *opaque)
-{
-blk_mig_cleanup();
-}
-
 static int block_save_setup(QEMUFile *f, void *opaque)
 {
 int ret;
@@ -879,7 +874,7 @@ static SaveVMHandlers savevm_block_handlers = {
 .save_live_complete = block_save_complete,
 .save_live_pending = block_save_pending,
 .load_state = block_load,
-.cleanup = block_migration_cancel,
+.cleanup = block_migration_cleanup,
 .is_active = block_is_active,
 };
 
diff --git a/migration/ram.c b/migration/ram.c
index e7f711e..8b045fe 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1024,7 +1024,7 @@ void free_xbzrle_decoded_buf(void)
 xbzrle_decoded_buf = NULL;
 }
 
-static void migration_end(void)
+static void ram_migration_cleanup(void *opaque)
 {
 /* caller have hold iothread lock or is in a bh, so there is
  * no writing race against this migration_bitmap
@@ -1049,11 +1049,6 @@ static void migration_end(void)
 XBZRLE_cache_unlock();
 }
 
-static void ram_migration_cancel(void *opaque)
-{
-migration_end();
-}
-
 static void reset_ram_globals(void)
 {
 last_seen_block = NULL;
@@ -1610,7 +1605,7 @@ static SaveVMHandlers savevm_ram_handlers = {
 .save_live_complete = ram_save_complete,
 .save_live_pending = ram_save_pending,
 .load_state = ram_load,
-.cleanup = ram_migration_cancel,
+.cleanup = ram_migration_cleanup,
 };
 
 void ram_mig_init(void)
-- 
1.9.1

[Qemu-devel] [v2 0/4] Fix long vm downtime during live migration

2015-08-12 Thread Liang Li

Some cleanup operations take long time during the pause and copy stage,
especially with the KVM patch 3ea3b7fa9af067, do these operations after
the completion of live migration can help to reduce VM downtime.

Ony the first patch changes the behavior, the rest 3 patches are for code
cleanup.

Changes:
  * Remove qemu_savevm_sate_cancel() in migrate_fd_cleanup()
  * Add 2 more patches for code clean up

Liang Li (4):
  migration: do cleanup operation after completion
  migration: rename qemu_savevm_state_cancel
  migration: rename cancel to cleanup in SaveVMHandles
  migration: code clean up

 include/migration/vmstate.h |  2 +-
 include/sysemu/sysemu.h |  2 +-
 migration/block.c   | 10 ++
 migration/migration.c   | 14 +++---
 migration/ram.c | 10 ++
 migration/savevm.c  | 10 +-
 trace-events|  2 +-
 7 files changed, 19 insertions(+), 31 deletions(-)

-- 
1.9.1

[Qemu-devel] about the patch kvmclock Ensure proper env-tsc value for kvmclock_current_nsec calculation

2015-08-12 Thread Li, Liang Z

Hi Paolo  Marcelo,

Could please point out what issue the patch 317b0a6d8ba44e try to fix?  I 
found in live migration the cpu_synchronize_all_states will be called twice, 
and it will take more than 1 ms sometimes. I try to do some optimization but 
lack the knowledge about the background. 

Thanks

Liang

Re: [Qemu-devel] [PATCH for-2.5 27/30] m68k: add addx/subx/negx

2015-08-12 Thread Richard Henderson


On 08/12/2015 05:11 PM, Laurent Vivier wrote:



Le 12/08/2015 20:46, Richard Henderson a écrit :

On 08/09/2015 01:13 PM, Laurent Vivier wrote:

+return (op1  ~((1UL  bits) - 1)) | res; \


deposit32(op1, res, bits, 0)



You mean: deposit32(op1, 0, bits, res) ?


Yes, sorry.  The operand order is different for tcg_gen_deposit_i32.  ;-)


r~

Re: [Qemu-devel] [PATCH] monitor: remove QAPI_EVENT_VSERPORT_CHANGE throttle

2015-08-12 Thread Amit Shah

On (Tue) 11 Aug 2015 [20:21:18], Laszlo Ersek wrote:
 On 08/11/15 19:04, marcandre.lur...@redhat.com wrote:
  From: Marc-André Lureau marcandre.lur...@redhat.com
  
  QAPI_EVENT_VSERPORT_CHANGE reports changes of a virtio serial port
  state. However, the events may be for different ports, but the throttle
  mechanism may replace the event for a different port, since it only
  checks the event type.
  
  libvirt relies on a correct state to be reported for all channels: the
  qemu-ga commands may no longer work if the state is reported
  disconnected. This can be triggered easily by having more than 1
  virtio-serial (qemu-ga + spice agent for example), and restarting
  quickly daemons or more realistically going quickly in and out of
  suspend.
  
  In a future patch, we may want to throttle events based on their
  arguments, but this will likely require dynamic allocations and more
  complicated code to insert/lookup pending events based on various
  arguments (id in QAPI_EVENT_VSERPORT_CHANGE case).
  
  Fixes:
  https://bugzilla.redhat.com/show_bug.cgi?id=1244064
  
  Signed-off-by: Marc-André Lureau marcandre.lur...@redhat.com
  ---
   monitor.c | 1 -
   1 file changed, 1 deletion(-)
  
  diff --git a/monitor.c b/monitor.c
  index aeea2b5..e4d56f7 100644
  --- a/monitor.c
  +++ b/monitor.c
  @@ -558,7 +558,6 @@ static void monitor_qapi_event_init(void)
   monitor_qapi_event_throttle(QAPI_EVENT_BALLOON_CHANGE, 1000);
   monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_REPORT_BAD, 1000);
   monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_FAILURE, 1000);
  -monitor_qapi_event_throttle(QAPI_EVENT_VSERPORT_CHANGE, 1000);
   
   qmp_event_set_func_emit(monitor_qapi_event_queue);
   }
  
 
 I don't mind the change (and the point of argument sensitivity is not
 lost on me), but note that this undoes the protection that is spelled
 out in the leading comment of the function, not visible in the context here:
 
 /* Limit guest-triggerable events to 1 per second */
 
 That was probably put in place in order to prevent a malicious guest
 from spamming the log files (and the CPU usage) of the management apps.
 
 One solution to that would be arg sensitivity. Another would be a
 burst-capable (ie. a slowly re-filling, limited size token bucket)
 throttle, maintained per event type.
 
 Until one of those gets written, I guess this patch is acceptable -- as
 long as mgmt people are okay with it. Daniel seems to be, so I don't mind.

OK - so I'll queue it.

Thanks,

Amit

Re: [Qemu-devel] [PATCH v3 2/3] sPAPR: Support RTAS call ibm, {open, close}-errinjct

2015-08-12 Thread Gavin Shan

On Tue, Aug 11, 2015 at 11:54:08AM +1000, Alexey Kardashevskiy wrote:
On 08/11/2015 10:57 AM, Gavin Shan wrote:
On Mon, Aug 10, 2015 at 10:24:56PM +1000, David Gibson wrote:
On Fri, Aug 07, 2015 at 01:33:32PM +1000, Gavin Shan wrote:
The patch supports RTAS calls ibm,{open,close}-errinjct to
manupliate the token, which is passed to RTAS call ibm,errinjct
to indicate the valid context for error injection. Each VM is
permitted to have only one token at once and we simply have one
random number for that.

Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
  hw/ppc/spapr.c |  5 
  hw/ppc/spapr_rtas.c| 66 
 ++
  include/hw/ppc/spapr.h | 10 +++-
  3 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index dfd808f..1ebd0b2 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1225,6 +1225,11 @@ static const VMStateDescription vmstate_spapr = {
  VMSTATE_UINT64_TEST(rtc_offset, sPAPRMachineState, 
 version_before_3),

  VMSTATE_PPC_TIMEBASE_V(tb, sPAPRMachineState, 2),
+
+/* Error injection token */
+VMSTATE_BOOL(is_errinjct_opened, sPAPRMachineState),
+VMSTATE_UINT32(errinjct_next_token, sPAPRMachineState),

Because you're adding fields to the vmstate you'll need to define a
new version number and make these fields only considered in the new
version.


Agree, do we have an example for me to refer to?


Technically it is enough to send a token which is not opened when it is
zero. The chunk below demonstrates versions use.


Agree. we don't have to track the token is opened or not. It's fine to
have not opened state in the target VM. I'll integrate this piece of
code into next revision. Thanks!

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index e046265..e568d41 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1213,17 +1213,21 @@ static bool version_before_3(void *opaque, int
version_id)

 static const VMStateDescription vmstate_spapr = {
 .name = spapr,
-.version_id = 3,
+.version_id = 4,
 .minimum_version_id = 1,
 .post_load = spapr_post_load,
 .fields = (VMStateField[]) {
 /* used to be @next_irq */
 VMSTATE_UNUSED_BUFFER(version_before_3, 0, 4),

+
 /* RTC offset */
 VMSTATE_UINT64_TEST(rtc_offset, sPAPRMachineState, version_before_3),

 VMSTATE_PPC_TIMEBASE_V(tb, sPAPRMachineState, 2),
+
+/* Error injection token */
+VMSTATE_UINT32_V(errinjct_next_token, sPAPRMachineState, 4),
 VMSTATE_END_OF_LIST()
 },
 };



Technically you only need to transfer whether the token is open, and
*if* it is open the current token value.  Not sure if it's worth
restricting to that though.


It would be fine to transfer the token even it's closed, I think.


Thanks,
Gavin

Re: [Qemu-devel] [PATCH v4 1/3] linux-headers: Add eeh.h

2015-08-12 Thread David Gibson

On Thu, Aug 13, 2015 at 11:30:37AM +1000, Gavin Shan wrote:
 On Wed, Aug 12, 2015 at 12:00:35PM +1000, Alexey Kardashevskiy wrote:
 On 08/11/2015 07:11 PM, Peter Maydell wrote:
 On 10 August 2015 at 08:13, Gavin Shan gws...@linux.vnet.ibm.com wrote:
 The header file was introduced by following Linux upstream commits:
 
  commit ed3e81f (powerpc/eeh: Move PE state constants around)
  commit ec33d36 (powerpc/eeh: Introduce eeh_pe_inject_err())
 
 Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
 ---
   linux-headers/asm-powerpc/eeh.h | 56 
  +
   1 file changed, 56 insertions(+)
   create mode 100644 linux-headers/asm-powerpc/eeh.h
 
 Shouldn't this be added by updating scripts/update-linux-headers.sh
 and then doing a plain synchronize headers against kernel version $X ?
 
 
 I also thought this is the protocol but then I looked into the git history
 and discovered this is not always the case :)
 
 
 
 So should I use scripts/update-linux-headers.sh or what I had is
 fine?

Please use update-linux-headers.sh - and make sure you also update the
script itself (if necessary) so it will pull in future updates to
eeh.h

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgplWSfQxnsgz.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH] linux-user: elfload: Still use TARGET_PAGE_SIZE for i386 guest

2015-08-12 Thread gchen gchen

On 2015年08月12日 23:06, Richard Henderson wrote:brgt; On 08/12/2015 12:59 AM, 
gchen gchen wrote:brgt;gt;gt; Nack.  There's 99 problems with host page 
sizegt; guest page size.  Thisbrgt;gt;gt; solves none of them, and in the 
hackiest way possible.brgt;gt;gt;brgt;gt;brgt;gt; Under alpha 
virtual machine, if set i386 guest page size 8KB, it willbrgt;gt; cause 
failure directly (any dynamically linked binaries can not 
work).brgt;brgt; Yes, I know.  The same thing happens when running i386 
guests on otherbrgt; (non-virtual) hosts.  E.g. Sparc64's 8kB page, 
PowerPC64's 64kB page.brgt;brbrYes. The reason why I am only focus on 
Alpha is the machine which I ambrworking for is almost the same as Alpha. 
But this machine is very slow,brits performance maybe like 10 years ago's 
x86_64 laptop.brbrgt;gt; Do you have any other ideas for solving this 
issue?brgt;brgt; The only complete solution that I see is to use softmmu 
with linux-user, sobrgt; that we properly emulate the guest pages.  Yes, it 
will cause quite somebrgt; slow-down in emulation, but I believe it's the 
only reliable way.brgt;brbrI have tried softmmu, for me, the performance 
is not acceptable, we canbrnot use this way.brbrOur main goal is let 
most of Windows XP graphic free programs (but notbropen source), can be used 
by user under Alpha like machine, also itsbrperformance and stability are 
acceptable.brbrbrThanks.br--brChen GangbrbrOpen, share, and 
attitude like air, water, and life which God blessedbr

[Qemu-devel] [Consult] linux-user: Let 8KB host support 4KB guest

2015-08-12 Thread gchen gchen

Hello All:

For my company, it is useful to let qemu linux-user 8KB host support 4KB
guest, it may have many issues, but I want to try to fix them (with the
aid from qemu members). So I want to consult:

 - Is it valuable and possible to fix these issues? (I guess, it is
   valuable and possible, but need much work).

 - How many resources do we need? (e.g. I can spend at least 1 months
   working days on it, with the aid from qemu members, is it enough?).

 - Can we describe all related issues ? (also describing each root cause
   will be better).


Thanks.
--
Chen Gang

Open, share, and attitude like air, water, and life which God blessed

Re: [Qemu-devel] [PATCH] linux-user: elfload: Still use TARGET_PAGE_SIZE for i386 guest

2015-08-12 Thread gchen gchen

On 2015年08月12日 23:06, Richard Henderson wrote:
 On 08/12/2015 12:59 AM, gchen gchen wrote:
 Nack.  There's 99 problems with host page size guest page size.  This
 solves none of them, and in the hackiest way possible.


 Under alpha virtual machine, if set i386 guest page size 8KB, it will
 cause failure directly (any dynamically linked binaries can not work).

 Yes, I know.  The same thing happens when running i386 guests on other
 (non-virtual) hosts.  E.g. Sparc64's 8kB page, PowerPC64's 64kB page.


Yes. The reason why I am only focus on Alpha is the machine which I am
working for is almost the same as Alpha. But this machine is very slow,
its performance maybe like 10 years ago's x86_64 laptop.

 Do you have any other ideas for solving this issue?

 The only complete solution that I see is to use softmmu with linux-user, so
 that we properly emulate the guest pages.  Yes, it will cause quite some
 slow-down in emulation, but I believe it's the only reliable way.


I have tried softmmu, for me, the performance is not acceptable, we can
not use this way.

Our main goal is let most of Windows XP graphic free programs (but not
open source), can be used by user under Alpha like machine, also its
performance and stability are acceptable.


Thanks.
--
Chen Gang

Open, share, and attitude like air, water, and life which God blessed

[Qemu-devel] [PATCH] linux-user: elfload: Still use TARGET_PAGE_SIZE for i386 guest

2015-08-12 Thread gchen gchen

Under Alpha host, for ubuntu12.04.5 i386 guest, it will cause failure:
Invalid ELF image for this architecture.

The related issue commit is a70daba linux-user: Tell guest about big
host page sizes.

Signed-off-by: Chen Gang gang.chen.5...@gmail.com
---
 linux-user/elfload.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 1788368..f4cf9b6 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1614,7 +1614,11 @@ static abi_ulong create_elf_tables(abi_ulong p, int 
argc, int envc,
 NEW_AUX_ENT(AT_PHDR, (abi_ulong)(info-load_addr + exec-e_phoff));
 NEW_AUX_ENT(AT_PHENT, (abi_ulong)(sizeof (struct elf_phdr)));
 NEW_AUX_ENT(AT_PHNUM, (abi_ulong)(exec-e_phnum));
+#ifdef TARGET_I386
+NEW_AUX_ENT(AT_PAGESZ, (abi_ulong)TARGET_PAGE_SIZE);
+#else
 NEW_AUX_ENT(AT_PAGESZ, (abi_ulong)(MAX(TARGET_PAGE_SIZE, getpagesize(;
+#endif
 NEW_AUX_ENT(AT_BASE, (abi_ulong)(interp_info ? interp_info-load_addr : 
0));
 NEW_AUX_ENT(AT_FLAGS, (abi_ulong)0);
 NEW_AUX_ENT(AT_ENTRY, info-entry);
--
1.9.1

Re: [Qemu-devel] [PATCH v6 0/2] vhost user: Add live migration

2015-08-12 Thread Michael S. Tsirkin

On Thu, Aug 06, 2015 at 10:45:07AM +0200, Thibaut Collet wrote:
 v5-v6
 1. First patch: remove a warning log
 2. Second patch: rename some functions to be more explicit on the purpose of
these functions.
 
 The first patch provides limited live migration:
 - guest without GUEST_ANNOUNCE capabilities does not announce migration ending
 and peers talking to the migrated guest can suffer important network outage.
 - Some packets sent by remote peers to the guest can be lost during migration.
 
 The second patch fixes limitation for guest without GUEST_ANNOUNCE 
 capabilities
 and patches from Marc Andre Lureau fix potential packet's lost during 
 migration.
 
 Thibaut Collet (2):
   vhost user: add support of live migration
   vhost user: add rarp sending after live migration for legacy guest

I think these patches need to be rebased on top of Marc Andre's ones,
and use protocol flags to negotiate capabilities.
Right?


  docs/specs/vhost-user.txt |   15 +++
  hw/net/vhost_net.c|   18 ++
  hw/virtio/vhost-backend.c |3 ++-
  hw/virtio/vhost-user.c|   32 ++--
  include/hw/virtio/vhost-backend.h |2 ++
  include/net/vhost_net.h   |1 +
  net/vhost-user.c  |   31 +--
  7 files changed, 97 insertions(+), 5 deletions(-)
 
 -- 
 1.7.10.4

Re: [Qemu-devel] [PATCH for-2.5 0/4] vhost: cleanups and switching to sorted memory map

2015-08-12 Thread Michael S. Tsirkin

On Tue, Jul 28, 2015 at 04:52:49PM +0200, Igor Mammedov wrote:
 making memory map a sorted array helps to simplify
 and speed up lookup/insertion and deletion ops on it.
 It also makes insertion/deteletion code easier to read.

I'm a bit confused by all the vhost patches you sent.
Is this series still something you want me to merge?
Or was it the one that caused memory corruptions with
mem hotplug?

 Igor Mammedov (4):
   vhost: codding style fix tab indents
   vhost: simplify/speedify vhost_dev_assign_memory()
   vhost: switch region lookup from linear to bsearch
   vhost: simplify/speedify vhost_dev_unassign_memory()
 
  hw/virtio/vhost.c | 252 
 ++
  1 file changed, 123 insertions(+), 129 deletions(-)
 
 -- 
 1.8.3.1

Re: [Qemu-devel] [Xen-devel] [PATCH v1] xenpt: Properly handle 64-bit bar with more than 4G size

2015-08-12 Thread Wu, Feng



 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Wednesday, August 12, 2015 4:43 PM
 To: Wu, Feng
 Cc: stefano.stabell...@eu.citrix.com; xen-de...@lists.xensource.com;
 qemu-devel@nongnu.org
 Subject: RE: [Qemu-devel] [Xen-devel] [PATCH v1] xenpt: Properly handle 64-bit
 bar with more than 4G size
 
  On 12.08.15 at 09:10, feng...@intel.com wrote:
 
 
  -Original Message-
  From: qemu-devel-bounces+feng.wu=intel@nongnu.org
  [mailto:qemu-devel-bounces+feng.wu=intel@nongnu.org] On Behalf Of
  Jan Beulich
  Sent: Wednesday, August 12, 2015 2:59 PM
  To: Wu, Feng
  Cc: xen-de...@lists.xensource.com; qemu-devel@nongnu.org;
  stefano.stabell...@eu.citrix.com
  Subject: Re: [Qemu-devel] [Xen-devel] [PATCH v1] xenpt: Properly handle
 64-bit
  bar with more than 4G size
 
   On 05.08.15 at 04:02, feng...@intel.com wrote:
   @@ -491,8 +474,9 @@ static int
  xen_pt_bar_reg_write(XenPCIPassthroughState *s, XenPTReg *cfg_entry,
bar_ro_mask = XEN_PT_BAR_IO_RO_MASK | (r_size - 1);
break;
case XEN_PT_BAR_FLAG_UPPER:
   +r = d-io_regions[index-1];
 
  Perhaps worth an assert(index  0)?
 
  No problem, I will add it. BTW, do you have any other comments about this
  patch? If no, I am
  going to send out the new version with this changes.
 
 No - everything else looks to make sense (but continues to need
 testing).
 

I don't have such a device in hand. Can anybody who has such a device help to 
test this
patch? It would be highly appreciated!

Thanks,
Feng

 Jan

Re: [Qemu-devel] [PATCH] linux-user: elfload: Still use TARGET_PAGE_SIZE for i386 guest

2015-08-12 Thread Richard Henderson


On 08/11/2015 11:03 PM, gchen gchen wrote:

Under Alpha host, for ubuntu12.04.5 i386 guest, it will cause failure:
Invalid ELF image for this architecture.

The related issue commit is a70daba linux-user: Tell guest about big
host page sizes.

Signed-off-by: Chen Ganggang.chen.5...@gmail.com
---
  linux-user/elfload.c | 4 
  1 file changed, 4 insertions(+)


Nack.  There's 99 problems with host page size  guest page size.  This solves 
none of them, and in the hackiest way possible.



r~

Re: [Qemu-devel] [PATCH v2 2/6] hw/virtio/virtio-pci: Use pow2ceil() rather than hand-calculation

2015-08-12 Thread Michael S. Tsirkin

On Fri, Jul 24, 2015 at 01:33:08PM +0100, Peter Maydell wrote:
 Use the utility function pow2ceil() for rounding up to the next
 largest power of 2, rather than inline calculation.
 
 Signed-off-by: Peter Maydell peter.mayd...@linaro.org

Reviewed-by: Michael S. Tsirkin m...@redhat.com

 ---
  hw/virtio/virtio-pci.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)
 
 diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
 index 283401a..845f52f 100644
 --- a/hw/virtio/virtio-pci.c
 +++ b/hw/virtio/virtio-pci.c
 @@ -1497,9 +1497,7 @@ static void virtio_pci_device_plugged(DeviceState *d, 
 Error **errp)
  if (legacy) {
  size = VIRTIO_PCI_REGION_SIZE(proxy-pci_dev)
  + virtio_bus_get_vdev_config_len(bus);
 -if (size  (size - 1)) {
 -size = 1  qemu_fls(size);
 -}
 +size = pow2ceil(size);
  
  memory_region_init_io(proxy-bar, OBJECT(proxy),
virtio_pci_config_ops,
 -- 
 1.9.1

Re: [Qemu-devel] [PATCH for-2.5 12/30] m68k: Manage divw overflow

2015-08-12 Thread Richard Henderson


On 08/09/2015 01:13 PM, Laurent Vivier wrote:

Overflow may be detected and set before the instruction completes.
If the instruction detects an overflow, it sets the overflow condition
code, and the operands are unaffected.


May also implies may not.  I presume this is important for matching hardware? 
Is there some program you know of that depends on this?



+/* dest.l / src.w */
+
+dest = DREG(insn, 9);
+tcg_gen_mov_i32(QREG_DIV1, dest);
+
  SRC_EA(env, src, OS_WORD, sign, NULL);
  tcg_gen_mov_i32(QREG_DIV2, src);
+
+/* div1 / div2 */
+
  if (sign) {
  gen_helper_divs(cpu_env, tcg_const_i32(1));
  } else {
  gen_helper_divu(cpu_env, tcg_const_i32(1));
  }

+set_cc_op(s, CC_OP_FLAGS);
+
+l1 = gen_new_label();
+gen_jmpcc(s, 9 /* V */, l1);
  tmp = tcg_temp_new();
  src = tcg_temp_new();
  tcg_gen_ext16u_i32(tmp, QREG_DIV1);
  tcg_gen_shli_i32(src, QREG_DIV2, 16);
-tcg_gen_or_i32(reg, tmp, src);
-set_cc_op(s, CC_OP_FLAGS);
+tcg_gen_or_i32(dest, tmp, src);
+gen_set_label(l1);


All that said, it's possible to implement this branch inside the helper via 
exception.  Or simply return the inputs to effect no change.



r~

[Qemu-devel] [PATCH v6 1/2] vhost-user: add multi queue support

2015-08-12 Thread Ouyang Changchun

Based on patch by Nikolay Nikolaev:
Vhost-user will implement the multi queue support in a similar way
to what vhost already has - a separate thread for each queue.
To enable the multi queue functionality - a new command line parameter
queues is introduced for the vhost-user netdev.

The RESET_OWNER change is based on commit:
   294ce717e0f212ed0763307f3eab72b4a1bdf4d0
If it is reverted, the patch need update for it accordingly.

Signed-off-by: Nikolay Nikolaev n.nikol...@virtualopensystems.com
Signed-off-by: Changchun Ouyang changchun.ouy...@intel.com
---
Changes since v5:
 - fix the message descption for VHOST_RESET_OWNER in vhost-user txt

Changes since v4:
 - remove the unnecessary trailing '\n'

Changes since v3:
 - fix one typo and wrap one long line

Changes since v2:
 - fix vq index issue for set_vring_call
   When it is the case of VHOST_SET_VRING_CALL, The vq_index is not initialized 
before it is used,
   thus it could be a random value. The random value leads to crash in vhost 
after passing down
   to vhost, as vhost use this random value to index an array index.
 - fix the typo in the doc and description
 - address vq index for reset_owner

Changes since v1:
 - use s-nc.info_str when bringing up/down the backend

 docs/specs/vhost-user.txt |  7 ++-
 hw/net/vhost_net.c|  3 ++-
 hw/virtio/vhost-user.c| 11 ++-
 net/vhost-user.c  | 37 -
 qapi-schema.json  |  6 +-
 qemu-options.hx   |  5 +++--
 6 files changed, 50 insertions(+), 19 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 70da3b1..9390f89 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -135,6 +135,11 @@ As older slaves don't support negotiating protocol 
features,
 a feature bit was dedicated for this purpose:
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
 
+Multi queue support
+---
+The protocol supports multiple queues by setting all index fields in the sent
+messages to a properly calculated value.
+
 Message types
 -
 
@@ -198,7 +203,7 @@ Message types
 
   Id: 4
   Equivalent ioctl: VHOST_RESET_OWNER
-  Master payload: N/A
+  Master payload: vring state description
 
   Issued when a new connection is about to be closed. The Master will no
   longer own this connection (and will usually close it).
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 1f25cb3..9cd6c05 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -159,6 +159,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
 
 net-dev.nvqs = 2;
 net-dev.vqs = net-vqs;
+net-dev.vq_index = net-nc-queue_index;
 
 r = vhost_dev_init(net-dev, options-opaque,
options-backend_type, options-force);
@@ -269,7 +270,7 @@ static void vhost_net_stop_one(struct vhost_net *net,
 for (file.index = 0; file.index  net-dev.nvqs; ++file.index) {
 const VhostOps *vhost_ops = net-dev.vhost_ops;
 int r = vhost_ops-vhost_call(net-dev, VHOST_RESET_OWNER,
-  NULL);
+  file);
 assert(r = 0);
 }
 }
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 27ba035..fb11d4c 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -219,7 +219,12 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 break;
 
 case VHOST_USER_SET_OWNER:
+break;
+
 case VHOST_USER_RESET_OWNER:
+memcpy(msg.state, arg, sizeof(struct vhost_vring_state));
+msg.state.index += dev-vq_index;
+msg.size = sizeof(m.state);
 break;
 
 case VHOST_USER_SET_MEM_TABLE:
@@ -262,17 +267,20 @@ static int vhost_user_call(struct vhost_dev *dev, 
unsigned long int request,
 case VHOST_USER_SET_VRING_NUM:
 case VHOST_USER_SET_VRING_BASE:
 memcpy(msg.state, arg, sizeof(struct vhost_vring_state));
+msg.state.index += dev-vq_index;
 msg.size = sizeof(m.state);
 break;
 
 case VHOST_USER_GET_VRING_BASE:
 memcpy(msg.state, arg, sizeof(struct vhost_vring_state));
+msg.state.index += dev-vq_index;
 msg.size = sizeof(m.state);
 need_reply = 1;
 break;
 
 case VHOST_USER_SET_VRING_ADDR:
 memcpy(msg.addr, arg, sizeof(struct vhost_vring_addr));
+msg.addr.index += dev-vq_index;
 msg.size = sizeof(m.addr);
 break;
 
@@ -280,7 +288,7 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned 
long int request,
 case VHOST_USER_SET_VRING_CALL:
 case VHOST_USER_SET_VRING_ERR:
 file = arg;
-msg.u64 = file-index  VHOST_USER_VRING_IDX_MASK;
+msg.u64 = (file-index + dev-vq_index)  VHOST_USER_VRING_IDX_MASK;
 msg.size = sizeof(m.u64);
 if (ioeventfd_enabled()  file-fd  0) {
 fds[fd_num++]

[Qemu-devel] [PATCH v6 0/2] vhost-user multi queue support

2015-08-12 Thread Ouyang Changchun

Vhost-user will implement the multi queue support in a similar way
to what vhost already has - a separate thread for each queue.
To enable the multi queue functionality - a new command line parameter
queues is introduced for the vhost-user netdev.

Use new message VHOST_USER_SET_VRING_FLAG to enable and disable an
actual virt queue, which is similar to attach/detach queue for tap device.
virtio driver on guest doesn't have to use max virt queue pair, it could
enable any number of virt queue ranging from 1 to max virt queue pair.
 
It requires that VHOST_USER_F_PROTOCOL_FEATURES is present.

Any comment is welcome for the patch set.

It has dependency on a few other patch(set):
vhost-user: protocol updates series
proposed earlier by Michael S. Tsirkin

RESET_OWNER change is based on commit:
294ce717e0f212ed0763307f3eab72b4a1bdf4d0  

Changchun Ouyang (2):
  vhost-user: add multi queue support
  vhost-user: new protocol feature for multi queue

 docs/specs/vhost-user.txt | 24 +++-
 hw/net/vhost_net.c| 21 +-
 hw/net/virtio-net.c   |  2 ++
 hw/virtio/vhost-user.c| 46 ---
 include/hw/virtio/vhost-backend.h |  2 ++
 include/net/vhost_net.h   |  1 +
 net/vhost-user.c  | 37 ---
 qapi-schema.json  |  6 -
 qemu-options.hx   |  5 +++--
 9 files changed, 123 insertions(+), 21 deletions(-)

-- 
1.8.4.2

Re: [Qemu-devel] [PATCH] linux-user: elfload: Still use TARGET_PAGE_SIZE for i386 guest

2015-08-12 Thread gchen gchen

On 2015年08月12日 14:45, Richard Henderson wrote:
 On 08/11/2015 11:03 PM, gchen gchen wrote:
 Under Alpha host, for ubuntu12.04.5 i386 guest, it will cause failure:
 Invalid ELF image for this architecture.

 The related issue commit is a70daba linux-user: Tell guest about big
 host page sizes.

 Signed-off-by: Chen Ganggang.chen.5...@gmail.com
 ---
   linux-user/elfload.c | 4 
   1 file changed, 4 insertions(+)

 Nack.  There's 99 problems with host page size guest page size.  This
 solves none of them, and in the hackiest way possible.


Under alpha virtual machine, if set i386 guest page size 8KB, it will
cause failure directly (any dynamically linked binaries can not work).

This fix can let i386 bash/cp/vi/ls run under alpha virtual machine,
although there maybe be many other issues (which I did not meet, now).

Do you have any other ideas for solving this issue?  e.g. can we rebuild
all i386 programs with segment alignment to 8KB? I am not quite sure,
but some i386 programs (which are interested in) have no source code!
:-(.


Thanks.
--
Chen Gang

Open, share, and attitude like air, water, and life which God blessed

Re: [Qemu-devel] [PATCH for-2.5 15/30] m68k: add more modes to movem

2015-08-12 Thread Andreas Schwab

Richard Henderson r...@twiddle.net writes:

 On 08/09/2015 01:13 PM, Laurent Vivier wrote:
 +opsize = (insn  0x40) != 0 ? OS_LONG : OS_WORD;
 +incr = opsize_bytes(opsize);
 +if (!is_load  (insn  070) == 040) {
 +for (i = 15; i = 0; i--, mask = 1) {

 This has got to be wrong.  Just because it's pre-decrement doesn't mean
 you should skip all of the loads.

Pre-dec only supports reg-to-mem, and is special because mask is bit
reversed.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.

Re: [Qemu-devel] [PATCH for-2.5 05/30] m68k: define operand sizes

2015-08-12 Thread Laurent Vivier



Le 12/08/2015 06:07, Richard Henderson a écrit :
 On 08/09/2015 01:13 PM, Laurent Vivier wrote:
 -#define OS_BYTE 0
 -#define OS_WORD 1
 -#define OS_LONG 2
 -#define OS_SINGLE 4
 -#define OS_DOUBLE 5
 +#define OS_BYTE 1
 +#define OS_WORD 2
 +#define OS_LONG 3
 +#define OS_SINGLE   4
 +#define OS_DOUBLE   5
 +#define OS_EXTENDED 6
 +#define OS_PACKED   7

 
 Is there a reason you've skipped the 0 value when adding the new values?

I think there is no reason, but if I change the value I have to update
abdc_mem, sbcd_mem instructions as they use it as an
incrementer/decrementer. I agree, it's a strange idea.

 
 +static inline int insn_opsize(int insn, int pos)
 +{
 +switch ((insn  pos)  3) {
 
 
 In particular, that change means that insn_opsize is more complicated
 than needed.  Further, is there any reason for POS to be a varable? 
 Isn't it at the same place for all insns?
 
 +static inline int ext_opsize(int ext, int pos)
 
 This should probably wait until the fp insns get added.

Yes.

Laurent

[Qemu-devel] [PATCH v6 2/2] vhost-user: new protocol feature for multi queue

2015-08-12 Thread Ouyang Changchun

This patch is based on top of vhost-user: protocol updates series
proposed earlier by Michael S. Tsirkin.

Use new message VHOST_USER_SET_VRING_FLAG to enable and disable an
actual virt queue, which is similar to attach/detach queue for tap device.
virtio driver on guest doesn't have to use max virt queue pair, it could
enable any number of virt queue ranging from 1 to max virt queue pair.

It requires that VHOST_USER_F_PROTOCOL_FEATURES is present.

Signed-off-by: Changchun Ouyang changchun.ouy...@intel.com
---
This is added since v5

 docs/specs/vhost-user.txt | 17 +
 hw/net/vhost_net.c| 18 ++
 hw/net/virtio-net.c   |  2 ++
 hw/virtio/vhost-user.c| 35 +--
 include/hw/virtio/vhost-backend.h |  2 ++
 include/net/vhost_net.h   |  1 +
 6 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 9390f89..cca3e5b 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -135,6 +135,10 @@ As older slaves don't support negotiating protocol 
features,
 a feature bit was dedicated for this purpose:
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
 
+The Slave uses vring flag to notify the vhost-user whether one virtq is enabled
+or not. This request doesn't require replies:
+#define VHOST_USER_PROTOCOL_F_VRING_FLAG 2
+
 Multi queue support
 ---
 The protocol supports multiple queues by setting all index fields in the sent
@@ -306,3 +310,16 @@ Message types
   Bits (0-7) of the payload contain the vring index. Bit 8 is the
   invalid FD flag. This flag is set when there is no file descriptor
   in the ancillary data.
+
+ * VHOST_USER_SET_VRING_FLAG
+
+  Id: 18
+  Equivalent ioctl: N/A
+  Master payload: vring state description
+
+  Set the flag(enable or disable) in the vring, the vhost user backend
+  enable or disable the vring according to state.num. Olny legal if feature
+  bit VHOST_USER_F_PROTOCOL_FEATURES is present in VHOST_USER_GET_FEATURE
+  and feature bit VHOST_USER_PROTOCOL_F_VRING_FLAG is present in
+  VHOST_USER_GET_PROTOCOL_FEATURES. The vring is enabled when state.num is
+  1, otherwise, the vring is disabled.
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 9cd6c05..5fa341c 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -405,6 +405,19 @@ VHostNetState *get_vhost_net(NetClientState *nc)
 
 return vhost_net;
 }
+
+int vhost_set_vring_flag(NetClientState *nc, unsigned int enable)
+{
+if (nc-info-type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) {
+struct vhost_net *net = get_vhost_net(nc);
+const VhostOps *vhost_ops = net-dev.vhost_ops;
+if (vhost_ops-vhost_backend_mq_set_vring_flag)
+return vhost_ops-vhost_backend_mq_set_vring_flag(net-dev, 
enable);
+}
+
+return 0;
+}
+
 #else
 struct vhost_net *vhost_net_init(VhostNetOptions *options)
 {
@@ -455,4 +468,9 @@ VHostNetState *get_vhost_net(NetClientState *nc)
 {
 return 0;
 }
+
+int vhost_set_vring_flag(NetClientState *nc, unsigned int enable)
+{
+return 0;
+}
 #endif
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 3af6faf..272b77d 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -396,6 +396,7 @@ static int peer_attach(VirtIONet *n, int index)
 }
 
 if (nc-peer-info-type != NET_CLIENT_OPTIONS_KIND_TAP) {
+vhost_set_vring_flag(nc-peer, 1);
 return 0;
 }
 
@@ -411,6 +412,7 @@ static int peer_detach(VirtIONet *n, int index)
 }
 
 if (nc-peer-info-type !=  NET_CLIENT_OPTIONS_KIND_TAP) {
+vhost_set_vring_flag(nc-peer, 0);
 return 0;
 }
 
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index fb11d4c..d806ce2 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -25,7 +25,8 @@
 
 #define VHOST_MEMORY_MAX_NREGIONS8
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
-#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x0ULL
+#define VHOST_USER_PROTOCOL_F_VRING_FLAG 2
+#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x7ULL
 
 typedef enum VhostUserRequest {
 VHOST_USER_NONE = 0,
@@ -45,6 +46,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_VRING_ERR = 14,
 VHOST_USER_GET_PROTOCOL_FEATURES = 15,
 VHOST_USER_SET_PROTOCOL_FEATURES = 16,
+VHOST_USER_SET_VRING_FLAG = 18,
 VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -399,6 +401,34 @@ static int vhost_user_init(struct vhost_dev *dev, void 
*opaque)
 return 0;
 }
 
+static int vhost_user_set_vring_flag(struct vhost_dev *dev, unsigned int 
enable)
+{
+VhostUserMsg msg = { 0 };
+int err;
+
+assert(dev-vhost_ops-backend_type == VHOST_BACKEND_TYPE_USER);
+
+if ((dev-backend_features  ( 1ULL  VHOST_USER_F_PROTOCOL_FEATURES)) == 
0)
+return -1;
+
+if ((dev-protocol_features  (1ULL  VHOST_USER_PROTOCOL_F_VRING_FLAG)) 
== 0)
+return -1;
+
+

Re: [Qemu-devel] [PATCH v2 1/6] hw/pci: Use pow2ceil() rather than hand-calculation

2015-08-12 Thread Michael S. Tsirkin

On Fri, Jul 24, 2015 at 01:33:07PM +0100, Peter Maydell wrote:
 A couple of places in hw/pci use an inline calculation to round a
 size up to the next largest power of 2. We have a utility routine
 for this, so use it.
 
 (The behaviour of the old code is different if the size value
 is 0 -- it would leave it as 0 rather than rounding up to 1,
 but in both cases we know the size can't be 0.
 In the case where the size value had bit 31 set, the old code
 would invoke undefined behaviour; the new code will give a
 result of 0. Presumably that could never happen either.)
 
 Signed-off-by: Peter Maydell peter.mayd...@linaro.org

Reviewed-by: Michael S. Tsirkin m...@redhat.com

 ---
  hw/pci/msix.c | 4 +---
  hw/pci/pci.c  | 4 +---
  2 files changed, 2 insertions(+), 6 deletions(-)
 
 diff --git a/hw/pci/msix.c b/hw/pci/msix.c
 index 7716bf3..2fdada4 100644
 --- a/hw/pci/msix.c
 +++ b/hw/pci/msix.c
 @@ -314,9 +314,7 @@ int msix_init_exclusive_bar(PCIDevice *dev, unsigned 
 short nentries,
  bar_size = bar_pba_offset + bar_pba_size;
  }
  
 -if (bar_size  (bar_size - 1)) {
 -bar_size = 1  qemu_fls(bar_size);
 -}
 +bar_size = pow2ceil(bar_size);
  
  name = g_strdup_printf(%s-msix, dev-name);
  memory_region_init(dev-msix_exclusive_bar, OBJECT(dev), name, 
 bar_size);
 diff --git a/hw/pci/pci.c b/hw/pci/pci.c
 index a017614..502da8d 100644
 --- a/hw/pci/pci.c
 +++ b/hw/pci/pci.c
 @@ -2065,9 +2065,7 @@ static void pci_add_option_rom(PCIDevice *pdev, bool 
 is_default_rom,
  g_free(path);
  return;
  }
 -if (size  (size - 1)) {
 -size = 1  qemu_fls(size);
 -}
 +size = pow2ceil(size);
  
  vmsd = qdev_get_vmsd(DEVICE(pdev));
  
 -- 
 1.9.1

Re: [Qemu-devel] [PATCH for-2.5 13/30] m68k: set Z and N on divu/muls overflow as a real 68040

2015-08-12 Thread Richard Henderson


On 08/09/2015 01:13 PM, Laurent Vivier wrote:

This allows to compare simulation results with a real 68040.

Signed-off-by: Laurent Vivierlaur...@vivier.eu
---
  target-m68k/op_helper.c | 38 ++
  1 file changed, 26 insertions(+), 12 deletions(-)


A good incentive, yes, but...  hmm.  Certainly another incentive to split 
helpers, rather than pass in word.


I don't have a way to review this though, so the best I can do is

Acked-by: Richard Henderson r...@twiddle.net


r~

Re: [Qemu-devel] [Xen-devel] [PATCH v1] xenpt: Properly handle 64-bit bar with more than 4G size

2015-08-12 Thread Wu, Feng



 -Original Message-
 From: qemu-devel-bounces+feng.wu=intel@nongnu.org
 [mailto:qemu-devel-bounces+feng.wu=intel@nongnu.org] On Behalf Of
 Jan Beulich
 Sent: Wednesday, August 12, 2015 2:59 PM
 To: Wu, Feng
 Cc: xen-de...@lists.xensource.com; qemu-devel@nongnu.org;
 stefano.stabell...@eu.citrix.com
 Subject: Re: [Qemu-devel] [Xen-devel] [PATCH v1] xenpt: Properly handle 64-bit
 bar with more than 4G size
 
  On 05.08.15 at 04:02, feng...@intel.com wrote:
  @@ -491,8 +474,9 @@ static int
 xen_pt_bar_reg_write(XenPCIPassthroughState *s, XenPTReg *cfg_entry,
   bar_ro_mask = XEN_PT_BAR_IO_RO_MASK | (r_size - 1);
   break;
   case XEN_PT_BAR_FLAG_UPPER:
  +r = d-io_regions[index-1];
 
 Perhaps worth an assert(index  0)?

No problem, I will add it. BTW, do you have any other comments about this 
patch? If no, I am
going to send out the new version with this changes.

Thanks,
Feng

 
 Jan

Re: [Qemu-devel] [RFC v4 1/9] exec.c: Add new exclusive bitmap to ram_list

2015-08-12 Thread alvise rigo

I think that tlb_flush_entry is not enough, since in theory another
vCPU could have a different TLB address referring the same phys
address.

alvise

On Tue, Aug 11, 2015 at 6:32 PM, Paolo Bonzini pbonz...@redhat.com wrote:


 On 11/08/2015 18:11, alvise rigo wrote:
  Why flush the entire cache (I understand you mean TLB)?
 Sorry, I meant the TLB.
 If for each removal of an exclusive entry we set also the bit to 1, we
 force the following LL to make a tlb_flush() on every vCPU.

 What if you only flush one entry with tlb_flush_entry (on every vCPU)?

 Paolo

Re: [Qemu-devel] [PATCH for-2.5 01/30] m68k: define m680x0 CPUs and features

2015-08-12 Thread Laurent Vivier



Le 12/08/2015 01:13, Richard Henderson a écrit :
 On 08/09/2015 01:13 PM, Laurent Vivier wrote:
  INSN(undef, , , CF_ISA_A);
 +INSN(undef, , , M68000);
  INSN(arith_im,  0080, fff8, CF_ISA_A);
 +INSN(arith_im,  , ff00, M68000);
 +INSN(undef, 00c0, ffc0, M68000);
  INSN(bitrev,00c0, fff8, CF_ISA_APLUSC);
  INSN(bitop_reg, 0100, f1c0, CF_ISA_A);
 +INSN(bitop_reg, 0100, f1c0, M68000);
  INSN(bitop_reg, 0140, f1c0, CF_ISA_A);
 +INSN(bitop_reg, 0140, f1c0, M68000);
 
 There's a *lot* of repetition in here.
 
 Can we also introduce a BASE() macro that's like INSN() except that it doesn't
 bother checking m68k_feature?  That way if both CF_ISA_A and M68000 are set, 
 we
 don't have to duplicate the entry.

Thank you, good idea.

Laurent

Re: [Qemu-devel] [PATCH COLO-Frame v8 00/34] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT)

2015-08-12 Thread zhanghailiang


On 2015/8/5 19:24, Dr. David Alan Gilbert wrote:

* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:

This is the 8th version of COLO.

Here is only COLO frame part, include: VM checkpoint,
failover, proxy API, block replication API, not include block replication.
The block part is treated as a separate series.

As usual, we provide 'basic' and 'developing' branches in github:
https://github.com/coloft/qemu/commits/colo-v1.5-basic
https://github.com/coloft/qemu/commits/colo-v1.5-developing (more features)

The 'basic' branch is exactly the same with this patch series,
We will keep this series simple as possible, just for easy review.

The extra features in colo-v1.5-developing branch:
1) Separate ram and device save/load process to reduce size of extra memory
used during checkpoint
2) Live migrate part of dirty pages to slave during sleep time.
3) You get the statistic info about checkpoint by command 'info migrate'


I'm hitting a problem that I think is due to the new global_state section
that Juan recently added; if I cause a failover I hit:

ERROR: invalid runstate transition: 'colo' - 'prelaunch'

(on the secondary).
I think the problem is that, the global_state is only sent for any 'unusual' 
states,
so in the first migration that gets done at startup, 'prelaunch' is included in 
the stream
in the global state, but then for later checkpoints the global_state probably 
isn't
sent.

I hacked around it by making global_state_needed return false; I guess
we need to find a better fix!



Hi Dave,

This problem can be fixed by the following modification:

diff --git a/migration/colo.c b/migration/colo.c
index 89f4a3f..fccb384 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -424,6 +424,10 @@ static void *colo_thread(void *opaque)
 qemu_mutex_unlock_iothread();
 trace_colo_vm_state_change(stop, run);

+if (global_state_store()  0) {
+goto out;
+}
+

Actually, the global_state is sent in every cycle of COLO checkpoint.
But the value is always the old one (prelaunch). It is only stored
in the first migration's last stage, but not been updated after going into colo 
mode,
where the new state is 'running'.


Thanks,
zhanghailiang






Please reference to the follow link to test COLO.
http://wiki.qemu.org/Features/COLO.

COLO is a totally new feature which is still in early stage,
your comments and feedback are warmly welcomed.

NOTE:
We have decided to re-implement the colo proxy in userspace (In qemu exactly).
you can find the discussion about why  how to realize the colo proxy in qemu 
from the follow link:
http://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg04069.html

TODO:
1. COLO function switch on/off
2. The capability of continuous FT
3. Optimize the performance.

v8:
- Move some global variables into MigrationIncomingState and MigrationState
- Move some cleanup work form colo thread and colo incoming thread into failover
   BH function and also fix the code logic for the cleanup work.
- fix the bug that colo thread and colo incoming thread possibly block in the
   socket 'recv' call when do failover work.
- Optimize colo_flush_ram_cache()
- Add migration state for incoming side, we use the state to verify if migration
   incoming side is in COLO state or not (Patch 5).
- Drop the patch 'COLO: Disable qdev hotplug when VM is in COLO mode', since it 
is not correct.

zhanghailiang (34):
   configure: Add parameter for configure to enable/disable COLO support
   migration: Introduce capability 'colo' to migration
   COLO: migrate colo related info to slave
   colo-comm/migration: skip colo info section for special cases
   migration: Add state records for migration incoming
   migration: Integrate COLO checkpoint process into migration
   migration: Integrate COLO checkpoint process into loadvm
   COLO: Implement colo checkpoint protocol
   COLO: Add a new RunState RUN_STATE_COLO
   QEMUSizedBuffer: Introduce two help functions for qsb
   COLO: Save VM state to slave when do checkpoint
   COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
   COLO VMstate: Load VM state into qsb before restore it
   arch_init: Start to trace dirty pages of SVM
   COLO RAM: Flush cached RAM into SVM's memory
   COLO failover: Introduce a new command to trigger a failover
   COLO failover: Introduce state to record failover process
   COLO failover: Implement COLO primary/secondary vm failover work
   qmp event: Add event notification for COLO error
   COLO failover: Don't do failover during loading VM's state
   COLO: Add new command parameter 'forward_nic' 'colo_script' for net
   COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
   tap: Make launch_script() public
   COLO NIC: Implement colo nic device interface configure()
   colo-nic: Handle secondary VM's original net device configure
   COLO NIC: Implement colo nic init/destroy function
   COLO NIC: Some init work related with proxy module
   COLO: Handle nfnetlink message from proxy module

Re: [Qemu-devel] [Xen-devel] [PATCH v1] xenpt: Properly handle 64-bit bar with more than 4G size

2015-08-12 Thread Jan Beulich

 On 12.08.15 at 09:10, feng...@intel.com wrote:

 
 -Original Message-
 From: qemu-devel-bounces+feng.wu=intel@nongnu.org 
 [mailto:qemu-devel-bounces+feng.wu=intel@nongnu.org] On Behalf Of
 Jan Beulich
 Sent: Wednesday, August 12, 2015 2:59 PM
 To: Wu, Feng
 Cc: xen-de...@lists.xensource.com; qemu-devel@nongnu.org;
 stefano.stabell...@eu.citrix.com 
 Subject: Re: [Qemu-devel] [Xen-devel] [PATCH v1] xenpt: Properly handle 
 64-bit
 bar with more than 4G size
 
  On 05.08.15 at 04:02, feng...@intel.com wrote:
  @@ -491,8 +474,9 @@ static int
 xen_pt_bar_reg_write(XenPCIPassthroughState *s, XenPTReg *cfg_entry,
   bar_ro_mask = XEN_PT_BAR_IO_RO_MASK | (r_size - 1);
   break;
   case XEN_PT_BAR_FLAG_UPPER:
  +r = d-io_regions[index-1];
 
 Perhaps worth an assert(index  0)?
 
 No problem, I will add it. BTW, do you have any other comments about this 
 patch? If no, I am
 going to send out the new version with this changes.

No - everything else looks to make sense (but continues to need
testing).

Jan

Re: [Qemu-devel] [PATCH for-2.5 05/30] m68k: define operand sizes

2015-08-12 Thread Andreas Schwab

Laurent Vivier laur...@vivier.eu writes:

 Le 12/08/2015 06:07, Richard Henderson a écrit :
 Is there a reason you've skipped the 0 value when adding the new values?

 I think there is no reason, but if I change the value I have to update
 abdc_mem, sbcd_mem instructions as they use it as an
 incrementer/decrementer. I agree, it's a strange idea.

Those uses are really opsize_bytes(OS_BYTE), technically.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.

[Qemu-devel] [PATCH] user-exec: alpha-host: Add type cast to avoid compiling warning

2015-08-12 Thread gchen gchen

The related building warnings in alpha virtual machine:

CCi386-linux-user/user-exec.o
  user-exec.c: In function 'cpu_x86_signal_handler':
  user-exec.c:363:20: error: initialization makes pointer from integer without 
a cast [-Werror=int-conversion]
   uint32_t *pc = uc-uc_mcontext.sc_pc;
  ^
  user-exec.c:383:30: error: passing argument 1 of 'handle_cpu_signal' makes 
integer from pointer without a cast [-Werror=int-conversion]
   return handle_cpu_signal(pc, (unsigned long)info-si_addr,
^
  user-exec.c:86:19: note: expected 'uintptr_t {aka long unsigned int}' but 
argument is of type 'uint32_t * {aka unsigned int *}'
   static inline int handle_cpu_signal(uintptr_t pc, unsigned long address,
 ^

Signed-off-by: Chen Gang gang.chen.5...@gmail.com
---
 user-exec.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/user-exec.c b/user-exec.c
index ed9a07f..baaeb09 100644
--- a/user-exec.c
+++ b/user-exec.c
@@ -360,7 +360,7 @@ int cpu_signal_handler(int host_signum, void *pinfo,
 {
 siginfo_t *info = pinfo;
 struct ucontext *uc = puc;
-uint32_t *pc = uc-uc_mcontext.sc_pc;
+uint32_t *pc = (uint32_t *)uc-uc_mcontext.sc_pc;
 uint32_t insn = *pc;
 int is_write = 0;

@@ -380,7 +380,7 @@ int cpu_signal_handler(int host_signum, void *pinfo,
 is_write = 1;
 }

-return handle_cpu_signal(pc, (unsigned long)info-si_addr,
+return handle_cpu_signal((unsigned long)pc, (unsigned long)info-si_addr,
  is_write, uc-uc_sigmask, puc);
 }
 #elif defined(__sparc__)
--
1.9.1

Re: [Qemu-devel] [Xen-devel] [PATCH v1] xenpt: Properly handle 64-bit bar with more than 4G size

2015-08-12 Thread Jan Beulich

 On 05.08.15 at 04:02, feng...@intel.com wrote:
 @@ -491,8 +474,9 @@ static int xen_pt_bar_reg_write(XenPCIPassthroughState 
 *s, XenPTReg *cfg_entry,
  bar_ro_mask = XEN_PT_BAR_IO_RO_MASK | (r_size - 1);
  break;
  case XEN_PT_BAR_FLAG_UPPER:
 +r = d-io_regions[index-1];

Perhaps worth an assert(index  0)?

Jan

Re: [Qemu-devel] [PATCH for-2.5 14/30] m68k: allow adda/suba to add/sub word

2015-08-12 Thread Richard Henderson


On 08/09/2015 01:13 PM, Laurent Vivier wrote:

Signed-off-by: Laurent Vivierlaur...@vivier.eu
---
  target-m68k/translate.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Reviewed-by: Richard Henderson r...@twiddle.net

r~

Re: [Qemu-devel] [PATCH for-2.5 15/30] m68k: add more modes to movem

2015-08-12 Thread Richard Henderson


On 08/09/2015 01:13 PM, Laurent Vivier wrote:

+opsize = (insn  0x40) != 0 ? OS_LONG : OS_WORD;
+incr = opsize_bytes(opsize);
+if (!is_load  (insn  070) == 040) {
+for (i = 15; i = 0; i--, mask = 1) {


This has got to be wrong.  Just because it's pre-decrement doesn't mean you 
should skip all of the loads.



r~

Re: [Qemu-devel] [PATCH] linux-user: elfload: Still use TARGET_PAGE_SIZE for i386 guest

2015-08-12 Thread gchen gchen

On 2015年08月12日 15:59, gchen gchen wrote:
 On 2015年08月12日 14:45, Richard Henderson wrote:
 On 08/11/2015 11:03 PM, gchen gchen wrote:
 Under Alpha host, for ubuntu12.04.5 i386 guest, it will cause failure:
 Invalid ELF image for this architecture.

 The related issue commit is a70daba linux-user: Tell guest about big
 host page sizes.

 Signed-off-by: Chen Ganggang.chen.5...@gmail.com
 ---
   linux-user/elfload.c | 4 
   1 file changed, 4 insertions(+)

 Nack.  There's 99 problems with host page size guest page size.  This
 solves none of them, and in the hackiest way possible.


 Under alpha virtual machine, if set i386 guest page size 8KB, it will
 cause failure directly (any dynamically linked binaries can not work).

 This fix can let i386 bash/cp/vi/ls run under alpha virtual machine,
 although there maybe be many other issues (which I did not meet, now).

 Do you have any other ideas for solving this issue?  e.g. can we rebuild
 all i386 programs with segment alignment to 8KB? I am not quite sure,
 but some i386 programs (which are interested in) have no source code!
 :-(.


By the way, the patches about the alpha are my current work, I can do
it during my work time, it will have no any negative effect with my
tilegx linux-user development in my free time. :-)

One of my current work contents are: run ubuntu12.04.5 (no graphic)
under alpha virtual machine, and run wine under ubuntu12.04.5 i386
guest under ubuntu alpha vm, and wine will run Win32 graphic programs.

 - I want to x86_64 laptop run Alpha linux-user run i386 linux-user run
   wine run Win32 graphic programs. I guess, I need build X libs under
   ubuntu Alpha vm (which nographic) -- use them for alpha linux-user.

 - At present, x86_64 laptop run arm linux-user run i386 linux-user run
   wine run Windows i386 graphic programs are OK, the performance is
   acceptable! the qemu code keeps no touch -- our qemu is very good!!!.


Welcome any ideas, suggestions and completions.

Thanks.
--
Chen Gang

Open, share, and attitude like air, water, and life which God blessed

Re: [Qemu-devel] [PATCH 1/2] migration: do cleanup operation after completion

2015-08-12 Thread Li, Liang Z

 
 On 12/08/2015 23:04, Liang Li wrote:
  @@ -1008,8 +1009,10 @@ static void *migration_thread(void *opaque)
   }
 
   qemu_mutex_lock_iothread();
  +end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
  +qemu_savevm_state_cancel();
  +
 
 You can remove the qemu_savevm_state_cancel() call from
 migrate_fd_cleanup, too.  Probably best to post a v2 with that change as well.
 
 Paolo

You are right. Done.

Liang

[Qemu-devel] [PATCH 1/2] migration: do cleanup operation after completion

2015-08-12 Thread Liang Li

Because of the patch 3ea3b7fa9af067982f34b of kvm, now the migration_end()
is a time consuming operation, which takes about dozens of milliseconds, and
will prolong VM downtime. Such an operation should be done after migration
completion.

For a VM with 8G RAM, this patch can reduce the VM downtime about 32 ms during
live migration.

Signed-off-by: Liang Li liang.z...@intel.com
---
 migration/block.c | 1 -
 migration/migration.c | 5 -
 migration/ram.c   | 1 -
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/migration/block.c b/migration/block.c
index ed865ed..85496fd 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -750,7 +750,6 @@ static int block_save_complete(QEMUFile *f, void *opaque)
 
 qemu_put_be64(f, BLK_MIG_FLAG_EOS);
 
-blk_mig_cleanup();
 return 0;
 }
 
diff --git a/migration/migration.c b/migration/migration.c
index 662e77e..c22095e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -923,6 +923,7 @@ static void *migration_thread(void *opaque)
 int64_t initial_bytes = 0;
 int64_t max_size = 0;
 int64_t start_time = initial_time;
+int64_t end_time;
 bool old_vm_running = false;
 
 rcu_register_thread();
@@ -1008,8 +1009,10 @@ static void *migration_thread(void *opaque)
 }
 
 qemu_mutex_lock_iothread();
+end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+qemu_savevm_state_cancel();
+
 if (s-state == MIGRATION_STATUS_COMPLETED) {
-int64_t end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
 uint64_t transferred_bytes = qemu_ftell(s-file);
 s-total_time = end_time - s-total_time;
 s-downtime = end_time - start_time;
diff --git a/migration/ram.c b/migration/ram.c
index 7f007e6..6249f6e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1269,7 +1269,6 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
 
 rcu_read_unlock();
 
-migration_end();
 qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
 
 return 0;
-- 
1.9.1

[Qemu-devel] [PATCH 2/2] migration: rename qemu_savevm_state_cancel

2015-08-12 Thread Liang Li

The function qemu_savevm_state_cancel is called after the migration
in migration_thread, it seems strange to 'cancel' it after completion,
rename it to qemu_savevm_state_cleanup looks better.

Signed-off-by: Liang Li liang.z...@intel.com
---
 include/sysemu/sysemu.h | 2 +-
 migration/migration.c   | 4 ++--
 migration/savevm.c  | 6 +++---
 trace-events| 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 44570d1..9cc0240 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -88,7 +88,7 @@ void qemu_savevm_state_begin(QEMUFile *f,
 void qemu_savevm_state_header(QEMUFile *f);
 int qemu_savevm_state_iterate(QEMUFile *f);
 void qemu_savevm_state_complete(QEMUFile *f);
-void qemu_savevm_state_cancel(void);
+void qemu_savevm_state_cleanup(void);
 uint64_t qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size);
 int qemu_loadvm_state(QEMUFile *f);
 
diff --git a/migration/migration.c b/migration/migration.c
index c22095e..e7fe50f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -561,7 +561,7 @@ static void migrate_fd_cleanup(void *opaque)
 assert(s-state != MIGRATION_STATUS_ACTIVE);
 
 if (s-state != MIGRATION_STATUS_COMPLETED) {
-qemu_savevm_state_cancel();
+qemu_savevm_state_cleanup();
 if (s-state == MIGRATION_STATUS_CANCELLING) {
 migrate_set_state(s, MIGRATION_STATUS_CANCELLING,
   MIGRATION_STATUS_CANCELLED);
@@ -1010,7 +1010,7 @@ static void *migration_thread(void *opaque)
 
 qemu_mutex_lock_iothread();
 end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
-qemu_savevm_state_cancel();
+qemu_savevm_state_cleanup();
 
 if (s-state == MIGRATION_STATUS_COMPLETED) {
 uint64_t transferred_bytes = qemu_ftell(s-file);
diff --git a/migration/savevm.c b/migration/savevm.c
index 6071215..b141b17 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -905,11 +905,11 @@ uint64_t qemu_savevm_state_pending(QEMUFile *f, uint64_t 
max_size)
 return ret;
 }
 
-void qemu_savevm_state_cancel(void)
+void qemu_savevm_state_cleanup(void)
 {
 SaveStateEntry *se;
 
-trace_savevm_state_cancel();
+trace_savevm_state_cleanup();
 QTAILQ_FOREACH(se, savevm_state.handlers, entry) {
 if (se-ops  se-ops-cancel) {
 se-ops-cancel(se-opaque);
@@ -946,7 +946,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
 ret = qemu_file_get_error(f);
 }
 if (ret != 0) {
-qemu_savevm_state_cancel();
+qemu_savevm_state_cleanup();
 error_setg_errno(errp, -ret, Error while writing VM state);
 }
 return ret;
diff --git a/trace-events b/trace-events
index 94bf3bb..102373c 100644
--- a/trace-events
+++ b/trace-events
@@ -1195,7 +1195,7 @@ savevm_state_begin(void) 
 savevm_state_header(void) 
 savevm_state_iterate(void) 
 savevm_state_complete(void) 
-savevm_state_cancel(void) 
+savevm_state_cleanup(void) 
 vmstate_save(const char *idstr, const char *vmsd_name) %s, %s
 vmstate_load(const char *idstr, const char *vmsd_name) %s, %s
 qemu_announce_self_iter(const char *mac) %s
-- 
1.9.1

[Qemu-devel] [PATCH] MAINTAINERS: list smbios maintainers

2015-08-12 Thread Michael S. Tsirkin

Now that smbios has its own directory, list its
maintainers. Same people as ACPI so just reuse that
entry.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 MAINTAINERS | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 978b717..a059d5d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -645,13 +645,15 @@ S: Supported
 F: include/hw/pci/*
 F: hw/pci/*
 
-ACPI
+ACPI/SMBIOS
 M: Michael S. Tsirkin m...@redhat.com
 M: Igor Mammedov imamm...@redhat.com
 S: Supported
 F: include/hw/acpi/*
+F: include/hw/smbios/*
 F: hw/mem/*
 F: hw/acpi/*
+F: hw/smbios/*
 F: hw/i386/acpi-build.[hc]
 F: hw/i386/*dsl
 F: hw/arm/virt-acpi-build.c
-- 
MST

Re: [Qemu-devel] [PATCH v6 0/2] vhost user: Add live migration

2015-08-12 Thread Marc-André Lureau

Hi

On Wed, Aug 12, 2015 at 9:25 AM, Michael S. Tsirkin m...@redhat.com wrote:
 I think these patches need to be rebased on top of Marc Andre's ones,
 and use protocol flags to negotiate capabilities.
 Right?

Correct. His patches should be applied before my migration tests, though.

-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH 0/2] Fix long vm downtime during live migration

2015-08-12 Thread Paolo Bonzini



On 12/08/2015 23:04, Liang Li wrote:
 Some cleanup operations take long time during the pause and copy stage,
 especially with the KVM patch 3ea3b7fa9af067, do these operation after
 the completion of live migration can help to reduce VM downtime.
 
 
 Liang Li (2):
   migration: do cleanup operation after completion
   migration: rename qemu_savevm_state_cancel
 
  include/sysemu/sysemu.h | 2 +-
  migration/block.c   | 1 -
  migration/migration.c   | 7 +--
  migration/ram.c | 1 -
  migration/savevm.c  | 6 +++---
  trace-events| 2 +-
  6 files changed, 10 insertions(+), 9 deletions(-)
 

That's really a clever solution!

Reviewed-by: Paolo Bonzini pbonz...@redhat.com
Cc: qemu-sta...@nongnu.org

[Qemu-devel] [PATCH] virtio/vhost: drop unnecessary VHOST_SET_VRING call

2015-08-12 Thread Marcel Apfelbaum

No need to send VHOST_SET_VRING_CALL to backend
before the negotiation with the guest is finished.

Signed-off-by: Marcel Apfelbaum mar...@redhat.com
---
 hw/virtio/vhost.c | 13 +
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 2712c6f..b448542 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -875,24 +875,13 @@ static void vhost_eventfd_del(MemoryListener *listener,
 static int vhost_virtqueue_init(struct vhost_dev *dev,
 struct vhost_virtqueue *vq, int n)
 {
-struct vhost_vring_file file = {
-.index = n,
-};
 int r = event_notifier_init(vq-masked_notifier, 0);
+
 if (r  0) {
 return r;
 }
 
-file.fd = event_notifier_get_fd(vq-masked_notifier);
-r = dev-vhost_ops-vhost_call(dev, VHOST_SET_VRING_CALL, file);
-if (r) {
-r = -errno;
-goto fail_call;
-}
 return 0;
-fail_call:
-event_notifier_cleanup(vq-masked_notifier);
-return r;
 }
 
 static void vhost_virtqueue_cleanup(struct vhost_virtqueue *vq)
-- 
2.1.0

Re: [Qemu-devel] [PATCH] virtio/vhost: drop unnecessary VHOST_SET_VRING call

2015-08-12 Thread Michael S. Tsirkin

On Wed, Aug 12, 2015 at 01:19:51PM +0300, Marcel Apfelbaum wrote:
 No need to send VHOST_SET_VRING_CALL to backend
 before the negotiation with the guest is finished.
 
 Signed-off-by: Marcel Apfelbaum mar...@redhat.com

Well - we do need to set it to the masked notifier initially to avoid
losing events.  You can't just drop it - need to move this call
somewhere else.

 ---
  hw/virtio/vhost.c | 13 +
  1 file changed, 1 insertion(+), 12 deletions(-)
 
 diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
 index 2712c6f..b448542 100644
 --- a/hw/virtio/vhost.c
 +++ b/hw/virtio/vhost.c
 @@ -875,24 +875,13 @@ static void vhost_eventfd_del(MemoryListener *listener,
  static int vhost_virtqueue_init(struct vhost_dev *dev,
  struct vhost_virtqueue *vq, int n)
  {
 -struct vhost_vring_file file = {
 -.index = n,
 -};
  int r = event_notifier_init(vq-masked_notifier, 0);
 +
  if (r  0) {
  return r;
  }
  
 -file.fd = event_notifier_get_fd(vq-masked_notifier);
 -r = dev-vhost_ops-vhost_call(dev, VHOST_SET_VRING_CALL, file);
 -if (r) {
 -r = -errno;
 -goto fail_call;
 -}
  return 0;
 -fail_call:
 -event_notifier_cleanup(vq-masked_notifier);
 -return r;
  }
  
  static void vhost_virtqueue_cleanup(struct vhost_virtqueue *vq)
 -- 
 2.1.0

Re: [Qemu-devel] [ARM SMBIOS V3 PATCH 4/5] smbios: add smbios 3.0 support

2015-08-12 Thread Michael S. Tsirkin

On Tue, Aug 11, 2015 at 10:08:21PM -0400, Wei Huang wrote:
 This patch adds support for SMBIOS 3.0 entry point. When caller invokes
 smbios_set_defaults(), it can specify entry point as 2.1 or 3.0. Then
 smbios_get_tables() will return the entry point table in right format.
 
 Acked-by: Gabriel Somlo so...@cmu.edu
 Tested-by: Gabriel Somlo so...@cmu.edu
 Tested-by: Leif Lindholm leif.lindh...@linaro.org
 Signed-off-by: Wei Huang w...@redhat.com
 ---
  hw/i386/pc_piix.c  |  3 +-
  hw/i386/pc_q35.c   |  3 +-
  hw/smbios/smbios.c | 83 
 +-
  include/hw/smbios/smbios.h | 51 
  4 files changed, 101 insertions(+), 39 deletions(-)
 
 diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
 index 653c710..04636b1 100644
 --- a/hw/i386/pc_piix.c
 +++ b/hw/i386/pc_piix.c
 @@ -173,7 +173,8 @@ static void pc_init1(MachineState *machine)
  MachineClass *mc = MACHINE_GET_CLASS(machine);
  /* These values are guest ABI, do not change */
  smbios_set_defaults(QEMU, Standard PC (i440FX + PIIX, 1996),
 -mc-name, smbios_legacy_mode, 
 smbios_uuid_encoded);
 +mc-name, smbios_legacy_mode, 
 smbios_uuid_encoded,
 +SMBIOS_ENTRY_POINT_21);
  }
  
  /* allocate ram and load rom/bios */
 diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
 index d83df14..061507d 100644
 --- a/hw/i386/pc_q35.c
 +++ b/hw/i386/pc_q35.c
 @@ -165,7 +165,8 @@ static void pc_q35_init(MachineState *machine)
  if (smbios_defaults) {
  /* These values are guest ABI, do not change */
  smbios_set_defaults(QEMU, Standard PC (Q35 + ICH9, 2009),
 -mc-name, smbios_legacy_mode, 
 smbios_uuid_encoded);
 +mc-name, smbios_legacy_mode, 
 smbios_uuid_encoded,
 +SMBIOS_ENTRY_POINT_21);
  }
  
  /* allocate ram and load rom/bios */
 diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
 index efdbb5d..de3cab5 100644
 --- a/hw/smbios/smbios.c
 +++ b/hw/smbios/smbios.c
 @@ -55,7 +55,10 @@ static uint8_t *smbios_tables;
  static size_t smbios_tables_len;
  static unsigned smbios_table_max;
  static unsigned smbios_table_cnt;
 -static struct smbios_entry_point ep;
 +static SmbiosEntryPointType smbios_ep_type = SMBIOS_ENTRY_POINT_21;
 +
 +static SmbiosEntryPoint ep;
 +static size_t ep_length;

I think it's better to drop ep_length and write a function that
retrieves length from ep, based on anchor string.  Don't duplicate
information.


  
  static int smbios_type4_count = 0;
  static bool smbios_immutable;
 @@ -771,11 +774,12 @@ void smbios_set_cpuid(uint32_t version, uint32_t 
 features)
  
  void smbios_set_defaults(const char *manufacturer, const char *product,
   const char *version, bool legacy_mode,
 - bool uuid_encoded)
 + bool uuid_encoded, SmbiosEntryPointType ep_type)
  {
  smbios_have_defaults = true;
  smbios_legacy = legacy_mode;
  smbios_uuid_encoded = uuid_encoded;
 +smbios_ep_type = ep_type;
  
  /* drop unwanted version of command-line file blob(s) */
  if (smbios_legacy) {
 @@ -808,26 +812,59 @@ void smbios_set_defaults(const char *manufacturer, 
 const char *product,
  
  static void smbios_entry_point_setup(void)
  {
 -memcpy(ep.anchor_string, _SM_, 4);
 -memcpy(ep.intermediate_anchor_string, _DMI_, 5);
 -ep.length = sizeof(struct smbios_entry_point);
 -ep.entry_point_revision = 0; /* formatted_area reserved, per spec v2.1+ 
 */
 -memset(ep.formatted_area, 0, 5);
 -
 -/* compliant with smbios spec v2.8 */
 -ep.smbios_major_version = 2;
 -ep.smbios_minor_version = 8;
 -ep.smbios_bcd_revision = 0x28;
 -
 -/* set during table construction, but BIOS may override: */
 -ep.structure_table_length = cpu_to_le16(smbios_tables_len);
 -ep.max_structure_size = cpu_to_le16(smbios_table_max);
 -ep.number_of_structures = cpu_to_le16(smbios_table_cnt);
 -
 -/* BIOS must recalculate: */
 -ep.checksum = 0;
 -ep.intermediate_checksum = 0;
 -ep.structure_table_address = cpu_to_le32(0);
 +switch (smbios_ep_type) {
 +case SMBIOS_ENTRY_POINT_21:
 +memcpy(ep.ep21.anchor_string, _SM_, 4);
 +memcpy(ep.ep21.intermediate_anchor_string, _DMI_, 5);
 +ep.ep21.length = sizeof(struct smbios_21_entry_point);
 +ep.ep21.entry_point_revision = 0; /* formatted_area reserved */
 +memset(ep.ep21.formatted_area, 0, 5);
 +
 +/* compliant with smbios spec v2.8 */
 +ep.ep21.smbios_major_version = 2;
 +ep.ep21.smbios_minor_version = 8;
 +ep.ep21.smbios_bcd_revision = 0x28;
 +
 +/* set during table construction, but BIOS may override: */
 +ep.ep21.structure_table_length = cpu_to_le16(smbios_tables_len);
 +

[Qemu-devel] [PATCH 0/2] Fix long vm downtime during live migration

2015-08-12 Thread Liang Li

Some cleanup operations take long time during the pause and copy stage,
especially with the KVM patch 3ea3b7fa9af067, do these operation after
the completion of live migration can help to reduce VM downtime.


Liang Li (2):
  migration: do cleanup operation after completion
  migration: rename qemu_savevm_state_cancel

 include/sysemu/sysemu.h | 2 +-
 migration/block.c   | 1 -
 migration/migration.c   | 7 +--
 migration/ram.c | 1 -
 migration/savevm.c  | 6 +++---
 trace-events| 2 +-
 6 files changed, 10 insertions(+), 9 deletions(-)

-- 
1.9.1

Re: [Qemu-devel] [RFC PATCH V7 09/19] Drop global lock during TCG code execution

2015-08-12 Thread Paolo Bonzini



On 11/08/2015 23:34, Frederic Konrad wrote:

 Also if qemu_cond_broadcast(qemu_io_proceeded_cond) is being dropped
 there is no point keeping the guff around in qemu_tcg_wait_io_event.

 Yes good point.
 
 BTW this leads to high consumption of host CPU eg: 100% per VCPU thread as
 the VCPUs thread are no longer waiting for qemu_io_proceeded_cond.

If the guest CPU is busy waiting, that's expected.  But if the guest CPU
is halted, it should not have 100% host CPU consumption.

Paolo

Re: [Qemu-devel] [PATCH] monitor: remove QAPI_EVENT_VSERPORT_CHANGE throttle

2015-08-12 Thread Daniel P. Berrange

On Tue, Aug 11, 2015 at 08:21:18PM +0200, Laszlo Ersek wrote:
 On 08/11/15 19:04, marcandre.lur...@redhat.com wrote:
  From: Marc-André Lureau marcandre.lur...@redhat.com
  
  QAPI_EVENT_VSERPORT_CHANGE reports changes of a virtio serial port
  state. However, the events may be for different ports, but the throttle
  mechanism may replace the event for a different port, since it only
  checks the event type.
  
  libvirt relies on a correct state to be reported for all channels: the
  qemu-ga commands may no longer work if the state is reported
  disconnected. This can be triggered easily by having more than 1
  virtio-serial (qemu-ga + spice agent for example), and restarting
  quickly daemons or more realistically going quickly in and out of
  suspend.
  
  In a future patch, we may want to throttle events based on their
  arguments, but this will likely require dynamic allocations and more
  complicated code to insert/lookup pending events based on various
  arguments (id in QAPI_EVENT_VSERPORT_CHANGE case).
  
  Fixes:
  https://bugzilla.redhat.com/show_bug.cgi?id=1244064
  
  Signed-off-by: Marc-André Lureau marcandre.lur...@redhat.com
  ---
   monitor.c | 1 -
   1 file changed, 1 deletion(-)
  
  diff --git a/monitor.c b/monitor.c
  index aeea2b5..e4d56f7 100644
  --- a/monitor.c
  +++ b/monitor.c
  @@ -558,7 +558,6 @@ static void monitor_qapi_event_init(void)
   monitor_qapi_event_throttle(QAPI_EVENT_BALLOON_CHANGE, 1000);
   monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_REPORT_BAD, 1000);
   monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_FAILURE, 1000);
  -monitor_qapi_event_throttle(QAPI_EVENT_VSERPORT_CHANGE, 1000);
   
   qmp_event_set_func_emit(monitor_qapi_event_queue);
   }
  
 
 I don't mind the change (and the point of argument sensitivity is not
 lost on me), but note that this undoes the protection that is spelled
 out in the leading comment of the function, not visible in the context here:
 
 /* Limit guest-triggerable events to 1 per second */
 
 That was probably put in place in order to prevent a malicious guest
 from spamming the log files (and the CPU usage) of the management apps.
 
 One solution to that would be arg sensitivity. Another would be a
 burst-capable (ie. a slowly re-filling, limited size token bucket)
 throttle, maintained per event type.
 
 Until one of those gets written, I guess this patch is acceptable -- as
 long as mgmt people are okay with it. Daniel seems to be, so I don't mind.

Not having rate limiting is certainly an undesirable situation, but
rate limiting which discards unrecoverable data is even worse as it
makes the event useless for the app. We've got a number of events which
are not rate limited for this reason. Ultimately we do need to figure
out a better way to rate limit such events, but until then we have no
option but to skip rate limiting for events which have this issue.

NB, some events which have args are acceptable to rate limit. eg the
balloon change event is fine, because apps generally only care about
the /current/ balloon level, so if 3 balloon events are emitted and
the first 2 are discarded that is fine, because the most recent balloon
level is the only one that matters.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] [PATCH] virtio/vhost: drop unnecessary VHOST_SET_VRING call

2015-08-12 Thread Michael S. Tsirkin

On Wed, Aug 12, 2015 at 02:10:56PM +0300, Marcel Apfelbaum wrote:
 On 08/12/2015 01:34 PM, Michael S. Tsirkin wrote:
 On Wed, Aug 12, 2015 at 01:19:51PM +0300, Marcel Apfelbaum wrote:
 No need to send VHOST_SET_VRING_CALL to backend
 before the negotiation with the guest is finished.
 
 Signed-off-by: Marcel Apfelbaum mar...@redhat.com
 
 Well - we do need to set it to the masked notifier initially to avoid
 losing events.  You can't just drop it - need to move this call
 somewhere else.
 What do we need to set?
 I just dropped the call to VHOST_SET_VRING_CALL.
 
 Thanks,
 Marcel

We use two eventfds: masked and unmasked one.
We switch dynamically dependent on msi mask value.
Code assumes we start out masked, so we need to match that.


 
 ---
   hw/virtio/vhost.c | 13 +
   1 file changed, 1 insertion(+), 12 deletions(-)
 
 diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
 index 2712c6f..b448542 100644
 --- a/hw/virtio/vhost.c
 +++ b/hw/virtio/vhost.c
 @@ -875,24 +875,13 @@ static void vhost_eventfd_del(MemoryListener 
 *listener,
   static int vhost_virtqueue_init(struct vhost_dev *dev,
   struct vhost_virtqueue *vq, int n)
   {
 -struct vhost_vring_file file = {
 -.index = n,
 -};
   int r = event_notifier_init(vq-masked_notifier, 0);
 +
   if (r  0) {
   return r;
   }
 
 -file.fd = event_notifier_get_fd(vq-masked_notifier);
 -r = dev-vhost_ops-vhost_call(dev, VHOST_SET_VRING_CALL, file);
 -if (r) {
 -r = -errno;
 -goto fail_call;
 -}
   return 0;
 -fail_call:
 -event_notifier_cleanup(vq-masked_notifier);
 -return r;
   }
 
   static void vhost_virtqueue_cleanup(struct vhost_virtqueue *vq)
 --
 2.1.0

Re: [Qemu-devel] [PATCH 1/2] migration: do cleanup operation after completion

2015-08-12 Thread Paolo Bonzini



On 12/08/2015 23:04, Liang Li wrote:
 @@ -1008,8 +1009,10 @@ static void *migration_thread(void *opaque)
  }
  
  qemu_mutex_lock_iothread();
 +end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
 +qemu_savevm_state_cancel();
 +

You can remove the qemu_savevm_state_cancel() call from
migrate_fd_cleanup, too.  Probably best to post a v2 with that change as
well.

Paolo

Re: [Qemu-devel] [PATCH] virtio/vhost: drop unnecessary VHOST_SET_VRING call

2015-08-12 Thread Marcel Apfelbaum


On 08/12/2015 01:34 PM, Michael S. Tsirkin wrote:

On Wed, Aug 12, 2015 at 01:19:51PM +0300, Marcel Apfelbaum wrote:

No need to send VHOST_SET_VRING_CALL to backend
before the negotiation with the guest is finished.

Signed-off-by: Marcel Apfelbaum mar...@redhat.com


Well - we do need to set it to the masked notifier initially to avoid
losing events.  You can't just drop it - need to move this call
somewhere else.

What do we need to set?
I just dropped the call to VHOST_SET_VRING_CALL.

Thanks,
Marcel




---
  hw/virtio/vhost.c | 13 +
  1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 2712c6f..b448542 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -875,24 +875,13 @@ static void vhost_eventfd_del(MemoryListener *listener,
  static int vhost_virtqueue_init(struct vhost_dev *dev,
  struct vhost_virtqueue *vq, int n)
  {
-struct vhost_vring_file file = {
-.index = n,
-};
  int r = event_notifier_init(vq-masked_notifier, 0);
+
  if (r  0) {
  return r;
  }

-file.fd = event_notifier_get_fd(vq-masked_notifier);
-r = dev-vhost_ops-vhost_call(dev, VHOST_SET_VRING_CALL, file);
-if (r) {
-r = -errno;
-goto fail_call;
-}
  return 0;
-fail_call:
-event_notifier_cleanup(vq-masked_notifier);
-return r;
  }

  static void vhost_virtqueue_cleanup(struct vhost_virtqueue *vq)
--
2.1.0

Re: [Qemu-devel] [PATCH for-2.5 16/30] m68k: Add all access modes and data sizes to some 680x0 instructions

2015-08-12 Thread Richard Henderson

On 08/09/2015 01:13 PM, Laurent Vivier wrote:
  case 6: /* cmpi */
  tcg_gen_mov_i32(dest, src1);
  tcg_gen_subi_i32(dest, dest, im);
  gen_update_cc_add(dest, tcg_const_i32(im));
 -set_cc_op(s, CC_OP_SUB);
 +SET_CC_OP(opsize, SUB);
  break;
  default:
  abort();
  }
 -if (op != 6) {
 -DEST_EA(env, insn, OS_LONG, dest, addr);
 -}
 +DEST_EA(env, insn, opsize, dest, addr);

It appears you've lost the if that doesn't write back the result of compare.
Did you intend to change the case 6 break to a return?


r~

[Qemu-devel] [PATCH 02/10] cpus: remove tcg_halt_cond global variable.

2015-08-12 Thread Paolo Bonzini

From: KONRAD Frederic fred.kon...@greensocs.com

This removes tcg_halt_cond global variable.
We need one QemuCond per virtual cpu for multithread TCG.

Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com
Message-Id: 1439220437-23957-9-git-send-email-fred.kon...@greensocs.com
[Keep tcg_halt_cond for bisectability, while making it static. - Paolo]
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/cpus.c b/cpus.c
index 9224488..8884278 100644
--- a/cpus.c
+++ b/cpus.c
@@ -813,7 +813,6 @@ static unsigned iothread_requesting_mutex;
 static QemuThread io_thread;
 
 static QemuThread *tcg_cpu_thread;
-static QemuCond *tcg_halt_cond;
 
 /* cpu creation */
 static QemuCond qemu_cpu_cond;
@@ -933,15 +932,13 @@ static void qemu_wait_io_event_common(CPUState *cpu)
 cpu-thread_kicked = false;
 }
 
-static void qemu_tcg_wait_io_event(void)
+static void qemu_tcg_wait_io_event(CPUState *cpu)
 {
-CPUState *cpu;
-
 while (all_cpu_threads_idle()) {
/* Start accounting real time to the virtual clock if the CPUs
   are idle.  */
 qemu_clock_warp(QEMU_CLOCK_VIRTUAL);
-qemu_cond_wait(tcg_halt_cond, qemu_global_mutex);
+qemu_cond_wait(cpu-halt_cond, qemu_global_mutex);
 }
 
 while (iothread_requesting_mutex) {
@@ -1067,7 +1064,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 
 /* wait for initial kick-off after machine start */
 while (first_cpu-stopped) {
-qemu_cond_wait(tcg_halt_cond, qemu_global_mutex);
+qemu_cond_wait(first_cpu-halt_cond, qemu_global_mutex);
 
 /* process any pending work */
 CPU_FOREACH(cpu) {
@@ -1088,7 +1085,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
 }
 }
-qemu_tcg_wait_io_event();
+qemu_tcg_wait_io_event(QTAILQ_FIRST(cpus));
 }
 
 return NULL;
@@ -1265,6 +1262,7 @@ void resume_all_vcpus(void)
 static void qemu_tcg_init_vcpu(CPUState *cpu)
 {
 char thread_name[VCPU_THREAD_NAME_SIZE];
+static QemuCond *tcg_halt_cond;
 
 tcg_cpu_address_space_init(cpu, cpu-as);
 
-- 
1.8.3.1

[Qemu-devel] [PATCH 01/10] cpus: protect work list with work_mutex

2015-08-12 Thread Paolo Bonzini

From: KONRAD Frederic fred.kon...@greensocs.com

Protect the list of queued work items with something other than
the BQL, as a preparation for running the work items outside it.

Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c| 22 ++
 include/qom/cpu.h |  6 +-
 qom/cpu.c |  1 +
 3 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/cpus.c b/cpus.c
index c1e74d9..9224488 100644
--- a/cpus.c
+++ b/cpus.c
@@ -845,6 +845,8 @@ void run_on_cpu(CPUState *cpu, void (*func)(void *data), 
void *data)
 wi.func = func;
 wi.data = data;
 wi.free = false;
+
+qemu_mutex_lock(cpu-work_mutex);
 if (cpu-queued_work_first == NULL) {
 cpu-queued_work_first = wi;
 } else {
@@ -853,9 +855,10 @@ void run_on_cpu(CPUState *cpu, void (*func)(void *data), 
void *data)
 cpu-queued_work_last = wi;
 wi.next = NULL;
 wi.done = false;
+qemu_mutex_unlock(cpu-work_mutex);
 
 qemu_cpu_kick(cpu);
-while (!wi.done) {
+while (!atomic_mb_read(wi.done)) {
 CPUState *self_cpu = current_cpu;
 
 qemu_cond_wait(qemu_work_cond, qemu_global_mutex);
@@ -876,6 +879,8 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void 
*data), void *data)
 wi-func = func;
 wi-data = data;
 wi-free = true;
+
+qemu_mutex_lock(cpu-work_mutex);
 if (cpu-queued_work_first == NULL) {
 cpu-queued_work_first = wi;
 } else {
@@ -884,6 +889,7 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void 
*data), void *data)
 cpu-queued_work_last = wi;
 wi-next = NULL;
 wi-done = false;
+qemu_mutex_unlock(cpu-work_mutex);
 
 qemu_cpu_kick(cpu);
 }
@@ -896,15 +902,23 @@ static void flush_queued_work(CPUState *cpu)
 return;
 }
 
-while ((wi = cpu-queued_work_first)) {
+qemu_mutex_lock(cpu-work_mutex);
+while (cpu-queued_work_first != NULL) {
+wi = cpu-queued_work_first;
 cpu-queued_work_first = wi-next;
+if (!cpu-queued_work_first) {
+cpu-queued_work_last = NULL;
+}
+qemu_mutex_unlock(cpu-work_mutex);
 wi-func(wi-data);
-wi-done = true;
+qemu_mutex_lock(cpu-work_mutex);
 if (wi-free) {
 g_free(wi);
+} else {
+atomic_mb_set(wi-done, true);
 }
 }
-cpu-queued_work_last = NULL;
+qemu_mutex_unlock(cpu-work_mutex);
 qemu_cond_broadcast(qemu_work_cond);
 }
 
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 39712ab..77bbff2 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -244,6 +244,8 @@ struct kvm_run;
  * @mem_io_pc: Host Program Counter at which the memory was accessed.
  * @mem_io_vaddr: Target virtual address at which the memory was accessed.
  * @kvm_fd: vCPU file descriptor for KVM.
+ * @work_mutex: Lock to prevent multiple access to queued_work_*.
+ * @queued_work_first: First asynchronous work pending.
  *
  * State of one CPU core or thread.
  */
@@ -264,7 +266,6 @@ struct CPUState {
 uint32_t host_tid;
 bool running;
 struct QemuCond *halt_cond;
-struct qemu_work_item *queued_work_first, *queued_work_last;
 bool thread_kicked;
 bool created;
 bool stop;
@@ -275,6 +276,9 @@ struct CPUState {
 int64_t icount_extra;
 sigjmp_buf jmp_env;
 
+QemuMutex work_mutex;
+struct qemu_work_item *queued_work_first, *queued_work_last;
+
 AddressSpace *as;
 struct AddressSpaceDispatch *memory_dispatch;
 MemoryListener *tcg_as_listener;
diff --git a/qom/cpu.c b/qom/cpu.c
index 62f4b5d..3e93223 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -314,6 +314,7 @@ static void cpu_common_initfn(Object *obj)
 
 cpu-cpu_index = -1;
 cpu-gdb_num_regs = cpu-gdb_num_g_regs = cc-gdb_num_core_regs;
+qemu_mutex_init(cpu-work_mutex);
 QTAILQ_INIT(cpu-breakpoints);
 QTAILQ_INIT(cpu-watchpoints);
 }
-- 
1.8.3.1

[Qemu-devel] [Bug 1297218] Re: guest hangs after live migration due to tsc jump

2015-08-12 Thread Serge Hallyn

Could someone confirm whether this is fixed in 15.04 and/or 15.10?

** Changed in: qemu (Ubuntu)
   Status: Triaged = Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1297218

Title:
  guest hangs after live migration due to tsc jump

Status in QEMU:
  New
Status in glusterfs package in Ubuntu:
  Invalid
Status in qemu package in Ubuntu:
  Incomplete
Status in glusterfs source package in Trusty:
  Confirmed
Status in qemu source package in Trusty:
  Confirmed

Bug description:
  We have two identical Ubuntu servers running libvirt/kvm/qemu, sharing
  a Gluster filesystem. Guests can be live migrated between them.
  However, live migration often leads to the guest being stuck at 100%
  for a while. In that case, the dmesg output for such a guest will show
  (once it recovers): Clocksource tsc unstable (delta = 662463064082
  ns). In this particular example, a guest was migrated and only after
  11 minutes (662 seconds) did it become responsive again.

  It seems that newly booted guests doe not suffer from this problem,
  these can be migrated back and forth at will. After a day or so, the
  problem becomes apparent. It also seems that migrating from server A
  to server B causes much more problems than going from B back to A. If
  necessary, I can do more measurements to qualify these observations.

  The VM servers run Ubuntu 13.04 with these packages:
  Kernel: 3.8.0-35-generic x86_64
  Libvirt: 1.0.2
  Qemu: 1.4.0
  Gluster-fs: 3.4.2 (libvirt access the images via the filesystem, not using 
libgfapi yet as the Ubuntu libvirt is not linked against libgfapi).
  The interconnect between both machines (both for migration and gluster) is 
10GbE. 
  Both servers are synced to NTP and well within 1ms form one another.

  Guests are either Ubuntu 13.04 or 13.10.

  On the guests, the current_clocksource is kvm-clock.
  The XML definition of the guests only contains:  clock offset='utc'/ 

  Now as far as I've read in the documentation of kvm-clock, it specifically 
supports live migrations, so I'm a bit surprised at these problems. There isn't 
all that much information to find on these issue, although I have found 
postings by others that seem to have run into the same issues, but without a 
solution.
  --- 
  ApportVersion: 2.14.1-0ubuntu3
  Architecture: amd64
  DistroRelease: Ubuntu 14.04
  Package: libvirt (not installed)
  ProcCmdline: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic 
root=UUID=1b0c3c6d-a9b8-4e84-b076-117ae267d178 ro console=ttyS1,115200n8 
BOOTIF=01-00-25-90-75-b5-c8
  ProcVersionSignature: Ubuntu 3.13.0-24.47-generic 3.13.9
  Tags:  trusty apparmor apparmor apparmor apparmor apparmor
  Uname: Linux 3.13.0-24-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: True
  modified.conffile..etc.default.libvirt.bin: [modified]
  modified.conffile..etc.libvirt.libvirtd.conf: [modified]
  modified.conffile..etc.libvirt.qemu.conf: [modified]
  modified.conffile..etc.libvirt.qemu.networks.default.xml: [deleted]
  mtime.conffile..etc.default.libvirt.bin: 2014-05-12T19:07:40.020662
  mtime.conffile..etc.libvirt.libvirtd.conf: 2014-05-13T14:40:25.894837
  mtime.conffile..etc.libvirt.qemu.conf: 2014-05-12T18:58:27.885506

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1297218/+subscriptions

Re: [Qemu-devel] [PATCH for-2.5 17/30] m68k: ori/andi/subi/addi/eori/cmpi can modify SR/CCR

2015-08-12 Thread Richard Henderson

On 08/09/2015 01:13 PM, Laurent Vivier wrote:
 Signed-off-by: Laurent Vivier laur...@vivier.eu
 ---
  target-m68k/translate.c | 95 
 ++---
  1 file changed, 58 insertions(+), 37 deletions(-)
 
 diff --git a/target-m68k/translate.c b/target-m68k/translate.c
 index 6a426e1..9e379b3 100644
 --- a/target-m68k/translate.c
 +++ b/target-m68k/translate.c
 @@ -1343,6 +1343,42 @@ DISAS_INSN(bitop_im)
  DEST_EA(env, insn, opsize, tmp, addr);
  }
  }
 +
 +static TCGv gen_get_ccr(DisasContext *s)
 +{
 +TCGv dest;
 +
 +gen_flush_flags(s);
 +dest = tcg_temp_new();
 +tcg_gen_shli_i32(dest, QREG_CC_X, 4);
 +tcg_gen_or_i32(dest, dest, QREG_CC_DEST);
 +return dest;
 +}
 +
 +static TCGv gen_get_sr(DisasContext *s)
 +{
 +TCGv ccr;
 +TCGv sr;
 +
 +ccr = gen_get_ccr(s);
 +sr = tcg_temp_new();
 +tcg_gen_andi_i32(sr, QREG_SR, 0xffe0);
 +tcg_gen_or_i32(sr, sr, ccr);
 +return sr;
 +}

Leaking the temporary produced by gen_get_ccr.

 +
 +static void gen_set_sr(DisasContext *s, TCGv val, int ccr_only)
 +{
 +TCGv tmp;
 +tmp = tcg_temp_new();
 +tcg_gen_andi_i32(QREG_CC_DEST, val, 0xf);
 +tcg_gen_shri_i32(tmp, val, 4);
 +tcg_gen_andi_i32(QREG_CC_X, tmp, 1);
 +if (!ccr_only) {
 +gen_helper_set_sr(cpu_env, val);
 +}
 +}

Leaking tmp.  And you don't even need to allocate it -- perform the shift into
QREG_CC_X.

 +
  DISAS_INSN(arith_im)
  {
  int op;
 @@ -1367,7 +1403,20 @@ DISAS_INSN(arith_im)
  default:
 abort();
  }
 -SRC_EA(env, src1, opsize, -1, (op == 6) ? NULL : addr);
 +if ((op == 0 || op == 1) 

The subject line is surely misleading, as this is only ANDI/ORI, right?  Again,
some more commentary would be helpful.

 +(insn  0x3f) == 0x3c) {
 +if (opsize == OS_BYTE) {
 +src1 = gen_get_ccr(s);
 +} else {
 +if (IS_USER(s)) {
 +gen_exception(s, s-pc - 2, EXCP_PRIVILEGE);
 +return;
 +}
 +src1 = gen_get_sr(s);
 +}
 +} else {
 +SRC_EA(env, src1, opsize, -1, (op == 6) ? NULL : addr);
 +}
  dest = tcg_temp_new();
  switch (op) {
  case 0: /* ori */
 @@ -1405,7 +1454,14 @@ DISAS_INSN(arith_im)
  default:
  abort();
  }
 -DEST_EA(env, insn, opsize, dest, addr);
 +if (op != 6) {
 +if ((op == 0 || op == 1) 
 +(insn  0x3f) == 0x3c) {
 +gen_set_sr(s, dest, opsize == OS_BYTE);
 +} else {
 +DEST_EA(env, insn, opsize, dest, addr);
 +}
 +}

That said, I think this code should be rearranged so that you don't have to
replicate so many conditionals.  In particular, the only thing of use in the
middle of import for the ccr insns are two lines: tcg_gen_andi/ori_tl.

I think it would be better to structure as

  if ((insn  0x3f) == 0x3c  (op == 0 || op == 1)) {
if (opsize == OS_BYTE) {
  src1 = gen_get_ccr ();
} else {
  ...
}
if (op == 0) {
  tcg_gen_ori_i32 ...
} else {
  tcg_gen_andi_i32 ...
}
gen_set_sr(s, dest, opsize == OS_BYTE);
return;
  }

  // existing code


r~

[Qemu-devel] [Bug 1163034] Re: linux-user mode can't handle guest setting RLIMIT_AS (hangs running gnutls28 configure check code)

2015-08-12 Thread Serge Hallyn

This will come in when implemented upstream.

** Changed in: qemu (Ubuntu)
   Importance: High = Wishlist

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1163034

Title:
  linux-user mode can't handle guest setting RLIMIT_AS (hangs running
  gnutls28 configure check code)

Status in QEMU:
  Confirmed
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  Please look at
  
https://code.launchpad.net/~costamagnagianfranco/+archive/costamagnagianfranco-ppa/+packages
  and
  
https://code.launchpad.net/~costamagnagianfranco/+archive/costamagnagianfranco-ppa/+build/4457434

  I cannot make gnutls28 build on armhf, I suspect a builder problem

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1163034/+subscriptions

Re: [Qemu-devel] [PATCH for-2.5 19/30] m68k: add cmpm

2015-08-12 Thread Richard Henderson

On 08/09/2015 01:13 PM, Laurent Vivier wrote:
 Signed-off-by: Laurent Vivier laur...@vivier.eu
 ---
  target-m68k/translate.c | 18 ++
  1 file changed, 18 insertions(+)
 
 diff --git a/target-m68k/translate.c b/target-m68k/translate.c
 index ae57792..adf4521 100644
 --- a/target-m68k/translate.c
 +++ b/target-m68k/translate.c
 @@ -2002,6 +2002,24 @@ DISAS_INSN(eor)
  
  opsize = insn_opsize(insn, 6);
  
 +if (((insn  3)  7) == 1) {
 +/* cmpm */

Surely this can be separated out from EOR via masks at register_opcode time.
And since this isn't a coldfire instruction...

 +reg = AREG(insn, 0);
 +src = gen_load(s, opsize, reg, 1);
 +tcg_gen_addi_i32(reg, reg, opsize_bytes(opsize));
 +
 +reg = AREG(insn, 9);
 +dest = gen_load(s, opsize, reg, 1);
 +tcg_gen_addi_i32(reg, reg, opsize_bytes(opsize));
 +
 +reg = tcg_temp_new();
 +tcg_gen_sub_i32(reg, dest, src);
 +gen_update_cc_add(reg, src);

No need for the extra temp, reg.  Simply modify dest.

 +SET_CC_OP(opsize, SUB);
 +
 +return;
 +}
 +
  SRC_EA(env, src, opsize, -1, addr);
  reg = DREG(insn, 9);
  dest = tcg_temp_new();

1 2 >

1 - 100 of 195 matches

Mail list logo