Re: [Qemu-devel] [PATCH v9] Add optionrom compatible with fw_cfg DMA version

2016-05-27 Thread Stefan Hajnoczi
On Mon, May 23, 2016 at 07:11:32PM +0100, Richard W.M. Jones wrote:
> v8 -> v9:
> 
>  - Add a workaround for GCC < 4.9.
> 
>  - Add linuxbios_dma.bin to Makefile.
> 
>  - Change Marc Mari's email address to new one.
> 
>  - Tested on RHEL 7.2 (gcc-4.8.5-4.el7.x86_64).
> 
> Rich.
> 

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC PATCH v4 0/3] Add Mediated device support[was: Add vGPU support]

2016-05-27 Thread Tian, Kevin
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Friday, May 27, 2016 10:55 PM
> 
> On Fri, 27 May 2016 11:02:46 +
> "Tian, Kevin"  wrote:
> 
> > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > Sent: Wednesday, May 25, 2016 9:44 PM
> > >
> > > On Wed, 25 May 2016 07:13:58 +
> > > "Tian, Kevin"  wrote:
> > >
> > > > > From: Kirti Wankhede [mailto:kwankh...@nvidia.com]
> > > > > Sent: Wednesday, May 25, 2016 3:58 AM
> > > > >
> > > > > This series adds Mediated device support to v4.6 Linux host kernel. 
> > > > > Purpose
> > > > > of this series is to provide a common interface for mediated device
> > > > > management that can be used by different devices. This series 
> > > > > introduces
> > > > > Mdev core module that create and manage mediated devices, VFIO based 
> > > > > driver
> > > > > for mediated PCI devices that are created by Mdev core module and 
> > > > > update
> > > > > VFIO type1 IOMMU module to support mediated devices.
> > > >
> > > > Thanks. "Mediated device" is more generic than previous one. :-)
> > > >
> > > > >
> > > > > What's new in v4?
> > > > > - Renamed 'vgpu' module to 'mdev' module that represent generic term
> > > > >   'Mediated device'.
> > > > > - Moved mdev directory to drivers/vfio directory as this is the 
> > > > > extension
> > > > >   of VFIO APIs for mediated devices.
> > > > > - Updated mdev driver to be flexible to register multiple types of 
> > > > > drivers
> > > > >   to mdev_bus_type bus.
> > > > > - Updated mdev core driver with mdev_put_device() and 
> > > > > mdev_get_device() for
> > > > >   mediated devices.
> > > > >
> > > > >
> > > >
> > > > Just curious. In this version you move the whole mdev core under
> > > > VFIO now. Sorry if I missed any agreement on this change. IIRC Alex
> > > > doesn't want VFIO to manage mdev life-cycle directly. Instead VFIO is
> > > > just a mdev driver on created mediated devices
> > >
> > > I did originally suggest keeping them separate, but as we've progressed
> > > through the implementation, it's become more clear that the mediated
> > > device interface is very much tied to the vfio interface, acting mostly
> > > as a passthrough.  So I thought it made sense to pull them together.
> > > Still open to discussion of course.  Thanks,
> > >
> >
> > The main benefit of maintaining a separate mdev framework, IMHO, is
> > to allow better support of both KVM and Xen. Xen doesn't work with VFIO
> > today, because other VM's memory is not allocated from Dom0 which
> > means VFIO within Dom0 doesn't has view/permission to control isolation
> > for other VMs.
> 
> Isn't this just a matter of the vfio iommu model selected?  There could
> be a vfio-iommu-xen that knows how to do the grant calls.
> 
> > However, after some thinking I think it might not be a big problem to
> > combine VFIO/mdev together, if we extend Xen to just use VFIO for
> > resource enumeration. In such model, VFIO still behaves as a single
> > kernel portal to enumerate mediated devices to user space, but give up
> > permission control to Qemu which will request a secure agent - Xen
> > hypervisor - to ensure isolation of VM usage on mediated device (including
> > EPT/IOMMU configuration).
> 
> The whole point here is to use the vfio user api and we seem to be
> progressing towards using vfio-core as a conduit where the mediated
> driver api is also fairly vfio-ish.  So it seems we're really headed
> towards a vfio-mediated device rather than some sort generic mediated
> driver interface.  I would object to leaving permission control to
> QEMU, QEMU is just a vfio user, there are others like DPDK.  The kernel
> needs to be in charge of protecting itself and users from each other,
> QEMU can't do this, which is part of reason that KVM has moved to vfio
> rather than the pci-sysfs resource interface.
> 
> > I'm not sure whether VFIO can support this usage today. It is somehow
> > similar to channel io passthru in s390, where we also rely on Qemu to
> > mediate ccw commands to ensure isolation. Maybe just some slight
> > extension is required (e.g. not assume some API must be invoked). Of
> > course Qemu side vfio code also need some change. If this can work,
> > at least we can first put it as the enumeration interface for mediated
> > device in Xen. In the future it may be extended to cover normal Xen
> > PCI assignment as well instead of using sysfs to read PCI resource
> > today.
> 
> The channel io proposal doesn't rely on QEMU for security either, the
> mediation occurs in the host kernel, parsing the ccw command program,
> and doing translations to replace the guest physical addresses with
> verified and pinned host physical addresses before submitting the
> program to be run.  A mediated device is policed by the mediated
> vendor driver in the host kernel, QEMU is untrusted, just like any
> other user.
> 
> If xen is currently using pci-sysfs for mapping device 

Re: [Qemu-devel] [Qemu-block] [PATCH v2 0/4] Drop virtio-{blk, scsi} op blockers

2016-05-27 Thread Stefan Hajnoczi
On Mon, May 23, 2016 at 10:19:34AM +0800, Fam Zheng wrote:
> v2: Switch to bdrv_lookup_bs on target_bs. [Kevin]
> Rebase onto master.
> Add Michael's ack-by in virtio patches.
> 
> We are ready to get rid of dataplane's op blockers altogether. Most operations
> are already unblocked in virtio-blk, and those remained for virtio-scsi only
> because we haven't got around to add counterpart unblocking code.
> 
> The first patch fixes an existing bug with blockdev-backup. Then the op
> blockers are removed from both devices.
> 
> Fam Zheng (4):
>   blockdev-backup: Use bdrv_lookup_bs on target
>   blockdev-backup: Don't move target AioContext if it's attached
>   virtio-blk: Remove op blocker for dataplane
>   virtio-scsi: Remove op blocker for dataplane
> 
>  blockdev.c  | 23 ---
>  hw/block/dataplane/virtio-blk.c | 63 
> -
>  hw/scsi/virtio-scsi.c   | 62 
>  include/hw/virtio/virtio-scsi.h | 11 ---
>  4 files changed, 13 insertions(+), 146 deletions(-)
> 
> -- 
> 2.8.2
> 
> 

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC] virtio-blk: simple multithreaded MQ implementation for bdrv_raw

2016-05-27 Thread Stefan Hajnoczi
On Fri, May 27, 2016 at 01:55:04PM +0200, Roman Pen wrote:
> Hello, all.
> 
> This is RFC because mostly this patch is a quick attempt to get true
> multithreaded multiqueue support for a block device with native AIO.
> The goal is to squeeze everything possible on lockless IO path from
> MQ block on a guest to MQ block on a host.
> 
> To avoid any locks in qemu backend and not to introduce thread safety
> into qemu block-layer I open same backend device several times, one
> device per one MQ.  e.g. the following is the stack for a virtio-blk
> with num-queues=2:
> 
> VirtIOBlock
>/   \
>  VirtQueue#0   VirtQueue#1
>   IOThread#0IOThread#1
>  BH#0  BH#1
>   Backend#0 Backend#1
>\   /
>  /dev/null0
> 
> To group all objects related to one vq new structure is introduced:
> 
> typedef struct VirtQueueCtx {
> BlockBackend *blk;
> struct VirtIOBlock *s;
> VirtQueue *vq;
> void *rq;
> QEMUBH *bh;
> QEMUBH *batch_notify_bh;
> IOThread *iothread;
> Notifier insert_notifier;
> Notifier remove_notifier;
> /* Operation blocker on BDS */
> Error *blocker;
> } VirtQueueCtx;
> 
> And VirtIOBlock includes an array of these contexts:
> 
>  typedef struct VirtIOBlock {
>  VirtIODevice parent_obj;
> +VirtQueueCtx mq[VIRTIO_QUEUE_MAX];
>  ...
> 
> This patch is based on Stefan's series: "virtio-blk: multiqueue support",
> with minor difference: I reverted "virtio-blk: multiqueue batch notify",
> which does not make a lot sense when each VQ is handled by it's own
> iothread.
> 
> The qemu configuration stays the same, i.e. put num-queues=N and N
> iothreads will be started on demand and N drives will be opened:
> 
> qemu -device virtio-blk-pci,num-queues=8
> 
> My configuration is the following:
> 
> host:
> Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz,
> 8 CPUs,
> /dev/nullb0 as backend with the following parameters:
>   $ cat /sys/module/null_blk/parameters/submit_queues
>   8
>   $ cat /sys/module/null_blk/parameters/irqmode
>   1
> 
> guest:
> 8 VCPUs
> 
> qemu:
> -object iothread,id=t0 \
> -drive 
> if=none,id=d0,file=/dev/nullb0,format=raw,snapshot=off,cache=none,aio=native \
> -device 
> virtio-blk-pci,num-queues=$N,iothread=t0,drive=d0,disable-modern=off,disable-legacy=on
> 
> where $N varies during the tests.
> 
> fio:
> [global]
> description=Emulation of Storage Server Access Pattern
> bssplit=512/20:1k/16:2k/9:4k/12:8k/19:16k/10:32k/8:64k/4
> fadvise_hint=0
> rw=randrw:2
> direct=1
> 
> ioengine=libaio
> iodepth=64
> iodepth_batch_submit=64
> iodepth_batch_complete=64
> numjobs=8
> gtod_reduce=1
> group_reporting=1
> 
> time_based=1
> runtime=30
> 
> [job]
> filename=/dev/vda
> 
> Results:
> num-queues   RD bw  WR bw
> --   -  -
> 
> * with 1 iothread *
> 
> 1 thr 1 mq   1225MB/s   1221MB/s
> 1 thr 2 mq   1559MB/s   1553MB/s
> 1 thr 4 mq   1729MB/s   1725MB/s
> 1 thr 8 mq   1660MB/s   1655MB/s
> 
> * with N iothreads *
> 
> 2 thr 2 mq   1845MB/s   1842MB/s
> 4 thr 4 mq   2187MB/s   2183MB/s
> 8 thr 8 mq   1383MB/s   1378MB/s
> 
> Obviously, 8 iothreads + 8 vcpu threads is too much for my machine
> with 8 CPUs, but 4 iothreads show quite good result.

Cool, thanks for trying this experiment and posting results.

It's encouraging to see the improvement.  Did you use any CPU affinity
settings to co-locate vcpu and iothreads onto host CPUs?

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 0/9] virtio-blk: multiqueue support

2016-05-27 Thread Stefan Hajnoczi
On Tue, May 24, 2016 at 02:51:04PM +0200, Christian Borntraeger wrote:
> On 05/21/2016 01:40 AM, Stefan Hajnoczi wrote:
> > The virtio_blk guest driver has supported multiple virtqueues since Linux 
> > 3.17.
> > This patch series adds multiple virtqueues to QEMU's virtio-blk emulated
> > device.
> > 
> > Ming Lei sent patches previously but these were not merged.  This series
> > implements virtio-blk multiqueue for QEMU from scratch since the codebase 
> > has
> > changed.  Live migration support for s->rq was also missing from the 
> > previous
> > series and has been added.
> > 
> > It's important to note that QEMU's block layer does not support multiqueue 
> > yet.
> > Therefore virtio-blk device processes all virtqueues in the same AioContext
> > (IOThread).  Further work is necessary to take advantage of multiqueue 
> > support
> > in QEMU's block layer once it becomes available.
> > 
> > I will post performance results once they are ready.
> > 
> > Stefan Hajnoczi (9):
> >   virtio-blk: use batch notify in non-dataplane case
> >   virtio-blk: tell dataplane which vq to notify
> >   virtio-blk: associate request with a virtqueue
> >   virtio-blk: add VirtIOBlockConf->num_queues
> >   virtio-blk: multiqueue batch notify
> >   vmstate: add VMSTATE_VARRAY_UINT32_ALLOC
> >   virtio-blk: live migrate s->rq with multiqueue
> >   virtio-blk: dataplane multiqueue support
> >   virtio-blk: add num-queues device property
> > 
> >  hw/block/dataplane/virtio-blk.c |  68 +++---
> >  hw/block/dataplane/virtio-blk.h |   2 +-
> >  hw/block/virtio-blk.c   | 200 
> > 
> >  include/hw/virtio/virtio-blk.h  |  13 ++-
> >  include/migration/vmstate.h |  10 ++
> >  5 files changed, 241 insertions(+), 52 deletions(-)
> > 
> 
> With 2.6 I see 2 host threads consuming a CPU when running fio in a single CPU
> guest with a null-blk device and iothread for that disk. (the vcpu thread and
> the iothread). With this patchset the main thread also consumes almost 80% of 
> a
> CPU doing polling in main_loop_wait. I have not even changes the num-queues 
> values.
> 
> So in essence 3 vs 2 host cpus.

Do you know which patch causes this?  Patch 1 maybe?

I will take a look for v2.  Thanks for the heads up.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 7/9] virtio-blk: live migrate s->rq with multiqueue

2016-05-27 Thread Stefan Hajnoczi
On Sat, May 21, 2016 at 05:37:27PM +0200, Paolo Bonzini wrote:
> 
> 
> On 21/05/2016 01:40, Stefan Hajnoczi wrote:
> >  while (req) {
> >  qemu_put_sbyte(f, 1);
> 
> Could you just put an extra 32-bit queue id here if num_queues > 1?  A
> guest with num_queues > 1 cannot be started on pre-2.7 QEMU, so you can
> change the migration format (if virtio were using vmstate, it would be a
> VMSTATE_UINT32_TEST).

That's a good point: the same command-line would not work on old QEMU so
there is no possibility of new->old migration in the first place!

Your solution is much simpler than my hack (which took me a long time to
find and I was kind of proud of).  Will change for v2.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 5/9] virtio-blk: multiqueue batch notify

2016-05-27 Thread Stefan Hajnoczi
On Sat, May 21, 2016 at 06:02:52PM +0200, Paolo Bonzini wrote:
> 
> 
> On 21/05/2016 01:40, Stefan Hajnoczi wrote:
> > +while ((i = find_next_bit(s->batch_notify_vqs, nvqs, i)) < nvqs) {
> > +VirtQueue *vq = virtio_get_queue(vdev, i);
> > +
> > +bitmap_clear(s->batch_notify_vqs, i, 1);
> 
> clear_bit?

Ignorance on my part.  Thanks!

> > +if (s->dataplane_started && !s->dataplane_disabled) {
> > +virtio_blk_data_plane_notify(s->dataplane, vq);
> > +} else {
> > +virtio_notify(vdev, vq);
> > +}
> 
> The find_next_bit loop is not very efficient and could use something
> similar to commit 41074f3 ("omap_intc: convert ffs(3) to ctz32() in
> omap_inth_sir_update()", 2015-04-28).  But it can be improved later.

Cool, will try that for inspiration in v2.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] tests: avoid coroutine pool test crash

2016-05-27 Thread Stefan Hajnoczi
On Fri, May 20, 2016 at 11:00:31AM -0700, Stefan Hajnoczi wrote:
> Skip the test_co_queue test case if the coroutine pool is not enabled.
> The test case does not work without the pool because it touches memory
> belonging to a freed coroutine (on purpose).
> 
> Reported-by: Eduardo Habkost 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  tests/test-coroutine.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] linux-user: provide frame information in x86-64 safe_syscall

2016-05-27 Thread Richard Henderson

On 05/27/2016 09:34 AM, Peter Maydell wrote:

On 27 May 2016 at 17:21, Richard Henderson  wrote:

On 05/27/2016 08:06 AM, Peter Maydell wrote:


@@ -31,6 +32,8 @@ safe_syscall_base:
  * does not list any ABI differences regarding stack alignment.)
  */
 push%rbp
+.cfi_def_cfa_offset 16
+.cfi_offset rbp,-16



While this is correct, there are two other directives that make it easier to
describe changes without having to compute globally correct constants.  Here
they would be:

.cfi_adjust_cfa_offset 8

Add 8 to the offset, i.e. decrement the SP by 8.


Presumably .cfi_startproc sets the initial offset to 8?
(It's not documented that it does so, which is I think partly why
I preferred to use a directive that definitely set the offset
to the right thing.)


It is documented to set up the normal no-instructions-executed call frame. 
Which in the case of x86, does have a non-zero offset.


There is a ".cfi_startproc simple" that begins a frame with no opcodes at all.


r~



Re: [Qemu-devel] [PATCH v6 04/15] include/processor.h: define cpu_relax()

2016-05-27 Thread Emilio G. Cota
On Fri, May 27, 2016 at 23:53:01 +0300, Sergey Fedorov wrote:
> On 25/05/16 04:13, Emilio G. Cota wrote:
> > Taken from the linux kernel.
> >
> > Reviewed-by: Richard Henderson 
> > Reviewed-by: Alex Bennée 
> > Signed-off-by: Emilio G. Cota 
> > ---
> >  include/qemu/processor.h | 30 ++
> >  1 file changed, 30 insertions(+)
> >  create mode 100644 include/qemu/processor.h
> >
> > diff --git a/include/qemu/processor.h b/include/qemu/processor.h
> > new file mode 100644
> > index 000..42bcc99
> > --- /dev/null
> > +++ b/include/qemu/processor.h
> > @@ -0,0 +1,30 @@
> > +/*
> > + * Copyright (C) 2016, Emilio G. Cota 
> > + *
> 
> If it's taken from the Linux kernel shouldn't it be attributed here?

It's "taken" as in "we do like the kernel does", not as in
"we're just copying this code".

Emilio



Re: [Qemu-devel] [PATCH v6 04/15] include/processor.h: define cpu_relax()

2016-05-27 Thread Sergey Fedorov
On 25/05/16 04:13, Emilio G. Cota wrote:
> Taken from the linux kernel.
>
> Reviewed-by: Richard Henderson 
> Reviewed-by: Alex Bennée 
> Signed-off-by: Emilio G. Cota 
> ---
>  include/qemu/processor.h | 30 ++
>  1 file changed, 30 insertions(+)
>  create mode 100644 include/qemu/processor.h
>
> diff --git a/include/qemu/processor.h b/include/qemu/processor.h
> new file mode 100644
> index 000..42bcc99
> --- /dev/null
> +++ b/include/qemu/processor.h
> @@ -0,0 +1,30 @@
> +/*
> + * Copyright (C) 2016, Emilio G. Cota 
> + *

If it's taken from the Linux kernel shouldn't it be attributed here?

> + * License: GNU GPL, version 2.
> + *   See the COPYING file in the top-level directory.
> + */
> +#ifndef QEMU_PROCESSOR_H
> +#define QEMU_PROCESSOR_H
> +
> +#include "qemu/atomic.h"
> +
> +#if defined(__i386__) || defined(__x86_64__)
> +# define cpu_relax() asm volatile("rep; nop" ::: "memory")
> +
> +#elif defined(__ia64__)
> +# define cpu_relax() asm volatile("hint @pause" ::: "memory")
> +
> +#elif defined(__aarch64__)
> +# define cpu_relax() asm volatile("yield" ::: "memory")
> +
> +#elif defined(__powerpc64__)
> +/* set Hardware Multi-Threading (HMT) priority to low; then back to medium */
> +# define cpu_relax() asm volatile("or 1, 1, 1;"
> +  "or 2, 2, 2;" ::: "memory")
> +
> +#else
> +# define cpu_relax() barrier()
> +#endif
> +
> +#endif /* QEMU_PROCESSOR_H */




Re: [Qemu-devel] [PATCH v3 2/2] trace: [all] Add "guest_mem_before" event

2016-05-27 Thread Stefan Hajnoczi
On Wed, May 18, 2016 at 08:45:35PM +0200, Lluís Vilanova wrote:
> Richard Henderson writes:
> 
> > On 05/18/2016 03:47 AM, Lluís Vilanova wrote:
> >> Signed-off-by: Lluís Vilanova 
> >> ---
> >> include/exec/cpu_ldst_template.h  |   25 
> >> include/exec/cpu_ldst_useronly_template.h |   22 ++
> >> tcg/tcg-op.c  |   32 ++--
> >> trace-events  |   22 ++
> >> trace/mem-internal.h  |   46 
> >> +
> >> trace/mem.h   |   34 +
> >> 6 files changed, 177 insertions(+), 4 deletions(-)
> >> create mode 100644 trace/mem-internal.h
> >> create mode 100644 trace/mem.h
> >> 
> >> diff --git a/include/exec/cpu_ldst_template.h 
> >> b/include/exec/cpu_ldst_template.h
> >> index 3091c00..eaf69a1 100644
> >> --- a/include/exec/cpu_ldst_template.h
> >> +++ b/include/exec/cpu_ldst_template.h
> >> @@ -23,6 +23,13 @@
> >> * You should have received a copy of the GNU Lesser General Public
> >> * License along with this library; if not, see 
> >> .
> >> */
> >> +
> >> +#if !defined(SOFTMMU_CODE_ACCESS)
> >> +#include "trace.h"
> >> +#endif
> >> +
> >> +#include "trace/mem.h"
> >> +
> >> #if DATA_SIZE == 8
> >> #define SUFFIX q
> >> #define USUFFIX q
> >> @@ -80,6 +87,12 @@ glue(glue(glue(cpu_ld, USUFFIX), MEMSUFFIX), 
> >> _ra)(CPUArchState *env,
> >> int mmu_idx;
> >> TCGMemOpIdx oi;
> >> 
> >> +#if !defined(SOFTMMU_CODE_ACCESS)
> >> +trace_guest_mem_before_exec(
> >> +ENV_GET_CPU(env), ptr,
> >> +trace_mem_build_info(SHIFT, false, MO_TE, false));
> >> +#endif
> 
> > I don't understand what this event is supposed to be tracing.
> > There's no documentation at all, even in the commit log.
> 
> I'm not sure if you mean for this file or the "guest_mem_before" event on the
> whole patch. The event description is in the "trace-events" file. Although
> terse, I think it's pretty explanatory, but I can expand it if it's not clear.

Good idea, please do that in the commit description for this patch.

Thanks,
Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 0/3] trace: enable tracing in qemu-io/qemu-nbd

2016-05-27 Thread Denis V. Lunev

On 05/27/2016 11:39 PM, Stefan Hajnoczi wrote:

On Tue, May 17, 2016 at 11:20:28AM +0300, Denis V. Lunev wrote:

Actually this is a rework of the original patch, set as a part of write-zeroes
patchset. Moving it out to process via trace tree.

Signed-off-by: Denis V. Lunev 
CC: Paolo Bonzini 
CC: Stefan Hajnoczi 
CC: Kevin Wolf 

Denis V. Lunev (3):
   trace: move qemu_trace_opts to trace/control.c
   trace: enable tracing in qemu-io
   trace: enable tracing in qemu-nbd

  qemu-io.c   | 13 ++---
  qemu-nbd.c  | 15 +++
  trace/control.c | 44 +++-
  trace/control.h | 24 +---
  vl.c| 37 +
  5 files changed, 82 insertions(+), 51 deletions(-)

Looks useful.  I'd like to apply the v2 when you send it.

Stefan

yep. I'll do that tomorrow or on monday.



Re: [Qemu-devel] [PATCH 0/3] trace: enable tracing in qemu-io/qemu-nbd

2016-05-27 Thread Stefan Hajnoczi
On Tue, May 17, 2016 at 11:20:28AM +0300, Denis V. Lunev wrote:
> Actually this is a rework of the original patch, set as a part of write-zeroes
> patchset. Moving it out to process via trace tree.
> 
> Signed-off-by: Denis V. Lunev 
> CC: Paolo Bonzini 
> CC: Stefan Hajnoczi 
> CC: Kevin Wolf 
> 
> Denis V. Lunev (3):
>   trace: move qemu_trace_opts to trace/control.c
>   trace: enable tracing in qemu-io
>   trace: enable tracing in qemu-nbd
> 
>  qemu-io.c   | 13 ++---
>  qemu-nbd.c  | 15 +++
>  trace/control.c | 44 +++-
>  trace/control.h | 24 +---
>  vl.c| 37 +
>  5 files changed, 82 insertions(+), 51 deletions(-)

Looks useful.  I'd like to apply the v2 when you send it.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 8/9] target-i386: Use "-" instead of "_" on all feature names

2016-05-27 Thread Eduardo Habkost
On Tue, May 24, 2016 at 03:22:27PM +0200, Igor Mammedov wrote:
> On Tue, 24 May 2016 09:34:05 -0300
> Eduardo Habkost  wrote:
> 
> > On Tue, May 24, 2016 at 02:17:03PM +0200, Igor Mammedov wrote:
> > > On Fri,  6 May 2016 15:11:31 -0300
> > > Eduardo Habkost  wrote:
[...]
> > > > -/* Convert all '_' in a feature string option name to '-', to make 
> > > > feature
> > > > - * name conform to QOM property naming rule, which uses '-' instead of 
> > > > '_'.
> > > > +/* Convert all '_' in a feature string option name to '-', to keep 
> > > > compatibility
> > > > + * with old feature names that used "_" instead of "-".
> > > >   */
> > > >  static inline void feat2prop(char *s)
> > > >  {
> > > > @@ -1925,8 +1925,10 @@ static void x86_cpu_parse_featurestr(CPUState 
> > > > *cs, char *features,
> > > >  while (featurestr) {
> > > >  char *val;  
> > > I'd place a single feat2prop() here
> > > and delete it from other call sites in this function.  
> > 
> > A previous version of this patch had it. But it would change the
> > property value too, not just the property name (breaking stuff
> > like "model-id=some_string").
> >
> it's bug in feat2prop(), which probably should be fixed there,
> so it would do what comment above it says. Or as alternative:

The comment above it doesn't say anything about stopping at a '='
delimiter. I always expected it to just replace "_" with "-" in a
null-terminated string.

(I am not completely against making it stop at '=', but I believe
your suggestion below sounds better).

> 
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index ca2a893..e46e4c3 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -1941,14 +1941,16 @@ static void x86_cpu_parse_featurestr(CPUState *cs, 
> char *features
>  featurestr = features ? strtok(features, ",") : NULL;
>  
>  while (featurestr) {
> -char *val;
> +char *val = strchr(featurestr, '=');
> +if (val) {
> +*val = 0; val++;
> +}
> +feat2prop(featurestr);

This would make "+feature=FOO" treated as a valid option, and it
isn't. It would keep the existing behavior if we did this:

-  if (featurestr[0] == '+') {
+  if (featurestr[0] == '+' && !val) {
   add_flagname_to_bitmaps(featurestr + 1, plus_features, 
_err);
-  } else if (featurestr[0] == '-') {
+  if (featurestr[0] == '+' && !val) {
   add_flagname_to_bitmaps(featurestr + 1, minus_features, 
_err);

In either case, I prefer to get this optimization reviewed as a
separate patch. Can you send it as a follow-up?

> -} else if ((val = strchr(featurestr, '='))) {
> -*val = 0; val++;
> -feat2prop(featurestr);
> +} else if (val) {
>  if (!strcmp(featurestr, "xlevel")) {
>  char *err;
>  char num[32];
> @@ -2000,7 +2002,6 @@ static void x86_cpu_parse_featurestr(CPUState *cs, char 
> *features,
>  object_property_parse(OBJECT(cpu), val, featurestr, 
> _err);
>  }
>  } else {
> -feat2prop(featurestr);
>  object_property_parse(OBJECT(cpu), "on", featurestr, _err);
>  }
>  if (local_err) {
> 
> 

-- 
Eduardo



Re: [Qemu-devel] [libvirt] [PATCH 7/9] qmp: Add runnability information to query-cpu-definitions

2016-05-27 Thread Eduardo Habkost
Just noticed that I hadn't replied to this yet. Sorry for the
long delay!

On Thu, May 12, 2016 at 09:46:25AM +0200, Markus Armbruster wrote:
> Eduardo Habkost  writes:
[...]
> > ##
> > # @CpuDefinitionInfo:
> > #
> > # Virtual CPU definition.
> > #
> > # @name: the name of the CPU definition
> > # @runnable: #optional. whether the CPU model us usable with the
> > #current machine and accelerator. Omitted if we don't
> > #know the answer. (since 2.7)
> > # @unavailable-features: List of attributes that prevent the CPU
> 
> Unless you drop the * sigil from '*unavailable-features', you need to
> insert #optional after the colon.

Fixed.

> 
> > #model from running in the current host.
> > #(since 2.7)
> > #
> > # @unavailable-features is a list of QOM property names that
> > # represent CPU model attributes that prevent the CPU from running.
> > # If the QOM property is read-only, that means the CPU model can
> > # never run in the current host. If the property is read-write, it
> > # means that it MAY be possible to run the CPU model in the current
> > # host if that property is changed. Management software can use it
> > # as hints to suggest or choose an alternative for the user, or
> > # just to generate meaningful error messages explaining why the CPU
> > # model can't be used.
> > #
> > # Since: 1.2.0
> > ##
> 
> Better.
> 
> Next issue: how @runnable and @unavailable-features are related isn't
> fully documented.  Here's my guess:
> 
> Combinations possible?@runnable
> @unavailable-features   absent  false  true
> absent yes  ? ?
> present, empty   ?  ? ?
> present, non-empty   ?yesno

unavailable-features should be present only if runnable is false.
It may be absent or empty if the architecture code still doesn't
provide detailed info.

Once we have additional architectures implementing the new
fields, we can consider requiring unavailable-features to be
always present (and non-empty) if runnable is false.

In other words:

Combinations possible?@runnable
@unavailable-features   absent  false  true
absent yes  yes[1]  yes
present, empty  no  yes[1]   no
present, non-empty  no   yes no

[1] I would like it to be "no", but I prefer to make it mandatory
only after we get some experience with other architectures.


I'm making the following changes to the documentation:

 # Virtual CPU definition.
 #
 # @name: the name of the CPU definition
-# @runnable: #optional. whether the CPU model us usable with the
+# @runnable: #optional Whether the CPU model us usable with the
 #current machine and accelerator. Omitted if we don't
 #know the answer. (since 2.7)
-# @unavailable-features: List of attributes that prevent the CPU
-#model from running in the current host.
+# @unavailable-features: #optional List of attributes that prevent
+#the CPU model from running in the current
+#host. Present only if @runnable is false.
 #(since 2.7)
 #
 # @unavailable-features is a list of QOM property names that

-- 
Eduardo



Re: [Qemu-devel] [PATCH v6 03/15] seqlock: rename write_lock/unlock to write_begin/end

2016-05-27 Thread Sergey Fedorov
On 25/05/16 04:13, Emilio G. Cota wrote:
> It is a more appropriate name, now that the mutex embedded
> in the seqlock is gone.
>
> Reviewed-by: Alex Bennée 
> Reviewed-by: Richard Henderson 
> Signed-off-by: Emilio G. Cota 

Reviewed-by: Sergey Fedorov 

> ---
>  cpus.c | 28 ++--
>  include/qemu/seqlock.h |  4 ++--
>  2 files changed, 16 insertions(+), 16 deletions(-)
>
> diff --git a/cpus.c b/cpus.c
> index dd86da5..735c9b2 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -247,13 +247,13 @@ int64_t cpu_get_clock(void)
>  void cpu_enable_ticks(void)
>  {
>  /* Here, the really thing protected by seqlock is cpu_clock_offset. */
> -seqlock_write_lock(_state.vm_clock_seqlock);
> +seqlock_write_begin(_state.vm_clock_seqlock);
>  if (!timers_state.cpu_ticks_enabled) {
>  timers_state.cpu_ticks_offset -= cpu_get_host_ticks();
>  timers_state.cpu_clock_offset -= get_clock();
>  timers_state.cpu_ticks_enabled = 1;
>  }
> -seqlock_write_unlock(_state.vm_clock_seqlock);
> +seqlock_write_end(_state.vm_clock_seqlock);
>  }
>  
>  /* disable cpu_get_ticks() : the clock is stopped. You must not call
> @@ -263,13 +263,13 @@ void cpu_enable_ticks(void)
>  void cpu_disable_ticks(void)
>  {
>  /* Here, the really thing protected by seqlock is cpu_clock_offset. */
> -seqlock_write_lock(_state.vm_clock_seqlock);
> +seqlock_write_begin(_state.vm_clock_seqlock);
>  if (timers_state.cpu_ticks_enabled) {
>  timers_state.cpu_ticks_offset += cpu_get_host_ticks();
>  timers_state.cpu_clock_offset = cpu_get_clock_locked();
>  timers_state.cpu_ticks_enabled = 0;
>  }
> -seqlock_write_unlock(_state.vm_clock_seqlock);
> +seqlock_write_end(_state.vm_clock_seqlock);
>  }
>  
>  /* Correlation between real and virtual time is always going to be
> @@ -292,7 +292,7 @@ static void icount_adjust(void)
>  return;
>  }
>  
> -seqlock_write_lock(_state.vm_clock_seqlock);
> +seqlock_write_begin(_state.vm_clock_seqlock);
>  cur_time = cpu_get_clock_locked();
>  cur_icount = cpu_get_icount_locked();
>  
> @@ -313,7 +313,7 @@ static void icount_adjust(void)
>  last_delta = delta;
>  timers_state.qemu_icount_bias = cur_icount
>- (timers_state.qemu_icount << 
> icount_time_shift);
> -seqlock_write_unlock(_state.vm_clock_seqlock);
> +seqlock_write_end(_state.vm_clock_seqlock);
>  }
>  
>  static void icount_adjust_rt(void *opaque)
> @@ -353,7 +353,7 @@ static void icount_warp_rt(void)
>  return;
>  }
>  
> -seqlock_write_lock(_state.vm_clock_seqlock);
> +seqlock_write_begin(_state.vm_clock_seqlock);
>  if (runstate_is_running()) {
>  int64_t clock = REPLAY_CLOCK(REPLAY_CLOCK_VIRTUAL_RT,
>   cpu_get_clock_locked());
> @@ -372,7 +372,7 @@ static void icount_warp_rt(void)
>  timers_state.qemu_icount_bias += warp_delta;
>  }
>  vm_clock_warp_start = -1;
> -seqlock_write_unlock(_state.vm_clock_seqlock);
> +seqlock_write_end(_state.vm_clock_seqlock);
>  
>  if (qemu_clock_expired(QEMU_CLOCK_VIRTUAL)) {
>  qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
> @@ -397,9 +397,9 @@ void qtest_clock_warp(int64_t dest)
>  int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
>  int64_t warp = qemu_soonest_timeout(dest - clock, deadline);
>  
> -seqlock_write_lock(_state.vm_clock_seqlock);
> +seqlock_write_begin(_state.vm_clock_seqlock);
>  timers_state.qemu_icount_bias += warp;
> -seqlock_write_unlock(_state.vm_clock_seqlock);
> +seqlock_write_end(_state.vm_clock_seqlock);
>  
>  qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
>  timerlist_run_timers(aio_context->tlg.tl[QEMU_CLOCK_VIRTUAL]);
> @@ -466,9 +466,9 @@ void qemu_start_warp_timer(void)
>   * It is useful when we want a deterministic execution time,
>   * isolated from host latencies.
>   */
> -seqlock_write_lock(_state.vm_clock_seqlock);
> +seqlock_write_begin(_state.vm_clock_seqlock);
>  timers_state.qemu_icount_bias += deadline;
> -seqlock_write_unlock(_state.vm_clock_seqlock);
> +seqlock_write_end(_state.vm_clock_seqlock);
>  qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
>  } else {
>  /*
> @@ -479,11 +479,11 @@ void qemu_start_warp_timer(void)
>   * you will not be sending network packets continuously instead 
> of
>   * every 100ms.
>   */
> -seqlock_write_lock(_state.vm_clock_seqlock);
> +seqlock_write_begin(_state.vm_clock_seqlock);
>  if (vm_clock_warp_start == -1 || vm_clock_warp_start > clock) {
>  vm_clock_warp_start = clock;
> 

Re: [Qemu-devel] [PATCH v6 02/15] seqlock: remove optional mutex

2016-05-27 Thread Sergey Fedorov
On 25/05/16 04:13, Emilio G. Cota wrote:
> This option is unused; besides, it bloats the struct when not needed.
> Let's just let writers define their own locks elsewhere.
>
> Reviewed-by: Alex Bennée 
> Reviewed-by: Richard Henderson 
> Signed-off-by: Emilio G. Cota 

Reviewed-by: Sergey Fedorov 

> ---
>  cpus.c |  2 +-
>  include/qemu/seqlock.h | 10 +-
>  2 files changed, 2 insertions(+), 10 deletions(-)
>
> diff --git a/cpus.c b/cpus.c
> index cbeb1f6..dd86da5 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -619,7 +619,7 @@ int cpu_throttle_get_percentage(void)
>  
>  void cpu_ticks_init(void)
>  {
> -seqlock_init(_state.vm_clock_seqlock, NULL);
> +seqlock_init(_state.vm_clock_seqlock);
>  vmstate_register(NULL, 0, _timers, _state);
>  throttle_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL_RT,
> cpu_throttle_timer_tick, NULL);
> diff --git a/include/qemu/seqlock.h b/include/qemu/seqlock.h
> index 70b01fd..e673482 100644
> --- a/include/qemu/seqlock.h
> +++ b/include/qemu/seqlock.h
> @@ -19,22 +19,17 @@
>  typedef struct QemuSeqLock QemuSeqLock;
>  
>  struct QemuSeqLock {
> -QemuMutex *mutex;
>  unsigned sequence;
>  };
>  
> -static inline void seqlock_init(QemuSeqLock *sl, QemuMutex *mutex)
> +static inline void seqlock_init(QemuSeqLock *sl)
>  {
> -sl->mutex = mutex;
>  sl->sequence = 0;
>  }
>  
>  /* Lock out other writers and update the count.  */
>  static inline void seqlock_write_lock(QemuSeqLock *sl)
>  {
> -if (sl->mutex) {
> -qemu_mutex_lock(sl->mutex);
> -}
>  ++sl->sequence;
>  
>  /* Write sequence before updating other fields.  */
> @@ -47,9 +42,6 @@ static inline void seqlock_write_unlock(QemuSeqLock *sl)
>  smp_wmb();
>  
>  ++sl->sequence;
> -if (sl->mutex) {
> -qemu_mutex_unlock(sl->mutex);
> -}
>  }
>  
>  static inline unsigned seqlock_read_begin(QemuSeqLock *sl)




Re: [Qemu-devel] [PATCH v6 01/15] compiler.h: add QEMU_ALIGNED() to enforce struct alignment

2016-05-27 Thread Sergey Fedorov
On 25/05/16 04:13, Emilio G. Cota wrote:
> Reviewed-by: Richard Henderson 
> Reviewed-by: Alex Bennée 
> Signed-off-by: Emilio G. Cota 

Reviewed-by: Sergey Fedorov 

> ---
>  include/qemu/compiler.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
> index 8f1cc7b..b64f899 100644
> --- a/include/qemu/compiler.h
> +++ b/include/qemu/compiler.h
> @@ -41,6 +41,8 @@
>  # define QEMU_PACKED __attribute__((packed))
>  #endif
>  
> +#define QEMU_ALIGNED(X) __attribute__((aligned(X)))
> +
>  #ifndef glue
>  #define xglue(x, y) x ## y
>  #define glue(x, y) xglue(x, y)




Re: [Qemu-devel] [PATCH v3] target-i386: implement CPUID[0xB] (Extended Topology Enumeration)

2016-05-27 Thread Eduardo Habkost
On Thu, May 12, 2016 at 07:24:26PM +0200, Radim Krčmář wrote:
> I looked at a dozen Intel CPU that have this CPUID and all of them
> always had Core offset as 1 (a wasted bit when hyperthreading is
> disabled) and Package offset at least 4 (wasted bits at <= 4 cores).
> 
> QEMU uses more compact IDs and it doesn't make much sense to change it
> now.  I keep the SMT and Core sub-leaves even if there is just one
> thread/core;  it makes the code simpler and there should be no harm.
> 
> Signed-off-by: Radim Krčmář 

Reviewed-by: Eduardo Habkost 

Applied to x86-next.

-- 
Eduardo



[Qemu-devel] [PATCH] pc: Add 2.7 machine

2016-05-27 Thread Eduardo Habkost
From: Igor Mammedov 

Signed-off-by: Igor Mammedov 
Reviewed-by: Eduardo Habkost 
Signed-off-by: Eduardo Habkost 
---
This is blocking some patches from getting included (e.g.  the
CPUID[0xB] patch from Radim), so I'm submitting Igor's RFC as
[PATCH].

I will add it to my x86-next queue so I can queue x86 patches
that depend on it. But I plan to wait until it gets merged
through the PC tree and rebase before sending a x86 pull request.
---
 hw/i386/pc_piix.c| 16 +---
 hw/i386/pc_q35.c | 13 +++--
 include/hw/i386/pc.h |  4 
 3 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 24e7042..d3bbe8a 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -416,13 +416,25 @@ static void pc_i440fx_machine_options(MachineClass *m)
 m->default_display = "std";
 }
 
-static void pc_i440fx_2_6_machine_options(MachineClass *m)
+static void pc_i440fx_2_7_machine_options(MachineClass *m)
 {
 pc_i440fx_machine_options(m);
 m->alias = "pc";
 m->is_default = 1;
 }
 
+DEFINE_I440FX_MACHINE(v2_7, "pc-i440fx-2.7", NULL,
+  pc_i440fx_2_7_machine_options);
+
+
+static void pc_i440fx_2_6_machine_options(MachineClass *m)
+{
+pc_i440fx_2_7_machine_options(m);
+m->is_default = 0;
+m->alias = NULL;
+SET_MACHINE_COMPAT(m, PC_COMPAT_2_6);
+}
+
 DEFINE_I440FX_MACHINE(v2_6, "pc-i440fx-2.6", NULL,
   pc_i440fx_2_6_machine_options);
 
@@ -431,8 +443,6 @@ static void pc_i440fx_2_5_machine_options(MachineClass *m)
 {
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 pc_i440fx_2_6_machine_options(m);
-m->alias = NULL;
-m->is_default = 0;
 pcmc->save_tsc_khz = false;
 m->legacy_fw_cfg_order = 1;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_5);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 04aae89..4787df1 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -283,12 +283,22 @@ static void pc_q35_machine_options(MachineClass *m)
 m->no_floppy = 1;
 }
 
-static void pc_q35_2_6_machine_options(MachineClass *m)
+static void pc_q35_2_7_machine_options(MachineClass *m)
 {
 pc_q35_machine_options(m);
 m->alias = "q35";
 }
 
+DEFINE_Q35_MACHINE(v2_7, "pc-q35-2.7", NULL,
+   pc_q35_2_7_machine_options);
+
+static void pc_q35_2_6_machine_options(MachineClass *m)
+{
+pc_q35_2_7_machine_options(m);
+m->alias = NULL;
+SET_MACHINE_COMPAT(m, PC_COMPAT_2_6);
+}
+
 DEFINE_Q35_MACHINE(v2_6, "pc-q35-2.6", NULL,
pc_q35_2_6_machine_options);
 
@@ -296,7 +306,6 @@ static void pc_q35_2_5_machine_options(MachineClass *m)
 {
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 pc_q35_2_6_machine_options(m);
-m->alias = NULL;
 pcmc->save_tsc_khz = false;
 m->legacy_fw_cfg_order = 1;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_5);
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 9ca2309..ca23609 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -356,7 +356,11 @@ int e820_add_entry(uint64_t, uint64_t, uint32_t);
 int e820_get_num_entries(void);
 bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
 
+#define PC_COMPAT_2_6 \
+HW_COMPAT_2_6
+
 #define PC_COMPAT_2_5 \
+PC_COMPAT_2_6 \
 HW_COMPAT_2_5
 
 /* Helper for setting model-id for CPU models that changed model-id
-- 
2.5.5




Re: [Qemu-devel] [RFC v2 11/11] tcg: enable thread-per-vCPU

2016-05-27 Thread Sergey Fedorov
On 27/05/16 18:25, Paolo Bonzini wrote:
>
> On 27/05/2016 17:07, Sergey Fedorov wrote:
>>  1. Make 'cpu->thread_kicked' access atomic
>>  2. Remove global 'exit_request' and use per-CPU 'exit_request'
>>  3. Change how 'current_cpu' is set
>>  4. Reorganize round-robin CPU TCG thread function
>>  5. Enable 'mmap_lock' for system mode emulation (do we really want 
>> this?)
 No, I don't think so.

>>  6. Enable 'tb_lock' for system mode emulation
>>  7. Introduce per-CPU TCG thread function
 At least 2/3/7 must be done at the same time, but I agree that this
 patch could use some splitting. :)
>> Hmm, 2/3 do also change single-threaded CPU loop. I think they should
>> apply separately from 7.
> Reviewed the patch now, and I'm not sure how you can do 2/3 for the
> single-threaded CPU loop.  They could be moved out of cpu_exec and into
> cpus.c (in a separate patch), but you need exit_request and
> tcg_current_cpu to properly kick the single-threaded CPU loop out of
> qemu_tcg_cpu_thread_fn.

Summarizing Paolo and my chat on IRC, we want run_on_cpu() to be served
as soon as possible so that it would not block IO thread for too long.
Removing global 'exit_request' would mean that a run_on_cpu() request
from IO thread wouldn't be served until single-threaded CPU loop
schedules the target CPU. This doesn't seem to be acceptable.

NB: Calling run_on_cpu() for other CPU from the CPU thread would cause a
deadlock in single-threaded round-robin CPU loop.

Thanks,
Sergey



Re: [Qemu-devel] [PATCH 09/10] block: fix backup in vmdk format image

2016-05-27 Thread Stefan Hajnoczi
On Sat, May 14, 2016 at 03:45:57PM +0300, Denis V. Lunev wrote:
> From: Pavel Butsykin 
> 
> The vmdk format has the extents and bs->file can be equal to the first
> extension. Before start of the backup we do detach the old context on the
> target drive at the bdrv_attach_aio_context. For the vmdk drive this means
> a double detach of the same block driver state, because the detach occurs
> for s->extents[0].file and bs->file.
> 
> To fix we  just skip the detach if s->extents[i].file and bs->file are the
> same. This approach is already used in the vmdk_free_extents() and the
> vmdk_get_allocated_file_size(), so it won't be some innovation :)
> 
> Signed-off-by: Pavel Butsykin 
> Signed-off-by: Denis V. Lunev 
> CC: Jeff Cody 
> CC: Markus Armbruster 
> CC: Eric Blake 
> CC: John Snow 
> CC: Stefan Hajnoczi 
> CC: Kevin Wolf 
> ---
>  block/vmdk.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/block/vmdk.c b/block/vmdk.c
> index 9530b30..0550924 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -2305,7 +2305,10 @@ static void vmdk_detach_aio_context(BlockDriverState 
> *bs)
>  int i;
>  
>  for (i = 0; i < s->num_extents; i++) {
> -bdrv_detach_aio_context(s->extents[i].file->bs);
> +BdrvChild *file = s->extents[i].file;
> +if (file != bs->file) {
> +bdrv_detach_aio_context(file->bs);
> +}
>  }
>  }

I think this fix is no longer necessary.  Max eliminated
vmdk_detach_aio_context() here:

[PULL 24/31] block: Propagate AioContext change to all children


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 07/10] blockdev-backup: added support for data compression

2016-05-27 Thread Stefan Hajnoczi
On Sat, May 14, 2016 at 03:45:55PM +0300, Denis V. Lunev wrote:
> From: Pavel Butsykin 
> 
> The idea is simple - backup is "written-once" data. It is written block
> by block and it is large enough. It would be nice to save storage
> space and compress it.
> 
> Signed-off-by: Pavel Butsykin 
> Signed-off-by: Denis V. Lunev 
> CC: Jeff Cody 
> CC: Markus Armbruster 
> CC: Eric Blake 
> CC: John Snow 
> CC: Stefan Hajnoczi 
> CC: Kevin Wolf 
> ---
>  blockdev.c   | 10 +-
>  qapi/block-core.json |  1 +
>  qmp-commands.hx  |  3 ++-
>  3 files changed, 12 insertions(+), 2 deletions(-)

Aside from the API doc issues that Eric mentioned:

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 04/10] qcow: add qcow_co_write_compressed

2016-05-27 Thread Stefan Hajnoczi
On Sat, May 14, 2016 at 03:45:52PM +0300, Denis V. Lunev wrote:
> +qemu_co_mutex_lock(>lock);
> +cluster_offset = get_cluster_offset(bs, sector_num << 9, 2, out_len, 0, 
> 0);
> +qemu_co_mutex_unlock(>lock);
> +if (cluster_offset == 0) {
> +ret = -EIO;
> +goto fail;
> +}
> +cluster_offset &= s->cluster_offset_mask;
> +
> +iov = (struct iovec) {
> +.iov_base   = out_buf,
> +.iov_len= out_len,
> +};
> +qemu_iovec_init_external(_qiov, , 1);
> +ret = bdrv_co_pwritev(bs->file->bs, cluster_offset, out_len, _qiov, 
> 0);

Not sure if this has the same race condition as the qcow2 patch.  It
seems that bdrv_getlength() is used to extend the file on a per-sector
basis.  That would mean compressed data is not packed inside sectors and
no read-write-modify race condition exists, but I haven't fully audited
get_cluster_offset().

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 06/10] drive-backup: added support for data compression

2016-05-27 Thread Stefan Hajnoczi
On Sat, May 14, 2016 at 03:45:54PM +0300, Denis V. Lunev wrote:
> From: Pavel Butsykin 
> 
> The idea is simple - backup is "written-once" data. It is written block
> by block and it is large enough. It would be nice to save storage
> space and compress it.
> 
> The patch adds a flag to the qmp/hmp drive-backup command which enables
> block compression. Compression should be implemented in the format driver
> to enable this feature.
> 
> There are some limitations of the format driver to allow compressed writes.
> We can write data only once. Though for backup this is perfectly fine.
> These limitations are maintained by the driver and the error will be
> reported if we are doing something wrong.
> 
> Signed-off-by: Pavel Butsykin 
> Signed-off-by: Denis V. Lunev 
> CC: Jeff Cody 
> CC: Markus Armbruster 
> CC: Eric Blake 
> CC: John Snow 
> CC: Stefan Hajnoczi 
> CC: Kevin Wolf 
> ---
>  block/backup.c| 13 +
>  blockdev.c| 12 ++--
>  hmp-commands.hx   |  8 +---
>  hmp.c |  3 ++-
>  include/block/block_int.h |  1 +
>  qapi/block-core.json  |  2 +-
>  qmp-commands.hx   |  4 +++-
>  7 files changed, 35 insertions(+), 8 deletions(-)

Aside from the API doc issues that Eric mentioned:

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] linux-user: provide frame information in x86-64 safe_syscall

2016-05-27 Thread Peter Maydell
On 27 May 2016 at 16:06, Peter Maydell  wrote:
>  return_ERESTARTSYS:
>  /* code path when we didn't execute the syscall */
> +.cfi_restore_state
>  mov $-TARGET_ERESTARTSYS, %rax
>  pop %rbp
> +.cfi_def_cfa_offset 8
> +.cfi_restore ebp

These should read '.cfi_restore rbp'. I have no idea how that got through
testing since it doesn't compile at all...

thanks
-- PMM



Re: [Qemu-devel] [PATCH 03/10] vmdk: add vmdk_co_write_compressed

2016-05-27 Thread Stefan Hajnoczi
On Sat, May 14, 2016 at 03:45:51PM +0300, Denis V. Lunev wrote:
> From: Pavel Butsykin 
> 
> Added implementation of the vmdk_co_write_compressed function that
> will allow us to safely use compressed writes for the vmdk from running
> VMs.
> 
> Signed-off-by: Pavel Butsykin 
> Signed-off-by: Denis V. Lunev 
> CC: Jeff Cody 
> CC: Markus Armbruster 
> CC: Eric Blake 
> CC: John Snow 
> CC: Stefan Hajnoczi 
> CC: Kevin Wolf 
> ---
>  block/vmdk.c | 56 ++--
>  1 file changed, 6 insertions(+), 50 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 02/10] qcow2: add qcow2_co_write_compressed

2016-05-27 Thread Stefan Hajnoczi
On Sat, May 14, 2016 at 03:45:50PM +0300, Denis V. Lunev wrote:
> +qemu_co_mutex_lock(>lock);
> +cluster_offset = \
> +qcow2_alloc_compressed_cluster_offset(bs, sector_num << 9, out_len);

The backslash isn't necessary for wrapping lines in C.  This kind of
thing is only necessary in languages like Python where the grammar is
whitespace sensistive.

The C compiler is happy with an arbitrary amount of whitespace
(newlines) in the middle of a statement.  The backslash in C is handled
by the preprocessor: it joins the line.  That's useful for macro
definitions where you need to tell the preprocessor that several lines
belong to one macro definition.  But it's not needed for normal C code.

> +if (!cluster_offset) {
> +qemu_co_mutex_unlock(>lock);
> +ret = -EIO;
> +goto fail;
> +}
> +cluster_offset &= s->cluster_offset_mask;
>  
> -BLKDBG_EVENT(bs->file, BLKDBG_WRITE_COMPRESSED);
> -ret = bdrv_pwrite(bs->file->bs, cluster_offset, out_buf, out_len);
> -if (ret < 0) {
> -goto fail;
> -}
> +ret = qcow2_pre_write_overlap_check(bs, 0, cluster_offset, out_len);
> +qemu_co_mutex_unlock(>lock);
> +if (ret < 0) {
> +goto fail;
>  }
>  
> +iov = (struct iovec) {
> +.iov_base   = out_buf,
> +.iov_len= out_len,
> +};
> +qemu_iovec_init_external(_qiov, , 1);
> +
> +BLKDBG_EVENT(bs->file, BLKDBG_WRITE_COMPRESSED);
> +ret = bdrv_co_pwritev(bs->file->bs, cluster_offset, out_len, _qiov, 
> 0);

There is a race condition here:

If the newly allocated cluster is only partially filled by compressed
data then qcow2_alloc_compressed_cluster_offset() remembers that more
bytes are still available in the cluster.  The
qcow2_alloc_compressed_cluster_offset() caller will continue filling the
same cluster.

Imagine two compressed writes running at the same time.  Write A
allocates just a few bytes so write B shares a sector with the first
write:

 Sector 1
  |AAAB|

The race condition is that bdrv_co_pwritev() uses read-modify-write (a
bounce buffer).  If both requests call bdrv_co_pwritev() around the same
time then the following could happen:

 Sector 1
  |000B|

or:

 Sector 1
  |AAA0|

It's necessary to hold s->lock around the compressed data write to avoid
this race condition.


signature.asc
Description: PGP signature


[Qemu-devel] [PATCH v13 3/8] hw/ptimer: Update .delta on period/freq change

2016-05-27 Thread Dmitry Osipenko
Delta value must be updated on period/freq change, otherwise running timer
would be restarted (counter reloaded with old delta). Only m68k/mcf520x
and arm/arm_timer devices are currently doing freq change correctly, i.e.
stopping the timer. Perform delta update to fix affected devices and
eliminate potential further mistakes.

Signed-off-by: Dmitry Osipenko 
Reviewed-by: Peter Crosthwaite 
---
 hw/core/ptimer.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index 7e6fc2d..76ebe9b 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -190,6 +190,7 @@ void ptimer_stop(ptimer_state *s)
 /* Set counter increment interval in nanoseconds.  */
 void ptimer_set_period(ptimer_state *s, int64_t period)
 {
+s->delta = ptimer_get_count(s);
 s->period = period;
 s->period_frac = 0;
 if (s->enabled) {
@@ -201,6 +202,7 @@ void ptimer_set_period(ptimer_state *s, int64_t period)
 /* Set counter frequency in Hz.  */
 void ptimer_set_freq(ptimer_state *s, uint32_t freq)
 {
+s->delta = ptimer_get_count(s);
 s->period = 10ll / freq;
 s->period_frac = (10ll << 32) / freq;
 if (s->enabled) {
-- 
2.8.3




[Qemu-devel] [PATCH v13 1/8] hw/ptimer: Fix issues caused by the adjusted timer limit value

2016-05-27 Thread Dmitry Osipenko
Multiple issues here related to the timer with a adjusted .limit value:

1) ptimer_get_count() returns incorrect counter value for the disabled
timer after loading the counter with a small value, because adjusted limit
value is used instead of the original.

For instance:
1) ptimer_stop(t)
2) ptimer_set_period(t, 1)
3) ptimer_set_limit(t, 0, 1)
4) ptimer_get_count(t) <-- would return 1 instead of 0

2) ptimer_get_count() might return incorrect value for the timer running
with a adjusted limit value.

For instance:
1) ptimer_stop(t)
2) ptimer_set_period(t, 1)
3) ptimer_set_limit(t, 10, 1)
4) ptimer_run(t)
5) ptimer_get_count(t) <-- might return value > 10

3) Neither ptimer_set_period() nor ptimer_set_freq() are adjusting the
limit value, so it is still possible to make timer timeout value
arbitrary small.

For instance:
1) ptimer_set_period(t, 1)
2) ptimer_set_limit(t, 1, 0)
3) ptimer_set_period(t, 1) <-- bypass limit correction

Fix all of the above issues by adjusting timer period instead of the limit.
Perform the adjustment for periodic timer only. Use the delta value instead
of the limit to make decision whether adjustment is required, as limit could
be altered while timer is running, resulting in incorrect value returned by
ptimer_get_count.

Signed-off-by: Dmitry Osipenko 
Reviewed-by: Peter Crosthwaite 
---
 hw/core/ptimer.c | 51 +++
 1 file changed, 31 insertions(+), 20 deletions(-)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index 153c835..16d7dd5 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -35,6 +35,9 @@ static void ptimer_trigger(ptimer_state *s)
 
 static void ptimer_reload(ptimer_state *s)
 {
+uint32_t period_frac = s->period_frac;
+uint64_t period = s->period;
+
 if (s->delta == 0) {
 ptimer_trigger(s);
 s->delta = s->limit;
@@ -45,10 +48,24 @@ static void ptimer_reload(ptimer_state *s)
 return;
 }
 
+/*
+ * Artificially limit timeout rate to something
+ * achievable under QEMU.  Otherwise, QEMU spends all
+ * its time generating timer interrupts, and there
+ * is no forward progress.
+ * About ten microseconds is the fastest that really works
+ * on the current generation of host machines.
+ */
+
+if (s->enabled == 1 && (s->delta * period < 1) && !use_icount) {
+period = 1 / s->delta;
+period_frac = 0;
+}
+
 s->last_event = s->next_event;
-s->next_event = s->last_event + s->delta * s->period;
-if (s->period_frac) {
-s->next_event += ((int64_t)s->period_frac * s->delta) >> 32;
+s->next_event = s->last_event + s->delta * period;
+if (period_frac) {
+s->next_event += ((int64_t)period_frac * s->delta) >> 32;
 }
 timer_mod(s->timer, s->next_event);
 }
@@ -83,6 +100,13 @@ uint64_t ptimer_get_count(ptimer_state *s)
 uint64_t div;
 int clz1, clz2;
 int shift;
+uint32_t period_frac = s->period_frac;
+uint64_t period = s->period;
+
+if ((s->enabled == 1) && !use_icount && (s->delta * period < 
1)) {
+period = 1 / s->delta;
+period_frac = 0;
+}
 
 /* We need to divide time by period, where time is stored in
rem (64-bit integer) and period is stored in period/period_frac
@@ -95,7 +119,7 @@ uint64_t ptimer_get_count(ptimer_state *s)
 */
 
 rem = s->next_event - now;
-div = s->period;
+div = period;
 
 clz1 = clz64(rem);
 clz2 = clz64(div);
@@ -104,13 +128,13 @@ uint64_t ptimer_get_count(ptimer_state *s)
 rem <<= shift;
 div <<= shift;
 if (shift >= 32) {
-div |= ((uint64_t)s->period_frac << (shift - 32));
+div |= ((uint64_t)period_frac << (shift - 32));
 } else {
 if (shift != 0)
-div |= (s->period_frac >> (32 - shift));
+div |= (period_frac >> (32 - shift));
 /* Look at remaining bits of period_frac and round div up if 
necessary.  */
-if ((uint32_t)(s->period_frac << shift))
+if ((uint32_t)(period_frac << shift))
 div += 1;
 }
 counter = rem / div;
@@ -182,19 +206,6 @@ void ptimer_set_freq(ptimer_state *s, uint32_t freq)
count = limit.  */
 void ptimer_set_limit(ptimer_state *s, uint64_t limit, int reload)
 {
-/*
- * Artificially limit timeout rate to something
- * achievable under QEMU.  Otherwise, QEMU spends all
- * its time generating timer interrupts, and there
- * is no forward progress.
- * About ten microseconds is the fastest that really works
- * on the current generation of host 

[Qemu-devel] [PATCH v13 2/8] hw/ptimer: Perform counter wrap around if timer already expired

2016-05-27 Thread Dmitry Osipenko
ptimer_get_count() might be called while QEMU timer already been expired.
In that case ptimer would return counter = 0, which might be undesirable
in case of polled timer. Do counter wrap around for periodic timer to keep
it distributed. In order to achieve more accurate emulation behaviour of
certain hardware, don't perform wrap around when in icount mode and return
counter = 0 in that case (that doesn't affect polled counter distribution).

Signed-off-by: Dmitry Osipenko 
Reviewed-by: Peter Crosthwaite 
---
 hw/core/ptimer.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index 16d7dd5..7e6fc2d 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -84,14 +84,16 @@ static void ptimer_tick(void *opaque)
 
 uint64_t ptimer_get_count(ptimer_state *s)
 {
-int64_t now;
 uint64_t counter;
 
 if (s->enabled) {
-now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+int64_t next = s->next_event;
+bool expired = (now - next >= 0);
+bool oneshot = (s->enabled == 2);
+
 /* Figure out the current counter value.  */
-if (now - s->next_event > 0
-|| s->period == 0) {
+if (s->period == 0 || (expired && (oneshot || use_icount))) {
 /* Prevent timer underflowing if it should already have
triggered.  */
 counter = 0;
@@ -103,7 +105,7 @@ uint64_t ptimer_get_count(ptimer_state *s)
 uint32_t period_frac = s->period_frac;
 uint64_t period = s->period;
 
-if ((s->enabled == 1) && !use_icount && (s->delta * period < 
1)) {
+if (!oneshot && (s->delta * period < 1) && !use_icount) {
 period = 1 / s->delta;
 period_frac = 0;
 }
@@ -118,7 +120,7 @@ uint64_t ptimer_get_count(ptimer_state *s)
backwards.
 */
 
-rem = s->next_event - now;
+rem = expired ? now - next : next - now;
 div = period;
 
 clz1 = clz64(rem);
@@ -138,6 +140,11 @@ uint64_t ptimer_get_count(ptimer_state *s)
 div += 1;
 }
 counter = rem / div;
+
+if (expired && counter != 0) {
+/* Wrap around periodic counter.  */
+counter = s->limit - (counter - 1) % s->limit;
+}
 }
 } else {
 counter = s->delta;
-- 
2.8.3




[Qemu-devel] [PATCH v13 4/8] hw/ptimer: Support "on the fly" timer mode switch

2016-05-27 Thread Dmitry Osipenko
Allow switching between periodic <-> oneshot modes while timer is running.

Signed-off-by: Dmitry Osipenko 
Reviewed-by: Peter Crosthwaite 
---
 hw/core/ptimer.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index 76ebe9b..d0b2f38 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -163,16 +163,17 @@ void ptimer_set_count(ptimer_state *s, uint64_t count)
 
 void ptimer_run(ptimer_state *s, int oneshot)
 {
-if (s->enabled) {
-return;
-}
-if (s->period == 0) {
+bool was_disabled = !s->enabled;
+
+if (was_disabled && s->period == 0) {
 fprintf(stderr, "Timer with period zero, disabling\n");
 return;
 }
 s->enabled = oneshot ? 2 : 1;
-s->next_event = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-ptimer_reload(s);
+if (was_disabled) {
+s->next_event = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+ptimer_reload(s);
+}
 }
 
 /* Pause a timer.  Note that this may cause it to "lose" time, even if it
-- 
2.8.3




[Qemu-devel] [PATCH v13 6/8] hw/ptimer: Support running with counter = 0 by introducing new policy feature

2016-05-27 Thread Dmitry Osipenko
Currently ptimer prints error message and clears enable flag for an arming
timer that has delta = load = 0. That actually is a valid case for most
of the timers, like instant IRQ trigger for oneshot timer or continuous in
periodic mode. There are different possible policies of how particular timer
could interpret setting counter to 0, like:

1) Immediate stop with triggering the interrupt in oneshot mode. Immediate
   wraparound with triggering the interrupt in continuous mode.

2) Immediate stop without triggering the interrupt in oneshot mode. Immediate
   wraparound without triggering the interrupt in continuous mode.

3) Stopping with triggering the interrupt after one period in oneshot mode.
   Wraparound with triggering the interrupt in continuous mode after one
   period.

4) Stopping without triggering the interrupt after one period in oneshot mode.
   Wraparound without triggering the interrupt in continuous mode after one
   period.

As well as handling oneshot/continuous modes differently.

Given that, currently, running the timer with counter/period equal to 0 is
treated as a error case, it's not obvious what policies should be supported
by ptimer. Let's implement the third policy for now, as it is known to be
used by the ARM MPTimer.

Explicitly forbid running with counter/period == 0 in all cases by aborting
QEMU execution instead of printing the error message.

Bump VMSD version since a new VMState member is added.

Signed-off-by: Dmitry Osipenko 
---
 hw/core/ptimer.c| 53 +++--
 include/hw/ptimer.h |  6 ++
 2 files changed, 37 insertions(+), 22 deletions(-)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index 05b0c27..9bc70f5 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -21,6 +21,7 @@ struct ptimer_state
 int64_t period;
 int64_t last_event;
 int64_t next_event;
+uint8_t policy;
 QEMUBH *bh;
 QEMUTimer *timer;
 };
@@ -37,15 +38,16 @@ static void ptimer_reload(ptimer_state *s)
 {
 uint32_t period_frac = s->period_frac;
 uint64_t period = s->period;
+uint64_t delta = MAX(1, s->delta);
 
-if (s->delta == 0) {
-ptimer_trigger(s);
-s->delta = s->limit;
+if (s->delta == 0 && s->policy == UNIMPLEMENTED) {
+hw_error("ptimer: Running with counter=0 is unimplemented by " \
+ "this timer, fix it!\n");
 }
-if (s->delta == 0 || s->period == 0) {
-fprintf(stderr, "Timer with period zero, disabling\n");
-s->enabled = 0;
-return;
+
+if (period == 0) {
+hw_error("ptimer: Timer tries to run with period=0, behaviour is " \
+ "undefined, fix it!\n");
 }
 
 /*
@@ -57,15 +59,15 @@ static void ptimer_reload(ptimer_state *s)
  * on the current generation of host machines.
  */
 
-if (s->enabled == 1 && (s->delta * period < 1) && !use_icount) {
-period = 1 / s->delta;
+if (s->enabled == 1 && (delta * period < 1) && !use_icount) {
+period = 1 / delta;
 period_frac = 0;
 }
 
 s->last_event = s->next_event;
-s->next_event = s->last_event + s->delta * period;
+s->next_event = s->last_event + delta * period;
 if (period_frac) {
-s->next_event += ((int64_t)period_frac * s->delta) >> 32;
+s->next_event += ((int64_t)period_frac * delta) >> 32;
 }
 timer_mod(s->timer, s->next_event);
 }
@@ -73,27 +75,30 @@ static void ptimer_reload(ptimer_state *s)
 static void ptimer_tick(void *opaque)
 {
 ptimer_state *s = (ptimer_state *)opaque;
-ptimer_trigger(s);
-s->delta = 0;
-if (s->enabled == 2) {
+
+s->delta = (s->enabled == 1) ? s->limit : 0;
+
+if (s->delta == 0) {
 s->enabled = 0;
 } else {
 ptimer_reload(s);
 }
+
+ptimer_trigger(s);
 }
 
 uint64_t ptimer_get_count(ptimer_state *s)
 {
 uint64_t counter;
 
-if (s->enabled) {
+if (s->enabled && s->delta != 0) {
 int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
 int64_t next = s->next_event;
 bool expired = (now - next >= 0);
 bool oneshot = (s->enabled == 2);
 
 /* Figure out the current counter value.  */
-if (s->period == 0 || (expired && (oneshot || use_icount))) {
+if (expired && (oneshot || use_icount)) {
 /* Prevent timer underflowing if it should already have
triggered.  */
 counter = 0;
@@ -165,11 +170,8 @@ void ptimer_run(ptimer_state *s, int oneshot)
 {
 bool was_disabled = !s->enabled;
 
-if (was_disabled && s->period == 0) {
-fprintf(stderr, "Timer with period zero, disabling\n");
-return;
-}
 s->enabled = oneshot ? 2 : 1;
+
 if (was_disabled) {
 s->next_event = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
 ptimer_reload(s);
@@ -203,6 +205,7 @@ void ptimer_set_period(ptimer_state *s, int64_t period)
 /* Set counter frequency in 

[Qemu-devel] [PATCH v13 8/8] arm_mptimer: Convert to use ptimer

2016-05-27 Thread Dmitry Osipenko
Current ARM MPTimer implementation uses QEMUTimer for the actual timer,
this implementation isn't complete and mostly tries to duplicate of what
generic ptimer is already doing fine.

Conversion to ptimer brings the following benefits and fixes:
- Simple timer pausing implementation
- Fixes counter value preservation after stopping the timer
- Correctly handles prescaler != 0 / counter = 0 / load = 0 cases
- Code simplification and reduction

Use new ptimer policy feature to correctly emulate running with/setting
counter = 0.

Bump VMSD to version 3, since VMState is changed and is not compatible
with the previous implementation.

Signed-off-by: Dmitry Osipenko 
---
 hw/timer/arm_mptimer.c | 135 ++---
 include/hw/timer/arm_mptimer.h |   5 +-
 2 files changed, 72 insertions(+), 68 deletions(-)

diff --git a/hw/timer/arm_mptimer.c b/hw/timer/arm_mptimer.c
index d66bbf0..bffc506 100644
--- a/hw/timer/arm_mptimer.c
+++ b/hw/timer/arm_mptimer.c
@@ -20,9 +20,10 @@
  */
 
 #include "qemu/osdep.h"
+#include "hw/ptimer.h"
 #include "hw/timer/arm_mptimer.h"
 #include "qapi/error.h"
-#include "qemu/timer.h"
+#include "qemu/main-loop.h"
 #include "qom/cpu.h"
 
 /* This device implements the per-cpu private timer and watchdog block
@@ -44,55 +45,54 @@ static inline void timerblock_update_irq(TimerBlock *tb)
 }
 
 /* Return conversion factor from mpcore timer ticks to qemu timer ticks.  */
-static inline uint32_t timerblock_scale(TimerBlock *tb)
+static inline uint32_t timerblock_scale(uint32_t control)
 {
-return (((tb->control >> 8) & 0xff) + 1) * 10;
+return (((control >> 8) & 0xff) + 1) * 10;
 }
 
-static void timerblock_reload(TimerBlock *tb, int restart)
+static inline void timerblock_set_count(struct ptimer_state *timer,
+uint32_t control, uint64_t *count)
 {
-if (tb->count == 0) {
-return;
+/* PTimer would immediately trigger interrupt for periodic timer
+ * when counter set to 0, MPtimer under certain condition only.
+ */
+if ((control & 3) == 3 && (control & 0xff00) == 0 && *count == 0) {
+*count = ptimer_get_limit(timer);
 }
-if (restart) {
-tb->tick = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+ptimer_set_count(timer, *count);
+}
+
+static inline void timerblock_run(struct ptimer_state *timer,
+  uint32_t control, uint32_t load)
+{
+if ((control & 1) && ((control & 0xff00) || load != 0)) {
+ptimer_run(timer, !(control & 2));
 }
-tb->tick += (int64_t)tb->count * timerblock_scale(tb);
-timer_mod(tb->timer, tb->tick);
 }
 
 static void timerblock_tick(void *opaque)
 {
 TimerBlock *tb = (TimerBlock *)opaque;
 tb->status = 1;
-if (tb->control & 2) {
-tb->count = tb->load;
-timerblock_reload(tb, 0);
-} else {
-tb->count = 0;
-}
 timerblock_update_irq(tb);
+/* Periodic timer with load = 0 and prescaler != 0 would re-trigger
+ * IRQ after one period, otherwise it either stops or wraps around.
+ */
+if ((tb->control & 2) && (tb->control & 0xff00) &&
+ptimer_get_limit(tb->timer) == 0) {
+ptimer_run(tb->timer, 0);
+}
 }
 
 static uint64_t timerblock_read(void *opaque, hwaddr addr,
 unsigned size)
 {
 TimerBlock *tb = (TimerBlock *)opaque;
-int64_t val;
 switch (addr) {
 case 0: /* Load */
-return tb->load;
+return ptimer_get_limit(tb->timer);
 case 4: /* Counter.  */
-if (((tb->control & 1) == 0) || (tb->count == 0)) {
-return 0;
-}
-/* Slow and ugly, but hopefully won't happen too often.  */
-val = tb->tick - qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-val /= timerblock_scale(tb);
-if (val < 0) {
-val = 0;
-}
-return val;
+return ptimer_get_count(tb->timer);
 case 8: /* Control.  */
 return tb->control;
 case 12: /* Interrupt status.  */
@@ -106,37 +106,45 @@ static void timerblock_write(void *opaque, hwaddr addr,
  uint64_t value, unsigned size)
 {
 TimerBlock *tb = (TimerBlock *)opaque;
-int64_t old;
+uint32_t control = tb->control;
 switch (addr) {
 case 0: /* Load */
-tb->load = value;
-/* Fall through.  */
-case 4: /* Counter.  */
-if ((tb->control & 1) && tb->count) {
-/* Cancel the previous timer.  */
-timer_del(tb->timer);
+/* Setting load to 0 stops the timer without doing the tick if
+ * prescaler = 0.
+ */
+if ((control & 1) && (control & 0xff00) == 0 && value == 0) {
+ptimer_stop(tb->timer);
 }
-tb->count = value;
-if (tb->control & 1) {
-timerblock_reload(tb, 1);
+ptimer_set_limit(tb->timer, value, 1);
+

[Qemu-devel] [PATCH v13 7/8] hw/ptimer: Fix counter - 1 returned by ptimer_get_count for the active timer

2016-05-27 Thread Dmitry Osipenko
Due to rounding down performed by ptimer_get_count, it returns counter - 1 for
the active timer. That's incorrect because counter should decrement only after
period been expired, not before. I.e. if running timer has been loaded with
value X, then timer counter should stay with X until period expired and
decrement after. Fix this by adding 1 to the counter value for the active and
unexpired timer.

Signed-off-by: Dmitry Osipenko 
---
 hw/core/ptimer.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index 9bc70f5..c9f2604 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -89,10 +89,10 @@ static void ptimer_tick(void *opaque)
 
 uint64_t ptimer_get_count(ptimer_state *s)
 {
+int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
 uint64_t counter;
 
-if (s->enabled && s->delta != 0) {
-int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+if (s->enabled && s->delta != 0 && now != s->last_event) {
 int64_t next = s->next_event;
 bool expired = (now - next >= 0);
 bool oneshot = (s->enabled == 2);
@@ -144,7 +144,7 @@ uint64_t ptimer_get_count(ptimer_state *s)
 if ((uint32_t)(period_frac << shift))
 div += 1;
 }
-counter = rem / div;
+counter = rem / div + (expired ? 0 : 1);
 
 if (expired && counter != 0) {
 /* Wrap around periodic counter.  */
-- 
2.8.3




[Qemu-devel] [PATCH v13 0/8] PTimer fixes/features and ARM MPTimer conversion

2016-05-27 Thread Dmitry Osipenko
Hello,

Current QEMU ARM MPTimer device model provides only a certain subset of the
emulation behavior, so this patch series is supposed to add missing parts by
converting the MPTimer to use generic ptimer helper. It fixes some important
ptimer bugs and provides new features that are required for the ARM MPTimer.

Emulation behavior is verified against the real HW by running specially
crafted MPTimer tests in both icount and non-icount modes:

https://gist.github.com/digetx/dbd46109503b1a91941a


Changelog for the ARM MPTimer QEMUTimer to ptimer conversion:

V2: Fixed changing periodic timer counter value "on the fly". I added a
test to the gist to cover that issue.

V3: Fixed starting the timer with load = 0 and counter != 0, added tests
to the gist for this issue. Changed vmstate version for all VMSD's,
since loadvm doesn't check version of nested VMSD.

V4: Fixed spurious IT bit set for the timer starting in the periodic mode
with counter = 0. Test added.

V5: Code cleanup, now depends on ptimer_set_limit() fix.

V6: No code change, added test to check ptimer_get_count() with corrected
.limit value.

V7: No change.

V8: No change.

V9: No change.

V10: Correctly handle cases when counter = load = 0 and prescaler != 0,
 i.e. triggering interrupt in that case. Call ptimer_* only when
 certain MPTimer state was changed, like prescaler change. Factor out
 timerblock_set_count from timerblock_run and inline both.
 Tests updated.

V11: Fixed missed periodic timer stopping on setting counter => 0 with
 load = 0 and prescaler = 0.

v12: Timer isn't doing uninterruptible IRQ, but ticks continuously.
 On setting counter/load to 0 with prescaler != 0, timer would trigger
 IRQ after one period. Verified on real HW, tests updated.

v13: Rebased on recent master, use new ptimer policy feature.

Patches for ptimer are introduced since V5 of "ARM MPTimer conversion".

Changelog for the ptimer patches:

V5: Only fixed ptimer_set_limit() for the disabled timer.

V6: As was pointed by Peter Maydell, there are other issues beyond
ptimer_set_limit(), so V6 supposed to cover all those issues.

V7: Added accidentally removed !use_icount check.
Added missed "else" statement.

V8: Adjust period instead of the limit and do it for periodic timer only
(.limit adjusting bug). Added patch/fix for freq/period change and
ptimer_get_count() improvement.

V9: Don't do wrap around if counter == 0, otherwise polled periodic
timer won't ever return counter = 0.

V10: Addressed V8/9 review comments.
 Adjust timer period based on delta instead of limit.
 Don't wrap around when in icount mode.
 New patches: "on the fly" mode switch, silence error msg when
  delta = load = 0, introduce ptimer_get_limit.

V11: Dropped timer tick from "Perform tick and counter wrap around if
 timer already expired" patch since it would cause bogus tick after
 QEMU been reset if ptimer was stopped and it's QEMUtimer expired
 during reset.
 Patch "Legalize running with delta = load = 0" now explicitly
 forbids period = 0.

v12: Fixed missed abort on setting freq > 10.
 New patches:
"Fix counter - 1 returned by ptimer_get_count for the active timer"
"Perform delayed tick instead of immediate if delta = 0"

v13: Squashed the following two patches from v12:

hw/ptimer: Legalize running with delta = load = 0 and abort on period = 0
hw/ptimer: Perform delayed tick instead of immediate if delta = 0

into

hw/ptimer: Support running with counter = 0 by introducing new policy 
feature

The second patch has no r-b, I omitted Peter's Crosthwaite r-b from the
first patch. I think it's not valid now, otherwise please let me know.

Dmitry Osipenko (8):
  hw/ptimer: Fix issues caused by the adjusted timer limit value
  hw/ptimer: Perform counter wrap around if timer already expired
  hw/ptimer: Update .delta on period/freq change
  hw/ptimer: Support "on the fly" timer mode switch
  hw/ptimer: Introduce ptimer_get_limit
  hw/ptimer: Support running with counter = 0 by introducing new policy
feature
  hw/ptimer: Fix counter - 1 returned by ptimer_get_count for the active
timer
  arm_mptimer: Convert to use ptimer

 hw/core/ptimer.c   | 131 ---
 hw/timer/arm_mptimer.c | 135 ++---
 include/hw/ptimer.h|   7 +++
 include/hw/timer/arm_mptimer.h |   5 +-
 4 files changed, 162 insertions(+), 116 deletions(-)

-- 
2.8.3




[Qemu-devel] [PATCH v13 5/8] hw/ptimer: Introduce ptimer_get_limit

2016-05-27 Thread Dmitry Osipenko
Currently ptimer users are used to store copy of the limit value, because
ptimer doesn't provide facility to retrieve the limit. Let's provide it.

Signed-off-by: Dmitry Osipenko 
Reviewed-by: Peter Crosthwaite 
---
 hw/core/ptimer.c| 5 +
 include/hw/ptimer.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index d0b2f38..05b0c27 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -225,6 +225,11 @@ void ptimer_set_limit(ptimer_state *s, uint64_t limit, int 
reload)
 }
 }
 
+uint64_t ptimer_get_limit(ptimer_state *s)
+{
+return s->limit;
+}
+
 const VMStateDescription vmstate_ptimer = {
 .name = "ptimer",
 .version_id = 1,
diff --git a/include/hw/ptimer.h b/include/hw/ptimer.h
index 8ebacbb..e397db5 100644
--- a/include/hw/ptimer.h
+++ b/include/hw/ptimer.h
@@ -19,6 +19,7 @@ typedef void (*ptimer_cb)(void *opaque);
 ptimer_state *ptimer_init(QEMUBH *bh);
 void ptimer_set_period(ptimer_state *s, int64_t period);
 void ptimer_set_freq(ptimer_state *s, uint32_t freq);
+uint64_t ptimer_get_limit(ptimer_state *s);
 void ptimer_set_limit(ptimer_state *s, uint64_t limit, int reload);
 uint64_t ptimer_get_count(ptimer_state *s);
 void ptimer_set_count(ptimer_state *s, uint64_t count);
-- 
2.8.3




Re: [Qemu-devel] [PULL v3 00/31] Misc changes for 2016-05-27

2016-05-27 Thread Peter Maydell
On 27 May 2016 at 17:11, Paolo Bonzini  wrote:
> The following changes since commit 2c56d06bafd8933d2a9c6e0aeb5d45f7c1fb5616:
>
>   Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging 
> (2016-05-26 14:29:30 +0100)
>
> are available in the git repository at:
>
>   git://github.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to 05d158af4bb817b05074a3846dce74466aea8d39:
>
>   exec: hide mr->ram_addr from qemu_get_ram_ptr users (2016-05-27 18:09:13 
> +0200)
>
> 
> * docs/atomics fixes and atomic_rcu_* optimization (Emilio)
> * NBD bugfix (Eric)
> * Memory fixes and cleanups (Paolo, Paul)
> * scsi-block support for SCSI status, including persistent
>   reservations (Paolo)
> * linuxboot support for fw_cfg DMA (Marc, Richard Jones)
> * kvm_stat moves to the Linux repository
> * SCSI bug fixes (Peter, Prasad)
> * Killing qemu_char_get_next_serial, non-ARM parts (Xiaoqiang)
>
> 

Doesn't build on clang :-(

  CCoptionrom/linuxboot_dma.o
clang: error: unknown argument: '-fno-toplevel-reorder'
clang: error: unknown argument: '-fno-toplevel-reorder'

thanks
-- PMM



Re: [Qemu-devel] [PATCH] linux-user: provide frame information in x86-64 safe_syscall

2016-05-27 Thread Peter Maydell
On 27 May 2016 at 17:21, Richard Henderson  wrote:
> On 05/27/2016 08:06 AM, Peter Maydell wrote:
>>
>> @@ -31,6 +32,8 @@ safe_syscall_base:
>>   * does not list any ABI differences regarding stack alignment.)
>>   */
>>  push%rbp
>> +.cfi_def_cfa_offset 16
>> +.cfi_offset rbp,-16
>
>
> While this is correct, there are two other directives that make it easier to
> describe changes without having to compute globally correct constants.  Here
> they would be:
>
> .cfi_adjust_cfa_offset 8
>
> Add 8 to the offset, i.e. decrement the SP by 8.

Presumably .cfi_startproc sets the initial offset to 8?
(It's not documented that it does so, which is I think partly why
I preferred to use a directive that definitely set the offset
to the right thing.)

> .cfi_rel_offset rbp, 0
>
> Save rbp at the current top-of-stack.
>
> The assembler will compute the absolute values from these relative values.
> Using them makes it easy to see from a narrow window of code that the
> annotations are correct.
>
> Otherwise,
>
> Reviewed-by: Richard Henderson 

Thanks. I'll spin a v2 with your changes in it next week.

-- PMM



Re: [Qemu-devel] [RFC PATCH v2 0/2] spapr: Memory hot-unplug support

2016-05-27 Thread Michael Roth
Quoting Thomas Huth (2016-05-27 10:48:45)
>  Hi Bharata,
> 
> On 15.03.2016 05:38, Bharata B Rao wrote:
> > This patchset adds memory hot removal support for PowerPC sPAPR.
> > This new version switches to using the proposed "count-indexed" type of
> > hotplug identifier which allows to hot remove a number of LMBs starting
> > with a given DRC index.
> > 
> > This count-indexed hotplug identifier isn't yet part of PAPR.
> 
> Just for clarification / my understanding: That means we also need a
> modified guest to support this new interface? If yes, did you post such
> patches somewhere else already, too?

No patches posted yet, but hopefully soon. These bits will likely be added
as part of an effort that moves all memory hotplug/unplug into guest
kernel instead of relying on drmgr. Most of the bits for in-kernel
memory hotplug are already upstream, but there's a number of other
requirements in the spec update (like a new hotplug interrupt/queue
instead of re-using EPOW) that need to be addressed as part of the
switchover.

> 
>  Thomas
> 




Re: [Qemu-devel] [PATCH] linux-user: provide frame information in x86-64 safe_syscall

2016-05-27 Thread Richard Henderson

On 05/27/2016 08:06 AM, Peter Maydell wrote:

@@ -31,6 +32,8 @@ safe_syscall_base:
  * does not list any ABI differences regarding stack alignment.)
  */
 push%rbp
+.cfi_def_cfa_offset 16
+.cfi_offset rbp,-16


While this is correct, there are two other directives that make it easier to 
describe changes without having to compute globally correct constants.  Here 
they would be:


.cfi_adjust_cfa_offset 8

Add 8 to the offset, i.e. decrement the SP by 8.

.cfi_rel_offset rbp, 0

Save rbp at the current top-of-stack.

The assembler will compute the absolute values from these relative values. 
Using them makes it easy to see from a narrow window of code that the 
annotations are correct.


Otherwise,

Reviewed-by: Richard Henderson 


r~



[Qemu-devel] [PULL v3 00/31] Misc changes for 2016-05-27

2016-05-27 Thread Paolo Bonzini
The following changes since commit 2c56d06bafd8933d2a9c6e0aeb5d45f7c1fb5616:

  Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging 
(2016-05-26 14:29:30 +0100)

are available in the git repository at:

  git://github.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 05d158af4bb817b05074a3846dce74466aea8d39:

  exec: hide mr->ram_addr from qemu_get_ram_ptr users (2016-05-27 18:09:13 
+0200)


* docs/atomics fixes and atomic_rcu_* optimization (Emilio)
* NBD bugfix (Eric)
* Memory fixes and cleanups (Paolo, Paul)
* scsi-block support for SCSI status, including persistent
  reservations (Paolo)
* linuxboot support for fw_cfg DMA (Marc, Richard Jones)
* kvm_stat moves to the Linux repository
* SCSI bug fixes (Peter, Prasad)
* Killing qemu_char_get_next_serial, non-ARM parts (Xiaoqiang)


Emilio G. Cota (3):
  docs/atomics: update atomic_read/set comparison with Linux
  atomics: emit an smp_read_barrier_depends() barrier only for Alpha and 
Thread Sanitizer
  atomics: do not emit consume barrier for atomic_rcu_read

Eric Blake (1):
  nbd: Don't trim unrequested bytes

Fam Zheng (1):
  scsi-generic: Merge block max xfer len in INQUIRY response

Marc Marí (1):
  Add optionrom compatible with fw_cfg DMA version

Paolo Bonzini (13):
  Revert "memory: Drop FlatRange.romd_mode"
  kvm_stat: Remove
  bt: rewrite csrhci_write to avoid out-of-bounds writes
  docs/atomics: update comparison with Linux
  scsi-disk: introduce a common base class
  scsi-disk: introduce dma_readv and dma_writev
  scsi-disk: add need_fua_emulation to SCSIDiskClass
  scsi-disk: introduce scsi_disk_req_check_error
  scsi-block: always use SG_IO
  memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptr
  exec: remove ram_addr argument from qemu_ram_block_from_host
  memory: split memory_region_from_host from qemu_ram_addr_from_host
  exec: hide mr->ram_addr from qemu_get_ram_ptr users

Paul Durrant (1):
  xen-hvm: ignore background I/O sections

Peter Lieven (1):
  block/iscsi: avoid potential overflow of acb->task->cdb

Prasad J Pandit (5):
  scsi: pvscsi: check command descriptor ring buffer size (CVE-2016-4952)
  scsi: mptsas: infinite loop while fetching requests
  scsi: megasas: use appropriate property buffer size
  scsi: megasas: initialise local configuration data buffer
  scsi: megasas: check 'read_queue_head' index value

xiaoqiang zhao (5):
  hw/char: QOM'ify escc.c
  hw/char: QOM'ify etraxfs_ser.c
  hw/char: QOM'ify lm32_juart.c
  hw/char: QOM'ify lm32_uart.c
  hw/char: QOM'ify milkymist-uart.c

 .gitignore|   4 +
 Makefile  |  11 +-
 block/iscsi.c |   7 +
 configure |  20 +
 cputlb.c  |   3 +-
 docs/atomics.txt  |  38 +-
 exec.c| 110 ++---
 hw/bt/hci-csr.c   |  67 +++-
 hw/char/escc.c|  30 +-
 hw/char/etraxfs_ser.c |  27 +-
 hw/char/lm32_juart.c  |  17 +-
 hw/char/lm32_uart.c   |  28 +-
 hw/char/milkymist-uart.c  |  10 +-
 hw/cris/axis_dev88.c  |   4 +-
 hw/i386/pc.c  |  10 +-
 hw/lm32/lm32.h|  19 +-
 hw/lm32/lm32_boards.c |   9 +-
 hw/lm32/milkymist-hw.h|   4 +-
 hw/lm32/milkymist.c   |   4 +-
 hw/misc/ivshmem.c |   5 +-
 hw/nvram/fw_cfg.c |   2 +-
 hw/scsi/megasas.c |   6 +-
 hw/scsi/mptsas.c  |   9 +-
 hw/scsi/scsi-disk.c   | 415 +--
 hw/scsi/scsi-generic.c|  12 +
 hw/scsi/vmw_pvscsi.c  |  24 +-
 hw/virtio/vhost-user.c|  25 +-
 include/exec/cpu-common.h |   4 +-
 include/exec/memory.h |  36 +-
 include/exec/ram_addr.h   |   3 -
 include/hw/cris/etraxfs.h |  16 +
 include/hw/nvram/fw_cfg.h |   1 +
 include/qemu/atomic.h |  25 +-
 memory.c  |  43 +-
 migration/postcopy-ram.c  |   3 +-
 nbd/server.c  |  20 +-
 pc-bios/optionrom/Makefile|  20 +-
 pc-bios/optionrom/code16gcc.h |   3 +
 pc-bios/optionrom/linuxboot_dma.c | 292 ++
 scripts/dump-guest-memory.py  |  19 +-
 scripts/kvm/kvm_stat  | 825 --
 scripts/kvm/kvm_stat.texi |  55 ---
 target-i386/kvm.c |   6 +-
 xen-hvm.c |  14 +-
 44 files changed, 1057 insertions(+), 1248 deletions(-)
 create mode 100644 pc-bios/optionrom/code16gcc.h
 create mode 100644 pc-bios/optionrom/linuxboot_dma.c
 delete 

[Qemu-devel] [PULL 23/31] scsi-disk: introduce dma_readv and dma_writev

2016-05-27 Thread Paolo Bonzini
These are replacements for blk_aio_readv and blk_aio_writev that allow
customization of the data path.  They reuse the DMA helpers' DMAIOFunc
callback type, so that the same function can be used in either the
QEMUSGList or the bounce-buffered case.

This customization will be needed in the next patch to do zero-copy
SG_IO on scsi-block.

Signed-off-by: Paolo Bonzini 
---
 hw/scsi/scsi-disk.c | 67 +
 1 file changed, 52 insertions(+), 15 deletions(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 2d9dcde..6506257 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -55,7 +55,18 @@ do { printf("scsi-disk: " fmt , ## __VA_ARGS__); } while (0)
 
 #define TYPE_SCSI_DISK_BASE "scsi-disk-base"
 
-typedef struct SCSIDiskState SCSIDiskState;
+#define SCSI_DISK_BASE(obj) \
+ OBJECT_CHECK(SCSIDiskState, (obj), TYPE_SCSI_DISK_BASE)
+#define SCSI_DISK_BASE_CLASS(klass) \
+ OBJECT_CLASS_CHECK(SCSIDiskClass, (klass), TYPE_SCSI_DISK_BASE)
+#define SCSI_DISK_BASE_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(SCSIDiskClass, (obj), TYPE_SCSI_DISK_BASE)
+
+typedef struct SCSIDiskClass {
+SCSIDeviceClass parent_class;
+DMAIOFunc   *dma_readv;
+DMAIOFunc   *dma_writev;
+} SCSIDiskClass;
 
 typedef struct SCSIDiskReq {
 SCSIRequest req;
@@ -73,7 +84,7 @@ typedef struct SCSIDiskReq {
 #define SCSI_DISK_F_DPOFUA1
 #define SCSI_DISK_F_NO_REMOVABLE_DEVOPS   2
 
-struct SCSIDiskState
+typedef struct SCSIDiskState
 {
 SCSIDevice qdev;
 uint32_t features;
@@ -90,7 +101,7 @@ struct SCSIDiskState
 char *product;
 bool tray_open;
 bool tray_locked;
-};
+} SCSIDiskState;
 
 static int scsi_handle_rw_error(SCSIDiskReq *r, int error, bool acct_failed);
 
@@ -317,6 +328,7 @@ done:
 static void scsi_do_read(SCSIDiskReq *r, int ret)
 {
 SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+SCSIDiskClass *sdc = (SCSIDiskClass *) object_get_class(OBJECT(s));
 
 assert (r->req.aiocb == NULL);
 
@@ -337,16 +349,16 @@ static void scsi_do_read(SCSIDiskReq *r, int ret)
 if (r->req.sg) {
 dma_acct_start(s->qdev.conf.blk, >acct, r->req.sg, BLOCK_ACCT_READ);
 r->req.resid -= r->req.sg->size;
-r->req.aiocb = dma_blk_read(s->qdev.conf.blk, r->req.sg,
-r->sector << BDRV_SECTOR_BITS,
-scsi_dma_complete, r);
+r->req.aiocb = dma_blk_io(blk_get_aio_context(s->qdev.conf.blk),
+  r->req.sg, r->sector << BDRV_SECTOR_BITS,
+  sdc->dma_readv, r, scsi_dma_complete, r,
+  DMA_DIRECTION_FROM_DEVICE);
 } else {
 scsi_init_iovec(r, SCSI_DMA_BUF_SIZE);
 block_acct_start(blk_get_stats(s->qdev.conf.blk), >acct,
  r->qiov.size, BLOCK_ACCT_READ);
-r->req.aiocb = blk_aio_preadv(s->qdev.conf.blk,
-  r->sector << BDRV_SECTOR_BITS, >qiov,
-  0, scsi_read_complete, r);
+r->req.aiocb = sdc->dma_readv(r->sector, >qiov,
+  scsi_read_complete, r, r);
 }
 
 done:
@@ -506,6 +518,7 @@ static void scsi_write_data(SCSIRequest *req)
 {
 SCSIDiskReq *r = DO_UPCAST(SCSIDiskReq, req, req);
 SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+SCSIDiskClass *sdc = (SCSIDiskClass *) object_get_class(OBJECT(s));
 
 /* No data transfer may already be in progress */
 assert(r->req.aiocb == NULL);
@@ -542,15 +555,15 @@ static void scsi_write_data(SCSIRequest *req)
 if (r->req.sg) {
 dma_acct_start(s->qdev.conf.blk, >acct, r->req.sg, 
BLOCK_ACCT_WRITE);
 r->req.resid -= r->req.sg->size;
-r->req.aiocb = dma_blk_write(s->qdev.conf.blk, r->req.sg,
- r->sector << BDRV_SECTOR_BITS,
- scsi_dma_complete, r);
+r->req.aiocb = dma_blk_io(blk_get_aio_context(s->qdev.conf.blk),
+  r->req.sg, r->sector << BDRV_SECTOR_BITS,
+  sdc->dma_writev, r, scsi_dma_complete, r,
+  DMA_DIRECTION_TO_DEVICE);
 } else {
 block_acct_start(blk_get_stats(s->qdev.conf.blk), >acct,
  r->qiov.size, BLOCK_ACCT_WRITE);
-r->req.aiocb = blk_aio_pwritev(s->qdev.conf.blk,
-   r->sector << BDRV_SECTOR_BITS, >qiov,
-   0, scsi_write_complete, r);
+r->req.aiocb = sdc->dma_writev(r->sector << BDRV_SECTOR_BITS, >qiov,
+   scsi_write_complete, r, r);
 }
 }
 
@@ -2658,12 +2671,35 @@ static int scsi_block_parse_cdb(SCSIDevice *d, 
SCSICommand *cmd,
 
 #endif
 
+static
+BlockAIOCB 

Re: [Qemu-devel] [PATCH v8 11/12] vfio: register aer resume notification handler for aer resume

2016-05-27 Thread Alex Williamson
On Fri, 27 May 2016 10:12:11 +0800
Zhou Jie  wrote:

> From: Chen Fan 
> 
> For supporting aer recovery, host and guest would run the same aer
> recovery code, that would do the secondary bus reset if the error
> is fatal, the aer recovery process:
>   1. error_detected
>   2. reset_link (if fatal)
>   3. slot_reset/mmio_enabled
>   4. resume
> 
> It indicates that host will do secondary bus reset to reset
> the physical devices under bus in step 2, that would cause
> devices in D3 status in a short time. But in qemu, we register
> an error detected handler, that would be invoked as host broadcasts
> the error-detected event in step 1, in order to avoid guest do
> reset_link when host do reset_link simultaneously. it may cause
> fatal error. we introduce a resmue notifier to assure host reset
> completely.
> In qemu, the aer recovery process:
>   1. Detect support for resume notification
>  If host vfio driver does not support for resume notification,
>  directly fail to boot up VM as with aer enabled.
>   2. Immediately notify the VM on error detected.
>   3. Delay the guest directed bus reset.
>   4. Wait for resume notification.
>  If we don't get the resume notification from the host after
>  some timeout, we would abort the guest directed bus reset
>  altogether and unplug of the device to prevent it from further
>  interacting with the VM.
>   5. After get the resume notification reset bus.


It seems like we have a number of questions open in the thread with MST
from the previous version, particularly whether we should actually drop
the resume notifier and block the reset in the kernel.  The concern
being that it's not very well specified what we can and cannot do
between the error interrupt and the resume interrupt.  We'd probably
need some other indicate of whether the host has this capability,
perhaps a flag in vfio_device_info.  Appreciate your opinions there.
Thanks,

Alex

 
> Signed-off-by: Chen Fan 
> Signed-off-by: Zhou Jie 
> ---
> v7->v8
>  *Add a comment for why VFIO_RESET_TIMEOUT value is 1000.
>  *change vfio_resume_cap to pci_aer_has_resume.
>  *change vfio_resume_notifier_handler to vfio_aer_resume_notifier_handler.
>  *change reset_timer to pci_aer_reset_blocked_timer.
>  *Remove error_report for not supporting resume notification in
>   vfio_populate_device function.
>  *All error and resume tracking is done with atomic bitmap on function 0.
>  *Remove stalling any access to the device until resume is signaled.
>   Because guest OS need read configure space to know the reason of error.
>   And it should up to guest OS to decide to stop access BAR region.
>  *Still use timer to delay reset.
>   Because vfio_aer_resume_notifier_handler cann't be invoked when
>   vfio_pci_reset is blocked.
> 
>  hw/vfio/pci.c  | 223 
> -
>  hw/vfio/pci.h  |   5 +
>  linux-headers/linux/vfio.h |   1 +
>  3 files changed, 228 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 6877a3d..77d86d8 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -35,6 +35,12 @@
>  #include "trace.h"
>  
>  #define MSIX_CAP_LENGTH 12
> +/*
> + * Timeout for waiting resume notification, it is 1 second.
> + * The resume notificaton will be sent after host aer error recovery.
> + * For hardware bus reset 1 second will be enough.
> + */
> +#define VFIO_RESET_TIMEOUT 1000
>  
>  static void vfio_disable_interrupts(VFIOPCIDevice *vdev);
>  static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
> @@ -1937,6 +1943,14 @@ static void vfio_check_hot_bus_reset(VFIOPCIDevice 
> *vdev, Error **errp)
>  VFIOGroup *group;
>  int ret, i, devfn, range_limit;
>  
> +if (!vdev->pci_aer_has_resume) {
> +error_setg(errp, "vfio: Cannot enable AER for device %s,"
> +   " host vfio driver does not support for"
> +   " resume notification",
> +   vdev->vbasedev.name);
> +return;
> +}
> +
>  ret = vfio_get_hot_reset_info(vdev, );
>  if (ret) {
>  error_setg(errp, "vfio: Cannot enable AER for device %s,"
> @@ -2594,6 +2608,17 @@ static int vfio_populate_device(VFIOPCIDevice *vdev)
>   vbasedev->name);
>  }
>  
> +irq_info.index = VFIO_PCI_RESUME_IRQ_INDEX;
> +
> +ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_IRQ_INFO, _info);
> +if (ret) {
> +/* This can fail for an old kernel or legacy PCI dev */
> +trace_vfio_populate_device_get_irq_info_failure();
> +ret = 0;
> +} else if (irq_info.count == 1) {
> +vdev->pci_aer_has_resume = true;
> +}
> +
>  error:
>  return ret;
>  }
> @@ -2606,6 +2631,72 @@ static void vfio_put_device(VFIOPCIDevice *vdev)
>  vfio_put_base_device(>vbasedev);
>  }
>  
> +static void 

[Qemu-devel] [PATCH v4 1/2] exec: [tcg] Track which vCPU is performing translation and execution

2016-05-27 Thread Lluís Vilanova
Information is tracked inside the TCGContext structure, and later used
by tracing events with the 'tcg' and 'vcpu' properties.

The 'cpu' field is used to check tracing of translation-time
events ("*_trans"). The 'tcg_env' field is used to pass it to
execution-time events ("*_exec").

Signed-off-by: Lluís Vilanova 
Reviewed-by: Peter Maydell 
Reviewed-by: Richard Henderson 
---
 target-alpha/translate.c  |1 +
 target-arm/translate.c|1 +
 target-cris/translate.c   |1 +
 target-cris/translate_v10.c   |1 +
 target-i386/translate.c   |1 +
 target-lm32/translate.c   |1 +
 target-m68k/translate.c   |1 +
 target-microblaze/translate.c |1 +
 target-mips/translate.c   |1 +
 target-moxie/translate.c  |1 +
 target-openrisc/translate.c   |1 +
 target-ppc/translate.c|1 +
 target-s390x/translate.c  |1 +
 target-sh4/translate.c|1 +
 target-sparc/translate.c  |1 +
 target-tilegx/translate.c |1 +
 target-tricore/translate.c|1 +
 target-unicore32/translate.c  |1 +
 target-xtensa/translate.c |1 +
 tcg/tcg.h |4 
 translate-all.c   |2 ++
 21 files changed, 25 insertions(+)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 5b86992..67681f6 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -150,6 +150,7 @@ void alpha_translate_init(void)
 done_init = 1;
 
 cpu_env = tcg_global_reg_new_ptr(TCG_AREG0, "env");
+tcg_ctx.tcg_env = cpu_env;
 
 for (i = 0; i < 31; i++) {
 cpu_std_ir[i] = tcg_global_mem_new_i64(cpu_env,
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 940ec8d..1a7496b 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -84,6 +84,7 @@ void arm_translate_init(void)
 int i;
 
 cpu_env = tcg_global_reg_new_ptr(TCG_AREG0, "env");
+tcg_ctx.tcg_env = cpu_env;
 
 for (i = 0; i < 16; i++) {
 cpu_R[i] = tcg_global_mem_new_i32(cpu_env,
diff --git a/target-cris/translate.c b/target-cris/translate.c
index a73176c..f603af3 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3364,6 +3364,7 @@ void cris_initialize_tcg(void)
 int i;
 
 cpu_env = tcg_global_reg_new_ptr(TCG_AREG0, "env");
+tcg_ctx.tcg_env = cpu_env;
 cc_x = tcg_global_mem_new(cpu_env,
   offsetof(CPUCRISState, cc_x), "cc_x");
 cc_src = tcg_global_mem_new(cpu_env,
diff --git a/target-cris/translate_v10.c b/target-cris/translate_v10.c
index 7607ead..f2e9768 100644
--- a/target-cris/translate_v10.c
+++ b/target-cris/translate_v10.c
@@ -1250,6 +1250,7 @@ void cris_initialize_crisv10_tcg(void)
 int i;
 
 cpu_env = tcg_global_reg_new_ptr(TCG_AREG0, "env");
+tcg_ctx.tcg_env = cpu_env;
 cc_x = tcg_global_mem_new(cpu_env,
   offsetof(CPUCRISState, cc_x), "cc_x");
 cc_src = tcg_global_mem_new(cpu_env,
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 1a1214d..7a6ef7c 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -8135,6 +8135,7 @@ void tcg_x86_init(void)
 int i;
 
 cpu_env = tcg_global_reg_new_ptr(TCG_AREG0, "env");
+tcg_ctx.tcg_env = cpu_env;
 cpu_cc_op = tcg_global_mem_new_i32(cpu_env,
offsetof(CPUX86State, cc_op), "cc_op");
 cpu_cc_dst = tcg_global_mem_new(cpu_env, offsetof(CPUX86State, cc_dst),
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index 256a51f..b2e5a3e 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1191,6 +1191,7 @@ void lm32_translate_init(void)
 int i;
 
 cpu_env = tcg_global_reg_new_ptr(TCG_AREG0, "env");
+tcg_ctx.tcg_env = cpu_env;
 
 for (i = 0; i < ARRAY_SIZE(cpu_R); i++) {
 cpu_R[i] = tcg_global_mem_new(cpu_env,
diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index 7560c3a..f90f80e 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -77,6 +77,7 @@ void m68k_tcg_init(void)
 int i;
 
 cpu_env = tcg_global_reg_new_ptr(TCG_AREG0, "env");
+tcg_ctx.tcg_env = cpu_env;
 
 #define DEFO32(name, offset) \
 QREG_##name = tcg_global_mem_new_i32(cpu_env, \
diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c
index f944965..05092f1 100644
--- a/target-microblaze/translate.c
+++ b/target-microblaze/translate.c
@@ -1869,6 +1869,7 @@ void mb_tcg_init(void)
 int i;
 
 cpu_env = tcg_global_reg_new_ptr(TCG_AREG0, "env");
+tcg_ctx.tcg_env = cpu_env;
 
 env_debug = tcg_global_mem_new(cpu_env,
 offsetof(CPUMBState, debug),
diff --git a/target-mips/translate.c b/target-mips/translate.c
index a3a05ec..24f994c 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -19993,6 +19993,7 @@ void mips_tcg_init(void)
 

[Qemu-devel] [PATCH v4 2/2] trace: [all] Add "guest_mem_before" event

2016-05-27 Thread Lluís Vilanova
The event is described in "trace-events". Note that the "MO_AMASK" flag
is not traced, since it does not seem to affect the visible semantics of
instructions.

Signed-off-by: Lluís Vilanova 
---
 include/exec/cpu_ldst_template.h  |   25 
 include/exec/cpu_ldst_useronly_template.h |   22 ++
 tcg/tcg-op.c  |   32 ++--
 trace-events  |   22 ++
 trace/mem-internal.h  |   46 +
 trace/mem.h   |   34 +
 6 files changed, 177 insertions(+), 4 deletions(-)
 create mode 100644 trace/mem-internal.h
 create mode 100644 trace/mem.h

diff --git a/include/exec/cpu_ldst_template.h b/include/exec/cpu_ldst_template.h
index 3091c00..eaf69a1 100644
--- a/include/exec/cpu_ldst_template.h
+++ b/include/exec/cpu_ldst_template.h
@@ -23,6 +23,13 @@
  * You should have received a copy of the GNU Lesser General Public
  * License along with this library; if not, see .
  */
+
+#if !defined(SOFTMMU_CODE_ACCESS)
+#include "trace.h"
+#endif
+
+#include "trace/mem.h"
+
 #if DATA_SIZE == 8
 #define SUFFIX q
 #define USUFFIX q
@@ -80,6 +87,12 @@ glue(glue(glue(cpu_ld, USUFFIX), MEMSUFFIX), 
_ra)(CPUArchState *env,
 int mmu_idx;
 TCGMemOpIdx oi;
 
+#if !defined(SOFTMMU_CODE_ACCESS)
+trace_guest_mem_before_exec(
+ENV_GET_CPU(env), ptr,
+trace_mem_build_info(SHIFT, false, MO_TE, false));
+#endif
+
 addr = ptr;
 page_index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
 mmu_idx = CPU_MMU_INDEX;
@@ -112,6 +125,12 @@ glue(glue(glue(cpu_lds, SUFFIX), MEMSUFFIX), 
_ra)(CPUArchState *env,
 int mmu_idx;
 TCGMemOpIdx oi;
 
+#if !defined(SOFTMMU_CODE_ACCESS)
+trace_guest_mem_before_exec(
+ENV_GET_CPU(env), ptr,
+trace_mem_build_info(SHIFT, true, MO_TE, false));
+#endif
+
 addr = ptr;
 page_index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
 mmu_idx = CPU_MMU_INDEX;
@@ -148,6 +167,12 @@ glue(glue(glue(cpu_st, SUFFIX), MEMSUFFIX), 
_ra)(CPUArchState *env,
 int mmu_idx;
 TCGMemOpIdx oi;
 
+#if !defined(SOFTMMU_CODE_ACCESS)
+trace_guest_mem_before_exec(
+ENV_GET_CPU(env), ptr,
+trace_mem_build_info(SHIFT, false, MO_TE, true));
+#endif
+
 addr = ptr;
 page_index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
 mmu_idx = CPU_MMU_INDEX;
diff --git a/include/exec/cpu_ldst_useronly_template.h 
b/include/exec/cpu_ldst_useronly_template.h
index 040b147..b1378bf 100644
--- a/include/exec/cpu_ldst_useronly_template.h
+++ b/include/exec/cpu_ldst_useronly_template.h
@@ -22,6 +22,13 @@
  * You should have received a copy of the GNU Lesser General Public
  * License along with this library; if not, see .
  */
+
+#if !defined(CODE_ACCESS)
+#include "trace.h"
+#endif
+
+#include "trace/mem.h"
+
 #if DATA_SIZE == 8
 #define SUFFIX q
 #define USUFFIX q
@@ -53,6 +60,11 @@
 static inline RES_TYPE
 glue(glue(cpu_ld, USUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr)
 {
+#if !defined(CODE_ACCESS)
+trace_guest_mem_before_exec(
+ENV_GET_CPU(env), ptr,
+trace_mem_build_info(DATA_SIZE, false, MO_TE, false));
+#endif
 return glue(glue(ld, USUFFIX), _p)(g2h(ptr));
 }
 
@@ -68,6 +80,11 @@ glue(glue(glue(cpu_ld, USUFFIX), MEMSUFFIX), 
_ra)(CPUArchState *env,
 static inline int
 glue(glue(cpu_lds, SUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr)
 {
+#if !defined(CODE_ACCESS)
+trace_guest_mem_before_exec(
+ENV_GET_CPU(env), ptr,
+trace_mem_build_info(DATA_SIZE, true, MO_TE, false));
+#endif
 return glue(glue(lds, SUFFIX), _p)(g2h(ptr));
 }
 
@@ -85,6 +102,11 @@ static inline void
 glue(glue(cpu_st, SUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr,
   RES_TYPE v)
 {
+#if !defined(CODE_ACCESS)
+trace_guest_mem_before_exec(
+ENV_GET_CPU(env), ptr,
+trace_mem_build_info(DATA_SIZE, false, MO_TE, true));
+#endif
 glue(glue(st, SUFFIX), _p)(g2h(ptr), v);
 }
 
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index f554b86..3b7e3ff 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -25,6 +25,8 @@
 #include "qemu/osdep.h"
 #include "tcg.h"
 #include "tcg-op.h"
+#include "trace-tcg.h"
+#include "trace/mem.h"
 
 /* Reduce the number of ifdefs below.  This assumes that all uses of
TCGV_HIGH and TCGV_LOW are properly protected by a conditional that
@@ -1904,22 +1906,41 @@ static void gen_ldst_i64(TCGOpcode opc, TCGv_i64 val, 
TCGv addr,
 #endif
 }
 
-void tcg_gen_qemu_ld_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
+static inline void do_tcg_gen_qemu_ld_i32(TCGv_i32 val, TCGv addr, TCGArg idx,
+  TCGMemOp memop)
 {
 memop = tcg_canonicalize_memop(memop, 0, 0);
 gen_ldst_i32(INDEX_op_qemu_ld_i32, 

[Qemu-devel] [PATCH v4 0/2] trace: Add event for vCPU memory accesses

2016-05-27 Thread Lluís Vilanova
This series adds an event to track information related to memory accesses
performed by the guest CPUs ("guest_mem_before").

A future series might extend this to contain the physical address and memory
value (e.g., "guest_mem_after").

Signed-off-by: Lluís Vilanova 
---

Changes in v4
-

* Clarify alignment info is not on the trace.
* Add event information on commit log. [Richard Henderson]


Changes in v3
-

* Set "tcg_ctx.cpu" to NULL when unused. [Paolo Bonzini]
* Clarify how the 'info' field is interpreted.
* Fix argument size in 'info' field when using ld/st handlers.
* Fix reset of unused bits in 'info' field.


Changes in v2
-

* Rebase on bfc766d.
* Rename "guest_vmem" to "guest_mem_before"
* Add memory access information. [suggested by Peter Maydell]
* Drop event "guest_vmem_user_syscall". [suggested by Peter Maydell]


Lluís Vilanova (2):
  exec: [tcg] Track which vCPU is performing translation and execution
  trace: [all] Add "guest_mem_before" event


 include/exec/cpu_ldst_template.h  |   25 
 include/exec/cpu_ldst_useronly_template.h |   22 ++
 target-alpha/translate.c  |1 +
 target-arm/translate.c|1 +
 target-cris/translate.c   |1 +
 target-cris/translate_v10.c   |1 +
 target-i386/translate.c   |1 +
 target-lm32/translate.c   |1 +
 target-m68k/translate.c   |1 +
 target-microblaze/translate.c |1 +
 target-mips/translate.c   |1 +
 target-moxie/translate.c  |1 +
 target-openrisc/translate.c   |1 +
 target-ppc/translate.c|1 +
 target-s390x/translate.c  |1 +
 target-sh4/translate.c|1 +
 target-sparc/translate.c  |1 +
 target-tilegx/translate.c |1 +
 target-tricore/translate.c|1 +
 target-unicore32/translate.c  |1 +
 target-xtensa/translate.c |1 +
 tcg/tcg-op.c  |   32 ++--
 tcg/tcg.h |4 +++
 trace-events  |   22 ++
 trace/mem-internal.h  |   46 +
 trace/mem.h   |   34 +
 translate-all.c   |2 +
 27 files changed, 202 insertions(+), 4 deletions(-)
 create mode 100644 trace/mem-internal.h
 create mode 100644 trace/mem.h


To: qemu-devel@nongnu.org
Cc: Stefan Hajnoczi 
Cc: Peter Maydell 



Re: [Qemu-devel] [PATCH v4 1/1] Introduce "xen-load-devices-state"

2016-05-27 Thread Amit Shah
Dave, can you take a look?

Thanks,

On (Mon) 11 Apr 2016 [11:56:02], Changlong Xie wrote:
> From: Wen Congyang 
> 
> Introduce a "xen-load-devices-state" QAPI command that can be used to
> load the state of all devices, but not the RAM or the block devices of
> the VM.
> 
> We only have hmp commands savevm/loadvm, and qmp commands
> xen-save-devices-state.
> 
> We use this new command for COLO:
> 1. suspend both primary vm and secondary vm
> 2. sync the state
> 3. resume both primary vm and secondary vm
> 
> In such case, we need to update all devices' state in any time.
> 
> Signed-off-by: Wen Congyang 
> Signed-off-by: Changlong Xie 
> ---
>  migration/savevm.c | 36 
>  qapi-schema.json   | 14 ++
>  qmp-commands.hx| 27 +++
>  3 files changed, 77 insertions(+)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 16ba443..d361a29 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -30,6 +30,7 @@
>  #include "hw/boards.h"
>  #include "hw/hw.h"
>  #include "hw/qdev.h"
> +#include "hw/xen/xen.h"
>  #include "net/net.h"
>  #include "monitor/monitor.h"
>  #include "sysemu/sysemu.h"
> @@ -1775,6 +1776,12 @@ qemu_loadvm_section_start_full(QEMUFile *f, 
> MigrationIncomingState *mis)
>  return -EINVAL;
>  }
>  
> +/* Validate if it is a device's state */
> +if (xen_enabled() && se->is_ram) {
> +error_report("loadvm: %s RAM loading not allowed on Xen", idstr);
> +return -EINVAL;
> +}
> +
>  /* Add entry */
>  le = g_malloc0(sizeof(*le));
>  
> @@ -2084,6 +2091,35 @@ void qmp_xen_save_devices_state(const char *filename, 
> Error **errp)
>  }
>  }
>  
> +void qmp_xen_load_devices_state(const char *filename, Error **errp)
> +{
> +QEMUFile *f;
> +int ret;
> +
> +/* Guest must be paused before loading the device state; the RAM state
> + * will already have been loaded by xc
> + */
> +if (runstate_is_running()) {
> +error_setg(errp, "Cannot update device state while vm is running");
> +return;
> +}
> +vm_stop(RUN_STATE_RESTORE_VM);
> +
> +f = qemu_fopen(filename, "rb");
> +if (!f) {
> +error_setg_file_open(errp, errno, filename);
> +return;
> +}
> +
> +migration_incoming_state_new(f);
> +ret = qemu_loadvm_state(f);
> +qemu_fclose(f);
> +if (ret < 0) {
> +error_setg(errp, QERR_IO_ERROR);
> +}
> +migration_incoming_state_destroy();
> +}
> +
>  int load_vmstate(const char *name)
>  {
>  BlockDriverState *bs, *bs_vm_state;
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 54634c4..132264f 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -4144,6 +4144,20 @@
>'data': [ 'none', 'record', 'play' ] }
>  
>  ##
> +# @xen-load-devices-state:
> +#
> +# Load the state of all devices from file. The RAM and the block devices
> +# of the VM are not loaded by this command.
> +#
> +# @filename: the file to load the state of the devices from as binary
> +# data. See xen-save-devices-state.txt for a description of the binary
> +# format.
> +#
> +# Since: 2.7
> +##
> +{ 'command': 'xen-load-devices-state', 'data': {'filename': 'str'} }
> +
> +##
>  # @GICCapability:
>  #
>  # The struct describes capability for a specific GIC (Generic
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index de896a5..68620f6 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -587,6 +587,33 @@ Example:
>  EQMP
>  
>  {
> +.name   = "xen-load-devices-state",
> +.args_type  = "filename:F",
> +.mhandler.cmd_new = qmp_marshal_xen_load_devices_state,
> +},
> +
> +SQMP
> +xen-load-devices-state
> +--
> +
> +Load the state of all devices from file. The RAM and the block devices
> +of the VM are not loaded by this command.
> +
> +Arguments:
> +
> +- "filename": the file to load the state of the devices from as binary
> +data. See xen-save-devices-state.txt for a description of the binary
> +format.
> +
> +Example:
> +
> +-> { "execute": "xen-load-devices-state",
> + "arguments": { "filename": "/tmp/resume" } }
> +<- { "return": {} }
> +
> +EQMP
> +
> +{
>  .name   = "xen-set-global-dirty-log",
>  .args_type  = "enable:b",
>  .mhandler.cmd_new = qmp_marshal_xen_set_global_dirty_log,
> -- 
> 1.9.3
> 
> 
> 

Amit



Re: [Qemu-devel] [RFC PATCH v2 0/2] spapr: Memory hot-unplug support

2016-05-27 Thread Thomas Huth
 Hi Bharata,

On 15.03.2016 05:38, Bharata B Rao wrote:
> This patchset adds memory hot removal support for PowerPC sPAPR.
> This new version switches to using the proposed "count-indexed" type of
> hotplug identifier which allows to hot remove a number of LMBs starting
> with a given DRC index.
> 
> This count-indexed hotplug identifier isn't yet part of PAPR.

Just for clarification / my understanding: That means we also need a
modified guest to support this new interface? If yes, did you post such
patches somewhere else already, too?

 Thomas




Re: [Qemu-devel] [PULL v2 00/31] Misc changes for 2016-05-27

2016-05-27 Thread Paolo Bonzini


On 27/05/2016 17:30, Peter Maydell wrote:
> This version fails on the "retypedefing a typedef" clang warning:
> 
> /home/petmay01/linaro/qemu-for-merges/hw/scsi/scsi-disk.c:73:3: error:
> redefinition of typedef 'SCSIDiskClass' is a C11 feature
> [-Werror,-Wtypedef-redefinition]
> } SCSIDiskClass;
>   ^
> /home/petmay01/linaro/qemu-for-merges/hw/scsi/scsi-disk.c:66:30: note:
> previous definition is here
> typedef struct SCSIDiskClass SCSIDiskClass;
>  ^
> 1 error generated.

Ugh, I am so much waiting for the docker series to get in...

Paolo



Re: [Qemu-devel] [RFC PATCH v4 2/3] VFIO driver for mediated PCI device

2016-05-27 Thread Alex Williamson
On Fri, 27 May 2016 10:03:31 +
"Tian, Kevin"  wrote:

> > From: Kirti Wankhede
> > Sent: Wednesday, May 25, 2016 9:05 PM
> > 
> >   
> > >> +{
> > >> +int ret = -EINVAL;
> > >> +struct phy_device *phy_dev = mdevice->phy_dev;
> > >> +
> > >> +if (dev_is_pci(phy_dev->dev) && phy_dev->ops->get_region_info) {
> > >> +mutex_lock(>ops_lock);
> > >> +ret = phy_dev->ops->get_region_info(mdevice, index,
> > >> +vfio_region_info);
> > >> +mutex_unlock(>ops_lock);
> > >> +}
> > >> +return ret;
> > >> +}
> > >> +
> > >> +static int mdev_read_base(struct vfio_mdevice *vdev)  
> > >
> > > similar as earlier comment - vdev or mdev?
> > >  
> > 
> > Here vdev is of type 'vfio_mdevice', that's why vdev, mdev doesn't suit
> > here. Changing it to 'vmdev' in next patch set.
> >   
> 
> 'vmdev' looks more confusing... :-)
> 
> Alex, can you give your thought here?

I don't see any problem with vmdev personally, are you unhappy with it
because it includes 'vm'?  It seems like it has a valid rationale, so
long as it's used consistently, I'm happy.  Thanks,

Alex



Re: [Qemu-devel] [PULL v2 00/31] Misc changes for 2016-05-27

2016-05-27 Thread Peter Maydell
On 27 May 2016 at 15:09, Paolo Bonzini  wrote:
> The following changes since commit 2c56d06bafd8933d2a9c6e0aeb5d45f7c1fb5616:
>
>   Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging 
> (2016-05-26 14:29:30 +0100)
>
> are available in the git repository at:
>
>   git://github.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to 103876bc5e1ea67eec386461114df5c953110e34:
>
>   exec: hide mr->ram_addr from qemu_get_ram_ptr users (2016-05-27 16:07:32 
> +0200)
>
> 
> * docs/atomics fixes and atomic_rcu_* optimization (Emilio)
> * NBD bugfix (Eric)
> * Memory fixes and cleanups (Paolo, Paul)
> * scsi-block support for SCSI status, including persistent
>   reservations (Paolo)
> * linuxboot support for fw_cfg DMA (Marc, Richard Jones)
> * kvm_stat moves to the Linux repository
> * SCSI bug fixes (Peter, Prasad)
> * Killing qemu_char_get_next_serial, non-ARM parts (Xiaoqiang)
>
> 

This version fails on the "retypedefing a typedef" clang warning:

/home/petmay01/linaro/qemu-for-merges/hw/scsi/scsi-disk.c:73:3: error:
redefinition of typedef 'SCSIDiskClass' is a C11 feature
[-Werror,-Wtypedef-redefinition]
} SCSIDiskClass;
  ^
/home/petmay01/linaro/qemu-for-merges/hw/scsi/scsi-disk.c:66:30: note:
previous definition is here
typedef struct SCSIDiskClass SCSIDiskClass;
 ^
1 error generated.

thanks
-- PMM



Re: [Qemu-devel] [RFC v2 11/11] tcg: enable thread-per-vCPU

2016-05-27 Thread Sergey Fedorov
On 27/05/16 17:55, Paolo Bonzini wrote:
>
> On 27/05/2016 15:57, Sergey Fedorov wrote:
>>  1. Make 'cpu->thread_kicked' access atomic
>>  2. Remove global 'exit_request' and use per-CPU 'exit_request'
>>  3. Change how 'current_cpu' is set
>>  4. Reorganize round-robin CPU TCG thread function
>>  5. Enable 'mmap_lock' for system mode emulation (do we really want this?)
> No, I don't think so.
>
>>  6. Enable 'tb_lock' for system mode emulation
>>  7. Introduce per-CPU TCG thread function
> At least 2/3/7 must be done at the same time, but I agree that this
> patch could use some splitting. :)

Hmm, 2/3 do also change single-threaded CPU loop. I think they should
apply separately from 7.

Thanks,
Sergey



Re: [Qemu-devel] [RFC v2 11/11] tcg: enable thread-per-vCPU

2016-05-27 Thread Paolo Bonzini


On 27/05/2016 17:07, Sergey Fedorov wrote:
>>> >>  1. Make 'cpu->thread_kicked' access atomic
>>> >>  2. Remove global 'exit_request' and use per-CPU 'exit_request'
>>> >>  3. Change how 'current_cpu' is set
>>> >>  4. Reorganize round-robin CPU TCG thread function
>>> >>  5. Enable 'mmap_lock' for system mode emulation (do we really want 
>>> >> this?)
>> > No, I don't think so.
>> >
>>> >>  6. Enable 'tb_lock' for system mode emulation
>>> >>  7. Introduce per-CPU TCG thread function
>> > At least 2/3/7 must be done at the same time, but I agree that this
>> > patch could use some splitting. :)
> Hmm, 2/3 do also change single-threaded CPU loop. I think they should
> apply separately from 7.

Reviewed the patch now, and I'm not sure how you can do 2/3 for the
single-threaded CPU loop.  They could be moved out of cpu_exec and into
cpus.c (in a separate patch), but you need exit_request and
tcg_current_cpu to properly kick the single-threaded CPU loop out of
qemu_tcg_cpu_thread_fn.

Thanks,

Paolo



[Qemu-devel] [PATCH v2 19/19] linux-user: Avoid possible misalignment in target_to_host_siginfo()

2016-05-27 Thread Peter Maydell
Reimplement target_to_host_siginfo() to use __get_user(), which
handles possibly misaligned source guest structures correctly.

Signed-off-by: Peter Maydell 
---
 linux-user/signal.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 7e2a80f..8417da7 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -409,13 +409,18 @@ void host_to_target_siginfo(target_siginfo_t *tinfo, 
const siginfo_t *info)
 /* XXX: find a solution for 64 bit (additional malloced data is needed) */
 void target_to_host_siginfo(siginfo_t *info, const target_siginfo_t *tinfo)
 {
-info->si_signo = tswap32(tinfo->si_signo);
-info->si_errno = tswap32(tinfo->si_errno);
-info->si_code = tswap32(tinfo->si_code);
-info->si_pid = tswap32(tinfo->_sifields._rt._pid);
-info->si_uid = tswap32(tinfo->_sifields._rt._uid);
-info->si_value.sival_ptr =
-(void *)(long)tswapal(tinfo->_sifields._rt._sigval.sival_ptr);
+/* This conversion is used only for the rt_sigqueueinfo syscall,
+ * and so we know that the _rt fields are the valid ones.
+ */
+abi_ulong sival_ptr;
+
+__get_user(info->si_signo, >si_signo);
+__get_user(info->si_errno, >si_errno);
+__get_user(info->si_code, >si_code);
+__get_user(info->si_pid, >_sifields._rt._pid);
+__get_user(info->si_uid, >_sifields._rt._uid);
+__get_user(sival_ptr, >_sifields._rt._sigval.sival_ptr);
+info->si_value.sival_ptr = (void *)(long)sival_ptr;
 }
 
 static int fatal_signal (int sig)
-- 
1.9.1




Re: [Qemu-devel] [PATCH 2/2] macio: switch over to new byte-aligned DMA helpers

2016-05-27 Thread Mark Cave-Ayland
On 27/05/16 16:02, John Snow wrote:

> On 05/27/2016 04:48 AM, Mark Cave-Ayland wrote:
>> Now that the DMA helpers are byte-aligned they can be called directly from
>> the macio routines rather than emulating byte-aligned accesses via multiple
>> block-level accesses.
>>
>> Signed-off-by: Mark Cave-Ayland 
>> ---
>>  hw/ide/macio.c |  213 
>> 
> 
> ^ _cool_ ^
> 
> I assume you'll actually not be needing me to deal with this until you
> resolve the HACK in the pre-requisite patch, yes?

Yes, that's correct. The reason I posted it as-is was to provide Paolo
with a test case and for Aurelien to verify the TRIM behaviour. Once the
hack is no longer required, I'll resubmit it properly.


ATB,

Mark.




[Qemu-devel] [PATCH v2 15/19] linux-user: Use safe_syscall for kill, tkill and tgkill syscalls

2016-05-27 Thread Peter Maydell
Use the safe_syscall wrapper for the kill, tkill and tgkill syscalls.
Without this, if a thread sent a SIGKILL to itself it could kill the
thread before we had a chance to process a signal that arrived just
before the SIGKILL, and that signal would get lost.

We drop all the ifdeffery for tkill and tgkill, because every guest
architecture we support implements them, and they've been in Linux
since 2003 so we can assume the host headers define the __NR_tkill
and __NR_tgkill constants.

Signed-off-by: Peter Maydell 
---
Timothy's patchset used block_signals() for this, but I
think safe_syscall is nicer when it can be used, since
it's lower overhead and less intrusive. Also extended to
cover all the thread-killing syscalls rather than just kill.
---
 linux-user/syscall.c | 23 +++
 1 file changed, 7 insertions(+), 16 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index cb5d519..549f571 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -186,8 +186,6 @@ static type name (type1 arg1,type2 arg2,type3 arg3,type4 
arg4,type5 arg5,   \
 #define __NR_sys_getpriority __NR_getpriority
 #define __NR_sys_rt_sigqueueinfo __NR_rt_sigqueueinfo
 #define __NR_sys_syslog __NR_syslog
-#define __NR_sys_tgkill __NR_tgkill
-#define __NR_sys_tkill __NR_tkill
 #define __NR_sys_futex __NR_futex
 #define __NR_sys_inotify_init __NR_inotify_init
 #define __NR_sys_inotify_add_watch __NR_inotify_add_watch
@@ -225,12 +223,6 @@ _syscall5(int, _llseek,  uint,  fd, ulong, hi, ulong, lo,
 #endif
 _syscall3(int,sys_rt_sigqueueinfo,int,pid,int,sig,siginfo_t *,uinfo)
 _syscall3(int,sys_syslog,int,type,char*,bufp,int,len)
-#if defined(TARGET_NR_tgkill) && defined(__NR_tgkill)
-_syscall3(int,sys_tgkill,int,tgid,int,pid,int,sig)
-#endif
-#if defined(TARGET_NR_tkill) && defined(__NR_tkill)
-_syscall2(int,sys_tkill,int,tid,int,sig)
-#endif
 #ifdef __NR_exit_group
 _syscall1(int,exit_group,int,error_code)
 #endif
@@ -704,6 +696,9 @@ safe_syscall6(int, pselect6, int, nfds, fd_set *, readfds, 
fd_set *, writefds, \
 safe_syscall6(int,futex,int *,uaddr,int,op,int,val, \
   const struct timespec *,timeout,int *,uaddr2,int,val3)
 safe_syscall2(int, rt_sigsuspend, sigset_t *, newset, size_t, sigsetsize)
+safe_syscall2(int, kill, pid_t, pid, int, sig)
+safe_syscall2(int, tkill, int, tid, int, sig)
+safe_syscall3(int, tgkill, int, tgid, int, pid, int, sig)
 
 static inline int host_to_target_sock_type(int host_type)
 {
@@ -6528,7 +6523,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 ret = 0;
 break;
 case TARGET_NR_kill:
-ret = get_errno(kill(arg1, target_to_host_signal(arg2)));
+ret = get_errno(safe_kill(arg1, target_to_host_signal(arg2)));
 break;
 #ifdef TARGET_NR_rename
 case TARGET_NR_rename:
@@ -9764,18 +9759,14 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 break;
 #endif
 
-#if defined(TARGET_NR_tkill) && defined(__NR_tkill)
 case TARGET_NR_tkill:
-ret = get_errno(sys_tkill((int)arg1, target_to_host_signal(arg2)));
+ret = get_errno(safe_tkill((int)arg1, target_to_host_signal(arg2)));
 break;
-#endif
 
-#if defined(TARGET_NR_tgkill) && defined(__NR_tgkill)
 case TARGET_NR_tgkill:
-   ret = get_errno(sys_tgkill((int)arg1, (int)arg2,
+ret = get_errno(safe_tgkill((int)arg1, (int)arg2,
 target_to_host_signal(arg3)));
-   break;
-#endif
+break;
 
 #ifdef TARGET_NR_set_robust_list
 case TARGET_NR_set_robust_list:
-- 
1.9.1




[Qemu-devel] [PATCH v2 07/19] linux-user: Fix race between multiple signals

2016-05-27 Thread Peter Maydell
If multiple host signals are received in quick succession they would
be queued in TaskState then delivered to the guest in spite of
signals being supposed to be blocked by the guest signal handler's
sa_mask. Fix this by decoupling the guest signal mask from the
host signal mask, so we can have protected sections where all
host signals are blocked. In particular we block signals from
when host_signal_handler() queues a signal from the guest until
process_pending_signals() has unqueued it. We also block signals
while we are manipulating the guest signal mask in emulation of
sigprocmask and similar syscalls.

Blocking host signals also ensures the correct behaviour with respect
to multiple threads and the overrun count of timer related signals.
Alas blocking and queuing in qemu is still needed because of virtual
processor exceptions, SIGSEGV and SIGBUS.

Blocking signals inside process_pending_signals() protects against
concurrency problems that would otherwise happen if host_signal_handler()
ran and accessed the signal data structures while process_pending_signals()
was manipulating them.

Since we now track the guest signal mask separately from that
of the host, the sigsuspend system calls must track the signal
mask passed to them, because when we process signals as we leave
the sigsuspend the guest signal mask in force is that passed to
sigsuspend.

Signed-off-by: Timothy Edward Baldwin 
Message-id: 1441497448-32489-19-git-send-email-t.e.baldwi...@members.leeds.ac.uk
[PMM: make signal_pending a simple flag rather than a word with two flag bits;
 ensure we don't call block_signals() twice in sigreturn codepaths;
 document and assert() the guarantee that using do_sigprocmask() to
 get the current mask never fails;  use the qemu atomics.h functions
 rather than raw volatile variable access; add extra commentary and
 documentation; block SIGSEGV/SIGBUS in block_signals() and in
 process_pending_signals() because they can't occur synchronously here;
 check the right do_sigprocmask() call for errors in ssetmask syscall;
 expand commit message; fixed sigsuspend() hanging]
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 linux-user/qemu.h|  50 +++-
 linux-user/signal.c  | 163 +++
 linux-user/syscall.c |  73 ---
 3 files changed, 213 insertions(+), 73 deletions(-)

diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index f09b750..5138289 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -123,14 +123,33 @@ typedef struct TaskState {
 #endif
 uint32_t stack_base;
 int used; /* non zero if used */
-bool sigsegv_blocked; /* SIGSEGV blocked by guest */
 struct image_info *info;
 struct linux_binprm *bprm;
 
 struct emulated_sigtable sigtab[TARGET_NSIG];
 struct sigqueue sigqueue_table[MAX_SIGQUEUE_SIZE]; /* siginfo queue */
 struct sigqueue *first_free; /* first free siginfo queue entry */
-int signal_pending; /* non zero if a signal may be pending */
+/* This thread's signal mask, as requested by the guest program.
+ * The actual signal mask of this thread may differ:
+ *  + we don't let SIGSEGV and SIGBUS be blocked while running guest code
+ *  + sometimes we block all signals to avoid races
+ */
+sigset_t signal_mask;
+/* The signal mask imposed by a guest sigsuspend syscall, if we are
+ * currently in the middle of such a syscall
+ */
+sigset_t sigsuspend_mask;
+/* Nonzero if we're leaving a sigsuspend and sigsuspend_mask is valid. */
+int in_sigsuspend;
+
+/* Nonzero if process_pending_signals() needs to do something (either
+ * handle a pending signal or unblock signals).
+ * This flag is written from a signal handler so should be accessed via
+ * the atomic_read() and atomic_write() functions. (It is not accessed
+ * from multiple threads.)
+ */
+int signal_pending;
+
 } __attribute__((aligned(16))) TaskState;
 
 extern char *exec_path;
@@ -235,6 +254,12 @@ unsigned long init_guest_space(unsigned long host_start,
  * It's also OK to implement these with safe_syscall, though it will be
  * a little less efficient if a signal is delivered at the 'wrong' moment.
  *
+ * Some non-interruptible syscalls need to be handled using block_signals()
+ * to block signals for the duration of the syscall. This mainly applies
+ * to code which needs to modify the data structures used by the
+ * host_signal_handler() function and the functions it calls, including
+ * all syscalls which change the thread's signal mask.
+ *
  * (2) Interruptible syscalls
  *
  * These are guest syscalls that can be interrupted by signals and
@@ -266,6 +291,8 @@ unsigned long init_guest_space(unsigned long host_start,
  * you make in the implementation returns either -TARGET_ERESTARTSYS or
  * EINTR though.)
  *
+ * block_signals() cannot be used 

[Qemu-devel] [PATCH v2 04/19] linux-user: Factor out uses of do_sigprocmask() from sigreturn code

2016-05-27 Thread Peter Maydell
All the architecture specific handlers for sigreturn include calls
to do_sigprocmask(SIGSETMASK, , NULL) to set the signal mask
from the uc_sigmask in the context being restored. Factor these
out into calls to a set_sigmask() function. The next patch will
want to add code which is not run when setting the signal mask
via do_sigreturn, and this change allows us to separate the two
cases.

Signed-off-by: Peter Maydell 
---
 linux-user/signal.c | 55 +++--
 1 file changed, 32 insertions(+), 23 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 5069c3f..1b86a85 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -239,6 +239,15 @@ int do_sigprocmask(int how, const sigset_t *set, sigset_t 
*oldset)
 return ret;
 }
 
+#if !defined(TARGET_OPENRISC) && !defined(TARGET_UNICORE32) && \
+!defined(TARGET_X86_64)
+/* Just set the guest's signal mask to the specified value */
+static void set_sigmask(const sigset_t *set)
+{
+do_sigprocmask(SIG_SETMASK, set, NULL);
+}
+#endif
+
 /* siginfo conversion */
 
 static inline void host_to_target_siginfo_noswap(target_siginfo_t *tinfo,
@@ -1093,7 +1102,7 @@ long do_sigreturn(CPUX86State *env)
 }
 
 target_to_host_sigset_internal(, _set);
-do_sigprocmask(SIG_SETMASK, , NULL);
+set_sigmask();
 
 /* restore registers */
 if (restore_sigcontext(env, >sc))
@@ -1118,7 +1127,7 @@ long do_rt_sigreturn(CPUX86State *env)
 if (!lock_user_struct(VERIFY_READ, frame, frame_addr, 1))
 goto badframe;
 target_to_host_sigset(, >uc.tuc_sigmask);
-do_sigprocmask(SIG_SETMASK, , NULL);
+set_sigmask();
 
 if (restore_sigcontext(env, >uc.tuc_mcontext)) {
 goto badframe;
@@ -1258,7 +1267,7 @@ static int target_restore_sigframe(CPUARMState *env,
 uint64_t pstate;
 
 target_to_host_sigset(, >uc.tuc_sigmask);
-do_sigprocmask(SIG_SETMASK, , NULL);
+set_sigmask();
 
 for (i = 0; i < 31; i++) {
 __get_user(env->xregs[i], >uc.tuc_mcontext.regs[i]);
@@ -1900,7 +1909,7 @@ static long do_sigreturn_v1(CPUARMState *env)
 }
 
 target_to_host_sigset_internal(_set, );
-do_sigprocmask(SIG_SETMASK, _set, NULL);
+set_sigmask(_set);
 
 if (restore_sigcontext(env, >sc)) {
 goto badframe;
@@ -1981,7 +1990,7 @@ static int do_sigframe_return_v2(CPUARMState *env, 
target_ulong frame_addr,
 abi_ulong *regspace;
 
 target_to_host_sigset(_set, >tuc_sigmask);
-do_sigprocmask(SIG_SETMASK, _set, NULL);
+set_sigmask(_set);
 
 if (restore_sigcontext(env, >tuc_mcontext))
 return 1;
@@ -2077,7 +2086,7 @@ static long do_rt_sigreturn_v1(CPUARMState *env)
 }
 
 target_to_host_sigset(_set, >uc.tuc_sigmask);
-do_sigprocmask(SIG_SETMASK, _set, NULL);
+set_sigmask(_set);
 
 if (restore_sigcontext(env, >uc.tuc_mcontext)) {
 goto badframe;
@@ -2453,7 +2462,7 @@ long do_sigreturn(CPUSPARCState *env)
 }
 
 target_to_host_sigset_internal(_set, );
-do_sigprocmask(SIG_SETMASK, _set, NULL);
+set_sigmask(_set);
 
 if (err) {
 goto segv_and_exit;
@@ -2576,7 +2585,7 @@ void sparc64_set_context(CPUSPARCState *env)
 }
 }
 target_to_host_sigset_internal(, _set);
-do_sigprocmask(SIG_SETMASK, , NULL);
+set_sigmask();
 }
 env->pc = pc;
 env->npc = npc;
@@ -2993,7 +3002,7 @@ long do_sigreturn(CPUMIPSState *regs)
 }
 
 target_to_host_sigset_internal(, _set);
-do_sigprocmask(SIG_SETMASK, , NULL);
+set_sigmask();
 
 restore_sigcontext(regs, >sf_sc);
 
@@ -3097,7 +3106,7 @@ long do_rt_sigreturn(CPUMIPSState *env)
 }
 
 target_to_host_sigset(, >rs_uc.tuc_sigmask);
-do_sigprocmask(SIG_SETMASK, , NULL);
+set_sigmask();
 
 restore_sigcontext(env, >rs_uc.tuc_mcontext);
 
@@ -3371,7 +3380,7 @@ long do_sigreturn(CPUSH4State *regs)
 goto badframe;
 
 target_to_host_sigset_internal(, _set);
-do_sigprocmask(SIG_SETMASK, , NULL);
+set_sigmask();
 
 restore_sigcontext(regs, >sc);
 
@@ -3397,7 +3406,7 @@ long do_rt_sigreturn(CPUSH4State *regs)
 }
 
 target_to_host_sigset(, >uc.tuc_sigmask);
-do_sigprocmask(SIG_SETMASK, , NULL);
+set_sigmask();
 
 restore_sigcontext(regs, >uc.tuc_mcontext);
 
@@ -3621,7 +3630,7 @@ long do_sigreturn(CPUMBState *env)
 __get_user(target_set.sig[i], >extramask[i - 1]);
 }
 target_to_host_sigset_internal(, _set);
-do_sigprocmask(SIG_SETMASK, , NULL);
+set_sigmask();
 
 restore_sigcontext(>uc.tuc_mcontext, env);
 /* We got here through a sigreturn syscall, our path back is via an
@@ -3792,7 +3801,7 @@ long do_sigreturn(CPUCRISState *env)
 __get_user(target_set.sig[i], >extramask[i - 1]);
 }
 target_to_host_sigset_internal(, _set);
-do_sigprocmask(SIG_SETMASK, , NULL);
+set_sigmask();
 
 restore_sigcontext(>sc, env);
 

[Qemu-devel] [PATCH v2 05/19] linux-user: Define macro for size of host kernel sigset_t

2016-05-27 Thread Peter Maydell
Some host syscalls take an argument specifying the size of a
host kernel's sigset_t (which isn't necessarily the same as
that of the host libc's type of that name). Instead of hardcoding
_NSIG / 8 where we do this, define and use a SIGSET_T_SIZE macro.

Signed-off-by: Peter Maydell 
---
 linux-user/syscall.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index df70255..e4b7404 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -119,6 +119,10 @@ int __clone2(int (*fn)(void *), void *child_stack_base,
 #defineVFAT_IOCTL_READDIR_BOTH _IOR('r', 1, struct 
linux_dirent [2])
 #defineVFAT_IOCTL_READDIR_SHORT_IOR('r', 2, struct 
linux_dirent [2])
 
+/* This is the size of the host kernel's sigset_t, needed where we make
+ * direct system calls that take a sigset_t pointer and a size.
+ */
+#define SIGSET_T_SIZE (_NSIG / 8)
 
 #undef _syscall0
 #undef _syscall1
@@ -7221,7 +7225,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 /* Extract the two packed args for the sigset */
 if (arg6) {
 sig_ptr = 
-sig.size = _NSIG / 8;
+sig.size = SIGSET_T_SIZE;
 
 arg7 = lock_user(VERIFY_READ, arg6, sizeof(*arg7) * 2, 1);
 if (!arg7) {
@@ -8275,7 +8279,8 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 set = NULL;
 }
 
-ret = get_errno(sys_ppoll(pfd, nfds, timeout_ts, set, 
_NSIG/8));
+ret = get_errno(sys_ppoll(pfd, nfds, timeout_ts,
+  set, SIGSET_T_SIZE));
 
 if (!is_error(ret) && arg3) {
 host_to_target_timespec(arg3, timeout_ts);
-- 
1.9.1




[Qemu-devel] [PATCH] linux-user: provide frame information in x86-64 safe_syscall

2016-05-27 Thread Peter Maydell
Use cfi directives in the x86-64 safe_syscall to allow gdb to get
backtraces right from within it. (In particular this will be
quite a common situation if the user interrupts QEMU while it's
in a blocked safe-syscall: at the point of the syscall insn RBP
is in use for something else, and so gdb can't find the frame then
without assistance.)

Signed-off-by: Peter Maydell 
---
The requirements for frame information annotations seem to be a bit
of an undocumented black art, but I think I have these right. At
any rate, gdb now gives correct backtraces from all points in
the routine as far as I can see. Review appreciated...


 linux-user/host/x86_64/safe-syscall.inc.S | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/linux-user/host/x86_64/safe-syscall.inc.S 
b/linux-user/host/x86_64/safe-syscall.inc.S
index dde434c..bbb1eca 100644
--- a/linux-user/host/x86_64/safe-syscall.inc.S
+++ b/linux-user/host/x86_64/safe-syscall.inc.S
@@ -24,6 +24,7 @@
  * -1-and-errno-set convention is done by the calling wrapper.
  */
 safe_syscall_base:
+.cfi_startproc
 /* This saves a frame pointer and aligns the stack for the syscall.
  * (It's unclear if the syscall ABI has the same stack alignment
  * requirements as the userspace function call ABI, but better safe 
than
@@ -31,6 +32,8 @@ safe_syscall_base:
  * does not list any ABI differences regarding stack alignment.)
  */
 push%rbp
+.cfi_def_cfa_offset 16
+.cfi_offset rbp,-16
 
 /* The syscall calling convention isn't the same as the
  * C one:
@@ -70,12 +73,19 @@ safe_syscall_start:
 safe_syscall_end:
 /* code path for having successfully executed the syscall */
 pop %rbp
+.cfi_remember_state
+.cfi_def_cfa_offset 8
+.cfi_restore ebp
 ret
 
 return_ERESTARTSYS:
 /* code path when we didn't execute the syscall */
+.cfi_restore_state
 mov $-TARGET_ERESTARTSYS, %rax
 pop %rbp
+.cfi_def_cfa_offset 8
+.cfi_restore ebp
 ret
+.cfi_endproc
 
 .size   safe_syscall_base, .-safe_syscall_base
-- 
1.9.1




[Qemu-devel] [PATCH v2 08/19] linux-user: Remove redundant default action check in queue_signal()

2016-05-27 Thread Peter Maydell
From: Timothy E Baldwin 

Both queue_signal() and process_pending_signals() did check for default
actions of signals, this is redundant and also causes fatal and stopping
signals to incorrectly cause guest system calls to be interrupted.

The code in queue_signal() is removed.

Signed-off-by: Timothy Edward Baldwin 
Message-id: 1441497448-32489-21-git-send-email-t.e.baldwi...@members.leeds.ac.uk
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 linux-user/signal.c | 37 -
 1 file changed, 37 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index a89853d..2c6790d 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -525,46 +525,10 @@ int queue_signal(CPUArchState *env, int sig, 
target_siginfo_t *info)
 TaskState *ts = cpu->opaque;
 struct emulated_sigtable *k;
 struct sigqueue *q, **pq;
-abi_ulong handler;
-int queue;
 
 trace_user_queue_signal(env, sig);
 k = >sigtab[sig - 1];
-queue = gdb_queuesig ();
-handler = sigact_table[sig - 1]._sa_handler;
 
-if (sig == TARGET_SIGSEGV && sigismember(>signal_mask, SIGSEGV)) {
-/* Guest has blocked SIGSEGV but we got one anyway. Assume this
- * is a forced SIGSEGV (ie one the kernel handles via force_sig_info
- * because it got a real MMU fault). A blocked SIGSEGV in that
- * situation is treated as if using the default handler. This is
- * not correct if some other process has randomly sent us a SIGSEGV
- * via kill(), but that is not easy to distinguish at this point,
- * so we assume it doesn't happen.
- */
-handler = TARGET_SIG_DFL;
-}
-
-if (!queue && handler == TARGET_SIG_DFL) {
-if (sig == TARGET_SIGTSTP || sig == TARGET_SIGTTIN || sig == 
TARGET_SIGTTOU) {
-kill(getpid(),SIGSTOP);
-return 0;
-} else
-/* default handler : ignore some signal. The other are fatal */
-if (sig != TARGET_SIGCHLD &&
-sig != TARGET_SIGURG &&
-sig != TARGET_SIGWINCH &&
-sig != TARGET_SIGCONT) {
-force_sig(sig);
-} else {
-return 0; /* indicate ignored */
-}
-} else if (!queue && handler == TARGET_SIG_IGN) {
-/* ignore signal */
-return 0;
-} else if (!queue && handler == TARGET_SIG_ERR) {
-force_sig(sig);
-} else {
 pq = >first;
 if (sig < TARGET_SIGRTMIN) {
 /* if non real time signal, we queue exactly one signal */
@@ -591,7 +555,6 @@ int queue_signal(CPUArchState *env, int sig, 
target_siginfo_t *info)
 /* signal that a new signal is pending */
 atomic_set(>signal_pending, 1);
 return 1; /* indicates that the signal was queued */
-}
 }
 
 #ifndef HAVE_SAFE_SYSCALL
-- 
1.9.1




[Qemu-devel] [PATCH v2 09/19] linux-user: Remove redundant gdb_queuesig()

2016-05-27 Thread Peter Maydell
From: Timothy E Baldwin 

Signed-off-by: Timothy Edward Baldwin 
Message-id: 1441497448-32489-22-git-send-email-t.e.baldwi...@members.leeds.ac.uk
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 gdbstub.c  | 13 -
 include/exec/gdbstub.h |  1 -
 2 files changed, 14 deletions(-)

diff --git a/gdbstub.c b/gdbstub.c
index b9e3710..ae83028 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -1494,19 +1494,6 @@ void gdb_exit(CPUArchState *env, int code)
 
 #ifdef CONFIG_USER_ONLY
 int
-gdb_queuesig (void)
-{
-GDBState *s;
-
-s = gdbserver_state;
-
-if (gdbserver_fd < 0 || s->fd < 0)
-return 0;
-else
-return 1;
-}
-
-int
 gdb_handlesig(CPUState *cpu, int sig)
 {
 GDBState *s;
diff --git a/include/exec/gdbstub.h b/include/exec/gdbstub.h
index 8e3f8d8..f9708bb 100644
--- a/include/exec/gdbstub.h
+++ b/include/exec/gdbstub.h
@@ -48,7 +48,6 @@ int use_gdb_syscalls(void);
 void gdb_set_stop_cpu(CPUState *cpu);
 void gdb_exit(CPUArchState *, int);
 #ifdef CONFIG_USER_ONLY
-int gdb_queuesig (void);
 int gdb_handlesig(CPUState *, int);
 void gdb_signalled(CPUArchState *, int);
 void gdbserver_fork(CPUState *);
-- 
1.9.1




[Qemu-devel] [PATCH v2 06/19] linux-user: Use safe_syscall for sigsuspend syscalls

2016-05-27 Thread Peter Maydell
Use the safe_syscall wrapper for sigsuspend syscalls. This
means that we will definitely deliver a signal that arrives
before we do the sigsuspend call, rather than blocking first
and delivering afterwards.

Signed-off-by: Peter Maydell 
---
 linux-user/syscall.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index e4b7404..083f26f 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -703,6 +703,7 @@ safe_syscall6(int, pselect6, int, nfds, fd_set *, readfds, 
fd_set *, writefds, \
   fd_set *, exceptfds, struct timespec *, timeout, void *, sig)
 safe_syscall6(int,futex,int *,uaddr,int,op,int,val, \
   const struct timespec *,timeout,int *,uaddr2,int,val3)
+safe_syscall2(int, rt_sigsuspend, sigset_t *, newset, size_t, sigsetsize)
 
 static inline int host_to_target_sock_type(int host_type)
 {
@@ -7007,7 +7008,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 target_to_host_old_sigset(, p);
 unlock_user(p, arg1, 0);
 #endif
-ret = get_errno(sigsuspend());
+ret = get_errno(safe_rt_sigsuspend(, SIGSET_T_SIZE));
 }
 break;
 #endif
@@ -7018,7 +7019,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 goto efault;
 target_to_host_sigset(, p);
 unlock_user(p, arg1, 0);
-ret = get_errno(sigsuspend());
+ret = get_errno(safe_rt_sigsuspend(, SIGSET_T_SIZE));
 }
 break;
 case TARGET_NR_rt_sigtimedwait:
-- 
1.9.1




Re: [Qemu-devel] [PATCH 2/2] macio: switch over to new byte-aligned DMA helpers

2016-05-27 Thread John Snow


On 05/27/2016 04:48 AM, Mark Cave-Ayland wrote:
> Now that the DMA helpers are byte-aligned they can be called directly from
> the macio routines rather than emulating byte-aligned accesses via multiple
> block-level accesses.
> 
> Signed-off-by: Mark Cave-Ayland 
> ---
>  hw/ide/macio.c |  213 
> 

^ _cool_ ^

I assume you'll actually not be needing me to deal with this until you
resolve the HACK in the pre-requisite patch, yes?

>  1 file changed, 28 insertions(+), 185 deletions(-)
> 
> diff --git a/hw/ide/macio.c b/hw/ide/macio.c
> index 42ad68a..e4e567e 100644
> --- a/hw/ide/macio.c
> +++ b/hw/ide/macio.c
> @@ -52,187 +52,6 @@ static const int debug_macio = 0;
>  
>  #define MACIO_PAGE_SIZE 4096
>  



[Qemu-devel] [PATCH v2 17/19] linux-user: Use both si_code and si_signo when converting siginfo_t

2016-05-27 Thread Peter Maydell
The siginfo_t struct includes a union. The correct way to identify
which fields of the union are relevant is complicated, because we
have to use a combination of the si_code and si_signo to figure out
which of the union's members are valid.  (Within the host kernel it
is always possible to tell, but the kernel carefully avoids giving
userspace the high 16 bits of si_code, so we don't have the
information to do this the easy way...) We therefore make our best
guess, bearing in mind that a guest can spoof most of the si_codes
via rt_sigqueueinfo() if it likes.  Once we have made our guess, we
record it in the top 16 bits of the si_code, so that tswap_siginfo()
later can use it.  tswap_siginfo() then strips these top bits out
before writing si_code to the guest (sign-extending the lower bits).

This fixes a bug where fields were sometimes wrong; in particular
the LTP kill10 test went into an infinite loop because its signal
handler got a si_pid value of 0 rather than the pid of the sending
process.

As part of this change, we switch to using __put_user() in the
tswap_siginfo code which writes out the byteswapped values to
the target memory, in case the target memory pointer is not
sufficiently aligned for the host CPU's requirements.

Signed-off-by: Peter Maydell 
---
 linux-user/signal.c   | 165 --
 linux-user/syscall_defs.h |  15 +
 2 files changed, 131 insertions(+), 49 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index b21d6bf..8ea0cbf 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -17,6 +17,7 @@
  *  along with this program; if not, see .
  */
 #include "qemu/osdep.h"
+#include "qemu/bitops.h"
 #include 
 #include 
 
@@ -274,70 +275,129 @@ static inline void 
host_to_target_siginfo_noswap(target_siginfo_t *tinfo,
  const siginfo_t *info)
 {
 int sig = host_to_target_signal(info->si_signo);
+int si_code = info->si_code;
+int si_type;
 tinfo->si_signo = sig;
 tinfo->si_errno = 0;
 tinfo->si_code = info->si_code;
 
-if (sig == TARGET_SIGILL || sig == TARGET_SIGFPE || sig == TARGET_SIGSEGV
-|| sig == TARGET_SIGBUS || sig == TARGET_SIGTRAP) {
-/* Should never come here, but who knows. The information for
-   the target is irrelevant.  */
-tinfo->_sifields._sigfault._addr = 0;
-} else if (sig == TARGET_SIGIO) {
-tinfo->_sifields._sigpoll._band = info->si_band;
-tinfo->_sifields._sigpoll._fd = info->si_fd;
-} else if (sig == TARGET_SIGCHLD) {
-tinfo->_sifields._sigchld._pid = info->si_pid;
-tinfo->_sifields._sigchld._uid = info->si_uid;
-tinfo->_sifields._sigchld._status
+/* This is awkward, because we have to use a combination of
+ * the si_code and si_signo to figure out which of the union's
+ * members are valid. (Within the host kernel it is always possible
+ * to tell, but the kernel carefully avoids giving userspace the
+ * high 16 bits of si_code, so we don't have the information to
+ * do this the easy way...) We therefore make our best guess,
+ * bearing in mind that a guest can spoof most of the si_codes
+ * via rt_sigqueueinfo() if it likes.
+ *
+ * Once we have made our guess, we record it in the top 16 bits of
+ * the si_code, so that tswap_siginfo() later can use it.
+ * tswap_siginfo() will strip these top bits out before writing
+ * si_code to the guest (sign-extending the lower bits).
+ */
+
+switch (si_code) {
+case SI_USER:
+case SI_TKILL:
+case SI_KERNEL:
+/* Sent via kill(), tkill() or tgkill(), or direct from the kernel.
+ * These are the only unspoofable si_code values.
+ */
+tinfo->_sifields._kill._pid = info->si_pid;
+tinfo->_sifields._kill._uid = info->si_uid;
+si_type = QEMU_SI_KILL;
+break;
+default:
+/* Everything else is spoofable. Make best guess based on signal */
+switch (sig) {
+case TARGET_SIGCHLD:
+tinfo->_sifields._sigchld._pid = info->si_pid;
+tinfo->_sifields._sigchld._uid = info->si_uid;
+tinfo->_sifields._sigchld._status
 = host_to_target_waitstatus(info->si_status);
-tinfo->_sifields._sigchld._utime = info->si_utime;
-tinfo->_sifields._sigchld._stime = info->si_stime;
-} else if (sig >= TARGET_SIGRTMIN) {
-tinfo->_sifields._rt._pid = info->si_pid;
-tinfo->_sifields._rt._uid = info->si_uid;
-/* XXX: potential problem if 64 bit */
-tinfo->_sifields._rt._sigval.sival_ptr
+tinfo->_sifields._sigchld._utime = info->si_utime;
+tinfo->_sifields._sigchld._stime = info->si_stime;
+si_type = QEMU_SI_CHLD;
+break;
+case TARGET_SIGIO:
+

[Qemu-devel] [PATCH v2 03/19] linux-user: Fix stray tab-indent

2016-05-27 Thread Peter Maydell
Fix a stray tab-indented linux in linux-user/signal.c.

Signed-off-by: Peter Maydell 
---
 linux-user/signal.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 5f98c71..5069c3f 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -5847,8 +5847,9 @@ static void handle_pending_signal(CPUArchState *cpu_env, 
int sig)
 else
 setup_frame(sig, sa, _old_set, cpu_env);
 #endif
-   if (sa->sa_flags & TARGET_SA_RESETHAND)
+if (sa->sa_flags & TARGET_SA_RESETHAND) {
 sa->_sa_handler = TARGET_SIG_DFL;
+}
 }
 if (q != >info)
 free_sigqueue(cpu_env, q);
-- 
1.9.1




Re: [Qemu-devel] [RFC PATCH v4 0/3] Add Mediated device support[was: Add vGPU support]

2016-05-27 Thread Alex Williamson
On Fri, 27 May 2016 11:02:46 +
"Tian, Kevin"  wrote:

> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Wednesday, May 25, 2016 9:44 PM
> > 
> > On Wed, 25 May 2016 07:13:58 +
> > "Tian, Kevin"  wrote:
> >   
> > > > From: Kirti Wankhede [mailto:kwankh...@nvidia.com]
> > > > Sent: Wednesday, May 25, 2016 3:58 AM
> > > >
> > > > This series adds Mediated device support to v4.6 Linux host kernel. 
> > > > Purpose
> > > > of this series is to provide a common interface for mediated device
> > > > management that can be used by different devices. This series introduces
> > > > Mdev core module that create and manage mediated devices, VFIO based 
> > > > driver
> > > > for mediated PCI devices that are created by Mdev core module and update
> > > > VFIO type1 IOMMU module to support mediated devices.  
> > >
> > > Thanks. "Mediated device" is more generic than previous one. :-)
> > >  
> > > >
> > > > What's new in v4?
> > > > - Renamed 'vgpu' module to 'mdev' module that represent generic term
> > > >   'Mediated device'.
> > > > - Moved mdev directory to drivers/vfio directory as this is the 
> > > > extension
> > > >   of VFIO APIs for mediated devices.
> > > > - Updated mdev driver to be flexible to register multiple types of 
> > > > drivers
> > > >   to mdev_bus_type bus.
> > > > - Updated mdev core driver with mdev_put_device() and mdev_get_device() 
> > > > for
> > > >   mediated devices.
> > > >
> > > >  
> > >
> > > Just curious. In this version you move the whole mdev core under
> > > VFIO now. Sorry if I missed any agreement on this change. IIRC Alex
> > > doesn't want VFIO to manage mdev life-cycle directly. Instead VFIO is
> > > just a mdev driver on created mediated devices  
> > 
> > I did originally suggest keeping them separate, but as we've progressed
> > through the implementation, it's become more clear that the mediated
> > device interface is very much tied to the vfio interface, acting mostly
> > as a passthrough.  So I thought it made sense to pull them together.
> > Still open to discussion of course.  Thanks,
> >   
> 
> The main benefit of maintaining a separate mdev framework, IMHO, is
> to allow better support of both KVM and Xen. Xen doesn't work with VFIO
> today, because other VM's memory is not allocated from Dom0 which
> means VFIO within Dom0 doesn't has view/permission to control isolation 
> for other VMs.

Isn't this just a matter of the vfio iommu model selected?  There could
be a vfio-iommu-xen that knows how to do the grant calls.

> However, after some thinking I think it might not be a big problem to
> combine VFIO/mdev together, if we extend Xen to just use VFIO for
> resource enumeration. In such model, VFIO still behaves as a single 
> kernel portal to enumerate mediated devices to user space, but give up 
> permission control to Qemu which will request a secure agent - Xen
> hypervisor - to ensure isolation of VM usage on mediated device (including
> EPT/IOMMU configuration).

The whole point here is to use the vfio user api and we seem to be
progressing towards using vfio-core as a conduit where the mediated
driver api is also fairly vfio-ish.  So it seems we're really headed
towards a vfio-mediated device rather than some sort generic mediated
driver interface.  I would object to leaving permission control to
QEMU, QEMU is just a vfio user, there are others like DPDK.  The kernel
needs to be in charge of protecting itself and users from each other,
QEMU can't do this, which is part of reason that KVM has moved to vfio
rather than the pci-sysfs resource interface.
 
> I'm not sure whether VFIO can support this usage today. It is somehow 
> similar to channel io passthru in s390, where we also rely on Qemu to 
> mediate ccw commands to ensure isolation. Maybe just some slight 
> extension is required (e.g. not assume some API must be invoked). Of 
> course Qemu side vfio code also need some change. If this can work, 
> at least we can first put it as the enumeration interface for mediated 
> device in Xen. In the future it may be extended to cover normal Xen 
> PCI assignment as well instead of using sysfs to read PCI resource
> today.

The channel io proposal doesn't rely on QEMU for security either, the
mediation occurs in the host kernel, parsing the ccw command program,
and doing translations to replace the guest physical addresses with
verified and pinned host physical addresses before submitting the
program to be run.  A mediated device is policed by the mediated
vendor driver in the host kernel, QEMU is untrusted, just like any
other user.

If xen is currently using pci-sysfs for mapping device resources, then
vfio should be directly usable, which leaves the IOMMU interfaces, such
as pinning and mapping user memory and making use of the IOMMU API,
that part of vfio is fairly modular though IOMMU groups is a fairly
fundamental concept within the core.  Thanks,

Alex



[Qemu-devel] [PATCH v2 02/19] linux-user: Move handle_pending_signal() to avoid need for declaration

2016-05-27 Thread Peter Maydell
Move the handle_pending_signal() function above process_pending_signals()
to avoid the need for a forward declaration. (Whitespace only change.)

Signed-off-by: Peter Maydell 
---
 linux-user/signal.c | 44 +---
 1 file changed, 21 insertions(+), 23 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index a9ac491..5f98c71 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -5765,29 +5765,6 @@ long do_rt_sigreturn(CPUArchState *env)
 
 #endif
 
-static void handle_pending_signal(CPUArchState *cpu_env, int sig);
-
-void process_pending_signals(CPUArchState *cpu_env)
-{
-CPUState *cpu = ENV_GET_CPU(cpu_env);
-int sig;
-TaskState *ts = cpu->opaque;
-
-if (!ts->signal_pending)
-return;
-
-/* FIXME: This is not threadsafe.  */
-for(sig = 1; sig <= TARGET_NSIG; sig++) {
-if (ts->sigtab[sig - 1].pending) {
-handle_pending_signal(cpu_env, sig);
-return;
-}
-}
-/* if no signal is pending, just return */
-ts->signal_pending = 0;
-return;
-}
-
 static void handle_pending_signal(CPUArchState *cpu_env, int sig)
 {
 CPUState *cpu = ENV_GET_CPU(cpu_env);
@@ -5876,3 +5853,24 @@ static void handle_pending_signal(CPUArchState *cpu_env, 
int sig)
 if (q != >info)
 free_sigqueue(cpu_env, q);
 }
+
+void process_pending_signals(CPUArchState *cpu_env)
+{
+CPUState *cpu = ENV_GET_CPU(cpu_env);
+int sig;
+TaskState *ts = cpu->opaque;
+
+if (!ts->signal_pending)
+return;
+
+/* FIXME: This is not threadsafe.  */
+for(sig = 1; sig <= TARGET_NSIG; sig++) {
+if (ts->sigtab[sig - 1].pending) {
+handle_pending_signal(cpu_env, sig);
+return;
+}
+}
+/* if no signal is pending, just return */
+ts->signal_pending = 0;
+return;
+}
-- 
1.9.1




[Qemu-devel] [PATCH v2 12/19] linux-user: Block signals during sigaction() handling

2016-05-27 Thread Peter Maydell
From: Timothy E Baldwin 

Block signals while emulating sigaction. This is a non-interruptible
syscall, and using block_signals() avoids races where the host
signal handler is invoked and tries to examine the signal handler
data structures while we are updating them.

Signed-off-by: Timothy Edward Baldwin 
Message-id: 1441497448-32489-29-git-send-email-t.e.baldwi...@members.leeds.ac.uk
[PMM: expanded commit message]
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 linux-user/signal.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index f489028..b21d6bf 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -640,7 +640,7 @@ out:
 return ret;
 }
 
-/* do_sigaction() return host values and errnos */
+/* do_sigaction() return target values and host errnos */
 int do_sigaction(int sig, const struct target_sigaction *act,
  struct target_sigaction *oact)
 {
@@ -649,8 +649,14 @@ int do_sigaction(int sig, const struct target_sigaction 
*act,
 int host_sig;
 int ret = 0;
 
-if (sig < 1 || sig > TARGET_NSIG || sig == TARGET_SIGKILL || sig == 
TARGET_SIGSTOP)
-return -EINVAL;
+if (sig < 1 || sig > TARGET_NSIG || sig == TARGET_SIGKILL || sig == 
TARGET_SIGSTOP) {
+return -TARGET_EINVAL;
+}
+
+if (block_signals()) {
+return -TARGET_ERESTARTSYS;
+}
+
 k = _table[sig - 1];
 if (oact) {
 __put_user(k->_sa_handler, >_sa_handler);
-- 
1.9.1




[Qemu-devel] [PATCH v2 10/19] linux-user: Remove real-time signal queuing

2016-05-27 Thread Peter Maydell
From: Timothy E Baldwin 

As host signals are now blocked whenever guest signals are blocked, the
queue of realtime signals is now in Linux. The QEMU queue is now
redundant and can be removed. (We already did not queue non-RT signals, and
none of the calls to queue_signal() except the one in host_signal_handler()
pass an RT signal number.)

Signed-off-by: Timothy Edward Baldwin 
Message-id: 1441497448-32489-23-git-send-email-t.e.baldwi...@members.leeds.ac.uk
Reviewed-by: Peter Maydell 
[PMM: minor commit message tweak]
Signed-off-by: Peter Maydell 
---
 linux-user/main.c   |  7 --
 linux-user/qemu.h   | 11 +
 linux-user/signal.c | 70 ++---
 3 files changed, 14 insertions(+), 74 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index b2bc6ab..b6da0ba 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -3794,14 +3794,7 @@ void stop_all_tasks(void)
 /* Assumes contents are already zeroed.  */
 void init_task_state(TaskState *ts)
 {
-int i;
- 
 ts->used = 1;
-ts->first_free = ts->sigqueue_table;
-for (i = 0; i < MAX_SIGQUEUE_SIZE - 1; i++) {
-ts->sigqueue_table[i].next = >sigqueue_table[i + 1];
-}
-ts->sigqueue_table[i].next = NULL;
 }
 
 CPUArchState *cpu_copy(CPUArchState *env)
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 5138289..b201f90 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -78,16 +78,9 @@ struct vm86_saved_state {
 
 #define MAX_SIGQUEUE_SIZE 1024
 
-struct sigqueue {
-struct sigqueue *next;
-target_siginfo_t info;
-};
-
 struct emulated_sigtable {
 int pending; /* true if signal is pending */
-struct sigqueue *first;
-struct sigqueue info; /* in order to always have memory for the
- first signal, we put it here */
+target_siginfo_t info;
 };
 
 /* NOTE: we force a big alignment so that the stack stored after is
@@ -127,8 +120,6 @@ typedef struct TaskState {
 struct linux_binprm *bprm;
 
 struct emulated_sigtable sigtab[TARGET_NSIG];
-struct sigqueue sigqueue_table[MAX_SIGQUEUE_SIZE]; /* siginfo queue */
-struct sigqueue *first_free; /* first free siginfo queue entry */
 /* This thread's signal mask, as requested by the guest program.
  * The actual signal mask of this thread may differ:
  *  + we don't let SIGSEGV and SIGBUS be blocked while running guest code
diff --git a/linux-user/signal.c b/linux-user/signal.c
index 2c6790d..5db1c0b 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -441,27 +441,6 @@ void signal_init(void)
 }
 }
 
-/* signal queue handling */
-
-static inline struct sigqueue *alloc_sigqueue(CPUArchState *env)
-{
-CPUState *cpu = ENV_GET_CPU(env);
-TaskState *ts = cpu->opaque;
-struct sigqueue *q = ts->first_free;
-if (!q)
-return NULL;
-ts->first_free = q->next;
-return q;
-}
-
-static inline void free_sigqueue(CPUArchState *env, struct sigqueue *q)
-{
-CPUState *cpu = ENV_GET_CPU(env);
-TaskState *ts = cpu->opaque;
-
-q->next = ts->first_free;
-ts->first_free = q;
-}
 
 /* abort execution with signal */
 static void QEMU_NORETURN force_sig(int target_sig)
@@ -524,37 +503,20 @@ int queue_signal(CPUArchState *env, int sig, 
target_siginfo_t *info)
 CPUState *cpu = ENV_GET_CPU(env);
 TaskState *ts = cpu->opaque;
 struct emulated_sigtable *k;
-struct sigqueue *q, **pq;
 
 trace_user_queue_signal(env, sig);
 k = >sigtab[sig - 1];
 
-pq = >first;
-if (sig < TARGET_SIGRTMIN) {
-/* if non real time signal, we queue exactly one signal */
-if (!k->pending)
-q = >info;
-else
-return 0;
-} else {
-if (!k->pending) {
-/* first signal */
-q = >info;
-} else {
-q = alloc_sigqueue(env);
-if (!q)
-return -EAGAIN;
-while (*pq != NULL)
-pq = &(*pq)->next;
-}
-}
-*pq = q;
-q->info = *info;
-q->next = NULL;
-k->pending = 1;
-/* signal that a new signal is pending */
-atomic_set(>signal_pending, 1);
-return 1; /* indicates that the signal was queued */
+/* we queue exactly one signal */
+if (k->pending) {
+return 0;
+}
+
+k->info = *info;
+k->pending = 1;
+/* signal that a new signal is pending */
+atomic_set(>signal_pending, 1);
+return 1; /* indicates that the signal was queued */
 }
 
 #ifndef HAVE_SAFE_SYSCALL
@@ -5783,16 +5745,12 @@ static void handle_pending_signal(CPUArchState 
*cpu_env, int sig)
 sigset_t set;
 target_sigset_t target_old_set;
 struct target_sigaction *sa;
-struct sigqueue *q;
 TaskState 

[Qemu-devel] [PATCH v2 11/19] linux-user: Queue synchronous signals separately

2016-05-27 Thread Peter Maydell
From: Timothy E Baldwin 

If a synchronous signal and an asynchronous signal arrive near simultaneously,
and the signal number of the asynchronous signal is lower than that of the
synchronous signal the the handler for the asynchronous would be called first,
and then the handler for the synchronous signal would be called within or
after the first handler with an incorrect context.

This is fixed by queuing synchronous signals separately. Note that this does
risk delaying a asynchronous signal until the synchronous signal handler
returns rather than handling the signal on another thread, but this seems
unlikely to cause problems for real guest programs and is unavoidable unless
we could guarantee to roll back and reexecute whatever guest instruction
caused the synchronous signal (which would be a bit odd if we've already
logged its execution, for instance, and would require careful analysis of
all guest CPUs to check it was possible in all cases).

Signed-off-by: Timothy Edward Baldwin 
Message-id: 1441497448-32489-24-git-send-email-t.e.baldwi...@members.leeds.ac.uk
[PMM: added a comment]
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 linux-user/qemu.h   |  1 +
 linux-user/signal.c | 74 ++---
 2 files changed, 43 insertions(+), 32 deletions(-)

diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index b201f90..6bd7b32 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -119,6 +119,7 @@ typedef struct TaskState {
 struct image_info *info;
 struct linux_binprm *bprm;
 
+struct emulated_sigtable sync_signal;
 struct emulated_sigtable sigtab[TARGET_NSIG];
 /* This thread's signal mask, as requested by the guest program.
  * The actual signal mask of this thread may differ:
diff --git a/linux-user/signal.c b/linux-user/signal.c
index 5db1c0b..f489028 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -502,18 +502,11 @@ int queue_signal(CPUArchState *env, int sig, 
target_siginfo_t *info)
 {
 CPUState *cpu = ENV_GET_CPU(env);
 TaskState *ts = cpu->opaque;
-struct emulated_sigtable *k;
 
 trace_user_queue_signal(env, sig);
-k = >sigtab[sig - 1];
-
-/* we queue exactly one signal */
-if (k->pending) {
-return 0;
-}
 
-k->info = *info;
-k->pending = 1;
+ts->sync_signal.info = *info;
+ts->sync_signal.pending = sig;
 /* signal that a new signal is pending */
 atomic_set(>signal_pending, 1);
 return 1; /* indicates that the signal was queued */
@@ -530,9 +523,13 @@ static void host_signal_handler(int host_signum, siginfo_t 
*info,
 void *puc)
 {
 CPUArchState *env = thread_cpu->env_ptr;
+CPUState *cpu = ENV_GET_CPU(env);
+TaskState *ts = cpu->opaque;
+
 int sig;
 target_siginfo_t tinfo;
 ucontext_t *uc = puc;
+struct emulated_sigtable *k;
 
 /* the CPU emulator uses some host signals to detect exceptions,
we forward to it some signals */
@@ -551,20 +548,23 @@ static void host_signal_handler(int host_signum, 
siginfo_t *info,
 rewind_if_in_safe_syscall(puc);
 
 host_to_target_siginfo_noswap(, info);
-if (queue_signal(env, sig, ) == 1) {
-/* Block host signals until target signal handler entered. We
- * can't block SIGSEGV or SIGBUS while we're executing guest
- * code in case the guest code provokes one in the window between
- * now and it getting out to the main loop. Signals will be
- * unblocked again in process_pending_signals().
- */
-sigfillset(>uc_sigmask);
-sigdelset(>uc_sigmask, SIGSEGV);
-sigdelset(>uc_sigmask, SIGBUS);
+k = >sigtab[sig - 1];
+k->info = tinfo;
+k->pending = sig;
+ts->signal_pending = 1;
+
+/* Block host signals until target signal handler entered. We
+ * can't block SIGSEGV or SIGBUS while we're executing guest
+ * code in case the guest code provokes one in the window between
+ * now and it getting out to the main loop. Signals will be
+ * unblocked again in process_pending_signals().
+ */
+sigfillset(>uc_sigmask);
+sigdelset(>uc_sigmask, SIGSEGV);
+sigdelset(>uc_sigmask, SIGBUS);
 
-/* interrupt the virtual CPU as soon as possible */
-cpu_exit(thread_cpu);
-}
+/* interrupt the virtual CPU as soon as possible */
+cpu_exit(thread_cpu);
 }
 
 /* do_sigaltstack() returns target values and errnos. */
@@ -5761,14 +5761,6 @@ static void handle_pending_signal(CPUArchState *cpu_env, 
int sig)
 handler = sa->_sa_handler;
 }
 
-if (sig == TARGET_SIGSEGV && sigismember(>signal_mask, SIGSEGV)) {
-/* Guest has blocked SIGSEGV but we got one anyway. Assume this
- * is a forced SIGSEGV (ie one the kernel handles via force_sig_info
- * because it 

[Qemu-devel] [PATCH v2 01/19] linux-user: Factor out handle_signal code from process_pending_signals()

2016-05-27 Thread Peter Maydell
Factor out the code to handle a single signal from the
process_pending_signals() function. The use of goto for flow control
is OK currently, but would get significantly uglier if extended to
allow running the handle_signal code multiple times.

Signed-off-by: Peter Maydell 
---
 linux-user/signal.c | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 8090b4d..a9ac491 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -5765,33 +5765,40 @@ long do_rt_sigreturn(CPUArchState *env)
 
 #endif
 
+static void handle_pending_signal(CPUArchState *cpu_env, int sig);
+
 void process_pending_signals(CPUArchState *cpu_env)
 {
 CPUState *cpu = ENV_GET_CPU(cpu_env);
 int sig;
-abi_ulong handler;
-sigset_t set, old_set;
-target_sigset_t target_old_set;
-struct emulated_sigtable *k;
-struct target_sigaction *sa;
-struct sigqueue *q;
 TaskState *ts = cpu->opaque;
 
 if (!ts->signal_pending)
 return;
 
 /* FIXME: This is not threadsafe.  */
-k = ts->sigtab;
 for(sig = 1; sig <= TARGET_NSIG; sig++) {
-if (k->pending)
-goto handle_signal;
-k++;
+if (ts->sigtab[sig - 1].pending) {
+handle_pending_signal(cpu_env, sig);
+return;
+}
 }
 /* if no signal is pending, just return */
 ts->signal_pending = 0;
 return;
+}
+
+static void handle_pending_signal(CPUArchState *cpu_env, int sig)
+{
+CPUState *cpu = ENV_GET_CPU(cpu_env);
+abi_ulong handler;
+sigset_t set, old_set;
+target_sigset_t target_old_set;
+struct target_sigaction *sa;
+struct sigqueue *q;
+TaskState *ts = cpu->opaque;
+struct emulated_sigtable *k = >sigtab[sig - 1];
 
- handle_signal:
 trace_user_handle_signal(cpu_env, sig);
 /* dequeue signal */
 q = k->first;
-- 
1.9.1




Re: [Qemu-devel] [RFC v2 11/11] tcg: enable thread-per-vCPU

2016-05-27 Thread Paolo Bonzini


On 27/05/2016 15:57, Sergey Fedorov wrote:
>  1. Make 'cpu->thread_kicked' access atomic
>  2. Remove global 'exit_request' and use per-CPU 'exit_request'
>  3. Change how 'current_cpu' is set
>  4. Reorganize round-robin CPU TCG thread function
>  5. Enable 'mmap_lock' for system mode emulation (do we really want this?)

No, I don't think so.

>  6. Enable 'tb_lock' for system mode emulation
>  7. Introduce per-CPU TCG thread function

At least 2/3/7 must be done at the same time, but I agree that this
patch could use some splitting. :)

Thanks,

Paolo



[Qemu-devel] [PATCH v2 14/19] linux-user: Restart exit() if signal pending

2016-05-27 Thread Peter Maydell
From: Timothy E Baldwin 

Without this a signal could vanish on thread exit.

Signed-off-by: Timothy Edward Baldwin 
Message-id: 1441497448-32489-26-git-send-email-t.e.baldwi...@members.leeds.ac.uk
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 linux-user/syscall.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 3fc9c8a..cb5d519 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -5999,8 +5999,12 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
However in threaded applictions it is used for thread termination,
and _exit_group is used for application termination.
Do thread termination if we have more then one thread.  */
-/* FIXME: This probably breaks if a signal arrives.  We should probably
-   be disabling signals.  */
+
+if (block_signals()) {
+ret = -TARGET_ERESTARTSYS;
+break;
+}
+
 if (CPU_NEXT(first_cpu)) {
 TaskState *ts;
 
-- 
1.9.1




[Qemu-devel] [PATCH v2 13/19] linux-user: pause() should not pause if signal pending

2016-05-27 Thread Peter Maydell
From: Timothy E Baldwin 

Fix races between signal handling and the pause syscall by
reimplementing it using block_signals() and sigsuspend().
(Using safe_syscall(pause) would also work, except that the
pause syscall doesn't exist on all architectures.)

Signed-off-by: Timothy Edward Baldwin 
Message-id: 1441497448-32489-28-git-send-email-t.e.baldwi...@members.leeds.ac.uk
[PMM: tweaked commit message]
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 linux-user/syscall.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 5a34642..3fc9c8a 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6418,7 +6418,10 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 #endif
 #ifdef TARGET_NR_pause /* not on alpha */
 case TARGET_NR_pause:
-ret = get_errno(pause());
+if (!block_signals()) {
+sigsuspend(&((TaskState *)cpu->opaque)->signal_mask);
+}
+ret = -TARGET_EINTR;
 break;
 #endif
 #ifdef TARGET_NR_utime
-- 
1.9.1




[Qemu-devel] [PATCH v2 16/19] linux-user: Restart fork() if signals pending

2016-05-27 Thread Peter Maydell
From: Timothy E Baldwin 

If there is a signal pending during fork() the signal handler will
erroneously be called in both the parent and child, so handle any
pending signals first.

Signed-off-by: Timothy Edward Baldwin 
Message-id: 1441497448-32489-20-git-send-email-t.e.baldwi...@members.leeds.ac.uk
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 linux-user/syscall.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 549f571..7d5f123 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -4796,6 +4796,11 @@ static int do_fork(CPUArchState *env, unsigned int 
flags, abi_ulong newsp,
 if ((flags & ~(CSIGNAL | CLONE_NPTL_FLAGS2)) != 0) {
 return -TARGET_EINVAL;
 }
+
+if (block_signals()) {
+return -TARGET_ERESTARTSYS;
+}
+
 fork_start();
 ret = fork();
 if (ret == 0) {
-- 
1.9.1




[Qemu-devel] [PATCH v2 18/19] linux-user: Avoid possible misalignment in host_to_target_siginfo()

2016-05-27 Thread Peter Maydell
host_to_target_siginfo() is implemented by a combination of
host_to_target_siginfo_noswap() followed by tswap_siginfo().
The first of these two functions assumes that the target_siginfo_t
it is writing to is correctly aligned, but the pointer passed
into host_to_target_siginfo() is directly from the guest and
might be misaligned. Use a local variable to avoid this problem.
(tswap_siginfo() does now correctly handle a misaligned destination.)

Signed-off-by: Peter Maydell 
---
 linux-user/signal.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 8ea0cbf..7e2a80f 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -400,8 +400,9 @@ static void tswap_siginfo(target_siginfo_t *tinfo,
 
 void host_to_target_siginfo(target_siginfo_t *tinfo, const siginfo_t *info)
 {
-host_to_target_siginfo_noswap(tinfo, info);
-tswap_siginfo(tinfo, tinfo);
+target_siginfo_t tgt_tmp;
+host_to_target_siginfo_noswap(_tmp, info);
+tswap_siginfo(tinfo, _tmp);
 }
 
 /* XXX: we support only POSIX RT signals are used. */
-- 
1.9.1




[Qemu-devel] [PATCH v2 00/19] linux-user: fix various signal race conditions

2016-05-27 Thread Peter Maydell

This patchset overhauls the linux-user signal handling code to
fix a number of race conditions. It is essentially a v2 of
Timothy Baldwin's original patchset, though I have addressed
code review issues, refactored it a little, fixed the occasional
minor bug and added some patches of my own for other issues I
spotted along the way.

The meat of the patchset is splitting out the guest thread's idea
of its signal mask from the host thread's actual signal mask. This
allows us to temporarily block all host signals as a method for
fixing some races:

 * block signals in host signal handler until we have processed
   the signal queue to deliver the guest signal (fixes a race
   where a second host signal could arrive and we would deliver
   it even if the guest handler's signal mask should prevent it)
 * block signals while we are manipulating QEMU data structures which
   the host signal handler reads (eg in sigaction syscall emulation)
 * block signals to fix races between signals and noninterruptible
   syscalls like pause, which we could in theory do with safe_syscall
   but which would be a pain to do that way because of variations
   in whether syscalls exist on different host architectures
 * block signals to fix races for complicated syscalls like fork
   which would be too painful to handle by trying to roll back
   if something was interrupted partway through

We also:
 * remove a lot of code that is made redundant by processing
   default signal actions in one place rather than two
 * make sure that synchronous signals correctly take priority
   over asynchronous signals
 * use safe_syscall for sigsuspend
 * use safe_syscall for kill/tkill/tgkill
 * make a better guess at which bits of the union in siginfo_t
   need to be converted by looking at si_code as well as si_signo
 * use __get_user and __put_user for siginfo conversion to avoid
   potential problems with misaligned guest addresses

Changes since Timothy's v1 patchset:
 * some patches at the front to factor out handle_pending_signal()
   to avoid using goto for flow control logic
 * new function set_sigmask() for setting signal mask when we have
   already blocked signals -- this allows us to define calling block_signals()
   twice to be illegal, which then means we can have signal_pending be a
   simple flag rather than a word with two flag bits in it
 * use the qemu atomics.h functions rather than raw volatile variable
   (it makes it clearer that the variable has to be handled with care IMHO)
 * bunch of extra commentary
 * add code to stop sigprocmask being able to mark SIGKILL, SIGSTOP as blocked
 * fixed handling of ssetmask
 * new patches to better handle conversion of siginfo_t structures
   (these fix problems with LTP tests like kill10 which try to kill
   processes by sending them an asynchronous SIGSEGV and expect to
   see the si_pid field in the resulting siginfo in the recipient.)

With all of these fixes plus the safe_syscall patches now in
master, the following LTP test cases which used to hang now do not:

 mq_timedreceive01 mq_timedsend01 kill10 kill11 msgrcv03
 nanosleep04 splice02 waitpid02 inotify06 pselect02 pselect02_64

(Not all of these were signal related issues, and a few might have
been fixed some time back.)

Next on my todo list after this is to expand the safe_syscall
support to more host architectures. There are also a few more
bugs lurking I suspect.

thanks
-- PMM


Peter Maydell (11):
  linux-user: Factor out handle_signal code from
process_pending_signals()
  linux-user: Move handle_pending_signal() to avoid need for declaration
  linux-user: Fix stray tab-indent
  linux-user: Factor out uses of do_sigprocmask() from sigreturn code
  linux-user: Define macro for size of host kernel sigset_t
  linux-user: Use safe_syscall for sigsuspend syscalls
  linux-user: Fix race between multiple signals
  linux-user: Use safe_syscall for kill, tkill and tgkill syscalls
  linux-user: Use both si_code and si_signo when converting siginfo_t
  linux-user: Avoid possible misalignment in host_to_target_siginfo()
  linux-user: Avoid possible misalignment in target_to_host_siginfo()

Timothy E Baldwin (8):
  linux-user: Remove redundant default action check in queue_signal()
  linux-user: Remove redundant gdb_queuesig()
  linux-user: Remove real-time signal queuing
  linux-user: Queue synchronous signals separately
  linux-user: Block signals during sigaction() handling
  linux-user: pause() should not pause if signal pending
  linux-user: Restart exit() if signal pending
  linux-user: Restart fork() if signals pending

 gdbstub.c |  13 --
 include/exec/gdbstub.h|   1 -
 linux-user/main.c |   7 -
 linux-user/qemu.h |  62 -
 linux-user/signal.c   | 572 ++
 linux-user/syscall.c  | 124 ++
 linux-user/syscall_defs.h |  15 ++
 7 files changed, 476 insertions(+), 318 deletions(-)

-- 
1.9.1




[Qemu-devel] [PATCH v6 11/15] docker: Add mingw test

2016-05-27 Thread Fam Zheng
Reviewed-by: Alex Bennée 
Signed-off-by: Fam Zheng 
---
 tests/docker/test-mingw | 34 ++
 1 file changed, 34 insertions(+)
 create mode 100755 tests/docker/test-mingw

diff --git a/tests/docker/test-mingw b/tests/docker/test-mingw
new file mode 100755
index 000..c03757a
--- /dev/null
+++ b/tests/docker/test-mingw
@@ -0,0 +1,34 @@
+#!/bin/bash -e
+#
+# Cross compile QEMU with mingw toolchain on Linux.
+#
+# Copyright (c) 2016 Red Hat Inc.
+#
+# Authors:
+#  Fam Zheng 
+#
+# This work is licensed under the terms of the GNU GPL, version 2
+# or (at your option) any later version. See the COPYING file in
+# the top-level directory.
+
+. common.rc
+
+requires mingw dtc
+
+for prefix in x86_64-w64-mingw32- i686-w64-mingw32-; do
+TARGET_LIST=x86_64-softmmu,aarch64-softmmu \
+build_qemu --cross-prefix=$prefix \
+--enable-trace-backends=simple \
+--enable-debug \
+--enable-gnutls \
+--enable-nettle \
+--enable-curl \
+--enable-vnc \
+--enable-bzip2 \
+--enable-guest-agent \
+--with-sdlabi=1.2 \
+--with-gtkabi=2.0
+make clean
+
+done
+
-- 
2.8.3




Re: [Qemu-devel] [PATCH v2 10/12] tcg/tci: Add support for fence

2016-05-27 Thread Pranith Kumar

Hi Sergey,

Sergey Fedorov writes:

> On 27/05/16 04:00, Richard Henderson wrote:
>> diff --git a/tci.c b/tci.c
>> index b488c0d..53b3f71 100644
>> --- a/tci.c
>> +++ b/tci.c
>> @@ -1236,6 +1236,9 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t 
>> *tb_ptr)
>>  tcg_abort();
>>  }
>>  break;
>> +case INDEX_op_fence:
>> +smp_mb();
>> +break;
>>  default:
>>  TODO();
>>  break;
>
> A bit of bike-shedding. While there's no common ISA term for "memory
> barrier" (also known as a "membar", "memory fence", etc.), we already
> refer to it as a "memory barrier" (or "mb") in include/qemu/atomic.h and
> docs/atomics.txt. Why don't be consistent and avoid introducing yet
> another term for the same thing?
>

Fair point. Do you think tcg_out_mb() is better then?

Thanks,
-- 
Pranith



Re: [Qemu-devel] [PULL 00/31] Misc changes for 2016-05-27

2016-05-27 Thread Richard W.M. Jones
On Fri, May 27, 2016 at 04:06:07PM +0200, Paolo Bonzini wrote:
> 
> 
> On 27/05/2016 15:38, Richard W.M. Jones wrote:
> > One way to solve this (which works for me) is as below.  There are
> > some other approaches, eg. using -fno-leading-underscore, or using a
> > conditional macro to mangle the name.  However I have no idea if there
> > is some preferred way.
> > 
> > Rich.
> > 
> > diff --git a/pc-bios/optionrom/linuxboot_dma.c 
> > b/pc-bios/optionrom/linuxboot_dma.c
> > index 86ef1ce..8509b28 100644
> > --- a/pc-bios/optionrom/linuxboot_dma.c
> > +++ b/pc-bios/optionrom/linuxboot_dma.c
> > @@ -213,6 +213,9 @@ static uint32_t get_e801_addr(void)
> >  return ret;
> >  }
> >  
> > +/* Force the asm name without leading underscore, even on Win32. */
> > +extern void load_kernel(void) asm("load_kernel");
> > +
> >  void load_kernel(void)
> >  {
> >  void *setup_addr;
> 
> Yes, that's what I wanted to do.  I also need
> 
> diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile
> index 2b11cd3..14e7f71 100644
> --- a/pc-bios/optionrom/Makefile
> +++ b/pc-bios/optionrom/Makefile
> @@ -31,6 +31,7 @@ build-all: multiboot.bin linuxboot.bin
> linuxboot_dma.bin kvmvapic.bin
> 
>  ifdef CONFIG_WIN32
>  LD_EMULATION = i386pe
> +CFLAGS += -Wa,-32
>  else
>  LD_EMULATION = elf_i386
>  endif
> 
> to work around what is likely a GCC bug.

OK I don't need that with mingw32-gcc-6.1.0-1.fc24.x86_64, but on the
other hand it doesn't negatively affect builds for me either.  Next
version coming up soon.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v



Re: [Qemu-devel] [RFC v2 11/11] tcg: enable thread-per-vCPU

2016-05-27 Thread Sergey Fedorov
On 05/04/16 18:32, Alex Bennée wrote:
> From: KONRAD Frederic 
>
> This allows the user to switch on multi-thread behaviour and spawn a
> thread per-vCPU. For a simple test like:
>
>   ./arm/run ./arm/locking-test.flat -smp 4 -tcg mttcg=on
>
> Will now use 4 vCPU threads and have an expected FAIL (instead of the
> unexpected PASS) as the default mode of the test has no protection when
> incrementing a shared variable.
>
> However we still default to a single thread for all vCPUs as individual
> front-end and back-ends need additional fixes to safely support:
>   - atomic behaviour
>   - tb invalidation
>   - memory ordering
>
> The function default_mttcg_enabled can be tweaked as support is added.
>
> As assumptions about tcg_current_cpu are no longer relevant to the
> single-threaded kick routine we need to save the current information
> somewhere else for the timer.

It was hard to read this patch first time through. It seems to pursue
multiple goals at the same time and needs to be split at least to the
following parts:
 1. Make 'cpu->thread_kicked' access atomic
 2. Remove global 'exit_request' and use per-CPU 'exit_request'
 3. Change how 'current_cpu' is set
 4. Reorganize round-robin CPU TCG thread function
 5. Enable 'mmap_lock' for system mode emulation (do we really want this?)
 6. Enable 'tb_lock' for system mode emulation
 7. Introduce per-CPU TCG thread function

That would be easier and faster to review these changes as a separate
patches. I attempted to mark each individual change with the numbers
from the list above.
 
>
> Signed-off-by: KONRAD Frederic 
> Signed-off-by: Paolo Bonzini 
> [AJB: Some fixes, conditionally, commit rewording]
> Signed-off-by: Alex Bennée 
>
> ---
> v1 (ajb):
>   - fix merge conflicts
>   - maintain single-thread approach
> v2
>   - re-base fixes (no longer has tb_find_fast lock tweak ahead)
>   - remove bogus break condition on cpu->stop/stopped
>   - only process exiting cpus exit_request
>   - handle all cpus idle case (fixes shutdown issues)
>   - sleep on EXCP_HALTED in mttcg mode (prevent crash on start-up)
>   - move icount timer into helper
> ---
>  cpu-exec-common.c   |   1 -
>  cpu-exec.c  |  15 
>  cpus.c  | 216 
> +---
>  include/exec/exec-all.h |   4 -
>  translate-all.c |   8 --
>  5 files changed, 150 insertions(+), 94 deletions(-)
>
> diff --git a/cpu-exec-common.c b/cpu-exec-common.c
> index 1b1731c..3d7eaa3 100644
> --- a/cpu-exec-common.c
> +++ b/cpu-exec-common.c
> @@ -23,7 +23,6 @@
>  #include "exec/memory-internal.h"
>  
>  bool exit_request;
> -CPUState *tcg_current_cpu;

This relates to item 2. There's no more references to this global
'exit_request', lets just remove it.

>  
>  /* exit the current TB from a signal handler. The host registers are
> restored in a state compatible with the CPU emulator
> diff --git a/cpu-exec.c b/cpu-exec.c
> index f558508..42cec05 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -292,7 +292,6 @@ static TranslationBlock *tb_find_slow(CPUState *cpu,
>  goto found;
>  }
>  
> -#ifdef CONFIG_USER_ONLY

5, 6.

>  /* mmap_lock is needed by tb_gen_code, and mmap_lock must be
>   * taken outside tb_lock.  Since we're momentarily dropping
>   * tb_lock, there's a chance that our desired tb has been
> @@ -306,15 +305,12 @@ static TranslationBlock *tb_find_slow(CPUState *cpu,
>  mmap_unlock();
>  goto found;
>  }
> -#endif

5, 6.

>  
>  /* if no translated code available, then translate it now */
>  cpu->tb_invalidated_flag = false;
>  tb = tb_gen_code(cpu, pc, cs_base, flags, 0);
>  
> -#ifdef CONFIG_USER_ONLY
>  mmap_unlock();
> -#endif

5.

>  
>  found:
>  /* we add the TB in the virtual pc hash table */
> @@ -388,13 +384,8 @@ int cpu_exec(CPUState *cpu)
>  cpu->halted = 0;
>  }
>  
> -atomic_mb_set(_current_cpu, cpu);
>  rcu_read_lock();
>  
> -if (unlikely(atomic_mb_read(_request))) {
> -cpu->exit_request = 1;
> -}
> -

2.

>  cc->cpu_exec_enter(cpu);
>  
>  /* Calculate difference between guest clock and host clock.
> @@ -515,7 +506,6 @@ int cpu_exec(CPUState *cpu)
>  }
>  if (unlikely(cpu->exit_request
>   || replay_has_interrupt())) {
> -cpu->exit_request = 0;

2.

>  cpu->exception_index = EXCP_INTERRUPT;
>  cpu_loop_exit(cpu);
>  }
> @@ -629,10 +619,5 @@ int cpu_exec(CPUState *cpu)
>  cc->cpu_exec_exit(cpu);
>  rcu_read_unlock();
>  
> -/* fail safe : never use current_cpu outside cpu_exec() */
> -current_cpu = NULL;

We still want to reset 'current_cpu' leaving cpu_exec(), don't we?

> -
> -/* Does not need atomic_mb_set because a spurious wakeup is okay.  

Re: [Qemu-devel] [PATCH v2 01/12] Introduce TCGOpcode for fence instruction

2016-05-27 Thread Pranith Kumar

Sergey Fedorov writes:

> On 27/05/16 04:00, Richard Henderson wrote:
>> diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h
>> index 6d0410c..b772d90 100644
>> --- a/tcg/tcg-opc.h
>> +++ b/tcg/tcg-opc.h
>> @@ -42,6 +42,8 @@ DEF(br, 0, 0, 1, TCG_OPF_BB_END)
>>  # define IMPL64  TCG_OPF_64BIT
>>  #endif
>>  
>> +DEF(fence, 0, 0, 0, TCG_OPF_SIDE_EFFECTS)
>> +
>
> I still think this TCG op needs to have a constant argument of a barrier
> type. So that we can distinguish between full, read and write memory
> barriers.
>

Yes, I have a version with this fixed. I will post my patches(v3) with this
changed.

Thanks,
-- 
Pranith



Re: [Qemu-devel] [PATCH v2 10/12] tcg/tci: Add support for fence

2016-05-27 Thread Pranith Kumar
On Fri, May 27, 2016 at 10:20 AM, Sergey Fedorov  wrote:
 +case INDEX_op_fence:
 +smp_mb();
 +break;
  default:
  TODO();
  break;
>>> A bit of bike-shedding. While there's no common ISA term for "memory
>>> barrier" (also known as a "membar", "memory fence", etc.), we already
>>> refer to it as a "memory barrier" (or "mb") in include/qemu/atomic.h and
>>> docs/atomics.txt. Why don't be consistent and avoid introducing yet
>>> another term for the same thing?
>>>
>> Fair point. Do you think tcg_out_mb() is better then?
>
> Yes, if used together with 'INDEX_op_mb', of course.
>

OK. I'll make the change. Thanks for the feedback!

-- 
Pranith



[Qemu-devel] [PULL 01/31] Add optionrom compatible with fw_cfg DMA version

2016-05-27 Thread Paolo Bonzini
From: Marc Marí 

This optionrom is based on linuxboot.S.

Signed-off-by: Marc Marí 
Signed-off-by: Richard W.M. Jones 
Message-Id: <1464027093-24073-2-git-send-email-rjo...@redhat.com>
[Add -fno-toplevel-reorder and fix Win32 compilation. - Paolo]
Signed-off-by: Paolo Bonzini 
---
 .gitignore|   4 +
 Makefile  |   2 +-
 configure |  20 +++
 hw/i386/pc.c  |  10 +-
 hw/nvram/fw_cfg.c |   2 +-
 include/hw/nvram/fw_cfg.h |   1 +
 pc-bios/optionrom/Makefile|  20 ++-
 pc-bios/optionrom/code16gcc.h |   3 +
 pc-bios/optionrom/linuxboot_dma.c | 292 ++
 9 files changed, 348 insertions(+), 6 deletions(-)
 create mode 100644 pc-bios/optionrom/code16gcc.h
 create mode 100644 pc-bios/optionrom/linuxboot_dma.c

diff --git a/.gitignore b/.gitignore
index 88a80ff..101d1e0 100644
--- a/.gitignore
+++ b/.gitignore
@@ -94,6 +94,10 @@
 /pc-bios/optionrom/linuxboot.bin
 /pc-bios/optionrom/linuxboot.raw
 /pc-bios/optionrom/linuxboot.img
+/pc-bios/optionrom/linuxboot_dma.asm
+/pc-bios/optionrom/linuxboot_dma.bin
+/pc-bios/optionrom/linuxboot_dma.raw
+/pc-bios/optionrom/linuxboot_dma.img
 /pc-bios/optionrom/multiboot.asm
 /pc-bios/optionrom/multiboot.bin
 /pc-bios/optionrom/multiboot.raw
diff --git a/Makefile b/Makefile
index a5d7e62..3a9782e 100644
--- a/Makefile
+++ b/Makefile
@@ -400,7 +400,7 @@ efi-e1000.rom efi-eepro100.rom efi-ne2k_pci.rom \
 efi-pcnet.rom efi-rtl8139.rom efi-virtio.rom \
 qemu-icon.bmp qemu_logo_no_text.svg \
 bamboo.dtb petalogix-s3adsp1800.dtb petalogix-ml605.dtb \
-multiboot.bin linuxboot.bin kvmvapic.bin \
+multiboot.bin linuxboot.bin linuxboot_dma.bin kvmvapic.bin \
 s390-ccw.img \
 spapr-rtas.bin slof.bin \
 palcode-clipper \
diff --git a/configure b/configure
index b5aab72..6d4cbbd 100755
--- a/configure
+++ b/configure
@@ -237,6 +237,7 @@ fortify_source=""
 strip_opt="yes"
 tcg_interpreter="no"
 bigendian="no"
+compiler_m16="no"
 mingw32="no"
 gcov="no"
 gcov_tool="gcov"
@@ -1524,6 +1525,21 @@ if test "$static" = "yes" ; then
   fi
 fi
 
+# Check if the compiler supports -m16 to generate i8086 binaries.
+#
+# GCC < 4.9 didn't, so we have to work around that when building the
+# linuxboot_dma option ROM.  When GCC < 4.9 is considered sufficiently
+# old that we no longer care about it, we can remove this section and
+# CONFIG_COMPILER_M16 which will simplify the build.
+if [ "$cpu" = "i386" -o "$cpu" = "x86_64" ] ; then
+  cat > $TMPC << EOF
+int main(void) { return 0; }
+EOF
+  if compile_prog "-m16" "" ; then
+compiler_m16=yes
+  fi
+fi
+
 # Unconditional check for compiler __thread support
   cat > $TMPC << EOF
 static __thread int tls_var;
@@ -4780,6 +4796,7 @@ fi
 echo "module support$modules"
 echo "host CPU  $cpu"
 echo "host big endian   $bigendian"
+echo "compiler has -m16 $compiler_m16"
 echo "target list   $target_list"
 echo "tcg debug enabled $debug_tcg"
 echo "gprof enabled $gprof"
@@ -4928,6 +4945,9 @@ fi
 if test "$bigendian" = "yes" ; then
   echo "HOST_WORDS_BIGENDIAN=y" >> $config_host_mak
 fi
+if test "$compiler_m16" = "yes" ; then
+  echo "CONFIG_COMPILER_M16=y" >> $config_host_mak
+fi
 if test "$mingw32" = "yes" ; then
   echo "CONFIG_WIN32=y" >> $config_host_mak
   rc_version=`cat $source_path/VERSION`
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index e29ccc8..2ab7b42 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1000,8 +1000,13 @@ static void load_linux(PCMachineState *pcms,
 fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
 fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
 
-option_rom[nb_option_roms].name = "linuxboot.bin";
-option_rom[nb_option_roms].bootindex = 0;
+if (fw_cfg_dma_enabled(fw_cfg)) {
+option_rom[nb_option_roms].name = "linuxboot_dma.bin";
+option_rom[nb_option_roms].bootindex = 0;
+} else {
+option_rom[nb_option_roms].name = "linuxboot.bin";
+option_rom[nb_option_roms].bootindex = 0;
+}
 nb_option_roms++;
 }
 
@@ -1264,6 +1269,7 @@ void xen_load_linux(PCMachineState *pcms)
 load_linux(pcms, fw_cfg);
 for (i = 0; i < nb_option_roms; i++) {
 assert(!strcmp(option_rom[i].name, "linuxboot.bin") ||
+   !strcmp(option_rom[i].name, "linuxboot_dma.bin") ||
!strcmp(option_rom[i].name, "multiboot.bin"));
 rom_add_option(option_rom[i].name, option_rom[i].bootindex);
 }
diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index cdbdfb5..6ac486e 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -552,7 +552,7 @@ static bool is_version_1(void *opaque, int version_id)
 return version_id == 1;
 }
 
-static bool fw_cfg_dma_enabled(void *opaque)
+bool fw_cfg_dma_enabled(void *opaque)
 {
 FWCfgState *s = opaque;
 
diff --git a/include/hw/nvram/fw_cfg.h 

[Qemu-devel] [PULL v2 00/31] Misc changes for 2016-05-27

2016-05-27 Thread Paolo Bonzini
The following changes since commit 2c56d06bafd8933d2a9c6e0aeb5d45f7c1fb5616:

  Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging 
(2016-05-26 14:29:30 +0100)

are available in the git repository at:

  git://github.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 103876bc5e1ea67eec386461114df5c953110e34:

  exec: hide mr->ram_addr from qemu_get_ram_ptr users (2016-05-27 16:07:32 
+0200)


* docs/atomics fixes and atomic_rcu_* optimization (Emilio)
* NBD bugfix (Eric)
* Memory fixes and cleanups (Paolo, Paul)
* scsi-block support for SCSI status, including persistent
  reservations (Paolo)
* linuxboot support for fw_cfg DMA (Marc, Richard Jones)
* kvm_stat moves to the Linux repository
* SCSI bug fixes (Peter, Prasad)
* Killing qemu_char_get_next_serial, non-ARM parts (Xiaoqiang)


Emilio G. Cota (3):
  docs/atomics: update atomic_read/set comparison with Linux
  atomics: emit an smp_read_barrier_depends() barrier only for Alpha and 
Thread Sanitizer
  atomics: do not emit consume barrier for atomic_rcu_read

Eric Blake (1):
  nbd: Don't trim unrequested bytes

Fam Zheng (1):
  scsi-generic: Merge block max xfer len in INQUIRY response

Marc Marí (1):
  Add optionrom compatible with fw_cfg DMA version

Paolo Bonzini (13):
  Revert "memory: Drop FlatRange.romd_mode"
  kvm_stat: Remove
  bt: rewrite csrhci_write to avoid out-of-bounds writes
  docs/atomics: update comparison with Linux
  scsi-disk: introduce a common base class
  scsi-disk: introduce dma_readv and dma_writev
  scsi-disk: add need_fua_emulation to SCSIDiskClass
  scsi-disk: introduce scsi_disk_req_check_error
  scsi-block: always use SG_IO
  memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptr
  exec: remove ram_addr argument from qemu_ram_block_from_host
  memory: split memory_region_from_host from qemu_ram_addr_from_host
  exec: hide mr->ram_addr from qemu_get_ram_ptr users

Paul Durrant (1):
  xen-hvm: ignore background I/O sections

Peter Lieven (1):
  block/iscsi: avoid potential overflow of acb->task->cdb

Prasad J Pandit (5):
  scsi: pvscsi: check command descriptor ring buffer size (CVE-2016-4952)
  scsi: mptsas: infinite loop while fetching requests
  scsi: megasas: use appropriate property buffer size
  scsi: megasas: initialise local configuration data buffer
  scsi: megasas: check 'read_queue_head' index value

xiaoqiang zhao (5):
  hw/char: QOM'ify escc.c
  hw/char: QOM'ify etraxfs_ser.c
  hw/char: QOM'ify lm32_juart.c
  hw/char: QOM'ify lm32_uart.c
  hw/char: QOM'ify milkymist-uart.c

 .gitignore|   4 +
 Makefile  |  11 +-
 block/iscsi.c |   7 +
 configure |  20 +
 cputlb.c  |   3 +-
 docs/atomics.txt  |  38 +-
 exec.c| 110 ++---
 hw/bt/hci-csr.c   |  67 +++-
 hw/char/escc.c|  30 +-
 hw/char/etraxfs_ser.c |  27 +-
 hw/char/lm32_juart.c  |  17 +-
 hw/char/lm32_uart.c   |  28 +-
 hw/char/milkymist-uart.c  |  10 +-
 hw/cris/axis_dev88.c  |   4 +-
 hw/i386/pc.c  |  10 +-
 hw/lm32/lm32.h|  19 +-
 hw/lm32/lm32_boards.c |   9 +-
 hw/lm32/milkymist-hw.h|   4 +-
 hw/lm32/milkymist.c   |   4 +-
 hw/misc/ivshmem.c |   5 +-
 hw/nvram/fw_cfg.c |   2 +-
 hw/scsi/megasas.c |   6 +-
 hw/scsi/mptsas.c  |   9 +-
 hw/scsi/scsi-disk.c   | 412 +--
 hw/scsi/scsi-generic.c|  12 +
 hw/scsi/vmw_pvscsi.c  |  24 +-
 hw/virtio/vhost-user.c|  25 +-
 include/exec/cpu-common.h |   4 +-
 include/exec/memory.h |  36 +-
 include/exec/ram_addr.h   |   3 -
 include/hw/cris/etraxfs.h |  16 +
 include/hw/nvram/fw_cfg.h |   1 +
 include/qemu/atomic.h |  25 +-
 memory.c  |  43 +-
 migration/postcopy-ram.c  |   3 +-
 nbd/server.c  |  20 +-
 pc-bios/optionrom/Makefile|  20 +-
 pc-bios/optionrom/code16gcc.h |   3 +
 pc-bios/optionrom/linuxboot_dma.c | 292 ++
 scripts/dump-guest-memory.py  |  19 +-
 scripts/kvm/kvm_stat  | 825 --
 scripts/kvm/kvm_stat.texi |  55 ---
 target-i386/kvm.c |   6 +-
 xen-hvm.c |  14 +-
 44 files changed, 1057 insertions(+), 1245 deletions(-)
 create mode 100644 pc-bios/optionrom/code16gcc.h
 create mode 100644 pc-bios/optionrom/linuxboot_dma.c
 delete 

Re: [Qemu-devel] [PULL 00/31] Misc changes for 2016-05-27

2016-05-27 Thread Paolo Bonzini


On 27/05/2016 15:38, Richard W.M. Jones wrote:
> One way to solve this (which works for me) is as below.  There are
> some other approaches, eg. using -fno-leading-underscore, or using a
> conditional macro to mangle the name.  However I have no idea if there
> is some preferred way.
> 
> Rich.
> 
> diff --git a/pc-bios/optionrom/linuxboot_dma.c 
> b/pc-bios/optionrom/linuxboot_dma.c
> index 86ef1ce..8509b28 100644
> --- a/pc-bios/optionrom/linuxboot_dma.c
> +++ b/pc-bios/optionrom/linuxboot_dma.c
> @@ -213,6 +213,9 @@ static uint32_t get_e801_addr(void)
>  return ret;
>  }
>  
> +/* Force the asm name without leading underscore, even on Win32. */
> +extern void load_kernel(void) asm("load_kernel");
> +
>  void load_kernel(void)
>  {
>  void *setup_addr;

Yes, that's what I wanted to do.  I also need

diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile
index 2b11cd3..14e7f71 100644
--- a/pc-bios/optionrom/Makefile
+++ b/pc-bios/optionrom/Makefile
@@ -31,6 +31,7 @@ build-all: multiboot.bin linuxboot.bin
linuxboot_dma.bin kvmvapic.bin

 ifdef CONFIG_WIN32
 LD_EMULATION = i386pe
+CFLAGS += -Wa,-32
 else
 LD_EMULATION = elf_i386
 endif

to work around what is likely a GCC bug.

Paolo



Re: [Qemu-devel] [PATCH v2 10/12] tcg/tci: Add support for fence

2016-05-27 Thread Sergey Fedorov
On 27/05/16 17:17, Pranith Kumar wrote:
> Hi Sergey,
>
> Sergey Fedorov writes:
>
>> On 27/05/16 04:00, Richard Henderson wrote:
>>> diff --git a/tci.c b/tci.c
>>> index b488c0d..53b3f71 100644
>>> --- a/tci.c
>>> +++ b/tci.c
>>> @@ -1236,6 +1236,9 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t 
>>> *tb_ptr)
>>>  tcg_abort();
>>>  }
>>>  break;
>>> +case INDEX_op_fence:
>>> +smp_mb();
>>> +break;
>>>  default:
>>>  TODO();
>>>  break;
>> A bit of bike-shedding. While there's no common ISA term for "memory
>> barrier" (also known as a "membar", "memory fence", etc.), we already
>> refer to it as a "memory barrier" (or "mb") in include/qemu/atomic.h and
>> docs/atomics.txt. Why don't be consistent and avoid introducing yet
>> another term for the same thing?
>>
> Fair point. Do you think tcg_out_mb() is better then?

Yes, if used together with 'INDEX_op_mb', of course.

Kind regards,
Sergey



Re: [Qemu-devel] [PULL 01/31] Add optionrom compatible with fw_cfg DMA version

2016-05-27 Thread Richard W.M. Jones
On Fri, May 27, 2016 at 04:09:32PM +0200, Paolo Bonzini wrote:
> From: Marc Marí 
> 
> This optionrom is based on linuxboot.S.
> 
> Signed-off-by: Marc Marí 
> Signed-off-by: Richard W.M. Jones 
> Message-Id: <1464027093-24073-2-git-send-email-rjo...@redhat.com>
> [Add -fno-toplevel-reorder and fix Win32 compilation. - Paolo]
> Signed-off-by: Paolo Bonzini 

This matches my version, minus a comment, and I just test-built it on
GCC (Linux & Win32 cross), and clang, so ...

Acked-by: Richard W.M. Jones 

Rich.

>  .gitignore|   4 +
>  Makefile  |   2 +-
>  configure |  20 +++
>  hw/i386/pc.c  |  10 +-
>  hw/nvram/fw_cfg.c |   2 +-
>  include/hw/nvram/fw_cfg.h |   1 +
>  pc-bios/optionrom/Makefile|  20 ++-
>  pc-bios/optionrom/code16gcc.h |   3 +
>  pc-bios/optionrom/linuxboot_dma.c | 292 
> ++
>  9 files changed, 348 insertions(+), 6 deletions(-)
>  create mode 100644 pc-bios/optionrom/code16gcc.h
>  create mode 100644 pc-bios/optionrom/linuxboot_dma.c
> 
> diff --git a/.gitignore b/.gitignore
> index 88a80ff..101d1e0 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -94,6 +94,10 @@
>  /pc-bios/optionrom/linuxboot.bin
>  /pc-bios/optionrom/linuxboot.raw
>  /pc-bios/optionrom/linuxboot.img
> +/pc-bios/optionrom/linuxboot_dma.asm
> +/pc-bios/optionrom/linuxboot_dma.bin
> +/pc-bios/optionrom/linuxboot_dma.raw
> +/pc-bios/optionrom/linuxboot_dma.img
>  /pc-bios/optionrom/multiboot.asm
>  /pc-bios/optionrom/multiboot.bin
>  /pc-bios/optionrom/multiboot.raw
> diff --git a/Makefile b/Makefile
> index a5d7e62..3a9782e 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -400,7 +400,7 @@ efi-e1000.rom efi-eepro100.rom efi-ne2k_pci.rom \
>  efi-pcnet.rom efi-rtl8139.rom efi-virtio.rom \
>  qemu-icon.bmp qemu_logo_no_text.svg \
>  bamboo.dtb petalogix-s3adsp1800.dtb petalogix-ml605.dtb \
> -multiboot.bin linuxboot.bin kvmvapic.bin \
> +multiboot.bin linuxboot.bin linuxboot_dma.bin kvmvapic.bin \
>  s390-ccw.img \
>  spapr-rtas.bin slof.bin \
>  palcode-clipper \
> diff --git a/configure b/configure
> index b5aab72..6d4cbbd 100755
> --- a/configure
> +++ b/configure
> @@ -237,6 +237,7 @@ fortify_source=""
>  strip_opt="yes"
>  tcg_interpreter="no"
>  bigendian="no"
> +compiler_m16="no"
>  mingw32="no"
>  gcov="no"
>  gcov_tool="gcov"
> @@ -1524,6 +1525,21 @@ if test "$static" = "yes" ; then
>fi
>  fi
>  
> +# Check if the compiler supports -m16 to generate i8086 binaries.
> +#
> +# GCC < 4.9 didn't, so we have to work around that when building the
> +# linuxboot_dma option ROM.  When GCC < 4.9 is considered sufficiently
> +# old that we no longer care about it, we can remove this section and
> +# CONFIG_COMPILER_M16 which will simplify the build.
> +if [ "$cpu" = "i386" -o "$cpu" = "x86_64" ] ; then
> +  cat > $TMPC << EOF
> +int main(void) { return 0; }
> +EOF
> +  if compile_prog "-m16" "" ; then
> +compiler_m16=yes
> +  fi
> +fi
> +
>  # Unconditional check for compiler __thread support
>cat > $TMPC << EOF
>  static __thread int tls_var;
> @@ -4780,6 +4796,7 @@ fi
>  echo "module support$modules"
>  echo "host CPU  $cpu"
>  echo "host big endian   $bigendian"
> +echo "compiler has -m16 $compiler_m16"
>  echo "target list   $target_list"
>  echo "tcg debug enabled $debug_tcg"
>  echo "gprof enabled $gprof"
> @@ -4928,6 +4945,9 @@ fi
>  if test "$bigendian" = "yes" ; then
>echo "HOST_WORDS_BIGENDIAN=y" >> $config_host_mak
>  fi
> +if test "$compiler_m16" = "yes" ; then
> +  echo "CONFIG_COMPILER_M16=y" >> $config_host_mak
> +fi
>  if test "$mingw32" = "yes" ; then
>echo "CONFIG_WIN32=y" >> $config_host_mak
>rc_version=`cat $source_path/VERSION`
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index e29ccc8..2ab7b42 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1000,8 +1000,13 @@ static void load_linux(PCMachineState *pcms,
>  fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
>  fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
>  
> -option_rom[nb_option_roms].name = "linuxboot.bin";
> -option_rom[nb_option_roms].bootindex = 0;
> +if (fw_cfg_dma_enabled(fw_cfg)) {
> +option_rom[nb_option_roms].name = "linuxboot_dma.bin";
> +option_rom[nb_option_roms].bootindex = 0;
> +} else {
> +option_rom[nb_option_roms].name = "linuxboot.bin";
> +option_rom[nb_option_roms].bootindex = 0;
> +}
>  nb_option_roms++;
>  }
>  
> @@ -1264,6 +1269,7 @@ void xen_load_linux(PCMachineState *pcms)
>  load_linux(pcms, fw_cfg);
>  for (i = 0; i < nb_option_roms; i++) {
>  assert(!strcmp(option_rom[i].name, "linuxboot.bin") ||
> +   !strcmp(option_rom[i].name, "linuxboot_dma.bin") ||
> !strcmp(option_rom[i].name, 

Re: [Qemu-devel] [PULL 00/31] Misc changes for 2016-05-27

2016-05-27 Thread Peter Maydell
On 27 May 2016 at 14:38, Richard W.M. Jones  wrote:
> On Fri, May 27, 2016 at 02:04:52PM +0100, Peter Maydell wrote:
>> With V=1:
>>
>> i686-w64-mingw32-ld -m i386pe -Ttext 0 -e _start -s -o
>> linuxboot_dma.img linuxboot_dma.o
>> linuxboot_dma.o:linuxboot_dma.c:(.text+0x57): undefined reference to
>> `load_kernel'
>>
>> Building an image for the target using our host compiler seems like
>> an odd choice,
>
> Which compiler should I be using?

I would have expected that how we build a guest image ought to
be the same regardless of how we're building QEMU itself.
(You wouldn't try to build an i386 image with $(CC) if you're
on an ARM host, for instance.)
However, given that the makefile already works this way, better
to go with the flow...

thanks
-- PMM



[Qemu-devel] [PATCH v6 12/15] docker: Add travis tool

2016-05-27 Thread Fam Zheng
The script is not prefixed with test- so it won't run with "make docker-test",
because it can take too long.

Run it with "make docker-travis@ubuntu".

Reviewed-by: Alex Bennée 
Signed-off-by: Fam Zheng 
---
 tests/docker/travis| 21 +
 tests/docker/travis.py | 48 
 2 files changed, 69 insertions(+)
 create mode 100755 tests/docker/travis
 create mode 100755 tests/docker/travis.py

diff --git a/tests/docker/travis b/tests/docker/travis
new file mode 100755
index 000..d345393
--- /dev/null
+++ b/tests/docker/travis
@@ -0,0 +1,21 @@
+#!/bin/bash -e
+#
+# Mimic a travis testing matrix
+#
+# Copyright (c) 2016 Red Hat Inc.
+#
+# Authors:
+#  Fam Zheng 
+#
+# This work is licensed under the terms of the GNU GPL, version 2
+# or (at your option) any later version. See the COPYING file in
+# the top-level directory.
+
+. common.rc
+
+requires pyyaml
+cmdfile=/tmp/travis_cmd_list.sh
+$QEMU_SRC/tests/docker/travis.py $QEMU_SRC/.travis.yml > $cmdfile
+chmod +x $cmdfile
+cd "$QEMU_SRC"
+$cmdfile
diff --git a/tests/docker/travis.py b/tests/docker/travis.py
new file mode 100755
index 000..8dcc964
--- /dev/null
+++ b/tests/docker/travis.py
@@ -0,0 +1,48 @@
+#!/usr/bin/env python
+#
+# Travis YAML config parser
+#
+# Copyright (c) 2016 Red Hat Inc.
+#
+# Authors:
+#  Fam Zheng 
+#
+# This work is licensed under the terms of the GNU GPL, version 2
+# or (at your option) any later version. See the COPYING file in
+# the top-level directory.
+
+import sys
+import yaml
+import itertools
+
+def load_yaml(fname):
+return yaml.load(open(fname, "r").read())
+
+def conf_iter(conf):
+def env_to_list(env):
+return env if isinstance(env, list) else [env]
+global_env = conf["env"]["global"]
+for entry in conf["matrix"]["include"]:
+yield {"env": global_env + env_to_list(entry["env"]),
+   "compiler": entry["compiler"]}
+for entry in itertools.product(conf["compiler"],
+   conf["env"]["matrix"]):
+yield {"env": global_env + env_to_list(entry[1]),
+   "compiler": entry[0]}
+
+def main():
+if len(sys.argv) < 2:
+sys.stderr.write("Usage: %s \n" % sys.argv[0])
+return 1
+conf = load_yaml(sys.argv[1])
+for config in conf_iter(conf):
+print "("
+print "\n".join(config["env"])
+print "alias cc=" + config["compiler"]
+print "\n".join(conf["before_script"])
+print "\n".join(conf["script"])
+print ")"
+return 0
+
+if __name__ == "__main__":
+sys.exit(main())
-- 
2.8.3




[Qemu-devel] [PATCH v6 14/15] docker: Add EXTRA_CONFIGURE_OPTS

2016-05-27 Thread Fam Zheng
Whatever passed in this variable will be appended to all
configure commands.

Signed-off-by: Fam Zheng 
Reviewed-by: Alex Bennée 
---
 tests/docker/Makefile.include | 3 +++
 tests/docker/common.rc| 1 +
 2 files changed, 4 insertions(+)

diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
index 372733d..e09a6c7 100644
--- a/tests/docker/Makefile.include
+++ b/tests/docker/Makefile.include
@@ -86,6 +86,8 @@ docker:
@echo
@echo 'Special variables:'
@echo 'TARGET_LIST=a,b,cOverride target list in builds.'
+   @echo 'EXTRA_CONFIGURE_OPTS="..."'
+   @echo ' Extra configure options.'
@echo 'IMAGES="a b c ..":   Filters which images to build or run.'
@echo 'TESTS="x y z .." Filters which tests to run (for 
docker-test).'
@echo 'J=[0..9]*Overrides the -jN parameter for make 
commands'
@@ -106,6 +108,7 @@ docker-run-%: docker-qemu-src
-t \
$(if $(DEBUG),-i,--net=none) \
-e TARGET_LIST=$(TARGET_LIST) \
+   -e EXTRA_CONFIGURE_OPTS=$(EXTRA_CONFIGURE_OPTS) 
\
-e V=$V -e J=$J -e DEBUG=$(DEBUG)\
-e CCACHE_DIR=/var/tmp/ccache \
-v $$(realpath 
$(SRC_COPY)):/var/tmp/qemu:z$(COMMA)ro \
diff --git a/tests/docker/common.rc b/tests/docker/common.rc
index 74b89d6..c493eeb 100755
--- a/tests/docker/common.rc
+++ b/tests/docker/common.rc
@@ -26,6 +26,7 @@ build_qemu()
 $QEMU_SRC/configure \
 --target-list="${TARGET_LIST}" \
 --prefix="$PWD/install" \
+$EXTRA_CONFIGURE_OPTS \
 "$@"
 make $MAKEFLAGS
 }
-- 
2.8.3




Re: [Qemu-devel] [PULL v2 00/38] linux-user pull request

2016-05-27 Thread Peter Maydell
On 27 May 2016 at 13:59,  <riku.voi...@linaro.org> wrote:
> From: Riku Voipio <riku.voi...@linaro.org>
>
> The following changes since commit 287db79df8af8e31f18e262feb5e05103a09e4d4:
>
>   Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into 
> staging (2016-05-24 13:06:33 +0100)
>
> are available in the git repository at:
>
>   git://git.linaro.org/people/riku.voipio/qemu.git 
> tags/pull-linux-user-20160527
>
> for you to fetch changes up to 49e55cbacf4ad08f831b9f3f9cb0f3082883a3a1:
>
>   linux-user,target-ppc: fix use of MSR_LE (2016-05-27 14:50:40 +0300)
>
> 
> linux-user pull request v2 for may 2016

Applied, thanks.

(I should be able to send out the next batch of patches later today ;-))

-- PMM



[Qemu-devel] [PATCH v6 07/15] docker: Add common.rc

2016-05-27 Thread Fam Zheng
"requires" checks the "FEATURE" environment for specified prerequisits,
and skip the execution of test if not found.

"build_qemu" is the central routine to compile QEMU for tests to call.

Reviewed-by: Alex Bennée 
Signed-off-by: Fam Zheng 
---
 tests/docker/common.rc | 31 +++
 1 file changed, 31 insertions(+)
 create mode 100755 tests/docker/common.rc

diff --git a/tests/docker/common.rc b/tests/docker/common.rc
new file mode 100755
index 000..74b89d6
--- /dev/null
+++ b/tests/docker/common.rc
@@ -0,0 +1,31 @@
+#!/bin/sh
+#
+# Common routines for docker test scripts.
+#
+# Copyright (c) 2016 Red Hat Inc.
+#
+# Authors:
+#  Fam Zheng 
+#
+# This work is licensed under the terms of the GNU GPL, version 2
+# or (at your option) any later version. See the COPYING file in
+# the top-level directory.
+
+requires()
+{
+for c in $@; do
+if ! echo "$FEATURES" | grep -wq -e "$c"; then
+echo "Prerequisite '$c' not present, skip"
+exit 0
+fi
+done
+}
+
+build_qemu()
+{
+$QEMU_SRC/configure \
+--target-list="${TARGET_LIST}" \
+--prefix="$PWD/install" \
+"$@"
+make $MAKEFLAGS
+}
-- 
2.8.3




[Qemu-devel] [PATCH v6 15/15] MAINTAINERS: Add tests/docker

2016-05-27 Thread Fam Zheng
Reviewed-by: Alex Bennée 
Signed-off-by: Fam Zheng 
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 3c949d5..091272e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1615,3 +1615,10 @@ Build system architecture
 M: Daniel P. Berrange 
 S: Odd Fixes
 F: docs/build-system.txt
+
+Docker testing
+--
+Docker based testing framework and cases
+M: Fam Zheng 
+S: Maintained
+F: tests/docker/
-- 
2.8.3




Re: [Qemu-devel] [PATCH v3] linux-user: Fix qemu-binfmt-conf.h to store config across reboot

2016-05-27 Thread Alexander Graf

On 05/25/2016 05:51 PM, Laurent Vivier wrote:


Le 25/02/2016 à 17:28, Laurent Vivier a écrit :

Please, Alex, Michael:

We need your ack/review.

Someone? :)


It's definitely an improvement over today's situation.

Reviewed-by: Alexander Graf 


Alex




[Qemu-devel] [PATCH v6 13/15] docs: Add text for tests/docker in build-system.txt

2016-05-27 Thread Fam Zheng
Reviewed-by: Alex Bennée 
Signed-off-by: Fam Zheng 
---
 docs/build-system.txt | 5 +
 1 file changed, 5 insertions(+)

diff --git a/docs/build-system.txt b/docs/build-system.txt
index 5ea..2af1e66 100644
--- a/docs/build-system.txt
+++ b/docs/build-system.txt
@@ -438,6 +438,11 @@ top level Makefile, so anything defined in this file will 
influence the
 entire build system. Care needs to be taken when writing rules for tests
 to ensure they only apply to the unit test execution / build.
 
+- tests/docker/Makefile.include
+
+Rules for Docker tests. Like tests/Makefile, this file is included
+directly by the top level Makefile, anything defined in this file will
+influence the entire build system.
 
 - po/Makefile
 
-- 
2.8.3




[Qemu-devel] [PATCH v6 03/15] rules.mak: Avoid double include

2016-05-27 Thread Fam Zheng
Signed-off-by: Fam Zheng 
---
 rules.mak | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/rules.mak b/rules.mak
index 4a8f464..7d8cf06 100644
--- a/rules.mak
+++ b/rules.mak
@@ -1,3 +1,5 @@
+ifeq ($(RULES_MAK),)
+RULES_MAK := 1
 
 COMMA := ,
 
@@ -355,3 +357,4 @@ define unnest-vars
 $(eval -include $(patsubst %.o,%.d,$(patsubst %.mo,%.d,$($v
 $(eval $v := $(filter-out %/,$($v
 endef
+endif
-- 
2.8.3




  1   2   3   >