date:20151006

Re: [Qemu-devel] rfc: vhost user enhancements for vm2vm communication

2015-10-06 Thread Michael S. Tsirkin

On Tue, Oct 06, 2015 at 02:42:34PM -0700, Nakajima, Jun wrote:
> Hi Michael,
> 
> Looks like the discussions tapered off, but do you have a plan to
> implement this if people are eventually fine with it? We want to
> extend this to support multiple VMs.

Absolutely. We are just back from holidays, and started looking at who
does what. If anyone wants to help, that'd also be nice.


> On Mon, Aug 31, 2015 at 11:35 AM, Nakajima, Jun  
> wrote:
> > On Mon, Aug 31, 2015 at 7:11 AM, Michael S. Tsirkin  wrote:
> >> Hello!
> >> During the KVM forum, we discussed supporting virtio on top
> >> of ivshmem. I have considered it, and came up with an alternative
> >> that has several advantages over that - please see below.
> >> Comments welcome.
> >
> > Hi Michael,
> >
> > I like this, and it should be able to achieve what I presented at KVM
> > Forum (vhost-user-shmem).
> > Comments below.
> >
> >>
> >> -
> >>
> >> Existing solutions to userspace switching between VMs on the
> >> same host are vhost-user and ivshmem.
> >>
> >> vhost-user works by mapping memory of all VMs being bridged into the
> >> switch memory space.
> >>
> >> By comparison, ivshmem works by exposing a shared region of memory to all 
> >> VMs.
> >> VMs are required to use this region to store packets. The switch only
> >> needs access to this region.
> >>
> >> Another difference between vhost-user and ivshmem surfaces when polling
> >> is used. With vhost-user, the switch is required to handle
> >> data movement between VMs, if using polling, this means that 1 host CPU
> >> needs to be sacrificed for this task.
> >>
> >> This is easiest to understand when one of the VMs is
> >> used with VF pass-through. This can be schematically shown below:
> >>
> >> +-- VM1 --++---VM2---+
> >> | virtio-pci  +-vhost-user-+ virtio-pci -- VF | -- VFIO -- IOMMU 
> >> -- NIC
> >> +-++-+
> >>
> >>
> >> With ivshmem in theory communication can happen directly, with two VMs
> >> polling the shared memory region.
> >>
> >>
> >> I won't spend time listing advantages of vhost-user over ivshmem.
> >> Instead, having identified two advantages of ivshmem over vhost-user,
> >> below is a proposal to extend vhost-user to gain the advantages
> >> of ivshmem.
> >>
> >>
> >> 1: virtio in guest can be extended to allow support
> >> for IOMMUs. This provides guest with full flexibility
> >> about memory which is readable or write able by each device.
> >
> > I assume that you meant VFIO only for virtio by "use of VFIO".  To get
> > VFIO working for general direct-I/O (including VFs) in guests, as you
> > know, we need to virtualize IOMMU (e.g. VT-d) and the interrupt
> > remapping table on x86 (i.e. nested VT-d).
> >
> >> By setting up a virtio device for each other VM we need to
> >> communicate to, guest gets full control of its security, from
> >> mapping all memory (like with current vhost-user) to only
> >> mapping buffers used for networking (like ivshmem) to
> >> transient mappings for the duration of data transfer only.
> >
> > And I think that we can use VMFUNC to have such transient mappings.
> >
> >> This also allows use of VFIO within guests, for improved
> >> security.
> >>
> >> vhost user would need to be extended to send the
> >> mappings programmed by guest IOMMU.
> >
> > Right. We need to think about cases where other VMs (VM3, etc.) join
> > the group or some existing VM leaves.
> > PCI hot-plug should work there (as you point out at "Advantages over
> > ivshmem" below).
> >
> >>
> >> 2. qemu can be extended to serve as a vhost-user client:
> >> remote VM mappings over the vhost-user protocol, and
> >> map them into another VM's memory.
> >> This mapping can take, for example, the form of
> >> a BAR of a pci device, which I'll call here vhost-pci -
> >> with bus address allowed
> >> by VM1's IOMMU mappings being translated into
> >> offsets within this BAR within VM2's physical
> >> memory space.
> >
> > I think it's sensible.
> >
> >>
> >> Since the translation can be a simple one, VM2
> >> can perform it within its vhost-pci device driver.
> >>
> >> While this setup would be the most useful with polling,
> >> VM1's ioeventfd can also be mapped to
> >> another VM2's irqfd, and vice versa, such that VMs
> >> can trigger interrupts to each other without need
> >> for a helper thread on the host.
> >>
> >>
> >> The resulting channel might look something like the following:
> >>
> >> +-- VM1 --+  +---VM2---+
> >> | virtio-pci -- iommu +--+ vhost-pci -- VF | -- VFIO -- IOMMU -- NIC
> >> +-+  +-+
> >>
> >> comparing the two diagrams, a vhost-user thread on the host is
> >> no longer required, reducing the host CPU utilization when
> >> polling is active.  At the same time, VM2 can not access all of VM1's
> >> memory - it is limited by the iommu configuration setup by VM1.
> >>
> >>

Re: [Qemu-devel] [PATCH v2 0/5] simplified QEMU guest exec

2015-10-06 Thread Denis V. Lunev


On 10/05/2015 05:57 PM, Denis V. Lunev wrote:

This patchset provides simplified guest-exec functionality. The
idea is simple. We drop original guest-pipe-open etc stuff and provides
simple and dumb API:
- spawn process (originally with stdin/stdout/stderr as /dev/null)
- later simple buffer is added for this purpose

That is all for now.

Changes from v1:
- use g_new0() instead of g_malloc0
- added explicit 'exited' bool to GuestExecStatus
- reworked documentation for GuestExecStatus
- added comment about platform-specific signals and exception codes
- replaces 'pid' with 'handle' in guest-exec api

Signed-off-by: Denis V. Lunev 
Signed-off-by: Yuri Pudgorodskiy 
CC: Michael Roth 

Denis V. Lunev (2):
   qga: drop guest_file_init helper and replace it with static
 initializers
   qga: handle possible SIGPIPE in guest-file-write

Yuri Pudgorodskiy (3):
   qga: guest exec functionality
   qga: handle G_IO_STATUS_AGAIN in ga_channel_write_all()
   qga: guest-exec simple stdin/stdout/stderr redirection

  qga/channel-posix.c  |  23 ++--
  qga/commands-posix.c |  10 +-
  qga/commands-win32.c |  10 +-
  qga/commands.c   | 363 +++
  qga/main.c   |   6 +
  qga/qapi-schema.json |  67 ++
  6 files changed, 453 insertions(+), 26 deletions(-)


Michael,

we are really near soft freeze. Can you pls spend a bit of time
and look this. The amount of changes is not that big in
comparison with the previous set.

Den

P.S. Sorry in advance if this ping happens too early.

Re: [Qemu-devel] [Qemu-block] [PATCH] gluster: allocate GlusterAIOCBs on the stack

2015-10-06 Thread Stefan Hajnoczi

On Thu, Oct 01, 2015 at 01:04:38PM +0200, Paolo Bonzini wrote:
> This is simpler now that the driver has been converted to coroutines.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  block/gluster.c | 86 
> ++---
>  1 file changed, 33 insertions(+), 53 deletions(-)

CCing Jeff on Gluster patches.

> diff --git a/block/gluster.c b/block/gluster.c
> index 1eb3a8c..0857c14 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -429,28 +429,23 @@ static coroutine_fn int 
> qemu_gluster_co_write_zeroes(BlockDriverState *bs,
>  int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
>  {
>  int ret;
> -GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
> +GlusterAIOCB acb;
>  BDRVGlusterState *s = bs->opaque;
>  off_t size = nb_sectors * BDRV_SECTOR_SIZE;
>  off_t offset = sector_num * BDRV_SECTOR_SIZE;
>  
> -acb->size = size;
> -acb->ret = 0;
> -acb->coroutine = qemu_coroutine_self();
> -acb->aio_context = bdrv_get_aio_context(bs);
> +acb.size = size;
> +acb.ret = 0;
> +acb.coroutine = qemu_coroutine_self();
> +acb.aio_context = bdrv_get_aio_context(bs);
>  
> -ret = glfs_zerofill_async(s->fd, offset, size, _finish_aiocb, 
> acb);
> +ret = glfs_zerofill_async(s->fd, offset, size, gluster_finish_aiocb, 
> );
>  if (ret < 0) {
> -ret = -errno;
> -goto out;
> +return -errno;
>  }
>  
>  qemu_coroutine_yield();
> -ret = acb->ret;
> -
> -out:
> -g_slice_free(GlusterAIOCB, acb);
> -return ret;
> +return acb.ret;
>  }
>  
>  static inline bool gluster_supports_zerofill(void)
> @@ -541,35 +536,30 @@ static coroutine_fn int 
> qemu_gluster_co_rw(BlockDriverState *bs,
>  int64_t sector_num, int nb_sectors, QEMUIOVector *qiov, int write)
>  {
>  int ret;
> -GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
> +GlusterAIOCB acb;
>  BDRVGlusterState *s = bs->opaque;
>  size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>  off_t offset = sector_num * BDRV_SECTOR_SIZE;
>  
> -acb->size = size;
> -acb->ret = 0;
> -acb->coroutine = qemu_coroutine_self();
> -acb->aio_context = bdrv_get_aio_context(bs);
> +acb.size = size;
> +acb.ret = 0;
> +acb.coroutine = qemu_coroutine_self();
> +acb.aio_context = bdrv_get_aio_context(bs);
>  
>  if (write) {
>  ret = glfs_pwritev_async(s->fd, qiov->iov, qiov->niov, offset, 0,
> -_finish_aiocb, acb);
> +gluster_finish_aiocb, );
>  } else {
>  ret = glfs_preadv_async(s->fd, qiov->iov, qiov->niov, offset, 0,
> -_finish_aiocb, acb);
> +gluster_finish_aiocb, );
>  }
>  
>  if (ret < 0) {
> -ret = -errno;
> -goto out;
> +return -errno;
>  }
>  
>  qemu_coroutine_yield();
> -ret = acb->ret;
> -
> -out:
> -g_slice_free(GlusterAIOCB, acb);
> -return ret;
> +return acb.ret;
>  }
>  
>  static int qemu_gluster_truncate(BlockDriverState *bs, int64_t offset)
> @@ -600,26 +590,21 @@ static coroutine_fn int 
> qemu_gluster_co_writev(BlockDriverState *bs,
>  static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
>  {
>  int ret;
> -GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
> +GlusterAIOCB acb;
>  BDRVGlusterState *s = bs->opaque;
>  
> -acb->size = 0;
> -acb->ret = 0;
> -acb->coroutine = qemu_coroutine_self();
> -acb->aio_context = bdrv_get_aio_context(bs);
> +acb.size = 0;
> +acb.ret = 0;
> +acb.coroutine = qemu_coroutine_self();
> +acb.aio_context = bdrv_get_aio_context(bs);
>  
> -ret = glfs_fsync_async(s->fd, _finish_aiocb, acb);
> +ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, );
>  if (ret < 0) {
> -ret = -errno;
> -goto out;
> +return -errno;
>  }
>  
>  qemu_coroutine_yield();
> -ret = acb->ret;
> -
> -out:
> -g_slice_free(GlusterAIOCB, acb);
> -return ret;
> +return acb.ret;
>  }
>  
>  #ifdef CONFIG_GLUSTERFS_DISCARD
> @@ -627,28 +612,23 @@ static coroutine_fn int 
> qemu_gluster_co_discard(BlockDriverState *bs,
>  int64_t sector_num, int nb_sectors)
>  {
>  int ret;
> -GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
> +GlusterAIOCB acb;
>  BDRVGlusterState *s = bs->opaque;
>  size_t size = nb_sectors * BDRV_SECTOR_SIZE;
>  off_t offset = sector_num * BDRV_SECTOR_SIZE;
>  
> -acb->size = 0;
> -acb->ret = 0;
> -acb->coroutine = qemu_coroutine_self();
> -acb->aio_context = bdrv_get_aio_context(bs);
> +acb.size = 0;
> +acb.ret = 0;
> +acb.coroutine = qemu_coroutine_self();
> +acb.aio_context = bdrv_get_aio_context(bs);
>  
> -ret = glfs_discard_async(s->fd, offset, size, _finish_aiocb, 
> acb);
> +ret = glfs_discard_async(s->fd, offset, size, gluster_finish_aiocb, 
> );
>  if (ret < 0) {
> -ret = -errno;
> -

Re: [Qemu-devel] [PATCH v4 3/7] Implement fw_cfg DMA interface

2015-10-06 Thread Peter Maydell

On 6 October 2015 at 15:44, Stefan Hajnoczi  wrote:
> On Thu, Oct 01, 2015 at 02:16:55PM +0200, Marc Marí wrote:
>> @@ -292,6 +307,119 @@ static void fw_cfg_data_mem_write(void *opaque, hwaddr 
>> addr,
>>  } while (i);
>>  }
>>
>> +static void fw_cfg_dma_transfer(FWCfgState *s)
>> +{
>> +dma_addr_t len;
>> +FWCfgDmaAccess dma;
>> +int arch;
>> +FWCfgEntry *e;
>> +int read;
>> +dma_addr_t dma_addr;
>> +
>> +/* Reset the address before the next access */
>> +dma_addr = s->dma_addr;
>> +s->dma_addr = 0;
>> +
>> +dma.address = ldq_be_dma(s->dma_as,
>> +dma_addr + offsetof(FWCfgDmaAccess, address));
>> +dma.length = ldl_be_dma(s->dma_as,
>> +dma_addr + offsetof(FWCfgDmaAccess, length));
>> +dma.control = ldl_be_dma(s->dma_as,
>> +dma_addr + offsetof(FWCfgDmaAccess, control));
>
> ldq_be_dma() doesn't report errors.  If dma_addr is invalid the return
> value could be anything.  Memory corruption inside the guest is possible
> if the address/length/control values happen to cause a memory read
> operation!

We discussed this in a previous revision. IMHO if the guest has
passed us a bogus dma_addr it should expect memory corruption.
We only need to be sure we don't allow a VM escape.

> Instead, please use:
>
> if (dma_memory_read(s->dma_as, dma_addr, , sizeof(dma))) {
> stl_be_dma(s->dma_as, dma_addr + offsetof(FWCfgDmaAccess, control),
>FW_CFG_DMA_CTL_ERROR);

If the guest handed us a bad dma_addr then this write will also
be bogus and could corrupt the guest's memory.

thanks
-- PMM

Re: [Qemu-devel] [PATCH v4 3/7] Implement fw_cfg DMA interface

2015-10-06 Thread Marc Marí

On Tue, 6 Oct 2015 15:44:53 +0100
Stefan Hajnoczi  wrote:

> On Thu, Oct 01, 2015 at 02:16:55PM +0200, Marc Marí wrote:
> > @@ -292,6 +307,119 @@ static void fw_cfg_data_mem_write(void
> > *opaque, hwaddr addr, } while (i);
> >  }
> >  
> > +static void fw_cfg_dma_transfer(FWCfgState *s)
> > +{
> > +dma_addr_t len;
> > +FWCfgDmaAccess dma;
> > +int arch;
> > +FWCfgEntry *e;
> > +int read;
> > +dma_addr_t dma_addr;
> > +
> > +/* Reset the address before the next access */
> > +dma_addr = s->dma_addr;
> > +s->dma_addr = 0;
> > +
> > +dma.address = ldq_be_dma(s->dma_as,
> > +dma_addr + offsetof(FWCfgDmaAccess,
> > address));
> > +dma.length = ldl_be_dma(s->dma_as,
> > +dma_addr + offsetof(FWCfgDmaAccess,
> > length));
> > +dma.control = ldl_be_dma(s->dma_as,
> > +dma_addr + offsetof(FWCfgDmaAccess,
> > control));
> 
> ldq_be_dma() doesn't report errors.  If dma_addr is invalid the return
> value could be anything.  Memory corruption inside the guest is
> possible if the address/length/control values happen to cause a
> memory read operation!
> 
> Instead, please use:
> 
> if (dma_memory_read(s->dma_as, dma_addr, , sizeof(dma))) {
> stl_be_dma(s->dma_as, dma_addr + offsetof(FWCfgDmaAccess,
> control), FW_CFG_DMA_CTL_ERROR);
> return;
> }
> 
> dma.address = be64_to_cpu(dma.address);
> dma.length = be32_to_cpu(dma.length);
> dma.control = be32_to_cpu(dma.control);
> 
> > +
> > +if (dma.control & FW_CFG_DMA_CTL_SELECT) {
> > +fw_cfg_select(s, dma.control >> 16);
> > +}
> > +
> > +arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> > +e = >entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> > +
> > +if (dma.control & FW_CFG_DMA_CTL_READ) {
> > +read = 1;
> > +} else if (dma.control & FW_CFG_DMA_CTL_SKIP) {
> > +read = 0;
> > +} else {
> > +dma.length = 0;
> > +}
> > +
> > +dma.control = 0;
> > +
> > +while (dma.length > 0 && !(dma.control &
> > FW_CFG_DMA_CTL_ERROR)) {
> > +if (s->cur_entry == FW_CFG_INVALID || !e->data ||
> > +s->cur_offset >= e->len) {
> > +len = dma.length;
> > +
> > +/* If the access is not a read access, it will be a
> > skip access,
> > + * tested before.
> > + */
> > +if (read) {
> > +if (dma_memory_set(s->dma_as, dma.address, 0,
> > len)) {
> > +dma.control |= FW_CFG_DMA_CTL_ERROR;
> > +}
> > +}
> > +
> > +} else {
> > +if (dma.length <= (e->len - s->cur_offset)) {
> > +len = dma.length;
> > +} else {
> > +len = (e->len - s->cur_offset);
> > +}
> > +
> > +if (e->read_callback) {
> > +e->read_callback(e->callback_opaque,
> > s->cur_offset);
> > +}
> > +
> > +/* If the access is not a read access, it will be a
> > skip access,
> > + * tested before.
> > + */
> > +if (read) {
> > +if (dma_memory_write(s->dma_as, dma.address,
> > +>data[s->cur_offset], len))
> > {
> > +dma.control |= FW_CFG_DMA_CTL_ERROR;
> > +}
> > +}
> > +
> > +s->cur_offset += len;
> > +}
> > +
> > +dma.address += len;
> > +dma.length  -= len;
> 
> I thought these fields are written back to guest memory?  For example,
> so the guest knows how many bytes were read before the error occurred.

This was proposed here:

http://lists.gnu.org/archive/html/qemu-devel/2015-08/msg04001.html

I also don't see much benefit into knowing how many bytes were read. If
the guest is trying to read an invalid entry or past the end of that
entry, the memory will be filled with zeros. The only moment when an
error would be reported is when there's some problem in the mapping.
And this problem is bad enough to just abort the whole operation.

Regards
Marc

Re: [Qemu-devel] [Qemu-block] [PATCH] throttle: test that snapshots move the throttling configuration

2015-10-06 Thread Kevin Wolf

Am 18.09.2015 um 17:54 hat Max Reitz geschrieben:
> On 17.09.2015 16:33, Alberto Garcia wrote:
> > If a snapshot is performed on a device that has I/O limits they should
> > be moved to the target image (the new active layer).
> > 
> > Signed-off-by: Alberto Garcia 
> > ---
> >  tests/qemu-iotests/096 | 69 
> > ++
> >  tests/qemu-iotests/096.out |  5 
> >  tests/qemu-iotests/group   |  1 +
> >  3 files changed, 75 insertions(+)
> >  create mode 100644 tests/qemu-iotests/096
> >  create mode 100644 tests/qemu-iotests/096.out
> 
> Looks good, I'd just like to throw in that 096 is in use by my
> looks-dead-but-actually-is-not and
> only-waiting-for-the-blockbackend-and-media-series-to-get-merged series
> "block: Rework bdrv_close_all()":
> http://lists.nongnu.org/archive/html/qemu-block/2015-03/msg00048.html
> 
> But then again, once the prerequisites are met, I'll have to send a new
> version of that series anyway...

It's generally a good idea to check the mailing list for test numbers
taken by posted, but not yet merged patches. If there are any gaps in
groups, chances are high that some patch is using it, so don't fill gaps
without checking that first.

> So, since 096 is not a magic number I'm extremely keen on keeping in my
> greedy claws:
> 
> Reviewed-by: Max Reitz 

Thanks, applied to the block branch.

Kevin


pgpe_dY5IrcKm.pgp
Description: PGP signature

Re: [Qemu-devel] [v4][PATCH 2/2] libxl: introduce gfx_passthru_kind

2015-10-06 Thread Stefano Stabellini

On Fri, 25 Sep 2015, Ian Campbell wrote:
> On Fri, 2015-09-18 at 16:30 +0800, Tiejun Chen wrote:
> > Although we already have 'gfx_passthru' in b_info, this doesn't suffice
> > after we want to handle IGD specifically. Now we define a new field of
> > type, gfx_passthru_kind, to indicate we're trying to pass IGD. Actually
> > this means we can benefit this to support other specific devices just
> > by extending gfx_passthru_kind. And then we can cooperate with
> > gfx_passthru to address IGD cases as follows:
> > 
> > gfx_passthru = 0=> sets build_info.u.gfx_passthru to false
> > gfx_passthru = 1=> sets build_info.u.gfx_passthru to true and
> >build_info.u.gfx_passthru_kind to DEFAULT
> > gfx_passthru = "igd"=> sets build_info.u.gfx_passthru to true
> >and build_info.u.gfx_passthru_kind to IGD
> > 
> > Here if gfx_passthru_kind = DEFAULT, we will call
> > libxl__is_igd_vga_passthru() to check if we're hitting that table to need
> > to pass that option to qemu. But if gfx_passthru_kind = "igd" we always
> > force to pass that.
> > 
> > And "-gfx_passthru" is just introduced to work for qemu-xen-traditional
> > so we should get this away from libxl__build_device_model_args_new() in
> > the case of qemu upstream.
> > 
> > Signed-off-by: Tiejun Chen 
> 
> Acked + applied both patches, thanks.
> 
> Stefano -- are the QEMU side patches in qemu-upstream-unstable.git yet? If
> not I suppose this is a call/reminder to backport them from mainline or
> whatever.

No, they are not, they'll be in qemu-upstream for Xen 4.7.

Re: [Qemu-devel] [PATCH v6 2/4] block: support passing 'backing': '' to 'blockdev-add'

2015-10-06 Thread Kevin Wolf

Am 22.09.2015 um 15:28 hat Alberto Garcia geschrieben:
> Passing an empty string allows opening an image but not its backing
> file. This was already described in the API documentation, only the
> implementation was missing.
> 
> This is useful for creating snapshots using images opened with
> blockdev-add, since they are not supposed to have a backing image
> before the operation.
> 
> Signed-off-by: Alberto Garcia 
> Reviewed-by: Max Reitz 
> Reviewed-by: Eric Blake 

Reviewed-by: Kevin Wolf

Re: [Qemu-devel] [PATCH v6 3/4] block: add a 'blockdev-snapshot' QMP command

2015-10-06 Thread Alberto Garcia

On Tue 06 Oct 2015 05:30:07 PM CEST, Kevin Wolf wrote:
>> -options = qdict_new();
>> -if (has_snapshot_node_name) {
>> -qdict_put(options, "node-name",
>> -  qstring_from_str(snapshot_node_name));
>> +if (snapshot_node_name && bdrv_find_node(snapshot_node_name)) {
>> +error_setg(errp, "New snapshot node name already exists");
>> +return;
>> +}
>
> Preexisting, but shouldn't we use bdrv_lookup_bs() here (because devices
> and node names share a namespace)?

I think you're right, good catch!

>> +if (state->new_bs->blk != NULL) {
>> +error_setg(errp, "The snapshot is already in use by %s",
>> +   blk_name(state->new_bs->blk));
>> +return;
>> +}
>
> Is it even possible yet to create a root BDS without a BB?

It is possible with Max's series, on which mine depends.

   http://patchwork.ozlabs.org/patch/519375/

>> +if (bdrv_op_is_blocked(state->new_bs, BLOCK_OP_TYPE_EXTERNAL_SNAPSHOT,
>> +   errp)) {
>> +return;
>> +}
>> +
>> +if (state->new_bs->backing_hd != NULL) {
>> +error_setg(errp, "The snapshot already has a backing image");
>>  }
>
> The error cases after bdrv_open() should probably bdrv_unref() the
> node.

I don't think it's necessary, external_snapshot_abort() already takes
care of that.

Thanks for reviewing!

Berto

Re: [Qemu-devel] [PATCH v5] linux-user/syscall.c: malloc()/calloc() to g_malloc()/g_try_malloc()/g_new0()

2015-10-06 Thread Stefan Hajnoczi

On Tue, Oct 6, 2015 at 3:06 PM, Harmandeep Kaur
 wrote:
> Convert malloc()/ calloc() calls to g_malloc()/ g_try_malloc()/ g_new0()
>
> Using GLib functions there is no need to check return value.
> It aborts the execution if allocation fails (in most of the cases).

This explains how the glib functions work, but for the commit
description I suggest:

Commit 7267c0947d7e8ae5dff7bafd932c3bc285f43e5c ("Use glib memory
allocation and free functions") converted qemu_malloc() to g_malloc().
malloc(3) users were not converted since they didn't go through
qemu_malloc().  All heap memory allocation should go through glib so
we can take advantage of a single memory allocator and its
debugging/tracing features.  Stop using malloc(3) directly.

[Qemu-devel] [PATCH 0/2] hw/arm/virt: max-cpus init/check fixup

2015-10-06 Thread Andrew Jones

Andrew Jones (2):
  [RFC] arm_gic_common.h: add gicv2 aliases for defines
  hw/arm/virt: don't use a15memmap directly

 hw/arm/virt.c| 22 +-
 include/hw/intc/arm_gic_common.h |  2 ++
 2 files changed, 19 insertions(+), 5 deletions(-)

-- 
2.4.3

[Qemu-devel] [PATCH 1/2] [RFC] arm_gic_common.h: add gicv2 aliases for defines

2015-10-06 Thread Andrew Jones

I'm not sure if arm_gic_common.h is supposed to be common, not
only between tcg and kvm, but also v2 and v3, but it currently
is (arm_gicv3_common.h includes it, and it's the only gic header
included by hw/arm/virt.c). If it should be the super-common
header, then it's unfortunate that the define names are too
generic. This patch doesn't help much, as it doesn't rename
anything, but it does start heading down the right path. With
it, code including the super-common header can start using more
appropriate names for a couple very gic-version-specific defines.

Signed-off-by: Andrew Jones 
---
 include/hw/intc/arm_gic_common.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/hw/intc/arm_gic_common.h b/include/hw/intc/arm_gic_common.h
index 564a72b2cf77f..299226064a30f 100644
--- a/include/hw/intc/arm_gic_common.h
+++ b/include/hw/intc/arm_gic_common.h
@@ -25,11 +25,13 @@
 
 /* Maximum number of possible interrupts, determined by the GIC architecture */
 #define GIC_MAXIRQ 1020
+#define GICV2_MAXIRQ GIC_MAXIRQ
 /* First 32 are private to each CPU (SGIs and PPIs). */
 #define GIC_INTERNAL 32
 #define GIC_NR_SGIS 16
 /* Maximum number of possible CPU interfaces, determined by GIC architecture */
 #define GIC_NCPU 8
+#define GICV2_NCPU GIC_NCPU
 
 #define MAX_NR_GROUP_PRIO 128
 #define GIC_NR_APRS (MAX_NR_GROUP_PRIO / 32)
-- 
2.4.3

Re: [Qemu-devel] [PATCH v4 3/7] Implement fw_cfg DMA interface

2015-10-06 Thread Stefan Hajnoczi

On Thu, Oct 01, 2015 at 02:16:55PM +0200, Marc Marí wrote:
> @@ -292,6 +307,119 @@ static void fw_cfg_data_mem_write(void *opaque, hwaddr 
> addr,
>  } while (i);
>  }
>  
> +static void fw_cfg_dma_transfer(FWCfgState *s)
> +{
> +dma_addr_t len;
> +FWCfgDmaAccess dma;
> +int arch;
> +FWCfgEntry *e;
> +int read;
> +dma_addr_t dma_addr;
> +
> +/* Reset the address before the next access */
> +dma_addr = s->dma_addr;
> +s->dma_addr = 0;
> +
> +dma.address = ldq_be_dma(s->dma_as,
> +dma_addr + offsetof(FWCfgDmaAccess, address));
> +dma.length = ldl_be_dma(s->dma_as,
> +dma_addr + offsetof(FWCfgDmaAccess, length));
> +dma.control = ldl_be_dma(s->dma_as,
> +dma_addr + offsetof(FWCfgDmaAccess, control));

ldq_be_dma() doesn't report errors.  If dma_addr is invalid the return
value could be anything.  Memory corruption inside the guest is possible
if the address/length/control values happen to cause a memory read
operation!

Instead, please use:

if (dma_memory_read(s->dma_as, dma_addr, , sizeof(dma))) {
stl_be_dma(s->dma_as, dma_addr + offsetof(FWCfgDmaAccess, control),
   FW_CFG_DMA_CTL_ERROR);
return;
}

dma.address = be64_to_cpu(dma.address);
dma.length = be32_to_cpu(dma.length);
dma.control = be32_to_cpu(dma.control);

> +
> +if (dma.control & FW_CFG_DMA_CTL_SELECT) {
> +fw_cfg_select(s, dma.control >> 16);
> +}
> +
> +arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
> +e = >entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
> +
> +if (dma.control & FW_CFG_DMA_CTL_READ) {
> +read = 1;
> +} else if (dma.control & FW_CFG_DMA_CTL_SKIP) {
> +read = 0;
> +} else {
> +dma.length = 0;
> +}
> +
> +dma.control = 0;
> +
> +while (dma.length > 0 && !(dma.control & FW_CFG_DMA_CTL_ERROR)) {
> +if (s->cur_entry == FW_CFG_INVALID || !e->data ||
> +s->cur_offset >= e->len) {
> +len = dma.length;
> +
> +/* If the access is not a read access, it will be a skip access,
> + * tested before.
> + */
> +if (read) {
> +if (dma_memory_set(s->dma_as, dma.address, 0, len)) {
> +dma.control |= FW_CFG_DMA_CTL_ERROR;
> +}
> +}
> +
> +} else {
> +if (dma.length <= (e->len - s->cur_offset)) {
> +len = dma.length;
> +} else {
> +len = (e->len - s->cur_offset);
> +}
> +
> +if (e->read_callback) {
> +e->read_callback(e->callback_opaque, s->cur_offset);
> +}
> +
> +/* If the access is not a read access, it will be a skip access,
> + * tested before.
> + */
> +if (read) {
> +if (dma_memory_write(s->dma_as, dma.address,
> +>data[s->cur_offset], len)) {
> +dma.control |= FW_CFG_DMA_CTL_ERROR;
> +}
> +}
> +
> +s->cur_offset += len;
> +}
> +
> +dma.address += len;
> +dma.length  -= len;

I thought these fields are written back to guest memory?  For example,
so the guest knows how many bytes were read before the error occurred.

Re: [Qemu-devel] [PATCHv2] fw_cfg: Define a static signature to be returned on DMA port reads

2015-10-06 Thread Laszlo Ersek

On 10/06/15 16:29, Kevin O'Connor wrote:
> On Tue, Oct 06, 2015 at 09:30:18AM +0200, Laszlo Ersek wrote:
>> On 10/06/15 01:51, Kevin O'Connor wrote:
>>> Return a static signature ("QEMU CFG") if the guest does a read to the
>>> DMA address io register.
>>>
>>> Signed-off-by: Kevin O'Connor 
>>> ---
>>>
>>> Marc, if you decide to respin your fw_cfg series, I've updated the dma
>>> signature patch.  This addresses the comments from Stefan, and I hope
>>> it addresses the comments from Laszlo.
>>
>> Thank you -- I didn't know about extract64().
>>
>> The patch looks good to me, but I think the QEMU coding style requries
>> /* ... */ comments, and forbids //.
> 
> I always forget about that one.  Marc, if you do respin the series I
> updated the patch below.
> 
> -Kevin
> 
> 
> commit 02a449ece95da00fa64a9c704555c6afa8e03579
> Author: Kevin O'Connor 
> Date:   Thu Oct 1 14:16:59 2015 +0200
> 
> fw_cfg: Define a static signature to be returned on DMA port reads
> 
> Return a static signature ("QEMU CFG") if the guest does a read to the
> DMA address io register.
> 
> Reviewed-by: Stefan Hajnoczi 
> Signed-off-by: Kevin O'Connor 
> 
> diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt
> index 2d6b2da..cbdce7d 100644
> --- a/docs/specs/fw_cfg.txt
> +++ b/docs/specs/fw_cfg.txt
> @@ -93,6 +93,9 @@ by selecting the "signature" item using key 0x 
> (FW_CFG_SIGNATURE),
>  and reading four bytes from the data register. If the fw_cfg device is
>  present, the four bytes read will contain the characters "QEMU".
>  
> +If the DMA interface is available, then reading the DMA Address
> +Register returns 0x51454d5520434647 ("QEMU CFG" in big-endian format).
> +
>  === Revision / feature bitmap (Key 0x0001, FW_CFG_ID) ===
>  
>  A 32-bit little-endian unsigned int, this item is used to check for enabled
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index 59933b3..5098bfc 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -53,6 +53,8 @@
>  #define FW_CFG_DMA_CTL_SKIP0x04
>  #define FW_CFG_DMA_CTL_SELECT  0x08
>  
> +#define FW_CFG_DMA_SIGNATURE 0x51454d5520434647 /* "QEMU CFG" */
> +
>  typedef struct FWCfgEntry {
>  uint32_t len;
>  uint8_t *data;
> @@ -393,6 +395,13 @@ static void fw_cfg_dma_transfer(FWCfgState *s)
>  trace_fw_cfg_read(s, 0);
>  }
>  
> +static uint64_t fw_cfg_dma_mem_read(void *opaque, hwaddr addr,
> +unsigned size)
> +{
> +/* Return a signature value (and handle various read sizes) */
> +return extract64(FW_CFG_DMA_SIGNATURE, (8 - addr - size) * 8, size * 8);
> +}
> +
>  static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr,
>   uint64_t value, unsigned size)
>  {
> @@ -416,8 +425,8 @@ static void fw_cfg_dma_mem_write(void *opaque, hwaddr 
> addr,
>  static bool fw_cfg_dma_mem_valid(void *opaque, hwaddr addr,
>unsigned size, bool is_write)
>  {
> -return is_write && ((size == 4 && (addr == 0 || addr == 4)) ||
> -(size == 8 && addr == 0));
> +return !is_write || ((size == 4 && (addr == 0 || addr == 4)) ||
> + (size == 8 && addr == 0));
>  }
>  
>  static bool fw_cfg_data_mem_valid(void *opaque, hwaddr addr,
> @@ -488,6 +497,7 @@ static const MemoryRegionOps fw_cfg_comb_mem_ops = {
>  };
>  
>  static const MemoryRegionOps fw_cfg_dma_mem_ops = {
> +.read = fw_cfg_dma_mem_read,
>  .write = fw_cfg_dma_mem_write,
>  .endianness = DEVICE_BIG_ENDIAN,
>  .valid.accepts = fw_cfg_dma_mem_valid,
> 

Reviewed-by: Laszlo Ersek 

Thanks!
Laszlo

[Qemu-devel] How to get started with the source code of Qemu?

2015-10-06 Thread Aaron Elkins

Hi all,

I am new to Qemu, and I’m extremely interested in understanding how the source 
code of Qemu work. But after
I downloaded the whole project, I just lost in it, the project is too large for 
me to get started.

If anyone here can point me to some useful document or some guides, to make me 
get started in understanding 
the source code?

What knowledge are required to understand the source code?

BTW, i know this project is not that simple to understand, but I would like to 
try, even I need to know a lot
of other knowledge before that, but at least let me get started.

Thanks

-Aaron

[Qemu-devel] [PATCH v5] target-tilegx: Support iret instruction and related special registers

2015-10-06 Thread Chen Gang

>From fa0950e403bbb98989117f632215ae0e698457d7 Mon Sep 17 00:00:00 2001
From: Chen Gang 
Date: Sun, 4 Oct 2015 17:41:14 +0800
Subject: [PATCH v5] target-tilegx: Support iret instruction and related special 
registers

Acording to the __longjmp tilegx libc implementation, and reference from
tilegx ISA document, and suggested by tilegx architecture member, we can
treat iret instruction as "jrp lr". The related code is below:

  ENTRY (__longjmp)
         FEEDBACK_ENTER(__longjmp)

  #define RESTORE(r) { LD r, r0 ; ADDI_PTR r0, r0, REGSIZE }
         FOR_EACH_CALLEE_SAVED_REG(RESTORE)

         {
          LD r2, r0       ; retrieve ICS bit from jmp_buf
          movei r3, 1
          CMPEQI r0, r1, 0
         }

         {
          mtspr INTERRUPT_CRITICAL_SECTION, r3
          shli r2, r2, SPR_EX_CONTEXT_0_1__ICS_SHIFT
         }

         {
          mtspr EX_CONTEXT_0_0, lr
          ori r2, r2, RETURN_PL
         }

         {
          or r0, r1, r0
          mtspr EX_CONTEXT_0_1, r2
         }

         iret

         jrp lr

EX_CONTEXT_0_0 is used for jumping address, and EX_CONTEXT_0_1 is for
INTERRUPT_CRITICAL_SECTION, which should only be 0 or 1 in user mode, or
it will cause target SEGV (and the patch doesn't implement system mode).

iret will "jrp EX_CONTEXT_0_0", for __longjmp, it is lr (but in other
cases, it may be not).  And the last "jrp lr" in __longjmp is for
historical reasons, and might get removed in the future.

Signed-off-by: Chen Gang 
---
 target-tilegx/cpu.h       |  2 ++
 target-tilegx/helper.c    | 22 ++
 target-tilegx/helper.h    |  1 +
 target-tilegx/translate.c | 14 +-
 4 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/target-tilegx/cpu.h b/target-tilegx/cpu.h
index 6f04fe7..6c0fd53 100644
--- a/target-tilegx/cpu.h
+++ b/target-tilegx/cpu.h
@@ -53,6 +53,8 @@ enum {
     TILEGX_SPR_CMPEXCH = 0,
     TILEGX_SPR_CRITICAL_SEC = 1,
     TILEGX_SPR_SIM_CONTROL = 2,
+    TILEGX_SPR_EX_CONTEXT_0_0 = 3,
+    TILEGX_SPR_EX_CONTEXT_0_1 = 4,
     TILEGX_SPR_COUNT
 };
 
diff --git a/target-tilegx/helper.c b/target-tilegx/helper.c
index 36b287f..dda821f 100644
--- a/target-tilegx/helper.c
+++ b/target-tilegx/helper.c
@@ -22,6 +22,7 @@
 #include "qemu-common.h"
 #include "exec/helper-proto.h"
 #include  /* For crc32 */
+#include "syscall_defs.h"
 
 void helper_exception(CPUTLGState *env, uint32_t excp)
 {
@@ -31,6 +32,27 @@ void helper_exception(CPUTLGState *env, uint32_t excp)
     cpu_loop_exit(cs);
 }
 
+void helper_ext01_ics(CPUTLGState *env)
+{
+    uint64_t val = env->spregs[TILEGX_SPR_EX_CONTEXT_0_1];
+
+    switch (val) {
+    case 0:
+    case 1:
+        env->spregs[TILEGX_SPR_CRITICAL_SEC] = val;
+        break;
+    default:
+#if defined(CONFIG_USER_ONLY)
+        env->signo = TARGET_SIGILL;
+        env->sigcode = TARGET_ILL_ILLOPC;
+        helper_exception(env, TILEGX_EXCP_SIGNAL);
+#else
+        helper_exception(env, TILEGX_EXCP_OPCODE_UNIMPLEMENTED);
+#endif
+        break;
+    }
+}
+
 uint64_t helper_cntlz(uint64_t arg)
 {
     return clz64(arg);
diff --git a/target-tilegx/helper.h b/target-tilegx/helper.h
index bbcc476..9281d0f 100644
--- a/target-tilegx/helper.h
+++ b/target-tilegx/helper.h
@@ -1,4 +1,5 @@
 DEF_HELPER_2(exception, noreturn, env, i32)
+DEF_HELPER_1(ext01_ics, void, env)
 DEF_HELPER_FLAGS_1(cntlz, TCG_CALL_NO_RWG_SE, i64, i64)
 DEF_HELPER_FLAGS_1(cnttz, TCG_CALL_NO_RWG_SE, i64, i64)
 DEF_HELPER_FLAGS_1(pcnt, TCG_CALL_NO_RWG_SE, i64, i64)
diff --git a/target-tilegx/translate.c b/target-tilegx/translate.c
index ab3fc81..acb9ec4 100644
--- a/target-tilegx/translate.c
+++ b/target-tilegx/translate.c
@@ -529,6 +529,15 @@ static TileExcp gen_rr_opcode(DisasContext *dc, unsigned 
opext,
         /* ??? This should yield, especially in system mode.  */
         mnemonic = "nap";
         goto done0;
+    case OE_RR_X1(IRET):
+        gen_helper_ext01_ics(cpu_env);
+        dc->jmp.cond = TCG_COND_ALWAYS;
+        dc->jmp.dest = tcg_temp_new();
+        tcg_gen_ld_tl(dc->jmp.dest, cpu_env,
+                      offsetof(CPUTLGState, 
spregs[TILEGX_SPR_EX_CONTEXT_0_0]));
+        tcg_gen_andi_tl(dc->jmp.dest, dc->jmp.dest, ~7);
+        mnemonic = "iret";
+        goto done0;
     case OE_RR_X1(SWINT0):
     case OE_RR_X1(SWINT2):
     case OE_RR_X1(SWINT3):
@@ -606,7 +615,6 @@ static TileExcp gen_rr_opcode(DisasContext *dc, unsigned 
opext,
         break;
     case OE_RR_X0(FSINGLE_PACK1):
     case OE_RR_Y0(FSINGLE_PACK1):
-    case OE_RR_X1(IRET):
         return TILEGX_EXCP_OPCODE_UNIMPLEMENTED;
     case OE_RR_X1(LD1S):
         memop = MO_SB;
@@ -1947,6 +1955,10 @@ static const TileSPR *find_spr(unsigned spr)
       offsetof(CPUTLGState, spregs[TILEGX_SPR_CRITICAL_SEC]), 0, 0)
     D(SIM_CONTROL,
       offsetof(CPUTLGState, spregs[TILEGX_SPR_SIM_CONTROL]), 0, 0)
+    D(EX_CONTEXT_0_0,
+      offsetof(CPUTLGState, spregs[TILEGX_SPR_EX_CONTEXT_0_0]), 0, 0)
+

Re: [Qemu-devel] [PATCH] throttle: test that snapshots move the throttling configuration

2015-10-06 Thread Alberto Garcia

Ping

On Thu 17 Sep 2015 04:33:06 PM CEST, Alberto Garcia wrote:
> If a snapshot is performed on a device that has I/O limits they should
> be moved to the target image (the new active layer).
>
> Signed-off-by: Alberto Garcia 
> ---
>  tests/qemu-iotests/096 | 69 
> ++
>  tests/qemu-iotests/096.out |  5 
>  tests/qemu-iotests/group   |  1 +
>  3 files changed, 75 insertions(+)
>  create mode 100644 tests/qemu-iotests/096
>  create mode 100644 tests/qemu-iotests/096.out
>
> diff --git a/tests/qemu-iotests/096 b/tests/qemu-iotests/096
> new file mode 100644
> index 000..e34204b
> --- /dev/null
> +++ b/tests/qemu-iotests/096
> @@ -0,0 +1,69 @@
> +#!/usr/bin/env python
> +#
> +# Test that snapshots move the throttling configuration to the active
> +# layer
> +#
> +# Copyright (C) 2015 Igalia, S.L.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see .
> +#
> +
> +import iotests
> +import os
> +
> +class TestLiveSnapshot(iotests.QMPTestCase):
> +base_img = os.path.join(iotests.test_dir, 'base.img')
> +target_img = os.path.join(iotests.test_dir, 'target.img')
> +group = 'mygroup'
> +iops = 6000
> +iops_size = 1024
> +
> +def setUp(self):
> +opts = []
> +opts.append('node-name=base')
> +opts.append('throttling.group=%s' % self.group)
> +opts.append('throttling.iops-total=%d' % self.iops)
> +opts.append('throttling.iops-size=%d' % self.iops_size)
> +iotests.qemu_img('create', '-f', iotests.imgfmt, self.base_img, 
> '100M')
> +self.vm = iotests.VM().add_drive(self.base_img, ','.join(opts))
> +self.vm.launch()
> +
> +def tearDown(self):
> +self.vm.shutdown()
> +os.remove(self.base_img)
> +os.remove(self.target_img)
> +
> +def checkConfig(self, active_layer):
> +result = self.vm.qmp('query-named-block-nodes')
> +for r in result['return']:
> +if r['node-name'] == active_layer:
> +self.assertEqual(r['group'], self.group)
> +self.assertEqual(r['iops'], self.iops)
> +self.assertEqual(r['iops_size'], self.iops_size)
> +else:
> +self.assertFalse(r.has_key('group'))
> +self.assertEqual(r['iops'], 0)
> +self.assertFalse(r.has_key('iops_size'))
> +
> +def testSnapshot(self):
> +self.checkConfig('base')
> +self.vm.qmp('blockdev-snapshot-sync',
> +node_name = 'base',
> +snapshot_node_name = 'target',
> +snapshot_file = self.target_img,
> +format = iotests.imgfmt)
> +self.checkConfig('target')
> +
> +if __name__ == '__main__':
> +iotests.main(supported_fmts=['qcow2'])
> diff --git a/tests/qemu-iotests/096.out b/tests/qemu-iotests/096.out
> new file mode 100644
> index 000..ae1213e
> --- /dev/null
> +++ b/tests/qemu-iotests/096.out
> @@ -0,0 +1,5 @@
> +.
> +--
> +Ran 1 tests
> +
> +OK
> diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
> index 439b1d2..30c784e 100644
> --- a/tests/qemu-iotests/group
> +++ b/tests/qemu-iotests/group
> @@ -102,6 +102,7 @@
>  093 auto
>  094 rw auto quick
>  095 rw auto quick
> +096 rw auto quick
>  097 rw auto backing
>  098 rw auto backing quick
>  099 rw auto quick
> -- 
> 2.5.1

[Qemu-devel] [PATCH 2/2] hw/arm/virt: don't use a15memmap directly

2015-10-06 Thread Andrew Jones

We should always go through VirtBoardInfo when we need the memmap.
To avoid using a15memmap directly, in this case, we need to defer
the max-cpus check from class init time to instance init time. In
class init we now use MAX_CPUMASK_BITS for max_cpus initialization,
which is the maximum QEMU supports, and also, incidentally, the
maximum KVM/gicv3 currently supports. Also, a nice side-effect of
delaying the max-cpus check is that we now get more appropriate
error messages for gicv2 machines that try to configure more than
123 cpus. Before this patch it would complain that the requested
number of cpus was greater than 123, but for gicv2 configs, it
should complain that the number is greater than 8.

Signed-off-by: Andrew Jones 
---
 hw/arm/virt.c | 22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index d25d6cfce74cd..a9901983731ae 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -918,7 +918,7 @@ static void machvirt_init(MachineState *machine)
 qemu_irq pic[NUM_IRQS];
 MemoryRegion *sysmem = get_system_memory();
 int gic_version = vms->gic_version;
-int n;
+int n, max_cpus;
 MemoryRegion *ram = g_new(MemoryRegion, 1);
 const char *cpu_model = machine->cpu_model;
 VirtBoardInfo *vbi;
@@ -952,6 +952,21 @@ static void machvirt_init(MachineState *machine)
 exit(1);
 }
 
+/* The maximum number of CPUs depends on the GIC version, or on how
+ * many redistributors we can fit into the memory map.
+ */
+if (gic_version == 3) {
+max_cpus = vbi->memmap[VIRT_GIC_REDIST].size / 0x2;
+} else {
+max_cpus = GICV2_NCPU;
+}
+
+if (smp_cpus > max_cpus) {
+error_report("mach-virt: Number of SMP cpus requested (%d), "
+ "exceeds max cpus supported %d", smp_cpus, max_cpus);
+exit(1);
+}
+
 vbi->smp_cpus = smp_cpus;
 
 if (machine->ram_size > vbi->memmap[VIRT_MEM].size) {
@@ -1150,10 +1165,7 @@ static void virt_class_init(ObjectClass *oc, void *data)
 
 mc->desc = "ARM Virtual Machine",
 mc->init = machvirt_init;
-/* Our maximum number of CPUs depends on how many redistributors
- * we can fit into memory map
- */
-mc->max_cpus = a15memmap[VIRT_GIC_REDIST].size / 0x2;
+mc->max_cpus = MAX_CPUMASK_BITS;
 mc->has_dynamic_sysbus = true;
 mc->block_default_type = IF_VIRTIO;
 mc->no_cdrom = 1;
-- 
2.4.3

Re: [Qemu-devel] [PATCH] scsi: switch from g_slice allocator to malloc

2015-10-06 Thread Stefan Hajnoczi

On Thu, Oct 01, 2015 at 01:04:40PM +0200, Paolo Bonzini wrote:
> Simplify memory allocation by sticking with a single API.  GSlice
> is not that fast anyway (tcmalloc/jemalloc are better).
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  hw/scsi/scsi-bus.c  |  4 ++--
>  hw/scsi/virtio-scsi-dataplane.c | 10 +-
>  hw/scsi/virtio-scsi.c   | 12 +---
>  3 files changed, 12 insertions(+), 14 deletions(-)

Reviewed-by: Stefan Hajnoczi

Re: [Qemu-devel] [Fix PATCH] Qemu/Xen: Fix early freeing MSIX MMIO memory region

2015-10-06 Thread Lan, Tianyu




On 10/6/2015 9:49 PM, Stefano Stabellini wrote:

On Tue, 6 Oct 2015, Paolo Bonzini wrote:

On 05/10/2015 18:53, Stefano Stabellini wrote:

This patch is to fix the issue via moving MSIX MMIO memory region into
struct XenPCIPassthroughState and free it together with pt device's obj.


Given that all the MSI-X related info are in XenPTMSIX, I would prefer
to keep the mmio memory region there, if possible.

Couldn't you just unhook msix->mmio from XenPCIPassthroughState's object
in xen_pt_msix_delete?  Calling object_property_del_child or
object_unparent?


This is the right thing to do, but there are two separate things to fix.

One is the use-after-free of msix->mmio, the other is that freeing
s->msix and in general xen_pt_config_delete should be done from the
.instance_finalize callback.  This is documented in docs/memory.txt.

This is an attempt at a patch (not even compiled):

diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
index e3d7194..e476bac 100644
--- a/hw/xen/xen_pt_msi.c
+++ b/hw/xen/xen_pt_msi.c
@@ -610,7 +610,7 @@ error_out:
  return rc;
  }

-void xen_pt_msix_delete(XenPCIPassthroughState *s)
+void xen_pt_msix_unmap(XenPCIPassthroughState *s)
  {
  XenPTMSIX *msix = s->msix;

@@ -627,6 +627,17 @@ void xen_pt_msix_delete(XenPCIPassthroughState *s)
  }

  memory_region_del_subregion(>bar[msix->bar_index], >mmio);
+}
+
+void xen_pt_msix_delete(XenPCIPassthroughState *s)
+{
+XenPTMSIX *msix = s->msix;
+
+if (!msix) {
+return;
+}
+
+object_unparent(>mmio);

  g_free(s->msix);
  s->msix = NULL;

where xen_pt_config_unmap would be called from xen_pt_destroy, and the call
to xen_pt_config_delete would be moved to xen_pci_passthrough_info's
instance_finalize member.


Thanks for the explanation and the code. It makes sense to me.

Lan, could you please write up a patch based on this approach and test
it?



Sure. I will update patch following the guide. Thanks Paolo & Stefano.

Re: [Qemu-devel] [PULL 00/10] VFIO updates for 2015-10-05

2015-10-06 Thread Peter Maydell

On 5 October 2015 at 21:36, Alex Williamson  wrote:
> The following changes since commit c0b520dfb8890294a9f8879f4759172900585995:
>
>   Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging 
> (2015-10-02 16:59:21 +0100)
>
> are available in the git repository at:
>
>
>   git://github.com/awilliam/qemu-vfio.git tags/vfio-update-20151005.0
>
> for you to fetch changes up to 727299697dd31f0e1ccecc7eab1bf658e8ed3079:
>
>   vfio: Expose a VFIO PCI device's group for EEH (2015-10-05 12:40:13 -0600)
>
> 
> VFIO updates 2015-10-05
>
>  - Change platform device IRQ setup sequence for compatibility
>with upcoming IRQ forwarding (Eric Auger)
>  - Extensions to support vfio-pci devices on spapr-pci-host-bridge
>(David Gibson)
>
> 

Hi. I'm afraid this fails to build with clang:

In file included from /home/petmay01/linaro/qemu-for-merges/hw/vfio/pci.c:38:
/home/petmay01/linaro/qemu-for-merges/include/hw/vfio/vfio-pci.h:7:26:
error: redefinition of typedef 'VFIOGroup' is a C11 feature
[-Werror,-Wtypedef-redefinition]
typedef struct VFIOGroup VFIOGroup;
 ^
/home/petmay01/linaro/qemu-for-merges/include/hw/vfio/vfio-common.h:117:3:
note: previous definition is here
} VFIOGroup;
  ^
1 error generated.

thanks
-- PMM

Re: [Qemu-devel] [PATCH v18 00/21] Deterministic replay core

2015-10-06 Thread Paolo Bonzini



On 21/09/2015 09:12, Pavel Dovgaluk wrote:
> Hi!
> 
> Paolo, have you reviewed these patches?

Pavel,

I think this is ready to go in.  Here are my final changes,
can you ack them?

Thanks,

Paolo

diff --git a/Makefile.objs b/Makefile.objs
index bc43e5c..ba4b45e 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -58,6 +58,8 @@ common-obj-y += audio/
 common-obj-y += hw/
 common-obj-y += accel.o
 
+common-obj-y += replay/
+
 common-obj-y += ui/
 common-obj-y += bt-host.o bt-vhci.o
 bt-host.o-cflags := $(BLUEZ_CFLAGS)
diff --git a/Makefile.target b/Makefile.target
index ca8f351..962d004 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -88,7 +88,6 @@ obj-y = exec.o translate-all.o cpu-exec.o
 obj-y += translate-common.o
 obj-y += cpu-exec-common.o
 obj-y += tcg/tcg.o tcg/tcg-op.o tcg/optimize.o
-obj-y += replay/
 obj-$(CONFIG_TCG_INTERPRETER) += tci.o
 obj-y += tcg/tcg-common.o
 obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o
diff --git a/qemu-timer.c b/qemu-timer.c
index e7a5c96..023d0fa 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -572,15 +572,14 @@ int64_t timerlistgroup_deadline_ns(QEMUTimerListGroup 
*tlg)
 QEMUClockType type;
 bool play = replay_mode == REPLAY_MODE_PLAY;
 for (type = 0; type < QEMU_CLOCK_MAX; type++) {
-if (qemu_clock_use_for_deadline(tlg->tl[type]->clock->type)) {
-if (!play || tlg->tl[type]->clock->type == QEMU_CLOCK_REALTIME) {
+if (qemu_clock_use_for_deadline(type)) {
+if (!play || type == QEMU_CLOCK_REALTIME) {
 deadline = qemu_soonest_timeout(deadline,
-timerlist_deadline_ns(
-tlg->tl[type]));
+
timerlist_deadline_ns(tlg->tl[type]));
 } else {
 /* Read clock from the replay file and
do not calculate the deadline, based on virtual clock. */
-qemu_clock_get_ns(tlg->tl[type]->clock->type);
+qemu_clock_get_ns(type);
 }
 }
 }
@@ -606,8 +605,7 @@ int64_t qemu_clock_get_ns(QEMUClockType type)
 now = REPLAY_CLOCK(REPLAY_CLOCK_HOST, get_clock_realtime());
 last = clock->last;
 clock->last = now;
-if ((now < last || now > (last + get_max_clock_jump()))
-&& replay_mode == REPLAY_MODE_NONE) {
+if (now < last || now > (last + get_max_clock_jump())) {
 notifier_list_notify(>reset_notifiers, );
 }
 return now;
diff --git a/replay/Makefile.objs b/replay/Makefile.objs
index 1267969..186dbcf 100755
--- a/replay/Makefile.objs
+++ b/replay/Makefile.objs
@@ -1,6 +1,6 @@
-obj-$(CONFIG_SOFTMMU) += replay.o
-obj-$(CONFIG_SOFTMMU) += replay-internal.o
-obj-$(CONFIG_SOFTMMU) += replay-events.o
-obj-$(CONFIG_SOFTMMU) += replay-time.o
-obj-$(CONFIG_SOFTMMU) += replay-input.o
-obj-$(CONFIG_USER_ONLY) += replay-user.o
+common-obj-$(CONFIG_SOFTMMU) += replay.o
+common-obj-$(CONFIG_SOFTMMU) += replay-internal.o
+common-obj-$(CONFIG_SOFTMMU) += replay-events.o
+common-obj-$(CONFIG_SOFTMMU) += replay-time.o
+common-obj-$(CONFIG_SOFTMMU) += replay-input.o
+common-obj-$(CONFIG_USER_ONLY) += replay-user.o
diff --git a/replay/replay-internal.c b/replay/replay-internal.c
index 69fe49f..4b6a83a 100755
--- a/replay/replay-internal.c
+++ b/replay/replay-internal.c
@@ -196,7 +196,7 @@ void replay_save_instructions(void)
 if (replay_file && replay_mode == REPLAY_MODE_RECORD) {
 replay_mutex_lock();
 int diff = (int)(replay_get_current_step() - 
replay_state.current_step);
-if (first_cpu != NULL && diff > 0) {
+if (diff > 0) {
 replay_put_event(EVENT_INSTRUCTION);
 replay_put_dword(diff);
 replay_state.current_step += diff;
diff --git a/stubs/replay.c b/stubs/replay.c
index f7f74c9..a52182d 100755
--- a/stubs/replay.c
+++ b/stubs/replay.c
@@ -21,11 +21,6 @@ bool replay_checkpoint(ReplayCheckpoint checkpoint)
 return 0;
 }
 
-int runstate_is_running(void)
-{
-return 0;
-}
-
 bool replay_events_enabled(void)
 {
 return false;

Re: [Qemu-devel] [PULL 00/10] VFIO updates for 2015-10-05

2015-10-06 Thread Alex Williamson

On Tue, 2015-10-06 at 15:50 +0100, Peter Maydell wrote:
> On 5 October 2015 at 21:36, Alex Williamson  
> wrote:
> > The following changes since commit c0b520dfb8890294a9f8879f4759172900585995:
> >
> >   Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging 
> > (2015-10-02 16:59:21 +0100)
> >
> > are available in the git repository at:
> >
> >
> >   git://github.com/awilliam/qemu-vfio.git tags/vfio-update-20151005.0
> >
> > for you to fetch changes up to 727299697dd31f0e1ccecc7eab1bf658e8ed3079:
> >
> >   vfio: Expose a VFIO PCI device's group for EEH (2015-10-05 12:40:13 -0600)
> >
> > 
> > VFIO updates 2015-10-05
> >
> >  - Change platform device IRQ setup sequence for compatibility
> >with upcoming IRQ forwarding (Eric Auger)
> >  - Extensions to support vfio-pci devices on spapr-pci-host-bridge
> >(David Gibson)
> >
> > 
> 
> Hi. I'm afraid this fails to build with clang:
> 
> In file included from /home/petmay01/linaro/qemu-for-merges/hw/vfio/pci.c:38:
> /home/petmay01/linaro/qemu-for-merges/include/hw/vfio/vfio-pci.h:7:26:
> error: redefinition of typedef 'VFIOGroup' is a C11 feature
> [-Werror,-Wtypedef-redefinition]
> typedef struct VFIOGroup VFIOGroup;
>  ^
> /home/petmay01/linaro/qemu-for-merges/include/hw/vfio/vfio-common.h:117:3:
> note: previous definition is here
> } VFIOGroup;
>   ^
> 1 error generated.

Thanks Peter.  David, can you send a replacement for that last patch?
Thanks,

Alex

Re: [Qemu-devel] [PATCHv2] fw_cfg: Define a static signature to be returned on DMA port reads

2015-10-06 Thread Kevin O'Connor

On Tue, Oct 06, 2015 at 09:30:18AM +0200, Laszlo Ersek wrote:
> On 10/06/15 01:51, Kevin O'Connor wrote:
> > Return a static signature ("QEMU CFG") if the guest does a read to the
> > DMA address io register.
> > 
> > Signed-off-by: Kevin O'Connor 
> > ---
> > 
> > Marc, if you decide to respin your fw_cfg series, I've updated the dma
> > signature patch.  This addresses the comments from Stefan, and I hope
> > it addresses the comments from Laszlo.
> 
> Thank you -- I didn't know about extract64().
> 
> The patch looks good to me, but I think the QEMU coding style requries
> /* ... */ comments, and forbids //.

I always forget about that one.  Marc, if you do respin the series I
updated the patch below.

-Kevin


commit 02a449ece95da00fa64a9c704555c6afa8e03579
Author: Kevin O'Connor 
Date:   Thu Oct 1 14:16:59 2015 +0200

fw_cfg: Define a static signature to be returned on DMA port reads

Return a static signature ("QEMU CFG") if the guest does a read to the
DMA address io register.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Kevin O'Connor 

diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt
index 2d6b2da..cbdce7d 100644
--- a/docs/specs/fw_cfg.txt
+++ b/docs/specs/fw_cfg.txt
@@ -93,6 +93,9 @@ by selecting the "signature" item using key 0x 
(FW_CFG_SIGNATURE),
 and reading four bytes from the data register. If the fw_cfg device is
 present, the four bytes read will contain the characters "QEMU".
 
+If the DMA interface is available, then reading the DMA Address
+Register returns 0x51454d5520434647 ("QEMU CFG" in big-endian format).
+
 === Revision / feature bitmap (Key 0x0001, FW_CFG_ID) ===
 
 A 32-bit little-endian unsigned int, this item is used to check for enabled
diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index 59933b3..5098bfc 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -53,6 +53,8 @@
 #define FW_CFG_DMA_CTL_SKIP0x04
 #define FW_CFG_DMA_CTL_SELECT  0x08
 
+#define FW_CFG_DMA_SIGNATURE 0x51454d5520434647 /* "QEMU CFG" */
+
 typedef struct FWCfgEntry {
 uint32_t len;
 uint8_t *data;
@@ -393,6 +395,13 @@ static void fw_cfg_dma_transfer(FWCfgState *s)
 trace_fw_cfg_read(s, 0);
 }
 
+static uint64_t fw_cfg_dma_mem_read(void *opaque, hwaddr addr,
+unsigned size)
+{
+/* Return a signature value (and handle various read sizes) */
+return extract64(FW_CFG_DMA_SIGNATURE, (8 - addr - size) * 8, size * 8);
+}
+
 static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr,
  uint64_t value, unsigned size)
 {
@@ -416,8 +425,8 @@ static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr,
 static bool fw_cfg_dma_mem_valid(void *opaque, hwaddr addr,
   unsigned size, bool is_write)
 {
-return is_write && ((size == 4 && (addr == 0 || addr == 4)) ||
-(size == 8 && addr == 0));
+return !is_write || ((size == 4 && (addr == 0 || addr == 4)) ||
+ (size == 8 && addr == 0));
 }
 
 static bool fw_cfg_data_mem_valid(void *opaque, hwaddr addr,
@@ -488,6 +497,7 @@ static const MemoryRegionOps fw_cfg_comb_mem_ops = {
 };
 
 static const MemoryRegionOps fw_cfg_dma_mem_ops = {
+.read = fw_cfg_dma_mem_read,
 .write = fw_cfg_dma_mem_write,
 .endianness = DEVICE_BIG_ENDIAN,
 .valid.accepts = fw_cfg_dma_mem_valid,

Re: [Qemu-devel] [PATCH v6 3/4] block: add a 'blockdev-snapshot' QMP command

2015-10-06 Thread Kevin Wolf

Am 22.09.2015 um 15:28 hat Alberto Garcia geschrieben:
> One of the limitations of the 'blockdev-snapshot-sync' command is that
> it does not allow passing BlockdevOptions to the newly created
> snapshots, so they are always opened using the default values.
> 
> Extending the command to allow passing options is not a practical
> solution because there is overlap between those options and some of
> the existing parameters of the command.
> 
> This patch introduces a new 'blockdev-snapshot' command with a simpler
> interface: it just takes two references to existing block devices that
> will be used as the source and target for the snapshot.
> 
> Since the main difference between the two commands is that one of them
> creates and opens the target image, while the other uses an already
> opened one, the bulk of the implementation is shared.
> 
> Signed-off-by: Alberto Garcia 
> Cc: Eric Blake 
> Cc: Max Reitz 
> ---
>  blockdev.c   | 163 
> ---
>  qapi-schema.json |   2 +
>  qapi/block-core.json |  28 +
>  qmp-commands.hx  |  38 
>  4 files changed, 171 insertions(+), 60 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 1a5b889..daf72f3 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1183,6 +1183,18 @@ void qmp_blockdev_snapshot_sync(bool has_device, const 
> char *device,
> , errp);
>  }
>  
> +void qmp_blockdev_snapshot(const char *node, const char *overlay,
> +   Error **errp)
> +{
> +BlockdevSnapshot snapshot_data = {
> +.node = (char *) node,
> +.overlay = (char *) overlay
> +};
> +
> +blockdev_do_action(TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT,
> +   _data, errp);
> +}
> +
>  void qmp_blockdev_snapshot_internal_sync(const char *device,
>   const char *name,
>   Error **errp)
> @@ -1521,57 +1533,48 @@ typedef struct ExternalSnapshotState {
>  static void external_snapshot_prepare(BlkTransactionState *common,
>Error **errp)
>  {
> -int flags, ret;
> -QDict *options;
> +int flags = 0, ret;
> +QDict *options = NULL;
>  Error *local_err = NULL;
> -bool has_device = false;
> +/* Device and node name of the image to generate the snapshot from */
>  const char *device;
> -bool has_node_name = false;
>  const char *node_name;
> -bool has_snapshot_node_name = false;
> -const char *snapshot_node_name;
> +/* Reference to the new image (for 'blockdev-snapshot') */
> +const char *snapshot_ref;
> +/* File name of the new image (for 'blockdev-snapshot-sync') */
>  const char *new_image_file;
> -const char *format = "qcow2";
> -enum NewImageMode mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
>  ExternalSnapshotState *state =
>   DO_UPCAST(ExternalSnapshotState, common, 
> common);
>  TransactionAction *action = common->action;
>  
> -/* get parameters */
> -g_assert(action->kind == TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC);
> -
> -has_device = action->blockdev_snapshot_sync->has_device;
> -device = action->blockdev_snapshot_sync->device;
> -has_node_name = action->blockdev_snapshot_sync->has_node_name;
> -node_name = action->blockdev_snapshot_sync->node_name;
> -has_snapshot_node_name =
> -action->blockdev_snapshot_sync->has_snapshot_node_name;
> -snapshot_node_name = action->blockdev_snapshot_sync->snapshot_node_name;
> -
> -new_image_file = action->blockdev_snapshot_sync->snapshot_file;
> -if (action->blockdev_snapshot_sync->has_format) {
> -format = action->blockdev_snapshot_sync->format;
> -}
> -if (action->blockdev_snapshot_sync->has_mode) {
> -mode = action->blockdev_snapshot_sync->mode;
> +/* 'blockdev-snapshot' and 'blockdev-snapshot-sync' have similar
> + * purpose but a different set of parameters */
> +switch (action->kind) {
> +case TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT:
> +{
> +BlockdevSnapshot *s = action->blockdev_snapshot;
> +device = s->node;
> +node_name = s->node;
> +new_image_file = NULL;
> +snapshot_ref = s->overlay;
> +}
> +break;
> +case TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC:
> +{
> +BlockdevSnapshotSync *s = action->blockdev_snapshot_sync;
> +device = s->has_device ? s->device : NULL;
> +node_name = s->has_node_name ? s->node_name : NULL;
> +new_image_file = s->snapshot_file;
> +snapshot_ref = NULL;
> +}
> +break;
> +default:
> +g_assert_not_reached();
>  }
>  
>  /* start processing */
> -state->old_bs = bdrv_lookup_bs(has_device ?

Re: [Qemu-devel] [PATCH] gluster: allocate GlusterAIOCBs on the stack

2015-10-06 Thread Jeff Cody

On Thu, Oct 01, 2015 at 01:04:38PM +0200, Paolo Bonzini wrote:
> This is simpler now that the driver has been converted to coroutines.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  block/gluster.c | 86 
> ++---
>  1 file changed, 33 insertions(+), 53 deletions(-)
>

Thanks,

Applied to my block branch:

git git://github.com/codyprime/qemu-kvm-jtc.git block

-Jeff

Re: [Qemu-devel] [PATCH v12 02/10] init/cleanup of netfilter object

2015-10-06 Thread Yang Hongyang




On 10/01/2015 12:59 AM, Markus Armbruster wrote:

Yang Hongyang  writes:


Add a netfilter object based on QOM.

A netfilter is attached to a netdev, captures all network packets
that pass through the netdev. When we delete the netdev, we also
delete the netfilter object attached to it, because if the netdev is
removed, the filter which attached to it is useless.

QTAILQ_ENTRY next used by netdev, filter belongs to the specific netdev is
in this queue.


I'm afraid this paragraph is incomprehensible.  What are you trying to
say?


Just to clarify what the next filed is used to, it is there because we had
a global list before, since the global list was removed from this series,
I will remove this paragraph.




Signed-off-by: Yang Hongyang 
---
  include/net/filter.h|  62 ++
  include/net/net.h   |   1 +
  include/qemu/typedefs.h |   1 +
  net/Makefile.objs   |   1 +
  net/filter.c| 138 
  net/net.c   |   7 +++
  qapi-schema.json|  19 +++
  7 files changed, 229 insertions(+)
  create mode 100644 include/net/filter.h
  create mode 100644 net/filter.c

diff --git a/include/net/filter.h b/include/net/filter.h
new file mode 100644
index 000..e3e14ea
--- /dev/null
+++ b/include/net/filter.h
@@ -0,0 +1,62 @@
+/*
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Author: Yang Hongyang 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_NET_FILTER_H
+#define QEMU_NET_FILTER_H
+
+#include "qom/object.h"
+#include "qemu-common.h"
+#include "qemu/typedefs.h"
+#include "net/queue.h"
+
+#define TYPE_NETFILTER "netfilter"
+#define NETFILTER(obj) \
+OBJECT_CHECK(NetFilterState, (obj), TYPE_NETFILTER)
+#define NETFILTER_GET_CLASS(obj) \
+OBJECT_GET_CLASS(NetFilterClass, (obj), TYPE_NETFILTER)
+#define NETFILTER_CLASS(klass) \
+OBJECT_CLASS_CHECK(NetFilterClass, (klass), TYPE_NETFILTER)
+
+typedef void (FilterSetup) (NetFilterState *nf, Error **errp);
+typedef void (FilterCleanup) (NetFilterState *nf);
+/*
+ * Return:
+ *   0: finished handling the packet, we should continue
+ *   size: filter stolen this packet, we stop pass this packet further
+ */
+typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc,
+   NetClientState *sender,
+   unsigned flags,
+   const struct iovec *iov,
+   int iovcnt,
+   NetPacketSent *sent_cb);
+
+struct NetFilterClass {
+ObjectClass parent_class;
+
+/* optional */
+FilterSetup *setup;
+FilterCleanup *cleanup;
+/* mandatory */
+FilterReceiveIOV *receive_iov;
+};
+typedef struct NetFilterClass NetFilterClass;


No separate typedef, please.


+
+
+struct NetFilterState {
+/* private */
+Object parent;
+
+/* protected */
+char *netdev_id;
+NetClientState *netdev;
+NetFilterDirection direction;
+QTAILQ_ENTRY(NetFilterState) next;
+};
+
+#endif /* QEMU_NET_FILTER_H */
diff --git a/include/net/net.h b/include/net/net.h
index 6a6cbef..36e5fab 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -92,6 +92,7 @@ struct NetClientState {
  NetClientDestructor *destructor;
  unsigned int queue_index;
  unsigned rxfilter_notify_enabled:1;
+QTAILQ_HEAD(, NetFilterState) filters;
  };

  typedef struct NICState {
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index 3a835ff..ee1ce1d 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -45,6 +45,7 @@ typedef struct Monitor Monitor;
  typedef struct MouseTransformInfo MouseTransformInfo;
  typedef struct MSIMessage MSIMessage;
  typedef struct NetClientState NetClientState;
+typedef struct NetFilterState NetFilterState;
  typedef struct NICInfo NICInfo;
  typedef struct PcGuestInfo PcGuestInfo;
  typedef struct PCIBridge PCIBridge;
diff --git a/net/Makefile.objs b/net/Makefile.objs
index ec19cb3..914aec0 100644
--- a/net/Makefile.objs
+++ b/net/Makefile.objs
@@ -13,3 +13,4 @@ common-obj-$(CONFIG_HAIKU) += tap-haiku.o
  common-obj-$(CONFIG_SLIRP) += slirp.o
  common-obj-$(CONFIG_VDE) += vde.o
  common-obj-$(CONFIG_NETMAP) += netmap.o
+common-obj-y += filter.o
diff --git a/net/filter.c b/net/filter.c
new file mode 100644
index 000..34e32cd
--- /dev/null
+++ b/net/filter.c
@@ -0,0 +1,138 @@
+/*
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Author: Yang Hongyang 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "qemu-common.h"
+#include "qapi/qmp/qerror.h"
+#include "qemu/error-report.h"
+
+#include "net/filter.h"
+#include "net/net.h"
+#include "net/vhost_net.h"
+#include

Re: [Qemu-devel] [PATCH v12 02/10] init/cleanup of netfilter object

2015-10-06 Thread Yang Hongyang




On 10/01/2015 12:59 AM, Markus Armbruster wrote:

Yang Hongyang  writes:


Add a netfilter object based on QOM.

A netfilter is attached to a netdev, captures all network packets
that pass through the netdev. When we delete the netdev, we also
delete the netfilter object attached to it, because if the netdev is
removed, the filter which attached to it is useless.

QTAILQ_ENTRY next used by netdev, filter belongs to the specific netdev is
in this queue.


I'm afraid this paragraph is incomprehensible.  What are you trying to
say?


Signed-off-by: Yang Hongyang 
---
  include/net/filter.h|  62 ++
  include/net/net.h   |   1 +
  include/qemu/typedefs.h |   1 +
  net/Makefile.objs   |   1 +
  net/filter.c| 138 
  net/net.c   |   7 +++
  qapi-schema.json|  19 +++
  7 files changed, 229 insertions(+)
  create mode 100644 include/net/filter.h
  create mode 100644 net/filter.c

diff --git a/include/net/filter.h b/include/net/filter.h
new file mode 100644
index 000..e3e14ea
--- /dev/null
+++ b/include/net/filter.h
@@ -0,0 +1,62 @@
+/*
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Author: Yang Hongyang 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_NET_FILTER_H
+#define QEMU_NET_FILTER_H
+
+#include "qom/object.h"
+#include "qemu-common.h"
+#include "qemu/typedefs.h"
+#include "net/queue.h"
+
+#define TYPE_NETFILTER "netfilter"
+#define NETFILTER(obj) \
+OBJECT_CHECK(NetFilterState, (obj), TYPE_NETFILTER)
+#define NETFILTER_GET_CLASS(obj) \
+OBJECT_GET_CLASS(NetFilterClass, (obj), TYPE_NETFILTER)
+#define NETFILTER_CLASS(klass) \
+OBJECT_CLASS_CHECK(NetFilterClass, (klass), TYPE_NETFILTER)
+
+typedef void (FilterSetup) (NetFilterState *nf, Error **errp);
+typedef void (FilterCleanup) (NetFilterState *nf);
+/*
+ * Return:
+ *   0: finished handling the packet, we should continue
+ *   size: filter stolen this packet, we stop pass this packet further
+ */
+typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc,
+   NetClientState *sender,
+   unsigned flags,
+   const struct iovec *iov,
+   int iovcnt,
+   NetPacketSent *sent_cb);
+
+struct NetFilterClass {
+ObjectClass parent_class;
+
+/* optional */
+FilterSetup *setup;
+FilterCleanup *cleanup;
+/* mandatory */
+FilterReceiveIOV *receive_iov;
+};
+typedef struct NetFilterClass NetFilterClass;


No separate typedef, please.


Seems this check in checkpatch.pl still present, it will report error
if combine the typedef, but I will do as you said anyway.




+
+
+struct NetFilterState {
+/* private */
+Object parent;
+
+/* protected */
+char *netdev_id;
+NetClientState *netdev;
+NetFilterDirection direction;
+QTAILQ_ENTRY(NetFilterState) next;
+};
+
+#endif /* QEMU_NET_FILTER_H */
diff --git a/include/net/net.h b/include/net/net.h
index 6a6cbef..36e5fab 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -92,6 +92,7 @@ struct NetClientState {
  NetClientDestructor *destructor;
  unsigned int queue_index;
  unsigned rxfilter_notify_enabled:1;
+QTAILQ_HEAD(, NetFilterState) filters;
  };

  typedef struct NICState {
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index 3a835ff..ee1ce1d 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -45,6 +45,7 @@ typedef struct Monitor Monitor;
  typedef struct MouseTransformInfo MouseTransformInfo;
  typedef struct MSIMessage MSIMessage;
  typedef struct NetClientState NetClientState;
+typedef struct NetFilterState NetFilterState;
  typedef struct NICInfo NICInfo;
  typedef struct PcGuestInfo PcGuestInfo;
  typedef struct PCIBridge PCIBridge;
diff --git a/net/Makefile.objs b/net/Makefile.objs
index ec19cb3..914aec0 100644
--- a/net/Makefile.objs
+++ b/net/Makefile.objs
@@ -13,3 +13,4 @@ common-obj-$(CONFIG_HAIKU) += tap-haiku.o
  common-obj-$(CONFIG_SLIRP) += slirp.o
  common-obj-$(CONFIG_VDE) += vde.o
  common-obj-$(CONFIG_NETMAP) += netmap.o
+common-obj-y += filter.o
diff --git a/net/filter.c b/net/filter.c
new file mode 100644
index 000..34e32cd
--- /dev/null
+++ b/net/filter.c
@@ -0,0 +1,138 @@
+/*
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Author: Yang Hongyang 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "qemu-common.h"
+#include "qapi/qmp/qerror.h"
+#include "qemu/error-report.h"
+
+#include "net/filter.h"
+#include "net/net.h"
+#include "net/vhost_net.h"
+#include "qom/object_interfaces.h"
+
+static char

[Qemu-devel] [PATCH v13 02/10] init/cleanup of netfilter object

2015-10-06 Thread Yang Hongyang

Add a netfilter object based on QOM.

A netfilter is attached to a netdev, captures all network packets
that pass through the netdev. When we delete the netdev, we also
delete the netfilter object attached to it, because if the netdev is
removed, the filter which attached to it is useless.

Signed-off-by: Yang Hongyang 
---
 include/net/filter.h|  61 +
 include/net/net.h   |   1 +
 include/qemu/typedefs.h |   1 +
 net/Makefile.objs   |   1 +
 net/filter.c| 138 
 net/net.c   |   7 +++
 qapi-schema.json|  20 +++
 7 files changed, 229 insertions(+)
 create mode 100644 include/net/filter.h
 create mode 100644 net/filter.c

diff --git a/include/net/filter.h b/include/net/filter.h
new file mode 100644
index 000..be27dee
--- /dev/null
+++ b/include/net/filter.h
@@ -0,0 +1,61 @@
+/*
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Author: Yang Hongyang 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_NET_FILTER_H
+#define QEMU_NET_FILTER_H
+
+#include "qom/object.h"
+#include "qemu-common.h"
+#include "qemu/typedefs.h"
+#include "net/queue.h"
+
+#define TYPE_NETFILTER "netfilter"
+#define NETFILTER(obj) \
+OBJECT_CHECK(NetFilterState, (obj), TYPE_NETFILTER)
+#define NETFILTER_GET_CLASS(obj) \
+OBJECT_GET_CLASS(NetFilterClass, (obj), TYPE_NETFILTER)
+#define NETFILTER_CLASS(klass) \
+OBJECT_CLASS_CHECK(NetFilterClass, (klass), TYPE_NETFILTER)
+
+typedef void (FilterSetup) (NetFilterState *nf, Error **errp);
+typedef void (FilterCleanup) (NetFilterState *nf);
+/*
+ * Return:
+ *   0: finished handling the packet, we should continue
+ *   size: filter stolen this packet, we stop pass this packet further
+ */
+typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc,
+   NetClientState *sender,
+   unsigned flags,
+   const struct iovec *iov,
+   int iovcnt,
+   NetPacketSent *sent_cb);
+
+typedef struct NetFilterClass {
+ObjectClass parent_class;
+
+/* optional */
+FilterSetup *setup;
+FilterCleanup *cleanup;
+/* mandatory */
+FilterReceiveIOV *receive_iov;
+} NetFilterClass;
+
+
+struct NetFilterState {
+/* private */
+Object parent;
+
+/* protected */
+char *netdev_id;
+NetClientState *netdev;
+NetFilterDirection direction;
+QTAILQ_ENTRY(NetFilterState) next;
+};
+
+#endif /* QEMU_NET_FILTER_H */
diff --git a/include/net/net.h b/include/net/net.h
index 6a6cbef..36e5fab 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -92,6 +92,7 @@ struct NetClientState {
 NetClientDestructor *destructor;
 unsigned int queue_index;
 unsigned rxfilter_notify_enabled:1;
+QTAILQ_HEAD(, NetFilterState) filters;
 };
 
 typedef struct NICState {
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index 3a835ff..ee1ce1d 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -45,6 +45,7 @@ typedef struct Monitor Monitor;
 typedef struct MouseTransformInfo MouseTransformInfo;
 typedef struct MSIMessage MSIMessage;
 typedef struct NetClientState NetClientState;
+typedef struct NetFilterState NetFilterState;
 typedef struct NICInfo NICInfo;
 typedef struct PcGuestInfo PcGuestInfo;
 typedef struct PCIBridge PCIBridge;
diff --git a/net/Makefile.objs b/net/Makefile.objs
index ec19cb3..914aec0 100644
--- a/net/Makefile.objs
+++ b/net/Makefile.objs
@@ -13,3 +13,4 @@ common-obj-$(CONFIG_HAIKU) += tap-haiku.o
 common-obj-$(CONFIG_SLIRP) += slirp.o
 common-obj-$(CONFIG_VDE) += vde.o
 common-obj-$(CONFIG_NETMAP) += netmap.o
+common-obj-y += filter.o
diff --git a/net/filter.c b/net/filter.c
new file mode 100644
index 000..d406259
--- /dev/null
+++ b/net/filter.c
@@ -0,0 +1,138 @@
+/*
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Author: Yang Hongyang 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "qemu-common.h"
+#include "qapi/qmp/qerror.h"
+#include "qemu/error-report.h"
+
+#include "net/filter.h"
+#include "net/net.h"
+#include "net/vhost_net.h"
+#include "qom/object_interfaces.h"
+
+static char *netfilter_get_netdev_id(Object *obj, Error **errp)
+{
+NetFilterState *nf = NETFILTER(obj);
+
+return g_strdup(nf->netdev_id);
+}
+
+static void netfilter_set_netdev_id(Object *obj, const char *str, Error **errp)
+{
+NetFilterState *nf = NETFILTER(obj);
+
+nf->netdev_id = g_strdup(str);
+}
+
+static int netfilter_get_direction(Object *obj, Error **errp G_GNUC_UNUSED)
+{
+NetFilterState *nf = NETFILTER(obj);
+return nf->direction;
+}
+
+static void

[Qemu-devel] [PATCH v13 03/10] netfilter: hook packets before net queue send

2015-10-06 Thread Yang Hongyang

Capture packets that will be sent.

Signed-off-by: Yang Hongyang 
Reviewed-by: Thomas Huth 
Signed-off-by: Jason Wang 
---
 include/net/filter.h |  8 +++
 net/filter.c | 17 ++
 net/net.c| 66 
 3 files changed, 91 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index be27dee..db035b6 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -58,4 +58,12 @@ struct NetFilterState {
 QTAILQ_ENTRY(NetFilterState) next;
 };
 
+ssize_t qemu_netfilter_receive(NetFilterState *nf,
+   NetFilterDirection direction,
+   NetClientState *sender,
+   unsigned flags,
+   const struct iovec *iov,
+   int iovcnt,
+   NetPacketSent *sent_cb);
+
 #endif /* QEMU_NET_FILTER_H */
diff --git a/net/filter.c b/net/filter.c
index d406259..147c57f 100644
--- a/net/filter.c
+++ b/net/filter.c
@@ -15,6 +15,23 @@
 #include "net/vhost_net.h"
 #include "qom/object_interfaces.h"
 
+ssize_t qemu_netfilter_receive(NetFilterState *nf,
+   NetFilterDirection direction,
+   NetClientState *sender,
+   unsigned flags,
+   const struct iovec *iov,
+   int iovcnt,
+   NetPacketSent *sent_cb)
+{
+if (nf->direction == direction ||
+nf->direction == NET_FILTER_DIRECTION_ALL) {
+return NETFILTER_GET_CLASS(OBJECT(nf))->receive_iov(
+   nf, sender, flags, iov, iovcnt, sent_cb);
+}
+
+return 0;
+}
+
 static char *netfilter_get_netdev_id(Object *obj, Error **errp)
 {
 NetFilterState *nf = NETFILTER(obj);
diff --git a/net/net.c b/net/net.c
index 033f4f3..e27643d 100644
--- a/net/net.c
+++ b/net/net.c
@@ -561,6 +561,44 @@ int qemu_can_send_packet(NetClientState *sender)
 return 1;
 }
 
+static ssize_t filter_receive_iov(NetClientState *nc,
+  NetFilterDirection direction,
+  NetClientState *sender,
+  unsigned flags,
+  const struct iovec *iov,
+  int iovcnt,
+  NetPacketSent *sent_cb)
+{
+ssize_t ret = 0;
+NetFilterState *nf = NULL;
+
+QTAILQ_FOREACH(nf, >filters, next) {
+ret = qemu_netfilter_receive(nf, direction, sender, flags, iov,
+ iovcnt, sent_cb);
+if (ret) {
+return ret;
+}
+}
+
+return ret;
+}
+
+static ssize_t filter_receive(NetClientState *nc,
+  NetFilterDirection direction,
+  NetClientState *sender,
+  unsigned flags,
+  const uint8_t *data,
+  size_t size,
+  NetPacketSent *sent_cb)
+{
+struct iovec iov = {
+.iov_base = (void *)data,
+.iov_len = size
+};
+
+return filter_receive_iov(nc, direction, sender, flags, , 1, sent_cb);
+}
+
 ssize_t qemu_deliver_packet(NetClientState *sender,
 unsigned flags,
 const uint8_t *data,
@@ -632,6 +670,7 @@ static ssize_t 
qemu_send_packet_async_with_flags(NetClientState *sender,
  NetPacketSent *sent_cb)
 {
 NetQueue *queue;
+int ret;
 
 #ifdef DEBUG_NET
 printf("qemu_send_packet_async:\n");
@@ -642,6 +681,19 @@ static ssize_t 
qemu_send_packet_async_with_flags(NetClientState *sender,
 return size;
 }
 
+/* Let filters handle the packet first */
+ret = filter_receive(sender, NET_FILTER_DIRECTION_TX,
+ sender, flags, buf, size, sent_cb);
+if (ret) {
+return ret;
+}
+
+ret = filter_receive(sender->peer, NET_FILTER_DIRECTION_RX,
+ sender, flags, buf, size, sent_cb);
+if (ret) {
+return ret;
+}
+
 queue = sender->peer->incoming_queue;
 
 return qemu_net_queue_send(queue, sender, flags, buf, size, sent_cb);
@@ -712,11 +764,25 @@ ssize_t qemu_sendv_packet_async(NetClientState *sender,
 NetPacketSent *sent_cb)
 {
 NetQueue *queue;
+int ret;
 
 if (sender->link_down || !sender->peer) {
 return iov_size(iov, iovcnt);
 }
 
+/* Let filters handle the packet first */
+ret = filter_receive_iov(sender, NET_FILTER_DIRECTION_TX, sender,
+ QEMU_NET_PACKET_FLAG_NONE, iov, iovcnt, sent_cb);
+if (ret) {
+return ret;
+}
+
+ret =

[Qemu-devel] [PATCH v13 01/10] vl.c: init delayed object after net_init_clients

2015-10-06 Thread Yang Hongyang

Init delayed object after net_init_clients, because netfilters need
to be initialized after net clients initialized.

Signed-off-by: Yang Hongyang 
---
 vl.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/vl.c b/vl.c
index f2bd8d2..366f38f 100644
--- a/vl.c
+++ b/vl.c
@@ -2767,6 +2767,7 @@ static bool object_create_initial(const char *type)
 if (g_str_equal(type, "rng-egd")) {
 return false;
 }
+/* TODO: implement netfilters */
 return true;
 }
 
@@ -4286,12 +4287,6 @@ int main(int argc, char **argv, char **envp)
 exit(0);
 }
 
-if (qemu_opts_foreach(qemu_find_opts("object"),
-  object_create,
-  object_create_delayed, NULL)) {
-exit(1);
-}
-
 machine_opts = qemu_get_machine_opts();
 if (qemu_opt_foreach(machine_opts, machine_set_property, current_machine,
  NULL)) {
@@ -4397,6 +4392,12 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 
+if (qemu_opts_foreach(qemu_find_opts("object"),
+  object_create,
+  object_create_delayed, NULL)) {
+exit(1);
+}
+
 #ifdef CONFIG_TPM
 if (tpm_init() < 0) {
 exit(1);
-- 
1.9.1

[Qemu-devel] [PATCH v13 00/10] Add a netfilter object and netbuffer filter

2015-10-06 Thread Yang Hongyang

This patch add an netfilter abstract object, captures all network packets
on associated netdev. Also implement a concrete filter buffer based on
this abstract object. the "buffer" netfilter could be used by VM FT solutions
like MicroCheckpointing, to buffer/release packets. Or to simulate
packet delay.

You can also get the series from:
https://github.com/macrosheep/qemu/tree/netfilter-v12

Usage:
 -netdev tap,id=bn0
 -device e1000,netdev=bn0
 -object filter-buffer,id=f0,netdev=bn0,queue=rx,interval=1000

dynamically add/remove netfilters:
 object_add filter-buffer,id=f0,netdev=bn0,queue=rx,interval=1000
 object_del f0

NOTE:
 interval is in microseconds and can't be omiited.
 queue is optional, and is one of rx|tx|all, default is "all". See
 enum NetFilterDirection for detail.

v13:
 - Address Markus's comments.

v12:
 - Address Markus's comments.
 - Rebased to the latest master.

v11:
 - address Jason's comments
 - add multiqueue support, the last 2 patches
 - rebased to the latest master

v10:
 - Reimplemented using QOM (suggested by stefan)
 - Do not export NetQueue internals (suggested by stefan)
 - see individual patch for detail

v9:
 - squash command description and help to patch 1&3
 - qapi changes according to Markus's comments
 - see individual patch for detail

v8:
 - some minor fixes according to Thomas's comments
 - rebased to the latest master branch

v7:
 - print filter info when execute 'info network'
 - addressed Jason's comments

v6:
 - add multiqueue support, please see individual patch for detail

v5:
 - add a sent_cb param to filter receive_iov api
 - squash the 4th patch into patch 3
 - remove dummy sent_cb (buffer filter)
 - addressed Jason's other comments, see individual patches for detail

v4:
 - get rid of struct Filter
 - squash the 4th patch into patch 2
 - fix qemu_netfilter_pass_to_next_iov
 - get rid of bh (buffer filter)
 - release the packet to next filter instead of to receiver (buffer filter)

v3:
 - add an api to pass the packet to next filter
 - remove netfilters when delete netdev
 - add qtest testcases for netfilter
 - addressed comments from Jason

v2:
 - add a chain option to netfilter object
 - move the hook place earlier, before net_queue_send
 - drop the unused api in buffer filter
 - squash buffer filter patches into one
 - remove receive() api from netfilter, only receive_iov() is enough
 - addressed comments from Jason

v1:
 initial patch.

Yang Hongyang (10):
  vl.c: init delayed object after net_init_clients
  init/cleanup of netfilter object
  netfilter: hook packets before net queue send
  net: merge qemu_deliver_packet and qemu_deliver_packet_iov
  net/queue: introduce NetQueueDeliverFunc
  netfilter: add an API to pass the packet to next filter
  netfilter: print filter info associate with the netdev
  net/queue: export qemu_net_queue_append_iov
  netfilter: add a netbuffer filter
  tests: add test cases for netfilter object

 include/net/filter.h|  77 
 include/net/net.h   |   6 +-
 include/net/queue.h |  20 -
 include/qemu/typedefs.h |   1 +
 net/Makefile.objs   |   2 +
 net/filter-buffer.c | 186 ++
 net/filter.c| 233 
 net/net.c   | 121 +++--
 net/queue.c |  24 +++--
 qapi-schema.json|  20 +
 qemu-options.hx |  17 
 tests/.gitignore|   1 +
 tests/Makefile  |   2 +
 tests/test-netfilter.c  | 200 +
 vl.c|  17 ++--
 15 files changed, 878 insertions(+), 49 deletions(-)
 create mode 100644 include/net/filter.h
 create mode 100644 net/filter-buffer.c
 create mode 100644 net/filter.c
 create mode 100644 tests/test-netfilter.c

-- 
1.9.1

[Qemu-devel] [PATCH v13 04/10] net: merge qemu_deliver_packet and qemu_deliver_packet_iov

2015-10-06 Thread Yang Hongyang

qemu_deliver_packet_iov already have the compat delivery, we
can drop qemu_deliver_packet.

Signed-off-by: Yang Hongyang 
Signed-off-by: Jason Wang 
---
 include/net/net.h |  5 -
 net/net.c | 51 ---
 net/queue.c   |  6 +-
 3 files changed, 21 insertions(+), 41 deletions(-)

diff --git a/include/net/net.h b/include/net/net.h
index 36e5fab..7af3e15 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -152,11 +152,6 @@ void qemu_check_nic_model(NICInfo *nd, const char *model);
 int qemu_find_nic_model(NICInfo *nd, const char * const *models,
 const char *default_model);
 
-ssize_t qemu_deliver_packet(NetClientState *sender,
-unsigned flags,
-const uint8_t *data,
-size_t size,
-void *opaque);
 ssize_t qemu_deliver_packet_iov(NetClientState *sender,
 unsigned flags,
 const struct iovec *iov,
diff --git a/net/net.c b/net/net.c
index e27643d..2f939f9 100644
--- a/net/net.c
+++ b/net/net.c
@@ -599,36 +599,6 @@ static ssize_t filter_receive(NetClientState *nc,
 return filter_receive_iov(nc, direction, sender, flags, , 1, sent_cb);
 }
 
-ssize_t qemu_deliver_packet(NetClientState *sender,
-unsigned flags,
-const uint8_t *data,
-size_t size,
-void *opaque)
-{
-NetClientState *nc = opaque;
-ssize_t ret;
-
-if (nc->link_down) {
-return size;
-}
-
-if (nc->receive_disabled) {
-return 0;
-}
-
-if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) {
-ret = nc->info->receive_raw(nc, data, size);
-} else {
-ret = nc->info->receive(nc, data, size);
-}
-
-if (ret == 0) {
-nc->receive_disabled = 1;
-}
-
-return ret;
-}
-
 void qemu_purge_queued_packets(NetClientState *nc)
 {
 if (!nc->peer) {
@@ -719,14 +689,25 @@ ssize_t qemu_send_packet_raw(NetClientState *nc, const 
uint8_t *buf, int size)
 }
 
 static ssize_t nc_sendv_compat(NetClientState *nc, const struct iovec *iov,
-   int iovcnt)
+   int iovcnt, unsigned flags)
 {
-uint8_t buffer[NET_BUFSIZE];
+uint8_t buf[NET_BUFSIZE];
+uint8_t *buffer;
 size_t offset;
 
-offset = iov_to_buf(iov, iovcnt, 0, buffer, sizeof(buffer));
+if (iovcnt == 1) {
+buffer = iov[0].iov_base;
+offset = iov[0].iov_len;
+} else {
+buffer = buf;
+offset = iov_to_buf(iov, iovcnt, 0, buffer, sizeof(buffer));
+}
 
-return nc->info->receive(nc, buffer, offset);
+if (flags & QEMU_NET_PACKET_FLAG_RAW && nc->info->receive_raw) {
+return nc->info->receive_raw(nc, buffer, offset);
+} else {
+return nc->info->receive(nc, buffer, offset);
+}
 }
 
 ssize_t qemu_deliver_packet_iov(NetClientState *sender,
@@ -749,7 +730,7 @@ ssize_t qemu_deliver_packet_iov(NetClientState *sender,
 if (nc->info->receive_iov) {
 ret = nc->info->receive_iov(nc, iov, iovcnt);
 } else {
-ret = nc_sendv_compat(nc, iov, iovcnt);
+ret = nc_sendv_compat(nc, iov, iovcnt, flags);
 }
 
 if (ret == 0) {
diff --git a/net/queue.c b/net/queue.c
index ebbe2bb..cf8db3a 100644
--- a/net/queue.c
+++ b/net/queue.c
@@ -152,9 +152,13 @@ static ssize_t qemu_net_queue_deliver(NetQueue *queue,
   size_t size)
 {
 ssize_t ret = -1;
+struct iovec iov = {
+.iov_base = (void *)data,
+.iov_len = size
+};
 
 queue->delivering = 1;
-ret = qemu_deliver_packet(sender, flags, data, size, queue->opaque);
+ret = qemu_deliver_packet_iov(sender, flags, , 1, queue->opaque);
 queue->delivering = 0;
 
 return ret;
-- 
1.9.1

[Qemu-devel] [PATCH v13 08/10] net/queue: export qemu_net_queue_append_iov

2015-10-06 Thread Yang Hongyang

This will be used by buffer filter implementation later to
queue packets.

Signed-off-by: Yang Hongyang 
Reviewed-by: Thomas Huth 
Signed-off-by: Jason Wang 
---
 include/net/queue.h |  7 +++
 net/queue.c | 12 ++--
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/include/net/queue.h b/include/net/queue.h
index b4a7183..5469fdb 100644
--- a/include/net/queue.h
+++ b/include/net/queue.h
@@ -47,6 +47,13 @@ typedef ssize_t (NetQueueDeliverFunc)(NetClientState *sender,
 
 NetQueue *qemu_new_net_queue(NetQueueDeliverFunc *deliver, void *opaque);
 
+void qemu_net_queue_append_iov(NetQueue *queue,
+   NetClientState *sender,
+   unsigned flags,
+   const struct iovec *iov,
+   int iovcnt,
+   NetPacketSent *sent_cb);
+
 void qemu_del_net_queue(NetQueue *queue);
 
 ssize_t qemu_net_queue_send(NetQueue *queue,
diff --git a/net/queue.c b/net/queue.c
index 16dddf0..de8b9d3 100644
--- a/net/queue.c
+++ b/net/queue.c
@@ -112,12 +112,12 @@ static void qemu_net_queue_append(NetQueue *queue,
 QTAILQ_INSERT_TAIL(>packets, packet, entry);
 }
 
-static void qemu_net_queue_append_iov(NetQueue *queue,
-  NetClientState *sender,
-  unsigned flags,
-  const struct iovec *iov,
-  int iovcnt,
-  NetPacketSent *sent_cb)
+void qemu_net_queue_append_iov(NetQueue *queue,
+   NetClientState *sender,
+   unsigned flags,
+   const struct iovec *iov,
+   int iovcnt,
+   NetPacketSent *sent_cb)
 {
 NetPacket *packet;
 size_t max_len = 0;
-- 
1.9.1

[Qemu-devel] [PATCH v13 09/10] netfilter: add a netbuffer filter

2015-10-06 Thread Yang Hongyang

This filter is to buffer/release packets. Can be used when using
MicroCheckpointing or other Remus like VM FT solutions.
You can also use it to simulate network delay.

Usage:
 -netdev tap,id=bn0
 -object filter-buffer,id=f0,netdev=bn0,queue=rx,interval=1000

NOTE:
 Interval is in microseconds, it can't be omitted currently, and can't be 0.

Signed-off-by: Yang Hongyang 
Signed-off-by: Jason Wang 
---
 net/Makefile.objs   |   1 +
 net/filter-buffer.c | 186 
 qemu-options.hx |  17 +
 vl.c|   6 +-
 4 files changed, 209 insertions(+), 1 deletion(-)
 create mode 100644 net/filter-buffer.c

diff --git a/net/Makefile.objs b/net/Makefile.objs
index 914aec0..5fa2f97 100644
--- a/net/Makefile.objs
+++ b/net/Makefile.objs
@@ -14,3 +14,4 @@ common-obj-$(CONFIG_SLIRP) += slirp.o
 common-obj-$(CONFIG_VDE) += vde.o
 common-obj-$(CONFIG_NETMAP) += netmap.o
 common-obj-y += filter.o
+common-obj-y += filter-buffer.o
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
new file mode 100644
index 000..336bdcc
--- /dev/null
+++ b/net/filter-buffer.c
@@ -0,0 +1,186 @@
+/*
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Author: Yang Hongyang 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "net/filter.h"
+#include "net/queue.h"
+#include "qemu-common.h"
+#include "qemu/timer.h"
+#include "qemu/iov.h"
+#include "qapi/qmp/qerror.h"
+#include "qapi-visit.h"
+#include "qom/object.h"
+
+#define TYPE_FILTER_BUFFER "filter-buffer"
+
+#define FILTER_BUFFER(obj) \
+OBJECT_CHECK(FilterBufferState, (obj), TYPE_FILTER_BUFFER)
+
+typedef struct FilterBufferState {
+NetFilterState parent_obj;
+
+NetQueue *incoming_queue;
+uint32_t interval;
+QEMUTimer release_timer;
+} FilterBufferState;
+
+static void filter_buffer_flush(NetFilterState *nf)
+{
+FilterBufferState *s = FILTER_BUFFER(nf);
+
+if (!qemu_net_queue_flush(s->incoming_queue)) {
+/* Unable to empty the queue, purge remaining packets */
+qemu_net_queue_purge(s->incoming_queue, nf->netdev);
+}
+}
+
+static void filter_buffer_release_timer(void *opaque)
+{
+NetFilterState *nf = opaque;
+FilterBufferState *s = FILTER_BUFFER(nf);
+
+/*
+ * Note: filter_buffer_flush() drops packets that can't be sent
+ * TODO: We should leave them queued.  But currently there's no way
+ * for the next filter or recivier to notify us that it can receive
+ * more packets.
+ */
+filter_buffer_flush(nf);
+/* Timer rearmed to fire again in s->interval microseconds. */
+timer_mod(>release_timer,
+  qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + s->interval);
+}
+
+/* filter APIs */
+static ssize_t filter_buffer_receive_iov(NetFilterState *nf,
+ NetClientState *sender,
+ unsigned flags,
+ const struct iovec *iov,
+ int iovcnt,
+ NetPacketSent *sent_cb)
+{
+FilterBufferState *s = FILTER_BUFFER(nf);
+
+/*
+ * We return size when buffer a packet, the sender will take it as
+ * a already sent packet, so sent_cb should not be called later.
+ *
+ * FIXME: Even if the guest can't receive packets for some reasons,
+ * the filter can still accept packets until its internal queue is full.
+ * For example:
+ *   For some reason, receiver could not receive more packets
+ * (.can_receive() returns zero). Without a filter, at most one packet
+ * will be queued in incoming queue and sender's poll will be disabled
+ * unit its sent_cb() was called. With a filter, it will keep receiving
+ * the packets without caring about the receiver. This is suboptimal.
+ * May need more thoughts (e.g keeping sent_cb).
+ */
+qemu_net_queue_append_iov(s->incoming_queue, sender, flags,
+  iov, iovcnt, NULL);
+return iov_size(iov, iovcnt);
+}
+
+static void filter_buffer_cleanup(NetFilterState *nf)
+{
+FilterBufferState *s = FILTER_BUFFER(nf);
+
+if (s->interval) {
+timer_del(>release_timer);
+}
+
+/* flush packets */
+if (s->incoming_queue) {
+filter_buffer_flush(nf);
+g_free(s->incoming_queue);
+}
+}
+
+static void filter_buffer_setup(NetFilterState *nf, Error **errp)
+{
+FilterBufferState *s = FILTER_BUFFER(nf);
+
+/*
+ * We may want to accept zero interval when VM FT solutions like MC
+ * or COLO use this filter to release packets on demand.
+ */
+if (!s->interval) {
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "interval",
+   "a non-zero interval");
+return;
+}
+
+s->incoming_queue =

[Qemu-devel] [PATCH v13 07/10] netfilter: print filter info associate with the netdev

2015-10-06 Thread Yang Hongyang

From: Yang Hongyang 

When execute "info network", print filter info also.
add a info_str member to NetFilterState, store specific filters
info.

Signed-off-by: Yang Hongyang 
Signed-off-by: Jason Wang 
---
 include/net/filter.h |  1 +
 net/filter.c | 20 
 net/net.c| 11 +++
 3 files changed, 32 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 5639976..2deda36 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -55,6 +55,7 @@ struct NetFilterState {
 char *netdev_id;
 NetClientState *netdev;
 NetFilterDirection direction;
+char info_str[256];
 QTAILQ_ENTRY(NetFilterState) next;
 };
 
diff --git a/net/filter.c b/net/filter.c
index 5d5022f..326f2b5 100644
--- a/net/filter.c
+++ b/net/filter.c
@@ -15,6 +15,7 @@
 #include "net/vhost_net.h"
 #include "qom/object_interfaces.h"
 #include "qemu/iov.h"
+#include "qapi/string-output-visitor.h"
 
 ssize_t qemu_netfilter_receive(NetFilterState *nf,
NetFilterDirection direction,
@@ -134,6 +135,9 @@ static void netfilter_complete(UserCreatable *uc, Error 
**errp)
 NetFilterClass *nfc = NETFILTER_GET_CLASS(uc);
 int queues;
 Error *local_err = NULL;
+char *str, *info;
+ObjectProperty *prop;
+StringOutputVisitor *ov;
 
 if (!nf->netdev_id) {
 error_setg(errp, "Parameter 'netdev' is required");
@@ -167,6 +171,22 @@ static void netfilter_complete(UserCreatable *uc, Error 
**errp)
 }
 }
 QTAILQ_INSERT_TAIL(>netdev->filters, nf, next);
+
+/* generate info str */
+QTAILQ_FOREACH(prop, (nf)->properties, node) {
+if (!strcmp(prop->name, "type")) {
+continue;
+}
+ov = string_output_visitor_new(false);
+object_property_get(OBJECT(nf), string_output_get_visitor(ov),
+prop->name, errp);
+str = string_output_get_string(ov);
+string_output_visitor_cleanup(ov);
+info = g_strdup_printf(",%s=%s", prop->name, str);
+g_strlcat(nf->info_str, info, sizeof(nf->info_str));
+g_free(str);
+g_free(info);
+}
 }
 
 static void netfilter_finalize(Object *obj)
diff --git a/net/net.c b/net/net.c
index c0ebb13..39af893 100644
--- a/net/net.c
+++ b/net/net.c
@@ -1179,10 +1179,21 @@ void qmp_netdev_del(const char *id, Error **errp)
 
 void print_net_client(Monitor *mon, NetClientState *nc)
 {
+NetFilterState *nf;
+
 monitor_printf(mon, "%s: index=%d,type=%s,%s\n", nc->name,
nc->queue_index,
NetClientOptionsKind_lookup[nc->info->type],
nc->info_str);
+if (!QTAILQ_EMPTY(>filters)) {
+monitor_printf(mon, "filters:\n");
+}
+QTAILQ_FOREACH(nf, >filters, next) {
+monitor_printf(mon, "  - %s: type=%s%s\n",
+   object_get_canonical_path_component(OBJECT(nf)),
+   object_get_typename(OBJECT(nf)),
+   nf->info_str);
+}
 }
 
 RxFilterInfoList *qmp_query_rx_filter(bool has_name, const char *name,
-- 
1.9.1

[Qemu-devel] [PATCH v13 06/10] netfilter: add an API to pass the packet to next filter

2015-10-06 Thread Yang Hongyang

add an API qemu_netfilter_pass_to_next() to pass the packet
to next filter.

Signed-off-by: Yang Hongyang 
Reviewed-by: Thomas Huth 
Signed-off-by: Jason Wang 
---
 include/net/filter.h |  7 +++
 net/filter.c | 58 
 2 files changed, 65 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index db035b6..5639976 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -66,4 +66,11 @@ ssize_t qemu_netfilter_receive(NetFilterState *nf,
int iovcnt,
NetPacketSent *sent_cb);
 
+/* pass the packet to the next filter */
+ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
+unsigned flags,
+const struct iovec *iov,
+int iovcnt,
+void *opaque);
+
 #endif /* QEMU_NET_FILTER_H */
diff --git a/net/filter.c b/net/filter.c
index 147c57f..5d5022f 100644
--- a/net/filter.c
+++ b/net/filter.c
@@ -14,6 +14,7 @@
 #include "net/net.h"
 #include "net/vhost_net.h"
 #include "qom/object_interfaces.h"
+#include "qemu/iov.h"
 
 ssize_t qemu_netfilter_receive(NetFilterState *nf,
NetFilterDirection direction,
@@ -32,6 +33,63 @@ ssize_t qemu_netfilter_receive(NetFilterState *nf,
 return 0;
 }
 
+ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
+unsigned flags,
+const struct iovec *iov,
+int iovcnt,
+void *opaque)
+{
+int ret = 0;
+int direction;
+NetFilterState *nf = opaque;
+NetFilterState *next = QTAILQ_NEXT(nf, next);
+
+if (!sender || !sender->peer) {
+/* no receiver, or sender been deleted, no need to pass it further */
+goto out;
+}
+
+if (nf->direction == NET_FILTER_DIRECTION_ALL) {
+if (sender == nf->netdev) {
+/* This packet is sent by netdev itself */
+direction = NET_FILTER_DIRECTION_TX;
+} else {
+direction = NET_FILTER_DIRECTION_RX;
+}
+} else {
+direction = nf->direction;
+}
+
+while (next) {
+/*
+ * if qemu_netfilter_pass_to_next been called, means that
+ * the packet has been hold by filter and has already retured size
+ * to the sender, so sent_cb shouldn't be called later, just
+ * pass NULL to next.
+ */
+ret = qemu_netfilter_receive(next, direction, sender, flags, iov,
+ iovcnt, NULL);
+if (ret) {
+return ret;
+}
+next = QTAILQ_NEXT(next, next);
+}
+
+/*
+ * We have gone through all filters, pass it to receiver.
+ * Do the valid check again incase sender or receiver been
+ * deleted while we go through filters.
+ */
+if (sender && sender->peer) {
+qemu_net_queue_send_iov(sender->peer->incoming_queue,
+sender, flags, iov, iovcnt, NULL);
+}
+
+out:
+/* no receiver, or sender been deleted */
+return iov_size(iov, iovcnt);
+}
+
 static char *netfilter_get_netdev_id(Object *obj, Error **errp)
 {
 NetFilterState *nf = NETFILTER(obj);
-- 
1.9.1

Re: [Qemu-devel] How to get started with the source code of Qemu?

2015-10-06 Thread Aaron Elkins

Hi Bastian,

Thanks for you suggestion, I decide to do as you said, started by picking some 
interesting parts.

-Aaron

On Oct 7, 2015, at 1:04 AM, Bastian Koppelmann  
wrote:

Hi Aaron,

On 10/06/2015 04:17 PM, Aaron Elkins wrote:
> Hi all,

> 

> I am new to Qemu, and I’m extremely interested in understanding how the 
> source code of Qemu work. But after

> I downloaded the whole project, I just lost in it, the project is too large 
> for me to get started.

> 

> If anyone here can point me to some useful document or some guides, to make 
> me get started in understanding

> the source code?

it depends of the area of your interest. Or do you seek a general overview 
regarding QEMU?

When I started with QEMU, I picked some part that looked interesting, looked at 
an interesting sounding function, added a breakpoint in gdb, and slowly stepped 
through it in order to understand it. Looking at the backtrace helps to see 
where this function was called to find more interesting function for the 
breakpoint stepping.

Sadly there is not a lot of documentation today. For some areas you have good 
chance, if you look into the docs/ directory. But mostly the sourcecode is the 
documentation. We talked about that issue on the QEMU Summit 2015 and would 
like to change it. However it depends on how people are willing to write high 
level documentation.

If you are interested in the tcg-frontend part of QEMU, I can give you some 
hints.
> 

> What knowledge are required to understand the source code?

> 

> BTW, i know this project is not that simple to understand, but I would like 
> to try, even I need to know a lot

> of other knowledge before that, but at least let me get started.

> 

> Thanks

> 

> -Aaron

> 

> 

Cheers,
Bastian

[Qemu-devel] [PATCH v13 05/10] net/queue: introduce NetQueueDeliverFunc

2015-10-06 Thread Yang Hongyang

net/queue.c has logic to send/queue/flush packets but a
qemu_deliver_packet_iov() call is hardcoded. Abstract this
func so that we can use our own deliver function in netfilter.

Signed-off-by: Yang Hongyang 
Cc: Stefan Hajnoczi 
Signed-off-by: Jason Wang 
---
 include/net/queue.h | 13 -
 net/net.c   |  2 +-
 net/queue.c |  8 +---
 3 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/include/net/queue.h b/include/net/queue.h
index fc02b33..b4a7183 100644
--- a/include/net/queue.h
+++ b/include/net/queue.h
@@ -34,7 +34,18 @@ typedef void (NetPacketSent) (NetClientState *sender, 
ssize_t ret);
 #define QEMU_NET_PACKET_FLAG_NONE  0
 #define QEMU_NET_PACKET_FLAG_RAW  (1<<0)
 
-NetQueue *qemu_new_net_queue(void *opaque);
+/* Returns:
+ *   >0 - success
+ *0 - queue packet for future redelivery
+ *   <0 - failure (discard packet)
+ */
+typedef ssize_t (NetQueueDeliverFunc)(NetClientState *sender,
+  unsigned flags,
+  const struct iovec *iov,
+  int iovcnt,
+  void *opaque);
+
+NetQueue *qemu_new_net_queue(NetQueueDeliverFunc *deliver, void *opaque);
 
 void qemu_del_net_queue(NetQueue *queue);
 
diff --git a/net/net.c b/net/net.c
index 2f939f9..c0ebb13 100644
--- a/net/net.c
+++ b/net/net.c
@@ -286,7 +286,7 @@ static void qemu_net_client_setup(NetClientState *nc,
 }
 QTAILQ_INSERT_TAIL(_clients, nc, next);
 
-nc->incoming_queue = qemu_new_net_queue(nc);
+nc->incoming_queue = qemu_new_net_queue(qemu_deliver_packet_iov, nc);
 nc->destructor = destructor;
 QTAILQ_INIT(>filters);
 }
diff --git a/net/queue.c b/net/queue.c
index cf8db3a..16dddf0 100644
--- a/net/queue.c
+++ b/net/queue.c
@@ -52,13 +52,14 @@ struct NetQueue {
 void *opaque;
 uint32_t nq_maxlen;
 uint32_t nq_count;
+NetQueueDeliverFunc *deliver;
 
 QTAILQ_HEAD(packets, NetPacket) packets;
 
 unsigned delivering : 1;
 };
 
-NetQueue *qemu_new_net_queue(void *opaque)
+NetQueue *qemu_new_net_queue(NetQueueDeliverFunc *deliver, void *opaque)
 {
 NetQueue *queue;
 
@@ -67,6 +68,7 @@ NetQueue *qemu_new_net_queue(void *opaque)
 queue->opaque = opaque;
 queue->nq_maxlen = 1;
 queue->nq_count = 0;
+queue->deliver = deliver;
 
 QTAILQ_INIT(>packets);
 
@@ -158,7 +160,7 @@ static ssize_t qemu_net_queue_deliver(NetQueue *queue,
 };
 
 queue->delivering = 1;
-ret = qemu_deliver_packet_iov(sender, flags, , 1, queue->opaque);
+ret = queue->deliver(sender, flags, , 1, queue->opaque);
 queue->delivering = 0;
 
 return ret;
@@ -173,7 +175,7 @@ static ssize_t qemu_net_queue_deliver_iov(NetQueue *queue,
 ssize_t ret = -1;
 
 queue->delivering = 1;
-ret = qemu_deliver_packet_iov(sender, flags, iov, iovcnt, queue->opaque);
+ret = queue->deliver(sender, flags, iov, iovcnt, queue->opaque);
 queue->delivering = 0;
 
 return ret;
-- 
1.9.1

Re: [Qemu-devel] How to get started with the source code of Qemu?

2015-10-06 Thread Aaron Elkins

Hi peter,

Thanks for suggestion.

Computer hardware architecture, that’s an interesting thing.

-Aaron

On Oct 7, 2015, at 4:08 AM, Peter Crosthwaite  
wrote:

On Tue, Oct 6, 2015 at 7:17 AM, Aaron Elkins  wrote:
> Hi all,
> 
> I am new to Qemu, and I’m extremely interested in understanding how the 
> source code of Qemu work. But after
> I downloaded the whole project, I just lost in it, the project is too large 
> for me to get started.
> 

It does a lot, what is your use case? Very few people have an
understanding of the entire code base. Someone might be able to point
you in a specific direction if you give more.

> If anyone here can point me to some useful document or some guides, to make 
> me get started in understanding
> the source code?
> 
> What knowledge are required to understand the source code?
> 

C coding. Git. Computer hardware architecture.

Regards,
Peter

> BTW, i know this project is not that simple to understand, but I would like 
> to try, even I need to know a lot
> of other knowledge before that, but at least let me get started.
> 
> Thanks
> 
> -Aaron
> 
> 
>

Re: [Qemu-devel] [PATCH v12 00/10] Add a netfilter object and netbuffer filter

2015-10-06 Thread Yang Hongyang




On 10/07/2015 09:33 AM, Yang Hongyang wrote:



On 10/01/2015 01:43 AM, Markus Armbruster wrote:

Yang Hongyang  writes:


This patch add an netfilter abstract object, captures all network packets
on associated netdev. Also implement a concrete filter buffer based on
this abstract object. the "buffer" netfilter could be used by VM FT solutions
like MicroCheckpointing, to buffer/release packets. Or to simulate
packet delay.

You can also get the series from:
https://github.com/macrosheep/qemu/tree/netfilter-v12

Usage:
  -netdev tap,id=bn0
  -device e1000,netdev=bn0
  -object filter-buffer,id=f0,netdev=bn0,queue=rx,interval=1000

dynamically add/remove netfilters:
  object_add filter-buffer,id=f0,netdev=bn0,queue=rx,interval=1000
  object_del f0

NOTE:
  interval is in microseconds and can't be omiited.
  queue is optional, and is one of rx|tx|all, default is "all". See
  enum NetFilterDirection for detail.


I reviewed the patches touching the QAPI schema or the command line.
Only a few simple issues left.  One more respin should do it.
.



Thanks, will do.


Hi Markus,

  I've already addressed all your comments on v12 and send out a v13. Could
you please review the last bits, thanks very much.






--
Thanks,
Yang.

[Qemu-devel] [PATCH v13 10/10] tests: add test cases for netfilter object

2015-10-06 Thread Yang Hongyang

Using qtest qmp interface to implement following cases:
1) add/remove netfilter
2) add a netfilter then delete the netdev
3) add/remove more than one netfilters
4) add more than one netfilters and then delete the netdev

Signed-off-by: Yang Hongyang 
Signed-off-by: Jason Wang 
---
 tests/.gitignore   |   1 +
 tests/Makefile |   2 +
 tests/test-netfilter.c | 200 +
 3 files changed, 203 insertions(+)
 create mode 100644 tests/test-netfilter.c

diff --git a/tests/.gitignore b/tests/.gitignore
index a607bdd..65496aa 100644
--- a/tests/.gitignore
+++ b/tests/.gitignore
@@ -49,5 +49,6 @@ test-vmstate
 test-write-threshold
 test-x86-cpuid
 test-xbzrle
+test-netfilter
 *-test
 qapi-schema/*.test.*
diff --git a/tests/Makefile b/tests/Makefile
index e6474ba..78d5d0a 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -191,6 +191,7 @@ gcov-files-i386-y += hw/pci-host/q35.c
 ifeq ($(CONFIG_VHOST_NET),y)
 check-qtest-i386-$(CONFIG_LINUX) += tests/vhost-user-test$(EXESUF)
 endif
+check-qtest-i386-y += tests/test-netfilter$(EXESUF)
 check-qtest-x86_64-y = $(check-qtest-i386-y)
 gcov-files-i386-y += i386-softmmu/hw/timer/mc146818rtc.c
 gcov-files-x86_64-y = $(subst 
i386-softmmu/,x86_64-softmmu/,$(gcov-files-i386-y))
@@ -437,6 +438,7 @@ tests/vhost-user-test$(EXESUF): tests/vhost-user-test.o 
qemu-char.o qemu-timer.o
 tests/qemu-iotests/socket_scm_helper$(EXESUF): 
tests/qemu-iotests/socket_scm_helper.o
 tests/test-qemu-opts$(EXESUF): tests/test-qemu-opts.o $(test-util-obj-y)
 tests/test-write-threshold$(EXESUF): tests/test-write-threshold.o 
$(test-block-obj-y)
+tests/test-netfilter$(EXESUF): tests/test-netfilter.o $(qtest-obj-y)
 
 ifeq ($(CONFIG_POSIX),y)
 LIBS += -lutil
diff --git a/tests/test-netfilter.c b/tests/test-netfilter.c
new file mode 100644
index 000..303deb7
--- /dev/null
+++ b/tests/test-netfilter.c
@@ -0,0 +1,200 @@
+/*
+ * QTest testcase for netfilter
+ *
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Author: Yang Hongyang 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include 
+#include "libqtest.h"
+
+/* add a netfilter to a netdev and then remove it */
+static void add_one_netfilter(void)
+{
+QDict *response;
+
+response = qmp("{'execute': 'object-add',"
+   " 'arguments': {"
+   "   'qom-type': 'filter-buffer',"
+   "   'id': 'qtest-f0',"
+   "   'props': {"
+   " 'netdev': 'qtest-bn0',"
+   " 'queue': 'rx',"
+   " 'interval': 1000"
+   "}}}");
+
+g_assert(response);
+g_assert(!qdict_haskey(response, "error"));
+QDECREF(response);
+
+response = qmp("{'execute': 'object-del',"
+   " 'arguments': {"
+   "   'id': 'qtest-f0'"
+   "}}");
+g_assert(response);
+g_assert(!qdict_haskey(response, "error"));
+QDECREF(response);
+}
+
+/* add a netfilter to a netdev and then remove the netdev */
+static void remove_netdev_with_one_netfilter(void)
+{
+QDict *response;
+
+response = qmp("{'execute': 'object-add',"
+   " 'arguments': {"
+   "   'qom-type': 'filter-buffer',"
+   "   'id': 'qtest-f0',"
+   "   'props': {"
+   " 'netdev': 'qtest-bn0',"
+   " 'queue': 'rx',"
+   " 'interval': 1000"
+   "}}}");
+
+g_assert(response);
+g_assert(!qdict_haskey(response, "error"));
+QDECREF(response);
+
+response = qmp("{'execute': 'netdev_del',"
+   " 'arguments': {"
+   "   'id': 'qtest-bn0'"
+   "}}");
+g_assert(response);
+g_assert(!qdict_haskey(response, "error"));
+QDECREF(response);
+
+/* add back the netdev */
+response = qmp("{'execute': 'netdev_add',"
+   " 'arguments': {"
+   "   'type': 'user',"
+   "   'id': 'qtest-bn0'"
+   "}}");
+g_assert(response);
+g_assert(!qdict_haskey(response, "error"));
+QDECREF(response);
+}
+
+/* add multi(2) netfilters to a netdev and then remove them */
+static void add_multi_netfilter(void)
+{
+QDict *response;
+
+response = qmp("{'execute': 'object-add',"
+   " 'arguments': {"
+   "   'qom-type': 'filter-buffer',"
+   "   'id': 'qtest-f0',"
+   "   'props': {"
+   " 'netdev': 'qtest-bn0',"
+   " 'queue': 'rx',"
+   " 'interval': 1000"
+   "}}}");
+
+g_assert(response);
+g_assert(!qdict_haskey(response, "error"));
+QDECREF(response);
+
+response = qmp("{'execute':

[Qemu-devel] How to build the latest Qemu on Mac OS X 10.11 (El Capitan ) ?

2015-10-06 Thread Aaron Elkins

Hi all,

I am currently working on Mac OS X 10.11 (El Capitan), and I want to know if I 
can build Qemu on it? and how?

Thanks

-Aaron

Re: [Qemu-devel] [PATCH] qobject: Replace property list with GHashTable

2015-10-06 Thread David Gibson

On Tue, Oct 06, 2015 at 03:02:17PM +0200, Laszlo Ersek wrote:
> David,
> 
> On 10/06/15 14:41, Pavel Fedin wrote:
> > ARM GICv3 systems with large number of CPUs create lots of IRQ pins. Since
> > every pin is represented as a property, number of these properties becomes
> > very large. Every property add first makes sure there's no duplicates.
> > Traversing the list becomes very slow, therefore qemu initialization takes
> > significant time (several seconds for e. g. 16 CPUs).
> > 
> > This patch replaces list with GHashTable, making lookup very fast. The only
> > drawback is that object_child_foreach() and object_child_foreach_recursive()
> > cannot modify their objects during traversal, since GHashTableIter does not
> > have modify-safe version. However, the code seems not to modify objects via
> > these functions.
> > 
> > Signed-off-by: Pavel Fedin 
> > ---
> >  include/qom/object.h |  4 +--
> >  qmp.c|  8 +++--
> >  qom/object.c | 98 
> > +++-
> >  vl.c |  4 ++-
> >  4 files changed, 54 insertions(+), 60 deletions(-)
> 
> Shouldn't this help similarly with the problem that 94649d423e worked
> around? (Although that patch has standalone merits of course.)

Yes it will.  The change to hash table would reduce my nasty O(n^3) case
to O(n^2) without 94649d423e, and to O(n) with it..

> 
> Thanks
> Laszlo
> 
> > diff --git a/include/qom/object.h b/include/qom/object.h
> > index be7280c..b100923 100644
> > --- a/include/qom/object.h
> > +++ b/include/qom/object.h
> > @@ -345,7 +345,7 @@ typedef struct ObjectProperty
> >  ObjectPropertyRelease *release;
> >  void *opaque;
> >  
> > -QTAILQ_ENTRY(ObjectProperty) node;
> > +Object *obj;
> >  } ObjectProperty;
> >  
> >  /**
> > @@ -405,7 +405,7 @@ struct Object
> >  /*< private >*/
> >  ObjectClass *class;
> >  ObjectFree *free;
> > -QTAILQ_HEAD(, ObjectProperty) properties;
> > +GHashTable *properties;
> >  uint32_t ref;
> >  Object *parent;
> >  };
> > diff --git a/qmp.c b/qmp.c
> > index 057a7cb..683f427 100644
> > --- a/qmp.c
> > +++ b/qmp.c
> > @@ -207,6 +207,7 @@ ObjectPropertyInfoList *qmp_qom_list(const char *path, 
> > Error **errp)
> >  Object *obj;
> >  bool ambiguous = false;
> >  ObjectPropertyInfoList *props = NULL;
> > +GHashTableIter iter;
> >  ObjectProperty *prop;
> >  
> >  obj = object_resolve_path(path, );
> > @@ -220,7 +221,8 @@ ObjectPropertyInfoList *qmp_qom_list(const char *path, 
> > Error **errp)
> >  return NULL;
> >  }
> >  
> > -QTAILQ_FOREACH(prop, >properties, node) {
> > +g_hash_table_iter_init(, obj->properties);
> > +while (g_hash_table_iter_next(, NULL, (gpointer *))) {
> >  ObjectPropertyInfoList *entry = g_malloc0(sizeof(*entry));
> >  
> >  entry->value = g_malloc0(sizeof(ObjectPropertyInfo));
> > @@ -499,6 +501,7 @@ DevicePropertyInfoList 
> > *qmp_device_list_properties(const char *typename,
> >  {
> >  ObjectClass *klass;
> >  Object *obj;
> > +GHashTableIter iter;
> >  ObjectProperty *prop;
> >  DevicePropertyInfoList *prop_list = NULL;
> >  
> > @@ -517,7 +520,8 @@ DevicePropertyInfoList 
> > *qmp_device_list_properties(const char *typename,
> >  
> >  obj = object_new(typename);
> >  
> > -QTAILQ_FOREACH(prop, >properties, node) {
> > +g_hash_table_iter_init(, obj->properties);
> > +while (g_hash_table_iter_next(, NULL, (gpointer *))) {
> >  DevicePropertyInfo *info;
> >  DevicePropertyInfoList *entry;
> >  
> > diff --git a/qom/object.c b/qom/object.c
> > index 4805328..9649243 100644
> > --- a/qom/object.c
> > +++ b/qom/object.c
> > @@ -326,6 +326,20 @@ static void object_post_init_with_type(Object *obj, 
> > TypeImpl *ti)
> >  }
> >  }
> >  
> > +static void property_destroy(gpointer data)
> > +{
> > +ObjectProperty *prop = data;
> > +
> > +if (prop->release) {
> > +prop->release(prop->obj, prop->name, prop->opaque);
> > +}
> > +
> > +g_free(prop->name);
> > +g_free(prop->type);
> > +g_free(prop->description);
> > +g_free(prop);
> > +}
> > +
> >  void object_initialize_with_type(void *data, size_t size, TypeImpl *type)
> >  {
> >  Object *obj = data;
> > @@ -340,7 +354,8 @@ void object_initialize_with_type(void *data, size_t 
> > size, TypeImpl *type)
> >  memset(obj, 0, type->instance_size);
> >  obj->class = type->class;
> >  object_ref(obj);
> > -QTAILQ_INIT(>properties);
> > +obj->properties = g_hash_table_new_full(g_str_hash, g_str_equal,
> > +NULL, property_destroy);
> >  object_init_with_type(obj, type);
> >  object_post_init_with_type(obj, type);
> >  }
> > @@ -357,40 +372,19 @@ static inline bool 
> > object_property_is_child(ObjectProperty *prop)
> >  return strstart(prop->type, "child<", NULL);
> >  }
> >  
> > -static

Re: [Qemu-devel] [PATCH] qobject: Replace property list with GHashTable

2015-10-06 Thread David Gibson

On Tue, Oct 06, 2015 at 03:41:56PM +0300, Pavel Fedin wrote:
> ARM GICv3 systems with large number of CPUs create lots of IRQ pins. Since
> every pin is represented as a property, number of these properties becomes
> very large. Every property add first makes sure there's no duplicates.
> Traversing the list becomes very slow, therefore qemu initialization takes
> significant time (several seconds for e. g. 16 CPUs).
> 
> This patch replaces list with GHashTable, making lookup very fast. The only
> drawback is that object_child_foreach() and object_child_foreach_recursive()
> cannot modify their objects during traversal, since GHashTableIter does not
> have modify-safe version. However, the code seems not to modify objects via
> these functions.

Hmm.. modifying a child object internally should be fine, shouldn't
it?  IIUC only trying to remove it, change the key or the pointer to
the value should be problematic.

> Signed-off-by: Pavel Fedin 
> ---
>  include/qom/object.h |  4 +--
>  qmp.c|  8 +++--
>  qom/object.c | 98 
> +++-
>  vl.c |  4 ++-
>  4 files changed, 54 insertions(+), 60 deletions(-)
> 
> diff --git a/include/qom/object.h b/include/qom/object.h
> index be7280c..b100923 100644
> --- a/include/qom/object.h
> +++ b/include/qom/object.h
> @@ -345,7 +345,7 @@ typedef struct ObjectProperty
>  ObjectPropertyRelease *release;
>  void *opaque;
>  
> -QTAILQ_ENTRY(ObjectProperty) node;
> +Object *obj;

What is the new obj pointer here for?

>  } ObjectProperty;
>  
>  /**
> @@ -405,7 +405,7 @@ struct Object
>  /*< private >*/
>  ObjectClass *class;
>  ObjectFree *free;
> -QTAILQ_HEAD(, ObjectProperty) properties;
> +GHashTable *properties;

How much extra memory does each Object take with no (or few) properties by
using a hash table rather than a simple list here?

>  uint32_t ref;
>  Object *parent;
>  };
> diff --git a/qmp.c b/qmp.c
> index 057a7cb..683f427 100644
> --- a/qmp.c
> +++ b/qmp.c
> @@ -207,6 +207,7 @@ ObjectPropertyInfoList *qmp_qom_list(const char *path, 
> Error **errp)
>  Object *obj;
>  bool ambiguous = false;
>  ObjectPropertyInfoList *props = NULL;
> +GHashTableIter iter;
>  ObjectProperty *prop;
>  
>  obj = object_resolve_path(path, );
> @@ -220,7 +221,8 @@ ObjectPropertyInfoList *qmp_qom_list(const char *path, 
> Error **errp)
>  return NULL;
>  }
>  
> -QTAILQ_FOREACH(prop, >properties, node) {
> +g_hash_table_iter_init(, obj->properties);
> +while (g_hash_table_iter_next(, NULL, (gpointer *))) {
>  ObjectPropertyInfoList *entry = g_malloc0(sizeof(*entry));
>  
>  entry->value = g_malloc0(sizeof(ObjectPropertyInfo));
> @@ -499,6 +501,7 @@ DevicePropertyInfoList *qmp_device_list_properties(const 
> char *typename,
>  {
>  ObjectClass *klass;
>  Object *obj;
> +GHashTableIter iter;
>  ObjectProperty *prop;
>  DevicePropertyInfoList *prop_list = NULL;
>  
> @@ -517,7 +520,8 @@ DevicePropertyInfoList *qmp_device_list_properties(const 
> char *typename,
>  
>  obj = object_new(typename);
>  
> -QTAILQ_FOREACH(prop, >properties, node) {
> +g_hash_table_iter_init(, obj->properties);
> +while (g_hash_table_iter_next(, NULL, (gpointer *))) {
>  DevicePropertyInfo *info;
>  DevicePropertyInfoList *entry;
>  
> diff --git a/qom/object.c b/qom/object.c
> index 4805328..9649243 100644
> --- a/qom/object.c
> +++ b/qom/object.c
> @@ -326,6 +326,20 @@ static void object_post_init_with_type(Object *obj, 
> TypeImpl *ti)
>  }
>  }
>  
> +static void property_destroy(gpointer data)
> +{
> +ObjectProperty *prop = data;
> +
> +if (prop->release) {
> +prop->release(prop->obj, prop->name, prop->opaque);
> +}
> +
> +g_free(prop->name);
> +g_free(prop->type);
> +g_free(prop->description);
> +g_free(prop);
> +}
> +
>  void object_initialize_with_type(void *data, size_t size, TypeImpl *type)
>  {
>  Object *obj = data;
> @@ -340,7 +354,8 @@ void object_initialize_with_type(void *data, size_t size, 
> TypeImpl *type)
>  memset(obj, 0, type->instance_size);
>  obj->class = type->class;
>  object_ref(obj);
> -QTAILQ_INIT(>properties);
> +obj->properties = g_hash_table_new_full(g_str_hash, g_str_equal,
> +NULL, property_destroy);
>  object_init_with_type(obj, type);
>  object_post_init_with_type(obj, type);
>  }
> @@ -357,40 +372,19 @@ static inline bool 
> object_property_is_child(ObjectProperty *prop)
>  return strstart(prop->type, "child<", NULL);
>  }
>  
> -static void object_property_del_all(Object *obj)
> +static gboolean object_property_del_child(gpointer key, gpointer value,
> +  gpointer child)
>  {
> -while (!QTAILQ_EMPTY(>properties)) {
> -ObjectProperty *prop =

Re: [Qemu-devel] [PULL 00/10] VFIO updates for 2015-10-05

2015-10-06 Thread David Gibson

On Tue, Oct 06, 2015 at 09:35:17AM -0600, Alex Williamson wrote:
> On Tue, 2015-10-06 at 15:50 +0100, Peter Maydell wrote:
> > On 5 October 2015 at 21:36, Alex Williamson  
> > wrote:
> > > The following changes since commit 
> > > c0b520dfb8890294a9f8879f4759172900585995:
> > >
> > >   Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into 
> > > staging (2015-10-02 16:59:21 +0100)
> > >
> > > are available in the git repository at:
> > >
> > >
> > >   git://github.com/awilliam/qemu-vfio.git tags/vfio-update-20151005.0
> > >
> > > for you to fetch changes up to 727299697dd31f0e1ccecc7eab1bf658e8ed3079:
> > >
> > >   vfio: Expose a VFIO PCI device's group for EEH (2015-10-05 12:40:13 
> > > -0600)
> > >
> > > 
> > > VFIO updates 2015-10-05
> > >
> > >  - Change platform device IRQ setup sequence for compatibility
> > >with upcoming IRQ forwarding (Eric Auger)
> > >  - Extensions to support vfio-pci devices on spapr-pci-host-bridge
> > >(David Gibson)
> > >
> > > 
> > 
> > Hi. I'm afraid this fails to build with clang:
> > 
> > In file included from 
> > /home/petmay01/linaro/qemu-for-merges/hw/vfio/pci.c:38:
> > /home/petmay01/linaro/qemu-for-merges/include/hw/vfio/vfio-pci.h:7:26:
> > error: redefinition of typedef 'VFIOGroup' is a C11 feature
> > [-Werror,-Wtypedef-redefinition]
> > typedef struct VFIOGroup VFIOGroup;
> >  ^
> > /home/petmay01/linaro/qemu-for-merges/include/hw/vfio/vfio-common.h:117:3:
> > note: previous definition is here
> > } VFIOGroup;
> >   ^
> > 1 error generated.
> 
> Thanks Peter.  David, can you send a replacement for that last patch?
> Thanks,

Actually, can you just drop the last patch for now.

The use case I thought I had for it turns out not to be.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] target-arm: Avoid calling arm_el_is_aa64() function for unimplemented EL

2015-10-06 Thread Sergey Sorokin

That is ok.

06.10.2015, 00:55, "Peter Maydell" :
> On 2 October 2015 at 14:21, Sergey Sorokin  wrote:
>>  It is incorrect to call arm_el_is_aa64() function for unimplemented EL.
>>  This patch fixes several attempts to do so.
>>
>>  Signed-off-by: Sergey Sorokin 
>>  ---
>>   target-arm/cpu.h | 8 +---
>>   target-arm/helper.c | 15 +--
>>   2 files changed, 18 insertions(+), 5 deletions(-)
>>
>>  diff --git a/target-arm/cpu.h b/target-arm/cpu.h
>>  index cc1578c..df456a5 100644
>>  --- a/target-arm/cpu.h
>>  +++ b/target-arm/cpu.h
>>  @@ -1015,11 +1015,11 @@ static inline bool access_secure_reg(CPUARMState 
>> *env)
>>    */
>>   #define A32_BANKED_CURRENT_REG_GET(_env, _regname) \
>>   A32_BANKED_REG_GET((_env), _regname, \
>>  - ((!arm_el_is_aa64((_env), 3) && arm_is_secure(_env
>>  + (arm_is_secure(_env) && !arm_el_is_aa64((_env), 3)))
>>
>>   #define A32_BANKED_CURRENT_REG_SET(_env, _regname, _val) \
>>   A32_BANKED_REG_SET((_env), _regname, \
>>  - ((!arm_el_is_aa64((_env), 3) && arm_is_secure(_env))), \
>>  + (arm_is_secure(_env) && !arm_el_is_aa64((_env), 3)), \
>>  (_val))
>>
>>   void arm_cpu_list(FILE *f, fprintf_function cpu_fprintf);
>>  @@ -1586,7 +1586,9 @@ static inline bool arm_excp_unmasked(CPUState *cs, 
>> unsigned int excp_idx,
>>    * interrupt.
>>    */
>>   if ((target_el > cur_el) && (target_el != 1)) {
>>  - if (arm_el_is_aa64(env, 3) || ((scr || hcr) && (!secure))) {
>>  + /* ARM_FEATURE_AARCH64 enabled means the higher EL is AArch64. */
>>  + if (arm_feature(env, ARM_FEATURE_AARCH64) ||
>>  + ((scr || hcr) && (!secure))) {
>>   unmasked = 1;
>>   }
>>   }
>
> I know we discussed this one before, but having looked more carefully
> at it, I think the reason it's weird is that the original code is
> only correct if we're not implementing EL2.
>
> For instance (table D1-14 in the v8 ARMARM rev A.g)
> if we're in an all-AArch64 environment executing in Secure EL0
> then the interrupt mask is supposed to have an effect if the interrupt
> is targeting EL2. But the current code will always set 'unmasked' to 1 if
> EL3 is 64 bit.
>
> So I think what the code ought to read is:
>
> if ((target_el > cur_el) && (target_el != 1)) {
> /* Exceptions targeting a higher EL may not be maskable */
> if (arm_feature(env, ARM_FEATURE_AARCH64)) {
> /* 64-bit masking rules are simple: exceptions to EL3
>  * can't be masked, and exceptions to EL2 can only be
>  * masked from Secure state.
>  */
> if (target_el == 3 || !secure) {
> unmasked = 1;
> }
> } else {
> /* The old 32-bit-only environment has a more complicated
>  * masking setup.
>  */
> if ((scr || hcr) && !secure) {
> unmasked = 1;
> }
> }
> }
>
> Except that then for the AArch64 case we've just calculated
> scr and hcr and then not needed them. I think most of the
> code calculating them ought to move into the else clause here.
>
> I'll write a patch for this and post it tomorrow.
>
> In the meantime, we can just make the comment say
>
>    /* ARM_FEATURE_AARCH64 enabled means the highest EL is AArch64.
> * This code currently assumes that EL2 is not implemented
> * (and so that highest EL will be 3 and the target_el also 3).
> */
>
>>  diff --git a/target-arm/helper.c b/target-arm/helper.c
>>  index 8367997..1f11dbd 100644
>>  --- a/target-arm/helper.c
>>  +++ b/target-arm/helper.c
>>  @@ -5220,11 +5220,22 @@ uint32_t arm_phys_excp_target_el(CPUState *cs, 
>> uint32_t excp_idx,
>>    uint32_t cur_el, bool secure)
>>   {
>>   CPUARMState *env = cs->env_ptr;
>>  - int rw = ((env->cp15.scr_el3 & SCR_RW) == SCR_RW);
>>  + int rw;
>>   int scr;
>>   int hcr;
>>   int target_el;
>>  - int is64 = arm_el_is_aa64(env, 3);
>>  + /* Is the higher EL AArch64? */
>
> "highest". I pointed this one out before...
>
>>  + int is64 = arm_feature(env, ARM_FEATURE_AARCH64);
>>  +
>>  + /* If the highest EL is in AArch64 state, and EL3 is not implemented,
>>  + * we must behave as if EL3 is implemented and is in AArch64 state.
>>  + * Therefore we need appropriate RW bit.
>>  + */
>
> I think we could put this comment into the else branch, and
> rephrase it a bit:
>
> + if (arm_feature(env, ARM_FEATURE_EL3)) {
> + rw = ((env->cp15.scr_el3 & SCR_RW) == SCR_RW);
> + } else {
> + /* Either EL2 is the highest EL (and so the EL2 register width
> + * is given by is64); or there is no EL2 or EL3, in which case
> + * the value of 'rw' does not affect the table lookup anyway.
> + */
> + rw = is64;
> + }
>
>>  + if (arm_feature(env, ARM_FEATURE_EL3)) {
>>  + rw = ((env->cp15.scr_el3 & SCR_RW) == SCR_RW);
>>  + } else {
>>  + rw = is64;
>>  + }
>
> What I can

[Qemu-devel] Debugging Migration

2015-10-06 Thread John Snow

Is there a convenient way of "pausing" or stalling a live migration to
allow methodical testing of race conditions?

I'd like to instrument something along the lines of:

(1) Live migration begins.
(2) migration is artificially halted or paused, but QEMU is allowed to run.
(3) Some additional qtest/QMP commands are received and processed.
(4) migration is allowed to resume.

Does anyone have perhaps even test patches to instrument this sort of
thing, or is it up to detective john to add it if he wants it?

Thanks,
--js

[Qemu-devel] [PULL 02/48] msix: add VMSTATE_MSIX_TEST

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

ivshmem is going to use MSIX state conditionally.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 include/hw/pci/msix.h | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/include/hw/pci/msix.h b/include/hw/pci/msix.h
index 954d82b..72e5f93 100644
--- a/include/hw/pci/msix.h
+++ b/include/hw/pci/msix.h
@@ -46,12 +46,16 @@ void msix_unset_vector_notifiers(PCIDevice *dev);
 
 extern const VMStateDescription vmstate_msix;
 
-#define VMSTATE_MSIX(_field, _state) {   \
-.name   = (stringify(_field)),   \
-.size   = sizeof(PCIDevice), \
-.vmsd   = _msix, \
-.flags  = VMS_STRUCT,\
-.offset = vmstate_offset_value(_state, _field, PCIDevice),   \
+#define VMSTATE_MSIX_TEST(_field, _state, _test) {   \
+.name = (stringify(_field)), \
+.size = sizeof(PCIDevice),   \
+.vmsd = _msix,   \
+.flags= VMS_STRUCT,  \
+.offset   = vmstate_offset_value(_state, _field, PCIDevice), \
+.field_exists = (_test)  \
 }
 
+#define VMSTATE_MSIX(_f, _s) \
+VMSTATE_MSIX_TEST(_f, _s, NULL)
+
 #endif
-- 
2.4.3

[Qemu-devel] [PULL 18/48] ivshmem: improve error handling

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

The test whether the chardev is an AF_UNIX socket rejects
"-chardev socket,id=chr0,path=/tmp/foo,server,nowait -device
ivshmem,chardev=chr0", but fails to explain why.

Use an explicit error on why a chardev may be rejected.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 50f9c8f..d7a00bd 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -301,7 +301,7 @@ static CharDriverState* create_eventfd_chr_device(void * 
opaque, EventNotifier *
 chr = qemu_chr_open_eventfd(eventfd);
 
 if (chr == NULL) {
-error_report("creating eventfd for eventfd %d failed", eventfd);
+error_report("creating chardriver for eventfd %d failed", eventfd);
 return NULL;
 }
 qemu_chr_fe_claim_no_fail(chr);
@@ -778,8 +778,12 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
**errp)
 attr |= PCI_BASE_ADDRESS_MEM_TYPE_64;
 }
 
-if ((s->server_chr != NULL) &&
-(strncmp(s->server_chr->filename, "unix:", 5) == 0)) {
+if (s->server_chr != NULL) {
+if (strncmp(s->server_chr->filename, "unix:", 5)) {
+error_setg(errp, "chardev is not a unix client socket");
+return;
+}
+
 /* if we get a UNIX socket as the parameter we will talk
  * to the ivshmem server to receive the memory region */
 
-- 
2.4.3

[Qemu-devel] [kvm-unit-tests PATCHv3 3/3] arm: pmu: Add CPI checking

2015-10-06 Thread Christopher Covington

Check the numbers of cycles per instruction (CPI) implied by ARM PMU
cycle counter values. Check that in -icount mode these strictly
match the specified rate.

Signed-off-by: Christopher Covington 
---
 arm/pmu.c | 72 ++-
 arm/unittests.cfg | 13 ++
 2 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/arm/pmu.c b/arm/pmu.c
index 589e605..0ad113d 100644
--- a/arm/pmu.c
+++ b/arm/pmu.c
@@ -84,12 +84,82 @@ static bool check_cycles_increase(void)
return true;
 }
 
-int main(void)
+/* Execute a known number of guest instructions. Only odd instruction counts
+ * greater than or equal to 3 are supported by the in-line assembly code. The
+ * control register (PMCR_EL0) is initialized with the provided value (allowing
+ * for example for the cycle counter or event counters to be reset). At the end
+ * of the exact instruction loop, zero is written to PMCR_EL0 to disable
+ * counting, allowing the cycle counter or event counters to be read at the
+ * leisure of the calling code.
+ */
+static void measure_instrs(int num, struct pmu_data pmcr)
+{
+   int i = (num - 1) / 2;
+
+   if (num < 3 || ((num - 1) % 2))
+   abort();
+
+   asm volatile(
+   "msr pmcr_el0, %[pmcr]\n"
+   "1: subs %[i], %[i], #1\n"
+   "b.gt 1b\n"
+   "msr pmcr_el0, xzr"
+   : [i] "+r" (i) : [pmcr] "r" (pmcr) : "cc");
+}
+
+/* Measure cycle counts for various known instruction counts. Ensure that the
+ * cycle counter progresses (similar to check_cycles_increase() but with more
+ * instructions and using reset and stop controls). If supplied a positive,
+ * nonzero CPI parameter, also strictly check that every measurement matches
+ * it. Strict CPI checking is used to test -icount mode.
+ */
+static bool check_cpi(int cpi)
+{
+   struct pmu_data pmcr;
+
+   pmcr.cycle_counter_reset = 1;
+   pmcr.enable = 1;
+
+   if (cpi > 0)
+   printf("Checking for CPI=%d.\n", cpi);
+   printf("instrs : cycles0 cycles1 ...\n");
+
+   for (int i = 3; i < 300; i += 32) {
+   int avg, sum = 0;
+
+   printf("%d :", i);
+   for (int j = 0; j < samples; j++) {
+   int cycles;
+
+   measure_instrs(i, pmcr);
+   asm volatile("mrs %0, pmccntr_el0" : "=r" (cycles));
+   printf(" %d", cycles);
+
+   if (!cycles || (cpi > 0 && cycles != i * cpi)) {
+   printf("\n");
+   return false;
+   }
+
+   sum += cycles;
+   }
+   avg = sum / samples;
+   printf(" sum=%d avg=%d avg_ipc=%d avg_cpi=%d\n",
+   sum, avg, i / avg, avg / i);
+   }
+
+   return true;
+}
+
+int main(int argc, char *argv[])
 {
report_prefix_push("pmu");
 
report("Control register", check_pmcr());
report("Monotonically increasing cycle count", check_cycles_increase());
 
+   int cpi = (argc == 1 ? atol(argv[0]) : 0);
+
+   report("Cycle/instruction ratio", check_cpi(cpi));
+
return report_summary();
 }
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index fd94adb..333ee0d 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -39,4 +39,17 @@ groups = selftest
 # Test PMU support without -icount
 [pmu]
 file = pmu.flat
+extra_params = -append '-1'
+groups = pmu
+
+# Test PMU support with -icount IPC=1
+[pmu-icount-1]
+file = pmu.flat
+extra_params = -icount 0 -append '1'
+groups = pmu
+
+# Test PMU support with -icount IPC=256
+[pmu-icount-256]
+file = pmu.flat
+extra_params = -icount 8 -append '256'
 groups = pmu
-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

Re: [Qemu-devel] How to get started with the source code of Qemu?

2015-10-06 Thread Eric Blake

On 10/06/2015 08:17 AM, Aaron Elkins wrote:
> Hi all,
> 
> I am new to Qemu, and I’m extremely interested in understanding how the 
> source code of Qemu work. But after
> I downloaded the whole project, I just lost in it, the project is too large 
> for me to get started.

Welcome.

As a piece of general advice, one of the best ways to get started (on
any project, not just qemu) is to start reading the mailing list, pick a
subject line that sounds interesting, and reviewing someone else's
patches on that topic.  Even if you admit that your review is weak
because you are not familiar with the code, the mere act of trying to
understand why someone else's patch was submitted will get you more
familiar with that part of the code base than randomly looking through
files on your own.

It also helps to point out that reviews tend to be the bottleneck, so
the more people that are contributing reviews in addition to their own
code, the faster the project can evolve.  It's easier to write your own
patches, and have a chance of them being reviewed in turn, if you have
already proven your willingness to review code from others first.

> 
> If anyone here can point me to some useful document or some guides, to make 
> me get started in understanding 
> the source code?

Qemu is probably big enough that no one person understands the entire
code base.  Rather, we have quite a few subject-matter experts on
various pieces of the overall project.  So don't feel bad if you don't
understand everything; it is enough to pick one topic that sounds
interesting and try to understand that.

There may be good wiki or blog pages with introductions to high-level
overviews of what qemu is doing, but I'll let others point those out (as
I'm not personally familiar with where such introductory materials would
live).  In fact, if you'd like to take good notes of what you are
learning, perhaps you could turn that into a tutorial for the next new
contributor, and do a much better job at the task than someone who has
been on the list for years and takes certain things for granted.

> 
> What knowledge are required to understand the source code?

Most of qemu is written in C, so having a good grasp on the language
helps. But even if you are weak in C, studying qemu and reading the
mailing list will help you learn and improve your skills.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] Debugging Migration

2015-10-06 Thread Dr. David Alan Gilbert

* John Snow (js...@redhat.com) wrote:
> Is there a convenient way of "pausing" or stalling a live migration to
> allow methodical testing of race conditions?
> 
> I'd like to instrument something along the lines of:
> 
> (1) Live migration begins.
> (2) migration is artificially halted or paused, but QEMU is allowed to run.
> (3) Some additional qtest/QMP commands are received and processed.
> (4) migration is allowed to resume.
> 
> Does anyone have perhaps even test patches to instrument this sort of
> thing, or is it up to detective john to add it if he wants it?

If you catch it during the iterative stage you can probably just
gdb or ctrl-z the destination and the migration thread should block;
or alternatively migrate to a pipe and similarly ctrl-z what ever is
there.

Mostly I do a few things:
  1) use tracing to follow it
 mostly just stderr tracing, but I've done systemtap scripts
 for some hairy stuff.
  2) Set the migration speed (migrate_set_speed) very very low
  3) Keep the source busy dirtying memory.

Of course that does lead to the question of what fun problem are
you trying to debug?

Dave

> --js
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] [PULL 15/48] ivshmem: initialize max_peer to -1

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

There is no peer when device is initialized, do not let doorbell for
inexisting peer 0.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 374ecff..0716deb 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -532,8 +532,6 @@ static void ivshmem_read(void *opaque, const uint8_t *buf, 
int size)
 if (incoming_posn == -1) {
 void * map_ptr;
 
-s->max_peer = 0;
-
 if (check_shm_size(s, incoming_fd, ) == -1) {
 error_report_err(err);
 close(incoming_fd);
@@ -723,6 +721,8 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
**errp)
 PCI_BASE_ADDRESS_MEM_PREFETCH;
 Error *local_err = NULL;
 
+s->max_peer = -1;
+
 if (s->sizearg == NULL) {
 s->ivshmem_size = 4 << 20; /* 4 MB default */
 } else {
-- 
2.4.3

[Qemu-devel] [PULL 00/48] ivshmem series

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

The following changes since commit 5fdb4671b08e0d1631447e81348b2b50a6b85bf7:

  Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into 
staging (2015-10-06 13:42:33 +0100)

are available in the git repository at:

  https://github.com/elmarco/qemu tags/ivshmem-series

for you to fetch changes up to 097cadb155ef22be286af1403240b4fbf0f038ef:

  ivshmem: use little-endian int64_t for the protocol (2015-10-06 21:17:22 
+0200)


Ivshmem series


David Marchand (3):
  contrib: add ivshmem client and server
  docs: update ivshmem device spec
  ivshmem: add check on protocol version in QEMU

Marc-André Lureau (45):
  char: add qemu_chr_free()
  msix: add VMSTATE_MSIX_TEST
  ivhsmem: read do not accept more than sizeof(long)
  ivshmem: fix number of bytes to push to fifo
  ivshmem: factor out the incoming fifo handling
  ivshmem: remove unnecessary dup()
  ivshmem: remove superflous ivshmem_attr field
  ivshmem: remove useless doorbell field
  ivshmem: more qdev conversion
  ivshmem: remove last exit(1)
  ivshmem: limit maximum number of peers to G_MAXUINT16
  ivshmem: simplify around increase_dynamic_storage()
  ivshmem: allocate eventfds in resize_peers()
  ivshmem: remove useless ivshmem_update_irq() val argument
  ivshmem: initialize max_peer to -1
  ivshmem: remove max_peer field
  ivshmem: improve debug messages
  ivshmem: improve error handling
  ivshmem: print error on invalid peer id
  ivshmem: simplify a bit the code
  ivshmem: use common return
  ivshmem: use common is_power_of_2()
  ivshmem: migrate with VMStateDescription
  ivshmem: shmfd can be 0
  ivshmem: check shm isn't already initialized
  ivshmem: add device description
  ivshmem: fix pci_ivshmem_exit()
  ivshmem: replace 'guest' for 'peer' appropriately
  ivshmem: error on too many eventfd received
  ivshmem: reset mask on device reset
  ivshmem-client: check the number of vectors
  ivshmem-server: use a uint16 for client ID
  ivshmem-server: fix hugetlbfs support
  contrib: remove unnecessary strdup()
  msix: implement pba write (but read-only)
  qtest: add qtest_add_abrt_handler()
  glib-compat: add 2.38/2.40/2.46 asserts
  tests: add ivshmem qtest
  ivshmem: do not keep shm_fd open
  ivshmem: use qemu_strtosz()
  ivshmem: add hostmem backend
  ivshmem: remove EventfdEntry.vector
  ivshmem: rename MSI eventfd_table
  ivshmem: use kvm irqfd for msi notifications
  ivshmem: use little-endian int64_t for the protocol

 Makefile|   8 +
 Makefile.objs   |   5 +
 configure   |   1 +
 contrib/ivshmem-client/Makefile.objs|   1 +
 contrib/ivshmem-client/ivshmem-client.c | 446 
++
 contrib/ivshmem-client/ivshmem-client.h | 213 +
 contrib/ivshmem-client/main.c   | 239 +++
 contrib/ivshmem-server/Makefile.objs|   1 +
 contrib/ivshmem-server/ivshmem-server.c | 480 
+
 contrib/ivshmem-server/ivshmem-server.h | 166 +
 contrib/ivshmem-server/main.c   | 263 
 docs/specs/ivshmem_device_spec.txt  | 127 +++---
 hw/misc/ivshmem.c   | 833 

 hw/pci/msix.c   |   6 +
 include/glib-compat.h   |  61 +
 include/hw/misc/ivshmem.h   |  25 ++
 include/hw/pci/msix.h   |  16 +-
 include/sysemu/char.h   |  10 +-
 qemu-char.c |   9 +-
 qemu-doc.texi   |  10 +-
 tests/Makefile  |   3 +
 tests/ivshmem-test.c| 496 
++
 tests/libqtest.c|  37 ++-
 tests/libqtest.h|   2 +
 24 files changed, 3142 insertions(+), 316 deletions(-)
 create mode 100644 contrib/ivshmem-client/Makefile.objs
 create mode 100644 contrib/ivshmem-client/ivshmem-client.c
 create mode 100644 contrib/ivshmem-client/ivshmem-client.h
 create mode 100644 contrib/ivshmem-client/main.c
 create mode 100644 contrib/ivshmem-server/Makefile.objs
 create mode 100644 contrib/ivshmem-server/ivshmem-server.c
 create mode 100644 contrib/ivshmem-server/ivshmem-server.h
 create mode 100644 contrib/ivshmem-server/main.c
 create mode 100644 include/hw/misc/ivshmem.h
 create mode 100644 tests/ivshmem-test.c

[Qemu-devel] [PULL 14/48] ivshmem: remove useless ivshmem_update_irq() val argument

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

val isn't used in ivshmem_update_irq() function.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 19640bb..374ecff 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -123,7 +123,7 @@ static inline bool is_power_of_two(uint64_t x) {
 }
 
 /* accessing registers - based on rtl8139 */
-static void ivshmem_update_irq(IVShmemState *s, int val)
+static void ivshmem_update_irq(IVShmemState *s)
 {
 PCIDevice *d = PCI_DEVICE(s);
 int isr;
@@ -144,7 +144,7 @@ static void ivshmem_IntrMask_write(IVShmemState *s, 
uint32_t val)
 
 s->intrmask = val;
 
-ivshmem_update_irq(s, val);
+ivshmem_update_irq(s);
 }
 
 static uint32_t ivshmem_IntrMask_read(IVShmemState *s)
@@ -162,7 +162,7 @@ static void ivshmem_IntrStatus_write(IVShmemState *s, 
uint32_t val)
 
 s->intrstatus = val;
 
-ivshmem_update_irq(s, val);
+ivshmem_update_irq(s);
 }
 
 static uint32_t ivshmem_IntrStatus_read(IVShmemState *s)
@@ -172,7 +172,7 @@ static uint32_t ivshmem_IntrStatus_read(IVShmemState *s)
 /* reading ISR clears all interrupts */
 s->intrstatus = 0;
 
-ivshmem_update_irq(s, 0);
+ivshmem_update_irq(s);
 
 return ret;
 }
-- 
2.4.3

[Qemu-devel] [PULL 04/48] ivshmem: fix number of bytes to push to fifo

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

If the fifo has 0 bytes, and the read is of size 1, the call to
fifo8_push_all() will copy off boundary data.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index fb53b3f..2162d02 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -455,7 +455,7 @@ static void ivshmem_read(void *opaque, const uint8_t *buf, 
int size)
 uint32_t num;
 
 IVSHMEM_DPRINTF("short read of %d bytes\n", size);
-num = MAX(size, sizeof(long) - fifo8_num_used(>incoming_fifo));
+num = MIN(size, sizeof(long) - fifo8_num_used(>incoming_fifo));
 fifo8_push_all(>incoming_fifo, buf, num);
 if (fifo8_num_used(>incoming_fifo) < sizeof(incoming_posn)) {
 return;
-- 
2.4.3

[Qemu-devel] [PULL 37/48] contrib: remove unnecessary strdup()

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

getopt() optarg points to argv memory, no need to dup those values,
fixes small leaks detected by clang-analyzer.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 contrib/ivshmem-client/main.c | 2 +-
 contrib/ivshmem-server/main.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/contrib/ivshmem-client/main.c b/contrib/ivshmem-client/main.c
index 44531f5..28dd81e 100644
--- a/contrib/ivshmem-client/main.c
+++ b/contrib/ivshmem-client/main.c
@@ -53,7 +53,7 @@ ivshmem_client_parse_args(IvshmemClientArgs *args, int argc, 
char *argv[])
 break;
 
 case 'S': /* unix_sock_path */
-args->unix_sock_path = strdup(optarg);
+args->unix_sock_path = optarg;
 break;
 
 default:
diff --git a/contrib/ivshmem-server/main.c b/contrib/ivshmem-server/main.c
index fb60af1..24a3ba2 100644
--- a/contrib/ivshmem-server/main.c
+++ b/contrib/ivshmem-server/main.c
@@ -92,15 +92,15 @@ ivshmem_server_parse_args(IvshmemServerArgs *args, int 
argc, char *argv[])
 break;
 
 case 'p': /* pid_file */
-args->pid_file = strdup(optarg);
+args->pid_file = optarg;
 break;
 
 case 'S': /* unix_socket_path */
-args->unix_socket_path = strdup(optarg);
+args->unix_socket_path = optarg;
 break;
 
 case 'm': /* shm_path */
-args->shm_path = strdup(optarg);
+args->shm_path = optarg;
 break;
 
 case 'l': /* shm_size */
-- 
2.4.3

[Qemu-devel] [PULL 24/48] ivshmem: shmfd can be 0

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

0 is a valid fd value, so change conditions and set -1 value early

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 0ccf932..d3d0204 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -233,7 +233,7 @@ static uint64_t ivshmem_io_read(void *opaque, hwaddr addr,
 
 case IVPOSITION:
 /* return my VM ID if the memory is mapped */
-if (s->shm_fd > 0) {
+if (s->shm_fd >= 0) {
 ret = s->vm_id;
 } else {
 ret = -1;
@@ -665,6 +665,8 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
**errp)
 PCI_BASE_ADDRESS_MEM_PREFETCH;
 Error *local_err = NULL;
 
+s->shm_fd = -1;
+
 if (s->sizearg == NULL) {
 s->ivshmem_size = 4 << 20; /* 4 MB default */
 } else {
@@ -709,8 +711,6 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
**errp)
 
 pci_config_set_interrupt_pin(pci_conf, 1);
 
-s->shm_fd = 0;
-
 memory_region_init_io(>ivshmem_mmio, OBJECT(s), _mmio_ops, s,
   "ivshmem-mmio", IVSHMEM_REG_BAR_SIZE);
 
-- 
2.4.3

[Qemu-devel] [PULL 03/48] ivhsmem: read do not accept more than sizeof(long)

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

ivshmem_read() only reads sizeof(long) from the input buffer.  Accepting
more could lead to fifo8 abort() on 32bit systems if fifo is not empty.

A following patch will change the protocol to 64-bit little-endian
instead.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index cc76989..fb53b3f 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -272,7 +272,7 @@ static void ivshmem_receive(void *opaque, const uint8_t 
*buf, int size)
 
 static int ivshmem_can_receive(void * opaque)
 {
-return 8;
+return sizeof(long);
 }
 
 static void ivshmem_event(void *opaque, int event)
-- 
2.4.3

[Qemu-devel] [PULL 27/48] ivshmem: fix pci_ivshmem_exit()

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

Free all objects owned by the device, making sure the device is free,
fixing hot-unplug.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 38 +++---
 1 file changed, 35 insertions(+), 3 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 7be3d5e..d1b5d35 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -800,15 +800,47 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
**errp)
 static void pci_ivshmem_exit(PCIDevice *dev)
 {
 IVShmemState *s = IVSHMEM(dev);
+int i;
+
+fifo8_destroy(>incoming_fifo);
 
 if (s->migration_blocker) {
 migrate_del_blocker(s->migration_blocker);
 error_free(s->migration_blocker);
 }
 
-memory_region_del_subregion(>bar, >ivshmem);
-vmstate_unregister_ram(>ivshmem, DEVICE(dev));
-fifo8_destroy(>incoming_fifo);
+if (s->shm_fd >= 0) {
+void *addr = memory_region_get_ram_ptr(>ivshmem);
+
+vmstate_unregister_ram(>ivshmem, DEVICE(dev));
+memory_region_del_subregion(>bar, >ivshmem);
+
+if (munmap(addr, s->ivshmem_size) == -1) {
+error_report("Failed to munmap shared memory %s", strerror(errno));
+}
+}
+
+if (s->eventfd_chr) {
+for (i = 0; i < s->vectors; i++) {
+if (s->eventfd_chr[i]) {
+qemu_chr_free(s->eventfd_chr[i]);
+}
+}
+g_free(s->eventfd_chr);
+}
+
+if (s->peers) {
+for (i = 0; i < s->nb_peers; i++) {
+close_guest_eventfds(s, i);
+}
+g_free(s->peers);
+}
+
+if (ivshmem_has_feature(s, IVSHMEM_MSI)) {
+msix_uninit_exclusive_bar(dev);
+}
+
+g_free(s->eventfd_table);
 }
 
 static bool test_msix(void *opaque, int version_id)
-- 
2.4.3

[Qemu-devel] [PULL 39/48] qtest: add qtest_add_abrt_handler()

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

Allow a test to add abort handlers, use GHook for all handlers.

There is currently no way to remove a handler, but it could be
later added if needed.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 tests/libqtest.c | 37 -
 tests/libqtest.h |  2 ++
 2 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/tests/libqtest.c b/tests/libqtest.c
index e5188e0..4a3a6ad 100644
--- a/tests/libqtest.c
+++ b/tests/libqtest.c
@@ -49,6 +49,7 @@ struct QTestState
 struct sigaction sigact_old; /* restored on exit */
 };
 
+static GHookList abrt_hooks;
 static GList *qtest_instances;
 static struct sigaction sigact_old;
 
@@ -112,10 +113,7 @@ static void kill_qemu(QTestState *s)
 
 static void sigabrt_handler(int signo)
 {
-GList *elem;
-for (elem = qtest_instances; elem; elem = elem->next) {
-kill_qemu(elem->data);
-}
+g_hook_list_invoke(_hooks, FALSE);
 }
 
 static void setup_sigabrt_handler(void)
@@ -136,6 +134,23 @@ static void cleanup_sigabrt_handler(void)
 sigaction(SIGABRT, _old, NULL);
 }
 
+void qtest_add_abrt_handler(void (*fn), const void *data)
+{
+GHook *hook;
+
+/* Only install SIGABRT handler once */
+if (!abrt_hooks.is_setup) {
+g_hook_list_init(_hooks, sizeof(GHook));
+setup_sigabrt_handler();
+}
+
+hook = g_hook_alloc(_hooks);
+hook->func = fn;
+hook->data = (void *)data;
+
+g_hook_prepend(_hooks, hook);
+}
+
 QTestState *qtest_init(const char *extra_args)
 {
 QTestState *s;
@@ -156,12 +171,7 @@ QTestState *qtest_init(const char *extra_args)
 sock = init_socket(socket_path);
 qmpsock = init_socket(qmp_socket_path);
 
-/* Only install SIGABRT handler once */
-if (!qtest_instances) {
-setup_sigabrt_handler();
-}
-
-qtest_instances = g_list_prepend(qtest_instances, s);
+qtest_add_abrt_handler(kill_qemu, s);
 
 s->qemu_pid = fork();
 if (s->qemu_pid == 0) {
@@ -209,13 +219,14 @@ QTestState *qtest_init(const char *extra_args)
 
 void qtest_quit(QTestState *s)
 {
+qtest_instances = g_list_remove(qtest_instances, s);
+g_hook_destroy_link(_hooks, g_hook_find_data(_hooks, TRUE, s));
+
 /* Uninstall SIGABRT handler on last instance */
-if (qtest_instances && !qtest_instances->next) {
+if (!qtest_instances) {
 cleanup_sigabrt_handler();
 }
 
-qtest_instances = g_list_remove(qtest_instances, s);
-
 kill_qemu(s);
 close(s->fd);
 close(s->qmp_fd);
diff --git a/tests/libqtest.h b/tests/libqtest.h
index ec42031..f02c07c 100644
--- a/tests/libqtest.h
+++ b/tests/libqtest.h
@@ -427,6 +427,8 @@ void qtest_add_data_func(const char *str, const void *data, 
void (*fn));
 g_free(path); \
 } while (0)
 
+void qtest_add_abrt_handler(void (*fn), const void *data);
+
 /**
  * qtest_start:
  * @args: other arguments to pass to QEMU
-- 
2.4.3

[Qemu-devel] [PULL 17/48] ivshmem: improve debug messages

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

Some misc improvements to ivshmem debug.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index c4c130d..50f9c8f 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -208,10 +208,13 @@ static void ivshmem_io_write(void *opaque, hwaddr addr,
 if (vector < s->peers[dest].nb_eventfds) {
 IVSHMEM_DPRINTF("Notifying VM %d on vector %d\n", dest, 
vector);
 event_notifier_set(>peers[dest].eventfds[vector]);
+} else {
+IVSHMEM_DPRINTF("Invalid destination vector %d on VM %d\n",
+vector, dest);
 }
 break;
 default:
-IVSHMEM_DPRINTF("Invalid VM Doorbell VM %d\n", dest);
+IVSHMEM_DPRINTF("Unhandled write " TARGET_FMT_plx "\n", addr);
 }
 }
 
@@ -263,9 +266,9 @@ static void ivshmem_receive(void *opaque, const uint8_t 
*buf, int size)
 {
 IVShmemState *s = opaque;
 
-ivshmem_IntrStatus_write(s, *buf);
+IVSHMEM_DPRINTF("ivshmem_receive 0x%02x size: %d\n", *buf, size);
 
-IVSHMEM_DPRINTF("ivshmem_receive 0x%02x\n", *buf);
+ivshmem_IntrStatus_write(s, *buf);
 }
 
 static int ivshmem_can_receive(void * opaque)
@@ -592,6 +595,7 @@ static void ivshmem_use_msix(IVShmemState * s)
 PCIDevice *d = PCI_DEVICE(s);
 int i;
 
+IVSHMEM_DPRINTF("%s, msix present: %d\n", __func__, msix_present(d));
 if (!msix_present(d)) {
 return;
 }
-- 
2.4.3

[Qemu-devel] [kvm-unit-tests PATCHv3 1/3] arm: Add PMU test

2015-10-06 Thread Christopher Covington

Beginning with a simple sanity check of the control register, add
a unit test for the ARM Performance Monitors Unit (PMU).

Signed-off-by: Christopher Covington 
---
 arm/pmu.c   | 66 +
 arm/unittests.cfg   |  5 
 config/config-arm64.mak |  4 ++-
 3 files changed, 74 insertions(+), 1 deletion(-)
 create mode 100644 arm/pmu.c

diff --git a/arm/pmu.c b/arm/pmu.c
new file mode 100644
index 000..91a3688
--- /dev/null
+++ b/arm/pmu.c
@@ -0,0 +1,66 @@
+/*
+ * Test the ARM Performance Monitors Unit (PMU).
+ *
+ * Copyright 2015 The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU Lesser General Public License version 2.1 and
+ * only version 2.1 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License
+ * for more details.
+ */
+#include "libcflat.h"
+
+struct pmu_data {
+   union {
+   uint32_t pmcr_el0;
+   struct {
+   uint32_t enable:1;
+   uint32_t event_counter_reset:1;
+   uint32_t cycle_counter_reset:1;
+   uint32_t cycle_counter_clock_divider:1;
+   uint32_t event_counter_export:1;
+   uint32_t cycle_counter_disable_when_prohibited:1;
+   uint32_t cycle_counter_long:1;
+   uint32_t zeros:4;
+   uint32_t num_counters:5;
+   uint32_t identification_code:8;
+   uint32_t implementer:8;
+   };
+   };
+};
+
+/* As a simple sanity check on the PMCR_EL0, ensure the implementer field isn't
+ * null. Also print out a couple other interesting fields for diagnostic
+ * purposes. For example, as of fall 2015, QEMU TCG mode doesn't implement
+ * event counters and therefore reports zero of them, but hopefully support for
+ * at least the instructions event will be added in the future and the reported
+ * number of event counters will become nonzero.
+ */
+static bool check_pmcr(void)
+{
+   struct pmu_data pmcr;
+
+   asm volatile("mrs %0, pmcr_el0" : "=r" (pmcr));
+
+   printf("PMU implementer: %c\n", pmcr.implementer);
+   printf("Identification code: 0x%x\n", pmcr.identification_code);
+   printf("Event counters:  %d\n", pmcr.num_counters);
+
+   if (pmcr.implementer)
+   return true;
+
+   return false;
+}
+
+int main(void)
+{
+   report_prefix_push("pmu");
+
+   report("Control register", check_pmcr());
+
+   return report_summary();
+}
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index e068a0c..fd94adb 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -35,3 +35,8 @@ file = selftest.flat
 smp = `getconf _NPROCESSORS_CONF`
 extra_params = -append 'smp'
 groups = selftest
+
+# Test PMU support without -icount
+[pmu]
+file = pmu.flat
+groups = pmu
diff --git a/config/config-arm64.mak b/config/config-arm64.mak
index d61b703..140b611 100644
--- a/config/config-arm64.mak
+++ b/config/config-arm64.mak
@@ -12,9 +12,11 @@ cflatobjs += lib/arm64/processor.o
 cflatobjs += lib/arm64/spinlock.o
 
 # arm64 specific tests
-tests =
+tests = $(TEST_DIR)/pmu.flat
 
 include config/config-arm-common.mak
 
 arch_clean: arm_clean
$(RM) lib/arm64/.*.d
+
+$(TEST_DIR)/pmu.elf: $(cstart.o) $(TEST_DIR)/pmu.o
-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

Re: [Qemu-devel] [PATCH 1/5] ide/atapi: make PIO read requests async

2015-10-06 Thread John Snow



On 10/06/2015 02:31 PM, Peter Lieven wrote:
> Am 06.10.2015 um 19:56 schrieb John Snow:
>>
>> On 10/06/2015 01:12 PM, Peter Lieven wrote:
 Am 06.10.2015 um 19:07 schrieb John Snow :



> On 10/06/2015 05:20 AM, Peter Lieven wrote:
>> Am 06.10.2015 um 10:57 schrieb Kevin Wolf:
>> Am 05.10.2015 um 23:15 hat John Snow geschrieben:
>>> On 09/21/2015 08:25 AM, Peter Lieven wrote:
 PIO read requests on the ATAPI interface used to be sync blk requests.
 This has to siginificant drawbacks. First the main loop hangs util an
 I/O request is completed and secondly if the I/O request does not
 complete (e.g. due to an unresponsive storage) Qemu hangs completely.
Maybe you can have a look at the other patches of this series as well?
Then I can
respin the whole series.



 Signed-off-by: Peter Lieven 
 ---
  hw/ide/atapi.c | 69
 --
  1 file changed, 43 insertions(+), 26 deletions(-)

 diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
 index 747f466..9257e1c 100644
 --- a/hw/ide/atapi.c
 +++ b/hw/ide/atapi.c
 @@ -105,31 +105,51 @@ static void cd_data_to_raw(uint8_t *buf, int lba)
  memset(buf, 0, 288);
  }
  -static int cd_read_sector(IDEState *s, int lba, uint8_t *buf, int
 sector_size)
 +static void cd_read_sector_cb(void *opaque, int ret)
  {
 -int ret;
 +IDEState *s = opaque;
  -switch(sector_size) {
 -case 2048:
 -block_acct_start(blk_get_stats(s->blk), >acct,
 - 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
 -ret = blk_read(s->blk, (int64_t)lba << 2, buf, 4);
 -block_acct_done(blk_get_stats(s->blk), >acct);
 -break;
 -case 2352:
 -block_acct_start(blk_get_stats(s->blk), >acct,
 - 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
 -ret = blk_read(s->blk, (int64_t)lba << 2, buf + 16, 4);
 -block_acct_done(blk_get_stats(s->blk), >acct);
 -if (ret < 0)
 -return ret;
 -cd_data_to_raw(buf, lba);
 -break;
 -default:
 -ret = -EIO;
 -break;
 +block_acct_done(blk_get_stats(s->blk), >acct);
 +
 +if (ret < 0) {
 +ide_atapi_io_error(s, ret);
 +return;
 +}
 +
 +if (s->cd_sector_size == 2352) {
 +cd_data_to_raw(s->io_buffer, s->lba);
  }
 -return ret;
 +
 +s->lba++;
 +s->io_buffer_index = 0;
 +s->status &= ~BUSY_STAT;
 +
 +ide_atapi_cmd_reply_end(s);
 +}
 +
 +static int cd_read_sector(IDEState *s, int lba, void *buf, int
 sector_size)
 +{
 +if (sector_size != 2048 && sector_size != 2352) {
 +return -EINVAL;
 +}
 +
 +s->iov.iov_base = buf;
 +if (sector_size == 2352) {
 +buf += 4;
 +}
>> This doesn't look quite right, buf is never read after this.
>>
>> Also, why +=4 when it was originally buf + 16?
> You are right. I mixed that up.
>
 +
 +s->iov.iov_len = 4 * BDRV_SECTOR_SIZE;
 +qemu_iovec_init_external(>qiov, >iov, 1);
 +
 +if (blk_aio_readv(s->blk, (int64_t)lba << 2, >qiov, 4,
 +  cd_read_sector_cb, s) == NULL) {
 +return -EIO;
 +}
 +
 +block_acct_start(blk_get_stats(s->blk), >acct,
 + 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
 +s->status |= BUSY_STAT;
 +return 0;
  }
>>> We discussed this off-list a bit, but for upstream synchronization:
>>>
>>> Unfortunately, I believe making cd_read_sector here non-blocking makes
>>> ide_atapi_cmd_reply_end non-blocking, and as a result makes calls to
>>> s->end_transfer_func() nonblocking, which functions like ide_data_readw
>>> are not prepared to cope with.
>> I don't think that's a problem as long as BSY is set while the
>> asynchronous command is running and DRQ is cleared. The latter will
>> protect ide_data_readw(). ide_sector_read() does essentially the same
>> thing.
> I was thinking the same. Without the BSY its not working at all.
>
>> Or maybe I'm just missing what you're trying to say.
>>
>>> My suggestion is to buffer an entire DRQ block of data at once
>>> (byte_count_limit) to avoid the problem.
>> No matter whether there is a problem or not, buffering more

Re: [Qemu-devel] [PATCH 1/5] ide/atapi: make PIO read requests async

2015-10-06 Thread Peter Lieven


> Am 06.10.2015 um 19:07 schrieb John Snow :
> 
> 
> 
>> On 10/06/2015 05:20 AM, Peter Lieven wrote:
>>> Am 06.10.2015 um 10:57 schrieb Kevin Wolf:
>>> Am 05.10.2015 um 23:15 hat John Snow geschrieben:
 
 On 09/21/2015 08:25 AM, Peter Lieven wrote:
> PIO read requests on the ATAPI interface used to be sync blk requests.
> This has to siginificant drawbacks. First the main loop hangs util an
> I/O request is completed and secondly if the I/O request does not
> complete (e.g. due to an unresponsive storage) Qemu hangs completely.
> 
> Signed-off-by: Peter Lieven 
> ---
>  hw/ide/atapi.c | 69
> --
>  1 file changed, 43 insertions(+), 26 deletions(-)
> 
> diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
> index 747f466..9257e1c 100644
> --- a/hw/ide/atapi.c
> +++ b/hw/ide/atapi.c
> @@ -105,31 +105,51 @@ static void cd_data_to_raw(uint8_t *buf, int lba)
>  memset(buf, 0, 288);
>  }
>  -static int cd_read_sector(IDEState *s, int lba, uint8_t *buf, int
> sector_size)
> +static void cd_read_sector_cb(void *opaque, int ret)
>  {
> -int ret;
> +IDEState *s = opaque;
>  -switch(sector_size) {
> -case 2048:
> -block_acct_start(blk_get_stats(s->blk), >acct,
> - 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
> -ret = blk_read(s->blk, (int64_t)lba << 2, buf, 4);
> -block_acct_done(blk_get_stats(s->blk), >acct);
> -break;
> -case 2352:
> -block_acct_start(blk_get_stats(s->blk), >acct,
> - 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
> -ret = blk_read(s->blk, (int64_t)lba << 2, buf + 16, 4);
> -block_acct_done(blk_get_stats(s->blk), >acct);
> -if (ret < 0)
> -return ret;
> -cd_data_to_raw(buf, lba);
> -break;
> -default:
> -ret = -EIO;
> -break;
> +block_acct_done(blk_get_stats(s->blk), >acct);
> +
> +if (ret < 0) {
> +ide_atapi_io_error(s, ret);
> +return;
> +}
> +
> +if (s->cd_sector_size == 2352) {
> +cd_data_to_raw(s->io_buffer, s->lba);
>  }
> -return ret;
> +
> +s->lba++;
> +s->io_buffer_index = 0;
> +s->status &= ~BUSY_STAT;
> +
> +ide_atapi_cmd_reply_end(s);
> +}
> +
> +static int cd_read_sector(IDEState *s, int lba, void *buf, int
> sector_size)
> +{
> +if (sector_size != 2048 && sector_size != 2352) {
> +return -EINVAL;
> +}
> +
> +s->iov.iov_base = buf;
> +if (sector_size == 2352) {
> +buf += 4;
> +}
>>> This doesn't look quite right, buf is never read after this.
>>> 
>>> Also, why +=4 when it was originally buf + 16?
>> 
>> You are right. I mixed that up.
>> 
>>> 
> +
> +s->iov.iov_len = 4 * BDRV_SECTOR_SIZE;
> +qemu_iovec_init_external(>qiov, >iov, 1);
> +
> +if (blk_aio_readv(s->blk, (int64_t)lba << 2, >qiov, 4,
> +  cd_read_sector_cb, s) == NULL) {
> +return -EIO;
> +}
> +
> +block_acct_start(blk_get_stats(s->blk), >acct,
> + 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
> +s->status |= BUSY_STAT;
> +return 0;
>  }
 We discussed this off-list a bit, but for upstream synchronization:
 
 Unfortunately, I believe making cd_read_sector here non-blocking makes
 ide_atapi_cmd_reply_end non-blocking, and as a result makes calls to
 s->end_transfer_func() nonblocking, which functions like ide_data_readw
 are not prepared to cope with.
>>> I don't think that's a problem as long as BSY is set while the
>>> asynchronous command is running and DRQ is cleared. The latter will
>>> protect ide_data_readw(). ide_sector_read() does essentially the same
>>> thing.
>> 
>> I was thinking the same. Without the BSY its not working at all.
>> 
>>> 
>>> Or maybe I'm just missing what you're trying to say.
>>> 
 My suggestion is to buffer an entire DRQ block of data at once
 (byte_count_limit) to avoid the problem.
>>> No matter whether there is a problem or not, buffering more data at once
>>> (and therefore doing less requests) is better for performance anyway.
>> 
>> Its possible to do only one read in the backend and read the whole
>> request into the IO buffer. I send a follow-up.
> 
> Be cautious: we only have 128K (+4 bytes) to play with in the io_buffer
> and the READ10 cdb can request up to 128MiB! For performance, it might
> be nice to always buffer something like:
> 
> MIN(128K, nb_sectors * sector_size)

isnt nb_sectors limited to CD_MAX_SECTORS (32)?

Peter


> 
> and then as the guest drains the DRQ block of size

[Qemu-devel] [PULL 26/48] ivshmem: add device description

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 9023f95..7be3d5e 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -925,6 +925,7 @@ static void ivshmem_class_init(ObjectClass *klass, void 
*data)
 dc->props = ivshmem_properties;
 dc->vmsd = _vmsd;
 set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+dc->desc = "Inter-VM shared memory";
 }
 
 static const TypeInfo ivshmem_info = {
-- 
2.4.3

[Qemu-devel] [PATCH V10 1/3] sd.h: Move sd.h to include/hw/sd/

2015-10-06 Thread Sai Pavan Boddu

Create a sd directory under include/hw/ and move sd.h to
include/hw/sd/

Signed-off-by: Sai Pavan Boddu 
Reviewed-by: Alistair Francis 
Reviewed-by: Peter Crosthwaite 
---
Changes for V10:
Fix Commit Message.
Changes for V9:
None.
Changes for V8:
None
Changes for V7:
None
Changes for V6:
Fix commit message.
Changes for V5:
None
Changes for V4:
Fix commit message.
Changes for V3:
None.
---
 hw/sd/milkymist-memcard.c | 2 +-
 hw/sd/omap_mmc.c  | 2 +-
 hw/sd/pl181.c | 2 +-
 hw/sd/pxa2xx_mmci.c   | 2 +-
 hw/sd/sd.c| 2 +-
 hw/sd/sdhci.h | 2 +-
 hw/sd/ssi-sd.c| 2 +-
 include/hw/{ => sd}/sd.h  | 0
 8 files changed, 7 insertions(+), 7 deletions(-)
 rename include/hw/{ => sd}/sd.h (100%)

diff --git a/hw/sd/milkymist-memcard.c b/hw/sd/milkymist-memcard.c
index 2209ef1..b430d56 100644
--- a/hw/sd/milkymist-memcard.c
+++ b/hw/sd/milkymist-memcard.c
@@ -28,7 +28,7 @@
 #include "qemu/error-report.h"
 #include "sysemu/block-backend.h"
 #include "sysemu/blockdev.h"
-#include "hw/sd.h"
+#include "hw/sd/sd.h"
 
 enum {
 ENABLE_CMD_TX   = (1<<0),
diff --git a/hw/sd/omap_mmc.c b/hw/sd/omap_mmc.c
index 35d8033..5bc4719 100644
--- a/hw/sd/omap_mmc.c
+++ b/hw/sd/omap_mmc.c
@@ -18,7 +18,7 @@
  */
 #include "hw/hw.h"
 #include "hw/arm/omap.h"
-#include "hw/sd.h"
+#include "hw/sd/sd.h"
 
 struct omap_mmc_s {
 qemu_irq irq;
diff --git a/hw/sd/pl181.c b/hw/sd/pl181.c
index 5242176..326c53a 100644
--- a/hw/sd/pl181.c
+++ b/hw/sd/pl181.c
@@ -10,7 +10,7 @@
 #include "sysemu/block-backend.h"
 #include "sysemu/blockdev.h"
 #include "hw/sysbus.h"
-#include "hw/sd.h"
+#include "hw/sd/sd.h"
 
 //#define DEBUG_PL181 1
 
diff --git a/hw/sd/pxa2xx_mmci.c b/hw/sd/pxa2xx_mmci.c
index d1fe6d5..b217080 100644
--- a/hw/sd/pxa2xx_mmci.c
+++ b/hw/sd/pxa2xx_mmci.c
@@ -12,7 +12,7 @@
 
 #include "hw/hw.h"
 #include "hw/arm/pxa.h"
-#include "hw/sd.h"
+#include "hw/sd/sd.h"
 #include "hw/qdev.h"
 
 struct PXA2xxMMCIState {
diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index 3e2a451..06f56d5 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -31,7 +31,7 @@
 
 #include "hw/hw.h"
 #include "sysemu/block-backend.h"
-#include "hw/sd.h"
+#include "hw/sd/sd.h"
 #include "qemu/bitmap.h"
 
 //#define DEBUG_SD 1
diff --git a/hw/sd/sdhci.h b/hw/sd/sdhci.h
index 3352d23..a45593f 100644
--- a/hw/sd/sdhci.h
+++ b/hw/sd/sdhci.h
@@ -28,7 +28,7 @@
 #include "qemu-common.h"
 #include "hw/pci/pci.h"
 #include "hw/sysbus.h"
-#include "hw/sd.h"
+#include "hw/sd/sd.h"
 
 /* R/W SDMA System Address register 0x0 */
 #define SDHC_SYSAD 0x00
diff --git a/hw/sd/ssi-sd.c b/hw/sd/ssi-sd.c
index e4b2d4f..c49ff62 100644
--- a/hw/sd/ssi-sd.c
+++ b/hw/sd/ssi-sd.c
@@ -13,7 +13,7 @@
 #include "sysemu/block-backend.h"
 #include "sysemu/blockdev.h"
 #include "hw/ssi.h"
-#include "hw/sd.h"
+#include "hw/sd/sd.h"
 
 //#define DEBUG_SSI_SD 1
 
diff --git a/include/hw/sd.h b/include/hw/sd/sd.h
similarity index 100%
rename from include/hw/sd.h
rename to include/hw/sd/sd.h
-- 
2.1.4

[Qemu-devel] [PULL 45/48] ivshmem: remove EventfdEntry.vector

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

No need to store an extra int for the vector number when it can be
computed easily by looking at the position in the array.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 2fdb92b..3283874 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -68,7 +68,6 @@ typedef struct Peer {
 
 typedef struct EventfdEntry {
 PCIDevice *pdev;
-int vector;
 } EventfdEntry;
 
 typedef struct IVShmemState {
@@ -287,9 +286,11 @@ static void fake_irqfd(void *opaque, const uint8_t *buf, 
int size) {
 
 EventfdEntry *entry = opaque;
 PCIDevice *pdev = entry->pdev;
+IVShmemState *s = IVSHMEM(pdev);
+int vector = entry - s->eventfd_table;
 
-IVSHMEM_DPRINTF("interrupt on vector %p %d\n", pdev, entry->vector);
-msix_notify(pdev, entry->vector);
+IVSHMEM_DPRINTF("interrupt on vector %p %d\n", pdev, vector);
+msix_notify(pdev, vector);
 }
 
 static CharDriverState* create_eventfd_chr_device(void * opaque, EventNotifier 
*n,
@@ -311,7 +312,6 @@ static CharDriverState* create_eventfd_chr_device(void * 
opaque, EventNotifier *
 /* if MSI is supported we need multiple interrupts */
 if (ivshmem_has_feature(s, IVSHMEM_MSI)) {
 s->eventfd_table[vector].pdev = PCI_DEVICE(s);
-s->eventfd_table[vector].vector = vector;
 
 qemu_chr_add_handlers(chr, ivshmem_can_receive, fake_irqfd,
   ivshmem_event, >eventfd_table[vector]);
-- 
2.4.3

[Qemu-devel] [PULL 33/48] ivshmem-server: use a uint16 for client ID

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

In practice, the number of VM is limited to MAXUINT16 in ivshmem, so use
the same limit on the server (removes a theorical infinite loop)

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 contrib/ivshmem-server/ivshmem-server.c | 11 ++-
 contrib/ivshmem-server/ivshmem-server.h |  2 +-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/contrib/ivshmem-server/ivshmem-server.c 
b/contrib/ivshmem-server/ivshmem-server.c
index 16ee583..972fda2 100644
--- a/contrib/ivshmem-server/ivshmem-server.c
+++ b/contrib/ivshmem-server/ivshmem-server.c
@@ -145,9 +145,18 @@ ivshmem_server_handle_new_conn(IvshmemServer *server)
 peer->sock_fd = newfd;
 
 /* get an unused peer id */
-while (ivshmem_server_search_peer(server, server->cur_id) != NULL) {
+/* XXX: this could use id allocation such as Linux IDA, or simply
+ * a free-list */
+for (i = 0; i < G_MAXUINT16; i++) {
+if (ivshmem_server_search_peer(server, server->cur_id) == NULL) {
+break;
+}
 server->cur_id++;
 }
+if (i == G_MAXUINT16) {
+IVSHMEM_SERVER_DEBUG(server, "cannot allocate new client id\n");
+goto fail;
+}
 peer->id = server->cur_id++;
 
 /* create eventfd, one per vector */
diff --git a/contrib/ivshmem-server/ivshmem-server.h 
b/contrib/ivshmem-server/ivshmem-server.h
index cd584fc..2176d5e 100644
--- a/contrib/ivshmem-server/ivshmem-server.h
+++ b/contrib/ivshmem-server/ivshmem-server.h
@@ -70,7 +70,7 @@ typedef struct IvshmemServer {
 size_t shm_size; /**< size of shm */
 int shm_fd;  /**< shm file descriptor */
 unsigned n_vectors;  /**< number of vectors */
-long cur_id; /**< id to be given to next client */
+uint16_t cur_id; /**< id to be given to next client */
 bool verbose;/**< true in verbose mode */
 IvshmemServerPeerList peer_list; /**< list of peers */
 } IvshmemServer;
-- 
2.4.3

[Qemu-devel] [PULL 25/48] ivshmem: check shm isn't already initialized

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

The server should not change the shm, and this isn't handled by qemu and
we should should verify this in qemu.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index d3d0204..9023f95 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -533,6 +533,12 @@ static void ivshmem_read(void *opaque, const uint8_t *buf, 
int size)
 if (incoming_posn == -1) {
 void * map_ptr;
 
+if (s->shm_fd >= 0) {
+error_report("shm already initialized");
+close(incoming_fd);
+return;
+}
+
 if (check_shm_size(s, incoming_fd, ) == -1) {
 error_report_err(err);
 close(incoming_fd);
-- 
2.4.3

[Qemu-devel] [PULL 21/48] ivshmem: use common return

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

Both if branches return, move this out to common end.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index c054e52..fbb6f40 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -525,13 +525,12 @@ static void ivshmem_read(void *opaque, const uint8_t 
*buf, int size)
 if (incoming_posn >= 0 && s->vm_id == -1) {
 /* receive our posn */
 s->vm_id = incoming_posn;
-return;
 } else {
 /* otherwise an fd == -1 means an existing guest has gone away */
 IVSHMEM_DPRINTF("posn %ld has gone away\n", incoming_posn);
 close_guest_eventfds(s, incoming_posn);
-return;
 }
+return;
 }
 
 /* if the position is -1, then it's shared memory region fd */
-- 
2.4.3

Re: [Qemu-devel] [PATCH v3 1/4] firmware: introduce sysfs driver for QEMU's fw_cfg device

2015-10-06 Thread Andy Lutomirski

On Sat, Oct 3, 2015 at 4:28 PM, Gabriel L. Somlo  wrote:
> From: Gabriel Somlo 
>
> Make fw_cfg entries of type "file" available via sysfs. Entries
> are listed under /sys/firmware/qemu_fw_cfg/by_key, in folders
> named after each entry's selector key. Filename, selector value,
> and size read-only attributes are included for each entry. Also,
> a "raw" attribute allows retrieval of the full binary content of
> each entry.
>
> This patch also provides a documentation file outlining the
> guest-side "hardware" interface exposed by the QEMU fw_cfg device.
>

What's the status of "by_name"?  There's a single (presumably
incorrect) mention of it in a comment in this patch.

I would prefer if the kernel populated by_name itself rather than
deferring that to udev, since I'd like to use this facility in virtme,
and I'd like to use fw_cfg very early on boot before I even start
udev.

--Andy

[Qemu-devel] [PATCH v2] Remove macros IO_READ_PROTO and IO_WRITE_PROTO

2015-10-06 Thread Nutan Shinde

Signed-off-by: Nutan Shinde 
---
 hw/audio/adlib.c  | 28 +++
 hw/audio/es1370.c | 60 +-
 hw/audio/gus.c| 26 +++---
 hw/audio/sb16.c   | 66 +++
 4 files changed, 90 insertions(+), 90 deletions(-)

diff --git a/hw/audio/adlib.c b/hw/audio/adlib.c
index af39920..966c590 100644
--- a/hw/audio/adlib.c
+++ b/hw/audio/adlib.c
@@ -50,7 +50,7 @@
 
 #ifdef HAS_YMF262
 #include "ymf262.h"
-void YMF262UpdateOneQEMU (int which, INT16 *dst, int length);
+void YMF262UpdateOneQEMU(int which, INT16 *dst, int length);
 #define SHIFT 2
 #else
 #include "fmopl.h"
@@ -86,7 +86,7 @@ typedef struct {
 
 static AdlibState *glob_adlib;
 
-static void adlib_stop_opl_timer (AdlibState *s, size_t n)
+static void adlib_stop_opl_timer(AdlibState *s, size_t n)
 {
 #ifdef HAS_YMF262
 YMF262TimerOver (0, n);
@@ -96,7 +96,7 @@ static void adlib_stop_opl_timer (AdlibState *s, size_t n)
 s->ticking[n] = 0;
 }
 
-static void adlib_kill_timers (AdlibState *s)
+static void adlib_kill_timers(AdlibState *s)
 {
 size_t i;
 
@@ -119,7 +119,7 @@ static void adlib_kill_timers (AdlibState *s)
 }
 }
 
-static void adlib_write (void *opaque, uint32_t nport, uint32_t val)
+static void adlib_write(void *opaque, uint32_t nport, uint32_t val)
 {
 AdlibState *s = opaque;
 int a = nport & 3;
@@ -136,7 +136,7 @@ static void adlib_write (void *opaque, uint32_t nport, 
uint32_t val)
 #endif
 }
 
-static uint32_t adlib_read (void *opaque, uint32_t nport)
+static uint32_t adlib_read(void *opaque, uint32_t nport)
 {
 AdlibState *s = opaque;
 uint8_t data;
@@ -152,7 +152,7 @@ static uint32_t adlib_read (void *opaque, uint32_t nport)
 return data;
 }
 
-static void timer_handler (int c, double interval_Sec)
+static void timer_handler(int c, double interval_Sec)
 {
 AdlibState *s = glob_adlib;
 unsigned n = c & 1;
@@ -177,7 +177,7 @@ static void timer_handler (int c, double interval_Sec)
 AUD_init_time_stamp_out (s->voice, >ats);
 }
 
-static int write_audio (AdlibState *s, int samples)
+static int write_audio(AdlibState *s, int samples)
 {
 int net = 0;
 int pos = s->pos;
@@ -208,7 +208,7 @@ static int write_audio (AdlibState *s, int samples)
 return net;
 }
 
-static void adlib_callback (void *opaque, int free)
+static void adlib_callback(void *opaque, int free)
 {
 AdlibState *s = opaque;
 int samples, net = 0, to_play, written;
@@ -259,7 +259,7 @@ static void adlib_callback (void *opaque, int free)
 }
 }
 
-static void Adlib_fini (AdlibState *s)
+static void Adlib_fini(AdlibState *s)
 {
 #ifdef HAS_YMF262
 YMF262Shutdown ();
@@ -284,7 +284,7 @@ static MemoryRegionPortio adlib_portio_list[] = {
 PORTIO_END_OF_LIST(),
 };
 
-static void adlib_realizefn (DeviceState *dev, Error **errp)
+static void adlib_realizefn(DeviceState *dev, Error **errp)
 {
 AdlibState *s = ADLIB(dev);
 struct audsettings as;
@@ -337,7 +337,7 @@ static void adlib_realizefn (DeviceState *dev, Error **errp)
 return;
 }
 
-s->samples = AUD_get_buffer_size_out (s->voice) >> SHIFT;
+s->samples = AUD_get_buffer_size_out(s->voice) >> SHIFT;
 s->mixbuf = g_malloc0 (s->samples << SHIFT);
 
 adlib_portio_list[0].offset = s->port;
@@ -352,7 +352,7 @@ static Property adlib_properties[] = {
 DEFINE_PROP_END_OF_LIST (),
 };
 
-static void adlib_class_initfn (ObjectClass *klass, void *data)
+static void adlib_class_initfn(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS (klass);
 
@@ -369,13 +369,13 @@ static const TypeInfo adlib_info = {
 .class_init= adlib_class_initfn,
 };
 
-static int Adlib_init (ISABus *bus)
+static int Adlib_init(ISABus *bus)
 {
 isa_create_simple (bus, TYPE_ADLIB);
 return 0;
 }
 
-static void adlib_register_types (void)
+static void adlib_register_types(void)
 {
 type_register_static (_info);
 isa_register_soundhw("adlib", ADLIB_DESC, Adlib_init);
diff --git a/hw/audio/es1370.c b/hw/audio/es1370.c
index dfb7c79..edb4864 100644
--- a/hw/audio/es1370.c
+++ b/hw/audio/es1370.c
@@ -157,15 +157,15 @@ static const unsigned dac1_samplerate[] = { 5512, 11025, 
22050, 44100 };
 #define DAC2_CHANNEL 1
 #define ADC_CHANNEL 2
 
-static void es1370_dac1_callback (void *opaque, int free);
-static void es1370_dac2_callback (void *opaque, int free);
-static void es1370_adc_callback (void *opaque, int avail);
+static void es1370_dac1_callback(void *opaque, int free);
+static void es1370_dac2_callback(void *opaque, int free);
+static void es1370_adc_callback(void *opaque, int avail);
 
 #ifdef DEBUG_ES1370
 
 #define ldebug(...) AUD_log ("es1370", __VA_ARGS__)
 
-static void print_ctl (uint32_t val)
+static void print_ctl(uint32_t val)
 {
 char buf[1024];
 
@@ -196,7 +196,7 @@ static void print_ctl (uint32_t val)
  buf);
 }
 
-static void print_sctl

Re: [Qemu-devel] [PATCH v18 00/21] Deterministic replay core

2015-10-06 Thread Paolo Bonzini



On 06/10/2015 17:09, Paolo Bonzini wrote:
> 
> 
> On 21/09/2015 09:12, Pavel Dovgaluk wrote:
>> Hi!
>>
>> Paolo, have you reviewed these patches?
> 
> Pavel,
> 
> I think this is ready to go in.  Here are my final changes,
> can you ack them?

Hmm, there are a few other issues that the patch I pasted doesn't fix.

Thanks,

Paolo

diff --git a/Makefile.objs b/Makefile.objs
index bc43e5c..ba4b45e 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -58,6 +58,8 @@ common-obj-y += audio/
 common-obj-y += hw/
 common-obj-y += accel.o
 
+common-obj-y += replay/
+
 common-obj-y += ui/
 common-obj-y += bt-host.o bt-vhci.o
 bt-host.o-cflags := $(BLUEZ_CFLAGS)
diff --git a/Makefile.target b/Makefile.target
index ca8f351..962d004 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -88,7 +88,6 @@ obj-y = exec.o translate-all.o cpu-exec.o
 obj-y += translate-common.o
 obj-y += cpu-exec-common.o
 obj-y += tcg/tcg.o tcg/tcg-op.o tcg/optimize.o
-obj-y += replay/
 obj-$(CONFIG_TCG_INTERPRETER) += tci.o
 obj-y += tcg/tcg-common.o
 obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o
diff --git a/cpu-exec.c b/cpu-exec.c
index 2b83e18..0850f8c 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -30,7 +30,7 @@
 #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
 #include "hw/i386/apic.h"
 #endif
-#include "replay/replay.h"
+#include "sysemu/replay.h"
 
 /* -icount align implementation. */
 
diff --git a/cpus.c b/cpus.c
index 5130806..7e846e3 100644
--- a/cpus.c
+++ b/cpus.c
@@ -42,7 +42,7 @@
 #include "qemu/seqlock.h"
 #include "qapi-event.h"
 #include "hw/nmi.h"
-#include "replay/replay.h"
+#include "sysemu/replay.h"
 
 #ifndef _WIN32
 #include "qemu/compatfd.h"
diff --git a/exec.c b/exec.c
index 11954b9..ab0be16 100644
--- a/exec.c
+++ b/exec.c
@@ -50,7 +50,7 @@
 #include "qemu/rcu_queue.h"
 #include "qemu/main-loop.h"
 #include "translate-all.h"
-#include "replay/replay.h"
+#include "sysemu/replay.h"
 
 #include "exec/memory-internal.h"
 #include "exec/ram_addr.h"
diff --git a/hw/bt/hci.c b/hw/bt/hci.c
index 93dd1dc..2151d01 100644
--- a/hw/bt/hci.c
+++ b/hw/bt/hci.c
@@ -24,7 +24,7 @@
 #include "sysemu/bt.h"
 #include "hw/bt.h"
 #include "qapi/qmp/qerror.h"
-#include "replay/replay.h"
+#include "sysemu/replay.h"
 
 struct bt_hci_s {
 uint8_t *(*evt_packet)(void *opaque);
@@ -2193,7 +2193,7 @@ struct HCIInfo *bt_new_hci(struct bt_scatternet_s *net)
 
 s->device.handle_destroy = bt_hci_destroy;
 
-error_set(>replay_blocker, ERROR_CLASS_REPLAY_NOT_SUPPORTED, "bt hci");
+error_setg(>replay_blocker, QERR_REPLAY_NOT_SUPPORTED, "-bt hci");
 replay_add_blocker(s->replay_blocker);
 
 return >info;
diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index c56078d..1d7aa9b 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -9,7 +9,7 @@
 #include "qemu/timer.h"
 #include "hw/ptimer.h"
 #include "qemu/host-utils.h"
-#include "replay/replay.h"
+#include "sysemu/replay.h"
 
 struct ptimer_state
 {
diff --git a/include/qapi/qmp/qerror.h b/include/qapi/qmp/qerror.h
index 0781a7f..f601499 100644
--- a/include/qapi/qmp/qerror.h
+++ b/include/qapi/qmp/qerror.h
@@ -107,6 +107,6 @@
 "this feature or command is not currently supported"
 
 #define QERR_REPLAY_NOT_SUPPORTED \
-ERROR_CLASS_GENERIC_ERROR, "Record/replay feature is not supported for 
'%s'"
+"Record/replay feature is not supported for '%s'"
 
 #endif /* QERROR_H */
diff --git a/replay/replay.h b/include/sysemu/replay.h
similarity index 100%
rename from replay/replay.h
rename to include/sysemu/replay.h
diff --git a/qapi/common.json b/qapi/common.json
index d80e3d4..bad56bf 100644
--- a/qapi/common.json
+++ b/qapi/common.json
@@ -22,15 +22,11 @@
 # @KVMMissingCap: the requested operation can't be fulfilled because a
 # required KVM capability is missing
 #
-# @ReplayNotSupported: the requested feature is not supported with
-#  record/replay mode enabled
-#
 # Since: 1.2
 ##
 { 'enum': 'ErrorClass',
   'data': [ 'GenericError', 'CommandNotFound', 'DeviceEncrypted',
-'DeviceNotActive', 'DeviceNotFound', 'KVMMissingCap',
-'ReplayNotSupported' ] }
+'DeviceNotActive', 'DeviceNotFound', 'KVMMissingCap' ] }
 
 ##
 # @VersionTriple
diff --git a/qemu-timer.c b/qemu-timer.c
index e7a5c96..80f8231 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -24,7 +24,7 @@
 
 #include "qemu/main-loop.h"
 #include "qemu/timer.h"
-#include "replay/replay.h"
+#include "sysemu/replay.h"
 #include "sysemu/sysemu.h"
 
 #ifdef CONFIG_POSIX
@@ -488,20 +488,20 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 break;
 default:
 case QEMU_CLOCK_VIRTUAL:
-if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running())
-|| !replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) {
+if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL) ||
+(replay_mode == REPLAY_MODE_PLAY && !runstate_is_running())) {
 goto out;
 }
 break;
 case QEMU_CLOCK_HOST:
-

Re: [Qemu-devel] [PATCH v1 1/1] sdhci.c: Limit the maximum block size

2015-10-06 Thread Peter Crosthwaite

On Tue, Oct 6, 2015 at 10:40 AM, Alistair Francis
 wrote:
> It is possible for the guest to set an invalid block
> size which is larger then the fifo_buffer[] array. This
> could cause a buffer overflow.
>
> To avoid this limit the maximum size of the blksize variable.
>
> Signed-off-by: Alistair Francis 
> Suggested-by: Igor Mitsyanko 
> Reported-by: Intel Security ATR 
> Reviewed-by: Stefan Hajnoczi 

Reviewed-by: Peter Crosthwaite 

With Pavan's patches and now this, the SD patches are starting to pile
up on list. What queue do they target? target-arm (as lead/major user)
or something block-related?

Regards,
Peter

> ---
>
>  hw/sd/sdhci.c | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
> index 65304cf..1d47f5c 100644
> --- a/hw/sd/sdhci.c
> +++ b/hw/sd/sdhci.c
> @@ -1006,6 +1006,16 @@ sdhci_write(void *opaque, hwaddr offset, uint64_t val, 
> unsigned size)
>  MASKED_WRITE(s->blksize, mask, value);
>  MASKED_WRITE(s->blkcnt, mask >> 16, value >> 16);
>  }
> +
> +/* Limit block size to the maximum buffer size */
> +if (extract32(s->blksize, 0, 12) > s->buf_maxsz) {
> +qemu_log_mask(LOG_GUEST_ERROR, "%s: Size 0x%x is larger than " \
> +  "the maximum buffer 0x%x", __func__, s->blksize,
> +  s->buf_maxsz);
> +
> +s->blksize = deposit32(s->blksize, 0, 12, s->buf_maxsz);
> +}
> +
>  break;
>  case SDHC_ARGUMENT:
>  MASKED_WRITE(s->argument, mask, value);
> --
> 2.1.4
>

[Qemu-devel] [PULL 12/48] ivshmem: simplify around increase_dynamic_storage()

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

Set the number of peers and array allocation in a single place. Rename
to better reflect the function content.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 27 +++
 1 file changed, 11 insertions(+), 16 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 3787398..6f41960 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -417,30 +417,28 @@ static void close_guest_eventfds(IVShmemState *s, int 
posn)
 
 /* this function increase the dynamic storage need to store data about other
  * guests */
-static int increase_dynamic_storage(IVShmemState *s, int new_min_size)
+static int resize_peers(IVShmemState *s, int new_min_size)
 {
 
-int j, old_nb_alloc;
+int j, old_size;
 
 /* limit number of max peers */
 if (new_min_size <= 0 || new_min_size > IVSHMEM_MAX_PEERS) {
 return -1;
 }
-
-old_nb_alloc = s->nb_peers;
-
-if (new_min_size >= s->nb_peers) {
-/* +1 because #new_min_size is used as last array index */
-s->nb_peers = new_min_size + 1;
-} else {
+if (new_min_size <= s->nb_peers) {
 return 0;
 }
 
+old_size = s->nb_peers;
+s->nb_peers = new_min_size;
+
 IVSHMEM_DPRINTF("bumping storage to %d guests\n", s->nb_peers);
+
 s->peers = g_realloc(s->peers, s->nb_peers * sizeof(Peer));
 
 /* zero out new pointers */
-for (j = old_nb_alloc; j < s->nb_peers; j++) {
+for (j = old_size; j < s->nb_peers; j++) {
 s->peers[j].eventfds = NULL;
 s->peers[j].nb_eventfds = 0;
 }
@@ -508,8 +506,8 @@ static void ivshmem_read(void *opaque, const uint8_t *buf, 
int size)
 
 /* make sure we have enough space for this guest */
 if (incoming_posn >= s->nb_peers) {
-if (increase_dynamic_storage(s, incoming_posn) < 0) {
-error_report("increase_dynamic_storage() failed");
+if (resize_peers(s, incoming_posn + 1) < 0) {
+error_report("failed to resize peers array");
 if (incoming_fd != -1) {
 close(incoming_fd);
 }
@@ -812,12 +810,9 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
**errp)
 }
 
 /* we allocate enough space for 16 guests and grow as needed */
-s->nb_peers = 16;
+resize_peers(s, 16);
 s->vm_id = -1;
 
-/* allocate/initialize space for interrupt handling */
-s->peers = g_malloc0(s->nb_peers * sizeof(Peer));
-
 pci_register_bar(dev, 2, attr, >bar);
 
 s->eventfd_chr = g_malloc0(s->vectors * sizeof(CharDriverState *));
-- 
2.4.3

[Qemu-devel] [PULL 46/48] ivshmem: rename MSI eventfd_table

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

The array is used to have vector specific data, so use a more
descriptive name.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 3283874..8581d43 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -66,9 +66,9 @@ typedef struct Peer {
 EventNotifier *eventfds;
 } Peer;
 
-typedef struct EventfdEntry {
+typedef struct MSIVector {
 PCIDevice *pdev;
-} EventfdEntry;
+} MSIVector;
 
 typedef struct IVShmemState {
 /*< private >*/
@@ -99,7 +99,7 @@ typedef struct IVShmemState {
 int vm_id;
 uint32_t vectors;
 uint32_t features;
-EventfdEntry *eventfd_table;
+MSIVector *msi_vectors;
 
 Error *migration_blocker;
 
@@ -284,10 +284,10 @@ static void ivshmem_event(void *opaque, int event)
 
 static void fake_irqfd(void *opaque, const uint8_t *buf, int size) {
 
-EventfdEntry *entry = opaque;
+MSIVector *entry = opaque;
 PCIDevice *pdev = entry->pdev;
 IVShmemState *s = IVSHMEM(pdev);
-int vector = entry - s->eventfd_table;
+int vector = entry - s->msi_vectors;
 
 IVSHMEM_DPRINTF("interrupt on vector %p %d\n", pdev, vector);
 msix_notify(pdev, vector);
@@ -311,10 +311,10 @@ static CharDriverState* create_eventfd_chr_device(void * 
opaque, EventNotifier *
 
 /* if MSI is supported we need multiple interrupts */
 if (ivshmem_has_feature(s, IVSHMEM_MSI)) {
-s->eventfd_table[vector].pdev = PCI_DEVICE(s);
+s->msi_vectors[vector].pdev = PCI_DEVICE(s);
 
 qemu_chr_add_handlers(chr, ivshmem_can_receive, fake_irqfd,
-  ivshmem_event, >eventfd_table[vector]);
+  ivshmem_event, >msi_vectors[vector]);
 } else {
 qemu_chr_add_handlers(chr, ivshmem_can_receive, ivshmem_receive,
   ivshmem_event, s);
@@ -660,7 +660,7 @@ static int ivshmem_setup_msi(IVShmemState * s)
 IVSHMEM_DPRINTF("msix initialized (%d vectors)\n", s->vectors);
 
 /* allocate QEMU char devices for receiving interrupts */
-s->eventfd_table = g_malloc0(s->vectors * sizeof(EventfdEntry));
+s->msi_vectors = g_malloc0(s->vectors * sizeof(MSIVector));
 
 ivshmem_use_msix(s);
 return 0;
@@ -865,7 +865,7 @@ static void pci_ivshmem_exit(PCIDevice *dev)
 msix_uninit_exclusive_bar(dev);
 }
 
-g_free(s->eventfd_table);
+g_free(s->msi_vectors);
 }
 
 static bool test_msix(void *opaque, int version_id)
-- 
2.4.3

[Qemu-devel] [PULL 28/48] ivshmem: replace 'guest' for 'peer' appropriately

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

The terms 'guest' and 'peer' are used sometime interchangeably which may
be confusing. Instead, use 'peer' for the remote instances of ivshmem
clients, and 'guest' for the local VM.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index d1b5d35..0e31d1d 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -89,7 +89,7 @@ typedef struct IVShmemState {
 int shm_fd; /* shared memory file descriptor */
 
 Peer *peers;
-int nb_peers; /* how many guests we have space for */
+int nb_peers; /* how many peers we have space for */
 
 int vm_id;
 uint32_t vectors;
@@ -387,9 +387,9 @@ static void ivshmem_del_eventfd(IVShmemState *s, int posn, 
int i)
   >peers[posn].eventfds[i]);
 }
 
-static void close_guest_eventfds(IVShmemState *s, int posn)
+static void close_peer_eventfds(IVShmemState *s, int posn)
 {
-int i, guest_curr_max;
+int i, n;
 
 if (!ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) {
 return;
@@ -399,14 +399,14 @@ static void close_guest_eventfds(IVShmemState *s, int 
posn)
 return;
 }
 
-guest_curr_max = s->peers[posn].nb_eventfds;
+n = s->peers[posn].nb_eventfds;
 
 memory_region_transaction_begin();
-for (i = 0; i < guest_curr_max; i++) {
+for (i = 0; i < n; i++) {
 ivshmem_del_eventfd(s, posn, i);
 }
 memory_region_transaction_commit();
-for (i = 0; i < guest_curr_max; i++) {
+for (i = 0; i < n; i++) {
 event_notifier_cleanup(>peers[posn].eventfds[i]);
 }
 
@@ -415,7 +415,7 @@ static void close_guest_eventfds(IVShmemState *s, int posn)
 }
 
 /* this function increase the dynamic storage need to store data about other
- * guests */
+ * peers */
 static int resize_peers(IVShmemState *s, int new_min_size)
 {
 
@@ -432,7 +432,7 @@ static int resize_peers(IVShmemState *s, int new_min_size)
 old_size = s->nb_peers;
 s->nb_peers = new_min_size;
 
-IVSHMEM_DPRINTF("bumping storage to %d guests\n", s->nb_peers);
+IVSHMEM_DPRINTF("bumping storage to %d peers\n", s->nb_peers);
 
 s->peers = g_realloc(s->peers, s->nb_peers * sizeof(Peer));
 
@@ -503,7 +503,7 @@ static void ivshmem_read(void *opaque, const uint8_t *buf, 
int size)
 incoming_fd = qemu_chr_fe_get_msgfd(s->server_chr);
 IVSHMEM_DPRINTF("posn is %ld, fd is %d\n", incoming_posn, incoming_fd);
 
-/* make sure we have enough space for this guest */
+/* make sure we have enough space for this peer */
 if (incoming_posn >= s->nb_peers) {
 if (resize_peers(s, incoming_posn + 1) < 0) {
 error_report("failed to resize peers array");
@@ -522,9 +522,9 @@ static void ivshmem_read(void *opaque, const uint8_t *buf, 
int size)
 /* receive our posn */
 s->vm_id = incoming_posn;
 } else {
-/* otherwise an fd == -1 means an existing guest has gone away */
+/* otherwise an fd == -1 means an existing peer has gone away */
 IVSHMEM_DPRINTF("posn %ld has gone away\n", incoming_posn);
-close_guest_eventfds(s, incoming_posn);
+close_peer_eventfds(s, incoming_posn);
 }
 return;
 }
@@ -573,7 +573,7 @@ static void ivshmem_read(void *opaque, const uint8_t *buf, 
int size)
 /* get a new eventfd: */
 new_eventfd = peer->nb_eventfds++;
 
-/* this is an eventfd for a particular guest VM */
+/* this is an eventfd for a particular peer VM */
 IVSHMEM_DPRINTF("eventfds[%ld][%d] = %d\n", incoming_posn,
 new_eventfd, incoming_fd);
 event_notifier_init_fd(>eventfds[new_eventfd], incoming_fd);
@@ -753,7 +753,7 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
**errp)
 return;
 }
 
-/* we allocate enough space for 16 guests and grow as needed */
+/* we allocate enough space for 16 peers and grow as needed */
 resize_peers(s, 16);
 s->vm_id = -1;
 
@@ -831,7 +831,7 @@ static void pci_ivshmem_exit(PCIDevice *dev)
 
 if (s->peers) {
 for (i = 0; i < s->nb_peers; i++) {
-close_guest_eventfds(s, i);
+close_peer_eventfds(s, i);
 }
 g_free(s->peers);
 }
-- 
2.4.3

Re: [Qemu-devel] [PATCH v3 0/4] SysFS driver for QEMU fw_cfg device

2015-10-06 Thread Laszlo Ersek

On 10/05/15 15:05, Mark Rutland wrote:
>>> I'm not sure I follow what the difficulty with supporting DT in addition
>>> to ACPI is? It looks like all you need is a compatible string and a reg
>>> entry.
>>
>> Bearing in mind that I have almost no experience with arm:
>>
>> I started out by probing all possible port-io and mmio locations where
>> fw_cfg registers might have been found, from a "classic" module_init
>> method.
>>
>> Arm has DT, which as far as I understand will answer the following two
>> questions: 1. Do I have fw_cfg ? 2. If yes, what address range does it use ?
>> So that I could continue using a classic module_init, but won't need
>> to probe for the device.
>>
>> PC (my primary architecture, the one I actually care about) does not
>> have DT. If I want to share the same code, I can't probe, so if I try
>> DT and don't find fw_cfg there (or somehow DT is no-op-ed out because
>> I'm on a PC guest), I could somehow look it up in ACPI the same way
>> (i.e., use ACPI as sort of a stand-in for DT).
> 
> I'd imagine that it's simple to have something in your probe path like:
> 
> if (pdev->dev.of_node)
>   parse_dt(pdev);
> else
>   parse_acpi(pdev);
> 
>> But all ACPI-enabled drivers I could find use dedicated macros (i.e.
>> no more classic module_init() and module_exit(), but rather
>> module_acpi_driver() with .add and .remove methods on an acpi_driver
>> object, etc.) Not sure how I'd glue DT back into something like that.
> 
> You don't have to use those macros, and can simply use the classic
> module_{init,exit} functions, calling the requisite acpi driver
> registration functions at module {init,exit} time.
> 
>> In addition, Michael's comment earlier in the thread suggests that
>> even my current acpi version isn't sufficiently "orthodox" w.r.t.
>> ACPI, and I should be providing the hardware access routine as
>> an ACPI/AML routine, to avoid race conditions with the rest of ACPI,
>> and for encapsulation. I.e. it's even rude to use the fw_cfg node's
>> ACPI _CRS method (the part where I'd be treating it like a DT stand-in
>> only to query fw_cfg's hardware specifics).
> 
> As Peter stated, this sounds very much like it rules out sharing the
> interface with FW generally (and is certainly scary).
> 
>> So far, all the information I've been able to pull together points
>> away from a dual DT + ACPI all-in-one solution for fw_cfg. If you know
>> of an example where that's done in an acceptable way, please let
>> me know so I can use it for inspiration...
> 
> I'm not immediately aware, but I would imagine you could search for
> files that had both an of_match_table and a acpi_bus_register_driver
> call.

One file that I think is an example for this (and I have looked at
before) is: "drivers/virtio/virtio_mmio.c".

Virtio-mmio is supposed to be enumerable in both ACPI and DT virtual
machines. For the QEMU side, grep QEMU for "LNRO0005" vs. "virtio,mmio".

Thanks
Laszlo

Re: [Qemu-devel] [PATCH v7 08/18] qapi: Test use of 'number' within alternates

2015-10-06 Thread Markus Armbruster

Eric Blake  writes:

> On 09/29/2015 04:21 PM, Eric Blake wrote:
>> Add some testsuite exposure for use of a 'number' as part of
>> an alternate.  The current state of the tree has a few bugs
>> exposed by this: our input parser depends on the ordering of
>> how the qapi schema declared the alternate, and the parser
>> does not accept integers for a 'number' in an alternate even
>> though it does for numbers outside of an alternate.
>> 
>> Mixing 'int' and 'number' in the same alternate is unusual,
>> since both are supplied by json-numbers, but there does not
>> seem to be a technical reason to forbid it given that our
>> json lexer distinguishes between json-numbers that can be
>> represented as an int vs. those that cannot.
>> 
>> Improve the existing test_visitor_in_alternate() to match the
>> style of the new test_visitor_in_alternate_number(), and to
>> ensure full coverage of all possible qtype parsing.
>> 
>> Signed-off-by: Eric Blake 
>> 
>> ---
>
>> +static void test_visitor_in_alternate_number(TestInputVisitorData *data,
>> + const void *unused)
>> +{
>> +Visitor *v;
>> +Error *err = NULL;
>> +AltStrBool *asb;
>> +AltStrNum *asn;
>> +AltNumStr *ans;
>> +AltStrInt *asi;
>> +AltIntNum *ain;
>> +AltNumInt *ani;
>> +
>> +/* Parsing an int */
>> +
>> +v = visitor_input_test_init(data, "42");
>> +visit_type_AltStrBool(v, , NULL, );
>> +g_assert(err);
>> +qapi_free_AltStrBool(asb);
>> +visitor_input_teardown(data, NULL);
>
> This fails to reset err = NULL...
>
>> +
>> +/* FIXME: Order of alternate should not affect semantics; asn should
>> + * parse the same as ans */
>> +v = visitor_input_test_init(data, "42");
>> +visit_type_AltStrNum(v, , NULL, );
>> +/* FIXME g_assert_cmpint(asn->kind, == ALT_STR_NUM_KIND_N); */
>> +/* FIXME g_assert_cmpfloat(asn->n, ==, 42); */
>> +g_assert(err);
>> +error_free(err);
>> +err = NULL;
>
> ...which means that this test is not reliable.  Do you need a v8, or can
> you squash this in?
>
>
> diff --git a/tests/test-qmp-input-visitor.c b/tests/test-qmp-input-visitor.c
> index 1b5a369..6104ac6 100644
> --- a/tests/test-qmp-input-visitor.c
> +++ b/tests/test-qmp-input-visitor.c
> @@ -395,6 +395,8 @@ static void
> test_visitor_in_alternate_number(TestInputVisitorData *data,
>  v = visitor_input_test_init(data, "42");
>  visit_type_AltStrBool(v, , NULL, );
>  g_assert(err);
> +error_free(err);
> +err = NULL;
>  qapi_free_AltStrBool(asb);
>
>  v = visitor_input_test_init(data, "42");

Squashed & pushed to qapi-next.

I also updated commit messages to document the tweaks made on commit.

Re: [Qemu-devel] [PATCHv2] fw_cfg: Define a static signature to be returned on DMA port reads

2015-10-06 Thread Laszlo Ersek

On 10/06/15 01:51, Kevin O'Connor wrote:
> Return a static signature ("QEMU CFG") if the guest does a read to the
> DMA address io register.
> 
> Signed-off-by: Kevin O'Connor 
> ---
> 
> Marc, if you decide to respin your fw_cfg series, I've updated the dma
> signature patch.  This addresses the comments from Stefan, and I hope
> it addresses the comments from Laszlo.

Thank you -- I didn't know about extract64().

The patch looks good to me, but I think the QEMU coding style requries
/* ... */ comments, and forbids //.

... "scripts/checkpatch.pl" has the following snippet:

# no C99 // comments
if ($line =~ m{//}) {
ERROR("do not use C99 // comments\n" . $herecurr);
}
...

Thanks!
Laszlo

> 
> BTW, if you wanted to, it's possible to use deposit64 in
> fw_cfg_dma_mem_write() to support all possible (validly aligned) write
> sizes.  Then fw_cfg_dma_mem_valid() shouldn't be needed.  Something
> like:
> 
> static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr,
>  uint64_t value, unsigned size)
> {
> FWCfgState *s = opaque;
> s->dma_addr = deposit64(s->dma_addr, (8 - addr - size)*8, size*8, value);
> if (addr + size >= 8) {
> fw_cfg_dma_transfer(s);
> }
> }
> 
> ---
>  docs/specs/fw_cfg.txt |  3 +++
>  hw/nvram/fw_cfg.c | 14 --
>  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt
> index 2d6b2da..cbdce7d 100644
> --- a/docs/specs/fw_cfg.txt
> +++ b/docs/specs/fw_cfg.txt
> @@ -93,6 +93,9 @@ by selecting the "signature" item using key 0x 
> (FW_CFG_SIGNATURE),
>  and reading four bytes from the data register. If the fw_cfg device is
>  present, the four bytes read will contain the characters "QEMU".
>  
> +If the DMA interface is available, then reading the DMA Address
> +Register returns 0x51454d5520434647 ("QEMU CFG" in big-endian format).
> +
>  === Revision / feature bitmap (Key 0x0001, FW_CFG_ID) ===
>  
>  A 32-bit little-endian unsigned int, this item is used to check for enabled
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index 59933b3..cf5c5c4 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -53,6 +53,8 @@
>  #define FW_CFG_DMA_CTL_SKIP0x04
>  #define FW_CFG_DMA_CTL_SELECT  0x08
>  
> +#define FW_CFG_DMA_SIGNATURE 0x51454d5520434647 /* "QEMU CFG" */
> +
>  typedef struct FWCfgEntry {
>  uint32_t len;
>  uint8_t *data;
> @@ -393,6 +395,13 @@ static void fw_cfg_dma_transfer(FWCfgState *s)
>  trace_fw_cfg_read(s, 0);
>  }
>  
> +static uint64_t fw_cfg_dma_mem_read(void *opaque, hwaddr addr,
> +unsigned size)
> +{
> +// Return a signature value (and handle various read sizes)
> +return extract64(FW_CFG_DMA_SIGNATURE, (8 - addr - size) * 8, size*8);
> +}
> +
>  static void fw_cfg_dma_mem_write(void *opaque, hwaddr addr,
>   uint64_t value, unsigned size)
>  {
> @@ -416,8 +425,8 @@ static void fw_cfg_dma_mem_write(void *opaque, hwaddr 
> addr,
>  static bool fw_cfg_dma_mem_valid(void *opaque, hwaddr addr,
>unsigned size, bool is_write)
>  {
> -return is_write && ((size == 4 && (addr == 0 || addr == 4)) ||
> -(size == 8 && addr == 0));
> +return !is_write || ((size == 4 && (addr == 0 || addr == 4)) ||
> + (size == 8 && addr == 0));
>  }
>  
>  static bool fw_cfg_data_mem_valid(void *opaque, hwaddr addr,
> @@ -488,6 +497,7 @@ static const MemoryRegionOps fw_cfg_comb_mem_ops = {
>  };
>  
>  static const MemoryRegionOps fw_cfg_dma_mem_ops = {
> +.read = fw_cfg_dma_mem_read,
>  .write = fw_cfg_dma_mem_write,
>  .endianness = DEVICE_BIG_ENDIAN,
>  .valid.accepts = fw_cfg_dma_mem_valid,
>

[Qemu-devel] [PATCH 3/4] why is runstate_is_running needed?

2015-10-06 Thread Paolo Bonzini

It doesn't seem correct to call it for all checkpoints, but why
is it right for timerlist_run_timers?
---
 qemu-timer.c   | 9 +++--
 stubs/replay.c | 5 -
 2 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index 3c6e4c3..f16e422 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -488,20 +488,17 @@ bool timerlist_run_timers(QEMUTimerList *timer_list)
 break;
 default:
 case QEMU_CLOCK_VIRTUAL:
-if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running())
-|| !replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) {
+if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL)) {
 goto out;
 }
 break;
 case QEMU_CLOCK_HOST:
-if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running())
-|| !replay_checkpoint(CHECKPOINT_CLOCK_HOST)) {
+if (!replay_checkpoint(CHECKPOINT_CLOCK_HOST)) {
 goto out;
 }
 break;
 case QEMU_CLOCK_VIRTUAL_RT:
-if ((replay_mode != REPLAY_MODE_NONE && !runstate_is_running())
-|| !replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL_RT)) {
+if (!replay_checkpoint(CHECKPOINT_CLOCK_VIRTUAL_RT)) {
 goto out;
 }
 break;
diff --git a/stubs/replay.c b/stubs/replay.c
index 71fa7d5..42d01b5 100755
--- a/stubs/replay.c
+++ b/stubs/replay.c
@@ -22,11 +22,6 @@ bool replay_checkpoint(ReplayCheckpoint checkpoint)
 return true;
 }
 
-int runstate_is_running(void)
-{
-abort();
-}
-
 bool replay_events_enabled(void)
 {
 return false;
-- 
2.5.0

Re: [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification

2015-10-06 Thread John Snow



On 09/05/2015 01:33 PM, Vladimir Sementsov-Ogievskiy wrote:
> On 05.09.2015 19:43, Vladimir Sementsov-Ogievskiy wrote:
>> Persistent dirty bitmaps will be saved into qcow2 files. It may be used
>> as 'internal' bitmaps (for qcow2 drives) or as 'external' bitmaps for
>> other drives (there may be qcow2 file with zero disk size but with
>> several dirty bitmaps for other drives).
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> ---
>>   docs/specs/qcow2.txt | 127
>> ++-
>>   1 file changed, 126 insertions(+), 1 deletion(-)
>>
>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>> index 121dfc8..5fc0365 100644
>> --- a/docs/specs/qcow2.txt
>> +++ b/docs/specs/qcow2.txt
>> @@ -103,7 +103,13 @@ in the description of a field.
>>   write to an image with unknown auto-clear
>> features if it
>>   clears the respective bits from this field first.
>>   -Bits 0-63:  Reserved (set to 0)
>> +Bit 0:  Dirty bitmaps bit. If this bit is set
>> then
>> +there is a _consistent_ Dirty bitmaps
>> extension
>> +in the image. If it is not set, but
>> there is a
>> +Dirty bitmaps extension, its data
>> should be
>> +considered as inconsistent.
>> +
>> +Bits 1-63:  Reserved (set to 0)
>>  96 -  99:  refcount_order
>>   Describes the width of a reference count block
>> entry (width
>> @@ -123,6 +129,7 @@ be stored. Each extension has a structure like the
>> following:
>>   0x - End of the header extension area
>>   0xE2792ACA - Backing file format name
>>   0x6803f857 - Feature name table
>> +0x23852875 - Dirty bitmaps
>>   other  - Unknown header extension, can
>> be safely
>>ignored
>>   @@ -166,6 +173,24 @@ the header extension data. Each entry look like
>> this:
>>   terminated if it has full length)
>> +== Dirty bitmaps ==
>> +
>> +Dirty bitmaps is an optional header extension. It provides an ability
>> to store
>> +dirty bitmaps in a qcow2 image. The fields are:
>> +
>> +  0 -  3:  nb_dirty_bitmaps
>> +   The number of dirty bitmaps contained in the
>> image. Valid
>> +   values: 0 - 65535.
>> +
>> +  4 -  7:  dirty_bitmap_directory_size
>> +   Size of the Dirty Bitmap Directory in bytes. Valid
>> values:
>> +   0 - 67108864 (= 1024 * nb_dirty_bitmaps).
>> +
>> +  8 - 15:  dirty_bitmap_directory_offset
>> +   Offset into the image file at which the Dirty Bitmap
>> +   Directory starts. Must be aligned to a cluster
>> boundary.
>> +
>> +
>>   == Host cluster management ==
>> qcow2 manages the allocation of host clusters by maintaining a
>> reference count
>> @@ -360,3 +385,103 @@ Snapshot table entry:
>> variable:   Padding to round up the snapshot table entry
>> size to the
>>   next multiple of 8.
>> +
>> +
>> +== Dirty bitmaps ==
>> +
>> +The feature supports storing dirty bitmaps in a qcow2 image.
>> +
>> +=== Cluster mapping ===
>> +
>> +Dirty bitmaps are stored using a ONE-level structure for the mapping of
>> +bitmaps to host clusters. It is called Dirty Bitmap Table.
>> +
>> +The Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
>> +Directory Entry) and may use multiple clusters, however it must be
>> contiguous
>> +in the image file.
>> +
>> +Given an offset (in bytes) into the bitmap, the offset into the image
>> file can
>> +be obtained as follows:
>> +
>> +byte_offset =
>> +dirty_bitmap_table[offset / cluster_size] + (offset %
>> cluster_size)
>> +
>> +Taking into accout the granularity of the bitmap, an offset in bits
>> into the
>> +image file can be obtained like this:
>> +
>> +bit_offset =
>> +byte_offset(bit_nr / granularity / 8) * 8 + (bit_nr /
>> granularity) % 8
>> +
>> +Here bit_nr is a number of "virtual" bit of the bitmap, which is
>> covered by
>> +"physical" bit with number (bit_nr / granularity).
>> +
>> +Dirty Bitmap Table entry:
>> +
>> +Bit  0 -  8:Reserved
>> +
>> + 9 - 55:Bits 9-55 of host cluster offset. Must be aligned
>> to a
>> +cluster boundary. If the offset is 0, the cluster is
>> +unallocated, and should be read as all zeros.
>> +
>> +56 - 63:Reserved
>> +
>> +=== Dirty Bitmap Directory ===
>> +
>> +Each dirty bitmap, saved in the image is described in the Dirty Bitmap
>> +Directory entry. Dirty Bitmap Directory is a contiguous area in the
>> image file,
>> +whose starting offset and

[Qemu-devel] [PULL 44/48] ivshmem: add hostmem backend

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

Instead of handling allocation, teach ivshmem to use a memory backend.
This allows to use hugetlbfs backed memory now.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c| 84 +---
 tests/ivshmem-test.c | 12 
 2 files changed, 78 insertions(+), 18 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 707e82c..2fdb92b 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -26,6 +26,8 @@
 #include "qemu/event_notifier.h"
 #include "qemu/fifo8.h"
 #include "sysemu/char.h"
+#include "sysemu/hostmem.h"
+#include "qapi/visitor.h"
 
 #include "hw/misc/ivshmem.h"
 
@@ -57,6 +59,8 @@
 #define IVSHMEM(obj) \
 OBJECT_CHECK(IVShmemState, (obj), TYPE_IVSHMEM)
 
+#define IVSHMEM_MEMDEV_PROP "memdev"
+
 typedef struct Peer {
 int nb_eventfds;
 EventNotifier *eventfds;
@@ -72,6 +76,7 @@ typedef struct IVShmemState {
 PCIDevice parent_obj;
 /*< public >*/
 
+HostMemoryBackend *hostmem;
 uint32_t intrmask;
 uint32_t intrstatus;
 
@@ -674,7 +679,22 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
**errp)
 uint8_t attr = PCI_BASE_ADDRESS_SPACE_MEMORY |
 PCI_BASE_ADDRESS_MEM_PREFETCH;
 
-if (s->sizearg == NULL) {
+if (!!s->server_chr + !!s->shmobj + !!s->hostmem != 1) {
+error_setg(errp, "You must specify either a shmobj, a chardev"
+   " or a hostmem");
+return;
+}
+
+if (s->hostmem) {
+MemoryRegion *mr;
+
+if (s->sizearg) {
+g_warning("size argument ignored with hostmem");
+}
+
+mr = host_memory_backend_get_memory(s->hostmem, errp);
+s->ivshmem_size = memory_region_size(mr);
+} else if (s->sizearg == NULL) {
 s->ivshmem_size = 4 << 20; /* 4 MB default */
 } else {
 char *end;
@@ -732,7 +752,16 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
**errp)
 attr |= PCI_BASE_ADDRESS_MEM_TYPE_64;
 }
 
-if (s->server_chr != NULL) {
+if (s->hostmem != NULL) {
+MemoryRegion *mr;
+
+IVSHMEM_DPRINTF("using hostmem\n");
+
+mr = host_memory_backend_get_memory(MEMORY_BACKEND(s->hostmem), errp);
+vmstate_register_ram(mr, DEVICE(s));
+memory_region_add_subregion(>bar, 0, mr);
+pci_register_bar(PCI_DEVICE(s), 2, attr, >bar);
+} else if (s->server_chr != NULL) {
 if (strncmp(s->server_chr->filename, "unix:", 5)) {
 error_setg(errp, "chardev is not a unix client socket");
 return;
@@ -741,12 +770,6 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
**errp)
 /* if we get a UNIX socket as the parameter we will talk
  * to the ivshmem server to receive the memory region */
 
-if (s->shmobj != NULL) {
-error_setg(errp, "do not specify both 'chardev' "
-   "and 'shm' with ivshmem");
-return;
-}
-
 IVSHMEM_DPRINTF("using shared memory server (socket = %s)\n",
 s->server_chr->filename);
 
@@ -770,11 +793,6 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
**errp)
 /* just map the file immediately, we're not using a server */
 int fd;
 
-if (s->shmobj == NULL) {
-error_setg(errp, "Must specify 'chardev' or 'shm' to ivshmem");
-return;
-}
-
 IVSHMEM_DPRINTF("using shm_open (shm object = %s)\n", s->shmobj);
 
 /* try opening with O_EXCL and if it succeeds zero the memory
@@ -814,14 +832,17 @@ static void pci_ivshmem_exit(PCIDevice *dev)
 }
 
 if (memory_region_is_mapped(>ivshmem)) {
-void *addr = memory_region_get_ram_ptr(>ivshmem);
+if (!s->hostmem) {
+void *addr = memory_region_get_ram_ptr(>ivshmem);
+
+if (munmap(addr, s->ivshmem_size) == -1) {
+error_report("Failed to munmap shared memory %s",
+ strerror(errno));
+}
+}
 
 vmstate_unregister_ram(>ivshmem, DEVICE(dev));
 memory_region_del_subregion(>bar, >ivshmem);
-
-if (munmap(addr, s->ivshmem_size) == -1) {
-error_report("Failed to munmap shared memory %s", strerror(errno));
-}
 }
 
 if (s->eventfd_chr) {
@@ -964,10 +985,37 @@ static void ivshmem_class_init(ObjectClass *klass, void 
*data)
 dc->desc = "Inter-VM shared memory";
 }
 
+static void ivshmem_check_memdev_is_busy(Object *obj, const char *name,
+ Object *val, Error **errp)
+{
+MemoryRegion *mr;
+
+mr = host_memory_backend_get_memory(MEMORY_BACKEND(val), errp);
+if (memory_region_is_mapped(mr)) {
+char *path = object_get_canonical_path_component(val);
+error_setg(errp, "can't use already busy memdev: %s", path);

[Qemu-devel] [PATCH 2/4] more replay fixes

2015-10-06 Thread Paolo Bonzini

1) Compile files once

2) Move include file from replay/replay.h to include/sysemu/replay.h.

3) Fix Error usage

4) cleanup timerlistgroup_deadline_ns a bit and allow clock jump
notifiers to run

5) move replay-user.c to stubs/
---
 Makefile.objs   |  2 ++
 Makefile.target |  1 -
 cpu-exec.c  |  2 +-
 cpus.c  |  2 +-
 exec.c  |  2 +-
 hw/bt/hci.c |  4 ++--
 hw/core/ptimer.c|  2 +-
 include/qapi/qmp/qerror.h   |  2 +-
 {replay => include/sysemu}/replay.h |  0
 qapi/common.json|  6 +-
 qemu-timer.c| 14 ++
 replay/Makefile.objs| 11 +--
 replay/replay-events.c  |  2 +-
 replay/replay-input.c   |  2 +-
 replay/replay-internal.c|  4 ++--
 replay/replay-internal.h|  0
 replay/replay-time.c|  2 +-
 replay/replay.c |  2 +-
 stubs/Makefile.objs |  1 +
 {replay => stubs}/replay-user.c |  6 +-
 stubs/replay.c  |  9 +++--
 ui/input.c  |  2 +-
 vl.c|  6 +++---
 23 files changed, 40 insertions(+), 44 deletions(-)
 rename {replay => include/sysemu}/replay.h (100%)
 mode change 100755 => 100644 replay/Makefile.objs
 mode change 100755 => 100644 replay/replay-events.c
 mode change 100755 => 100644 replay/replay-input.c
 mode change 100755 => 100644 replay/replay-internal.c
 mode change 100755 => 100644 replay/replay-internal.h
 mode change 100755 => 100644 replay/replay-time.c
 mode change 100755 => 100644 replay/replay.c
 rename {replay => stubs}/replay-user.c (90%)

diff --git a/Makefile.objs b/Makefile.objs
index bc43e5c..ba4b45e 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -58,6 +58,8 @@ common-obj-y += audio/
 common-obj-y += hw/
 common-obj-y += accel.o
 
+common-obj-y += replay/
+
 common-obj-y += ui/
 common-obj-y += bt-host.o bt-vhci.o
 bt-host.o-cflags := $(BLUEZ_CFLAGS)
diff --git a/Makefile.target b/Makefile.target
index ca8f351..962d004 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -88,7 +88,6 @@ obj-y = exec.o translate-all.o cpu-exec.o
 obj-y += translate-common.o
 obj-y += cpu-exec-common.o
 obj-y += tcg/tcg.o tcg/tcg-op.o tcg/optimize.o
-obj-y += replay/
 obj-$(CONFIG_TCG_INTERPRETER) += tci.o
 obj-y += tcg/tcg-common.o
 obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o
diff --git a/cpu-exec.c b/cpu-exec.c
index 2b83e18..0850f8c 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -30,7 +30,7 @@
 #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
 #include "hw/i386/apic.h"
 #endif
-#include "replay/replay.h"
+#include "sysemu/replay.h"
 
 /* -icount align implementation. */
 
diff --git a/cpus.c b/cpus.c
index 5130806..7e846e3 100644
--- a/cpus.c
+++ b/cpus.c
@@ -42,7 +42,7 @@
 #include "qemu/seqlock.h"
 #include "qapi-event.h"
 #include "hw/nmi.h"
-#include "replay/replay.h"
+#include "sysemu/replay.h"
 
 #ifndef _WIN32
 #include "qemu/compatfd.h"
diff --git a/exec.c b/exec.c
index dba9258..38f968a 100644
--- a/exec.c
+++ b/exec.c
@@ -50,7 +50,7 @@
 #include "qemu/rcu_queue.h"
 #include "qemu/main-loop.h"
 #include "translate-all.h"
-#include "replay/replay.h"
+#include "sysemu/replay.h"
 
 #include "exec/memory-internal.h"
 #include "exec/ram_addr.h"
diff --git a/hw/bt/hci.c b/hw/bt/hci.c
index 93dd1dc..2151d01 100644
--- a/hw/bt/hci.c
+++ b/hw/bt/hci.c
@@ -24,7 +24,7 @@
 #include "sysemu/bt.h"
 #include "hw/bt.h"
 #include "qapi/qmp/qerror.h"
-#include "replay/replay.h"
+#include "sysemu/replay.h"
 
 struct bt_hci_s {
 uint8_t *(*evt_packet)(void *opaque);
@@ -2193,7 +2193,7 @@ struct HCIInfo *bt_new_hci(struct bt_scatternet_s *net)
 
 s->device.handle_destroy = bt_hci_destroy;
 
-error_set(>replay_blocker, ERROR_CLASS_REPLAY_NOT_SUPPORTED, "bt hci");
+error_setg(>replay_blocker, QERR_REPLAY_NOT_SUPPORTED, "-bt hci");
 replay_add_blocker(s->replay_blocker);
 
 return >info;
diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index 86d544f..edf077c 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -9,7 +9,7 @@
 #include "qemu/timer.h"
 #include "hw/ptimer.h"
 #include "qemu/host-utils.h"
-#include "replay/replay.h"
+#include "sysemu/replay.h"
 
 struct ptimer_state
 {
diff --git a/include/qapi/qmp/qerror.h b/include/qapi/qmp/qerror.h
index 0781a7f..f601499 100644
--- a/include/qapi/qmp/qerror.h
+++ b/include/qapi/qmp/qerror.h
@@ -107,6 +107,6 @@
 "this feature or command is not currently supported"
 
 #define QERR_REPLAY_NOT_SUPPORTED \
-ERROR_CLASS_GENERIC_ERROR, "Record/replay feature is not supported for 
'%s'"
+"Record/replay feature is not supported for '%s'"
 
 #endif /* QERROR_H */
diff --git a/replay/replay.h b/include/sysemu/replay.h
similarity index 100%
rename from replay/replay.h
rename to include/sysemu/replay.h

[Qemu-devel] [PATCH 4/4] events doubts

2015-10-06 Thread Paolo Bonzini

It is not clear what separates REPLAY_ASYNC_EVENT_BH from other async
events.  It seems to be an ordering issue, but then why do input events
not have to be looked up in the queue?  It would be much simpler if they
are all handled the same way.
---
 replay/replay-events.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/replay/replay-events.c b/replay/replay-events.c
index 402f644..d6c61f6 100644
--- a/replay/replay-events.c
+++ b/replay/replay-events.c
@@ -203,13 +203,15 @@ static Event *replay_read_event(int checkpoint)
 return NULL;
 }
 
-/* Events that has not to be in the queue */
+/* Read event-specific data */
 switch (read_event_kind) {
 case REPLAY_ASYNC_EVENT_BH:
 if (read_id == -1) {
 read_id = replay_get_qword();
 }
 break;
+
+/* Events that do not have to be in the queue - ### WHY? */
 case REPLAY_ASYNC_EVENT_INPUT:
 event = g_malloc0(sizeof(Event));
 event->event_kind = read_event_kind;
@@ -220,6 +222,7 @@ static Event *replay_read_event(int checkpoint)
 event->event_kind = read_event_kind;
 event->opaque = 0;
 return event;
+
 default:
 error_report("Unknown ID %d of replay event", read_event_kind);
 exit(1);
@@ -239,8 +242,6 @@ static Event *replay_read_event(int checkpoint)
 return NULL;
 }
 
-/* Read event-specific data */
-
 return event;
 }
 
-- 
2.5.0

[Qemu-devel] [PULL 43/48] ivshmem: use qemu_strtosz()

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

Use the common qemu utility function to parse the memory size.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 36 +---
 1 file changed, 5 insertions(+), 31 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index b873c23..707e82c 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -646,33 +646,6 @@ static void ivshmem_reset(DeviceState *d)
 ivshmem_use_msix(s);
 }
 
-static uint64_t ivshmem_get_size(IVShmemState * s, Error **errp) {
-
-uint64_t value;
-char *ptr;
-
-value = strtoull(s->sizearg, , 10);
-switch (*ptr) {
-case 0: case 'M': case 'm':
-value <<= 20;
-break;
-case 'G': case 'g':
-value <<= 30;
-break;
-default:
-error_setg(errp, "invalid ram size: %s", s->sizearg);
-return 0;
-}
-
-/* BARs must be a power of 2 */
-if (!is_power_of_2(value)) {
-error_setg(errp, "size must be power of 2");
-return 0;
-}
-
-return value;
-}
-
 static int ivshmem_setup_msi(IVShmemState * s)
 {
 if (msix_init_exclusive_bar(PCI_DEVICE(s), s->vectors, 1)) {
@@ -700,16 +673,17 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
**errp)
 uint8_t *pci_conf;
 uint8_t attr = PCI_BASE_ADDRESS_SPACE_MEMORY |
 PCI_BASE_ADDRESS_MEM_PREFETCH;
-Error *local_err = NULL;
 
 if (s->sizearg == NULL) {
 s->ivshmem_size = 4 << 20; /* 4 MB default */
 } else {
-s->ivshmem_size = ivshmem_get_size(s, _err);
-if (local_err) {
-error_propagate(errp, local_err);
+char *end;
+int64_t size = qemu_strtosz(s->sizearg, );
+if (size < 0 || *end != '\0' || !is_power_of_2(size)) {
+error_setg(errp, "Invalid size %s", s->sizearg);
 return;
 }
+s->ivshmem_size = size;
 }
 
 fifo8_create(>incoming_fifo, sizeof(long));
-- 
2.4.3

Re: [Qemu-devel] [PATCH 2/4] more replay fixes

2015-10-06 Thread Eric Blake

On 10/06/2015 02:00 PM, Paolo Bonzini wrote:
> 1) Compile files once
> 
> 2) Move include file from replay/replay.h to include/sysemu/replay.h.
> 
> 3) Fix Error usage
> 
> 4) cleanup timerlistgroup_deadline_ns a bit and allow clock jump
> notifiers to run
> 
> 5) move replay-user.c to stubs/
> ---

> +++ b/include/qapi/qmp/qerror.h
> @@ -107,6 +107,6 @@
>  "this feature or command is not currently supported"
>  
>  #define QERR_REPLAY_NOT_SUPPORTED \
> -ERROR_CLASS_GENERIC_ERROR, "Record/replay feature is not supported for 
> '%s'"
> +"Record/replay feature is not supported for '%s'"

We should not be adding new #defines to this file.  Instead, inline the
message into the callers that do error_setg() (I see hw/bt/hci.c as the
first such caller).

> +++ b/qapi/common.json
> @@ -22,15 +22,11 @@
>  # @KVMMissingCap: the requested operation can't be fulfilled because a
>  # required KVM capability is missing
>  #
> -# @ReplayNotSupported: the requested feature is not supported with
> -#  record/replay mode enabled
> -#
>  # Since: 1.2
>  ##
>  { 'enum': 'ErrorClass',
>'data': [ 'GenericError', 'CommandNotFound', 'DeviceEncrypted',
> -'DeviceNotActive', 'DeviceNotFound', 'KVMMissingCap',
> -'ReplayNotSupported' ] }
> +'DeviceNotActive', 'DeviceNotFound', 'KVMMissingCap' ] }

Thank you for this. We definitely do not want to be adding new error
classes without a very strong reason, and even if such classes are
added, they must properly be documented as 'Since 2.5'.

> +++ b/vl.c
> @@ -122,7 +122,7 @@ int main(int argc, char **argv)
>  #include "qapi-event.h"
>  #include "exec/semihost.h"
>  #include "crypto/init.h"
> -#include "replay/replay.h"
> +#include "sysemu/replay.h"
>  #include "qapi/qmp/qerror.h"
>  
>  #define MAX_VIRTIO_CONSOLES 1
> @@ -851,7 +851,7 @@ static void configure_rtc(QemuOpts *opts)
>  } else if (!strcmp(value, "localtime")) {
>  Error *blocker = NULL;
>  rtc_utc = 0;
> -error_set(, ERROR_CLASS_REPLAY_NOT_SUPPORTED,
> +error_setg(, QERR_REPLAY_NOT_SUPPORTED,
>"-rtc base=localtime");
>  replay_add_blocker(blocker);
>  } else {
> @@ -1258,7 +1258,7 @@ static void smp_parse(QemuOpts *opts)
>  
>  if (smp_cpus > 1 || smp_cores > 1 || smp_threads > 1) {
>  Error *blocker = NULL;
> -error_set(, ERROR_CLASS_REPLAY_NOT_SUPPORTED, "smp");
> +error_setg(, QERR_REPLAY_NOT_SUPPORTED, "smp");
>  replay_add_blocker(blocker);

Okay, I see that there is more than one location with the same failure,
which is where using the #define sort of makes it nicer to guarantee a
consistent message.  But in general, use of error_setg() with a macro
that passes a %s to printf at a distance is ugly, and should be avoided
compared to just inlining the error message directly or writing a helper
method that can properly set a consistent message.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PULL 41/48] tests: add ivshmem qtest

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

Adds 4 ivshmemtests:
- single qemu instance and basic IO
- pair of instances, check memory sharing
- pair of instances with server, and MSIX
- hot plug/unplug

A temporary shm is created as well as a directory to place server
socket, both should be clear on exit and abort.

Cc: Cam Macdonell 
CC: Andreas Färber 
Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 tests/Makefile   |   3 +
 tests/ivshmem-test.c | 484 +++
 2 files changed, 487 insertions(+)
 create mode 100644 tests/ivshmem-test.c

diff --git a/tests/Makefile b/tests/Makefile
index e6474ba..4f78ea4 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -146,6 +146,8 @@ gcov-files-pci-y += hw/display/virtio-gpu-pci.c
 gcov-files-pci-$(CONFIG_VIRTIO_VGA) += hw/display/virtio-vga.c
 check-qtest-pci-y += tests/intel-hda-test$(EXESUF)
 gcov-files-pci-y += hw/audio/intel-hda.c hw/audio/hda-codec.c
+check-qtest-pci-$(CONFIG_LINUX) += tests/ivshmem-test$(EXESUF)
+gcov-files-pci-y += hw/misc/ivshmem.c
 
 check-qtest-i386-y = tests/endianness-test$(EXESUF)
 check-qtest-i386-y += tests/fdc-test$(EXESUF)
@@ -437,6 +439,7 @@ tests/vhost-user-test$(EXESUF): tests/vhost-user-test.o 
qemu-char.o qemu-timer.o
 tests/qemu-iotests/socket_scm_helper$(EXESUF): 
tests/qemu-iotests/socket_scm_helper.o
 tests/test-qemu-opts$(EXESUF): tests/test-qemu-opts.o $(test-util-obj-y)
 tests/test-write-threshold$(EXESUF): tests/test-write-threshold.o 
$(test-block-obj-y)
+tests/ivshmem-test$(EXESUF): tests/ivshmem-test.o 
contrib/ivshmem-server/ivshmem-server.o $(libqos-pc-obj-y)
 
 ifeq ($(CONFIG_POSIX),y)
 LIBS += -lutil
diff --git a/tests/ivshmem-test.c b/tests/ivshmem-test.c
new file mode 100644
index 000..f872592
--- /dev/null
+++ b/tests/ivshmem-test.c
@@ -0,0 +1,484 @@
+/*
+ * QTest testcase for ivshmem
+ *
+ * Copyright (c) 2015 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "contrib/ivshmem-server/ivshmem-server.h"
+#include "libqos/pci-pc.h"
+#include "libqtest.h"
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+
+#if GLIB_CHECK_VERSION(2, 32, 0)
+#define HAVE_THREAD_NEW
+#endif
+
+#define TMPSHMSIZE (1 << 20)
+static char *tmpshm;
+static void *tmpshmem;
+static char *tmpdir;
+static char *tmpserver;
+
+static void save_fn(QPCIDevice *dev, int devfn, void *data)
+{
+QPCIDevice **pdev = (QPCIDevice **) data;
+
+*pdev = dev;
+}
+
+static QPCIDevice *get_device(void)
+{
+QPCIDevice *dev;
+QPCIBus *pcibus;
+
+pcibus = qpci_init_pc();
+qpci_device_foreach(pcibus, 0x1af4, 0x1110, save_fn, );
+g_assert(dev != NULL);
+
+return dev;
+}
+
+typedef struct _IVState {
+QTestState *qtest;
+void *reg_base, *mem_base;
+QPCIDevice *dev;
+} IVState;
+
+enum Reg {
+INTRMASK = 0,
+INTRSTATUS = 4,
+IVPOSITION = 8,
+DOORBELL = 12,
+};
+
+static const char* reg2str(enum Reg reg) {
+switch (reg) {
+case INTRMASK:
+return "IntrMask";
+case INTRSTATUS:
+return "IntrStatus";
+case IVPOSITION:
+return "IVPosition";
+case DOORBELL:
+return "DoorBell";
+default:
+return NULL;
+}
+}
+
+static inline unsigned in_reg(IVState *s, enum Reg reg)
+{
+const char *name = reg2str(reg);
+QTestState *qtest = global_qtest;
+unsigned res;
+
+global_qtest = s->qtest;
+res = qpci_io_readl(s->dev, s->reg_base + reg);
+g_test_message("*%s -> %x\n", name, res);
+global_qtest = qtest;
+
+return res;
+}
+
+static inline void out_reg(IVState *s, enum Reg reg, unsigned v)
+{
+const char *name = reg2str(reg);
+QTestState *qtest = global_qtest;
+
+global_qtest = s->qtest;
+g_test_message("%x -> *%s\n", v, name);
+qpci_io_writel(s->dev, s->reg_base + reg, v);
+global_qtest = qtest;
+}
+
+static void setup_vm_cmd(IVState *s, const char *cmd, bool msix)
+{
+uint64_t barsize;
+
+s->qtest = qtest_start(cmd);
+
+s->dev = get_device();
+
+/* FIXME: other bar order fails, mappings changes */
+s->mem_base = qpci_iomap(s->dev, 2, );
+g_assert_nonnull(s->mem_base);
+g_assert_cmpuint(barsize, ==, TMPSHMSIZE);
+
+if (msix) {
+qpci_msix_enable(s->dev);
+}
+
+s->reg_base = qpci_iomap(s->dev, 0, );
+g_assert_nonnull(s->reg_base);
+g_assert_cmpuint(barsize, ==, 256);
+
+qpci_device_enable(s->dev);
+}
+
+static void setup_vm(IVState *s)
+{
+char *cmd = g_strdup_printf("-device ivshmem,shm=%s,size=1M", tmpshm);
+
+setup_vm_cmd(s, cmd, false);
+
+g_free(cmd);
+}
+
+static void test_ivshmem_single(void)
+{
+IVState state, *s;
+uint32_t data[1024];
+int i;
+
+

[Qemu-devel] [PULL 29/48] ivshmem: error on too many eventfd received

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

The number of eventfd that can be handled per peer is limited by the
number of vectors. Return an error when receiving too many of them.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 0e31d1d..50af4c7 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -571,6 +571,14 @@ static void ivshmem_read(void *opaque, const uint8_t *buf, 
int size)
 /* each peer has an associated array of eventfds, and we keep
  * track of how many eventfds received so far */
 /* get a new eventfd: */
+/* get a new eventfd */
+if (peer->nb_eventfds >= s->vectors) {
+error_report("Too many eventfd received, device has %d vectors",
+ s->vectors);
+close(incoming_fd);
+return;
+}
+
 new_eventfd = peer->nb_eventfds++;
 
 /* this is an eventfd for a particular peer VM */
-- 
2.4.3

[Qemu-devel] [PATCHv4] target-arm: Implement AArch64 OSLAR/OSLSR_EL1 sysregs

2015-10-06 Thread Davorin Mista

Added oslar_write function to OSLAR_EL1 sysreg, using a status variable
in ARMCPUState.cp15 struct (oslsr_el1). This variable is also linked
to the newly added read-only OSLSR_EL1 register.

Linux reads from this register during its suspend/resume procedure.

Signed-off-by: Davorin Mista 

---
Changed in v2:
-switched from using dummy registers to an actual register implementation
-implemented write function for OSLAR_EL1 sysreg
-added state variable to ARMCPUState struct

Changed in v3:
-renamed variable to oslsr_el1 and moved to cp15
-renamed write frunction to oslar_write
-support both 32bit and 64bit ARM in oslar_write
-moved resetvalue to the corresponding read-only register
-removed "dummy" comments above registers

Changed in v4:
-added type = ARM_CP_NO_RAW
-removed fieldOffset for OSLAR register
-tested with QEMU mainline (git.qemu.org/qemu.git)
---
 target-arm/cpu.h|  1 +
 target-arm/helper.c | 22 --
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index cc1578c..9b80c26 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -378,6 +378,7 @@ typedef struct CPUARMState {
 uint64_t dbgwvr[16]; /* watchpoint value registers */
 uint64_t dbgwcr[16]; /* watchpoint control registers */
 uint64_t mdscr_el1;
+uint64_t oslsr_el1; /* OS Lock Status */
 /* If the counter is enabled, this stores the last time the counter
  * was reset. Otherwise it stores the counter value
  */
diff --git a/target-arm/helper.c b/target-arm/helper.c
index 8367997..33a3e3f 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -3564,6 +3564,20 @@ static CPAccessResult ctr_el0_access(CPUARMState *env, 
const ARMCPRegInfo *ri)
 return CP_ACCESS_OK;
 }
 
+/* write to oslsr_el1 (OS lock status) state variable */
+static void oslar_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t 
value)
+{
+int oslock;
+
+if (ri->state == ARM_CP_STATE_AA32) {
+oslock = (value == 0xC5ACCE55);
+} else {
+oslock = value & 1;
+}
+
+env->cp15.oslsr_el1 = deposit32(env->cp15.oslsr_el1, 1, 1, oslock);
+}
+
 static const ARMCPRegInfo debug_cp_reginfo[] = {
 /* DBGDRAR, DBGDSAR: always RAZ since we don't implement memory mapped
  * debug components. The AArch64 version of DBGDRAR is named MDRAR_EL1;
@@ -3592,10 +3606,14 @@ static const ARMCPRegInfo debug_cp_reginfo[] = {
   .type = ARM_CP_ALIAS,
   .access = PL1_R,
   .fieldoffset = offsetof(CPUARMState, cp15.mdscr_el1), },
-/* We define a dummy WI OSLAR_EL1, because Linux writes to it. */
 { .name = "OSLAR_EL1", .state = ARM_CP_STATE_BOTH,
   .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 0, .opc2 = 4,
-  .access = PL1_W, .type = ARM_CP_NOP },
+  .access = PL1_W, .type = ARM_CP_NO_RAW,
+  .writefn = oslar_write },
+{ .name = "OSLSR_EL1", .state = ARM_CP_STATE_BOTH,
+  .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 1, .opc2 = 4,
+  .access = PL1_R, .resetvalue = 10,
+  .fieldoffset = offsetof(CPUARMState, cp15.oslsr_el1) },
 /* Dummy OSDLR_EL1: 32-bit Linux will read this */
 { .name = "OSDLR_EL1", .state = ARM_CP_STATE_BOTH,
   .cp = 14, .opc0 = 2, .opc1 = 0, .crn = 1, .crm = 3, .opc2 = 4,
-- 
2.6.0

Re: [Qemu-devel] [kvm-unit-tests PATCHv3 2/3] arm: pmu: Check cycle count increases

2015-10-06 Thread Andrew Jones

On Tue, Oct 06, 2015 at 01:49:25PM -0400, Christopher Covington wrote:
> Ensure that reads of the PMCCNTR_EL0 are monotonically increasing,
> even for the smallest delta of two subsequent reads.
> 
> Signed-off-by: Christopher Covington 
> ---
>  arm/pmu.c | 29 +
>  1 file changed, 29 insertions(+)
> 
> diff --git a/arm/pmu.c b/arm/pmu.c
> index 91a3688..589e605 100644
> --- a/arm/pmu.c
> +++ b/arm/pmu.c
> @@ -33,6 +33,8 @@ struct pmu_data {
>   };
>  };
>  
> +static const int samples = 10;

#define NR_SAMPLES 10

> +
>  /* As a simple sanity check on the PMCR_EL0, ensure the implementer field 
> isn't
>   * null. Also print out a couple other interesting fields for diagnostic
>   * purposes. For example, as of fall 2015, QEMU TCG mode doesn't implement
> @@ -56,11 +58,38 @@ static bool check_pmcr(void)
>   return false;
>  }
>  
> +/* Ensure that the cycle counter progresses between back-to-back reads.
> + */

style nit: your block quotes don't have opening wing (the preferred
kernel style - and, fwiw, my preference too...)

> +static bool check_cycles_increase(void)
> +{
> + struct pmu_data pmcr;
> +
> + pmcr.enable = 1;
> + asm volatile("msr pmcr_el0, %0" : : "r" (pmcr));
> +
> + for (int i = 0; i < samples; i++) {
> + int a, b;
> +
> + asm volatile(
> + "mrs %[a], pmccntr_el0\n"
> + "mrs %[b], pmccntr_el0\n"
> + : [a] "=r" (a), [b] "=r" (b));
> +
> + if (a >= b) {
> + printf("Read %d then %d.\n", a, b);
> + return false;
> + }
> + }
> +
> + return true;
> +}
> +
>  int main(void)
>  {
>   report_prefix_push("pmu");
>  
>   report("Control register", check_pmcr());
> + report("Monotonically increasing cycle count", check_cycles_increase());
>  
>   return report_summary();
>  }
> -- 
> Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
> 
>

[Qemu-devel] [RFH PATCH 0/4] record/replay fixups and doubts

2015-10-06 Thread Paolo Bonzini

These are some comments I have about the record/replay code.  I can
integrate these in your patches myself, but I need an ack/tested-by and
in some case more answers...  Please take a look.

Paolo Bonzini (4):
  replay: generalize ptimer event to bottom halves
  more replay fixes
  why is runstate_is_running needed?
  events doubts

 Makefile.objs   |  2 ++
 Makefile.target |  1 -
 cpu-exec.c  |  2 +-
 cpus.c  |  2 +-
 exec.c  |  2 +-
 hw/bt/hci.c |  4 ++--
 hw/core/ptimer.c|  8 ++--
 include/qapi/qmp/qerror.h   |  2 +-
 {replay => include/sysemu}/replay.h |  4 ++--
 qapi/common.json|  6 +-
 qemu-timer.c| 23 +--
 replay/Makefile.objs| 11 +--
 replay/replay-events.c  | 24 +++-
 replay/replay-input.c   |  2 +-
 replay/replay-internal.c|  4 ++--
 replay/replay-internal.h|  2 +-
 replay/replay-time.c|  2 +-
 replay/replay.c |  2 +-
 stubs/Makefile.objs |  1 +
 {replay => stubs}/replay-user.c |  6 +-
 stubs/replay.c  | 10 +-
 ui/input.c  |  2 +-
 vl.c|  6 +++---
 23 files changed, 59 insertions(+), 69 deletions(-)
 rename {replay => include/sysemu}/replay.h (97%)
 mode change 100755 => 100644 replay/Makefile.objs
 mode change 100755 => 100644 replay/replay-events.c
 mode change 100755 => 100644 replay/replay-input.c
 mode change 100755 => 100644 replay/replay-internal.c
 mode change 100755 => 100644 replay/replay-internal.h
 mode change 100755 => 100644 replay/replay-time.c
 mode change 100755 => 100644 replay/replay.c
 rename {replay => stubs}/replay-user.c (90%)

-- 
2.5.0

[Qemu-devel] [PATCH 1/4] replay: generalize ptimer event to bottom halves

2015-10-06 Thread Paolo Bonzini

Make the code a bit more type safe and follow the same scheme as
replay_input_event and replay_input_sync_event.

Signed-off-by: Paolo Bonzini 
---
 hw/core/ptimer.c |  6 +-
 replay/replay-events.c   | 15 ++-
 replay/replay-internal.h |  2 +-
 replay/replay.h  |  4 ++--
 4 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index c56078d..86d544f 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -28,11 +28,7 @@ struct ptimer_state
 static void ptimer_trigger(ptimer_state *s)
 {
 if (s->bh) {
-if (replay_mode != REPLAY_MODE_NONE) {
-replay_add_ptimer_event(s->bh, replay_get_current_step());
-} else {
-qemu_bh_schedule(s->bh);
-}
+replay_bh_schedule_event(s->bh);
 }
 }
 
diff --git a/replay/replay-events.c b/replay/replay-events.c
index 23f3b12..06dd4ca 100755
--- a/replay/replay-events.c
+++ b/replay/replay-events.c
@@ -37,7 +37,7 @@ static bool events_enabled;
 static void replay_run_event(Event *event)
 {
 switch (event->event_kind) {
-case REPLAY_ASYNC_EVENT_PTIMER:
+case REPLAY_ASYNC_EVENT_BH:
 aio_bh_call(event->opaque);
 break;
 case REPLAY_ASYNC_EVENT_INPUT:
@@ -129,9 +129,14 @@ static void replay_add_event(ReplayAsyncEventKind 
event_kind,
 replay_mutex_unlock();
 }
 
-void replay_add_ptimer_event(void *bh, uint64_t id)
+void replay_bh_schedule_event(QEMUBH *bh)
 {
-replay_add_event(REPLAY_ASYNC_EVENT_PTIMER, bh, NULL, id);
+if (replay_mode != REPLAY_MODE_NONE) {
+uint64_t id = replay_get_current_step();
+replay_add_event(REPLAY_ASYNC_EVENT_BH, bh, NULL, id);
+} else {
+qemu_bh_schedule(bh);
+}
 }
 
 void replay_add_input_event(struct InputEvent *event)
@@ -154,7 +159,7 @@ static void replay_save_event(Event *event, int checkpoint)
 
 /* save event-specific data */
 switch (event->event_kind) {
-case REPLAY_ASYNC_EVENT_PTIMER:
+case REPLAY_ASYNC_EVENT_BH:
 replay_put_qword(event->id);
 break;
 case REPLAY_ASYNC_EVENT_INPUT:
@@ -200,7 +205,7 @@ static Event *replay_read_event(int checkpoint)
 
 /* Events that has not to be in the queue */
 switch (read_event_kind) {
-case REPLAY_ASYNC_EVENT_PTIMER:
+case REPLAY_ASYNC_EVENT_BH:
 if (read_id == -1) {
 read_id = replay_get_qword();
 }
diff --git a/replay/replay-internal.h b/replay/replay-internal.h
index 04d2e1b..77e0d29 100755
--- a/replay/replay-internal.h
+++ b/replay/replay-internal.h
@@ -41,7 +41,7 @@ enum ReplayEvents {
 /* Asynchronous events IDs */
 
 enum ReplayAsyncEventKind {
-REPLAY_ASYNC_EVENT_PTIMER,
+REPLAY_ASYNC_EVENT_BH,
 REPLAY_ASYNC_EVENT_INPUT,
 REPLAY_ASYNC_EVENT_INPUT_SYNC,
 REPLAY_ASYNC_COUNT
diff --git a/replay/replay.h b/replay/replay.h
index cbb4e11..abb4688 100755
--- a/replay/replay.h
+++ b/replay/replay.h
@@ -110,8 +110,8 @@ bool replay_checkpoint(ReplayCheckpoint checkpoint);
 void replay_disable_events(void);
 /*! Returns true when saving events is enabled */
 bool replay_events_enabled(void);
-/*! Adds ptimer event to the queue */
-void replay_add_ptimer_event(void *bh, uint64_t id);
+/*! Adds bottom half event to the queue */
+void replay_bh_schedule_event(QEMUBH *bh);
 /*! Adds input event to the queue */
 void replay_input_event(QemuConsole *src, InputEvent *evt);
 /*! Adds input sync event to the queue */
-- 
2.5.0

Re: [Qemu-devel] How to get started with the source code of Qemu?

2015-10-06 Thread Peter Crosthwaite

On Tue, Oct 6, 2015 at 7:17 AM, Aaron Elkins  wrote:
> Hi all,
>
> I am new to Qemu, and I’m extremely interested in understanding how the 
> source code of Qemu work. But after
> I downloaded the whole project, I just lost in it, the project is too large 
> for me to get started.
>

It does a lot, what is your use case? Very few people have an
understanding of the entire code base. Someone might be able to point
you in a specific direction if you give more.

> If anyone here can point me to some useful document or some guides, to make 
> me get started in understanding
> the source code?
>
> What knowledge are required to understand the source code?
>

C coding. Git. Computer hardware architecture.

Regards,
Peter

> BTW, i know this project is not that simple to understand, but I would like 
> to try, even I need to know a lot
> of other knowledge before that, but at least let me get started.
>
> Thanks
>
> -Aaron
>
>
>

[Qemu-devel] [kvm-unit-tests PATCHv3 2/3] arm: pmu: Check cycle count increases

2015-10-06 Thread Christopher Covington

Ensure that reads of the PMCCNTR_EL0 are monotonically increasing,
even for the smallest delta of two subsequent reads.

Signed-off-by: Christopher Covington 
---
 arm/pmu.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/arm/pmu.c b/arm/pmu.c
index 91a3688..589e605 100644
--- a/arm/pmu.c
+++ b/arm/pmu.c
@@ -33,6 +33,8 @@ struct pmu_data {
};
 };
 
+static const int samples = 10;
+
 /* As a simple sanity check on the PMCR_EL0, ensure the implementer field isn't
  * null. Also print out a couple other interesting fields for diagnostic
  * purposes. For example, as of fall 2015, QEMU TCG mode doesn't implement
@@ -56,11 +58,38 @@ static bool check_pmcr(void)
return false;
 }
 
+/* Ensure that the cycle counter progresses between back-to-back reads.
+ */
+static bool check_cycles_increase(void)
+{
+   struct pmu_data pmcr;
+
+   pmcr.enable = 1;
+   asm volatile("msr pmcr_el0, %0" : : "r" (pmcr));
+
+   for (int i = 0; i < samples; i++) {
+   int a, b;
+
+   asm volatile(
+   "mrs %[a], pmccntr_el0\n"
+   "mrs %[b], pmccntr_el0\n"
+   : [a] "=r" (a), [b] "=r" (b));
+
+   if (a >= b) {
+   printf("Read %d then %d.\n", a, b);
+   return false;
+   }
+   }
+
+   return true;
+}
+
 int main(void)
 {
report_prefix_push("pmu");
 
report("Control register", check_pmcr());
+   report("Monotonically increasing cycle count", check_cycles_increase());
 
return report_summary();
 }
-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

[Qemu-devel] [PULL 47/48] ivshmem: use kvm irqfd for msi notifications

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

Use irqfd for improving context switch when notifying the guest.
If the host doesn't support kvm irqfd, regular msi notifications are
still supported.

Note: the ivshmem implementation doesn't allow switching between MSI and
IO interrupts, this patch doesn't either.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Paolo Bonzini 
---
 hw/misc/ivshmem.c | 180 --
 1 file changed, 174 insertions(+), 6 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 8581d43..7c7c80d 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -19,6 +19,7 @@
 #include "hw/hw.h"
 #include "hw/i386/pc.h"
 #include "hw/pci/pci.h"
+#include "hw/pci/msi.h"
 #include "hw/pci/msix.h"
 #include "sysemu/kvm.h"
 #include "migration/migration.h"
@@ -68,6 +69,7 @@ typedef struct Peer {
 
 typedef struct MSIVector {
 PCIDevice *pdev;
+int virq;
 } MSIVector;
 
 typedef struct IVShmemState {
@@ -293,13 +295,73 @@ static void fake_irqfd(void *opaque, const uint8_t *buf, 
int size) {
 msix_notify(pdev, vector);
 }
 
+static int ivshmem_vector_unmask(PCIDevice *dev, unsigned vector,
+ MSIMessage msg)
+{
+IVShmemState *s = IVSHMEM(dev);
+EventNotifier *n = >peers[s->vm_id].eventfds[vector];
+MSIVector *v = >msi_vectors[vector];
+int ret;
+
+IVSHMEM_DPRINTF("vector unmask %p %d\n", dev, vector);
+
+ret = kvm_irqchip_update_msi_route(kvm_state, v->virq, msg);
+if (ret < 0) {
+return ret;
+}
+
+return kvm_irqchip_add_irqfd_notifier_gsi(kvm_state, n, NULL, v->virq);
+}
+
+static void ivshmem_vector_mask(PCIDevice *dev, unsigned vector)
+{
+IVShmemState *s = IVSHMEM(dev);
+EventNotifier *n = >peers[s->vm_id].eventfds[vector];
+int ret;
+
+IVSHMEM_DPRINTF("vector mask %p %d\n", dev, vector);
+
+ret = kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, n,
+s->msi_vectors[vector].virq);
+if (ret != 0) {
+error_report("remove_irqfd_notifier_gsi failed");
+}
+}
+
+static void ivshmem_vector_poll(PCIDevice *dev,
+unsigned int vector_start,
+unsigned int vector_end)
+{
+IVShmemState *s = IVSHMEM(dev);
+unsigned int vector;
+
+IVSHMEM_DPRINTF("vector poll %p %d-%d\n", dev, vector_start, vector_end);
+
+vector_end = MIN(vector_end, s->vectors);
+
+for (vector = vector_start; vector < vector_end; vector++) {
+EventNotifier *notifier = >peers[s->vm_id].eventfds[vector];
+
+if (!msix_is_masked(dev, vector)) {
+continue;
+}
+
+if (event_notifier_test_and_clear(notifier)) {
+msix_set_pending(dev, vector);
+}
+}
+}
+
 static CharDriverState* create_eventfd_chr_device(void * opaque, EventNotifier 
*n,
   int vector)
 {
 /* create a event character device based on the passed eventfd */
 IVShmemState *s = opaque;
-CharDriverState * chr;
+PCIDevice *pdev = PCI_DEVICE(s);
 int eventfd = event_notifier_get_fd(n);
+CharDriverState *chr;
+
+s->msi_vectors[vector].pdev = pdev;
 
 chr = qemu_chr_open_eventfd(eventfd);
 
@@ -484,6 +546,58 @@ static bool fifo_update_and_get(IVShmemState *s, const 
uint8_t *buf, int size,
 return true;
 }
 
+static int ivshmem_add_kvm_msi_virq(IVShmemState *s, int vector)
+{
+PCIDevice *pdev = PCI_DEVICE(s);
+MSIMessage msg = msix_get_message(pdev, vector);
+int ret;
+
+IVSHMEM_DPRINTF("ivshmem_add_kvm_msi_virq vector:%d\n", vector);
+
+if (s->msi_vectors[vector].pdev != NULL) {
+return 0;
+}
+
+ret = kvm_irqchip_add_msi_route(kvm_state, msg);
+if (ret < 0) {
+error_report("ivshmem: kvm_irqchip_add_msi_route failed");
+return -1;
+}
+
+s->msi_vectors[vector].virq = ret;
+s->msi_vectors[vector].pdev = pdev;
+
+return 0;
+}
+
+static void setup_interrupt(IVShmemState *s, int vector)
+{
+EventNotifier *n = >peers[s->vm_id].eventfds[vector];
+bool with_irqfd = kvm_msi_via_irqfd_enabled() &&
+ivshmem_has_feature(s, IVSHMEM_MSI);
+PCIDevice *pdev = PCI_DEVICE(s);
+
+IVSHMEM_DPRINTF("setting up interrupt for vector: %d\n", vector);
+
+if (!with_irqfd) {
+IVSHMEM_DPRINTF("with eventfd");
+s->eventfd_chr[vector] = create_eventfd_chr_device(s, n, vector);
+} else if (msix_enabled(pdev)) {
+IVSHMEM_DPRINTF("with irqfd");
+if (ivshmem_add_kvm_msi_virq(s, vector) < 0) {
+return;
+}
+
+if (!msix_is_masked(pdev, vector)) {
+kvm_irqchip_add_irqfd_notifier_gsi(kvm_state, n, NULL,
+   s->msi_vectors[vector].virq);
+}
+} else {
+/* it will be delayed until

[Qemu-devel] [PULL 09/48] ivshmem: more qdev conversion

2015-10-06 Thread marcandre . lureau

From: Marc-André Lureau 

Use the latest qemu device modeling API, in particular, convert to
realize to fix the error handling; right now a botched device_add
ivhsmem command kills the VM.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Claudio Fontana 
---
 hw/misc/ivshmem.c | 119 +++---
 1 file changed, 68 insertions(+), 51 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index dea4096..62547c0 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -319,22 +319,23 @@ static CharDriverState* create_eventfd_chr_device(void * 
opaque, EventNotifier *
 
 }
 
-static int check_shm_size(IVShmemState *s, int fd) {
+static int check_shm_size(IVShmemState *s, int fd, Error **errp)
+{
 /* check that the guest isn't going to try and map more memory than the
  * the object has allocated return -1 to indicate error */
 
 struct stat buf;
 
 if (fstat(fd, ) < 0) {
-error_report("exiting: fstat on fd %d failed: %s",
- fd, strerror(errno));
+error_setg(errp, "exiting: fstat on fd %d failed: %s",
+   fd, strerror(errno));
 return -1;
 }
 
 if (s->ivshmem_size > buf.st_size) {
-error_report("Requested memory size greater"
- " than shared object size (%" PRIu64 " > %" PRIu64")",
- s->ivshmem_size, (uint64_t)buf.st_size);
+error_setg(errp, "Requested memory size greater"
+   " than shared object size (%" PRIu64 " > %" PRIu64")",
+   s->ivshmem_size, (uint64_t)buf.st_size);
 return -1;
 } else {
 return 0;
@@ -343,13 +344,18 @@ static int check_shm_size(IVShmemState *s, int fd) {
 
 /* create the shared memory BAR when we are not using the server, so we can
  * create the BAR and map the memory immediately */
-static void create_shared_memory_BAR(IVShmemState *s, int fd, uint8_t attr) {
-
+static int create_shared_memory_BAR(IVShmemState *s, int fd, uint8_t attr,
+Error **errp)
+{
 void * ptr;
 
-s->shm_fd = fd;
-
 ptr = mmap(0, s->ivshmem_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
+if (ptr == MAP_FAILED) {
+error_setg_errno(errp, errno, "Failed to mmap shared memory");
+return -1;
+}
+
+s->shm_fd = fd;
 
 memory_region_init_ram_ptr(>ivshmem, OBJECT(s), "ivshmem.bar2",
s->ivshmem_size, ptr);
@@ -358,6 +364,8 @@ static void create_shared_memory_BAR(IVShmemState *s, int 
fd, uint8_t attr) {
 
 /* region for shared memory */
 pci_register_bar(PCI_DEVICE(s), 2, attr, >bar);
+
+return 0;
 }
 
 static void ivshmem_add_eventfd(IVShmemState *s, int posn, int i)
@@ -481,6 +489,7 @@ static void ivshmem_read(void *opaque, const uint8_t *buf, 
int size)
 int incoming_fd;
 int guest_max_eventfd;
 long incoming_posn;
+Error *err = NULL;
 
 if (!fifo_update_and_get(s, buf, size,
  _posn, sizeof(incoming_posn))) {
@@ -524,18 +533,24 @@ static void ivshmem_read(void *opaque, const uint8_t 
*buf, int size)
 
 /* if the position is -1, then it's shared memory region fd */
 if (incoming_posn == -1) {
-
 void * map_ptr;
 
 s->max_peer = 0;
 
-if (check_shm_size(s, incoming_fd) == -1) {
-exit(1);
+if (check_shm_size(s, incoming_fd, ) == -1) {
+error_report_err(err);
+close(incoming_fd);
+return;
 }
 
 /* mmap the region and map into the BAR2 */
 map_ptr = mmap(0, s->ivshmem_size, PROT_READ|PROT_WRITE, MAP_SHARED,
 incoming_fd, 0);
+if (map_ptr == MAP_FAILED) {
+error_report("Failed to mmap shared memory %s", strerror(errno));
+close(incoming_fd);
+return;
+}
 memory_region_init_ram_ptr(>ivshmem, OBJECT(s),
"ivshmem.bar2", s->ivshmem_size, map_ptr);
 vmstate_register_ram(>ivshmem, DEVICE(s));
@@ -610,7 +625,7 @@ static void ivshmem_reset(DeviceState *d)
 ivshmem_use_msix(s);
 }
 
-static uint64_t ivshmem_get_size(IVShmemState * s) {
+static uint64_t ivshmem_get_size(IVShmemState * s, Error **errp) {
 
 uint64_t value;
 char *ptr;
@@ -624,24 +639,23 @@ static uint64_t ivshmem_get_size(IVShmemState * s) {
 value <<= 30;
 break;
 default:
-error_report("invalid ram size: %s", s->sizearg);
-exit(1);
+error_setg(errp, "invalid ram size: %s", s->sizearg);
+return 0;
 }
 
 /* BARs must be a power of 2 */
 if (!is_power_of_two(value)) {
-error_report("size must be power of 2");
-exit(1);
+error_setg(errp, "size must be power of 2");
+

Re: [Qemu-devel] [kvm-unit-tests PATCHv3 3/3] arm: pmu: Add CPI checking

2015-10-06 Thread Andrew Jones

On Tue, Oct 06, 2015 at 01:49:26PM -0400, Christopher Covington wrote:
> Check the numbers of cycles per instruction (CPI) implied by ARM PMU
> cycle counter values. Check that in -icount mode these strictly
> match the specified rate.
> 
> Signed-off-by: Christopher Covington 
> ---
>  arm/pmu.c | 72 
> ++-
>  arm/unittests.cfg | 13 ++
>  2 files changed, 84 insertions(+), 1 deletion(-)
> 
> diff --git a/arm/pmu.c b/arm/pmu.c
> index 589e605..0ad113d 100644
> --- a/arm/pmu.c
> +++ b/arm/pmu.c
> @@ -84,12 +84,82 @@ static bool check_cycles_increase(void)
>   return true;
>  }
>  
> -int main(void)
> +/* Execute a known number of guest instructions. Only odd instruction counts
> + * greater than or equal to 3 are supported by the in-line assembly code. The
> + * control register (PMCR_EL0) is initialized with the provided value 
> (allowing
> + * for example for the cycle counter or event counters to be reset). At the 
> end
> + * of the exact instruction loop, zero is written to PMCR_EL0 to disable
> + * counting, allowing the cycle counter or event counters to be read at the
> + * leisure of the calling code.
> + */
> +static void measure_instrs(int num, struct pmu_data pmcr)
> +{
> + int i = (num - 1) / 2;
> +
> + if (num < 3 || ((num - 1) % 2))
> + abort();

assert(num >= 3 && ((num - 1) % 2) == 0);

> +
> + asm volatile(
> + "msr pmcr_el0, %[pmcr]\n"
^\t
> + "1: subs %[i], %[i], #1\n"
   ^\t  ^\t
> + "b.gt 1b\n"
 ^\t
> + "msr pmcr_el0, xzr"
^\t
> + : [i] "+r" (i) : [pmcr] "r" (pmcr) : "cc");
> +}
> +
> +/* Measure cycle counts for various known instruction counts. Ensure that the
> + * cycle counter progresses (similar to check_cycles_increase() but with more
> + * instructions and using reset and stop controls). If supplied a positive,
> + * nonzero CPI parameter, also strictly check that every measurement matches
> + * it. Strict CPI checking is used to test -icount mode.
> + */
> +static bool check_cpi(int cpi)
> +{
> + struct pmu_data pmcr;
> +
> + pmcr.cycle_counter_reset = 1;
> + pmcr.enable = 1;
> +
> + if (cpi > 0)
> + printf("Checking for CPI=%d.\n", cpi);
> + printf("instrs : cycles0 cycles1 ...\n");
> +
> + for (int i = 3; i < 300; i += 32) {
> + int avg, sum = 0;
> +
> + printf("%d :", i);
> + for (int j = 0; j < samples; j++) {
> + int cycles;
> +
> + measure_instrs(i, pmcr);
> + asm volatile("mrs %0, pmccntr_el0" : "=r" (cycles));
> + printf(" %d", cycles);
> +
> + if (!cycles || (cpi > 0 && cycles != i * cpi)) {
> + printf("\n");
> + return false;
> + }
> +
> + sum += cycles;
> + }
> + avg = sum / samples;
> + printf(" sum=%d avg=%d avg_ipc=%d avg_cpi=%d\n",
> + sum, avg, i / avg, avg / i);
> + }
> +
> + return true;
> +}
> +
> +int main(int argc, char *argv[])
>  {
>   report_prefix_push("pmu");
>  
>   report("Control register", check_pmcr());
>   report("Monotonically increasing cycle count", check_cycles_increase());
>  
> + int cpi = (argc == 1 ? atol(argv[0]) : 0);

I prefer variable declarations at the top of the function, and

  int cpi = 0;

  if (argc > 1)
cpi = atol(argv[0]);

looks a bit better to me.


> +
> + report("Cycle/instruction ratio", check_cpi(cpi));
> +
>   return report_summary();
>  }
> diff --git a/arm/unittests.cfg b/arm/unittests.cfg
> index fd94adb..333ee0d 100644
> --- a/arm/unittests.cfg
> +++ b/arm/unittests.cfg
> @@ -39,4 +39,17 @@ groups = selftest
>  # Test PMU support without -icount
>  [pmu]
>  file = pmu.flat
> +extra_params = -append '-1'

Why do we need this cpu == -1? Can't it just be zero?

> +groups = pmu
> +
> +# Test PMU support with -icount IPC=1
> +[pmu-icount-1]
> +file = pmu.flat
> +extra_params = -icount 0 -append '1'
> +groups = pmu
> +
> +# Test PMU support with -icount IPC=256
> +[pmu-icount-256]
> +file = pmu.flat
> +extra_params = -icount 8 -append '256'
>  groups = pmu

-icount is a tcg specific parameter. I have a patch[*] in my staging
branch which allows you to specify 'accel = tcg' in unittests.cfg for
this type of test. You'll need to use that for anything with -icount
on the extra_params list.

Thanks,
drew

[*] 
https://github.com/rhdrjones/kvm-unit-tests/commit/85e084cf263e76484f7d82cbc9add4e7602f80a4

1 2 3 >

1 - 100 of 237 matches

Mail list logo