date:20161123

Re: [Qemu-devel] sane char device writes?

2016-11-23 Thread Thomas Huth

On 23.11.2016 19:09, Michal Suchánek wrote:
> Hello,
> 
> I have reported the issue with qemu aborting in spapr_vty.c because
> gtk.c submitted more data than can be sent to the emulated serial port.
> 
> While the abort has been resolved and spapr_vty.c should truncate the
> data now getting the data through is still not possible.
> 
> Looking in the code I see that console.c has this code (which is only
> piece of code in UI corresponding the the gtk part I found):
> 
> static void kbd_send_chars(void *opaque)
> {
> QemuConsole *s = opaque;
> int len;
> uint8_t buf[16];
> 
> len = qemu_chr_be_can_write(s->chr);
> if (len > s->out_fifo.count)
> len = s->out_fifo.count;
> if (len > 0) {
> if (len > sizeof(buf))
> len = sizeof(buf);
> qemu_fifo_read(&s->out_fifo, buf, len);
> qemu_chr_be_write(s->chr, buf, len);
> }
> /* characters are pending: we send them a bit later (XXX:
>horrible, should change char device API) */
> if (s->out_fifo.count > 0) {
> timer_mod(s->kbd_timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME)
> + 1); }
> }
> 
> The corresponding piece of code in gtk.c is AFAICT
> 
> static gboolean  (VteTerminal *terminal, gchar *text, guint size,
>  gpointer user_data)
> {
> VirtualConsole *vc = user_data;
> 
> if (vc->vte.echo) {
> VteTerminal *term = VTE_TERMINAL(vc->vte.terminal);
> int i;
> for (i = 0; i < size; i++) {
> uint8_t c = text[i];
> if (c >= 128 || isprint(c)) {
> /* 8-bit characters are considered printable.  */
> vte_terminal_feed(term, &text[i], 1);
> } else if (c == '\r' || c == '\n') {
> vte_terminal_feed(term, "\r\n", 2);
> } else {
> char ctrl[2] = { '^', 0};
> ctrl[1] = text[i] ^ 64;
> vte_terminal_feed(term, ctrl, 2);
> }
> }
> }
> 
> qemu_chr_be_write(vc->vte.chr, (uint8_t  *)text, (unsigned
> int)size); return TRUE;
> }
> 
> meaning there is no loop to split the submitted text buffer.
> 
> gd_vc_in is VTE callback handling input so I suspect it either handles
> it or not and it cannot say it handled only part of the "commit" event.
> 
> So for this to work an extra buffer would have to be stored in gtk.c
> somewhere, and possibly similar timer trick used as in console.c
> 
> Any ideas how to do this without introducing too much insanity?
>
> Presumably using a GTK timer for repeating gd_vc_in the handler would
> run in the same GTK UI thread as the "commit" signal handler and
> excessive locking would not be required.
> 
> The data passed to gd_vc_in is presumably freed when it ends so it
> would have to be copied somewhere. It's quite possible to create a
> static list in gd_vc_in or some extra field in VirtualConsole.

Not sure how the best solution should really look like, but Paolo
suggested something here:

http://lists.gnu.org/archive/html/qemu-devel/2016-11/msg0.html

... so I'm putting him on CC: ... maybe he's got some spare minutes to
elaborate on his idea.

 Thomas

Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost

2016-11-23 Thread Yuanhan Liu

On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
> > > You keep assuming that you have the VM started first and
> > > figure out things afterwards, but this does not work.
> > > 
> > > Think about a cluster of machines. You want to start a VM in
> > > a way that will ensure compatibility with all hosts
> > > in a cluster.
> > 
> > I see. I was more considering about the case when the dst
> > host (including the qemu and dpdk combo) is given, and
> > then determine whether it will be a successfull migration
> > or not.
> > 
> > And you are asking that we need to know which host could
> > be a good candidate before starting the migration. In such
> > case, we indeed need some inputs from both the qemu and
> > vhost-user backend.
> > 
> > For DPDK, I think it could be simple, just as you said, it
> > could be either a tiny script, or even a macro defined in
> > the source code file (we extend it every time we add a
> > new feature) to let the libvirt to read it. Or something
> > else.
> 
> There's the issue of APIs that tweak features as Maxime
> suggested.

Yes, it's a good point.

> Maybe the only thing to do is to deprecate it,

Looks like so.

> but I feel some way for application to pass info into
> guest might be benefitial.

The two APIs are just for tweaking feature bits DPDK supports before
any device got connected. It's another way to disable some features
(the another obvious way is to through QEMU command lines).

IMO, it's bit handy only in a case like: we have bunch of VMs. Instead
of disabling something though qemu one by one, we could disable it
once in DPDK.

But I doubt the useful of it. It's only used in DPDK's vhost example
after all. Nor is it used in vhost pmd, neither is it used in OVS.

> > > If you don't, guest visible interface will change
> > > and you won't be able to migrate.
> > > 
> > > It does not make sense to discuss feature bits specifically
> > > since that is not the only part of interface.
> > > For example, max ring size supported might change.
> > 
> > I don't quite understand why we have to consider the max ring
> > size here? Isn't it a virtio device attribute, that QEMU could
> > provide such compatibility information?
> >
> > I mean, DPDK is supposed to support vary vring size, it's QEMU
> > to give a specifc value.
> 
> If backend supports s/g of any size up to 2^16, there's no issue.

I don't know others, but I see no issues in DPDK.

> ATM some backends might be assuming up to 1K s/g since
> QEMU never supported bigger ones. We might classify this
> as a bug, or not and add a feature flag.
> 
> But it's just an example. There might be more values at issue
> in the future.

Yeah, maybe. But we could analysis it one by one.

> > > Let me describe how it works in qemu/libvirt.
> > > When you install a VM, you can specify compatibility
> > > level (aka "machine type"), and you can query the supported compatibility
> > > levels. Management uses that to find the supported compatibility
> > > and stores the compatibility in XML that is migrated with the VM.
> > > There's also a way to find the latest level which is the
> > > default unless overridden by user, again this level
> > > is recorded and then
> > > - management can make sure migration destination is compatible
> > > - management can avoid migration to hosts without that support
> > 
> > Thanks for the info, it helps.
> > 
> > ...
> > > > > >>As version here is an opaque string for libvirt and qemu,
> > > > > >>anything can be used - but I suggest either a list
> > > > > >>of values defining the interface, e.g.
> > > > > >>any_layout=on,max_ring=256
> > > > > >>or a version including the name and vendor of the backend,
> > > > > >>e.g. "org.dpdk.v4.5.6".
> > 
> > The version scheme may not be ideal here. Assume a QEMU is supposed
> > to work with a specific DPDK version, however, user may disable some
> > newer features through qemu command line, that it also could work with
> > an elder DPDK version. Using the version scheme will not allow us doing
> > such migration to an elder DPDK version. The MTU is a lively example
> > here? (when MTU feature is provided by QEMU but is actually disabled
> > by user, that it could also work with an elder DPDK without MTU support).
> > 
> > --yliu
> 
> OK, so does a list of values look better to you then?

Yes, if there are no better way.

And I think it may be better to not list all those features, literally.
But instead, using the number should be better, say, features=0xdeadbeef.

Listing the feature names means we have to come to an agreement in all
components involved here (QEMU, libvirt, DPDK, VPP, and maybe more
backends), that we have to use the exact same feature names. Though it
may not be a big deal, it lacks some flexibility.

A feature bits will not have this issue.

--yliu

> 
> 
> > > > > >>
> > > > > >>Note that typically the list of supported versions can only be
> > > > > >>extended, not shrunk. Also, if the host/guest interf

Re: [Qemu-devel] [PATCH v1 09/10] target-ppc: add vextu[bhw]lx instructions

2016-11-23 Thread Nikunj A Dadhania

David Gibson  writes:

> [ Unknown signature status ]
> On Wed, Nov 23, 2016 at 05:07:18PM +0530, Nikunj A Dadhania wrote:
>> From: Avinesh Kumar 
>> 
>> vextublx:  Vector Extract Unsigned Byte Left
>> vextuhlx:  Vector Extract Unsigned Halfword Left
>> vextuwlx:  Vector Extract Unsigned Word Left
>> 
>> Signed-off-by: Avinesh Kumar 
>> Signed-off-by: Nikunj A Dadhania 
>
> So, when I suggested doing these without helpers before, I had
> forgotten that the non-byte versions can straddle the word boundary.
> Given that the offset is in a register, not the instruction that does
> make it complicated.
>
> But, this version also relies on working 128-bit arithmetic, AFAICT
> this will just fail to build if CONFIG_INT128 isn't defined.

It has both the implementation, just that the defines might have
confused you:

#if defined(HOST_WORDS_BIGENDIAN)

#  if defined(CONFIG_INT128)
#  else
#  endif

#else /* !defined (HOST_WORDS_BIGENDIAN) */

#  if defined(CONFIG_INT128)
#  else
#  endif

#endif 

> It really shouldn't be that hard to make a helper that works just in
> terms of 64-bit arithmetic - there are only 3 cases (all in the upper
> word, all in the lower, and straddling).

Currently, its being done using byte array.

 +{   \
 +target_ulong r = 0; \
 +int i;  \
 +int index = a & 0xf;\
 +for (i = 0; i < elem; i++) {\
 +r = r << 8; \
 +if (index + i <= 15) {  \
 +r = r | b->u8[index + i];   \
 +}   \
 +}   \
 +return r;   \
 +}

> I'd prefer to see it done that way, rather than increasing reliance on
> CONFIG_INT128.

Regards
Nikunj

Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Veritas HyperScale VxHS block device support

2016-11-23 Thread Ketan Nilangekar






On 11/24/16, 4:07 AM, "Paolo Bonzini"  wrote:

>
>
>On 23/11/2016 23:09, ashish mittal wrote:
>> On the topic of protocol security -
>> 
>> Would it be enough for the first patch to implement only
>> authentication and not encryption?
>
>Yes, of course.  However, as we introduce more and more QEMU-specific
>characteristics to a protocol that is already QEMU-specific (it doesn't
>do failover, etc.), I am still not sure of the actual benefit of using
>libqnio versus having an NBD server or FUSE driver.
>
>You have already mentioned performance, but the design has changed so
>much that I think one of the two things has to change: either failover
>moves back to QEMU and there is no (closed source) translator running on
>the node, or the translator needs to speak a well-known and
>already-supported protocol.

IMO design has not changed. Implementation has changed significantly. I would 
propose that we keep resiliency/failover code out of QEMU driver and implement 
it entirely in libqnio as planned in a subsequent revision. The VxHS server 
does not need to understand/handle failover at all. 

Today libqnio gives us significantly better performance than any NBD/FUSE 
implementation. We know because we have prototyped with both. Significant 
improvements to libqnio are also in the pipeline which will use cross memory 
attach calls to further boost performance. Ofcourse a big reason for the 
performance is also the HyperScale storage backend but we believe this method 
of IO tapping/redirecting can be leveraged by other solutions as well.

Ketan

>
>Paolo
>
>> On Wed, Nov 23, 2016 at 12:25 AM, Ketan Nilangekar
>>  wrote:
>>> +Nitin Jerath from Veritas.
>>>
>>>
>>>
>>>
>>> On 11/18/16, 7:06 PM, "Daniel P. Berrange"  wrote:
>>>
 On Fri, Nov 18, 2016 at 01:25:43PM +, Ketan Nilangekar wrote:
>
>
>> On Nov 18, 2016, at 5:25 PM, Daniel P. Berrange  
>> wrote:
>>
>>> On Fri, Nov 18, 2016 at 11:36:02AM +, Ketan Nilangekar wrote:
>>>
>>>
>>>
>>>
>>>
 On 11/18/16, 3:32 PM, "Stefan Hajnoczi"  wrote:

> On Fri, Nov 18, 2016 at 02:26:21AM -0500, Jeff Cody wrote:
> * Daniel pointed out that there is no authentication method for 
> taking to a
>  remote server.  This seems a bit scary.  Maybe all that is needed 
> here is
>  some clarification of the security scheme for authentication?  My
>  impression from above is that you are relying on the networks being
>  private to provide some sort of implicit authentication, though, and 
> this
>  seems fragile (and doesn't protect against a compromised guest or 
> other
>  process on the server, for one).

 Exactly, from the QEMU trust model you must assume that QEMU has been
 compromised by the guest.  The escaped guest can connect to the VxHS
 server since it controls the QEMU process.

 An escaped guest must not have access to other guests' volumes.
 Therefore authentication is necessary.
>>>
>>> Just so I am clear on this, how will such an escaped guest get to know
>>> the other guest vdisk IDs?
>>
>> There can be a multiple approaches depending on the deployment scenario.
>> At the very simplest it could directly read the IDs out of the libvirt
>> XML files in /var/run/libvirt. Or it can rnu "ps" to list other running
>> QEMU processes and see the vdisk IDs in the command line args of those
>> processes. Or the mgmt app may be creating vdisk IDs based on some
>> particular scheme, and the attacker may have info about this which lets
>> them determine likely IDs.  Or the QEMU may have previously been
>> permitted to the use the disk and remembered the ID for use later
>> after access to the disk has been removed.
>>
>
> Are we talking about a compromised guest here or compromised hypervisor?
> How will a compromised guest read the xml file or list running qemu
> processes?

 Compromised QEMU process, aka hypervisor userspace


 Regards,
 Daniel
 --
 |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ 
 :|
 |: http://libvirt.org  -o- http://virt-manager.org 
 :|
 |: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ 
 :|

Re: [Qemu-devel] [PATCH v2] pci: relax pci_msi_get_message()

2016-11-23 Thread Peter Xu

On Tue, Nov 22, 2016 at 04:08:50PM +0800, Peter Xu wrote:
> We are very strict in the past getting MSIs from commit
> d1f6af6a1 ("kvm-irqchip: simplify kvm_irqchip_add_msi_route"), assuming
> that MSI should be configured before hand when fetching. When we have
> unrecognized configurations, we panic the system. However looks like
> this is too strict to be working on some platform, and issues occured.
> Firstly it's found on a ppc case and fixed by David in:
> 
>   6d17a01 vfio/pci: Fix regression in MSI routing configuration
> 
> However we encountered another case now with windows virtio driver and
> reported (and possibly more):
> 
>   http://bugs.debian.org/844361
> 
> To make every driver/hardware happy, let's loosen the rule and go back
> to the original behavior - instead of panic the system, when we try to
> fetch MSI without configured MSI/MSI-X system, we just provide an empty
> message to make drivers happy.
> 
> Reported-by: Maciej Kotliński 
> Signed-off-by: Peter Xu 

Sorry I should mark this as "for-2.8". Also cc stable since this bug
exists since 2.7.0.

Michael, do you think it can be a material for 2.8 rc2?

Thanks,

-- peterx

Re: [Qemu-devel] [PATCH 1/2] virtio-net rsc: support coalescing ipv4 tcp traffic

2016-11-23 Thread Michael S. Tsirkin

On Thu, Nov 24, 2016 at 12:31:18PM +0800, Jason Wang wrote:
> 
> 
> On 2016年11月24日 12:26, Michael S. Tsirkin wrote:
> > On Thu, Nov 24, 2016 at 12:17:21PM +0800, Jason Wang wrote:
> > > > diff --git a/include/standard-headers/linux/virtio_net.h 
> > > > b/include/standard-headers/linux/virtio_net.h
> > > > index 30ff249..e67b36e 100644
> > > > --- a/include/standard-headers/linux/virtio_net.h
> > > > +++ b/include/standard-headers/linux/virtio_net.h
> > > > @@ -57,6 +57,9 @@
> > > >  * Steering */
> > > >#define VIRTIO_NET_F_CTRL_MAC_ADDR 23/* Set MAC address */
> > > > +/* Guest can handle coalesced ipv4-tcp packets */
> > > > +#define VIRTIO_NET_F_GUEST_RSC441
> > > Why not use 24?
> > I think we should use features >31 (virtio 1 only) for
> > nice-to-have features like RSC. Feature bits <31 are
> > easy to backport, so it makes more sense to use
> > them for fundamental things like the MTU
> > (which for some setups help fix broken networking).
> 
> Ok, I believe we need clarify this in the spec or somewhere else.

There is a design considerations chapter, it can go there.

-- 
MST

Re: [Qemu-devel] [PATCH] memory: add section range info for IOMMU notifier

2016-11-23 Thread David Gibson

On Wed, Nov 23, 2016 at 05:20:58PM +0800, Peter Xu wrote:
> In this patch, IOMMUNotifier.{start|end} are introduced to store section
> information for a specific notifier. When notification occurs, we not
> only check the notification type (MAP|UNMAP), but also check whether the
> notified iova is in the range of specific IOMMU notifier, and skip those
> notifiers if not in the listened range.
> 
> When removing an region, we need to make sure we removed the correct
> VFIOGuestIOMMU by checking the IOMMUNotifier.start address as well.
> 
> Suggested-by: David Gibson 
> Signed-off-by: Peter Xu 
> 
> ---
> This patch fixes the same issue with the following one:
> 
>   [PATCH] vfio: avoid adding same iommu mr for notify
> 
> Alex/David, would you please help provide some review comments on either
> of the two patches? When we can settle down the best way, then I'll drop
> the other one (I still prefer the other one...). Thanks,
> 
> Signed-off-by: Peter Xu 
> ---
>  hw/vfio/common.c  | 7 ++-
>  include/exec/memory.h | 3 +++
>  memory.c  | 4 +++-
>  3 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 801578b..c3db115 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -455,6 +455,10 @@ static void vfio_listener_region_add(MemoryListener 
> *listener,
>  giommu->container = container;
>  giommu->n.notify = vfio_iommu_map_notify;
>  giommu->n.notifier_flags = IOMMU_NOTIFIER_ALL;
> +giommu->n.start = section->offset_within_address_space;

I think this needs to be offset_within_region rather than
offset_within_address_space.  The IOVAs used in the IOMMUTLBEntry are
relative to the MR, not the enclosing AS (in fact there could be
several enclosing ASes with the right aliasing).  See for example
put_tce_emu() - the (ioba - tcet->bus_offset) expression is
effectively converting the AS relative ioba into an MR relative
address.

> +llend = int128_add(int128_make64(giommu->n.start), section->size);
> +llend = int128_sub(llend, int128_one());
> +giommu->n.end = int128_get64(llend);
>  QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
>  
>  memory_region_register_iommu_notifier(giommu->iommu, &giommu->n);
> @@ -525,7 +529,8 @@ static void vfio_listener_region_del(MemoryListener 
> *listener,
>  VFIOGuestIOMMU *giommu;
>  
>  QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
> -if (giommu->iommu == section->mr) {
> +if (giommu->iommu == section->mr &&
> +giommu->n.start == section->offset_within_address_space) {

Same here.

>  memory_region_unregister_iommu_notifier(giommu->iommu,
>  &giommu->n);
>  QLIST_REMOVE(giommu, giommu_next);
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 9728a2f..87357ea 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -84,6 +84,9 @@ typedef enum {
>  struct IOMMUNotifier {
>  void (*notify)(struct IOMMUNotifier *notifier, IOMMUTLBEntry *data);
>  IOMMUNotifierFlag notifier_flags;
> +/* Notify for address space range start <= addr <= end */
> +hwaddr start;
> +hwaddr end;
>  QLIST_ENTRY(IOMMUNotifier) node;
>  };
>  typedef struct IOMMUNotifier IOMMUNotifier;
> diff --git a/memory.c b/memory.c
> index 33110e9..f89d047 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -1662,7 +1662,9 @@ void memory_region_notify_iommu(MemoryRegion *mr,
>  }
>  
>  QLIST_FOREACH(iommu_notifier, &mr->iommu_notify, node) {
> -if (iommu_notifier->notifier_flags & request_flags) {
> +if (iommu_notifier->notifier_flags & request_flags &&
> +iommu_notifier->start <= entry.iova &&
> +iommu_notifier->end >= entry.iova) {
>  iommu_notifier->notify(iommu_notifier, &entry);
>  }
>  }


Apart from that, I think it looks correct.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] hw/pci: disable pci-bridge's shpc by default

2016-11-23 Thread David Gibson

On Wed, Nov 23, 2016 at 01:08:46PM +0200, Marcel Apfelbaum wrote:
> On 11/22/2016 10:25 PM, Laurent Vivier wrote:
> > 
> > 
> > On 22/11/2016 18:26, Marcel Apfelbaum wrote:
> > > On 11/18/2016 05:52 PM, Andrew Jones wrote:
> > > > On Wed, Nov 16, 2016 at 07:05:25PM +0200, Marcel Apfelbaum wrote:
> > > > > On 11/16/2016 06:44 PM, Andrew Jones wrote:
> > > > > > On Sat, Nov 05, 2016 at 06:46:34PM +0200, Marcel Apfelbaum wrote:
> > > > > > > On 11/03/2016 09:40 PM, Michael S. Tsirkin wrote:
> > > > > > > > On Thu, Nov 03, 2016 at 01:05:44PM +0200, Marcel Apfelbaum 
> > > > > > > > wrote:
> > > > > > > > > On 11/03/2016 06:18 AM, Michael S. Tsirkin wrote:
> > > > > > > > > > On Wed, Nov 02, 2016 at 05:16:42PM +0200, Marcel Apfelbaum 
> > > > > > > > > > wrote:
> > > > > > > > > > > The shpc component is optional while  ACPI hotplug is used
> > > > > > > > > > > for hot-plugging PCI devices into a PCI-PCI bridge.
> > > > > > > > > > > Disabling the shpc by default will make slot 0 usable at 
> > > > > > > > > > > boot time
> > > > > > > > > 
> > > > > > > > > Hi Michael
> > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > at the cost of breaking all hotplug for all non-acpi users.
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Do we have a non-acpi user that is able to use the shpc 
> > > > > > > > > component
> > > > > > > > > as-is today?
> > > > > > > > 
> > > > > > > > power and some arm systems I guess?
> > > > > > > > 
> > > > > > > 
> > > > > > > Adding Andrew , maybe he can give us an answer.
> > > > > > 
> > > > > > Not really :-) My lack of PCI knowledge makes that difficult. I'd be
> > > > > > happy
> > > > > > to help with an experiment though. Can you give me command line
> > > > > > arguments,
> > > > > > qmp commands, etc. that I should use to try it out? I imagine I 
> > > > > > should
> > > > > > just boot an ARM guest using DT (instead of ACPI) and then attempt 
> > > > > > to
> > > > > > hotplug a PCI device. I'm not sure, however, what, if any, special
> > > > > > configuration I need in order to ensure I'm testing what you're
> > > > > > interested in.
> > > > > > 
> > > > > 
> > > > > Hi Drew,
> > > > > 
> > > > > 
> > > > > Just run QEMU with '-device pci-bridge,chassis_nr=1,id=bridge1
> > > > > -monitor stdio'
> > > > > with an ARM guest using DT and wait until the guest finish booting.
> > > > > 
> > > > > Then run at hmp:
> > > > > device_add virtio-net-pci,bus=bridge1,id=net2
> > > > > 
> > > > > Next run lspci in the guest to see the new device.
> > > > 
> > > > Thanks for the instructions Marcel. Here's the results
> > > > 
> > > >  $QEMU -machine virt,accel=$ACCEL -cpu $CPU -nographic -m 4096 -smp 8 \
> > > >-bios /usr/share/AAVMF/AAVMF_CODE.fd \
> > > >-device pci-bridge,chassis_nr=1,id=bridge1 \
> > > >-drive file=$FEDORA_IMG,if=none,id=dr0,format=qcow2 \
> > > >-device virtio-blk-pci,bus=bridge1,addr=01,drive=dr0,id=disk0 \
> > > >-netdev user,id=hostnet0 \
> > > >-device 
> > > > virtio-net-pci,bus=bridge1,addr=02,netdev=hostnet0,id=net0
> > > > 
> > > >  # lspci
> > > >  00:00.0 Host bridge: Red Hat, Inc. Device 0008
> > > >  00:01.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
> > > >  01:01.0 SCSI storage controller: Red Hat, Inc Virtio block device
> > > >  01:02.0 Ethernet controller: Red Hat, Inc Virtio network device
> > > > 
> > > >  (qemu) device_add virtio-net-pci,bus=bridge1,id=net2
> > > >  Unsupported PCI slot 0 for standard hotplug controller. Valid slots are
> > > >  between 1 and 31.
> > > > 
> > > > (Tried again giving addr=03)
> > > > 
> > > >  (qemu) device_add virtio-net-pci,bus=bridge1,id=net2,addr=03
> > > > 
> > > > (Seemed to work, but...)
> > > > 
> > > >  # lspci
> > > >  00:00.0 Host bridge: Red Hat, Inc. Device 0008
> > > >  00:01.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
> > > >  01:01.0 SCSI storage controller: Red Hat, Inc Virtio block device
> > > >  01:02.0 Ethernet controller: Red Hat, Inc Virtio network device
> > > > 
> > > > (Doesn't show up in lscpi. So I guess it doesn't work)
> > > > 
> > > 
> > > Hi Drew,
> > > Thanks for confirming that it doesn't work.
> > > 
> > > Michael asked if we can check the same for powerpc before
> > > disabling the shpc by default.
> > > 
> > > Adding David, Thomas and Laurrent, maybe they have time
> > > to check it for powerpc.
> > 
> > With this patch:
> > 
> > # lspci
> > 00:00.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
> > 00:03.0 Unclassified device [00ff]: Red Hat, Inc Virtio memory balloon
> > [root@dhcp6-56 ~]# QEMU 2.7.90 monitor - type 'help' for more information
> > (qemu) device_add virtio-net-pci,bus=bridge1,id=net2
> > Bus 'bridge1' does not support hotplugging
> > (qemu) device_add virtio-net-pci,bus=bridge1,id=net2,addr=3
> > Bus 'bridge1' does not support hotplugging
> > 
> > Without this patch:
> > 
> > # lspci
> > 00:00.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
> > 00:03.0 Unclassified device [00ff]: Red Ha

Re: [Qemu-devel] [kvm-unit-tests PATCH v11 1/3] arm: Add PMU test

2016-11-23 Thread Wei Huang



On 11/23/2016 11:15 AM, Andrew Jones wrote:
> On Wed, Nov 23, 2016 at 01:16:08PM +, Andre Przywara wrote:
>> Hi,
>>
>> On 22/11/16 18:29, Wei Huang wrote:
>>> From: Christopher Covington 
>>>
>>> Beginning with a simple sanity check of the control register, add
>>> a unit test for the ARM Performance Monitors Unit (PMU).
>>
>> Mmh, the output of this is a bit confusing. How about to join some
>> information? I changed it to give me:
>> INFO: pmu: PMU implementer/ID code: "A"(0x41)/0x0
>> INFO: pmu: Event counters:  0
>> PASS: pmu: Control register
>>
>> ... by using the newly introduced report_info() to make it look nicer.
> 
> Agreed. That would look nicer and make good use of report_info. Let's
> do that.

I have adjusted v12 using report_info(), with all PMU PMCR fields
printed in the same line. Implementer info was printed with Hex first,
then ASCII representation, to match MIDR table in ARM manual:

INFO: pmu: PMU implementer/ID code/counters: 0x41("A")/0x1/6


> 
>>
>>>
>>> Signed-off-by: Christopher Covington 
>>> Signed-off-by: Wei Huang 
>>> Reviewed-by: Andrew Jones 
>>> ---
>>>  arm/Makefile.common |  3 ++-
>>>  arm/pmu.c   | 74 
>>> +
>>>  arm/unittests.cfg   |  5 
>>>  3 files changed, 81 insertions(+), 1 deletion(-)
>>>  create mode 100644 arm/pmu.c
>>>
>>> diff --git a/arm/Makefile.common b/arm/Makefile.common
>>> index f37b5c2..5da2fdd 100644
>>> --- a/arm/Makefile.common
>>> +++ b/arm/Makefile.common
>>> @@ -12,7 +12,8 @@ endif
>>>  tests-common = \
>>> $(TEST_DIR)/selftest.flat \
>>> $(TEST_DIR)/spinlock-test.flat \
>>> -   $(TEST_DIR)/pci-test.flat
>>> +   $(TEST_DIR)/pci-test.flat \
>>> +   $(TEST_DIR)/pmu.flat
>>>  
>>>  all: test_cases
>>>  
>>> diff --git a/arm/pmu.c b/arm/pmu.c
>>> new file mode 100644
>>> index 000..9d9c53b
>>> --- /dev/null
>>> +++ b/arm/pmu.c
>>> @@ -0,0 +1,74 @@
>>> +/*
>>> + * Test the ARM Performance Monitors Unit (PMU).
>>> + *
>>> + * Copyright (c) 2015-2016, The Linux Foundation. All rights reserved.
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify it
>>> + * under the terms of the GNU Lesser General Public License version 2.1 and
>>> + * only version 2.1 as published by the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope that it will be useful, but 
>>> WITHOUT
>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public 
>>> License
>>> + * for more details.
>>> + */
>>> +#include "libcflat.h"
>>> +#include "asm/barrier.h"
>>> +
>>> +#define PMU_PMCR_N_SHIFT   11
>>> +#define PMU_PMCR_N_MASK0x1f
>>> +#define PMU_PMCR_ID_SHIFT  16
>>> +#define PMU_PMCR_ID_MASK   0xff
>>> +#define PMU_PMCR_IMP_SHIFT 24
>>> +#define PMU_PMCR_IMP_MASK  0xff
>>> +
>>> +#if defined(__arm__)
>>
>> I guess you should use the arch specific header files we have in place
>> for that (lib/arm{.64}/asm/processor.h). Also there are sysreg read
>> wrappers (at least for arm64) in there already, can't we base this
>> function on them: DEFINE_GET_SYSREG32(pmcr, el0)?
>> (Requires a small change to get rid of the forced "_el1" suffix)
>>
>> We should wait for the GIC series to be merged, as this contains some
>> changes in this area.
> 
> As this unit test is the only consumer of PMC registers so far, then
> I'd prefer the defines and accessors stay here for now. Once we see
> a use in other unit tests then we can move some of it out.

I left accessors in-place. We can always come back to refactor them later.

> 
>>
>>> +static inline uint32_t pmcr_read(void)
>>> +{
>>> +   uint32_t ret;
>>> +
>>> +   asm volatile("mrc p15, 0, %0, c9, c12, 0" : "=r" (ret));
>>> +   return ret;
>>> +}
>>> +#elif defined(__aarch64__)
>>> +static inline uint32_t pmcr_read(void)
>>> +{
>>> +   uint32_t ret;
>>> +
>>> +   asm volatile("mrs %0, pmcr_el0" : "=r" (ret));
>>> +   return ret;
>>> +}
>>> +#endif
>>> +
>>> +/*
>>> + * As a simple sanity check on the PMCR_EL0, ensure the implementer field 
>>> isn't
>>> + * null. Also print out a couple other interesting fields for diagnostic
>>> + * purposes. For example, as of fall 2016, QEMU TCG mode doesn't implement
>>> + * event counters and therefore reports zero event counters, but hopefully
>>> + * support for at least the instructions event will be added in the future 
>>> and
>>> + * the reported number of event counters will become nonzero.
>>> + */
>>> +static bool check_pmcr(void)
>>> +{
>>> +   uint32_t pmcr;
>>> +
>>> +   pmcr = pmcr_read();
>>> +
>>> +   printf("PMU implementer: %c\n",
>>> +  (pmcr >> PMU_PMCR_IMP_SHIFT) & PMU_PMCR_IMP_MASK);
>>
>> If this register reads as zero, the output is mangled (since it cuts off
>> the string before the newline):
>> =
>> PMU implementer: Identification code: 0x0
>> =
>>
>> I guess you need something like:
>> (pmcr >> PMU_PMCR_IMP_SHIFT

Re: [Qemu-devel] [PATCH 1/2] virtio-net rsc: support coalescing ipv4 tcp traffic

2016-11-23 Thread Jason Wang




On 2016年11月24日 12:26, Michael S. Tsirkin wrote:

On Thu, Nov 24, 2016 at 12:17:21PM +0800, Jason Wang wrote:

diff --git a/include/standard-headers/linux/virtio_net.h 
b/include/standard-headers/linux/virtio_net.h
index 30ff249..e67b36e 100644
--- a/include/standard-headers/linux/virtio_net.h
+++ b/include/standard-headers/linux/virtio_net.h
@@ -57,6 +57,9 @@
 * Steering */
   #define VIRTIO_NET_F_CTRL_MAC_ADDR 23/* Set MAC address */
+/* Guest can handle coalesced ipv4-tcp packets */
+#define VIRTIO_NET_F_GUEST_RSC441

Why not use 24?

I think we should use features >31 (virtio 1 only) for
nice-to-have features like RSC. Feature bits <31 are
easy to backport, so it makes more sense to use
them for fundamental things like the MTU
(which for some setups help fix broken networking).


Ok, I believe we need clarify this in the spec or somewhere else.

Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI

2016-11-23 Thread Michael S. Tsirkin

On Wed, Nov 23, 2016 at 07:38:35PM -0500, Kevin O'Connor wrote:
> As a general comment - it does seem unfortunate that we keep building
> adhoc interfaces to communicate information from firmware to QEMU.  We
> have a generic mechanism (fw_cfg) for passing adhoc information from
> QEMU to the firmware, but the inverse seems to always involve magic
> pci registers, magic io space registers, specific init ordering, etc.

FWIW I posted a proposal
fw-cfg: support writeable blobs
a while ago to try to address that

-- 
MST

Re: [Qemu-devel] [ RFC Patch v7 0/2] Support Receive-Segment-Offload(RSC) for WHQL

2016-11-23 Thread Jason Wang




On 2016年11月01日 01:41, w...@redhat.com wrote:

From: Wei Xu 

This patch is to support WHQL test for Windows guest, while this
feature also benifits other guest works as a kernel 'gro' like
feature with userspace implementation.

Feature information:
http://msdn.microsoft.com/en-us/library/windows/hardware/jj853324

v6->v7
- Change the drain timer from 'virtual' to 'host' since it invisible
   to guest.
- Move the buffer list empty check to virtio_net_rsc_do_coalesc().
- The header comparision is a bit odd for ipv4 in this patch, it
   should be simpler with equal check, but this is also a helper for ipv6
   in next patch, and ipv6 used a different size address fields, so i used
   an 'address + size' byte comparision for address, and change comparing
   the tcp port with 'int' equal check.
- Add count for packets whose size less than a normal tcp packet in
   sanity check.
- Move constant value comparison to the right side of the equal symbol.
- Use host header length in stead of guest header length to verify a
   packet in virtio_net_rsc_receive(), in case of the different header
   length for guest and host.
- Check whether the packet size is enough to hold a legal packet before
   extract ip unit.
- Bypass ip/tcp ECN packets.
- Expand the feature bit definition from 32 to 64 bits.

Other notes:
- About tcp windows scale, we don't have connection tracking about all
   tcp connections, so we don't know what the exact window size is using,
   thus this feature may get negative influence to it, have to turn this
   feature off for such a user case currently.
- There are 2 new fields in the virtio net header, it's not in either
   kernel tree or maintainer's tree right now, I just put it directly here.
- The statistics is kept in this version since it's helpful for
   troubleshooting.


Please do not adding more and more stuffs in the same patch. Instead, 
you can add them by using new patches on top. This can greatly simplify 
the reviewers' work. E.g in this version, it looks like the parts of 
virtio extension brings lots of troubles. So I suggest to split the 
patch into several parts:


- helpers (e.g macro for ECN bit)
- core coalescing logic which has been reviewed for several version, 
please do not add more functions to this part. This part could be even 
disabled in the code until virtio part is introduced.

- virtio extension (e.g virtio-net header extension and feature bits)
- stats

Thanks

[Qemu-devel] [kvm-unit-tests PATCH v12 2/3] arm: pmu: Check cycle count increases

2016-11-23 Thread Wei Huang

From: Christopher Covington 

Ensure that reads of the PMCCNTR_EL0 are monotonically increasing,
even for the smallest delta of two subsequent reads.

Signed-off-by: Christopher Covington 
Signed-off-by: Wei Huang 
Reviewed-by: Andrew Jones 
---
 arm/pmu.c | 156 ++
 1 file changed, 156 insertions(+)

diff --git a/arm/pmu.c b/arm/pmu.c
index 98ebea4..3ae6545 100644
--- a/arm/pmu.c
+++ b/arm/pmu.c
@@ -15,6 +15,9 @@
 #include "libcflat.h"
 #include "asm/barrier.h"
 
+#define PMU_PMCR_E (1 << 0)
+#define PMU_PMCR_C (1 << 2)
+#define PMU_PMCR_LC(1 << 6)
 #define PMU_PMCR_N_SHIFT   11
 #define PMU_PMCR_N_MASK0x1f
 #define PMU_PMCR_ID_SHIFT  16
@@ -22,6 +25,14 @@
 #define PMU_PMCR_IMP_SHIFT 24
 #define PMU_PMCR_IMP_MASK  0xff
 
+#define ID_DFR0_PERFMON_SHIFT 24
+#define ID_DFR0_PERFMON_MASK  0xf
+
+#define PMU_CYCLE_IDX 31
+
+#define NR_SAMPLES 10
+
+static unsigned int pmu_version;
 #if defined(__arm__)
 static inline uint32_t pmcr_read(void)
 {
@@ -30,6 +41,69 @@ static inline uint32_t pmcr_read(void)
asm volatile("mrc p15, 0, %0, c9, c12, 0" : "=r" (ret));
return ret;
 }
+
+static inline void pmcr_write(uint32_t value)
+{
+   asm volatile("mcr p15, 0, %0, c9, c12, 0" : : "r" (value));
+   isb();
+}
+
+static inline void pmselr_write(uint32_t value)
+{
+   asm volatile("mcr p15, 0, %0, c9, c12, 5" : : "r" (value));
+   isb();
+}
+
+static inline void pmxevtyper_write(uint32_t value)
+{
+   asm volatile("mcr p15, 0, %0, c9, c13, 1" : : "r" (value));
+}
+
+static inline uint64_t pmccntr_read(void)
+{
+   uint32_t lo, hi = 0;
+
+   if (pmu_version == 0x3)
+   asm volatile("mrrc p15, 0, %0, %1, c9" : "=r" (lo), "=r" (hi));
+   else
+   asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r" (lo));
+
+   return ((uint64_t)hi << 32) | lo;
+}
+
+static inline void pmccntr_write(uint64_t value)
+{
+   uint32_t lo, hi;
+
+   lo = value & 0x;
+   hi = (value >> 32) & 0x;
+
+   if (pmu_version == 0x3)
+   asm volatile("mcrr p15, 0, %0, %1, c9" : : "r" (lo), "r" (hi));
+   else
+   asm volatile("mcr p15, 0, %0, c9, c13, 0" : : "r" (lo));
+}
+
+static inline void pmcntenset_write(uint32_t value)
+{
+   asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r" (value));
+}
+
+/* PMCCFILTR is an obsolete name for PMXEVTYPER31 in ARMv7 */
+static inline void pmccfiltr_write(uint32_t value)
+{
+   pmselr_write(PMU_CYCLE_IDX);
+   pmxevtyper_write(value);
+   isb();
+}
+
+static inline uint32_t id_dfr0_read(void)
+{
+   uint32_t val;
+
+   asm volatile("mrc p15, 0, %0, c0, c1, 2" : "=r" (val));
+   return val;
+}
 #elif defined(__aarch64__)
 static inline uint32_t pmcr_read(void)
 {
@@ -38,6 +112,44 @@ static inline uint32_t pmcr_read(void)
asm volatile("mrs %0, pmcr_el0" : "=r" (ret));
return ret;
 }
+
+static inline void pmcr_write(uint32_t value)
+{
+   asm volatile("msr pmcr_el0, %0" : : "r" (value));
+   isb();
+}
+
+static inline uint64_t pmccntr_read(void)
+{
+   uint64_t cycles;
+
+   asm volatile("mrs %0, pmccntr_el0" : "=r" (cycles));
+   return cycles;
+}
+
+static inline void pmccntr_write(uint64_t value)
+{
+   asm volatile("msr pmccntr_el0, %0" : : "r" (value));
+}
+
+static inline void pmcntenset_write(uint32_t value)
+{
+   asm volatile("msr pmcntenset_el0, %0" : : "r" (value));
+}
+
+static inline void pmccfiltr_write(uint32_t value)
+{
+   asm volatile("msr pmccfiltr_el0, %0" : : "r" (value));
+   isb();
+}
+
+static inline uint32_t id_dfr0_read(void)
+{
+   uint32_t id;
+
+   asm volatile("mrs %0, id_dfr0_el1" : "=r" (id));
+   return id;
+}
 #endif
 
 /*
@@ -63,11 +175,55 @@ static bool check_pmcr(void)
return ((pmcr >> PMU_PMCR_IMP_SHIFT) & PMU_PMCR_IMP_MASK) != 0;
 }
 
+/*
+ * Ensure that the cycle counter progresses between back-to-back reads.
+ */
+static bool check_cycles_increase(void)
+{
+   bool success = true;
+
+   /* init before event access, this test only cares about cycle count */
+   pmcntenset_write(1 << PMU_CYCLE_IDX);
+   pmccfiltr_write(0); /* count cycles in EL0, EL1, but not EL2 */
+   pmccntr_write(0);
+
+   pmcr_write(pmcr_read() | PMU_PMCR_LC | PMU_PMCR_C | PMU_PMCR_E);
+
+   for (int i = 0; i < NR_SAMPLES; i++) {
+   uint64_t a, b;
+
+   a = pmccntr_read();
+   b = pmccntr_read();
+
+   if (a >= b) {
+   printf("Read %"PRId64" then %"PRId64".\n", a, b);
+   success = false;
+   break;
+   }
+   }
+
+   pmcr_write(pmcr_read() & ~PMU_PMCR_E);
+
+   return success;
+}
+
+void pmu_init(void)
+{
+   uint32_t dfr0;
+
+   /* probe pmu version */
+   dfr0 = id_dfr0_read();
+   pmu_version = (dfr0 >> ID_DFR0_P

Re: [Qemu-devel] [PATCH 1/2] virtio-net rsc: support coalescing ipv4 tcp traffic

2016-11-23 Thread Michael S. Tsirkin

On Thu, Nov 24, 2016 at 12:17:21PM +0800, Jason Wang wrote:
> > diff --git a/include/standard-headers/linux/virtio_net.h 
> > b/include/standard-headers/linux/virtio_net.h
> > index 30ff249..e67b36e 100644
> > --- a/include/standard-headers/linux/virtio_net.h
> > +++ b/include/standard-headers/linux/virtio_net.h
> > @@ -57,6 +57,9 @@
> >  * Steering */
> >   #define VIRTIO_NET_F_CTRL_MAC_ADDR 23 /* Set MAC address */
> > +/* Guest can handle coalesced ipv4-tcp packets */
> > +#define VIRTIO_NET_F_GUEST_RSC441
> 
> Why not use 24?

I think we should use features >31 (virtio 1 only) for
nice-to-have features like RSC. Feature bits <31 are
easy to backport, so it makes more sense to use
them for fundamental things like the MTU
(which for some setups help fix broken networking).

[Qemu-devel] [kvm-unit-tests PATCH v12 3/3] arm: pmu: Add CPI checking

2016-11-23 Thread Wei Huang

From: Christopher Covington 

Calculate the numbers of cycles per instruction (CPI) implied by ARM
PMU cycle counter values. The code includes a strict checking facility
intended for the -icount option in TCG mode in the configuration file.

Signed-off-by: Christopher Covington 
Signed-off-by: Wei Huang 
Reviewed-by: Andrew Jones 
---
 arm/pmu.c | 123 +-
 arm/unittests.cfg |  14 +++
 2 files changed, 136 insertions(+), 1 deletion(-)

diff --git a/arm/pmu.c b/arm/pmu.c
index 3ae6545..f05d00d 100644
--- a/arm/pmu.c
+++ b/arm/pmu.c
@@ -104,6 +104,27 @@ static inline uint32_t id_dfr0_read(void)
asm volatile("mrc p15, 0, %0, c0, c1, 2" : "=r" (val));
return val;
 }
+
+/*
+ * Extra instructions inserted by the compiler would be difficult to compensate
+ * for, so hand assemble everything between, and including, the PMCR accesses
+ * to start and stop counting. isb instructions were inserted to make sure
+ * pmccntr read after this function returns the exact instructions executed in
+ * the controlled block. Total instrs = isb + mcr + 2*loop = 2 + 2*loop.
+ */
+static inline void precise_instrs_loop(int loop, uint32_t pmcr)
+{
+   asm volatile(
+   "   mcr p15, 0, %[pmcr], c9, c12, 0\n"
+   "   isb\n"
+   "1: subs%[loop], %[loop], #1\n"
+   "   bgt 1b\n"
+   "   mcr p15, 0, %[z], c9, c12, 0\n"
+   "   isb\n"
+   : [loop] "+r" (loop)
+   : [pmcr] "r" (pmcr), [z] "r" (0)
+   : "cc");
+}
 #elif defined(__aarch64__)
 static inline uint32_t pmcr_read(void)
 {
@@ -150,6 +171,27 @@ static inline uint32_t id_dfr0_read(void)
asm volatile("mrs %0, id_dfr0_el1" : "=r" (id));
return id;
 }
+
+/*
+ * Extra instructions inserted by the compiler would be difficult to compensate
+ * for, so hand assemble everything between, and including, the PMCR accesses
+ * to start and stop counting. isb instructions are inserted to make sure
+ * pmccntr read after this function returns the exact instructions executed
+ * in the controlled block. Total instrs = isb + msr + 2*loop = 2 + 2*loop.
+ */
+static inline void precise_instrs_loop(int loop, uint32_t pmcr)
+{
+   asm volatile(
+   "   msr pmcr_el0, %[pmcr]\n"
+   "   isb\n"
+   "1: subs%[loop], %[loop], #1\n"
+   "   b.gt1b\n"
+   "   msr pmcr_el0, xzr\n"
+   "   isb\n"
+   : [loop] "+r" (loop)
+   : [pmcr] "r" (pmcr)
+   : "cc");
+}
 #endif
 
 /*
@@ -207,6 +249,79 @@ static bool check_cycles_increase(void)
return success;
 }
 
+/*
+ * Execute a known number of guest instructions. Only even instruction counts
+ * greater than or equal to 4 are supported by the in-line assembly code. The
+ * control register (PMCR_EL0) is initialized with the provided value (allowing
+ * for example for the cycle counter or event counters to be reset). At the end
+ * of the exact instruction loop, zero is written to PMCR_EL0 to disable
+ * counting, allowing the cycle counter or event counters to be read at the
+ * leisure of the calling code.
+ */
+static void measure_instrs(int num, uint32_t pmcr)
+{
+   int loop = (num - 2) / 2;
+
+   assert(num >= 4 && ((num - 2) % 2 == 0));
+   precise_instrs_loop(loop, pmcr);
+}
+
+/*
+ * Measure cycle counts for various known instruction counts. Ensure that the
+ * cycle counter progresses (similar to check_cycles_increase() but with more
+ * instructions and using reset and stop controls). If supplied a positive,
+ * nonzero CPI parameter, also strictly check that every measurement matches
+ * it. Strict CPI checking is used to test -icount mode.
+ */
+static bool check_cpi(int cpi)
+{
+   uint32_t pmcr = pmcr_read() | PMU_PMCR_LC | PMU_PMCR_C | PMU_PMCR_E;
+
+   /* init before event access, this test only cares about cycle count */
+   pmcntenset_write(1 << PMU_CYCLE_IDX);
+   pmccfiltr_write(0); /* count cycles in EL0, EL1, but not EL2 */
+
+   if (cpi > 0)
+   printf("Checking for CPI=%d.\n", cpi);
+   printf("instrs : cycles0 cycles1 ...\n");
+
+   for (unsigned int i = 4; i < 300; i += 32) {
+   uint64_t avg, sum = 0;
+
+   printf("%d :", i);
+   for (int j = 0; j < NR_SAMPLES; j++) {
+   uint64_t cycles;
+
+   pmccntr_write(0);
+   measure_instrs(i, pmcr);
+   cycles = pmccntr_read();
+   printf(" %"PRId64"", cycles);
+
+   if (!cycles) {
+   printf("\ncycles not incrementing!\n");
+   return false;
+   } else if (cpi > 0 && cycles != i * cpi) {
+   printf("\nunexpected cycle count received!\n");
+   return false;
+   } else if ((cycles >> 32)

[Qemu-devel] [kvm-unit-tests PATCH v12 1/3] arm: Add PMU test

2016-11-23 Thread Wei Huang

From: Christopher Covington 

Beginning with a simple sanity check of the control register, add
a unit test for the ARM Performance Monitors Unit (PMU).

Signed-off-by: Christopher Covington 
Signed-off-by: Wei Huang 
---
 arm/Makefile.common |  3 ++-
 arm/pmu.c   | 73 +
 arm/unittests.cfg   |  5 
 3 files changed, 80 insertions(+), 1 deletion(-)
 create mode 100644 arm/pmu.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index f37b5c2..5da2fdd 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -12,7 +12,8 @@ endif
 tests-common = \
$(TEST_DIR)/selftest.flat \
$(TEST_DIR)/spinlock-test.flat \
-   $(TEST_DIR)/pci-test.flat
+   $(TEST_DIR)/pci-test.flat \
+   $(TEST_DIR)/pmu.flat
 
 all: test_cases
 
diff --git a/arm/pmu.c b/arm/pmu.c
new file mode 100644
index 000..98ebea4
--- /dev/null
+++ b/arm/pmu.c
@@ -0,0 +1,73 @@
+/*
+ * Test the ARM Performance Monitors Unit (PMU).
+ *
+ * Copyright (c) 2015-2016, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU Lesser General Public License version 2.1 and
+ * only version 2.1 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License
+ * for more details.
+ */
+#include "libcflat.h"
+#include "asm/barrier.h"
+
+#define PMU_PMCR_N_SHIFT   11
+#define PMU_PMCR_N_MASK0x1f
+#define PMU_PMCR_ID_SHIFT  16
+#define PMU_PMCR_ID_MASK   0xff
+#define PMU_PMCR_IMP_SHIFT 24
+#define PMU_PMCR_IMP_MASK  0xff
+
+#if defined(__arm__)
+static inline uint32_t pmcr_read(void)
+{
+   uint32_t ret;
+
+   asm volatile("mrc p15, 0, %0, c9, c12, 0" : "=r" (ret));
+   return ret;
+}
+#elif defined(__aarch64__)
+static inline uint32_t pmcr_read(void)
+{
+   uint32_t ret;
+
+   asm volatile("mrs %0, pmcr_el0" : "=r" (ret));
+   return ret;
+}
+#endif
+
+/*
+ * As a simple sanity check on the PMCR_EL0, ensure the implementer field isn't
+ * null. Also print out a couple other interesting fields for diagnostic
+ * purposes. For example, as of fall 2016, QEMU TCG mode doesn't implement
+ * event counters and therefore reports zero event counters, but hopefully
+ * support for at least the instructions event will be added in the future and
+ * the reported number of event counters will become nonzero.
+ */
+static bool check_pmcr(void)
+{
+   uint32_t pmcr;
+
+   pmcr = pmcr_read();
+
+   report_info("PMU implementer/ID code/counters: 0x%x(\"%c\")/0x%x/%d",
+   (pmcr >> PMU_PMCR_IMP_SHIFT) & PMU_PMCR_IMP_MASK,
+   ((pmcr >> PMU_PMCR_IMP_SHIFT) & PMU_PMCR_IMP_MASK) ? : ' ',
+   (pmcr >> PMU_PMCR_ID_SHIFT) & PMU_PMCR_ID_MASK,
+   (pmcr >> PMU_PMCR_N_SHIFT) & PMU_PMCR_N_MASK);
+
+   return ((pmcr >> PMU_PMCR_IMP_SHIFT) & PMU_PMCR_IMP_MASK) != 0;
+}
+
+int main(void)
+{
+   report_prefix_push("pmu");
+
+   report("Control register", check_pmcr());
+
+   return report_summary();
+}
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index ae32a42..816f494 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -58,3 +58,8 @@ groups = selftest
 [pci-test]
 file = pci-test.flat
 groups = pci
+
+# Test PMU support
+[pmu]
+file = pmu.flat
+groups = pmu
-- 
1.8.3.1

[Qemu-devel] [kvm-unit-tests PATCH v12 0/3] ARM PMU tests

2016-11-23 Thread Wei Huang

Changes from v11:
* Use report_info() to report PMU HW related info (implementer, id code, ...)
* Print PMU PMCR info in the same line

Note:
1) Current KVM code has bugs in handling PMCCFILTR write. A fix (see
below) is required for this unit testing code to work correctly under
KVM mode.
https://lists.cs.columbia.edu/pipermail/kvmarm/2016-November/022134.html.

Thanks,
-Wei

Christopher Covington (3):
  arm: Add PMU test
  arm: pmu: Check cycle count increases
  arm: pmu: Add CPI checking

 arm/Makefile.common |   3 +-
 arm/pmu.c   | 350 
 arm/unittests.cfg   |  19 +++
 3 files changed, 371 insertions(+), 1 deletion(-)
 create mode 100644 arm/pmu.c

-- 
1.8.3.1

Re: [Qemu-devel] [PATCH 1/2] virtio-net rsc: support coalescing ipv4 tcp traffic

2016-11-23 Thread Jason Wang




On 2016年11月01日 01:41, w...@redhat.com wrote:

From: Wei Xu 

All the data packets in a tcp connection are cached
to a single buffer in every receive interval, and will
be sent out via a timer, the 'virtio_net_rsc_timeout'
controls the interval, this value may impact the
performance and response time of tcp connection,
5(50us) is an experience value to gain a performance
improvement, since the whql test sends packets every 100us,
so '30(300us)' passes the test case, it is the default
value as well, tune it via the command line parameter
'rsc_interval' within 'virtio-net-pci' device, for example,
to launch a guest with interval set as '50':

'virtio-net-pci,netdev=hostnet1,bus=pci.0,id=net1,mac=00,rsc_interval=50'

The timer will only be triggered if the packets pool is not empty,
and it'll drain off all the cached packets.

'NetRscChain' is used to save the segments of IPv4/6 in a
VirtIONet device.

A new segment becomes a 'Candidate' as well as it passed sanity check,
the main handler of TCP includes TCP window update, duplicated
ACK check and the real data coalescing.

An 'Candidate' segment means:
1. Segment is within current window and the sequence is the expected one.
2. 'ACK' of the segment is in the valid window.

Sanity check includes:
1. Incorrect version in IP header
2. An IP options or IP fragment
3. Not a TCP packet
4. Sanity size check to prevent buffer overflow attack.
5. An ECN packet

Even though, there might more cases should be considered such as
ip identification other flags, while it breaks the test because
windows set it to the same even it's not a fragment.

Normally it includes 2 typical ways to handle a TCP control flag,
'bypass' and 'finalize', 'bypass' means should be sent out directly,
while 'finalize' means the packets should also be bypassed, but this
should be done after search for the same connection packets in the
pool and drain all of them out, this is to avoid out of order fragment.

All the 'SYN' packets will be bypassed since this always begin a new'
connection, other flags such 'URG/FIN/RST/CWR/ECE' will trigger a
finalization, because this normally happens upon a connection is going
to be closed, an 'URG' packet also finalize current coalescing unit.

Statistics can be used to monitor the basic coalescing status, the
'out of order' and 'out of window' means how many retransmitting packets,
thus describe the performance intuitively.

Signed-off-by: Wei Xu 
---
  hw/net/virtio-net.c | 602 ++--
  include/hw/virtio/virtio-net.h  |   5 +-
  include/hw/virtio/virtio.h  |  76 
  include/net/eth.h   |   2 +
  include/standard-headers/linux/virtio_net.h |  14 +
  net/tap.c   |   3 +-
  6 files changed, 670 insertions(+), 32 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 06bfe4b..d1824d9 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -15,10 +15,12 @@
  #include "qemu/iov.h"
  #include "hw/virtio/virtio.h"
  #include "net/net.h"
+#include "net/eth.h"
  #include "net/checksum.h"
  #include "net/tap.h"
  #include "qemu/error-report.h"
  #include "qemu/timer.h"
+#include "qemu/sockets.h"
  #include "hw/virtio/virtio-net.h"
  #include "net/vhost_net.h"
  #include "hw/virtio/virtio-bus.h"
@@ -43,6 +45,24 @@
  #define endof(container, field) \
  (offsetof(container, field) + sizeof(((container *)0)->field))
  
+#define VIRTIO_NET_IP4_ADDR_SIZE   8/* ipv4 saddr + daddr */


Only used once in the code, I don't see much value of this macro.


+
+#define VIRTIO_NET_TCP_FLAG 0x3F
+#define VIRTIO_NET_TCP_HDR_LENGTH   0xF000
+
+/* IPv4 max payload, 16 bits in the header */
+#define VIRTIO_NET_MAX_IP4_PAYLOAD (65535 - sizeof(struct ip_header))
+#define VIRTIO_NET_MAX_TCP_PAYLOAD 65535
+
+/* header length value in ip header without option */
+#define VIRTIO_NET_IP4_HEADER_LENGTH 5
+
+/* Purge coalesced packets timer interval, This value affects the performance
+   a lot, and should be tuned carefully, '30'(300us) is the recommended
+   value to pass the WHQL test, '5' can gain 2x netperf throughput with
+   tso/gso/gro 'off'. */
+#define VIRTIO_NET_RSC_INTERVAL  30


This should be a property for virito-net and the above comment can be 
the description of the property.



+
  typedef struct VirtIOFeature {
  uint32_t flags;
  size_t end;
@@ -589,7 +609,12 @@ static uint64_t 
virtio_net_guest_offloads_by_features(uint32_t features)
  (1ULL << VIRTIO_NET_F_GUEST_ECN)  |
  (1ULL << VIRTIO_NET_F_GUEST_UFO);
  
-return guest_offloads_mask & features;

+if (features & VIRTIO_NET_F_CTRL_GUEST_OFFLOADS) {
+return (guest_offloads_mask & features) |
+   (1ULL << VIRTIO_NET_F_GUEST_RSC4);


Why need to care this, I believe RSC has nothing to do with peer's 
offload setting?



+} else {
+return guest_offloads_mask & f

[Qemu-devel] [PATCH v2 3/4] spec/vhost-user: add the VHOST_USER_SET_PEER_CONNECTION message

2016-11-23 Thread Wei Wang

The VHOST_USER_SET_PEER_CONNECTION message is introduced to manage the
vhost-pci dataplane connection status. The slave can use the vhost-pci
dataplane to transmit/receive packets to/from the master when the
connection is turned ON, and stops using it when the connection is
turned OFF.

Signed-off-by: Wei Wang 
---
 docs/specs/vhost-user.txt | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 3bbe641..fdc99ea 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -130,12 +130,14 @@ is a list of the ones that do:
  * VHOST_USER_GET_PROTOCOL_FEATURES
  * VHOST_USER_GET_VRING_BASE
  * VHOST_USER_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
+ * VHOST_USER_SET_PEER_CONNECTION
 
 [ Also see the section on REPLY_ACK protocol extension. ]
 
 Currently, the communication also supports the slave actively sending messages
 to the master. Here is a list of them:
  * VHOST_USER_SET_FEATURES
+ * VHOST_USER_SET_PEER_CONNECTION
 
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
@@ -479,6 +481,39 @@ Message types
   The first 6 bytes of the payload contain the mac address of the guest to
   allow the vhost user backend to construct and broadcast the fake RARP.
 
+ * VHOST_USER_SET_PEER_CONNECTION
+
+  Id: 20
+  Equivalent ioctl: N/A
+  Master payload: u64
+  Slave payload: u64
+
+  The slave device requests to connect or disconnect to the master device.
+  The master device may request to disconnect to the slave device.
+  This request should be sent only when VHOST_USER_PROTOCOL_F_VHOST_PCI has
+  been negotiated.
+
+  Connection command (ON): If the reply message indicates "success", the
+  connection status is "active", and the slave can start to transmit and
+  receive packets to the master through the vhost-pci dataplane. Replying 
of
+  this message is asynchronous, because the device needs to talk to the
+  driver first.
+  Disconnection command (OFF): If the reply message indicates "success", 
the
+  connection status is "inactive". The slave cannot use the vhost-pci
+  dataplane when the connection is "inactive". Replying of this message is
+  asynchronous, because the device needs to talk to the driver first.
+  Creation command (CREATE): Sent by the master to the slave to request for
+  the creation of a slave device. If the reply messages indicates 
"success",
+  it means that the slave is able to create a slave device for the master.
+  Destroy command (DESTROY): Sent by the master to the slave to request for
+  the destruction of the slave device. This command should only be sent 
when
+  the connection is "inactive". No reply is needed for this command.
+
+  #define VHOST_USER_SET_PEER_CONNECTION_F_OFF   0
+  #define VHOST_USER_SET_PEER_CONNECTION_F_ON1
+  #define VHOST_USER_SET_PEER_CONNECTION_F_CREATE2
+  #define VHOST_USER_SET_PEER_CONNECTION_F_DESTROY   3
+
 VHOST_USER_PROTOCOL_F_REPLY_ACK:
 ---
 The original vhost-user specification only demands replies for certain
-- 
2.7.4

[Qemu-devel] [PATCH v2 4/4] spec/vhost-user: add VHOST_USER_PROTOCOL_F_VERSATILE_SLAVE

2016-11-23 Thread Wei Wang

The VHOST_USER_PROTOCOL_F_VERSATILE_SLAVE protocol feature indicates
that the slave side implementation supports different types of devices.
The master tells the slave what type of device to create by sending
the VHOST_USER_SET_DEV_INFO message.

Signed-off-by: Wei Wang 
---
 docs/specs/vhost-user.txt | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index fdc99ea..da1314d 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -264,11 +264,12 @@ restarted.
 Protocol features
 -
 
-#define VHOST_USER_PROTOCOL_F_MQ 0
-#define VHOST_USER_PROTOCOL_F_LOG_SHMFD  1
-#define VHOST_USER_PROTOCOL_F_RARP   2
-#define VHOST_USER_PROTOCOL_F_REPLY_ACK  3
-#define VHOST_USER_PROTOCOL_F_VHOST_PCI  4
+#define VHOST_USER_PROTOCOL_F_MQ   0
+#define VHOST_USER_PROTOCOL_F_LOG_SHMFD1
+#define VHOST_USER_PROTOCOL_F_RARP 2
+#define VHOST_USER_PROTOCOL_F_REPLY_ACK3
+#define VHOST_USER_PROTOCOL_F_VHOST_PCI4
+#define VHOST_USER_PROTOCOL_F_VERSATILE_SLAVE  5
 
 Message types
 -
@@ -514,6 +515,16 @@ Message types
   #define VHOST_USER_SET_PEER_CONNECTION_F_CREATE2
   #define VHOST_USER_SET_PEER_CONNECTION_F_DESTROY   3
 
+ * VHOST_USER_SET_DEV_INFO
+
+  Id: 21
+  Equivalent ioctl: N/A
+  Master payload: u64
+
+  The master sends the device type info to the slave.
+  This request should be sent only when 
VHOST_USER_PROTOCOL_F_VERSATILE_SLAVE
+  has been negotiated.
+
 VHOST_USER_PROTOCOL_F_REPLY_ACK:
 ---
 The original vhost-user specification only demands replies for certain
-- 
2.7.4

[Qemu-devel] [PATCH v2 2/4] spec/vhost-user: extend vhost-user to support the vhost-pci based inter-vm communiaction

2016-11-23 Thread Wei Wang

The protocol feature, VHOST_USER_PROTOCOL_F_VHOST_PCI, indicates the
support of vhost-pci. With the vhost-pci extension, the slave may
actively sending meesages to the master. VHOST_USER_SET_FEATURES is one
of the examples.

To understand this example, let's first understand how
the device feature bits are negotiated between the slave
device/driver and the master device/driver:
1) the master device (e.g. virtio-net) GET_FEATURES from the slave
(assume the feature bits are "f1");
2) the master device negotiates the feature bits with its driver (assume
the device gets "f2" after the negotiation);
3) the master device SET_FEATURES("f2") to the slave;
4) the slave creates a slave device (e.g. vhost-pci-net) with "f2" and
the slave device negotiates "f2" with its driver (assume the device gets
"f3" after the negotiation);
5) the slave _actively_ SET_FEATURES("f3") to the master device;
6) if "f3" != "f2", the master device needs to perform a device reset
and re-negotiate the feature bits with its driver using "f3".

Signed-off-by: Wei Wang 
---
 docs/specs/vhost-user.txt | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index d70bd83..3bbe641 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -17,12 +17,15 @@ The protocol defines 2 sides of the communication, master 
and slave. Master is
 the application that shares its virtqueues, in our case QEMU. Slave is the
 consumer of the virtqueues.
 
-In the current implementation QEMU is the Master, and the Slave is intended to
+In the traditional implementation QEMU is the master, and the slave is 
intended to
 be a software Ethernet switch running in user space, such as Snabbswitch.
 
 Master and slave can be either a client (i.e. connecting) or server (listening)
 in the socket communication.
 
+The current vhost-user protocol is extended to support the vhost-pci based 
inter-VM
+communication. In this case, both the slave and master are QEMU instances.
+
 Message Specification
 -
 
@@ -36,7 +39,7 @@ consists of 3 header fields and a payload:
  * Request: 32-bit type of the request
  * Flags: 32-bit bit field:
- Lower 2 bits are the version (currently 0x01)
-   - Bit 2 is the reply flag - needs to be sent on each reply from the slave
+   - Bit 2 is the reply flag - needs to be sent on each reply
- Bit 3 is the need_reply flag - see VHOST_USER_PROTOCOL_F_REPLY_ACK for
  details.
  * Size - 32-bit size of the payload
@@ -119,9 +122,9 @@ The protocol for vhost-user is based on the existing 
implementation of vhost
 for the Linux Kernel. Most messages that can be sent via the Unix domain socket
 implementing vhost-user have an equivalent ioctl to the kernel implementation.
 
-The communication consists of master sending message requests and slave sending
-message replies. Most of the requests don't require replies. Here is a list of
-the ones that do:
+Traditionally, the communication consists of master sending message requests 
and
+slave sending message replies. Most of the requests don't require replies. Here
+is a list of the ones that do:
 
  * VHOST_USER_GET_FEATURES
  * VHOST_USER_GET_PROTOCOL_FEATURES
@@ -130,6 +133,10 @@ the ones that do:
 
 [ Also see the section on REPLY_ACK protocol extension. ]
 
+Currently, the communication also supports the slave actively sending messages
+to the master. Here is a list of them:
+ * VHOST_USER_SET_FEATURES
+
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
 
@@ -259,6 +266,7 @@ Protocol features
 #define VHOST_USER_PROTOCOL_F_LOG_SHMFD  1
 #define VHOST_USER_PROTOCOL_F_RARP   2
 #define VHOST_USER_PROTOCOL_F_REPLY_ACK  3
+#define VHOST_USER_PROTOCOL_F_VHOST_PCI  4
 
 Message types
 -
@@ -279,6 +287,7 @@ Message types
   Id: 2
   Ioctl: VHOST_SET_FEATURES
   Master payload: u64
+  Slave payload: u64
 
   Enable features in the underlying vhost implementation using a bitmask.
   Feature bit VHOST_USER_F_PROTOCOL_FEATURES signals slave support for
-- 
2.7.4

[Qemu-devel] [PATCH v2 1/4] spec/vhost-user: fix the VHOST_USER prefix

2016-11-23 Thread Wei Wang

Signed-off-by: Wei Wang 
---
 docs/specs/vhost-user.txt | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 7890d71..d70bd83 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -123,22 +123,22 @@ The communication consists of master sending message 
requests and slave sending
 message replies. Most of the requests don't require replies. Here is a list of
 the ones that do:
 
- * VHOST_GET_FEATURES
- * VHOST_GET_PROTOCOL_FEATURES
- * VHOST_GET_VRING_BASE
- * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
+ * VHOST_USER_GET_FEATURES
+ * VHOST_USER_GET_PROTOCOL_FEATURES
+ * VHOST_USER_GET_VRING_BASE
+ * VHOST_USER_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
 
 [ Also see the section on REPLY_ACK protocol extension. ]
 
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
 
- * VHOST_SET_MEM_TABLE
- * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
- * VHOST_SET_LOG_FD
- * VHOST_SET_VRING_KICK
- * VHOST_SET_VRING_CALL
- * VHOST_SET_VRING_ERR
+ * VHOST_USER_SET_MEM_TABLE
+ * VHOST_USER_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
+ * VHOST_USER_SET_LOG_FD
+ * VHOST_USER_SET_VRING_KICK
+ * VHOST_USER_SET_VRING_CALL
+ * VHOST_USER_SET_VRING_ERR
 
 If Master is unable to send the full message or receives a wrong reply it will
 close the connection. An optional reconnection mechanism can be implemented.
-- 
2.7.4

[Qemu-devel] [PATCH v2 0/4] * vhost-user spec extension for vhost-pci *

2016-11-23 Thread Wei Wang

This spec patch series extends the vhost-user protocol to support the vhost-pci
based inter-VM communiaction.

v1->v2 changes:
1) start from the simpler case - change "1-slave-N-master" to "1-slave-1-master"
configuration plane. Accordingly, the previous "uuid", "conn_id" are removed;
2) add the _CREATE_ and _DESTROY_ comands to the VHOST_USER_SET_PEER_CONNECTION
message; and
3) fix the VHOST_USER prefix.

Wei Wang (4):
  spec/vhost-user: fix the VHOST_USER prefix
  spec/vhost-user: extend vhost-user to support the vhost-pci based
inter-vm communiaction
  spec/vhost-user: add the VHOST_USER_SET_PEER_CONNECTION message
  spec/vhost-user: add VHOST_USER_PROTOCOL_F_VERSATILE_SLAVE

 docs/specs/vhost-user.txt | 93 +--
 1 file changed, 74 insertions(+), 19 deletions(-)

-- 
2.7.4

Re: [Qemu-devel] -nodefaults and available buses (was Re: [RFC 00/15] qmp: Report supported device types on 'query-machines')

2016-11-23 Thread David Gibson

On Wed, Nov 23, 2016 at 03:10:47PM -0200, Eduardo Habkost wrote:
> (CCing the maintainers of the machines that crash when using
> -nodefaults)
> 
> On Tue, Nov 22, 2016 at 08:34:50PM -0200, Eduardo Habkost wrote:
> [...]
> > "default defaults" vs "-nodefault defaults"
> > ---
> > 
> > Two bad news:
> > 
> > 1) We need to differentiate buses created by the machine with
> >"-nodefaults" and buses that are created only without
> >"-nodefaults".
> > 
> > libvirt use -nodefaults when starting QEMU, so knowing which
> > buses are available when using -nodefaults is more interesting
> > for them.
> > 
> > Other software, on the other hand, might be interested in the
> > results without -nodefaults.
> > 
> > We need to be able model both cases in the new interface.
> > Suggestions are welcome.
> 
> The good news is that the list is short. The only[1] machines
> where the list of buses seem to change when using -nodefaults
> are:
> 
> * mpc8544ds
> * ppce500
> * mpc8544ds
> * ppce500
> * s390-ccw-virtio-*
> 
> On all cases above, the only difference is that a virtio bus is
> available if not using -nodefaults.

Hrm.. that's odd.  Well, it makes sense for the s390 which has special
virtio arrangements.  However, the others are all embedded ppc
machines, whose virtio should be bog-standard virtio-pci.  I'm
wondering if the addition of the virtio "bus" is a side-effect of the
NIC or storage device created without -nodefaults being virtio.

> Considering that the list is short, I plan to rename
> 'supported-device-types' to 'always-available-buses', and
> document that it will include only the buses that are not
> disabled by -nodefaults.
> 
> [1] I mean, the only ones from the set that don't crash with
> -nodefaults. The ones below could not be tested:
> 
> > 2) A lot of machine-types won't start if using
> >"-nodefaults -machine " without any extra devices or
> >drives.
> > 
> > Lots of machines require some drives or devices to be created
> > (especially ARM machines that require a SD drive to be
> > available).
> > 
> > Some machines will make QEMU exit, some of them simply segfault.
> > I am looking for ways to work around it so we can still validate
> > -nodefaults-based info on the test code.
> 
> The following machines won't work with -nodefaults:
> 
> These make QEMU segfault:
> * cubieboard
> * petalogix-ml605
> * or32-sim
> * virtex-ml507
> * Niagara
> 
> These exit with a "missing SecureDigital device" error:
> * akita
> * borzoi
> * cheetah
> * connex
> * mainstone
> * n800
> * n810
> * spitz
> * sx1
> * sx1-v1
> * terrier
> * tosa
> * verdex
> * z2
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 2/9] target-ppc: Fix xscmpodp and xscmpudp instructions

2016-11-23 Thread David Gibson

On Wed, Nov 23, 2016 at 11:10:08AM +0530, Bharata B Rao wrote:
> On Wed, Nov 23, 2016 at 03:01:18PM +1100, David Gibson wrote:
> > On Tue, Nov 22, 2016 at 05:15:58PM +0530, Nikunj A Dadhania wrote:
> > > From: Bharata B Rao 
> > > 
> > > - xscmpodp & xscmpudp are missing flags reset.
> > > - In xscmpodp, VXCC should be set only if VE is 0 for signalling NaN case
> > >   and VXCC should be set by explicitly checking for quiet NaN case.
> > > - Comparison is being done only if the operands are not NaNs. However as
> > >   per ISA, it should be done even when operands are NaNs.
> > 
> > For my interest, can you explain the difference between ordered and
> > unordered comparisons?  I looked at the ISA and mostly just became
> > confused.
> 
> >From another section of the same ISA doc, I see these description which
> makes the distinction between ordered and unordered comparisions a bit
> more clear.
> 
> Unordered:
> 
> "If either of the operands is a NaN, either quiet or signal-
> ing, then CR field BF and the FPCC are set to reflect
> unordered. If either of the operands is a Signaling NaN,
> then VXSNAN is set."
> 
> Ordered:
> 
> "If either of the operands is a NaN, either quiet or signal-
> ing, then CR field BF and the FPCC are set to reflect
> unordered. If either of the operands is a Signaling NaN,
> then VXSNAN is set and, if Invalid Operation is dis-
> abled (VE=0), VXVC is set. If neither operand is a Sig-
> naling NaN but at least one operand is a Quiet NaN,
> then VXVC is set."

Ah, thanks.  So it's basically just the setting of VXVC which differs.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 2/4] target-ppc: Implement bcdctsq. instruction

2016-11-23 Thread David Gibson

On Wed, Nov 23, 2016 at 02:21:43PM -0200, Jose Ricardo Ziviani wrote:
> bcdctsq.: Decimal convert to signed quadword. It is possible to
> convert packed decimal values to signed quadwords.
> 
> Signed-off-by: Jose Ricardo Ziviani 

Reviewed-by: David Gibson 

> ---
>  target-ppc/helper.h |  1 +
>  target-ppc/int_helper.c | 40 
> +
>  target-ppc/translate/vmx-impl.inc.c |  7 +++
>  3 files changed, 48 insertions(+)
> 
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index 87f533c..503f257 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -383,6 +383,7 @@ DEF_HELPER_3(bcdctn, i32, avr, avr, i32)
>  DEF_HELPER_3(bcdcfz, i32, avr, avr, i32)
>  DEF_HELPER_3(bcdctz, i32, avr, avr, i32)
>  DEF_HELPER_3(bcdcfsq, i32, avr, avr, i32)
> +DEF_HELPER_3(bcdctsq, i32, avr, avr, i32)
>  
>  DEF_HELPER_2(xsadddp, void, env, i32)
>  DEF_HELPER_2(xssubdp, void, env, i32)
> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> index 751909c..ca0d0b8 100644
> --- a/target-ppc/int_helper.c
> +++ b/target-ppc/int_helper.c
> @@ -2919,6 +2919,46 @@ uint32_t helper_bcdcfsq(ppc_avr_t *r, ppc_avr_t *b, 
> uint32_t ps)
>  return cr;
>  }
>  
> +uint32_t helper_bcdctsq(ppc_avr_t *r, ppc_avr_t *b, uint32_t ps)
> +{
> +uint8_t i;
> +int cr;
> +uint64_t carry;
> +uint64_t unused;
> +uint64_t lo_value;
> +uint64_t hi_value = 0;
> +int sgnb = bcd_get_sgn(b);
> +int invalid = (sgnb == 0);
> +
> +lo_value = bcd_get_digit(b, 31, &invalid);
> +for (i = 30; i > 0; i--) {
> +mulu64(&lo_value, &carry, lo_value, 10ULL);
> +mulu64(&hi_value, &unused, hi_value, 10ULL);
> +lo_value += bcd_get_digit(b, i, &invalid);
> +hi_value += carry;
> +
> +if (unlikely(invalid)) {
> +break;
> +}
> +}
> +
> +if (sgnb == -1) {
> +r->s64[LO_IDX] = -lo_value;
> +r->s64[HI_IDX] = ~hi_value + !r->s64[LO_IDX];
> +} else {
> +r->s64[LO_IDX] = lo_value;
> +r->s64[HI_IDX] = hi_value;
> +}
> +
> +cr = bcd_cmp_zero(b);
> +
> +if (unlikely(invalid)) {
> +cr = 1 << CRF_SO;
> +}
> +
> +return cr;
> +}
> +
>  void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
>  {
>  int i;
> diff --git a/target-ppc/translate/vmx-impl.inc.c 
> b/target-ppc/translate/vmx-impl.inc.c
> index 36141e5..1579b58 100644
> --- a/target-ppc/translate/vmx-impl.inc.c
> +++ b/target-ppc/translate/vmx-impl.inc.c
> @@ -990,10 +990,14 @@ GEN_BCD2(bcdctn)
>  GEN_BCD2(bcdcfz)
>  GEN_BCD2(bcdctz)
>  GEN_BCD2(bcdcfsq)
> +GEN_BCD2(bcdctsq)
>  
>  static void gen_xpnd04_1(DisasContext *ctx)
>  {
>  switch (opc4(ctx->opcode)) {
> +case 0:
> +gen_bcdctsq(ctx);
> +break;
>  case 2:
>  gen_bcdcfsq(ctx);
>  break;
> @@ -1018,6 +1022,9 @@ static void gen_xpnd04_1(DisasContext *ctx)
>  static void gen_xpnd04_2(DisasContext *ctx)
>  {
>  switch (opc4(ctx->opcode)) {
> +case 0:
> +gen_bcdctsq(ctx);
> +break;
>  case 2:
>  gen_bcdcfsq(ctx);
>  break;

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 4/4] target-ppc: Implement bcdsetsgn. instruction

2016-11-23 Thread David Gibson

On Wed, Nov 23, 2016 at 02:21:45PM -0200, Jose Ricardo Ziviani wrote:
> bcdsetsgn.: Decimal set sign. This instruction copies the register
> value to the result register but adjust the signal according to
> the preferred sign value.
> 
> Signed-off-by: Jose Ricardo Ziviani 

Reviewed-by: David Gibson 

> ---
>  target-ppc/helper.h |  1 +
>  target-ppc/int_helper.c | 19 +++
>  target-ppc/translate/vmx-impl.inc.c |  8 
>  3 files changed, 28 insertions(+)
> 
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index dada48e..cddac8e 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -385,6 +385,7 @@ DEF_HELPER_3(bcdctz, i32, avr, avr, i32)
>  DEF_HELPER_3(bcdcfsq, i32, avr, avr, i32)
>  DEF_HELPER_3(bcdctsq, i32, avr, avr, i32)
>  DEF_HELPER_4(bcdcpsgn, i32, avr, avr, avr, i32)
> +DEF_HELPER_3(bcdsetsgn, i32, avr, avr, i32)
>  
>  DEF_HELPER_2(xsadddp, void, env, i32)
>  DEF_HELPER_2(xssubdp, void, env, i32)
> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> index 833c9d2..db430ef 100644
> --- a/target-ppc/int_helper.c
> +++ b/target-ppc/int_helper.c
> @@ -2982,6 +2982,25 @@ uint32_t helper_bcdcpsgn(ppc_avr_t *r, ppc_avr_t *a, 
> ppc_avr_t *b, uint32_t ps)
>  return bcd_cmp_zero(r);
>  }
>  
> +uint32_t helper_bcdsetsgn(ppc_avr_t *r, ppc_avr_t *b, uint32_t ps)
> +{
> +int i;
> +int invalid = 0;
> +int sgnb = bcd_get_sgn(b);
> +
> +*r = *b;
> +bcd_put_digit(r, bcd_preferred_sgn(sgnb, ps), 0);
> +
> +for (i = 1; i < 32; i++) {
> +bcd_get_digit(b, i, &invalid);
> +if (unlikely(invalid)) {
> +return 1 << CRF_SO;
> +}
> +}
> +
> +return bcd_cmp_zero(r);
> +}
> +
>  void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
>  {
>  int i;
> diff --git a/target-ppc/translate/vmx-impl.inc.c 
> b/target-ppc/translate/vmx-impl.inc.c
> index c14b666..b188e60 100644
> --- a/target-ppc/translate/vmx-impl.inc.c
> +++ b/target-ppc/translate/vmx-impl.inc.c
> @@ -991,6 +991,7 @@ GEN_BCD2(bcdcfz)
>  GEN_BCD2(bcdctz)
>  GEN_BCD2(bcdcfsq)
>  GEN_BCD2(bcdctsq)
> +GEN_BCD2(bcdsetsgn)
>  GEN_BCD(bcdcpsgn);
>  
>  static void gen_xpnd04_1(DisasContext *ctx)
> @@ -1014,6 +1015,9 @@ static void gen_xpnd04_1(DisasContext *ctx)
>  case 7:
>  gen_bcdcfn(ctx);
>  break;
> +case 31:
> +gen_bcdsetsgn(ctx);
> +break;
>  default:
>  gen_invalid(ctx);
>  break;
> @@ -1038,12 +1042,16 @@ static void gen_xpnd04_2(DisasContext *ctx)
>  case 7:
>  gen_bcdcfn(ctx);
>  break;
> +case 31:
> +gen_bcdsetsgn(ctx);
> +break;
>  default:
>  gen_invalid(ctx);
>  break;
>  }
>  }
>  
> +
>  GEN_VXFORM_DUAL(vsubcuw, PPC_ALTIVEC, PPC_NONE, \
>  xpnd04_1, PPC_NONE, PPC2_ISA300)
>  GEN_VXFORM_DUAL(vsubsws, PPC_ALTIVEC, PPC_NONE, \

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 3/4] target-ppc: Implement bcdcpsgn. instruction

2016-11-23 Thread David Gibson

On Wed, Nov 23, 2016 at 02:21:44PM -0200, Jose Ricardo Ziviani wrote:
> bcdcpsgn.: Decimal copy sign. Given two registers vra and vrb, it
> copies the vra value with vrb sign to the result register vrt.
> 
> Signed-off-by: Jose Ricardo Ziviani 

Reviewed-by: David Gibson 

> ---
>  target-ppc/helper.h |  1 +
>  target-ppc/int_helper.c | 23 +++
>  target-ppc/translate/vmx-impl.inc.c |  3 +++
>  target-ppc/translate/vmx-ops.inc.c  |  2 +-
>  4 files changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index 503f257..dada48e 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -384,6 +384,7 @@ DEF_HELPER_3(bcdcfz, i32, avr, avr, i32)
>  DEF_HELPER_3(bcdctz, i32, avr, avr, i32)
>  DEF_HELPER_3(bcdcfsq, i32, avr, avr, i32)
>  DEF_HELPER_3(bcdctsq, i32, avr, avr, i32)
> +DEF_HELPER_4(bcdcpsgn, i32, avr, avr, avr, i32)
>  
>  DEF_HELPER_2(xsadddp, void, env, i32)
>  DEF_HELPER_2(xssubdp, void, env, i32)
> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> index ca0d0b8..833c9d2 100644
> --- a/target-ppc/int_helper.c
> +++ b/target-ppc/int_helper.c
> @@ -2959,6 +2959,29 @@ uint32_t helper_bcdctsq(ppc_avr_t *r, ppc_avr_t *b, 
> uint32_t ps)
>  return cr;
>  }
>  
> +uint32_t helper_bcdcpsgn(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t 
> ps)
> +{
> +int i;
> +int invalid = 0;
> +
> +if (bcd_get_sgn(a) == 0 || bcd_get_sgn(b) == 0) {
> +return 1 << CRF_SO;
> +}
> +
> +*r = *a;
> +bcd_put_digit(r, b->u8[BCD_DIG_BYTE(0)] & 0xF, 0);
> +
> +for (i = 1; i < 32; i++) {
> +bcd_get_digit(a, i, &invalid);
> +bcd_get_digit(b, i, &invalid);
> +if (unlikely(invalid)) {
> +return 1 << CRF_SO;
> +}
> +}
> +
> +return bcd_cmp_zero(r);
> +}
> +
>  void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
>  {
>  int i;
> diff --git a/target-ppc/translate/vmx-impl.inc.c 
> b/target-ppc/translate/vmx-impl.inc.c
> index 1579b58..c14b666 100644
> --- a/target-ppc/translate/vmx-impl.inc.c
> +++ b/target-ppc/translate/vmx-impl.inc.c
> @@ -991,6 +991,7 @@ GEN_BCD2(bcdcfz)
>  GEN_BCD2(bcdctz)
>  GEN_BCD2(bcdcfsq)
>  GEN_BCD2(bcdctsq)
> +GEN_BCD(bcdcpsgn);
>  
>  static void gen_xpnd04_1(DisasContext *ctx)
>  {
> @@ -1056,6 +1057,8 @@ GEN_VXFORM_DUAL(vsubuhm, PPC_ALTIVEC, PPC_NONE, \
>  bcdsub, PPC_NONE, PPC2_ALTIVEC_207)
>  GEN_VXFORM_DUAL(vsubuhs, PPC_ALTIVEC, PPC_NONE, \
>  bcdsub, PPC_NONE, PPC2_ALTIVEC_207)
> +GEN_VXFORM_DUAL(vaddshs, PPC_ALTIVEC, PPC_NONE, \
> +bcdcpsgn, PPC_NONE, PPC2_ISA300)
>  
>  static void gen_vsbox(DisasContext *ctx)
>  {
> diff --git a/target-ppc/translate/vmx-ops.inc.c 
> b/target-ppc/translate/vmx-ops.inc.c
> index f02b3be..70d7d2b 100644
> --- a/target-ppc/translate/vmx-ops.inc.c
> +++ b/target-ppc/translate/vmx-ops.inc.c
> @@ -131,7 +131,7 @@ GEN_VXFORM_DUAL(vaddubs, vmul10uq, 0, 8, PPC_ALTIVEC, 
> PPC_NONE),
>  GEN_VXFORM_DUAL(vadduhs, vmul10euq, 0, 9, PPC_ALTIVEC, PPC_NONE),
>  GEN_VXFORM(vadduws, 0, 10),
>  GEN_VXFORM(vaddsbs, 0, 12),
> -GEN_VXFORM(vaddshs, 0, 13),
> +GEN_VXFORM_DUAL(vaddshs, bcdcpsgn, 0, 13, PPC_ALTIVEC, PPC_NONE),
>  GEN_VXFORM(vaddsws, 0, 14),
>  GEN_VXFORM_DUAL(vsububs, bcdadd, 0, 24, PPC_ALTIVEC, PPC_NONE),
>  GEN_VXFORM_DUAL(vsubuhs, bcdsub, 0, 25, PPC_ALTIVEC, PPC_NONE),

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 0/4] POWER9 TCG enablements - BCD functions part II

2016-11-23 Thread David Gibson

On Wed, Nov 23, 2016 at 02:21:41PM -0200, Jose Ricardo Ziviani wrote:
> v2:
>  - use div128 and mul64 functions to make code easier to understand
>  - fixed int128 neg
>  - improved functions bcdcpsgn and bcdsetsgn to do less work
>than necessary
>  - rebased on ppc-for-2.9
> 
> This serie contains 4 new instructions for POWER9 ISA3.0
> 
> bcdcfsq.: Convert signed quadword to packed BCD
> bcdctsq.: Convert packed BCD to signed quadword
> bcdcpsgn.: Copy the sign of a register to another
> bcdsetsgn.: Set the BCD sign according to a preferred sign

Patch 1/4 has some problems, see comments.

Patches 2..4/4 look ok - except that they'll need to be updated for
the recent change I merged from Nikunj (in ppc-for-2.9) which changes
the meaning of CRF_*.

> 
> Jose Ricardo Ziviani (4):
>   target-ppc: Implement bcdcfsq. instruction
>   target-ppc: Implement bcdctsq. instruction
>   target-ppc: Implement bcdcpsgn. instruction
>   target-ppc: Implement bcdsetsgn. instruction
> 
>  target-ppc/helper.h |   4 ++
>  target-ppc/int_helper.c | 127 
> 
>  target-ppc/translate/vmx-impl.inc.c |  25 +++
>  target-ppc/translate/vmx-ops.inc.c  |   2 +-
>  4 files changed, 157 insertions(+), 1 deletion(-)
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v1 ppc-for-2.9 00/10] POWER9 TCG enablements - part8

2016-11-23 Thread David Gibson

On Wed, Nov 23, 2016 at 05:07:09PM +0530, Nikunj A Dadhania wrote:
> This series contains 18 new instructions for POWER9 ISA3.0
> Vector Extract Left/Right Indexed
> VSX Scalar Compare Exponents
> VSX Scalar Compare Quad-Precision
> Load/Store VSX Vector 
> Load/Store VSX Scalar
> 
> Changelog:
> v0:
> * Change dq/ds-form decoding for primary opcode 0x3D
> * Rename CR Field defines, as at every place it was
>   using bit shifts.
> * Use symbolic constants in xscmp*
> * Fix bug in exception handling for QNaN
> * Define EXTRACT128 within CONFIG_INT128

I've applied patches 1..8 to ppc-for-2.9.  Patches 9 & 10 can still do
with some improvement, I think.

> 
> Patches
> ===
> 01-03: Consolidation/Fixes
>04: 
>   xscmpexpdp: VSX Scalar Compare Exponents Double-Precision
>   xscmpexpqp: VSX Scalar Compare Exponents Quad-Precision
>05:
>   xscmpoqp: VSX Scalar Compare Ordered Quad-Precision
>   xscmpuqp: VSX Scalar Compare Unordered Quad-Precision
>06:
>   lxsd:  Load VSX Scalar Dword
>   lxssp: Load VSX Scalar Single Precision
>07:
>   stxsd:  Store VSX Scalar Dword
>   stxssp: Store VSX Scalar Single Precision
>08:
>   lxv:   Load VSX Vector
>   lxvx:  Load VSX Vector Indexed
>   stxv:  Store VSX Vector
>   stxvx: Store VSX Vector Indexed
>09: 
>   vextublx:  Vector Extract Unsigned Byte Left
>   vextuhlx:  Vector Extract Unsigned Halfword Left
>   vextuwlx:  Vector Extract Unsigned Word Left
>10: 
>   vextubrx: Vector Extract Unsigned Byte Right-Indexed
>   vextuhrx: Vector Extract Unsigned  Halfword Right-Indexed
>   vextuwrx: Vector Extract Unsigned Word Right-Indexed
> 
> Avinesh Kumar (1):
>   target-ppc: add vextu[bhw]lx instructions
> 
> Bharata B Rao (4):
>   target-ppc: Consolidate instruction decode helpers
>   target-ppc: Fix xscmpodp and xscmpudp instructions
>   target-ppc: Add xscmpexp[dp,qp] instructions
>   target-ppc: Add xscmpoqp and xscmpuqp instructions
> 
> Hariharan T.S (1):
>   target-ppc: add vextu[bhw]rx instructions
> 
> Nikunj A Dadhania (4):
>   target-ppc: rename CRF_* defines as CRF_*_BIT
>   target-ppc: implement lxsd and lxssp instructions
>   target-ppc: implement stxsd and stxssp
>   target-ppc: implement lxv/lxvx and stxv/stxvx
> 
>  target-ppc/cpu.h|  21 ++--
>  target-ppc/fpu_helper.c | 169 ++
>  target-ppc/helper.h |  10 ++
>  target-ppc/int_helper.c | 155 +---
>  target-ppc/internal.h   | 152 
>  target-ppc/translate.c  | 230 
> +++-
>  target-ppc/translate/fp-ops.inc.c   |   2 -
>  target-ppc/translate/vmx-impl.inc.c |  23 
>  target-ppc/translate/vmx-ops.inc.c  |   8 +-
>  target-ppc/translate/vsx-impl.inc.c |  96 +++
>  target-ppc/translate/vsx-ops.inc.c  |  10 ++
>  11 files changed, 666 insertions(+), 210 deletions(-)
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v1 09/10] target-ppc: add vextu[bhw]lx instructions

2016-11-23 Thread David Gibson

On Wed, Nov 23, 2016 at 05:07:18PM +0530, Nikunj A Dadhania wrote:
> From: Avinesh Kumar 
> 
> vextublx:  Vector Extract Unsigned Byte Left
> vextuhlx:  Vector Extract Unsigned Halfword Left
> vextuwlx:  Vector Extract Unsigned Word Left
> 
> Signed-off-by: Avinesh Kumar 
> Signed-off-by: Nikunj A Dadhania 

So, when I suggested doing these without helpers before, I had
forgotten that the non-byte versions can straddle the word boundary.
Given that the offset is in a register, not the instruction that does
make it complicated.

But, this version also relies on working 128-bit arithmetic, AFAICT
this will just fail to build if CONFIG_INT128 isn't defined.  It
really shouldn't be that hard to make a helper that works just in
terms of 64-bit arithmetic - there are only 3 cases (all in the upper
word, all in the lower, and straddling).  I'd prefer to see it done
that way, rather than increasing reliance on CONFIG_INT128.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 1/4] target-ppc: Implement bcdcfsq. instruction

2016-11-23 Thread David Gibson

On Wed, Nov 23, 2016 at 02:21:42PM -0200, Jose Ricardo Ziviani wrote:
> bcdcfsq.: Decimal convert from signed quadword. It is not possible
> to convert values less than 10^31-1 or greater than -10^31-1 to be
> represented in packed decimal format.

You have your less than / greater than the wrong way around in the
above.

> 
> Signed-off-by: Jose Ricardo Ziviani 
> ---
>  target-ppc/helper.h |  1 +
>  target-ppc/int_helper.c | 45 
> +
>  target-ppc/translate/vmx-impl.inc.c |  7 ++
>  3 files changed, 53 insertions(+)
> 
> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> index da00f0a..87f533c 100644
> --- a/target-ppc/helper.h
> +++ b/target-ppc/helper.h
> @@ -382,6 +382,7 @@ DEF_HELPER_3(bcdcfn, i32, avr, avr, i32)
>  DEF_HELPER_3(bcdctn, i32, avr, avr, i32)
>  DEF_HELPER_3(bcdcfz, i32, avr, avr, i32)
>  DEF_HELPER_3(bcdctz, i32, avr, avr, i32)
> +DEF_HELPER_3(bcdcfsq, i32, avr, avr, i32)
>  
>  DEF_HELPER_2(xsadddp, void, env, i32)
>  DEF_HELPER_2(xssubdp, void, env, i32)
> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> index 8886a72..751909c 100644
> --- a/target-ppc/int_helper.c
> +++ b/target-ppc/int_helper.c
> @@ -2874,6 +2874,51 @@ uint32_t helper_bcdctz(ppc_avr_t *r, ppc_avr_t *b, 
> uint32_t ps)
>  return cr;
>  }
>  
> +uint32_t helper_bcdcfsq(ppc_avr_t *r, ppc_avr_t *b, uint32_t ps)
> +{
> +int cr;
> +int i;
> +int ox_flag = 0;
> +uint64_t lo_value;
> +uint64_t hi_value;
> +uint64_t max = 0x38d7ea4c68000;

In this case it would be clearer what's going on if you gave this
constant in decimal. "max" is also not a great name - see below.

> +ppc_avr_t ret = { .u64 = { 0, 0 } };
> +
> +if (b->s64[HI_IDX] < 0) {
> +lo_value = -b->s64[LO_IDX];
> +hi_value = ~b->u64[HI_IDX] + !lo_value;
> +bcd_put_digit(&ret, 0xD, 0);
> +} else {
> +lo_value = b->u64[LO_IDX];
> +hi_value = b->u64[HI_IDX];
> +bcd_put_digit(&ret, bcd_preferred_sgn(0, ps), 0);
> +}
> +
> +if (divu128(&lo_value, &hi_value, max)) {
> +ox_flag = 1;
> +} else if (lo_value >= max && hi_value == 0) {

This isn't right.  max == 10^15, but in fact the dividend can safely
be up to 10^16 - 1.  I don't see what checking the remainder against 0
has to do with anything, either.  No overflow + (dividend < 10^16)
should be sufficient.

> +ox_flag = 1;
> +}
> +
> +for (i = 1; hi_value; hi_value /= 10, i++) {
> +bcd_put_digit(&ret, hi_value % 10, i);
> +}
> +
> +for (; lo_value; lo_value /= 10, i++) {
> +bcd_put_digit(&ret, lo_value % 10, i);
> +}
> +
> +cr = bcd_cmp_zero(&ret);
> +
> +if (unlikely(ox_flag)) {
> +cr |= 1 << CRF_SO;

Since you posted I've merged a patch from Nikunj which chanes the
meaning of these CRF_* flags to be bit masks instead of shifts.  So
this will need to become just
cr |= CRF_SO;

> +}
> +
> +*r = ret;
> +
> +return cr;
> +}
> +
>  void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
>  {
>  int i;
> diff --git a/target-ppc/translate/vmx-impl.inc.c 
> b/target-ppc/translate/vmx-impl.inc.c
> index 7143eb3..36141e5 100644
> --- a/target-ppc/translate/vmx-impl.inc.c
> +++ b/target-ppc/translate/vmx-impl.inc.c
> @@ -989,10 +989,14 @@ GEN_BCD2(bcdcfn)
>  GEN_BCD2(bcdctn)
>  GEN_BCD2(bcdcfz)
>  GEN_BCD2(bcdctz)
> +GEN_BCD2(bcdcfsq)
>  
>  static void gen_xpnd04_1(DisasContext *ctx)
>  {
>  switch (opc4(ctx->opcode)) {
> +case 2:
> +gen_bcdcfsq(ctx);
> +break;
>  case 4:
>  gen_bcdctz(ctx);
>  break;
> @@ -1014,6 +1018,9 @@ static void gen_xpnd04_1(DisasContext *ctx)
>  static void gen_xpnd04_2(DisasContext *ctx)
>  {
>  switch (opc4(ctx->opcode)) {
> +case 2:
> +gen_bcdcfsq(ctx);
> +break;
>  case 4:
>  gen_bcdctz(ctx);
>  break;

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 1/4] target-ppc: Implement bcdcfsq. instruction

2016-11-23 Thread David Gibson

On Thu, Nov 24, 2016 at 01:43:18AM +0100, Richard Henderson wrote:
> On 11/23/2016 05:21 PM, Jose Ricardo Ziviani wrote:
> > bcdcfsq.: Decimal convert from signed quadword. It is not possible
> > to convert values less than 10^31-1 or greater than -10^31-1 to be
> > represented in packed decimal format.
> > 
> > Signed-off-by: Jose Ricardo Ziviani 
> > ---
> >  target-ppc/helper.h |  1 +
> >  target-ppc/int_helper.c | 45 
> > +
> >  target-ppc/translate/vmx-impl.inc.c |  7 ++
> >  3 files changed, 53 insertions(+)
> > 
> > diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> > index da00f0a..87f533c 100644
> > --- a/target-ppc/helper.h
> > +++ b/target-ppc/helper.h
> > @@ -382,6 +382,7 @@ DEF_HELPER_3(bcdcfn, i32, avr, avr, i32)
> >  DEF_HELPER_3(bcdctn, i32, avr, avr, i32)
> >  DEF_HELPER_3(bcdcfz, i32, avr, avr, i32)
> >  DEF_HELPER_3(bcdctz, i32, avr, avr, i32)
> > +DEF_HELPER_3(bcdcfsq, i32, avr, avr, i32)
> > 
> >  DEF_HELPER_2(xsadddp, void, env, i32)
> >  DEF_HELPER_2(xssubdp, void, env, i32)
> > diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> > index 8886a72..751909c 100644
> > --- a/target-ppc/int_helper.c
> > +++ b/target-ppc/int_helper.c
> > @@ -2874,6 +2874,51 @@ uint32_t helper_bcdctz(ppc_avr_t *r, ppc_avr_t *b, 
> > uint32_t ps)
> >  return cr;
> >  }
> > 
> > +uint32_t helper_bcdcfsq(ppc_avr_t *r, ppc_avr_t *b, uint32_t ps)
> > +{
> > +int cr;
> > +int i;
> > +int ox_flag = 0;
> > +uint64_t lo_value;
> > +uint64_t hi_value;
> > +uint64_t max = 0x38d7ea4c68000;
> 
> This is at heart a decimal number, and should be written as such.
> Also, you need ULL for a 32-bit host compile.
> 
> > +if (divu128(&lo_value, &hi_value, max)) {
> > +ox_flag = 1;
> > +} else if (lo_value >= max && hi_value == 0) {
> > +ox_flag = 1;
> > +}
> 
> Dispense with ox_flag and set cr = CRF_SO now.
> 
> > +for (i = 1; hi_value; hi_value /= 10, i++) {
> > +bcd_put_digit(&ret, hi_value % 10, i);
> > +}
> > +
> > +for (; lo_value; lo_value /= 10, i++) {
> > +bcd_put_digit(&ret, lo_value % 10, i);
> > +}
> 
> How can this possibly work?  You know there are 15 digits between high and
> low, but you continue with i++?
>
> If hi_value == 1 && lo_value == 1, this should not produce 11, but
> 10001.

Ah, yes good catch, I missed that.  I think it will be clearer to use
fixed bounds on each loop, rather than testing the value of the
remaining result.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v1 02/10] target-ppc: rename CRF_* defines as CRF_*_BIT

2016-11-23 Thread David Gibson

On Wed, Nov 23, 2016 at 05:07:11PM +0530, Nikunj A Dadhania wrote:
> Add _BIT to CRF_[GT,LT,EQ_SO] and introduce CRF_[GT,LT,EQ,SO] for usage
> without shifts in the code. This would simplify the code.
> 
> Signed-off-by: Nikunj A Dadhania 

Nice!

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH for-2.8] target-m68k: Fix cmpa operand size

2016-11-23 Thread Richard Henderson


On 11/23/2016 09:55 PM, Laurent Vivier wrote:

"The size of the operation can be specified as word or long.
Word length source operands are sign-extended to 32 bits for
comparison."

So comparison is always done using OS_LONG.

Signed-off-by: Laurent Vivier 
---
 target-m68k/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH v2 1/4] target-ppc: Implement bcdcfsq. instruction

2016-11-23 Thread Richard Henderson


On 11/23/2016 05:21 PM, Jose Ricardo Ziviani wrote:

bcdcfsq.: Decimal convert from signed quadword. It is not possible
to convert values less than 10^31-1 or greater than -10^31-1 to be
represented in packed decimal format.

Signed-off-by: Jose Ricardo Ziviani 
---
 target-ppc/helper.h |  1 +
 target-ppc/int_helper.c | 45 +
 target-ppc/translate/vmx-impl.inc.c |  7 ++
 3 files changed, 53 insertions(+)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index da00f0a..87f533c 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -382,6 +382,7 @@ DEF_HELPER_3(bcdcfn, i32, avr, avr, i32)
 DEF_HELPER_3(bcdctn, i32, avr, avr, i32)
 DEF_HELPER_3(bcdcfz, i32, avr, avr, i32)
 DEF_HELPER_3(bcdctz, i32, avr, avr, i32)
+DEF_HELPER_3(bcdcfsq, i32, avr, avr, i32)

 DEF_HELPER_2(xsadddp, void, env, i32)
 DEF_HELPER_2(xssubdp, void, env, i32)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 8886a72..751909c 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -2874,6 +2874,51 @@ uint32_t helper_bcdctz(ppc_avr_t *r, ppc_avr_t *b, 
uint32_t ps)
 return cr;
 }

+uint32_t helper_bcdcfsq(ppc_avr_t *r, ppc_avr_t *b, uint32_t ps)
+{
+int cr;
+int i;
+int ox_flag = 0;
+uint64_t lo_value;
+uint64_t hi_value;
+uint64_t max = 0x38d7ea4c68000;


This is at heart a decimal number, and should be written as such.
Also, you need ULL for a 32-bit host compile.


+if (divu128(&lo_value, &hi_value, max)) {
+ox_flag = 1;
+} else if (lo_value >= max && hi_value == 0) {
+ox_flag = 1;
+}


Dispense with ox_flag and set cr = CRF_SO now.


+for (i = 1; hi_value; hi_value /= 10, i++) {
+bcd_put_digit(&ret, hi_value % 10, i);
+}
+
+for (; lo_value; lo_value /= 10, i++) {
+bcd_put_digit(&ret, lo_value % 10, i);
+}


How can this possibly work?  You know there are 15 digits between high and low, 
but you continue with i++?


If hi_value == 1 && lo_value == 1, this should not produce 11, but 
10001.



r~

Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI

2016-11-23 Thread Kevin O'Connor

On Thu, Nov 24, 2016 at 01:01:58AM +0100, Laszlo Ersek wrote:
> CC Jordan & Mike
> 
> On 11/23/16 23:35, Paolo Bonzini wrote:
> > 
> > 
> > On 18/11/2016 11:36, Laszlo Ersek wrote:
> >> This is v3 of the series, with updates based on the v2 discussion:
> >> .
> >>
> >> I've added feature negotiation via the APM_STS ("scratchpad") register.
> >> A new spec file called "docs/specs/q35-apm-sts.txt" is included.
> >>
> >> Tested with new OVMF patches (about to send out those as well).
> >> Regression tested with SeaBIOS (beyond simple functional tests with
> >> maximum SeaBIOS logging enabled, I used gdb to step through the new
> >> ich9_apm_status_changed() callback to see if it was behaving compatibly
> >> with SeaBIOS).
> >>
> >> The series was developed and tested on top of v2.7.0, because v2.8.0-rc0
> >> crashes very quickly for me when running OVMF:
> >>
> >>   kvm_io_ioeventfd_add: error adding ioeventfd: File exists
> >>
> >> It is my understanding that there are patches on the list for this:
> >>
> >>   [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes
> >>
> >> Anyway, the series rebases to v2.8.0-rc0 without as much as context
> >> differences.
> > 
> > Hi Laszlo,
> > 
> > sorry for the slightly delayed reply.
> > 
> > First of all, I'm wondering if we would be better off adding a new port
> > 0xB1 that is QEMU-specific, instead of reusing 0xB3.
> 
> Sure, I can look into that, if we agree that's the best way to proceed,
> for now. (Although I'm not really happy about the new memory region
> stuff it would require. :()
> 
> I CC'd Kevin to learn if he foresaw other uses for the APM_STS register
> in SeaBIOS.

I don't foresee further use of APM_STS in SeaBIOS.  The SMM code in
SeaBIOS is specific to QEMU anyway.  Also, the current use of APM_STS
is so trivial, we could easily remove it from SeaBIOS in a future
release (were that desirable).

As a general comment - it does seem unfortunate that we keep building
adhoc interfaces to communicate information from firmware to QEMU.  We
have a generic mechanism (fw_cfg) for passing adhoc information from
QEMU to the firmware, but the inverse seems to always involve magic
pci registers, magic io space registers, specific init ordering, etc.

That said, I don't object to your proposal.

-Kevin

Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI

2016-11-23 Thread Laszlo Ersek

On 11/24/16 01:01, Laszlo Ersek wrote:
> CC Jordan & Mike
> 
> On 11/23/16 23:35, Paolo Bonzini wrote:
>>
>>
>> On 18/11/2016 11:36, Laszlo Ersek wrote:
>>> This is v3 of the series, with updates based on the v2 discussion:
>>> .
>>>
>>> I've added feature negotiation via the APM_STS ("scratchpad") register.
>>> A new spec file called "docs/specs/q35-apm-sts.txt" is included.
>>>
>>> Tested with new OVMF patches (about to send out those as well).
>>> Regression tested with SeaBIOS (beyond simple functional tests with
>>> maximum SeaBIOS logging enabled, I used gdb to step through the new
>>> ich9_apm_status_changed() callback to see if it was behaving compatibly
>>> with SeaBIOS).
>>>
>>> The series was developed and tested on top of v2.7.0, because v2.8.0-rc0
>>> crashes very quickly for me when running OVMF:
>>>
>>>   kvm_io_ioeventfd_add: error adding ioeventfd: File exists
>>>
>>> It is my understanding that there are patches on the list for this:
>>>
>>>   [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes
>>>
>>> Anyway, the series rebases to v2.8.0-rc0 without as much as context
>>> differences.
>>
>> Hi Laszlo,
>>
>> sorry for the slightly delayed reply.
>>
>> First of all, I'm wondering if we would be better off adding a new port
>> 0xB1 that is QEMU-specific, instead of reusing 0xB3.
> 
> Sure, I can look into that, if we agree that's the best way to proceed,
> for now. (Although I'm not really happy about the new memory region
> stuff it would require. :()
> 
> I CC'd Kevin to learn if he foresaw other uses for the APM_STS register
> in SeaBIOS.
> 
> BTW, what happens in QEMU if the guest reads an unimplemented port?
> Hm... unassigned_io_write() seems to be a no-op, and
> unassigned_io_read() returns all-bits-one. This means that for a new
> port, the negotiation protocol / values have to be reworked.
> 
> Port 0xB1 is occupied by ICH9 according to the spec:
> 
>   Table 9-2. Fixed I/O Ranges Decoded by Intel ® ICH9 (Sheet 2 of 2)
> 
>   I/O
>   Address  Read Target   Write Target  Internal Unit
>   ---      -
>   B0h–B1h  Interrupt Controller  Interrupt Controller  Interrupt
> 
> I wonder if we care -- after all, APM_STS (0xB3) is documented not to
> have any hardware effects ("scratchpad register").
> 
>> Second, I now remembered the reason why I was against broadcast SMI.
>> The reason is that it breaks hot-plug.
> 
> How does it break hot-plug? After reading your explanation below: is it
> that the broadcast SMI (possibly raised by the OS directly) would get to
> the new VCPU before the firmware relocated its SMBASE?
> 
>>
>> On hot-plug, the firmware (if it wants to use SMI for anything secure)
>> must relocate the SMBASE of the newly-hotplugged CPU before the OS has
>> any chance to put its fangs on it.  This is because the OS can send
>> direct SMIs and is in control of the area at 0x3.  So OVMF is
>> already broken with respect to hotplug,
> 
> Yes. We theorized that there could be further edk2 core modules that
> hadn't been open sourced yet, necessary for handling VCPU hotplug.
> 
>> but I am not yet sure if these
>> patches would break it further.
> 
> Hard to say without seeing those modules.
> 
> I will speculate: assuming that the non-public modules fit together with
> the public modules in some way, I expect the broadcast SMI shouldn't
> break them. The reason is that the broadcast SMI / traditional delivery
> are the default method in UefiCpuPkg, and in practice they work better
> (more reliably) with the rest of the edk2 infrastructure, in my
> experience, than the relaxed sync method.
> 
> In his review today, Jordan wrote
> ,
> 
>> I'm glad we'll be using a mechanism that broadcasts to all the
>> processors like the real hardware. It is a bit unfortunate that it
>> doesn't go through the b2 port for it.
> 
> If broadcast is how real hardware does it (even by default!,
> apparently), I expect those as-yet unreleased, hotplug-handling modules
> in edk2 should be able to cope with broadcast.
> 
>> The full solution is to follow a protocol similar to what real hardware
>> does.
> 
> Real hardware seems to use broadcast, according to the above...
> 
> On 11/23/16 23:35, Paolo Bonzini wrote:
> 
>> On hot-plug, before the new CPU starts running, the DSDT should
>> generate an SMI (with a well-known value written to 0xB2).
> 
> I'm not sure I understand right. If it is the DSDT that writes to 0xB2,
> that's just another way to say, "the firmware vendor asks the operating
> system to write to 0xB2". If the malicious OS is smart enough, it can
> realize (from the hardware signal to run the ACPI GPE handler, IIRC)
> that a new VCPU is available, and simply not trigger the SMI.
> 
>> The handler
>> then:
>>
>> 1) waits for SMM rendezvous
>>
>> 2) unparks the hotplugged VCPU.  U

Re: [Qemu-devel] [PATCH 4/4] arm: Add an RX8900 RTC to the ASpeed board

2016-11-23 Thread Andrew Jeffery

On Wed, 2016-11-23 at 09:48 +0100, Cédric Le Goater wrote:
> On 11/23/2016 01:46 AM, Alastair D'Silva wrote:
> > On Tue, 2016-11-22 at 17:56 +0100, Cédric Le Goater wrote:
> > > On 11/17/2016 05:36 AM, Alastair D'Silva wrote:
> > > > 
> > > > > > > > From: Alastair D'Silva 
> > > > 
> > > > Connect an RX8900 RTC to i2c12 of the AST2500 SOC at address 0x32
> > > 
> > > If this is a board device, we should include it under a machine
> > > routine.
> > > 
> > > Is that for the palmetto ? The ast2500 does not have a RTC.
> > > 
> > > Thanks,
> > > 
> > > C. 
> > 
> > Ok
> 
>  
> I suppose we could change aspeed_board_init() to return 
> a AspeedSoCState* and use the soc object in the specific 
> _init routines to add devices. 
> 
> Andrew, what is your opinion on that ? 

I see the I2C bus configuration as a declarative problem. In a similar
vein we already have AspeedBoardConfig, so I think we should try to
describe the buses and attached devices there. That way we can have a
generic aspeed_i2c_bus_init() routine that we call inside
aspeed_board_init().

This would avoid encoding the buses and their slaves in the board-
specific init code.

Is that a reasonable alternative? I agree that we need to use a
different approach to that which the current patch is using.

Andrew

signature.asc
Description: This is a digitally signed message part

Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI

2016-11-23 Thread Laszlo Ersek

CC Jordan & Mike

On 11/23/16 23:35, Paolo Bonzini wrote:
> 
> 
> On 18/11/2016 11:36, Laszlo Ersek wrote:
>> This is v3 of the series, with updates based on the v2 discussion:
>> .
>>
>> I've added feature negotiation via the APM_STS ("scratchpad") register.
>> A new spec file called "docs/specs/q35-apm-sts.txt" is included.
>>
>> Tested with new OVMF patches (about to send out those as well).
>> Regression tested with SeaBIOS (beyond simple functional tests with
>> maximum SeaBIOS logging enabled, I used gdb to step through the new
>> ich9_apm_status_changed() callback to see if it was behaving compatibly
>> with SeaBIOS).
>>
>> The series was developed and tested on top of v2.7.0, because v2.8.0-rc0
>> crashes very quickly for me when running OVMF:
>>
>>   kvm_io_ioeventfd_add: error adding ioeventfd: File exists
>>
>> It is my understanding that there are patches on the list for this:
>>
>>   [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes
>>
>> Anyway, the series rebases to v2.8.0-rc0 without as much as context
>> differences.
> 
> Hi Laszlo,
> 
> sorry for the slightly delayed reply.
> 
> First of all, I'm wondering if we would be better off adding a new port
> 0xB1 that is QEMU-specific, instead of reusing 0xB3.

Sure, I can look into that, if we agree that's the best way to proceed,
for now. (Although I'm not really happy about the new memory region
stuff it would require. :()

I CC'd Kevin to learn if he foresaw other uses for the APM_STS register
in SeaBIOS.

BTW, what happens in QEMU if the guest reads an unimplemented port?
Hm... unassigned_io_write() seems to be a no-op, and
unassigned_io_read() returns all-bits-one. This means that for a new
port, the negotiation protocol / values have to be reworked.

Port 0xB1 is occupied by ICH9 according to the spec:

  Table 9-2. Fixed I/O Ranges Decoded by Intel ® ICH9 (Sheet 2 of 2)

  I/O
  Address  Read Target   Write Target  Internal Unit
  ---      -
  B0h–B1h  Interrupt Controller  Interrupt Controller  Interrupt

I wonder if we care -- after all, APM_STS (0xB3) is documented not to
have any hardware effects ("scratchpad register").

> Second, I now remembered the reason why I was against broadcast SMI.
> The reason is that it breaks hot-plug.

How does it break hot-plug? After reading your explanation below: is it
that the broadcast SMI (possibly raised by the OS directly) would get to
the new VCPU before the firmware relocated its SMBASE?

> 
> On hot-plug, the firmware (if it wants to use SMI for anything secure)
> must relocate the SMBASE of the newly-hotplugged CPU before the OS has
> any chance to put its fangs on it.  This is because the OS can send
> direct SMIs and is in control of the area at 0x3.  So OVMF is
> already broken with respect to hotplug,

Yes. We theorized that there could be further edk2 core modules that
hadn't been open sourced yet, necessary for handling VCPU hotplug.

> but I am not yet sure if these
> patches would break it further.

Hard to say without seeing those modules.

I will speculate: assuming that the non-public modules fit together with
the public modules in some way, I expect the broadcast SMI shouldn't
break them. The reason is that the broadcast SMI / traditional delivery
are the default method in UefiCpuPkg, and in practice they work better
(more reliably) with the rest of the edk2 infrastructure, in my
experience, than the relaxed sync method.

In his review today, Jordan wrote
,

> I'm glad we'll be using a mechanism that broadcasts to all the
> processors like the real hardware. It is a bit unfortunate that it
> doesn't go through the b2 port for it.

If broadcast is how real hardware does it (even by default!,
apparently), I expect those as-yet unreleased, hotplug-handling modules
in edk2 should be able to cope with broadcast.

> The full solution is to follow a protocol similar to what real hardware
> does.

Real hardware seems to use broadcast, according to the above...

On 11/23/16 23:35, Paolo Bonzini wrote:

> On hot-plug, before the new CPU starts running, the DSDT should
> generate an SMI (with a well-known value written to 0xB2).

I'm not sure I understand right. If it is the DSDT that writes to 0xB2,
that's just another way to say, "the firmware vendor asks the operating
system to write to 0xB2". If the malicious OS is smart enough, it can
realize (from the hardware signal to run the ACPI GPE handler, IIRC)
that a new VCPU is available, and simply not trigger the SMI.

> The handler
> then:
> 
> 1) waits for SMM rendezvous
> 
> 2) unparks the hotplugged VCPU.  Until the VCPU is unparked, it doesn't
> react to either INITs or of course SIPIs (this would need to be
> implemented separately for both TCG and KVM! but only in QEMU, not in
> the kernel).

Okay, this does plug the hole

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-11-23 Thread Martin Pitt

Hello Rafael, or anyone else affected,

Accepted qemu into xenial-proposed. The package will build now and be
available at https://launchpad.net/ubuntu/+source/qemu/1:2.5+dfsg-
5ubuntu10.7 in a few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, and change the tag
from verification-needed to verification-done. If it does not fix the
bug for you, please add a comment stating that, and change the tag to
verification-failed.  In either case, details of your testing will help
us make a better decision.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance!

** Changed in: qemu (Ubuntu Xenial)
   Status: In Progress => Fix Committed

** Tags added: verification-needed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in Ubuntu Cloud Archive:
  In Progress
Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Fix Committed
Status in qemu source package in Yakkety:
  In Progress
Status in qemu source package in Zesty:
  Fix Released

Bug description:
  [Impact]

   * Updated QEMU (from UCA) live migration doesn't work with 3.13 kernels.
   * QEMU code checks if it can create /tmp/memfd-XXX files wrongly.
   * Apparmor will block access to /tmp/ and QEMU will fail migrating.

  [Test Case]

   * Install 2 Ubuntu Trusty (3.13) + UCA Mitaka + apparmor rules.
   * Try to live-migration from one to another. 
   * Apparmor will block creation of /tmp/memfd-XXX files.

  [Regression Potential]

   Pros:
   * Exhaustively tested this.
   * Worked with upstream on this fix. 
   * I'm implementing new vhost log mechanism for upstream.
   * One line change to a blocker that is already broken.

   Cons:
   * To break live migration in other circumstances. 

  [Other Info]

   * Christian Ehrhardt has been following this.

  ORIGINAL DESCRIPTION:

  When libvirt starts using apparmor, and creating apparmor profiles for
  every virtual machine created in the compute nodes, mitaka qemu (2.5 -
  and upstream also) uses a fallback mechanism for creating shared
  memory for live-migrations. This fall back mechanism, on kernels 3.13
  - that don't have memfd_create() system-call, try to create files on
  /tmp/ directory and fails.. causing live-migration not to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
     tmpdir = g_get_tmp_dir
     ...
     mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkc

Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Veritas HyperScale VxHS block device support

2016-11-23 Thread Paolo Bonzini



On 23/11/2016 23:09, ashish mittal wrote:
> On the topic of protocol security -
> 
> Would it be enough for the first patch to implement only
> authentication and not encryption?

Yes, of course.  However, as we introduce more and more QEMU-specific
characteristics to a protocol that is already QEMU-specific (it doesn't
do failover, etc.), I am still not sure of the actual benefit of using
libqnio versus having an NBD server or FUSE driver.

You have already mentioned performance, but the design has changed so
much that I think one of the two things has to change: either failover
moves back to QEMU and there is no (closed source) translator running on
the node, or the translator needs to speak a well-known and
already-supported protocol.

Paolo

> On Wed, Nov 23, 2016 at 12:25 AM, Ketan Nilangekar
>  wrote:
>> +Nitin Jerath from Veritas.
>>
>>
>>
>>
>> On 11/18/16, 7:06 PM, "Daniel P. Berrange"  wrote:
>>
>>> On Fri, Nov 18, 2016 at 01:25:43PM +, Ketan Nilangekar wrote:


> On Nov 18, 2016, at 5:25 PM, Daniel P. Berrange  
> wrote:
>
>> On Fri, Nov 18, 2016 at 11:36:02AM +, Ketan Nilangekar wrote:
>>
>>
>>
>>
>>
>>> On 11/18/16, 3:32 PM, "Stefan Hajnoczi"  wrote:
>>>
 On Fri, Nov 18, 2016 at 02:26:21AM -0500, Jeff Cody wrote:
 * Daniel pointed out that there is no authentication method for taking 
 to a
  remote server.  This seems a bit scary.  Maybe all that is needed 
 here is
  some clarification of the security scheme for authentication?  My
  impression from above is that you are relying on the networks being
  private to provide some sort of implicit authentication, though, and 
 this
  seems fragile (and doesn't protect against a compromised guest or 
 other
  process on the server, for one).
>>>
>>> Exactly, from the QEMU trust model you must assume that QEMU has been
>>> compromised by the guest.  The escaped guest can connect to the VxHS
>>> server since it controls the QEMU process.
>>>
>>> An escaped guest must not have access to other guests' volumes.
>>> Therefore authentication is necessary.
>>
>> Just so I am clear on this, how will such an escaped guest get to know
>> the other guest vdisk IDs?
>
> There can be a multiple approaches depending on the deployment scenario.
> At the very simplest it could directly read the IDs out of the libvirt
> XML files in /var/run/libvirt. Or it can rnu "ps" to list other running
> QEMU processes and see the vdisk IDs in the command line args of those
> processes. Or the mgmt app may be creating vdisk IDs based on some
> particular scheme, and the attacker may have info about this which lets
> them determine likely IDs.  Or the QEMU may have previously been
> permitted to the use the disk and remembered the ID for use later
> after access to the disk has been removed.
>

 Are we talking about a compromised guest here or compromised hypervisor?
 How will a compromised guest read the xml file or list running qemu
 processes?
>>>
>>> Compromised QEMU process, aka hypervisor userspace
>>>
>>>
>>> Regards,
>>> Daniel
>>> --
>>> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ 
>>> :|
>>> |: http://libvirt.org  -o- http://virt-manager.org 
>>> :|
>>> |: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ 
>>> :|

Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI

2016-11-23 Thread Paolo Bonzini

On 18/11/2016 11:36, Laszlo Ersek wrote:
> This is v3 of the series, with updates based on the v2 discussion:
> .
> 
> I've added feature negotiation via the APM_STS ("scratchpad") register.
> A new spec file called "docs/specs/q35-apm-sts.txt" is included.
> 
> Tested with new OVMF patches (about to send out those as well).
> Regression tested with SeaBIOS (beyond simple functional tests with
> maximum SeaBIOS logging enabled, I used gdb to step through the new
> ich9_apm_status_changed() callback to see if it was behaving compatibly
> with SeaBIOS).
> 
> The series was developed and tested on top of v2.7.0, because v2.8.0-rc0
> crashes very quickly for me when running OVMF:
> 
>   kvm_io_ioeventfd_add: error adding ioeventfd: File exists
> 
> It is my understanding that there are patches on the list for this:
> 
>   [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes
> 
> Anyway, the series rebases to v2.8.0-rc0 without as much as context
> differences.

Hi Laszlo,

sorry for the slightly delayed reply.

First of all, I'm wondering if we would be better off adding a new port
0xB1 that is QEMU-specific, instead of reusing 0xB3.

Second, I now remembered the reason why I was against broadcast SMI.
The reason is that it breaks hot-plug.

On hot-plug, the firmware (if it wants to use SMI for anything secure)
must relocate the SMBASE of the newly-hotplugged CPU before the OS has
any chance to put its fangs on it.  This is because the OS can send
direct SMIs and is in control of the area at 0x3.  So OVMF is
already broken with respect to hotplug, but I am not yet sure if these
patches would break it further.

The full solution is to follow a protocol similar to what real hardware
does.  On hot-plug, before the new CPU starts running, the DSDT should
generate an SMI (with a well-known value written to 0xB2).  The handler
then:

1) waits for SMM rendezvous

2) unparks the hotplugged VCPU.  Until the VCPU is unparked, it doesn't
react to either INITs or of course SIPIs (this would need to be
implemented separately for both TCG and KVM! but only in QEMU, not in
the kernel).

3) relocates SMBASE

4) records the presence of the new VCPU's APIC id for subsequent SMIs.

The other important things are:

* new CPU-hotplug controller (docs/specs/acpi_cpu_hotplug.txt)
interfaces.  I think this would only need a new bit in the write
register at 0x4 ("CPU device control fields").

The 0x0-0x3 write register, currently reserved for reading, might
become read/write for easier communication with the SMI handler.  The
SMI handler would write 1 to the new bit in order to unpark the VCPU.

* how to enable this.  I think it would need a new SMM feature, just
like those that you are adding here, in order to make it negotiable.  If
it is not negotiated, but broadcast SMI is negotiated, you'd need to do
something such as not allowing CPU-hotplug.  (This is the only part that
I think is required for 2.9).

In order to trigger the SMI, the

 ifctx = aml_if(aml_equal(ins_evt, one));
 {
 aml_append(ifctx,
 aml_call2(CPU_NOTIFY_METHOD, cpu_data, dev_chk));
 aml_append(ifctx, aml_store(one, ins_evt));
 aml_append(ifctx, aml_store(one, has_event));
 }

would be replaced by something like this pseudo-AML:

Store(And(smm_features, 2), Local1)
...
Store(next_cpu_cmd, cpu_cmd)
If (Equal(ins_evt, One)) {
If (Greater(Local1, Zero)) {
Store(CPU_HP_APM_CMD, apm_cmd)
}
CPU_NOTIFY_METHOD(cpu_data, dev_chk)
Store(One, ins_evt)
Store(One, has_event)
}

Of course all this is for OVMF only.  SeaBIOS doesn't need to do
anything of this, because it actually likes to have its SMIs only on the
current CPU (and it had better be the BSP, since SMBASE is not relocated
elsewhere!).

Igor, any thoughts?

I understand that this is quite huge in both OVMF and QEMU, but we've
only been delaying it and we knew about it. :(

Paolo

> Cc: "Kevin O'Connor" 
> Cc: "Michael S. Tsirkin" 
> Cc: Gerd Hoffmann 
> Cc: Paolo Bonzini 
> 
> Thanks
> Laszlo
> 
> Laszlo Ersek (3):
>   hw/isa/apm: introduce callback for APM_STS_IOPORT writes
>   hw/isa/lpc_ich9: add SMI feature negotiation via APM_STS
>   hw/isa/lpc_ich9: ICH9_APM_STS_F_BROADCAST_SMI: inject SMI on all VCPUs
> 
>  docs/specs/q35-apm-sts.txt | 80 
> ++
>  include/hw/i386/ich9.h |  9 ++
>  include/hw/isa/apm.h   |  9 +++---
>  hw/acpi/piix4.c|  2 +-
>  hw/isa/apm.c   | 15 ++---
>  hw/isa/lpc_ich9.c  | 64 +++--
>  hw/isa/vt82c686.c  |  2 +-
>  7 files change

Re: [Qemu-devel] [PATCH for-2.8 v3] xen_disk: split discard input to match internal representation

2016-11-23 Thread Olaf Hering

Am 23. November 2016 21:44:50 MEZ, schrieb Olaf Hering :

>Is this a can for 2.x?
 candidate 


Olaf

Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Veritas HyperScale VxHS block device support

2016-11-23 Thread ashish mittal

On the topic of protocol security -

Would it be enough for the first patch to implement only
authentication and not encryption?

On Wed, Nov 23, 2016 at 12:25 AM, Ketan Nilangekar
 wrote:
> +Nitin Jerath from Veritas.
>
>
>
>
> On 11/18/16, 7:06 PM, "Daniel P. Berrange"  wrote:
>
>>On Fri, Nov 18, 2016 at 01:25:43PM +, Ketan Nilangekar wrote:
>>>
>>>
>>> > On Nov 18, 2016, at 5:25 PM, Daniel P. Berrange  
>>> > wrote:
>>> >
>>> >> On Fri, Nov 18, 2016 at 11:36:02AM +, Ketan Nilangekar wrote:
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>> On 11/18/16, 3:32 PM, "Stefan Hajnoczi"  wrote:
>>> >>>
>>>  On Fri, Nov 18, 2016 at 02:26:21AM -0500, Jeff Cody wrote:
>>>  * Daniel pointed out that there is no authentication method for taking 
>>>  to a
>>>   remote server.  This seems a bit scary.  Maybe all that is needed 
>>>  here is
>>>   some clarification of the security scheme for authentication?  My
>>>   impression from above is that you are relying on the networks being
>>>   private to provide some sort of implicit authentication, though, and 
>>>  this
>>>   seems fragile (and doesn't protect against a compromised guest or 
>>>  other
>>>   process on the server, for one).
>>> >>>
>>> >>> Exactly, from the QEMU trust model you must assume that QEMU has been
>>> >>> compromised by the guest.  The escaped guest can connect to the VxHS
>>> >>> server since it controls the QEMU process.
>>> >>>
>>> >>> An escaped guest must not have access to other guests' volumes.
>>> >>> Therefore authentication is necessary.
>>> >>
>>> >> Just so I am clear on this, how will such an escaped guest get to know
>>> >> the other guest vdisk IDs?
>>> >
>>> > There can be a multiple approaches depending on the deployment scenario.
>>> > At the very simplest it could directly read the IDs out of the libvirt
>>> > XML files in /var/run/libvirt. Or it can rnu "ps" to list other running
>>> > QEMU processes and see the vdisk IDs in the command line args of those
>>> > processes. Or the mgmt app may be creating vdisk IDs based on some
>>> > particular scheme, and the attacker may have info about this which lets
>>> > them determine likely IDs.  Or the QEMU may have previously been
>>> > permitted to the use the disk and remembered the ID for use later
>>> > after access to the disk has been removed.
>>> >
>>>
>>> Are we talking about a compromised guest here or compromised hypervisor?
>>> How will a compromised guest read the xml file or list running qemu
>>> processes?
>>
>>Compromised QEMU process, aka hypervisor userspace
>>
>>
>>Regards,
>>Daniel
>>--
>>|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
>>|: http://libvirt.org  -o- http://virt-manager.org :|
>>|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|

Re: [Qemu-devel] [PATCH v1 04/18] util/rbcache: range-based cache core

2016-11-23 Thread Kevin Wolf

Am 15.11.2016 um 07:37 hat Pavel Butsykin geschrieben:
> RBCache provides functionality to cache the data from block devices
> (basically). The range here is used as the main key for searching and storing
> data. The cache is based on red-black trees, so basic operations search,
> insert, delete are performed for O(log n).
> 
> It is important to note that QEMU usually does not require a data cache, but
> in reality, there are already some cases where a cache of small amounts can
> increase performance, so as the data structure was selected red-black trees,
> this is a fairly simple data structure and show high efficiency on a small
> number of elements. Therefore, when the minimum range is 512 bytes, the
> recommended size of the cache memory no more than 8-16mb.  Also note
> that this cache implementation allows to store ranges of different lengths
> without alignment.
> 
> Generic cache core can easily be used to implement different caching policies 
> at
> the block level, such as read-ahed. Also it can be used in some special cases,
> for example for caching data in qcow2 when sequential allocating writes to 
> image
> with backing file.
> 
> Signed-off-by: Pavel Butsykin 
> ---
>  MAINTAINERS|   6 ++
>  include/qemu/rbcache.h | 105 +
>  util/Makefile.objs |   1 +
>  util/rbcache.c | 246 
> +
>  4 files changed, 358 insertions(+)
>  create mode 100644 include/qemu/rbcache.h
>  create mode 100644 util/rbcache.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index ddf797b..cb74802 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1365,6 +1365,12 @@ F: include/qemu/rbtree.h
>  F: include/qemu/rbtree_augmented.h
>  F: util/rbtree.c
>  
> +Range-Based Cache
> +M: Denis V. Lunev 
> +S: Supported
> +F: include/qemu/rbcache.h
> +F: util/rbcache.c
> +
>  UUID
>  M: Fam Zheng 
>  S: Supported
> diff --git a/include/qemu/rbcache.h b/include/qemu/rbcache.h
> new file mode 100644
> index 000..c8f0a9f
> --- /dev/null
> +++ b/include/qemu/rbcache.h
> @@ -0,0 +1,105 @@
> +/*
> + * QEMU Range-Based Cache core
> + *
> + * Copyright (C) 2015-2016 Parallels IP Holdings GmbH. All rights reserved.
> + *
> + * Author: Pavel Butsykin 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef RBCACHE_H
> +#define RBCACHE_H
> +
> +#include "qemu/rbtree.h"
> +#include "qemu/queue.h"
> +
> +typedef struct RBCacheNode {
> +struct RbNode rb_node;
> +uint64_t offset;
> +uint64_t bytes;
> +QTAILQ_ENTRY(RBCacheNode) entry;
> +} RBCacheNode;
> +
> +typedef struct RBCache RBCache;
> +
> +typedef RBCacheNode *RBNodeAlloc(uint64_t offset, uint64_t bytes, void 
> *opaque);
> +typedef void RBNodeFree(RBCacheNode *node, void *opaque);

Maybe worth comments describing what these functions do apart from
g_new()/g_free()? I assume that offset and bytes must be initialised
from the parameters. Should rb_node and entry be zeroed?

> +
> +enum eviction_type {
> +RBCACHE_FIFO,
> +RBCACHE_LRU,
> +};
> +
> +/**
> + * rbcache_search:
> + * @rbcache: the cache object.
> + * @offset: the start of the range.
> + * @bytes: the size of the range.
> + *
> + * Returns the node corresponding to the range(offset, bytes),
> + * or NULL if the node was not found.
> + */
> +void *rbcache_search(RBCache *rbcache, uint64_t offset, uint64_t bytes);

What if the range covers multiple nodes? Is it defined which of the
nodes we return or do you just get any?

Why does this function (and the following ones) return void* rather than
RBCacheNode* if they are supposed to return a node?

> +/**
> + * rbcache_insert:
> + * @rbcache: the cache object.
> + * @node: a new node for the cache.
> + *
> + * Returns the new node, or old node if the node already exists.
> + */
> +void *rbcache_insert(RBCache *rbcache, RBCacheNode *node);

What does "if the node already exists" mean? If @node (the very same
object) is already stored in the cache object, or if a node describing
the same range already exists?

> +/**
> + * rbcache_search_and_insert:
> + * @rbcache: the cache object.
> + * @offset: the start of the range.
> + * @bytes: the size of the range.
> + *
> + * rbcache_search_and_insert() is like rbcache_insert(), except that a new 
> node
> + * is allocated inside the function. Returns the new node, or old node if the
> + * node already exists.
> + */
> +void *rbcache_search_and_insert(RBCache *rbcache, uint64_t offset,
> +uint64_t byte);

What happens if a node exists, but only for part of the range?

> +/**
> + * rbcache_remove:
> + * @rbcache: the cache object.
> + * @node: the node to remove.
> + *
> + * Removes the cached range owned by the node.
> + */
> +void rbcache_remove(RBCache *rbcache, RBCacheNode *node);
> +
> +RBCacheNode *rbcache_node_alloc(RBCache *rbcache, uint64_t offset,
> +

[Qemu-devel] [PATCH for-2.8] target-m68k: Fix cmpa operand size

2016-11-23 Thread Laurent Vivier

"The size of the operation can be specified as word or long.
Word length source operands are sign-extended to 32 bits for
comparison."

So comparison is always done using OS_LONG.

Signed-off-by: Laurent Vivier 
---
 target-m68k/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index 8e522db..d2d6816 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2170,7 +2170,7 @@ DISAS_INSN(cmpa)
 }
 SRC_EA(env, src, opsize, 1, NULL);
 reg = AREG(insn, 9);
-gen_update_cc_cmp(s, reg, src, opsize);
+gen_update_cc_cmp(s, reg, src, OS_LONG);
 }
 
 DISAS_INSN(eor)
-- 
2.7.4

Re: [Qemu-devel] [PATCH for-2.8 v3] xen_disk: split discard input to match internal representation

2016-11-23 Thread Olaf Hering

Am 23. November 2016 13:27:13 MEZ, schrieb Kevin Wolf :
>Am 23.11.2016 um 12:40 hat Eric Blake geschrieben:

>> Qualifies as a bug fix, so requesting 2.8 inclusion.
>> Reviewed-by: Eric Blake 

Is this a can for 2.x?

Olaf

[Qemu-devel] [PATCH 2/2] tpm: Add TPM I2C Atmel frontend

2016-11-23 Thread Fabio Urquiza

Add a new frontend to the TPM backend that emulate the Atmel I2C TPM
AT97SC3204T device and make it available to use on ARM machines that have
I2C Bus configured.

Signed-off-by: Fabio Urquiza 
---
 default-configs/aarch64-softmmu.mak |   1 +
 default-configs/arm-softmmu.mak |   1 +
 hw/tpm/Makefile.objs|   1 +
 hw/tpm/tpm_i2c_atmel.c  | 361 
 4 files changed, 364 insertions(+)
 create mode 100644 hw/tpm/tpm_i2c_atmel.c

diff --git a/default-configs/aarch64-softmmu.mak 
b/default-configs/aarch64-softmmu.mak
index 2449483..232957e 100644
--- a/default-configs/aarch64-softmmu.mak
+++ b/default-configs/aarch64-softmmu.mak
@@ -7,3 +7,4 @@ CONFIG_AUX=y
 CONFIG_DDC=y
 CONFIG_DPCD=y
 CONFIG_XLNX_ZYNQMP=y
+CONFIG_TPM_I2C_ATMEL=$(CONFIG_TPM)
diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index 6de3e16..ef3c8ac 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -115,3 +115,4 @@ CONFIG_ACPI=y
 CONFIG_SMBIOS=y
 CONFIG_ASPEED_SOC=y
 CONFIG_GPIO_KEY=y
+CONFIG_TPM_I2C_ATMEL=$(CONFIG_TPM)
diff --git a/hw/tpm/Makefile.objs b/hw/tpm/Makefile.objs
index 64cecc3..0d0f0b1 100644
--- a/hw/tpm/Makefile.objs
+++ b/hw/tpm/Makefile.objs
@@ -1,2 +1,3 @@
 common-obj-$(CONFIG_TPM_TIS) += tpm_tis.o
+common-obj-$(CONFIG_TPM_I2C_ATMEL) += tpm_i2c_atmel.o
 common-obj-$(CONFIG_TPM_PASSTHROUGH) += tpm_passthrough.o tpm_util.o
diff --git a/hw/tpm/tpm_i2c_atmel.c b/hw/tpm/tpm_i2c_atmel.c
new file mode 100644
index 000..07af79b
--- /dev/null
+++ b/hw/tpm/tpm_i2c_atmel.c
@@ -0,0 +1,361 @@
+/*
+ * tpm_i2c_atmel.c - QEMU's TPM I2C interface emulator
+ *
+ * Copyright (C) 2012, HPE Corporation
+ *
+ * Authors:
+ *  Fabio Urquiza 
+ *
+ * Based on tpm_tis.c:
+ *  Stefan Berger 
+ *  David Safford 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ * Implementation of the TIS I2C interface according to specs found at
+ * http://www.trustedcomputinggroup.org. This implementation currently
+ * supports version 1.2 Atmel AT97SC3204T CI, 10 December 2016
+ *
+ * TPM I2C for TPM 2 implementation following TCG TPM I2C Interface
+ * Specification TPM Profile (PTP) Specification, Familiy 2.0, Revision 1.0
+ */
+
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "qemu/main-loop.h"
+#include "hw/i2c/i2c.h"
+#include "qemu/bcd.h"
+#include "sysemu/tpm_backend.h"
+#include "tpm_int.h"
+#include "qapi/error.h"
+
+#define DEBUG_TIS 0
+
+#define DPRINTF(fmt, ...) do { \
+if (DEBUG_TIS) { \
+printf(fmt, ## __VA_ARGS__); \
+} \
+} while (0);
+
+/* vendor-specific registers */
+#define TPM_TIS_STS_TPM_FAMILY_MASK (0x3 << 26)/* TPM 2.0 */
+#define TPM_TIS_STS_TPM_FAMILY1_2   (0 << 26)  /* TPM 2.0 */
+#define TPM_TIS_STS_TPM_FAMILY2_0   (1 << 26)  /* TPM 2.0 */
+
+#define TPM_TIS_STS_VALID (1 << 7)
+#define TPM_TIS_STS_DATA_AVAILABLE(1 << 4)
+#define TPM_TIS_STS_SELFTEST_DONE (1 << 2)
+
+#define TPM_TIS_ACCESS_TPM_REG_VALID_STS  (1 << 7)
+
+#define TPM_TIS_IFACE_ID_INTERFACE_TIS1_3   (0xf) /* TPM 2.0 */
+#define TPM_TIS_IFACE_ID_INTERFACE_FIFO (0x0) /* TPM 2.0 */
+#define TPM_TIS_IFACE_ID_INTERFACE_VER_FIFO (0 << 4)  /* TPM 2.0 */
+#define TPM_TIS_IFACE_ID_CAP_5_LOCALITIES   (1 << 8)  /* TPM 2.0 */
+#define TPM_TIS_IFACE_ID_CAP_TIS_SUPPORTED  (1 << 13) /* TPM 2.0 */
+#define TPM_TIS_IFACE_ID_INT_SEL_LOCK   (1 << 19) /* TPM 2.0 */
+
+#define TPM_TIS_IFACE_ID_SUPPORTED_FLAGS1_3 \
+(TPM_TIS_IFACE_ID_INTERFACE_TIS1_3 | \
+ (~0u << 4)/* all of it is don't care */)
+
+/* if backend was a TPM 2.0: */
+#define TPM_TIS_IFACE_ID_SUPPORTED_FLAGS2_0 \
+(TPM_TIS_IFACE_ID_INTERFACE_FIFO | \
+ TPM_TIS_IFACE_ID_INTERFACE_VER_FIFO | \
+ TPM_TIS_IFACE_ID_CAP_5_LOCALITIES | \
+ TPM_TIS_IFACE_ID_CAP_TIS_SUPPORTED)
+
+#define TPM_TIS_NO_DATA_BYTE  0xff
+
+static const VMStateDescription vmstate_tpm_i2c_atmel = {
+.name = "tpm",
+.unmigratable = 1,
+};
+
+static uint32_t tpm_i2c_atmel_get_size_from_buffer(const TPMSizedBuffer *sb)
+{
+return be32_to_cpu(*(uint32_t *)&sb->buffer[2]);
+}
+
+static void tpm_i2c_atmel_show_buffer(const TPMSizedBuffer *sb, const char 
*string)
+{
+#ifdef DEBUG_TIS
+uint32_t len, i;
+
+len = tpm_i2c_atmel_get_size_from_buffer(sb);
+DPRINTF("tpm_tis: %s length = %d\n", string, len);
+for (i = 0; i < len; i++) {
+if (i && !(i % 16)) {
+DPRINTF("\n");
+}
+DPRINTF("%.2X ", sb->buffer[i]);
+}
+DPRINTF("\n");
+#endif
+}
+
+/*
+ * Set the given flags in the STS register by clearing the register but
+ * preserving the SELFTEST_DONE and TPM_FAMILY_MASK flags and then setting
+ * the new flags.
+ *
+ * The SELFTEST_DONE flag is acquired from the backend that determines it by
+ * peeking into TPM commands.
+ *
+ * A VM suspend/resume will preserve the flag

[Qemu-devel] [PATCH 1/2] i2c: Add flag to NACK I2C transfers when busy

2016-11-23 Thread Fabio Urquiza

Add a busy flag on the I2CSlave struct which the device could set to NACK
I2C transfer requests during the execution of the event handling function.
If the busy flag is set, i2c_start_transfer() shall return 1.

Signed-off-by: Fabio Urquiza 
---
 hw/i2c/core.c| 3 +++
 include/hw/i2c/i2c.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/hw/i2c/core.c b/hw/i2c/core.c
index abd4c4c..438233c 100644
--- a/hw/i2c/core.c
+++ b/hw/i2c/core.c
@@ -142,6 +142,9 @@ int i2c_start_transfer(I2CBus *bus, uint8_t address, int 
recv)
start condition.  */
 if (sc->event) {
 sc->event(node->elt, recv ? I2C_START_RECV : I2C_START_SEND);
+if (node->elt->busy) {
+return -1;
+}
 }
 }
 return 0;
diff --git a/include/hw/i2c/i2c.h b/include/hw/i2c/i2c.h
index c4085aa..9c6b1ce 100644
--- a/include/hw/i2c/i2c.h
+++ b/include/hw/i2c/i2c.h
@@ -48,6 +48,7 @@ struct I2CSlave
 
 /* Remaining fields for internal use by the I2C code.  */
 uint8_t address;
+uint8_t busy;
 };
 
 I2CBus *i2c_init_bus(DeviceState *parent, const char *name);
-- 
2.9.3 (Apple Git-75)

[Qemu-devel] [PATCH 0/2] Add Atmel I2C TPM AT97SC3204T emulated device

2016-11-23 Thread Fabio Urquiza

### Overview ###

The TPM passthrough feature allow a developer to test TPM functionalities,
like Measure Boot, without the need to tamper with critical parts of the
host machine, ie. bootloader. It has been implemented to the x86 architecture
and have the same interface that is provided to PC machines: TPM TIS.

With the growing of use of the ARM server machines, also comes the need to
reuse the same security features that are present in the in the PC server 
environment. So comes the need to use TPM devices in the ARM architecture.

This patchset provides a new frontend to the QEMU TPM passthrough with an
interface suitable to communicate with an ARM based machine.

### Technical details ###

The TPM devices on the PC architecture are integrated in a way that the
interface to it is made using an abstraction layer provided by the BIOS/UFI
firmware. That interface is not available to ARM machines, therefore a new
QEMU front end with a more suitable interface needed to be developed. The
options based on the available TPM devices in the market were I2C and SPI.
To make the choice, we look into the supported TPM drivers available to
ARM architecture in the Linux Kernel. One of the simplest drivers was the
ATMEL I2C TPM AT97SC3204T, so that was our target for the emulation.

We created a file called tpm_i2c_atmel.c based on the already available
tpm_tis.c, registering as a I2C_SLAVE_CLASS and calling the tpm_backend
functions.

One of the problems we had to address is regarding the behavior of the
ATMEL I2C TPM AT97SC3204T Linux driver. After the driver sends a request
to the TPM, it keeps polling the device with I2C read request. The real
AT97SC3204T hardware ignore those requests while the response is not ready
simply by not ACKing the I2C read on its address. When the response is
ready it will ACK the request and proceed writing the response in the wire.

The QEMU I2C API does not provide a way to not ACK I2C requests when the
device is not ready to transmit. In fact, if the device has been configured
in the virtual machine, QEMU will automatically ACK every request without
asking for the device permission for it. Therefore we created a flag in
the I2CSlave struct that tells the I2C subsystem that the device is busy
and not ready to ACK a I2C transfer. We understand that it could not be
the best solution to the problem, but it appears to be the solution that
have the least impact in the code overall. Suggestions on a different
approach would be welcome.

### Testing ###

We tested the feature in the versatilepb machine running Linux with
TrouSerS and tpm_tools:

qemu-system-arm -M versatilepb -kernel output/images/zImage -dtb 
output/images/versatile-pb.dtb -drive 
file=output/images/rootfs.ext2,if=scsi,format=raw -append "root=/dev/sda 
console=ttyAMA0,115200" -net nic,model=rtl8139 -net user -nographic -device 
tpm-tis,tpmdev=tpm-tpm0,address=0x20 -tpmdev 
passthrough,id=tpm-tpm0,path=/dev/tpm0,cancel-path=/sys/devices/pnp0/00:08/tpm/tpm0/cancel

The following device needed to be included on the Device Tree:

i2c0: i2c@10002000 {
 #address-cells = <1>;
 #size-cells = <0>;
 compatible = "arm,versatile-i2c";
 reg = <0x10002000 0x1000>;
 
 rtc@68 {
 compatible = "dallas,ds1338";
 reg = <0x68>;
 };


 tpm@20 {
 compatible = "atmel,at97sc3204t";
 reg = <0x20>;
 };

The following config needed to be enabled in the Linux Kernel:

I2C_VERSATILE=y
TCG_TPM=y
TCG_TIS_I2C_ATMEL=y

Fabio Urquiza (2):
  i2c: Add flag to NACK I2C transfers when busy
  tpm: Add TPM I2C Atmel frontend

 default-configs/aarch64-softmmu.mak |   1 +
 default-configs/arm-softmmu.mak |   1 +
 hw/i2c/core.c   |   3 +
 hw/tpm/Makefile.objs|   1 +
 hw/tpm/tpm_i2c_atmel.c  | 361 
 include/hw/i2c/i2c.h|   1 +
 6 files changed, 368 insertions(+)
 create mode 100644 hw/tpm/tpm_i2c_atmel.c

-- 
2.9.3 (Apple Git-75)

[Qemu-devel] [PATCH v2 2/5] slirp: VMStatify sbuf

2016-11-23 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Convert the sbuf structure to a VMStateDescription.
Note this uses the VMSTATE_WITH_TMP mechanism to calculate
and reload the offsets based on the pointers.

Signed-off-by: Dr. David Alan Gilbert 
Reviewed-by: David Gibson 
Acked-by: Samuel Thibault 
---
 slirp/sbuf.h  |   4 +-
 slirp/slirp.c | 116 ++
 2 files changed, 78 insertions(+), 42 deletions(-)

diff --git a/slirp/sbuf.h b/slirp/sbuf.h
index efcec39..a722ecb 100644
--- a/slirp/sbuf.h
+++ b/slirp/sbuf.h
@@ -12,8 +12,8 @@
 #define sbspace(sb) ((sb)->sb_datalen - (sb)->sb_cc)
 
 struct sbuf {
-   u_int   sb_cc;  /* actual chars in buffer */
-   u_int   sb_datalen; /* Length of data  */
+   uint32_t sb_cc; /* actual chars in buffer */
+   uint32_t sb_datalen;/* Length of data  */
char*sb_wptr;   /* write pointer. points to where the next
 * bytes should be written in the sbuf */
char*sb_rptr;   /* read pointer. points to where the next
diff --git a/slirp/slirp.c b/slirp/slirp.c
index 6276315..2f7802e 100644
--- a/slirp/slirp.c
+++ b/slirp/slirp.c
@@ -1185,19 +1185,72 @@ static const VMStateDescription vmstate_slirp_tcp = {
 }
 };
 
-static void slirp_sbuf_save(QEMUFile *f, struct sbuf *sbuf)
+/* The sbuf has a pair of pointers that are migrated as offsets;
+ * we calculate the offsets and restore the pointers using
+ * pre_save/post_load on a tmp structure.
+ */
+struct sbuf_tmp {
+struct sbuf *parent;
+uint32_t roff, woff;
+};
+
+static void sbuf_tmp_pre_save(void *opaque)
+{
+struct sbuf_tmp *tmp = opaque;
+tmp->woff = tmp->parent->sb_wptr - tmp->parent->sb_data;
+tmp->roff = tmp->parent->sb_rptr - tmp->parent->sb_data;
+}
+
+static int sbuf_tmp_post_load(void *opaque, int version)
 {
-uint32_t off;
-
-qemu_put_be32(f, sbuf->sb_cc);
-qemu_put_be32(f, sbuf->sb_datalen);
-off = (uint32_t)(sbuf->sb_wptr - sbuf->sb_data);
-qemu_put_sbe32(f, off);
-off = (uint32_t)(sbuf->sb_rptr - sbuf->sb_data);
-qemu_put_sbe32(f, off);
-qemu_put_buffer(f, (unsigned char*)sbuf->sb_data, sbuf->sb_datalen);
+struct sbuf_tmp *tmp = opaque;
+uint32_t requested_len = tmp->parent->sb_datalen;
+
+/* Allocate the buffer space used by the field after the tmp */
+sbreserve(tmp->parent, tmp->parent->sb_datalen);
+
+if (tmp->parent->sb_datalen != requested_len) {
+return -ENOMEM;
+}
+if (tmp->woff >= requested_len ||
+tmp->roff >= requested_len) {
+error_report("invalid sbuf offsets r/w=%u/%u len=%u",
+ tmp->roff, tmp->woff, requested_len);
+return -EINVAL;
+}
+
+tmp->parent->sb_wptr = tmp->parent->sb_data + tmp->woff;
+tmp->parent->sb_rptr = tmp->parent->sb_data + tmp->roff;
+
+return 0;
 }
 
+
+static const VMStateDescription vmstate_slirp_sbuf_tmp = {
+.name = "slirp-sbuf-tmp",
+.post_load = sbuf_tmp_post_load,
+.pre_save  = sbuf_tmp_pre_save,
+.version_id = 0,
+.fields = (VMStateField[]) {
+VMSTATE_UINT32(woff, struct sbuf_tmp),
+VMSTATE_UINT32(roff, struct sbuf_tmp),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static const VMStateDescription vmstate_slirp_sbuf = {
+.name = "slirp-sbuf",
+.version_id = 0,
+.fields = (VMStateField[]) {
+VMSTATE_UINT32(sb_cc, struct sbuf),
+VMSTATE_UINT32(sb_datalen, struct sbuf),
+VMSTATE_WITH_TMP(struct sbuf, struct sbuf_tmp, vmstate_slirp_sbuf_tmp),
+VMSTATE_VBUFFER_UINT32(sb_data, struct sbuf, 0, NULL, 0, sb_datalen),
+VMSTATE_END_OF_LIST()
+}
+};
+
+
 static void slirp_socket_save(QEMUFile *f, struct socket *so)
 {
 qemu_put_be32(f, so->so_urgc);
@@ -1225,8 +1278,9 @@ static void slirp_socket_save(QEMUFile *f, struct socket 
*so)
 qemu_put_byte(f, so->so_emu);
 qemu_put_byte(f, so->so_type);
 qemu_put_be32(f, so->so_state);
-slirp_sbuf_save(f, &so->so_rcv);
-slirp_sbuf_save(f, &so->so_snd);
+/* TODO: Build vmstate at this level */
+vmstate_save_state(f, &vmstate_slirp_sbuf, &so->so_rcv, 0);
+vmstate_save_state(f, &vmstate_slirp_sbuf, &so->so_snd, 0);
 vmstate_save_state(f, &vmstate_slirp_tcp, so->so_tcpcb, 0);
 }
 
@@ -1263,31 +1317,9 @@ static void slirp_state_save(QEMUFile *f, void *opaque)
 slirp_bootp_save(f, slirp);
 }
 
-static int slirp_sbuf_load(QEMUFile *f, struct sbuf *sbuf)
-{
-uint32_t off, sb_cc, sb_datalen;
-
-sb_cc = qemu_get_be32(f);
-sb_datalen = qemu_get_be32(f);
-
-sbreserve(sbuf, sb_datalen);
-
-if (sbuf->sb_datalen != sb_datalen)
-return -ENOMEM;
-
-sbuf->sb_cc = sb_cc;
-
-off = qemu_get_sbe32(f);
-sbuf->sb_wptr = sbuf->sb_data + off;
-off = qemu_get_sbe32(f);
-sbuf->sb_rptr = sbuf->sb_data + off;
-qemu_get_buffer(f, (unsigned char*)sbuf->sb_data, sbuf->sb_datalen);
-
-return 0;
-}
-
 static in

[Qemu-devel] [PATCH v2 5/5] slirp: VMStatify remaining except for loop

2016-11-23 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

This converts the remaining components, except for the top level
loop, to VMState.

Signed-off-by: Dr. David Alan Gilbert 
---
 slirp/slirp.c | 48 +++-
 1 file changed, 19 insertions(+), 29 deletions(-)

diff --git a/slirp/slirp.c b/slirp/slirp.c
index c631338..5f95d75 100644
--- a/slirp/slirp.c
+++ b/slirp/slirp.c
@@ -1319,15 +1319,25 @@ static const VMStateDescription vmstate_slirp_socket = {
 }
 };
 
-static void slirp_bootp_save(QEMUFile *f, Slirp *slirp)
-{
-int i;
+static const VMStateDescription vmstate_slirp_bootp_client = {
+.name = "slirp_bootpclient",
+.fields = (VMStateField[]) {
+VMSTATE_UINT16(allocated, BOOTPClient),
+VMSTATE_BUFFER(macaddr, BOOTPClient),
+VMSTATE_END_OF_LIST()
+}
+};
 
-for (i = 0; i < NB_BOOTP_CLIENTS; i++) {
-qemu_put_be16(f, slirp->bootp_clients[i].allocated);
-qemu_put_buffer(f, slirp->bootp_clients[i].macaddr, 6);
+static const VMStateDescription vmstate_slirp = {
+.name = "slirp",
+.version_id = 4,
+.fields = (VMStateField[]) {
+VMSTATE_UINT16_V(ip_id, Slirp, 2),
+VMSTATE_STRUCT_ARRAY(bootp_clients, Slirp, NB_BOOTP_CLIENTS, 3,
+ vmstate_slirp_bootp_client, BOOTPClient),
+VMSTATE_END_OF_LIST()
 }
-}
+};
 
 static void slirp_state_save(QEMUFile *f, void *opaque)
 {
@@ -1347,22 +1357,10 @@ static void slirp_state_save(QEMUFile *f, void *opaque)
 }
 qemu_put_byte(f, 0);
 
-qemu_put_be16(f, slirp->ip_id);
-
-slirp_bootp_save(f, slirp);
+vmstate_save_state(f, &vmstate_slirp, slirp, NULL);
 }
 
 
-static void slirp_bootp_load(QEMUFile *f, Slirp *slirp)
-{
-int i;
-
-for (i = 0; i < NB_BOOTP_CLIENTS; i++) {
-slirp->bootp_clients[i].allocated = qemu_get_be16(f);
-qemu_get_buffer(f, slirp->bootp_clients[i].macaddr, 6);
-}
-}
-
 static int slirp_state_load(QEMUFile *f, void *opaque, int version_id)
 {
 Slirp *slirp = opaque;
@@ -1397,13 +1395,5 @@ static int slirp_state_load(QEMUFile *f, void *opaque, 
int version_id)
 so->extra = (void *)ex_ptr->ex_exec;
 }
 
-if (version_id >= 2) {
-slirp->ip_id = qemu_get_be16(f);
-}
-
-if (version_id >= 3) {
-slirp_bootp_load(f, slirp);
-}
-
-return 0;
+return vmstate_load_state(f, &vmstate_slirp, slirp, version_id);
 }
-- 
2.9.3

[Qemu-devel] [PATCH v2 0/5] SLIRP VMStatification [ for 2.9 ]

2016-11-23 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Hi,
  This is an update to my previous series that included VMSTATE_WITH_TMP.
I've not bothered including the VMSTATE_WITH_TMP set this time; but it's
still dependent on it, so we'll wait until it's in first.

  My main change in this version is that I've commoned the fhost/lhost unions
in struct socket, and that's made the 'socket level' code a lot simpler
and should make it a lot easier to add IPv6 support to it.

Dave

Dr. David Alan Gilbert (10):
  slirp: VMState conversion; tcpcb
  slirp: VMStatify sbuf
  slirp: Common lhost/fhost union
  slirp: VMStatify socket level
  slirp: VMStatify remaining except for loop

 slirp/sbuf.h|4 
 slirp/slirp.c   |  459 +-
 slirp/socket.h  |   24 +-
 slirp/tcp_var.h |6 

 5 files changed, 231 insertions(+), 263 deletions(-)


-- 
2.9.3

[Qemu-devel] [PATCH v2 3/5] slirp: Common lhost/fhost union

2016-11-23 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

The socket structure has a pair of unions for lhost and fhost
addresses; the unions are identical so split them out into
a separate union declaration.

Signed-off-by: Dr. David Alan Gilbert 
---
 slirp/socket.h | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/slirp/socket.h b/slirp/socket.h
index 8feed2a..c1be77e 100644
--- a/slirp/socket.h
+++ b/slirp/socket.h
@@ -15,6 +15,12 @@
  * Our socket structure
  */
 
+union slirp_sockaddr {
+struct sockaddr_storage ss;
+struct sockaddr_in sin;
+struct sockaddr_in6 sin6;
+};
+
 struct socket {
   struct socket *so_next,*so_prev;  /* For a linked list of sockets */
 
@@ -31,22 +37,14 @@ struct socket {
   struct tcpiphdr *so_ti; /* Pointer to the original ti within
* so_mconn, for non-blocking connections */
   int so_urgc;
-  union {   /* foreign host */
-  struct sockaddr_storage ss;
-  struct sockaddr_in sin;
-  struct sockaddr_in6 sin6;
-  } fhost;
+  union slirp_sockaddr fhost;  /* Foreign host */
 #define so_faddr fhost.sin.sin_addr
 #define so_fport fhost.sin.sin_port
 #define so_faddr6 fhost.sin6.sin6_addr
 #define so_fport6 fhost.sin6.sin6_port
 #define so_ffamily fhost.ss.ss_family
 
-  union {   /* local host */
-  struct sockaddr_storage ss;
-  struct sockaddr_in sin;
-  struct sockaddr_in6 sin6;
-  } lhost;
+  union slirp_sockaddr lhost;  /* Local host */
 #define so_laddr lhost.sin.sin_addr
 #define so_lport lhost.sin.sin_port
 #define so_laddr6 lhost.sin6.sin6_addr
-- 
2.9.3

[Qemu-devel] [PATCH v2 4/5] slirp: VMStatify socket level

2016-11-23 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Working up the stack, this replaces the slirp_socket_load/save
with VMState definitions.

Signed-off-by: Dr. David Alan Gilbert 
---
 slirp/slirp.c  | 146 ++---
 slirp/socket.h |   6 +--
 2 files changed, 69 insertions(+), 83 deletions(-)

diff --git a/slirp/slirp.c b/slirp/slirp.c
index 2f7802e..c631338 100644
--- a/slirp/slirp.c
+++ b/slirp/slirp.c
@@ -1250,40 +1250,75 @@ static const VMStateDescription vmstate_slirp_sbuf = {
 }
 };
 
+static bool slirp_older_than_v4(void *opaque, int version_id)
+{
+return version_id < 4;
+}
 
-static void slirp_socket_save(QEMUFile *f, struct socket *so)
+static bool slirp_family_inet(void *opaque, int version_id)
 {
-qemu_put_be32(f, so->so_urgc);
-qemu_put_be16(f, so->so_ffamily);
-switch (so->so_ffamily) {
-case AF_INET:
-qemu_put_be32(f, so->so_faddr.s_addr);
-qemu_put_be16(f, so->so_fport);
-break;
-default:
-error_report("so_ffamily unknown, unable to save so_faddr and"
- " so_fport");
-}
-qemu_put_be16(f, so->so_lfamily);
-switch (so->so_lfamily) {
-case AF_INET:
-qemu_put_be32(f, so->so_laddr.s_addr);
-qemu_put_be16(f, so->so_lport);
-break;
-default:
-error_report("so_ffamily unknown, unable to save so_laddr and"
- " so_lport");
+union slirp_sockaddr *ssa = (union slirp_sockaddr *)opaque;
+return ssa->ss.ss_family == AF_INET;
+}
+
+static int slirp_socket_pre_load(void *opaque)
+{
+struct socket *so = opaque;
+if (tcp_attach(so) < 0) {
+return -ENOMEM;
 }
-qemu_put_byte(f, so->so_iptos);
-qemu_put_byte(f, so->so_emu);
-qemu_put_byte(f, so->so_type);
-qemu_put_be32(f, so->so_state);
-/* TODO: Build vmstate at this level */
-vmstate_save_state(f, &vmstate_slirp_sbuf, &so->so_rcv, 0);
-vmstate_save_state(f, &vmstate_slirp_sbuf, &so->so_snd, 0);
-vmstate_save_state(f, &vmstate_slirp_tcp, so->so_tcpcb, 0);
+/* Older versions don't load these fields */
+so->so_ffamily = AF_INET;
+so->so_lfamily = AF_INET;
+return 0;
 }
 
+static const VMStateDescription vmstate_slirp_socket_addr = {
+.name = "slirp-socket-addr",
+.version_id = 4,
+.fields = (VMStateField[]) {
+VMSTATE_UINT16(ss.ss_family, union slirp_sockaddr),
+VMSTATE_UINT32_TEST(sin.sin_addr.s_addr, union slirp_sockaddr,
+slirp_family_inet),
+VMSTATE_UINT16_TEST(sin.sin_port, union slirp_sockaddr,
+slirp_family_inet),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static const VMStateDescription vmstate_slirp_socket = {
+.name = "slirp-socket",
+.version_id = 4,
+.pre_load = slirp_socket_pre_load,
+.fields = (VMStateField[]) {
+VMSTATE_UINT32(so_urgc, struct socket),
+/* Pre-v4 versions */
+VMSTATE_UINT32_TEST(so_faddr.s_addr, struct socket,
+slirp_older_than_v4),
+VMSTATE_UINT32_TEST(so_laddr.s_addr, struct socket,
+slirp_older_than_v4),
+VMSTATE_UINT16_TEST(so_fport, struct socket, slirp_older_than_v4),
+VMSTATE_UINT16_TEST(so_lport, struct socket, slirp_older_than_v4),
+/* v4 and newer */
+VMSTATE_STRUCT(fhost, struct socket, 4, vmstate_slirp_socket_addr,
+   union slirp_sockaddr),
+VMSTATE_STRUCT(lhost, struct socket, 4, vmstate_slirp_socket_addr,
+   union slirp_sockaddr),
+
+VMSTATE_UINT8(so_iptos, struct socket),
+VMSTATE_UINT8(so_emu, struct socket),
+VMSTATE_UINT8(so_type, struct socket),
+VMSTATE_INT32(so_state, struct socket),
+VMSTATE_STRUCT(so_rcv, struct socket, 0, vmstate_slirp_sbuf,
+   struct sbuf),
+VMSTATE_STRUCT(so_snd, struct socket, 0, vmstate_slirp_sbuf,
+   struct sbuf),
+VMSTATE_STRUCT_POINTER(so_tcpcb, struct socket, vmstate_slirp_tcp,
+   struct tcpcb),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static void slirp_bootp_save(QEMUFile *f, Slirp *slirp)
 {
 int i;
@@ -1308,7 +1343,7 @@ static void slirp_state_save(QEMUFile *f, void *opaque)
 continue;
 
 qemu_put_byte(f, 42);
-slirp_socket_save(f, so);
+vmstate_save_state(f, &vmstate_slirp_socket, so, NULL);
 }
 qemu_put_byte(f, 0);
 
@@ -1317,55 +1352,6 @@ static void slirp_state_save(QEMUFile *f, void *opaque)
 slirp_bootp_save(f, slirp);
 }
 
-static int slirp_socket_load(QEMUFile *f, struct socket *so, int version_id)
-{
-int ret = 0;
-if (tcp_attach(so) < 0)
-return -ENOMEM;
-
-so->so_urgc = qemu_get_be32(f);
-if (version_id <= 3) {
-so->so_ffamily = AF_INET;
-so->so_faddr.s_addr = qemu_get_be32(f);
-so->so_laddr.s_addr = qem

[Qemu-devel] [PATCH v2 1/5] slirp: VMState conversion; tcpcb

2016-11-23 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Convert the migration of the struct tcpcb to use a VMStateDescription,
the rest of it will come later.

Mostly mechanical, except for conversion of some 'char' to uint8_t
to ensure portability.

Signed-off-by: Dr. David Alan Gilbert 
Reviewed-by: Samuel Thibault 
---
 slirp/slirp.c   | 149 
 slirp/tcp_var.h |   6 +--
 2 files changed, 57 insertions(+), 98 deletions(-)

diff --git a/slirp/slirp.c b/slirp/slirp.c
index 6e2b4e5..6276315 100644
--- a/slirp/slirp.c
+++ b/slirp/slirp.c
@@ -1129,53 +1129,62 @@ void slirp_socket_recv(Slirp *slirp, struct in_addr 
guest_addr, int guest_port,
 tcp_output(sototcpcb(so));
 }
 
-static void slirp_tcp_save(QEMUFile *f, struct tcpcb *tp)
+static int slirp_tcp_post_load(void *opaque, int version)
 {
-int i;
+tcp_template((struct tcpcb *)opaque);
 
-qemu_put_sbe16(f, tp->t_state);
-for (i = 0; i < TCPT_NTIMERS; i++)
-qemu_put_sbe16(f, tp->t_timer[i]);
-qemu_put_sbe16(f, tp->t_rxtshift);
-qemu_put_sbe16(f, tp->t_rxtcur);
-qemu_put_sbe16(f, tp->t_dupacks);
-qemu_put_be16(f, tp->t_maxseg);
-qemu_put_sbyte(f, tp->t_force);
-qemu_put_be16(f, tp->t_flags);
-qemu_put_be32(f, tp->snd_una);
-qemu_put_be32(f, tp->snd_nxt);
-qemu_put_be32(f, tp->snd_up);
-qemu_put_be32(f, tp->snd_wl1);
-qemu_put_be32(f, tp->snd_wl2);
-qemu_put_be32(f, tp->iss);
-qemu_put_be32(f, tp->snd_wnd);
-qemu_put_be32(f, tp->rcv_wnd);
-qemu_put_be32(f, tp->rcv_nxt);
-qemu_put_be32(f, tp->rcv_up);
-qemu_put_be32(f, tp->irs);
-qemu_put_be32(f, tp->rcv_adv);
-qemu_put_be32(f, tp->snd_max);
-qemu_put_be32(f, tp->snd_cwnd);
-qemu_put_be32(f, tp->snd_ssthresh);
-qemu_put_sbe16(f, tp->t_idle);
-qemu_put_sbe16(f, tp->t_rtt);
-qemu_put_be32(f, tp->t_rtseq);
-qemu_put_sbe16(f, tp->t_srtt);
-qemu_put_sbe16(f, tp->t_rttvar);
-qemu_put_be16(f, tp->t_rttmin);
-qemu_put_be32(f, tp->max_sndwnd);
-qemu_put_byte(f, tp->t_oobflags);
-qemu_put_byte(f, tp->t_iobc);
-qemu_put_sbe16(f, tp->t_softerror);
-qemu_put_byte(f, tp->snd_scale);
-qemu_put_byte(f, tp->rcv_scale);
-qemu_put_byte(f, tp->request_r_scale);
-qemu_put_byte(f, tp->requested_s_scale);
-qemu_put_be32(f, tp->ts_recent);
-qemu_put_be32(f, tp->ts_recent_age);
-qemu_put_be32(f, tp->last_ack_sent);
+return 0;
 }
 
+static const VMStateDescription vmstate_slirp_tcp = {
+.name = "slirp-tcp",
+.version_id = 0,
+.post_load = slirp_tcp_post_load,
+.fields = (VMStateField[]) {
+VMSTATE_INT16(t_state, struct tcpcb),
+VMSTATE_INT16_ARRAY(t_timer, struct tcpcb, TCPT_NTIMERS),
+VMSTATE_INT16(t_rxtshift, struct tcpcb),
+VMSTATE_INT16(t_rxtcur, struct tcpcb),
+VMSTATE_INT16(t_dupacks, struct tcpcb),
+VMSTATE_UINT16(t_maxseg, struct tcpcb),
+VMSTATE_UINT8(t_force, struct tcpcb),
+VMSTATE_UINT16(t_flags, struct tcpcb),
+VMSTATE_UINT32(snd_una, struct tcpcb),
+VMSTATE_UINT32(snd_nxt, struct tcpcb),
+VMSTATE_UINT32(snd_up, struct tcpcb),
+VMSTATE_UINT32(snd_wl1, struct tcpcb),
+VMSTATE_UINT32(snd_wl2, struct tcpcb),
+VMSTATE_UINT32(iss, struct tcpcb),
+VMSTATE_UINT32(snd_wnd, struct tcpcb),
+VMSTATE_UINT32(rcv_wnd, struct tcpcb),
+VMSTATE_UINT32(rcv_nxt, struct tcpcb),
+VMSTATE_UINT32(rcv_up, struct tcpcb),
+VMSTATE_UINT32(irs, struct tcpcb),
+VMSTATE_UINT32(rcv_adv, struct tcpcb),
+VMSTATE_UINT32(snd_max, struct tcpcb),
+VMSTATE_UINT32(snd_cwnd, struct tcpcb),
+VMSTATE_UINT32(snd_ssthresh, struct tcpcb),
+VMSTATE_INT16(t_idle, struct tcpcb),
+VMSTATE_INT16(t_rtt, struct tcpcb),
+VMSTATE_UINT32(t_rtseq, struct tcpcb),
+VMSTATE_INT16(t_srtt, struct tcpcb),
+VMSTATE_INT16(t_rttvar, struct tcpcb),
+VMSTATE_UINT16(t_rttmin, struct tcpcb),
+VMSTATE_UINT32(max_sndwnd, struct tcpcb),
+VMSTATE_UINT8(t_oobflags, struct tcpcb),
+VMSTATE_UINT8(t_iobc, struct tcpcb),
+VMSTATE_INT16(t_softerror, struct tcpcb),
+VMSTATE_UINT8(snd_scale, struct tcpcb),
+VMSTATE_UINT8(rcv_scale, struct tcpcb),
+VMSTATE_UINT8(request_r_scale, struct tcpcb),
+VMSTATE_UINT8(requested_s_scale, struct tcpcb),
+VMSTATE_UINT32(ts_recent, struct tcpcb),
+VMSTATE_UINT32(ts_recent_age, struct tcpcb),
+VMSTATE_UINT32(last_ack_sent, struct tcpcb),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static void slirp_sbuf_save(QEMUFile *f, struct sbuf *sbuf)
 {
 uint32_t off;
@@ -1218,7 +1227,7 @@ static void slirp_socket_save(QEMUFile *f, struct socket 
*so)
 qemu_put_be32(f, so->so_state);
 slirp_sbuf_save(f, &so->so_rcv);
 slirp_sbuf_save(f, &so->so_snd);
-slirp_tcp_save(f, so->so_tcpcb);
+vmstate_save_state(f, &v

Re: [Qemu-devel] [PATCH] xen_disk: convert discard input to byte ranges

2016-11-23 Thread Stefano Stabellini

On Wed, 23 Nov 2016, Olaf Hering wrote:
> On Wed, Nov 23, Olaf Hering wrote:
> 
> > > > +if (!blk_split_discard(ioreq, req->sector_number, 
> > > > req->nr_sectors)) {
> > > > +goto err;
> > > How is error handling supposed to work here?
> 
> In the guest the cmd is stuck, instead of getting an IO error:
> 
> [   91.966404] mkfs.ext4   D  0  2878   2831 
> 0x
> [   91.966406]  88002204bc48 880030530480 88002fae5800 
> 88002204c000
> [   91.966407]   7fff 8000 
> 024000c0
> [   91.966409]  88002204bc60 815dd985 880038815c00 
> 88002204bd08
> [   91.966409] Call Trace:
> [   91.966413]  [] schedule+0x35/0x80
> [   91.966416]  [] schedule_timeout+0x237/0x2d0
> [   91.966419]  [] io_schedule_timeout+0xa6/0x110
> [   91.966421]  [] wait_for_completion_io+0xa3/0x110
> [   91.966425]  [] submit_bio_wait+0x50/0x60
> [   91.966430]  [] blkdev_issue_discard+0x78/0xb0
> [   91.966433]  [] blk_ioctl_discard+0x7b/0xa0
> [   91.966436]  [] blkdev_ioctl+0x730/0x920
> [   91.966440]  [] block_ioctl+0x3d/0x40
> [   91.966444]  [] do_vfs_ioctl+0x2cd/0x4a0
> [   91.966453]  [] SyS_ioctl+0x74/0x80
> [   91.966456]  [] entry_SYSCALL_64_fastpath+0x12/0x6d

The error should be sent back to the frontend via the status field. Not
sure why blkfront is not hanlding it correctly.

Re: [Qemu-devel] [PATCH for-2.8 v3] xen_disk: split discard input to match internal representation

2016-11-23 Thread Stefano Stabellini

On Wed, 23 Nov 2016, Kevin Wolf wrote:
> Am 23.11.2016 um 12:40 hat Eric Blake geschrieben:
> > On 11/23/2016 04:39 AM, Olaf Hering wrote:
> > > The guest sends discard requests as u64 sector/count pairs, but the
> > > block layer operates internally with s64/s32 pairs. The conversion
> > > leads to IO errors in the guest, the discard request is not processed.
> > > 
> > >   domU.cfg:
> > >   'vdev=xvda, format=qcow2, backendtype=qdisk, target=/x.qcow2'
> > >   domU:
> > >   mkfs.ext4 -F /dev/xvda
> > >   Discarding device blocks: failed - Input/output error
> > > 
> > > Fix this by splitting the request into chunks of BDRV_REQUEST_MAX_SECTORS.
> > > Add input range checking to avoid overflow.
> > > 
> > > Fixes f313520 ("xen_disk: add discard support")
> > > 
> > > Signed-off-by: Olaf Hering 
> > > ---
> > 
> > Qualifies as a bug fix, so requesting 2.8 inclusion.
> > Reviewed-by: Eric Blake 
> 
> Stefano, are you going to merge this or should I take a look?

I can merge it.

Cheers,

Stefano

Re: [Qemu-devel] [PATCH v3] xen_disk: split discard input to match internal representation

2016-11-23 Thread Stefano Stabellini

On Wed, 23 Nov 2016, Olaf Hering wrote:
> The guest sends discard requests as u64 sector/count pairs, but the
> block layer operates internally with s64/s32 pairs. The conversion
> leads to IO errors in the guest, the discard request is not processed.
> 
>   domU.cfg:
>   'vdev=xvda, format=qcow2, backendtype=qdisk, target=/x.qcow2'
>   domU:
>   mkfs.ext4 -F /dev/xvda
>   Discarding device blocks: failed - Input/output error
> 
> Fix this by splitting the request into chunks of BDRV_REQUEST_MAX_SECTORS.
> Add input range checking to avoid overflow.
> 
> Fixes f313520 ("xen_disk: add discard support")
> 
> Signed-off-by: Olaf Hering 

Reviewed-by: Stefano Stabellini 


> v3:
>  turn tab into spaces to fix checkpatch warning
> v2:
>  adjust overflow check
>  add Fixes revspec because the initial commit also failed to convert u64 to 
> s32
>  adjust summary
> 
>  hw/block/xen_disk.c | 42 --
>  1 file changed, 36 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index 3a7dc19..456a2d5 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -660,6 +660,38 @@ static void qemu_aio_complete(void *opaque, int ret)
>  qemu_bh_schedule(ioreq->blkdev->bh);
>  }
>  
> +static bool blk_split_discard(struct ioreq *ioreq, blkif_sector_t 
> sector_number,
> +  uint64_t nr_sectors)
> +{
> +struct XenBlkDev *blkdev = ioreq->blkdev;
> +int64_t byte_offset;
> +int byte_chunk;
> +uint64_t byte_remaining, limit;
> +uint64_t sec_start = sector_number;
> +uint64_t sec_count = nr_sectors;
> +
> +/* Wrap around, or overflowing byte limit? */
> +if (sec_start + sec_count < sec_count ||
> +sec_start + sec_count > INT64_MAX >> BDRV_SECTOR_BITS) {
> +return false;
> +}
> +
> +limit = BDRV_REQUEST_MAX_SECTORS << BDRV_SECTOR_BITS;
> +byte_offset = sec_start << BDRV_SECTOR_BITS;
> +byte_remaining = sec_count << BDRV_SECTOR_BITS;
> +
> +do {
> +byte_chunk = byte_remaining > limit ? limit : byte_remaining;
> +ioreq->aio_inflight++;
> +blk_aio_pdiscard(blkdev->blk, byte_offset, byte_chunk,
> + qemu_aio_complete, ioreq);
> +byte_remaining -= byte_chunk;
> +byte_offset += byte_chunk;
> +} while (byte_remaining > 0);
> +
> +return true;
> +}
> +
>  static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
>  {
>  struct XenBlkDev *blkdev = ioreq->blkdev;
> @@ -708,12 +740,10 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
>  break;
>  case BLKIF_OP_DISCARD:
>  {
> -struct blkif_request_discard *discard_req = (void *)&ioreq->req;
> -ioreq->aio_inflight++;
> -blk_aio_pdiscard(blkdev->blk,
> - discard_req->sector_number << BDRV_SECTOR_BITS,
> - discard_req->nr_sectors << BDRV_SECTOR_BITS,
> - qemu_aio_complete, ioreq);
> +struct blkif_request_discard *req = (void *)&ioreq->req;
> +if (!blk_split_discard(ioreq, req->sector_number, req->nr_sectors)) {
> +goto err;
> +}
>  break;
>  }
>  default:
>

Re: [Qemu-devel] [PATCH 3/3] xen: ignore direction in bufioreq handling

2016-11-23 Thread Stefano Stabellini

On Wed, 23 Nov 2016, Paul Durrant wrote:
> > -Original Message-
> > From: Jan Beulich [mailto:jbeul...@suse.com]
> > Sent: 23 November 2016 09:25
> > To: qemu-devel@nongnu.org
> > Cc: Anthony Perard ; Paul Durrant
> > ; Stefano Stabellini ; xen-
> > devel 
> > Subject: [PATCH 3/3] xen: ignore direction in bufioreq handling
> > 
> > There's no way to communicate back read data, so only writes can ever
> > be usefully specified. Ignore the field, paving the road for eventually
> > re-using the bit for something else in a few (many?) years time.
> > 
> > Signed-off-by: Jan Beulich 
> 
> Reviewed-by: Paul Durrant 

Acked-by: Stefano Stabellini 


> > 
> > --- a/xen-hvm.c
> > +++ b/xen-hvm.c
> > @@ -997,6 +997,7 @@ static int handle_buffered_iopage(XenIOS
> >  memset(&req, 0x00, sizeof(req));
> >  req.state = STATE_IOREQ_READY;
> >  req.count = 1;
> > +req.dir = IOREQ_WRITE;
> > 
> >  for (;;) {
> >  uint32_t rdptr = buf_page->read_pointer, wrptr;
> > @@ -1014,7 +1015,6 @@ static int handle_buffered_iopage(XenIOS
> >  req.size = 1U << buf_req->size;
> >  req.addr = buf_req->addr;
> >  req.data = buf_req->data;
> > -req.dir = buf_req->dir;
> >  req.type = buf_req->type;
> >  xen_rmb();
> >  qw = (req.size == 8);
> > @@ -1031,10 +1031,12 @@ static int handle_buffered_iopage(XenIOS
> >  handle_ioreq(state, &req);
> > 
> >  /* Only req.data may get updated by handle_ioreq(), albeit even 
> > that
> > - * should not happen as such data would never make it to the guest.
> > + * should not happen as such data would never make it to the guest 
> > (we
> > + * can only usefully see writes here after all).
> >   */
> >  assert(req.state == STATE_IOREQ_READY);
> >  assert(req.count == 1);
> > +assert(req.dir == IOREQ_WRITE);
> >  assert(!req.data_is_ptr);
> > 
> >  atomic_add(&buf_page->read_pointer, qw + 1);
> > 
> > 
>

[Qemu-devel] [QEMU PATCH v14 2/4] migration: migrate QTAILQ

2016-11-23 Thread Jianjun Duan

Currently we cannot directly transfer a QTAILQ instance because of the
limitation in the migration code. Here we introduce an approach to
transfer such structures. We created VMStateInfo vmstate_info_qtailq
for QTAILQ. Similar VMStateInfo can be created for other data structures
such as list.

When a QTAILQ is migrated from source to target, it is appended to the
corresponding QTAILQ structure, which is assumed to have been properly
initialized.

This approach will be used to transfer pending_events and ccs_list in spapr
state.

We also create some macros in qemu/queue.h to access a QTAILQ using pointer
arithmetic. This ensures that we do not depend on the implementation
details about QTAILQ in the migration code.

Signed-off-by: Jianjun Duan 
---
 include/migration/vmstate.h | 20 +
 include/qemu/queue.h| 60 +++
 migration/trace-events  |  4 +++
 migration/vmstate.c | 69 +
 4 files changed, 153 insertions(+)

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index eafc8f2..e47ad6e 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -253,6 +253,7 @@ extern const VMStateInfo vmstate_info_timer;
 extern const VMStateInfo vmstate_info_buffer;
 extern const VMStateInfo vmstate_info_unused_buffer;
 extern const VMStateInfo vmstate_info_bitmap;
+extern const VMStateInfo vmstate_info_qtailq;
 
 #define type_check_2darray(t1,t2,n,m) ((t1(*)[n][m])0 - (t2*)0)
 #define type_check_array(t1,t2,n) ((t1(*)[n])0 - (t2*)0)
@@ -664,6 +665,25 @@ extern const VMStateInfo vmstate_info_bitmap;
 .offset   = offsetof(_state, _field),\
 }
 
+/* For migrating a QTAILQ.
+ * Target QTAILQ needs be properly initialized.
+ * _type: type of QTAILQ element
+ * _next: name of QTAILQ entry field in QTAILQ element
+ * _vmsd: VMSD for QTAILQ element
+ * size: size of QTAILQ element
+ * start: offset of QTAILQ entry in QTAILQ element
+ */
+#define VMSTATE_QTAILQ_V(_field, _state, _version, _vmsd, _type, _next)  \
+{\
+.name = (stringify(_field)), \
+.version_id   = (_version),  \
+.vmsd = &(_vmsd),\
+.size = sizeof(_type),   \
+.info = &vmstate_info_qtailq,\
+.offset   = offsetof(_state, _field),\
+.start= offsetof(_type, _next),  \
+}
+
 /* _f : field name
_f_n : num of elements field_name
_n : num of elements
diff --git a/include/qemu/queue.h b/include/qemu/queue.h
index 342073f..35292c3 100644
--- a/include/qemu/queue.h
+++ b/include/qemu/queue.h
@@ -438,4 +438,64 @@ struct {   
 \
 #define QTAILQ_PREV(elm, headname, field) \
 (*(((struct headname *)((elm)->field.tqe_prev))->tqh_last))
 
+#define field_at_offset(base, offset, type)
\
+((type) (((char *) (base)) + (offset)))
+
+typedef struct DUMMY_Q_ENTRY DUMMY_Q_ENTRY;
+typedef struct DUMMY_Q DUMMY_Q;
+
+struct DUMMY_Q_ENTRY {
+QTAILQ_ENTRY(DUMMY_Q_ENTRY) next;
+};
+
+struct DUMMY_Q {
+QTAILQ_HEAD(DUMMY_Q_HEAD, DUMMY_Q_ENTRY) head;
+};
+
+#define dummy_q ((DUMMY_Q *) 0)
+#define dummy_qe ((DUMMY_Q_ENTRY *) 0)
+
+/*
+ * Offsets of layout of a tail queue head.
+ */
+#define QTAILQ_FIRST_OFFSET (offsetof(typeof(dummy_q->head), tqh_first))
+#define QTAILQ_LAST_OFFSET  (offsetof(typeof(dummy_q->head), tqh_last))
+/*
+ * Raw access of elements of a tail queue
+ */
+#define QTAILQ_RAW_FIRST(head) 
\
+(*field_at_offset(head, QTAILQ_FIRST_OFFSET, void **))
+#define QTAILQ_RAW_TQH_LAST(head)  
\
+(*field_at_offset(head, QTAILQ_LAST_OFFSET, void ***))
+
+/*
+ * Offsets of layout of a tail queue element.
+ */
+#define QTAILQ_NEXT_OFFSET (offsetof(typeof(dummy_qe->next), tqe_next))
+#define QTAILQ_PREV_OFFSET (offsetof(typeof(dummy_qe->next), tqe_prev))
+
+/*
+ * Raw access of elements of a tail entry
+ */
+#define QTAILQ_RAW_NEXT(elm, entry)
\
+(*field_at_offset(elm, entry + QTAILQ_NEXT_OFFSET, void **))
+#define QTAILQ_RAW_TQE_PREV(elm, entry)
\
+(*field_at_offset(elm, entry + QTAILQ_PREV_OFFSET, void ***))
+/*
+ * Tail queue tranversal using pointer arithmetic.
+ */
+#define QTAILQ_RAW_FOREACH(elm, head, entry)   
\
+for ((elm) = QTAILQ_RAW_FIRST(head);   
\
+ (elm);

[Qemu-devel] [QEMU PATCH v14 0/4] migration: migrate QTAILQ

2016-11-23 Thread Jianjun Duan

Hi all,

I addressed some review comments. Comments are welcome. 

v14: - Fixed a return statement.

Previous versions are:

v13: - Changed some QTAILQ related macro names to match existing ones. 
(link: http://lists.nongnu.org/archive/html/qemu-ppc/2016-11/msg00226.html)

v12: - Fixed type for put_qtailq which caused build break.
(link: http://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01328.html

v11: - Split error_report statements into a separate patch.
 - Changed the signature of put. It now returns int type.
 - Minor changes to QTAILQ macros. 
 
v10: - Fixed a typo.
(link: http://lists.nongnu.org/archive/html/qemu-ppc/2016-10/msg01206.html)

v9: - No more hard encoding of QTAILQ layout information
(link: http://lists.nongnu.org/archive/html/qemu-ppc/2016-10/msg01042.html)

v8: - Fixed a style issue. 
(link: http://lists.nongnu.org/archive/html/qemu-ppc/2016-10/msg00874.html)

v7: - Fixed merge errors.
- Simplified macro definitions related to pointer arithmetic based QTAILQ 
access.
- Added test case for QTAILQ migration in tests/test-vmstate.c.
(link: http://lists.nongnu.org/archive/html/qemu-ppc/2016-10/msg00711.html)


v6: - Split from Power specific patches. 
- Dropped VMS_LINKED flag.
- Rebased to master.
- Added comments to clarify about put/get in VMStateInfo.  
(link: http://lists.nongnu.org/archive/html/qemu-ppc/2016-10/msg00336.html)

v5: - Rebased to David's ppc-for-2.8. 
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg00270.html)

v4: - Introduce a way to set customized instance_id in SaveStateEntry. Use it
  to set instance_id for DRC using its unique index to address David 
  Gibson's concern.
- Rename VMS_CSTM to VMS_LINKED based on Paolo Bonzini's suggestions.
- Clean up qjson stuff in put_qtailq. 
- Add trace for put_qtailq and get_qtailq based on David Gilbert's 
  suggestion.
- Based on David's ppc-for-2.7. 
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg07720.html)

v3: - Simplify overall design followng discussion with Paolo. No longer need
  metadata to migrate QTAILQ.
- Extend VMStateInfo instead of adding similar fields to VMStateField.
- Clean up macros in qemu/queue.h.
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg05695.html)

v2: - Introduce a general approach to migrate QTAILQ in qemu/queue.h.
- Migrate signalled field in the DRC state.
- Put the newly added migrating fields in subsections so that backward 
  migration is not broken.  
- Set detach_cb field right after migration so that a migrated hot-unplug
  event could finish its course.
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg04188.html)

v1: - Inital version.
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-04/msg02601.html)


Jianjun Duan (4):
  migration: extend VMStateInfo
  migration: migrate QTAILQ
  tests/migration: Add test for QTAILQ migration
  migration: add error_report

 hw/display/virtio-gpu.c |   8 +-
 hw/intc/s390_flic_kvm.c |   8 +-
 hw/net/vmxnet3.c|  24 --
 hw/nvram/eeprom93xx.c   |   8 +-
 hw/nvram/fw_cfg.c   |   8 +-
 hw/pci/msix.c   |   8 +-
 hw/pci/pci.c|  16 +++-
 hw/pci/shpc.c   |   7 +-
 hw/scsi/scsi-bus.c  |   8 +-
 hw/timer/twl92230.c |   8 +-
 hw/usb/redirect.c   |  26 +--
 hw/virtio/virtio-pci.c  |   8 +-
 hw/virtio/virtio.c  |  15 +++-
 include/migration/vmstate.h |  39 --
 include/qemu/queue.h|  60 +++
 migration/savevm.c  |   7 +-
 migration/trace-events  |   4 +
 migration/vmstate.c | 184 +++-
 target-alpha/machine.c  |   6 +-
 target-arm/machine.c|  14 +++-
 target-i386/machine.c   |  26 +--
 target-mips/machine.c   |  14 +++-
 target-ppc/machine.c|  12 ++-
 target-sparc/machine.c  |   6 +-
 tests/test-vmstate.c| 160 ++
 25 files changed, 578 insertions(+), 106 deletions(-)

-- 
1.9.1

[Qemu-devel] [QEMU PATCH v14 3/4] tests/migration: Add test for QTAILQ migration

2016-11-23 Thread Jianjun Duan

Add a test for QTAILQ migration to tests/test-vmstate.c.

Signed-off-by: Jianjun Duan 
---
 tests/test-vmstate.c | 160 +++
 1 file changed, 160 insertions(+)

diff --git a/tests/test-vmstate.c b/tests/test-vmstate.c
index d2f529b..88aab8c 100644
--- a/tests/test-vmstate.c
+++ b/tests/test-vmstate.c
@@ -544,6 +544,163 @@ static void test_arr_ptr_str_no0_load(void)
 }
 }
 
+/* test QTAILQ migration */
+typedef struct TestQtailqElement TestQtailqElement;
+
+struct TestQtailqElement {
+bool b;
+uint8_t  u8;
+QTAILQ_ENTRY(TestQtailqElement) next;
+};
+
+typedef struct TestQtailq {
+int16_t  i16;
+QTAILQ_HEAD(TestQtailqHead, TestQtailqElement) q;
+int32_t  i32;
+} TestQtailq;
+
+static const VMStateDescription vmstate_q_element = {
+.name = "test/queue-element",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_BOOL(b, TestQtailqElement),
+VMSTATE_UINT8(u8, TestQtailqElement),
+VMSTATE_END_OF_LIST()
+},
+};
+
+static const VMStateDescription vmstate_q = {
+.name = "test/queue",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_INT16(i16, TestQtailq),
+VMSTATE_QTAILQ_V(q, TestQtailq, 1, vmstate_q_element, 
TestQtailqElement,
+ next),
+VMSTATE_INT32(i32, TestQtailq),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static void test_save_q(void)
+{
+TestQtailq obj_q = {
+.i16 = -512,
+.i32 = 7,
+};
+
+TestQtailqElement obj_qe1 = {
+.b = true,
+.u8 = 130,
+};
+
+TestQtailqElement obj_qe2 = {
+.b = false,
+.u8 = 65,
+};
+
+uint8_t wire_q[] = {
+/* i16 */ 0xfe, 0x0,
+/* start of element 0 of q */ 0x01,
+/* .b  */ 0x01,
+/* .u8 */ 0x82,
+/* start of element 1 of q */ 0x01,
+/* b */   0x00,
+/* u8 */  0x41,
+/* end of q */0x00,
+/* i32 */ 0x00, 0x01, 0x11, 0x70,
+QEMU_VM_EOF, /* just to ensure we won't get EOF reported prematurely */
+};
+
+QTAILQ_INIT(&obj_q.q);
+QTAILQ_INSERT_TAIL(&obj_q.q, &obj_qe1, next);
+QTAILQ_INSERT_TAIL(&obj_q.q, &obj_qe2, next);
+
+save_vmstate(&vmstate_q, &obj_q);
+compare_vmstate(wire_q, sizeof(wire_q));
+}
+
+static void test_load_q(void)
+{
+TestQtailq obj_q = {
+.i16 = -512,
+.i32 = 7,
+};
+
+TestQtailqElement obj_qe1 = {
+.b = true,
+.u8 = 130,
+};
+
+TestQtailqElement obj_qe2 = {
+.b = false,
+.u8 = 65,
+};
+
+uint8_t wire_q[] = {
+/* i16 */ 0xfe, 0x0,
+/* start of element 0 of q */ 0x01,
+/* .b  */ 0x01,
+/* .u8 */ 0x82,
+/* start of element 1 of q */ 0x01,
+/* b */   0x00,
+/* u8 */  0x41,
+/* end of q */0x00,
+/* i32 */ 0x00, 0x01, 0x11, 0x70,
+};
+
+QTAILQ_INIT(&obj_q.q);
+QTAILQ_INSERT_TAIL(&obj_q.q, &obj_qe1, next);
+QTAILQ_INSERT_TAIL(&obj_q.q, &obj_qe2, next);
+
+QEMUFile *fsave = open_test_file(true);
+
+qemu_put_buffer(fsave, wire_q, sizeof(wire_q));
+qemu_put_byte(fsave, QEMU_VM_EOF);
+g_assert(!qemu_file_get_error(fsave));
+qemu_fclose(fsave);
+
+QEMUFile *fload = open_test_file(false);
+TestQtailq tgt;
+
+QTAILQ_INIT(&tgt.q);
+vmstate_load_state(fload, &vmstate_q, &tgt, 1);
+char eof = qemu_get_byte(fload);
+g_assert(!qemu_file_get_error(fload));
+g_assert_cmpint(tgt.i16, ==, obj_q.i16);
+g_assert_cmpint(tgt.i32, ==, obj_q.i32);
+g_assert_cmpint(eof, ==, QEMU_VM_EOF);
+
+TestQtailqElement *qele_from = QTAILQ_FIRST(&obj_q.q);
+TestQtailqElement *qlast_from = QTAILQ_LAST(&obj_q.q, TestQtailqHead);
+TestQtailqElement *qele_to = QTAILQ_FIRST(&tgt.q);
+TestQtailqElement *qlast_to = QTAILQ_LAST(&tgt.q, TestQtailqHead);
+
+while (1) {
+g_assert_cmpint(qele_to->b, ==, qele_from->b);
+g_assert_cmpint(qele_to->u8, ==, qele_from->u8);
+if ((qele_from == qlast_from) || (qele_to == qlast_to)) {
+break;
+}
+qele_from = QTAILQ_NEXT(qele_from, next);
+qele_to = QTAILQ_NEXT(qele_to, next);
+}
+
+g_assert_cmpint((uint64_t) qele_from, ==, (uint64_t) qlast_from);
+g_assert_cmpint((uint64_t) qele_to, ==, (uint64_t) qlast_to);
+
+/* clean up */
+TestQtailqElement *qele;
+while (!QTAILQ_EMPTY(&tgt.q)) {
+qele = QTAILQ_LAST(&tgt.q, TestQtailqHead);
+QTAILQ_REMOVE(&tgt.q, qele, next);
+free(qele);
+qele = NULL;
+}
+qemu_fclose(fload);
+}
+
 int

[Qemu-devel] sane char device writes?

2016-11-23 Thread Michal Suchánek

Hello,

I have reported the issue with qemu aborting in spapr_vty.c because
gtk.c submitted more data than can be sent to the emulated serial port.

While the abort has been resolved and spapr_vty.c should truncate the
data now getting the data through is still not possible.

Looking in the code I see that console.c has this code (which is only
piece of code in UI corresponding the the gtk part I found):

static void kbd_send_chars(void *opaque)
{
QemuConsole *s = opaque;
int len;
uint8_t buf[16];

len = qemu_chr_be_can_write(s->chr);
if (len > s->out_fifo.count)
len = s->out_fifo.count;
if (len > 0) {
if (len > sizeof(buf))
len = sizeof(buf);
qemu_fifo_read(&s->out_fifo, buf, len);
qemu_chr_be_write(s->chr, buf, len);
}
/* characters are pending: we send them a bit later (XXX:
   horrible, should change char device API) */
if (s->out_fifo.count > 0) {
timer_mod(s->kbd_timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME)
+ 1); }
}

The corresponding piece of code in gtk.c is AFAICT

static gboolean  (VteTerminal *terminal, gchar *text, guint size,
 gpointer user_data)
{
VirtualConsole *vc = user_data;

if (vc->vte.echo) {
VteTerminal *term = VTE_TERMINAL(vc->vte.terminal);
int i;
for (i = 0; i < size; i++) {
uint8_t c = text[i];
if (c >= 128 || isprint(c)) {
/* 8-bit characters are considered printable.  */
vte_terminal_feed(term, &text[i], 1);
} else if (c == '\r' || c == '\n') {
vte_terminal_feed(term, "\r\n", 2);
} else {
char ctrl[2] = { '^', 0};
ctrl[1] = text[i] ^ 64;
vte_terminal_feed(term, ctrl, 2);
}
}
}

qemu_chr_be_write(vc->vte.chr, (uint8_t  *)text, (unsigned
int)size); return TRUE;
}

meaning there is no loop to split the submitted text buffer.

gd_vc_in is VTE callback handling input so I suspect it either handles
it or not and it cannot say it handled only part of the "commit" event.

So for this to work an extra buffer would have to be stored in gtk.c
somewhere, and possibly similar timer trick used as in console.c

Any ideas how to do this without introducing too much insanity?

Presumably using a GTK timer for repeating gd_vc_in the handler would
run in the same GTK UI thread as the "commit" signal handler and
excessive locking would not be required.

The data passed to gd_vc_in is presumably freed when it ends so it
would have to be copied somewhere. It's quite possible to create a
static list in gd_vc_in or some extra field in VirtualConsole.

Thanks

Michal

Re: [Qemu-devel] [PATCH 1/3] xen: fix quad word bufioreq handling

2016-11-23 Thread Stefano Stabellini

On Wed, 23 Nov 2016, Jan Beulich wrote:
> >>> On 23.11.16 at 11:45,  wrote:
> > No, if QEMU is using a default ioreq server (i.e. the legacy way of doing 
> > things) then it's vulnerable to the guest messing with the rings and I'd 
> > forgotten that migrated-in guests from old QEMUs also end up using the 
> > default 
> > server, so I guess this is a worthy checkt to make... although maybe it's 
> > best to just bail if the check fails, since it would indicate a malicious 
> > guest.
> 
> Okay, that's basically the TBD note I have in the patch; I'll wait for
> at least one of the qemu maintainers to voice their preference.
 
I think we should just print an error and destroy_hvm_domain(false) or
hw_error if the check fails.

Re: [Qemu-devel] [PATCH 2/3] xen: slightly simplify bufioreq handling

2016-11-23 Thread Stefano Stabellini

On Wed, 23 Nov 2016, Jan Beulich wrote:
> There's no point setting fields always receiving the same value on each
> iteration, as handle_ioreq() doesn't alter them anyway. Set state and
> count once ahead of the loop, drop the redundant clearing of
> data_is_ptr, and avoid the meaningless setting of df altogether.

Why setting df is meaningless?


> Also avoid doing an unsigned long calculation of size when the field to
> be initialized is only 32 bits wide (and the shift value in the range
> 0...3).
> 
> Signed-off-by: Jan Beulich 
> 
> --- a/xen-hvm.c
> +++ b/xen-hvm.c
> @@ -995,6 +995,8 @@ static int handle_buffered_iopage(XenIOS
>  }
>  
>  memset(&req, 0x00, sizeof(req));
> +req.state = STATE_IOREQ_READY;
> +req.count = 1;
>  
>  for (;;) {
>  uint32_t rdptr = buf_page->read_pointer, wrptr;
> @@ -1009,15 +1011,11 @@ static int handle_buffered_iopage(XenIOS
>  break;
>  }
>  buf_req = &buf_page->buf_ioreq[rdptr % IOREQ_BUFFER_SLOT_NUM];
> -req.size = 1UL << buf_req->size;
> -req.count = 1;
> +req.size = 1U << buf_req->size;
>  req.addr = buf_req->addr;
>  req.data = buf_req->data;
> -req.state = STATE_IOREQ_READY;
>  req.dir = buf_req->dir;
> -req.df = 1;
>  req.type = buf_req->type;
> -req.data_is_ptr = 0;
>  xen_rmb();
>  qw = (req.size == 8);
>  if (qw) {
> @@ -1032,6 +1030,13 @@ static int handle_buffered_iopage(XenIOS
>  
>  handle_ioreq(state, &req);
>  
> +/* Only req.data may get updated by handle_ioreq(), albeit even that
> + * should not happen as such data would never make it to the guest.
> + */
> +assert(req.state == STATE_IOREQ_READY);
> +assert(req.count == 1);
> +assert(!req.data_is_ptr);
> +
>  atomic_add(&buf_page->read_pointer, qw + 1);
>  }
>  
> 
> 
>

[Qemu-devel] [QEMU PATCH v14 4/4] migration: add error_report

2016-11-23 Thread Jianjun Duan

Added error_report where version_ids do not match in vmstate_load_state.

Signed-off-by: Jianjun Duan 
---
 migration/vmstate.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/migration/vmstate.c b/migration/vmstate.c
index 2f9d4ba..0e6fce4 100644
--- a/migration/vmstate.c
+++ b/migration/vmstate.c
@@ -85,6 +85,7 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription 
*vmsd,
 
 trace_vmstate_load_state(vmsd->name, version_id);
 if (version_id > vmsd->version_id) {
+error_report("%s %s",  vmsd->name, "too new");
 trace_vmstate_load_state_end(vmsd->name, "too new", -EINVAL);
 return -EINVAL;
 }
@@ -95,6 +96,7 @@ int vmstate_load_state(QEMUFile *f, const VMStateDescription 
*vmsd,
 trace_vmstate_load_state_end(vmsd->name, "old path", ret);
 return ret;
 }
+error_report("%s %s",  vmsd->name, "too old");
 trace_vmstate_load_state_end(vmsd->name, "too old", -EINVAL);
 return -EINVAL;
 }
-- 
1.9.1

[Qemu-devel] [PATCH v2 2/4] test-qga: Avoid qobject_from_jsonv("%"PRId64)

2016-11-23 Thread Eric Blake

The qobject_from_jsonv() function implements a pseudo-printf
language for creating a QObject; however, it is hard-coded to
only parse a subset of formats understood by -Wformat, and is
not a straight synonym to bare printf().  In particular, any
use of an int64_t integer works only if the system's
definition of PRId64 matches what the parser expects; which
works on glibc (%lld or %ld depending on 32- vs. 64-bit) and
mingw (%I64d), but not on Mac OS (%qd).  Rather than enhance
the parser, it is just as easy to use normal printf() for
this particular conversion, matching what is done elsewhere
in this file [1], which is safe in this instance because the
format does not contain any of the problematic differences
(bare '%' or the '%s' format).

The use of PRId64 for a variable named 'pid' is gross, but it
is a sad reality of the 64-bit mingw environment, which
mistakenly defines pid_t as a 64-bit type even though getpid()
returns 'int' on that platform [2].  Our definition of the
QGA GuestExec type defines 'pid' as a 64-bit entity, and we
can't tighten it to 'int32' unless the mingw header is fixed.
Using 'long long' instead of 'int64_t' just so that we can
stick with qobject_from_jsonv("%lld") instead of printf() is
not any prettier, since we may have later type churn anyways.

[1] see 'git grep -A2 strdup_printf tests/test-qga.c'
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1397787

Reported by: G 3 
Signed-off-by: Eric Blake 

---
v2: improve commit message, hoist allocation out of loop
---
 tests/test-qga.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/tests/test-qga.c b/tests/test-qga.c
index 40af649..868b02a 100644
--- a/tests/test-qga.c
+++ b/tests/test-qga.c
@@ -837,6 +837,7 @@ static void test_qga_guest_exec(gconstpointer fix)
 int64_t pid, now, exitcode;
 gsize len;
 bool exited;
+char *cmd;

 /* exec 'echo foo bar' */
 ret = qmp_fd(fixture->fd, "{'execute': 'guest-exec', 'arguments': {"
@@ -851,9 +852,10 @@ static void test_qga_guest_exec(gconstpointer fix)

 /* wait for completion */
 now = g_get_monotonic_time();
+cmd = g_strdup_printf("{'execute': 'guest-exec-status',"
+  " 'arguments': { 'pid': %" PRId64 " } }", pid);
 do {
-ret = qmp_fd(fixture->fd, "{'execute': 'guest-exec-status',"
- " 'arguments': { 'pid': %" PRId64 "  } }", pid);
+ret = qmp_fd(fixture->fd, cmd);
 g_assert_nonnull(ret);
 val = qdict_get_qdict(ret, "return");
 exited = qdict_get_bool(val, "exited");
@@ -863,6 +865,7 @@ static void test_qga_guest_exec(gconstpointer fix)
 } while (!exited &&
  g_get_monotonic_time() < now + 5 * G_TIME_SPAN_SECOND);
 g_assert(exited);
+g_free(cmd);

 /* check stdout */
 exitcode = qdict_get_int(val, "exitcode");
-- 
2.7.4

[Qemu-devel] [QEMU PATCH v14 1/4] migration: extend VMStateInfo

2016-11-23 Thread Jianjun Duan

Current migration code cannot handle some data structures such as
QTAILQ in qemu/queue.h. Here we extend the signatures of put/get
in VMStateInfo so that customized handling is supported. put now
will return int type.

Signed-off-by: Jianjun Duan 
---
 hw/display/virtio-gpu.c |   8 +++-
 hw/intc/s390_flic_kvm.c |   8 +++-
 hw/net/vmxnet3.c|  24 +++---
 hw/nvram/eeprom93xx.c   |   8 +++-
 hw/nvram/fw_cfg.c   |   8 +++-
 hw/pci/msix.c   |   8 +++-
 hw/pci/pci.c|  16 +--
 hw/pci/shpc.c   |   7 ++-
 hw/scsi/scsi-bus.c  |   8 +++-
 hw/timer/twl92230.c |   8 +++-
 hw/usb/redirect.c   |  26 +++---
 hw/virtio/virtio-pci.c  |   8 +++-
 hw/virtio/virtio.c  |  15 --
 include/migration/vmstate.h |  19 ++--
 migration/savevm.c  |   7 ++-
 migration/vmstate.c | 113 +---
 target-alpha/machine.c  |   6 ++-
 target-arm/machine.c|  14 --
 target-i386/machine.c   |  26 +++---
 target-mips/machine.c   |  14 --
 target-ppc/machine.c|  12 +++--
 target-sparc/machine.c  |   6 ++-
 22 files changed, 263 insertions(+), 106 deletions(-)

diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 60bce94..c58fa1b 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -988,7 +988,8 @@ static const VMStateDescription vmstate_virtio_gpu_scanouts 
= {
 },
 };
 
-static void virtio_gpu_save(QEMUFile *f, void *opaque, size_t size)
+static int virtio_gpu_save(QEMUFile *f, void *opaque, size_t size,
+   VMStateField *field, QJSON *vmdesc)
 {
 VirtIOGPU *g = opaque;
 struct virtio_gpu_simple_resource *res;
@@ -1013,9 +1014,12 @@ static void virtio_gpu_save(QEMUFile *f, void *opaque, 
size_t size)
 qemu_put_be32(f, 0); /* end of list */
 
 vmstate_save_state(f, &vmstate_virtio_gpu_scanouts, g, NULL);
+
+return 0;
 }
 
-static int virtio_gpu_load(QEMUFile *f, void *opaque, size_t size)
+static int virtio_gpu_load(QEMUFile *f, void *opaque, size_t size,
+   VMStateField *field)
 {
 VirtIOGPU *g = opaque;
 struct virtio_gpu_simple_resource *res;
diff --git a/hw/intc/s390_flic_kvm.c b/hw/intc/s390_flic_kvm.c
index 21ac2e2..61f512f 100644
--- a/hw/intc/s390_flic_kvm.c
+++ b/hw/intc/s390_flic_kvm.c
@@ -286,7 +286,8 @@ static void kvm_s390_release_adapter_routes(S390FLICState 
*fs,
  * increase until buffer is sufficient or maxium size is
  * reached
  */
-static void kvm_flic_save(QEMUFile *f, void *opaque, size_t size)
+static int kvm_flic_save(QEMUFile *f, void *opaque, size_t size,
+ VMStateField *field, QJSON *vmdesc)
 {
 KVMS390FLICState *flic = opaque;
 int len = FLIC_SAVE_INITIAL_SIZE;
@@ -319,6 +320,8 @@ static void kvm_flic_save(QEMUFile *f, void *opaque, size_t 
size)
 count * sizeof(struct kvm_s390_irq));
 }
 g_free(buf);
+
+return 0;
 }
 
 /**
@@ -331,7 +334,8 @@ static void kvm_flic_save(QEMUFile *f, void *opaque, size_t 
size)
  * Note: Do nothing when no interrupts where stored
  * in QEMUFile
  */
-static int kvm_flic_load(QEMUFile *f, void *opaque, size_t size)
+static int kvm_flic_load(QEMUFile *f, void *opaque, size_t size,
+ VMStateField *field)
 {
 uint64_t len = 0;
 uint64_t count = 0;
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 92f6af9..4163ca8 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -2451,7 +2451,8 @@ static void vmxnet3_put_tx_stats_to_file(QEMUFile *f,
 qemu_put_be64(f, tx_stat->pktsTxDiscard);
 }
 
-static int vmxnet3_get_txq_descr(QEMUFile *f, void *pv, size_t size)
+static int vmxnet3_get_txq_descr(QEMUFile *f, void *pv, size_t size,
+VMStateField *field)
 {
 Vmxnet3TxqDescr *r = pv;
 
@@ -2465,7 +2466,8 @@ static int vmxnet3_get_txq_descr(QEMUFile *f, void *pv, 
size_t size)
 return 0;
 }
 
-static void vmxnet3_put_txq_descr(QEMUFile *f, void *pv, size_t size)
+static int vmxnet3_put_txq_descr(QEMUFile *f, void *pv, size_t size,
+ VMStateField *field, QJSON *vmdesc)
 {
 Vmxnet3TxqDescr *r = pv;
 
@@ -2474,6 +2476,8 @@ static void vmxnet3_put_txq_descr(QEMUFile *f, void *pv, 
size_t size)
 qemu_put_byte(f, r->intr_idx);
 qemu_put_be64(f, r->tx_stats_pa);
 vmxnet3_put_tx_stats_to_file(f, &r->txq_stats);
+
+return 0;
 }
 
 static const VMStateInfo txq_descr_info = {
@@ -2512,7 +2516,8 @@ static void vmxnet3_put_rx_stats_to_file(QEMUFile *f,
 qemu_put_be64(f, rx_stat->pktsRxError);
 }
 
-static int vmxnet3_get_rxq_descr(QEMUFile *f, void *pv, size_t size)
+static int vmxnet3_get_rxq_descr(QEMUFile *f, void *pv, size_t size,
+VMStateField *field)
 {
 Vmxnet3RxqDescr *r = pv;
 int i;
@@ -2530,7 +2535,8 @@ static int vmxnet3_get_rxq_descr(QEMUFile *f, void *pv, 
size_t size)
 return 0;
 }

Re: [Qemu-devel] [PATCH for-2.8] target-m68k: fix EXG instruction

2016-11-23 Thread Richard Henderson


On 11/23/2016 05:37 PM, Laurent Vivier wrote:

opcodes of "EXG Ax,Ay" and "EXG Dx,Dy" have been swapped

Signed-off-by: Laurent Vivier 
---
 target-m68k/translate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


Whoops.  ;-)

Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH 3/3] qapi: Drop support for qobject_from_jsonf("%"PRId64)

2016-11-23 Thread Eric Blake

On 11/23/2016 10:56 AM, Markus Armbruster wrote:
> Eric Blake  writes:
> 
>> On 11/23/2016 08:17 AM, Markus Armbruster wrote:
>>
>>>
>>> The first two patches are bug fixes, and as such they should be
>>> considered for 2.8.
>>>
>>> This patch doesn't fix anything, and it might conceivably break
>>> something.  Too late for 2.8.
>>
>> Ah, but it DOES fix check-qjson on Mac OS.
> 
> PATCH 1+2 do, don't they?

No, patch 1 fixes emission of QMP events, patch 2 fixes check-qga. Also
reported broken by G 3 was check-qjson, and my audit revealed that
test-qobject-input-visitor is also affected.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v2 4/4] RFC: qapi: Drop support for qobject_from_jsonf("%"PRId64)

2016-11-23 Thread Eric Blake

The qobject_from_jsonf() function implements a pseudo-printf
language for creating a QObject; however, it is hard-coded to
only parse a subset of formats understood by printf().  In
particular, any use of an int64_t integer works only if the
system's definition of PRId64 matches what the parser expects;
which works on glibc (%lld or %ld depending on 32- vs. 64-bit)
and mingw (%I64d), but not on Mac OS (%qd).  Rather than
enhance the parser, we have already eliminated all clients
that were using int64_t, and therefore all uses of %I64d.  No
one should be using long (%ld), since the size of that type
differs between 32- and 64-bit.

Therefore, reducing the parser to accept ONLY %lld allows us to
still accept 64-bit integers (passed in as long long, not
int64_t), while removing %ld and %I64d support gives us a few
more platforms (more than just the infrequent Mac OS' %qd) that
will fail if a later use picks an unrecognized format string.
There are few enough uses of qobject_from_json[fv]() that it is
easy to audit that all callers in the main code body are
correct, and all remaining callers in the testsuite are covered
by a successful 'make check'.  This patch gives confidence that
the earlier patches are appropriate for 2.8 on Mac OS, but this
particular patch is not hard freeze material, and should be
deferred to 2.9 (or else dynamic JSON pseudo-printf ripped out
in its entirety, rather than just unused formats).

Reported by: G 3 
Signed-off-by: Eric Blake 
---
 qobject/json-lexer.c  | 17 -
 qobject/json-parser.c |  9 -
 2 files changed, 4 insertions(+), 22 deletions(-)

diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c
index af4a75e..a5e6570 100644
--- a/qobject/json-lexer.c
+++ b/qobject/json-lexer.c
@@ -61,9 +61,6 @@ enum json_lexer_state {
 IN_ESCAPE,
 IN_ESCAPE_L,
 IN_ESCAPE_LL,
-IN_ESCAPE_I,
-IN_ESCAPE_I6,
-IN_ESCAPE_I64,
 IN_WHITESPACE,
 IN_START,
 };
@@ -230,22 +227,9 @@ static const uint8_t json_lexer[][256] =  {
 },

 [IN_ESCAPE_L] = {
-['d'] = JSON_ESCAPE,
 ['l'] = IN_ESCAPE_LL,
 },

-[IN_ESCAPE_I64] = {
-['d'] = JSON_ESCAPE,
-},
-
-[IN_ESCAPE_I6] = {
-['4'] = IN_ESCAPE_I64,
-},
-
-[IN_ESCAPE_I] = {
-['6'] = IN_ESCAPE_I6,
-},
-
 [IN_ESCAPE] = {
 ['d'] = JSON_ESCAPE,
 ['i'] = JSON_ESCAPE,
@@ -253,7 +237,6 @@ static const uint8_t json_lexer[][256] =  {
 ['s'] = JSON_ESCAPE,
 ['f'] = JSON_ESCAPE,
 ['l'] = IN_ESCAPE_L,
-['I'] = IN_ESCAPE_I,
 },

 /* top level rule */
diff --git a/qobject/json-parser.c b/qobject/json-parser.c
index c18e48a..86b9d7f 100644
--- a/qobject/json-parser.c
+++ b/qobject/json-parser.c
@@ -461,23 +461,22 @@ static QObject *parse_escape(JSONParserContext *ctxt, 
va_list *ap)
 token = parser_context_pop_token(ctxt);
 assert(token && token->type == JSON_ESCAPE);

+/* We only accept a fixed subset of printf. In particular, PRId64
+ * is not guaranteed to work; use long long instead of int64_t. */
 if (!strcmp(token->str, "%p")) {
 return va_arg(*ap, QObject *);
 } else if (!strcmp(token->str, "%i")) {
 return QOBJECT(qbool_from_bool(va_arg(*ap, int)));
 } else if (!strcmp(token->str, "%d")) {
 return QOBJECT(qint_from_int(va_arg(*ap, int)));
-} else if (!strcmp(token->str, "%ld")) {
-return QOBJECT(qint_from_int(va_arg(*ap, long)));
-} else if (!strcmp(token->str, "%lld") ||
-   !strcmp(token->str, "%I64d")) {
+} else if (!strcmp(token->str, "%lld")) {
 return QOBJECT(qint_from_int(va_arg(*ap, long long)));
 } else if (!strcmp(token->str, "%s")) {
 return QOBJECT(qstring_from_str(va_arg(*ap, const char *)));
 } else if (!strcmp(token->str, "%f")) {
 return QOBJECT(qfloat_from_double(va_arg(*ap, double)));
 }
-return NULL;
+assert(false);
 }

 static QObject *parse_literal(JSONParserContext *ctxt)
-- 
2.7.4

[Qemu-devel] [for-2.8 0/4] 9p patches for 2.8 20161123

2016-11-23 Thread Greg Kurz

The following changes since commit 00227fefd2059464cd2f59aed29944874c630e2f:

  Update version for v2.8.0-rc1 release (2016-11-22 22:29:08 +)

are available in the git repository at:

  https://github.com/gkurz/qemu.git tags/for-upstream

for you to fetch changes up to 898ae90a44551d25b8e956fd87372d303c82fe68:

  9pfs: add cleanup operation for proxy backend driver (2016-11-23 13:53:34 
+0100)


This pull request fixes some leaks (memory, fd) in the handle and proxy
backends.


Li Qiang (4):
  9pfs: adjust the order of resource cleanup in device unrealize
  9pfs: add cleanup operation in FileOperations
  9pfs: add cleanup operation for handle backend driver
  9pfs: add cleanup operation for proxy backend driver

 fsdev/file-op-9p.h  |  1 +
 hw/9pfs/9p-handle.c |  9 +
 hw/9pfs/9p-proxy.c  | 13 +
 hw/9pfs/9p.c| 10 --
 4 files changed, 31 insertions(+), 2 deletions(-)
-- 
2.7.4

Re: [Qemu-devel] [kvm-unit-tests PATCH v11 3/3] arm: pmu: Add CPI checking

2016-11-23 Thread Andrew Jones

On Tue, Nov 22, 2016 at 12:29:14PM -0600, Wei Huang wrote:
> From: Christopher Covington 
> 
> Calculate the numbers of cycles per instruction (CPI) implied by ARM
> PMU cycle counter values. The code includes a strict checking facility
> intended for the -icount option in TCG mode in the configuration file.
> 
> Signed-off-by: Christopher Covington 
> Signed-off-by: Wei Huang 
> ---
>  arm/pmu.c | 123 
> +-
>  arm/unittests.cfg |  14 +++
>  2 files changed, 136 insertions(+), 1 deletion(-)


Reviewed-by: Andrew Jones

Re: [Qemu-devel] [kvm-unit-tests PATCH v11 0/3] ARM PMU tests

2016-11-23 Thread Andrew Jones

On Tue, Nov 22, 2016 at 12:29:11PM -0600, Wei Huang wrote:
> Changes from v10:
> * Change the name of loop test function to precise_instrs_loop()
> * Minor comment fixes to measure_instrs() and to explain isb() in loop funcs 
> 
> Note:
> 1) Current KVM code has bugs in handling PMCCFILTR write. A fix (see
> below) is required for this unit testing code to work correctly under
> KVM mode.
> https://lists.cs.columbia.edu/pipermail/kvmarm/2016-November/022134.html.
> 
> Thanks,
> -Wei
> 
> Christopher Covington (3):
>   arm: Add PMU test
>   arm: pmu: Check cycle count increases
>   arm: pmu: Add CPI checking
> 
>  arm/Makefile.common |   3 +-
>  arm/pmu.c   | 351 
> 
>  arm/unittests.cfg   |  19 +++
>  3 files changed, 372 insertions(+), 1 deletion(-)
>  create mode 100644 arm/pmu.c
> 
> -- 
> 1.8.3.1
>

I'm pretty happy with this series. Andre has good suggestions though.
If you send a v12 soon, and nobody complains about my v7 gic series,
then I'll group this with the gic series into a single PULL request
for Radim and Paolo.

Thanks,
drew

[Qemu-devel] [PATCH v2 for-2.8 0/4] Fix MacOS runtime failure of qobject_from_jsonf()

2016-11-23 Thread Eric Blake

programmingk...@gmail.com[*] reported a runtime failure on a
32-bit Mac OS compilation, where "%"PRId64 expands to "%qd".
Fortunately, we had very few spots that were relying on our
pseudo-printf JSON parsing of int64_t numbers, so it was
easier to just convert callers to stick to safer %lld.

The remaining uses of pseudo-printf handling are more complex;
there are only 3 users in the released codebased, but LOTS of
users in the testsuite (via wrapper functions like qmp()); I
will be posting a followup series that rips out the remaining
uses of dynamic JSON, but it will be 2.9 material, while
these (first three) patches qualify for 2.8.  The fourth patch
is RFC; not intended to be applied now, but shows how I tested
patch 3/4; it will probably reappear in the later 2.9 series.

[*] git log shows the name John, but the particular email that
sparked this only stated the non-descript name 'G 3', which
makes it a bit hard for me to know which form is preferred
when lending credit.

Eric Blake (4):
  qmp-event: Avoid qobject_from_jsonf("%"PRId64)
  test-qga: Avoid qobject_from_jsonv("%"PRId64)
  tests: Avoid qobject_from_jsonf("%"PRId64)
  RFC: qapi: Drop support for qobject_from_jsonf("%"PRId64)

 qapi/qmp-event.c   | 17 -
 qobject/json-lexer.c   | 17 -
 qobject/json-parser.c  |  9 -
 tests/check-qjson.c|  4 ++--
 tests/test-qga.c   |  7 +--
 tests/test-qobject-input-visitor.c |  5 +++--
 6 files changed, 18 insertions(+), 41 deletions(-)

-- 
2.7.4

[Qemu-devel] [for-2.8 1/4] 9pfs: adjust the order of resource cleanup in device unrealize

2016-11-23 Thread Greg Kurz

From: Li Qiang 

Unrealize should undo things that were set during realize in
reverse order. So should do in the error path in realize.

Signed-off-by: Li Qiang 
Reviewed-by: Greg Kurz 
Signed-off-by: Greg Kurz 
---
 hw/9pfs/9p.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index aea7e9d39206..087b5c98eec1 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -3521,8 +3521,8 @@ int v9fs_device_realize_common(V9fsState *s, Error **errp)
 rc = 0;
 out:
 if (rc) {
-g_free(s->ctx.fs_root);
 g_free(s->tag);
+g_free(s->ctx.fs_root);
 v9fs_path_free(&path);
 }
 return rc;
@@ -3530,8 +3530,8 @@ out:
 
 void v9fs_device_unrealize_common(V9fsState *s, Error **errp)
 {
-g_free(s->ctx.fs_root);
 g_free(s->tag);
+g_free(s->ctx.fs_root);
 }
 
 typedef struct VirtfsCoResetData {
-- 
2.7.4

[Qemu-devel] [PATCH v2 3/4] tests: Avoid qobject_from_jsonf("%"PRId64)

2016-11-23 Thread Eric Blake

The qobject_from_jsonf() function implements a pseudo-printf
language for creating a QObject; however, it is hard-coded to
only parse a subset of formats understood by -Wformat, and is
not a straight synonym to bare printf().  In particular, any
use of an int64_t integer works only if the system's
definition of PRId64 matches what the parser expects; which
works on glibc (%lld or %ld depending on 32- vs. 64-bit) and
mingw (%I64d), but not on Mac OS (%qd).  Rather than enhance
the parser, it is just as easy to force the use of int (where
the value is small enough) or long long instead of int64_t,
which we know always works.

This should cover all remaining testsuite uses of
qobject_from_json[fv]() that were trying to rely on PRId64,
although my proof for that was done by adding in asserts and
checking that 'make check' still passed, where such asserts
are inappropriate during hard freeze.  A later series in 2.9
may remove all dynamic JSON parsing, but that's a bigger task.

Reported by: G 3 
Signed-off-by: Eric Blake 
---
 tests/check-qjson.c| 4 ++--
 tests/test-qobject-input-visitor.c | 5 +++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tests/check-qjson.c b/tests/check-qjson.c
index 8595574..b03a2e1 100644
--- a/tests/check-qjson.c
+++ b/tests/check-qjson.c
@@ -964,7 +964,7 @@ static void vararg_number(void)
 QInt *qint;
 QFloat *qfloat;
 int value = 0x2342;
-int64_t value64 = 0x2342342343LL;
+long long value64 = 0x2342342343LL;
 double valuef = 2.323423423;

 obj = qobject_from_jsonf("%d", value);
@@ -976,7 +976,7 @@ static void vararg_number(void)

 QDECREF(qint);

-obj = qobject_from_jsonf("%" PRId64, value64);
+obj = qobject_from_jsonf("%lld", value64);
 g_assert(obj != NULL);
 g_assert(qobject_type(obj) == QTYPE_QINT);

diff --git a/tests/test-qobject-input-visitor.c 
b/tests/test-qobject-input-visitor.c
index 26c5012..945404a 100644
--- a/tests/test-qobject-input-visitor.c
+++ b/tests/test-qobject-input-visitor.c
@@ -83,10 +83,11 @@ static Visitor 
*visitor_input_test_init_raw(TestInputVisitorData *data,
 static void test_visitor_in_int(TestInputVisitorData *data,
 const void *unused)
 {
-int64_t res = 0, value = -42;
+int64_t res = 0;
+int value = -42;
 Visitor *v;

-v = visitor_input_test_init(data, "%" PRId64, value);
+v = visitor_input_test_init(data, "%d", value);

 visit_type_int(v, NULL, &res, &error_abort);
 g_assert_cmpint(res, ==, value);
-- 
2.7.4

[Qemu-devel] [kvm-unit-tests PATCH v7 05/11] arm/arm64: irq enable/disable

2016-11-23 Thread Andrew Jones

Reviewed-by: Alex Bennée 
Reviewed-by: Eric Auger 
Signed-off-by: Andrew Jones 
---
 lib/arm/asm/processor.h   | 10 ++
 lib/arm64/asm/processor.h | 10 ++
 2 files changed, 20 insertions(+)

diff --git a/lib/arm/asm/processor.h b/lib/arm/asm/processor.h
index bc46d1f980ee..959ecda5dced 100644
--- a/lib/arm/asm/processor.h
+++ b/lib/arm/asm/processor.h
@@ -35,6 +35,16 @@ static inline unsigned long current_cpsr(void)
 
 #define current_mode() (current_cpsr() & MODE_MASK)
 
+static inline void local_irq_enable(void)
+{
+   asm volatile("cpsie i" : : : "memory", "cc");
+}
+
+static inline void local_irq_disable(void)
+{
+   asm volatile("cpsid i" : : : "memory", "cc");
+}
+
 static inline unsigned long get_mpidr(void)
 {
unsigned long mpidr;
diff --git a/lib/arm64/asm/processor.h b/lib/arm64/asm/processor.h
index 94f7ce35b65c..d54a4ed1c187 100644
--- a/lib/arm64/asm/processor.h
+++ b/lib/arm64/asm/processor.h
@@ -68,6 +68,16 @@ static inline unsigned long current_level(void)
return el & 0xc;
 }
 
+static inline void local_irq_enable(void)
+{
+   asm volatile("msr daifclr, #2" : : : "memory");
+}
+
+static inline void local_irq_disable(void)
+{
+   asm volatile("msr daifset, #2" : : : "memory");
+}
+
 #define DEFINE_GET_SYSREG(reg, type)   \
 static inline type get_##reg(void) \
 {  \
-- 
2.9.3

[Qemu-devel] [kvm-unit-tests PATCH v7 04/11] arm/arm64: add some delay routines

2016-11-23 Thread Andrew Jones

Allow a thread to wait some specified amount of time. Can
specify in cycles, usecs, and msecs.

Reviewed-by: Alex Bennée 
Reviewed-by: Eric Auger 
Signed-off-by: Andrew Jones 
---
 lib/arm/asm/processor.h   | 19 +++
 lib/arm/processor.c   | 15 +++
 lib/arm64/asm/processor.h | 19 +++
 lib/arm64/processor.c | 15 +++
 4 files changed, 68 insertions(+)

diff --git a/lib/arm/asm/processor.h b/lib/arm/asm/processor.h
index ecf5bbe1824a..bc46d1f980ee 100644
--- a/lib/arm/asm/processor.h
+++ b/lib/arm/asm/processor.h
@@ -5,7 +5,9 @@
  *
  * This work is licensed under the terms of the GNU LGPL, version 2.
  */
+#include 
 #include 
+#include 
 
 enum vector {
EXCPTN_RST,
@@ -51,4 +53,21 @@ extern int mpidr_to_cpu(unsigned long mpidr);
 extern void start_usr(void (*func)(void *arg), void *arg, unsigned long 
sp_usr);
 extern bool is_user(void);
 
+static inline u64 get_cntvct(void)
+{
+   u64 vct;
+   isb();
+   asm volatile("mrrc p15, 1, %Q0, %R0, c14" : "=r" (vct));
+   return vct;
+}
+
+extern void delay(u64 cycles);
+extern void udelay(unsigned long usecs);
+
+static inline void mdelay(unsigned long msecs)
+{
+   while (msecs--)
+   udelay(1000);
+}
+
 #endif /* _ASMARM_PROCESSOR_H_ */
diff --git a/lib/arm/processor.c b/lib/arm/processor.c
index 54fdb87ef019..c2ee360df688 100644
--- a/lib/arm/processor.c
+++ b/lib/arm/processor.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static const char *processor_modes[] = {
"USER_26", "FIQ_26" , "IRQ_26" , "SVC_26" ,
@@ -141,3 +142,17 @@ bool is_user(void)
 {
return current_thread_info()->flags & TIF_USER_MODE;
 }
+
+void delay(u64 cycles)
+{
+   u64 start = get_cntvct();
+   while ((get_cntvct() - start) < cycles)
+   cpu_relax();
+}
+
+void udelay(unsigned long usec)
+{
+   unsigned int frq;
+   asm volatile("mrc p15, 0, %0, c14, c0, 0" : "=r" (frq));
+   delay((u64)usec * frq / 100);
+}
diff --git a/lib/arm64/asm/processor.h b/lib/arm64/asm/processor.h
index 7e448dc81a6a..94f7ce35b65c 100644
--- a/lib/arm64/asm/processor.h
+++ b/lib/arm64/asm/processor.h
@@ -17,8 +17,10 @@
 #define SCTLR_EL1_M(1 << 0)
 
 #ifndef __ASSEMBLY__
+#include 
 #include 
 #include 
+#include 
 
 enum vector {
EL1T_SYNC,
@@ -89,5 +91,22 @@ extern int mpidr_to_cpu(unsigned long mpidr);
 extern void start_usr(void (*func)(void *arg), void *arg, unsigned long 
sp_usr);
 extern bool is_user(void);
 
+static inline u64 get_cntvct(void)
+{
+   u64 vct;
+   isb();
+   asm volatile("mrs %0, cntvct_el0" : "=r" (vct));
+   return vct;
+}
+
+extern void delay(u64 cycles);
+extern void udelay(unsigned long usecs);
+
+static inline void mdelay(unsigned long msecs)
+{
+   while (msecs--)
+   udelay(1000);
+}
+
 #endif /* !__ASSEMBLY__ */
 #endif /* _ASMARM64_PROCESSOR_H_ */
diff --git a/lib/arm64/processor.c b/lib/arm64/processor.c
index deeab4ec9c8a..50fa835c6f1e 100644
--- a/lib/arm64/processor.c
+++ b/lib/arm64/processor.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static const char *vector_names[] = {
"el1t_sync",
@@ -253,3 +254,17 @@ bool is_user(void)
 {
return current_thread_info()->flags & TIF_USER_MODE;
 }
+
+void delay(u64 cycles)
+{
+   u64 start = get_cntvct();
+   while ((get_cntvct() - start) < cycles)
+   cpu_relax();
+}
+
+void udelay(unsigned long usec)
+{
+   unsigned int frq;
+   asm volatile("mrs %0, cntfrq_el0" : "=r" (frq));
+   delay((u64)usec * frq / 100);
+}
-- 
2.9.3

Re: [Qemu-devel] [RFC 00/15] qmp: Report supported device types on 'query-machines'

2016-11-23 Thread Eduardo Habkost

On Wed, Nov 23, 2016 at 06:43:16PM +0200, Marcel Apfelbaum wrote:
> On 11/22/2016 03:11 AM, Eduardo Habkost wrote:
> > The Problem
> > ===
> > 
> > Currently management software has no way to find out which device
> > types can be plugged in a machine, unless the machine is already
> > initialized.
> > 
> 
> Hi Eduardo,
> Thank you for this interesting series. I think this is a problem
> worth addressing.
> 
> > Even after the machine is initialized, there's no way to map
> > existing bus types to supported device types unless management
> > software hardcodes the mapping between bus types and device
> > types.
> > 
> 
> Here I am a little lost.
> 
> We are going for machine => supported devices or
> bus-type => supported devices?

On this series, we go for machine-type => supported-devices.

A bus-type => supported-devices map wouldn't work because
different PCIe bus instances might accept different types of
devices (so supported-devices depend on the specific bus
instance, not only on the bus-type).

v2 will probably be more detailed. I plan to change it to:

query-machine(machine-type) => list of BusInfo

BusInfo would contain:
 * bus-type
 * bus-path
 * accepted-device-types (list of type/interface names)

> 
> > Example: floppy support on q35 vs i440fx
> > 
> > 
> > There's no way for libvirt to find out that there's no floppy
> > controller on pc-q35-* machine-types by default.
> > 
> 
> Again "by default". So do we want to query the init state of a machine?
> What devices are there? Or what devices *can be* there?

"by default" means what's present when using "-machine "
with no extra -device arguments.

We want to know what _buses_ are always there. Which in turn lets
management know which _device_ types _can_ be plugged.

> 
> > With this series, pc-i440fx-* will report "floppy" as a supported
> > device type, but pc-q35-* will not.
> > 
> > Example: Legacy PCI vs vs PCIe devices
> > --
> > 
> > Some devices require a PCIe bus to be available, others work on
> > both legacy PCI and PCIe, while others work only on a legacy PCI
> > bus.
> > 
> > Currently management software has no way to know which devices
> > can be added to a given machine, unless it hardcodes machine-type
> > names and device-types names.
> > 
> 
> Again it seems a double problem, machine => devices vs pci/pcie bus => 
> devices.
> The bus => devices match is not related to a machine type.

A bus-type => device-type match would not depend on the
machine-type, but it would not be useful: different bus instances
can accept different device-types (and the way the bus topology
is configured depend on the machine-type).


> 
> > The Proposed Interface
> > ==
> > 
> > This series adds a new field to the output of 'query-machines':
> > 'supported-device-types'. It will contain a list of QOM type
> > names, that can be used to find the list of device types that can
> > be plugged in the machine by default.
> 
> What do you mean "by default"? Without bridges or part of the machine itself?

I mean "when you just run -machine with no extra -device
arguments".

> 
>  The type names reported on
> > the new field can then be used as the 'implements' argument on
> > the 'qom-list-types' command, to find out which device types can
> > be plugged on the machine.
> > 
> > Example output
> > --
> > 
> >   (QEMU) query-machines
> >   {
> > "return": [
> > [...]
> > {
> > "supported-device-types": [
> > "sys-bus-device"
> 
> 
> I don't know how "sys-bus-device" can help us... :)

Yes, I added comments about it below. :)

> 
> > ],
> > "cpu-max": 1,
> > "hotpluggable-cpus": false,
> > "name": "none"
> > },
> > [...]
> > {
> > "supported-device-types": [
> > "sys-bus-device"
> > ],
> > "cpu-max": 1,
> > "hotpluggable-cpus": false,
> > "name": "xenpv"
> > },
> > [...]
> > {
> > "supported-device-types": [
> > "sys-bus-device",
> > "floppy",
> > "i2c-slave",
> > "pci-device",
> > "isa-device",
> > "ide-device"
> 
> Is don't know is this high level classification is useful,
> here is an example:
> 
>pvi-device is supported => then we look for all pci devices?
> But what if some pci devices make sense on a machine type,
> but not on another?

If not all pci devices are supported, then the machine must not
return "pci-device" as supported. We need to define a new
type/interface name that would be implemented only by the
supported devices. e.g. "legacy-pci-device".

> 
> 
> 
> > ],
> > "name": "pc-i440fx-2.8",
> > "alias": "pc",
> > "is-default": true,
> > "c

[Qemu-devel] [PATCH v2 1/4] qmp-event: Avoid qobject_from_jsonf("%"PRId64)

2016-11-23 Thread Eric Blake

The qobject_from_jsonf() function implements a pseudo-printf
language for creating a QObject; however, it is hard-coded to
only parse a subset of formats understood by -Wformat, and is
not a straight synonym to bare printf().  In particular, any
use of an int64_t integer works only if the system's
definition of PRId64 matches what the parser expects; which
works on glibc (%lld or %ld depending on 32- vs. 64-bit) and
mingw (%I64d), but not on Mac OS (%qd).  Rather than enhance
the parser, it is just as easy to use 'long long', which we
know always works.  There are few enough callers of
qobject_from_json[fv]() that it is easy to audit that this is
the only non-testsuite caller that was actually relying on
this particular conversion.

Reported by: G 3 
Signed-off-by: Eric Blake 

---
v2: keep qobject_from_jsonf for now, but switch to %lld
---
 qapi/qmp-event.c | 17 -
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/qapi/qmp-event.c b/qapi/qmp-event.c
index 8bba165..e7c8755 100644
--- a/qapi/qmp-event.c
+++ b/qapi/qmp-event.c
@@ -35,21 +35,12 @@ static void timestamp_put(QDict *qdict)
 int err;
 QObject *obj;
 qemu_timeval tv;
-int64_t sec, usec;

 err = qemu_gettimeofday(&tv);
-if (err < 0) {
-/* Put -1 to indicate failure of getting host time */
-sec = -1;
-usec = -1;
-} else {
-sec = tv.tv_sec;
-usec = tv.tv_usec;
-}
-
-obj = qobject_from_jsonf("{ 'seconds': %" PRId64 ", "
- "'microseconds': %" PRId64 " }",
- sec, usec);
+/* Put -1 to indicate failure of getting host time */
+obj = qobject_from_jsonf("{ 'seconds': %lld, 'microseconds': %lld }",
+ err < 0 ? -1LL : tv.tv_sec,
+ err < 0 ? -1LL : tv.tv_usec);
 qdict_put_obj(qdict, "timestamp", obj);
 }

-- 
2.7.4

Re: [Qemu-devel] [kvm-unit-tests PATCH v11 1/3] arm: Add PMU test

2016-11-23 Thread Andrew Jones

On Wed, Nov 23, 2016 at 01:16:08PM +, Andre Przywara wrote:
> Hi,
> 
> On 22/11/16 18:29, Wei Huang wrote:
> > From: Christopher Covington 
> > 
> > Beginning with a simple sanity check of the control register, add
> > a unit test for the ARM Performance Monitors Unit (PMU).
> 
> Mmh, the output of this is a bit confusing. How about to join some
> information? I changed it to give me:
> INFO: pmu: PMU implementer/ID code: "A"(0x41)/0x0
> INFO: pmu: Event counters:  0
> PASS: pmu: Control register
> 
> ... by using the newly introduced report_info() to make it look nicer.

Agreed. That would look nicer and make good use of report_info. Let's
do that.

> 
> > 
> > Signed-off-by: Christopher Covington 
> > Signed-off-by: Wei Huang 
> > Reviewed-by: Andrew Jones 
> > ---
> >  arm/Makefile.common |  3 ++-
> >  arm/pmu.c   | 74 
> > +
> >  arm/unittests.cfg   |  5 
> >  3 files changed, 81 insertions(+), 1 deletion(-)
> >  create mode 100644 arm/pmu.c
> > 
> > diff --git a/arm/Makefile.common b/arm/Makefile.common
> > index f37b5c2..5da2fdd 100644
> > --- a/arm/Makefile.common
> > +++ b/arm/Makefile.common
> > @@ -12,7 +12,8 @@ endif
> >  tests-common = \
> > $(TEST_DIR)/selftest.flat \
> > $(TEST_DIR)/spinlock-test.flat \
> > -   $(TEST_DIR)/pci-test.flat
> > +   $(TEST_DIR)/pci-test.flat \
> > +   $(TEST_DIR)/pmu.flat
> >  
> >  all: test_cases
> >  
> > diff --git a/arm/pmu.c b/arm/pmu.c
> > new file mode 100644
> > index 000..9d9c53b
> > --- /dev/null
> > +++ b/arm/pmu.c
> > @@ -0,0 +1,74 @@
> > +/*
> > + * Test the ARM Performance Monitors Unit (PMU).
> > + *
> > + * Copyright (c) 2015-2016, The Linux Foundation. All rights reserved.
> > + *
> > + * This program is free software; you can redistribute it and/or modify it
> > + * under the terms of the GNU Lesser General Public License version 2.1 and
> > + * only version 2.1 as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful, but 
> > WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public 
> > License
> > + * for more details.
> > + */
> > +#include "libcflat.h"
> > +#include "asm/barrier.h"
> > +
> > +#define PMU_PMCR_N_SHIFT   11
> > +#define PMU_PMCR_N_MASK0x1f
> > +#define PMU_PMCR_ID_SHIFT  16
> > +#define PMU_PMCR_ID_MASK   0xff
> > +#define PMU_PMCR_IMP_SHIFT 24
> > +#define PMU_PMCR_IMP_MASK  0xff
> > +
> > +#if defined(__arm__)
> 
> I guess you should use the arch specific header files we have in place
> for that (lib/arm{.64}/asm/processor.h). Also there are sysreg read
> wrappers (at least for arm64) in there already, can't we base this
> function on them: DEFINE_GET_SYSREG32(pmcr, el0)?
> (Requires a small change to get rid of the forced "_el1" suffix)
> 
> We should wait for the GIC series to be merged, as this contains some
> changes in this area.

As this unit test is the only consumer of PMC registers so far, then
I'd prefer the defines and accessors stay here for now. Once we see
a use in other unit tests then we can move some of it out.

> 
> > +static inline uint32_t pmcr_read(void)
> > +{
> > +   uint32_t ret;
> > +
> > +   asm volatile("mrc p15, 0, %0, c9, c12, 0" : "=r" (ret));
> > +   return ret;
> > +}
> > +#elif defined(__aarch64__)
> > +static inline uint32_t pmcr_read(void)
> > +{
> > +   uint32_t ret;
> > +
> > +   asm volatile("mrs %0, pmcr_el0" : "=r" (ret));
> > +   return ret;
> > +}
> > +#endif
> > +
> > +/*
> > + * As a simple sanity check on the PMCR_EL0, ensure the implementer field 
> > isn't
> > + * null. Also print out a couple other interesting fields for diagnostic
> > + * purposes. For example, as of fall 2016, QEMU TCG mode doesn't implement
> > + * event counters and therefore reports zero event counters, but hopefully
> > + * support for at least the instructions event will be added in the future 
> > and
> > + * the reported number of event counters will become nonzero.
> > + */
> > +static bool check_pmcr(void)
> > +{
> > +   uint32_t pmcr;
> > +
> > +   pmcr = pmcr_read();
> > +
> > +   printf("PMU implementer: %c\n",
> > +  (pmcr >> PMU_PMCR_IMP_SHIFT) & PMU_PMCR_IMP_MASK);
> 
> If this register reads as zero, the output is mangled (since it cuts off
> the string before the newline):
> =
> PMU implementer: Identification code: 0x0
> =
> 
> I guess you need something like:
> (pmcr >> PMU_PMCR_IMP_SHIFT) & PMU_PMCR_IMP_MASK ?: ' '

Good idea.

> 
> > +   printf("Identification code: 0x%x\n",
> > +  (pmcr >> PMU_PMCR_ID_SHIFT) & PMU_PMCR_ID_MASK);
> 
> As mentioned above this should use report_info() now, also it would be
> nice to merge this with the message above into one line of output.

Agreed.

Thanks,
drew

> 
> Cheers,
> Andre
> 
> > +   printf("Event counters:  %d\n",
> > +  (pmcr >> PMU_PMCR_N

[Qemu-devel] [for-2.8 3/4] 9pfs: add cleanup operation for handle backend driver

2016-11-23 Thread Greg Kurz

From: Li Qiang 

In the init operation of handle backend dirver, it allocates a
handle_data struct and opens a mount file. We should free these
resources when the 9pfs device is unrealized. This is what this
patch does.

Signed-off-by: Li Qiang 
Reviewed-by: Greg Kurz 
Signed-off-by: Greg Kurz 
---
 hw/9pfs/9p-handle.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/9pfs/9p-handle.c b/hw/9pfs/9p-handle.c
index 3d77594f9245..1687661bc95a 100644
--- a/hw/9pfs/9p-handle.c
+++ b/hw/9pfs/9p-handle.c
@@ -649,6 +649,14 @@ out:
 return ret;
 }
 
+static void handle_cleanup(FsContext *ctx)
+{
+struct handle_data *data = ctx->private;
+
+close(data->mountfd);
+g_free(data);
+}
+
 static int handle_parse_opts(QemuOpts *opts, struct FsDriverEntry *fse)
 {
 const char *sec_model = qemu_opt_get(opts, "security_model");
@@ -671,6 +679,7 @@ static int handle_parse_opts(QemuOpts *opts, struct 
FsDriverEntry *fse)
 FileOperations handle_ops = {
 .parse_opts   = handle_parse_opts,
 .init = handle_init,
+.cleanup  = handle_cleanup,
 .lstat= handle_lstat,
 .readlink = handle_readlink,
 .close= handle_close,
-- 
2.7.4

[Qemu-devel] [for-2.8 2/4] 9pfs: add cleanup operation in FileOperations

2016-11-23 Thread Greg Kurz

From: Li Qiang 

Currently, the backend of VirtFS doesn't have a cleanup
function. This will lead resource leak issues if the backed
driver allocates resources. This patch addresses this issue.

Signed-off-by: Li Qiang 
Reviewed-by: Greg Kurz 
Signed-off-by: Greg Kurz 
---
 fsdev/file-op-9p.h | 1 +
 hw/9pfs/9p.c   | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/fsdev/file-op-9p.h b/fsdev/file-op-9p.h
index 6db9feac8f1c..a56dc8488dfc 100644
--- a/fsdev/file-op-9p.h
+++ b/fsdev/file-op-9p.h
@@ -100,6 +100,7 @@ struct FileOperations
 {
 int (*parse_opts)(QemuOpts *, struct FsDriverEntry *);
 int (*init)(struct FsContext *);
+void (*cleanup)(struct FsContext *);
 int (*lstat)(FsContext *, V9fsPath *, struct stat *);
 ssize_t (*readlink)(FsContext *, V9fsPath *, char *, size_t);
 int (*chmod)(FsContext *, V9fsPath *, FsCred *);
diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 087b5c98eec1..faebd91f5fab 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -3521,6 +3521,9 @@ int v9fs_device_realize_common(V9fsState *s, Error **errp)
 rc = 0;
 out:
 if (rc) {
+if (s->ops->cleanup && s->ctx.private) {
+s->ops->cleanup(&s->ctx);
+}
 g_free(s->tag);
 g_free(s->ctx.fs_root);
 v9fs_path_free(&path);
@@ -3530,6 +3533,9 @@ out:
 
 void v9fs_device_unrealize_common(V9fsState *s, Error **errp)
 {
+if (s->ops->cleanup) {
+s->ops->cleanup(&s->ctx);
+}
 g_free(s->tag);
 g_free(s->ctx.fs_root);
 }
-- 
2.7.4

Re: [Qemu-devel] [PATCH 2/3] test-qga: Avoid qobject_from_jsonf("%"PRId64)

2016-11-23 Thread Markus Armbruster

Eric Blake  writes:

> On 11/23/2016 08:05 AM, Markus Armbruster wrote:
>
>> Same problem as in the previous patch, but here you replace it by
>> g_strdup_printf(), where the previous patch replaced it by manual
>> QObject construction,
>> 
>> Manual QObject construction tends to be less readable.
>
> Are there things we can do to make it more readable to the point where
> it would be tolerable in the situations where it is needed?
>
> One of the patches on my dynamic-JSON removal series adds a new:
>
> qdict_put_int(dict, "key", 1);
>
> which is a lot more legible than:
>
> qdict_put(dict, "key", qint_from_int(1));

It's more legible, but I wouldn't call it "a lot more legible".

>> g_strdup_printf() doesn't have that problem, but it has a more serious
>> one: escaping for JSON is no longer below the hood.
>> 
>> Since the string gets passed to qmp_fd(), we additionally need to escape
>> '%'.
>
> Worse, the escaping of %s differs between the two (in printf, %s just
> concatenates strings, in dynamic JSON, it adds outer "" and escapes
> inner " into \").

That's a feature.  It actually escapes much more than just '"'.  Have a
look at to_json() case QTYPE_QSTRING.

The imporant bit here is: _jsonf() is not printf()!  The part it shares
with printf() is the argument types associated with conversion
specifiers, and it shares them just because that way the compiler can
help us catch type errors.  What it does with the arguments is
*different*, because what it does is different.  It does *not* format a
string.  Not even conceptually.  It builds a QObject from a string
template.

>> Interfaces that require callers to escape almost inevitably result in
>> bugs if experience is any guide.  Safer, less low level interfaces are
>> preferable.
>> 
>> Nothing actually needs escaping here, so your code isn't wrong.  It's
>> just a bad example.
>> 
>> You've pointed out that the file is chock-full of bad examples already,
>> so one more won't make a difference.  Point taken regarding the
>> immediate fix.  But I doubt it a sane strategy for replacing _jsonf().
>
> Well, until I post my conversion series that eliminates _json[fv](), we
> don't have any hard numbers on how many bad examples remain, or whether
> the cleanup looks worth it.

Yes.  Without patches, the discussion is speculative.

Re: [Qemu-devel] [RFC v2 0/3] virtio-net: Add support to MTU feature

2016-11-23 Thread Michael S. Tsirkin

On Wed, Nov 23, 2016 at 09:02:53AM -0500, Aaron Conole wrote:
> "Michael S. Tsirkin"  writes:
> 
> > On Wed, Nov 23, 2016 at 11:42:52AM +0800, Jason Wang wrote:
> >> > > > > > > >  Seems to me like an easy way to get out of sync.
> >> > > > > >
> >> > > > > >If we send it to the backend, that has a chance to check
> >> > > > > >mtu and disconnect on error.
> >> > > >
> >> > > >For vhost-user backend, we can send it the MTU value with a
> >> > > >vhost-user protocol feature.
> >> > > >
> >> > > >For tun/macvtap, how do you do without adding a new ioctl ?
> >> > Have management configure same mtu on the backend and in qemu.
> >> > 
> >> > 
> >> 
> >> Then why not do same for vhost-user (instead of using two different
> >> methods)?
> >
> > That's what I'm saying. If backend supports that, we can also
> > check the mtu in some way to make sure it matches.
> 
> I'm not sure why we need a new ioctl (or an ioctl at all - netlink
> supports all of this)?
> 
> ex:
> 
> 08:58:34 aconole {fast-datapath-beta-rhel-7} ~/rhpkg/openvswitch$ sudo ip 
> tuntap add dev tap0 mode tap
> [sudo] password for aconole: 
> 08:58:40 aconole {fast-datapath-beta-rhel-7} ~/rhpkg/openvswitch$ ip l
> 1: lo:  mtu 65536 qdisc noqueue state UNKNOWN mode 
> DEFAULT group default qlen 1000
> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> ...
> 7: tap0:  mtu 1500 qdisc noop state DOWN mode DEFAULT 
> group default qlen 1000
> link/ether 46:e0:fc:83:54:1c brd ff:ff:ff:ff:ff:ff
> 08:58:51 aconole {fast-datapath-beta-rhel-7} ~/rhpkg/openvswitch$ sudo ip l 
> set tap0 mtu 8000
> 08:58:54 aconole {fast-datapath-beta-rhel-7} ~/rhpkg/openvswitch$ ip l
> 1: lo:  mtu 65536 qdisc noqueue state UNKNOWN mode 
> DEFAULT group default qlen 1000
> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> ...
> 7: tap0:  mtu 8000 qdisc noop state DOWN mode DEFAULT 
> group default qlen 1000
> link/ether 46:e0:fc:83:54:1c brd ff:ff:ff:ff:ff:ff
> 
> So, at least with iproute2, we can already read and write using the netlink
> interface for tuntap devices.  I haven't played with macvtap, but I
> think it's similar support - just do a netlink query, get the configured
> MTU, and advertise it.  I might be missing something though - I'm a
> simple guy with simple ideas.  Maybe there's a cross-platform issue or
> something?
> 
> -Aaron

qemu is generally not running with enough priveledges to
allow access to netlink.

[Qemu-devel] -nodefaults and available buses (was Re: [RFC 00/15] qmp: Report supported device types on 'query-machines')

2016-11-23 Thread Eduardo Habkost

(CCing the maintainers of the machines that crash when using
-nodefaults)

On Tue, Nov 22, 2016 at 08:34:50PM -0200, Eduardo Habkost wrote:
[...]
> "default defaults" vs "-nodefault defaults"
> ---
> 
> Two bad news:
> 
> 1) We need to differentiate buses created by the machine with
>"-nodefaults" and buses that are created only without
>"-nodefaults".
> 
> libvirt use -nodefaults when starting QEMU, so knowing which
> buses are available when using -nodefaults is more interesting
> for them.
> 
> Other software, on the other hand, might be interested in the
> results without -nodefaults.
> 
> We need to be able model both cases in the new interface.
> Suggestions are welcome.

The good news is that the list is short. The only[1] machines
where the list of buses seem to change when using -nodefaults
are:

* mpc8544ds
* ppce500
* mpc8544ds
* ppce500
* s390-ccw-virtio-*

On all cases above, the only difference is that a virtio bus is
available if not using -nodefaults.

Considering that the list is short, I plan to rename
'supported-device-types' to 'always-available-buses', and
document that it will include only the buses that are not
disabled by -nodefaults.

[1] I mean, the only ones from the set that don't crash with
-nodefaults. The ones below could not be tested:

> 2) A lot of machine-types won't start if using
>"-nodefaults -machine " without any extra devices or
>drives.
> 
> Lots of machines require some drives or devices to be created
> (especially ARM machines that require a SD drive to be
> available).
> 
> Some machines will make QEMU exit, some of them simply segfault.
> I am looking for ways to work around it so we can still validate
> -nodefaults-based info on the test code.

The following machines won't work with -nodefaults:

These make QEMU segfault:
* cubieboard
* petalogix-ml605
* or32-sim
* virtex-ml507
* Niagara

These exit with a "missing SecureDigital device" error:
* akita
* borzoi
* cheetah
* connex
* mainstone
* n800
* n810
* spitz
* sx1
* sx1-v1
* terrier
* tosa
* verdex
* z2

-- 
Eduardo

[Qemu-devel] [kvm-unit-tests PATCH v7 11/11] arm/arm64: gic: don't just use zero

2016-11-23 Thread Andrew Jones

Allow user to select who sends ipis and with which irq,
rather than just always sending irq=0 from cpu0.

Signed-off-by: Andrew Jones 

---
v7: cleanup cmdline parsing and add complain on bad args [Eric]
v6:
 - make sender/irq names more future-proof [drew]
 - sanity check inputs [drew]
 - introduce check_sender/irq and bad_sender/irq to more
   cleanly do checks [drew]
 - default sender and irq to 1, instead of still zero [drew]
v4: improve structure and make sure spurious checking is
done even when the sender isn't cpu0
v2: actually check that the irq received was the irq sent,
and (for gicv2) that the sender is the expected one.
---
 arm/gic.c | 143 --
 1 file changed, 120 insertions(+), 23 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 23c1860a49d9..88c5f49d807d 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -11,6 +11,7 @@
  * This work is licensed under the terms of the GNU LGPL, version 2.
  */
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -27,6 +28,8 @@ struct gic {
 
 static struct gic *gic;
 static int acked[NR_CPUS], spurious[NR_CPUS];
+static int bad_sender[NR_CPUS], bad_irq[NR_CPUS];
+static int cmdl_sender = 1, cmdl_irq = 1;
 static cpumask_t ready;
 
 static void nr_cpu_check(int nr)
@@ -42,10 +45,23 @@ static void wait_on_ready(void)
cpu_relax();
 }
 
+static void stats_reset(void)
+{
+   int i;
+
+   for (i = 0; i < nr_cpus; ++i) {
+   acked[i] = 0;
+   bad_sender[i] = -1;
+   bad_irq[i] = -1;
+   }
+   smp_wmb();
+}
+
 static void check_acked(cpumask_t *mask)
 {
int missing = 0, extra = 0, unexpected = 0;
int nr_pass, cpu, i;
+   bool bad = false;
 
/* Wait up to 5s for all interrupts to be delivered */
for (i = 0; i < 50; ++i) {
@@ -55,9 +71,21 @@ static void check_acked(cpumask_t *mask)
smp_rmb();
nr_pass += cpumask_test_cpu(cpu, mask) ?
acked[cpu] == 1 : acked[cpu] == 0;
+
+   if (bad_sender[cpu] != -1) {
+   printf("cpu%d received IPI from wrong sender 
%d\n",
+   cpu, bad_sender[cpu]);
+   bad = true;
+   }
+
+   if (bad_irq[cpu] != -1) {
+   printf("cpu%d received wrong irq %d\n",
+   cpu, bad_irq[cpu]);
+   bad = true;
+   }
}
if (nr_pass == nr_cpus) {
-   report("Completed in %d ms", true, ++i * 100);
+   report("Completed in %d ms", !bad, ++i * 100);
return;
}
}
@@ -90,6 +118,22 @@ static void check_spurious(void)
}
 }
 
+static void check_ipi_sender(u32 irqstat)
+{
+   if (gic_version() == 2) {
+   int src = (irqstat >> 10) & 7;
+
+   if (src != cmdl_sender)
+   bad_sender[smp_processor_id()] = src;
+   }
+}
+
+static void check_irqnr(u32 irqnr)
+{
+   if (irqnr != (u32)cmdl_irq)
+   bad_irq[smp_processor_id()] = irqnr;
+}
+
 static void ipi_handler(struct pt_regs *regs __unused)
 {
u32 irqstat = gic_read_iar();
@@ -97,8 +141,10 @@ static void ipi_handler(struct pt_regs *regs __unused)
 
if (irqnr != GICC_INT_SPURIOUS) {
gic_write_eoir(irqstat);
-   smp_rmb(); /* pairs with wmb in ipi_test functions */
+   smp_rmb(); /* pairs with wmb in stats_reset */
++acked[smp_processor_id()];
+   check_ipi_sender(irqstat);
+   check_irqnr(irqnr);
smp_wmb(); /* pairs with rmb in check_acked */
} else {
++spurious[smp_processor_id()];
@@ -108,22 +154,22 @@ static void ipi_handler(struct pt_regs *regs __unused)
 
 static void gicv2_ipi_send_self(void)
 {
-   writel(2 << 24, gicv2_dist_base() + GICD_SGIR);
+   writel(2 << 24 | cmdl_irq, gicv2_dist_base() + GICD_SGIR);
 }
 
 static void gicv2_ipi_send_broadcast(void)
 {
-   writel(1 << 24, gicv2_dist_base() + GICD_SGIR);
+   writel(1 << 24 | cmdl_irq, gicv2_dist_base() + GICD_SGIR);
 }
 
 static void gicv3_ipi_send_self(void)
 {
-   gic_ipi_send_single(0, smp_processor_id());
+   gic_ipi_send_single(cmdl_irq, smp_processor_id());
 }
 
 static void gicv3_ipi_send_broadcast(void)
 {
-   gicv3_write_sgi1r(1ULL << 40);
+   gicv3_write_sgi1r(1ULL << 40 | cmdl_irq << 24);
isb();
 }
 
@@ -132,10 +178,9 @@ static void ipi_test_self(void)
cpumask_t mask;
 
report_prefix_push("self");
-   memset(acked, 0, sizeof(acked));
-   smp_wmb();
+   stats_reset();
cpumask_clear(&mask);
-   cpumask_set_cpu(0, &mask);
+   cpumask_set_cpu(s

[Qemu-devel] [kvm-unit-tests PATCH v7 10/11] arm/arm64: gicv3: add an IPI test

2016-11-23 Thread Andrew Jones

Signed-off-by: Andrew Jones 

---
v7:
 - add common ipi_send_single/mask (replacing ipi_send).
   Note, the arg order irq,cpu got swapped. [Eric]
 - comment rewording [Eric]
 - make enable_defaults a common op [Eric]
 - gic_enable_defaults() will now invoke gic_init if
   necessary [drew]
 - split lib/arm/gic.c into gic-v2/3.c [Eric]
v6: move most gicv2/gicv3 wrappers to common code [Alex]
v5:
 - fix copy+paste error in gicv3_write_eoir [drew]
 - use modern register names [Andre]
v4:
 - heavily comment gicv3_ipi_send_tlist() [Eric]
 - changes needed for gicv2 iar/irqstat fix to other patch
v2:
 - use IRM for gicv3 broadcast
---
 arm/gic.c  | 83 
 arm/unittests.cfg  |  6 +++
 lib/arm/asm/arch_gicv3.h   | 23 
 lib/arm/asm/gic-v2.h   |  2 +
 lib/arm/asm/gic-v3.h   | 12 +-
 lib/arm/asm/gic.h  | 63 +++
 lib/arm/gic-v2.c   | 40 
 lib/arm/gic-v3.c   | 94 ++
 lib/arm/gic.c  |  9 -
 lib/arm64/asm/arch_gicv3.h | 22 +++
 10 files changed, 336 insertions(+), 18 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index b42c2b1ca1e1..23c1860a49d9 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -3,6 +3,8 @@
  *
  * GICv2
  *   + test sending/receiving IPIs
+ * GICv3
+ *   + test sending/receiving IPIs
  *
  * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
  *
@@ -16,7 +18,14 @@
 #include 
 #include 
 
-static int gic_version;
+struct gic {
+   struct {
+   void (*send_self)(void);
+   void (*send_broadcast)(void);
+   } ipi;
+};
+
+static struct gic *gic;
 static int acked[NR_CPUS], spurious[NR_CPUS];
 static cpumask_t ready;
 
@@ -83,11 +92,11 @@ static void check_spurious(void)
 
 static void ipi_handler(struct pt_regs *regs __unused)
 {
-   u32 irqstat = readl(gicv2_cpu_base() + GICC_IAR);
-   u32 irqnr = irqstat & GICC_IAR_INT_ID_MASK;
+   u32 irqstat = gic_read_iar();
+   u32 irqnr = gic_iar_irqnr(irqstat);
 
if (irqnr != GICC_INT_SPURIOUS) {
-   writel(irqstat, gicv2_cpu_base() + GICC_EOIR);
+   gic_write_eoir(irqstat);
smp_rmb(); /* pairs with wmb in ipi_test functions */
++acked[smp_processor_id()];
smp_wmb(); /* pairs with rmb in check_acked */
@@ -97,6 +106,27 @@ static void ipi_handler(struct pt_regs *regs __unused)
}
 }
 
+static void gicv2_ipi_send_self(void)
+{
+   writel(2 << 24, gicv2_dist_base() + GICD_SGIR);
+}
+
+static void gicv2_ipi_send_broadcast(void)
+{
+   writel(1 << 24, gicv2_dist_base() + GICD_SGIR);
+}
+
+static void gicv3_ipi_send_self(void)
+{
+   gic_ipi_send_single(0, smp_processor_id());
+}
+
+static void gicv3_ipi_send_broadcast(void)
+{
+   gicv3_write_sgi1r(1ULL << 40);
+   isb();
+}
+
 static void ipi_test_self(void)
 {
cpumask_t mask;
@@ -106,7 +136,7 @@ static void ipi_test_self(void)
smp_wmb();
cpumask_clear(&mask);
cpumask_set_cpu(0, &mask);
-   writel(2 << 24, gicv2_dist_base() + GICD_SGIR);
+   gic->ipi.send_self();
check_acked(&mask);
report_prefix_pop();
 }
@@ -114,14 +144,15 @@ static void ipi_test_self(void)
 static void ipi_test_smp(void)
 {
cpumask_t mask;
-   unsigned long tlist;
+   int i;
 
report_prefix_push("target-list");
memset(acked, 0, sizeof(acked));
smp_wmb();
-   tlist = cpumask_bits(&cpu_present_mask)[0] & 0xaa;
-   cpumask_bits(&mask)[0] = tlist;
-   writel((u8)tlist << 16, gicv2_dist_base() + GICD_SGIR);
+   cpumask_copy(&mask, &cpu_present_mask);
+   for (i = 0; i < nr_cpus; i += 2)
+   cpumask_clear_cpu(i, &mask);
+   gic_ipi_send_mask(0, &mask);
check_acked(&mask);
report_prefix_pop();
 
@@ -130,14 +161,14 @@ static void ipi_test_smp(void)
smp_wmb();
cpumask_copy(&mask, &cpu_present_mask);
cpumask_clear_cpu(0, &mask);
-   writel(1 << 24, gicv2_dist_base() + GICD_SGIR);
+   gic->ipi.send_broadcast();
check_acked(&mask);
report_prefix_pop();
 }
 
 static void ipi_enable(void)
 {
-   gicv2_enable_defaults();
+   gic_enable_defaults();
 #ifdef __arm__
install_exception_handler(EXCPTN_IRQ, ipi_handler);
 #else
@@ -154,18 +185,40 @@ static void ipi_recv(void)
wfi();
 }
 
+static struct gic gicv2 = {
+   .ipi = {
+   .send_self = gicv2_ipi_send_self,
+   .send_broadcast = gicv2_ipi_send_broadcast,
+   },
+};
+
+static struct gic gicv3 = {
+   .ipi = {
+   .send_self = gicv3_ipi_send_self,
+   .send_broadcast = gicv3_ipi_send_broadcast,
+   },
+};
+
 int main(int argc, char **argv)
 {
char pfx[8];
int cpu;
 
-   gic_version = gic_init();
-   if (!gic_version)
-   report_ab

Re: [Qemu-devel] [PATCH 3/3] qapi: Drop support for qobject_from_jsonf("%"PRId64)

2016-11-23 Thread Markus Armbruster

Eric Blake  writes:

> On 11/23/2016 08:17 AM, Markus Armbruster wrote:
>
>> 
>> The first two patches are bug fixes, and as such they should be
>> considered for 2.8.
>> 
>> This patch doesn't fix anything, and it might conceivably break
>> something.  Too late for 2.8.
>
> Ah, but it DOES fix check-qjson on Mac OS.

PATCH 1+2 do, don't they?

> As mentioned to Paolo, I'm splitting this into two parts for the v2
> series (the first part to fix testsuite failures on Mac OS which is
> still 2.8 material, the second to rip out %I64d which becomes more of
> 2.9 material).
>
>>> My other argument is that I _do_ intend to rip out ALL of the dynamic
>>> JSON support, at which point we no longer have %d, let along %lld.
>>> Until you see that followup series and decide whether it was too
>>> invasive for 2.9, it's hard to say that we are throwing out anything
>>> useful in this short-term fix for 2.8.  So I guess that gives me a
>>> reason to hurry up and finish my work on that series to post it today
>>> before I take a long holiday weekend.
>> 
>> If we rip out _jsonf() in 2.9, then ripping out currently unused parts
>> of it in 2.8 during hard freeze is needless churn at a rather
>> inconvenient time.
>> 
>> If we decice not to rip it out, it may well have to be reverted.
>> 
>> I don't think there's a need to hurry, as this patch isn't appropriate
>> for 2.8 anyway, so there's no reason to quickly decide what to do with
>> the followup series now.
>
> Fair enough.
>
> v2 coming soon.

Thanks!

[Qemu-devel] [kvm-unit-tests PATCH v7 08/11] libcflat: add IS_ALIGNED() macro, and page sizes

2016-11-23 Thread Andrew Jones

From: Peter Xu 

These macros will be useful to do page alignment checks.

Reviewed-by: Andre Przywara 
Reviewed-by: Eric Auger 
Signed-off-by: Peter Xu 
[drew: also added SZ_64K and changed to shifts]
Signed-off-by: Andrew Jones 

---
v6: change to shifts [Alex]
---
 lib/libcflat.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/lib/libcflat.h b/lib/libcflat.h
index c3fa4f24c499..bdcc561ccafd 100644
--- a/lib/libcflat.h
+++ b/lib/libcflat.h
@@ -33,6 +33,12 @@
 #define __ALIGN_MASK(x, mask)  (((x) + (mask)) & ~(mask))
 #define __ALIGN(x, a)  __ALIGN_MASK(x, (typeof(x))(a) - 1)
 #define ALIGN(x, a)__ALIGN((x), (a))
+#define IS_ALIGNED(x, a)   (((x) & ((typeof(x))(a) - 1)) == 0)
+
+#define SZ_4K  (1 << 12)
+#define SZ_64K (1 << 16)
+#define SZ_2M  (1 << 21)
+#define SZ_1G  (1 << 30)
 
 typedef uint8_tu8;
 typedef int8_t s8;
-- 
2.9.3

[Qemu-devel] [for-2.8 4/4] 9pfs: add cleanup operation for proxy backend driver

2016-11-23 Thread Greg Kurz

From: Li Qiang 

In the init operation of proxy backend dirver, it allocates a
V9fsProxy struct and some other resources. We should free these
resources when the 9pfs device is unrealized. This is what this
patch does.

Signed-off-by: Li Qiang 
Reviewed-by: Greg Kurz 
Signed-off-by: Greg Kurz 
---
 hw/9pfs/9p-proxy.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/hw/9pfs/9p-proxy.c b/hw/9pfs/9p-proxy.c
index f2417b7fd73d..f4aa7a9d70f8 100644
--- a/hw/9pfs/9p-proxy.c
+++ b/hw/9pfs/9p-proxy.c
@@ -1168,9 +1168,22 @@ static int proxy_init(FsContext *ctx)
 return 0;
 }
 
+static void proxy_cleanup(FsContext *ctx)
+{
+V9fsProxy *proxy = ctx->private;
+
+g_free(proxy->out_iovec.iov_base);
+g_free(proxy->in_iovec.iov_base);
+if (ctx->export_flags & V9FS_PROXY_SOCK_NAME) {
+close(proxy->sockfd);
+}
+g_free(proxy);
+}
+
 FileOperations proxy_ops = {
 .parse_opts   = proxy_parse_opts,
 .init = proxy_init,
+.cleanup  = proxy_cleanup,
 .lstat= proxy_lstat,
 .readlink = proxy_readlink,
 .close= proxy_close,
-- 
2.7.4

[Qemu-devel] [kvm-unit-tests PATCH v7 09/11] arm/arm64: add initial gicv3 support

2016-11-23 Thread Andrew Jones

Reviewed-by: Alex Bennée 
Reviewed-by: Eric Auger 
Signed-off-by: Andrew Jones 

---
v7: split lib/arm/gic.c into gic-v2/3.c [Eric]
v6:
 - added comments [Alex]
 - added stride parameter to gicv3_set_redist_base [Andre]
 - redist-wait s/rwp/uwp/ and comment [Andre]
 - removed unnecessary wait-for-rwps [Andre]
v5: use modern register names [Andre]
v4:
 - only take defines from kernel we need now [Andre]
 - simplify enable by not caring if we reinit the distributor [drew]
v2:
 - configure irqs as NS GRP1
---
 arm/Makefile.common|   2 +-
 lib/arm/asm/arch_gicv3.h   |  47 
 lib/arm/asm/gic-v3.h   | 104 +
 lib/arm/asm/gic.h  |   5 ++-
 lib/arm/gic-v2.c   |  27 
 lib/arm/gic-v3.c   |  61 ++
 lib/arm/gic.c  |  30 +
 lib/arm64/asm/arch_gicv3.h |  44 +++
 lib/arm64/asm/gic-v3.h |   1 +
 lib/arm64/asm/sysreg.h |  44 +++
 10 files changed, 343 insertions(+), 22 deletions(-)
 create mode 100644 lib/arm/asm/arch_gicv3.h
 create mode 100644 lib/arm/asm/gic-v3.h
 create mode 100644 lib/arm/gic-v2.c
 create mode 100644 lib/arm/gic-v3.c
 create mode 100644 lib/arm64/asm/arch_gicv3.h
 create mode 100644 lib/arm64/asm/gic-v3.h
 create mode 100644 lib/arm64/asm/sysreg.h

diff --git a/arm/Makefile.common b/arm/Makefile.common
index 2fe7aeeca6d4..6c0898f28be1 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -46,7 +46,7 @@ cflatobjs += lib/arm/mmu.o
 cflatobjs += lib/arm/bitops.o
 cflatobjs += lib/arm/psci.o
 cflatobjs += lib/arm/smp.o
-cflatobjs += lib/arm/gic.o
+cflatobjs += lib/arm/gic.o lib/arm/gic-v2.o lib/arm/gic-v3.o
 
 libeabi = lib/arm/libeabi.a
 eabiobjs = lib/arm/eabi_compat.o
diff --git a/lib/arm/asm/arch_gicv3.h b/lib/arm/asm/arch_gicv3.h
new file mode 100644
index ..276577452a14
--- /dev/null
+++ b/lib/arm/asm/arch_gicv3.h
@@ -0,0 +1,47 @@
+/*
+ * All ripped off from arch/arm/include/asm/arch_gicv3.h
+ *
+ * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+#ifndef _ASMARM_ARCH_GICV3_H_
+#define _ASMARM_ARCH_GICV3_H_
+
+#ifndef __ASSEMBLY__
+#include 
+#include 
+#include 
+
+#define __stringify xstr
+
+#define __ACCESS_CP15(CRn, Op1, CRm, Op2)  p15, Op1, %0, CRn, CRm, Op2
+
+#define ICC_PMR__ACCESS_CP15(c4, 0, c6, 0)
+#define ICC_IGRPEN1__ACCESS_CP15(c12, 0, c12, 7)
+
+static inline void gicv3_write_pmr(u32 val)
+{
+   asm volatile("mcr " __stringify(ICC_PMR) : : "r" (val));
+}
+
+static inline void gicv3_write_grpen1(u32 val)
+{
+   asm volatile("mcr " __stringify(ICC_IGRPEN1) : : "r" (val));
+   isb();
+}
+
+/*
+ * We may access GICR_TYPER and GITS_TYPER by reading both the TYPER
+ * offset and the following offset (+ 4) and then combining them to
+ * form a 64-bit address.
+ */
+static inline u64 gicv3_read_typer(const volatile void __iomem *addr)
+{
+   u64 val = readl(addr);
+   val |= (u64)readl(addr + 4) << 32;
+   return val;
+}
+
+#endif /* !__ASSEMBLY__ */
+#endif /* _ASMARM_ARCH_GICV3_H_ */
diff --git a/lib/arm/asm/gic-v3.h b/lib/arm/asm/gic-v3.h
new file mode 100644
index ..73ade4681d21
--- /dev/null
+++ b/lib/arm/asm/gic-v3.h
@@ -0,0 +1,104 @@
+/*
+ * All GIC* defines are lifted from include/linux/irqchip/arm-gic-v3.h
+ *
+ * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+#ifndef _ASMARM_GIC_V3_H_
+#define _ASMARM_GIC_V3_H_
+
+#ifndef _ASMARM_GIC_H_
+#error Do not directly include . Include 
+#endif
+
+/*
+ * Distributor registers
+ *
+ * We expect to be run in Non-secure mode, thus we define the
+ * group1 enable bits with respect to that view.
+ */
+#define GICD_CTLR_RWP  (1U << 31)
+#define GICD_CTLR_ARE_NS   (1U << 4)
+#define GICD_CTLR_ENABLE_G1A   (1U << 1)
+#define GICD_CTLR_ENABLE_G1(1U << 0)
+
+/* Re-Distributor registers, offsets from RD_base */
+#define GICR_TYPER 0x0008
+
+#define GICR_TYPER_LAST(1U << 4)
+
+/* Re-Distributor registers, offsets from SGI_base */
+#define GICR_IGROUPR0  GICD_IGROUPR
+#define GICR_ISENABLER0GICD_ISENABLER
+#define GICR_IPRIORITYR0   GICD_IPRIORITYR
+
+#include 
+
+#ifndef __ASSEMBLY__
+#include 
+#include 
+#include 
+#include 
+
+struct gicv3_data {
+   void *dist_base;
+   void *redist_base[NR_CPUS];
+   unsigned int irq_nr;
+};
+extern struct gicv3_data gicv3_data;
+
+#define gicv3_dist_base()  (gicv3_data.dist_base)
+#define gicv3_redist_base()
(gicv3_data.redist_base[smp_processor_id()])
+#define gicv3_sgi_base()   
(gicv3_data.redist_base[smp_processor_id()] + SZ_64K)
+
+extern int gicv3_init(void);

[Qemu-devel] [kvm-unit-tests PATCH v7 07/11] arm/arm64: gicv2: add an IPI test

2016-11-23 Thread Andrew Jones

Reviewed-by: Eric Auger 
Signed-off-by: Andrew Jones 
---
v6: move the spurious check to its own check_ function [drew]
v5: use modern registers [Andre]
v4: properly mask irqnr in ipi_handler
v2: add more details in the output if a test fails,
report spurious interrupts if we get them
---
 arm/Makefile.common  |   8 +--
 arm/gic.c| 199 +++
 arm/unittests.cfg|   8 +++
 lib/arm/asm/gic-v2.h |   2 +
 lib/arm/asm/gic.h|   4 ++
 5 files changed, 217 insertions(+), 4 deletions(-)
 create mode 100644 arm/gic.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index 6f56015c43c4..2fe7aeeca6d4 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -9,10 +9,10 @@ ifeq ($(LOADADDR),)
LOADADDR = 0x4000
 endif
 
-tests-common = \
-   $(TEST_DIR)/selftest.flat \
-   $(TEST_DIR)/spinlock-test.flat \
-   $(TEST_DIR)/pci-test.flat
+tests-common  = $(TEST_DIR)/selftest.flat
+tests-common += $(TEST_DIR)/spinlock-test.flat
+tests-common += $(TEST_DIR)/pci-test.flat
+tests-common += $(TEST_DIR)/gic.flat
 
 all: test_cases
 
diff --git a/arm/gic.c b/arm/gic.c
new file mode 100644
index ..b42c2b1ca1e1
--- /dev/null
+++ b/arm/gic.c
@@ -0,0 +1,199 @@
+/*
+ * GIC tests
+ *
+ * GICv2
+ *   + test sending/receiving IPIs
+ *
+ * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int gic_version;
+static int acked[NR_CPUS], spurious[NR_CPUS];
+static cpumask_t ready;
+
+static void nr_cpu_check(int nr)
+{
+   if (nr_cpus < nr)
+   report_abort("At least %d cpus required", nr);
+}
+
+static void wait_on_ready(void)
+{
+   cpumask_set_cpu(smp_processor_id(), &ready);
+   while (!cpumask_full(&ready))
+   cpu_relax();
+}
+
+static void check_acked(cpumask_t *mask)
+{
+   int missing = 0, extra = 0, unexpected = 0;
+   int nr_pass, cpu, i;
+
+   /* Wait up to 5s for all interrupts to be delivered */
+   for (i = 0; i < 50; ++i) {
+   mdelay(100);
+   nr_pass = 0;
+   for_each_present_cpu(cpu) {
+   smp_rmb();
+   nr_pass += cpumask_test_cpu(cpu, mask) ?
+   acked[cpu] == 1 : acked[cpu] == 0;
+   }
+   if (nr_pass == nr_cpus) {
+   report("Completed in %d ms", true, ++i * 100);
+   return;
+   }
+   }
+
+   for_each_present_cpu(cpu) {
+   if (cpumask_test_cpu(cpu, mask)) {
+   if (!acked[cpu])
+   ++missing;
+   else if (acked[cpu] > 1)
+   ++extra;
+   } else {
+   if (acked[cpu])
+   ++unexpected;
+   }
+   }
+
+   report("Timed-out (5s). ACKS: missing=%d extra=%d unexpected=%d",
+  false, missing, extra, unexpected);
+}
+
+static void check_spurious(void)
+{
+   int cpu;
+
+   smp_rmb();
+   for_each_present_cpu(cpu) {
+   if (spurious[cpu])
+   printf("ipi: WARN: cpu%d got %d spurious interrupts\n",
+   spurious[cpu], smp_processor_id());
+   }
+}
+
+static void ipi_handler(struct pt_regs *regs __unused)
+{
+   u32 irqstat = readl(gicv2_cpu_base() + GICC_IAR);
+   u32 irqnr = irqstat & GICC_IAR_INT_ID_MASK;
+
+   if (irqnr != GICC_INT_SPURIOUS) {
+   writel(irqstat, gicv2_cpu_base() + GICC_EOIR);
+   smp_rmb(); /* pairs with wmb in ipi_test functions */
+   ++acked[smp_processor_id()];
+   smp_wmb(); /* pairs with rmb in check_acked */
+   } else {
+   ++spurious[smp_processor_id()];
+   smp_wmb();
+   }
+}
+
+static void ipi_test_self(void)
+{
+   cpumask_t mask;
+
+   report_prefix_push("self");
+   memset(acked, 0, sizeof(acked));
+   smp_wmb();
+   cpumask_clear(&mask);
+   cpumask_set_cpu(0, &mask);
+   writel(2 << 24, gicv2_dist_base() + GICD_SGIR);
+   check_acked(&mask);
+   report_prefix_pop();
+}
+
+static void ipi_test_smp(void)
+{
+   cpumask_t mask;
+   unsigned long tlist;
+
+   report_prefix_push("target-list");
+   memset(acked, 0, sizeof(acked));
+   smp_wmb();
+   tlist = cpumask_bits(&cpu_present_mask)[0] & 0xaa;
+   cpumask_bits(&mask)[0] = tlist;
+   writel((u8)tlist << 16, gicv2_dist_base() + GICD_SGIR);
+   check_acked(&mask);
+   report_prefix_pop();
+
+   report_prefix_push("broadcast");
+   memset(acked, 0, sizeof(acked));
+   smp_wmb();
+   cpumask_copy(&mask, &cpu_present_mask);
+   cpumask_clear_cpu(0, &mask);
+   writel(1 << 24, gicv2_

[Qemu-devel] [kvm-unit-tests PATCH v7 00/11] arm/arm64: add gic framework

2016-11-23 Thread Andrew Jones

v7:
 - biggest change is splitting lib/arm/gic.c into lib/arm/gic.c,
   lib/arm/gic-v2.c, lib/arm/gic-v3.c
 - second biggest change, which probably affects Alex, is that
   gic_ipi_send(cpu, irq) changed to gic_ipi_send_single(irq, cpu),
   note the swapping of cpu and irq!
 - other changes thanks to Eric are noted in individual patches
 - also rebased to latest master

v6:
 - rebased to latest master
 - several other changes thanks to Andre and Alex, changes in
   individual patch change logs
 - some code cleanups

v5:
 - fix arm32/gicv3 compile [drew]
 - use modern register names [Andre]
 - one Andre r-b

v4:
 - Eric's r-b's
 - Andre's suggestion to only take defines we need
 - several other changes listed in individual patches

v3:
 - Rebased on latest master
 - Added Alex's r-b's

v2:
 Rebased on latest master + my "populate argv[0]" series (will
 send a REPOST for that shortly. Additionally a few patches got
 fixes/features;
 07/10 got same fix as kernel 7c9b973061 "irqchip/gic-v3: Configure
   all interrupts as non-secure Group-1" in order to continue
   working over TCG, as the gicv3 code for TCG removed a hack
   it had there to make Linux happy.
 08/10 added more output for when things fail (if they fail)
 09/10 switched gicv3 broadcast implementation to using IRM. This
   found a bug in a recent (but not tip) kernel, which I was
   about to fix, but then I saw MarcZ beat me to it.
 10/10 actually check that the input irq is the received irq


Import defines, and steal enough helper functions, from Linux to
enable programming of the gic (v2 and v3). Then use the framework
to add an initial test (an ipi test; self, target-list, broadcast).

It's my hope that this framework will be a suitable base on which
more tests may be easily added, particularly because we have
vgic-new and tcg gicv3 emulation getting close to merge. (v3 UPDATE:
vgic-new and tcg gicv3 are merged now)

To run it, along with other tests, just do

 ./configure [ --arch=[arm|arm64] --cross-prefix=$PREFIX ]
 make
 export QEMU=$PATH_TO_QEMU
 ./run_tests.sh

To run it separately do, e.g.

$QEMU -machine virt,accel=tcg -cpu cortex-a57 \
 -device virtio-serial-device \
 -device virtconsole,chardev=ctd -chardev testdev,id=ctd \
 -display none -serial stdio \
 -kernel arm/gic.flat \
 -smp 123 -machine gic-version=3 -append ipi
  ^^ note, we can go nuts with nr-cpus on TCG :-)

Or, a KVM example using a different "sender" cpu and irq (other than zero)

$QEMU -machine virt,accel=kvm -cpu host \
 -device virtio-serial-device \
 -device virtconsole,chardev=ctd -chardev testdev,id=ctd \
 -display none -serial stdio \
 -kernel arm/gic.flat \
 -smp 48 -machine gic-version=3 -append 'ipi sender=42 irq=1'


Patches:
01-05: fixes and functionality needed by the later gic patches
06-07: enable gicv2 and gicv2 IPI test
08-10: enable gicv3 and gicv3 IPI test
   11: extend the IPI tests to take variable sender and irq

Available here: https://github.com/rhdrjones/kvm-unit-tests/commits/arm/gic-v7


Andrew Jones (10):
  lib: xstr: allow multiple args
  arm64: fix get_"sysreg32" and make MPIDR 64bit
  arm/arm64: smp: support more than 8 cpus
  arm/arm64: add some delay routines
  arm/arm64: irq enable/disable
  arm/arm64: add initial gicv2 support
  arm/arm64: gicv2: add an IPI test
  arm/arm64: add initial gicv3 support
  arm/arm64: gicv3: add an IPI test
  arm/arm64: gic: don't just use zero

Peter Xu (1):
  libcflat: add IS_ALIGNED() macro, and page sizes

 arm/Makefile.common|   9 +-
 arm/gic.c  | 349 +
 arm/run|  19 ++-
 arm/selftest.c |   5 +-
 arm/unittests.cfg  |  14 ++
 lib/arm/asm/arch_gicv3.h   |  70 +
 lib/arm/asm/gic-v2.h   |  38 +
 lib/arm/asm/gic-v3.h   | 114 +++
 lib/arm/asm/gic.h  | 109 ++
 lib/arm/asm/processor.h|  42 +-
 lib/arm/asm/setup.h|   4 +-
 lib/arm/gic-v2.c   |  67 +
 lib/arm/gic-v3.c   | 155 
 lib/arm/gic.c  |  71 +
 lib/arm/processor.c|  15 ++
 lib/arm/setup.c|  10 ++
 lib/arm64/asm/arch_gicv3.h |  66 +
 lib/arm64/asm/gic-v2.h |   1 +
 lib/arm64/asm/gic-v3.h |   1 +
 lib/arm64/asm/gic.h|   1 +
 lib/arm64/asm/processor.h  |  53 +--
 lib/arm64/asm/sysreg.h |  44 ++
 lib/arm64/processor.c  |  15 ++
 lib/libcflat.h |  10 +-
 24 files changed, 1254 insertions(+), 28 deletions(-)
 create mode 100644 arm/gic.c
 create mode 100644 lib/arm/asm/arch_gicv3.h
 create mode 100644 lib/arm/asm/gic-v2.h
 create mode 100644 lib/arm/asm/gic-v3.h
 create mode 100644 lib/arm/asm/gic.h
 create mode 100644 lib/arm/gic-v2.c
 create mode 100644 lib/arm/gic-v3.c
 create mode 100644 lib/arm/gic.c
 create mode 100644 lib/arm64/asm/arch_gicv3.h
 create mode 100644 lib/arm64/asm/gic-v2.h
 create mode 100644 lib/arm64/asm/gi

[Qemu-devel] [kvm-unit-tests PATCH v7 03/11] arm/arm64: smp: support more than 8 cpus

2016-11-23 Thread Andrew Jones

By adding support for launching with gicv3 we can break the 8 vcpu
limit. This patch adds support to smp code and also selects the
vgic model corresponding to the host. The vgic model may also be
manually selected by adding e.g. -machine gic-version=3 to
extra_params.

Reviewed-by: Alex Bennée 
Reviewed-by: Andre Przywara 
Reviewed-by: Eric Auger 
Signed-off-by: Andrew Jones 

---
v5: left cpus a u32 for now. Changing to u64 requires a change to
devicetree. Will do it later. [Andre]
v4: improved commit message
---
 arm/run   | 19 ---
 arm/selftest.c|  5 -
 lib/arm/asm/processor.h   |  9 +++--
 lib/arm/asm/setup.h   |  4 ++--
 lib/arm/setup.c   | 10 ++
 lib/arm64/asm/processor.h |  9 +++--
 6 files changed, 42 insertions(+), 14 deletions(-)

diff --git a/arm/run b/arm/run
index f1b04af614dc..1c40ab02eb57 100755
--- a/arm/run
+++ b/arm/run
@@ -31,13 +31,6 @@ if [ -z "$ACCEL" ]; then
fi
 fi
 
-if [ "$HOST" = "aarch64" ] && [ "$ACCEL" = "kvm" ]; then
-   processor="host"
-   if [ "$ARCH" = "arm" ]; then
-   processor+=",aarch64=off"
-   fi
-fi
-
 qemu="${QEMU:-qemu-system-$ARCH_NAME}"
 qpath=$(which $qemu 2>/dev/null)
 
@@ -53,6 +46,18 @@ fi
 
 M='-machine virt'
 
+if [ "$ACCEL" = "kvm" ]; then
+   if $qemu $M,\? 2>&1 | grep gic-version > /dev/null; then
+   M+=',gic-version=host'
+   fi
+   if [ "$HOST" = "aarch64" ]; then
+   processor="host"
+   if [ "$ARCH" = "arm" ]; then
+   processor+=",aarch64=off"
+   fi
+   fi
+fi
+
 if ! $qemu $M -device '?' 2>&1 | grep virtconsole > /dev/null; then
echo "$qpath doesn't support virtio-console for chr-testdev. Exiting."
exit 2
diff --git a/arm/selftest.c b/arm/selftest.c
index 196164f5313d..2f117f795d2d 100644
--- a/arm/selftest.c
+++ b/arm/selftest.c
@@ -312,9 +312,10 @@ static bool psci_check(void)
 static cpumask_t smp_reported;
 static void cpu_report(void)
 {
+   unsigned long mpidr = get_mpidr();
int cpu = smp_processor_id();
 
-   report("CPU%d online", true, cpu);
+   report("CPU(%3d) mpidr=%lx", mpidr_to_cpu(mpidr) == cpu, cpu, mpidr);
cpumask_set_cpu(cpu, &smp_reported);
halt();
 }
@@ -343,6 +344,7 @@ int main(int argc, char **argv)
 
} else if (strcmp(argv[1], "smp") == 0) {
 
+   unsigned long mpidr = get_mpidr();
int cpu;
 
report("PSCI version", psci_check());
@@ -353,6 +355,7 @@ int main(int argc, char **argv)
smp_boot_secondary(cpu, cpu_report);
}
 
+   report("CPU(%3d) mpidr=%lx", mpidr_to_cpu(mpidr) == 0, 0, 
mpidr);
cpumask_set_cpu(0, &smp_reported);
while (!cpumask_full(&smp_reported))
cpu_relax();
diff --git a/lib/arm/asm/processor.h b/lib/arm/asm/processor.h
index 02f912f99974..ecf5bbe1824a 100644
--- a/lib/arm/asm/processor.h
+++ b/lib/arm/asm/processor.h
@@ -40,8 +40,13 @@ static inline unsigned long get_mpidr(void)
return mpidr;
 }
 
-/* Only support Aff0 for now, up to 4 cpus */
-#define mpidr_to_cpu(mpidr) ((int)((mpidr) & 0xff))
+#define MPIDR_HWID_BITMASK 0xff
+extern int mpidr_to_cpu(unsigned long mpidr);
+
+#define MPIDR_LEVEL_SHIFT(level) \
+   (((1 << level) >> 1) << 3)
+#define MPIDR_AFFINITY_LEVEL(mpidr, level) \
+   ((mpidr >> MPIDR_LEVEL_SHIFT(level)) & 0xff)
 
 extern void start_usr(void (*func)(void *arg), void *arg, unsigned long 
sp_usr);
 extern bool is_user(void);
diff --git a/lib/arm/asm/setup.h b/lib/arm/asm/setup.h
index cb8fdbd38dd5..1de99dd184d1 100644
--- a/lib/arm/asm/setup.h
+++ b/lib/arm/asm/setup.h
@@ -10,8 +10,8 @@
 #include 
 #include 
 
-#define NR_CPUS8
-extern u32 cpus[NR_CPUS];
+#define NR_CPUS255
+extern u32 cpus[NR_CPUS];  /* per-cpu IDs (MPIDRs) */
 extern int nr_cpus;
 
 #define NR_MEM_REGIONS 8
diff --git a/lib/arm/setup.c b/lib/arm/setup.c
index 7e7b39f11dde..241bf9410447 100644
--- a/lib/arm/setup.c
+++ b/lib/arm/setup.c
@@ -30,6 +30,16 @@ int nr_cpus;
 struct mem_region mem_regions[NR_MEM_REGIONS];
 phys_addr_t __phys_offset, __phys_end;
 
+int mpidr_to_cpu(unsigned long mpidr)
+{
+   int i;
+
+   for (i = 0; i < nr_cpus; ++i)
+   if (cpus[i] == (mpidr & MPIDR_HWID_BITMASK))
+   return i;
+   return -1;
+}
+
 static void cpu_set(int fdtnode __unused, u32 regval, void *info __unused)
 {
int cpu = nr_cpus++;
diff --git a/lib/arm64/asm/processor.h b/lib/arm64/asm/processor.h
index 9a208ff729b7..7e448dc81a6a 100644
--- a/lib/arm64/asm/processor.h
+++ b/lib/arm64/asm/processor.h
@@ -78,8 +78,13 @@ static inline type get_##reg(void)   
\
 
 DEFINE_GET_SYSREG64(mpidr)
 
-/* Only support Aff0 for now, gicv2 only */
-#define mpidr_to_cpu(mpidr) ((int)((mpid

[Qemu-devel] [kvm-unit-tests PATCH v7 02/11] arm64: fix get_"sysreg32" and make MPIDR 64bit

2016-11-23 Thread Andrew Jones

mrs is always 64bit, so we should always use a 64bit register.
Sometimes we'll only want to return the lower 32, but not for
MPIDR, as that does define fields in the upper 32.

Reviewed-by: Alex Bennée 
Reviewed-by: Eric Auger 
Signed-off-by: Andrew Jones 

---
v5: switch arm32's get_mpidr to 'unsigned long' too, to be
consistent with arm64 [Andre]
---
 lib/arm/asm/processor.h   |  4 ++--
 lib/arm64/asm/processor.h | 15 +--
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/lib/arm/asm/processor.h b/lib/arm/asm/processor.h
index f25e7eee3666..02f912f99974 100644
--- a/lib/arm/asm/processor.h
+++ b/lib/arm/asm/processor.h
@@ -33,9 +33,9 @@ static inline unsigned long current_cpsr(void)
 
 #define current_mode() (current_cpsr() & MODE_MASK)
 
-static inline unsigned int get_mpidr(void)
+static inline unsigned long get_mpidr(void)
 {
-   unsigned int mpidr;
+   unsigned long mpidr;
asm volatile("mrc p15, 0, %0, c0, c0, 5" : "=r" (mpidr));
return mpidr;
 }
diff --git a/lib/arm64/asm/processor.h b/lib/arm64/asm/processor.h
index 84d5c7ce752b..9a208ff729b7 100644
--- a/lib/arm64/asm/processor.h
+++ b/lib/arm64/asm/processor.h
@@ -66,14 +66,17 @@ static inline unsigned long current_level(void)
return el & 0xc;
 }
 
-#define DEFINE_GET_SYSREG32(reg)   \
-static inline unsigned int get_##reg(void) \
+#define DEFINE_GET_SYSREG(reg, type)   \
+static inline type get_##reg(void) \
 {  \
-   unsigned int reg;   \
-   asm volatile("mrs %0, " #reg "_el1" : "=r" (reg));  \
-   return reg; \
+   unsigned long r;\
+   asm volatile("mrs %0, " #reg "_el1" : "=r" (r));\
+   return (type)r; \
 }
-DEFINE_GET_SYSREG32(mpidr)
+#define DEFINE_GET_SYSREG32(reg) DEFINE_GET_SYSREG(reg, unsigned int)
+#define DEFINE_GET_SYSREG64(reg) DEFINE_GET_SYSREG(reg, unsigned long)
+
+DEFINE_GET_SYSREG64(mpidr)
 
 /* Only support Aff0 for now, gicv2 only */
 #define mpidr_to_cpu(mpidr) ((int)((mpidr) & 0xff))
-- 
2.9.3

[Qemu-devel] [kvm-unit-tests PATCH v7 06/11] arm/arm64: add initial gicv2 support

2016-11-23 Thread Andrew Jones

Add some gicv2 support. This just adds init and enable
functions, allowing unit tests to start messing with it.

Reviewed-by: Andre Przywara 
Reviewed-by: Eric Auger 
Signed-off-by: Andrew Jones 

---
v6: added comments (register offset headers) [Alex]
v5: share/use only the modern register names [Andre]
v4:
 - only take defines from kernel we need now [Andre]
 - moved defines to asm/gic.h so they'll be shared with v3 [drew]
 - simplify enable by not caring if we reinit the distributor [drew]
 - init all GICD_INT_DEF_PRI_X4 registers [Eric]
---
 arm/Makefile.common|  1 +
 lib/arm/asm/gic-v2.h   | 34 ++
 lib/arm/asm/gic.h  | 39 ++
 lib/arm/gic.c  | 76 ++
 lib/arm64/asm/gic-v2.h |  1 +
 lib/arm64/asm/gic.h|  1 +
 6 files changed, 152 insertions(+)
 create mode 100644 lib/arm/asm/gic-v2.h
 create mode 100644 lib/arm/asm/gic.h
 create mode 100644 lib/arm/gic.c
 create mode 100644 lib/arm64/asm/gic-v2.h
 create mode 100644 lib/arm64/asm/gic.h

diff --git a/arm/Makefile.common b/arm/Makefile.common
index f37b5c2a3de4..6f56015c43c4 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -46,6 +46,7 @@ cflatobjs += lib/arm/mmu.o
 cflatobjs += lib/arm/bitops.o
 cflatobjs += lib/arm/psci.o
 cflatobjs += lib/arm/smp.o
+cflatobjs += lib/arm/gic.o
 
 libeabi = lib/arm/libeabi.a
 eabiobjs = lib/arm/eabi_compat.o
diff --git a/lib/arm/asm/gic-v2.h b/lib/arm/asm/gic-v2.h
new file mode 100644
index ..c2d5fecd4886
--- /dev/null
+++ b/lib/arm/asm/gic-v2.h
@@ -0,0 +1,34 @@
+/*
+ * All GIC* defines are lifted from include/linux/irqchip/arm-gic.h
+ *
+ * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+#ifndef _ASMARM_GIC_V2_H_
+#define _ASMARM_GIC_V2_H_
+
+#ifndef _ASMARM_GIC_H_
+#error Do not directly include . Include 
+#endif
+
+#define GICD_ENABLE0x1
+#define GICC_ENABLE0x1
+
+#ifndef __ASSEMBLY__
+
+struct gicv2_data {
+   void *dist_base;
+   void *cpu_base;
+   unsigned int irq_nr;
+};
+extern struct gicv2_data gicv2_data;
+
+#define gicv2_dist_base()  (gicv2_data.dist_base)
+#define gicv2_cpu_base()   (gicv2_data.cpu_base)
+
+extern int gicv2_init(void);
+extern void gicv2_enable_defaults(void);
+
+#endif /* !__ASSEMBLY__ */
+#endif /* _ASMARM_GIC_V2_H_ */
diff --git a/lib/arm/asm/gic.h b/lib/arm/asm/gic.h
new file mode 100644
index ..e3580bd1d42d
--- /dev/null
+++ b/lib/arm/asm/gic.h
@@ -0,0 +1,39 @@
+/*
+ * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+#ifndef _ASMARM_GIC_H_
+#define _ASMARM_GIC_H_
+
+#include 
+
+/* Distributor registers */
+#define GICD_CTLR  0x
+#define GICD_TYPER 0x0004
+#define GICD_ISENABLER 0x0100
+#define GICD_IPRIORITYR0x0400
+
+#define GICD_TYPER_IRQS(typer) typer) & 0x1f) + 1) * 32)
+#define GICD_INT_EN_SET_SGI0x
+#define GICD_INT_DEF_PRI_X40xa0a0a0a0
+
+/* CPU interface registers */
+#define GICC_CTLR  0x
+#define GICC_PMR   0x0004
+
+#define GICC_INT_PRI_THRESHOLD 0xf0
+
+#ifndef __ASSEMBLY__
+
+/*
+ * gic_init will try to find all known gics, and then
+ * initialize the gic data for the one found.
+ * returns
+ *  0   : no gic was found
+ *  > 0 : the gic version of the gic found
+ */
+extern int gic_init(void);
+
+#endif /* !__ASSEMBLY__ */
+#endif /* _ASMARM_GIC_H_ */
diff --git a/lib/arm/gic.c b/lib/arm/gic.c
new file mode 100644
index ..d655105e058b
--- /dev/null
+++ b/lib/arm/gic.c
@@ -0,0 +1,76 @@
+/*
+ * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+#include 
+#include 
+#include 
+
+struct gicv2_data gicv2_data;
+
+/*
+ * Documentation/devicetree/bindings/interrupt-controller/arm,gic.txt
+ */
+static bool
+gic_get_dt_bases(const char *compatible, void **base1, void **base2)
+{
+   struct dt_pbus_reg reg;
+   struct dt_device gic;
+   struct dt_bus bus;
+   int node, ret;
+
+   dt_bus_init_defaults(&bus);
+   dt_device_init(&gic, &bus, NULL);
+
+   node = dt_device_find_compatible(&gic, compatible);
+   assert(node >= 0 || node == -FDT_ERR_NOTFOUND);
+
+   if (node == -FDT_ERR_NOTFOUND)
+   return false;
+
+   dt_device_bind_node(&gic, node);
+
+   ret = dt_pbus_translate(&gic, 0, ®);
+   assert(ret == 0);
+   *base1 = ioremap(reg.addr, reg.size);
+
+   ret = dt_pbus_translate(&gic, 1, ®);
+   assert(ret == 0);
+   *base2 = ioremap(reg.addr, reg.size);
+
+   return true;
+}
+
+int gicv2_init(void)
+{
+   return gic_get_dt_bases("arm,cortex-a15-gic",
+

1 2 3 >

1 - 100 of 294 matches

Mail list logo