date:20170801

Re: [Qemu-devel] [RFC 16/29] qmp: hmp: add migrate "resume" option

2017-08-01 Thread Peter Xu

On Tue, Aug 01, 2017 at 12:03:48PM +0100, Daniel P. Berrange wrote:
> On Fri, Jul 28, 2017 at 04:06:25PM +0800, Peter Xu wrote:
> > It will be used when we want to resume one paused migration.
> > 
> > Signed-off-by: Peter Xu 
> > ---
> >  hmp-commands.hx   | 7 ---
> >  hmp.c | 4 +++-
> >  migration/migration.c | 2 +-
> >  qapi-schema.json  | 5 -
> >  4 files changed, 12 insertions(+), 6 deletions(-)
> 
> I'm not seeing explicit info about how we handle the original failure
> and how it relates to this resume command, but this feels like a
> potentially racy approach to me.
> 
> If we have a network problem between source & target, we could see
> two results. Either the TCP stream will simply hang (it'll still
> appear open to QEMU but no traffic will be flowing),

(let's say this is the "1st condition")

> or the connection
> may actually break such that we get EOF and end up closing the file
> descriptor.

(let's say this is the "2nd condition")

> 
> In the latter case, we're ok because the original channel is now
> gone and we can safely establish the new one by issuing the new
> 'migrate --resume URI' command.
> 
> In the former case, however, there is the possibility that the
> hang may come back to life at some point, concurrently with us
> trying to do 'migrate --resume URI' and I'm unclear on the
> semantics if that happens.
> 
> Should the original connection carry on, and thus cause the
> 'migrate --resume' command to fail, or will we forcably terminate
> the original connection no matter what and use the new "resumed"
> connection.

Hmm yes, this is a good question. Currently this series is only
handling the 2nd condition, say, when we can detect the error via
system calls (IIUC we can know nothing when the 1st condition is
encountered, we just e.g. block at the system calls as usual when
reading the file handle). And currently the "resume" command is only
allowed if the 2nd condition is detected (so it will never destroy an
existing channel).

If you see the next following patch, there is something like:

if (has_resume && resume) {
if (s->state != MIGRATION_STATUS_POSTCOPY_PAUSED) {
error_setg(errp, "Cannot resume if there is no "
   "paused migration");
return;
}
goto do_resume;
}

And here MIGRATION_STATUS_POSTCOPY_PAUSED will only be set when the
2nd condition is met.

> 
> There's also synchronization with the target host - at the time we
> want to recover, we need to be able to tell the target to accept
> new incoming clients again, but we don't want to do that if the
> original connection comes back to life.

Yeah, I hacked this part in this v1 series (as you may have seen) to
keep the ports open-forever. I am not sure whether that is acceptable,
but looks not. :)

How about this: when destination detected 2nd condition, it firstly
switch to "postcopy-pause" state, then re-opens the accept channels.
And it can turns the accept channels off when the state moves out of
"postcopy-pause".

> 
> It feels to me that if the mgmt app or admin believes the migration
> is in a stuck state, we should be able to explicitly terminate the
> existing connection via a monitor command. Then setup the target
> host to accept new client, and then issue this migrate resume on
> the source.

Totally agree. That should be the only way to handle 1st condition
well. However, would you mind if I postpone it a bit? IMHO as long as
we can solve the 2nd condition nicely (which is the goal of this
series), then it won't be too hard to continue support the 1st
condition.

Since we are at here discussing the usage model... maybe I can further
extend it a bit to gain more input.

IMHO in general there are two phases for the recovery (assume we are
always talking about postcopy):

  active --> paused --> recovery --> active
   [1] [2]

For [1]: the 1st condition we discussed above can be seen as "manual
pause" - user can provide a command to forcely discard existing
migration channel. While 2nd condition is the "automatic pause" (what
this series does): when qemu detected network problem, it
automatically switch to the paused state.

For [2]: we are always doing it in the "manual" way: we need a command
to trigger the recovery.

What I am thinking is whether it would make sense in the future to do
the "automatic" thing for [2] as well. In that sense, source
periodically detects connectability of existing migration channel
(which is broken), and it will auto-reconnect if it finds that the
network is recovered. We can add a new capability bit for it (e.g.,
"postcopy-auto-recovery"), showing whether we would like the
"automatic recovery" happen.

If we put these into a matrix:

|+---+|
| Pause mode | Recovery mode | Use case   |

Re: [Qemu-devel] [PULL 00/17] Misc changes for QEMU 2.10-rc1 (?)

2017-08-01 Thread Paolo Bonzini

> > 
> > * Xen fix (Anthony)
> > * chardev fixes (Anton, Marc-André)
> > * small dead code removal (Zhongyi)
> > * documentation (Dan)
> > * bugfixes (David)
> > * decrease migration downtime (Jay)
> > * improved error output (Laurent)
> > * RTC tests and bugfix (me)
> > * Bluetooth clang analyzer fix (me)
> > * KVM CPU hotplug race (Peng Hao)
> > * Two other patches from Philippe's clang analyzer series
> >
> 
> Applied, thanks, but can you check whatever it is in your
> workflow that's mangling non-ascii characters?

It's https://bugzilla.mozilla.org/show_bug.cgi?id=1017768.  Usually
I apply patches with patchew, which bypasses that bug.

Paolo

Re: [Qemu-devel] [RFC 12/29] migration: allow dst vm pause on postcopy

2017-08-01 Thread Peter Xu

On Tue, Aug 01, 2017 at 10:47:16AM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (pet...@redhat.com) wrote:

[...]

> > +/* Return true if we should continue the migration, or false. */
> > +static bool postcopy_pause_incoming(MigrationIncomingState *mis)
> > +{
> > +trace_postcopy_pause_incoming();
> > +
> > +migrate_set_state(>state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
> > +  MIGRATION_STATUS_POSTCOPY_PAUSED);
> > +
> > +assert(mis->from_src_file);
> > +qemu_file_shutdown(mis->from_src_file);
> > +qemu_fclose(mis->from_src_file);
> > +mis->from_src_file = NULL;
> > +
> > +assert(mis->to_src_file);
> > +qemu_mutex_lock(>rp_mutex);
> > +qemu_file_shutdown(mis->to_src_file);
> > +qemu_fclose(mis->to_src_file);
> > +mis->to_src_file = NULL;
> > +qemu_mutex_unlock(>rp_mutex);
> 
> Hmm is that safe?  If we look at migrate_send_rp_message we have:
> 
> static void migrate_send_rp_message(MigrationIncomingState *mis,
> enum mig_rp_message_type message_type,
> uint16_t len, void *data)
> {
> trace_migrate_send_rp_message((int)message_type, len);
> qemu_mutex_lock(>rp_mutex);
> qemu_put_be16(mis->to_src_file, (unsigned int)message_type);
> qemu_put_be16(mis->to_src_file, len);
> qemu_put_buffer(mis->to_src_file, data, len);
> qemu_fflush(mis->to_src_file);
> qemu_mutex_unlock(>rp_mutex);
> }
> 
> If we came into postcopy_pause_incoming at about the same time
> migrate_send_rp_message was being called and pause_incoming took the
> lock first, then once it release the lock, send_rp_message carries on
> and uses mis->to_src_file that's now NULL.
> 
> One solution here is to just call qemu_file_shutdown() but leave the
> files open at this point, but clean the files up sometime later.

I see the commnent on patch 14 as well - yeah, we need patch 14 to
co-op here, and as long as we are with patch 14, we should be ok.

> 
> > +
> > +while (mis->state == MIGRATION_STATUS_POSTCOPY_PAUSED) {
> > +qemu_sem_wait(>postcopy_pause_sem_dst);
> > +}
> > +
> > +trace_postcopy_pause_incoming_continued();
> > +
> > +return true;
> > +}
> > +
> >  static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
> >  {
> >  uint8_t section_type;
> >  int ret = 0;
> >  
> > +retry:
> >  while (true) {
> >  section_type = qemu_get_byte(f);
> >  
> > @@ -2004,6 +2034,21 @@ static int qemu_loadvm_state_main(QEMUFile *f, 
> > MigrationIncomingState *mis)
> >  out:
> >  if (ret < 0) {
> >  qemu_file_set_error(f, ret);
> > +
> > +/*
> > + * Detect whether it is:
> > + *
> > + * 1. postcopy running
> > + * 2. network failure (-EIO)
> > + *
> > + * If so, we try to wait for a recovery.
> > + */
> > +if (mis->state == MIGRATION_STATUS_POSTCOPY_ACTIVE &&
> > +ret == -EIO && postcopy_pause_incoming(mis)) {
> > +/* Reset f to point to the newly created channel */
> > +f = mis->from_src_file;
> > +goto retry;
> > +}
> 
> I wonder if:
> 
>if (mis->state == MIGRATION_STATUS_POSTCOPY_ACTIVE &&
>ret == -EIO && postcopy_pause_incoming(mis)) {
>/* Try again after postcopy recovery */
>return qemu_loadvm_state_main(mis->from_src_file, mis);
>}
> would be nicer; it avoids the goto loop.

I agree we should avoid using goto loops. However I do see vast usages
of goto like this one when we want to redo part of the procedures of a
function (or, of course, when handling errors in "C-style").

Calling qemu_loadvm_state_main() inside itself is ok as well, but it
also has defect: stack usage would be out of control, or even, it can
be controled by malicious users. E.g., if someone used program to
periodically stop/start any network endpoint along the migration
network, QEMU may go into a paused -> recovery -> active -> paused ...
loop, and stack usage will just grow with time. I'd say it's an
extreme example though...

(Another way besides above two: maybe we can just return in
 qemu_loadvm_state_main with something like -EAGAIN, then the caller
 of qemu_loadvm_state_main can re-call it when necessary, though I
 would prefer "goto is okay here"... :)

-- 
Peter Xu

Re: [Qemu-devel] [RFC 10/29] migration: new property "x-postcopy-fast"

2017-08-01 Thread Peter Xu

On Tue, Aug 01, 2017 at 09:50:02AM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (pet...@redhat.com) wrote:
> > On Mon, Jul 31, 2017 at 07:52:24PM +0100, Dr. David Alan Gilbert wrote:
> > > * Peter Xu (pet...@redhat.com) wrote:
> > > > This provides a way to start postcopy ASAP when migration starts. To do
> > > > this, we need both:
> > > > 
> > > >   -global migration.x-postcopy-ram=on \
> > > >   -global migration.x-postcopy-fast=on
> > > 
> > > Can you explain why this is necessary?  Both sides already know
> > > they're doing a postcopy recovery don't they?
> > 
> > What I wanted to do here is to provide a way to start postcopy at the
> > very beginning (actually it'll possibly start postcopy at the first
> > loop in migration_thread), instead of start postcopy until we trigger
> > it using "migrate_start_postcopy" command.
> > 
> > I used it for easier debugging (so I don't need to type
> > "migrate_start_postcopy" every time when I trigger postcopy
> > migration), meanwhile I think it can also be used when someone really
> > want to start postcopy from the very beginning.
> > 
> > Would such a new parameter makes sense?
> 
> Other than debugging, I don't think there's a real use for it; the
> slight delay between starting migration and triggering postcopy has
> very little cost.

Then let me drop this patch in next version. I do think I should avoid
introducing too many things "for debugging only"...

-- 
Peter Xu

Re: [Qemu-devel] [RFC 04/29] bitmap: introduce bitmap_invert()

2017-08-01 Thread Peter Xu

On Tue, Aug 01, 2017 at 09:40:09AM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (pet...@redhat.com) wrote:
> > On Mon, Jul 31, 2017 at 06:11:56PM +0100, Dr. David Alan Gilbert wrote:
> > > * Peter Xu (pet...@redhat.com) wrote:
> > > > It is used to invert the whole bitmap.
> > > 
> > > Would it be easier to change bitmap_complement to use ^
> > > in it's macro and slow_bitmap_complement, and then you could call it
> > > with src==dst  to do the same thing with just that small change?
> > 
> > Or, I can directly use that and drop this patch. :-)
> 
> Yes, that's fine - note the only difference I see is what happens to the
> bits in the last word after the end of the count; your code leaves them
> as is, the complement code will zero them on the destination I think.

I see.  I believe both should work since bitmap users should not
use those bits after all (considering those bits are outside range of
valid bits when declaring the bitmap).  Thanks,

-- 
Peter Xu

[Qemu-devel] [PATCH 0/2] migration: fixes for 2.10

2017-08-01 Thread Peter Xu

Two patches isolated from the postcopy recovery series, which may be
good for 2.10.

Peter Xu (2):
  migration: fix comment disorder in RAMState
  io: fix qio_channel_socket_accept err handling

 io/channel-socket.c | 3 ++-
 migration/ram.c | 4 ++--
 2 files changed, 4 insertions(+), 3 deletions(-)

-- 
2.7.4

[Qemu-devel] [PATCH 2/2] io: fix qio_channel_socket_accept err handling

2017-08-01 Thread Peter Xu

When accept failed, we should setup errp with the reason. More
importantly, the caller may assume errp be non-NULL when error happens,
and not setting the errp may crash QEMU.

At the same time, move the trace_qio_channel_socket_accept_fail() after
the if check on EINTR. Two reasons:

1. when EINTR happened, it's not really a fault (we should just try
   again), so we should not log with an "accept failure".

2. trace_*() functions may overwrite errno, then the old errno will be
   missing. We need to either check errno before trace_*() calls, or
   reserve the errno.

Signed-off-by: Peter Xu 
---
 io/channel-socket.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/io/channel-socket.c b/io/channel-socket.c
index 53386b7..442f230 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -340,10 +340,11 @@ qio_channel_socket_accept(QIOChannelSocket *ioc,
 cioc->fd = qemu_accept(ioc->fd, (struct sockaddr *)>remoteAddr,
>remoteAddrLen);
 if (cioc->fd < 0) {
-trace_qio_channel_socket_accept_fail(ioc);
 if (errno == EINTR) {
 goto retry;
 }
+trace_qio_channel_socket_accept_fail(ioc);
+error_setg_errno(errp, errno, "Unable to accept connection");
 goto error;
 }
 
-- 
2.7.4

[Qemu-devel] [PATCH 1/2] migration: fix comment disorder in RAMState

2017-08-01 Thread Peter Xu

Comments for "migration_dirty_pages" and "bitmap_mutex" are switched.
Fix it.

Reviewed-by: Dr. David Alan Gilbert 
Signed-off-by: Peter Xu 
---
 migration/ram.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 1b08296..e18b3e2 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -188,9 +188,9 @@ struct RAMState {
 uint64_t iterations_prev;
 /* Iterations since start */
 uint64_t iterations;
-/* protects modification of the bitmap */
-uint64_t migration_dirty_pages;
 /* number of dirty bits in the bitmap */
+uint64_t migration_dirty_pages;
+/* protects modification of the bitmap */
 QemuMutex bitmap_mutex;
 /* The RAMBlock used in the last src_page_requests */
 RAMBlock *last_req_rb;
-- 
2.7.4

Re: [Qemu-devel] [RFC 03/29] io: fix qio_channel_socket_accept err handling

2017-08-01 Thread Peter Xu

On Tue, Aug 01, 2017 at 09:55:08AM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berra...@redhat.com) wrote:
> > On Tue, Aug 01, 2017 at 10:25:19AM +0800, Peter Xu wrote:
> > > On Mon, Jul 31, 2017 at 05:53:39PM +0100, Dr. David Alan Gilbert wrote:
> > > > * Peter Xu (pet...@redhat.com) wrote:
> > > > > When accept failed, we should setup errp with the reason. More
> > > > > importantly, the caller may assume errp be non-NULL when error 
> > > > > happens,
> > > > > and not setting the errp may crash QEMU.
> > > > > 
> > > > > Signed-off-by: Peter Xu 
> > > > > ---
> > > > >  io/channel-socket.c | 1 +
> > > > >  1 file changed, 1 insertion(+)
> > > > > 
> > > > > diff --git a/io/channel-socket.c b/io/channel-socket.c
> > > > > index 53386b7..7bc308e 100644
> > > > > --- a/io/channel-socket.c
> > > > > +++ b/io/channel-socket.c
> > > > > @@ -344,6 +344,7 @@ qio_channel_socket_accept(QIOChannelSocket *ioc,
> > > > >  if (errno == EINTR) {
> > > > >  goto retry;
> > > > >  }
> > > > > +error_setg_errno(errp, errno, "Unable to accept connection");
> > > > >  goto error;
> > > > 
> > > > OK, but this code actually has a bigger problem as well:
> > > > 
> > > > the original is:
> > > > 
> > > > cioc->fd = qemu_accept(ioc->fd, (struct sockaddr 
> > > > *)>remoteAddr,
> > > >>remoteAddrLen);
> > > > if (cioc->fd < 0) {
> > > > trace_qio_channel_socket_accept_fail(ioc);
> > > > if (errno == EINTR) {
> > > > goto retry;
> > > > }
> > > > goto error;
> > > > }
> > > > 
> > > > Stefan confirmed that trace_ doesn't preserve errno; so the if
> > > > following it is wrong.  It needs to preserve errno.
> > > 
> > > Ah... If so, not sure whether we can do the reservation in trace codes
> > > in general?
> > > 
> > > For this one, I can just move the trace_*() below the errno check.
> > > After all, if EINTR is got, it's not really a fail, so imho we should
> > > not trace it with "accept fail".
> > 
> > Agreed, we just need to move the trace below the if.
> 
> Peter: Can you split this as a separate patch and it seems OK to try and
> put this in 2.10 since it's a strict bug fix.

Sure!  Then I'll possibly include the comment fix patch as well.

-- 
Peter Xu

Re: [Qemu-devel] [for-2.11 PATCH 26/26] spapr: add hotplug hooks for PHB hotplug

2017-08-01 Thread David Gibson

On Tue, Aug 01, 2017 at 05:30:46PM +0200, Greg Kurz wrote:
> On Fri, 28 Jul 2017 14:24:03 +1000
> David Gibson  wrote:
> 
> > On Thu, Jul 27, 2017 at 07:09:55PM +0200, Greg Kurz wrote:
> > > On Thu, 27 Jul 2017 14:41:31 +1000
> > > Alexey Kardashevskiy  wrote:
> > >   
> > > > On 26/07/17 18:40, Greg Kurz wrote:  
> > > > > Hotplugging PHBs is a machine-level operation, but PHBs reside on the
> > > > > main system bus, so we register spapr machine as the handler for the
> > > > > main system bus.
> > > > > 
> > > > > Signed-off-by: Michael Roth 
> > > > > Signed-off-by: Greg Kurz 
> > > > > ---
> > > > > - rebased against ppc-for-2.10
> > > > > - converted to unplug_request
> > > > > - handle drc_id at pre-plug
> > > > > - reset hotplugged PHB at plug
> > > > > - compatibility with older machine types
> > > > > ---
> > > > >  hw/ppc/spapr.c  |  114 
> > > > > +++
> > > > >  hw/ppc/spapr_drc.c  |1 
> > > > >  hw/ppc/spapr_pci.c  |2 -
> > > > >  include/hw/pci-host/spapr.h |2 +
> > > > >  include/hw/ppc/spapr.h  |1 
> > > > >  5 files changed, 118 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > > > index 90485054c2e7..589f76ef9fb8 100644
> > > > > --- a/hw/ppc/spapr.c
> > > > > +++ b/hw/ppc/spapr.c
> > > > > @@ -2540,6 +2540,10 @@ static void ppc_spapr_init(MachineState 
> > > > > *machine)
> > > > >  register_savevm_live(NULL, "spapr/htab", -1, 1,
> > > > >   _htab_handlers, spapr);
> > > > >  
> > > > > +if (spapr->dr_phb_enabled) {
> > > > > +qbus_set_hotplug_handler(sysbus_get_default(), 
> > > > > OBJECT(machine), NULL);
> > > > > +}
> > > > > +
> > > > >  qemu_register_boot_set(spapr_boot_set, spapr);
> > > > >  
> > > > >  if (kvm_enabled()) {
> > > > > @@ -3238,6 +3242,103 @@ out:
> > > > >  error_propagate(errp, local_err);
> > > > >  }
> > > > >  
> > > > > +static void spapr_phb_pre_plug(HotplugHandler *hotplug_dev, 
> > > > > DeviceState *dev,
> > > > > +   Error **errp)
> > > > > +{
> > > > > +sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> > > > > +
> > > > > +if (sphb->drc_id == (uint32_t)-1) {
> > > > > +sphb->drc_id = sphb->index;
> > > > > +}
> > > > > +
> > > > > +if (sphb->drc_id >= SPAPR_DRC_MAX_PHB) {
> > > > > +error_setg(errp, "PHB id %d out of range", sphb->drc_id);
> > > > > +}
> > > > 
> > > > 
> > > > sphb->index in considered 16bits in the existing code (even though it is
> > > > defined as 32bit) and SPAPR_DRC_MAX_PHB is just 256. I'd suggest using 
> > > > the
> > > > same limit for both, either 256 or 65536 is fine for me.
> > > > 
> > > > It is actually a bit weird - it is possible to completely configure few
> > > > PHBs in the command line so they will have index==-1 but PCI hotplug 
> > > > code -
> > > > spapr_phb_get_pci_func_drc() and spapr_phb_realize() - does not check 
> > > > for
> > > > this and just does (sphb->index << 16).  
> > > 
> > > You're right and this looks like a bug... I'll try to come up with a fix.
> > >   
> > > > May be just ditch drc_id, enforce index not to be -1 and use it as 
> > > > drc_id?
> > > >   
> > > 
> > > This was how Mike did it in the original patchset but David suggested
> > > to introduce drc_id (to preserve existing setups I guess):
> > > 
> > > http://patchwork.ozlabs.org/patch/466262/  
> > 
> > Huh.  So I did.  But.. sorry, I've changed my mind.
> > 
> > The fact that needing a DRC forces us to have a reasonable small id
> > for each PHB seems like a good excuse to make index mandatory - I'm
> > not convinced anyone was actually creating PHBs without index, and
> > this does allow us to simplify a bunch of things.
> > 
> > I'd like to see that done as a preliminary cleanup patch, though.
> > 
> 
> Just to be sure. I could verify that the weirdness reported by Alexey
> causes QEMU to misbehave. Only the first "index-less" PHB has realized
> DRCs:
> 
> => subsequent "index-less" PHBs silently ignore hotplugging of PCI devices
> 
> => QEMU won't even start with coldplugged devices in these "index-less"
>PHBs
> 
> This preliminary cleanup for hotpluggable PHBs is hence also a bug fix
> for current PHBs.

Ok.

> Do we want to fix this long-standing bug in 2.10 ?

No, not worth pushing in this late.

> Do we want to preserve the current buggy behavior for older machine
> types ?

No, I don't think so.  I think the reasonable course here is to push
the new behaviour out.  Only if someone complains that they were
actually relying on the old behaviour in the wild do we try to
preserve it for the older machine types.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_

Re: [Qemu-devel] [for-2.11 PATCH 24/26] spapr: allow guest to update the XICS phandle

2017-08-01 Thread David Gibson

On Tue, Aug 01, 2017 at 01:26:15PM +0200, Greg Kurz wrote:
> On Tue, 1 Aug 2017 12:20:56 +1000
> Alexey Kardashevskiy  wrote:
> 
> > On 31/07/17 14:58, David Gibson wrote:
> > > On Fri, Jul 28, 2017 at 08:20:57AM +0200, Thomas Huth wrote:  
> > >> On 28.07.2017 06:02, David Gibson wrote:  
> > >>> On Tue, Jul 25, 2017 at 08:03:06PM +0200, Greg Kurz wrote:  
> >  The "phandle" property of the XICS node is referenced by the 
> >  "interrupt-map"
> >  property of each PHB node. This is used by the guest OS to setup IRQs 
> >  for
> >  all PCI devices.
> > 
> >  QEMU uses an arbitrary value (0x) for this phandle, but SLOF 
> >  converts
> >  this value to a SLOF specific one, which is then presented to the 
> >  guest OS.
> > 
> >  This patches introduces the new KVMPPC_H_UPDATE_PHANDLE hcall, which 
> >  is used
> >  by SLOF to communicate the patched phandle value back to QEMU. This 
> >  value
> >  is then cached and preserved accross migration until machine reset.
> > 
> >  This is required to be able to support PHB hotplug.
> > 
> >  Note, that SLOF already has some code to call KVMPPC_H_RTAS_UPDATE, so 
> >  we
> >  have to introduce its number even if QEMU currently doesn't implement 
> >  it.
> > 
> >  Suggested-by: Thomas Huth 
> >  Signed-off-by: Greg Kurz   
> > >>>
> > >>> Ugh.  I really, really hope we can avoid this, though I don't
> > >>> immediately see how.  Having to have two way communication between
> > >>> qemu and SLOF about the device tree contents just seems like opening
> > >>> the door to endless complexities.
> > >>>
> > >>> This is basically a consequence of the fact that both qemu and partly
> > >>> responsible for constructing the device tree for the guest, and that's
> > >>> not easy to avoid.
> > >>>
> > >>> Hrm.. Thomas, I know it's not really the OF way, but would it be
> > >>> feasible to change SLOF to use the phandles as supplied by qemu rather
> > >>> than creating its own?  
> > >>
> > >> I don't see a way to do this in an easy, clean, reasonable way. SLOF
> > >> uses pointers to internal structures as phandles all over the place. You
> > >> likely can't replace that so easily without rewriting half of the whole
> > >> device tree related code in SLOF, I guess...  
> > > 
> > > Dang, that's what I suspected.
> > > 
> > > Just to be clear the phandles are used directly as raw pointers?
> > > There's not even some lookup macro we could change?
> > >   
> > 
> > We would need to rewrite
> > https://github.com/aik/SLOF/blob/master/slof/fs/node.fs
> > 
> > Since SLOF does not seem to use phandles as pointers directly anywhere but
> > in nofe.fs, this should be doable. Easier said than done though.
> > 
> 
> Yikes... If we go that way, I'll definitely need some assistance and time.
> 
> Cc'ing Nikunj for some extra advices.

Yeah, I hope we could get a feasibility idea from someone who knows
Forth - Thomas or Nikunj, maybe.

I'd be thinking of replacing the direct pointer dereferences for
phandles with a simple lookup table of phandle to pointer, populated
as we unflatten the tree.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC V2 0/4] vfio: Introduce Live migration capability to vfio_mdev device

2017-08-01 Thread Zhang, Yulei



> -Original Message-
> From: Tian, Kevin
> Sent: Monday, July 31, 2017 2:55 PM
> To: Zhang, Yulei ; qemu-devel@nongnu.org
> Cc: joonas.lahti...@linux.intel.com; zhen...@linux.intel.com; Zheng, Xiao
> ; Wang, Zhi A ;
> alex.william...@redhat.com
> Subject: RE: [Qemu-devel][RFC V2 0/4] vfio: Introduce Live migration
> capability to vfio_mdev device
> 
> > From: Zhang, Yulei
> > Sent: Tuesday, May 9, 2017 3:59 PM
> >
> > Summary
> >
> > This series RFC would like to introduce the live migration capability
> > to vfio_mdev device.
> >
> > As currently vfio_mdev device don't support migration, we introduce a
> > new vfio subtype region
> VFIO_REGION_SUBTYPE_INTEL_IGD_DEVICE_STATE
> > for Intel vGPU device, during the vfio device initialization, the mdev
> > device will be set to migratable if the new region exist.
> 
> Looking at your series, there is really nothing specific to vGPU or even Intel
> vGPU regarding to device state save/restore...
> 

You are right, we may define a new subtype for it as it is common region for 
vfio 
device state save/restore.

> >
> > The intention to add the new region is using it for vfio_mdev device
> > status save and restore during the migration. The access to this
> > region will be trapped and forward to the vfio_mdev device driver. And
> > we use the first byte in the new region to control the running state
> > of mdev device.
> >
> > Meanwhile we add one new ioctl VFIO_IOMMU_GET_DIRTY_BITMAP to
> help do
> > the mdev device dirty page synchronization.
> >
> > So the vfio_mdev device migration sequence would be Source VM side:
> > start migration
> > |
> > V
> >  get the cpu state change callback, write to the
> >  subregion's first byte to stop the mdev device
> > |
> > V
> >  quary the dirty page bitmap from iommu container
> >  and add into qemu dirty list for synchronization
> > |
> > V
> >  save the deivce status into Qemufile which is
> >  read from the vfio device subregion
> >
> > Target VM side:
> >restore the mdev device after get the
> >  saved status context from Qemufile
> > |
> > V
> >  get the cpu state change callback
> >  write to subregion's first byte to
> >   start the mdev device to put it in
> >   running status
> > |
> > V
> > finish migration
> >
> > V1->V2:
> > Per Alex's suggestion:
> > 1. use device subtype region instead of VFIO PCI fixed region.
> > 2. remove unnecessary ioctl, use the first byte of subregion to
> >control the running state of mdev device.
> > 3. for dirty page synchronization, implement the interface with
> >VFIOContainer instead of vfio pci device.
> >
> > Yulei Zhang (4):
> >   vfio: introduce a new VFIO sub region for mdev device migration
> > support
> >   vfio: Add vm status change callback to stop/restart the mdev device
> >   vfio: Add struct vfio_vmstate_info to introduce put/get callback
> > funtion for vfio device status save/restore
> >   vifo: introduce new VFIO ioctl VFIO_IOMMU_GET_DIRTY_BITMAP
> >
> >  hw/vfio/common.c  |  32 +
> >  hw/vfio/pci.c | 164
> > +-
> >  hw/vfio/pci.h |   1 +
> >  include/hw/vfio/vfio-common.h |   1 +
> >  linux-headers/linux/vfio.h|  26 ++-
> >  5 files changed, 220 insertions(+), 4 deletions(-)
> >
> > --
> > 2.7.4

Re: [Qemu-devel] [PATCHv2 02/04] colo-compare: Processpactkets inthe IOThreadofthe primary

2017-08-01 Thread wang.yong155

>On Tue, 08/01 12:25, Paolo Bonzini wrote:

>> On 28/07/2017 02:25, Fam Zheng wrote:

>> > On Thu, 07/27 15:47, Zhang Chen wrote:

>> >> CC. Fam and David.

>> >>

>> >> Any idea about it?

>> > 

>> > Is it possible to use g_main_context_{push,pop}_thread_default to "move" 
>> > chardev

>> > GSources to IOThread's context, then use g_main_context_query like main 
>> > thread

>> > and poll the fds in aio_poll?

>> 

>> How would you do that without making aio_poll performance worse for

>> everyone?

>

>I don't knoe. Maintain a flag and only do these slow stuff after at least one

>GSource has been moved around?




How about this, when Iothread's GMainContext is NULL, use io_poll, or else use 
g_main_context_* poll the fds ?




Iothread's GMainContext is NULL by default,so use aio_poll in iothread_run 
function.

In addition we can create a new GMainContext in chardev,use g_main_context_* in 
iothread_run.

Maybe this doesn't affect performance ?




Thanks

WangYong












原始邮件



发件人： 
收件人： 
抄送人： 王勇10170530 
王广10165992  
   
 
日 期 ：2017年08月01日 22:52
主 题 ：Re: [Qemu-devel] [PATCHv2 02/04] colo-compare: Processpactkets inthe 
IOThreadofthe primary





On Tue, 08/01 12:25, Paolo Bonzini wrote:
> On 28/07/2017 02:25, Fam Zheng wrote:
> > On Thu, 07/27 15:47, Zhang Chen wrote:
> >> CC. Fam and David.
> >>
> >> Any idea about it?
> > 
> > Is it possible to use g_main_context_{push,pop}_thread_default to "move" 
> > chardev
> > GSources to IOThread's context, then use g_main_context_query like main 
> > thread
> > and poll the fds in aio_poll?
> 
> How would you do that without making aio_poll performance worse for
> everyone?

I don't knoe. Maintain a flag and only do these slow stuff after at least one
GSource has been moved around?

Fam

Re: [Qemu-devel] [PATCH v2 2/2] s390x/css: generate solicited crw for rchp completion signaling

2017-08-01 Thread Dong Jia Shi

* Halil Pasic  [2017-08-01 17:16:37 +0200]:

> 
> 
> On 08/01/2017 09:57 AM, Dong Jia Shi wrote:
> [..]
> > --- a/hw/s390x/css.c
> > +++ b/hw/s390x/css.c
> > @@ -1745,10 +1745,10 @@ int css_do_rchp(uint8_t cssid, uint8_t chpid)
> >  }
> > 
> >  /* We don't really use a channel path, so we're done here. */
> > -css_queue_crw(CRW_RSC_CHP, CRW_ERC_INIT,
> > +css_queue_crw(CRW_RSC_CHP, CRW_ERC_INIT, 1,
> >channel_subsys.max_cssid > 0 ? 1 : 0, chpid);
> >  if (channel_subsys.max_cssid > 0) {
> > -css_queue_crw(CRW_RSC_CHP, CRW_ERC_INIT, 0, real_cssid << 8);
> > +css_queue_crw(CRW_RSC_CHP, CRW_ERC_INIT, 1, 0, real_cssid << 8);
> >  }
> >  return 0;
> >  }
> > @@ -2028,7 +2028,8 @@ void css_subch_assign(uint8_t cssid, uint8_t ssid, 
> > uint16_t schid,
> >  }
> >  }
> > 
> > -void css_queue_crw(uint8_t rsc, uint8_t erc, int chain, uint16_t rsid)
> > +void css_queue_crw(uint8_t rsc, uint8_t erc, int solicited,
> > +   int chain, uint16_t rsid)
> 
> I think you could make the parameters solicited and chain bool (AFAIU
> they are conceptually bool) for clearer semantic. If you go with that
> you could also get rid of the superfluous ternary operator ( we have
> stuff like some_cond ? 1 : 0 for the chain parameter in more than
> one place.
> 
> Btw. I find bool flags easy to mix up and difficult to read. I have no better
> idea how to write this (in C) though. I was considering throwing chain and
> solicited together into a single flags parameter, but looking at the client 
> code
> it does not look like a good idea.
I think I just need to get used to differet tastes. ;)

> 
> Besides the cosmetic considerations above LGTM
Thanks!

-- 
Dong Jia Shi

Re: [Qemu-devel] Does qemu guest agent support 'guest-exec'?

2017-08-01 Thread Hu, Robert



Best Regards,
Robert Hoo

> -Original Message-
> From: Michael Roth [mailto:mdr...@linux.vnet.ibm.com]
> Sent: Wednesday, August 2, 2017 0:44
> To: Hu, Robert ; qemu-devel@nongnu.org; qemu-
> disc...@nongnu.org
> Subject: Re: [Qemu-devel] Does qemu guest agent support 'guest-exec'?
> 
> Quoting Hu, Robert (2017-08-01 02:15:01)
> > Hi,
> >
> > qemu/scripts/qmp/qemu-ga-client seems only support "cat, fsfreeze, fstrim,
> halt, ifconfig, info, ping, powerdown, reboot, shutdown, suspend".
> >
> > But from qemu/qga/commands.c seems at least Linux guest should already
> support this. Despite qemu-ga-client, how can I talk to guest-agent in guest 
> to
> execute some program? any other utils?
> 
> qemu-ga-client is more of a helper script to make it easier to execute things
> from cmdline and hasn't been updated to support guest-exec. But the official
> API is documented in qga/qapi-schema.json and involves talking to qemu-ga
> directly via JSON commands. A simple example for guest-exec would be
> something like:
> 
> mdroth@loki:~$ sudo nc -U /tmp/vm3-qga.sock
> {'execute':'guest-exec','arguments':{'path':'ip','arg': ['addr', 'show',
> 'eth0'],'capture-output':true}}
> {"return": {"pid": 1462}}
> {'execute':'guest-exec-status','arguments':{'pid':1462}}
> {"return": {"exitcode": 0, "out-data":
> "MjogZXRoMDogPEJST0FEQ0FTVCxNVUxUSUNBU1QsVVAsTE9XRVJfVVA+IG10d
> SAxNTAwIHFkaXNjIHBmaWZvX2Zhc3Qgc3RhdGUgVVAgZ3JvdXAgZGVmYXVsdCB
> xbGVuIDEwMDAKICAgIGxpbmsvZXRoZXIgNTI6NTQ6MDA6MTI6MzQ6MDMgYnJk
> IGZmOmZmOmZmOmZmOmZmOmZmCiAgICBpbmV0IDE5Mi4xNjguMTIyLjEzLzI0
> IGJyZCAxOTIuMTY4LjEyMi4yNTUgc2NvcGUgZ2xvYmFsIGR5bmFtaWMgZXRoMA
> ogICAgICAgdmFsaWRfbGZ0IDMwNjRzZWMgcHJlZmVycmVkX2xmdCAzMDY0c2Vj
> CiAgICBpbmV0NiBmZTgwOjo1MDU0OmZmOmZlMTI6MzQwMy82NCBzY29wZSB
> saW5rIAogICAgICAgdmFsaWRfbGZ0IGZvcmV2ZXIgcHJlZmVycmVkX2xmdCBmb3J
> ldmVyCg==", "exited": true}} ^C mdroth@loki:~$ cat < 2: eth0:  mtu 1500 qdisc pfifo_fast
> state UP group default qlen 1000
> link/ether 52:54:00:12:34:03 brd ff:ff:ff:ff:ff:ff
> inet 192.168.122.13/24 brd 192.168.122.255 scope global dynamic eth0
>valid_lft 3064sec preferred_lft 3064sec
> inet6 fe80::5054:ff:fe12:3403/64 scope link
>valid_lft forever preferred_lft forever
> 
> 
> >
> > Best Regards,
> > Robert Hoo
> >
> >

Re: [Qemu-devel] [PATCH v2 1/2] s390x/css: use macro for event-information pending error recover code

2017-08-01 Thread Dong Jia Shi

* Halil Pasic  [2017-08-01 17:24:10 +0200]:

> 
> 
> On 08/01/2017 09:57 AM, Dong Jia Shi wrote:
> > Let's use a macro for the ERC (error recover code) when generating a
> > Channel Subsystem Event-information pending CRW (channel report word).
> > 
> > While we are at it, let's also add all other ERCs.
> > 
> > Signed-off-by: Dong Jia Shi 
> > ---
> >  hw/s390x/css.c|  2 +-
> >  include/hw/s390x/ioinst.h | 11 +--
> >  2 files changed, 10 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/s390x/css.c b/hw/s390x/css.c
> > index 6a42b95cee..5321ca016b 100644
> > --- a/hw/s390x/css.c
> > +++ b/hw/s390x/css.c
> > @@ -2103,7 +2103,7 @@ void css_generate_chp_crws(uint8_t cssid, uint8_t 
> > chpid)
> >  void css_generate_css_crws(uint8_t cssid)
> >  {
> >  if (!channel_subsys.sei_pending) {
> > -css_queue_crw(CRW_RSC_CSS, 0, 0, cssid);
> > +css_queue_crw(CRW_RSC_CSS, CRW_ERC_EVENT, 0, cssid);
> >  }
> >  channel_subsys.sei_pending = true;
> >  }
> > diff --git a/include/hw/s390x/ioinst.h b/include/hw/s390x/ioinst.h
> > index 92d15655e4..f89019f78f 100644
> > --- a/include/hw/s390x/ioinst.h
> > +++ b/include/hw/s390x/ioinst.h
> > @@ -201,8 +201,15 @@ typedef struct CRW {
> >  #define CRW_FLAGS_MASK_A 0x0080
> >  #define CRW_FLAGS_MASK_ERC 0x003f
> > 
> > -#define CRW_ERC_INIT 0x02
> > -#define CRW_ERC_IPI  0x04
> > +#define CRW_ERC_EVENT0x00 /* event information pending */
> > +#define CRW_ERC_AVAIL0x01 /* available */
> > +#define CRW_ERC_INIT 0x02 /* initialized */
> > +#define CRW_ERC_TERROR   0x03 /* temporary error */
> > +#define CRW_ERC_IPI  0x04 /* installed parm initialized */
> > +#define CRW_ERC_TERM 0x05 /* terminal */
> > +#define CRW_ERC_PERRN0x06 /* perm. error, facility not init */
> > +#define CRW_ERC_PERRI0x07 /* perm. error, facility init */
> > +#define CRW_ERC_PMOD 0x08 /* installed parameters modified */
> 
> You have missed installed parameters restored from the PoP (above
> you say add all other).
Ok. Will add:
#define CRW_ERC_IPR  0x0A /* installed parameters restored */

> 
> Other than that.
> 
> LGTM
> 
> > 
> >  #define CRW_RSC_SUBCH 0x3
> >  #define CRW_RSC_CHP   0x4
> > 

-- 
Dong Jia Shi

Re: [Qemu-devel] [RFC PATCH] booke206: fix MAS update on tlb miss

2017-08-01 Thread David Gibson

On Tue, Aug 01, 2017 at 10:44:57AM +0200, KONRAD Frederic wrote:
> When a tlb instruction miss happen, rw is set to 0 at the bottom
> of cpu_ppc_handle_mmu_fault which cause the MAS update function to miss
> the SAS and TS bit in MAS6, MAS1 in booke206_update_mas_tlb_miss.
> 
> Just calling booke206_update_mas_tlb_miss with rw = 2 solve the issue.
> 
> Signed-off-by: KONRAD Frederic 

Applied to ppc-for-2.10.

> ---
>  target/ppc/mmu_helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/ppc/mmu_helper.c b/target/ppc/mmu_helper.c
> index b7b9088..f06b938 100644
> --- a/target/ppc/mmu_helper.c
> +++ b/target/ppc/mmu_helper.c
> @@ -1551,7 +1551,7 @@ static int cpu_ppc_handle_mmu_fault(CPUPPCState *env, 
> target_ulong address,
>  env->spr[SPR_40x_ESR] = 0x;
>  break;
>  case POWERPC_MMU_BOOKE206:
> -booke206_update_mas_tlb_miss(env, address, rw);
> +booke206_update_mas_tlb_miss(env, address, 2);
>  /* fall through */
>  case POWERPC_MMU_BOOKE:
>  cs->exception_index = POWERPC_EXCP_ITLB;

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Qemu-devel] [PATCH] virtio: Mark virtio-device as non-user-creatable

2017-08-01 Thread Eduardo Habkost

TYPE_VIRTIO_DEVICE devices are already not usable with -device
and device_add, but they are reported as user-creatable on
"-device help" and through monitor interfaces.

Mark them as not user-creatable to avoid confusing users, and to
allow automated testing (e.g. scripts/device-crash-test) to skip
them.

Before this patch, device-crash-test will try to test
virtio-device devices with all machine-types:

  $ time ./scripts/device-crash-test -D virtio-device -v 
./x86_64-softmmu/qemu-system-x86_64
  [...]
  INFO: Total: 1088 test cases
  INFO: Skipped 408 test cases

  real0m49.775s

After this patch, the script won't try to test virtio-device
devices:

  $ time ./scripts/device-crash-test -D virtio-device -v 
./x86_64-softmmu/qemu-system-x86_64
  INFO: Total: 0 test cases

  real0m0.092s

Signed-off-by: Eduardo Habkost 
---
 hw/virtio/virtio.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 464947f..c4bdb94 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2653,6 +2653,17 @@ static void virtio_device_class_init(ObjectClass *klass, 
void *data)
 dc->unrealize = virtio_device_unrealize;
 dc->bus_type = TYPE_VIRTIO_BUS;
 dc->props = virtio_properties;
+/*
+ * Reason:
+ * - TYPE_VIRTIO_DEVICE devices are not visible to guests
+ *   unless they are created and controlled by transport-specific
+ *   devices (virtio-pci, virtio-mmio, and virtio-ccw).
+ * - A TYPE_VIRTIO_BUS bus is never available for plugging
+ *   using -device/device_add, as virtio-bus buses are
+ *   created on the fly and immediately populated by the
+ *   transport-specific devices' realize methods.
+ */
+dc->user_creatable = false;
 vdc->start_ioeventfd = virtio_device_start_ioeventfd_impl;
 vdc->stop_ioeventfd = virtio_device_stop_ioeventfd_impl;
 
-- 
2.9.4

Re: [Qemu-devel] [PATCH v3 5/5] docs: update documentation considering PCIE-PCI bridge

2017-08-01 Thread Laszlo Ersek

On 08/01/17 23:39, Michael S. Tsirkin wrote:
> On Wed, Aug 02, 2017 at 12:33:12AM +0300, Alexander Bezzubikov wrote:
>> 2017-08-01 23:31 GMT+03:00 Laszlo Ersek :
>>> (Whenever my comments conflict with Michael's or Marcel's, I defer to them.)
>>>
>>> On 07/29/17 01:37, Aleksandr Bezzubikov wrote:
 Signed-off-by: Aleksandr Bezzubikov 
 ---
  docs/pcie.txt|  46 ++
  docs/pcie_pci_bridge.txt | 121 
 +++
  2 files changed, 147 insertions(+), 20 deletions(-)
  create mode 100644 docs/pcie_pci_bridge.txt

 diff --git a/docs/pcie.txt b/docs/pcie.txt
 index 5bada24..338b50e 100644
 --- a/docs/pcie.txt
 +++ b/docs/pcie.txt
 @@ -46,7 +46,7 @@ Place only the following kinds of devices directly on 
 the Root Complex:
  (2) PCI Express Root Ports (ioh3420), for starting exclusively PCI 
 Express
  hierarchies.

 -(3) DMI-PCI Bridges (i82801b11-bridge), for starting legacy PCI
 +(3) PCIE-PCI Bridge (pcie-pci-bridge), for starting legacy PCI
  hierarchies.

  (4) Extra Root Complexes (pxb-pcie), if multiple PCI Express Root 
 Buses
>>>
>>> When reviewing previous patches modifying / adding this file, I
>>> requested that we spell out "PCI Express" every single time. I'd like to
>>> see the same in this patch, if possible.
>>
>> OK, I didn't know it.
>>
>>>
 @@ -55,18 +55,18 @@ Place only the following kinds of devices directly on 
 the Root Complex:
 pcie.0 bus
 
 
  |||  |
 -   ---   --   --   --
 -   | PCI Dev |   | PCIe Root Port |   | DMI-PCI Bridge |   |  pxb-pcie  |
 -   ---   --   --   --
 +   ---   --   ---   --
 +   | PCI Dev |   | PCIe Root Port |   | PCIE-PCI Bridge |   |  pxb-pcie  |
 +   ---   --   ---   --

  2.1.1 To plug a device into pcie.0 as a Root Complex Integrated Endpoint 
 use:
-device [,bus=pcie.0]
  2.1.2 To expose a new PCI Express Root Bus use:
-device pxb-pcie,id=pcie.1,bus_nr=x[,numa_node=y][,addr=z]
 -  Only PCI Express Root Ports and DMI-PCI bridges can be connected
 +  Only PCI Express Root Ports, PCIE-PCI bridges and DMI-PCI bridges 
 can be connected
>>>
>>> It would be nice if we could keep the flowing text wrapped to 80 chars.
>>>
>>> Also, here you add the "PCI Express-PCI" bridge to the list of allowed
>>> controllers (and you keep DMI-PCI as permitted), but above DMI was
>>> replaced. I think these should be made consistent -- we should make up
>>> our minds if we continue to recommend the DMI-PCI bridge or not. If not,
>>> then we should eradicate all traces of it. If we want to keep it at
>>> least for compatibility, then it should remain as fully documented as it
>>> is now.
>>
>> Now I'm beginning to think that we shouldn't keep the DMI-PCI bridge
>> even for compatibility and may want to use a new PCIE-PCI bridge
>> everywhere (of course, except some cases when users are
>> sure they need exactly DMI-PCI bridge for some reason)
> 
> Can dmi-pci support shpc? why doesn't it? For compatibility?

I don't know why, but the fact that it doesn't is the reason libvirt
settled on auto-creating a dmi-pci bridge and a pci-pci bridge under
that for Q35. The reasoning was (IIRC Laine's words correctly) that the
dmi-pci bridge cannot receive hotplugged devices, while the pci-pci
bridge cannot be connected to the root complex. So both were needed.

Thanks
Laszlo

Re: [Qemu-devel] [PATCH 1/1] qemu-iotests/109: Fix lock race condition

2017-08-01 Thread John Snow



On 08/01/2017 05:31 PM, Cleber Rosa wrote:
> A race condition is currently present between the clean up attempt of
> the QEMU process and the execution of qemu-img.  The actual (bad)
> output is:
> 
>  -Warning: Image size mismatch!
>  -Images are identical.
>  +qemu-img: Could not open '/tests/qemu-iotests/scratch/t.raw': 
> Failed to get "consistent read" lock
>  +Is another process using the image?
> 
> A KILL signal is sent to the QEMU process, but qemu-img may begin to
> run before the QEMU process is really gone.  qemu-img will then
> attempt to open the TEST_IMG file before it can secure a lock on it.
> 
> This attempts a more graceful shutdown, and waits for the QEMU process
> to exit.
> 
> Signed-off-by: Cleber Rosa 

Reviewed-by: John Snow 

Thanks for this, it was driving me batty.

Re: [Qemu-devel] [Qemu-block] [PATCH 1/1] qemu-iotests/109: Fix lock race condition

2017-08-01 Thread Jeff Cody

On Tue, Aug 01, 2017 at 05:31:27PM -0400, Cleber Rosa wrote:
> A race condition is currently present between the clean up attempt of
> the QEMU process and the execution of qemu-img.  The actual (bad)
> output is:
> 
>  -Warning: Image size mismatch!
>  -Images are identical.
>  +qemu-img: Could not open '/tests/qemu-iotests/scratch/t.raw': 
> Failed to get "consistent read" lock
>  +Is another process using the image?
> 
> A KILL signal is sent to the QEMU process, but qemu-img may begin to
> run before the QEMU process is really gone.  qemu-img will then
> attempt to open the TEST_IMG file before it can secure a lock on it.
> 
> This attempts a more graceful shutdown, and waits for the QEMU process
> to exit.
> 
> Signed-off-by: Cleber Rosa 


Reviewed-by: Jeff Cody 


> ---
>  tests/qemu-iotests/109 |  3 ++-
>  tests/qemu-iotests/109.out | 56 
> ++
>  2 files changed, 58 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qemu-iotests/109 b/tests/qemu-iotests/109
> index 3b496a3..d70b574 100755
> --- a/tests/qemu-iotests/109
> +++ b/tests/qemu-iotests/109
> @@ -67,7 +67,8 @@ function run_qemu()
>  _send_qemu_cmd $QEMU_HANDLE '' "BLOCK_JOB_COMPLETED"
>  fi
>  _send_qemu_cmd $QEMU_HANDLE '{"execute":"query-block-jobs"}' "return"
> -_cleanup_qemu
> +_send_qemu_cmd $QEMU_HANDLE '{"execute":"quit"}' "return"
> +wait=1 _cleanup_qemu
>  }
>  
>  for fmt in qcow qcow2 qed vdi vmdk vpc; do
> diff --git a/tests/qemu-iotests/109.out b/tests/qemu-iotests/109.out
> index dc02f9e..c189e28 100644
> --- a/tests/qemu-iotests/109.out
> +++ b/tests/qemu-iotests/109.out
> @@ -12,12 +12,17 @@ Specify the 'raw' format explicitly to remove the 
> restrictions.
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": 
> "report"}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": LEN, "offset": 0, 
> "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
>  {"return": []}
> +{"return": {}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "SHUTDOWN", "data": {"guest": false}}
>  read 65536/65536 bytes at offset 0
>  64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>  {"return": {}}
>  {"return": {}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "BLOCK_JOB_READY", "data": {"device": "src", "len": 1024, "offset": 1024, 
> "speed": 0, "type": "mirror"}}
>  {"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 1024, 
> "offset": 1024, "paused": false, "speed": 0, "ready": true, "type": 
> "mirror"}]}
> +{"return": {}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "SHUTDOWN", "data": {"guest": false}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 1024, "offset": 1024, 
> "speed": 0, "type": "mirror"}}
>  Warning: Image size mismatch!
>  Images are identical.
>  
> @@ -33,12 +38,17 @@ Specify the 'raw' format explicitly to remove the 
> restrictions.
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": 
> "report"}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": LEN, "offset": 512, 
> "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
>  {"return": []}
> +{"return": {}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "SHUTDOWN", "data": {"guest": false}}
>  read 65536/65536 bytes at offset 0
>  64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>  {"return": {}}
>  {"return": {}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "BLOCK_JOB_READY", "data": {"device": "src", "len": 197120, "offset": 197120, 
> "speed": 0, "type": "mirror"}}
>  {"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 
> 197120, "offset": 197120, "paused": false, "speed": 0, "ready": true, "type": 
> "mirror"}]}
> +{"return": {}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "SHUTDOWN", "data": {"guest": false}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 197120, "offset": 
> 197120, "speed": 0, "type": "mirror"}}
>  Warning: Image size mismatch!
>  Images are identical.
>  
> @@ -54,12 +64,17 @@ Specify the 'raw' format explicitly to remove the 
> restrictions.
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write",

Re: [Qemu-devel] [PATCH for-2.10 1/1] qemu-iotests/109: Fix lock race condition

2017-08-01 Thread Eric Blake

On 08/01/2017 04:31 PM, Cleber Rosa wrote:
> A race condition is currently present between the clean up attempt of
> the QEMU process and the execution of qemu-img.  The actual (bad)
> output is:
> 
>  -Warning: Image size mismatch!
>  -Images are identical.
>  +qemu-img: Could not open '/tests/qemu-iotests/scratch/t.raw': 
> Failed to get "consistent read" lock
>  +Is another process using the image?
> 
> A KILL signal is sent to the QEMU process, but qemu-img may begin to
> run before the QEMU process is really gone.  qemu-img will then
> attempt to open the TEST_IMG file before it can secure a lock on it.
> 
> This attempts a more graceful shutdown, and waits for the QEMU process
> to exit.
> 
> Signed-off-by: Cleber Rosa 
> ---
>  tests/qemu-iotests/109 |  3 ++-
>  tests/qemu-iotests/109.out | 56 
> ++
>  2 files changed, 58 insertions(+), 1 deletion(-)

Reviewed-by: Eric Blake 

Probably missed -rc1, but still okay for -rc2

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PULL 06/10] accel: cleanup error output

2017-08-01 Thread Eric Blake

On 08/01/2017 04:37 PM, Michael S. Tsirkin wrote:
> From: Laurent Vivier 
> 
> Only emit "XXX accelerator not found", if there are not
> further accelerators listed. eg
> 
>accel=kvm:tcg
> 
> doesn't print a "KVM accelerator not found" warning
> when it falls back to tcg, but a
> 
>accel=kvm
> 
> prints a warning, since no fallback is given.

Already merged as d73f0247; but git should do the right thing as long as
your two pull requests don't introduce any conflicts.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH 1/1] qemu-iotests/109: Fix lock race condition

2017-08-01 Thread Cleber Rosa

A race condition is currently present between the clean up attempt of
the QEMU process and the execution of qemu-img.  The actual (bad)
output is:

 -Warning: Image size mismatch!
 -Images are identical.
 +qemu-img: Could not open '/tests/qemu-iotests/scratch/t.raw': 
Failed to get "consistent read" lock
 +Is another process using the image?

A KILL signal is sent to the QEMU process, but qemu-img may begin to
run before the QEMU process is really gone.  qemu-img will then
attempt to open the TEST_IMG file before it can secure a lock on it.

This attempts a more graceful shutdown, and waits for the QEMU process
to exit.

Signed-off-by: Cleber Rosa 
---
 tests/qemu-iotests/109 |  3 ++-
 tests/qemu-iotests/109.out | 56 ++
 2 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/109 b/tests/qemu-iotests/109
index 3b496a3..d70b574 100755
--- a/tests/qemu-iotests/109
+++ b/tests/qemu-iotests/109
@@ -67,7 +67,8 @@ function run_qemu()
 _send_qemu_cmd $QEMU_HANDLE '' "BLOCK_JOB_COMPLETED"
 fi
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"query-block-jobs"}' "return"
-_cleanup_qemu
+_send_qemu_cmd $QEMU_HANDLE '{"execute":"quit"}' "return"
+wait=1 _cleanup_qemu
 }
 
 for fmt in qcow qcow2 qed vdi vmdk vpc; do
diff --git a/tests/qemu-iotests/109.out b/tests/qemu-iotests/109.out
index dc02f9e..c189e28 100644
--- a/tests/qemu-iotests/109.out
+++ b/tests/qemu-iotests/109.out
@@ -12,12 +12,17 @@ Specify the 'raw' format explicitly to remove the 
restrictions.
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": 
"report"}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": LEN, "offset": 0, 
"speed": 0, "type": "mirror", "error": "Operation not permitted"}}
 {"return": []}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"SHUTDOWN", "data": {"guest": false}}
 read 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"BLOCK_JOB_READY", "data": {"device": "src", "len": 1024, "offset": 1024, 
"speed": 0, "type": "mirror"}}
 {"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 1024, 
"offset": 1024, "paused": false, "speed": 0, "ready": true, "type": "mirror"}]}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"SHUTDOWN", "data": {"guest": false}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 1024, "offset": 1024, 
"speed": 0, "type": "mirror"}}
 Warning: Image size mismatch!
 Images are identical.
 
@@ -33,12 +38,17 @@ Specify the 'raw' format explicitly to remove the 
restrictions.
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": 
"report"}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": LEN, "offset": 512, 
"speed": 0, "type": "mirror", "error": "Operation not permitted"}}
 {"return": []}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"SHUTDOWN", "data": {"guest": false}}
 read 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"BLOCK_JOB_READY", "data": {"device": "src", "len": 197120, "offset": 197120, 
"speed": 0, "type": "mirror"}}
 {"return": [{"io-status": "ok", "device": "src", "busy": false, "len": 197120, 
"offset": 197120, "paused": false, "speed": 0, "ready": true, "type": 
"mirror"}]}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"SHUTDOWN", "data": {"guest": false}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 197120, "offset": 
197120, "speed": 0, "type": "mirror"}}
 Warning: Image size mismatch!
 Images are identical.
 
@@ -54,12 +64,17 @@ Specify the 'raw' format explicitly to remove the 
restrictions.
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"BLOCK_JOB_ERROR", "data": {"device": "src", "operation": "write", "action": 
"report"}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": LEN, "offset": 262144, 
"speed": 0, "type": "mirror", "error": "Operation not permitted"}}
 {"return": []}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP,

Re: [Qemu-devel] [PATCH v3 5/5] docs: update documentation considering PCIE-PCI bridge

2017-08-01 Thread Michael S. Tsirkin

On Wed, Aug 02, 2017 at 12:33:12AM +0300, Alexander Bezzubikov wrote:
> 2017-08-01 23:31 GMT+03:00 Laszlo Ersek :
> > (Whenever my comments conflict with Michael's or Marcel's, I defer to them.)
> >
> > On 07/29/17 01:37, Aleksandr Bezzubikov wrote:
> >> Signed-off-by: Aleksandr Bezzubikov 
> >> ---
> >>  docs/pcie.txt|  46 ++
> >>  docs/pcie_pci_bridge.txt | 121 
> >> +++
> >>  2 files changed, 147 insertions(+), 20 deletions(-)
> >>  create mode 100644 docs/pcie_pci_bridge.txt
> >>
> >> diff --git a/docs/pcie.txt b/docs/pcie.txt
> >> index 5bada24..338b50e 100644
> >> --- a/docs/pcie.txt
> >> +++ b/docs/pcie.txt
> >> @@ -46,7 +46,7 @@ Place only the following kinds of devices directly on 
> >> the Root Complex:
> >>  (2) PCI Express Root Ports (ioh3420), for starting exclusively PCI 
> >> Express
> >>  hierarchies.
> >>
> >> -(3) DMI-PCI Bridges (i82801b11-bridge), for starting legacy PCI
> >> +(3) PCIE-PCI Bridge (pcie-pci-bridge), for starting legacy PCI
> >>  hierarchies.
> >>
> >>  (4) Extra Root Complexes (pxb-pcie), if multiple PCI Express Root 
> >> Buses
> >
> > When reviewing previous patches modifying / adding this file, I
> > requested that we spell out "PCI Express" every single time. I'd like to
> > see the same in this patch, if possible.
> 
> OK, I didn't know it.
> 
> >
> >> @@ -55,18 +55,18 @@ Place only the following kinds of devices directly on 
> >> the Root Complex:
> >> pcie.0 bus
> >> 
> >> 
> >>  |||  |
> >> -   ---   --   --   --
> >> -   | PCI Dev |   | PCIe Root Port |   | DMI-PCI Bridge |   |  pxb-pcie  |
> >> -   ---   --   --   --
> >> +   ---   --   ---   --
> >> +   | PCI Dev |   | PCIe Root Port |   | PCIE-PCI Bridge |   |  pxb-pcie  |
> >> +   ---   --   ---   --
> >>
> >>  2.1.1 To plug a device into pcie.0 as a Root Complex Integrated Endpoint 
> >> use:
> >>-device [,bus=pcie.0]
> >>  2.1.2 To expose a new PCI Express Root Bus use:
> >>-device pxb-pcie,id=pcie.1,bus_nr=x[,numa_node=y][,addr=z]
> >> -  Only PCI Express Root Ports and DMI-PCI bridges can be connected
> >> +  Only PCI Express Root Ports, PCIE-PCI bridges and DMI-PCI bridges 
> >> can be connected
> >
> > It would be nice if we could keep the flowing text wrapped to 80 chars.
> >
> > Also, here you add the "PCI Express-PCI" bridge to the list of allowed
> > controllers (and you keep DMI-PCI as permitted), but above DMI was
> > replaced. I think these should be made consistent -- we should make up
> > our minds if we continue to recommend the DMI-PCI bridge or not. If not,
> > then we should eradicate all traces of it. If we want to keep it at
> > least for compatibility, then it should remain as fully documented as it
> > is now.
> 
> Now I'm beginning to think that we shouldn't keep the DMI-PCI bridge
> even for compatibility and may want to use a new PCIE-PCI bridge
> everywhere (of course, except some cases when users are
> sure they need exactly DMI-PCI bridge for some reason)

Can dmi-pci support shpc? why doesn't it? For compatibility?


> >
> >>to the pcie.1 bus:
> >>-device 
> >> ioh3420,id=root_port1[,bus=pcie.1][,chassis=x][,slot=y][,addr=z]   
> >>   \
> >> -  -device i82801b11-bridge,id=dmi_pci_bridge1,bus=pcie.1
> >> +  -device pcie-pci-bridge,id=pcie_pci_bridge1,bus=pcie.1
> >>
> >>
> >>  2.2 PCI Express only hierarchy
> >> @@ -130,21 +130,25 @@ Notes:
> >>  Legacy PCI devices can be plugged into pcie.0 as Integrated Endpoints,
> >>  but, as mentioned in section 5, doing so means the legacy PCI
> >>  device in question will be incapable of hot-unplugging.
> >> -Besides that use DMI-PCI Bridges (i82801b11-bridge) in combination
> >> +Besides that use PCIE-PCI Bridges (pcie-pci-bridge) in combination
> >>  with PCI-PCI Bridges (pci-bridge) to start PCI hierarchies.
> >> +Instead of the PCIE-PCI Bridge DMI-PCI one can be used,
> >> +but it doens't support hot-plug, is not crossplatform and since that
> >
> > s/doens't/doesn't/
> >
> > s/since that/therefore it/
> >
> >> +is obsolete and deprecated. Use the PCIE-PCI Bridge if you're not
> >> +absolutely sure you need the DMI-PCI Bridge.
> >>
> >> -Prefer flat hierarchies. For most scenarios a single DMI-PCI Bridge
> >> +Prefer flat hierarchies. For most scenarios a single PCIE-PCI Bridge
> >>  (having 32 slots) and several PCI-PCI Bridges attached to it
> >>  (each supporting also 32 slots) will support hundreds of legacy devices.
> >> -The recommendation is to

[Qemu-devel] [PULL 09/10] pc: make 'pc.rom' readonly when machine has PCI enabled

2017-08-01 Thread Michael S. Tsirkin

From: Igor Mammedov 

looking at bios ROM mapping in QEMU it seems that only isapc
(i.e. not PCI enabled machine) requires ROM being mapped as
RW in other cases BIOS is mapped as RO. Do the same for option
ROM 'pc.rom' when machine has PCI enabled.

As useful side-effect pc.rom MemoryRegion stops being
put in vhost memory map (filtered out by vhost_section()),
which reduces number of entries by 1.

Coincidentally it fixes migration failure reported in

"[PATCH V2]  vhost: fix a migration failed because of vhost region merge"

where following destination CLI with 
/sys/module/vhost/parameters/max_mem_regions = 8

export DIMMSCOUNT=6
QEMU -enable-kvm \
 -netdev type=tap,id=guest0,vhost=on,script=no,vhostforce \
 -device virtio-net-pci,netdev=guest0 \
 -m 256,slots=256,maxmem=2G \
 `i=0; while [ $i -lt $DIMMSCOUNT ]; do echo \
 "-object memory-backend-ram,id=m$i,size=128M \
  -device pc-dimm,id=d$i,memdev=m$i"; i=$(($i + 1)); \
 done`

will fail to startup with error:

 "-device pc-dimm,id=d5,memdev=m5: a used vhost backend has no free memory 
slots left"

while it's possible to add the 6th DIMM during hotplug
on source.

Issue is caused by the fact that number of entries in vhost map
is bigger on 1 entry, when -device is processed, than
after guest boots up, and that offending entry belongs to
'pc.rom', it's not like vhost intends to do IO in ROM range
so making it RO hides region from vhost and makes number
of entries in vhost memory map at -device/machine_done time
match number of entries after guest boots.

Signed-off-by: Igor Mammedov 
Reported-by: Peng Hao 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/pc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 22e1603..5943539 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1443,6 +1443,9 @@ void pc_memory_init(PCMachineState *pcms,
 option_rom_mr = g_malloc(sizeof(*option_rom_mr));
 memory_region_init_ram(option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE,
_fatal);
+if (pcmc->pci_enabled) {
+memory_region_set_readonly(option_rom_mr, true);
+}
 memory_region_add_subregion_overlap(rom_memory,
 PC_ROM_MIN_VGA,
 option_rom_mr,
-- 
MST

[Qemu-devel] [PULL 08/10] vhost-user: fix watcher need be removed when vhost-user hotplug

2017-08-01 Thread Michael S. Tsirkin

From: Yunjian Wang 

"nc" is freed after hotplug vhost-user, but the watcher is not removed.
The QEMU crash when the watcher access the "nc" when socket disconnects.

Program received signal SIGSEGV, Segmentation fault.
#0  object_get_class (obj=obj@entry=0x2) at qom/object.c:750
#1  0x7f9bb4180da1 in qemu_chr_fe_disconnect (be=) at 
chardev/char-fe.c:372
#2  0x7f9bb40d1100 in net_vhost_user_watch (chan=, 
cond=, opaque=) at net/vhost-user.c:188
#3  0x7f9baf97f99a in g_main_context_dispatch () from 
/usr/lib64/libglib-2.0.so.0
#4  0x7f9bb41d7ebc in glib_pollfds_poll () at util/main-loop.c:213
#5  os_host_main_loop_wait (timeout=) at util/main-loop.c:261
#6  main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:515
#7  0x7f9bb3e266a7 in main_loop () at vl.c:1917
#8  main (argc=, argv=, envp=) 
at vl.c:4786

Signed-off-by: Yunjian Wang 
Reviewed-by: Marc-André Lureau 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 net/vhost-user.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/net/vhost-user.c b/net/vhost-user.c
index 36f32a2..c23927c 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -151,6 +151,10 @@ static void vhost_user_cleanup(NetClientState *nc)
 s->vhost_net = NULL;
 }
 if (nc->queue_index == 0) {
+if (s->watch) {
+g_source_remove(s->watch);
+s->watch = 0;
+}
 qemu_chr_fe_deinit(>chr, true);
 }
 
-- 
MST

[Qemu-devel] [PULL 10/10] pc: acpi: force FADT rev1 for 440fx based machine types

2017-08-01 Thread Michael S. Tsirkin

From: Igor Mammedov 

w2k used to boot on QEMU until revision of FADT has
been bumped to rev3
(commit 77af8a2b hw/i386: Use Rev3 FADT (ACPI 2.0) instead of Rev1 to improve 
guest OS support.)

Keep PC machine at rev1 to remain compatible and Q35
at rev3 where w2k isn't supported anyway so OSX could
run as well.

Signed-off-by: Igor Mammedov 
Tested-by: John Arbuckle 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/acpi-build.c | 22 ++
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 6b7bade..b9c245c 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -90,6 +90,7 @@ typedef struct AcpiMcfgInfo {
 } AcpiMcfgInfo;
 
 typedef struct AcpiPmInfo {
+bool force_rev1_fadt;
 bool s3_disabled;
 bool s4_disabled;
 bool pcihp_bridge_en;
@@ -129,10 +130,13 @@ static void acpi_get_pm_info(AcpiPmInfo *pm)
 Object *obj = NULL;
 QObject *o;
 
+pm->force_rev1_fadt = false;
 pm->cpu_hp_io_base = 0;
 pm->pcihp_io_base = 0;
 pm->pcihp_io_len = 0;
 if (piix) {
+/* w2k requires FADT(rev1) or it won't boot, keep PC compatible */
+pm->force_rev1_fadt = true;
 obj = piix;
 pm->cpu_hp_io_base = PIIX4_CPU_HOTPLUG_IO_BASE;
 pm->pcihp_io_base =
@@ -304,6 +308,9 @@ static void fadt_setup(AcpiFadtDescriptorRev3 *fadt, 
AcpiPmInfo *pm)
 fadt->flags |= cpu_to_le32(1 << ACPI_FADT_F_FORCE_APIC_CLUSTER_MODEL);
 }
 fadt->century = RTC_CENTURY;
+if (pm->force_rev1_fadt) {
+return;
+}
 
 fadt->flags |= cpu_to_le32(1 << ACPI_FADT_F_RESET_REG_SUP);
 fadt->reset_value = 0xf;
@@ -342,6 +349,8 @@ build_fadt(GArray *table_data, BIOSLinker *linker, 
AcpiPmInfo *pm,
 unsigned fw_ctrl_offset = (char *)>firmware_ctrl - table_data->data;
 unsigned dsdt_entry_offset = (char *)>dsdt - table_data->data;
 unsigned xdsdt_entry_offset = (char *)>x_dsdt - table_data->data;
+int fadt_size = sizeof(*fadt);
+int rev = 3;
 
 /* FACS address to be filled by Guest linker */
 bios_linker_loader_add_pointer(linker,
@@ -353,12 +362,17 @@ build_fadt(GArray *table_data, BIOSLinker *linker, 
AcpiPmInfo *pm,
 bios_linker_loader_add_pointer(linker,
 ACPI_BUILD_TABLE_FILE, dsdt_entry_offset, sizeof(fadt->dsdt),
 ACPI_BUILD_TABLE_FILE, dsdt_tbl_offset);
-bios_linker_loader_add_pointer(linker,
-ACPI_BUILD_TABLE_FILE, xdsdt_entry_offset, sizeof(fadt->x_dsdt),
-ACPI_BUILD_TABLE_FILE, dsdt_tbl_offset);
+if (pm->force_rev1_fadt) {
+rev = 1;
+fadt_size = offsetof(typeof(*fadt), reset_register);
+} else {
+bios_linker_loader_add_pointer(linker,
+ACPI_BUILD_TABLE_FILE, xdsdt_entry_offset, sizeof(fadt->x_dsdt),
+ACPI_BUILD_TABLE_FILE, dsdt_tbl_offset);
+}
 
 build_header(linker, table_data,
- (void *)fadt, "FACP", sizeof(*fadt), 3, oem_id, oem_table_id);
+ (void *)fadt, "FACP", fadt_size, rev, oem_id, oem_table_id);
 }
 
 void pc_madt_cpu_entry(AcpiDeviceIf *adev, int uid,
-- 
MST

Re: [Qemu-devel] [PATCH v2 for-2.11 3/3] qemu-iotests: add option to save temp files on error

2017-08-01 Thread Jeff Cody

On Tue, Aug 01, 2017 at 08:34:01AM +0200, Markus Armbruster wrote:
> Jeff Cody  writes:
> 
> > Now that ./check takes care of cleaning up after each tests, it
> > can also selectively not clean up.  Add option to leave all output from
> > tests intact if that test encountered an error.
> >
> > Note: this currently only works for bash tests, as the python tests
> > still clean up after themselves manually.
> 
> Should we add a TODO comment for that?
> 

Couldn't hurt!

> Much appreciated work, by the way.  You might want to mention in one of
> your commit messages that this is also a step towards running iotests in
> parallel.
> 

Thanks.  I'll go ahead and spin a v3 to clean up commit messages and add a
TODO, since 2.11 won't open for a while.


> Another step towards sanity would be making $TEST_DIR instead of
> $source_iotests the current working directory for running tests.

Yep!

[Qemu-devel] [PULL 06/10] accel: cleanup error output

2017-08-01 Thread Michael S. Tsirkin

From: Laurent Vivier 

Only emit "XXX accelerator not found", if there are not
further accelerators listed. eg

   accel=kvm:tcg

doesn't print a "KVM accelerator not found" warning
when it falls back to tcg, but a

   accel=kvm

prints a warning, since no fallback is given.

Suggested-by: Daniel P. Berrange 
Suggested-by: Paolo Bonzini 
Signed-off-by: Laurent Vivier 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Eric Blake 
Tested-by: Thomas Huth 
---
 accel/accel.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/accel/accel.c b/accel/accel.c
index fa85844..8ae40e1 100644
--- a/accel/accel.c
+++ b/accel/accel.c
@@ -33,6 +33,7 @@
 #include "sysemu/qtest.h"
 #include "hw/xen/xen.h"
 #include "qom/object.h"
+#include "qemu/error-report.h"
 
 static const TypeInfo accel_type = {
 .name = TYPE_ACCEL,
@@ -69,19 +70,20 @@ static int accel_init_machine(AccelClass *acc, MachineState 
*ms)
 
 void configure_accelerator(MachineState *ms)
 {
-const char *p;
+const char *accel, *p;
 char buf[10];
 int ret;
 bool accel_initialised = false;
 bool init_failed = false;
 AccelClass *acc = NULL;
 
-p = qemu_opt_get(qemu_get_machine_opts(), "accel");
-if (p == NULL) {
+accel = qemu_opt_get(qemu_get_machine_opts(), "accel");
+if (accel == NULL) {
 /* Use the default "accelerator", tcg */
-p = "tcg";
+accel = "tcg";
 }
 
+p = accel;
 while (!accel_initialised && *p != '\0') {
 if (*p == ':') {
 p++;
@@ -89,7 +91,6 @@ void configure_accelerator(MachineState *ms)
 p = get_opt_name(buf, sizeof(buf), p, ':');
 acc = accel_find(buf);
 if (!acc) {
-fprintf(stderr, "\"%s\" accelerator not found.\n", buf);
 continue;
 }
 if (acc->available && !acc->available()) {
@@ -100,9 +101,8 @@ void configure_accelerator(MachineState *ms)
 ret = accel_init_machine(acc, ms);
 if (ret < 0) {
 init_failed = true;
-fprintf(stderr, "failed to initialize %s: %s\n",
-acc->name,
-strerror(-ret));
+error_report("failed to initialize %s: %s",
+ acc->name, strerror(-ret));
 } else {
 accel_initialised = true;
 }
@@ -110,13 +110,13 @@ void configure_accelerator(MachineState *ms)
 
 if (!accel_initialised) {
 if (!init_failed) {
-fprintf(stderr, "No accelerator found!\n");
+error_report("-machine accel=%s: No accelerator found", accel);
 }
 exit(1);
 }
 
 if (init_failed) {
-fprintf(stderr, "Back to %s accelerator.\n", acc->name);
+error_report("Back to %s accelerator", acc->name);
 }
 }
 
-- 
MST

[Qemu-devel] [PULL 03/10] vhost-user: fix legacy cross-endian configurations

2017-08-01 Thread Michael S. Tsirkin

From: Felipe Franciosi 

Currently, vhost-user does not implement any means for notifying the
backend about guest endianess. This commit introduces a new message
called VHOST_USER_SET_VRING_ENDIAN which is analogous to the ioctl()
called VHOST_SET_VRING_ENDIAN used for kernel vhost backends. Such
message is necessary for backends supporting legacy (pre-1.0) virtio
devices running in big-endian guests.

Signed-off-by: Felipe Franciosi 
Signed-off-by: Mike Cui 
---
 docs/interop/vhost-user.txt | 16 
 hw/virtio/vhost-user.c  | 23 +--
 2 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
index 481ab56..954771d 100644
--- a/docs/interop/vhost-user.txt
+++ b/docs/interop/vhost-user.txt
@@ -326,6 +326,7 @@ Protocol features
 #define VHOST_USER_PROTOCOL_F_REPLY_ACK  3
 #define VHOST_USER_PROTOCOL_F_MTU4
 #define VHOST_USER_PROTOCOL_F_SLAVE_REQ  5
+#define VHOST_USER_PROTOCOL_F_CROSS_ENDIAN   6
 
 Master message types
 
@@ -580,6 +581,21 @@ Master message types
   This request should be send only when VIRTIO_F_IOMMU_PLATFORM feature
   has been successfully negotiated.
 
+ * VHOST_USER_SET_VRING_ENDIAN
+
+  Id: 23
+  Equivalent ioctl: VHOST_SET_VRING_ENDIAN
+  Master payload: vring state description
+
+  Set the endianess of a VQ for legacy devices. Little-endian is indicated
+  with state.num set to 0 and big-endian is indicated with state.num set
+  to 1. Other values are invalid.
+  This request should be sent only when VHOST_USER_PROTOCOL_F_CROSS_ENDIAN
+  has been negotiated.
+  Backends that negotiated this feature should handle both endianesses
+  and expect this message once (per VQ) during device configuration
+  (ie. before the master starts the VQ).
+
 Slave message types
 ---
 
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 2203011..093675e 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -33,6 +33,7 @@ enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_REPLY_ACK = 3,
 VHOST_USER_PROTOCOL_F_NET_MTU = 4,
 VHOST_USER_PROTOCOL_F_SLAVE_REQ = 5,
+VHOST_USER_PROTOCOL_F_CROSS_ENDIAN = 6,
 
 VHOST_USER_PROTOCOL_F_MAX
 };
@@ -63,6 +64,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_NET_SET_MTU = 20,
 VHOST_USER_SET_SLAVE_REQ_FD = 21,
 VHOST_USER_IOTLB_MSG = 22,
+VHOST_USER_SET_VRING_ENDIAN = 23,
 VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -367,8 +369,25 @@ static int vhost_user_set_vring_addr(struct vhost_dev *dev,
 static int vhost_user_set_vring_endian(struct vhost_dev *dev,
struct vhost_vring_state *ring)
 {
-error_report("vhost-user trying to send unhandled ioctl");
-return -1;
+bool cross_endian = virtio_has_feature(dev->protocol_features,
+   VHOST_USER_PROTOCOL_F_CROSS_ENDIAN);
+VhostUserMsg msg = {
+.request = VHOST_USER_SET_VRING_ENDIAN,
+.flags = VHOST_USER_VERSION,
+.payload.state = *ring,
+.size = sizeof(msg.payload.state),
+};
+
+if (!cross_endian) {
+error_report("vhost-user trying to send unhandled ioctl");
+return -1;
+}
+
+if (vhost_user_write(dev, , NULL, 0) < 0) {
+return -1;
+}
+
+return 0;
 }
 
 static int vhost_set_vring(struct vhost_dev *dev,
-- 
MST

[Qemu-devel] [PULL 07/10] tests/bios-tables-test: Compiler warning fix

2017-08-01 Thread Michael S. Tsirkin

From: "Dr. David Alan Gilbert" 

gcc 7.1.1 in fedora 26 moans about the:
   tables = g_new0(uint32_t, tables_nr)

because it can't convince itself that tables_nr is positive.
This is fallout from g_assert_cmpint no longer necessarily being
no-return;  replace it with a plain g_assert.

Signed-off-by: Dr. David Alan Gilbert 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Marcel Apfelbaum 
---
 tests/bios-tables-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/bios-tables-test.c b/tests/bios-tables-test.c
index 63da978..564da45 100644
--- a/tests/bios-tables-test.c
+++ b/tests/bios-tables-test.c
@@ -108,7 +108,7 @@ static void test_acpi_rsdt_table(test_data *data)
 /* compute the table entries in rsdt */
 tables_nr = (rsdt_table->length - sizeof(AcpiRsdtDescriptorRev1)) /
 sizeof(uint32_t);
-g_assert_cmpint(tables_nr, >, 0);
+g_assert(tables_nr > 0);
 
 /* get the addresses of the tables pointed by rsdt */
 tables = g_new0(uint32_t, tables_nr);
-- 
MST

[Qemu-devel] [PULL 02/10] vhost: fix a memory leak

2017-08-01 Thread Michael S. Tsirkin

From: Peng Hao 

vhost exists a call for g_file_get_contents, but not call g_free.

Signed-off-by: Peng Hao
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Marc-André Lureau 
---
 hw/virtio/vhost-backend.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index cb055e8..7f09efa 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -52,11 +52,13 @@ static int vhost_kernel_memslots_limit(struct vhost_dev 
*dev)
 , NULL, NULL)) {
 uint64_t val = g_ascii_strtoull(s, NULL, 10);
 if (!((val == G_MAXUINT64 || !val) && errno)) {
+g_free(s);
 return val;
 }
 error_report("ignoring invalid max_mem_regions value in vhost module:"
  " %s", s);
 }
+g_free(s);
 return limit;
 }
 
-- 
MST

[Qemu-devel] [PULL 04/10] intel_iommu: fix iova for pt

2017-08-01 Thread Michael S. Tsirkin

From: Peter Xu 

IOMMUTLBEntry.iova is returned incorrectly on one PT path (though mostly
we cannot really trigger this path, even if we do, we are mostly
disgarding this value, so it didn't break anything). Fix it by
converting the VTD_PAGE_MASK into the correct definition
VTD_PAGE_MASK_4K, then remove VTD_PAGE_MASK.

Fixes: b93130 ("intel_iommu: cleanup vtd_{do_}iommu_translate()")
Signed-off-by: Peter Xu 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/intel_iommu_internal.h | 1 -
 hw/i386/intel_iommu.c  | 4 ++--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index f50ecd8..0e73a65 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -384,7 +384,6 @@ typedef struct VTDIOTLBPageInvInfo VTDIOTLBPageInvInfo;
 /* Pagesize of VTD paging structures, including root and context tables */
 #define VTD_PAGE_SHIFT  12
 #define VTD_PAGE_SIZE   (1ULL << VTD_PAGE_SHIFT)
-#define VTD_PAGE_MASK   (VTD_PAGE_SIZE - 1)
 
 #define VTD_PAGE_SHIFT_4K   12
 #define VTD_PAGE_MASK_4K(~((1ULL << VTD_PAGE_SHIFT_4K) - 1))
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index e398746..e0b0498 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1139,9 +1139,9 @@ static bool vtd_do_iommu_translate(VTDAddressSpace 
*vtd_as, PCIBus *bus,
  * Also, let's ignore IOTLB caching as well for PT devices.
  */
 if (vtd_ce_get_type() == VTD_CONTEXT_TT_PASS_THROUGH) {
-entry->iova = addr & VTD_PAGE_MASK;
+entry->iova = addr & VTD_PAGE_MASK_4K;
 entry->translated_addr = entry->iova;
-entry->addr_mask = VTD_PAGE_MASK;
+entry->addr_mask = ~VTD_PAGE_MASK_4K;
 entry->perm = IOMMU_RW;
 trace_vtd_translate_pt(source_id, entry->iova);
 
-- 
MST

[Qemu-devel] [PULL 05/10] intel_iommu: use access_flags for iotlb

2017-08-01 Thread Michael S. Tsirkin

From: Peter Xu 

It was cached by read/write separately. Let's merge them.

Signed-off-by: Peter Xu 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/i386/intel_iommu.h |  3 +--
 hw/i386/intel_iommu.c | 15 +++
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 08d8a26..ac15e6b 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -103,8 +103,7 @@ struct VTDIOTLBEntry {
 uint16_t domain_id;
 uint64_t slpte;
 uint64_t mask;
-bool read_flags;
-bool write_flags;
+uint8_t access_flags;
 };
 
 /* VT-d Source-ID Qualifier types */
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index e0b0498..a7bf87a 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -237,8 +237,7 @@ out:
 
 static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t source_id,
  uint16_t domain_id, hwaddr addr, uint64_t slpte,
- bool read_flags, bool write_flags,
- uint32_t level)
+ uint8_t access_flags, uint32_t level)
 {
 VTDIOTLBEntry *entry = g_malloc(sizeof(*entry));
 uint64_t *key = g_malloc(sizeof(*key));
@@ -253,8 +252,7 @@ static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t 
source_id,
 entry->gfn = gfn;
 entry->domain_id = domain_id;
 entry->slpte = slpte;
-entry->read_flags = read_flags;
-entry->write_flags = write_flags;
+entry->access_flags = access_flags;
 entry->mask = vtd_slpt_level_page_mask(level);
 *key = vtd_get_iotlb_key(gfn, source_id, level);
 g_hash_table_replace(s->iotlb, key, entry);
@@ -1087,6 +1085,7 @@ static bool vtd_do_iommu_translate(VTDAddressSpace 
*vtd_as, PCIBus *bus,
 bool is_fpd_set = false;
 bool reads = true;
 bool writes = true;
+uint8_t access_flags;
 VTDIOTLBEntry *iotlb_entry;
 
 /*
@@ -1101,8 +1100,7 @@ static bool vtd_do_iommu_translate(VTDAddressSpace 
*vtd_as, PCIBus *bus,
 trace_vtd_iotlb_page_hit(source_id, addr, iotlb_entry->slpte,
  iotlb_entry->domain_id);
 slpte = iotlb_entry->slpte;
-reads = iotlb_entry->read_flags;
-writes = iotlb_entry->write_flags;
+access_flags = iotlb_entry->access_flags;
 page_mask = iotlb_entry->mask;
 goto out;
 }
@@ -1172,13 +1170,14 @@ static bool vtd_do_iommu_translate(VTDAddressSpace 
*vtd_as, PCIBus *bus,
 }
 
 page_mask = vtd_slpt_level_page_mask(level);
+access_flags = IOMMU_ACCESS_FLAG(reads, writes);
 vtd_update_iotlb(s, source_id, VTD_CONTEXT_ENTRY_DID(ce.hi), addr, slpte,
- reads, writes, level);
+ access_flags, level);
 out:
 entry->iova = addr & page_mask;
 entry->translated_addr = vtd_get_slpte_addr(slpte) & page_mask;
 entry->addr_mask = ~page_mask;
-entry->perm = IOMMU_ACCESS_FLAG(reads, writes);
+entry->perm = access_flags;
 return true;
 
 error:
-- 
MST

[Qemu-devel] [PULL 01/10] tests: switch pxe and vm gen id tests to use kvm

2017-08-01 Thread Michael S. Tsirkin

Speed up tests on host systems with kvm support.

Cc: Paolo Bonzini 
Cc: Thomas Huth 
Cc: Laurent Vivier 
Signed-off-by: Michael S. Tsirkin 
Acked-by: Paolo Bonzini 
---
 tests/pxe-test.c | 2 +-
 tests/vmgenid-test.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/pxe-test.c b/tests/pxe-test.c
index 34282d3..cf6e225 100644
--- a/tests/pxe-test.c
+++ b/tests/pxe-test.c
@@ -25,7 +25,7 @@ static void test_pxe_one(const char *params, bool ipv6)
 {
 char *args;
 
-args = g_strdup_printf("-machine accel=tcg -nodefaults -boot order=n "
+args = g_strdup_printf("-machine accel=kvm:tcg -nodefaults -boot order=n "
"-netdev user,id=" NETNAME ",tftp=./,bootfile=%s,"
"ipv4=%s,ipv6=%s %s", disk, ipv6 ? "off" : "on",
ipv6 ? "on" : "off", params);
diff --git a/tests/vmgenid-test.c b/tests/vmgenid-test.c
index a49de92..3d5c1c3 100644
--- a/tests/vmgenid-test.c
+++ b/tests/vmgenid-test.c
@@ -132,7 +132,7 @@ static char disk[] = "tests/vmgenid-test-disk-XX";
 
 static char *guid_cmd_strdup(const char *guid)
 {
-return g_strdup_printf("-machine accel=tcg "
+return g_strdup_printf("-machine accel=kvm:tcg "
"-device vmgenid,id=testvgid,guid=%s "
"-drive id=hd0,if=none,file=%s,format=raw "
"-device ide-hd,drive=hd0 ",
-- 
MST

[Qemu-devel] [PULL 00/10] pc, acpi, virtio: fixes, test speedup for rc1

2017-08-01 Thread Michael S. Tsirkin

The following changes since commit 7d48cf8102a10e4a54333811bafb5eb566509268:

  Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' 
into staging (2017-08-01 14:33:56 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream

for you to fetch changes up to 3a3fcc75f92ab0d71ba0e7511af7dc575c4b4f6c:

  pc: acpi: force FADT rev1 for 440fx based machine types (2017-08-02 00:13:26 
+0300)


pc, acpi, virtio: fixes, test speedup for rc1

Some fixes all over the place. Notably vhost-user gained a new message
to set endian-ness. Borderline for 2.10 but seems to be the only way to
fix legacy guests.  Also pc tests are run on kvm now. Not a fix at all
but doesn't touch qemu itself, so I merged it since I had to run these a
lot and I just got tired of waiting for these to finish.

Signed-off-by: Michael S. Tsirkin 


Dr. David Alan Gilbert (1):
  tests/bios-tables-test: Compiler warning fix

Felipe Franciosi (1):
  vhost-user: fix legacy cross-endian configurations

Igor Mammedov (2):
  pc: make 'pc.rom' readonly when machine has PCI enabled
  pc: acpi: force FADT rev1 for 440fx based machine types

Laurent Vivier (1):
  accel: cleanup error output

Michael S. Tsirkin (1):
  tests: switch pxe and vm gen id tests to use kvm

Peng Hao (1):
  vhost: fix a memory leak

Peter Xu (2):
  intel_iommu: fix iova for pt
  intel_iommu: use access_flags for iotlb

Yunjian Wang (1):
  vhost-user: fix watcher need be removed when vhost-user hotplug

 docs/interop/vhost-user.txt| 16 
 hw/i386/intel_iommu_internal.h |  1 -
 include/hw/i386/intel_iommu.h  |  3 +--
 accel/accel.c  | 20 ++--
 hw/i386/acpi-build.c   | 22 ++
 hw/i386/intel_iommu.c  | 19 +--
 hw/i386/pc.c   |  3 +++
 hw/virtio/vhost-backend.c  |  2 ++
 hw/virtio/vhost-user.c | 23 +--
 net/vhost-user.c   |  4 
 tests/bios-tables-test.c   |  2 +-
 tests/pxe-test.c   |  2 +-
 tests/vmgenid-test.c   |  2 +-
 13 files changed, 87 insertions(+), 32 deletions(-)

[Qemu-devel] [PATCH 0/1] qemu-iotests/109: Fix lock race condition

2017-08-01 Thread Cleber Rosa

A race condition is currently present between the clean up attempt of
the QEMU process and the execution of qemu-img.

Cleber Rosa(1):
   qemu-iotests/109: Fix lock/race condition

 tests/qemu-iotests/109 |  3 ++-
 tests/qemu-iotests/109.out | 56 ++
 2 files changed, 58 insertions(+), 1 deletion(-)

Re: [Qemu-devel] [PATCH v3 5/5] docs: update documentation considering PCIE-PCI bridge

2017-08-01 Thread Alexander Bezzubikov

2017-08-01 23:31 GMT+03:00 Laszlo Ersek :
> (Whenever my comments conflict with Michael's or Marcel's, I defer to them.)
>
> On 07/29/17 01:37, Aleksandr Bezzubikov wrote:
>> Signed-off-by: Aleksandr Bezzubikov 
>> ---
>>  docs/pcie.txt|  46 ++
>>  docs/pcie_pci_bridge.txt | 121 
>> +++
>>  2 files changed, 147 insertions(+), 20 deletions(-)
>>  create mode 100644 docs/pcie_pci_bridge.txt
>>
>> diff --git a/docs/pcie.txt b/docs/pcie.txt
>> index 5bada24..338b50e 100644
>> --- a/docs/pcie.txt
>> +++ b/docs/pcie.txt
>> @@ -46,7 +46,7 @@ Place only the following kinds of devices directly on the 
>> Root Complex:
>>  (2) PCI Express Root Ports (ioh3420), for starting exclusively PCI 
>> Express
>>  hierarchies.
>>
>> -(3) DMI-PCI Bridges (i82801b11-bridge), for starting legacy PCI
>> +(3) PCIE-PCI Bridge (pcie-pci-bridge), for starting legacy PCI
>>  hierarchies.
>>
>>  (4) Extra Root Complexes (pxb-pcie), if multiple PCI Express Root Buses
>
> When reviewing previous patches modifying / adding this file, I
> requested that we spell out "PCI Express" every single time. I'd like to
> see the same in this patch, if possible.

OK, I didn't know it.

>
>> @@ -55,18 +55,18 @@ Place only the following kinds of devices directly on 
>> the Root Complex:
>> pcie.0 bus
>> 
>> 
>>  |||  |
>> -   ---   --   --   --
>> -   | PCI Dev |   | PCIe Root Port |   | DMI-PCI Bridge |   |  pxb-pcie  |
>> -   ---   --   --   --
>> +   ---   --   ---   --
>> +   | PCI Dev |   | PCIe Root Port |   | PCIE-PCI Bridge |   |  pxb-pcie  |
>> +   ---   --   ---   --
>>
>>  2.1.1 To plug a device into pcie.0 as a Root Complex Integrated Endpoint 
>> use:
>>-device [,bus=pcie.0]
>>  2.1.2 To expose a new PCI Express Root Bus use:
>>-device pxb-pcie,id=pcie.1,bus_nr=x[,numa_node=y][,addr=z]
>> -  Only PCI Express Root Ports and DMI-PCI bridges can be connected
>> +  Only PCI Express Root Ports, PCIE-PCI bridges and DMI-PCI bridges can 
>> be connected
>
> It would be nice if we could keep the flowing text wrapped to 80 chars.
>
> Also, here you add the "PCI Express-PCI" bridge to the list of allowed
> controllers (and you keep DMI-PCI as permitted), but above DMI was
> replaced. I think these should be made consistent -- we should make up
> our minds if we continue to recommend the DMI-PCI bridge or not. If not,
> then we should eradicate all traces of it. If we want to keep it at
> least for compatibility, then it should remain as fully documented as it
> is now.

Now I'm beginning to think that we shouldn't keep the DMI-PCI bridge
even for compatibility and may want to use a new PCIE-PCI bridge
everywhere (of course, except some cases when users are
sure they need exactly DMI-PCI bridge for some reason)

>
>>to the pcie.1 bus:
>>-device 
>> ioh3420,id=root_port1[,bus=pcie.1][,chassis=x][,slot=y][,addr=z] 
>> \
>> -  -device i82801b11-bridge,id=dmi_pci_bridge1,bus=pcie.1
>> +  -device pcie-pci-bridge,id=pcie_pci_bridge1,bus=pcie.1
>>
>>
>>  2.2 PCI Express only hierarchy
>> @@ -130,21 +130,25 @@ Notes:
>>  Legacy PCI devices can be plugged into pcie.0 as Integrated Endpoints,
>>  but, as mentioned in section 5, doing so means the legacy PCI
>>  device in question will be incapable of hot-unplugging.
>> -Besides that use DMI-PCI Bridges (i82801b11-bridge) in combination
>> +Besides that use PCIE-PCI Bridges (pcie-pci-bridge) in combination
>>  with PCI-PCI Bridges (pci-bridge) to start PCI hierarchies.
>> +Instead of the PCIE-PCI Bridge DMI-PCI one can be used,
>> +but it doens't support hot-plug, is not crossplatform and since that
>
> s/doens't/doesn't/
>
> s/since that/therefore it/
>
>> +is obsolete and deprecated. Use the PCIE-PCI Bridge if you're not
>> +absolutely sure you need the DMI-PCI Bridge.
>>
>> -Prefer flat hierarchies. For most scenarios a single DMI-PCI Bridge
>> +Prefer flat hierarchies. For most scenarios a single PCIE-PCI Bridge
>>  (having 32 slots) and several PCI-PCI Bridges attached to it
>>  (each supporting also 32 slots) will support hundreds of legacy devices.
>> -The recommendation is to populate one PCI-PCI Bridge under the DMI-PCI 
>> Bridge
>> +The recommendation is to populate one PCI-PCI Bridge under the PCIE-PCI 
>> Bridge
>>  until is full and then plug a new PCI-PCI Bridge...
>>
>> pcie.0 bus
>> --
>>  ||
>> -   ---

Re: [Qemu-devel] [PATCH] vfio/pci-quirks: Set non-zero GMS memory size for IGD

2017-08-01 Thread Alex Williamson

On Fri, 28 Jul 2017 08:22:45 +0300
Dmitry Fleytman  wrote:

> > On 28 Jul 2017, at 07:51 AM, Zhang, Xiong Y  wrote:
> >   
> >>> On 26 Jul 2017, at 08:22 AM, Zhang, Xiong Y   
> >> wrote:  
> >>> 
> >>> Sorry, we indeed found Intel windows guest graphic driver couldn't be 
> >>> bind  
> >> when GMS memory size is zero. And we have fixed it and the next intel
> >> windows driver release will contain this fix.  
> >>> So currently please use x-igd-gms in legacy mode to work around this.
> >>> 
> >>> BTW, how did you know window driver allocate extra ~4G memory when  
> >> GMS size set to 0 ?
> >> 
> >> We noticed that with new IGD driver memory usage on VMs raised by around
> >> 4G.
> >> 
> >> Then we examined memory allocations with poolmon
> >> (https://docs.microsoft.com/en-us/windows-hardware/drivers/devtest/poolm
> >> on)
> >> and saw that around 4G of memory is allocated with IGD driver pool tag.
> >> Additionally we noticed that on VMs with less than 5G of memory IGD driver
> >> does not load and system does fallback to standard VGA driver.
> >>   
> > [Zhang, Xiong Y]  I tried our internal driver and didn't found the above 
> > two issues, please wait for the next intel graphic driver release.  
> 
> Thanks!

So to close on this, we should just be recommending that users
encountering this problem add the x-igd-gms=1 option to the device
until the next Intel graphics release.  I won't plan to push this
change.  Thanks all,

Alex

Re: [Qemu-devel] [PATCH for-2.10 0/5] tests: acpi: make sure FADT is compared to reference table

2017-08-01 Thread Michael S. Tsirkin

On Mon, Jul 31, 2017 at 05:40:47PM +0200, Igor Mammedov wrote:
> While refactoring i386/FADT generation to build_append_int_noprefix() 
>
> and testing it, It turned out that FADT is only tested for valid  
>
> checksum but actual test for unintended changes isn't applied to it   
>
> even though we have reference tables in tree. 
>
> So here goes a couple of cleanups to reflect what fuctions do +   
>
> some comments and actual fix. 
>
>   
>
> Note to maintainer:   
>
>   FADT reference table is out of sync and should be updated along with
>
>   series applied. 
>
>   
>
> CC: "Michael S. Tsirkin" 
> 
> CC: Marcel Apfelbaum  

Absolutely good stuff, but not a bugfix as such (it's not that the
test is wrong, it's that we skip FADT for now)
so I don't think this is 2.10 material.

> Igor Mammedov (5):
>   tests: acpi: move tested tables array allocation outside of
> test_acpi_dsdt_table()
>   tests: acpi: init table descriptor in test_dst_table()
>   tests: acpi: rename test_acpi_tables()/test_dst_table() to reflect its
> usage
>   tests: acpi: add comments to fetch_rsdt_referenced_tables/data->tables
> usage
>   tests: acpi: fix FADT not being compared to reference table
> 
>  tests/bios-tables-test.c | 30 ++
>  1 file changed, 18 insertions(+), 12 deletions(-)
> 
> -- 
> 2.7.4

Re: [Qemu-devel] [PATCH v3 5/5] docs: update documentation considering PCIE-PCI bridge

2017-08-01 Thread Laszlo Ersek

(Whenever my comments conflict with Michael's or Marcel's, I defer to them.)

On 07/29/17 01:37, Aleksandr Bezzubikov wrote:
> Signed-off-by: Aleksandr Bezzubikov 
> ---
>  docs/pcie.txt|  46 ++
>  docs/pcie_pci_bridge.txt | 121 
> +++
>  2 files changed, 147 insertions(+), 20 deletions(-)
>  create mode 100644 docs/pcie_pci_bridge.txt
> 
> diff --git a/docs/pcie.txt b/docs/pcie.txt
> index 5bada24..338b50e 100644
> --- a/docs/pcie.txt
> +++ b/docs/pcie.txt
> @@ -46,7 +46,7 @@ Place only the following kinds of devices directly on the 
> Root Complex:
>  (2) PCI Express Root Ports (ioh3420), for starting exclusively PCI 
> Express
>  hierarchies.
>  
> -(3) DMI-PCI Bridges (i82801b11-bridge), for starting legacy PCI
> +(3) PCIE-PCI Bridge (pcie-pci-bridge), for starting legacy PCI
>  hierarchies.
>  
>  (4) Extra Root Complexes (pxb-pcie), if multiple PCI Express Root Buses

When reviewing previous patches modifying / adding this file, I
requested that we spell out "PCI Express" every single time. I'd like to
see the same in this patch, if possible.

> @@ -55,18 +55,18 @@ Place only the following kinds of devices directly on the 
> Root Complex:
> pcie.0 bus
> 
> 
>  |||  |
> -   ---   --   --   --
> -   | PCI Dev |   | PCIe Root Port |   | DMI-PCI Bridge |   |  pxb-pcie  |
> -   ---   --   --   --
> +   ---   --   ---   --
> +   | PCI Dev |   | PCIe Root Port |   | PCIE-PCI Bridge |   |  pxb-pcie  |
> +   ---   --   ---   --
>  
>  2.1.1 To plug a device into pcie.0 as a Root Complex Integrated Endpoint use:
>-device [,bus=pcie.0]
>  2.1.2 To expose a new PCI Express Root Bus use:
>-device pxb-pcie,id=pcie.1,bus_nr=x[,numa_node=y][,addr=z]
> -  Only PCI Express Root Ports and DMI-PCI bridges can be connected
> +  Only PCI Express Root Ports, PCIE-PCI bridges and DMI-PCI bridges can 
> be connected

It would be nice if we could keep the flowing text wrapped to 80 chars.

Also, here you add the "PCI Express-PCI" bridge to the list of allowed
controllers (and you keep DMI-PCI as permitted), but above DMI was
replaced. I think these should be made consistent -- we should make up
our minds if we continue to recommend the DMI-PCI bridge or not. If not,
then we should eradicate all traces of it. If we want to keep it at
least for compatibility, then it should remain as fully documented as it
is now.

>to the pcie.1 bus:
>-device 
> ioh3420,id=root_port1[,bus=pcie.1][,chassis=x][,slot=y][,addr=z]  
>\
> -  -device i82801b11-bridge,id=dmi_pci_bridge1,bus=pcie.1
> +  -device pcie-pci-bridge,id=pcie_pci_bridge1,bus=pcie.1
>  
>  
>  2.2 PCI Express only hierarchy
> @@ -130,21 +130,25 @@ Notes:
>  Legacy PCI devices can be plugged into pcie.0 as Integrated Endpoints,
>  but, as mentioned in section 5, doing so means the legacy PCI
>  device in question will be incapable of hot-unplugging.
> -Besides that use DMI-PCI Bridges (i82801b11-bridge) in combination
> +Besides that use PCIE-PCI Bridges (pcie-pci-bridge) in combination
>  with PCI-PCI Bridges (pci-bridge) to start PCI hierarchies.
> +Instead of the PCIE-PCI Bridge DMI-PCI one can be used,
> +but it doens't support hot-plug, is not crossplatform and since that

s/doens't/doesn't/

s/since that/therefore it/

> +is obsolete and deprecated. Use the PCIE-PCI Bridge if you're not 
> +absolutely sure you need the DMI-PCI Bridge.
>  
> -Prefer flat hierarchies. For most scenarios a single DMI-PCI Bridge
> +Prefer flat hierarchies. For most scenarios a single PCIE-PCI Bridge
>  (having 32 slots) and several PCI-PCI Bridges attached to it
>  (each supporting also 32 slots) will support hundreds of legacy devices.
> -The recommendation is to populate one PCI-PCI Bridge under the DMI-PCI Bridge
> +The recommendation is to populate one PCI-PCI Bridge under the PCIE-PCI 
> Bridge
>  until is full and then plug a new PCI-PCI Bridge...
>  
> pcie.0 bus
> --
>  ||
> -   ---   --
> -   | PCI Dev |   | DMI-PCI BRIDGE |
> -   ----
> +   ---   ---
> +   | PCI Dev |   | PCIE-PCI BRIDGE |
> +   -----
> ||
>----
>| PCI-PCI Bridge ||

[Qemu-devel] [Bug 1473451] Re: Please support the native bios format for dec alpha

2017-08-01 Thread Thomas Huth

** Changed in: qemu
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1473451

Title:
  Please support the native bios format for dec alpha

Status in QEMU:
  Won't Fix

Bug description:
  Currently qemu-system-alpha -bios parameter takes an ELF image.
  However HP maintains firmware updates for those systems.

  Some example rom files can be found here
  
ftp://ftp.hp.com/pub/alphaserver/firmware/current_platforms/v7.3_release/DS20_DS20e/

  It might allow things like using the SRM firmware.
  The ARC(nt) firmware would allow to build and test windows applications for 
that platforms without having the relevant hardware

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1473451/+subscriptions

Re: [Qemu-devel] [PULL 02/17] cpu_physical_memory_sync_dirty_bitmap: Fix alignment check

2017-08-01 Thread Dr. David Alan Gilbert

* Peter Maydell (peter.mayd...@linaro.org) wrote:
> On 1 August 2017 at 17:17, Paolo Bonzini  wrote:
> > From: "Dr. David Alan Gilbert" 
> >
> > This code has an optimised, word aligned version, and a boring
> > unaligned version.  Recently 084140bd498909 fixed a missing offset
> > addition from the core of both versions.  However, the offset isn't
> > necessarily aligned and thus the choice between the two versions
> > needs fixing up to also include the offset.
> >
> > Symptom:
> >   A few stuck unsent pages during migration; not normally noticed
> > unless under very low bandwidth in which case the migration may get
> > stuck never ending and never performing a 2nd sync; noticed by
> > a hanging postcopy-test on a very heavily loaded system.
> >
> > Fixes: 084140bd498909
> >
> > Signed-off-by: Dr. David Alan Gilbert 
> > Reported-by: Alex BenneÃ© 
> > Tested-by: Alex BenneÃ© 
> >
> > --
> > v2
> >   Move 'page' inside the if (Comment from Paolo)
> > Message-Id: <20170724165125.29887-1-dgilb...@redhat.com>
> > Signed-off-by: Paolo Bonzini 
> 
> Something somewhere along the line seems to have mangled the
> unicode characters in Alex's name :-(
> Also, Alex's email address is typoed, and what should be the
> '---' marker has been written as '--' so the below-the-fold waffle
> is still hanging around in the commit.
> 
> This doesn't seem worth rejecting the pull request on the
> eve of rc1 for, but it's still a bit sad :-(

Hmm yes, I think some of that's mine - the missing 'n' is certainly
my fault; and I may have got his two 'e's in the wrong order,
but I don't know what changed the encoding, that seems right on:
  https://lists.nongnu.org/archive/html/qemu-devel/2017-07/msg07417.html

Dave

> thanks
> -- PMM
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PULL 00/17] Misc changes for QEMU 2.10-rc1 (?)

2017-08-01 Thread Peter Maydell

On 1 August 2017 at 17:17, Paolo Bonzini  wrote:
> The following changes since commit 7d48cf8102a10e4a54333811bafb5eb566509268:
>
>   Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' 
> into staging (2017-08-01 14:33:56 +0100)
>
> are available in the git repository at:
>
>
>   git://github.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to 33f21e4f044ac1c37f60edc1f1aee628be8f463b:
>
>   mc146818rtc: implement UIP latching as intended (2017-08-01 17:27:34 +0200)
>
> 
> * Xen fix (Anthony)
> * chardev fixes (Anton, Marc-André)
> * small dead code removal (Zhongyi)
> * documentation (Dan)
> * bugfixes (David)
> * decrease migration downtime (Jay)
> * improved error output (Laurent)
> * RTC tests and bugfix (me)
> * Bluetooth clang analyzer fix (me)
> * KVM CPU hotplug race (Peng Hao)
> * Two other patches from Philippe's clang analyzer series
>

Applied, thanks, but can you check whatever it is in your
workflow that's mangling non-ascii characters?

thanks
-- PMM

Re: [Qemu-devel] [PULL 02/17] cpu_physical_memory_sync_dirty_bitmap: Fix alignment check

2017-08-01 Thread Peter Maydell

On 1 August 2017 at 17:17, Paolo Bonzini  wrote:
> From: "Dr. David Alan Gilbert" 
>
> This code has an optimised, word aligned version, and a boring
> unaligned version.  Recently 084140bd498909 fixed a missing offset
> addition from the core of both versions.  However, the offset isn't
> necessarily aligned and thus the choice between the two versions
> needs fixing up to also include the offset.
>
> Symptom:
>   A few stuck unsent pages during migration; not normally noticed
> unless under very low bandwidth in which case the migration may get
> stuck never ending and never performing a 2nd sync; noticed by
> a hanging postcopy-test on a very heavily loaded system.
>
> Fixes: 084140bd498909
>
> Signed-off-by: Dr. David Alan Gilbert 
> Reported-by: Alex BenneÃ© 
> Tested-by: Alex BenneÃ© 
>
> --
> v2
>   Move 'page' inside the if (Comment from Paolo)
> Message-Id: <20170724165125.29887-1-dgilb...@redhat.com>
> Signed-off-by: Paolo Bonzini 

Something somewhere along the line seems to have mangled the
unicode characters in Alex's name :-(
Also, Alex's email address is typoed, and what should be the
'---' marker has been written as '--' so the below-the-fold waffle
is still hanging around in the commit.

This doesn't seem worth rejecting the pull request on the
eve of rc1 for, but it's still a bit sad :-(

thanks
-- PMM

Re: [Qemu-devel] [seabios PATCH for qemu 2.10 0/2] seabios: build ACPI 1.0-compatible ACPI tables

2017-08-01 Thread Michael S. Tsirkin

On Wed, Jul 26, 2017 at 11:42:33AM +0200, Paolo Bonzini wrote:
> Old operating systems would like to have a rev1 (ACPI 1.0) FADT, but
> new operating systems would like to have rev3 (ACPI 2.0).
> Since old operating systems do not know about XSDTs, the
> solution is to point the RSDT to a v1 FADT and the XSDT to a
> rev3 FADT.

So I think for 2.10 Igor's patch is much smaller and only
touches QEMU, I'm inclined to merge just that one.
Let's discuss this after 2.10 is out.

> 
> Paolo Bonzini (2):
>   seabios: build RSDT from XSDT
>   seabios: create rev1 FADT in compatibility RSDT
> 
>  src/fw/paravirt.c | 68 
> ++-
>  src/std/acpi.h| 11 +
>  2 files changed, 78 insertions(+), 1 deletion(-)
> 
> -- 
> 2.13.3

Re: [Qemu-devel] [PATCH v3 2/3] pci: add QEMU-specific PCI capability structure

2017-08-01 Thread Alexander Bezzubikov

2017-08-01 16:38 GMT+03:00 Marcel Apfelbaum :
> On 31/07/2017 22:01, Alexander Bezzubikov wrote:
>>
>> 2017-07-31 21:57 GMT+03:00 Michael S. Tsirkin :
>>>
>>> On Mon, Jul 31, 2017 at 09:54:55PM +0300, Alexander Bezzubikov wrote:

 2017-07-31 17:09 GMT+03:00 Marcel Apfelbaum :
>
> On 31/07/2017 17:00, Michael S. Tsirkin wrote:
>>
>>
>> On Sat, Jul 29, 2017 at 02:34:31AM +0300, Aleksandr Bezzubikov wrote:
>>>
>>>
>>> On PCI init PCI bridge devices may need some
>>> extra info about bus number to reserve, IO, memory and
>>> prefetchable memory limits. QEMU can provide this
>>> with special vendor-specific PCI capability.
>>>
>>> This capability is intended to be used only
>>> for Red Hat PCI bridges, i.e. QEMU cooperation.
>>>
>>> Signed-off-by: Aleksandr Bezzubikov 
>>> ---
>>>src/fw/dev-pci.h | 62
>>> 
>>>1 file changed, 62 insertions(+)
>>>create mode 100644 src/fw/dev-pci.h
>>>
>>> diff --git a/src/fw/dev-pci.h b/src/fw/dev-pci.h
>>> new file mode 100644
>>> index 000..fbd49ed
>>> --- /dev/null
>>> +++ b/src/fw/dev-pci.h
>>> @@ -0,0 +1,62 @@
>>> +#ifndef _PCI_CAP_H
>>> +#define _PCI_CAP_H
>>> +
>>> +#include "types.h"
>>> +
>>> +/*
>>> +
>>> +QEMU-specific vendor(Red Hat)-specific capability.
>>> +It's intended to provide some hints for firmware to init PCI
>>> devices.
>>> +
>>> +Its is shown below:
>>> +
>>> +Header:
>>> +
>>> +u8 id;   Standard PCI Capability Header field
>>> +u8 next; Standard PCI Capability Header field
>>> +u8 len;  Standard PCI Capability Header field
>>> +u8 type; Red Hat vendor-specific capability type:
>>> +   now only REDHAT_QEMU_CAP 1 exists
>>> +Data:
>>> +
>>> +u16 non_prefetchable_16; non-prefetchable memory limit
>>> +
>>> +u8 bus_res;  minimum bus number to reserve;
>>> + this is necessary for PCI Express Root Ports
>>> + to support PCIE-to-PCI bridge hotplug
>>> +
>>> +u8 io_8; IO limit in case of 8-bit limit value
>>> +u32 io_32;   IO limit in case of 16-bit limit value
>>> + io_8 and io_16 are mutually exclusive, in other words,
>>> + they can't be non-zero simultaneously
>>> +
>>> +u32 prefetchable_32; non-prefetchable memory limit
>>> + in case of 32-bit limit value
>>> +u64 prefetchable_64; non-prefetchable memory limit
>>> + in case of 64-bit limit value
>>> + prefetachable_32 and prefetchable_64
>>> are
>>> + mutually exclusive, in other words,
>>> + they can't be non-zero simultaneously
>>> +If any field in Data section is 0,
>>> +it means that such kind of reservation
>>> +is not needed.


 I really don't like this 'mutually exclusive' fields approach because
 IMHO it increases confusion level when undertanding this capability
 structure.
 But - if we came to consensus on that, then IO fields should be used
 in the same way,
 because as I understand, this 'mutual exclusivity' serves to distinguish
 cases
 when we employ only *_LIMIT register and both *_LIMIT an UPPER_*_LIMIT
 registers.
 And this is how both IO and PREFETCHABLE works, isn't it?
>>>
>>>
>>> I would just use simeple 64 bit registers. PCI spec has an ugly format
>>> with fields spread all over the place but that is because of
>>> compatibility concerns. It makes not sense to spend cycles just
>>> to be similarly messy.
>>
>>
>> Then I suggest to use exactly one field of a maximum possible size
>> for each reserving object, and get rid of mutually exclusive fields.
>> Then it can be something like that (order and names can be changed):
>> u8 bus;
>> u16 non_pref;
>> u32 io;
>> u64 pref;
>>
>
> I think Michael suggested:
>
> u64 bus_res;
> u64 mem_res;
> u64 io_res;
> u64 mem_pref_res;
>
> OR:
> u32 bus_res;
> u32 mem_res;
> u32 io_res;
> u64 mem_pref_res;
>
>
> We can use 0XFFF..F as "not-set" value "merging" Gerd's and Michael's
> requests.

Let's dwell on the second option (with -1 as 'ignore' sign), if no new
objections.

>
> Thanks,
> Marcel
>
>
>>
>>>
>>>
>>
>>
>
> Hi Michael,
>
>> We also want a way to say "no hint for this type".
>>
>> One way to achive this would be to have instead multiple
>> vendor specific capabilities, one for each of
>> bus#/io/mem/prefetch. 0 would mean do not reserve anything,
>> absence of capability would mean "no info, up to firmware".
>>
>
> First version of

Re: [Qemu-devel] [PULL 00/17] Misc changes for QEMU 2.10-rc1 (?)

2017-08-01 Thread Paolo Bonzini

On 01/08/2017 19:22, Peter Maydell wrote:
> On 1 August 2017 at 18:17, Paolo Bonzini  wrote:
>> Ok, for this patch I kept it consistent within the file, but for 2.11 we
>> can change everything to "while (0)".
> 
> Is this a proposal to bite the bullet and fix all our
> code style issues in existing code? :-)

Only "while (0)" :)

> It would be an interesting question to find out what % of
> QEMU's lines of code have not been touched in the six
> years since we added checkpatch, and how they're
> distributed in the codebase (are there any neat
> visualisations of this sort of thing?) Fix-when-touched
> works OK where we're actively working but can leave
> some big dead spots I suspect.

That's an interesting idea.  Since I've been volunteered to present
about QEMU at KVM Forum, I might give it a shot.

Paolo

Re: [Qemu-devel] [PATCH v2] pc: acpi: force FADT rev1 for 440fx based machine types

2017-08-01 Thread Michael S. Tsirkin

On Fri, Jul 28, 2017 at 02:28:57PM +0200, Igor Mammedov wrote:
> On Wed, 26 Jul 2017 23:18:01 +0300
> "Michael S. Tsirkin"  wrote:
> 
> > On Wed, Jul 26, 2017 at 04:09:37PM +0200, Igor Mammedov wrote:
> > > On Tue, 25 Jul 2017 16:36:06 +0300
> > > "Michael S. Tsirkin"  wrote:
> > >   
> > > > On Mon, Jul 24, 2017 at 03:50:20PM +0200, Igor Mammedov wrote:  
> > > > > w2k used to boot on QEMU until revision of FADT has
> > > > > been bumped to rev3
> > > > > (commit 77af8a2b hw/i386: Use Rev3 FADT (ACPI 2.0) instead of Rev1 to 
> > > > > improve guest OS support.)
> > > > > 
> > > > > Keep PC machine at rev1 to remain compatible and Q35
> > > > > at rev3 where w2k isn't supported anyway so OSX could
> > > > > run as well.
> > > > > 
> > > > > Signed-off-by: Igor Mammedov 
> > > > > ---
> > > > > v2:
> > > > >- make pc rev1 and q35 rev3.
> > > > >"Michael S. Tsirkin" 
> > > > > 
> > > > > PS:
> > > > > Only compile tested.
> > > > > 
> > > > > 
> > > > > CC: Programmingkid 
> > > > > CC: Phil Dennis-Jordan 
> > > > > CC: "Michael S. Tsirkin" 
> > > > > 
> > > > > ---
> > > > >  hw/i386/acpi-build.c | 22 ++
> > > > >  1 file changed, 18 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > > > index 6b7bade..b9c245c 100644
> > > > > --- a/hw/i386/acpi-build.c
> > > > > +++ b/hw/i386/acpi-build.c
> > > > > @@ -90,6 +90,7 @@ typedef struct AcpiMcfgInfo {
> > > > >  } AcpiMcfgInfo;
> > > > >  
> > > > >  typedef struct AcpiPmInfo {
> > > > > +bool force_rev1_fadt;
> > > > >  bool s3_disabled;
> > > > >  bool s4_disabled;
> > > > >  bool pcihp_bridge_en;
> > > > > @@ -129,10 +130,13 @@ static void acpi_get_pm_info(AcpiPmInfo *pm)
> > > > >  Object *obj = NULL;
> > > > >  QObject *o;
> > > > >  
> > > > > +pm->force_rev1_fadt = false;
> > > > >  pm->cpu_hp_io_base = 0;
> > > > >  pm->pcihp_io_base = 0;
> > > > >  pm->pcihp_io_len = 0;
> > > > >  if (piix) {
> > > > > +/* w2k requires FADT(rev1) or it won't boot, keep PC 
> > > > > compatible */
> > > > > +pm->force_rev1_fadt = true;
> > > > >  obj = piix;
> > > > >  pm->cpu_hp_io_base = PIIX4_CPU_HOTPLUG_IO_BASE;
> > > > >  pm->pcihp_io_base =
> > > > > @@ -304,6 +308,9 @@ static void fadt_setup(AcpiFadtDescriptorRev3 
> > > > > *fadt, AcpiPmInfo *pm)
> > > > >  fadt->flags |= cpu_to_le32(1 << 
> > > > > ACPI_FADT_F_FORCE_APIC_CLUSTER_MODEL);
> > > > >  }
> > > > >  fadt->century = RTC_CENTURY;
> > > > > +if (pm->force_rev1_fadt) {
> > > > > +return;
> > > > > +}
> > > > >  
> > > > >  fadt->flags |= cpu_to_le32(1 << ACPI_FADT_F_RESET_REG_SUP);
> > > > >  fadt->reset_value = 0xf;
> > > > > @@ -342,6 +349,8 @@ build_fadt(GArray *table_data, BIOSLinker 
> > > > > *linker, AcpiPmInfo *pm,
> > > > >  unsigned fw_ctrl_offset = (char *)>firmware_ctrl - 
> > > > > table_data->data;
> > > > >  unsigned dsdt_entry_offset = (char *)>dsdt - 
> > > > > table_data->data;
> > > > >  unsigned xdsdt_entry_offset = (char *)>x_dsdt - 
> > > > > table_data->data;
> > > > > +int fadt_size = sizeof(*fadt);
> > > > > +int rev = 3;
> > > > >  
> > > > >  /* FACS address to be filled by Guest linker */
> > > > >  bios_linker_loader_add_pointer(linker,
> > > > > @@ -353,12 +362,17 @@ build_fadt(GArray *table_data, BIOSLinker 
> > > > > *linker, AcpiPmInfo *pm,
> > > > >  bios_linker_loader_add_pointer(linker,
> > > > >  ACPI_BUILD_TABLE_FILE, dsdt_entry_offset, sizeof(fadt->dsdt),
> > > > >  ACPI_BUILD_TABLE_FILE, dsdt_tbl_offset);
> > > > > -bios_linker_loader_add_pointer(linker,
> > > > > -ACPI_BUILD_TABLE_FILE, xdsdt_entry_offset, 
> > > > > sizeof(fadt->x_dsdt),
> > > > > -ACPI_BUILD_TABLE_FILE, dsdt_tbl_offset);
> > > > > +if (pm->force_rev1_fadt) {
> > > > > +rev = 1;
> > > > > +fadt_size = offsetof(typeof(*fadt), reset_register);
> > > > 
> > > > I don't really like it that we are using structs to format fields - I
> > > > would prefer build_append_int_noprefix based APIs - but since we do,
> > > > let's add a proper structure for FADT rev 1 instead of this hack. 
>  
> Michael,
> 
> I've tried to switch FADT to build_append_int_noprefix APIs,
> but it turned out into quite a bit of re-factoring including
> clean-ups and adding new APIs to pack all the stuff in FADT.
> 
> It's too much for hard freeze, lets not rush it and go with
> this patch for 2.10 to put out fire.
> 
> When 2.11 merge window is open I'll rebase/post FADTv1/3
> conversion to build_append_int_noprefix API.

Makes sense.

Re: [Qemu-devel] [PULL 00/17] Misc changes for QEMU 2.10-rc1 (?)

2017-08-01 Thread Peter Maydell

On 1 August 2017 at 18:17, Paolo Bonzini  wrote:
> Ok, for this patch I kept it consistent within the file, but for 2.11 we
> can change everything to "while (0)".

Is this a proposal to bite the bullet and fix all our
code style issues in existing code? :-)

It would be an interesting question to find out what % of
QEMU's lines of code have not been touched in the six
years since we added checkpatch, and how they're
distributed in the codebase (are there any neat
visualisations of this sort of thing?) Fix-when-touched
works OK where we're actively working but can leave
some big dead spots I suspect.

thanks
-- PMM

Re: [Qemu-devel] [PULL 00/17] Misc changes for QEMU 2.10-rc1 (?)

2017-08-01 Thread Paolo Bonzini

On 01/08/2017 19:10, Peter Maydell wrote:
> On 1 August 2017 at 17:50, Paolo Bonzini  wrote:
>> On 01/08/2017 18:48, no-re...@patchew.org wrote:
>>> ERROR: space prohibited before that '++' (ctx:WxB)
>>> #78: FILE: hw/bt/sdp.c:741:
>>> +data[len ++] = attribute_id >> 8;
>>>   ^
>>>
>>> ERROR: space prohibited before that '++' (ctx:WxB)
>>> #79: FILE: hw/bt/sdp.c:742:
>>> +data[len ++] = attribute_id & 0xff;
>>
>> This is the preexisting Bluetooth code... I didn't change the space,
>> should I have done that?
> 
> Judgement call -- I usually fix up existing errors if I'm touching
> a bit of code anyway, unless it's a whitespace-only change or a
> pure code-motion patch.

Me too.  In this case however the code is pretty much untouched so it's
unlikely that it would become consistent one day (and I suspect no one
wants to get on git blame for bluetooth emulation :)).

>>> ERROR: space required before the open parenthesis '('
>>> #73: FILE: tests/rtc-test.c:344:
>>> +} while(0)
>>
>> This seems to be more common than "while (0)" inside macros, should we
>> allow it in checkpatch.pl?
> 
> Overall the space is much more common: 551 examples with the
> space vs 90 without; so I don't think a relaxation of checkpatch
> is particularly justified. I don't think macros need to be any
> different from the rest of our code on things like spacing.

Ok, for this patch I kept it consistent within the file, but for 2.11 we
can change everything to "while (0)".

Paolo

Re: [Qemu-devel] [PULL 00/17] Misc changes for QEMU 2.10-rc1 (?)

2017-08-01 Thread Peter Maydell

On 1 August 2017 at 17:50, Paolo Bonzini  wrote:
> On 01/08/2017 18:48, no-re...@patchew.org wrote:
>> ERROR: space prohibited before that '++' (ctx:WxB)
>> #78: FILE: hw/bt/sdp.c:741:
>> +data[len ++] = attribute_id >> 8;
>>   ^
>>
>> ERROR: space prohibited before that '++' (ctx:WxB)
>> #79: FILE: hw/bt/sdp.c:742:
>> +data[len ++] = attribute_id & 0xff;
>
> This is the preexisting Bluetooth code... I didn't change the space,
> should I have done that?

Judgement call -- I usually fix up existing errors if I'm touching
a bit of code anyway, unless it's a whitespace-only change or a
pure code-motion patch.

>> ERROR: space required before the open parenthesis '('
>> #73: FILE: tests/rtc-test.c:344:
>> +} while(0)
>
> This seems to be more common than "while (0)" inside macros, should we
> allow it in checkpatch.pl?

Overall the space is much more common: 551 examples with the
space vs 90 without; so I don't think a relaxation of checkpatch
is particularly justified. I don't think macros need to be any
different from the rest of our code on things like spacing.

thanks
-- PMM

Re: [Qemu-devel] [PATCH v2 2/2] build-sys: do not compile net/vhost-user.c if vhost-user is disabled

2017-08-01 Thread Marc-André Lureau



- Original Message -
> On Tue, Aug 01, 2017 at 12:15:16PM -0400, Marc-André Lureau wrote:
> > Hi
> > 
> > - Original Message -
> > > On Fri, Jul 28, 2017 at 04:13:09PM +0200, Marc-André Lureau wrote:
> > > > This adds two extra #ifdef that have fairly limited conflict potential.
> > > > 
> > > > Signed-off-by: Marc-André Lureau 
> > > 
> > > OK but why can't we keep all ifdefs in net/ where all the
> > > rest of the features are ifdef'ed?
> > > 
> > 
> > We would need fallback code for vhost_user_get_* functions (see below).
> 
> Not if you merely add ifdefs around name arrays in net/net.c
> I think.

vhost_user_get_* are defined in net/vhost-user.c, linked from hw/net/vhost_net.c

> 
> > > My concern is no one is going to test this weird configuration
> > > so we might accidentally re-enable it without noticing.
> > > net/net.c is where all command line play happens so
> > > people know a bunch of configs need to be tested when
> > > changing it.
> > 
> > I am not sure I understand your argument there. You are afraid we
> > 'accidentaly re-enable' vhost-user... and forget to remove those #ifdef?
> > 
> > I can imagine people don't like having code clutter with #ifdef, but I
> > think when it doesn't change much the logic (as in here with clean
> > per-case switch/if), it is fine, ymmv.
> 
> Let's assume all this code is refactored, we'd have to test
> with ifdefs.

This is true whether there is one or ten #ifdef, isn't it?

It only gets more complicated the more #ifdef there is. But here it's fairly 
limited impact since it's just 2 more #ifdef for the proposed CONFIG_VHOST_USER 
from previous patch.

> 
> > > 
> > > 
> > > > ---
> > > >  hw/net/vhost_net.c | 4 
> > > >  net/Makefile.objs  | 2 +-
> > > >  2 files changed, 5 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > > > index e037db63a3..96c4da49e4 100644
> > > > --- a/hw/net/vhost_net.c
> > > > +++ b/hw/net/vhost_net.c
> > > > @@ -193,6 +193,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions
> > > > *options)
> > > >  }
> > > >  }
> > > >  
> > > > +#ifdef CONFIG_VHOST_USER
> > > >  /* Set sane init value. Override when guest acks. */
> > > >  if (net->nc->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
> > > >  features = vhost_user_get_acked_features(net->nc);
> > > > @@ -203,6 +204,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions
> > > > *options)
> > > >  goto fail;
> > > >  }
> > > >  }
> > > > +#endif
> > > >  
> > > >  vhost_net_ack_features(net, features);
> > > >  
> > > > @@ -414,10 +416,12 @@ VHostNetState *get_vhost_net(NetClientState *nc)
> > > >  case NET_CLIENT_DRIVER_TAP:
> > > >  vhost_net = tap_get_vhost_net(nc);
> > > >  break;
> > > > +#ifdef CONFIG_VHOST_USER
> > > >  case NET_CLIENT_DRIVER_VHOST_USER:
> > > >  vhost_net = vhost_user_get_vhost_net(nc);
> > > >  assert(vhost_net);
> > > >  break;
> > > > +#endif
> > > >  default:
> > > >  break;
> > > >  }
> > > > diff --git a/net/Makefile.objs b/net/Makefile.objs
> > > > index 67ba5e26fb..7cac7ed1e4 100644
> > > > --- a/net/Makefile.objs
> > > > +++ b/net/Makefile.objs
> > > > @@ -3,7 +3,7 @@ common-obj-y += socket.o
> > > >  common-obj-y += dump.o
> > > >  common-obj-y += eth.o
> > > >  common-obj-$(CONFIG_L2TPV3) += l2tpv3.o
> > > > -common-obj-$(CONFIG_POSIX) += vhost-user.o
> > > > +common-obj-$(CONFIG_VHOST_USER) += vhost-user.o
> > 
> > So we could have something like:
> > 
> > +common-obj-$(not CONFIG_VHOST_USER) += vhost-user-stubs.o
> > 
> > ?
> > 
> > > >  common-obj-$(CONFIG_SLIRP) += slirp.o
> > > >  common-obj-$(CONFIG_VDE) += vde.o
> > > >  common-obj-$(CONFIG_NETMAP) += netmap.o
> > > > --
> > > > 2.14.0.rc0.1.g40ca67566
> > > 
>

Re: [Qemu-devel] [PULL v2 00/14] Block layer patches for 2.10.0-rc1

2017-08-01 Thread Peter Maydell

On 1 August 2017 at 17:12, Kevin Wolf  wrote:
> The following changes since commit 5619c179057e24195ff19c8fe6d6a6cbcb16ed28:
>
>   Merge remote-tracking branch 
> 'remotes/pmaydell/tags/pull-target-arm-20170731' into staging (2017-07-31 
> 14:45:42 +0100)
>
> are available in the git repository at:
>
>   git://repo.or.cz/qemu/kevin.git tags/for-upstream
>
> for you to fetch changes up to 8e8eb0a9035e5b6c6447c82138570e388282cfa2:
>
>   block/qapi: Remove redundant NULL check to silence Coverity (2017-08-01 
> 18:09:33 +0200)
>
> 
> Block layer patches for 2.10.0-rc1
>

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH v3 4/7] block: convert ThrottleGroup to object with QOM

2017-08-01 Thread Manos Pitsidianakis


On Tue, Aug 01, 2017 at 04:47:03PM +0100, Stefan Hajnoczi wrote:

On Mon, Jul 31, 2017 at 12:54:40PM +0300, Manos Pitsidianakis wrote:

ThrottleGroup is converted to an object. This will allow the future
throttle block filter drive easy creation and configuration of throttle
groups in QMP and cli.

A new QAPI struct, ThrottleLimits, is introduced to provide a shared
struct for all throttle configuration needs in QMP.

ThrottleGroups can be created via CLI as
-object throttle-group,id=foo,x-iops-total=100,x-..
where x-* are individual limit properties. Since we can't add non-scalar
properties in -object this interface must be used instead. However,
setting these properties must be disabled after initialization because
certain combinations of limits are forbidden and thus configuration
changes should be done in one transaction. The individual properties
will go away when support for non-scalar values in CLI is implemented
and thus are marked as experimental.

ThrottleGroup also has a `limits` property that uses the ThrottleLimits
struct.  It can be used to create ThrottleGroups or set the
configuration in existing groups as follows:

{ "execute": "object-add",
  "arguments": {
"qom-type": "throttle-group",
"id": "foo",
"props" : {
  "limits": {
  "iops-total": 100
  }
}
  }
}
{ "execute" : "qom-set",
"arguments" : {
"path" : "foo",
"property" : "limits",
"value" : {
"iops-total" : 99
}
}
}

This also means a group's configuration can be fetched with qom-get.

ThrottleGroups can be anonymous which means they can't get accessed by
other users ie they will always be units instead of group (Because they
have one ThrottleGroupMember).


blockdev.c automatically assigns -drive id= to the group name if
throttling.group= wasn't given.  So who will use anonymous single-drive
mode?


Manual filter nodes. It's possible to not pass a group name value and 
the resulting group will be anonymous. Are you suggesting to move this 
change to the throttle filter patch?



Signed-off-by: Manos Pitsidianakis 
---
Notes:
Note: I tested Markus Armbruster's patch in 
<87wp7fghi9@dusky.pond.sub.org>
on master and I can use this syntax successfuly:
-object '{ "qom-type" : "throttle-group", "id" : "foo", "props" : { 
"limits" \
: { "iops-total" : 1000 } } }'
If this gets merged using -object will be a little more verbose but at 
least we
won't have seperate properties, which is a good thing, so the x-* should be
dropped.

 block/throttle-groups.c | 402 
 include/block/throttle-groups.h |   3 +
 include/qemu/throttle-options.h |  59 --
 include/qemu/throttle.h |   3 +
 qapi/block-core.json|  48 +
 tests/test-throttle.c   |   1 +
 util/throttle.c | 151 +++
 7 files changed, 617 insertions(+), 50 deletions(-)

diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index f711a3dc62..b9c5036e44 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -25,9 +25,17 @@
 #include "qemu/osdep.h"
 #include "sysemu/block-backend.h"
 #include "block/throttle-groups.h"
+#include "qemu/throttle-options.h"
 #include "qemu/queue.h"
 #include "qemu/thread.h"
 #include "sysemu/qtest.h"
+#include "qapi/error.h"
+#include "qapi-visit.h"
+#include "qom/object.h"
+#include "qom/object_interfaces.h"
+
+static void throttle_group_obj_init(Object *obj);
+static void throttle_group_obj_complete(UserCreatable *obj, Error **errp);

 /* The ThrottleGroup structure (with its ThrottleState) is shared
  * among different ThrottleGroupMembers and it's independent from
@@ -54,6 +62,10 @@
  * that ThrottleGroupMember has throttled requests in the queue.
  */
 typedef struct ThrottleGroup {
+Object parent_obj;
+
+/* refuse individual property change if initialization is complete */
+bool is_initialized;
 char *name; /* This is constant during the lifetime of the group */

 QemuMutex lock; /* This lock protects the following four fields */
@@ -63,8 +75,7 @@ typedef struct ThrottleGroup {
 bool any_timer_armed[2];
 QEMUClockType clock_type;

-/* These two are protected by the global throttle_groups_lock */
-unsigned refcount;
+/* This is protected by the global throttle_groups_lock */
 QTAILQ_ENTRY(ThrottleGroup) list;
 } ThrottleGroup;

@@ -77,7 +88,8 @@ static QTAILQ_HEAD(, ThrottleGroup) throttle_groups =
  * If no ThrottleGroup is found with the given name a new one is
  * created.
  *
- * @name: the name of the ThrottleGroup
+ * @name: the name of the ThrottleGroup, NULL means a new anonymous group will
+ *be created.
  * @ret:  the ThrottleState member of the ThrottleGroup
  */
 ThrottleState *throttle_group_incref(const char *name)
@@ -87,32 +99,30 @@ ThrottleState *throttle_group_incref(const char *name)

 qemu_mutex_lock(_groups_lock);

Re: [Qemu-devel] [PULL 00/17] Misc changes for QEMU 2.10-rc1 (?)

2017-08-01 Thread Paolo Bonzini

On 01/08/2017 18:48, no-re...@patchew.org wrote:
> === OUTPUT BEGIN ===
> Checking PATCH 1/17: vl.c/exit: pause cpus before closing block devices...
> Checking PATCH 2/17: cpu_physical_memory_sync_dirty_bitmap: Fix alignment 
> check...
> Checking PATCH 3/17: accel: cleanup error output...
> Checking PATCH 4/17: char-fd: remove useless chr pointer...
> Checking PATCH 5/17: char: don't exit on hmp 'chardev-add help'...
> Checking PATCH 6/17: docs: document deprecation policy & deprecated features 
> in appendix...
> Checking PATCH 7/17: target-i386: kvm_get/put_vcpu_events don't handle 
> sipi_vector...
> Checking PATCH 8/17: exec: Add lock parameter to qemu_ram_ptr_length...
> Checking PATCH 9/17: bt: stop the sdp memory allocation craziness...
> ERROR: space prohibited before that '++' (ctx:WxB)
> #78: FILE: hw/bt/sdp.c:741:
> +data[len ++] = attribute_id >> 8;
>   ^
> 
> ERROR: space prohibited before that '++' (ctx:WxB)
> #79: FILE: hw/bt/sdp.c:742:
> +data[len ++] = attribute_id & 0xff;

This is the preexisting Bluetooth code... I didn't change the space,
should I have done that?

> Your patch has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.
> 
> Checking PATCH 10/17: qemu-options: document existance of versioned machine 
> types...
> Checking PATCH 11/17: migration: optimize the downtime...
> Checking PATCH 12/17: hw/scsi/vmw_pvscsi: Remove the dead error handling...
> Checking PATCH 13/17: hw/scsi/vmw_pvscsi: Convert to realize...
> Checking PATCH 14/17: rtc-test: cleanup register_b_set_flag test...
> ERROR: space required before the open parenthesis '('
> #73: FILE: tests/rtc-test.c:344:
> +} while(0)

This seems to be more common than "while (0)" inside macros, should we
allow it in checkpatch.pl?

Paolo

> total: 1 errors, 0 warnings, 115 lines checked
> 
> Your patch has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.
> 
> Checking PATCH 15/17: rtc-test: introduce more update tests...
> Checking PATCH 16/17: mc146818rtc: simplify check_update_timer...
> Checking PATCH 17/17: mc146818rtc: implement UIP latching as intended...
> === OUTPUT END ===
> 
> Test command exited with code: 1
> 
> 
> ---
> Email generated automatically by Patchew [http://patchew.org/].
> Please send your feedback to patchew-de...@freelists.org
>

Re: [Qemu-devel] [PULL 00/17] Misc changes for QEMU 2.10-rc1 (?)

2017-08-01 Thread no-reply

Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PULL 00/17] Misc changes for QEMU 2.10-rc1 (?)
Message-id: 1501604245-33460-1-git-send-email-pbonz...@redhat.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
3489547b43 mc146818rtc: implement UIP latching as intended
ed88cc342f mc146818rtc: simplify check_update_timer
818434f924 rtc-test: introduce more update tests
d2208f1403 rtc-test: cleanup register_b_set_flag test
acd549e69e hw/scsi/vmw_pvscsi: Convert to realize
07345966fe hw/scsi/vmw_pvscsi: Remove the dead error handling
7651c26752 migration: optimize the downtime
c24276b69b qemu-options: document existance of versioned machine types
6240b43885 bt: stop the sdp memory allocation craziness
8e3ab7f33e exec: Add lock parameter to qemu_ram_ptr_length
c2d7f542e9 target-i386: kvm_get/put_vcpu_events don't handle sipi_vector
caaa83b53e docs: document deprecation policy & deprecated features in appendix
e085726add char: don't exit on hmp 'chardev-add help'
b9ba7ecb28 char-fd: remove useless chr pointer
ee56cb9f35 accel: cleanup error output
235bccb31d cpu_physical_memory_sync_dirty_bitmap: Fix alignment check
99135ab7e2 vl.c/exit: pause cpus before closing block devices

=== OUTPUT BEGIN ===
Checking PATCH 1/17: vl.c/exit: pause cpus before closing block devices...
Checking PATCH 2/17: cpu_physical_memory_sync_dirty_bitmap: Fix alignment 
check...
Checking PATCH 3/17: accel: cleanup error output...
Checking PATCH 4/17: char-fd: remove useless chr pointer...
Checking PATCH 5/17: char: don't exit on hmp 'chardev-add help'...
Checking PATCH 6/17: docs: document deprecation policy & deprecated features in 
appendix...
Checking PATCH 7/17: target-i386: kvm_get/put_vcpu_events don't handle 
sipi_vector...
Checking PATCH 8/17: exec: Add lock parameter to qemu_ram_ptr_length...
Checking PATCH 9/17: bt: stop the sdp memory allocation craziness...
ERROR: space prohibited before that '++' (ctx:WxB)
#78: FILE: hw/bt/sdp.c:741:
+data[len ++] = attribute_id >> 8;
  ^

ERROR: space prohibited before that '++' (ctx:WxB)
#79: FILE: hw/bt/sdp.c:742:
+data[len ++] = attribute_id & 0xff;
  ^

total: 2 errors, 0 warnings, 48 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 10/17: qemu-options: document existance of versioned machine 
types...
Checking PATCH 11/17: migration: optimize the downtime...
Checking PATCH 12/17: hw/scsi/vmw_pvscsi: Remove the dead error handling...
Checking PATCH 13/17: hw/scsi/vmw_pvscsi: Convert to realize...
Checking PATCH 14/17: rtc-test: cleanup register_b_set_flag test...
ERROR: space required before the open parenthesis '('
#73: FILE: tests/rtc-test.c:344:
+} while(0)

total: 1 errors, 0 warnings, 115 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 15/17: rtc-test: introduce more update tests...
Checking PATCH 16/17: mc146818rtc: simplify check_update_timer...
Checking PATCH 17/17: mc146818rtc: implement UIP latching as intended...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] Does qemu guest agent support 'guest-exec'?

2017-08-01 Thread Michael Roth

Quoting Hu, Robert (2017-08-01 02:15:01)
> Hi,
> 
> qemu/scripts/qmp/qemu-ga-client seems only support "cat, fsfreeze, fstrim, 
> halt, ifconfig, info, ping, powerdown, reboot, shutdown, suspend".
> 
> But from qemu/qga/commands.c seems at least Linux guest should already 
> support this. Despite qemu-ga-client, how can I talk to guest-agent in guest 
> to execute some program? any other utils?

qemu-ga-client is more of a helper script to make it easier to execute
things from cmdline and hasn't been updated to support guest-exec. But
the official API is documented in qga/qapi-schema.json and involves
talking to qemu-ga directly via JSON commands. A simple example for
guest-exec would be something like:

mdroth@loki:~$ sudo nc -U /tmp/vm3-qga.sock
{'execute':'guest-exec','arguments':{'path':'ip','arg': ['addr', 'show', 
'eth0'],'capture-output':true}}
{"return": {"pid": 1462}}
{'execute':'guest-exec-status','arguments':{'pid':1462}}
{"return": {"exitcode": 0, "out-data": 
"MjogZXRoMDogPEJST0FEQ0FTVCxNVUxUSUNBU1QsVVAsTE9XRVJfVVA+IG10dSAxNTAwIHFkaXNjIHBmaWZvX2Zhc3Qgc3RhdGUgVVAgZ3JvdXAgZGVmYXVsdCBxbGVuIDEwMDAKICAgIGxpbmsvZXRoZXIgNTI6NTQ6MDA6MTI6MzQ6MDMgYnJkIGZmOmZmOmZmOmZmOmZmOmZmCiAgICBpbmV0IDE5Mi4xNjguMTIyLjEzLzI0IGJyZCAxOTIuMTY4LjEyMi4yNTUgc2NvcGUgZ2xvYmFsIGR5bmFtaWMgZXRoMAogICAgICAgdmFsaWRfbGZ0IDMwNjRzZWMgcHJlZmVycmVkX2xmdCAzMDY0c2VjCiAgICBpbmV0NiBmZTgwOjo1MDU0OmZmOmZlMTI6MzQwMy82NCBzY29wZSBsaW5rIAogICAgICAgdmFsaWRfbGZ0IGZvcmV2ZXIgcHJlZmVycmVkX2xmdCBmb3JldmVyCg==",
 "exited": true}}
^C
mdroth@loki:~$ cat < mtu 1500 qdisc pfifo_fast state UP 
group default qlen 1000
link/ether 52:54:00:12:34:03 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.13/24 brd 192.168.122.255 scope global dynamic eth0
   valid_lft 3064sec preferred_lft 3064sec
inet6 fe80::5054:ff:fe12:3403/64 scope link 
   valid_lft forever preferred_lft forever


> 
> Best Regards,
> Robert Hoo
> 
>

Re: [Qemu-devel] [PATCH v2 2/2] build-sys: do not compile net/vhost-user.c if vhost-user is disabled

2017-08-01 Thread Michael S. Tsirkin

On Tue, Aug 01, 2017 at 12:15:16PM -0400, Marc-André Lureau wrote:
> Hi
> 
> - Original Message -
> > On Fri, Jul 28, 2017 at 04:13:09PM +0200, Marc-André Lureau wrote:
> > > This adds two extra #ifdef that have fairly limited conflict potential.
> > > 
> > > Signed-off-by: Marc-André Lureau 
> > 
> > OK but why can't we keep all ifdefs in net/ where all the
> > rest of the features are ifdef'ed?
> > 
> 
> We would need fallback code for vhost_user_get_* functions (see below).

Not if you merely add ifdefs around name arrays in net/net.c
I think.

> > My concern is no one is going to test this weird configuration
> > so we might accidentally re-enable it without noticing.
> > net/net.c is where all command line play happens so
> > people know a bunch of configs need to be tested when
> > changing it.
> 
> I am not sure I understand your argument there. You are afraid we 
> 'accidentaly re-enable' vhost-user... and forget to remove those #ifdef?
> 
> I can imagine people don't like having code clutter with #ifdef, but I think 
> when it doesn't change much the logic (as in here with clean per-case 
> switch/if), it is fine, ymmv.

Let's assume all this code is refactored, we'd have to test
with ifdefs.


> > 
> > 
> > > ---
> > >  hw/net/vhost_net.c | 4 
> > >  net/Makefile.objs  | 2 +-
> > >  2 files changed, 5 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > > index e037db63a3..96c4da49e4 100644
> > > --- a/hw/net/vhost_net.c
> > > +++ b/hw/net/vhost_net.c
> > > @@ -193,6 +193,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions
> > > *options)
> > >  }
> > >  }
> > >  
> > > +#ifdef CONFIG_VHOST_USER
> > >  /* Set sane init value. Override when guest acks. */
> > >  if (net->nc->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
> > >  features = vhost_user_get_acked_features(net->nc);
> > > @@ -203,6 +204,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions
> > > *options)
> > >  goto fail;
> > >  }
> > >  }
> > > +#endif
> > >  
> > >  vhost_net_ack_features(net, features);
> > >  
> > > @@ -414,10 +416,12 @@ VHostNetState *get_vhost_net(NetClientState *nc)
> > >  case NET_CLIENT_DRIVER_TAP:
> > >  vhost_net = tap_get_vhost_net(nc);
> > >  break;
> > > +#ifdef CONFIG_VHOST_USER
> > >  case NET_CLIENT_DRIVER_VHOST_USER:
> > >  vhost_net = vhost_user_get_vhost_net(nc);
> > >  assert(vhost_net);
> > >  break;
> > > +#endif
> > >  default:
> > >  break;
> > >  }
> > > diff --git a/net/Makefile.objs b/net/Makefile.objs
> > > index 67ba5e26fb..7cac7ed1e4 100644
> > > --- a/net/Makefile.objs
> > > +++ b/net/Makefile.objs
> > > @@ -3,7 +3,7 @@ common-obj-y += socket.o
> > >  common-obj-y += dump.o
> > >  common-obj-y += eth.o
> > >  common-obj-$(CONFIG_L2TPV3) += l2tpv3.o
> > > -common-obj-$(CONFIG_POSIX) += vhost-user.o
> > > +common-obj-$(CONFIG_VHOST_USER) += vhost-user.o
> 
> So we could have something like:
> 
> +common-obj-$(not CONFIG_VHOST_USER) += vhost-user-stubs.o
> 
> ?
> 
> > >  common-obj-$(CONFIG_SLIRP) += slirp.o
> > >  common-obj-$(CONFIG_VDE) += vde.o
> > >  common-obj-$(CONFIG_NETMAP) += netmap.o
> > > --
> > > 2.14.0.rc0.1.g40ca67566
> >

[Qemu-devel] [PULL 17/17] mc146818rtc: implement UIP latching as intended

2017-08-01 Thread Paolo Bonzini

In some cases, the guest can observe the wrong ordering of UIP and
interrupts.  This can happen if the VCPU exit is timed like this:

   iothread VCPU
  ... wait for interrupt ...
t-100ns   read register A
t  wake up, take BQL
t+100ns update_in_progress
  return false
return UIP=0
   trigger interrupt

The interrupt is late; the VCPU expected the falling edge of UIP to
happen after the interrupt.  update_in_progress is already trying to
cover this case by latching UIP if the timer is going to fire soon,
and the fix is documented in the commit message for commit 56038ef623
("RTC: Update the RTC clock only when reading it", 2012-09-10).  It
cannot be tested with qtest, because its timing of interrupts vs. reads
is exact.

However, the implementation was incorrect because UIP cmos_ioport_read
cleared register A instead of leaving that to rtc_update_timer.  Fixing
the implementation of cmos_ioport_read to match the commit message,
however, breaks the "uip-stuck" test case from the previous patch.
To fix it, skip update timer optimizations if UIP has been latched.

Signed-off-by: Paolo Bonzini 
---
 hw/timer/mc146818rtc.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
index ffb2c6a..82843ed 100644
--- a/hw/timer/mc146818rtc.c
+++ b/hw/timer/mc146818rtc.c
@@ -294,6 +294,7 @@ static void check_update_timer(RTCState *s)
  * them to occur.
  */
 if ((s->cmos_data[RTC_REG_A] & 0x60) == 0x60) {
+assert((s->cmos_data[RTC_REG_A] & REG_A_UIP) == 0);
 timer_del(s->update_timer);
 return;
 }
@@ -309,8 +310,12 @@ static void check_update_timer(RTCState *s)
 s->next_alarm_time = next_update_time +
  (next_alarm_sec - 1) * NANOSECONDS_PER_SECOND;
 
-/* If UF is already set, we might be able to optimize. */
-if (s->cmos_data[RTC_REG_C] & REG_C_UF) {
+/* If update_in_progress latched the UIP bit, we must keep the timer
+ * programmed to the next second, so that UIP is cleared.  Otherwise,
+ * if UF is already set, we might be able to optimize.
+ */
+if (!(s->cmos_data[RTC_REG_A] & REG_A_UIP) &&
+(s->cmos_data[RTC_REG_C] & REG_C_UF)) {
 /* If AF cannot change (i.e. either it is set already, or
  * SET=1 and then the time is not updated), nothing to do.
  */
@@ -725,12 +730,10 @@ static uint64_t cmos_ioport_read(void *opaque, hwaddr 
addr,
 ret = s->cmos_data[s->cmos_index];
 break;
 case RTC_REG_A:
+ret = s->cmos_data[s->cmos_index];
 if (update_in_progress(s)) {
-s->cmos_data[s->cmos_index] |= REG_A_UIP;
-} else {
-s->cmos_data[s->cmos_index] &= ~REG_A_UIP;
+ret |= REG_A_UIP;
 }
-ret = s->cmos_data[s->cmos_index];
 break;
 case RTC_REG_C:
 ret = s->cmos_data[s->cmos_index];
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH v3 4/7] block: convert ThrottleGroup to object with QOM

2017-08-01 Thread Stefan Hajnoczi

On Mon, Jul 31, 2017 at 12:54:40PM +0300, Manos Pitsidianakis wrote:
> ThrottleGroup is converted to an object. This will allow the future
> throttle block filter drive easy creation and configuration of throttle
> groups in QMP and cli.
> 
> A new QAPI struct, ThrottleLimits, is introduced to provide a shared
> struct for all throttle configuration needs in QMP.
> 
> ThrottleGroups can be created via CLI as
> -object throttle-group,id=foo,x-iops-total=100,x-..
> where x-* are individual limit properties. Since we can't add non-scalar
> properties in -object this interface must be used instead. However,
> setting these properties must be disabled after initialization because
> certain combinations of limits are forbidden and thus configuration
> changes should be done in one transaction. The individual properties
> will go away when support for non-scalar values in CLI is implemented
> and thus are marked as experimental.
> 
> ThrottleGroup also has a `limits` property that uses the ThrottleLimits
> struct.  It can be used to create ThrottleGroups or set the
> configuration in existing groups as follows:
> 
> { "execute": "object-add",
>   "arguments": {
> "qom-type": "throttle-group",
> "id": "foo",
> "props" : {
>   "limits": {
>   "iops-total": 100
>   }
> }
>   }
> }
> { "execute" : "qom-set",
> "arguments" : {
> "path" : "foo",
> "property" : "limits",
> "value" : {
> "iops-total" : 99
> }
> }
> }
> 
> This also means a group's configuration can be fetched with qom-get.
> 
> ThrottleGroups can be anonymous which means they can't get accessed by
> other users ie they will always be units instead of group (Because they
> have one ThrottleGroupMember).

blockdev.c automatically assigns -drive id= to the group name if
throttling.group= wasn't given.  So who will use anonymous single-drive
mode?

> Signed-off-by: Manos Pitsidianakis 
> ---
> Notes:
> Note: I tested Markus Armbruster's patch in 
> <87wp7fghi9@dusky.pond.sub.org>
> on master and I can use this syntax successfuly:
> -object '{ "qom-type" : "throttle-group", "id" : "foo", "props" : { 
> "limits" \
> : { "iops-total" : 1000 } } }'
> If this gets merged using -object will be a little more verbose but at 
> least we
> won't have seperate properties, which is a good thing, so the x-* should 
> be
> dropped.
> 
>  block/throttle-groups.c | 402 
> 
>  include/block/throttle-groups.h |   3 +
>  include/qemu/throttle-options.h |  59 --
>  include/qemu/throttle.h |   3 +
>  qapi/block-core.json|  48 +
>  tests/test-throttle.c   |   1 +
>  util/throttle.c | 151 +++
>  7 files changed, 617 insertions(+), 50 deletions(-)
> 
> diff --git a/block/throttle-groups.c b/block/throttle-groups.c
> index f711a3dc62..b9c5036e44 100644
> --- a/block/throttle-groups.c
> +++ b/block/throttle-groups.c
> @@ -25,9 +25,17 @@
>  #include "qemu/osdep.h"
>  #include "sysemu/block-backend.h"
>  #include "block/throttle-groups.h"
> +#include "qemu/throttle-options.h"
>  #include "qemu/queue.h"
>  #include "qemu/thread.h"
>  #include "sysemu/qtest.h"
> +#include "qapi/error.h"
> +#include "qapi-visit.h"
> +#include "qom/object.h"
> +#include "qom/object_interfaces.h"
> +
> +static void throttle_group_obj_init(Object *obj);
> +static void throttle_group_obj_complete(UserCreatable *obj, Error **errp);
>  
>  /* The ThrottleGroup structure (with its ThrottleState) is shared
>   * among different ThrottleGroupMembers and it's independent from
> @@ -54,6 +62,10 @@
>   * that ThrottleGroupMember has throttled requests in the queue.
>   */
>  typedef struct ThrottleGroup {
> +Object parent_obj;
> +
> +/* refuse individual property change if initialization is complete */
> +bool is_initialized;
>  char *name; /* This is constant during the lifetime of the group */
>  
>  QemuMutex lock; /* This lock protects the following four fields */
> @@ -63,8 +75,7 @@ typedef struct ThrottleGroup {
>  bool any_timer_armed[2];
>  QEMUClockType clock_type;
>  
> -/* These two are protected by the global throttle_groups_lock */
> -unsigned refcount;
> +/* This is protected by the global throttle_groups_lock */
>  QTAILQ_ENTRY(ThrottleGroup) list;
>  } ThrottleGroup;
>  
> @@ -77,7 +88,8 @@ static QTAILQ_HEAD(, ThrottleGroup) throttle_groups =
>   * If no ThrottleGroup is found with the given name a new one is
>   * created.
>   *
> - * @name: the name of the ThrottleGroup
> + * @name: the name of the ThrottleGroup, NULL means a new anonymous group 
> will
> + *be created.
>   * @ret:  the ThrottleState member of the ThrottleGroup
>   */
>  ThrottleState *throttle_group_incref(const char *name)
> @@ -87,32 +99,30 @@ ThrottleState *throttle_group_incref(const char *name)
>  
>

[Qemu-devel] [PULL 12/17] hw/scsi/vmw_pvscsi: Remove the dead error handling

2017-08-01 Thread Paolo Bonzini

From: Mao Zhongyi 

qemu_bh_new() is a wrapper around aio_bh_new(), which returns
null only when g_new() does. It doesn't. So remove the dead
error handling.

Reviewed-by: Dmitry Fleytman 
Cc: Paolo Bonzini 
Cc: Markus Armbruster 
Signed-off-by: Mao Zhongyi 
Message-Id: <20170726084153.10121-1-maozy.f...@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini 
---
 hw/scsi/vmw_pvscsi.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c
index 4a106da..d92973e 100644
--- a/hw/scsi/vmw_pvscsi.c
+++ b/hw/scsi/vmw_pvscsi.c
@@ -1138,10 +1138,6 @@ pvscsi_init(PCIDevice *pci_dev)
 }
 
 s->completion_worker = qemu_bh_new(pvscsi_process_completion_queue, s);
-if (!s->completion_worker) {
-pvscsi_cleanup_msi(s);
-return -ENOMEM;
-}
 
 scsi_bus_new(>bus, sizeof(s->bus), DEVICE(pci_dev),
  _scsi_info, NULL);
-- 
1.8.3.1

[Qemu-devel] [PULL 16/17] mc146818rtc: simplify check_update_timer

2017-08-01 Thread Paolo Bonzini

Move all the optimized cases together, since they all have UF=1 in
common.

Signed-off-by: Paolo Bonzini 
---
 hw/timer/mc146818rtc.c | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
index 1b8d3d7..ffb2c6a 100644
--- a/hw/timer/mc146818rtc.c
+++ b/hw/timer/mc146818rtc.c
@@ -291,26 +291,14 @@ static void check_update_timer(RTCState *s)
 
 /* From the data sheet: "Holding the dividers in reset prevents
  * interrupts from operating, while setting the SET bit allows"
- * them to occur.  However, it will prevent an alarm interrupt
- * from occurring, because the time of day is not updated.
+ * them to occur.
  */
 if ((s->cmos_data[RTC_REG_A] & 0x60) == 0x60) {
 timer_del(s->update_timer);
 return;
 }
-if ((s->cmos_data[RTC_REG_C] & REG_C_UF) &&
-(s->cmos_data[RTC_REG_B] & REG_B_SET)) {
-timer_del(s->update_timer);
-return;
-}
-if ((s->cmos_data[RTC_REG_C] & REG_C_UF) &&
-(s->cmos_data[RTC_REG_C] & REG_C_AF)) {
-timer_del(s->update_timer);
-return;
-}
 
 guest_nsec = get_guest_rtc_ns(s) % NANOSECONDS_PER_SECOND;
-/* if UF is clear, reprogram to next second */
 next_update_time = qemu_clock_get_ns(rtc_clock)
 + NANOSECONDS_PER_SECOND - guest_nsec;
 
@@ -321,7 +309,17 @@ static void check_update_timer(RTCState *s)
 s->next_alarm_time = next_update_time +
  (next_alarm_sec - 1) * NANOSECONDS_PER_SECOND;
 
+/* If UF is already set, we might be able to optimize. */
 if (s->cmos_data[RTC_REG_C] & REG_C_UF) {
+/* If AF cannot change (i.e. either it is set already, or
+ * SET=1 and then the time is not updated), nothing to do.
+ */
+if ((s->cmos_data[RTC_REG_B] & REG_B_SET) ||
+(s->cmos_data[RTC_REG_C] & REG_C_AF)) {
+timer_del(s->update_timer);
+return;
+}
+
 /* UF is set, but AF is clear.  Program the timer to target
  * the alarm time.  */
 next_update_time = s->next_alarm_time;
-- 
1.8.3.1

[Qemu-devel] [PULL 07/17] target-i386: kvm_get/put_vcpu_events don't handle sipi_vector

2017-08-01 Thread Paolo Bonzini

From: Peng Hao 

qemu call kvm_get_vcpu_events, and kernel return sipi_vector always
0, never valid when reporting to user space. But when qemu calls
kvm_put_vcpu_events will make sipi_vector in kernel be 0. This will
accidently modify sipi_vector when sipi_vector in kernel is not 0.

Signed-off-by: Peng Hao 
Reviewed-by: Liu Yi 
Message-Id: <1500047256-8911-1-git-send-email-peng.h...@zte.com.cn>
Signed-off-by: Paolo Bonzini 
---
 target/i386/kvm.c | 35 ---
 1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index a6613e1..6db7783 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -2444,8 +2444,10 @@ static int kvm_put_vcpu_events(X86CPU *cpu, int level)
 }
 
 if (level >= KVM_PUT_RESET_STATE) {
-events.flags |=
-KVM_VCPUEVENT_VALID_NMI_PENDING | KVM_VCPUEVENT_VALID_SIPI_VECTOR;
+events.flags |= KVM_VCPUEVENT_VALID_NMI_PENDING;
+if (env->mp_state == KVM_MP_STATE_SIPI_RECEIVED) {
+events.flags |= KVM_VCPUEVENT_VALID_SIPI_VECTOR;
+}
 }
 
 return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, );
@@ -2633,6 +2635,10 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 if (ret < 0) {
 return ret;
 }
+ret = kvm_put_vcpu_events(x86_cpu, level);
+if (ret < 0) {
+return ret;
+}
 if (level >= KVM_PUT_RESET_STATE) {
 ret = kvm_put_mp_state(x86_cpu);
 if (ret < 0) {
@@ -2644,11 +2650,6 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 if (ret < 0) {
 return ret;
 }
-
-ret = kvm_put_vcpu_events(x86_cpu, level);
-if (ret < 0) {
-return ret;
-}
 ret = kvm_put_debugregs(x86_cpu);
 if (ret < 0) {
 return ret;
@@ -2668,35 +2669,39 @@ int kvm_arch_get_registers(CPUState *cs)
 
 assert(cpu_is_stopped(cs) || qemu_cpu_is_self(cs));
 
-ret = kvm_getput_regs(cpu, 0);
+ret = kvm_get_vcpu_events(cpu);
 if (ret < 0) {
 goto out;
 }
-ret = kvm_get_xsave(cpu);
+/*
+ * KVM_GET_MPSTATE can modify CS and RIP, call it before
+ * KVM_GET_REGS and KVM_GET_SREGS.
+ */
+ret = kvm_get_mp_state(cpu);
 if (ret < 0) {
 goto out;
 }
-ret = kvm_get_xcrs(cpu);
+ret = kvm_getput_regs(cpu, 0);
 if (ret < 0) {
 goto out;
 }
-ret = kvm_get_sregs(cpu);
+ret = kvm_get_xsave(cpu);
 if (ret < 0) {
 goto out;
 }
-ret = kvm_get_msrs(cpu);
+ret = kvm_get_xcrs(cpu);
 if (ret < 0) {
 goto out;
 }
-ret = kvm_get_mp_state(cpu);
+ret = kvm_get_sregs(cpu);
 if (ret < 0) {
 goto out;
 }
-ret = kvm_get_apic(cpu);
+ret = kvm_get_msrs(cpu);
 if (ret < 0) {
 goto out;
 }
-ret = kvm_get_vcpu_events(cpu);
+ret = kvm_get_apic(cpu);
 if (ret < 0) {
 goto out;
 }
-- 
1.8.3.1

[Qemu-devel] [PULL 09/17] bt: stop the sdp memory allocation craziness

2017-08-01 Thread Paolo Bonzini

Clang static analyzer reports a memory leak.  Actually, the allocated
memory escapes here:

record->attribute_list[record->attributes].pair = data;

but clang is correct that the memory might leak if len is zero.  We
know it isn't; assert that it is the case.

The craziness doesn't end there.  The memory is freed by
bt_l2cap_sdp_close_ch:

   g_free(sdp->service_list[i].attribute_list->pair);

which actually should have been written like this:

   g_free(sdp->service_list[i].attribute_list[0].pair);

The attribute_list is sorted with qsort; but indeed the first
entry of attribute_list should point to "data" even after the qsort,
because the first record has id SDP_ATTR_RECORD_HANDLE, whose
numeric value is zero.

But hang on.  The qsort function is

static int sdp_attributeid_compare(
const struct sdp_service_attribute_s *a,
const struct sdp_service_attribute_s *b)
{
return (int) b->attribute_id - a->attribute_id;
}

but no one ever writes attribute_id.  So it only works if qsort is
stable, and who knows what else is broken, but we can fix it by
setting attribute_id in the while loop.

Signed-off-by: Paolo Bonzini 
---
 hw/bt/sdp.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/hw/bt/sdp.c b/hw/bt/sdp.c
index f67b3b8..3cb60b9 100644
--- a/hw/bt/sdp.c
+++ b/hw/bt/sdp.c
@@ -580,7 +580,7 @@ static void bt_l2cap_sdp_close_ch(void *opaque)
 int i;
 
 for (i = 0; i < sdp->services; i ++) {
-g_free(sdp->service_list[i].attribute_list->pair);
+g_free(sdp->service_list[i].attribute_list[0].pair);
 g_free(sdp->service_list[i].attribute_list);
 g_free(sdp->service_list[i].uuid);
 }
@@ -720,6 +720,8 @@ static void sdp_service_record_build(struct 
sdp_service_record_s *record,
 len += sdp_attr_max_size(>attributes[record->attributes ++].data,
 >uuids);
 }
+
+assert(len > 0);
 record->uuids = pow2ceil(record->uuids);
 record->attribute_list =
 g_malloc0(record->attributes * sizeof(*record->attribute_list));
@@ -730,12 +732,14 @@ static void sdp_service_record_build(struct 
sdp_service_record_s *record,
 record->attributes = 0;
 uuid = record->uuid;
 while (def->attributes[record->attributes].data.type) {
+int attribute_id = def->attributes[record->attributes].id;
 record->attribute_list[record->attributes].pair = data;
+record->attribute_list[record->attributes].attribute_id = attribute_id;
 
 len = 0;
 data[len ++] = SDP_DTYPE_UINT | SDP_DSIZE_2;
-data[len ++] = def->attributes[record->attributes].id >> 8;
-data[len ++] = def->attributes[record->attributes].id & 0xff;
+data[len ++] = attribute_id >> 8;
+data[len ++] = attribute_id & 0xff;
 len += sdp_attr_write(data + len,
 >attributes[record->attributes].data, );
 
@@ -749,10 +753,15 @@ static void sdp_service_record_build(struct 
sdp_service_record_s *record,
 data += len;
 }
 
-/* Sort the attribute list by the AttributeID */
+/* Sort the attribute list by the AttributeID.  The first must be
+ * SDP_ATTR_RECORD_HANDLE so that bt_l2cap_sdp_close_ch can free
+ * the buffer.
+ */
 qsort(record->attribute_list, record->attributes,
 sizeof(*record->attribute_list),
 (void *) sdp_attributeid_compare);
+assert(record->attribute_list[0].pair == data);
+
 /* Sort the searchable UUIDs list for bisection */
 qsort(record->uuid, record->uuids,
 sizeof(*record->uuid),
-- 
1.8.3.1

[Qemu-devel] [PULL 08/17] exec: Add lock parameter to qemu_ram_ptr_length

2017-08-01 Thread Paolo Bonzini

From: Anthony PERARD 

Commit 04bf2526ce87f21b32c9acba1c5518708c243ad0 (exec: use
qemu_ram_ptr_length to access guest ram) start using qemu_ram_ptr_length
instead of qemu_map_ram_ptr, but when used with Xen, the behavior of
both function is different. They both call xen_map_cache, but one with
"lock", meaning the mapping of guest memory is never released
implicitly, and the second one without, which means, mapping can be
release later, when needed.

In the context of address_space_{read,write}_continue, the ptr to those
mapping should not be locked because it is used immediatly and never
used again.

The lock parameter make it explicit in which context qemu_ram_ptr_length
is called.

Signed-off-by: Anthony PERARD 
Message-Id: <20170726165326.10327-1-anthony.per...@citrix.com>
Reviewed-by: Stefano Stabellini 
Signed-off-by: Paolo Bonzini 
---
 exec.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/exec.c b/exec.c
index 01ac21e..d20c34c 100644
--- a/exec.c
+++ b/exec.c
@@ -2203,7 +2203,7 @@ void *qemu_map_ram_ptr(RAMBlock *ram_block, ram_addr_t 
addr)
  * Called within RCU critical section.
  */
 static void *qemu_ram_ptr_length(RAMBlock *ram_block, ram_addr_t addr,
- hwaddr *size)
+ hwaddr *size, bool lock)
 {
 RAMBlock *block = ram_block;
 if (*size == 0) {
@@ -,10 +,10 @@ static void *qemu_ram_ptr_length(RAMBlock *ram_block, 
ram_addr_t addr,
  * In that case just map the requested area.
  */
 if (block->offset == 0) {
-return xen_map_cache(addr, *size, 1, true);
+return xen_map_cache(addr, *size, lock, lock);
 }
 
-block->host = xen_map_cache(block->offset, block->max_length, 1, true);
+block->host = xen_map_cache(block->offset, block->max_length, 1, lock);
 }
 
 return ramblock_ptr(block, addr);
@@ -2947,7 +2947,7 @@ static MemTxResult 
address_space_write_continue(AddressSpace *as, hwaddr addr,
 }
 } else {
 /* RAM case */
-ptr = qemu_ram_ptr_length(mr->ram_block, addr1, );
+ptr = qemu_ram_ptr_length(mr->ram_block, addr1, , false);
 memcpy(ptr, buf, l);
 invalidate_and_set_dirty(mr, addr1, l);
 }
@@ -3038,7 +3038,7 @@ MemTxResult address_space_read_continue(AddressSpace *as, 
hwaddr addr,
 }
 } else {
 /* RAM case */
-ptr = qemu_ram_ptr_length(mr->ram_block, addr1, );
+ptr = qemu_ram_ptr_length(mr->ram_block, addr1, , false);
 memcpy(buf, ptr, l);
 }
 
@@ -3349,7 +3349,7 @@ void *address_space_map(AddressSpace *as,
 
 memory_region_ref(mr);
 *plen = address_space_extend_translation(as, addr, len, mr, xlat, l, 
is_write);
-ptr = qemu_ram_ptr_length(mr->ram_block, xlat, plen);
+ptr = qemu_ram_ptr_length(mr->ram_block, xlat, plen, true);
 rcu_read_unlock();
 
 return ptr;
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH v3 5/7] block: add throttle block filter driver

2017-08-01 Thread Stefan Hajnoczi

On Mon, Jul 31, 2017 at 12:54:41PM +0300, Manos Pitsidianakis wrote:
> +static int throttle_configure_tgm(BlockDriverState *bs,
> +  ThrottleGroupMember *tgm,
> +  QDict *options, Error **errp)
> +{
> +int ret;
> +ThrottleConfig cfg;
> +const char *group_name = NULL;
> +Error *local_err = NULL;
> +QemuOpts *opts = qemu_opts_create(_opts, NULL, 0, _err);

If there is no scenario where this can fail please use error_abort.


signature.asc
Description: PGP signature

[Qemu-devel] [PULL 15/17] rtc-test: introduce more update tests

2017-08-01 Thread Paolo Bonzini

Test divider reset and UIP behavior.

Signed-off-by: Paolo Bonzini 
---
 tests/rtc-test.c | 82 +++-
 1 file changed, 81 insertions(+), 1 deletion(-)

diff --git a/tests/rtc-test.c b/tests/rtc-test.c
index 798cf5e..d7a96cb 100644
--- a/tests/rtc-test.c
+++ b/tests/rtc-test.c
@@ -325,6 +325,30 @@ static void set_datetime_bcd(int h, int min, int s, int d, 
int m, int y)
 cmos_write(RTC_DAY_OF_MONTH, d);
 }
 
+static void set_datetime_dec(int h, int min, int s, int d, int m, int y)
+{
+cmos_write(RTC_HOURS, h);
+cmos_write(RTC_MINUTES, min);
+cmos_write(RTC_SECONDS, s);
+cmos_write(RTC_YEAR, y % 100);
+cmos_write(RTC_CENTURY, y / 100);
+cmos_write(RTC_MONTH, m);
+cmos_write(RTC_DAY_OF_MONTH, d);
+}
+
+static void set_datetime(int mode, int h, int min, int s, int d, int m, int y)
+{
+cmos_write(RTC_REG_B, mode);
+
+cmos_write(RTC_REG_A, 0x76);
+if (mode & REG_B_DM) {
+set_datetime_dec(h, min, s, d, m, y);
+} else {
+set_datetime_bcd(h, min, s, d, m, y);
+}
+cmos_write(RTC_REG_A, 0x26);
+}
+
 #define assert_time(h, m, s) \
 do { \
 g_assert_cmpint(cmos_read(RTC_HOURS), ==, h); \
@@ -559,6 +583,60 @@ static void register_b_set_flag(void)
 assert_datetime_bcd(0x02, 0x04, 0x59, 0x02, 0x02, 0x2011);
 }
 
+static void divider_reset(void)
+{
+/* Enable binary-coded decimal (BCD) mode in Register B*/
+cmos_write(RTC_REG_B, REG_B_24H);
+
+/* Enter divider reset */
+cmos_write(RTC_REG_A, 0x76);
+set_datetime_bcd(0x02, 0x04, 0x58, 0x02, 0x02, 0x2011);
+
+assert_datetime_bcd(0x02, 0x04, 0x58, 0x02, 0x02, 0x2011);
+
+/* Since divider reset flag is still enabled, these are equality checks. */
+clock_step(10LL);
+assert_datetime_bcd(0x02, 0x04, 0x58, 0x02, 0x02, 0x2011);
+
+/* The first update ends 500 ms after divider reset */
+cmos_write(RTC_REG_A, 0x26);
+clock_step(5LL - UIP_HOLD_LENGTH - 1);
+g_assert_cmpint(cmos_read(RTC_REG_A) & REG_A_UIP, ==, 0);
+assert_datetime_bcd(0x02, 0x04, 0x58, 0x02, 0x02, 0x2011);
+
+clock_step(1);
+g_assert_cmpint(cmos_read(RTC_REG_A) & REG_A_UIP, !=, 0);
+clock_step(UIP_HOLD_LENGTH);
+g_assert_cmpint(cmos_read(RTC_REG_A) & REG_A_UIP, ==, 0);
+
+assert_datetime_bcd(0x02, 0x04, 0x59, 0x02, 0x02, 0x2011);
+}
+
+static void uip_stuck(void)
+{
+set_datetime(REG_B_24H, 0x02, 0x04, 0x58, 0x02, 0x02, 0x2011);
+
+/* The first update ends 500 ms after divider reset */
+(void)cmos_read(RTC_REG_C);
+clock_step(5LL);
+g_assert_cmpint(cmos_read(RTC_REG_A) & REG_A_UIP, ==, 0);
+assert_datetime_bcd(0x02, 0x04, 0x59, 0x02, 0x02, 0x2011);
+
+/* UF is now set.  */
+cmos_write(RTC_HOURS_ALARM, 0x02);
+cmos_write(RTC_MINUTES_ALARM, 0xC0);
+cmos_write(RTC_SECONDS_ALARM, 0xC0);
+
+/* Because the alarm will fire soon, reading register A will latch UIP.  */
+clock_step(10LL - UIP_HOLD_LENGTH / 2);
+g_assert_cmpint(cmos_read(RTC_REG_A) & REG_A_UIP, !=, 0);
+
+/* Move the alarm far away.  This must not cause UIP to remain stuck!  */
+cmos_write(RTC_HOURS_ALARM, 0x03);
+clock_step(UIP_HOLD_LENGTH);
+g_assert_cmpint(cmos_read(RTC_REG_A) & REG_A_UIP, ==, 0);
+}
+
 #define RTC_PERIOD_CODE113   /* 8 Hz */
 #define RTC_PERIOD_CODE215   /* 2 Hz */
 
@@ -625,7 +703,9 @@ int main(int argc, char **argv)
 qtest_add_func("/rtc/basic/bcd-12h", basic_12h_bcd);
 qtest_add_func("/rtc/set-year/20xx", set_year_20xx);
 qtest_add_func("/rtc/set-year/1980", set_year_1980);
-qtest_add_func("/rtc/misc/register_b_set_flag", register_b_set_flag);
+qtest_add_func("/rtc/update/register_b_set_flag", register_b_set_flag);
+qtest_add_func("/rtc/update/divider-reset", divider_reset);
+qtest_add_func("/rtc/update/uip-stuck", uip_stuck);
 qtest_add_func("/rtc/misc/fuzz-registers", fuzz_registers);
 qtest_add_func("/rtc/periodic/interrupt", periodic_timer);
 
-- 
1.8.3.1

[Qemu-devel] [PULL 11/17] migration: optimize the downtime

2017-08-01 Thread Paolo Bonzini

From: Jay Zhou 

Qemu_savevm_state_cleanup takes about 300ms in my ram migration tests
with a 8U24G vm(20G is really occupied), the main cost comes from
KVM_SET_USER_MEMORY_REGION ioctl when mem.memory_size = 0 in
kvm_set_user_memory_region. In kmod, the main cost is
kvm_zap_obsolete_pages, which traverses the active_mmu_pages list to
zap the unsync sptes.

It can be optimized by delaying memory_global_dirty_log_stop to the next
vm_start.

Changes v2->v3:
 - NULL VMChangeStateHandler if it is deleted and protect the scenario
   of nested invocations of memory_global_dirty_log_start/stop [Paolo]

Changes v1->v2:
 - create a VMChangeStateHandler in memory.c to reduce the coupling [Paolo]

Signed-off-by: Jay Zhou 
Message-Id: <1501237733-2736-1-git-send-email-jianjay.z...@huawei.com>
Signed-off-by: Paolo Bonzini 
---
 memory.c | 36 +++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/memory.c b/memory.c
index a7bc70a..c0adc35 100644
--- a/memory.c
+++ b/memory.c
@@ -2357,8 +2357,15 @@ void memory_global_dirty_log_sync(void)
 }
 }
 
+static VMChangeStateEntry *vmstate_change;
+
 void memory_global_dirty_log_start(void)
 {
+if (vmstate_change) {
+qemu_del_vm_change_state_handler(vmstate_change);
+vmstate_change = NULL;
+}
+
 global_dirty_log = true;
 
 MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward);
@@ -2369,7 +2376,7 @@ void memory_global_dirty_log_start(void)
 memory_region_transaction_commit();
 }
 
-void memory_global_dirty_log_stop(void)
+static void memory_global_dirty_log_do_stop(void)
 {
 global_dirty_log = false;
 
@@ -2381,6 +2388,33 @@ void memory_global_dirty_log_stop(void)
 MEMORY_LISTENER_CALL_GLOBAL(log_global_stop, Reverse);
 }
 
+static void memory_vm_change_state_handler(void *opaque, int running,
+   RunState state)
+{
+if (running) {
+memory_global_dirty_log_do_stop();
+
+if (vmstate_change) {
+qemu_del_vm_change_state_handler(vmstate_change);
+vmstate_change = NULL;
+}
+}
+}
+
+void memory_global_dirty_log_stop(void)
+{
+if (!runstate_is_running()) {
+if (vmstate_change) {
+return;
+}
+vmstate_change = qemu_add_vm_change_state_handler(
+memory_vm_change_state_handler, NULL);
+return;
+}
+
+memory_global_dirty_log_do_stop();
+}
+
 static void listener_add_address_space(MemoryListener *listener,
AddressSpace *as)
 {
-- 
1.8.3.1

[Qemu-devel] [PULL 10/17] qemu-options: document existance of versioned machine types

2017-08-01 Thread Paolo Bonzini

From: "Daniel P. Berrange" 

The -machine docs did not explain what the versioned machine
types are for, nor that they'll be maintained across
releases.

Signed-off-by: Daniel P. Berrange 
Message-Id: <20170725141041.1195-1-berra...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 qemu-options.hx | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 746b5fa..9f6e2ad 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -49,7 +49,20 @@ STEXI
 @item -machine [type=]@var{name}[,prop=@var{value}[,...]]
 @findex -machine
 Select the emulated machine by @var{name}. Use @code{-machine help} to list
-available machines. Supported machine properties are:
+available machines.
+
+For architectures which aim to support live migration compatibility
+across releases, each release will introduce a new versioned machine
+type. For example, the 2.8.0 release introduced machine types
+``pc-i440fx-2.8'' and ``pc-q35-2.8'' for the x86_64/i686 architectures.
+
+To allow live migration of guests from QEMU version 2.8.0, to QEMU
+version 2.9.0, the 2.9.0 version must support the ``pc-i440fx-2.8''
+and ``pc-q35-2.8'' machines too. To allow users live migrating VMs
+to skip multiple intermediate releases when upgrading, new releases
+of QEMU will support machine types from many previous versions.
+
+Supported machine properties are:
 @table @option
 @item accel=@var{accels1}[:@var{accels2}[:...]]
 This is used to enable an accelerator. Depending on the target architecture,
-- 
1.8.3.1

[Qemu-devel] [PULL 05/17] char: don't exit on hmp 'chardev-add help'

2017-08-01 Thread Paolo Bonzini

From: Anton Nefedov 

qemu_chr_new_from_opts() is used from both vl.c and hmp,
and it is quite confusing to see qemu suddenly exit after receiving a help
option in hmp.

Do exit(0) from vl.c instead.

Signed-off-by: Anton Nefedov 
Message-Id: <1500977081-120929-1-git-send-email-anton.nefe...@virtuozzo.com>
Signed-off-by: Paolo Bonzini 
---
 chardev/char.c |  2 +-
 include/chardev/char.h |  4 +++-
 vl.c   | 10 ++
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/chardev/char.c b/chardev/char.c
index c34b44a..5d283b9 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -620,7 +620,7 @@ Chardev *qemu_chr_new_from_opts(QemuOpts *opts, Error 
**errp)
 
 error_report("Available chardev backend types: %s", str->str);
 g_string_free(str, true);
-exit(0);
+return NULL;
 }
 
 if (id == NULL) {
diff --git a/include/chardev/char.h b/include/chardev/char.h
index 1604ea9..66dde46 100644
--- a/include/chardev/char.h
+++ b/include/chardev/char.h
@@ -65,7 +65,9 @@ struct Chardev {
  *
  * @opts see qemu-config.c for a list of valid options
  *
- * Returns: a new character backend
+ * Returns: on success: a new character backend
+ *  otherwise:  NULL; @errp specifies the error
+ *or left untouched in case of help option
  */
 Chardev *qemu_chr_new_from_opts(QemuOpts *opts,
 Error **errp);
diff --git a/vl.c b/vl.c
index dd803fc..99fcfa0 100644
--- a/vl.c
+++ b/vl.c
@@ -2344,10 +2344,12 @@ static int chardev_init_func(void *opaque, QemuOpts 
*opts, Error **errp)
 {
 Error *local_err = NULL;
 
-qemu_chr_new_from_opts(opts, _err);
-if (local_err) {
-error_report_err(local_err);
-return -1;
+if (!qemu_chr_new_from_opts(opts, _err)) {
+if (local_err) {
+error_report_err(local_err);
+return -1;
+}
+exit(0);
 }
 return 0;
 }
-- 
1.8.3.1

[Qemu-devel] [PULL 13/17] hw/scsi/vmw_pvscsi: Convert to realize

2017-08-01 Thread Paolo Bonzini

From: Mao Zhongyi 

Convert a device model where initialization obviously
can't fail, make it implement realize() rather than init().

Reviewed-by: Dmitry Fleytman 
Cc: Paolo Bonzini 
Cc: Markus Armbruster 
Signed-off-by: Mao Zhongyi 
Message-Id: <20170726084153.10121-2-maozy.f...@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini 
---
 hw/scsi/vmw_pvscsi.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c
index d92973e..77d8b6f 100644
--- a/hw/scsi/vmw_pvscsi.c
+++ b/hw/scsi/vmw_pvscsi.c
@@ -1103,8 +1103,8 @@ static const struct SCSIBusInfo pvscsi_scsi_info = {
 .cancel = pvscsi_request_cancelled,
 };
 
-static int
-pvscsi_init(PCIDevice *pci_dev)
+static void
+pvscsi_realizefn(PCIDevice *pci_dev, Error **errp)
 {
 PVSCSIState *s = PVSCSI(pci_dev);
 
@@ -1144,8 +1144,6 @@ pvscsi_init(PCIDevice *pci_dev)
 /* override default SCSI bus hotplug-handler, with pvscsi's one */
 qbus_set_hotplug_handler(BUS(>bus), DEVICE(s), _abort);
 pvscsi_reset_state(s);
-
-return 0;
 }
 
 static void
@@ -1278,7 +1276,7 @@ static void pvscsi_class_init(ObjectClass *klass, void 
*data)
 PVSCSIClass *pvs_k = PVSCSI_DEVICE_CLASS(klass);
 HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(klass);
 
-k->init = pvscsi_init;
+k->realize = pvscsi_realizefn;
 k->exit = pvscsi_uninit;
 k->vendor_id = PCI_VENDOR_ID_VMWARE;
 k->device_id = PCI_DEVICE_ID_VMWARE_PVSCSI;
-- 
1.8.3.1

[Qemu-devel] [PULL 03/17] accel: cleanup error output

2017-08-01 Thread Paolo Bonzini

From: Laurent Vivier 

Only emit "XXX accelerator not found", if there are not
further accelerators listed. eg

   accel=kvm:tcg

doesn't print a "KVM accelerator not found" warning
when it falls back to tcg, but a

   accel=kvm

prints a warning, since no fallback is given.

Suggested-by: Daniel P. Berrange 
Suggested-by: Paolo Bonzini 
Signed-off-by: Laurent Vivier 
Message-Id: <20170717144527.24534-1-lviv...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 accel/accel.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/accel/accel.c b/accel/accel.c
index fa85844..8ae40e1 100644
--- a/accel/accel.c
+++ b/accel/accel.c
@@ -33,6 +33,7 @@
 #include "sysemu/qtest.h"
 #include "hw/xen/xen.h"
 #include "qom/object.h"
+#include "qemu/error-report.h"
 
 static const TypeInfo accel_type = {
 .name = TYPE_ACCEL,
@@ -69,19 +70,20 @@ static int accel_init_machine(AccelClass *acc, MachineState 
*ms)
 
 void configure_accelerator(MachineState *ms)
 {
-const char *p;
+const char *accel, *p;
 char buf[10];
 int ret;
 bool accel_initialised = false;
 bool init_failed = false;
 AccelClass *acc = NULL;
 
-p = qemu_opt_get(qemu_get_machine_opts(), "accel");
-if (p == NULL) {
+accel = qemu_opt_get(qemu_get_machine_opts(), "accel");
+if (accel == NULL) {
 /* Use the default "accelerator", tcg */
-p = "tcg";
+accel = "tcg";
 }
 
+p = accel;
 while (!accel_initialised && *p != '\0') {
 if (*p == ':') {
 p++;
@@ -89,7 +91,6 @@ void configure_accelerator(MachineState *ms)
 p = get_opt_name(buf, sizeof(buf), p, ':');
 acc = accel_find(buf);
 if (!acc) {
-fprintf(stderr, "\"%s\" accelerator not found.\n", buf);
 continue;
 }
 if (acc->available && !acc->available()) {
@@ -100,9 +101,8 @@ void configure_accelerator(MachineState *ms)
 ret = accel_init_machine(acc, ms);
 if (ret < 0) {
 init_failed = true;
-fprintf(stderr, "failed to initialize %s: %s\n",
-acc->name,
-strerror(-ret));
+error_report("failed to initialize %s: %s",
+ acc->name, strerror(-ret));
 } else {
 accel_initialised = true;
 }
@@ -110,13 +110,13 @@ void configure_accelerator(MachineState *ms)
 
 if (!accel_initialised) {
 if (!init_failed) {
-fprintf(stderr, "No accelerator found!\n");
+error_report("-machine accel=%s: No accelerator found", accel);
 }
 exit(1);
 }
 
 if (init_failed) {
-fprintf(stderr, "Back to %s accelerator.\n", acc->name);
+error_report("Back to %s accelerator", acc->name);
 }
 }
 
-- 
1.8.3.1

[Qemu-devel] [PULL 04/17] char-fd: remove useless chr pointer

2017-08-01 Thread Paolo Bonzini

From: Marc-André Lureau 

Apparently unused since it was introduced in commit
a29753f8aa79a34a324afebe340182a51a5aef11. Now, it can be trivially
accessed by CHARDEV() of self.

Signed-off-by: Marc-André Lureau 
Message-Id: <20170720100046.4424-1-marcandre.lur...@redhat.com>
Reviewed-by: Eric Blake 
Signed-off-by: Paolo Bonzini 
---
 chardev/char-fd.c | 1 -
 include/chardev/char-fd.h | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/chardev/char-fd.c b/chardev/char-fd.c
index 1584a3d..6a62a54 100644
--- a/chardev/char-fd.c
+++ b/chardev/char-fd.c
@@ -141,7 +141,6 @@ void qemu_chr_open_fd(Chardev *chr,
 qio_channel_set_name(QIO_CHANNEL(s->ioc_out), name);
 g_free(name);
 qemu_set_nonblock(fd_out);
-s->chr = chr;
 }
 
 static void char_fd_class_init(ObjectClass *oc, void *data)
diff --git a/include/chardev/char-fd.h b/include/chardev/char-fd.h
index 55ae5b4..e7c2b17 100644
--- a/include/chardev/char-fd.h
+++ b/include/chardev/char-fd.h
@@ -29,7 +29,7 @@
 
 typedef struct FDChardev {
 Chardev parent;
-Chardev *chr;
+
 QIOChannel *ioc_in, *ioc_out;
 int max_size;
 } FDChardev;
-- 
1.8.3.1

[Qemu-devel] [PULL 06/17] docs: document deprecation policy & deprecated features in appendix

2017-08-01 Thread Paolo Bonzini

From: "Daniel P. Berrange" 

The deprecation of features in QEMU is totally adhoc currently,
with no way for the user to get a list of what is deprecated
in each release. This adds an appendix to the doc that records
when each deprecation was made and provides text explaining
what to use instead, if anything.

Since there has been no formal policy around removal of deprecated
features in the past, any deprecations prior to 2.10.0 are to be
treated as if they had been made at the 2.10.0 release. Thus the
earliest that existing deprecations will be deleted is the start
of the 2.12.0 cycle.

Signed-off-by: Daniel P. Berrange 
Message-Id: <20170725113638.7019-1-berra...@redhat.com>
Reviewed-by: Thomas Huth 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
---
 qemu-doc.texi | 175 ++
 1 file changed, 175 insertions(+)

diff --git a/qemu-doc.texi b/qemu-doc.texi
index 48af515..aeb7bc5 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -38,6 +38,7 @@
 * QEMU Guest Agent::
 * QEMU User space emulator::
 * Implementation notes::
+* Deprecated features::
 * License::
 * Index::
 @end menu
@@ -3128,6 +3129,180 @@ Run the emulation in single step mode.
 
 @include qemu-tech.texi
 
+@node Deprecated features
+@appendix Deprecated features
+
+In general features are intended to be supported indefinitely once
+introduced into QEMU. In the event that a feature needs to be removed,
+it will be listed in this appendix. The feature will remain functional
+for 2 releases prior to actual removal. Deprecated features may also
+generate warnings on the console when QEMU starts up, or if activated
+via a monitor command, however, this is not a mandatory requirement.
+
+Prior to the 2.10.0 release there was no official policy on how
+long features would be deprecated prior to their removal, nor
+any documented list of which features were deprecated. Thus
+any features deprecated prior to 2.10.0 will be treated as if
+they were first deprecated in the 2.10.0 release.
+
+What follows is a list of all features currently marked as
+deprecated.
+
+@section System emulator command line arguments
+
+@subsection -drive boot=on|off (since 1.3.0)
+
+The ``boot=on|off'' option to the ``-drive'' argument is
+ignored. Applications should use the ``bootindex=N'' parameter
+to set an absolute ordering between devices instead.
+
+@subsection -tdf (since 1.3.0)
+
+The ``-tdf'' argument is ignored. The behaviour implemented
+by this argument is now the default when using the KVM PIT,
+but can be requested explicitly using
+``-global kvm-pit.lost_tick_policy=slew''.
+
+@subsection -no-kvm-pit-reinjection (since 1.3.0)
+
+The ``-no-kvm-pit-reinjection'' argument is now a
+synonym for setting ``-global kvm-pit.lost_tick_policy=discard''.
+
+@subsection -no-kvm-irqchip (since 1.3.0)
+
+The ``-no-kvm-irqchip'' argument is now a synonym for
+setting ``-machine kernel_irqchip=off''.
+
+@subsection -no-kvm-pit (since 1.3.0)
+
+The ``-no-kvm-pit'' argument is ignored. It is no longer
+possible to disable the KVM PIT directly.
+
+@subsection -no-kvm (since 1.3.0)
+
+The ``-no-kvm'' argument is now a synonym for setting
+``-machine accel=tcg''.
+
+@subsection -mon default=on (since 2.4.0)
+
+The ``default'' option to the ``-mon'' argument is
+now ignored. When multiple monitors were enabled, it
+indicated which monitor would receive log messages
+from the various subsystems. This feature is no longer
+required as messages are now only sent to the monitor
+in response to explicitly monitor commands.
+
+@subsection -vnc tls (since 2.5.0)
+
+The ``-vnc tls'' argument is now a synonym for setting
+``-object tls-creds-anon,id=tls0'' combined with
+``-vnc tls-creds=tls0'
+
+@subsection -vnc x509 (since 2.5.0)
+
+The ``-vnc x509=/path/to/certs'' argument is now a
+synonym for setting
+``-object tls-creds-x509,dir=/path/to/certs,id=tls0,verify-peer=no''
+combined with ``-vnc tls-creds=tls0'
+
+@subsection -vnc x509verify (since 2.5.0)
+
+The ``-vnc x509verify=/path/to/certs'' argument is now a
+synonym for setting
+``-object tls-creds-x509,dir=/path/to/certs,id=tls0,verify-peer=yes''
+combined with ``-vnc tls-creds=tls0'
+
+@subsection -tftp (since 2.6.0)
+
+The ``-tftp /some/dir'' argument is now a synonym for setting
+the ``-netdev user,tftp=/some/dir' argument. The new syntax
+allows different settings to be provided per NIC.
+
+@subsection -bootp (since 2.6.0)
+
+The ``-bootp /some/file'' argument is now a synonym for setting
+the ``-netdev user,bootp=/some/file' argument. The new syntax
+allows different settings to be provided per NIC.
+
+@subsection -redir (since 2.6.0)
+
+The ``-redir ARGS'' argument is now a synonym for setting
+the ``-netdev user,hostfwd=ARGS'' argument instead. The new
+syntax allows different settings to be provided per NIC.
+
+@subsection -smb (since 2.6.0)
+
+The ``-smb

[Qemu-devel] [PULL 14/17] rtc-test: cleanup register_b_set_flag test

2017-08-01 Thread Paolo Bonzini

Introduce set_datetime_bcd/assert_datetime_bcd, and handle
UIP correctly.

Signed-off-by: Paolo Bonzini 
---
 tests/rtc-test.c | 76 ++--
 1 file changed, 46 insertions(+), 30 deletions(-)

diff --git a/tests/rtc-test.c b/tests/rtc-test.c
index e78f701..798cf5e 100644
--- a/tests/rtc-test.c
+++ b/tests/rtc-test.c
@@ -17,6 +17,8 @@
 #include "qemu/timer.h"
 #include "hw/timer/mc146818rtc_regs.h"
 
+#define UIP_HOLD_LENGTH   (8 * NANOSECONDS_PER_SECOND / 32768)
+
 static uint8_t base = 0x70;
 
 static int bcd2dec(int value)
@@ -297,16 +299,30 @@ static void alarm_time(void)
 g_assert(cmos_read(RTC_REG_C) == 0);
 }
 
+static void set_time_regs(int h, int m, int s)
+{
+cmos_write(RTC_HOURS, h);
+cmos_write(RTC_MINUTES, m);
+cmos_write(RTC_SECONDS, s);
+}
+
 static void set_time(int mode, int h, int m, int s)
 {
-/* set BCD 12 hour mode */
 cmos_write(RTC_REG_B, mode);
-
 cmos_write(RTC_REG_A, 0x76);
+set_time_regs(h, m, s);
+cmos_write(RTC_REG_A, 0x26);
+}
+
+static void set_datetime_bcd(int h, int min, int s, int d, int m, int y)
+{
 cmos_write(RTC_HOURS, h);
-cmos_write(RTC_MINUTES, m);
+cmos_write(RTC_MINUTES, min);
 cmos_write(RTC_SECONDS, s);
-cmos_write(RTC_REG_A, 0x26);
+cmos_write(RTC_YEAR, y & 0xFF);
+cmos_write(RTC_CENTURY, y >> 8);
+cmos_write(RTC_MONTH, m);
+cmos_write(RTC_DAY_OF_MONTH, d);
 }
 
 #define assert_time(h, m, s) \
@@ -316,6 +332,17 @@ static void set_time(int mode, int h, int m, int s)
 g_assert_cmpint(cmos_read(RTC_SECONDS), ==, s); \
 } while(0)
 
+#define assert_datetime_bcd(h, min, s, d, m, y) \
+do { \
+g_assert_cmpint(cmos_read(RTC_HOURS), ==, h); \
+g_assert_cmpint(cmos_read(RTC_MINUTES), ==, min); \
+g_assert_cmpint(cmos_read(RTC_SECONDS), ==, s); \
+g_assert_cmpint(cmos_read(RTC_DAY_OF_MONTH), ==, d); \
+g_assert_cmpint(cmos_read(RTC_MONTH), ==, m); \
+g_assert_cmpint(cmos_read(RTC_YEAR), ==, (y & 0xFF)); \
+g_assert_cmpint(cmos_read(RTC_CENTURY), ==, (y >> 8)); \
+} while(0)
+
 static void basic_12h_bcd(void)
 {
 /* set BCD 12 hour mode */
@@ -506,41 +533,30 @@ static void fuzz_registers(void)
 
 static void register_b_set_flag(void)
 {
+if (cmos_read(RTC_REG_A) & REG_A_UIP) {
+clock_step(UIP_HOLD_LENGTH + NANOSECONDS_PER_SECOND / 5);
+}
+g_assert_cmpint(cmos_read(RTC_REG_A) & REG_A_UIP, ==, 0);
+
 /* Enable binary-coded decimal (BCD) mode and SET flag in Register B*/
 cmos_write(RTC_REG_B, REG_B_24H | REG_B_SET);
 
-cmos_write(RTC_REG_A, 0x76);
-cmos_write(RTC_YEAR, 0x11);
-cmos_write(RTC_CENTURY, 0x20);
-cmos_write(RTC_MONTH, 0x02);
-cmos_write(RTC_DAY_OF_MONTH, 0x02);
-cmos_write(RTC_HOURS, 0x02);
-cmos_write(RTC_MINUTES, 0x04);
-cmos_write(RTC_SECONDS, 0x58);
-cmos_write(RTC_REG_A, 0x26);
+set_datetime_bcd(0x02, 0x04, 0x58, 0x02, 0x02, 0x2011);
 
-/* Since SET flag is still enabled, these are equality checks. */
-g_assert_cmpint(cmos_read(RTC_HOURS), ==, 0x02);
-g_assert_cmpint(cmos_read(RTC_MINUTES), ==, 0x04);
-g_assert_cmpint(cmos_read(RTC_SECONDS), ==, 0x58);
-g_assert_cmpint(cmos_read(RTC_DAY_OF_MONTH), ==, 0x02);
-g_assert_cmpint(cmos_read(RTC_MONTH), ==, 0x02);
-g_assert_cmpint(cmos_read(RTC_YEAR), ==, 0x11);
-g_assert_cmpint(cmos_read(RTC_CENTURY), ==, 0x20);
+assert_datetime_bcd(0x02, 0x04, 0x58, 0x02, 0x02, 0x2011);
+
+/* Since SET flag is still enabled, time does not advance. */
+clock_step(10LL);
+assert_datetime_bcd(0x02, 0x04, 0x58, 0x02, 0x02, 0x2011);
 
 /* Disable SET flag in Register B */
 cmos_write(RTC_REG_B, cmos_read(RTC_REG_B) & ~REG_B_SET);
 
-g_assert_cmpint(cmos_read(RTC_HOURS), ==, 0x02);
-g_assert_cmpint(cmos_read(RTC_MINUTES), ==, 0x04);
+assert_datetime_bcd(0x02, 0x04, 0x58, 0x02, 0x02, 0x2011);
 
-/* Since SET flag is disabled, this is an inequality check.
- * We (reasonably) assume that no (sexagesimal) overflow occurs. */
-g_assert_cmpint(cmos_read(RTC_SECONDS), >=, 0x58);
-g_assert_cmpint(cmos_read(RTC_DAY_OF_MONTH), ==, 0x02);
-g_assert_cmpint(cmos_read(RTC_MONTH), ==, 0x02);
-g_assert_cmpint(cmos_read(RTC_YEAR), ==, 0x11);
-g_assert_cmpint(cmos_read(RTC_CENTURY), ==, 0x20);
+/* Since SET flag is disabled, the clock now advances.  */
+clock_step(10LL);
+assert_datetime_bcd(0x02, 0x04, 0x59, 0x02, 0x02, 0x2011);
 }
 
 #define RTC_PERIOD_CODE113   /* 8 Hz */
-- 
1.8.3.1

[Qemu-devel] [PULL 01/17] vl.c/exit: pause cpus before closing block devices

2017-08-01 Thread Paolo Bonzini

From: "Dr. David Alan Gilbert" 

There's a rare exit seg if the guest is accessing
IO during exit.
It's always hitting the atomic_inc(>in_flight) with a NULL
bs. This was added recently in 99723548  but I don't see it
as the cause.

Flip vl.c around so we pause the cpus before closing the block devices,
that way we shouldn't have anything trying to access them when
they're gone.

This was originally Red Hat bz 
https://bugzilla.redhat.com/show_bug.cgi?id=1451015

Signed-off-by: Dr. David Alan Gilbert 
Reported-by: Cong Li 

--
This is a very rare race, I'll leave it running in a loop to see if
we hit anything else and to check this really fixes it.

I do worry if there are other cases that can trigger this - e.g.
hot-unplug or ejecting a CD.

Message-Id: <20170713190116.21608-1-dgilb...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 vl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/vl.c b/vl.c
index fb6b2ef..dd803fc 100644
--- a/vl.c
+++ b/vl.c
@@ -4787,8 +4787,8 @@ int main(int argc, char **argv, char **envp)
 replay_disable_events();
 iothread_stop_all();
 
-bdrv_close_all();
 pause_all_vcpus();
+bdrv_close_all();
 res_free();
 
 /* vhost-user must be cleaned up before chardevs.  */
-- 
1.8.3.1

[Qemu-devel] [PULL 00/17] Misc changes for QEMU 2.10-rc1 (?)

2017-08-01 Thread Paolo Bonzini

The following changes since commit 7d48cf8102a10e4a54333811bafb5eb566509268:

  Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' 
into staging (2017-08-01 14:33:56 +0100)

are available in the git repository at:


  git://github.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 33f21e4f044ac1c37f60edc1f1aee628be8f463b:

  mc146818rtc: implement UIP latching as intended (2017-08-01 17:27:34 +0200)


* Xen fix (Anthony)
* chardev fixes (Anton, Marc-André)
* small dead code removal (Zhongyi)
* documentation (Dan)
* bugfixes (David)
* decrease migration downtime (Jay)
* improved error output (Laurent)
* RTC tests and bugfix (me)
* Bluetooth clang analyzer fix (me)
* KVM CPU hotplug race (Peng Hao)
* Two other patches from Philippe's clang analyzer series


Anthony PERARD (1):
  exec: Add lock parameter to qemu_ram_ptr_length

Anton Nefedov (1):
  char: don't exit on hmp 'chardev-add help'

Daniel P. Berrange (2):
  docs: document deprecation policy & deprecated features in appendix
  qemu-options: document existance of versioned machine types

Dr. David Alan Gilbert (2):
  vl.c/exit: pause cpus before closing block devices
  cpu_physical_memory_sync_dirty_bitmap: Fix alignment check

Jay Zhou (1):
  migration: optimize the downtime

Laurent Vivier (1):
  accel: cleanup error output

Mao Zhongyi (2):
  hw/scsi/vmw_pvscsi: Remove the dead error handling
  hw/scsi/vmw_pvscsi: Convert to realize

Marc-André Lureau (1):
  char-fd: remove useless chr pointer

Paolo Bonzini (5):
  bt: stop the sdp memory allocation craziness
  rtc-test: cleanup register_b_set_flag test
  rtc-test: introduce more update tests
  mc146818rtc: simplify check_update_timer
  mc146818rtc: implement UIP latching as intended

Peng Hao (1):
  target-i386: kvm_get/put_vcpu_events don't handle sipi_vector

 accel/accel.c |  20 +++---
 chardev/char-fd.c |   1 -
 chardev/char.c|   2 +-
 exec.c|  12 ++--
 hw/bt/sdp.c   |  17 +++--
 hw/scsi/vmw_pvscsi.c  |  12 +---
 hw/timer/mc146818rtc.c|  37 +-
 include/chardev/char-fd.h |   2 +-
 include/chardev/char.h|   4 +-
 include/exec/ram_addr.h   |   7 +-
 memory.c  |  36 +-
 qemu-doc.texi | 175 ++
 qemu-options.hx   |  15 +++-
 target/i386/kvm.c |  35 ++
 tests/rtc-test.c  | 156 +
 vl.c  |  12 ++--
 16 files changed, 437 insertions(+), 106 deletions(-)
-- 
1.8.3.1

[Qemu-devel] [PULL 02/17] cpu_physical_memory_sync_dirty_bitmap: Fix alignment check

2017-08-01 Thread Paolo Bonzini

From: "Dr. David Alan Gilbert" 

This code has an optimised, word aligned version, and a boring
unaligned version.  Recently 084140bd498909 fixed a missing offset
addition from the core of both versions.  However, the offset isn't
necessarily aligned and thus the choice between the two versions
needs fixing up to also include the offset.

Symptom:
  A few stuck unsent pages during migration; not normally noticed
unless under very low bandwidth in which case the migration may get
stuck never ending and never performing a 2nd sync; noticed by
a hanging postcopy-test on a very heavily loaded system.

Fixes: 084140bd498909

Signed-off-by: Dr. David Alan Gilbert 
Reported-by: Alex BenneÃ© 
Tested-by: Alex BenneÃ© 

--
v2
  Move 'page' inside the if (Comment from Paolo)
Message-Id: <20170724165125.29887-1-dgilb...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 include/exec/ram_addr.h | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index c04f4f6..d017639 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -377,19 +377,20 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock 
*rb,
uint64_t *real_dirty_pages)
 {
 ram_addr_t addr;
-unsigned long page = BIT_WORD(start >> TARGET_PAGE_BITS);
+unsigned long word = BIT_WORD((start + rb->offset) >> TARGET_PAGE_BITS);
 uint64_t num_dirty = 0;
 unsigned long *dest = rb->bmap;
 
 /* start address is aligned at the start of a word? */
-if (((page * BITS_PER_LONG) << TARGET_PAGE_BITS) == start) {
+if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) ==
+ (start + rb->offset)) {
 int k;
 int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
 unsigned long * const *src;
-unsigned long word = BIT_WORD((start + rb->offset) >> 
TARGET_PAGE_BITS);
 unsigned long idx = (word * BITS_PER_LONG) / DIRTY_MEMORY_BLOCK_SIZE;
 unsigned long offset = BIT_WORD((word * BITS_PER_LONG) %
 DIRTY_MEMORY_BLOCK_SIZE);
+unsigned long page = BIT_WORD(start >> TARGET_PAGE_BITS);
 
 rcu_read_lock();
 
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH v3 6/7] block: add BlockDevOptionsThrottle to QAPI

2017-08-01 Thread Stefan Hajnoczi

On Mon, Jul 31, 2017 at 12:54:42PM +0300, Manos Pitsidianakis wrote:
> @@ -3095,6 +3096,22 @@
>  '*tls-creds': 'str' } }
>  
>  ##
> +# @BlockdevOptionsThrottle:
> +#
> +# Driver specific block device options for the throttle driver
> +#
> +# @throttle-group:   the name of the throttle-group object to use. It will be
> +#created if it doesn't already exist
> +# @file: reference to or definition of the data source block 
> device
> +# @limits:   ThrottleLimits options
> +# Since: 2.11
> +##
> +{ 'struct': 'BlockdevOptionsThrottle',
> +  'data': { '*throttle-group': 'str',
> +'file' : 'BlockdevRef',
> +'*limits' : 'ThrottleLimits'
> + } }

What happens when throttle-group isn't given?  Please document it.


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 2/2] build-sys: do not compile net/vhost-user.c if vhost-user is disabled

2017-08-01 Thread Marc-André Lureau

Hi

- Original Message -
> On Fri, Jul 28, 2017 at 04:13:09PM +0200, Marc-André Lureau wrote:
> > This adds two extra #ifdef that have fairly limited conflict potential.
> > 
> > Signed-off-by: Marc-André Lureau 
> 
> OK but why can't we keep all ifdefs in net/ where all the
> rest of the features are ifdef'ed?
> 

We would need fallback code for vhost_user_get_* functions (see below).

> My concern is no one is going to test this weird configuration
> so we might accidentally re-enable it without noticing.
> net/net.c is where all command line play happens so
> people know a bunch of configs need to be tested when
> changing it.

I am not sure I understand your argument there. You are afraid we 'accidentaly 
re-enable' vhost-user... and forget to remove those #ifdef?

I can imagine people don't like having code clutter with #ifdef, but I think 
when it doesn't change much the logic (as in here with clean per-case 
switch/if), it is fine, ymmv.

> 
> 
> > ---
> >  hw/net/vhost_net.c | 4 
> >  net/Makefile.objs  | 2 +-
> >  2 files changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > index e037db63a3..96c4da49e4 100644
> > --- a/hw/net/vhost_net.c
> > +++ b/hw/net/vhost_net.c
> > @@ -193,6 +193,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions
> > *options)
> >  }
> >  }
> >  
> > +#ifdef CONFIG_VHOST_USER
> >  /* Set sane init value. Override when guest acks. */
> >  if (net->nc->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
> >  features = vhost_user_get_acked_features(net->nc);
> > @@ -203,6 +204,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions
> > *options)
> >  goto fail;
> >  }
> >  }
> > +#endif
> >  
> >  vhost_net_ack_features(net, features);
> >  
> > @@ -414,10 +416,12 @@ VHostNetState *get_vhost_net(NetClientState *nc)
> >  case NET_CLIENT_DRIVER_TAP:
> >  vhost_net = tap_get_vhost_net(nc);
> >  break;
> > +#ifdef CONFIG_VHOST_USER
> >  case NET_CLIENT_DRIVER_VHOST_USER:
> >  vhost_net = vhost_user_get_vhost_net(nc);
> >  assert(vhost_net);
> >  break;
> > +#endif
> >  default:
> >  break;
> >  }
> > diff --git a/net/Makefile.objs b/net/Makefile.objs
> > index 67ba5e26fb..7cac7ed1e4 100644
> > --- a/net/Makefile.objs
> > +++ b/net/Makefile.objs
> > @@ -3,7 +3,7 @@ common-obj-y += socket.o
> >  common-obj-y += dump.o
> >  common-obj-y += eth.o
> >  common-obj-$(CONFIG_L2TPV3) += l2tpv3.o
> > -common-obj-$(CONFIG_POSIX) += vhost-user.o
> > +common-obj-$(CONFIG_VHOST_USER) += vhost-user.o

So we could have something like:

+common-obj-$(not CONFIG_VHOST_USER) += vhost-user-stubs.o

?

> >  common-obj-$(CONFIG_SLIRP) += slirp.o
> >  common-obj-$(CONFIG_VDE) += vde.o
> >  common-obj-$(CONFIG_NETMAP) += netmap.o
> > --
> > 2.14.0.rc0.1.g40ca67566
>

[Qemu-devel] [PATCH] migration: fix small leaks

2017-08-01 Thread Marc-André Lureau

Spotted thanks to valgrind and tests/device-introspect-test:

==11711== 1 bytes in 1 blocks are definitely lost in loss record 6 of 14,537
==11711==at 0x4C2EB6B: malloc (vg_replace_malloc.c:299)
==11711==by 0x1E0CDBD8: g_malloc (gmem.c:94)
==11711==by 0x1E0E696E: g_strdup (gstrfuncs.c:363)
==11711==by 0x695693: migration_instance_init (migration.c:2226)
==11711==by 0x717C4B: object_init_with_type (object.c:344)
==11711==by 0x717E80: object_initialize_with_type (object.c:375)
==11711==by 0x7182EB: object_new_with_type (object.c:483)
==11711==by 0x718328: object_new (object.c:493)
==11711==by 0x4B8A29: qmp_device_list_properties (qmp.c:542)
==11711==by 0x4A9561: qmp_marshal_device_list_properties 
(qmp-marshal.c:1425)
==11711==by 0x819D4A: do_qmp_dispatch (qmp-dispatch.c:104)
==11711==by 0x819E82: qmp_dispatch (qmp-dispatch.c:131)

Signed-off-by: Marc-André Lureau 
---
 migration/migration.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index 085c32c994..c3fe0ed9ca 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2214,6 +2214,15 @@ static void migration_class_init(ObjectClass *klass, 
void *data)
 dc->props = migration_properties;
 }
 
+static void migration_instance_finalize(Object *obj)
+{
+MigrationState *ms = MIGRATION_OBJ(obj);
+MigrationParameters *params = >parameters;
+
+g_free(params->tls_hostname);
+g_free(params->tls_creds);
+}
+
 static void migration_instance_init(Object *obj)
 {
 MigrationState *ms = MIGRATION_OBJ(obj);
@@ -2282,6 +2291,7 @@ static const TypeInfo migration_type = {
 .class_size = sizeof(MigrationClass),
 .instance_size = sizeof(MigrationState),
 .instance_init = migration_instance_init,
+.instance_finalize = migration_instance_finalize,
 };
 
 static void register_migration_types(void)
-- 
2.14.0.rc0.1.g40ca67566

[Qemu-devel] [PULL v2 00/14] Block layer patches for 2.10.0-rc1

2017-08-01 Thread Kevin Wolf

The following changes since commit 5619c179057e24195ff19c8fe6d6a6cbcb16ed28:

  Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170731' 
into staging (2017-07-31 14:45:42 +0100)

are available in the git repository at:

  git://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to 8e8eb0a9035e5b6c6447c82138570e388282cfa2:

  block/qapi: Remove redundant NULL check to silence Coverity (2017-08-01 
18:09:33 +0200)


Block layer patches for 2.10.0-rc1


Eric Blake (2):
  iotests: Check dirty bitmap statistics in 124
  iotests: Add test of recent fix to 'qemu-img measure'

Kevin Wolf (8):
  qemu-iotests/041: Fix leaked scratch images
  qemu-iotests: Remove blkdebug.conf after tests
  qemu-iotests/141: Fix image cleanup
  qemu-iotests/153: Fix leaked scratch images
  qemu-iotests/162: Fix leaked temporary files
  qemu-iotests/063: Fix leaked image
  qemu-iotests/059: Fix leaked image files
  block/qapi: Remove redundant NULL check to silence Coverity

Manos Pitsidianakis (2):
  block: fix dangling bs->explicit_options in block.c
  block: fix leaks in bdrv_open_driver()

Max Reitz (2):
  iotests: Fix test 156
  iotests: Redirect stderr to stdout in 186

 block.c  | 24 +-
 block/qapi.c |  3 ++-
 tests/qemu-iotests/041   |  4 ++-
 tests/qemu-iotests/059   | 11 -
 tests/qemu-iotests/059.out   | 22 -
 tests/qemu-iotests/063   |  4 +--
 tests/qemu-iotests/074   |  1 +
 tests/qemu-iotests/124   |  7 +-
 tests/qemu-iotests/141   |  2 +-
 tests/qemu-iotests/153   |  1 +
 tests/qemu-iotests/156   |  4 +--
 tests/qemu-iotests/162   |  7 ++
 tests/qemu-iotests/179   |  1 +
 tests/qemu-iotests/186   |  2 +-
 tests/qemu-iotests/186.out   | 12 -
 tests/qemu-iotests/190   | 59 
 tests/qemu-iotests/190.out   | 11 +
 tests/qemu-iotests/common.rc |  3 +++
 tests/qemu-iotests/group |  1 +
 19 files changed, 140 insertions(+), 39 deletions(-)
 create mode 100755 tests/qemu-iotests/190
 create mode 100644 tests/qemu-iotests/190.out

Re: [Qemu-devel] [PULL 00/15] Block layer patches for 2.10.0-rc1

2017-08-01 Thread Kevin Wolf

Am 01.08.2017 um 18:01 hat Peter Maydell geschrieben:
> On 1 August 2017 at 15:46, Kevin Wolf  wrote:
> > The following changes since commit 5619c179057e24195ff19c8fe6d6a6cbcb16ed28:
> >
> >   Merge remote-tracking branch 
> > 'remotes/pmaydell/tags/pull-target-arm-20170731' into staging (2017-07-31 
> > 14:45:42 +0100)
> >
> > are available in the git repository at:
> >
> >   git://repo.or.cz/qemu/kevin.git tags/for-upstream
> >
> > for you to fetch changes up to 9ae03670701fea9124f4f1a6d4b6d1badbf9e485:
> >
> >   block/qapi: Remove redundant NULL check to silence Coverity (2017-08-01 
> > 15:26:04 +0200)
> >
> > 
> > Block layer patches for 2.10.0-rc1
> >
> > 
> 
> Hi; I'm afraid the docs changes in this pull upset some of
> my test machines:
> 
> NetBSD, OpenBSD, OSX:
> 
>   GEN docs/qemu-block-drivers.html
> /root/qemu/docs/qemu-block-drivers.texi:800: Unmatched `@end'.
> makeinfo: Removing output file `docs/qemu-block-drivers.html' due to
> errors; use --force to preserve.
> Makefile:692: recipe for target 'docs/qemu-block-drivers.html' failed
> 
> (and same error again trying to build the .txt file)
> 
> S390x, PPC64 Linux give a warning:
>   GEN docs/qemu-block-drivers.html
> /home/linux1/qemu/docs/qemu-block-drivers.texi: warning: must specify
> a title with a title command or @top
> 
> (Other hosts still running the build job.)

I'm dropping the patch and sending v2.

Kevin

Re: [Qemu-devel] [PATCH v2 2/2] build-sys: do not compile net/vhost-user.c if vhost-user is disabled

2017-08-01 Thread Michael S. Tsirkin

On Fri, Jul 28, 2017 at 04:13:09PM +0200, Marc-André Lureau wrote:
> This adds two extra #ifdef that have fairly limited conflict potential.
> 
> Signed-off-by: Marc-André Lureau 

OK but why can't we keep all ifdefs in net/ where all the
rest of the features are ifdef'ed?

My concern is no one is going to test this weird configuration
so we might accidentally re-enable it without noticing.
net/net.c is where all command line play happens so
people know a bunch of configs need to be tested when
changing it.


> ---
>  hw/net/vhost_net.c | 4 
>  net/Makefile.objs  | 2 +-
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index e037db63a3..96c4da49e4 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -193,6 +193,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
>  }
>  }
>  
> +#ifdef CONFIG_VHOST_USER
>  /* Set sane init value. Override when guest acks. */
>  if (net->nc->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
>  features = vhost_user_get_acked_features(net->nc);
> @@ -203,6 +204,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
>  goto fail;
>  }
>  }
> +#endif
>  
>  vhost_net_ack_features(net, features);
>  
> @@ -414,10 +416,12 @@ VHostNetState *get_vhost_net(NetClientState *nc)
>  case NET_CLIENT_DRIVER_TAP:
>  vhost_net = tap_get_vhost_net(nc);
>  break;
> +#ifdef CONFIG_VHOST_USER
>  case NET_CLIENT_DRIVER_VHOST_USER:
>  vhost_net = vhost_user_get_vhost_net(nc);
>  assert(vhost_net);
>  break;
> +#endif
>  default:
>  break;
>  }
> diff --git a/net/Makefile.objs b/net/Makefile.objs
> index 67ba5e26fb..7cac7ed1e4 100644
> --- a/net/Makefile.objs
> +++ b/net/Makefile.objs
> @@ -3,7 +3,7 @@ common-obj-y += socket.o
>  common-obj-y += dump.o
>  common-obj-y += eth.o
>  common-obj-$(CONFIG_L2TPV3) += l2tpv3.o
> -common-obj-$(CONFIG_POSIX) += vhost-user.o
> +common-obj-$(CONFIG_VHOST_USER) += vhost-user.o
>  common-obj-$(CONFIG_SLIRP) += slirp.o
>  common-obj-$(CONFIG_VDE) += vde.o
>  common-obj-$(CONFIG_NETMAP) += netmap.o
> -- 
> 2.14.0.rc0.1.g40ca67566

Re: [Qemu-devel] [PULL 00/15] Block layer patches for 2.10.0-rc1

2017-08-01 Thread Peter Maydell

On 1 August 2017 at 15:46, Kevin Wolf  wrote:
> The following changes since commit 5619c179057e24195ff19c8fe6d6a6cbcb16ed28:
>
>   Merge remote-tracking branch 
> 'remotes/pmaydell/tags/pull-target-arm-20170731' into staging (2017-07-31 
> 14:45:42 +0100)
>
> are available in the git repository at:
>
>   git://repo.or.cz/qemu/kevin.git tags/for-upstream
>
> for you to fetch changes up to 9ae03670701fea9124f4f1a6d4b6d1badbf9e485:
>
>   block/qapi: Remove redundant NULL check to silence Coverity (2017-08-01 
> 15:26:04 +0200)
>
> 
> Block layer patches for 2.10.0-rc1
>
> 

Hi; I'm afraid the docs changes in this pull upset some of
my test machines:

NetBSD, OpenBSD, OSX:

  GEN docs/qemu-block-drivers.html
/root/qemu/docs/qemu-block-drivers.texi:800: Unmatched `@end'.
makeinfo: Removing output file `docs/qemu-block-drivers.html' due to
errors; use --force to preserve.
Makefile:692: recipe for target 'docs/qemu-block-drivers.html' failed

(and same error again trying to build the .txt file)

S390x, PPC64 Linux give a warning:
  GEN docs/qemu-block-drivers.html
/home/linux1/qemu/docs/qemu-block-drivers.texi: warning: must specify
a title with a title command or @top

(Other hosts still running the build job.)

thanks
-- PMM

Re: [Qemu-devel] [PATCH v3 2/5] hw/pci: introduce pcie-pci-bridge device

2017-08-01 Thread Marcel Apfelbaum


On 01/08/2017 18:51, Michael S. Tsirkin wrote:

On Tue, Aug 01, 2017 at 06:45:13PM +0300, Marcel Apfelbaum wrote:

On 01/08/2017 18:32, Michael S. Tsirkin wrote:

On Mon, Jul 31, 2017 at 09:40:41PM +0300, Alexander Bezzubikov wrote:

+typedef struct PCIEPCIBridge {
+/*< private >*/
+PCIBridge parent_obj;
+
+bool msi_enable;



Please rename the msi_enable property to "msi" in order
to be aligned with the existent PCIBridgeDev and
consider making it OnOffAuto for the same reason.
(I am not sure about the last part though, we have
   no meaning for "auto" here)



Agreed about "msi", but OnOffAuto looks weird to me
as we always want MSI to be enabled.




Hi Michael,


Why even have a property then? Can't you enable it unconditionally?



Because of a current bug in Linux kernel:
https://www.spinics.net/lists/linux-pci/msg63052.html
msi will not work until the patch is merged. Even when
it will be merged, not all linux kernels will contain the patch.


You should Cc stable to make sure they all gain it eventually.



Right! thanks, we missed cc-ing stable.
Added stable to the mail thread.
Marcel



Disabling msi is a workaround for the above case.

Thanks,
Marcel


Really enabling MSI without bus master is a bug that I'm not 100% sure
it even worth working around. But I guess it's not too bad to have a
work-around given it's this simple.

Re: [Qemu-devel] [PATCH 07/18] nbd: Minimal structured read for client

2017-08-01 Thread Vladimir Sementsov-Ogievskiy


01.08.2017 18:41, Vladimir Sementsov-Ogievskiy wrote:

07.02.2017 23:14, Eric Blake wrote:

On 02/03/2017 09:47 AM, Vladimir Sementsov-Ogievskiy wrote:
Minimal implementation: always send DF flag, to not deal with 
fragmented

replies.

This works well with your minimal server implementation, but I worry
that it will cause us to fall over when talking to a fully-compliant
server that chooses to send EOVERFLOW errors for any request larger than
64k when DF is set; it also makes it impossible to benefit from sparse
reads.  I guess that means we need to start thinking about followup
patches to flush out our implementation.  But maybe I can live with this
patch as is, since the goal of your series was not so much the full
power of structured reads, but getting to a point where we could use
structured reply for block status, even if it means your client can only
communicate with qemu-nbd as server for now, as long as we do get to the
rest of the patches for a full-blown structured read.


Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd-client.c  |  47 +++
  block/nbd-client.h  |   2 +
  include/block/nbd.h |  15 +++--
  nbd/client.c| 170 
++--

  qemu-nbd.c  |   2 +-
  5 files changed, 203 insertions(+), 33 deletions(-)

Hmm - no change to the testsuite. Structured reads seems like the sort
of thing that it would be nice to test with some canned server replies,
particularly with server behavior that is permitted by the NBD protocol
but does not happen by default in qemu-nbd.


diff --git a/block/nbd-client.c b/block/nbd-client.c
index 3779c6c999..ff96bd1635 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -180,13 +180,20 @@ static void 
nbd_co_receive_reply(NBDClientSession *s,

  *reply = s->reply;
  if (reply->handle != request->handle ||
  !s->ioc) {
+reply->simple = true;
  reply->error = EIO;

I don't think this is quite right - by setting reply->simple to true,
you are forcing the caller to treat this as the final packet related to
this request->handle, even though that might not be the case.

As it is, I wonder if this code is correct, even before your patch - the
server is allowed to give responses out-of-order (if we request multiple
reads without waiting for the first response) - I don't see how setting
reply->error to EIO if the request->handle indicates that we are
receiving an out-of-order response to some other packet, but that our
request is still awaiting traffic.


Hmm, looks like it should initiate disconnect instead of just 
reporting error to io operation caller.





Also, nbd_co_send_request errors are not handled but just returned to 
the caller. Shouldn't the first


error on socket io initiate disconnect? I think, only the io errors, 
transferred from nbd-export block device


should be just returned to bdrv_{io} caller. NBD-protocol related errors 
(invalid handles, etc.) should initiate


disconnect, so that current and all future bdrv_{io}'s from client 
return -EIO.



--
Best regards,
Vladimir

Re: [Qemu-devel] [PATCH v3 2/5] hw/pci: introduce pcie-pci-bridge device

2017-08-01 Thread Michael S. Tsirkin

On Tue, Aug 01, 2017 at 06:45:13PM +0300, Marcel Apfelbaum wrote:
> On 01/08/2017 18:32, Michael S. Tsirkin wrote:
> > On Mon, Jul 31, 2017 at 09:40:41PM +0300, Alexander Bezzubikov wrote:
> > > > > +typedef struct PCIEPCIBridge {
> > > > > +/*< private >*/
> > > > > +PCIBridge parent_obj;
> > > > > +
> > > > > +bool msi_enable;
> > > > 
> > > > 
> > > > Please rename the msi_enable property to "msi" in order
> > > > to be aligned with the existent PCIBridgeDev and
> > > > consider making it OnOffAuto for the same reason.
> > > > (I am not sure about the last part though, we have
> > > >   no meaning for "auto" here)
> > > > 
> > > 
> > > Agreed about "msi", but OnOffAuto looks weird to me
> > > as we always want MSI to be enabled.
> > 
> 
> Hi Michael,
> 
> > Why even have a property then? Can't you enable it unconditionally?
> > 
> 
> Because of a current bug in Linux kernel:
>https://www.spinics.net/lists/linux-pci/msg63052.html
> msi will not work until the patch is merged. Even when
> it will be merged, not all linux kernels will contain the patch.

You should Cc stable to make sure they all gain it eventually.

> Disabling msi is a workaround for the above case.
> 
> Thanks,
> Marcel

Really enabling MSI without bus master is a bug that I'm not 100% sure
it even worth working around. But I guess it's not too bad to have a
work-around given it's this simple.

-- 
MST

Re: [Qemu-devel] [PATCH v3 2/5] hw/pci: introduce pcie-pci-bridge device

2017-08-01 Thread Marcel Apfelbaum


On 01/08/2017 18:32, Michael S. Tsirkin wrote:

On Mon, Jul 31, 2017 at 09:40:41PM +0300, Alexander Bezzubikov wrote:

+typedef struct PCIEPCIBridge {
+/*< private >*/
+PCIBridge parent_obj;
+
+bool msi_enable;



Please rename the msi_enable property to "msi" in order
to be aligned with the existent PCIBridgeDev and
consider making it OnOffAuto for the same reason.
(I am not sure about the last part though, we have
  no meaning for "auto" here)



Agreed about "msi", but OnOffAuto looks weird to me
as we always want MSI to be enabled.




Hi Michael,


Why even have a property then? Can't you enable it unconditionally?



Because of a current bug in Linux kernel:
   https://www.spinics.net/lists/linux-pci/msg63052.html
msi will not work until the patch is merged. Even when
it will be merged, not all linux kernels will contain the patch.

Disabling msi is a workaround for the above case.

Thanks,
Marcel

Re: [Qemu-devel] [PATCH 07/18] nbd: Minimal structured read for client

2017-08-01 Thread Vladimir Sementsov-Ogievskiy


07.02.2017 23:14, Eric Blake wrote:

On 02/03/2017 09:47 AM, Vladimir Sementsov-Ogievskiy wrote:

Minimal implementation: always send DF flag, to not deal with fragmented
replies.

This works well with your minimal server implementation, but I worry
that it will cause us to fall over when talking to a fully-compliant
server that chooses to send EOVERFLOW errors for any request larger than
64k when DF is set; it also makes it impossible to benefit from sparse
reads.  I guess that means we need to start thinking about followup
patches to flush out our implementation.  But maybe I can live with this
patch as is, since the goal of your series was not so much the full
power of structured reads, but getting to a point where we could use
structured reply for block status, even if it means your client can only
communicate with qemu-nbd as server for now, as long as we do get to the
rest of the patches for a full-blown structured read.


Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd-client.c  |  47 +++
  block/nbd-client.h  |   2 +
  include/block/nbd.h |  15 +++--
  nbd/client.c| 170 ++--
  qemu-nbd.c  |   2 +-
  5 files changed, 203 insertions(+), 33 deletions(-)

Hmm - no change to the testsuite. Structured reads seems like the sort
of thing that it would be nice to test with some canned server replies,
particularly with server behavior that is permitted by the NBD protocol
but does not happen by default in qemu-nbd.


diff --git a/block/nbd-client.c b/block/nbd-client.c
index 3779c6c999..ff96bd1635 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -180,13 +180,20 @@ static void nbd_co_receive_reply(NBDClientSession *s,
  *reply = s->reply;
  if (reply->handle != request->handle ||
  !s->ioc) {
+reply->simple = true;
  reply->error = EIO;

I don't think this is quite right - by setting reply->simple to true,
you are forcing the caller to treat this as the final packet related to
this request->handle, even though that might not be the case.

As it is, I wonder if this code is correct, even before your patch - the
server is allowed to give responses out-of-order (if we request multiple
reads without waiting for the first response) - I don't see how setting
reply->error to EIO if the request->handle indicates that we are
receiving an out-of-order response to some other packet, but that our
request is still awaiting traffic.


Hmm, looks like it should initiate disconnect instead of just reporting 
error to io operation caller.



--
Best regards,
Vladimir

Re: [Qemu-devel] [PATCH v3 2/5] hw/pci: introduce pcie-pci-bridge device

2017-08-01 Thread Michael S. Tsirkin

On Mon, Jul 31, 2017 at 09:40:41PM +0300, Alexander Bezzubikov wrote:
> >> +typedef struct PCIEPCIBridge {
> >> +/*< private >*/
> >> +PCIBridge parent_obj;
> >> +
> >> +bool msi_enable;
> >
> >
> > Please rename the msi_enable property to "msi" in order
> > to be aligned with the existent PCIBridgeDev and
> > consider making it OnOffAuto for the same reason.
> > (I am not sure about the last part though, we have
> >  no meaning for "auto" here)
> >
> 
> Agreed about "msi", but OnOffAuto looks weird to me
> as we always want MSI to be enabled.

Why even have a property then? Can't you enable it unconditionally?

-- 
MST

Re: [Qemu-devel] [for-2.11 PATCH 26/26] spapr: add hotplug hooks for PHB hotplug

2017-08-01 Thread Greg Kurz

On Fri, 28 Jul 2017 14:24:03 +1000
David Gibson  wrote:

> On Thu, Jul 27, 2017 at 07:09:55PM +0200, Greg Kurz wrote:
> > On Thu, 27 Jul 2017 14:41:31 +1000
> > Alexey Kardashevskiy  wrote:
> >   
> > > On 26/07/17 18:40, Greg Kurz wrote:  
> > > > Hotplugging PHBs is a machine-level operation, but PHBs reside on the
> > > > main system bus, so we register spapr machine as the handler for the
> > > > main system bus.
> > > > 
> > > > Signed-off-by: Michael Roth 
> > > > Signed-off-by: Greg Kurz 
> > > > ---
> > > > - rebased against ppc-for-2.10
> > > > - converted to unplug_request
> > > > - handle drc_id at pre-plug
> > > > - reset hotplugged PHB at plug
> > > > - compatibility with older machine types
> > > > ---
> > > >  hw/ppc/spapr.c  |  114 
> > > > +++
> > > >  hw/ppc/spapr_drc.c  |1 
> > > >  hw/ppc/spapr_pci.c  |2 -
> > > >  include/hw/pci-host/spapr.h |2 +
> > > >  include/hw/ppc/spapr.h  |1 
> > > >  5 files changed, 118 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > > index 90485054c2e7..589f76ef9fb8 100644
> > > > --- a/hw/ppc/spapr.c
> > > > +++ b/hw/ppc/spapr.c
> > > > @@ -2540,6 +2540,10 @@ static void ppc_spapr_init(MachineState *machine)
> > > >  register_savevm_live(NULL, "spapr/htab", -1, 1,
> > > >   _htab_handlers, spapr);
> > > >  
> > > > +if (spapr->dr_phb_enabled) {
> > > > +qbus_set_hotplug_handler(sysbus_get_default(), 
> > > > OBJECT(machine), NULL);
> > > > +}
> > > > +
> > > >  qemu_register_boot_set(spapr_boot_set, spapr);
> > > >  
> > > >  if (kvm_enabled()) {
> > > > @@ -3238,6 +3242,103 @@ out:
> > > >  error_propagate(errp, local_err);
> > > >  }
> > > >  
> > > > +static void spapr_phb_pre_plug(HotplugHandler *hotplug_dev, 
> > > > DeviceState *dev,
> > > > +   Error **errp)
> > > > +{
> > > > +sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> > > > +
> > > > +if (sphb->drc_id == (uint32_t)-1) {
> > > > +sphb->drc_id = sphb->index;
> > > > +}
> > > > +
> > > > +if (sphb->drc_id >= SPAPR_DRC_MAX_PHB) {
> > > > +error_setg(errp, "PHB id %d out of range", sphb->drc_id);
> > > > +}
> > > 
> > > 
> > > sphb->index in considered 16bits in the existing code (even though it is
> > > defined as 32bit) and SPAPR_DRC_MAX_PHB is just 256. I'd suggest using the
> > > same limit for both, either 256 or 65536 is fine for me.
> > > 
> > > It is actually a bit weird - it is possible to completely configure few
> > > PHBs in the command line so they will have index==-1 but PCI hotplug code 
> > > -
> > > spapr_phb_get_pci_func_drc() and spapr_phb_realize() - does not check for
> > > this and just does (sphb->index << 16).  
> > 
> > You're right and this looks like a bug... I'll try to come up with a fix.
> >   
> > > May be just ditch drc_id, enforce index not to be -1 and use it as drc_id?
> > >   
> > 
> > This was how Mike did it in the original patchset but David suggested
> > to introduce drc_id (to preserve existing setups I guess):
> > 
> > http://patchwork.ozlabs.org/patch/466262/  
> 
> Huh.  So I did.  But.. sorry, I've changed my mind.
> 
> The fact that needing a DRC forces us to have a reasonable small id
> for each PHB seems like a good excuse to make index mandatory - I'm
> not convinced anyone was actually creating PHBs without index, and
> this does allow us to simplify a bunch of things.
> 
> I'd like to see that done as a preliminary cleanup patch, though.
> 

Just to be sure. I could verify that the weirdness reported by Alexey
causes QEMU to misbehave. Only the first "index-less" PHB has realized
DRCs:

=> subsequent "index-less" PHBs silently ignore hotplugging of PCI devices

=> QEMU won't even start with coldplugged devices in these "index-less"
   PHBs

This preliminary cleanup for hotpluggable PHBs is hence also a bug fix
for current PHBs.

Do we want to fix this long-standing bug in 2.10 ?

Do we want to preserve the current buggy behavior for older machine types ?


pgpq4epA0NOS8.pgp
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 1/2] s390x/css: use macro for event-information pending error recover code

2017-08-01 Thread Halil Pasic



On 08/01/2017 09:57 AM, Dong Jia Shi wrote:
> Let's use a macro for the ERC (error recover code) when generating a
> Channel Subsystem Event-information pending CRW (channel report word).
> 
> While we are at it, let's also add all other ERCs.
> 
> Signed-off-by: Dong Jia Shi 
> ---
>  hw/s390x/css.c|  2 +-
>  include/hw/s390x/ioinst.h | 11 +--
>  2 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/s390x/css.c b/hw/s390x/css.c
> index 6a42b95cee..5321ca016b 100644
> --- a/hw/s390x/css.c
> +++ b/hw/s390x/css.c
> @@ -2103,7 +2103,7 @@ void css_generate_chp_crws(uint8_t cssid, uint8_t chpid)
>  void css_generate_css_crws(uint8_t cssid)
>  {
>  if (!channel_subsys.sei_pending) {
> -css_queue_crw(CRW_RSC_CSS, 0, 0, cssid);
> +css_queue_crw(CRW_RSC_CSS, CRW_ERC_EVENT, 0, cssid);
>  }
>  channel_subsys.sei_pending = true;
>  }
> diff --git a/include/hw/s390x/ioinst.h b/include/hw/s390x/ioinst.h
> index 92d15655e4..f89019f78f 100644
> --- a/include/hw/s390x/ioinst.h
> +++ b/include/hw/s390x/ioinst.h
> @@ -201,8 +201,15 @@ typedef struct CRW {
>  #define CRW_FLAGS_MASK_A 0x0080
>  #define CRW_FLAGS_MASK_ERC 0x003f
> 
> -#define CRW_ERC_INIT 0x02
> -#define CRW_ERC_IPI  0x04
> +#define CRW_ERC_EVENT0x00 /* event information pending */
> +#define CRW_ERC_AVAIL0x01 /* available */
> +#define CRW_ERC_INIT 0x02 /* initialized */
> +#define CRW_ERC_TERROR   0x03 /* temporary error */
> +#define CRW_ERC_IPI  0x04 /* installed parm initialized */
> +#define CRW_ERC_TERM 0x05 /* terminal */
> +#define CRW_ERC_PERRN0x06 /* perm. error, facility not init */
> +#define CRW_ERC_PERRI0x07 /* perm. error, facility init */
> +#define CRW_ERC_PMOD 0x08 /* installed parameters modified */

You have missed installed parameters restored from the PoP (above
you say add all other).

Other than that.

LGTM

> 
>  #define CRW_RSC_SUBCH 0x3
>  #define CRW_RSC_CHP   0x4
>

Re: [Qemu-devel] [PATCH v2 2/2] s390x/css: generate solicited crw for rchp completion signaling

2017-08-01 Thread Halil Pasic

On 08/01/2017 09:57 AM, Dong Jia Shi wrote:
[..]
> --- a/hw/s390x/css.c
> +++ b/hw/s390x/css.c
> @@ -1745,10 +1745,10 @@ int css_do_rchp(uint8_t cssid, uint8_t chpid)
>  }
> 
>  /* We don't really use a channel path, so we're done here. */
> -css_queue_crw(CRW_RSC_CHP, CRW_ERC_INIT,
> +css_queue_crw(CRW_RSC_CHP, CRW_ERC_INIT, 1,
>channel_subsys.max_cssid > 0 ? 1 : 0, chpid);
>  if (channel_subsys.max_cssid > 0) {
> -css_queue_crw(CRW_RSC_CHP, CRW_ERC_INIT, 0, real_cssid << 8);
> +css_queue_crw(CRW_RSC_CHP, CRW_ERC_INIT, 1, 0, real_cssid << 8);
>  }
>  return 0;
>  }
> @@ -2028,7 +2028,8 @@ void css_subch_assign(uint8_t cssid, uint8_t ssid, 
> uint16_t schid,
>  }
>  }
> 
> -void css_queue_crw(uint8_t rsc, uint8_t erc, int chain, uint16_t rsid)
> +void css_queue_crw(uint8_t rsc, uint8_t erc, int solicited,
> +   int chain, uint16_t rsid)

I think you could make the parameters solicited and chain bool (AFAIU
they are conceptually bool) for clearer semantic. If you go with that
you could also get rid of the superfluous ternary operator ( we have
stuff like some_cond ? 1 : 0 for the chain parameter in more than
one place.

Btw. I find bool flags easy to mix up and difficult to read. I have no better
idea how to write this (in C) though. I was considering throwing chain and
solicited together into a single flags parameter, but looking at the client code
it does not look like a good idea.

Besides the cosmetic considerations above LGTM

Re: [Qemu-devel] About virtio device hotplug in Q35! 【外域邮件.谨慎查阅】

2017-08-01 Thread Alex Williamson

On Tue, 1 Aug 2017 17:35:40 +0800
Bob Chen  wrote:

> 2017-08-01 13:46 GMT+08:00 Alex Williamson :
> 
> > On Tue, 1 Aug 2017 13:04:46 +0800
> > Bob Chen  wrote:
> >  
> > > Hi,
> > >
> > > This is a sketch of my hardware topology.
> > >
> > >   CPU0 <- QPI ->CPU1
> > >| |
> > > Root Port(at PCIe.0)Root Port(at PCIe.1)
> > >/\   /   \  
> >
> > Are each of these lines above separate root ports?  ie. each root
> > complex hosts two root ports, each with a two-port switch downstream of
> > it?
> >  
> 
> Not quite sure if root complex is a concept or a real physical device ...
> 
> But according to my observation by `lspci -vt`, there are indeed 4 Root
> Ports in the system. So the sketch might need a tiny update.
> 
> 
>   CPU0 <- QPI ->CPU1
> 
>| |
> 
>   Root Complex(device?)  Root Complex(device?)
> 
>  /\   /\
> 
> Root Port  Root Port Root Port  Root Port
> 
>/\   /\
> 
> SwitchSwitch SwitchSwitch
> 
>  /   \  /  \  /   \ /   \
> 
>GPU   GPU  GPU  GPU  GPU   GPU  GPU   GPU


Yes, that's what I expected.  So the numbers make sense, the immediate
sibling GPU would share bandwidth between the root port and upstream
switch port, any other GPU should not double-up on any single link.

> > > SwitchSwitch SwitchSwitch
> > >  /   \  /  \  /   \/\
> > >GPU   GPU  GPU  GPU  GPU   GPU GPU   GPU
> > >
> > >
> > > And below are the p2p bandwidth test results.
> > >
> > > Host：
> > >D\D 0  1  2  3  4  5  6  7
> > >  0 426.91  25.32  19.72  19.72  19.69  19.68  19.75  19.66
> > >  1  25.31 427.61  19.74  19.72  19.66  19.68  19.74  19.73
> > >  2  19.73  19.73 429.49  25.33  19.66  19.74  19.73  19.74
> > >  3  19.72  19.71  25.36 426.68  19.70  19.71  19.77  19.74
> > >  4  19.72  19.72  19.73  19.75 425.75  25.33  19.72  19.71
> > >  5  19.71  19.75  19.76  19.75  25.35 428.11  19.69  19.70
> > >  6  19.76  19.72  19.79  19.78  19.73  19.74 425.75  25.35
> > >  7  19.69  19.75  19.79  19.75  19.72  19.72  25.39 427.15
> > >
> > > VM：
> > >D\D 0  1  2  3  4  5  6  7
> > >  0 427.38  10.52  18.99  19.11  19.75  19.62  19.75  19.71
> > >  1  10.53 426.68  19.28  19.19  19.73  19.71  19.72  19.73
> > >  2  18.88  19.30 426.92  10.48  19.66  19.71  19.67  19.68
> > >  3  18.93  19.18  10.45 426.94  19.69  19.72  19.67  19.72
> > >  4  19.60  19.66  19.69  19.70 428.13  10.49  19.40  19.57
> > >  5  19.52  19.74  19.72  19.69  10.44 426.45  19.68  19.61
> > >  6  19.63  19.50  19.72  19.64  19.59  19.66 426.91  10.47
> > >  7  19.69  19.75  19.70  19.69  19.66  19.74  10.45 426.23  
> >
> > Interesting test, how do you get these numbers?  What are the units,
> > GB/s?
> >  
> 
> 
> 
> A p2pBandwidthLatencyTest from Nvidia CUDA sample code. Units are
> GB/s. Asynchronous read and write. Bidirectional.
> 
> However, the Unidirectional test had shown a different result. Didn't fall
> down to a half.
> 
> VM:
> Unidirectional P2P=Enabled Bandwidth Matrix (GB/s)
>D\D 0  1  2  3  4  5  6  7
>  0 424.07  10.02  11.33  11.30  11.09  11.05  11.06  11.10
>  1  10.05 425.98  11.40  11.33  11.08  11.10  11.13  11.09
>  2  11.31  11.28 423.67  10.10  11.14  11.13  11.13  11.11
>  3  11.30  11.31  10.08 425.05  11.10  11.07  11.09  11.06
>  4  11.16  11.17  11.21  11.17 423.67  10.08  11.25  11.28
>  5  10.97  11.01  11.07  11.02  10.09 425.52  11.23  11.27
>  6  11.09  11.13  11.16  11.10  11.28  11.33 422.71  10.10
>  7  11.13  11.09  11.15  11.11  11.36  11.33  10.02 422.75
> 
> Host:
> Unidirectional P2P=Enabled Bandwidth Matrix (GB/s)
>D\D 0  1  2  3  4  5  6  7
>  0 424.13  13.38  10.17  10.17  11.23  11.21  10.94  11.22
>  1  13.38 424.06  10.18  10.19  11.20  11.19  11.19  11.14
>  2  10.18  10.18 422.75  13.38  11.19  11.19  11.17  11.17
>  3  10.18  10.18  13.38 425.05  11.05  11.08  11.08  11.06
>  4  11.01  11.06  11.06  11.03 423.21  13.38  10.17  10.17
>  5  10.91  10.91  10.89  10.92  13.38 425.52  10.18  10.18
>  6  11.28  11.30  11.32  11.31  10.19  10.18 424.59  13.37
>  7  11.18  11.20  11.16  11.21  10.17  10.19  13.38 424.13

Looks right, a unidirectional test would create bidirectional data
flows on the root port to upstream switch link and should be able to
saturate that link.  With the bidirectional test, that link becomes a
bottleneck.
 
> > > In the VM, the bandwidth between two GPUs

Re: [Qemu-devel] [PATCHv2 02/04] colo-compare: Processpactkets in the IOThreadofthe primary

2017-08-01 Thread Fam Zheng

On Tue, 08/01 12:25, Paolo Bonzini wrote:
> On 28/07/2017 02:25, Fam Zheng wrote:
> > On Thu, 07/27 15:47, Zhang Chen wrote:
> >> CC. Fam and David.
> >>
> >> Any idea about it?
> > 
> > Is it possible to use g_main_context_{push,pop}_thread_default to "move" 
> > chardev
> > GSources to IOThread's context, then use g_main_context_query like main 
> > thread
> > and poll the fds in aio_poll?
> 
> How would you do that without making aio_poll performance worse for
> everyone?

I don't knoe. Maintain a flag and only do these slow stuff after at least one
GSource has been moved around?

Fam

Re: [Qemu-devel] [for-2.11 PATCH 26/26] spapr: add hotplug hooks for PHB hotplug

2017-08-01 Thread Greg Kurz

On Thu, 27 Jul 2017 13:37:24 -0500
Michael Roth  wrote:

> Quoting Greg Kurz (2017-07-27 12:09:55)
> > On Thu, 27 Jul 2017 14:41:31 +1000
> > Alexey Kardashevskiy  wrote:
> >   
> > > On 26/07/17 18:40, Greg Kurz wrote:  
> > > > Hotplugging PHBs is a machine-level operation, but PHBs reside on the
> > > > main system bus, so we register spapr machine as the handler for the
> > > > main system bus.
> > > > 
> > > > Signed-off-by: Michael Roth 
> > > > Signed-off-by: Greg Kurz 
> > > > ---
> > > > - rebased against ppc-for-2.10
> > > > - converted to unplug_request
> > > > - handle drc_id at pre-plug
> > > > - reset hotplugged PHB at plug
> > > > - compatibility with older machine types
> > > > ---
> > > >  hw/ppc/spapr.c  |  114 
> > > > +++
> > > >  hw/ppc/spapr_drc.c  |1 
> > > >  hw/ppc/spapr_pci.c  |2 -
> > > >  include/hw/pci-host/spapr.h |2 +
> > > >  include/hw/ppc/spapr.h  |1 
> > > >  5 files changed, 118 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > > index 90485054c2e7..589f76ef9fb8 100644
> > > > --- a/hw/ppc/spapr.c
> > > > +++ b/hw/ppc/spapr.c
> > > > @@ -2540,6 +2540,10 @@ static void ppc_spapr_init(MachineState *machine)
> > > >  register_savevm_live(NULL, "spapr/htab", -1, 1,
> > > >   _htab_handlers, spapr);
> > > >  
> > > > +if (spapr->dr_phb_enabled) {
> > > > +qbus_set_hotplug_handler(sysbus_get_default(), 
> > > > OBJECT(machine), NULL);
> > > > +}
> > > > +
> > > >  qemu_register_boot_set(spapr_boot_set, spapr);
> > > >  
> > > >  if (kvm_enabled()) {
> > > > @@ -3238,6 +3242,103 @@ out:
> > > >  error_propagate(errp, local_err);
> > > >  }
> > > >  
> > > > +static void spapr_phb_pre_plug(HotplugHandler *hotplug_dev, 
> > > > DeviceState *dev,
> > > > +   Error **errp)
> > > > +{
> > > > +sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(dev);
> > > > +
> > > > +if (sphb->drc_id == (uint32_t)-1) {
> > > > +sphb->drc_id = sphb->index;
> > > > +}
> > > > +
> > > > +if (sphb->drc_id >= SPAPR_DRC_MAX_PHB) {
> > > > +error_setg(errp, "PHB id %d out of range", sphb->drc_id);
> > > > +}
> > > 
> > > 
> > > sphb->index in considered 16bits in the existing code (even though it is
> > > defined as 32bit) and SPAPR_DRC_MAX_PHB is just 256. I'd suggest using the
> > > same limit for both, either 256 or 65536 is fine for me.
> > > 
> > > It is actually a bit weird - it is possible to completely configure few
> > > PHBs in the command line so they will have index==-1 but PCI hotplug code 
> > > -
> > > spapr_phb_get_pci_func_drc() and spapr_phb_realize() - does not check for
> > > this and just does (sphb->index << 16).  
> > 
> > You're right and this looks like a bug... I'll try to come up with a fix.  
> 
> Yup, that's a bug, and we can trigger it currently by adding 2
> additional PHBs that don't have an index specified. QOM catches
> and reports them as "attempt to add duplicate property", but it's
> just reported by spapr_dr_connector_new() and doesn't seem to be
> treated as fatal (and probably should be).
> 

Yeah, spapr_dr_connector_new() doesn't care for errors at all and
happily returns a pointer to an unrealized DRC. This causes weird
behavior. For example, when using mandatory props instead of index:

-device spapr-pci-host-bridge,id=pci1,\
  buid=0x8002001,\
  liobn=0x8100,\
  liobn64=0x8101,\
  mem_win_addr=0x2001,\
  mem64_win_addr=0x2200,\
  io_win_addr=0x2001 \
-device spapr-pci-host-bridge,id=pci2,\
  buid=0x8002002,\
  liobn=0x8200,\
  liobn64=0x8201,\
  mem_win_addr=0x20018000,\
  mem64_win_addr=0x2300,\
  io_win_addr=0x2002 \
-device virtio-net,id=hp1,bus=pci1.0 \
-device virtio-net,id=hp2,bus=pci2.0 

QEMU complains and exits:

qemu-system-ppc64: -device virtio-net,id=hp2,bus=pci2.0: an attached device
 is still awaiting release

> Might also see this more in practice now with the multi-phb support
> in libvirt, though I'd imagine those would tend to rely on phb->index
> being set.
> 

I had confirmation from Shiva that libvirt passes the following to QEMU:

spapr-pci-host-bridge,index=X,id=pci.X<,numa_node=Z>

> Now that phb->drc_id is available though we can just use that instead.
> I agree it should be limited to 16-bit or smaller to avoid any
> possibility of overlap.
> 

Well, David changed his mind and now suggests we rather

[Qemu-devel] [PULL 15/15] block/qapi: Remove redundant NULL check to silence Coverity

2017-08-01 Thread Kevin Wolf

When skipping implicit nodes in bdrv_block_device_info(), we know that
bs0 is always non-NULL; initially, because it's taken from a BdrvChild
and a BdrvChild never has a NULL bs, and after the first iteration
because implicit nodes always have a backing file.

Remove the NULL check and add an assertion that the implicit node does
indeed have a backing file.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Jeff Cody 
---
 block/qapi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/qapi.c b/block/qapi.c
index d2b18ee9df..5f1a71f5d2 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -145,8 +145,9 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
 
 /* Skip automatically inserted nodes that the user isn't aware of for
  * query-block (blk != NULL), but not for query-named-block-nodes */
-while (blk && bs0 && bs0->drv && bs0->implicit) {
+while (blk && bs0->drv && bs0->implicit) {
 bs0 = backing_bs(bs0);
+assert(bs0);
 }
 }
 
-- 
2.13.3

1 2 3 >

1 - 100 of 235 matches

Mail list logo