date:20161013

Re: [Qemu-devel] [PATCH 0/7] blockjobs: preliminary refactoring work, Pt 1

2016-10-13 Thread no-reply

Hi,

Your series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 1476399422-8028-1-git-send-email-js...@redhat.com
Subject: [Qemu-devel] [PATCH 0/7] blockjobs: preliminary refactoring work, Pt 1

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git show --no-patch --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
42e8049 blockjobs: fix documentation
545d6e6 blockjobs: split interface into public/private, Part 1
6b894fe Blockjobs: Internalize user_pause logic
0863a3c blockjob: centralize QMP event emissions
1b493ef Replication/Blockjobs: Create replication jobs as internal
ba80fb1 blockjobs: Allow creating internal jobs
76110a3 blockjobs: hide internal jobs from management API

=== OUTPUT BEGIN ===
Checking PATCH 1/7: blockjobs: hide internal jobs from management API...
Checking PATCH 2/7: blockjobs: Allow creating internal jobs...
Checking PATCH 3/7: Replication/Blockjobs: Create replication jobs as 
internal...
Checking PATCH 4/7: blockjob: centralize QMP event emissions...
Checking PATCH 5/7: Blockjobs: Internalize user_pause logic...
Checking PATCH 6/7: blockjobs: split interface into public/private, Part 1...
ERROR: struct BlockJobDriver should normally be const
#182: FILE: include/block/blockjob.h:31:
+typedef struct BlockJobDriver BlockJobDriver;

ERROR: struct BlockJobDriver should normally be const
#415: FILE: include/block/blockjob_int.h:37:
+struct BlockJobDriver {

ERROR: space prohibited between function name and open parenthesis '('
#459: FILE: include/block/blockjob_int.h:81:
+void coroutine_fn (*pause)(BlockJob *job);

ERROR: space prohibited between function name and open parenthesis '('
#466: FILE: include/block/blockjob_int.h:88:
+void coroutine_fn (*resume)(BlockJob *job);

total: 4 errors, 0 warnings, 558 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 7/7: blockjobs: fix documentation...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [PATCH 07/11] spapr: add hotplug interrupt machine options

2016-10-13 Thread David Gibson

On Wed, Oct 12, 2016 at 06:13:55PM -0500, Michael Roth wrote:
> This adds machine options of the form:
> 
>   -machine pseries,legacy-hotplug-events=true
>   -machine pseries,legacy-hotplug-events=false
> 
> to denote whether or not we wish to force the use of "legacy" style
> hotplug events, which are surfaced through EPOW interrupts instead of
> a dedicated interrupt source, and lack certain features necessary,
> mainly, for memory unplug support.
> 
> If false, QEMU will default to "legacy" style unless the guest
> advertises support for the newer events via
> ibm,client-architecture-support hcall during early boot.
> 
> For pseries-2.7 and earlier we default to true, for newer machine
> types we default to false.
> 
> Signed-off-by: Michael Roth 

Hrm.. I think it would be a little clearer if you could find a wording
such that both the internal variable and the external property have
the same sense - i.e. get rid of the ! in the property getters /
setters.

> ---
>  hw/ppc/spapr.c  | 31 +++
>  include/hw/ppc/spapr.h  |  1 +
>  include/hw/ppc/spapr_ovec.h |  1 +
>  3 files changed, 33 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index f8cde92..d80a6fa 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1816,6 +1816,11 @@ static void ppc_spapr_init(MachineState *machine)
>  
>  spapr_ovec_set(spapr->ov5, OV5_FORM1_AFFINITY);
>  
> +/* use dedicated HP event source if guest supports it */
> +if (spapr->use_hotplug_event_source) {
> +spapr_ovec_set(spapr->ov5, OV5_HP_EVT);
> +}
> +
>  /* init CPUs */
>  if (machine->cpu_model == NULL) {
>  machine->cpu_model = kvm_enabled() ? "host" : smc->tcg_default_cpu;
> @@ -2172,16 +2177,39 @@ static void spapr_set_kvm_type(Object *obj, const 
> char *value, Error **errp)
>  spapr->kvm_type = g_strdup(value);
>  }
>  
> +static bool spapr_get_legacy_hotplug_events(Object *obj, Error **errp)
> +{
> +sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> +
> +return !spapr->use_hotplug_event_source;
> +}
> +
> +static void spapr_set_legacy_hotplug_events(Object *obj, bool value,
> +Error **errp)
> +{
> +sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
> +
> +spapr->use_hotplug_event_source = !value;
> +}
> +
>  static void spapr_machine_initfn(Object *obj)
>  {
>  sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
>  
>  spapr->htab_fd = -1;
> +spapr->use_hotplug_event_source = true;
>  object_property_add_str(obj, "kvm-type",
>  spapr_get_kvm_type, spapr_set_kvm_type, NULL);
>  object_property_set_description(obj, "kvm-type",
>  "Specifies the KVM virtualization mode 
> (HV, PR)",
>  NULL);
> +object_property_add_bool(obj, "legacy-hotplug-events",
> +spapr_get_legacy_hotplug_events,
> +spapr_set_legacy_hotplug_events,
> +NULL);
> +object_property_set_description(obj, "legacy-hotplug-events",
> +"Use deprecated EPOW mechanism for 
> hotplug events",
> +NULL);
>  }
>  
>  static void spapr_machine_finalizefn(Object *obj)
> @@ -2518,6 +2546,9 @@ DEFINE_SPAPR_MACHINE(2_8, "2.8", true);
>  
>  static void spapr_machine_2_7_instance_options(MachineState *machine)
>  {
> +sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
> +
> +spapr->use_hotplug_event_source = false;
>  }
>  
>  static void spapr_machine_2_7_class_options(MachineClass *mc)
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 27a3328..d1a4a14 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -74,6 +74,7 @@ struct sPAPRMachineState {
>  uint32_t check_exception_irq;
>  Notifier epow_notifier;
>  QTAILQ_HEAD(, sPAPREventLogEntry) pending_events;
> +bool use_hotplug_event_source;
>  
>  /* Migration state */
>  int htab_save_index;
> diff --git a/include/hw/ppc/spapr_ovec.h b/include/hw/ppc/spapr_ovec.h
> index 47fa04c..92167c6 100644
> --- a/include/hw/ppc/spapr_ovec.h
> +++ b/include/hw/ppc/spapr_ovec.h
> @@ -45,6 +45,7 @@ typedef struct sPAPROptionVector sPAPROptionVector;
>  /* option vector 5 */
>  #define OV5_DRCONF_MEMORY   OV_BIT(2, 2)
>  #define OV5_FORM1_AFFINITY  OV_BIT(5, 0)
> +#define OV5_HP_EVT  OV_BIT(6, 5)
>  
>  /* interfaces */
>  sPAPROptionVector *spapr_ovec_new(void);

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 06/11] spapr: update spapr hotplug documentation

2016-10-13 Thread David Gibson

On Wed, Oct 12, 2016 at 06:13:54PM -0500, Michael Roth wrote:
> This updates the existing documentation to reflect recent updates to
> the hotplug event structure, which are in draft form but slated
> for inclusion in PAPR/LoPAPR.
> 
> Signed-off-by: Michael Roth 

Reviewed-by: David Gibson 

> ---
>  docs/specs/ppc-spapr-hotplug.txt | 55 
> +---
>  1 file changed, 46 insertions(+), 9 deletions(-)
> 
> diff --git a/docs/specs/ppc-spapr-hotplug.txt 
> b/docs/specs/ppc-spapr-hotplug.txt
> index 631b0ca..f57e2a0 100644
> --- a/docs/specs/ppc-spapr-hotplug.txt
> +++ b/docs/specs/ppc-spapr-hotplug.txt
> @@ -233,12 +233,27 @@ tools by host-level management such as an HMC. This 
> level of management is not
>  applicable to PowerKVM, hence the reason for extending the notification
>  framework to support hotplug events.
>  
> -Note that these events are not yet formally part of the PAPR+ specification,
> -but support for this format has already been implemented in DR-related
> -guest tools such as powerpc-utils/librtas, as well as kernel patches that 
> have
> -been submitted to handle in-kernel processing of memory/cpu-related hotplug
> -events[1], and is planned for formal inclusion is PAPR+ specification. The
> -hotplug-specific payload is QEMU implemented as follows (with all values
> +The format for these EPOW-signalled events is described below under
> +"hotplug/unplug event structure". Note that these events are not
> +formally part of the PAPR+ specification, and have been superseded by a
> +newer format, also described below under "hotplug/unplug event structure",
> +and so are now deemed a "legacy" format. The formats are similar, but the
> +"modern" format contains additional fields/flags, which are denoted for the
> +purposes of this documentation with "#ifdef GUEST_SUPPORTS_MODERN" guards.
> +
> +QEMU should assume support only for "legacy" fields/flags unless the guest
> +advertises support for the "modern" format via 
> ibm,client-architecture-support
> +hcall by setting byte 5, bit 6 of it's ibm,architecture-vec-5 option vector
> +structure (as described by LoPAPR v11, B.6.2.3). As with "legacy" format 
> events,
> +"modern" format events are surfaced to the guest via check-exception RTAS 
> calls,
> +but use a dedicated event source to signal the guest. This event source is
> +advertised to the guest by the addition of a "hot-plug-events" node under
> +"/event-sources" node of the guest's device tree using the standard format
> +described in LoPAPR v11, B.6.12.1.
> +
> +== hotplug/unplug event structure ==
> +
> +The hotplug-specific payload in QEMU is implemented as follows (with all 
> values
>  encoded in big-endian format):
>  
>  struct rtas_event_log_v6_hp {
> @@ -263,14 +278,23 @@ struct rtas_event_log_v6_hp {
>  #define RTAS_LOG_V6_HP_ACTION_ADD   1
>  #define RTAS_LOG_V6_HP_ACTION_REMOVE2
>  uint8_t hotplug_action; /* action (add/remove) */
> -#define RTAS_LOG_V6_HP_ID_DRC_NAME  1
> -#define RTAS_LOG_V6_HP_ID_DRC_INDEX 2
> -#define RTAS_LOG_V6_HP_ID_DRC_COUNT 3
> +#define RTAS_LOG_V6_HP_ID_DRC_NAME  1
> +#define RTAS_LOG_V6_HP_ID_DRC_INDEX 2
> +#define RTAS_LOG_V6_HP_ID_DRC_COUNT 3
> +#ifdef GUEST_SUPPORTS_MODERN
> +#define RTAS_LOG_V6_HP_ID_DRC_COUNT_INDEXED 4
> +#endif
>  uint8_t hotplug_identifier; /* type of the resource identifier,
>   * which serves as the discriminator
>   * for the 'drc' union field below
>   */
> +#ifdef GUEST_SUPPORTS_MODERN
> +uint8_t capabilities;   /* capability flags, currently unused
> + * by QEMU
> + */
> +#else
>  uint8_t reserved;
> +#endif
>  union {
>  uint32_t index; /* DRC index of resource to take 
> action
>   * on
> @@ -278,6 +302,19 @@ struct rtas_event_log_v6_hp {
>  uint32_t count; /* number of DR resources to take
>   * action on (guest chooses which)
>   */
> +#ifdef GUEST_SUPPORTS_MODERN
> +struct {
> +uint32_t count; /* number of DR resources to take
> + * action on
> + */
> +uint32_t index; /* DRC index of first resource to 
> take
> + * action on. guest will take action
> + * on DRC index  through
> + * DRC index  in
> + * sequential order
> + */
> +}

Re: [Qemu-devel] [PATCH 02/11] spapr_hcall: use spapr_ovec_* interfaces for CAS options

2016-10-13 Thread David Gibson

On Fri, Oct 14, 2016 at 02:02:31PM +1100, David Gibson wrote:
> On Wed, Oct 12, 2016 at 06:13:50PM -0500, Michael Roth wrote:
> > Currently we access individual bytes of an option vector via
> > ldub_phys() to test for the presence of a particular capability
> > within that byte. Currently this is only done for the "dynamic
> > reconfiguration memory" capability bit. If that bit is present,
> > we pass a boolean value to spapr_h_cas_compose_response()
> > to generate a modified device tree segment with the additional
> > properties required to enable this functionality.
> > 
> > As more capability bits are added, will would need to modify the
> > code to add additional option vector accesses and extend the
> > param list for spapr_h_cas_compose_response() to include similar
> > boolean values for these parameters.
> > 
> > Avoid this by switching to spapr_ovec_* helpers so we can do all
> > the parsing in one shot and then test for these additional bits
> > within spapr_h_cas_compose_response() directly.
> > 
> > Cc: Bharata B Rao 
> > Signed-off-by: Michael Roth 
> 
> Reviewed-by: David Gibson 

That said.. some comments making the overall scheme here might be
helpful.

Specifically..

[snip]
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index 39dadaa..6c20d28 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -6,6 +6,7 @@
> >  #include "hw/ppc/xics.h"
> >  #include "hw/ppc/spapr_drc.h"
> >  #include "hw/mem/pc-dimm.h"
> > +#include "hw/ppc/spapr_ovec.h"
> >  
> >  struct VIOsPAPRBus;
> >  struct sPAPRPHBState;
> > @@ -66,6 +67,8 @@ struct sPAPRMachineState {
> >  uint64_t rtc_offset; /* Now used only during incoming migration */
> >  struct PPCTimebase tb;
> >  bool has_graphics;
> > +sPAPROptionVector *ov5;
> > +sPAPROptionVector *ov5_cas;

IIUC, the ov5 represents all the features qemu is capable of
supporting, and ov5_cas records the ones that were actually negotiated
during CAS.  Some descriptions here could make that much easier to follow.


> >  uint32_t check_exception_irq;
> >  Notifier epow_notifier;
> > @@ -577,7 +580,7 @@ void spapr_events_init(sPAPRMachineState *sm);
> >  void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
> >  int spapr_h_cas_compose_response(sPAPRMachineState *sm,
> >   target_ulong addr, target_ulong size,
> > - bool cpu_update, bool memory_update);
> > + bool cpu_update);
> >  sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn);
> >  void spapr_tce_table_enable(sPAPRTCETable *tcet,
> >  uint32_t page_shift, uint64_t bus_offset,
> > diff --git a/include/hw/ppc/spapr_ovec.h b/include/hw/ppc/spapr_ovec.h
> > index fba2d98..09afd59 100644
> > --- a/include/hw/ppc/spapr_ovec.h
> > +++ b/include/hw/ppc/spapr_ovec.h
> > @@ -42,6 +42,9 @@ typedef struct sPAPROptionVector sPAPROptionVector;
> >  
> >  #define OV_BIT(byte, bit) ((byte - 1) * BITS_PER_BYTE + bit)
> >  
> > +/* option vector 5 */
> > +#define OV5_DRCONF_MEMORY   OV_BIT(2, 2)
> > +
> >  /* interfaces */
> >  sPAPROptionVector *spapr_ovec_new(void);
> >  sPAPROptionVector *spapr_ovec_clone(sPAPROptionVector *ov_orig);
> 



-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 03/11] spapr: add option vector handling in CAS-generated resets

2016-10-13 Thread David Gibson

On Wed, Oct 12, 2016 at 06:13:51PM -0500, Michael Roth wrote:
> In some cases, ibm,client-architecture-support calls can fail. This
> could happen in the current code for situations where the modified
> device tree segment exceeds the buffer size provided by the guest
> via the call parameters. In these cases, QEMU will reset, allowing
> an opportunity to regenerate the device tree from scratch via
> boot-time handling. There are potentially other scenarios as well,
> not currently reachable in the current code, but possible in theory,
> such as cases where device-tree properties or nodes need to be removed.
> 
> We currently don't handle either of these properly for option vector
> capabilities however. Instead of carrying the negotiated capability
> beyond the reset and creating the boot-time device tree accordingly,
> we start from scratch, generating the same boot-time device tree as we
> did prior to the CAS-generated and the same device tree updates as we
> did before. This could (in theory) cause us to get stuck in a reset
> loop. This hasn't been observed, but depending on the extensiveness
> of CAS-induced device tree updates in the future, could eventually
> become an issue.
> 
> Address this by pulling capability-related device tree
> updates resulting from CAS calls into a common routine,
> spapr_populate_cas_updates(), and adding an sPAPROptionVector*
> parameter that allows us to test for newly-negotiated capabilities.
> We invoke it as follows:
> 
> 1) When ibm,client-architecture-support gets called, we
>call spapr_populate_cas_updates() with the set of capabilities
>added since the previous call to ibm,client-architecture-support.
>For the initial boot, or a system reset generated by something
>other than the CAS call itself, this set will consist of *all*
>options supported both the platform and the guest. For calls
>to ibm,client-architecture-support immediately after a CAS-induced
>reset, we call spapr_populate_cas_updates() with only the set
>of capabilities added since the previous call, since the other
>capabilities will have already been addressed by the boot-time
>device-tree this time around. In the unlikely event that
>capabilities are *removed* since the previous CAS, we will
>generate a CAS-induced reset. In the unlikely event that we
>cannot fit the device-tree updates into the buffer provided
>by the guest, well generate a CAS-induced reset.
> 
> 2) When a CAS update results in the need to reset the machine and
>include the updates in the boot-time device tree, we call the
>spapr_populate_cas_updates() using the full set of negotiated
>capabilities as part of the reset path. At initial boot, or after
>a reset generated by something other than the CAS call itself,
>this set will be empty, resulting in what should be the same
>boot-time device-tree as we generated prior to this patch. For
>CAS-induced reset, this routine will be called with the full set of
>capabilities negotiated by the platform/guest in the previous
>CAS call, which should result in CAS updates from previous call
>being accounted for in the initial boot-time device tree.
> 
> Signed-off-by: Michael Roth 

Reviewed-by: David Gibson 

I suspect HPT resizing is also going to need actual CAS reboots
(rather than just adjusting the DT), so it's handy you've implemented
that here.

> ---
>  hw/ppc/spapr.c | 43 ++-
>  hw/ppc/spapr_hcall.c   | 22 ++
>  include/hw/ppc/spapr.h |  4 +++-
>  3 files changed, 55 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 934d6b2..460c7a8 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -854,13 +854,28 @@ out:
>  return ret;
>  }
>  
> +static int spapr_populate_cas_updates(sPAPRMachineState *spapr, void *fdt,
> +  sPAPROptionVector *ov5_updates)
> +{
> +sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> +int ret = 0;
> +
> +/* Generate ibm,dynamic-reconfiguration-memory node if required */
> +if (spapr_ovec_test(ov5_updates, OV5_DRCONF_MEMORY)) {
> +g_assert(smc->dr_lmb_enabled);
> +ret = spapr_populate_drconf_memory(spapr, fdt);
> +}
> +
> +return ret;
> +}
> +
>  int spapr_h_cas_compose_response(sPAPRMachineState *spapr,
>   target_ulong addr, target_ulong size,
> - bool cpu_update)
> + bool cpu_update,
> + sPAPROptionVector *ov5_updates)
>  {
>  void *fdt, *fdt_skel;
>  sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
> -sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
>  
>  size -= sizeof(hdr);
>  
> @@ -879,11 +894,7 @@ int

Re: [Qemu-devel] [PATCH 08/11] spapr_events: add support for dedicated hotplug event source

2016-10-13 Thread David Gibson

On Wed, Oct 12, 2016 at 06:13:56PM -0500, Michael Roth wrote:
> Hotplug events were previously delivered using an EPOW interrupt
> and were queued by linux guests into a circular buffer. For traditional
> EPOW events like shutdown/resets, this isn't an issue, but for hotplug
> events there are cases where this buffer can be exhausted, resulting
> in the loss of hotplug events, resets, etc.
> 
> Newer-style hotplug event are delivered using a dedicated event source.
> We enable this in supported guests by adding standard an additional
> event source in the guest device-tree via /event-sources, and, if
> the guest advertises support for the newer-style hotplug events,
> using the corresponding interrupt to signal the available of
> hotplug/unplug events.
> 
> Signed-off-by: Michael Roth 

So.. are you saying that as well as allowing new event types, the new
special hotplug event souce effectively allows for a bigger queue?

Does that mean that we didn't even necessarily need the base+length
unplug events, because we could now have sent the many single-LMB
unplug requests that were necessary?  Or does it not increase the
effective queue enough for that?

> ---
>  hw/ppc/spapr.c |  10 ++--
>  hw/ppc/spapr_events.c  | 148 
> ++---
>  include/hw/ppc/spapr.h |   3 +-
>  3 files changed, 120 insertions(+), 41 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d80a6fa..2037222 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -275,8 +275,7 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
> hwaddr initrd_size,
> hwaddr kernel_size,
> bool little_endian,
> -   const char *kernel_cmdline,
> -   uint32_t epow_irq)
> +   const char *kernel_cmdline)
>  {
>  void *fdt;
>  uint32_t start_prop = cpu_to_be32(initrd_base);
> @@ -437,7 +436,7 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
>  _FDT((fdt_end_node(fdt)));
>  
>  /* event-sources */
> -spapr_events_fdt_skel(fdt, epow_irq);
> +spapr_events_fdt_skel(fdt);
>  
>  /* /hypervisor node */
>  if (kvm_enabled()) {
> @@ -1944,7 +1943,7 @@ static void ppc_spapr_init(MachineState *machine)
>  }
>  g_free(filename);
>  
> -/* Set up EPOW events infrastructure */
> +/* Set up RTAS event infrastructure */
>  spapr_events_init(spapr);
>  
>  /* Set up the RTC RTAS interfaces */
> @@ -2076,8 +2075,7 @@ static void ppc_spapr_init(MachineState *machine)
>  /* Prepare the device tree */
>  spapr->fdt_skel = spapr_create_fdt_skel(initrd_base, initrd_size,
>  kernel_size, kernel_le,
> -kernel_cmdline,
> -spapr->check_exception_irq);
> +kernel_cmdline);
>  assert(spapr->fdt_skel != NULL);
>  
>  /* used by RTAS */
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 4c7b6ae..f8bbec6 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -40,6 +40,7 @@
>  #include "hw/ppc/spapr_drc.h"
>  #include "qemu/help_option.h"
>  #include "qemu/bcd.h"
> +#include "hw/ppc/spapr_ovec.h"
>  #include 
>  
>  struct rtas_error_log {
> @@ -206,28 +207,104 @@ struct hp_log_full {
>  struct rtas_event_log_v6_hp hp;
>  } QEMU_PACKED;
>  
> -#define EVENT_MASK_INTERNAL_ERRORS   0x8000
> -#define EVENT_MASK_EPOW  0x4000
> -#define EVENT_MASK_HOTPLUG   0x1000
> -#define EVENT_MASK_IO0x0800
> +typedef enum EventClassIndex {
> +EVENT_CLASS_INTERNAL_ERRORS = 0,
> +EVENT_CLASS_EPOW= 1,
> +EVENT_CLASS_RESERVED= 2,
> +EVENT_CLASS_HOT_PLUG= 3,
> +EVENT_CLASS_IO  = 4,
> +EVENT_CLASS_MAX
> +} EventClassIndex;
> +
> +#define EVENT_CLASS_MASK(index) (1 << (31 - index))
> +
> +typedef struct EventSource {
> +const char *name;
> +int irq;
> +uint32_t mask;
> +bool enabled;
> +} EventSource;
> +
> +static EventSource event_source[EVENT_CLASS_MAX] = {
> +[EVENT_CLASS_INTERNAL_ERRORS]   = { .name = "internal-errors", },
> +[EVENT_CLASS_EPOW]  = { .name = "epow-events", },
> +[EVENT_CLASS_HOT_PLUG]  = { .name = "hot-plug-events", },
> +[EVENT_CLASS_IO]= { .name = "ibm,io-events", },
> +};
> +
> +static void rtas_event_source_register(EventClassIndex index, int irq)
> +{
> +/* we only support 1 irq per event class at the moment */
> +g_assert(!event_source[index].enabled);
> +event_source[index].irq = irq;
> +event_source[index].mask =

Re: [Qemu-devel] [PATCH 05/11] spapr: fix inheritance chain for default machine options

2016-10-13 Thread David Gibson

On Wed, Oct 12, 2016 at 06:13:53PM -0500, Michael Roth wrote:
> Rather than machine instances having backward-compatible option
> defaults that need to be repeatedly re-enabled for every new machine
> type we introduce, we set the defaults appropriate for newer machine
> types, then add code to explicitly disable instance options as needed
> to maintain compatibility with older machine types.
> 
> Currently pseries-2.5 does not inherit from pseries-2.6 in this
> fashion, which is okay at the moment since we do not have any
> instance compatibility options for pseries-2.6+ currently.
> 
> We will make use of this in future patches though, so fix it here.
> 
> Signed-off-by: Michael Roth 

This patch stands on its own, so I've applied it to ppc-for-2.8 (and
also extended it to make 2_7 inherit from 2_8).

> ---
>  hw/ppc/spapr.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 3b2a459..f8cde92 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2544,6 +2544,7 @@ DEFINE_SPAPR_MACHINE(2_7, "2.7", false);
>  
>  static void spapr_machine_2_6_instance_options(MachineState *machine)
>  {
> +spapr_machine_2_7_instance_options(machine);
>  }
>  
>  static void spapr_machine_2_6_class_options(MachineClass *mc)
> @@ -2568,6 +2569,7 @@ DEFINE_SPAPR_MACHINE(2_6, "2.6", false);
>  
>  static void spapr_machine_2_5_instance_options(MachineState *machine)
>  {
> +spapr_machine_2_6_instance_options(machine);
>  }
>  
>  static void spapr_machine_2_5_class_options(MachineClass *mc)

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 09/11] spapr: Add DRC count indexed hotplug identifier type

2016-10-13 Thread David Gibson

On Wed, Oct 12, 2016 at 06:13:57PM -0500, Michael Roth wrote:
> From: Bharata B Rao 
> 
> Add support for DRC count indexed hotplug ID type which is primarily
> needed for memory hot unplug. This type allows for specifying the
> number of DRs that should be plugged/unplugged starting from a given
> DRC index.
> 
> Signed-off-by: Bharata B Rao 
> * updated rtas_event_log_v6_hp to reflect count/index field ordering
>   used in PAPR hotplug ACR
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr_events.c  | 74 
> --
>  include/hw/ppc/spapr.h |  4 +++
>  2 files changed, 63 insertions(+), 15 deletions(-)
> 
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index f8bbec6..eeca800 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -175,6 +175,16 @@ struct epow_log_full {
>  struct rtas_event_log_v6_epow epow;
>  } QEMU_PACKED;
>  
> +union drc_identifier {
> +uint32_t index;
> +uint32_t count;
> +struct {
> +uint32_t count;
> +uint32_t index;
> +} count_indexed;
> +char name[1];
> +} QEMU_PACKED;
> +
>  struct rtas_event_log_v6_hp {
>  #define RTAS_LOG_V6_SECTION_ID_HOTPLUG  0x4850 /* HP */
>  struct rtas_event_log_v6_section_header hdr;
> @@ -191,12 +201,9 @@ struct rtas_event_log_v6_hp {
>  #define RTAS_LOG_V6_HP_ID_DRC_NAME   1
>  #define RTAS_LOG_V6_HP_ID_DRC_INDEX  2
>  #define RTAS_LOG_V6_HP_ID_DRC_COUNT  3
> +#define RTAS_LOG_V6_HP_ID_DRC_COUNT_INDEXED  4
>  uint8_t reserved;
> -union {
> -uint32_t index;
> -uint32_t count;
> -char name[1];
> -} drc;
> +union drc_identifier drc_id;
>  } QEMU_PACKED;
>  
>  struct hp_log_full {
> @@ -457,7 +464,7 @@ static void spapr_hotplug_set_signalled(uint32_t 
> drc_index)
>  
>  static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
>  sPAPRDRConnectorType drc_type,
> -uint32_t drc)
> +union drc_identifier *drc_id)
>  {
>  sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>  struct hp_log_full *new_hp;
> @@ -502,7 +509,7 @@ static void spapr_hotplug_req_event(uint8_t hp_id, 
> uint8_t hp_action,
>  case SPAPR_DR_CONNECTOR_TYPE_PCI:
>  hp->hotplug_type = RTAS_LOG_V6_HP_TYPE_PCI;
>  if (hp->hotplug_action == RTAS_LOG_V6_HP_ACTION_ADD) {
> -spapr_hotplug_set_signalled(drc);
> +spapr_hotplug_set_signalled(drc_id->index);
>  }
>  break;
>  case SPAPR_DR_CONNECTOR_TYPE_LMB:
> @@ -520,9 +527,16 @@ static void spapr_hotplug_req_event(uint8_t hp_id, 
> uint8_t hp_action,
>  }
>  
>  if (hp_id == RTAS_LOG_V6_HP_ID_DRC_COUNT) {
> -hp->drc.count = cpu_to_be32(drc);
> +hp->drc_id.count = cpu_to_be32(drc_id->count);
>  } else if (hp_id == RTAS_LOG_V6_HP_ID_DRC_INDEX) {
> -hp->drc.index = cpu_to_be32(drc);
> +hp->drc_id.index = cpu_to_be32(drc_id->index);
> +} else if (hp_id == RTAS_LOG_V6_HP_ID_DRC_COUNT_INDEXED) {
> +/* we should not be using count_indexed value unless the guest
> + * supports dedicated hotplug event source
> + */
> +g_assert(spapr_ovec_test(spapr->ov5_cas, OV5_HP_EVT));
> +hp->drc_id.count_indexed.count = 
> cpu_to_be32(drc_id->count_indexed.count);
> +hp->drc_id.count_indexed.index = 
> cpu_to_be32(drc_id->count_indexed.index);
>  }
>  
>  rtas_event_log_queue(RTAS_LOG_TYPE_HOTPLUG, new_hp, true);
> @@ -535,34 +549,64 @@ void spapr_hotplug_req_add_by_index(sPAPRDRConnector 
> *drc)
>  {
>  sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
>  sPAPRDRConnectorType drc_type = drck->get_type(drc);
> -uint32_t index = drck->get_index(drc);
> +union drc_identifier drc_id;
>  
> +drc_id.index = drck->get_index(drc);
>  spapr_hotplug_req_event(RTAS_LOG_V6_HP_ID_DRC_INDEX,
> -RTAS_LOG_V6_HP_ACTION_ADD, drc_type, index);
> +RTAS_LOG_V6_HP_ACTION_ADD, drc_type, _id);
>  }
>  
>  void spapr_hotplug_req_remove_by_index(sPAPRDRConnector *drc)
>  {
>  sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
>  sPAPRDRConnectorType drc_type = drck->get_type(drc);
> -uint32_t index = drck->get_index(drc);
> +union drc_identifier drc_id;
>  
> +drc_id.index = drck->get_index(drc);
>  spapr_hotplug_req_event(RTAS_LOG_V6_HP_ID_DRC_INDEX,
> -RTAS_LOG_V6_HP_ACTION_REMOVE, drc_type, index);
> +RTAS_LOG_V6_HP_ACTION_REMOVE, drc_type, _id);
>  }
>  
>  void spapr_hotplug_req_add_by_count(sPAPRDRConnectorType drc_type,
> uint32_t

Re: [Qemu-devel] [PATCH v6 00/15] nbd: efficient write zeroes

2016-10-13 Thread no-reply

Hi,

Your series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PATCH v6 00/15] nbd: efficient write zeroes
Type: series
Message-id: 1476392335-9256-1-git-send-email-ebl...@redhat.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git show --no-patch --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
3e3cab6 nbd: Implement NBD_CMD_WRITE_ZEROES on client
6127973 nbd: Implement NBD_CMD_WRITE_ZEROES on server
dbe675e nbd: Improve server handling of shutdown requests
d86be2e nbd: Support shorter handshake
598c0b0 nbd: Less allocation during NBD_OPT_LIST
71d9fb4 nbd: Let client skip portions of server reply
4554cca nbd: Let server know when client gives up negotiation
c33b361 nbd: Share common option-sending code in client
9e216fe nbd: Send message along with server NBD_REP_ERR errors
761e798 nbd: Share common reply-sending code in server
dd5e949 nbd: Rename struct nbd_request and nbd_reply
45a155f nbd: Rename NbdClientSession to NBDClientSession
00f1816 nbd: Rename NBDRequest to NBDRequestData
bcd4778 nbd: Treat flags vs. command type as separate fields
fde7920 nbd: Add qemu-nbd -D for human-readable description

=== OUTPUT BEGIN ===
Checking PATCH 1/15: nbd: Add qemu-nbd -D for human-readable description...
Checking PATCH 2/15: nbd: Treat flags vs. command type as separate fields...
Checking PATCH 3/15: nbd: Rename NBDRequest to NBDRequestData...
Checking PATCH 4/15: nbd: Rename NbdClientSession to NBDClientSession...
Checking PATCH 5/15: nbd: Rename struct nbd_request and nbd_reply...
Checking PATCH 6/15: nbd: Share common reply-sending code in server...
Checking PATCH 7/15: nbd: Send message along with server NBD_REP_ERR errors...
Checking PATCH 8/15: nbd: Share common option-sending code in client...
Checking PATCH 9/15: nbd: Let server know when client gives up negotiation...
Checking PATCH 10/15: nbd: Let client skip portions of server reply...
Checking PATCH 11/15: nbd: Less allocation during NBD_OPT_LIST...
Checking PATCH 12/15: nbd: Support shorter handshake...
Checking PATCH 13/15: nbd: Improve server handling of shutdown requests...
ERROR: return of an errno should typically be -ve (return -ESHUTDOWN)
#63: FILE: nbd/client.c:38:
+return ESHUTDOWN;

total: 1 errors, 0 warnings, 95 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 14/15: nbd: Implement NBD_CMD_WRITE_ZEROES on server...
Checking PATCH 15/15: nbd: Implement NBD_CMD_WRITE_ZEROES on client...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices

2016-10-13 Thread Alex Williamson

On Fri, 14 Oct 2016 09:01:01 +0530
Kirti Wankhede  wrote:

> On 10/13/2016 8:06 PM, Alex Williamson wrote:
> > On Thu, 13 Oct 2016 14:52:09 +0530
> > Kirti Wankhede  wrote:
> >   
> >> On 10/13/2016 3:14 AM, Alex Williamson wrote:  
> >>> Under the device we have "mtty2", shouldn't that be
> >>> "mdev_supported_type", which then links to mtty2?  Otherwise a user
> >>> needs to decode from the link what this attribute is.
> >>> 
> >>
> >> I thought it should show type, so that by looking at 'ls' output user
> >> should be able to find type_id.  
> > 
> > The type_id should be shown by actually reading the link, not by the
> > link name itself, the same way that the iommu_group link for a device
> > isn't the group number, it links to the group number but uses a
> > standard link name.
> >   
> 
> Ok. I'll rename the link name to 'mdev_supported_type'

Hmm, if we have a device, then clearly it's a supported type, we can
probably reduce this to 'mdev_type'.  Sorry for not catching that.

BTW, please include the linux-kernel 
mailing list on the CC in your next posting.  Thanks,

Alex

Re: [Qemu-devel] [RFC PATCH 00/11] spapr: option vector re-work and memory unplug support

2016-10-13 Thread no-reply

Hi,

Your series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [RFC PATCH 00/11] spapr: option vector re-work and memory 
unplug support
Type: series
Message-id: 1476314039-9520-1-git-send-email-mdr...@linux.vnet.ibm.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git show --no-patch --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
b6c6ecd spapr: Memory hot-unplug support
74753f0 spapr: use count+index for memory hotplug
2de1399 spapr: Add DRC count indexed hotplug identifier type
860a9e5 spapr_events: add support for dedicated hotplug event source
895d1aa spapr: add hotplug interrupt machine options
fffa858 spapr: update spapr hotplug documentation
e9df226 spapr: fix inheritance chain for default machine options
dc9b8b1 spapr: improve ibm, architecture-vec-5 property handling
be26f44 spapr: add option vector handling in CAS-generated resets
cc5d859 spapr_hcall: use spapr_ovec_* interfaces for CAS options
90daf38 spapr_ovec: initial implementation of option vector helpers

=== OUTPUT BEGIN ===
Checking PATCH 1/11: spapr_ovec: initial implementation of option vector 
helpers...
WARNING: architecture specific defines should be avoided
#338: FILE: include/hw/ppc/spapr_ovec.h:36:
+#if !defined(__HW_SPAPR_OPTION_VECTORS_H__)

total: 0 errors, 1 warnings, 314 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 2/11: spapr_hcall: use spapr_ovec_* interfaces for CAS options...
Checking PATCH 3/11: spapr: add option vector handling in CAS-generated 
resets...
Checking PATCH 4/11: spapr: improve ibm, architecture-vec-5 property handling...
Checking PATCH 5/11: spapr: fix inheritance chain for default machine options...
Checking PATCH 6/11: spapr: update spapr hotplug documentation...
Checking PATCH 7/11: spapr: add hotplug interrupt machine options...
Checking PATCH 8/11: spapr_events: add support for dedicated hotplug event 
source...
ERROR: switch and case should be at the same indent
#164: FILE: hw/ppc/spapr_events.c:283:
+switch (log_type) {
+case RTAS_LOG_TYPE_HOTPLUG:
[...]
+case RTAS_LOG_TYPE_EPOW:
[...]
+default:

total: 1 errors, 0 warnings, 272 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 9/11: spapr: Add DRC count indexed hotplug identifier type...
WARNING: line over 80 characters
#85: FILE: hw/ppc/spapr_events.c:538:
+hp->drc_id.count_indexed.count = 
cpu_to_be32(drc_id->count_indexed.count);

WARNING: line over 80 characters
#86: FILE: hw/ppc/spapr_events.c:539:
+hp->drc_id.count_indexed.index = 
cpu_to_be32(drc_id->count_indexed.index);

total: 0 errors, 2 warnings, 144 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 10/11: spapr: use count+index for memory hotplug...
WARNING: line over 80 characters
#58: FILE: hw/ppc/spapr.c:2265:
+   addr_start / 
SPAPR_MEMORY_BLOCK_SIZE);

WARNING: line over 80 characters
#64: FILE: hw/ppc/spapr.c:2271:
+spapr_hotplug_req_add_by_count(SPAPR_DR_CONNECTOR_TYPE_LMB, 
nr_lmbs);

total: 0 errors, 2 warnings, 45 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 11/11: spapr: Memory hot-unplug support...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [PATCHv3 0/7] Improve PCI IO window orgnaization for pseries

2016-10-13 Thread no-reply

Hi,

Your series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 1476316647-9433-1-git-send-email-da...@gibson.dropbear.id.au
Subject: [Qemu-devel] [PATCHv3 0/7] Improve PCI IO window orgnaization for 
pseries

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git show --no-patch --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
8fa0718 spapr: Improved placement of PCI host bridges in guest memory map
6416268 spapr_pci: Add a 64-bit MMIO window
94e8f4c spapr: Adjust placement of PCI host bridge to allow > 1TiB RAM
692718a spapr_pci: Delegate placement of PCI host bridges to machine type
2b765c9 libqos: Limit spapr-pci to 32-bit MMIO for now
4543692 libqos: Correct error in PCI hole sizing for spapr
2c35727 libqos: Isolate knowledge of spapr memory map to qpci_init_spapr()

=== OUTPUT BEGIN ===
Checking PATCH 1/7: libqos: Isolate knowledge of spapr memory map to 
qpci_init_spapr()...
Checking PATCH 2/7: libqos: Correct error in PCI hole sizing for spapr...
Checking PATCH 3/7: libqos: Limit spapr-pci to 32-bit MMIO for now...
Checking PATCH 4/7: spapr_pci: Delegate placement of PCI host bridges to 
machine type...
Checking PATCH 5/7: spapr: Adjust placement of PCI host bridge to allow > 1TiB 
RAM...
Checking PATCH 6/7: spapr_pci: Add a 64-bit MMIO window...
ERROR: trailing whitespace
#237: FILE: include/hw/ppc/spapr.h:44:
+  uint64_t *buid, hwaddr *pio, $

total: 1 errors, 0 warnings, 170 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 7/7: spapr: Improved placement of PCI host bridges in guest 
memory map...
ERROR: Macros with multiple statements should be enclosed in a do - while loop
#141: FILE: hw/ppc/spapr.c:2528:
+#define SPAPR_COMPAT_2_7\
+HW_COMPAT_2_7   \
+{   \
+.driver   = TYPE_SPAPR_PCI_HOST_BRIDGE, \
+.property = "mem_win_size", \
+.value= stringify(SPAPR_PCI_2_7_MMIO_WIN_SIZE),\
+},  \
+{   \
+.driver   = TYPE_SPAPR_PCI_HOST_BRIDGE, \
+.property = "mem64_win_size",   \
+.value= "0",\
+},

total: 1 errors, 0 warnings, 225 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [PATCH v4 00/12] qapi: Allow blockdev-add for NBD

2016-10-13 Thread no-reply

Hi,

Your series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Subject: [Qemu-devel] [PATCH v4 00/12] qapi: Allow blockdev-add for NBD
Type: series
Message-id: 20160928205602.17275-1-mre...@redhat.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=16
make docker-test-quick@centos6
make docker-test-mingw@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
db8b8fd iotests: Add test for NBD's blockdev-add interface
bc67989 socket_scm_helper: Accept fd directly
f2a2af7 iotests.py: Allow concurrent qemu instances
e86dbb6 iotests.py: Add qemu_nbd function
b5f7f81 qapi: Allow blockdev-add for NBD
1dd5e1d block/nbd: Use SocketAddress options
bbfdd26 block/nbd: Accept SocketAddress
0c908ca block/nbd: Add nbd_has_filename_options_conflict()
a534bca block/nbd: Use qdict_put()
134116f block/nbd: Default port in nbd_refresh_filename()
dfba813 block/nbd: Reject port parameter without host
28956bb block/nbd: Drop trailing "." in error messages

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf'
  BUILD   centos6
  ARCHIVE qemu.tgz
  ARCHIVE dtc.tgz
  COPYRUNNER
  RUN test-quick in centos6
Packages installed:
SDL-devel-1.2.14-7.el6_7.1.x86_64
ccache-3.1.6-2.el6.x86_64
epel-release-6-8.noarch
gcc-4.4.7-17.el6.x86_64
git-1.7.1-4.el6_7.1.x86_64
glib2-devel-2.28.8-5.el6.x86_64
libfdt-devel-1.4.0-1.el6.x86_64
make-3.81-23.el6.x86_64
package g++ is not installed
pixman-devel-0.32.8-1.el6.x86_64
tar-1.23-15.el6_8.x86_64
zlib-devel-1.2.3-29.el6.x86_64

Environment variables:
PACKAGES=libfdt-devel ccache tar git make gcc g++ zlib-devel 
glib2-devel SDL-devel pixman-devel epel-release
HOSTNAME=c2cf8161a74c
TERM=xterm
MAKEFLAGS= -j16
HISTSIZE=1000
J=16
USER=root
CCACHE_DIR=/var/tmp/ccache
EXTRA_CONFIGURE_OPTS=
V=
SHOW_ENV=1
MAIL=/var/spool/mail/root
PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
LANG=en_US.UTF-8
TARGET_LIST=
HISTCONTROL=ignoredups
SHLVL=1
HOME=/root
TEST_DIR=/tmp/qemu-test
LOGNAME=root
LESSOPEN=||/usr/bin/lesspipe.sh %s
FEATURES= dtc
DEBUG=
G_BROKEN_FILENAMES=1
CCACHE_HASHDIR=
_=/usr/bin/env

Configure options:
--enable-werror --target-list=x86_64-softmmu,aarch64-softmmu 
--prefix=/var/tmp/qemu-build/install
No C++ compiler available; disabling C++ specific optional code
Install prefix/var/tmp/qemu-build/install
BIOS directory/var/tmp/qemu-build/install/share/qemu
binary directory  /var/tmp/qemu-build/install/bin
library directory /var/tmp/qemu-build/install/lib
module directory  /var/tmp/qemu-build/install/lib/qemu
libexec directory /var/tmp/qemu-build/install/libexec
include directory /var/tmp/qemu-build/install/include
config directory  /var/tmp/qemu-build/install/etc
local state directory   /var/tmp/qemu-build/install/var
Manual directory  /var/tmp/qemu-build/install/share/man
ELF interp prefix /usr/gnemul/qemu-%M
Source path   /tmp/qemu-test/src
C compilercc
Host C compiler   cc
C++ compiler  
Objective-C compiler cc
ARFLAGS   rv
CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g 
QEMU_CFLAGS   -I/usr/include/pixman-1-pthread -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include   -fPIE -DPIE -m64 -D_GNU_SOURCE 
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes 
-Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes 
-fno-strict-aliasing -fno-common -fwrapv  -Wendif-labels -Wmissing-include-dirs 
-Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self 
-Wignored-qualifiers -Wold-style-declaration -Wold-style-definition 
-Wtype-limits -fstack-protector-all
LDFLAGS   -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g 
make  make
install   install
pythonpython -B
smbd  /usr/sbin/smbd
module supportno
host CPU  x86_64
host big endian   no
target list   x86_64-softmmu aarch64-softmmu
tcg debug enabled no
gprof enabled no
sparse enabledno
strip binariesyes
profiler  no
static build  no
pixmansystem
SDL support   yes (1.2.14)
GTK support   no 
GTK GL supportno
VTE support   no 
TLS priority  NORMAL
GNUTLS supportno
GNUTLS rndno
libgcrypt no
libgcrypt kdf no
nettleno 
nettle kdfno
libtasn1  no
curses supportno
virgl support no
curl support  no
mingw32 support   no
Audio drivers oss
Block whitelist (rw) 
Block whitelist (ro) 
VirtFS supportno
VNC support   yes
VNC SASL support  no
VNC JPEG support  no
VNC PNG support   no
xen support

Re: [Qemu-devel] [PATCH v2 03/20] target-ppc: move back cpu_exec_init() to init

2016-10-13 Thread David Gibson

On Thu, Oct 13, 2016 at 06:24:45PM +0200, Laurent Vivier wrote:
> We have now the cpu_exec_realize() in realize,
> so the init part must be in init.
> 
> As cpu_exec_unrealize() is called from cpu_common_finalize(),
> remove the call from ppc_cpu_unrealizefn().
> 
> CC: Bharata B Rao 
> CC: Alexander Graf 
> CC: qemu-...@nongnu.org
> Signed-off-by: Laurent Vivier 
> ---
>  target-ppc/translate_init.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> index 094f28a..bbca8b5 100644
> --- a/target-ppc/translate_init.c
> +++ b/target-ppc/translate_init.c
> @@ -9678,7 +9678,6 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error 
> **errp)
>  }
>  #endif
>  
> -cpu_exec_init(cs);
>  cpu_exec_realize(cs, _err);
>  if (local_err != NULL) {
>  error_propagate(errp, local_err);
> @@ -9911,8 +9910,6 @@ static void ppc_cpu_unrealizefn(DeviceState *dev, Error 
> **errp)
>  opc_handler_t **table, **table_2;
>  int i, j, k;
>  
> -cpu_exec_unrealize(CPU(dev));
> -

This doesn't seem right.  As you said in 0/20, cpu_exec_unrealize() is
called from cpu_common_finalize().  But finalize should mirror init,
not unrealize().  So it seems that unrealize() really should belong
here, not in finalize.

>  for (i = 0; i < PPC_CPU_OPCODES_LEN; i++) {
>  if (env->opcodes[i] == _handler) {
>  continue;
> @@ -10435,6 +10432,7 @@ static void ppc_cpu_initfn(Object *obj)
>  CPUPPCState *env = >env;
>  
>  cs->env_ptr = env;
> +cpu_exec_init(cs);
>  
>  env->msr_mask = pcc->msr_mask;
>  env->mmu_model = pcc->mmu_model;

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 01/20] exec: split cpu_exec_init()

2016-10-13 Thread David Gibson

On Thu, Oct 13, 2016 at 06:24:43PM +0200, Laurent Vivier wrote:
> Extract the realize part to cpu_exec_realize(), update all
> calls to cpu_exec_init() to add cpu_exec_realize() to
> have no functionnal change.
> 
> Put in cpu_exec_init() what initializes the CPU,
> in cpu_exec_realize() what adds it to the environment.
> 
> Remove error parameter from cpu_exec_init() as it can't fail.
> 
> Rename cpu_exec_exit() with cpu_exec_unrealize():
> cpu_exec_exit() is undoing what it has been done by cpu_exec_realize(), so
> call it cpu_exec_unrealize().
> 
> CC: Paolo Bonzini 
> Signed-off-by: Laurent Vivier 

Reviewed-by: David Gibson 

> ---
>  exec.c  | 12 +++-
>  include/exec/exec-all.h |  3 ++-
>  include/qom/cpu.h   |  2 +-
>  qom/cpu.c   |  2 +-
>  target-alpha/cpu.c  |  3 ++-
>  target-arm/cpu.c|  3 ++-
>  target-cris/cpu.c   |  3 ++-
>  target-i386/cpu.c   |  3 ++-
>  target-lm32/cpu.c   |  3 ++-
>  target-m68k/cpu.c   |  3 ++-
>  target-microblaze/cpu.c |  3 ++-
>  target-mips/cpu.c   |  3 ++-
>  target-moxie/cpu.c  |  3 ++-
>  target-openrisc/cpu.c   |  3 ++-
>  target-ppc/translate_init.c |  5 +++--
>  target-s390x/cpu.c  |  3 ++-
>  target-sh4/cpu.c|  3 ++-
>  target-sparc/cpu.c  |  3 ++-
>  target-tilegx/cpu.c |  3 ++-
>  target-tricore/cpu.c|  3 ++-
>  target-unicore32/cpu.c  |  3 ++-
>  target-xtensa/cpu.c |  3 ++-
>  22 files changed, 48 insertions(+), 27 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 374c364..885dc79 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -596,7 +596,7 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int 
> asidx)
>  }
>  #endif
>  
> -void cpu_exec_exit(CPUState *cpu)
> +void cpu_exec_unrealize(CPUState *cpu)
>  {
>  CPUClass *cc = CPU_GET_CLASS(cpu);
>  
> @@ -610,11 +610,8 @@ void cpu_exec_exit(CPUState *cpu)
>  }
>  }
>  
> -void cpu_exec_init(CPUState *cpu, Error **errp)
> +void cpu_exec_init(CPUState *cpu)
>  {
> -CPUClass *cc ATTRIBUTE_UNUSED = CPU_GET_CLASS(cpu);
> -Error *local_err ATTRIBUTE_UNUSED = NULL;
> -
>  cpu->as = NULL;
>  cpu->num_ases = 0;
>  
> @@ -635,6 +632,11 @@ void cpu_exec_init(CPUState *cpu, Error **errp)
>  cpu->memory = system_memory;
>  object_ref(OBJECT(cpu->memory));
>  #endif
> +}
> +
> +void cpu_exec_realize(CPUState *cpu, Error **errp)
> +{
> +CPUClass *cc ATTRIBUTE_UNUSED = CPU_GET_CLASS(cpu);
>  
>  cpu_list_add(cpu);
>  
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 336a57c..b42533e 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -57,7 +57,8 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>uint32_t flags,
>int cflags);
>  
> -void cpu_exec_init(CPUState *cpu, Error **errp);
> +void cpu_exec_init(CPUState *cpu);
> +void cpu_exec_realize(CPUState *cpu, Error **errp);
>  void QEMU_NORETURN cpu_loop_exit(CPUState *cpu);
>  void QEMU_NORETURN cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc);
>  
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index 6d481a1..4962980 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -946,7 +946,7 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int 
> asidx);
>  
>  void QEMU_NORETURN cpu_abort(CPUState *cpu, const char *fmt, ...)
>  GCC_FMT_ATTR(2, 3);
> -void cpu_exec_exit(CPUState *cpu);
> +void cpu_exec_unrealize(CPUState *cpu);
>  
>  #ifdef CONFIG_SOFTMMU
>  extern const struct VMStateDescription vmstate_cpu_common;
> diff --git a/qom/cpu.c b/qom/cpu.c
> index c40f774..39590e1 100644
> --- a/qom/cpu.c
> +++ b/qom/cpu.c
> @@ -367,7 +367,7 @@ static void cpu_common_initfn(Object *obj)
>  static void cpu_common_finalize(Object *obj)
>  {
>  CPUState *cpu = CPU(obj);
> -cpu_exec_exit(cpu);
> +cpu_exec_unrealize(CPU(obj));
>  g_free(cpu->trace_dstate);
>  }
>  
> diff --git a/target-alpha/cpu.c b/target-alpha/cpu.c
> index 6d01d7f..98761d7 100644
> --- a/target-alpha/cpu.c
> +++ b/target-alpha/cpu.c
> @@ -266,7 +266,8 @@ static void alpha_cpu_initfn(Object *obj)
>  CPUAlphaState *env = >env;
>  
>  cs->env_ptr = env;
> -cpu_exec_init(cs, _abort);
> +cpu_exec_init(cs);
> +cpu_exec_realize(cs, _abort);
>  tlb_flush(cs, 1);
>  
>  alpha_translate_init();
> diff --git a/target-arm/cpu.c b/target-arm/cpu.c
> index 1b9540e..7e58134 100644
> --- a/target-arm/cpu.c
> +++ b/target-arm/cpu.c
> @@ -444,7 +444,8 @@ static void arm_cpu_initfn(Object *obj)
>  uint32_t Aff1, Aff0;
>  
>  cs->env_ptr = >env;
> -cpu_exec_init(cs, _abort);
> +cpu_exec_init(cs);
> +cpu_exec_realize(cs, _abort);
>  cpu->cp_regs = g_hash_table_new_full(g_int_hash, g_int_equal,
>

Re: [Qemu-devel] [PATCH v3 00/13] pc: q35: x2APIC support in kvm_apic mode

2016-10-13 Thread no-reply

Hi,

Your series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Type: series
Message-id: 1476352367-69400-1-git-send-email-imamm...@redhat.com
Subject: [Qemu-devel] [PATCH v3 00/13] pc: q35: x2APIC support in kvm_apic mode

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=16
make docker-test-quick@centos6
make docker-test-mingw@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
bca661e pc: q35: bump max_cpus to 288
17f6729 pc: require IRQ remapping and EIM if there could be x2APIC CPUs
8fc45a4 pc: add 'etc/boot-cpus' fw_cfg file for machine with more than 255 CPUs
a7d6417 increase MAX_CPUMASK_BITS from 255 to 288
b03673c pc: clarify FW_CFG_MAX_CPUS usage comment
f71d194 pc: kvm_apic: pass APIC ID depending on xAPIC/x2APIC mode
87d9001 pc: apic_common: reset APIC ID to initial ID when switching into x2APIC 
mode
426abcf pc: apic_common: restore APIC ID to initial ID on reset
9cb56de pc: apic_common: extend APIC ID property to 32bit
489db69 pc: leave max apic_id_limit only in legacy cpu hotplug code
a503ed4 acpi: cphp: force switch to modern cpu hotplug if APIC ID > 254
4899cb1 acpi: cphp: support x2APIC entry in cpu._MAT
b9fe39d pc: acpi: x2APIC support for SRAT table
4ffb548 pc: acpi: x2APIC support for MADT table

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf'
  BUILD   centos6
  ARCHIVE qemu.tgz
  ARCHIVE dtc.tgz
  COPYRUNNER
  RUN test-quick in centos6
Packages installed:
SDL-devel-1.2.14-7.el6_7.1.x86_64
ccache-3.1.6-2.el6.x86_64
epel-release-6-8.noarch
gcc-4.4.7-17.el6.x86_64
git-1.7.1-4.el6_7.1.x86_64
glib2-devel-2.28.8-5.el6.x86_64
libfdt-devel-1.4.0-1.el6.x86_64
make-3.81-23.el6.x86_64
package g++ is not installed
pixman-devel-0.32.8-1.el6.x86_64
tar-1.23-15.el6_8.x86_64
zlib-devel-1.2.3-29.el6.x86_64

Environment variables:
PACKAGES=libfdt-devel ccache tar git make gcc g++ zlib-devel 
glib2-devel SDL-devel pixman-devel epel-release
HOSTNAME=b7cb6513802f
TERM=xterm
MAKEFLAGS= -j16
HISTSIZE=1000
J=16
USER=root
CCACHE_DIR=/var/tmp/ccache
EXTRA_CONFIGURE_OPTS=
V=
SHOW_ENV=1
MAIL=/var/spool/mail/root
PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
LANG=en_US.UTF-8
TARGET_LIST=
HISTCONTROL=ignoredups
SHLVL=1
HOME=/root
TEST_DIR=/tmp/qemu-test
LOGNAME=root
LESSOPEN=||/usr/bin/lesspipe.sh %s
FEATURES= dtc
DEBUG=
G_BROKEN_FILENAMES=1
CCACHE_HASHDIR=
_=/usr/bin/env

Configure options:
--enable-werror --target-list=x86_64-softmmu,aarch64-softmmu 
--prefix=/var/tmp/qemu-build/install
No C++ compiler available; disabling C++ specific optional code
Install prefix/var/tmp/qemu-build/install
BIOS directory/var/tmp/qemu-build/install/share/qemu
binary directory  /var/tmp/qemu-build/install/bin
library directory /var/tmp/qemu-build/install/lib
module directory  /var/tmp/qemu-build/install/lib/qemu
libexec directory /var/tmp/qemu-build/install/libexec
include directory /var/tmp/qemu-build/install/include
config directory  /var/tmp/qemu-build/install/etc
local state directory   /var/tmp/qemu-build/install/var
Manual directory  /var/tmp/qemu-build/install/share/man
ELF interp prefix /usr/gnemul/qemu-%M
Source path   /tmp/qemu-test/src
C compilercc
Host C compiler   cc
C++ compiler  
Objective-C compiler cc
ARFLAGS   rv
CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g 
QEMU_CFLAGS   -I/usr/include/pixman-1-pthread -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include   -fPIE -DPIE -m64 -D_GNU_SOURCE 
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes 
-Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes 
-fno-strict-aliasing -fno-common -fwrapv  -Wendif-labels -Wmissing-include-dirs 
-Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self 
-Wignored-qualifiers -Wold-style-declaration -Wold-style-definition 
-Wtype-limits -fstack-protector-all
LDFLAGS   -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g 
make  make
install   install
pythonpython -B
smbd  /usr/sbin/smbd
module supportno
host CPU  x86_64
host big endian   no
target list   x86_64-softmmu aarch64-softmmu
tcg debug enabled no
gprof enabled no
sparse enabledno
strip binariesyes
profiler  no
static build  no
pixmansystem
SDL support   yes (1.2.14)
GTK support   no 
GTK GL supportno
VTE support   no 
TLS priority  NORMAL
GNUTLS supportno
GNUTLS rndno
libgcrypt no
libgcrypt kdf no
nettleno 
nettle kdfno

Re: [Qemu-devel] [PATCH v5] timer: a9gtimer: remove loop to auto-increment comparator

2016-10-13 Thread P J P

+-- On Fri, 14 Oct 2016, P J P wrote --+
| +if (gtb->control & R_CONTROL_AUTO_INCREMENT && gtb->inc) {
| +inc = update.new - gtb->compare;
| +inc = MAX(QEMU_ALIGN_DOWN(inc, gtb->inc), gtb->inc);
| +DB_PRINT("Auto incrementing timer compare by %"
| +PRId64 "\n", inc);
| +gtb->compare += inc;

  Please consider [PATCH v6]. I should've slept over this, instead of sending 
it at 02:00 hrs. at night.

Thank you.
--
Prasad J Pandit / Red Hat Product Security Team
47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F

[Qemu-devel] [PATCH v6] timer: a9gtimer: remove loop to auto-increment comparator

2016-10-13 Thread P J P

From: Prasad J Pandit 

ARM A9MP processor has a peripheral timer with an auto-increment
register, which holds an increment step value. A user could set
this value to zero. When auto-increment control bit is enabled,
it leads to an infinite loop in 'a9_gtimer_update' while
updating comparator value. Remove this loop incrementing the
comparator value.

Reported-by: Li Qiang 
Signed-off-by: Prasad J Pandit 
---
 hw/timer/a9gtimer.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

Update per
  -> https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg02891.html

diff --git a/hw/timer/a9gtimer.c b/hw/timer/a9gtimer.c
index 772f85f..03dfaf2 100644
--- a/hw/timer/a9gtimer.c
+++ b/hw/timer/a9gtimer.c
@@ -73,6 +73,7 @@ static void a9_gtimer_update(A9GTimerState *s, bool sync)
 
 A9GTimerUpdate update = a9_gtimer_get_update(s);
 int i;
+uint64_t inc;
 int64_t next_cdiff = 0;
 
 for (i = 0; i < s->num_cpu; ++i) {
@@ -82,15 +83,15 @@ static void a9_gtimer_update(A9GTimerState *s, bool sync)
 if ((s->control & R_CONTROL_TIMER_ENABLE) &&
 (gtb->control & R_CONTROL_COMP_ENABLE)) {
 /* R2p0+, where the compare function is >= */
-while (gtb->compare < update.new) {
+if (gtb->compare < update.new) {
 DB_PRINT("Compare event happened for CPU %d\n", i);
 gtb->status = 1;
-if (gtb->control & R_CONTROL_AUTO_INCREMENT) {
-DB_PRINT("Auto incrementing timer compare by %" PRId32 
"\n",
- gtb->inc);
-gtb->compare += gtb->inc;
-} else {
-break;
+if (gtb->control & R_CONTROL_AUTO_INCREMENT && gtb->inc) {
+inc = update.new + gtb->inc - gtb->compare - 1;
+inc = QEMU_ALIGN_DOWN(inc, gtb->inc);
+DB_PRINT("Auto incrementing timer compare by %"
+PRId64 "\n", inc);
+gtb->compare += inc;
 }
 }
 cdiff = (int64_t)gtb->compare - (int64_t)update.new + 1;
-- 
2.5.5

Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices

2016-10-13 Thread Kirti Wankhede



On 10/13/2016 8:06 PM, Alex Williamson wrote:
> On Thu, 13 Oct 2016 14:52:09 +0530
> Kirti Wankhede  wrote:
> 
>> On 10/13/2016 3:14 AM, Alex Williamson wrote:
>>> On Thu, 13 Oct 2016 00:32:48 +0530
>>> Kirti Wankhede  wrote:
>>>   
 On 10/12/2016 9:29 PM, Alex Williamson wrote:  
> On Wed, 12 Oct 2016 20:43:48 +0530
> Kirti Wankhede  wrote:
> 
>> On 10/12/2016 7:22 AM, Tian, Kevin wrote:
 From: Kirti Wankhede [mailto:kwankh...@nvidia.com]
 Sent: Wednesday, October 12, 2016 4:45 AM  
>> +* mdev_supported_types:
>> +List of current supported mediated device types and its details 
>> are added
>> +in this directory in following format:
>> +
>> +|- 
>> +|--- Vendor-specific-attributes [optional]
>> +|--- mdev_supported_types
>> +| |--- 
>> +| |   |--- create
>> +| |   |--- name
>> +| |   |--- available_instances
>> +| |   |--- description /class
>> +| |   |--- [devices]
>> +| |--- 
>> +| |   |--- create
>> +| |   |--- name
>> +| |   |--- available_instances
>> +| |   |--- description /class
>> +| |   |--- [devices]
>> +| |--- 
>> +|  |--- create
>> +|  |--- name
>> +|  |--- available_instances
>> +|  |--- description /class
>> +|  |--- [devices]
>> +
>> +[TBD : description or class is yet to be decided. This will 
>> change.]  
>
> I thought that in previous discussions we had agreed to drop
> the  concept and use the name as the unique identifier.
> When reporting these types in libvirt we won't want to report
> the type id values - we'll want the name strings to be unique.
>  

 The 'name' might not be unique but type_id will be. For example that 
 Neo
 pointed out in earlier discussion, virtual devices can come from two
 different physical devices, end user would be presented with what they
 had selected but there will be internal implementation differences. In
 that case 'type_id' will be unique.
  
>>>
>>> Hi, Kirti, my understanding is that Neo agreed to use an unique type
>>> string (if you still called it ), and then no need of 
>>> additional
>>> 'name' field which can be put inside 'description' field. See below 
>>> quote:
>>>   
>>
>> We had internal discussions about this within NVIDIA and found that
>> 'name' might not be unique where as 'type_id' would be unique. I'm
>> refering to Neo's mail after that, where Neo do pointed that out.
>>
>> https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg07714.html
>
> Everyone not privy to those internal discussions, including me, seems to
> think we dropped type_id and that if a vendor does not have a stable
> name, they can compose some sort of stable type description based on the
> name+id, or even vendor+id, ex. NVIDIA-11.  So please share why we
> haven't managed to kill off type_id yet.  No matter what internal
> representation each vendor driver has of "type_id" it seems possible
> for it to come up with stable string to define a given configuration.


 The 'type_id' is unique and the 'name' are not, the name is just a
 virtual device name/ human readable name. Because at this moment Intel
 can't define a proper GPU class, we have to add a 'description' field
 there as well to represent the features of this virtual device, once we
 have all agreed with the GPU class and its mandatory attributes, the
 'description' field can be removed. Here is an example,
 type_id/type_name = NVIDIA_11,
 name=M60-M0Q,
 description=2560x1600, 2 displays, 512MB"

 Neo's previous comment only applies to the situation where we will have
 the GPU class or optional attributes defined and recognized by libvirt,
 since that is not going to happen any time soon, we will have to have
 the new 'description' field, and we don't want to have it mixed up with
 'name' field.

 We can definitely have something like name+id as Alex recommended to
 remove the 'name' field, but it will just require libvirt to have more
 logic to parse that string.  
>>>
>>> Let's use the mtty example driver provided in patch 5 so we can all
>>> more clearly see how the interfaces work.  I'll start from the
>>> beginning of my experience and work my way to the type/name thing.
>>>   
>>
>> Thanks for looking into it and getting feel of it. And I hope this helps
>> to understand that 'name' and 'type_id' are different.
>>
>>
>>> (please add a modules_install

Re: [Qemu-devel] [PATCH 01/11] spapr_ovec: initial implementation of option vector helpers

2016-10-13 Thread David Gibson

On Wed, Oct 12, 2016 at 06:13:49PM -0500, Michael Roth wrote:
> PAPR guests advertise their capabilities to the platform by passing
> an ibm,architecture-vec structure via an
> ibm,client-architecture-support hcall as described by LoPAPR v11,
> B.6.2.3. during early boot.
> 
> Using this information, the platform enables the capabilities it
> supports, then encodes a subset of those enabled capabilities (the
> 5th option vector of the ibm,architecture-vec structure passed to
> ibm,client-architecture-support) into the guest device tree via
> "/chosen/ibm,architecture-vec-5".
> 
> The logical format of these these option vectors is a bit-vector,
> where individual bits are addressed/documented based on the byte-wise
> offset from the beginning of the bit-vector, followed by the bit-wise
> index starting from the byte-wise offset. Thus the bits of each of
> these bytes are stored in reverse order. Additionally, the first
> byte of each option vector is encodes the length of the option vector,
> so byte offsets begin at 1, and bit offset at 0.

Heh.. pity qemu doesn't use the ccan bitmap module
(http://ccodearchive.net/info/bitmap.html).  By design it always
stores the bitmaps in IBM bit number ordering, because that's most
obvious to a human reading a memory dump (for the purpose of bit
vectors - in most situations the IBM numbering is dumb).

> This is not very intuitive for the purposes of mapping these bits to
> a particular documented capability, so this patch introduces a set
> of abstractions that encapsulate the work of parsing/encoding these
> options vectors and testing for individual capabilities.
> 
> Cc: Bharata B Rao 
> Signed-off-by: Michael Roth 

A handful of small nits.

> ---
>  hw/ppc/Makefile.objs|   2 +-
>  hw/ppc/spapr_ovec.c | 244 
> 
>  include/hw/ppc/spapr_ovec.h |  62 +++
>  3 files changed, 307 insertions(+), 1 deletion(-)
>  create mode 100644 hw/ppc/spapr_ovec.c
>  create mode 100644 include/hw/ppc/spapr_ovec.h
> 
> diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
> index 99a0d4e..2e0b0c9 100644
> --- a/hw/ppc/Makefile.objs
> +++ b/hw/ppc/Makefile.objs
> @@ -4,7 +4,7 @@ obj-y += ppc.o ppc_booke.o fdt.o
>  obj-$(CONFIG_PSERIES) += spapr.o spapr_vio.o spapr_events.o
>  obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
>  obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
> -obj-$(CONFIG_PSERIES) += spapr_cpu_core.o
> +obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o
>  ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
>  obj-y += spapr_pci_vfio.o
>  endif
> diff --git a/hw/ppc/spapr_ovec.c b/hw/ppc/spapr_ovec.c
> new file mode 100644
> index 000..ddc19f5
> --- /dev/null
> +++ b/hw/ppc/spapr_ovec.c
> @@ -0,0 +1,244 @@
> +/*
> + * QEMU SPAPR Architecture Option Vector Helper Functions
> + *
> + * Copyright IBM Corp. 2016
> + *
> + * Authors:
> + *  Bharata B Rao 
> + *  Michael Roth  
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/ppc/spapr_ovec.h"
> +#include "qemu/bitmap.h"
> +#include "exec/address-spaces.h"
> +#include "qemu/error-report.h"
> +#include 
> +
> +/* #define DEBUG_SPAPR_OVEC */
> +
> +#ifdef DEBUG_SPAPR_OVEC
> +#define DPRINTFN(fmt, ...) \
> +do { fprintf(stderr, fmt "\n", ## __VA_ARGS__); } while (0)
> +#else
> +#define DPRINTFN(fmt, ...) \
> +do { } while (0)
> +#endif
> +
> +#define OV_MAXBYTES 256 /* not including length byte */
> +#define OV_MAXBITS (OV_MAXBYTES * BITS_PER_BYTE)
> +
> +/* we *could* work with bitmaps directly, but handling the bitmap privately
> + * allows us to more safely make assumptions about the bitmap size and
> + * simplify the calling code somewhat
> + */
> +struct sPAPROptionVector {
> +unsigned long *bitmap;
> +};
> +
> +static sPAPROptionVector *spapr_ovec_from_bitmap(unsigned long *bitmap)
> +{
> +sPAPROptionVector *ov;
> +
> +g_assert(bitmap);
> +
> +ov = g_new0(sPAPROptionVector, 1);
> +ov->bitmap = bitmap;
> +
> +return ov;
> +}
> +
> +sPAPROptionVector *spapr_ovec_new(void)
> +{
> +return spapr_ovec_from_bitmap(bitmap_new(OV_MAXBITS));
> +}
> +
> +sPAPROptionVector *spapr_ovec_clone(sPAPROptionVector *ov_orig)
> +{
> +sPAPROptionVector *ov;
> +
> +g_assert(ov_orig);
> +
> +ov = spapr_ovec_new();
> +bitmap_copy(ov->bitmap, ov_orig->bitmap, OV_MAXBITS);
> +
> +return ov;
> +}
> +
> +void spapr_ovec_intersect(sPAPROptionVector *ov,
> +  sPAPROptionVector *ov1,
> +  sPAPROptionVector *ov2)
> +{
> +g_assert(ov);
> +g_assert(ov1);
> +g_assert(ov2);
> +
> +bitmap_and(ov->bitmap, ov1->bitmap, ov2->bitmap, OV_MAXBITS);
> +}
> +
>

Re: [Qemu-devel] [PATCH 02/11] spapr_hcall: use spapr_ovec_* interfaces for CAS options

2016-10-13 Thread David Gibson

On Wed, Oct 12, 2016 at 06:13:50PM -0500, Michael Roth wrote:
> Currently we access individual bytes of an option vector via
> ldub_phys() to test for the presence of a particular capability
> within that byte. Currently this is only done for the "dynamic
> reconfiguration memory" capability bit. If that bit is present,
> we pass a boolean value to spapr_h_cas_compose_response()
> to generate a modified device tree segment with the additional
> properties required to enable this functionality.
> 
> As more capability bits are added, will would need to modify the
> code to add additional option vector accesses and extend the
> param list for spapr_h_cas_compose_response() to include similar
> boolean values for these parameters.
> 
> Avoid this by switching to spapr_ovec_* helpers so we can do all
> the parsing in one shot and then test for these additional bits
> within spapr_h_cas_compose_response() directly.
> 
> Cc: Bharata B Rao 
> Signed-off-by: Michael Roth 

Reviewed-by: David Gibson 

> ---
>  hw/ppc/spapr.c  | 10 ++--
>  hw/ppc/spapr_hcall.c| 56 
> -
>  include/hw/ppc/spapr.h  |  5 +++-
>  include/hw/ppc/spapr_ovec.h |  3 +++
>  4 files changed, 30 insertions(+), 44 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 03e3803..934d6b2 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -856,7 +856,7 @@ out:
>  
>  int spapr_h_cas_compose_response(sPAPRMachineState *spapr,
>   target_ulong addr, target_ulong size,
> - bool cpu_update, bool memory_update)
> + bool cpu_update)
>  {
>  void *fdt, *fdt_skel;
>  sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
> @@ -880,7 +880,8 @@ int spapr_h_cas_compose_response(sPAPRMachineState *spapr,
>  }
>  
>  /* Generate ibm,dynamic-reconfiguration-memory node if required */
> -if (memory_update && smc->dr_lmb_enabled) {
> +if (spapr_ovec_test(spapr->ov5_cas, OV5_DRCONF_MEMORY)) {
> +g_assert(smc->dr_lmb_enabled);
>  _FDT((spapr_populate_drconf_memory(spapr, fdt)));
>  }
>  
> @@ -1769,7 +1770,12 @@ static void ppc_spapr_init(MachineState *machine)
> DIV_ROUND_UP(max_cpus * smt, smp_threads),
> XICS_IRQS_SPAPR, _fatal);
>  
> +/* Set up containers for ibm,client-set-architecture negotiated options 
> */
> +spapr->ov5 = spapr_ovec_new();
> +spapr->ov5_cas = spapr_ovec_new();
> +
>  if (smc->dr_lmb_enabled) {
> +spapr_ovec_set(spapr->ov5, OV5_DRCONF_MEMORY);
>  spapr_validate_node_memory(machine, _fatal);
>  }
>  
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index c5e7e8c..f1d081b 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -11,6 +11,7 @@
>  #include "trace.h"
>  #include "sysemu/kvm.h"
>  #include "kvm_ppc.h"
> +#include "hw/ppc/spapr_ovec.h"
>  
>  struct SPRSyncState {
>  int spr;
> @@ -880,32 +881,6 @@ static target_ulong h_set_mode(PowerPCCPU *cpu, 
> sPAPRMachineState *spapr,
>  return ret;
>  }
>  
> -/*
> - * Return the offset to the requested option vector @vector in the
> - * option vector table @table.
> - */
> -static target_ulong cas_get_option_vector(int vector, target_ulong table)
> -{
> -int i;
> -char nr_vectors, nr_entries;
> -
> -if (!table) {
> -return 0;
> -}
> -
> -nr_vectors = (ldl_phys(_space_memory, table) >> 24) + 1;
> -if (!vector || vector > nr_vectors) {
> -return 0;
> -}
> -table++; /* skip nr option vectors */
> -
> -for (i = 0; i < vector - 1; i++) {
> -nr_entries = ldl_phys(_space_memory, table) >> 24;
> -table += nr_entries + 2;
> -}
> -return table;
> -}
> -
>  typedef struct {
>  uint32_t cpu_version;
>  Error *err;
> @@ -961,23 +936,21 @@ static void cas_handle_compat_cpu(PowerPCCPUClass *pcc, 
> uint32_t pvr,
>  }
>  }
>  
> -#define OV5_DRCONF_MEMORY 0x20
> -
>  static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
>sPAPRMachineState *spapr,
>target_ulong opcode,
>target_ulong *args)
>  {
>  target_ulong list = ppc64_phys_to_real(args[0]);
> -target_ulong ov_table, ov5;
> +target_ulong ov_table;
>  PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu_);
>  CPUState *cs;
> -bool cpu_match = false, cpu_update = true, memory_update = false;
> +bool cpu_match = false, cpu_update = true;
>  unsigned old_cpu_version = cpu_->cpu_version;
>  unsigned compat_lvl = 0, cpu_version = 0;
>  unsigned max_lvl = get_compat_level(cpu_->max_compat);
>  int counter;
> -

Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices

2016-10-13 Thread Kirti Wankhede

On 10/14/2016 7:52 AM, Jike Song wrote:
> On 10/11/2016 04:28 AM, Kirti Wankhede wrote:
>> +
>> +Under per-physical device sysfs:
>> +
>> +
>> +* mdev_supported_types:
>> +List of current supported mediated device types and its details are 
>> added
>> +in this directory in following format:
>> +
>> +|- 
>> +|--- Vendor-specific-attributes [optional]
>> +|--- mdev_supported_types
>> +| |--- 
>> +| |   |--- create
>> +| |   |--- name
>> +| |   |--- available_instances
>> +| |   |--- description /class
>> +| |   |--- [devices]
>> +| |--- 
>> +| |   |--- create
>> +| |   |--- name
>> +| |   |--- available_instances
>> +| |   |--- description /class
>> +| |   |--- [devices]
>> +| |--- 
>> +|  |--- create
>> +|  |--- name
>> +|  |--- available_instances
>> +|  |--- description /class
>> +|  |--- [devices]
>> +
>> +[TBD : description or class is yet to be decided. This will change.]
>> +
>> +Under per mdev device:
>> +--
>> +
>> +|- 
>> +|--- $MDEV_UUID
>> + |--- remove
>> + |--- {link to its type}
>> + |--- vendor-specific-attributes [optional]
>> +
> 
> All mdev directories are placed under physical device directly.
> 
> Looking at the sysfs directory of physical device, you get:
> 
> 
> |--- mdev_supported_types/
> ||--- type1/
> ||--- type2/
> ||--- type3/
> |--- mdev1/
> |--- mdev2/
> 
> 
> 
> With an independent device between physical and mdev, and names
> simplified, you will get:
> 
> 
> |--- mdev/
> ||--- supported_type1/
> ||--- supported_type2/
> ||--- supported_type3/
> ||--- mdev1/
> ||--- mdev2/
> 
> i.e. everything related to mdev are placed under one single directory -
> the same as SR-IOV.  I'm not sure if it is possible without
> introducing an independent device (which you apparently dislike), but
> placing so many mdev directories under physical doesn't seems clean.
> 
> 

I'm repeating the same example that I had in reply to Alex's question,
the parent-child relationship between devices is reflected in sysfs.
There are cases when there are multiple children and all are placed in
same parent directory:

80:01.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express
Root Port 1a (rev 07)
80:02.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express
Root Port 2a (rev 07)
80:03.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express
Root Port 3a in PCI Express Mode (rev 07)
80:04.0 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
0 (rev 07)
80:04.1 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
1 (rev 07)
80:04.2 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
2 (rev 07)
80:04.3 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
3 (rev 07)
80:04.4 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
4 (rev 07)
80:04.5 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
5 (rev 07)
80:04.6 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
6 (rev 07)
80:04.7 System peripheral: Intel Corporation Xeon E5/Core i7 DMA Channel
7 (rev 07)
80:05.0 System peripheral: Intel Corporation Xeon E5/Core i7 Address
Map, VTd_Misc, System Management (rev 07)
80:05.2 System peripheral: Intel Corporation Xeon E5/Core i7 Control
Status and Global Errors (rev 07)
80:05.4 PIC: Intel Corporation Xeon E5/Core i7 I/O APIC (rev 07)
81:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network
Connection (rev 01)
81:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network
Connection (rev 01)
83:00.0 PCI bridge: PLX Technology, Inc. PEX 8747 48-Lane, 5-Port PCI
Express Gen 3 (8.0 GT/s) Switch (rev ca)
84:08.0 PCI bridge: PLX Technology, Inc. PEX 8747 48-Lane, 5-Port PCI
Express Gen 3 (8.0 GT/s) Switch (rev ca)
84:10.0 PCI bridge: PLX Technology, Inc. PEX 8747 48-Lane, 5-Port PCI
Express Gen 3 (8.0 GT/s) Switch (rev ca)
85:00.0 VGA compatible controller: NVIDIA Corporation Device 13f2 (rev a1)
86:00.0 VGA compatible controller: NVIDIA Corporation Device 13f2 (rev a1)

In sysfs, those are in same parent folder of its parent root port:

# ls /sys/devices/pci\:80/ -l
total 0
drwxr-xr-x 8 root root0 Oct 13 13:30 :80:01.0
drwxr-xr-x 7 root root0 Oct 13 13:30 :80:02.0
drwxr-xr-x 6 root root0 Oct 13 13:30 :80:03.0
drwxr-xr-x 6 root root0 Oct 13 13:30 :80:04.0
drwxr-xr-x 6 root root0 Oct 13 13:30 :80:04.1
drwxr-xr-x 6 root root0 Oct 13 13:30 :80:04.2
drwxr-xr-x 6 root root0 Oct 13 13:30 :80:04.3
drwxr-xr-x 6 root root0 Oct 13 13:30 :80:04.4
drwxr-xr-x 6 root root0 Oct 13 13:30 :80:04.5
drwxr-xr-x 6 root root0 Oct 13 13:30 :80:04.6
drwxr-xr-x 6 root root0 Oct 13

Re: [Qemu-devel] [PATCH 2/3] tests/boot-sector: Use mkstemp() to create a unique file name

2016-10-13 Thread Fam Zheng

On Tue, 10/11 17:19, Thomas Huth wrote:
> The pxe-test is run for three different targets now (x86_64, i386
> and ppc64), and the bios-tables-test is run for two targets (x86_64
> and i386). But each of the tests is using an invariant name for the
> disk image with the boot sector code - so if the tests are running in
> parallel, there is a race condition that they destroy the disk image
> of a parallel test program. Let's use mkstemp() to create unique
> temporary files here instead - and since mkstemp() is returning an
> integer file descriptor instead of a FILE pointer, we also switch
> the fwrite() and fclose() to write() and close() instead.
> 
> Reported-by: Sascha Silbe 
> Signed-off-by: Thomas Huth 

Tested-by: Fam Zheng

[Qemu-devel] [PATCH v2] target-mips: Fix Loongson pandn instruction.

2016-10-13 Thread Heiher

From: Heiher 

pandn FD, FS, FT
Operation: FD = ((NOT FS) AND FT)

Signed-off-by: Heiher 
Signed-off-by: Fuxin Zhang 
---
 target-mips/translate.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/target-mips/translate.c b/target-mips/translate.c
index 55c2ca0..1412347 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -3945,9 +3945,12 @@ static void gen_loongson_multimedia(DisasContext *ctx, 
int rd, int rs, int rt)
 LMI_DIRECT(XOR_CP2, xor, xor);
 LMI_DIRECT(NOR_CP2, nor, nor);
 LMI_DIRECT(AND_CP2, and, and);
-LMI_DIRECT(PANDN, pandn, andc);
 LMI_DIRECT(OR, or, or);
 
+case OPC_PANDN:
+tcg_gen_andc_i64(t0, t1, t0);
+break;
+
 case OPC_PINSRH_0:
 tcg_gen_deposit_i64(t0, t0, t1, 0, 16);
 break;
-- 
2.10.0

Re: [Qemu-devel] [PATCH 2/2] target-mips: Fix Loongson multimedia instructions.

2016-10-13 Thread Heiher

On Fri, Oct 14, 2016 at 12:44 AM, Yongbok Kim  wrote:
>
>
> On 13/10/2016 08:10, Heiher wrote:
>> From: Heiher 
>>
>> Needed to emit FPU exception on Loongson multimedia instructions
>> executing if Status:CU1 is clear. or FPR changes may be missed
>> on Linux.
>>
>> Signed-off-by: Heiher 
>> Signed-off-by: Fuxin Zhang 
>> ---
>>  target-mips/translate.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/target-mips/translate.c b/target-mips/translate.c
>> index 139f249..b87a09b 100644
>> --- a/target-mips/translate.c
>> +++ b/target-mips/translate.c
>> @@ -3871,6 +3871,7 @@ static void gen_loongson_multimedia(DisasContext *ctx, 
>> int rd, int rs, int rt)
>>  break;
>>  }
>>
>> +check_cp1_enabled(ctx);
>
> Isn't it also required to check Status.CU2 bit? I guess the Loongson
> Multimedia instructions was implemented as the Cop2? Please correct me if I
> am wrong with this.

I don't think the Loongson CPUs have real coprocessor 2 although these
multimedia instructions are encoding as Cp2. Similar to MSA, these
instructions reusing 64-bit FPRs. In fact, The older Loongson CPUs
emit Cp2 unusable exception, so a cp2 hook is needed to combine FPU
and Cp2 in kernel. In new revision, Cp1 exception is emitted.

>
>>  gen_load_fpr64(ctx, t0, rs);
>>  gen_load_fpr64(ctx, t1, rt);
>>
>>
>
> Regards,
> Yongbok



-- 
Best regards!
Heiher
http://hev.cc

Re: [Qemu-devel] [PATCH v8 4/6] docs: Add Documentation for Mediated devices

2016-10-13 Thread Jike Song

On 10/11/2016 04:28 AM, Kirti Wankhede wrote:
> +
> +Under per-physical device sysfs:
> +
> +
> +* mdev_supported_types:
> +List of current supported mediated device types and its details are added
> +in this directory in following format:
> +
> +|- 
> +|--- Vendor-specific-attributes [optional]
> +|--- mdev_supported_types
> +| |--- 
> +| |   |--- create
> +| |   |--- name
> +| |   |--- available_instances
> +| |   |--- description /class
> +| |   |--- [devices]
> +| |--- 
> +| |   |--- create
> +| |   |--- name
> +| |   |--- available_instances
> +| |   |--- description /class
> +| |   |--- [devices]
> +| |--- 
> +|  |--- create
> +|  |--- name
> +|  |--- available_instances
> +|  |--- description /class
> +|  |--- [devices]
> +
> +[TBD : description or class is yet to be decided. This will change.]
> +
> +Under per mdev device:
> +--
> +
> +|- 
> +|--- $MDEV_UUID
> + |--- remove
> + |--- {link to its type}
> + |--- vendor-specific-attributes [optional]
> +

All mdev directories are placed under physical device directly.

Looking at the sysfs directory of physical device, you get:


|--- mdev_supported_types/
||--- type1/
||--- type2/
||--- type3/
|--- mdev1/
|--- mdev2/



With an independent device between physical and mdev, and names
simplified, you will get:


|--- mdev/
||--- supported_type1/
||--- supported_type2/
||--- supported_type3/
||--- mdev1/
||--- mdev2/

i.e. everything related to mdev are placed under one single directory -
the same as SR-IOV.  I'm not sure if it is possible without
introducing an independent device (which you apparently dislike), but
placing so many mdev directories under physical doesn't seems clean.



--
Thanks,
Jike

Re: [Qemu-devel] [PATCH v6 00/15] nbd: efficient write zeroes

2016-10-13 Thread Eric Blake

On 10/13/2016 07:00 PM,
no-re...@ec2-52-6-146-230.compute-1.amazonaws.com wrote:
> Hi,
> 
> Your series failed automatic build test. Please find the testing commands and
> their output below. If you have docker installed, you can probably reproduce 
> it
> locally.
> 

> /tmp/qemu-test/src/nbd/client.c: In function 'nbd_errno_to_system_errno':
> /tmp/qemu-test/src/nbd/client.c:38:16: error: 'ESHUTDOWN' undeclared (first 
> use in this function)
>  return ESHUTDOWN;
> ^

Oh fun, I get to work around a missing errno value on mingw.  I'll come
up with something to squash in to 13/15.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [QEMU PATCH v6 0/2] migration: migrate QTAILQ

2016-10-13 Thread no-reply

Hi,

Your series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Message-id: 1476394254-7987-1-git-send-email-du...@linux.vnet.ibm.com
Type: series
Subject: [Qemu-devel] [QEMU PATCH v6 0/2] migration: migrate QTAILQ

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=16
make docker-test-quick@centos6
make docker-test-mingw@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
1315f2a migration: migrate QTAILQ
48aed08 migration: extend VMStateInfo

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf'
  BUILD   centos6
=== OUTPUT END ===

Abort: command timeout (>3600 seconds)


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [PATCH 0/4] machine, hostmem, pc: Register properties as class properties

2016-10-13 Thread no-reply

Hi,

Your series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Message-id: 1476394002-8392-1-git-send-email-ehabk...@redhat.com
Type: series
Subject: [Qemu-devel] [PATCH 0/4] machine, hostmem, pc: Register properties as 
class properties

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=16
make docker-test-quick@centos6
make docker-test-mingw@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
0e0a60a hostmem-file: Register TYPE_MEMORY_BACKEND_FILE properties as class 
properties
2bb6ac4 hostmem: Register TYPE_MEMORY_BACKEND properties as class properties
dd4ca16 pc: Register TYPE_PC_MACHINE properties as class properties
ce6f883 machine: Register TYPE_MACHINE properties as class properties

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf'
  BUILD   centos6
  ARCHIVE qemu.tgz
  ARCHIVE dtc.tgz
  COPYRUNNER
  RUN test-quick in centos6
Packages installed:
SDL-devel-1.2.14-7.el6_7.1.x86_64
ccache-3.1.6-2.el6.x86_64
epel-release-6-8.noarch
gcc-4.4.7-17.el6.x86_64
git-1.7.1-4.el6_7.1.x86_64
glib2-devel-2.28.8-5.el6.x86_64
libfdt-devel-1.4.0-1.el6.x86_64
make-3.81-23.el6.x86_64
package g++ is not installed
pixman-devel-0.32.8-1.el6.x86_64
tar-1.23-15.el6_8.x86_64
zlib-devel-1.2.3-29.el6.x86_64

Environment variables:
PACKAGES=libfdt-devel ccache tar git make gcc g++ zlib-devel 
glib2-devel SDL-devel pixman-devel epel-release
HOSTNAME=3ba34d679a5b
TERM=xterm
MAKEFLAGS= -j16
HISTSIZE=1000
J=16
USER=root
CCACHE_DIR=/var/tmp/ccache
EXTRA_CONFIGURE_OPTS=
V=
SHOW_ENV=1
MAIL=/var/spool/mail/root
PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
LANG=en_US.UTF-8
TARGET_LIST=
HISTCONTROL=ignoredups
SHLVL=1
HOME=/root
TEST_DIR=/tmp/qemu-test
LOGNAME=root
LESSOPEN=||/usr/bin/lesspipe.sh %s
FEATURES= dtc
DEBUG=
G_BROKEN_FILENAMES=1
CCACHE_HASHDIR=
_=/usr/bin/env

Configure options:
--enable-werror --target-list=x86_64-softmmu,aarch64-softmmu 
--prefix=/var/tmp/qemu-build/install
No C++ compiler available; disabling C++ specific optional code
Install prefix/var/tmp/qemu-build/install
BIOS directory/var/tmp/qemu-build/install/share/qemu
binary directory  /var/tmp/qemu-build/install/bin
library directory /var/tmp/qemu-build/install/lib
module directory  /var/tmp/qemu-build/install/lib/qemu
libexec directory /var/tmp/qemu-build/install/libexec
include directory /var/tmp/qemu-build/install/include
config directory  /var/tmp/qemu-build/install/etc
local state directory   /var/tmp/qemu-build/install/var
Manual directory  /var/tmp/qemu-build/install/share/man
ELF interp prefix /usr/gnemul/qemu-%M
Source path   /tmp/qemu-test/src
C compilercc
Host C compiler   cc
C++ compiler  
Objective-C compiler cc
ARFLAGS   rv
CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g 
QEMU_CFLAGS   -I/usr/include/pixman-1-pthread -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include   -fPIE -DPIE -m64 -D_GNU_SOURCE 
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes 
-Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes 
-fno-strict-aliasing -fno-common -fwrapv  -Wendif-labels -Wmissing-include-dirs 
-Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self 
-Wignored-qualifiers -Wold-style-declaration -Wold-style-definition 
-Wtype-limits -fstack-protector-all
LDFLAGS   -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g 
make  make
install   install
pythonpython -B
smbd  /usr/sbin/smbd
module supportno
host CPU  x86_64
host big endian   no
target list   x86_64-softmmu aarch64-softmmu
tcg debug enabled no
gprof enabled no
sparse enabledno
strip binariesyes
profiler  no
static build  no
pixmansystem
SDL support   yes (1.2.14)
GTK support   no 
GTK GL supportno
VTE support   no 
TLS priority  NORMAL
GNUTLS supportno
GNUTLS rndno
libgcrypt no
libgcrypt kdf no
nettleno 
nettle kdfno
libtasn1  no
curses supportno
virgl support no
curl support  no
mingw32 support   no
Audio drivers oss
Block whitelist (rw) 
Block whitelist (ro) 
VirtFS supportno
VNC support   yes
VNC SASL support  no
VNC JPEG support  no
VNC PNG support   no
xen support   no
brlapi supportno
bluez  supportno
Documentation no
PIE   yes
vde support   no
netmap supportno
Linux AIO support no
ATTR/XATTR support yes
Install blobs yes
KVM support   yes
RDMA support  no

Re: [Qemu-devel] [PATCH v2 2/2] xen_platform: SUSE xenlinux unplug for emulated PCI

2016-10-13 Thread Stefano Stabellini

On Fri, 2 Sep 2016, Olaf Hering wrote:
> Implement SUSE specific unplug protocol for emulated PCI devices
> in PVonHVM guests. Its a simple 'outl(1, (ioaddr + 4));'.
> This protocol was implemented and used since Xen 3.0.4.
> It is used in all SUSE/SLES/openSUSE releases up to SLES11SP3 and
> openSUSE 12.3.
> 
> Signed-off-by: Olaf Hering 
> ---
>  hw/i386/xen/xen_platform.c | 31 ++-
>  1 file changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c
> index 53be3c7..6faee4c 100644
> --- a/hw/i386/xen/xen_platform.c
> +++ b/hw/i386/xen/xen_platform.c
> @@ -313,13 +313,42 @@ static void xen_platform_ioport_writeb(void *opaque, 
> hwaddr addr,
> uint64_t val, unsigned int size)
>  {
>  PCIXenPlatformState *s = opaque;
> +PCIDevice *pci_dev = PCI_DEVICE(s);
>  
>  switch (addr) {
>  case 0: /* Platform flags */
>  platform_fixed_ioport_writeb(opaque, 0, (uint32_t)val);
>  break;
> +case 4:
> +if (val == 1) {
> +/*
> + * SUSE unplug for Xenlinux
> + * xen-kmp used this since xen-3.0.4, instead the official 
> protocol
> + * from xen-3.3+ It did an unconditional "outl(1, (ioaddr + 4));"
> + * Pre VMDP 1.7 used 4 and 8 depending on how VMDP was 
> configured.
> + * If VMDP was to control both disk and LAN it would use 4.
> + * If it controlled just disk or just LAN, it would use 8 below.
> + */
> +blk_drain_all();
> +blk_flush_all();

I was about to send a pull request for this series but blk_flush_all
causes a build failure:

/local/qemu-upstream/hw/i386/xen/xen_platform.c: In function 
'xen_platform_ioport_writeb':
/local/qemu-upstream/hw/i386/xen/xen_platform.c:331:13: error: implicit 
declaration of function 'blk_flush_all' [-Werror=implicit-function-declaration]
/local/qemu-upstream/hw/i386/xen/xen_platform.c:331:13: error: nested extern 
declaration of 'blk_flush_all' [-Werror=nested-externs]
cc1: all warnings being treated as errors
make[1]: *** [hw/i386/xen/xen_platform.o] Error 1
make[1]: *** Waiting for unfinished jobs
make: *** [subdir-i386-softmmu] Error 2



> +pci_unplug_disks(pci_dev->bus);
> +pci_unplug_nics(pci_dev->bus);
> +}
> +break;
>  case 8:
> -log_writeb(s, (uint32_t)val);
> +switch (val) {
> +case 1:
> +blk_drain_all();
> +blk_flush_all();
> +pci_unplug_disks(pci_dev->bus);
> +break;
> +case 2:
> +pci_unplug_nics(pci_dev->bus);
> +break;
> +default:
> +log_writeb(s, (uint32_t)val);
> +break;
> +}
>  break;
>  default:
>  break;
>

Re: [Qemu-devel] [PATCH v6 00/15] nbd: efficient write zeroes

2016-10-13 Thread no-reply

Hi,

Your series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Message-id: 1476392335-9256-1-git-send-email-ebl...@redhat.com
Type: series
Subject: [Qemu-devel] [PATCH v6 00/15] nbd: efficient write zeroes

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=16
make docker-test-quick@centos6
make docker-test-mingw@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/1476399422-8028-1-git-send-email-js...@redhat.com 
-> patchew/1476399422-8028-1-git-send-email-js...@redhat.com
Switched to a new branch 'test'
3e3cab6 nbd: Implement NBD_CMD_WRITE_ZEROES on client
6127973 nbd: Implement NBD_CMD_WRITE_ZEROES on server
dbe675e nbd: Improve server handling of shutdown requests
d86be2e nbd: Support shorter handshake
598c0b0 nbd: Less allocation during NBD_OPT_LIST
71d9fb4 nbd: Let client skip portions of server reply
4554cca nbd: Let server know when client gives up negotiation
c33b361 nbd: Share common option-sending code in client
9e216fe nbd: Send message along with server NBD_REP_ERR errors
761e798 nbd: Share common reply-sending code in server
dd5e949 nbd: Rename struct nbd_request and nbd_reply
45a155f nbd: Rename NbdClientSession to NBDClientSession
00f1816 nbd: Rename NBDRequest to NBDRequestData
bcd4778 nbd: Treat flags vs. command type as separate fields
fde7920 nbd: Add qemu-nbd -D for human-readable description

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf'
  BUILD   centos6
  ARCHIVE qemu.tgz
  ARCHIVE dtc.tgz
  COPYRUNNER
  RUN test-quick in centos6
Packages installed:
SDL-devel-1.2.14-7.el6_7.1.x86_64
ccache-3.1.6-2.el6.x86_64
epel-release-6-8.noarch
gcc-4.4.7-17.el6.x86_64
git-1.7.1-4.el6_7.1.x86_64
glib2-devel-2.28.8-5.el6.x86_64
libfdt-devel-1.4.0-1.el6.x86_64
make-3.81-23.el6.x86_64
package g++ is not installed
pixman-devel-0.32.8-1.el6.x86_64
tar-1.23-15.el6_8.x86_64
zlib-devel-1.2.3-29.el6.x86_64

Environment variables:
PACKAGES=libfdt-devel ccache tar git make gcc g++ zlib-devel 
glib2-devel SDL-devel pixman-devel epel-release
HOSTNAME=29f210180aca
TERM=xterm
MAKEFLAGS= -j16
HISTSIZE=1000
J=16
USER=root
CCACHE_DIR=/var/tmp/ccache
EXTRA_CONFIGURE_OPTS=
V=
SHOW_ENV=1
MAIL=/var/spool/mail/root
PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
LANG=en_US.UTF-8
TARGET_LIST=
HISTCONTROL=ignoredups
SHLVL=1
HOME=/root
TEST_DIR=/tmp/qemu-test
LOGNAME=root
LESSOPEN=||/usr/bin/lesspipe.sh %s
FEATURES= dtc
DEBUG=
G_BROKEN_FILENAMES=1
CCACHE_HASHDIR=
_=/usr/bin/env

Configure options:
--enable-werror --target-list=x86_64-softmmu,aarch64-softmmu 
--prefix=/var/tmp/qemu-build/install
No C++ compiler available; disabling C++ specific optional code
Install prefix/var/tmp/qemu-build/install
BIOS directory/var/tmp/qemu-build/install/share/qemu
binary directory  /var/tmp/qemu-build/install/bin
library directory /var/tmp/qemu-build/install/lib
module directory  /var/tmp/qemu-build/install/lib/qemu
libexec directory /var/tmp/qemu-build/install/libexec
include directory /var/tmp/qemu-build/install/include
config directory  /var/tmp/qemu-build/install/etc
local state directory   /var/tmp/qemu-build/install/var
Manual directory  /var/tmp/qemu-build/install/share/man
ELF interp prefix /usr/gnemul/qemu-%M
Source path   /tmp/qemu-test/src
C compilercc
Host C compiler   cc
C++ compiler  
Objective-C compiler cc
ARFLAGS   rv
CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g 
QEMU_CFLAGS   -I/usr/include/pixman-1-pthread -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include   -fPIE -DPIE -m64 -D_GNU_SOURCE 
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes 
-Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes 
-fno-strict-aliasing -fno-common -fwrapv  -Wendif-labels -Wmissing-include-dirs 
-Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self 
-Wignored-qualifiers -Wold-style-declaration -Wold-style-definition 
-Wtype-limits -fstack-protector-all
LDFLAGS   -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g 
make  make
install   install
pythonpython -B
smbd  /usr/sbin/smbd
module supportno
host CPU  x86_64
host big endian   no
target list   x86_64-softmmu aarch64-softmmu
tcg debug enabled no
gprof enabled no
sparse enabledno
strip binariesyes
profiler  no
static build  no
pixmansystem
SDL support   yes (1.2.14)
GTK support   no 
GTK GL supportno
VTE support   no 
TLS priority

[Qemu-devel] [PATCH 5/7] Blockjobs: Internalize user_pause logic

2016-10-13 Thread John Snow

BlockJobs will begin hiding their state in preparation for some
refactorings anyway, so let's internalize the user_pause mechanism
instead of leaving it to callers to correctly manage.

Signed-off-by: John Snow 
---
 blockdev.c   | 12 +---
 blockjob.c   | 22 --
 include/block/blockjob.h | 26 ++
 3 files changed, 51 insertions(+), 9 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 22a1280..1661d08 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3579,7 +3579,7 @@ void qmp_block_job_cancel(const char *device,
 force = false;
 }
 
-if (job->user_paused && !force) {
+if (block_job_user_paused(job) && !force) {
 error_setg(errp, "The block job for device '%s' is currently paused",
device);
 goto out;
@@ -3596,13 +3596,12 @@ void qmp_block_job_pause(const char *device, Error 
**errp)
 AioContext *aio_context;
 BlockJob *job = find_block_job(device, _context, errp);
 
-if (!job || job->user_paused) {
+if (!job || block_job_user_paused(job)) {
 return;
 }
 
-job->user_paused = true;
 trace_qmp_block_job_pause(job);
-block_job_pause(job);
+block_job_user_pause(job);
 aio_context_release(aio_context);
 }
 
@@ -3611,14 +3610,13 @@ void qmp_block_job_resume(const char *device, Error 
**errp)
 AioContext *aio_context;
 BlockJob *job = find_block_job(device, _context, errp);
 
-if (!job || !job->user_paused) {
+if (!job || !block_job_user_paused(job)) {
 return;
 }
 
-job->user_paused = false;
 trace_qmp_block_job_resume(job);
 block_job_iostatus_reset(job);
-block_job_resume(job);
+block_job_user_resume(job);
 aio_context_release(aio_context);
 }
 
diff --git a/blockjob.c b/blockjob.c
index e32cb78..d118a1f 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -362,11 +362,22 @@ void block_job_pause(BlockJob *job)
 job->pause_count++;
 }
 
+void block_job_user_pause(BlockJob *job)
+{
+job->user_paused = true;
+block_job_pause(job);
+}
+
 static bool block_job_should_pause(BlockJob *job)
 {
 return job->pause_count > 0;
 }
 
+bool block_job_user_paused(BlockJob *job)
+{
+return job ? job->user_paused : 0;
+}
+
 void coroutine_fn block_job_pause_point(BlockJob *job)
 {
 if (!block_job_should_pause(job)) {
@@ -403,6 +414,14 @@ void block_job_resume(BlockJob *job)
 block_job_enter(job);
 }
 
+void block_job_user_resume(BlockJob *job)
+{
+if (job && job->user_paused && job->pause_count > 0) {
+job->user_paused = false;
+block_job_resume(job);
+}
+}
+
 void block_job_enter(BlockJob *job)
 {
 if (job->co && !job->busy) {
@@ -626,8 +645,7 @@ BlockErrorAction block_job_error_action(BlockJob *job, 
BlockdevOnError on_err,
 }
 if (action == BLOCK_ERROR_ACTION_STOP) {
 /* make the pause user visible, which will be resumed from QMP. */
-job->user_paused = true;
-block_job_pause(job);
+block_job_user_pause(job);
 block_job_iostatus_set_err(job, error);
 }
 return action;
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 928f0b8..5b61140 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -358,6 +358,23 @@ void coroutine_fn block_job_pause_point(BlockJob *job);
 void block_job_pause(BlockJob *job);
 
 /**
+ * block_job_user_pause:
+ * @job: The job to be paused.
+ *
+ * Asynchronously pause the specified job.
+ * Do not allow a resume until a matching call to block_job_user_resume.
+ */
+void block_job_user_pause(BlockJob *job);
+
+/**
+ * block_job_paused:
+ * @job: The job to query.
+ *
+ * Returns true if the job is user-paused.
+ */
+bool block_job_user_paused(BlockJob *job);
+
+/**
  * block_job_resume:
  * @job: The job to be resumed.
  *
@@ -366,6 +383,15 @@ void block_job_pause(BlockJob *job);
 void block_job_resume(BlockJob *job);
 
 /**
+ * block_job_user_resume:
+ * @job: The job to be resumed.
+ *
+ * Resume the specified job.
+ * Must be paired with a preceding block_job_user_pause.
+ */
+void block_job_user_resume(BlockJob *job);
+
+/**
  * block_job_enter:
  * @job: The job to enter.
  *
-- 
2.7.4

Re: [Qemu-devel] [PATCHv3 7/7] spapr: Improved placement of PCI host bridges in guest memory map

2016-10-13 Thread David Gibson

On Thu, Oct 13, 2016 at 10:40:32AM +0200, Laurent Vivier wrote:
> 
> 
> On 13/10/2016 01:57, David Gibson wrote:
> > Currently, the MMIO space for accessing PCI on pseries guests begins at
> > 1 TiB in guest address space.  Each PCI host bridge (PHB) has a 64 GiB
> > chunk of address space in which it places its outbound PIO and 32-bit and
> > 64-bit MMIO windows.
> > 
> > This scheme as several problems:
> >   - It limits guest RAM to 1 TiB (though we have a limited fix for this
> > now)
> >   - It limits the total MMIO window to 64 GiB.  This is not always enough
> > for some of the large nVidia GPGPU cards
> >   - Putting all the windows into a single 64 GiB area means that naturally
> > aligning things within there will waste more address space.
> > In addition there was a miscalculation in some of the defaults, which meant
> > that the MMIO windows for each PHB actually slightly overran the 64 GiB
> > region for that PHB.  We got away without nasty consequences because
> > the overrun fit within an unused area at the beginning of the next PHB's
> > region, but it's not pretty.
> > 
> > This patch implements a new scheme which addresses those problems, and is
> > also closer to what bare metal hardware and pHyp guests generally use.
> > 
> > Because some guest versions (including most current distro kernels) can't
> > access PCI MMIO above 64 TiB, we put all the PCI windows between 32 TiB and
> > 64 TiB.  This is broken into 1 TiB chunks.  The 1 TiB contains the PIO
> > (64 kiB) and 32-bit MMIO (2 GiB) windows for all of the PHBs.  Each
> > subsequent TiB chunk contains a naturally aligned 64-bit MMIO window for
> > one PHB each.
> > 
> > This reduces the number of allowed PHBs (without full manual configuration
> > of all the windows) from 256 to 31, but this should still be plenty in
> > practice.
> > 
> > We also change some of the default window sizes for manually configured
> > PHBs to saner values.
> > 
> > Finally we adjust some tests and libqos so that it correctly uses the new
> > default locations.  Ideally it would parse the device tree given to the
> > guest, but that's a more complex problem for another time.
> > 
> > Signed-off-by: David Gibson 
> > ---
> >  hw/ppc/spapr.c  | 126 
> > +---
> >  hw/ppc/spapr_pci.c  |   5 +-
> >  include/hw/pci-host/spapr.h |   8 ++-
> >  tests/endianness-test.c |   3 +-
> >  tests/libqos/pci-spapr.c|   9 ++--
> >  tests/spapr-phb-test.c  |   2 +-
> >  6 files changed, 113 insertions(+), 40 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 8db3657..2d952a8 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -2375,31 +2375,42 @@ static void spapr_phb_placement(sPAPRMachineState 
> > *spapr, uint32_t index,
> >  hwaddr *mmio32, hwaddr *mmio64,
> >  unsigned n_dma, uint32_t *liobns, Error 
> > **errp)
> >  {
> > +/*
> > + * New-style PHB window placement.
> > + *
> > + * Goals: Gives large (1TiB), naturally aligned 64-bit MMIO window
> > + * for each PHB, in addition to 2GiB 32-bit MMIO and 64kiB PIO
> > + * windows.
> > + *
> > + * Some guest kernels can't work with MMIO windows above 1<<46
> > + * (64TiB), so we place up to 31 PHBs in the area 32TiB..64TiB
> > + *
> > + * 32TiB..33TiB contains the PIO and 32-bit MMIO windows for all
> > + * PHBs.  33..34TiB has the 64-bit MMIO window for PHB0, 34..35
> > + * has the 64-bit window for PHB1 and so forth.
> > + */
> >  const uint64_t base_buid = 0x8002000ULL;
> > -const hwaddr phb_spacing = 0x10ULL; /* 64 GiB */
> > -const hwaddr mmio_offset = 0xa000; /* 2 GiB + 512 MiB */
> > -const hwaddr pio_offset = 0x8000; /* 2 GiB */
> > -const uint32_t max_index = 255;
> > -const hwaddr phb0_alignment = 0x100ULL; /* 1 TiB */
> >  
> > -uint64_t ram_top = MACHINE(spapr)->ram_size;
> > -hwaddr phb0_base, phb_base;
> > +int max_phbs =
> > +(SPAPR_PCI_LIMIT - SPAPR_PCI_BASE) / SPAPR_PCI_MEM64_WIN_SIZE - 1;
> > +hwaddr mmio32_base = SPAPR_PCI_BASE + SPAPR_PCI_MEM32_WIN_SIZE;
> 
> The result is the same but I would add SPAPR_PCI_MEM_WIN_BUS_OFFSET
> instead of SPAPR_PCI_MEM32_WIN_SIZE.
> 
> As SPAPR_PCI_MEM32_WIN_SIZE is defined as "((1ULL << 32) -
> SPAPR_PCI_MEM_WIN_BUS_OFFSET)", I guess 0..SPAPR_PCI_MEM_WIN_BUS_OFFSET
> is for PIO and SPAPR_PCI_MEM_WIN_BUS_OFFSET..(1<<32) is for MMIO.

No, not quite.

> This is what we have below with:
> 
> *pio = SPAPR_PCI_BASE + index * SPAPR_PCI_IO_WIN_SIZE;
> *mmio32 = mmio32_base + index * SPAPR_PCI_MEM32_WIN_SIZE;
> 
> Perhaps we can see *mmio32 as
> 
> SPAPR_PCI_BASE + (index + 1) * SPAPR_PCI_MEM32_WIN_SIZE

That's the intended effect.  Basically we take 32..64TiB and divide it
into 1TiB regions.  Regions 1..31 are mmio64

[Qemu-devel] [PATCH 0/7] blockjobs: preliminary refactoring work, Pt 1

2016-10-13 Thread John Snow

This is a follow-up to patches 1-6 of:
[PATCH v2 00/11] blockjobs: Fix transactional race condition

That series started trying to refactor blockjobs with the goal of
internalizing BlockJob state as a side effect of having gone through
the effort of figuring out which commands were "safe" to call on
a Job that has no coroutine object.

I've split out the less contentious bits so I can move forward with my
original work of focusing on the transactional race condition in a
different series.

Functionally the biggest difference here is the presence of "internal"
block jobs, which do not emit QMP events or show up in block query
requests. This is done for the sake of replication jobs, which should
not be interfering with the public jobs namespace.



For convenience, this branch is available at:
https://github.com/jnsnow/qemu.git branch job-refactor-pt1
https://github.com/jnsnow/qemu/tree/job-refactor-pt1

This version is tagged job-refactor-pt1-v1:
https://github.com/jnsnow/qemu/releases/tag/job-refactor-pt1-v1

John Snow (7):
  blockjobs: hide internal jobs from management API
  blockjobs: Allow creating internal jobs
  Replication/Blockjobs: Create replication jobs as internal
  blockjob: centralize QMP event emissions
  Blockjobs: Internalize user_pause logic
  blockjobs: split interface into public/private, Part 1
  blockjobs: fix documentation

 block/backup.c   |   5 +-
 block/commit.c   |  10 +-
 block/mirror.c   |  28 +++--
 block/replication.c  |  14 +--
 block/stream.c   |   9 +-
 block/trace-events   |   5 +-
 blockdev.c   |  74 +
 blockjob.c   | 109 ++
 include/block/block.h|   3 +-
 include/block/block_int.h|  26 ++---
 include/block/blockjob.h | 257 +++
 include/block/blockjob_int.h | 232 ++
 qemu-img.c   |   5 +-
 tests/test-blockjob-txn.c|   5 +-
 tests/test-blockjob.c|   4 +-
 15 files changed, 443 insertions(+), 343 deletions(-)
 create mode 100644 include/block/blockjob_int.h

-- 
2.7.4

Re: [Qemu-devel] [PATCH v4 07/20] ppc/pnv: add XSCOM handlers to PnvCore

2016-10-13 Thread David Gibson

On Thu, Oct 13, 2016 at 08:50:41AM +0200, Cédric Le Goater wrote:
> On 10/13/2016 02:51 AM, David Gibson wrote:
> > On Mon, Oct 03, 2016 at 09:24:43AM +0200, Cédric Le Goater wrote:
> >> Now that we are using real HW ids for the cores in PowerNV chips, we
> >> can route the XSCOM accesses to them. We just need to attach a
> >> specific XSCOM memory region to each core in the appropriate window
> >> for the core number.
> >>
> >> To start with, let's install the DTS (Digital Thermal Sensor) handlers
> >> which should return 38°C for each core.
> >>
> >> Signed-off-by: Cédric Le Goater 
> >> ---
> >>
> >>  Changes since v3:
> >>
> >>  - moved to new XSCOM model
> >>  - kept the write op on the XSCOM memory region for later use
> >>
> >>  Changes since v2:
> >>
> >>  - added a XSCOM memory region to handle access to the EX core
> >>registers   
> >>  - extended the PnvCore object with a XSCOM_INTERFACE so that we can
> >>use pnv_xscom_pcba() and pnv_xscom_addr() to handle XSCOM address
> >>translation.
> >>
> >>  hw/ppc/pnv.c   |  4 
> >>  hw/ppc/pnv_core.c  | 50 
> >> ++
> >>  include/hw/ppc/pnv_core.h  |  2 ++
> >>  include/hw/ppc/pnv_xscom.h | 19 ++
> >>  4 files changed, 75 insertions(+)
> >>
> >> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> >> index 5e19b6880387..ffe245fe59d2 100644
> >> --- a/hw/ppc/pnv.c
> >> +++ b/hw/ppc/pnv.c
> >> @@ -620,6 +620,10 @@ static void pnv_chip_realize(DeviceState *dev, Error 
> >> **errp)
> >>   _fatal);
> >>  object_unref(OBJECT(pnv_core));
> >>  i++;
> >> +
> >> +memory_region_add_subregion(>xscom,
> >> + PNV_XSCOM_EX_CORE_BASE(core_hwid) << 3,
> >> + _CORE(pnv_core)->xscom_regs);
> > 
> > Might be worth adding some convenience functions for doing the various
> > bits of xscom MR juggling, otherwise this looks fine.
> 
> To do the 8 byte shifting ? or something like this :
>  
>   void pnv_xscom_add_subregion(PnvChip *chip, uint32_t pcba,
>MemoryRegion *subregion);

Yes, something like that, which incorporates the << 3.  And maybe
another one to construct the MR for use on the scom device side.

> or even :
> 
>   void pnv_xscom_add_subregion(PnvChip *chip, PnvXScomInterface *obj)
> 
> but that would require some more handlers under  PnvXScomInterface.
> 
> Thanks,
> 
> C.
> 
> 
> >>  }
> >>  g_free(typename);
> >>  }
> >> diff --git a/hw/ppc/pnv_core.c b/hw/ppc/pnv_core.c
> >> index d37788f142f4..a1c8a14f06b6 100644
> >> --- a/hw/ppc/pnv_core.c
> >> +++ b/hw/ppc/pnv_core.c
> >> @@ -19,6 +19,7 @@
> >>  #include "qemu/osdep.h"
> >>  #include "sysemu/sysemu.h"
> >>  #include "qapi/error.h"
> >> +#include "qemu/log.h"
> >>  #include "target-ppc/cpu.h"
> >>  #include "hw/ppc/ppc.h"
> >>  #include "hw/ppc/pnv.h"
> >> @@ -64,6 +65,51 @@ static void powernv_cpu_init(PowerPCCPU *cpu, Error 
> >> **errp)
> >>  powernv_cpu_reset(cpu);
> >>  }
> >>  
> >> +/*
> >> + * These values are read by the PowerNV HW monitors under Linux
> >> + */
> >> +#define PNV_XSCOM_EX_DTS_RESULT0 0x5
> >> +#define PNV_XSCOM_EX_DTS_RESULT1 0x50001
> >> +
> >> +static uint64_t pnv_core_xscom_read(void *opaque, hwaddr addr,
> >> +unsigned int width)
> >> +{
> >> +uint32_t offset = addr >> 3;
> >> +uint64_t val = 0;
> >> +
> >> +/* The result should be 38 C */
> >> +switch (offset) {
> >> +case PNV_XSCOM_EX_DTS_RESULT0:
> >> +val = 0x26f024f023full;
> >> +break;
> >> +case PNV_XSCOM_EX_DTS_RESULT1:
> >> +val = 0x24full;
> >> +break;
> >> +default:
> >> +qemu_log_mask(LOG_UNIMP, "Warning: reading reg=0x%" HWADDR_PRIx,
> >> +  addr);
> >> +}
> >> +
> >> +return val;
> >> +}
> >> +
> >> +static void pnv_core_xscom_write(void *opaque, hwaddr addr, uint64_t val,
> >> + unsigned int width)
> >> +{
> >> +qemu_log_mask(LOG_UNIMP, "Warning: writing to reg=0x%" HWADDR_PRIx,
> >> +  addr);
> >> +}
> >> +
> >> +static const MemoryRegionOps pnv_core_xscom_ops = {
> >> +.read = pnv_core_xscom_read,
> >> +.write = pnv_core_xscom_write,
> >> +.valid.min_access_size = 8,
> >> +.valid.max_access_size = 8,
> >> +.impl.min_access_size = 8,
> >> +.impl.max_access_size = 8,
> >> +.endianness = DEVICE_BIG_ENDIAN,
> >> +};
> >> +
> >>  static void pnv_core_realize_child(Object *child, Error **errp)
> >>  {
> >>  Error *local_err = NULL;
> >> @@ -119,6 +165,10 @@ static void pnv_core_realize(DeviceState *dev, Error 
> >> **errp)
> >>  goto err;
> >>  }
> >>  }
> >> +
> >> +snprintf(name, sizeof(name), "xscom-core.%d", cc->core_id);
> >> +memory_region_init_io(>xscom_regs, OBJECT(dev), 
> >>

Re: [Qemu-devel] [PATCH v1 3/3] target-ppc: implement xxbr[qdwh] instruction

2016-10-13 Thread David Gibson

On Thu, Oct 13, 2016 at 01:14:35PM -0500, Richard Henderson wrote:
> On 10/12/2016 07:21 PM, David Gibson wrote:
> > > +static void gen_bswap32x4(TCGv_i64 outh, TCGv_i64 outl,
> > > +  TCGv_i64 inh, TCGv_i64 inl)
> > > +{
> > > +TCGv_i64 hi = tcg_temp_new_i64();
> > > +TCGv_i64 lo = tcg_temp_new_i64();
> > > +
> > > +tcg_gen_bswap64_i64(hi, inh);
> > > +tcg_gen_bswap64_i64(lo, inl);
> > > +tcg_gen_shri_i64(outh, hi, 32);
> > > +tcg_gen_deposit_i64(outh, outh, hi, 32, 32);
> > > +tcg_gen_shri_i64(outl, lo, 32);
> > > +tcg_gen_deposit_i64(outl, outl, lo, 32, 32);
> > > +
> > > +tcg_temp_free_i64(hi);
> > > +tcg_temp_free_i64(lo);
> > > +}
> > 
> > Is there actually any advantage to having this 128-bit operation
> > working on two 64-bit "register"s, as opposed to having a bswap32x2
> > that operates on a single 64-bit register amd calling it twice?
> 
> For this one, no particular advantage.
> 
> > > +gen_bswap16x8(xth, xtl, xbh, xbl);
> > 
> > Likewise for the 16x8 version, I guess, although that would mean
> > changing the existing users.
> 
> For this one, we have to build a 64-bit constant, 0x00ff00ff00ff00ff.  On
> some hosts that's up to 6 insns.  Being about to reuse that for both swaps
> is useful.

Ah, good point.

> > > +tcg_gen_bswap64_i64(t0, xbl);
> > > +tcg_gen_bswap64_i64(xtl, xbh);
> > > +tcg_gen_bswap64_i64(xth, t0);
> > 
> > This looks wrong.  You swap xbl as you move it to t0, then swap it
> > again as you put it back into xth.  So it looks like you'll translate
> >0011223344556677 8899AABBCCDDEEFF
> > to
> >8899AABBCCDDEEFF 7766554433221100
> > whereas it should become
> >FFEEDDCCBBAA9977 7766554433221100
> 
> Indeed, the third line should be a move, not a swap.
> 
> 
> r~
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PULL 0/4] ppc patches for qemu-2.7 stable branch

2016-10-13 Thread David Gibson

On Thu, Oct 13, 2016 at 12:57:19PM +0100, Peter Maydell wrote:
> On 13 October 2016 at 12:54, Peter Maydell <peter.mayd...@linaro.org> wrote:
> > On 13 October 2016 at 06:15, David Gibson <da...@gibson.dropbear.id.au> 
> > wrote:
> >> The following changes since commit 
> >> 1dc33ed90bf1fe1c2014dffa0d9e863c520d953a:
> >>
> >>   Update version for v2.7.0 release (2016-09-02 13:44:11 +0100)
> >>
> >> are available in the git repository at:
> >>
> >>   git://github.com/dgibson/qemu.git tags/ppc-for-2.7-20161013
> >>
> >> for you to fetch changes up to 2e68f28854f0120c9a938a61b64aaf1eaecb162b:
> >>
> >>   ppc: Check the availability of transactional memory (2016-10-13 12:58:06 
> >> +1100)
> >
> > Applied to master, thanks. (I hope that was what you had in mind;
> > if not we'll have to unwind stuff somehow...)
> 
> unwind> looking more closely, there's no actual diff between HEAD
> now and what it was, so the merge commit is a no-op of sorts.
> Hopefully it doesn't cause any problems.

Right.. these were all (clean) cherry picks from the master branch
anyway, so I'd expect it to be a no-op.

> More generally, we need to come up with something for distinguishing
> PULL requests not for master, because my current workflow basically
> says "anything that says 'for you to fetch changes up to' will get
> merged into master...

Um.. yes.. this was intended for merge to the 2.7 branch, not master.
Any ideas how I should express that?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Qemu-devel] [PATCH 6/7] blockjobs: split interface into public/private, Part 1

2016-10-13 Thread John Snow

To make it a little more obvious which functions are intended to be
public interface and which are intended to be for use only by jobs
themselves, split the interface into "public" and "private" files.

Convert blockjobs (e.g. block/backup) to using the private interface.
Leave blockdev and others on the public interface.

There are remaining uses of private state by qemu-img, and several
cases in blockdev.c and block/io.c where we grab job->blk for the
purposes of acquiring an AIOContext.

These will be corrected in future patches.

Signed-off-by: John Snow 
---
 block/backup.c   |   2 +-
 block/commit.c   |   2 +-
 block/mirror.c   |   2 +-
 block/stream.c   |   2 +-
 blockjob.c   |   2 +-
 include/block/block.h|   3 +-
 include/block/blockjob.h | 205 +-
 include/block/blockjob_int.h | 232 +++
 tests/test-blockjob-txn.c|   2 +-
 tests/test-blockjob.c|   2 +-
 10 files changed, 244 insertions(+), 210 deletions(-)
 create mode 100644 include/block/blockjob_int.h

diff --git a/block/backup.c b/block/backup.c
index 6a60ca8..6d12100 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -16,7 +16,7 @@
 #include "trace.h"
 #include "block/block.h"
 #include "block/block_int.h"
-#include "block/blockjob.h"
+#include "block/blockjob_int.h"
 #include "block/block_backup.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qerror.h"
diff --git a/block/commit.c b/block/commit.c
index 475a375..d555600 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -15,7 +15,7 @@
 #include "qemu/osdep.h"
 #include "trace.h"
 #include "block/block_int.h"
-#include "block/blockjob.h"
+#include "block/blockjob_int.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qerror.h"
 #include "qemu/ratelimit.h"
diff --git a/block/mirror.c b/block/mirror.c
index 4374fb4..c81b5e0 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -13,7 +13,7 @@
 
 #include "qemu/osdep.h"
 #include "trace.h"
-#include "block/blockjob.h"
+#include "block/blockjob_int.h"
 #include "block/block_int.h"
 #include "sysemu/block-backend.h"
 #include "qapi/error.h"
diff --git a/block/stream.c b/block/stream.c
index 7d6877d..906f7f3 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -14,7 +14,7 @@
 #include "qemu/osdep.h"
 #include "trace.h"
 #include "block/block_int.h"
-#include "block/blockjob.h"
+#include "block/blockjob_int.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qerror.h"
 #include "qemu/ratelimit.h"
diff --git a/blockjob.c b/blockjob.c
index d118a1f..e6f0d97 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -27,7 +27,7 @@
 #include "qemu-common.h"
 #include "trace.h"
 #include "block/block.h"
-#include "block/blockjob.h"
+#include "block/blockjob_int.h"
 #include "block/block_int.h"
 #include "sysemu/block-backend.h"
 #include "qapi/qmp/qerror.h"
diff --git a/include/block/block.h b/include/block/block.h
index 107c603..89b5feb 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -7,16 +7,15 @@
 #include "qemu/coroutine.h"
 #include "block/accounting.h"
 #include "block/dirty-bitmap.h"
+#include "block/blockjob.h"
 #include "qapi/qmp/qobject.h"
 #include "qapi-types.h"
 #include "qemu/hbitmap.h"
 
 /* block.c */
 typedef struct BlockDriver BlockDriver;
-typedef struct BlockJob BlockJob;
 typedef struct BdrvChild BdrvChild;
 typedef struct BdrvChildRole BdrvChildRole;
-typedef struct BlockJobTxn BlockJobTxn;
 
 typedef struct BlockDriverInfo {
 /* in bytes, 0 if irrelevant */
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 5b61140..bfc8233 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -28,78 +28,15 @@
 
 #include "block/block.h"
 
-/**
- * BlockJobDriver:
- *
- * A class type for block job driver.
- */
-typedef struct BlockJobDriver {
-/** Derived BlockJob struct size */
-size_t instance_size;
-
-/** String describing the operation, part of query-block-jobs QMP API */
-BlockJobType job_type;
-
-/** Optional callback for job types that support setting a speed limit */
-void (*set_speed)(BlockJob *job, int64_t speed, Error **errp);
-
-/** Optional callback for job types that need to forward I/O status reset 
*/
-void (*iostatus_reset)(BlockJob *job);
-
-/**
- * Optional callback for job types whose completion must be triggered
- * manually.
- */
-void (*complete)(BlockJob *job, Error **errp);
-
-/**
- * If the callback is not NULL, it will be invoked when all the jobs
- * belonging to the same transaction complete; or upon this job's
- * completion if it is not in a transaction. Skipped if NULL.
- *
- * All jobs will complete with a call to either .commit() or .abort() but
- * never both.
- */
-void (*commit)(BlockJob *job);
-
-/**
- * If the callback is not NULL, it will be invoked when any job in the
- * same transaction fails;

[Qemu-devel] [PATCH 4/7] blockjob: centralize QMP event emissions

2016-10-13 Thread John Snow

There's no reason to leave this to blockdev; we can do it in blockjobs
directly and get rid of an extra callback for most users.

All non-internal events, even those created outside of QMP, will
consistently emit events.

Signed-off-by: John Snow 
---
 block/commit.c|  8 
 block/mirror.c|  6 ++
 block/stream.c|  7 +++
 block/trace-events|  5 ++---
 blockdev.c| 42 --
 blockjob.c| 23 +++
 include/block/block_int.h | 17 -
 include/block/blockjob.h  | 17 -
 8 files changed, 42 insertions(+), 83 deletions(-)

diff --git a/block/commit.c b/block/commit.c
index f29e341..475a375 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -209,8 +209,8 @@ static const BlockJobDriver commit_job_driver = {
 
 void commit_start(const char *job_id, BlockDriverState *bs,
   BlockDriverState *base, BlockDriverState *top, int64_t speed,
-  BlockdevOnError on_error, BlockCompletionFunc *cb,
-  void *opaque, const char *backing_file_str, Error **errp)
+  BlockdevOnError on_error, const char *backing_file_str,
+  Error **errp)
 {
 CommitBlockJob *s;
 BlockReopenQueue *reopen_queue = NULL;
@@ -233,7 +233,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
 }
 
 s = block_job_create(job_id, _job_driver, bs, speed,
- BLOCK_JOB_DEFAULT, cb, opaque, errp);
+ BLOCK_JOB_DEFAULT, NULL, NULL, errp);
 if (!s) {
 return;
 }
@@ -276,7 +276,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
 s->on_error = on_error;
 s->common.co = qemu_coroutine_create(commit_run, s);
 
-trace_commit_start(bs, base, top, s, s->common.co, opaque);
+trace_commit_start(bs, base, top, s, s->common.co);
 qemu_coroutine_enter(s->common.co);
 }
 
diff --git a/block/mirror.c b/block/mirror.c
index 15d2d10..4374fb4 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -979,9 +979,7 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
   MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
   BlockdevOnError on_source_error,
   BlockdevOnError on_target_error,
-  bool unmap,
-  BlockCompletionFunc *cb,
-  void *opaque, Error **errp)
+  bool unmap, Error **errp)
 {
 bool is_none_mode;
 BlockDriverState *base;
@@ -994,7 +992,7 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
 base = mode == MIRROR_SYNC_MODE_TOP ? backing_bs(bs) : NULL;
 mirror_start_job(job_id, bs, BLOCK_JOB_DEFAULT, target, replaces,
  speed, granularity, buf_size, backing_mode,
- on_source_error, on_target_error, unmap, cb, opaque, errp,
+ on_source_error, on_target_error, unmap, NULL, NULL, errp,
  _job_driver, is_none_mode, base, false);
 }
 
diff --git a/block/stream.c b/block/stream.c
index eeb6f52..7d6877d 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -216,13 +216,12 @@ static const BlockJobDriver stream_job_driver = {
 
 void stream_start(const char *job_id, BlockDriverState *bs,
   BlockDriverState *base, const char *backing_file_str,
-  int64_t speed, BlockdevOnError on_error,
-  BlockCompletionFunc *cb, void *opaque, Error **errp)
+  int64_t speed, BlockdevOnError on_error, Error **errp)
 {
 StreamBlockJob *s;
 
 s = block_job_create(job_id, _job_driver, bs, speed,
- BLOCK_JOB_DEFAULT, cb, opaque, errp);
+ BLOCK_JOB_DEFAULT, NULL, NULL, errp);
 if (!s) {
 return;
 }
@@ -232,6 +231,6 @@ void stream_start(const char *job_id, BlockDriverState *bs,
 
 s->on_error = on_error;
 s->common.co = qemu_coroutine_create(stream_run, s);
-trace_stream_start(bs, base, s, s->common.co, opaque);
+trace_stream_start(bs, base, s, s->common.co);
 qemu_coroutine_enter(s->common.co);
 }
diff --git a/block/trace-events b/block/trace-events
index 05fa13c..c12f91b 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -20,11 +20,11 @@ bdrv_co_do_copy_on_readv(void *bs, int64_t offset, unsigned 
int bytes, int64_t c
 
 # block/stream.c
 stream_one_iteration(void *s, int64_t sector_num, int nb_sectors, int 
is_allocated) "s %p sector_num %"PRId64" nb_sectors %d is_allocated %d"
-stream_start(void *bs, void *base, void *s, void *co, void *opaque) "bs %p 
base %p s %p co %p opaque %p"
+stream_start(void *bs, void *base, void *s, void *co) "bs %p base %p s %p co 
%p"
 
 # block/commit.c
 commit_one_iteration(void *s, int64_t sector_num, int nb_sectors, int 
is_allocated) "s %p sector_num %"PRId64" nb_sectors %d is_allocated %d"

[Qemu-devel] [PATCH 3/7] Replication/Blockjobs: Create replication jobs as internal

2016-10-13 Thread John Snow

Bubble up the internal interface to commit and backup jobs, then switch
replication tasks over to using this methodology.

Signed-off-by: John Snow 
---
 block/backup.c|  3 ++-
 block/mirror.c| 21 ++---
 block/replication.c   | 14 +++---
 blockdev.c| 11 +++
 include/block/block_int.h |  9 +++--
 qemu-img.c|  5 +++--
 6 files changed, 36 insertions(+), 27 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 5acb5c4..6a60ca8 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -527,6 +527,7 @@ void backup_start(const char *job_id, BlockDriverState *bs,
   bool compress,
   BlockdevOnError on_source_error,
   BlockdevOnError on_target_error,
+  int creation_flags,
   BlockCompletionFunc *cb, void *opaque,
   BlockJobTxn *txn, Error **errp)
 {
@@ -596,7 +597,7 @@ void backup_start(const char *job_id, BlockDriverState *bs,
 }
 
 job = block_job_create(job_id, _job_driver, bs, speed,
-   BLOCK_JOB_DEFAULT, cb, opaque, errp);
+   creation_flags, cb, opaque, errp);
 if (!job) {
 goto error;
 }
diff --git a/block/mirror.c b/block/mirror.c
index 74c03ae..15d2d10 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -906,9 +906,9 @@ static const BlockJobDriver commit_active_job_driver = {
 };
 
 static void mirror_start_job(const char *job_id, BlockDriverState *bs,
- BlockDriverState *target, const char *replaces,
- int64_t speed, uint32_t granularity,
- int64_t buf_size,
+ int creation_flags, BlockDriverState *target,
+ const char *replaces, int64_t speed,
+ uint32_t granularity, int64_t buf_size,
  BlockMirrorBackingMode backing_mode,
  BlockdevOnError on_source_error,
  BlockdevOnError on_target_error,
@@ -936,8 +936,8 @@ static void mirror_start_job(const char *job_id, 
BlockDriverState *bs,
 buf_size = DEFAULT_MIRROR_BUF_SIZE;
 }
 
-s = block_job_create(job_id, driver, bs, speed,
- BLOCK_JOB_DEFAULT, cb, opaque, errp);
+s = block_job_create(job_id, driver, bs, speed, creation_flags,
+ cb, opaque, errp);
 if (!s) {
 return;
 }
@@ -992,17 +992,16 @@ void mirror_start(const char *job_id, BlockDriverState 
*bs,
 }
 is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
 base = mode == MIRROR_SYNC_MODE_TOP ? backing_bs(bs) : NULL;
-mirror_start_job(job_id, bs, target, replaces,
+mirror_start_job(job_id, bs, BLOCK_JOB_DEFAULT, target, replaces,
  speed, granularity, buf_size, backing_mode,
  on_source_error, on_target_error, unmap, cb, opaque, errp,
  _job_driver, is_none_mode, base, false);
 }
 
 void commit_active_start(const char *job_id, BlockDriverState *bs,
- BlockDriverState *base, int64_t speed,
- BlockdevOnError on_error,
- BlockCompletionFunc *cb,
- void *opaque, Error **errp,
+ BlockDriverState *base, int creation_flags,
+ int64_t speed, BlockdevOnError on_error,
+ BlockCompletionFunc *cb, void *opaque, Error **errp,
  bool auto_complete)
 {
 int64_t length, base_length;
@@ -1041,7 +1040,7 @@ void commit_active_start(const char *job_id, 
BlockDriverState *bs,
 }
 }
 
-mirror_start_job(job_id, bs, base, NULL, speed, 0, 0,
+mirror_start_job(job_id, bs, creation_flags, base, NULL, speed, 0, 0,
  MIRROR_LEAVE_BACKING_CHAIN,
  on_error, on_error, false, cb, opaque, _err,
  _active_job_driver, false, base, auto_complete);
diff --git a/block/replication.c b/block/replication.c
index 3bd1cf1..d4f4a7b 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -496,10 +496,11 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
 bdrv_op_block_all(top_bs, s->blocker);
 bdrv_op_unblock(top_bs, BLOCK_OP_TYPE_DATAPLANE, s->blocker);
 
-backup_start("replication-backup", s->secondary_disk->bs,
- s->hidden_disk->bs, 0, MIRROR_SYNC_MODE_NONE, NULL, false,
+backup_start(NULL, s->secondary_disk->bs, s->hidden_disk->bs, 0,
+ MIRROR_SYNC_MODE_NONE, NULL, false,
  BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
- backup_job_completed, s, NULL, _err);
+ BLOCK_JOB_INTERNAL, backup_job_completed, s,
+

[Qemu-devel] [PATCH 1/7] blockjobs: hide internal jobs from management API

2016-10-13 Thread John Snow

If jobs are not created directly by the user, do not allow them to be
seen by the user/management utility. At the moment, 'internal' jobs are
those that do not have an ID. As of this patch it is impossible to
create such jobs.

Signed-off-by: John Snow 
---
 blockdev.c   | 17 +
 blockjob.c   | 37 +++--
 include/block/blockjob.h | 12 ++--
 3 files changed, 54 insertions(+), 12 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 07ec733..5904edb 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3915,13 +3915,22 @@ BlockJobInfoList *qmp_query_block_jobs(Error **errp)
 BlockJob *job;
 
 for (job = block_job_next(NULL); job; job = block_job_next(job)) {
-BlockJobInfoList *elem = g_new0(BlockJobInfoList, 1);
-AioContext *aio_context = blk_get_aio_context(job->blk);
+BlockJobInfoList *elem;
+AioContext *aio_context;
 
+if (block_job_is_internal(job)) {
+continue;
+}
+elem = g_new0(BlockJobInfoList, 1);
+aio_context = blk_get_aio_context(job->blk);
 aio_context_acquire(aio_context);
-elem->value = block_job_query(job);
+elem->value = block_job_query(job, errp);
 aio_context_release(aio_context);
-
+if (!elem->value) {
+g_free(elem);
+qapi_free_BlockJobInfoList(head);
+return NULL;
+}
 *p_next = elem;
 p_next = >next;
 }
diff --git a/blockjob.c b/blockjob.c
index 43fecbe..e78ad94 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -185,6 +185,11 @@ void *block_job_create(const char *job_id, const 
BlockJobDriver *driver,
 return job;
 }
 
+bool block_job_is_internal(BlockJob *job)
+{
+return (job->id == NULL);
+}
+
 void block_job_ref(BlockJob *job)
 {
 ++job->refcnt;
@@ -494,9 +499,15 @@ void block_job_yield(BlockJob *job)
 block_job_pause_point(job);
 }
 
-BlockJobInfo *block_job_query(BlockJob *job)
+BlockJobInfo *block_job_query(BlockJob *job, Error **errp)
 {
-BlockJobInfo *info = g_new0(BlockJobInfo, 1);
+BlockJobInfo *info;
+
+if (block_job_is_internal(job)) {
+error_setg(errp, "Cannot query QEMU internal Jobs");
+return NULL;
+}
+info = g_new0(BlockJobInfo, 1);
 info->type  = g_strdup(BlockJobType_lookup[job->driver->job_type]);
 info->device= g_strdup(job->id);
 info->len   = job->len;
@@ -519,6 +530,10 @@ static void block_job_iostatus_set_err(BlockJob *job, int 
error)
 
 void block_job_event_cancelled(BlockJob *job)
 {
+if (block_job_is_internal(job)) {
+return;
+}
+
 qapi_event_send_block_job_cancelled(job->driver->job_type,
 job->id,
 job->len,
@@ -529,6 +544,10 @@ void block_job_event_cancelled(BlockJob *job)
 
 void block_job_event_completed(BlockJob *job, const char *msg)
 {
+if (block_job_is_internal(job)) {
+return;
+}
+
 qapi_event_send_block_job_completed(job->driver->job_type,
 job->id,
 job->len,
@@ -543,6 +562,10 @@ void block_job_event_ready(BlockJob *job)
 {
 job->ready = true;
 
+if (block_job_is_internal(job)) {
+return;
+}
+
 qapi_event_send_block_job_ready(job->driver->job_type,
 job->id,
 job->len,
@@ -573,10 +596,12 @@ BlockErrorAction block_job_error_action(BlockJob *job, 
BlockdevOnError on_err,
 default:
 abort();
 }
-qapi_event_send_block_job_error(job->id,
-is_read ? IO_OPERATION_TYPE_READ :
-IO_OPERATION_TYPE_WRITE,
-action, _abort);
+if (!block_job_is_internal(job)) {
+qapi_event_send_block_job_error(job->id,
+is_read ? IO_OPERATION_TYPE_READ :
+IO_OPERATION_TYPE_WRITE,
+action, _abort);
+}
 if (action == BLOCK_ERROR_ACTION_STOP) {
 /* make the pause user visible, which will be resumed from QMP. */
 job->user_paused = true;
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 4ddb4ae..6ecfa2e 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -107,7 +107,7 @@ struct BlockJob {
 BlockBackend *blk;
 
 /**
- * The ID of the block job.
+ * The ID of the block job. May be NULL for internal jobs.
  */
 char *id;
 
@@ -333,7 +333,7 @@ bool block_job_is_cancelled(BlockJob *job);
  *
  * Return information about a job.
  */
-BlockJobInfo *block_job_query(BlockJob *job);
+BlockJobInfo *block_job_query(BlockJob *job, Error **errp);
 
 /**
  * block_job_pause_point:
@@ -504,4 +504,12 @@ void

[Qemu-devel] [PATCH 2/7] blockjobs: Allow creating internal jobs

2016-10-13 Thread John Snow

Add the ability to create jobs without an ID.

Signed-off-by: John Snow 
---
 block/backup.c|  2 +-
 block/commit.c|  2 +-
 block/mirror.c|  3 ++-
 block/stream.c|  2 +-
 blockjob.c| 25 -
 include/block/blockjob.h  |  7 ++-
 tests/test-blockjob-txn.c |  3 ++-
 tests/test-blockjob.c |  2 +-
 8 files changed, 30 insertions(+), 16 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 582bd0f..5acb5c4 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -596,7 +596,7 @@ void backup_start(const char *job_id, BlockDriverState *bs,
 }
 
 job = block_job_create(job_id, _job_driver, bs, speed,
-   cb, opaque, errp);
+   BLOCK_JOB_DEFAULT, cb, opaque, errp);
 if (!job) {
 goto error;
 }
diff --git a/block/commit.c b/block/commit.c
index 9f67a8b..f29e341 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -233,7 +233,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
 }
 
 s = block_job_create(job_id, _job_driver, bs, speed,
- cb, opaque, errp);
+ BLOCK_JOB_DEFAULT, cb, opaque, errp);
 if (!s) {
 return;
 }
diff --git a/block/mirror.c b/block/mirror.c
index f9d1fec..74c03ae 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -936,7 +936,8 @@ static void mirror_start_job(const char *job_id, 
BlockDriverState *bs,
 buf_size = DEFAULT_MIRROR_BUF_SIZE;
 }
 
-s = block_job_create(job_id, driver, bs, speed, cb, opaque, errp);
+s = block_job_create(job_id, driver, bs, speed,
+ BLOCK_JOB_DEFAULT, cb, opaque, errp);
 if (!s) {
 return;
 }
diff --git a/block/stream.c b/block/stream.c
index 3187481..eeb6f52 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -222,7 +222,7 @@ void stream_start(const char *job_id, BlockDriverState *bs,
 StreamBlockJob *s;
 
 s = block_job_create(job_id, _job_driver, bs, speed,
- cb, opaque, errp);
+ BLOCK_JOB_DEFAULT, cb, opaque, errp);
 if (!s) {
 return;
 }
diff --git a/blockjob.c b/blockjob.c
index e78ad94..017905a 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -118,7 +118,7 @@ static void block_job_detach_aio_context(void *opaque)
 }
 
 void *block_job_create(const char *job_id, const BlockJobDriver *driver,
-   BlockDriverState *bs, int64_t speed,
+   BlockDriverState *bs, int64_t speed, int flags,
BlockCompletionFunc *cb, void *opaque, Error **errp)
 {
 BlockBackend *blk;
@@ -130,7 +130,7 @@ void *block_job_create(const char *job_id, const 
BlockJobDriver *driver,
 return NULL;
 }
 
-if (job_id == NULL) {
+if (job_id == NULL && !(flags & BLOCK_JOB_INTERNAL)) {
 job_id = bdrv_get_device_name(bs);
 if (!*job_id) {
 error_setg(errp, "An explicit job ID is required for this node");
@@ -138,14 +138,21 @@ void *block_job_create(const char *job_id, const 
BlockJobDriver *driver,
 }
 }
 
-if (!id_wellformed(job_id)) {
-error_setg(errp, "Invalid job ID '%s'", job_id);
-return NULL;
-}
+if (job_id) {
+if (flags & BLOCK_JOB_INTERNAL) {
+error_setg(errp, "Cannot specify job ID for internal block job");
+return NULL;
+}
 
-if (block_job_get(job_id)) {
-error_setg(errp, "Job ID '%s' already in use", job_id);
-return NULL;
+if (!id_wellformed(job_id)) {
+error_setg(errp, "Invalid job ID '%s'", job_id);
+return NULL;
+}
+
+if (block_job_get(job_id)) {
+error_setg(errp, "Job ID '%s' already in use", job_id);
+return NULL;
+}
 }
 
 blk = blk_new();
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 6ecfa2e..fdb31e0 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -200,6 +200,11 @@ struct BlockJob {
 QLIST_ENTRY(BlockJob) txn_list;
 };
 
+typedef enum BlockJobCreateFlags {
+BLOCK_JOB_DEFAULT = 0x00,
+BLOCK_JOB_INTERNAL = 0x01,
+} BlockJobCreateFlags;
+
 /**
  * block_job_next:
  * @job: A block job, or %NULL.
@@ -242,7 +247,7 @@ BlockJob *block_job_get(const char *id);
  * called from a wrapper that is specific to the job type.
  */
 void *block_job_create(const char *job_id, const BlockJobDriver *driver,
-   BlockDriverState *bs, int64_t speed,
+   BlockDriverState *bs, int64_t speed, int flags,
BlockCompletionFunc *cb, void *opaque, Error **errp);
 
 /**
diff --git a/tests/test-blockjob-txn.c b/tests/test-blockjob-txn.c
index d049cba..b79e0c6 100644
--- a/tests/test-blockjob-txn.c
+++ b/tests/test-blockjob-txn.c
@@ -98,7 +98,8 @@ static BlockJob *test_block_job_start(unsigned int

[Qemu-devel] [PATCH 7/7] blockjobs: fix documentation

2016-10-13 Thread John Snow

(Trivial)

Fix wrong function names in documentation.

Signed-off-by: John Snow 
---
 include/block/blockjob_int.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index 8eced19..10ebb38 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -191,8 +191,8 @@ void coroutine_fn block_job_pause_point(BlockJob *job);
 void block_job_enter(BlockJob *job);
 
 /**
- * block_job_ready:
- * @job: The job which is now ready to complete.
+ * block_job_event_ready:
+ * @job: The job which is now ready to be completed.
  *
  * Send a BLOCK_JOB_READY event for the specified job.
  */
-- 
2.7.4

[Qemu-devel] [PATCH v10 07/10] hbitmap: serialization

2016-10-13 Thread John Snow

From: Vladimir Sementsov-Ogievskiy 

Functions to serialize / deserialize(restore) HBitmap. HBitmap should be
saved to linear sequence of bits independently of endianness and bitmap
array element (unsigned long) size. Therefore Little Endian is chosen.

These functions are appropriate for dirty bitmap migration, restoring
the bitmap in several steps is available. To save performance, every
step writes only the last level of the bitmap. All other levels are
restored by hbitmap_deserialize_finish() as a last step of restoring.
So, HBitmap is inconsistent while restoring.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
[Fix left shift operand to 1UL; add "finish" parameter. - Fam]
Signed-off-by: Fam Zheng 

Signed-off-by: John Snow 
---
 include/qemu/hbitmap.h |  79 
 util/hbitmap.c | 137 +
 2 files changed, 216 insertions(+)

diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 1725919..eb46475 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -146,6 +146,85 @@ void hbitmap_reset_all(HBitmap *hb);
 bool hbitmap_get(const HBitmap *hb, uint64_t item);
 
 /**
+ * hbitmap_serialization_granularity:
+ * @hb: HBitmap to operate on.
+ *
+ * Granularity of serialization chunks, used by other serialization functions.
+ * For every chunk:
+ * 1. Chunk start should be aligned to this granularity.
+ * 2. Chunk size should be aligned too, except for last chunk (for which
+ *  start + count == hb->size)
+ */
+uint64_t hbitmap_serialization_granularity(const HBitmap *hb);
+
+/**
+ * hbitmap_serialization_size:
+ * @hb: HBitmap to operate on.
+ * @start: Starting bit
+ * @count: Number of bits
+ *
+ * Return number of bytes hbitmap_(de)serialize_part needs
+ */
+uint64_t hbitmap_serialization_size(const HBitmap *hb,
+uint64_t start, uint64_t count);
+
+/**
+ * hbitmap_serialize_part
+ * @hb: HBitmap to operate on.
+ * @buf: Buffer to store serialized bitmap.
+ * @start: First bit to store.
+ * @count: Number of bits to store.
+ *
+ * Stores HBitmap data corresponding to given region. The format of saved data
+ * is linear sequence of bits, so it can be used by hbitmap_deserialize_part
+ * independently of endianness and size of HBitmap level array elements
+ */
+void hbitmap_serialize_part(const HBitmap *hb, uint8_t *buf,
+uint64_t start, uint64_t count);
+
+/**
+ * hbitmap_deserialize_part
+ * @hb: HBitmap to operate on.
+ * @buf: Buffer to restore bitmap data from.
+ * @start: First bit to restore.
+ * @count: Number of bits to restore.
+ * @finish: Whether to call hbitmap_deserialize_finish automatically.
+ *
+ * Restores HBitmap data corresponding to given region. The format is the same
+ * as for hbitmap_serialize_part.
+ *
+ * If @finish is false, caller must call hbitmap_serialize_finish before using
+ * the bitmap.
+ */
+void hbitmap_deserialize_part(HBitmap *hb, uint8_t *buf,
+  uint64_t start, uint64_t count,
+  bool finish);
+
+/**
+ * hbitmap_deserialize_zeroes
+ * @hb: HBitmap to operate on.
+ * @start: First bit to restore.
+ * @count: Number of bits to restore.
+ * @finish: Whether to call hbitmap_deserialize_finish automatically.
+ *
+ * Fills the bitmap with zeroes.
+ *
+ * If @finish is false, caller must call hbitmap_serialize_finish before using
+ * the bitmap.
+ */
+void hbitmap_deserialize_zeroes(HBitmap *hb, uint64_t start, uint64_t count,
+bool finish);
+
+/**
+ * hbitmap_deserialize_finish
+ * @hb: HBitmap to operate on.
+ *
+ * Repair HBitmap after calling hbitmap_deserialize_data. Actually, all HBitmap
+ * layers are restored here.
+ */
+void hbitmap_deserialize_finish(HBitmap *hb);
+
+/**
  * hbitmap_free:
  * @hb: HBitmap to operate on.
  *
diff --git a/util/hbitmap.c b/util/hbitmap.c
index f303975..5d1a21c 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -397,6 +397,143 @@ bool hbitmap_get(const HBitmap *hb, uint64_t item)
 return (hb->levels[HBITMAP_LEVELS - 1][pos >> BITS_PER_LEVEL] & bit) != 0;
 }
 
+uint64_t hbitmap_serialization_granularity(const HBitmap *hb)
+{
+/* Require at least 64 bit granularity to be safe on both 64 bit and 32 bit
+ * hosts. */
+return 64 << hb->granularity;
+}
+
+/* Start should be aligned to serialization granularity, chunk size should be
+ * aligned to serialization granularity too, except for last chunk.
+ */
+static void serialization_chunk(const HBitmap *hb,
+uint64_t start, uint64_t count,
+unsigned long **first_el, uint64_t *el_count)
+{
+uint64_t last = start + count - 1;
+uint64_t gran = hbitmap_serialization_granularity(hb);
+
+assert((start & (gran - 1)) == 0);
+assert((last >> hb->granularity) <

[Qemu-devel] [PATCH v10 08/10] block: BdrvDirtyBitmap serialization interface

2016-10-13 Thread John Snow

From: Vladimir Sementsov-Ogievskiy 

Several functions to provide necessary access to BdrvDirtyBitmap for
block-migration.c

Signed-off-by: Vladimir Sementsov-Ogievskiy 
[Add the "finish" parameters. - Fam]
Signed-off-by: Fam Zheng 
Reviewed-by: John Snow 
Reviewed-by: Max Reitz 

Signed-off-by: John Snow 
---
 block/dirty-bitmap.c | 37 +
 include/block/dirty-bitmap.h | 14 ++
 2 files changed, 51 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 31d5296..384146b 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -453,6 +453,43 @@ void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, 
HBitmap *in)
 hbitmap_free(tmp);
 }
 
+uint64_t bdrv_dirty_bitmap_serialization_size(const BdrvDirtyBitmap *bitmap,
+  uint64_t start, uint64_t count)
+{
+return hbitmap_serialization_size(bitmap->bitmap, start, count);
+}
+
+uint64_t bdrv_dirty_bitmap_serialization_align(const BdrvDirtyBitmap *bitmap)
+{
+return hbitmap_serialization_granularity(bitmap->bitmap);
+}
+
+void bdrv_dirty_bitmap_serialize_part(const BdrvDirtyBitmap *bitmap,
+  uint8_t *buf, uint64_t start,
+  uint64_t count)
+{
+hbitmap_serialize_part(bitmap->bitmap, buf, start, count);
+}
+
+void bdrv_dirty_bitmap_deserialize_part(BdrvDirtyBitmap *bitmap,
+uint8_t *buf, uint64_t start,
+uint64_t count, bool finish)
+{
+hbitmap_deserialize_part(bitmap->bitmap, buf, start, count, finish);
+}
+
+void bdrv_dirty_bitmap_deserialize_zeroes(BdrvDirtyBitmap *bitmap,
+  uint64_t start, uint64_t count,
+  bool finish)
+{
+hbitmap_deserialize_zeroes(bitmap->bitmap, start, count, finish);
+}
+
+void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap *bitmap)
+{
+hbitmap_deserialize_finish(bitmap->bitmap);
+}
+
 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
 int64_t nr_sectors)
 {
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index c4e7858..efc2965 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -55,4 +55,18 @@ void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *hbi, int64_t 
sector_num);
 int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
 
+uint64_t bdrv_dirty_bitmap_serialization_size(const BdrvDirtyBitmap *bitmap,
+  uint64_t start, uint64_t count);
+uint64_t bdrv_dirty_bitmap_serialization_align(const BdrvDirtyBitmap *bitmap);
+void bdrv_dirty_bitmap_serialize_part(const BdrvDirtyBitmap *bitmap,
+  uint8_t *buf, uint64_t start,
+  uint64_t count);
+void bdrv_dirty_bitmap_deserialize_part(BdrvDirtyBitmap *bitmap,
+uint8_t *buf, uint64_t start,
+uint64_t count, bool finish);
+void bdrv_dirty_bitmap_deserialize_zeroes(BdrvDirtyBitmap *bitmap,
+  uint64_t start, uint64_t count,
+  bool finish);
+void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap *bitmap);
+
 #endif
-- 
2.7.4

[Qemu-devel] [PATCH v10 04/10] block: Support meta dirty bitmap

2016-10-13 Thread John Snow

From: Fam Zheng 

The added group of operations enables tracking of the changed bits in
the dirty bitmap.

Signed-off-by: Fam Zheng 
Reviewed-by: Max Reitz 
Signed-off-by: John Snow 
---
 block/dirty-bitmap.c | 52 
 include/block/dirty-bitmap.h |  9 
 2 files changed, 61 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index c572dfa..9c6febb 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -38,6 +38,7 @@
  */
 struct BdrvDirtyBitmap {
 HBitmap *bitmap;/* Dirty sector bitmap implementation */
+HBitmap *meta;  /* Meta dirty bitmap */
 BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
 char *name; /* Optional non-empty unique ID */
 int64_t size;   /* Size of the bitmap (Number of sectors) */
@@ -103,6 +104,56 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState 
*bs,
 return bitmap;
 }
 
+/* bdrv_create_meta_dirty_bitmap
+ *
+ * Create a meta dirty bitmap that tracks the changes of bits in @bitmap. I.e.
+ * when a dirty status bit in @bitmap is changed (either from reset to set or
+ * the other way around), its respective meta dirty bitmap bit will be marked
+ * dirty as well.
+ *
+ * @bitmap: the block dirty bitmap for which to create a meta dirty bitmap.
+ * @chunk_size: how many bytes of bitmap data does each bit in the meta bitmap
+ * track.
+ */
+void bdrv_create_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap,
+   int chunk_size)
+{
+assert(!bitmap->meta);
+bitmap->meta = hbitmap_create_meta(bitmap->bitmap,
+   chunk_size * BITS_PER_BYTE);
+}
+
+void bdrv_release_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+{
+assert(bitmap->meta);
+hbitmap_free_meta(bitmap->bitmap);
+bitmap->meta = NULL;
+}
+
+int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
+   BdrvDirtyBitmap *bitmap, int64_t sector,
+   int nb_sectors)
+{
+uint64_t i;
+int sectors_per_bit = 1 << hbitmap_granularity(bitmap->meta);
+
+/* To optimize: we can make hbitmap to internally check the range in a
+ * coarse level, or at least do it word by word. */
+for (i = sector; i < sector + nb_sectors; i += sectors_per_bit) {
+if (hbitmap_get(bitmap->meta, i)) {
+return true;
+}
+}
+return false;
+}
+
+void bdrv_dirty_bitmap_reset_meta(BlockDriverState *bs,
+  BdrvDirtyBitmap *bitmap, int64_t sector,
+  int nb_sectors)
+{
+hbitmap_reset(bitmap->meta, sector, nb_sectors);
+}
+
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
 {
 return bitmap->successor;
@@ -233,6 +284,7 @@ static void 
bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
 if ((!bitmap || bm == bitmap) && (!only_named || bm->name)) {
 assert(!bm->active_iterators);
 assert(!bdrv_dirty_bitmap_frozen(bm));
+assert(!bm->meta);
 QLIST_REMOVE(bm, list);
 hbitmap_free(bm->bitmap);
 g_free(bm->name);
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 0ef927d..69c500b 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -8,6 +8,9 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
   uint32_t granularity,
   const char *name,
   Error **errp);
+void bdrv_create_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap,
+   int chunk_size);
+void bdrv_release_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
Error **errp);
@@ -36,6 +39,12 @@ void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int64_t nr_sectors);
 void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
  int64_t cur_sector, int64_t nr_sectors);
+int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
+   BdrvDirtyBitmap *bitmap, int64_t sector,
+   int nb_sectors);
+void bdrv_dirty_bitmap_reset_meta(BlockDriverState *bs,
+  BdrvDirtyBitmap *bitmap, int64_t sector,
+  int nb_sectors);
 BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
  uint64_t first_sector);
 void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter);
-- 
2.7.4

[Qemu-devel] [PATCH v10 01/10] block: Hide HBitmap in block dirty bitmap interface

2016-10-13 Thread John Snow

From: Fam Zheng 

HBitmap is an implementation detail of block dirty bitmap that should be hidden
from users. Introduce a BdrvDirtyBitmapIter to encapsulate the underlying
HBitmapIter.

A small difference in the interface is, before, an HBitmapIter is initialized
in place, now the new BdrvDirtyBitmapIter must be dynamically allocated because
the structure definition is in block/dirty-bitmap.c.

Two current users are converted too.

Signed-off-by: Fam Zheng 
Reviewed-by: Max Reitz 
Signed-off-by: John Snow 
---
 block/backup.c   | 14 --
 block/dirty-bitmap.c | 39 +--
 block/mirror.c   | 24 +---
 include/block/dirty-bitmap.h |  7 +--
 include/qemu/typedefs.h  |  1 +
 5 files changed, 60 insertions(+), 25 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 582bd0f..02dbe48 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -372,14 +372,14 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 int64_t end;
 int64_t last_cluster = -1;
 int64_t sectors_per_cluster = cluster_size_sectors(job);
-HBitmapIter hbi;
+BdrvDirtyBitmapIter *dbi;
 
 granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
 clusters_per_iter = MAX((granularity / job->cluster_size), 1);
-bdrv_dirty_iter_init(job->sync_bitmap, );
+dbi = bdrv_dirty_iter_new(job->sync_bitmap, 0);
 
 /* Find the next dirty sector(s) */
-while ((sector = hbitmap_iter_next()) != -1) {
+while ((sector = bdrv_dirty_iter_next(dbi)) != -1) {
 cluster = sector / sectors_per_cluster;
 
 /* Fake progress updates for any clusters we skipped */
@@ -391,7 +391,7 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
 do {
 if (yield_and_check(job)) {
-return ret;
+goto out;
 }
 ret = backup_do_cow(job, cluster * sectors_per_cluster,
 sectors_per_cluster, _is_read,
@@ -399,7 +399,7 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 if ((ret < 0) &&
 backup_error_action(job, error_is_read, -ret) ==
 BLOCK_ERROR_ACTION_REPORT) {
-return ret;
+goto out;
 }
 } while (ret < 0);
 }
@@ -407,7 +407,7 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 /* If the bitmap granularity is smaller than the backup granularity,
  * we need to advance the iterator pointer to the next cluster. */
 if (granularity < job->cluster_size) {
-bdrv_set_dirty_iter(, cluster * sectors_per_cluster);
+bdrv_set_dirty_iter(dbi, cluster * sectors_per_cluster);
 }
 
 last_cluster = cluster - 1;
@@ -419,6 +419,8 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 job->common.offset += ((end - last_cluster - 1) * job->cluster_size);
 }
 
+out:
+bdrv_dirty_iter_free(dbi);
 return ret;
 }
 
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index f2bfdcf..c572dfa 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -42,9 +42,15 @@ struct BdrvDirtyBitmap {
 char *name; /* Optional non-empty unique ID */
 int64_t size;   /* Size of the bitmap (Number of sectors) */
 bool disabled;  /* Bitmap is read-only */
+int active_iterators;   /* How many iterators are active */
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
 
+struct BdrvDirtyBitmapIter {
+HBitmapIter hbi;
+BdrvDirtyBitmap *bitmap;
+};
+
 BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char *name)
 {
 BdrvDirtyBitmap *bm;
@@ -212,6 +218,7 @@ void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
 
 QLIST_FOREACH(bitmap, >dirty_bitmaps, list) {
 assert(!bdrv_dirty_bitmap_frozen(bitmap));
+assert(!bitmap->active_iterators);
 hbitmap_truncate(bitmap->bitmap, size);
 bitmap->size = size;
 }
@@ -224,6 +231,7 @@ static void 
bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
 BdrvDirtyBitmap *bm, *next;
 QLIST_FOREACH_SAFE(bm, >dirty_bitmaps, list, next) {
 if ((!bitmap || bm == bitmap) && (!only_named || bm->name)) {
+assert(!bm->active_iterators);
 assert(!bdrv_dirty_bitmap_frozen(bm));
 QLIST_REMOVE(bm, list);
 hbitmap_free(bm->bitmap);
@@ -320,9 +328,29 @@ uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap 
*bitmap)
 return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
 }
 
-void bdrv_dirty_iter_init(BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)

[Qemu-devel] [PATCH v10 00/10] Dirty bitmap changes for migration/persistence work

2016-10-13 Thread John Snow

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/10:[] [--] 'block: Hide HBitmap in block dirty bitmap interface'
002/10:[] [--] 'HBitmap: Introduce "meta" bitmap to track bit changes'
003/10:[] [--] 'tests: Add test code for meta bitmap'
004/10:[] [--] 'block: Support meta dirty bitmap'
005/10:[] [--] 'block: Add two dirty bitmap getters'
006/10:[] [--] 'block: Assert that bdrv_release_dirty_bitmap succeeded'
007/10:[] [--] 'hbitmap: serialization'
008/10:[] [--] 'block: BdrvDirtyBitmap serialization interface'
009/10:[0005] [FC] 'tests: Add test code for hbitmap serialization'
010/10:[] [--] 'block: More operations for meta dirty bitmap'

===
v10: Now with less bits
===

09: Fixed tests to work correctly on 32bit machines, Thanks Max.

===
v9: Wow, it's been a while.
===

07: Replaced size_t by uint64_t (Fixes 32bit build)
09: Fixed serialization tests for BE machines

===
v8: Hello, is it v8 you're looking for?
===

01: Rebase conflict over int64_t header for bdrv_reset_dirty_bitmap.
04: Revised sector math to make literally any sense.
08: Rebase conflict over int64_t header for bdrv_reset_dirty_bitmap.

===
v7:
===

02: Fix rebase mishap.
04: Slight loop adjustment.
09: Fix constant on 32bit machines.

===
v6: Rebase. Stole series from Fam.
===

02: Added documentation changes as suggested by Max.

===
v5: Rebase: first 5 patches from last revision are already merged.
===

Addressed Max's comments:

01: - "block.c" -> "block/dirty-bitmap.c" in commit message.
- "an BdrvDirtyBitmapIter" -> "an BdrvDirtyBitmapIter" in code comment.
- hbitmap_next => next_dirty as variable name.
- bdrv_dirty_iter_free()/bdrv_dirty_iter_new() pairs =>
  bdrv_set_dirty_iter.

02: Move the assert fix into 01.

04: Truncate the meta bitmap (done by hbitmap_truncate).

06: Add Max's r-b.

07: I left the memcpy vs cpu_to_le32/64w as is to pick up Max's r-b. That
could be improved on top if wanted.

10: Add Max's r-b.



For convenience, this branch is available at:
https://github.com/jnsnow/qemu.git branch meta-bitmap
https://github.com/jnsnow/qemu/tree/meta-bitmap

This version is tagged meta-bitmap-v10:
https://github.com/jnsnow/qemu/releases/tag/meta-bitmap-v10

Fam Zheng (8):
  block: Hide HBitmap in block dirty bitmap interface
  HBitmap: Introduce "meta" bitmap to track bit changes
  tests: Add test code for meta bitmap
  block: Support meta dirty bitmap
  block: Add two dirty bitmap getters
  block: Assert that bdrv_release_dirty_bitmap succeeded
  tests: Add test code for hbitmap serialization
  block: More operations for meta dirty bitmap

Vladimir Sementsov-Ogievskiy (2):
  hbitmap: serialization
  block: BdrvDirtyBitmap serialization interface

 block/backup.c   |  14 ++-
 block/dirty-bitmap.c | 160 -
 block/mirror.c   |  24 ++--
 include/block/dirty-bitmap.h |  35 +-
 include/qemu/hbitmap.h   | 100 
 include/qemu/typedefs.h  |   1 +
 tests/test-hbitmap.c | 272 +++
 util/hbitmap.c   | 206 +---
 8 files changed, 772 insertions(+), 40 deletions(-)

-- 
2.7.4

[Qemu-devel] [PATCH v10 10/10] block: More operations for meta dirty bitmap

2016-10-13 Thread John Snow

From: Fam Zheng 

Callers can create an iterator of meta bitmap with
bdrv_dirty_meta_iter_new(), then use the bdrv_dirty_iter_* operations on
it. Meta iterators are also counted by bitmap->active_iterators.

Also add a couple of functions to retrieve granularity and count.

Signed-off-by: Fam Zheng 
Reviewed-by: Max Reitz 
Signed-off-by: John Snow 
---
 block/dirty-bitmap.c | 19 +++
 include/block/dirty-bitmap.h |  3 +++
 2 files changed, 22 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 384146b..519737c 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -393,6 +393,11 @@ uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap 
*bitmap)
 return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
 }
 
+uint32_t bdrv_dirty_bitmap_meta_granularity(BdrvDirtyBitmap *bitmap)
+{
+return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->meta);
+}
+
 BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
  uint64_t first_sector)
 {
@@ -403,6 +408,15 @@ BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap 
*bitmap,
 return iter;
 }
 
+BdrvDirtyBitmapIter *bdrv_dirty_meta_iter_new(BdrvDirtyBitmap *bitmap)
+{
+BdrvDirtyBitmapIter *iter = g_new(BdrvDirtyBitmapIter, 1);
+hbitmap_iter_init(>hbi, bitmap->meta, 0);
+iter->bitmap = bitmap;
+bitmap->active_iterators++;
+return iter;
+}
+
 void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter)
 {
 if (!iter) {
@@ -514,3 +528,8 @@ int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap)
 {
 return hbitmap_count(bitmap->bitmap);
 }
+
+int64_t bdrv_get_meta_dirty_count(BdrvDirtyBitmap *bitmap)
+{
+return hbitmap_count(bitmap->meta);
+}
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index efc2965..9dea14b 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -30,6 +30,7 @@ void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
 uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap);
+uint32_t bdrv_dirty_bitmap_meta_granularity(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
 const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap);
@@ -47,12 +48,14 @@ int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
 void bdrv_dirty_bitmap_reset_meta(BlockDriverState *bs,
   BdrvDirtyBitmap *bitmap, int64_t sector,
   int nb_sectors);
+BdrvDirtyBitmapIter *bdrv_dirty_meta_iter_new(BdrvDirtyBitmap *bitmap);
 BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
  uint64_t first_sector);
 void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter);
 int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter);
 void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *hbi, int64_t sector_num);
 int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap);
+int64_t bdrv_get_meta_dirty_count(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
 
 uint64_t bdrv_dirty_bitmap_serialization_size(const BdrvDirtyBitmap *bitmap,
-- 
2.7.4

[Qemu-devel] [PATCH v10 02/10] HBitmap: Introduce "meta" bitmap to track bit changes

2016-10-13 Thread John Snow

From: Fam Zheng 

Upon each bit toggle, the corresponding bit in the meta bitmap will be
set.

Signed-off-by: Fam Zheng 
[Amended text inline. --js]
Reviewed-by: Max Reitz 

Signed-off-by: John Snow 
---
 include/qemu/hbitmap.h | 21 +++
 util/hbitmap.c | 69 +++---
 2 files changed, 75 insertions(+), 15 deletions(-)

diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 8ab721e..1725919 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -178,6 +178,27 @@ void hbitmap_iter_init(HBitmapIter *hbi, const HBitmap 
*hb, uint64_t first);
  */
 unsigned long hbitmap_iter_skip_words(HBitmapIter *hbi);
 
+/* hbitmap_create_meta:
+ * Create a "meta" hbitmap to track dirtiness of the bits in this HBitmap.
+ * The caller owns the created bitmap and must call hbitmap_free_meta(hb) to
+ * free it.
+ *
+ * Currently, we only guarantee that if a bit in the hbitmap is changed it
+ * will be reflected in the meta bitmap, but we do not yet guarantee the
+ * opposite.
+ *
+ * @hb: The HBitmap to operate on.
+ * @chunk_size: How many bits in @hb does one bit in the meta track.
+ */
+HBitmap *hbitmap_create_meta(HBitmap *hb, int chunk_size);
+
+/* hbitmap_free_meta:
+ * Free the meta bitmap of @hb.
+ *
+ * @hb: The HBitmap whose meta bitmap should be freed.
+ */
+void hbitmap_free_meta(HBitmap *hb);
+
 /**
  * hbitmap_iter_next:
  * @hbi: HBitmapIter to operate on.
diff --git a/util/hbitmap.c b/util/hbitmap.c
index 99fd2ba..f303975 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -78,6 +78,9 @@ struct HBitmap {
  */
 int granularity;
 
+/* A meta dirty bitmap to track the dirtiness of bits in this HBitmap. */
+HBitmap *meta;
+
 /* A number of progressively less coarse bitmaps (i.e. level 0 is the
  * coarsest).  Each bit in level N represents a word in level N+1 that
  * has a set bit, except the last level where each bit represents the
@@ -209,25 +212,27 @@ static uint64_t hb_count_between(HBitmap *hb, uint64_t 
start, uint64_t last)
 }
 
 /* Setting starts at the last layer and propagates up if an element
- * changes from zero to non-zero.
+ * changes.
  */
 static inline bool hb_set_elem(unsigned long *elem, uint64_t start, uint64_t 
last)
 {
 unsigned long mask;
-bool changed;
+unsigned long old;
 
 assert((last >> BITS_PER_LEVEL) == (start >> BITS_PER_LEVEL));
 assert(start <= last);
 
 mask = 2UL << (last & (BITS_PER_LONG - 1));
 mask -= 1UL << (start & (BITS_PER_LONG - 1));
-changed = (*elem == 0);
+old = *elem;
 *elem |= mask;
-return changed;
+return old != *elem;
 }
 
-/* The recursive workhorse (the depth is limited to HBITMAP_LEVELS)... */
-static void hb_set_between(HBitmap *hb, int level, uint64_t start, uint64_t 
last)
+/* The recursive workhorse (the depth is limited to HBITMAP_LEVELS)...
+ * Returns true if at least one bit is changed. */
+static bool hb_set_between(HBitmap *hb, int level, uint64_t start,
+   uint64_t last)
 {
 size_t pos = start >> BITS_PER_LEVEL;
 size_t lastpos = last >> BITS_PER_LEVEL;
@@ -256,23 +261,28 @@ static void hb_set_between(HBitmap *hb, int level, 
uint64_t start, uint64_t last
 if (level > 0 && changed) {
 hb_set_between(hb, level - 1, pos, lastpos);
 }
+return changed;
 }
 
 void hbitmap_set(HBitmap *hb, uint64_t start, uint64_t count)
 {
 /* Compute range in the last layer.  */
+uint64_t first, n;
 uint64_t last = start + count - 1;
 
 trace_hbitmap_set(hb, start, count,
   start >> hb->granularity, last >> hb->granularity);
 
-start >>= hb->granularity;
+first = start >> hb->granularity;
 last >>= hb->granularity;
-count = last - start + 1;
 assert(last < hb->size);
+n = last - first + 1;
 
-hb->count += count - hb_count_between(hb, start, last);
-hb_set_between(hb, HBITMAP_LEVELS - 1, start, last);
+hb->count += n - hb_count_between(hb, first, last);
+if (hb_set_between(hb, HBITMAP_LEVELS - 1, first, last) &&
+hb->meta) {
+hbitmap_set(hb->meta, start, count);
+}
 }
 
 /* Resetting works the other way round: propagate up if the new
@@ -293,8 +303,10 @@ static inline bool hb_reset_elem(unsigned long *elem, 
uint64_t start, uint64_t l
 return blanked;
 }
 
-/* The recursive workhorse (the depth is limited to HBITMAP_LEVELS)... */
-static void hb_reset_between(HBitmap *hb, int level, uint64_t start, uint64_t 
last)
+/* The recursive workhorse (the depth is limited to HBITMAP_LEVELS)...
+ * Returns true if at least one bit is changed. */
+static bool hb_reset_between(HBitmap *hb, int level, uint64_t start,
+ uint64_t last)
 {
 size_t pos = start >> BITS_PER_LEVEL;
 size_t lastpos = last >> BITS_PER_LEVEL;
@@ -337,22 +349,29 @@ static void

[Qemu-devel] [PATCH v10 06/10] block: Assert that bdrv_release_dirty_bitmap succeeded

2016-10-13 Thread John Snow

From: Fam Zheng 

We use a loop over bs->dirty_bitmaps to make sure the caller is
only releasing a bitmap owned by bs. Let's also assert that in this case
the caller is releasing a bitmap that does exist.

Signed-off-by: Fam Zheng 
Reviewed-by: Max Reitz 
Signed-off-by: John Snow 
---
 block/dirty-bitmap.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 860acc9..31d5296 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -305,6 +305,9 @@ static void 
bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
 }
 }
 }
+if (bitmap) {
+abort();
+}
 }
 
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
-- 
2.7.4

[Qemu-devel] [PATCH v10 09/10] tests: Add test code for hbitmap serialization

2016-10-13 Thread John Snow

From: Fam Zheng 

Signed-off-by: Fam Zheng 
[Fixed minor constant issue. --js]
Signed-off-by: John Snow 

Signed-off-by: John Snow 
---
 tests/test-hbitmap.c | 156 +++
 1 file changed, 156 insertions(+)

diff --git a/tests/test-hbitmap.c b/tests/test-hbitmap.c
index e3abde1..9b7495c 100644
--- a/tests/test-hbitmap.c
+++ b/tests/test-hbitmap.c
@@ -11,6 +11,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/hbitmap.h"
+#include "qemu/bitmap.h"
 #include "block/block.h"
 
 #define LOG_BITS_PER_LONG  (BITS_PER_LONG == 32 ? 5 : 6)
@@ -737,6 +738,16 @@ static void test_hbitmap_meta_one(TestHBitmapData *data, 
const void *unused)
 }
 }
 
+static void test_hbitmap_serialize_granularity(TestHBitmapData *data,
+   const void *unused)
+{
+int r;
+
+hbitmap_test_init(data, L3 * 2, 3);
+r = hbitmap_serialization_granularity(data->hb);
+g_assert_cmpint(r, ==, 64 << 3);
+}
+
 static void test_hbitmap_meta_zero(TestHBitmapData *data, const void *unused)
 {
 hbitmap_test_init_meta(data, 0, 0, 1);
@@ -744,6 +755,142 @@ static void test_hbitmap_meta_zero(TestHBitmapData *data, 
const void *unused)
 hbitmap_check_meta(data, 0, 0);
 }
 
+static void hbitmap_test_serialize_range(TestHBitmapData *data,
+ uint8_t *buf, size_t buf_size,
+ uint64_t pos, uint64_t count)
+{
+size_t i;
+unsigned long *el = (unsigned long *)buf;
+
+assert(hbitmap_granularity(data->hb) == 0);
+hbitmap_reset_all(data->hb);
+memset(buf, 0, buf_size);
+if (count) {
+hbitmap_set(data->hb, pos, count);
+}
+hbitmap_serialize_part(data->hb, buf, 0, data->size);
+
+/* Serialized buffer is inherently LE, convert it back manually to test */
+for (i = 0; i < buf_size / sizeof(unsigned long); i++) {
+el[i] = (BITS_PER_LONG == 32 ? le32_to_cpu(el[i]) : 
le64_to_cpu(el[i]));
+}
+
+for (i = 0; i < data->size; i++) {
+int is_set = test_bit(i, (unsigned long *)buf);
+if (i >= pos && i < pos + count) {
+g_assert(is_set);
+} else {
+g_assert(!is_set);
+}
+}
+
+/* Re-serialize for deserialization testing */
+memset(buf, 0, buf_size);
+hbitmap_serialize_part(data->hb, buf, 0, data->size);
+hbitmap_reset_all(data->hb);
+hbitmap_deserialize_part(data->hb, buf, 0, data->size, true);
+
+for (i = 0; i < data->size; i++) {
+int is_set = hbitmap_get(data->hb, i);
+if (i >= pos && i < pos + count) {
+g_assert(is_set);
+} else {
+g_assert(!is_set);
+}
+}
+}
+
+static void test_hbitmap_serialize_basic(TestHBitmapData *data,
+ const void *unused)
+{
+int i, j;
+size_t buf_size;
+uint8_t *buf;
+uint64_t positions[] = { 0, 1, L1 - 1, L1, L2 - 1, L2, L2 + 1, L3 - 1 };
+int num_positions = sizeof(positions) / sizeof(positions[0]);
+
+hbitmap_test_init(data, L3, 0);
+buf_size = hbitmap_serialization_size(data->hb, 0, data->size);
+buf = g_malloc0(buf_size);
+
+for (i = 0; i < num_positions; i++) {
+for (j = 0; j < num_positions; j++) {
+hbitmap_test_serialize_range(data, buf, buf_size,
+ positions[i],
+ MIN(positions[j], L3 - positions[i]));
+}
+}
+
+g_free(buf);
+}
+
+static void test_hbitmap_serialize_part(TestHBitmapData *data,
+const void *unused)
+{
+int i, j, k;
+size_t buf_size;
+uint8_t *buf;
+uint64_t positions[] = { 0, 1, L1 - 1, L1, L2 - 1, L2, L2 + 1, L3 - 1 };
+int num_positions = sizeof(positions) / sizeof(positions[0]);
+
+hbitmap_test_init(data, L3, 0);
+buf_size = L2;
+buf = g_malloc0(buf_size);
+
+for (i = 0; i < num_positions; i++) {
+hbitmap_set(data->hb, positions[i], 1);
+}
+
+for (i = 0; i < data->size; i += buf_size) {
+unsigned long *el = (unsigned long *)buf;
+hbitmap_serialize_part(data->hb, buf, i, buf_size);
+for (j = 0; j < buf_size / sizeof(unsigned long); j++) {
+el[j] = (BITS_PER_LONG == 32 ? le32_to_cpu(el[j]) : 
le64_to_cpu(el[j]));
+}
+
+for (j = 0; j < buf_size; j++) {
+bool should_set = false;
+for (k = 0; k < num_positions; k++) {
+if (positions[k] == j + i) {
+should_set = true;
+break;
+}
+}
+g_assert_cmpint(should_set, ==, test_bit(j, (unsigned long *)buf));
+}
+}
+
+g_free(buf);
+}
+
+static void test_hbitmap_serialize_zeroes(TestHBitmapData *data,
+  const void *unused)

[Qemu-devel] [PATCH v10 05/10] block: Add two dirty bitmap getters

2016-10-13 Thread John Snow

From: Fam Zheng 

For dirty bitmap users to get the size and the name of a
BdrvDirtyBitmap.

Signed-off-by: Fam Zheng 
Reviewed-by: John Snow 
Reviewed-by: Max Reitz 
Signed-off-by: John Snow 
---
 block/dirty-bitmap.c | 10 ++
 include/block/dirty-bitmap.h |  2 ++
 2 files changed, 12 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 9c6febb..860acc9 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -154,6 +154,16 @@ void bdrv_dirty_bitmap_reset_meta(BlockDriverState *bs,
 hbitmap_reset(bitmap->meta, sector, nb_sectors);
 }
 
+int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap)
+{
+return bitmap->size;
+}
+
+const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap)
+{
+return bitmap->name;
+}
+
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
 {
 return bitmap->successor;
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 69c500b..c4e7858 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -32,6 +32,8 @@ uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState 
*bs);
 uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
+const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap);
+int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap);
 DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap);
 int bdrv_get_dirty(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
int64_t sector);
-- 
2.7.4

[Qemu-devel] [PATCH v10 03/10] tests: Add test code for meta bitmap

2016-10-13 Thread John Snow

From: Fam Zheng 

Signed-off-by: Fam Zheng 
Reviewed-by: John Snow 
Reviewed-by: Max Reitz 
Signed-off-by: John Snow 
---
 tests/test-hbitmap.c | 116 +++
 1 file changed, 116 insertions(+)

diff --git a/tests/test-hbitmap.c b/tests/test-hbitmap.c
index c0e9895..e3abde1 100644
--- a/tests/test-hbitmap.c
+++ b/tests/test-hbitmap.c
@@ -11,6 +11,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/hbitmap.h"
+#include "block/block.h"
 
 #define LOG_BITS_PER_LONG  (BITS_PER_LONG == 32 ? 5 : 6)
 
@@ -20,6 +21,7 @@
 
 typedef struct TestHBitmapData {
 HBitmap   *hb;
+HBitmap   *meta;
 unsigned long *bits;
 size_t size;
 size_t old_size;
@@ -91,6 +93,14 @@ static void hbitmap_test_init(TestHBitmapData *data,
 }
 }
 
+static void hbitmap_test_init_meta(TestHBitmapData *data,
+   uint64_t size, int granularity,
+   int meta_chunk)
+{
+hbitmap_test_init(data, size, granularity);
+data->meta = hbitmap_create_meta(data->hb, meta_chunk);
+}
+
 static inline size_t hbitmap_test_array_size(size_t bits)
 {
 size_t n = DIV_ROUND_UP(bits, BITS_PER_LONG);
@@ -133,6 +143,9 @@ static void hbitmap_test_teardown(TestHBitmapData *data,
   const void *unused)
 {
 if (data->hb) {
+if (data->meta) {
+hbitmap_free_meta(data->hb);
+}
 hbitmap_free(data->hb);
 data->hb = NULL;
 }
@@ -634,6 +647,103 @@ static void 
test_hbitmap_truncate_shrink_large(TestHBitmapData *data,
 hbitmap_test_truncate(data, size, -diff, 0);
 }
 
+static void hbitmap_check_meta(TestHBitmapData *data,
+   int64_t start, int count)
+{
+int64_t i;
+
+for (i = 0; i < data->size; i++) {
+if (i >= start && i < start + count) {
+g_assert(hbitmap_get(data->meta, i));
+} else {
+g_assert(!hbitmap_get(data->meta, i));
+}
+}
+}
+
+static void hbitmap_test_meta(TestHBitmapData *data,
+  int64_t start, int count,
+  int64_t check_start, int check_count)
+{
+hbitmap_reset_all(data->hb);
+hbitmap_reset_all(data->meta);
+
+/* Test "unset" -> "unset" will not update meta. */
+hbitmap_reset(data->hb, start, count);
+hbitmap_check_meta(data, 0, 0);
+
+/* Test "unset" -> "set" will update meta */
+hbitmap_set(data->hb, start, count);
+hbitmap_check_meta(data, check_start, check_count);
+
+/* Test "set" -> "set" will not update meta */
+hbitmap_reset_all(data->meta);
+hbitmap_set(data->hb, start, count);
+hbitmap_check_meta(data, 0, 0);
+
+/* Test "set" -> "unset" will update meta */
+hbitmap_reset_all(data->meta);
+hbitmap_reset(data->hb, start, count);
+hbitmap_check_meta(data, check_start, check_count);
+}
+
+static void hbitmap_test_meta_do(TestHBitmapData *data, int chunk_size)
+{
+uint64_t size = chunk_size * 100;
+hbitmap_test_init_meta(data, size, 0, chunk_size);
+
+hbitmap_test_meta(data, 0, 1, 0, chunk_size);
+hbitmap_test_meta(data, 0, chunk_size, 0, chunk_size);
+hbitmap_test_meta(data, chunk_size - 1, 1, 0, chunk_size);
+hbitmap_test_meta(data, chunk_size - 1, 2, 0, chunk_size * 2);
+hbitmap_test_meta(data, chunk_size - 1, chunk_size + 1, 0, chunk_size * 2);
+hbitmap_test_meta(data, chunk_size - 1, chunk_size + 2, 0, chunk_size * 3);
+hbitmap_test_meta(data, 7 * chunk_size - 1, chunk_size + 2,
+  6 * chunk_size, chunk_size * 3);
+hbitmap_test_meta(data, size - 1, 1, size - chunk_size, chunk_size);
+hbitmap_test_meta(data, 0, size, 0, size);
+}
+
+static void test_hbitmap_meta_byte(TestHBitmapData *data, const void *unused)
+{
+hbitmap_test_meta_do(data, BITS_PER_BYTE);
+}
+
+static void test_hbitmap_meta_word(TestHBitmapData *data, const void *unused)
+{
+hbitmap_test_meta_do(data, BITS_PER_LONG);
+}
+
+static void test_hbitmap_meta_sector(TestHBitmapData *data, const void *unused)
+{
+hbitmap_test_meta_do(data, BDRV_SECTOR_SIZE * BITS_PER_BYTE);
+}
+
+/**
+ * Create an HBitmap and test set/unset.
+ */
+static void test_hbitmap_meta_one(TestHBitmapData *data, const void *unused)
+{
+int i;
+int64_t offsets[] = {
+0, 1, L1 - 1, L1, L1 + 1, L2 - 1, L2, L2 + 1, L3 - 1, L3, L3 + 1
+};
+
+hbitmap_test_init_meta(data, L3 * 2, 0, 1);
+for (i = 0; i < ARRAY_SIZE(offsets); i++) {
+hbitmap_test_meta(data, offsets[i], 1, offsets[i], 1);
+hbitmap_test_meta(data, offsets[i], L1, offsets[i], L1);
+hbitmap_test_meta(data, offsets[i], L2, offsets[i], L2);
+}
+}
+
+static void test_hbitmap_meta_zero(TestHBitmapData *data, const void *unused)
+{
+hbitmap_test_init_meta(data, 0, 0, 1);
+
+

[Qemu-devel] [QEMU PATCH v6 2/2] migration: migrate QTAILQ

2016-10-13 Thread Jianjun Duan

Currently we cannot directly transfer a QTAILQ instance because of the
limitation in the migration code. Here we introduce an approach to
transfer such structures. We created VMStateInfo vmstate_info_qtailq
for QTAILQ. Similar VMStateInfo can be created for other data structures
such as list.

This approach will be used to transfer pending_events and ccs_list in spapr
state.

We also create some macros in qemu/queue.h to access a QTAILQ using pointer
arithmetic. This ensures that we do not depend on the implementation
details about QTAILQ in the migration code.

Signed-off-by: Jianjun Duan 
---
 include/migration/vmstate.h | 20 +++
 include/qemu/queue.h| 32 
 migration/trace-events  |  4 +++
 migration/vmstate.c | 59 +
 4 files changed, 115 insertions(+)

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index d0e37b5..4dd0aed 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -251,6 +251,7 @@ extern const VMStateInfo vmstate_info_timer;
 extern const VMStateInfo vmstate_info_buffer;
 extern const VMStateInfo vmstate_info_unused_buffer;
 extern const VMStateInfo vmstate_info_bitmap;
+extern const VMStateInfo vmstate_info_qtailq;
 
 #define type_check_2darray(t1,t2,n,m) ((t1(*)[n][m])0 - (t2*)0)
 #define type_check_array(t1,t2,n) ((t1(*)[n])0 - (t2*)0)
@@ -662,6 +663,25 @@ extern const VMStateInfo vmstate_info_bitmap;
 .offset   = offsetof(_state, _field),\
 }
 
+/* For QTAILQ that need customized handling
+ * _type: type of QTAILQ element
+ * _next: name of QTAILQ entry field in QTAILQ element
+ * _vmsd: VMSD for QTAILQ element
+ * size: size of QTAILQ element
+ * start: offset of QTAILQ entry in QTAILQ element
+ */
+#define VMSTATE_QTAILQ_V(_field, _state, _version, _vmsd, _type, _next)  \
+{\
+.name = (stringify(_field)), \
+.version_id   = (_version),  \
+.vmsd = &(_vmsd),\
+.size = sizeof(_type),   \
+.info = _info_qtailq,\
+.flags= VMS_LINKED,  \
+.offset   = offsetof(_state, _field),\
+.start= offsetof(_type, _next),  \
+}
+
 /* _f : field name
_f_n : num of elements field_name
_n : num of elements
diff --git a/include/qemu/queue.h b/include/qemu/queue.h
index 342073f..d672ae0 100644
--- a/include/qemu/queue.h
+++ b/include/qemu/queue.h
@@ -438,4 +438,36 @@ struct {   
 \
 #define QTAILQ_PREV(elm, headname, field) \
 (*(((struct headname *)((elm)->field.tqe_prev))->tqh_last))
 
+/*
+ * Offsets of layout of a tail queue head.
+ */
+#define QTAILQ_FIRST_OFFSET 0
+#define QTAILQ_LAST_OFFSET (sizeof(void *))
+
+/*
+ * Offsets of layout of a tail queue element.
+ */
+#define QTAILQ_NEXT_OFFSET 0
+#define QTAILQ_PREV_OFFSET (sizeof(void *))
+
+/*
+ * Tail queue tranversal using pointer arithmetic.
+ */
+#define QTAILQ_RAW_FOREACH(elm, head, entry)   
\
+for ((elm) = *((void **) ((char *) (head) + QTAILQ_FIRST_OFFSET)); 
\
+ (elm);
\
+ (elm) =   
\
+ *((void **) ((char *) (elm) + (entry) + QTAILQ_NEXT_OFFSET)))
+/*
+ * Tail queue insertion using pointer arithmetic.
+ */
+#define QTAILQ_RAW_INSERT_TAIL(head, elm, entry) do {  
\
+*((void **) ((char *) (elm) + (entry) + QTAILQ_NEXT_OFFSET)) = NULL;   
\
+*((void **) ((char *) (elm) + (entry) + QTAILQ_PREV_OFFSET)) = 
\
+*((void **) ((char *) (head) + QTAILQ_LAST_OFFSET));   
\
+**((void ***)((char *) (head) + QTAILQ_LAST_OFFSET)) = (elm);  
\
+*((void **) ((char *) (head) + QTAILQ_LAST_OFFSET)) =  
\
+(void *) ((char *) (elm) + (entry) + QTAILQ_NEXT_OFFSET);  
\
+} while (/*CONSTCOND*/0)
+
 #endif /* QEMU_SYS_QUEUE_H */
diff --git a/migration/trace-events b/migration/trace-events
index dfee75a..9a6ec59 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -52,6 +52,10 @@ vmstate_n_elems(const char *name, int n_elems) "%s: %d"
 vmstate_subsection_load(const char *parent) "%s"
 vmstate_subsection_load_bad(const char *parent,  const char *sub, const char 
*sub2) "%s: %s/%s"
 vmstate_subsection_load_good(const char *parent) "%s"
+get_qtailq(const char *name, int version_id) "%s v%d"
+get_qtailq_end(const char *name,

[Qemu-devel] [QEMU PATCH v6 0/2] migration: migrate QTAILQ

2016-10-13 Thread Jianjun Duan

v6: - Split from Power specific patches. 
- Dropped VMS_LINKED flag.
- Rebased to master.
- Added comments to clarify about put/get in VMStateInfo.  

Previous versions are:

v5: - Rebased to David's ppc-for-2.8. 
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg00270.html)

v4: - Introduce a way to set customized instance_id in SaveStateEntry. Use it
  to set instance_id for DRC using its unique index to address David 
  Gibson's concern.
- Rename VMS_CSTM to VMS_LINKED based on Paolo Bonzini's suggestions.
- Clean up qjson stuff in put_qtailq. 
- Add trace for put_qtailq and get_qtailq based on David Gilbert's 
  suggestion.
- Based on David's ppc-for-2.7. 
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg07720.html)

v3: - Simplify overall design followng discussion with Paolo. No longer need
  metadata to migrate QTAILQ.
- Extend VMStateInfo instead of adding similar fields to VMStateField.
- Clean up macros in qemu/queue.h.
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg05695.html)

v2: - Introduce a general approach to migrate QTAILQ in qemu/queue.h.
- Migrate signalled field in the DRC state.
- Put the newly added migrating fields in subsections so that backward 
  migration is not broken.  
- Set detach_cb field right after migration so that a migrated hot-unplug
  event could finish its course.
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg04188.html)

v1: - Inital version.
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-04/msg02601.html)


Jianjun Duan (2):
  migration: extend VMStateInfo
  migration: migrate QTAILQ

 hw/display/virtio-gpu.c |   6 +-
 hw/net/vmxnet3.c|  18 +++--
 hw/nvram/eeprom93xx.c   |   6 +-
 hw/nvram/fw_cfg.c   |   6 +-
 hw/pci/msix.c   |   6 +-
 hw/pci/pci.c|  12 ++--
 hw/pci/shpc.c   |   5 +-
 hw/scsi/scsi-bus.c  |   6 +-
 hw/timer/twl92230.c |   6 +-
 hw/usb/redirect.c   |  18 +++--
 hw/virtio/virtio-pci.c  |   6 +-
 hw/virtio/virtio.c  |  12 ++--
 include/migration/vmstate.h |  35 --
 include/qemu/queue.h|  32 +
 migration/savevm.c  |   5 +-
 migration/trace-events  |   4 ++
 migration/vmstate.c | 163 ++--
 target-alpha/machine.c  |   5 +-
 target-arm/machine.c|  12 ++--
 target-i386/machine.c   |  21 --
 target-mips/machine.c   |  10 +--
 target-ppc/machine.c|  10 +--
 target-sparc/machine.c  |   5 +-
 23 files changed, 307 insertions(+), 102 deletions(-)

-- 
1.9.1

[Qemu-devel] [QEMU PATCH v6 1/2] migration: extend VMStateInfo

2016-10-13 Thread Jianjun Duan

Current migration code cannot handle some data structures such as
QTAILQ in qemu/queue.h. Here we extend the signatures of put/get
in VMStateInfo so that customized handling is supported.

Signed-off-by: Jianjun Duan 
---
 hw/display/virtio-gpu.c |   6 ++-
 hw/net/vmxnet3.c|  18 +---
 hw/nvram/eeprom93xx.c   |   6 ++-
 hw/nvram/fw_cfg.c   |   6 ++-
 hw/pci/msix.c   |   6 ++-
 hw/pci/pci.c|  12 +++--
 hw/pci/shpc.c   |   5 ++-
 hw/scsi/scsi-bus.c  |   6 ++-
 hw/timer/twl92230.c |   6 ++-
 hw/usb/redirect.c   |  18 +---
 hw/virtio/virtio-pci.c  |   6 ++-
 hw/virtio/virtio.c  |  12 +++--
 include/migration/vmstate.h |  15 +--
 migration/savevm.c  |   5 ++-
 migration/vmstate.c | 104 
 target-alpha/machine.c  |   5 ++-
 target-arm/machine.c|  12 +++--
 target-i386/machine.c   |  21 ++---
 target-mips/machine.c   |  10 +++--
 target-ppc/machine.c|  10 +++--
 target-sparc/machine.c  |   5 ++-
 21 files changed, 192 insertions(+), 102 deletions(-)

diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index fa6fd0e..2a21150 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -987,7 +987,8 @@ static const VMStateDescription vmstate_virtio_gpu_scanouts 
= {
 },
 };
 
-static void virtio_gpu_save(QEMUFile *f, void *opaque, size_t size)
+static void virtio_gpu_save(QEMUFile *f, void *opaque, size_t size,
+VMStateField *field, QJSON *vmdesc)
 {
 VirtIOGPU *g = opaque;
 struct virtio_gpu_simple_resource *res;
@@ -1014,7 +1015,8 @@ static void virtio_gpu_save(QEMUFile *f, void *opaque, 
size_t size)
 vmstate_save_state(f, _virtio_gpu_scanouts, g, NULL);
 }
 
-static int virtio_gpu_load(QEMUFile *f, void *opaque, size_t size)
+static int virtio_gpu_load(QEMUFile *f, void *opaque, size_t size,
+   VMStateField *field)
 {
 VirtIOGPU *g = opaque;
 struct virtio_gpu_simple_resource *res;
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 90f6943..943a960 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -2450,7 +2450,8 @@ static void vmxnet3_put_tx_stats_to_file(QEMUFile *f,
 qemu_put_be64(f, tx_stat->pktsTxDiscard);
 }
 
-static int vmxnet3_get_txq_descr(QEMUFile *f, void *pv, size_t size)
+static int vmxnet3_get_txq_descr(QEMUFile *f, void *pv, size_t size,
+VMStateField *field)
 {
 Vmxnet3TxqDescr *r = pv;
 
@@ -2464,7 +2465,8 @@ static int vmxnet3_get_txq_descr(QEMUFile *f, void *pv, 
size_t size)
 return 0;
 }
 
-static void vmxnet3_put_txq_descr(QEMUFile *f, void *pv, size_t size)
+static void vmxnet3_put_txq_descr(QEMUFile *f, void *pv, size_t size,
+VMStateField *field, QJSON *vmdesc)
 {
 Vmxnet3TxqDescr *r = pv;
 
@@ -2511,7 +2513,8 @@ static void vmxnet3_put_rx_stats_to_file(QEMUFile *f,
 qemu_put_be64(f, rx_stat->pktsRxError);
 }
 
-static int vmxnet3_get_rxq_descr(QEMUFile *f, void *pv, size_t size)
+static int vmxnet3_get_rxq_descr(QEMUFile *f, void *pv, size_t size,
+VMStateField *field)
 {
 Vmxnet3RxqDescr *r = pv;
 int i;
@@ -2529,7 +2532,8 @@ static int vmxnet3_get_rxq_descr(QEMUFile *f, void *pv, 
size_t size)
 return 0;
 }
 
-static void vmxnet3_put_rxq_descr(QEMUFile *f, void *pv, size_t size)
+static void vmxnet3_put_rxq_descr(QEMUFile *f, void *pv, size_t size,
+VMStateField *field, QJSON *vmdesc)
 {
 Vmxnet3RxqDescr *r = pv;
 int i;
@@ -2574,7 +2578,8 @@ static const VMStateInfo rxq_descr_info = {
 .put = vmxnet3_put_rxq_descr
 };
 
-static int vmxnet3_get_int_state(QEMUFile *f, void *pv, size_t size)
+static int vmxnet3_get_int_state(QEMUFile *f, void *pv, size_t size,
+VMStateField *field)
 {
 Vmxnet3IntState *r = pv;
 
@@ -2585,7 +2590,8 @@ static int vmxnet3_get_int_state(QEMUFile *f, void *pv, 
size_t size)
 return 0;
 }
 
-static void vmxnet3_put_int_state(QEMUFile *f, void *pv, size_t size)
+static void vmxnet3_put_int_state(QEMUFile *f, void *pv, size_t size,
+VMStateField *field, QJSON *vmdesc)
 {
 Vmxnet3IntState *r = pv;
 
diff --git a/hw/nvram/eeprom93xx.c b/hw/nvram/eeprom93xx.c
index 2c16fc2..76d5f41 100644
--- a/hw/nvram/eeprom93xx.c
+++ b/hw/nvram/eeprom93xx.c
@@ -94,14 +94,16 @@ struct _eeprom_t {
This is a Big hack, but it is how the old state did it.
  */
 
-static int get_uint16_from_uint8(QEMUFile *f, void *pv, size_t size)
+static int get_uint16_from_uint8(QEMUFile *f, void *pv, size_t size,
+ VMStateField *field)
 {
 uint16_t *v = pv;
 *v = qemu_get_ubyte(f);
 return 0;
 }
 
-static void put_unused(QEMUFile *f, void *pv, size_t size)
+static void put_unused(QEMUFile *f, void *pv, size_t size, VMStateField *field,
+   QJSON *vmdesc)
 {
 fprintf(stderr, "uint16_from_uint8 is used only

[Qemu-devel] [PATCH 2/4] pc: Register TYPE_PC_MACHINE properties as class properties

2016-10-13 Thread Eduardo Habkost

Signed-off-by: Eduardo Habkost 
---
 hw/i386/pc.c | 56 ++--
 1 file changed, 26 insertions(+), 30 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 93ff49c..f4b0cda 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2134,41 +2134,11 @@ static void pc_machine_initfn(Object *obj)
 {
 PCMachineState *pcms = PC_MACHINE(obj);
 
-object_property_add(obj, PC_MACHINE_MEMHP_REGION_SIZE, "int",
-pc_machine_get_hotplug_memory_region_size,
-NULL, NULL, NULL, _abort);
-
 pcms->max_ram_below_4g = 0; /* use default */
-object_property_add(obj, PC_MACHINE_MAX_RAM_BELOW_4G, "size",
-pc_machine_get_max_ram_below_4g,
-pc_machine_set_max_ram_below_4g,
-NULL, NULL, _abort);
-object_property_set_description(obj, PC_MACHINE_MAX_RAM_BELOW_4G,
-"Maximum ram below the 4G boundary (32bit 
boundary)",
-_abort);
-
 pcms->smm = ON_OFF_AUTO_AUTO;
-object_property_add(obj, PC_MACHINE_SMM, "OnOffAuto",
-pc_machine_get_smm,
-pc_machine_set_smm,
-NULL, NULL, _abort);
-object_property_set_description(obj, PC_MACHINE_SMM,
-"Enable SMM (pc & q35)",
-_abort);
-
 pcms->vmport = ON_OFF_AUTO_AUTO;
-object_property_add(obj, PC_MACHINE_VMPORT, "OnOffAuto",
-pc_machine_get_vmport,
-pc_machine_set_vmport,
-NULL, NULL, _abort);
-object_property_set_description(obj, PC_MACHINE_VMPORT,
-"Enable vmport (pc & q35)",
-_abort);
-
 /* nvdimm is disabled on default. */
 pcms->acpi_nvdimm_state.is_enabled = false;
-object_property_add_bool(obj, PC_MACHINE_NVDIMM, pc_machine_get_nvdimm,
- pc_machine_set_nvdimm, _abort);
 }
 
 static void pc_machine_reset(void)
@@ -2303,6 +2273,32 @@ static void pc_machine_class_init(ObjectClass *oc, void 
*data)
 hc->unplug_request = pc_machine_device_unplug_request_cb;
 hc->unplug = pc_machine_device_unplug_cb;
 nc->nmi_monitor_handler = x86_nmi;
+
+object_class_property_add(oc, PC_MACHINE_MEMHP_REGION_SIZE, "int",
+pc_machine_get_hotplug_memory_region_size, NULL,
+NULL, NULL, _abort);
+
+object_class_property_add(oc, PC_MACHINE_MAX_RAM_BELOW_4G, "size",
+pc_machine_get_max_ram_below_4g, pc_machine_set_max_ram_below_4g,
+NULL, NULL, _abort);
+
+object_class_property_set_description(oc, PC_MACHINE_MAX_RAM_BELOW_4G,
+"Maximum ram below the 4G boundary (32bit boundary)", _abort);
+
+object_class_property_add(oc, PC_MACHINE_SMM, "OnOffAuto",
+pc_machine_get_smm, pc_machine_set_smm,
+NULL, NULL, _abort);
+object_class_property_set_description(oc, PC_MACHINE_SMM,
+"Enable SMM (pc & q35)", _abort);
+
+object_class_property_add(oc, PC_MACHINE_VMPORT, "OnOffAuto",
+pc_machine_get_vmport, pc_machine_set_vmport,
+NULL, NULL, _abort);
+object_class_property_set_description(oc, PC_MACHINE_VMPORT,
+"Enable vmport (pc & q35)", _abort);
+
+object_class_property_add_bool(oc, PC_MACHINE_NVDIMM,
+pc_machine_get_nvdimm, pc_machine_set_nvdimm, _abort);
 }
 
 static const TypeInfo pc_machine_info = {
-- 
2.7.4

[Qemu-devel] [PATCH 3/4] hostmem: Register TYPE_MEMORY_BACKEND properties as class properties

2016-10-13 Thread Eduardo Habkost

The NULL errp arguments on the property registration calls were
changed to _abort.

Signed-off-by: Eduardo Habkost 
---
 backends/hostmem.c | 42 ++
 1 file changed, 22 insertions(+), 20 deletions(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index b7a208d..4256d24 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -241,26 +241,6 @@ static void host_memory_backend_init(Object *obj)
 backend->merge = machine_mem_merge(machine);
 backend->dump = machine_dump_guest_core(machine);
 backend->prealloc = mem_prealloc;
-
-object_property_add_bool(obj, "merge",
-host_memory_backend_get_merge,
-host_memory_backend_set_merge, NULL);
-object_property_add_bool(obj, "dump",
-host_memory_backend_get_dump,
-host_memory_backend_set_dump, NULL);
-object_property_add_bool(obj, "prealloc",
-host_memory_backend_get_prealloc,
-host_memory_backend_set_prealloc, NULL);
-object_property_add(obj, "size", "int",
-host_memory_backend_get_size,
-host_memory_backend_set_size, NULL, NULL, NULL);
-object_property_add(obj, "host-nodes", "int",
-host_memory_backend_get_host_nodes,
-host_memory_backend_set_host_nodes, NULL, NULL, NULL);
-object_property_add_enum(obj, "policy", "HostMemPolicy",
- HostMemPolicy_lookup,
- host_memory_backend_get_policy,
- host_memory_backend_set_policy, NULL);
 }
 
 MemoryRegion *
@@ -375,6 +355,28 @@ host_memory_backend_class_init(ObjectClass *oc, void *data)
 
 ucc->complete = host_memory_backend_memory_complete;
 ucc->can_be_deleted = host_memory_backend_can_be_deleted;
+
+object_class_property_add_bool(oc, "merge",
+host_memory_backend_get_merge,
+host_memory_backend_set_merge, _abort);
+object_class_property_add_bool(oc, "dump",
+host_memory_backend_get_dump,
+host_memory_backend_set_dump, _abort);
+object_class_property_add_bool(oc, "prealloc",
+host_memory_backend_get_prealloc,
+host_memory_backend_set_prealloc, _abort);
+object_class_property_add(oc, "size", "int",
+host_memory_backend_get_size,
+host_memory_backend_set_size,
+NULL, NULL, _abort);
+object_class_property_add(oc, "host-nodes", "int",
+host_memory_backend_get_host_nodes,
+host_memory_backend_set_host_nodes,
+NULL, NULL, _abort);
+object_class_property_add_enum(oc, "policy", "HostMemPolicy",
+HostMemPolicy_lookup,
+host_memory_backend_get_policy,
+host_memory_backend_set_policy, _abort);
 }
 
 static const TypeInfo host_memory_backend_info = {
-- 
2.7.4

[Qemu-devel] [PATCH 4/4] hostmem-file: Register TYPE_MEMORY_BACKEND_FILE properties as class properties

2016-10-13 Thread Eduardo Habkost

To do the conversion, the file_backend_class_init() was moved
after the getter/setter functions. The old
file_backend_instance_init() function was removed because it is
not needed anymore.

The NULL errp arguments on the property registration calls were
changed to _abort.

Signed-off-by: Eduardo Habkost 
---
 backends/hostmem-file.c | 26 +++---
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 5c4b808..42efb2f 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -64,14 +64,6 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error 
**errp)
 #endif
 }
 
-static void
-file_backend_class_init(ObjectClass *oc, void *data)
-{
-HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
-
-bc->alloc = file_backend_memory_alloc;
-}
-
 static char *get_mem_path(Object *o, Error **errp)
 {
 HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
@@ -112,13 +104,18 @@ static void file_memory_backend_set_share(Object *o, bool 
value, Error **errp)
 }
 
 static void
-file_backend_instance_init(Object *o)
+file_backend_class_init(ObjectClass *oc, void *data)
 {
-object_property_add_bool(o, "share",
-file_memory_backend_get_share,
-file_memory_backend_set_share, NULL);
-object_property_add_str(o, "mem-path", get_mem_path,
-set_mem_path, NULL);
+HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
+
+bc->alloc = file_backend_memory_alloc;
+
+object_class_property_add_bool(oc, "share",
+file_memory_backend_get_share, file_memory_backend_set_share,
+_abort);
+object_class_property_add_str(oc, "mem-path",
+get_mem_path, set_mem_path,
+_abort);
 }
 
 static void file_backend_instance_finalize(Object *o)
@@ -132,7 +129,6 @@ static const TypeInfo file_backend_info = {
 .name = TYPE_MEMORY_BACKEND_FILE,
 .parent = TYPE_MEMORY_BACKEND,
 .class_init = file_backend_class_init,
-.instance_init = file_backend_instance_init,
 .instance_finalize = file_backend_instance_finalize,
 .instance_size = sizeof(HostMemoryBackendFile),
 };
-- 
2.7.4

[Qemu-devel] [PATCH 0/4] machine, hostmem, pc: Register properties as class properties

2016-10-13 Thread Eduardo Habkost

This series changes the existing machine, pc, and hostmem code to
register their QOM properties as class properties on class_init
instead of instance properties on instance_init.

Eduardo Habkost (4):
  machine: Register TYPE_MACHINE properties as class properties
  pc: Register TYPE_PC_MACHINE properties as class properties
  hostmem: Register TYPE_MEMORY_BACKEND properties as class properties
  hostmem-file: Register TYPE_MEMORY_BACKEND_FILE properties as class
properties

 backends/hostmem-file.c |  26 +++---
 backends/hostmem.c  |  42 +-
 hw/core/machine.c   | 206 +++-
 hw/i386/pc.c|  56 ++---
 4 files changed, 157 insertions(+), 173 deletions(-)

-- 
2.7.4

[Qemu-devel] [PATCH 1/4] machine: Register TYPE_MACHINE properties as class properties

2016-10-13 Thread Eduardo Habkost

When doing the conversion, the NULL errp arguments on the
property registration calls were changed to _abort.

Signed-off-by: Eduardo Habkost 
---
 hw/core/machine.c | 206 ++
 1 file changed, 98 insertions(+), 108 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index afd84ac..b0fd91f 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -364,6 +364,104 @@ static void machine_class_init(ObjectClass *oc, void 
*data)
 /* Default 128 MB as guest ram size */
 mc->default_ram_size = 128 * M_BYTE;
 mc->rom_file_has_mr = true;
+
+object_class_property_add_str(oc, "accel",
+machine_get_accel, machine_set_accel, _abort);
+object_class_property_set_description(oc, "accel",
+"Accelerator list", _abort);
+
+object_class_property_add(oc, "kernel-irqchip", "OnOffSplit",
+NULL, machine_set_kernel_irqchip,
+NULL, NULL, _abort);
+object_class_property_set_description(oc, "kernel-irqchip",
+"Configure KVM in-kernel irqchip", _abort);
+
+object_class_property_add(oc, "kvm-shadow-mem", "int",
+machine_get_kvm_shadow_mem, machine_set_kvm_shadow_mem,
+NULL, NULL, _abort);
+object_class_property_set_description(oc, "kvm-shadow-mem",
+"KVM shadow MMU size", _abort);
+
+object_class_property_add_str(oc, "kernel",
+machine_get_kernel, machine_set_kernel, _abort);
+object_class_property_set_description(oc, "kernel",
+"Linux kernel image file", _abort);
+
+object_class_property_add_str(oc, "initrd",
+machine_get_initrd, machine_set_initrd, _abort);
+object_class_property_set_description(oc, "initrd",
+"Linux initial ramdisk file", _abort);
+
+object_class_property_add_str(oc, "append",
+machine_get_append, machine_set_append, _abort);
+object_class_property_set_description(oc, "append",
+"Linux kernel command line", _abort);
+
+object_class_property_add_str(oc, "dtb",
+machine_get_dtb, machine_set_dtb, _abort);
+object_class_property_set_description(oc, "dtb",
+"Linux kernel device tree file", _abort);
+
+object_class_property_add_str(oc, "dumpdtb",
+machine_get_dumpdtb, machine_set_dumpdtb, _abort);
+object_class_property_set_description(oc, "dumpdtb",
+"Dump current dtb to a file and quit", _abort);
+
+object_class_property_add(oc, "phandle-start", "int",
+machine_get_phandle_start, machine_set_phandle_start,
+NULL, NULL, _abort);
+object_class_property_set_description(oc, "phandle-start",
+"The first phandle ID we may generate dynamically", _abort);
+
+object_class_property_add_str(oc, "dt-compatible",
+machine_get_dt_compatible, machine_set_dt_compatible, _abort);
+object_class_property_set_description(oc, "dt-compatible",
+"Overrides the \"compatible\" property of the dt root node",
+_abort);
+
+object_class_property_add_bool(oc, "dump-guest-core",
+machine_get_dump_guest_core, machine_set_dump_guest_core, 
_abort);
+object_class_property_set_description(oc, "dump-guest-core",
+"Include guest memory in  a core dump", _abort);
+
+object_class_property_add_bool(oc, "mem-merge",
+machine_get_mem_merge, machine_set_mem_merge, _abort);
+object_class_property_set_description(oc, "mem-merge",
+"Enable/disable memory merge support", _abort);
+
+object_class_property_add_bool(oc, "usb",
+machine_get_usb, machine_set_usb, _abort);
+object_class_property_set_description(oc, "usb",
+"Set on/off to enable/disable usb", _abort);
+
+object_class_property_add_bool(oc, "graphics",
+machine_get_graphics, machine_set_graphics, _abort);
+object_class_property_set_description(oc, "graphics",
+"Set on/off to enable/disable graphics emulation", _abort);
+
+object_class_property_add_bool(oc, "igd-passthru",
+machine_get_igd_gfx_passthru, machine_set_igd_gfx_passthru,
+_abort);
+object_class_property_set_description(oc, "igd-passthru",
+"Set on/off to enable/disable igd passthrou", _abort);
+
+object_class_property_add_str(oc, "firmware",
+machine_get_firmware, machine_set_firmware,
+_abort);
+object_class_property_set_description(oc, "firmware",
+"Firmware image", _abort);
+
+object_class_property_add_bool(oc, "suppress-vmdesc",
+machine_get_suppress_vmdesc, machine_set_suppress_vmdesc,
+_abort);
+object_class_property_set_description(oc, "suppress-vmdesc",
+"Set on to disable self-describing migration", _abort);
+
+object_class_property_add_bool(oc, "enforce-config-section",
+machine_get_enforce_config_section, machine_set_enforce_config_section,
+_abort);
+object_class_property_set_description(oc, "enforce-config-section",
+"Set on to enforce

[Qemu-devel] [PATCH v6 15/15] nbd: Implement NBD_CMD_WRITE_ZEROES on client

2016-10-13 Thread Eric Blake

Upstream NBD protocol recently added the ability to efficiently
write zeroes without having to send the zeroes over the wire,
along with a flag to control whether the client wants a hole.

The generic block code takes care of falling back to the obvious
write of lots of zeroes if we return -ENOTSUP because the server
does not have WRITE_ZEROES.

Ideally, since NBD_CMD_WRITE_ZEROES does not involve any data
over the wire, we want to support transactions that are much
larger than the normal 32M limit imposed on NBD_CMD_WRITE.  But
the server may still have a limit smaller than UINT_MAX, so
until experimental NBD protocol additions for advertising various
command sizes is finalized (see [1], [2]), for now we just stick to
the same limits as normal writes.

[1] https://github.com/yoe/nbd/blob/extension-info/doc/proto.md
[2] https://sourceforge.net/p/nbd/mailman/message/35081223/

Signed-off-by: Eric Blake 

---
v6: rebase
v5: enhance commit message
v4: rebase to byte-based limits
v3: rebase, tell block layer about our support
---
 block/nbd-client.h |  2 ++
 block/nbd-client.c | 35 +++
 block/nbd.c|  4 
 3 files changed, 41 insertions(+)

diff --git a/block/nbd-client.h b/block/nbd-client.h
index 78e8e57..e51df22 100644
--- a/block/nbd-client.h
+++ b/block/nbd-client.h
@@ -48,6 +48,8 @@ int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t 
offset, int count);
 int nbd_client_co_flush(BlockDriverState *bs);
 int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset,
   uint64_t bytes, QEMUIOVector *qiov, int flags);
+int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
+int count, BdrvRequestFlags flags);
 int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
  uint64_t bytes, QEMUIOVector *qiov, int flags);

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 8e89add..31db557 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -275,6 +275,41 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t 
offset,
 return -reply.error;
 }

+int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
+int count, BdrvRequestFlags flags)
+{
+ssize_t ret;
+NBDClientSession *client = nbd_get_client_session(bs);
+NBDRequest request = {
+.type = NBD_CMD_WRITE_ZEROES,
+.from = offset,
+.len = count,
+};
+NBDReply reply;
+
+if (!(client->nbdflags & NBD_FLAG_SEND_WRITE_ZEROES)) {
+return -ENOTSUP;
+}
+
+if (flags & BDRV_REQ_FUA) {
+assert(client->nbdflags & NBD_FLAG_SEND_FUA);
+request.flags |= NBD_CMD_FLAG_FUA;
+}
+if (!(flags & BDRV_REQ_MAY_UNMAP)) {
+request.flags |= NBD_CMD_FLAG_NO_HOLE;
+}
+
+nbd_coroutine_start(client, );
+ret = nbd_co_send_request(bs, , NULL);
+if (ret < 0) {
+reply.error = -ret;
+} else {
+nbd_co_receive_reply(client, , , NULL);
+}
+nbd_coroutine_end(client, );
+return -reply.error;
+}
+
 int nbd_client_co_flush(BlockDriverState *bs)
 {
 NBDClientSession *client = nbd_get_client_session(bs);
diff --git a/block/nbd.c b/block/nbd.c
index e227490..6c7bbc8 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -403,6 +403,7 @@ static int nbd_co_flush(BlockDriverState *bs)
 static void nbd_refresh_limits(BlockDriverState *bs, Error **errp)
 {
 bs->bl.max_pdiscard = NBD_MAX_BUFFER_SIZE;
+bs->bl.max_pwrite_zeroes = NBD_MAX_BUFFER_SIZE;
 bs->bl.max_transfer = NBD_MAX_BUFFER_SIZE;
 }

@@ -491,6 +492,7 @@ static BlockDriver bdrv_nbd = {
 .bdrv_file_open = nbd_open,
 .bdrv_co_preadv = nbd_client_co_preadv,
 .bdrv_co_pwritev= nbd_client_co_pwritev,
+.bdrv_co_pwrite_zeroes  = nbd_client_co_pwrite_zeroes,
 .bdrv_close = nbd_close,
 .bdrv_co_flush_to_os= nbd_co_flush,
 .bdrv_co_pdiscard   = nbd_client_co_pdiscard,
@@ -509,6 +511,7 @@ static BlockDriver bdrv_nbd_tcp = {
 .bdrv_file_open = nbd_open,
 .bdrv_co_preadv = nbd_client_co_preadv,
 .bdrv_co_pwritev= nbd_client_co_pwritev,
+.bdrv_co_pwrite_zeroes  = nbd_client_co_pwrite_zeroes,
 .bdrv_close = nbd_close,
 .bdrv_co_flush_to_os= nbd_co_flush,
 .bdrv_co_pdiscard   = nbd_client_co_pdiscard,
@@ -527,6 +530,7 @@ static BlockDriver bdrv_nbd_unix = {
 .bdrv_file_open = nbd_open,
 .bdrv_co_preadv = nbd_client_co_preadv,
 .bdrv_co_pwritev= nbd_client_co_pwritev,
+.bdrv_co_pwrite_zeroes  = nbd_client_co_pwrite_zeroes,
 .bdrv_close = nbd_close,
 .bdrv_co_flush_to_os= nbd_co_flush,
 .bdrv_co_pdiscard   = nbd_client_co_pdiscard,
-- 
2.7.4

[Qemu-devel] [PATCH v6 13/15] nbd: Improve server handling of shutdown requests

2016-10-13 Thread Eric Blake

NBD commit 6d34500b clarified how clients and servers are supposed
to behave before closing a connection. It added NBD_REP_ERR_SHUTDOWN
(for the server to announce it is about to go away during option
haggling, so the client should quit sending NBD_OPT_* other than
NBD_OPT_ABORT) and ESHUTDOWN (for the server to announce it is about
to go away during transmission, so the client should quit sending
NBD_CMD_* other than NBD_CMD_DISC).  It also clarified that
NBD_OPT_ABORT gets a reply, while NBD_CMD_DISC does not.

This patch merely adds the missing reply to NBD_OPT_ABORT and teaches
the client to recognize server errors.  Actually teaching the server
to send NBD_REP_ERR_SHUTDOWN or ESHUTDOWN would require knowing that
the server has been requested to shut down soon (maybe we could do
that by installing a SIGINT handler in qemu-nbd, which transitions
from RUNNING to a new state that waits for the client to react,
rather than just out-right quitting - but that's a bigger task for
another day).

Signed-off-by: Eric Blake 

---
v6: rebase
v5: no change
v4: new patch
---
 include/block/nbd.h | 13 +
 nbd/nbd-internal.h  |  1 +
 nbd/client.c| 16 
 nbd/server.c| 10 ++
 4 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index d326308..eea7ef0 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -83,12 +83,17 @@ typedef struct NBDReply NBDReply;
 #define NBD_FLAG_C_NO_ZEROES  (1 << 1) /* End handshake without zeroes. */

 /* Reply types. */
+#define NBD_REP_ERR(value) ((UINT32_C(1) << 31) | (value))
+
 #define NBD_REP_ACK (1) /* Data sending finished. */
 #define NBD_REP_SERVER  (2) /* Export description. */
-#define NBD_REP_ERR_UNSUP   ((UINT32_C(1) << 31) | 1) /* Unknown option. */
-#define NBD_REP_ERR_POLICY  ((UINT32_C(1) << 31) | 2) /* Server denied */
-#define NBD_REP_ERR_INVALID ((UINT32_C(1) << 31) | 3) /* Invalid length. */
-#define NBD_REP_ERR_TLS_REQD((UINT32_C(1) << 31) | 5) /* TLS required */
+
+#define NBD_REP_ERR_UNSUP   NBD_REP_ERR(1)  /* Unknown option */
+#define NBD_REP_ERR_POLICY  NBD_REP_ERR(2)  /* Server denied */
+#define NBD_REP_ERR_INVALID NBD_REP_ERR(3)  /* Invalid length */
+#define NBD_REP_ERR_PLATFORMNBD_REP_ERR(4)  /* Not compiled in */
+#define NBD_REP_ERR_TLS_REQDNBD_REP_ERR(5)  /* TLS required */
+#define NBD_REP_ERR_SHUTDOWNNBD_REP_ERR(7)  /* Server shutting down */

 /* Request flags, sent from client to server during transmission phase */
 #define NBD_CMD_FLAG_FUA(1 << 0)
diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index dd57e18..eee20ab 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -92,6 +92,7 @@
 #define NBD_ENOMEM 12
 #define NBD_EINVAL 22
 #define NBD_ENOSPC 28
+#define NBD_ESHUTDOWN  108

 static inline ssize_t read_sync(QIOChannel *ioc, void *buffer, size_t size)
 {
diff --git a/nbd/client.c b/nbd/client.c
index f5e4c74..b33ae46 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -34,6 +34,8 @@ static int nbd_errno_to_system_errno(int err)
 return ENOMEM;
 case NBD_ENOSPC:
 return ENOSPC;
+case NBD_ESHUTDOWN:
+return ESHUTDOWN;
 default:
 TRACE("Squashing unexpected error %d to EINVAL", err);
 /* fallthrough */
@@ -231,11 +233,21 @@ static int nbd_handle_reply_err(QIOChannel *ioc, 
nbd_opt_reply *reply,
reply->option);
 break;

+case NBD_REP_ERR_PLATFORM:
+error_setg(errp, "Server lacks support for option %" PRIx32,
+   reply->option);
+break;
+
 case NBD_REP_ERR_TLS_REQD:
 error_setg(errp, "TLS negotiation required before option %" PRIx32,
reply->option);
 break;

+case NBD_REP_ERR_SHUTDOWN:
+error_setg(errp, "Server shutting down before option %" PRIx32,
+   reply->option);
+break;
+
 default:
 error_setg(errp, "Unknown error code when asking for option %" PRIx32,
reply->option);
@@ -784,6 +796,10 @@ ssize_t nbd_receive_reply(QIOChannel *ioc, NBDReply *reply)
 LOG("invalid magic (got 0x%" PRIx32 ")", magic);
 return -EINVAL;
 }
+if (reply->error == ESHUTDOWN) {
+LOG("server shutting down");
+return -EINVAL;
+}
 return 0;
 }

diff --git a/nbd/server.c b/nbd/server.c
index 20f1086..a7aa2ba 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -39,6 +39,8 @@ static int system_errno_to_nbd_errno(int err)
 case EFBIG:
 case ENOSPC:
 return NBD_ENOSPC;
+case ESHUTDOWN:
+return NBD_ESHUTDOWN;
 case EINVAL:
 default:
 return NBD_EINVAL;
@@ -526,6 +528,10 @@ static int nbd_negotiate_options(NBDClient *client)
 if (ret < 0) {
 return ret;
 }
+/* Let the

[Qemu-devel] [PATCH v6 12/15] nbd: Support shorter handshake

2016-10-13 Thread Eric Blake

The NBD Protocol allows the server and client to mutually agree
on a shorter handshake (omit the 124 bytes of reserved 0), via
the server advertising NBD_FLAG_NO_ZEROES and the client
acknowledging with NBD_FLAG_C_NO_ZEROES (only possible in
newstyle, whether or not it is fixed newstyle).  It doesn't
shave much off the wire, but we might as well implement it.

Signed-off-by: Eric Blake 
Reviewed-by: Alex Bligh 

---
v6: rebase
v5: no change
v4: rebase
v3: rebase
---
 include/block/nbd.h |  6 --
 nbd/client.c|  8 +++-
 nbd/server.c| 15 +++
 3 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index b69bf1d..d326308 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -74,11 +74,13 @@ typedef struct NBDReply NBDReply;

 /* New-style handshake (global) flags, sent from server to client, and
control what will happen during handshake phase. */
-#define NBD_FLAG_FIXED_NEWSTYLE (1 << 0)/* Fixed newstyle protocol. */
+#define NBD_FLAG_FIXED_NEWSTYLE   (1 << 0) /* Fixed newstyle protocol. */
+#define NBD_FLAG_NO_ZEROES(1 << 1) /* End handshake without zeroes. */

 /* New-style client flags, sent from client to server to control what happens
during handshake phase. */
-#define NBD_FLAG_C_FIXED_NEWSTYLE   (1 << 0)/* Fixed newstyle protocol. */
+#define NBD_FLAG_C_FIXED_NEWSTYLE (1 << 0) /* Fixed newstyle protocol. */
+#define NBD_FLAG_C_NO_ZEROES  (1 << 1) /* End handshake without zeroes. */

 /* Reply types. */
 #define NBD_REP_ACK (1) /* Data sending finished. */
diff --git a/nbd/client.c b/nbd/client.c
index f6468db..f5e4c74 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -439,6 +439,7 @@ int nbd_receive_negotiate(QIOChannel *ioc, const char 
*name, uint16_t *flags,
 char buf[256];
 uint64_t magic, s;
 int rc;
+bool zeroes = true;

 TRACE("Receiving negotiation tlscreds=%p hostname=%s.",
   tlscreds, hostname ? hostname : "");
@@ -503,6 +504,11 @@ int nbd_receive_negotiate(QIOChannel *ioc, const char 
*name, uint16_t *flags,
 TRACE("Server supports fixed new style");
 clientflags |= NBD_FLAG_C_FIXED_NEWSTYLE;
 }
+if (globalflags & NBD_FLAG_NO_ZEROES) {
+zeroes = false;
+TRACE("Server supports no zeroes");
+clientflags |= NBD_FLAG_C_NO_ZEROES;
+}
 /* client requested flags */
 clientflags = cpu_to_be32(clientflags);
 if (write_sync(ioc, , sizeof(clientflags)) !=
@@ -590,7 +596,7 @@ int nbd_receive_negotiate(QIOChannel *ioc, const char 
*name, uint16_t *flags,
 }

 TRACE("Size is %" PRIu64 ", export flags %" PRIx16, *size, *flags);
-if (drop_sync(ioc, 124) != 124) {
+if (zeroes && drop_sync(ioc, 124) != 124) {
 error_setg(errp, "Failed to read reserved block");
 goto fail;
 }
diff --git a/nbd/server.c b/nbd/server.c
index 3d39292..20f1086 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -81,6 +81,7 @@ struct NBDClient {
 int refcount;
 void (*close)(NBDClient *client);

+bool no_zeroes;
 NBDExport *exp;
 QCryptoTLSCreds *tlscreds;
 char *tlsaclname;
@@ -449,6 +450,11 @@ static int nbd_negotiate_options(NBDClient *client)
 fixedNewstyle = true;
 flags &= ~NBD_FLAG_C_FIXED_NEWSTYLE;
 }
+if (flags & NBD_FLAG_C_NO_ZEROES) {
+TRACE("Client supports no zeroes at handshake end");
+client->no_zeroes = true;
+flags &= ~NBD_FLAG_C_NO_ZEROES;
+}
 if (flags != 0) {
 TRACE("Unknown client flags 0x%" PRIx32 " received", flags);
 return -EIO;
@@ -601,6 +607,7 @@ static coroutine_fn int nbd_negotiate(NBDClientNewData 
*data)
 const uint16_t myflags = (NBD_FLAG_HAS_FLAGS | NBD_FLAG_SEND_TRIM |
   NBD_FLAG_SEND_FLUSH | NBD_FLAG_SEND_FUA);
 bool oldStyle;
+size_t len;

 /* Old style negotiation header without options
 [ 0 ..   7]   passwd   ("NBDMAGIC")
@@ -617,7 +624,7 @@ static coroutine_fn int nbd_negotiate(NBDClientNewData 
*data)
 options sent
 [18 ..  25]   size
 [26 ..  27]   export flags
-[28 .. 151]   reserved (0)
+[28 .. 151]   reserved (0, omit if no_zeroes)
  */

 qio_channel_set_blocking(client->ioc, false, NULL);
@@ -636,7 +643,7 @@ static coroutine_fn int nbd_negotiate(NBDClientNewData 
*data)
 stw_be_p(buf + 26, client->exp->nbdflags | myflags);
 } else {
 stq_be_p(buf + 8, NBD_OPTS_MAGIC);
-stw_be_p(buf + 16, NBD_FLAG_FIXED_NEWSTYLE);
+stw_be_p(buf + 16, NBD_FLAG_FIXED_NEWSTYLE | NBD_FLAG_NO_ZEROES);
 }

 if (oldStyle) {
@@ -663,8 +670,8 @@ static coroutine_fn int nbd_negotiate(NBDClientNewData 
*data)
   client->exp->size, client->exp->nbdflags | myflags);
 stq_be_p(buf + 18,

Re: [Qemu-devel] [PATCH v3 0/4] target-arm: Handle tagged addresses when loading PC

2016-10-13 Thread Peter Maydell

On 13 October 2016 at 20:09, Tom Hanson  wrote:
> Looking at arm_cpu_do_interrupt_aarch64() and the ARM spec, the
> new PC value is always an offset from the appropriate VBAR. The
> only place I can find the the VBAR being set is at boot time
> (i.e. UEFI).

Any guest system software can set the VBAR any time it likes.
In practice it gets set once at bootup and then left that way
because there's no good reason to move it aronud.

> Can the boot code use a tagged pointer to specify the VBAR?

Yes, exactly, you can have a tagged pointer in the VBAR.
The point is that the spec says that when the value is read
out of the VBAR the tag bits must handled appropriately:
check the pseudocode AArch64.TakeException(), which calls
BranchTo(VBAR[] + vect_offset, ...)
and BranchTo() handles the tag bits (in the same way as
any other 'branch to arbitrary new PC value').

thanks
-- PMM

[Qemu-devel] [PATCH v6 11/15] nbd: Less allocation during NBD_OPT_LIST

2016-10-13 Thread Eric Blake

Since we know that the maximum name we are willing to accept
is small enough to stack-allocate, rework the iteration over
NBD_OPT_LIST responses to reuse a stack buffer rather than
allocating every time.  Furthermore, we don't even have to
allocate if we know the server's length doesn't match what
we are searching for.

Signed-off-by: Eric Blake 

---
v6: rebase
v5: alter signature of nbd_receive_list for simpler logic
v4: rebase
v3: tweak commit message
---
 nbd/client.c | 145 +--
 1 file changed, 70 insertions(+), 75 deletions(-)

diff --git a/nbd/client.c b/nbd/client.c
index df7eb9c..f6468db 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -254,19 +254,28 @@ static int nbd_handle_reply_err(QIOChannel *ioc, 
nbd_opt_reply *reply,
 return result;
 }

-static int nbd_receive_list(QIOChannel *ioc, char **name, Error **errp)
+/* Process another portion of the NBD_OPT_LIST reply.  Set *@match if
+ * the current reply matches @want or if the server does not support
+ * NBD_OPT_LIST, otherwise leave @match alone.  Return 0 if iteration
+ * is complete, positive if more replies are expected, or negative
+ * with @errp set if an unrecoverable error occurred. */
+static int nbd_receive_list(QIOChannel *ioc, const char *want, bool *match,
+Error **errp)
 {
 nbd_opt_reply reply;
 uint32_t len;
 uint32_t namelen;
+char name[NBD_MAX_NAME_SIZE + 1];
 int error;

-*name = NULL;
 if (nbd_receive_option_reply(ioc, NBD_OPT_LIST, , errp) < 0) {
 return -1;
 }
 error = nbd_handle_reply_err(ioc, , errp);
 if (error <= 0) {
+/* The server did not support NBD_OPT_LIST, so set *match on
+ * the assumption that any name will be accepted.  */
+*match = true;
 return error;
 }
 len = reply.length;
@@ -277,105 +286,91 @@ static int nbd_receive_list(QIOChannel *ioc, char 
**name, Error **errp)
 nbd_send_opt_abort(ioc);
 return -1;
 }
-} else if (reply.type == NBD_REP_SERVER) {
-if (len < sizeof(namelen) || len > NBD_MAX_BUFFER_SIZE) {
-error_setg(errp, "incorrect option length %" PRIu32, len);
-nbd_send_opt_abort(ioc);
-return -1;
-}
-if (read_sync(ioc, , sizeof(namelen)) != sizeof(namelen)) {
-error_setg(errp, "failed to read option name length");
-nbd_send_opt_abort(ioc);
-return -1;
-}
-namelen = be32_to_cpu(namelen);
-len -= sizeof(namelen);
-if (len < namelen) {
-error_setg(errp, "incorrect option name length");
-nbd_send_opt_abort(ioc);
-return -1;
-}
-if (namelen > NBD_MAX_NAME_SIZE) {
-error_setg(errp, "export name length too long %" PRIu32, namelen);
-nbd_send_opt_abort(ioc);
-return -1;
-}
-
-*name = g_new0(char, namelen + 1);
-if (read_sync(ioc, *name, namelen) != namelen) {
-error_setg(errp, "failed to read export name");
-g_free(*name);
-*name = NULL;
-nbd_send_opt_abort(ioc);
-return -1;
-}
-(*name)[namelen] = '\0';
-len -= namelen;
-if (drop_sync(ioc, len) != len) {
-error_setg(errp, "failed to read export description");
-g_free(*name);
-*name = NULL;
-nbd_send_opt_abort(ioc);
-return -1;
-}
-} else {
+return 0;
+} else if (reply.type != NBD_REP_SERVER) {
 error_setg(errp, "Unexpected reply type %" PRIx32 " expected %x",
reply.type, NBD_REP_SERVER);
 nbd_send_opt_abort(ioc);
 return -1;
 }
+
+if (len < sizeof(namelen) || len > NBD_MAX_BUFFER_SIZE) {
+error_setg(errp, "incorrect option length %" PRIu32, len);
+nbd_send_opt_abort(ioc);
+return -1;
+}
+if (read_sync(ioc, , sizeof(namelen)) != sizeof(namelen)) {
+error_setg(errp, "failed to read option name length");
+nbd_send_opt_abort(ioc);
+return -1;
+}
+namelen = be32_to_cpu(namelen);
+len -= sizeof(namelen);
+if (len < namelen) {
+error_setg(errp, "incorrect option name length");
+nbd_send_opt_abort(ioc);
+return -1;
+}
+if (namelen != strlen(want)) {
+if (drop_sync(ioc, len) != len) {
+error_setg(errp, "failed to skip export name with wrong length");
+nbd_send_opt_abort(ioc);
+return -1;
+}
+return 1;
+}
+
+assert(namelen < sizeof(name));
+if (read_sync(ioc, name, namelen) != namelen) {
+error_setg(errp, "failed to read export name");
+nbd_send_opt_abort(ioc);
+return -1;
+}
+name[namelen] = '\0';
+len -= namelen;
+if (drop_sync(ioc, len) != len) {
+

[Qemu-devel] [PATCH v6 00/15] nbd: efficient write zeroes

2016-10-13 Thread Eric Blake

v5 was here, but missed 2.7 freeze:
https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg04053.html

Since then, I've rebased the series, and the bulk of the changes
were to use consistent NBDFoo CamelCase naming, as well as to
improve the commit messages for questions raised on v5.

Also available as a tag at:
git fetch git://repo.or.cz/qemu/ericb.git nbd-zero-v6

001/15:[] [-C] 'nbd: Add qemu-nbd -D for human-readable description'
002/15:[] [--] 'nbd: Treat flags vs. command type as separate fields'
003/15:[down] 'nbd: Rename NBDRequest to NBDRequestData'
004/15:[down] 'nbd: Rename NbdClientSession to NBDClientSession'
005/15:[down] 'nbd: Rename struct nbd_request and nbd_reply'
006/15:[0012] [FC] 'nbd: Share common reply-sending code in server'
007/15:[0006] [FC] 'nbd: Send message along with server NBD_REP_ERR errors'
008/15:[0015] [FC] 'nbd: Share common option-sending code in client'
009/15:[] [-C] 'nbd: Let server know when client gives up negotiation'
010/15:[] [-C] 'nbd: Let client skip portions of server reply'
011/15:[0004] [FC] 'nbd: Less allocation during NBD_OPT_LIST'
012/15:[] [-C] 'nbd: Support shorter handshake'
013/15:[] [-C] 'nbd: Improve server handling of shutdown requests'
014/15:[] [-C] 'nbd: Implement NBD_CMD_WRITE_ZEROES on server'
015/15:[0006] [FC] 'nbd: Implement NBD_CMD_WRITE_ZEROES on client'

Eric Blake (15):
  nbd: Add qemu-nbd -D for human-readable description
  nbd: Treat flags vs. command type as separate fields
  nbd: Rename NBDRequest to NBDRequestData
  nbd: Rename NbdClientSession to NBDClientSession
  nbd: Rename struct nbd_request and nbd_reply
  nbd: Share common reply-sending code in server
  nbd: Send message along with server NBD_REP_ERR errors
  nbd: Share common option-sending code in client
  nbd: Let server know when client gives up negotiation
  nbd: Let client skip portions of server reply
  nbd: Less allocation during NBD_OPT_LIST
  nbd: Support shorter handshake
  nbd: Improve server handling of shutdown requests
  nbd: Implement NBD_CMD_WRITE_ZEROES on server
  nbd: Implement NBD_CMD_WRITE_ZEROES on client

 block/nbd-client.h  |  10 +-
 include/block/nbd.h |  73 ++--
 nbd/nbd-internal.h  |  12 +-
 block/nbd-client.c  |  96 +++
 block/nbd.c |   8 +-
 nbd/client.c| 482 
 nbd/server.c| 294 ++--
 qemu-nbd.c  |  12 +-
 qemu-nbd.texi   |   5 +-
 9 files changed, 614 insertions(+), 378 deletions(-)

-- 
2.7.4

[Qemu-devel] [PATCH v6 05/15] nbd: Rename struct nbd_request and nbd_reply

2016-10-13 Thread Eric Blake

Our coding convention prefers CamelCase names, and we already
have other existing structs with NBDFoo naming.  Let's be
consistent, before later patches add even more structs.

Signed-off-by: Eric Blake 
---
v6: new patch
---
 block/nbd-client.h  |  2 +-
 include/block/nbd.h | 10 ++
 block/nbd-client.c  | 28 ++--
 nbd/client.c|  4 ++--
 nbd/server.c| 12 ++--
 5 files changed, 29 insertions(+), 27 deletions(-)

diff --git a/block/nbd-client.h b/block/nbd-client.h
index a84a478..78e8e57 100644
--- a/block/nbd-client.h
+++ b/block/nbd-client.h
@@ -29,7 +29,7 @@ typedef struct NBDClientSession {
 int in_flight;

 Coroutine *recv_coroutine[MAX_NBD_REQUESTS];
-struct nbd_reply reply;
+NBDReply reply;

 bool is_unix;
 } NBDClientSession;
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 5fe2670..a33581b 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -29,18 +29,20 @@
 /* Note: these are _NOT_ the same as the network representation of an NBD
  * request and reply!
  */
-struct nbd_request {
+struct NBDRequest {
 uint64_t handle;
 uint64_t from;
 uint32_t len;
 uint16_t flags;
 uint16_t type;
 };
+typedef struct NBDRequest NBDRequest;

-struct nbd_reply {
+struct NBDReply {
 uint64_t handle;
 uint32_t error;
 };
+typedef struct NBDReply NBDReply;

 /* Transmission (export) flags: sent from server to client during handshake,
but describe what will happen during transmission */
@@ -101,8 +103,8 @@ int nbd_receive_negotiate(QIOChannel *ioc, const char 
*name, uint16_t *flags,
   QIOChannel **outioc,
   off_t *size, Error **errp);
 int nbd_init(int fd, QIOChannelSocket *sioc, uint16_t flags, off_t size);
-ssize_t nbd_send_request(QIOChannel *ioc, struct nbd_request *request);
-ssize_t nbd_receive_reply(QIOChannel *ioc, struct nbd_reply *reply);
+ssize_t nbd_send_request(QIOChannel *ioc, NBDRequest *request);
+ssize_t nbd_receive_reply(QIOChannel *ioc, NBDReply *reply);
 int nbd_client(int fd);
 int nbd_disconnect(int fd);

diff --git a/block/nbd-client.c b/block/nbd-client.c
index c94608a..8e89add 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -116,7 +116,7 @@ static void nbd_restart_write(void *opaque)
 }

 static int nbd_co_send_request(BlockDriverState *bs,
-   struct nbd_request *request,
+   NBDRequest *request,
QEMUIOVector *qiov)
 {
 NBDClientSession *s = nbd_get_client_session(bs);
@@ -168,8 +168,8 @@ static int nbd_co_send_request(BlockDriverState *bs,
 }

 static void nbd_co_receive_reply(NBDClientSession *s,
- struct nbd_request *request,
- struct nbd_reply *reply,
+ NBDRequest *request,
+ NBDReply *reply,
  QEMUIOVector *qiov)
 {
 int ret;
@@ -196,7 +196,7 @@ static void nbd_co_receive_reply(NBDClientSession *s,
 }

 static void nbd_coroutine_start(NBDClientSession *s,
-   struct nbd_request *request)
+NBDRequest *request)
 {
 /* Poor man semaphore.  The free_sema is locked when no other request
  * can be accepted, and unlocked after receiving one reply.  */
@@ -210,7 +210,7 @@ static void nbd_coroutine_start(NBDClientSession *s,
 }

 static void nbd_coroutine_end(NBDClientSession *s,
-struct nbd_request *request)
+  NBDRequest *request)
 {
 int i = HANDLE_TO_INDEX(s, request->handle);
 s->recv_coroutine[i] = NULL;
@@ -223,12 +223,12 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t 
offset,
  uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
 NBDClientSession *client = nbd_get_client_session(bs);
-struct nbd_request request = {
+NBDRequest request = {
 .type = NBD_CMD_READ,
 .from = offset,
 .len = bytes,
 };
-struct nbd_reply reply;
+NBDReply reply;
 ssize_t ret;

 assert(bytes <= NBD_MAX_BUFFER_SIZE);
@@ -249,12 +249,12 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t 
offset,
   uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
 NBDClientSession *client = nbd_get_client_session(bs);
-struct nbd_request request = {
+NBDRequest request = {
 .type = NBD_CMD_WRITE,
 .from = offset,
 .len = bytes,
 };
-struct nbd_reply reply;
+NBDReply reply;
 ssize_t ret;

 if (flags & BDRV_REQ_FUA) {
@@ -278,8 +278,8 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t 
offset,
 int nbd_client_co_flush(BlockDriverState *bs)
 {
 NBDClientSession *client = nbd_get_client_session(bs);
-struct nbd_request request = { .type = NBD_CMD_FLUSH };
-struct nbd_reply reply;
+NBDRequest request =

[Qemu-devel] [PATCH v6 10/15] nbd: Let client skip portions of server reply

2016-10-13 Thread Eric Blake

The server has a nice helper function nbd_negotiate_drop_sync()
which lets it easily ignore fluff from the client (such as the
payload to an unknown option request).  We can't quite make it
common, since it depends on nbd_negotiate_read() which handles
coroutine magic, but we can copy the idea into the client where
we have places where we want to ignore data (such as the
description tacked on the end of NBD_REP_SERVER).

Signed-off-by: Eric Blake 

---
v6: rebase
v5: no change
v4: rebase
v3: rebase
---
 nbd/client.c | 47 +--
 1 file changed, 33 insertions(+), 14 deletions(-)

diff --git a/nbd/client.c b/nbd/client.c
index a3e1e7a..df7eb9c 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -75,6 +75,32 @@ static QTAILQ_HEAD(, NBDExport) exports = 
QTAILQ_HEAD_INITIALIZER(exports);

 */

+/* Discard length bytes from channel.  Return -errno on failure, or
+ * the amount of bytes consumed. */
+static ssize_t drop_sync(QIOChannel *ioc, size_t size)
+{
+ssize_t ret, dropped = size;
+char small[1024];
+char *buffer;
+
+buffer = sizeof(small) < size ? small : g_malloc(MIN(65536, size));
+while (size > 0) {
+ret = read_sync(ioc, buffer, MIN(65536, size));
+if (ret < 0) {
+goto cleanup;
+}
+assert(ret <= size);
+size -= ret;
+}
+ret = dropped;
+
+ cleanup:
+if (buffer != small) {
+g_free(buffer);
+}
+return ret;
+}
+
 /* Send an option request.
  *
  * The request is for option @opt, with @data containing @len bytes of
@@ -285,19 +311,12 @@ static int nbd_receive_list(QIOChannel *ioc, char **name, 
Error **errp)
 }
 (*name)[namelen] = '\0';
 len -= namelen;
-if (len) {
-char *buf = g_malloc(len + 1);
-if (read_sync(ioc, buf, len) != len) {
-error_setg(errp, "failed to read export description");
-g_free(*name);
-g_free(buf);
-*name = NULL;
-nbd_send_opt_abort(ioc);
-return -1;
-}
-buf[len] = '\0';
-TRACE("Ignoring export description: %s", buf);
-g_free(buf);
+if (drop_sync(ioc, len) != len) {
+error_setg(errp, "failed to read export description");
+g_free(*name);
+*name = NULL;
+nbd_send_opt_abort(ioc);
+return -1;
 }
 } else {
 error_setg(errp, "Unexpected reply type %" PRIx32 " expected %x",
@@ -576,7 +595,7 @@ int nbd_receive_negotiate(QIOChannel *ioc, const char 
*name, uint16_t *flags,
 }

 TRACE("Size is %" PRIu64 ", export flags %" PRIx16, *size, *flags);
-if (read_sync(ioc, , 124) != 124) {
+if (drop_sync(ioc, 124) != 124) {
 error_setg(errp, "Failed to read reserved block");
 goto fail;
 }
-- 
2.7.4

[Qemu-devel] [PATCH v6 08/15] nbd: Share common option-sending code in client

2016-10-13 Thread Eric Blake

Rather than open-coding each option request, it's easier to
have common helper functions do the work.  That in turn requires
having convenient packed types for handling option requests
and replies.

Signed-off-by: Eric Blake 

---
v6: comment and formatting tweaks
v5: no change
v4: rebase
v3: rebase, tweak a debug message
---
 include/block/nbd.h |  25 +-
 nbd/nbd-internal.h  |   2 +-
 nbd/client.c| 255 ++--
 3 files changed, 131 insertions(+), 151 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index a33581b..b69bf1d 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -26,15 +26,34 @@
 #include "io/channel-socket.h"
 #include "crypto/tlscreds.h"

-/* Note: these are _NOT_ the same as the network representation of an NBD
+/* Handshake phase structs - this struct is passed on the wire */
+
+struct nbd_option {
+uint64_t magic; /* NBD_OPTS_MAGIC */
+uint32_t option; /* NBD_OPT_* */
+uint32_t length;
+} QEMU_PACKED;
+typedef struct nbd_option nbd_option;
+
+struct nbd_opt_reply {
+uint64_t magic; /* NBD_REP_MAGIC */
+uint32_t option; /* NBD_OPT_* */
+uint32_t type; /* NBD_REP_* */
+uint32_t length;
+} QEMU_PACKED;
+typedef struct nbd_opt_reply nbd_opt_reply;
+
+/* Transmission phase structs
+ *
+ * Note: these are _NOT_ the same as the network representation of an NBD
  * request and reply!
  */
 struct NBDRequest {
 uint64_t handle;
 uint64_t from;
 uint32_t len;
-uint16_t flags;
-uint16_t type;
+uint16_t flags; /* NBD_CMD_FLAG_* */
+uint16_t type; /* NBD_CMD_* */
 };
 typedef struct NBDRequest NBDRequest;

diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index 99e5157..dd57e18 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -62,7 +62,7 @@
 #define NBD_REPLY_MAGIC 0x67446698
 #define NBD_OPTS_MAGIC  0x49484156454F5054LL
 #define NBD_CLIENT_MAGIC0x420281861253LL
-#define NBD_REP_MAGIC   0x3e889045565a9LL
+#define NBD_REP_MAGIC   0x0003e889045565a9LL

 #define NBD_SET_SOCK_IO(0xab, 0)
 #define NBD_SET_BLKSIZE _IO(0xab, 1)
diff --git a/nbd/client.c b/nbd/client.c
index 86e29dc..f7a2d6e 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -75,64 +75,128 @@ static QTAILQ_HEAD(, NBDExport) exports = 
QTAILQ_HEAD_INITIALIZER(exports);

 */

+/* Send an option request.
+ *
+ * The request is for option @opt, with @data containing @len bytes of
+ * additional payload for the request (@len may be -1 to treat @data as
+ * a C string; and @data may be NULL if @len is 0).
+ * Return 0 if successful, -1 with errp set if it is impossible to
+ * continue. */
+static int nbd_send_option_request(QIOChannel *ioc, uint32_t opt,
+   uint32_t len, const char *data,
+   Error **errp)
+{
+nbd_option req;
+QEMU_BUILD_BUG_ON(sizeof(req) != 16);

-/* If type represents success, return 1 without further action.
- * If type represents an error reply, consume the rest of the packet on ioc.
- * Then return 0 for unsupported (so the client can fall back to
- * other approaches), or -1 with errp set for other errors.
+if (len == -1) {
+req.length = len = strlen(data);
+}
+TRACE("Sending option request %" PRIu32", len %" PRIu32, opt, len);
+
+stq_be_p(, NBD_OPTS_MAGIC);
+stl_be_p(, opt);
+stl_be_p(, len);
+
+if (write_sync(ioc, , sizeof(req)) != sizeof(req)) {
+error_setg(errp, "Failed to send option request header");
+return -1;
+}
+
+if (len && write_sync(ioc, (char *) data, len) != len) {
+error_setg(errp, "Failed to send option request data");
+return -1;
+}
+
+return 0;
+}
+
+/* Receive the header of an option reply, which should match the given
+ * opt.  Read through the length field, but NOT the length bytes of
+ * payload. Return 0 if successful, -1 with errp set if it is
+ * impossible to continue. */
+static int nbd_receive_option_reply(QIOChannel *ioc, uint32_t opt,
+nbd_opt_reply *reply, Error **errp)
+{
+QEMU_BUILD_BUG_ON(sizeof(*reply) != 20);
+if (read_sync(ioc, reply, sizeof(*reply)) != sizeof(*reply)) {
+error_setg(errp, "failed to read option reply");
+return -1;
+}
+be64_to_cpus(>magic);
+be32_to_cpus(>option);
+be32_to_cpus(>type);
+be32_to_cpus(>length);
+
+TRACE("Received option reply %" PRIx32", type %" PRIx32", len %" PRIu32,
+  reply->option, reply->type, reply->length);
+
+if (reply->magic != NBD_REP_MAGIC) {
+error_setg(errp, "Unexpected option reply magic");
+return -1;
+}
+if (reply->option != opt) {
+error_setg(errp, "Unexpected option type %x expected %x",
+   reply->option, opt);
+return -1;
+}
+return 0;
+}
+
+/* If reply represents success, return 1 without

[Qemu-devel] [PATCH v6 01/15] nbd: Add qemu-nbd -D for human-readable description

2016-10-13 Thread Eric Blake

The NBD protocol allows servers to advertise a human-readable
description alongside an export name during NBD_OPT_LIST.  Add
an option to pass through the user's string to the NBD client.

Doing this also makes it easier to test commit 200650d4, which
is the client counterpart of receiving the description.

Signed-off-by: Eric Blake 

---
v6: rebase to latest
v5: rebase to latest
v4: rebase to latest
---
 include/block/nbd.h |  1 +
 nbd/nbd-internal.h  |  5 +++--
 nbd/server.c| 34 ++
 qemu-nbd.c  | 12 +++-
 qemu-nbd.texi   |  5 -
 5 files changed, 45 insertions(+), 12 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 80610ff..fd58390 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -115,6 +115,7 @@ BlockBackend *nbd_export_get_blockdev(NBDExport *exp);

 NBDExport *nbd_export_find(const char *name);
 void nbd_export_set_name(NBDExport *exp, const char *name);
+void nbd_export_set_description(NBDExport *exp, const char *description);
 void nbd_export_close_all(void);

 void nbd_client_new(NBDExport *exp,
diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index 93a6ca8..7e78064 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -104,9 +104,10 @@ static inline ssize_t read_sync(QIOChannel *ioc, void 
*buffer, size_t size)
 return nbd_wr_syncv(ioc, , 1, size, true);
 }

-static inline ssize_t write_sync(QIOChannel *ioc, void *buffer, size_t size)
+static inline ssize_t write_sync(QIOChannel *ioc, const void *buffer,
+ size_t size)
 {
-struct iovec iov = { .iov_base = buffer, .iov_len = size };
+struct iovec iov = { .iov_base = (void *) buffer, .iov_len = size };

 return nbd_wr_syncv(ioc, , 1, size, false);
 }
diff --git a/nbd/server.c b/nbd/server.c
index 472f584..319827b 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -61,6 +61,7 @@ struct NBDExport {

 BlockBackend *blk;
 char *name;
+char *description;
 off_t dev_offset;
 off_t size;
 uint16_t nbdflags;
@@ -129,7 +130,8 @@ static ssize_t nbd_negotiate_read(QIOChannel *ioc, void 
*buffer, size_t size)

 }

-static ssize_t nbd_negotiate_write(QIOChannel *ioc, void *buffer, size_t size)
+static ssize_t nbd_negotiate_write(QIOChannel *ioc, const void *buffer,
+   size_t size)
 {
 ssize_t ret;
 guint watch;
@@ -225,11 +227,15 @@ static int nbd_negotiate_send_rep(QIOChannel *ioc, 
uint32_t type, uint32_t opt)

 static int nbd_negotiate_send_rep_list(QIOChannel *ioc, NBDExport *exp)
 {
-uint64_t magic, name_len;
+uint64_t magic;
+size_t name_len, desc_len;
 uint32_t opt, type, len;
+const char *name = exp->name ? exp->name : "";
+const char *desc = exp->description ? exp->description : "";

-TRACE("Advertising export name '%s'", exp->name ? exp->name : "");
-name_len = strlen(exp->name);
+TRACE("Advertising export name '%s' description '%s'", name, desc);
+name_len = strlen(name);
+desc_len = strlen(desc);
 magic = cpu_to_be64(NBD_REP_MAGIC);
 if (nbd_negotiate_write(ioc, , sizeof(magic)) != sizeof(magic)) {
 LOG("write failed (magic)");
@@ -245,18 +251,22 @@ static int nbd_negotiate_send_rep_list(QIOChannel *ioc, 
NBDExport *exp)
 LOG("write failed (reply type)");
 return -EINVAL;
 }
-len = cpu_to_be32(name_len + sizeof(len));
+len = cpu_to_be32(name_len + desc_len + sizeof(len));
 if (nbd_negotiate_write(ioc, , sizeof(len)) != sizeof(len)) {
 LOG("write failed (length)");
 return -EINVAL;
 }
 len = cpu_to_be32(name_len);
 if (nbd_negotiate_write(ioc, , sizeof(len)) != sizeof(len)) {
-LOG("write failed (length)");
+LOG("write failed (name length)");
 return -EINVAL;
 }
-if (nbd_negotiate_write(ioc, exp->name, name_len) != name_len) {
-LOG("write failed (buffer)");
+if (nbd_negotiate_write(ioc, name, name_len) != name_len) {
+LOG("write failed (name buffer)");
+return -EINVAL;
+}
+if (nbd_negotiate_write(ioc, desc, desc_len) != desc_len) {
+LOG("write failed (description buffer)");
 return -EINVAL;
 }
 return 0;
@@ -893,6 +903,12 @@ void nbd_export_set_name(NBDExport *exp, const char *name)
 nbd_export_put(exp);
 }

+void nbd_export_set_description(NBDExport *exp, const char *description)
+{
+g_free(exp->description);
+exp->description = g_strdup(description);
+}
+
 void nbd_export_close(NBDExport *exp)
 {
 NBDClient *client, *next;
@@ -902,6 +918,7 @@ void nbd_export_close(NBDExport *exp)
 client_close(client);
 }
 nbd_export_set_name(exp, NULL);
+nbd_export_set_description(exp, NULL);
 nbd_export_put(exp);
 }

@@ -920,6 +937,7 @@ void nbd_export_put(NBDExport *exp)

 if (--exp->refcount == 0) {
 assert(exp->name == NULL);
+assert(exp->description ==

[Qemu-devel] [PATCH v6 14/15] nbd: Implement NBD_CMD_WRITE_ZEROES on server

2016-10-13 Thread Eric Blake

Upstream NBD protocol recently added the ability to efficiently
write zeroes without having to send the zeroes over the wire,
along with a flag to control whether the client wants to allow
a hole.

Note that when it comes to requiring full allocation, vs.
permitting optimizations, the NBD spec intentionally picked a
different sense for the flag; the rules in qemu are:
MAY_UNMAP == 0: must write zeroes
MAY_UNMAP == 1: may use holes if reads will see zeroes

while in NBD, the rules are:
FLAG_NO_HOLE == 1: must write zeroes
FLAG_NO_HOLE == 0: may use holes if reads will see zeroes

In all cases, the 'may use holes' scenario is optional (the
server need not use a hole, and must not use a hole if
subsequent reads would not see zeroes).

Signed-off-by: Eric Blake 

---
v6: rebase, improve commit message
v5: no change
v4: rebase, fix value for constant
v3: abandon NBD_CMD_CLOSE extension, rebase to use blk_pwrite_zeroes
---
 include/block/nbd.h |  8 ++--
 nbd/server.c| 42 --
 2 files changed, 46 insertions(+), 4 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index eea7ef0..3e373f0 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -71,6 +71,7 @@ typedef struct NBDReply NBDReply;
 #define NBD_FLAG_SEND_FUA   (1 << 3)/* Send FUA (Force Unit 
Access) */
 #define NBD_FLAG_ROTATIONAL (1 << 4)/* Use elevator algorithm - 
rotational media */
 #define NBD_FLAG_SEND_TRIM  (1 << 5)/* Send TRIM (discard) */
+#define NBD_FLAG_SEND_WRITE_ZEROES (1 << 6) /* Send WRITE_ZEROES */

 /* New-style handshake (global) flags, sent from server to client, and
control what will happen during handshake phase. */
@@ -96,7 +97,8 @@ typedef struct NBDReply NBDReply;
 #define NBD_REP_ERR_SHUTDOWNNBD_REP_ERR(7)  /* Server shutting down */

 /* Request flags, sent from client to server during transmission phase */
-#define NBD_CMD_FLAG_FUA(1 << 0)
+#define NBD_CMD_FLAG_FUA(1 << 0) /* 'force unit access' during write */
+#define NBD_CMD_FLAG_NO_HOLE(1 << 1) /* don't punch hole on zero run */

 /* Supported request types */
 enum {
@@ -104,7 +106,9 @@ enum {
 NBD_CMD_WRITE = 1,
 NBD_CMD_DISC = 2,
 NBD_CMD_FLUSH = 3,
-NBD_CMD_TRIM = 4
+NBD_CMD_TRIM = 4,
+/* 5 reserved for failed experiment NBD_CMD_CACHE */
+NBD_CMD_WRITE_ZEROES = 6,
 };

 #define NBD_DEFAULT_PORT   10809
diff --git a/nbd/server.c b/nbd/server.c
index a7aa2ba..b189619 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -615,7 +615,8 @@ static coroutine_fn int nbd_negotiate(NBDClientNewData 
*data)
 char buf[8 + 8 + 8 + 128];
 int rc;
 const uint16_t myflags = (NBD_FLAG_HAS_FLAGS | NBD_FLAG_SEND_TRIM |
-  NBD_FLAG_SEND_FLUSH | NBD_FLAG_SEND_FUA);
+  NBD_FLAG_SEND_FLUSH | NBD_FLAG_SEND_FUA |
+  NBD_FLAG_SEND_WRITE_ZEROES);
 bool oldStyle;
 size_t len;

@@ -1145,11 +1146,17 @@ static ssize_t nbd_co_receive_request(NBDRequestData 
*req,
 rc = request->type == NBD_CMD_WRITE ? -ENOSPC : -EINVAL;
 goto out;
 }
-if (request->flags & ~NBD_CMD_FLAG_FUA) {
+if (request->flags & ~(NBD_CMD_FLAG_FUA | NBD_CMD_FLAG_NO_HOLE)) {
 LOG("unsupported flags (got 0x%x)", request->flags);
 rc = -EINVAL;
 goto out;
 }
+if (request->type != NBD_CMD_WRITE_ZEROES &&
+(request->flags & NBD_CMD_FLAG_NO_HOLE)) {
+LOG("unexpected flags (got 0x%x)", request->flags);
+rc = -EINVAL;
+goto out;
+}

 rc = 0;

@@ -1254,6 +1261,37 @@ static void nbd_trip(void *opaque)
 }
 break;

+case NBD_CMD_WRITE_ZEROES:
+TRACE("Request type is WRITE_ZEROES");
+
+if (exp->nbdflags & NBD_FLAG_READ_ONLY) {
+TRACE("Server is read-only, return error");
+reply.error = EROFS;
+goto error_reply;
+}
+
+TRACE("Writing to device");
+
+flags = 0;
+if (request.flags & NBD_CMD_FLAG_FUA) {
+flags |= BDRV_REQ_FUA;
+}
+if (!(request.flags & NBD_CMD_FLAG_NO_HOLE)) {
+flags |= BDRV_REQ_MAY_UNMAP;
+}
+ret = blk_pwrite_zeroes(exp->blk, request.from + exp->dev_offset,
+request.len, flags);
+if (ret < 0) {
+LOG("writing to file failed");
+reply.error = -ret;
+goto error_reply;
+}
+
+if (nbd_co_send_reply(req, , 0) < 0) {
+goto out;
+}
+break;
+
 case NBD_CMD_DISC:
 /* unreachable, thanks to special case in nbd_co_receive_request() */
 abort();
-- 
2.7.4

[Qemu-devel] [PATCH v6 04/15] nbd: Rename NbdClientSession to NBDClientSession

2016-10-13 Thread Eric Blake

It's better to use consistent capitalization of the namespace
used for NBD functions; we have more instances of NBD* than
Nbd*.

Signed-off-by: Eric Blake 

---
v6: new patch
---
 block/nbd-client.h |  6 +++---
 block/nbd-client.c | 26 +-
 block/nbd.c|  4 ++--
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/block/nbd-client.h b/block/nbd-client.h
index 044aca4..a84a478 100644
--- a/block/nbd-client.h
+++ b/block/nbd-client.h
@@ -17,7 +17,7 @@

 #define MAX_NBD_REQUESTS16

-typedef struct NbdClientSession {
+typedef struct NBDClientSession {
 QIOChannelSocket *sioc; /* The master data channel */
 QIOChannel *ioc; /* The current I/O channel which may differ (eg TLS) */
 uint16_t nbdflags;
@@ -32,9 +32,9 @@ typedef struct NbdClientSession {
 struct nbd_reply reply;

 bool is_unix;
-} NbdClientSession;
+} NBDClientSession;

-NbdClientSession *nbd_get_client_session(BlockDriverState *bs);
+NBDClientSession *nbd_get_client_session(BlockDriverState *bs);

 int nbd_client_init(BlockDriverState *bs,
 QIOChannelSocket *sock,
diff --git a/block/nbd-client.c b/block/nbd-client.c
index 7e9c3ec..c94608a 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -33,7 +33,7 @@
 #define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs))
 #define INDEX_TO_HANDLE(bs, index)  ((index)  ^ ((uint64_t)(intptr_t)bs))

-static void nbd_recv_coroutines_enter_all(NbdClientSession *s)
+static void nbd_recv_coroutines_enter_all(NBDClientSession *s)
 {
 int i;

@@ -46,7 +46,7 @@ static void nbd_recv_coroutines_enter_all(NbdClientSession *s)

 static void nbd_teardown_connection(BlockDriverState *bs)
 {
-NbdClientSession *client = nbd_get_client_session(bs);
+NBDClientSession *client = nbd_get_client_session(bs);

 if (!client->ioc) { /* Already closed */
 return;
@@ -68,7 +68,7 @@ static void nbd_teardown_connection(BlockDriverState *bs)
 static void nbd_reply_ready(void *opaque)
 {
 BlockDriverState *bs = opaque;
-NbdClientSession *s = nbd_get_client_session(bs);
+NBDClientSession *s = nbd_get_client_session(bs);
 uint64_t i;
 int ret;

@@ -119,7 +119,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
struct nbd_request *request,
QEMUIOVector *qiov)
 {
-NbdClientSession *s = nbd_get_client_session(bs);
+NBDClientSession *s = nbd_get_client_session(bs);
 AioContext *aio_context;
 int rc, ret, i;

@@ -167,7 +167,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
 return rc;
 }

-static void nbd_co_receive_reply(NbdClientSession *s,
+static void nbd_co_receive_reply(NBDClientSession *s,
  struct nbd_request *request,
  struct nbd_reply *reply,
  QEMUIOVector *qiov)
@@ -195,7 +195,7 @@ static void nbd_co_receive_reply(NbdClientSession *s,
 }
 }

-static void nbd_coroutine_start(NbdClientSession *s,
+static void nbd_coroutine_start(NBDClientSession *s,
struct nbd_request *request)
 {
 /* Poor man semaphore.  The free_sema is locked when no other request
@@ -209,7 +209,7 @@ static void nbd_coroutine_start(NbdClientSession *s,
 /* s->recv_coroutine[i] is set as soon as we get the send_lock.  */
 }

-static void nbd_coroutine_end(NbdClientSession *s,
+static void nbd_coroutine_end(NBDClientSession *s,
 struct nbd_request *request)
 {
 int i = HANDLE_TO_INDEX(s, request->handle);
@@ -222,7 +222,7 @@ static void nbd_coroutine_end(NbdClientSession *s,
 int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
  uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
-NbdClientSession *client = nbd_get_client_session(bs);
+NBDClientSession *client = nbd_get_client_session(bs);
 struct nbd_request request = {
 .type = NBD_CMD_READ,
 .from = offset,
@@ -248,7 +248,7 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t 
offset,
 int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset,
   uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
-NbdClientSession *client = nbd_get_client_session(bs);
+NBDClientSession *client = nbd_get_client_session(bs);
 struct nbd_request request = {
 .type = NBD_CMD_WRITE,
 .from = offset,
@@ -277,7 +277,7 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t 
offset,

 int nbd_client_co_flush(BlockDriverState *bs)
 {
-NbdClientSession *client = nbd_get_client_session(bs);
+NBDClientSession *client = nbd_get_client_session(bs);
 struct nbd_request request = { .type = NBD_CMD_FLUSH };
 struct nbd_reply reply;
 ssize_t ret;
@@ -302,7 +302,7 @@ int nbd_client_co_flush(BlockDriverState *bs)

 int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int count)
 {
-NbdClientSession

[Qemu-devel] [PATCH v6 02/15] nbd: Treat flags vs. command type as separate fields

2016-10-13 Thread Eric Blake

Current upstream NBD documents that requests have a 16-bit flags,
followed by a 16-bit type integer; although older versions mentioned
only a 32-bit field with masking to find flags.  Since the protocol
is in network order (big-endian over the wire), the ABI is unchanged;
but dealing with the flags as a separate field rather than masking
will make it easier to add support for upcoming NBD extensions that
increase the number of both flags and commands.

Improve some comments in nbd.h based on the current upstream
NBD protocol (https://github.com/yoe/nbd/blob/master/doc/proto.md),
and touch some nearby code to keep checkpatch.pl happy.

Signed-off-by: Eric Blake 

---
v6: no change
v5: no change
v4: rebase to earlier changes
v3: rebase to other changes earlier in series
---
 include/block/nbd.h | 18 --
 nbd/nbd-internal.h  |  4 ++--
 block/nbd-client.c  |  9 +++--
 nbd/client.c|  9 ++---
 nbd/server.c| 39 +++
 5 files changed, 42 insertions(+), 37 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index fd58390..5fe2670 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -1,4 +1,5 @@
 /*
+ *  Copyright (C) 2016 Red Hat, Inc.
  *  Copyright (C) 2005  Anthony Liguori 
  *
  *  Network Block Device
@@ -32,7 +33,8 @@ struct nbd_request {
 uint64_t handle;
 uint64_t from;
 uint32_t len;
-uint32_t type;
+uint16_t flags;
+uint16_t type;
 };

 struct nbd_reply {
@@ -40,6 +42,8 @@ struct nbd_reply {
 uint32_t error;
 };

+/* Transmission (export) flags: sent from server to client during handshake,
+   but describe what will happen during transmission */
 #define NBD_FLAG_HAS_FLAGS  (1 << 0)/* Flags are there */
 #define NBD_FLAG_READ_ONLY  (1 << 1)/* Device is read-only */
 #define NBD_FLAG_SEND_FLUSH (1 << 2)/* Send FLUSH */
@@ -47,10 +51,12 @@ struct nbd_reply {
 #define NBD_FLAG_ROTATIONAL (1 << 4)/* Use elevator algorithm - 
rotational media */
 #define NBD_FLAG_SEND_TRIM  (1 << 5)/* Send TRIM (discard) */

-/* New-style global flags. */
+/* New-style handshake (global) flags, sent from server to client, and
+   control what will happen during handshake phase. */
 #define NBD_FLAG_FIXED_NEWSTYLE (1 << 0)/* Fixed newstyle protocol. */

-/* New-style client flags. */
+/* New-style client flags, sent from client to server to control what happens
+   during handshake phase. */
 #define NBD_FLAG_C_FIXED_NEWSTYLE   (1 << 0)/* Fixed newstyle protocol. */

 /* Reply types. */
@@ -61,10 +67,10 @@ struct nbd_reply {
 #define NBD_REP_ERR_INVALID ((UINT32_C(1) << 31) | 3) /* Invalid length. */
 #define NBD_REP_ERR_TLS_REQD((UINT32_C(1) << 31) | 5) /* TLS required */

+/* Request flags, sent from client to server during transmission phase */
+#define NBD_CMD_FLAG_FUA(1 << 0)

-#define NBD_CMD_MASK_COMMAND   0x
-#define NBD_CMD_FLAG_FUA   (1 << 16)
-
+/* Supported request types */
 enum {
 NBD_CMD_READ = 0,
 NBD_CMD_WRITE = 1,
diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index 7e78064..99e5157 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -53,10 +53,10 @@
 /* This is all part of the "official" NBD API.
  *
  * The most up-to-date documentation is available at:
- * https://github.com/yoe/nbd/blob/master/doc/proto.txt
+ * https://github.com/yoe/nbd/blob/master/doc/proto.md
  */

-#define NBD_REQUEST_SIZE(4 + 4 + 8 + 8 + 4)
+#define NBD_REQUEST_SIZE(4 + 2 + 2 + 8 + 8 + 4)
 #define NBD_REPLY_SIZE  (4 + 4 + 8)
 #define NBD_REQUEST_MAGIC   0x25609513
 #define NBD_REPLY_MAGIC 0x67446698
diff --git a/block/nbd-client.c b/block/nbd-client.c
index 2cf3237..7e9c3ec 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -1,6 +1,7 @@
 /*
  * QEMU Block driver for  NBD
  *
+ * Copyright (C) 2016 Red Hat, Inc.
  * Copyright (C) 2008 Bull S.A.S.
  * Author: Laurent Vivier 
  *
@@ -258,7 +259,7 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t 
offset,

 if (flags & BDRV_REQ_FUA) {
 assert(client->nbdflags & NBD_FLAG_SEND_FUA);
-request.type |= NBD_CMD_FLAG_FUA;
+request.flags |= NBD_CMD_FLAG_FUA;
 }

 assert(bytes <= NBD_MAX_BUFFER_SIZE);
@@ -343,11 +344,7 @@ void nbd_client_attach_aio_context(BlockDriverState *bs,
 void nbd_client_close(BlockDriverState *bs)
 {
 NbdClientSession *client = nbd_get_client_session(bs);
-struct nbd_request request = {
-.type = NBD_CMD_DISC,
-.from = 0,
-.len = 0
-};
+struct nbd_request request = { .type = NBD_CMD_DISC };

 if (client->ioc == NULL) {
 return;
diff --git a/nbd/client.c b/nbd/client.c
index a92f1e2..7c172ed 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -1,4 +1,5 @@
 /*
+ *  Copyright (C) 2016 Red Hat, Inc.
  *  Copyright (C) 2005

[Qemu-devel] [PATCH v6 06/15] nbd: Share common reply-sending code in server

2016-10-13 Thread Eric Blake

Rather than open-coding NBD_REP_SERVER, reuse the code we
already have by adding a length parameter.  Additionally,
the refactoring will make adding NBD_OPT_GO in a later patch
easier.

Signed-off-by: Eric Blake 

---
v6: improve (and add) function comments
v5: no change
v4: no change
v3: rebase to changes earlier in series
---
 nbd/server.c | 52 +++-
 1 file changed, 27 insertions(+), 25 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 8e0ad78..14a5bd5 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -196,12 +196,15 @@ static ssize_t nbd_negotiate_drop_sync(QIOChannel *ioc, 
size_t size)

 */

-static int nbd_negotiate_send_rep(QIOChannel *ioc, uint32_t type, uint32_t opt)
+/* Send a reply header, including length, but no payload.
+ * Return -errno on error, 0 on success. */
+static int nbd_negotiate_send_rep_len(QIOChannel *ioc, uint32_t type,
+  uint32_t opt, uint32_t len)
 {
 uint64_t magic;
-uint32_t len;

-TRACE("Reply opt=%" PRIx32 " type=%" PRIx32, type, opt);
+TRACE("Reply opt=%" PRIx32 " type=%" PRIx32 " len=%" PRIu32,
+  type, opt, len);

 magic = cpu_to_be64(NBD_REP_MAGIC);
 if (nbd_negotiate_write(ioc, , sizeof(magic)) != sizeof(magic)) {
@@ -218,7 +221,7 @@ static int nbd_negotiate_send_rep(QIOChannel *ioc, uint32_t 
type, uint32_t opt)
 LOG("write failed (rep type)");
 return -EINVAL;
 }
-len = cpu_to_be32(0);
+len = cpu_to_be32(len);
 if (nbd_negotiate_write(ioc, , sizeof(len)) != sizeof(len)) {
 LOG("write failed (rep data length)");
 return -EINVAL;
@@ -226,37 +229,32 @@ static int nbd_negotiate_send_rep(QIOChannel *ioc, 
uint32_t type, uint32_t opt)
 return 0;
 }

+/* Send a reply header with default 0 length.
+ * Return -errno on error, 0 on success. */
+static int nbd_negotiate_send_rep(QIOChannel *ioc, uint32_t type, uint32_t opt)
+{
+return nbd_negotiate_send_rep_len(ioc, type, opt, 0);
+}
+
+/* Send a single NBD_REP_SERVER reply to NBD_OPT_LIST, including payload.
+ * Return -errno on error, 0 on success. */
 static int nbd_negotiate_send_rep_list(QIOChannel *ioc, NBDExport *exp)
 {
-uint64_t magic;
 size_t name_len, desc_len;
-uint32_t opt, type, len;
+uint32_t len;
 const char *name = exp->name ? exp->name : "";
 const char *desc = exp->description ? exp->description : "";
+int rc;

 TRACE("Advertising export name '%s' description '%s'", name, desc);
 name_len = strlen(name);
 desc_len = strlen(desc);
-magic = cpu_to_be64(NBD_REP_MAGIC);
-if (nbd_negotiate_write(ioc, , sizeof(magic)) != sizeof(magic)) {
-LOG("write failed (magic)");
-return -EINVAL;
- }
-opt = cpu_to_be32(NBD_OPT_LIST);
-if (nbd_negotiate_write(ioc, , sizeof(opt)) != sizeof(opt)) {
-LOG("write failed (opt)");
-return -EINVAL;
-}
-type = cpu_to_be32(NBD_REP_SERVER);
-if (nbd_negotiate_write(ioc, , sizeof(type)) != sizeof(type)) {
-LOG("write failed (reply type)");
-return -EINVAL;
-}
-len = cpu_to_be32(name_len + desc_len + sizeof(len));
-if (nbd_negotiate_write(ioc, , sizeof(len)) != sizeof(len)) {
-LOG("write failed (length)");
-return -EINVAL;
+len = name_len + desc_len + sizeof(len);
+rc = nbd_negotiate_send_rep_len(ioc, NBD_REP_SERVER, NBD_OPT_LIST, len);
+if (rc < 0) {
+return rc;
 }
+
 len = cpu_to_be32(name_len);
 if (nbd_negotiate_write(ioc, , sizeof(len)) != sizeof(len)) {
 LOG("write failed (name length)");
@@ -273,6 +271,8 @@ static int nbd_negotiate_send_rep_list(QIOChannel *ioc, 
NBDExport *exp)
 return 0;
 }

+/* Process the NBD_OPT_LIST command, with a potential series of replies.
+ * Return -errno on error, 0 on success. */
 static int nbd_negotiate_handle_list(NBDClient *client, uint32_t length)
 {
 NBDExport *exp;
@@ -381,6 +381,8 @@ static QIOChannel *nbd_negotiate_handle_starttls(NBDClient 
*client,
 }


+/* Process all NBD_OPT_* client option commands.
+ * Return -errno on error, 0 on success. */
 static int nbd_negotiate_options(NBDClient *client)
 {
 uint32_t flags;
-- 
2.7.4

[Qemu-devel] [PATCH v6 09/15] nbd: Let server know when client gives up negotiation

2016-10-13 Thread Eric Blake

The NBD spec says that a client should send NBD_OPT_ABORT
rather than just dropping the connection, if the client doesn't
like something the server sent during option negotiation.  This
is a best-effort attempt only, and can only be done in places
where we know the server is still in sync with what we've sent,
whether or not we've read everything the server has sent.
Technically, the server then has to reply with NBD_REP_ACK, but
it's not worth complicating the client to wait around for that
reply.

Signed-off-by: Eric Blake 

---
v6: rebase
v5: no change
v4: new patch
---
 nbd/client.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/nbd/client.c b/nbd/client.c
index f7a2d6e..a3e1e7a 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -111,6 +111,19 @@ static int nbd_send_option_request(QIOChannel *ioc, 
uint32_t opt,
 return 0;
 }

+/* Send NBD_OPT_ABORT as a courtesy to let the server know that we are
+ * not going to attempt further negotiation. */
+static void nbd_send_opt_abort(QIOChannel *ioc)
+{
+/* Technically, a compliant server is supposed to reply to us; but
+ * older servers disconnected instead. At any rate, we're allowed
+ * to disconnect without waiting for the server reply, so we don't
+ * even care if the request makes it to the server, let alone
+ * waiting around for whether the server replies. */
+nbd_send_option_request(ioc, NBD_OPT_ABORT, 0, NULL, NULL);
+}
+
+
 /* Receive the header of an option reply, which should match the given
  * opt.  Read through the length field, but NOT the length bytes of
  * payload. Return 0 if successful, -1 with errp set if it is
@@ -121,6 +134,7 @@ static int nbd_receive_option_reply(QIOChannel *ioc, 
uint32_t opt,
 QEMU_BUILD_BUG_ON(sizeof(*reply) != 20);
 if (read_sync(ioc, reply, sizeof(*reply)) != sizeof(*reply)) {
 error_setg(errp, "failed to read option reply");
+nbd_send_opt_abort(ioc);
 return -1;
 }
 be64_to_cpus(>magic);
@@ -133,11 +147,13 @@ static int nbd_receive_option_reply(QIOChannel *ioc, 
uint32_t opt,

 if (reply->magic != NBD_REP_MAGIC) {
 error_setg(errp, "Unexpected option reply magic");
+nbd_send_opt_abort(ioc);
 return -1;
 }
 if (reply->option != opt) {
 error_setg(errp, "Unexpected option type %x expected %x",
reply->option, opt);
+nbd_send_opt_abort(ioc);
 return -1;
 }
 return 0;
@@ -206,6 +222,9 @@ static int nbd_handle_reply_err(QIOChannel *ioc, 
nbd_opt_reply *reply,

  cleanup:
 g_free(msg);
+if (result < 0) {
+nbd_send_opt_abort(ioc);
+}
 return result;
 }

@@ -229,25 +248,30 @@ static int nbd_receive_list(QIOChannel *ioc, char **name, 
Error **errp)
 if (reply.type == NBD_REP_ACK) {
 if (len != 0) {
 error_setg(errp, "length too long for option end");
+nbd_send_opt_abort(ioc);
 return -1;
 }
 } else if (reply.type == NBD_REP_SERVER) {
 if (len < sizeof(namelen) || len > NBD_MAX_BUFFER_SIZE) {
 error_setg(errp, "incorrect option length %" PRIu32, len);
+nbd_send_opt_abort(ioc);
 return -1;
 }
 if (read_sync(ioc, , sizeof(namelen)) != sizeof(namelen)) {
 error_setg(errp, "failed to read option name length");
+nbd_send_opt_abort(ioc);
 return -1;
 }
 namelen = be32_to_cpu(namelen);
 len -= sizeof(namelen);
 if (len < namelen) {
 error_setg(errp, "incorrect option name length");
+nbd_send_opt_abort(ioc);
 return -1;
 }
 if (namelen > NBD_MAX_NAME_SIZE) {
 error_setg(errp, "export name length too long %" PRIu32, namelen);
+nbd_send_opt_abort(ioc);
 return -1;
 }

@@ -256,6 +280,7 @@ static int nbd_receive_list(QIOChannel *ioc, char **name, 
Error **errp)
 error_setg(errp, "failed to read export name");
 g_free(*name);
 *name = NULL;
+nbd_send_opt_abort(ioc);
 return -1;
 }
 (*name)[namelen] = '\0';
@@ -267,6 +292,7 @@ static int nbd_receive_list(QIOChannel *ioc, char **name, 
Error **errp)
 g_free(*name);
 g_free(buf);
 *name = NULL;
+nbd_send_opt_abort(ioc);
 return -1;
 }
 buf[len] = '\0';
@@ -276,6 +302,7 @@ static int nbd_receive_list(QIOChannel *ioc, char **name, 
Error **errp)
 } else {
 error_setg(errp, "Unexpected reply type %" PRIx32 " expected %x",
reply.type, NBD_REP_SERVER);
+nbd_send_opt_abort(ioc);
 return -1;
 }
 return 1;
@@ -325,6 +352,7 @@ static int nbd_receive_query_exports(QIOChannel *ioc,

 if (!foundExport) {
 error_setg(errp, "No export with name '%s'

[Qemu-devel] [PATCH v6 03/15] nbd: Rename NBDRequest to NBDRequestData

2016-10-13 Thread Eric Blake

We have both 'struct NBDRequest' and 'struct nbd_request'; making
it confusing to see which does what.  Furthermore, we want to
rename nbd_request to align with our normal CamelCase naming
conventions.  So, rename the struct which is used to associate
the data received during request callbacks, while leaving the
shorter name for the description of the request sent over the
wire in the NBD protocol.

Signed-off-by: Eric Blake 

---
v6: new patch
---
 nbd/server.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 2e84d51..78c0419 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -47,10 +47,10 @@ static int system_errno_to_nbd_errno(int err)

 /* Definitions for opaque data types */

-typedef struct NBDRequest NBDRequest;
+typedef struct NBDRequestData NBDRequestData;

-struct NBDRequest {
-QSIMPLEQ_ENTRY(NBDRequest) entry;
+struct NBDRequestData {
+QSIMPLEQ_ENTRY(NBDRequestData) entry;
 NBDClient *client;
 uint8_t *data;
 bool complete;
@@ -759,21 +759,21 @@ static void client_close(NBDClient *client)
 }
 }

-static NBDRequest *nbd_request_get(NBDClient *client)
+static NBDRequestData *nbd_request_get(NBDClient *client)
 {
-NBDRequest *req;
+NBDRequestData *req;

 assert(client->nb_requests <= MAX_NBD_REQUESTS - 1);
 client->nb_requests++;
 nbd_update_can_read(client);

-req = g_new0(NBDRequest, 1);
+req = g_new0(NBDRequestData, 1);
 nbd_client_get(client);
 req->client = client;
 return req;
 }

-static void nbd_request_put(NBDRequest *req)
+static void nbd_request_put(NBDRequestData *req)
 {
 NBDClient *client = req->client;

@@ -975,7 +975,7 @@ void nbd_export_close_all(void)
 }
 }

-static ssize_t nbd_co_send_reply(NBDRequest *req, struct nbd_reply *reply,
+static ssize_t nbd_co_send_reply(NBDRequestData *req, struct nbd_reply *reply,
  int len)
 {
 NBDClient *client = req->client;
@@ -1011,7 +1011,7 @@ static ssize_t nbd_co_send_reply(NBDRequest *req, struct 
nbd_reply *reply,
  * and any other negative value to report an error to the client
  * (although the caller may still need to disconnect after reporting
  * the error).  */
-static ssize_t nbd_co_receive_request(NBDRequest *req,
+static ssize_t nbd_co_receive_request(NBDRequestData *req,
   struct nbd_request *request)
 {
 NBDClient *client = req->client;
@@ -1105,7 +1105,7 @@ static void nbd_trip(void *opaque)
 {
 NBDClient *client = opaque;
 NBDExport *exp = client->exp;
-NBDRequest *req;
+NBDRequestData *req;
 struct nbd_request request;
 struct nbd_reply reply;
 ssize_t ret;
-- 
2.7.4

[Qemu-devel] [PATCH v6 07/15] nbd: Send message along with server NBD_REP_ERR errors

2016-10-13 Thread Eric Blake

The NBD Protocol allows us to send human-readable messages
along with any NBD_REP_ERR error during option negotiation;
make use of this fact for clients that know what to do with
our message.

Signed-off-by: Eric Blake 

---
v6: tweak comments, fix indentation
v5: don't leak 'msg'
v4: new patch
---
 nbd/server.c | 78 +---
 1 file changed, 59 insertions(+), 19 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 14a5bd5..3d39292 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -236,6 +236,38 @@ static int nbd_negotiate_send_rep(QIOChannel *ioc, 
uint32_t type, uint32_t opt)
 return nbd_negotiate_send_rep_len(ioc, type, opt, 0);
 }

+/* Send an error reply.
+ * Return -errno on error, 0 on success. */
+static int GCC_FMT_ATTR(4, 5)
+nbd_negotiate_send_rep_err(QIOChannel *ioc, uint32_t type,
+   uint32_t opt, const char *fmt, ...)
+{
+va_list va;
+char *msg;
+int ret;
+size_t len;
+
+va_start(va, fmt);
+msg = g_strdup_vprintf(fmt, va);
+va_end(va);
+len = strlen(msg);
+assert(len < 4096);
+TRACE("sending error message \"%s\"", msg);
+ret = nbd_negotiate_send_rep_len(ioc, type, opt, len);
+if (ret < 0) {
+goto out;
+}
+if (nbd_negotiate_write(ioc, msg, len) != len) {
+LOG("write failed (error message)");
+ret = -EIO;
+} else {
+ret = 0;
+}
+out:
+g_free(msg);
+return ret;
+}
+
 /* Send a single NBD_REP_SERVER reply to NBD_OPT_LIST, including payload.
  * Return -errno on error, 0 on success. */
 static int nbd_negotiate_send_rep_list(QIOChannel *ioc, NBDExport *exp)
@@ -281,8 +313,9 @@ static int nbd_negotiate_handle_list(NBDClient *client, 
uint32_t length)
 if (nbd_negotiate_drop_sync(client->ioc, length) != length) {
 return -EIO;
 }
-return nbd_negotiate_send_rep(client->ioc,
-  NBD_REP_ERR_INVALID, NBD_OPT_LIST);
+return nbd_negotiate_send_rep_err(client->ioc,
+  NBD_REP_ERR_INVALID, NBD_OPT_LIST,
+  "OPT_LIST should not have length");
 }

 /* For each export, send a NBD_REP_SERVER reply. */
@@ -329,7 +362,8 @@ fail:
 return rc;
 }

-
+/* Handle NBD_OPT_STARTTLS. Return NULL to drop connection, or else the
+ * new channel for all further (now-encrypted) communication. */
 static QIOChannel *nbd_negotiate_handle_starttls(NBDClient *client,
  uint32_t length)
 {
@@ -343,7 +377,8 @@ static QIOChannel *nbd_negotiate_handle_starttls(NBDClient 
*client,
 if (nbd_negotiate_drop_sync(ioc, length) != length) {
 return NULL;
 }
-nbd_negotiate_send_rep(ioc, NBD_REP_ERR_INVALID, NBD_OPT_STARTTLS);
+nbd_negotiate_send_rep_err(ioc, NBD_REP_ERR_INVALID, NBD_OPT_STARTTLS,
+   "OPT_STARTTLS should not have length");
 return NULL;
 }

@@ -473,13 +508,15 @@ static int nbd_negotiate_options(NBDClient *client)
 return -EINVAL;

 default:
-TRACE("Option 0x%" PRIx32 " not permitted before TLS",
-  clientflags);
 if (nbd_negotiate_drop_sync(client->ioc, length) != length) {
 return -EIO;
 }
-ret = nbd_negotiate_send_rep(client->ioc, NBD_REP_ERR_TLS_REQD,
- clientflags);
+ret = nbd_negotiate_send_rep_err(client->ioc,
+ NBD_REP_ERR_TLS_REQD,
+ clientflags,
+ "Option 0x%" PRIx32
+ "not permitted before TLS",
+ clientflags);
 if (ret < 0) {
 return ret;
 }
@@ -505,27 +542,30 @@ static int nbd_negotiate_options(NBDClient *client)
 return -EIO;
 }
 if (client->tlscreds) {
-TRACE("TLS already enabled");
-ret = nbd_negotiate_send_rep(client->ioc,
- NBD_REP_ERR_INVALID,
- clientflags);
+ret = nbd_negotiate_send_rep_err(client->ioc,
+ NBD_REP_ERR_INVALID,
+ clientflags,
+ "TLS already enabled");
 } else {
-TRACE("TLS not configured");
-ret = nbd_negotiate_send_rep(client->ioc,
-

Re: [Qemu-devel] [PATCH 2/5] cpus: use atomic_read to read seqlock-protected variables

2016-10-13 Thread Paolo Bonzini

> Is tsan happy with the way seqlocks are written right now?

I honestly don't know.  But if there are tsan bugs there's
not much we can do.  The alternative below has overhead on
ARM and PPC and does not quite fit in atomic.h.

In any case, a bigger issue is that this patch breaks on
32-bit because it does 64-bit atomic_read.  We might have
to fall back to volatile when not running on tsan.

Paolo

> Dmitry Vyukov wrote:
> > 1. Tsan is bad at handling stand-alone memory barriers.
> > And here is a way to express seqlock that is both correct, is
> > understood by tsan and is no overhead on x86:
> > 
> > // writer
> > atomic_store(, seq+1, memory_order_relaxed);
> > atomic_store([0], ..., memory_order_release);
> > ...
> > atomic_store([N], ..., memory_order_release);
> > atomic_store(, seq+1, memory_order_release);
> > 
> > // reader
> > atomic_load(, memory_order_acquire);
> > d0 = atomic_load([0], memory_order_acquire);
> > ...
> > dN = atomic_load([N], memory_order_acquire);
> > atomic_load(, memory_order_relaxed);
> 
> Source: https://groups.google.com/forum/#!topic/thread-sanitizer/B4i9EMQ4BQE
> 
> Thanks,
> 
>   Emilio
>

Re: [Qemu-devel] [PATCH 3/4] sockets: add AF_VSOCK support

2016-10-13 Thread Michael Roth

Quoting Stefan Hajnoczi (2016-10-12 10:06:10)
> On Fri, Oct 07, 2016 at 11:42:35AM -0500, Michael Roth wrote:
> > Quoting Stefan Hajnoczi (2016-10-06 11:40:17)
> > > Add the AF_VSOCK address family so that qemu-ga will be able to use
> > > virtio-vsock.
> > > 
> > > The AF_VSOCK address family uses  address tuples.  The cid is
> > > the unique identifier comparable to an IP address.  AF_VSOCK does not
> > > use name resolution so it's seasy to convert between struct sockaddr_vm
> > > and strings.
> > > 
> > > This patch defines a VsockSocketAddress instead of trying to piggy-back
> > > on InetSocketAddress.  This is cleaner in the long run since it avoids
> > > lots of IPv4 vs IPv6 vs vsock special casing.
> > > 
> > > Signed-off-by: Stefan Hajnoczi 
> > > ---
> > >  qapi-schema.json|  23 +-
> > >  util/qemu-sockets.c | 222 
> > > 
> > >  2 files changed, 244 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/qapi-schema.json b/qapi-schema.json
> > > index c3dcf11..8864a96 100644
> > > --- a/qapi-schema.json
> > > +++ b/qapi-schema.json
> > > @@ -987,12 +987,14 @@
> > >  #
> > >  # @unix: unix socket
> > >  #
> > > +# @vsock: vsock family (since 2.8)
> > > +#
> > >  # @unknown: otherwise
> > >  #
> > >  # Since: 2.1
> > >  ##
> > >  { 'enum': 'NetworkAddressFamily',
> > > -  'data': [ 'ipv4', 'ipv6', 'unix', 'unknown' ] }
> > > +  'data': [ 'ipv4', 'ipv6', 'unix', 'vsock', 'unknown' ] }
> > > 
> > >  ##
> > >  # @VncBasicInfo
> > > @@ -3017,6 +3019,24 @@
> > >  'path': 'str' } }
> > > 
> > >  ##
> > > +# @VsockSocketAddress
> > > +#
> > > +# Captures a socket address in the vsock namespace.
> > > +#
> > > +# @cid: unique host identifier
> > > +# @port: port
> > > +#
> > > +# Note that string types are used to allow for possible future hostname 
> > > or
> > > +# service resolution support.
> > > +#
> > > +# Since 2.8
> > > +##
> > > +{ 'struct': 'VsockSocketAddress',
> > > +  'data': {
> > > +'cid': 'str',
> > > +'port': 'str' } }
> > 
> > Is there any reason to not define these as uint32_t? Not sure if there
> > are other reasons for this, but if it's just for consistency with how
> > Inet is handled, the code seems to do straight atoi()<->printf("%d") to
> > covert between numerical and string representation so it doesn't seem
> > like we need to account for any differences between command-line and
> > internal representation in sockaddr_vm.
> 
> Just in case AF_VSOCK ever supports name and service resolution like
> TCP/IP.  In that case cid could be a host name and port could be a
> service name.
> 
> (I mentioned this in the doc comment.)

Ahh, sorry I missed that completely. Makes perfect sense then.

> 
> Stefan

Re: [Qemu-devel] [PATCH 4/4] qga: add vsock-listen method

2016-10-13 Thread Michael Roth

Quoting Stefan Hajnoczi (2016-10-12 10:07:33)
> On Fri, Oct 07, 2016 at 12:07:41PM -0500, Michael Roth wrote:
> > Quoting Stefan Hajnoczi (2016-10-06 11:40:18)
> > > Add AF_VSOCK (virtio-vsock) support as an alternative to virtio-serial.
> > > 
> > >   $ qemu-system-x86_64 -device vhost-vsock-pci,guest-cid=3 ...
> > >   (guest)# qemu-ga -m vsock-listen -p 3:1234
> > > 
> > > Signed-off-by: Stefan Hajnoczi 
> > 
> > Reviewed-by: Michael Roth 
> > 
> > I still need to get a vsock environment set up to test with, but looks
> > good other than minor comments in patch 3.
> 
> Linux 4.8 has the guest and vhost drivers:
> 
> CONFIG_VSOCKETS=m
> CONFIG_VIRTIO_VSOCKETS=m
> CONFIG_VIRTIO_VSOCKETS_COMMON=m
> CONFIG_VHOST_VSOCK=m

I still need to do some work to get this fully integrated into my
test set up, but I ran these patches through some basic testing
with a 4.8.0 host and 4.9 guest and everything seems to be in
working order. I think Eric had some comments related to parameter
parsing in patch 3, but otherwise looks good.

> 
> Stefan

Re: [Qemu-devel] [PATCH] tests: add mac99 and g3beige in boot-serial-test

2016-10-13 Thread Thomas Huth

On 13.10.2016 21:53, Laurent Vivier wrote:
> g3beige (pmac_oldworld) and mac99 (pmac_newworld) are missing in
> boot-serial-test.
> 
> Perhaps because serial output of OpenBIOS is only enabled with
> '-nographic'

IIRC clearly, I've left them out because they are basically already
tested with the prom-env test. I was a little bit afraid that the
testing time would become too long, but since this test is quite fast,
and it makes sense to check whether the serial output is working, too, I
think it's also OK if you add them here.

Two thoughts though:

1) I think you do *not* need the "-nographic" here, because the test is
using the "-serial" parameter to get the output of the serial console.

2) While you're at it, you could enable the test for a sparc and a
sparc64 machine, too (preferably one that is not tested by the prom-env
test yet)

 Thomas

> Signed-off-by: Laurent Vivier 
> ---
>  tests/boot-serial-test.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c
> index d98c564..d477a6a 100644
> --- a/tests/boot-serial-test.c
> +++ b/tests/boot-serial-test.c
> @@ -26,8 +26,12 @@ static testdef_t tests[] = {
>  { "alpha", "clipper", "", "PCI:" },
>  { "ppc", "ppce500", "", "U-Boot" },
>  { "ppc", "prep", "", "Open Hack'Ware BIOS" },
> +{ "ppc", "mac99", "-nographic", "OpenBIOS" },
> +{ "ppc", "g3beige", "-nographic", "OpenBIOS" },
>  { "ppc64", "ppce500", "", "U-Boot" },
>  { "ppc64", "prep", "", "Open Hack'Ware BIOS" },
> +{ "ppc64", "mac99", "-nographic", "OpenBIOS" },
> +{ "ppc64", "g3beige", "-nographic", "OpenBIOS" },
>  { "ppc64", "pseries", "", "Open Firmware" },
>  { "i386", "isapc", "-cpu qemu32 -device sga", "SGABIOS" },
>  { "i386", "pc", "-device sga", "SGABIOS" },
>

[Qemu-devel] [PATCH v3] usb: Change *_exitfn return type from int to void

2016-10-13 Thread Akanksha Srivastava

The *_exitfn functions cannot fail and should not be
returning int.
This also removes the passthru_exitfn since this callback
does nothing as of now.
This was suggested as a Bite-sized task for code cleanup.
Signed-off-by: Akanksha Srivastava 
---
 hw/usb/ccid-card-emulated.c   |  3 +--
 hw/usb/ccid-card-passthru.c   |  6 --
 hw/usb/ccid.h |  2 +-
 hw/usb/dev-smartcard-reader.c | 11 +--
 4 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/hw/usb/ccid-card-emulated.c b/hw/usb/ccid-card-emulated.c
index 3213f9f..eceb5f3 100644
--- a/hw/usb/ccid-card-emulated.c
+++ b/hw/usb/ccid-card-emulated.c
@@ -547,7 +547,7 @@ static int emulated_initfn(CCIDCardState *base)
 return 0;
 }

-static int emulated_exitfn(CCIDCardState *base)
+static void emulated_exitfn(CCIDCardState *base)
 {
 EmulatedState *card = EMULATED_CCID_CARD(base);
 VEvent *vevent = vevent_new(VEVENT_LAST, NULL, NULL);
@@ -564,7 +564,6 @@ static int emulated_exitfn(CCIDCardState *base)
 qemu_mutex_destroy(>handle_apdu_mutex);
 qemu_mutex_destroy(>vreader_mutex);
 qemu_mutex_destroy(>event_list_mutex);
-return 0;
 }

 static Property emulated_card_properties[] = {
diff --git a/hw/usb/ccid-card-passthru.c b/hw/usb/ccid-card-passthru.c
index 2eacea7..7209f73 100644
--- a/hw/usb/ccid-card-passthru.c
+++ b/hw/usb/ccid-card-passthru.c
@@ -364,11 +364,6 @@ static int passthru_initfn(CCIDCardState *base)
 return 0;
 }

-static int passthru_exitfn(CCIDCardState *base)
-{
-return 0;
-}
-
 static VMStateDescription passthru_vmstate = {
 .name = "ccid-card-passthru",
 .version_id = 1,
@@ -395,7 +390,6 @@ static void passthru_class_initfn(ObjectClass *klass, void 
*data)
 CCIDCardClass *cc = CCID_CARD_CLASS(klass);

 cc->initfn = passthru_initfn;
-cc->exitfn = passthru_exitfn;
 cc->get_atr = passthru_get_atr;
 cc->apdu_from_guest = passthru_apdu_from_guest;
 set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
diff --git a/hw/usb/ccid.h b/hw/usb/ccid.h
index 9334da8..1f07011 100644
--- a/hw/usb/ccid.h
+++ b/hw/usb/ccid.h
@@ -33,7 +33,7 @@ typedef struct CCIDCardClass {
 void (*apdu_from_guest)(CCIDCardState *card,
 const uint8_t *apdu,
 uint32_t len);
-int (*exitfn)(CCIDCardState *card);
+void (*exitfn)(CCIDCardState *card);
 int (*initfn)(CCIDCardState *card);
 } CCIDCardClass;

diff --git a/hw/usb/dev-smartcard-reader.c b/hw/usb/dev-smartcard-reader.c
index af4b851..e1940ad 100644
--- a/hw/usb/dev-smartcard-reader.c
+++ b/hw/usb/dev-smartcard-reader.c
@@ -508,14 +508,14 @@ static void ccid_card_apdu_from_guest(CCIDCardState *card,
 }
 }

-static int ccid_card_exitfn(CCIDCardState *card)
+static void ccid_card_exitfn(CCIDCardState *card)
 {
 CCIDCardClass *cc = CCID_CARD_GET_CLASS(card);

 if (cc->exitfn) {
-return cc->exitfn(card);
+cc->exitfn(card);
 }
-return 0;
+
 }

 static int ccid_card_initfn(CCIDCardState *card)
@@ -1279,7 +1279,6 @@ void ccid_card_card_inserted(CCIDCardState *card)

 static int ccid_card_exit(DeviceState *qdev)
 {
-int ret = 0;
 CCIDCardState *card = CCID_CARD(qdev);
 USBDevice *dev = USB_DEVICE(qdev->parent_bus->parent);
 USBCCIDState *s = USB_CCID_DEV(dev);
@@ -1287,9 +1286,9 @@ static int ccid_card_exit(DeviceState *qdev)
 if (ccid_card_inserted(s)) {
 ccid_card_card_removed(card);
 }
-ret = ccid_card_exitfn(card);
+ccid_card_exitfn(card);
 s->card = NULL;
-return ret;
+return 0;
 }

 static int ccid_card_init(DeviceState *qdev)
--
1.9.1

Re: [Qemu-devel] [PATCH v4] timer: a9gtimer: remove loop to auto-increment comparator

2016-10-13 Thread P J P

+-- On Thu, 13 Oct 2016, Peter Maydell wrote --+
| I suggest you try putting in some sample values for the
| various variables to confirm that your new code produces the
| same answers that the old code did.

Yep, sent a revised patch v5. Thank you.
--
Prasad J Pandit / Red Hat Product Security Team
47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F

[Qemu-devel] [PATCH v5] timer: a9gtimer: remove loop to auto-increment comparator

2016-10-13 Thread P J P

From: Prasad J Pandit 

ARM A9MP processor has a peripheral timer with an auto-increment
register, which holds an increment step value. A user could set
this value to zero. When auto-increment control bit is enabled,
it leads to an infinite loop in 'a9_gtimer_update' while
updating comparator value. Remove this loop incrementing the
comparator value.

Reported-by: Li Qiang 
Signed-off-by: Prasad J Pandit 
---
 hw/timer/a9gtimer.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

Update per
  -> https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg02891.html

diff --git a/hw/timer/a9gtimer.c b/hw/timer/a9gtimer.c
index 772f85f..012f10c 100644
--- a/hw/timer/a9gtimer.c
+++ b/hw/timer/a9gtimer.c
@@ -73,6 +73,7 @@ static void a9_gtimer_update(A9GTimerState *s, bool sync)
 
 A9GTimerUpdate update = a9_gtimer_get_update(s);
 int i;
+uint64_t inc;
 int64_t next_cdiff = 0;
 
 for (i = 0; i < s->num_cpu; ++i) {
@@ -82,15 +83,15 @@ static void a9_gtimer_update(A9GTimerState *s, bool sync)
 if ((s->control & R_CONTROL_TIMER_ENABLE) &&
 (gtb->control & R_CONTROL_COMP_ENABLE)) {
 /* R2p0+, where the compare function is >= */
-while (gtb->compare < update.new) {
+if (gtb->compare < update.new) {
 DB_PRINT("Compare event happened for CPU %d\n", i);
 gtb->status = 1;
-if (gtb->control & R_CONTROL_AUTO_INCREMENT) {
-DB_PRINT("Auto incrementing timer compare by %" PRId32 
"\n",
- gtb->inc);
-gtb->compare += gtb->inc;
-} else {
-break;
+if (gtb->control & R_CONTROL_AUTO_INCREMENT && gtb->inc) {
+inc = update.new - gtb->compare;
+inc = MAX(QEMU_ALIGN_DOWN(inc, gtb->inc), gtb->inc);
+DB_PRINT("Auto incrementing timer compare by %"
+PRId64 "\n", inc);
+gtb->compare += inc;
 }
 }
 cdiff = (int64_t)gtb->compare - (int64_t)update.new + 1;
-- 
2.5.5

[Qemu-devel] [PATCH] tests: add mac99 and g3beige in boot-serial-test

2016-10-13 Thread Laurent Vivier

g3beige (pmac_oldworld) and mac99 (pmac_newworld) are missing in
boot-serial-test.

Perhaps because serial output of OpenBIOS is only enabled with
'-nographic'

Signed-off-by: Laurent Vivier 
---
 tests/boot-serial-test.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c
index d98c564..d477a6a 100644
--- a/tests/boot-serial-test.c
+++ b/tests/boot-serial-test.c
@@ -26,8 +26,12 @@ static testdef_t tests[] = {
 { "alpha", "clipper", "", "PCI:" },
 { "ppc", "ppce500", "", "U-Boot" },
 { "ppc", "prep", "", "Open Hack'Ware BIOS" },
+{ "ppc", "mac99", "-nographic", "OpenBIOS" },
+{ "ppc", "g3beige", "-nographic", "OpenBIOS" },
 { "ppc64", "ppce500", "", "U-Boot" },
 { "ppc64", "prep", "", "Open Hack'Ware BIOS" },
+{ "ppc64", "mac99", "-nographic", "OpenBIOS" },
+{ "ppc64", "g3beige", "-nographic", "OpenBIOS" },
 { "ppc64", "pseries", "", "Open Firmware" },
 { "i386", "isapc", "-cpu qemu32 -device sga", "SGABIOS" },
 { "i386", "pc", "-device sga", "SGABIOS" },
-- 
2.7.4

Re: [Qemu-devel] [PATCH 0/5] More thread sanitizer fixes and atomic.h improvements

2016-10-13 Thread Emilio G. Cota

On Thu, Oct 13, 2016 at 03:39:55 -0400, Paolo Bonzini wrote:
> > On Mon, Oct 10, 2016 at 15:59:02 +0200, Paolo Bonzini wrote:
> > > See each patch.  My attempt at fixing whatever I did when I obviously
> > > didn't know enough^W about the C11 memory model, and at setting a
> > > better example for future generations...
> > 
> > Just for context. Building on this patchset, is it now time to
> > phase out smp_(rw)mb in favour or C11's acq/rel, as you laid
> > out in your KVM Forum talk [*]?
> 
> Yes, this would be the start of it.  However I'm a bit undecided
> because ARMv8 doesn't have acq/rel memory barriers, and its STLR
> opcode is stronger than a store release.
> 
> > What is the plan with smp_mb_(sg)et? It's not clear to me from
> > the slides, but given patch 5 I don't see a reason to keep them.
> 
> No plan for now.  It makes sense to phase out at least atomic_mb_read.
> atomic_mb_set is more efficient on x86 than store+mfence, so there's
> that too.

I see, thanks.

On a related note: can we squeeze the appended in this patch set? If
we keep the atomic_mb's, at least there should be a good reason for their
use--this is not the case below.

Emilio

commit cffdc51df4a6346f2b38425f1f1251aa12866fa8
Author: Emilio G. Cota 
Date:   Thu Oct 13 15:06:07 2016 -0400

qht-bench: relax test_start/stop atomic accesses

test_start/stop are used only as flags to loop on. Barriers are unnecessary,
since no dependent data is transferred among threads apart from the flags
themselves.

This commit relaxes the three accesses to test_start/stop that were
not yet relaxed.

Signed-off-by: Emilio G. Cota 

diff --git a/tests/qht-bench.c b/tests/qht-bench.c
index 76360a0..2afa09d 100644
--- a/tests/qht-bench.c
+++ b/tests/qht-bench.c
@@ -193,7 +193,7 @@ static void *thread_func(void *p)
 rcu_register_thread();
 
 atomic_inc(_ready_threads);
-while (!atomic_mb_read(_start)) {
+while (!atomic_read(_start)) {
 cpu_relax();
 }
 
@@ -393,11 +393,11 @@ static void run_test(void)
 while (atomic_read(_ready_threads) != n_rw_threads + n_rz_threads) {
 cpu_relax();
 }
-atomic_mb_set(_start, true);
+atomic_set(_start, true);
 do {
 remaining = sleep(duration);
 } while (remaining);
-atomic_mb_set(_stop, true);
+atomic_set(_stop, true);
 
 for (i = 0; i < n_rw_threads; i++) {
 qemu_thread_join(_threads[i]);

Re: [Qemu-devel] [PATCH 2/5] cpus: use atomic_read to read seqlock-protected variables

2016-10-13 Thread Emilio G. Cota

On Mon, Oct 10, 2016 at 15:59:04 +0200, Paolo Bonzini wrote:
> There is a data race if the variable is written concurrently to the
> read.  In C11 this has undefined behavior.  Use atomic_read.  The
> write side does not need atomic_set, because it is protected by a
> mutex.

Is tsan happy with the way seqlocks are written right now?

According to this message I just found by Dmitry Vyukov, tsan
shouldn't be. Note however that the message is from April'15,
so it might be outdated:

Dmitry Vyukov wrote:
> 1. Tsan is bad at handling stand-alone memory barriers. 

> And here is a way to express seqlock that is both correct, is 
> understood by tsan and is no overhead on x86: 
> 
> // writer 
> atomic_store(, seq+1, memory_order_relaxed); 
> atomic_store([0], ..., memory_order_release); 
> ... 
> atomic_store([N], ..., memory_order_release); 
> atomic_store(, seq+1, memory_order_release); 
> 
> // reader 
> atomic_load(, memory_order_acquire); 
> d0 = atomic_load([0], memory_order_acquire); 
> ... 
> dN = atomic_load([N], memory_order_acquire); 
> atomic_load(, memory_order_relaxed); 

Source: https://groups.google.com/forum/#!topic/thread-sanitizer/B4i9EMQ4BQE

Thanks,

Emilio

Re: [Qemu-devel] [PATCH v3 0/4] target-arm: Handle tagged addresses when loading PC

2016-10-13 Thread Tom Hanson

On 10/12/2016 01:50 PM, Thomas Hanson wrote:
...
> 
>   Still looking into handling of tagged addresses for exceptions and
>   exception returns.  Will handle that as a separate patch set.

Peter,

Looking at arm_cpu_do_interrupt_aarch64() and the ARM spec, the new PC value is 
always an offset from the appropriate VBAR. The only place I can find the the 
VBAR being set is at boot time (i.e. UEFI).

Can the boot code use a tagged pointer to specify the VBAR?

Is there some other place/time when the VBAR can be modified post-boot?

Thanks,
Tom

Re: [Qemu-devel] [PATCH] Fix build for less common build directories names

2016-10-13 Thread Peter Maydell

On 13 October 2016 at 19:29, Stefan Weil  wrote:
> scripts/tracetool generates a C preprocessor macro from the name of the
> build directory. Any characters which are possible in a directory name
> but not allowed in a macro name must be substituted, otherwise builds
> will fail.
>
> Signed-off-by: Stefan Weil 
> ---
>
> I had problems with a build directory of the form "host,variant".
> Is this fix needed for stable, too?

Why does it need to care about the build directory name at all?
Ideally builds should be entirely deterministically reproducible
whatever the path to the source or build directory names is...

thanks
-- PMM

Re: [Qemu-devel] [PATCH 2/4] block/ssh: Add InetSocketAddress and accept it

2016-10-13 Thread Ashijeet Acharya

On Tue, Oct 11, 2016 at 1:07 PM, Ashijeet Acharya
 wrote:
> Add InetSocketAddress compatibility to SSH driver.
>
> Add a new option "server" to the SSH block driver which then accepts
> a InetSocketAddress.
>
> "host" and "port" are supported as legacy options and are mapped to
> their InetSocketAddress representation.
>
> Signed-off-by: Ashijeet Acharya 
> ---
>  block/ssh.c | 87 
> ++---
>  1 file changed, 78 insertions(+), 9 deletions(-)
>
> diff --git a/block/ssh.c b/block/ssh.c
> index 75cb7bc..702871a 100644
> --- a/block/ssh.c
> +++ b/block/ssh.c
> @@ -32,8 +32,11 @@
>  #include "qemu/error-report.h"
>  #include "qemu/sockets.h"
>  #include "qemu/uri.h"
> +#include "qapi-visit.h"
>  #include "qapi/qmp/qint.h"
>  #include "qapi/qmp/qstring.h"
> +#include "qapi/qmp-input-visitor.h"
> +#include "qapi/qmp-output-visitor.h"
>
>  /* DEBUG_SSH=1 enables the DPRINTF (debugging printf) statements in
>   * this block driver code.
> @@ -74,6 +77,8 @@ typedef struct BDRVSSHState {
>   */
>  LIBSSH2_SFTP_ATTRIBUTES attrs;
>
> +InetSocketAddress *inet;
> +
>  /* Used to warn if 'flush' is not supported. */
>  char *hostport;
>  bool unsafe_flush_warning;
> @@ -263,7 +268,9 @@ static bool ssh_has_filename_options_conflict(QDict 
> *options, Error **errp)
>  !strcmp(qe->key, "port") ||
>  !strcmp(qe->key, "path") ||
>  !strcmp(qe->key, "user") ||
> -!strcmp(qe->key, "host_key_check"))
> +!strcmp(qe->key, "host_key_check") ||
> +!strcmp(qe->key, "server") ||
> +!strncmp(qe->key, "server.", 7))
>  {
>  error_setg(errp, "Option '%s' cannot be used with a file name",
> qe->key);
> @@ -555,13 +562,71 @@ static QemuOptsList ssh_runtime_opts = {
>  },
>  };
>
> +static bool ssh_process_legacy_socket_options(QDict *output_opts,
> +  QemuOpts *legacy_opts,
> +  Error **errp)
> +{
> +const char *host = qemu_opt_get(legacy_opts, "host");
> +const char *port = qemu_opt_get(legacy_opts, "port");
> +
> +if (!host && port) {
> +error_setg(errp, "port may not be used without host");
> +return false;
> +}
> +
> +if (!host) {
> +error_setg(errp, "No hostname was specified");
> +return false;

I was debugging this part with gdb while making the changes for v2,
and I hit something very strange.
The code always gives the error of "No hostname was specified". To be
more clear, I reverted back to original driver state and the problem
did not seem to appear for the same qemu command line and I can't find
the bug.

Command I used:

$ ./bin/qemu-system-x86_64 -m 512 -drive
file=ssh://ashijeet@127.0.0.1/home/ashijeet/qemu_build/test.qcow2,
if=virtio

Is there something wrong with the command line?

Ashijeet

[Qemu-devel] [PATCH] Fix build for less common build directories names

2016-10-13 Thread Stefan Weil

scripts/tracetool generates a C preprocessor macro from the name of the
build directory. Any characters which are possible in a directory name
but not allowed in a macro name must be substituted, otherwise builds
will fail.

Signed-off-by: Stefan Weil 
---

I had problems with a build directory of the form "host,variant".
Is this fix needed for stable, too?

Regards
Stefan W.

 scripts/tracetool.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/tracetool.py b/scripts/tracetool.py
index 629b259..fe9c9e9 100755
--- a/scripts/tracetool.py
+++ b/scripts/tracetool.py
@@ -70,7 +70,7 @@ def make_group_name(filename):
 
 if dirname == "":
 return "common"
-return re.sub(r"/|-", "_", dirname)
+return re.sub(r"[^A-Za-z0-9]", "_", dirname)
 
 def main(args):
 global _SCRIPT
-- 
2.1.4

Re: [Qemu-devel] [PATCH v1 3/3] target-ppc: implement xxbr[qdwh] instruction

2016-10-13 Thread Richard Henderson


On 10/12/2016 07:21 PM, David Gibson wrote:

+static void gen_bswap32x4(TCGv_i64 outh, TCGv_i64 outl,
+  TCGv_i64 inh, TCGv_i64 inl)
+{
+TCGv_i64 hi = tcg_temp_new_i64();
+TCGv_i64 lo = tcg_temp_new_i64();
+
+tcg_gen_bswap64_i64(hi, inh);
+tcg_gen_bswap64_i64(lo, inl);
+tcg_gen_shri_i64(outh, hi, 32);
+tcg_gen_deposit_i64(outh, outh, hi, 32, 32);
+tcg_gen_shri_i64(outl, lo, 32);
+tcg_gen_deposit_i64(outl, outl, lo, 32, 32);
+
+tcg_temp_free_i64(hi);
+tcg_temp_free_i64(lo);
+}


Is there actually any advantage to having this 128-bit operation
working on two 64-bit "register"s, as opposed to having a bswap32x2
that operates on a single 64-bit register amd calling it twice?


For this one, no particular advantage.


+gen_bswap16x8(xth, xtl, xbh, xbl);


Likewise for the 16x8 version, I guess, although that would mean
changing the existing users.


For this one, we have to build a 64-bit constant, 0x00ff00ff00ff00ff.  On some 
hosts that's up to 6 insns.  Being about to reuse that for both swaps is useful.



+tcg_gen_bswap64_i64(t0, xbl);
+tcg_gen_bswap64_i64(xtl, xbh);
+tcg_gen_bswap64_i64(xth, t0);


This looks wrong.  You swap xbl as you move it to t0, then swap it
again as you put it back into xth.  So it looks like you'll translate
   0011223344556677 8899AABBCCDDEEFF
to
   8899AABBCCDDEEFF 7766554433221100
whereas it should become
   FFEEDDCCBBAA9977 7766554433221100


Indeed, the third line should be a move, not a swap.


r~

Re: [Qemu-devel] [PATCH v7 00/22] qcow2: persistent dirty bitmaps

2016-10-13 Thread John Snow




On 10/01/2016 09:37 AM, Max Reitz wrote:

On 30.09.2016 12:53, Vladimir Sementsov-Ogievskiy wrote:

v7:
https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=refs%2Ftags%2Fqcow2-bitmap-v7
based on block-next (https://github.com/XanClic/qemu/commits/block-next)


It should be noted that (at least my) block-next is only valid during
freeze, after that it becomes stale. I assume the patches you require
from my block-next branch are the ones from the "Dirty bitmap changes
for migration/persistence work" series, which I had to drop from my pull
requests, however, due to some issues on Big Endian machines.

The only reason they are still in my block-next branch is, as I said,
that that branch is just stale. I won't overwrite it for the time being,
though, so this series can still be applied on top.

Max



Lemme fix up the 32bit problems and resend that out right now.
(At least it's not lock correctness.)

--js

Re: [Qemu-devel] [PATCH] machine: Fix replacement of '_' by '-' in machine property names

2016-10-13 Thread Eduardo Habkost

On Thu, Oct 13, 2016 at 06:44:14PM +0200, Markus Armbruster wrote:
> machine_set_property() replaces '_' by '-' in the property name.
> Except it fails to replace an initial '_'.  Screwed up in commit
> b0ddb8b.  Reproducer: "-M pc,__foo_bar=true" produces "Property
> '._-foo-bar' not found".
> 
> Error messages using a mangled name rather than the name the user
> actually wrote is user-hostile, but that's a different topic.
> 
> Signed-off-by: Markus Armbruster 

Reviewed-by: Eduardo Habkost 

I suggest we follow the same approach we used in the x86 CPU
code: instead of requiring a special parser that magically
translate strings, just add property aliases for the old names
that contained "_". It would fix the user-hostile error messages
as well.


> ---
>  vl.c | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/vl.c b/vl.c
> index c657acd..1c0b0ba 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2804,17 +2804,16 @@ static int machine_set_property(void *opaque,
>  {
>  Object *obj = OBJECT(opaque);
>  Error *local_err = NULL;
> -char *c, *qom_name;
> +char *p, *qom_name;
>  
>  if (strcmp(name, "type") == 0) {
>  return 0;
>  }
>  
>  qom_name = g_strdup(name);
> -c = qom_name;
> -while (*c++) {
> -if (*c == '_') {
> -*c = '-';
> +for (p = qom_name; *p; p++) {
> +if (*p == '_') {
> +*p = '-';
>  }
>  }
>  
> -- 
> 2.5.5
> 

-- 
Eduardo

Re: [Qemu-devel] [PATCH] machine: Fix replacement of '_' by '-' in machine property names

2016-10-13 Thread Eduardo Habkost

On Thu, Oct 13, 2016 at 06:44:14PM +0200, Markus Armbruster wrote:
> machine_set_property() replaces '_' by '-' in the property name.
> Except it fails to replace an initial '_'.  Screwed up in commit
> b0ddb8b.  Reproducer: "-M pc,__foo_bar=true" produces "Property
> '._-foo-bar' not found".
> 
> Error messages using a mangled name rather than the name the user
> actually wrote is user-hostile, but that's a different topic.
> 
> Signed-off-by: Markus Armbruster 

Applied to machine-next. Thanks.

-- 
Eduardo

[Qemu-devel] [PATCH 16/18] iothread: release AioContext around aio_poll

2016-10-13 Thread Paolo Bonzini

This is the first step towards having fine-grained critical sections in
dataplane threads, which will resolve lock ordering problems between
address_space_* functions (which need the BQL when doing MMIO, even
after we complete RCU-based dispatch) and the AioContext.

Because AioContext does not use contention callbacks anymore, the
unit test has to be changed.

Previously applied as a0710f7995f914e3044e5899bd8ff6c43c62f916 and
then reverted.

Reviewed-by: Fam Zheng 
Signed-off-by: Paolo Bonzini 
---
 async.c | 22 +++---
 docs/multiple-iothreads.txt | 40 +++-
 include/block/aio.h |  3 ---
 iothread.c  | 11 ++-
 tests/test-aio.c| 22 ++
 5 files changed, 42 insertions(+), 56 deletions(-)

diff --git a/async.c b/async.c
index fb37b03..27db772 100644
--- a/async.c
+++ b/async.c
@@ -107,8 +107,8 @@ int aio_bh_poll(AioContext *ctx)
  * aio_notify again if necessary.
  */
 if (atomic_xchg(>scheduled, 0)) {
-/* Idle BHs and the notify BH don't count as progress */
-if (!bh->idle && bh != ctx->notify_dummy_bh) {
+/* Idle BHs don't count as progress */
+if (!bh->idle) {
 ret = 1;
 }
 bh->idle = 0;
@@ -260,7 +260,6 @@ aio_ctx_finalize(GSource *source)
 {
 AioContext *ctx = (AioContext *) source;
 
-qemu_bh_delete(ctx->notify_dummy_bh);
 thread_pool_free(ctx->thread_pool);
 
 #ifdef CONFIG_LINUX_AIO
@@ -346,19 +345,6 @@ static void aio_timerlist_notify(void *opaque)
 aio_notify(opaque);
 }
 
-static void aio_rfifolock_cb(void *opaque)
-{
-AioContext *ctx = opaque;
-
-/* Kick owner thread in case they are blocked in aio_poll() */
-qemu_bh_schedule(ctx->notify_dummy_bh);
-}
-
-static void notify_dummy_bh(void *opaque)
-{
-/* Do nothing, we were invoked just to force the event loop to iterate */
-}
-
 static void event_notifier_dummy_cb(EventNotifier *e)
 {
 }
@@ -386,11 +372,9 @@ AioContext *aio_context_new(Error **errp)
 #endif
 ctx->thread_pool = NULL;
 qemu_mutex_init(>bh_lock);
-rfifolock_init(>lock, aio_rfifolock_cb, ctx);
+rfifolock_init(>lock, NULL, NULL);
 timerlistgroup_init(>tlg, aio_timerlist_notify, ctx);
 
-ctx->notify_dummy_bh = aio_bh_new(ctx, notify_dummy_bh, NULL);
-
 return ctx;
 fail:
 g_source_destroy(>source);
diff --git a/docs/multiple-iothreads.txt b/docs/multiple-iothreads.txt
index 40b8419..0e7cdb2 100644
--- a/docs/multiple-iothreads.txt
+++ b/docs/multiple-iothreads.txt
@@ -105,13 +105,10 @@ a BH in the target AioContext beforehand and then call 
qemu_bh_schedule().  No
 acquire/release or locking is needed for the qemu_bh_schedule() call.  But be
 sure to acquire the AioContext for aio_bh_new() if necessary.
 
-The relationship between AioContext and the block layer

-The AioContext originates from the QEMU block layer because it provides a
-scoped way of running event loop iterations until all work is done.  This
-feature is used to complete all in-flight block I/O requests (see
-bdrv_drain_all()).  Nowadays AioContext is a generic event loop that can be
-used by any QEMU subsystem.
+AioContext and the block layer
+--
+The AioContext originates from the QEMU block layer, even though nowadays
+AioContext is a generic event loop that can be used by any QEMU subsystem.
 
 The block layer has support for AioContext integrated.  Each BlockDriverState
 is associated with an AioContext using bdrv_set_aio_context() and
@@ -122,13 +119,22 @@ Block layer code must therefore expect to run in an 
IOThread and avoid using
 old APIs that implicitly use the main loop.  See the "How to program for
 IOThreads" above for information on how to do that.
 
-If main loop code such as a QMP function wishes to access a BlockDriverState it
-must first call aio_context_acquire(bdrv_get_aio_context(bs)) to ensure the
-IOThread does not run in parallel.
-
-Long-running jobs (usually in the form of coroutines) are best scheduled in the
-BlockDriverState's AioContext to avoid the need to acquire/release around each
-bdrv_*() call.  Be aware that there is currently no mechanism to get notified
-when bdrv_set_aio_context() moves this BlockDriverState to a different
-AioContext (see bdrv_detach_aio_context()/bdrv_attach_aio_context()), so you
-may need to add this if you want to support long-running jobs.
+If main loop code such as a QMP function wishes to access a BlockDriverState
+it must first call aio_context_acquire(bdrv_get_aio_context(bs)) to ensure
+that callbacks in the IOThread do not run in parallel.
+
+Code running in the monitor typically needs to ensure that past
+requests from the guest are completed.  When a block device is running
+in an IOThread, the IOThread can also process

[Qemu-devel] dpdk/vpp and cross-version migration for vhost

2016-10-13 Thread Michael S. Tsirkin

Hi!
So it looks like we face a problem with cross-version
migration when using vhost. It's not new but became more
acute with the advent of vhost user.

For users to be able to migrate between different versions
of the hypervisor the interface exposed to guests
by hypervisor must stay unchanged.

The problem is that a qemu device is connected
to a backend in another process, so the interface
exposed to guests depends on the capabilities of that
process.

Specifically, for vhost user interface based on virtio, this includes
the "host features" bitmap that defines the interface, as well as more
host values such as the max ring size.  Adding new features/changing
values to this interface is required to make progress, but on the other
hand we need ability to get the old host features to be compatible.

To solve this problem within qemu, qemu has a versioning system based on
a machine type concept which fundamentally is a version string, by
specifying that string one can get hardware compatible with a previous
qemu version. QEMU also reports the latest version and list of versions
supported so libvirt records the version at VM creation and then is
careful to use this machine version whenever it migrates a VM.

One might wonder how is this solved with a kernel vhost backend. The
answer is that it mostly isn't - instead an assumption is made, that
qemu versions are deployed together with the kernel - this is generally
true for downstreams.  Thus whenever qemu gains a new feature, it is
already supported by the kernel as well.  However, if one attempts
migration with a new qemu from a system with a new to old kernel, one
would get a failure.

In the world where we have multiple userspace backends, with some of
these supplied by ISVs, this seems non-realistic.

IMO we need to support vhost backend versioning, ideally
in a way that will also work for vhost kernel backends.

So I'd like to get some input from both backend and management
developers on what a good solution would look like.

If we want to emulate the qemu solution, this involves adding the
concept of interface versions to dpdk.  For example, dpdk could supply a
file (or utility printing?) with list of versions: latest and versions
supported. libvirt could read that and
- store latest version at vm creation
- pass it around with the vm
- pass it to qemu

>From here, qemu could pass this over the vhost-user channel,
thus making sure it's initialized with the correct
compatible interface.

As version here is an opaque string for libvirt and qemu,
anything can be used - but I suggest either a list
of values defining the interface, e.g.
any_layout=on,max_ring=256
or a version including the name and vendor of the backend,
e.g. "org.dpdk.v4.5.6".

Note that typically the list of supported versions can only be
extended, not shrunk. Also, if the host/guest interface
does not change, don't change the current version as
this just creates work for everyone.

Thoughts? Would this work well for management? dpdk? vpp?

Thanks!

-- 
MST

Re: [Qemu-devel] [PATCH] machine: Fix replacement of '_' by '-' in machine property names

2016-10-13 Thread Eric Blake

On 10/13/2016 11:44 AM, Markus Armbruster wrote:
> machine_set_property() replaces '_' by '-' in the property name.
> Except it fails to replace an initial '_'.  Screwed up in commit
> b0ddb8b.  Reproducer: "-M pc,__foo_bar=true" produces "Property
> '._-foo-bar' not found".
> 
> Error messages using a mangled name rather than the name the user
> actually wrote is user-hostile, but that's a different topic.
> 
> Signed-off-by: Markus Armbruster 
> ---
>  vl.c | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

1 2 3 4 >

1 - 100 of 339 matches

Mail list logo