Re: [Qemu-devel] [PATCH v9 05/11] numa: Extend CLI to provide initiator information for numa nodes

2019-08-13 Thread Tao Xu

On 8/14/2019 10:39 AM, Dan Williams wrote:

On Tue, Aug 13, 2019 at 8:00 AM Igor Mammedov  wrote:


On Fri,  9 Aug 2019 14:57:25 +0800
Tao  wrote:


From: Tao Xu 

In ACPI 6.3 chapter 5.2.27 Heterogeneous Memory Attribute Table (HMAT),
The initiator represents processor which access to memory. And in 5.2.27.3
Memory Proximity Domain Attributes Structure, the attached initiator is
defined as where the memory controller responsible for a memory proximity
domain. With attached initiator information, the topology of heterogeneous
memory can be described.

Extend CLI of "-numa node" option to indicate the initiator numa node-id.
In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report
the platform's HMAT tables.

Reviewed-by: Jingqi Liu 
Suggested-by: Dan Williams 
Signed-off-by: Tao Xu 
---

No changes in v9
---
  hw/core/machine.c | 24 
  hw/core/numa.c| 13 +
  include/sysemu/numa.h |  3 +++
  qapi/machine.json |  6 +-
  qemu-options.hx   | 27 +++
  5 files changed, 68 insertions(+), 5 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 3c55470103..113184a9df 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -640,6 +640,7 @@ void machine_set_cpu_numa_node(MachineState *machine,
 const CpuInstanceProperties *props, Error 
**errp)
  {
  MachineClass *mc = MACHINE_GET_CLASS(machine);
+NodeInfo *numa_info = machine->numa_state->nodes;
  bool match = false;
  int i;

@@ -709,6 +710,16 @@ void machine_set_cpu_numa_node(MachineState *machine,
  match = true;
  slot->props.node_id = props->node_id;
  slot->props.has_node_id = props->has_node_id;
+
+if (numa_info[props->node_id].initiator_valid &&
+(props->node_id != numa_info[props->node_id].initiator)) {
+error_setg(errp, "The initiator of CPU NUMA node %" PRId64
+   " should be itself.", props->node_id);
+return;
+}
+numa_info[props->node_id].initiator_valid = true;
+numa_info[props->node_id].has_cpu = true;
+numa_info[props->node_id].initiator = props->node_id;
  }

  if (!match) {
@@ -1050,6 +1061,7 @@ static void machine_numa_finish_cpu_init(MachineState 
*machine)
  GString *s = g_string_new(NULL);
  MachineClass *mc = MACHINE_GET_CLASS(machine);
  const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(machine);
+NodeInfo *numa_info = machine->numa_state->nodes;

  assert(machine->numa_state->num_nodes);
  for (i = 0; i < possible_cpus->len; i++) {
@@ -1083,6 +1095,18 @@ static void machine_numa_finish_cpu_init(MachineState 
*machine)
  machine_set_cpu_numa_node(machine, , _fatal);
  }
  }
+
+for (i = 0; i < machine->numa_state->num_nodes; i++) {
+if (numa_info[i].initiator_valid &&
+!numa_info[numa_info[i].initiator].has_cpu) {

   ^^ possible out of bounds read, 
see bellow


+error_report("The initiator-id %"PRIu16 " of NUMA node %d"
+ " does not exist.", numa_info[i].initiator, i);
+error_printf("\n");
+
+exit(1);
+}

it takes care only about nodes that have cpus or memory-only ones that have
initiator explicitly provided on CLI. And leaves possibility to have
memory-only nodes without initiator mixed with nodes that have initiator.
Is it valid to have mixed configuration?
Should we forbid it?


The spec talks about the "Proximity Domain for the Attached Initiator"
field only being valid if the memory controller for the memory can be
identified by an initiator id in the SRAT. So I expect the only way to
define a memory proximity domain without this local initiator is to
allow specifying a node-id that does not have an entry in the SRAT.


Hi Dan,

So there may be a situation for the Attached Initiator field is not
valid? If true, I would allow user to input Initiator invalid.


That would be a useful feature for testing OS HMAT parsing behavior,
and may match platforms that exist in practice.




+}
+
  if (s->len && !qtest_enabled()) {
  warn_report("CPU(s) not present in any NUMA nodes: %s",
  s->str);
diff --git a/hw/core/numa.c b/hw/core/numa.c
index 8fcbba05d6..cfb6339810 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -128,6 +128,19 @@ static void parse_numa_node(MachineState *ms, 
NumaNodeOptions *node,
  numa_info[nodenr].node_mem = object_property_get_uint(o, "size", 
NULL);
  numa_info[nodenr].node_memdev = MEMORY_BACKEND(o);
  }
+
+if (node->has_initiator) {
+if (numa_info[nodenr].initiator_valid &&
+(node->initiator != numa_info[nodenr].initiator)) {
+error_setg(errp, "The initiator of NUMA node %" PRIu16 " has been "
+   "set to node %" 

Re: [Qemu-devel] [Qemu-ppc] [GIT PULL for qemu-pseries REPOST] pseries: Update SLOF firmware image

2019-08-13 Thread Aravinda Prasad



On Tuesday 13 August 2019 07:47 PM, David Gibson wrote:
> On Tue, Aug 13, 2019 at 01:00:24PM +0530, Aravinda Prasad wrote:
>>
>>
>> On Monday 12 August 2019 03:38 PM, David Gibson wrote:
>>> On Mon, Aug 05, 2019 at 02:14:39PM +0530, Aravinda Prasad wrote:
 Alexey/David,

 With the SLOF changes, QEMU cannot resize the RTAS blob. Resizing is
 required for FWNMI support which extends the RTAS blob to include an
 error log upon a machine check.

 The check to valid RTAS buffer fails in the guest because the rtas-size
 updated in QEMU is not reflecting in the guest.

 Any workaround for this?
>>>
>>> Well, we should still be able to do it, it just means fwnmi would need
>>> a SLOF change.  It's an inconvenience, but not really a big deal.
>>
>> Yes. Alexey and I were discussing about the following changes to SLOf:
>>
>> diff --git a/lib/libhvcall/hvcall.S b/lib/libhvcall/hvcall.S
>> index b19f6dbeff2c..880d29a29122 100644
>> --- a/lib/libhvcall/hvcall.S
>> +++ b/lib/libhvcall/hvcall.S
>> @@ -134,6 +134,7 @@ ENTRY(hv_rtas)
>> ori r3,r3,KVMPPC_H_RTAS@l
>> HVCALL
>> blr
>> +.space 2048
>> .globl hv_rtas_size
>>  hv_rtas_size:
>> .long . - hv_rtas;
>>
>>
>> But this will statically reserve space for RTAS even when
>> SPAPR_CAP_FWNMI_MCE is OFF.
> 
> Sure.  We could flag that in the DT somehow, and have SLOF reserve the
> space conditionally.
> 
> Or we could just ignore it. 2 kiB is miniscule compared to our minimum
> guest size, and our current RTAS is microscopic compared to PowerVM.

I also think so, 2kiB is miniscule so we can allocate it statically.

Alexey,

Can you please include the above one line fix to SLOF?

> 
> 

-- 
Regards,
Aravinda



Re: [Qemu-devel] [PATCH] spapr/xive: Mask the EAS when allocating an IRQ

2019-08-13 Thread David Gibson
On Tue, Aug 13, 2019 at 05:46:04PM +0100, Peter Maydell wrote:
> On Tue, 13 Aug 2019 at 17:44, Cédric Le Goater  wrote:
> >
> > If an IRQ is allocated and not configured, such as a MSI requested by
> > a PCI driver, it can be saved in its default state and possibly later
> > on restored using the same state. If not initially MASKED, KVM will
> > try to find a matching priority/target tuple for the interrupt and
> > fail to restore the VM because 0/0 is not a valid target.
> >
> > When allocating a IRQ number, the EAS should be set to a sane default :
> > VALID and MASKED.
> >
> > Reported-by: Satheesh Rajendran 
> > Signed-off-by: Cédric Le Goater 
> > ---
> >
> >  David, this fixes a "virsh save/restore" issue in certain configurations
> >  of CPU topology which never showed up before :/
> >
> >  Peter, I was busy on a KVM/passthru issue and lacked the time to
> >  investigate all ... you decide.
> 
> rc5 has been tagged so this is definitely too late for 4.1.

Understood.  It's unfortunate, but I've merged this for 4.2, and I'll
look into stable branch and downstream backports.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v9 09/11] numa: Extend the CLI to provide memory latency and bandwidth information

2019-08-13 Thread Tao Xu

On 8/13/2019 11:11 PM, Eric Blake wrote:

On 8/9/19 1:57 AM, Tao wrote:

From: Liu Jingqi 

Add -numa hmat-lb option to provide System Locality Latency and
Bandwidth Information. These memory attributes help to build
System Locality Latency and Bandwidth Information Structure(s)
in ACPI Heterogeneous Memory Attribute Table (HMAT).

Signed-off-by: Liu Jingqi 
Signed-off-by: Tao Xu 
---

Changes in v9:
 - change the CLI input way, make it more user firendly (Daniel Black)
 use latency=NUM[p|n|u]s and bandwidth=NUM[M|G|P](B/s) as input and drop
 the base-lat and base-bw input.


Why are you hand-rolling yet another scaling parser instead of reusing
one that's already in-tree?


Because there are no time scaling parser and QMP 'size' type will use kb 
as default. It is a tricky issue because the entry in HMAT is small(max 
0x) and we need to store the unit in HMAT.


But as you mentioned blew, 'str' is not a good choice for QMP.
Therefore, what about this solution:

For bandwidth, reuse the qemu_strtosz_MiB() (because the smllest unit is 
MB/s). For latency, write a time scaling parser named as 
"qemu_strtotime_ps()" and "qemu_strtotime_ns()" in util/cutils.c. And 
then use it to pre-convert them into the single scale (QMP interface can 
use).


At last, in HMAT, we auto store the data, separate it into the same base 
unit and entry, and show error if overflow. Then the HMAT can support as 
large as possible.


I am wondering if this solution is OK.



+++ b/hw/core/numa.c



+void parse_numa_hmat_lb(MachineState *ms, NumaHmatLBOptions *node,
+Error **errp)
+{



+if (node->has_latency) {
+hmat_lb = ms->numa_state->hmat_lb[node->hierarchy][node->data_type];
+
+if (!hmat_lb) {
+hmat_lb = g_malloc0(sizeof(*hmat_lb));
+ms->numa_state->hmat_lb[node->hierarchy][node->data_type] = 
hmat_lb;
+} else if (hmat_lb->latency[node->initiator][node->target]) {
+error_setg(errp, "Duplicate configuration of the latency for "
+   "initiator=%" PRIu16 " and target=%" PRIu16 ".",
+   node->initiator, node->target);
+return;
+}
+
+ret = qemu_strtoui(node->latency, , 10, );
+if (ret < 0) {
+error_setg(errp, "Invalid latency %s", node->latency);
+return;
+}
+
+if (*endptr == '\0') {
+base_lat = 1;
+} else if (*(endptr + 1) == 's') {
+switch (*endptr) {
+case 'p':
+base_lat = 1;
+break;
+case 'n':
+base_lat = PICO_PER_NSEC;
+break;
+case 'u':
+base_lat = PICO_PER_USEC;
+break;


Hmm - this is a different scaling than any of our existing parsers
(which assume multiples k/M/G..., not subdivisions u/n/s)



+if (node->has_bandwidth) {
+hmat_lb = ms->numa_state->hmat_lb[node->hierarchy][node->data_type];
+
+if (!hmat_lb) {
+hmat_lb = g_malloc0(sizeof(*hmat_lb));
+ms->numa_state->hmat_lb[node->hierarchy][node->data_type] = 
hmat_lb;
+} else if (hmat_lb->bandwidth[node->initiator][node->target]) {
+error_setg(errp, "Duplicate configuration of the bandwidth for "
+   "initiator=%" PRIu16 " and target=%" PRIu16 ".",
+   node->initiator, node->target);
+return;
+}
+
+ret = qemu_strtoui(node->bandwidth, , 10, );
+if (ret < 0) {
+error_setg(errp, "Invalid bandwidth %s", node->bandwidth);
+return;
+}
+
+switch (toupper(*endptr)) {
+case '\0':
+case 'M':
+base_bw = 1;
+break;
+case 'G':
+base_bw = UINT64_C(1) << 10;
+break;
+case 'P':
+base_bw = UINT64_C(1) << 20;
+break;


But this one, in addition to being wrong (P is 1<<30, not 1<<20), should
definitely be reusing qemu_strtosz_metric() or similar (look in
util/cutils.c).



+++ b/qapi/machine.json
@@ -377,10 +377,12 @@
  #
  # @cpu: property based CPU(s) to node mapping (Since: 2.10)
  #
+# @hmat-lb: memory latency and bandwidth information (Since: 4.2)
+#
  # Since: 2.1
  ##
  { 'enum': 'NumaOptionsType',
-  'data': [ 'node', 'dist', 'cpu' ] }
+  'data': [ 'node', 'dist', 'cpu', 'hmat-lb' ] }
  



+##
+# @HmatLBDataType:
+#
+# Data type in the System Locality Latency
+# and Bandwidth Information Structure of HMAT (Heterogeneous
+# Memory Attribute Table)
+#
+# For more information of @HmatLBDataType see
+# the chapter 5.2.27.4: Table 5-142:  Field "Data Type" of ACPI 6.3 spec.
+#
+# @access-latency: access latency (picoseconds)
+#
+# @read-latency: read latency (picoseconds)
+#
+# @write-latency: write latency (picoseconds)
+#
+# @access-bandwidth: access bandwidth (MB/s)
+#
+# @read-bandwidth: read bandwidth (MB/s)
+#
+# 

Re: [Qemu-devel] [PATCH v9 05/11] numa: Extend CLI to provide initiator information for numa nodes

2019-08-13 Thread Dan Williams
On Tue, Aug 13, 2019 at 8:00 AM Igor Mammedov  wrote:
>
> On Fri,  9 Aug 2019 14:57:25 +0800
> Tao  wrote:
>
> > From: Tao Xu 
> >
> > In ACPI 6.3 chapter 5.2.27 Heterogeneous Memory Attribute Table (HMAT),
> > The initiator represents processor which access to memory. And in 5.2.27.3
> > Memory Proximity Domain Attributes Structure, the attached initiator is
> > defined as where the memory controller responsible for a memory proximity
> > domain. With attached initiator information, the topology of heterogeneous
> > memory can be described.
> >
> > Extend CLI of "-numa node" option to indicate the initiator numa node-id.
> > In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report
> > the platform's HMAT tables.
> >
> > Reviewed-by: Jingqi Liu 
> > Suggested-by: Dan Williams 
> > Signed-off-by: Tao Xu 
> > ---
> >
> > No changes in v9
> > ---
> >  hw/core/machine.c | 24 
> >  hw/core/numa.c| 13 +
> >  include/sysemu/numa.h |  3 +++
> >  qapi/machine.json |  6 +-
> >  qemu-options.hx   | 27 +++
> >  5 files changed, 68 insertions(+), 5 deletions(-)
> >
> > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > index 3c55470103..113184a9df 100644
> > --- a/hw/core/machine.c
> > +++ b/hw/core/machine.c
> > @@ -640,6 +640,7 @@ void machine_set_cpu_numa_node(MachineState *machine,
> > const CpuInstanceProperties *props, Error 
> > **errp)
> >  {
> >  MachineClass *mc = MACHINE_GET_CLASS(machine);
> > +NodeInfo *numa_info = machine->numa_state->nodes;
> >  bool match = false;
> >  int i;
> >
> > @@ -709,6 +710,16 @@ void machine_set_cpu_numa_node(MachineState *machine,
> >  match = true;
> >  slot->props.node_id = props->node_id;
> >  slot->props.has_node_id = props->has_node_id;
> > +
> > +if (numa_info[props->node_id].initiator_valid &&
> > +(props->node_id != numa_info[props->node_id].initiator)) {
> > +error_setg(errp, "The initiator of CPU NUMA node %" PRId64
> > +   " should be itself.", props->node_id);
> > +return;
> > +}
> > +numa_info[props->node_id].initiator_valid = true;
> > +numa_info[props->node_id].has_cpu = true;
> > +numa_info[props->node_id].initiator = props->node_id;
> >  }
> >
> >  if (!match) {
> > @@ -1050,6 +1061,7 @@ static void machine_numa_finish_cpu_init(MachineState 
> > *machine)
> >  GString *s = g_string_new(NULL);
> >  MachineClass *mc = MACHINE_GET_CLASS(machine);
> >  const CPUArchIdList *possible_cpus = 
> > mc->possible_cpu_arch_ids(machine);
> > +NodeInfo *numa_info = machine->numa_state->nodes;
> >
> >  assert(machine->numa_state->num_nodes);
> >  for (i = 0; i < possible_cpus->len; i++) {
> > @@ -1083,6 +1095,18 @@ static void 
> > machine_numa_finish_cpu_init(MachineState *machine)
> >  machine_set_cpu_numa_node(machine, , _fatal);
> >  }
> >  }
> > +
> > +for (i = 0; i < machine->numa_state->num_nodes; i++) {
> > +if (numa_info[i].initiator_valid &&
> > +!numa_info[numa_info[i].initiator].has_cpu) {
>   ^^ possible out of bounds read, 
> see bellow
>
> > +error_report("The initiator-id %"PRIu16 " of NUMA node %d"
> > + " does not exist.", numa_info[i].initiator, i);
> > +error_printf("\n");
> > +
> > +exit(1);
> > +}
> it takes care only about nodes that have cpus or memory-only ones that have
> initiator explicitly provided on CLI. And leaves possibility to have
> memory-only nodes without initiator mixed with nodes that have initiator.
> Is it valid to have mixed configuration?
> Should we forbid it?

The spec talks about the "Proximity Domain for the Attached Initiator"
field only being valid if the memory controller for the memory can be
identified by an initiator id in the SRAT. So I expect the only way to
define a memory proximity domain without this local initiator is to
allow specifying a node-id that does not have an entry in the SRAT.

That would be a useful feature for testing OS HMAT parsing behavior,
and may match platforms that exist in practice.

>
> > +}
> > +
> >  if (s->len && !qtest_enabled()) {
> >  warn_report("CPU(s) not present in any NUMA nodes: %s",
> >  s->str);
> > diff --git a/hw/core/numa.c b/hw/core/numa.c
> > index 8fcbba05d6..cfb6339810 100644
> > --- a/hw/core/numa.c
> > +++ b/hw/core/numa.c
> > @@ -128,6 +128,19 @@ static void parse_numa_node(MachineState *ms, 
> > NumaNodeOptions *node,
> >  numa_info[nodenr].node_mem = object_property_get_uint(o, "size", 
> > NULL);
> >  numa_info[nodenr].node_memdev = MEMORY_BACKEND(o);
> >  }
> > +
> > +if (node->has_initiator) {
> > +if (numa_info[nodenr].initiator_valid &&
> 

Re: [Qemu-devel] [PATCH] Fix Guest VM crash due to iSCSI Sense Key error

2019-08-13 Thread Shaju Abraham
I do not have a test case to reproduce this issue. It is seen rarely. The fix 
looks good to me, will confirm if I am able to reproduce the error scenario.

Regards
Shaju

On 8/14/19, 4:21 AM, "John Snow"  wrote:



On 7/7/19 10:55 PM, shaju.abra...@nutanix.com wrote:
> From: Shaju Abraham 
> 
> During the  IDE DMA transfer for a ISCSI target,when libiscsi encounters
> a SENSE KEY error, it sets the task->sense to  the value "COMMAND 
ABORTED".
> The function iscsi_translate_sense() later translaters this error to 
-ECANCELED
> and this value is passed to the callback function. In the case of  IDE 
DMA read
> or write, the callback function returns immediately if the value of the 
ret
> argument is -ECANCELED.
> Later when ide_cancel_dma_sync() function is invoked  the assertion
> "s->bus->dma->aiocb == ((void *)0)" fails and the qemu process gets 
terminated.
> Fix the issue by making the value of s->bus->dma->aiocb = NULL when
> -ECANCELED is passed to the callback.
> 
> Signed-off-by: Shaju Abraham 
> ---
>  hw/ide/core.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index 6afadf8..78ea357 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -841,6 +841,7 @@ static void ide_dma_cb(void *opaque, int ret)
>  bool stay_active = false;
>  
>  if (ret == -ECANCELED) {
> +s->bus->dma->aiocb = NULL;
>  return;
>  }
>  
> 

Hopefully just as adequately addressed by the patches in


https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jnsnow_qemu_commits_ide=DwICaQ=s883GpUCOChKOHiocYtGcg=sY-XeNqcuy_ruBQ9T7A2LmG6ktyYXXSxRB1ljkxMepI=lmNnHLnsZKEaZkunWBMldNPiL87un4Q2Brtsa0zCKiQ=KGmAtez5AckTpNugzMzxMObkZKQ3A5vIIiukShVYUXM=
 

but if you wanted to give it a test and confirm for me, I wouldn't be
upset by that.

--js




Re: [Qemu-devel] [PATCH v9 05/11] numa: Extend CLI to provide initiator information for numa nodes

2019-08-13 Thread Tao Xu

On 8/13/2019 11:00 PM, Igor Mammedov wrote:

On Fri,  9 Aug 2019 14:57:25 +0800
Tao  wrote:


From: Tao Xu 

In ACPI 6.3 chapter 5.2.27 Heterogeneous Memory Attribute Table (HMAT),
The initiator represents processor which access to memory. And in 5.2.27.3
Memory Proximity Domain Attributes Structure, the attached initiator is
defined as where the memory controller responsible for a memory proximity
domain. With attached initiator information, the topology of heterogeneous
memory can be described.

Extend CLI of "-numa node" option to indicate the initiator numa node-id.
In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report
the platform's HMAT tables.

Reviewed-by: Jingqi Liu 
Suggested-by: Dan Williams 
Signed-off-by: Tao Xu 
---

No changes in v9
---

[...]

+
+for (i = 0; i < machine->numa_state->num_nodes; i++) {
+if (numa_info[i].initiator_valid &&
+!numa_info[numa_info[i].initiator].has_cpu) {

   ^^ possible out of bounds read, 
see bellow


I will add a error "if (numa_info[i].initiator >= MAX_NODES)" when input.

+error_report("The initiator-id %"PRIu16 " of NUMA node %d"
+ " does not exist.", numa_info[i].initiator, i);
+error_printf("\n");
+
+exit(1);
+}

it takes care only about nodes that have cpus or memory-only ones that have
initiator explicitly provided on CLI. And leaves possibility to have
memory-only nodes without initiator mixed with nodes that have initiator.
Is it valid to have mixed configuration?
Should we forbid it?

Mixed configuration may indeed trigger bug in the future. Because in 
this patches we default generate HMAT. But mixed configuration situation 
or without initiator setting will let mem-only node "Flags" field 0, 
then the Proximity Domain for the Attached Initiator field is not

valid.

List are three situations:

1) full configuration, just like
-object memory-backend-ram,size=1G,id=m0 \
-object memory-backend-ram,size=1G,id=m1 \
-object memory-backend-ram,size=1G,id=m2 \
-numa node,nodeid=0,memdev=m0 \
-numa node,nodeid=1,memdev=m1,initiator=0 \
-numa node,nodeid=2,memdev=m2,initiator=0

2) mixed configuration, just like
-object memory-backend-ram,size=1G,id=m0 \
-object memory-backend-ram,size=1G,id=m1 \
-object memory-backend-ram,size=1G,id=m2 \
-numa node,nodeid=0,memdev=m0 \
-numa node,nodeid=1,memdev=m1,initiator=0 \
-numa node,nodeid=2,memdev=m2

3) no configuration, just like
-object memory-backend-ram,size=1G,id=m0 \
-object memory-backend-ram,size=1G,id=m1 \
-object memory-backend-ram,size=1G,id=m2 \
-numa node,nodeid=0,memdev=m0 \
-numa node,nodeid=1,memdev=m1 \
-numa node,nodeid=2,memdev=m2

I have 3 ideas:

1. HMAT option. Add a machine option like "-machine,hmat=yes", then qemu 
can have HMAT.


2. Default setting. The numa without initiator default set numa node 
which has cpu 0 as initiator.


3. Auto setting. intelligent auto configuration like 
numa_default_auto_assign_ram, auto set initiator of the memory-only 
nodes averagely.


Therefore, there are 2 different solution:

1) HMAT option + Default setting

2) HMAT option + Auto setting


+}
+
  if (s->len && !qtest_enabled()) {
  warn_report("CPU(s) not present in any NUMA nodes: %s",
  s->str);
diff --git a/hw/core/numa.c b/hw/core/numa.c
index 8fcbba05d6..cfb6339810 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -128,6 +128,19 @@ static void parse_numa_node(MachineState *ms, 
NumaNodeOptions *node,
  numa_info[nodenr].node_mem = object_property_get_uint(o, "size", 
NULL);
  numa_info[nodenr].node_memdev = MEMORY_BACKEND(o);
  }
+
+if (node->has_initiator) {
+if (numa_info[nodenr].initiator_valid &&
+(node->initiator != numa_info[nodenr].initiator)) {
+error_setg(errp, "The initiator of NUMA node %" PRIu16 " has been "
+   "set to node %" PRIu16, nodenr,
+   numa_info[nodenr].initiator);
+return;
+}
+
+numa_info[nodenr].initiator_valid = true;
+numa_info[nodenr].initiator = node->initiator;

  ^^^
not validated  user input? (which could lead to read beyond numa_info[] 
boundaries
in previous hunk).


+}
  numa_info[nodenr].present = true;
  max_numa_nodeid = MAX(max_numa_nodeid, nodenr + 1);
  ms->numa_state->num_nodes++;
diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 76da3016db..46ad06e000 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -10,6 +10,9 @@ struct NodeInfo {
  uint64_t node_mem;
  struct HostMemoryBackend *node_memdev;
  bool present;
+bool has_cpu;
+bool initiator_valid;
+uint16_t initiator;
  uint8_t distance[MAX_NODES];
  };
  
diff --git a/qapi/machine.json b/qapi/machine.json

index 6db8a7e2ec..05e367d26a 100644
--- 

[Qemu-devel] [PATCH 5/6] migration: add some multifd traces

2019-08-13 Thread Juan Quintela
Signed-off-by: Juan Quintela 
---
 migration/ram.c| 3 +++
 migration/trace-events | 4 
 2 files changed, 7 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index f1aec95f83..25a211c3fb 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1173,6 +1173,7 @@ static void *multifd_send_thread(void *opaque)
 
 out:
 if (local_err) {
+trace_multifd_send_error(p->id);
 multifd_send_terminate_threads(local_err);
 }
 
@@ -1203,6 +1204,7 @@ static void multifd_new_send_channel_async(QIOTask *task, 
gpointer opaque)
 QIOChannel *sioc = QIO_CHANNEL(qio_task_get_source(task));
 Error *local_err = NULL;
 
+trace_multifd_new_send_channel_async(p->id);
 if (qio_task_propagate_error(task, _err)) {
 migrate_set_error(migrate_get_current(), local_err);
 multifd_save_cleanup();
@@ -1496,6 +1498,7 @@ bool multifd_recv_new_channel(QIOChannel *ioc, Error 
**errp)
 atomic_read(_recv_state->count));
 return false;
 }
+trace_multifd_recv_new_channel(id);
 
 p = _recv_state->params[id];
 if (p->c != NULL) {
diff --git a/migration/trace-events b/migration/trace-events
index 9fbef614ab..5d85f8bf83 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -81,7 +81,9 @@ migration_bitmap_sync_start(void) ""
 migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64
 migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, 
unsigned long page) "rb %s start 0x%"PRIx64" size 0x%"PRIx64" page 0x%lx"
 migration_throttle(void) ""
+multifd_new_send_channel_async(uint8_t id) "channel %d"
 multifd_recv(uint8_t id, uint64_t packet_num, uint32_t used, uint32_t flags, 
uint32_t next_packet_size) "channel %d packet_num %" PRIu64 " pages %d flags 
0x%x next packet size %d"
+multifd_recv_new_channel(uint8_t id) "channel %d"
 multifd_recv_sync_main(long packet_num) "packet num %ld"
 multifd_recv_sync_main_signal(uint8_t id) "channel %d"
 multifd_recv_sync_main_wait(uint8_t id) "channel %d"
@@ -89,7 +91,9 @@ multifd_recv_terminate_threads(bool error) "error %d"
 multifd_recv_thread_can_start(uint8_t id) "channel %d"
 multifd_recv_thread_end(uint8_t id, uint64_t packets, uint64_t pages) "channel 
%d packets %" PRIu64 " pages %" PRIu64
 multifd_recv_thread_start(uint8_t id) "%d"
+multifd_save_setup_wait(uint8_t id) "%d"
 multifd_send(uint8_t id, uint64_t packet_num, uint32_t used, uint32_t flags, 
uint32_t next_packet_size) "channel %d packet_num %" PRIu64 " pages %d flags 
0x%x next packet size %d"
+multifd_send_error(uint8_t id) "channel %d"
 multifd_send_sync_main(long packet_num) "packet num %ld"
 multifd_send_sync_main_signal(uint8_t id) "channel %d"
 multifd_send_sync_main_wait(uint8_t id) "channel %d"
-- 
2.21.0




[Qemu-devel] [PATCH 6/6] RFH: We lost "connect" events

2019-08-13 Thread Juan Quintela
When we have lots of channels, sometimes multifd migration fails
with the following error:

(qemu) migrate -d tcp:0:
(qemu) qemu-system-x86_64: multifd_send_pages: channel 17 has already quit!
qemu-system-x86_64: multifd_send_pages: channel 17 has already quit!
qemu-system-x86_64: multifd_send_sync_main: multifd_send_pages fail
qemu-system-x86_64: Unable to write to socket: Connection reset by peer
info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
clear-bitmap-shift: 18
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off 
compress: off events: off postcopy-ram: off x-colo: off release-ram: off block: 
off return-path: off pause-before-switchover: off multifd: on dirty-bitmaps: 
off postcopy-blocktime: off late-block-activate: off x-ignore-shared: off
Migration status: failed (Unable to write to socket: Connection reset by peer)
total time: 0 milliseconds

On this particular example I am using 100 channels.  The bigger the
number of channels, the easier that it is to reproduce.  That don't
mean that it is a good idea to use so many channels.

With the previous patches on this series, I can run "reliabely" on my
hardware with until 10 channels.  Most of the time.  Until it fails.
With 100 channels, it fails almost always.

I thought that the problem was on the send side, so I tried to debug
there.  As you can see for the delay, if you put any
printf()/error_report/trace, you can get that the error goes away, it
is very timing sensitive.  With a delay of 1 microseconds, it only
works sometimes.

What have I discovered so far:

- send side calls qemu_socket() on all the channels.  So it appears
  that it gets created correctly.
- on the destination side, it appears that "somehowe" some of the
  connections are lost by the listener.  This error happens when the
  destination side socket hasn't been "accepted", and it is not
  properly created.  As far as I can see, we have several options:

  1- I don't know how to use properly qio asynchronously
 (this is one big posiblity).

  2- glib has one error in this case?  or how qio listener is
 implemented on top of glib.  I put lots of printf() and other
 instrumentation, and it appears that the listener io_func is not
 called at all for the connections that are missing.

  3- it is always possible that we are missing some g_main_loop_run()
 somewhere.  Notice how test/test-io-channel-socket.c calls it
 "creatively".

  4- It is enterely possible that I should be using the sockets as
 blocking instead of non-blocking.  But I am not sure about that
 one yet.

- on the sending side, what happens is:

  eventually it call socket_connect() after all the async dance with
  thread creation, etc, etc. Source side creates all the channels, it
  is the destination side which is missing some of them.

  sending side sends the first packet by that channel, it "sucheeds"
  and didn't give any error.

  after some time, sending side decides to send another packet through
  that channel, and it is now when we get the above error.

Any good ideas?

Later, Juan.

PD: Command line used is attached:

Imortant bits:
- multifd is set
- multifd_channels is set to 100

/scratch/qemu/fail/x64/x86_64-softmmu/qemu-system-x86_64 -M
pc-i440fx-3.1,accel=kvm,usb=off,vmport=off,nvdimm -L
/mnt/code/qemu/check/pc-bios/ -smp 2 -name t1,debug-threads=on -m 3G
-uuid 113100f9-6c99-4a7a-9b78-eb1c088d1087 -monitor stdio -boot
strict=on -drive
file=/mnt/images/test.img,format=qcow2,if=none,id=disk0 -device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=disk0,id=virtio-disk0,bootindex=1
-netdev tap,id=hostnet0,script=/etc/kvm-ifup,downscript= -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:9d:10:51,bus=pci.0,addr=0x3
-serial pty -parallel none -usb -device usb-tablet -k es -vga cirrus
--global migration.x-multifd=on --global
migration.multifd-channels=100 -trace events=/home/quintela/tmp/events

CC: Daniel P. Berrangé 

Signed-off-by: Juan Quintela 
---
 migration/ram.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/migration/ram.c b/migration/ram.c
index 25a211c3fb..50586304a0 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1248,6 +1248,7 @@ int multifd_save_setup(void)
 p->packet = g_malloc0(p->packet_len);
 p->name = g_strdup_printf("multifdsend_%d", i);
 socket_send_channel_create(multifd_new_send_channel_async, p);
+usleep(10);
 }
 return 0;
 }
-- 
2.21.0




[Qemu-devel] [PATCH 4/6] migration: Make multifd threads wait until all have been created

2019-08-13 Thread Juan Quintela
This makes it clear that no thread handles any incoming message until
all threads have been created.

Signed-off-by: Juan Quintela 
---
 migration/ram.c| 24 ++--
 migration/trace-events |  1 +
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 4a6ae677a9..f1aec95f83 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -702,6 +702,8 @@ typedef struct {
 uint64_t num_pages;
 /* syncs main thread and channels */
 QemuSemaphore sem_sync;
+/* thread can continue */
+QemuSemaphore can_start;
 } MultiFDRecvParams;
 
 static int multifd_send_initial_packet(MultiFDSendParams *p, Error **errp)
@@ -1313,6 +1315,7 @@ int multifd_load_cleanup(Error **errp)
 p->c = NULL;
 qemu_mutex_destroy(>mutex);
 qemu_sem_destroy(>sem_sync);
+qemu_sem_destroy(>can_start);
 g_free(p->name);
 p->name = NULL;
 multifd_pages_clear(p->pages);
@@ -1366,6 +1369,9 @@ static void *multifd_recv_thread(void *opaque)
 trace_multifd_recv_thread_start(p->id);
 rcu_register_thread();
 
+qemu_sem_wait(>can_start);
+trace_multifd_recv_thread_can_start(p->id);
+
 while (true) {
 uint32_t used;
 uint32_t flags;
@@ -1445,6 +1451,7 @@ int multifd_load_setup(void)
 
 qemu_mutex_init(>mutex);
 qemu_sem_init(>sem_sync, 0);
+qemu_sem_init(>can_start, 0);
 p->quit = false;
 p->id = i;
 p->pages = multifd_pages_init(page_count);
@@ -1477,6 +1484,7 @@ bool multifd_recv_new_channel(QIOChannel *ioc, Error 
**errp)
 {
 MultiFDRecvParams *p;
 Error *local_err = NULL;
+bool last_one;
 int id;
 
 id = multifd_recv_initial_packet(ioc, _err);
@@ -1506,8 +1514,20 @@ bool multifd_recv_new_channel(QIOChannel *ioc, Error 
**errp)
 qemu_thread_create(>thread, p->name, multifd_recv_thread, p,
QEMU_THREAD_JOINABLE);
 atomic_inc(_recv_state->count);
-return atomic_read(_recv_state->count) ==
-   migrate_multifd_channels();
+
+last_one =  atomic_read(_recv_state->count)
+== migrate_multifd_channels();
+
+if (last_one) {
+int i;
+
+for (i = 0; i < migrate_multifd_channels(); i++) {
+MultiFDRecvParams *p = _recv_state->params[i];
+
+qemu_sem_post(>can_start);
+}
+}
+return last_one;
 }
 
 /**
diff --git a/migration/trace-events b/migration/trace-events
index dd13a5c4b1..9fbef614ab 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -86,6 +86,7 @@ multifd_recv_sync_main(long packet_num) "packet num %ld"
 multifd_recv_sync_main_signal(uint8_t id) "channel %d"
 multifd_recv_sync_main_wait(uint8_t id) "channel %d"
 multifd_recv_terminate_threads(bool error) "error %d"
+multifd_recv_thread_can_start(uint8_t id) "channel %d"
 multifd_recv_thread_end(uint8_t id, uint64_t packets, uint64_t pages) "channel 
%d packets %" PRIu64 " pages %" PRIu64
 multifd_recv_thread_start(uint8_t id) "%d"
 multifd_send(uint8_t id, uint64_t packet_num, uint32_t used, uint32_t flags, 
uint32_t next_packet_size) "channel %d packet_num %" PRIu64 " pages %d flags 
0x%x next packet size %d"
-- 
2.21.0




[Qemu-devel] [PATCH 3/6] migration: Make sure that all multifd channels have been created

2019-08-13 Thread Juan Quintela
If we start the migration before all have been created, we have to
handle the case that one channel still don't exist.  This way it is
easier.

Signed-off-by: Juan Quintela 
---
 migration/ram.c| 14 ++
 migration/trace-events |  1 +
 2 files changed, 15 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 4bdd201a4e..4a6ae677a9 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -663,6 +663,8 @@ typedef struct {
 uint64_t num_pages;
 /* syncs main thread and channels */
 QemuSemaphore sem_sync;
+/* thread has started and setup is done */
+QemuSemaphore started;
 }  MultiFDSendParams;
 
 typedef struct {
@@ -1039,6 +1041,7 @@ void multifd_save_cleanup(void)
 qemu_mutex_destroy(>mutex);
 qemu_sem_destroy(>sem);
 qemu_sem_destroy(>sem_sync);
+qemu_sem_destroy(>started);
 g_free(p->name);
 p->name = NULL;
 multifd_pages_clear(p->pages);
@@ -1113,6 +1116,8 @@ static void *multifd_send_thread(void *opaque)
 /* initial packet */
 p->num_packets = 1;
 
+qemu_sem_post(>started);
+
 while (true) {
 qemu_sem_wait(>sem);
 qemu_mutex_lock(>mutex);
@@ -1229,6 +1234,7 @@ int multifd_save_setup(void)
 qemu_mutex_init(>mutex);
 qemu_sem_init(>sem, 0);
 qemu_sem_init(>sem_sync, 0);
+qemu_sem_init(>started, 0);
 p->quit = false;
 p->pending_job = 0;
 p->id = i;
@@ -3486,6 +3492,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 ram_control_before_iterate(f, RAM_CONTROL_SETUP);
 ram_control_after_iterate(f, RAM_CONTROL_SETUP);
 
+/* We want to wait for all threads to have started before doing
+ * anything else */
+for (int i = 0; i < migrate_multifd_channels(); i++) {
+MultiFDSendParams *p = _send_state->params[i];
+
+qemu_sem_wait(>started);
+trace_multifd_send_thread_started(p->id);
+}
 multifd_send_sync_main();
 qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
 qemu_fflush(f);
diff --git a/migration/trace-events b/migration/trace-events
index 886ce70ca0..dd13a5c4b1 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -95,6 +95,7 @@ multifd_send_sync_main_wait(uint8_t id) "channel %d"
 multifd_send_terminate_threads(bool error) "error %d"
 multifd_send_thread_end(uint8_t id, uint64_t packets, uint64_t pages) "channel 
%d packets %" PRIu64 " pages %"  PRIu64
 multifd_send_thread_start(uint8_t id) "%d"
+multifd_send_thread_started(uint8_t id) "channel %d"
 ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: 
%" PRIx64 " %zx"
 ram_load_loop(const char *rbname, uint64_t addr, int flags, void *host) "%s: 
addr: 0x%" PRIx64 " flags: 0x%x host: %p"
 ram_load_postcopy_loop(uint64_t addr, int flags) "@%" PRIx64 " %x"
-- 
2.21.0




[Qemu-devel] [PATCH 1/6] migration: Add traces for multifd terminate threads

2019-08-13 Thread Juan Quintela
Signed-off-by: Juan Quintela 
---
 migration/ram.c| 4 
 migration/trace-events | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 889148dd84..ca11d43e30 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -996,6 +996,8 @@ static void multifd_send_terminate_threads(Error *err)
 {
 int i;
 
+trace_multifd_send_terminate_threads(err != NULL);
+
 if (err) {
 MigrationState *s = migrate_get_current();
 migrate_set_error(s, err);
@@ -1254,6 +1256,8 @@ static void multifd_recv_terminate_threads(Error *err)
 {
 int i;
 
+trace_multifd_recv_terminate_threads(err != NULL);
+
 if (err) {
 MigrationState *s = migrate_get_current();
 migrate_set_error(s, err);
diff --git a/migration/trace-events b/migration/trace-events
index d8e54c367a..886ce70ca0 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -85,12 +85,14 @@ multifd_recv(uint8_t id, uint64_t packet_num, uint32_t 
used, uint32_t flags, uin
 multifd_recv_sync_main(long packet_num) "packet num %ld"
 multifd_recv_sync_main_signal(uint8_t id) "channel %d"
 multifd_recv_sync_main_wait(uint8_t id) "channel %d"
+multifd_recv_terminate_threads(bool error) "error %d"
 multifd_recv_thread_end(uint8_t id, uint64_t packets, uint64_t pages) "channel 
%d packets %" PRIu64 " pages %" PRIu64
 multifd_recv_thread_start(uint8_t id) "%d"
 multifd_send(uint8_t id, uint64_t packet_num, uint32_t used, uint32_t flags, 
uint32_t next_packet_size) "channel %d packet_num %" PRIu64 " pages %d flags 
0x%x next packet size %d"
 multifd_send_sync_main(long packet_num) "packet num %ld"
 multifd_send_sync_main_signal(uint8_t id) "channel %d"
 multifd_send_sync_main_wait(uint8_t id) "channel %d"
+multifd_send_terminate_threads(bool error) "error %d"
 multifd_send_thread_end(uint8_t id, uint64_t packets, uint64_t pages) "channel 
%d packets %" PRIu64 " pages %"  PRIu64
 multifd_send_thread_start(uint8_t id) "%d"
 ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: 
%" PRIx64 " %zx"
-- 
2.21.0




[Qemu-devel] [PATCH 0/6] Fix multifd with big number of channels

2019-08-13 Thread Juan Quintela
Hi

When we have much more channels than cpus, we end having failures when
writting to sockets. This series:
- add some traces
- fix some of the trouble with serialization of creating the
  threads/channels in proper order.
- Ask for help with the last patch.  See documentation there.

Please, review.

Juan Quintela (6):
  migration: Add traces for multifd terminate threads
  migration: Make global sem_sync semaphore by channel
  migration: Make sure that all multifd channels have been created
  migration: Make multifd threads wait until all have been created
  migration: add some multifd traces
  RFH: We lost "connect" events

 migration/ram.c| 60 +++---
 migration/trace-events |  8 ++
 2 files changed, 59 insertions(+), 9 deletions(-)

-- 
2.21.0




[Qemu-devel] [PATCH 2/6] migration: Make global sem_sync semaphore by channel

2019-08-13 Thread Juan Quintela
This makes easy to debug things because when you want for all threads
to arrive at that semaphore, you know which one your are waiting for.

Signed-off-by: Juan Quintela 
---
 migration/ram.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index ca11d43e30..4bdd201a4e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -661,6 +661,8 @@ typedef struct {
 uint64_t num_packets;
 /* pages sent through this channel */
 uint64_t num_pages;
+/* syncs main thread and channels */
+QemuSemaphore sem_sync;
 }  MultiFDSendParams;
 
 typedef struct {
@@ -896,8 +898,6 @@ struct {
 MultiFDSendParams *params;
 /* array of pages to sent */
 MultiFDPages_t *pages;
-/* syncs main thread and channels */
-QemuSemaphore sem_sync;
 /* global number of generated multifd packets */
 uint64_t packet_num;
 /* send channels ready */
@@ -1038,6 +1038,7 @@ void multifd_save_cleanup(void)
 p->c = NULL;
 qemu_mutex_destroy(>mutex);
 qemu_sem_destroy(>sem);
+qemu_sem_destroy(>sem_sync);
 g_free(p->name);
 p->name = NULL;
 multifd_pages_clear(p->pages);
@@ -1047,7 +1048,6 @@ void multifd_save_cleanup(void)
 p->packet = NULL;
 }
 qemu_sem_destroy(_send_state->channels_ready);
-qemu_sem_destroy(_send_state->sem_sync);
 g_free(multifd_send_state->params);
 multifd_send_state->params = NULL;
 multifd_pages_clear(multifd_send_state->pages);
@@ -1092,7 +1092,7 @@ static void multifd_send_sync_main(void)
 MultiFDSendParams *p = _send_state->params[i];
 
 trace_multifd_send_sync_main_wait(p->id);
-qemu_sem_wait(_send_state->sem_sync);
+qemu_sem_wait(>sem_sync);
 }
 trace_multifd_send_sync_main(multifd_send_state->packet_num);
 }
@@ -1152,7 +1152,7 @@ static void *multifd_send_thread(void *opaque)
 qemu_mutex_unlock(>mutex);
 
 if (flags & MULTIFD_FLAG_SYNC) {
-qemu_sem_post(_send_state->sem_sync);
+qemu_sem_post(>sem_sync);
 }
 qemu_sem_post(_send_state->channels_ready);
 } else if (p->quit) {
@@ -1175,7 +1175,7 @@ out:
  */
 if (ret != 0) {
 if (flags & MULTIFD_FLAG_SYNC) {
-qemu_sem_post(_send_state->sem_sync);
+qemu_sem_post(>sem_sync);
 }
 qemu_sem_post(_send_state->channels_ready);
 }
@@ -1221,7 +1221,6 @@ int multifd_save_setup(void)
 multifd_send_state = g_malloc0(sizeof(*multifd_send_state));
 multifd_send_state->params = g_new0(MultiFDSendParams, thread_count);
 multifd_send_state->pages = multifd_pages_init(page_count);
-qemu_sem_init(_send_state->sem_sync, 0);
 qemu_sem_init(_send_state->channels_ready, 0);
 
 for (i = 0; i < thread_count; i++) {
@@ -1229,6 +1228,7 @@ int multifd_save_setup(void)
 
 qemu_mutex_init(>mutex);
 qemu_sem_init(>sem, 0);
+qemu_sem_init(>sem_sync, 0);
 p->quit = false;
 p->pending_job = 0;
 p->id = i;
-- 
2.21.0




Re: [Qemu-devel] [PATCH] riscv: hmp: Add a command to show virtual memory mappings

2019-08-13 Thread Bin Meng
Hi Palmer,

On Tue, Aug 13, 2019 at 11:18 PM Palmer Dabbelt  wrote:
>
> On Wed, 31 Jul 2019 05:49:15 PDT (-0700), bmeng...@gmail.com wrote:
> > This adds 'info mem' command for RISC-V, to show virtual memory
> > mappings that aids debugging.
> >
> > Rather than showing every valid PTE, the command compacts the
> > output by merging all contiguous physical address mappings into
> > one block and only shows the merged block mapping details.
> >
> > Signed-off-by: Bin Meng 
> > ---
> >
> >  hmp-commands-info.hx   |   2 +-
> >  target/riscv/Makefile.objs |   4 +
> >  target/riscv/monitor.c | 227 
> > +
> >  3 files changed, 232 insertions(+), 1 deletion(-)
> >  create mode 100644 target/riscv/monitor.c
> >
> > diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
> > index c59444c..257ee7d 100644
> > --- a/hmp-commands-info.hx
> > +++ b/hmp-commands-info.hx
> > @@ -249,7 +249,7 @@ STEXI
> >  Show virtual to physical memory mappings.
> >  ETEXI
> >
> > -#if defined(TARGET_I386)
> > +#if defined(TARGET_I386) || defined(TARGET_RISCV)
> >  {
> >  .name   = "mem",
> >  .args_type  = "",
> > diff --git a/target/riscv/Makefile.objs b/target/riscv/Makefile.objs
> > index b1c79bc..a8ceccd 100644
> > --- a/target/riscv/Makefile.objs
> > +++ b/target/riscv/Makefile.objs
> > @@ -1,5 +1,9 @@
> >  obj-y += translate.o op_helper.o cpu_helper.o cpu.o csr.o fpu_helper.o 
> > gdbstub.o pmp.o
> >
> > +ifeq ($(CONFIG_SOFTMMU),y)
> > +obj-y += monitor.o
> > +endif
> > +
> >  DECODETREE = $(SRC_PATH)/scripts/decodetree.py
> >
> >  decode32-y = $(SRC_PATH)/target/riscv/insn32.decode
> > diff --git a/target/riscv/monitor.c b/target/riscv/monitor.c
> > new file mode 100644
> > index 000..30560ff
> > --- /dev/null
> > +++ b/target/riscv/monitor.c
> > @@ -0,0 +1,227 @@
> > +/*
> > + * QEMU monitor for RISC-V
> > + *
> > + * Copyright (c) 2019 Bin Meng 
> > + *
> > + * RISC-V specific monitor commands implementation
> > + *
> > + * This program is free software; you can redistribute it and/or modify it
> > + * under the terms and conditions of the GNU General Public License,
> > + * version 2 or later, as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope it will be useful, but WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License 
> > for
> > + * more details.
> > + *
> > + * You should have received a copy of the GNU General Public License along 
> > with
> > + * this program.  If not, see .
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "cpu.h"
> > +#include "cpu_bits.h"
> > +#include "monitor/monitor.h"
> > +#include "monitor/hmp-target.h"
> > +
> > +#ifdef TARGET_RISCV64
> > +#define PTE_HEADER_FIELDS   "vaddrpaddr"\
> > +"size attr\n"
> > +#define PTE_HEADER_DELIMITER"  "\
> > +" ---\n"
> > +#else
> > +#define PTE_HEADER_FIELDS   "vaddrpaddrsize attr\n"
> > +#define PTE_HEADER_DELIMITER"   
> > ---\n"
> > +#endif
> > +
> > +/* Perform linear address sign extension */
> > +static target_ulong addr_canonical(int va_bits, target_ulong addr)
> > +{
> > +#ifdef TARGET_RISCV64
> > +if (addr & (1UL << (va_bits - 1))) {
> > +addr |= (hwaddr)-(1L << va_bits);
> > +}
> > +#endif
> > +
> > +return addr;
> > +}
> > +
> > +static void print_pte_header(Monitor *mon)
> > +{
> > +monitor_printf(mon, PTE_HEADER_FIELDS);
> > +monitor_printf(mon, PTE_HEADER_DELIMITER);
> > +}
> > +
> > +static void print_pte(Monitor *mon, int va_bits, target_ulong vaddr,
> > +  hwaddr paddr, target_ulong size, int attr)
> > +{
> > +/* santity check on vaddr */
> > +if (vaddr >= (1UL << va_bits)) {
> > +return;
> > +}
> > +
> > +if (!size) {
> > +return;
> > +}
> > +
> > +monitor_printf(mon, TARGET_FMT_lx " " TARGET_FMT_plx " " TARGET_FMT_lx
> > +   " %c%c%c%c%c%c%c\n",
> > +   addr_canonical(va_bits, vaddr),
> > +   paddr, size,
> > +   attr & PTE_R ? 'r' : '-',
> > +   attr & PTE_W ? 'w' : '-',
> > +   attr & PTE_X ? 'x' : '-',
> > +   attr & PTE_U ? 'u' : '-',
> > +   attr & PTE_G ? 'g' : '-',
> > +   attr & PTE_A ? 'a' : '-',
> > +   attr & PTE_D ? 'd' : '-');
> > +}
> > +
> > +static void walk_pte(Monitor *mon, hwaddr base, target_ulong start,
> > + int level, int ptidxbits, int ptesize, int va_bits,
> > + hwaddr *vbase, hwaddr *pbase, hwaddr *last_paddr,
> > + 

Re: [Qemu-devel] [PATCH v9 01/11] hw/arm: simplify arm_load_dtb

2019-08-13 Thread Andrew Jeffery



On Wed, 14 Aug 2019, at 07:30, Alistair Francis wrote:
> On Fri, Aug 9, 2019 at 12:01 AM Tao  wrote:
> >
> > From: Tao Xu 
> >
> > In struct arm_boot_info, kernel_filename, initrd_filename and
> > kernel_cmdline are copied from from MachineState. This patch add
> > MachineState as a parameter into arm_load_dtb() and move the copy chunk
> > of kernel_filename, initrd_filename and kernel_cmdline into
> > arm_load_kernel().
> >
> > Reviewed-by: Igor Mammedov 
> > Reviewed-by: Liu Jingqi 
> > Suggested-by: Igor Mammedov 
> > Signed-off-by: Tao Xu 
> 
> Reviewed-by: Alistair Francis 
> 
> Alistair
> 
> > ---
> >
> > No changes in v9
> > ---
> >  hw/arm/aspeed.c   |  5 +

For the ASPEED machines:

Acked-by: Andrew Jeffery 



[Qemu-devel] [PATCH] test-bitmap: test set 1 bit case for bitmap_set

2019-08-13 Thread Wei Yang
All current bitmap_set test cases set range across word, while the
handle of a range within one word is different from that.

Add case to set 1 bit as a represent for set range within one word.

Signed-off-by: Wei Yang 

---
Thanks for Paolo's finding.

---
 tests/test-bitmap.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/tests/test-bitmap.c b/tests/test-bitmap.c
index 18aa584591..087e02a26c 100644
--- a/tests/test-bitmap.c
+++ b/tests/test-bitmap.c
@@ -67,6 +67,18 @@ static void bitmap_set_case(bmap_set_func set_func)
 
 bmap = bitmap_new(BMAP_SIZE);
 
+/* Set one bit at offset in second word */
+for (offset = 0; offset <= BITS_PER_LONG; offset++) {
+bitmap_clear(bmap, 0, BMAP_SIZE);
+set_func(bmap, BITS_PER_LONG + offset, 1);
+g_assert_cmpint(find_first_bit(bmap, 2 * BITS_PER_LONG),
+==, BITS_PER_LONG + offset);
+g_assert_cmpint(find_next_zero_bit(bmap,
+   3 * BITS_PER_LONG,
+   BITS_PER_LONG + offset),
+==, BITS_PER_LONG + offset + 1);
+}
+
 /* Both Aligned, set bits [BITS_PER_LONG, 3*BITS_PER_LONG] */
 set_func(bmap, BITS_PER_LONG, 2 * BITS_PER_LONG);
 g_assert_cmpuint(bmap[1], ==, -1ul);
-- 
2.17.1




Re: [Qemu-devel] [RFC] dirty-bitmaps: add block-dirty-bitmap-persist command

2019-08-13 Thread Eric Blake
On 8/13/19 5:44 PM, John Snow wrote:
> This is for the purpose of toggling on/off persistence on a bitmap.
> This enables you to save a bitmap that was not persistent, but may
> have already accumulated valuable data.
> 
> This is simply a QOL enhancement:
> - Allows user to "upgrade" an existing bitmap to persistent
> - Allows user to "downgrade" an existing bitmap to transient,
>   removing it from storage without deleting the bitmap.
> 

In the meantime, a workaround is:

create tmp bitmap (non-persistent is fine)
merge existing bitmap into tmp bitmap
delete existing bitmap
recreate original bitmap with desired change in persistence
merge tmp bitmap into re-created original bitmap
delete tmp bitmap

(I'm not sure how much, if any of that, has to be done with a
transaction; ideally none, since merging two bitmaps that are both
enabled is not going to lose any bits.  And since one of the two ends of
the transaction has a non-persistent bitmap, qemu failing in the narrow
window where the original bitmap does not exist at all is not that much
different from failing while the bitmap is transient. If losing data due
to qemu failure was important, the bitmap should never have been
transient in the first place)

> Signed-off-by: John Snow 
> ---
> 
> This is just an RFC because I'm not sure if I really want to pursue
> adding this, but it was raised in a discussion I had recently that it
> was a little annoying as an API design that persistence couldn't be
> changed after addition, so I wanted to see how much code it would take
> to address that.
> 
> (So this patch isn't really tested; just: "Hey, look!")
> 
> I don't like this patch because it exacerbates my perceived problems
> with the "check if I can make it persistent, then toggle the flag"
> model, where I prefer the "Just try to set it persistent and let it fail
> if it cannot" model, but there were some issues with that patchset that
> I want to revisit.

The idea itself makes sense. I don't know if libvirt would ever use it,
but it does seem like it could make hand-management of bitmaps easier to
reason about.

> +++ b/qapi/block-core.json
> @@ -2001,6 +2001,19 @@
>'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32',
>  '*persistent': 'bool', '*autoload': 'bool', '*disabled': 'bool' 
> } }
>  
> +##
> +# @BlockDirtyBitmapPersist:

The QAPI additions look fine to me, regardless of whether you respin the
code based on review there.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [FOR 4.1 PATCH] riscv: roms: Fix make rules for building sifive_u bios

2019-08-13 Thread Palmer Dabbelt

On Tue, 13 Aug 2019 09:52:13 PDT (-0700), alistai...@gmail.com wrote:

On Tue, Aug 13, 2019 at 6:00 AM Peter Maydell  wrote:


On Mon, 12 Aug 2019 at 09:38, Peter Maydell  wrote:
>
> On Sun, 11 Aug 2019 at 08:17, Bin Meng  wrote:
> >
> > Hi Palmer,
> >
> > On Tue, Aug 6, 2019 at 1:04 AM Alistair Francis  
wrote:
> > >
> > > On Fri, Aug 2, 2019 at 11:08 PM Bin Meng  wrote:
> > > >
> > > > Currently the make rules are wrongly using qemu/virt opensbi image
> > > > for sifive_u machine. Correct it.
> > > >
> > > > Signed-off-by: Bin Meng 
> > >
> > > Good catch.
> > >
> > > @Palmer Dabbelt can you take this for 4.1?
> > >
> >
> > Is this patch merged for 4.1? Thanks!
>
> Sorry, it doesn't look like it is, and it's now missed the
> deadline for 4.1 (only critical showstopper bugs and security
> issues would go in at this point).

Since a very late ppc pullreq turned up which needed to also go into
rc5 and meant we couldn't just have a single-change rc, I figured this
was safe enough to also apply for rc5, so I've put it in.


Thanks Peter!


Ya, that's great -- this will save us some headaches.



Re: [Qemu-devel] [PATCH v2 6/7] target/riscv: rationalise softfloat includes

2019-08-13 Thread Palmer Dabbelt

On Fri, 09 Aug 2019 18:55:42 PDT (-0700), alistai...@gmail.com wrote:

On Fri, Aug 9, 2019 at 2:22 AM Alex Bennée  wrote:


We should avoid including the whole of softfloat headers in cpu.h and
explicitly include it only where we will be calling softfloat
functions. We can use the -types.h and -helpers.h in cpu.h for the few
bits that are global.

Signed-off-by: Alex Bennée 
Reviewed-by: Richard Henderson 


I just reviewed v1, but this also applies to v2:

Reviewed-by: Alistair Francis 


Acked-by: Palmer Dabbelt 

I'm assuming this are going in through another tree, along with the rest of the 
patch set.




Alistair


---
 target/riscv/cpu.c| 1 +
 target/riscv/cpu.h| 2 +-
 target/riscv/fpu_helper.c | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index f8d07bd20ad..6d52f97d7c3 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -27,6 +27,7 @@
 #include "qemu/error-report.h"
 #include "hw/qdev-properties.h"
 #include "migration/vmstate.h"
+#include "fpu/softfloat-helpers.h"

 /* RISC-V CPU definitions */

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 0adb307f329..240b31e2ebb 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -22,7 +22,7 @@

 #include "qom/cpu.h"
 #include "exec/cpu-defs.h"
-#include "fpu/softfloat.h"
+#include "fpu/softfloat-types.h"

 #define TCG_GUEST_DEFAULT_MO 0

diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
index b4f818a6465..0b79562a690 100644
--- a/target/riscv/fpu_helper.c
+++ b/target/riscv/fpu_helper.c
@@ -21,6 +21,7 @@
 #include "qemu/host-utils.h"
 #include "exec/exec-all.h"
 #include "exec/helper-proto.h"
+#include "fpu/softfloat.h"

 target_ulong riscv_cpu_get_fflags(CPURISCVState *env)
 {
--
2.20.1




Re: [Qemu-devel] [PATCH v2] RISC-V: Ignore the S and U letters when formatting ISA strings

2019-08-13 Thread no-reply
Patchew URL: https://patchew.org/QEMU/20190813225307.5792-1-pal...@sifive.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PATCH v2] RISC-V: Ignore the S and U letters when 
formatting ISA strings
Message-id: 20190813225307.5792-1-pal...@sifive.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/20190813225307.5792-1-pal...@sifive.com -> 
patchew/20190813225307.5792-1-pal...@sifive.com
Submodule 'capstone' (https://git.qemu.org/git/capstone.git) registered for 
path 'capstone'
Submodule 'dtc' (https://git.qemu.org/git/dtc.git) registered for path 'dtc'
Submodule 'roms/QemuMacDrivers' (https://git.qemu.org/git/QemuMacDrivers.git) 
registered for path 'roms/QemuMacDrivers'
Submodule 'roms/SLOF' (https://git.qemu.org/git/SLOF.git) registered for path 
'roms/SLOF'
Submodule 'roms/edk2' (https://git.qemu.org/git/edk2.git) registered for path 
'roms/edk2'
Submodule 'roms/ipxe' (https://git.qemu.org/git/ipxe.git) registered for path 
'roms/ipxe'
Submodule 'roms/openbios' (https://git.qemu.org/git/openbios.git) registered 
for path 'roms/openbios'
Submodule 'roms/openhackware' (https://git.qemu.org/git/openhackware.git) 
registered for path 'roms/openhackware'
Submodule 'roms/opensbi' (https://git.qemu.org/git/opensbi.git) registered for 
path 'roms/opensbi'
Submodule 'roms/qemu-palcode' (https://git.qemu.org/git/qemu-palcode.git) 
registered for path 'roms/qemu-palcode'
Submodule 'roms/seabios' (https://git.qemu.org/git/seabios.git/) registered for 
path 'roms/seabios'
Submodule 'roms/seabios-hppa' (https://git.qemu.org/git/seabios-hppa.git) 
registered for path 'roms/seabios-hppa'
Submodule 'roms/sgabios' (https://git.qemu.org/git/sgabios.git) registered for 
path 'roms/sgabios'
Submodule 'roms/skiboot' (https://git.qemu.org/git/skiboot.git) registered for 
path 'roms/skiboot'
Submodule 'roms/u-boot' (https://git.qemu.org/git/u-boot.git) registered for 
path 'roms/u-boot'
Submodule 'roms/u-boot-sam460ex' (https://git.qemu.org/git/u-boot-sam460ex.git) 
registered for path 'roms/u-boot-sam460ex'
Submodule 'slirp' (https://git.qemu.org/git/libslirp.git) registered for path 
'slirp'
Submodule 'tests/fp/berkeley-softfloat-3' 
(https://git.qemu.org/git/berkeley-softfloat-3.git) registered for path 
'tests/fp/berkeley-softfloat-3'
Submodule 'tests/fp/berkeley-testfloat-3' 
(https://git.qemu.org/git/berkeley-testfloat-3.git) registered for path 
'tests/fp/berkeley-testfloat-3'
Submodule 'ui/keycodemapdb' (https://git.qemu.org/git/keycodemapdb.git) 
registered for path 'ui/keycodemapdb'
Cloning into 'capstone'...
Submodule path 'capstone': checked out 
'22ead3e0bfdb87516656453336160e0a37b066bf'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '88f18909db731a627456f26d779445f84e449536'
Cloning into 'roms/QemuMacDrivers'...
Submodule path 'roms/QemuMacDrivers': checked out 
'90c488d5f4a407342247b9ea869df1c2d9c8e266'
Cloning into 'roms/SLOF'...
Submodule path 'roms/SLOF': checked out 
'ba1ab360eebe6338bb8d7d83a9220ccf7e213af3'
Cloning into 'roms/edk2'...
Submodule path 'roms/edk2': checked out 
'20d2e5a125e34fc8501026613a71549b2a1a3e54'
Submodule 'SoftFloat' (https://github.com/ucb-bar/berkeley-softfloat-3.git) 
registered for path 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
Submodule 'CryptoPkg/Library/OpensslLib/openssl' 
(https://github.com/openssl/openssl) registered for path 
'CryptoPkg/Library/OpensslLib/openssl'
Cloning into 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'...
Submodule path 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3': 
checked out 'b64af41c3276f97f0e181920400ee056b9c88037'
Cloning into 'CryptoPkg/Library/OpensslLib/openssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl': checked out 
'50eaac9f3337667259de725451f201e784599687'
Submodule 'boringssl' (https://boringssl.googlesource.com/boringssl) registered 
for path 'boringssl'
Submodule 'krb5' (https://github.com/krb5/krb5) registered for path 'krb5'
Submodule 'pyca.cryptography' (https://github.com/pyca/cryptography.git) 
registered for path 'pyca-cryptography'
Cloning into 'boringssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/boringssl': 
checked out '2070f8ad9151dc8f3a73bffaa146b5e6937a583f'
Cloning into 'krb5'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/krb5': checked 
out 'b9ad6c49505c96a088326b62a52568e3484f2168'
Cloning into 'pyca-cryptography'...
Submodule path 
'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/pyca-cryptography': checked out 
'09403100de2f6f1cdd0d484dcb8e620f1c335c8f'
Cloning into 

[Qemu-devel] [PATCH v2] RISC-V: Ignore the S and U letters when formatting ISA strings

2019-08-13 Thread Palmer Dabbelt
The ISA strings we're providing from QEMU aren't actually legal RISC-V
ISA strings, as both S and U cannot exist as single-letter extensions
and must instead be multi-letter strings.  We're still using the ISA
strings inside QEMU to track the availiable extensions, so this patch
just strips out the S and U extensions when formatting ISA strings.

This boots Linux on top of 4.1-rc3, which no longer has the U extension
in /proc/cpuinfo.

Signed-off-by: Palmer Dabbelt 
---
 target/riscv/cpu.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index f8d07bd20ad7..a67c54c738ba 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -501,7 +501,23 @@ char *riscv_isa_string(RISCVCPU *cpu)
 char *p = isa_str + snprintf(isa_str, maxlen, "rv%d", TARGET_LONG_BITS);
 for (i = 0; i < sizeof(riscv_exts); i++) {
 if (cpu->env.misa & RV(riscv_exts[i])) {
-*p++ = qemu_tolower(riscv_exts[i]);
+char lower = qemu_tolower(riscv_exts[i]);
+switch (lower) {
+case 's':
+case 'u':
+/*
+* The 's' and 'u' letters shouldn't show up in ISA strings as
+* they're not extensions, but they should show up in MISA.
+* Since we use these letters interally as a pseudo ISA string
+* to set MISA it's easier to just strip them out when
+* formatting the ISA string.
+*/
+break;
+
+default:
+*p++ = lower;
+break;
+}
 }
 }
 *p = '\0';
-- 
2.21.0




Re: [Qemu-devel] [PATCH for 4.1] RISC-V: Ignore the S and U extensions when formatting ISA strings

2019-08-13 Thread Palmer Dabbelt

On Wed, 07 Aug 2019 10:54:52 PDT (-0700), alistai...@gmail.com wrote:

On Wed, Aug 7, 2019 at 8:00 AM Palmer Dabbelt  wrote:


The ISA strings we're providing from QEMU aren't actually legal RISC-V
ISA strings, as both the S and U extensions cannot exist as
single-letter extensions and must instead be multi-letter strings.
We're still using the ISA strings inside QEMU to track the availiable


s/availiable/available/g


extensions, so this patch just strips out the S and U extensions when
formatting ISA strings.


Atish and I were talking about this and we concluded that S and U
aren't extensions, but should be reported in the misa CSR.


Andrew agrees.





This boots Linux on top of 4.1-rc3, which no longer has the U extension
in /proc/cpuinfo.

Signed-off-by: Palmer Dabbelt 
---
This is another late one, but I'd like to target it for 4.1 as we're
providing illegal ISA strings and I don't want to bake that into a bunch
of other code.
---
 target/riscv/cpu.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index f8d07bd20ad7..4df14433d789 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -501,7 +501,22 @@ char *riscv_isa_string(RISCVCPU *cpu)
 char *p = isa_str + snprintf(isa_str, maxlen, "rv%d", TARGET_LONG_BITS);
 for (i = 0; i < sizeof(riscv_exts); i++) {
 if (cpu->env.misa & RV(riscv_exts[i])) {
-*p++ = qemu_tolower(riscv_exts[i]);
+char lower = qemu_tolower(riscv_exts[i]);
+switch (lower) {
+case 's':
+case 'u':
+/*
+ * The 's' and 'u' extensions shouldn't be passed in the device
+ * tree, but we still use them internally to track extension
+ * sets.  Here we just explicitly remove them when formatting
+ * an ISA string.


This should be updated to note mention 's' and 'u' as extensions, but
clarify that they are correctly include in the misa CSR.


I'll send a v2 that cleans up the wording on the comment and commit message.



Alistair


+ */
+break;
+
+default:
+*p++ = qemu_tolower(riscv_exts[i]);
+break;
+}
 }
 }
 *p = '\0';
--
2.21.0






Re: [Qemu-devel] [PATCH] Fix Guest VM crash due to iSCSI Sense Key error

2019-08-13 Thread John Snow



On 7/7/19 10:55 PM, shaju.abra...@nutanix.com wrote:
> From: Shaju Abraham 
> 
> During the  IDE DMA transfer for a ISCSI target,when libiscsi encounters
> a SENSE KEY error, it sets the task->sense to  the value "COMMAND ABORTED".
> The function iscsi_translate_sense() later translaters this error to 
> -ECANCELED
> and this value is passed to the callback function. In the case of  IDE DMA 
> read
> or write, the callback function returns immediately if the value of the ret
> argument is -ECANCELED.
> Later when ide_cancel_dma_sync() function is invoked  the assertion
> "s->bus->dma->aiocb == ((void *)0)" fails and the qemu process gets 
> terminated.
> Fix the issue by making the value of s->bus->dma->aiocb = NULL when
> -ECANCELED is passed to the callback.
> 
> Signed-off-by: Shaju Abraham 
> ---
>  hw/ide/core.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index 6afadf8..78ea357 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -841,6 +841,7 @@ static void ide_dma_cb(void *opaque, int ret)
>  bool stay_active = false;
>  
>  if (ret == -ECANCELED) {
> +s->bus->dma->aiocb = NULL;
>  return;
>  }
>  
> 

Hopefully just as adequately addressed by the patches in

https://github.com/jnsnow/qemu/commits/ide

but if you wanted to give it a test and confirm for me, I wouldn't be
upset by that.

--js



Re: [Qemu-devel] [PATCH] dma-helpers: ensure AIO callback is invoked after cancellation

2019-08-13 Thread John Snow



On 8/13/19 6:40 PM, John Snow wrote:
> 
> 
> On 7/29/19 5:34 PM, Paolo Bonzini wrote:
>> dma_aio_cancel unschedules the BH if there is one, which corresponds
>> to the reschedule_dma case of dma_blk_cb.  This can stall the DMA
>> permanently, because dma_complete will never get invoked and therefore
>> nobody will ever invoke the original AIO callback in dbs->common.cb.
>>
>> Fix this by invoking the callback (which is ensured to happen after
>> a bdrv_aio_cancel_async, or done manually in the dbs->bh case), and
>> add assertions to check that the DMA state machine is indeed waiting
>> for dma_complete or reschedule_dma, but never both.
>>
>> Reported-by: John Snow 
>> Signed-off-by: Paolo Bonzini 
> 
> No maintainer here, I guess; Paolo will you be pulling this or should I
> do it as part of the other IDE fixes I need to make?
> 

Nevermind, I made a decision.

--js



Re: [Qemu-devel] [PATCH] Revert "ide/ahci: Check for -ECANCELED in aio callbacks"

2019-08-13 Thread John Snow



On 7/29/19 6:36 PM, John Snow wrote:
> This reverts commit 0d910cfeaf2076b116b4517166d5deb0fea76394.
> 
> It's not correct to just ignore an error code in a callback; we need to
> handle that error and possible report failure to the guest so that they
> don't wait indefinitely for an operation that will now never finish.
> 
> This ought to help cases reported by Nutanix where iSCSI returns a
> legitimate -ECANCELED for certain operations which should be propagated
> normally.
> 
> Reported-by: Shaju Abraham 
> Signed-off-by: John Snow 

Nobody's yelling, so this is getting staged on my IDE branch, alongside
Paolo's dma-helpers fix.

Thanks, applied to my IDE tree:

https://github.com/jnsnow/qemu/commits/ide
https://github.com/jnsnow/qemu.git

--js



[Qemu-devel] [RFC] dirty-bitmaps: add block-dirty-bitmap-persist command

2019-08-13 Thread John Snow
This is for the purpose of toggling on/off persistence on a bitmap.
This enables you to save a bitmap that was not persistent, but may
have already accumulated valuable data.

This is simply a QOL enhancement:
- Allows user to "upgrade" an existing bitmap to persistent
- Allows user to "downgrade" an existing bitmap to transient,
  removing it from storage without deleting the bitmap.

Signed-off-by: John Snow 
---

This is just an RFC because I'm not sure if I really want to pursue
adding this, but it was raised in a discussion I had recently that it
was a little annoying as an API design that persistence couldn't be
changed after addition, so I wanted to see how much code it would take
to address that.

(So this patch isn't really tested; just: "Hey, look!")

I don't like this patch because it exacerbates my perceived problems
with the "check if I can make it persistent, then toggle the flag"
model, where I prefer the "Just try to set it persistent and let it fail
if it cannot" model, but there were some issues with that patchset that
I want to revisit.

---

 blockdev.c   | 49 
 qapi/block-core.json | 34 ++
 2 files changed, 83 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 2d7e7be538..230442e921 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3095,6 +3095,55 @@ void qmp_block_dirty_bitmap_merge(const char *node, 
const char *target,
 do_block_dirty_bitmap_merge(node, target, bitmaps, NULL, errp);
 }
 
+void qmp_block_dirty_bitmap_persist(const char *node, const char *name,
+bool persist, Error **errp)
+{
+BdrvDirtyBitmap *bitmap;
+BlockDriverState *bs;
+AioContext *aio_context = NULL;
+Error *local_err = NULL;
+bool persistent;
+
+bitmap = block_dirty_bitmap_lookup(node, name, , errp);
+if (!bitmap || !bs) {
+return;
+}
+
+if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_DEFAULT, errp)) {
+return;
+}
+
+persistent = bdrv_dirty_bitmap_get_persistence(bitmap);
+
+if (persist != persistent) {
+aio_context = bdrv_get_aio_context(bs);
+aio_context_acquire(aio_context);
+}
+
+if (!persist && persistent) {
+bdrv_remove_persistent_dirty_bitmap(bs, name, _err);
+if (local_err != NULL) {
+error_propagate(errp, local_err);
+goto out;
+}
+}
+
+if (persist && !persistent) {
+uint32_t granularity = bdrv_dirty_bitmap_granularity(bitmap);
+if (!bdrv_can_store_new_dirty_bitmap(bs, name, granularity, errp)) {
+goto out;
+}
+}
+
+bdrv_dirty_bitmap_set_persistence(bitmap, persistent);
+
+ out:
+if (aio_context) {
+aio_context_release(aio_context);
+}
+return;
+}
+
 BlockDirtyBitmapSha256 *qmp_x_debug_block_dirty_bitmap_sha256(const char *node,
   const char *name,
   Error **errp)
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 3dbf23d874..9c0957f528 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2001,6 +2001,19 @@
   'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32',
 '*persistent': 'bool', '*autoload': 'bool', '*disabled': 'bool' } }
 
+##
+# @BlockDirtyBitmapPersist:
+#
+# @persist: True sets the specified bitmap as persistent.
+#   False will remove it from storage and mark it transient.
+#
+# Since: 4.2
+##
+{ 'struct': 'BlockDirtyBitmapPersist',
+  'base': 'BlockDirtyBitmap',
+  'data': { 'persist': 'bool' }
+}
+
 ##
 # @BlockDirtyBitmapMergeSource:
 #
@@ -2173,6 +2186,27 @@
   { 'command': 'block-dirty-bitmap-merge',
 'data': 'BlockDirtyBitmapMerge' }
 
+##
+# @block-dirty-bitmap-persist:
+#
+# Mark a dirty bitmap as either persistent or transient.
+#
+# Returns: nothing on success; including for no-op.
+#  GenericError with explanation if the operation did not succeed.
+#
+# Example:
+#
+# -> { "execute": "block-dirty-bitmap-persist",
+#  "arguments": { "node": "drive0",
+# "bitmap": "bitmap0",
+# "persist": true } }
+# <- { "return": {} }
+#
+# Since: 4.2
+##
+{ 'command': 'block-dirty-bitmap-persist',
+  'data': 'BlockDirtyBitmapPersist' }
+
 ##
 # @BlockDirtyBitmapSha256:
 #
-- 
2.21.0




Re: [Qemu-devel] [PULL 04/32] target/riscv: Implement riscv_cpu_unassigned_access

2019-08-13 Thread Palmer Dabbelt

On Thu, 01 Aug 2019 08:39:17 PDT (-0700), Peter Maydell wrote:

On Wed, 3 Jul 2019 at 09:41, Palmer Dabbelt  wrote:


From: Michael Clark 

This patch adds support for the riscv_cpu_unassigned_access call
and will raise a load or store access fault.

Signed-off-by: Michael Clark 
[Changes by AF:
 - Squash two patches and rewrite commit message
 - Set baddr to the access address
]
Signed-off-by: Alistair Francis 
Reviewed-by: Palmer Dabbelt 
Signed-off-by: Palmer Dabbelt 


Oops, I missed seeing this go by. The do_unassigned_access
hook is deprecated and you should drop this and use
the do_transaction_failed hook instead.

The distinction between the two is that do_unassigned_access
will end up being called for any failing access, including
not just "normal" guest accesses but also for bad accesses
that happen during page table walks (which often want to
be reported to the guest differently) and also accesses
by random devices like DMA controllers (where throwing a
cpu exception is always a bug).

Changing the hook implementation itself should be straightforward;
commit 6ad4d7eed05a1e23537f is an example of doing that on Alpha.
You also want to check all the places in your target code that
do physical memory accesses, determine what the right behaviour
if they get a bus fault is, and implement that (or at least put
in TODO comments).


Sorry, updating that has been on my TODO list for a while now.  I figured it 
was better to have the deprecated version in there than nothing at all.  I've 
written some patches to fix this, but I want to give them another look before 
sending them out.




Re: [Qemu-devel] [PATCH] dma-helpers: ensure AIO callback is invoked after cancellation

2019-08-13 Thread John Snow



On 7/29/19 5:34 PM, Paolo Bonzini wrote:
> dma_aio_cancel unschedules the BH if there is one, which corresponds
> to the reschedule_dma case of dma_blk_cb.  This can stall the DMA
> permanently, because dma_complete will never get invoked and therefore
> nobody will ever invoke the original AIO callback in dbs->common.cb.
> 
> Fix this by invoking the callback (which is ensured to happen after
> a bdrv_aio_cancel_async, or done manually in the dbs->bh case), and
> add assertions to check that the DMA state machine is indeed waiting
> for dma_complete or reschedule_dma, but never both.
> 
> Reported-by: John Snow 
> Signed-off-by: Paolo Bonzini 

No maintainer here, I guess; Paolo will you be pulling this or should I
do it as part of the other IDE fixes I need to make?

--js



Re: [Qemu-devel] [Qemu-block] [PATCH 0/3] block: Make various formats' block_status recurse again

2019-08-13 Thread John Snow



On 8/13/19 10:48 AM, Max Reitz wrote:
> On 12.08.19 23:45, John Snow wrote:
>>
>>
>> On 8/12/19 3:11 PM, Max Reitz wrote:
>>> On 12.08.19 20:39, John Snow wrote:


 On 7/25/19 11:55 AM, Max Reitz wrote:
> Hi,
>
> 69f47505ee66afaa513305de0c1895a224e52c45 changed block_status so that it
> would only go down to the protocol layer if the format layer returned
> BDRV_BLOCK_RECURSE, thus indicating that it has no sufficient
> information whether a given range in the image is zero or not.
> Generally, this is because the image is preallocated and thus all ranges
> appear as zeroes.
>
> However, it only implemented this preallocation detection for qcow2.
> There are more formats that support preallocation, though: vdi, vhdx,
> vmdk, vpc.  (Funny how they all start with “v”.)
>
> For vdi, vmdk, and vpc, the fix is rather simple, because they really
> have different subformats depending on whether an image is preallocated
> or not.  This makes the check very simple.
>
> vhdx is more like qcow2, where after the image has been created, it
> isn’t clear whether it’s been preallocated or everything is allocated
> because everything was already written to.  69f47505ee added a heuristic
> to qcow2 to get around this, but I think that’s too much for vhdx.  I
> just left it unfixed, because I don’t care that much, honestly (and I
> don’t think anyone else does).
>

 What's the practical outcome of that, and is the limitation documented
 somewhere?
>>>
>>> The outcome is that it if you preallocate a vhdx image
>>> (subformat=fixed), you’ll see that all sectors contain data, even if
>>> they may be zero sectors on the filesystem level.
>>>
>>> I don’t think it’s user-visible whatsoever.
>>>
>>
>> But it might mean that doing things with sync=top might over-allocate
>> data depending on the destination, wouldn't it?
>>
>> That's not crucial, but it's possibly visible, no?
> 
> I don’t think it has anything to do with sync=top because whether a
> block is zero on the protocol level has nothing to do with whether it is
> allocated on the format level.
> 
> It may make a difference for convert which uses block_status to inquire
> the zero status.  However, it also does zero-detection, so...
> 

Oh, okay then. Probably... fine, but I have a nagging doubt relating to
some of the fallbacks in e.g. qcow2 that tend to inflate zeroes in some
cases (or used to. Maybe it's been fixed since.)

...but I can't point to anything, so it's fine, and I'm just drawing
things out for no reason.

Reviewed-by: John Snow 

 (I'm fine with not fixing it, I just want it documented somehow.)
>>>
>>> I am really not inclined to start any documentation on the
>>> particularities with which qemu handles vhdx images.
>>>
>>> (Especially so considering we don’t even have any documentation on the
>>> qcow2 case.  The stress in my paragraph was “heuristic”.  If you
>>> preallocate a qcow2 image, but then discard enough sectors that the
>>> heuristic thinks you didn’t, you’ll have the same effect.  Or if you
>>> grow a preallocated image without preallocating the new area.)
>>>
>>> Max
>>>
>>
>> "But our qcow2 docs are also bad" is the kind of argument I can't
>> *really* disagree with, but...
> 
> My main argument is that nobody would read the vhdx docs anyway.
> 
> Max
> 

That's the sort of thing I'd like to change, but I guess I haven't
really made good on that desire in any way, so what good is that?

--js



[Qemu-devel] [ANNOUNCE] QEMU 4.1.0-rc5 is now available

2019-08-13 Thread Michael Roth
Hello,

On behalf of the QEMU Team, I'd like to announce the availability of the
sixth release candidate for the QEMU 4.1 release.  This release is meant
for testing purposes and should not be used in a production environment.

  http://download.qemu-project.org/qemu-4.1.0-rc5.tar.xz
  http://download.qemu-project.org/qemu-4.1.0-rc5.tar.xz.sig

A note from the maintainer:

  A late-breaking security issue meant we needed to add an rc5
  to the 4.1.0 release process. The only changes since rc4 are
  the fix for a security issue in the bochs-display device, two
  bugfixes that only affect the PPC spapr machine, and a trivial
  makefile fix that only matters if you're building the risc-v
  BIOS images from source. We will release the final 4.1.0 on
  Thursday 15th August.

You can help improve the quality of the QEMU 4.1 release by testing this
release and reporting bugs on Launchpad:

  https://bugs.launchpad.net/qemu/

The release plan, as well a documented known issues for release
candidates, are available at:

  http://wiki.qemu.org/Planning/4.1

Please add entries to the ChangeLog for the 4.1 release below:

  http://wiki.qemu.org/ChangeLog/4.1

Thank you to everyone involved!

Changes since rc4:

f28ed74fd1: Update version for v4.1.0-rc5 release (Peter Maydell)
02db1be1d0: riscv: roms: Fix make rules for building sifive_u bios (Bin Meng)
310cda5b5e: spapr/xive: Fix migration of hot-plugged CPUs (Cédric Le Goater)
25c9780d38: spapr: Reset CAS & IRQ subsystem after devices (David Gibson)
5e7bcdcfe6: display/bochs: fix pcie support (Gerd Hoffmann)




Re: [Qemu-devel] [Qemu-block] [PATCH 7/7] iotests: Disable 126 for some vmdk subformats

2019-08-13 Thread John Snow



On 8/13/19 10:00 AM, Max Reitz wrote:
> On 12.08.19 23:33, John Snow wrote:
>>
>>
>> On 7/25/19 11:57 AM, Max Reitz wrote:
>>> Several vmdk subformats do not work with iotest 126, so disable them.
>>>
>>> (twoGbMaxExtentSparse actually should work, but fixing that is a bit
>>> difficult.  The problem is that the vmdk descriptor file will contain a
>>> referenc to "image:base.vmdk", which the block layer cannot open because
>>
>> reference
>>
>>> it does not know the protocol "image".  This is not trivial to solve,
>>> because I suppose real protocols like "http://; should be supported.
>>> Making vmdk treat all paths with a potential protocol prefix that the
>>> block layer does not recognize as plain files seems a bit weird,
>>> though.  Ignoring this problem does not seem too bad.)
>>>
>>> Signed-off-by: Max Reitz 
>>> ---
>>>  tests/qemu-iotests/126 | 6 ++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/tests/qemu-iotests/126 b/tests/qemu-iotests/126
>>> index 9b0dcf9255..8e55d7c843 100755
>>> --- a/tests/qemu-iotests/126
>>> +++ b/tests/qemu-iotests/126
>>> @@ -33,6 +33,12 @@ status=1 # failure is the default!
>>>  
>>>  # Needs backing file support
>>>  _supported_fmt qcow qcow2 qed vmdk
>>> +# (1) Flat vmdk images do not support backing files
>>> +# (2) Split vmdk images simply fail this test right now.  Fixing that
>>> +# is left for another day.
>>
>> Which one? :)
> 
> H?  Fixing refers to #2.  #1 is not a bug or missing feature, it’s
> just how it is.  (This test needs backing files, so...)
> 
> If you mean “which are which“, then the ones with *Flat are flat images
> (:-)), and the ones with twoGbMaxExtent* are split.
> 

"Which day" ;)

>>> +_unsupported_imgopts "subformat=monolithicFlat" \
>>> + "subformat=twoGbMaxExtentFlat" \
>>> + "subformat=twoGbMaxExtentSparse"
>>>  # This is the default protocol (and we want to test the difference between
>>>  # colons which separate a protocol prefix from the rest and colons which 
>>> are
>>>  # just part of the filename, so we cannot test protocols which require a 
>>> prefix)
>>>
>>
>> What exactly fails?
> 
> Interestingly I only now noticed that the test passes with “vmdk: Use
> bdrv_dirname() for relative extent paths” (patch 2) reverted...
> 
>> Does the VMDK driver see `image:` and think it's a
>> special filename it needs to handle and fails to do so?
> No.  Whenever the block layer sees a parsee filename[1] with a colon
> before a slash, it thinks everything before the colon is a protocol
> prefix.  For example:
> 

Actually, I think we're on the same page here. I maybe meant to type
"block layer" instead of "VMDK driver", but it does look like it does
special processing on this sort of filename that breaks in this case.

> $ qemu-img info foo:bar
> qemu-img: Could not open 'foo:bar': Unknown protocol 'foo'
> 
> This test is precisely for this.  How can you specify an image filename
> that has a colon in it (without using -blockdev)?  One way is to prepend
> it with “./”, the other is “file:”.
> 
> Now with split VMDKs, we must write something in the header file to
> reference the extents.  What vmdk does for an image like
> “image:foo.vmdk” is it writes “image:foo-s001.vmdk” there.
> 
> When it tries to open that extent, what happens depends on whether
> “vmdk: Use bdrv_dirname() for relative extent paths” (patch 2) is applied:
> 
> --- Before that patch ---
> 
> vmdk takes the descriptor filename, which, thanks to some magic in the
> block layer, is always “./image:foo.vmdk”, even when you gave it as
> “file:image:foo.vmdk” (the “file:” is stripped because it does nothing,
> generally, and the “./” is then prepended because of the false protocol
> prefix “image:”).
> 
> It then invokes path_combine() with that path and the path given in the
> descriptor file (“image:foo-s001.vmdk”).  This yields
> “./image:foo-s001.vmdk”, which actually works.
> 
> --- After that patch ---
> 
> OK, what I messed up is that I just took the extent path to be an
> absolute path if it has a protocol prefix.  (Because that’s how we
> usually do it.)  Turns out that vmdk never did that, and path_combine()
> actually completely ignores protocol prefixes in the relative filename.
> 
> I suppose I could do the same and just drop the path_has_protocol() from
> patch 2.  But that’d be a bit broken, as I wrote in the commit
> message...  If the descriptor file refers to an extent on
> “http://example.com/extent.vmdk”, I suppose that should not be
> interpreted as a relative path, but actually work...
> 
> But anyway, I guess if it’s a bit broken already, I might just keep it
> that way.
> 
> 
> tl;dr: Turns out patch 2 broke this test, because it (accidentally)
> tried to fix something that I consider broken.  If I just keep it broken
> (I didn’t know it was), this test will continue to work and probably
> nobody will care because, well, it already is broken and nobody cares.
> 

So 

Re: [Qemu-devel] [PATCH v9 01/11] hw/arm: simplify arm_load_dtb

2019-08-13 Thread Alistair Francis
On Fri, Aug 9, 2019 at 12:01 AM Tao  wrote:
>
> From: Tao Xu 
>
> In struct arm_boot_info, kernel_filename, initrd_filename and
> kernel_cmdline are copied from from MachineState. This patch add
> MachineState as a parameter into arm_load_dtb() and move the copy chunk
> of kernel_filename, initrd_filename and kernel_cmdline into
> arm_load_kernel().
>
> Reviewed-by: Igor Mammedov 
> Reviewed-by: Liu Jingqi 
> Suggested-by: Igor Mammedov 
> Signed-off-by: Tao Xu 

Reviewed-by: Alistair Francis 

Alistair

> ---
>
> No changes in v9
> ---
>  hw/arm/aspeed.c   |  5 +
>  hw/arm/boot.c | 14 --
>  hw/arm/collie.c   |  8 +---
>  hw/arm/cubieboard.c   |  5 +
>  hw/arm/exynos4_boards.c   |  7 ++-
>  hw/arm/highbank.c |  8 +---
>  hw/arm/imx25_pdk.c|  5 +
>  hw/arm/integratorcp.c |  8 +---
>  hw/arm/kzm.c  |  5 +
>  hw/arm/mainstone.c|  5 +
>  hw/arm/mcimx6ul-evk.c |  5 +
>  hw/arm/mcimx7d-sabre.c|  5 +
>  hw/arm/musicpal.c |  8 +---
>  hw/arm/nseries.c  |  5 +
>  hw/arm/omap_sx1.c |  5 +
>  hw/arm/palm.c | 10 ++
>  hw/arm/raspi.c|  6 +-
>  hw/arm/realview.c |  5 +
>  hw/arm/sabrelite.c|  5 +
>  hw/arm/sbsa-ref.c |  3 +--
>  hw/arm/spitz.c|  5 +
>  hw/arm/tosa.c |  8 +---
>  hw/arm/versatilepb.c  |  5 +
>  hw/arm/vexpress.c |  5 +
>  hw/arm/virt.c |  8 +++-
>  hw/arm/xilinx_zynq.c  |  8 +---
>  hw/arm/xlnx-versal-virt.c |  7 ++-
>  hw/arm/xlnx-zcu102.c  |  5 +
>  hw/arm/z2.c   |  8 +---
>  include/hw/arm/boot.h |  4 ++--
>  30 files changed, 43 insertions(+), 147 deletions(-)
>
> diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
> index 843b708247..f8733b86b9 100644
> --- a/hw/arm/aspeed.c
> +++ b/hw/arm/aspeed.c
> @@ -241,9 +241,6 @@ static void aspeed_board_init(MachineState *machine,
>  write_boot_rom(drive0, FIRMWARE_ADDR, fl->size, _abort);
>  }
>
> -aspeed_board_binfo.kernel_filename = machine->kernel_filename;
> -aspeed_board_binfo.initrd_filename = machine->initrd_filename;
> -aspeed_board_binfo.kernel_cmdline = machine->kernel_cmdline;
>  aspeed_board_binfo.ram_size = ram_size;
>  aspeed_board_binfo.loader_start = sc->info->memmap[ASPEED_SDRAM];
>  aspeed_board_binfo.nb_cpus = bmc->soc.num_cpus;
> @@ -252,7 +249,7 @@ static void aspeed_board_init(MachineState *machine,
>  cfg->i2c_init(bmc);
>  }
>
> -arm_load_kernel(ARM_CPU(first_cpu), _board_binfo);
> +arm_load_kernel(ARM_CPU(first_cpu), machine, _board_binfo);
>  }
>
>  static void palmetto_bmc_i2c_init(AspeedBoardState *bmc)
> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> index c2b89b3bb9..ba604f8277 100644
> --- a/hw/arm/boot.c
> +++ b/hw/arm/boot.c
> @@ -524,7 +524,7 @@ static void fdt_add_psci_node(void *fdt)
>  }
>
>  int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
> - hwaddr addr_limit, AddressSpace *as)
> + hwaddr addr_limit, AddressSpace *as, MachineState *ms)
>  {
>  void *fdt = NULL;
>  int size, rc, n = 0;
> @@ -627,9 +627,9 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info 
> *binfo,
>  qemu_fdt_add_subnode(fdt, "/chosen");
>  }
>
> -if (binfo->kernel_cmdline && *binfo->kernel_cmdline) {
> +if (ms->kernel_cmdline && *ms->kernel_cmdline) {
>  rc = qemu_fdt_setprop_string(fdt, "/chosen", "bootargs",
> - binfo->kernel_cmdline);
> + ms->kernel_cmdline);
>  if (rc < 0) {
>  fprintf(stderr, "couldn't set /chosen/bootargs\n");
>  goto fail;
> @@ -1261,7 +1261,7 @@ static void arm_setup_firmware_boot(ARMCPU *cpu, struct 
> arm_boot_info *info)
>   */
>  }
>
> -void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info *info)
> +void arm_load_kernel(ARMCPU *cpu, MachineState *ms, struct arm_boot_info 
> *info)
>  {
>  CPUState *cs;
>  AddressSpace *as = arm_boot_address_space(cpu, info);
> @@ -1282,7 +1282,9 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
> *info)
>   * doesn't support secure.
>   */
>  assert(!(info->secure_board_setup && kvm_enabled()));
> -
> +info->kernel_filename = ms->kernel_filename;
> +info->kernel_cmdline = ms->kernel_cmdline;
> +info->initrd_filename = ms->initrd_filename;
>  info->dtb_filename = qemu_opt_get(qemu_get_machine_opts(), "dtb");
>  info->dtb_limit = 0;
>
> @@ -1294,7 +1296,7 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
> *info)
>  }
>
>  if (!info->skip_dtb_autoload && have_dtb(info)) {
> -if (arm_load_dtb(info->dtb_start, info, info->dtb_limit, as) < 0) {
> +if (arm_load_dtb(info->dtb_start, info, info->dtb_limit, as, ms) < 
> 0) {
>   

Re: [Qemu-devel] [PATCH v9 01/11] hw/arm: simplify arm_load_dtb

2019-08-13 Thread Eduardo Habkost


CCing ARM maintainers.  I'd like to at least get one Acked-by from
them before queueing this on machine-next.


On Fri, Aug 09, 2019 at 02:57:21PM +0800, Tao wrote:
> From: Tao Xu 
> 
> In struct arm_boot_info, kernel_filename, initrd_filename and
> kernel_cmdline are copied from from MachineState. This patch add
> MachineState as a parameter into arm_load_dtb() and move the copy chunk
> of kernel_filename, initrd_filename and kernel_cmdline into
> arm_load_kernel().
> 
> Reviewed-by: Igor Mammedov 
> Reviewed-by: Liu Jingqi 
> Suggested-by: Igor Mammedov 
> Signed-off-by: Tao Xu 
> ---
> 
> No changes in v9
> ---
>  hw/arm/aspeed.c   |  5 +
>  hw/arm/boot.c | 14 --
>  hw/arm/collie.c   |  8 +---
>  hw/arm/cubieboard.c   |  5 +
>  hw/arm/exynos4_boards.c   |  7 ++-
>  hw/arm/highbank.c |  8 +---
>  hw/arm/imx25_pdk.c|  5 +
>  hw/arm/integratorcp.c |  8 +---
>  hw/arm/kzm.c  |  5 +
>  hw/arm/mainstone.c|  5 +
>  hw/arm/mcimx6ul-evk.c |  5 +
>  hw/arm/mcimx7d-sabre.c|  5 +
>  hw/arm/musicpal.c |  8 +---
>  hw/arm/nseries.c  |  5 +
>  hw/arm/omap_sx1.c |  5 +
>  hw/arm/palm.c | 10 ++
>  hw/arm/raspi.c|  6 +-
>  hw/arm/realview.c |  5 +
>  hw/arm/sabrelite.c|  5 +
>  hw/arm/sbsa-ref.c |  3 +--
>  hw/arm/spitz.c|  5 +
>  hw/arm/tosa.c |  8 +---
>  hw/arm/versatilepb.c  |  5 +
>  hw/arm/vexpress.c |  5 +
>  hw/arm/virt.c |  8 +++-
>  hw/arm/xilinx_zynq.c  |  8 +---
>  hw/arm/xlnx-versal-virt.c |  7 ++-
>  hw/arm/xlnx-zcu102.c  |  5 +
>  hw/arm/z2.c   |  8 +---
>  include/hw/arm/boot.h |  4 ++--
>  30 files changed, 43 insertions(+), 147 deletions(-)
> 
> diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
> index 843b708247..f8733b86b9 100644
> --- a/hw/arm/aspeed.c
> +++ b/hw/arm/aspeed.c
> @@ -241,9 +241,6 @@ static void aspeed_board_init(MachineState *machine,
>  write_boot_rom(drive0, FIRMWARE_ADDR, fl->size, _abort);
>  }
>  
> -aspeed_board_binfo.kernel_filename = machine->kernel_filename;
> -aspeed_board_binfo.initrd_filename = machine->initrd_filename;
> -aspeed_board_binfo.kernel_cmdline = machine->kernel_cmdline;
>  aspeed_board_binfo.ram_size = ram_size;
>  aspeed_board_binfo.loader_start = sc->info->memmap[ASPEED_SDRAM];
>  aspeed_board_binfo.nb_cpus = bmc->soc.num_cpus;
> @@ -252,7 +249,7 @@ static void aspeed_board_init(MachineState *machine,
>  cfg->i2c_init(bmc);
>  }
>  
> -arm_load_kernel(ARM_CPU(first_cpu), _board_binfo);
> +arm_load_kernel(ARM_CPU(first_cpu), machine, _board_binfo);
>  }
>  
>  static void palmetto_bmc_i2c_init(AspeedBoardState *bmc)
> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> index c2b89b3bb9..ba604f8277 100644
> --- a/hw/arm/boot.c
> +++ b/hw/arm/boot.c
> @@ -524,7 +524,7 @@ static void fdt_add_psci_node(void *fdt)
>  }
>  
>  int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
> - hwaddr addr_limit, AddressSpace *as)
> + hwaddr addr_limit, AddressSpace *as, MachineState *ms)
>  {
>  void *fdt = NULL;
>  int size, rc, n = 0;
> @@ -627,9 +627,9 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info 
> *binfo,
>  qemu_fdt_add_subnode(fdt, "/chosen");
>  }
>  
> -if (binfo->kernel_cmdline && *binfo->kernel_cmdline) {
> +if (ms->kernel_cmdline && *ms->kernel_cmdline) {
>  rc = qemu_fdt_setprop_string(fdt, "/chosen", "bootargs",
> - binfo->kernel_cmdline);
> + ms->kernel_cmdline);
>  if (rc < 0) {
>  fprintf(stderr, "couldn't set /chosen/bootargs\n");
>  goto fail;
> @@ -1261,7 +1261,7 @@ static void arm_setup_firmware_boot(ARMCPU *cpu, struct 
> arm_boot_info *info)
>   */
>  }
>  
> -void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info *info)
> +void arm_load_kernel(ARMCPU *cpu, MachineState *ms, struct arm_boot_info 
> *info)
>  {
>  CPUState *cs;
>  AddressSpace *as = arm_boot_address_space(cpu, info);
> @@ -1282,7 +1282,9 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
> *info)
>   * doesn't support secure.
>   */
>  assert(!(info->secure_board_setup && kvm_enabled()));
> -
> +info->kernel_filename = ms->kernel_filename;
> +info->kernel_cmdline = ms->kernel_cmdline;
> +info->initrd_filename = ms->initrd_filename;
>  info->dtb_filename = qemu_opt_get(qemu_get_machine_opts(), "dtb");
>  info->dtb_limit = 0;
>  
> @@ -1294,7 +1296,7 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
> *info)
>  }
>  
>  if (!info->skip_dtb_autoload && have_dtb(info)) {
> -if (arm_load_dtb(info->dtb_start, info, info->dtb_limit, 

Re: [Qemu-devel] [PATCH v2 2/4] block/qcow2: refactor qcow2_co_preadv_part

2019-08-13 Thread Max Reitz
On 30.07.19 16:18, Vladimir Sementsov-Ogievskiy wrote:
> Further patch will run partial requests of iterations of
> qcow2_co_preadv in parallel for performance reasons. To prepare for
> this, separate part which may be parallelized into separate function
> (qcow2_co_preadv_task).
> 
> While being here, also separate encrypted clusters reading to own
> function, like it is done for compressed reading.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  qapi/block-core.json |   2 +-
>  block/qcow2.c| 206 +++
>  2 files changed, 112 insertions(+), 96 deletions(-)

Looks good to me overall, just wondering about some details, as always.

> diff --git a/block/qcow2.c b/block/qcow2.c
> index 93ab7edcea..7fa71968b2 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -1967,17 +1967,115 @@ out:
>  return ret;
>  }
>  
> +static coroutine_fn int
> +qcow2_co_preadv_encrypted(BlockDriverState *bs,
> +   uint64_t file_cluster_offset,
> +   uint64_t offset,
> +   uint64_t bytes,
> +   QEMUIOVector *qiov,
> +   uint64_t qiov_offset)
> +{
> +int ret;
> +BDRVQcow2State *s = bs->opaque;
> +uint8_t *buf;
> +
> +assert(bs->encrypted && s->crypto);
> +assert(bytes <= QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
> +
> +/*
> + * For encrypted images, read everything into a temporary
> + * contiguous buffer on which the AES functions can work.
> + * Note, that we can implement enctyption, working on qiov,

-, and s/enctyption/encryption/

> + * but we must not do decryption in guest buffers for security
> + * reasons.

"for security reasons" is a bit handwave-y, no?

[...]

> +static coroutine_fn int qcow2_co_preadv_task(BlockDriverState *bs,
> + QCow2ClusterType cluster_type,
> + uint64_t file_cluster_offset,
> + uint64_t offset, uint64_t bytes,
> + QEMUIOVector *qiov,
> + size_t qiov_offset)
> +{
> +BDRVQcow2State *s = bs->opaque;
> +int offset_in_cluster = offset_into_cluster(s, offset);
> +
> +switch (cluster_type) {

[...]

> +default:
> +g_assert_not_reached();
> +/*
> + * QCOW2_CLUSTER_ZERO_PLAIN and QCOW2_CLUSTER_ZERO_ALLOC handled
> + * in qcow2_co_preadv_part

Hmm, I’d still add them explicitly as cases and put their own
g_assert_not_reach() there.

> + */
> +}
> +
> +g_assert_not_reached();
> +
> +return -EIO;

Maybe abort()ing instead of g_assert_not_reach() would save you from
having to return here?

Max



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH-for-4.2 v9 11/12] tests: add dummy ACPI tables for arm/virt board

2019-08-13 Thread Shameer Kolothum
This patch is in preparation for adding numamem and memhp tests
to arm/virt board so that 'make check' is happy. This may not
be required once the scripts are run and new tables are
generated with ".numamem" and ".memhp" extensions.

Signed-off-by: Shameer Kolothum 
---
I am not sure this is the right way to do this. But without this, when
the numamem and memhp tests are added, you will get,

Looking for expected file 'tests/data/acpi/virt/SRAT.numamem'
Looking for expected file 'tests/data/acpi/virt/SRAT'
**
ERROR:tests/bios-tables-test.c:327:load_expected_aml: assertion failed: 
(exp_sdt.aml_file)

---
 tests/data/acpi/virt/SLIT | Bin 0 -> 48 bytes
 tests/data/acpi/virt/SRAT | Bin 0 -> 224 bytes
 2 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 tests/data/acpi/virt/SLIT
 create mode 100644 tests/data/acpi/virt/SRAT

diff --git a/tests/data/acpi/virt/SLIT b/tests/data/acpi/virt/SLIT
new file mode 100644
index 
..74ec3b4b461ffecca36d8537975c202a5f011185
GIT binary patch
literal 48
scmWIc@eDCwU|?X>aq@Te2v%^42yhMtiZKGkKx`1r1jHb~B`V4V0NaKK0RR91

literal 0
HcmV?d1

diff --git a/tests/data/acpi/virt/SRAT b/tests/data/acpi/virt/SRAT
new file mode 100644
index 
..119922f4973f621602047d1dc160519f810922a3
GIT binary patch
literal 224
zcmWFzatwLEz`(%x)yd!4BUr>jB1F=Cg2*ZH@DxXmUMHZ-x3$7Gd2B8jU
X02q8=hbcr=2NT6lGiu

[Qemu-devel] [PATCH-for-4.2 v9 07/12] hw/arm/virt-acpi-build: Add PC-DIMM in SRAT

2019-08-13 Thread Shameer Kolothum
Generate Memory Affinity Structures for PC-DIMM ranges.

Also, Linux and Windows need ACPI SRAT table to make memory hotplug
work properly, however currently QEMU doesn't create SRAT table if
numa options aren't present on CLI. Hence add support(>=4.2) to
create numa node automatically (auto_enable_numa_with_memhp) when
QEMU is started with memory hotplug enabled but without '-numa'
options on CLI.

Signed-off-by: Shameer Kolothum 
Signed-off-by: Eric Auger 
Reviewed-by: Igor Mammedov 
---
v8 --> v9
 - Added auto_enable_numa_with_memhp support.

---
 hw/arm/virt-acpi-build.c | 9 +
 hw/arm/virt.c| 2 ++
 2 files changed, 11 insertions(+)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 63fa845076..6d697af2df 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -518,6 +518,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 int i, srat_start;
 uint64_t mem_base;
 MachineClass *mc = MACHINE_GET_CLASS(vms);
+MachineState *ms = MACHINE(vms);
 const CPUArchIdList *cpu_list = mc->possible_cpu_arch_ids(MACHINE(vms));
 
 srat_start = table_data->len;
@@ -543,6 +544,14 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 }
 }
 
+if (ms->device_memory) {
+numamem = acpi_data_push(table_data, sizeof *numamem);
+build_srat_memory(numamem, ms->device_memory->base,
+  memory_region_size(>device_memory->mr),
+  nb_numa_nodes - 1,
+  MEM_AFFINITY_HOTPLUGGABLE | MEM_AFFINITY_ENABLED);
+}
+
 build_header(linker, table_data, (void *)(table_data->data + srat_start),
  "SRAT", table_data->len - srat_start, 3, NULL, NULL);
 }
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 0949a227a9..56d64fc0a9 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2049,6 +2049,7 @@ static void virt_machine_class_init(ObjectClass *oc, void 
*data)
 hc->plug = virt_machine_device_plug_cb;
 hc->unplug_request = virt_machine_device_unplug_request_cb;
 mc->numa_mem_supported = true;
+mc->auto_enable_numa_with_memhp = true;
 }
 
 static void virt_instance_init(Object *obj)
@@ -2154,6 +2155,7 @@ static void virt_machine_4_1_options(MachineClass *mc)
 virt_machine_4_2_options(mc);
 compat_props_add(mc->compat_props, hw_compat_4_1, hw_compat_4_1_len);
 vmc->no_ged = true;
+mc->auto_enable_numa_with_memhp = false;
 }
 DEFINE_VIRT_MACHINE(4, 1)
 
-- 
2.17.1





[Qemu-devel] [PATCH-for-4.2 v9 09/12] hw/arm: Use GED for system_powerdown event

2019-08-13 Thread Shameer Kolothum
For machines 4.2 or higher with ACPI boot use GED for system_powerdown
event instead of GPIO. Guest boot with DT still uses GPIO.

Signed-off-by: Shameer Kolothum 
---
v8 --> v9
 -Re-arranged patches 8 & 9 from v8 based on Igor's comments.

v7 --> v8
 -Retained gpio based system_powerdown support for machines < 4.2.
 -Reuse of virt_powerdown_req() for ACPI GED use.
 -Dropped Eric's R-by for now because of above.

---
 hw/acpi/generic_event_device.c |  8 
 hw/arm/virt-acpi-build.c   |  6 +++---
 hw/arm/virt.c  | 16 +++-
 include/hw/acpi/acpi_dev_interface.h   |  1 +
 include/hw/acpi/generic_event_device.h |  3 +++
 5 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index f4c23470c2..d6d7b28cfd 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -19,6 +19,7 @@
 
 static const uint32_t ged_supported_events[] = {
 ACPI_GED_MEM_HOTPLUG_EVT,
+ACPI_GED_PWR_DOWN_EVT,
 };
 
 /*
@@ -103,6 +104,11 @@ void build_ged_aml(Aml *table, const char *name, 
HotplugHandler *hotplug_dev,
 aml_append(if_ctx, aml_call0(MEMORY_DEVICES_CONTAINER "."
  MEMORY_SLOT_SCAN_METHOD));
 break;
+case ACPI_GED_PWR_DOWN_EVT:
+aml_append(if_ctx,
+   aml_notify(aml_name(ACPI_POWER_BUTTON_DEVICE),
+  aml_int(0x80)));
+break;
 default:
 /*
  * Please make sure all the events in ged_supported_events[]
@@ -189,6 +195,8 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, 
AcpiEventStatusBits ev)
 
 if (ev & ACPI_MEMORY_HOTPLUG_STATUS) {
 sel = ACPI_GED_MEM_HOTPLUG_EVT;
+} else if (ev & ACPI_POWER_DOWN_STATUS) {
+sel = ACPI_GED_PWR_DOWN_EVT;
 } else {
 /* Unknown event. Return without generating interrupt. */
 warn_report("GED: Unsupported event %d. No irq injected", ev);
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 6d697af2df..61b399dc58 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -49,7 +49,6 @@
 #include "kvm_arm.h"
 
 #define ARM_SPI_BASE 32
-#define ACPI_POWER_BUTTON_DEVICE "PWRB"
 
 static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
 {
@@ -739,13 +738,14 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 (irqmap[VIRT_MMIO] + ARM_SPI_BASE), NUM_VIRTIO_TRANSPORTS);
 acpi_dsdt_add_pci(scope, memmap, (irqmap[VIRT_PCIE] + ARM_SPI_BASE),
   vms->highmem, vms->highmem_ecam);
-acpi_dsdt_add_gpio(scope, [VIRT_GPIO],
-   (irqmap[VIRT_GPIO] + ARM_SPI_BASE));
 if (vms->acpi_dev) {
 build_ged_aml(scope, "\\_SB."GED_DEVICE,
   HOTPLUG_HANDLER(vms->acpi_dev),
   irqmap[VIRT_ACPI_GED] + ARM_SPI_BASE, AML_SYSTEM_MEMORY,
   memmap[VIRT_ACPI_GED].base);
+} else {
+acpi_dsdt_add_gpio(scope, [VIRT_GPIO],
+   (irqmap[VIRT_GPIO] + ARM_SPI_BASE));
 }
 
 if (vms->acpi_dev && ms->ram_slots) {
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 0e75213b44..d49e1a583c 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -528,7 +528,7 @@ static inline DeviceState *create_acpi_ged(VirtMachineState 
*vms, qemu_irq *pic)
 {
 DeviceState *dev;
 int irq = vms->irqmap[VIRT_ACPI_GED];
-uint32_t event = ACPI_GED_MEM_HOTPLUG_EVT;
+uint32_t event = ACPI_GED_MEM_HOTPLUG_EVT | ACPI_GED_PWR_DOWN_EVT;
 
 dev = qdev_create(NULL, TYPE_ACPI_GED);
 qdev_prop_set_uint32(dev, "ged-event", event);
@@ -783,8 +783,14 @@ static void create_rtc(const VirtMachineState *vms, 
qemu_irq *pic)
 static DeviceState *gpio_key_dev;
 static void virt_powerdown_req(Notifier *n, void *opaque)
 {
-/* use gpio Pin 3 for power button event */
-qemu_set_irq(qdev_get_gpio_in(gpio_key_dev, 0), 1);
+VirtMachineState *s = container_of(n, VirtMachineState, 
powerdown_notifier);
+
+if (s->acpi_dev) {
+acpi_send_event(s->acpi_dev, ACPI_POWER_DOWN_STATUS);
+} else {
+/* use gpio Pin 3 for power button event */
+qemu_set_irq(qdev_get_gpio_in(gpio_key_dev, 0), 1);
+}
 }
 
 static void create_gpio(const VirtMachineState *vms, qemu_irq *pic)
@@ -1712,10 +1718,10 @@ static void machvirt_init(MachineState *machine)
 
 create_pcie(vms, pic);
 
-create_gpio(vms, pic);
-
 if (has_ged && aarch64 && firmware_loaded && acpi_enabled) {
 vms->acpi_dev = create_acpi_ged(vms, pic);
+} else {
+create_gpio(vms, pic);
 }
 
  /* connect powerdown request */
diff --git a/include/hw/acpi/acpi_dev_interface.h 
b/include/hw/acpi/acpi_dev_interface.h
index 43ff119179..adcb3a816c 100644
--- a/include/hw/acpi/acpi_dev_interface.h

[Qemu-devel] [PATCH-for-4.2 v9 12/12] tests: Add bios tests to arm/virt

2019-08-13 Thread Shameer Kolothum
This adds numamem and memhp tests for arm/virt platform

Signed-off-by: Shameer Kolothum 
---
 tests/bios-tables-test-allowed-diff.h |  1 +
 tests/bios-tables-test.c  | 49 +++
 2 files changed, 50 insertions(+)

diff --git a/tests/bios-tables-test-allowed-diff.h 
b/tests/bios-tables-test-allowed-diff.h
index 7b4adbc822..d181a4da4a 100644
--- a/tests/bios-tables-test-allowed-diff.h
+++ b/tests/bios-tables-test-allowed-diff.h
@@ -1,2 +1,3 @@
 /* List of comma-separated changed AML files to ignore */
 "tests/data/acpi/virt/DSDT",
+"tests/data/acpi/virt/SRAT",
diff --git a/tests/bios-tables-test.c b/tests/bios-tables-test.c
index a356ac3489..1d6f330d53 100644
--- a/tests/bios-tables-test.c
+++ b/tests/bios-tables-test.c
@@ -871,6 +871,53 @@ static void test_acpi_piix4_tcg_dimm_pxm(void)
 test_acpi_tcg_dimm_pxm(MACHINE_PC);
 }
 
+static void test_acpi_virt_tcg_memhp(void)
+{
+test_data data = {
+.machine = "virt",
+.accel = "tcg",
+.uefi_fl1 = "pc-bios/edk2-aarch64-code.fd",
+.uefi_fl2 = "pc-bios/edk2-arm-vars.fd",
+.cd = "tests/data/uefi-boot-images/bios-tables-test.aarch64.iso.qcow2",
+.ram_start = 0x4000ULL,
+.scan_len = 256ULL * 1024 * 1024,
+};
+
+data.variant = ".memhp";
+test_acpi_one(" -cpu cortex-a57"
+  " -m 256M,slots=3,maxmem=1G"
+  " -object memory-backend-ram,id=ram0,size=128M"
+  " -object memory-backend-ram,id=ram1,size=128M"
+  " -numa node,memdev=ram0 -numa node,memdev=ram1"
+  " -numa dist,src=0,dst=1,val=21",
+  );
+
+free_test_data();
+
+}
+
+static void test_acpi_virt_tcg_numamem(void)
+{
+test_data data = {
+.machine = "virt",
+.accel = "tcg",
+.uefi_fl1 = "pc-bios/edk2-aarch64-code.fd",
+.uefi_fl2 = "pc-bios/edk2-arm-vars.fd",
+.cd = "tests/data/uefi-boot-images/bios-tables-test.aarch64.iso.qcow2",
+.ram_start = 0x4000ULL,
+.scan_len = 128ULL * 1024 * 1024,
+};
+
+data.variant = ".numamem";
+test_acpi_one(" -cpu cortex-a57"
+  " -object memory-backend-ram,id=ram0,size=128M"
+  " -numa node,memdev=ram0",
+  );
+
+free_test_data();
+
+}
+
 static void test_acpi_virt_tcg(void)
 {
 test_data data = {
@@ -917,6 +964,8 @@ int main(int argc, char *argv[])
 qtest_add_func("acpi/q35/dimmpxm", test_acpi_q35_tcg_dimm_pxm);
 } else if (strcmp(arch, "aarch64") == 0) {
 qtest_add_func("acpi/virt", test_acpi_virt_tcg);
+qtest_add_func("acpi/virt/numamem", test_acpi_virt_tcg_numamem);
+qtest_add_func("acpi/virt/memhp", test_acpi_virt_tcg_memhp);
 }
 ret = g_test_run();
 boot_sector_cleanup(disk);
-- 
2.17.1





[Qemu-devel] [PATCH-for-4.2 v9 05/12] hw/arm/virt: Add 4.2 machine type

2019-08-13 Thread Shameer Kolothum
This is in preparation to create ACPI GED device as we
need to disable it for <4.2 for migration to work.

Signed-off-by: Shameer Kolothum 
Reviewed-by: Igor Mammedov 
---
 hw/arm/virt.c   | 9 -
 hw/core/machine.c   | 3 +++
 include/hw/boards.h | 3 +++
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 72cde9deba..ef65e721d2 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2097,10 +2097,17 @@ static void machvirt_machine_init(void)
 }
 type_init(machvirt_machine_init);
 
+static void virt_machine_4_2_options(MachineClass *mc)
+{
+}
+DEFINE_VIRT_MACHINE_AS_LATEST(4, 2)
+
 static void virt_machine_4_1_options(MachineClass *mc)
 {
+virt_machine_4_2_options(mc);
+compat_props_add(mc->compat_props, hw_compat_4_1, hw_compat_4_1_len);
 }
-DEFINE_VIRT_MACHINE_AS_LATEST(4, 1)
+DEFINE_VIRT_MACHINE(4, 1)
 
 static void virt_machine_4_0_options(MachineClass *mc)
 {
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 32d1ca9abc..83cd1bfeec 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -27,6 +27,9 @@
 #include "hw/pci/pci.h"
 #include "hw/mem/nvdimm.h"
 
+GlobalProperty hw_compat_4_1[] = {};
+const size_t hw_compat_4_1_len = G_N_ELEMENTS(hw_compat_4_1);
+
 GlobalProperty hw_compat_4_0[] = {
 { "VGA","edid", "false" },
 { "secondary-vga",  "edid", "false" },
diff --git a/include/hw/boards.h b/include/hw/boards.h
index a71d1a53a5..d9ec37d807 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -317,6 +317,9 @@ struct MachineState {
 } \
 type_init(machine_initfn##_register_types)
 
+extern GlobalProperty hw_compat_4_1[];
+extern const size_t hw_compat_4_1_len;
+
 extern GlobalProperty hw_compat_4_0[];
 extern const size_t hw_compat_4_0_len;
 
-- 
2.17.1





[Qemu-devel] [PATCH-for-4.2 v9 00/12] ARM virt: ACPI memory hotplug support

2019-08-13 Thread Shameer Kolothum
This series is an attempt to provide device memory hotplug support 
on ARM virt platform. This is based on Eric's recent works here[1]
and carries some of the pc-dimm related patches dropped from his
series.

The kernel support for arm64 memory hot add was added recently by
Robin and hence the guest kernel should be => 5.0-rc1.

NVDIM support is not included currently as we still have an unresolved
issue while hot adding NVDIMM[2]. However NVDIMM cold plug patches
can be included, but not done for now, for keeping it simple.

This makes use of GED device to sent hotplug ACPI events to the
Guest. GED code is based on Nemu. Thanks to the efforts of Samuel and
Sebastien to add the hardware-reduced support to Nemu using GED
device[3]. (Please shout if I got the author/signed-off wrong for
those patches or missed any names).

This is sanity tested on a HiSilicon ARM64 platform and appreciate
any further testing.

Note:
Attempted adding dimm_pxm test case to bios-tables-test for arm/virt.
But noticed the issue decribed here[5]. This is under investigation 
now.

Thanks,
Shameer

[1] https://patchwork.kernel.org/cover/10837565/
[2] https://patchwork.kernel.org/cover/10783589/
[3] https://github.com/intel/nemu/blob/topic/virt-x86/hw/acpi/ged.c
[4] http://lists.infradead.org/pipermail/linux-arm-kernel/2019-May/651763.html
[5] https://www.mail-archive.com/qemu-devel@nongnu.org/msg632651.html

v8 --> v9
 -Changes related to GED being a TYPE_SYS_BUS_DEVICE now.
 -Re-arranged patches 8 and 9.
 -Added GED ABI documentation(patch #10).
 -Added numamem and memhp tests to arm/virt(#11 and #12)
 -Dropped few R-by tags as code has changed a bit.
 -Please see Individual patch history for details.
 
v7 --> v8
 -Addressed comments from Igor.Please see individual patches.
 -Updated bios-tables-test-allowed-diff.h to avoid "make check"
  failure (patch #6) and dropped patch #10
 -Added Igor's R-by to patches 4 & 5.
 -Dropped Erics's R-by from patch #9 for now.

v6 --> v7
- Added 4.2 machine support and restricted GED creation for < 4.2
  This is to address the migration test fail reported by Eric.
- Included "tests: Update DSDT ACPI table.." patch(#10) from Eric
  to fix the "make check" bios-tables-test failure.
  
v5 --> v6

-Addressed comments from Eric.
-Added R-by from Eric and Igor.

v4 --> v5
-Removed gsi/ged-irq routing in virt.
-Added Migration support.
-Dropped support for DT coldplug case based on the discussions
 here[4]
-Added system_powerdown support through GED.

v3 --> v4
Addressed comments from Igor and Eric,
-Renamed "virt-acpi" to "acpi-ged".
-Changed ged device parent to TYPE_DEVICE.
-Introduced DT memory node property "hotpluggable" to resolve device
 memory being treated as early boot memory issue(patch #7).
-Combined patches #3 and #9 from v3 into #3.

v2 --> v3

Addressed comments from Igor and Eric,
-Made virt acpi device platform independent and moved
 to hw/acpi/generic_event_device.c
-Moved ged specific code into hw/acpi/generic_event_device.c
-Introduced an opt-in feature "fdt" to resolve device-memory being
 treated as early boot memory.
-Dropped patch #1 from v2.

RFC --> v2

-Use GED device instead of GPIO for ACPI hotplug events.
-Removed NVDIMM support for now.
-Includes dropped patches from Eric's v9 series.

Eric Auger (1):
  hw/arm/virt: Add memory hotplug framework

Samuel Ortiz (2):
  hw/acpi: Do not create memory hotplug method when handler is not
defined
  hw/acpi: Add ACPI Generic Event Device Support

Shameer Kolothum (9):
  hw/acpi: Make ACPI IO address space configurable
  hw/arm/virt: Add 4.2 machine type
  hw/arm/virt: Enable device memory cold/hot plug with ACPI boot
  hw/arm/virt-acpi-build: Add PC-DIMM in SRAT
  hw/arm: Factor out powerdown notifier from GPIO
  hw/arm: Use GED for system_powerdown event
  docs/specs: Add ACPI GED documentation
  tests: add dummy ACPI tables for arm/virt board
  tests: Add bios tests to arm/virt

 docs/specs/acpi_hw_reduced_hotplug.txt |  60 +
 hw/acpi/Kconfig|   4 +
 hw/acpi/Makefile.objs  |   1 +
 hw/acpi/generic_event_device.c | 324 +
 hw/acpi/memory_hotplug.c   |  39 +--
 hw/arm/Kconfig |   4 +
 hw/arm/virt-acpi-build.c   |  31 ++-
 hw/arm/virt.c  | 136 ++-
 hw/core/machine.c  |   3 +
 hw/i386/acpi-build.c   |   4 +-
 hw/i386/pc.c   |   3 +
 include/hw/acpi/acpi_dev_interface.h   |   1 +
 include/hw/acpi/generic_event_device.h | 103 
 include/hw/acpi/memory_hotplug.h   |   9 +-
 include/hw/arm/virt.h  |   5 +
 include/hw/boards.h|   3 +
 include/hw/i386/pc.h   |   3 +
 tests/bios-tables-test-allowed-diff.h  |   2 +
 tests/bios-tables-test.c   |  49 
 tests/data/acpi/virt/SLIT  | Bin 0 -> 48 bytes
 tests/data/acpi/virt/SRAT  | 

[Qemu-devel] [PATCH-for-4.2 v9 10/12] docs/specs: Add ACPI GED documentation

2019-08-13 Thread Shameer Kolothum
Documents basic concepts of ACPI Generic Event device(GED)
and interface between QEMU and the ACPI BIOS.

Signed-off-by: Shameer Kolothum 
---
 docs/specs/acpi_hw_reduced_hotplug.txt | 60 ++
 1 file changed, 60 insertions(+)
 create mode 100644 docs/specs/acpi_hw_reduced_hotplug.txt

diff --git a/docs/specs/acpi_hw_reduced_hotplug.txt 
b/docs/specs/acpi_hw_reduced_hotplug.txt
new file mode 100644
index 00..46839be5ff
--- /dev/null
+++ b/docs/specs/acpi_hw_reduced_hotplug.txt
@@ -0,0 +1,60 @@
+QEMU<->ACPI BIOS Generic Event Device interface
+
+The ACPI Generic Event Device (GED) is a HW reduced platform
+specific device introduced in ACPI v6.1 that handles all platform
+events, including the hotplug ones. GED is modelled as a device
+in the namespace with a _HID defined to be ACPI0013. This document
+describes the interface between QEMU and the ACPI BIOS.
+
+GED allows HW reduced platforms to handle interrupts in ACPI ASL
+statements. It follows a very similar approach like the _EVT method
+from GPIO events. All interrupts are listed in  _CRS and the handler
+is written in _EVT method. However, Qemu implementation uses a single
+interrupt for the GED device, relying on IO memory region to communicate
+the type of device affected by the interrupt. This way, we can support
+up to 32 events with a unique interrupt.
+
+Here is an example.
+
+Device (\_SB.GED)
+{
+Name (_HID, "ACPI0013")
+Name (_UID, Zero)
+Name (_CRS, ResourceTemplate ()
+{
+Interrupt (ResourceConsumer, Edge, ActiveHigh, Exclusive, ,, )
+{
+0x0029,
+}
+})
+OperationRegion (EREG, SystemMemory, 0x0908, 0x04)
+Field (EREG, DWordAcc, NoLock, WriteAsZeros)
+{
+ESEL,   32
+}
+Method (_EVT, 1, Serialized)
+{
+Local0 = ESEL // ESEL = IO memory region which specifies the
+  // device type.
+If (((Local0 & One) == One))
+{
+MethodEvent1()
+}
+If ((Local0 & 0x2) == 0x2)
+{
+MethodEvent2()
+}
+...
+}
+}
+
+GED IO interface (4 byte access):
+read access:
+[0x0-0x3] Event selector bit field(32 bit) set by Qemu.
+bits:
+1:  Memory hotplug event
+2:  System power down event
+ 3-31:  Reserved
+
+write_access:
+Nothing is expected to be written into GED IO memory
-- 
2.17.1





[Qemu-devel] [PATCH-for-4.2 v9 08/12] hw/arm: Factor out powerdown notifier from GPIO

2019-08-13 Thread Shameer Kolothum
This is in preparation of using GED device for
system_powerdown event. Make the powerdown notifier
registration independent of create_gpio() fn.

Signed-off-by: Shameer Kolothum 
---
 hw/arm/virt.c | 12 
 include/hw/arm/virt.h |  1 +
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 56d64fc0a9..0e75213b44 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -787,10 +787,6 @@ static void virt_powerdown_req(Notifier *n, void *opaque)
 qemu_set_irq(qdev_get_gpio_in(gpio_key_dev, 0), 1);
 }
 
-static Notifier virt_system_powerdown_notifier = {
-.notify = virt_powerdown_req
-};
-
 static void create_gpio(const VirtMachineState *vms, qemu_irq *pic)
 {
 char *nodename;
@@ -831,10 +827,6 @@ static void create_gpio(const VirtMachineState *vms, 
qemu_irq *pic)
   KEY_POWER);
 qemu_fdt_setprop_cells(vms->fdt, "/gpio-keys/poweroff",
"gpios", phandle, 3, 0);
-
-/* connect powerdown request */
-qemu_register_powerdown_notifier(_system_powerdown_notifier);
-
 g_free(nodename);
 }
 
@@ -1726,6 +1718,10 @@ static void machvirt_init(MachineState *machine)
 vms->acpi_dev = create_acpi_ged(vms, pic);
 }
 
+ /* connect powerdown request */
+ vms->powerdown_notifier.notify = virt_powerdown_req;
+ qemu_register_powerdown_notifier(>powerdown_notifier);
+
 /* Create mmio transports, so the user can create virtio backends
  * (which will be automatically plugged in to the transports). If
  * no backend is created the transport will just sit harmlessly idle.
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 577ee49b4b..0b41083e9d 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -137,6 +137,7 @@ typedef struct {
 int psci_conduit;
 hwaddr highest_gpa;
 DeviceState *acpi_dev;
+Notifier powerdown_notifier;
 } VirtMachineState;
 
 #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
-- 
2.17.1





[Qemu-devel] [PATCH-for-4.2 v9 03/12] hw/acpi: Add ACPI Generic Event Device Support

2019-08-13 Thread Shameer Kolothum
From: Samuel Ortiz 

The ACPI Generic Event Device (GED) is a hardware-reduced specific
device[ACPI v6.1 Section 5.6.9] that handles all platform events,
including the hotplug ones. This patch generates the AML code that
defines GEDs.

Platforms need to specify their own GED Event bitmap to describe
what kind of events they want to support through GED.  Also this
uses a a single interrupt for the  GED device, relying on IO
memory region to communicate the type of device affected by the
interrupt. This way, we can support up to 32 events with a unique
interrupt.

This supports only memory hotplug for now.

Signed-off-by: Samuel Ortiz 
Signed-off-by: Sebastien Boeuf 
Signed-off-by: Shameer Kolothum 
---
v8 --> v9
 -Changes related to GED being a TYPE_SYS_BUS_DEVICE now.
 -Removed Eric's R-by tag for now.

v7 --> v8.
 -Removed qemu_mutex_lock() across the ged state selector access.
 -Rephrased comments section in acpi_ged_send_event().
 -Moved acpi_ged_event() code into acpi_ged_send_event().
 -Added check for memhp_base and ged_base in realize().
---
 hw/acpi/Kconfig|   4 +
 hw/acpi/Makefile.objs  |   1 +
 hw/acpi/generic_event_device.c | 316 +
 include/hw/acpi/generic_event_device.h | 100 
 4 files changed, 421 insertions(+)
 create mode 100644 hw/acpi/generic_event_device.c
 create mode 100644 include/hw/acpi/generic_event_device.h

diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index 7c59cf900b..12e3f1e86e 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -31,3 +31,7 @@ config ACPI_VMGENID
 bool
 default y
 depends on PC
+
+config ACPI_HW_REDUCED
+bool
+depends on ACPI
diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
index 9bb2101e3b..655a9c1973 100644
--- a/hw/acpi/Makefile.objs
+++ b/hw/acpi/Makefile.objs
@@ -6,6 +6,7 @@ common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) += memory_hotplug.o
 common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
 common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
 common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
+common-obj-$(CONFIG_ACPI_HW_REDUCED) += generic_event_device.o
 common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
 
 common-obj-y += acpi_interface.o
diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
new file mode 100644
index 00..f4c23470c2
--- /dev/null
+++ b/hw/acpi/generic_event_device.c
@@ -0,0 +1,316 @@
+/*
+ *
+ * Copyright (c) 2018 Intel Corporation
+ * Copyright (c) 2019 Huawei Technologies R & D (UK) Ltd
+ * Written by Samuel Ortiz, Shameer Kolothum
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "exec/address-spaces.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/generic_event_device.h"
+#include "hw/mem/pc-dimm.h"
+#include "qemu/error-report.h"
+
+static const uint32_t ged_supported_events[] = {
+ACPI_GED_MEM_HOTPLUG_EVT,
+};
+
+/*
+ * The ACPI Generic Event Device (GED) is a hardware-reduced specific
+ * device[ACPI v6.1 Section 5.6.9] that handles all platform events,
+ * including the hotplug ones. Platforms need to specify their own
+ * GED Event bitmap to describe what kind of events they want to support
+ * through GED. This routine uses a single interrupt for the GED device,
+ * relying on IO memory region to communicate the type of device
+ * affected by the interrupt. This way, we can support up to 32 events
+ * with a unique interrupt.
+ */
+void build_ged_aml(Aml *table, const char *name, HotplugHandler *hotplug_dev,
+   uint32_t ged_irq, AmlRegionSpace rs, hwaddr ged_base)
+{
+AcpiGedState *s = ACPI_GED(hotplug_dev);
+Aml *crs = aml_resource_template();
+Aml *evt, *field;
+Aml *dev = aml_device("%s", name);
+Aml *evt_sel = aml_local(0);
+Aml *esel = aml_name(AML_GED_EVT_SEL);
+
+assert(ged_base);
+
+/* _CRS interrupt */
+aml_append(crs, aml_interrupt(AML_CONSUMER, AML_EDGE, AML_ACTIVE_HIGH,
+  AML_EXCLUSIVE, _irq, 1));
+
+aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0013")));
+aml_append(dev, aml_name_decl("_UID", aml_string(GED_DEVICE)));
+aml_append(dev, aml_name_decl("_CRS", crs));
+
+/* Append IO region */
+aml_append(dev, aml_operation_region(AML_GED_EVT_REG, rs,
+   aml_int(ged_base + ACPI_GED_EVT_SEL_OFFSET),
+   ACPI_GED_EVT_SEL_LEN));
+field = aml_field(AML_GED_EVT_REG, AML_DWORD_ACC, AML_NOLOCK,
+  AML_WRITE_AS_ZEROS);
+aml_append(field, aml_named_field(AML_GED_EVT_SEL,
+  ACPI_GED_EVT_SEL_LEN * BITS_PER_BYTE));
+aml_append(dev, field);
+
+/*
+ * For each GED event we:
+ * - Add a conditional block for each event, inside a loop.
+ * - Call a 

[Qemu-devel] [PATCH-for-4.2 v9 06/12] hw/arm/virt: Enable device memory cold/hot plug with ACPI boot

2019-08-13 Thread Shameer Kolothum
This initializes the GED device with base memory and irq, configures
ged memory hotplug event and builds the corresponding aml code. With
this, both hot and cold plug of device memory is enabled now for Guest
with ACPI boot.

Memory cold plug support with Guest DT boot is not yet supported.

Signed-off-by: Shameer Kolothum 
---
v8 --> v9
 -Changes related to GED being a TYPE_SYS_BUS_DEVICE now.
 -Error propagation to _plug() handler.
 -Removed R-by by Eric for now.

v7 --> v8
 -Changed no_acpi_dev to no_ged.
 -Fixed 'dev' reference leak by object_new().
 -Updated bios-tables-test-allowed-diff.h to avoid "make check"
  failure.

---
 hw/arm/Kconfig|  2 +
 hw/arm/virt-acpi-build.c  | 16 +++
 hw/arm/virt.c | 62 ---
 include/hw/arm/virt.h |  4 ++
 tests/bios-tables-test-allowed-diff.h |  1 +
 5 files changed, 78 insertions(+), 7 deletions(-)

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 84961c17ab..ad7f7c089b 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -22,6 +22,8 @@ config ARM_VIRT
 select ACPI_PCI
 select MEM_DEVICE
 select DIMM
+select ACPI_MEMORY_HOTPLUG
+select ACPI_HW_REDUCED
 
 config CHEETAH
 bool
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 0afb372769..63fa845076 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -40,6 +40,8 @@
 #include "hw/acpi/aml-build.h"
 #include "hw/acpi/utils.h"
 #include "hw/acpi/pci.h"
+#include "hw/acpi/memory_hotplug.h"
+#include "hw/acpi/generic_event_device.h"
 #include "hw/pci/pcie_host.h"
 #include "hw/pci/pci.h"
 #include "hw/arm/virt.h"
@@ -705,6 +707,7 @@ static void
 build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 {
 Aml *scope, *dsdt;
+MachineState *ms = MACHINE(vms);
 const MemMapEntry *memmap = vms->memmap;
 const int *irqmap = vms->irqmap;
 
@@ -729,6 +732,19 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
   vms->highmem, vms->highmem_ecam);
 acpi_dsdt_add_gpio(scope, [VIRT_GPIO],
(irqmap[VIRT_GPIO] + ARM_SPI_BASE));
+if (vms->acpi_dev) {
+build_ged_aml(scope, "\\_SB."GED_DEVICE,
+  HOTPLUG_HANDLER(vms->acpi_dev),
+  irqmap[VIRT_ACPI_GED] + ARM_SPI_BASE, AML_SYSTEM_MEMORY,
+  memmap[VIRT_ACPI_GED].base);
+}
+
+if (vms->acpi_dev && ms->ram_slots) {
+build_memory_hotplug_aml(scope, ms->ram_slots, "\\_SB", NULL,
+ AML_SYSTEM_MEMORY,
+ memmap[VIRT_PCDIMM_ACPI].base);
+}
+
 acpi_dsdt_add_power_button(scope);
 
 aml_append(dsdt, scope);
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index ef65e721d2..0949a227a9 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -66,6 +66,7 @@
 #include "target/arm/internals.h"
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
+#include "hw/acpi/generic_event_device.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
 static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -136,6 +137,8 @@ static const MemMapEntry base_memmap[] = {
 [VIRT_GPIO] =   { 0x0903, 0x1000 },
 [VIRT_SECURE_UART] ={ 0x0904, 0x1000 },
 [VIRT_SMMU] =   { 0x0905, 0x0002 },
+[VIRT_PCDIMM_ACPI] ={ 0x0907, MEMORY_HOTPLUG_IO_LEN },
+[VIRT_ACPI_GED] =   { 0x0908, ACPI_GED_EVT_SEL_LEN },
 [VIRT_MMIO] =   { 0x0a00, 0x0200 },
 /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
 [VIRT_PLATFORM_BUS] =   { 0x0c00, 0x0200 },
@@ -171,6 +174,7 @@ static const int a15irqmap[] = {
 [VIRT_PCIE] = 3, /* ... to 6 */
 [VIRT_GPIO] = 7,
 [VIRT_SECURE_UART] = 8,
+[VIRT_ACPI_GED] = 9,
 [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
 [VIRT_GIC_V2M] = 48, /* ...to 48 + NUM_GICV2M_SPIS - 1 */
 [VIRT_SMMU] = 74,/* ...to 74 + NUM_SMMU_IRQS - 1 */
@@ -520,6 +524,26 @@ static void fdt_add_pmu_nodes(const VirtMachineState *vms)
 }
 }
 
+static inline DeviceState *create_acpi_ged(VirtMachineState *vms, qemu_irq 
*pic)
+{
+DeviceState *dev;
+int irq = vms->irqmap[VIRT_ACPI_GED];
+uint32_t event = ACPI_GED_MEM_HOTPLUG_EVT;
+
+dev = qdev_create(NULL, TYPE_ACPI_GED);
+qdev_prop_set_uint32(dev, "ged-event", event);
+object_property_add_child(qdev_get_machine(), "acpi-ged",
+  OBJECT(dev), NULL);
+qdev_init_nofail(dev);
+
+sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, vms->memmap[VIRT_ACPI_GED].base);
+sysbus_mmio_map(SYS_BUS_DEVICE(dev), 1, 
vms->memmap[VIRT_PCDIMM_ACPI].base);
+
+sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[irq]);
+
+return dev;
+}
+
 static void create_its(VirtMachineState *vms, DeviceState *gicdev)
 {

[Qemu-devel] [PATCH-for-4.2 v9 04/12] hw/arm/virt: Add memory hotplug framework

2019-08-13 Thread Shameer Kolothum
From: Eric Auger 

This patch adds the memory hot-plug/hot-unplug infrastructure
in machvirt. The device memory is not yet exposed to the Guest
either through DT or ACPI and hence both cold/hot plug of memory
is explicitly disabled for now.

Signed-off-by: Eric Auger 
Signed-off-by: Kwangwoo Lee 
Signed-off-by: Shameer Kolothum 
Reviewed-by: Peter Maydell 
Reviewed-by: Igor Mammedov 
---
v8 --> v9
 -Added error propagation.
---
 hw/arm/Kconfig |  2 ++
 hw/arm/virt.c  | 53 +-
 2 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index ab65ecd216..84961c17ab 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -20,6 +20,8 @@ config ARM_VIRT
 select SMBIOS
 select VIRTIO_MMIO
 select ACPI_PCI
+select MEM_DEVICE
+select DIMM
 
 config CHEETAH
 bool
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index d9496c9363..72cde9deba 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -64,6 +64,8 @@
 #include "hw/arm/smmuv3.h"
 #include "hw/acpi/acpi.h"
 #include "target/arm/internals.h"
+#include "hw/mem/pc-dimm.h"
+#include "hw/mem/nvdimm.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
 static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -1871,6 +1873,42 @@ static const CPUArchIdList 
*virt_possible_cpu_arch_ids(MachineState *ms)
 return ms->possible_cpus;
 }
 
+static void virt_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
+ Error **errp)
+{
+
+/*
+ * The device memory is not yet exposed to the Guest either through
+ * DT or ACPI and hence both cold/hot plug of memory is explicitly
+ * disabled for now.
+ */
+if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+error_setg(errp, "memory cold/hot plug is not yet supported");
+return;
+}
+
+pc_dimm_pre_plug(PC_DIMM(dev), MACHINE(hotplug_dev), NULL, errp);
+}
+
+static void virt_memory_plug(HotplugHandler *hotplug_dev,
+ DeviceState *dev, Error **errp)
+{
+VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
+Error *local_err = NULL;
+
+pc_dimm_plug(PC_DIMM(dev), MACHINE(vms), _err);
+
+error_propagate(errp, local_err);
+}
+
+static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
+DeviceState *dev, Error **errp)
+{
+if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+virt_memory_pre_plug(hotplug_dev, dev, errp);
+}
+}
+
 static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
 DeviceState *dev, Error **errp)
 {
@@ -1882,12 +1920,23 @@ static void virt_machine_device_plug_cb(HotplugHandler 
*hotplug_dev,
  SYS_BUS_DEVICE(dev));
 }
 }
+if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+virt_memory_plug(hotplug_dev, dev, errp);
+}
+}
+
+static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
+  DeviceState *dev, Error **errp)
+{
+error_setg(errp, "device unplug request for unsupported device"
+   " type: %s", object_get_typename(OBJECT(dev)));
 }
 
 static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
 DeviceState *dev)
 {
-if (object_dynamic_cast(OBJECT(dev), TYPE_SYS_BUS_DEVICE)) {
+if (object_dynamic_cast(OBJECT(dev), TYPE_SYS_BUS_DEVICE) ||
+   (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM))) {
 return HOTPLUG_HANDLER(machine);
 }
 
@@ -1951,7 +2000,9 @@ static void virt_machine_class_init(ObjectClass *oc, void 
*data)
 mc->kvm_type = virt_kvm_type;
 assert(!mc->get_hotplug_handler);
 mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
+hc->pre_plug = virt_machine_device_pre_plug_cb;
 hc->plug = virt_machine_device_plug_cb;
+hc->unplug_request = virt_machine_device_unplug_request_cb;
 mc->numa_mem_supported = true;
 }
 
-- 
2.17.1





[Qemu-devel] [PATCH-for-4.2 v9 02/12] hw/acpi: Do not create memory hotplug method when handler is not defined

2019-08-13 Thread Shameer Kolothum
From: Samuel Ortiz 

With Hardware-reduced ACPI, the GED device will manage ACPI
hotplug entirely. As a consequence, make the memory specific
events AML generation optional. The code will only be added
when the method name is not NULL.

Signed-off-by: Samuel Ortiz 
Signed-off-by: Shameer Kolothum 
Reviewed-by: Eric Auger 
Reviewed-by: Igor Mammedov 
---
 hw/acpi/memory_hotplug.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
index 1734d4b44f..552f60a716 100644
--- a/hw/acpi/memory_hotplug.c
+++ b/hw/acpi/memory_hotplug.c
@@ -715,10 +715,12 @@ void build_memory_hotplug_aml(Aml *table, uint32_t nr_mem,
 }
 aml_append(table, dev_container);
 
-method = aml_method(event_handler_method, 0, AML_NOTSERIALIZED);
-aml_append(method,
-aml_call0(MEMORY_DEVICES_CONTAINER "." MEMORY_SLOT_SCAN_METHOD));
-aml_append(table, method);
+if (event_handler_method) {
+method = aml_method(event_handler_method, 0, AML_NOTSERIALIZED);
+aml_append(method, aml_call0(MEMORY_DEVICES_CONTAINER "."
+ MEMORY_SLOT_SCAN_METHOD));
+aml_append(table, method);
+}
 
 g_free(mhp_res_path);
 }
-- 
2.17.1





[Qemu-devel] [PATCH-for-4.2 v9 01/12] hw/acpi: Make ACPI IO address space configurable

2019-08-13 Thread Shameer Kolothum
This is in preparation for adding support for ARM64 platforms
where it doesn't use port mapped IO for ACPI IO space. We are
making changes so that MMIO region can be accommodated
and board can pass the base address into the aml build function.

Also move few MEMORY_* definitions to header so that other memory
hotplug event signalling mechanisms (eg. Generic Event Device on
HW-reduced acpi platforms) can use the same from their respective
event handler code.

Signed-off-by: Shameer Kolothum 
---
v8 --> v9
  -base address is an input into build_memory_hotplug_aml()
  -Removed R-by tags from Igor and Eric for now.
---
 hw/acpi/memory_hotplug.c | 29 ++---
 hw/i386/acpi-build.c |  4 +++-
 hw/i386/pc.c |  3 +++
 include/hw/acpi/memory_hotplug.h |  9 +++--
 include/hw/i386/pc.h |  3 +++
 5 files changed, 30 insertions(+), 18 deletions(-)

diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
index 297812d5f7..1734d4b44f 100644
--- a/hw/acpi/memory_hotplug.c
+++ b/hw/acpi/memory_hotplug.c
@@ -29,12 +29,7 @@
 #define MEMORY_SLOT_PROXIMITY_METHOD "MPXM"
 #define MEMORY_SLOT_EJECT_METHOD "MEJ0"
 #define MEMORY_SLOT_NOTIFY_METHOD"MTFY"
-#define MEMORY_SLOT_SCAN_METHOD  "MSCN"
 #define MEMORY_HOTPLUG_DEVICE"MHPD"
-#define MEMORY_HOTPLUG_IO_LEN 24
-#define MEMORY_DEVICES_CONTAINER "\\_SB.MHPC"
-
-static uint16_t memhp_io_base;
 
 static ACPIOSTInfo *acpi_memory_device_status(int slot, MemStatus *mdev)
 {
@@ -209,7 +204,7 @@ static const MemoryRegionOps acpi_memory_hotplug_ops = {
 };
 
 void acpi_memory_hotplug_init(MemoryRegion *as, Object *owner,
-  MemHotplugState *state, uint16_t io_base)
+  MemHotplugState *state, hwaddr io_base)
 {
 MachineState *machine = MACHINE(qdev_get_machine());
 
@@ -218,12 +213,10 @@ void acpi_memory_hotplug_init(MemoryRegion *as, Object 
*owner,
 return;
 }
 
-assert(!memhp_io_base);
-memhp_io_base = io_base;
 state->devs = g_malloc0(sizeof(*state->devs) * state->dev_count);
 memory_region_init_io(>io, owner, _memory_hotplug_ops, state,
   "acpi-mem-hotplug", MEMORY_HOTPLUG_IO_LEN);
-memory_region_add_subregion(as, memhp_io_base, >io);
+memory_region_add_subregion(as, io_base, >io);
 }
 
 /**
@@ -342,7 +335,8 @@ const VMStateDescription vmstate_memory_hotplug = {
 
 void build_memory_hotplug_aml(Aml *table, uint32_t nr_mem,
   const char *res_root,
-  const char *event_handler_method)
+  const char *event_handler_method,
+  AmlRegionSpace rs, hwaddr memhp_io_base)
 {
 int i;
 Aml *ifctx;
@@ -365,14 +359,19 @@ void build_memory_hotplug_aml(Aml *table, uint32_t nr_mem,
 aml_name_decl("_UID", aml_string("Memory hotplug resources")));
 
 crs = aml_resource_template();
-aml_append(crs,
-aml_io(AML_DECODE16, memhp_io_base, memhp_io_base, 0,
-   MEMORY_HOTPLUG_IO_LEN)
-);
+if (rs == AML_SYSTEM_IO) {
+aml_append(crs,
+aml_io(AML_DECODE16, memhp_io_base, memhp_io_base, 0,
+   MEMORY_HOTPLUG_IO_LEN)
+);
+} else {
+aml_append(crs, aml_memory32_fixed(memhp_io_base,
+MEMORY_HOTPLUG_IO_LEN, AML_READ_WRITE));
+}
 aml_append(mem_ctrl_dev, aml_name_decl("_CRS", crs));
 
 aml_append(mem_ctrl_dev, aml_operation_region(
-MEMORY_HOTPLUG_IO_REGION, AML_SYSTEM_IO,
+MEMORY_HOTPLUG_IO_REGION, rs,
 aml_int(memhp_io_base), MEMORY_HOTPLUG_IO_LEN)
 );
 
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index f3fdfefcd5..e76d6631ea 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1871,7 +1871,9 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 build_cpus_aml(dsdt, machine, opts, pm->cpu_hp_io_base,
"\\_SB.PCI0", "\\_GPE._E02");
 }
-build_memory_hotplug_aml(dsdt, nr_mem, "\\_SB.PCI0", "\\_GPE._E03");
+build_memory_hotplug_aml(dsdt, nr_mem, "\\_SB.PCI0",
+ "\\_GPE._E03", AML_SYSTEM_IO,
+ pcms->memhp_io_base);
 
 scope =  aml_scope("_GPE");
 {
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 549c437050..be973cea99 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1936,6 +1936,9 @@ void pc_memory_init(PCMachineState *pcms,
 
 /* Init default IOAPIC address space */
 pcms->ioapic_as = _space_memory;
+
+/* Init ACPI memory hotplug IO base address */
+pcms->memhp_io_base = ACPI_MEMORY_HOTPLUG_BASE;
 }
 
 /*
diff --git a/include/hw/acpi/memory_hotplug.h b/include/hw/acpi/memory_hotplug.h
index 77c65765d6..dfe9cf3fde 100644
--- a/include/hw/acpi/memory_hotplug.h
+++ 

Re: [Qemu-devel] [PATCH v2 1/4] block: introduce aio task pool

2019-08-13 Thread Max Reitz
On 30.07.19 16:18, Vladimir Sementsov-Ogievskiy wrote:
> Common interface for aio task loops. To be used for improving
> performance of synchronous io loops in qcow2, block-stream,
> copy-on-read, and may be other places.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---

Looks good to me overall.

>  block/aio_task.h|  52 +++

I’ve move this to include/block/.

>  block/aio_task.c| 119 
>  block/Makefile.objs |   2 +
>  3 files changed, 173 insertions(+)
>  create mode 100644 block/aio_task.h
>  create mode 100644 block/aio_task.c
> 
> diff --git a/block/aio_task.h b/block/aio_task.h
> new file mode 100644
> index 00..933af1d8e7
> --- /dev/null
> +++ b/block/aio_task.h

[...]

> +typedef struct AioTaskPool AioTaskPool;
> +typedef struct AioTask AioTask;
> +typedef int (*AioTaskFunc)(AioTask *task);

+coroutine_fn

> +struct AioTask {
> +AioTaskPool *pool;
> +AioTaskFunc func;
> +int ret;
> +};
> +
> +/*
> + * aio_task_pool_new
> + *
> + * The caller is responsible to g_free AioTaskPool pointer after use.

s/to g_free/for g_freeing/ or something similar.

Or you’d just add aio_task_pool_free().

> + */
> +AioTaskPool *aio_task_pool_new(int max_busy_tasks);
> +int aio_task_pool_status(AioTaskPool *pool);

A comment wouldn’t hurt.  It wasn’t immediately clear to me that status
refers to the error code of a failing task (or 0), although it wasn’t
too much of a surprise either.

> +bool aio_task_pool_empty(AioTaskPool *pool);
> +void aio_task_pool_start_task(AioTaskPool *pool, AioTask *task);

Maybe make a note that task->pool will be set automatically?

> +void aio_task_pool_wait_slot(AioTaskPool *pool);
> +void aio_task_pool_wait_one(AioTaskPool *pool);
> +void aio_task_pool_wait_all(AioTaskPool *pool);

Shouldn’t all of these but aio_task_pool_empty() and
aio_task_pool_status() be coroutine_fns?

> +#endif /* BLOCK_AIO_TASK_H */
> diff --git a/block/aio_task.c b/block/aio_task.c
> new file mode 100644
> index 00..807be8deb5
> --- /dev/null
> +++ b/block/aio_task.c

[...]

> +static void aio_task_co(void *opaque)

+coroutine_fn

[...]

> +void aio_task_pool_wait_one(AioTaskPool *pool)
> +{
> +assert(pool->busy_tasks > 0);
> +assert(qemu_coroutine_self() == pool->main_co);
> +
> +pool->wait_done = true;

Hmmm, but the wait actually isn’t done yet. :-)

Maybe s/wait_done/waiting/?

Max

> +qemu_coroutine_yield();
> +
> +assert(!pool->wait_done);
> +assert(pool->busy_tasks < pool->max_busy_tasks);
> +}



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [QEMU] [PATCH v5 0/8] Add Qemu to SeaBIOS LCHS interface

2019-08-13 Thread Max Reitz
On 26.06.19 14:39, Sam Eiderman wrote:
> v1:
> 
> Non-standard logical geometries break under QEMU.
> 
> A virtual disk which contains an operating system which depends on
> logical geometries (consistent values being reported from BIOS INT13
> AH=08) will most likely break under QEMU/SeaBIOS if it has non-standard
> logical geometries - for example 56 SPT (sectors per track).
> No matter what QEMU will guess - SeaBIOS, for large enough disks - will
> use LBA translation, which will report 63 SPT instead.
> 
> In addition we can not enforce SeaBIOS to rely on phyiscal geometries at
> all. A virtio-blk-pci virtual disk with 255 phyiscal heads can not
> report more than 16 physical heads when moved to an IDE controller, the
> ATA spec allows a maximum of 16 heads - this is an artifact of
> virtualization.
> 
> By supplying the logical geometies directly we are able to support such
> "exotic" disks.
> 
> We will use fw_cfg to do just that.

(From a block perspective,) I didn’t find anything too bad, so:

Acked-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [QEMU] [PATCH v5 4/8] scsi: Propagate unrealize() callback to scsi-hd

2019-08-13 Thread Max Reitz
On 26.06.19 14:39, Sam Eiderman wrote:
> We will need to add LCHS removal logic to scsi-hd's unrealize() in the
> next commit.
> 
> Reviewed-by: Karl Heubaum 
> Reviewed-by: Arbel Moshe 
> Signed-off-by: Sam Eiderman 
> ---
>  hw/scsi/scsi-bus.c | 15 +++
>  include/hw/scsi/scsi.h |  1 +
>  2 files changed, 16 insertions(+)
> 
> diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
> index c480553083..f6fe497a1a 100644
> --- a/hw/scsi/scsi-bus.c
> +++ b/hw/scsi/scsi-bus.c

[...]

> @@ -213,11 +221,18 @@ static void scsi_qdev_realize(DeviceState *qdev, Error 
> **errp)
>  static void scsi_qdev_unrealize(DeviceState *qdev, Error **errp)
>  {
>  SCSIDevice *dev = SCSI_DEVICE(qdev);
> +Error *local_err = NULL;
>  
>  if (dev->vmsentry) {
>  qemu_del_vm_change_state_handler(dev->vmsentry);
>  }
>  
> +scsi_device_unrealize(dev, _err);
> +if (local_err) {
> +error_propagate(errp, local_err);
> +return;
> +}
> +
>  scsi_device_purge_requests(dev, SENSE_CODE(NO_SENSE));

(I see this code for the first time, but) I suppose I’d put the
scsi_device_unrealize() after scsi_device_purge_requests().

Max

>  blockdev_mark_auto_del(dev->conf.blk);
>  }



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [QEMU] [PATCH v5 5/8] bootdevice: Gather LCHS from all relevant devices

2019-08-13 Thread Max Reitz
On 26.06.19 14:39, Sam Eiderman wrote:
> Relevant devices are:
> * ide-hd (and ide-cd, ide-drive)
> * scsi-hd (and scsi-cd, scsi-disk, scsi-block)
> * virtio-blk-pci
> 
> We do not call del_boot_device_lchs() for ide-* since we don't need to -
> IDE block devices do not support unplugging.
> 
> Reviewed-by: Karl Heubaum 
> Reviewed-by: Arbel Moshe 
> Signed-off-by: Sam Eiderman 
> ---
>  hw/block/virtio-blk.c |  6 ++
>  hw/ide/qdev.c |  5 +
>  hw/scsi/scsi-disk.c   | 14 ++
>  3 files changed, 25 insertions(+)
> 
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 06e57a4d39..787bbd768a 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -1182,6 +1182,11 @@ static void virtio_blk_device_realize(DeviceState 
> *dev, Error **errp)
>  blk_set_guest_block_size(s->blk, s->conf.conf.logical_block_size);
>  
>  blk_iostatus_enable(s->blk);
> +
> +add_boot_device_lchs(dev, "/disk@0,0",
> + (>conf)->lcyls,
> + (>conf)->lheads,
> + (>conf)->lsecs);

...why not simply “conf->conf.lcyls” and so on?

[...]

> diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> index 7b89ac798b..3451aefdea 100644
> --- a/hw/scsi/scsi-disk.c
> +++ b/hw/scsi/scsi-disk.c

[...]

> @@ -2988,6 +2998,7 @@ static void scsi_hd_class_initfn(ObjectClass *klass, 
> void *data)
>  SCSIDeviceClass *sc = SCSI_DEVICE_CLASS(klass);
>  
>  sc->realize  = scsi_hd_realize;
> +sc->unrealize= scsi_unrealize;
>  sc->alloc_req= scsi_new_request;
>  sc->unit_attention_reported = scsi_disk_unit_attention_reported;
>  dc->desc = "virtual SCSI disk";
> @@ -3019,6 +3030,7 @@ static void scsi_cd_class_initfn(ObjectClass *klass, 
> void *data)
>  SCSIDeviceClass *sc = SCSI_DEVICE_CLASS(klass);
>  
>  sc->realize  = scsi_cd_realize;
> +sc->unrealize= scsi_unrealize;
>  sc->alloc_req= scsi_new_request;
>  sc->unit_attention_reported = scsi_disk_unit_attention_reported;
>  dc->desc = "virtual SCSI CD-ROM";
> @@ -3054,6 +3066,7 @@ static void scsi_block_class_initfn(ObjectClass *klass, 
> void *data)
>  SCSIDiskClass *sdc = SCSI_DISK_BASE_CLASS(klass);
>  
>  sc->realize  = scsi_block_realize;
> +sc->unrealize= scsi_unrealize;
>  sc->alloc_req= scsi_block_new_request;
>  sc->parse_cdb= scsi_block_parse_cdb;
>  sdc->dma_readv   = scsi_block_dma_readv;
> @@ -3095,6 +3108,7 @@ static void scsi_disk_class_initfn(ObjectClass *klass, 
> void *data)
>  SCSIDeviceClass *sc = SCSI_DEVICE_CLASS(klass);
>  
>  sc->realize  = scsi_disk_realize;
> +sc->unrealize= scsi_unrealize;
>  sc->alloc_req= scsi_new_request;
>  sc->unit_attention_reported = scsi_disk_unit_attention_reported;
>  dc->fw_name = "disk";

Only scsi-hd has the lchs properties, though, so what’s the purpose of
defining the unrealize function for all other classes?

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3 11/14] migration: add support to migrate page encryption bitmap

2019-08-13 Thread Dr. David Alan Gilbert
* Singh, Brijesh (brijesh.si...@amd.com) wrote:
> When memory encryption is enabled, the hypervisor maintains a page
> encryption bitmap which is referred by hypervisor during migratoin to check
> if page is private or shared. The bitmap is built during the VM bootup and
> must be migrated to the target host so that hypervisor on target host can
> use it for future migration. The KVM_{SET,GET}_PAGE_ENC_BITMAP can be used
> to get and set the bitmap for a given gfn range.
> 
> Signed-off-by: Brijesh Singh 
> ---
>  accel/kvm/kvm-all.c  | 27 
>  accel/kvm/sev-stub.c | 11 +
>  include/sysemu/sev.h |  6 +++
>  target/i386/sev.c| 93 
>  target/i386/trace-events |  2 +
>  5 files changed, 139 insertions(+)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index ba0e7fa2be..f4d136b022 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -185,10 +185,37 @@ static int kvm_memcrypt_load_incoming_page(QEMUFile *f, 
> uint8_t *ptr)
>  return sev_load_incoming_page(kvm_state->memcrypt_handle, f, ptr);
>  }
>  
> +static int kvm_memcrypt_save_outgoing_bitmap(QEMUFile *f)
> +{
> +KVMMemoryListener *kml = _state->memory_listener;
> +KVMState *s = kvm_state;
> +int ret = 1, i;
> +
> +/* iterate through all the registered slots and send the bitmap */
> +for (i = 0; i < s->nr_slots; i++) {
> +KVMSlot *mem = >slots[i];
> +ret = sev_save_outgoing_bitmap(s->memcrypt_handle, f, 
> mem->start_addr,
> +   mem->memory_size,
> +   (i + 1) == s->nr_slots);
> +if (ret) {
> +return 1;
> +}
> +}
> +
> +return ret;
> +}
> +
> +static int kvm_memcrypt_load_incoming_bitmap(QEMUFile *f)
> +{
> +return sev_load_incoming_bitmap(kvm_state->memcrypt_handle, f);
> +}
> +
>  static struct MachineMemoryEncryptionOps sev_memory_encryption_ops = {
>  .save_setup = kvm_memcrypt_save_setup,
>  .save_outgoing_page = kvm_memcrypt_save_outgoing_page,
>  .load_incoming_page = kvm_memcrypt_load_incoming_page,
> +.save_outgoing_bitmap = kvm_memcrypt_save_outgoing_bitmap,
> +.load_incoming_bitmap = kvm_memcrypt_load_incoming_bitmap,
>  };
>  
>  int kvm_memcrypt_encrypt_data(uint8_t *ptr, uint64_t len)
> diff --git a/accel/kvm/sev-stub.c b/accel/kvm/sev-stub.c
> index 1b6773ef72..fa96225abc 100644
> --- a/accel/kvm/sev-stub.c
> +++ b/accel/kvm/sev-stub.c
> @@ -41,3 +41,14 @@ int sev_load_incoming_page(void *handle, QEMUFile *f, 
> uint8_t *ptr)
>  {
>  return 1;
>  }
> +
> +int sev_save_outgoing_bitmap(void *handle, QEMUFile *f,
> + unsigned long start, uint64_t length, bool last)
> +{
> +return 1;
> +}
> +
> +int sev_load_incoming_bitmap(void *handle, QEMUFile *f)
> +{
> +return 1;
> +}
> diff --git a/include/sysemu/sev.h b/include/sysemu/sev.h
> index e9371bd2dd..f777083c94 100644
> --- a/include/sysemu/sev.h
> +++ b/include/sysemu/sev.h
> @@ -16,6 +16,9 @@
>  
>  #include "sysemu/kvm.h"
>  
> +#define RAM_SAVE_ENCRYPTED_PAGE0x1
> +#define RAM_SAVE_ENCRYPTED_BITMAP  0x2
> +
>  void *sev_guest_init(const char *id);
>  int sev_encrypt_data(void *handle, uint8_t *ptr, uint64_t len);
>  int sev_save_setup(void *handle, const char *pdh, const char *plat_cert,
> @@ -23,4 +26,7 @@ int sev_save_setup(void *handle, const char *pdh, const 
> char *plat_cert,
>  int sev_save_outgoing_page(void *handle, QEMUFile *f, uint8_t *ptr,
> uint32_t size, uint64_t *bytes_sent);
>  int sev_load_incoming_page(void *handle, QEMUFile *f, uint8_t *ptr);
> +int sev_load_incoming_bitmap(void *handle, QEMUFile *f);
> +int sev_save_outgoing_bitmap(void *handle, QEMUFile *f, unsigned long start,
> + uint64_t length, bool last);
>  #endif
> diff --git a/target/i386/sev.c b/target/i386/sev.c
> index a689011991..9d643e720c 100644
> --- a/target/i386/sev.c
> +++ b/target/i386/sev.c
> @@ -65,6 +65,8 @@ static const char *const sev_fw_errlist[] = {
>  #define SEV_FW_MAX_ERROR  ARRAY_SIZE(sev_fw_errlist)
>  
>  #define SEV_FW_BLOB_MAX_SIZE0x4000  /* 16KB */
> +#define ENCRYPTED_BITMAP_CONTINUE   0x1
> +#define ENCRYPTED_BITMAP_END0x2
>  
>  static int
>  sev_ioctl(int fd, int cmd, void *data, int *error)
> @@ -1232,6 +1234,97 @@ int sev_load_incoming_page(void *handle, QEMUFile *f, 
> uint8_t *ptr)
>  return sev_receive_update_data(f, ptr);
>  }
>  
> +#define ALIGN(x, y)  (((x) + (y) - 1) & ~((y) - 1))
> +
> +int sev_load_incoming_bitmap(void *handle, QEMUFile *f)
> +{
> +void *bmap;
> +unsigned long bmap_size, base_gpa;
> +unsigned long npages, expected_size, length;
> +struct kvm_page_enc_bitmap e = {};
> +int status;
> +
> +status = qemu_get_be32(f);
> +
> +while (status != ENCRYPTED_BITMAP_END) {

It would be good to be more defensive - I 

Re: [Qemu-devel] [RFC PATCH v2 04/17] fuzz: Skip modules that were already initialized

2019-08-13 Thread Oleinik, Alexander
On Fri, 2019-08-09 at 10:04 +0100, Stefan Hajnoczi wrote:
> On Mon, Aug 05, 2019 at 07:11:05AM +, Oleinik, Alexander wrote:
> > Signed-off-by: Alexander Oleinik 
> > ---
> >  util/module.c | 7 +++
> >  1 file changed, 7 insertions(+)
> 
> Why is this necessary?  Existing callers only invoke this function
> once
> for each type.
This was suggested by Paolo in Message-ID:
fad9d12a-39df-e2fa-064b-5132add9d...@redhat.com

I need to initialize the QOS module in the fuzzer main to identify the
qemu arguments, prior to running vl.c:main.
> Please include justification in the commit description.
Will do
> Stefan



Re: [Qemu-devel] [QEMU] [PATCH v5 3/8] bootdevice: Add interface to gather LCHS

2019-08-13 Thread Max Reitz
On 26.06.19 14:39, Sam Eiderman wrote:
> Add an interface to provide direct logical CHS values for boot devices.
> We will use this interface in the next commits.
> 
> Reviewed-by: Karl Heubaum 
> Reviewed-by: Arbel Moshe 
> Signed-off-by: Sam Eiderman 
> ---
>  bootdevice.c| 55 
> +
>  include/sysemu/sysemu.h |  3 +++
>  2 files changed, 58 insertions(+)

I’ve got a couple of “undelivered mail returned to sender” mails for Sam
recently, but anyway...

> diff --git a/bootdevice.c b/bootdevice.c
> index 1d225202f9..bc5e1c2de4 100644
> --- a/bootdevice.c
> +++ b/bootdevice.c
> @@ -343,3 +343,58 @@ void device_add_bootindex_property(Object *obj, int32_t 
> *bootindex,
>  /* initialize devices' bootindex property to -1 */
>  object_property_set_int(obj, -1, name, NULL);
>  }
> +
> +typedef struct FWLCHSEntry FWLCHSEntry;
> +
> +struct FWLCHSEntry {
> +QTAILQ_ENTRY(FWLCHSEntry) link;
> +DeviceState *dev;
> +char *suffix;
> +uint32_t lcyls;
> +uint32_t lheads;
> +uint32_t lsecs;
> +};
> +
> +static QTAILQ_HEAD(, FWLCHSEntry) fw_lchs =
> +QTAILQ_HEAD_INITIALIZER(fw_lchs);
> +
> +void add_boot_device_lchs(DeviceState *dev, const char *suffix,
> +  uint32_t lcyls, uint32_t lheads, uint32_t lsecs)
> +{
> +FWLCHSEntry *node;
> +
> +if (!lcyls && !lheads && !lsecs) {
> +return;
> +}
> +
> +assert(dev != NULL || suffix != NULL);

It doesn’t look like any caller actually passes a NULL @dev, so why not
drop the @suffix part?

> +node = g_malloc0(sizeof(FWLCHSEntry));
> +node->suffix = g_strdup(suffix);
> +node->dev = dev;
> +node->lcyls = lcyls;
> +node->lheads = lheads;
> +node->lsecs = lsecs;
> +
> +QTAILQ_INSERT_TAIL(_lchs, node, link);
> +}
> +
> +void del_boot_device_lchs(DeviceState *dev, const char *suffix)
> +{
> +FWLCHSEntry *i;
> +
> +if (dev == NULL) {
> +return;
> +}
> +
> +QTAILQ_FOREACH(i, _lchs, link) {
> +if ((!suffix || !g_strcmp0(i->suffix, suffix)) &&
> + i->dev == dev) {

(Furthermore, it’d be impossible to remove an FWLCHSEntry with .dev ==
NULL.)

Max

> +QTAILQ_REMOVE(_lchs, i, link);
> +g_free(i->suffix);
> +g_free(i);
> +
> +break;
> +}
> +}
> +}



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [RFC PATCH v2 02/17] fuzz: Add fuzzer configure options

2019-08-13 Thread Oleinik, Alexander
On Mon, 2019-08-12 at 18:39 -0400, Bandan Das wrote:
> "Oleinik, Alexander"  writes:
> ...
> >  if test "$supported_cpu" = "no"; then
> >  echo
> > @@ -7306,6 +7310,17 @@ fi
> >  if test "$sheepdog" = "yes" ; then
> >echo "CONFIG_SHEEPDOG=y" >> $config_host_mak
> >  fi
> > +if test "$fuzzing" = "yes" ; then
> > +  QEMU_CFLAGS="$QEMU_CFLAGS -fsanitize=fuzzer,address  -fprofile-
> > instr-generate"
> > +  QEMU_CFLAGS="$QEMU_CFLAGS -fprofile-instr-generate -fcoverage-
> > mapping"
> 
> What is the purpose of -fprofile-instr-generate ? Coverage info ?
> (Listed twice above)
Yes, it's for coverage info. I'll fix it so it is only listed once.

> Bandan
> 
> > +  QEMU_LDFLAGS="$LDFLAGS -fsanitize=fuzzer,address"
> > +
> > +  # Add tests/ to include path, since this is done in
> > tests/Makefile.include,
> > +  # and required for QOS objects to build. This can be removed
> > if/when the
> > +  # fuzzer is compiled using rules in tests/Makefile.include
> > +  QEMU_INCLUDES="-iquote \$(SRC_PATH)/tests $QEMU_INCLUDES"
> > +  echo "CONFIG_FUZZ=y" >> $config_host_mak
> > +fi
> >  
> >  if test "$tcg_interpreter" = "yes"; then
> >QEMU_INCLUDES="-iquote \$(SRC_PATH)/tcg/tci $QEMU_INCLUDES"



Re: [Qemu-devel] [PATCH v2] Add git-publish profile for security bugs

2019-08-13 Thread John Snow



On 8/13/19 5:07 AM, Gerd Hoffmann wrote:
> Simplifies sending security patches to all people listed in
> https://wiki.qemu.org/SecurityProcess.  Should also make it
> harder to send a copy to the mailing list by accident.
> 
> Signed-off-by: Gerd Hoffmann 
> ---
>  .gitpublish | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/.gitpublish b/.gitpublish
> index a13f8c7c0ecd..01f8279fa840 100644
> --- a/.gitpublish
> +++ b/.gitpublish
> @@ -49,3 +49,15 @@ base = master
>  to = qemu-devel@nongnu.org
>  cc = qemu-...@nongnu.org
>  cccmd = scripts/get_maintainer.pl --noroles --norolestats --nogit 
> --nogit-fallback 2>/dev/null
> +
> +# https://wiki.qemu.org/SecurityProcess
> +[gitpublishprofile "security"]
> +base = master
> +to = m...@redhat.com
> +to = pmato...@redhat.com
> +to = sstabell...@kernel.org
> +to = secal...@redhat.com
> +to = mdr...@linux.vnet.ibm.com
> +to = p...@redhat.com
> +suppresscc = all
> +inspect-emails = true
> 

OK, but it still would be nice to have the MAINTAINERS file and this
config item cross-reference each other, especially because any changes
to our security policy should not leave hanging loose ends.



[Qemu-devel] [PATCH v3] block: posix: Handle undetectable alignment

2019-08-13 Thread Nir Soffer
In some cases buf_align or request_alignment cannot be detected:

1. With Gluster, buf_align cannot be detected since the actual I/O is
   done on Gluster server, and qemu buffer alignment does not matter.
   Since we don't have alignment requirement, buf_align=1 is the best
   value.

2. With local XFS filesystem, buf_align cannot be detected if reading
   from unallocated area. In this we must align the buffer, but we don't
   know what is the correct size. Using the wrong alignment results in
   I/O error.

3. With Gluster backed by XFS, request_alignment cannot be detected if
   reading from unallocated area. In this case we need to use the
   correct alignment, and failing to do so results in I/O errors.

4. With NFS, the server does not use direct I/O, so both buf_align cannot
   be detected. In this case we don't need any alignment so we can use
   buf_align=1 and request_alignment=1.

These cases seems to work when storage sector size is 512 bytes, because
the current code starts checking align=512. If the check succeeds
because alignment cannot be detected we use 512. But this does not work
for storage with 4k sector size.

To determine if we can detect the alignment, we probe first with
align=1. If probing succeeds, maybe there are no alignment requirement
(cases 1, 4) or we are probing unallocated area (cases 2, 3). Since we
don't have any way to tell, we treat this as undetectable alignment. If
probing with align=1 fails with EINVAL, but probing with one of the
expected alignments succeeds, we know that we found a working alignment.

Practically the alignment requirements are the same for buffer
alignment, buffer length, and offset in file. So in case we cannot
detect buf_align, we can use request alignment. If we cannot detect
request alignment, we can fallback to a safe value. To use this logic,
we probe first request alignment instead of buf_align.

Here is a table showing the behaviour with current code (the value in
parenthesis is the optimal value).

CaseSectorbuf_align (opt)   request_alignment (opt) result
==
1   512   512   (1)  512   (512) OK
1   4096  512   (1)  4096  (4096)FAIL
--
2   512   512   (512)512   (512) OK
2   4096  512   (4096)   4096  (4096)FAIL
--
3   512   512   (1)  512   (512) OK
3   4096  512   (1)  512   (4096)FAIL
--
4   512   512   (1)  512   (1)   OK
4   4096  512   (1)  512   (1)   OK

Same cases with this change:

CaseSectorbuf_align (opt)   request_alignment (opt) result
==
1   512   512   (1)  512   (512) OK
1   4096  4096  (1)  4096  (4096)OK
--
2   512   512   (512)512   (512) OK
2   4096  4096  (4096)   4096  (4096)OK
--
3   512   4096  (1)  4096  (512) OK
3   4096  4096  (1)  4096  (4096)OK
--
4   512   4096  (1)  4096  (1)   OK
4   4096  4096  (1)  4096  (1)   OK

I tested that provisioning VMs and copying disks on local XFS and
Gluster with 4k bytes sector size work now, resolving bugs [1],[2].
I tested also on XFS, NFS, Gluster with 512 bytes sector size.

[1] https://bugzilla.redhat.com/1737256
[2] https://bugzilla.redhat.com/1738657

Signed-off-by: Nir Soffer 
---

Changes since v2
- Improve the commit message (Kevin)
- Remove unneeded 2-level ternary (Kevin)

v2 was here:
https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg00426.html

 block/file-posix.c | 36 +---
 1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index f33b542b33..9baade65f4 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -323,6 +323,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
 BDRVRawState *s = bs->opaque;
 char *buf;
 size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize());
+size_t alignments[] = {1, 512, 1024, 2048, 4096};
 
 /* For SCSI generic devices the alignment is not really used.
With buffered I/O, we don't have any restrictions. */
@@ -349,25 +350,38 

Re: [Qemu-devel] [PATCH] usb: reword -usb command-line option and mention xHCI

2019-08-13 Thread Thomas Huth
On 8/13/19 3:30 PM, Stefan Hajnoczi wrote:
> The -usb section of the man page is not very clear on what exactly -usb
> does and fails to mention xHCI as a modern alternative (-device
> nec-usb-xhci).
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  qemu-options.hx | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 9621e934c0..7d11c016d1 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -1436,12 +1436,15 @@ STEXI
>  ETEXI
>  
>  DEF("usb", 0, QEMU_OPTION_usb,
> -"-usbenable the USB driver (if it is not used by default 
> yet)\n",
> +"-usbenable on-board USB host controller (if not enabled by 
> default)\n",
>  QEMU_ARCH_ALL)
>  STEXI
>  @item -usb
>  @findex -usb
> -Enable the USB driver (if it is not used by default yet).
> +Enable USB emulation on machine types with an on-board USB host controller 
> (if
> +not enabled by default).  Note that on-board USB host controllers may not
> +support USB 3.0.  In this case -device nec-usb-xhci can be used instead on

Should we maybe rather recommend qemu-xhci instead?
And please put the @option{} around the "-device *-xhci" here.

With @option:

Reviewed-by: Thomas Huth 



Re: [Qemu-devel] [PATCH 3/6] tests/libqtest: Remove unused function hmp()

2019-08-13 Thread Thomas Huth
On 8/13/19 5:20 PM, Eric Blake wrote:
> On 8/13/19 4:30 AM, Thomas Huth wrote:
>> No test is using hmp() anymore, and since this function uses the disliked
>> global_qtest variable, we should also make sure that nobody adds new code
>> with this function again. qtest_hmp() should be used instead.
>>
>> Signed-off-by: Thomas Huth 
>> ---
>>  tests/libqtest.c | 11 ---
>>  tests/libqtest.h | 10 --
>>  2 files changed, 21 deletions(-)
> 
> Yay.
> 
> We could, at a later time, introduce a patch to do s/qtest_hmp/hmp/ if
> it was deemed worthwhile, but I'm not sure it's worth the churn.

Actually, I like the qtest_* prefix for the libqtest functions - so it
is clear at the first sight that a function is part of libqtest or
rather the test itself.

> Reviewed-by: Eric Blake 

Thanks a lot!

 Thomas





signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3 10/14] target/i386: sev: add support to load incoming encrypted page

2019-08-13 Thread Dr. David Alan Gilbert
* Singh, Brijesh (brijesh.si...@amd.com) wrote:
> The sev_load_incoming_page() provide the implementation to read the
> incoming guest private pages from the socket and load it into the guest
> memory. The routines uses the RECEIVE_START command to create the
> incoming encryption context on the first call then uses the
> RECEIEVE_UPDATE_DATA command to load the encrypted pages into the guest
> memory. After migration is completed, we issue the RECEIVE_FINISH command
> to transition the SEV guest to the runnable state so that it can be
> executed.
> 
> Signed-off-by: Brijesh Singh 

OK, some comments about the return values of the functions would help,
but other than that.


Reviewed-by: Dr. David Alan Gilbert 

> ---
>  accel/kvm/kvm-all.c  |   6 ++
>  accel/kvm/sev-stub.c |   5 ++
>  include/sysemu/sev.h |   1 +
>  target/i386/sev.c| 137 ++-
>  target/i386/trace-events |   3 +
>  5 files changed, 151 insertions(+), 1 deletion(-)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index a5b0ae9363..ba0e7fa2be 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -180,9 +180,15 @@ static int kvm_memcrypt_save_outgoing_page(QEMUFile *f, 
> uint8_t *ptr,
>bytes_sent);
>  }
>  
> +static int kvm_memcrypt_load_incoming_page(QEMUFile *f, uint8_t *ptr)
> +{
> +return sev_load_incoming_page(kvm_state->memcrypt_handle, f, ptr);
> +}
> +
>  static struct MachineMemoryEncryptionOps sev_memory_encryption_ops = {
>  .save_setup = kvm_memcrypt_save_setup,
>  .save_outgoing_page = kvm_memcrypt_save_outgoing_page,
> +.load_incoming_page = kvm_memcrypt_load_incoming_page,
>  };
>  
>  int kvm_memcrypt_encrypt_data(uint8_t *ptr, uint64_t len)
> diff --git a/accel/kvm/sev-stub.c b/accel/kvm/sev-stub.c
> index 51b17b8141..1b6773ef72 100644
> --- a/accel/kvm/sev-stub.c
> +++ b/accel/kvm/sev-stub.c
> @@ -36,3 +36,8 @@ int sev_save_outgoing_page(void *handle, QEMUFile *f, 
> uint8_t *ptr,
>  {
>  return 1;
>  }
> +
> +int sev_load_incoming_page(void *handle, QEMUFile *f, uint8_t *ptr)
> +{
> +return 1;
> +}
> diff --git a/include/sysemu/sev.h b/include/sysemu/sev.h
> index f06fd203cd..e9371bd2dd 100644
> --- a/include/sysemu/sev.h
> +++ b/include/sysemu/sev.h
> @@ -22,4 +22,5 @@ int sev_save_setup(void *handle, const char *pdh, const 
> char *plat_cert,
> const char *amd_cert);
>  int sev_save_outgoing_page(void *handle, QEMUFile *f, uint8_t *ptr,
> uint32_t size, uint64_t *bytes_sent);
> +int sev_load_incoming_page(void *handle, QEMUFile *f, uint8_t *ptr);
>  #endif
> diff --git a/target/i386/sev.c b/target/i386/sev.c
> index 1820c62a71..a689011991 100644
> --- a/target/i386/sev.c
> +++ b/target/i386/sev.c
> @@ -721,13 +721,34 @@ sev_launch_finish(SEVState *s)
>  }
>  }
>  
> +static int
> +sev_receive_finish(SEVState *s)
> +{
> +int error, ret = 1;
> +
> +trace_kvm_sev_receive_finish();
> +ret = sev_ioctl(s->sev_fd, KVM_SEV_RECEIVE_FINISH, 0, );
> +if (ret) {
> +error_report("%s: RECEIVE_FINISH ret=%d fw_error=%d '%s'",
> +__func__, ret, error, fw_error_to_str(error));
> +goto err;
> +}
> +
> +sev_set_guest_state(SEV_STATE_RUNNING);
> +err:
> +return ret;
> +}
> +
> +
>  static void
>  sev_vm_state_change(void *opaque, int running, RunState state)
>  {
>  SEVState *s = opaque;
>  
>  if (running) {
> -if (!sev_check_state(SEV_STATE_RUNNING)) {
> +if (sev_check_state(SEV_STATE_RECEIVE_UPDATE)) {
> +sev_receive_finish(s);
> +} else if (!sev_check_state(SEV_STATE_RUNNING)) {
>  sev_launch_finish(s);
>  }
>  }
> @@ -1097,6 +1118,120 @@ int sev_save_outgoing_page(void *handle, QEMUFile *f, 
> uint8_t *ptr,
>  return sev_send_update_data(s, f, ptr, sz, bytes_sent);
>  }
>  
> +static int
> +sev_receive_start(QSevGuestInfo *sev, QEMUFile *f)
> +{
> +int ret = 1;
> +int fw_error;
> +struct kvm_sev_receive_start start = { };
> +gchar *session = NULL, *pdh_cert = NULL;
> +
> +/* get SEV guest handle */
> +start.handle = object_property_get_int(OBJECT(sev), "handle",
> +   _abort);
> +
> +/* get the source policy */
> +start.policy = qemu_get_be32(f);
> +
> +/* get source PDH key */
> +start.pdh_len = qemu_get_be32(f);
> +if (!check_blob_length(start.pdh_len)) {
> +return 1;
> +}
> +
> +pdh_cert = g_new(gchar, start.pdh_len);
> +qemu_get_buffer(f, (uint8_t *)pdh_cert, start.pdh_len);
> +start.pdh_uaddr = (uintptr_t)pdh_cert;
> +
> +/* get source session data */
> +start.session_len = qemu_get_be32(f);
> +if (!check_blob_length(start.session_len)) {
> +return 1;
> +}
> +session = g_new(gchar, start.session_len);
> +qemu_get_buffer(f, (uint8_t *)session, start.session_len);
> +

Re: [Qemu-devel] [PATCH v2] ppc: Add support for 'mffsl' instruction

2019-08-13 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/1565712926-21194-1-git-send-email...@us.ibm.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PATCH v2] ppc: Add support for 'mffsl' instruction
Message-id: 1565712926-21194-1-git-send-email...@us.ibm.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/1565712926-21194-1-git-send-email...@us.ibm.com -> 
patchew/1565712926-21194-1-git-send-email...@us.ibm.com
 * [new tag] patchew/20190813164420.9829-1-...@kaod.org -> 
patchew/20190813164420.9829-1-...@kaod.org
Submodule 'capstone' (https://git.qemu.org/git/capstone.git) registered for 
path 'capstone'
Submodule 'dtc' (https://git.qemu.org/git/dtc.git) registered for path 'dtc'
Submodule 'roms/QemuMacDrivers' (https://git.qemu.org/git/QemuMacDrivers.git) 
registered for path 'roms/QemuMacDrivers'
Submodule 'roms/SLOF' (https://git.qemu.org/git/SLOF.git) registered for path 
'roms/SLOF'
Submodule 'roms/edk2' (https://git.qemu.org/git/edk2.git) registered for path 
'roms/edk2'
Submodule 'roms/ipxe' (https://git.qemu.org/git/ipxe.git) registered for path 
'roms/ipxe'
Submodule 'roms/openbios' (https://git.qemu.org/git/openbios.git) registered 
for path 'roms/openbios'
Submodule 'roms/openhackware' (https://git.qemu.org/git/openhackware.git) 
registered for path 'roms/openhackware'
Submodule 'roms/opensbi' (https://git.qemu.org/git/opensbi.git) registered for 
path 'roms/opensbi'
Submodule 'roms/qemu-palcode' (https://git.qemu.org/git/qemu-palcode.git) 
registered for path 'roms/qemu-palcode'
Submodule 'roms/seabios' (https://git.qemu.org/git/seabios.git/) registered for 
path 'roms/seabios'
Submodule 'roms/seabios-hppa' (https://git.qemu.org/git/seabios-hppa.git) 
registered for path 'roms/seabios-hppa'
Submodule 'roms/sgabios' (https://git.qemu.org/git/sgabios.git) registered for 
path 'roms/sgabios'
Submodule 'roms/skiboot' (https://git.qemu.org/git/skiboot.git) registered for 
path 'roms/skiboot'
Submodule 'roms/u-boot' (https://git.qemu.org/git/u-boot.git) registered for 
path 'roms/u-boot'
Submodule 'roms/u-boot-sam460ex' (https://git.qemu.org/git/u-boot-sam460ex.git) 
registered for path 'roms/u-boot-sam460ex'
Submodule 'slirp' (https://git.qemu.org/git/libslirp.git) registered for path 
'slirp'
Submodule 'tests/fp/berkeley-softfloat-3' 
(https://git.qemu.org/git/berkeley-softfloat-3.git) registered for path 
'tests/fp/berkeley-softfloat-3'
Submodule 'tests/fp/berkeley-testfloat-3' 
(https://git.qemu.org/git/berkeley-testfloat-3.git) registered for path 
'tests/fp/berkeley-testfloat-3'
Submodule 'ui/keycodemapdb' (https://git.qemu.org/git/keycodemapdb.git) 
registered for path 'ui/keycodemapdb'
Cloning into 'capstone'...
Submodule path 'capstone': checked out 
'22ead3e0bfdb87516656453336160e0a37b066bf'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '88f18909db731a627456f26d779445f84e449536'
Cloning into 'roms/QemuMacDrivers'...
Submodule path 'roms/QemuMacDrivers': checked out 
'90c488d5f4a407342247b9ea869df1c2d9c8e266'
Cloning into 'roms/SLOF'...
Submodule path 'roms/SLOF': checked out 
'ba1ab360eebe6338bb8d7d83a9220ccf7e213af3'
Cloning into 'roms/edk2'...
Submodule path 'roms/edk2': checked out 
'20d2e5a125e34fc8501026613a71549b2a1a3e54'
Submodule 'SoftFloat' (https://github.com/ucb-bar/berkeley-softfloat-3.git) 
registered for path 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
Submodule 'CryptoPkg/Library/OpensslLib/openssl' 
(https://github.com/openssl/openssl) registered for path 
'CryptoPkg/Library/OpensslLib/openssl'
Cloning into 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'...
Submodule path 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3': 
checked out 'b64af41c3276f97f0e181920400ee056b9c88037'
Cloning into 'CryptoPkg/Library/OpensslLib/openssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl': checked out 
'50eaac9f3337667259de725451f201e784599687'
Submodule 'boringssl' (https://boringssl.googlesource.com/boringssl) registered 
for path 'boringssl'
Submodule 'krb5' (https://github.com/krb5/krb5) registered for path 'krb5'
Submodule 'pyca.cryptography' (https://github.com/pyca/cryptography.git) 
registered for path 'pyca-cryptography'
Cloning into 'boringssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/boringssl': 
checked out '2070f8ad9151dc8f3a73bffaa146b5e6937a583f'
Cloning into 'krb5'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/krb5': checked 
out 'b9ad6c49505c96a088326b62a52568e3484f2168'
Cloning into 'pyca-cryptography'...
Submodule path 

Re: [Qemu-devel] [PATCH-4.2 v2 5/5] target/riscv: Fix Floating Point register names

2019-08-13 Thread Alistair Francis
On Mon, Aug 12, 2019 at 4:08 PM Palmer Dabbelt  wrote:
>
> On Tue, 30 Jul 2019 16:35:34 PDT (-0700), Alistair Francis wrote:
> > From: Atish Patra 
> >
> > As per the RISC-V spec, Floating Point registers are named as f0..f31
> > so lets fix the register names accordingly.
> >
> > Signed-off-by: Atish Patra 
> > Signed-off-by: Alistair Francis 
> > ---
> >  target/riscv/cpu.c | 8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> > index f8d07bd20a..af1e9b7690 100644
> > --- a/target/riscv/cpu.c
> > +++ b/target/riscv/cpu.c
> > @@ -40,10 +40,10 @@ const char * const riscv_int_regnames[] = {
> >  };
> >
> >  const char * const riscv_fpr_regnames[] = {
> > -  "ft0", "ft1", "ft2",  "ft3",  "ft4", "ft5", "ft6",  "ft7",
> > -  "fs0", "fs1", "fa0",  "fa1",  "fa2", "fa3", "fa4",  "fa5",
> > -  "fa6", "fa7", "fs2",  "fs3",  "fs4", "fs5", "fs6",  "fs7",
> > -  "fs8", "fs9", "fs10", "fs11", "ft8", "ft9", "ft10", "ft11"
> > +  "f0", "f1", "f2",  "f3",  "f4", "f5", "f6", "f7",
> > +  "f8", "f9", "f10",  "f11",  "f12", "f13", "f14", "f15",
> > +  "f16", "f17", "f18",  "f19",  "f20", "f21", "f22", "f23",
> > +  "f24", "f25", "f26", "f27", "f28", "f29", "f30", "f31"
> >  };
> >
> >  const char * const riscv_excp_names[] = {
>
> I actually don't think this one is right: riscv_int_regnames uses the ABI
> names, so this should match.  I'd be OK switching both of them, but not just
> one.

I like that the int registers use the ABI names though, as I find that useful.

What about we change the registers to use both? As in something like
x0/zero for all registers?

The disadvantage is that it's a little longer, but it seems the most useful.

Alistair

>
> I've queued the other four patches.



Re: [Qemu-devel] [PATCH] ppc: Add support for 'mffsl' instruction

2019-08-13 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/1565710319-1026-1-git-send-email...@us.ibm.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PATCH] ppc: Add support for 'mffsl' instruction
Message-id: 1565710319-1026-1-git-send-email...@us.ibm.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
   02db1be..f28ed74  master -> master
 * [new tag] patchew/1565710319-1026-1-git-send-email...@us.ibm.com -> 
patchew/1565710319-1026-1-git-send-email...@us.ibm.com
Submodule 'capstone' (https://git.qemu.org/git/capstone.git) registered for 
path 'capstone'
Submodule 'dtc' (https://git.qemu.org/git/dtc.git) registered for path 'dtc'
Submodule 'roms/QemuMacDrivers' (https://git.qemu.org/git/QemuMacDrivers.git) 
registered for path 'roms/QemuMacDrivers'
Submodule 'roms/SLOF' (https://git.qemu.org/git/SLOF.git) registered for path 
'roms/SLOF'
Submodule 'roms/edk2' (https://git.qemu.org/git/edk2.git) registered for path 
'roms/edk2'
Submodule 'roms/ipxe' (https://git.qemu.org/git/ipxe.git) registered for path 
'roms/ipxe'
Submodule 'roms/openbios' (https://git.qemu.org/git/openbios.git) registered 
for path 'roms/openbios'
Submodule 'roms/openhackware' (https://git.qemu.org/git/openhackware.git) 
registered for path 'roms/openhackware'
Submodule 'roms/opensbi' (https://git.qemu.org/git/opensbi.git) registered for 
path 'roms/opensbi'
Submodule 'roms/qemu-palcode' (https://git.qemu.org/git/qemu-palcode.git) 
registered for path 'roms/qemu-palcode'
Submodule 'roms/seabios' (https://git.qemu.org/git/seabios.git/) registered for 
path 'roms/seabios'
Submodule 'roms/seabios-hppa' (https://git.qemu.org/git/seabios-hppa.git) 
registered for path 'roms/seabios-hppa'
Submodule 'roms/sgabios' (https://git.qemu.org/git/sgabios.git) registered for 
path 'roms/sgabios'
Submodule 'roms/skiboot' (https://git.qemu.org/git/skiboot.git) registered for 
path 'roms/skiboot'
Submodule 'roms/u-boot' (https://git.qemu.org/git/u-boot.git) registered for 
path 'roms/u-boot'
Submodule 'roms/u-boot-sam460ex' (https://git.qemu.org/git/u-boot-sam460ex.git) 
registered for path 'roms/u-boot-sam460ex'
Submodule 'slirp' (https://git.qemu.org/git/libslirp.git) registered for path 
'slirp'
Submodule 'tests/fp/berkeley-softfloat-3' 
(https://git.qemu.org/git/berkeley-softfloat-3.git) registered for path 
'tests/fp/berkeley-softfloat-3'
Submodule 'tests/fp/berkeley-testfloat-3' 
(https://git.qemu.org/git/berkeley-testfloat-3.git) registered for path 
'tests/fp/berkeley-testfloat-3'
Submodule 'ui/keycodemapdb' (https://git.qemu.org/git/keycodemapdb.git) 
registered for path 'ui/keycodemapdb'
Cloning into 'capstone'...
Submodule path 'capstone': checked out 
'22ead3e0bfdb87516656453336160e0a37b066bf'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '88f18909db731a627456f26d779445f84e449536'
Cloning into 'roms/QemuMacDrivers'...
Submodule path 'roms/QemuMacDrivers': checked out 
'90c488d5f4a407342247b9ea869df1c2d9c8e266'
Cloning into 'roms/SLOF'...
Submodule path 'roms/SLOF': checked out 
'ba1ab360eebe6338bb8d7d83a9220ccf7e213af3'
Cloning into 'roms/edk2'...
Submodule path 'roms/edk2': checked out 
'20d2e5a125e34fc8501026613a71549b2a1a3e54'
Submodule 'SoftFloat' (https://github.com/ucb-bar/berkeley-softfloat-3.git) 
registered for path 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
Submodule 'CryptoPkg/Library/OpensslLib/openssl' 
(https://github.com/openssl/openssl) registered for path 
'CryptoPkg/Library/OpensslLib/openssl'
Cloning into 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'...
Submodule path 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3': 
checked out 'b64af41c3276f97f0e181920400ee056b9c88037'
Cloning into 'CryptoPkg/Library/OpensslLib/openssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl': checked out 
'50eaac9f3337667259de725451f201e784599687'
Submodule 'boringssl' (https://boringssl.googlesource.com/boringssl) registered 
for path 'boringssl'
Submodule 'krb5' (https://github.com/krb5/krb5) registered for path 'krb5'
Submodule 'pyca.cryptography' (https://github.com/pyca/cryptography.git) 
registered for path 'pyca-cryptography'
Cloning into 'boringssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/boringssl': 
checked out '2070f8ad9151dc8f3a73bffaa146b5e6937a583f'
Cloning into 'krb5'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/krb5': checked 
out 'b9ad6c49505c96a088326b62a52568e3484f2168'
Cloning into 'pyca-cryptography'...
Submodule path 
'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/pyca-cryptography': checked out 

Re: [Qemu-devel] [FOR 4.1 PATCH] riscv: roms: Fix make rules for building sifive_u bios

2019-08-13 Thread Alistair Francis
On Tue, Aug 13, 2019 at 6:00 AM Peter Maydell  wrote:
>
> On Mon, 12 Aug 2019 at 09:38, Peter Maydell  wrote:
> >
> > On Sun, 11 Aug 2019 at 08:17, Bin Meng  wrote:
> > >
> > > Hi Palmer,
> > >
> > > On Tue, Aug 6, 2019 at 1:04 AM Alistair Francis  
> > > wrote:
> > > >
> > > > On Fri, Aug 2, 2019 at 11:08 PM Bin Meng  wrote:
> > > > >
> > > > > Currently the make rules are wrongly using qemu/virt opensbi image
> > > > > for sifive_u machine. Correct it.
> > > > >
> > > > > Signed-off-by: Bin Meng 
> > > >
> > > > Good catch.
> > > >
> > > > @Palmer Dabbelt can you take this for 4.1?
> > > >
> > >
> > > Is this patch merged for 4.1? Thanks!
> >
> > Sorry, it doesn't look like it is, and it's now missed the
> > deadline for 4.1 (only critical showstopper bugs and security
> > issues would go in at this point).
>
> Since a very late ppc pullreq turned up which needed to also go into
> rc5 and meant we couldn't just have a single-change rc, I figured this
> was safe enough to also apply for rc5, so I've put it in.

Thanks Peter!

Alistair

>
> thanks
> -- PMM



Re: [Qemu-devel] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once

2019-08-13 Thread Max Reitz
On 13.08.19 18:45, Vladimir Sementsov-Ogievskiy wrote:
> 13.08.2019 19:30, Max Reitz wrote:
>> On 13.08.19 17:32, Vladimir Sementsov-Ogievskiy wrote:
>>> 13.08.2019 18:02, Max Reitz wrote:
 On 13.08.19 17:00, Vladimir Sementsov-Ogievskiy wrote:
> 13.08.2019 17:57, Max Reitz wrote:
>> On 13.08.19 16:39, Vladimir Sementsov-Ogievskiy wrote:
>>> 13.08.2019 17:23, Max Reitz wrote:
 On 13.08.19 16:14, Vladimir Sementsov-Ogievskiy wrote:
>>
>> [...]
>>
> But still..
>
> Synchronous mirror allocates full-request buffers on guest write. Is 
> it correct?
>
> If we assume that it is correct to double memory usage of guest 
> operations, than for backup
> the problem is only in write_zero and discard where guest-assumed 
> memory usage should be zero.

 Well, but that is the problem.  I didn’t say anything in v2, because I
 only thought of normal writes and I found it fine to double the memory
 usage there (a guest won’t issue huge write requests in parallel).  But
 discard/write-zeroes are a different matter.

> And if we should distinguish writes from write_zeroes and discard, 
> it's better to postpone this
> improvement to be after backup-top filter merged.

 But do you need to distinguish it?  Why not just keep track of memory
 usage and put the current I/O coroutine to sleep in a CoQueue or
 something, and wake that up at the end of backup_do_cow()?

>>>
>>> 1. Because if we _can_ allow doubling of memory, it's more effective to 
>>> not restrict allocations on
>>> guest writes. It's just seems to be more effective technique.
>>
>> But the problem with backup and zero writes/discards is that the memory
>> is not doubled.  The request doesn’t need any memory, but the CBW
>> operation does, and maybe lots of it.
>>
>> So the guest may issue many zero writes/discards in parallel and thus
>> exhaust memory on the host.
>
> So this is the reason to separate writes from write-zeros/discrads. So at 
> least write will be happy. And I
> think that write is more often request than write-zero/discard

 But that makes it complicated for no practical gain whatsoever.

>>
>>> 2. Anyway, I'd allow some always-available size to allocate - let it be 
>>> one cluster, which will correspond
>>> to current behavior and prevent guest io hang in worst case.
>>
>> The guest would only hang if it we have to copy more than e.g. 64 MB at
>> a time.  At which point I think it’s not unreasonable to sequentialize
>> requests.

 Because of this.  How is it bad to start sequentializing writes when the
 data exceeds 64 MB?

>>>
>>> So you want total memory limit of 64 MB? (with possible parameter like in 
>>> mirror)
>>>
>>> And allocation algorithm to copy count bytes:
>>>
>>> if free_mem >= count: allocate count bytes
>>> else if free_mem >= cluster: allocate cluster and copy in a loop
>>> else wait in co-queue until some memory available and retry
>>>
>>> Is it OK for you?
>>
>> Sounds good to me, although I don’t know whether the second branch is
>> necessary.  As I’ve said, the total limit is just an insurance against a
>> guest that does some crazy stuff.
>>
> 
> I'm afraid that if there would be one big request it may wait forever while 
> smaller
> requests will eat most of available memory. So it would be unfair queue: 
> smaller
> requests will have higher priority in low memory case. With [2] it becomes 
> more fair.

OK.  Sounds reasonable.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] spapr/xive: Mask the EAS when allocating an IRQ

2019-08-13 Thread Cédric Le Goater
On 13/08/2019 18:46, Peter Maydell wrote:
> On Tue, 13 Aug 2019 at 17:44, Cédric Le Goater  wrote:
>>
>> If an IRQ is allocated and not configured, such as a MSI requested by
>> a PCI driver, it can be saved in its default state and possibly later
>> on restored using the same state. If not initially MASKED, KVM will
>> try to find a matching priority/target tuple for the interrupt and
>> fail to restore the VM because 0/0 is not a valid target.
>>
>> When allocating a IRQ number, the EAS should be set to a sane default :
>> VALID and MASKED.
>>
>> Reported-by: Satheesh Rajendran 
>> Signed-off-by: Cédric Le Goater 
>> ---
>>
>>  David, this fixes a "virsh save/restore" issue in certain configurations
>>  of CPU topology which never showed up before :/
>>
>>  Peter, I was busy on a KVM/passthru issue and lacked the time to
>>  investigate all ... you decide.
> 
> rc5 has been tagged so this is definitely too late for 4.1.

This is nothing too invasive which will be difficult to backport.

Thanks,

C. 



Re: [Qemu-devel] [PATCH] spapr/xive: Mask the EAS when allocating an IRQ

2019-08-13 Thread Peter Maydell
On Tue, 13 Aug 2019 at 17:44, Cédric Le Goater  wrote:
>
> If an IRQ is allocated and not configured, such as a MSI requested by
> a PCI driver, it can be saved in its default state and possibly later
> on restored using the same state. If not initially MASKED, KVM will
> try to find a matching priority/target tuple for the interrupt and
> fail to restore the VM because 0/0 is not a valid target.
>
> When allocating a IRQ number, the EAS should be set to a sane default :
> VALID and MASKED.
>
> Reported-by: Satheesh Rajendran 
> Signed-off-by: Cédric Le Goater 
> ---
>
>  David, this fixes a "virsh save/restore" issue in certain configurations
>  of CPU topology which never showed up before :/
>
>  Peter, I was busy on a KVM/passthru issue and lacked the time to
>  investigate all ... you decide.

rc5 has been tagged so this is definitely too late for 4.1.

thanks
-- PMM



Re: [Qemu-devel] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once

2019-08-13 Thread Vladimir Sementsov-Ogievskiy
13.08.2019 19:30, Max Reitz wrote:
> On 13.08.19 17:32, Vladimir Sementsov-Ogievskiy wrote:
>> 13.08.2019 18:02, Max Reitz wrote:
>>> On 13.08.19 17:00, Vladimir Sementsov-Ogievskiy wrote:
 13.08.2019 17:57, Max Reitz wrote:
> On 13.08.19 16:39, Vladimir Sementsov-Ogievskiy wrote:
>> 13.08.2019 17:23, Max Reitz wrote:
>>> On 13.08.19 16:14, Vladimir Sementsov-Ogievskiy wrote:
> 
> [...]
> 
 But still..

 Synchronous mirror allocates full-request buffers on guest write. Is 
 it correct?

 If we assume that it is correct to double memory usage of guest 
 operations, than for backup
 the problem is only in write_zero and discard where guest-assumed 
 memory usage should be zero.
>>>
>>> Well, but that is the problem.  I didn’t say anything in v2, because I
>>> only thought of normal writes and I found it fine to double the memory
>>> usage there (a guest won’t issue huge write requests in parallel).  But
>>> discard/write-zeroes are a different matter.
>>>
 And if we should distinguish writes from write_zeroes and discard, 
 it's better to postpone this
 improvement to be after backup-top filter merged.
>>>
>>> But do you need to distinguish it?  Why not just keep track of memory
>>> usage and put the current I/O coroutine to sleep in a CoQueue or
>>> something, and wake that up at the end of backup_do_cow()?
>>>
>>
>> 1. Because if we _can_ allow doubling of memory, it's more effective to 
>> not restrict allocations on
>> guest writes. It's just seems to be more effective technique.
>
> But the problem with backup and zero writes/discards is that the memory
> is not doubled.  The request doesn’t need any memory, but the CBW
> operation does, and maybe lots of it.
>
> So the guest may issue many zero writes/discards in parallel and thus
> exhaust memory on the host.

 So this is the reason to separate writes from write-zeros/discrads. So at 
 least write will be happy. And I
 think that write is more often request than write-zero/discard
>>>
>>> But that makes it complicated for no practical gain whatsoever.
>>>
>
>> 2. Anyway, I'd allow some always-available size to allocate - let it be 
>> one cluster, which will correspond
>> to current behavior and prevent guest io hang in worst case.
>
> The guest would only hang if it we have to copy more than e.g. 64 MB at
> a time.  At which point I think it’s not unreasonable to sequentialize
> requests.
>>>
>>> Because of this.  How is it bad to start sequentializing writes when the
>>> data exceeds 64 MB?
>>>
>>
>> So you want total memory limit of 64 MB? (with possible parameter like in 
>> mirror)
>>
>> And allocation algorithm to copy count bytes:
>>
>> if free_mem >= count: allocate count bytes
>> else if free_mem >= cluster: allocate cluster and copy in a loop
>> else wait in co-queue until some memory available and retry
>>
>> Is it OK for you?
> 
> Sounds good to me, although I don’t know whether the second branch is
> necessary.  As I’ve said, the total limit is just an insurance against a
> guest that does some crazy stuff.
> 

I'm afraid that if there would be one big request it may wait forever while 
smaller
requests will eat most of available memory. So it would be unfair queue: smaller
requests will have higher priority in low memory case. With [2] it becomes more 
fair.



-- 
Best regards,
Vladimir


[Qemu-devel] [PATCH] spapr/xive: Mask the EAS when allocating an IRQ

2019-08-13 Thread Cédric Le Goater
If an IRQ is allocated and not configured, such as a MSI requested by
a PCI driver, it can be saved in its default state and possibly later
on restored using the same state. If not initially MASKED, KVM will
try to find a matching priority/target tuple for the interrupt and
fail to restore the VM because 0/0 is not a valid target.

When allocating a IRQ number, the EAS should be set to a sane default :
VALID and MASKED.

Reported-by: Satheesh Rajendran 
Signed-off-by: Cédric Le Goater 
---

 David, this fixes a "virsh save/restore" issue in certain configurations
 of CPU topology which never showed up before :/

 Peter, I was busy on a KVM/passthru issue and lacked the time to
 investigate all ... you decide.

 hw/intc/spapr_xive.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index 3ae311d9ff7f..1f9c624df13d 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -534,7 +534,10 @@ bool spapr_xive_irq_claim(SpaprXive *xive, uint32_t lisn, 
bool lsi)
 return false;
 }
 
-xive->eat[lisn].w |= cpu_to_be64(EAS_VALID);
+/*
+ * Set default values when allocating an IRQ number
+ */
+xive->eat[lisn].w |= cpu_to_be64(EAS_VALID | EAS_MASKED);
 if (lsi) {
 xive_source_irq_set_lsi(xsrc, lisn);
 }
-- 
2.21.0




[Qemu-devel] [PATCH v2] ppc: Add support for 'mffsl' instruction

2019-08-13 Thread Paul A. Clarke
From: "Paul A. Clarke" 

ISA 3.0B added a set of Floating-Point Status and Control Register (FPSCR)
instructions: mffsce, mffscdrn, mffscdrni, mffscrn, mffscrni, mffsl.
This patch adds support for 'mffsl'.

'mffsl' is identical to 'mffs', except it only returns mode, status, and enable
bits from the FPSCR.

On CPUs without support for 'mffsl' (below ISA 3.0), the 'mffsl' instruction
will execute identically to 'mffs'.

Note: I renamed FPSCR_RN to FPSCR_RN0 so I could create an FPSCR_RN mask which
is both bits of the FPSCR rounding mode, as defined in the ISA.

I also fixed a typo in the definition of FPSCR_FR.

Signed-off-by: Paul A. Clarke 

v2: (Sorry for the quick v2!)
- I found that I copied too much of the 'mffs' implementation.
  The 'Rc' condition code bits are not needed for 'mffsl'.  Removed.
- I now free the (renamed) 'tmask' temporary.
- I now bail early for older ISA to the original 'mffs' implementation.

---
 disas/ppc.c|  5 +
 target/ppc/cpu.h   | 15 ++-
 target/ppc/fpu_helper.c|  4 ++--
 target/ppc/translate/fp-impl.inc.c | 23 +++
 target/ppc/translate/fp-ops.inc.c  |  3 ++-
 5 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/disas/ppc.c b/disas/ppc.c
index a545437..12b6a14 100644
--- a/disas/ppc.c
+++ b/disas/ppc.c
@@ -1765,6 +1765,9 @@ extract_tbr (unsigned long insn,
 /* An X_MASK with the RA and RB fields fixed.  */
 #define XRARB_MASK (X_MASK | RA_MASK | RB_MASK)
 
+/* An X form instruction with the RA field fixed.  */
+#define XRA(op, xop, ra) (X ((op), (xop)) | (((ra) << 16) & XRA_MASK))
+
 /* An XRARB_MASK, but with the L bit clear.  */
 #define XRLARB_MASK (XRARB_MASK & ~((unsigned long) 1 << 16))
 
@@ -4998,6 +5001,8 @@ const struct powerpc_opcode powerpc_opcodes[] = {
 { "ddivq",   XRC(63,546,0), X_MASK,POWER6, { FRT, FRA, FRB } },
 { "ddivq.",  XRC(63,546,1), X_MASK,POWER6, { FRT, FRA, FRB } },
 
+{ "mffsl",   XRA(63,583,12), XRARB_MASK,   POWER9, { FRT } },
+
 { "mffs",XRC(63,583,0), XRARB_MASK,COM,{ FRT } },
 { "mffs.",   XRC(63,583,1), XRARB_MASK,COM,{ FRT } },
 
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index c9beba2..74e8da4 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -591,7 +591,7 @@ enum {
 #define FPSCR_XE 3  /* Floating-point inexact exception enable   */
 #define FPSCR_NI 2  /* Floating-point non-IEEE mode  */
 #define FPSCR_RN11
-#define FPSCR_RN 0  /* Floating-point rounding control   */
+#define FPSCR_RN00  /* Floating-point rounding control   */
 #define fpscr_fex(((env->fpscr) >> FPSCR_FEX)& 0x1)
 #define fpscr_vx (((env->fpscr) >> FPSCR_VX) & 0x1)
 #define fpscr_ox (((env->fpscr) >> FPSCR_OX) & 0x1)
@@ -614,7 +614,7 @@ enum {
 #define fpscr_ze (((env->fpscr) >> FPSCR_ZE) & 0x1)
 #define fpscr_xe (((env->fpscr) >> FPSCR_XE) & 0x1)
 #define fpscr_ni (((env->fpscr) >> FPSCR_NI) & 0x1)
-#define fpscr_rn (((env->fpscr) >> FPSCR_RN) & 0x3)
+#define fpscr_rn (((env->fpscr) >> FPSCR_RN0)& 0x3)
 /* Invalid operation exception summary */
 #define fpscr_ix ((env->fpscr) & ((1 << FPSCR_VXSNAN) | (1 << FPSCR_VXISI)  | \
   (1 << FPSCR_VXIDI)  | (1 << FPSCR_VXZDZ)  | \
@@ -640,7 +640,7 @@ enum {
 #define FP_VXZDZ(1ull << FPSCR_VXZDZ)
 #define FP_VXIMZ(1ull << FPSCR_VXIMZ)
 #define FP_VXVC (1ull << FPSCR_VXVC)
-#define FP_FR   (1ull << FSPCR_FR)
+#define FP_FR   (1ull << FPSCR_FR)
 #define FP_FI   (1ull << FPSCR_FI)
 #define FP_C(1ull << FPSCR_C)
 #define FP_FL   (1ull << FPSCR_FL)
@@ -648,7 +648,7 @@ enum {
 #define FP_FE   (1ull << FPSCR_FE)
 #define FP_FU   (1ull << FPSCR_FU)
 #define FP_FPCC (FP_FL | FP_FG | FP_FE | FP_FU)
-#define FP_FPRF (FP_C  | FP_FL | FP_FG | FP_FE | FP_FU)
+#define FP_FPRF (FP_C | FP_FPCC)
 #define FP_VXSOFT   (1ull << FPSCR_VXSOFT)
 #define FP_VXSQRT   (1ull << FPSCR_VXSQRT)
 #define FP_VXCVI(1ull << FPSCR_VXCVI)
@@ -659,7 +659,12 @@ enum {
 #define FP_XE   (1ull << FPSCR_XE)
 #define FP_NI   (1ull << FPSCR_NI)
 #define FP_RN1  (1ull << FPSCR_RN1)
-#define FP_RN   (1ull << FPSCR_RN)
+#define FP_RN0  (1ull << FPSCR_RN0)
+#define FP_RN   (FP_RN1 | FP_RN0)
+
+#define FP_MODE FP_RN
+#define FP_ENABLES  (FP_VE | FP_OE | FP_UE | FP_ZE | FP_XE)
+#define FP_STATUS   (FP_FR | FP_FI | FP_FPRF)
 
 /* the exception bits which can be cleared by mcrfs - includes FX */
 #define FP_EX_CLEAR_BITS (FP_FX | FP_OX | FP_UX | FP_ZX | \
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index f437c88..5611cf0 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -403,7 +403,7 @@ void 

[Qemu-devel] [PATCH] ppc: Add support for 'mffsl' instruction

2019-08-13 Thread Paul A. Clarke
From: "Paul A. Clarke" 

ISA 3.0B added a set of Floating-Point Status and Control Register (FPSCR)
instructions: mffsce, mffscdrn, mffscdrni, mffscrn, mffscrni, mffsl.
This patch adds support for 'mffsl'.

'mffsl' is identical to 'mffs', except it only returns mode, status, and enable
bits from the FPSCR.

On CPUs without support for 'mffsl' (below ISA 3.0), the 'mffsl' instruction
will execute identically to 'mffs'.

Note: I renamed FPSCR_RN to FPSCR_RN0 so I could create an FPSCR_RN mask which
is both bits of the FPSCR rounding mode, as defined in the ISA.

I also fixed a typo in the definition of FPSCR_FR.

Signed-off-by: Paul A. Clarke 
---
 disas/ppc.c|  5 +
 target/ppc/cpu.h   | 15 ++-
 target/ppc/fpu_helper.c|  4 ++--
 target/ppc/translate/fp-impl.inc.c | 23 +++
 target/ppc/translate/fp-ops.inc.c  |  3 ++-
 5 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/disas/ppc.c b/disas/ppc.c
index a545437..12b6a14 100644
--- a/disas/ppc.c
+++ b/disas/ppc.c
@@ -1765,6 +1765,9 @@ extract_tbr (unsigned long insn,
 /* An X_MASK with the RA and RB fields fixed.  */
 #define XRARB_MASK (X_MASK | RA_MASK | RB_MASK)
 
+/* An X form instruction with the RA field fixed.  */
+#define XRA(op, xop, ra) (X ((op), (xop)) | (((ra) << 16) & XRA_MASK))
+
 /* An XRARB_MASK, but with the L bit clear.  */
 #define XRLARB_MASK (XRARB_MASK & ~((unsigned long) 1 << 16))
 
@@ -4998,6 +5001,8 @@ const struct powerpc_opcode powerpc_opcodes[] = {
 { "ddivq",   XRC(63,546,0), X_MASK,POWER6, { FRT, FRA, FRB } },
 { "ddivq.",  XRC(63,546,1), X_MASK,POWER6, { FRT, FRA, FRB } },
 
+{ "mffsl",   XRA(63,583,12), XRARB_MASK,   POWER9, { FRT } },
+
 { "mffs",XRC(63,583,0), XRARB_MASK,COM,{ FRT } },
 { "mffs.",   XRC(63,583,1), XRARB_MASK,COM,{ FRT } },
 
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index c9beba2..74e8da4 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -591,7 +591,7 @@ enum {
 #define FPSCR_XE 3  /* Floating-point inexact exception enable   */
 #define FPSCR_NI 2  /* Floating-point non-IEEE mode  */
 #define FPSCR_RN11
-#define FPSCR_RN 0  /* Floating-point rounding control   */
+#define FPSCR_RN00  /* Floating-point rounding control   */
 #define fpscr_fex(((env->fpscr) >> FPSCR_FEX)& 0x1)
 #define fpscr_vx (((env->fpscr) >> FPSCR_VX) & 0x1)
 #define fpscr_ox (((env->fpscr) >> FPSCR_OX) & 0x1)
@@ -614,7 +614,7 @@ enum {
 #define fpscr_ze (((env->fpscr) >> FPSCR_ZE) & 0x1)
 #define fpscr_xe (((env->fpscr) >> FPSCR_XE) & 0x1)
 #define fpscr_ni (((env->fpscr) >> FPSCR_NI) & 0x1)
-#define fpscr_rn (((env->fpscr) >> FPSCR_RN) & 0x3)
+#define fpscr_rn (((env->fpscr) >> FPSCR_RN0)& 0x3)
 /* Invalid operation exception summary */
 #define fpscr_ix ((env->fpscr) & ((1 << FPSCR_VXSNAN) | (1 << FPSCR_VXISI)  | \
   (1 << FPSCR_VXIDI)  | (1 << FPSCR_VXZDZ)  | \
@@ -640,7 +640,7 @@ enum {
 #define FP_VXZDZ(1ull << FPSCR_VXZDZ)
 #define FP_VXIMZ(1ull << FPSCR_VXIMZ)
 #define FP_VXVC (1ull << FPSCR_VXVC)
-#define FP_FR   (1ull << FSPCR_FR)
+#define FP_FR   (1ull << FPSCR_FR)
 #define FP_FI   (1ull << FPSCR_FI)
 #define FP_C(1ull << FPSCR_C)
 #define FP_FL   (1ull << FPSCR_FL)
@@ -648,7 +648,7 @@ enum {
 #define FP_FE   (1ull << FPSCR_FE)
 #define FP_FU   (1ull << FPSCR_FU)
 #define FP_FPCC (FP_FL | FP_FG | FP_FE | FP_FU)
-#define FP_FPRF (FP_C  | FP_FL | FP_FG | FP_FE | FP_FU)
+#define FP_FPRF (FP_C | FP_FPCC)
 #define FP_VXSOFT   (1ull << FPSCR_VXSOFT)
 #define FP_VXSQRT   (1ull << FPSCR_VXSQRT)
 #define FP_VXCVI(1ull << FPSCR_VXCVI)
@@ -659,7 +659,12 @@ enum {
 #define FP_XE   (1ull << FPSCR_XE)
 #define FP_NI   (1ull << FPSCR_NI)
 #define FP_RN1  (1ull << FPSCR_RN1)
-#define FP_RN   (1ull << FPSCR_RN)
+#define FP_RN0  (1ull << FPSCR_RN0)
+#define FP_RN   (FP_RN1 | FP_RN0)
+
+#define FP_MODE FP_RN
+#define FP_ENABLES  (FP_VE | FP_OE | FP_UE | FP_ZE | FP_XE)
+#define FP_STATUS   (FP_FR | FP_FI | FP_FPRF)
 
 /* the exception bits which can be cleared by mcrfs - includes FX */
 #define FP_EX_CLEAR_BITS (FP_FX | FP_OX | FP_UX | FP_ZX | \
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index f437c88..5611cf0 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -403,7 +403,7 @@ void helper_fpscr_clrbit(CPUPPCState *env, uint32_t bit)
 if (prev == 1) {
 switch (bit) {
 case FPSCR_RN1:
-case FPSCR_RN:
+case FPSCR_RN0:
 fpscr_set_rounding_mode(env);
 break;
 case FPSCR_VXSNAN:
@@ -557,7 +557,7 

Re: [Qemu-devel] [PATCH 1/2] block/raw-format: switch to BDRV_BLOCK_DATA with BDRV_BLOCK_RECURSE

2019-08-13 Thread Vladimir Sementsov-Ogievskiy
13.08.2019 19:08, Kevin Wolf wrote:
> Am 13.08.2019 um 17:54 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 13.08.2019 18:41, Kevin Wolf wrote:
>>> Am 13.08.2019 um 16:43 hat Max Reitz geschrieben:
 On 13.08.19 13:04, Kevin Wolf wrote:
> Am 12.08.2019 um 20:11 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> BDRV_BLOCK_RAW makes generic bdrv_co_block_status to fallthrough to
>> returned file. But is it correct behavior at all? If returned file
>> itself has a backing file, we may report as totally unallocated and
>> area which actually has data in bottom backing file.
>>
>> So, mirroring of qcow2 under raw-format is broken. Which is illustrated
>> by following commit with a test. Let's make raw-format behave more
>> correctly returning BDRV_BLOCK_DATA.
>>
>> Suggested-by: Max Reitz 
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>
> After some reading, I think I came to the conclusion that RAW is the
> correct thing to do. There is indeed a problem, but this patch is trying
> to fix it in the wrong place.
>
> In the case where the backing file contains some data, and we have a
> 'raw' node above the qcow2 overlay node, the content of the respective
> block is not defined by the queried backing file layer, so it is
> completely correct that bdrv_is_allocated() returns false,like it would
> if you queried the qcow2 layer directly.

 I disagree.  The queried backing file layer is the raw node.  As I said,
 in my opinion raw nodes are not filter nodes, neither in behavior (they
 have an offset option), nor in how they are generally used (as a format).

 The raw format does not support backing files.  Therefore, everything on
 a raw node is allocated.

 (That is, like, my opinion.)

>If it returned true, we would
> copy everything, which isn't right either (the test cases should may add
> the qemu-img map output of the target so this becomes visible).

 It is right.
>>>
>>> So we don't even agree what mirroring the raw node should even mean.
>>>
>>> I can the see your point when you say that the raw node has no backing
>>> file, so everything should be copied. But I can also see the point that
>>> the raw node can really just be used as a filter that limits the data
>>> exposed from the qcow2 layer, and you want to keep the copy a COW
>>> overlay over the same backing file.
>>>
>>> Both are valid use cases in principle and there is no single right or
>>> wrong.
>>>
>>> We don't currently support the latter use case because we have only
>>> sync=full or sync=top, but if you could specify a base node instead, we
>>> could probably suport the case without all of the special-casing filter
>>> nodes and backing file childs.
>>>
>>> You would call bdrv_co_block_status_above() with the right base node and
>>> it would just recurse whereever the data is stored, be it bs->backing,
>>> bs->file or even driver-specific children. This would allow you to find
>>> out whether some block in the top node came from the base node that
>>> we're going to keep. If yes, skip it; if no, copy it.
>>>
>>> This way we wouldn't have to decide whether raw is a filter or not,
>>> because it wouldn't make a difference. The behaviour would only depend
>>> on the base node given by the user. If you specified the top-level qcow2
>>> file as the base, you get your full copy;
>>
>> ahm, full-copy = base is NULL..
> 
> Oops, yes, of course. Using the top-level node would create an empty
> "copy".
> 
>>> if you specified the backing
>>> qcow2, you get the partial copy where the target still uses the same
>>> backing file.
>>>
>>> (Hm... It would only actually work if the offsets stay the same in the
>>> chain, which is true for backing file children, but not necessarily for
>>> other children.
>>
>> Don't follow, what you mean by offsets stay the same and what is wrong
>> with it?
> 
> Say we have this graph:
> 
> raw,offset=65536
>  |
>  v
>qcow2-+
>  |   |
>  v   v
>file base
> 
> Now you can't just mirror the raw node into a target.qcow2 that shares
> base as the backing file, because the offsets will be wrong. In order to
> use such a copy correctly, you'd have to use a raw node again in the
> backing chain:
> 
> target.qcow2+
>  |   |
>  v   v
>file  raw,offset=65536
>  |
>  v
>base
> 
> So the case where offsets differ between the top and the base node isn't
> trivial.

Understand, but for me it don't look like the thing that behaves in unexpected
for user way, on the contrary, it seems obvious that it will not work, as user
understand what is backing file (offsets are backed by corresponding offsets)

> 
> (If this case isn't complicated enough yet, imagine passing file as the
> base node instead... It just can't work.)
> 

Re: [Qemu-devel] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once

2019-08-13 Thread Max Reitz
On 13.08.19 17:32, Vladimir Sementsov-Ogievskiy wrote:
> 13.08.2019 18:02, Max Reitz wrote:
>> On 13.08.19 17:00, Vladimir Sementsov-Ogievskiy wrote:
>>> 13.08.2019 17:57, Max Reitz wrote:
 On 13.08.19 16:39, Vladimir Sementsov-Ogievskiy wrote:
> 13.08.2019 17:23, Max Reitz wrote:
>> On 13.08.19 16:14, Vladimir Sementsov-Ogievskiy wrote:

[...]

>>> But still..
>>>
>>> Synchronous mirror allocates full-request buffers on guest write. Is it 
>>> correct?
>>>
>>> If we assume that it is correct to double memory usage of guest 
>>> operations, than for backup
>>> the problem is only in write_zero and discard where guest-assumed 
>>> memory usage should be zero.
>>
>> Well, but that is the problem.  I didn’t say anything in v2, because I
>> only thought of normal writes and I found it fine to double the memory
>> usage there (a guest won’t issue huge write requests in parallel).  But
>> discard/write-zeroes are a different matter.
>>
>>> And if we should distinguish writes from write_zeroes and discard, it's 
>>> better to postpone this
>>> improvement to be after backup-top filter merged.
>>
>> But do you need to distinguish it?  Why not just keep track of memory
>> usage and put the current I/O coroutine to sleep in a CoQueue or
>> something, and wake that up at the end of backup_do_cow()?
>>
>
> 1. Because if we _can_ allow doubling of memory, it's more effective to 
> not restrict allocations on
> guest writes. It's just seems to be more effective technique.

 But the problem with backup and zero writes/discards is that the memory
 is not doubled.  The request doesn’t need any memory, but the CBW
 operation does, and maybe lots of it.

 So the guest may issue many zero writes/discards in parallel and thus
 exhaust memory on the host.
>>>
>>> So this is the reason to separate writes from write-zeros/discrads. So at 
>>> least write will be happy. And I
>>> think that write is more often request than write-zero/discard
>>
>> But that makes it complicated for no practical gain whatsoever.
>>

> 2. Anyway, I'd allow some always-available size to allocate - let it be 
> one cluster, which will correspond
> to current behavior and prevent guest io hang in worst case.

 The guest would only hang if it we have to copy more than e.g. 64 MB at
 a time.  At which point I think it’s not unreasonable to sequentialize
 requests.
>>
>> Because of this.  How is it bad to start sequentializing writes when the
>> data exceeds 64 MB?
>>
> 
> So you want total memory limit of 64 MB? (with possible parameter like in 
> mirror)
> 
> And allocation algorithm to copy count bytes:
> 
> if free_mem >= count: allocate count bytes
> else if free_mem >= cluster: allocate cluster and copy in a loop
> else wait in co-queue until some memory available and retry
> 
> Is it OK for you?

Sounds good to me, although I don’t know whether the second branch is
necessary.  As I’ve said, the total limit is just an insurance against a
guest that does some crazy stuff.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 1/2] block/raw-format: switch to BDRV_BLOCK_DATA with BDRV_BLOCK_RECURSE

2019-08-13 Thread Max Reitz
On 13.08.19 17:41, Kevin Wolf wrote:
> Am 13.08.2019 um 16:43 hat Max Reitz geschrieben:
>> On 13.08.19 13:04, Kevin Wolf wrote:
>>> Am 12.08.2019 um 20:11 hat Vladimir Sementsov-Ogievskiy geschrieben:
 BDRV_BLOCK_RAW makes generic bdrv_co_block_status to fallthrough to
 returned file. But is it correct behavior at all? If returned file
 itself has a backing file, we may report as totally unallocated and
 area which actually has data in bottom backing file.

 So, mirroring of qcow2 under raw-format is broken. Which is illustrated
 by following commit with a test. Let's make raw-format behave more
 correctly returning BDRV_BLOCK_DATA.

 Suggested-by: Max Reitz 
 Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>>
>>> After some reading, I think I came to the conclusion that RAW is the
>>> correct thing to do. There is indeed a problem, but this patch is trying
>>> to fix it in the wrong place.
>>>
>>> In the case where the backing file contains some data, and we have a
>>> 'raw' node above the qcow2 overlay node, the content of the respective
>>> block is not defined by the queried backing file layer, so it is
>>> completely correct that bdrv_is_allocated() returns false,like it would
>>> if you queried the qcow2 layer directly.
>>
>> I disagree.  The queried backing file layer is the raw node.  As I said,
>> in my opinion raw nodes are not filter nodes, neither in behavior (they
>> have an offset option), nor in how they are generally used (as a format).
>>
>> The raw format does not support backing files.  Therefore, everything on
>> a raw node is allocated.
>>
>> (That is, like, my opinion.)
>>
>>>  If it returned true, we would
>>> copy everything, which isn't right either (the test cases should may add
>>> the qemu-img map output of the target so this becomes visible).
>>
>> It is right.
> 
> So we don't even agree what mirroring the raw node should even mean.
> 
> I can the see your point when you say that the raw node has no backing
> file, so everything should be copied. But I can also see the point that
> the raw node can really just be used as a filter that limits the data
> exposed from the qcow2 layer, and you want to keep the copy a COW
> overlay over the same backing file.

But it is not a filter.  If you limit the data by using an offset, that
precisely makes it not a filter.

> Both are valid use cases in principle and there is no single right or
> wrong.

I don’t get what you mean because raw is not a filter.

If it were a filter, it’d be clear: Regarding allocation information, we
skip it.  But it isn’t.  Hence we cannot skip it.

I don’t see why you’re trying to make the point with raw, when it simply
is not a filter.


> We don't currently support the latter use case because we have only
> sync=full or sync=top, but if you could specify a base node instead, we
> could probably suport the case without all of the special-casing filter
> nodes and backing file childs.

This sounds like it’d be difficult to special-case filter nodes.  It isn’t.

Whenever the user specifies some node that should refer to a backing
chain element, all you do is call bdrv_skip_rw_filters() and you land at
the corresponding COW node.

> You would call bdrv_co_block_status_above() with the right base node and
> it would just recurse whereever the data is stored, be it bs->backing,
> bs->file or even driver-specific children. This would allow you to find
> out whether some block in the top node came from the base node that
> we're going to keep. If yes, skip it; if no, copy it.
> 
> This way we wouldn't have to decide whether raw is a filter or not,

Well, it isn’t.

> because it wouldn't make a difference. The behaviour would only depend
> on the base node given by the user. If you specified the top-level qcow2
> file as the base, you get your full copy; if you specified the backing
> qcow2, you get the partial copy where the target still uses the same
> backing file.

I simply do not see your point.

For two reasons:

(1) I don’t think anyone should use is_allocate on nodes that are not
COW nodes.  They are likely to be doing something wrong then.

You don’t seem to agree, because you say this makes the callers
complicated.  What I think is that we don’t have that many callers, and
it’s probably a good thing when they have to think about the backing
chain structure.

But anyway.

(2) What I understand is that you propose adding some way to find out
the next element in the COW chain if is_allocated returns false.  Well,
my “Deal with filters” series adds two ways to do that:

- bdrv_filtered_cow_bs(bdrv_skip_rw_filters(bs)) will return the first
child behind the COW node, but that may be another filter.

- bdrv_backing_chain_next(bs) will return “the first non-filter backing
image of the first non-filter image.”.  It’s just
bdrv_skip_rw_filters(bdrv_filtered_cow_bs(bdrv_skip_rw_filters(bs))).

So the thing is, we don’t need to modify 

Re: [Qemu-devel] CPU hotplug using SMM with QEMU+OVMF

2019-08-13 Thread Laszlo Ersek
On 08/13/19 18:09, Laszlo Ersek wrote:
> On 08/13/19 16:16, Laszlo Ersek wrote:

>> (06) Host CPU: (SMM) Save 38000, Update 38000 -- fill simple SMM
>>  rebase code.
>>
>> (07) Host CPU: (SMM) Send message to New CPU to Enable SMI.
> 
> Aha, so this is the SMM-only register you mention in step (03). Is the
> register specified in the Intel SDM?
> 
> 
>> (08) New CPU: (Flash) Get message - Enable SMI.
>>
>> (09) Host CPU: (SMM) Send SMI to the new CPU only.
>>
>> (10) New CPU: (SMM) Response first SMI at 38000, and rebase SMBASE to
>>  TSEG.
> 
> What code does the new CPU execute after it completes step (10)? Does it
> halt?
> 
> 
>> (11) Host CPU: (SMM) Restore 38000.
> 
> These steps (i.e., (06) through (11)) don't appear RAS-specific. The
> only platform-specific feature seems to be SMI masking register, which
> could be extracted into a new SmmCpuFeaturesLib API.
> 
> Thus, would you please consider open sourcing firmware code for steps
> (06) through (11)?
> 
> 
> Alternatively -- and in particular because the stack for step (01)
> concerns me --, we could approach this from a high-level, functional
> perspective. The states that really matter are the relocated SMBASE for
> the new CPU, and the state of the full system, right at the end of step
> (11).
> 
> When the SMM setup quiesces during normal firmware boot, OVMF could use
> existent (finalized) SMBASE infomation to *pre-program* some virtual
> QEMU hardware, with such state that would be expected, as "final" state,
> of any new hotplugged CPU. Afterwards, if / when the hotplug actually
> happens, QEMU could blanket-apply this state to the new CPU, and
> broadcast a hardware SMI to all CPUs except the new one.
> 
> The hardware SMI should tell the firmware that the rest of the process
> -- step (12) below, and onward -- is being requested.
> 
> If I understand right, this approach would produce an firmware & system
> state that's identical to what's expected right after step (11):
> 
> - all SMBASEs relocated
> - all preexistent CPUs in SMM
> - new CPU halted / blocked from launch
> - DRAM at 0x3 / 0x38000 contains OS-owned data
> 
> Is my understanding correct that this is the expected state after step
> (11)?

Revisiting some of my notes from earlier, such as
 -- apologies,
private BZ... --, we discussed some of this stuff with Mike on the phone
in April.

And, it looked like generating a hardware SMI in QEMU, in association
with the hotplug action that was being requested through the QEMU
monitor, would be the right approach.

By now I have forgotten about that discussion -- hence "revisiting my
notes"--, but luckily, it seems consistent with what I've proposed
above, under "alternatively".

Thanks,
Laszlo



Re: [Qemu-devel] CPU hotplug using SMM with QEMU+OVMF

2019-08-13 Thread Laszlo Ersek
On 08/13/19 16:16, Laszlo Ersek wrote:

> Yingwen and Jiewen suggested the following process.
>
> Legend:
>
> - "New CPU":  CPU being hot-added
> - "Host CPU": existing CPU
> - (Flash):code running from flash
> - (SMM):  code running from SMRAM
>
> Steps:
>
> (01) New CPU: (Flash) enter reset vector, Global SMI disabled by
>  default.

- What does "Global SMI disabled by default" mean? In particular, what
  is "global" here?

  Do you mean that the CPU being hot-plugged should mask (by default)
  broadcast SMIs? What about directed SMIs? (An attacker could try that
  too.)

  And what about other processors? (I'd assume step (01)) is not
  relevant for other processors, but "global" is quite confusing here.)

- Does this part require a new branch somewhere in the OVMF SEC code?
  How do we determine whether the CPU executing SEC is BSP or
  hot-plugged AP?

- How do we tell the hot-plugged AP where to start execution? (I.e. that
  it should execute code at a particular pflash location.)

  For example, in MpInitLib, we start a specific AP with INIT-SIPI-SIPI,
  where "SIPI" stores the startup address in the "Interrupt Command
  Register" (which is memory-mapped in xAPIC mode, and an MSR in x2APIC
  mode, apparently). That doesn't apply here -- should QEMU auto-start
  the new CPU?

- What memory is used as stack by the new CPU, when it runs code from
  flash?

  QEMU does not emulate CAR (Cache As RAM). The new CPU doesn't have
  access to SMRAM. And we cannot use AcpiNVS or Reserved memory, because
  a malicious OS could use other CPUs -- or PCI device DMA -- to attack
  the stack (unless QEMU forcibly paused other CPUs upon hotplug; I'm
  not sure).

- If an attempt is made to hotplug multiple CPUs in quick succession,
  does something serialize those attempts?

  Again, stack usage could be a concern, even with Cache-As-RAM --
  HyperThreads (logical processors) on a single core don't have
  dedicated cache.

  Does CPU hotplug apply only at the socket level? If the CPU is
  multi-core, what is responsible for hot-plugging all cores present in
  the socket?


> (02) New CPU: (Flash) configure memory control to let it access global
>  host memory.

In QEMU/KVM guests, we don't have to enable memory explicitly, it just
exists and works.

In OVMF X64 SEC, we can't access RAM above 4GB, but that shouldn't be an
issue per se.


> (03) New CPU: (Flash) send board message to tell host CPU (GPIO->SCI)
>  -- I am waiting for hot-add message.

Maybe we can simplify this in QEMU by broadcasting an SMI to existent
processors immediately upon plugging the new CPU.


>(NOTE: Host CPU can only send
>  instruction in SMM mode. -- The register is SMM only)

Sorry, I don't follow -- what register are we talking about here, and
why is the BSP needed to send anything at all? What "instruction" do you
have in mind?


> (04) Host CPU: (OS) get message from board that a new CPU is added.
>  (GPIO -> SCI)
>
> (05) Host CPU: (OS) All CPUs enter SMM (SCI->SWSMI) (NOTE: New CPU
>  will not enter CPU because SMI is disabled)

I don't understand the OS involvement here. But, again, perhaps QEMU can
force all existent CPUs into SMM immediately upon adding the new CPU.


> (06) Host CPU: (SMM) Save 38000, Update 38000 -- fill simple SMM
>  rebase code.
>
> (07) Host CPU: (SMM) Send message to New CPU to Enable SMI.

Aha, so this is the SMM-only register you mention in step (03). Is the
register specified in the Intel SDM?


> (08) New CPU: (Flash) Get message - Enable SMI.
>
> (09) Host CPU: (SMM) Send SMI to the new CPU only.
>
> (10) New CPU: (SMM) Response first SMI at 38000, and rebase SMBASE to
>  TSEG.

What code does the new CPU execute after it completes step (10)? Does it
halt?


> (11) Host CPU: (SMM) Restore 38000.

These steps (i.e., (06) through (11)) don't appear RAS-specific. The
only platform-specific feature seems to be SMI masking register, which
could be extracted into a new SmmCpuFeaturesLib API.

Thus, would you please consider open sourcing firmware code for steps
(06) through (11)?


Alternatively -- and in particular because the stack for step (01)
concerns me --, we could approach this from a high-level, functional
perspective. The states that really matter are the relocated SMBASE for
the new CPU, and the state of the full system, right at the end of step
(11).

When the SMM setup quiesces during normal firmware boot, OVMF could use
existent (finalized) SMBASE infomation to *pre-program* some virtual
QEMU hardware, with such state that would be expected, as "final" state,
of any new hotplugged CPU. Afterwards, if / when the hotplug actually
happens, QEMU could blanket-apply this state to the new CPU, and
broadcast a hardware SMI to all CPUs except the new one.

The hardware SMI should tell the firmware that the rest of the process
-- step (12) below, and onward -- is being requested.

If I understand right, this approach would 

Re: [Qemu-devel] [PATCH 1/2] block/raw-format: switch to BDRV_BLOCK_DATA with BDRV_BLOCK_RECURSE

2019-08-13 Thread Kevin Wolf
Am 13.08.2019 um 17:54 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 13.08.2019 18:41, Kevin Wolf wrote:
> > Am 13.08.2019 um 16:43 hat Max Reitz geschrieben:
> >> On 13.08.19 13:04, Kevin Wolf wrote:
> >>> Am 12.08.2019 um 20:11 hat Vladimir Sementsov-Ogievskiy geschrieben:
>  BDRV_BLOCK_RAW makes generic bdrv_co_block_status to fallthrough to
>  returned file. But is it correct behavior at all? If returned file
>  itself has a backing file, we may report as totally unallocated and
>  area which actually has data in bottom backing file.
> 
>  So, mirroring of qcow2 under raw-format is broken. Which is illustrated
>  by following commit with a test. Let's make raw-format behave more
>  correctly returning BDRV_BLOCK_DATA.
> 
>  Suggested-by: Max Reitz 
>  Signed-off-by: Vladimir Sementsov-Ogievskiy 
> >>>
> >>> After some reading, I think I came to the conclusion that RAW is the
> >>> correct thing to do. There is indeed a problem, but this patch is trying
> >>> to fix it in the wrong place.
> >>>
> >>> In the case where the backing file contains some data, and we have a
> >>> 'raw' node above the qcow2 overlay node, the content of the respective
> >>> block is not defined by the queried backing file layer, so it is
> >>> completely correct that bdrv_is_allocated() returns false,like it would
> >>> if you queried the qcow2 layer directly.
> >>
> >> I disagree.  The queried backing file layer is the raw node.  As I said,
> >> in my opinion raw nodes are not filter nodes, neither in behavior (they
> >> have an offset option), nor in how they are generally used (as a format).
> >>
> >> The raw format does not support backing files.  Therefore, everything on
> >> a raw node is allocated.
> >>
> >> (That is, like, my opinion.)
> >>
> >>>   If it returned true, we would
> >>> copy everything, which isn't right either (the test cases should may add
> >>> the qemu-img map output of the target so this becomes visible).
> >>
> >> It is right.
> > 
> > So we don't even agree what mirroring the raw node should even mean.
> > 
> > I can the see your point when you say that the raw node has no backing
> > file, so everything should be copied. But I can also see the point that
> > the raw node can really just be used as a filter that limits the data
> > exposed from the qcow2 layer, and you want to keep the copy a COW
> > overlay over the same backing file.
> > 
> > Both are valid use cases in principle and there is no single right or
> > wrong.
> > 
> > We don't currently support the latter use case because we have only
> > sync=full or sync=top, but if you could specify a base node instead, we
> > could probably suport the case without all of the special-casing filter
> > nodes and backing file childs.
> > 
> > You would call bdrv_co_block_status_above() with the right base node and
> > it would just recurse whereever the data is stored, be it bs->backing,
> > bs->file or even driver-specific children. This would allow you to find
> > out whether some block in the top node came from the base node that
> > we're going to keep. If yes, skip it; if no, copy it.
> > 
> > This way we wouldn't have to decide whether raw is a filter or not,
> > because it wouldn't make a difference. The behaviour would only depend
> > on the base node given by the user. If you specified the top-level qcow2
> > file as the base, you get your full copy;
> 
> ahm, full-copy = base is NULL..

Oops, yes, of course. Using the top-level node would create an empty
"copy".

> > if you specified the backing
> > qcow2, you get the partial copy where the target still uses the same
> > backing file.
> > 
> > (Hm... It would only actually work if the offsets stay the same in the
> > chain, which is true for backing file children, but not necessarily for
> > other children.
> 
> Don't follow, what you mean by offsets stay the same and what is wrong
> with it?

Say we have this graph:

raw,offset=65536
|
v
  qcow2-+
|   |
v   v
  file base

Now you can't just mirror the raw node into a target.qcow2 that shares
base as the backing file, because the offsets will be wrong. In order to
use such a copy correctly, you'd have to use a raw node again in the
backing chain:

target.qcow2+
|   |
v   v
  file  raw,offset=65536
|
v
  base

So the case where offsets differ between the top and the base node isn't
trivial.

(If this case isn't complicated enough yet, imagine passing file as the
base node instead... It just can't work.)

Kevin



Re: [Qemu-devel] [PATCH-for-4.2 v1 9/9] s390x/cpumodel: Add new TCG features to QEMU cpu model

2019-08-13 Thread Cornelia Huck
On Mon,  5 Aug 2019 17:29:47 +0200
David Hildenbrand  wrote:

> We now implement a bunch of new facilities we can properly indicate.
> 
> ESOP-1/ESOP-2 handling is discussed in the PoP Chafter 3-15
> ("Suppression on Protection"). The "Basic suppression-on-protection (SOP)
> facility" is a core part of z/Architecture without a facility
> indication. ESOP-2 is indicated by ESOP-1 + Side-effect facility
> ("ESOP-2"). Besides ESOP-2, the side-effect facility is only relevant for
> the guarded-storage facility (we don't implement).
> 
> S390_ESOP:
> - We indicate DAT exeptions by setting bit 61 of the TEID (TEC) to 1 and
>   bit 60 to zero. We don't trigger ALCP exceptions yet. Also, we set
>   bit 0-51 and bit 62/63 to the right values.
> S390_ACCESS_EXCEPTION_FS_INDICATION:
> - The TEID (TEC) properly indicates in bit 52/53 on any access if it was
>   a fetch or a store
> S390_SIDE_EFFECT_ACCESS_ESOP2:
> - We have no side-effect accesses (esp., we don't implement the
>   guarded-storage faciliy), we correctly set bit 64 of the TEID (TEC) to
>   0 (no side-effect).
> - ESOP2: We properly set bit 56, 60, 61 in the TEID (TEC) to indicate the
>   type of protection. We don't trigger KCP/ALCP exceptions yet.
> S390_INSTRUCTION_EXEC_PROT:
> - The MMU properly detects and indicates the exception on instruction fetches
> - Protected TLB entries will never get PAGE_EXEC set.
> 
> There is no need to fake the abscence of any of the facilities - without
> the facilities, some bits of the TEID (TEC) are simply unpredictable.
> 
> As IEP was added with z14 and we currently implement a z13, add it to
> the MAX model instead.

Looks sane, once we get those features supported.

> 
> Signed-off-by: David Hildenbrand 
> ---
>  target/s390x/gen-features.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/target/s390x/gen-features.c b/target/s390x/gen-features.c
> index 7e82f2f004..6e78d40d9a 100644
> --- a/target/s390x/gen-features.c
> +++ b/target/s390x/gen-features.c
> @@ -704,12 +704,16 @@ static uint16_t qemu_V4_1[] = {
>  };
>  
>  static uint16_t qemu_LATEST[] = {
> +S390_FEAT_ACCESS_EXCEPTION_FS_INDICATION,
> +S390_FEAT_SIDE_EFFECT_ACCESS_ESOP2,
> +S390_FEAT_ESOP,
>  };
>  
>  /* add all new definitions before this point */
>  static uint16_t qemu_MAX[] = {
>  /* generates a dependency warning, leave it out for now */
>  S390_FEAT_MSA_EXT_5,
> +S390_FEAT_INSTRUCTION_EXEC_PROT,

This 'dependency warning' only refers to msa_ext_5, no? Can we make
that more obvious?

>  };
>  
>  /** END FEATURE DEFS **/




Re: [Qemu-devel] [PATCH 1/2] block/raw-format: switch to BDRV_BLOCK_DATA with BDRV_BLOCK_RECURSE

2019-08-13 Thread Max Reitz
On 13.08.19 17:22, Vladimir Sementsov-Ogievskiy wrote:
> 13.08.2019 18:03, Max Reitz wrote:
>> On 13.08.19 16:56, Vladimir Sementsov-Ogievskiy wrote:
>>> 13.08.2019 17:43, Max Reitz wrote:
 On 13.08.19 13:04, Kevin Wolf wrote:
> Am 12.08.2019 um 20:11 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> BDRV_BLOCK_RAW makes generic bdrv_co_block_status to fallthrough to
>> returned file. But is it correct behavior at all? If returned file
>> itself has a backing file, we may report as totally unallocated and
>> area which actually has data in bottom backing file.
>>
>> So, mirroring of qcow2 under raw-format is broken. Which is illustrated
>> by following commit with a test. Let's make raw-format behave more
>> correctly returning BDRV_BLOCK_DATA.
>>
>> Suggested-by: Max Reitz 
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>
> After some reading, I think I came to the conclusion that RAW is the
> correct thing to do. There is indeed a problem, but this patch is trying
> to fix it in the wrong place.
>
> In the case where the backing file contains some data, and we have a
> 'raw' node above the qcow2 overlay node, the content of the respective
> block is not defined by the queried backing file layer, so it is
> completely correct that bdrv_is_allocated() returns false,like it would
> if you queried the qcow2 layer directly.

 I disagree.  The queried backing file layer is the raw node.  As I said,
 in my opinion raw nodes are not filter nodes, neither in behavior (they
 have an offset option), nor in how they are generally used (as a format).

 The raw format does not support backing files.  Therefore, everything on
 a raw node is allocated.

>>>
>>> Could you tell me at least, what means "allocated" ?
>>>
>>> It's a term that describing a region somehow.. But how? Allocated where?
>>> In raw node, in its child or both? Am I right that if region allocated in
>>> one of non-cow children it is assumed to be allocated in parent too? Or 
>>> what?
>>>
>>> And it's unrelated to real disk allocation which (IMHO) directly shows that
>>> this a bad term.
>>
>> It’s a term for COW backing chains.  If something is allocated on a
>> given node in a COW backing chain, it means it is either present in
>> exactly that node or in one of its storage children (in case the node is
>> a format node).  If it is not allocated, it is not, and read accesses
>> will be forwarded to the COW backing child.
>>
> 
> And this definition leads exactly to bug in these series:
> 
> 
> [raw]
>|
>|file
>V   file
> [qcow2]->[file]
>|
>|backing
>V
> [base]
> 
> 
> Assume something is actually allocated in [base] but not in [qcow2].
> So, [qcow2] node reports it as unallocated. So nobdy of [raw]'s storage
> children contains this as allocated, so it's unallocated in [raw].

Well, I would say it is in raw’s storage child, because the definition
of backing chain allocation only recurses down the COW link.

If it is *anywhere* in the storage subtree, it is to be considered
allocated on this level.  If it is in the backing COW subtree, it is not.

And if it is in neither, it doesn’t matter whether we say it’s allocated
or not:

For example (1), for sparse files, the raw driver may not have the data
in the storage subtree.  But it sure as hell won’t have it in its
backing COW subtree, because it doesn’t have one.  So for simplicity’s
sake, it may just return everything as allocated.

For example (2), for sparse qcow2 files, the qcow2 driver may report
some ranges as unallocated even when it doesn’t have a backing file.  We
don’t need to change it to report such cases as allocated.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH-for-4.2 v1 8/9] s390x/cpumodel: Prepare for changes of QEMU model

2019-08-13 Thread Cornelia Huck
On Mon,  5 Aug 2019 17:29:46 +0200
David Hildenbrand  wrote:

> Setup the 4.1 compatibility model so we can add new features to the
> LATEST model.

Basically a nop for now from an outside view.

> 
> Signed-off-by: David Hildenbrand 
> ---
>  hw/s390x/s390-virtio-ccw.c  | 2 ++
>  target/s390x/gen-features.c | 6 +-
>  2 files changed, 7 insertions(+), 1 deletion(-)

Reviewed-by: Cornelia Huck 



[Qemu-devel] [PULL 12/29] Include hw/irq.h a lot less

2019-08-13 Thread Markus Armbruster
In my "build everything" tree, changing hw/irq.h triggers a recompile
of some 5400 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).

hw/hw.h supposedly includes it for convenience.  Several other headers
include it just to get qemu_irq and.or qemu_irq_handler.

Move the qemu_irq and qemu_irq_handler typedefs from hw/irq.h to
qemu/typedefs.h, and then include hw/irq.h only where it's still
needed.  Touching it now recompiles only some 500 objects.

Signed-off-by: Markus Armbruster 
Reviewed-by: Alistair Francis 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Message-Id: <20190812052359.30071-13-arm...@redhat.com>
---
 hw/alpha/alpha_sys.h | 1 -
 hw/hppa/hppa_sys.h   | 1 -
 include/hw/acpi/acpi.h   | 1 -
 include/hw/arm/boot.h| 1 -
 include/hw/arm/omap.h| 1 -
 include/hw/arm/soc_dma.h | 1 -
 include/hw/block/fdc.h   | 1 -
 include/hw/bt.h  | 1 -
 include/hw/core/split-irq.h  | 1 -
 include/hw/cris/etraxfs_dma.h| 1 -
 include/hw/display/blizzard.h| 1 -
 include/hw/display/tc6393xb.h| 1 -
 include/hw/hw.h  | 1 -
 include/hw/ide/internal.h| 1 +
 include/hw/input/gamepad.h   | 1 -
 include/hw/input/tsc2xxx.h   | 1 -
 include/hw/irq.h | 4 
 include/hw/isa/vt82c686.h| 1 -
 include/hw/mips/mips.h   | 1 -
 include/hw/misc/cbus.h   | 1 -
 include/hw/net/lan9118.h | 1 -
 include/hw/net/smc91c111.h   | 1 -
 include/hw/or-irq.h  | 1 -
 include/hw/ppc/spapr_irq.h   | 1 -
 include/hw/qdev-core.h   | 1 -
 include/hw/sh4/sh_intc.h | 1 -
 include/hw/timer/m48t59.h| 1 -
 include/hw/tricore/tricore.h | 1 -
 include/hw/vfio/vfio-platform.h  | 1 -
 include/hw/xen/xen.h | 1 -
 include/hw/xtensa/mx_pic.h   | 1 -
 include/qemu/typedefs.h  | 9 +
 include/sysemu/kvm.h | 1 -
 hw/acpi/core.c   | 1 +
 hw/acpi/piix4.c  | 1 +
 hw/alpha/typhoon.c   | 1 +
 hw/arm/armsse.c  | 1 +
 hw/arm/exynos4210.c  | 1 +
 hw/arm/exynos4_boards.c  | 1 +
 hw/arm/integratorcp.c| 1 +
 hw/arm/msf2-soc.c| 1 +
 hw/arm/musicpal.c| 1 +
 hw/arm/omap1.c   | 1 +
 hw/arm/omap2.c   | 1 +
 hw/arm/palm.c| 2 ++
 hw/arm/pxa2xx.c  | 1 +
 hw/arm/pxa2xx_gpio.c | 1 +
 hw/arm/realview.c| 1 +
 hw/arm/smmuv3.c  | 1 +
 hw/arm/spitz.c   | 1 +
 hw/arm/stellaris.c   | 1 +
 hw/arm/strongarm.c   | 1 +
 hw/arm/tosa.c| 1 +
 hw/arm/versatilepb.c | 1 +
 hw/arm/virt.c| 1 +
 hw/arm/z2.c  | 1 +
 hw/audio/cs4231a.c   | 1 +
 hw/audio/gus.c   | 1 +
 hw/audio/marvell_88w8618.c   | 1 +
 hw/audio/milkymist-ac97.c| 1 +
 hw/audio/pl041.c | 1 +
 hw/audio/sb16.c  | 1 +
 hw/block/fdc.c   | 1 +
 hw/char/bcm2835_aux.c| 1 +
 hw/char/cadence_uart.c   | 1 +
 hw/char/cmsdk-apb-uart.c | 1 +
 hw/char/escc.c   | 1 +
 hw/char/etraxfs_ser.c| 1 +
 hw/char/exynos4210_uart.c| 1 +
 hw/char/grlib_apbuart.c  | 1 +
 hw/char/imx_serial.c | 1 +
 hw/char/ipoctal232.c | 1 +
 hw/char/lm32_uart.c  | 1 +
 hw/char/mcf_uart.c   | 1 +
 hw/char/milkymist-uart.c | 1 +
 hw/char/nrf51_uart.c | 1 +
 hw/char/parallel.c   | 1 +
 hw/char/pl011.c  | 1 +
 hw/char/serial-pci-multi.c   | 1 +
 hw/char/serial-pci.c | 1 +
 hw/char/serial.c | 1 +
 hw/char/sh_serial.c  | 2 ++
 hw/char/spapr_vty.c  | 1 +
 hw/char/stm32f2xx_usart.c| 1 +
 hw/char/xilinx_uartlite.c| 1 +
 hw/core/or-irq.c | 1 +
 hw/core/qdev.c   | 1 +
 hw/core/split-irq.c  | 1 +
 hw/cpu/a15mpcore.c   | 1 +
 hw/cpu/a9mpcore.c| 1 +
 hw/cpu/arm11mpcore.c | 1 +
 hw/cpu/realview_mpcore.c | 1 +
 hw/display/ads7846.c | 1 +
 hw/display/bcm2835_fb.c  | 1 +
 hw/display/cg3.c | 1 +
 hw/display/exynos4210_fimd.c | 1 +
 hw/display/g364fb.c  | 1 +
 hw/display/milkymist-tmu2.c  | 1 +
 hw/display/omap_dss.c| 2 ++
 hw/display/omap_lcdc.c   | 2 ++
 hw/display/pl110.c   | 1 +
 hw/display/pxa2xx_lcd.c  | 1 +
 hw/display/tc6393xb.c| 2 ++
 hw/display/xlnx_dp.c | 1 +
 hw/dma/bcm2835_dma.c | 1 +
 hw/dma/etraxfs_dma.c | 2 ++
 hw/dma/pl080.c   | 1 +
 

[Qemu-devel] [PULL 27/29] Include sysemu/sysemu.h a lot less

2019-08-13 Thread Markus Armbruster
In my "build everything" tree, changing sysemu/sysemu.h triggers a
recompile of some 5400 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

hw/qdev-core.h includes sysemu/sysemu.h since recent commit e965ffa70a
"qdev: add qdev_add_vm_change_state_handler()".  This is a bad idea:
hw/qdev-core.h is widely included.

Move the declaration of qdev_add_vm_change_state_handler() to
sysemu/sysemu.h, and drop the problematic include from hw/qdev-core.h.

Touching sysemu/sysemu.h now recompiles some 1800 objects.
qemu/uuid.h also drops from 5400 to 1800.  A few more headers show
smaller improvement: qemu/notify.h drops from 5600 to 5200,
qemu/timer.h from 5600 to 4500, and qapi/qapi-types-run-state.h from
5500 to 5000.

Cc: Stefan Hajnoczi 
Signed-off-by: Markus Armbruster 
Reviewed-by: Alistair Francis 
Reviewed-by: Stefan Hajnoczi 
Message-Id: <20190812052359.30071-28-arm...@redhat.com>
Reviewed-by: Alex Bennée 
---
 hw/usb/hcd-ehci.h | 1 +
 include/hw/qdev-core.h| 5 -
 include/sysemu/sysemu.h   | 3 +++
 accel/kvm/kvm-all.c   | 1 +
 backends/hostmem.c| 1 +
 cpus.c| 1 +
 hw/arm/allwinner-a10.c| 1 +
 hw/arm/aspeed_soc.c   | 1 +
 hw/arm/kzm.c  | 1 +
 hw/arm/msf2-soc.c | 1 +
 hw/arm/stm32f205_soc.c| 1 +
 hw/char/serial-isa.c  | 1 +
 hw/char/xen_console.c | 1 +
 hw/core/numa.c| 1 +
 hw/core/vm-change-state-handler.c | 1 +
 hw/display/qxl-render.c   | 1 +
 hw/i386/xen/xen-hvm.c | 1 +
 hw/i386/xen/xen-mapcache.c| 1 +
 hw/intc/ioapic.c  | 1 +
 hw/pci/pci.c  | 1 +
 hw/riscv/sifive_e.c   | 1 +
 hw/riscv/sifive_u.c   | 1 +
 hw/riscv/spike.c  | 1 +
 hw/riscv/virt.c   | 1 +
 hw/sparc64/niagara.c  | 2 +-
 hw/xen/xen-common.c   | 1 +
 hw/xen/xen_devconfig.c| 1 +
 hw/xenpv/xen_machine_pv.c | 1 +
 migration/global_state.c  | 1 +
 migration/migration.c | 1 +
 migration/savevm.c| 1 +
 31 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/hw/usb/hcd-ehci.h b/hw/usb/hcd-ehci.h
index 0298238f0b..fdbcfdcbeb 100644
--- a/hw/usb/hcd-ehci.h
+++ b/hw/usb/hcd-ehci.h
@@ -21,6 +21,7 @@
 #include "qemu/timer.h"
 #include "hw/usb.h"
 #include "sysemu/dma.h"
+#include "sysemu/sysemu.h"
 #include "hw/pci/pci.h"
 #include "hw/sysbus.h"
 
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index e5b62dd2fc..de70b7a19a 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -5,7 +5,6 @@
 #include "qemu/bitmap.h"
 #include "qom/object.h"
 #include "hw/hotplug.h"
-#include "sysemu/sysemu.h"
 
 enum {
 DEV_NVECTORS_UNSPECIFIED = -1,
@@ -451,8 +450,4 @@ static inline bool qbus_is_hotpluggable(BusState *bus)
 void device_listener_register(DeviceListener *listener);
 void device_listener_unregister(DeviceListener *listener);
 
-VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev,
- VMChangeStateHandler *cb,
- void *opaque);
-
 #endif
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 227202999d..908f158677 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -29,6 +29,9 @@ VMChangeStateEntry 
*qemu_add_vm_change_state_handler(VMChangeStateHandler *cb,
  void *opaque);
 VMChangeStateEntry *qemu_add_vm_change_state_handler_prio(
 VMChangeStateHandler *cb, void *opaque, int priority);
+VMChangeStateEntry *qdev_add_vm_change_state_handler(DeviceState *dev,
+ VMChangeStateHandler *cb,
+ void *opaque);
 void qemu_del_vm_change_state_handler(VMChangeStateEntry *e);
 void vm_state_notify(int running, RunState state);
 
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index e1a44eccf5..fc38d0b9e3 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -29,6 +29,7 @@
 #include "exec/gdbstub.h"
 #include "sysemu/kvm_int.h"
 #include "sysemu/cpus.h"
+#include "sysemu/sysemu.h"
 #include "qemu/bswap.h"
 #include "exec/memory.h"
 #include "exec/ram_addr.h"
diff --git a/backends/hostmem.c b/backends/hostmem.c
index 463102aa15..6d333dc23c 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -12,6 +12,7 @@
 
 #include "qemu/osdep.h"
 #include "sysemu/hostmem.h"
+#include "sysemu/sysemu.h"
 #include "hw/boards.h"
 #include "qapi/error.h"
 #include "qapi/qapi-builtin-visit.h"
diff --git a/cpus.c b/cpus.c
index e70cc58e31..a20a9a29c1 100644
--- a/cpus.c
+++ b/cpus.c
@@ -41,6 +41,7 @@
 #include "sysemu/kvm.h"
 #include "sysemu/hax.h"
 #include "sysemu/hvf.h"

[Qemu-devel] [PULL 10/29] ide: Include hw/ide/internal a bit less outside hw/ide/

2019-08-13 Thread Markus Armbruster
According to hw/ide/internal's file comment, only files in hw/ide/ are
supposed to include it.  Drag reality slightly closer to supposition.

Three includes outside hw/ide remain: hw/arm/sbsa-ref.c,
include/hw/ide/pci.h, and include/hw/misc/macio/macio.h.  Turns out
board code needs ide-internal.h to wire up IDE stuff.  More cleanup is
needed.  Left for another day.

Cc: John Snow 
Signed-off-by: Markus Armbruster 
Reviewed-by: John Snow 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Message-Id: <20190812052359.30071-11-arm...@redhat.com>
---
 hw/ide/ahci_internal.h | 1 +
 hw/ppc/mac.h   | 1 -
 include/hw/arm/allwinner-a10.h | 1 -
 include/hw/arm/xlnx-zynqmp.h   | 1 -
 include/hw/misc/mos6522.h  | 1 -
 hw/arm/allwinner-a10.c | 1 +
 hw/arm/cubieboard.c| 1 +
 hw/arm/xlnx-zynqmp.c   | 1 +
 8 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/ide/ahci_internal.h b/hw/ide/ahci_internal.h
index 95ecddcd3c..73424516da 100644
--- a/hw/ide/ahci_internal.h
+++ b/hw/ide/ahci_internal.h
@@ -25,6 +25,7 @@
 #define HW_IDE_AHCI_INTERNAL_H
 
 #include "hw/ide/ahci.h"
+#include "hw/ide/internal.h"
 #include "hw/sysbus.h"
 
 #define AHCI_MEM_BAR_SIZE 0x1000
diff --git a/hw/ppc/mac.h b/hw/ppc/mac.h
index a741300ac9..6af87d1fa0 100644
--- a/hw/ppc/mac.h
+++ b/hw/ppc/mac.h
@@ -30,7 +30,6 @@
 #include "exec/memory.h"
 #include "hw/boards.h"
 #include "hw/sysbus.h"
-#include "hw/ide/internal.h"
 #include "hw/input/adb.h"
 #include "hw/misc/mos6522.h"
 #include "hw/pci/pci_host.h"
diff --git a/include/hw/arm/allwinner-a10.h b/include/hw/arm/allwinner-a10.h
index 7182ce5c4b..101b72a71d 100644
--- a/include/hw/arm/allwinner-a10.h
+++ b/include/hw/arm/allwinner-a10.h
@@ -7,7 +7,6 @@
 #include "hw/timer/allwinner-a10-pit.h"
 #include "hw/intc/allwinner-a10-pic.h"
 #include "hw/net/allwinner_emac.h"
-#include "hw/ide/pci.h"
 #include "hw/ide/ahci.h"
 
 #include "sysemu/sysemu.h"
diff --git a/include/hw/arm/xlnx-zynqmp.h b/include/hw/arm/xlnx-zynqmp.h
index 6cb65e7537..d7483c3b42 100644
--- a/include/hw/arm/xlnx-zynqmp.h
+++ b/include/hw/arm/xlnx-zynqmp.h
@@ -22,7 +22,6 @@
 #include "hw/intc/arm_gic.h"
 #include "hw/net/cadence_gem.h"
 #include "hw/char/cadence_uart.h"
-#include "hw/ide/pci.h"
 #include "hw/ide/ahci.h"
 #include "hw/sd/sdhci.h"
 #include "hw/ssi/xilinx_spips.h"
diff --git a/include/hw/misc/mos6522.h b/include/hw/misc/mos6522.h
index 03d9f0c059..493c907537 100644
--- a/include/hw/misc/mos6522.h
+++ b/include/hw/misc/mos6522.h
@@ -29,7 +29,6 @@
 
 #include "exec/memory.h"
 #include "hw/sysbus.h"
-#include "hw/ide/internal.h"
 #include "hw/input/adb.h"
 
 /* Bits in ACR */
diff --git a/hw/arm/allwinner-a10.c b/hw/arm/allwinner-a10.c
index 35e906ca54..3b0d3eccdd 100644
--- a/hw/arm/allwinner-a10.c
+++ b/hw/arm/allwinner-a10.c
@@ -16,6 +16,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "exec/address-spaces.h"
 #include "qapi/error.h"
 #include "qemu/module.h"
 #include "cpu.h"
diff --git a/hw/arm/cubieboard.c b/hw/arm/cubieboard.c
index f7c8a5985a..38e0ca0f53 100644
--- a/hw/arm/cubieboard.c
+++ b/hw/arm/cubieboard.c
@@ -16,6 +16,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "exec/address-spaces.h"
 #include "qapi/error.h"
 #include "cpu.h"
 #include "hw/sysbus.h"
diff --git a/hw/arm/xlnx-zynqmp.c b/hw/arm/xlnx-zynqmp.c
index a60830d37a..0f587e63d3 100644
--- a/hw/arm/xlnx-zynqmp.c
+++ b/hw/arm/xlnx-zynqmp.c
@@ -24,6 +24,7 @@
 #include "hw/boards.h"
 #include "exec/address-spaces.h"
 #include "sysemu/kvm.h"
+#include "sysemu/sysemu.h"
 #include "kvm_arm.h"
 
 #define GIC_NUM_SPI_INTR 160
-- 
2.21.0




[Qemu-devel] [PULL 29/29] sysemu: Split sysemu/runstate.h off sysemu/sysemu.h

2019-08-13 Thread Markus Armbruster
sysemu/sysemu.h is a rather unfocused dumping ground for stuff related
to the system-emulator.  Evidence:

* It's included widely: in my "build everything" tree, changing
  sysemu/sysemu.h still triggers a recompile of some 1100 out of 6600
  objects (not counting tests and objects that don't depend on
  qemu/osdep.h, down from 5400 due to the previous two commits).

* It pulls in more than a dozen additional headers.

Split stuff related to run state management into its own header
sysemu/runstate.h.

Touching sysemu/sysemu.h now recompiles some 850 objects.  qemu/uuid.h
also drops from 1100 to 850, and qapi/qapi-types-run-state.h from 4400
to 4200.  Touching new sysemu/runstate.h recompiles some 500 objects.

Since I'm touching MAINTAINERS to add sysemu/runstate.h anyway, also
add qemu/main-loop.h.

Suggested-by: Paolo Bonzini 
Signed-off-by: Markus Armbruster 
Message-Id: <20190812052359.30071-30-arm...@redhat.com>
Reviewed-by: Alex Bennée 
---
 include/hw/ppc/spapr_drc.h|  2 +-
 include/sysemu/runstate.h | 68 +++
 include/sysemu/sysemu.h   | 61 ---
 accel/kvm/kvm-all.c   |  1 +
 audio/audio.c |  2 +-
 block/block-backend.c |  2 +-
 blockdev.c|  1 +
 cpus.c|  2 +-
 dump/dump.c   |  2 +-
 gdbstub.c |  1 +
 hw/acpi/core.c|  2 +-
 hw/acpi/ich9.c|  2 +-
 hw/acpi/piix4.c   |  1 +
 hw/arm/highbank.c |  1 +
 hw/arm/integratorcp.c |  1 +
 hw/arm/msf2-soc.c |  1 +
 hw/arm/musicpal.c |  1 +
 hw/arm/nseries.c  |  1 +
 hw/arm/omap1.c|  1 +
 hw/arm/omap2.c|  1 +
 hw/arm/sbsa-ref.c |  1 +
 hw/arm/spitz.c|  1 +
 hw/arm/stellaris.c|  1 +
 hw/arm/tosa.c |  2 +-
 hw/arm/virt.c |  1 +
 hw/block/pflash_cfi01.c   |  2 +-
 hw/block/vhost-user-blk.c |  1 +
 hw/block/virtio-blk.c |  1 +
 hw/char/serial.c  |  2 +-
 hw/core/machine-qmp-cmds.c|  1 +
 hw/core/vm-change-state-handler.c |  2 +-
 hw/display/qxl-render.c   |  2 +-
 hw/display/qxl.c  |  2 +-
 hw/dma/etraxfs_dma.c  |  2 +-
 hw/i386/kvm/clock.c   |  2 +-
 hw/i386/kvm/i8254.c   |  2 +-
 hw/i386/kvmvapic.c|  1 +
 hw/i386/pc.c  |  1 +
 hw/i386/xen/xen-hvm.c |  1 +
 hw/i386/xen/xen-mapcache.c|  2 +-
 hw/ide/core.c |  2 +-
 hw/ide/qdev.c |  1 +
 hw/input/pckbd.c  |  2 +-
 hw/input/ps2.c|  2 +-
 hw/intc/arm_gicv3_its_kvm.c   |  2 +-
 hw/intc/arm_gicv3_kvm.c   |  2 +-
 hw/intc/spapr_xive_kvm.c  |  1 +
 hw/ipmi/ipmi.c|  2 +-
 hw/isa/lpc_ich9.c |  1 +
 hw/mips/boston.c  |  1 +
 hw/mips/mips_malta.c  |  1 +
 hw/mips/mips_r4k.c|  1 +
 hw/misc/arm_sysctl.c  |  2 +-
 hw/misc/cbus.c|  2 +-
 hw/misc/exynos4210_pmu.c  |  2 +-
 hw/misc/imx7_snvs.c   |  2 +-
 hw/misc/iotkit-sysctl.c   |  2 +-
 hw/misc/macio/cuda.c  |  2 +-
 hw/misc/macio/pmu.c   |  2 +-
 hw/misc/pvpanic.c |  2 +-
 hw/misc/slavio_misc.c |  2 +-
 hw/misc/zynq_slcr.c   |  2 +-
 hw/net/e1000e_core.c  |  2 +-
 hw/nvram/spapr_nvram.c|  2 +
 hw/pci-host/bonito.c  |  2 +-
 hw/pci-host/piix.c|  2 +-
 hw/pci-host/sabre.c   |  2 +-
 hw/ppc/e500.c |  1 +
 hw/ppc/mpc8544_guts.c |  2 +-
 hw/ppc/pnv.c  |  1 +
 hw/ppc/ppc.c  |  2 +-
 hw/ppc/ppc_booke.c|  2 +-
 hw/ppc/prep_systemio.c|  2 +-
 hw/ppc/spapr.c|  1 +
 hw/ppc/spapr_events.c |  2 +-
 hw/ppc/spapr_hcall.c  |  2 +-
 hw/ppc/spapr_rtas.c   |  2 +
 hw/rdma/vmw/pvrdma_main.c |  2 +-
 hw/s390x/ipl.c|  1 +
 hw/s390x/sclpquiesce.c|  2 +-
 hw/s390x/tod-kvm.c|  2 +-
 hw/scsi/scsi-bus.c|  1 +
 hw/sh4/r2d.c  |  1 +
 hw/sparc/sun4m.c  |  1 +
 hw/sparc64/sun4u.c|  1 +
 hw/timer/etraxfs_timer.c  |  2 +-
 hw/timer/m48t59.c |  1 +
 hw/timer/mc146818rtc.c|  1 +
 hw/timer/milkymist-sysctl.c   |  2 +-
 hw/timer/pxa2xx_timer.c   |  2 +-
 hw/usb/hcd-ehci.c |  2 +-
 hw/usb/host-libusb.c  |  1 +
 hw/usb/redirect.c |  1 +
 hw/vfio/pci.c

Re: [Qemu-devel] [PATCH 1/2] block/raw-format: switch to BDRV_BLOCK_DATA with BDRV_BLOCK_RECURSE

2019-08-13 Thread Vladimir Sementsov-Ogievskiy
13.08.2019 18:41, Kevin Wolf wrote:
> Am 13.08.2019 um 16:43 hat Max Reitz geschrieben:
>> On 13.08.19 13:04, Kevin Wolf wrote:
>>> Am 12.08.2019 um 20:11 hat Vladimir Sementsov-Ogievskiy geschrieben:
 BDRV_BLOCK_RAW makes generic bdrv_co_block_status to fallthrough to
 returned file. But is it correct behavior at all? If returned file
 itself has a backing file, we may report as totally unallocated and
 area which actually has data in bottom backing file.

 So, mirroring of qcow2 under raw-format is broken. Which is illustrated
 by following commit with a test. Let's make raw-format behave more
 correctly returning BDRV_BLOCK_DATA.

 Suggested-by: Max Reitz 
 Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>>
>>> After some reading, I think I came to the conclusion that RAW is the
>>> correct thing to do. There is indeed a problem, but this patch is trying
>>> to fix it in the wrong place.
>>>
>>> In the case where the backing file contains some data, and we have a
>>> 'raw' node above the qcow2 overlay node, the content of the respective
>>> block is not defined by the queried backing file layer, so it is
>>> completely correct that bdrv_is_allocated() returns false,like it would
>>> if you queried the qcow2 layer directly.
>>
>> I disagree.  The queried backing file layer is the raw node.  As I said,
>> in my opinion raw nodes are not filter nodes, neither in behavior (they
>> have an offset option), nor in how they are generally used (as a format).
>>
>> The raw format does not support backing files.  Therefore, everything on
>> a raw node is allocated.
>>
>> (That is, like, my opinion.)
>>
>>>   If it returned true, we would
>>> copy everything, which isn't right either (the test cases should may add
>>> the qemu-img map output of the target so this becomes visible).
>>
>> It is right.
> 
> So we don't even agree what mirroring the raw node should even mean.
> 
> I can the see your point when you say that the raw node has no backing
> file, so everything should be copied. But I can also see the point that
> the raw node can really just be used as a filter that limits the data
> exposed from the qcow2 layer, and you want to keep the copy a COW
> overlay over the same backing file.
> 
> Both are valid use cases in principle and there is no single right or
> wrong.
> 
> We don't currently support the latter use case because we have only
> sync=full or sync=top, but if you could specify a base node instead, we
> could probably suport the case without all of the special-casing filter
> nodes and backing file childs.
> 
> You would call bdrv_co_block_status_above() with the right base node and
> it would just recurse whereever the data is stored, be it bs->backing,
> bs->file or even driver-specific children. This would allow you to find
> out whether some block in the top node came from the base node that
> we're going to keep. If yes, skip it; if no, copy it.
> 
> This way we wouldn't have to decide whether raw is a filter or not,
> because it wouldn't make a difference. The behaviour would only depend
> on the base node given by the user. If you specified the top-level qcow2
> file as the base, you get your full copy;

ahm, full-copy = base is NULL..

> if you specified the backing
> qcow2, you get the partial copy where the target still uses the same
> backing file.
> 
> (Hm... It would only actually work if the offsets stay the same in the
> chain, which is true for backing file children, but not necessarily for
> other children.

Don't follow, what you mean by offsets stay the same and what is wrong with it?

> Anyway, even if we don't gain much functionality, I
> really want a more generic model that avoids different types of nodes
> and edges as much as possible.)
> 
> Kevin
> 


-- 
Best regards,
Vladimir



[Qemu-devel] [PULL 18/29] Include hw/hw.h exactly where needed

2019-08-13 Thread Markus Armbruster
In my "build everything" tree, changing hw/hw.h triggers a recompile
of some 2600 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).

The previous commits have left only the declaration of hw_error() in
hw/hw.h.  This permits dropping most of its inclusions.  Touching it
now recompiles less than 200 objects.

Signed-off-by: Markus Armbruster 
Reviewed-by: Alistair Francis 
Message-Id: <20190812052359.30071-19-arm...@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
---
 hw/display/qxl.h | 1 -
 hw/i386/amd_iommu.h  | 1 -
 hw/microblaze/boot.h | 1 -
 hw/net/ne2000.h  | 1 -
 hw/nios2/boot.h  | 1 -
 hw/usb/hcd-ehci.h| 1 -
 include/hw/audio/pcspk.h | 1 -
 include/hw/audio/wm8750.h| 1 -
 include/hw/char/serial.h | 1 -
 include/hw/char/stm32f2xx_usart.h| 1 -
 include/hw/dma/i8257.h   | 1 -
 include/hw/hw.h  | 1 -
 include/hw/i386/ich9.h   | 1 -
 include/hw/i386/ioapic_internal.h| 1 -
 include/hw/input/i8042.h | 1 -
 include/hw/isa/apm.h | 1 -
 include/hw/isa/i8259_internal.h  | 1 -
 include/hw/misc/stm32f2xx_syscfg.h   | 1 -
 include/hw/net/ne2000-isa.h  | 1 -
 include/hw/pci-host/designware.h | 1 -
 include/hw/pci-host/gpex.h   | 1 -
 include/hw/pci-host/q35.h| 1 -
 include/hw/pci-host/uninorth.h   | 1 -
 include/hw/pci-host/xilinx-pcie.h| 1 -
 include/hw/pci/pcie.h| 1 -
 include/hw/pci/pcie_aer.h| 1 -
 include/hw/qdev.h| 1 -
 include/hw/riscv/riscv_htif.h| 1 -
 include/hw/ssi/stm32f2xx_spi.h   | 1 -
 include/hw/timer/aspeed_rtc.h| 1 -
 include/hw/timer/i8254.h | 1 -
 include/hw/timer/i8254_internal.h| 1 -
 include/hw/virtio/vhost.h| 1 -
 include/hw/virtio/virtio.h   | 1 -
 include/hw/xen/xen_common.h  | 1 -
 include/sysemu/dma.h | 1 -
 include/sysemu/hax.h | 1 -
 include/sysemu/hvf.h | 1 -
 accel/kvm/kvm-all.c  | 1 -
 audio/audio.c| 1 -
 audio/spiceaudio.c   | 1 -
 audio/wavcapture.c   | 1 -
 cpus.c   | 1 +
 device-hotplug.c | 1 -
 exec.c   | 1 -
 hw/9pfs/xen-9p-backend.c | 1 -
 hw/acpi/core.c   | 1 -
 hw/acpi/cpu_hotplug.c| 1 -
 hw/acpi/ich9.c   | 1 -
 hw/acpi/pcihp.c  | 1 -
 hw/acpi/piix4.c  | 1 -
 hw/adc/stm32f2xx_adc.c   | 1 -
 hw/alpha/dp264.c | 1 -
 hw/alpha/typhoon.c   | 1 -
 hw/arm/boot.c| 1 -
 hw/arm/collie.c  | 1 -
 hw/arm/gumstix.c | 1 -
 hw/arm/integratorcp.c| 1 +
 hw/arm/mainstone.c   | 1 -
 hw/arm/musicpal.c| 1 +
 hw/arm/omap2.c   | 1 -
 hw/arm/omap_sx1.c| 1 -
 hw/arm/palm.c| 1 -
 hw/arm/pxa2xx_pic.c  | 1 -
 hw/arm/spitz.c   | 1 -
 hw/arm/tosa.c| 1 -
 hw/arm/virt-acpi-build.c | 1 -
 hw/arm/z2.c  | 1 -
 hw/audio/ac97.c  | 1 -
 hw/audio/adlib.c | 1 -
 hw/audio/cs4231a.c   | 1 -
 hw/audio/es1370.c| 1 -
 hw/audio/gus.c   | 1 -
 hw/audio/hda-codec.c | 1 -
 hw/audio/intel-hda.c | 1 -
 hw/audio/marvell_88w8618.c   | 1 -
 hw/audio/milkymist-ac97.c| 1 -
 hw/audio/pcspk.c | 1 -
 hw/audio/sb16.c  | 1 -
 hw/block/dataplane/xen-block.c   | 1 -
 hw/block/ecc.c   | 1 -
 hw/block/fdc.c   | 1 -
 hw/block/m25p80.c| 1 -
 hw/block/nvme.c  | 1 -
 hw/block/pflash_cfi01.c  | 1 -
 hw/block/pflash_cfi02.c  | 1 -
 hw/block/tc58128.c   | 1 -
 hw/block/xen-block.c | 1 -
 hw/char/debugcon.c   | 1 -
 hw/char/digic-uart.c | 1 -
 hw/char/escc.c   | 1 -
 hw/char/lm32_juart.c | 1 -
 hw/char/lm32_uart.c  

[Qemu-devel] [PULL 24/29] Include sysemu/hostmem.h less

2019-08-13 Thread Markus Armbruster
Move the HostMemoryBackend typedef from sysemu/hostmem.h to
qemu/typedefs.h.  This renders a few inclusions of sysemu/hostmem.h
superfluous; drop them.

Cc: Eduardo Habkost 
Cc: Igor Mammedov 
Signed-off-by: Markus Armbruster 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Eduardo Habkost 
Reviewed-by: Igor Mammedov 
Tested-by: Philippe Mathieu-Daudé 
Message-Id: <20190812052359.30071-25-arm...@redhat.com>
---
 include/hw/mem/pc-dimm.h| 1 -
 include/hw/virtio/virtio-pmem.h | 1 -
 include/qemu/typedefs.h | 1 +
 include/sysemu/hostmem.h| 1 -
 hw/mem/nvdimm.c | 1 +
 hw/virtio/virtio-pmem.c | 1 +
 6 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
index 47b246f95c..289edc0f3d 100644
--- a/include/hw/mem/pc-dimm.h
+++ b/include/hw/mem/pc-dimm.h
@@ -17,7 +17,6 @@
 #define QEMU_PC_DIMM_H
 
 #include "exec/memory.h"
-#include "sysemu/hostmem.h"
 #include "hw/qdev-core.h"
 
 #define TYPE_PC_DIMM "pc-dimm"
diff --git a/include/hw/virtio/virtio-pmem.h b/include/hw/virtio/virtio-pmem.h
index 8bf2ae780f..33f1999320 100644
--- a/include/hw/virtio/virtio-pmem.h
+++ b/include/hw/virtio/virtio-pmem.h
@@ -16,7 +16,6 @@
 
 #include "hw/virtio/virtio.h"
 #include "qapi/qapi-types-misc.h"
-#include "sysemu/hostmem.h"
 
 #define TYPE_VIRTIO_PMEM "virtio-pmem"
 
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index 9e1283aacf..f569f5f270 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -33,6 +33,7 @@ typedef struct FWCfgEntry FWCfgEntry;
 typedef struct FWCfgIoState FWCfgIoState;
 typedef struct FWCfgMemState FWCfgMemState;
 typedef struct FWCfgState FWCfgState;
+typedef struct HostMemoryBackend HostMemoryBackend;
 typedef struct HVFX86EmulatorState HVFX86EmulatorState;
 typedef struct I2CBus I2CBus;
 typedef struct I2SCodec I2SCodec;
diff --git a/include/sysemu/hostmem.h b/include/sysemu/hostmem.h
index 92fa0e458c..afeb5db1b1 100644
--- a/include/sysemu/hostmem.h
+++ b/include/sysemu/hostmem.h
@@ -27,7 +27,6 @@
 #define MEMORY_BACKEND_CLASS(klass) \
 OBJECT_CLASS_CHECK(HostMemoryBackendClass, (klass), TYPE_MEMORY_BACKEND)
 
-typedef struct HostMemoryBackend HostMemoryBackend;
 typedef struct HostMemoryBackendClass HostMemoryBackendClass;
 
 /**
diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c
index 6fefd65092..375f9a588a 100644
--- a/hw/mem/nvdimm.c
+++ b/hw/mem/nvdimm.c
@@ -30,6 +30,7 @@
 #include "hw/mem/nvdimm.h"
 #include "hw/qdev-properties.h"
 #include "hw/mem/memory-device.h"
+#include "sysemu/hostmem.h"
 
 static void nvdimm_get_label_size(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
diff --git a/hw/virtio/virtio-pmem.c b/hw/virtio/virtio-pmem.c
index ff1a2ddb36..c0c9395e55 100644
--- a/hw/virtio/virtio-pmem.c
+++ b/hw/virtio/virtio-pmem.c
@@ -21,6 +21,7 @@
 #include "hw/virtio/virtio-access.h"
 #include "standard-headers/linux/virtio_ids.h"
 #include "standard-headers/linux/virtio_pmem.h"
+#include "sysemu/hostmem.h"
 #include "block/aio.h"
 #include "block/thread-pool.h"
 
-- 
2.21.0




[Qemu-devel] [PULL 23/29] numa: Don't include hw/boards.h into sysemu/numa.h

2019-08-13 Thread Markus Armbruster
sysemu/numa.h includes hw/boards.h just for the CPUArchId typedef, at
the cost of pulling in more than two dozen extra headers indirectly.

I could move the typedef from hw/boards.h to qemu/typedefs.h.  But
it's used in just two headers: boards.h and numa.h.

I could move it to another header both its users include.
exec/cpu-common.h seems to be the least bad fit.

But I'm keeping this simple & stupid: declare the struct tag in
numa.h.

Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Signed-off-by: Markus Armbruster 
Reviewed-by: Eduardo Habkost 
Message-Id: <20190812052359.30071-24-arm...@redhat.com>
---
 include/hw/boards.h   | 2 +-
 include/sysemu/numa.h | 9 +++--
 hw/mem/pc-dimm.c  | 1 +
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index 67e551636a..739d109fe1 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -86,7 +86,7 @@ void machine_class_allow_dynamic_sysbus_dev(MachineClass *mc, 
const char *type);
  * @props - CPU object properties, initialized by board
  * #vcpus_count - number of threads provided by @cpu object
  */
-typedef struct {
+typedef struct CPUArchId {
 uint64_t arch_id;
 int64_t vcpus_count;
 CpuInstanceProperties props;
diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 01a263eba2..4c4c1dee9b 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -4,7 +4,10 @@
 #include "qemu/bitmap.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/hostmem.h"
-#include "hw/boards.h"
+#include "qapi/qapi-types-machine.h"
+#include "exec/cpu-common.h"
+
+struct CPUArchId;
 
 extern int nb_numa_nodes;   /* Number of NUMA nodes */
 extern bool have_numa_distance;
@@ -32,5 +35,7 @@ void numa_legacy_auto_assign_ram(MachineClass *mc, NodeInfo 
*nodes,
  int nb_nodes, ram_addr_t size);
 void numa_default_auto_assign_ram(MachineClass *mc, NodeInfo *nodes,
   int nb_nodes, ram_addr_t size);
-void numa_cpu_pre_plug(const CPUArchId *slot, DeviceState *dev, Error **errp);
+void numa_cpu_pre_plug(const struct CPUArchId *slot, DeviceState *dev,
+   Error **errp);
+
 #endif
diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index 1f3e676066..dea48f9163 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -19,6 +19,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "hw/boards.h"
 #include "hw/mem/pc-dimm.h"
 #include "hw/qdev-properties.h"
 #include "migration/vmstate.h"
-- 
2.21.0




[Qemu-devel] [PULL 25/29] numa: Move remaining NUMA declarations from sysemu.h to numa.h

2019-08-13 Thread Markus Armbruster
Commit e35704ba9c "numa: Move NUMA declarations from sysemu.h to
numa.h" left a few NUMA-related macros behind.  Move them now.

Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Signed-off-by: Markus Armbruster 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Eduardo Habkost 
Message-Id: <20190812052359.30071-26-arm...@redhat.com>
---
 include/sysemu/hostmem.h | 2 +-
 include/sysemu/numa.h| 9 +++--
 include/sysemu/sysemu.h  | 7 ---
 exec.c   | 2 +-
 hw/core/numa.c   | 1 +
 hw/mem/pc-dimm.c | 1 +
 hw/pci/pci.c | 2 +-
 hw/ppc/spapr.c   | 1 +
 8 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/include/sysemu/hostmem.h b/include/sysemu/hostmem.h
index afeb5db1b1..4dbdadd39e 100644
--- a/include/sysemu/hostmem.h
+++ b/include/sysemu/hostmem.h
@@ -13,7 +13,7 @@
 #ifndef SYSEMU_HOSTMEM_H
 #define SYSEMU_HOSTMEM_H
 
-#include "sysemu/sysemu.h" /* for MAX_NODES */
+#include "sysemu/numa.h"
 #include "qapi/qapi-types-machine.h"
 #include "qom/object.h"
 #include "exec/memory.h"
diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 4c4c1dee9b..7a4ce89765 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -2,13 +2,18 @@
 #define SYSEMU_NUMA_H
 
 #include "qemu/bitmap.h"
-#include "sysemu/sysemu.h"
-#include "sysemu/hostmem.h"
 #include "qapi/qapi-types-machine.h"
 #include "exec/cpu-common.h"
 
 struct CPUArchId;
 
+#define MAX_NODES 128
+#define NUMA_NODE_UNASSIGNED MAX_NODES
+#define NUMA_DISTANCE_MIN 10
+#define NUMA_DISTANCE_DEFAULT 20
+#define NUMA_DISTANCE_MAX 254
+#define NUMA_DISTANCE_UNREACHABLE 255
+
 extern int nb_numa_nodes;   /* Number of NUMA nodes */
 extern bool have_numa_distance;
 
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index ac18a1184a..227202999d 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -117,13 +117,6 @@ extern QEMUClockType rtc_clock;
 extern const char *mem_path;
 extern int mem_prealloc;
 
-#define MAX_NODES 128
-#define NUMA_NODE_UNASSIGNED MAX_NODES
-#define NUMA_DISTANCE_MIN 10
-#define NUMA_DISTANCE_DEFAULT 20
-#define NUMA_DISTANCE_MAX 254
-#define NUMA_DISTANCE_UNREACHABLE 255
-
 #define MAX_OPTION_ROMS 16
 typedef struct QEMUOptionRom {
 const char *name;
diff --git a/exec.c b/exec.c
index 78f849de99..4aaa14b075 100644
--- a/exec.c
+++ b/exec.c
@@ -45,7 +45,7 @@
 #include "exec/memory.h"
 #include "exec/ioport.h"
 #include "sysemu/dma.h"
-#include "sysemu/numa.h"
+#include "sysemu/hostmem.h"
 #include "sysemu/hw_accel.h"
 #include "exec/address-spaces.h"
 #include "sysemu/xen-mapcache.h"
diff --git a/hw/core/numa.c b/hw/core/numa.c
index d817f06ead..450c522dd8 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -23,6 +23,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "sysemu/hostmem.h"
 #include "sysemu/numa.h"
 #include "exec/cpu-common.h"
 #include "exec/ramlist.h"
diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index dea48f9163..7c324a1329 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -28,6 +28,7 @@
 #include "qapi/error.h"
 #include "qapi/visitor.h"
 #include "qemu/module.h"
+#include "sysemu/hostmem.h"
 #include "sysemu/numa.h"
 #include "trace.h"
 
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 9001b81daa..4b6ffab13d 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -34,7 +34,7 @@
 #include "migration/vmstate.h"
 #include "monitor/monitor.h"
 #include "net/net.h"
-#include "sysemu/sysemu.h"
+#include "sysemu/numa.h"
 #include "hw/loader.h"
 #include "qemu/error-report.h"
 #include "qemu/range.h"
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 06d23a5004..4044e61a0c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -29,6 +29,7 @@
 #include "qapi/error.h"
 #include "qapi/visitor.h"
 #include "sysemu/sysemu.h"
+#include "sysemu/hostmem.h"
 #include "sysemu/numa.h"
 #include "sysemu/qtest.h"
 #include "sysemu/reset.h"
-- 
2.21.0




[Qemu-devel] [PULL 20/29] Include qemu/main-loop.h less

2019-08-13 Thread Markus Armbruster
In my "build everything" tree, changing qemu/main-loop.h triggers a
recompile of some 5600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).  It includes block/aio.h,
which in turn includes qemu/event_notifier.h, qemu/notify.h,
qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h,
qemu/thread.h, qemu/timer.h, and a few more.

Include qemu/main-loop.h only where it's needed.  Touching it now
recompiles only some 1700 objects.  For block/aio.h and
qemu/event_notifier.h, these numbers drop from 5600 to 2800.  For the
others, they shrink only slightly.

Signed-off-by: Markus Armbruster 
Message-Id: <20190812052359.30071-21-arm...@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
---
 fsdev/qemu-fsdev-throttle.h | 1 -
 hw/9pfs/coth.h  | 1 -
 include/block/block_int.h   | 1 -
 include/chardev/char-fe.h   | 1 +
 include/chardev/char-io.h   | 1 +
 include/chardev/char.h  | 2 +-
 include/hw/scsi/scsi.h  | 1 +
 include/sysemu/sysemu.h | 1 -
 nbd/nbd-internal.h  | 1 -
 ui/vnc-auth-sasl.h  | 1 -
 accel/kvm/kvm-all.c | 1 +
 block.c | 1 +
 block/block-backend.c   | 1 +
 block/create.c  | 1 +
 block/io.c  | 1 +
 block/nbd.c | 1 +
 block/nfs.c | 1 +
 block/nvme.c| 1 +
 block/qcow2.c   | 1 +
 block/qed.c | 1 +
 block/sheepdog.c| 1 +
 block/throttle-groups.c | 1 +
 blockdev.c  | 1 +
 blockjob.c  | 1 +
 chardev/baum.c  | 1 +
 chardev/char-pipe.c | 1 +
 chardev/char-win-stdio.c| 1 +
 chardev/char-win.c  | 1 +
 dump/dump.c | 1 +
 fsdev/qemu-fsdev-throttle.c | 1 +
 hw/9pfs/9p.c| 1 +
 hw/9pfs/codir.c | 1 +
 hw/9pfs/cofile.c| 1 +
 hw/9pfs/cofs.c  | 1 +
 hw/9pfs/coth.c  | 1 +
 hw/9pfs/coxattr.c   | 1 +
 hw/9pfs/xen-9p-backend.c| 1 +
 hw/arm/omap1.c  | 1 +
 hw/block/dataplane/virtio-blk.c | 1 +
 hw/block/dataplane/xen-block.c  | 1 +
 hw/block/fdc.c  | 1 +
 hw/block/xen-block.c| 1 +
 hw/char/virtio-serial-bus.c | 1 +
 hw/core/machine-qmp-cmds.c  | 1 +
 hw/display/qxl.c| 1 +
 hw/dma/etraxfs_dma.c| 1 +
 hw/i386/intel_iommu.c   | 1 +
 hw/i386/xen/xen-hvm.c   | 1 +
 hw/ide/ahci.c   | 1 +
 hw/ide/core.c   | 1 +
 hw/ide/qdev.c   | 1 +
 hw/intc/s390_flic.c | 1 +
 hw/m68k/mcf5206.c   | 1 +
 hw/m68k/mcf5208.c   | 1 +
 hw/misc/imx6_src.c  | 1 +
 hw/net/fsl_etsec/etsec.c| 1 +
 hw/net/lan9118.c| 1 +
 hw/net/vhost_net.c  | 2 +-
 hw/net/virtio-net.c | 1 +
 hw/ppc/ppc.c| 1 +
 hw/ppc/ppc440_uc.c  | 1 +
 hw/ppc/spapr_hcall.c| 1 +
 hw/ppc/spapr_rng.c  | 1 +
 hw/scsi/mptsas.c| 1 +
 hw/scsi/scsi-disk.c | 1 +
 hw/scsi/vmw_pvscsi.c| 1 +
 hw/timer/allwinner-a10-pit.c| 1 +
 hw/timer/altera_timer.c | 1 +
 hw/timer/etraxfs_timer.c| 1 +
 hw/timer/exynos4210_rtc.c   | 1 +
 hw/timer/milkymist-sysctl.c | 1 +
 hw/usb/dev-uas.c| 1 +
 hw/usb/hcd-ehci.c   | 1 +
 hw/usb/host-libusb.c| 1 +
 hw/usb/xen-usb.c| 1 +
 hw/vfio/ccw.c   | 1 +
 hw/vfio/common.c| 1 +
 hw/vfio/pci.c   | 1 +
 hw/vfio/platform.c  | 1 +
 hw/virtio/vhost-backend.c   | 1 +
 hw/virtio/vhost-user.c  | 1 +
 hw/virtio/virtio-crypto.c   | 1 +
 hw/virtio/virtio-pmem.c | 1 +
 hw/virtio/virtio.c  | 1 +
 hw/xen/xen-legacy-backend.c | 1 +
 hw/xen/xen_pvdev.c  | 1 +
 memory.c| 1 +
 migration/block.c   | 1 +
 migration/colo.c| 1 +
 migration/migration.c   | 1 +
 migration/savevm.c  | 1 +
 net/can/can_socketcan.c | 1 +
 net/netmap.c| 1 +
 net/tap-win32.c | 1 +
 net/tap.c   | 1 +
 qemu-img.c  | 1 +
 qom/cpu.c   | 1 +
 replay/replay-internal.c| 1 +
 target/arm/helper-a64.c | 1 +
 target/arm/helper.c | 2 ++
 target/arm/kvm.c| 1 +
 target/arm/kvm64.c  | 1 +
 target/arm/m_helper.c   | 2 ++
 target/arm/psci.c   | 2 ++
 target/i386/kvm.c   | 1 +
 target/lm32/op_helper.c | 1 +
 target/mips/kvm.c   | 1 +
 target/ppc/int_helper.c | 2 ++
 

[Qemu-devel] [PULL 28/29] sysemu: Move the VMChangeStateEntry typedef to qemu/typedefs.h

2019-08-13 Thread Markus Armbruster
In my "build everything" tree, changing sysemu/sysemu.h triggers a
recompile of some 1800 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h, down from 5400 due to the
previous commit).

Several headers include sysemu/sysemu.h just to get typedef
VMChangeStateEntry.  Move it from sysemu/sysemu.h to qemu/typedefs.h.
Spell its structure tag the same while there.  Drop the now
superfluous includes of sysemu/sysemu.h from headers.

Touching sysemu/sysemu.h now recompiles some 1100 objects.
qemu/uuid.h also drops from 1800 to 1100, and
qapi/qapi-types-run-state.h from 5000 to 4400.

Signed-off-by: Markus Armbruster 
Message-Id: <20190812052359.30071-29-arm...@redhat.com>
Reviewed-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
---
 hw/usb/hcd-ehci.h   | 1 -
 include/hw/ide/internal.h   | 3 ++-
 include/hw/ppc/spapr_xive.h | 1 -
 include/hw/scsi/scsi.h  | 1 -
 include/hw/virtio/virtio.h  | 1 -
 include/qemu/typedefs.h | 1 +
 include/sysemu/sysemu.h | 1 -
 hw/block/vhost-user-blk.c   | 1 +
 hw/block/virtio-blk.c   | 1 +
 hw/display/virtio-gpu.c | 1 +
 hw/misc/macio/macio.c   | 1 +
 hw/net/virtio-net.c | 1 +
 hw/s390x/s390-ccw.c | 1 +
 hw/s390x/s390-virtio-ccw.c  | 1 +
 hw/scsi/scsi-bus.c  | 1 +
 hw/scsi/vhost-scsi.c| 1 +
 hw/scsi/vhost-user-scsi.c   | 1 +
 hw/usb/hcd-ehci.c   | 1 +
 hw/virtio/virtio-rng.c  | 1 +
 hw/virtio/virtio.c  | 1 +
 vl.c| 6 +++---
 21 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/hw/usb/hcd-ehci.h b/hw/usb/hcd-ehci.h
index fdbcfdcbeb..0298238f0b 100644
--- a/hw/usb/hcd-ehci.h
+++ b/hw/usb/hcd-ehci.h
@@ -21,7 +21,6 @@
 #include "qemu/timer.h"
 #include "hw/usb.h"
 #include "sysemu/dma.h"
-#include "sysemu/sysemu.h"
 #include "hw/pci/pci.h"
 #include "hw/sysbus.h"
 
diff --git a/include/hw/ide/internal.h b/include/hw/ide/internal.h
index c6954c1d56..52ec197da0 100644
--- a/include/hw/ide/internal.h
+++ b/include/hw/ide/internal.h
@@ -6,11 +6,12 @@
  * only files in hw/ide/ are supposed to include this file.
  * non-internal declarations are in hw/ide.h
  */
+
+#include "qapi/qapi-types-run-state.h"
 #include "hw/ide.h"
 #include "hw/irq.h"
 #include "hw/isa/isa.h"
 #include "sysemu/dma.h"
-#include "sysemu/sysemu.h"
 #include "hw/block/block.h"
 #include "scsi/constants.h"
 
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index a39e672f27..bfd40f01d8 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -12,7 +12,6 @@
 
 #include "hw/ppc/spapr_irq.h"
 #include "hw/ppc/xive.h"
-#include "sysemu/sysemu.h"
 
 #define TYPE_SPAPR_XIVE "spapr-xive"
 #define SPAPR_XIVE(obj) OBJECT_CHECK(SpaprXive, (obj), TYPE_SPAPR_XIVE)
diff --git a/include/hw/scsi/scsi.h b/include/hw/scsi/scsi.h
index 2bfaad0fe9..d77a92361b 100644
--- a/include/hw/scsi/scsi.h
+++ b/include/hw/scsi/scsi.h
@@ -4,7 +4,6 @@
 #include "block/aio.h"
 #include "hw/block/block.h"
 #include "hw/qdev-core.h"
-#include "sysemu/sysemu.h"
 #include "scsi/utils.h"
 #include "qemu/notify.h"
 
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index df40a46d60..48e8d04ff6 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -17,7 +17,6 @@
 #include "exec/memory.h"
 #include "hw/qdev-core.h"
 #include "net/net.h"
-#include "sysemu/sysemu.h"
 #include "migration/vmstate.h"
 #include "qemu/event_notifier.h"
 #include "standard-headers/linux/virtio_config.h"
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index f569f5f270..3fcdde8bfc 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -102,6 +102,7 @@ typedef struct SHPCDevice SHPCDevice;
 typedef struct SSIBus SSIBus;
 typedef struct VirtIODevice VirtIODevice;
 typedef struct Visitor Visitor;
+typedef struct VMChangeStateEntry VMChangeStateEntry;
 typedef struct VMStateDescription VMStateDescription;
 
 /*
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 908f158677..7606eaaf2a 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -22,7 +22,6 @@ void runstate_set(RunState new_state);
 int runstate_is_running(void);
 bool runstate_needs_reset(void);
 bool runstate_store(char *str, size_t size);
-typedef struct vm_change_state_entry VMChangeStateEntry;
 typedef void VMChangeStateHandler(void *opaque, int running, RunState state);
 
 VMChangeStateEntry *qemu_add_vm_change_state_handler(VMChangeStateHandler *cb,
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 7b44cca6d9..6b6cd07362 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -28,6 +28,7 @@
 #include "hw/virtio/virtio.h"
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
+#include "sysemu/sysemu.h"
 
 static const int user_feature_bits[] = {
 VIRTIO_BLK_F_SIZE_MAX,
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c

[Qemu-devel] [PULL 00/29] Header cleanup patches for 2019-08-13

2019-08-13 Thread Markus Armbruster
The following changes since commit 864ab314f1d924129d06ac7b571f105a2b76a4b2:

  Update version for v4.1.0-rc4 release (2019-08-06 17:05:21 +0100)

are available in the Git repository at:

  git://repo.or.cz/qemu/armbru.git tags/pull-include-2019-08-13

for you to fetch changes up to 8d111fd683b678d3826e192bc07ffcc349a118b5:

  sysemu: Split sysemu/runstate.h off sysemu/sysemu.h (2019-08-13 13:16:20 
+0200)


Header cleanup patches for 2019-08-13


These patches are rather bothersome to rebase, so I'd like to get them
into 4.2 early.

Markus Armbruster (29):
  include: Make headers more self-contained
  Include generated QAPI headers less
  qapi: Split error.json off common.json
  memory: Fix type of IOMMUMemoryRegionClass member @parent_class
  queue: Drop superfluous #include qemu/atomic.h
  trace: Eliminate use of TARGET_FMT_plx
  trace: Do not include qom/cpu.h into generated trace.h
  Include sysemu/reset.h a lot less
  Include migration/qemu-file-types.h a lot less
  ide: Include hw/ide/internal a bit less outside hw/ide/
  typedefs: Separate incomplete types and function types
  Include hw/irq.h a lot less
  Clean up inclusion of exec/cpu-common.h
  migration: Move the VMStateDescription typedef to typedefs.h
  Include migration/vmstate.h less
  Include exec/memory.h slightly less
  Include qom/object.h slightly less
  Include hw/hw.h exactly where needed
  Include qemu/queue.h slightly less
  Include qemu/main-loop.h less
  Include hw/qdev-properties.h less
  Include hw/boards.h a bit less
  numa: Don't include hw/boards.h into sysemu/numa.h
  Include sysemu/hostmem.h less
  numa: Move remaining NUMA declarations from sysemu.h to numa.h
  Clean up inclusion of sysemu/sysemu.h
  Include sysemu/sysemu.h a lot less
  sysemu: Move the VMChangeStateEntry typedef to qemu/typedefs.h
  sysemu: Split sysemu/runstate.h off sysemu/sysemu.h

 qapi/common.json | 24 ---
 qapi/error.json  | 29 ++
 qapi/qapi-schema.json|  1 +
 fsdev/qemu-fsdev-throttle.h  |  1 -
 hw/9pfs/coth.h   |  1 -
 hw/alpha/alpha_sys.h |  1 -
 hw/audio/intel-hda.h |  2 +-
 hw/audio/lm4549.h|  1 +
 hw/display/qxl.h |  1 -
 hw/hppa/hppa_sys.h   |  1 -
 hw/i386/amd_iommu.h  |  1 -
 hw/ide/ahci_internal.h   |  1 +
 hw/lm32/lm32.h   |  1 +
 hw/lm32/milkymist-hw.h   |  2 +-
 hw/microblaze/boot.h |  1 -
 hw/net/can/can_sja1000.h |  1 +
 hw/net/fsl_etsec/etsec.h |  1 -
 hw/net/ne2000.h  |  1 -
 hw/net/pcnet.h   |  1 +
 hw/nios2/boot.h  |  1 -
 hw/ppc/mac.h |  1 -
 hw/s390x/ipl.h   |  2 +-
 hw/usb/ccid.h|  2 +-
 hw/usb/hcd-ehci.h|  2 -
 hw/xtensa/xtensa_memory.h|  1 -
 include/authz/listfile.h |  1 -
 include/block/block.h|  1 -
 include/block/block_int.h|  1 -
 include/block/raw-aio.h  |  2 +
 include/block/write-threshold.h  |  2 +
 include/chardev/char-fe.h|  1 +
 include/chardev/char-io.h|  1 +
 include/chardev/char.h   |  2 +-
 include/disas/disas.h|  1 +
 include/exec/cpu-defs.h  |  1 -
 include/exec/cputlb.h|  3 ++
 include/exec/exec-all.h  |  1 +
 include/exec/ioport.h|  2 +
 include/exec/memory-internal.h   |  2 +
 include/exec/memory.h| 10 -
 include/exec/ram_addr.h  |  1 +
 include/exec/softmmu-semi.h  |  2 +
 include/exec/tb-hash.h   |  2 +
 include/exec/user/thunk.h|  2 +
 include/fpu/softfloat-macros.h   |  2 +
 include/hw/acpi/acpi.h   |  1 -
 include/hw/acpi/acpi_dev_interface.h |  2 +
 include/hw/acpi/pci.h|  3 ++
 include/hw/acpi/tco.h|  2 +
 include/hw/acpi/vmgenid.h|  2 +-
 include/hw/adc/stm32f2xx_adc.h   |  2 +
 include/hw/arm/allwinner-a10.h   |  3 +-
 include/hw/arm/aspeed_soc.h  |  1 +
 include/hw/arm/bcm2836.h |  1 +
 include/hw/arm/boot.h|  2 -
 include/hw/arm/exynos4210.h  |  3 +-
 include/hw/arm/fsl-imx25.h   |  1 +
 include/hw/arm/fsl-imx31.h   |  1 +
 

[Qemu-devel] [PULL 26/29] Clean up inclusion of sysemu/sysemu.h

2019-08-13 Thread Markus Armbruster
In my "build everything" tree, changing sysemu/sysemu.h triggers a
recompile of some 5400 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

Almost a third of its inclusions are actually superfluous.  Delete
them.  Downgrade two more to qapi/qapi-types-run-state.h, and move one
from char/serial.h to char/serial.c.

hw/semihosting/config.c, monitor/monitor.c, qdev-monitor.c, and
stubs/semihost.c define variables declared in sysemu/sysemu.h without
including it.  The compiler is cool with that, but include it anyway.

This doesn't reduce actual use much, as it's still included into
widely included headers.  The next commit will tackle that.

Signed-off-by: Markus Armbruster 
Reviewed-by: Alistair Francis 
Message-Id: <20190812052359.30071-27-arm...@redhat.com>
Reviewed-by: Alex Bennée 
---
 hw/usb/hcd-ehci.h   | 1 -
 include/hw/arm/allwinner-a10.h  | 1 -
 include/hw/char/serial.h| 1 -
 include/hw/i386/pc.h| 1 -
 include/hw/riscv/riscv_htif.h   | 1 -
 include/hw/timer/stm32f2xx_timer.h  | 1 -
 include/hw/virtio/virtio-bus.h  | 1 -
 include/hw/xen/xen-legacy-backend.h | 1 -
 include/migration/global_state.h| 2 +-
 include/sysemu/kvm_int.h| 1 -
 include/sysemu/replay.h | 2 +-
 include/ui/spice-display.h  | 1 -
 accel/tcg/tcg-all.c | 1 -
 block/nfs.c | 1 -
 blockdev-nbd.c  | 1 -
 dump/win_dump.c | 1 -
 hw/acpi/pcihp.c | 1 -
 hw/acpi/vmgenid.c   | 1 -
 hw/alpha/pci.c  | 1 -
 hw/alpha/typhoon.c  | 1 -
 hw/arm/nrf51_soc.c  | 1 -
 hw/arm/smmu-common.c| 1 -
 hw/arm/smmuv3.c | 1 -
 hw/arm/sysbus-fdt.c | 1 -
 hw/arm/z2.c | 1 -
 hw/char/exynos4210_uart.c   | 1 -
 hw/char/imx_serial.c| 1 -
 hw/char/serial.c| 1 +
 hw/core/loader-fit.c| 1 -
 hw/core/platform-bus.c  | 1 -
 hw/core/qdev.c  | 1 -
 hw/display/ramfb-standalone.c   | 1 -
 hw/display/ramfb.c  | 1 -
 hw/dma/xlnx-zynq-devcfg.c   | 1 -
 hw/hppa/dino.c  | 1 -
 hw/hppa/pci.c   | 1 -
 hw/i2c/smbus_ich9.c | 1 -
 hw/ide/cmd646.c | 1 -
 hw/ide/ioport.c | 1 -
 hw/ide/piix.c   | 1 -
 hw/ide/via.c| 1 -
 hw/input/adb-kbd.c  | 1 -
 hw/intc/allwinner-a10-pic.c | 1 -
 hw/intc/mips_gic.c  | 1 -
 hw/intc/xics_pnv.c  | 1 -
 hw/ipmi/ipmi_bmc_extern.c   | 1 -
 hw/isa/vt82c686.c   | 1 -
 hw/misc/armsse-cpuid.c  | 1 -
 hw/misc/armsse-mhu.c| 1 -
 hw/misc/imx6_src.c  | 1 -
 hw/misc/imx7_gpr.c  | 1 -
 hw/misc/iotkit-sysinfo.c| 1 -
 hw/misc/mips_cmgcr.c| 1 -
 hw/misc/mos6522.c   | 1 -
 hw/misc/sga.c   | 1 -
 hw/misc/zynq-xadc.c | 1 -
 hw/net/fsl_etsec/etsec.c| 1 -
 hw/net/lan9118.c| 1 -
 hw/net/ne2000.c | 1 -
 hw/net/opencores_eth.c  | 1 -
 hw/net/pcnet.c  | 1 -
 hw/nios2/generic_nommu.c| 1 -
 hw/pci-host/pam.c   | 1 -
 hw/ppc/pnv_bmc.c| 1 -
 hw/ppc/pnv_core.c   | 1 -
 hw/ppc/pnv_lpc.c| 1 -
 hw/ppc/pnv_occ.c| 1 -
 hw/ppc/ppce500_spin.c   | 1 -
 hw/ppc/spapr_rng.c  | 1 -
 hw/ppc/spapr_vio.c  | 1 -
 hw/s390x/event-facility.c   | 1 -
 hw/s390x/sclpcpu.c  | 1 -
 hw/s390x/virtio-ccw.c   | 1 -
 hw/scsi/scsi-disk.c | 1 -
 hw/sd/milkymist-memcard.c   | 1 -
 hw/semihosting/config.c | 1 +
 hw/ssi/aspeed_smc.c | 1 -
 hw/ssi/imx_spi.c| 1 -
 hw/ssi/xilinx_spi.c | 1 -
 hw/ssi/xilinx_spips.c   | 1 -
 hw/timer/allwinner-a10-pit.c| 1 -
 hw/timer/altera_timer.c | 1 -
 hw/timer/exynos4210_rtc.c   | 1 -
 hw/tricore/tricore_testboard.c  | 1 -
 hw/vfio/ap.c| 1 -
 hw/vfio/platform.c  | 1 -
 hw/xen/xen_pt_load_rom.c| 1 -
 hw/xtensa/xtensa_memory.c   | 1 -
 monitor/monitor.c   | 1 +
 net/tap-bsd.c   | 1 -
 net/tap-linux.c | 1 -
 net/tap-solaris.c   | 1 -
 net/tap-win32.c | 1 -
 qdev-monitor.c  | 1 +
 qemu-img.c  | 1 -
 qom/cpu.c   | 1 -
 replay/replay-audio.c  

[Qemu-devel] [PULL 16/29] Include exec/memory.h slightly less

2019-08-13 Thread Markus Armbruster
Drop unnecessary inclusions from headers.  Downgrade a few more to
exec/hwaddr.h.

Signed-off-by: Markus Armbruster 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Message-Id: <20190812052359.30071-17-arm...@redhat.com>
---
 hw/audio/lm4549.h   | 1 +
 hw/net/can/can_sja1000.h| 1 +
 hw/xtensa/xtensa_memory.h   | 1 -
 include/hw/arm/boot.h   | 1 -
 include/hw/arm/fsl-imx7.h   | 1 -
 include/hw/arm/soc_dma.h| 2 +-
 include/hw/block/flash.h| 2 +-
 include/hw/boards.h | 1 +
 include/hw/char/parallel.h  | 1 -
 include/hw/display/milkymist_tmu2.h | 1 +
 include/hw/display/tc6393xb.h   | 2 --
 include/hw/display/vga.h| 2 +-
 include/hw/hw.h | 1 -
 include/hw/i2c/pm_smbus.h   | 1 +
 include/hw/i2c/smbus_eeprom.h   | 1 +
 include/hw/misc/auxbus.h| 1 +
 include/hw/ppc/xics.h   | 1 +
 include/hw/usb.h| 1 +
 include/hw/virtio/virtio.h  | 1 +
 migration/migration.h   | 1 +
 hw/display/edid-region.c| 1 +
 hw/display/tc6393xb.c   | 1 +
 hw/net/ne2000.c | 1 +
 migration/colo.c| 1 +
 migration/postcopy-ram.c| 1 +
 migration/rdma.c| 1 +
 26 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/hw/audio/lm4549.h b/hw/audio/lm4549.h
index 74c3ee8934..aba9bb5b07 100644
--- a/hw/audio/lm4549.h
+++ b/hw/audio/lm4549.h
@@ -13,6 +13,7 @@
 #define HW_LM4549_H
 
 #include "audio/audio.h"
+#include "exec/hwaddr.h"
 
 typedef void (*lm4549_callback)(void *opaque);
 
diff --git a/hw/net/can/can_sja1000.h b/hw/net/can/can_sja1000.h
index 4731cbbd2a..220a622087 100644
--- a/hw/net/can/can_sja1000.h
+++ b/hw/net/can/can_sja1000.h
@@ -27,6 +27,7 @@
 #ifndef HW_CAN_SJA1000_H
 #define HW_CAN_SJA1000_H
 
+#include "exec/hwaddr.h"
 #include "net/can_emu.h"
 
 #define CAN_SJA_MEM_SIZE  128
diff --git a/hw/xtensa/xtensa_memory.h b/hw/xtensa/xtensa_memory.h
index d50a30..af7e8025e3 100644
--- a/hw/xtensa/xtensa_memory.h
+++ b/hw/xtensa/xtensa_memory.h
@@ -29,7 +29,6 @@
 #define XTENSA_MEMORY_H
 
 #include "cpu.h"
-#include "exec/memory.h"
 
 void xtensa_create_memory_regions(const XtensaMemory *memory,
   const char *name,
diff --git a/include/hw/arm/boot.h b/include/hw/arm/boot.h
index 350d4b0498..5714dea1a2 100644
--- a/include/hw/arm/boot.h
+++ b/include/hw/arm/boot.h
@@ -11,7 +11,6 @@
 #ifndef HW_ARM_BOOT_H
 #define HW_ARM_BOOT_H
 
-#include "exec/memory.h"
 #include "target/arm/cpu-qom.h"
 #include "qemu/notify.h"
 
diff --git a/include/hw/arm/fsl-imx7.h b/include/hw/arm/fsl-imx7.h
index 8003d45d1e..706aef2e7e 100644
--- a/include/hw/arm/fsl-imx7.h
+++ b/include/hw/arm/fsl-imx7.h
@@ -38,7 +38,6 @@
 #include "hw/net/imx_fec.h"
 #include "hw/pci-host/designware.h"
 #include "hw/usb/chipidea.h"
-#include "exec/memory.h"
 #include "cpu.h"
 
 #define TYPE_FSL_IMX7 "fsl,imx7"
diff --git a/include/hw/arm/soc_dma.h b/include/hw/arm/soc_dma.h
index 7886291d54..e93a7499a8 100644
--- a/include/hw/arm/soc_dma.h
+++ b/include/hw/arm/soc_dma.h
@@ -21,7 +21,7 @@
 #ifndef HW_SOC_DMA_H
 #define HW_SOC_DMA_H
 
-#include "exec/memory.h"
+#include "exec/hwaddr.h"
 
 struct soc_dma_s;
 struct soc_dma_ch_s;
diff --git a/include/hw/block/flash.h b/include/hw/block/flash.h
index 1acaf7de80..2136a2d5e4 100644
--- a/include/hw/block/flash.h
+++ b/include/hw/block/flash.h
@@ -3,7 +3,7 @@
 
 /* NOR flash devices */
 
-#include "exec/memory.h"
+#include "exec/hwaddr.h"
 
 /* pflash_cfi01.c */
 
diff --git a/include/hw/boards.h b/include/hw/boards.h
index a71d1a53a5..3a0be3131a 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -3,6 +3,7 @@
 #ifndef HW_BOARDS_H
 #define HW_BOARDS_H
 
+#include "exec/memory.h"
 #include "sysemu/blockdev.h"
 #include "sysemu/accel.h"
 #include "hw/qdev.h"
diff --git a/include/hw/char/parallel.h b/include/hw/char/parallel.h
index d6dd62fb9f..0a23c0f57e 100644
--- a/include/hw/char/parallel.h
+++ b/include/hw/char/parallel.h
@@ -1,7 +1,6 @@
 #ifndef HW_PARALLEL_H
 #define HW_PARALLEL_H
 
-#include "exec/memory.h"
 #include "hw/isa/isa.h"
 #include "chardev/char.h"
 
diff --git a/include/hw/display/milkymist_tmu2.h 
b/include/hw/display/milkymist_tmu2.h
index 148a119a1d..1fd978dcc5 100644
--- a/include/hw/display/milkymist_tmu2.h
+++ b/include/hw/display/milkymist_tmu2.h
@@ -27,6 +27,7 @@
 #ifndef HW_DISPLAY_MILKYMIST_TMU2_H
 #define HW_DISPLAY_MILKYMIST_TMU2_H
 
+#include "exec/hwaddr.h"
 #include "hw/qdev.h"
 
 #if defined(CONFIG_X11) && defined(CONFIG_OPENGL)
diff --git a/include/hw/display/tc6393xb.h b/include/hw/display/tc6393xb.h
index c653ef717b..f9263bf98a 100644
--- a/include/hw/display/tc6393xb.h
+++ b/include/hw/display/tc6393xb.h
@@ -12,8 +12,6 @@
 #ifndef HW_DISPLAY_TC6393XB_H
 #define HW_DISPLAY_TC6393XB_H
 
-#include "exec/memory.h"
-
 

[Qemu-devel] [PULL 22/29] Include hw/boards.h a bit less

2019-08-13 Thread Markus Armbruster
hw/boards.h pulls in almost 60 headers.  The less we include it into
headers, the better.  As a first step, drop superfluous inclusions,
and downgrade some more to what's actually needed.  Gets rid of just
one inclusion into a header.

Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Signed-off-by: Markus Armbruster 
Reviewed-by: Alistair Francis 
Message-Id: <20190812052359.30071-23-arm...@redhat.com>
---
 include/hw/mem/pc-dimm.h| 1 -
 backends/cryptodev-builtin.c| 1 -
 backends/cryptodev-vhost-user.c | 1 -
 backends/cryptodev.c| 1 -
 hw/acpi/ich9.c  | 1 +
 hw/alpha/dp264.c| 1 -
 hw/alpha/typhoon.c  | 1 +
 hw/arm/boot.c   | 1 -
 hw/arm/exynos4210.c | 2 +-
 hw/arm/fsl-imx25.c  | 1 -
 hw/arm/fsl-imx31.c  | 1 -
 hw/arm/msf2-soc.c   | 1 -
 hw/arm/nrf51_soc.c  | 1 -
 hw/arm/omap1.c  | 1 +
 hw/arm/omap2.c  | 1 +
 hw/arm/smmuv3.c | 1 -
 hw/arm/virt.c   | 1 +
 hw/core/numa.c  | 2 ++
 hw/i386/pc_piix.c   | 1 -
 hw/i386/pc_q35.c| 1 -
 hw/i386/pc_sysfw.c  | 1 -
 hw/ppc/e500plat.c   | 1 -
 hw/ppc/mpc8544ds.c  | 1 -
 hw/ppc/pnv.c| 1 +
 hw/ppc/ppc405_uc.c  | 1 -
 hw/ppc/spapr_cpu_core.c | 1 -
 hw/ppc/spapr_vio.c  | 1 -
 hw/riscv/boot.c | 2 +-
 hw/s390x/s390-stattrib.c| 1 -
 hw/xtensa/xtensa_memory.c   | 1 -
 monitor/qmp-cmds.c  | 1 -
 target/alpha/machine.c  | 1 -
 target/arm/machine.c| 1 -
 target/arm/monitor.c| 1 -
 target/hppa/machine.c   | 1 -
 target/i386/hvf/hvf.c   | 1 -
 target/i386/hvf/x86_task.c  | 1 -
 target/i386/machine.c   | 1 -
 target/i386/whpx-all.c  | 1 -
 target/lm32/machine.c   | 1 -
 target/moxie/machine.c  | 1 -
 target/openrisc/machine.c   | 1 -
 target/ppc/machine.c| 1 -
 target/sparc/machine.c  | 1 -
 44 files changed, 10 insertions(+), 37 deletions(-)

diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
index 66dee284ac..47b246f95c 100644
--- a/include/hw/mem/pc-dimm.h
+++ b/include/hw/mem/pc-dimm.h
@@ -19,7 +19,6 @@
 #include "exec/memory.h"
 #include "sysemu/hostmem.h"
 #include "hw/qdev-core.h"
-#include "hw/boards.h"
 
 #define TYPE_PC_DIMM "pc-dimm"
 #define PC_DIMM(obj) \
diff --git a/backends/cryptodev-builtin.c b/backends/cryptodev-builtin.c
index 9fb0bd57a6..c8ae3b9742 100644
--- a/backends/cryptodev-builtin.c
+++ b/backends/cryptodev-builtin.c
@@ -23,7 +23,6 @@
 
 #include "qemu/osdep.h"
 #include "sysemu/cryptodev.h"
-#include "hw/boards.h"
 #include "qapi/error.h"
 #include "standard-headers/linux/virtio_crypto.h"
 #include "crypto/cipher.h"
diff --git a/backends/cryptodev-vhost-user.c b/backends/cryptodev-vhost-user.c
index 1052a5d0e9..b344283940 100644
--- a/backends/cryptodev-vhost-user.c
+++ b/backends/cryptodev-vhost-user.c
@@ -22,7 +22,6 @@
  */
 
 #include "qemu/osdep.h"
-#include "hw/boards.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qerror.h"
 #include "qemu/error-report.h"
diff --git a/backends/cryptodev.c b/backends/cryptodev.c
index f35be377ef..3c071eab95 100644
--- a/backends/cryptodev.c
+++ b/backends/cryptodev.c
@@ -23,7 +23,6 @@
 
 #include "qemu/osdep.h"
 #include "sysemu/cryptodev.h"
-#include "hw/boards.h"
 #include "qapi/error.h"
 #include "qapi/visitor.h"
 #include "qemu/config-file.h"
diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 39649cbe6a..c1aaa07d43 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -31,6 +31,7 @@
 #include "hw/pci/pci.h"
 #include "migration/vmstate.h"
 #include "qemu/timer.h"
+#include "qom/cpu.h"
 #include "sysemu/reset.h"
 #include "sysemu/sysemu.h"
 #include "hw/acpi/acpi.h"
diff --git a/hw/alpha/dp264.c b/hw/alpha/dp264.c
index 546b89bbcc..51feee8558 100644
--- a/hw/alpha/dp264.c
+++ b/hw/alpha/dp264.c
@@ -11,7 +11,6 @@
 #include "cpu.h"
 #include "elf.h"
 #include "hw/loader.h"
-#include "hw/boards.h"
 #include "alpha_sys.h"
 #include "qemu/error-report.h"
 #include "sysemu/sysemu.h"
diff --git a/hw/alpha/typhoon.c b/hw/alpha/typhoon.c
index 5d7f8f3342..1c0565acc1 100644
--- a/hw/alpha/typhoon.c
+++ b/hw/alpha/typhoon.c
@@ -11,6 +11,7 @@
 #include "qemu/units.h"
 #include "qapi/error.h"
 #include "cpu.h"
+#include "hw/boards.h"
 #include "hw/irq.h"
 #include "sysemu/sysemu.h"
 #include "alpha_sys.h"
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 8563672942..eff89ab80e 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -18,7 +18,6 @@
 #include "sysemu/sysemu.h"
 #include "sysemu/numa.h"
 #include "sysemu/reset.h"
-#include "hw/boards.h"
 #include "hw/loader.h"
 #include "elf.h"
 #include "sysemu/device_tree.h"
diff --git a/hw/arm/exynos4210.c b/hw/arm/exynos4210.c
index 0e403f3e78..a9f8a5c868 100644
--- a/hw/arm/exynos4210.c
+++ 

[Qemu-devel] [PULL 11/29] typedefs: Separate incomplete types and function types

2019-08-13 Thread Markus Armbruster
While there, drop the obsolete file comment.

Signed-off-by: Markus Armbruster 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Message-Id: <20190812052359.30071-12-arm...@redhat.com>
Reviewed-by: Alex Bennée 
---
 include/qemu/typedefs.h | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index fcdaae58c4..29346648d4 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -1,10 +1,10 @@
 #ifndef QEMU_TYPEDEFS_H
 #define QEMU_TYPEDEFS_H
 
-/* A load of opaque types so that device init declarations don't have to
-   pull in all the real definitions.  */
-
-/* Please keep this list in case-insensitive alphabetical order */
+/*
+ * Incomplete struct types
+ * Please keep this list in case-insensitive alphabetical order.
+ */
 typedef struct AdapterInfo AdapterInfo;
 typedef struct AddressSpace AddressSpace;
 typedef struct AioContext AioContext;
@@ -101,6 +101,10 @@ typedef struct SHPCDevice SHPCDevice;
 typedef struct SSIBus SSIBus;
 typedef struct VirtIODevice VirtIODevice;
 typedef struct Visitor Visitor;
+
+/*
+ * Function types
+ */
 typedef void SaveStateHandler(QEMUFile *f, void *opaque);
 typedef int LoadStateHandler(QEMUFile *f, void *opaque, int version_id);
 
-- 
2.21.0




[Qemu-devel] [PULL 08/29] Include sysemu/reset.h a lot less

2019-08-13 Thread Markus Armbruster
In my "build everything" tree, changing sysemu/reset.h triggers a
recompile of some 2600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

The main culprit is hw/hw.h, which supposedly includes it for
convenience.

Include sysemu/reset.h only where it's needed.  Touching it now
recompiles less than 200 objects.

Signed-off-by: Markus Armbruster 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Alistair Francis 
Tested-by: Philippe Mathieu-Daudé 
Message-Id: <20190812052359.30071-9-arm...@redhat.com>
---
 include/hw/hw.h| 1 -
 hw/acpi/ich9.c | 2 ++
 hw/acpi/piix4.c| 2 ++
 hw/acpi/vmgenid.c  | 1 +
 hw/arm/armv7m.c| 1 +
 hw/arm/boot.c  | 1 +
 hw/arm/nseries.c   | 1 +
 hw/arm/omap1.c | 1 +
 hw/arm/omap2.c | 1 +
 hw/arm/virt-acpi-build.c   | 1 +
 hw/char/parallel.c | 1 +
 hw/char/serial.c   | 1 +
 hw/core/generic-loader.c   | 1 +
 hw/core/loader.c   | 1 +
 hw/cris/boot.c | 1 +
 hw/display/cirrus_vga.c| 1 +
 hw/display/ramfb.c | 2 ++
 hw/display/vga.c   | 2 ++
 hw/hppa/machine.c  | 1 +
 hw/i386/acpi-build.c   | 1 +
 hw/i386/pc.c   | 1 +
 hw/ide/cmd646.c| 1 +
 hw/ide/piix.c  | 1 +
 hw/ide/sii3112.c   | 1 +
 hw/ide/via.c   | 1 +
 hw/input/lm832x.c  | 1 +
 hw/input/pckbd.c   | 2 ++
 hw/input/ps2.c | 2 ++
 hw/input/tsc2005.c | 1 +
 hw/input/tsc210x.c | 1 +
 hw/intc/mips_gic.c | 1 +
 hw/intc/pnv_xive.c | 1 +
 hw/intc/spapr_xive.c   | 1 +
 hw/intc/xics.c | 1 +
 hw/intc/xive.c | 1 +
 hw/isa/piix4.c | 1 +
 hw/isa/vt82c686.c  | 1 +
 hw/lm32/lm32_boards.c  | 1 +
 hw/lm32/milkymist.c| 1 +
 hw/microblaze/boot.c   | 1 +
 hw/mips/cps.c  | 1 +
 hw/mips/mips_fulong2e.c| 1 +
 hw/mips/mips_jazz.c| 1 +
 hw/mips/mips_malta.c   | 1 +
 hw/mips/mips_mipssim.c | 2 ++
 hw/mips/mips_r4k.c | 2 ++
 hw/misc/vmcoreinfo.c   | 1 +
 hw/moxie/moxiesim.c| 2 ++
 hw/net/eepro100.c  | 1 +
 hw/nios2/boot.c| 1 +
 hw/nvram/fw_cfg.c  | 1 +
 hw/openrisc/openrisc_sim.c | 1 +
 hw/pci-host/bonito.c   | 1 +
 hw/pci-host/piix.c | 1 +
 hw/ppc/e500.c  | 1 +
 hw/ppc/mac_newworld.c  | 1 +
 hw/ppc/mac_oldworld.c  | 1 +
 hw/ppc/pnv.c   | 1 +
 hw/ppc/pnv_core.c  | 1 +
 hw/ppc/pnv_psi.c   | 1 +
 hw/ppc/ppc405_boards.c | 2 ++
 hw/ppc/ppc405_uc.c | 2 ++
 hw/ppc/ppc440_bamboo.c | 1 +
 hw/ppc/ppc440_uc.c | 1 +
 hw/ppc/ppc4xx_devs.c   | 2 ++
 hw/ppc/ppc4xx_pci.c| 1 +
 hw/ppc/ppc_booke.c | 2 ++
 hw/ppc/prep.c  | 2 ++
 hw/ppc/sam460ex.c  | 1 +
 hw/ppc/spapr.c | 1 +
 hw/ppc/spapr_cpu_core.c| 2 ++
 hw/ppc/spapr_drc.c | 1 +
 hw/ppc/virtex_ml507.c  | 1 +
 hw/riscv/riscv_hart.c  | 1 +
 hw/s390x/ipl.c | 1 +
 hw/s390x/s390-virtio-ccw.c | 1 +
 hw/sh4/r2d.c   | 1 +
 hw/sparc/leon3.c   | 2 ++
 hw/sparc/sun4m.c   | 2 ++
 hw/sparc64/sparc64.c   | 1 +
 hw/timer/etraxfs_timer.c   | 1 +
 hw/timer/mc146818rtc.c | 1 +
 hw/tpm/tpm_ppi.c   | 1 -
 hw/vfio/common.c   | 1 +
 hw/watchdog/wdt_diag288.c  | 1 +
 hw/xtensa/sim.c| 1 +
 hw/xtensa/xtfpga.c | 1 +
 target/i386/cpu.c  | 1 +
 target/i386/hax-all.c  | 1 +
 target/i386/kvm.c  | 1 +
 target/s390x/cpu.c | 1 +
 vl.c   | 1 +
 92 files changed, 107 insertions(+), 2 deletions(-)

diff --git a/include/hw/hw.h b/include/hw/hw.h
index b1b79964b5..a4fb2390e8 100644
--- a/include/hw/hw.h
+++ b/include/hw/hw.h
@@ -12,7 +12,6 @@
 #include "hw/irq.h"
 #include "migration/vmstate.h"
 #include "migration/qemu-file-types.h"
-#include "sysemu/reset.h"
 
 void QEMU_NORETURN hw_error(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
 
diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index e53dfe1ee3..b4d987c811 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -23,6 +23,7 @@
  * Contributions after 2012-01-13 are licensed under the terms of the
  * GNU GPL, version 2 or (at your option) any later version.
  */
+
 #include "qemu/osdep.h"
 #include "hw/hw.h"
 #include "qapi/error.h"
@@ -30,6 +31,7 @@
 #include "hw/i386/pc.h"
 #include "hw/pci/pci.h"
 #include "qemu/timer.h"
+#include "sysemu/reset.h"
 #include "sysemu/sysemu.h"
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/tco.h"
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index ec4e186cec..a59e58d937 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -18,6 +18,7 @@
  * Contributions after 2012-01-13 are licensed under the terms of the
  * GNU GPL, version 2 or (at your option) any later version.
  */
+
 #include "qemu/osdep.h"
 #include "hw/hw.h"
 #include "hw/i386/pc.h"
@@ 

[Qemu-devel] [PULL 19/29] Include qemu/queue.h slightly less

2019-08-13 Thread Markus Armbruster
Signed-off-by: Markus Armbruster 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Message-Id: <20190812052359.30071-20-arm...@redhat.com>
---
 include/exec/cpu-defs.h | 1 -
 include/hw/xen/xen_common.h | 1 -
 include/net/can_emu.h   | 1 +
 include/net/filter.h| 1 +
 include/qemu/range.h| 2 --
 include/qom/object.h| 1 -
 include/sysemu/cryptodev.h  | 1 +
 include/sysemu/rng.h| 1 +
 include/sysemu/sysemu.h | 1 -
 linux-user/qemu.h   | 1 -
 nbd/nbd-internal.h  | 1 -
 hw/scsi/vhost-scsi.c| 1 -
 hw/vfio/ap.c| 1 -
 linux-user/elfload.c| 1 +
 linux-user/main.c   | 1 +
 linux-user/syscall.c| 1 +
 nbd/client.c| 1 +
 nbd/server.c| 1 +
 qapi/qapi-dealloc-visitor.c | 1 -
 target/i386/whpx-all.c  | 1 -
 ui/kbd-state.c  | 1 -
 util/vfio-helpers.c | 1 -
 22 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
index 9bc713a70b..57a9a4ffd9 100644
--- a/include/exec/cpu-defs.h
+++ b/include/exec/cpu-defs.h
@@ -25,7 +25,6 @@
 
 #include "qemu/host-utils.h"
 #include "qemu/thread.h"
-#include "qemu/queue.h"
 #ifdef CONFIG_TCG
 #include "tcg-target.h"
 #endif
diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h
index 1e3ec4e16e..82e56339dd 100644
--- a/include/hw/xen/xen_common.h
+++ b/include/hw/xen/xen_common.h
@@ -16,7 +16,6 @@
 
 #include "hw/xen/xen.h"
 #include "hw/pci/pci.h"
-#include "qemu/queue.h"
 #include "hw/xen/trace.h"
 
 extern xc_interface *xen_xc;
diff --git a/include/net/can_emu.h b/include/net/can_emu.h
index 1da4d01b95..d4fc51b57d 100644
--- a/include/net/can_emu.h
+++ b/include/net/can_emu.h
@@ -28,6 +28,7 @@
 #ifndef NET_CAN_EMU_H
 #define NET_CAN_EMU_H
 
+#include "qemu/queue.h"
 #include "qom/object.h"
 
 /* NOTE: the following two structures is copied from . */
diff --git a/include/net/filter.h b/include/net/filter.h
index 9bc6fa3cc6..e8fb6259db 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -10,6 +10,7 @@
 #define QEMU_NET_FILTER_H
 
 #include "qapi/qapi-types-net.h"
+#include "qemu/queue.h"
 #include "qom/object.h"
 #include "net/queue.h"
 
diff --git a/include/qemu/range.h b/include/qemu/range.h
index 71b8b215c6..f62b363e0d 100644
--- a/include/qemu/range.h
+++ b/include/qemu/range.h
@@ -20,8 +20,6 @@
 #ifndef QEMU_RANGE_H
 #define QEMU_RANGE_H
 
-#include "qemu/queue.h"
-
 /*
  * Operations on 64 bit address ranges.
  * Notes:
diff --git a/include/qom/object.h b/include/qom/object.h
index 7bb82a7f56..128d00c77f 100644
--- a/include/qom/object.h
+++ b/include/qom/object.h
@@ -15,7 +15,6 @@
 #define QEMU_OBJECT_H
 
 #include "qapi/qapi-builtin-types.h"
-#include "qemu/queue.h"
 #include "qemu/module.h"
 
 struct TypeImpl;
diff --git a/include/sysemu/cryptodev.h b/include/sysemu/cryptodev.h
index 92bbb79131..a9afb7e5b5 100644
--- a/include/sysemu/cryptodev.h
+++ b/include/sysemu/cryptodev.h
@@ -23,6 +23,7 @@
 #ifndef CRYPTODEV_H
 #define CRYPTODEV_H
 
+#include "qemu/queue.h"
 #include "qom/object.h"
 
 /**
diff --git a/include/sysemu/rng.h b/include/sysemu/rng.h
index 2a02f47771..9b22c156f8 100644
--- a/include/sysemu/rng.h
+++ b/include/sysemu/rng.h
@@ -13,6 +13,7 @@
 #ifndef QEMU_RNG_H
 #define QEMU_RNG_H
 
+#include "qemu/queue.h"
 #include "qom/object.h"
 
 #define TYPE_RNG_BACKEND "rng-backend"
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 984c439ac9..77f5df59b0 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -3,7 +3,6 @@
 /* Misc. things related to the system emulator.  */
 
 #include "qapi/qapi-types-run-state.h"
-#include "qemu/queue.h"
 #include "qemu/timer.h"
 #include "qemu/notify.h"
 #include "qemu/main-loop.h"
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index aac0334627..f6f5fe5fbb 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -16,7 +16,6 @@
 #include "syscall_defs.h"
 #include "target_syscall.h"
 #include "exec/gdbstub.h"
-#include "qemu/queue.h"
 
 /* This is the size of the host kernel's sigset_t, needed where we make
  * direct system calls that take a sigset_t pointer and a size.
diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index 049f83df77..ec3d2e2ebc 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -28,7 +28,6 @@
 #endif
 
 #include "qemu/bswap.h"
-#include "qemu/queue.h"
 #include "qemu/main-loop.h"
 
 /* This is all part of the "official" NBD API.
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 343ca8be7a..83c9d83459 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -20,7 +20,6 @@
 #include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "qemu/module.h"
-#include "qemu/queue.h"
 #include "monitor/monitor.h"
 #include "migration/blocker.h"
 #include "hw/virtio/vhost-scsi.h"
diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index d1c86abb76..2bfc402037 100644
--- a/hw/vfio/ap.c
+++ 

[Qemu-devel] [PULL 09/29] Include migration/qemu-file-types.h a lot less

2019-08-13 Thread Markus Armbruster
In my "build everything" tree, changing migration/qemu-file-types.h
triggers a recompile of some 2600 out of 6600 objects (not counting
tests and objects that don't depend on qemu/osdep.h).

The culprit is again hw/hw.h, which supposedly includes it for
convenience.

Include migration/qemu-file-types.h only where it's needed.  Touching
it now recompiles less than 200 objects.

Signed-off-by: Markus Armbruster 
Message-Id: <20190812052359.30071-10-arm...@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
---
 include/hw/hw.h | 1 -
 include/migration/cpu.h | 1 +
 hw/acpi/piix4.c | 1 +
 hw/block/virtio-blk.c   | 1 +
 hw/char/virtio-serial-bus.c | 1 +
 hw/display/virtio-gpu.c | 1 +
 hw/intc/apic_common.c   | 1 +
 hw/intc/s390_flic_kvm.c | 1 +
 hw/nvram/eeprom93xx.c   | 1 +
 hw/nvram/fw_cfg.c   | 1 +
 hw/pci-host/piix.c  | 1 +
 hw/pci/msix.c   | 1 +
 hw/pci/pci.c| 1 +
 hw/pci/shpc.c   | 1 +
 hw/ppc/spapr.c  | 1 +
 hw/s390x/s390-skeys.c   | 1 +
 hw/s390x/tod.c  | 1 +
 hw/s390x/virtio-ccw.c   | 1 +
 hw/scsi/mptsas.c| 1 +
 hw/scsi/scsi-bus.c  | 1 +
 hw/scsi/scsi-disk.c | 1 +
 hw/scsi/scsi-generic.c  | 1 +
 hw/scsi/virtio-scsi.c   | 1 +
 hw/timer/i8254_common.c | 1 +
 hw/timer/twl92230.c | 1 +
 hw/usb/redirect.c   | 1 +
 hw/virtio/vhost.c   | 1 +
 hw/virtio/virtio-mmio.c | 1 +
 hw/virtio/virtio-pci.c  | 1 +
 hw/virtio/virtio.c  | 1 +
 target/ppc/kvm.c| 1 +
 31 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/include/hw/hw.h b/include/hw/hw.h
index a4fb2390e8..b399627cbe 100644
--- a/include/hw/hw.h
+++ b/include/hw/hw.h
@@ -11,7 +11,6 @@
 #include "exec/memory.h"
 #include "hw/irq.h"
 #include "migration/vmstate.h"
-#include "migration/qemu-file-types.h"
 
 void QEMU_NORETURN hw_error(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
 
diff --git a/include/migration/cpu.h b/include/migration/cpu.h
index da1618d620..2a22470d0d 100644
--- a/include/migration/cpu.h
+++ b/include/migration/cpu.h
@@ -4,6 +4,7 @@
 #define MIGRATION_CPU_H
 
 #include "exec/cpu-defs.h"
+#include "migration/qemu-file-types.h"
 
 #if TARGET_LONG_BITS == 64
 #define qemu_put_betl qemu_put_be64
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index a59e58d937..0d8c821f37 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -40,6 +40,7 @@
 #include "hw/acpi/memory_hotplug.h"
 #include "hw/acpi/acpi_dev_interface.h"
 #include "hw/xen/xen.h"
+#include "migration/qemu-file-types.h"
 #include "qom/cpu.h"
 #include "trace.h"
 
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index cbb3729158..1f40834d27 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -26,6 +26,7 @@
 # include 
 #endif
 #include "hw/virtio/virtio-bus.h"
+#include "migration/qemu-file-types.h"
 #include "hw/virtio/virtio-access.h"
 
 /* Config size before the discard support (hide associated config fields) */
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index f7a54f261b..b868e54d72 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -22,6 +22,7 @@
 #include "qapi/error.h"
 #include "qemu/iov.h"
 #include "qemu/module.h"
+#include "migration/qemu-file-types.h"
 #include "monitor/monitor.h"
 #include "qemu/error-report.h"
 #include "qemu/queue.h"
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 25d9e327fc..ed92071963 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -18,6 +18,7 @@
 #include "trace.h"
 #include "sysemu/dma.h"
 #include "hw/virtio/virtio.h"
+#include "migration/qemu-file-types.h"
 #include "hw/virtio/virtio-gpu.h"
 #include "hw/virtio/virtio-gpu-bswap.h"
 #include "hw/virtio/virtio-gpu-pixman.h"
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index e764a2bb03..7045761281 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -31,6 +31,7 @@
 #include "sysemu/kvm.h"
 #include "hw/qdev.h"
 #include "hw/sysbus.h"
+#include "migration/qemu-file-types.h"
 
 static int apic_irq_delivered;
 bool apic_report_tpr_access;
diff --git a/hw/intc/s390_flic_kvm.c b/hw/intc/s390_flic_kvm.c
index ff45b4ab0b..819aa5e198 100644
--- a/hw/intc/s390_flic_kvm.c
+++ b/hw/intc/s390_flic_kvm.c
@@ -22,6 +22,7 @@
 #include "hw/s390x/s390_flic.h"
 #include "hw/s390x/adapter.h"
 #include "hw/s390x/css.h"
+#include "migration/qemu-file-types.h"
 #include "trace.h"
 
 #define FLIC_SAVE_INITIAL_SIZE getpagesize()
diff --git a/hw/nvram/eeprom93xx.c b/hw/nvram/eeprom93xx.c
index 2db3d7cce6..5fc23df1d4 100644
--- a/hw/nvram/eeprom93xx.c
+++ b/hw/nvram/eeprom93xx.c
@@ -38,6 +38,7 @@
 #include "qemu/osdep.h"
 #include "hw/hw.h"
 #include "hw/nvram/eeprom93xx.h"
+#include "migration/qemu-file-types.h"
 
 /* Debug EEPROM emulation. */
 //~ #define DEBUG_EEPROM
diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c

  1   2   3   >