Re: [PATCH] docs: Add a QEMU Code of Conduct and Conflict Resolution Policy document

2021-03-30 Thread Thomas Huth

On 29/03/2021 22.59, Paolo Bonzini wrote:



Il lun 29 mar 2021, 20:33 Daniel P. Berrangé > ha scritto:


The obvious alternative is to import the contributor covenant

https://www.contributor-covenant.org/



The Contributor Covenant 1.x and 2.x are very different in that 2.x also 
includes conflict resolution. Unlike the code of conduct, the consequences 
of bad behavior are hard to generalize across multiple projects, so I would 
prefer anyway the 1.x version. The differences with the Django CoC aren't 
substantial.


Right. I also think we should use a code of conduct that allows us to keep 
the conflict resolution in a separate document.


Contributor Covenant 1.x is certainly an option, too, but it has IMHO 
already quite rigorous language ("Project maintainers have the [...] 
responsibility to remove, edit, or reject comments, commits, code, wiki 
edits ...", "Project maintainers who do not [...] enforce the Code of 
Conduct may be permanently removed from the project team."), which could 
either scare away people from taking maintainers responsibility or also 
could be used fire up arguments ("you are a maintainer, now according to the 
CoC you have to do this and that..."), which I'd rather like to avoid.
(well, as you know, I'm not a native English speaker, so I might also have 
gotten that tone wrong, but that's the impression that I had after reading 
that text as non-native speaker).


That's why I'd rather prefer the Django CoC instead.

However this does mean being more careful about the language in the "custom" 
documents such as the conflict resolution policy.



The second, it isn't a static document. It is being evolved over
time with new versions issued as understanding of problematic
situations evolves. We can choose to periodically update to stay
current with the broadly accepted norms.


This however has the same issues as the "or later" clause of the GPL (see 
the above example of 1.x vs 2.x for the Contributor Covenant). I don't think 
upgrade of the CoC should be automatic since there are no "compatibility" 
issues.


Agreed. We shouldn't auto-upgrade to a newer version of a CoC without 
reviewing the new clauses.



 > +If you are experiencing conflict, you should first address the perceived
 > +conflict directly with other involved parties, preferably through a
 > +real-time medium such as IRC. If this fails,


I agree with Daniel that this part should only be advisory. For example:

If you are experiencing conflict, please consider first addressing the 
perceived  conflict directly with other involved parties, preferably through 
a real-time medium such as IRC. If this fails or if you do not feel 
comfortable proceeding this way,...


Also this document doesn't mention anything about ensuring the
confidentiality/privacy for any complaints reported, which I
think is important to state explicitly.


Agreed, and also the part about keeping a record should be removed from the 
consequences part because it's a privacy regulation minefield.


Ok, thanks for the feedback, I'll try to incorporate it and send a v2.

 Thomas




Re: [PATCH v2 0/6] esp: fix asserts/segfaults discovered by fuzzer

2021-03-30 Thread Mark Cave-Ayland

On 18/03/2021 18:13, Paolo Bonzini wrote:


On 18/03/21 00:02, Mark Cave-Ayland wrote:

Recently there have been a number of issues raised on Launchpad as a result of
fuzzing the am53c974 (ESP) device. I spent some time over the past couple of
days checking to see if anything had improved since my last patchset: from
what I can tell the issues are still present, but the cmdfifo related failures
now assert rather than corrupting memory.

This patchset applied to master passes my local tests using the qtest fuzz test
cases added by Alexander for the following Launchpad bugs:

   https://bugs.launchpad.net/qemu/+bug/1919035
   https://bugs.launchpad.net/qemu/+bug/1919036
   https://bugs.launchpad.net/qemu/+bug/1910723
   https://bugs.launchpad.net/qemu/+bug/1909247
I'm posting this now just before soft freeze since I see that some of the issues
have recently been allocated CVEs and so it could be argued that even though
they have existed for some time, it is worth fixing them for 6.0.

Signed-off-by: Mark Cave-Ayland 

v2:
- Add Alexander's R-B tag for patch 2 and Phil's R-B for patch 3
- Add patch 4 for additional testcase provided in Alexander's patch 1 comment
- Move current_req NULL checks forward in DMA functions (fixes ASAN bug reported
   at https://bugs.launchpad.net/qemu/+bug/1909247/comments/6) in patch 3
- Add qtest for am53c974 containing a basic set of regression tests using the
   automatic test cases generated by the fuzzer as requested by Paolo


Mark Cave-Ayland (6):
   esp: don't underflow cmdfifo if no message out/command data is present
   esp: don't overflow cmdfifo if TC is larger than the cmdfifo size
   esp: ensure cmdfifo is not empty and current_dev is non-NULL
   esp: don't underflow fifo when writing to the device
   esp: always check current_req is not NULL before use in DMA callbacks
   tests/qtest: add tests for am53c974 device

  hw/scsi/esp.c   |  73 +
  tests/qtest/am53c974-test.c | 122 
  tests/qtest/meson.build |   1 +
  3 files changed, 171 insertions(+), 25 deletions(-)
  create mode 100644 tests/qtest/am53c974-test.c



Queued, thanks.

Paolo


Hi Paolo,

I had a quick look at Alex's updated test cases and most of them are based on an 
incorrect assumption I made around the behaviour of fifo8_pop_buf(). Can you drop 
these for now, and I will submit a v3 shortly once I've given it a full run through 
my test images?



ATB,

Mark.



Re: [PATCH] ppc/spapr: Add support for implement support for H_SCM_HEALTH

2021-03-30 Thread Shivaprasad G Bhat

Hi Vaibhav,

Some comments inline..

On 3/29/21 9:52 PM, Vaibhav Jain wrote:

Add support for H_SCM_HEALTH hcall described at [1] for spapr
nvdimms. This enables guest to detect the 'unarmed' status of a
specific spapr nvdimm identified by its DRC and if its unarmed, mark
the region backed by the nvdimm as read-only.

The patch adds h_scm_health() to handle the H_SCM_HEALTH hcall which
returns two 64-bit bitmaps (health bitmap, health bitmap mask) derived
from 'struct nvdimm->unarmed' member.

Linux kernel side changes to enable handling of 'unarmed' nvdimms for
ppc64 are proposed at [2].

References:
[1] "Hypercall Op-codes (hcalls)"
 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/powerpc/papr_hcalls.rst

[2] "powerpc/papr_scm: Mark nvdimm as unarmed if needed during probe"
 
https://lore.kernel.org/linux-nvdimm/20210329113103.476760-1-vaib...@linux.ibm.com/

Signed-off-by: Vaibhav Jain 
---
  hw/ppc/spapr_nvdimm.c  | 30 ++
  include/hw/ppc/spapr.h |  4 ++--
  2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr_nvdimm.c b/hw/ppc/spapr_nvdimm.c
index b46c36917c..e38740036d 100644
--- a/hw/ppc/spapr_nvdimm.c
+++ b/hw/ppc/spapr_nvdimm.c
@@ -31,6 +31,13 @@
  #include "qemu/range.h"
  #include "hw/ppc/spapr_numa.h"
  
+/* DIMM health bitmap bitmap indicators */

+/* SCM device is unable to persist memory contents */
+#define PAPR_PMEM_UNARMED (1ULL << (63 - 0))
+
+/* Bits status indicators for health bitmap indicating unarmed dimm */
+#define PAPR_PMEM_UNARMED_MASK (PAPR_PMEM_UNARMED)
+
  bool spapr_nvdimm_validate(HotplugHandler *hotplug_dev, NVDIMMDevice *nvdimm,
 uint64_t size, Error **errp)
  {
@@ -467,6 +474,28 @@ static target_ulong h_scm_unbind_all(PowerPCCPU *cpu, 
SpaprMachineState *spapr,
  return H_SUCCESS;
  }
  
+static target_ulong h_scm_health(PowerPCCPU *cpu, SpaprMachineState *spapr,

+ target_ulong opcode, target_ulong *args)
+{
+uint32_t drc_index = args[0];
+SpaprDrc *drc = spapr_drc_by_index(drc_index);
+NVDIMMDevice *nvdimm;
+
+if (drc && spapr_drc_type(drc) != SPAPR_DR_CONNECTOR_TYPE_PMEM) {
+return H_PARAMETER;
+}
+



Please check if drc->dev is not NULL too. DRCs are created in advance

and drc->dev may not be assigned if the device is not plugged yet.



+nvdimm = NVDIMM(drc->dev);
+
+/* Check if the nvdimm is unarmed and send its status via health bitmaps */
+args[0] = nvdimm->unarmed ? PAPR_PMEM_UNARMED_MASK : 0;



Please use object_property_get_bool to fetch the unarmed value.



+
+/* health bitmap mask same as the health bitmap */
+args[1] = args[0];
+
+return H_SUCCESS;
+}
+
  static void spapr_scm_register_types(void)
  {


...


Thanks,

Shivaprasad




[PING] [PATCH] [NFC] Mark locally used symbols as static.

2021-03-30 Thread Yuri Gribov
Hi all,

This patch makes locally used symbols static to enable more compiler
optimizations on them. Some of the symbols turned out to not be used
at all so I marked them with ATTRIBUTE_UNUSED (as I wasn't sure if
they were ok to delete).

The symbols have been identified with a pet project of mine:
https://github.com/yugr/Localizer

Link to patch: 
https://patchew.org/QEMU/cajotw+5ddmsr8qjqxaa1oht79rpmjcrwkybuartynr_ngux...@mail.gmail.com/

>From 4e790fd06becfbbf6fb106ac52ae1e4515f1ac73 Mon Sep 17 00:00:00 2001
From: Yury Gribov 
Date: Sat, 20 Mar 2021 23:39:15 +0300
Subject: [PATCH] Mark locally used symbols as static.

Signed-off-by: Yury Gribov 
Acked-by: Max Filippov  (xtensa)
Acked-by: David Gibson  (ppc)
Reviewed-by: Stefan Hajnoczi  (tracetool)
Reviewed-by: Taylor Simpson  (hexagon)
---
 disas/alpha.c | 16 ++--
 disas/m68k.c  | 78 -
 disas/mips.c  | 14 ++--
 disas/nios2.c | 84 +--
 disas/ppc.c   | 26 +++---
 disas/riscv.c |  2 +-
 pc-bios/optionrom/linuxboot_dma.c |  4 +-
 scripts/tracetool/format/c.py |  2 +-
 target/hexagon/gen_dectree_import.c   |  2 +-
 target/hexagon/opcodes.c  |  2 +-
 target/i386/cpu.c |  2 +-
 target/s390x/cpu_models.c |  2 +-
 .../xtensa/core-dc232b/xtensa-modules.c.inc   |  2 +-
 .../xtensa/core-dc233c/xtensa-modules.c.inc   |  2 +-
 target/xtensa/core-de212/xtensa-modules.c.inc |  2 +-
 .../core-de233_fpu/xtensa-modules.c.inc   |  2 +-
 .../xtensa/core-dsp3400/xtensa-modules.c.inc  |  2 +-
 target/xtensa/core-fsf/xtensa-modules.c.inc   |  2 +-
 .../xtensa-modules.c.inc  |  2 +-
 .../core-test_kc705_be/xtensa-modules.c.inc   |  2 +-
 .../core-test_mmuhifi_c3/xtensa-modules.c.inc |  2 +-
 21 files changed, 125 insertions(+), 127 deletions(-)

diff --git a/disas/alpha.c b/disas/alpha.c
index 3db90fa665..361a4ed101 100644
--- a/disas/alpha.c
+++ b/disas/alpha.c
@@ -56,8 +56,8 @@ struct alpha_opcode
 /* The table itself is sorted by major opcode number, and is otherwise
in the order in which the disassembler should consider
instructions.  */
-extern const struct alpha_opcode alpha_opcodes[];
-extern const unsigned alpha_num_opcodes;
+static const struct alpha_opcode alpha_opcodes[];
+static const unsigned alpha_num_opcodes;

 /* Values defined for the flags field of a struct alpha_opcode.  */

@@ -137,8 +137,8 @@ struct alpha_operand
 /* Elements in the table are retrieved by indexing with values from
the operands field of the alpha_opcodes table.  */

-extern const struct alpha_operand alpha_operands[];
-extern const unsigned alpha_num_operands;
+static const struct alpha_operand alpha_operands[];
+static const unsigned alpha_num_operands;

 /* Values defined for the flags field of a struct alpha_operand.  */

@@ -293,7 +293,7 @@ static int extract_ev6hwjhint (unsigned, int *);
 
 /* The operands table  */

-const struct alpha_operand alpha_operands[] =
+static const struct alpha_operand alpha_operands[] =
 {
   /* The fields are bits, shift, insert, extract, flags */
   /* The zero index is used to indicate end-of-list */
@@ -424,7 +424,7 @@ const struct alpha_operand alpha_operands[] =
 insert_ev6hwjhint, extract_ev6hwjhint }
 };

-const unsigned alpha_num_operands =
sizeof(alpha_operands)/sizeof(*alpha_operands);
+static ATTRIBUTE_UNUSED const unsigned alpha_num_operands =
sizeof(alpha_operands)/sizeof(*alpha_operands);

 /* The RB field when it is the same as the RA field in the same insn.
This operand is marked fake.  The insertion function just copies
@@ -706,7 +706,7 @@ extract_ev6hwjhint(unsigned insn, int *invalid
ATTRIBUTE_UNUSED)
that were not assigned to a particular extension.
 */

-const struct alpha_opcode alpha_opcodes[] = {
+static const struct alpha_opcode alpha_opcodes[] = {
   { "halt",SPCD(0x00,0x), BASE, ARG_NONE },
   { "draina",  SPCD(0x00,0x0002), BASE, ARG_NONE },
   { "bpt", SPCD(0x00,0x0080), BASE, ARG_NONE },
@@ -1732,7 +1732,7 @@ const struct alpha_opcode alpha_opcodes[] = {
   { "bgt", BRA(0x3F), BASE, ARG_BRA },
 };

-const unsigned alpha_num_opcodes =
sizeof(alpha_opcodes)/sizeof(*alpha_opcodes);
+static ATTRIBUTE_UNUSED const unsigned alpha_num_opcodes =
sizeof(alpha_opcodes)/sizeof(*alpha_opcodes);

 /* OSF register names.  */

diff --git a/disas/m68k.c b/disas/m68k.c
index aefaecfbd6..903d5cfec4 100644
--- a/disas/m68k.c
+++ b/disas/m68k.c
@@ -95,29 +95,29 @@ struct floatformat

 /* floatformats for IEEE single and double, big and little endian.  */

-extern const struct floatformat floatformat_ieee_single_big;
-extern const struct floatformat floatformat_ieee_single_little;
-extern const struct floatformat 

Re: [PATCH for-6.0 1/7] hw/block/nvme: fix pi constraint check

2021-03-30 Thread Klaus Jensen
On Mar 29 19:52, Gollu Appalanaidu wrote:
> On Wed, Mar 24, 2021 at 09:09:01PM +0100, Klaus Jensen wrote:
> > From: Klaus Jensen 
> > 
> > Protection Information can only be enabled if there is at least 8 bytes
> > of metadata.
> > 
> > Signed-off-by: Klaus Jensen 
> > ---
> > hw/block/nvme-ns.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
> > index 7f8d139a8663..ca04ee1bacfb 100644
> > --- a/hw/block/nvme-ns.c
> > +++ b/hw/block/nvme-ns.c
> > @@ -394,7 +394,7 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, 
> > Error **errp)
> > return -1;
> > }
> > 
> > -if (ns->params.pi && !ns->params.ms) {
> > +if (ns->params.pi && ns->params.ms < 8) {
> and also it is good check that "metadata size" is power of 2 or not?
> 

While I don't expect a lot of real-world devices having metadata sizes
that are not power of twos, there is no requirement in the spec for
that.

And the implementation here also does not require it :)


signature.asc
Description: PGP signature


Re: [PATCH] ppc/spapr: Add support for implement support for H_SCM_HEALTH

2021-03-30 Thread Vaibhav Jain
Hi Shiva,

Thanks for reviweing this patch. My responses inline below;


Shivaprasad G Bhat  writes:



>>   
>> +static target_ulong h_scm_health(PowerPCCPU *cpu, SpaprMachineState *spapr,
>> + target_ulong opcode, target_ulong *args)
>> +{
>> +uint32_t drc_index = args[0];
>> +SpaprDrc *drc = spapr_drc_by_index(drc_index);
>> +NVDIMMDevice *nvdimm;
>> +
>> +if (drc && spapr_drc_type(drc) != SPAPR_DR_CONNECTOR_TYPE_PMEM) {
>> +return H_PARAMETER;
>> +}
>> +
>
>
> Please check if drc->dev is not NULL too. DRCs are created in advance
>
> and drc->dev may not be assigned if the device is not plugged yet.
>
>
Sure, will address that in v2

>> +nvdimm = NVDIMM(drc->dev);
>> +
>> +/* Check if the nvdimm is unarmed and send its status via health 
>> bitmaps */
>> +args[0] = nvdimm->unarmed ? PAPR_PMEM_UNARMED_MASK : 0;
>
>
> Please use object_property_get_bool to fetch the unarmed value.
>
>
Sure I will switch to object_property_get_bool in v2. However I see
nvdimm->unarmed being accessed in similar manner in
nvdimm_build_structure_memdev() which probably needs an update too.



-- 
Cheers
~ Vaibhav



[PATCH v2] docs: Add a QEMU Code of Conduct and Conflict Resolution Policy document

2021-03-30 Thread Thomas Huth
In an ideal world, we would all get along together very well, always be
polite and never end up in huge conflicts. And even if there are conflicts,
we would always handle each other fair and respectfully. Unfortunately,
this is not an ideal world and sometimes people forget how to interact with
each other in a professional and respectful way. Fortunately, this rarely
happens in the QEMU community, but still there are such rare cases, and
then it would be good to have a basic code of conduct document available
that can be shown to persons who are misbehaving. And if that does not help
yet, we should also have a conflict resolution policy ready that can be
applied in the worst case.

The Code of Conduct document is based on the Django Code of Conduct
(https://www.djangoproject.com/conduct/) and the conflict resolution
has been assembled by Paolo, based on the Drupal Conflict Resolution Policy
(https://www.drupal.org/conflict-resolution) and the Mozilla Consequence Ladder
(https://github.com/mozilla/diversity/blob/master/code-of-conduct-enforcement/consequence-ladder.md)

Signed-off-by: Thomas Huth 
---
 I've picked the Django Code of Conduct as a base, since it sounds rather
 friendly and still welcoming to me, but I'm open for other suggestions, too
 (but we should maybe pick one where the conflict resolution policy is
 separated from the CoC itself so that it can be better taylored to the
 requirements of the QEMU project)

 v2: Adjusted the wording in the conflict resolution document according to
 the suggestions from Daniel and Paolo

 docs/devel/code-of-conduct.rst | 85 ++
 docs/devel/conflict-resolution.rst | 78 +++
 docs/devel/index.rst   |  2 +
 3 files changed, 165 insertions(+)
 create mode 100644 docs/devel/code-of-conduct.rst
 create mode 100644 docs/devel/conflict-resolution.rst

diff --git a/docs/devel/code-of-conduct.rst b/docs/devel/code-of-conduct.rst
new file mode 100644
index 00..050dbd9e16
--- /dev/null
+++ b/docs/devel/code-of-conduct.rst
@@ -0,0 +1,85 @@
+Code of Conduct
+===
+
+Like the technical community as a whole, the QEMU community is made up of a
+mixture of professionals and volunteers from all over the world.
+Diversity is one of our huge strengths, but it can also lead to communication
+issues and unhappiness. To that end, we have a few ground rules that we ask
+people to adhere to. This code applies equally to founders, maintainers,
+contributors, mentors and those seeking help and guidance.
+
+This isn't an exhaustive list of things that you can't do. Rather, take it in
+the spirit in which it's intended - a guide to make it easier to enrich all of
+us and the technical communities in which we participate:
+
+* Be friendly and patient.
+
+* Be welcoming. We strive to be a community that welcomes and supports people
+  of all backgrounds and identities. This includes, but is not limited to
+  members of any race, ethnicity, culture, national origin, colour, immigration
+  status, social and economic class, educational level, sex, sexual 
orientation,
+  gender identity and expression, age, size, family status, political belief,
+  religion, and mental and physical ability.
+
+* Be considerate. Your work will be used by other people, and you in turn will
+  depend on the work of others. Any decision you take will affect users and
+  colleagues, and you should take those consequences into account when making
+  decisions. Remember that we're a world-wide community, so you might not be
+  communicating in someone else's primary language.
+
+* Be respectful. Not all of us will agree all the time, but disagreement is no
+  excuse for poor behavior and poor manners. We might all experience some
+  frustration now and then, but we cannot allow that frustration to turn into
+  a personal attack. It's important to remember that a community where people
+  feel uncomfortable or threatened is not a productive one. Members of the QEMU
+  community should be respectful when dealing with other members as well as
+  with people outside the QEMU community.
+
+* Be careful in the words that you choose. We are a community of professionals,
+  and we conduct ourselves professionally. Be kind to others. Do not insult or
+  put down other participants. Harassment and other exclusionary behavior
+  aren't acceptable. This includes, but is not limited to:
+
+  * Violent threats or language directed against another person.
+
+  * Discriminatory jokes and language.
+
+  * Posting sexually explicit or violent material.
+
+  * Posting (or threatening to post) other people's personally identifying
+information ("doxing").
+
+  * Personal insults, especially those using racist or sexist terms.
+
+  * Unwelcome sexual attention.
+
+  * Advocating for, or encouraging, any of the above behavior.
+
+  * Repeated harassment of others. In general, if someone asks you to stop,
+then stop.
+
+* When we disagree, 

Re: [PATCH] docs: Add a QEMU Code of Conduct and Conflict Resolution Policy document

2021-03-30 Thread Daniel P . Berrangé
On Mon, Mar 29, 2021 at 10:59:23PM +0200, Paolo Bonzini wrote:
> Il lun 29 mar 2021, 20:33 Daniel P. Berrangé  ha
> scritto:
> 
> > The obvious alternative is to import the contributor covenant
> >
> >   https://www.contributor-covenant.org/
> 
> 
> The Contributor Covenant 1.x and 2.x are very different in that 2.x also
> includes conflict resolution. Unlike the code of conduct, the consequences
> of bad behavior are hard to generalize across multiple projects, so I would
> prefer anyway the 1.x version. The differences with the Django CoC aren't
> substantial.
> 
> However this does mean being more careful about the language in the
> "custom" documents such as the conflict resolution policy.
> 
> 
> The second, it isn't a static document. It is being evolved over
> > time with new versions issued as understanding of problematic
> > situations evolves. We can choose to periodically update to stay
> > current with the broadly accepted norms.
> >
> 
> This however has the same issues as the "or later" clause of the GPL (see
> the above example of 1.x vs 2.x for the Contributor Covenant). I don't
> think upgrade of the CoC should be automatic since there are no
> "compatibility" issues.

Note, I didn't say we should automatically upgrade - I said we can
choose to upgrade. 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v5 5/5] virtiofsd: Switch creds, drop FSETID for system.posix_acl_access xattr

2021-03-30 Thread Luis Henriques
On Mon, Mar 29, 2021 at 03:51:51PM -0400, Vivek Goyal wrote:
> On Mon, Mar 29, 2021 at 04:35:57PM +0100, Luis Henriques wrote:
> > On Thu, Mar 25, 2021 at 11:38:52AM -0400, Vivek Goyal wrote:
> > > When posix access acls are set on a file, it can lead to adjusting file
> > > permissions (mode) as well. If caller does not have CAP_FSETID and it
> > > also does not have membership of owner group, this will lead to clearing
> > > SGID bit in mode.
> > > 
> > > Current fuse code is written in such a way that it expects file server
> > > to take care of chaning file mode (permission), if there is a need.
> > > Right now, host kernel does not clear SGID bit because virtiofsd is
> > > running as root and has CAP_FSETID. For host kernel to clear SGID,
> > > virtiofsd need to switch to gid of caller in guest and also drop
> > > CAP_FSETID (if caller did not have it to begin with).
> > > 
> > > If SGID needs to be cleared, client will set the flag
> > > FUSE_SETXATTR_ACL_KILL_SGID in setxattr request. In that case server
> > > should kill sgid.
> > > 
> > > Currently just switch to uid/gid of the caller and drop CAP_FSETID
> > > and that should do it.
> > > 
> > > This should fix the xfstest generic/375 test case.
> > > 
> > > We don't have to switch uid for this to work. That could be one 
> > > optimization
> > > that pass a parameter to lo_change_cred() to only switch gid and not uid.
> > > 
> > > Also this will not work whenever (if ever) we support idmapped mounts. In
> > > that case it is possible that uid/gid in request are 0/0 but still we
> > > need to clear SGID. So we will have to pick a non-root sgid and switch
> > > to that instead. That's an TODO item for future when idmapped mount
> > > support is introduced.
> > > 
> > > Reported-by: Luis Henriques 
> > > Signed-off-by: Vivek Goyal 
> > > ---
> > >  include/standard-headers/linux/fuse.h |  7 +
> > >  tools/virtiofsd/passthrough_ll.c  | 42 +--
> > >  2 files changed, 47 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/include/standard-headers/linux/fuse.h 
> > > b/include/standard-headers/linux/fuse.h
> > > index cc87ff27d0..4eb79399d4 100644
> > > --- a/include/standard-headers/linux/fuse.h
> > > +++ b/include/standard-headers/linux/fuse.h
> > > @@ -180,6 +180,7 @@
> > >   *  - add FUSE_HANDLE_KILLPRIV_V2, FUSE_WRITE_KILL_SUIDGID, 
> > > FATTR_KILL_SUIDGID
> > >   *  - add FUSE_OPEN_KILL_SUIDGID
> > >   *  - add FUSE_SETXATTR_V2
> > > + *  - add FUSE_SETXATTR_ACL_KILL_SGID
> > >   */
> > >  
> > >  #ifndef _LINUX_FUSE_H
> > > @@ -450,6 +451,12 @@ struct fuse_file_lock {
> > >   */
> > >  #define FUSE_OPEN_KILL_SUIDGID   (1 << 0)
> > >  
> > > +/**
> > > + * setxattr flags
> > > + * FUSE_SETXATTR_ACL_KILL_SGID: Clear SGID when system.posix_acl_access 
> > > is set
> > > + */
> > > +#define FUSE_SETXATTR_ACL_KILL_SGID(1 << 0)
> > > +
> > >  enum fuse_opcode {
> > >   FUSE_LOOKUP = 1,
> > >   FUSE_FORGET = 2,  /* no reply */
> > > diff --git a/tools/virtiofsd/passthrough_ll.c 
> > > b/tools/virtiofsd/passthrough_ll.c
> > > index 3f5c267604..8a48071d0b 100644
> > > --- a/tools/virtiofsd/passthrough_ll.c
> > > +++ b/tools/virtiofsd/passthrough_ll.c
> > > @@ -175,7 +175,7 @@ struct lo_data {
> > >  int user_killpriv_v2, killpriv_v2;
> > >  /* If set, virtiofsd is responsible for setting umask during 
> > > creation */
> > >  bool change_umask;
> > > -int user_posix_acl;
> > > +int user_posix_acl, posix_acl;
> > >  };
> > >  
> > >  static const struct fuse_opt lo_opts[] = {
> > > @@ -716,8 +716,10 @@ static void lo_init(void *userdata, struct 
> > > fuse_conn_info *conn)
> > >   * in fuse_lowlevel.c
> > >   */
> > >  fuse_log(FUSE_LOG_DEBUG, "lo_init: enabling posix acl\n");
> > > -conn->want |= FUSE_CAP_POSIX_ACL | FUSE_CAP_DONT_MASK;
> > > +conn->want |= FUSE_CAP_POSIX_ACL | FUSE_CAP_DONT_MASK |
> > > +  FUSE_CAP_SETXATTR_V2;
> > 
> > An annoying thing with this is that if we're using a kernel without
> > _V2 support the mount will still succeed.  But we'll see:
> > 
> > ls: cannot access '/mnt': Connection refused
> > 
> > and in the userspace:
> > 
> > fuse: error: filesystem requested capabilities 0x2000 that are not 
> > supported by kernel, aborting.
> > 
> > Maybe it would be worth to automatically disable acl support if this
> > happens (with an error message) but still allow the filesystem to be
> > used.
> 
> If user specific "-o posix_acl" then it is better to fail explicitly
> if posix_acl can't be enabled. If user did not specify anything, then
> it makes sense to automatically disable posix acl  and continue.
> 
> > Or, which is probably better, to handle the EPROTO error in the
> > kernel during mount.
> 
> This will have been idea but in fuse, init process handling happens
> asynchronously. That is mount returns to user space while init
> command might complete at a later point of time. So 

[PATCH] target/xtensa: fix core import to meson.build

2021-03-30 Thread Max Filippov
import_core.sh was not updated to change meson.build when new xtensa
core is imported. Fix that.

Cc: qemu-sta...@nongnu.org # v5.2.0
Signed-off-by: Max Filippov 
---
 target/xtensa/import_core.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/xtensa/import_core.sh b/target/xtensa/import_core.sh
index c8626a8c02eb..f3404039cc20 100755
--- a/target/xtensa/import_core.sh
+++ b/target/xtensa/import_core.sh
@@ -66,5 +66,5 @@ static XtensaConfig $NAME __attribute__((unused)) = {
 REGISTER_CORE($NAME)
 EOF
 
-grep -q core-${NAME}.o "$BASE"/Makefile.objs || \
-echo "obj-y += core-${NAME}.o" >> "$BASE"/Makefile.objs
+grep -q core-${NAME}.c "$BASE"/meson.build || \
+echo "xtensa_ss.add(files('core-${NAME}.c'))" >> "$BASE"/meson.build
-- 
2.20.1




[PATCH] target/xtensa: make xtensa_modules static on import

2021-03-30 Thread Max Filippov
xtensa_modules variable defined in each xtensa-modules.c.inc is only
used locally by the including file. Make it static.

Signed-off-by: Max Filippov 
---
 target/xtensa/import_core.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/xtensa/import_core.sh b/target/xtensa/import_core.sh
index f3404039cc20..53d3c4d099bb 100755
--- a/target/xtensa/import_core.sh
+++ b/target/xtensa/import_core.sh
@@ -35,6 +35,7 @@ tar -xf "$OVERLAY" -O binutils/xtensa-modules.c | \
 -e '/^#include "ansidecl.h"/d' \
 -e '/^Slot_[a-zA-Z0-9_]\+_decode (const xtensa_insnbuf insn)/,/^}/s/^  
return 0;$/  return XTENSA_UNDEFINED;/' \
 -e 's/#include /#include "xtensa-isa.h"/' \
+-e 's/^\(xtensa_isa_internal xtensa_modules\)/static \1/' \
 > "$TARGET"/xtensa-modules.c.inc
 
 cat < "${TARGET}.c"
-- 
2.20.1




Re: [PATCH v7 4/4] tests: Add tests for yank with the chardev-change case

2021-03-30 Thread Marc-André Lureau
Hi Lukas,

On Mon, Mar 29, 2021 at 10:55 PM Lukas Straub  wrote:

> Add tests for yank with the chardev-change case.
>
> Signed-off-by: Lukas Straub 
> Reviewed-by: Marc-André Lureau 
> Tested-by: Li Zhang 
> ---
>  MAINTAINERS|   1 +
>  tests/unit/meson.build |   3 +-
>  tests/unit/test-yank.c | 227 +
>  3 files changed, 230 insertions(+), 1 deletion(-)
>  create mode 100644 tests/unit/test-yank.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 77259c031d..accb683a55 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2821,6 +2821,7 @@ M: Lukas Straub 
>  S: Odd fixes
>  F: util/yank.c
>  F: migration/yank_functions*
> +F: tests/unit/test-yank.c
>  F: include/qemu/yank.h
>  F: qapi/yank.json
>
> diff --git a/tests/unit/meson.build b/tests/unit/meson.build
> index 4bfe4627ba..b3bc2109da 100644
> --- a/tests/unit/meson.build
> +++ b/tests/unit/meson.build
> @@ -123,7 +123,8 @@ if have_system
>  'test-util-sockets': ['socket-helpers.c'],
>  'test-base64': [],
>  'test-bufferiszero': [],
> -'test-vmstate': [migration, io]
> +'test-vmstate': [migration, io],
> +'test-yank': ['socket-helpers.c', qom, io, chardev]
>}
>if 'CONFIG_INOTIFY1' in config_host
>  tests += {'test-util-filemonitor': []}
> diff --git a/tests/unit/test-yank.c b/tests/unit/test-yank.c
> new file mode 100644
> index 00..c46946b642
> --- /dev/null
> +++ b/tests/unit/test-yank.c
> @@ -0,0 +1,227 @@
> +/*
> + * Tests for QEMU yank feature
> + *
> + * Copyright (c) Lukas Straub 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include 
> +
> +#include "qemu/config-file.h"
> +#include "qemu/module.h"
> +#include "qemu/option.h"
> +#include "chardev/char-fe.h"
> +#include "sysemu/sysemu.h"
> +#include "qapi/error.h"
> +#include "qapi/qapi-commands-char.h"
> +#include "qapi/qapi-types-char.h"
> +#include "qapi/qapi-commands-yank.h"
> +#include "qapi/qapi-types-yank.h"
> +#include "io/channel-socket.h"
> +#include "io/net-listener.h"
> +#include "socket-helpers.h"
> +
> +typedef struct {
> +SocketAddress *addr;
> +bool old_yank;
> +bool new_yank;
> +bool fail;
> +} CharChangeTestConfig;
> +
> +static int chardev_change(void *opaque)
> +{
> +return 0;
> +}
> +
> +static bool is_yank_instance_registered(void)
> +{
> +YankInstanceList *list;
> +bool ret;
> +
> +list = qmp_query_yank(_abort);
> +
> +ret = !!list;
> +
> +qapi_free_YankInstanceList(list);
> +
> +return ret;
> +}
> +
> +static void char_change_test(gconstpointer opaque)
> +{
> +CharChangeTestConfig *conf = (gpointer) opaque;
> +SocketAddress *addr;
> +Chardev *chr;
> +CharBackend be;
> +ChardevReturn *ret;
> +QIOChannelSocket *ioc;
> +QIONetListener *listener;
> +
> +/*
> + * Setup a listener socket and determine its address
> + * so we know the TCP port for the client later
> + */
> +ioc = qio_channel_socket_new();
> +g_assert_nonnull(ioc);
> +qio_channel_socket_listen_sync(ioc, conf->addr, 1, _abort);
> +addr = qio_channel_socket_get_local_address(ioc, _abort);
> +g_assert_nonnull(addr);
> +listener = qio_net_listener_new();
> +g_assert_nonnull(listener);
> +qio_net_listener_add(listener, ioc);
>

The listener doesn't work, as there is no loop running. This works for me.
Please update the patch & resend. Thanks

diff --git a/tests/unit/test-yank.c b/tests/unit/test-yank.c
index 1596a3b98e..6e28648750 100644
--- a/tests/unit/test-yank.c
+++ b/tests/unit/test-yank.c
@@ -49,6 +49,16 @@ static bool is_yank_instance_registered(void)
 return ret;
 }

+static gpointer
+accept_thread(gpointer data)
+{
+QIOChannelSocket *ioc = data;
+
+qio_channel_socket_accept(ioc, _abort);
+
+return NULL;
+}
+
 static void char_change_test(gconstpointer opaque)
 {
 CharChangeTestConfig *conf = (gpointer) opaque;
@@ -57,6 +67,7 @@ static void char_change_test(gconstpointer opaque)
 CharBackend be;
 ChardevReturn *ret;
 QIOChannelSocket *ioc;
+QemuThread thread;

 /*
  * Setup a listener socket and determine its address
@@ -115,6 +126,11 @@ static void char_change_test(gconstpointer opaque)

 g_assert(!is_yank_instance_registered());

+if (conf->old_yank) {
+qemu_thread_create(, "accept", accept_thread,
+   ioc, QEMU_THREAD_JOINABLE);
+}
+
 ret = qmp_chardev_add("chardev", [conf->old_yank],
_abort);
 qapi_free_ChardevReturn(ret);
 chr = qemu_chr_find("chardev");
@@ -123,6 +139,10 @@ static void char_change_test(gconstpointer opaque)
 g_assert(is_yank_instance_registered() == conf->old_yank);

 qemu_chr_wait_connected(chr, _abort);
+if (conf->old_yank) {
+qemu_thread_join();
+}
+
 qemu_chr_fe_init(, chr, _abort);
 /* allow 

[Bug 1090604] Re: RFE: Implement support for SMBIOS Type 41 structures

2021-03-30 Thread Vincent Bernat
I have sent a first patch around this:
https://lists.nongnu.org/archive/html/qemu-devel/2021-03/msg09391.html

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1090604

Title:
  RFE: Implement support for SMBIOS Type 41 structures

Status in QEMU:
  In Progress

Bug description:
  This was originally filed in Fedora bugzilla:
  https://bugzilla.redhat.com/show_bug.cgi?id=669955

  """
  Please extend the existing support for SMBIOS in qemu to add a capability to 
provide "Onboard Devices Extended Information" (Type 41). Not only is this 
replacing one of the existing types, but it also provides a mapping between 
devices and physical system chassis locations. But there is no physical 
chassis! Right. However, this doesn't mean you don't want to tell the guest OS 
which virtual (e.g. network) interface is which. You can do that, if you 
implement this extension that is already going into real hardware, and likely 
other VMs too.

  See also page 117 of the v2.7 of the SMBIOS spec.

  FWIW, VMware ESX and Workstation expose their PCI NICs in the PCI IRQ Routing 
Table.  Kind of odd the first time you see it with biosdevname, as your NIC 
becomes pci3#1, but that's "correct" from a BIOS perspective. :-)
  """

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1090604/+subscriptions



Re: [PATCH 0/6] Add debug interface to kick/call on purpose

2021-03-30 Thread Dongli Zhang



On 3/28/21 8:56 PM, Jason Wang wrote:
> 
> 在 2021/3/27 上午5:16, Dongli Zhang 写道:
>> Hi Jason,
>>
>> On 3/26/21 12:24 AM, Jason Wang wrote:
>>> 在 2021/3/26 下午1:44, Dongli Zhang 写道:
 The virtio device/driver (e.g., vhost-scsi or vhost-net) may hang due to
 the loss of doorbell kick, e.g.,

 https://urldefense.com/v3/__https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg01711.html__;!!GqivPVa7Brio!KS3pAU2cKjz4wgI4QSlE-YsJPhPG71nkE5_tGhaOf7mi_xvNxbvKkfn03rk5BNDLSEU$


 ... or due to the loss of IRQ, e.g., as fixed by linux kernel commit
 fe200ae48ef5 ("genirq: Mark polled irqs and defer the real handler").

 This patch introduces a new debug interface 'DeviceEvent' to DeviceClass
 to help narrow down if the issue is due to loss of irq/kick. So far the new
 interface handles only two events: 'call' and 'kick'. Any device (e.g.,
 virtio/vhost or VFIO) may implement the interface (e.g., via eventfd, MSI-X
 or legacy IRQ).

 The 'call' is to inject irq on purpose by admin for a specific device 
 (e.g.,
 vhost-scsi) from QEMU/host to VM, while the 'kick' is to kick the doorbell
 on purpose by admin at QEMU/host side for a specific device.


 This device can be used as a workaround if call/kick is lost due to
 virtualization software (e.g., kernel or QEMU) issue.

 We may also implement the interface for VFIO PCI, e.g., to write to
 VFIOPCIDevice->msi_vectors[i].interrupt will be able to inject IRQ to VM
 on purpose. This is considered future work once the virtio part is done.


 Below is from live crash analysis. Initially, the queue=2 has count=15 for
 'kick' eventfd_ctx. Suppose there is data in vring avail while there is no
 used available. We suspect this is because vhost-scsi was not notified by
 VM. In order to narrow down and analyze the issue, we use live crash to
 dump the current counter of eventfd for queue=2.

 crash> eventfd_ctx 8f67f6bbe700
 struct eventfd_ctx {
     kref = {
   refcount = {
     refs = {
   counter = 4
     }
   }
     },
     wqh = {
   lock = {
     {
   rlock = {
     raw_lock = {
   val = {
     counter = 0
   }
     }
   }
     }
   },
   head = {
     next = 0x8f841dc08e18,
     prev = 0x8f841dc08e18
   }
     },
     count = 15, ---> eventfd is 15 !!!
     flags = 526336
 }

 Now we kick the doorbell for vhost-scsi queue=2 on purpose for diagnostic
 with this interface.

 { "execute": "x-debug-device-event",
     "arguments": { "dev": "/machine/peripheral/vscsi0",
    "event": "kick", "queue": 2 } }

 The counter is increased to 16. Suppose the hang issue is resolved, it
 indicates something bad is in software that the 'kick' is lost.
>>> What do you mean by "software" here? And it looks to me you're testing 
>>> whether
>>> event_notifier_set() is called by virtio_queue_notify() here. If so, I'm not
>>> sure how much value could we gain from a dedicated debug interface like this
>>> consider there're a lot of exisinting general purpose debugging method like
>>> tracing or gdb. I'd say the path from virtio_queue_notify() to
>>> event_notifier_set() is only a very small fraction of the process of 
>>> virtqueue
>>> kick which is unlikey to be buggy. Consider usually the ioeventfd will be
>>> offloaded to KVM, it's more a chance that something is wrong in setuping
>>> ioeventfd instead of here. Irq is even more complicated.
>> Thank you very much!
>>
>> I am not testing whether event_notifier_set() is called by 
>> virtio_queue_notify().
>>
>> The 'software' indicates the data processing and event notification mechanism
>> involved with virtio/vhost PV driver frontend. E.g., while VM is waiting for 
>> an
>> extra IRQ, vhost side did not trigger IRQ, suppose vring_need_event()
>> erroneously returns false due to corrupted ring buffer status.
> 
> 
> So there could be several factors that may block the notification:
> 
> 1) eventfd bug (ioeventfd vs irqfd)
> 2) wrong virtqueue state (either driver or device)
> 3) missing barriers (either driver or device)
> 4) Qemu bug (irqchip and routing)
> ...

This is not only about whether notification is blocked.

It can also be used to help narrow down and understand if there is any
suspicious issue in blk-mq/scsi/netdev/napi code. The PV drivers are not only
drivers following virtio spec. It is closely related to many of other kernel
components.

Suppose IO was recovered after we inject an IRQ to vhost-scsi on purpose, we
will be able to analyze what may happen along the IO completion path starting
from when /where the IRQ is injected ... perhaps the root cause is not with
virtio but 

Re: [PATCH] replay: fix recursive checkpoints

2021-03-30 Thread Pavel Dovgalyuk

On 29.03.2021 14:25, Alex Bennée wrote:


Pavel Dovgalyuk  writes:


Record/replay uses checkpoints to synchronize the execution
of the threads and timers. Hardware events such as BH are
processed at the checkpoints too.
Event processing can cause refreshing the virtual timers
and calling the icount-related functions, that also use checkpoints.
This patch prevents recursive processing of such checkpoints,
because they have their own records in the log and should be
processed later.

Signed-off-by: Pavel Dovgalyuk 
---
  replay/replay.c |   11 ++-
  1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/replay/replay.c b/replay/replay.c
index c806fec69a..6df2abc18c 100644
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -180,12 +180,13 @@ bool replay_checkpoint(ReplayCheckpoint checkpoint)
  }
  
  if (in_checkpoint) {

-/* If we are already in checkpoint, then there is no need
-   for additional synchronization.
+/*
 Recursion occurs when HW event modifies timers.
-   Timer modification may invoke the checkpoint and
-   proceed to recursion. */
-return true;
+   Prevent performing icount warp in this case and
+   wait for another invocation of the checkpoint.
+*/


nit: as you are updating the comment you might as well fix the style. It
would probably help with the diff as well.


+g_assert(replay_mode == REPLAY_MODE_PLAY);
+return false;
  }
  in_checkpoint = true;


The accompanying comments in replay.h are also confusing

 Returns 0 in PLAY mode if checkpoint was not found.
 Returns 1 in all other cases.

Which translated to actual bool results:

 Returns false in PLAY mode if checkpoint was not found
 Returns true in all other cases

Which implies the checkpoint is always found (or created?) which I'm not
even sure of while following the rest of the replay_checkpoint code
which has exit cases of:

 bool res = false; (default)
 replay_state.data_kind != EVENT_ASYNC;
 res = true; (when recording)

So is the following more correct?

/**
  * replay_checkpoint(checkpoint): save (in RECORD) or consume (in PLAY) 
checkpoint
  * @checkpoint: the checkpoint event
  *
  * In SAVE mode stores the checkpoint in the record and potentially
  * saves a number of events.
  *
  * In PLAY mode consumes checkpoint and any following EVENT_ASYNC events.
  *
  * Results: in SAVE mode always True
  *  in PLAY mode True unless checkpoint not found or recursively 
called.
  */



Almost true.
In PLAY returns True only if the checkpoint was found and all following 
async events matched and processed.
Otherwise returns false and non-processed events are postponed to be 
consumed later.


Pavel Dovgalyuk



Re: [PULL 00/10] For 6.0 patches

2021-03-30 Thread Marc-André Lureau
Hi

On Mon, Mar 29, 2021 at 9:54 PM Peter Maydell 
wrote:

> On Mon, 29 Mar 2021 at 17:30, Marc-André Lureau
>  wrote:
> >
> > Hi
> >
> > On Mon, Mar 29, 2021 at 7:56 PM Peter Maydell 
> wrote:
> >>
> >> On Mon, 29 Mar 2021 at 15:17, Marc-André Lureau
> >>  wrote:
> >> > ../docs/meson.build:30: WARNING: /usr/bin/sphinx-build-3:
> >> > Configuration error:
> >> > The Sphinx 'sphinx_rtd_theme' HTML theme was not found.
> >> >
> >> > ../docs/meson.build:32:6: ERROR: Problem encountered: Install a
> Python 3 version of python-sphinx and the readthedoc theme
> >>
> >>
> >> So why do you get that message, and I see the above? Older
> >> sphinx-build ?
> >
> >
> >
> > It's strange, it's like ModuleNotFoundError was not catched by the
> "except ImportError".
> >
> > What's the version of python?
>
> It's whatever's in the BSD VMs. I also saw the same error on the
>

I built successfully with  vm-build-openbsd, vm-build-freebsd, and
vm-build-netbsd. None have sphinx installed, thus simply print:
Program sphinx-build-3 sphinx-build found: NO

Am I missing something?

aarch64 CI machine, which has python 3.8.5 and sphinx-build 1.8.5.
> My guess is that it might be the sphinx-build version here. I vaguely
> recall that Sphinx is kind of picky about exceptions within the conf
> file but that there was a change in what it allowed at some point.
> It's possible we just can't do much with the old versions.
>

How do you run the build? Running make from an existing configured or build
state? If so, I have seen sphinx errors that don't stop the build (and
actually building the docs without sphinx-rtd). I don't know why this
happens, "regenerate"/reconfigure errors should stop the build.

It seems like a minor issue to me. A clean build will error correctly.


> I'm inclined to suggest we should postpone switching to the rtd theme
> until after the 6.0 release -- there isn't a strong need to get it
> in this release, is there ?
>
>
There is no hurry, but let's try to make some progress. If it's ready, I'll
let you decide if this is acceptable during freeze period or not.

Now I am not sure what should be fixed... I will try to find the cause of
the non-fatal error on incremental build.

thanks

-- 
Marc-André Lureau


[Bug 1862874] Re: java may stuck for a long time in system mode with "-cpu max"

2021-03-30 Thread David Hildenbrand
** Changed in: qemu
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1862874

Title:
  java may stuck for a long time in system mode with "-cpu max"

Status in QEMU:
  Confirmed

Bug description:
  Bug Description:
  Run "java -version" in guest VM, java may stuck for a long time (several 
hours) and then recover.

  Steps to reproduce:
  1. Launch VM by attached simple script: launch.sh
  2. Execute "java -version" and then print "date" in a loop
  while :
  do
/home/bot/jdk/bin/java -version
date
  done
  3. A long time gap will be observed: may > 24 hours.

  Technical details:
  * host: x86_64 Linux 4.15.0-70-generic
  * qemu v4.2.0
  * java: tried two versions: openjdk-11-jre-headless or compiled java-13 
  * command-line: (See details in launch.sh)
  /home/bot/qemu/qemu-build/qemu-4.2.0/binaries/bin/qemu-system-x86_64 \
-drive "file=${img},format=qcow2" \
-drive "file=${user_data},format=raw" \
-cpu max \
-m 24G \
-serial mon:stdio \
-smp 8 \
-nographic \
  ;

  * Observed by java core dump generated by "kill -SIGSEGV" when java stucked:
  Different pthreads are blocked on their own condition variables:

Id   Target Id Frame
1Thread 0x7f48a041a080 (LWP 22470) __GI_raise (sig=sig@entry=6)
  at ../sysdeps/unix/sysv/linux/raise.c:51
2Thread 0x7f487197d700 (LWP 22473) 0x7f489f5c49f3 in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x7f48980197c0)
  at ../sysdeps/unix/sysv/linux/futex-internal.h:88
3Thread 0x7f4861b89700 (LWP 22483) 0x7f489f5c4ed9 in 
futex_reltimed_wait_cancelable (private=, 
reltime=0x7f4861b88960, expected=0,
  futex_word=0x7f489801b084)
  at ../sysdeps/unix/sysv/linux/futex-internal.h:142
4Thread 0x7f4861e8c700 (LWP 22480) 0x7f489f5c76d6 in 
futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, 
futex_word=0x7f48980107c0)
  at ../sysdeps/unix/sysv/linux/futex-internal.h:205
5Thread 0x7f4861c8a700 (LWP 22482) 0x7f489f5c4ed9 in 
futex_reltimed_wait_cancelable (private=, 
reltime=0x7f4861c89800, expected=0,
  futex_word=0x7f489801ed44)
  at ../sysdeps/unix/sysv/linux/futex-internal.h:142
6Thread 0x7f48a0418700 (LWP 22471) 0x7f4880b13200 in ?? ()
7Thread 0x7f48703ea700 (LWP 22478) 0x7f489f5c49f3 in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x7f489801dfc0)
  at ../sysdeps/unix/sysv/linux/futex-internal.h:88
8Thread 0x7f48702e9700 (LWP 22479) 0x7f489f5c49f3 in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x7f489838cd84)
  at ../sysdeps/unix/sysv/linux/futex-internal.h:88
9Thread 0x7f4870f71700 (LWP 22475) 0x7f489f5c49f3 in 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x7f489801a300)
  at ../sysdeps/unix/sysv/linux/futex-internal.h:88
10   Thread 0x7f487187b700 (LWP 22474) 0x7f489f5c76d6 in 
futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, 
futex_word=0x7f48980cf770)
  at ../sysdeps/unix/sysv/linux/futex-internal.h:205
11   Thread 0x7f4871a7f700 (LWP 22472) 0x7f489f5c76d6 in 
futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, 
futex_word=0x7f489809ba30)
  at ../sysdeps/unix/sysv/linux/futex-internal.h:205
12   Thread 0x7f4861d8b700 (LWP 22481) 0x7f489f5c4ed9 in 
futex_reltimed_wait_cancelable (private=, 
reltime=0x7f4861d8a680, expected=0,
  futex_word=0x7f489801ed44)
  at ../sysdeps/unix/sysv/linux/futex-internal.h:142
13   Thread 0x7f48704ec700 (LWP 22477) 0x7f489f5c4ed9 in 
futex_reltimed_wait_cancelable (private=, 
reltime=0x7f48704eb910, expected=0,
  futex_word=0x7f489801d120)
  at ../sysdeps/unix/sysv/linux/futex-internal.h:142
14   Thread 0x7f4870e6f700 (LWP 22476) 0x7f489f5c4ed9 in 
futex_reltimed_wait_cancelable (private=, 
reltime=0x7f4870e6eb20, expected=0,
  futex_word=0x7f489828abd0)
  at ../sysdeps/unix/sysv/linux/futex-internal.h:142

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1862874/+subscriptions



[Bug 1920913] Re: Openjdk11+ fails to install on s390x

2021-03-30 Thread David Hildenbrand
Same BUG as https://bugs.launchpad.net/qemu/+bug/1862874

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1920913

Title:
  Openjdk11+ fails to install on s390x

Status in QEMU:
  New

Bug description:
  While installing openjdk11 or higher from repo, it crashes while configuring 
ca-certificates-java.
  Although `java -version` passes, `jar -version` crashes. Detailed logs 
attached to this issue.

  ```
  # A fatal error has been detected by the Java Runtime Environment:
  #
  #  SIGILL (0x4) at pc=0x0040126f9980, pid=8425, tid=8430
  #
  # JRE version: OpenJDK Runtime Environment (11.0.10+9) (build 
11.0.10+9-Ubuntu-0ubuntu1.20.04)
  # Java VM: OpenJDK 64-Bit Server VM (11.0.10+9-Ubuntu-0ubuntu1.20.04, mixed 
mode, tiered, compressed oops, g1 gc, linux-s390x)
  # Problematic frame:
  # J 4 c1 java.lang.StringLatin1.hashCode([B)I java.base@11.0.10 (42 bytes) @ 
0x0040126f9980 [0x0040126f9980+0x]
  #
  # Core dump will be written. Default location: Core dumps may be processed 
with "/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to //core.8425)
  #
  # An error report file with more information is saved as:
  # //hs_err_pid8425.log
  sed with "/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to 
/root/core.10740)
  #
  # An error report file with more information is saved as:
  # /root/hs_err_pid10740.log
  ```

  Observed this on s390x/ubuntu as well as s390x/alpine when run on amd64 host.
  Please note, on native s390x, the installation is successful. Also this crash 
is not observed while installing openjdk-8-jdk.

  Qemu version: 5.2.0

  Please let me know if any more details are needed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1920913/+subscriptions



Re: [RFC 1/8] memory: Allow eventfd add/del without starting a transaction

2021-03-30 Thread Greg Kurz
On Mon, 29 Mar 2021 18:03:49 +0100
Stefan Hajnoczi  wrote:

> On Thu, Mar 25, 2021 at 04:07:28PM +0100, Greg Kurz wrote:
> > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > index 5728a681b27d..98ed552e001c 100644
> > --- a/include/exec/memory.h
> > +++ b/include/exec/memory.h
> > @@ -1848,13 +1848,25 @@ void 
> > memory_region_clear_flush_coalesced(MemoryRegion *mr);
> >   * @match_data: whether to match against @data, instead of just @addr
> >   * @data: the data to match against the guest write
> >   * @e: event notifier to be triggered when @addr, @size, and @data all 
> > match.
> > + * @transaction: whether to start a transaction for the change
> 
> "start" is unclear. Does it begin a transaction and return with the
> transaction unfinished? I think instead the function performs the
> eventfd addition within a transaction. It would be nice to clarify this.
> 

What about: 

 * @transaction: if true, the eventfd is added within a nested transaction,
 *   if false, it is up to the caller to ensure this is called
 *   within a transaction.

> >   **/
> > -void memory_region_add_eventfd(MemoryRegion *mr,
> > -   hwaddr addr,
> > -   unsigned size,
> > -   bool match_data,
> > -   uint64_t data,
> > -   EventNotifier *e);
> > +void memory_region_add_eventfd_full(MemoryRegion *mr,
> > +hwaddr addr,
> > +unsigned size,
> > +bool match_data,
> > +uint64_t data,
> > +EventNotifier *e,
> > +bool transaction);
> > +
> > +static inline void memory_region_add_eventfd(MemoryRegion *mr,
> > + hwaddr addr,
> > + unsigned size,
> > + bool match_data,
> > + uint64_t data,
> > + EventNotifier *e)
> > +{
> > +memory_region_add_eventfd_full(mr, addr, size, match_data, data, e, 
> > true);
> > +}
> >  
> >  /**
> >   * memory_region_del_eventfd: Cancel an eventfd.
> > @@ -1868,13 +1880,25 @@ void memory_region_add_eventfd(MemoryRegion *mr,
> >   * @match_data: whether to match against @data, instead of just @addr
> >   * @data: the data to match against the guest write
> >   * @e: event notifier to be triggered when @addr, @size, and @data all 
> > match.
> > + * @transaction: whether to start a transaction for the change
> 
> Same here.

and:

 * @transaction: if true, the eventfd is cancelled within a nested transaction,
 *   if false, it is up to the caller to ensure this is called
 *   within a transaction.

?


pgpx_Buep1Imv.pgp
Description: OpenPGP digital signature


Re: [PATCH v2] qapi: introduce 'query-cpu-model-cpuid' action

2021-03-30 Thread Valeriy Vdovin
On Tue, Mar 30, 2021 at 02:15:10AM +0200, Igor Mammedov wrote:
> On Thu, 25 Mar 2021 19:57:05 +0300
> Valeriy Vdovin  wrote:
> 
> > Introducing new qapi method 'query-cpu-model-cpuid'. This method can be 
> > used to
> > get virtualized cpu model info generated by QEMU during VM initialization in
> > the form of cpuid representation.
> > 
> > Diving into more details about virtual cpu generation: QEMU first parses 
> > '-cpu'
> > command line option. From there it takes the name of the model as the basis 
> > for
> > feature set of the new virtual cpu. After that it uses trailing '-cpu' 
> > options,
> > that state if additional cpu features should be present on the virtual cpu 
> > or
> > excluded from it (tokens '+'/'-' or '=on'/'=off').
> > After that QEMU checks if the host's cpu can actually support the derived
> > feature set and applies host limitations to it.
> > After this initialization procedure, virtual cpu has it's model and
> > vendor names, and a working feature set and is ready for identification
> > instructions such as CPUID.
> > 
> > Currently full output for this method is only supported for x86 cpus.
> > 
> > To learn exactly how virtual cpu is presented to the guest machine via CPUID
> > instruction, new qapi method can be used. By calling 'query-cpu-model-cpuid'
> > method, one can get a full listing of all CPUID leafs with subleafs which 
> > are
> > supported by the initialized virtual cpu.
> > 
> > Other than debug, the method is useful in cases when we would like to
> > utilize QEMU's virtual cpu initialization routines and put the retrieved
> > values into kernel CPUID overriding mechanics for more precise control
> > over how various processes perceive its underlying hardware with
> > container processes as a good example.
> 
> 
> existing 'query-cpu-definitions' does return feature bits that are actually
> supported by qemu/host combination, why do we need a second very simillar 
> interface?
> 
We've examined 'query-cpu-definitions' as well as 'query-cpu-model-expansion', 
which
is even a better fit for the job. But both methods just provide a list of cpu 
features,
while leaving CPUID generation out of their scope.
Here is an example output from 'query-cpu-model-expansion':

{
"return": {
  "model": {
 "name": "max",
 "props": {
   "vmx-entry-load-rtit-ctl": false,
   "phys-bits": 0,
   "core-id": -1,
   "svme-addr-chk": false,
   "xlevel": 2147483656,
   "cmov": true,
   "ia64": false,
   "ssb-no": false,
   "aes": false,
   "vmx-apicv-xapic": true,
   ...

However having this information we are only half-way there. We now need to 
somehow 
convert all this information into CPUID leaves, that we will be able to give 
out 
to the callers of 'cpuid' instruction. As we can see in the above listing, the
field type is not even a uniform list of cpu features. It's an unordered 
information
that matters for virtual cpu presentation each in it's own way.

To construct CPUID leaves from that, the application should have ALL the 
knowledge
about each type of the above properties list. This is the kind of code writing 
that
we naturally want to evade knowing that there is a perfect function that 
already does
that.
I'm talking about 'cpu_x86_cpuid' in QEMU sources. It already does the whole 
CPUID
response construction. Just looking at its listing the function seems to be 
pretty
complex. So writing the same logic will repeat the same complexity and all 
risks.
Also it's in a public domain, so it's guaranteed to be revisisted, improved and
bug-fixed often.
So utilizing this function is an easy choice, in fact almost no choice. All we 
need 
is an api, that can fetch results from this function, which is exactly what our 
new
QMP method does. The method is pretty straightforward, so there will be not 
much to 
maintain, compared to the effort that would need to be done to support future 
CPU
features.

> > 
> > Output format:
> > The core part of the returned JSON object can be described as a list of 
> > lists
> > with top level list contains leaf-level elements and the bottom level
> > containing subleafs, where 'leaf' is CPUID argument passed in EAX register 
> > and
> > 'subleaf' is a value passed to CPUID in ECX register for some specific
> > leafs, that support that. Each most basic CPUID result is passed in a
> > maximum of 4 registers EAX, EBX, ECX and EDX, with most leafs not utilizing
> > all 4 registers at once.
> > Also note that 'subleaf' is a kind of extension, used by only a couple of
> > leafs, while most of the leafs don't have this. Nevertheless, the output
> > data structure presents ALL leafs as having at least a single 'subleaf'.
> > This is done for data structure uniformity, so that it could be
> > processed in a more straightforward manner, in this case no one suffers
> > from such simplification.
> > 
> > Use example:
> > virsh qemu-monitor-command VM --pretty '{ "execute": 
> > "query-cpu-model-cpuid" }'
> > 

[PATCH v3 1/3] Linux headers: update from 5.12-rc3

2021-03-30 Thread Ravi Bangoria
Update against Linux 5.12-rc3

Signed-off-by: Ravi Bangoria 
---
 include/standard-headers/drm/drm_fourcc.h | 23 -
 include/standard-headers/linux/input.h|  2 +-
 .../standard-headers/rdma/vmw_pvrdma-abi.h|  7 ++
 linux-headers/asm-generic/unistd.h|  4 +-
 linux-headers/asm-mips/unistd_n32.h   |  1 +
 linux-headers/asm-mips/unistd_n64.h   |  1 +
 linux-headers/asm-mips/unistd_o32.h   |  1 +
 linux-headers/asm-powerpc/kvm.h   |  2 +
 linux-headers/asm-powerpc/unistd_32.h |  1 +
 linux-headers/asm-powerpc/unistd_64.h |  1 +
 linux-headers/asm-s390/unistd_32.h|  1 +
 linux-headers/asm-s390/unistd_64.h|  1 +
 linux-headers/asm-x86/kvm.h   |  1 +
 linux-headers/asm-x86/unistd_32.h |  1 +
 linux-headers/asm-x86/unistd_64.h |  1 +
 linux-headers/asm-x86/unistd_x32.h|  1 +
 linux-headers/linux/kvm.h | 89 +++
 linux-headers/linux/vfio.h| 27 ++
 18 files changed, 161 insertions(+), 4 deletions(-)

diff --git a/include/standard-headers/drm/drm_fourcc.h 
b/include/standard-headers/drm/drm_fourcc.h
index c47e19810c..a61ae520c2 100644
--- a/include/standard-headers/drm/drm_fourcc.h
+++ b/include/standard-headers/drm/drm_fourcc.h
@@ -526,6 +526,25 @@ extern "C" {
  */
 #define I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS fourcc_mod_code(INTEL, 7)
 
+/*
+ * Intel Color Control Surface with Clear Color (CCS) for Gen-12 render
+ * compression.
+ *
+ * The main surface is Y-tiled and is at plane index 0 whereas CCS is linear
+ * and at index 1. The clear color is stored at index 2, and the pitch should
+ * be ignored. The clear color structure is 256 bits. The first 128 bits
+ * represents Raw Clear Color Red, Green, Blue and Alpha color each represented
+ * by 32 bits. The raw clear color is consumed by the 3d engine and generates
+ * the converted clear color of size 64 bits. The first 32 bits store the Lower
+ * Converted Clear Color value and the next 32 bits store the Higher Converted
+ * Clear Color value when applicable. The Converted Clear Color values are
+ * consumed by the DE. The last 64 bits are used to store Color Discard Enable
+ * and Depth Clear Value Valid which are ignored by the DE. A CCS cache line
+ * corresponds to an area of 4x1 tiles in the main surface. The main surface
+ * pitch is required to be a multiple of 4 tile widths.
+ */
+#define I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS_CC fourcc_mod_code(INTEL, 8)
+
 /*
  * Tiled, NV12MT, grouped in 64 (pixels) x 32 (lines) -sized macroblocks
  *
@@ -1035,9 +1054,9 @@ drm_fourcc_canonicalize_nvidia_format_mod(uint64_t 
modifier)
  * Not all combinations are valid, and different SoCs may support different
  * combinations of layout and options.
  */
-#define __fourcc_mod_amlogic_layout_mask 0xf
+#define __fourcc_mod_amlogic_layout_mask 0xff
 #define __fourcc_mod_amlogic_options_shift 8
-#define __fourcc_mod_amlogic_options_mask 0xf
+#define __fourcc_mod_amlogic_options_mask 0xff
 
 #define DRM_FORMAT_MOD_AMLOGIC_FBC(__layout, __options) \
fourcc_mod_code(AMLOGIC, \
diff --git a/include/standard-headers/linux/input.h 
b/include/standard-headers/linux/input.h
index f89c986190..7822c24178 100644
--- a/include/standard-headers/linux/input.h
+++ b/include/standard-headers/linux/input.h
@@ -81,7 +81,7 @@ struct input_id {
  * in units per radian.
  * When INPUT_PROP_ACCELEROMETER is set the resolution changes.
  * The main axes (ABS_X, ABS_Y, ABS_Z) are then reported in
- * in units per g (units/g) and in units per degree per second
+ * units per g (units/g) and in units per degree per second
  * (units/deg/s) for rotational axes (ABS_RX, ABS_RY, ABS_RZ).
  */
 struct input_absinfo {
diff --git a/include/standard-headers/rdma/vmw_pvrdma-abi.h 
b/include/standard-headers/rdma/vmw_pvrdma-abi.h
index 0989426a3f..c30182a7ae 100644
--- a/include/standard-headers/rdma/vmw_pvrdma-abi.h
+++ b/include/standard-headers/rdma/vmw_pvrdma-abi.h
@@ -133,6 +133,13 @@ enum pvrdma_wc_flags {
PVRDMA_WC_FLAGS_MAX = PVRDMA_WC_WITH_NETWORK_HDR_TYPE,
 };
 
+enum pvrdma_network_type {
+   PVRDMA_NETWORK_IB,
+   PVRDMA_NETWORK_ROCE_V1 = PVRDMA_NETWORK_IB,
+   PVRDMA_NETWORK_IPV4,
+   PVRDMA_NETWORK_IPV6
+};
+
 struct pvrdma_alloc_ucontext_resp {
uint32_t qp_tab_size;
uint32_t reserved;
diff --git a/linux-headers/asm-generic/unistd.h 
b/linux-headers/asm-generic/unistd.h
index 7287529177..ce58cff99b 100644
--- a/linux-headers/asm-generic/unistd.h
+++ b/linux-headers/asm-generic/unistd.h
@@ -861,9 +861,11 @@ __SYSCALL(__NR_faccessat2, sys_faccessat2)
 __SYSCALL(__NR_process_madvise, sys_process_madvise)
 #define __NR_epoll_pwait2 441
 __SC_COMP(__NR_epoll_pwait2, sys_epoll_pwait2, compat_sys_epoll_pwait2)
+#define __NR_mount_setattr 442
+__SYSCALL(__NR_mount_setattr, sys_mount_setattr)
 
 #undef __NR_syscalls
-#define 

Re: [PATCH v4 for-6.0? 0/3] qcow2: fix parallel rewrite and discard (rw-lock)

2021-03-30 Thread Vladimir Sementsov-Ogievskiy

30.03.2021 12:49, Max Reitz wrote:

On 25.03.21 20:12, Vladimir Sementsov-Ogievskiy wrote:

ping. Do we want it for 6.0?


I’d rather wait.  I think the conclusion was that guests shouldn’t hit this 
because they serialize discards?


I think, that we never had bugs, so we of course can wait.



There’s also something Kevin wrote on IRC a couple of weeks ago, for which I 
had hoped he’d sent an email but I don’t think he did, so I’ll try to remember 
and paraphrase as well as I can...

He basically asked whether it wouldn’t be conceptually simpler to take a 
reference to some cluster in get_cluster_offset() and later release it with a 
to-be-added put_cluster_offset().

He also noted that reading is problematic, too, because if you read a discarded 
and reused cluster, this might result in an information leak (some guest 
application might be able to read data it isn’t allowed to read); that’s why 
making get_cluster_offset() the point of locking clusters against discarding 
would be better.


Yes, I thought about read too, (RFCed in cover letter of [PATCH v5 0/6] qcow2: 
fix parallel rewrite and discard (lockless))



This would probably work with both of your solutions.  For the in-memory 
solutions, you’d take a refcount to an actual cluster; in the CoRwLock 
solution, you’d take that lock.

What do you think?



Hmm. What do you mean? Just rename my qcow2_inflight_writes_inc() and 
qcow2_inflight_writes_dec() to get_cluster_offset()/put_cluster_offset(), to 
make it more native to use for read operations as well?

Or to update any kind of "getting cluster offset" in the whole qcow2 driver to take a kind of 
"dynamic reference count" by get_cluster_offset() and then call corresponding put() somewhere? In this 
case I'm afraid it's a lot more work.. It would be also the problem that a lot of paths in qcow2 are not in 
coroutine and don't even take s->lock when they actually should. This will also mean that we do same job as 
normal qcow2 refcounts already do: no sense in keeping additional "dynamic refcount" for L2 table 
cluster while reading it, as we already have non-zero qcow2 normal refcount for it..


--
Best regards,
Vladimir



Re: Serious doubts about Gitlab CI

2021-03-30 Thread Thomas Huth

On 30/03/2021 13.19, Daniel P. Berrangé wrote:

On Mon, Mar 29, 2021 at 03:10:36PM +0100, Stefan Hajnoczi wrote:

Hi,
I wanted to follow up with a summary of the CI jobs:

1. Containers & Containers Layer2 - ~3 minutes/job x 39 jobs
2. Builds - ~50 minutes/job x 61 jobs
3. Tests - ~12 minutes/job x 20 jobs
4. Deploy - 52 minutes x 1 job


I hope that 52 was just a typo ... ?


I think a challenges we have with our incremental approach is that
we're not really taking into account relative importance of the
different build scenarios, and often don't look at the big picture
of what the new job adds in terms of quality, compared to existing
jobs.

eg Consider we have

   build-system-alpine:
   build-system-ubuntu:
   build-system-debian:
   build-system-fedora:
   build-system-centos:
   build-system-opensuse:


I guess we could go through that list of jobs and remove the duplicated 
target CPUs, e.g. it should be enough to test x86_64-softmmu only once.



   build-trace-multi-user:
   build-trace-ftrace-system:
   build-trace-ust-system:

I'd question whether we really need any of those 'build-trace'
jobs. Instead, we could have build-system-ubuntu pass
--enable-trace-backends=log,simple,syslog, build-system-debian
pass --enable-trace-backends=ust and build-system-fedora
pass --enable-trace-backends=ftrace, etc.


I recently had the very same idea already:

 https://gitlab.com/qemu-project/qemu/-/commit/65aff82076a9bbfdf7

:-)


Another example, is that we test builds on centos7 with
three different combos of crypto backend settings. This was
to exercise bugs we've seen in old crypto packages in RHEL-7
but in reality, it is probably overkill, because downstream
RHEL-7 only cares about one specific combination.


Care to send a patch? Or shall we just wait one more months and then remove 
these jobs (since we won't support RHEL7 after QEMU 6.0 anymore)?



We don't really have a clearly defined plan to identify what
the most important things are in our testing coverage, so we
tend to accept anything without questioning its value add.
This really feeds back into the idea I've brought up many
times in the past, that we need to better define what we aim
to support in QEMU and its quality level, which will influence
what are the scenarios we care about testing.


But code that we have in the repository should get at least some basic test 
coverage, otherwise it bitrots soon ... so it's maybe rather the other old 
problem that we struggle with, that we should deprecate more code and remove 
it if nobody cares about it...



Traditionally ccache (https://ccache.dev/) was used to detect
recompilation of the same compiler input files. This is trickier to do
in GitLab CI since it would be necessary to share and update a cache,
potentially between untrusted users. Unfortunately this shifts the
bottleneck from CPU to network in a CI-as-a-Service environment since
the cached build output needs to be accessed by the linker on the CI
runner but is stored remotely.


Our docker containers install ccache already and I could have sworn
that we use that in gitlab, but now I'm not so sure. We're only
saving the "build/" directory as an artifact between jobs, and I'm
not sure that directory holds the ccache cache.


AFAIK we never really enabled ccache in the gitlab-CI, only in Travis.


This is as far as I've gotten with thinking about CI efficiency. Do you
think these optimizations are worth investigating or should we keep it
simple and just disable many builds by default?


ccache is a no-brainer and assuming it isn't already working with
our gitlab jobs, we must fix that asap.


I've found some nice instructions here:

https://gould.cx/ted/blog/2017/06/10/ccache-for-Gitlab-CI/

... and just kicked off a build with these modifications, let's see how it 
goes...



Aside from optimizing CI, we should consider whether there's more we
can do to optimize build process itself. We've done alot of work, but
there's still plenty of stuff we build multiple times, once for each
target. Perhaps there's scope for cutting this down in some manner ?


Right, I think we should also work more towards consolidating the QEMU 
binaries, to avoid that we have to always build sooo many target binaries 
again and again. E.g.:


- Do we still need to support 32-bit hosts? If not we could
  finally get rid of qemu-system-i386, qemu-system-ppc,
  qemu-system-arm, etc. and just provide the 64-bit variants

- Could we maybe somehow unify the targets that have both, big
  and little endian versions? Then we could merge e.g.
  qemu-system-microblaze and qemu-system-microblazeel etc.

- Or could we maybe even build a unified qemu-system binary that
  contains all target CPUs? ... that would also allow e.g.
  machines with a x86 main CPU and an ARM-based board management
  controller...


I'm unclear how many jobs in CI are build submodules, but if there's
more scope for using the pre-built distro packages that's going to
be beneficial in build time.



Re: Serious doubts about Gitlab CI

2021-03-30 Thread Philippe Mathieu-Daudé
On 3/30/21 1:55 PM, Thomas Huth wrote:
> On 30/03/2021 13.19, Daniel P. Berrangé wrote:
>> On Mon, Mar 29, 2021 at 03:10:36PM +0100, Stefan Hajnoczi wrote:

>>> Traditionally ccache (https://ccache.dev/) was used to detect
>>> recompilation of the same compiler input files. This is trickier to do
>>> in GitLab CI since it would be necessary to share and update a cache,
>>> potentially between untrusted users. Unfortunately this shifts the
>>> bottleneck from CPU to network in a CI-as-a-Service environment since
>>> the cached build output needs to be accessed by the linker on the CI
>>> runner but is stored remotely.
>>
>> Our docker containers install ccache already and I could have sworn
>> that we use that in gitlab, but now I'm not so sure. We're only
>> saving the "build/" directory as an artifact between jobs, and I'm
>> not sure that directory holds the ccache cache.
> 
> AFAIK we never really enabled ccache in the gitlab-CI, only in Travis.

Back then the Travis setup was simpler, and it took me 2 to 3 weeks
to get it right (probably spending 3 to 4h a day on it).

>>> This is as far as I've gotten with thinking about CI efficiency. Do you
>>> think these optimizations are worth investigating or should we keep it
>>> simple and just disable many builds by default?
>>
>> ccache is a no-brainer and assuming it isn't already working with
>> our gitlab jobs, we must fix that asap.
> 
> I've found some nice instructions here:
> 
> https://gould.cx/ted/blog/2017/06/10/ccache-for-Gitlab-CI/
> 
> ... and just kicked off a build with these modifications, let's see how
> it goes...

But we cross-build in Docker containers, so you need to mount the
cache dir in the container and set the CCACHE_DIR env var, isn't it?

Watch out about custom runners. If we do too many changes on the
free-tier runners, we'll never have the custom runner series integrated.

My 2 cents.

Regards,

Phil.



Re: [PATCH v4 for-6.0? 0/3] qcow2: fix parallel rewrite and discard (rw-lock)

2021-03-30 Thread Max Reitz

On 30.03.21 12:51, Vladimir Sementsov-Ogievskiy wrote:

30.03.2021 12:49, Max Reitz wrote:

On 25.03.21 20:12, Vladimir Sementsov-Ogievskiy wrote:

ping. Do we want it for 6.0?


I’d rather wait.  I think the conclusion was that guests shouldn’t hit 
this because they serialize discards?


I think, that we never had bugs, so we of course can wait.



There’s also something Kevin wrote on IRC a couple of weeks ago, for 
which I had hoped he’d sent an email but I don’t think he did, so I’ll 
try to remember and paraphrase as well as I can...


He basically asked whether it wouldn’t be conceptually simpler to take 
a reference to some cluster in get_cluster_offset() and later release 
it with a to-be-added put_cluster_offset().


He also noted that reading is problematic, too, because if you read a 
discarded and reused cluster, this might result in an information leak 
(some guest application might be able to read data it isn’t allowed to 
read); that’s why making get_cluster_offset() the point of locking 
clusters against discarding would be better.


Yes, I thought about read too, (RFCed in cover letter of [PATCH v5 0/6] 
qcow2: fix parallel rewrite and discard (lockless))




This would probably work with both of your solutions.  For the 
in-memory solutions, you’d take a refcount to an actual cluster; in 
the CoRwLock solution, you’d take that lock.


What do you think?



Hmm. What do you mean? Just rename my qcow2_inflight_writes_inc() and 
qcow2_inflight_writes_dec() to 
get_cluster_offset()/put_cluster_offset(), to make it more native to use 
for read operations as well?


Hm.  Our discussion wasn’t so detailed.

I interpreted it to mean all qcow2 functions that find an offset to a 
qcow2 cluster, namely qcow2_get_host_offset(), 
qcow2_alloc_host_offset(), and qcow2_alloc_compressed_cluster_offset().


When those functions return an offset (in)to some cluster, that cluster 
(or the image as a whole) should be locked against discards.  Every 
offset received this way would require an accompanying 
qcow2_put_host_offset().


Or to update any kind of "getting cluster offset" in the whole qcow2 
driver to take a kind of "dynamic reference count" by 
get_cluster_offset() and then call corresponding put() somewhere? In 
this case I'm afraid it's a lot more work..


Hm, really?  I would have assumed we need to do some locking in all 
functions that get a cluster offset this way, so it should be less work 
to take the lock in the functions they invoke to get the offset.


It would be also the problem 
that a lot of paths in qcow2 are not in coroutine and don't even take 
s->lock when they actually should.


I’m not sure what you mean here, because all functions that invoke any 
of the three functions I listed above are coroutine_fns (or, well, I 
didn’t look it up, but they all have *_co_* in their name).


This will also mean that we do same 
job as normal qcow2 refcounts already do: no sense in keeping additional 
"dynamic refcount" for L2 table cluster while reading it, as we already 
have non-zero qcow2 normal refcount for it..


I’m afraid I don’t understand how normal refcounts relate to this.  For 
example, qcow2_get_host_offset() doesn’t touch refcounts at all.


Max




[PULL 8/9] qcow2: Force preallocation with data-file-raw

2021-03-30 Thread Max Reitz
Setting the qcow2 data-file-raw bit means that you can ignore the
qcow2 metadata when reading from the external data file.  It does not
mean that you have to ignore it, though.  Therefore, the data read must
be the same regardless of whether you interpret the metadata or whether
you ignore it, and thus the L1/L2 tables must all be present and give a
1:1 mapping.

This patch changes 244's output: First, the qcow2 file is larger right
after creation, because of metadata preallocation.  Second, the qemu-img
map output changes: Everything that was not explicitly discarded or
zeroed is now a data area.

Signed-off-by: Max Reitz 
Message-Id: <20210326145509.163455-2-mre...@redhat.com>
Reviewed-by: Eric Blake 
---
 block/qcow2.c  | 34 ++
 tests/qemu-iotests/244.out |  9 -
 2 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 2fb43c6f7e..9727ae8fe3 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3503,6 +3503,28 @@ qcow2_co_create(BlockdevCreateOptions *create_options, 
Error **errp)
 ret = -EINVAL;
 goto out;
 }
+if (qcow2_opts->data_file_raw &&
+qcow2_opts->preallocation == PREALLOC_MODE_OFF)
+{
+/*
+ * data-file-raw means that "the external data file can be
+ * read as a consistent standalone raw image without looking
+ * at the qcow2 metadata."  It does not say that the metadata
+ * must be ignored, though (and the qcow2 driver in fact does
+ * not ignore it), so the L1/L2 tables must be present and
+ * give a 1:1 mapping, so you get the same result regardless
+ * of whether you look at the metadata or whether you ignore
+ * it.
+ */
+qcow2_opts->preallocation = PREALLOC_MODE_METADATA;
+
+/*
+ * Cannot use preallocation with backing files, but giving a
+ * backing file when specifying data_file_raw is an error
+ * anyway.
+ */
+assert(!qcow2_opts->has_backing_file);
+}
 
 if (qcow2_opts->data_file) {
 if (version < 3) {
@@ -4238,6 +4260,18 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 error_setg_errno(errp, -ret, "Failed to grow the L1 table");
 goto fail;
 }
+
+if (data_file_is_raw(bs) && prealloc == PREALLOC_MODE_OFF) {
+/*
+ * When creating a qcow2 image with data-file-raw, we enforce
+ * at least prealloc=metadata, so that the L1/L2 tables are
+ * fully allocated and reading from the data file will return
+ * the same data as reading from the qcow2 image.  When the
+ * image is grown, we must consequently preallocate the
+ * metadata structures to cover the added area.
+ */
+prealloc = PREALLOC_MODE_METADATA;
+}
 }
 
 switch (prealloc) {
diff --git a/tests/qemu-iotests/244.out b/tests/qemu-iotests/244.out
index 7269b4295a..1a3ae31dde 100644
--- a/tests/qemu-iotests/244.out
+++ b/tests/qemu-iotests/244.out
@@ -83,7 +83,7 @@ qcow2 file size after I/O: 327680
 === Standalone image with external data file (valid raw) ===
 
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
data_file=TEST_DIR/t.IMGFMT.data data_file_raw=on
-qcow2 file size before I/O: 196616
+qcow2 file size before I/O: 327680
 
 wrote 4194304/4194304 bytes at offset 1048576
 4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -93,11 +93,10 @@ wrote 3145728/3145728 bytes at offset 3145728
 3 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 No errors were found on the image.
 
-[{ "start": 0, "length": 1048576, "depth": 0, "zero": true, "data": false},
-{ "start": 1048576, "length": 1048576, "depth": 0, "zero": false, "data": 
true, "offset": 1048576},
+[{ "start": 0, "length": 2097152, "depth": 0, "zero": false, "data": true, 
"offset": 0},
 { "start": 2097152, "length": 2097152, "depth": 0, "zero": true, "data": 
false},
-{ "start": 4194304, "length": 1048576, "depth": 0, "zero": true, "data": 
false, "offset": 4194304},
-{ "start": 5242880, "length": 61865984, "depth": 0, "zero": true, "data": 
false}]
+{ "start": 4194304, "length": 2097152, "depth": 0, "zero": true, "data": 
false, "offset": 4194304},
+{ "start": 6291456, "length": 60817408, "depth": 0, "zero": false, "data": 
true, "offset": 6291456}]
 
 read 1048576/1048576 bytes at offset 0
 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-- 
2.29.2




[PULL 9/9] iotests/244: Test preallocation for data-file-raw

2021-03-30 Thread Max Reitz
Three test cases:
(1) Adding a qcow2 (metadata) file to an existing data file, see whether
we can read the existing data through the qcow2 image.
(2) Append data to the data file, grow the qcow2 image accordingly, see
whether we can read the new data through the qcow2 image.
(3) At runtime, add a backing image to a freshly created qcow2 image
with an external data file (with data-file-raw).  Reading data from
the qcow2 image must return the same result as reading data from the
data file, so everything in the backing image must be ignored.
(This did not use to be the case, because without the L2 tables
preallocated, all clusters would appear as unallocated, and so the
qcow2 driver would fall through to the backing file.)

Signed-off-by: Max Reitz 
Message-Id: <20210326145509.163455-3-mre...@redhat.com>
Reviewed-by: Eric Blake 
---
 tests/qemu-iotests/244 | 104 +
 tests/qemu-iotests/244.out |  59 +
 2 files changed, 163 insertions(+)

diff --git a/tests/qemu-iotests/244 b/tests/qemu-iotests/244
index a46b441627..3e61fa25bb 100755
--- a/tests/qemu-iotests/244
+++ b/tests/qemu-iotests/244
@@ -38,6 +38,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 # get standard environment, filters and checks
 . ./common.rc
 . ./common.filter
+. ./common.qemu
 
 _supported_fmt qcow2
 _supported_proto file
@@ -267,6 +268,109 @@ case $result in
 ;;
 esac
 
+echo
+echo '=== Preallocation with data-file-raw ==='
+
+echo
+echo '--- Using a non-zeroed data file ---'
+
+# Using data-file-raw must enforce at least metadata preallocation so
+# that it does not matter whether one reads the raw file or the qcow2
+# file
+
+# Pre-create the data file, write some data.  Real-world use cases for
+# this are adding a qcow2 metadata file to a block device (i.e., using
+# the device as the data file) or adding qcow2 features to pre-existing
+# raw images (e.g. because the user now wants persistent dirty bitmaps).
+truncate -s 1M "$TEST_IMG.data"
+$QEMU_IO -f raw -c 'write -P 42 0 1M' "$TEST_IMG.data" | _filter_qemu_io
+
+# We cannot use qemu-img to create the qcow2 image, because it would
+# clear the data file.  Use the blockdev-create job instead, which will
+# only format the qcow2 image file.
+touch "$TEST_IMG"
+_launch_qemu \
+-blockdev file,node-name=data,filename="$TEST_IMG.data" \
+-blockdev file,node-name=meta,filename="$TEST_IMG"
+
+_send_qemu_cmd $QEMU_HANDLE '{ "execute": "qmp_capabilities" }' 'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+'{ "execute": "blockdev-create",
+   "arguments": {
+   "job-id": "create",
+   "options": {
+   "driver": "qcow2",
+   "size": '"$((1 * 1024 * 1024))"',
+   "file": "meta",
+   "data-file": "data",
+   "data-file-raw": true
+   } } }' \
+'"status": "concluded"'
+
+_send_qemu_cmd $QEMU_HANDLE \
+'{ "execute": "job-dismiss", "arguments": { "id": "create" } }' \
+'return'
+
+_cleanup_qemu
+
+echo
+echo 'Comparing pattern:'
+
+# Reading from either the qcow2 file or the data file should return
+# the same result:
+$QEMU_IO -f raw -c 'read -P 42 0 1M' "$TEST_IMG.data" | _filter_qemu_io
+$QEMU_IO -f $IMGFMT -c 'read -P 42 0 1M' "$TEST_IMG" | _filter_qemu_io
+
+# For good measure
+$QEMU_IMG compare -f raw "$TEST_IMG.data" "$TEST_IMG"
+
+echo
+echo '--- Truncation (growing) ---'
+
+# Append some new data to the raw file, then resize the qcow2 image
+# accordingly and see whether the new data is visible.  Technically
+# that is not allowed, but it is reasonable behavior, so test it.
+truncate -s 2M "$TEST_IMG.data"
+$QEMU_IO -f raw -c 'write -P 84 1M 1M' "$TEST_IMG.data" | _filter_qemu_io
+
+$QEMU_IMG resize "$TEST_IMG" 2M
+
+echo
+echo 'Comparing pattern:'
+
+$QEMU_IO -f raw -c 'read -P 42 0 1M' -c 'read -P 84 1M 1M' "$TEST_IMG.data" \
+| _filter_qemu_io
+$QEMU_IO -f $IMGFMT -c 'read -P 42 0 1M' -c 'read -P 84 1M 1M' "$TEST_IMG" \
+| _filter_qemu_io
+
+$QEMU_IMG compare -f raw "$TEST_IMG.data" "$TEST_IMG"
+
+echo
+echo '--- Giving a backing file at runtime ---'
+
+# qcow2 files with data-file-raw cannot have backing files given by
+# their image header, but qemu will allow you to set a backing node at
+# runtime -- it should not have any effect, though (because reading
+# from the qcow2 node should return the same data as reading from the
+# raw node).
+
+_make_test_img -o "data_file=$TEST_IMG.data,data_file_raw=on" 1M
+TEST_IMG="$TEST_IMG.base" _make_test_img 1M
+
+# Write something that is not zero into the base image
+$QEMU_IO -c 'write -P 42 0 1M' "$TEST_IMG.base" | _filter_qemu_io
+
+echo
+echo 'Comparing qcow2 image and raw data file:'
+
+# $TEST_IMG and $TEST_IMG.data must show the same data at all times;
+# that is, the qcow2 node must not fall through to the backing image
+# at any point
+$QEMU_IMG compare --image-opts \
+"driver=raw,file.filename=$TEST_IMG.data"  \
+

Re: [PATCH for-6.0 1/7] hw/block/nvme: fix pi constraint check

2021-03-30 Thread Gollu Appalanaidu

On Tue, Mar 30, 2021 at 09:24:59AM +0200, Klaus Jensen wrote:

On Mar 29 19:52, Gollu Appalanaidu wrote:

On Wed, Mar 24, 2021 at 09:09:01PM +0100, Klaus Jensen wrote:
> From: Klaus Jensen 
>
> Protection Information can only be enabled if there is at least 8 bytes
> of metadata.
>
> Signed-off-by: Klaus Jensen 
> ---
> hw/block/nvme-ns.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
> index 7f8d139a8663..ca04ee1bacfb 100644
> --- a/hw/block/nvme-ns.c
> +++ b/hw/block/nvme-ns.c
> @@ -394,7 +394,7 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, 
Error **errp)
> return -1;
> }
>
> -if (ns->params.pi && !ns->params.ms) {
> +if (ns->params.pi && ns->params.ms < 8) {
and also it is good check that "metadata size" is power of 2 or not?



While I don't expect a lot of real-world devices having metadata sizes
that are not power of twos, there is no requirement in the spec for
that.

And the implementation here also does not require it :)


Reviewed-by: Gollu Appalanaidu 


[PULL 5/5] hw/timer/renesas_tmr: Add default-case asserts in read_tcnt()

2021-03-30 Thread Peter Maydell
In commit 81b3ddaf8772ec we fixed a use of uninitialized data
in read_tcnt(). However this change wasn't enough to placate
Coverity, which is not smart enough to see that if we read a
2 bit field and then handle cases 0, 1, 2 and 3 then there cannot
be a flow of execution through the switch default. Add explicit
default cases which assert that they can't be reached, which
should help silence Coverity.

Signed-off-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Message-id: 20210319162458.13760-1-peter.mayd...@linaro.org
---
 hw/timer/renesas_tmr.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/timer/renesas_tmr.c b/hw/timer/renesas_tmr.c
index eed39917fec..d96002e1ee6 100644
--- a/hw/timer/renesas_tmr.c
+++ b/hw/timer/renesas_tmr.c
@@ -146,6 +146,8 @@ static uint16_t read_tcnt(RTMRState *tmr, unsigned size, 
int ch)
 case CSS_CASCADING:
 tcnt[1] = tmr->tcnt[1];
 break;
+default:
+g_assert_not_reached();
 }
 switch (FIELD_EX8(tmr->tccr[0], TCCR, CSS)) {
 case CSS_INTERNAL:
@@ -159,6 +161,8 @@ static uint16_t read_tcnt(RTMRState *tmr, unsigned size, 
int ch)
 case CSS_EXTERNAL: /* QEMU doesn't implement this */
 tcnt[0] = tmr->tcnt[0];
 break;
+default:
+g_assert_not_reached();
 }
 } else {
 tcnt[0] = tmr->tcnt[0];
-- 
2.20.1




[PULL 3/5] hw/arm/smmuv3: Drop unused CDM_VALID() and is_cd_valid()

2021-03-30 Thread Peter Maydell
From: Zenghui Yu 

They were introduced in commit 9bde7f0674fe ("hw/arm/smmuv3: Implement
translate callback") but never actually used. Drop them.

Signed-off-by: Zenghui Yu 
Acked-by: Eric Auger 
Message-id: 20210325142702.790-1-yuzeng...@huawei.com
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 hw/arm/smmuv3-internal.h | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index b6f7e53b7c7..3dac5766ca3 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -595,13 +595,6 @@ static inline int pa_range(STE *ste)
 #define CD_A(x)  extract32((x)->word[1], 14, 1)
 #define CD_AARCH64(x)extract32((x)->word[1], 9 , 1)
 
-#define CDM_VALID(x)((x)->word[0] & 0x1)
-
-static inline int is_cd_valid(SMMUv3State *s, STE *ste, CD *cd)
-{
-return CD_VALID(cd);
-}
-
 /**
  * tg2granule - Decodes the CD translation granule size field according
  * to the ttbr in use
-- 
2.20.1




Re: [PATCH v4 for-6.0? 0/3] qcow2: fix parallel rewrite and discard (rw-lock)

2021-03-30 Thread Max Reitz

On 25.03.21 20:12, Vladimir Sementsov-Ogievskiy wrote:

ping. Do we want it for 6.0?


I’d rather wait.  I think the conclusion was that guests shouldn’t hit 
this because they serialize discards?


There’s also something Kevin wrote on IRC a couple of weeks ago, for 
which I had hoped he’d sent an email but I don’t think he did, so I’ll 
try to remember and paraphrase as well as I can...


He basically asked whether it wouldn’t be conceptually simpler to take a 
reference to some cluster in get_cluster_offset() and later release it 
with a to-be-added put_cluster_offset().


He also noted that reading is problematic, too, because if you read a 
discarded and reused cluster, this might result in an information leak 
(some guest application might be able to read data it isn’t allowed to 
read); that’s why making get_cluster_offset() the point of locking 
clusters against discarding would be better.


This would probably work with both of your solutions.  For the in-memory 
solutions, you’d take a refcount to an actual cluster; in the CoRwLock 
solution, you’d take that lock.


What do you think?

Max




Re: [PATCH v2 0/6] esp: fix asserts/segfaults discovered by fuzzer

2021-03-30 Thread Paolo Bonzini

On 30/03/21 09:34, Mark Cave-Ayland wrote:

Hi Paolo,

I had a quick look at Alex's updated test cases and most of them are 
based on an incorrect assumption I made around the behaviour of 
fifo8_pop_buf(). Can you drop these for now, and I will submit a v3 
shortly once I've given it a full run through my test images?


Hi,

I also had some failures of the tests on CI, which is why I hadn't 
incorporated these changes yet.  Thanks for the advance warning, I'll 
wait for your v3.


Paolo




Re: [RFC v1] hw/smbios: support for type 41 (onboard devices extended information)

2021-03-30 Thread Daniel P . Berrangé
On Sun, Mar 28, 2021 at 10:57:26PM +0200, Vincent Bernat wrote:
> Type 41 defines the attributes of devices that are onboard. The
> original intent was to imply the BIOS had some level of control over
> the enablement of the associated devices.
> 
> If network devices are present in this table, by default, udev will
> name the corresponding interfaces enoX, X being the instance number.
> Without such information, udev will fallback to using the PCI ID and
> this usually gives ens3 or ens4. This can be a bit annoying as the
> name of the network card may depend on the order of options and may
> change if a new PCI device is added earlier on the commande line.
> Being able to provide SMBIOS type 41 entry ensure the name of the
> interface won't change and helps the user guess the right name without
> booting a first time.
> 
> This can be invoked with:
> 
> $QEMU -netdev user,id=internet
>   -device virtio-net-pci,mac=50:54:00:00:00:42,netdev=internet \
>   -smbios type=41,designation=Onboard 
> LAN,instance=1,kind=ethernet,pci=:00:09.0
> 
> Which results in the guest seeing dmidecode data and the interface
> exposed as "eno1":
> 
> $ dmidecode -t 41
> # dmidecode 3.3
> Getting SMBIOS data from sysfs.
> SMBIOS 2.8 present.Handle 0x2900, DMI type 41, 11 bytes
> Onboard Device
> Reference Designation: Onboard LAN
> Type: Ethernet
> Status: Enabled
> Type Instance: 1
> Bus Address: :00:09.0
> $ udevadm info -p /sys/class/net/eno1 | grep ONBOARD
> E: ID_NET_NAME_ONBOARD=eno1
> E: ID_NET_LABEL_ONBOARD=Onboard LAN
> 
> The original plan was to directly provide a device and populate "kind"
> and "pci" from the device. However, since the SMIBIOS tables are built
> during argument evaluation, the information is not yet available.
> I would welcome some guidance on how to implement this.

I'm not sure I see the problem you're describing here, could
you elaborate ?

I see SMBIOS tables are built by  smbios_get_tables() method.
This is called from qemu_init(), after all arguents have been
processed and devices have been created.

It seems like this should allow SMBIOS tables to be auto-populated
from the NICs listed in -device args previously.


Note, if we're going to auto-populate the SMBIOS type 41 tabes
from -device args, then we'll need to make this behaviour
configurable via a property, so that we can ensure this only
applies to new machine types.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v3 4/5] qemu-iotests: let "check" spawn an arbitrary test command

2021-03-30 Thread Max Reitz

On 26.03.21 16:05, Max Reitz wrote:

On 26.03.21 15:23, Paolo Bonzini wrote:

Right now there is no easy way for "check" to print a reproducer command.
Because such a reproducer command line would be huge, we can instead 
teach

check to start a command of our choice.  This can be for example a Python
unit test with arguments to only run a specific subtest.

Move the trailing empty line to print_env(), since it always looks better
and one caller was not adding it.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Tested-by: Emanuele Giuseppe Esposito 
Message-Id: <20210323181928.311862-5-pbonz...@redhat.com>
---
  tests/qemu-iotests/check | 18 +-
  tests/qemu-iotests/testenv.py    |  3 ++-
  tests/qemu-iotests/testrunner.py |  1 -
  3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index d1c87ceaf1..df9fd733ff 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -19,6 +19,9 @@
  import os
  import sys
  import argparse
+import shutil
+from pathlib import Path
+
  from findtests import TestFinder
  from testenv import TestEnv
  from testrunner import TestRunner
@@ -101,7 +104,7 @@ def make_argparser() -> argparse.ArgumentParser:
 'rerun failed ./check command, starting from 
the '

 'middle of the process.')
  g_sel.add_argument('tests', metavar='TEST_FILES', nargs='*',
-   help='tests to run')
+   help='tests to run, or "--" followed by a 
command')

  return p
@@ -114,6 +117,19 @@ if __name__ == '__main__':
    imgopts=args.imgopts, misalign=args.misalign,
    debug=args.debug, valgrind=args.valgrind)
+    if len(sys.argv) > 1 and sys.argv[-len(args.tests)-1] == '--':
+    if not args.tests:
+    sys.exit("missing command after '--'")
+    cmd = args.tests
+    env.print_env()
+    exec_path = Path(shutil.which(cmd[0]))


297 says:

check:125: error: Argument 1 to "Path" has incompatible type 
"Optional[str]"; expected "Union[str, _PathLike[str]]"

Found 1 error in 1 file (checked 1 source file)

Normally I’d assert this away, but actually I think the returned value 
should be checked and we should print an error if it’s None.  (Seems 
like shutil.which() doesn’t raise an exception if there is no such 
command, it just returns None.)


Max


+    if exec_path is None:
+    sys.exit('command not found: ' + cmd[0])


Oh, I see, the intent to print an error is actually there.  The problem 
is just that Path(None) throws an exception, so we must check 
shutil.which()’s return value.


I’ll squash this in if you don’t mind:

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index df9fd733ff..e2230f5612 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -122,9 +122,10 @@ if __name__ == '__main__':
 sys.exit("missing command after '--'")
 cmd = args.tests
 env.print_env()
-exec_path = Path(shutil.which(cmd[0]))
-if exec_path is None:
+exec_pathstr = shutil.which(cmd[0])
+if exec_pathstr is None:
 sys.exit('command not found: ' + cmd[0])
+exec_path = Path(exec_pathstr)
 cmd[0] = exec_path.resolve()
 full_env = env.prepare_subprocess(cmd)
 os.chdir(Path(exec_path).parent)


+    cmd[0] = exec_path.resolve()
+    full_env = env.prepare_subprocess(cmd)
+    os.chdir(Path(exec_path).parent)
+    os.execve(cmd[0], cmd, full_env)
+
  testfinder = TestFinder(test_dir=env.source_iotests)
  groups = args.groups.split(',') if args.groups else None
diff --git a/tests/qemu-iotests/testenv.py 
b/tests/qemu-iotests/testenv.py

index fca3a609e0..cd0e39b789 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -284,7 +284,8 @@ def print_env(self) -> None:
  PLATFORM  -- {platform}
  TEST_DIR  -- {TEST_DIR}
  SOCK_DIR  -- {SOCK_DIR}
-SOCKET_SCM_HELPER -- {SOCKET_SCM_HELPER}"""
+SOCKET_SCM_HELPER -- {SOCKET_SCM_HELPER}
+"""
  args = collections.defaultdict(str, self.get_env())
diff --git a/tests/qemu-iotests/testrunner.py 
b/tests/qemu-iotests/testrunner.py

index 519924dc81..2f56ac545d 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -316,7 +316,6 @@ def run_tests(self, tests: List[str]) -> bool:
  if not self.makecheck:
  self.env.print_env()
-    print()
  test_field_width = max(len(os.path.basename(t)) for t in 
tests) + 2









Re: [PATCH v2 0/2] qcow2: Force preallocation with data-file-raw

2021-03-30 Thread Max Reitz

On 26.03.21 15:55, Max Reitz wrote:

v1: https://lists.nongnu.org/archive/html/qemu-block/2020-06/msg00992.html


Hi,

I think that qcow2 images with data-file-raw should always have
preallocated 1:1 L1/L2 tables, so that the image always looks the same
whether you respect or ignore the qcow2 metadata.  The easiest way to
achieve that is to enforce at least metadata preallocation whenever
data-file-raw is given.


Thanks for the review, applied to my block branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block

Max




Re: [PATCH v5 4/6] qcow2: introduce inflight-write-counters

2021-03-30 Thread Vladimir Sementsov-Ogievskiy

26.03.2021 23:00, Vladimir Sementsov-Ogievskiy wrote:

We have a bug in qcow2: assume we've started data write into host
cluster A. s->lock is unlocked. During the write the refcount of
cluster A may become zero, cluster may be reallocated for other needs,
and our in-flight write become a use-after-free. More details will be
in the further commit which actually fixes the bug.

For now, let's prepare infrastructure for the following fix. We are
going to track these in-flight data writes. So, we create a hash map

   cluster_index -> Qcow2InFlightRefcount

For now, add only basic structure and simple counting logic. No guest
write is actually counted, we only add infrastructure.
Qcow2InFlightRefcount will be expanded in the following commit, that's
why we need a structure.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/qcow2.h  | 16 
  block/qcow2-refcount.c | 86 ++
  block/qcow2.c  |  5 +++
  3 files changed, 107 insertions(+)

diff --git a/block/qcow2.h b/block/qcow2.h
index 0fe5f74ed3..b25ef06111 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -420,6 +420,17 @@ typedef struct BDRVQcow2State {
   * is to convert the image with the desired compression type set.
   */
  Qcow2CompressionType compression_type;
+
+/*
+ * inflight_writes_counters:
+ *   Map cluster index (int64_t) -> Qcow2InFlightWriteCounter
+ *
+ * The map contains entries only for clusters that have in-flight data
+ * (not-metadata) writes. So Qcow2InFlightWriteCounter::inflight_writes_cnt
+ * is always (except for when being removed in update_inflight_write_cnt())
+ * >= 1 for stored elements.
+ */
+GHashTable *inflight_writes_counters;
  } BDRVQcow2State;
  
  typedef struct Qcow2COWRegion {

@@ -896,6 +907,11 @@ int qcow2_shrink_reftable(BlockDriverState *bs);
  int64_t qcow2_get_last_cluster(BlockDriverState *bs, int64_t size);
  int qcow2_detect_metadata_preallocation(BlockDriverState *bs);
  
+void qcow2_inflight_writes_inc(BlockDriverState *bs, int64_t offset,

+   int64_t length);
+void qcow2_inflight_writes_dec(BlockDriverState *bs, int64_t offset,
+   int64_t length);
+
  /* qcow2-cluster.c functions */
  int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
  bool exact_size);
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 1369724b41..eedc83ea4a 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -799,6 +799,92 @@ found:
  }
  }
  
+typedef struct Qcow2InFlightWriteCounter {

+/*
+ * Number of in-flight writes to the cluster, always > 0, as when it 
becomes
+ * 0 the entry is removed from s->inflight_writes_counters.
+ */
+uint64_t inflight_writes_cnt;
+} Qcow2InFlightWriteCounter;
+
+/* Find Qcow2InFlightWriteCounter corresponding to @cluster_index */
+static Qcow2InFlightWriteCounter *find_infl_wr(BDRVQcow2State *s,
+   int64_t cluster_index)
+{
+Qcow2InFlightWriteCounter *infl;
+
+if (!s->inflight_writes_counters) {
+return NULL;
+}
+
+infl = g_hash_table_lookup(s->inflight_writes_counters, _index);
+
+if (infl) {
+assert(infl->inflight_writes_cnt > 0);
+}
+
+return infl;
+}
+
+/*
+ * The function is intended to be called with decrease=false before writing
+ * guest data and with decrease=true after write finish.
+ */
+static void coroutine_fn
+update_inflight_write_cnt(BlockDriverState *bs, int64_t offset, int64_t length,
+  bool decrease)
+{
+BDRVQcow2State *s = bs->opaque;
+int64_t start, last, cluster_index;
+
+start = start_of_cluster(s, offset) >> s->cluster_bits;
+last = start_of_cluster(s, offset + length - 1) >> s->cluster_bits;
+for (cluster_index = start; cluster_index <= last; cluster_index++) {
+Qcow2InFlightWriteCounter *infl = find_infl_wr(s, cluster_index);
+
+if (!decrease) {
+if (!infl) {
+infl = g_new0(Qcow2InFlightWriteCounter, 1);
+g_hash_table_insert(s->inflight_writes_counters,
+g_memdup(_index,
+ sizeof(cluster_index)), infl);
+}
+infl->inflight_writes_cnt++;
+continue;
+}
+
+/* decrease */
+assert(infl);
+assert(infl->inflight_writes_cnt >= 1);
+
+infl->inflight_writes_cnt--;
+
+if (infl->inflight_writes_cnt == 0) {
+g_hash_table_remove(s->inflight_writes_counters, _index);
+}
+}
+}
+
+/*
+ * Works both with locked and unlocked s->lock. It just doesn't touch s->lock 
in
+ * contrast to qcow2_inflight_writes_dec()
+ */
+void qcow2_inflight_writes_inc(BlockDriverState *bs, int64_t offset,
+   int64_t length)
+{
+

[PATCH] i386: Make 'hv-reenlightenment' require explicit 'tsc-frequency' setting

2021-03-30 Thread Vitaly Kuznetsov
Commit 561dbb41b1d7 "i386: Make migration fail when Hyper-V reenlightenment
was enabled but 'user_tsc_khz' is unset" forbade migrations with when guest
has opted for reenlightenment notifications but 'tsc-frequency' wasn't set
explicitly on the command line. This works but the migration fail late and
this may come as an unpleasant surprise. To make things more explicit,
require 'tsc-frequency=' on the command line when 'hv-reenlightenment' was
enabled. Make the change affect 6.0+ machine types only to preserve
previously-valid configurations valid.

Signed-off-by: Vitaly Kuznetsov 
---
 docs/hyperv.txt   |  1 +
 hw/i386/pc.c  |  1 +
 target/i386/cpu.c | 23 +--
 target/i386/cpu.h |  1 +
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/docs/hyperv.txt b/docs/hyperv.txt
index e53c581f4586..5b02d341ab25 100644
--- a/docs/hyperv.txt
+++ b/docs/hyperv.txt
@@ -165,6 +165,7 @@ emulate TSC accesses after migration so 'tsc-frequency=' 
CPU option also has to
 be specified to make migration succeed. The destination host has to either have
 the same TSC frequency or support TSC scaling CPU feature.
 
+Requires: tsc-frequency
 Recommended: hv-frequencies
 
 3.16. hv-evmcs
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 8a84b25a031e..47b79e949ad7 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -98,6 +98,7 @@
 
 GlobalProperty pc_compat_5_2[] = {
 { "ICH9-LPC", "x-smi-cpu-hotunplug", "off" },
+{ TYPE_X86_CPU, "x-hv-reenlightenment-requires-tscfreq", "off"},
 };
 const size_t pc_compat_5_2_len = G_N_ELEMENTS(pc_compat_5_2);
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 6b3e9467f177..751636bafac5 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6647,10 +6647,23 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
verbose)
 }
 }
 
-static void x86_cpu_hyperv_realize(X86CPU *cpu)
+static void x86_cpu_hyperv_realize(X86CPU *cpu, Error **errp)
 {
+CPUX86State *env = >env;
 size_t len;
 
+/*
+ * Reenlightenment requires explicit 'tsc-frequency' setting for successful
+ * migration (see hyperv_reenlightenment_post_load(). As 'hv-passthrough'
+ * mode is not migratable, we can loosen the restriction.
+ */
+if (hyperv_feat_enabled(cpu, HYPERV_FEAT_REENLIGHTENMENT) &&
+!cpu->hyperv_passthrough && !env->user_tsc_khz &&
+cpu->hyperv_reenlightenment_requires_tscfreq) {
+error_setg(errp, "'hv-reenlightenment' requires 'tsc-frequency=' to be 
set");
+return;
+}
+
 /* Hyper-V vendor id */
 if (!cpu->hyperv_vendor) {
 memcpy(cpu->hyperv_vendor_id, "Microsoft Hv", 12);
@@ -6846,7 +6859,11 @@ static void x86_cpu_realizefn(DeviceState *dev, Error 
**errp)
 }
 
 /* Process Hyper-V enlightenments */
-x86_cpu_hyperv_realize(cpu);
+x86_cpu_hyperv_realize(cpu, _err);
+if (local_err != NULL) {
+error_propagate(errp, local_err);
+return;
+}
 
 cpu_exec_realizefn(cs, _err);
 if (local_err != NULL) {
@@ -7374,6 +7391,8 @@ static Property x86_cpu_properties[] = {
 DEFINE_PROP_INT32("x-hv-max-vps", X86CPU, hv_max_vps, -1),
 DEFINE_PROP_BOOL("x-hv-synic-kvm-only", X86CPU, hyperv_synic_kvm_only,
  false),
+DEFINE_PROP_BOOL("x-hv-reenlightenment-requires-tscfreq", X86CPU,
+ hyperv_reenlightenment_requires_tscfreq, true),
 DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
  true),
 DEFINE_PROP_END_OF_LIST()
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 570f916878f9..0196a300f018 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1677,6 +1677,7 @@ struct X86CPU {
 uint32_t hyperv_spinlock_attempts;
 char *hyperv_vendor;
 bool hyperv_synic_kvm_only;
+bool hyperv_reenlightenment_requires_tscfreq;
 uint64_t hyperv_features;
 bool hyperv_passthrough;
 OnOffAuto hyperv_no_nonarch_cs;
-- 
2.30.2




Re: [RFC v1] hw/smbios: support for type 41 (onboard devices extended information)

2021-03-30 Thread Vincent Bernat
 ❦ 30 mars 2021 11:35 +01, Daniel P. Berrangé:

>> If network devices are present in this table, by default, udev will
>> name the corresponding interfaces enoX, X being the instance number.
>> Without such information, udev will fallback to using the PCI ID and
>> this usually gives ens3 or ens4. This can be a bit annoying as the
>> name of the network card may depend on the order of options and may
>> change if a new PCI device is added earlier on the commande line.
>> Being able to provide SMBIOS type 41 entry ensure the name of the
>> interface won't change and helps the user guess the right name without
>> booting a first time.
>> 
>> This can be invoked with:
>> 
>> $QEMU -netdev user,id=internet
>>   -device virtio-net-pci,mac=50:54:00:00:00:42,netdev=internet \
>>   -smbios type=41,designation=Onboard 
>> LAN,instance=1,kind=ethernet,pci=:00:09.0
>> 
>> Which results in the guest seeing dmidecode data and the interface
>> exposed as "eno1":
>> 
>> $ dmidecode -t 41
>> # dmidecode 3.3
>> Getting SMBIOS data from sysfs.
>> SMBIOS 2.8 present.Handle 0x2900, DMI type 41, 11 bytes
>> Onboard Device
>> Reference Designation: Onboard LAN
>> Type: Ethernet
>> Status: Enabled
>> Type Instance: 1
>> Bus Address: :00:09.0
>> $ udevadm info -p /sys/class/net/eno1 | grep ONBOARD
>> E: ID_NET_NAME_ONBOARD=eno1
>> E: ID_NET_LABEL_ONBOARD=Onboard LAN
>> 
>> The original plan was to directly provide a device and populate "kind"
>> and "pci" from the device. However, since the SMIBIOS tables are built
>> during argument evaluation, the information is not yet available.
>> I would welcome some guidance on how to implement this.
>
> I'm not sure I see the problem you're describing here, could
> you elaborate ?
>
> I see SMBIOS tables are built by  smbios_get_tables() method.
> This is called from qemu_init(), after all arguents have been
> processed and devices have been created.

OK, I was mistaken. I'll try to retrieve the information here then.

> It seems like this should allow SMBIOS tables to be auto-populated
> from the NICs listed in -device args previously.
>
>
> Note, if we're going to auto-populate the SMBIOS type 41 tabes
> from -device args, then we'll need to make this behaviour
> configurable via a property, so that we can ensure this only
> applies to new machine types.

I didn't plan for something automatic, just being able to specify a PCI
device in the -smbios arguments and have the PCI location automatically
filled from that.
-- 
Keep it simple to make it faster.
- The Elements of Programming Style (Kernighan & Plauger)



[PULL 1/5] net/npcm7xx_emc.c: Fix handling of receiving packets when RSDR not set

2021-03-30 Thread Peter Maydell
From: Doug Evans 

Turning REG_MCMDR_RXON is enough to start receiving packets.

Signed-off-by: Doug Evans 
Message-id: 20210319195044.741821-1-...@google.com
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 hw/net/npcm7xx_emc.c   |  4 +++-
 tests/qtest/npcm7xx_emc-test.c | 30 +-
 2 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/hw/net/npcm7xx_emc.c b/hw/net/npcm7xx_emc.c
index 714a742ba7a..7c892f820fb 100644
--- a/hw/net/npcm7xx_emc.c
+++ b/hw/net/npcm7xx_emc.c
@@ -702,7 +702,9 @@ static void npcm7xx_emc_write(void *opaque, hwaddr offset,
!(value & REG_MCMDR_RXON)) {
 emc->regs[REG_MGSTA] |= REG_MGSTA_RXHA;
 }
-if (!(value & REG_MCMDR_RXON)) {
+if (value & REG_MCMDR_RXON) {
+emc->rx_active = true;
+} else {
 emc_halt_rx(emc, 0);
 }
 break;
diff --git a/tests/qtest/npcm7xx_emc-test.c b/tests/qtest/npcm7xx_emc-test.c
index 7a281731950..9eec71d87c1 100644
--- a/tests/qtest/npcm7xx_emc-test.c
+++ b/tests/qtest/npcm7xx_emc-test.c
@@ -492,9 +492,6 @@ static void enable_tx(QTestState *qts, const EMCModule *mod,
 mcmdr |= REG_MCMDR_TXON;
 emc_write(qts, mod, REG_MCMDR, mcmdr);
 }
-
-/* Prod the device to send the packet. */
-emc_write(qts, mod, REG_TSDR, 1);
 }
 
 static void emc_send_verify1(QTestState *qts, const EMCModule *mod, int fd,
@@ -558,6 +555,9 @@ static void emc_send_verify(QTestState *qts, const 
EMCModule *mod, int fd,
 enable_tx(qts, mod, [0], NUM_TX_DESCRIPTORS, desc_addr,
   with_irq ? REG_MIEN_ENTXINTR : 0);
 
+/* Prod the device to send the packet. */
+emc_write(qts, mod, REG_TSDR, 1);
+
 /*
  * It's problematic to observe the interrupt for each packet.
  * Instead just wait until all the packets go out.
@@ -643,13 +643,10 @@ static void enable_rx(QTestState *qts, const EMCModule 
*mod,
 mcmdr |= REG_MCMDR_RXON | mcmdr_flags;
 emc_write(qts, mod, REG_MCMDR, mcmdr);
 }
-
-/* Prod the device to accept a packet. */
-emc_write(qts, mod, REG_RSDR, 1);
 }
 
 static void emc_recv_verify(QTestState *qts, const EMCModule *mod, int fd,
-bool with_irq)
+bool with_irq, bool pump_rsdr)
 {
 NPCM7xxEMCRxDesc desc[NUM_RX_DESCRIPTORS];
 uint32_t desc_addr = DESC_ADDR;
@@ -679,6 +676,15 @@ static void emc_recv_verify(QTestState *qts, const 
EMCModule *mod, int fd,
 enable_rx(qts, mod, [0], NUM_RX_DESCRIPTORS, desc_addr,
   with_irq ? REG_MIEN_ENRXINTR : 0, 0);
 
+/*
+ * If requested, prod the device to accept a packet.
+ * This isn't necessary, the linux driver doesn't do this.
+ * Test doing/not-doing this for robustness.
+ */
+if (pump_rsdr) {
+emc_write(qts, mod, REG_RSDR, 1);
+}
+
 /* Send test packet to device's socket. */
 ret = iov_send(fd, iov, 2, 0, sizeof(len) + sizeof(test));
 g_assert_cmpint(ret, == , sizeof(test) + sizeof(len));
@@ -826,8 +832,14 @@ static void test_rx(gconstpointer test_data)
 
 qtest_irq_intercept_in(qts, "/machine/soc/a9mpcore/gic");
 
-emc_recv_verify(qts, td->module, test_sockets[0], /*with_irq=*/false);
-emc_recv_verify(qts, td->module, test_sockets[0], /*with_irq=*/true);
+emc_recv_verify(qts, td->module, test_sockets[0], /*with_irq=*/false,
+/*pump_rsdr=*/false);
+emc_recv_verify(qts, td->module, test_sockets[0], /*with_irq=*/false,
+/*pump_rsdr=*/true);
+emc_recv_verify(qts, td->module, test_sockets[0], /*with_irq=*/true,
+/*pump_rsdr=*/false);
+emc_recv_verify(qts, td->module, test_sockets[0], /*with_irq=*/true,
+/*pump_rsdr=*/true);
 emc_test_ptle(qts, td->module, test_sockets[0]);
 
 qtest_quit(qts);
-- 
2.20.1




Re: [PATCH v3 4/5] qemu-iotests: let "check" spawn an arbitrary test command

2021-03-30 Thread Max Reitz

On 30.03.21 12:44, Max Reitz wrote:

On 30.03.21 12:38, Max Reitz wrote:

On 26.03.21 16:05, Max Reitz wrote:

On 26.03.21 15:23, Paolo Bonzini wrote:
Right now there is no easy way for "check" to print a reproducer 
command.
Because such a reproducer command line would be huge, we can instead 
teach
check to start a command of our choice.  This can be for example a 
Python

unit test with arguments to only run a specific subtest.

Move the trailing empty line to print_env(), since it always looks 
better

and one caller was not adding it.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Tested-by: Emanuele Giuseppe Esposito 
Message-Id: <20210323181928.311862-5-pbonz...@redhat.com>
---
  tests/qemu-iotests/check | 18 +-
  tests/qemu-iotests/testenv.py    |  3 ++-
  tests/qemu-iotests/testrunner.py |  1 -
  3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index d1c87ceaf1..df9fd733ff 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -19,6 +19,9 @@
  import os
  import sys
  import argparse
+import shutil
+from pathlib import Path
+
  from findtests import TestFinder
  from testenv import TestEnv
  from testrunner import TestRunner
@@ -101,7 +104,7 @@ def make_argparser() -> argparse.ArgumentParser:
 'rerun failed ./check command, starting 
from the '

 'middle of the process.')
  g_sel.add_argument('tests', metavar='TEST_FILES', nargs='*',
-   help='tests to run')
+   help='tests to run, or "--" followed by a 
command')

  return p
@@ -114,6 +117,19 @@ if __name__ == '__main__':
    imgopts=args.imgopts, misalign=args.misalign,
    debug=args.debug, valgrind=args.valgrind)
+    if len(sys.argv) > 1 and sys.argv[-len(args.tests)-1] == '--':
+    if not args.tests:
+    sys.exit("missing command after '--'")
+    cmd = args.tests
+    env.print_env()
+    exec_path = Path(shutil.which(cmd[0]))


297 says:

check:125: error: Argument 1 to "Path" has incompatible type 
"Optional[str]"; expected "Union[str, _PathLike[str]]"

Found 1 error in 1 file (checked 1 source file)

Normally I’d assert this away, but actually I think the returned 
value should be checked and we should print an error if it’s None.  
(Seems like shutil.which() doesn’t raise an exception if there is no 
such command, it just returns None.)


Max


+    if exec_path is None:
+    sys.exit('command not found: ' + cmd[0])


Oh, I see, the intent to print an error is actually there.  The 
problem is just that Path(None) throws an exception, so we must check 
shutil.which()’s return value.


I’ll squash this in if you don’t mind:

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index df9fd733ff..e2230f5612 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -122,9 +122,10 @@ if __name__ == '__main__':
  sys.exit("missing command after '--'")
  cmd = args.tests
  env.print_env()
-    exec_path = Path(shutil.which(cmd[0]))
-    if exec_path is None:
+    exec_pathstr = shutil.which(cmd[0])
+    if exec_pathstr is None:
  sys.exit('command not found: ' + cmd[0])
+    exec_path = Path(exec_pathstr)
  cmd[0] = exec_path.resolve()
  full_env = env.prepare_subprocess(cmd)
  os.chdir(Path(exec_path).parent)


+    cmd[0] = exec_path.resolve()
+    full_env = env.prepare_subprocess(cmd)
+    os.chdir(Path(exec_path).parent)


Oh, and this Path() does nothing, I presume, so I’m going to replace it 
with just “exec_path”.


On third thought, the pathlib doc says:

> If you want to walk an arbitrary filesystem path upwards, it is
> recommended to first call Path.resolve() so as to resolve symlinks and
> eliminate “..” components.

So I guess the best would be to make it “exec_path = 
Path(exec_pathstr).resolve()”.


I’d also prefer if cmd[0] was a string and not a Path object 
(Path.resolve returns a Path).  os.execve() can work with Path objects 
as of 3.6 (which is the minimum version we require), but 
prepare_subprocess() expects a list of strings.  (I don’t know why mypy 
doesn’t complain.  I presume it just can’t resolve cmd's type.)


So here’s the full diff:

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index df9fd733ff..7c9d3a0852 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -122,12 +122,13 @@ if __name__ == '__main__':
 sys.exit("missing command after '--'")
 cmd = args.tests
 env.print_env()
-exec_path = Path(shutil.which(cmd[0]))
-if exec_path is None:
+exec_pathstr = shutil.which(cmd[0])
+if exec_pathstr is None:
 sys.exit('command not found: ' + cmd[0])
-cmd[0] = exec_path.resolve()
+

Re: [PATCH] hw/block/nvme: remove description for zoned.append_size_limit

2021-03-30 Thread Klaus Jensen
On Mar 30 10:19, Niklas Cassel wrote:
> On Tue, Mar 23, 2021 at 12:20:32PM +0100, Klaus Jensen wrote:
> > On Mar 23 11:18, Niklas Cassel wrote:
> > > From: Niklas Cassel 
> > > 
> > > The description was originally removed in commit 578d914b263c
> > > ("hw/block/nvme: align zoned.zasl with mdts") together with the removal
> > > of the zoned.append_size_limit parameter itself.
> > > 
> > > However, it was (most likely accidentally), re-added in commit
> > > f7dcd31885cb ("hw/block/nvme: add non-mdts command size limit for 
> > > verify").
> > > 
> > > Remove the description again, since the parameter it describes,
> > > zoned.append_size_limit, no longer exists.
> > > 
> > > Signed-off-by: Niklas Cassel 
> > > ---
> > >  hw/block/nvme.c | 8 
> > >  1 file changed, 8 deletions(-)
> > > 
> > > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > > index 6842b01ab5..205d3ec944 100644
> > > --- a/hw/block/nvme.c
> > > +++ b/hw/block/nvme.c
> > > @@ -91,14 +91,6 @@
> > >   *   the minimum memory page size (CAP.MPSMIN). The default value is 0 
> > > (i.e.
> > >   *   defaulting to the value of `mdts`).
> > >   *
> > > - * - `zoned.append_size_limit`
> > > - *   The maximum I/O size in bytes that is allowed in Zone Append 
> > > command.
> > > - *   The default is 128KiB. Since internally this this value is 
> > > maintained as
> > > - *   ZASL = log2( / ), some values 
> > > assigned
> > > - *   to this property may be rounded down and result in a lower maximum 
> > > ZA
> > > - *   data size being in effect. By setting this property to 0, users can 
> > > make
> > > - *   ZASL to be equal to MDTS. This property only affects zoned 
> > > namespaces.
> > > - *
> > >   * nvme namespace device parameters
> > >   * 
> > >   * - `subsys`
> > > -- 
> > > 2.30.2
> > 
> > Argh. Thanks Niklas, queing it up for fixes.
> > 
> > Reviewed-by: Klaus Jensen 
> 
> I don't see it in nvme-fixes yet.
> 
> Did it get stuck in purgatory? ;)
> 
> 

I could have included it for the PULL from yesterday, but I kinda forgot
and only added the coverity fixes. That's pulled now, so I've queued it
up for the next round of fixes now! :)

Thanks for following up on it!


signature.asc
Description: PGP signature


Re: Serious doubts about Gitlab CI

2021-03-30 Thread Daniel P . Berrangé
On Tue, Mar 30, 2021 at 02:09:38PM +0200, Paolo Bonzini wrote:
> On 30/03/21 13:55, Thomas Huth wrote:
> > 
> > Since the build system has been converted to meson, I think the
> > configure script prefers to use the submodules instead of the distro
> > packages. I've tried to remedy this a little bit here:
> > 
> > https://gitlab.com/qemu-project/qemu/-/commit/db0108d5d846e9a8
> > 
> > ... but new jobs of course will use the submodules again if the author
> > is not careful.
> 
> Hmm... it should be the same (or if not it's a bug).
> 
> > Also I wonder whether we could maybe even get rid of the capstone and slirp 
> > submodules in QEMU now
> 
> At least for slirp, we probably want to stay more on the bleeding edge which
> implies having to keep the submodule.  Capstone and libfdt probably can go.

I don't think we need to stay on the bleeding edge per-se in terms of
what we build against

We have a declared minimum version of libslirp that we absolutely must
have in order to get the API we need for core featureset. If new APIs
are introduced, it is quite reasonable for us to make their usage in
QEMU conditional, just as we would for any other 3rd party library we
use.

The reason to have slirp as a submodule is just to avoid a functional
regression on distros which don't have slirp available at all, and
which we don't expect to introduce it.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: Serious doubts about Gitlab CI

2021-03-30 Thread Daniel P . Berrangé
On Tue, Mar 30, 2021 at 03:19:49PM +0200, Paolo Bonzini wrote:
> On 30/03/21 15:12, Daniel P. Berrangé wrote:
> > > Now, but that may change already in 6.1 in order to add CFI support.
> > We can bundle a newer version, but we don't need to require a newer
> > version. Simply conditional compile for the bits we need. If distro
> > slirp is too old, then sorry, you can't enable CFI + slirp at the
> > same time. If the distro really wants that combination we don't have
> > to own the solution - the distro should update their slirp.
> > 
> > Or to put it another way, QEMU doesn't need to go out of its way to
> > enable new features on old distros. We merely need to not regress
> > in the features we previously offered.  We bundled slirp as a submodule
> > so that old distros didn't loose slirp entirely. We don't need to
> > offer CFI on those distros.
> 
> This is true, on the other hand only having to support one API version has
> its benefits.  The complication in the build system is minimal once slirp is
> made into a subproject; therefore it is appealing to keep the QEMU code
> simple.

I don't think slirp is special in this regard. The benefit you're promoting
here applies to any dependancy we have, but I think the benefit is not big
enough to justify.

The use of submodules has imposed significant pain on QEMU developers over
the years, and as such I think our general goal should be to have zero git
submodules over the long term. Usage of submodules ought to be considered
a short term workaround only, with a clear criteria for removal. We should
continually introduce dependancies on newer & newer versions, as that means
we'll never have any opportunity to remove them and reduce the cost on
QEMU.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




[PATCH v3 3/3] ppc: Enable 2nd DAWR support on p10

2021-03-30 Thread Ravi Bangoria
As per the PAPR, bit 0 of byte 64 in pa-features property indicates
availability of 2nd DAWR registers. i.e. If this bit is set, 2nd
DAWR is present, otherwise not. Use KVM_CAP_PPC_DAWR1 capability to
find whether kvm supports 2nd DAWR or not. If it's supported, allow
user to set the pa-feature bit in guest DT using cap-dawr1 machine
capability. Though, watchpoint on powerpc TCG guest is not supported
and thus 2nd DAWR is not enabled for TCG mode.

Signed-off-by: Ravi Bangoria 
---
 hw/ppc/spapr.c  | 11 ++-
 hw/ppc/spapr_caps.c | 32 
 include/hw/ppc/spapr.h  |  6 +-
 target/ppc/cpu.h|  2 ++
 target/ppc/kvm.c| 12 
 target/ppc/kvm_ppc.h|  7 +++
 target/ppc/translate_init.c.inc | 15 +++
 7 files changed, 83 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index d56418ca29..4660ff9e6b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -238,7 +238,7 @@ static void spapr_dt_pa_features(SpaprMachineState *spapr,
 0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 - 53 */
 /* 54: DecFP, 56: DecI, 58: SHA */
 0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 54 - 59 */
-/* 60: NM atomic, 62: RNG */
+/* 60: NM atomic, 62: RNG, 64: DAWR1 (ISA 3.1) */
 0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
 };
 uint8_t *pa_features = NULL;
@@ -256,6 +256,10 @@ static void spapr_dt_pa_features(SpaprMachineState *spapr,
 pa_features = pa_features_300;
 pa_size = sizeof(pa_features_300);
 }
+if (ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_3_10, 0, cpu->compat_pvr)) {
+pa_features = pa_features_300;
+pa_size = sizeof(pa_features_300);
+}
 if (!pa_features) {
 return;
 }
@@ -279,6 +283,9 @@ static void spapr_dt_pa_features(SpaprMachineState *spapr,
  * in pa-features. So hide it from them. */
 pa_features[40 + 2] &= ~0x80; /* Radix MMU */
 }
+if (spapr_get_cap(spapr, SPAPR_CAP_DAWR1)) {
+pa_features[66] |= 0x80;
+}
 
 _FDT((fdt_setprop(fdt, offset, "ibm,pa-features", pa_features, pa_size)));
 }
@@ -2003,6 +2010,7 @@ static const VMStateDescription vmstate_spapr = {
 _spapr_cap_ccf_assist,
 _spapr_cap_fwnmi,
 _spapr_fwnmi,
+_spapr_cap_dawr1,
 NULL
 }
 };
@@ -4539,6 +4547,7 @@ static void spapr_machine_class_init(ObjectClass *oc, 
void *data)
 smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = SPAPR_CAP_ON;
 smc->default_caps.caps[SPAPR_CAP_CCF_ASSIST] = SPAPR_CAP_ON;
 smc->default_caps.caps[SPAPR_CAP_FWNMI] = SPAPR_CAP_ON;
+smc->default_caps.caps[SPAPR_CAP_DAWR1] = SPAPR_CAP_OFF;
 spapr_caps_add_properties(smc);
 smc->irq = _irq_dual;
 smc->dr_phb_enabled = true;
diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 9ea7ddd1e9..9c39a211fd 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -523,6 +523,27 @@ static void cap_fwnmi_apply(SpaprMachineState *spapr, 
uint8_t val,
 }
 }
 
+static void cap_dawr1_apply(SpaprMachineState *spapr, uint8_t val,
+   Error **errp)
+{
+if (!val) {
+return; /* Disable by default */
+}
+
+if (tcg_enabled()) {
+error_setg(errp,
+"DAWR1 not supported in TCG. Try appending -machine 
cap-dawr1=off");
+} else if (kvm_enabled()) {
+if (!kvmppc_has_cap_dawr1()) {
+error_setg(errp,
+"DAWR1 not supported by KVM. Try appending -machine 
cap-dawr1=off");
+} else if (kvmppc_set_cap_dawr1(val) < 0) {
+error_setg(errp,
+"DAWR1 not supported by KVM. Try appending -machine 
cap-dawr1=off");
+}
+}
+}
+
 SpaprCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
 [SPAPR_CAP_HTM] = {
 .name = "htm",
@@ -631,6 +652,16 @@ SpaprCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
 .type = "bool",
 .apply = cap_fwnmi_apply,
 },
+[SPAPR_CAP_DAWR1] = {
+.name = "dawr1",
+.description = "Allow DAWR1",
+.index = SPAPR_CAP_DAWR1,
+.get = spapr_cap_get_bool,
+.set = spapr_cap_set_bool,
+.type = "bool",
+.apply = cap_dawr1_apply,
+},
+
 };
 
 static SpaprCapabilities default_caps_with_cpu(SpaprMachineState *spapr,
@@ -771,6 +802,7 @@ SPAPR_CAP_MIG_STATE(nested_kvm_hv, SPAPR_CAP_NESTED_KVM_HV);
 SPAPR_CAP_MIG_STATE(large_decr, SPAPR_CAP_LARGE_DECREMENTER);
 SPAPR_CAP_MIG_STATE(ccf_assist, SPAPR_CAP_CCF_ASSIST);
 SPAPR_CAP_MIG_STATE(fwnmi, SPAPR_CAP_FWNMI);
+SPAPR_CAP_MIG_STATE(dawr1, SPAPR_CAP_DAWR1);
 
 void spapr_caps_init(SpaprMachineState *spapr)
 {
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index b8985fab5b..00c8341acf 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -74,8 +74,10 @@ typedef enum {
 #define SPAPR_CAP_CCF_ASSIST   

[PATCH v3 0/3] ppc: Enable 2nd DAWR support on Power10

2021-03-30 Thread Ravi Bangoria
This series enables 2nd DAWR support on p10 qemu guest. 2nd
DAWR is new watchpoint added in Power10 processor. Kernel/kvm
patches are already in[1]. Watchpoint on powerpc TCG guest is
not supported and thus 2nd DAWR is not enabled for TCG mode.

Patches apply fine on qemu/master branch (9e2e9fe3df9f).

v2: 
https://lore.kernel.org/r/20210329041906.213991-1-ravi.bango...@linux.ibm.com
v2->v3:
  - Don't introduce pa_features_310[], instead, reuse pa_features_300[]
for 3.1 guests, as there is no difference between initial values of
them atm.
  - Call gen_spr_book3s_310_dbg() from init_proc_POWER10() instead of
init_proc_POWER8(). Also, Don't call gen_spr_book3s_207_dbg() from
gen_spr_book3s_310_dbg() as init_proc_POWER10() already calls it.

v1: 
https://lore.kernel.org/r/20200723104220.314671-1-ravi.bango...@linux.ibm.com
[Apologies for long gap]
v1->v2:
  - Introduce machine capability cap-dawr1 to enable/disable
the feature. By default, 2nd DAWR is OFF for guests even
when host kvm supports it. User has to manually enable it
with -machine cap-dawr1=on if he wishes to use it.
  - Split the header file changes into separate patch. (Sync
headers from v5.12-rc3)

[1] https://git.kernel.org/torvalds/c/bd1de1a0e6eff

Ravi Bangoria (3):
  Linux headers: update from 5.12-rc3
  ppc: Rename current DAWR macros and variables
  ppc: Enable 2nd DAWR support on p10

 hw/ppc/spapr.c| 11 ++-
 hw/ppc/spapr_caps.c   | 32 +++
 include/hw/ppc/spapr.h|  8 +-
 include/standard-headers/drm/drm_fourcc.h | 23 -
 include/standard-headers/linux/input.h|  2 +-
 .../standard-headers/rdma/vmw_pvrdma-abi.h|  7 ++
 linux-headers/asm-generic/unistd.h|  4 +-
 linux-headers/asm-mips/unistd_n32.h   |  1 +
 linux-headers/asm-mips/unistd_n64.h   |  1 +
 linux-headers/asm-mips/unistd_o32.h   |  1 +
 linux-headers/asm-powerpc/kvm.h   |  2 +
 linux-headers/asm-powerpc/unistd_32.h |  1 +
 linux-headers/asm-powerpc/unistd_64.h |  1 +
 linux-headers/asm-s390/unistd_32.h|  1 +
 linux-headers/asm-s390/unistd_64.h|  1 +
 linux-headers/asm-x86/kvm.h   |  1 +
 linux-headers/asm-x86/unistd_32.h |  1 +
 linux-headers/asm-x86/unistd_64.h |  1 +
 linux-headers/asm-x86/unistd_x32.h|  1 +
 linux-headers/linux/kvm.h | 89 +++
 linux-headers/linux/vfio.h| 27 ++
 target/ppc/cpu.h  |  6 +-
 target/ppc/kvm.c  | 12 +++
 target/ppc/kvm_ppc.h  |  7 ++
 target/ppc/translate_init.c.inc   | 19 +++-
 25 files changed, 249 insertions(+), 11 deletions(-)

-- 
2.17.1




[PATCH v3 2/3] ppc: Rename current DAWR macros and variables

2021-03-30 Thread Ravi Bangoria
Power10 is introducing second DAWR. Use real register names (with
suffix 0) from ISA for current macros and variables used by Qemu.

One exception to this is KVM_REG_PPC_DAWR[X]. This is from kernel
uapi header and thus not changed in kernel as well as Qemu.

Signed-off-by: Ravi Bangoria 
---
 include/hw/ppc/spapr.h  | 2 +-
 target/ppc/cpu.h| 4 ++--
 target/ppc/translate_init.c.inc | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 47cebaf3ac..b8985fab5b 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -363,7 +363,7 @@ struct SpaprMachineState {
 
 /* Values for 2nd argument to H_SET_MODE */
 #define H_SET_MODE_RESOURCE_SET_CIABR   1
-#define H_SET_MODE_RESOURCE_SET_DAWR2
+#define H_SET_MODE_RESOURCE_SET_DAWR0   2
 #define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE 3
 #define H_SET_MODE_RESOURCE_LE  4
 
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index e73416da68..cd02d65303 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1459,10 +1459,10 @@ typedef PowerPCCPU ArchCPU;
 #define SPR_MPC_BAR   (0x09F)
 #define SPR_PSPB  (0x09F)
 #define SPR_DPDES (0x0B0)
-#define SPR_DAWR  (0x0B4)
+#define SPR_DAWR0 (0x0B4)
 #define SPR_RPR   (0x0BA)
 #define SPR_CIABR (0x0BB)
-#define SPR_DAWRX (0x0BC)
+#define SPR_DAWRX0(0x0BC)
 #define SPR_HFSCR (0x0BE)
 #define SPR_VRSAVE(0x100)
 #define SPR_USPRG0(0x100)
diff --git a/target/ppc/translate_init.c.inc b/target/ppc/translate_init.c.inc
index c03a7c4f52..879e6df217 100644
--- a/target/ppc/translate_init.c.inc
+++ b/target/ppc/translate_init.c.inc
@@ -7748,12 +7748,12 @@ static void gen_spr_book3s_dbg(CPUPPCState *env)
 
 static void gen_spr_book3s_207_dbg(CPUPPCState *env)
 {
-spr_register_kvm_hv(env, SPR_DAWR, "DAWR",
+spr_register_kvm_hv(env, SPR_DAWR0, "DAWR0",
 SPR_NOACCESS, SPR_NOACCESS,
 SPR_NOACCESS, SPR_NOACCESS,
 _read_generic, _write_generic,
 KVM_REG_PPC_DAWR, 0x);
-spr_register_kvm_hv(env, SPR_DAWRX, "DAWRX",
+spr_register_kvm_hv(env, SPR_DAWRX0, "DAWRX0",
 SPR_NOACCESS, SPR_NOACCESS,
 SPR_NOACCESS, SPR_NOACCESS,
 _read_generic, _write_generic,
-- 
2.17.1




Re: [PATCH v3 4/5] qemu-iotests: let "check" spawn an arbitrary test command

2021-03-30 Thread Paolo Bonzini

On 30/03/21 12:57, Max Reitz wrote:


297 says:

check:125: error: Argument 1 to "Path" has incompatible type "Optional[str]"; expected 
"Union[str, _PathLike[str]]"
Found 1 error in 1 file (checked 1 source file)


Weird, I had tested this and I cannot reproduce it.



diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index df9fd733ff..7c9d3a0852 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -122,12 +122,13 @@ if __name__ == '__main__':
  sys.exit("missing command after '--'")
  cmd = args.tests
  env.print_env()
-    exec_path = Path(shutil.which(cmd[0]))
-    if exec_path is None:
+    exec_pathstr = shutil.which(cmd[0])
+    if exec_pathstr is None:
  sys.exit('command not found: ' + cmd[0])
-    cmd[0] = exec_path.resolve()
+    exec_path = Path(exec_pathstr).resolve()
+    cmd[0] = str(exec_path)
  full_env = env.prepare_subprocess(cmd)
-    os.chdir(Path(exec_path).parent)
+    os.chdir(exec_path.parent)
  os.execve(cmd[0], cmd, full_env)

  testfinder = TestFinder(test_dir=env.source_iotests)


But now these are so many changes that I feel uncomfortable making this 
change myself.  This series only affects the iotests, so AFAIU we are in 
no hurry to get this into rc1, and it can still go into rc2.


Go ahead and squash it.

Technically I think resolve() is not needed because we're basically just 
doing "dirname" and not going upwards in the directory tree.  That would 
leave the smaller change in message id 
51523e26-a184-9434-cb60-277c7b3c6...@redhat.com.  However, it doesn't 
hurt either and others may have the same doubt as you.


Thanks Max!

Paolo




Re: [PATCH v3 0/5] qemu-iotests: quality of life improvements

2021-03-30 Thread Max Reitz

On 30.03.21 13:32, Max Reitz wrote:

On 26.03.21 15:23, Paolo Bonzini wrote:

This series adds a few usability improvements to qemu-iotests, in
particular:

- arguments can be passed to Python unittests scripts, for example
   to run only a subset of the test cases (patches 1-2)

- it is possible to do "./check -- ../../../tests/qemu-iotests/055 
args..."

   and specify arbitrary arguments to be passed to a single test script.
   This allows to take advantage of the previous feature and ease 
debugging

   of Python tests.

Paolo

Thanks, I’ve amended patch 4 and applied the series to my block branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block


I’m sorry but I’ll have to drop it again.  At least iotests 245 und 295 
fail; I assume it has something to do with `iotests.activate_logging()`.


I don’t think that’s something that we’ll fix today, so I think we 
should postpone this series to rc2 after all.


Max




Re: [PATCH v2] i386: Make migration fail when Hyper-V reenlightenment was enabled but 'user_tsc_khz' is unset

2021-03-30 Thread Vitaly Kuznetsov
"Dr. David Alan Gilbert"  writes:

> * Vitaly Kuznetsov (vkuzn...@redhat.com) wrote:
>> "Dr. David Alan Gilbert"  writes:
>> 
>> > * Vitaly Kuznetsov (vkuzn...@redhat.com) wrote:
>> >> KVM doesn't fully support Hyper-V reenlightenment notifications on
>> >> migration. In particular, it doesn't support emulating TSC frequency
>> >> of the source host by trapping all TSC accesses so unless TSC scaling
>> >> is supported on the destination host and KVM_SET_TSC_KHZ succeeds, it
>> >> is unsafe to proceed with migration.
>> >> 
>> >> KVM_SET_TSC_KHZ is called from two sites: kvm_arch_init_vcpu() and
>> >> kvm_arch_put_registers(). The later (intentionally) doesn't propagate
>> >> errors allowing migrations to succeed even when TSC scaling is not
>> >> supported on the destination. This doesn't suit 're-enlightenment'
>> >> use-case as we have to guarantee that TSC frequency stays constant.
>> >> 
>> >> Require 'tsc-frequency=' command line option to be specified for 
>> >> successful
>> >> migration when re-enlightenment was enabled by the guest.
>> >> 
>> >> Signed-off-by: Vitaly Kuznetsov 
>> >> ---
>> >> This patch is a successor of "[PATCH 3/3] i386: Make sure
>> >> kvm_arch_set_tsc_khz() succeeds on migration when 'hv-reenlightenment'
>> >> was exposed" taking a different approach suggested by Paolo.
>> >> ---
>> >>  docs/hyperv.txt|  5 +
>> >>  target/i386/kvm/hyperv-proto.h |  1 +
>> >>  target/i386/machine.c  | 20 
>> >>  3 files changed, 26 insertions(+)
>> >> 
>> >> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
>> >> index 5df00da54fc4..e53c581f4586 100644
>> >> --- a/docs/hyperv.txt
>> >> +++ b/docs/hyperv.txt
>> >> @@ -160,6 +160,11 @@ the hypervisor) until it is ready to switch to the 
>> >> new one. This, in conjunction
>> >>  with hv-frequencies, allows Hyper-V on KVM to pass stable clocksource 
>> >> (Reference
>> >>  TSC page) to its own guests.
>> >>  
>> >> +Note, KVM doesn't fully support re-enlightenment notifications and 
>> >> doesn't
>> >> +emulate TSC accesses after migration so 'tsc-frequency=' CPU option also 
>> >> has to
>> >> +be specified to make migration succeed. The destination host has to 
>> >> either have
>> >> +the same TSC frequency or support TSC scaling CPU feature.
>> >> +
>> >>  Recommended: hv-frequencies
>> >>  
>> >>  3.16. hv-evmcs
>> >> diff --git a/target/i386/kvm/hyperv-proto.h 
>> >> b/target/i386/kvm/hyperv-proto.h
>> >> index 056a305be38c..e30d64b4ade4 100644
>> >> --- a/target/i386/kvm/hyperv-proto.h
>> >> +++ b/target/i386/kvm/hyperv-proto.h
>> >> @@ -139,6 +139,7 @@
>> >>   * Reenlightenment notification MSRs
>> >>   */
>> >>  #define HV_X64_MSR_REENLIGHTENMENT_CONTROL  0x4106
>> >> +#define HV_REENLIGHTENMENT_ENABLE_BIT   (1u << 16)
>> >>  #define HV_X64_MSR_TSC_EMULATION_CONTROL0x4107
>> >>  #define HV_X64_MSR_TSC_EMULATION_STATUS 0x4108
>> >>  
>> >> diff --git a/target/i386/machine.c b/target/i386/machine.c
>> >> index 7259fe6868c6..137604ddb898 100644
>> >> --- a/target/i386/machine.c
>> >> +++ b/target/i386/machine.c
>> >> @@ -883,11 +883,31 @@ static bool 
>> >> hyperv_reenlightenment_enable_needed(void *opaque)
>> >>  env->msr_hv_tsc_emulation_status != 0;
>> >>  }
>> >>  
>> >> +static int hyperv_reenlightenment_post_load(void *opaque, int version_id)
>> >> +{
>> >> +X86CPU *cpu = opaque;
>> >> +CPUX86State *env = >env;
>> >> +
>> >> +/*
>> >> + * KVM doesn't fully support re-enlightenment notifications so we 
>> >> need to
>> >> + * make sure TSC frequency doesn't change upon migration.
>> >> + */
>> >> +if ((env->msr_hv_reenlightenment_control & 
>> >> HV_REENLIGHTENMENT_ENABLE_BIT) &&
>> >> +!env->user_tsc_khz) {
>> >> +error_report("Guest enabled re-enlightenment notifications, "
>> >> + "'tsc-frequency=' has to be specified");
>> >
>> > It's unusual to fail on the destination for a valid configuration but
>> > guest state;  wouldn't it be better to always insist on tsc-frequency if
>> > that hv feature is exposed; failing early before reeiving the state?
>> >
>> 
>> Doing so would make a number of currently existing configurations
>> invalid, even when re-enlightenment is not to be used by the
>> guest. AFAIR Windows without Hyper-V doesn't enable it. Generally, we
>> just advise people to 'enable all currently supported hyper-v
>> enlightenments' to make things easier so reenlightenment may end up
>> being added for no particular reason.
>
> Ouch, that's difficult - the problem with testing this late is that the
> migration fails right at the end so it's an unpleasent surprise.
>
> Could you disallow re-enlightenment without tsc-frequency on new machine
> types?
>

Will do. I'm not exactly sure if I should target 6.0 or 6.1 atm, let's
try the former first.

-- 
Vitaly




Re: [PATCH] target/xtensa: make xtensa_modules static on import

2021-03-30 Thread Philippe Mathieu-Daudé
On 3/30/21 9:30 AM, Max Filippov wrote:
> xtensa_modules variable defined in each xtensa-modules.c.inc is only
> used locally by the including file. Make it static.
> 

Reported-by: Yury Gribov 

> Signed-off-by: Max Filippov 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  target/xtensa/import_core.sh | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/target/xtensa/import_core.sh b/target/xtensa/import_core.sh
> index f3404039cc20..53d3c4d099bb 100755
> --- a/target/xtensa/import_core.sh
> +++ b/target/xtensa/import_core.sh
> @@ -35,6 +35,7 @@ tar -xf "$OVERLAY" -O binutils/xtensa-modules.c | \
>  -e '/^#include "ansidecl.h"/d' \
>  -e '/^Slot_[a-zA-Z0-9_]\+_decode (const xtensa_insnbuf 
> insn)/,/^}/s/^  return 0;$/  return XTENSA_UNDEFINED;/' \
>  -e 's/#include /#include "xtensa-isa.h"/' \
> +-e 's/^\(xtensa_isa_internal xtensa_modules\)/static \1/' \
>  > "$TARGET"/xtensa-modules.c.inc
>  
>  cat < "${TARGET}.c"
> 




Re: Serious doubts about Gitlab CI

2021-03-30 Thread Peter Maydell
On Tue, 30 Mar 2021 at 12:56, Thomas Huth  wrote:
> Right, I think we should also work more towards consolidating the QEMU
> binaries, to avoid that we have to always build sooo many target binaries
> again and again. E.g.:
>
> - Do we still need to support 32-bit hosts? If not we could
>finally get rid of qemu-system-i386, qemu-system-ppc,
>qemu-system-arm, etc. and just provide the 64-bit variants

We could drop qemu-system-i386  without dropping 32-bit host
support (except for the special case of wanting to use KVM):
32-bit host TCG happily runs the qemu-system-foo64 binary.
This does depend on the target arch having been set up so that
the 64-bit version works exactly like the 32-bit one for 32-bit
guest boards, though -- arm does this. I think x86 mostly does
except for differences like the default guest CPU type. riscv
used to have a "32 bit cpus only in the qemu-system-foo64 binary"
setup but I think that is either fixed or being fixed. There's
also the issue that it breaks existing working user commandlines,
of course.

> - Could we maybe somehow unify the targets that have both, big
>and little endian versions? Then we could merge e.g.
>qemu-system-microblaze and qemu-system-microblazeel etc.
>
> - Or could we maybe even build a unified qemu-system binary that
>contains all target CPUs? ... that would also allow e.g.
>machines with a x86 main CPU and an ARM-based board management
>controller...

I would like to see this one day, but it's a pretty non-trivial
amount of engineering work to identify all the places where we
currently hard-code a compile-time setting about the target
architecture and make them runtime instead (in a way that doesn't
torpedo performance). "is the target CPU big-endian" is one of those...

> Also I wonder whether we could maybe even get rid of the capstone and slirp
> submodules in QEMU now ... these libraries should be available in the most
> distros by now, and otherwise people could also install them manually instead?

I suspect that's rather overoptimistic, but how widely available they
are is a question of fact that we can check.

thanks
-- PMM



[PULL 7/9] qsd: Document FUSE exports

2021-03-30 Thread Max Reitz
Implementing FUSE exports required no changes to the storage daemon, so
we forgot to document them there.  Considering that both NBD and
vhost-user-blk exports are documented in its man page (and NBD exports
in its --help text), we should probably do the same for FUSE.

Signed-off-by: Max Reitz 
Message-Id: <20210217115844.62661-1-mre...@redhat.com>
Reviewed-by: Eric Blake 
---
 docs/tools/qemu-storage-daemon.rst   | 19 +++
 storage-daemon/qemu-storage-daemon.c |  4 
 2 files changed, 23 insertions(+)

diff --git a/docs/tools/qemu-storage-daemon.rst 
b/docs/tools/qemu-storage-daemon.rst
index 086493ebb3..3ec4bdd914 100644
--- a/docs/tools/qemu-storage-daemon.rst
+++ b/docs/tools/qemu-storage-daemon.rst
@@ -74,6 +74,7 @@ Standard options:
 .. option:: --export 
[type=]nbd,id=,node-name=[,name=][,writable=on|off][,bitmap=]
   --export 
[type=]vhost-user-blk,id=,node-name=,addr.type=unix,addr.path=[,writable=on|off][,logical-block-size=][,num-queues=]
   --export 
[type=]vhost-user-blk,id=,node-name=,addr.type=fd,addr.str=[,writable=on|off][,logical-block-size=][,num-queues=]
+  --export 
[type=]fuse,id=,node-name=,mountpoint=[,growable=on|off][,writable=on|off]
 
   is a block export definition. ``node-name`` is the block node that should be
   exported. ``writable`` determines whether or not the export allows write
@@ -92,6 +93,16 @@ Standard options:
   ``logical-block-size`` sets the logical block size in bytes (the default is
   512). ``num-queues`` sets the number of virtqueues (the default is 1).
 
+  The ``fuse`` export type takes a mount point, which must be a regular file,
+  on which to export the given block node. That file will not be changed, it
+  will just appear to have the block node's content while the export is active
+  (very much like mounting a filesystem on a directory does not change what the
+  directory contains, it only shows a different content while the filesystem is
+  mounted). Consequently, applications that have opened the given file before
+  the export became active will continue to see its original content. If
+  ``growable`` is set, writes after the end of the exported file will grow the
+  block node to fit.
+
 .. option:: --monitor MONITORDEF
 
   is a QMP monitor definition. See the :manpage:`qemu(1)` manual page for
@@ -196,6 +207,14 @@ domain socket ``vhost-user-blk.sock``::
   --blockdev driver=qcow2,node-name=qcow2,file=file \
   --export 
type=vhost-user-blk,id=export,addr.type=unix,addr.path=vhost-user-blk.sock,node-name=qcow2
 
+Export a qcow2 image file ``disk.qcow2`` via FUSE on itself, so the disk image
+file will then appear as a raw image::
+
+  $ qemu-storage-daemon \
+  --blockdev driver=file,node-name=file,filename=disk.qcow2 \
+  --blockdev driver=qcow2,node-name=qcow2,file=file \
+  --export 
type=fuse,id=export,node-name=qcow2,mountpoint=disk.qcow2,writable=on
+
 See also
 
 
diff --git a/storage-daemon/qemu-storage-daemon.c 
b/storage-daemon/qemu-storage-daemon.c
index 72900dc2ec..fc8b150629 100644
--- a/storage-daemon/qemu-storage-daemon.c
+++ b/storage-daemon/qemu-storage-daemon.c
@@ -98,6 +98,10 @@ static void help(void)
 " export the specified block node over NBD\n"
 " (requires --nbd-server)\n"
 "\n"
+"  --export [type=]fuse,id=,node-name=,mountpoint=\n"
+"   [,growable=on|off][,writable=on|off]\n"
+" export the specified block node over FUSE\n"
+"\n"
 "  --monitor [chardev=]name[,mode=control][,pretty[=on|off]]\n"
 " configure a QMP monitor\n"
 "\n"
-- 
2.29.2




Re: An error due to installation that require binutils package

2021-03-30 Thread Stefano Garzarella

Hi John,

On Mon, Mar 29, 2021 at 09:46:49PM +0300, John Simpson wrote:

Hello,

Kindly ask you to have a look at this bug.
Thank you for your replies.


It's already fixed in QEMU upstream and the fix will be released with 
the 6.0 version next month (the rc0 is already available):

https://gitlab.com/qemu-project/qemu/-/commit/bbd2d5a8120771ec59b86a80a1f51884e0a26e53

I guess xen-4.14.1 is using an older version, so if you want you can 
backport that patch in your version, the change should be simple.


Thanks,
Stefano



On Mon, Mar 29, 2021 at 7:07 PM George Dunlap 
wrote:


John,

Thanks for your report.  Can you post your bug report
xen-de...@lists.xenproject.org ?

The bug is in the compilation of QEMU, which is an external project; so
it’s possible that we’ll end up having to raise this with that community as
well.

Thanks,
 -George Dunlap

> On Mar 28, 2021, at 2:26 PM, John Simpson  wrote:
>
> Hello,
>
> Just forwarding this message to you. Can you give some thoughs about
this? Thanks a lot.
>
>
> -- Forwarded message -
> From: Alan Modra 
> Date: Sun, Mar 28, 2021 at 2:21 PM
> Subject: Re: An error due to installation that require binutils package.
> To: John Simpson 
> Cc: 
>
>
> On Sun, Mar 28, 2021 at 12:55:23PM +0300, John Simpson via Binutils
wrote:
> >   BUILD   pc-bios/optionrom/kvmvapic.img
> > ld: Error: unable to disambiguate: -no-pie (did you mean --no-pie ?)
>
> -no-pie is a gcc option.  Neither -no-pie nor --no-pie is a valid ld
> option.  The fault lies with whatever passed -no-pie to ld.
>
> --
> Alan Modra
> Australia Development Lab, IBM
>
>
>
> -- Forwarded message -
> From: Andreas Schwab 
> Date: Sun, Mar 28, 2021 at 2:17 PM
> Subject: Re: An error due to installation that require binutils 
> package.

> To: John Simpson via Binutils 
> Cc: John Simpson 
>
>
> Please report that to the xen project.  ld -no-pie doesn't have a useful
> meaning.  It used to mean the same as ld -n -o-pie, which sets "-pie" as
> the output file name.
>
> Andreas.
>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."
>
>
>
> -- Forwarded message -
> From: John Simpson 
> Date: Sun, Mar 28, 2021 at 12:55 PM
> Subject: An error due to installation that require binutils package.
> To: 
>
>
> Hello,
>
> Recently I got a following error due to installation xen on
5.11.6-1-MANJARO kernel:
>
>   GEN target/riscv/trace.c
>   GEN target/s390x/trace.c
>   GEN target/sparc/trace.c
>   GEN util/trace.c
>   GEN config-all-devices.mak
> make[1]: Entering directory
'/home/username/xen/src/xen-4.14.1/tools/qemu-xen/slirp'
> make[1]: Nothing to be done for 'all'.
> make[1]: Leaving directory
'/home/username/xen/src/xen-4.14.1/tools/qemu-xen/slirp'
>   BUILD   pc-bios/optionrom/multiboot.img
>   BUILD   pc-bios/optionrom/linuxboot.img
>   BUILD   pc-bios/optionrom/linuxboot_dma.img
>   BUILD   pc-bios/optionrom/kvmvapic.img
> ld: Error: unable to disambiguate: -no-pie (did you mean --no-pie ?)
> make[1]: *** [Makefile:53: multiboot.img] Error 1
> make[1]: *** Waiting for unfinished jobs
> ld: Error: unable to disambiguate: -no-pie (did you mean --no-pie ?)
> make[1]: *** [Makefile:53: linuxboot_dma.img] Error 1
>   BUILD   pc-bios/optionrom/pvh.img
> ld: Error: unable to disambiguate: -no-pie (did you mean --no-pie ?)
> make[1]: *** [Makefile:53: linuxboot.img] Error 1
> ld: Error: unable to disambiguate: -no-pie (did you mean --no-pie ?)
> make[1]: *** [Makefile:53: kvmvapic.img] Error 1
> ld: Error: unable to disambiguate: -no-pie (did you mean --no-pie ?)
> make[1]: *** [Makefile:50: pvh.img] Error 1
> make: *** [Makefile:581: pc-bios/optionrom/all] Error 2
> make: Leaving directory
'/home/username/xen/src/xen-4.14.1/tools/qemu-xen-build'
> make[3]: *** [Makefile:218: subdir-all-qemu-xen-dir] Error 2
> make[3]: Leaving directory '/home/username/xen/src/xen-4.14.1/tools'
> make[2]: ***
[/home/username/xen/src/xen-4.14.1/tools/../tools/Rules.mk:235:
subdirs-install] Error 2
> make[2]: Leaving directory '/home/username/xen/src/xen-4.14.1/tools'
> make[1]: *** [Makefile:72: install] Error 2
> make[1]: Leaving directory '/home/username/xen/src/xen-4.14.1/tools'
> make: *** [Makefile:134: install-tools] Error 2
> ==> ERROR: A failure occurred in build().
> Aborting...
>
> Currently I have fresh binutils 2.36.1-2 and it seems to me that the
issue is related to this part of code:
>
> https://github.com/bminor/binutils-gdb/blob/master/ld/lexsup.c#L451
>
> It seems to me that this could impact far more users than just me.
>







Re: [PATCH for-6.0 6/7] hw/block/nvme: update dmsrl limit on namespace detachment

2021-03-30 Thread Gollu Appalanaidu

On Wed, Mar 24, 2021 at 09:09:06PM +0100, Klaus Jensen wrote:

From: Klaus Jensen 

The Non-MDTS DMSRL limit must be recomputed when namespaces are
detached.

Fixes: 645ce1a70cb6 ("hw/block/nvme: support namespace attachment command")
Signed-off-by: Klaus Jensen 
---
hw/block/nvme.c | 17 +
1 file changed, 17 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 403c8381a498..e84e43b2692d 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -4876,6 +4876,21 @@ static uint16_t nvme_aer(NvmeCtrl *n, NvmeRequest *req)
return NVME_NO_COMPLETE;
}

+static void __nvme_update_dmrsl(NvmeCtrl *n)
+{
+int nsid;
+
+for (nsid = 1; nsid <= NVME_MAX_NAMESPACES; nsid++) {
+NvmeNamespace *ns = nvme_ns(n, nsid);
+if (!ns) {
+continue;
+}
+
+n->dmrsl = MIN_NON_ZERO(n->dmrsl,
+BDRV_REQUEST_MAX_BYTES / nvme_l2b(ns, 1));
+}
+}
+


Looks good to me!


static void __nvme_select_ns_iocs(NvmeCtrl *n, NvmeNamespace *ns);
static uint16_t nvme_ns_attachment(NvmeCtrl *n, NvmeRequest *req)
{
@@ -4925,6 +4940,8 @@ static uint16_t nvme_ns_attachment(NvmeCtrl *n, 
NvmeRequest *req)
}

nvme_ns_detach(ctrl, ns);
+
+__nvme_update_dmrsl(ctrl);
}

/*
--
2.31.0




Reviwed-by: Gollu Appalanaidu 


Re: [PATCH for-6.0 5/7] hw/block/nvme: fix warning about legacy namespace configuration

2021-03-30 Thread Gollu Appalanaidu

On Wed, Mar 24, 2021 at 09:09:05PM +0100, Klaus Jensen wrote:

From: Klaus Jensen 

Remove the unused BlockConf from the controller structure and fix the
constraint checking to actually check the right BlockConf and issue the
warning.

Signed-off-by: Klaus Jensen 
---
hw/block/nvme.h | 1 -
hw/block/nvme.c | 2 +-
2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index c610ab30dc5c..1570f65989a7 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -166,7 +166,6 @@ typedef struct NvmeCtrl {
NvmeBar  bar;
NvmeParams   params;
NvmeBus  bus;
-BlockConfconf;

uint16_tcntlid;
boolqs_created;
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 7a7e793c6c26..403c8381a498 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -5807,7 +5807,7 @@ static void nvme_check_constraints(NvmeCtrl *n, Error 
**errp)
params->max_ioqpairs = params->num_queues - 1;
}

-if (n->conf.blk) {
+if (n->namespace.blkconf.blk) {
warn_report("drive property is deprecated; "
"please use an nvme-ns device instead");
}
--
2.31.0




Reviewed-by: Gollu Appalanaidu 


Re: [PATCH v2] docs: Add a QEMU Code of Conduct and Conflict Resolution Policy document

2021-03-30 Thread Daniel P . Berrangé
On Tue, Mar 30, 2021 at 12:53:04PM +0200, Paolo Bonzini wrote:
> On 30/03/21 11:08, Thomas Huth wrote:
> >   I've picked the Django Code of Conduct as a base, since it sounds rather
> >   friendly and still welcoming to me, but I'm open for other suggestions, 
> > too
> >   (but we should maybe pick one where the conflict resolution policy is
> >   separated from the CoC itself so that it can be better taylored to the
> >   requirements of the QEMU project)
> 
> It turns out that the Django CoC is ultimately based on the Fedora CoC,
> so I tried using https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> as an inspiration for what can be cut. Here is the outcome:
> 
> -
> The QEMU community is made up of a mixture of professionals and
> volunteers from all over the world. Diversity is one of our strengths,
> but it can also lead to communication issues and unhappiness.
> To that end, we have a few ground rules that we ask people to adhere to.
> 
> * Be welcoming. We are committed to making participation in this project
>   a harassment-free experience for everyone, regardless of level of
>   experience, gender, gender identity and expression, sexual orientation,
>   disability, personal appearance, body size, race, ethnicity, age, religion,
>   or nationality.
> 
> * Be respectful. Not all of us will agree all the time.  Disagreements, both
>   social and technical, happen all the time and the QEMU community is no
>   exception. When we disagree, we try to understand why.  It is important that
>   we resolve disagreements and differing views constructively.  Members of the
>   QEMU community should be respectful when dealing with other contributors as
>   well as with people outside the QEMU community and with users of QEMU.
> 
> Harassment and other exclusionary behavior are not acceptable. A community
> where people feel uncomfortable or threatened is neither welcoming nor
> respectful.  Examples of unacceptable behavior by participants include:
> 
> * The use of sexualized language or imagery
> 
> * Personal attacks
> 
> * Trolling or insulting/derogatory comments
> 
> * Public or private harassment
> 
> * Publishing other's private information, such as physical or electronic
> addresses, without explicit permission
> 
> This isn't an exhaustive list of things that you can't do. Rather, take
> it in the spirit in which it's intended—a guide to make it easier to
> be excellent to each other.
> 
> This code of conduct applies to all spaces managed by the QEMU project.
> This includes IRC, the mailing lists, the issue tracker, community
> events, and any other forums created by the project team which the
> community uses for communication. This code of conduct also applies
> outside these spaces, when an individual acts as a representative or a
> member of the project or its community.

I really don't like this last sentance. The qualifier

  ', when an individual acts as a representative or member...'

is opening up a clear loophole to escape consequences under the
QEMU CoC.

Consider someone is kicked out from another project for violation
of that project's CoC, that would also be considered a violation
under QEMU's CoC. This qualifier is explicitly stating that the CoC
violation in the other project has no bearing on whether that
person can now start participating in QEMU. I think that's a bad
mixed message we're sending there. It is especially poor if the
victim from the other project is also a QEMU contributor.

The wording Thomas' draft has

  In addition, violations of this code outside these spaces may
  affect a person's ability to participate within them.

doesn't require QEMU to take action. It just set a statement
of intent that gives QEMU the freedom to evaluate whether it is
reasonable to take action to protect its contributors, should a
contributor wish to raise an issue that occurred outside QEMU.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v4 for-6.0? 0/3] qcow2: fix parallel rewrite and discard (rw-lock)

2021-03-30 Thread Vladimir Sementsov-Ogievskiy

30.03.2021 15:51, Max Reitz wrote:

On 30.03.21 12:51, Vladimir Sementsov-Ogievskiy wrote:

30.03.2021 12:49, Max Reitz wrote:

On 25.03.21 20:12, Vladimir Sementsov-Ogievskiy wrote:

ping. Do we want it for 6.0?


I’d rather wait.  I think the conclusion was that guests shouldn’t hit this 
because they serialize discards?


I think, that we never had bugs, so we of course can wait.



There’s also something Kevin wrote on IRC a couple of weeks ago, for which I 
had hoped he’d sent an email but I don’t think he did, so I’ll try to remember 
and paraphrase as well as I can...

He basically asked whether it wouldn’t be conceptually simpler to take a 
reference to some cluster in get_cluster_offset() and later release it with a 
to-be-added put_cluster_offset().

He also noted that reading is problematic, too, because if you read a discarded 
and reused cluster, this might result in an information leak (some guest 
application might be able to read data it isn’t allowed to read); that’s why 
making get_cluster_offset() the point of locking clusters against discarding 
would be better.


Yes, I thought about read too, (RFCed in cover letter of [PATCH v5 0/6] qcow2: 
fix parallel rewrite and discard (lockless))



This would probably work with both of your solutions.  For the in-memory 
solutions, you’d take a refcount to an actual cluster; in the CoRwLock 
solution, you’d take that lock.

What do you think?



Hmm. What do you mean? Just rename my qcow2_inflight_writes_inc() and 
qcow2_inflight_writes_dec() to get_cluster_offset()/put_cluster_offset(), to 
make it more native to use for read operations as well?


Hm.  Our discussion wasn’t so detailed.

I interpreted it to mean all qcow2 functions that find an offset to a qcow2 
cluster, namely qcow2_get_host_offset(), qcow2_alloc_host_offset(), and 
qcow2_alloc_compressed_cluster_offset().


What about qcow2_alloc_clusters() ?



When those functions return an offset (in)to some cluster, that cluster (or the 
image as a whole) should be locked against discards.  Every offset received 
this way would require an accompanying qcow2_put_host_offset().


Or to update any kind of "getting cluster offset" in the whole qcow2 driver to take a 
kind of "dynamic reference count" by get_cluster_offset() and then call corresponding 
put() somewhere? In this case I'm afraid it's a lot more work..


Hm, really?  I would have assumed we need to do some locking in all functions 
that get a cluster offset this way, so it should be less work to take the lock 
in the functions they invoke to get the offset.


It would be also the problem that a lot of paths in qcow2 are not in coroutine and 
don't even take s->lock when they actually should.


I’m not sure what you mean here, because all functions that invoke any of the 
three functions I listed above are coroutine_fns (or, well, I didn’t look it 
up, but they all have *_co_* in their name).


qcow2_alloc_clusters() has a lot more callers..




This will also mean that we do same job as normal qcow2 refcounts already do: no sense in 
keeping additional "dynamic refcount" for L2 table cluster while reading it, as 
we already have non-zero qcow2 normal refcount for it..


I’m afraid I don’t understand how normal refcounts relate to this.  For 
example, qcow2_get_host_offset() doesn’t touch refcounts at all.



I mean the following: remember our discussion about what is free-cluster. If we add 
"dynamic-refcount", or "infligth-write-counter" thing only to count inflight data-writing 
(or, as discussed, we should count data-reads as well) operations, than "full reference count" of 
the cluster is inflight-write-count + qcow2-metadata-refcount.

But if we add a kind of "dynamic refcount" for any use of host cluster, for example reading of L2 table, than 
we duplicate the reference in qcow2-metadata to this L2 table (represented as refcount) by our "dynamic 
refcount", and we don't have a concept of "full reference count" as the sum above.. We still should 
treat a cluster as free when both "dynamic refcount" and qcow2-metadata-refcount are zero, but their sum 
doesn't have a good sense. Not a problem maybe.. But looks like a complication with no benefit.


==

OK, I think now that you didn't mean qcow2_alloc_clusters(). So, we are saying about only 
functions returning an offset to cluster with "guest data", not to any kind of 
host cluster. Than what you propose looks like this to me:

 - take my v5
 - rename qcow2_inflight_writes_dec() to put_cluster_offset()
 - call qcow2_inflight_writes_inc() from the three functions you mention

That make sense to me. Still, put_cluster_offset() name doesn't make it obvious that it's 
only for clusters with "guest data", and we shouldn't call it when work with 
metadata clusters.

--
Best regards,
Vladimir



[PULL 0/5] target-arm queue

2021-03-30 Thread Peter Maydell
The following changes since commit 7993b0f83fe5c3f8555e79781d5d098f99751a94:

  Merge remote-tracking branch 
'remotes/nvme/tags/nvme-fixes-for-6.0-pull-request' into staging (2021-03-29 
18:45:12 +0100)

are available in the Git repository at:

  https://git.linaro.org/people/pmaydell/qemu-arm.git pull-target-arm-20210330

for you to fetch changes up to b9e3f1579a4b06fc63dfa8cdb68df1c58eeb0cf1:

  hw/timer/renesas_tmr: Add default-case asserts in read_tcnt() (2021-03-30 
14:05:34 +0100)


 * net/npcm7xx_emc.c: Fix handling of receiving packets when RSDR not set
 * hw/display/xlnx_dp: Free FIFOs adding xlnx_dp_finalize()
 * hw/arm/smmuv3: Drop unused CDM_VALID() and is_cd_valid()
 * target/arm: Make number of counters in PMCR follow the CPU
 * hw/timer/renesas_tmr: Add default-case asserts in read_tcnt()


Doug Evans (1):
  net/npcm7xx_emc.c: Fix handling of receiving packets when RSDR not set

Peter Maydell (2):
  target/arm: Make number of counters in PMCR follow the CPU
  hw/timer/renesas_tmr: Add default-case asserts in read_tcnt()

Philippe Mathieu-Daudé (1):
  hw/display/xlnx_dp: Free FIFOs adding xlnx_dp_finalize()

Zenghui Yu (1):
  hw/arm/smmuv3: Drop unused CDM_VALID() and is_cd_valid()

 hw/arm/smmuv3-internal.h   |  7 ---
 target/arm/cpu.h   |  1 +
 hw/display/xlnx_dp.c   |  9 +
 hw/net/npcm7xx_emc.c   |  4 +++-
 hw/timer/renesas_tmr.c |  4 
 target/arm/cpu64.c |  3 +++
 target/arm/cpu_tcg.c   |  5 +
 target/arm/helper.c| 29 +
 target/arm/kvm64.c |  2 ++
 tests/qtest/npcm7xx_emc-test.c | 30 +-
 10 files changed, 65 insertions(+), 29 deletions(-)



[PULL 2/5] hw/display/xlnx_dp: Free FIFOs adding xlnx_dp_finalize()

2021-03-30 Thread Peter Maydell
From: Philippe Mathieu-Daudé 

When building with --enable-sanitizers we get:

  Direct leak of 16 byte(s) in 1 object(s) allocated from:
  #0 0x5618479ec7cf in malloc (qemu-system-aarch64+0x233b7cf)
  #1 0x7f675745f958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958)
  #2 0x561847c2dcc9 in xlnx_dp_init hw/display/xlnx_dp.c:1259:5
  #3 0x56184a5bdab8 in object_init_with_type qom/object.c:375:9
  #4 0x56184a5a2bda in object_initialize_with_type qom/object.c:517:5
  #5 0x56184a5a24d5 in object_initialize qom/object.c:536:5
  #6 0x56184a5a2f6c in object_initialize_child_with_propsv 
qom/object.c:566:5
  #7 0x56184a5a2e60 in object_initialize_child_with_props 
qom/object.c:549:10
  #8 0x56184a5a3a1e in object_initialize_child_internal qom/object.c:603:5
  #9 0x5618495aa431 in xlnx_zynqmp_init hw/arm/xlnx-zynqmp.c:273:5

The RX/TX FIFOs are created in xlnx_dp_init(), add xlnx_dp_finalize()
to destroy them.

Fixes: 58ac482a66d ("introduce xlnx-dp")
Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Alistair Francis 
Message-id: 20210323182958.277654-1-f4...@amsat.org
Signed-off-by: Peter Maydell 
---
 hw/display/xlnx_dp.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/display/xlnx_dp.c b/hw/display/xlnx_dp.c
index c56e6ec5936..4fd6aeb18b5 100644
--- a/hw/display/xlnx_dp.c
+++ b/hw/display/xlnx_dp.c
@@ -1260,6 +1260,14 @@ static void xlnx_dp_init(Object *obj)
 fifo8_create(>tx_fifo, 16);
 }
 
+static void xlnx_dp_finalize(Object *obj)
+{
+XlnxDPState *s = XLNX_DP(obj);
+
+fifo8_destroy(>tx_fifo);
+fifo8_destroy(>rx_fifo);
+}
+
 static void xlnx_dp_realize(DeviceState *dev, Error **errp)
 {
 XlnxDPState *s = XLNX_DP(dev);
@@ -1359,6 +1367,7 @@ static const TypeInfo xlnx_dp_info = {
 .parent= TYPE_SYS_BUS_DEVICE,
 .instance_size = sizeof(XlnxDPState),
 .instance_init = xlnx_dp_init,
+.instance_finalize = xlnx_dp_finalize,
 .class_init= xlnx_dp_class_init,
 };
 
-- 
2.20.1




[PULL 4/5] target/arm: Make number of counters in PMCR follow the CPU

2021-03-30 Thread Peter Maydell
Currently we give all the v7-and-up CPUs a PMU with 4 counters.  This
means that we don't provide the 6 counters that are required by the
Arm BSA (Base System Architecture) specification if the CPU supports
the Virtualization extensions.

Instead of having a single PMCR_NUM_COUNTERS, make each CPU type
specify the PMCR reset value (obtained from the appropriate TRM), and
use the 'N' field of that value to define the number of counters
provided.

This means that we now supply 6 counters for Cortex-A53, A57, A72,
A15 and A9 as well as '-cpu max'; Cortex-A7 and A8 stay at 4; and
Cortex-R5 goes down to 3.

Note that because we now use the PMCR reset value of the specific
implementation, we no longer set the LC bit out of reset.  This has
an UNKNOWN value out of reset for all cores with any AArch32 support,
so guest software should be setting it anyway if it wants it.

Signed-off-by: Peter Maydell 
Tested-by: Marcin Juszkiewicz 
Message-id: 20210311165947.27470-1-peter.mayd...@linaro.org
Reviewed-by: Richard Henderson 
---
 target/arm/cpu.h |  1 +
 target/arm/cpu64.c   |  3 +++
 target/arm/cpu_tcg.c |  5 +
 target/arm/helper.c  | 29 +
 target/arm/kvm64.c   |  2 ++
 5 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 193a49ec7fa..fe68f464b3a 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -942,6 +942,7 @@ struct ARMCPU {
 uint64_t id_aa64mmfr2;
 uint64_t id_aa64dfr0;
 uint64_t id_aa64dfr1;
+uint64_t reset_pmcr_el0;
 } isar;
 uint64_t midr;
 uint32_t revidr;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index f0a9e968c9c..5d9d56a33c3 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -141,6 +141,7 @@ static void aarch64_a57_initfn(Object *obj)
 cpu->gic_num_lrs = 4;
 cpu->gic_vpribits = 5;
 cpu->gic_vprebits = 5;
+cpu->isar.reset_pmcr_el0 = 0x41013000;
 define_arm_cp_regs(cpu, cortex_a72_a57_a53_cp_reginfo);
 }
 
@@ -194,6 +195,7 @@ static void aarch64_a53_initfn(Object *obj)
 cpu->gic_num_lrs = 4;
 cpu->gic_vpribits = 5;
 cpu->gic_vprebits = 5;
+cpu->isar.reset_pmcr_el0 = 0x41033000;
 define_arm_cp_regs(cpu, cortex_a72_a57_a53_cp_reginfo);
 }
 
@@ -245,6 +247,7 @@ static void aarch64_a72_initfn(Object *obj)
 cpu->gic_num_lrs = 4;
 cpu->gic_vpribits = 5;
 cpu->gic_vprebits = 5;
+cpu->isar.reset_pmcr_el0 = 0x41023000;
 define_arm_cp_regs(cpu, cortex_a72_a57_a53_cp_reginfo);
 }
 
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index 046e476f65f..8252fd29f90 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -301,6 +301,7 @@ static void cortex_a8_initfn(Object *obj)
 cpu->ccsidr[1] = 0x2007e01a; /* 16k L1 icache. */
 cpu->ccsidr[2] = 0xf000; /* No L2 icache. */
 cpu->reset_auxcr = 2;
+cpu->isar.reset_pmcr_el0 = 0x41002000;
 define_arm_cp_regs(cpu, cortexa8_cp_reginfo);
 }
 
@@ -373,6 +374,7 @@ static void cortex_a9_initfn(Object *obj)
 cpu->clidr = (1 << 27) | (1 << 24) | 3;
 cpu->ccsidr[0] = 0xe00fe019; /* 16k L1 dcache. */
 cpu->ccsidr[1] = 0x200fe019; /* 16k L1 icache. */
+cpu->isar.reset_pmcr_el0 = 0x41093000;
 define_arm_cp_regs(cpu, cortexa9_cp_reginfo);
 }
 
@@ -443,6 +445,7 @@ static void cortex_a7_initfn(Object *obj)
 cpu->ccsidr[0] = 0x701fe00a; /* 32K L1 dcache */
 cpu->ccsidr[1] = 0x201fe00a; /* 32K L1 icache */
 cpu->ccsidr[2] = 0x711fe07a; /* 4096K L2 unified cache */
+cpu->isar.reset_pmcr_el0 = 0x41072000;
 define_arm_cp_regs(cpu, cortexa15_cp_reginfo); /* Same as A15 */
 }
 
@@ -485,6 +488,7 @@ static void cortex_a15_initfn(Object *obj)
 cpu->ccsidr[0] = 0x701fe00a; /* 32K L1 dcache */
 cpu->ccsidr[1] = 0x201fe00a; /* 32K L1 icache */
 cpu->ccsidr[2] = 0x711fe07a; /* 4096K L2 unified cache */
+cpu->isar.reset_pmcr_el0 = 0x410F3000;
 define_arm_cp_regs(cpu, cortexa15_cp_reginfo);
 }
 
@@ -717,6 +721,7 @@ static void cortex_r5_initfn(Object *obj)
 cpu->isar.id_isar6 = 0x0;
 cpu->mp_is_up = true;
 cpu->pmsav7_dregion = 16;
+cpu->isar.reset_pmcr_el0 = 0x41151800;
 define_arm_cp_regs(cpu, cortexr5_cp_reginfo);
 }
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index d9220be7c5a..8fb6cc96e4d 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -38,7 +38,6 @@
 #endif
 
 #define ARM_CPU_FREQ 10 /* FIXME: 1 GHz, should be configurable */
-#define PMCR_NUM_COUNTERS 4 /* QEMU IMPDEF choice */
 
 #ifndef CONFIG_USER_ONLY
 
@@ -1149,7 +1148,9 @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
 
 static inline uint32_t pmu_num_counters(CPUARMState *env)
 {
-  return (env->cp15.c9_pmcr & PMCRN_MASK) >> PMCRN_SHIFT;
+ARMCPU *cpu = env_archcpu(env);
+
+return (cpu->isar.reset_pmcr_el0 & PMCRN_MASK) >> PMCRN_SHIFT;
 }
 
 /* Bits allowed to be set/cleared for PMCNTEN* and PMINTEN* */
@@ -5753,13 +5754,6 @@ static const ARMCPRegInfo el2_cp_reginfo[] 

Re: [RFC 1/8] memory: Allow eventfd add/del without starting a transaction

2021-03-30 Thread Stefan Hajnoczi
On Tue, Mar 30, 2021 at 09:47:49AM +0200, Greg Kurz wrote:
> On Mon, 29 Mar 2021 18:03:49 +0100
> Stefan Hajnoczi  wrote:
> 
> > On Thu, Mar 25, 2021 at 04:07:28PM +0100, Greg Kurz wrote:
> > > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > > index 5728a681b27d..98ed552e001c 100644
> > > --- a/include/exec/memory.h
> > > +++ b/include/exec/memory.h
> > > @@ -1848,13 +1848,25 @@ void 
> > > memory_region_clear_flush_coalesced(MemoryRegion *mr);
> > >   * @match_data: whether to match against @data, instead of just @addr
> > >   * @data: the data to match against the guest write
> > >   * @e: event notifier to be triggered when @addr, @size, and @data all 
> > > match.
> > > + * @transaction: whether to start a transaction for the change
> > 
> > "start" is unclear. Does it begin a transaction and return with the
> > transaction unfinished? I think instead the function performs the
> > eventfd addition within a transaction. It would be nice to clarify this.
> > 
> 
> What about: 
> 
>  * @transaction: if true, the eventfd is added within a nested transaction,
>  *   if false, it is up to the caller to ensure this is called
>  *   within a transaction.

Sounds good, thanks!

Stefan


signature.asc
Description: PGP signature


Re: [PATCH] hw/block/nvme: remove description for zoned.append_size_limit

2021-03-30 Thread Niklas Cassel
On Tue, Mar 23, 2021 at 12:20:32PM +0100, Klaus Jensen wrote:
> On Mar 23 11:18, Niklas Cassel wrote:
> > From: Niklas Cassel 
> > 
> > The description was originally removed in commit 578d914b263c
> > ("hw/block/nvme: align zoned.zasl with mdts") together with the removal
> > of the zoned.append_size_limit parameter itself.
> > 
> > However, it was (most likely accidentally), re-added in commit
> > f7dcd31885cb ("hw/block/nvme: add non-mdts command size limit for verify").
> > 
> > Remove the description again, since the parameter it describes,
> > zoned.append_size_limit, no longer exists.
> > 
> > Signed-off-by: Niklas Cassel 
> > ---
> >  hw/block/nvme.c | 8 
> >  1 file changed, 8 deletions(-)
> > 
> > diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> > index 6842b01ab5..205d3ec944 100644
> > --- a/hw/block/nvme.c
> > +++ b/hw/block/nvme.c
> > @@ -91,14 +91,6 @@
> >   *   the minimum memory page size (CAP.MPSMIN). The default value is 0 
> > (i.e.
> >   *   defaulting to the value of `mdts`).
> >   *
> > - * - `zoned.append_size_limit`
> > - *   The maximum I/O size in bytes that is allowed in Zone Append command.
> > - *   The default is 128KiB. Since internally this this value is maintained 
> > as
> > - *   ZASL = log2( / ), some values assigned
> > - *   to this property may be rounded down and result in a lower maximum ZA
> > - *   data size being in effect. By setting this property to 0, users can 
> > make
> > - *   ZASL to be equal to MDTS. This property only affects zoned namespaces.
> > - *
> >   * nvme namespace device parameters
> >   * 
> >   * - `subsys`
> > -- 
> > 2.30.2
> 
> Argh. Thanks Niklas, queing it up for fixes.
> 
> Reviewed-by: Klaus Jensen 

I don't see it in nvme-fixes yet.

Did it get stuck in purgatory? ;)


Kind regards,
Niklas


Re: [PATCH v10 2/6] arm64: kvm: Introduce MTE VM feature

2021-03-30 Thread Catalin Marinas
On Mon, Mar 29, 2021 at 05:06:51PM +0100, Steven Price wrote:
> On 28/03/2021 13:21, Catalin Marinas wrote:
> > On Sat, Mar 27, 2021 at 03:23:24PM +, Catalin Marinas wrote:
> > > On Fri, Mar 12, 2021 at 03:18:58PM +, Steven Price wrote:
> > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > > > index 77cb2d28f2a4..b31b7a821f90 100644
> > > > --- a/arch/arm64/kvm/mmu.c
> > > > +++ b/arch/arm64/kvm/mmu.c
> > > > @@ -879,6 +879,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> > > > phys_addr_t fault_ipa,
> > > > if (vma_pagesize == PAGE_SIZE && !force_pte)
> > > > vma_pagesize = transparent_hugepage_adjust(memslot, hva,
> > > >, 
> > > > _ipa);
> > > > +
> > > > +   if (fault_status != FSC_PERM && kvm_has_mte(kvm) && 
> > > > pfn_valid(pfn)) {
> > > > +   /*
> > > > +* VM will be able to see the page's tags, so we must 
> > > > ensure
> > > > +* they have been initialised. if PG_mte_tagged is set, 
> > > > tags
> > > > +* have already been initialised.
> > > > +*/
> > > > +   struct page *page = pfn_to_page(pfn);
> > > > +   unsigned long i, nr_pages = vma_pagesize >> PAGE_SHIFT;
> > > > +
> > > > +   for (i = 0; i < nr_pages; i++, page++) {
> > > > +   if (!test_and_set_bit(PG_mte_tagged, 
> > > > >flags))
> > > > +   mte_clear_page_tags(page_address(page));
> > > > +   }
> > > > +   }
> > > 
> > > This pfn_valid() check may be problematic. Following commit eeb0753ba27b
> > > ("arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory"), it returns
> > > true for ZONE_DEVICE memory but such memory is allowed not to support
> > > MTE.
> > 
> > Some more thinking, this should be safe as any ZONE_DEVICE would be
> > mapped as untagged memory in the kernel linear map. It could be slightly
> > inefficient if it unnecessarily tries to clear tags in ZONE_DEVICE,
> > untagged memory. Another overhead is pfn_valid() which will likely end
> > up calling memblock_is_map_memory().
> > 
> > However, the bigger issue is that Stage 2 cannot disable tagging for
> > Stage 1 unless the memory is Non-cacheable or Device at S2. Is there a
> > way to detect what gets mapped in the guest as Normal Cacheable memory
> > and make sure it's only early memory or hotplug but no ZONE_DEVICE (or
> > something else like on-chip memory)?  If we can't guarantee that all
> > Cacheable memory given to a guest supports tags, we should disable the
> > feature altogether.
> 
> In stage 2 I believe we only have two types of mapping - 'normal' or
> DEVICE_nGnRE (see stage2_map_set_prot_attr()). Filtering out the latter is a
> case of checking the 'device' variable, and makes sense to avoid the
> overhead you describe.
> 
> This should also guarantee that all stage-2 cacheable memory supports tags,
> as kvm_is_device_pfn() is simply !pfn_valid(), and pfn_valid() should only
> be true for memory that Linux considers "normal".

That's the problem. With Anshuman's commit I mentioned above,
pfn_valid() returns true for ZONE_DEVICE mappings (e.g. persistent
memory, not talking about some I/O mapping that requires Device_nGnRE).
So kvm_is_device_pfn() is false for such memory and it may be mapped as
Normal but it is not guaranteed to support tagging.

For user MTE, we get away with this as the MAP_ANONYMOUS requirement
would filter it out while arch_add_memory() will ensure it's mapped as
untagged in the linear map. See another recent fix for hotplugged
memory: d15dfd31384b ("arm64: mte: Map hotplugged memory as Normal
Tagged"). We needed to ensure that ZONE_DEVICE doesn't end up as tagged,
only hoplugged memory. Both handled via arch_add_memory() in the arch
code with ZONE_DEVICE starting at devm_memremap_pages().

> > > I now wonder if we can get a MAP_ANONYMOUS mapping of ZONE_DEVICE pfn
> > > even without virtualisation.
> > 
> > I haven't checked all the code paths but I don't think we can get a
> > MAP_ANONYMOUS mapping of ZONE_DEVICE memory as we normally need a file
> > descriptor.
> 
> I certainly hope this is the case - it's the weird corner cases of device
> drivers that worry me. E.g. I know i915 has a "hidden" mmap behind an ioctl
> (see i915_gem_mmap_ioctl(), although this case is fine - it's MAP_SHARED).
> Mali's kbase did something similar in the past.

I think this should be fine since it's not a MAP_ANONYMOUS (we do allow
MAP_SHARED to be tagged).

-- 
Catalin



Re: [RFC 4/8] virtio-pci: Batch add/del ioeventfds in a single MR transaction

2021-03-30 Thread Greg Kurz
On Mon, 29 Mar 2021 18:24:40 +0100
Stefan Hajnoczi  wrote:

> On Thu, Mar 25, 2021 at 04:07:31PM +0100, Greg Kurz wrote:
> > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > index 1b1942d521cc..0279e5671bcb 100644
> > --- a/softmmu/memory.c
> > +++ b/softmmu/memory.c
> > @@ -2368,7 +2368,7 @@ void memory_region_add_eventfd_full(MemoryRegion *mr,
> >  if (size) {
> >  adjust_endianness(mr, , size_memop(size) | MO_TE);
> >  }
> > -if (transaction) {
> > +if (!transaction) {
> >  memory_region_transaction_begin();
> >  }
> >  for (i = 0; i < mr->ioeventfd_nb; ++i) {
> > @@ -2383,7 +2383,7 @@ void memory_region_add_eventfd_full(MemoryRegion *mr,
> >  sizeof(*mr->ioeventfds) * (mr->ioeventfd_nb-1 - i));
> >  mr->ioeventfds[i] = mrfd;
> >  ioeventfd_update_pending |= mr->enabled;
> > -if (transaction) {
> > +if (!transaction) {
> >  memory_region_transaction_commit();
> >  }
> 
> Looks like these two hunks belong in a previous patch.

And they are actually wrong... we *do* want a nested
transaction if 'transaction' is true :) This is a
leftover I thought I had removed but obviously not...


pgpyL9fZ1xzzo.pgp
Description: OpenPGP digital signature


Re: Serious doubts about Gitlab CI

2021-03-30 Thread Daniel P . Berrangé
On Mon, Mar 29, 2021 at 03:10:36PM +0100, Stefan Hajnoczi wrote:
> Hi,
> I wanted to follow up with a summary of the CI jobs:
> 
> 1. Containers & Containers Layer2 - ~3 minutes/job x 39 jobs
> 2. Builds - ~50 minutes/job x 61 jobs
> 3. Tests - ~12 minutes/job x 20 jobs
> 4. Deploy - 52 minutes x 1 job
> 
> The Builds phase consumes the most CI minutes. If we can optimize this
> phase then we'll achieve the biggest impact.
> 
> In the short term builds could be disabled. However, in the long term I
> think full build coverage is desirable to prevent merging code that
> breaks certain host OSes/architectures (e.g. stable Linux distros,
> macOS, etc).

The notion of "full build coverage" doesn't really exist in reality.
The number of platforms that QEMU is targetting, combined with the
number of features that can be turned on/off in QEMU configure
means that the matrix for "full build coverage" is too huge to ever
contemplate.

So far we've been adding new jobs whenever we hit some situation
where we found a build problem that wasn't previously detected by
CI. In theory this is more reasonable as a strategy, than striving
for full build coverage, as it targets only places where we've hit
real world problems. I think we're seeing though, that even the
incremental new coverage approach is not sustainable in the real
world. Or rather it is only sustainable if CI resources are
essentially free.


Traditionally the biggest amount of testing would be done in a
freeze period leading upto a release. WIth GitLab CI we've tried
to move to a model where testing is continuous, such that we
have git master in a so called "always ready" state. This is
very good in general, but it comes with significant hardware
resource costs. We've relied on free service for this and this
is being less viable.



I think a challenges we have with our incremental approach is that
we're not really taking into account relative importance of the
different build scenarios, and often don't look at the big picture
of what the new job adds in terms of quality, compared to existing
jobs.

eg Consider we have

  build-system-alpine:
  build-system-ubuntu:
  build-system-debian:
  build-system-fedora:
  build-system-centos:
  build-system-opensuse:

  build-trace-multi-user:
  build-trace-ftrace-system:
  build-trace-ust-system:

I'd question whether we really need any of those 'build-trace'
jobs. Instead, we could have build-system-ubuntu pass
--enable-trace-backends=log,simple,syslog, build-system-debian
pass --enable-trace-backends=ust and build-system-fedora
pass --enable-trace-backends=ftrace, etc. 

Another example, is that we test builds on centos7 with
three different combos of crypto backend settings. This was
to exercise bugs we've seen in old crypto packages in RHEL-7
but in reality, it is probably overkill, because downstream
RHEL-7 only cares about one specific combination.

We don't really have a clearly defined plan to identify what
the most important things are in our testing coverage, so we
tend to accept anything without questioning its value add.
This really feeds back into the idea I've brought up many
times in the past, that we need to better define what we aim
to support in QEMU and its quality level, which will influence
what are the scenarios we care about testing.


> Traditionally ccache (https://ccache.dev/) was used to detect
> recompilation of the same compiler input files. This is trickier to do
> in GitLab CI since it would be necessary to share and update a cache,
> potentially between untrusted users. Unfortunately this shifts the
> bottleneck from CPU to network in a CI-as-a-Service environment since
> the cached build output needs to be accessed by the linker on the CI
> runner but is stored remotely.

Our docker containers install ccache already and I could have sworn
that we use that in gitlab, but now I'm not so sure. We're only
saving the "build/" directory as an artifact between jobs, and I'm
not sure that directory holds the ccache cache.

> A complementary approach is avoiding compilation altogether when code
> changes do not affect a build target. For example, a change to
> qemu-storage-daemon.c does not require rebuilding the system emulator
> targets. Either the compiler or the build system could produce a
> manifest of source files that went into a build target, and that
> information is what's needed to avoid compiling unchanged targets.

I think we want to be pretty wary of making the CI jobs too complex
in what they do. We want them to accurately reflect the way that our
developers and end users build the system in general. Trying to add
clever logic to the CI system to skip building certain pieces will
make the CI system more complex and fragile which will increase the
burden of keeping CI working reliably.

> Ideally the CI would look at the code changes and only launch jobs that
> were affected. Those jobs would use a C compiler cache to avoid
> rebuilding compiler input that has not changed. 

Re: [PATCH v3 0/5] qemu-iotests: quality of life improvements

2021-03-30 Thread Max Reitz

On 26.03.21 15:23, Paolo Bonzini wrote:

This series adds a few usability improvements to qemu-iotests, in
particular:

- arguments can be passed to Python unittests scripts, for example
   to run only a subset of the test cases (patches 1-2)

- it is possible to do "./check -- ../../../tests/qemu-iotests/055 args..."
   and specify arbitrary arguments to be passed to a single test script.
   This allows to take advantage of the previous feature and ease debugging
   of Python tests.

Paolo

Thanks, I’ve amended patch 4 and applied the series to my block branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block

Max




Re: [PATCH v3 0/5] qemu-iotests: quality of life improvements

2021-03-30 Thread Paolo Bonzini

On 30/03/21 13:44, Max Reitz wrote:

On 30.03.21 13:32, Max Reitz wrote:

On 26.03.21 15:23, Paolo Bonzini wrote:

This series adds a few usability improvements to qemu-iotests, in
particular:

- arguments can be passed to Python unittests scripts, for example
   to run only a subset of the test cases (patches 1-2)

- it is possible to do "./check -- ../../../tests/qemu-iotests/055 
args..."

   and specify arbitrary arguments to be passed to a single test script.
   This allows to take advantage of the previous feature and ease 
debugging

   of Python tests.

Paolo

Thanks, I’ve amended patch 4 and applied the series to my block branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block


I’m sorry but I’ll have to drop it again.  At least iotests 245 und 295 
fail; I assume it has something to do with `iotests.activate_logging()`.


Ok, will look into it.  Can you give me the exact set of ./check 
invocations that you use?


Paolo




Re: [PULL 00/10] For 6.0 patches

2021-03-30 Thread Peter Maydell
On Tue, 30 Mar 2021 at 09:29, Marc-André Lureau
 wrote:
>
> Hi
>
> On Mon, Mar 29, 2021 at 9:54 PM Peter Maydell  
> wrote:
>> aarch64 CI machine, which has python 3.8.5 and sphinx-build 1.8.5.
>> My guess is that it might be the sphinx-build version here. I vaguely
>> recall that Sphinx is kind of picky about exceptions within the conf
>> file but that there was a change in what it allowed at some point.
>> It's possible we just can't do much with the old versions.
>
>
> How do you run the build? Running make from an existing configured or build 
> state? If so, I have seen sphinx errors that don't stop the build (and 
> actually building the docs without sphinx-rtd). I don't know why this 
> happens, "regenerate"/reconfigure errors should stop the build.

On that machine, yes, it's an incremental build.

thanks
-- PMM



[PULL 3/9] iotests/116: Fix reference output

2021-03-30 Thread Max Reitz
15ce94a68ca ("block/qed: bdrv_qed_do_open: deal with errp") has improved
the qed driver's error reporting, though sadly did not add a test for
it.
The good news are: There already is such a test, namely 116.
The bad news are: Its reference output was not adjusted, and so now it
fails.

Let's fix the reference output, which has the nice side effect of
demonstrating 15ce94a68ca's improvements.

Fixes: 15ce94a68ca6730466c565c3d29971aab3087bf1
   ("block/qed: bdrv_qed_do_open: deal with errp")
Signed-off-by: Max Reitz 
Message-Id: <20210326141419.156831-1-mre...@redhat.com>
---
 tests/qemu-iotests/116.out | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/116.out b/tests/qemu-iotests/116.out
index 49f9a261a0..5f6c6fffca 100644
--- a/tests/qemu-iotests/116.out
+++ b/tests/qemu-iotests/116.out
@@ -2,7 +2,7 @@ QA output created by 116
 
 == truncated header cluster ==
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
-qemu-io: can't open device TEST_DIR/t.qed: Could not open 'TEST_DIR/t.qed': 
Invalid argument
+qemu-io: can't open device TEST_DIR/t.qed: QED table offset is invalid
 
 == invalid header magic ==
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
@@ -10,21 +10,21 @@ qemu-io: can't open device TEST_DIR/t.qed: Image not in QED 
format
 
 == invalid cluster size ==
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
-qemu-io: can't open device TEST_DIR/t.qed: Could not open 'TEST_DIR/t.qed': 
Invalid argument
+qemu-io: can't open device TEST_DIR/t.qed: QED cluster size is invalid
 
 == invalid table size ==
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
-qemu-io: can't open device TEST_DIR/t.qed: Could not open 'TEST_DIR/t.qed': 
Invalid argument
+qemu-io: can't open device TEST_DIR/t.qed: QED table size is invalid
 
 == invalid header size ==
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
-qemu-io: can't open device TEST_DIR/t.qed: Could not open 'TEST_DIR/t.qed': 
Invalid argument
+qemu-io: can't open device TEST_DIR/t.qed: QED table offset is invalid
 
 == invalid L1 table offset ==
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
-qemu-io: can't open device TEST_DIR/t.qed: Could not open 'TEST_DIR/t.qed': 
Invalid argument
+qemu-io: can't open device TEST_DIR/t.qed: QED table offset is invalid
 
 == invalid image size ==
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
-qemu-io: can't open device TEST_DIR/t.qed: Could not open 'TEST_DIR/t.qed': 
Invalid argument
+qemu-io: can't open device TEST_DIR/t.qed: QED image size is invalid
 *** done
-- 
2.29.2




Re: Serious doubts about Gitlab CI

2021-03-30 Thread Paolo Bonzini

On 30/03/21 14:23, Philippe Mathieu-Daudé wrote:

On 3/30/21 2:09 PM, Paolo Bonzini wrote:

On 30/03/21 13:55, Thomas Huth wrote:


Also I wonder whether we could maybe even get rid of the capstone and
slirp submodules in QEMU now


At least for slirp, we probably want to stay more on the bleeding edge
which implies having to keep the submodule.


FYI QEMU libSLiRP submodule doesn't point to bleeding edge branch but to
the stable branch (which should be what distributions package).


Now, but that may change already in 6.1 in order to add CFI support.

Paolo




Re: [RFC 0/8] virtio: Improve boot time of virtio-scsi-pci and virtio-blk-pci

2021-03-30 Thread Greg Kurz
On Mon, 29 Mar 2021 18:35:16 +0100
Stefan Hajnoczi  wrote:

> On Thu, Mar 25, 2021 at 04:07:27PM +0100, Greg Kurz wrote:
> > Now that virtio-scsi-pci and virtio-blk-pci map 1 virtqueue per vCPU,
> > a serious slow down may be observed on setups with a big enough number
> > of vCPUs.
> > 
> > Exemple with a pseries guest on a bi-POWER9 socket system (128 HW threads):
> > 
> > 1   0m20.922s   0m21.346s
> > 2   0m21.230s   0m20.350s
> > 4   0m21.761s   0m20.997s
> > 8   0m22.770s   0m20.051s
> > 16  0m22.038s   0m19.994s
> > 32  0m22.928s   0m20.803s
> > 64  0m26.583s   0m22.953s
> > 128 0m41.273s   0m32.333s
> > 256 2m4.727s1m16.924s
> > 384 6m5.563s3m26.186s
> > 
> > Both perf and gprof indicate that QEMU is hogging CPUs when setting up
> > the ioeventfds:
> > 
> >  67.88%  swapper [kernel.kallsyms]  [k] power_pmu_enable
> >   9.47%  qemu-kvm[kernel.kallsyms]  [k] smp_call_function_single
> >   8.64%  qemu-kvm[kernel.kallsyms]  [k] power_pmu_enable
> > =>2.79%  qemu-kvmqemu-kvm   [.] 
> > memory_region_ioeventfd_before
> > =>2.12%  qemu-kvmqemu-kvm   [.] 
> > address_space_update_ioeventfds
> >   0.56%  kworker/8:0-mm  [kernel.kallsyms]  [k] smp_call_function_single
> > 
> > address_space_update_ioeventfds() is called when committing an MR
> > transaction, i.e. for each ioeventfd with the current code base,
> > and it internally loops on all ioventfds:
> > 
> > static void address_space_update_ioeventfds(AddressSpace *as)
> > {
> > [...]
> > FOR_EACH_FLAT_RANGE(fr, view) {
> > for (i = 0; i < fr->mr->ioeventfd_nb; ++i) {
> > 
> > This means that the setup of ioeventfds for these devices has
> > quadratic time complexity.
> > 
> > This series introduce generic APIs to allow batch creation and deletion
> > of ioeventfds, and converts virtio-blk and virtio-scsi to use them. This
> > greatly improves the numbers:
> > 
> > 1   0m21.271s   0m22.076s
> > 2   0m20.912s   0m19.716s
> > 4   0m20.508s   0m19.310s
> > 8   0m21.374s   0m20.273s
> > 16  0m21.559s   0m21.374s
> > 32  0m22.532s   0m21.271s
> > 64  0m26.550s   0m22.007s
> > 128 0m29.115s   0m27.446s
> > 256 0m44.752s   0m41.004s
> > 384 1m2.884s0m58.023s
> 
> Excellent numbers!
> 
> I wonder if the code can be simplified since
> memory_region_transaction_begin/end() supports nesting. Why not call
> them directly from the device model instead of introducing callbacks in
> core virtio and virtio-pci code?
> 

It seems a bit awkward that the device model should assume a memory
transaction is needed to setup host notifiers, which are ioeventfds
under the hood but the device doesn't know that.

> Also, do you think there are other opportunities to have a long
> transaction to batch up machine init, device hotplug, etc? It's not
> clear to me when transactions must be ended. Clearly it's necessary to

The transaction *must* be ended before calling
virtio_bus_cleanup_host_notifier() because
address_space_add_del_ioeventfds(), called when
finishing the transaction, needs the "to-be-closed"
eventfds to be still open, otherwise the KVM_IOEVENTFD 
ioctl() might fail with EBADF.

See this change in patch 3:

@@ -315,6 +338,10 @@ static void 
virtio_bus_unset_and_cleanup_host_notifiers(VirtioBusState *bus,
 
 for (i = 0; i < nvqs; i++) {
 virtio_bus_set_host_notifier(bus, i + n_offset, false);
+}
+/* Let address_space_update_ioeventfds() run before closing ioeventfds */
+virtio_bus_set_host_notifier_commit(bus);
+for (i = 0; i < nvqs; i++) {
 virtio_bus_cleanup_host_notifier(bus, i + n_offset);
 }
 }

Maybe I should provide more details why we're doing that ?

> end the transaction if we need to do something that depends on the
> MemoryRegion, eventfd, etc being updated. But most of the time there is
> no immediate need to end the transaction and more code could share the
> same transaction before we go back to the event loop or vcpu thread.
> 

I can't tell for all scenarios that involve memory transactions but
it seems this is definitely not the case for ioeventfds : the rest
of the code expects the transaction to be complete.

> Stefan

Thanks for the review !

Cheers,

--
Greg


pgp1I57ZMlUFD.pgp
Description: OpenPGP digital signature


Re: Serious doubts about Gitlab CI

2021-03-30 Thread Daniel P . Berrangé
On Tue, Mar 30, 2021 at 01:55:48PM +0200, Thomas Huth wrote:
> On 30/03/2021 13.19, Daniel P. Berrangé wrote:

> > Another example, is that we test builds on centos7 with
> > three different combos of crypto backend settings. This was
> > to exercise bugs we've seen in old crypto packages in RHEL-7
> > but in reality, it is probably overkill, because downstream
> > RHEL-7 only cares about one specific combination.
> 
> Care to send a patch? Or shall we just wait one more months and then remove
> these jobs (since we won't support RHEL7 after QEMU 6.0 anymore)?

Yeah, we'll be able to cull this entirely very soon, including
both the C backcompat code and CI jobs at the same time, so I'll
just wait.


> > Our docker containers install ccache already and I could have sworn
> > that we use that in gitlab, but now I'm not so sure. We're only
> > saving the "build/" directory as an artifact between jobs, and I'm
> > not sure that directory holds the ccache cache.
> 
> AFAIK we never really enabled ccache in the gitlab-CI, only in Travis.
> 
> > > This is as far as I've gotten with thinking about CI efficiency. Do you
> > > think these optimizations are worth investigating or should we keep it
> > > simple and just disable many builds by default?
> > 
> > ccache is a no-brainer and assuming it isn't already working with
> > our gitlab jobs, we must fix that asap.
> 
> I've found some nice instructions here:
> 
> https://gould.cx/ted/blog/2017/06/10/ccache-for-Gitlab-CI/
> 
> ... and just kicked off a build with these modifications, let's see how it
> goes...

Yep, that looks similar to what we do in libvirt, though we don't override
the compiler at the job level. Instead we just ensure the dir containing
ccache symlinks appears first in $PATH.

So in containers we have this:

https://gitlab.com/libvirt/libvirt/-/blob/master/ci/containers/centos-8.Dockerfile

and in gitlab-ci.yml we have env vars set

  export CCACHE_BASEDIR="$(pwd)"
  export CCACHE_DIR="$CCACHE_BASEDIR/ccache"
  export CCACHE_MAXSIZE="500M"
  export PATH="$CCACHE_WRAPPERSDIR:$PATH"

And per-job caches:

  cache:
paths:
  - ccache/
key: "$CI_JOB_NAME"

note the "key" is important to avoid clashing caches from different
envs.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: Serious doubts about Gitlab CI

2021-03-30 Thread Paolo Bonzini

On 30/03/21 15:12, Daniel P. Berrangé wrote:

Now, but that may change already in 6.1 in order to add CFI support.

We can bundle a newer version, but we don't need to require a newer
version. Simply conditional compile for the bits we need. If distro
slirp is too old, then sorry, you can't enable CFI + slirp at the
same time. If the distro really wants that combination we don't have
to own the solution - the distro should update their slirp.

Or to put it another way, QEMU doesn't need to go out of its way to
enable new features on old distros. We merely need to not regress
in the features we previously offered.  We bundled slirp as a submodule
so that old distros didn't loose slirp entirely. We don't need to
offer CFI on those distros.


This is true, on the other hand only having to support one API version 
has its benefits.  The complication in the build system is minimal once 
slirp is made into a subproject; therefore it is appealing to keep the 
QEMU code simple.


Paolo




Re: [PATCH v10 1/6] arm64: mte: Sync tags for pages where PTE is untagged

2021-03-30 Thread Catalin Marinas
On Mon, Mar 29, 2021 at 04:55:29PM +0100, Steven Price wrote:
> On 26/03/2021 18:56, Catalin Marinas wrote:
> > On Fri, Mar 12, 2021 at 03:18:57PM +, Steven Price wrote:
> > > A KVM guest could store tags in a page even if the VMM hasn't mapped
> > > the page with PROT_MTE. So when restoring pages from swap we will
> > > need to check to see if there are any saved tags even if !pte_tagged().
> > > 
> > > However don't check pages which are !pte_valid_user() as these will
> > > not have been swapped out.
> > > 
> > > Signed-off-by: Steven Price 
> > > ---
> > >   arch/arm64/include/asm/pgtable.h |  2 +-
> > >   arch/arm64/kernel/mte.c  | 16 
> > >   2 files changed, 13 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/arch/arm64/include/asm/pgtable.h 
> > > b/arch/arm64/include/asm/pgtable.h
> > > index e17b96d0e4b5..84166625c989 100644
> > > --- a/arch/arm64/include/asm/pgtable.h
> > > +++ b/arch/arm64/include/asm/pgtable.h
> > > @@ -312,7 +312,7 @@ static inline void set_pte_at(struct mm_struct *mm, 
> > > unsigned long addr,
> > >   __sync_icache_dcache(pte);
> > >   if (system_supports_mte() &&
> > > - pte_present(pte) && pte_tagged(pte) && !pte_special(pte))
> > > + pte_present(pte) && pte_valid_user(pte) && !pte_special(pte))
> > >   mte_sync_tags(ptep, pte);
> > 
> > With the EPAN patches queued in for-next/epan, pte_valid_user()
> > disappeared as its semantics weren't very clear.
> 
> Thanks for pointing that out.
> 
> > So this relies on the set_pte_at() being done on the VMM address space.
> > I wonder, if the VMM did an mprotect(PROT_NONE), can the VM still access
> > it via stage 2? If yes, the pte_valid_user() test wouldn't work. We need
> > something like pte_present() && addr <= user_addr_max().
> 
> AFAIUI the stage 2 matches the VMM's address space (for the subset that has
> memslots). So mprotect(PROT_NONE) would cause the stage 2 mapping to be
> invalidated and a subsequent fault would exit to the VMM to sort out. This
> sort of thing is done for the lazy migration use case (i.e. pages are
> fetched as the VM tries to access them).

There's also the protected KVM case which IIUC wouldn't provide any
mapping of the guest memory to the host (or maybe the host still thinks
it's there but cannot access it without a Stage 2 fault). At least in
this case it wouldn't swap pages out and it would be the responsibility
of the EL2 code to clear the tags when giving pages to the guest
(user_mem_abort() must not touch the page).

So basically we either have a valid, accessible mapping in the VMM and
we can handle the tags via set_pte_at() or we leave it to whatever is
running at EL2 in the pKVM case.

I don't remember whether we had a clear conclusion in the past: have we
ruled out requiring the VMM to map the guest memory with PROT_MTE
entirely? IIRC a potential problem was the VMM using MTE itself and
having to disable it when accessing the guest memory.

Another potential issue (I haven't got my head around it yet) is a race
in mte_sync_tags() as we now defer the PG_mte_tagged bit setting until
after the tags had been restored. Can we have the same page mapped by
two ptes, each attempting to restore it from swap and one gets it first
and starts modifying it? Given that we set the actual pte after setting
PG_mte_tagged, it's probably alright but I think we miss some barriers.

Also, if a page is not a swap one, we currently clear the tags if mapped
as pte_tagged() (prior to this patch). We'd need something similar when
mapping it in the guest so that we don't leak tags but to avoid any page
ending up with PG_mte_tagged, I think you moved the tag clearing to
user_mem_abort() in the KVM code. I presume set_pte_at() in the VMM
would be called first and then set in Stage 2.

> > BTW, ignoring virtualisation, can we ever bring a page in from swap on a
> > PROT_NONE mapping (say fault-around)? It's not too bad if we keep the
> > metadata around for when the pte becomes accessible but I suspect we
> > remove it if the page is removed from swap.
> 
> There are two stages of bringing data from swap. First is populating the
> swap cache by doing the physical read from swap. The second is actually
> restoring the page table entries.

When is the page metadata removed? I want to make sure we don't drop it
for some pte attributes.

-- 
Catalin



Re: [PATCH v3 4/5] qemu-iotests: let "check" spawn an arbitrary test command

2021-03-30 Thread Max Reitz

On 30.03.21 12:38, Max Reitz wrote:

On 26.03.21 16:05, Max Reitz wrote:

On 26.03.21 15:23, Paolo Bonzini wrote:
Right now there is no easy way for "check" to print a reproducer 
command.
Because such a reproducer command line would be huge, we can instead 
teach
check to start a command of our choice.  This can be for example a 
Python

unit test with arguments to only run a specific subtest.

Move the trailing empty line to print_env(), since it always looks 
better

and one caller was not adding it.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Tested-by: Emanuele Giuseppe Esposito 
Message-Id: <20210323181928.311862-5-pbonz...@redhat.com>
---
  tests/qemu-iotests/check | 18 +-
  tests/qemu-iotests/testenv.py    |  3 ++-
  tests/qemu-iotests/testrunner.py |  1 -
  3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index d1c87ceaf1..df9fd733ff 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -19,6 +19,9 @@
  import os
  import sys
  import argparse
+import shutil
+from pathlib import Path
+
  from findtests import TestFinder
  from testenv import TestEnv
  from testrunner import TestRunner
@@ -101,7 +104,7 @@ def make_argparser() -> argparse.ArgumentParser:
 'rerun failed ./check command, starting from 
the '

 'middle of the process.')
  g_sel.add_argument('tests', metavar='TEST_FILES', nargs='*',
-   help='tests to run')
+   help='tests to run, or "--" followed by a 
command')

  return p
@@ -114,6 +117,19 @@ if __name__ == '__main__':
    imgopts=args.imgopts, misalign=args.misalign,
    debug=args.debug, valgrind=args.valgrind)
+    if len(sys.argv) > 1 and sys.argv[-len(args.tests)-1] == '--':
+    if not args.tests:
+    sys.exit("missing command after '--'")
+    cmd = args.tests
+    env.print_env()
+    exec_path = Path(shutil.which(cmd[0]))


297 says:

check:125: error: Argument 1 to "Path" has incompatible type 
"Optional[str]"; expected "Union[str, _PathLike[str]]"

Found 1 error in 1 file (checked 1 source file)

Normally I’d assert this away, but actually I think the returned value 
should be checked and we should print an error if it’s None.  (Seems 
like shutil.which() doesn’t raise an exception if there is no such 
command, it just returns None.)


Max


+    if exec_path is None:
+    sys.exit('command not found: ' + cmd[0])


Oh, I see, the intent to print an error is actually there.  The problem 
is just that Path(None) throws an exception, so we must check 
shutil.which()’s return value.


I’ll squash this in if you don’t mind:

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index df9fd733ff..e2230f5612 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -122,9 +122,10 @@ if __name__ == '__main__':
  sys.exit("missing command after '--'")
  cmd = args.tests
  env.print_env()
-    exec_path = Path(shutil.which(cmd[0]))
-    if exec_path is None:
+    exec_pathstr = shutil.which(cmd[0])
+    if exec_pathstr is None:
  sys.exit('command not found: ' + cmd[0])
+    exec_path = Path(exec_pathstr)
  cmd[0] = exec_path.resolve()
  full_env = env.prepare_subprocess(cmd)
  os.chdir(Path(exec_path).parent)


+    cmd[0] = exec_path.resolve()
+    full_env = env.prepare_subprocess(cmd)
+    os.chdir(Path(exec_path).parent)


Oh, and this Path() does nothing, I presume, so I’m going to replace it 
with just “exec_path”.


Max


+    os.execve(cmd[0], cmd, full_env)
+
  testfinder = TestFinder(test_dir=env.source_iotests)
  groups = args.groups.split(',') if args.groups else None
diff --git a/tests/qemu-iotests/testenv.py 
b/tests/qemu-iotests/testenv.py

index fca3a609e0..cd0e39b789 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -284,7 +284,8 @@ def print_env(self) -> None:
  PLATFORM  -- {platform}
  TEST_DIR  -- {TEST_DIR}
  SOCK_DIR  -- {SOCK_DIR}
-SOCKET_SCM_HELPER -- {SOCKET_SCM_HELPER}"""
+SOCKET_SCM_HELPER -- {SOCKET_SCM_HELPER}
+"""
  args = collections.defaultdict(str, self.get_env())
diff --git a/tests/qemu-iotests/testrunner.py 
b/tests/qemu-iotests/testrunner.py

index 519924dc81..2f56ac545d 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -316,7 +316,6 @@ def run_tests(self, tests: List[str]) -> bool:
  if not self.makecheck:
  self.env.print_env()
-    print()
  test_field_width = max(len(os.path.basename(t)) for t in 
tests) + 2











Re: Ways to deal with broken machine types

2021-03-30 Thread David Edmondson
On Tuesday, 2021-03-23 at 15:40:24 -04, Michael S. Tsirkin wrote:

> On Tue, Mar 23, 2021 at 05:40:36PM +, Daniel P. Berrangé wrote:
>> On Tue, Mar 23, 2021 at 05:54:47PM +0100, Igor Mammedov wrote:
>> > Let me hijack this thread for beyond this case scope.
>> > 
>> > I agree that for this particular bug we've done all we could, but
>> > there is broader issue to discuss here.
>> > 
>> > We have machine versions to deal with hw compatibility issues and that 
>> > covers most of the cases,
>> > but occasionally we notice problem well after release(s),
>> > so users may be stuck with broken VM and need to manually fix 
>> > configuration (and/or VM).
>> > Figuring out what's wrong and how to fix it is far from trivial. So lets 
>> > discuss if we
>> > can help to ease this pain, yes it will be late for first victims but it's 
>> > still
>> > better than never.
>> 
>> To summarize the problem situation
>> 
>>  - We rely on a machine type version to encode a precise guest ABI.
>>  - Due a bug, we are in a situation where the same machine type
>>encodes two distinct guest ABIs due to a mistake introduced
>>betwen QEMU N-2 and N-1
>>  - We want to fix the bug in QEMU N
>>  - For incoming migration there is no way to distinguish between
>>the ABIs used in N-2 and N-1, to pick the right one
>
>
> Not just incoming migration. Same applies to a guest restart.
>
>
>> So we're left with an unwinnable problem:
>> 
>>   - Not fixing the bug =>
>> 
>>a) user migrating N-2 to N-1 have ABI change
>>b) user migrating N-2 to N have ABI change
>>c) user migrating N-1 to N are fine
>> 
>> No mitigation for (a) or (b)
>> 
>>   - Fixing the bug =>
>> 
>>a) user migrating N-2 to N-1 have ABI change.
>>b) user migrating N-2 to N are fine
>>c) user migrating N-1 to N have ABI change
>> 
>> Bad situations (a) and (c) are mitigated by
>> backporting fix to N-1-stable too.
>> 
>> Generally we have preferred to fix the bug, because we have
>> usually identified them fairly quickly after release, and
>> backporting the fix to stable has been sufficient mitigation
>> against ill effects. Basically the people left broken are a
>> relatively small set out of the total userbase.
>> 
>> The real challenge arises when we are slow to identify the
>> problem, such that we have a large number of people impacted.
>> 
>> 
>> > I'll try to sum up idea Michael suggested (here comes my unorganized 
>> > brain-dump),
>> > 
>> > 1. We can keep in VM's config QEMU version it was created on
>> >and as minimum warn user with a pointer to known issues if version in
>> >config mismatches version of actually used QEMU, with a knob to silence
>> >it for particular mismatch.
>> > 
>> > When an issue becomes know and resolved we know for sure how and what
>> > changed and embed instructions on what options to use for fixing up VM's
>> > config to preserve old HW config depending on QEMU version VM was 
>> > installed on.
>> 
>> > some more ideas:
>> >2. let mgmt layer to keep fixup list and apply them to config if 
>> > available
>> >(user would need to upgrade mgmt or update fixup list somehow)
>> >3. let mgmt layer to pass VM's QEMU version to currently used QEMU, so
>> >   that QEMU could maintain and apply fixups based on QEMU version + 
>> > machine type.
>> >   The user will have to upgrade to newer QEMU to get/use new fixups.
>> 
>> The nice thing about machine type versioning is that we are treating the
>> versions as opaque strings which represent a specific ABI, regardless of
>> the QEMU version. This means that even if distros backport fixes for bugs
>> or even new features, the machine type compatibility check remains a
>> simple equality comparsion.
>> 
>> As soon as you introduce the QEMU version though, we have created a
>> large matrix for compatibility.
>
>
> Yes but. If we explicitly handle them all the same then
> mechanically testing them all is an overkill.
> We just need to test the ones that have bugs which we
> care about fixing.
>
>
>> This matrix is expanded if a distro
>> chooses to backport fixes for any of the machine type bugs to their
>> stable streams. This can get particularly expensive when there are
>> multiple streams a distro is maintaining.
>> 
>> *IF* the original N-1 qemu has a property that could be queried by
>> the mgmt app to identify a machine type bug, then we could potentially
>> apply a fixup automatically.
>> 
>> eg query-machines command in QEMU version N could report against
>> "pc-i440fx-5.0", that there was a regression fix that has to be
>> applied if property "foo" had value "bar".
>> 
>> Now, the mgmt app wants to migrate from QEMU N-2 or N-1 to QEMU N.
>> It can query the value of "foo" on the source QEMU with qom-get.
>> It now knows whether it has to override this property "foo" when
>> spawning QEMU N on the target host.
>> 
>> Of course this doesn't help us if neither N-1 or N-2 QEMU had a

Re: Ways to deal with broken machine types

2021-03-30 Thread Michael S. Tsirkin
On Tue, Mar 30, 2021 at 12:21:37PM +0100, David Edmondson wrote:
> > Unfortunately I do not think this is practical :(.
> >
> > All examples of breakage I am aware of, we did not
> > realise some part of interface was part of guest ABI
> > and unsafe to change. We simply would not know to write a
> > test for it.
> 
> While agreeing that it would not be possible to cover all aspects of the
> ABI immediately, does that mean that some level of coverage would not be
> useful?

Our testing already warns about ACPI table changes (which is what
happened here). We just verified them manually and thought they are
fine.

-- 
MST




Re: Serious doubts about Gitlab CI

2021-03-30 Thread Paolo Bonzini

On 30/03/21 13:55, Thomas Huth wrote:


Since the build system has been converted to meson, I think the 
configure script prefers to use the submodules instead of the distro 
packages. I've tried to remedy this a little bit here:


https://gitlab.com/qemu-project/qemu/-/commit/db0108d5d846e9a8

... but new jobs of course will use the submodules again if the author 
is not careful.


Hmm... it should be the same (or if not it's a bug).


Also I wonder whether we could maybe even get rid of the capstone and slirp 
submodules in QEMU now


At least for slirp, we probably want to stay more on the bleeding edge 
which implies having to keep the submodule.  Capstone and libfdt 
probably can go, though at least libfdt may be more useful on Windows.


Paolo




Re: Serious doubts about Gitlab CI

2021-03-30 Thread Paolo Bonzini

On 30/03/21 13:55, Thomas Huth wrote:


Since the build system has been converted to meson, I think the 
configure script prefers to use the submodules instead of the distro 
packages. I've tried to remedy this a little bit here:


https://gitlab.com/qemu-project/qemu/-/commit/db0108d5d846e9a8

... but new jobs of course will use the submodules again if the author 
is not careful.


Hmm... it should be the same (or if not it's a bug).


Also I wonder whether we could maybe even get rid of the capstone and slirp 
submodules in QEMU now


At least for slirp, we probably want to stay more on the bleeding edge 
which implies having to keep the submodule.  Capstone and libfdt 
probably can go.


Paolo




Re: Serious doubts about Gitlab CI

2021-03-30 Thread Philippe Mathieu-Daudé
On 3/30/21 2:09 PM, Paolo Bonzini wrote:
> On 30/03/21 13:55, Thomas Huth wrote:
>>
>> Also I wonder whether we could maybe even get rid of the capstone and
>> slirp submodules in QEMU now
> 
> At least for slirp, we probably want to stay more on the bleeding edge
> which implies having to keep the submodule.

FYI QEMU libSLiRP submodule doesn't point to bleeding edge branch but to
the stable branch (which should be what distributions package).



[PULL 0/9] Block patches for 6.0-rc1

2021-03-30 Thread Max Reitz
The following changes since commit ec2e6e016d24bd429792d08cf607e4c5350dcdaa:

  Merge remote-tracking branch 
'remotes/vivier2/tags/linux-user-for-6.0-pull-request' into staging (2021-03-28 
19:49:57 +0100)

are available in the Git repository at:

  https://github.com/XanClic/qemu.git tags/pull-block-2021-03-30

for you to fetch changes up to 2ec7e8a94668efccf7f45634584cfa19a83fc553:

  iotests/244: Test preallocation for data-file-raw (2021-03-30 13:02:11 +0200)


Block patches for 6.0-rc1:
- Mark the qcow2 cache clean timer as external to fix record/replay
- Fix the mirror filter node's permissions so that an external process
  cannot grab an image while it is used as the mirror source
- Add documentation about FUSE exports to the storage daemon
- When creating a qcow2 image with the data-file-raw option, all
  metadata structures should be preallocated
- iotest fixes


Connor Kuehl (1):
  iotests: fix 051.out expected output after error text touchups

Max Reitz (6):
  iotests/116: Fix reference output
  iotests/046: Filter request length
  block/mirror: Fix mirror_top's permissions
  qsd: Document FUSE exports
  qcow2: Force preallocation with data-file-raw
  iotests/244: Test preallocation for data-file-raw

Pavel Dovgalyuk (1):
  qcow2: use external virtual timers

Tao Xu (1):
  iotests: Fix typo in iotest 051

 docs/tools/qemu-storage-daemon.rst   |  19 +
 block/mirror.c   |  32 +++--
 block/qcow2.c|  41 ++-
 storage-daemon/qemu-storage-daemon.c |   4 ++
 tests/qemu-iotests/046   |   3 +-
 tests/qemu-iotests/046.out   | 104 +--
 tests/qemu-iotests/051   |   2 +-
 tests/qemu-iotests/051.out   |   6 +-
 tests/qemu-iotests/051.pc.out|   4 +-
 tests/qemu-iotests/116.out   |  12 ++--
 tests/qemu-iotests/244   | 104 +++
 tests/qemu-iotests/244.out   |  68 --
 12 files changed, 319 insertions(+), 80 deletions(-)

-- 
2.29.2




[PULL 2/9] iotests: fix 051.out expected output after error text touchups

2021-03-30 Thread Max Reitz
From: Connor Kuehl 

A patch was recently applied that touched up some error messages that
pertained to key names like 'node-name'. The trouble is it only updated
tests/qemu-iotests/051.pc.out and not tests/qemu-iotests/051.out as
well.

Do that now.

Fixes: 785ec4b1b9 ("block: Clarify error messages pertaining to
'node-name'")
Signed-off-by: Connor Kuehl 
Message-Id: <20210318200949.1387703-2-cku...@redhat.com>
Tested-by: Christian Borntraeger 
Reviewed-by: John Snow 
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/051.out | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/051.out b/tests/qemu-iotests/051.out
index 437053c839..441f83e41a 100644
--- a/tests/qemu-iotests/051.out
+++ b/tests/qemu-iotests/051.out
@@ -61,13 +61,13 @@ QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) quit
 
 Testing: -drive file=TEST_DIR/t.qcow2,node-name=123foo
-QEMU_PROG: -drive file=TEST_DIR/t.qcow2,node-name=123foo: Invalid node name
+QEMU_PROG: -drive file=TEST_DIR/t.qcow2,node-name=123foo: Invalid node-name: 
'123foo'
 
 Testing: -drive file=TEST_DIR/t.qcow2,node-name=_foo
-QEMU_PROG: -drive file=TEST_DIR/t.qcow2,node-name=_foo: Invalid node name
+QEMU_PROG: -drive file=TEST_DIR/t.qcow2,node-name=_foo: Invalid node-name: 
'_foo'
 
 Testing: -drive file=TEST_DIR/t.qcow2,node-name=foo#12
-QEMU_PROG: -drive file=TEST_DIR/t.qcow2,node-name=foo#12: Invalid node name
+QEMU_PROG: -drive file=TEST_DIR/t.qcow2,node-name=foo#12: Invalid node-name: 
'foo#12'
 
 
 === Device without drive ===
-- 
2.29.2




[PULL 5/9] iotests/046: Filter request length

2021-03-30 Thread Max Reitz
For its concurrent requests, 046 has always filtered the offset,
probably because concurrent requests may settle in any order.  However,
it did not filter the request length, and so if requests with different
lengths settle in an unexpected order (notably the longer request before
the shorter request), the test fails (for no good reason).

Filter the length, too.

Signed-off-by: Max Reitz 
Message-Id: <20200918153323.108932-1-mre...@redhat.com>
---
 tests/qemu-iotests/046 |   3 +-
 tests/qemu-iotests/046.out | 104 ++---
 2 files changed, 54 insertions(+), 53 deletions(-)

diff --git a/tests/qemu-iotests/046 b/tests/qemu-iotests/046
index 50b0678f60..517b162508 100755
--- a/tests/qemu-iotests/046
+++ b/tests/qemu-iotests/046
@@ -187,7 +187,8 @@ EOF
 }
 
 overlay_io | $QEMU_IO blkdebug::"$TEST_IMG" | _filter_qemu_io |\
-   sed -e 's/bytes at offset [0-9]*/bytes at offset XXX/g'
+sed -e 's/[0-9]*\/[0-9]* bytes at offset [0-9]*/XXX\/XXX bytes at offset 
XXX/g' \
+-e 's/^[0-9]* KiB/XXX KiB/g'
 
 echo
 echo "== Verify image content =="
diff --git a/tests/qemu-iotests/046.out b/tests/qemu-iotests/046.out
index 66ad987ab3..b1a03f4041 100644
--- a/tests/qemu-iotests/046.out
+++ b/tests/qemu-iotests/046.out
@@ -71,74 +71,74 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=6442450944 
backing_file=TEST_DIR
 == Some concurrent requests touching the same cluster ==
 blkdebug: Suspended request 'A'
 blkdebug: Resuming request 'A'
-wrote 8192/8192 bytes at offset XXX
-8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-wrote 8192/8192 bytes at offset XXX
-8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-wrote 8192/8192 bytes at offset XXX
-8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 blkdebug: Suspended request 'A'
 blkdebug: Resuming request 'A'
-wrote 8192/8192 bytes at offset XXX
-8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-wrote 65536/65536 bytes at offset XXX
-64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 blkdebug: Suspended request 'A'
 blkdebug: Resuming request 'A'
-wrote 8192/8192 bytes at offset XXX
-8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-wrote 65536/65536 bytes at offset XXX
-64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-wrote 32768/32768 bytes at offset XXX
-32 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 blkdebug: Suspended request 'A'
 blkdebug: Resuming request 'A'
-wrote 8192/8192 bytes at offset XXX
-8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-wrote 57344/57344 bytes at offset XXX
-56 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-wrote 4096/4096 bytes at offset XXX
-4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-wrote 32768/32768 bytes at offset XXX
-32 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-discard 65536/65536 bytes at offset XXX
-64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+discard XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 blkdebug: Suspended request 'A'
 blkdebug: Resuming request 'A'
-wrote 8192/8192 bytes at offset XXX
-8 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-wrote 57344/57344 bytes at offset XXX
-56 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-wrote 4096/4096 bytes at offset XXX
-4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-wrote 65536/65536 bytes at offset XXX
-64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-discard 65536/65536 bytes at offset XXX
-64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote XXX/XXX bytes at offset XXX
+XXX KiB, X 

Re: [PATCH] docs: Add a QEMU Code of Conduct and Conflict Resolution Policy document

2021-03-30 Thread Paolo Bonzini

On 30/03/21 09:13, Thomas Huth wrote:
Contributor Covenant 1.x is certainly an option, too, but it has IMHO 
already quite rigorous language ("Project maintainers have the [...] 
responsibility to remove, edit, or reject comments, commits, code, wiki 
edits ...", "Project maintainers who do not [...] enforce the Code of 
Conduct may be permanently removed from the project team."), which could 
either scare away people from taking maintainers responsibility or also 
could be used fire up arguments ("you are a maintainer, now according to 
the CoC you have to do this and that..."), which I'd rather like to avoid.
(well, as you know, I'm not a native English speaker, so I might also 
have gotten that tone wrong, but that's the impression that I had after 
reading that text as non-native speaker).


I see your point.  We also have the issue that mailing list archives are 
basically immutable and maintained on Savannah.  It would be hard for 
anyone to remove problematic language in many cases.


My first review last night focused on the conflict resolution policy 
because I was obviously more familiar with it.  I have now reread the 
code of conduct more closely and I like it, both the original and the 
small changes you made to the Django code of conduct.


I do have a couple remarks:

* like its ancestor, it is still erring on the opposite side by not 
identifying who is responsible for having a welcoming community, which 
goes beyond remediation.  Maintainers do have _some_ responsibility in 
that respect, and it should be mentioned somewhere.


* this sentence could be seen as making QEMU responsible for acting 
based on what people say on Facebook or Twitter:



In addition, violations of this code outside these spaces may
+affect a person's ability to participate within them.


I don't want to open that can of worms; not now at least.  The conflict 
resolution policy already calls out specific exceptions as a consequence 
of CoC violations, and I think that's enough.


As you're the one doing the work I don't want to impose my view, but I'd 
like to ask you to consider at least the following two changes:


* replace the above sentence with "This code of conduct also applies 
outside these spaces, when an individual acts as a representative or a 
member of the project or its community".


* in the paragraph after it ("If you believe someone is violating the 
code of conduct...") prepend the following text from the Contributor 
Covenant: "By adopting this Code of Conduct, project maintainers commit 
themselves to fairly and consistently applying these principles to every 
aspect of managing this project".


(On top of this the "When we disagree, try to understand why" bullet is 
somewhat redundant with both the conflict resolution policy and other 
parts of the code of conduct, and I like such documents to be as short 
as possible.  But that's more cosmetic than normative, so it's not a big 
deal).


What do you think?

Thanks,

Paolo




[Bug 1920913] Re: Openjdk11+ fails to install on s390x

2021-03-30 Thread Namrata Bhave
Tried building jdk 11 from source, the generated executable still
crashes(fastdebug as well as release mode):

```
root@24d396a17e00:~/jdk# build/linux-s390x-normal-server-release/jdk/bin/java 
-version
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0x00400b234440, pid=18175, tid=18178
#
# JRE version: OpenJDK Runtime Environment (11.0) (build 
11-internal+0-adhoc..jdk)
# Java VM: OpenJDK 64-Bit Server VM (11-internal+0-adhoc..jdk, mixed mode, 
tiered, compressed oops, g1 gc, linux-s390x)
# Problematic frame:
# J 78 c1 java.util.HashMap.afterNodeInsertion(Z)V java.base (1 bytes) @ 
0x00400b234440 [0x00400b234400+0x0040]
#
# Core dump will be written. Default location: Core dumps may be processed with 
"/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to 
/root/jdk/core.18175)
#
# An error report file with more information is saved as:
# /root/jdk/hs_err_pid18175.log
Compiled method (c1)1795   78   3   
java.util.HashMap::afterNodeInsertion (1 bytes)
 total in heap  [0x00400b234210,0x00400b2345b0] = 928
 relocation [0x00400b234378,0x00400b2343a0] = 40
 constants  [0x00400b2343c0,0x00400b234400] = 64
 main code  [0x00400b234400,0x00400b234500] = 256
 stub code  [0x00400b234500,0x00400b234558] = 88
 metadata   [0x00400b234558,0x00400b234568] = 16
 scopes data[0x00400b234568,0x00400b234578] = 16
 scopes pcs [0x00400b234578,0x00400b2345a8] = 48
 dependencies   [0x00400b2345a8,0x00400b2345b0] = 8
Compiled method (c1)1806   74   3   java.util.HashMap::putVal (300 
bytes)
 total in heap  [0x00400b230210,0x00400b231f20] = 7440
 relocation [0x00400b230378,0x00400b230690] = 792
 constants  [0x00400b2306c0,0x00400b230a00] = 832
 main code  [0x00400b230a00,0x00400b231980] = 3968
 stub code  [0x00400b231980,0x00400b231a68] = 232
 metadata   [0x00400b231a68,0x00400b231ad0] = 104
 scopes data[0x00400b231ad0,0x00400b231ce8] = 536
 scopes pcs [0x00400b231ce8,0x00400b231eb8] = 464
 dependencies   [0x00400b231eb8,0x00400b231ec0] = 8
 nul chk table  [0x00400b231ec0,0x00400b231f20] = 96
Could not load hsdis-s390x.so; library not loadable; PrintAssembly is disabled
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
Aborted (core dumped)
root@24d396a17e00:~/jdk#
```

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1920913

Title:
  Openjdk11+ fails to install on s390x

Status in QEMU:
  New

Bug description:
  While installing openjdk11 or higher from repo, it crashes while configuring 
ca-certificates-java.
  Although `java -version` passes, `jar -version` crashes. Detailed logs 
attached to this issue.

  ```
  # A fatal error has been detected by the Java Runtime Environment:
  #
  #  SIGILL (0x4) at pc=0x0040126f9980, pid=8425, tid=8430
  #
  # JRE version: OpenJDK Runtime Environment (11.0.10+9) (build 
11.0.10+9-Ubuntu-0ubuntu1.20.04)
  # Java VM: OpenJDK 64-Bit Server VM (11.0.10+9-Ubuntu-0ubuntu1.20.04, mixed 
mode, tiered, compressed oops, g1 gc, linux-s390x)
  # Problematic frame:
  # J 4 c1 java.lang.StringLatin1.hashCode([B)I java.base@11.0.10 (42 bytes) @ 
0x0040126f9980 [0x0040126f9980+0x]
  #
  # Core dump will be written. Default location: Core dumps may be processed 
with "/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to //core.8425)
  #
  # An error report file with more information is saved as:
  # //hs_err_pid8425.log
  sed with "/usr/share/apport/apport %p %s %c %d %P %E" (or dumping to 
/root/core.10740)
  #
  # An error report file with more information is saved as:
  # /root/hs_err_pid10740.log
  ```

  Observed this on s390x/ubuntu as well as s390x/alpine when run on amd64 host.
  Please note, on native s390x, the installation is successful. Also this crash 
is not observed while installing openjdk-8-jdk.

  Qemu version: 5.2.0

  Please let me know if any more details are needed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1920913/+subscriptions



Re: [RFC 3/8] virtio: Add API to batch set host notifiers

2021-03-30 Thread Greg Kurz
On Mon, 29 Mar 2021 18:10:57 +0100
Stefan Hajnoczi  wrote:

> On Thu, Mar 25, 2021 at 04:07:30PM +0100, Greg Kurz wrote:
> > Introduce VirtioBusClass methods to begin and commit a transaction
> > of setting/unsetting host notifiers. These handlers will be implemented
> > by virtio-pci to batch addition and deletion of ioeventfds for multiqueue
> > devices like virtio-scsi-pci or virtio-blk-pci.
> > 
> > Convert virtio_bus_set_host_notifiers() to use these handlers. Note that
> > virtio_bus_cleanup_host_notifier() closes eventfds, which could still be
> > passed to the KVM_IOEVENTFD ioctl() when the transaction ends and fail
> > with EBADF. The cleanup of the host notifiers is thus pushed to a
> > separate loop in virtio_bus_unset_and_cleanup_host_notifiers(), after
> > transaction commit.
> > 
> > Signed-off-by: Greg Kurz 
> > ---
> >  include/hw/virtio/virtio-bus.h |  4 
> >  hw/virtio/virtio-bus.c | 34 ++
> >  2 files changed, 38 insertions(+)
> > 
> > diff --git a/include/hw/virtio/virtio-bus.h b/include/hw/virtio/virtio-bus.h
> > index 6d1e4ee3e886..99704b2c090a 100644
> > --- a/include/hw/virtio/virtio-bus.h
> > +++ b/include/hw/virtio/virtio-bus.h
> > @@ -82,6 +82,10 @@ struct VirtioBusClass {
> >   */
> >  int (*ioeventfd_assign)(DeviceState *d, EventNotifier *notifier,
> >  int n, bool assign);
> > +
> > +void (*ioeventfd_assign_begin)(DeviceState *d);
> > +void (*ioeventfd_assign_commit)(DeviceState *d);
> 
> Please add doc comments for these new functions.
> 

Will do.

> > +
> >  /*
> >   * Whether queue number n is enabled.
> >   */
> > diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> > index c9e7cdb5c161..156484c4ca14 100644
> > --- a/hw/virtio/virtio-bus.c
> > +++ b/hw/virtio/virtio-bus.c
> > @@ -295,6 +295,28 @@ int virtio_bus_set_host_notifier(VirtioBusState *bus, 
> > int n, bool assign)
> >  return r;
> >  }
> >  
> > +static void virtio_bus_set_host_notifier_begin(VirtioBusState *bus)
> > +{
> > +VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(bus);
> > +DeviceState *proxy = DEVICE(BUS(bus)->parent);
> > +
> > +if (k->ioeventfd_assign_begin) {
> > +assert(k->ioeventfd_assign_commit);
> > +k->ioeventfd_assign_begin(proxy);
> > +}
> > +}
> > +
> > +static void virtio_bus_set_host_notifier_commit(VirtioBusState *bus)
> > +{
> > +VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(bus);
> > +DeviceState *proxy = DEVICE(BUS(bus)->parent);
> > +
> > +if (k->ioeventfd_assign_commit) {
> > +assert(k->ioeventfd_assign_begin);
> > +k->ioeventfd_assign_commit(proxy);
> > +}
> > +}
> > +
> >  void virtio_bus_cleanup_host_notifier(VirtioBusState *bus, int n)
> >  {
> >  VirtIODevice *vdev = virtio_bus_get_device(bus);
> > @@ -308,6 +330,7 @@ void virtio_bus_cleanup_host_notifier(VirtioBusState 
> > *bus, int n)
> >  event_notifier_cleanup(notifier);
> >  }
> >  
> > +/* virtio_bus_set_host_notifier_begin() must have been called */
> >  static void virtio_bus_unset_and_cleanup_host_notifiers(VirtioBusState 
> > *bus,
> >  int nvqs, int 
> > n_offset)
> >  {
> > @@ -315,6 +338,10 @@ static void 
> > virtio_bus_unset_and_cleanup_host_notifiers(VirtioBusState *bus,
> >  
> >  for (i = 0; i < nvqs; i++) {
> >  virtio_bus_set_host_notifier(bus, i + n_offset, false);
> > +}
> > +/* Let address_space_update_ioeventfds() run before closing ioeventfds 
> > */
> 
> assert(memory_region_transaction_depth == 0)?
> 

Hmm... appart from the fact that memory_region_transaction_depth is
a memory internal thing that shouldn't be exposed here, it seems to
me that memory_region_transaction_depth can be != 0 when, e.g. when
batching is used... or I'm missing something ?

I was actually thinking of adding some asserts for that in the
memory_region_*_eventfd_full() functions introduced by patch 1.

if (!transaction) {
memory_region_transaction_begin();
}
assert(memory_region_transaction_depth != 0);

> > +virtio_bus_set_host_notifier_commit(bus);
> > +for (i = 0; i < nvqs; i++) {
> >  virtio_bus_cleanup_host_notifier(bus, i + n_offset);
> >  }
> >  }
> > @@ -327,17 +354,24 @@ int virtio_bus_set_host_notifiers(VirtioBusState 
> > *bus, int nvqs, int n_offset,
> >  int rc;
> >  
> >  if (assign) {
> > +virtio_bus_set_host_notifier_begin(bus);
> > +
> >  for (i = 0; i < nvqs; i++) {
> >  rc = virtio_bus_set_host_notifier(bus, i + n_offset, true);
> >  if (rc != 0) {
> >  warn_report_once("%s: Failed to set host notifier (%s).\n",
> >   vdev->name, strerror(-rc));
> >  
> > +/* This also calls virtio_bus_set_host_notifier_commit() */
> >  virtio_bus_unset_and_cleanup_host_notifiers(bus, i, 
> > n_offset);
> >  

Re: [PATCH v2] docs: Add a QEMU Code of Conduct and Conflict Resolution Policy document

2021-03-30 Thread Paolo Bonzini

On 30/03/21 11:08, Thomas Huth wrote:

  I've picked the Django Code of Conduct as a base, since it sounds rather
  friendly and still welcoming to me, but I'm open for other suggestions, too
  (but we should maybe pick one where the conflict resolution policy is
  separated from the CoC itself so that it can be better taylored to the
  requirements of the QEMU project)


It turns out that the Django CoC is ultimately based on the Fedora CoC,
so I tried using https://docs.fedoraproject.org/en-US/project/code-of-conduct/
as an inspiration for what can be cut. Here is the outcome:

-
The QEMU community is made up of a mixture of professionals and
volunteers from all over the world. Diversity is one of our strengths,
but it can also lead to communication issues and unhappiness.
To that end, we have a few ground rules that we ask people to adhere to.

* Be welcoming. We are committed to making participation in this project
  a harassment-free experience for everyone, regardless of level of
  experience, gender, gender identity and expression, sexual orientation,
  disability, personal appearance, body size, race, ethnicity, age, religion,
  or nationality.

* Be respectful. Not all of us will agree all the time.  Disagreements, both
  social and technical, happen all the time and the QEMU community is no
  exception. When we disagree, we try to understand why.  It is important that
  we resolve disagreements and differing views constructively.  Members of the
  QEMU community should be respectful when dealing with other contributors as
  well as with people outside the QEMU community and with users of QEMU.

Harassment and other exclusionary behavior are not acceptable. A community
where people feel uncomfortable or threatened is neither welcoming nor
respectful.  Examples of unacceptable behavior by participants include:

* The use of sexualized language or imagery

* Personal attacks

* Trolling or insulting/derogatory comments

* Public or private harassment

* Publishing other's private information, such as physical or electronic
addresses, without explicit permission

This isn't an exhaustive list of things that you can't do. Rather, take
it in the spirit in which it's intended—a guide to make it easier to
be excellent to each other.

This code of conduct applies to all spaces managed by the QEMU project.
This includes IRC, the mailing lists, the issue tracker, community
events, and any other forums created by the project team which the
community uses for communication. This code of conduct also applies
outside these spaces, when an individual acts as a representative or a
member of the project or its community.

By adopting this code of conduct, project maintainers commit themselves
to fairly and consistently applying these principles to every aspect of
managing this project.  If you believe someone is violating the code of
conduct, please read the +:ref:`conflict-resolution` document for
information about how to proceed.

This document is based on the `Fedora Code of Conduct
`__ and the
`Contributor Covenant version 1.3.0
`__.


As a comparison:
* Contributor Covenant 1.3.0: 308 words
* text above: 386 words
* Fedora Code of Conduct: 429 words
* Contributor Covenant 1.4: 442 words
* Django Code of Conduct: 663 words


Thanks,

Paolo




Re: [PULL for-6.0 0/2] emulated nvme fixes

2021-03-30 Thread Peter Maydell
On Mon, 29 Mar 2021 at 18:04, Klaus Jensen  wrote:
>
> From: Klaus Jensen 
>
> Hi Peter,
>
> The following changes since commit ec2e6e016d24bd429792d08cf607e4c5350dcdaa:
>
>   Merge remote-tracking branch 
> 'remotes/vivier2/tags/linux-user-for-6.0-pull-request' into staging 
> (2021-03-28 19:49:57 +0100)
>
> are available in the Git repository at:
>
>   git://git.infradead.org/qemu-nvme.git tags/nvme-fixes-for-6.0-pull-request
>
> for you to fetch changes up to 3a69cadbef7af23a566dbe2400043c247c3d50ca:
>
>   hw/block/nvme: fix ref counting in nvme_format_ns (2021-03-29 18:46:57 
> +0200)
>
> 
> emulated nvme fixes
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/6.0
for any user-visible changes.

-- PMM



[PULL 1/9] iotests: Fix typo in iotest 051

2021-03-30 Thread Max Reitz
From: Tao Xu 

There is an typo in iotest 051, correct it.

Signed-off-by: Tao Xu 
Message-Id: <20210324084321.90952-1-tao3...@intel.com>
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/051| 2 +-
 tests/qemu-iotests/051.pc.out | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
index 333cc81818..7bf29343d7 100755
--- a/tests/qemu-iotests/051
+++ b/tests/qemu-iotests/051
@@ -199,7 +199,7 @@ case "$QEMU_DEFAULT_MACHINE" in
 # virtio-blk enables the iothread only when the driver initialises the
 # device, so a second virtio-blk device can't be added even with the
 # same iothread. virtio-scsi allows this.
-run_qemu $iothread -device 
virtio-blk-pci,drive=disk,iohtread=iothread0,share-rw=on
+run_qemu $iothread -device 
virtio-blk-pci,drive=disk,iothread=iothread0,share-rw=on
 run_qemu $iothread -device 
virtio-scsi,id=virtio-scsi1,iothread=thread0 -device 
scsi-hd,bus=virtio-scsi1.0,drive=disk,share-rw=on
 ;;
  *)
diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out
index e95bd42b8d..afe7632964 100644
--- a/tests/qemu-iotests/051.pc.out
+++ b/tests/qemu-iotests/051.pc.out
@@ -183,9 +183,9 @@ Testing: -drive 
file=TEST_DIR/t.qcow2,if=none,node-name=disk -object iothread,id
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) QEMU_PROG: -device scsi-hd,bus=virtio-scsi1.0,drive=disk,share-rw=on: 
Cannot change iothread of active block backend
 
-Testing: -drive file=TEST_DIR/t.qcow2,if=none,node-name=disk -object 
iothread,id=thread0 -device virtio-scsi,iothread=thread0,id=virtio-scsi0 
-device scsi-hd,bus=virtio-scsi0.0,drive=disk,share-rw=on -device 
virtio-blk-pci,drive=disk,iohtread=iothread0,share-rw=on
+Testing: -drive file=TEST_DIR/t.qcow2,if=none,node-name=disk -object 
iothread,id=thread0 -device virtio-scsi,iothread=thread0,id=virtio-scsi0 
-device scsi-hd,bus=virtio-scsi0.0,drive=disk,share-rw=on -device 
virtio-blk-pci,drive=disk,iothread=iothread0,share-rw=on
 QEMU X.Y.Z monitor - type 'help' for more information
-(qemu) QEMU_PROG: -device 
virtio-blk-pci,drive=disk,iohtread=iothread0,share-rw=on: Cannot change 
iothread of active block backend
+(qemu) QEMU_PROG: -device 
virtio-blk-pci,drive=disk,iothread=iothread0,share-rw=on: Cannot change 
iothread of active block backend
 
 Testing: -drive file=TEST_DIR/t.qcow2,if=none,node-name=disk -object 
iothread,id=thread0 -device virtio-scsi,iothread=thread0,id=virtio-scsi0 
-device scsi-hd,bus=virtio-scsi0.0,drive=disk,share-rw=on -device 
virtio-scsi,id=virtio-scsi1,iothread=thread0 -device 
scsi-hd,bus=virtio-scsi1.0,drive=disk,share-rw=on
 QEMU X.Y.Z monitor - type 'help' for more information
-- 
2.29.2




[PULL 6/9] block/mirror: Fix mirror_top's permissions

2021-03-30 Thread Max Reitz
mirror_top currently shares all permissions, and takes only the WRITE
permission (if some parent has taken that permission, too).

That is wrong, though; mirror_top is a filter, so it should take
permissions like any other filter does.  For example, if the parent
needs CONSISTENT_READ, we need to take that, too, and if it cannot share
the WRITE permission, we cannot share it either.

The exception is when mirror_top is used for active commit, where we
cannot take CONSISTENT_READ (because it is deliberately unshared above
the base node) and where we must share WRITE (so that it is shared for
all images in the backing chain, so the mirror job can take it for the
target BB).

Signed-off-by: Max Reitz 
Message-Id: <20210211172242.146671-2-mre...@redhat.com>
Reviewed-by: Eric Blake 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/mirror.c | 32 +---
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 6af02a57c4..d7e54c0ff7 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -89,6 +89,7 @@ typedef struct MirrorBlockJob {
 typedef struct MirrorBDSOpaque {
 MirrorBlockJob *job;
 bool stop;
+bool is_commit;
 } MirrorBDSOpaque;
 
 struct MirrorOp {
@@ -1522,13 +1523,27 @@ static void bdrv_mirror_top_child_perm(BlockDriverState 
*bs, BdrvChild *c,
 return;
 }
 
-/* Must be able to forward guest writes to the real image */
-*nperm = 0;
-if (perm & BLK_PERM_WRITE) {
-*nperm |= BLK_PERM_WRITE;
-}
+bdrv_default_perms(bs, c, role, reopen_queue,
+   perm, shared, nperm, nshared);
 
-*nshared = BLK_PERM_ALL;
+if (s->is_commit) {
+/*
+ * For commit jobs, we cannot take CONSISTENT_READ, because
+ * that permission is unshared for everything above the base
+ * node (except for filters on the base node).
+ * We also have to force-share the WRITE permission, or
+ * otherwise we would block ourselves at the base node (if
+ * writes are blocked for a node, they are also blocked for
+ * its backing file).
+ * (We could also share RESIZE, because it may be needed for
+ * the target if its size is less than the top node's; but
+ * bdrv_default_perms_for_cow() automatically shares RESIZE
+ * for backing nodes if WRITE is shared, so there is no need
+ * to do it here.)
+ */
+*nperm &= ~BLK_PERM_CONSISTENT_READ;
+*nshared |= BLK_PERM_WRITE;
+}
 }
 
 /* Dummy node that provides consistent read to its users without requiring it
@@ -1591,6 +1606,8 @@ static BlockJob *mirror_start_job(
 return NULL;
 }
 
+target_is_backing = bdrv_chain_contains(bs, target);
+
 /* In the case of active commit, add dummy driver to provide consistent
  * reads on the top, while disabling it in the intermediate nodes, and make
  * the backing chain writable. */
@@ -1613,6 +1630,8 @@ static BlockJob *mirror_start_job(
 bs_opaque = g_new0(MirrorBDSOpaque, 1);
 mirror_top_bs->opaque = bs_opaque;
 
+bs_opaque->is_commit = target_is_backing;
+
 /* bdrv_append takes ownership of the mirror_top_bs reference, need to keep
  * it alive until block_job_create() succeeds even if bs has no parent. */
 bdrv_ref(mirror_top_bs);
@@ -1653,7 +1672,6 @@ static BlockJob *mirror_start_job(
 target_perms = BLK_PERM_WRITE;
 target_shared_perms = BLK_PERM_WRITE_UNCHANGED;
 
-target_is_backing = bdrv_chain_contains(bs, target);
 if (target_is_backing) {
 int64_t bs_size, target_size;
 bs_size = bdrv_getlength(bs);
-- 
2.29.2




[PULL 4/9] qcow2: use external virtual timers

2021-03-30 Thread Max Reitz
From: Pavel Dovgalyuk 

Regular virtual timers are used to emulate timings
related to vCPU and peripheral states. QCOW2 uses timers
to clean the cache. These timers should have external
flag. In the opposite case they affect the execution
and it can't be recorded and replayed.
This patch adds external flag to the timer for qcow2
cache clean.

Signed-off-by: Pavel Dovgalyuk 
Reviewed-by: Paolo Bonzini 
Message-Id: <161700516327.1141158.8366564693714562536.stgit@pasha-ThinkPad-X280>
Signed-off-by: Max Reitz 
---
 block/qcow2.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 0db1227ac9..2fb43c6f7e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -840,9 +840,10 @@ static void cache_clean_timer_init(BlockDriverState *bs, 
AioContext *context)
 {
 BDRVQcow2State *s = bs->opaque;
 if (s->cache_clean_interval > 0) {
-s->cache_clean_timer = aio_timer_new(context, QEMU_CLOCK_VIRTUAL,
- SCALE_MS, cache_clean_timer_cb,
- bs);
+s->cache_clean_timer =
+aio_timer_new_with_attrs(context, QEMU_CLOCK_VIRTUAL,
+ SCALE_MS, QEMU_TIMER_ATTR_EXTERNAL,
+ cache_clean_timer_cb, bs);
 timer_mod(s->cache_clean_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
   (int64_t) s->cache_clean_interval * 1000);
 }
-- 
2.29.2




Re: Serious doubts about Gitlab CI

2021-03-30 Thread Daniel P . Berrangé
On Tue, Mar 30, 2021 at 02:45:43PM +0200, Paolo Bonzini wrote:
> On 30/03/21 14:23, Philippe Mathieu-Daudé wrote:
> > On 3/30/21 2:09 PM, Paolo Bonzini wrote:
> > > On 30/03/21 13:55, Thomas Huth wrote:
> > > > 
> > > > Also I wonder whether we could maybe even get rid of the capstone and
> > > > slirp submodules in QEMU now
> > > 
> > > At least for slirp, we probably want to stay more on the bleeding edge
> > > which implies having to keep the submodule.
> > 
> > FYI QEMU libSLiRP submodule doesn't point to bleeding edge branch but to
> > the stable branch (which should be what distributions package).
> 
> Now, but that may change already in 6.1 in order to add CFI support.

We can bundle a newer version, but we don't need to require a newer
version. Simply conditional compile for the bits we need. If distro
slirp is too old, then sorry, you can't enable CFI + slirp at the
same time. If the distro really wants that combination we don't have
to own the solution - the distro should update their slirp.

Or to put it another way, QEMU doesn't need to go out of its way to
enable new features on old distros. We merely need to not regress
in the features we previously offered.  We bundled slirp as a submodule
so that old distros didn't loose slirp entirely. We don't need to
offer CFI on those distros.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [RFC 3/8] virtio: Add API to batch set host notifiers

2021-03-30 Thread Stefan Hajnoczi
On Tue, Mar 30, 2021 at 12:17:40PM +0200, Greg Kurz wrote:
> On Mon, 29 Mar 2021 18:10:57 +0100
> Stefan Hajnoczi  wrote:
> > On Thu, Mar 25, 2021 at 04:07:30PM +0100, Greg Kurz wrote:
> > > @@ -315,6 +338,10 @@ static void 
> > > virtio_bus_unset_and_cleanup_host_notifiers(VirtioBusState *bus,
> > >  
> > >  for (i = 0; i < nvqs; i++) {
> > >  virtio_bus_set_host_notifier(bus, i + n_offset, false);
> > > +}
> > > +/* Let address_space_update_ioeventfds() run before closing 
> > > ioeventfds */
> > 
> > assert(memory_region_transaction_depth == 0)?
> > 
> 
> Hmm... appart from the fact that memory_region_transaction_depth is
> a memory internal thing that shouldn't be exposed here, it seems to
> me that memory_region_transaction_depth can be != 0 when, e.g. when
> batching is used... or I'm missing something ?
> 
> I was actually thinking of adding some asserts for that in the
> memory_region_*_eventfd_full() functions introduced by patch 1.
> 
> if (!transaction) {
> memory_region_transaction_begin();
> }
> assert(memory_region_transaction_depth != 0);

In that case is it safe to call virtio_bus_cleanup_host_notifier()
below? I thought it depends on the transaction committing first.

> 
> > > +virtio_bus_set_host_notifier_commit(bus);
> > > +for (i = 0; i < nvqs; i++) {
> > >  virtio_bus_cleanup_host_notifier(bus, i + n_offset);
> > >  }
> > >  }


signature.asc
Description: PGP signature


[PATCH v3 04/12] target/hexagon: make slot number an unsigned

2021-03-30 Thread Alessandro Di Federico via
From: Paolo Montesel 

Signed-off-by: Alessandro Di Federico 
Signed-off-by: Paolo Montesel 
---
 target/hexagon/genptr.c | 6 --
 target/hexagon/macros.h | 2 +-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 7481f4c1dd..fd18aabe8d 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -33,7 +33,8 @@ static inline TCGv gen_read_preg(TCGv pred, uint8_t num)
 return pred;
 }
 
-static inline void gen_log_predicated_reg_write(int rnum, TCGv val, int slot)
+static inline void gen_log_predicated_reg_write(int rnum, TCGv val,
+unsigned slot)
 {
 TCGv one = tcg_const_tl(1);
 TCGv zero = tcg_const_tl(0);
@@ -62,7 +63,8 @@ static inline void gen_log_reg_write(int rnum, TCGv val)
 #endif
 }
 
-static void gen_log_predicated_reg_write_pair(int rnum, TCGv_i64 val, int slot)
+static void gen_log_predicated_reg_write_pair(int rnum, TCGv_i64 val,
+  unsigned slot)
 {
 TCGv val32 = tcg_temp_new();
 TCGv one = tcg_const_tl(1);
diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index cfcb8173ba..d9473c8823 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -154,7 +154,7 @@
 #define LOAD_CANCEL(EA) do { CANCEL; } while (0)
 
 #ifdef QEMU_GENERATE
-static inline void gen_pred_cancel(TCGv pred, int slot_num)
+static inline void gen_pred_cancel(TCGv pred, unsigned slot_num)
  {
 TCGv slot_mask = tcg_const_tl(1 << slot_num);
 TCGv tmp = tcg_temp_new();
-- 
2.31.1




[PATCH v3 09/12] target/hexagon: import lexer for idef-parser

2021-03-30 Thread Alessandro Di Federico via
From: Paolo Montesel 

Signed-off-by: Alessandro Di Federico 
Signed-off-by: Paolo Montesel 
---
 target/hexagon/idef-parser/idef-parser.h  | 240 +++
 target/hexagon/idef-parser/idef-parser.lex| 611 ++
 target/hexagon/meson.build|   4 +
 tests/docker/dockerfiles/alpine.docker|   1 +
 tests/docker/dockerfiles/centos7.docker   |   1 +
 tests/docker/dockerfiles/centos8.docker   |   1 +
 tests/docker/dockerfiles/debian10.docker  |   1 +
 .../dockerfiles/fedora-i386-cross.docker  |   1 +
 .../dockerfiles/fedora-win32-cross.docker |   1 +
 .../dockerfiles/fedora-win64-cross.docker |   1 +
 tests/docker/dockerfiles/fedora.docker|   1 +
 tests/docker/dockerfiles/opensuse-leap.docker |   1 +
 tests/docker/dockerfiles/ubuntu.docker|   1 +
 tests/docker/dockerfiles/ubuntu1804.docker|   1 +
 tests/docker/dockerfiles/ubuntu2004.docker|   3 +-
 15 files changed, 868 insertions(+), 1 deletion(-)
 create mode 100644 target/hexagon/idef-parser/idef-parser.h
 create mode 100644 target/hexagon/idef-parser/idef-parser.lex

diff --git a/target/hexagon/idef-parser/idef-parser.h 
b/target/hexagon/idef-parser/idef-parser.h
new file mode 100644
index 00..ecfa0174e2
--- /dev/null
+++ b/target/hexagon/idef-parser/idef-parser.h
@@ -0,0 +1,240 @@
+/*
+ * Copyright(c) 2019-2021 rev.ng Srls. All Rights Reserved.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#ifndef IDEF_PARSER_H
+#define IDEF_PARSER_H
+
+#include 
+#include 
+#include 
+#include 
+
+#define TCGV_NAME_SIZE 7
+#define MAX_WRITTEN_REGS 32
+#define OFFSET_STR_LEN 32
+#define ALLOC_LIST_LEN 32
+#define ALLOC_NAME_SIZE 32
+#define INIT_LIST_LEN 32
+#define OUT_BUF_LEN (1024 * 1024)
+#define SIGNATURE_BUF_LEN (128 * 1024)
+#define HEADER_BUF_LEN (128 * 1024)
+
+/* Variadic macros to wrap the buffer printing functions */
+#define EMIT(c, ...) \
+do { \
+g_string_append_printf((c)->out_str, __VA_ARGS__);   \
+} while (0)
+
+#define EMIT_SIG(c, ...)   
\
+do {   
\
+g_string_append_printf((c)->signature_str, __VA_ARGS__);   
\
+} while (0)
+
+#define EMIT_HEAD(c, ...)  
\
+do {   
\
+g_string_append_printf((c)->header_str, __VA_ARGS__);  
\
+} while (0)
+
+/**
+ * Type of register, assigned to the HexReg.type field
+ */
+typedef enum {GENERAL_PURPOSE, CONTROL, MODIFIER, DOTNEW} RegType;
+
+/**
+ * Types of control registers, assigned to the HexReg.id field
+ */
+typedef enum {SP, FP, LR, GP, LC0, LC1, SA0, SA1} CregType;
+
+/**
+ * Identifier string of the control registers, indexed by the CregType enum
+ */
+extern const char *creg_str[];
+
+/**
+ * Semantic record of the REG tokens, identifying registers
+ */
+typedef struct HexReg {
+CregType id;/**< Identifier of the register  */
+RegType type;   /**< Type of the register*/
+unsigned bit_width; /**< Bit width of the reg, 32 or 64 bits */
+} HexReg;
+
+/**
+ * Data structure, identifying a TCGv temporary value
+ */
+typedef struct HexTmp {
+int index;  /**< Index of the TCGv temporary value*/
+} HexTmp;
+
+/**
+ * Enum of the possible immediated, an immediate is a value which is known
+ * at tinycode generation time, e.g. an integer value, not a TCGv
+ */
+enum ImmUnionTag {I, VARIABLE, VALUE, QEMU_TMP, IMM_PC, IMM_CONSTEXT};
+
+/**
+ * Semantic record of the IMM token, identifying an immediate constant
+ */
+typedef struct HexImm {
+union {
+char id;/**< Identifier of the immediate */
+uint64_t value; /**< Immediate value (for VALUE type immediates) */
+uint64_t index; /**< Index of the immediate (for int temp vars)  */
+};
+enum ImmUnionTag type;  /**< Type of the immediate  */
+} HexImm;
+
+/**
+ * Semantic record of the PRE token, identifying a predicate
+ */
+typedef 

[PATCH v3 10/12] target/hexagon: import parser for idef-parser

2021-03-30 Thread Alessandro Di Federico via
From: Paolo Montesel 

Signed-off-by: Alessandro Di Federico 
Signed-off-by: Paolo Montesel 
---
 target/hexagon/idef-parser/idef-parser.y  |  940 +++
 target/hexagon/idef-parser/parser-helpers.c   | 2230 +
 target/hexagon/idef-parser/parser-helpers.h   |  344 +++
 target/hexagon/meson.build|   26 +-
 tests/docker/dockerfiles/alpine.docker|1 +
 tests/docker/dockerfiles/centos7.docker   |1 +
 tests/docker/dockerfiles/centos8.docker   |1 +
 tests/docker/dockerfiles/debian10.docker  |2 +
 .../dockerfiles/fedora-i386-cross.docker  |2 +
 .../dockerfiles/fedora-win32-cross.docker |2 +
 .../dockerfiles/fedora-win64-cross.docker |2 +
 tests/docker/dockerfiles/fedora.docker|1 +
 tests/docker/dockerfiles/opensuse-leap.docker |1 +
 tests/docker/dockerfiles/ubuntu.docker|2 +
 tests/docker/dockerfiles/ubuntu1804.docker|2 +
 tests/docker/dockerfiles/ubuntu2004.docker|4 +-
 16 files changed, 3559 insertions(+), 2 deletions(-)
 create mode 100644 target/hexagon/idef-parser/idef-parser.y
 create mode 100644 target/hexagon/idef-parser/parser-helpers.c
 create mode 100644 target/hexagon/idef-parser/parser-helpers.h

diff --git a/target/hexagon/idef-parser/idef-parser.y 
b/target/hexagon/idef-parser/idef-parser.y
new file mode 100644
index 00..5d96c9262a
--- /dev/null
+++ b/target/hexagon/idef-parser/idef-parser.y
@@ -0,0 +1,940 @@
+%{
+/*
+ * Copyright(c) 2019-2021 rev.ng Srls. All Rights Reserved.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#include "idef-parser.h"
+#include "parser-helpers.h"
+#include "idef-parser.tab.h"
+#include "idef-parser.yy.h"
+
+/* Uncomment this to disable yyasserts */
+/* #define NDEBUG */
+
+#define ERR_LINE_CONTEXT 40
+
+%}
+
+%lex-param {void *scanner}
+%parse-param {void *scanner}
+%parse-param {Context *c}
+
+%define parse.error verbose
+%define parse.lac full
+%define api.pure full
+
+%locations
+
+%union {
+GString *string;
+HexValue rvalue;
+HexSat sat;
+HexCast cast;
+HexExtract extract;
+HexMpy mpy;
+bool is_unsigned;
+int index;
+}
+
+/* Tokens */
+%start input
+
+%expect 1
+
+%token INAME DREG DIMM DPRE DEA RREG WREG FREG FIMM RPRE WPRE FPRE FWRAP FEA 
VAR
+%token POW ABS CROUND ROUND CIRCADD COUNTONES INC DEC ANDA ORA XORA PLUSPLUS 
ASL
+%token ASR LSR EQ NEQ LTE GTE MIN MAX ANDL ORL FOR ICIRC IF MUN FSCR FCHK SXT
+%token ZXT CONSTEXT LOCNT BREV SIGN LOAD STORE CONSTLL CONSTULL PC NPC LPCFG
+%token CANCEL IDENTITY PART1 BREV_4 BREV_8 ROTL INSBITS SETBITS EXTBITS 
EXTRANGE
+%token CAST4_8U SETOVF FAIL DEINTERLEAVE INTERLEAVE CARRY_FROM_ADD
+
+%token  REG IMM PRE
+%token  ELSE
+%token  MPY
+%token  SAT
+%token  CAST DEPOSIT SETHALF
+%token  EXTRACT
+%type  INAME
+%type  rvalue lvalue VAR assign_statement var
+%type  DREG DIMM DPRE RREG RPRE FAIL
+%type  if_stmt IF ':' '?'
+%type  SIGN
+
+/* Operator Precedences */
+%left MIN MAX
+%left '('
+%left ','
+%left '='
+%right CIRCADD
+%right INC DEC ANDA ORA XORA
+%left '?' ':'
+%left ORL
+%left ANDL
+%left '|'
+%left '^' ANDOR
+%left '&'
+%left EQ NEQ
+%left '<' '>' LTE GTE
+%left ASL ASR LSR
+%right ABS
+%left '-' '+'
+%left POW
+%left '*' '/' '%' MPY
+%right '~' '!'
+%left '['
+%right CAST
+%right LOCNT BREV
+
+/* Bison Grammar */
+%%
+
+/* Input file containing the description of each hexagon instruction */
+input : instructions
+  {
+  YYACCEPT;
+  }
+  ;
+
+instructions : instruction instructions
+ | %empty
+ ;
+
+instruction : INAME
+  {
+  gen_inst(c, $1);
+  }
+  arguments
+  {
+  gen_inst_args(c, &@1);
+  }
+  code
+  {
+  gen_inst_code(c, &@1);
+  }
+| error /* Recover gracefully after instruction compilation error 
*/
+  {
+  free_instruction(c);
+  }
+;
+
+arguments : '(' ')'
+  | '(' argument_list ')';
+
+argument_list : decl ',' argument_list
+  | decl
+  ;
+
+var : VAR
+  {
+  track_string(c, $1.var.name);
+  $$ = $1;
+  }
+;
+
+/* Return the modified registers list */
+code : '{' statements '}'
+  

Re: [PATCH v6 3/6] coroutine-lock: Store the coroutine in the CoWaitRecord only once

2021-03-30 Thread Stefan Hajnoczi
On Thu, Mar 25, 2021 at 12:29:38PM +0100, Paolo Bonzini wrote:
> From: David Edmondson 
> 
> When taking the slow path for mutex acquisition, set the coroutine
> value in the CoWaitRecord in push_waiter(), rather than both there and
> in the caller.
> 
> Reviewed-by: Paolo Bonzini 
> Reviewed-by: Philippe Mathieu-Daudé 
> Signed-off-by: David Edmondson 
> Message-Id: <20210309144015.557477-4-david.edmond...@oracle.com>
> Signed-off-by: Paolo Bonzini 
> ---
>  util/qemu-coroutine-lock.c | 1 -
>  1 file changed, 1 deletion(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [PATCH v3 00/12] target/hexagon: introduce idef-parser

2021-03-30 Thread no-reply
Patchew URL: https://patchew.org/QEMU/20210330143750.3037824-1-ale.q...@rev.ng/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20210330143750.3037824-1-ale.q...@rev.ng
Subject: [PATCH v3 00/12] target/hexagon: introduce idef-parser

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
   7993b0f..4a0ba67  master -> master
 - [tag update]  patchew/20210325112941.365238-1-pbonz...@redhat.com -> 
patchew/20210325112941.365238-1-pbonz...@redhat.com
 - [tag update]  patchew/20210329110303.15235-1-alex.ben...@linaro.org -> 
patchew/20210329110303.15235-1-alex.ben...@linaro.org
 * [new tag] patchew/20210330143750.3037824-1-ale.q...@rev.ng -> 
patchew/20210330143750.3037824-1-ale.q...@rev.ng
Switched to a new branch 'test'
0f8591b target/hexagon: import additional tests
937c6cf target/hexagon: call idef-parser functions
a86f74c target/hexagon: import parser for idef-parser
d1c0eea target/hexagon: import lexer for idef-parser
83ab86b target/hexagon: prepare input for the idef-parser
83b3488 target/hexagon: expose next PC in DisasContext
6ec869c target/hexagon: introduce new helper functions
c9b6a53 target/hexagon: make helper functions non-static
4fc7d90 target/hexagon: make slot number an unsigned
f62b717 target/hexagon: import README for idef-parser
d75567e target/hexagon: update MAINTAINERS for idef-parser
131fca8 tcg: expose TCGCond manipulation routines

=== OUTPUT BEGIN ===
1/12 Checking commit 131fca89f291 (tcg: expose TCGCond manipulation routines)
Use of uninitialized value $acpi_testexpected in string eq at 
./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#19: 
new file mode 100644

total: 0 errors, 1 warnings, 183 lines checked

Patch 1/12 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
2/12 Checking commit d75567e8a276 (target/hexagon: update MAINTAINERS for 
idef-parser)
3/12 Checking commit f62b717c38c9 (target/hexagon: import README for 
idef-parser)
Use of uninitialized value $acpi_testexpected in string eq at 
./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#38: 
new file mode 100644

total: 0 errors, 1 warnings, 464 lines checked

Patch 3/12 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
4/12 Checking commit 4fc7d902ffc6 (target/hexagon: make slot number an unsigned)
5/12 Checking commit c9b6a53db620 (target/hexagon: make helper functions 
non-static)
6/12 Checking commit 6ec869cf1437 (target/hexagon: introduce new helper 
functions)
7/12 Checking commit 83b3488172da (target/hexagon: expose next PC in 
DisasContext)
8/12 Checking commit 83ab86b500b4 (target/hexagon: prepare input for the 
idef-parser)
Use of uninitialized value $acpi_testexpected in string eq at 
./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#20: 
new file mode 100644

total: 0 errors, 1 warnings, 316 lines checked

Patch 8/12 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
9/12 Checking commit d1c0eea89a02 (target/hexagon: import lexer for idef-parser)
Use of uninitialized value $acpi_testexpected in string eq at 
./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#29: 
new file mode 100644

total: 0 errors, 1 warnings, 946 lines checked

Patch 9/12 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
10/12 Checking commit a86f74c8704d (target/hexagon: import parser for 
idef-parser)
Use of uninitialized value $acpi_testexpected in string eq at 
./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#30: 
new file mode 100644

ERROR: suspicious ; after while (0)
#3314: FILE: target/hexagon/idef-parser/parser-helpers.h:98:
+} while (0);

total: 1 errors, 1 warnings, 3681 lines checked

Patch 10/12 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

11/12 Checking commit 937c6cfff6d1 (target/hexagon: call idef-parser functions)
12/12 Checking commit 0f8591bf9641 (target/hexagon: import additional tests)
Use of uninitialized value $acpi_testexpected in string eq at 

Re: [PATCH] i386: Make 'hv-reenlightenment' require explicit 'tsc-frequency' setting

2021-03-30 Thread Dr. David Alan Gilbert
* Vitaly Kuznetsov (vkuzn...@redhat.com) wrote:
> Commit 561dbb41b1d7 "i386: Make migration fail when Hyper-V reenlightenment
> was enabled but 'user_tsc_khz' is unset" forbade migrations with when guest
> has opted for reenlightenment notifications but 'tsc-frequency' wasn't set
> explicitly on the command line. This works but the migration fail late and
> this may come as an unpleasant surprise. To make things more explicit,
> require 'tsc-frequency=' on the command line when 'hv-reenlightenment' was
> enabled. Make the change affect 6.0+ machine types only to preserve
> previously-valid configurations valid.
> 
> Signed-off-by: Vitaly Kuznetsov 

That looks better for me from a migration point of view:


Acked-by: Dr. David Alan Gilbert 

> ---
>  docs/hyperv.txt   |  1 +
>  hw/i386/pc.c  |  1 +
>  target/i386/cpu.c | 23 +--
>  target/i386/cpu.h |  1 +
>  4 files changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
> index e53c581f4586..5b02d341ab25 100644
> --- a/docs/hyperv.txt
> +++ b/docs/hyperv.txt
> @@ -165,6 +165,7 @@ emulate TSC accesses after migration so 'tsc-frequency=' 
> CPU option also has to
>  be specified to make migration succeed. The destination host has to either 
> have
>  the same TSC frequency or support TSC scaling CPU feature.
>  
> +Requires: tsc-frequency
>  Recommended: hv-frequencies
>  
>  3.16. hv-evmcs
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 8a84b25a031e..47b79e949ad7 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -98,6 +98,7 @@
>  
>  GlobalProperty pc_compat_5_2[] = {
>  { "ICH9-LPC", "x-smi-cpu-hotunplug", "off" },
> +{ TYPE_X86_CPU, "x-hv-reenlightenment-requires-tscfreq", "off"},
>  };
>  const size_t pc_compat_5_2_len = G_N_ELEMENTS(pc_compat_5_2);
>  
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 6b3e9467f177..751636bafac5 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6647,10 +6647,23 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
> verbose)
>  }
>  }
>  
> -static void x86_cpu_hyperv_realize(X86CPU *cpu)
> +static void x86_cpu_hyperv_realize(X86CPU *cpu, Error **errp)
>  {
> +CPUX86State *env = >env;
>  size_t len;
>  
> +/*
> + * Reenlightenment requires explicit 'tsc-frequency' setting for 
> successful
> + * migration (see hyperv_reenlightenment_post_load(). As 'hv-passthrough'
> + * mode is not migratable, we can loosen the restriction.
> + */
> +if (hyperv_feat_enabled(cpu, HYPERV_FEAT_REENLIGHTENMENT) &&
> +!cpu->hyperv_passthrough && !env->user_tsc_khz &&
> +cpu->hyperv_reenlightenment_requires_tscfreq) {
> +error_setg(errp, "'hv-reenlightenment' requires 'tsc-frequency=' to 
> be set");
> +return;
> +}
> +
>  /* Hyper-V vendor id */
>  if (!cpu->hyperv_vendor) {
>  memcpy(cpu->hyperv_vendor_id, "Microsoft Hv", 12);
> @@ -6846,7 +6859,11 @@ static void x86_cpu_realizefn(DeviceState *dev, Error 
> **errp)
>  }
>  
>  /* Process Hyper-V enlightenments */
> -x86_cpu_hyperv_realize(cpu);
> +x86_cpu_hyperv_realize(cpu, _err);
> +if (local_err != NULL) {
> +error_propagate(errp, local_err);
> +return;
> +}
>  
>  cpu_exec_realizefn(cs, _err);
>  if (local_err != NULL) {
> @@ -7374,6 +7391,8 @@ static Property x86_cpu_properties[] = {
>  DEFINE_PROP_INT32("x-hv-max-vps", X86CPU, hv_max_vps, -1),
>  DEFINE_PROP_BOOL("x-hv-synic-kvm-only", X86CPU, hyperv_synic_kvm_only,
>   false),
> +DEFINE_PROP_BOOL("x-hv-reenlightenment-requires-tscfreq", X86CPU,
> + hyperv_reenlightenment_requires_tscfreq, true),
>  DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
>   true),
>  DEFINE_PROP_END_OF_LIST()
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 570f916878f9..0196a300f018 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1677,6 +1677,7 @@ struct X86CPU {
>  uint32_t hyperv_spinlock_attempts;
>  char *hyperv_vendor;
>  bool hyperv_synic_kvm_only;
> +bool hyperv_reenlightenment_requires_tscfreq;
>  uint64_t hyperv_features;
>  bool hyperv_passthrough;
>  OnOffAuto hyperv_no_nonarch_cs;
> -- 
> 2.30.2
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




[PATCH v6 2/5] virtiofsd: Add capability to change/restore umask

2021-03-30 Thread Vivek Goyal
When parent directory has default acl and a file is created in that
directory, then umask is ignored and final file permissions are
determined using default acl instead. (man 2 umask).

Currently, fuse applies the umask and sends modified mode in create
request accordingly. fuse server can set FUSE_DONT_MASK and tell
fuse client to not apply umask and fuse server will take care of
it as needed.

With posix acls enabled, requirement will be that we want umask
to determine final file mode if parent directory does not have
default acl.

So if posix acls are enabled, opt in for FUSE_DONT_MASK. virtiofsd
will set umask of the thread doing file creation. And host kernel
should use that umask if parent directory does not have default
acls, otherwise umask does not take affect.

Miklos mentioned that we already call unshare(CLONE_FS) for
every thread. That means umask has now become property of per
thread and it should be ok to manipulate it in file creation path.

This patch only adds capability to change umask and restore it. It
does not enable it yet. Next patch will add capability to enable it
based on if user enabled posix_acl or not.

This should fix fstest generic/099.

Reported-by: Luis Henriques 
Signed-off-by: Vivek Goyal 
Reviewed-by: Stefan Hajnoczi 
---
 tools/virtiofsd/passthrough_ll.c | 22 --
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
index b144320e48..e6ae3d38d7 100644
--- a/tools/virtiofsd/passthrough_ll.c
+++ b/tools/virtiofsd/passthrough_ll.c
@@ -122,6 +122,7 @@ struct lo_inode {
 struct lo_cred {
 uid_t euid;
 gid_t egid;
+mode_t umask;
 };
 
 enum {
@@ -172,6 +173,8 @@ struct lo_data {
 /* An O_PATH file descriptor to /proc/self/fd/ */
 int proc_self_fd;
 int user_killpriv_v2, killpriv_v2;
+/* If set, virtiofsd is responsible for setting umask during creation */
+bool change_umask;
 };
 
 static const struct fuse_opt lo_opts[] = {
@@ -1134,7 +1137,8 @@ static void lo_lookup(fuse_req_t req, fuse_ino_t parent, 
const char *name)
  * ownership of caller.
  * TODO: What about selinux context?
  */
-static int lo_change_cred(fuse_req_t req, struct lo_cred *old)
+static int lo_change_cred(fuse_req_t req, struct lo_cred *old,
+  bool change_umask)
 {
 int res;
 
@@ -1154,11 +1158,14 @@ static int lo_change_cred(fuse_req_t req, struct 
lo_cred *old)
 return errno_save;
 }
 
+if (change_umask) {
+old->umask = umask(req->ctx.umask);
+}
 return 0;
 }
 
 /* Regain Privileges */
-static void lo_restore_cred(struct lo_cred *old)
+static void lo_restore_cred(struct lo_cred *old, bool restore_umask)
 {
 int res;
 
@@ -1173,6 +1180,9 @@ static void lo_restore_cred(struct lo_cred *old)
 fuse_log(FUSE_LOG_ERR, "setegid(%u): %m\n", old->egid);
 exit(1);
 }
+
+if (restore_umask)
+umask(old->umask);
 }
 
 static void lo_mknod_symlink(fuse_req_t req, fuse_ino_t parent,
@@ -1202,7 +1212,7 @@ static void lo_mknod_symlink(fuse_req_t req, fuse_ino_t 
parent,
 return;
 }
 
-saverr = lo_change_cred(req, );
+saverr = lo_change_cred(req, , lo->change_umask && !S_ISLNK(mode));
 if (saverr) {
 goto out;
 }
@@ -1211,7 +1221,7 @@ static void lo_mknod_symlink(fuse_req_t req, fuse_ino_t 
parent,
 
 saverr = errno;
 
-lo_restore_cred();
+lo_restore_cred(, lo->change_umask && !S_ISLNK(mode));
 
 if (res == -1) {
 goto out;
@@ -1918,7 +1928,7 @@ static void lo_create(fuse_req_t req, fuse_ino_t parent, 
const char *name,
 return;
 }
 
-err = lo_change_cred(req, );
+err = lo_change_cred(req, , lo->change_umask);
 if (err) {
 goto out;
 }
@@ -1929,7 +1939,7 @@ static void lo_create(fuse_req_t req, fuse_ino_t parent, 
const char *name,
 fd = openat(parent_inode->fd, name, fi->flags | O_CREAT | O_EXCL, mode);
 err = fd == -1 ? errno : 0;
 
-lo_restore_cred();
+lo_restore_cred(, lo->change_umask);
 
 /* Ignore the error if file exists and O_EXCL was not given */
 if (err && (err != EEXIST || (fi->flags & O_EXCL))) {
-- 
2.25.4




  1   2   3   >