Re: [Qemu-devel] kvm / virsh snapshot management

2019-06-10 Thread Gary Dale

On 2019-06-10 8:10 p.m., Eric Blake wrote:

On 6/10/19 6:00 PM, Gary Dale wrote:


Any ideas on what I'm doing wrong?

Do you know for sure whether you have internal or external snapshots?
And at this point, your questions are starting to wander more into
libvirt territory.


Yes. I'm using internal snapshots. From your other e-mail, I gather that
the (only) benefit to blockcommit with internal snapshots would be to
reduce the size of the various tables recording changed blocks. Without
a blockcommit, the L1 tables get progressively larger over time since
they record all changes to the base file. Eventually the snapshots could
become larger than the base image if I don't do a blockcommit.

Not quite. Blockcommit requires external images. It says to take this
image chain:

base <- active

and change it into this shorter chain:

base

by moving the cluster from active into base.  There is no such thing as
blockcommit on internal snapshots, because you don't have any backing
file to push into.

With internal snapshots, the longer an L1 table is active, the more
clusters you have to change compared to what was the case before the
snapshot was created - every time you change an existing cluster, the
refcount on the old cluster decreases and the change gets written into a
new cluster with refcount 1.  Yes, you can reach the point where there
are more clusters with refcount 1 associated with your current L1 table
than there are clusters with refcount > 1 that are shared with one or
more previous internal snapshots. But they are not recording a change to
the base file, rather, they are recording the current state of the file
where an internal snapshot says to not forget the old state of the file.
  And yes, a qcow2 file with internal snapshots can require more disk
space than the amount of space exposed to the guest.  But that's true
too with external snapshots (the sum of the space required by all images
in the chain may be larger than the space visible to the guest).



OK. I think I'm getting it now. Thanks for your help. I just wish there 
was some consistent documentation that explained all this. The Red Hat 
stuff seems to assume that you understand it only applies to external 
snapshots and the virsh man page seems to do the same.





Re: [Qemu-devel] [PULL 23/44] target/ppc: Use vector variable shifts for VSL, VSR, VSRA

2019-06-10 Thread David Gibson
On Fri, Jun 07, 2019 at 09:28:49AM -0500, Richard Henderson wrote:
> On 6/7/19 9:09 AM, Laurent Vivier wrote:
> > On 07/06/2019 11:29, Laurent Vivier wrote:
> >> On 29/05/2019 08:49, David Gibson wrote:
> >>> From: Richard Henderson 
> >>>
> >>> The gvec expanders take care of masking the shift amount
> >>> against the element width.
> >>>
> >>> Signed-off-by: Richard Henderson 
> >>> Message-Id: <20190518191430.21686-2-richard.hender...@linaro.org>
> >>> Signed-off-by: David Gibson 
> >>> ---
> >>>  target/ppc/helper.h | 12 --
> >>>  target/ppc/int_helper.c | 37 -
> >>>  target/ppc/translate/vmx-impl.inc.c | 24 +--
> >>>  3 files changed, 12 insertions(+), 61 deletions(-)
> >>
> >> This patch introduces a regressions
> >>  with Fedora 29 guest:
> >>
> >> - during kernel boot:
> >>
> >> [   40.397876] crypto_register_alg 'aes' = 0
> >> [   40.577517] crypto_register_alg 'cbc(aes)' = 0
> >> [   40.743576] crypto_register_alg 'ctr(aes)' = 0
> >> [   41.061379] alg: skcipher: Test 1 failed (invalid result) on encryption 
> >> for p8_aes_xts
> >> [   41.062054] : 91 7c f6 9e bd 68 b2 ec 9b 9f e9 a3 ea dd a6 92
> >> [   41.062163] 0010: 98 10 35 57 5e dc 36 1e 9a f7 bc ba 39 f2 5c eb
> >> [   41.062834] crypto_register_alg 'xts(aes)' = 0
> >> [   41.077358] alg: hash: Test 2 failed for p8_ghash
> >> [   41.077553] : 5f 89 ab f7 20 57 20 57 20 57 20 57 20 57 20 57
> >>
> >> - with libssl:
> >>
> >> # curl -o /dev/null https://www.google.com
> >>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> >> Current
> >>  Dload  Upload   Total   SpentLeft  
> >> Speed
> >>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:--   
> >>   0
> >> curl: (35) error:1408F119:SSL routines:ssl3_get_record:decryption failed 
> >> or bad record mac
> >>
> >> [before, this one fails with:
> >> curl: (35) error:04091068:rsa routines:int_rsa_verify:bad signature ]
> >>
> >> If I revert this patch on top of 0d74f3b427 + "target/ppc: Fix lxvw4x, 
> >> lxvh8x and lxvb16x", all works fine.
> >>
> >> Thanks,
> >> Laurent
> >>
> > 
> > This seems to fix the problem:
> > 
> > diff --git a/accel/tcg/tcg-runtime-gvec.c b/accel/tcg/tcg-runtime-gvec.c
> > index 3b6052fe97..6f0709b307 100644
> > --- a/accel/tcg/tcg-runtime-gvec.c
> > +++ b/accel/tcg/tcg-runtime-gvec.c
> > @@ -874,7 +874,7 @@ void HELPER(gvec_sar8v)(void *d, void *a, void *b,
> > uint32_t desc)
> >  intptr_t oprsz = simd_oprsz(desc);
> >  intptr_t i;
> > 
> > -for (i = 0; i < oprsz; i += sizeof(vec8)) {
> > +for (i = 0; i < oprsz; i += sizeof(uint8_t)) {
> >  uint8_t sh = *(uint8_t *)(b + i) & 7;
> >  *(int8_t *)(d + i) = *(int8_t *)(a + i) >> sh;
> >  }
> 
> Grr.  I really really need to come up with a solution for testing that allows
> me to test paths that the host cpu would not ordinarily take.  This bug is
> hidden on a host with AVX2.
> 
> Thanks for the digging.

Can one of you send this fix formally with a S-o-b and so forth?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [Qemu-block] [PATCH] file-posix: unlock qemu_global_mutex before pread when attach disk

2019-06-10 Thread l00284672

-- Would the "open" hang as well in that case?
   The "open" doesn't hang in that case.

Do you have any better solutions to solve this problem in the case?
  



On 2019/6/11 0:03, Paolo Bonzini wrote:

On 10/06/19 16:51, l00284672 wrote:

The pread will hang in attaching disk just when backend storage network
disconnection .

Would the "open" hang as well in that case?

I think the locking range of qemu_global_mutex is too large when do qmp
operation. what
does the qemu_global_mutex  really protect?

Basically everything.


what is the risk of unlocking qemu_global_mutex in qmp?

It's not QMP; it's specifically the code that calls raw_probe_alignment.

Paolo

.



<>

[Qemu-devel] [Bug 1832281] [NEW] tcg bug master / 4.0.0 v8 operation >>> and |=

2019-06-10 Thread manuel baesler
Public bug reported:

vm guest is linux, executed with tcg
running this Node.js snippet leads to

$ node
> a = undefined
undefined
> a >>> 0
4294967295

host node
$ node
> a = undefined
undefined
> a >>> 0
0

same with |=

node
Welcome to Node.js v12.4.0.
Type ".help" for more information.
> let buffer
undefined
> buffer |= 0
0

vm with tcg:

$ ./out/Release/node --version
v12.4.0
./out/Release/node -e "let buffer; buffer |= 0; console.log(buffer);"
-1

vm guest is debian x86_64 latest release
vm guest is started with ./x86_64-softmmu/qemu-system-x86_64 -vnc :0 -cdrom 
debian-9.9.0-amd64-netinst.iso -m 4G -smp cores=6,threads=1,sockets=1 -nic 
user,hostfwd=tcp:ipv4addr:2233-:22 -cpu qemu64 debian.img

git tag v4.0.0 and master, commit
a578cdfbdd8f9beff5ced52b7826ddb1669abbbf, for building qemu-system-
x86_64 was used.

Node.js is compiled on the vm guest (v12.4.0 / master)

see also
https://github.com/nodejs/node/issues/19348#issuecomment-500465502

I need further assistance to track down the cause of the bug.

Kind regards
Manuel

** Affects: qemu
 Importance: Undecided
 Status: New

** Description changed:

  vm guest is linux, executed with tcg
  running this Node.js snippet leads to
  
  $ node
- > a = undefined 
+ > a = undefined
  undefined
  > a >>> 0
  4294967295
  
  host node
  $ node
  > a = undefined
  undefined
  > a >>> 0
  0
  
  same with |=
  
  node
  Welcome to Node.js v12.4.0.
  Type ".help" for more information.
  > let buffer
  undefined
  > buffer |= 0
  0
  
  vm with tcg:
  
  $ ./out/Release/node --version
  v12.4.0
  ./out/Release/node -e "let buffer; buffer |= 0; console.log(buffer);"
  -1
  
- 
  vm guest is debian x86_64 latest release
  vm guest is started with ./x86_64-softmmu/qemu-system-x86_64 -vnc :0 -cdrom 
debian-9.9.0-amd64-netinst.iso -m 4G -smp cores=6,threads=1,sockets=1 -nic 
user,hostfwd=tcp:ipv4addr:2233-:22 -cpu qemu64 debian.img
  
  git tag v4.0.0 and master, commit
  a578cdfbdd8f9beff5ced52b7826ddb1669abbbf, for building qemu-system-
  x86_64 was used.
  
- Node.js as compiled on the vm guest (v12.4.0 / master)
- 
+ Node.js is compiled on the vm guest (v12.4.0 / master)
  
  see also
  https://github.com/nodejs/node/issues/19348#issuecomment-500465502
  
  I need further assistance to track down the cause of the bug.
  
  Kind regards
  Manuel

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1832281

Title:
  tcg bug master / 4.0.0 v8 operation >>> and |=

Status in QEMU:
  New

Bug description:
  vm guest is linux, executed with tcg
  running this Node.js snippet leads to

  $ node
  > a = undefined
  undefined
  > a >>> 0
  4294967295

  host node
  $ node
  > a = undefined
  undefined
  > a >>> 0
  0

  same with |=

  node
  Welcome to Node.js v12.4.0.
  Type ".help" for more information.
  > let buffer
  undefined
  > buffer |= 0
  0

  vm with tcg:

  $ ./out/Release/node --version
  v12.4.0
  ./out/Release/node -e "let buffer; buffer |= 0; console.log(buffer);"
  -1

  vm guest is debian x86_64 latest release
  vm guest is started with ./x86_64-softmmu/qemu-system-x86_64 -vnc :0 -cdrom 
debian-9.9.0-amd64-netinst.iso -m 4G -smp cores=6,threads=1,sockets=1 -nic 
user,hostfwd=tcp:ipv4addr:2233-:22 -cpu qemu64 debian.img

  git tag v4.0.0 and master, commit
  a578cdfbdd8f9beff5ced52b7826ddb1669abbbf, for building qemu-system-
  x86_64 was used.

  Node.js is compiled on the vm guest (v12.4.0 / master)

  see also
  https://github.com/nodejs/node/issues/19348#issuecomment-500465502

  I need further assistance to track down the cause of the bug.

  Kind regards
  Manuel

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1832281/+subscriptions



Re: [Qemu-devel] kvm / virsh snapshot management

2019-06-10 Thread Eric Blake
On 6/10/19 6:00 PM, Gary Dale wrote:

>>> Any ideas on what I'm doing wrong?
>> Do you know for sure whether you have internal or external snapshots?
>> And at this point, your questions are starting to wander more into
>> libvirt territory.
>>
> Yes. I'm using internal snapshots. From your other e-mail, I gather that
> the (only) benefit to blockcommit with internal snapshots would be to
> reduce the size of the various tables recording changed blocks. Without
> a blockcommit, the L1 tables get progressively larger over time since
> they record all changes to the base file. Eventually the snapshots could
> become larger than the base image if I don't do a blockcommit.

Not quite. Blockcommit requires external images. It says to take this
image chain:

base <- active

and change it into this shorter chain:

base

by moving the cluster from active into base.  There is no such thing as
blockcommit on internal snapshots, because you don't have any backing
file to push into.

With internal snapshots, the longer an L1 table is active, the more
clusters you have to change compared to what was the case before the
snapshot was created - every time you change an existing cluster, the
refcount on the old cluster decreases and the change gets written into a
new cluster with refcount 1.  Yes, you can reach the point where there
are more clusters with refcount 1 associated with your current L1 table
than there are clusters with refcount > 1 that are shared with one or
more previous internal snapshots. But they are not recording a change to
the base file, rather, they are recording the current state of the file
where an internal snapshot says to not forget the old state of the file.
 And yes, a qcow2 file with internal snapshots can require more disk
space than the amount of space exposed to the guest.  But that's true
too with external snapshots (the sum of the space required by all images
in the chain may be larger than the space visible to the guest).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] tests/acceptance: Improve failure reporting in linux_ssh_mips_malta.py

2019-06-10 Thread Cleber Rosa
On Mon, Jun 10, 2019 at 09:49:10PM +0200, Aleksandar Markovic wrote:
> From: Aleksandar Markovic 
> 
> Rather than optputing a cryptic message:
> 
> FAIL: True not found in [False],
> 
> the following will be reported too, if the command output does not meet
> specified expectations:
> 
> 'lspci -d 11ab:4620' output doesn't contain the word 'GT-64120'
> 
> Signed-off-by: Aleksandar Markovic 
> ---
>  tests/acceptance/linux_ssh_mips_malta.py | 21 ++---
>  1 file changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/tests/acceptance/linux_ssh_mips_malta.py 
> b/tests/acceptance/linux_ssh_mips_malta.py
> index aafb0c3..cbf1b34 100644
> --- a/tests/acceptance/linux_ssh_mips_malta.py
> +++ b/tests/acceptance/linux_ssh_mips_malta.py
> @@ -147,20 +147,27 @@ class LinuxSSH(Test):
>  
>  def run_common_commands(self):
>  stdout, stderr = self.ssh_command('lspci -d 11ab:4620')
> -self.assertIn(True, ["GT-64120" in line for line in stdout])
> +self.assertIn(True, ["GT-64120a" in line for line in stdout],

Looks like there's an extra, unintended, "a" in the expected output, that is,
s/GT-64120a/GT-64120/.

> +  "'lspci -d 11ab:4620' output doesn't contain "
> +  "the word 'GT-64120'")
>
>  stdout, stderr = self.ssh_command('cat 
> /sys/bus/i2c/devices/i2c-0/name')
> -self.assertIn(True, ["SMBus PIIX4 adapter" in line
> - for line in stdout])
> +self.assertIn(True, ["SMBus PIIX4 adaptera" in line

Here too (s/adaptera/adapter/).

> + for line in stdout],
> +  "cat /sys/bus/i2c/devices/i2c-0/name' doesn't contain "
> +  "the words 'SMBus PIIX4 adapter'")
>  
>  stdout, stderr = self.ssh_command('cat /proc/mtd')
> -self.assertIn(True, ["YAMON" in line
> - for line in stdout])
> +self.assertIn(True, ["YAMONa" in line

Also here (s/YAMONa/YAMONa/).

> + for line in stdout],
> +  "'cat /proc/mtd' doesn't contain the word 'YAMON'")
>  
>  # Empty 'Board Config'
>  stdout, stderr = self.ssh_command('md5sum /dev/mtd2ro')
> -self.assertIn(True, ["0dfbe8aa4c20b52e1b8bf3cb6cbdf193" in line
> - for line in stdout])
> +self.assertIn(True, ["0dfbe8aa4c20b52e1b8bf3cb6cbdf193a" in line
> + for line in stdout],

And finnaly in the hash 
(s/0dfbe8aa4c20b52e1b8bf3cb6cbdf193a/0dfbe8aa4c20b52e1b8bf3cb6cbdf193/).

> +  "'md5sum /dev/mtd2ro' doesn't contain "
> +  "the word '0dfbe8aa4c20b52e1b8bf3cb6cbdf193'")
>  
>  def do_test_mips_malta(self, endianess, kernel_path, uname_m):
>  self.boot_debian_wheezy_image_and_ssh_login(endianess, kernel_path)
> -- 
> 2.7.4
> 
> 

With those changes, the tests pass for me.  I'd recommend though:

1) Not related to your patch, but it's good practice to name unused
   variables "_", that is:

   stdout, _ = self.ssh_command('lspci -d 11ab:4620')

2) Avoid repeating the expected content (which lead to the trailing
   "a"s in this patch).  Something like:

   cmd = 'lspci -d 11ab:4620'
   stdout, _ = self.ssh_command(cmd)
   exp = "GT-64120"
   self.assertIn(True, [exp in line for line in stdout],
 '"%s" output does not contain "%s"' % (cmd, exp))

3) Optionally, create an utility function that would make the check
   more obvious and avoid looping through all lines of the output
   (and creating a list, which a list comprehension will do).  Example:

   def ssh_command_output_contains(self, cmd, exp):
   stdout, _ = self.ssh_command(cmd)
   for line in stdout:
   if exp in line:
   break
   else:
   self.fail('"%s" output does not contain "%s"' % (cmd, exp))
   
def run_common_commands(self):
self.ssh_command_output_contains('lspci -d 11ab:4620', 'GT-64120')
self.ssh_command_output_contains('cat /sys/bus/i2c/devices/i2c-0/name',
 'SMBus PIIX4 adapter')
self.ssh_command_output_contains('cat /proc/mtd', 'YAMON')
# Empty 'Board Config'
self.ssh_command_output_contains('md5sum /dev/mtd2ro',
 '0dfbe8aa4c20b52e1b8bf3cb6cbdf193')

Cheers,
- Cleber.



Re: [Qemu-devel] kvm / virsh snapshot management

2019-06-10 Thread Gary Dale

On 2019-06-10 6:07 p.m., Eric Blake wrote:

On 6/10/19 4:27 PM, Gary Dale wrote:


Trying this against a test VM, I ran into a roadblock. My command line
and the results are:

# virsh blockcommit stretch "/home/secure/virtual/stretch.qcow2" --top
stretchS3 --delete --wait
error: unsupported flags (0x2) in function qemuDomainBlockCommit

I get the same thing when the path to the qcow2 file isn't quoted.

That's a libvirt limitation - the --delete flag is documented from the
generic API standpoint, but not (yet) implemented for the qemu driver
within libvirt. For now, you have to omit --delete from your virsh
command line, and then manually 'rm' the unused external file after the
fact.

Which is not possible since I'm using internal snapshots.



I noted in
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/virtualization_administration_guide/sub-sect-domain_commands-using_blockcommit_to_shorten_a_backing_chain
that the options use a single "-".

Sounds like a bug in that documentation.


Yes, and the man page also seems to be wrong. The section on blockcommit 
begins:


blockcommit domain path [bandwidth] [--bytes] [base] [--shallow] [top] 
[--delete]
   [--keep-relative] [--wait [--async] [--verbose]] [--timeout 
seconds] [--active]

   [{--pivot | --keep-overlay}]
   Reduce the length of a backing image chain, by committing 
changes at the top of the
   chain (snapshot or delta files) into backing images. By 
default, this command

   attempts to flatten the entire chain.

In addition to "[base]" actually being "[--base base]" and "[top]" being 
"[--top top]", the description of what it does only applies to external 
snapshots. Similar things are wrong in the blockpull section.





However the results for that were:
# virsh blockcommit stretch /home/secure/virtual/stretch.qcow2 -top
stretchS3 -delete -wait
error: Scaled numeric value '-top' for <--bandwidth> option is malformed
or out of range

which looks like virsh doesn't like the single dashes and is trying to
interpret them as positional options.

I also did a

# virsh domblklist stretch
Target Source

vda    /home/secure/virtual/stretch.qcow2
hda    -

and tried using vda instead of the full path in the blockcommit but got
the same error.

Any ideas on what I'm doing wrong?

Do you know for sure whether you have internal or external snapshots?
And at this point, your questions are starting to wander more into
libvirt territory.

Yes. I'm using internal snapshots. From your other e-mail, I gather that 
the (only) benefit to blockcommit with internal snapshots would be to 
reduce the size of the various tables recording changed blocks. Without 
a blockcommit, the L1 tables get progressively larger over time since 
they record all changes to the base file. Eventually the snapshots could 
become larger than the base image if I don't do a blockcommit.




Re: [Qemu-devel] kvm / virsh snapshot management

2019-06-10 Thread Eric Blake
On 6/10/19 5:47 PM, Gary Dale wrote:

>>>
>>> Since blockcommit would make it impossible for me to revert to an
>>> earlier state (because I'm committing the oldest snapshot, if it screws
>>> up, I can't undo within virsh), I need to make sure this command is
>>> correct.
>>>
>>>
> Interesting. Your comments are quite different from what the Redhat

It's "Red Hat", two words :)

> online documentation suggests. It spends some time talking about
> flattening the chains (e.g.
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/virtualization_administration_guide/sub-sect-domain_commands-using_blockcommit_to_shorten_a_backing_chain)

That is all about external snapshot file chains (Red Hat specifically
discourages the use of internal snapshots).

> while you are saying the chains don't exist. I gather this is because
> Redhat doesn't like internal snapshots, so they focus purely on
> documenting external ones.
> 
> It does strike me as a little bizarre to handle internal and external
> snapshots differently since the essential difference only seems to be
> where the data is stored. Using chains for one and reference counts for
> the other sounds like a recipe for for things not working right.

If nothing else, it's a reason WHY Red Hat discourages the use of
internal snapshots.

> 
> Anyway, if I understand what you are saying, with internal snapshots, i
> can simply delete old ones and create new ones without worrying about
> there being any performance penalty. All internal snapshots are one hop
> away from the base image.

Still not quite right. All internal snapshots ARE a complete base image,
they do not track a delta from any other point in time, but rather the
complete disk contents of the point in time in question.

Yes, you can delete internal snapshots at will, because nothing else
depends on them. We don't yet have good code for compacting unused
portions of a qcow2 image, though, so your file size may still appear
larger than necessary (hopefully it's sparse, though, so not actually
consuming extra storage).

Also, don't try to mix-and-match internal and external snapshots on a
single guest image - once you've used one style, trying to switch to the
other can cause data loss if you aren't precise about which files
require which clusters to stick around.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] kvm / virsh snapshot management

2019-06-10 Thread Gary Dale

On 2019-06-10 6:04 p.m., Eric Blake wrote:

On 6/10/19 10:54 AM, Gary Dale wrote:


One explanation I've seen of the process is if I delete a snapshot, the
changes it contains are merged with its immediate child.

Nope.  Deleting a snapshot decrements the reference count on all its
data clusters.  If a data cluster's reference count reaches zero it will
be freed.  That's all, there is no additional data movement or
reorganization aside from this.

Perhaps not physically but logically it would appear that the data
clusters were merged.

No.

If I have an image that starts out as all blanks, then write to part of
it (top line showing cluster number, bottom line showing representative
data):

012345
AA

then take internal snapshot S1, then write more:

ABB---

then take another internal snapshot S2, then write even more:

ABCC--

the single qcow2 image will have something like:

L1 table for S1 => {
   guest cluster 0 => host cluster 5 refcount 3 content A
   guest cluster 1 => host cluster 6 refcount 1 content A
}
L1 table for S2 => {
   guest cluster 0 => host cluster 5 refcount 3 content A
   guest cluster 1 => host cluster 7 refcount 2 content B
   guest cluster 2 => host cluster 8 refcount 1 content B
}
L1 table for active image => {
   guest cluster 0 => host cluster 5 refcount 3 content A
   guest cluster 1 => host cluster 7 refcount 2 content B
   guest cluster 2 => host cluster 9 refcount 1 content C
   guest cluster 3 => host cluster 10 refcount 1 content C
}


If I then delete S2, I'm left with:

L1 table for S1 => {
   guest cluster 0 => host cluster 5 refcount 2 content A
   guest cluster 1 => host cluster 6 refcount 1 content A
}
L1 table for active image => {
   guest cluster 0 => host cluster 5 refcount 2 content A
   guest cluster 1 => host cluster 7 refcount 1 content B
   guest cluster 2 => host cluster 9 refcount 1 content C
   guest cluster 3 => host cluster 10 refcount 1 content C
}

and host cluster 8 is no longer in use.

Or, if I instead use external snapshots, I have a chain of images:

base <- mid <- active

L1 table for image base => {
   guest cluster 0 => host cluster 5 refcount 1 content A
   guest cluster 1 => host cluster 6 refcount 1 content A
}
L1 table for image mid => {
   guest cluster 1 => host cluster 5 refcount 1 content B
   guest cluster 2 => host cluster 6 refcount 1 content B
}
L1 table for image active => {
   guest cluster 2 => host cluster 5 refcount 1 content C
   guest cluster 3 => host cluster 6 refcount 1 content C
}

If I then delete image mid, I can do so in one of two ways:

blockcommit mid into base:
base <- active
L1 table for image base => {
   guest cluster 0 => host cluster 5 refcount 1 content A
   guest cluster 1 => host cluster 6 refcount 1 content B
   guest cluster 2 => host cluster 7 refcount 1 content B
}
L1 table for image active => {
   guest cluster 2 => host cluster 5 refcount 1 content C
   guest cluster 3 => host cluster 6 refcount 1 content C
}


blockpull mid into active:
base <- active
L1 table for image base => {
   guest cluster 0 => host cluster 5 refcount 1 content A
   guest cluster 1 => host cluster 6 refcount 1 content A
}
L1 table for image active => {
   guest cluster 1 => host cluster 7 refcount 1 content B
   guest cluster 2 => host cluster 5 refcount 1 content C
   guest cluster 3 => host cluster 6 refcount 1 content C
}



Can some provide a little clarity on this? Thanks!

If you want an analogy then git(1) is a pretty good one.  qcow2 internal
snapshots are like git tags.  Unlike branches, tags are immutable.  In
qcow2 you only have a master branch (the current disk state) from which
you can create a new tag or you can use git-checkout(1) to apply a
snapshot (discarding whatever your current disk state is).

Stefan

That's just making things less clear - I've never tried to understand
git either. Thanks for the attempt though.

If I've gotten things correct, once the base image is established, there
is a current disk state that points to a table containing all the writes
since the base image. Creating a snapshot essentially takes that pointer
and gives it the snapshot name, while creating a new current disk state
pointer and data table where subsequent writes are recorded.

Not quite. Rather, for internal snapshots, there is a table pointing to
ALL the contents that should be visible to the guest at that point in
time (one table for each snapshot, which is effectively read-only, and
one table for the active image, which is updated dynamically as guest
writes happen).  But the table does NOT track provenance of a cluster,
only a refcount.


Deleting snapshots removes your ability to refer to a data table by
name, but the table itself still exists anonymously as part of a chain
of data tables between the base image and the current state.

Wrong for internal snapshots. There is no chain of data tables, and if a
cluster's refcount goes to 0, you no longer have access to the
information that the guest saw at the time 

Re: [Qemu-devel] kvm / virsh snapshot management

2019-06-10 Thread Eric Blake
On 6/10/19 4:27 PM, Gary Dale wrote:

>>
> Trying this against a test VM, I ran into a roadblock. My command line
> and the results are:
> 
> # virsh blockcommit stretch "/home/secure/virtual/stretch.qcow2" --top
> stretchS3 --delete --wait
> error: unsupported flags (0x2) in function qemuDomainBlockCommit
> 
> I get the same thing when the path to the qcow2 file isn't quoted.

That's a libvirt limitation - the --delete flag is documented from the
generic API standpoint, but not (yet) implemented for the qemu driver
within libvirt. For now, you have to omit --delete from your virsh
command line, and then manually 'rm' the unused external file after the
fact.

> 
> I noted in
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/virtualization_administration_guide/sub-sect-domain_commands-using_blockcommit_to_shorten_a_backing_chain
> that the options use a single "-".

Sounds like a bug in that documentation.

> However the results for that were:
> # virsh blockcommit stretch /home/secure/virtual/stretch.qcow2 -top
> stretchS3 -delete -wait
> error: Scaled numeric value '-top' for <--bandwidth> option is malformed
> or out of range
> 
> which looks like virsh doesn't like the single dashes and is trying to
> interpret them as positional options.
> 
> I also did a
> 
> # virsh domblklist stretch
> Target Source
> 
> vda    /home/secure/virtual/stretch.qcow2
> hda    -
> 
> and tried using vda instead of the full path in the blockcommit but got
> the same error.
> 
> Any ideas on what I'm doing wrong?

Do you know for sure whether you have internal or external snapshots?
And at this point, your questions are starting to wander more into
libvirt territory.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] kvm / virsh snapshot management

2019-06-10 Thread Eric Blake
On 6/10/19 10:54 AM, Gary Dale wrote:

>>> One explanation I've seen of the process is if I delete a snapshot, the
>>> changes it contains are merged with its immediate child.
>> Nope.  Deleting a snapshot decrements the reference count on all its
>> data clusters.  If a data cluster's reference count reaches zero it will
>> be freed.  That's all, there is no additional data movement or
>> reorganization aside from this.
> Perhaps not physically but logically it would appear that the data
> clusters were merged.

No.

If I have an image that starts out as all blanks, then write to part of
it (top line showing cluster number, bottom line showing representative
data):

012345
AA

then take internal snapshot S1, then write more:

ABB---

then take another internal snapshot S2, then write even more:

ABCC--

the single qcow2 image will have something like:

L1 table for S1 => {
  guest cluster 0 => host cluster 5 refcount 3 content A
  guest cluster 1 => host cluster 6 refcount 1 content A
}
L1 table for S2 => {
  guest cluster 0 => host cluster 5 refcount 3 content A
  guest cluster 1 => host cluster 7 refcount 2 content B
  guest cluster 2 => host cluster 8 refcount 1 content B
}
L1 table for active image => {
  guest cluster 0 => host cluster 5 refcount 3 content A
  guest cluster 1 => host cluster 7 refcount 2 content B
  guest cluster 2 => host cluster 9 refcount 1 content C
  guest cluster 3 => host cluster 10 refcount 1 content C
}


If I then delete S2, I'm left with:

L1 table for S1 => {
  guest cluster 0 => host cluster 5 refcount 2 content A
  guest cluster 1 => host cluster 6 refcount 1 content A
}
L1 table for active image => {
  guest cluster 0 => host cluster 5 refcount 2 content A
  guest cluster 1 => host cluster 7 refcount 1 content B
  guest cluster 2 => host cluster 9 refcount 1 content C
  guest cluster 3 => host cluster 10 refcount 1 content C
}

and host cluster 8 is no longer in use.

Or, if I instead use external snapshots, I have a chain of images:

base <- mid <- active

L1 table for image base => {
  guest cluster 0 => host cluster 5 refcount 1 content A
  guest cluster 1 => host cluster 6 refcount 1 content A
}
L1 table for image mid => {
  guest cluster 1 => host cluster 5 refcount 1 content B
  guest cluster 2 => host cluster 6 refcount 1 content B
}
L1 table for image active => {
  guest cluster 2 => host cluster 5 refcount 1 content C
  guest cluster 3 => host cluster 6 refcount 1 content C
}

If I then delete image mid, I can do so in one of two ways:

blockcommit mid into base:
base <- active
L1 table for image base => {
  guest cluster 0 => host cluster 5 refcount 1 content A
  guest cluster 1 => host cluster 6 refcount 1 content B
  guest cluster 2 => host cluster 7 refcount 1 content B
}
L1 table for image active => {
  guest cluster 2 => host cluster 5 refcount 1 content C
  guest cluster 3 => host cluster 6 refcount 1 content C
}


blockpull mid into active:
base <- active
L1 table for image base => {
  guest cluster 0 => host cluster 5 refcount 1 content A
  guest cluster 1 => host cluster 6 refcount 1 content A
}
L1 table for image active => {
  guest cluster 1 => host cluster 7 refcount 1 content B
  guest cluster 2 => host cluster 5 refcount 1 content C
  guest cluster 3 => host cluster 6 refcount 1 content C
}


>>> Can some provide a little clarity on this? Thanks!
>> If you want an analogy then git(1) is a pretty good one.  qcow2 internal
>> snapshots are like git tags.  Unlike branches, tags are immutable.  In
>> qcow2 you only have a master branch (the current disk state) from which
>> you can create a new tag or you can use git-checkout(1) to apply a
>> snapshot (discarding whatever your current disk state is).
>>
>> Stefan
> 
> That's just making things less clear - I've never tried to understand
> git either. Thanks for the attempt though.
> 
> If I've gotten things correct, once the base image is established, there
> is a current disk state that points to a table containing all the writes
> since the base image. Creating a snapshot essentially takes that pointer
> and gives it the snapshot name, while creating a new current disk state
> pointer and data table where subsequent writes are recorded.

Not quite. Rather, for internal snapshots, there is a table pointing to
ALL the contents that should be visible to the guest at that point in
time (one table for each snapshot, which is effectively read-only, and
one table for the active image, which is updated dynamically as guest
writes happen).  But the table does NOT track provenance of a cluster,
only a refcount.

> 
> Deleting snapshots removes your ability to refer to a data table by
> name, but the table itself still exists anonymously as part of a chain
> of data tables between the base image and the current state.

Wrong for internal snapshots. There is no chain of data tables, and if a
cluster's refcount goes to 0, you no longer have access to the
information that the guest saw at the time that 

Re: [Qemu-devel] [PATCH 1/2] docs/specs/index.rst: Fix minor syntax issues

2019-06-10 Thread Aleksandar Markovic
On Jun 10, 2019 5:25 PM, "Peter Maydell"  wrote:
>
> The docs/specs/index.rst has a couple of minor issues which
> we didn't notice because we weren't building the manual:
>  * the ToC entry for the new PPC XIVE docs points to
>a nonexistent file
>  * the initial comment needs to be marked by '..', not '.',
>or it will appear in the output
>  * the title doesn't match the capitialization used by
>the existing interop or devel manuals, and uses
>'full-system emulation' rather than the 'system emulation'
>that the interop manual title uses
>
> Fix these minor issues before we start trying to build the manual.
>
> Signed-off-by: Peter Maydell 
> ---

Acked-by: Aleksandar Markovic 

>  docs/specs/index.rst | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/docs/specs/index.rst b/docs/specs/index.rst
> index 2e927519c2e..40adb97c5eb 100644
> --- a/docs/specs/index.rst
> +++ b/docs/specs/index.rst
> @@ -1,8 +1,8 @@
> -. This is the top level page for the 'specs' manual
> +.. This is the top level page for the 'specs' manual
>
>
> -QEMU full-system emulation guest hardware specifications
> -
> +QEMU System Emulation Guest Hardware Specifications
> +===
>
>
>  Contents:
> @@ -10,4 +10,5 @@ Contents:
>  .. toctree::
> :maxdepth: 2
>
> -   xive
> +   ppc-xive
> +   ppc-spapr-xive
> --
> 2.20.1
>
>


Re: [Qemu-devel] [PATCH 2/2] docs: Build and install specs manual

2019-06-10 Thread Aleksandar Markovic
On Jun 10, 2019 5:25 PM, "Peter Maydell"  wrote:
>
> Now we have some rST format docs in the docs/specs/ manual, we should
> actually build and install it.
>
> Signed-off-by: Peter Maydell 
> ---

Acked-by: Aleksandar Markovic 

>  Makefile   |  7 ++-
>  docs/specs/conf.py | 16 
>  2 files changed, 22 insertions(+), 1 deletion(-)
>  create mode 100644 docs/specs/conf.py
>
> diff --git a/Makefile b/Makefile
> index 8e2fc6624c3..cfb18f15254 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -731,6 +731,7 @@ distclean: clean
> rm -rf .doctrees
> $(call clean-manual,devel)
> $(call clean-manual,interop)
> +   $(call clean-manual,specs)
> for d in $(TARGET_DIRS); do \
> rm -rf $$d || exit 1 ; \
>  done
> @@ -781,6 +782,7 @@ endef
>  .PHONY: install-sphinxdocs
>  install-sphinxdocs: sphinxdocs
> $(call install-manual,interop)
> +   $(call install-manual,specs)
>
>  install-doc: $(DOCS) install-sphinxdocs
> $(INSTALL_DIR) "$(DESTDIR)$(qemu_docdir)"
> @@ -962,7 +964,7 @@ docs/version.texi: $(SRC_PATH)/VERSION config-host.mak
>  # and handles "don't rebuild things unless necessary" itself.
>  # The '.doctrees' files are cached information to speed this up.
>  .PHONY: sphinxdocs
> -sphinxdocs: $(MANUAL_BUILDDIR)/devel/index.html
$(MANUAL_BUILDDIR)/interop/index.html
> +sphinxdocs: $(MANUAL_BUILDDIR)/devel/index.html
$(MANUAL_BUILDDIR)/interop/index.html $(MANUAL_BUILDDIR)/specs/index.html
>
>  # Canned command to build a single manual
>  build-manual = $(call quiet-command,sphinx-build $(if $(V),,-q) -W -n -b
html -D version=$(VERSION) -D release="$(FULL_VERSION)" -d .doctrees/$1
$(SRC_PATH)/docs/$1 $(MANUAL_BUILDDIR)/$1 ,"SPHINX","$(MANUAL_BUILDDIR)/$1")
> @@ -975,6 +977,9 @@ $(MANUAL_BUILDDIR)/devel/index.html: $(call
manual-deps,devel)
>  $(MANUAL_BUILDDIR)/interop/index.html: $(call manual-deps,interop)
> $(call build-manual,interop)
>
> +$(MANUAL_BUILDDIR)/specs/index.html: $(call manual-deps,specs)
> +   $(call build-manual,specs)
> +
>  qemu-options.texi: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool
> $(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< >
$@,"GEN","$@")
>
> diff --git a/docs/specs/conf.py b/docs/specs/conf.py
> new file mode 100644
> index 000..4d56f3ae13c
> --- /dev/null
> +++ b/docs/specs/conf.py
> @@ -0,0 +1,16 @@
> +# -*- coding: utf-8 -*-
> +#
> +# QEMU documentation build configuration file for the 'specs' manual.
> +#
> +# This includes the top level conf file and then makes any necessary
tweaks.
> +import sys
> +import os
> +
> +qemu_docdir = os.path.abspath("..")
> +parent_config = os.path.join(qemu_docdir, "conf.py")
> +exec(compile(open(parent_config, "rb").read(), parent_config, 'exec'))
> +
> +# This slightly misuses the 'description', but is the best way to get
> +# the manual title to appear in the sidebar.
> +html_theme_options['description'] = \
> +u'System Emulation Guest Hardware Specifications'
> --
> 2.20.1
>
>


Re: [Qemu-devel] [PATCH v21 4/7] target/avr: Add instruction translation

2019-06-10 Thread Richard Henderson
On 6/6/19 12:30 PM, Michael Rolnik wrote:
> +if (ctx.check_skip > 0) {
> +TCGLabel *skip = gen_new_label();
> +TCGLabel *done = gen_new_label();
> +
> +tcg_gen_brcondi_tl(TCG_COND_NE, cpu_skip, 0, skip);
> +translate();
> +tcg_gen_br(done);
> +gen_set_label(skip);
> +tcg_gen_movi_tl(cpu_skip, 0);
> +tcg_gen_movi_tl(cpu_pc, ctx.npc);
> +gen_set_label(done);
> +ctx.check_skip--;
> +} else {
> +translate();
> +}

In future, do not indent code like this.

I had been thinking of a slightly more complex solution that does not require
every TB to begin with a conditional branch testing cpu_skip.  This also has
the property that we almost never write to cpu_skip at all -- the condition is
consumed immediately within host registers without being written back to ENV.
The only time we do write to cpu_skip is for debugging, single-stepping, or
when we are forced to break the TB for other unusual reasons.

The following implements this solution.  It's based on some other cleanups that
I have made, and commented upon here.  The full tree can be found at

  https://github.com/rth7680/qemu/commits/review/avr-21


r~
>From 893b021bc042c5cb5e7d9cf7d6ff3d9e2cfc25de Mon Sep 17 00:00:00 2001
From: Richard Henderson 
Date: Sun, 9 Jun 2019 21:38:38 -0700
Subject: [PATCH] !fixup target/avr: Add instruction translation

Reorganize instruction skipping.

Put the known state of env->skip into TBFLAGS, which allows the
first instruction to not test cpu_skip directly.  It either knows
that we need not skip (almost always), or that we must skip (rarely).

Put the in-flight skipping state into tcg variables.  This allows
us to consume the skip state without writing it out to cpu_skip.
---
 target/avr/cpu.h   |   4 +
 target/avr/translate.c | 248 +++--
 2 files changed, 167 insertions(+), 85 deletions(-)

diff --git a/target/avr/cpu.h b/target/avr/cpu.h
index a086aca30c..a0de72d75b 100644
--- a/target/avr/cpu.h
+++ b/target/avr/cpu.h
@@ -187,6 +187,7 @@ int avr_cpu_memory_rw_debug(CPUState *cs, vaddr address, uint8_t *buf,
 
 enum {
 TB_FLAGS_FULL_ACCESS = 1,
+TB_FLAGS_SKIP = 2,
 };
 
 static inline void cpu_get_tb_cpu_state(CPUAVRState *env, target_ulong *pc,
@@ -200,6 +201,9 @@ static inline void cpu_get_tb_cpu_state(CPUAVRState *env, target_ulong *pc,
 if (env->fullacc) {
 flags |= TB_FLAGS_FULL_ACCESS;
 }
+if (env->skip) {
+flags |= TB_FLAGS_SKIP;
+}
 
 *pflags = flags;
 }
diff --git a/target/avr/translate.c b/target/avr/translate.c
index dd22abf93a..4d9e2afa26 100644
--- a/target/avr/translate.c
+++ b/target/avr/translate.c
@@ -78,7 +78,11 @@ struct DisasContext {
 int memidx;
 int bstate;
 int singlestep;
-int check_skip;
+
+TCGv skip_var0;
+TCGv skip_var1;
+TCGCond skip_cond;
+bool free_skip_var0;
 };
 
 static int to_A(DisasContext *ctx, int indx) { return 16 + (indx % 16); }
@@ -583,38 +587,45 @@ static bool trans_BLD(DisasContext *ctx, arg_BLD *a)
  */
 static bool trans_BRBC(DisasContext *ctx, arg_BRBC *a)
 {
-TCGLabel *taken = gen_new_label();
+TCGLabel *not_taken = gen_new_label();
+TCGCond cond = TCG_COND_EQ;
+TCGv var;
 
 switch (a->bit) {
 case 0x00:
-tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_Cf, 0, taken);
+var = cpu_Cf;
 break;
 case 0x01:
-tcg_gen_brcondi_i32(TCG_COND_NE, cpu_Zf, 0, taken);
+cond = TCG_COND_NE;
+var = cpu_Zf;
 break;
 case 0x02:
-tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_Nf, 0, taken);
+var = cpu_Nf;
 break;
 case 0x03:
-tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_Vf, 0, taken);
+var = cpu_Vf;
 break;
 case 0x04:
-tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_Sf, 0, taken);
+var = cpu_Sf;
 break;
 case 0x05:
-tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_Hf, 0, taken);
+var = cpu_Hf;
 break;
 case 0x06:
-tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_Tf, 0, taken);
+var = cpu_Tf;
 break;
 case 0x07:
-tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_If, 0, taken);
+var = cpu_If;
 break;
+default:
+g_assert_not_reached();
 }
 
-gen_goto_tb(ctx, 1, ctx->npc);
-gen_set_label(taken);
-gen_goto_tb(ctx, 0, ctx->npc + a->imm);
+tcg_gen_brcondi_i32(tcg_invert_cond(cond), var, 0, not_taken);
+gen_goto_tb(ctx, 1, ctx->npc + a->imm);
+gen_set_label(not_taken);
+
+ctx->bstate = DISAS_CHAIN;
 return true;
 }
 
@@ -626,38 +637,45 @@ static bool trans_BRBC(DisasContext *ctx, arg_BRBC *a)
  */
 static bool trans_BRBS(DisasContext *ctx, arg_BRBS *a)
 {
-TCGLabel *taken = gen_new_label();
+TCGLabel *not_taken = gen_new_label();
+TCGCond cond = TCG_COND_NE;
+TCGv var;
 
 switch 

Re: [Qemu-devel] kvm / virsh snapshot management

2019-06-10 Thread Gary Dale

On 2019-06-10 11:54 a.m., Gary Dale wrote:

On 2019-06-10 8:19 a.m., Stefan Hajnoczi wrote:

On Sat, Jun 01, 2019 at 08:12:01PM -0400, Gary Dale wrote:

A while back I converted a raw disk image to qcow2 to be able to use
snapshots. However I realize that I may not really understand 
exactly how
snapshots work. In this particular case, I'm only talking about 
internal
snapshots currently as there seems to be some differences of opinion 
as to
whether internal or external are safer/more reliable. I'm also only 
talking

about shutdown state snapshots, so it should just be the disk that is
snapshotted.

As I understand it, the first snapshot freezes the base image and 
subsequent
changes in the virtual machine's disk are stored elsewhere in the 
qcow2 file

(remember, only internal snapshots). If I take a second snapshot, that
freezes the first one, and subsequent changes are now in third 
location.
Each new snapshot is incremental to the one that preceded it rather 
than
differential to the base image. Each new snapshot is a child of the 
previous

one.

Internal snapshots are not incremental or differential at the qcow2
level, they are simply a separate L1/L2 table pointing to data clusters.
In other words, they are an independent set of metadata showing the full
state of the image at the point of the snapshot.  qcow2 does not track
relationships between snapshots and parents/children.
Which sounds to me like they are incremental. Each snapshot starts a 
new L1/L2 table so that the state of the previous one is preserved.



One explanation I've seen of the process is if I delete a snapshot, the
changes it contains are merged with its immediate child.

Nope.  Deleting a snapshot decrements the reference count on all its
data clusters.  If a data cluster's reference count reaches zero it will
be freed.  That's all, there is no additional data movement or
reorganization aside from this.
Perhaps not physically but logically it would appear that the data 
clusters were merged.



So if I deleted the
first snapshot, the base image stays the same but any data that has 
changed
since the base image is now in the second snapshot's location. The 
merge
with children explanation also implies that the base image is never 
touched

even if the first snapshot is deleted.

But if I delete a snapshot that has no children, is that essentially 
the
same as reverting to the point that snapshot was created and all 
subsequent
disk changes are lost? Or does it merge down to the parent snapshot? 
If I

delete all snapshots, would that revert to the base image?

No.  qcow2 has the concept of the current disk state of the running VM -
what you get when you boot the guest - and the snapshots - they are
read-only.

When you delete snapshots the current disk state (running VM) is
unaffected.

When you apply a snapshot this throws away the current disk state and
uses the snapshot as the new current disk state.  The read-only snapshot
itself is not modified in any way and you can apply the same snapshot
again as many times as you wish later.
So in essence the current state is a pointer to the latest data 
cluster, which is the only data cluster that can be modified.



I've seen it explained that a snapshot is very much like a timestamp so
deleting a timestamp removes the dividing line between writes that 
occurred
before and after that time, so that data is really only removed if I 
revert

to some time stamp - all writes after that point are discarded. In this
explanation, deleting the oldest timestamp is essentially updating 
the base

image. Deleting all snapshots would leave me with the base image fully
updated.

Frankly, the second explanation sounds more reasonable to me, 
without having
to figure out how copy-on-write works,  But I'm dealing with 
important data

here and I don't want to mess it up by mishandling the snapshots.

Can some provide a little clarity on this? Thanks!

If you want an analogy then git(1) is a pretty good one.  qcow2 internal
snapshots are like git tags.  Unlike branches, tags are immutable.  In
qcow2 you only have a master branch (the current disk state) from which
you can create a new tag or you can use git-checkout(1) to apply a
snapshot (discarding whatever your current disk state is).

Stefan


That's just making things less clear - I've never tried to understand 
git either. Thanks for the attempt though.


If I've gotten things correct, once the base image is established, 
there is a current disk state that points to a table containing all 
the writes since the base image. Creating a snapshot essentially takes 
that pointer and gives it the snapshot name, while creating a new 
current disk state pointer and data table where subsequent writes are 
recorded.


Deleting snapshots removes your ability to refer to a data table by 
name, but the table itself still exists anonymously as part of a chain 
of data tables between the base image and the current state.


This leaves a problem. The chain 

Re: [Qemu-devel] [Qemu-arm] [PATCH 28/42] target/arm: Convert VMOV (imm) to decodetree

2019-06-10 Thread Ali Mezgani


Re: [Qemu-devel] [PULL 8/8] travis: Make check-acceptance job more verbose

2019-06-10 Thread Wainer dos Santos Moschetta

Hi Eduardo,

On 06/07/2019 06:15 PM, Eduardo Habkost wrote:

It will help us debug issues when tests fail.

Signed-off-by: Eduardo Habkost 
---
  .travis.yml | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.travis.yml b/.travis.yml
index 75e017a5cf..82c74673e1 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -225,7 +225,7 @@ matrix:
  # Acceptance (Functional) tests
  - env:
  - CONFIG="--python=/usr/bin/python3 
--target-list=x86_64-softmmu,mips-softmmu,mips64el-softmmu,aarch64-softmmu,arm-softmmu,s390x-softmmu,alpha-softmmu"
-- TEST_CMD="make check-acceptance"
+- TEST_CMD="make AVOCADO_SHOW=test check-acceptance"


Please see Cleber's "[PATCH 1/8] Travis: print acceptance tests logs in 
case of job failure". His patch will print the job.log content only and 
only if some test failed, which IMO is preferable over making avocado 
more verbose as you proposed.


Thanks!

- Wainer


addons:
  apt:
packages:





[Qemu-devel] [PATCH qemu v2] aspeed: Add support for the swift-bmc board

2019-06-10 Thread Adriana Kobylak
From: Adriana Kobylak 

The Swift board is an OpenPOWER system hosting POWER processors.
Add support for their BMC including the I2C devices as found on HW.

Signed-off-by: Adriana Kobylak 
Reviewed-by: Cédric Le Goater 
---
 hw/arm/aspeed.c | 50 ++
 1 file changed, 50 insertions(+)

diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
index 33070a6..7fd0e0c 100644
--- a/hw/arm/aspeed.c
+++ b/hw/arm/aspeed.c
@@ -73,6 +73,17 @@ struct AspeedBoardState {
 SCU_AST2500_HW_STRAP_ACPI_ENABLE |  \
 SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER))
 
+/* Swift hardware value: 0xF11AD206 */
+#define SWIFT_BMC_HW_STRAP1 (   \
+AST2500_HW_STRAP1_DEFAULTS |\
+SCU_AST2500_HW_STRAP_SPI_AUTOFETCH_ENABLE | \
+SCU_AST2500_HW_STRAP_GPIO_STRAP_ENABLE |\
+SCU_AST2500_HW_STRAP_UART_DEBUG |   \
+SCU_AST2500_HW_STRAP_DDR4_ENABLE |  \
+SCU_H_PLL_BYPASS_EN |   \
+SCU_AST2500_HW_STRAP_ACPI_ENABLE |  \
+SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_MASTER))
+
 /* Witherspoon hardware value: 0xF10AD216 (but use romulus definition) */
 #define WITHERSPOON_BMC_HW_STRAP1 ROMULUS_BMC_HW_STRAP1
 
@@ -287,6 +298,35 @@ static void romulus_bmc_i2c_init(AspeedBoardState *bmc)
 i2c_create_slave(aspeed_i2c_get_bus(DEVICE(>i2c), 11), "ds1338", 
0x32);
 }
 
+static void swift_bmc_i2c_init(AspeedBoardState *bmc)
+{
+AspeedSoCState *soc = >soc;
+
+i2c_create_slave(aspeed_i2c_get_bus(DEVICE(>i2c), 3), "pca9552", 
0x60);
+
+/* The swift board expects a TMP275 but a TMP105 is compatible */
+i2c_create_slave(aspeed_i2c_get_bus(DEVICE(>i2c), 7), "tmp105", 0x48);
+/* The swift board expects a pca9551 but a pca9552 is compatible */
+i2c_create_slave(aspeed_i2c_get_bus(DEVICE(>i2c), 7), "pca9552", 
0x60);
+
+/* The swift board expects an Epson RX8900 RTC but a ds1338 is compatible 
*/
+i2c_create_slave(aspeed_i2c_get_bus(DEVICE(>i2c), 8), "ds1338", 0x32);
+i2c_create_slave(aspeed_i2c_get_bus(DEVICE(>i2c), 8), "pca9552", 
0x60);
+
+i2c_create_slave(aspeed_i2c_get_bus(DEVICE(>i2c), 9), "tmp423", 0x4c);
+/* The swift board expects a pca9539 but a pca9552 is compatible */
+i2c_create_slave(aspeed_i2c_get_bus(DEVICE(>i2c), 9), "pca9552", 
0x74);
+
+i2c_create_slave(aspeed_i2c_get_bus(DEVICE(>i2c), 10), "tmp423", 
0x4c);
+/* The swift board expects a pca9539 but a pca9552 is compatible */
+i2c_create_slave(aspeed_i2c_get_bus(DEVICE(>i2c), 10), "pca9552",
+ 0x74);
+
+/* The swift board expects a TMP275 but a TMP105 is compatible */
+i2c_create_slave(aspeed_i2c_get_bus(DEVICE(>i2c), 12), "tmp105", 
0x48);
+i2c_create_slave(aspeed_i2c_get_bus(DEVICE(>i2c), 12), "tmp105", 
0x4a);
+}
+
 static void witherspoon_bmc_i2c_init(AspeedBoardState *bmc)
 {
 AspeedSoCState *soc = >soc;
@@ -378,6 +418,16 @@ static const AspeedBoardConfig aspeed_boards[] = {
 .i2c_init  = romulus_bmc_i2c_init,
 .ram   = 512 * MiB,
 }, {
+.name  = MACHINE_TYPE_NAME("swift-bmc"),
+.desc  = "OpenPOWER Swift BMC (ARM1176)",
+.soc_name  = "ast2500-a1",
+.hw_strap1 = SWIFT_BMC_HW_STRAP1,
+.fmc_model = "mx66l1g45g",
+.spi_model = "mx66l1g45g",
+.num_cs= 2,
+.i2c_init  = swift_bmc_i2c_init,
+.ram   = 512 * MiB,
+}, {
 .name  = MACHINE_TYPE_NAME("witherspoon-bmc"),
 .desc  = "OpenPOWER Witherspoon BMC (ARM1176)",
 .soc_name  = "ast2500-a1",
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH v21 4/7] target/avr: Add instruction translation

2019-06-10 Thread Richard Henderson
On 6/6/19 12:30 PM, Michael Rolnik wrote:
> +if ((ctx.cpc & (TARGET_PAGE_SIZE - 1)) == 0) {
> +break; /* page boundary */
> +}

This test isn't right, because this ended the TB if the *first* instruction was
located on the page boundary.  It also fails to allow for a 32-bit opcode to
span 0xff - 0x101, since we did not see an instruction boundary at 0x100.

Since there is no MMU, we do not have to protect against permissions changes.
All we have to worry about is QEMU's code that detects self-modifying code.
This requires the code for each TranslationBlocks be contained within no more
than two pages.  Therefore,

/* Do not allow the next insn to cross into a third code page.  */
if ((ctx.npc - pc_start) * 2 >= TARGET_PAGE_SIZE - 4) {
break;
}

is the better condition.  Subtracting 4 allows for the next instruction to be a
32-bit opcode.  Limiting to TARGET_PAGE_SIZE allows pc_start to be located
anywhere in the first code page without allowing npc to cross into a third code
page.


r~



Re: [Qemu-devel] [PATCH 7/8] VNC Acceptance test: check protocol version

2019-06-10 Thread Wainer dos Santos Moschetta



On 06/07/2019 12:22 PM, Cleber Rosa wrote:

This goes a bit further than the other tests, and does a basic (read
only) interaction with the VNC protocol.

This is not a enough to perform a handshake, but enough to make sure
that the socket is somewhat operational and that the expected initial
step of the handshake is performed by the server and that the version
matches.

Signed-off-by: Cleber Rosa 
---
  tests/acceptance/vnc.py | 12 
  1 file changed, 12 insertions(+)

diff --git a/tests/acceptance/vnc.py b/tests/acceptance/vnc.py
index d32ae46685..b000446d7c 100644
--- a/tests/acceptance/vnc.py
+++ b/tests/acceptance/vnc.py
@@ -11,6 +11,7 @@
  import os
  import tempfile
  import shutil
+import socket
  
  from avocado_qemu import Test
  
@@ -71,5 +72,16 @@ class VncUnixSocket(Test):

  arg='new_password')
  self.assertEqual(set_password_response['return'], {})
  
+def test_protocol_version(self):

+self.vm.add_args('-nodefaults', '-S',
+ '-vnc', 'unix:%s' % self.socket_path)
+self.vm.launch()
+self.assertTrue(self.vm.qmp('query-vnc')['return']['enabled'])


I will advocate for the use QEMUMachine.command() instead of qmp(). The 
former do some checks on the qmp response and raises a more gently 
exception if something went wrong. This patch looks good to me though.


Reviewed-by: Wainer dos Santos Moschetta 


+client = socket.socket(socket.AF_UNIX)
+client.connect(self.socket_path)
+expected = b'RFB 003.008'
+actual = client.recv(len(expected))
+self.assertEqual(expected, actual, "Wrong protocol version")
+
  def tearDown(self):
  shutil.rmtree(self.socket_dir)





Re: [Qemu-devel] [PATCH v21 4/7] target/avr: Add instruction translation

2019-06-10 Thread Richard Henderson
On 6/6/19 12:30 PM, Michael Rolnik wrote:
> +enum {
> +BS_NONE = 0, /* Nothing special (none of the below) */
> +BS_STOP = 1, /* We want to stop translation for any reason */
> +BS_BRANCH = 2, /* A branch condition is reached */
> +BS_EXCP = 3, /* An exception condition is reached */
> +};

This set of exit conditions is confused and incomplete.

(1) BS_BRANCH and BS_EXCP do the same thing, namely they
equate exactly to DISAS_NORETURN.

(2) BS_NONE equates exactly to DISAS_NEXT.

(3) BS_STOP is probably better named DISAS_EXIT, just so
it matches the other naming above, and that it will
result in a call to tcg_gen_exit_tb.

(4) You're missing a case that applies to indirect branches
which should use tcg_gen_lookup_and_goto_tb().
I suggest this be called DISAS_LOOKUP.

> +static void gen_goto_tb(DisasContext *ctx, int n, target_ulong dest)
> +{
> +TranslationBlock *tb = ctx->tb;
> +
> +if (ctx->singlestep == 0) {
> +tcg_gen_goto_tb(n);
> +tcg_gen_movi_i32(cpu_pc, dest);
> +tcg_gen_exit_tb(tb, n);
> +} else {
> +tcg_gen_movi_i32(cpu_pc, dest);
> +gen_helper_debug(cpu_env);
> +tcg_gen_exit_tb(NULL, 0);
> +}
> +}

Should set ctx->bstate = DISAS_NORETURN here, and not to BS_BRANCH in the 
callers.

> +if (avr_feature(ctx->env, AVR_FEATURE_ADIW_SBIW) == false) {
> +gen_helper_unsupported(cpu_env);
> +
> +ctx->bstate = BS_EXCP;
> +return true;
> +}

It would be good to define a helper function

static bool have_feature(DisasContext *ctx, int feature)
{
if (!avr_feature(ctx->env, feature)) {
gen_helper_unsupported(cpu_env);
ctx->bstate = DISAS_NORETURN;
return false;
}
return true;
}

so that this pattern becomes

if (!have_feature(ctx, AVR_FEATURE_FOO)) {
return true;
}
// Lots of code
return true;

or

if (have_feature(ctx, AVR_FEATURE_FOO)) {
// Just a few lines
}
return true;

> +/*
> + *  Returns from subroutine. The return address is loaded from the STACK.
> + *  The Stack Pointer uses a preincrement scheme during RET.
> + */
> +static bool trans_RET(DisasContext *ctx, arg_RET *a)
> +{
> +gen_pop_ret(ctx, cpu_pc);
> +
> +tcg_gen_exit_tb(NULL, 0);
> +
> +ctx->bstate = BS_BRANCH;
> +return true;
> +}

With very few exceptions, the lone use of tcg_gen_exit_tb is a bug, because you
have failed to take ctx->singlestep into account.

In this case, this is one of the indirect branch places which you should be
using DISAS_LOOKUP.

> +static bool trans_RETI(DisasContext *ctx, arg_RETI *a)
> +{
> +gen_pop_ret(ctx, cpu_pc);
> +
> +tcg_gen_movi_tl(cpu_If, 1);
> +
> +tcg_gen_exit_tb(NULL, 0);
> +
> +ctx->bstate = BS_BRANCH;
> +return true;
> +}

Here you need to use DISAS_EXIT, because instructions that enable interrupts
also need to exit to the main loop so that we re-evaluate whether interrupts
need to be delivered.

> +if (ctx.singlestep) {
> +if (ctx.bstate == BS_STOP || ctx.bstate == BS_NONE) {
> +tcg_gen_movi_tl(cpu_pc, ctx.npc);
> +}
> +gen_helper_debug(cpu_env);
> +tcg_gen_exit_tb(NULL, 0);
> +} else {
> +switch (ctx.bstate) {
> +case BS_STOP:
> +case BS_NONE:
> +gen_goto_tb(, 0, ctx.npc);
> +break;
> +case BS_EXCP:
> +case BS_BRANCH:
> +tcg_gen_exit_tb(NULL, 0);
> +break;
> +default:
> +break;
> +}
> +}

Better written as

switch (ctx.bstate) {
case DISAS_NORETURN:
break;
case DISAS_NEXT:
case DISAS_CHAIN:
/* Note gen_goto_tb checks singlestep. */
gen_goto_tb(, 0, ctx.npc);
break;
case DISAS_LOOKUP:
if (!ctx.singlestep) {
tcg_gen_lookup_and_goto_ptr();
break;
}
/* fall through */
case DISAS_EXIT:
if (ctx.singlestep) {
gen_helper_debug(cpu_env);
} else {
tcg_gen_exit_tb(NULL, 0);
}
break;
default:
g_assert_not_reached();
}


r~



Re: [Qemu-devel] [PATCH 6/8] VNC Acceptance test: simplify test names

2019-06-10 Thread Wainer dos Santos Moschetta

On 06/07/2019 12:22 PM, Cleber Rosa wrote:

The test name is composed of the class name and method name, so it
looks like there's some redundancy here that we can eliminate.

Signed-off-by: Cleber Rosa 
---
  tests/acceptance/vnc.py | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Reviewed-by: Wainer dos Santos Moschetta 



diff --git a/tests/acceptance/vnc.py b/tests/acceptance/vnc.py
index 675fd507ed..d32ae46685 100644
--- a/tests/acceptance/vnc.py
+++ b/tests/acceptance/vnc.py
@@ -45,7 +45,7 @@ class VncUnixSocket(Test):
  self.socket_dir = tempfile.mkdtemp()
  self.socket_path = os.path.join(self.socket_dir, 'vnc-socket')
  
-def test_vnc_change_password_requires_a_password(self):

+def test_change_password_requires_a_password(self):
  self.vm.add_args('-nodefaults', '-S',
   '-vnc', 'unix:%s' % self.socket_path)
  self.vm.launch()
@@ -60,7 +60,7 @@ class VncUnixSocket(Test):
  self.assertEqual(set_password_response['error']['desc'],
   'Could not set password')
  
-def test_vnc_change_password(self):

+def test_change_password(self):
  self.vm.add_args('-nodefaults', '-S',
   '-vnc', 'unix:%s,password' % self.socket_path)
  self.vm.launch()





Re: [Qemu-devel] [PATCH 5/8] VNC Acceptance test: use UNIX domain sockets to avoid port collisions

2019-06-10 Thread Wainer dos Santos Moschetta



On 06/07/2019 12:22 PM, Cleber Rosa wrote:

While running in parallel, the VNC tests that use a TCP port easily
collide.  There's a number of possibilities to reduce the probability
of collisions, but none that completely prevents it from happening.

So, to avoid those collisions, and given that the scope of the tests
are really not related to nature of the socket type, let's switch to
UNIX domain sockets created in temporary directories.

Note: the amount of boiler plate code is far from the ideal, but it's
related to the fact that a test "workdir"[1] attribute can not be used
here, because of the 108 bytes limitation of the UNIX socket path (see
ad9579aaa16). There's a fair assumption here that the temporary
directory returned by Python's tempfile.mkdtemp() won't be anywhere
close to 100 bytes.

[1] 
https://avocado-framework.readthedocs.io/en/68.0/api/test/avocado.html#avocado.Test.workdir

Signed-off-by: Cleber Rosa 
---
  tests/acceptance/vnc.py | 20 ++--
  1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/tests/acceptance/vnc.py b/tests/acceptance/vnc.py
index 064ceabcc1..675fd507ed 100644
--- a/tests/acceptance/vnc.py
+++ b/tests/acceptance/vnc.py
@@ -8,6 +8,10 @@
  # This work is licensed under the terms of the GNU GPL, version 2 or
  # later.  See the COPYING file in the top-level directory.
  
+import os

+import tempfile
+import shutil
+
  from avocado_qemu import Test
  
  
@@ -34,8 +38,16 @@ class Vnc(Test):

  self.assertEqual(set_password_response['error']['desc'],
   'Could not set password')
  
+class VncUnixSocket(Test):

+
+def setUp(self):
+super(VncUnixSocket, self).setUp()
+self.socket_dir = tempfile.mkdtemp()
+self.socket_path = os.path.join(self.socket_dir, 'vnc-socket')
+
  def test_vnc_change_password_requires_a_password(self):
-self.vm.add_args('-nodefaults', '-S', '-vnc', ':0')
+self.vm.add_args('-nodefaults', '-S',
+ '-vnc', 'unix:%s' % self.socket_path)
  self.vm.launch()
  self.assertTrue(self.vm.qmp('query-vnc')['return']['enabled'])
  set_password_response = self.vm.qmp('change',
@@ -49,7 +61,8 @@ class Vnc(Test):
   'Could not set password')
  
  def test_vnc_change_password(self):

-self.vm.add_args('-nodefaults', '-S', '-vnc', ':0,password')
+self.vm.add_args('-nodefaults', '-S',
+ '-vnc', 'unix:%s,password' % self.socket_path)
  self.vm.launch()
  self.assertTrue(self.vm.qmp('query-vnc')['return']['enabled'])
  set_password_response = self.vm.qmp('change',
@@ -57,3 +70,6 @@ class Vnc(Test):
  target='password',
  arg='new_password')
  self.assertEqual(set_password_response['return'], {})
+
+def tearDown(self):
+shutil.rmtree(self.socket_dir)


You missed to call super's tearDown in order to gently shutdown all VM 
created in by the tests. Other than that, it looks good to me.


- Wainer





Re: [Qemu-devel] [PATCH v21 4/7] target/avr: Add instruction translation

2019-06-10 Thread Richard Henderson
On 6/6/19 12:30 PM, Michael Rolnik wrote:
> +target_long cpc;
> +target_long npc;

CPC gets copied back and forth to NPC, but is now otherwise unused.
You can drop that.

> +static void translate(DisasContext *ctx)
> +{
> +uint32_t opcode;
> +int res;
> +/* PC points to words.  */
> +opcode = cpu_ldl_code(ctx->env, ctx->cpc * 2) & 0x;
> +
> +ctx->npc = ctx->cpc + 1;

Which allows this to become

uint32_t opcode = next_word(ctx);


r~



Re: [Qemu-devel] [PATCH 4/8] Boot Linux Console Test: add a test for ppc64 + pseries

2019-06-10 Thread Wainer dos Santos Moschetta

Hi Cleber,

On 06/07/2019 12:22 PM, Cleber Rosa wrote:

Just like the previous tests, boots a Linux kernel on a ppc64 target
using the pseries machine.

Signed-off-by: Cleber Rosa 
Reviewed-by: Caio Carrara 
CC: Daniel P. Berrangé 
---
  .travis.yml|  2 +-
  tests/acceptance/boot_linux_console.py | 19 +++
  2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/.travis.yml b/.travis.yml
index 9f8e73f276..5592e458ab 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -224,7 +224,7 @@ matrix:
  
  # Acceptance (Functional) tests

  - env:
-- CONFIG="--python=/usr/bin/python3 
--target-list=x86_64-softmmu,mips-softmmu,mips64el-softmmu,aarch64-softmmu,arm-softmmu,s390x-softmmu,alpha-softmmu"
+- CONFIG="--python=/usr/bin/python3 
--target-list=x86_64-softmmu,mips-softmmu,mips64el-softmmu,aarch64-softmmu,arm-softmmu,s390x-softmmu,alpha-softmmu,ppc64-softmmu"
  - TEST_CMD="make check-acceptance"
after_failure:
  - cat tests/results/latest/job.log
diff --git a/tests/acceptance/boot_linux_console.py 
b/tests/acceptance/boot_linux_console.py
index d5c500ea30..a196636367 100644
--- a/tests/acceptance/boot_linux_console.py
+++ b/tests/acceptance/boot_linux_console.py
@@ -217,3 +217,22 @@ class BootLinuxConsole(Test):
  self.vm.launch()
  console_pattern = 'Kernel command line: %s' % kernel_command_line
  self.wait_for_console_pattern(console_pattern)
+
+def test_ppc64_pseries(self):
+"""
+:avocado: tags=arch:ppc64
+:avocado: tags=machine:pseries
+"""
+kernel_url = 
('https://download.fedoraproject.org/pub/fedora-secondary/'
+  'releases/29/Everything/ppc64le/os/ppc/ppc64/vmlinuz')
+kernel_hash = '3fe04abfc852b66653b8c3c897a59a689270bc77'
+kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
+
+self.vm.set_machine('pseries')
+self.vm.set_console()
+kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE + 'console=hvc0'
+self.vm.add_args('-kernel', kernel_path,
+ '-append', kernel_command_line)
+self.vm.launch()
+console_pattern = 'Kernel command line: %s' % kernel_command_line
+self.wait_for_console_pattern(console_pattern)


I'm refactoring the acceptance tests which boot a Linux Kernel so that 
they can share common functions. My latest implementation [1] was 
re-based on this patch, and it would help manage my contribution if we 
could merge this very soon.


Anyway, tested this patch on my Fedora 30 x86_64 machine. Passed.

Reviewed-by: Wainer dos Santos Moschetta 

[1]  https://github.com/wainersm/qemu/tree/acceptance_boot_linux



Re: [Qemu-devel] [PATCH v3 3/4] net/announce: Add optional ID

2019-06-10 Thread Eric Blake
On 6/10/19 1:44 PM, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> Previously there was a single instance of the timer used by
> monitor triggered announces, that's OK, but when combined with the
> previous change that lets you have announces for subsets of interfaces
> it's a bit restrictive if you want to do different things to different
> interfaces.
> 
> Add an 'id' field to the announce, and maintain a list of the
> timers based on id.
> 
> This allows you to for example:
> a) Start an announce going on interface eth0 for a long time
> b) Start an announce going on interface eth1 for a long time
> c) Kill the announce on eth0 while leaving eth1 going.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> ---

> +++ b/include/net/announce.h
> @@ -23,8 +23,12 @@ struct AnnounceTimer {
>  /* Returns: update the timer to the next time point */
>  int64_t qemu_announce_timer_step(AnnounceTimer *timer);
>  
> -/* Delete the underlying timer and other data */
> -void qemu_announce_timer_del(AnnounceTimer *timer);
> +/*
> + * Delete the underlying timer and other datas

'data' is already plural, 'datas' is not a word.

> + * If 'free_named' true and the timer is a named timer, then remove
> + * it from the list of named timers and free the AnnounceTimer itself.
> + */
> +void qemu_announce_timer_del(AnnounceTimer *timer, bool free_named);
>  

> +++ b/qapi/net.json
> @@ -702,6 +702,10 @@
>  # @interfaces: An optional list of interface names, which restrict the
>  #announcment to the listed interfaces. (Since 4.1)
>  #
> +# @id: A name to be used to identify an instance of announce-timers
> +#and to allow it to modified later.  Not for use as
> +#part of the migration paramters. (Since 4.1)

parameters

> +#
>  # Since: 4.0
>  ##
>  
> @@ -710,7 +714,8 @@
>  'max': 'int',
>  'rounds': 'int',
>  'step': 'int',
> -'*interfaces': ['str'] } }
> +'*interfaces': ['str'],
> +'*id' : 'str' } }
>  
>  ##
>  # @announce-self:
> @@ -725,7 +730,7 @@
>  # -> { "execute": "announce-self",
>  #  "arguments": {
>  #  "initial": 50, "max": 550, "rounds": 10, "step": 50,
> -#  "interfaces": ["vn2","vn3"] } }
> +#  "interfaces": ["vn2","vn3"], "id": "bob" } }
>  # <- { "return": {} }
>  #

Worth an example of deleting a timer by id?

>  # Since: 4.0
> diff --git a/tests/virtio-net-test.c b/tests/virtio-net-test.c
> index 163126cf07..7184e2bff4 100644
> --- a/tests/virtio-net-test.c
> +++ b/tests/virtio-net-test.c
> @@ -186,7 +186,7 @@ static void announce_self(void *obj, void *data, 
> QGuestAllocator *t_alloc)
>  rsp = qmp("{ 'execute' : 'announce-self', "
>" 'arguments': {"
>" 'initial': 50, 'max': 550,"
> -  " 'rounds': 10, 'step': 50 } }");
> +  " 'rounds': 10, 'step': 50, 'id': 'bob' } }");

And here, is it worth testing that you can delete by id, rather than
just create with an id?

>  assert(!qdict_haskey(rsp, "error"));
>  qobject_unref(rsp);
>  
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 3/8] Acceptance tests: drop left over usage of ":avocado: enable"

2019-06-10 Thread Wainer dos Santos Moschetta



On 06/07/2019 12:22 PM, Cleber Rosa wrote:

Commit 9531d26c10 removed all of ":avocado: enable" tags, but then
a new entry was added with the introduction of migration.py.

Let's remove it for consistency.

Signed-off-by: Cleber Rosa 
---
  tests/acceptance/migration.py | 3 ---
  1 file changed, 3 deletions(-)


I was about to send a patch to remove it as well. :)

Reviewed-by: Wainer dos Santos Moschetta 



diff --git a/tests/acceptance/migration.py b/tests/acceptance/migration.py
index 6115cf6c24..a44c1ae58f 100644
--- a/tests/acceptance/migration.py
+++ b/tests/acceptance/migration.py
@@ -17,9 +17,6 @@ from avocado.utils import wait
  
  
  class Migration(Test):

-"""
-:avocado: enable
-"""
  
  timeout = 10
  





Re: [Qemu-devel] [PATCH 2/8] tests/requirements.txt: pin paramiko version requirement

2019-06-10 Thread Wainer dos Santos Moschetta



On 06/07/2019 12:22 PM, Cleber Rosa wrote:

It's a good practice (I'd really say a must) to pin as much as
possible of the software versions used during test, so let's apply
that to paramiko.

According to https://pypi.org/project/paramiko/, 2.4.2 is the latest
released version.  It's also easily obtainable on systems such as
Fedora 30.

Signed-off-by: Cleber Rosa 
---
  tests/requirements.txt | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Wainer dos Santos Moschetta 



diff --git a/tests/requirements.txt b/tests/requirements.txt
index 3ae0e29ad7..bd1f7590ed 100644
--- a/tests/requirements.txt
+++ b/tests/requirements.txt
@@ -2,4 +2,4 @@
  # in the tests/venv Python virtual environment. For more info,
  # refer to: https://pip.pypa.io/en/stable/user_guide/#id1
  avocado-framework==68.0
-paramiko
+paramiko==2.4.2





[Qemu-devel] [PATCH] tests/acceptance: Improve failure reporting in linux_ssh_mips_malta.py

2019-06-10 Thread Aleksandar Markovic
From: Aleksandar Markovic 

Rather than optputing a cryptic message:

FAIL: True not found in [False],

the following will be reported too, if the command output does not meet
specified expectations:

'lspci -d 11ab:4620' output doesn't contain the word 'GT-64120'

Signed-off-by: Aleksandar Markovic 
---
 tests/acceptance/linux_ssh_mips_malta.py | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/tests/acceptance/linux_ssh_mips_malta.py 
b/tests/acceptance/linux_ssh_mips_malta.py
index aafb0c3..cbf1b34 100644
--- a/tests/acceptance/linux_ssh_mips_malta.py
+++ b/tests/acceptance/linux_ssh_mips_malta.py
@@ -147,20 +147,27 @@ class LinuxSSH(Test):
 
 def run_common_commands(self):
 stdout, stderr = self.ssh_command('lspci -d 11ab:4620')
-self.assertIn(True, ["GT-64120" in line for line in stdout])
+self.assertIn(True, ["GT-64120a" in line for line in stdout],
+  "'lspci -d 11ab:4620' output doesn't contain "
+  "the word 'GT-64120'")
 
 stdout, stderr = self.ssh_command('cat 
/sys/bus/i2c/devices/i2c-0/name')
-self.assertIn(True, ["SMBus PIIX4 adapter" in line
- for line in stdout])
+self.assertIn(True, ["SMBus PIIX4 adaptera" in line
+ for line in stdout],
+  "cat /sys/bus/i2c/devices/i2c-0/name' doesn't contain "
+  "the words 'SMBus PIIX4 adapter'")
 
 stdout, stderr = self.ssh_command('cat /proc/mtd')
-self.assertIn(True, ["YAMON" in line
- for line in stdout])
+self.assertIn(True, ["YAMONa" in line
+ for line in stdout],
+  "'cat /proc/mtd' doesn't contain the word 'YAMON'")
 
 # Empty 'Board Config'
 stdout, stderr = self.ssh_command('md5sum /dev/mtd2ro')
-self.assertIn(True, ["0dfbe8aa4c20b52e1b8bf3cb6cbdf193" in line
- for line in stdout])
+self.assertIn(True, ["0dfbe8aa4c20b52e1b8bf3cb6cbdf193a" in line
+ for line in stdout],
+  "'md5sum /dev/mtd2ro' doesn't contain "
+  "the word '0dfbe8aa4c20b52e1b8bf3cb6cbdf193'")
 
 def do_test_mips_malta(self, endianess, kernel_path, uname_m):
 self.boot_debian_wheezy_image_and_ssh_login(endianess, kernel_path)
-- 
2.7.4




Re: [Qemu-devel] [PATCH v3 1/4] net/announce: Allow optional list of interfaces

2019-06-10 Thread Eric Blake
On 6/10/19 1:43 PM, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> Allow the caller to restrict the set of interfaces that announces are
> sent on.  The default is still to send on all interfaces.
> 
> e.g.
> 
>   { "execute": "announce-self", "arguments": { "initial": 50, "max": 550, 
> "rounds": 5, "step": 50, "interfaces": ["vn2","vn1"] } }
> 
> This doesn't affect the behaviour of migraiton announcments.
> 
> Note: There's still only one timer for the qmp command, so that
> performing an 'announce-self' on one list of interfaces followed
> by another 'announce-self' on another list will stop the announces
> on the existing set.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> ---

> +++ b/qapi/net.json
> @@ -699,6 +699,9 @@
>  #
>  # @step: Delay increase (in ms) after each self-announcement attempt
>  #
> +# @interfaces: An optional list of interface names, which restrict the

restricts

> +#announcment to the listed interfaces. (Since 4.1)

announcement

> +#
>  # Since: 4.0
>  ##
>  
> @@ -706,7 +709,8 @@
>'data': { 'initial': 'int',
>  'max': 'int',
>  'rounds': 'int',
> -'step': 'int' } }
> +'step': 'int',
> +'*interfaces': ['str'] } }
>  
>  ##
>  # @announce-self:
> @@ -718,9 +722,10 @@
>  #
>  # Example:
>  #
> -# -> { "execute": "announce-self"
> +# -> { "execute": "announce-self",

Embarrassing that we didn't notice that one earlier.

>  #  "arguments": {
> -#  "initial": 50, "max": 550, "rounds": 10, "step": 50 } }
> +#  "initial": 50, "max": 550, "rounds": 10, "step": 50,
> +#  "interfaces": ["vn2","vn3"] } }

Worth a space after the comma? Not required, but I think it looks nicer.

As I only focused on doc issues, I'll leave the full review to others.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 1/8] Travis: print acceptance tests logs in case of job failure

2019-06-10 Thread Wainer dos Santos Moschetta



On 06/07/2019 12:22 PM, Cleber Rosa wrote:

Because Travis doesn't allow us to keep files produced during tests
(such as log files), let's print the complete job log to the "console"
in case of job failures.

This is a debugging aid, and given that there's been some timeouts
happening on some tests, we absolutely needs the logs to have a proper
action.

Signed-off-by: Cleber Rosa 
---
  .travis.yml | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/.travis.yml b/.travis.yml
index b053a836a3..9f8e73f276 100644
--- a/.travis.yml
+++ b/.travis.yml


It's handy. Unfortunately you cannot archive the log files in Travis, 
otherwise that would be a nice option.


Reviewed-by: Wainer dos Santos Moschetta 


@@ -226,6 +226,8 @@ matrix:
  - env:
  - CONFIG="--python=/usr/bin/python3 
--target-list=x86_64-softmmu,mips-softmmu,mips64el-softmmu,aarch64-softmmu,arm-softmmu,s390x-softmmu,alpha-softmmu"
  - TEST_CMD="make check-acceptance"
+  after_failure:
+- cat tests/results/latest/job.log
addons:
  apt:
packages:





Re: [Qemu-devel] [PATCH v11 4/7] dm: enable synchronous dax

2019-06-10 Thread Mike Snitzer
On Mon, Jun 10 2019 at  5:07am -0400,
Pankaj Gupta  wrote:

>  This patch sets dax device 'DAXDEV_SYNC' flag if all the target
>  devices of device mapper support synchrononous DAX. If device
>  mapper consists of both synchronous and asynchronous dax devices,
>  we don't set 'DAXDEV_SYNC' flag.
> 
> Signed-off-by: Pankaj Gupta 
> ---
>  drivers/md/dm-table.c | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> index 350cf0451456..c5160d846fe6 100644
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -890,10 +890,17 @@ static int device_supports_dax(struct dm_target *ti, 
> struct dm_dev *dev,
>   start, len);
>  }
>  
> +static int device_synchronous(struct dm_target *ti, struct dm_dev *dev,
> +sector_t start, sector_t len, void *data)
> +{
> + return dax_synchronous(dev->dax_dev);
> +}
> +
>  bool dm_table_supports_dax(struct dm_table *t, int blocksize)
>  {
>   struct dm_target *ti;
>   unsigned i;
> + bool dax_sync = true;
>  
>   /* Ensure that all targets support DAX. */
>   for (i = 0; i < dm_table_get_num_targets(t); i++) {
> @@ -906,7 +913,14 @@ bool dm_table_supports_dax(struct dm_table *t, int 
> blocksize)
>   !ti->type->iterate_devices(ti, device_supports_dax,
>   ))
>   return false;
> +
> + /* Check devices support synchronous DAX */
> + if (dax_sync &&
> + !ti->type->iterate_devices(ti, device_synchronous, NULL))
> + dax_sync = false;
>   }
> + if (dax_sync)
> + set_dax_synchronous(t->md->dax_dev);
>  
>   return true;
>  }
> -- 
> 2.20.1
> 

dm_table_supports_dax() is called multiple times (from
dm_table_set_restrictions and dm_table_determine_type).  It is strange
to have a getter have a side-effect of being a setter too.  Overloading
like this could get you in trouble in the future.

Are you certain this is what you want?

Or would it be better to refactor dm_table_supports_dax() to take an
iterate_devices_fn arg and have callers pass the appropriate function?
Then have dm_table_set_restrictions() caller do:

 if (dm_table_supports_dax(t, device_synchronous, NULL))
  set_dax_synchronous(t->md->dax_dev);

(NULL arg implies dm_table_supports_dax() refactoring would take a int
*data pointer rather than int type).

Mike



[Qemu-devel] [PATCH v3 4/4] net/announce: Add HMP optional ID

2019-06-10 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

Add the optional ID to the HMP command.

e.g.
   # start an announce for a long time on eth1
   migrate_set_parameter announce-rounds 1000
   announce_self "eth1" e1

   # start an announce on eth2
   announce_self "eth2" e2

   # Change e1 to be announcing on eth1 and eth3
   announce_self "eth1,eth3" e1

   # Cancel e1
   migrate_set_parameter announce-rounds 0
   announce_self "" e1

Signed-off-by: Dr. David Alan Gilbert 
---
 hmp-commands.hx | 7 ---
 hmp.c   | 3 +++
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 1b63372713..7ba8543cc3 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -955,8 +955,8 @@ ETEXI
 
 {
 .name   = "announce_self",
-.args_type  = "interfaces:s?",
-.params = "[interfaces]",
+.args_type  = "interfaces:s?,id:s?",
+.params = "[interfaces] [id]",
 .help   = "Trigger GARP/RARP announcements",
 .cmd= hmp_announce_self,
 },
@@ -968,7 +968,8 @@ Trigger a round of GARP/RARP broadcasts; this is useful for 
explicitly updating
 network infrastructure after a reconfiguration or some forms of migration.
 The timings of the round are set by the migration announce parameters.
 An optional comma separated @var{interfaces} list restricts the announce to the
-named set of interfaces.
+named set of interfaces. An optional @var{id} can be used to start a separate 
announce
+timer and to change the parameters of it later.
 ETEXI
 
 {
diff --git a/hmp.c b/hmp.c
index 52efb4a4fa..fd498ca0a8 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1669,12 +1669,15 @@ void hmp_info_snapshots(Monitor *mon, const QDict 
*qdict)
 void hmp_announce_self(Monitor *mon, const QDict *qdict)
 {
 const char *interfaces_str = qdict_get_try_str(qdict, "interfaces");
+const char *id = qdict_get_try_str(qdict, "id");
 AnnounceParameters *params = QAPI_CLONE(AnnounceParameters,
 migrate_announce_params());
 
 qapi_free_strList(params->interfaces);
 params->interfaces = strList_from_comma_list(interfaces_str);
 params->has_interfaces = params->interfaces != NULL;
+params->id = g_strdup(id);
+params->has_id = !!params->id;
 qmp_announce_self(params, NULL);
 qapi_free_AnnounceParameters(params);
 }
-- 
2.21.0




[Qemu-devel] [PATCH v3 3/4] net/announce: Add optional ID

2019-06-10 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

Previously there was a single instance of the timer used by
monitor triggered announces, that's OK, but when combined with the
previous change that lets you have announces for subsets of interfaces
it's a bit restrictive if you want to do different things to different
interfaces.

Add an 'id' field to the announce, and maintain a list of the
timers based on id.

This allows you to for example:
a) Start an announce going on interface eth0 for a long time
b) Start an announce going on interface eth1 for a long time
c) Kill the announce on eth0 while leaving eth1 going.

Signed-off-by: Dr. David Alan Gilbert 
---
 hw/net/virtio-net.c |  4 ++--
 include/net/announce.h  |  8 +--
 net/announce.c  | 46 ++---
 net/trace-events|  3 ++-
 qapi/net.json   |  9 ++--
 tests/virtio-net-test.c |  2 +-
 6 files changed, 57 insertions(+), 15 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index ffe0872fff..120248bbf6 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -2359,7 +2359,7 @@ static int virtio_net_post_load_device(void *opaque, int 
version_id)
 timer_mod(n->announce_timer.tm,
   qemu_clock_get_ms(n->announce_timer.type));
 } else {
-qemu_announce_timer_del(>announce_timer);
+qemu_announce_timer_del(>announce_timer, false);
 }
 }
 
@@ -2783,7 +2783,7 @@ static void virtio_net_device_unrealize(DeviceState *dev, 
Error **errp)
 virtio_net_del_queue(n, i);
 }
 
-qemu_announce_timer_del(>announce_timer);
+qemu_announce_timer_del(>announce_timer, false);
 g_free(n->vqs);
 qemu_del_nic(n->nic);
 virtio_net_rsc_cleanup(n);
diff --git a/include/net/announce.h b/include/net/announce.h
index 3ebffe517e..9cf7a4f835 100644
--- a/include/net/announce.h
+++ b/include/net/announce.h
@@ -23,8 +23,12 @@ struct AnnounceTimer {
 /* Returns: update the timer to the next time point */
 int64_t qemu_announce_timer_step(AnnounceTimer *timer);
 
-/* Delete the underlying timer and other data */
-void qemu_announce_timer_del(AnnounceTimer *timer);
+/*
+ * Delete the underlying timer and other datas
+ * If 'free_named' true and the timer is a named timer, then remove
+ * it from the list of named timers and free the AnnounceTimer itself.
+ */
+void qemu_announce_timer_del(AnnounceTimer *timer, bool free_named);
 
 /*
  * Under BQL/main thread
diff --git a/net/announce.c b/net/announce.c
index 1ce42b571d..4d4e2c22a1 100644
--- a/net/announce.c
+++ b/net/announce.c
@@ -15,6 +15,8 @@
 #include "qapi/qapi-commands-net.h"
 #include "trace.h"
 
+static GData *named_timers;
+
 int64_t qemu_announce_timer_step(AnnounceTimer *timer)
 {
 int64_t step;
@@ -31,8 +33,13 @@ int64_t qemu_announce_timer_step(AnnounceTimer *timer)
 return step;
 }
 
-void qemu_announce_timer_del(AnnounceTimer *timer)
+/*
+ * If 'free_named' is true, then remove the timer from the list
+ * and free the timer itself.
+ */
+void qemu_announce_timer_del(AnnounceTimer *timer, bool free_named)
 {
+bool free_timer = false;
 if (timer->tm) {
 timer_del(timer->tm);
 timer_free(timer->tm);
@@ -40,6 +47,18 @@ void qemu_announce_timer_del(AnnounceTimer *timer)
 }
 qapi_free_strList(timer->params.interfaces);
 timer->params.interfaces = NULL;
+if (free_named && timer->params.has_id) {
+free_timer = timer ==
+ g_datalist_get_data(_timers, timer->params.id);
+g_datalist_remove_data(_timers, timer->params.id);
+}
+trace_qemu_announce_timer_del(free_named, free_timer, timer->params.id);
+g_free(timer->params.id);
+timer->params.id = NULL;
+
+if (free_timer) {
+g_free(timer);
+}
 }
 
 /*
@@ -56,7 +75,7 @@ void qemu_announce_timer_reset(AnnounceTimer *timer,
  * We're under the BQL, so the current timer can't
  * be firing, so we should be able to delete it.
  */
-qemu_announce_timer_del(timer);
+qemu_announce_timer_del(timer, false);
 
 QAPI_CLONE_MEMBERS(AnnounceParameters, >params, params);
 timer->round = params->rounds;
@@ -120,7 +139,8 @@ static void qemu_announce_self_iter(NICState *nic, void 
*opaque)
 skip = false;
 }
 
-trace_qemu_announce_self_iter(nic->ncs->name,
+trace_qemu_announce_self_iter(timer->params.has_id ? timer->params.id : 
"_",
+  nic->ncs->name,
   qemu_ether_ntoa(>conf->macaddr), skip);
 
 if (!skip) {
@@ -143,7 +163,7 @@ static void qemu_announce_self_once(void *opaque)
 if (--timer->round) {
 qemu_announce_timer_step(timer);
 } else {
-qemu_announce_timer_del(timer);
+qemu_announce_timer_del(timer, true);
 }
 }
 
@@ -154,12 +174,24 @@ void qemu_announce_self(AnnounceTimer *timer, 
AnnounceParameters *params)
 if (params->rounds) {
 

[Qemu-devel] [PATCH v3 0/4] network announce; interface selection & IDs

2019-06-10 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

Laine asked for some extra features on the network announce support;

The first allows the announce timer to announce on a subset of the
interfaces.

The second allows there to be multiple timers, each with their own
parameters (including the interface list).

Signed-off-by: Dr. David Alan Gilbert 

v3
  Support for multiple timers.

Dr. David Alan Gilbert (4):
  net/announce: Allow optional list of interfaces
  net/announce: Add HMP optional interface list
  net/announce: Add optional ID
  net/announce: Add HMP optional ID

 hmp-commands.hx |  7 +++-
 hmp.c   | 41 +++-
 hw/net/virtio-net.c |  4 +-
 include/net/announce.h  |  8 +++-
 net/announce.c  | 83 ++---
 net/trace-events|  3 +-
 qapi/net.json   | 16 ++--
 tests/virtio-net-test.c |  2 +-
 8 files changed, 139 insertions(+), 25 deletions(-)

-- 
2.21.0




[Qemu-devel] [PATCH v3 1/4] net/announce: Allow optional list of interfaces

2019-06-10 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

Allow the caller to restrict the set of interfaces that announces are
sent on.  The default is still to send on all interfaces.

e.g.

  { "execute": "announce-self", "arguments": { "initial": 50, "max": 550, 
"rounds": 5, "step": 50, "interfaces": ["vn2","vn1"] } }

This doesn't affect the behaviour of migraiton announcments.

Note: There's still only one timer for the qmp command, so that
performing an 'announce-self' on one list of interfaces followed
by another 'announce-self' on another list will stop the announces
on the existing set.

Signed-off-by: Dr. David Alan Gilbert 
---
 include/net/announce.h |  2 +-
 net/announce.c | 39 ---
 net/trace-events   |  2 +-
 qapi/net.json  | 11 ---
 4 files changed, 42 insertions(+), 12 deletions(-)

diff --git a/include/net/announce.h b/include/net/announce.h
index 892d302b65..3ebffe517e 100644
--- a/include/net/announce.h
+++ b/include/net/announce.h
@@ -23,7 +23,7 @@ struct AnnounceTimer {
 /* Returns: update the timer to the next time point */
 int64_t qemu_announce_timer_step(AnnounceTimer *timer);
 
-/* Delete the underlying timer */
+/* Delete the underlying timer and other data */
 void qemu_announce_timer_del(AnnounceTimer *timer);
 
 /*
diff --git a/net/announce.c b/net/announce.c
index 91e9a6e267..1ce42b571d 100644
--- a/net/announce.c
+++ b/net/announce.c
@@ -38,6 +38,8 @@ void qemu_announce_timer_del(AnnounceTimer *timer)
 timer_free(timer->tm);
 timer->tm = NULL;
 }
+qapi_free_strList(timer->params.interfaces);
+timer->params.interfaces = NULL;
 }
 
 /*
@@ -96,24 +98,47 @@ static int announce_self_create(uint8_t *buf,
 
 static void qemu_announce_self_iter(NICState *nic, void *opaque)
 {
+AnnounceTimer *timer = opaque;
 uint8_t buf[60];
 int len;
+bool skip;
+
+if (timer->params.has_interfaces) {
+strList *entry = timer->params.interfaces;
+/* Skip unless we find our name in the requested list */
+skip = true;
+
+while (entry) {
+if (!strcmp(entry->value, nic->ncs->name)) {
+/* Found us */
+skip = false;
+break;
+}
+entry = entry->next;
+}
+} else {
+skip = false;
+}
+
+trace_qemu_announce_self_iter(nic->ncs->name,
+  qemu_ether_ntoa(>conf->macaddr), skip);
 
-trace_qemu_announce_self_iter(qemu_ether_ntoa(>conf->macaddr));
-len = announce_self_create(buf, nic->conf->macaddr.a);
+if (!skip) {
+len = announce_self_create(buf, nic->conf->macaddr.a);
 
-qemu_send_packet_raw(qemu_get_queue(nic), buf, len);
+qemu_send_packet_raw(qemu_get_queue(nic), buf, len);
 
-/* if the NIC provides it's own announcement support, use it as well */
-if (nic->ncs->info->announce) {
-nic->ncs->info->announce(nic->ncs);
+/* if the NIC provides it's own announcement support, use it as well */
+if (nic->ncs->info->announce) {
+nic->ncs->info->announce(nic->ncs);
+}
 }
 }
 static void qemu_announce_self_once(void *opaque)
 {
 AnnounceTimer *timer = (AnnounceTimer *)opaque;
 
-qemu_foreach_nic(qemu_announce_self_iter, NULL);
+qemu_foreach_nic(qemu_announce_self_iter, timer);
 
 if (--timer->round) {
 qemu_announce_timer_step(timer);
diff --git a/net/trace-events b/net/trace-events
index a7937f3f3a..875ef2a0f3 100644
--- a/net/trace-events
+++ b/net/trace-events
@@ -1,7 +1,7 @@
 # See docs/devel/tracing.txt for syntax documentation.
 
 # announce.c
-qemu_announce_self_iter(const char *mac) "%s"
+qemu_announce_self_iter(const char *name, const char *mac, int skip) "%s:%s 
skip: %d"
 
 # vhost-user.c
 vhost_user_event(const char *chr, int event) "chr: %s got event: %d"
diff --git a/qapi/net.json b/qapi/net.json
index 5f7bff1637..ee9bf2c5f5 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -699,6 +699,9 @@
 #
 # @step: Delay increase (in ms) after each self-announcement attempt
 #
+# @interfaces: An optional list of interface names, which restrict the
+#announcment to the listed interfaces. (Since 4.1)
+#
 # Since: 4.0
 ##
 
@@ -706,7 +709,8 @@
   'data': { 'initial': 'int',
 'max': 'int',
 'rounds': 'int',
-'step': 'int' } }
+'step': 'int',
+'*interfaces': ['str'] } }
 
 ##
 # @announce-self:
@@ -718,9 +722,10 @@
 #
 # Example:
 #
-# -> { "execute": "announce-self"
+# -> { "execute": "announce-self",
 #  "arguments": {
-#  "initial": 50, "max": 550, "rounds": 10, "step": 50 } }
+#  "initial": 50, "max": 550, "rounds": 10, "step": 50,
+#  "interfaces": ["vn2","vn3"] } }
 # <- { "return": {} }
 #
 # Since: 4.0
-- 
2.21.0




[Qemu-devel] [PATCH v3 2/4] net/announce: Add HMP optional interface list

2019-06-10 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

Add the optional interface list to the HMP command.

i.e.

   All interfaces
announce_self

   Just the named interfaces:
announce_self vn1,vn2

Signed-off-by: Dr. David Alan Gilbert 
---
 hmp-commands.hx |  6 --
 hmp.c   | 38 +-
 2 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 3dc421cb6a..1b63372713 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -955,8 +955,8 @@ ETEXI
 
 {
 .name   = "announce_self",
-.args_type  = "",
-.params = "",
+.args_type  = "interfaces:s?",
+.params = "[interfaces]",
 .help   = "Trigger GARP/RARP announcements",
 .cmd= hmp_announce_self,
 },
@@ -967,6 +967,8 @@ STEXI
 Trigger a round of GARP/RARP broadcasts; this is useful for explicitly 
updating the
 network infrastructure after a reconfiguration or some forms of migration.
 The timings of the round are set by the migration announce parameters.
+An optional comma separated @var{interfaces} list restricts the announce to the
+named set of interfaces.
 ETEXI
 
 {
diff --git a/hmp.c b/hmp.c
index d4460992b6..52efb4a4fa 100644
--- a/hmp.c
+++ b/hmp.c
@@ -27,6 +27,7 @@
 #include "monitor/monitor.h"
 #include "monitor/qdev.h"
 #include "qapi/error.h"
+#include "qapi/clone-visitor.h"
 #include "qapi/opts-visitor.h"
 #include "qapi/qapi-builtin-visit.h"
 #include "qapi/qapi-commands-block.h"
@@ -38,6 +39,7 @@
 #include "qapi/qapi-commands-run-state.h"
 #include "qapi/qapi-commands-tpm.h"
 #include "qapi/qapi-commands-ui.h"
+#include "qapi/qapi-visit-net.h"
 #include "qapi/qmp/qdict.h"
 #include "qapi/qmp/qerror.h"
 #include "qapi/string-input-visitor.h"
@@ -67,6 +69,32 @@ static void hmp_handle_error(Monitor *mon, Error **errp)
 }
 }
 
+/*
+ * Produce a strList from a comma separated list.
+ * A NULL or empty input string return NULL.
+ */
+static strList *strList_from_comma_list(const char *in)
+{
+strList *res = NULL;
+strList **hook = 
+
+while (in && in[0]) {
+char *comma = strchr(in, ',');
+*hook = g_new0(strList, 1);
+
+if (comma) {
+(*hook)->value = g_strndup(in, comma - in);
+in = comma + 1; /* skip the , */
+} else {
+(*hook)->value = g_strdup(in);
+in = NULL;
+}
+hook = &(*hook)->next;
+}
+
+return res;
+}
+
 void hmp_info_name(Monitor *mon, const QDict *qdict)
 {
 NameInfo *info;
@@ -1640,7 +1668,15 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
 
 void hmp_announce_self(Monitor *mon, const QDict *qdict)
 {
-qmp_announce_self(migrate_announce_params(), NULL);
+const char *interfaces_str = qdict_get_try_str(qdict, "interfaces");
+AnnounceParameters *params = QAPI_CLONE(AnnounceParameters,
+migrate_announce_params());
+
+qapi_free_strList(params->interfaces);
+params->interfaces = strList_from_comma_list(interfaces_str);
+params->has_interfaces = params->interfaces != NULL;
+qmp_announce_self(params, NULL);
+qapi_free_AnnounceParameters(params);
 }
 
 void hmp_migrate_cancel(Monitor *mon, const QDict *qdict)
-- 
2.21.0




Re: [Qemu-devel] [PATCH 28/42] target/arm: Convert VMOV (imm) to decodetree

2019-06-10 Thread Richard Henderson
On 6/10/19 10:12 AM, Peter Maydell wrote:
> On Sat, 8 Jun 2019 at 20:29, Richard Henderson
>  wrote:
>>
>> On 6/6/19 12:45 PM, Peter Maydell wrote:
>>> +n = (a->imm4h << 28) & 0x8000;
>>> +i = ((a->imm4h << 4) & 0x70) | a->imm4l;
>>> +if (i & 0x40) {
>>> +i |= 0x780;
>>> +} else {
>>> +i |= 0x800;
>>> +}
>>> +n |= i << 19;
>>
>> Can we reuse vfp_expand_imm here?  Given that you don't have pure code motion
>> from the old code (due to field extraction) it doesn't feel wrong to go ahead
>> and fix this wart now.
> 
> I dunno, I'd kind of prefer to do it later. We're already
> drifting away from the old code as you say, and going
> straight to vfp_expand_imm() makes it even less clear that
> we're doing the same thing the old code was...

Fair enough.
Reviewed-by: Richard Henderson 


r~



[Qemu-devel] [Bug 1832250] [NEW] arm32v6/golang:1.10-alpine is broken for qemu 2.8 on MacOS cross-compilation

2019-06-10 Thread Troy Fine
Public bug reported:

FROM arm32v6/golang:1.10-alpine

docker build -t openhorizon/ibm.gps_arm:2.0.7 -f ./Dockerfile.arm .
Sending build context to Docker daemon  110.6kB
Step 1/12 : FROM arm32v6/golang:1.10-alpine
1.10-alpine: Pulling from arm32v6/golang
05276f4299f2: Pull complete 
5657e63df536: Pull complete 
febca98d0249: Pull complete 
5053a7aa5dea: Pull complete 
d048463a3701: Pull complete 
b628c679d668: Pull complete 
Digest: sha256:94c5fd97b17d0e9fe89e011446bedda4784cb0af7a60494989e2a21c0dcba92f
Status: Downloaded newer image for arm32v6/golang:1.10-alpine
 ---> 3110964e8c9a
Step 2/12 : RUN apk --no-cache update && apk add git
 ---> Running in 14ffb11506bb
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/main/armhf/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/community/armhf/APKINDEX.tar.gz
v3.9.4-24-g4e2ff29bbe [http://dl-cdn.alpinelinux.org/alpine/v3.9/main]
v3.9.4-25-g65097c9cdc [http://dl-cdn.alpinelinux.org/alpine/v3.9/community]
OK: 9547 distinct packages available
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/main/armhf/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/community/armhf/APKINDEX.tar.gz
(1/7) Installing nghttp2-libs (1.35.1-r0)
(2/7) Installing libssh2 (1.8.2-r0)
(3/7) Installing libcurl (7.64.0-r2)
(4/7) Installing libgcc (8.3.0-r0)
(5/7) Installing expat (2.2.6-r0)
(6/7) Installing pcre2 (10.32-r1)
(7/7) Installing git (2.20.1-r0)
Executing busybox-1.29.3-r10.trigger
OK: 18 MiB in 22 packages
Removing intermediate container 14ffb11506bb
 ---> 6890ea7ed09b
Step 3/12 : RUN mkdir -p /build/bin
 ---> Running in 44e52d78d7b4
Removing intermediate container 44e52d78d7b4
 ---> 0763afda41d1
Step 4/12 : COPY src /build/src
 ---> 05bab9a72a34
Step 5/12 : WORKDIR /build
 ---> Running in 5a663caff249
Removing intermediate container 5a663caff249
 ---> 5a6ca53c00de
Step 6/12 : RUN env GOPATH=/build GOOPTIONS_ARM='CGO_ENABLED=0 GOOS=linux 
GOARCH=arm GOARM=6' go get github.com/kellydunn/golang-geo
 ---> Running in 05b09ee0c206
Removing intermediate container 05b09ee0c206
 ---> e68c6e222e51
Step 7/12 : RUN env GOPATH=/build GOOPTIONS_ARM='CGO_ENABLED=0 GOOS=linux 
GOARCH=arm GOARM=6' go build -o /build/bin/armv6_gps /build/src/main.go
 ---> Running in ea6d2707e35f
qemu-arm: /build/qemu-rwi8RH/qemu-2.8+dfsg/translate-all.c:175: tb_lock: 
Assertion `!have_tb_lock' failed.
qemu-arm: /build/qemu-rwi8RH/qemu-2.8+dfsg/translate-all.c:175: tb_lock: 
Assertion `!have_tb_lock' failed.
The command '/bin/sh -c env GOPATH=/build GOOPTIONS_ARM='CGO_ENABLED=0 
GOOS=linux GOARCH=arm GOARM=6' go build -o /build/bin/armv6_gps 
/build/src/main.go' returned a non-zero code: 139
make: *** [build] Error 139

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1832250

Title:
  arm32v6/golang:1.10-alpine is broken for qemu 2.8 on MacOS cross-
  compilation

Status in QEMU:
  New

Bug description:
  FROM arm32v6/golang:1.10-alpine

  docker build -t openhorizon/ibm.gps_arm:2.0.7 -f ./Dockerfile.arm .
  Sending build context to Docker daemon  110.6kB
  Step 1/12 : FROM arm32v6/golang:1.10-alpine
  1.10-alpine: Pulling from arm32v6/golang
  05276f4299f2: Pull complete 
  5657e63df536: Pull complete 
  febca98d0249: Pull complete 
  5053a7aa5dea: Pull complete 
  d048463a3701: Pull complete 
  b628c679d668: Pull complete 
  Digest: 
sha256:94c5fd97b17d0e9fe89e011446bedda4784cb0af7a60494989e2a21c0dcba92f
  Status: Downloaded newer image for arm32v6/golang:1.10-alpine
   ---> 3110964e8c9a
  Step 2/12 : RUN apk --no-cache update && apk add git
   ---> Running in 14ffb11506bb
  fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/main/armhf/APKINDEX.tar.gz
  fetch 
http://dl-cdn.alpinelinux.org/alpine/v3.9/community/armhf/APKINDEX.tar.gz
  v3.9.4-24-g4e2ff29bbe [http://dl-cdn.alpinelinux.org/alpine/v3.9/main]
  v3.9.4-25-g65097c9cdc [http://dl-cdn.alpinelinux.org/alpine/v3.9/community]
  OK: 9547 distinct packages available
  fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/main/armhf/APKINDEX.tar.gz
  fetch 
http://dl-cdn.alpinelinux.org/alpine/v3.9/community/armhf/APKINDEX.tar.gz
  (1/7) Installing nghttp2-libs (1.35.1-r0)
  (2/7) Installing libssh2 (1.8.2-r0)
  (3/7) Installing libcurl (7.64.0-r2)
  (4/7) Installing libgcc (8.3.0-r0)
  (5/7) Installing expat (2.2.6-r0)
  (6/7) Installing pcre2 (10.32-r1)
  (7/7) Installing git (2.20.1-r0)
  Executing busybox-1.29.3-r10.trigger
  OK: 18 MiB in 22 packages
  Removing intermediate container 14ffb11506bb
   ---> 6890ea7ed09b
  Step 3/12 : RUN mkdir -p /build/bin
   ---> Running in 44e52d78d7b4
  Removing intermediate container 44e52d78d7b4
   ---> 0763afda41d1
  Step 4/12 : COPY src /build/src
   ---> 05bab9a72a34
  Step 5/12 : WORKDIR /build
   ---> Running in 5a663caff249
  Removing intermediate container 5a663caff249
   ---> 5a6ca53c00de
  Step 6/12 : RUN env GOPATH=/build 

Re: [Qemu-devel] [PATCH] hw: misc: Add Aspeed XDMA device

2019-06-10 Thread Eddie James



On 6/6/19 6:34 PM, Philippe Mathieu-Daudé wrote:

Hi Eddie,

On 6/4/19 12:09 AM, Eddie James wrote:

The XDMA engine embedded in the Aspeed SOCs performs PCI DMA operations
between the SOC (acting as a BMC) and a host processor in a server.

If I got your model correctly, it does no DMA operation but simply
answer correctly to the BMC, and set 'operation done' IRQ with no delay.
So this is a dummy device. Then it would be more useful having ignored
DMA ops traced with trace-events.



Thats correct. Good idea, I will add some tracing.





The XDMA engine exists on the AST2400, AST2500, and AST2600 SOCs, so
enable it for all of those.

Signed-off-by: Eddie James 
---
  hw/arm/aspeed_soc.c   |  14 
  hw/misc/Makefile.objs |   2 +-
  hw/misc/aspeed_xdma.c | 156 ++
  include/hw/arm/aspeed_soc.h   |   2 +
  include/hw/misc/aspeed_xdma.h |  31 +
  5 files changed, 204 insertions(+), 1 deletion(-)
  create mode 100644 hw/misc/aspeed_xdma.c
  create mode 100644 include/hw/misc/aspeed_xdma.h

diff --git a/hw/arm/aspeed_soc.c b/hw/arm/aspeed_soc.c
index faff42b..b25bb18 100644
--- a/hw/arm/aspeed_soc.c
+++ b/hw/arm/aspeed_soc.c
@@ -31,6 +31,7 @@
  #define ASPEED_SOC_VIC_BASE 0x1E6C
  #define ASPEED_SOC_SDMC_BASE0x1E6E
  #define ASPEED_SOC_SCU_BASE 0x1E6E2000
+#define ASPEED_SOC_XDMA_BASE0x1E6E7000
  #define ASPEED_SOC_SRAM_BASE0x1E72
  #define ASPEED_SOC_TIMER_BASE   0x1E782000
  #define ASPEED_SOC_WDT_BASE 0x1E785000
@@ -159,6 +160,9 @@ static void aspeed_soc_init(Object *obj)
  
  sysbus_init_child_obj(obj, "ftgmac100", OBJECT(>ftgmac100),

sizeof(s->ftgmac100), TYPE_FTGMAC100);
+
+sysbus_init_child_obj(obj, "xdma", OBJECT(>xdma), sizeof(s->xdma),
+  TYPE_ASPEED_XDMA);
  }
  
  static void aspeed_soc_realize(DeviceState *dev, Error **errp)

@@ -298,6 +302,16 @@ static void aspeed_soc_realize(DeviceState *dev, Error 
**errp)
  sysbus_mmio_map(SYS_BUS_DEVICE(>ftgmac100), 0, ASPEED_SOC_ETH1_BASE);
  sysbus_connect_irq(SYS_BUS_DEVICE(>ftgmac100), 0,
 qdev_get_gpio_in(DEVICE(>vic), 2));
+
+/* XDMA */
+object_property_set_bool(OBJECT(>xdma), true, "realized", );
+if (err) {
+error_propagate(errp, err);
+return;
+}
+sysbus_mmio_map(SYS_BUS_DEVICE(>xdma), 0, ASPEED_SOC_XDMA_BASE);
+sysbus_connect_irq(SYS_BUS_DEVICE(>xdma), 0,
+   qdev_get_gpio_in(DEVICE(>vic), 6));
  }
  
  static void aspeed_soc_class_init(ObjectClass *oc, void *data)

diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index 77b9df9..a4483af 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -74,7 +74,7 @@ obj-$(CONFIG_ARMSSE_MHU) += armsse-mhu.o
  
  obj-$(CONFIG_PVPANIC) += pvpanic.o

  obj-$(CONFIG_AUX) += auxbus.o
-obj-$(CONFIG_ASPEED_SOC) += aspeed_scu.o aspeed_sdmc.o
+obj-$(CONFIG_ASPEED_SOC) += aspeed_scu.o aspeed_sdmc.o aspeed_xdma.o
  obj-$(CONFIG_MSF2) += msf2-sysreg.o
  obj-$(CONFIG_NRF51_SOC) += nrf51_rng.o
  
diff --git a/hw/misc/aspeed_xdma.c b/hw/misc/aspeed_xdma.c

new file mode 100644
index 000..fe3a32e
--- /dev/null
+++ b/hw/misc/aspeed_xdma.c
@@ -0,0 +1,156 @@
+/*
+ * ASPEED XDMA Controller
+ * Eddie James 
+ *
+ * Copyright (C) 2019 IBM Corp
+ * SPDX-License-Identifer: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qemu/error-report.h"
+#include "hw/misc/aspeed_xdma.h"
+#include "qapi/error.h"
+
+#define XDMA_BMC_CMDQ_ADDR 0x10
+#define XDMA_BMC_CMDQ_ENDP 0x14
+#define XDMA_BMC_CMDQ_WRP  0x18
+#define  XDMA_BMC_CMDQ_W_MASK  0x0003
+#define XDMA_BMC_CMDQ_RDP  0x1C
+#define  XDMA_BMC_CMDQ_RDP_MAGIC   0xEE882266
+#define XDMA_IRQ_ENG_CTRL  0x20
+#define  XDMA_IRQ_ENG_CTRL_US_COMP BIT(4)
+#define  XDMA_IRQ_ENG_CTRL_DS_COMP BIT(5)
+#define  XDMA_IRQ_ENG_CTRL_W_MASK  0xBFEFF07F
+#define XDMA_IRQ_ENG_STAT  0x24
+#define  XDMA_IRQ_ENG_STAT_US_COMP BIT(4)
+#define  XDMA_IRQ_ENG_STAT_DS_COMP BIT(5)
+#define  XDMA_IRQ_ENG_STAT_RESET   0xF800
+
+#define TO_REG(addr) ((addr) / sizeof(uint32_t))
+
+static uint64_t aspeed_xdma_read(void *opaque, hwaddr addr, unsigned int size)
+{
+uint32_t val = 0;
+AspeedXDMAState *xdma = opaque;
+
+if (addr < ASPEED_XDMA_REG_SIZE) {
+val = xdma->regs[TO_REG(addr)];
+}
+
+return (uint64_t)val;
+}
+
+static void aspeed_xdma_write(void *opaque, hwaddr addr, uint64_t val,
+  unsigned int size)
+{
+unsigned int idx;
+uint32_t val32 = (uint32_t)val;
+AspeedXDMAState *xdma = opaque;
+
+if (addr >= ASPEED_XDMA_REG_SIZE) {
+return;
+}
+
+switch (addr) {
+case XDMA_BMC_CMDQ_ENDP:
+xdma->regs[TO_REG(addr)] = val32 & XDMA_BMC_CMDQ_W_MASK;
+break;
+case XDMA_BMC_CMDQ_WRP:
+idx = TO_REG(addr);
+

Re: [Qemu-devel] kvm / virsh snapshot management

2019-06-10 Thread Gary Dale

On 2019-06-10 8:19 a.m., Stefan Hajnoczi wrote:

On Sat, Jun 01, 2019 at 08:12:01PM -0400, Gary Dale wrote:

A while back I converted a raw disk image to qcow2 to be able to use
snapshots. However I realize that I may not really understand exactly how
snapshots work. In this particular case, I'm only talking about internal
snapshots currently as there seems to be some differences of opinion as to
whether internal or external are safer/more reliable. I'm also only talking
about shutdown state snapshots, so it should just be the disk that is
snapshotted.

As I understand it, the first snapshot freezes the base image and subsequent
changes in the virtual machine's disk are stored elsewhere in the qcow2 file
(remember, only internal snapshots). If I take a second snapshot, that
freezes the first one, and subsequent changes are now in third location.
Each new snapshot is incremental to the one that preceded it rather than
differential to the base image. Each new snapshot is a child of the previous
one.

Internal snapshots are not incremental or differential at the qcow2
level, they are simply a separate L1/L2 table pointing to data clusters.
In other words, they are an independent set of metadata showing the full
state of the image at the point of the snapshot.  qcow2 does not track
relationships between snapshots and parents/children.
Which sounds to me like they are incremental. Each snapshot starts a new 
L1/L2 table so that the state of the previous one is preserved.



One explanation I've seen of the process is if I delete a snapshot, the
changes it contains are merged with its immediate child.

Nope.  Deleting a snapshot decrements the reference count on all its
data clusters.  If a data cluster's reference count reaches zero it will
be freed.  That's all, there is no additional data movement or
reorganization aside from this.
Perhaps not physically but logically it would appear that the data 
clusters were merged.



So if I deleted the
first snapshot, the base image stays the same but any data that has changed
since the base image is now in the second snapshot's location. The merge
with children explanation also implies that the base image is never touched
even if the first snapshot is deleted.

But if I delete a snapshot that has no children, is that essentially the
same as reverting to the point that snapshot was created and all subsequent
disk changes are lost? Or does it merge down to the parent snapshot? If I
delete all snapshots, would that revert to the base image?

No.  qcow2 has the concept of the current disk state of the running VM -
what you get when you boot the guest - and the snapshots - they are
read-only.

When you delete snapshots the current disk state (running VM) is
unaffected.

When you apply a snapshot this throws away the current disk state and
uses the snapshot as the new current disk state.  The read-only snapshot
itself is not modified in any way and you can apply the same snapshot
again as many times as you wish later.
So in essence the current state is a pointer to the latest data cluster, 
which is the only data cluster that can be modified.



I've seen it explained that a snapshot is very much like a timestamp so
deleting a timestamp removes the dividing line between writes that occurred
before and after that time, so that data is really only removed if I revert
to some time stamp - all writes after that point are discarded. In this
explanation, deleting the oldest timestamp is essentially updating the base
image. Deleting all snapshots would leave me with the base image fully
updated.

Frankly, the second explanation sounds more reasonable to me, without having
to figure out how copy-on-write works,  But I'm dealing with important data
here and I don't want to mess it up by mishandling the snapshots.

Can some provide a little clarity on this? Thanks!

If you want an analogy then git(1) is a pretty good one.  qcow2 internal
snapshots are like git tags.  Unlike branches, tags are immutable.  In
qcow2 you only have a master branch (the current disk state) from which
you can create a new tag or you can use git-checkout(1) to apply a
snapshot (discarding whatever your current disk state is).

Stefan


That's just making things less clear - I've never tried to understand 
git either. Thanks for the attempt though.


If I've gotten things correct, once the base image is established, there 
is a current disk state that points to a table containing all the writes 
since the base image. Creating a snapshot essentially takes that pointer 
and gives it the snapshot name, while creating a new current disk state 
pointer and data table where subsequent writes are recorded.


Deleting snapshots removes your ability to refer to a data table by 
name, but the table itself still exists anonymously as part of a chain 
of data tables between the base image and the current state.


This leaves a problem. The chain will very quickly get quite long which 
will impact performance. To combat 

Re: [Qemu-devel] [PULL 0/8] Python queue, 2019-06-07

2019-06-10 Thread Daniel P . Berrangé
On Mon, Jun 10, 2019 at 06:38:05PM +0100, Peter Maydell wrote:
> On Mon, 10 Jun 2019 at 18:30, Daniel P. Berrangé  wrote:
> > Do you have access to any machine in the compile farm or is access
> > granted on a per-machine basis ?   If i'm reading this page right:
> >
> >   https://cfarm.tetaneutral.net/machines/list/
> >
> > there is now one aarch64 machine (gcc117) that is running
> > Debian 9 / Stretch and another (gcc118) with OpenSUSE Leap 15.
> > In terms of OS version at least, either of those could be viable
> > for QEMU, if it is possible for you to access them.
> 
> They're all available, but those machines have less RAM and
> fewer CPUs; I think the A1100 is also not as powerful as the X-Gene.
> I'm pretty sure Linaro has access to something better than that,
> but it will take me a bit to find out and set up to use it.
> (I do actually have a Mustang board myself but it's running
> the aarch32 builds and I'd prefer not to make it also do the
> aarch64 builds at the same time.)

Ok, thanks for the explanation.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] [PULL 0/8] Python queue, 2019-06-07

2019-06-10 Thread Peter Maydell
On Mon, 10 Jun 2019 at 18:30, Daniel P. Berrangé  wrote:
> Do you have access to any machine in the compile farm or is access
> granted on a per-machine basis ?   If i'm reading this page right:
>
>   https://cfarm.tetaneutral.net/machines/list/
>
> there is now one aarch64 machine (gcc117) that is running
> Debian 9 / Stretch and another (gcc118) with OpenSUSE Leap 15.
> In terms of OS version at least, either of those could be viable
> for QEMU, if it is possible for you to access them.

They're all available, but those machines have less RAM and
fewer CPUs; I think the A1100 is also not as powerful as the X-Gene.
I'm pretty sure Linaro has access to something better than that,
but it will take me a bit to find out and set up to use it.
(I do actually have a Mustang board myself but it's running
the aarch32 builds and I'd prefer not to make it also do the
aarch64 builds at the same time.)

thanks
-- PMM



Re: [Qemu-devel] [PULL 0/8] Python queue, 2019-06-07

2019-06-10 Thread Daniel P . Berrangé
On Mon, Jun 10, 2019 at 06:15:46PM +0100, Peter Maydell wrote:
> On Mon, 10 Jun 2019 at 18:12, Daniel P. Berrangé  wrote:
> > There's been two new LTS releases of Ubuntu since then
> > with Xenial and Bionic, so IMHO, it is pretty reasonable
> > to want to drop Trusty rather than continuing to spend time
> > on software versions from 2014 and before. Both our Travis
> > and Docker setups use Xenial as minimum and the number of
> > our developers stuck using Trusty is likely negligible.
> >
> > What gap is your gcc compile farm filling & can we find an
> > alternate way to address that gap that's viable ?
> 
> In this case, it's the aarch64 host. I can probably
> find something else to run this on, but it's not going
> to happen immediately.

Do you have access to any machine in the compile farm or is access
granted on a per-machine basis ?   If i'm reading this page right:

  https://cfarm.tetaneutral.net/machines/list/

there is now one aarch64 machine (gcc117) that is running
Debian 9 / Stretch and another (gcc118) with OpenSUSE Leap 15.
In terms of OS version at least, either of those could be viable
for QEMU, if it is possible for you to access them.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] [PULL 0/8] Python queue, 2019-06-07

2019-06-10 Thread Peter Maydell
On Mon, 10 Jun 2019 at 18:12, Daniel P. Berrangé  wrote:
> There's been two new LTS releases of Ubuntu since then
> with Xenial and Bionic, so IMHO, it is pretty reasonable
> to want to drop Trusty rather than continuing to spend time
> on software versions from 2014 and before. Both our Travis
> and Docker setups use Xenial as minimum and the number of
> our developers stuck using Trusty is likely negligible.
>
> What gap is your gcc compile farm filling & can we find an
> alternate way to address that gap that's viable ?

In this case, it's the aarch64 host. I can probably
find something else to run this on, but it's not going
to happen immediately.

> docker containers for non-x86_64 arches that cross compile.

You need to actually run the tests, so merely cross
compiling doesn't suffice. (Running 'make check' catches
a fair number of bugs.)

I've asked the gcc farm admins if they have plans for an
OS upgrade on those boxes.

thanks
-- PMM



Re: [Qemu-devel] [PULL 0/8] Python queue, 2019-06-07

2019-06-10 Thread Daniel P . Berrangé
On Mon, Jun 10, 2019 at 02:21:57PM +0100, Peter Maydell wrote:
> On Mon, 10 Jun 2019 at 14:11, Eduardo Habkost  wrote:
> >
> > On Mon, Jun 10, 2019 at 01:58:50PM +0100, Peter Maydell wrote:
> > > Hi. This fails to build on one of my buildtest machines:
> > >
> > > ERROR: Cannot use 'python3', Python 2 >= 2.7 or Python 3 >= 3.5 is 
> > > required.
> > >Use --python=/path/to/python to specify a supported Python.
> > >
> > > The machine has python 2.7.6 and 3.4.3. (It's an Ubuntu trusty
> > > box; it's one of the gcc compile farm machines so upgrades to its
> > > OS are not really under my control.)
> >
> > Ubuntu 16.04 LTS (Xenial) was released in April 2016.  Doesn't
> > that mean Trusty is not a supported build platform since April
> > 2018?
> 
> Possibly, but as I say the gcc compile farm is what it is.
> Regardless of the strict letter of the deprecation policy,
> when we start running into issues with the set of build test
> machines I tend to feel we may be being a bit over-hasty in
> deprecating things.

There's been two new LTS releases of Ubuntu since then
with Xenial and Bionic, so IMHO, it is pretty reasonable
to want to drop Trusty rather than continuing to spend time
on software versions from 2014 and before. Both our Travis
and Docker setups use Xenial as minimum and the number of
our developers stuck using Trusty is likely negligible.

What gap is your gcc compile farm filling & can we find an
alternate way to address that gap that's viable ?

Does the gcc compile farm include Docker to let us run the
build in a container from the compile farm. Or can we use
a Docker container on a modern x86_64 host to test the same
thing. We have docker containers for all Linux OS we need
to target, and docker containers for non-x86_64 arches that
cross compile.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] [PATCH 28/42] target/arm: Convert VMOV (imm) to decodetree

2019-06-10 Thread Peter Maydell
On Sat, 8 Jun 2019 at 20:29, Richard Henderson
 wrote:
>
> On 6/6/19 12:45 PM, Peter Maydell wrote:
> > +n = (a->imm4h << 28) & 0x8000;
> > +i = ((a->imm4h << 4) & 0x70) | a->imm4l;
> > +if (i & 0x40) {
> > +i |= 0x780;
> > +} else {
> > +i |= 0x800;
> > +}
> > +n |= i << 19;
>
> Can we reuse vfp_expand_imm here?  Given that you don't have pure code motion
> from the old code (due to field extraction) it doesn't feel wrong to go ahead
> and fix this wart now.

I dunno, I'd kind of prefer to do it later. We're already
drifting away from the old code as you say, and going
straight to vfp_expand_imm() makes it even less clear that
we're doing the same thing the old code was...

thanks
-- PMM



[Qemu-devel] Ubuntu Trusty as supported build platform (was Re: [PULL 0/8] Python queue, 2019-06-07)

2019-06-10 Thread Eduardo Habkost
On Mon, Jun 10, 2019 at 02:21:57PM +0100, Peter Maydell wrote:
> On Mon, 10 Jun 2019 at 14:11, Eduardo Habkost  wrote:
> >
> > On Mon, Jun 10, 2019 at 01:58:50PM +0100, Peter Maydell wrote:
> > > Hi. This fails to build on one of my buildtest machines:
> > >
> > > ERROR: Cannot use 'python3', Python 2 >= 2.7 or Python 3 >= 3.5 is 
> > > required.
> > >Use --python=/path/to/python to specify a supported Python.
> > >
> > > The machine has python 2.7.6 and 3.4.3. (It's an Ubuntu trusty
> > > box; it's one of the gcc compile farm machines so upgrades to its
> > > OS are not really under my control.)
> >
> > Ubuntu 16.04 LTS (Xenial) was released in April 2016.  Doesn't
> > that mean Trusty is not a supported build platform since April
> > 2018?
> 
> Possibly, but as I say the gcc compile farm is what it is.
> Regardless of the strict letter of the deprecation policy,
> when we start running into issues with the set of build test
> machines I tend to feel we may be being a bit over-hasty in
> deprecating things.

I understand this as a request to make Trusty a supported build
platform.  Can we please update the documentation to reflect
that, then?

-- 
Eduardo



Re: [Qemu-devel] [PATCH 15/18] Boot Linux Console Test: add a test for aarch64 + virt

2019-06-10 Thread Cleber Rosa
On Mon, Jun 10, 2019 at 09:53:03AM +0100, Daniel P. Berrangé wrote:
> On Fri, Jun 07, 2019 at 03:58:57PM -0300, Eduardo Habkost wrote:
> > CCing Daniel, who wrote commit 6ab3fc32ea64.
> > 
> > On Fri, Jun 07, 2019 at 11:44:32AM -0400, Cleber Rosa wrote:
> > > On Fri, Jun 07, 2019 at 12:42:14AM -0300, Eduardo Habkost wrote:
> > > > On Fri, Jun 07, 2019 at 12:26:48AM -0300, Eduardo Habkost wrote:
> > > > > On Fri, Feb 01, 2019 at 11:10:31AM -0500, Cleber Rosa wrote:
> > > > > > 
> > > > > > 
> > > > > > On 1/31/19 4:26 PM, Cleber Rosa wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 1/31/19 3:21 PM, Cleber Rosa wrote:
> > > > > > >>
> > > > > > >>
> > > > > > >> On 1/31/19 3:02 PM, Wainer dos Santos Moschetta wrote:
> > > > > > >>>
> > > > > > >>> On 01/17/2019 04:56 PM, Cleber Rosa wrote:
> > > > > >  Just like the previous tests, boots a Linux kernel on a 
> > > > > >  aarch64 target
> > > > > >  using the virt machine.
> > > > > > 
> > > > > >  One special option added is the CPU type, given that the kernel
> > > > > >  selected fails to boot on the virt machine's default CPU 
> > > > > >  (cortex-a15).
> > > > > > 
> > > > > >  Signed-off-by: Cleber Rosa 
> > > > > >  ---
> > > > > >    .travis.yml    |  2 +-
> > > > > >    tests/acceptance/boot_linux_console.py | 20 
> > > > > >  
> > > > > >    2 files changed, 21 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > >  diff --git a/.travis.yml b/.travis.yml
> > > > > >  index 54100eea5a..595e8c0b6c 100644
> > > > > >  --- a/.travis.yml
> > > > > >  +++ b/.travis.yml
> > > > > >  @@ -187,7 +187,7 @@ matrix:
> > > > > >      # Acceptance (Functional) tests
> > > > > >    - env:
> > > > > >  -    - CONFIG="--python=/usr/bin/python3
> > > > > >  --target-list=x86_64-softmmu,mips-softmmu,mips64el-softmmu,ppc64-softmmu"
> > > > > >  +    - CONFIG="--python=/usr/bin/python3
> > > > > >  --target-list=x86_64-softmmu,mips-softmmu,mips64el-softmmu,ppc64-softmmu,aarch64-softmmu"
> > > > > > 
> > > > > >    - TEST_CMD="make check-acceptance"
> > > > > >      addons:
> > > > > >    apt:
> > > > > >  diff --git a/tests/acceptance/boot_linux_console.py
> > > > > >  b/tests/acceptance/boot_linux_console.py
> > > > > >  index f3ccd23a7a..107700b517 100644
> > > > > >  --- a/tests/acceptance/boot_linux_console.py
> > > > > >  +++ b/tests/acceptance/boot_linux_console.py
> > > > > >  @@ -138,3 +138,23 @@ class BootLinuxConsole(Test):
> > > > > >    self.vm.launch()
> > > > > >    console_pattern = 'Kernel command line: %s' %
> > > > > >  kernel_command_line
> > > > > >    self.wait_for_console_pattern(console_pattern)
> > > > > >  +
> > > > > >  +    def test_aarch64_virt(self):
> > > > > > >>>
> > > > > > >>> That test case fails on my system (Fedora 29 x86_64). Avocado 
> > > > > > >>> seems
> > > > > > >>> unable to kill the VM so it  reaches the timeout.
> > > > > > >>>
> > > > > > >>> I compiled QEMU with default configuration:
> > > > > > >>>
> > > > > > >>> $ configure --python=/usr/bin/python3 
> > > > > > >>> --target-list=x86_64-softmmu
> > > > > > >>> --target-list=x86_64-softmmu,mips-softmmu,mips64el-softmmu,ppc64-softmmu,aarch64-softmmu)
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Follows a snippet of the Avocado's job.log file:
> > > > > > >>> 
> > > > > > >>> 2019-01-31 14:41:34,912 test L0602 INFO | START
> > > > > > >>> 07-/root/src/qemu/tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_aarch64_virt
> > > > > > >>>
> > > > > > >>> 2019-01-31 14:41:34,912 test L0298 DEBUG| DATA
> > > > > > >>> (filename=output.expected) => NOT FOUND (data sources: variant, 
> > > > > > >>> test, file)
> > > > > > >>> 2019-01-31 14:41:34,913 parameters   L0146 DEBUG| PARAMS 
> > > > > > >>> (key=arch,
> > > > > > >>> path=*, default=aarch64) => 'aarch64'
> > > > > > >>> 2019-01-31 14:41:34,913 parameters   L0146 DEBUG| PARAMS
> > > > > > >>> (key=qemu_bin, path=*, 
> > > > > > >>> default=aarch64-softmmu/qemu-system-aarch64) =>
> > > > > > >>> 'aarch64-softmmu/qemu-system-aarch64'
> > > > > > >>> 2019-01-31 14:41:34,915 download L0070 INFO | Fetching
> > > > > > >>> https://sjc.edge.kernel.org/fedora-buffet/fedora/linux/releases/29/Server/aarch64/os/images/pxeboot/vmlinuz
> > > > > > >>> -> /var/lib/avocado/data/cache/by_name/vmlinuz.3upct2pr
> > > > > > >>> 2019-01-31 14:41:35,490 download L0054 DEBUG| Retrieved 
> > > > > > >>> URL
> > > > > > >>> "https://sjc.edge.kernel.org/fedora-buffet/fedora/linux/releases/29/Server/aarch64/os/images/pxeboot/vmlinuz":
> > > > > > >>> content-length 8623423, date: "Thu, 31 Jan 2019 19:41:35 GMT",
> > > > > > >>> last-modified: "Sun, 21 Oct 2018 00:43:09 GMT"
> > > > > > >>> 2019-01-31 

Re: [Qemu-devel] [PATCH 1/2] docs/specs/index.rst: Fix minor syntax issues

2019-06-10 Thread Cédric Le Goater
On 10/06/2019 17:24, Peter Maydell wrote:
> The docs/specs/index.rst has a couple of minor issues which
> we didn't notice because we weren't building the manual:
>  * the ToC entry for the new PPC XIVE docs points to
>a nonexistent file
>  * the initial comment needs to be marked by '..', not '.',
>or it will appear in the output
>  * the title doesn't match the capitialization used by
>the existing interop or devel manuals, and uses
>'full-system emulation' rather than the 'system emulation'
>that the interop manual title uses
> 
> Fix these minor issues before we start trying to build the manual.

I only tested the pdf generation of each file with rst2pdf :/
 
> Signed-off-by: Peter Maydell 

Reviewed-by: Cédric Le Goater 

Thanks,

C.

> ---
>  docs/specs/index.rst | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/docs/specs/index.rst b/docs/specs/index.rst
> index 2e927519c2e..40adb97c5eb 100644
> --- a/docs/specs/index.rst
> +++ b/docs/specs/index.rst
> @@ -1,8 +1,8 @@
> -. This is the top level page for the 'specs' manual
> +.. This is the top level page for the 'specs' manual
>  
>  
> -QEMU full-system emulation guest hardware specifications
> -
> +QEMU System Emulation Guest Hardware Specifications
> +===
>  
>  
>  Contents:
> @@ -10,4 +10,5 @@ Contents:
>  .. toctree::
> :maxdepth: 2
>  
> -   xive
> +   ppc-xive
> +   ppc-spapr-xive
> 




Re: [Qemu-devel] qgraph

2019-06-10 Thread Paolo Bonzini
On 10/06/19 18:12, Andreas Färber wrote:
> Am 10.06.19 um 15:52 schrieb Paolo Bonzini:
>> On 10/06/19 15:28, Andreas Färber wrote:
>>> So if we want a new QMP operation, the most sense would probably make
>>> where-can-I-attach-type(foo) returning a list of QOM paths, showing only
>>> the first free slot per bus. That would allow a more efficient lookup
>>> implementation inside QEMU than needing to check each slot[n] property
>>> via qom-get after discovering it with qom-list.
>>
>> Note that what Natalia is seeking is an introspection mechanism to be
>> used _before_ creating a virtual machine though.
> 
> QMP implied creating a virtual machine though.

Yes, but you can start QEMU with -M none and just invoke QOM
introspection commands.

> This then goes back to the long-discussed topic of not doing recursive
> realized=true when starting halted with -s but deferring that to the
> cont operation. I doubt that's been implemented in the meantime?

Nope.

Paolo



Re: [Qemu-devel] qgraph

2019-06-10 Thread Andreas Färber
Am 10.06.19 um 15:52 schrieb Paolo Bonzini:
> On 10/06/19 15:28, Andreas Färber wrote:
>> So if we want a new QMP operation, the most sense would probably make
>> where-can-I-attach-type(foo) returning a list of QOM paths, showing only
>> the first free slot per bus. That would allow a more efficient lookup
>> implementation inside QEMU than needing to check each slot[n] property
>> via qom-get after discovering it with qom-list.
> 
> Note that what Natalia is seeking is an introspection mechanism to be
> used _before_ creating a virtual machine though.

QMP implied creating a virtual machine though.

This then goes back to the long-discussed topic of not doing recursive
realized=true when starting halted with -s but deferring that to the
cont operation. I doubt that's been implemented in the meantime?

Cheers,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)



[Qemu-devel] [PATCH v2] qapi: InitSocketAddress: add keepalive option

2019-06-10 Thread Vladimir Sementsov-Ogievskiy
It's needed to provide keepalive for nbd client to track server
availability.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

v2: [by Markus's comments]
 - improve commit message
 - s/keepalive/keep-alive
 - update inet_parse()


 qapi/sockets.json   |  5 -
 util/qemu-sockets.c | 22 ++
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/qapi/sockets.json b/qapi/sockets.json
index fc81d8d5e8..13a2627e1d 100644
--- a/qapi/sockets.json
+++ b/qapi/sockets.json
@@ -53,6 +53,8 @@
 #
 # @ipv6: whether to accept IPv6 addresses, default try both IPv4 and IPv6
 #
+# @keep-alive: enable keep-alive when connecting to this socket (Since 4.1)
+#
 # Since: 1.3
 ##
 { 'struct': 'InetSocketAddress',
@@ -61,7 +63,8 @@
 '*numeric':  'bool',
 '*to': 'uint16',
 '*ipv4': 'bool',
-'*ipv6': 'bool' } }
+'*ipv6': 'bool',
+'*keep-alive': 'bool' } }
 
 ##
 # @UnixSocketAddress:
diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
index 8850a280a8..9c842c4a93 100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -457,6 +457,19 @@ int inet_connect_saddr(InetSocketAddress *saddr, Error 
**errp)
 }
 
 freeaddrinfo(res);
+
+if (saddr->keep_alive) {
+int val = 1;
+int ret = qemu_setsockopt(sock, SOL_SOCKET, SO_KEEPALIVE,
+  , sizeof(val));
+
+if (ret < 0) {
+error_setg_errno(errp, errno, "Unable to set KEEPALIVE");
+close(sock);
+return -1;
+}
+}
+
 return sock;
 }
 
@@ -652,6 +665,15 @@ int inet_parse(InetSocketAddress *addr, const char *str, 
Error **errp)
 }
 addr->has_ipv6 = true;
 }
+begin = strstr(optstr, ",keep-alive");
+if (begin) {
+if (inet_parse_flag("keep-alive", begin + strlen("keep-alive="),
+>keep_alive, errp) < 0)
+{
+return -1;
+}
+addr->has_keep_alive = true;
+}
 return 0;
 }
 
-- 
2.18.0




Re: [Qemu-devel] [Qemu-block] [PATCH] file-posix: unlock qemu_global_mutex before pread when attach disk

2019-06-10 Thread Paolo Bonzini
On 10/06/19 16:51, l00284672 wrote:
> The pread will hang in attaching disk just when backend storage network
> disconnection .

Would the "open" hang as well in that case?
> I think the locking range of qemu_global_mutex is too large when do qmp
> operation. what
> does the qemu_global_mutex  really protect?

Basically everything.

> what is the risk of unlocking qemu_global_mutex in qmp?

It's not QMP; it's specifically the code that calls raw_probe_alignment.

Paolo



Re: [Qemu-devel] [PATCH v2] Incorrect Stack Pointer shadow register support on some m68k CPUs

2019-06-10 Thread Laurent Vivier
Le 09/06/2019 à 18:43, Lucien Murray-Pitts a écrit :
> Brief overview;
> - Added "CPU class" m68k_feature to each CPU init
>   so MOVEC can detect wrong CR (Control Register) access
> - Added cascaded "inheritance" of m68k_features by calling m680xx_cpu_initfn()
>   of previous CPU so that 68060 inherits 68040, and so on
> - Added comments above m680xx_cpu_initfn to identify additional supported
>   features for that CPU class
> - Added more detailed comments, including CPU classes supported,
>   to enum m68k_features
> - Added more detailed comments to each case of m68k_move_to/from helpers
>   to list the supported CPUs for that CR
> - Added CPU class detection for each CR type, exits switch if unsupported
> - Added ILLEGAL INSTRUCITON exception condition when the helper fails to
>   decode the CR
> - Moved abort only to handle unimplemented control registers,
>   all other unknown CR will cause ILLEGAL instruciton
> - Fixed m68k_switch_sp so it switches only if MSP feature is implemented
> - Changed the MOVEC instruction in translate to be 68010 not 68000
> - Added missing BUSCR/PCR CR defines, and decodes for helpers for the 68060


For readability, could you split this patch in several ones?

For instance, one patch by item of the list above?
Globally, the approach looks good, but I'm not convinced we need one
feature bit for each CPU.

Thanks,
Laurent

> Long overview;
> The 68000 does not support the MOVEC instruction, it was added with the 68010.
> 
> Futher on the 68010, and CPU32 the ISP doesnt exist.
> These CPUs only have SSP/USP.
> 
> On supporting CPUs the SR register also implements a single bit,  the "M"
> (master-mode) bit that determines which of the ISP/MSP is active at the time.
> 
> When not supported by the CPU the MOVEC instruction when accessing these
> shadow registers should issue an ILLEGAL INSTRUCTION exception.
> 
> Futher this patch adds classes for each CPU family 680[012346] so that
> illegal access to specific control registers can be checked.
> 
> Additional comments added to the features set to claify
> exactly what differentiates each class.  (m68k_features)
> 
> The helpers m68k_movec_to, and m68k_movec_from have been updated to support
> the exception ILLEGAL INSTRUCTION for all control registers that
> are illegal per CPU class, and for any unkown control register.
> 
> Added defines for BUS, and Processor Configuration Register (PCR) for MC68060,
> and case statements in the helper for the missing Cache Address Register 
> (CAAR),
> and the new BUS, and PCR which results in a cpu abort (unimplemented).
> 
> All other cases will result in an ILLEGAL INSTRUCTION exception as per the
> manual.
> 
> Because the MSP support results in an illegal instruction exception
> if the wrong Control Register is accessed then it was necessary to
> know the CPU class in the MOVEC instruction.
> 
> To do this a sizable overhaul of the CPU initialize funcitons was needed
> to add a feature showing the CPU class.
> 
> So in the CPU classes the m680XX_cpu_initfn functions have been rearranged
> to cascade starting from the base 68000, so that the 68010 then inherits
> from this, and so on until the 68060.
> 
> Because each cpu class inherits the previous CPU class, then for example
> the 68020 also has the feature 68010, and 68000 and so on upto the 68060.
> 
> Added some more detailed comments to each cpu initialization function
> to make it clear the instructions added/changed for that CPU to make
> future debugging easier, and the reason for the feature flags more clear.
> 
> These comments could go deeper into explaining supported/ehnaced modes,
> but this wasnt done in this patch.
> 
> There are comments in the existing code referring to the EC/L/and-so-on
> classes, however no code has been implemented to handle these specific
> varitations of each CPU class, and so no m68k_feature was mde to
> distinguish them that way.
> 
> Signed-off-by: Lucien Murray-Pitts 
> ---
> 
> Notes:
> v1->v2
>   - Submitted previous patch to fix existing non-compliant comment style 
>   - Added a comment about sp in CPUM68KState structure
>   - updated movec in the same patch to issue exception
>   - Reworked code in m68k_movec_from()/m68k_movec_to() because 
> as they trigger a cpu_abort() with unknown code, 
>   - Above rework then required some additions for CPU class and so on.
>   - Maybe this is becoming more of an RFC? / should be split for the 
> rework?
> 
> Based-on: 20190606234125.GA4830@localhost.localdomain
> ([PATCH v2] m68k comments break patch submission due to being incorrectly 
> formatted)
> 
>  target/m68k/cpu.c   | 112 ++
>  target/m68k/cpu.h   |  56 ++---
>  target/m68k/helper.c| 247 +++-
>  target/m68k/translate.c |   4 +-
>  4 files changed, 326 insertions(+), 93 deletions(-)
> 
> diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
> index c144278661..09f3514715 100644
> --- 

Re: [Qemu-devel] [PATCH] qapi: InitSocketAddress: add keepalive option

2019-06-10 Thread Vladimir Sementsov-Ogievskiy
07.06.2019 20:22, Markus Armbruster wrote:
> Vladimir Sementsov-Ogievskiy  writes:
> 
>> 06.06.2019 14:17, Daniel P. Berrangé wrote:
>>> On Thu, Jun 06, 2019 at 01:15:33PM +0300, Vladimir Sementsov-Ogievskiy 
>>> wrote:
 Signed-off-by: Vladimir Sementsov-Ogievskiy 
 ---

 Hi all!

 This is a continuation of "[PATCH v2 0/2] nbd: enable keepalive", but
 it's a try from another side, so almost nothing common with v2.
> 
> Please explain intended use of your new option in your commit message.

Will do. Actual reason is keepalive for nbd-client.

> 
qapi/sockets.json   |  5 -
util/qemu-sockets.c | 13 +
2 files changed, 17 insertions(+), 1 deletion(-)

 diff --git a/qapi/sockets.json b/qapi/sockets.json
 index fc81d8d5e8..aefa024051 100644
 --- a/qapi/sockets.json
 +++ b/qapi/sockets.json
 @@ -53,6 +53,8 @@
#
# @ipv6: whether to accept IPv6 addresses, default try both IPv4 and 
 IPv6
#
 +# @keepalive: enable keepalive when connecting to this socket (Since 4.1)
 +#
# Since: 1.3
##
{ 'struct': 'InetSocketAddress',
 @@ -61,7 +63,8 @@
'*numeric':  'bool',
'*to': 'uint16',
'*ipv4': 'bool',
 -'*ipv6': 'bool' } }
 +'*ipv6': 'bool',
 +'*keepalive': 'bool' } }
> 
> I know the C identifier is SO_KEEPALIVE, but let's stick to proper
> English words in QMP: keep-alive.

Ok

> 

##
# @UnixSocketAddress:
 diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
 index 8850a280a8..d2cd2a9d4f 100644
 --- a/util/qemu-sockets.c
 +++ b/util/qemu-sockets.c
 @@ -457,6 +457,19 @@ int inet_connect_saddr(InetSocketAddress *saddr, 
 Error **errp)
}

freeaddrinfo(res);
 +
 +if (saddr->keepalive) {
>>>
>>> IIUC, best practice is to use 'saddr->has_keepalive && saddr->keepalive'
>>
>> As I remember, now all optional fields are zeroed. But I'm not against 
>> stricter condition.
> 
> As far as I'm concerned, relying on zero-initialization of optional
> members is fine.
> 
>>
>>>
 +int val = 1;
 +int ret = qemu_setsockopt(sock, SOL_SOCKET, SO_KEEPALIVE,
 +  , sizeof(val));
 +
 +if (ret < 0) {
 +error_setg_errno(errp, errno, "Unable to set KEEPALIVE");
 +close(sock);
 +return -1;
 +}
 +}
 +
return sock;
}
> 
> Possibly ignorant question: why obey the keep-alive option for active
> connections (made with inet_connect_saddr()), but not passive ones (made
> with inet_listen_saddr() + accept())?

Hmm. It's a bit trickier, as seems I can't do it on socket level, as parameter 
keep-alive I
get for listen part, but keep-alive should be enabled on socket from accept. So 
this should
be implemented on qio_channel level.. I'd prefer not implement it now. We now 
only interested
in keep-alive for client, and seems generally keep-alive is more usable for 
client part.

> 
> Do you need to update inet_parse()?

will do, thank you for reviewing!



-- 
Best regards,
Vladimir


Re: [Qemu-devel] [PULL v2 00/39] tcg: Move softmmu tlb into CPUNegativeOffsetState

2019-06-10 Thread Peter Maydell
On Mon, 10 Jun 2019 at 15:59, Richard Henderson
 wrote:
>
> V2 should fix a typo affecting OpenBSD.
>
>
> r~
>
> The following changes since commit 19735c837ae2056b4651720290eda59498eca65a:
>
>   Merge remote-tracking branch 
> 'remotes/kraxel/tags/usb-20190607-pull-request' into staging (2019-06-10 
> 11:53:19 +0100)
>
> are available in the Git repository at:
>
>   https://github.com/rth7680/qemu.git tags/pull-tcg-20190610
>
> for you to fetch changes up to 43b3952dea0f763ceeaa2f119c473b5cc6d29c90:
>
>   tcg/arm: Remove mostly unreachable tlb special case (2019-06-10 07:03:42 
> -0700)
>
> 
> Move softmmu tlb into CPUNegativeOffsetState
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.1
for any user-visible changes.

-- PMM



Re: [Qemu-devel] [QUESTION] How to reduce network latency to improve netperf TCP_RR drastically?

2019-06-10 Thread Michael S. Tsirkin
On Tue, Jun 04, 2019 at 03:10:43PM +0800, Like Xu wrote:
> Hi Michael,
> 
> At https://www.linux-kvm.org/page/NetworkingTodo, there is an entry for
> network latency saying:
> 
> ---
> reduce networking latency:
>  allow handling short packets from softirq or VCPU context
>  Plan:
>We are going through the scheduler 3 times
>(could be up to 5 if softirqd is involved)
>Consider RX: host irq -> io thread -> VCPU thread ->
>guest irq -> guest thread.
>This adds a lot of latency.
>We can cut it by some 1.5x if we do a bit of work
>either in the VCPU or softirq context.
>  Testing: netperf TCP RR - should be improved drastically
>   netperf TCP STREAM guest to host - no regression
>  Contact: MST
> ---
> 
> I am trying to make some contributions to improving netperf TCP_RR.
> Could you please share more ideas or plans or implemental details to make it
> happen?
> 
> Thanks,
> Like Xu


So some of this did happen. netif_receive_skb is now called
directly from tun_get_user.

Question is about the rx/tun_put_user path now.

If the vhost thread is idle, there's a single packet
outstanding then maybe we can forward the packet to userspace
directly from BH without waking up the thread.

For this to work we need to map some userspace memory into kernel
ahead of the time. For example, maybe it can happen when
guest adds RX buffers? Copying Jason who's looking into
memory mapping matters.

-- 
MST



Re: [Qemu-devel] [PATCH 1/7] iotests: allow Valgrind checking all QEMU processes

2019-06-10 Thread Eric Blake
On 6/10/19 10:02 AM, Andrey Shinkevich wrote:
> 
> 
> On 10/06/2019 17:24, Eric Blake wrote:
>> On 6/9/19 1:35 PM, Andrey Shinkevich wrote:
>>> With the '-valgrind' option, let all the QEMU processes be run under
>>> the Valgrind tool. The Valgrind own parameters may be set with its
>>> environment variable VALGRIND_OPTS, e.g.
>>> VALGRIND_OPTS="--leak-check=yes" ./check -qcow2 -valgrind 
>>
>> Let's spell this --valgrind; long options should prefer the use of --
>> (as in getopt_long), whether or not we also have a reason to support
>> -valgrind (as in getopt_long_only). Yes, qemu is an oddball in this
>> regards, but no need to make it worse.
>>
> 
> Thank you, Eric. That sounds good but the short option'-valgrind' is
> preexisting in QEMU. Should I create a new patch for the long option?
> If so, will we have both options supported by QEMU?

Oh, you're talking about qemu-iotests/check already supporting merely
'-valgrind', not 'qemu-kvm' or '*/qemu-system-*'.  ./check is already an
oddball for not permitting double dash, but at this point, normalizing
it is a lot of churn. So it becomes a tradeoff on how much grunt work do
you really want to do.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH 2/2] docs: Build and install specs manual

2019-06-10 Thread Peter Maydell
Now we have some rST format docs in the docs/specs/ manual, we should
actually build and install it.

Signed-off-by: Peter Maydell 
---
 Makefile   |  7 ++-
 docs/specs/conf.py | 16 
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100644 docs/specs/conf.py

diff --git a/Makefile b/Makefile
index 8e2fc6624c3..cfb18f15254 100644
--- a/Makefile
+++ b/Makefile
@@ -731,6 +731,7 @@ distclean: clean
rm -rf .doctrees
$(call clean-manual,devel)
$(call clean-manual,interop)
+   $(call clean-manual,specs)
for d in $(TARGET_DIRS); do \
rm -rf $$d || exit 1 ; \
 done
@@ -781,6 +782,7 @@ endef
 .PHONY: install-sphinxdocs
 install-sphinxdocs: sphinxdocs
$(call install-manual,interop)
+   $(call install-manual,specs)
 
 install-doc: $(DOCS) install-sphinxdocs
$(INSTALL_DIR) "$(DESTDIR)$(qemu_docdir)"
@@ -962,7 +964,7 @@ docs/version.texi: $(SRC_PATH)/VERSION config-host.mak
 # and handles "don't rebuild things unless necessary" itself.
 # The '.doctrees' files are cached information to speed this up.
 .PHONY: sphinxdocs
-sphinxdocs: $(MANUAL_BUILDDIR)/devel/index.html 
$(MANUAL_BUILDDIR)/interop/index.html
+sphinxdocs: $(MANUAL_BUILDDIR)/devel/index.html 
$(MANUAL_BUILDDIR)/interop/index.html $(MANUAL_BUILDDIR)/specs/index.html
 
 # Canned command to build a single manual
 build-manual = $(call quiet-command,sphinx-build $(if $(V),,-q) -W -n -b html 
-D version=$(VERSION) -D release="$(FULL_VERSION)" -d .doctrees/$1 
$(SRC_PATH)/docs/$1 $(MANUAL_BUILDDIR)/$1 ,"SPHINX","$(MANUAL_BUILDDIR)/$1")
@@ -975,6 +977,9 @@ $(MANUAL_BUILDDIR)/devel/index.html: $(call 
manual-deps,devel)
 $(MANUAL_BUILDDIR)/interop/index.html: $(call manual-deps,interop)
$(call build-manual,interop)
 
+$(MANUAL_BUILDDIR)/specs/index.html: $(call manual-deps,specs)
+   $(call build-manual,specs)
+
 qemu-options.texi: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > 
$@,"GEN","$@")
 
diff --git a/docs/specs/conf.py b/docs/specs/conf.py
new file mode 100644
index 000..4d56f3ae13c
--- /dev/null
+++ b/docs/specs/conf.py
@@ -0,0 +1,16 @@
+# -*- coding: utf-8 -*-
+#
+# QEMU documentation build configuration file for the 'specs' manual.
+#
+# This includes the top level conf file and then makes any necessary tweaks.
+import sys
+import os
+
+qemu_docdir = os.path.abspath("..")
+parent_config = os.path.join(qemu_docdir, "conf.py")
+exec(compile(open(parent_config, "rb").read(), parent_config, 'exec'))
+
+# This slightly misuses the 'description', but is the best way to get
+# the manual title to appear in the sidebar.
+html_theme_options['description'] = \
+u'System Emulation Guest Hardware Specifications'
-- 
2.20.1




[Qemu-devel] [PATCH 0/2] docs: build and install the 'specs' manual

2019-06-10 Thread Peter Maydell
(Sorry for the initial send where I sent only the cover
letter and forgot to tell git send-email to send the
patches as well...)

With the recent addition of the XIVE documentation, we
now have some actual .rST format documentation in
docs/specs, so we should start building and installing
this manual.

Patch 1 in this series fixes up some minor problems with
docs/specs/index.rst which we didn't notice because we
weren't building the manual. Patch 2 adds the makefile
and config runes to do the build and install.

NB: there's a trivial textual conflict in Makefile
with the 'convert qemu-ga' patch I sent earlier today;
this applies direct on top of master.

thanks
-- PMM


Peter Maydell (2):
  docs/specs/index.rst: Fix minor syntax issues
  docs: Build and install specs manual

 Makefile |  7 ++-
 docs/specs/conf.py   | 16 
 docs/specs/index.rst |  9 +
 3 files changed, 27 insertions(+), 5 deletions(-)
 create mode 100644 docs/specs/conf.py

-- 
2.20.1




[Qemu-devel] [PATCH 1/2] docs/specs/index.rst: Fix minor syntax issues

2019-06-10 Thread Peter Maydell
The docs/specs/index.rst has a couple of minor issues which
we didn't notice because we weren't building the manual:
 * the ToC entry for the new PPC XIVE docs points to
   a nonexistent file
 * the initial comment needs to be marked by '..', not '.',
   or it will appear in the output
 * the title doesn't match the capitialization used by
   the existing interop or devel manuals, and uses
   'full-system emulation' rather than the 'system emulation'
   that the interop manual title uses

Fix these minor issues before we start trying to build the manual.

Signed-off-by: Peter Maydell 
---
 docs/specs/index.rst | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/docs/specs/index.rst b/docs/specs/index.rst
index 2e927519c2e..40adb97c5eb 100644
--- a/docs/specs/index.rst
+++ b/docs/specs/index.rst
@@ -1,8 +1,8 @@
-. This is the top level page for the 'specs' manual
+.. This is the top level page for the 'specs' manual
 
 
-QEMU full-system emulation guest hardware specifications
-
+QEMU System Emulation Guest Hardware Specifications
+===
 
 
 Contents:
@@ -10,4 +10,5 @@ Contents:
 .. toctree::
:maxdepth: 2
 
-   xive
+   ppc-xive
+   ppc-spapr-xive
-- 
2.20.1




[Qemu-devel] [PATCH 0/2] docs: build and install the 'specs' manual

2019-06-10 Thread Peter Maydell
With the recent addition of the XIVE documentation, we
now have some actual .rST format documentation in
docs/specs, so we should start building and installing
this manual.

Patch 1 in this series fixes up some minor problems with
docs/specs/index.rst which we didn't notice because we
weren't building the manual. Patch 2 adds the makefile
and config runes to do the build and install.

NB: there's a trivial textual conflict in Makefile
with the 'convert qemu-ga' patch I sent earlier today;
this applies direct on top of master.

thanks
-- PMM


Peter Maydell (2):
  docs/specs/index.rst: Fix minor syntax issues
  docs: Build and install specs manual

 Makefile |  7 ++-
 docs/specs/conf.py   | 16 
 docs/specs/index.rst |  9 +
 3 files changed, 27 insertions(+), 5 deletions(-)
 create mode 100644 docs/specs/conf.py

-- 
2.20.1




Re: [Qemu-devel] [PATCH v21 4/7] target/avr: Add instruction translation

2019-06-10 Thread Richard Henderson
On 6/6/19 12:30 PM, Michael Rolnik wrote:
> +void avr_cpu_tcg_init(void)
> +{
> +int i;
> +
> +#define AVR_REG_OFFS(x) offsetof(CPUAVRState, x)
> +cpu_pc = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(pc_w), "pc");
> +cpu_Cf = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(sregC), "Cf");
> +cpu_Zf = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(sregZ), "Zf");
> +cpu_Nf = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(sregN), "Nf");
> +cpu_Vf = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(sregV), "Vf");
> +cpu_Sf = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(sregS), "Sf");
> +cpu_Hf = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(sregH), "Hf");
> +cpu_Tf = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(sregT), "Tf");
> +cpu_If = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(sregI), "If");
> +cpu_rampD = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(rampD), 
> "rampD");
> +cpu_rampX = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(rampX), 
> "rampX");
> +cpu_rampY = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(rampY), 
> "rampY");
> +cpu_rampZ = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(rampZ), 
> "rampZ");
> +cpu_eind = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(eind), "eind");
> +cpu_sp = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(sp), "sp");
> +cpu_skip = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(skip), "skip");
> +
> +for (i = 0; i < 32; i++) {
> +char name[16];
> +
> +sprintf(name, "r[%d]", i);
> +
> +cpu_r[i] = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(r[i]), name);
> +}
> +}

These register names need to be permanently allocated.
I suggest

static const char reg_names[32][8] = {
"r[0]", "r[1]" ...
};
cpu_r[i] = tcg_global_mem_new_i32(cpu_env, AVR_REG_OFFS(r[i]),
  reg_names[i]);


r~



Re: [Qemu-devel] [PATCH v21 3/7] target/avr: Add instruction decoding

2019-06-10 Thread Richard Henderson
On 6/6/19 12:30 PM, Michael Rolnik wrote:
> +STS 1001 001 .  @ldst_s
> +
> -- 

checkpatch error for extra blank line at end of file.


r~



Re: [Qemu-devel] [PATCH 1/7] iotests: allow Valgrind checking all QEMU processes

2019-06-10 Thread Andrey Shinkevich


On 10/06/2019 17:24, Eric Blake wrote:
> On 6/9/19 1:35 PM, Andrey Shinkevich wrote:
>> With the '-valgrind' option, let all the QEMU processes be run under
>> the Valgrind tool. The Valgrind own parameters may be set with its
>> environment variable VALGRIND_OPTS, e.g.
>> VALGRIND_OPTS="--leak-check=yes" ./check -qcow2 -valgrind 
> 
> Let's spell this --valgrind; long options should prefer the use of --
> (as in getopt_long), whether or not we also have a reason to support
> -valgrind (as in getopt_long_only). Yes, qemu is an oddball in this
> regards, but no need to make it worse.
> 

Thank you, Eric. That sounds good but the short option'-valgrind' is
preexisting in QEMU. Should I create a new patch for the long option?
If so, will we have both options supported by QEMU?
Andrey

>>
>> Signed-off-by: Andrey Shinkevich 
>> ---
>>   tests/qemu-iotests/common.rc | 65 
>> 
>>   1 file changed, 48 insertions(+), 17 deletions(-)
>>
>> diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
>> index 93f8738..3caaca4 100644
> 
> 

-- 
With the best regards,
Andrey Shinkevich


[Qemu-devel] [PULL v2 00/39] tcg: Move softmmu tlb into CPUNegativeOffsetState

2019-06-10 Thread Richard Henderson
V2 should fix a typo affecting OpenBSD.


r~

The following changes since commit 19735c837ae2056b4651720290eda59498eca65a:

  Merge remote-tracking branch 'remotes/kraxel/tags/usb-20190607-pull-request' 
into staging (2019-06-10 11:53:19 +0100)

are available in the Git repository at:

  https://github.com/rth7680/qemu.git tags/pull-tcg-20190610

for you to fetch changes up to 43b3952dea0f763ceeaa2f119c473b5cc6d29c90:

  tcg/arm: Remove mostly unreachable tlb special case (2019-06-10 07:03:42 
-0700)


Move softmmu tlb into CPUNegativeOffsetState


Richard Henderson (39):
  tcg: Fold CPUTLBWindow into CPUTLBDesc
  tcg: Split out target/arch/cpu-param.h
  tcg: Create struct CPUTLB
  cpu: Define CPUArchState with typedef
  cpu: Define ArchCPU
  cpu: Replace ENV_GET_CPU with env_cpu
  cpu: Introduce env_archcpu
  target/alpha: Use env_cpu, env_archcpu
  target/arm: Use env_cpu, env_archcpu
  target/cris: Reindent mmu.c
  target/cris: Reindent op_helper.c
  target/cris: Use env_cpu, env_archcpu
  target/hppa: Use env_cpu, env_archcpu
  target/i386: Use env_cpu, env_archcpu
  target/lm32: Use env_cpu, env_archcpu
  target/m68k: Use env_cpu
  target/microblaze: Use env_cpu, env_archcpu
  target/mips: Use env_cpu, env_archcpu
  target/moxie: Use env_cpu, env_archcpu
  target/nios2: Use env_cpu, env_archcpu
  target/openrisc: Use env_cpu, env_archcpu
  target/ppc: Use env_cpu, env_archcpu
  target/riscv: Use env_cpu, env_archcpu
  target/s390x: Use env_cpu, env_archcpu
  target/sh4: Use env_cpu, env_archcpu
  target/sparc: Use env_cpu, env_archcpu
  target/tilegx: Use env_cpu
  target/tricore: Use env_cpu
  target/unicore32: Use env_cpu, env_archcpu
  target/xtensa: Use env_cpu, env_archcpu
  cpu: Move ENV_OFFSET to exec/gen-icount.h
  cpu: Introduce cpu_set_cpustate_pointers
  cpu: Introduce CPUNegativeOffsetState
  cpu: Move icount_decr to CPUNegativeOffsetState
  cpu: Move the softmmu tlb to CPUNegativeOffsetState
  cpu: Remove CPU_COMMON
  tcg/aarch64: Use LDP to load tlb mask+table
  tcg/arm: Use LDRD to load tlb mask+table
  tcg/arm: Remove mostly unreachable tlb special case

 accel/tcg/atomic_template.h   |   8 +-
 include/exec/cpu-all.h|  69 +++
 include/exec/cpu-defs.h   | 111 ++--
 include/exec/cpu_ldst.h   |   6 +-
 include/exec/cpu_ldst_template.h  |   6 +-
 include/exec/cpu_ldst_useronly_template.h |   6 +-
 include/exec/gen-icount.h |  14 +-
 include/exec/softmmu-semi.h   |  16 +-
 include/qom/cpu.h |  40 +-
 linux-user/cpu_loop-common.h  |   2 +-
 linux-user/m68k/target_cpu.h  |   2 +-
 target/alpha/cpu-param.h  |  31 ++
 target/alpha/cpu.h|  40 +-
 target/arm/cpu-param.h|  34 ++
 target/arm/cpu.h  |  52 +-
 target/cris/cpu-param.h   |  17 +
 target/cris/cpu.h |  25 +-
 target/hppa/cpu-param.h   |  34 ++
 target/hppa/cpu.h |  38 +-
 target/i386/cpu-param.h   |  28 +
 target/i386/cpu.h |  40 +-
 target/lm32/cpu-param.h   |  17 +
 target/lm32/cpu.h |  25 +-
 target/m68k/cpu-param.h   |  22 +
 target/m68k/cpu.h |  28 +-
 target/microblaze/cpu-param.h |  18 +
 target/microblaze/cpu.h   |  63 +--
 target/mips/cpu-param.h   |  29 ++
 target/mips/cpu.h |  21 +-
 target/mips/mips-defs.h   |  15 -
 target/moxie/cpu-param.h  |  17 +
 target/moxie/cpu.h|  29 +-
 target/nios2/cpu-param.h  |  21 +
 target/nios2/cpu.h|  33 +-
 target/openrisc/cpu-param.h   |  17 +
 target/openrisc/cpu.h |  31 +-
 target/ppc/cpu-param.h|  37 ++
 target/ppc/cpu.h  |  61 +--
 target/ppc/helper_regs.h  |   4 +-
 target/riscv/cpu-param.h  |  23 +
 target/riscv/cpu.h|  34 +-
 target/s390x/cpu-param.h  |  17 +
 target/s390x/cpu.h|  31 +-
 target/sh4/cpu-param.h|  21 +
 target/sh4/cpu.h  |  30 +-
 target/sparc/cpu-param.h  |  28 +
 target/sparc/cpu.h|  36 +-
 target/tilegx/cpu-param.h |  17 +
 target/tilegx/cpu.h   |  23 +-
 target/tricore/cpu-param.h|  17 +
 target

[Qemu-devel] [PATCH 14/39] target/i386: Use env_cpu, env_archcpu

2019-06-10 Thread Richard Henderson
Cleanup in the boilerplate that each target must define.
Replace x86_env_get_cpu with env_archcpu.  The combination
CPU(x86_env_get_cpu) should have used ENV_GET_CPU to begin;
use env_cpu now.

Reviewed-by: Alistair Francis 
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/i386/cpu.h  |  5 -
 bsd-user/main.c|  3 +--
 hw/i386/kvmvapic.c |  4 ++--
 hw/i386/pc.c   |  2 +-
 linux-user/i386/cpu_loop.c |  2 +-
 linux-user/i386/signal.c   |  2 +-
 linux-user/vm86.c  | 18 +-
 target/i386/bpt_helper.c   |  4 ++--
 target/i386/cpu.c  |  4 ++--
 target/i386/excp_helper.c  |  2 +-
 target/i386/fpu_helper.c   |  2 +-
 target/i386/helper.c   | 16 ++--
 target/i386/misc_helper.c  | 24 +++-
 target/i386/seg_helper.c   | 14 +++---
 target/i386/smm_helper.c   |  4 ++--
 target/i386/svm_helper.c   | 22 +++---
 16 files changed, 58 insertions(+), 70 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 103fd709b0..709d88cfcf 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1480,11 +1480,6 @@ struct X86CPU {
 int32_t hv_max_vps;
 };
 
-static inline X86CPU *x86_env_get_cpu(CPUX86State *env)
-{
-return container_of(env, X86CPU, env);
-}
-
 #define ENV_OFFSET offsetof(X86CPU, env)
 
 #ifndef CONFIG_USER_ONLY
diff --git a/bsd-user/main.c b/bsd-user/main.c
index 6192e9d91e..53e1f42408 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -140,8 +140,7 @@ static void set_idt(int n, unsigned int dpl)
 
 void cpu_loop(CPUX86State *env)
 {
-X86CPU *cpu = x86_env_get_cpu(env);
-CPUState *cs = CPU(cpu);
+CPUState *cs = env_cpu(env);
 int trapnr;
 abi_ulong pc;
 //target_siginfo_t info;
diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index 70f6f26a94..fe5b12ef6e 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -152,7 +152,7 @@ static void update_guest_rom_state(VAPICROMState *s)
 
 static int find_real_tpr_addr(VAPICROMState *s, CPUX86State *env)
 {
-CPUState *cs = CPU(x86_env_get_cpu(env));
+CPUState *cs = env_cpu(env);
 hwaddr paddr;
 target_ulong addr;
 
@@ -279,7 +279,7 @@ instruction_ok:
 
 static int update_rom_mapping(VAPICROMState *s, CPUX86State *env, target_ulong 
ip)
 {
-CPUState *cs = CPU(x86_env_get_cpu(env));
+CPUState *cs = env_cpu(env);
 hwaddr paddr;
 uint32_t rom_state_vaddr;
 uint32_t pos, patch, offset;
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index edc240bcbf..1b08b56362 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -406,7 +406,7 @@ uint64_t cpu_get_tsc(CPUX86State *env)
 /* IRQ handling */
 int cpu_get_pic_interrupt(CPUX86State *env)
 {
-X86CPU *cpu = x86_env_get_cpu(env);
+X86CPU *cpu = env_archcpu(env);
 int intno;
 
 if (!kvm_irqchip_in_kernel()) {
diff --git a/linux-user/i386/cpu_loop.c b/linux-user/i386/cpu_loop.c
index 51cfa006c9..71da24384f 100644
--- a/linux-user/i386/cpu_loop.c
+++ b/linux-user/i386/cpu_loop.c
@@ -82,7 +82,7 @@ static void set_idt(int n, unsigned int dpl)
 
 void cpu_loop(CPUX86State *env)
 {
-CPUState *cs = CPU(x86_env_get_cpu(env));
+CPUState *cs = env_cpu(env);
 int trapnr;
 abi_ulong pc;
 abi_ulong ret;
diff --git a/linux-user/i386/signal.c b/linux-user/i386/signal.c
index fecb4c99c3..97a39204cc 100644
--- a/linux-user/i386/signal.c
+++ b/linux-user/i386/signal.c
@@ -198,7 +198,7 @@ static void setup_sigcontext(struct target_sigcontext *sc,
 struct target_fpstate *fpstate, CPUX86State *env, abi_ulong mask,
 abi_ulong fpstate_addr)
 {
-CPUState *cs = CPU(x86_env_get_cpu(env));
+CPUState *cs = env_cpu(env);
 #ifndef TARGET_X86_64
 uint16_t magic;
 
diff --git a/linux-user/vm86.c b/linux-user/vm86.c
index 9c393df424..2fa7a89edc 100644
--- a/linux-user/vm86.c
+++ b/linux-user/vm86.c
@@ -72,7 +72,7 @@ static inline unsigned int vm_getl(CPUX86State *env,
 
 void save_v86_state(CPUX86State *env)
 {
-CPUState *cs = CPU(x86_env_get_cpu(env));
+CPUState *cs = env_cpu(env);
 TaskState *ts = cs->opaque;
 struct target_vm86plus_struct * target_v86;
 
@@ -132,7 +132,7 @@ static inline void return_to_32bit(CPUX86State *env, int 
retval)
 
 static inline int set_IF(CPUX86State *env)
 {
-CPUState *cs = CPU(x86_env_get_cpu(env));
+CPUState *cs = env_cpu(env);
 TaskState *ts = cs->opaque;
 
 ts->v86flags |= VIF_MASK;
@@ -145,7 +145,7 @@ static inline int set_IF(CPUX86State *env)
 
 static inline void clear_IF(CPUX86State *env)
 {
-CPUState *cs = CPU(x86_env_get_cpu(env));
+CPUState *cs = env_cpu(env);
 TaskState *ts = cs->opaque;
 
 ts->v86flags &= ~VIF_MASK;
@@ -163,7 +163,7 @@ static inline void clear_AC(CPUX86State *env)
 
 static inline int set_vflags_long(unsigned long eflags, CPUX86State *env)
 {
-CPUState *cs = CPU(x86_env_get_cpu(env));
+CPUState *cs = env_cpu(env);
 TaskState *ts = cs->opaque;
 
 

Re: [Qemu-devel] [PATCH] qemu-ga: Convert invocation documentation to rST

2019-06-10 Thread Peter Maydell
On Mon, 10 Jun 2019 at 14:45, Peter Maydell  wrote:
>
> The qemu-ga documentation is currently in qemu-ga.texi in
> Texinfo format, which we present to the user as:
>  * a qemu-ga manpage
>  * a section of the main qemu-doc HTML documentation
>
> Convert the documentation to rST format, and present it to
> the user as:
>  * a qemu-ga manpage
>  * part of the interop/ Sphinx manual
>
> Signed-off-by: Peter Maydell 
> ---
> This is part of my general proposals about how we should
> transition away from texinfo to sphinx (written up at
> https://wiki.qemu.org/Features/Documentation). It's the
> first part of step 3 (convert standalone manpages), so it's
> interesting as a demonstration of Sphinx generating manpages
> as well as HTML. The format of the output manpage is broadly
> the same, though there are a few minor differences because
> rST doesn't support quite the same kinds of output. (It also
> fixes a bug where the current manpage renders some text intended
> to be in bold as literally "B".)
>
> Having the infrastructure for creating a simple manpage via
> Sphinx should be a useful assist for the work in step 1 of the
> transition plan that involves conversion of the auto-generated
> qemu-ga-ref and qemu-qmp-ref (which need to produce manpages).
>
> The non-manpage HTML version of the doc moves from qemu-doc.html
> into docs/interop/ (its final location in the new 5-manual setup).
>
>  Makefile |  13 ++--
>  MAINTAINERS  |   2 +-
>  docs/conf.py |  18 ++---
>  docs/interop/conf.py |   7 ++
>  docs/interop/index.rst   |   1 +
>  docs/interop/qemu-ga.rst | 133 +
>  qemu-doc.texi|   5 --
>  qemu-ga.texi | 137 ---
>  8 files changed, 161 insertions(+), 155 deletions(-)
>  create mode 100644 docs/interop/qemu-ga.rst
>  delete mode 100644 qemu-ga.texi

I just realised that I forgot to update the Makefile 'install-manual'
macro to filter out the generated docs/interop/qemu-ga.8 file,
so at the moment 'make install' will put a copy into
/usr/local/share/doc/qemu/interop/ as well as into
/usr/local/share/man/man8/. I'll fix that in v2.

thanks
-- PMM



Re: [Qemu-devel] [Qemu-block] [PATCH] file-posix: unlock qemu_global_mutex before pread when attach disk

2019-06-10 Thread l00284672
The pread will hang in attaching disk just when backend storage network 
disconnection .


I think the locking range of qemu_global_mutex is too large when do qmp 
operation. what


does the qemu_global_mutex  really protect?  what is the risk of 
unlocking qemu_global_mutex


in qmp?


On 2019/6/10 21:51, Paolo Bonzini wrote:

On 10/06/19 15:34, Zhengui li wrote:

when do qmp sush as drive_add,  qemu main thread locks the
qemu_global_mutex  and do pread in raw_probe_alignmen. Pread is a
synchronous operation. If backend storage network has a large delay
or IO pressure is too large,  the pread operation will not return for
a long time, which make vcpu thread can't acquire qemu_global_mutex
for a long time and make the vcpu thread unable to be scheduled for a
long time.  So virtual machine cpu soft lockup happened.

qemu main thread should not hold qemu_global_mutex for a long time
when do qmp that involving IO synchronous operation sush pread ,
ioctl, etc. So this patch unlock qemu_global_mutex before IO
synchronous operation sush pread.

These preads are for 512-4096 bytes, can they really last much longer
than the "open" that precedes them?  If pread of 4K can trigger a soft
lockup, things are really screwed up---and it's hard to be sure that all
callers of raw_probe_alignment are okay with releasing the global mutex.

Paolo

.



<>

Re: [Qemu-devel] Need help generating instruction traces from the emulator.

2019-06-10 Thread Peter Maydell
On Mon, 10 Jun 2019 at 15:24, Nisarg Ujjainkar
 wrote:
>
> Hello,
>
> I am using qemu based aosp (emu-master-dev branch of the aosp) to generate
> the instruction traces for android os running on the ARM architecture.
>
> I am able to generate the CPU instruction using the qemu invocation
>  flags. For the
> purpose of my study, I need all the memory requests from all the IPs and so
> far I only have the memory requests from the CPU.

This isn't supported by QEMU's logging infrastructure.
You might be able to find places to add suitable logging,
but you'd be looking at modifying the source code of the
relevant devices to do that.

thanks
-- PMM



Re: [Qemu-devel] [PATCH] RFC: qio: Improve corking of TLS sessions

2019-06-10 Thread Eric Blake
On 6/10/19 9:02 AM, Eric Blake wrote:

> 
> send(MSG_MORE)
> send()
> 
> is ideal; under the hood, we can translate it to:
> 
> send(MSG_MORE)
>   gnutls_record_cork()
>   gnutls_record_send()
> send()
>   if (size > threshold) {
> gnutls_record_uncork()
> gnutls_record_send()
>   } else {
> gnutls_record_send()
> gnutls_record_uncork()
>   }
> 
> So we really need a way to plumb a MSG_MORE flag for senders to use,
> when they KNOW they will be sending back-to-back pieces and where the
> first piece is short, but it is not yet obvious whether the second piece
> is short or long.

This is what I meant to say,

> 
> MSG_MORE was lon the next message to go through the stack, if the
> previous message next paccork for

this was an editing accident on incomplete thoughts.  But I wanted to add:

Setting up the ability to pass MGS_MORE through the qio stack will
require either an update to ALL callers of qio_write to pass a flags
argument (usually 0), or to add a set of new entry points to qio for the
few callers that want to pass a non-zero flags argument (for now, nbd
and sheepdog).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 1/7] iotests: allow Valgrind checking all QEMU processes

2019-06-10 Thread Eric Blake
On 6/9/19 1:35 PM, Andrey Shinkevich wrote:
> With the '-valgrind' option, let all the QEMU processes be run under
> the Valgrind tool. The Valgrind own parameters may be set with its
> environment variable VALGRIND_OPTS, e.g.
> VALGRIND_OPTS="--leak-check=yes" ./check -qcow2 -valgrind 

Let's spell this --valgrind; long options should prefer the use of --
(as in getopt_long), whether or not we also have a reason to support
-valgrind (as in getopt_long_only). Yes, qemu is an oddball in this
regards, but no need to make it worse.

> 
> Signed-off-by: Andrey Shinkevich 
> ---
>  tests/qemu-iotests/common.rc | 65 
> 
>  1 file changed, 48 insertions(+), 17 deletions(-)
> 
> diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
> index 93f8738..3caaca4 100644


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3 0/8] WIP: Multifd compression support

2019-06-10 Thread Eric Blake
On 5/20/19 1:35 AM, Wei Yang wrote:
> On Wed, May 15, 2019 at 02:15:36PM +0200, Juan Quintela wrote:
>> v3:
>> - improve the code
>> - address David and Markus comments
>> - make compression code into methods
>>  so we can add any other method ading just three functions
>>
>> Please review, as far as I know everything is ok now.
>>
>> Todo: Add zstd support
> 
> Confusion here. It is zstd or sztd?
> 
> BTW, I am not sure what it is :-)
> 
>>
>> v2:
>> - improve the code left and right
>> - Split better the zlib code
>> - rename everything to v4.1
>> - Add tests for multifd-compress zlib
>> - Parameter is now an enum (soon will see sztd)
>^^^

zstd is the name of the new compression algorithm.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v6 6/7] block/nbd-client: nbd reconnect

2019-06-10 Thread Eric Blake
On 6/10/19 8:29 AM, Vladimir Sementsov-Ogievskiy wrote:

>> Hmm, and then, include it into BDRVNBDState? I don't remember why didn't do
>> it, and now I don't see any reason for it. We store this information anyway
>> for the whole life of the driver..
>>
>> So, if I'm going to refactor it, I have a question: is there a reason for
>> layered BDRVNBDState:
>>
>> typedef struct BDRVNBDState {
>>      NBDClientSession client;
>>
>>      /* For nbd_refresh_filename() */
>>      SocketAddress *saddr;
>>      char *export, *tlscredsid;
>> } BDRVNBDState;
>>
>> and use nbd_get_client_session everywhere instead of simple converting void 
>> *opaque
>> to State pointer like in qcow2 for example?
>>
>> The only reason I can imagine is not to share the whole BDRVNBDState, and 
>> keep
>> things we are using only in nbd.c to be available only in nbd.c.. But I've 
>> put
>> saddr and export to NBDConnection to be used in nbd-client.c anyway. So, I 
>> think
>> it's a good moment to flatten and share BDRVNBDState and drop 
>> nbd_get_client_session.
>>
> 
> And if we are here, what is exact purpose of splitting  nbd-client.c from 
> nbd.c?

I see no strong reasons to keep the separation. If a single large file
is easier to maintain than an arbitrary split between two files, I'm
open to the idea of a patch that undoes the split.


> or need of defining driver callbacks in header file, instead of keeping them 
> together
> with driver struct definition and static (like other block drivers)...
> 
> 
> And names of these two files always confused me.. nbd.c is nbd client block 
> driver, and
> driver is defined here.. But we have nbd-client.c which defines nbd client 
> driver too.
> And we also have nbd/client.c (OK, this split I understand of course, but it 
> increases
> confusion anyway).

I'm also toying with the idea of writing block/nbd.c to utilize the
relatively-new libnbd library, to see if there are any differences in
behavior by offloading NBD access to a 3rd-party library.  But that's
further down the road.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] Need help generating instruction traces from the emulator.

2019-06-10 Thread Nisarg Ujjainkar
Hello,

I am using qemu based aosp (emu-master-dev branch of the aosp) to generate
the instruction traces for android os running on the ARM architecture.

I am able to generate the CPU instruction using the qemu invocation
 flags. For the
purpose of my study, I need all the memory requests from all the IPs and so
far I only have the memory requests from the CPU.

Can tell me about how to generate the traces from all other IPs (GPU, GSM
chip etc.). Since memory requests from other IPs account for more than 30%
of all the memory requests from the SoC (Source unconfirmed). Getting
memory requests from these IPs is very crucial for my research.

Thanks and regards
Nisarg Ujjainkar

-- 
Nisarg Ujjainkar
Junior Undergraduate Student
Department of Computer Science and Engineering
IIT Gandhinagar
+91 926488 \ 9425665211
ᐧ


Re: [Qemu-devel] [PATCH] file-posix: unlock qemu_global_mutex before pread when attach disk

2019-06-10 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/1560173684-6264-1-git-send-email-lizhen...@huawei.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PATCH] file-posix: unlock qemu_global_mutex before pread 
when attach disk
Type: series
Message-id: 1560173684-6264-1-git-send-email-lizhen...@huawei.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

From https://github.com/patchew-project/qemu
 * [new tag]   
patchew/1560173684-6264-1-git-send-email-lizhen...@huawei.com -> 
patchew/1560173684-6264-1-git-send-email-lizhen...@huawei.com
Switched to a new branch 'test'
a5549055a3 file-posix: unlock qemu_global_mutex before pread when attach disk

=== OUTPUT BEGIN ===
ERROR: Missing Signed-off-by: line(s)

total: 1 errors, 0 warnings, 9 lines checked

Commit a5549055a30d (file-posix: unlock qemu_global_mutex before pread when 
attach disk) has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/1560173684-6264-1-git-send-email-lizhen...@huawei.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [Qemu-devel] [PATCH] qemu-ga: Convert invocation documentation to rST

2019-06-10 Thread Peter Maydell
On Mon, 10 Jun 2019 at 14:56, Paolo Bonzini  wrote:
>
> On 10/06/19 15:45, Peter Maydell wrote:
> > +# Canned command to build manpages from a single manual
> > +build-manpages = $(call quiet-command,CONFDIR="$(qemu_confdir)" 
> > sphinx-build $(if $(V),,-q) -W -n -b man -D version=$(VERSION) -D 
> > release="$(FULL_VERSION)" -d .doctrees/$1 $(SRC_PATH)/docs/$1 
> > $(MANUAL_BUILDDIR)/$1 ,"SPHINX","$(MANUAL_BUILDDIR)/$1")
>
>
> ...
>
> > +# The rst_epilog fragment is effectively included in every rST file.
> > +# We use it to define substitutions based on build config that
> > +# can then be used in the documentation. The fallback if the
> > +# environment variable is not set is for the benefit of readthedocs
> > +# style document building; our Makefile always sets the variable.
> > +confdir = os.getenv('CONFDIR', "/usr/local/etc")
>
> This should be /usr/local/etc/qemu is you want $(qemu_confdir) above and
> not $(sysconfdir).
>
> But since we always set the variable when building from the QEMU build
> system, perhaps "/etc" or "/etc/qemu" is a more useful default when
> building the manual outside QEMU?

Yeah, I wasn't sure what to set this to either...

thanks
-- PMM



Re: [Qemu-devel] [PATCH] RFC: qio: Improve corking of TLS sessions

2019-06-10 Thread Eric Blake
On 6/10/19 4:08 AM, Daniel P. Berrangé wrote:
> On Fri, Jun 07, 2019 at 05:14:14PM -0500, Eric Blake wrote:
>> Our current implementation of qio_channel_set_cork() is pointless for
>> TLS sessions: we block the underlying channel, but still hand things
>> piecemeal to gnutls which then produces multiple encryption packets.
>> Better is to directly use gnutls corking, which collects multiple
>> inputs into a single encryption packet.
>>
>> Signed-off-by: Eric Blake 
>>
>> ---
>>
>> RFC because unfortunately, I'm not sure I like the results.  My test
>> case (using latest nbdkit.git) involves sending 10G of random data
>> over NBD using parallel writes (and doing nothing on the server end;
>> this is all about timing the data transfer):
>>
>> $ dd if=/dev/urandom of=rand bs=1M count=10k
>> $ time nbdkit -p 10810 --tls=require --tls-verify-peer \
>>--tls-psk=/tmp/keys.psk --filter=stats null 10g statsfile=/dev/stderr \
>>--run '~/qemu/qemu-img convert -f raw -W -n --target-image-opts \
>>  --object tls-creds-psk,id=tls0,endpoint=client,dir=/tmp,username=eblake 
>> \
>>  rand 
>> driver=nbd,server.type=inet,server.host=localhost,server.port=10810,tls-creds=tls0'
>>
>> Pre-patch, I measured:
>> real 0m34.536s
>> user 0m29.264s
>> sys  0m4.014s
>>
>> while post-patch, it changed to:
>> real 0m36.055s
>> user 0m27.685s
>> sys  0m10.138s
>>
>> Less time spent in user space, but for this particular qemu-img
>> behavior (writing 2M chunks at a time), gnutls is now uncorking huge
>> packets and the kernel is doing enough extra work that the overall
>> program actually takes longer. :(
> 
> I wonder if the problem is that we're hitting a tradeoff between
> time spent in encryption vs time spent in network I/O.
> 
> When we don't cork, the kernel has already sent the first packet
> during the time gnutls is burning CPU encrypting the second packet.
> 
> When we do cork gnutls has to encrypt both packets before the kernel
> can send anything.

Exactly. That's why my gut feel is that having only a binary cork is too
course, and we instead want something more like MSG_MORE. If the flag is
set, AND the current size is small, then cork; then on the next message,
we see that we are still corked, and decide to either uncork immediately
(current message is large, corking buys us nothing useful as we're going
to split the current message anyway) or uncork after appending the
current message (current message is small enough that keeping things
corked may still be a win).  Visually,

cork
send()
send()
uncork

is too late - you can't tell whether the second send would have
benefitted from staying corked after the first.  But:

send(MSG_MORE)
send()

is ideal; under the hood, we can translate it to:

send(MSG_MORE)
  gnutls_record_cork()
  gnutls_record_send()
send()
  if (size > threshold) {
gnutls_record_uncork()
gnutls_record_send()
  } else {
gnutls_record_send()
gnutls_record_uncork()
  }

So we really need a way to plumb a MSG_MORE flag for senders to use,
when they KNOW they will be sending back-to-back pieces and where the
first piece is short, but it is not yet obvious whether the second piece
is short or long.

MSG_MORE was lon the next message to go through the stack, if the
previous message next paccork for

> 
>> For smaller I/O patterns, the effects of corking are more likely to be
>> beneficial, but I don't have a ready test case to produce that pattern
>> (short of creating a guest running fio on a device backed by nbd).
> 
> I think it would probably be useful to see guest kernels with fio
> and definitely see results for something closer to sector sized
> I/O.
> 
> 2 MB chunks is pretty unrealistic for a guest workload where there
> will be lots of sector sized I/O. Sure QEMU / guest OS can merge
> sectors, but it still feels like most I/O is going to be much smaller
> than 2 MB.

I'll have to play with other benchmarking setups, and see if I can come
up with a reliable fio test.

> 
> 
>> diff --git a/io/channel.c b/io/channel.c
>> index 2a26c2a2c0b9..3510912465ac 100644
>> --- a/io/channel.c
>> +++ b/io/channel.c
>> @@ -379,7 +379,16 @@ void qio_channel_set_cork(QIOChannel *ioc,
>>  QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc);
>>
>>  if (klass->io_set_cork) {
>> -klass->io_set_cork(ioc, enabled);
>> +int r = klass->io_set_cork(ioc, enabled);
>> +
>> +while (r == QIO_CHANNEL_ERR_BLOCK) {
>> +if (qemu_in_coroutine()) {
>> +qio_channel_yield(ioc, G_IO_OUT);
>> +} else {
>> +qio_channel_wait(ioc, G_IO_OUT);
>> +}
>> +r = klass->io_set_cork(ioc, enabled);
>> +}
>>  }
> 
> Interesting - did you actually hit this EAGAIN behaviour ? I wouldn't
> have expected anything to be pending in gnutls that needs retry. 

Yes. When gnutls_record_cork() is called, NOTHING goes over the wire,
but instead everything gets appended to a realloc'd buffer; when you
finally 

Re: [Qemu-devel] qgraph

2019-06-10 Thread Paolo Bonzini
On 10/06/19 15:28, Andreas Färber wrote:
> So if we want a new QMP operation, the most sense would probably make
> where-can-I-attach-type(foo) returning a list of QOM paths, showing only
> the first free slot per bus. That would allow a more efficient lookup
> implementation inside QEMU than needing to check each slot[n] property
> via qom-get after discovering it with qom-list.

Note that what Natalia is seeking is an introspection mechanism to be
used _before_ creating a virtual machine though.

Paolo



Re: [Qemu-devel] [PATCH] qemu-ga: Convert invocation documentation to rST

2019-06-10 Thread Paolo Bonzini
On 10/06/19 15:45, Peter Maydell wrote:
> +# Canned command to build manpages from a single manual
> +build-manpages = $(call quiet-command,CONFDIR="$(qemu_confdir)" sphinx-build 
> $(if $(V),,-q) -W -n -b man -D version=$(VERSION) -D 
> release="$(FULL_VERSION)" -d .doctrees/$1 $(SRC_PATH)/docs/$1 
> $(MANUAL_BUILDDIR)/$1 ,"SPHINX","$(MANUAL_BUILDDIR)/$1")


...

> +# The rst_epilog fragment is effectively included in every rST file.
> +# We use it to define substitutions based on build config that
> +# can then be used in the documentation. The fallback if the
> +# environment variable is not set is for the benefit of readthedocs
> +# style document building; our Makefile always sets the variable.
> +confdir = os.getenv('CONFDIR', "/usr/local/etc")

This should be /usr/local/etc/qemu is you want $(qemu_confdir) above and
not $(sysconfdir).

But since we always set the variable when building from the QEMU build
system, perhaps "/etc" or "/etc/qemu" is a more useful default when
building the manual outside QEMU?

Paolo

> +rst_epilog = ".. |CONFDIR| replace:: ``" + confdir + "``\n"
> +




Re: [Qemu-devel] [Qemu-block] [PATCH] file-posix: unlock qemu_global_mutex before pread when attach disk

2019-06-10 Thread Paolo Bonzini
On 10/06/19 15:34, Zhengui li wrote:
> 
> when do qmp sush as drive_add,  qemu main thread locks the
> qemu_global_mutex  and do pread in raw_probe_alignmen. Pread is a
> synchronous operation. If backend storage network has a large delay
> or IO pressure is too large,  the pread operation will not return for
> a long time, which make vcpu thread can't acquire qemu_global_mutex
> for a long time and make the vcpu thread unable to be scheduled for a
> long time.  So virtual machine cpu soft lockup happened.
> 
> qemu main thread should not hold qemu_global_mutex for a long time
> when do qmp that involving IO synchronous operation sush pread ,
> ioctl, etc. So this patch unlock qemu_global_mutex before IO
> synchronous operation sush pread.

These preads are for 512-4096 bytes, can they really last much longer
than the "open" that precedes them?  If pread of 4K can trigger a soft
lockup, things are really screwed up---and it's hard to be sure that all
callers of raw_probe_alignment are okay with releasing the global mutex.

Paolo



Re: [Qemu-devel] [PATCH] pc: fix possible NULL pointer dereference in pc_machine_get_device_memory_region_size()

2019-06-10 Thread Paolo Bonzini
On 10/06/19 15:50, Igor Mammedov wrote:
> QEMU will crash when device-memory-region-size property is read if 
> ms->device_memory
> wasn't initialized yet.
> 
> Crash can be reproduced with:
>  $QEMU -preconfig -qmp unix:qmp_socket,server,nowait &
>  ./scripts/qmp/qom-get -s qmp_socket /machine.device-memory-region-size
> 
> Instead of crashing return 0 if ms->device_memory hasn't been initialized.
> 
> Signed-off-by: Igor Mammedov 
> ---
> v2:
>   add reproducer to commit message
>(Markus Armbruster )
> 
>  hw/i386/pc.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index edc240b..1b7ead9 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -2459,7 +2459,11 @@ pc_machine_get_device_memory_region_size(Object *obj, 
> Visitor *v,
>   Error **errp)
>  {
>  MachineState *ms = MACHINE(obj);
> -int64_t value = memory_region_size(>device_memory->mr);
> +int64_t value = 0;
> +
> +if (ms->device_memory) {
> +memory_region_size(>device_memory->mr);
> +}
>  
>  visit_type_int(v, name, , errp);
>  }
> 

Queued, thanks.

Paolo



[Qemu-devel] [PATCH v5 11/12] qemu-io: adds support for io_uring

2019-06-10 Thread Aarushi Mehta
Signed-off-by: Aarushi Mehta 
---
 qemu-io.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/qemu-io.c b/qemu-io.c
index 8d5d5911cb..54b82151c4 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -129,6 +129,7 @@ static void open_help(void)
 " -n, -- disable host cache, short for -t none\n"
 " -U, -- force shared permissions\n"
 " -k, -- use kernel AIO implementation (on Linux only)\n"
+" -i  -- use kernel io_uring (Linux 5.1+)\n"
 " -t, -- use the given cache mode for the image\n"
 " -d, -- use the given discard mode for the image\n"
 " -o, -- options to be given to the block driver"
@@ -188,6 +189,11 @@ static int open_f(BlockBackend *blk, int argc, char **argv)
 case 'k':
 flags |= BDRV_O_NATIVE_AIO;
 break;
+#ifdef CONFIG_LINUX_IO_URING
+case 'i':
+flags |= BDRV_O_IO_URING;
+break;
+#endif
 case 't':
 if (bdrv_parse_cache_mode(optarg, , ) < 0) {
 error_report("Invalid cache option: %s", optarg);
@@ -290,6 +296,7 @@ static void usage(const char *name)
 "  -C, --copy-on-read   enable copy-on-read\n"
 "  -m, --misalign   misalign allocations for O_DIRECT\n"
 "  -k, --native-aio use kernel AIO implementation (on Linux only)\n"
+"  -i  --io_uring   use kernel io_uring (Linux 5.1+)\n"
 "  -t, --cache=MODE use the given cache mode for the image\n"
 "  -d, --discard=MODE   use the given discard mode for the image\n"
 "  -T, --trace [[enable=]][,events=][,file=]\n"
@@ -499,6 +506,7 @@ int main(int argc, char **argv)
 { "copy-on-read", no_argument, NULL, 'C' },
 { "misalign", no_argument, NULL, 'm' },
 { "native-aio", no_argument, NULL, 'k' },
+{ "io_uring", no_argument, NULL, 'i' },
 { "discard", required_argument, NULL, 'd' },
 { "cache", required_argument, NULL, 't' },
 { "trace", required_argument, NULL, 'T' },
@@ -566,6 +574,11 @@ int main(int argc, char **argv)
 case 'k':
 flags |= BDRV_O_NATIVE_AIO;
 break;
+#ifdef CONFIG_LINUX_IO_URING
+case 'i':
+flags |= BDRV_O_IO_URING;
+break;
+#endif
 case 't':
 if (bdrv_parse_cache_mode(optarg, , ) < 0) {
 error_report("Invalid cache option: %s", optarg);
-- 
2.17.1




[Qemu-devel] [PATCH v5 10/12] block/io_uring: adds userspace completion polling

2019-06-10 Thread Aarushi Mehta
Signed-off-by: Aarushi Mehta 
---
 block/io_uring.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/block/io_uring.c b/block/io_uring.c
index 47e027364a..acfaa48151 100644
--- a/block/io_uring.c
+++ b/block/io_uring.c
@@ -142,6 +142,21 @@ static void qemu_luring_completion_cb(void *opaque)
 qemu_luring_process_completions_and_submit(s);
 }
 
+static bool qemu_luring_poll_cb(void *opaque)
+{
+LuringState *s = opaque;
+struct io_uring_cqe *cqes;
+
+if (io_uring_peek_cqe(>ring, ) == 0) {
+if (!cqes) {
+qemu_luring_process_completions_and_submit(s);
+return true;
+}
+}
+
+return false;
+}
+
 static void ioq_init(LuringQueue *io_q)
 {
 QSIMPLEQ_INIT(_q->sq_overflow);
@@ -294,7 +309,7 @@ void luring_attach_aio_context(LuringState *s, AioContext 
*new_context)
 s->aio_context = new_context;
 s->completion_bh = aio_bh_new(new_context, qemu_luring_completion_bh, s);
 aio_set_fd_handler(s->aio_context, s->ring.ring_fd, false,
-   qemu_luring_completion_cb, NULL, NULL, s);
+   qemu_luring_completion_cb, NULL, qemu_luring_poll_cb, 
s);
 }
 
 LuringState *luring_init(Error **errp)
-- 
2.17.1




[Qemu-devel] [PATCH v5 12/12] qemu-iotests/087: checks for io_uring

2019-06-10 Thread Aarushi Mehta
Signed-off-by: Aarushi Mehta 
---
 tests/qemu-iotests/087 | 26 ++
 tests/qemu-iotests/087.out | 10 ++
 2 files changed, 36 insertions(+)

diff --git a/tests/qemu-iotests/087 b/tests/qemu-iotests/087
index d6c8613419..0cc7283ad8 100755
--- a/tests/qemu-iotests/087
+++ b/tests/qemu-iotests/087
@@ -124,6 +124,32 @@ run_qemu_filter_aio <

[Qemu-devel] [PATCH v5 07/12] blockdev: accept io_uring as option

2019-06-10 Thread Aarushi Mehta
Signed-off-by: Aarushi Mehta 
Reviewed-by: Stefan Hajnoczi 
---
 blockdev.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 3f44b891eb..a2a5b32604 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -386,6 +386,8 @@ static void extract_common_blockdev_options(QemuOpts *opts, 
int *bdrv_flags,
 if ((aio = qemu_opt_get(opts, "aio")) != NULL) {
 if (!strcmp(aio, "native")) {
 *bdrv_flags |= BDRV_O_NATIVE_AIO;
+} else if (!strcmp(aio, "io_uring")) {
+*bdrv_flags |= BDRV_O_IO_URING;
 } else if (!strcmp(aio, "threads")) {
 /* this is the default */
 } else {
@@ -4579,7 +4581,7 @@ QemuOptsList qemu_common_drive_opts = {
 },{
 .name = "aio",
 .type = QEMU_OPT_STRING,
-.help = "host AIO implementation (threads, native)",
+.help = "host AIO implementation (threads, native, io_uring)",
 },{
 .name = BDRV_OPT_CACHE_WB,
 .type = QEMU_OPT_BOOL,
-- 
2.17.1




[Qemu-devel] [PATCH v5 08/12] block/file-posix.c: extend to use io_uring

2019-06-10 Thread Aarushi Mehta
Signed-off-by: Aarushi Mehta 
---
 block/file-posix.c | 85 +-
 1 file changed, 69 insertions(+), 16 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index d018429672..211dfe5337 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -154,6 +154,7 @@ typedef struct BDRVRawState {
 bool has_write_zeroes:1;
 bool discard_zeroes:1;
 bool use_linux_aio:1;
+bool use_linux_io_uring:1;
 bool page_cache_inconsistent:1;
 bool has_fallocate;
 bool needs_alignment;
@@ -423,7 +424,7 @@ static QemuOptsList raw_runtime_opts = {
 {
 .name = "aio",
 .type = QEMU_OPT_STRING,
-.help = "host AIO implementation (threads, native)",
+.help = "host AIO implementation (threads, native, io_uring)",
 },
 {
 .name = "locking",
@@ -482,9 +483,15 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 goto fail;
 }
 
-aio_default = (bdrv_flags & BDRV_O_NATIVE_AIO)
-  ? BLOCKDEV_AIO_OPTIONS_NATIVE
-  : BLOCKDEV_AIO_OPTIONS_THREADS;
+if (bdrv_flags & BDRV_O_NATIVE_AIO) {
+aio_default = BLOCKDEV_AIO_OPTIONS_NATIVE;
+#ifdef CONFIG_LINUX_IO_URING
+} else if (bdrv_flags & BDRV_O_IO_URING) {
+aio_default = BLOCKDEV_AIO_OPTIONS_IO_URING;
+#endif
+} else {
+aio_default = BLOCKDEV_AIO_OPTIONS_THREADS;
+}
 aio = qapi_enum_parse(_lookup,
   qemu_opt_get(opts, "aio"),
   aio_default, _err);
@@ -493,7 +500,11 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 ret = -EINVAL;
 goto fail;
 }
+
 s->use_linux_aio = (aio == BLOCKDEV_AIO_OPTIONS_NATIVE);
+#ifdef CONFIG_LINUX_IO_URING
+s->use_linux_io_uring = (aio == BLOCKDEV_AIO_OPTIONS_IO_URING);
+#endif
 
 locking = qapi_enum_parse(_lookup,
   qemu_opt_get(opts, "locking"),
@@ -557,7 +568,7 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 s->shared_perm = BLK_PERM_ALL;
 
 #ifdef CONFIG_LINUX_AIO
- /* Currently Linux does AIO only for files opened with O_DIRECT */
+/* Currently Linux does AIO only for files opened with O_DIRECT */
 if (s->use_linux_aio) {
 if (!(s->open_flags & O_DIRECT)) {
 error_setg(errp, "aio=native was specified, but it requires "
@@ -579,6 +590,22 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 }
 #endif /* !defined(CONFIG_LINUX_AIO) */
 
+#ifdef CONFIG_LINUX_IO_URING
+if (s->use_linux_io_uring) {
+if (!aio_setup_linux_io_uring(bdrv_get_aio_context(bs), errp)) {
+error_prepend(errp, "Unable to use io_uring: ");
+goto fail;
+}
+}
+#else
+if (s->use_linux_io_uring) {
+error_setg(errp, "aio=io_uring was specified, but is not supported "
+ "in this build.");
+ret = -EINVAL;
+goto fail;
+}
+#endif /* !defined(CONFIG_LINUX_IO_URING) */
+
 s->has_discard = true;
 s->has_write_zeroes = true;
 if ((bs->open_flags & BDRV_O_NOCACHE) != 0) {
@@ -1875,16 +1902,20 @@ static int coroutine_fn raw_co_prw(BlockDriverState 
*bs, uint64_t offset,
  * If this is the case tell the low-level driver that it needs
  * to copy the buffer.
  */
-if (s->needs_alignment) {
-if (!bdrv_qiov_is_aligned(bs, qiov)) {
-type |= QEMU_AIO_MISALIGNED;
+if (s->needs_alignment && !bdrv_qiov_is_aligned(bs, qiov)) {
+type |= QEMU_AIO_MISALIGNED;
+#ifdef CONFIG_LINUX_IO_URING
+} else if (s->use_linux_io_uring) {
+LuringState *aio = aio_get_linux_io_uring(bdrv_get_aio_context(bs));
+assert(qiov->size == bytes);
+return luring_co_submit(bs, aio, s->fd, offset, qiov, type);
+#endif
 #ifdef CONFIG_LINUX_AIO
-} else if (s->use_linux_aio) {
-LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs));
-assert(qiov->size == bytes);
-return laio_co_submit(bs, aio, s->fd, offset, qiov, type);
+} else if (s->use_linux_aio && s->needs_alignment) {
+LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs));
+assert(qiov->size == bytes);
+return laio_co_submit(bs, aio, s->fd, offset, qiov, type);
 #endif
-}
 }
 
 acb = (RawPosixAIOData) {
@@ -1920,24 +1951,36 @@ static int coroutine_fn raw_co_pwritev(BlockDriverState 
*bs, uint64_t offset,
 
 static void raw_aio_plug(BlockDriverState *bs)
 {
+BDRVRawState __attribute__((unused)) *s = bs->opaque;
 #ifdef CONFIG_LINUX_AIO
-BDRVRawState *s = bs->opaque;
 if (s->use_linux_aio) {
 LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs));
 laio_io_plug(bs, aio);
 }
 #endif
+#ifdef CONFIG_LINUX_IO_URING
+if (s->use_linux_io_uring) {
+LuringState *aio = 

[Qemu-devel] [PATCH v5 06/12] util/async: add aio interfaces for io_uring

2019-06-10 Thread Aarushi Mehta
Signed-off-by: Aarushi Mehta 
Reviewed-by: Stefan Hajnoczi 
---
 util/async.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/util/async.c b/util/async.c
index c10642a385..2709f0edc3 100644
--- a/util/async.c
+++ b/util/async.c
@@ -277,6 +277,14 @@ aio_ctx_finalize(GSource *source)
 }
 #endif
 
+#ifdef CONFIG_LINUX_IO_URING
+if (ctx->linux_io_uring) {
+luring_detach_aio_context(ctx->linux_io_uring, ctx);
+luring_cleanup(ctx->linux_io_uring);
+ctx->linux_io_uring = NULL;
+}
+#endif
+
 assert(QSLIST_EMPTY(>scheduled_coroutines));
 qemu_bh_delete(ctx->co_schedule_bh);
 
@@ -341,6 +349,29 @@ LinuxAioState *aio_get_linux_aio(AioContext *ctx)
 }
 #endif
 
+#ifdef CONFIG_LINUX_IO_URING
+LuringState *aio_setup_linux_io_uring(AioContext *ctx, Error **errp)
+{
+if (ctx->linux_io_uring) {
+return ctx->linux_io_uring;
+}
+
+ctx->linux_io_uring = luring_init(errp);
+if (!ctx->linux_io_uring) {
+return NULL;
+}
+
+luring_attach_aio_context(ctx->linux_io_uring, ctx);
+return ctx->linux_io_uring;
+}
+
+LuringState *aio_get_linux_io_uring(AioContext *ctx)
+{
+assert(ctx->linux_io_uring);
+return ctx->linux_io_uring;
+}
+#endif
+
 void aio_notify(AioContext *ctx)
 {
 /* Write e.g. bh->scheduled before reading ctx->notify_me.  Pairs
@@ -432,6 +463,11 @@ AioContext *aio_context_new(Error **errp)
 #ifdef CONFIG_LINUX_AIO
 ctx->linux_aio = NULL;
 #endif
+
+#ifdef CONFIG_LINUX_IO_URING
+ctx->linux_io_uring = NULL;
+#endif
+
 ctx->thread_pool = NULL;
 qemu_rec_mutex_init(>lock);
 timerlistgroup_init(>tlg, aio_timerlist_notify, ctx);
-- 
2.17.1




[Qemu-devel] [PATCH v5 09/12] block: add trace events for io_uring

2019-06-10 Thread Aarushi Mehta
Signed-off-by: Aarushi Mehta 
---
 block/io_uring.c   | 14 --
 block/trace-events |  8 
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/block/io_uring.c b/block/io_uring.c
index f327c7ef96..47e027364a 100644
--- a/block/io_uring.c
+++ b/block/io_uring.c
@@ -17,6 +17,7 @@
 #include "block/raw-aio.h"
 #include "qemu/coroutine.h"
 #include "qapi/error.h"
+#include "trace.h"
 
 #define MAX_EVENTS 128
 
@@ -191,12 +192,15 @@ static int ioq_submit(LuringState *s)
 
 void luring_io_plug(BlockDriverState *bs, LuringState *s)
 {
+trace_luring_io_plug();
 s->io_q.plugged++;
 }
 
 void luring_io_unplug(BlockDriverState *bs, LuringState *s)
 {
 assert(s->io_q.plugged);
+trace_luring_io_unplug(s->io_q.blocked, s->io_q.plugged,
+ s->io_q.in_queue, s->io_q.in_flight);
 if (--s->io_q.plugged == 0 &&
 !s->io_q.blocked && s->io_q.in_queue > 0) {
 ioq_submit(s);
@@ -217,6 +221,7 @@ void luring_io_unplug(BlockDriverState *bs, LuringState *s)
 static int luring_do_submit(int fd, LuringAIOCB *luringcb, LuringState *s,
 uint64_t offset, int type)
 {
+int ret;
 struct io_uring_sqe *sqes = io_uring_get_sqe(>ring);
 if (!sqes) {
 sqes = >sqeq;
@@ -242,11 +247,14 @@ static int luring_do_submit(int fd, LuringAIOCB 
*luringcb, LuringState *s,
 }
 io_uring_sqe_set_data(sqes, luringcb);
 s->io_q.in_queue++;
-
+trace_luring_do_submit(s->io_q.blocked, s->io_q.plugged,
+   s->io_q.in_queue, s->io_q.in_flight);
 if (!s->io_q.blocked &&
 (!s->io_q.plugged ||
  s->io_q.in_flight + s->io_q.in_queue >= MAX_EVENTS)) {
-return ioq_submit(s);
+ret = ioq_submit(s);
+trace_luring_do_submit_done(ret);
+return ret;
 }
 return 0;
 }
@@ -294,6 +302,7 @@ LuringState *luring_init(Error **errp)
 int rc;
 LuringState *s;
 s = g_malloc0(sizeof(*s));
+trace_luring_init_state((void *)s, sizeof(*s));
 struct io_uring *ring = >ring;
 rc =  io_uring_queue_init(MAX_EVENTS, ring, 0);
 if (rc < 0) {
@@ -311,4 +320,5 @@ void luring_cleanup(LuringState *s)
 {
 io_uring_queue_exit(>ring);
 g_free(s);
+trace_luring_cleanup_state();
 }
diff --git a/block/trace-events b/block/trace-events
index eab51497fc..c4564dcd96 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -60,6 +60,14 @@ qmp_block_stream(void *bs, void *job) "bs %p job %p"
 file_paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) 
"acb %p opaque %p offset %"PRId64" count %d type %d"
 file_copy_file_range(void *bs, int src, int64_t src_off, int dst, int64_t 
dst_off, int64_t bytes, int flags, int64_t ret) "bs %p src_fd %d offset 
%"PRIu64" dst_fd %d offset %"PRIu64" bytes %"PRIu64" flags %d ret %"PRId64
 
+#io_uring.c
+luring_init_state(void *s, size_t size) "s %p size %zu"
+luring_cleanup_state(void) "s freed"
+disable luring_io_plug(void) "plug"
+disable luring_io_unplug(int blocked, int plugged, int queued, int inflight) 
"blocked %d plugged %d queued %d inflight %d"
+disable luring_do_submit(int blocked, int plugged, int queued, int inflight) 
"blocked %d plugged %d queued %d inflight %d"
+disable luring_do_submit_done(int ret) "submitted to kernel %d"
+
 # qcow2.c
 qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset 0x%" 
PRIx64 " bytes %d"
 qcow2_writev_done_req(void *co, int ret) "co %p ret %d"
-- 
2.17.1




[Qemu-devel] [PATCH v5 03/12] block/block: add BDRV flag for io_uring

2019-06-10 Thread Aarushi Mehta
Signed-off-by: Aarushi Mehta 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Maxim Levitsky 
---
 include/block/block.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/block/block.h b/include/block/block.h
index f9415ed740..5e08df716f 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -121,6 +121,7 @@ typedef struct HDGeometry {
   ignoring the format layer */
 #define BDRV_O_NO_IO   0x1 /* don't initialize for I/O */
 #define BDRV_O_AUTO_RDONLY 0x2 /* degrade to read-only if opening 
read-write fails */
+#define BDRV_O_IO_URING0x4 /* use io_uring instead of the thread pool 
*/
 
 #define BDRV_O_CACHE_MASK  (BDRV_O_NOCACHE | BDRV_O_NO_FLUSH)
 
-- 
2.17.1




[Qemu-devel] [PATCH v5 05/12] stubs: add stubs for io_uring interface

2019-06-10 Thread Aarushi Mehta
Signed-off-by: Aarushi Mehta 
Reviewed-by: Stefan Hajnoczi 
---
 MAINTAINERS |  1 +
 stubs/Makefile.objs |  1 +
 stubs/io_uring.c| 32 
 3 files changed, 34 insertions(+)
 create mode 100644 stubs/io_uring.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 49f896796e..bc38175124 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2522,6 +2522,7 @@ R: Stefan Hajnoczi 
 L: qemu-bl...@nongnu.org
 S: Maintained
 F: block/io_uring.c
+F: stubs/io_uring.c
 
 qcow2
 M: Kevin Wolf 
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index 9c7393b08c..5cf160a9c8 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -13,6 +13,7 @@ stub-obj-y += iothread.o
 stub-obj-y += iothread-lock.o
 stub-obj-y += is-daemonized.o
 stub-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
+stub-obj-$(CONFIG_LINUX_IO_URING) += io_uring.o
 stub-obj-y += machine-init-done.o
 stub-obj-y += migr-blocker.o
 stub-obj-y += change-state-handler.o
diff --git a/stubs/io_uring.c b/stubs/io_uring.c
new file mode 100644
index 00..622d1e4648
--- /dev/null
+++ b/stubs/io_uring.c
@@ -0,0 +1,32 @@
+/*
+ * Linux io_uring support.
+ *
+ * Copyright (C) 2009 IBM, Corp.
+ * Copyright (C) 2009 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+#include "block/aio.h"
+#include "block/raw-aio.h"
+
+void luring_detach_aio_context(LuringState *s, AioContext *old_context)
+{
+abort();
+}
+
+void luring_attach_aio_context(LuringState *s, AioContext *new_context)
+{
+abort();
+}
+
+LuringState *luring_init(Error **errp)
+{
+abort();
+}
+
+void luring_cleanup(LuringState *s)
+{
+abort();
+}
-- 
2.17.1




[Qemu-devel] [PATCH v5 04/12] block/io_uring: implements interfaces for io_uring

2019-06-10 Thread Aarushi Mehta
Aborts when sqe fails to be set as sqes cannot be returned to the ring.

Signed-off-by: Aarushi Mehta 
---
 MAINTAINERS |   7 +
 block/Makefile.objs |   3 +
 block/io_uring.c| 314 
 include/block/aio.h |  16 +-
 include/block/raw-aio.h |  12 ++
 5 files changed, 351 insertions(+), 1 deletion(-)
 create mode 100644 block/io_uring.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 7be1225415..49f896796e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2516,6 +2516,13 @@ F: block/file-posix.c
 F: block/file-win32.c
 F: block/win32-aio.c
 
+Linux io_uring
+M: Aarushi Mehta 
+R: Stefan Hajnoczi 
+L: qemu-bl...@nongnu.org
+S: Maintained
+F: block/io_uring.c
+
 qcow2
 M: Kevin Wolf 
 M: Max Reitz 
diff --git a/block/Makefile.objs b/block/Makefile.objs
index ae11605c9f..8fde7a23a5 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -18,6 +18,7 @@ block-obj-y += block-backend.o snapshot.o qapi.o
 block-obj-$(CONFIG_WIN32) += file-win32.o win32-aio.o
 block-obj-$(CONFIG_POSIX) += file-posix.o
 block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
+block-obj-$(CONFIG_LINUX_IO_URING) += io_uring.o
 block-obj-y += null.o mirror.o commit.o io.o create.o
 block-obj-y += throttle-groups.o
 block-obj-$(CONFIG_LINUX) += nvme.o
@@ -61,5 +62,7 @@ block-obj-$(if $(CONFIG_LZFSE),m,n) += dmg-lzfse.o
 dmg-lzfse.o-libs   := $(LZFSE_LIBS)
 qcow.o-libs:= -lz
 linux-aio.o-libs   := -laio
+io_uring.o-cflags  := $(LINUX_IO_URING_CFLAGS)
+io_uring.o-libs:= $(LINUX_IO_URING_LIBS)
 parallels.o-cflags := $(LIBXML2_CFLAGS)
 parallels.o-libs   := $(LIBXML2_LIBS)
diff --git a/block/io_uring.c b/block/io_uring.c
new file mode 100644
index 00..f327c7ef96
--- /dev/null
+++ b/block/io_uring.c
@@ -0,0 +1,314 @@
+/*
+ * Linux io_uring support.
+ *
+ * Copyright (C) 2009 IBM, Corp.
+ * Copyright (C) 2009 Red Hat, Inc.
+ * Copyright (C) 2019 Aarushi Mehta
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+#include 
+#include "qemu-common.h"
+#include "block/aio.h"
+#include "qemu/queue.h"
+#include "block/block.h"
+#include "block/raw-aio.h"
+#include "qemu/coroutine.h"
+#include "qapi/error.h"
+
+#define MAX_EVENTS 128
+
+typedef struct LuringAIOCB {
+Coroutine *co;
+struct io_uring_sqe sqeq;
+ssize_t ret;
+QEMUIOVector *qiov;
+bool is_read;
+QSIMPLEQ_ENTRY(LuringAIOCB) next;
+} LuringAIOCB;
+
+typedef struct LuringQueue {
+int plugged;
+unsigned int in_queue;
+unsigned int in_flight;
+bool blocked;
+QSIMPLEQ_HEAD(, LuringAIOCB) sq_overflow;
+} LuringQueue;
+
+typedef struct LuringState {
+AioContext *aio_context;
+
+struct io_uring ring;
+
+/* io queue for submit at batch.  Protected by AioContext lock. */
+LuringQueue io_q;
+
+/* I/O completion processing.  Only runs in I/O thread.  */
+QEMUBH *completion_bh;
+} LuringState;
+
+/**
+ * ioq_submit:
+ * @s: AIO state
+ *
+ * Queues pending sqes and submits them
+ *
+ */
+static int ioq_submit(LuringState *s);
+
+/**
+ * qemu_luring_process_completions:
+ * @s: AIO state
+ *
+ * Fetches completed I/O requests, consumes cqes and invokes their callbacks.
+ *
+ */
+static void qemu_luring_process_completions(LuringState *s)
+{
+struct io_uring_cqe *cqes;
+int ret;
+
+/*
+ * Request completion callbacks can run the nested event loop.
+ * Schedule ourselves so the nested event loop will "see" remaining
+ * completed requests and process them.  Without this, completion
+ * callbacks that wait for other requests using a nested event loop
+ * would hang forever.
+ */
+qemu_bh_schedule(s->completion_bh);
+
+while (io_uring_peek_cqe(>ring, ) == 0) {
+if (!cqes) {
+break;
+}
+LuringAIOCB *luringcb = io_uring_cqe_get_data(cqes);
+ret = cqes->res;
+
+if (ret == luringcb->qiov->size) {
+ret = 0;
+} else if (ret >= 0) {
+/* Short Read/Write */
+if (luringcb->is_read) {
+/* Read, pad with zeroes */
+qemu_iovec_memset(luringcb->qiov, ret, 0,
+luringcb->qiov->size - ret);
+} else {
+ret = -ENOSPC;;
+}
+}
+luringcb->ret = ret;
+
+io_uring_cqe_seen(>ring, cqes);
+cqes = NULL;
+/* Change counters one-by-one because we can be nested. */
+s->io_q.in_flight--;
+
+/*
+ * If the coroutine is already entered it must be in ioq_submit()
+ * and will notice luringcb->ret has been filled in when it
+ * eventually runs later. Coroutines cannot be entered recursively
+ * so avoid doing that!
+ */
+if (!qemu_coroutine_entered(luringcb->co)) {
+aio_co_wake(luringcb->co);
+}
+}
+qemu_bh_cancel(s->completion_bh);
+}
+

[Qemu-devel] [PATCH] pc: fix possible NULL pointer dereference in pc_machine_get_device_memory_region_size()

2019-06-10 Thread Igor Mammedov
QEMU will crash when device-memory-region-size property is read if 
ms->device_memory
wasn't initialized yet.

Crash can be reproduced with:
 $QEMU -preconfig -qmp unix:qmp_socket,server,nowait &
 ./scripts/qmp/qom-get -s qmp_socket /machine.device-memory-region-size

Instead of crashing return 0 if ms->device_memory hasn't been initialized.

Signed-off-by: Igor Mammedov 
---
v2:
  add reproducer to commit message
   (Markus Armbruster )

 hw/i386/pc.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index edc240b..1b7ead9 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2459,7 +2459,11 @@ pc_machine_get_device_memory_region_size(Object *obj, 
Visitor *v,
  Error **errp)
 {
 MachineState *ms = MACHINE(obj);
-int64_t value = memory_region_size(>device_memory->mr);
+int64_t value = 0;
+
+if (ms->device_memory) {
+memory_region_size(>device_memory->mr);
+}
 
 visit_type_int(v, name, , errp);
 }
-- 
2.7.4




[Qemu-devel] [PATCH v5 01/12] configure: permit use of io_uring

2019-06-10 Thread Aarushi Mehta
Signed-off-by: Aarushi Mehta 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Maxim Levitsky 
---
 configure | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/configure b/configure
index b091b82cb3..7aa18d308d 100755
--- a/configure
+++ b/configure
@@ -365,6 +365,7 @@ xen=""
 xen_ctrl_version=""
 xen_pci_passthrough=""
 linux_aio=""
+linux_io_uring=""
 cap_ng=""
 attr=""
 libattr=""
@@ -1266,6 +1267,10 @@ for opt do
   ;;
   --enable-linux-aio) linux_aio="yes"
   ;;
+  --disable-linux-io-uring) linux_io_uring="no"
+  ;;
+  --enable-linux-io-uring) linux_io_uring="yes"
+  ;;
   --disable-attr) attr="no"
   ;;
   --enable-attr) attr="yes"
@@ -1784,6 +1789,7 @@ disabled with --disable-FEATURE, default is enabled if 
available:
   vde support for vde network
   netmap  support for netmap network
   linux-aio   Linux AIO support
+  linux-io-uring  Linux io_uring support
   cap-ng  libcap-ng support
   attrattr and xattr support
   vhost-net   vhost-net kernel acceleration support
@@ -3973,6 +3979,21 @@ EOF
 linux_aio=no
   fi
 fi
+##
+# linux-io-uring probe
+
+if test "$linux_io_uring" != "no" ; then
+  if $pkg_config liburing; then
+linux_io_uring_cflags=$($pkg_config --cflags liburing)
+linux_io_uring_libs=$($pkg_config --libs liburing)
+linux_io_uring=yes
+  else
+if test "$linux_io_uring" = "yes" ; then
+  feature_not_found "linux io_uring" "Install liburing devel"
+fi
+linux_io_uring=no
+  fi
+fi
 
 ##
 # TPM emulation is only on POSIX
@@ -6396,6 +6417,7 @@ echo "PIE   $pie"
 echo "vde support   $vde"
 echo "netmap support$netmap"
 echo "Linux AIO support $linux_aio"
+echo "Linux io_uring support $linux_io_uring"
 echo "ATTR/XATTR support $attr"
 echo "Install blobs $blobs"
 echo "KVM support   $kvm"
@@ -6874,6 +6896,11 @@ fi
 if test "$linux_aio" = "yes" ; then
   echo "CONFIG_LINUX_AIO=y" >> $config_host_mak
 fi
+if test "$linux_io_uring" = "yes" ; then
+  echo "CONFIG_LINUX_IO_URING=y" >> $config_host_mak
+  echo "LINUX_IO_URING_CFLAGS=$linux_io_uring_cflags" >> $config_host_mak
+  echo "LINUX_IO_URING_LIBS=$linux_io_uring_libs" >> $config_host_mak
+fi
 if test "$attr" = "yes" ; then
   echo "CONFIG_ATTR=y" >> $config_host_mak
 fi
-- 
2.17.1




[Qemu-devel] [PATCH v5 02/12] qapi/block-core: add option for io_uring

2019-06-10 Thread Aarushi Mehta
Option only enumerates for hosts that support it.

Signed-off-by: Aarushi Mehta 
Reviewed-by: Stefan Hajnoczi 
---
 qapi/block-core.json | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 1defcde048..db7eedd058 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2792,11 +2792,13 @@
 #
 # @threads: Use qemu's thread pool
 # @native:  Use native AIO backend (only Linux and Windows)
+# @io_uring:Use linux io_uring (since 4.1)
 #
 # Since: 2.9
 ##
 { 'enum': 'BlockdevAioOptions',
-  'data': [ 'threads', 'native' ] }
+  'data': [ 'threads', 'native',
+{ 'name': 'io_uring', 'if': 'defined(CONFIG_LINUX_IO_URING)' } ] }
 
 ##
 # @BlockdevCacheOptions:
-- 
2.17.1




[Qemu-devel] [PATCH v5 00/12] Add support for io_uring

2019-06-10 Thread Aarushi Mehta
This patch series adds support for the newly developed io_uring Linux AIO
interface. Linux io_uring is faster than Linux's AIO asynchronous I/O code,
offers efficient buffered asynchronous I/O support, the ability to do I/O
without performing a system call via polled I/O, and other efficiency 
enhancements.

Testing it requires a host kernel (5.1+) and the liburing library.
Use the option -drive aio=io_uring to enable it.

v5:
- Adds completion polling
- Extends qemu-io
- Adds qemu-iotest

v4:
- Add error handling
- Add trace events
- Remove aio submission based code

v3:
- Fix major errors in io_uring (sorry)
- Option now enumerates for CONFIG_LINUX_IO_URING
- pkg config support added

Aarushi Mehta (12):
  configure: permit use of io_uring
  qapi/block-core: add option for io_uring Only enumerates option for
devices that support it
  block/block: add BDRV flag for io_uring
  block/io_uring: implements interfaces for io_uring Aborts when sqe
fails to be set as sqes cannot be returned to the ring.
  stubs: add stubs for io_uring interface
  util/async: add aio interfaces for io_uring
  blockdev: accept io_uring as option
  block/file-posix.c: extend to use io_uring
  block: add trace events for io_uring
  block/io_uring: adds userspace completion polling
  qemu-io: adds support for io_uring
  qemu-iotests/087: checks for io_uring

 MAINTAINERS|   8 +
 block/Makefile.objs|   3 +
 block/file-posix.c |  85 --
 block/io_uring.c   | 339 +
 block/trace-events |   8 +
 blockdev.c |   4 +-
 configure  |  27 +++
 include/block/aio.h|  16 +-
 include/block/block.h  |   1 +
 include/block/raw-aio.h|  12 ++
 qapi/block-core.json   |   4 +-
 qemu-io.c  |  13 ++
 stubs/Makefile.objs|   1 +
 stubs/io_uring.c   |  32 
 tests/qemu-iotests/087 |  26 +++
 tests/qemu-iotests/087.out |  10 ++
 util/async.c   |  36 
 17 files changed, 606 insertions(+), 19 deletions(-)
 create mode 100644 block/io_uring.c
 create mode 100644 stubs/io_uring.c

-- 
2.17.1




Re: [Qemu-devel] [PULL 00/39] tcg: Move the softmmu tlb to CPUNegativeOffsetState

2019-06-10 Thread Peter Maydell
On Mon, 10 Jun 2019 at 03:02, Richard Henderson
 wrote:
>
> The following changes since commit 185b7ccc11354cbd69b6d53bf8d831dd964f6c88:
>
>   Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20190607-2' into 
> staging (2019-06-07 15:24:13 +0100)
>
> are available in the Git repository at:
>
>   https://github.com/rth7680/qemu.git tags/pull-tcg-20190609
>
> for you to fetch changes up to e20774aed18cc5e0113e6a6c502ece2fc1c41931:
>
>   tcg/arm: Remove mostly unreachable tlb special case (2019-06-09 18:55:23 
> -0700)
>
> 
> Move softmmu tlb into CPUNegativeOffsetState
>
> 

Hi; this failed to build on OpenBSD:
/tmp/qemu-test.RzUFLe/bsd-user/main.c: In function 'cpu_loop':
/tmp/qemu-test.RzUFLe/bsd-user/main.c:143:28: error: 'cpu' undeclared
(first use in this function)
 CPUState *cs = env_cpu(cpu);
^
/tmp/qemu-test.RzUFLe/bsd-user/main.c:143:28: note: each undeclared
identifier is reported only once for each function it appears in

(freebsd and netbsd were ok)

thanks
-- PMM



[Qemu-devel] [PATCH] qemu-ga: Convert invocation documentation to rST

2019-06-10 Thread Peter Maydell
The qemu-ga documentation is currently in qemu-ga.texi in
Texinfo format, which we present to the user as:
 * a qemu-ga manpage
 * a section of the main qemu-doc HTML documentation

Convert the documentation to rST format, and present it to
the user as:
 * a qemu-ga manpage
 * part of the interop/ Sphinx manual

Signed-off-by: Peter Maydell 
---
This is part of my general proposals about how we should
transition away from texinfo to sphinx (written up at
https://wiki.qemu.org/Features/Documentation). It's the
first part of step 3 (convert standalone manpages), so it's
interesting as a demonstration of Sphinx generating manpages
as well as HTML. The format of the output manpage is broadly
the same, though there are a few minor differences because
rST doesn't support quite the same kinds of output. (It also
fixes a bug where the current manpage renders some text intended
to be in bold as literally "B".)

Having the infrastructure for creating a simple manpage via
Sphinx should be a useful assist for the work in step 1 of the
transition plan that involves conversion of the auto-generated
qemu-ga-ref and qemu-qmp-ref (which need to produce manpages).

The non-manpage HTML version of the doc moves from qemu-doc.html
into docs/interop/ (its final location in the new 5-manual setup).

 Makefile |  13 ++--
 MAINTAINERS  |   2 +-
 docs/conf.py |  18 ++---
 docs/interop/conf.py |   7 ++
 docs/interop/index.rst   |   1 +
 docs/interop/qemu-ga.rst | 133 +
 qemu-doc.texi|   5 --
 qemu-ga.texi | 137 ---
 8 files changed, 161 insertions(+), 155 deletions(-)
 create mode 100644 docs/interop/qemu-ga.rst
 delete mode 100644 qemu-ga.texi

diff --git a/Makefile b/Makefile
index 8e2fc6624c3..cdada210746 100644
--- a/Makefile
+++ b/Makefile
@@ -329,7 +329,7 @@ endif
 endif
 
 ifdef BUILD_DOCS
-DOCS=qemu-doc.html qemu-doc.txt qemu.1 qemu-img.1 qemu-nbd.8 qemu-ga.8
+DOCS=qemu-doc.html qemu-doc.txt qemu.1 qemu-img.1 qemu-nbd.8 
docs/interop/qemu-ga.8
 DOCS+=docs/interop/qemu-qmp-ref.html docs/interop/qemu-qmp-ref.txt 
docs/interop/qemu-qmp-ref.7
 DOCS+=docs/interop/qemu-ga-ref.html docs/interop/qemu-ga-ref.txt 
docs/interop/qemu-ga-ref.7
 DOCS+=docs/qemu-block-drivers.7
@@ -804,7 +804,7 @@ ifdef CONFIG_TRACE_SYSTEMTAP
$(INSTALL_DATA) scripts/qemu-trace-stap.1 "$(DESTDIR)$(mandir)/man1"
 endif
 ifneq (,$(findstring qemu-ga,$(TOOLS)))
-   $(INSTALL_DATA) qemu-ga.8 "$(DESTDIR)$(mandir)/man8"
+   $(INSTALL_DATA) docs/interop/qemu-ga.8 "$(DESTDIR)$(mandir)/man8"
$(INSTALL_DATA) docs/interop/qemu-ga-ref.html "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) docs/interop/qemu-ga-ref.txt "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) docs/interop/qemu-ga-ref.7 "$(DESTDIR)$(mandir)/man7"
@@ -969,12 +969,18 @@ build-manual = $(call quiet-command,sphinx-build $(if 
$(V),,-q) -W -n -b html -D
 # We assume all RST files in the manual's directory are used in it
 manual-deps = $(wildcard $(SRC_PATH)/docs/$1/*.rst) 
$(SRC_PATH)/docs/$1/conf.py $(SRC_PATH)/docs/conf.py
 
+# Canned command to build manpages from a single manual
+build-manpages = $(call quiet-command,CONFDIR="$(qemu_confdir)" sphinx-build 
$(if $(V),,-q) -W -n -b man -D version=$(VERSION) -D release="$(FULL_VERSION)" 
-d .doctrees/$1 $(SRC_PATH)/docs/$1 $(MANUAL_BUILDDIR)/$1 
,"SPHINX","$(MANUAL_BUILDDIR)/$1")
+
 $(MANUAL_BUILDDIR)/devel/index.html: $(call manual-deps,devel)
$(call build-manual,devel)
 
 $(MANUAL_BUILDDIR)/interop/index.html: $(call manual-deps,interop)
$(call build-manual,interop)
 
+$(MANUAL_BUILDDIR)/interop/qemu-ga.8: $(call manual-deps,interop)
+   $(call build-manpages,interop)
+
 qemu-options.texi: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > 
$@,"GEN","$@")
 
@@ -998,7 +1004,6 @@ qemu.1: qemu-option-trace.texi
 qemu-img.1: qemu-img.texi qemu-option-trace.texi qemu-img-cmds.texi
 fsdev/virtfs-proxy-helper.1: fsdev/virtfs-proxy-helper.texi
 qemu-nbd.8: qemu-nbd.texi qemu-option-trace.texi
-qemu-ga.8: qemu-ga.texi
 docs/qemu-block-drivers.7: docs/qemu-block-drivers.texi
 docs/qemu-cpu-models.7: docs/qemu-cpu-models.texi
 scripts/qemu-trace-stap.1: scripts/qemu-trace-stap.texi
@@ -1010,7 +1015,7 @@ txt: qemu-doc.txt docs/interop/qemu-qmp-ref.txt 
docs/interop/qemu-ga-ref.txt
 
 qemu-doc.html qemu-doc.info qemu-doc.pdf qemu-doc.txt: \
qemu-img.texi qemu-nbd.texi qemu-options.texi qemu-option-trace.texi \
-   qemu-deprecated.texi qemu-monitor.texi qemu-img-cmds.texi qemu-ga.texi \
+   qemu-deprecated.texi qemu-monitor.texi qemu-img-cmds.texi \
qemu-monitor-info.texi docs/qemu-block-drivers.texi \
docs/qemu-cpu-models.texi docs/security.texi
 
diff --git a/MAINTAINERS b/MAINTAINERS
index a96829ea83a..27a6ffa7431 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2024,7 

Re: [Qemu-devel] [PATCH v2 1/5] target/i386: Add cpu die-level topology support for X86CPU

2019-06-10 Thread Like Xu

On 2019/6/6 11:32, Eduardo Habkost wrote:

On Tue, May 21, 2019 at 12:50:52AM +0800, Like Xu wrote:

The die-level as the first PC-specific cpu topology is added to the
leagcy cpu topology model which only covers sockets/cores/threads.

In the new model with die-level support, the total number of logical
processors (including offline) on board will be calculated as:

  #cpus = #sockets * #dies * #cores * #threads

and considering compatibility, the default value for #dies is 1.

A new set of die-related variables are added in smp context and the
CPUX86State.nr_dies is assigned in x86_cpu_initfn() from PCMachineState.

Signed-off-by: Like Xu 
---
  hw/i386/pc.c   | 3 +++
  include/hw/i386/pc.h   | 2 ++
  include/hw/i386/topology.h | 2 ++
  qapi/misc.json | 6 --
  target/i386/cpu.c  | 9 +
  target/i386/cpu.h  | 3 +++
  6 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 896c22e32e..83ab53c814 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2341,6 +2341,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
  
  topo.pkg_id = cpu->socket_id;

  topo.core_id = cpu->core_id;
+topo.die_id = cpu->die_id;
  topo.smt_id = cpu->thread_id;
  cpu->apic_id = apicid_from_topo_ids(smp_cores, smp_threads, );
  }
@@ -2692,6 +2693,8 @@ static const CPUArchIdList 
*pc_possible_cpu_arch_ids(MachineState *ms)
   ms->smp.cores, ms->smp.threads, );
  ms->possible_cpus->cpus[i].props.has_socket_id = true;
  ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
+ms->possible_cpus->cpus[i].props.has_die_id = true;
+ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
  ms->possible_cpus->cpus[i].props.has_core_id = true;
  ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
  ms->possible_cpus->cpus[i].props.has_thread_id = true;
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index ce3c22951e..b5faf2ede9 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -24,6 +24,7 @@
   * PCMachineState:
   * @acpi_dev: link to ACPI PM device that performs ACPI hotplug handling
   * @boot_cpus: number of present VCPUs
+ * @smp_dies: number of dies per one package
   */
  struct PCMachineState {
  /*< private >*/
@@ -59,6 +60,7 @@ struct PCMachineState {
  bool apic_xrupt_override;
  unsigned apic_id_limit;
  uint16_t boot_cpus;
+unsigned smp_dies;
  
  /* NUMA information: */

  uint64_t numa_nodes;
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 1ebaee0f76..7f80498eb3 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -47,6 +47,7 @@ typedef uint32_t apic_id_t;
  
  typedef struct X86CPUTopoInfo {

  unsigned pkg_id;
+unsigned die_id;


Isn't it better to add this field only on patch 4/5?


  unsigned core_id;
  unsigned smt_id;
  } X86CPUTopoInfo;
@@ -130,6 +131,7 @@ static inline void x86_topo_ids_from_apicid(apic_id_t 
apicid,
  topo->core_id = (apicid >> apicid_core_offset(nr_cores, nr_threads)) &
 ~(0xUL << apicid_core_width(nr_cores, nr_threads));
  topo->pkg_id = apicid >> apicid_pkg_offset(nr_cores, nr_threads);
+topo->die_id = -1;


Why are you setting die_id = -1 here?


Hi Eduardo,thanks for your comments and support.

Would it be a better way to introduce all die related variables 
including has_die_id/nr_dies/cpu->die_id/topo.die_id/smp_dies in one 
patch for consistency check and backport convenient?


In this case the default value for topo->die_id would be 0 (for sure, 
one die per package) with has_die_id = false. Is that acceptable to you?




If die_id isn't valid yet, isn't it better to keep has_die_id =
false at pc_possible_cpu_arch_ids() above, and set has_die_id =
true only on patch 4/5?


  }
  
  /* Make APIC ID for the CPU 'cpu_index'

diff --git a/qapi/misc.json b/qapi/misc.json
index 8b3ca4fdd3..cd236c89b3 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -2924,10 +2924,11 @@
  #
  # @node-id: NUMA node ID the CPU belongs to
  # @socket-id: socket number within node/board the CPU belongs to
-# @core-id: core number within socket the CPU belongs to
+# @die-id: die number within node/board the CPU belongs to (Since 4.1)
+# @core-id: core number within die the CPU belongs to
  # @thread-id: thread number within core the CPU belongs to
  #
-# Note: currently there are 4 properties that could be present
+# Note: currently there are 5 properties that could be present
  # but management should be prepared to pass through other
  # properties with device_add command to allow for future
  # interface extension. This also requires the filed names to be kept in
@@ -2938,6 +2939,7 @@
  { 'struct': 'CpuInstanceProperties',
'data': { '*node-id': 'int',
  '*socket-id': 'int',
+'*die-id': 'int',

  1   2   >