Re: [PATCH] net: check payload length limit for all frames

2020-07-16 Thread Jason Wang



On 2020/7/17 下午1:06, P J P wrote:

   Hello Jason, all

+-- On Fri, 17 Jul 2020, Jason Wang wrote --+
| On 2020/7/17 上午9:21, Alexander Bulekov wrote:
| > On 200717 0853, Li Qiang wrote:
| >> Which issue are you trying to solve, any reference linking?
| >> I also send a patch related this part and also a UAF.
| >
| > I reported a UAF privately to QEMU-Security in May. I believe the one Li
| > is referring to is this one https://bugs.launchpad.net/qemu/+bug/1886362
| >
| > When I saw Prasad's email, I was worried that I reported the same bug
| > twice, but I can still reproduce LP#1886362 with Prasad's patch.
| >
| > On the other hand, I cannot reproduce either issue with Li's patch:
| > Message-Id: <20200716161453.61295-1-liq...@163.com>
| >
| > Based on this, I think there were two distinct issues.

   Yes, LP#1886362 looks different that the one fixed here.

| Could you describe the issue you saw in details? (E.g the calltrace?) The
| commit log does not explain where we can get OOB or UAF.

I should've included the backtrace in the commit message. It crossed my mind
after I sent the patch. Sorry.



Thanks but I don't see a direct relation between 64K limit and this 
calltrace.


Maybe you can elaborate more on this?

Thanks




===
1040323==ERROR: AddressSanitizer: heap-use-after-free on address 0x606344d8 
at pc 0x5571b8fb9ce7 bp 0x7ffede5a2290 sp 0x7ffede5a2280
READ of size 8 at 0x606344d8 thread T0
 #0 0x5571b8fb9ce6 in e1000e_write_packet_to_guest hw/net/e1000e_core.c:1587
 #1 0x5571b8fba8fc in e1000e_receive_iov hw/net/e1000e_core.c:1709
 #2 0x5571b8f9cb08 in e1000e_nc_receive_iov hw/net/e1000e.c:213
 #3 0x5571b8f90285 in net_tx_pkt_sendv hw/net/net_tx_pkt.c:544
 #4 0x5571b8f90aaa in net_tx_pkt_send hw/net/net_tx_pkt.c:620
 #5 0x5571b8f90b2e in net_tx_pkt_send_loopback hw/net/net_tx_pkt.c:633
 #6 0x5571b8fb3c3b in e1000e_tx_pkt_send hw/net/e1000e_core.c:664
 #7 0x5571b8fb43dd in e1000e_process_tx_desc hw/net/e1000e_core.c:743
 #8 0x5571b8fb5b4f in e1000e_start_xmit hw/net/e1000e_core.c:934
 #9 0x5571b8fbe2ad in e1000e_set_tdt hw/net/e1000e_core.c:2451
 #10 0x5571b8fc01e0 in e1000e_core_write hw/net/e1000e_core.c:3265
 #11 0x5571b8f9c387 in e1000e_mmio_write hw/net/e1000e.c:109
 ...
0x606344d8 is located 24 bytes inside of 64-byte region 
[0x606344c0,0x60634500)
freed by thread T0 here:
 #0 0x7f89e3b87307 in __interceptor_free (/lib64/libasan.so.6+0xb0307)
 #1 0x7f89e37c7a6c in g_free (/lib64/libglib-2.0.so.0+0x58a6c)
 #2 0x5571b8f95fc7 in net_rx_pkt_pull_data hw/net/net_rx_pkt.c:103
 #3 0x5571b8f969b3 in net_rx_pkt_attach_iovec_ex hw/net/net_rx_pkt.c:158
 #4 0x5571b8fba6bc in e1000e_receive_iov hw/net/e1000e_core.c:1695
 #5 0x5571b8f9cb08 in e1000e_nc_receive_iov hw/net/e1000e.c:213
 #6 0x5571b8f90285 in net_tx_pkt_sendv hw/net/net_tx_pkt.c:544
 #7 0x5571b8f90aaa in net_tx_pkt_send hw/net/net_tx_pkt.c:620
 #8 0x5571b8f90b2e in net_tx_pkt_send_loopback hw/net/net_tx_pkt.c:633
 #9 0x5571b8fb3c3b in e1000e_tx_pkt_send hw/net/e1000e_core.c:664
 #10 0x5571b8fb43dd in e1000e_process_tx_desc hw/net/e1000e_core.c:743
 #11 0x5571b8fb5b4f in e1000e_start_xmit hw/net/e1000e_core.c:934
 #12 0x5571b8fbe011 in e1000e_set_tctl hw/net/e1000e_core.c:2431
 #13 0x5571b8fc01e0 in e1000e_core_write hw/net/e1000e_core.c:3265
 #14 0x5571b8f9c387 in e1000e_mmio_write hw/net/e1000e.c:109
 ...
previously allocated by thread T0 here:
 #0 0x7f89e3b87667 in __interceptor_malloc (/lib64/libasan.so.6+0xb0667)
 #1 0x7f89e37c7978 in g_malloc (/lib64/libglib-2.0.so.0+0x58978)
 #2 0x5571b8f95fc7 in net_rx_pkt_pull_data hw/net/net_rx_pkt.c:103
 #3 0x5571b8f969b3 in net_rx_pkt_attach_iovec_ex hw/net/net_rx_pkt.c:158
 #4 0x5571b8fba6bc in e1000e_receive_iov hw/net/e1000e_core.c:1695
 #5 0x5571b8f9cb08 in e1000e_nc_receive_iov hw/net/e1000e.c:213
 #6 0x5571b8f90285 in net_tx_pkt_sendv hw/net/net_tx_pkt.c:544
 #7 0x5571b8f90aaa in net_tx_pkt_send hw/net/net_tx_pkt.c:620
 #8 0x5571b8f90b2e in net_tx_pkt_send_loopback hw/net/net_tx_pkt.c:633
 #9 0x5571b8fb3c3b in e1000e_tx_pkt_send hw/net/e1000e_core.c:664
 #10 0x5571b8fb43dd in e1000e_process_tx_desc hw/net/e1000e_core.c:743
 #11 0x5571b8fb5b4f in e1000e_start_xmit hw/net/e1000e_core.c:934
 #12 0x5571b8fbe2ad in e1000e_set_tdt hw/net/e1000e_core.c:2451
 #13 0x5571b8fc01e0 in e1000e_core_write hw/net/e1000e_core.c:3265
 #14 0x5571b8f9c387 in e1000e_mmio_write hw/net/e1000e.c:109
===
  
| >>> --- a/hw/net/net_tx_pkt.c

| >>> +++ b/hw/net/net_tx_pkt.c
| >>> @@ -607,12 +607,10 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt,
| >>> NetClientState *nc)
| >>>* Since underlying infrastructure does not support IP datagrams
| >>>longer
| >>>* than 64K we should drop such packets and don't even try to send
| >>>*/
| >>> -if (VIRTIO_NET_HDR_GSO_NONE != 

Re: [PATCH] gitlab-ci.yml: Add oss-fuzz build tests

2020-07-16 Thread Thomas Huth
On 16/07/2020 18.33, Alexander Bulekov wrote:
> This tries to build and run the fuzzers with the same build-script used
> by oss-fuzz. This doesn't guarantee that the builds on oss-fuzz will
> also succeed, since oss-fuzz provides its own compiler and fuzzer vars,
> but it can catch changes that are not compatible with the the
> ./scripts/oss-fuzz/build.sh script.
> The strange way of finding fuzzer binaries stems from the method used by
> oss-fuzz:
> https://github.com/google/oss-fuzz/blob/master/infra/base-images/base-runner/targets_list
> 
> Signed-off-by: Alexander Bulekov 
> ---
> 
> Similar to Thomas' patch:
> 
>> Note: This patch needs two other patches merged first to work correctly:
> 
>> - 'fuzz: Expect the cmdline in a freeable GString' from Alexander
> 
>> - 'qom: Plug memory leak in "info qom-tree"' from Markus
> 
> Otherwise the test will fail due to detected memory leaks.
> 
> Fair warning: I haven't been able to trigger this new job yet. I tried
> to run the pipeline with these changes on my forked repo on gitlab, but
> did not reach the build-oss-fuzz. I think this is due to some failures
> in the Containers-layer-2 stage:
> 
> ...
> Error response from daemon: manifest for
> registry.gitlab.com/a1xndr/qemu/qemu/debian-all-test-cross:latest not
> found: manifest unknown: manifest unknown
> #2 [internal] load .dockerignore
> #2 transferring context:
> #2 transferring context: 2B 0.1s done
> #2 DONE 0.1s
> #1 [internal] load build definition from tmpg8j4xoop.docker
> #1 transferring dockerfile: 2.21kB 0.1s done
> #1 DONE 0.2s
> #3 [internal] load metadata for docker.io/qemu/debian10:latest
> #3 ERROR: pull access denied, repository does not exist or may require
> authorization: server message: insufficient_scope: authorization failed

These look like the problems that we've seen with the main repo until
two days ago, too, e.g.:

 https://gitlab.com/qemu-project/qemu/-/jobs/640410842

Maybe Alex (Bennée) can comment on how to resolve them?

> 
>  .gitlab-ci.yml | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
> index e96f8794b9..a50df420c9 100644
> --- a/.gitlab-ci.yml
> +++ b/.gitlab-ci.yml
> @@ -182,6 +182,20 @@ build-fuzzer:
>  || exit 1 ;
>done

As mentioned in my other mail, I think you can replace my build-fuzzer
job once this is working.

> +build-oss-fuzz:
> +  <<: *native_build_job_definition
> +  variables:
> +IMAGE: fedora
> +  script:
> +- OUT_DIR="./build" CC=clang-9 CXX=clang++-9 CFLAGS="-fsanitize=address"
> +  LIB_FUZZING_ENGINE="-fsanitize=fuzzer" CFL

That "CFL" at the end seems to be a typo (leftover from "CFLAGS")?

Also the fedora container does not have clang-9 :

 https://gitlab.com/huth/qemu/-/jobs/643383032#L28

I think it is at clang 10 already, so maybe just use CC=clang (without
version number)?

> +  ./scripts/oss-fuzz/build.sh
> +- for fuzzer in $(find ./build-oss-fuzz/DEST_DIR/ -executable -type f); 
> do
> +grep "LLVMFuzzerTestOneInput" ${fuzzer} > /dev/null 2>&1 || continue 
> ;
> +echo Testing ${fuzzer} ... ;
> +"${fuzzer}" -runs=1000 || exit 1 ;
> +  done

Should we exclude the virtio-net tests, since they could leak network
traffic to the host?

 Thomas




Re: [PATCH] e1000e: using bottom half to send packets

2020-07-16 Thread Jason Wang



On 2020/7/17 下午12:46, Li Qiang wrote:

Jason Wang  于2020年7月17日周五 上午11:10写道:


On 2020/7/17 上午12:14, Li Qiang wrote:

Alexander Bulekov reported a UAF bug related e1000e packets send.

-->https://bugs.launchpad.net/qemu/+bug/1886362

This is because the guest trigger a e1000e packet send and set the
data's address to e1000e's MMIO address. So when the e1000e do DMA
it will write the MMIO again and trigger re-entrancy and finally
causes this UAF.

Paolo suggested to use a bottom half whenever MMIO is doing complicate
things in here:
-->https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg03342.html

Reference here:
'The easiest solution is to delay processing of descriptors to a bottom
half whenever MMIO is doing something complicated.  This is also better
for latency because it will free the vCPU thread more quickly and leave
the work to the I/O thread.'


I think several things were missed in this patch (take virtio-net as a
reference), do we need the following things:


Thanks Jason,
In fact I know this, I'm scared for touching this but I want to try.
Thanks for your advice.


- Cancel the bh when VM is stopped.

Ok. I think add a vm state change notifier for e1000e can address this.


- A throttle to prevent bh from executing too much timer?

Ok, I think add a config timeout and add a timer in e1000e can address this.



Sorry, a typo. I meant we probably need a tx_burst as what virtio-net did.





- A flag to record whether or not this a pending tx (and migrate it?)

Is just a flag enough? Could you explain more about the idea behind
processing the virtio-net/e1000e using bh like this?



Virtio-net use a tx_waiting variable to record whether or not there's a 
pending bh. (E.g bh is cancelled due to vmstop, we need reschedule it 
after vmresume). Maybe we can do something simpler by just schecule bh 
unconditionally during vm resuming.




For example, if the guest trigger a lot of packets send and if the bh
is scheduled in IO thread. So will we lost packets?



We don't since we don't populate virtqueue which means packets are 
queued there.


Thanks



How we avoid this in virtio-net.

Thanks,
Li Qiang




Thanks



This patch fixes this UAF.

Signed-off-by: Li Qiang 
---
   hw/net/e1000e_core.c | 25 +
   hw/net/e1000e_core.h |  2 ++
   2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index bcd186cac5..6165b04b68 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -2423,32 +2423,27 @@ e1000e_set_dbal(E1000ECore *core, int index, uint32_t 
val)
   static void
   e1000e_set_tctl(E1000ECore *core, int index, uint32_t val)
   {
-E1000E_TxRing txr;
   core->mac[index] = val;

   if (core->mac[TARC0] & E1000_TARC_ENABLE) {
-e1000e_tx_ring_init(core, , 0);
-e1000e_start_xmit(core, );
+qemu_bh_schedule(core->tx[0].tx_bh);
   }

   if (core->mac[TARC1] & E1000_TARC_ENABLE) {
-e1000e_tx_ring_init(core, , 1);
-e1000e_start_xmit(core, );
+qemu_bh_schedule(core->tx[1].tx_bh);
   }
   }

   static void
   e1000e_set_tdt(E1000ECore *core, int index, uint32_t val)
   {
-E1000E_TxRing txr;
   int qidx = e1000e_mq_queue_idx(TDT, index);
   uint32_t tarc_reg = (qidx == 0) ? TARC0 : TARC1;

   core->mac[index] = val & 0x;

   if (core->mac[tarc_reg] & E1000_TARC_ENABLE) {
-e1000e_tx_ring_init(core, , qidx);
-e1000e_start_xmit(core, );
+qemu_bh_schedule(core->tx[qidx].tx_bh);
   }
   }

@@ -3322,6 +3317,16 @@ e1000e_vm_state_change(void *opaque, int running, 
RunState state)
   }
   }

+static void e1000e_core_tx_bh(void *opaque)
+{
+struct e1000e_tx *tx = opaque;
+E1000ECore *core = tx->core;
+E1000E_TxRing txr;
+
+e1000e_tx_ring_init(core, , tx - >tx[0]);
+e1000e_start_xmit(core, );
+}
+
   void
   e1000e_core_pci_realize(E1000ECore *core,
   const uint16_t *eeprom_templ,
@@ -3340,6 +3345,8 @@ e1000e_core_pci_realize(E1000ECore *core,
   for (i = 0; i < E1000E_NUM_QUEUES; i++) {
   net_tx_pkt_init(>tx[i].tx_pkt, core->owner,
   E1000E_MAX_TX_FRAGS, core->has_vnet);
+core->tx[i].core = core;
+core->tx[i].tx_bh = qemu_bh_new(e1000e_core_tx_bh, >tx[i]);
   }

   net_rx_pkt_init(>rx_pkt, core->has_vnet);
@@ -3367,6 +3374,8 @@ e1000e_core_pci_uninit(E1000ECore *core)
   for (i = 0; i < E1000E_NUM_QUEUES; i++) {
   net_tx_pkt_reset(core->tx[i].tx_pkt);
   net_tx_pkt_uninit(core->tx[i].tx_pkt);
+qemu_bh_delete(core->tx[i].tx_bh);
+core->tx[i].tx_bh = NULL;
   }

   net_rx_pkt_uninit(core->rx_pkt);
diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h
index aee32f7e48..94ddc6afc2 100644
--- a/hw/net/e1000e_core.h
+++ b/hw/net/e1000e_core.h
@@ -77,6 +77,8 @@ struct E1000Core {
   unsigned char sum_needed;
   bool cptse;
   

Re: [PATCH] gitlab-ci.yml: Add fuzzer tests

2020-07-16 Thread Thomas Huth
On 16/07/2020 18.46, Alexander Bulekov wrote:
> On 200716 1209, Thomas Huth wrote:
>> So far we neither compile-tested nor run any of the new fuzzers in our CI,
>> which led to some build failures of the fuzzer code in the past weeks.
>> To avoid this problem, add a job to compile the fuzzer code and run some
>> loops (which likely don't find any new bugs via fuzzing, but at least we
>> know that the code can still be run).
>>
>> A nice side-effect of this test is that the leak tests are enabled here,
>> so we should now notice some of the memory leaks in our code base earlier.
>>
>> Signed-off-by: Thomas Huth 
> 
> Thank you for this, Thomas. I just sent a patch to add a job that runs
> similar tests with the build-script that we use on oss-fuzz
> Patch: <20200716163330.29141-1-alx...@bu.edu>

Good idea! ... but it does not work quite yet, I gave it a try and ended
up here:

 https://gitlab.com/huth/qemu/-/jobs/643353500

> I know that these jobs have a lot of overlap, but there are enough
> quirks in the oss-fuzz build-script that I, personally, don't think
> they are redundant.

I think we should finally go with your approach and compile the fuzzing
test via the script. But since that seems to need some more work first,
let's go with my patch now, so that we have something for 5.1-rc1, and
then when your patch is ready, replace my "build-fuzzer" job with yours, ok?

>> +build-fuzzer:
>> +  <<: *native_build_job_definition
>> +  variables:
>> +IMAGE: fedora
>> +  script:
>> +- mkdir build
>> +- cd build
>> +- ../configure --cc=clang --cxx=clang++ --enable-fuzzing
>> +   --target-list=x86_64-softmmu
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg02310.html
> When/if this gets merged, enable-fuzzing won't build with
> AddressSanitizer, by default, so I would add --enable-sanitizers, just
> to be safe. 

Ok, thanks, I'll add that.

>> +- make -j"$JOBS" all check-build x86_64-softmmu/fuzz
>> +- make check
>> +- for fuzzer in i440fx-qos-fork-fuzz i440fx-qos-noreset-fuzz
>> +i440fx-qtest-reboot-fuzz virtio-scsi-flags-fuzz virtio-scsi-fuzz ; 
>> do
> 
> Any reason for these particular fuzzers? I know the virtio-net ones find
> crashes pretty quickly, but I dont think that causes a non-zero exit.

I did not include the virtio-net fuzzers because I read that warning
"May result in network traffic emitted from the  process. Run in an
isolated network environment." in the help text ... so I was not sure
whether they are really suitable for the CI on the gitlab machines?

 Thomas




Re: [RFC PATCH-for-5.2 v2 2/2] block/block-backend: Let blk_attach_dev() provide helpful error

2020-07-16 Thread Markus Armbruster
Daniel P. Berrangé  writes:

> On Thu, Jul 16, 2020 at 02:37:04PM +0200, Philippe Mathieu-Daudé wrote:
>> Let blk_attach_dev() take an Error* object to return helpful
>> information. Adapt the callers.
>> 
>>   $ qemu-system-arm -M n800
>>   qemu-system-arm: sd_init failed: cannot attach blk 'sd0' to device 
>> 'sd-card'
>>   blk 'sd0' is already attached by device 'omap2-mmc'
>>   Drive 'sd0' is already in use because it has been automatically connected 
>> to another device
>>   (do you need 'if=none' in the drive options?)
>> 
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>> v2: Rebased after 668f62ec62 ("error: Eliminate error_propagate()")
>> ---
>>  include/sysemu/block-backend.h   |  2 +-
>>  block/block-backend.c| 11 ++-
>>  hw/block/fdc.c   |  4 +---
>>  hw/block/swim.c  |  4 +---
>>  hw/block/xen-block.c |  5 +++--
>>  hw/core/qdev-properties-system.c | 16 +---
>>  hw/ide/qdev.c|  4 +---
>>  hw/scsi/scsi-disk.c  |  4 +---
>>  8 files changed, 27 insertions(+), 23 deletions(-)
>> 
>> diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
>> index 8203d7f6f9..118fbad0b4 100644
>> --- a/include/sysemu/block-backend.h
>> +++ b/include/sysemu/block-backend.h
>> @@ -113,7 +113,7 @@ BlockDeviceIoStatus blk_iostatus(const BlockBackend 
>> *blk);
>>  void blk_iostatus_disable(BlockBackend *blk);
>>  void blk_iostatus_reset(BlockBackend *blk);
>>  void blk_iostatus_set_err(BlockBackend *blk, int error);
>> -int blk_attach_dev(BlockBackend *blk, DeviceState *dev);
>> +int blk_attach_dev(BlockBackend *blk, DeviceState *dev, Error **errp);
>>  void blk_detach_dev(BlockBackend *blk, DeviceState *dev);
>>  DeviceState *blk_get_attached_dev(BlockBackend *blk);
>>  char *blk_get_attached_dev_id(BlockBackend *blk);
>> diff --git a/block/block-backend.c b/block/block-backend.c
>> index 63ff940ef9..b7be0a4619 100644
>> --- a/block/block-backend.c
>> +++ b/block/block-backend.c
>> @@ -884,12 +884,21 @@ void blk_get_perm(BlockBackend *blk, uint64_t *perm, 
>> uint64_t *shared_perm)
>>  
>>  /*
>>   * Attach device model @dev to @blk.
>> + *
>> + * @blk: Block backend
>> + * @dev: Device to attach the block backend to
>> + * @errp: pointer to NULL initialized error object
>> + *
>>   * Return 0 on success, -EBUSY when a device model is attached already.
>>   */
>> -int blk_attach_dev(BlockBackend *blk, DeviceState *dev)
>> +int blk_attach_dev(BlockBackend *blk, DeviceState *dev, Error **errp)
>>  {
>>  trace_blk_attach_dev(blk_name(blk), object_get_typename(OBJECT(dev)));
>>  if (blk->dev) {
>> +error_setg(errp, "cannot attach blk '%s' to device '%s'",
>> +   blk_name(blk), object_get_typename(OBJECT(dev)));
>> +error_append_hint(errp, "blk '%s' is already attached by device 
>> '%s'\n",
>> +  blk_name(blk), 
>> object_get_typename(OBJECT(blk->dev)));
>
> I would have a preference for expanding the main error message and not
> using a hint.  Any hint is completely thrown away when using QMP :-(

Hints work best in cases like

error message
hint suggesting things to try to fix it, in CLI syntax

error message rejecting a configuration value
hint listing possible values, in CLI syntax

Why "in CLI syntax"?  Well, we need to use *some* syntax to be useful.
Hints have always been phrased for the CLI, simply because the hint
feature grew out of my disgust over the loss of lovingly written CLI
hints in the conversion to Error.

Hints in CLI syntax would be misleading QMP.  We never extended QMP to
transport hints.

Hints may tempt you in a case like

error message that is painfully long, because it really tries hard to 
explain, which is laudable in theory, but sadly illegible in practice; also, 
interminable sentences, meh, this is an error message, not a Joyce novel

to get something like

terse error message
Explanation that is rather long, because it really tries hard to
explain.  It can be multiple sentences, and lines are wrapped at a
reasonable length.

Comes out okay in the CLI, but the explanation is lost in QMP.

I don't have a good solution to offer for errors that genuinely need
explaining.




Re: sysbus_create_simple Vs qdev_create

2020-07-16 Thread Markus Armbruster
Eduardo Habkost  writes:

> I'd also note that the use of "parent" in the code is also
> ambiguous.  It can mean:
>
> * QOM parent type, i.e. TypeInfo.parent.  Related fields:
>   * parent_class members of class structs
>   * parent_obj members of object structs

I hate the use of "parent" and "child" for a super- / subtype relation.

Correcting the terminology there would be short term pain for long term
gain.  Worthwhile?

> * QOM composition tree parent object, i.e. Object::parent
> * qdev device parent bus, i.e. DeviceState::parent_bus
> * parent device of of qdev bus, i.e. BusState::parent

These are tree relations.  Use of "parent" and "child" is perfectly
fine.




Re: [PATCH] net: check payload length limit for all frames

2020-07-16 Thread P J P
  Hello Jason, all

+-- On Fri, 17 Jul 2020, Jason Wang wrote --+
| On 2020/7/17 上午9:21, Alexander Bulekov wrote:
| > On 200717 0853, Li Qiang wrote:
| >> Which issue are you trying to solve, any reference linking?
| >> I also send a patch related this part and also a UAF.
| >
| > I reported a UAF privately to QEMU-Security in May. I believe the one Li
| > is referring to is this one https://bugs.launchpad.net/qemu/+bug/1886362
| >
| > When I saw Prasad's email, I was worried that I reported the same bug
| > twice, but I can still reproduce LP#1886362 with Prasad's patch.
| >
| > On the other hand, I cannot reproduce either issue with Li's patch:
| > Message-Id: <20200716161453.61295-1-liq...@163.com>
| >
| > Based on this, I think there were two distinct issues.

  Yes, LP#1886362 looks different that the one fixed here.

| Could you describe the issue you saw in details? (E.g the calltrace?) The
| commit log does not explain where we can get OOB or UAF.

I should've included the backtrace in the commit message. It crossed my mind 
after I sent the patch. Sorry.

===
1040323==ERROR: AddressSanitizer: heap-use-after-free on address 0x606344d8 
at pc 0x5571b8fb9ce7 bp 0x7ffede5a2290 sp 0x7ffede5a2280
READ of size 8 at 0x606344d8 thread T0
#0 0x5571b8fb9ce6 in e1000e_write_packet_to_guest hw/net/e1000e_core.c:1587
#1 0x5571b8fba8fc in e1000e_receive_iov hw/net/e1000e_core.c:1709
#2 0x5571b8f9cb08 in e1000e_nc_receive_iov hw/net/e1000e.c:213
#3 0x5571b8f90285 in net_tx_pkt_sendv hw/net/net_tx_pkt.c:544
#4 0x5571b8f90aaa in net_tx_pkt_send hw/net/net_tx_pkt.c:620
#5 0x5571b8f90b2e in net_tx_pkt_send_loopback hw/net/net_tx_pkt.c:633
#6 0x5571b8fb3c3b in e1000e_tx_pkt_send hw/net/e1000e_core.c:664
#7 0x5571b8fb43dd in e1000e_process_tx_desc hw/net/e1000e_core.c:743
#8 0x5571b8fb5b4f in e1000e_start_xmit hw/net/e1000e_core.c:934
#9 0x5571b8fbe2ad in e1000e_set_tdt hw/net/e1000e_core.c:2451
#10 0x5571b8fc01e0 in e1000e_core_write hw/net/e1000e_core.c:3265
#11 0x5571b8f9c387 in e1000e_mmio_write hw/net/e1000e.c:109
...
0x606344d8 is located 24 bytes inside of 64-byte region 
[0x606344c0,0x60634500)
freed by thread T0 here:
#0 0x7f89e3b87307 in __interceptor_free (/lib64/libasan.so.6+0xb0307)
#1 0x7f89e37c7a6c in g_free (/lib64/libglib-2.0.so.0+0x58a6c)
#2 0x5571b8f95fc7 in net_rx_pkt_pull_data hw/net/net_rx_pkt.c:103
#3 0x5571b8f969b3 in net_rx_pkt_attach_iovec_ex hw/net/net_rx_pkt.c:158
#4 0x5571b8fba6bc in e1000e_receive_iov hw/net/e1000e_core.c:1695
#5 0x5571b8f9cb08 in e1000e_nc_receive_iov hw/net/e1000e.c:213
#6 0x5571b8f90285 in net_tx_pkt_sendv hw/net/net_tx_pkt.c:544
#7 0x5571b8f90aaa in net_tx_pkt_send hw/net/net_tx_pkt.c:620
#8 0x5571b8f90b2e in net_tx_pkt_send_loopback hw/net/net_tx_pkt.c:633
#9 0x5571b8fb3c3b in e1000e_tx_pkt_send hw/net/e1000e_core.c:664
#10 0x5571b8fb43dd in e1000e_process_tx_desc hw/net/e1000e_core.c:743
#11 0x5571b8fb5b4f in e1000e_start_xmit hw/net/e1000e_core.c:934
#12 0x5571b8fbe011 in e1000e_set_tctl hw/net/e1000e_core.c:2431
#13 0x5571b8fc01e0 in e1000e_core_write hw/net/e1000e_core.c:3265
#14 0x5571b8f9c387 in e1000e_mmio_write hw/net/e1000e.c:109
...
previously allocated by thread T0 here:
#0 0x7f89e3b87667 in __interceptor_malloc (/lib64/libasan.so.6+0xb0667)
#1 0x7f89e37c7978 in g_malloc (/lib64/libglib-2.0.so.0+0x58978)
#2 0x5571b8f95fc7 in net_rx_pkt_pull_data hw/net/net_rx_pkt.c:103
#3 0x5571b8f969b3 in net_rx_pkt_attach_iovec_ex hw/net/net_rx_pkt.c:158
#4 0x5571b8fba6bc in e1000e_receive_iov hw/net/e1000e_core.c:1695
#5 0x5571b8f9cb08 in e1000e_nc_receive_iov hw/net/e1000e.c:213
#6 0x5571b8f90285 in net_tx_pkt_sendv hw/net/net_tx_pkt.c:544
#7 0x5571b8f90aaa in net_tx_pkt_send hw/net/net_tx_pkt.c:620
#8 0x5571b8f90b2e in net_tx_pkt_send_loopback hw/net/net_tx_pkt.c:633
#9 0x5571b8fb3c3b in e1000e_tx_pkt_send hw/net/e1000e_core.c:664
#10 0x5571b8fb43dd in e1000e_process_tx_desc hw/net/e1000e_core.c:743
#11 0x5571b8fb5b4f in e1000e_start_xmit hw/net/e1000e_core.c:934
#12 0x5571b8fbe2ad in e1000e_set_tdt hw/net/e1000e_core.c:2451
#13 0x5571b8fc01e0 in e1000e_core_write hw/net/e1000e_core.c:3265
#14 0x5571b8f9c387 in e1000e_mmio_write hw/net/e1000e.c:109
===
 
| >>> --- a/hw/net/net_tx_pkt.c
| >>> +++ b/hw/net/net_tx_pkt.c
| >>> @@ -607,12 +607,10 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt,
| >>> NetClientState *nc)
| >>>* Since underlying infrastructure does not support IP datagrams
| >>>longer
| >>>* than 64K we should drop such packets and don't even try to send
| >>>*/
| >>> -if (VIRTIO_NET_HDR_GSO_NONE != pkt->virt_hdr.gso_type) {
| >>> -if (pkt->payload_len >
| >>> -ETH_MAX_IP_DGRAM_LEN -
| >>> -pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) {
| >>> -return false;
| >>> -}
| >>> +if 

Re: [PATCH] e1000e: using bottom half to send packets

2020-07-16 Thread Li Qiang
Jason Wang  于2020年7月17日周五 上午11:10写道:
>
>
> On 2020/7/17 上午12:14, Li Qiang wrote:
> > Alexander Bulekov reported a UAF bug related e1000e packets send.
> >
> > -->https://bugs.launchpad.net/qemu/+bug/1886362
> >
> > This is because the guest trigger a e1000e packet send and set the
> > data's address to e1000e's MMIO address. So when the e1000e do DMA
> > it will write the MMIO again and trigger re-entrancy and finally
> > causes this UAF.
> >
> > Paolo suggested to use a bottom half whenever MMIO is doing complicate
> > things in here:
> > -->https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg03342.html
> >
> > Reference here:
> > 'The easiest solution is to delay processing of descriptors to a bottom
> > half whenever MMIO is doing something complicated.  This is also better
> > for latency because it will free the vCPU thread more quickly and leave
> > the work to the I/O thread.'
>
>
> I think several things were missed in this patch (take virtio-net as a
> reference), do we need the following things:
>

Thanks Jason,
In fact I know this, I'm scared for touching this but I want to try.
Thanks for your advice.

> - Cancel the bh when VM is stopped.

Ok. I think add a vm state change notifier for e1000e can address this.

> - A throttle to prevent bh from executing too much timer?

Ok, I think add a config timeout and add a timer in e1000e can address this.

> - A flag to record whether or not this a pending tx (and migrate it?)

Is just a flag enough? Could you explain more about the idea behind
processing the virtio-net/e1000e using bh like this?
For example, if the guest trigger a lot of packets send and if the bh
is scheduled in IO thread. So will we lost packets?
How we avoid this in virtio-net.

Thanks,
Li Qiang



>
> Thanks
>
>
> >
> > This patch fixes this UAF.
> >
> > Signed-off-by: Li Qiang 
> > ---
> >   hw/net/e1000e_core.c | 25 +
> >   hw/net/e1000e_core.h |  2 ++
> >   2 files changed, 19 insertions(+), 8 deletions(-)
> >
> > diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
> > index bcd186cac5..6165b04b68 100644
> > --- a/hw/net/e1000e_core.c
> > +++ b/hw/net/e1000e_core.c
> > @@ -2423,32 +2423,27 @@ e1000e_set_dbal(E1000ECore *core, int index, 
> > uint32_t val)
> >   static void
> >   e1000e_set_tctl(E1000ECore *core, int index, uint32_t val)
> >   {
> > -E1000E_TxRing txr;
> >   core->mac[index] = val;
> >
> >   if (core->mac[TARC0] & E1000_TARC_ENABLE) {
> > -e1000e_tx_ring_init(core, , 0);
> > -e1000e_start_xmit(core, );
> > +qemu_bh_schedule(core->tx[0].tx_bh);
> >   }
> >
> >   if (core->mac[TARC1] & E1000_TARC_ENABLE) {
> > -e1000e_tx_ring_init(core, , 1);
> > -e1000e_start_xmit(core, );
> > +qemu_bh_schedule(core->tx[1].tx_bh);
> >   }
> >   }
> >
> >   static void
> >   e1000e_set_tdt(E1000ECore *core, int index, uint32_t val)
> >   {
> > -E1000E_TxRing txr;
> >   int qidx = e1000e_mq_queue_idx(TDT, index);
> >   uint32_t tarc_reg = (qidx == 0) ? TARC0 : TARC1;
> >
> >   core->mac[index] = val & 0x;
> >
> >   if (core->mac[tarc_reg] & E1000_TARC_ENABLE) {
> > -e1000e_tx_ring_init(core, , qidx);
> > -e1000e_start_xmit(core, );
> > +qemu_bh_schedule(core->tx[qidx].tx_bh);
> >   }
> >   }
> >
> > @@ -3322,6 +3317,16 @@ e1000e_vm_state_change(void *opaque, int running, 
> > RunState state)
> >   }
> >   }
> >
> > +static void e1000e_core_tx_bh(void *opaque)
> > +{
> > +struct e1000e_tx *tx = opaque;
> > +E1000ECore *core = tx->core;
> > +E1000E_TxRing txr;
> > +
> > +e1000e_tx_ring_init(core, , tx - >tx[0]);
> > +e1000e_start_xmit(core, );
> > +}
> > +
> >   void
> >   e1000e_core_pci_realize(E1000ECore *core,
> >   const uint16_t *eeprom_templ,
> > @@ -3340,6 +3345,8 @@ e1000e_core_pci_realize(E1000ECore *core,
> >   for (i = 0; i < E1000E_NUM_QUEUES; i++) {
> >   net_tx_pkt_init(>tx[i].tx_pkt, core->owner,
> >   E1000E_MAX_TX_FRAGS, core->has_vnet);
> > +core->tx[i].core = core;
> > +core->tx[i].tx_bh = qemu_bh_new(e1000e_core_tx_bh, >tx[i]);
> >   }
> >
> >   net_rx_pkt_init(>rx_pkt, core->has_vnet);
> > @@ -3367,6 +3374,8 @@ e1000e_core_pci_uninit(E1000ECore *core)
> >   for (i = 0; i < E1000E_NUM_QUEUES; i++) {
> >   net_tx_pkt_reset(core->tx[i].tx_pkt);
> >   net_tx_pkt_uninit(core->tx[i].tx_pkt);
> > +qemu_bh_delete(core->tx[i].tx_bh);
> > +core->tx[i].tx_bh = NULL;
> >   }
> >
> >   net_rx_pkt_uninit(core->rx_pkt);
> > diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h
> > index aee32f7e48..94ddc6afc2 100644
> > --- a/hw/net/e1000e_core.h
> > +++ b/hw/net/e1000e_core.h
> > @@ -77,6 +77,8 @@ struct E1000Core {
> >   unsigned char sum_needed;
> >   bool cptse;
> >   struct NetTxPkt *tx_pkt;
> > +QEMUBH *tx_bh;
> > +  

[PATCH] Fix vhost-user buffer over-read on ram hot-unplug

2020-07-16 Thread Raphael Norwitz
The VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS vhost-user protocol
feature introduced a shadow-table, used by the backend to dynamically
determine how a vdev's memory regions have changed since the last
vhost_user_set_mem_table() call. On hot-remove, a memmove() operation
is used to overwrite the removed shadow region descriptor(s). The size
parameter of this memmove was off by 1 such that if a VM with a backend
supporting the VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS filled it's
shadow-table (by performing the maximum number of supported hot-add
operatons) and attempted to remove the last region, Qemu would read an
out of bounds value and potentially crash.

This change fixes the memmove() bounds such that this erroneous read can
never happen.

Signed-off-by: Peter Turschmid 
Signed-off-by: Raphael Norwitz 
---
 hw/virtio/vhost-user.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 3123121..d7e2423 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -672,7 +672,7 @@ static int send_remove_regions(struct vhost_dev *dev,
 memmove(>shadow_regions[shadow_reg_idx],
 >shadow_regions[shadow_reg_idx + 1],
 sizeof(struct vhost_memory_region) *
-(u->num_shadow_regions - shadow_reg_idx));
+(u->num_shadow_regions - shadow_reg_idx - 1));
 u->num_shadow_regions--;
 }
 
-- 
1.8.3.1




Re: [virtio-dev] [RFC for Linux v4 0/2] virtio_balloon: Add VIRTIO_BALLOON_F_CONT_PAGES to report continuous pages

2020-07-16 Thread teawater



> 2020年7月16日 18:45,Michael S. Tsirkin  写道:
> 
> On Thu, Jul 16, 2020 at 03:01:18PM +0800, teawater wrote:
>> 
>> 
>>> 2020年7月16日 14:38,Michael S. Tsirkin  写道:
>>> 
>>> On Thu, Jul 16, 2020 at 10:41:50AM +0800, Hui Zhu wrote:
 The first, second and third version are in [1], [2] and [3].
 Code of current version for Linux and qemu is available in [4] and [5].
 Update of this version:
 1. Report continuous pages will increase the speed.  So added deflate
  continuous pages.
 2. According to the comments from David in [6], added 2 new vqs 
 inflate_cont_vq
  and deflate_cont_vq to report continuous pages with format 32 bits pfn 
 and 32
  bits size.
 Following is the introduction of the function.
 These patches add VIRTIO_BALLOON_F_CONT_PAGES to virtio_balloon. With this
 flag, balloon tries to use continuous pages to inflate and deflate.
 Opening this flag can bring two benefits:
 1. Report continuous pages will increase memory report size of each time
  call tell_host.  Then it will increase the speed of balloon inflate and
  deflate.
 2. Host THPs will be splitted when qemu release the page of balloon 
 inflate.
  Inflate balloon with continuous pages will let QEMU release the pages
  of same THPs.  That will help decrease the splitted THPs number in
  the host.
  Following is an example in a VM with 1G memory 1CPU.  This test setups an
  environment that has a lot of fragmentation pages.  Then inflate balloon 
 will
  split the THPs.
>> 
>> 
 // This is the THP number before VM execution in the host.
 // None use THP.
 cat /proc/meminfo | grep AnonHugePages:
 AnonHugePages: 0 kB
>> These lines are from host.
>> 
 // After VM start, use usemem
 // (https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git)
 // punch-holes function generates 400m fragmentation pages in the guest
 // kernel.
 usemem --punch-holes -s -1 800m &
>> These lines are from guest.  They setups the environment that has a lot of 
>> fragmentation pages.
>> 
 // This is the THP number after this command in the host.
 // Some THP is used by VM because usemem will access 800M memory
 // in the guest.
 cat /proc/meminfo | grep AnonHugePages:
 AnonHugePages:911360 kB
>> These lines are from host.
>> 
 // Connect to the QEMU monitor, setup balloon, and set it size to 600M.
 (qemu) device_add virtio-balloon-pci,id=balloon1
 (qemu) info balloon
 balloon: actual=1024
 (qemu) balloon 600
 (qemu) info balloon
 balloon: actual=600
>> These lines are from host.
>> 
 // This is the THP number after inflate the balloon in the host.
 cat /proc/meminfo | grep AnonHugePages:
 AnonHugePages: 88064 kB
>> These lines are from host.
>> 
 // Set the size back to 1024M in the QEMU monitor.
 (qemu) balloon 1024
 (qemu) info balloon
 balloon: actual=1024
>> These lines are from host.
>> 
 // Use usemem to increase the memory usage of QEMU.
 killall usemem
 usemem 800m
>> These lines are from guest.
>> 
 // This is the THP number after this operation.
 cat /proc/meminfo | grep AnonHugePages:
 AnonHugePages: 65536 kB
>> These lines are from host.
>> 
>> 
>> 
 
 Following example change to use continuous pages balloon.  The number of
 splitted THPs is decreased.
 // This is the THP number before VM execution in the host.
 // None use THP.
 cat /proc/meminfo | grep AnonHugePages:
 AnonHugePages: 0 kB
>> These lines are from host.
>> 
 // After VM start, use usemem punch-holes function generates 400M
 // fragmentation pages in the guest kernel.
 usemem --punch-holes -s -1 800m &
>> These lines are from guest.  They setups the environment that has a lot of 
>> fragmentation pages.
>> 
 // This is the THP number after this command in the host.
 // Some THP is used by VM because usemem will access 800M memory
 // in the guest.
 cat /proc/meminfo | grep AnonHugePages:
 AnonHugePages:911360 kB
>> These lines are from host.
>> 
 // Connect to the QEMU monitor, setup balloon, and set it size to 600M.
 (qemu) device_add virtio-balloon-pci,id=balloon1,cont-pages=on
 (qemu) info balloon
 balloon: actual=1024
 (qemu) balloon 600
 (qemu) info balloon
 balloon: actual=600
>> These lines are from host.
>> 
 // This is the THP number after inflate the balloon in the host.
 cat /proc/meminfo | grep AnonHugePages:
 AnonHugePages:616448 kB
 // Set the size back to 1024M in the QEMU monitor.
 (qemu) balloon 1024
 (qemu) info balloon
 balloon: actual=1024
>> These lines are from host.
>> 
 // Use usemem to increase the memory usage of QEMU.
 killall usemem
 usemem 800m
>> These lines are from guest.
>> 
 // This is the THP number after this operation.
 cat 

[PULL SUBSYSTEM qemu-pseries] pseries: Update SLOF firmware image

2020-07-16 Thread Alexey Kardashevskiy
The following changes since commit 1038a309ec829f05a3a3e52a9951cfdb24dfd02c:

  spapr: Add a new level of NUMA for GPUs (2020-07-17 10:36:28 +1000)

are available in the Git repository at:

  g...@github.com:aik/qemu.git tags/qemu-slof-20200717

for you to fetch changes up to 7f5258dd8327d574de455a2271788474fa25548d:

  pseries: Update SLOF firmware image (2020-07-17 13:23:00 +1000)


Alexey Kardashevskiy (1):
  pseries: Update SLOF firmware image

 pc-bios/README   |   2 +-
 pc-bios/slof.bin | Bin 965112 -> 968368 bytes
 roms/SLOF|   2 +-
 3 files changed, 2 insertions(+), 2 deletions(-)


*** Note: this is not for master, this is for pseries

This adds tcgbios (this was posted earlier [1] but got lost)
and fixes FDT update at ibm,client-architecture-support
for huge guests.

The full list of changes:

Alexey Kardashevskiy (4):
  make: Define default rule for .c when V=1 or V=2
  version: update to 20200513
  fdt: Avoid recursion when traversing tree
  version: update to 20200717

Gustavo Romero (1):
  board-qemu: Fix comment about SLOF start address

Stefan Berger (6):
  tcgbios: Only write logs for PCRs that are allocated
  tcgbios: Fix the vendorInfoSize to be of type uint8_t
  tcgbios: Add support for SHA3 type of algorithms
  elf: Implement elf_get_file_size to determine size of an ELF image
  tcgbios: Implement tpm_hash_log_extend_event_buffer
  tcgbios: Measure the bootloader file read from disk

[1] 
https://patchwork.ozlabs.org/project/qemu-devel/patch/20200513024355.121476-1-...@ozlabs.ru/




Re: [PATCH] net: check payload length limit for all frames

2020-07-16 Thread Jason Wang



On 2020/7/17 上午9:21, Alexander Bulekov wrote:

On 200717 0853, Li Qiang wrote:

P J P  于2020年7月17日周五 上午3:26写道:

From: Prasad J Pandit 

While sending packets, the check that packet 'payload_len'
is within 64kB limit, seems to happen only for GSO frames.
It may lead to use-after-free or out-of-bounds access like
issues when sending non-GSO frames. Check the 'payload_len'
limit for all packets, irrespective of the gso type.


Hello Prasad,
Which issue are you trying to solve, any reference linking?

I also send a patch related this part and also a UAF.

Thanks,
Li Qiang

Hi Li, Prasad,
I reported a UAF privately to QEMU-Security in May. I believe the one Li
is referring to is this one https://bugs.launchpad.net/qemu/+bug/1886362

When I saw Prasad's email, I was worried that I reported the same bug
twice, but I can still reproduce LP#1886362 with Prasad's patch.

On the other hand, I cannot reproduce either issue with Li's patch:
Message-Id: <20200716161453.61295-1-liq...@163.com>

Based on this, I think there were two distinct issues. Both of the
crashes rely on e1000e tx loopback into e1000e MMIO. Since Li's
patch adds a TX bh, it seems to mitigate such types of issues.

Sorry about any confusion.
-Alex



Could you describe the issue you saw in details? (E.g the calltrace?) 
The commit log does not explain where we can get OOB or UAF.


Thanks





Reported-by: Alexander Bulekov 
Signed-off-by: Prasad J Pandit 
---
  hw/net/net_tx_pkt.c | 10 --
  1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 162f802dd7..e66998a8f9 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -607,12 +607,10 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, NetClientState 
*nc)
   * Since underlying infrastructure does not support IP datagrams longer
   * than 64K we should drop such packets and don't even try to send
   */
-if (VIRTIO_NET_HDR_GSO_NONE != pkt->virt_hdr.gso_type) {
-if (pkt->payload_len >
-ETH_MAX_IP_DGRAM_LEN -
-pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) {
-return false;
-}
+if (pkt->payload_len >
+ETH_MAX_IP_DGRAM_LEN -
+pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) {
+return false;
  }

  if (pkt->has_virt_hdr ||
--
2.26.2







Re: [PATCH] e1000e: using bottom half to send packets

2020-07-16 Thread Jason Wang



On 2020/7/17 上午12:14, Li Qiang wrote:

Alexander Bulekov reported a UAF bug related e1000e packets send.

-->https://bugs.launchpad.net/qemu/+bug/1886362

This is because the guest trigger a e1000e packet send and set the
data's address to e1000e's MMIO address. So when the e1000e do DMA
it will write the MMIO again and trigger re-entrancy and finally
causes this UAF.

Paolo suggested to use a bottom half whenever MMIO is doing complicate
things in here:
-->https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg03342.html

Reference here:
'The easiest solution is to delay processing of descriptors to a bottom
half whenever MMIO is doing something complicated.  This is also better
for latency because it will free the vCPU thread more quickly and leave
the work to the I/O thread.'



I think several things were missed in this patch (take virtio-net as a 
reference), do we need the following things:


- Cancel the bh when VM is stopped.
- A throttle to prevent bh from executing too much timer?
- A flag to record whether or not this a pending tx (and migrate it?)

Thanks




This patch fixes this UAF.

Signed-off-by: Li Qiang 
---
  hw/net/e1000e_core.c | 25 +
  hw/net/e1000e_core.h |  2 ++
  2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index bcd186cac5..6165b04b68 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -2423,32 +2423,27 @@ e1000e_set_dbal(E1000ECore *core, int index, uint32_t 
val)
  static void
  e1000e_set_tctl(E1000ECore *core, int index, uint32_t val)
  {
-E1000E_TxRing txr;
  core->mac[index] = val;
  
  if (core->mac[TARC0] & E1000_TARC_ENABLE) {

-e1000e_tx_ring_init(core, , 0);
-e1000e_start_xmit(core, );
+qemu_bh_schedule(core->tx[0].tx_bh);
  }
  
  if (core->mac[TARC1] & E1000_TARC_ENABLE) {

-e1000e_tx_ring_init(core, , 1);
-e1000e_start_xmit(core, );
+qemu_bh_schedule(core->tx[1].tx_bh);
  }
  }
  
  static void

  e1000e_set_tdt(E1000ECore *core, int index, uint32_t val)
  {
-E1000E_TxRing txr;
  int qidx = e1000e_mq_queue_idx(TDT, index);
  uint32_t tarc_reg = (qidx == 0) ? TARC0 : TARC1;
  
  core->mac[index] = val & 0x;
  
  if (core->mac[tarc_reg] & E1000_TARC_ENABLE) {

-e1000e_tx_ring_init(core, , qidx);
-e1000e_start_xmit(core, );
+qemu_bh_schedule(core->tx[qidx].tx_bh);
  }
  }
  
@@ -3322,6 +3317,16 @@ e1000e_vm_state_change(void *opaque, int running, RunState state)

  }
  }
  
+static void e1000e_core_tx_bh(void *opaque)

+{
+struct e1000e_tx *tx = opaque;
+E1000ECore *core = tx->core;
+E1000E_TxRing txr;
+
+e1000e_tx_ring_init(core, , tx - >tx[0]);
+e1000e_start_xmit(core, );
+}
+
  void
  e1000e_core_pci_realize(E1000ECore *core,
  const uint16_t *eeprom_templ,
@@ -3340,6 +3345,8 @@ e1000e_core_pci_realize(E1000ECore *core,
  for (i = 0; i < E1000E_NUM_QUEUES; i++) {
  net_tx_pkt_init(>tx[i].tx_pkt, core->owner,
  E1000E_MAX_TX_FRAGS, core->has_vnet);
+core->tx[i].core = core;
+core->tx[i].tx_bh = qemu_bh_new(e1000e_core_tx_bh, >tx[i]);
  }
  
  net_rx_pkt_init(>rx_pkt, core->has_vnet);

@@ -3367,6 +3374,8 @@ e1000e_core_pci_uninit(E1000ECore *core)
  for (i = 0; i < E1000E_NUM_QUEUES; i++) {
  net_tx_pkt_reset(core->tx[i].tx_pkt);
  net_tx_pkt_uninit(core->tx[i].tx_pkt);
+qemu_bh_delete(core->tx[i].tx_bh);
+core->tx[i].tx_bh = NULL;
  }
  
  net_rx_pkt_uninit(core->rx_pkt);

diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h
index aee32f7e48..94ddc6afc2 100644
--- a/hw/net/e1000e_core.h
+++ b/hw/net/e1000e_core.h
@@ -77,6 +77,8 @@ struct E1000Core {
  unsigned char sum_needed;
  bool cptse;
  struct NetTxPkt *tx_pkt;
+QEMUBH *tx_bh;
+E1000ECore *core;
  } tx[E1000E_NUM_QUEUES];
  
  struct NetRxPkt *rx_pkt;





Re: [PATCH] net: check payload length limit for all frames

2020-07-16 Thread Alexander Bulekov
On 200717 0853, Li Qiang wrote:
> P J P  于2020年7月17日周五 上午3:26写道:
> >
> > From: Prasad J Pandit 
> >
> > While sending packets, the check that packet 'payload_len'
> > is within 64kB limit, seems to happen only for GSO frames.
> > It may lead to use-after-free or out-of-bounds access like
> > issues when sending non-GSO frames. Check the 'payload_len'
> > limit for all packets, irrespective of the gso type.
> >
> 
> Hello Prasad,
> Which issue are you trying to solve, any reference linking?
> 
> I also send a patch related this part and also a UAF.
> 
> Thanks,
> Li Qiang

Hi Li, Prasad,
I reported a UAF privately to QEMU-Security in May. I believe the one Li
is referring to is this one https://bugs.launchpad.net/qemu/+bug/1886362

When I saw Prasad's email, I was worried that I reported the same bug
twice, but I can still reproduce LP#1886362 with Prasad's patch.

On the other hand, I cannot reproduce either issue with Li's patch:
Message-Id: <20200716161453.61295-1-liq...@163.com>

Based on this, I think there were two distinct issues. Both of the
crashes rely on e1000e tx loopback into e1000e MMIO. Since Li's
patch adds a TX bh, it seems to mitigate such types of issues.

Sorry about any confusion.
-Alex

> > Reported-by: Alexander Bulekov 
> > Signed-off-by: Prasad J Pandit 
> > ---
> >  hw/net/net_tx_pkt.c | 10 --
> >  1 file changed, 4 insertions(+), 6 deletions(-)
> >
> > diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
> > index 162f802dd7..e66998a8f9 100644
> > --- a/hw/net/net_tx_pkt.c
> > +++ b/hw/net/net_tx_pkt.c
> > @@ -607,12 +607,10 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, 
> > NetClientState *nc)
> >   * Since underlying infrastructure does not support IP datagrams longer
> >   * than 64K we should drop such packets and don't even try to send
> >   */
> > -if (VIRTIO_NET_HDR_GSO_NONE != pkt->virt_hdr.gso_type) {
> > -if (pkt->payload_len >
> > -ETH_MAX_IP_DGRAM_LEN -
> > -pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) {
> > -return false;
> > -}
> > +if (pkt->payload_len >
> > +ETH_MAX_IP_DGRAM_LEN -
> > +pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) {
> > +return false;
> >  }
> >
> >  if (pkt->has_virt_hdr ||
> > --
> > 2.26.2
> >
> >



[PATCH] usb: only build hcd-dwc2 host controller for RASPI target

2020-07-16 Thread Paul Zimmerman
The hcd-dwc2 host controller is currently built for all targets.
Since for now hcd-dwc2 is only implemented on RASPI, restrict its
build to that target only.

Signed-off-by: Paul Zimmerman 
---

Hi Gerd,

Do we want to apply this before the 5.1.0 release? It seems a waste
to build this code for every target when it's only used on one.
Sorry I didn't realize this earlier.

Thanks,
Paul

 hw/arm/Kconfig | 1 +
 hw/usb/Kconfig | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 4a224a6351..bc3a423940 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -315,6 +315,7 @@ config RASPI
 select FRAMEBUFFER
 select PL011 # UART
 select SDHCI
+select USB_DWC2
 
 config STM32F205_SOC
 bool
diff --git a/hw/usb/Kconfig b/hw/usb/Kconfig
index d4d8c37c28..5e63dc75f8 100644
--- a/hw/usb/Kconfig
+++ b/hw/usb/Kconfig
@@ -48,7 +48,6 @@ config USB_MUSB
 
 config USB_DWC2
 bool
-default y
 select USB
 
 config TUSB6010
-- 
2.17.1




Re: [PATCH] net: check payload length limit for all frames

2020-07-16 Thread Li Qiang
P J P  于2020年7月17日周五 上午3:26写道:
>
> From: Prasad J Pandit 
>
> While sending packets, the check that packet 'payload_len'
> is within 64kB limit, seems to happen only for GSO frames.
> It may lead to use-after-free or out-of-bounds access like
> issues when sending non-GSO frames. Check the 'payload_len'
> limit for all packets, irrespective of the gso type.
>

Hello Prasad,
Which issue are you trying to solve, any reference linking?

I also send a patch related this part and also a UAF.

Thanks,
Li Qiang

> Reported-by: Alexander Bulekov 
> Signed-off-by: Prasad J Pandit 
> ---
>  hw/net/net_tx_pkt.c | 10 --
>  1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
> index 162f802dd7..e66998a8f9 100644
> --- a/hw/net/net_tx_pkt.c
> +++ b/hw/net/net_tx_pkt.c
> @@ -607,12 +607,10 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, 
> NetClientState *nc)
>   * Since underlying infrastructure does not support IP datagrams longer
>   * than 64K we should drop such packets and don't even try to send
>   */
> -if (VIRTIO_NET_HDR_GSO_NONE != pkt->virt_hdr.gso_type) {
> -if (pkt->payload_len >
> -ETH_MAX_IP_DGRAM_LEN -
> -pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) {
> -return false;
> -}
> +if (pkt->payload_len >
> +ETH_MAX_IP_DGRAM_LEN -
> +pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) {
> +return false;
>  }
>
>  if (pkt->has_virt_hdr ||
> --
> 2.26.2
>
>



Re: [PATCH] spapr_pci: Robustify support of PCI bridges

2020-07-16 Thread David Gibson
On Thu, Jul 16, 2020 at 04:57:54PM +0200, Greg Kurz wrote:
> On Thu, 16 Jul 2020 16:23:52 +0200
> Markus Armbruster  wrote:
> 
> > David Gibson  writes:
> > 
> > > On Thu, Jul 16, 2020 at 12:32:44PM +0200, Greg Kurz wrote:
> > >> On Thu, 16 Jul 2020 14:45:40 +1000
> > >> David Gibson  wrote:
> > >> 
> > >> > On Thu, Jul 09, 2020 at 07:12:47PM +0200, Greg Kurz wrote:
> > >> > > Some recent error handling cleanups unveiled issues with our support 
> > >> > > of
> > >> > > PCI bridges:
> > >> > > 
> > >> > > 1) QEMU aborts when using non-standard PCI bridge types,
> > >> > >unveiled by commit 7ef1553dac "spapr_pci: Drop some dead error 
> > >> > > handling"
> > >> > > 
> > >> > > $ qemu-system-ppc64 -M pseries -device pcie-pci-bridge
> > >> > > Unexpected error in object_property_find() at qom/object.c:1240:
> > >> > > qemu-system-ppc64: -device pcie-pci-bridge: Property '.chassis_nr' 
> > >> > > not found
> > >> > > Aborted (core dumped)
> > >> > 
> > >> > Oops, I thought we had a check that we actually had a "pci-bridge"
> > >> > device before continuing with the hotplug, but I guess not.
> > >> 
> > >> Ah... are you suggesting we should explicitly check the actual type
> > >> of the bridge rather than looking for the "chassis_nr" property ?
> > >
> > > Uh.. I thought about it, but I don't think it matters much which way
> > > we do it.
> > 
> > Would it make sense to add the "chassis_nr" property to *all* PCI
> > bridge devices?
> > 
> 
> I see that the "PCI Express to PCI/PCI-X Bridge Specification" mentions
> a "Chassis Number Register" which looks very similar to the what exists
> in standard PCI-to-PCI brdiges. This doesn't seem to be implemented in
> our "pcie-pci-bridge" device model though, but of course I have no idea
> why :)

We could consider it, but I don't think there's a lot to be gained by
it at this stage.  I don't think there's really any reason to want to
use bridges other than plain "pci-bridge" on the pseries machine.

PCI is a bit weird on pseries, since it's explicitly paravirt.
Although you can use extended config space, and thereby PCI-E devices
on it, the topology really looks pretty much identical to vanilla
PCI.  So, I don't think there's any reason to use PCI-E bridges on
pseries.

Other than PCI-E bridges of various sorts, a quick scan suggests all
the other bridge types in qemu are weird variants that are mostly
specific to some particular platform.  I don't see any reason we'd
want those on pseries either.

> Maybe Michael or Marcel (cc'd) can share some thoughts about that ?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH] spapr_pci: Robustify support of PCI bridges

2020-07-16 Thread David Gibson
On Thu, Jul 16, 2020 at 04:42:00PM +0200, Greg Kurz wrote:
> On Thu, 16 Jul 2020 16:01:18 +0200
> Markus Armbruster  wrote:
> 
> > David Gibson  writes:
> > 
> > > On Thu, Jul 09, 2020 at 07:12:47PM +0200, Greg Kurz wrote:
> > >> Some recent error handling cleanups unveiled issues with our support of
> > >> PCI bridges:
> > >> 
> > >> 1) QEMU aborts when using non-standard PCI bridge types,
> > >>unveiled by commit 7ef1553dac "spapr_pci: Drop some dead error 
> > >> handling"
> > >> 
> > >> $ qemu-system-ppc64 -M pseries -device pcie-pci-bridge
> > >> Unexpected error in object_property_find() at qom/object.c:1240:
> > >> qemu-system-ppc64: -device pcie-pci-bridge: Property '.chassis_nr' not 
> > >> found
> > >> Aborted (core dumped)
> > >
> > > Oops, I thought we had a check that we actually had a "pci-bridge"
> > > device before continuing with the hotplug, but I guess not.
> > >
> > >> This happens because we assume all PCI bridge types to have a 
> > >> "chassis_nr"
> > >> property. This property only exists with the standard PCI bridge type
> > >> "pci-bridge" actually. We could possibly revert 7ef1553dac but it seems
> > >> much simpler to check the presence of "chassis_nr" earlier.
> > >
> > > Hrm, right, 7ef1553dac was not really correct since add_drcs() really
> > > can fail.
> > 
> > Right.  I failed to see that we can run into a bridge without a
> > "chassis_nr" here.

And I missed it on review, as well.

> > >> 2) QEMU abort if same "chassis_nr" value is used several times,
> > >>unveiled by commit d2623129a7de "qom: Drop parameter @errp of
> > >>object_property_add() & friends"
> > >> 
> > >> $ qemu-system-ppc64 -M pseries -device pci-bridge,chassis_nr=1 \
> > >> -device pci-bridge,chassis_nr=1
> > >> Unexpected error in object_property_try_add() at qom/object.c:1167:
> > >> qemu-system-ppc64: -device pci-bridge,chassis_nr=1: attempt to add 
> > >> duplicate property '4100' to object (type 'container')
> > >> Aborted (core dumped)
> > 
> > Before d2623129a7de, the error got *ignored* in
> > spapr_dr_connector_new():
> > 
> > SpaprDrc *spapr_dr_connector_new(Object *owner, const char *type,
> >  uint32_t id)
> > {
> > SpaprDrc *drc = SPAPR_DR_CONNECTOR(object_new(type));
> > char *prop_name;
> > 
> > drc->id = id;
> > drc->owner = owner;
> > prop_name = g_strdup_printf("dr-connector[%"PRIu32"]",
> > spapr_drc_index(drc));
> > object_property_add_child(owner, prop_name, OBJECT(drc), 
> > _abort);
> > object_unref(OBJECT(drc));
> > --->object_property_set_bool(OBJECT(drc), true, "realized", NULL);
> > g_free(prop_name);
> > 
> > return drc;
> > }
> > 
> > I doubt that's healthy.

Indeed.

> This isn't. The object_property_set_bool() was later converted to
> qdev_realize() (thanks again for the cleanups!) but the problem
> remains. Realize can fail and I see now reason we don't do proper
> error handling when it comes to the DRCs.
> 
> I'll look into fixing that.
> 
> > >> This happens because we assume that "chassis_nr" values are unique, but
> > >> nobody enforces that and we end up generating duplicate DRC ids. The PCI
> > >> code doesn't really care for duplicate "chassis_nr" properties since it
> > >> is only used to initialize the "Chassis Number Register" of the bridge,
> > >> with no functional impact on QEMU. So, even if passing the same value
> > >> several times might look weird, it never broke anything before, so
> > >> I guess we don't necessarily want to enforce strict checking in the PCI
> > >> code now.
> > >
> > > Yeah, I guess.  I'm pretty sure that the chassis number of bridges is
> > > supposed to be system-unique (well, unique within the PCI domain at
> > > least, I guess) as part of the hardware spec.  So specifying multiple
> > > chassis ids the same is a user error, but we need a better failure
> > > mode.
> > >
> > >> Workaround both issues in the PAPR code: check that the bridge has a
> > >> unique and non null "chassis_nr" when plugging it into its parent bus.
> > >>
> > >> Fixes: 05929a6c5dfe ("spapr: Don't use bus number for building DRC ids")
> > >
> > > Arguably, it's really fixing 7ef1553dac.
> > 
> > I agree 7ef1553dac broke the "use a bridge that doesn't have property
> > 'chassis_nr' case.
> > 
> > I suspect the "duplicate chassis_nr" case has always been broken, and
> > d2623129a7de merely uncovered it.
> 
> Yes.

I agree.

> > If we can trigger the abort with hot-plug, then d2623129a7de made things
> > materially worse (new way to accidentally kill your guest and maybe lose
> > data), and I'd add a Fixes: blaming it.
> > 
> 
> Yes it does.
> 
> David,
> 
> Maybe consider folding a third Fixes: tag into this patch ?

Done.

> > >> Reported-by: Thomas Huth 
> > >> Signed-off-by: Greg Kurz 
> > >
> > > I had a few misgivings about the details of this, but I 

Re: [PATCH v4] spapr: Add a new level of NUMA for GPUs

2020-07-16 Thread David Gibson
On Thu, Jul 16, 2020 at 05:56:55PM -0500, Reza Arbab wrote:
> NUMA nodes corresponding to GPU memory currently have the same
> affinity/distance as normal memory nodes. Add a third NUMA associativity
> reference point enabling us to give GPU nodes more distance.
> 
> This is guest visible information, which shouldn't change under a
> running guest across migration between different qemu versions, so make
> the change effective only in new (pseries > 5.0) machine types.
> 
> Before, `numactl -H` output in a guest with 4 GPUs (nodes 2-5):
> 
> node distances:
> node   0   1   2   3   4   5
>   0:  10  40  40  40  40  40
>   1:  40  10  40  40  40  40
>   2:  40  40  10  40  40  40
>   3:  40  40  40  10  40  40
>   4:  40  40  40  40  10  40
>   5:  40  40  40  40  40  10
> 
> After:
> 
> node distances:
> node   0   1   2   3   4   5
>   0:  10  40  80  80  80  80
>   1:  40  10  80  80  80  80
>   2:  80  80  10  80  80  80
>   3:  80  80  80  10  80  80
>   4:  80  80  80  80  10  80
>   5:  80  80  80  80  80  10
> 
> These are the same distances as on the host, mirroring the change made
> to host firmware in skiboot commit f845a648b8cb ("numa/associativity:
> Add a new level of NUMA for GPU's").

Applied to ppc-for-5.1.

> 
> Signed-off-by: Reza Arbab 
> ---
> v4:
> * Use nvslot->numa_id for distinction at all levels of ibm,associativity
> * Use ARRAY_SIZE(refpoints)
> * Rebase
> 
> v3:
> * Squash into one patch
> * Add PHB compat property
> ---
>  hw/ppc/spapr.c  | 21 +++--
>  hw/ppc/spapr_pci.c  |  2 ++
>  hw/ppc/spapr_pci_nvlink2.c  | 13 ++---
>  include/hw/pci-host/spapr.h |  1 +
>  include/hw/ppc/spapr.h  |  1 +
>  5 files changed, 33 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 299908cc7396..0ae293ec9431 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -890,10 +890,16 @@ static int spapr_dt_rng(void *fdt)
>  static void spapr_dt_rtas(SpaprMachineState *spapr, void *fdt)
>  {
>  MachineState *ms = MACHINE(spapr);
> +SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(ms);
>  int rtas;
>  GString *hypertas = g_string_sized_new(256);
>  GString *qemu_hypertas = g_string_sized_new(256);
> -uint32_t refpoints[] = { cpu_to_be32(0x4), cpu_to_be32(0x4) };
> +uint32_t refpoints[] = {
> +cpu_to_be32(0x4),
> +cpu_to_be32(0x4),
> +cpu_to_be32(0x2),
> +};
> +uint32_t nr_refpoints = ARRAY_SIZE(refpoints);
>  uint64_t max_device_addr = MACHINE(spapr)->device_memory->base +
>  memory_region_size((spapr)->device_memory->mr);
>  uint32_t lrdr_capacity[] = {
> @@ -945,8 +951,12 @@ static void spapr_dt_rtas(SpaprMachineState *spapr, void 
> *fdt)
>   qemu_hypertas->str, qemu_hypertas->len));
>  g_string_free(qemu_hypertas, TRUE);
>  
> +if (smc->pre_5_1_assoc_refpoints) {
> +nr_refpoints = 2;
> +}
> +
>  _FDT(fdt_setprop(fdt, rtas, "ibm,associativity-reference-points",
> - refpoints, sizeof(refpoints)));
> + refpoints, nr_refpoints * sizeof(refpoints[0])));
>  
>  _FDT(fdt_setprop(fdt, rtas, "ibm,max-associativity-domains",
>   maxdomains, sizeof(maxdomains)));
> @@ -4584,9 +4594,16 @@ DEFINE_SPAPR_MACHINE(5_1, "5.1", true);
>   */
>  static void spapr_machine_5_0_class_options(MachineClass *mc)
>  {
> +SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> +static GlobalProperty compat[] = {
> +{ TYPE_SPAPR_PCI_HOST_BRIDGE, "pre-5.1-associativity", "on" },
> +};
> +
>  spapr_machine_5_1_class_options(mc);
>  compat_props_add(mc->compat_props, hw_compat_5_0, hw_compat_5_0_len);
> +compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
>  mc->numa_mem_supported = true;
> +smc->pre_5_1_assoc_refpoints = true;
>  }
>  
>  DEFINE_SPAPR_MACHINE(5_0, "5.0", false);
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 2a6a48744aaa..16739334e35f 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -2035,6 +2035,8 @@ static Property spapr_phb_properties[] = {
>   pcie_ecs, true),
>  DEFINE_PROP_UINT64("gpa", SpaprPhbState, nv2_gpa_win_addr, 0),
>  DEFINE_PROP_UINT64("atsd", SpaprPhbState, nv2_atsd_win_addr, 0),
> +DEFINE_PROP_BOOL("pre-5.1-associativity", SpaprPhbState,
> + pre_5_1_assoc, false),
>  DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/hw/ppc/spapr_pci_nvlink2.c b/hw/ppc/spapr_pci_nvlink2.c
> index dd8cd6db9654..76ae77ebc851 100644
> --- a/hw/ppc/spapr_pci_nvlink2.c
> +++ b/hw/ppc/spapr_pci_nvlink2.c
> @@ -362,9 +362,9 @@ void spapr_phb_nvgpu_ram_populate_dt(SpaprPhbState *sphb, 
> void *fdt)
>  _abort);
>  uint32_t associativity[] = {
>  cpu_to_be32(0x4),
> -SPAPR_GPU_NUMA_ID,
> -SPAPR_GPU_NUMA_ID,
> -

Re: [PATCH qemu v9] spapr: Implement Open Firmware client interface

2020-07-16 Thread David Gibson
On Thu, Jul 16, 2020 at 07:04:56PM +1000, Alexey Kardashevskiy wrote:
> Ping? I kinda realize it is not going to replace SLOF any time soon but
> still...

Yeah, I know.   I just haven't had time to consider it.  Priority
starvation.

> On 07/07/2020 10:34, Alexey Kardashevskiy wrote:
> > Ping?
> > 
> > 
> > On 24/06/2020 10:28, Alexey Kardashevskiy wrote:
> >> Ping?
> >>
> >> On 02/06/2020 21:40, Alexey Kardashevskiy wrote:
> >>> Ping?
> >>>
> >>> On 13/05/2020 13:58, Alexey Kardashevskiy wrote:
>  The PAPR platform which describes an OS environment that's presented by
>  a combination of a hypervisor and firmware. The features it specifies
>  require collaboration between the firmware and the hypervisor.
> 
>  Since the beginning, the runtime component of the firmware (RTAS) has
>  been implemented as a 20 byte shim which simply forwards it to
>  a hypercall implemented in qemu. The boot time firmware component is
>  SLOF - but a build that's specific to qemu, and has always needed to be
>  updated in sync with it. Even though we've managed to limit the amount
>  of runtime communication we need between qemu and SLOF, there's some,
>  and it has become increasingly awkward to handle as we've implemented
>  new features.
> 
>  This implements a boot time OF client interface (CI) which is
>  enabled by a new "x-vof" pseries machine option (stands for "Virtual Open
>  Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall
>  which implements Open Firmware Client Interface (OF CI). This allows
>  using a smaller stateless firmware which does not have to manage
>  the device tree.
> 
>  The new "vof.bin" firmware image is included with source code under
>  pc-bios/. It also includes RTAS blob.
> 
>  This implements a handful of CI methods just to get -kernel/-initrd
>  working. In particular, this implements the device tree fetching and
>  simple memory allocator - "claim" (an OF CI memory allocator) and updates
>  "/memory@0/available" to report the client about available memory.
> 
>  This implements changing some device tree properties which we know how
>  to deal with, the rest is ignored. To allow changes, this skips
>  fdt_pack() when x-vof=on as not packing the blob leaves some room for
>  appending.
> 
>  In absence of SLOF, this assigns phandles to device tree nodes to make
>  device tree traversing work.
> 
>  When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree.
> 
>  This adds basic instances support which are managed by a hash map
>  ihandle -> [phandle].
> 
>  Before the guest started, the used memory is:
>  0..4000 - the initial firmware
>  1..18 - stack
> 
>  This OF CI does not implement "interpret".
> 
>  With this basic support, this can only boot into kernel directly.
>  However this is just enough for the petitboot kernel and initradmdisk to
>  boot from any possible source. Note this requires reasonably recent guest
>  kernel with:
>  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735
> 
>  Signed-off-by: Alexey Kardashevskiy 
>  ---
> 
> 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[PATCH v4] spapr: Add a new level of NUMA for GPUs

2020-07-16 Thread Reza Arbab
NUMA nodes corresponding to GPU memory currently have the same
affinity/distance as normal memory nodes. Add a third NUMA associativity
reference point enabling us to give GPU nodes more distance.

This is guest visible information, which shouldn't change under a
running guest across migration between different qemu versions, so make
the change effective only in new (pseries > 5.0) machine types.

Before, `numactl -H` output in a guest with 4 GPUs (nodes 2-5):

node distances:
node   0   1   2   3   4   5
  0:  10  40  40  40  40  40
  1:  40  10  40  40  40  40
  2:  40  40  10  40  40  40
  3:  40  40  40  10  40  40
  4:  40  40  40  40  10  40
  5:  40  40  40  40  40  10

After:

node distances:
node   0   1   2   3   4   5
  0:  10  40  80  80  80  80
  1:  40  10  80  80  80  80
  2:  80  80  10  80  80  80
  3:  80  80  80  10  80  80
  4:  80  80  80  80  10  80
  5:  80  80  80  80  80  10

These are the same distances as on the host, mirroring the change made
to host firmware in skiboot commit f845a648b8cb ("numa/associativity:
Add a new level of NUMA for GPU's").

Signed-off-by: Reza Arbab 
---
v4:
* Use nvslot->numa_id for distinction at all levels of ibm,associativity
* Use ARRAY_SIZE(refpoints)
* Rebase

v3:
* Squash into one patch
* Add PHB compat property
---
 hw/ppc/spapr.c  | 21 +++--
 hw/ppc/spapr_pci.c  |  2 ++
 hw/ppc/spapr_pci_nvlink2.c  | 13 ++---
 include/hw/pci-host/spapr.h |  1 +
 include/hw/ppc/spapr.h  |  1 +
 5 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 299908cc7396..0ae293ec9431 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -890,10 +890,16 @@ static int spapr_dt_rng(void *fdt)
 static void spapr_dt_rtas(SpaprMachineState *spapr, void *fdt)
 {
 MachineState *ms = MACHINE(spapr);
+SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(ms);
 int rtas;
 GString *hypertas = g_string_sized_new(256);
 GString *qemu_hypertas = g_string_sized_new(256);
-uint32_t refpoints[] = { cpu_to_be32(0x4), cpu_to_be32(0x4) };
+uint32_t refpoints[] = {
+cpu_to_be32(0x4),
+cpu_to_be32(0x4),
+cpu_to_be32(0x2),
+};
+uint32_t nr_refpoints = ARRAY_SIZE(refpoints);
 uint64_t max_device_addr = MACHINE(spapr)->device_memory->base +
 memory_region_size((spapr)->device_memory->mr);
 uint32_t lrdr_capacity[] = {
@@ -945,8 +951,12 @@ static void spapr_dt_rtas(SpaprMachineState *spapr, void 
*fdt)
  qemu_hypertas->str, qemu_hypertas->len));
 g_string_free(qemu_hypertas, TRUE);
 
+if (smc->pre_5_1_assoc_refpoints) {
+nr_refpoints = 2;
+}
+
 _FDT(fdt_setprop(fdt, rtas, "ibm,associativity-reference-points",
- refpoints, sizeof(refpoints)));
+ refpoints, nr_refpoints * sizeof(refpoints[0])));
 
 _FDT(fdt_setprop(fdt, rtas, "ibm,max-associativity-domains",
  maxdomains, sizeof(maxdomains)));
@@ -4584,9 +4594,16 @@ DEFINE_SPAPR_MACHINE(5_1, "5.1", true);
  */
 static void spapr_machine_5_0_class_options(MachineClass *mc)
 {
+SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
+static GlobalProperty compat[] = {
+{ TYPE_SPAPR_PCI_HOST_BRIDGE, "pre-5.1-associativity", "on" },
+};
+
 spapr_machine_5_1_class_options(mc);
 compat_props_add(mc->compat_props, hw_compat_5_0, hw_compat_5_0_len);
+compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
 mc->numa_mem_supported = true;
+smc->pre_5_1_assoc_refpoints = true;
 }
 
 DEFINE_SPAPR_MACHINE(5_0, "5.0", false);
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 2a6a48744aaa..16739334e35f 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -2035,6 +2035,8 @@ static Property spapr_phb_properties[] = {
  pcie_ecs, true),
 DEFINE_PROP_UINT64("gpa", SpaprPhbState, nv2_gpa_win_addr, 0),
 DEFINE_PROP_UINT64("atsd", SpaprPhbState, nv2_atsd_win_addr, 0),
+DEFINE_PROP_BOOL("pre-5.1-associativity", SpaprPhbState,
+ pre_5_1_assoc, false),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/ppc/spapr_pci_nvlink2.c b/hw/ppc/spapr_pci_nvlink2.c
index dd8cd6db9654..76ae77ebc851 100644
--- a/hw/ppc/spapr_pci_nvlink2.c
+++ b/hw/ppc/spapr_pci_nvlink2.c
@@ -362,9 +362,9 @@ void spapr_phb_nvgpu_ram_populate_dt(SpaprPhbState *sphb, 
void *fdt)
 _abort);
 uint32_t associativity[] = {
 cpu_to_be32(0x4),
-SPAPR_GPU_NUMA_ID,
-SPAPR_GPU_NUMA_ID,
-SPAPR_GPU_NUMA_ID,
+cpu_to_be32(nvslot->numa_id),
+cpu_to_be32(nvslot->numa_id),
+cpu_to_be32(nvslot->numa_id),
 cpu_to_be32(nvslot->numa_id)
 };
 uint64_t size = object_property_get_uint(nv_mrobj, "size", NULL);
@@ -375,6 +375,13 @@ void spapr_phb_nvgpu_ram_populate_dt(SpaprPhbState 

Re: TB Cache size grows out of control with qemu 5.0

2020-07-16 Thread BALATON Zoltan

On Thu, 16 Jul 2020, Alex Bennée wrote:

Christian Ehrhardt  writes:

On Wed, Jul 15, 2020 at 5:58 PM BALATON Zoltan  wrote:

See commit 47a2def4533a2807e48954abd50b32ecb1aaf29a and the next two
following it.


Thank you Zoltan for pointing out this commit, I agree that this seems to be
the trigger for the issues I'm seeing. Unfortunately the common CI host size
is 1-2G. For example on Ubuntu Autopkgtests 1.5G.
Those of them running guests do so in 0.5-1G size in TCG mode
(as they often can't rely on having KVM available).

The 1G TB buffer + 0.5G actual guest size + lack of dynamic downsizing
on memory pressure (never existed) makes these systems go OOM-Killing
the qemu process.


Ooops. I admit the assumption was that most people running system
emulation would be doing it on beefier machines.


The patches indicated that the TB flushes on a full guest boot are a
good indicator of the TB size efficiency. From my old checks I had:

- Qemu 4.2 512M guest with 32M default overwritten by ram-size/4
TB flush count  14, 14, 16
- Qemu 5.0 512M guest with 1G default
TB flush count  1, 1, 1

I agree that ram/4 seems odd, especially on huge guests that is a lot
potentially wasted. And most environments have a bit of breathing
room 1G is too big in small host systems and the common CI system falls
into this category. So I tuned it down to 256M for a test.

- Qemu 4.2 512M guest with tb-size 256M
TB flush count  5, 5, 5
- Qemu 5.0 512M guest with tb-size 256M
TB flush count  5, 5, 5
- Qemu 5.0 512M guest with 256M default in code
TB flush count  5, 5, 5

So performance wise the results are as much in-between as you'd think from a
TB size in between. And the memory consumption which (for me) is the actual
current issue to fix would be back in line again as expected.


So I'm glad you have the workaround.



So on one hand I'm suggesting something like:
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -944,7 +944,7 @@ static void page_lock_pair(PageDesc **re
  * Users running large scale system emulation may want to tweak their
  * runtime setup via the tb-size control on the command line.
  */
-#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB)
+#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (256 * MiB)


The problem we have is any number we pick here is arbitrary. And while
it did regress your use-case changing it again just pushes a performance
regression onto someone else. The most (*) 64 bit desktop PCs have 16Gb
of RAM, almost all have more than 8gb. And there is a workaround.

* random number from Steams HW survey.


 #endif
 #endif

OTOH I understand someone else might want to get the more speedy 1G
especially for large guests. If someone used to run a 4G guest in TCG the
TB Size was 1G all along.
How about picking the smaller of (1G || ram-size/4) as default?

This might then look like:
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -956,7 +956,12 @@ static inline size_t size_code_gen_buffe
 {
 /* Size the buffer.  */
 if (tb_size == 0) {
-tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
+unsigned long max_default = (unsigned long)(ram_size / 4);
+if (max_default < DEFAULT_CODE_GEN_BUFFER_SIZE) {
+tb_size = max_default;
+} else {
+   tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
+}
 }
 if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
 tb_size = MIN_CODE_GEN_BUFFER_SIZE;

This is a bit more tricky than it seems as ram_sizes is no more
present in that context but it is enough to discuss it.
That should serve all cases - small and large - better as a pure
static default of 1G or always ram/4?


I'm definitely against re-introducing ram_size into the mix. The
original commit (a1b18df9a4) that broke this introduced an ordering
dependency which we don't want to bring back.

I'd be more amenable to something that took into account host memory and
limited the default if it was smaller than a threshold. Is there a way
to probe that that doesn't involve slurping /proc/meminfo?


I agree that this should be dependent on host memory size not guest 
ram_size but it might be tricky to get that value because different host 
OSes would need different ways. Maybe a new qemu_host_mem_size portability 
function will be needed that implements this for different host OSes. 
POSIX may or may not have sysconf _SC_PHYS_PAGES and _SC_AVPHYS_PAGES and 
linux has sysinfo but don't know how reliable these are.


Regards,
BALATON Zoltan

Re: [GIT PULL] I2C updates

2020-07-16 Thread Corey Minyard
On Thu, Jul 16, 2020 at 09:45:41PM +0100, Peter Maydell wrote:
> On Thu, 16 Jul 2020 at 18:49, Corey Minyard  wrote:
> >
> > The following changes since commit 45db94cc90c286a9965a285ba19450f448760a09:
> >
> >   Merge remote-tracking branch 
> > 'remotes/mcayland/tags/qemu-openbios-20200707' into staging (2020-07-10 
> > 16:43:40 +0100)
> >
> > are available in the Git repository at:
> >
> >   https://github.com/cminyard/qemu.git tags/for-qemu-i2c-5
> >
> > for you to fetch changes up to 73d5f22ecbb76dfc785876779d47787084ff0f42:
> >
> >   hw/i2c: Document the I2C qdev helpers (2020-07-16 12:30:54 -0500)
> >
> > 
> > Minor changes to:
> >
> > Add an SMBus config entry
> >
> > Cleanup/simplify/document some I2C interfaces
> >
> > 
> > Philippe Mathieu-Daudé (6):
> >   hw/i2c/Kconfig: Add an entry for the SMBus
> >   hw/i2c/aspeed_i2c: Simplify aspeed_i2c_get_bus()
> >   hw/i2c: Rename i2c_try_create_slave() as i2c_slave_new()
> >   hw/i2c: Rename i2c_realize_and_unref() as 
> > i2c_slave_realize_and_unref()
> >   hw/i2c: Rename i2c_create_slave() as i2c_slave_create_simple()
> >   hw/i2c: Document the I2C qdev helpers
> 
> Hi; this failed to build on x86-64 Linux (incremental build):

Hmm, I did test this, and I just rebuilt, then rebased on the end of
master and rebuilt, without issue.

It looks like the smbus code is not being included, but I don't see how
that can be.

-corey

> 
>   LINKi386-softmmu/qemu-system-i386
> ../hw/i2c/smbus_eeprom.o: In function `smbus_eeprom_vmstate_needed':
> /home/petmay01/linaro/qemu-for-merges/hw/i2c/smbus_eeprom.c:94:
> undefined reference to `smbus_vmstate_needed'
> ../hw/i2c/smbus_eeprom.o:(.data.rel+0x50): undefined reference to
> `vmstate_smbus_device'
> ../hw/i2c/pm_smbus.o: In function `smb_transaction':
> /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:93: undefined
> reference to `smbus_quick_command'
> /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:97: undefined
> reference to `smbus_receive_byte'
> /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:100: undefined
> reference to `smbus_send_byte'
> /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:105: undefined
> reference to `smbus_read_byte'
> /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:108: undefined
> reference to `smbus_write_byte'
> /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:114: undefined
> reference to `smbus_read_word'
> /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:117: undefined
> reference to `smbus_write_word'
> /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:149: undefined
> reference to `smbus_read_block'
> /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:174: undefined
> reference to `smbus_write_block'
> ../hw/i2c/pm_smbus.o: In function `smb_ioport_writeb':
> /home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:290: undefined
> reference to `smbus_write_block'
> ../hw/ipmi/smbus_ipmi.o:(.data.rel+0x50): undefined reference to
> `vmstate_smbus_device'
> collect2: error: ld returned 1 exit status
> 
> (similarly for other qemu-system-* binary links)
> 
> thanks
> -- PMM



Re: sysbus_create_simple Vs qdev_create

2020-07-16 Thread Eduardo Habkost
On Wed, Jul 15, 2020 at 04:37:18PM +0200, Markus Armbruster wrote:
> Pratik Parvati  writes:
> 
> > Hi Markus and Philippe,
> >
> > Thanks for your reply. Now I am pretty clear about Qdev and sysbus helper
> > function.
> >
> > Can you please explain to me in brief on buses and device hierarchies (i.e.
> > BusState and DeviceState) and how they are related to each other? As I can
> > see, the DeviceState class inherits the BusState
> >
> > struct DeviceState {
> > /*< private >*/
> > Object parent_obj;
> > /*< public >*/
> >
> > const char *id;
> > char *canonical_path;
> > bool realized;
> > bool pending_deleted_event;
> > QemuOpts *opts;
> > int hotplugged;
> > bool allow_unplug_during_migration;
> > BusState *parent_bus; \\ BusState is inherited here
> > QLIST_HEAD(, NamedGPIOList) gpios;
> > QLIST_HEAD(, BusState) child_bus;
> > int num_child_bus;
> > int instance_id_alias;
> > int alias_required_for_version;
> > ResettableState reset;
> > };
> >
> > and BusState, in turn, inherits the DeviceState as
> >
> > /**
> >  * BusState:
> >  * @hotplug_handler: link to a hotplug handler associated with bus.
> >  * @reset: ResettableState for the bus; handled by Resettable interface.
> >  */struct BusState {
> > Object obj;
> > DeviceState *parent; \\ DeviceState is inherited here
> > char *name;
> > HotplugHandler *hotplug_handler;
> > int max_index;
> > bool realized;
> > int num_children;
> > QTAILQ_HEAD(, BusChild) children;
> > QLIST_ENTRY(BusState) sibling;
> > ResettableState reset;
> > };
> >
> > I am a bit confused. Can you brief me this relation!
> 
> We sorely lack introductory documentation on both qdev and QOM.  I
> believe most developers are more or less confused about them most of the
> time.  I've done a bit of work on both, so I'm probably less confused
> than average.  I'm cc'ing maintainers in the hope of reducing average
> confusion among participants in this thread.
> 
> DeviceState does not inherit from BusState, and BusState does not
> inherit from DeviceState.  The relation you marked in the code is
> actually "has a".
> 
> A DeviceState may have a BusState, namely the bus the device is plugged
> into.  "May", because some devices are bus-less (their
> DEVICE_GET_CLASS(dev)->bus_type is null), and the others get plugged
> into their bus only at realize time.
> 
> Example: PCI device "pci-serial" plugs into a PCI bus.
> 
> Example: device "serial" does not plug into a bus (its used as component
> device of "pci-serial" and other devices).
> 
> Example: device "pc-dimm" does not plug into a bus.
> 
> A bus has a DeviceState, namely the device providing this bus.
> 
> Example: device "i440FX-pcihost" provides PCI bus "pci.0".
> 
> Both DeviceState and BusState are QOM subtypes of Object.  I prefer to
> avoid use of "inherit" for that, because it can mean different things to
> different people.

I'd also note that the use of "parent" in the code is also
ambiguous.  It can mean:

* QOM parent type, i.e. TypeInfo.parent.  Related fields:
  * parent_class members of class structs
  * parent_obj members of object structs
* QOM composition tree parent object, i.e. Object::parent
* qdev device parent bus, i.e. DeviceState::parent_bus
* parent device of of qdev bus, i.e. BusState::parent

-- 
Eduardo




Re: [PATCH 2/2] configure: add support for Control-Flow Integrity

2020-07-16 Thread Daniele Buono

On 7/2/2020 5:52 AM, Daniel P. Berrangé wrote:

The need to maintain this list of functions makes me feel very
uneasy.

How can we have any confidence that this list of functions is
accurate ? How will maintainers ensure that they correctly update
it as they are writing/changing code, and how will they test the
result ?


Hi Daniel,

I gave it some thought and studied more of clang's options. It is 
possible to disable cfi on specific functions by using an __attribute__ 
keyword, instead of providing a list in an external file.
In terms of maintaining, this is much better since we are removing the 
need to update the list. I would suggest defining a macro, 
__disable_cfi__, that can be prepended to a function. The macro would 
expand to nothing if cfi is disabled, or to the proper attribute if it 
is enabled. Here's example code snippet


/* Disable CFI checks.
 * The callback function has been loaded from an external library so we 
do not

 * have type information */
__disable_cfi__
void qemu_plugin_vcpu_syscall_ret(CPUState *cpu, int64_t num, int64_t ret)
{
...
}

This would take care of renaming and removal of functions that need cfi.
It would also probably be beneficial to the developers since they can 
see immediately that the function they are working with needs to have 
CFI checks disabled, and why.


If you think this is a better approach, I'll submit v2 with this 
approach instead of the external function list.



For new code, however, the best thing is proper education and testing.
I'll work on a document for docs/devel to detail what it is and how to 
make code cfi-safe.
A good approach should be to test code changes with CFI enabled. CFI is, 
after all, a sanitizer and therefore it makes sense to use it also 
during development, together with ASan, UBSan and the likes. 
Unfortunately, since it works only with clang, I don't think this can 
ever be a hard requirement.


Daniele



hw-display-qxl.so: undefined symbol: qemu_qxl_io_log_semaphore

2020-07-16 Thread Cole Robinson
Hi Gerd,

I'm trying to build qemu 5.1.0-rc0 in Fedora. I'm hitting some issues.

Using this configure line:

./configure --prefix=/usr --libdir=/usr/lib64 --sysconfdir=/etc
--localstatedir=/var --libexecdir=/usr/libexec
--interp-prefix=/usr/qemu-%M --with-pkgversion=qemu-5.1.0-0.1.rc0.fc33
'--extra-ldflags=-Wl,--build-id -Wl,-z,relro -Wl,-z,now'
'--extra-cflags=-O2 -g -pipe -Wall -Werror=format-security
-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions
-fstack-protector-strong -grecord-gcc-switches
-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1
-specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic
-fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection'
--enable-trace-backend=dtrace --audio-drv-list=pa,sdl,alsa,oss
--enable-kvm --target-list=x86_64-softmmu --enable-pie --enable-modules
--enable-spice

Build and then run:

$ ./x86_64-softmmu/qemu-system-x86_64 -device \? | grep qxl
Failed to open module:
/home/crobinso/src/qemu/x86_64-softmmu/../hw-display-qxl.so: undefined
symbol: qemu_qxl_io_log_semaphore


That error breaks iotests 127:

--- /home/crobinso/src/qemu/tests/qemu-iotests/127.out  2020-07-15
04:00:10.589138586 -0400
+++ /home/crobinso/src/qemu/tests/qemu-iotests/127.out.bad  2020-07-16
16:44:37.717248172 -0400
@@ -1,4 +1,5 @@
 QA output created by 127
+Failed to open module:
/home/crobinso/src/qemu/x86_64-softmmu/../hw-display-qxl.so: undefined
symbol: qemu_qxl_io_log_semaphore
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=65536
 Formatting 'TEST_DIR/t.IMGFMT.overlay0', fmt=IMGFMT size=65536
backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
 Formatting 'TEST_DIR/t.IMGFMT.overlay1', fmt=IMGFMT size=65536
backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT


Doing a build with every target and running 'make check' will show
undefined symbol errors for other targets with hw-display-qxl too. Most
reference the qxl_io_log call but some are like:

Failed to open module:
/home/crobinso/src/qemu/microblaze-softmmu/../hw-display-qxl.so:
undefined symbol: vga_ioport_read


Also as a side note though I think it's pre-existing: running the test
suite with --enable-modules while there are host installed modules is
very noisy with lots of repetitive warnings like:

Failed to initialize module: /usr/lib64/qemu/audio-oss.so
Note: only modules from the same build can be loaded.
Failed to initialize module: /usr/lib64/qemu/audio-pa.so
Note: only modules from the same build can be loaded.
Failed to initialize module: /usr/lib64/qemu/audio-sdl.so
Note: only modules from the same build can be loaded.
Failed to initialize module: /usr/lib64/qemu/ui-curses.so
Note: only modules from the same build can be loaded.
Failed to initialize module: /usr/lib64/qemu/ui-gtk.so
Note: only modules from the same build can be loaded.

It would be nice if those could be avoided somehow. Maybe
QEMU_MODULE_DIR can help?

Thanks,
Cole




Re: [RFC PATCH] tcg/cpu-exec: precise single-stepping after an exception

2020-07-16 Thread Richard Henderson
On 7/16/20 1:12 PM, Peter Maydell wrote:
> On Thu, 16 Jul 2020 at 11:08, Luc Michel  wrote:
>>
>> When single-stepping with a debugger attached to QEMU, and when an
>> exception is raised, the debugger misses the first instruction after the
>> exception:
> 
> This is a long-standing bug; thanks for looking at it.
> (https://bugs.launchpad.net/qemu/+bug/757702)
> 
> 
>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>> index d95c4848a4..e85fab5d40 100644
>> --- a/accel/tcg/cpu-exec.c
>> +++ b/accel/tcg/cpu-exec.c
>> @@ -502,10 +502,21 @@ static inline bool cpu_handle_exception(CPUState *cpu, 
>> int *ret)
>>  CPUClass *cc = CPU_GET_CLASS(cpu);
>>  qemu_mutex_lock_iothread();
>>  cc->do_interrupt(cpu);
>>  qemu_mutex_unlock_iothread();
>>  cpu->exception_index = -1;
>> +
>> +if (unlikely(cpu->singlestep_enabled)) {
>> +/*
>> + * After processing the exception, ensure an EXCP_DEBUG is
>> + * raised when single-stepping so that GDB doesn't miss the
>> + * next instruction.
>> + */
>> +cpu->exception_index = EXCP_DEBUG;
>> +return cpu_handle_exception(cpu, ret);
>> +}
> 
> I like the idea of being able to do this generically in
> the main loop.
> 
> How about interrupts? If we are single-stepping and we
> take an interrupt I guess we want to stop before the first
> insn of the interrupt handler rather than after it, which
> would imply a similar change to cpu_handle_interrupt().

Fair.  I think something like this:

if (cc->cpu_exec_interrupt(cpu, interrupt_request)) {
replay_interrupt();
-   cpu->exception_index = -1;
+   cpu->exception_index =
+   (cpu->singlestep_enabled ? EXCP_DEBUG : -1);
*last_tb = NULL;
}

I'm not quite sure how to test this though...

Probably best to keep this a separate patch anyway.


r~



Re: qemu test-qga failure on mergebuild after VERSION file change: dependency issues??

2020-07-16 Thread Philippe Mathieu-Daudé
On 7/16/20 10:15 PM, Peter Maydell wrote:
> On Thu, 16 Jul 2020 at 20:52, Michael Roth  wrote:
>> But is it intermittent, environment-dependent? I'm trying to understand how 
>> to
>> replicate Peter's result since it seems like it would be straightforward
>> reproducer.
> 
> I blew away all my build trees and recreated them from
> scratch, and the issue went away. I'm suspicious that the
> complete lack of .d files was induced by a failed earlier
> pullreq attempt and left the build tree in a messed up state
> where it wouldn't notice that it needed to rebuild files.

If it happens again, can you try to revert aaa1b70a0b ("Makefile:
simplify MINIKCONF rules") on top of the tag you are testing, and
re-run the testing?




Re: [PATCH v5 10/11] hw/arm: Wire up BMC boot flash for npcm750-evb and quanta-gsj

2020-07-16 Thread Havard Skinnemoen
On Wed, Jul 15, 2020 at 1:54 PM Havard Skinnemoen
 wrote:
>
> On Wed, Jul 15, 2020 at 3:57 AM Philippe Mathieu-Daudé  
> wrote:
> >
> > On 7/15/20 11:00 AM, Markus Armbruster wrote:
> > > Now my point.  Why first make up user configuration, then use that to
> > > create a BlockBackend, when you could just go ahead and create the
> > > BlockBackend?
> >
> > CLI issue mostly.
> >
> > We can solve it similarly to the recent "sdcard: Do not allow invalid SD
> > card sizes" patch:
> >
> >  if (!dinfo) {
> >  error_setg(errp, "Missing SPI flash drive");
> >  error_append_hint(errp, "You can use a dummy drive using:\n");
> >  error_append_hint(errp, "-drive if=mtd,driver=null-co,"
> >  "read-ones=on,size=64M\n);
> >  return;
> >  }
> >
> > having npcm7xx_connect_flash() taking an Error* argument,
> > and MachineClass::init() call it with _fatal.
>
> Erroring out if the user specifies a configuration that can't possibly
> boot sounds good to me. Better than trying to come up with defaults
> that are still not going to result in a bootable system.
>
> For testing recovery paths, I think it makes sense to explicitly
> specify a null device as you suggest.

Hmm, one problem. qom-test fails with

qemu-system-aarch64: Missing SPI flash drive
You can add a dummy drive using:
-drive if=mtd,driver=null-co,read-zeroes=on,size=32M
Broken pipe
/usr/local/google/home/hskinnemoen/qemu/for-upstream/tests/qtest/libqtest.c:166:
kill_qemu() tried to terminate QEMU process but encountered exit
status 1 (expected 0)
ERROR qom-test - too few tests run (expected 68, got 7)

So it looks like we might need a different solution to this, unless we
want to make generic tests more machine-aware...



Re: [GIT PULL] I2C updates

2020-07-16 Thread Peter Maydell
On Thu, 16 Jul 2020 at 18:49, Corey Minyard  wrote:
>
> The following changes since commit 45db94cc90c286a9965a285ba19450f448760a09:
>
>   Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-20200707' 
> into staging (2020-07-10 16:43:40 +0100)
>
> are available in the Git repository at:
>
>   https://github.com/cminyard/qemu.git tags/for-qemu-i2c-5
>
> for you to fetch changes up to 73d5f22ecbb76dfc785876779d47787084ff0f42:
>
>   hw/i2c: Document the I2C qdev helpers (2020-07-16 12:30:54 -0500)
>
> 
> Minor changes to:
>
> Add an SMBus config entry
>
> Cleanup/simplify/document some I2C interfaces
>
> 
> Philippe Mathieu-Daudé (6):
>   hw/i2c/Kconfig: Add an entry for the SMBus
>   hw/i2c/aspeed_i2c: Simplify aspeed_i2c_get_bus()
>   hw/i2c: Rename i2c_try_create_slave() as i2c_slave_new()
>   hw/i2c: Rename i2c_realize_and_unref() as i2c_slave_realize_and_unref()
>   hw/i2c: Rename i2c_create_slave() as i2c_slave_create_simple()
>   hw/i2c: Document the I2C qdev helpers

Hi; this failed to build on x86-64 Linux (incremental build):

  LINKi386-softmmu/qemu-system-i386
../hw/i2c/smbus_eeprom.o: In function `smbus_eeprom_vmstate_needed':
/home/petmay01/linaro/qemu-for-merges/hw/i2c/smbus_eeprom.c:94:
undefined reference to `smbus_vmstate_needed'
../hw/i2c/smbus_eeprom.o:(.data.rel+0x50): undefined reference to
`vmstate_smbus_device'
../hw/i2c/pm_smbus.o: In function `smb_transaction':
/home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:93: undefined
reference to `smbus_quick_command'
/home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:97: undefined
reference to `smbus_receive_byte'
/home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:100: undefined
reference to `smbus_send_byte'
/home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:105: undefined
reference to `smbus_read_byte'
/home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:108: undefined
reference to `smbus_write_byte'
/home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:114: undefined
reference to `smbus_read_word'
/home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:117: undefined
reference to `smbus_write_word'
/home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:149: undefined
reference to `smbus_read_block'
/home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:174: undefined
reference to `smbus_write_block'
../hw/i2c/pm_smbus.o: In function `smb_ioport_writeb':
/home/petmay01/linaro/qemu-for-merges/hw/i2c/pm_smbus.c:290: undefined
reference to `smbus_write_block'
../hw/ipmi/smbus_ipmi.o:(.data.rel+0x50): undefined reference to
`vmstate_smbus_device'
collect2: error: ld returned 1 exit status

(similarly for other qemu-system-* binary links)

thanks
-- PMM



[Bug 1887854] Re: Spurious Data Abort on qemu-system-aarch64

2020-07-16 Thread Peter Maydell
Writing to SCTLR can cause QEMU to flush its TLB (as an internal
implementation detail), so if adding SCTLR writes is sufficient to cause
the problem to go away, I would be suspicious that your guest code is
missing necessary TLB maintenance instructions.

QEMU 3.1 and 4.1 are quite old -- can you reproduce with 5.0 or
(ideally) head-of-git ?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1887854

Title:
  Spurious Data Abort on qemu-system-aarch64

Status in QEMU:
  New

Bug description:
  When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is 
not yet publically available), the test generates a spurious data abort (the 
MMU and alignment checks should be disabled according to bits 1, 0 of 
SCTLR_EL1). The abort information is as follows:
  Taking exception 4 [Data Abort]
  ...from EL1 to EL1
  ...with ESR 0x25/0x9610
  ...with FAR 0x104010ca28
  ...with ELR 0x400195d8
  ...to EL1 PC 0x40018200 PSTATE 0x3c5

  The ESR indicates that a synchronous external abort has occurred.
  ESR EC field: 0b100101

  From the ARMv8 technical manual: Data Abort taken without a change in
  Exception level. Used for MMU faults generated by data accesses,
  alignment faults other than those caused by Stack Pointer
  misalignment, and synchronous External aborts, including synchronous
  parity or ECC errors. Not used for debug related exceptions.

  ESR ISS field: 0b1

  From the ARMv8 technical manual: Synchronous External abort, not on
  translation table walk or hardware update of translation table.

  The following command line is used to invoke qemu:
  qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot 
-nographic -serial mon:stdio -kernel 
build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d 
in_asm,int,cpu_reset,unimp,guest_errors

  This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as
  built by the RTEMS source builder (4.1+minor patches).

  Edit: This bug can be worked around by getting and setting SCTLR
  without changing its value before each data abort would occur. This
  test needs 6 of these workarounds to operate successfully.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1887854/+subscriptions



Re: qemu test-qga failure on mergebuild after VERSION file change: dependency issues??

2020-07-16 Thread Peter Maydell
On Thu, 16 Jul 2020 at 20:52, Michael Roth  wrote:
> But is it intermittent, environment-dependent? I'm trying to understand how to
> replicate Peter's result since it seems like it would be straightforward
> reproducer.

I blew away all my build trees and recreated them from
scratch, and the issue went away. I'm suspicious that the
complete lack of .d files was induced by a failed earlier
pullreq attempt and left the build tree in a messed up state
where it wouldn't notice that it needed to rebuild files.

-- PMM



Re: [PULL 0/2] Fixes 20200716 patches

2020-07-16 Thread Peter Maydell
On Thu, 16 Jul 2020 at 10:34, Gerd Hoffmann  wrote:
>
> The following changes since commit 8746309137ba470d1b2e8f5ce86ac228625db940:
>
>   Update version for v5.1.0-rc0 release (2020-07-15 19:08:07 +0100)
>
> are available in the Git repository at:
>
>   git://git.kraxel.org/qemu tags/fixes-20200716-pull-request
>
> for you to fetch changes up to 4084e35068772cf4f81bbae5174019f277c61084:
>
>   usb: fix storage regression (2020-07-16 10:20:27 +0200)
>
> 
> fixes: usb storage regression, vfio display ramfb bug
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/5.1
for any user-visible changes.

-- PMM



Re: [RFC PATCH] tcg/cpu-exec: precise single-stepping after an exception

2020-07-16 Thread Peter Maydell
On Thu, 16 Jul 2020 at 11:08, Luc Michel  wrote:
>
> When single-stepping with a debugger attached to QEMU, and when an
> exception is raised, the debugger misses the first instruction after the
> exception:

This is a long-standing bug; thanks for looking at it.
(https://bugs.launchpad.net/qemu/+bug/757702)


> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index d95c4848a4..e85fab5d40 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -502,10 +502,21 @@ static inline bool cpu_handle_exception(CPUState *cpu, 
> int *ret)
>  CPUClass *cc = CPU_GET_CLASS(cpu);
>  qemu_mutex_lock_iothread();
>  cc->do_interrupt(cpu);
>  qemu_mutex_unlock_iothread();
>  cpu->exception_index = -1;
> +
> +if (unlikely(cpu->singlestep_enabled)) {
> +/*
> + * After processing the exception, ensure an EXCP_DEBUG is
> + * raised when single-stepping so that GDB doesn't miss the
> + * next instruction.
> + */
> +cpu->exception_index = EXCP_DEBUG;
> +return cpu_handle_exception(cpu, ret);
> +}

I like the idea of being able to do this generically in
the main loop.

How about interrupts? If we are single-stepping and we
take an interrupt I guess we want to stop before the first
insn of the interrupt handler rather than after it, which
would imply a similar change to cpu_handle_interrupt().

thanks
-- PMM



[Bug 1887854] Re: Spurious Data Abort on qemu-system-aarch64

2020-07-16 Thread K
** Description changed:

  When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is 
not yet publically available), the test generates a spurious data abort (the 
MMU and alignment checks should be disabled according to bits 1, 0 of 
SCTLR_EL1). The abort information is as follows:
  Taking exception 4 [Data Abort]
  ...from EL1 to EL1
  ...with ESR 0x25/0x9610
  ...with FAR 0x104010ca28
  ...with ELR 0x400195d8
  ...to EL1 PC 0x40018200 PSTATE 0x3c5
  
  The ESR indicates that a synchronous external abort has occurred.
  ESR EC field: 0b100101
  
  From the ARMv8 technical manual: Data Abort taken without a change in
  Exception level. Used for MMU faults generated by data accesses,
  alignment faults other than those caused by Stack Pointer misalignment,
  and synchronous External aborts, including synchronous parity or ECC
  errors. Not used for debug related exceptions.
  
  ESR ISS field: 0b1
  
  From the ARMv8 technical manual: Synchronous External abort, not on
  translation table walk or hardware update of translation table.
  
  The following command line is used to invoke qemu:
  qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot 
-nographic -serial mon:stdio -kernel 
build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d 
in_asm,int,cpu_reset,unimp,guest_errors
  
  This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as
  built by the RTEMS source builder (4.1+minor patches).
  
  Edit: This bug can be worked around by getting and setting SCTLR without
- changing its value.
+ changing its value before each data abort would occur. This test needs 6
+ of these workarounds to operate successfully.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1887854

Title:
  Spurious Data Abort on qemu-system-aarch64

Status in QEMU:
  New

Bug description:
  When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is 
not yet publically available), the test generates a spurious data abort (the 
MMU and alignment checks should be disabled according to bits 1, 0 of 
SCTLR_EL1). The abort information is as follows:
  Taking exception 4 [Data Abort]
  ...from EL1 to EL1
  ...with ESR 0x25/0x9610
  ...with FAR 0x104010ca28
  ...with ELR 0x400195d8
  ...to EL1 PC 0x40018200 PSTATE 0x3c5

  The ESR indicates that a synchronous external abort has occurred.
  ESR EC field: 0b100101

  From the ARMv8 technical manual: Data Abort taken without a change in
  Exception level. Used for MMU faults generated by data accesses,
  alignment faults other than those caused by Stack Pointer
  misalignment, and synchronous External aborts, including synchronous
  parity or ECC errors. Not used for debug related exceptions.

  ESR ISS field: 0b1

  From the ARMv8 technical manual: Synchronous External abort, not on
  translation table walk or hardware update of translation table.

  The following command line is used to invoke qemu:
  qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot 
-nographic -serial mon:stdio -kernel 
build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d 
in_asm,int,cpu_reset,unimp,guest_errors

  This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as
  built by the RTEMS source builder (4.1+minor patches).

  Edit: This bug can be worked around by getting and setting SCTLR
  without changing its value before each data abort would occur. This
  test needs 6 of these workarounds to operate successfully.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1887854/+subscriptions



Re: qemu test-qga failure on mergebuild after VERSION file change: dependency issues??

2020-07-16 Thread Philippe Mathieu-Daudé
On 7/16/20 9:52 PM, Michael Roth wrote:
> Quoting Philippe Mathieu-Daudé (2020-07-16 12:59:28)
>> On 7/16/20 7:55 PM, Michael Roth wrote:
>>> Quoting Peter Maydell (2020-07-16 05:53:17)
 The first merge I tried to process after bumping VERSION for rc0
 failed on test-qga like this:

 MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
 tests/test-qga -m=quick -k --tap < /dev/null | ./scripts/tap-driv
 er.pl --test-name="test-qga"
 PASS 1 test-qga /qga/sync-delimited
 PASS 2 test-qga /qga/sync
 PASS 3 test-qga /qga/ping
 **
 ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info:
 assertion failed (version == QEMU_VERSION): ("5.0.9
 0" == "5.0.50")
 ERROR test-qga - Bail out!
 ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info:
 assertion failed (versio
 n == QEMU_VERSION): ("5.0.90" == "5.0.50")
 Aborted (core dumped)
 /home/petmay01/linaro/qemu-for-merges/tests/Makefile.include:659:
 recipe for target 'check-unit' failed

 Looking at timestamps on files, tests/test-qga.o never got rebuilt,
 even though config-host.h has been updated (and so has the new
 QEMU_VERSION). Any idea what's gone wrong here?

 Also weird: this build tree has no .d files in it.
>>>
>>> I've been trying to reproduce with:
>>>
>>> make
>>> make check-unit
>>> *bump VERSION
>>> make check-unit
>>>
>>> but test-qga.o gets rebuilt as expected and the test passed.
>>>
>>> This is with ubuntu 18.04, x86, with out-of-tree build directory. Are you 
>>> aware
>>> of any other factors that might be needed to reproduce this?
>>
>> The problem is not for qga, it affects all QEMU objects.
> 
> But is it intermittent, environment-dependent? I'm trying to understand how to
> replicate Peter's result since it seems like it would be straightforward
> reproducer.

How to reproduce:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg723531.html

> 
>>
>>>

 thanks
 -- PMM
>>>
>>
> 




Re: qemu test-qga failure on mergebuild after VERSION file change: dependency issues??

2020-07-16 Thread Michael Roth
Quoting Philippe Mathieu-Daudé (2020-07-16 12:59:28)
> On 7/16/20 7:55 PM, Michael Roth wrote:
> > Quoting Peter Maydell (2020-07-16 05:53:17)
> >> The first merge I tried to process after bumping VERSION for rc0
> >> failed on test-qga like this:
> >>
> >> MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
> >> tests/test-qga -m=quick -k --tap < /dev/null | ./scripts/tap-driv
> >> er.pl --test-name="test-qga"
> >> PASS 1 test-qga /qga/sync-delimited
> >> PASS 2 test-qga /qga/sync
> >> PASS 3 test-qga /qga/ping
> >> **
> >> ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info:
> >> assertion failed (version == QEMU_VERSION): ("5.0.9
> >> 0" == "5.0.50")
> >> ERROR test-qga - Bail out!
> >> ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info:
> >> assertion failed (versio
> >> n == QEMU_VERSION): ("5.0.90" == "5.0.50")
> >> Aborted (core dumped)
> >> /home/petmay01/linaro/qemu-for-merges/tests/Makefile.include:659:
> >> recipe for target 'check-unit' failed
> >>
> >> Looking at timestamps on files, tests/test-qga.o never got rebuilt,
> >> even though config-host.h has been updated (and so has the new
> >> QEMU_VERSION). Any idea what's gone wrong here?
> >>
> >> Also weird: this build tree has no .d files in it.
> > 
> > I've been trying to reproduce with:
> > 
> > make
> > make check-unit
> > *bump VERSION
> > make check-unit
> > 
> > but test-qga.o gets rebuilt as expected and the test passed.
> > 
> > This is with ubuntu 18.04, x86, with out-of-tree build directory. Are you 
> > aware
> > of any other factors that might be needed to reproduce this?
> 
> The problem is not for qga, it affects all QEMU objects.

But is it intermittent, environment-dependent? I'm trying to understand how to
replicate Peter's result since it seems like it would be straightforward
reproducer.

> 
> > 
> >>
> >> thanks
> >> -- PMM
> > 
> 



[Bug 1887854] [NEW] Spurious Data Abort on qemu-system-aarch64

2020-07-16 Thread K
Public bug reported:

When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is not 
yet publically available), the test generates a spurious data abort (the MMU 
and alignment checks should be disabled according to bits 1, 0 of SCTLR_EL1). 
The abort information is as follows:
Taking exception 4 [Data Abort]
...from EL1 to EL1
...with ESR 0x25/0x9610
...with FAR 0x104010ca28
...with ELR 0x400195d8
...to EL1 PC 0x40018200 PSTATE 0x3c5

The ESR indicates that a synchronous external abort has occurred.
ESR EC field: 0b100101

>From the ARMv8 technical manual: Data Abort taken without a change in
Exception level. Used for MMU faults generated by data accesses,
alignment faults other than those caused by Stack Pointer misalignment,
and synchronous External aborts, including synchronous parity or ECC
errors. Not used for debug related exceptions.

ESR ISS field: 0b1

>From the ARMv8 technical manual: Synchronous External abort, not on
translation table walk or hardware update of translation table.

The following command line is used to invoke qemu:
qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot -nographic 
-serial mon:stdio -kernel 
build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d 
in_asm,int,cpu_reset,unimp,guest_errors

This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as
built by the RTEMS source builder (4.1+minor patches).

Edit: This bug can be worked around by getting and setting SCTLR without
changing its value.

** Affects: qemu
 Importance: Undecided
 Status: New

** Attachment added: "Qemu execution log and binary"
   
https://bugs.launchpad.net/bugs/1887854/+attachment/5393233/+files/data_abort.tar.gz

** Description changed:

  When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is 
not yet publically available), the test generates a spurious data abort (the 
MMU and alignment checks should be disabled according to bits 1, 0 of 
SCTLR_EL1). The abort information is as follows:
  Taking exception 4 [Data Abort]
  ...from EL1 to EL1
  ...with ESR 0x25/0x9610
  ...with FAR 0x104010ca28
  ...with ELR 0x400195d8
  ...to EL1 PC 0x40018200 PSTATE 0x3c5
  
  The ESR indicates that a synchronous external abort has occurred.
  ESR EC field: 0b100101
  
  From the ARMv8 technical manual: Data Abort taken without a change in
  Exception level. Used for MMU faults generated by data accesses,
  alignment faults other than those caused by Stack Pointer misalignment,
  and synchronous External aborts, including synchronous parity or ECC
  errors. Not used for debug related exceptions.
  
  ESR ISS field: 0b1
  
  From the ARMv8 technical manual: Synchronous External abort, not on
  translation table walk or hardware update of translation table.
  
  The following command line is used to invoke qemu:
  qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot 
-nographic -serial mon:stdio -kernel 
build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d 
in_asm,int,cpu_reset,unimp,guest_errors
  
  This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as
  built by the RTEMS source builder (4.1+minor patches).
+ 
+ Edit: This bug can be worked around by getting and setting SCTLR without
+ changing its value.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1887854

Title:
  Spurious Data Abort on qemu-system-aarch64

Status in QEMU:
  New

Bug description:
  When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is 
not yet publically available), the test generates a spurious data abort (the 
MMU and alignment checks should be disabled according to bits 1, 0 of 
SCTLR_EL1). The abort information is as follows:
  Taking exception 4 [Data Abort]
  ...from EL1 to EL1
  ...with ESR 0x25/0x9610
  ...with FAR 0x104010ca28
  ...with ELR 0x400195d8
  ...to EL1 PC 0x40018200 PSTATE 0x3c5

  The ESR indicates that a synchronous external abort has occurred.
  ESR EC field: 0b100101

  From the ARMv8 technical manual: Data Abort taken without a change in
  Exception level. Used for MMU faults generated by data accesses,
  alignment faults other than those caused by Stack Pointer
  misalignment, and synchronous External aborts, including synchronous
  parity or ECC errors. Not used for debug related exceptions.

  ESR ISS field: 0b1

  From the ARMv8 technical manual: Synchronous External abort, not on
  translation table walk or hardware update of translation table.

  The following command line is used to invoke qemu:
  qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot 
-nographic -serial mon:stdio -kernel 
build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d 
in_asm,int,cpu_reset,unimp,guest_errors

  This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as
  built by 

[PATCH v2] tcg/cpu-exec: precise single-stepping after an exception

2020-07-16 Thread Luc Michel
When single-stepping with a debugger attached to QEMU, and when an
exception is raised, the debugger misses the first instruction after the
exception:

$ qemu-system-aarch64 -M virt -display none -cpu cortex-a53 -s -S

$ aarch64-linux-gnu-gdb
GNU gdb (GDB) 9.2
[...]
(gdb) tar rem :1234
Remote debugging using :1234
warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
0x in ?? ()
(gdb) # writing nop insns to 0x200 and 0x204
(gdb) set *0x200 = 0xd503201f
(gdb) set *0x204 = 0xd503201f
(gdb) # 0x0 address contains 0 which is an invalid opcode.
(gdb) # The CPU should raise an exception and jump to 0x200
(gdb) si
0x0204 in ?? ()

With this commit, the same run steps correctly on the first instruction
of the exception vector:

(gdb) si
0x0200 in ?? ()

Signed-off-by: Luc Michel 
---
v2:
  - remove RFC tag
  - inline the recursive call to cpu_handle_exception [Richard]
---
 accel/tcg/cpu-exec.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index d95c4848a4..59b1b5fe76 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -502,10 +502,22 @@ static inline bool cpu_handle_exception(CPUState *cpu, 
int *ret)
 CPUClass *cc = CPU_GET_CLASS(cpu);
 qemu_mutex_lock_iothread();
 cc->do_interrupt(cpu);
 qemu_mutex_unlock_iothread();
 cpu->exception_index = -1;
+
+if (unlikely(cpu->singlestep_enabled)) {
+/*
+ * After processing the exception, ensure an EXCP_DEBUG is
+ * raised when single-stepping so that GDB doesn't miss the
+ * next instruction.
+ */
+*ret = EXCP_DEBUG;
+cpu_handle_debug_exception(cpu);
+return true;
+}
+
 } else if (!replay_has_interrupt()) {
 /* give a chance to iothread in replay mode */
 *ret = EXCP_INTERRUPT;
 return true;
 }
-- 
2.27.0




[PATCH] net: check payload length limit for all frames

2020-07-16 Thread P J P
From: Prasad J Pandit 

While sending packets, the check that packet 'payload_len'
is within 64kB limit, seems to happen only for GSO frames.
It may lead to use-after-free or out-of-bounds access like
issues when sending non-GSO frames. Check the 'payload_len'
limit for all packets, irrespective of the gso type.

Reported-by: Alexander Bulekov 
Signed-off-by: Prasad J Pandit 
---
 hw/net/net_tx_pkt.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 162f802dd7..e66998a8f9 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -607,12 +607,10 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, NetClientState 
*nc)
  * Since underlying infrastructure does not support IP datagrams longer
  * than 64K we should drop such packets and don't even try to send
  */
-if (VIRTIO_NET_HDR_GSO_NONE != pkt->virt_hdr.gso_type) {
-if (pkt->payload_len >
-ETH_MAX_IP_DGRAM_LEN -
-pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) {
-return false;
-}
+if (pkt->payload_len >
+ETH_MAX_IP_DGRAM_LEN -
+pkt->vec[NET_TX_PKT_L3HDR_FRAG].iov_len) {
+return false;
 }
 
 if (pkt->has_virt_hdr ||
-- 
2.26.2




Re: [PATCH for-5.1] i386: hvf: Explicitly set CR4 guest/host mask

2020-07-16 Thread Roman Bolshakov
On Thu, Jul 16, 2020 at 02:14:57PM -0400, Eduardo Habkost wrote:
> On Tue, Jul 14, 2020 at 08:20:04PM +0200, Paolo Bonzini wrote:
> > Hi Roman, please ask Peter to apply it directly because I won't be able to
> > send a pull request in the next couple of weeks.
> > 
> > Paolo
> > 
> > Il mar 14 lug 2020, 12:39 Roman Bolshakov  ha
> > scritto:
> > 
> > > On Tue, Jul 14, 2020 at 12:07:27PM +0300, Roman Bolshakov wrote:
> > > > Removal of register reset omitted initialization of CR4 guest/host mask.
> > > > x86_64 guests aren't booting without it.
> > > >
> > > > Fixes: 5009ef22c6bb2 ("i386: hvf: Don't duplicate register reset")
> > > > Signed-off-by: Roman Bolshakov 
> > > >
> > >
> > > If one has a chance to test or review it, it'd be very helpful. That'd
> > > allow to include it in RC0.
> > >
> 
> I'll queue it for my -rc1 pull request.
> 

Thanks!

-Roman



[PULL 4/6] target/i386: fix model number and add missing features for Icelake-Server CPU model

2020-07-16 Thread Eduardo Habkost
From: Chenyi Qiang 

Add the missing features(sha_ni, avx512ifma, rdpid, fsrm,
vmx-rdseed-exit, vmx-pml, vmx-eptp-switching) and change the model
number to 106 in the Icelake-Server-v4 CPU model.

Signed-off-by: Chenyi Qiang 
Message-Id: <20200714084148.26690-3-chenyi.qi...@intel.com>
Signed-off-by: Eduardo Habkost 
---
 target/i386/cpu.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 3885826bc4..132ef90421 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3512,6 +3512,20 @@ static X86CPUDefinition builtin_x86_defs[] = {
 { /* end of list */ }
 },
 },
+{
+.version = 4,
+.props = (PropValue[]) {
+{ "sha-ni", "on" },
+{ "avx512ifma", "on" },
+{ "rdpid", "on" },
+{ "fsrm", "on" },
+{ "vmx-rdseed-exit", "on" },
+{ "vmx-pml", "on" },
+{ "vmx-eptp-switching", "on" },
+{ "model", "106" },
+{ /* end of list */ }
+},
+},
 { /* end of list */ }
 }
 },
-- 
2.26.2




[PULL 1/6] i368/cpu: Clear env->user_features after loading versioned CPU model

2020-07-16 Thread Eduardo Habkost
From: Xiaoyao Li 

Features defined in versioned CPU model are recorded in env->user_features
since they are updated as property. It's unwated because they are not
user specified.

Simply clear env->user_features as a fix. It won't clear user specified
features because user specified features are filled to
env->user_features later in x86_cpu_expand_features().

Cc: Chenyi Qiang 
Suggested-by: Eduardo Habkost 
Signed-off-by: Xiaoyao Li 
Message-Id: <20200713174436.41070-2-xiaoyao...@intel.com>
[ehabkost: fix coding style]
Signed-off-by: Eduardo Habkost 
---
 target/i386/cpu.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 1e5123251d..caf0334f3a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5159,6 +5159,13 @@ static void x86_cpu_load_model(X86CPU *cpu, X86CPUModel 
*model)
 object_property_set_str(OBJECT(cpu), "vendor", vendor, _abort);
 
 x86_cpu_apply_version_props(cpu, model);
+
+/*
+ * Properties in versioned CPU model are not user specified features.
+ * We can simply clear env->user_features here since it will be filled 
later
+ * in x86_cpu_expand_features() based on plus_features and minus_features.
+ */
+memset(>user_features, 0, sizeof(env->user_features));
 }
 
 #ifndef CONFIG_USER_ONLY
-- 
2.26.2




[PULL 6/6] i386: hvf: Explicitly set CR4 guest/host mask

2020-07-16 Thread Eduardo Habkost
From: Roman Bolshakov 

Removal of register reset omitted initialization of CR4 guest/host mask.
x86_64 guests aren't booting without it.

Fixes: 5009ef22c6bb2 ("i386: hvf: Don't duplicate register reset")
Signed-off-by: Roman Bolshakov 
Message-Id: <20200714090726.41082-1-r.bolsha...@yadro.com>
Signed-off-by: Eduardo Habkost 
---
 target/i386/hvf/vmx.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/i386/hvf/vmx.h b/target/i386/hvf/vmx.h
index 75ba1e2a5f..587b1b8375 100644
--- a/target/i386/hvf/vmx.h
+++ b/target/i386/hvf/vmx.h
@@ -166,6 +166,7 @@ static inline void macvm_set_cr4(hv_vcpuid_t vcpu, uint64_t 
cr4)
 
 wvmcs(vcpu, VMCS_GUEST_CR4, guest_cr4);
 wvmcs(vcpu, VMCS_CR4_SHADOW, cr4);
+wvmcs(vcpu, VMCS_CR4_MASK, CR4_VMXE);
 
 hv_vcpu_invalidate_tlb(vcpu);
 hv_vcpu_flush(vcpu);
-- 
2.26.2




[PULL 0/6] x86 fixes for -rc1

2020-07-16 Thread Eduardo Habkost
The following changes since commit ee5128bb00f90dd301991d80d1db5224ce924c84:

  Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into 
staging (2020-07-16 13:12:05 +0100)

are available in the Git repository at:

  git://github.com/ehabkost/qemu.git tags/x86-next-pull-request

for you to fetch changes up to 818b9f111d64b40661d08f5e23236ac1ca5df505:

  i386: hvf: Explicitly set CR4 guest/host mask (2020-07-16 14:15:13 -0400)


x86 fixes for -rc1

Fixes for x86 that missed hard freeze:
* Don't trigger warnings for features set by
  CPU model versions (Xiaoyao Li)
* Missing features in Icelake-Server, Skylake-Server,
  Cascadelake-Server CPU models (Chenyi Qiang)
* Fix hvf x86_64 guest boot crash (Roman Bolshakov)



Chenyi Qiang (3):
  target/i386: add fast short REP MOV support
  target/i386: fix model number and add missing features for
Icelake-Server CPU model
  target/i386: add the missing vmx features for Skylake-Server and
Cascadelake-Server CPU models

Roman Bolshakov (1):
  i386: hvf: Explicitly set CR4 guest/host mask

Xiaoyao Li (2):
  i368/cpu: Clear env->user_features after loading versioned CPU model
  i386/cpu: Don't add unavailable_features to env->user_features

 target/i386/cpu.h |  2 ++
 target/i386/hvf/vmx.h |  1 +
 target/i386/cpu.c | 38 --
 3 files changed, 39 insertions(+), 2 deletions(-)

-- 
2.26.2





[PULL 5/6] target/i386: add the missing vmx features for Skylake-Server and Cascadelake-Server CPU models

2020-07-16 Thread Eduardo Habkost
From: Chenyi Qiang 

Add the missing vmx features in Skylake-Server and Cascadelake-Server
CPU models based on the output of Paolo's script.

Signed-off-by: Chenyi Qiang 
Message-Id: <20200714084148.26690-4-chenyi.qi...@intel.com>
Signed-off-by: Eduardo Habkost 
---
 target/i386/cpu.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 132ef90421..588f32e136 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3034,6 +3034,13 @@ static X86CPUDefinition builtin_x86_defs[] = {
 { /* end of list */ }
 }
 },
+{
+.version = 4,
+.props = (PropValue[]) {
+{ "vmx-eptp-switching", "on" },
+{ /* end of list */ }
+}
+},
 { /* end of list */ }
 }
 },
@@ -3158,6 +3165,13 @@ static X86CPUDefinition builtin_x86_defs[] = {
   { /* end of list */ }
   },
 },
+{ .version = 4,
+  .note = "ARCH_CAPABILITIES, no TSX",
+  .props = (PropValue[]) {
+  { "vmx-eptp-switching", "on" },
+  { /* end of list */ }
+  },
+},
 { /* end of list */ }
 }
 },
-- 
2.26.2




[PULL 3/6] target/i386: add fast short REP MOV support

2020-07-16 Thread Eduardo Habkost
From: Chenyi Qiang 

For CPUs support fast short REP MOV[CPUID.(EAX=7,ECX=0):EDX(bit4)], e.g
Icelake and Tigerlake, expose it to the guest VM.

Reviewed-by: Eduardo Habkost 
Signed-off-by: Chenyi Qiang 
Message-Id: <20200714084148.26690-2-chenyi.qi...@intel.com>
Signed-off-by: Eduardo Habkost 
---
 target/i386/cpu.h | 2 ++
 target/i386/cpu.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 37fffa5cac..e1a5c174dc 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -775,6 +775,8 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS];
 #define CPUID_7_0_EDX_AVX512_4VNNIW (1U << 2)
 /* AVX512 Multiply Accumulation Single Precision */
 #define CPUID_7_0_EDX_AVX512_4FMAPS (1U << 3)
+/* Fast Short Rep Mov */
+#define CPUID_7_0_EDX_FSRM  (1U << 4)
 /* AVX512 Vector Pair Intersection to a Pair of Mask Registers */
 #define CPUID_7_0_EDX_AVX512_VP2INTERSECT (1U << 8)
 /* SERIALIZE instruction */
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 93b62b2eca..3885826bc4 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -984,7 +984,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
 .type = CPUID_FEATURE_WORD,
 .feat_names = {
 NULL, NULL, "avx512-4vnniw", "avx512-4fmaps",
-NULL, NULL, NULL, NULL,
+"fsrm", NULL, NULL, NULL,
 "avx512-vp2intersect", NULL, "md-clear", NULL,
 NULL, NULL, "serialize", NULL,
 "tsx-ldtrk", NULL, NULL /* pconfig */, NULL,
-- 
2.26.2




[PULL 2/6] i386/cpu: Don't add unavailable_features to env->user_features

2020-07-16 Thread Eduardo Habkost
From: Xiaoyao Li 

Features unavailable due to absent of their dependent features should
not be added to env->user_features. env->user_features only contains the
feature explicity specified with -feature/+feature by user.

Fixes: 99e24dbdaa68 ("target/i386: introduce generic feature dependency 
mechanism")
Signed-off-by: Xiaoyao Li 
Message-Id: <20200713174436.41070-3-xiaoyao...@intel.com>
Signed-off-by: Eduardo Habkost 
---
 target/i386/cpu.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index caf0334f3a..93b62b2eca 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6371,7 +6371,6 @@ static void x86_cpu_expand_features(X86CPU *cpu, Error 
**errp)
   unavailable_features & 
env->user_features[d->to.index],
   "This feature depends on other features 
that were not requested");
 
-env->user_features[d->to.index] |= unavailable_features;
 env->features[d->to.index] &= ~unavailable_features;
 }
 }
-- 
2.26.2




Re: [PATCH for-5.1] i386: hvf: Explicitly set CR4 guest/host mask

2020-07-16 Thread Eduardo Habkost
On Tue, Jul 14, 2020 at 08:20:04PM +0200, Paolo Bonzini wrote:
> Hi Roman, please ask Peter to apply it directly because I won't be able to
> send a pull request in the next couple of weeks.
> 
> Paolo
> 
> Il mar 14 lug 2020, 12:39 Roman Bolshakov  ha
> scritto:
> 
> > On Tue, Jul 14, 2020 at 12:07:27PM +0300, Roman Bolshakov wrote:
> > > Removal of register reset omitted initialization of CR4 guest/host mask.
> > > x86_64 guests aren't booting without it.
> > >
> > > Fixes: 5009ef22c6bb2 ("i386: hvf: Don't duplicate register reset")
> > > Signed-off-by: Roman Bolshakov 
> > >
> >
> > If one has a chance to test or review it, it'd be very helpful. That'd
> > allow to include it in RC0.
> >

I'll queue it for my -rc1 pull request.

-- 
Eduardo




Re: qemu test-qga failure on mergebuild after VERSION file change: dependency issues??

2020-07-16 Thread Philippe Mathieu-Daudé
On 7/16/20 7:55 PM, Michael Roth wrote:
> Quoting Peter Maydell (2020-07-16 05:53:17)
>> The first merge I tried to process after bumping VERSION for rc0
>> failed on test-qga like this:
>>
>> MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
>> tests/test-qga -m=quick -k --tap < /dev/null | ./scripts/tap-driv
>> er.pl --test-name="test-qga"
>> PASS 1 test-qga /qga/sync-delimited
>> PASS 2 test-qga /qga/sync
>> PASS 3 test-qga /qga/ping
>> **
>> ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info:
>> assertion failed (version == QEMU_VERSION): ("5.0.9
>> 0" == "5.0.50")
>> ERROR test-qga - Bail out!
>> ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info:
>> assertion failed (versio
>> n == QEMU_VERSION): ("5.0.90" == "5.0.50")
>> Aborted (core dumped)
>> /home/petmay01/linaro/qemu-for-merges/tests/Makefile.include:659:
>> recipe for target 'check-unit' failed
>>
>> Looking at timestamps on files, tests/test-qga.o never got rebuilt,
>> even though config-host.h has been updated (and so has the new
>> QEMU_VERSION). Any idea what's gone wrong here?
>>
>> Also weird: this build tree has no .d files in it.
> 
> I've been trying to reproduce with:
> 
> make
> make check-unit
> *bump VERSION
> make check-unit
> 
> but test-qga.o gets rebuilt as expected and the test passed.
> 
> This is with ubuntu 18.04, x86, with out-of-tree build directory. Are you 
> aware
> of any other factors that might be needed to reproduce this?

The problem is not for qga, it affects all QEMU objects.

> 
>>
>> thanks
>> -- PMM
> 




Re: [RFC PATCH] tcg/cpu-exec: precise single-stepping after an exception

2020-07-16 Thread Richard Henderson
On 7/16/20 3:04 AM, Luc Michel wrote:
> When single-stepping with a debugger attached to QEMU, and when an
> exception is raised, the debugger misses the first instruction after the
> exception:
> 
> $ qemu-system-aarch64 -M virt -display none -cpu cortex-a53 -s -S
> 
> $ aarch64-linux-gnu-gdb
> GNU gdb (GDB) 9.2
> [...]
> (gdb) tar rem :1234
> Remote debugging using :1234
> warning: No executable has been specified and target does not support
> determining executable automatically.  Try using the "file" command.
> 0x in ?? ()
> (gdb) # writing nop insns to 0x200 and 0x204
> (gdb) set *0x200 = 0xd503201f
> (gdb) set *0x204 = 0xd503201f
> (gdb) # 0x0 address contains 0 which is an invalid opcode.
> (gdb) # The CPU should raise an exception and jump to 0x200
> (gdb) si
> 0x0204 in ?? ()
> 
> With this commit, the same run steps correctly on the first instruction
> of the exception vector:
> 
> (gdb) si
> 0x0200 in ?? ()
> 
> Signed-off-by: Luc Michel 
> ---
> 
> RFC because I'm really not sure this is the good place to do that since
> EXCP_DEBUG are usually raised in each target translate.c. It could also
> have implications with record/replay I'm not aware of.
> 
> ---
>  accel/tcg/cpu-exec.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index d95c4848a4..e85fab5d40 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -502,10 +502,21 @@ static inline bool cpu_handle_exception(CPUState *cpu, 
> int *ret)
>  CPUClass *cc = CPU_GET_CLASS(cpu);
>  qemu_mutex_lock_iothread();
>  cc->do_interrupt(cpu);
>  qemu_mutex_unlock_iothread();
>  cpu->exception_index = -1;
> +
> +if (unlikely(cpu->singlestep_enabled)) {
> +/*
> + * After processing the exception, ensure an EXCP_DEBUG is
> + * raised when single-stepping so that GDB doesn't miss the
> + * next instruction.
> + */
> +cpu->exception_index = EXCP_DEBUG;
> +return cpu_handle_exception(cpu, ret);

Plausible.  Although recursion on an inline function is going to defeat the
inline, in general.

We could expand the recursion by hand with

if (unlikely(cpu->singlestep_enabled)) {
*ret = EXCP_DEBUG;
cpu_handle_debug_exception(cpu);
return true;
}

which might even be clearer.


r~

> +}
> +
>  } else if (!replay_has_interrupt()) {
>  /* give a chance to iothread in replay mode */
>  *ret = EXCP_INTERRUPT;
>  return true;
>  }
> 




Re: qemu test-qga failure on mergebuild after VERSION file change: dependency issues??

2020-07-16 Thread Michael Roth
Quoting Peter Maydell (2020-07-16 05:53:17)
> The first merge I tried to process after bumping VERSION for rc0
> failed on test-qga like this:
> 
> MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
> tests/test-qga -m=quick -k --tap < /dev/null | ./scripts/tap-driv
> er.pl --test-name="test-qga"
> PASS 1 test-qga /qga/sync-delimited
> PASS 2 test-qga /qga/sync
> PASS 3 test-qga /qga/ping
> **
> ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info:
> assertion failed (version == QEMU_VERSION): ("5.0.9
> 0" == "5.0.50")
> ERROR test-qga - Bail out!
> ERROR:/home/petmay01/linaro/qemu-for-merges/tests/test-qga.c:303:test_qga_info:
> assertion failed (versio
> n == QEMU_VERSION): ("5.0.90" == "5.0.50")
> Aborted (core dumped)
> /home/petmay01/linaro/qemu-for-merges/tests/Makefile.include:659:
> recipe for target 'check-unit' failed
> 
> Looking at timestamps on files, tests/test-qga.o never got rebuilt,
> even though config-host.h has been updated (and so has the new
> QEMU_VERSION). Any idea what's gone wrong here?
> 
> Also weird: this build tree has no .d files in it.

I've been trying to reproduce with:

make
make check-unit
*bump VERSION
make check-unit

but test-qga.o gets rebuilt as expected and the test passed.

This is with ubuntu 18.04, x86, with out-of-tree build directory. Are you aware
of any other factors that might be needed to reproduce this?

> 
> thanks
> -- PMM



Re: [PULL v1 0/2] Merge tpm 2020/07/15 v1

2020-07-16 Thread Peter Maydell
On Wed, 15 Jul 2020 at 20:23, Stefan Berger  wrote:
>
> Hello!
>
> This series fixes a couple of minor issues with the PPC64 TPM SPAPR interface
> and a test case.
>
>Stefan
>
> The following changes since commit 8746309137ba470d1b2e8f5ce86ac228625db940:
>
>   Update version for v5.1.0-rc0 release (2020-07-15 19:08:07 +0100)
>
> are available in the Git repository at:
>
>   git://github.com/stefanberger/qemu-tpm.git tags/pull-tpm-2020-07-15-1
>
> for you to fetch changes up to df8a7568932e4c3c930fdfeb228dd72b4bb71a1f:
>
>   tests: tpm: Skip over pcrUpdateCounter byte in result comparison 
> (2020-07-15 14:57:33 -0400)
>
> ---
> Stefan Berger (2):
>   tpm: tpm_spapr: Exit on TPM backend failures
>   tests: tpm: Skip over pcrUpdateCounter byte in result comparison


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/5.1
for any user-visible changes.

-- PMM



[GIT PULL] I2C updates

2020-07-16 Thread Corey Minyard
The following changes since commit 45db94cc90c286a9965a285ba19450f448760a09:

  Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-20200707' 
into staging (2020-07-10 16:43:40 +0100)

are available in the Git repository at:

  https://github.com/cminyard/qemu.git tags/for-qemu-i2c-5

for you to fetch changes up to 73d5f22ecbb76dfc785876779d47787084ff0f42:

  hw/i2c: Document the I2C qdev helpers (2020-07-16 12:30:54 -0500)


Minor changes to:

Add an SMBus config entry

Cleanup/simplify/document some I2C interfaces


Philippe Mathieu-Daudé (6):
  hw/i2c/Kconfig: Add an entry for the SMBus
  hw/i2c/aspeed_i2c: Simplify aspeed_i2c_get_bus()
  hw/i2c: Rename i2c_try_create_slave() as i2c_slave_new()
  hw/i2c: Rename i2c_realize_and_unref() as i2c_slave_realize_and_unref()
  hw/i2c: Rename i2c_create_slave() as i2c_slave_create_simple()
  hw/i2c: Document the I2C qdev helpers

 hw/arm/aspeed.c | 82 +++--
 hw/arm/musicpal.c   |  4 +--
 hw/arm/nseries.c|  8 ++---
 hw/arm/pxa2xx.c |  5 +--
 hw/arm/realview.c   |  2 +-
 hw/arm/spitz.c  |  4 +--
 hw/arm/stellaris.c  |  2 +-
 hw/arm/tosa.c   |  2 +-
 hw/arm/versatilepb.c|  2 +-
 hw/arm/vexpress.c   |  2 +-
 hw/arm/z2.c |  4 +--
 hw/display/sii9022.c|  2 +-
 hw/i2c/Kconfig  |  8 +++--
 hw/i2c/Makefile.objs|  3 +-
 hw/i2c/aspeed_i2c.c |  3 +-
 hw/i2c/core.c   | 15 -
 hw/ppc/e500.c   |  2 +-
 hw/ppc/sam460ex.c   |  2 +-
 include/hw/i2c/aspeed_i2c.h |  2 +-
 include/hw/i2c/i2c.h| 54 +++--
 20 files changed, 131 insertions(+), 77 deletions(-)




Re: [PATCH v3 3/9] vfio: add quirk device write method

2020-07-16 Thread Peter Maydell
On Tue, 30 Jun 2020 at 13:30, P J P  wrote:
>
> From: Prasad J Pandit 
>
> Add vfio quirk device mmio write method to avoid NULL pointer
> dereference issue.
>
> Reported-by: Lei Sun 
> Reviewed-by: Li Qiang 
> Signed-off-by: Prasad J Pandit 
> ---
>  hw/vfio/pci-quirks.c | 8 
>  1 file changed, 8 insertions(+)
>
> Update v3: Add Reviewed-by: ...
>   -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg09406.html
>
> diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
> index d304c81148..cc6d5dbc23 100644
> --- a/hw/vfio/pci-quirks.c
> +++ b/hw/vfio/pci-quirks.c
> @@ -14,6 +14,7 @@
>  #include "config-devices.h"
>  #include "exec/memop.h"
>  #include "qemu/units.h"
> +#include "qemu/log.h"
>  #include "qemu/error-report.h"
>  #include "qemu/main-loop.h"
>  #include "qemu/module.h"
> @@ -264,8 +265,15 @@ static uint64_t vfio_ati_3c3_quirk_read(void *opaque,
>  return data;
>  }
>
> +static void vfio_ati_3c3_quirk_write(void *opaque, hwaddr addr,
> +uint64_t data, unsigned size)
> +{
> +qemu_log_mask(LOG_GUEST_ERROR, "%s not implemented\n", __func__);
> +}
> +
>  static const MemoryRegionOps vfio_ati_3c3_quirk = {
>  .read = vfio_ati_3c3_quirk_read,
> +.write = vfio_ati_3c3_quirk_write,
>  .endianness = DEVICE_LITTLE_ENDIAN,
>  };


Alex (Williamson) -- as the vfio maintainer, do you have a view
on whether we should be logging write accesses to port 0x3c3
here as guest-errors or unimplemented-QEMU-functionality?

Guest-error seems plausible to me, anyway.

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v3 1/9] hw/pci-host: add pci-intack write method

2020-07-16 Thread Peter Maydell
On Tue, 30 Jun 2020 at 13:29, P J P  wrote:
>
> From: Prasad J Pandit 
>
> Add pci-intack mmio write method to avoid NULL pointer dereference
> issue.
>
> Reported-by: Lei Sun 
> Reviewed-by: Li Qiang 
> Signed-off-by: Prasad J Pandit 
> ---
>  hw/pci-host/prep.c | 8 
>  1 file changed, 8 insertions(+)
>
> Update v3: Add Reviewed-by: ...
>   -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg09395.html
>
> diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c
> index 367e408b91..3c8ff6af03 100644
> --- a/hw/pci-host/prep.c
> +++ b/hw/pci-host/prep.c
> @@ -26,6 +26,7 @@
>  #include "qemu/osdep.h"
>  #include "qemu-common.h"
>  #include "qemu/units.h"
> +#include "qemu/log.h"
>  #include "qapi/error.h"
>  #include "hw/pci/pci.h"
>  #include "hw/pci/pci_bus.h"
> @@ -119,8 +120,15 @@ static uint64_t raven_intack_read(void *opaque, hwaddr 
> addr,
>  return pic_read_irq(isa_pic);
>  }
>
> +static void raven_intack_write(void *opaque, hwaddr addr,
> +uint64_t data, unsigned size)
> +{
> +qemu_log_mask(LOG_UNIMP, "%s not implemented\n", __func__);
> +}
> +
>  static const MemoryRegionOps raven_intack_ops = {
>  .read = raven_intack_read,
> +.write = raven_intack_write,
>  .valid = {
>  .max_access_size = 1,
>  },

I suspect this may be a read-only register (and so a guest error
rather than unimp) but I'm not sure I've found the correct
Raven PCI controller datasheet, and if I have then there's a
lot of unimplemented functionality in our model. So UNIMP
is fine. This controller is only used by the ppc '40p' machine.

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v3 9/9] memory: assert MemoryRegionOps callbacks are defined

2020-07-16 Thread Peter Maydell
On Tue, 30 Jun 2020 at 13:31, P J P  wrote:
>
> From: Prasad J Pandit 
>
> When registering a MemoryRegionOps object, assert that its
> read/write callback methods are defined. This avoids potential
> guest crash via a NULL pointer dereference.
>
> Suggested-by: Peter Maydell 
> Reviewed-by: Li Qiang 
> Signed-off-by: Prasad J Pandit 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM



[PATCH v4 for-5.2 1/2] spapr: Use error_append_hint() in spapr_caps.c

2020-07-16 Thread Greg Kurz
We have a dedicated error API for hints. Use it instead of embedding
the hint in the error message, as recommanded in the "qapi/error.h"
header file.

Since spapr_caps_apply() passes _fatal, all functions must
also call the ERRP_GUARD() macro for error_append_hint() to be
functional.

While here, have cap_fwnmi_apply(), which already uses
error_append_hint(), to call ERRP_GUARD() as well.

Signed-off-by: Greg Kurz 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Laurent Vivier 
---
 hw/ppc/spapr_caps.c |   89 +--
 1 file changed, 50 insertions(+), 39 deletions(-)

diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 3225fc5a2edc..275f5bd0342c 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -180,24 +180,24 @@ static void spapr_cap_set_pagesize(Object *obj, Visitor 
*v, const char *name,
 
 static void cap_htm_apply(SpaprMachineState *spapr, uint8_t val, Error **errp)
 {
+ERRP_GUARD();
 if (!val) {
 /* TODO: We don't support disabling htm yet */
 return;
 }
 if (tcg_enabled()) {
-error_setg(errp,
-   "No Transactional Memory support in TCG,"
-   " try appending -machine cap-htm=off");
+error_setg(errp, "No Transactional Memory support in TCG");
+error_append_hint(errp, "Try appending -machine cap-htm=off\n");
 } else if (kvm_enabled() && !kvmppc_has_cap_htm()) {
 error_setg(errp,
-"KVM implementation does not support Transactional Memory,"
-   " try appending -machine cap-htm=off"
-);
+   "KVM implementation does not support Transactional Memory");
+error_append_hint(errp, "Try appending -machine cap-htm=off\n");
 }
 }
 
 static void cap_vsx_apply(SpaprMachineState *spapr, uint8_t val, Error **errp)
 {
+ERRP_GUARD();
 PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
 CPUPPCState *env = >env;
 
@@ -209,13 +209,14 @@ static void cap_vsx_apply(SpaprMachineState *spapr, 
uint8_t val, Error **errp)
  * rid of anything that doesn't do VMX */
 g_assert(env->insns_flags & PPC_ALTIVEC);
 if (!(env->insns_flags2 & PPC2_VSX)) {
-error_setg(errp, "VSX support not available,"
-   " try appending -machine cap-vsx=off");
+error_setg(errp, "VSX support not available");
+error_append_hint(errp, "Try appending -machine cap-vsx=off\n");
 }
 }
 
 static void cap_dfp_apply(SpaprMachineState *spapr, uint8_t val, Error **errp)
 {
+ERRP_GUARD();
 PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
 CPUPPCState *env = >env;
 
@@ -224,8 +225,8 @@ static void cap_dfp_apply(SpaprMachineState *spapr, uint8_t 
val, Error **errp)
 return;
 }
 if (!(env->insns_flags2 & PPC2_DFP)) {
-error_setg(errp, "DFP support not available,"
-   " try appending -machine cap-dfp=off");
+error_setg(errp, "DFP support not available");
+error_append_hint(errp, "Try appending -machine cap-dfp=off\n");
 }
 }
 
@@ -239,6 +240,7 @@ SpaprCapPossible cap_cfpc_possible = {
 static void cap_safe_cache_apply(SpaprMachineState *spapr, uint8_t val,
  Error **errp)
 {
+ERRP_GUARD();
 uint8_t kvm_val =  kvmppc_get_cap_safe_cache();
 
 if (tcg_enabled() && val) {
@@ -247,9 +249,9 @@ static void cap_safe_cache_apply(SpaprMachineState *spapr, 
uint8_t val,
 cap_cfpc_possible.vals[val]);
 } else if (kvm_enabled() && (val > kvm_val)) {
 error_setg(errp,
-   "Requested safe cache capability level not supported by 
kvm,"
-   " try appending -machine cap-cfpc=%s",
-   cap_cfpc_possible.vals[kvm_val]);
+   "Requested safe cache capability level not supported by 
KVM");
+error_append_hint(errp, "Try appending -machine cap-cfpc=%s\n",
+  cap_cfpc_possible.vals[kvm_val]);
 }
 }
 
@@ -263,6 +265,7 @@ SpaprCapPossible cap_sbbc_possible = {
 static void cap_safe_bounds_check_apply(SpaprMachineState *spapr, uint8_t val,
 Error **errp)
 {
+ERRP_GUARD();
 uint8_t kvm_val =  kvmppc_get_cap_safe_bounds_check();
 
 if (tcg_enabled() && val) {
@@ -271,9 +274,9 @@ static void cap_safe_bounds_check_apply(SpaprMachineState 
*spapr, uint8_t val,
 cap_sbbc_possible.vals[val]);
 } else if (kvm_enabled() && (val > kvm_val)) {
 error_setg(errp,
-"Requested safe bounds check capability level not supported by kvm,"
-   " try appending -machine cap-sbbc=%s",
-   cap_sbbc_possible.vals[kvm_val]);
+"Requested safe bounds check capability level not supported by KVM");
+error_append_hint(errp, "Try appending -machine cap-sbbc=%s\n",
+  cap_sbbc_possible.vals[kvm_val]);
 }
 }
 
@@ -290,6 +293,7 @@ SpaprCapPossible cap_ibs_possible = {

[PATCH v4 for-5.2 2/2] spapr: Forbid nested KVM-HV in pre-power9 compat mode

2020-07-16 Thread Greg Kurz
Nested KVM HV only works if the kernel is using the radix MMU mode, ie.
the CPU is POWER9 and it is not running in some pre-power9 compat mode.
Otherwise, the KVM HV module fails to load in the guest with -ENODEV.
It might be painful for a user to discover this late that nested cannot
work with their setup. Erroring out at machine init instead seems to be
the best we can do.

Signed-off-by: Greg Kurz 
Reviewed-by: Laurent Vivier 
---
 hw/ppc/spapr_caps.c |   10 ++
 1 file changed, 10 insertions(+)

diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 275f5bd0342c..10a80a8159f4 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -382,6 +382,8 @@ static void cap_nested_kvm_hv_apply(SpaprMachineState 
*spapr,
 uint8_t val, Error **errp)
 {
 ERRP_GUARD();
+PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
+
 if (!val) {
 /* capability disabled by default */
 return;
@@ -391,6 +393,14 @@ static void cap_nested_kvm_hv_apply(SpaprMachineState 
*spapr,
 error_setg(errp, "No Nested KVM-HV support in TCG");
 error_append_hint(errp, "Try appending -machine cap-nested-hv=off\n");
 } else if (kvm_enabled()) {
+if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_3_00, 0,
+  spapr->max_compat_pvr)) {
+error_setg(errp, "Nested KVM-HV only supported on POWER9");
+error_append_hint(errp,
+  "Try appending -machine 
max-cpu-compat=power9\n");
+return;
+}
+
 if (!kvmppc_has_cap_nested_kvm_hv()) {
 error_setg(errp,
"KVM implementation does not support Nested KVM-HV");





[PATCH v4 for-5.2 0/2] spapr: Improve error reporting in spapr_caps.c

2020-07-16 Thread Greg Kurz
Nested KVM HV only works if the kernel is using the radix MMU mode, ie.
the CPU is POWER9 and it is not running in some pre-power9 compat mode.
Otherwise, the KVM HV module fails to load in the guest with -ENODEV.
It might be painful for a user to discover this late that nested cannot
work with their setup. It seems a better fit for QEMU to do a sanity
check when applying the nested-hv sPAPR capability and print out an
informative error message.

sPAPR capabilities are checked at machine init. If a capability cannot
be used, an error message is printed and QEMU exits. In most places,
the error message also contains an hint for the user. But we should
use error_append_hint() for that, as explained in the "qapi/error.h"
header.

So this series first converts spapr_caps.c to using error_append_hint().
This requires to add some ERRP_GUARD() because spapr_caps_apply() passes
_fatal. Then it adds a sanity check for the nested-hv case with
an error message and hint.

v4: - Same as v3 but rebased on ppc-for-5.2, updated changelogs
  and cover
v3: - Add preliminary patch to use warn_report() instead of
  a convoluted error_setg()+warn_report_err() sequence
v2: - Fix indentation and add some missing \n in patch 2
- Add ERRP_AUTO_PROPAGATE() to cap_nested_kvm_hv_apply() in
  patch 2 instead of patch 3

---

Greg Kurz (2):
  spapr: Use error_append_hint() in spapr_caps.c
  spapr: Forbid nested KVM-HV in pre-power9 compat mode


 hw/ppc/spapr_caps.c |   99 +++
 1 file changed, 60 insertions(+), 39 deletions(-)

--
Greg




Re: [PATCH 2/2] python/qemu: Change ConsoleSocket to optionally drain socket.

2020-07-16 Thread Robert Foley
On Thu, 16 Jul 2020 at 09:42, Alex Bennée  wrote:
>

> > +self._drain_thread = None
> > +socket.socket.__init__(self, socket.AF_UNIX, socket.SOCK_STREAM)
> > +self.connect(address)
> > +self._drain = drain
>
> We end up with two variables that represent the fact we have draining
> happening. Could we rationalise it into:
>
>   if drain:
>  self._drain_thread = self._thread_start()
>   else
>  self._drain_thread = None # if this is needed
>
> And then tests for:
>
>   if not self._drain:
>
> become
>
>   if self._drain_thread is None:

Good point, this is simpler.  Will update.


> > +if self._drain and self._drain_thread is not None:
> > +thread, self._drain_thread = self._drain_thread, None
> Would self._drain ever not have self._drain_thread set?

No, I believe that if drain is set, it results in _drain_thread also being set.
This will be cleaned up once we drop the _drain.

>
> >  thread.join()
> > +socket.socket.close(self)
> 
> > diff --git a/python/qemu/machine.py b/python/qemu/machine.py
> > index 6769359766..62709d86e4 100644
> > --- a/python/qemu/machine.py
> > +++ b/python/qemu/machine.py
> > @@ -22,7 +22,6 @@ import logging
> >  import os
> >  import subprocess
> >  import shutil
> > -import socket
>
> FYI minor patch conflict here with master

OK, will rebase and fix this conflict.

Thanks & Regards,
-Rob
>
> >  import tempfile
> >  from typing import Optional, Type
> >  from types import TracebackType
> > @@ -591,12 +590,8 @@ class QEMUMachine:
> >  Returns a socket connected to the console
> >  """
> >  if self._console_socket is None:
> > -if self._drain_console:
> > -self._console_socket = console_socket.ConsoleSocket(
> > -self._console_address,
> > -file=self._console_log_path)
> > -else:
> > -self._console_socket = socket.socket(socket.AF_UNIX,
> > - socket.SOCK_STREAM)
> > -self._console_socket.connect(self._console_address)
> > +self._console_socket = console_socket.ConsoleSocket(
> > +self._console_address,
> > +file=self._console_log_path,
> > +drain=self._drain_console)
> >  return self._console_socket
>
>
> --
> Alex Bennée



Re: [PATCH v3 8/9] imx7-ccm: add digprog mmio write method

2020-07-16 Thread Peter Maydell
On Thu, 16 Jul 2020 at 17:55, P J P  wrote:
>
> +-- On Thu, 16 Jul 2020, Peter Maydell wrote --+
> | > +static void imx7_digprog_write(void *opaque, hwaddr addr,
> | > +uint64_t data, unsigned size)
> | > +{
> | > +qemu_log_mask(LOG_UNIMP, "%s not implemented\n", __func__);
> | > +}
> |
> | This covers a single register (DIGPROG) which is read-only (it returns a
> | silicon revision number). So this is not a LOG_UNIMP, but a LOG_GUEST_ERROR:
> |
> |  qemu_log_mask(LOG_GUEST_ERROR, "Guest write to read-only
> | ANALOG_DIGPROG register\n");
>
> Should this be g_assert_not_reached() in that case?

No, because a malicious guest can write to the register
(and cause the function to be called), it is merely that
it is a bug in guest code for it to do that.

-- PMM



Re: [PATCH v3 8/9] imx7-ccm: add digprog mmio write method

2020-07-16 Thread P J P
+-- On Thu, 16 Jul 2020, Peter Maydell wrote --+
| > +static void imx7_digprog_write(void *opaque, hwaddr addr,
| > +uint64_t data, unsigned size)
| > +{
| > +qemu_log_mask(LOG_UNIMP, "%s not implemented\n", __func__);
| > +}
| 
| This covers a single register (DIGPROG) which is read-only (it returns a 
| silicon revision number). So this is not a LOG_UNIMP, but a LOG_GUEST_ERROR:
| 
|  qemu_log_mask(LOG_GUEST_ERROR, "Guest write to read-only
| ANALOG_DIGPROG register\n");

Should this be g_assert_not_reached() in that case?

Thank you.
--
Prasad J Pandit / Red Hat Product Security Team
8685 545E B54C 486B C6EB 271E E285 8B5A F050 DE8D




Re: [PATCH] introduce VFIO-over-socket protocol specificaion

2020-07-16 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/1594913503-52271-1-git-send-email-thanos.maka...@nutanix.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 1594913503-52271-1-git-send-email-thanos.maka...@nutanix.com
Subject: [PATCH] introduce VFIO-over-socket protocol specificaion

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
aa6155e introduce VFIO-over-socket protocol specificaion

=== OUTPUT BEGIN ===
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#20: 
new file mode 100644

ERROR: trailing whitespace
#167: FILE: docs/devel/vfio-over-socket.rst:143:
+configuration space. $

ERROR: trailing whitespace
#360: FILE: docs/devel/vfio-over-socket.rst:336:
+|| +-++ | $

ERROR: trailing whitespace
#572: FILE: docs/devel/vfio-over-socket.rst:548:
+be supported in future versions. $

ERROR: trailing whitespace
#587: FILE: docs/devel/vfio-over-socket.rst:563:
+| Command  | 5| $

ERROR: trailing whitespace
#874: FILE: docs/devel/vfio-over-socket.rst:850:
+the interrupt. $

ERROR: trailing whitespace
#878: FILE: docs/devel/vfio-over-socket.rst:854:
+guest unmasks the interrupt. $

ERROR: trailing whitespace
#908: FILE: docs/devel/vfio-over-socket.rst:884:
+appears after the 16 byte message header. $

total: 7 errors, 1 warnings, 1135 lines checked

Commit aa6155e147a9 (introduce VFIO-over-socket protocol specificaion) has 
style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/1594913503-52271-1-git-send-email-thanos.maka...@nutanix.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH] introduce VFIO-over-socket protocol specificaion

2020-07-16 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/1594913503-52271-1-git-send-email-thanos.maka...@nutanix.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  block/qapi.o
  CC  block/file-win32.o

Warning, treated as error:
/tmp/qemu-test/src/docs/devel/vfio-over-socket.rst:281:Malformed table.

+--+-+---+
---
  CC  crypto/hmac.o
  CC  crypto/hmac-nettle.o
  CC  crypto/desrfb.o
make: *** [Makefile:1090: docs/devel/index.html] Error 2
make: *** Waiting for unfinished jobs
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 708, in 
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=739a80f85e9146d59e3826f0d63daa6c', '-u', 
'1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-gwtqm3ly/src/docker-src.2020-07-16-12.51.55.8263:/var/tmp/qemu:z,ro',
 'qemu/fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=739a80f85e9146d59e3826f0d63daa6c
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-gwtqm3ly/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real3m4.939s
user0m8.716s


The full log is available at
http://patchew.org/logs/1594913503-52271-1-git-send-email-thanos.maka...@nutanix.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH v3 2/9] pci-host: add pcie-msi read method

2020-07-16 Thread Peter Maydell
On Tue, 30 Jun 2020 at 13:30, P J P  wrote:
>
> From: Prasad J Pandit 
>
> Add pcie-msi mmio read method to avoid NULL pointer dereference
> issue.

This change is specific to the designware pci host controller;
it would be nice to have "designware" in the commit subject.


> Reported-by: Lei Sun 
> Reviewed-by: Li Qiang 
> Signed-off-by: Prasad J Pandit 
> ---
>  hw/pci-host/designware.c | 9 +
>  1 file changed, 9 insertions(+)
>
> Update v3: Add Reviewed-by: ...
>   -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg09400.html

> +static uint64_t designware_pcie_root_msi_read(void *opaque, hwaddr addr,
> +  unsigned size)
> +{
> +qemu_log_mask(LOG_UNIMP, "%s not implemented\n", __func__);
> +return 0;

This is the right change, but is missing an explanation
of why it's right:

Looking at the data sheet, in the real hardware MSI interrupts
are handled by looking at writes to see whether they match the
configured address; if so then the write gets quashed and never
gets put out onto the AXI bus (to the CPU, memory, etc). This only
happens for writes, so reads from the magic address are just
allowed to pass through and will read from the system address
space like any other read from any other address. That's not trivial
to implement, though, and well-behaved guests won't care, so
we can just explain why we're not doing anything with a suitable
comment:

/*
 * Attempts to read from the MSI address are undefined in
 * the PCI specifications. For this hardware, the datasheet
 * specifies that a read from the magic address is simply not
 * intercepted by the MSI controller, and will go out to the
 * AHB/AXI bus like any other PCI-device-initiated DMA read.
 * This is not trivial to implement in QEMU, so since
 * well-behaved guests won't ever ask a PCI device to DMA from
 * this address we just log the missing functionality.
 */
qemu_log_mask(LOG_UNIMP, "%s not implemented\n", __func__);
return 0;

> +}
> +
>  static void designware_pcie_root_msi_write(void *opaque, hwaddr addr,
> uint64_t val, unsigned len)
>  {
> @@ -77,6 +85,7 @@ static void designware_pcie_root_msi_write(void *opaque, 
> hwaddr addr,
>  }
>
>  static const MemoryRegionOps designware_pci_host_msi_ops = {
> +.read = designware_pcie_root_msi_read,
>  .write = designware_pcie_root_msi_write,
>  .endianness = DEVICE_LITTLE_ENDIAN,
>  .valid = {

With that comment,
Reviewed-by: Peter Maydell 

thanks
-- PMM



[PATCH] gitlab-ci.yml: Add oss-fuzz build tests

2020-07-16 Thread Alexander Bulekov
This tries to build and run the fuzzers with the same build-script used
by oss-fuzz. This doesn't guarantee that the builds on oss-fuzz will
also succeed, since oss-fuzz provides its own compiler and fuzzer vars,
but it can catch changes that are not compatible with the the
./scripts/oss-fuzz/build.sh script.
The strange way of finding fuzzer binaries stems from the method used by
oss-fuzz:
https://github.com/google/oss-fuzz/blob/master/infra/base-images/base-runner/targets_list

Signed-off-by: Alexander Bulekov 
---

Similar to Thomas' patch:

> Note: This patch needs two other patches merged first to work correctly:

> - 'fuzz: Expect the cmdline in a freeable GString' from Alexander

> - 'qom: Plug memory leak in "info qom-tree"' from Markus

Otherwise the test will fail due to detected memory leaks.

Fair warning: I haven't been able to trigger this new job yet. I tried
to run the pipeline with these changes on my forked repo on gitlab, but
did not reach the build-oss-fuzz. I think this is due to some failures
in the Containers-layer-2 stage:

...
Error response from daemon: manifest for
registry.gitlab.com/a1xndr/qemu/qemu/debian-all-test-cross:latest not
found: manifest unknown: manifest unknown
#2 [internal] load .dockerignore
#2 transferring context:
#2 transferring context: 2B 0.1s done
#2 DONE 0.1s
#1 [internal] load build definition from tmpg8j4xoop.docker
#1 transferring dockerfile: 2.21kB 0.1s done
#1 DONE 0.2s
#3 [internal] load metadata for docker.io/qemu/debian10:latest
#3 ERROR: pull access denied, repository does not exist or may require
authorization: server message: insufficient_scope: authorization failed
#4 [1/3] FROM docker.io/qemu/debian10:latest
#4 resolve docker.io/qemu/debian10:latest 0.1s done
#4 ERROR: pull access denied, repository does not exist or may require
authorization: server message: insufficient_scope: authorization failed
--
 > [internal] load metadata for docker.io/qemu/debian10:latest:
--
--
 > [1/3] FROM docker.io/qemu/debian10:latest:
--
failed to solve with frontend dockerfile.v0: failed to build LLB: failed
to load cache key: pull access denied, repository does not exist or may
require authorization: server message: insufficient_scope: authorization
failed
...

 .gitlab-ci.yml | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index e96f8794b9..a50df420c9 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -182,6 +182,20 @@ build-fuzzer:
 || exit 1 ;
   done
 
+build-oss-fuzz:
+  <<: *native_build_job_definition
+  variables:
+IMAGE: fedora
+  script:
+- OUT_DIR="./build" CC=clang-9 CXX=clang++-9 CFLAGS="-fsanitize=address"
+  LIB_FUZZING_ENGINE="-fsanitize=fuzzer" CFL
+  ./scripts/oss-fuzz/build.sh
+- for fuzzer in $(find ./build-oss-fuzz/DEST_DIR/ -executable -type f); do
+grep "LLVMFuzzerTestOneInput" ${fuzzer} > /dev/null 2>&1 || continue ;
+echo Testing ${fuzzer} ... ;
+"${fuzzer}" -runs=1000 || exit 1 ;
+  done
+
 build-tci:
   <<: *native_build_job_definition
   variables:
-- 
2.26.2




Re: [PATCH] gitlab-ci.yml: Add fuzzer tests

2020-07-16 Thread Alexander Bulekov
On 200716 1209, Thomas Huth wrote:
> So far we neither compile-tested nor run any of the new fuzzers in our CI,
> which led to some build failures of the fuzzer code in the past weeks.
> To avoid this problem, add a job to compile the fuzzer code and run some
> loops (which likely don't find any new bugs via fuzzing, but at least we
> know that the code can still be run).
> 
> A nice side-effect of this test is that the leak tests are enabled here,
> so we should now notice some of the memory leaks in our code base earlier.
> 
> Signed-off-by: Thomas Huth 

Thank you for this, Thomas. I just sent a patch to add a job that runs
similar tests with the build-script that we use on oss-fuzz
Patch: <20200716163330.29141-1-alx...@bu.edu>

I know that these jobs have a lot of overlap, but there are enough
quirks in the oss-fuzz build-script that I, personally, don't think
they are redundant.

A couple notes below, and I haven't been able to test on my own fork of
qemu on gitlab, yet due to some pipeline errors, but otherwise

Reviewed-by: Alexander Bulekov 

> ---
>  .gitlab-ci.yml | 20 +++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
> index 5eeba2791b..e96f8794b9 100644
> --- a/.gitlab-ci.yml
> +++ b/.gitlab-ci.yml
> @@ -161,9 +161,27 @@ build-clang:
>  IMAGE: fedora
>  CONFIGURE_ARGS: --cc=clang --cxx=clang++
>  TARGETS: alpha-softmmu arm-softmmu m68k-softmmu mips64-softmmu
> -  ppc-softmmu s390x-softmmu x86_64-softmmu arm-linux-user
> +  ppc-softmmu s390x-softmmu arm-linux-user
>  MAKE_CHECK_ARGS: check
>  
> +build-fuzzer:
> +  <<: *native_build_job_definition
> +  variables:
> +IMAGE: fedora
> +  script:
> +- mkdir build
> +- cd build
> +- ../configure --cc=clang --cxx=clang++ --enable-fuzzing
> +   --target-list=x86_64-softmmu

https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg02310.html
When/if this gets merged, enable-fuzzing won't build with
AddressSanitizer, by default, so I would add --enable-sanitizers, just
to be safe. 

> +- make -j"$JOBS" all check-build x86_64-softmmu/fuzz
> +- make check
> +- for fuzzer in i440fx-qos-fork-fuzz i440fx-qos-noreset-fuzz
> +i440fx-qtest-reboot-fuzz virtio-scsi-flags-fuzz virtio-scsi-fuzz ; do

Any reason for these particular fuzzers? I know the virtio-net ones find
crashes pretty quickly, but I dont think that causes a non-zero exit.

> +  echo Testing ${fuzzer} ... ;
> +  x86_64-softmmu/qemu-fuzz-x86_64 --fuzz-target=${fuzzer} -runs=1000
> +|| exit 1 ;
> +  done
> +
>  build-tci:
><<: *native_build_job_definition
>variables:
> -- 
> 2.18.1
> 



Re: [PATCH v3] spapr: Add a new level of NUMA for GPUs

2020-07-16 Thread Daniel Henrique Barboza




On 7/16/20 1:00 PM, Reza Arbab wrote:

On Thu, Jul 16, 2020 at 06:42:11AM -0300, Daniel Henrique Barboza wrote:

Which would translate here to:

   uint32_t associativity[] = {
   cpu_to_be32(0x4),
   cpu_to_be32(nvslot->numa_id),
   cpu_to_be32(nvslot->numa_id),
   cpu_to_be32(nvslot->numa_id),
   cpu_to_be32(nvslot->numa_id),
   };


Sure, that's how it originally was in v1 of the patch.


Yeah, Greg commented this in v2 about this chunk:


This is a guest visible change. It should theoretically be controlled
with a compat property of the PHB (look for "static GlobalProperty" in
spapr.c). But since this code is only used for GPU passthrough and we
don't support migration of such devices, I guess it's okay. Maybe just
mention it in the changelog.
--

By all means you can mention in changelog/code comment why the associativity
of the GPU is nvslot->numa_id 4 times in a row, but I believe this
format is still clearer for us to understand. It also makes it equal
to skiboot.

And it deprecates the SPAPR_GPU_NUMA_ID macro, allowing us to use its value
(1) for other internal purposes regarding NUMA without collision with the
GPU semantics.



Thanks,


DHB




I'll send a v4 today. It's been a while so I need to rebase anyway.





Re: [PATCH v3 6/9] spapr_pci: add spapr msi read method

2020-07-16 Thread Peter Maydell
On Tue, 30 Jun 2020 at 13:31, P J P  wrote:
>
> From: Prasad J Pandit 
>
> Add spapr msi mmio read method to avoid NULL pointer dereference
> issue.
>
> Reported-by: Lei Sun 
> Acked-by: David Gibson 
> Reviewed-by: Li Qiang 
> Signed-off-by: Prasad J Pandit 
> ---
>  hw/ppc/spapr_pci.c | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
>
> Update v3: Add Acked-by: ...
>   -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg08054.html
>
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 329002ac04..7033352834 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -52,6 +52,7 @@
>  #include "sysemu/kvm.h"
>  #include "sysemu/hostmem.h"
>  #include "sysemu/numa.h"
> +#include "qemu/log.h"
>
>  /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
>  #define RTAS_QUERY_FN   0
> @@ -738,6 +739,12 @@ static PCIINTxRoute spapr_route_intx_pin_to_irq(void 
> *opaque, int pin)
>  return route;
>  }
>
> +static uint64_t spapr_msi_read(void *opaque, hwaddr addr, unsigned size)
> +{
> +qemu_log_mask(LOG_UNIMP, "%s not implemented\n", __func__);
> +return 0;
> +}
> +
>  /*
>   * MSI/MSIX memory region implementation.
>   * The handler handles both MSI and MSIX.
> @@ -755,8 +762,10 @@ static void spapr_msi_write(void *opaque, hwaddr addr,
>  }
>
>  static const MemoryRegionOps spapr_msi_ops = {
> -/* There is no .read as the read result is undefined by PCI spec */
> -.read = NULL,
> +/* .read result is undefined by PCI spec

QEMU multiline comments should have the '/*' on a line of its own.

> + * define .read method to avoid assert failure in memory_region_init_io
> + */

If this is undefined behaviour per the PCI spec then LOG_UNIMP
is the wrong thing -- this should either be LOG_GUEST_ERROR
(if the guest can do this or program the h/w to do this)
or assert() (if the only way this could happen would be a bug
in a QEMU model of a PCI device).

> +.read = spapr_msi_read,
>  .write = spapr_msi_write,
>  .endianness = DEVICE_LITTLE_ENDIAN
>  };
> --
> 2.26.2

thanks
-- PMM



Re: [PATCH] target/i386: floatx80: avoid compound literals in static initializers

2020-07-16 Thread Alex Bennée


Laszlo Ersek  writes:

> Quoting ISO C99 6.7.8p4, "All the expressions in an initializer for an
> object that has static storage duration shall be constant expressions or
> string literals".
>
> The compound literal produced by the make_floatx80() macro is not such a
> constant expression, per 6.6p7-9. (An implementation may accept it,
> according to 6.6p10, but is not required to.)
>
> Therefore using "floatx80_zero" and make_floatx80() for initializing
> "f2xm1_table" and "fpatan_table" is not portable. And gcc-4.8 in RHEL-7.6
> actually chokes on them:
>
>> target/i386/fpu_helper.c:871:5: error: initializer element is not constant
>>  { make_floatx80(0xbfff, 0x8000ULL),
>>  ^
>
> We've had the make_floatx80_init() macro for this purpose since commit
> 3bf7e40ab914 ("softfloat: fix for C99", 2012-03-17), so let's use that
> macro again.
>
> Fixes: eca30647fc07
> Fixes: ff57bb7b6326
> Link: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg06566.html
> Link: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg04714.html
> Cc: Alex Bennée 
> Cc: Aurelien Jarno 
> Cc: Eduardo Habkost 
> Cc: Joseph Myers 
> Cc: Paolo Bonzini 
> Cc: Peter Maydell 
> Cc: Richard Henderson 
> Signed-off-by: Laszlo Ersek 
> ---
>
> Notes:
> I can see that there are test cases under "tests/tcg/i386", but I don't
> know how to run them.

You can run the TCG tests with:

  make check-tcg

or more specifically:

  make run-tcg-tests-i386-linux-user

there is also a:

  make check-softfloat

although in this case nothing is affected.

softfloat bits:

Acked-by: Alex Bennée 

>
>  include/fpu/softfloat.h  |   1 +
>  target/i386/fpu_helper.c | 426 +++
>  2 files changed, 214 insertions(+), 213 deletions(-)
>
> diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
> index f1a19df066b7..659218b5c787 100644
> --- a/include/fpu/softfloat.h
> +++ b/include/fpu/softfloat.h
> @@ -822,6 +822,7 @@ static inline bool floatx80_invalid_encoding(floatx80 a)
>  }
>  
>  #define floatx80_zero make_floatx80(0x, 0xLL)
> +#define floatx80_zero_init make_floatx80_init(0x, 0xLL)
>  #define floatx80_one make_floatx80(0x3fff, 0x8000LL)
>  #define floatx80_ln2 make_floatx80(0x3ffe, 0xb17217f7d1cf79acLL)
>  #define floatx80_pi make_floatx80(0x4000, 0xc90fdaa22168c235LL)
> diff --git a/target/i386/fpu_helper.c b/target/i386/fpu_helper.c
> index f5e6c4b88d4e..4ea73874d836 100644
> --- a/target/i386/fpu_helper.c
> +++ b/target/i386/fpu_helper.c
> @@ -868,201 +868,201 @@ struct f2xm1_data {
>  };
>  
>  static const struct f2xm1_data f2xm1_table[65] = {
> -{ make_floatx80(0xbfff, 0x8000ULL),
> -  make_floatx80(0x3ffe, 0x8000ULL),
> -  make_floatx80(0xbffe, 0x8000ULL) },
> -{ make_floatx80(0xbffe, 0xf8002e7eULL),
> -  make_floatx80(0x3ffe, 0x82cd8698ac2b9160ULL),
> -  make_floatx80(0xbffd, 0xfa64f2cea7a8dd40ULL) },
> -{ make_floatx80(0xbffe, 0xefffe960ULL),
> -  make_floatx80(0x3ffe, 0x85aac367cc488345ULL),
> -  make_floatx80(0xbffd, 0xf4aa7930676ef976ULL) },
> -{ make_floatx80(0xbffe, 0xe8006f10ULL),
> -  make_floatx80(0x3ffe, 0x88980e8092da5c14ULL),
> -  make_floatx80(0xbffd, 0xeecfe2feda4b47d8ULL) },
> -{ make_floatx80(0xbffe, 0xe0008a45ULL),
> -  make_floatx80(0x3ffe, 0x8b95c1e3ea8ba2a5ULL),
> -  make_floatx80(0xbffd, 0xe8d47c382ae8bab6ULL) },
> -{ make_floatx80(0xbffe, 0xd7ff8a9eULL),
> -  make_floatx80(0x3ffe, 0x8ea4398b45cd8116ULL),
> -  make_floatx80(0xbffd, 0xe2b78ce97464fdd4ULL) },
> -{ make_floatx80(0xbffe, 0xd00019a0ULL),
> -  make_floatx80(0x3ffe, 0x91c3d373ab11b919ULL),
> -  make_floatx80(0xbffd, 0xdc785918a9dc8dceULL) },
> -{ make_floatx80(0xbffe, 0xc7ff14dfULL),
> -  make_floatx80(0x3ffe, 0x94f4efa8fef76836ULL),
> -  make_floatx80(0xbffd, 0xd61620ae02112f94ULL) },
> -{ make_floatx80(0xbffe, 0xc0006530ULL),
> -  make_floatx80(0x3ffe, 0x9837f0518db87fbbULL),
> -  make_floatx80(0xbffd, 0xcf901f5ce48f008aULL) },
> -{ make_floatx80(0xbffe, 0xb7ff1723ULL),
> -  make_floatx80(0x3ffe, 0x9b8d39b9d54eb74cULL),
> -  make_floatx80(0xbffd, 0xc8e58c8c55629168ULL) },
> -{ make_floatx80(0xbffe, 0xb000b5e1ULL),
> -  make_floatx80(0x3ffe, 0x9ef5326091a0c366ULL),
> -  make_floatx80(0xbffd, 0xc2159b3edcbe7934ULL) },
> -{ make_floatx80(0xbffe, 0xa8006f8aULL),
> -  make_floatx80(0x3ffe, 0xa27043030c49370aULL),
> -  make_floatx80(0xbffd, 0xbb1f79f9e76d91ecULL) },
> -{ make_floatx80(0xbffe, 0x9fff816aULL),
> -  make_floatx80(0x3ffe, 0xa5fed6a9b15171cfULL),
> -  make_floatx80(0xbffd, 0xb40252ac9d5d1c62ULL) },
> -{ make_floatx80(0xbffe, 0x97ffb621ULL),
> -  make_floatx80(0x3ffe, 0xa9a15ab4ea7c30e6ULL),
> -  make_floatx80(0xbffd, 

Re: [PATCH v3 5/9] nvram: add nrf51_soc flash read method

2020-07-16 Thread Peter Maydell
On Tue, 30 Jun 2020 at 13:31, P J P  wrote:
>
> From: Prasad J Pandit 
>
> Add nrf51_soc mmio read method to avoid NULL pointer dereference
> issue.
>
> Reported-by: Lei Sun 
> Signed-off-by: Prasad J Pandit 
> ---
>  hw/nvram/nrf51_nvm.c | 5 +
>  1 file changed, 5 insertions(+)
>
> Update v3: use g_assert_not_reached()
>   -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg09560.html
>
> diff --git a/hw/nvram/nrf51_nvm.c b/hw/nvram/nrf51_nvm.c
> index f2283c1a8d..82e89d965b 100644
> --- a/hw/nvram/nrf51_nvm.c
> +++ b/hw/nvram/nrf51_nvm.c
> @@ -273,6 +273,10 @@ static const MemoryRegionOps io_ops = {
>  .endianness = DEVICE_LITTLE_ENDIAN,
>  };
>
> +static uint64_t flash_read(void *opaque, hwaddr offset, unsigned size)
> +{

Could use a comment about why this is unreachable, since it's
not totally obvious:
/*
 * This is a rom_device MemoryRegion which is always in
 * romd_mode (we never put it in MMIO mode), so reads always
 * go directly to RAM and never come here.
 */
> +g_assert_not_reached();
> +}
>

Otherwise
Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v2 5/9] nvram: add nrf51_soc flash read method

2020-07-16 Thread Peter Maydell
On Mon, 29 Jun 2020 at 12:18, Li Qiang  wrote:
>
> P J P  于2020年6月25日周四 上午3:01写道:
> >
> > From: Prasad J Pandit 
> >
> > Add nrf51_soc mmio read method to avoid NULL pointer dereference
> > issue.
> >
> > Reported-by: Lei Sun 
> > Signed-off-by: Prasad J Pandit 
> > ---
> >  hw/nvram/nrf51_nvm.c | 8 
> >  1 file changed, 8 insertions(+)
> >
> > Update v2: return ldl_le_p()
> >   -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg04972.html
> >
> > diff --git a/hw/nvram/nrf51_nvm.c b/hw/nvram/nrf51_nvm.c
> > index f2283c1a8d..8000ed530a 100644
> > --- a/hw/nvram/nrf51_nvm.c
> > +++ b/hw/nvram/nrf51_nvm.c
> > @@ -273,6 +273,13 @@ static const MemoryRegionOps io_ops = {
> >  .endianness = DEVICE_LITTLE_ENDIAN,
> >  };
> >
> > +static uint64_t flash_read(void *opaque, hwaddr offset, unsigned size)
> > +{
> > +NRF51NVMState *s = NRF51_NVM(opaque);
> > +
> > +assert(offset + size <= s->flash_size);
> > +return ldl_le_p(s->storage + offset);
> > +}
>
> The 'flash_ops' is for ROM, though I don't see where it calls
> 'memory_region_rom_device_set_romd'
> to ROMD, so this MR is in MMIO mode and it needs a read callback.

I think that 'romd mode' (ie reads-go-directly-to-RAM) is
the default: memory_region_initfn() sets romd_mode to true.
So unless the device actively calls memory_region_rom_device_set_romd(mr, false)
then the read callback can't be reached.

thanks
-- PMM



Re: TB Cache size grows out of control with qemu 5.0

2020-07-16 Thread Alex Bennée


Christian Ehrhardt  writes:

> On Wed, Jul 15, 2020 at 5:58 PM BALATON Zoltan  wrote:
>
>> See commit 47a2def4533a2807e48954abd50b32ecb1aaf29a and the next two
>> following it.
>>
>
> Thank you Zoltan for pointing out this commit, I agree that this seems to be
> the trigger for the issues I'm seeing. Unfortunately the common CI host size
> is 1-2G. For example on Ubuntu Autopkgtests 1.5G.
> Those of them running guests do so in 0.5-1G size in TCG mode
> (as they often can't rely on having KVM available).
>
> The 1G TB buffer + 0.5G actual guest size + lack of dynamic downsizing
> on memory pressure (never existed) makes these systems go OOM-Killing
> the qemu process.

Ooops. I admit the assumption was that most people running system
emulation would be doing it on beefier machines.

> The patches indicated that the TB flushes on a full guest boot are a
> good indicator of the TB size efficiency. From my old checks I had:
>
> - Qemu 4.2 512M guest with 32M default overwritten by ram-size/4
> TB flush count  14, 14, 16
> - Qemu 5.0 512M guest with 1G default
> TB flush count  1, 1, 1
>
> I agree that ram/4 seems odd, especially on huge guests that is a lot
> potentially wasted. And most environments have a bit of breathing
> room 1G is too big in small host systems and the common CI system falls
> into this category. So I tuned it down to 256M for a test.
>
> - Qemu 4.2 512M guest with tb-size 256M
> TB flush count  5, 5, 5
> - Qemu 5.0 512M guest with tb-size 256M
> TB flush count  5, 5, 5
> - Qemu 5.0 512M guest with 256M default in code
> TB flush count  5, 5, 5
>
> So performance wise the results are as much in-between as you'd think from a
> TB size in between. And the memory consumption which (for me) is the actual
> current issue to fix would be back in line again as expected.

So I'm glad you have the workaround. 

>
> So on one hand I'm suggesting something like:
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -944,7 +944,7 @@ static void page_lock_pair(PageDesc **re
>   * Users running large scale system emulation may want to tweak their
>   * runtime setup via the tb-size control on the command line.
>   */
> -#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB)
> +#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (256 * MiB)

The problem we have is any number we pick here is arbitrary. And while
it did regress your use-case changing it again just pushes a performance
regression onto someone else. The most (*) 64 bit desktop PCs have 16Gb
of RAM, almost all have more than 8gb. And there is a workaround.

* random number from Steams HW survey.

>  #endif
>  #endif
>
> OTOH I understand someone else might want to get the more speedy 1G
> especially for large guests. If someone used to run a 4G guest in TCG the
> TB Size was 1G all along.
> How about picking the smaller of (1G || ram-size/4) as default?
>
> This might then look like:
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -956,7 +956,12 @@ static inline size_t size_code_gen_buffe
>  {
>  /* Size the buffer.  */
>  if (tb_size == 0) {
> -tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
> +unsigned long max_default = (unsigned long)(ram_size / 4);
> +if (max_default < DEFAULT_CODE_GEN_BUFFER_SIZE) {
> +tb_size = max_default;
> +} else {
> +   tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
> +}
>  }
>  if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
>  tb_size = MIN_CODE_GEN_BUFFER_SIZE;
>
> This is a bit more tricky than it seems as ram_sizes is no more
> present in that context but it is enough to discuss it.
> That should serve all cases - small and large - better as a pure
> static default of 1G or always ram/4?

I'm definitely against re-introducing ram_size into the mix. The
original commit (a1b18df9a4) that broke this introduced an ordering
dependency which we don't want to bring back.

I'd be more amenable to something that took into account host memory and
limited the default if it was smaller than a threshold. Is there a way
to probe that that doesn't involve slurping /proc/meminfo?

>
> P.S. I added Alex being the Author of the offending patch and Richard/Paolo
> for being listed in the Maintainers file for TCG.


-- 
Alex Bennée



Re: [PATCH v3 8/9] imx7-ccm: add digprog mmio write method

2020-07-16 Thread Peter Maydell
On Tue, 30 Jun 2020 at 13:31, P J P  wrote:
>
> From: Prasad J Pandit 
>
> Add digprog mmio write method to avoid assert failure during
> initialisation.
>
> Reviewed-by: Li Qiang 
> Signed-off-by: Prasad J Pandit 
> ---
>  hw/misc/imx7_ccm.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> Update v3: Add Reviewed-by: ...
>   -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg09452.html
>
> diff --git a/hw/misc/imx7_ccm.c b/hw/misc/imx7_ccm.c
> index 02fc1ae8d0..5ac5ecf74c 100644
> --- a/hw/misc/imx7_ccm.c
> +++ b/hw/misc/imx7_ccm.c
> @@ -131,8 +131,15 @@ static const struct MemoryRegionOps imx7_set_clr_tog_ops 
> = {
>  },
>  };
>
> +static void imx7_digprog_write(void *opaque, hwaddr addr,
> +uint64_t data, unsigned size)
> +{
> +qemu_log_mask(LOG_UNIMP, "%s not implemented\n", __func__);
> +}

This covers a single register (DIGPROG) which is read-only
(it returns a silicon revision number). So this is not a
LOG_UNIMP, but a LOG_GUEST_ERROR:

 qemu_log_mask(LOG_GUEST_ERROR, "Guest write to read-only
ANALOG_DIGPROG register\n");

thanks
-- PMM



Re: [PATCH for-5.1] qapi: Fix visit_type_STRUCT() not to fail for null object

2020-07-16 Thread Li Qiang
Markus Armbruster  于2020年7月16日周四 下午11:07写道:
>
> To make deallocating partially constructed objects work, the
> visit_type_STRUCT() need to succeed without doing anything when passed
> a null object.
>
> Commit cdd2b228b9 "qapi: Smooth visitor error checking in generated
> code" broke that.  To reproduce, run tests/test-qobject-input-visitor
> with AddressSanitizer:
>
> ==4353==ERROR: LeakSanitizer: detected memory leaks
>
> Direct leak of 16 byte(s) in 1 object(s) allocated from:
> #0 0x7f192d0c5d28 in __interceptor_calloc 
> (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
> #1 0x7f192cd21b10 in g_malloc0 
> (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x51b10)
> #2 0x556725f6bbee in visit_next_list qapi/qapi-visit-core.c:86
> #3 0x556725f49e15 in visit_type_UserDefOneList 
> tests/test-qapi-visit.c:474
> #4 0x556725f4489b in test_visitor_in_fail_struct_in_list 
> tests/test-qobject-input-visitor.c:1086
> #5 0x7f192cd42f29  
> (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x72f29)
>
> SUMMARY: AddressSanitizer: 16 byte(s) leaked in 1 allocation(s).
>
> Test case /visitor/input/fail/struct-in-list feeds a list with a bad
> element to the QObject input visitor.  Visiting that element duly
> fails, and aborts the visit with the list only partially constructed:
> the faulty object is null.  Cleaning up the partially constructed list
> visits that null object, fails, and aborts the visit before the list
> node gets freed.
>
> Fix the the generated visit_type_STRUCT() to succeed for null objects.
>
> Fixes: cdd2b228b973d2a29edf7696ef6e8b08ec329019
> Reported-by: Li Qiang 
> Signed-off-by: Markus Armbruster 

Oh, I also sent this too.
Not matter, just ignore my patch.

Tested-by: Li Qiang 
Reviewed-by: Li Qiang 


> ---
>  scripts/qapi/visit.py | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/scripts/qapi/visit.py b/scripts/qapi/visit.py
> index 3fb2f30510..cdabc5fa28 100644
> --- a/scripts/qapi/visit.py
> +++ b/scripts/qapi/visit.py
> @@ -249,6 +249,7 @@ bool visit_type_%(c_name)s(Visitor *v, const char *name, 
> %(c_name)s **obj, Error
>  if (!*obj) {
>  /* incomplete */
>  assert(visit_is_dealloc(v));
> +ok = true;
>  goto out_obj;
>  }
>  if (!visit_type_%(c_name)s_members(v, *obj, errp)) {
> --
> 2.26.2
>
>



[PATCH] e1000e: using bottom half to send packets

2020-07-16 Thread Li Qiang
Alexander Bulekov reported a UAF bug related e1000e packets send.

-->https://bugs.launchpad.net/qemu/+bug/1886362

This is because the guest trigger a e1000e packet send and set the
data's address to e1000e's MMIO address. So when the e1000e do DMA
it will write the MMIO again and trigger re-entrancy and finally
causes this UAF.

Paolo suggested to use a bottom half whenever MMIO is doing complicate
things in here:
-->https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg03342.html

Reference here:
'The easiest solution is to delay processing of descriptors to a bottom
half whenever MMIO is doing something complicated.  This is also better
for latency because it will free the vCPU thread more quickly and leave
the work to the I/O thread.'

This patch fixes this UAF.

Signed-off-by: Li Qiang 
---
 hw/net/e1000e_core.c | 25 +
 hw/net/e1000e_core.h |  2 ++
 2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index bcd186cac5..6165b04b68 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -2423,32 +2423,27 @@ e1000e_set_dbal(E1000ECore *core, int index, uint32_t 
val)
 static void
 e1000e_set_tctl(E1000ECore *core, int index, uint32_t val)
 {
-E1000E_TxRing txr;
 core->mac[index] = val;
 
 if (core->mac[TARC0] & E1000_TARC_ENABLE) {
-e1000e_tx_ring_init(core, , 0);
-e1000e_start_xmit(core, );
+qemu_bh_schedule(core->tx[0].tx_bh);
 }
 
 if (core->mac[TARC1] & E1000_TARC_ENABLE) {
-e1000e_tx_ring_init(core, , 1);
-e1000e_start_xmit(core, );
+qemu_bh_schedule(core->tx[1].tx_bh);
 }
 }
 
 static void
 e1000e_set_tdt(E1000ECore *core, int index, uint32_t val)
 {
-E1000E_TxRing txr;
 int qidx = e1000e_mq_queue_idx(TDT, index);
 uint32_t tarc_reg = (qidx == 0) ? TARC0 : TARC1;
 
 core->mac[index] = val & 0x;
 
 if (core->mac[tarc_reg] & E1000_TARC_ENABLE) {
-e1000e_tx_ring_init(core, , qidx);
-e1000e_start_xmit(core, );
+qemu_bh_schedule(core->tx[qidx].tx_bh);
 }
 }
 
@@ -3322,6 +3317,16 @@ e1000e_vm_state_change(void *opaque, int running, 
RunState state)
 }
 }
 
+static void e1000e_core_tx_bh(void *opaque)
+{
+struct e1000e_tx *tx = opaque;
+E1000ECore *core = tx->core;
+E1000E_TxRing txr;
+
+e1000e_tx_ring_init(core, , tx - >tx[0]);
+e1000e_start_xmit(core, );
+}
+
 void
 e1000e_core_pci_realize(E1000ECore *core,
 const uint16_t *eeprom_templ,
@@ -3340,6 +3345,8 @@ e1000e_core_pci_realize(E1000ECore *core,
 for (i = 0; i < E1000E_NUM_QUEUES; i++) {
 net_tx_pkt_init(>tx[i].tx_pkt, core->owner,
 E1000E_MAX_TX_FRAGS, core->has_vnet);
+core->tx[i].core = core;
+core->tx[i].tx_bh = qemu_bh_new(e1000e_core_tx_bh, >tx[i]);
 }
 
 net_rx_pkt_init(>rx_pkt, core->has_vnet);
@@ -3367,6 +3374,8 @@ e1000e_core_pci_uninit(E1000ECore *core)
 for (i = 0; i < E1000E_NUM_QUEUES; i++) {
 net_tx_pkt_reset(core->tx[i].tx_pkt);
 net_tx_pkt_uninit(core->tx[i].tx_pkt);
+qemu_bh_delete(core->tx[i].tx_bh);
+core->tx[i].tx_bh = NULL;
 }
 
 net_rx_pkt_uninit(core->rx_pkt);
diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h
index aee32f7e48..94ddc6afc2 100644
--- a/hw/net/e1000e_core.h
+++ b/hw/net/e1000e_core.h
@@ -77,6 +77,8 @@ struct E1000Core {
 unsigned char sum_needed;
 bool cptse;
 struct NetTxPkt *tx_pkt;
+QEMUBH *tx_bh;
+E1000ECore *core;
 } tx[E1000E_NUM_QUEUES];
 
 struct NetRxPkt *rx_pkt;
-- 
2.17.1




Re: [PATCH v3 7/9] tz-ppc: add dummy read/write methods

2020-07-16 Thread Peter Maydell
On Tue, 30 Jun 2020 at 13:31, P J P  wrote:
>
> From: Prasad J Pandit 
>
> Add tz-ppc-dummy mmio read/write methods to avoid assert failure
> during initialisation.
>
> Signed-off-by: Prasad J Pandit 
> --

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v10 09/10] qcow2_format.py: collect fields to dump in JSON format

2020-07-16 Thread Vladimir Sementsov-Ogievskiy

16.07.2020 18:52, Andrey Shinkevich wrote:

On 16.07.2020 18:40, Vladimir Sementsov-Ogievskiy wrote:

16.07.2020 18:34, Andrey Shinkevich wrote:

On 16.07.2020 13:24, Vladimir Sementsov-Ogievskiy wrote:

14.07.2020 00:36, Andrey Shinkevich wrote:

As __dict__ is being extended with class members we do not want to
print, make a light copy of the initial __dict__ and extend the copy
by adding lists we have to print in the JSON output.

Signed-off-by: Andrey Shinkevich 
---
  tests/qemu-iotests/qcow2_format.py | 4 
  1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/qcow2_format.py 
b/tests/qemu-iotests/qcow2_format.py
index e0e14b5..83c3482 100644
--- a/tests/qemu-iotests/qcow2_format.py
+++ b/tests/qemu-iotests/qcow2_format.py
@@ -109,6 +109,8 @@ class Qcow2Struct(metaclass=Qcow2StructMeta):
  self.__dict__ = dict((field[2], values[i])
   for i, field in enumerate(self.fields))
  +    self.fields_dict = self.__dict__.copy()


No, I don't like that. Keeping two copies of all the data is bad idea. If you 
want to select some fields, add a method (dump_json() ?) Which will collect the 
fields you want in a dict and return that new dict. But don't store object 
attributes twice.



That is what I did in the versions before but it looks clumsy to me. Every 
single class lists almost all the items of the __dict__ again in the additional 
method.



Most of them should be listed automatically by base class method, which will 
iterate through .fields







Not really. It makes a light copy that stores only references to the desired 
fields.



I'm not about memory consumption, I'm about the design. Keeping two 
representations of same thing is always painful to maintain.




+
  def dump(self, dump_json=None):
  for f in self.fields:
  value = self.__dict__[f[2]]
@@ -144,6 +146,7 @@ class Qcow2BitmapExt(Qcow2Struct):
  self.bitmap_directory = \
  [Qcow2BitmapDirEntry(fd, cluster_size=self.cluster_size)
   for _ in range(self.nb_bitmaps)]
+ self.fields_dict.update(bitmap_directory=self.bitmap_directory)
    def dump(self, dump_json=None):
  super().dump(dump_json)
@@ -189,6 +192,7 @@ class Qcow2BitmapDirEntry(Qcow2Struct):
  table = [e[0] for e in struct.iter_unpack('>Q', fd.read(table_size))]
  self.bitmap_table = Qcow2BitmapTable(raw_table=table,
cluster_size=self.cluster_size)
+ self.fields_dict.update(bitmap_table=self.bitmap_table)
    def dump(self, dump_json=None):
  print(f'{"Bitmap name":<25} {self.name}')










--
Best regards,
Vladimir



Re: [PATCH v3] spapr: Add a new level of NUMA for GPUs

2020-07-16 Thread Reza Arbab

On Thu, Jul 16, 2020 at 06:42:11AM -0300, Daniel Henrique Barboza wrote:

Which would translate here to:

   uint32_t associativity[] = {
   cpu_to_be32(0x4),
   cpu_to_be32(nvslot->numa_id),
   cpu_to_be32(nvslot->numa_id),
   cpu_to_be32(nvslot->numa_id),
   cpu_to_be32(nvslot->numa_id),
   };


Sure, that's how it originally was in v1 of the patch.

I'll send a v4 today. It's been a while so I need to rebase anyway.

--
Reza Arbab



Re: [PATCH for-5.1] qapi: Fix visit_type_STRUCT() not to fail for null object

2020-07-16 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20200716150617.4027356-1-arm...@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  TESTiotest-qcow2: 020
  TESTcheck-unit: tests/test-char
**
ERROR:/tmp/qemu-test/src/tests/test-char.c:1204:char_serial_test: 'chr' should 
not be NULL
ERROR test-char - Bail out! 
ERROR:/tmp/qemu-test/src/tests/test-char.c:1204:char_serial_test: 'chr' should 
not be NULL
  TESTiotest-qcow2: 021
make: *** [check-unit] Error 1
make: *** Waiting for unfinished jobs
  TESTiotest-qcow2: 022
  TESTiotest-qcow2: 024
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=df5ad74d767b4fb79503019b6f0a4007', '-u', 
'1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-x2oom0v2/src/docker-src.2020-07-16-11.47.51.27809:/var/tmp/qemu:z,ro',
 'qemu/centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=df5ad74d767b4fb79503019b6f0a4007
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-x2oom0v2/src'
make: *** [docker-run-test-quick@centos7] Error 2

real16m21.927s
user0m9.780s


The full log is available at
http://patchew.org/logs/20200716150617.4027356-1-arm...@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

[Bug 1880822] Re: CVE-2020-13253 QEMU: sd: OOB access could crash the guest resulting in DoS

2020-07-16 Thread Philippe Mathieu-Daudé
Fixed in commit 790762e54871143415bffcec4cb3c022c3cd.

** Changed in: qemu
   Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1880822

Title:
  CVE-2020-13253 QEMU: sd: OOB access could crash the guest resulting in
  DoS

Status in QEMU:
  Fix Committed

Bug description:
  An out-of-bounds read access issue was found in the SD Memory Card
  emulator of the QEMU. It occurs while performing block write commands
  via sdhci_write(), if a guest user has sent 'address' which is OOB of
  's->wp_groups'. A guest user/process may use this flaw to crash the
  QEMU process resulting in DoS.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1880822/+subscriptions



Re: [PATCH v10 09/10] qcow2_format.py: collect fields to dump in JSON format

2020-07-16 Thread Andrey Shinkevich

On 16.07.2020 18:40, Vladimir Sementsov-Ogievskiy wrote:

16.07.2020 18:34, Andrey Shinkevich wrote:

On 16.07.2020 13:24, Vladimir Sementsov-Ogievskiy wrote:

14.07.2020 00:36, Andrey Shinkevich wrote:

As __dict__ is being extended with class members we do not want to
print, make a light copy of the initial __dict__ and extend the copy
by adding lists we have to print in the JSON output.

Signed-off-by: Andrey Shinkevich 
---
  tests/qemu-iotests/qcow2_format.py | 4 
  1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/qcow2_format.py 
b/tests/qemu-iotests/qcow2_format.py

index e0e14b5..83c3482 100644
--- a/tests/qemu-iotests/qcow2_format.py
+++ b/tests/qemu-iotests/qcow2_format.py
@@ -109,6 +109,8 @@ class Qcow2Struct(metaclass=Qcow2StructMeta):
  self.__dict__ = dict((field[2], values[i])
   for i, field in enumerate(self.fields))
  +    self.fields_dict = self.__dict__.copy()


No, I don't like that. Keeping two copies of all the data is bad 
idea. If you want to select some fields, add a method (dump_json() 
?) Which will collect the fields you want in a dict and return that 
new dict. But don't store object attributes twice.



That is what I did in the versions before but it looks clumsy to me. 
Every single class lists almost all the items of the __dict__ again in 
the additional method.


Andrey






Not really. It makes a light copy that stores only references to the 
desired fields.




I'm not about memory consumption, I'm about the design. Keeping two 
representations of same thing is always painful to maintain.





+
  def dump(self, dump_json=None):
  for f in self.fields:
  value = self.__dict__[f[2]]
@@ -144,6 +146,7 @@ class Qcow2BitmapExt(Qcow2Struct):
  self.bitmap_directory = \
  [Qcow2BitmapDirEntry(fd, cluster_size=self.cluster_size)
   for _ in range(self.nb_bitmaps)]
+ self.fields_dict.update(bitmap_directory=self.bitmap_directory)
    def dump(self, dump_json=None):
  super().dump(dump_json)
@@ -189,6 +192,7 @@ class Qcow2BitmapDirEntry(Qcow2Struct):
  table = [e[0] for e in struct.iter_unpack('>Q', 
fd.read(table_size))]

  self.bitmap_table = Qcow2BitmapTable(raw_table=table,
cluster_size=self.cluster_size)
+ self.fields_dict.update(bitmap_table=self.bitmap_table)
    def dump(self, dump_json=None):
  print(f'{"Bitmap name":<25} {self.name}')











[PATCH] introduce VFIO-over-socket protocol specificaion

2020-07-16 Thread Thanos Makatos
This patch introduces the VFIO-over-socket protocol specification, which
is designed to allow devices to be emulated outside QEMU, in a separate
process. VFIO-over-socket reuses the existing VFIO defines, structs and
concepts.

It has been earlier discussed as an RFC in:
"RFC: use VFIO over a UNIX domain socket to implement device offloading"

Signed-off-by: John G Johnson 
Signed-off-by: Thanos Makatos 
---
 docs/devel/vfio-over-socket.rst | 1135 +++
 1 files changed, 1135 insertions(+), 0 deletions(-)
 create mode 100644 docs/devel/vfio-over-socket.rst

diff --git a/docs/devel/vfio-over-socket.rst b/docs/devel/vfio-over-socket.rst
new file mode 100644
index 000..723b944
--- /dev/null
+++ b/docs/devel/vfio-over-socket.rst
@@ -0,0 +1,1135 @@
+***
+VFIO-over-socket Protocol Specification
+***
+
+Version 0.1
+
+Introduction
+
+VFIO-over-socket, also known as vfio-user, is a protocol that allows a device
+to be virtualized in a separate process outside of QEMU. VFIO-over-socket
+devices consist of a generic VFIO device type, living inside QEMU, which we
+call the client, and the core device implementation, living outside QEMU, which
+we call the server. VFIO-over-socket can be the main transport mechanism for
+multi-process QEMU, however it can be used by other applications offering
+device virtualization. Explaining the advantages of a
+disaggregated/multi-process QEMU, and device virtualization outside QEMU in
+general, is beyond the scope of this document.
+
+This document focuses on specifying the VFIO-over-socket protocol. VFIO has
+been chosen for the following reasons:
+
+1) It is a mature and stable API, backed by an extensively used framework.
+2) The existing VFIO client implementation (qemu/hw/vfio/) can be largely
+   reused.
+
+In a proof of concept implementation it has been demonstrated that using VFIO
+over a UNIX domain socket is a viable option. VFIO-over-socket is designed with
+QEMU in mind, however it could be used by other client applications. The
+VFIO-over-socket protocol does not require that QEMU's VFIO client
+implementation is used in QEMU. None of the VFIO kernel modules are required
+for supporting the protocol, neither in the client nor the server, only the
+source header files are used.
+
+The main idea is to allow a virtual device to function in a separate process in
+the same host over a UNIX domain socket. A UNIX domain socket (AF_UNIX) is
+chosen because we can trivially send file descriptors over it, which in turn
+allows:
+
+* Sharing of guest memory for DMA with the virtual device process.
+* Sharing of virtual device memory with the guest for fast MMIO.
+* Efficient sharing of eventfd's for triggering interrupts.
+
+However, other socket types could be used which allows the virtual device
+process to run in a separate guest in the same host (AF_VSOCK) or remotely
+(AF_INET). Theoretically the underlying transport doesn't necessarily have to
+be a socket, however we don't examine such alternatives. In this document we
+focus on using a UNIX domain socket and introduce basic support for the other
+two types of sockets without considering performance implications.
+
+This document does not yet describe any internal details of the server-side
+implementation, however QEMU's VFIO client implementation will have to be
+adapted according to this protocol in order to support VFIO-over-socket virtual
+devices.
+
+VFIO
+
+VFIO is a framework that allows a physical device to be securely passed through
+to a user space process; the kernel does not drive the device at all.
+Typically, the user space process is a VM and the device is passed through to
+it in order to achieve high performance. VFIO provides an API and the required
+functionality in the kernel. QEMU has adopted VFIO to allow a guest virtual
+machine to directly access physical devices, instead of emulating them in
+software
+
+VFIO-over-socket reuses the core VFIO concepts defined in its API, but
+implements them as messages to be sent over a UNIX-domain socket. It does not
+change the kernel-based VFIO in any way, in fact none of the VFIO kernel
+modules need to be loaded to use VFIO-over-socket. It is also possible for QEMU
+to concurrently use the current kernel-based VFIO for one guest device, and use
+VFIO-over-socket for another device in the same guest.
+
+VFIO Device Model
+-
+A device under VFIO presents a standard VFIO model to the user process. Many
+of the VFIO operations in the existing kernel model use the ioctl() system
+call, and references to the existing model are called the ioctl()
+implementation in this document.
+
+The following sections describe the set of messages that implement the VFIO
+device model over a UNIX domain socket. In many cases, the messages are direct
+translations of data structures used in the ioctl() implementation. Messages
+derived 

[PATCH] osdep.h: Add doc comment for qemu_get_thread_id()

2020-07-16 Thread Peter Maydell
Add a documentation comment for qemu_get_thread_id(): since this
is rather host-OS-specific it's useful if people writing the
implementation and people thinking of using the function know
what the purpose and limitations are.

Signed-off-by: Peter Maydell 
---
Based on conversation with Dan on IRC, and prompted by the recent
patch to add OpenBSD support.

Q: should we document exactly what the thread-id value is for
each host platform in the QMP documentation ? Somebody writing
a management layer app should ideally not have to grovel through
the application to figure out what they should do with the
integer value they get back from query-cpus...

 include/qemu/osdep.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 4841b5c6b5f..8279f72e5ed 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -515,6 +515,20 @@ bool qemu_has_ofd_lock(void);
 
 bool qemu_write_pidfile(const char *pidfile, Error **errp);
 
+/**
+ * qemu_get_thread_id: Return OS-specific ID of current thread
+ *
+ * This function returns an OS-specific identifier of the
+ * current thread. This will be used for the "thread-id" field in
+ * the response to the QMP query-cpus and query-iothreads commands.
+ * The intention is that a VM management layer application can then
+ * use it to tie specific QEMU vCPU and IO threads to specific host
+ * CPUs using whatever the host OS's CPU affinity setting API is.
+ * New implementations of this function for new host OSes should
+ * return the most sensible integer ID that works for that purpose.
+ *
+ * This function should not be used for anything else inside QEMU.
+ */
 int qemu_get_thread_id(void);
 
 #ifndef CONFIG_IOVEC
-- 
2.20.1




Re: [PATCH v10 09/10] qcow2_format.py: collect fields to dump in JSON format

2020-07-16 Thread Vladimir Sementsov-Ogievskiy

16.07.2020 18:34, Andrey Shinkevich wrote:

On 16.07.2020 13:24, Vladimir Sementsov-Ogievskiy wrote:

14.07.2020 00:36, Andrey Shinkevich wrote:

As __dict__ is being extended with class members we do not want to
print, make a light copy of the initial __dict__ and extend the copy
by adding lists we have to print in the JSON output.

Signed-off-by: Andrey Shinkevich 
---
  tests/qemu-iotests/qcow2_format.py | 4 
  1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/qcow2_format.py 
b/tests/qemu-iotests/qcow2_format.py
index e0e14b5..83c3482 100644
--- a/tests/qemu-iotests/qcow2_format.py
+++ b/tests/qemu-iotests/qcow2_format.py
@@ -109,6 +109,8 @@ class Qcow2Struct(metaclass=Qcow2StructMeta):
  self.__dict__ = dict((field[2], values[i])
   for i, field in enumerate(self.fields))
  +    self.fields_dict = self.__dict__.copy()


No, I don't like that. Keeping two copies of all the data is bad idea. If you 
want to select some fields, add a method (dump_json() ?) Which will collect the 
fields you want in a dict and return that new dict. But don't store object 
attributes twice.



Not really. It makes a light copy that stores only references to the desired 
fields.



I'm not about memory consumption, I'm about the design. Keeping two 
representations of same thing is always painful to maintain.




+
  def dump(self, dump_json=None):
  for f in self.fields:
  value = self.__dict__[f[2]]
@@ -144,6 +146,7 @@ class Qcow2BitmapExt(Qcow2Struct):
  self.bitmap_directory = \
  [Qcow2BitmapDirEntry(fd, cluster_size=self.cluster_size)
   for _ in range(self.nb_bitmaps)]
+ self.fields_dict.update(bitmap_directory=self.bitmap_directory)
    def dump(self, dump_json=None):
  super().dump(dump_json)
@@ -189,6 +192,7 @@ class Qcow2BitmapDirEntry(Qcow2Struct):
  table = [e[0] for e in struct.iter_unpack('>Q', fd.read(table_size))]
  self.bitmap_table = Qcow2BitmapTable(raw_table=table,
cluster_size=self.cluster_size)
+    self.fields_dict.update(bitmap_table=self.bitmap_table)
    def dump(self, dump_json=None):
  print(f'{"Bitmap name":<25} {self.name}')







--
Best regards,
Vladimir



Re: [PATCH v10 09/10] qcow2_format.py: collect fields to dump in JSON format

2020-07-16 Thread Andrey Shinkevich

On 16.07.2020 13:24, Vladimir Sementsov-Ogievskiy wrote:

14.07.2020 00:36, Andrey Shinkevich wrote:

As __dict__ is being extended with class members we do not want to
print, make a light copy of the initial __dict__ and extend the copy
by adding lists we have to print in the JSON output.

Signed-off-by: Andrey Shinkevich 
---
  tests/qemu-iotests/qcow2_format.py | 4 
  1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/qcow2_format.py 
b/tests/qemu-iotests/qcow2_format.py

index e0e14b5..83c3482 100644
--- a/tests/qemu-iotests/qcow2_format.py
+++ b/tests/qemu-iotests/qcow2_format.py
@@ -109,6 +109,8 @@ class Qcow2Struct(metaclass=Qcow2StructMeta):
  self.__dict__ = dict((field[2], values[i])
   for i, field in enumerate(self.fields))
  +    self.fields_dict = self.__dict__.copy()


No, I don't like that. Keeping two copies of all the data is bad idea. 
If you want to select some fields, add a method (dump_json() ?) Which 
will collect the fields you want in a dict and return that new dict. 
But don't store object attributes twice.




Not really. It makes a light copy that stores only references to the 
desired fields.


Andrey



+
  def dump(self, dump_json=None):
  for f in self.fields:
  value = self.__dict__[f[2]]
@@ -144,6 +146,7 @@ class Qcow2BitmapExt(Qcow2Struct):
  self.bitmap_directory = \
  [Qcow2BitmapDirEntry(fd, cluster_size=self.cluster_size)
   for _ in range(self.nb_bitmaps)]
+ self.fields_dict.update(bitmap_directory=self.bitmap_directory)
    def dump(self, dump_json=None):
  super().dump(dump_json)
@@ -189,6 +192,7 @@ class Qcow2BitmapDirEntry(Qcow2Struct):
  table = [e[0] for e in struct.iter_unpack('>Q', 
fd.read(table_size))]

  self.bitmap_table = Qcow2BitmapTable(raw_table=table,
cluster_size=self.cluster_size)
+    self.fields_dict.update(bitmap_table=self.bitmap_table)
    def dump(self, dump_json=None):
  print(f'{"Bitmap name":<25} {self.name}')








Re: [PATCH 2/2] i386/cpu: Mask off unsupported XSAVE components

2020-07-16 Thread Xiaoyao Li

On 7/16/2020 11:14 PM, Eduardo Habkost wrote:

On Thu, Jul 16, 2020 at 04:20:19PM +0800, Xiaoyao Li wrote:

When setting up XSAVE components, it needs to mask off those unsupported
by KVM.

Signed-off-by: Xiaoyao Li 


We must never disable CPUID features silently based on host
capabilities, otherwise we can't guarantee guest ABI stability
when migrating to another host.  Filtering of features should
involve a call to mark_unavailable_features() (or some equivalent
mechanism) so we can report the missing features properly through
QMP.

Could you explain what's the bug you are trying to fix?  The loop
at x86_cpu_filter_features() is already supposed to disable
features unsupported by the host.


Sorry, I forgot x86_cpu_filter_features() totally when code inspection.


---
  target/i386/cpu.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f5f11603e805..efc92334b7b1 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6274,8 +6274,10 @@ static void x86_cpu_enable_xsave_components(X86CPU *cpu)
  }
  }
  
-env->features[FEAT_XSAVE_COMP_LO] = mask;

-env->features[FEAT_XSAVE_COMP_HI] = mask >> 32;
+env->features[FEAT_XSAVE_COMP_LO] = mask &
+x86_cpu_get_supported_feature_word(FEAT_XSAVE_COMP_LO, 
cpu->migratable);
+env->features[FEAT_XSAVE_COMP_HI] = (mask >> 32) &
+x86_cpu_get_supported_feature_word(FEAT_XSAVE_COMP_HI, 
cpu->migratable);
  }
  
  /* Steps involved on loading and filtering CPUID data

--
2.18.4








Re: [PATCH v7 02/47] block: Add chain helper functions

2020-07-16 Thread Vladimir Sementsov-Ogievskiy

16.07.2020 17:50, Max Reitz wrote:

On 13.07.20 12:18, Vladimir Sementsov-Ogievskiy wrote:

25.06.2020 18:21, Max Reitz wrote:

Add some helper functions for skipping filters in a chain of block
nodes.

Signed-off-by: Max Reitz 
---
   include/block/block_int.h |  3 +++
   block.c   | 55 +++
   2 files changed, 58 insertions(+)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index bb3457c5e8..5da793bfc3 100644



This patch raises two questions:

1. How to treat filters at the end of the backing chain?


It was my understanding that this configuration would be impossible.


OK for me, but I'd prefer to have a comment and assertions.




- child-access function will return no filter child for such nodes, it's
correct of course
- filer skipping functions will return this filter.. How much is it
correct - I don't know.


Consider a chain

top --- backing ---> filter-with-no-child


How would it be possible to have filter without a child?


if bdrv_backing_chain_next(top) returns NULL, it's incorrect, because
top actually have backing, and on read it will read from it for
unallocated clusters (and this should crash). So, probably, returning
filter as a backing-chain-next is a valid thing to do. Or we should
assert that we are not in such situation (which may crash more often
than trying to really read from nonexistent child).

so, returning NULL, may even less correct than returning a filter..


2. How to tread nodes with drv=NULL, but with filter child (with
BDRV_CHILD_FILTERED role).


drv=NULL is a bug on its own that should be fixed...  (The idea we had
at some point was to introduce a special driver that just always returns
-EIO on everything, and to replace corrupt qcow2 nodes by that.  Or,
well, we might just return -EIO from the qcow2 driver, actually, if we
never use drv=NULL anywhere else.)

In any case, drv=NULL is an edge case that I think never has anything to
do with filters.


So, again some good comment and assertion won't hurt.




- child-access functions returns no filtered child for such nodes
- filter skipping functions will stop on it..

===

Isn't it better to drop drv->is_filter at all? And call filter nodes
with a bs->file or bs->backing
child in BDRV_CHILD_FILTERED role? This automatically closes the two
questions:

- node without a child in BDRV_CHILD_FILTERED is automatically
non-filter. So, filter driver is responsible for having such child.
- node without a drv may still be a filter if it have
BDRV_CHILD_FILTERED.. Still, not very useful.

Anyway, is_filter and BDRV_CHILD_FILTERED are in contradiction, and it
seems good to get rid of is_filter. But I may miss something.

[..]


+
+static BlockDriverState *bdrv_do_skip_filters(BlockDriverState *bs,
+  bool
stop_on_explicit_filter)
+{
+    BdrvChild *c;
+
+    if (!bs) {
+    return NULL;
+    }
+
+    while (!(stop_on_explicit_filter && !bs->implicit)) {
+    c = bdrv_filter_child(bs);
+    if (!c) {
+    break;
+    }
+    bs = c->bs;
+    }
+    /*
+ * Note that this treats nodes with bs->drv == NULL as not being
+ * filters (bs->drv == NULL should be replaced by something else
+ * anyway).
+ * The advantage of this behavior is that this function will thus
+ * always return a non-NULL value (given a non-NULL @bs).


I don't see, how it is follows from first sentence? We can skip nodes
with a child of BDRV_CHILD_FILTERED and drv=NULL as well, and still return
non-NULL bs at the end...


My idea was that nodes with bs->drv == NULL might not even have
children.  If we treated them like filters and skipped through them, we
would have to return NULL if there is no child.


Didn't you mean "treat nodes without filter child as not being filters,
even if they have drv->is_filter == true"? This is a real reason for the
second sentence.


Hm.  I implicitly always assume that filters always have a filter child,
so I tend to not even question that part.



Hmm. Still, the relationship in the comment seems unclear to me, the code 
itself is simpler :)

Okay, I'm actually OK with this all. I'd prefer to have assertions and comments 
on corner-cases I mentioned, but patch is OK as is:

Reviewed-by: Vladimir Sementsov-Ogievskiy 



--
Best regards,
Vladimir



Re: [PATCH 1/2] i386/cpu: Clear FEAT_XSAVE_COMP_{LO, HI} when XSAVE is not available

2020-07-16 Thread Xiaoyao Li

On 7/16/2020 11:15 PM, Eduardo Habkost wrote:

On Thu, Jul 16, 2020 at 04:20:18PM +0800, Xiaoyao Li wrote:

Per Intel SDM vol 1, 13.2, if CPUID.1:ECX.XSAVE[bit 26] is 0, the
processor provides no further enumeration through CPUID function 0DH.


Can you explain what's the bug you are trying to fix?
env->features[FEAT_XSAVE_COMP_*] is already initialized as zero.


When "-cpu host", it's not zero I think.





Signed-off-by: Xiaoyao Li 
---
  target/i386/cpu.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 1e5123251d74..f5f11603e805 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6261,6 +6261,8 @@ static void x86_cpu_enable_xsave_components(X86CPU *cpu)
  uint64_t mask;
  
  if (!(env->features[FEAT_1_ECX] & CPUID_EXT_XSAVE)) {

+env->features[FEAT_XSAVE_COMP_LO] = 0;
+env->features[FEAT_XSAVE_COMP_HI] = 0;
  return;
  }
  
--

2.18.4








Re: [PATCH v7 21/47] block: Use CAFs in bdrv_refresh_filename()

2020-07-16 Thread Max Reitz
On 15.07.20 14:52, Andrey Shinkevich wrote:
> On 25.06.2020 18:21, Max Reitz wrote:
>> bdrv_refresh_filename() and the kind of related bdrv_dirname() should
>> look to the primary child when they wish to copy the underlying file's
>> filename.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>   block.c | 29 +
>>   1 file changed, 21 insertions(+), 8 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 8131d0b5eb..7c827fefa0 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -6797,6 +6797,7 @@ void bdrv_refresh_filename(BlockDriverState *bs)
>>   {
>>   BlockDriver *drv = bs->drv;
>>   BdrvChild *child;
>> +    BlockDriverState *primary_child_bs;
>>   QDict *opts;
>>   bool backing_overridden;
>>   bool generate_json_filename; /* Whether our default
>> implementation should
>> @@ -6866,20 +6867,30 @@ void bdrv_refresh_filename(BlockDriverState *bs)
>>   qobject_unref(bs->full_open_options);
>>   bs->full_open_options = opts;
>>   +    primary_child_bs = bdrv_primary_bs(bs);
>> +
>>   if (drv->bdrv_refresh_filename) {
>>   /* Obsolete information is of no use here, so drop the old
>> file name
>>    * information before refreshing it */
>>   bs->exact_filename[0] = '\0';
>>     drv->bdrv_refresh_filename(bs);
>> -    } else if (bs->file) {
>> -    /* Try to reconstruct valid information from the underlying
>> file */
>> +    } else if (primary_child_bs) {
>> +    /*
>> + * Try to reconstruct valid information from the underlying
>> + * file -- this only works for format nodes (filter nodes
>> + * cannot be probed and as such must be selected by the user
>> + * either through an options dict, or through a special
>> + * filename which the filter driver must construct in its
>> + * .bdrv_refresh_filename() implementation).
>> + */
> 
> 
> The caller may not be aware of a filter node and intend to refresh the
> name of underlying format node.
> 
> In that case, the filter driver should redirect the call to the format
> node.

It shouldn’t.  We can only return a plain filename if passing that
filename to qemu (e.g. to -drive) will result in the same block graph
configuration.

This is what the comment means by “filter nodes cannot be probed”: If
there is a filter node, we must generate a json:{} filename, because
otherwise reopening the block device with -drive by passing the filename
generated here would result in a configuration where the filter is missing.

> What are situations the name of the filter itself should be refreshed in?

Hypothetically, if a filename could specify a filter.  For example, say
the filename “filter[copy-on-read]:foo.qcow2” would result in qemu
creating a COR filter on top of a qcow2 node, then we could generate
such a filename.

In practice, filters cannot be configured through plain filenames (apart
from blkverify, but that’s a debugging feature, so it doesn’t really
matter), so there is no such situation.  All filter nodes should have an
empty exact_filename and thus get a json:{} pseudo-filename.

> If there are any, should we do both actions or choose either?
> 
> Andrey
> 
> 
>>     bs->exact_filename[0] = '\0';
>>     /*
>>    * We can use the underlying file's filename if:
>>    * - it has a filename,
>> + * - the current BDS is not a filter,
> 
> 
> Should we check the function input parameter for being a filter's BS
> here, in this function, and handle the case here or let the filter
> driver function do that or else the caller should check it?

bdrv_refresh_filename() is called whenever some node in the block graph
has changed, just to refresh its filename (after that change).  The
caller generally doesn’t really care about the result, so it doesn’t
matter whether the node is a filter or not (i.e., whether it gets a
plain filename or not).

I don’t think the caller should check it, and in this implementation we
simply have to handle filter nodes correctly: That is, not give them a
plain filename.

Max



signature.asc
Description: OpenPGP digital signature


[Bug 1887820] [NEW] TCG test targets missing from 'make check-help'

2020-07-16 Thread Philippe Mathieu-Daudé
Public bug reported:

We can run the TCG tests using:

$ make run-tcg-tests-$TARGET-softmmu

This is not listed in 'make check-help'.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1887820

Title:
  TCG test targets missing from 'make check-help'

Status in QEMU:
  New

Bug description:
  We can run the TCG tests using:

  $ make run-tcg-tests-$TARGET-softmmu

  This is not listed in 'make check-help'.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1887820/+subscriptions



Re: [PATCH 1/2] i386/cpu: Clear FEAT_XSAVE_COMP_{LO,HI} when XSAVE is not available

2020-07-16 Thread Eduardo Habkost
On Thu, Jul 16, 2020 at 04:20:18PM +0800, Xiaoyao Li wrote:
> Per Intel SDM vol 1, 13.2, if CPUID.1:ECX.XSAVE[bit 26] is 0, the
> processor provides no further enumeration through CPUID function 0DH.

Can you explain what's the bug you are trying to fix?
env->features[FEAT_XSAVE_COMP_*] is already initialized as zero.

> 
> Signed-off-by: Xiaoyao Li 
> ---
>  target/i386/cpu.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 1e5123251d74..f5f11603e805 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6261,6 +6261,8 @@ static void x86_cpu_enable_xsave_components(X86CPU *cpu)
>  uint64_t mask;
>  
>  if (!(env->features[FEAT_1_ECX] & CPUID_EXT_XSAVE)) {
> +env->features[FEAT_XSAVE_COMP_LO] = 0;
> +env->features[FEAT_XSAVE_COMP_HI] = 0;
>  return;
>  }
>  
> -- 
> 2.18.4
> 

-- 
Eduardo




Re: [PATCH v7 20/47] block: Iterate over children in refresh_limits

2020-07-16 Thread Max Reitz
On 14.07.20 20:37, Andrey Shinkevich wrote:
> On 25.06.2020 18:21, Max Reitz wrote:
>> Instead of looking at just bs->file and bs->backing, we should look at
>> all children that could end up receiving forwarded requests.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>   block/io.c | 32 
>>   1 file changed, 16 insertions(+), 16 deletions(-)
>>
>> diff --git a/block/io.c b/block/io.c
>> index c2af7711d6..37057f13e0 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -135,6 +135,8 @@ static void bdrv_merge_limits(BlockLimits *dst,
>> const BlockLimits *src)
>>   void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
>>   {
>>   BlockDriver *drv = bs->drv;
>> +    BdrvChild *c;
>> +    bool have_limits;
>>   Error *local_err = NULL;
>>     memset(>bl, 0, sizeof(bs->bl));
>> @@ -149,14 +151,21 @@ void bdrv_refresh_limits(BlockDriverState *bs,
>> Error **errp)
>>   drv->bdrv_co_preadv_part) ? 1 : 512;
>>     /* Take some limits from the children as a default */
>> -    if (bs->file) {
>> -    bdrv_refresh_limits(bs->file->bs, _err);
>> -    if (local_err) {
>> -    error_propagate(errp, local_err);
>> -    return;
>> +    have_limits = false;
>> +    QLIST_FOREACH(c, >children, next) {
>> +    if (c->role & (BDRV_CHILD_DATA | BDRV_CHILD_FILTERED |
>> BDRV_CHILD_COW))
>> +    {
>> +    bdrv_refresh_limits(c->bs, _err);
>> +    if (local_err) {
>> +    error_propagate(errp, local_err);
>> +    return;
>> +    }
>> +    bdrv_merge_limits(>bl, >bs->bl);
>> +    have_limits = true;
>>   }
>> -    bdrv_merge_limits(>bl, >file->bs->bl);
>> -    } else {
>> +    }
>> +
>> +    if (!have_limits) {
> 
> 
> This conditioned piece of code worked with (bs->file == NULL) only.
> 
> Now, it works only if there are neither bs->file, nor bs->backing, nor
> else filtered children.
> 
> Is it OK and doesn't break the logic for all cases?

Hm.  Good question.

I think the answer is it’s OK.

For DATA and FILTERED, it makes absolute sense to just use their
alignments.  For COW, maybe not so much?  But if there’s a COW child,
there has to be a DATA child as well (in practice).  So we’ll always
consider its alignment, too.

(And hypothetically speaking, if there was a COW child but no DATA
child, then the only alignment we need to observe is in fact the one of
the COW child.)

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 2/2] i386/cpu: Mask off unsupported XSAVE components

2020-07-16 Thread Eduardo Habkost
On Thu, Jul 16, 2020 at 04:20:19PM +0800, Xiaoyao Li wrote:
> When setting up XSAVE components, it needs to mask off those unsupported
> by KVM.
> 
> Signed-off-by: Xiaoyao Li 

We must never disable CPUID features silently based on host
capabilities, otherwise we can't guarantee guest ABI stability
when migrating to another host.  Filtering of features should
involve a call to mark_unavailable_features() (or some equivalent
mechanism) so we can report the missing features properly through
QMP.

Could you explain what's the bug you are trying to fix?  The loop
at x86_cpu_filter_features() is already supposed to disable
features unsupported by the host.

> ---
>  target/i386/cpu.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index f5f11603e805..efc92334b7b1 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6274,8 +6274,10 @@ static void x86_cpu_enable_xsave_components(X86CPU 
> *cpu)
>  }
>  }
>  
> -env->features[FEAT_XSAVE_COMP_LO] = mask;
> -env->features[FEAT_XSAVE_COMP_HI] = mask >> 32;
> +env->features[FEAT_XSAVE_COMP_LO] = mask &
> +x86_cpu_get_supported_feature_word(FEAT_XSAVE_COMP_LO, 
> cpu->migratable);
> +env->features[FEAT_XSAVE_COMP_HI] = (mask >> 32) &
> +x86_cpu_get_supported_feature_word(FEAT_XSAVE_COMP_HI, 
> cpu->migratable);
>  }
>  
>  /* Steps involved on loading and filtering CPUID data
> -- 
> 2.18.4
> 

-- 
Eduardo




Re: [PATCH v7 19/47] vmdk: Drop vmdk_co_flush()

2020-07-16 Thread Max Reitz
On 14.07.20 16:52, Andrey Shinkevich wrote:
> On 25.06.2020 18:21, Max Reitz wrote:
>> Before HEAD^, we needed this because bdrv_co_flush() by itself would
>> only flush bs->file.  With HEAD^, bdrv_co_flush() will flush all
>> children on which a WRITE or WRITE_UNCHANGED permission has been taken.
>> Thus, vmdk no longer needs to do it itself.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>   block/vmdk.c | 16 
>>   1 file changed, 16 deletions(-)
>>
>> diff --git a/block/vmdk.c b/block/vmdk.c
>> index 62da465126..a23890e6ec 100644
>> --- a/block/vmdk.c
>> +++ b/block/vmdk.c
>> @@ -2802,21 +2802,6 @@ static void vmdk_close(BlockDriverState *bs)
>>   error_free(s->migration_blocker);
>>   }
>>   -static coroutine_fn int vmdk_co_flush(BlockDriverState *bs)
>> -{
>> -    BDRVVmdkState *s = bs->opaque;
>> -    int i, err;
>> -    int ret = 0;
>> -
>> -    for (i = 0; i < s->num_extents; i++) {
>> -    err = bdrv_co_flush(s->extents[i].file->bs);
>> -    if (err < 0) {
>> -    ret = err;
>> -    }
>> -    }
>> -    return ret;
>> -}
>> -
>>   static int64_t vmdk_get_allocated_file_size(BlockDriverState *bs)
>>   {
>>   int i;
>> @@ -3075,7 +3060,6 @@ static BlockDriver bdrv_vmdk = {
>>   .bdrv_close   = vmdk_close,
>>   .bdrv_co_create_opts  = vmdk_co_create_opts,
>>   .bdrv_co_create   = vmdk_co_create,
>> -    .bdrv_co_flush_to_disk    = vmdk_co_flush,
> 
> 
> After HEAD^ applied, wouldn't we get an endless recursion in
> bdrv_co_flush() if the HEAD (this patch) had not been merged into HEAD^?

Hm, how so?  HEAD^ just flushes all children, just like vmdk_co_flush()
does.  So it seems to me just like double the work.  (Which is
unfortunate but shouldn’t be a problem.)

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] target/i386: floatx80: avoid compound literals in static initializers

2020-07-16 Thread Philippe Mathieu-Daudé
On 7/16/20 4:42 PM, Laszlo Ersek wrote:
> Quoting ISO C99 6.7.8p4, "All the expressions in an initializer for an
> object that has static storage duration shall be constant expressions or
> string literals".
> 
> The compound literal produced by the make_floatx80() macro is not such a
> constant expression, per 6.6p7-9. (An implementation may accept it,
> according to 6.6p10, but is not required to.)
> 
> Therefore using "floatx80_zero" and make_floatx80() for initializing
> "f2xm1_table" and "fpatan_table" is not portable. And gcc-4.8 in RHEL-7.6
> actually chokes on them:
> 
>> target/i386/fpu_helper.c:871:5: error: initializer element is not constant
>>  { make_floatx80(0xbfff, 0x8000ULL),
>>  ^

This reminds me of:

commit 6fa9ba09dbf4eb8b52bcb47d6820957f1b77ee0b
Author: Kamil Rytarowski 
Date:   Mon Sep 4 23:23:06 2017 +0200

target/m68k: Switch fpu_rom from make_floatx80() to make_floatx80_init()

GCC 4.7.2 on SunOS reports that the values assigned to array members
are not
real constants:

target/m68k/fpu_helper.c:32:5: error: initializer element is not
constant
target/m68k/fpu_helper.c:32:5: error: (near initialization for
'fpu_rom[0]')
rules.mak:66: recipe for target 'target/m68k/fpu_helper.o' failed

Convert the array to make_floatx80_init() to fix it.
Replace floatx80_pi-like constants with make_floatx80_init() as they are
defined as make_floatx80().

Reviewed-by: Philippe Mathieu-Daudé 

> 
> We've had the make_floatx80_init() macro for this purpose since commit
> 3bf7e40ab914 ("softfloat: fix for C99", 2012-03-17), so let's use that
> macro again.
> 
> Fixes: eca30647fc07
> Fixes: ff57bb7b6326
> Link: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg06566.html
> Link: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg04714.html
> Cc: Alex Bennée 
> Cc: Aurelien Jarno 
> Cc: Eduardo Habkost 
> Cc: Joseph Myers 
> Cc: Paolo Bonzini 
> Cc: Peter Maydell 
> Cc: Richard Henderson 
> Signed-off-by: Laszlo Ersek 
> ---
> 
> Notes:
> I can see that there are test cases under "tests/tcg/i386", but I don't
> know how to run them.

Yeah it is not easy to figure...

Try 'make run-tcg-tests-i386-softmmu'
but you need docker :^)

(There is also 'make check-softfloat', listed in 'make check-help')

> 
>  include/fpu/softfloat.h  |   1 +
>  target/i386/fpu_helper.c | 426 +++
>  2 files changed, 214 insertions(+), 213 deletions(-)
> 
> diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
> index f1a19df066b7..659218b5c787 100644
> --- a/include/fpu/softfloat.h
> +++ b/include/fpu/softfloat.h
> @@ -822,6 +822,7 @@ static inline bool floatx80_invalid_encoding(floatx80 a)
>  }
>  
>  #define floatx80_zero make_floatx80(0x, 0xLL)
> +#define floatx80_zero_init make_floatx80_init(0x, 0xLL)
>  #define floatx80_one make_floatx80(0x3fff, 0x8000LL)
>  #define floatx80_ln2 make_floatx80(0x3ffe, 0xb17217f7d1cf79acLL)
>  #define floatx80_pi make_floatx80(0x4000, 0xc90fdaa22168c235LL)
> diff --git a/target/i386/fpu_helper.c b/target/i386/fpu_helper.c
> index f5e6c4b88d4e..4ea73874d836 100644
> --- a/target/i386/fpu_helper.c
> +++ b/target/i386/fpu_helper.c
> @@ -868,201 +868,201 @@ struct f2xm1_data {
>  };
...




Re: [PATCH v6 0/3] modify CPU model info

2020-07-16 Thread Eduardo Habkost
On Tue, Jul 14, 2020 at 04:41:45PM +0800, Chenyi Qiang wrote:
> Add the missing VMX features in Skylake-Server, Cascadelake-Server and
> Icelake-Server CPU models. In Icelake-Server CPU model, it lacks sha_ni,
> avx512ifma, rdpid and fsrm. The model number of Icelake-Server also needs
> to be fixed.
> 
> To apply this patchset, a bug related to env->user_features need to be
> fixed first. The patch is available at
> https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg04399.html
> 

Queued for 5.1, thanks!

-- 
Eduardo




[PATCH for-5.1] qapi: Fix visit_type_STRUCT() not to fail for null object

2020-07-16 Thread Markus Armbruster
To make deallocating partially constructed objects work, the
visit_type_STRUCT() need to succeed without doing anything when passed
a null object.

Commit cdd2b228b9 "qapi: Smooth visitor error checking in generated
code" broke that.  To reproduce, run tests/test-qobject-input-visitor
with AddressSanitizer:

==4353==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x7f192d0c5d28 in __interceptor_calloc 
(/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
#1 0x7f192cd21b10 in g_malloc0 
(/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x51b10)
#2 0x556725f6bbee in visit_next_list qapi/qapi-visit-core.c:86
#3 0x556725f49e15 in visit_type_UserDefOneList 
tests/test-qapi-visit.c:474
#4 0x556725f4489b in test_visitor_in_fail_struct_in_list 
tests/test-qobject-input-visitor.c:1086
#5 0x7f192cd42f29  (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x72f29)

SUMMARY: AddressSanitizer: 16 byte(s) leaked in 1 allocation(s).

Test case /visitor/input/fail/struct-in-list feeds a list with a bad
element to the QObject input visitor.  Visiting that element duly
fails, and aborts the visit with the list only partially constructed:
the faulty object is null.  Cleaning up the partially constructed list
visits that null object, fails, and aborts the visit before the list
node gets freed.

Fix the the generated visit_type_STRUCT() to succeed for null objects.

Fixes: cdd2b228b973d2a29edf7696ef6e8b08ec329019
Reported-by: Li Qiang 
Signed-off-by: Markus Armbruster 
---
 scripts/qapi/visit.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/qapi/visit.py b/scripts/qapi/visit.py
index 3fb2f30510..cdabc5fa28 100644
--- a/scripts/qapi/visit.py
+++ b/scripts/qapi/visit.py
@@ -249,6 +249,7 @@ bool visit_type_%(c_name)s(Visitor *v, const char *name, 
%(c_name)s **obj, Error
 if (!*obj) {
 /* incomplete */
 assert(visit_is_dealloc(v));
+ok = true;
 goto out_obj;
 }
 if (!visit_type_%(c_name)s_members(v, *obj, errp)) {
-- 
2.26.2




Re: [PATCH] spapr_pci: Robustify support of PCI bridges

2020-07-16 Thread Greg Kurz
On Thu, 16 Jul 2020 16:23:52 +0200
Markus Armbruster  wrote:

> David Gibson  writes:
> 
> > On Thu, Jul 16, 2020 at 12:32:44PM +0200, Greg Kurz wrote:
> >> On Thu, 16 Jul 2020 14:45:40 +1000
> >> David Gibson  wrote:
> >> 
> >> > On Thu, Jul 09, 2020 at 07:12:47PM +0200, Greg Kurz wrote:
> >> > > Some recent error handling cleanups unveiled issues with our support of
> >> > > PCI bridges:
> >> > > 
> >> > > 1) QEMU aborts when using non-standard PCI bridge types,
> >> > >unveiled by commit 7ef1553dac "spapr_pci: Drop some dead error 
> >> > > handling"
> >> > > 
> >> > > $ qemu-system-ppc64 -M pseries -device pcie-pci-bridge
> >> > > Unexpected error in object_property_find() at qom/object.c:1240:
> >> > > qemu-system-ppc64: -device pcie-pci-bridge: Property '.chassis_nr' not 
> >> > > found
> >> > > Aborted (core dumped)
> >> > 
> >> > Oops, I thought we had a check that we actually had a "pci-bridge"
> >> > device before continuing with the hotplug, but I guess not.
> >> 
> >> Ah... are you suggesting we should explicitly check the actual type
> >> of the bridge rather than looking for the "chassis_nr" property ?
> >
> > Uh.. I thought about it, but I don't think it matters much which way
> > we do it.
> 
> Would it make sense to add the "chassis_nr" property to *all* PCI
> bridge devices?
> 

I see that the "PCI Express to PCI/PCI-X Bridge Specification" mentions
a "Chassis Number Register" which looks very similar to the what exists
in standard PCI-to-PCI brdiges. This doesn't seem to be implemented in
our "pcie-pci-bridge" device model though, but of course I have no idea
why :)

Maybe Michael or Marcel (cc'd) can share some thoughts about that ?

> [...]
> 




  1   2   3   >