date:20210426

Re: [PATCH 0/9] hw/block: m25p80: Fix the mess of dummy bytes needed for fast read commands

2021-04-26 Thread Alistair Francis

On Fri, Apr 23, 2021 at 4:46 PM Bin Meng  wrote:
>
> On Mon, Feb 8, 2021 at 10:41 PM Bin Meng  wrote:
> >
> > On Thu, Jan 21, 2021 at 10:18 PM Francisco Iglesias
> >  wrote:
> > >
> > > Hi Bin,
> > >
> > > On [2021 Jan 21] Thu 16:59:51, Bin Meng wrote:
> > > > Hi Francisco,
> > > >
> > > > On Thu, Jan 21, 2021 at 4:50 PM Francisco Iglesias
> > > >  wrote:
> > > > >
> > > > > Dear Bin,
> > > > >
> > > > > On [2021 Jan 20] Wed 22:20:25, Bin Meng wrote:
> > > > > > Hi Francisco,
> > > > > >
> > > > > > On Tue, Jan 19, 2021 at 9:01 PM Francisco Iglesias
> > > > > >  wrote:
> > > > > > >
> > > > > > > Hi Bin,
> > > > > > >
> > > > > > > On [2021 Jan 18] Mon 20:32:19, Bin Meng wrote:
> > > > > > > > Hi Francisco,
> > > > > > > >
> > > > > > > > On Mon, Jan 18, 2021 at 6:06 PM Francisco Iglesias
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > Hi Bin,
> > > > > > > > >
> > > > > > > > > On [2021 Jan 15] Fri 22:38:18, Bin Meng wrote:
> > > > > > > > > > Hi Francisco,
> > > > > > > > > >
> > > > > > > > > > On Fri, Jan 15, 2021 at 8:26 PM Francisco Iglesias
> > > > > > > > > >  wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi Bin,
> > > > > > > > > > >
> > > > > > > > > > > On [2021 Jan 15] Fri 10:07:52, Bin Meng wrote:
> > > > > > > > > > > > Hi Francisco,
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Jan 15, 2021 at 2:13 AM Francisco Iglesias
> > > > > > > > > > > >  wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hi Bin,
> > > > > > > > > > > > >
> > > > > > > > > > > > > On [2021 Jan 14] Thu 23:08:53, Bin Meng wrote:
> > > > > > > > > > > > > > From: Bin Meng 
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The m25p80 model uses s->needed_bytes to indicate 
> > > > > > > > > > > > > > how many follow-up
> > > > > > > > > > > > > > bytes are expected to be received after it receives 
> > > > > > > > > > > > > > a command. For
> > > > > > > > > > > > > > example, depending on the address mode, either 
> > > > > > > > > > > > > > 3-byte address or
> > > > > > > > > > > > > > 4-byte address is needed.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > For fast read family commands, some dummy cycles 
> > > > > > > > > > > > > > are required after
> > > > > > > > > > > > > > sending the address bytes, and the dummy cycles 
> > > > > > > > > > > > > > need to be counted
> > > > > > > > > > > > > > in s->needed_bytes. This is where the mess began.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > As the variable name (needed_bytes) indicates, the 
> > > > > > > > > > > > > > unit is in byte.
> > > > > > > > > > > > > > It is not in bit, or cycle. However for some reason 
> > > > > > > > > > > > > > the model has
> > > > > > > > > > > > > > been using the number of dummy cycles for 
> > > > > > > > > > > > > > s->needed_bytes. The right
> > > > > > > > > > > > > > approach is to convert the number of dummy cycles 
> > > > > > > > > > > > > > to bytes based on
> > > > > > > > > > > > > > the SPI protocol, for example, 6 dummy cycles for 
> > > > > > > > > > > > > > the Fast Read Quad
> > > > > > > > > > > > > > I/O (EBh) should be converted to 3 bytes per the 
> > > > > > > > > > > > > > formula (6 * 4 / 8).
> > > > > > > > > > > > >
> > > > > > > > > > > > > While not being the original implementor I must 
> > > > > > > > > > > > > assume that above solution was
> > > > > > > > > > > > > considered but not chosen by the developers due to it 
> > > > > > > > > > > > > is inaccuracy (it
> > > > > > > > > > > > > wouldn't be possible to model exacly 6 dummy cycles, 
> > > > > > > > > > > > > only a multiple of 8,
> > > > > > > > > > > > > meaning that if the controller is wrongly programmed 
> > > > > > > > > > > > > to generate 7 the error
> > > > > > > > > > > > > wouldn't be caught and the controller will still be 
> > > > > > > > > > > > > considered "correct"). Now
> > > > > > > > > > > > > that we have this detail in the implementation I'm in 
> > > > > > > > > > > > > favor of keeping it, this
> > > > > > > > > > > > > also because the detail is already in use for 
> > > > > > > > > > > > > catching exactly above error.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I found no clue from the commit message that my 
> > > > > > > > > > > > proposed solution here
> > > > > > > > > > > > was ever considered, otherwise all SPI controller 
> > > > > > > > > > > > models supporting
> > > > > > > > > > > > software generation should have been found out 
> > > > > > > > > > > > seriously broken long
> > > > > > > > > > > > time ago!
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > The controllers you are referring to might lack support 
> > > > > > > > > > > for commands requiring
> > > > > > > > > > > dummy clock cycles but I really hope they work with the 
> > > > > > > > > > > other commands? If so I
> > > > > > > > > >
> > > > > > > > > > I am not sure why you view dummy clock cycles as something 
> > > > > > > > > >

Re: [PATCH v6 1/9] hw: Add check for queue number

2021-04-26 Thread Jason Wang




在 2021/4/27 上午11:39, Cindy Lu 写道:

In order to support configure interrupt. we will use queue number -1
as configure interrupt
since all these device are not support the configure interrupt
So we will add an check here, if the idx is -1, the function
will return;



The title is confusing since the change is specific for the guest notifiers.

A better one would be "virtio: guest notifier support for config interrupt"




Signed-off-by: Cindy Lu 
---
  hw/display/vhost-user-gpu.c|  8 ++--
  hw/net/virtio-net.c| 10 +++---
  hw/virtio/vhost-user-fs.c  | 11 +++
  hw/virtio/vhost-vsock-common.c |  8 ++--
  hw/virtio/virtio-crypto.c  |  8 ++--
  5 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c
index 51f1747c4a..d8e26cedf1 100644
--- a/hw/display/vhost-user-gpu.c
+++ b/hw/display/vhost-user-gpu.c
@@ -490,7 +490,9 @@ static bool
  vhost_user_gpu_guest_notifier_pending(VirtIODevice *vdev, int idx)
  {
  VhostUserGPU *g = VHOST_USER_GPU(vdev);
-
+if (idx == -1) {



Let's introduce a macro for this instead of the magic number.

Thanks



+return false;
+}
  return vhost_virtqueue_pending(>vhost->dev, idx);
  }
  
@@ -498,7 +500,9 @@ static void

  vhost_user_gpu_guest_notifier_mask(VirtIODevice *vdev, int idx, bool mask)
  {
  VhostUserGPU *g = VHOST_USER_GPU(vdev);
-
+if (idx == -1) {
+return;
+}
  vhost_virtqueue_mask(>vhost->dev, vdev, idx, mask);
  }
  
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c

index 9179013ac4..78ccaa228c 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3060,7 +3060,10 @@ static bool 
virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx)
  VirtIONet *n = VIRTIO_NET(vdev);
  NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
  assert(n->vhost_started);
-return vhost_net_virtqueue_pending(get_vhost_net(nc->peer), idx);
+if (idx != -1) {
+return vhost_net_virtqueue_pending(get_vhost_net(nc->peer), idx);
+}
+return false;
  }
  
  static void virtio_net_guest_notifier_mask(VirtIODevice *vdev, int idx,

@@ -3069,8 +3072,9 @@ static void virtio_net_guest_notifier_mask(VirtIODevice 
*vdev, int idx,
  VirtIONet *n = VIRTIO_NET(vdev);
  NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
  assert(n->vhost_started);
-vhost_net_virtqueue_mask(get_vhost_net(nc->peer),
- vdev, idx, mask);
+if (idx != -1) {
+vhost_net_virtqueue_mask(get_vhost_net(nc->peer), vdev, idx, mask);
+ }
  }
  
  static void virtio_net_set_config_size(VirtIONet *n, uint64_t host_features)

diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
index 1bc5d03a00..37424c2193 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -142,18 +142,21 @@ static void vuf_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
   */
  }
  
-static void vuf_guest_notifier_mask(VirtIODevice *vdev, int idx,

-bool mask)
+static void vuf_guest_notifier_mask(VirtIODevice *vdev, int idx, bool mask)
  {
  VHostUserFS *fs = VHOST_USER_FS(vdev);
-
+if (idx == -1) {
+return;
+}
  vhost_virtqueue_mask(>vhost_dev, vdev, idx, mask);
  }
  
  static bool vuf_guest_notifier_pending(VirtIODevice *vdev, int idx)

  {
  VHostUserFS *fs = VHOST_USER_FS(vdev);
-
+if (idx == -1) {
+return false;
+}
  return vhost_virtqueue_pending(>vhost_dev, idx);
  }
  
diff --git a/hw/virtio/vhost-vsock-common.c b/hw/virtio/vhost-vsock-common.c

index 5b2ebf3496..0adf823d37 100644
--- a/hw/virtio/vhost-vsock-common.c
+++ b/hw/virtio/vhost-vsock-common.c
@@ -100,7 +100,9 @@ static void 
vhost_vsock_common_guest_notifier_mask(VirtIODevice *vdev, int idx,
  bool mask)
  {
  VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(vdev);
-
+if (idx == -1) {
+return;
+}
  vhost_virtqueue_mask(>vhost_dev, vdev, idx, mask);
  }
  
@@ -108,7 +110,9 @@ static bool vhost_vsock_common_guest_notifier_pending(VirtIODevice *vdev,

 int idx)
  {
  VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(vdev);
-
+if (idx == -1) {
+return false;
+}
  return vhost_virtqueue_pending(>vhost_dev, idx);
  }
  
diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c

index 54f9bbb789..c47f4ffb24 100644
--- a/hw/virtio/virtio-crypto.c
+++ b/hw/virtio/virtio-crypto.c
@@ -947,7 +947,9 @@ static void virtio_crypto_guest_notifier_mask(VirtIODevice 
*vdev, int idx,
  int queue = virtio_crypto_vq2q(idx);
  
  assert(vcrypto->vhost_started);

-
+if (idx == -1) {
+return;
+}
  cryptodev_vhost_virtqueue_mask(vdev, queue, idx, mask);
  }
  
@@ -957,7 +959,9 @@ static bool virtio_crypto_guest_notifier_pending(VirtIODevice *vdev,

[Bug 1926246] [NEW] chrome based apps can not be run under qemu user mode

2021-04-26 Thread Wind Li

Public bug reported:

chrome uses /proc/self/exe to fork render process.
Here a simple code to reproduce the issue. It's output parent then child but 
failed with qemu: unknown option 'type=renderer'.

Maybe we can modify exec syscall to replace /proc/self/exe to the real
path.

//gcc -o self self.c 
#include 
#include 
#include 
int main(int argc, char** argv) {
  if(argc==1){
printf ("parent\n");
if ( fork() == 0 )
{
return execl("/proc/self/exe","/proc/self/exe", "--type=renderer",NULL);
}
  } else {
printf ("child\n");
  }
  return 0;
}

similar reports:
https://github.com/AppImage/AppImageKit/issues/965  
https://github.com/golang/go/issues/42080  

Workardound:
compile chrome or your chrome based app with a patch to 
content/common/child_process_host_impl.cc:GetChildPath, get the realpath of 
/proc/self/exe:  

diff --git a/content/common/child_process_host_impl.cc 
b/content/common/child_process_host_impl.cc
index bc78aba80ac8..9fab74d3bae8 100644
--- a/content/common/child_process_host_impl.cc
+++ b/content/common/child_process_host_impl.cc
@@ -60,8 +60,12 @@ base::FilePath ChildProcessHost::GetChildPath(int flags) {
 #if defined(OS_LINUX)
   // Use /proc/self/exe rather than our known binary path so updates
   // can't swap out the binary from underneath us.
-  if (child_path.empty() && flags & CHILD_ALLOW_SELF)
-child_path = base::FilePath(base::kProcSelfExe);
+  if (child_path.empty() && flags & CHILD_ALLOW_SELF) {
+if (!ReadSymbolicLink(base::FilePath(base::kProcSelfExe), _path)) {
+  NOTREACHED() << "Unable to resolve " << base::kProcSelfExe << ".";
+  child_path = base::FilePath(base::kProcSelfExe);
+}
+  }
 #endif

   // On most platforms, the child executable is the same as the current

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1926246

Title:
  chrome based apps can not be run under qemu user mode

Status in QEMU:
  New

Bug description:
  chrome uses /proc/self/exe to fork render process.
  Here a simple code to reproduce the issue. It's output parent then child but 
failed with qemu: unknown option 'type=renderer'.

  Maybe we can modify exec syscall to replace /proc/self/exe to the real
  path.

  //gcc -o self self.c 
  #include 
  #include 
  #include 
  int main(int argc, char** argv) {
if(argc==1){
  printf ("parent\n");
if ( fork() == 0 )
  {
  return execl("/proc/self/exe","/proc/self/exe", 
"--type=renderer",NULL);
  }
} else {
  printf ("child\n");
}
return 0;
  }

  similar reports:
  https://github.com/AppImage/AppImageKit/issues/965  
  https://github.com/golang/go/issues/42080  

  Workardound:
  compile chrome or your chrome based app with a patch to 
content/common/child_process_host_impl.cc:GetChildPath, get the realpath of 
/proc/self/exe:  

  diff --git a/content/common/child_process_host_impl.cc 
b/content/common/child_process_host_impl.cc
  index bc78aba80ac8..9fab74d3bae8 100644
  --- a/content/common/child_process_host_impl.cc
  +++ b/content/common/child_process_host_impl.cc
  @@ -60,8 +60,12 @@ base::FilePath ChildProcessHost::GetChildPath(int flags) {
   #if defined(OS_LINUX)
 // Use /proc/self/exe rather than our known binary path so updates
 // can't swap out the binary from underneath us.
  -  if (child_path.empty() && flags & CHILD_ALLOW_SELF)
  -child_path = base::FilePath(base::kProcSelfExe);
  +  if (child_path.empty() && flags & CHILD_ALLOW_SELF) {
  +if (!ReadSymbolicLink(base::FilePath(base::kProcSelfExe), _path)) {
  +  NOTREACHED() << "Unable to resolve " << base::kProcSelfExe << ".";
  +  child_path = base::FilePath(base::kProcSelfExe);
  +}
  +  }
   #endif

 // On most platforms, the child executable is the same as the
  current

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1926246/+subscriptions

[Bug 1895305] Re: pthread_cancel fails with "RT33" with musl libc

2021-04-26 Thread Thomas Huth

Ok, thanks, since this was a regressin in Alpine, I'm marking the bug as
closed here.

** Changed in: qemu
   Status: New => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1895305

Title:
  pthread_cancel fails with "RT33" with musl libc

Status in QEMU:
  Invalid

Bug description:
  From my testing it seems that QEMU built against musl libc crashes on
  pthread_cancel cancel calls - if the binary is also built with musl
  libc.

  Minimal sample:

  #include 
  #include 
  #include 
  void* threadfunc(void* ignored) {
while (1) {
pause();
}
return NULL;
  }
  int main() {
pthread_t thread;
pthread_create(, NULL, , NULL);
sleep(1);
pthread_cancel(thread);
printf("OK, alive\n");
  }

  In an Alpine Linux aarch64 chroot (on an x86_64 host) the binary will
  just output RT33 and has exit code 161.

  Using qemu-aarch64 on an x86_64 host results in the output (fish shell)
fish: “qemu-aarch64-static ./musl-stat…” terminated by signal Unknown 
(Unknown)
  or (bash)
Real-time signal 2

  and exit code 164.

  It doesn't matter whether the binary is linked dynamically or static.
  You can see my test results in the following table:

  |  | QEMU glibc | QEMU musl |
  |--||---|
  | binary glibc dynamic | ✓  | ✓ |
  | binary glibc static  | ✓  | ✓ |
  | binary musl dynamic  | ✓  | ✗ |
  | binary musl static   | ✓  | ✗ |

  Both QEMU builds are v5.1.0 (glibc v2.32 / musl v1.2.1)

  I've uploaded all my compile and test commands (plus a script to
  conveniently run them all) to https://github.com/z3ntu/qemu-
  pthread_cancel . It also includes the built binaries if needed. The
  test script output can be found at https://github.com/z3ntu/qemu-
  pthread_cancel/blob/master/results.txt

  Further links:
  - https://gitlab.com/postmarketOS/pmaports/-/issues/190#note_141902075
  - https://gitlab.com/postmarketOS/pmbootstrap/-/issues/1970

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1895305/+subscriptions

[PATCH] hw/input/hid: Add support for keys of jp106 keyboard.

2021-04-26 Thread Katsuhiro Ueno

Add support for the following keys: KATAKANAHIRAGANA, HENKAN, MUHENKAN,
RO, and YEN.  Before this commit, these keys did not work as expected
when a jp106 keyboard was connected to the guest as a usb-kbd device.

Signed-off-by: Katsuhiro Ueno 
---
 hw/input/hid.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/input/hid.c b/hw/input/hid.c
index e1d2e46083..8aab0521f4 100644
--- a/hw/input/hid.c
+++ b/hw/input/hid.c
@@ -51,8 +51,8 @@ static const uint8_t hid_usage_keys[0x100] = {
 0x45, 0x68, 0x69, 0x6a, 0x6b, 0x6c, 0x6d, 0x6e,
 0xe8, 0xe9, 0x71, 0x72, 0x73, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x85, 0x00, 0x00, 0x00, 0x00,
-0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
-0x00, 0x00, 0x00, 0x00, 0x00, 0xe3, 0xe7, 0x65,
+0x88, 0x00, 0x00, 0x87, 0x00, 0x00, 0x00, 0x00,
+0x00, 0x8a, 0x00, 0x8b, 0x00, 0x89, 0xe7, 0x65,

 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
-- 
2.24.3 (Apple Git-128)

Re: [PATCH v6 0/9] vhost-vdpa: add support for configure interrupt

2021-04-26 Thread no-reply

Patchew URL: https://patchew.org/QEMU/20210427033951.29805-1-l...@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20210427033951.29805-1-l...@redhat.com
Subject: [PATCH v6 0/9] vhost-vdpa: add support for configure interrupt

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 - [tag update]  patchew/20210424162229.3312116-1-f4...@amsat.org -> 
patchew/20210424162229.3312116-1-f4...@amsat.org
 * [new tag] patchew/20210427033951.29805-1-l...@redhat.com -> 
patchew/20210427033951.29805-1-l...@redhat.com
Switched to a new branch 'test'
75b5f19 virtio-net: add peer_deleted check in virtio_net_handle_rx
e6c2a2d virtio: decouple virtqueue from set notifier fd handler
d7e3243 virtio-pci: add support for configure interrupt
55946c5 virtio-mmio: add support for configure interrupt
0e8ea81 vhost:add support for configure interrupt
665e11d vhost-vdpa: add support for config interrupt call back
e83f793 vhost: add new call back function for config interrupt
3ce0821 virtio-pci:decouple virtqueue from interrupt setting process
d4debe8 hw: Add check for queue number

=== OUTPUT BEGIN ===
1/9 Checking commit d4debe818e21 (hw: Add check for queue number)
2/9 Checking commit 3ce082180cca (virtio-pci:decouple virtqueue from interrupt 
setting process)
3/9 Checking commit e83f7938542d (vhost: add new call back function for config 
interrupt)
4/9 Checking commit 665e11d3c98f (vhost-vdpa: add support for config interrupt 
call back)
5/9 Checking commit 0e8ea81cf5a0 (vhost:add support for configure interrupt)
6/9 Checking commit 55946c55ee11 (virtio-mmio: add support for configure 
interrupt)
7/9 Checking commit d7e3243f82ec (virtio-pci: add support for configure 
interrupt)
ERROR: Missing Signed-off-by: line(s)

total: 1 errors, 0 warnings, 266 lines checked

Patch 7/9 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

8/9 Checking commit e6c2a2d36bce (virtio: decouple virtqueue from set notifier 
fd handler)
9/9 Checking commit 75b5f193ac32 (virtio-net: add peer_deleted check in 
virtio_net_handle_rx)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20210427033951.29805-1-l...@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

[PATCH v6 7/9] virtio-pci: add support for configure interrupt

2021-04-26 Thread Cindy Lu

Add support for configure interrupt, use kvm_irqfd_assign and set the
gsi to kernel. When the configure notifier was eventfd_signal by host
kernel, this will finally inject an msix interrupt to guest
---
 hw/virtio/virtio-pci.c | 186 ++---
 1 file changed, 120 insertions(+), 66 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 2b7e6cc0d9..07d28dd367 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -664,12 +664,10 @@ static uint32_t virtio_read_config(PCIDevice *pci_dev,
 }
 
 static int kvm_virtio_pci_vq_vector_use(VirtIOPCIProxy *proxy,
-unsigned int queue_no,
 unsigned int vector)
 {
 VirtIOIRQFD *irqfd = >vector_irqfd[vector];
 int ret;
-
 if (irqfd->users == 0) {
 ret = kvm_irqchip_add_msi_route(kvm_state, vector, >pci_dev);
 if (ret < 0) {
@@ -708,93 +706,120 @@ static void kvm_virtio_pci_irqfd_release(VirtIOPCIProxy 
*proxy,
 ret = kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, n, irqfd->virq);
 assert(ret == 0);
 }
-
-static int kvm_virtio_pci_vector_use(VirtIOPCIProxy *proxy, int nvqs)
+ static int virtio_pci_get_notifier(VirtIOPCIProxy *proxy, int queue_no,
+  EventNotifier **n, unsigned int *vector)
 {
 PCIDevice *dev = >pci_dev;
 VirtIODevice *vdev = virtio_bus_get_device(>bus);
-VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
-unsigned int vector;
-int ret, queue_no;
 VirtQueue *vq;
-EventNotifier *n;
-for (queue_no = 0; queue_no < nvqs; queue_no++) {
+
+if (queue_no == -1) {
+*n = virtio_get_config_notifier(vdev);
+*vector = vdev->config_vector;
+} else {
 if (!virtio_queue_get_num(vdev, queue_no)) {
-break;
-}
-vector = virtio_queue_vector(vdev, queue_no);
-if (vector >= msix_nr_vectors_allocated(dev)) {
-continue;
-}
-ret = kvm_virtio_pci_vq_vector_use(proxy, queue_no, vector);
-if (ret < 0) {
-goto undo;
-}
-/* If guest supports masking, set up irqfd now.
- * Otherwise, delay until unmasked in the frontend.
- */
-if (vdev->use_guest_notifier_mask && k->guest_notifier_mask) {
-vq = virtio_get_queue(vdev, queue_no);
-n = virtio_queue_get_guest_notifier(vq);
-ret = kvm_virtio_pci_irqfd_use(proxy, n, vector);
-if (ret < 0) {
-kvm_virtio_pci_vq_vector_release(proxy, vector);
-goto undo;
-}
+return -1;
 }
+*vector = virtio_queue_vector(vdev, queue_no);
+vq = virtio_get_queue(vdev, queue_no);
+*n = virtio_queue_get_guest_notifier(vq);
+}
+if (*vector >= msix_nr_vectors_allocated(dev)) {
+return -1;
 }
 return 0;
+}
 
+static int kvm_virtio_pci_vector_use_one(VirtIOPCIProxy *proxy, int queue_no)
+{
+unsigned int vector;
+int ret;
+EventNotifier *n;
+ret = virtio_pci_get_notifier(proxy, queue_no, , );
+if (ret < 0) {
+return ret;
+}
+ret = kvm_virtio_pci_vq_vector_use(proxy, vector);
+if (ret < 0) {
+goto undo;
+}
+ret = kvm_virtio_pci_irqfd_use(proxy,  n, vector);
+if (ret < 0) {
+goto undo;
+}
+return 0;
 undo:
-while (--queue_no >= 0) {
-vector = virtio_queue_vector(vdev, queue_no);
-if (vector >= msix_nr_vectors_allocated(dev)) {
-continue;
-}
-if (vdev->use_guest_notifier_mask && k->guest_notifier_mask) {
-vq = virtio_get_queue(vdev, queue_no);
-n = virtio_queue_get_guest_notifier(vq);
-kvm_virtio_pci_irqfd_release(proxy, n, vector);
-}
-kvm_virtio_pci_vq_vector_release(proxy, vector);
+kvm_virtio_pci_irqfd_release(proxy, n, vector);
+return ret;
+}
+static int kvm_virtio_pci_vector_use(VirtIOPCIProxy *proxy, int nvqs)
+{
+int queue_no;
+int ret = 0;
+for (queue_no = 0; queue_no < nvqs; queue_no++) {
+ret = kvm_virtio_pci_vector_use_one(proxy, queue_no);
 }
 return ret;
 }
 
-static void kvm_virtio_pci_vector_release(VirtIOPCIProxy *proxy, int nvqs)
+static int kvm_virtio_pci_vector_config_use(VirtIOPCIProxy *proxy)
+{
+return kvm_virtio_pci_vector_use_one(proxy, -1);
+ }
+
+static void kvm_virtio_pci_vector_release_one(VirtIOPCIProxy *proxy,
+int queue_no)
 {
-PCIDevice *dev = >pci_dev;
 VirtIODevice *vdev = virtio_bus_get_device(>bus);
 unsigned int vector;
-int queue_no;
-VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
-VirtQueue *vq;
 EventNotifier *n;
+int ret;
+VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
+ret = virtio_pci_get_notifier(proxy, queue_no, , );
+if (ret < 0) {
+return;
+}
+
+if

Re: [RFC PATCH 2/4] hw/ppc: Add kvm-only file spapr_hcall_tcg_stub.c

2021-04-26 Thread David Gibson

On Fri, Apr 23, 2021 at 07:06:15PM -0300, Fabiano Rosas wrote:
> "Lucas Mateus Castro (alqotel)"  writes:
> 
> > This file should be used instead of spapr_hcall.c when compiling
> > without tcg (--disable-tcg) as it does not call tcg-only functions and
> > trips fatal error when invalid functions are called
> 
> Not calling any TCG-specific function is not an indication of the code
> being "kvm only" in this case. So I think this patch is backwards, we
> should instead aim to remove tcg-only code from spapr_hcall.c.

Right.

> > As of right now some functions are repeated here and in spapr_hcall.c,
> > as they are static, is some other method to deal with this
> > recommended?
> 
> Yeah, you should not be repeating the functions. From previous
> discussions on this topic I understood that we'd have another
> hypercall_register_types for TCG. So we could have a spapr_hcall_tcg.c
> that contains tcg-only functions. And they would only be used in that
> file so they would continue being static.

> > Also some functions should only cause a fatal error as KVM should
> > intercept and handle their call, but as I'm not sure which ones I just
> > did this to functions that called tcg-only code.
> >
> > Signed-off-by: Lucas Mateus Castro (alqotel) 
> > ---
> >  hw/ppc/spapr_hcall_tcg_stub.c | 1824 +
> >  1 file changed, 1824 insertions(+)
> >  create mode 100644 hw/ppc/spapr_hcall_tcg_stub.c
> >
> > diff --git a/hw/ppc/spapr_hcall_tcg_stub.c
> > b/hw/ppc/spapr_hcall_tcg_stub.c
> 
> Your usage of stub here is a bit confusing. Take a look at
> target/ppc/kvm-stub.c and accel/stubs/kvm-stub.c. These are files that
> are only included in the build to satisfy any references to the symbols
> they contain. The implementation is just an empty body or an error
> return. So if the feature is included, the actual foo.c will be present
> with the proper implementation; if not, we get the empty stub.
> 
> Also, look at target/ppc/kvm.h under #ifndef CONFIG_USER_ONLY. There's
> some similar ideas there that could be of help.
> 
> So my suggestion for this patch is take a step back and move first all
> of the TCG-only functions that are certainly not needed. We can then
> figure out what patterns we are going to use to stub them in the
> KVM-only build. After that we take a look at what's left and go from
> there.

Right.  You should be able to stub these much more simply than this.
Just a single say "h_tcg_only()" stub, then in the !TCG case you
register that instead of the real implementation with
spapr_register_hypercall().

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[PATCH v6 6/9] virtio-mmio: add support for configure interrupt

2021-04-26 Thread Cindy Lu

Add configure interrupt support for virtio-mmio bus. This
interrupt will working while backend is vhost-vdpa

Signed-off-by: Cindy Lu 
---
 hw/virtio/virtio-mmio.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
index e1b5c3b81e..d8cb368728 100644
--- a/hw/virtio/virtio-mmio.c
+++ b/hw/virtio/virtio-mmio.c
@@ -632,7 +632,26 @@ static int virtio_mmio_set_guest_notifier(DeviceState *d, 
int n, bool assign,
 
 return 0;
 }
+static int virtio_mmio_set_config_notifier(DeviceState *d,  bool assign)
+{
+VirtIOMMIOProxy *proxy = VIRTIO_MMIO(d);
+VirtIODevice *vdev = virtio_bus_get_device(>bus);
+VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
 
+EventNotifier *notifier = virtio_get_config_notifier(vdev);
+int r = 0;
+if (assign) {
+r = event_notifier_init(notifier, 0);
+virtio_set_config_notifier_fd_handler(vdev, true, false);
+} else {
+virtio_set_config_notifier_fd_handler(vdev, false, false);
+event_notifier_cleanup(notifier);
+}
+if (vdc->guest_notifier_mask && vdev->use_guest_notifier_mask) {
+vdc->guest_notifier_mask(vdev, -1, !assign);
+}
+return r;
+}
 static int virtio_mmio_set_guest_notifiers(DeviceState *d, int nvqs,
bool assign)
 {
@@ -654,8 +673,15 @@ static int virtio_mmio_set_guest_notifiers(DeviceState *d, 
int nvqs,
 goto assign_error;
 }
 }
+   r = virtio_mmio_set_config_notifier(d, assign);
+   if (r < 0) {
+goto config_assign_error;
+   }
 
 return 0;
+config_assign_error:
+assert(assign);
+r = virtio_mmio_set_config_notifier(d, false);
 
 assign_error:
 /* We get here on assignment failure. Recover by undoing for VQs 0 .. n. */
-- 
2.21.3

[PATCH v6 5/9] vhost:add support for configure interrupt

2021-04-26 Thread Cindy Lu

Add configure notifier support in vhost and related driver
When backend support VIRTIO_NET_F_STATUS,setup the configure
interrupt function in vhost_dev_start and release the related
resource when vhost_dev_stop

Signed-off-by: Cindy Lu 
---
 hw/net/vhost_net.c |  9 +
 hw/net/virtio-net.c|  6 
 hw/virtio/vhost.c  | 70 --
 hw/virtio/virtio.c | 22 
 include/hw/virtio/vhost.h  |  3 ++
 include/hw/virtio/virtio.h |  4 +++
 include/net/vhost_net.h|  3 ++
 7 files changed, 115 insertions(+), 2 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 24d555e764..12e30dc25e 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -426,6 +426,15 @@ void vhost_net_virtqueue_mask(VHostNetState *net, 
VirtIODevice *dev,
 vhost_virtqueue_mask(>dev, dev, idx, mask);
 }
 
+bool vhost_net_config_pending(VHostNetState *net, int idx)
+{
+return vhost_config_pending(>dev, idx);
+}
+void vhost_net_config_mask(VHostNetState *net, VirtIODevice *dev,
+  bool mask)
+{
+vhost_config_mask(>dev, dev,  mask);
+}
 VHostNetState *get_vhost_net(NetClientState *nc)
 {
 VHostNetState *vhost_net = 0;
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 78ccaa228c..43b912453a 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3063,6 +3063,9 @@ static bool 
virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx)
 if (idx != -1) {
 return vhost_net_virtqueue_pending(get_vhost_net(nc->peer), idx);
 }
+if (idx == -1) {
+return vhost_net_config_pending(get_vhost_net(nc->peer), idx);
+   }
 return false;
 }
 
@@ -3075,6 +3078,9 @@ static void virtio_net_guest_notifier_mask(VirtIODevice 
*vdev, int idx,
 if (idx != -1) {
 vhost_net_virtqueue_mask(get_vhost_net(nc->peer), vdev, idx, mask);
  }
+if (idx == -1) {
+vhost_net_config_mask(get_vhost_net(nc->peer), vdev, mask);
+ }
 }
 
 static void virtio_net_set_config_size(VirtIONet *n, uint64_t host_features)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 614ccc2bcb..162a5dd90c 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -21,6 +21,7 @@
 #include "qemu/error-report.h"
 #include "qemu/memfd.h"
 #include "standard-headers/linux/vhost_types.h"
+#include "standard-headers/linux/virtio_net.h"
 #include "exec/address-spaces.h"
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
@@ -1313,6 +1314,10 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
 goto fail;
 }
 }
+r = event_notifier_init(>masked_config_notifier, 0);
+if (r < 0) {
+return r;
+}
 
 if (busyloop_timeout) {
 for (i = 0; i < hdev->nvqs; ++i) {
@@ -1405,6 +1410,7 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
 for (i = 0; i < hdev->nvqs; ++i) {
 vhost_virtqueue_cleanup(hdev->vqs + i);
 }
+event_notifier_cleanup(>masked_config_notifier);
 if (hdev->mem) {
 /* those are only safe after successful init */
 memory_listener_unregister(>memory_listener);
@@ -1498,6 +1504,16 @@ bool vhost_virtqueue_pending(struct vhost_dev *hdev, int 
n)
 return event_notifier_test_and_clear(>masked_notifier);
 }
 
+bool vhost_config_pending(struct vhost_dev *hdev, int n)
+{
+assert(hdev->vhost_ops);
+
+if ((hdev->started == false) ||
+(hdev->vhost_ops->vhost_set_config_call == NULL)) {
+return false;
+}
+return event_notifier_test_and_clear(>masked_config_notifier);
+}
 /* Mask/unmask events from this vq. */
 void vhost_virtqueue_mask(struct vhost_dev *hdev, VirtIODevice *vdev, int n,
  bool mask)
@@ -1522,6 +1538,30 @@ void vhost_virtqueue_mask(struct vhost_dev *hdev, 
VirtIODevice *vdev, int n,
 VHOST_OPS_DEBUG("vhost_set_vring_call failed");
 }
 }
+void vhost_config_mask(struct vhost_dev *hdev, VirtIODevice *vdev,
+ bool mask)
+{
+   int fd;
+   int r;
+   EventNotifier *masked_config_notifier = >masked_config_notifier;
+   EventNotifier *config_notifier = >config_notifier;
+   assert(hdev->vhost_ops);
+
+   if ((hdev->started == false) ||
+(hdev->vhost_ops->vhost_set_config_call == NULL)) {
+return ;
+}
+if (mask) {
+assert(vdev->use_guest_notifier_mask);
+fd = event_notifier_get_fd(masked_config_notifier);
+} else {
+fd = event_notifier_get_fd(config_notifier);
+}
+   r = hdev->vhost_ops->vhost_set_config_call(hdev, );
+   if (r < 0) {
+error_report("vhost_set_config_call failed");
+}
+}
 
 uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
 uint64_t features)
@@ -1701,6 +1741,7 @@ int vhost_dev_get_inflight(struct vhost_dev *dev, 
uint16_t queue_size,
 int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
 {
 int i, r;
+int fd = 0;

[PATCH v6 4/9] vhost-vdpa: add support for config interrupt call back

2021-04-26 Thread Cindy Lu

Add new call back function in vhost-vdpa, this call back function only
supported in vhost-vdpa backend

Signed-off-by: Cindy Lu 
---
 hw/virtio/trace-events | 2 ++
 hw/virtio/vhost-vdpa.c | 7 +++
 2 files changed, 9 insertions(+)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 2060a144a2..6710835b46 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -52,6 +52,8 @@ vhost_vdpa_set_vring_call(void *dev, unsigned int index, int 
fd) "dev: %p index:
 vhost_vdpa_get_features(void *dev, uint64_t features) "dev: %p features: 
0x%"PRIx64
 vhost_vdpa_set_owner(void *dev) "dev: %p"
 vhost_vdpa_vq_get_addr(void *dev, void *vq, uint64_t desc_user_addr, uint64_t 
avail_user_addr, uint64_t used_user_addr) "dev: %p vq: %p desc_user_addr: 
0x%"PRIx64" avail_user_addr: 0x%"PRIx64" used_user_addr: 0x%"PRIx64
+vhost_vdpa_set_config_call(void *dev, int *fd)"dev: %p fd: %p"
+
 
 # virtio.c
 virtqueue_alloc_element(void *elem, size_t sz, unsigned in_num, unsigned 
out_num) "elem %p size %zd in_num %u out_num %u"
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 01d2101d09..9ba2a2bed4 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -545,6 +545,12 @@ static int vhost_vdpa_set_vring_call(struct vhost_dev *dev,
 trace_vhost_vdpa_set_vring_call(dev, file->index, file->fd);
 return vhost_vdpa_call(dev, VHOST_SET_VRING_CALL, file);
 }
+static int vhost_vdpa_set_config_call(struct vhost_dev *dev,
+   int *fd)
+{
+trace_vhost_vdpa_set_config_call(dev, fd);
+return vhost_vdpa_call(dev, VHOST_VDPA_SET_CONFIG_CALL, fd);
+}
 
 static int vhost_vdpa_get_features(struct vhost_dev *dev,
  uint64_t *features)
@@ -611,4 +617,5 @@ const VhostOps vdpa_ops = {
 .vhost_get_device_id = vhost_vdpa_get_device_id,
 .vhost_vq_get_addr = vhost_vdpa_vq_get_addr,
 .vhost_force_iommu = vhost_vdpa_force_iommu,
+.vhost_set_config_call = vhost_vdpa_set_config_call,
 };
-- 
2.21.3

[PATCH v6 8/9] virtio: decouple virtqueue from set notifier fd handler

2021-04-26 Thread Cindy Lu

This patch will decouple virtqueue from
virtio_queue_set_guest_notifier_fd_handler,
here queue number -1 mean the configure interrupt. The funtion
will set the config_notify_read as fd handler

Signed-off-by: Cindy Lu 
---
 hw/s390x/virtio-ccw.c  |  6 +++---
 hw/virtio/virtio-mmio.c|  8 
 hw/virtio/virtio-pci.c |  9 +
 hw/virtio/virtio.c | 35 +--
 include/hw/virtio/virtio.h |  4 +---
 5 files changed, 30 insertions(+), 32 deletions(-)

diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index 4582e94ae7..5d73c99d30 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -989,11 +989,11 @@ static int virtio_ccw_set_guest_notifier(VirtioCcwDevice 
*dev, int n,
 if (r < 0) {
 return r;
 }
-virtio_queue_set_guest_notifier_fd_handler(vq, true, with_irqfd);
+virtio_set_notifier_fd_handler(vdev, n, true, with_irqfd);
 if (with_irqfd) {
 r = virtio_ccw_add_irqfd(dev, n);
 if (r) {
-virtio_queue_set_guest_notifier_fd_handler(vq, false,
+virtio_set_notifier_fd_handler(vdev, n, false,
with_irqfd);
 return r;
 }
@@ -1017,7 +1017,7 @@ static int virtio_ccw_set_guest_notifier(VirtioCcwDevice 
*dev, int n,
 if (with_irqfd) {
 virtio_ccw_remove_irqfd(dev, n);
 }
-virtio_queue_set_guest_notifier_fd_handler(vq, false, with_irqfd);
+virtio_set_notifier_fd_handler(vdev, n, false, with_irqfd);
 event_notifier_cleanup(notifier);
 }
 return 0;
diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
index d8cb368728..4ea55001be 100644
--- a/hw/virtio/virtio-mmio.c
+++ b/hw/virtio/virtio-mmio.c
@@ -620,9 +620,9 @@ static int virtio_mmio_set_guest_notifier(DeviceState *d, 
int n, bool assign,
 if (r < 0) {
 return r;
 }
-virtio_queue_set_guest_notifier_fd_handler(vq, true, with_irqfd);
+virtio_set_notifier_fd_handler(vdev, n, true, with_irqfd);
 } else {
-virtio_queue_set_guest_notifier_fd_handler(vq, false, with_irqfd);
+virtio_set_notifier_fd_handler(vdev, n, false, with_irqfd);
 event_notifier_cleanup(notifier);
 }
 
@@ -642,9 +642,9 @@ static int virtio_mmio_set_config_notifier(DeviceState *d,  
bool assign)
 int r = 0;
 if (assign) {
 r = event_notifier_init(notifier, 0);
-virtio_set_config_notifier_fd_handler(vdev, true, false);
+virtio_set_notifier_fd_handler(vdev, -1, true, false);
 } else {
-virtio_set_config_notifier_fd_handler(vdev, false, false);
+virtio_set_notifier_fd_handler(vdev, -1, false, false);
 event_notifier_cleanup(notifier);
 }
 if (vdc->guest_notifier_mask && vdev->use_guest_notifier_mask) {
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 07d28dd367..5033b3db4f 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -806,10 +806,10 @@ static int virtio_pci_set_config_notifier(DeviceState *d, 
 bool assign)
 int r = 0;
 if (assign) {
 r = event_notifier_init(notifier, 0);
-virtio_set_config_notifier_fd_handler(vdev, true, true);
+virtio_set_notifier_fd_handler(vdev, -1, true, true);
 kvm_virtio_pci_vector_config_use(proxy);
 } else {
-virtio_set_config_notifier_fd_handler(vdev, false, true);
+virtio_set_notifier_fd_handler(vdev, -1, false, true);
 kvm_virtio_pci_vector_config_release(proxy);
 event_notifier_cleanup(notifier);
 }
@@ -1005,9 +1005,9 @@ static int virtio_pci_set_guest_notifier(DeviceState *d, 
int n, bool assign,
 if (r < 0) {
 return r;
 }
-virtio_queue_set_guest_notifier_fd_handler(vq, true, with_irqfd);
+virtio_set_notifier_fd_handler(vdev, n, true, with_irqfd);
 } else {
-virtio_queue_set_guest_notifier_fd_handler(vq, false, with_irqfd);
+virtio_set_notifier_fd_handler(vdev, n, false, with_irqfd);
 event_notifier_cleanup(notifier);
 }
 
@@ -1049,6 +1049,7 @@ static int virtio_pci_set_guest_notifiers(DeviceState *d, 
int nvqs, bool assign)
 msix_unset_vector_notifiers(>pci_dev);
 if (proxy->vector_irqfd) {
 kvm_virtio_pci_vector_release(proxy, nvqs);
+kvm_virtio_pci_vector_config_release(proxy);
 g_free(proxy->vector_irqfd);
 proxy->vector_irqfd = NULL;
 }
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 5dff29c981..8f0087deac 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3510,32 +3510,31 @@ static void virtio_config_read(EventNotifier *n)
 virtio_notify_config(vdev);
 }
 }
-void virtio_queue_set_guest_notifier_fd_handler(VirtQueue *vq, bool assign,
-bool with_irqfd)

[PATCH v6 3/9] vhost: add new call back function for config interrupt

2021-04-26 Thread Cindy Lu

To support configure interrupt, we need to
add a new call back function for config interrupt.

Signed-off-by: Cindy Lu 
---
 include/hw/virtio/vhost-backend.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/hw/virtio/vhost-backend.h 
b/include/hw/virtio/vhost-backend.h
index 8a6f8e2a7a..adaf6982d2 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -125,6 +125,8 @@ typedef int (*vhost_get_device_id_op)(struct vhost_dev 
*dev, uint32_t *dev_id);
 
 typedef bool (*vhost_force_iommu_op)(struct vhost_dev *dev);
 
+typedef int (*vhost_set_config_call_op)(struct vhost_dev *dev,
+   int *fd);
 typedef struct VhostOps {
 VhostBackendType backend_type;
 vhost_backend_init vhost_backend_init;
@@ -170,6 +172,7 @@ typedef struct VhostOps {
 vhost_vq_get_addr_op  vhost_vq_get_addr;
 vhost_get_device_id_op vhost_get_device_id;
 vhost_force_iommu_op vhost_force_iommu;
+vhost_set_config_call_op vhost_set_config_call;
 } VhostOps;
 
 extern const VhostOps user_ops;
-- 
2.21.3

[PATCH v6 2/9] virtio-pci:decouple virtqueue from interrupt setting process

2021-04-26 Thread Cindy Lu

Now the code for interrupt/vector are coupling
with the vq number, this patch will decouple the vritqueue
numbers from these functions.

Signed-off-by: Cindy Lu 
---
 hw/virtio/virtio-pci.c | 51 --
 1 file changed, 29 insertions(+), 22 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 36524a5728..2b7e6cc0d9 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -691,23 +691,17 @@ static void 
kvm_virtio_pci_vq_vector_release(VirtIOPCIProxy *proxy,
 }
 
 static int kvm_virtio_pci_irqfd_use(VirtIOPCIProxy *proxy,
- unsigned int queue_no,
+ EventNotifier *n,
  unsigned int vector)
 {
 VirtIOIRQFD *irqfd = >vector_irqfd[vector];
-VirtIODevice *vdev = virtio_bus_get_device(>bus);
-VirtQueue *vq = virtio_get_queue(vdev, queue_no);
-EventNotifier *n = virtio_queue_get_guest_notifier(vq);
 return kvm_irqchip_add_irqfd_notifier_gsi(kvm_state, n, NULL, irqfd->virq);
 }
 
 static void kvm_virtio_pci_irqfd_release(VirtIOPCIProxy *proxy,
-  unsigned int queue_no,
+  EventNotifier *n ,
   unsigned int vector)
 {
-VirtIODevice *vdev = virtio_bus_get_device(>bus);
-VirtQueue *vq = virtio_get_queue(vdev, queue_no);
-EventNotifier *n = virtio_queue_get_guest_notifier(vq);
 VirtIOIRQFD *irqfd = >vector_irqfd[vector];
 int ret;
 
@@ -722,7 +716,8 @@ static int kvm_virtio_pci_vector_use(VirtIOPCIProxy *proxy, 
int nvqs)
 VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
 unsigned int vector;
 int ret, queue_no;
-
+VirtQueue *vq;
+EventNotifier *n;
 for (queue_no = 0; queue_no < nvqs; queue_no++) {
 if (!virtio_queue_get_num(vdev, queue_no)) {
 break;
@@ -739,7 +734,9 @@ static int kvm_virtio_pci_vector_use(VirtIOPCIProxy *proxy, 
int nvqs)
  * Otherwise, delay until unmasked in the frontend.
  */
 if (vdev->use_guest_notifier_mask && k->guest_notifier_mask) {
-ret = kvm_virtio_pci_irqfd_use(proxy, queue_no, vector);
+vq = virtio_get_queue(vdev, queue_no);
+n = virtio_queue_get_guest_notifier(vq);
+ret = kvm_virtio_pci_irqfd_use(proxy, n, vector);
 if (ret < 0) {
 kvm_virtio_pci_vq_vector_release(proxy, vector);
 goto undo;
@@ -755,7 +752,9 @@ undo:
 continue;
 }
 if (vdev->use_guest_notifier_mask && k->guest_notifier_mask) {
-kvm_virtio_pci_irqfd_release(proxy, queue_no, vector);
+vq = virtio_get_queue(vdev, queue_no);
+n = virtio_queue_get_guest_notifier(vq);
+kvm_virtio_pci_irqfd_release(proxy, n, vector);
 }
 kvm_virtio_pci_vq_vector_release(proxy, vector);
 }
@@ -769,7 +768,8 @@ static void kvm_virtio_pci_vector_release(VirtIOPCIProxy 
*proxy, int nvqs)
 unsigned int vector;
 int queue_no;
 VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
-
+VirtQueue *vq;
+EventNotifier *n;
 for (queue_no = 0; queue_no < nvqs; queue_no++) {
 if (!virtio_queue_get_num(vdev, queue_no)) {
 break;
@@ -782,7 +782,9 @@ static void kvm_virtio_pci_vector_release(VirtIOPCIProxy 
*proxy, int nvqs)
  * Otherwise, it was cleaned when masked in the frontend.
  */
 if (vdev->use_guest_notifier_mask && k->guest_notifier_mask) {
-kvm_virtio_pci_irqfd_release(proxy, queue_no, vector);
+vq = virtio_get_queue(vdev, queue_no);
+n = virtio_queue_get_guest_notifier(vq);
+kvm_virtio_pci_irqfd_release(proxy, n, vector);
 }
 kvm_virtio_pci_vq_vector_release(proxy, vector);
 }
@@ -791,12 +793,11 @@ static void kvm_virtio_pci_vector_release(VirtIOPCIProxy 
*proxy, int nvqs)
 static int virtio_pci_vq_vector_unmask(VirtIOPCIProxy *proxy,
unsigned int queue_no,
unsigned int vector,
-   MSIMessage msg)
+   MSIMessage msg,
+EventNotifier *n)
 {
 VirtIODevice *vdev = virtio_bus_get_device(>bus);
 VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
-VirtQueue *vq = virtio_get_queue(vdev, queue_no);
-EventNotifier *n = virtio_queue_get_guest_notifier(vq);
 VirtIOIRQFD *irqfd;
 int ret = 0;
 
@@ -823,14 +824,15 @@ static int virtio_pci_vq_vector_unmask(VirtIOPCIProxy 
*proxy,
 event_notifier_set(n);
 }
 } else {
-ret = kvm_virtio_pci_irqfd_use(proxy, queue_no, vector);
+ret = kvm_virtio_pci_irqfd_use(proxy, n, vector);
 }
 return ret;
 }
 
 static void virtio_pci_vq_vector_mask(VirtIOPCIProxy

[PATCH v6 9/9] virtio-net: add peer_deleted check in virtio_net_handle_rx

2021-04-26 Thread Cindy Lu

During the test, We found this fuction will continue running
while the peer is deleted, this will case the crash. so add
check for this.

Signed-off-by: Cindy Lu 
---
 hw/net/virtio-net.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 43b912453a..1be3f8e76f 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1403,7 +1403,9 @@ static void virtio_net_handle_rx(VirtIODevice *vdev, 
VirtQueue *vq)
 {
 VirtIONet *n = VIRTIO_NET(vdev);
 int queue_index = vq2q(virtio_get_queue_index(vq));
-
+if (n->nic->peer_deleted) {
+return;
+}
 qemu_flush_queued_packets(qemu_get_subqueue(n->nic, queue_index));
 }
 
-- 
2.21.3

Re: [RFC PATCH 4/4] target/ppc: isolated cpu init from translation logic

2021-04-26 Thread David Gibson

On Fri, Apr 23, 2021 at 04:18:07PM -0300, Bruno Larsen (billionai) wrote:
65;6203;1c> finished isolation of CPU initialization logic from
> translation logic. CPU initialization now only has common code
> and may or may not call accelerator-specific code, as the
> build options require.
> 
> Signed-off-by: Bruno Larsen (billionai) 
> ---
>  target/ppc/{translate_init.c.inc => cpu_init.c} | 11 ++-
>  target/ppc/meson.build  |  1 +
>  target/ppc/translate.c  |  4 +++-
>  3 files changed, 14 insertions(+), 2 deletions(-)
>  rename target/ppc/{translate_init.c.inc => cpu_init.c} (99%)
> 
> diff --git a/target/ppc/translate_init.c.inc b/target/ppc/cpu_init.c
> similarity index 99%
> rename from target/ppc/translate_init.c.inc
> rename to target/ppc/cpu_init.c
> index 33e44f1363..38e4c87aa5 100644
> --- a/target/ppc/translate_init.c.inc
> +++ b/target/ppc/cpu_init.c
> @@ -18,6 +18,7 @@
>   * License along with this library; if not, see 
> .
>   */
>  
> +#include "qemu/osdep.h"
>  #include "disas/dis-asm.h"
>  #include "exec/gdbstub.h"
>  #include "kvm_ppc.h"
> @@ -42,6 +43,9 @@
>  #include "fpu/softfloat.h"
>  #include "qapi/qapi-commands-machine-target.h"
>  
> +#include "helper_regs.h"
> +#include "internal.h"
> +
>  /* #define PPC_DUMP_CPU */
>  /* #define PPC_DEBUG_SPR */
>  /* #define PPC_DUMP_SPR_ACCESSES */
> @@ -51,7 +55,12 @@ static inline void vscr_init(CPUPPCState *env, uint32_t 
> val)
>  {
>  /* Altivec always uses round-to-nearest */
>  set_float_rounding_mode(float_round_nearest_even, >vec_status);
> -helper_mtvscr(env, val);
> +/*
> + * This comment is here just so the project will build.
> + * The current solution is in another patch and will be
> + * added when we figure out an internal fork of qemu
> + */
> +/* helper_mtvscr(env, val); */

Ugh.  Yeah, this doesn't belong here at all.  This looks like what
should be reset time initialization of the VSCR, which isn't actually
an SPR, though it's somewhat similar.  It really belongs in the reset
path for the relevant CPUs, not with the construction of the CPU
registers itself.

>  }
>  
>  /*
> diff --git a/target/ppc/meson.build b/target/ppc/meson.build
> index aaee5e7c0c..14f0ba5d48 100644
> --- a/target/ppc/meson.build
> +++ b/target/ppc/meson.build
> @@ -2,6 +2,7 @@ ppc_ss = ss.source_set()
>  ppc_ss.add(files(
>'cpu-models.c',
>'cpu.c',
> +  'cpu_init.c',
>'dfp_helper.c',
>'excp_helper.c',
>'fpu_helper.c',
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index bb893be928..a4d9fb8d54 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -37,6 +37,9 @@
>  #include "exec/log.h"
>  #include "qemu/atomic128.h"
>  
> +#include "qemu/qemu-print.h"
> +#include "qapi/error.h"
> +#include "internal.h"
>  
>  #define CPU_SINGLE_STEP 0x1
>  #define CPU_BRANCH_STEP 0x2
> @@ -7593,7 +7596,6 @@ GEN_HANDLER2_E(trechkpt, "trechkpt", 0x1F, 0x0E, 0x1F, 
> 0x03FFF800, \
>  };
>  
>  #include "helper_regs.h"
> -#include "translate_init.c.inc"
>  
>  
> /*/
>  /* Misc PowerPC helpers */

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [RFC PATCH 3/4] target/ppc: Move SPR generation to separate file

2021-04-26 Thread David Gibson

On Mon, Apr 26, 2021 at 06:08:53PM -0300, Fabiano Rosas wrote:
> "Bruno Larsen (billionai)"  writes:
> 
> > This move is required to enable building without TCG.
> > All the logic related to registering SPRs specific to
> > some architectures or machines has been hidden in this
> > new file.
> 
> Hm... I thought we ended up deciding to keep the gen_spr_
> functions in translate_init.c.inc (cpu_init.c). I don't strongly favour
> one way or the other, but one alternative would be to just rename the
> gen_spr_ functions to add_sprs_ or even
> register__sprs and leave them where they are.

Right.  I think renaming these away from gen_*() is a good idea, to
avoid confusion with the other gen_*() functions which, well, generate
TCG code.

I don't think there's a lot of value in moving them out from
translate_init.  Honestly the more useful way to break up
translate_init would be by CPU family, rather than by SPR vs. non-SPR
setup

> 
> > The idea of this final patch is to hide all SPR generation from
> > translate_init, but in an effort to simplify the RFC the 4
> > functions for not_implemented SPRs were created. They'll be
> > substituted by gen_spr__misc in reusable ways for the
> > final patch.
> 
> I'd expect this patch to be just a big removal of gen_spr* from
> translate_init.c.inc and their addition into spr_common.c. So any other
> prep work should come in separate patches ealier in the
> series. Specifically, at least one patch for the macro work and another
> for the refactoring of open coded spr_register calls into gen_spr_*
> functions.

Seconded.

> 
> > another issue we ran into was vscr_init using static functions
> > means it has to be static, so we had to remove them from 
> > gen_spr_74xx and gen_spr_book3s_altivec, and have them in
> > the init_procs instead.
> 
> Looks like moving vscr_init out, along with a more detailed explanation
> of the issue could be in another preliminary change.
> 
> >
> > Finally, SPR_NOACCESS had to be defined in internal.h, as it
> > is used by spr_common, translate_init and translate. If there
> > is a better solution, I'll be happy to implement it.
> >
> > As for the redundant code complaint this patch will get, it has only
> > been moved, so I don't know if I can remove that code
> >
> > Signed-off-by: Bruno Larsen (billionai) 
> > ---
> >  target/ppc/internal.h   |  108 +
> >  target/ppc/meson.build  |1 +
> >  target/ppc/spr_common.c | 2943 ++
> >  target/ppc/translate_init.c.inc | 4031 ++-
> >  4 files changed, 3314 insertions(+), 3769 deletions(-)
> >  create mode 100644 target/ppc/spr_common.c
> >
> > diff --git a/target/ppc/internal.h b/target/ppc/internal.h
> > index de78c23717..25df546eae 100644
> > --- a/target/ppc/internal.h
> > +++ b/target/ppc/internal.h
> > @@ -226,4 +226,112 @@ void destroy_ppc_opcodes(PowerPCCPU *cpu);
> >  void ppc_gdb_init(CPUState *cs, PowerPCCPUClass *ppc);
> >  gchar *ppc_gdb_arch_name(CPUState *cs);
> >  
> > +/* spr-common.c */
> > +#include "cpu.h"
> > +void gen_spr_generic(CPUPPCState *env);
> 
> The fact that these are called gen_* is confusing since they don't
> really generate anything. They mostly just add SPRs to the list and
> register the SPR rw callbacks for TCG. Maybe we could rename them at the
> end of the series to something more clear.
> 
> > +void gen_spr_ne_601(CPUPPCState *env);
> > +void gen_spr_sdr1(CPUPPCState *env);
> > +void gen_low_BATs(CPUPPCState *env);
> > +void gen_high_BATs(CPUPPCState *env);
> > +void gen_tbl(CPUPPCState *env);
> > +void gen_6xx_7xx_soft_tlb(CPUPPCState *env, int nb_tlbs, int nb_ways);
> > +void gen_spr_G2_755(CPUPPCState *env);
> > +void gen_spr_7xx(CPUPPCState *env);
> > +#ifdef TARGET_PPC64
> > +void gen_spr_amr(CPUPPCState *env);
> > +void gen_spr_iamr(CPUPPCState *env);
> > +#endif /* TARGET_PPC64 */
> > +void gen_spr_thrm(CPUPPCState *env);
> > +void gen_spr_604(CPUPPCState *env);
> > +void gen_spr_603(CPUPPCState *env);
> > +void gen_spr_G2(CPUPPCState *env);
> > +void gen_spr_602(CPUPPCState *env);
> > +void gen_spr_601(CPUPPCState *env);
> > +void gen_spr_74xx(CPUPPCState *env);
> > +void gen_l3_ctrl(CPUPPCState *env);
> > +void gen_74xx_soft_tlb(CPUPPCState *env, int nb_tlbs, int nb_ways);
> > +void gen_spr_not_implemented(CPUPPCState *env,
> > + int num, const char *name);
> > +void gen_spr_not_implemented_ureg(CPUPPCState *env,
> > +  int num, const char *name);
> > +void gen_spr_not_implemented_no_write(CPUPPCState *env,
> > +  int num, const char *name);
> > +void gen_spr_not_implemented_write_nop(CPUPPCState *env,
> > +   int num, const char *name);
> > +void gen_spr_PSSCR(CPUPPCState *env);
> > +void gen_spr_TIDR(CPUPPCState *env);
> > +void gen_spr_pvr(CPUPPCState *env, PowerPCCPUClass *pcc);
> > +void gen_spr_svr(CPUPPCState *env, PowerPCCPUClass *pcc);

[PATCH v6 1/9] hw: Add check for queue number

2021-04-26 Thread Cindy Lu

In order to support configure interrupt. we will use queue number -1
as configure interrupt
since all these device are not support the configure interrupt
So we will add an check here, if the idx is -1, the function
will return;

Signed-off-by: Cindy Lu 
---
 hw/display/vhost-user-gpu.c|  8 ++--
 hw/net/virtio-net.c| 10 +++---
 hw/virtio/vhost-user-fs.c  | 11 +++
 hw/virtio/vhost-vsock-common.c |  8 ++--
 hw/virtio/virtio-crypto.c  |  8 ++--
 5 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c
index 51f1747c4a..d8e26cedf1 100644
--- a/hw/display/vhost-user-gpu.c
+++ b/hw/display/vhost-user-gpu.c
@@ -490,7 +490,9 @@ static bool
 vhost_user_gpu_guest_notifier_pending(VirtIODevice *vdev, int idx)
 {
 VhostUserGPU *g = VHOST_USER_GPU(vdev);
-
+if (idx == -1) {
+return false;
+}
 return vhost_virtqueue_pending(>vhost->dev, idx);
 }
 
@@ -498,7 +500,9 @@ static void
 vhost_user_gpu_guest_notifier_mask(VirtIODevice *vdev, int idx, bool mask)
 {
 VhostUserGPU *g = VHOST_USER_GPU(vdev);
-
+if (idx == -1) {
+return;
+}
 vhost_virtqueue_mask(>vhost->dev, vdev, idx, mask);
 }
 
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 9179013ac4..78ccaa228c 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3060,7 +3060,10 @@ static bool 
virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx)
 VirtIONet *n = VIRTIO_NET(vdev);
 NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
 assert(n->vhost_started);
-return vhost_net_virtqueue_pending(get_vhost_net(nc->peer), idx);
+if (idx != -1) {
+return vhost_net_virtqueue_pending(get_vhost_net(nc->peer), idx);
+}
+return false;
 }
 
 static void virtio_net_guest_notifier_mask(VirtIODevice *vdev, int idx,
@@ -3069,8 +3072,9 @@ static void virtio_net_guest_notifier_mask(VirtIODevice 
*vdev, int idx,
 VirtIONet *n = VIRTIO_NET(vdev);
 NetClientState *nc = qemu_get_subqueue(n->nic, vq2q(idx));
 assert(n->vhost_started);
-vhost_net_virtqueue_mask(get_vhost_net(nc->peer),
- vdev, idx, mask);
+if (idx != -1) {
+vhost_net_virtqueue_mask(get_vhost_net(nc->peer), vdev, idx, mask);
+ }
 }
 
 static void virtio_net_set_config_size(VirtIONet *n, uint64_t host_features)
diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
index 1bc5d03a00..37424c2193 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -142,18 +142,21 @@ static void vuf_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
  */
 }
 
-static void vuf_guest_notifier_mask(VirtIODevice *vdev, int idx,
-bool mask)
+static void vuf_guest_notifier_mask(VirtIODevice *vdev, int idx, bool mask)
 {
 VHostUserFS *fs = VHOST_USER_FS(vdev);
-
+if (idx == -1) {
+return;
+}
 vhost_virtqueue_mask(>vhost_dev, vdev, idx, mask);
 }
 
 static bool vuf_guest_notifier_pending(VirtIODevice *vdev, int idx)
 {
 VHostUserFS *fs = VHOST_USER_FS(vdev);
-
+if (idx == -1) {
+return false;
+}
 return vhost_virtqueue_pending(>vhost_dev, idx);
 }
 
diff --git a/hw/virtio/vhost-vsock-common.c b/hw/virtio/vhost-vsock-common.c
index 5b2ebf3496..0adf823d37 100644
--- a/hw/virtio/vhost-vsock-common.c
+++ b/hw/virtio/vhost-vsock-common.c
@@ -100,7 +100,9 @@ static void 
vhost_vsock_common_guest_notifier_mask(VirtIODevice *vdev, int idx,
 bool mask)
 {
 VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(vdev);
-
+if (idx == -1) {
+return;
+}
 vhost_virtqueue_mask(>vhost_dev, vdev, idx, mask);
 }
 
@@ -108,7 +110,9 @@ static bool 
vhost_vsock_common_guest_notifier_pending(VirtIODevice *vdev,
int idx)
 {
 VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(vdev);
-
+if (idx == -1) {
+return false;
+}
 return vhost_virtqueue_pending(>vhost_dev, idx);
 }
 
diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
index 54f9bbb789..c47f4ffb24 100644
--- a/hw/virtio/virtio-crypto.c
+++ b/hw/virtio/virtio-crypto.c
@@ -947,7 +947,9 @@ static void virtio_crypto_guest_notifier_mask(VirtIODevice 
*vdev, int idx,
 int queue = virtio_crypto_vq2q(idx);
 
 assert(vcrypto->vhost_started);
-
+if (idx == -1) {
+return;
+}
 cryptodev_vhost_virtqueue_mask(vdev, queue, idx, mask);
 }
 
@@ -957,7 +959,9 @@ static bool 
virtio_crypto_guest_notifier_pending(VirtIODevice *vdev, int idx)
 int queue = virtio_crypto_vq2q(idx);
 
 assert(vcrypto->vhost_started);
-
+if (idx == -1) {
+return false;
+}
 return cryptodev_vhost_virtqueue_pending(vdev, queue, idx);
 }
 
-- 
2.21.3

[PATCH v6 0/9] vhost-vdpa: add support for configure interrupt

2021-04-26 Thread Cindy Lu

these patches are add the support for configure interrupt 

These code are all tested in vp-vdpa (support configure interrupt)
vdpa_sim (not support configure interrupt)

test in virtio-pci bus and virtio-mmio bus

Change in v2:
Add support for virtio-mmio bus
active the notifier while the backend support configure interrupt
misc fixes form v1

Change in v3
fix the coding style problems

Change in v4
misc fixes form v3
merge the set_config_notifier to set_guest_notifier
when vdpa start, check the feature by VIRTIO_NET_F_STATUS 

Change in v5
misc fixes form v4
split the code for introduce configure interrupt type and callback function 
will init the configure interrupt in all virtio-pci and virtio-mmio bus, but 
will 
only active while using vhost-vdpa driver

Change in v6
misc fixes form v5
decouple virtqueue from interrupt setting and misc process
fix the bug in virtio_net_handle_rx
use -1 as the queue number to identify if the interrupt is configure interrupt

Cindy Lu (9):
  hw: Add check for queue number
  virtio-pci:decouple virtqueue from interrupt setting process
  vhost: add new call back function for config interrupt
  vhost-vdpa: add support for config interrupt call back
  vhost:add support for configure interrupt
  virtio-mmio: add support for configure interrupt
  virtio-pci: add support for configure interrupt
  virtio: decouple virtqueue from set notifier fd handler
  virtio-net: add peer_deleted check in virtio_net_handle_rx

 hw/display/vhost-user-gpu.c   |   8 +-
 hw/net/vhost_net.c|   9 ++
 hw/net/virtio-net.c   |  20 ++-
 hw/s390x/virtio-ccw.c |   6 +-
 hw/virtio/trace-events|   2 +
 hw/virtio/vhost-user-fs.c |  11 +-
 hw/virtio/vhost-vdpa.c|   7 +
 hw/virtio/vhost-vsock-common.c|   8 +-
 hw/virtio/vhost.c |  70 +-
 hw/virtio/virtio-crypto.c |   8 +-
 hw/virtio/virtio-mmio.c   |  30 -
 hw/virtio/virtio-pci.c| 212 +++---
 hw/virtio/virtio.c|  37 --
 include/hw/virtio/vhost-backend.h |   3 +
 include/hw/virtio/vhost.h |   3 +
 include/hw/virtio/virtio.h|   4 +-
 include/net/vhost_net.h   |   3 +
 17 files changed, 336 insertions(+), 105 deletions(-)

-- 
2.21.3

Re: [RFC PATCH 1/4] target/ppc: move opcode table logic to translate.c

2021-04-26 Thread da...@gibson.dropbear.id.au

On Mon, Apr 26, 2021 at 07:29:54PM +, Bruno Piazera Larsen wrote:
> > > code motion to remove opcode callback table from
> > > translate_init.c.inc to translate.c in preparation
> > > to remove #include  from
> > > translate.c
> >
> > I'd mention the creation of destroy_ppc_opcodes since this patch is not
> > strictly just moving code.
> 
> Sure, will do for v2.
> 
> > > +#if defined(PPC_DUMP_CPU)
> >
> > The commented out define for this was left behind.
> 
> Good catch! The define is going to still be used by a couple of things in 
> cpu_init, though.
> I'm guessing moving to internal.h is the best solution, but correct
> me if I'm wrong

Generally LGTM, excepting the things Fabiano pointed out.

> 
> 
> Bruno Piazera Larsen
> 
> Instituto de Pesquisas 
> ELDORADO
> 
> Departamento Computação Embarcada
> 
> Analista de Software Trainee
> 
> Aviso Legal - Disclaimer

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [RFC PATCH 2/4] target/ppc: isolated SPR read/write callbacks

2021-04-26 Thread David Gibson

On Fri, Apr 23, 2021 at 04:18:05PM -0300, Bruno Larsen (billionai) wrote:
> Moved all functions related to SPR read/write callbacks into a new file
> specific for holding these. This is setting up a better separation of
> SPR registration, which is required to be able to build disabling
> TCG.
> 
> The solution to move it to spr_tcg.c.inc and including it in translate.c
> is a work in progress, any better solutions are very much appreciated.
> Also, making the R/W functions not static is required for the next
> commit.

[snip]
> diff --git a/target/ppc/spr_tcg.h b/target/ppc/spr_tcg.h
> new file mode 100644
> index 00..1e09d001a9
> --- /dev/null
> +++ b/target/ppc/spr_tcg.h
> @@ -0,0 +1,132 @@
> +#ifndef SPR_TCG_H
> +#define SPR_TCG_H
> +
> +#include "qemu/osdep.h"
> +#include "cpu.h"
> +#include "exec/translator.h"
> +#include "tcg/tcg.h"
> +
> +/* prototypes for readers and writers for SPRs */

The 2 fscr functions below aren't readers and writers for the FSCR.
Instead they're used by instructions related to facilities the FSCR
can enable and disable - this generates the code to check the FSCR and
generate an exception if the units are disabled.

That doesn't mean they don't belong here, but it does mean 



> +
> +#ifdef TARGET_PPC64
> +void gen_fscr_facility_check(DisasContext *ctx, int facility_sprn,
> +int bit, int sprn, int cause);
> +void gen_msr_facility_check(DisasContext *ctx, int facility_sprn,
> +   int bit, int sprn, int cause);
> +#endif

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

RE: [PATCH] cutils: Fix memleak in get_relocated_path()

2021-04-26 Thread Duan, Zhenzhong


> -Original Message-
> From: Philippe Mathieu-Daudé 
> Sent: Monday, April 26, 2021 10:06 PM
> To: Duan, Zhenzhong ; qemu-
> de...@nongnu.org
> Cc: qemu-triv...@nongnu.org; pbonz...@redhat.com; Stefano Garzarella
> 
> Subject: Re: [PATCH] cutils: Fix memleak in get_relocated_path()
> 
> Hi,
> 
> On 4/27/21 12:30 AM, Zhenzhong Duan wrote:
> > Valgrind complains definitely loss in get_relocated_path(), because
> > GString is leaked in get_relocated_path() when returning with gchar *.
> > Use g_string_free(, false) to free GString while preserving gchar *.
> >
> > Signed-off-by: Zhenzhong Duan 
> > ---
> >  util/cutils.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/util/cutils.c b/util/cutils.c index
> > ee908486da..f58c2157d2 100644
> > --- a/util/cutils.c
> > +++ b/util/cutils.c
> > @@ -1055,5 +1055,5 @@ char *get_relocated_path(const char *dir)
> >  assert(G_IS_DIR_SEPARATOR(dir[-1]));
> >  g_string_append(result, dir - 1);
> >  }
> > -return result->str;
> > +return g_string_free(result, FALSE);
> >  }
> >
> 
> Thanks for your patch, but Stefano sent the same fix 2 weeks ago:
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg798279.html
> 
> It should be merged once the development tree opens again (we are now
> 'freezed' before the v6.0.0 release).

I see, thanks for your quick response.

Zhenzhong

Re: [PATCH v4] target/ppc: code motion from translate_init.c.inc to gdbstub.c

2021-04-26 Thread David Gibson

On Mon, Apr 26, 2021 at 03:47:06PM -0300, Bruno Larsen (billionai) wrote:
> All the code related to gdb has been moved from translate_init.c.inc
> file to the gdbstub.c file, where it makes more sense.
> 
> Version 4 fixes the omission of internal.h in gdbstub, mentioned in
> <87sg3d2gf5@linux.ibm.com>, and the extra blank line.
> 
> Signed-off-by: Bruno Larsen (billionai) 
> Suggested-by: Fabiano Rosas 

Applied to ppc-for-6.1, thanks.

> ---
>  target/ppc/gdbstub.c| 258 
>  target/ppc/internal.h   |   5 +
>  target/ppc/translate_init.c.inc | 254 +--
>  3 files changed, 264 insertions(+), 253 deletions(-)
> 
> diff --git a/target/ppc/gdbstub.c b/target/ppc/gdbstub.c
> index c28319fb97..94a7273ee0 100644
> --- a/target/ppc/gdbstub.c
> +++ b/target/ppc/gdbstub.c
> @@ -20,6 +20,8 @@
>  #include "qemu/osdep.h"
>  #include "cpu.h"
>  #include "exec/gdbstub.h"
> +#include "exec/helper-proto.h"
> +#include "internal.h"
>  
>  static int ppc_gdb_register_len_apple(int n)
>  {
> @@ -387,3 +389,259 @@ const char *ppc_gdb_get_dynamic_xml(CPUState *cs, const 
> char *xml_name)
>  return NULL;
>  }
>  #endif
> +
> +static bool avr_need_swap(CPUPPCState *env)
> +{
> +#ifdef HOST_WORDS_BIGENDIAN
> +return msr_le;
> +#else
> +return !msr_le;
> +#endif
> +}
> +
> +#if !defined(CONFIG_USER_ONLY)
> +static int gdb_find_spr_idx(CPUPPCState *env, int n)
> +{
> +int i;
> +
> +for (i = 0; i < ARRAY_SIZE(env->spr_cb); i++) {
> +ppc_spr_t *spr = >spr_cb[i];
> +
> +if (spr->name && spr->gdb_id == n) {
> +return i;
> +}
> +}
> +return -1;
> +}
> +
> +static int gdb_get_spr_reg(CPUPPCState *env, GByteArray *buf, int n)
> +{
> +int reg;
> +int len;
> +
> +reg = gdb_find_spr_idx(env, n);
> +if (reg < 0) {
> +return 0;
> +}
> +
> +len = TARGET_LONG_SIZE;
> +gdb_get_regl(buf, env->spr[reg]);
> +ppc_maybe_bswap_register(env, gdb_get_reg_ptr(buf, len), len);
> +return len;
> +}
> +
> +static int gdb_set_spr_reg(CPUPPCState *env, uint8_t *mem_buf, int n)
> +{
> +int reg;
> +int len;
> +
> +reg = gdb_find_spr_idx(env, n);
> +if (reg < 0) {
> +return 0;
> +}
> +
> +len = TARGET_LONG_SIZE;
> +ppc_maybe_bswap_register(env, mem_buf, len);
> +env->spr[reg] = ldn_p(mem_buf, len);
> +
> +return len;
> +}
> +#endif
> +
> +static int gdb_get_float_reg(CPUPPCState *env, GByteArray *buf, int n)
> +{
> +uint8_t *mem_buf;
> +if (n < 32) {
> +gdb_get_reg64(buf, *cpu_fpr_ptr(env, n));
> +mem_buf = gdb_get_reg_ptr(buf, 8);
> +ppc_maybe_bswap_register(env, mem_buf, 8);
> +return 8;
> +}
> +if (n == 32) {
> +gdb_get_reg32(buf, env->fpscr);
> +mem_buf = gdb_get_reg_ptr(buf, 4);
> +ppc_maybe_bswap_register(env, mem_buf, 4);
> +return 4;
> +}
> +return 0;
> +}
> +
> +static int gdb_set_float_reg(CPUPPCState *env, uint8_t *mem_buf, int n)
> +{
> +if (n < 32) {
> +ppc_maybe_bswap_register(env, mem_buf, 8);
> +*cpu_fpr_ptr(env, n) = ldq_p(mem_buf);
> +return 8;
> +}
> +if (n == 32) {
> +ppc_maybe_bswap_register(env, mem_buf, 4);
> +store_fpscr(env, ldl_p(mem_buf), 0x);
> +return 4;
> +}
> +return 0;
> +}
> +
> +static int gdb_get_avr_reg(CPUPPCState *env, GByteArray *buf, int n)
> +{
> +uint8_t *mem_buf;
> +
> +if (n < 32) {
> +ppc_avr_t *avr = cpu_avr_ptr(env, n);
> +if (!avr_need_swap(env)) {
> +gdb_get_reg128(buf, avr->u64[0] , avr->u64[1]);
> +} else {
> +gdb_get_reg128(buf, avr->u64[1] , avr->u64[0]);
> +}
> +mem_buf = gdb_get_reg_ptr(buf, 16);
> +ppc_maybe_bswap_register(env, mem_buf, 8);
> +ppc_maybe_bswap_register(env, mem_buf + 8, 8);
> +return 16;
> +}
> +if (n == 32) {
> +gdb_get_reg32(buf, helper_mfvscr(env));
> +mem_buf = gdb_get_reg_ptr(buf, 4);
> +ppc_maybe_bswap_register(env, mem_buf, 4);
> +return 4;
> +}
> +if (n == 33) {
> +gdb_get_reg32(buf, (uint32_t)env->spr[SPR_VRSAVE]);
> +mem_buf = gdb_get_reg_ptr(buf, 4);
> +ppc_maybe_bswap_register(env, mem_buf, 4);
> +return 4;
> +}
> +return 0;
> +}
> +
> +static int gdb_set_avr_reg(CPUPPCState *env, uint8_t *mem_buf, int n)
> +{
> +if (n < 32) {
> +ppc_avr_t *avr = cpu_avr_ptr(env, n);
> +ppc_maybe_bswap_register(env, mem_buf, 8);
> +ppc_maybe_bswap_register(env, mem_buf + 8, 8);
> +if (!avr_need_swap(env)) {
> +avr->u64[0] = ldq_p(mem_buf);
> +avr->u64[1] = ldq_p(mem_buf + 8);
> +} else {
> +avr->u64[1] = ldq_p(mem_buf);
> +avr->u64[0] = ldq_p(mem_buf + 8);
> +}
> +return 16;
> +}
> +if (n == 32) {
> +

RE: [PATCH v6 09/10] Add the function of colo_bitmap_clear_dirty

2021-04-26 Thread Rao, Lei

Hi, Dave

I think this set of patches is beneficial to upstream. Please check these 
performance data. If you have any other ideas, please let me know.

Thanks
Lei.

-Original Message-
From: Rao, Lei 
Sent: Friday, April 16, 2021 3:57 PM
To: dgilb...@redhat.com
Cc: qemu-devel@nongnu.org; Zhang, Chen ; 
lizhij...@cn.fujitsu.com; jasow...@redhat.com; quint...@redhat.com; 
pbonz...@redhat.com; lukasstra...@web.de
Subject: RE: [PATCH v6 09/10] Add the function of colo_bitmap_clear_dirty

Hi, Dave

The performance data has added to the commit messages. 
Do you have any other suggestions?

Thanks
Lei.

-Original Message-
From: Rao, Lei 
Sent: Friday, April 9, 2021 11:21 AM
To: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; quint...@redhat.com; dgilb...@redhat.com; 
pbonz...@redhat.com; lukasstra...@web.de
Cc: qemu-devel@nongnu.org; Rao, Lei 
Subject: [PATCH v6 09/10] Add the function of colo_bitmap_clear_dirty

From: "Rao, Lei" 

When we use continuous dirty memory copy for flushing ram cache on secondary 
VM, we can also clean up the bitmap of contiguous dirty page memory. This also 
can reduce the VM stop time during checkpoint.

The performance test for COLO as follow:

Server configuraton:
CPU :Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz MEM :251G(type:DDR4 Speed:2666 
MT/s) SSD :Intel 730 and DC S35x0/3610/3700 Series SSDs

dirty pages:3189376  migration_bitmap_clear_dirty time consuming(ns):105194000 
dirty pages:3189784  migration_bitmap_clear_dirty time consuming(ns):105297000 
dirty pages:3190501  migration_bitmap_clear_dirty time consuming(ns):10541 
dirty pages:3188734  migration_bitmap_clear_dirty time consuming(ns):105138000 
dirty pages:3189464  migration_bitmap_clear_dirty time consuming(ns):111736000 
dirty pages:3188558  migration_bitmap_clear_dirty time consuming(ns):105079000 
dirty pages:3239489  migration_bitmap_clear_dirty time consuming(ns):106761000

dirty pages:3190240  colo_bitmap_clear_dirty time consuming(ns):8369000 dirty 
pages:3189293  colo_bitmap_clear_dirty time consuming(ns):8388000 dirty 
pages:3189171  colo_bitmap_clear_dirty time consuming(ns):8641000 dirty 
pages:3189099  colo_bitmap_clear_dirty time consuming(ns):828 dirty 
pages:3189974  colo_bitmap_clear_dirty time consuming(ns):8352000 dirty 
pages:3189471  colo_bitmap_clear_dirty time consuming(ns):8348000 dirty 
pages:3189681  colo_bitmap_clear_dirty time consuming(ns):8426000

it can be seen from the data that colo_bitmap_clear_dirty is more efficient.

Signed-off-by: Lei Rao 
Reviewed-by: Lukas Straub 
Tested-by: Lukas Straub 
---
 migration/ram.c | 36 +++-
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c index 8661d82..11275cd 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -857,6 +857,36 @@ unsigned long colo_bitmap_find_dirty(RAMState *rs, 
RAMBlock *rb,
 return first;
 }
 
+/**
+ * colo_bitmap_clear_dirty:when we flush ram cache to ram, we will use
+ * continuous memory copy, so we can also clean up the bitmap of 
+contiguous
+ * dirty memory.
+ */
+static inline bool colo_bitmap_clear_dirty(RAMState *rs,
+   RAMBlock *rb,
+   unsigned long start,
+   unsigned long num) {
+bool ret;
+unsigned long i = 0;
+
+/*
+ * Since flush ram cache to ram can only happen on Secondary VM.
+ * and the clear bitmap always is NULL on destination side.
+ * Therefore, there is unnecessary to judge whether the
+ * clear_bitmap needs clear.
+ */
+QEMU_LOCK_GUARD(>bitmap_mutex);
+for (i = 0; i < num; i++) {
+ret = test_and_clear_bit(start + i, rb->bmap);
+if (ret) {
+rs->migration_dirty_pages--;
+}
+}
+
+return ret;
+}
+
 static inline bool migration_bitmap_clear_dirty(RAMState *rs,
 RAMBlock *rb,
 unsigned long page) @@ 
-3774,11 +3804,7 @@ void colo_flush_ram_cache(void)
 num = 0;
 block = QLIST_NEXT_RCU(block, next);
 } else {
-unsigned long i = 0;
-
-for (i = 0; i < num; i++) {
-migration_bitmap_clear_dirty(ram_state, block, offset + i);
-}
+colo_bitmap_clear_dirty(ram_state, block, offset, num);
 dst_host = block->host
  + (((ram_addr_t)offset) << TARGET_PAGE_BITS);
 src_host = block->colo_cache
--
1.8.3.1

Re: [PATCH v2] hw/i386: Expand the range of CPU topologies between smp and maxcpus

2021-04-26 Thread Like Xu


On 2021/4/26 21:30, Daniel P. Berrangé wrote:

On Mon, Apr 26, 2021 at 10:08:52AM +0800, caodon...@kingsoft.com wrote:

Change the criteria for the initial CPU topology and maxcpus, user can
have more settings


Can you provide a better explanation of why this is needed. What
valid usage scenario is blocked by the current check ?

AFAICT, it partially reverts an intentional change done in several
years ago in :


   commit bc1fb850a31468ac4976f3895f01a6d981e06d0a
   Author: Igor Mammedov 
   Date:   Thu Sep 13 13:06:01 2018 +0200

 vl.c deprecate incorrect CPUs topology
 
 -smp [cpus],sockets/cores/threads[,maxcpus] should describe topology

 so that total number of logical CPUs [sockets * cores * threads]
 would be equal to [maxcpus], however historically we didn't have
 such check in QEMU and it is possible to start VM with an invalid
 topology.
 Deprecate invalid options combination so we can make sure that
 the topology VM started with is always correct in the future.
 Users with an invalid sockets/cores/threads/maxcpus values should
 fix their CLI to make sure that
[sockets * cores * threads] == [maxcpus]




Another helpful commit would be:

commit c4332cd1dcf2964c23893ab4c0bf8d774e42a3cf
Author: Igor Mammedov 
Date:   Fri Sep 11 09:32:02 2020 -0400

smp: drop support for deprecated (invalid topologies)

it's was deprecated since 3.1

Support for invalid topologies is removed, the user must ensure
that topologies described with -smp include all possible cpus,
i.e. (sockets * cores * threads) == maxcpus or QEMU will
exit with error.


So is the following statement correct:

When we explicitly set the topology, we must ensure that the combination 
(sockets/dies/cores/threads/maxcpus) is always valid. If we need hot plug 
testing, we can only use something like "-smp 1,maxcpus = 4" since 3.1.


?






Signed-off-by: Dongli Cao 
---
hw/i386/pc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 8a84b25..ef2e819 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -751,7 +751,7 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
  exit(1);
  }

-if (sockets * dies * cores * threads != ms->smp.max_cpus) {
+if (sockets * dies * cores * threads > ms->smp.max_cpus) {
  error_report("Invalid CPU topology deprecated: "
   "sockets (%u) * dies (%u) * cores (%u) * threads (%u) 
"
   "!= maxcpus (%u)",


This is


--
1.8.3.1









caodon...@kingsoft.com




Regards,
Daniel

RE: [PATCH v3 09/12] target/hexagon: import lexer for idef-parser

2021-04-26 Thread Taylor Simpson




> -Original Message-
> From: Alessandro Di Federico 
> Sent: Tuesday, March 30, 2021 9:38 AM
> To: qemu-devel@nongnu.org
> Cc: Taylor Simpson ; Brian Cain
> ; bab...@rev.ng; ni...@rev.ng; phi...@redhat.com;
> richard.hender...@linaro.org; Alessandro Di Federico 
> Subject: [PATCH v3 09/12] target/hexagon: import lexer for idef-parser
> 

> +"fLSBNEW(P"{LOWER_PRE}"N)" { yylval->rvalue.type = PREDICATE;
> +   yylval->rvalue.pre.id = yytext[9];
> +   yylval->rvalue.bit_width = 32;
> +   yylval->rvalue.is_dotnew = true;
> +   return PRE; }
> +"fLSBNEW0"   { yylval->rvalue.type = PREDICATE;
> +   yylval->rvalue.pre.id = '0';
> +   yylval->rvalue.bit_width = 32;
> +   yylval->rvalue.is_dotnew = true;
> +   return PRE; }
> +"fLSBNEW1"   { yylval->rvalue.type = PREDICATE;
> +   yylval->rvalue.pre.id = '1';
> +   yylval->rvalue.bit_width = 32;
> +   yylval->rvalue.is_dotnew = true;
> +   return PRE; }
> +"fLSBNEW1NOT"{ yylval->rvalue.type = PREDICATE;
> +   yylval->rvalue.pre.id = '1';
> +   yylval->rvalue.bit_width = 32;
> +   yylval->rvalue.is_dotnew = true;
> +   return PRE; }

These represent the least significant bit of the operand.  Perhaps you should 
set the bit_width to 1?  Or do tcg_gen_andi_tl(..., 1)?

Thanks,
Taylor

RE: [PATCH v4 09/12] target/hexagon: import lexer for idef-parser

2021-04-26 Thread Taylor Simpson




> -Original Message-
> From: Alessandro Di Federico 
> Sent: Thursday, April 15, 2021 11:35 AM
> To: qemu-devel@nongnu.org
> Cc: Taylor Simpson ; Brian Cain
> ; bab...@rev.ng; ni...@rev.ng; phi...@redhat.com;
> richard.hender...@linaro.org; Alessandro Di Federico 
> Subject: [PATCH v4 09/12] target/hexagon: import lexer for idef-parser

> +/**
> + * Semantic record of the IMM token, identifying an immediate constant
> + */
> +typedef struct HexImm {
> +union {
> +char id;/**< Identifier of the immediate 
> */
> +uint64_t value; /**< Immediate value (for VALUE type immediates) 
> */

Most immediates are 32 bits.  Since you treat them as 64 bits, you end up with 
unnecessary extends and truncates in the TCG.

Here's an example from idef-generated-emitter.c
void emit_J2_jump(DisasContext *ctx, Insn *insn, Packet *pkt, int riV)
/* fIMMEXT(riV); (riV = riV & ~3); (PC = fREAD_PC()+riV);} */
{
int64_t qemu_tmp_0 = ~((int64_t)3ULL);
int32_t qemu_tmp_1 = riV & qemu_tmp_0;
riV = qemu_tmp_1;
TCGv_i32 tmp_0 = tcg_temp_local_new_i32();
tcg_gen_movi_i32(tmp_0, ctx->base.pc_next);
TCGv_i64 tmp_1 = tcg_temp_local_new_i64();
tcg_gen_ext_i32_i64(tmp_1, tmp_0);  <- 
Don't need this extension
tcg_temp_free_i32(tmp_0);
TCGv_i64 tmp_2 = tcg_temp_local_new_i64();
tcg_gen_addi_i64(tmp_2, tmp_1, (int64_t)riV);<- This 
should be 32 bits
tcg_temp_free_i64(tmp_1);
TCGv_i32 tmp_3 = tcg_temp_local_new_i32();
tcg_gen_trunc_i64_tl(tmp_3, tmp_2); <- 
Don't need this truncation
tcg_temp_free_i64(tmp_2);
gen_write_new_pc(tmp_3);
tcg_temp_free_i32(tmp_3);
}

> +uint64_t index; /**< Index of the immediate (for int temp vars)  
> */
> +};

Re: [PATCH 5/7] hw: Have machines Kconfig-select FW_CFG

2021-04-26 Thread David Gibson

On Tue, Apr 27, 2021 at 12:03:42AM +0200, BALATON Zoltan wrote:
> On Mon, 26 Apr 2021, Philippe Mathieu-Daudé wrote:
> > Beside the loongson3-virt machine (MIPS), the following machines
> > also use the fw_cfg device:
> > 
> > - ARM: virt & sbsa-ref
> > - HPPA: generic machine
> > - X86: ACPI based (pc & microvm)
> > - PPC64: various
> > - SPARC: sun4m & sun4u
> > 
> > Add their FW_CFG Kconfig dependency.
> > 
> > Signed-off-by: Philippe Mathieu-Daudé 
> > ---
> > hw/arm/Kconfig | 2 ++
> > hw/hppa/Kconfig| 1 +
> > hw/i386/Kconfig| 2 ++
> > hw/ppc/Kconfig | 1 +
> > hw/sparc/Kconfig   | 1 +
> > hw/sparc64/Kconfig | 1 +
> > 6 files changed, 8 insertions(+)
> > 
> > diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
> > index 8c37cf00da7..3b2641e39dc 100644
> > --- a/hw/arm/Kconfig
> > +++ b/hw/arm/Kconfig
> > @@ -8,6 +8,7 @@ config ARM_VIRT
> > imply TPM_TIS_SYSBUS
> > select ARM_GIC
> > select ACPI
> > +select FW_CFG
> > select ARM_SMMUV3
> > select GPIO_KEY
> > select FW_CFG_DMA
> > @@ -216,6 +217,7 @@ config SBSA_REF
> > select PL061 # GPIO
> > select USB_EHCI_SYSBUS
> > select WDT_SBSA
> > +select FW_CFG
> > 
> > config SABRELITE
> > bool
> > diff --git a/hw/hppa/Kconfig b/hw/hppa/Kconfig
> > index 22948db0256..45f40e09224 100644
> > --- a/hw/hppa/Kconfig
> > +++ b/hw/hppa/Kconfig
> > @@ -14,3 +14,4 @@ config DINO
> > select LASIPS2
> > select PARALLEL
> > select ARTIST
> > +select FW_CFG
> > diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
> > index 7f91f30877f..9e4039a2dce 100644
> > --- a/hw/i386/Kconfig
> > +++ b/hw/i386/Kconfig
> > @@ -52,6 +52,7 @@ config PC_ACPI
> > select SMBUS_EEPROM
> > select PFLASH_CFI01
> > depends on ACPI_SMBUS
> > +select FW_CFG
> > 
> > config I440FX
> > bool
> > @@ -106,6 +107,7 @@ config MICROVM
> > select ACPI_HW_REDUCED
> > select PCI_EXPRESS_GENERIC_BRIDGE
> > select USB_XHCI_SYSBUS
> > +select FW_CFG
> > 
> > config X86_IOMMU
> > bool
> > diff --git a/hw/ppc/Kconfig b/hw/ppc/Kconfig
> > index d11dc30509d..a7ba8283bf1 100644
> > --- a/hw/ppc/Kconfig
> > +++ b/hw/ppc/Kconfig
> > @@ -131,6 +131,7 @@ config VIRTEX
> > # Only used by 64-bit targets
> > config FW_CFG_PPC
> > bool
> > +select FW_CFG
> 
> Why do we need a separate config option here if all it does is select FW_CFG
> and also in meson.build it only seems to add fw_cfg.c? (Unlike FW_CFG_DMA
> which seems to add other files so another option makes sense for that case).
> Could we just use FW_CFG directly and drop the PPC specific option like you
> did for MIPS?
> 
> Also the comment saying "Only used by 64-bit targets" seems to be wrong as
> it is also selected by MAC_OLDWORLD that's definitely a 32-bit machine (and
> MAC_NEWWORLD that can be both 32 or 64 bit) so maybe this option used to do
> something previously but now seems to be equivalent to just FW_CFG. So could
> it be dropped and use FW_CFG instead to simplify this or what's the reason
> to keep a PPC specific option for it?

Actually... good point.  I don't see any reason for this config option either.

> 
> Regards,
> BALATON Zoltan
> 
> > 
> > config FDT_PPC
> > bool
> > diff --git a/hw/sparc/Kconfig b/hw/sparc/Kconfig
> > index 8dcb10086fd..267bf45fa21 100644
> > --- a/hw/sparc/Kconfig
> > +++ b/hw/sparc/Kconfig
> > @@ -15,6 +15,7 @@ config SUN4M
> > select STP2000
> > select CHRP_NVRAM
> > select OR_IRQ
> > +select FW_CFG
> > 
> > config LEON3
> > bool
> > diff --git a/hw/sparc64/Kconfig b/hw/sparc64/Kconfig
> > index 980a201bb73..c17b34b9d5b 100644
> > --- a/hw/sparc64/Kconfig
> > +++ b/hw/sparc64/Kconfig
> > @@ -13,6 +13,7 @@ config SUN4U
> > select PCKBD
> > select SIMBA
> > select CHRP_NVRAM
> > +select FW_CFG
> > 
> > config NIAGARA
> > bool
> > 


-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH 4/5] hw/pci-host/raven: Manually reset the OR_IRQ device

2021-04-26 Thread David Gibson

On Sat, Apr 24, 2021 at 06:22:28PM +0200, Philippe Mathieu-Daudé wrote:
> The OR_IRQ device is bus-less, thus isn't reset automatically.
> Add the raven_pcihost_reset() handler to manually reset the OR IRQ.
> 
> Fixes: f40b83a4e31 ("40p: use OR gate to wire up raven PCI interrupts")
> Signed-off-by: Philippe Mathieu-Daudé 

Acked-by: David Gibson 

> ---
>  hw/pci-host/prep.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c
> index 0a9162fba97..275379e4c78 100644
> --- a/hw/pci-host/prep.c
> +++ b/hw/pci-host/prep.c
> @@ -230,6 +230,15 @@ static void raven_change_gpio(void *opaque, int n, int 
> level)
>  s->contiguous_map = level;
>  }
>  
> +static void raven_pcihost_reset(DeviceState *dev)
> +{
> +PREPPCIState *s = RAVEN_PCI_HOST_BRIDGE(dev);
> +
> +if (!s->is_legacy_prep) {
> +device_legacy_reset(DEVICE(>or_irq));
> +}
> +}
> +
>  static void raven_pcihost_realizefn(DeviceState *d, Error **errp)
>  {
>  SysBusDevice *dev = SYS_BUS_DEVICE(d);
> @@ -422,6 +431,7 @@ static void raven_pcihost_class_init(ObjectClass *klass, 
> void *data)
>  
>  set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
>  dc->realize = raven_pcihost_realizefn;
> +dc->reset = raven_pcihost_reset;
>  device_class_set_props(dc, raven_pcihost_properties);
>  dc->fw_name = "pci";
>  }

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH v3] target/ppc: code motion from translate_init.c.inc to gdbstub.c

2021-04-26 Thread David Gibson

On Mon, Apr 26, 2021 at 12:50:54PM -0300, Fabiano Rosas wrote:
> David Gibson  writes:
> 
> > On Wed, Apr 14, 2021 at 01:09:19PM -0700, Richard Henderson wrote:
> >> On 4/14/21 7:59 AM, Bruno Larsen (billionai) wrote:
> >> > All the code related to gdb has been moved from translate_init.c.inc
> >> > file to the gdbstub.c file, where it makes more sense.
> >> > 
> >> > This new version puts the prototypes in internal.h, to not expose
> >> > them unnecessarily.
> >> > 
> >> > Signed-off-by: Bruno Larsen (billionai) 
> >> > Suggested-by: Fabiano Rosas 
> >> > ---
> >> >   target/ppc/gdbstub.c| 258 
> >> >   target/ppc/internal.h   |   5 +
> >> >   target/ppc/translate_init.c.inc | 254 +--
> >> >   3 files changed, 264 insertions(+), 253 deletions(-)
> >> 
> >> Reviewed-by: Richard Henderson 
> >
> > Applied to ppc-for-6.1, thanks.
> 
> 
> The prototypes moved to internal.h in v3 so gdbstub.c needs to include
> it now. The linux-user build is breaking with:

Thanks for the report.  I've pulled this from ppc-for-6.1 and I'll
await the next spin.

> 
> $ ../configure --target-list=ppc64le-linux-user
> $ make -j$(nproc)
> (...)
> [316/959] Compiling C object 
> libqemu-ppc64le-linux-user.fa.p/target_ppc_gdbstub.c.o   
> FAILED: libqemu-ppc64le-linux-user.fa.p/target_ppc_gdbstub.c.o
> 
> cc -Ilibqemu-ppc64le-linux-user.fa.p -I. -I.. -Itarget/ppc -I../target/ppc 
> -I../linux-user/host/x86_64 -Ilinux-user -I../linux-user -Ilinux-user/ppc 
> -I../linux-user/ppc -I../capstone/include/capstone -Itrace -Iqap
> i -Iui/shader -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include 
> -fdiagnostics-color=auto -pipe -Wall -Winvalid-pch -Werror -std=gnu99 -O2 -g 
> -isystem /home/fabiano/kvm/qemu-patch-testing/linux-headers -isystem
>  linux-headers -iquote . -iquote /home/fabiano/kvm/qemu-patch-testing -iquote 
> /home/fabiano/kvm/qemu-patch-testing/include -iquote 
> /home/fabiano/kvm/qemu-patch-testing/disas/libvixl -iquote 
> /home/fabiano/kvm/qemu-
> patch-testing/tcg/i386 -iquote /home/fabiano/kvm/qemu-patch-testing/accel/tcg 
> -pthread -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -m64 -mcx16 -D_GNU_SOURCE 
> -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes
>  -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes 
> -fno-strict-aliasing -fno-common -fwrapv -Wold-style-declaration 
> -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k 
> -Winit-self -Wig
> nored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels 
> -Wexpansion-to-defined -Wimplicit-fallthrough=2 -Wno-missing-include-dirs 
> -Wno-shift-negative-value -Wno-psabi -fstack-protector-strong -fPIC -isystem.
> ./linux-headers -isystemlinux-headers -DNEED_CPU_H 
> '-DCONFIG_TARGET="ppc64le-linux-user-config-target.h"' 
> '-DCONFIG_DEVICES="ppc64le-linux-user-config-devices.h"' -MD -MQ 
> libqemu-ppc64le-linux-user.fa.p/target_ppc
> _gdbstub.c.o -MF libqemu-ppc64le-linux-user.fa.p/target_ppc_gdbstub.c.o.d -o 
> libqemu-ppc64le-linux-user.fa.p/target_ppc_gdbstub.c.o -c 
> ../target/ppc/gdbstub.c   
> ../target/ppc/gdbstub.c:615:8: error: no previous prototype for 
> ‘ppc_gdb_arch_name’ [-Werror=missing-prototypes]  
>
>   615 | gchar *ppc_gdb_arch_name(CPUState *cs)
> 
>   |^  
> 
> ../target/ppc/gdbstub.c:624:6: error: no previous prototype for 
> ‘ppc_gdb_init’ [-Werror=missing-prototypes]   
>
>   624 | void ppc_gdb_init(CPUState *cs, PowerPCCPUClass *pcc)
>   |  ^~~~ 
> 
> cc1: all warnings being treated as errors
> 
> >> > +void ppc_gdb_init(CPUState *cs, PowerPCCPUClass *pcc)
> >> > +{
> >> > +
> >> > +if (pcc->insns_flags & PPC_FLOAT) {
> >> 
> >> Watch the extra blank lines.
> >
> > Fixed in my tree.
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH 1/5] hw/ppc/spapr_iommu: Register machine reset handler

2021-04-26 Thread David Gibson

On Sat, Apr 24, 2021 at 06:22:25PM +0200, Philippe Mathieu-Daudé wrote:
> The TYPE_SPAPR_TCE_TABLE device is bus-less, thus isn't reset
> automatically.  Register a reset handler to get reset with the
> machine.
> 
> It doesn't seem to be an issue because it is that way since the
> device QDev'ifycation 8 years ago, in commit a83000f5e3f
> ("spapr-tce: make sPAPRTCETable a proper device").
> Still, correct to have a proper API usage.

So, the reason this works now is that we explicitly call
device_reset() on the TCE table from the TCE tables "owner", either a
PHB (spapr_phb_reset()) or a VIO device (spapr_vio_quiesce_one()).

I think we want either that, or the register_reset(), not both.

> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/ppc/spapr_iommu.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
> index 24537ffcbd3..f7dad1dc0fe 100644
> --- a/hw/ppc/spapr_iommu.c
> +++ b/hw/ppc/spapr_iommu.c
> @@ -24,6 +24,7 @@
>  #include "sysemu/kvm.h"
>  #include "kvm_ppc.h"
>  #include "migration/vmstate.h"
> +#include "sysemu/reset.h"
>  #include "sysemu/dma.h"
>  #include "exec/address-spaces.h"
>  #include "trace.h"
> @@ -302,6 +303,11 @@ static const VMStateDescription vmstate_spapr_tce_table 
> = {
>  }
>  };
>  
> +static void spapr_tce_reset_handler(void *dev)
> +{
> +device_legacy_reset(DEVICE(dev));
> +}
> +
>  static void spapr_tce_table_realize(DeviceState *dev, Error **errp)
>  {
>  SpaprTceTable *tcet = SPAPR_TCE_TABLE(dev);
> @@ -324,6 +330,8 @@ static void spapr_tce_table_realize(DeviceState *dev, 
> Error **errp)
>  
>  vmstate_register(VMSTATE_IF(tcet), tcet->liobn, _spapr_tce_table,
>   tcet);
> +
> +qemu_register_reset(spapr_tce_reset_handler, dev);
>  }
>  
>  void spapr_tce_set_need_vfio(SpaprTceTable *tcet, bool need_vfio)
> @@ -425,6 +433,8 @@ static void spapr_tce_table_unrealize(DeviceState *dev)
>  {
>  SpaprTceTable *tcet = SPAPR_TCE_TABLE(dev);
>  
> +qemu_unregister_reset(spapr_tce_reset_handler, dev);
> +
>  vmstate_unregister(VMSTATE_IF(tcet), _spapr_tce_table, tcet);
>  
>  QLIST_REMOVE(tcet, list);

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH 5/7] hw: Have machines Kconfig-select FW_CFG

2021-04-26 Thread David Gibson

On Mon, Apr 26, 2021 at 09:35:18PM +0200, Philippe Mathieu-Daudé wrote:
> Beside the loongson3-virt machine (MIPS), the following machines
> also use the fw_cfg device:
> 
> - ARM: virt & sbsa-ref
> - HPPA: generic machine
> - X86: ACPI based (pc & microvm)
> - PPC64: various
> - SPARC: sun4m & sun4u
> 
> Add their FW_CFG Kconfig dependency.
> 
> Signed-off-by: Philippe Mathieu-Daudé 

ppc parts
Acked-by: David Gibson 

> ---
>  hw/arm/Kconfig | 2 ++
>  hw/hppa/Kconfig| 1 +
>  hw/i386/Kconfig| 2 ++
>  hw/ppc/Kconfig | 1 +
>  hw/sparc/Kconfig   | 1 +
>  hw/sparc64/Kconfig | 1 +
>  6 files changed, 8 insertions(+)
> 
> diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
> index 8c37cf00da7..3b2641e39dc 100644
> --- a/hw/arm/Kconfig
> +++ b/hw/arm/Kconfig
> @@ -8,6 +8,7 @@ config ARM_VIRT
>  imply TPM_TIS_SYSBUS
>  select ARM_GIC
>  select ACPI
> +select FW_CFG
>  select ARM_SMMUV3
>  select GPIO_KEY
>  select FW_CFG_DMA
> @@ -216,6 +217,7 @@ config SBSA_REF
>  select PL061 # GPIO
>  select USB_EHCI_SYSBUS
>  select WDT_SBSA
> +select FW_CFG
>  
>  config SABRELITE
>  bool
> diff --git a/hw/hppa/Kconfig b/hw/hppa/Kconfig
> index 22948db0256..45f40e09224 100644
> --- a/hw/hppa/Kconfig
> +++ b/hw/hppa/Kconfig
> @@ -14,3 +14,4 @@ config DINO
>  select LASIPS2
>  select PARALLEL
>  select ARTIST
> +select FW_CFG
> diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
> index 7f91f30877f..9e4039a2dce 100644
> --- a/hw/i386/Kconfig
> +++ b/hw/i386/Kconfig
> @@ -52,6 +52,7 @@ config PC_ACPI
>  select SMBUS_EEPROM
>  select PFLASH_CFI01
>  depends on ACPI_SMBUS
> +select FW_CFG
>  
>  config I440FX
>  bool
> @@ -106,6 +107,7 @@ config MICROVM
>  select ACPI_HW_REDUCED
>  select PCI_EXPRESS_GENERIC_BRIDGE
>  select USB_XHCI_SYSBUS
> +select FW_CFG
>  
>  config X86_IOMMU
>  bool
> diff --git a/hw/ppc/Kconfig b/hw/ppc/Kconfig
> index d11dc30509d..a7ba8283bf1 100644
> --- a/hw/ppc/Kconfig
> +++ b/hw/ppc/Kconfig
> @@ -131,6 +131,7 @@ config VIRTEX
>  # Only used by 64-bit targets
>  config FW_CFG_PPC
>  bool
> +select FW_CFG
>  
>  config FDT_PPC
>  bool
> diff --git a/hw/sparc/Kconfig b/hw/sparc/Kconfig
> index 8dcb10086fd..267bf45fa21 100644
> --- a/hw/sparc/Kconfig
> +++ b/hw/sparc/Kconfig
> @@ -15,6 +15,7 @@ config SUN4M
>  select STP2000
>  select CHRP_NVRAM
>  select OR_IRQ
> +select FW_CFG
>  
>  config LEON3
>  bool
> diff --git a/hw/sparc64/Kconfig b/hw/sparc64/Kconfig
> index 980a201bb73..c17b34b9d5b 100644
> --- a/hw/sparc64/Kconfig
> +++ b/hw/sparc64/Kconfig
> @@ -13,6 +13,7 @@ config SUN4U
>  select PCKBD
>  select SIMBA
>  select CHRP_NVRAM
> +select FW_CFG
>  
>  config NIAGARA
>  bool

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH 1/4] target/ppc: Code motion required to build disabling tcg

2021-04-26 Thread David Gibson

On Fri, Apr 23, 2021 at 10:28:14AM -0300, Fabiano Rosas wrote:
> David Gibson  writes:
> 
> > On Thu, Apr 22, 2021 at 04:35:34PM -0300, Fabiano Rosas wrote:
> >> Bruno Piazera Larsen  writes:
> >> 
> >> >> > You are correct! I've just tweaked the code that defines spr_register 
> >> >> > and
> >> >> > it should be working now. I'm still working in splitting the SPR 
> >> >> > functions
> >> >> > from translate_init, since I think it would make it easier to prepare 
> >> >> > the
> >> >> > !TCG case and for adding new architectures in the future, and I found 
> >> >> > a
> >> >> > few more problems:
> >> >>
> >> >> Actually looking at the stuff below, I suspect that separating our
> >> >> "spr" logic specifically might be a bad idea.  At least some of the
> >> >> SPRs control pretty fundamental things about how the processor
> >> >> operates, and I suspect separating it from the main translation logic
> >> >> may be more trouble than it's worth.
> >> 
> >> I disagree with the code proximity argument. Having TCG code clearly
> >> separate from common code seems more important to me than having the SPR
> >> callbacks close to the init_proc functions.
> >
> > Hmm.. I may be misinterpreting what you're intending here.  I
> > certainly agree that separating TCG only code from common code is a
> > good idea.  My point, though, is that the vast majority of the SPR
> > code *is* TCG specific - there are just a relatively few cases where
> > SPRs have a common path.  That basically only happens when a) the SPR
> > can be affected by means other than the guest executing instructions
> > specifically to do that (i.e. usually by hypercalls) and b) accessing
> > the SPR has some side effects that need to be handled in both TCG and
> > KVM cases
> 
> The SPR code in translate_init.c.inc currently comprises of:
> 
> 1) the gen_spr* functions that are called during init_proc for each
> processor type;

Ah... that's one part of the confusion.  I forgot about these
functions.  These should indeed be common, despite sharing the gen_*()
prefix with mostly things that are explicitly TCG only.

> 2) the spr_register macros and _spr_register function that adds the SPRs
> to env->spr, called from (1);
> 
> 3) the TCG-specific SPR read|write callbacks, registered by (2);
> 
> 4) the KVM specific attribute one_reg_id, registered by (2).
> 
> The intention is to have one .c file (cpu_init.c) that deals with
> processor initialization, which is mostly setting PowerPCCPUClass
> attributes and registering the appropriate SPRs for each processor
> family (1,2). We're considering that to be shared between KVM and TCG
> for now.

Yes, that's what I'd expect.

> What is going into a separate file are the read and write SPR callbacks,
> which are TCG specific (3). They are still referenced from the other
> file when registering the SPRs, but are ignored when building for
> KVM-only. These are kept in a TCG-only compilation unit.

Ah, right, I'd forgotten that many of the callbacks are in
translate_init.c not translate.c.  Indeed, those will have to move.

> There's still a
> decision to be made whether we should have a separate spr_tcg file for
> them, or move them into translate.c along with the rest of TCG code.

Ah, I see.  Ok, yes, in that case moving them to a new TCG only spr
file makes more sense to me.  translate.c is already enormous.

> 
> The one_reg_id is just one attribute so that does not change.
> 
> > From the descriptions it sounded like you were trying to separate
> > *all* SPR code, not just these specific cases from the translation
> > core, and that's what I'm saying is a bad idea.
> 
> So, if anything, the SPR callbacks are moving _closer_ to the
> translation core.

Right.  Sorry for the misunderstanding.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Bug 1926231] [NEW] SCSI passthrough of SATA cdrom -> errors & performance issues

2021-04-26 Thread Michael Slade

Public bug reported:

qemu 5.0, compiled from git

I am passing through a SATA cdrom via SCSI passthrough, with this
libvirt XML:


  


  
  
  


It seems to mostly work, I have written discs with it, except I am
getting errors that cause reads to take about 5x as long as they should,
under certain circumstances.  It appears to be based on the guest's read
block size.

I found that if on the guest I run, say `dd if=$some_large_file
bs=262144|pv > /dev/null`, `iostat` and `pv` disagree about how much is
being read by a factor of about 2.  Also many kernel messages like this
happen on the guest:

[  190.919684] sr 0:0:0:0: [sr0] tag#160 FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE cmd_age=0s
[  190.919687] sr 0:0:0:0: [sr0] tag#160 Sense Key : Aborted Command [current] 
[  190.919689] sr 0:0:0:0: [sr0] tag#160 Add. Sense: I/O process terminated
[  190.919691] sr 0:0:0:0: [sr0] tag#160 CDB: Read(10) 28 00 00 18 a5 5a 00 00 
80 00
[  190.919694] blk_update_request: I/O error, dev sr0, sector 6460776 op 
0x0:(READ) flags 0x80700 phys_seg 5 prio class 0

If I change to bs=131072 the errors stop and performance is normal.

(262144 happens to be the block size ultimately used by md5sum, which is
how I got here)

I also ran strace on the qemu process while it was happening, and
noticed SG_IO calls like this:

21748 10:06:29.330910 ioctl(22, SG_IO, {interface_id='S', 
dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=10, 
cmdp="\x28\x00\x00\x12\x95\x5a\x00\x00\x80\x00", mx_sb_len=252, iovec_count=0, 
dxfer_len=262144, timeout=4294967295, flags=SG_FLAG_DIRECT_IO 
21751 10:06:29.330976 ioctl(22, SG_IO, {interface_id='S', 
dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=10, 
cmdp="\x28\x00\x00\x12\x94\xda\x00\x00\x02\x00", mx_sb_len=252, iovec_count=0, 
dxfer_len=4096, timeout=4294967295, flags=SG_FLAG_DIRECT_IO 
21749 10:06:29.331586 ioctl(22, SG_IO, {interface_id='S', 
dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=10, 
cmdp="\x28\x00\x00\x12\x94\xdc\x00\x00\x02\x00", mx_sb_len=252, iovec_count=0, 
dxfer_len=4096, timeout=4294967295, flags=SG_FLAG_DIRECT_IO 
[etc]

I suspect qemu is the culprit because I have tried a 4.19 guest kernel
as well as a 5.9 one, with the same result.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1926231

Title:
  SCSI passthrough of SATA cdrom -> errors & performance issues

Status in QEMU:
  New

Bug description:
  qemu 5.0, compiled from git

  I am passing through a SATA cdrom via SCSI passthrough, with this
  libvirt XML:

  

  
  



  

  It seems to mostly work, I have written discs with it, except I am
  getting errors that cause reads to take about 5x as long as they
  should, under certain circumstances.  It appears to be based on the
  guest's read block size.

  I found that if on the guest I run, say `dd if=$some_large_file
  bs=262144|pv > /dev/null`, `iostat` and `pv` disagree about how much
  is being read by a factor of about 2.  Also many kernel messages like
  this happen on the guest:

  [  190.919684] sr 0:0:0:0: [sr0] tag#160 FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE cmd_age=0s
  [  190.919687] sr 0:0:0:0: [sr0] tag#160 Sense Key : Aborted Command 
[current] 
  [  190.919689] sr 0:0:0:0: [sr0] tag#160 Add. Sense: I/O process terminated
  [  190.919691] sr 0:0:0:0: [sr0] tag#160 CDB: Read(10) 28 00 00 18 a5 5a 00 
00 80 00
  [  190.919694] blk_update_request: I/O error, dev sr0, sector 6460776 op 
0x0:(READ) flags 0x80700 phys_seg 5 prio class 0

  If I change to bs=131072 the errors stop and performance is normal.

  (262144 happens to be the block size ultimately used by md5sum, which
  is how I got here)

  I also ran strace on the qemu process while it was happening, and
  noticed SG_IO calls like this:

  21748 10:06:29.330910 ioctl(22, SG_IO, {interface_id='S', 
dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=10, 
cmdp="\x28\x00\x00\x12\x95\x5a\x00\x00\x80\x00", mx_sb_len=252, iovec_count=0, 
dxfer_len=262144, timeout=4294967295, flags=SG_FLAG_DIRECT_IO 
  21751 10:06:29.330976 ioctl(22, SG_IO, {interface_id='S', 
dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=10, 
cmdp="\x28\x00\x00\x12\x94\xda\x00\x00\x02\x00", mx_sb_len=252, iovec_count=0, 
dxfer_len=4096, timeout=4294967295, flags=SG_FLAG_DIRECT_IO 
  21749 10:06:29.331586 ioctl(22, SG_IO, {interface_id='S', 
dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=10, 
cmdp="\x28\x00\x00\x12\x94\xdc\x00\x00\x02\x00", mx_sb_len=252, iovec_count=0, 
dxfer_len=4096, timeout=4294967295, flags=SG_FLAG_DIRECT_IO 
  [etc]

  I suspect qemu is the culprit because I have tried a 4.19 guest kernel
  as well as a 5.9 one, with the same result.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1926231/+subscriptions

Re: [PATCH] skip virtio fs cache section to enable NIC pass through

2021-04-26 Thread Dev Audsin

 virtio-fs with DAX is currently not compatible with NIC Pass through. VM
fails to boot when DAX  cache is enabled and SR-IOV VF is being attached.
This patch solves the problem. Hencem DAX cache and SR-IOV VF are be
attached together.

When a SR-IOV VF attaches to a qemu process, vfio will try to pin the
entire DAX Window but it is empty when the guest boots and will fail.
A method to make VFIO and DAX to work together is to make vfio skip DAX
cache.
Currently DAX cache need to be set to 0, for the SR-IOV VF to be attached
to Kata containers.
Enabling both SR-IOV VF and DAX work together will potentially improve
performance for workloads which are I/O and network intensive

On Mon, Apr 26, 2021 at 9:24 PM Dev Audsin  wrote:

> Signed-off-by: Dev Audsin 
> ---
>  hw/vfio/common.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 6ff1daa763..3af70238bd 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -541,7 +541,8 @@ static int vfio_host_win_del(VFIOContainer *container,
> hwaddr min_iova,
>
>  static bool vfio_listener_skipped_section(MemoryRegionSection *section)
>  {
> -return (!memory_region_is_ram(section->mr) &&
> +return (!strcmp(memory_region_name(section->mr), "virtio-fs-cache"))
> ||
> +  (!memory_region_is_ram(section->mr) &&
>  !memory_region_is_iommu(section->mr)) ||
> /*
>  * Sizing an enabled 64-bit BAR can cause spurious mappings to
> --
> 2.25.1
>

Re: [PATCH] skip virtio fs cache section to enable NIC passthrough

2021-04-26 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20210426200553.145976-2-dev.devaq...@gmail.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20210426200553.145976-2-dev.devaq...@gmail.com
Subject: [PATCH] skip virtio fs cache section to enable NIC passthrough

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/20210426200553.145976-2-dev.devaq...@gmail.com -> 
patchew/20210426200553.145976-2-dev.devaq...@gmail.com
Switched to a new branch 'test'
120a1b0 skip virtio fs cache section to enable NIC passthrough

=== OUTPUT BEGIN ===
ERROR: code indent should never use tabs
#23: FILE: hw/vfio/common.c:540:
+^I   (!memory_region_is_ram(section->mr) &&$

total: 1 errors, 0 warnings, 9 lines checked

Commit 120a1b0ec9de (skip virtio fs cache section to enable NIC passthrough) 
has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20210426200553.145976-2-dev.devaq...@gmail.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

[PATCH] skip virtio fs cache section to enable NIC pass through

2021-04-26 Thread Dev Audsin

 Signed-off-by: Dev Audsin 
---
 hw/vfio/common.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 6ff1daa763..3af70238bd 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -541,7 +541,8 @@ static int vfio_host_win_del(VFIOContainer *container,
hwaddr min_iova,

 static bool vfio_listener_skipped_section(MemoryRegionSection *section)
 {
-return (!memory_region_is_ram(section->mr) &&
+return (!strcmp(memory_region_name(section->mr), "virtio-fs-cache")) ||
+  (!memory_region_is_ram(section->mr) &&
 !memory_region_is_iommu(section->mr)) ||
/*
 * Sizing an enabled 64-bit BAR can cause spurious mappings to
-- 
2.25.1

[no subject]

2021-04-26 Thread Dev Audsin



virtio-fs with DAX is currently not compatible with NIC Pass through.
When a SR-IOV VF attaches to a qemu process, vfio will try to pin the entire 
DAX Window but it is empty when the guest boots and will fail.
A method to make VFIO and DAX to work together is to make vfio skip DAX cache.
Currently DAX cache need to be set to 0, for the SR-IOV VF to be attached to 
Kata containers.
Enabling both SR-IOV VF and DAX work together will potentially improve 
performance for workloads which are I/O and network intensive

[PATCH] skip virtio fs cache section to enable NIC passthrough

2021-04-26 Thread Dev Audsin

Signed-off-by: Dev Audsin 
---
 hw/vfio/common.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 6ff1daa763..3af70238bd 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -541,7 +541,8 @@ static int vfio_host_win_del(VFIOContainer *container, 
hwaddr min_iova,
 
 static bool vfio_listener_skipped_section(MemoryRegionSection *section)
 {
-return (!memory_region_is_ram(section->mr) &&
+return (!strcmp(memory_region_name(section->mr), "virtio-fs-cache")) ||
+  (!memory_region_is_ram(section->mr) &&
 !memory_region_is_iommu(section->mr)) ||
/*
 * Sizing an enabled 64-bit BAR can cause spurious mappings to
-- 
2.25.1

Re: [PATCH] make vfio and DAX cache work together

2021-04-26 Thread Dev Audsin

Hi Alex and David

@Alex:

Justification on why this region cannot be a DMA target for the device,

virtio-fs with DAX is currently not compatible with NIC Pass through.
When a SR-IOV VF attaches to a qemu process, vfio will try to pin the
entire DAX Window but it is empty when the guest boots and will fail.
A method to make VFIO and DAX to work together is to make vfio skip
DAX cache.

Currently DAX cache need to be set to 0, for the SR-IOV VF to be
attached to Kata containers. Enabling both SR-IOV VF and DAX work
together will potentially improve performance for workloads which are
I/O and network intensive.

@David

1. If DAX mode of virtiofs isn't yet in qemu, what is the best project
that this could be discussed?
2a. Regarding your comment on hard coding the name, I am referring to
the device by its name, which has been initialised in
hw/virtio/vhost-user-fs.c. I downloaded the source code of qemu with
virtiofs support which I obtained in reference to the Kata container
project and analysed it.  I see the following  code snippet in
hw/virtio/vhost-user-fs
  if (fs->conf.cache_size) {
/* Anonymous, private memory is not counted as overcommit */
cache_ptr = mmap(NULL, fs->conf.cache_size, DAX_WINDOW_PROT,
 MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
if (cache_ptr == MAP_FAILED) {
error_setg(errp, "Unable to mmap blank cache");
return;
}

memory_region_init_ram_ptr(>cache, OBJECT(vdev),
   "virtio-fs-cache",
   fs->conf.cache_size, cache_ptr);
}

In the above code snippet, the memory region is initialised with
device name  "virtio-fs-cache",which I am referring to in my source
code.

2b. Regarding, needing a way for the cache to declare it wants to be
omitted, I am not sure thats what is needed. Currently virtio-fs with
DAX is currently not compatible with vfio. I want to overcome this
problem by making vfio not using the cache.
What I want is cache to be used for purposes other than the VFIO
device. For example, in my deployment scenario, I want DAX cache to be
not used by SR-IOV VF (which is a VFIO device) but used by all other
system.

3. Moved to  vfio_listener_skip_section () and patch resubmitted.

Best
Dev

Re: [PATCH 12/22] qapi/parser: add type hint annotations

2021-04-26 Thread John Snow


On 4/25/21 8:34 AM, Markus Armbruster wrote:

value: object isn't wrong, but why not _ExprValue?



Updated excuse:

because all the way back outside in _parse, we know that:

1. expr is a dict (because of get_expr(False))
2. expr['pragma'] is also a dict, because we explicitly check it there.
3. We iterate over the keys; all we know so far is that the values are 
... something.

4. _pragma()'s job is to validate the type(s) anyway.

More or less, the _ExprValue type union isn't remembered here -- even 
though it was once upon a time something returned by get_expr, it 
happened in a nested call that is now opaque to mypy in this context.


So, it's some combination of "That's all we know about it" and "It 
happens to be exactly sufficient for this function to operate."


--js

Re: [PATCH 11/22] qapi/parser: Rework _check_pragma_list_of_str as a TypeGuard

2021-04-26 Thread John Snow


On 4/25/21 8:32 AM, Markus Armbruster wrote:

John Snow  writes:


TypeGuards wont exist in Python proper until 3.10. Ah well. We can hack
up our own by declaring this function to return the type we claim it
checks for and using this to safely downcast object -> List[str].

In so doing, I bring this function in-line under _pragma so it can use
the 'info' object in its closure. Having done this, _pragma also now
no longer needs to take a 'self' parameter, so drop it.

Rename it to just _check(), to help us out with the line-length -- and
now that it's contained within _pragma, it is contextually easier to see
how it's used anyway -- especially with types.

Signed-off-by: John Snow 

---

I left (name, value) as args to avoid creating a fully magic "macro",
though, I thought this was too weird:

 info.pragma.foobar = _check()

and it looked more reasonable as:

 info.pragma.foobar = _check(name, value)

Signed-off-by: John Snow 
---
  scripts/qapi/parser.py | 26 +-
  1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
index 16fd36f8391..d02a134aae9 100644
--- a/scripts/qapi/parser.py
+++ b/scripts/qapi/parser.py
@@ -17,6 +17,7 @@
  from collections import OrderedDict
  import os
  import re
+from typing import List
  
  from .common import match_nofail

  from .error import QAPISemError, QAPISourceError
@@ -151,28 +152,27 @@ def _include(include, info, incl_fname, 
previously_included):
  ) from err
  
  @staticmethod

-def _check_pragma_list_of_str(name, value, info):
-if (not isinstance(value, list)
-or any([not isinstance(elt, str) for elt in value])):
-raise QAPISemError(
-info,
-"pragma %s must be a list of strings" % name)
+def _pragma(name, value, info):
+
+def _check(name, value) -> List[str]:
+if (not isinstance(value, list) or
+any([not isinstance(elt, str) for elt in value])):
+raise QAPISemError(
+info,
+"pragma %s must be a list of strings" % name)
+return value
  
-def _pragma(self, name, value, info):

  if name == 'doc-required':
  if not isinstance(value, bool):
  raise QAPISemError(info,
 "pragma 'doc-required' must be boolean")
  info.pragma.doc_required = value
  elif name == 'command-name-exceptions':
-self._check_pragma_list_of_str(name, value, info)
-info.pragma.command_name_exceptions = value
+info.pragma.command_name_exceptions = _check(name, value)
  elif name == 'command-returns-exceptions':
-self._check_pragma_list_of_str(name, value, info)
-info.pragma.command_returns_exceptions = value
+info.pragma.command_returns_exceptions = _check(name, value)
  elif name == 'member-name-exceptions':
-self._check_pragma_list_of_str(name, value, info)
-info.pragma.member_name_exceptions = value
+info.pragma.member_name_exceptions = _check(name, value)
  else:
  raise QAPISemError(info, "unknown pragma '%s'" % name)


While I appreciate the terseness, I'm not sure I like the generic name
_check() for checking one of two special cases, namely "list of string".
The other case being "boolean".  We could acquire more cases later.



Yeah, sorry, just trying to make the line fit ...

The important thing is that we need to make sure this routine returns 
some known type. It's just that the block down here has very long lines.


Recommendations?

Re: [PATCH 1/1] amd_iommu: fix wrong MMIO operations

2021-04-26 Thread Michael S. Tsirkin

On Mon, Apr 26, 2021 at 10:21:54AM +0200, Roman Kapl wrote:
> Address was swapped with value when writing MMIO registers, so the user
> saw garbage in lot of cases. The interrupt status was not correctly set.
> 
> Signed-off-by: Roman Kapl 

Ouch. This API is just inconsistent, everyone else
uses addr, value in this order. How about fixing the
function signature instead?


> ---
>  hw/i386/amd_iommu.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
> index 74a93a5d93..bb5ce8c04d 100644
> --- a/hw/i386/amd_iommu.c
> +++ b/hw/i386/amd_iommu.c
> @@ -141,13 +141,13 @@ static bool amdvi_test_mask(AMDVIState *s, hwaddr addr, 
> uint64_t val)
>  /* OR a 64-bit register with a 64-bit value storing result in the register */
>  static void amdvi_assign_orq(AMDVIState *s, hwaddr addr, uint64_t val)
>  {
> -amdvi_writeq_raw(s, addr, amdvi_readq(s, addr) | val);
> +amdvi_writeq_raw(s, amdvi_readq(s, addr) | val, addr);
>  }
>  
>  /* AND a 64-bit register with a 64-bit value storing result in the register 
> */
>  static void amdvi_assign_andq(AMDVIState *s, hwaddr addr, uint64_t val)
>  {
> -   amdvi_writeq_raw(s, addr, amdvi_readq(s, addr) & val);
> +   amdvi_writeq_raw(s, amdvi_readq(s, addr) & val, addr);
>  }
>  
>  static void amdvi_generate_msi_interrupt(AMDVIState *s)
> @@ -382,7 +382,7 @@ static void amdvi_completion_wait(AMDVIState *s, uint64_t 
> *cmd)
>  }
>  /* set completion interrupt */
>  if (extract64(cmd[0], 1, 1)) {
> -amdvi_test_mask(s, AMDVI_MMIO_STATUS, AMDVI_MMIO_STATUS_COMP_INT);
> +amdvi_assign_orq(s, AMDVI_MMIO_STATUS, AMDVI_MMIO_STATUS_COMP_INT);
>  /* generate interrupt */
>  amdvi_generate_msi_interrupt(s);
>  }
> -- 
> 2.20.1

Re: [PATCH 03/22] qapi/source: Remove line number from QAPISourceInfo initializer

2021-04-26 Thread John Snow


On 4/24/21 2:38 AM, Markus Armbruster wrote:

Mixing f-string and % interpolation.  I doubt we'd write it this way
from scratch.  I recommend to either stick to % for now (leave
conversion to f-strings for later), or conver the column formatting,
too, even though it's not related to the patch's purpose.


True. Two thoughts:

1. I don't like using % formatting because it behaves differently from 
.format() and f-strings. My overwhelming desire is to never use it for 
this reason.


Example: {foo} will call foo's __format__ method, whereas "%s" % foo 
will simply add str(foo). They are not always the same, not even for 
built-in Python objects.



2. Cleaning up the formatting here without cleaning it up everywhere is 
a great way to get the patch NACKed. You have in the past been fairly 
reluctant to "While we're here" cleanups, so I am trying to cut back on 
them.



This is why my habit for f-strings keeps trickling in: whenever I have 
to rewrite any interpolation, I reach for the one that behaves most 
idiomatically for Python 3. I am trying to balance that against churn 
that's not in the stated goals of the patch.


In this case: I'll clean the rest of the method to match; and add a note 
to the commit message that explains why. I will get around to removing 
all of the f-strings, but I want to hit the clean linter baseline first 
to help guide the testing for such a series. I regret the awkward 
transitional period.


--js

Re: [PATCH 5/7] hw: Have machines Kconfig-select FW_CFG

2021-04-26 Thread BALATON Zoltan


On Mon, 26 Apr 2021, Philippe Mathieu-Daudé wrote:

Beside the loongson3-virt machine (MIPS), the following machines
also use the fw_cfg device:

- ARM: virt & sbsa-ref
- HPPA: generic machine
- X86: ACPI based (pc & microvm)
- PPC64: various
- SPARC: sun4m & sun4u

Add their FW_CFG Kconfig dependency.

Signed-off-by: Philippe Mathieu-Daudé 
---
hw/arm/Kconfig | 2 ++
hw/hppa/Kconfig| 1 +
hw/i386/Kconfig| 2 ++
hw/ppc/Kconfig | 1 +
hw/sparc/Kconfig   | 1 +
hw/sparc64/Kconfig | 1 +
6 files changed, 8 insertions(+)

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 8c37cf00da7..3b2641e39dc 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -8,6 +8,7 @@ config ARM_VIRT
imply TPM_TIS_SYSBUS
select ARM_GIC
select ACPI
+select FW_CFG
select ARM_SMMUV3
select GPIO_KEY
select FW_CFG_DMA
@@ -216,6 +217,7 @@ config SBSA_REF
select PL061 # GPIO
select USB_EHCI_SYSBUS
select WDT_SBSA
+select FW_CFG

config SABRELITE
bool
diff --git a/hw/hppa/Kconfig b/hw/hppa/Kconfig
index 22948db0256..45f40e09224 100644
--- a/hw/hppa/Kconfig
+++ b/hw/hppa/Kconfig
@@ -14,3 +14,4 @@ config DINO
select LASIPS2
select PARALLEL
select ARTIST
+select FW_CFG
diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index 7f91f30877f..9e4039a2dce 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -52,6 +52,7 @@ config PC_ACPI
select SMBUS_EEPROM
select PFLASH_CFI01
depends on ACPI_SMBUS
+select FW_CFG

config I440FX
bool
@@ -106,6 +107,7 @@ config MICROVM
select ACPI_HW_REDUCED
select PCI_EXPRESS_GENERIC_BRIDGE
select USB_XHCI_SYSBUS
+select FW_CFG

config X86_IOMMU
bool
diff --git a/hw/ppc/Kconfig b/hw/ppc/Kconfig
index d11dc30509d..a7ba8283bf1 100644
--- a/hw/ppc/Kconfig
+++ b/hw/ppc/Kconfig
@@ -131,6 +131,7 @@ config VIRTEX
# Only used by 64-bit targets
config FW_CFG_PPC
bool
+select FW_CFG


Why do we need a separate config option here if all it does is select 
FW_CFG and also in meson.build it only seems to add fw_cfg.c? (Unlike 
FW_CFG_DMA which seems to add other files so another option makes sense 
for that case). Could we just use FW_CFG directly and drop the PPC 
specific option like you did for MIPS?


Also the comment saying "Only used by 64-bit targets" seems to be wrong as 
it is also selected by MAC_OLDWORLD that's definitely a 32-bit machine 
(and MAC_NEWWORLD that can be both 32 or 64 bit) so maybe this option used 
to do something previously but now seems to be equivalent to just FW_CFG. 
So could it be dropped and use FW_CFG instead to simplify this or what's 
the reason to keep a PPC specific option for it?


Regards,
BALATON Zoltan



config FDT_PPC
bool
diff --git a/hw/sparc/Kconfig b/hw/sparc/Kconfig
index 8dcb10086fd..267bf45fa21 100644
--- a/hw/sparc/Kconfig
+++ b/hw/sparc/Kconfig
@@ -15,6 +15,7 @@ config SUN4M
select STP2000
select CHRP_NVRAM
select OR_IRQ
+select FW_CFG

config LEON3
bool
diff --git a/hw/sparc64/Kconfig b/hw/sparc64/Kconfig
index 980a201bb73..c17b34b9d5b 100644
--- a/hw/sparc64/Kconfig
+++ b/hw/sparc64/Kconfig
@@ -13,6 +13,7 @@ config SUN4U
select PCKBD
select SIMBA
select CHRP_NVRAM
+select FW_CFG

config NIAGARA
bool

[Bug 1926202] Re: qemu-user can't run some ppc binaries

2021-04-26 Thread Laurent Vivier

This is not a regression (reproduced with 5.2 and 5.1)

  IN: strlen
  0x1000d780:  7d2a03f8  cmpb r10, r9, r0

  OP:
   ld_i32 tmp0,env,$0xfff0
   brcond_i32 tmp0,$0x0,lt,$L0

    1000d780
   mov_i32 nip,$0x1000d780
   mov_i32 tmp0,$0x60
   mov_i32 tmp4,$0x21
   call raise_exception_err,$0x2,$0,env,tmp0,tmp4
   exit_tb $0x0
   set_label $L0
   exit_tb $0x7efd50022283

"cmpb" is define in ISA 2.05, but qemu-ppc (32bit) defines by default a
PowerPC 750 that is not ISA 2.05.

It doesn't seem QEMU supports ISA 2.05 for any 32bit PowerPC (only
POWER7 and above, that are 64bit processors).

** Tags removed: linux-user

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1926202

Title:
  qemu-user can't run some ppc binaries

Status in QEMU:
  New

Bug description:
  qemu-user v6.0.0-rc5, built in static mode, will crash for certain ppc
  binaries.  It seems to have something to do with glibc for some Centos
  versions.  The problem is easiest to see with statically-linked
  binaries.

  The attached Dockerfile shows how to produce a ppc binary that will
  crash qemu-user.  Here is how to reproduce the problem:

  $ uname -m
  x86_64

  $ docker run --rm --privileged multiarch/qemu-user-static --reset -p
  yes

  $ docker build -t qemu-bug:centos -f Dockerfile.centos .

  $ docker run --rm -it -v$PWD:$PWD -w$PWD qemu-bug:centos cp
  /helloworld-centos.static.ppc .

  $ qemu-ppc-static --version
  qemu-ppc version 5.2.95 (v6.0.0-rc5)
  Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers

  $ qemu-ppc-static ./helloworld-centos.static.ppc
  emu: uncaught target signal 4 (Illegal instruction) - core dumped
  [1]16678 illegal hardware instruction (core dumped)  qemu-ppc-static 
./helloworld-centos.static.ppc

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1926202/+subscriptions

Re: [PATCH] make vfio and DAX cache work together

2021-04-26 Thread Alex Williamson

On Mon, 26 Apr 2021 21:50:38 +0100
Dev Audsin  wrote:

> Hi Alex and David
> 
> @Alex:
> 
> Justification on why this region cannot be a DMA target for the device,
> 
> virtio-fs with DAX is currently not compatible with NIC Pass through.
> When a SR-IOV VF attaches to a qemu process, vfio will try to pin the
> entire DAX Window but it is empty when the guest boots and will fail.
> A method to make VFIO and DAX to work together is to make vfio skip
> DAX cache.
> 
> Currently DAX cache need to be set to 0, for the SR-IOV VF to be
> attached to Kata containers. Enabling both SR-IOV VF and DAX work
> together will potentially improve performance for workloads which are
> I/O and network intensive.

Sorry, there's no actual justification described here.  You're enabling
a VM with both features, virtio-fs DAX and VFIO, but there's no
evidence that they "work together" or that your use case is simply
avoiding a scenario where the device might attempt to DMA into the area
with this designation.  With this change, if the device were to attempt
to DMA into this region, it would be blocked by the IOMMU, which might
result in a data loss within the VM.  Justification of this change
needs to prove that this region can never be a DMA target for the
device, not simply that both features can be enabled and we hope that
they don't interact.  Thanks,

Alex

Re: [RFC PATCH 3/4] target/ppc: Move SPR generation to separate file

2021-04-26 Thread Fabiano Rosas

"Bruno Larsen (billionai)"  writes:

> This move is required to enable building without TCG.
> All the logic related to registering SPRs specific to
> some architectures or machines has been hidden in this
> new file.

Hm... I thought we ended up deciding to keep the gen_spr_
functions in translate_init.c.inc (cpu_init.c). I don't strongly favour
one way or the other, but one alternative would be to just rename the
gen_spr_ functions to add_sprs_ or even
register__sprs and leave them where they are.

> The idea of this final patch is to hide all SPR generation from
> translate_init, but in an effort to simplify the RFC the 4
> functions for not_implemented SPRs were created. They'll be
> substituted by gen_spr__misc in reusable ways for the
> final patch.

I'd expect this patch to be just a big removal of gen_spr* from
translate_init.c.inc and their addition into spr_common.c. So any other
prep work should come in separate patches ealier in the
series. Specifically, at least one patch for the macro work and another
for the refactoring of open coded spr_register calls into gen_spr_*
functions.

> another issue we ran into was vscr_init using static functions
> means it has to be static, so we had to remove them from 
> gen_spr_74xx and gen_spr_book3s_altivec, and have them in
> the init_procs instead.

Looks like moving vscr_init out, along with a more detailed explanation
of the issue could be in another preliminary change.

>
> Finally, SPR_NOACCESS had to be defined in internal.h, as it
> is used by spr_common, translate_init and translate. If there
> is a better solution, I'll be happy to implement it.
>
> As for the redundant code complaint this patch will get, it has only
> been moved, so I don't know if I can remove that code
>
> Signed-off-by: Bruno Larsen (billionai) 
> ---
>  target/ppc/internal.h   |  108 +
>  target/ppc/meson.build  |1 +
>  target/ppc/spr_common.c | 2943 ++
>  target/ppc/translate_init.c.inc | 4031 ++-
>  4 files changed, 3314 insertions(+), 3769 deletions(-)
>  create mode 100644 target/ppc/spr_common.c
>
> diff --git a/target/ppc/internal.h b/target/ppc/internal.h
> index de78c23717..25df546eae 100644
> --- a/target/ppc/internal.h
> +++ b/target/ppc/internal.h
> @@ -226,4 +226,112 @@ void destroy_ppc_opcodes(PowerPCCPU *cpu);
>  void ppc_gdb_init(CPUState *cs, PowerPCCPUClass *ppc);
>  gchar *ppc_gdb_arch_name(CPUState *cs);
>  
> +/* spr-common.c */
> +#include "cpu.h"
> +void gen_spr_generic(CPUPPCState *env);

The fact that these are called gen_* is confusing since they don't
really generate anything. They mostly just add SPRs to the list and
register the SPR rw callbacks for TCG. Maybe we could rename them at the
end of the series to something more clear.

> +void gen_spr_ne_601(CPUPPCState *env);
> +void gen_spr_sdr1(CPUPPCState *env);
> +void gen_low_BATs(CPUPPCState *env);
> +void gen_high_BATs(CPUPPCState *env);
> +void gen_tbl(CPUPPCState *env);
> +void gen_6xx_7xx_soft_tlb(CPUPPCState *env, int nb_tlbs, int nb_ways);
> +void gen_spr_G2_755(CPUPPCState *env);
> +void gen_spr_7xx(CPUPPCState *env);
> +#ifdef TARGET_PPC64
> +void gen_spr_amr(CPUPPCState *env);
> +void gen_spr_iamr(CPUPPCState *env);
> +#endif /* TARGET_PPC64 */
> +void gen_spr_thrm(CPUPPCState *env);
> +void gen_spr_604(CPUPPCState *env);
> +void gen_spr_603(CPUPPCState *env);
> +void gen_spr_G2(CPUPPCState *env);
> +void gen_spr_602(CPUPPCState *env);
> +void gen_spr_601(CPUPPCState *env);
> +void gen_spr_74xx(CPUPPCState *env);
> +void gen_l3_ctrl(CPUPPCState *env);
> +void gen_74xx_soft_tlb(CPUPPCState *env, int nb_tlbs, int nb_ways);
> +void gen_spr_not_implemented(CPUPPCState *env,
> + int num, const char *name);
> +void gen_spr_not_implemented_ureg(CPUPPCState *env,
> +  int num, const char *name);
> +void gen_spr_not_implemented_no_write(CPUPPCState *env,
> +  int num, const char *name);
> +void gen_spr_not_implemented_write_nop(CPUPPCState *env,
> +   int num, const char *name);
> +void gen_spr_PSSCR(CPUPPCState *env);
> +void gen_spr_TIDR(CPUPPCState *env);
> +void gen_spr_pvr(CPUPPCState *env, PowerPCCPUClass *pcc);
> +void gen_spr_svr(CPUPPCState *env, PowerPCCPUClass *pcc);
> +void gen_spr_pir(CPUPPCState *env);
> +void gen_spr_spefscr(CPUPPCState *env);
> +void gen_spr_l1fgc(CPUPPCState *env, int num, int initial_value);
> +void gen_spr_hid0(CPUPPCState *env);
> +void gen_spr_mas73(CPUPPCState *env);
> +void gen_spr_mmucsr0(CPUPPCState *env);
> +void gen_spr_l1csr0(CPUPPCState *env);
> +void gen_spr_l1csr1(CPUPPCState *env);
> +void gen_spr_l2csr0(CPUPPCState *env);
> +void gen_spr_usprg3(CPUPPCState *env);
> +void gen_spr_usprgh(CPUPPCState *env);
> +void gen_spr_BookE(CPUPPCState *env, uint64_t ivor_mask);
> +uint32_t gen_tlbncfg(uint32_t assoc, uint32_t minsize,
> +

Re: [PATCH] skip virtio fs cache section to enable NIC pass through

2021-04-26 Thread Alex Williamson

On Mon, 26 Apr 2021 21:27:52 +0100
Dev Audsin  wrote:

>  virtio-fs with DAX is currently not compatible with NIC Pass through. VM
> fails to boot when DAX  cache is enabled and SR-IOV VF is being attached.
> This patch solves the problem. Hencem DAX cache and SR-IOV VF are be
> attached together.
> 
> When a SR-IOV VF attaches to a qemu process, vfio will try to pin the
> entire DAX Window but it is empty when the guest boots and will fail.
> A method to make VFIO and DAX to work together is to make vfio skip DAX
> cache.
> Currently DAX cache need to be set to 0, for the SR-IOV VF to be attached
> to Kata containers.
> Enabling both SR-IOV VF and DAX work together will potentially improve
> performance for workloads which are I/O and network intensive

Please work on your patch email tooling, this is not how to provide a
commit log.

Also, this is not a qemu-trivial candidate imo.  A qemu-trivial patch
should be obviously correct, not just simple in mechanics.  It's not
obvious to me that simply skipping a region by name to avoid an
incompatibility is correct.

> On Mon, Apr 26, 2021 at 9:24 PM Dev Audsin  wrote:
> 
> > Signed-off-by: Dev Audsin 
> > ---
> >  hw/vfio/common.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > index 6ff1daa763..3af70238bd 100644
> > --- a/hw/vfio/common.c
> > +++ b/hw/vfio/common.c
> > @@ -541,7 +541,8 @@ static int vfio_host_win_del(VFIOContainer *container,
> > hwaddr min_iova,
> >
> >  static bool vfio_listener_skipped_section(MemoryRegionSection *section)
> >  {
> > -return (!memory_region_is_ram(section->mr) &&
> > +return (!strcmp(memory_region_name(section->mr), "virtio-fs-cache"))
> > ||
> > +  (!memory_region_is_ram(section->mr) &&
> >  !memory_region_is_iommu(section->mr)) ||
> > /*
> >  * Sizing an enabled 64-bit BAR can cause spurious mappings to
> > --
> > 2.25.1
> >  

Dave Gilbert already commented that a hard coded name comparison is not
a good solution here.  There needs to be more analysis of the issue
beyond simply making the VM with this combination boot.  If there's a
valid reason this particular region cannot be a device DMA target, then
advertise that reason and make vfio skip all regions with that
property.  It's clear that we already skip non-ram and non-iommu
sections, why is this region considered both "ram" and not a DMA
target?  The fact that it's not populated at guest boot does not
provide any support that it couldn't later be populated and become a DMA
target for the assigned device.  Thanks,

Alex

[Bug 1926202] Re: qemu-user can't run some ppc binaries

2021-04-26 Thread Laurent Vivier

Thank you. I can reproduce the problem.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1926202

Title:
  qemu-user can't run some ppc binaries

Status in QEMU:
  New

Bug description:
  qemu-user v6.0.0-rc5, built in static mode, will crash for certain ppc
  binaries.  It seems to have something to do with glibc for some Centos
  versions.  The problem is easiest to see with statically-linked
  binaries.

  The attached Dockerfile shows how to produce a ppc binary that will
  crash qemu-user.  Here is how to reproduce the problem:

  $ uname -m
  x86_64

  $ docker run --rm --privileged multiarch/qemu-user-static --reset -p
  yes

  $ docker build -t qemu-bug:centos -f Dockerfile.centos .

  $ docker run --rm -it -v$PWD:$PWD -w$PWD qemu-bug:centos cp
  /helloworld-centos.static.ppc .

  $ qemu-ppc-static --version
  qemu-ppc version 5.2.95 (v6.0.0-rc5)
  Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers

  $ qemu-ppc-static ./helloworld-centos.static.ppc
  emu: uncaught target signal 4 (Illegal instruction) - core dumped
  [1]16678 illegal hardware instruction (core dumped)  qemu-ppc-static 
./helloworld-centos.static.ppc

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1926202/+subscriptions

[ANNOUNCE] QEMU 6.0.0-rc5 is now available

2021-04-26 Thread Michael Roth

Hello,

On behalf of the QEMU Team, I'd like to announce the availability of the
sixth release candidate for the QEMU 6.0 release. This release is meant
for testing purposes and should not be used in a production environment.

  http://download.qemu-project.org/qemu-6.0.0-rc5.tar.xz
  http://download.qemu-project.org/qemu-6.0.0-rc5.tar.xz.sig

A note from the maintainer:

  Unfortunately we found a couple of late-breaking bugs that we felt
  needed fixing before the release, so we're putting out an rc5 with
  those fixes today. We plan to make the final 6.0 release (with no
  further changes) on Thursday.

You can help improve the quality of the QEMU 6.0 release by testing this
release and reporting bugs on Launchpad:

  https://bugs.launchpad.net/qemu/

The release plan, as well a documented known issues for release
candidates, are available at:

  http://wiki.qemu.org/Planning/6.0

Please add entries to the ChangeLog for the 6.0 release below:

  http://wiki.qemu.org/ChangeLog/6.0

Thank you to everyone involved!

Changes since rc4:

0cef06d187: Update version for v6.0.0-rc5 release (Peter Maydell)
5351fb7cb2: hw/block/nvme: fix invalid msix exclusive uninit (Klaus Jensen)
ffa090bc56: target/s390x: fix s390_probe_access to check PAGE_WRITE_ORG for 
writeability (Alex Bennée)
bc38e31b4e: net: check the existence of peer before trying to pad (Jason Wang)

[Bug 1926202] Re: qemu-user can't run some ppc binaries

2021-04-26 Thread Aaron Simmons

helloworld-centos.static.ppc is attached as part of comment #2

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1926202

Title:
  qemu-user can't run some ppc binaries

Status in QEMU:
  New

Bug description:
  qemu-user v6.0.0-rc5, built in static mode, will crash for certain ppc
  binaries.  It seems to have something to do with glibc for some Centos
  versions.  The problem is easiest to see with statically-linked
  binaries.

  The attached Dockerfile shows how to produce a ppc binary that will
  crash qemu-user.  Here is how to reproduce the problem:

  $ uname -m
  x86_64

  $ docker run --rm --privileged multiarch/qemu-user-static --reset -p
  yes

  $ docker build -t qemu-bug:centos -f Dockerfile.centos .

  $ docker run --rm -it -v$PWD:$PWD -w$PWD qemu-bug:centos cp
  /helloworld-centos.static.ppc .

  $ qemu-ppc-static --version
  qemu-ppc version 5.2.95 (v6.0.0-rc5)
  Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers

  $ qemu-ppc-static ./helloworld-centos.static.ppc
  emu: uncaught target signal 4 (Illegal instruction) - core dumped
  [1]16678 illegal hardware instruction (core dumped)  qemu-ppc-static 
./helloworld-centos.static.ppc

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1926202/+subscriptions

RE: [RFC PATCH 2/4] target/ppc: isolated SPR read/write callbacks

2021-04-26 Thread Bruno Piazera Larsen

> > The solution to move it to spr_tcg.c.inc and including it in translate.c
> > is a work in progress, any better solutions are very much appreciated.
> > Also, making the R/W functions not static is required for the next
> > commit.
>
> Looks like this could be done in the next commit then.

Sure, I can separate it like this. It was just easier to make the commit like 
this, since
I wouldn't need to undo all the static removals and redo them for the next 
commit
(I tend to work through all the code and then separate changes into commits, 
maybe
that's a bad habit I should drop), and also I thought it would making reviewing 
the
next patch easier.

> > +void spr_load_dump_spr(int sprn)
> > +{
> > +#ifdef PPC_DUMP_SPR_ACCESSES
>
> The define needs to come along.

Oops, another one that I forgot. This has the same issue as the one before, so 
I'm
guessing the same solution: move the define to internal, so both cpu_init.c and
translate.c can access it.

> > +/* I really see no reason to keep these gen_*_xer */
> > +/* instead of just leaving the code in the spr_*_xer */
> > +void gen_read_xer(DisasContext *ctx, TCGv dst)
> > +{
> > +TCGv t0 = tcg_temp_new();
> > +TCGv t1 = tcg_temp_new();
> > +TCGv t2 = tcg_temp_new();
> > +tcg_gen_mov_tl(dst, cpu_xer);
> > +tcg_gen_shli_tl(t0, cpu_so, XER_SO);
> > +tcg_gen_shli_tl(t1, cpu_ov, XER_OV);
> > +tcg_gen_shli_tl(t2, cpu_ca, XER_CA);
> > +tcg_gen_or_tl(t0, t0, t1);
> > +tcg_gen_or_tl(dst, dst, t2);
> > +tcg_gen_or_tl(dst, dst, t0);
> > +if (is_isa300(ctx)) {
> > +tcg_gen_shli_tl(t0, cpu_ov32, XER_OV32);
> > +tcg_gen_or_tl(dst, dst, t0);
> > +tcg_gen_shli_tl(t0, cpu_ca32, XER_CA32);
> > +tcg_gen_or_tl(dst, dst, t0);
> > +}
> > +tcg_temp_free(t0);
> > +tcg_temp_free(t1);
> > +tcg_temp_free(t2);
> > +}
> > +
> > +void gen_write_xer(TCGv src)
> > +{
> > +/* Write all flags, while reading back check for isa300 */
> > +tcg_gen_andi_tl(cpu_xer, src,
> > +~((1u << XER_SO) |
> > +  (1u << XER_OV) | (1u << XER_OV32) |
> > +  (1u << XER_CA) | (1u << XER_CA32)));
> > +tcg_gen_extract_tl(cpu_ov32, src, XER_OV32, 1);
> > +tcg_gen_extract_tl(cpu_ca32, src, XER_CA32, 1);
> > +tcg_gen_extract_tl(cpu_so, src, XER_SO, 1);
> > +tcg_gen_extract_tl(cpu_ov, src, XER_OV, 1);
> > +tcg_gen_extract_tl(cpu_ca, src, XER_CA, 1);
> > +}
>
> These two can continue being static.

Good catch again. But my question (in the comment at the begining) remains:
Is there a good reason to keep them separate from spr_(read|write)_xer, since
they are only used by those functions and aren't much different than other
read|write functions.

> Moving a big amount of code like this to another file *and* rearranging
> the code within the file at the same time makes it harder to review and
> is error prone. I'd move the code in one patch and rearrange things in a
> separate patch if needed.

Yeah... I didn't know about the automated sed command until earlier today
so it didn't occur to me that it could be a problem. The rearranging was either
an accident, or a way to reduce ifdefs, but that can be a trivial patch later on

> > +/* prototypes for readers and writers for SPRs */
> > +
> > +#ifdef TARGET_PPC64
> > +void gen_fscr_facility_check(DisasContext *ctx, int facility_sprn,
> > +int bit, int sprn, int cause);
> > +void gen_msr_facility_check(DisasContext *ctx, int facility_sprn,
> > +   int bit, int sprn, int cause);
> > +#endif
>
> The gen_* functions are only called from within the spr.c.inc file. You
> shouldn't need them here.
>
> > +
> > +void spr_load_dump_spr(int sprn);
> > +void spr_read_generic(DisasContext *ctx, int gprn, int sprn);
> > +void spr_store_dump_spr(int sprn);
> > +void spr_write_generic(DisasContext *ctx, int sprn, int gprn);
> > +void gen_read_xer(DisasContext *ctx, TCGv dst);
> > +void gen_write_xer(TCGv src);
>
> Same here.

Relics of a different solution. Will remove for v2

> > -static void spr_noaccess(DisasContext *ctx, int gprn, int sprn)
> > -{
> > -#if 0
> > -sprn = ((sprn >> 5) & 0x1F) | ((sprn & 0x1F) << 5);
> > -printf("ERROR: try to access SPR %d !\n", sprn);
> > -#endif
> > -}
> > -#define SPR_NOACCESS (_noaccess)
>
> What happens to code in translate.c that checks this? I don't think you
> can remove it.

the define was moved to internal.h (forgot to include in this patch, it's in the
next one, will fix). I don't know if that's a good solution, but it's working 
for now



> >  #if defined(TARGET_PPC64)
> > -#if defined(CONFIG_USER_ONLY)
> > -#define POWERPC970_HID5_INIT 0x0080
> > -#else
> > -#define POWERPC970_HID5_INIT 0x
> > -#endif
>
> Where do these went?

Same as before:  They were added to internal.h and I forgot.



Bruno Piazera Larsen

Instituto de Pesquisas

[Bug 1926202] Re: qemu-user can't run some ppc binaries

2021-04-26 Thread Laurent Vivier

Could you provide directly the binary to test (helloworld-
centos.static.ppc)?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1926202

Title:
  qemu-user can't run some ppc binaries

Status in QEMU:
  New

Bug description:
  qemu-user v6.0.0-rc5, built in static mode, will crash for certain ppc
  binaries.  It seems to have something to do with glibc for some Centos
  versions.  The problem is easiest to see with statically-linked
  binaries.

  The attached Dockerfile shows how to produce a ppc binary that will
  crash qemu-user.  Here is how to reproduce the problem:

  $ uname -m
  x86_64

  $ docker run --rm --privileged multiarch/qemu-user-static --reset -p
  yes

  $ docker build -t qemu-bug:centos -f Dockerfile.centos .

  $ docker run --rm -it -v$PWD:$PWD -w$PWD qemu-bug:centos cp
  /helloworld-centos.static.ppc .

  $ qemu-ppc-static --version
  qemu-ppc version 5.2.95 (v6.0.0-rc5)
  Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers

  $ qemu-ppc-static ./helloworld-centos.static.ppc
  emu: uncaught target signal 4 (Illegal instruction) - core dumped
  [1]16678 illegal hardware instruction (core dumped)  qemu-ppc-static 
./helloworld-centos.static.ppc

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1926202/+subscriptions

[Bug 1926202] Re: qemu-user can't run some ppc binaries

2021-04-26 Thread Laurent Vivier

** Tags added: linux-user

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1926202

Title:
  qemu-user can't run some ppc binaries

Status in QEMU:
  New

Bug description:
  qemu-user v6.0.0-rc5, built in static mode, will crash for certain ppc
  binaries.  It seems to have something to do with glibc for some Centos
  versions.  The problem is easiest to see with statically-linked
  binaries.

  The attached Dockerfile shows how to produce a ppc binary that will
  crash qemu-user.  Here is how to reproduce the problem:

  $ uname -m
  x86_64

  $ docker run --rm --privileged multiarch/qemu-user-static --reset -p
  yes

  $ docker build -t qemu-bug:centos -f Dockerfile.centos .

  $ docker run --rm -it -v$PWD:$PWD -w$PWD qemu-bug:centos cp
  /helloworld-centos.static.ppc .

  $ qemu-ppc-static --version
  qemu-ppc version 5.2.95 (v6.0.0-rc5)
  Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers

  $ qemu-ppc-static ./helloworld-centos.static.ppc
  emu: uncaught target signal 4 (Illegal instruction) - core dumped
  [1]16678 illegal hardware instruction (core dumped)  qemu-ppc-static 
./helloworld-centos.static.ppc

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1926202/+subscriptions

Re: constant_tsc support for SVM guest

2021-04-26 Thread Marcelo Tosatti

On Sun, Apr 25, 2021 at 12:19:11AM -0500, Wei Huang wrote:
> 
> 
> On 4/23/21 4:27 PM, Eduardo Habkost wrote:
> > On Fri, Apr 23, 2021 at 12:32:00AM -0500, Wei Huang wrote:
> > > There was a customer request for const_tsc support on AMD guests. Right 
> > > now
> > > this feature is turned off by default for QEMU x86 CPU types (in
> > > CPUID_Fn8007_EDX[8]). However we are seeing a discrepancy in guest VM
> > > behavior between Intel and AMD.
> > > 
> > > In Linux kernel, Intel x86 code enables X86_FEATURE_CONSTANT_TSC based on
> > > vCPU's family & model. So it ignores CPUID_Fn8007_EDX[8] and guest VMs
> > > have const_tsc enabled. On AMD, however, the kernel checks
> > > CPUID_Fn8007_EDX[8]. So const_tsc is disabled on AMD by default.
> > 
> > Oh.  This seems to defeat the purpose of the invtsc migration
> > blocker we have.
> > 
> > Do we know when this behavior was introduced in Linux?
> 
> This code has existed in the kernel for a long time:
> 
>   commit 2b16a2353814a513cdb5c5c739b76a19d7ea39ce
>   Author: Andi Kleen 
>   Date:   Wed Jan 30 13:32:40 2008 +0100
> 
>  x86: move X86_FEATURE_CONSTANT_TSC into early cpu feature detection
> 
> There was another related commit which might explain the reasoning of
> turning on CONSTANT_TSC based on CPU family on Intel:
> 
>   commit 40fb17152c50a69dc304dd632131c2f41281ce44
>   Author: Venki Pallipadi 
>   Date:   Mon Nov 17 16:11:37 2008 -0800
> 
>  x86: support always running TSC on Intel CPUs
> 
> According to the commit above, there are two kernel features: CONSTANT_TSC
> and NONSTOP_TSC:
> 
>   * CONSTANT_TSC: TSC runs at constant rate
>   * NONSTOP_TSC: TSC not stop in deep C-states
> 
> If CPUID_Fn8007_EDX[8] == 1, both CONSTANT_TSC and NONSTOP_TSC are
> turned on. This applies to all x86 vendors. For Intel CPU with certain CPU
> families (i.e. c->x86 == 0x6 && c->x86_model >= 0x0e), it will turn on
> CONSTANT_TSC (but no NONSTOP_TSC) with CPUID_Fn8007_EDX[8]=0.
> 
> I believe the migration blocker was created for the CONSTANT_TSC case: if
> vCPU claims to have a constant TSC rate, we have to make sure src/dest are
> matched with each other (having the same CPU frequency or have tsc_ratio).
> NONSTOP_TSC doesn't matter in this scope.
>
> > > I am thinking turning on invtsc for EPYC CPU types (see example below). 
> > > Most
> > > AMD server CPUs have supported invariant TSC for a long time. So this 
> > > change
> > > is compatible with the hardware behavior. The only problem is live 
> > > migration
> > > support, which will be blocked because of invtsc. 

It should be blocked, if performed to a host with a different frequency
or without TscRateMsr, if one desires the "constant TSC rate" meaning
to be maintained.

> > > However this problem
> > > should be considered very minor because most server CPUs support 
> > > TscRateMsr
> > > (see CPUID_Fn800A_EDX[4]), allowing VMs to migrate among CPUs with
> > > different TSC rates. This live migration restriction can be lifted as long
> > > as the destination supports TscRateMsr or has the same frequency as the
> > > source (QEMU/libvirt do it).
> > > 
> > > [BTW I believe this migration limitation might be unnecessary because it 
> > > is
> > > apparently OK for Intel guests to ignore invtsc while claiming const_tsc.
> > > Have anyone reported issues?]

Not as far as i know.

Fact is that libvirt will set the TSC_KHZ (from the value of
KVM_GET_TSC_KHZ ioctl).

That could be done inside QEMU itself, maybe by specifying -cpu
AAA,cpu-freq=auto ?

https://www.spinics.net/linux/fedora/libvir/msg141570.html

Re: [RFC PATCH 2/4] target/ppc: isolated SPR read/write callbacks

2021-04-26 Thread Fabiano Rosas

"Bruno Larsen (billionai)"  writes:

> Moved all functions related to SPR read/write callbacks into a new file
> specific for holding these. This is setting up a better separation of
> SPR registration, which is required to be able to build disabling
> TCG.
>
> The solution to move it to spr_tcg.c.inc and including it in translate.c
> is a work in progress, any better solutions are very much appreciated.
> Also, making the R/W functions not static is required for the next
> commit.

Looks like this could be done in the next commit then.

>
> Signed-off-by: Bruno Larsen (billionai) 
> ---
>  target/ppc/spr_tcg.c.inc| 1002 +++
>  target/ppc/spr_tcg.h|  132 
>  target/ppc/translate.c  |   48 +-
>  target/ppc/translate_init.c.inc |  986 --
>  4 files changed, 1136 insertions(+), 1032 deletions(-)
>  create mode 100644 target/ppc/spr_tcg.c.inc
>  create mode 100644 target/ppc/spr_tcg.h
>
> diff --git a/target/ppc/spr_tcg.c.inc b/target/ppc/spr_tcg.c.inc
> new file mode 100644
> index 00..a0e62b3816
> --- /dev/null
> +++ b/target/ppc/spr_tcg.c.inc
> @@ -0,0 +1,1002 @@
> +#include "exec/translator.h"
> +#include "spr_tcg.h"
> +
> +#ifdef TARGET_PPC64
> +void gen_fscr_facility_check(DisasContext *ctx, int facility_sprn,
> +int bit, int sprn, int cause)
> +{
> +TCGv_i32 t1 = tcg_const_i32(bit);
> +TCGv_i32 t2 = tcg_const_i32(sprn);
> +TCGv_i32 t3 = tcg_const_i32(cause);
> +
> +gen_helper_fscr_facility_check(cpu_env, t1, t2, t3);
> +
> +tcg_temp_free_i32(t3);
> +tcg_temp_free_i32(t2);
> +tcg_temp_free_i32(t1);
> +}
> +
> +void gen_msr_facility_check(DisasContext *ctx, int facility_sprn,
> +   int bit, int sprn, int cause)
> +{
> +TCGv_i32 t1 = tcg_const_i32(bit);
> +TCGv_i32 t2 = tcg_const_i32(sprn);
> +TCGv_i32 t3 = tcg_const_i32(cause);
> +
> +gen_helper_msr_facility_check(cpu_env, t1, t2, t3);
> +
> +tcg_temp_free_i32(t3);
> +tcg_temp_free_i32(t2);
> +tcg_temp_free_i32(t1);
> +}
> +#endif
> +/*/
> +/* Reader and writer functions for SPRs */
> +
> +void spr_noaccess(DisasContext *ctx, int gprn, int sprn)
> +{
> +#if 0
> +sprn = ((sprn >> 5) & 0x1F) | ((sprn & 0x1F) << 5);
> +printf("ERROR: try to access SPR %d !\n", sprn);
> +#endif
> +}
> +
> +/*
> + * Generic callbacks:
> + * do nothing but store/retrieve spr value
> + */
> +void spr_load_dump_spr(int sprn)
> +{
> +#ifdef PPC_DUMP_SPR_ACCESSES

The define needs to come along.

> +TCGv_i32 t0 = tcg_const_i32(sprn);
> +gen_helper_load_dump_spr(cpu_env, t0);
> +tcg_temp_free_i32(t0);
> +#endif
> +}
> +
> +void spr_read_generic(DisasContext *ctx, int gprn, int sprn)
> +{
> +gen_load_spr(cpu_gpr[gprn], sprn);
> +spr_load_dump_spr(sprn);
> +}
> +
> +void spr_store_dump_spr(int sprn)
> +{
> +#ifdef PPC_DUMP_SPR_ACCESSES
> +TCGv_i32 t0 = tcg_const_i32(sprn);
> +gen_helper_store_dump_spr(cpu_env, t0);
> +tcg_temp_free_i32(t0);
> +#endif
> +}
> +
> +void spr_write_generic(DisasContext *ctx, int sprn, int gprn)
> +{
> +gen_store_spr(sprn, cpu_gpr[gprn]);
> +spr_store_dump_spr(sprn);
> +}
> +
> +/* SPR common to all PowerPC */
> +/* XER */
> +
> +/* I really see no reason to keep these gen_*_xer */
> +/* instead of just leaving the code in the spr_*_xer */
> +void gen_read_xer(DisasContext *ctx, TCGv dst)
> +{
> +TCGv t0 = tcg_temp_new();
> +TCGv t1 = tcg_temp_new();
> +TCGv t2 = tcg_temp_new();
> +tcg_gen_mov_tl(dst, cpu_xer);
> +tcg_gen_shli_tl(t0, cpu_so, XER_SO);
> +tcg_gen_shli_tl(t1, cpu_ov, XER_OV);
> +tcg_gen_shli_tl(t2, cpu_ca, XER_CA);
> +tcg_gen_or_tl(t0, t0, t1);
> +tcg_gen_or_tl(dst, dst, t2);
> +tcg_gen_or_tl(dst, dst, t0);
> +if (is_isa300(ctx)) {
> +tcg_gen_shli_tl(t0, cpu_ov32, XER_OV32);
> +tcg_gen_or_tl(dst, dst, t0);
> +tcg_gen_shli_tl(t0, cpu_ca32, XER_CA32);
> +tcg_gen_or_tl(dst, dst, t0);
> +}
> +tcg_temp_free(t0);
> +tcg_temp_free(t1);
> +tcg_temp_free(t2);
> +}
> +
> +void gen_write_xer(TCGv src)
> +{
> +/* Write all flags, while reading back check for isa300 */
> +tcg_gen_andi_tl(cpu_xer, src,
> +~((1u << XER_SO) |
> +  (1u << XER_OV) | (1u << XER_OV32) |
> +  (1u << XER_CA) | (1u << XER_CA32)));
> +tcg_gen_extract_tl(cpu_ov32, src, XER_OV32, 1);
> +tcg_gen_extract_tl(cpu_ca32, src, XER_CA32, 1);
> +tcg_gen_extract_tl(cpu_so, src, XER_SO, 1);
> +tcg_gen_extract_tl(cpu_ov, src, XER_OV, 1);
> +tcg_gen_extract_tl(cpu_ca, src, XER_CA, 1);
> +}

These two can continue being static.

> +
> +void spr_read_xer(DisasContext *ctx, int gprn, int sprn)
> +{
> +gen_read_xer(ctx, cpu_gpr[gprn]);
> +}
> +
> +void spr_write_xer(DisasContext

[Bug 1926202] Re: qemu-user can't run some ppc binaries

2021-04-26 Thread Aaron Simmons

** Attachment added: "ppc binary that crashes qemu-user"
   
https://bugs.launchpad.net/qemu/+bug/1926202/+attachment/5492563/+files/helloworld-centos.static.ppc

** Description changed:

  qemu-user v6.0.0-rc5, built in static mode, will crash for certain ppc
  binaries.  It seems to have something to do with glibc for some Centos
  versions.  The problem is easiest to see with statically-linked
  binaries.
  
  The attached Dockerfile shows how to produce a ppc binary that will
  crash qemu-user.  Here is how to reproduce the problem:
  
  $ uname -m
  x86_64
  $ docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
  $ docker build -t qemu-bug:centos -f Dockerfile.centos .
  $ docker run --rm -it -v$PWD:$PWD -w$PWD qemu-bug:centos cp 
/helloworld-centos.static.ppc .
  $ qemu-ppc version 5.2.95 (v6.0.0-rc5)
  Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers
  $ qemu-ppc-static ./helloworld-centos.static.ppc
  emu: uncaught target signal 4 (Illegal instruction) - core dumped
  [1]16678 illegal hardware instruction (core dumped)  qemu-ppc-static 
./helloworld-centos.static.ppc
- 
- I can also provide the binary if necessary.

** Description changed:

  qemu-user v6.0.0-rc5, built in static mode, will crash for certain ppc
  binaries.  It seems to have something to do with glibc for some Centos
  versions.  The problem is easiest to see with statically-linked
  binaries.
  
  The attached Dockerfile shows how to produce a ppc binary that will
  crash qemu-user.  Here is how to reproduce the problem:
  
  $ uname -m
  x86_64
  $ docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
  $ docker build -t qemu-bug:centos -f Dockerfile.centos .
  $ docker run --rm -it -v$PWD:$PWD -w$PWD qemu-bug:centos cp 
/helloworld-centos.static.ppc .
- $ qemu-ppc version 5.2.95 (v6.0.0-rc5)
+ $ qemu-qemu-ppc-static --version
+ qemu-ppc version 5.2.95 (v6.0.0-rc5)
  Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers
  $ qemu-ppc-static ./helloworld-centos.static.ppc
  emu: uncaught target signal 4 (Illegal instruction) - core dumped
  [1]16678 illegal hardware instruction (core dumped)  qemu-ppc-static 
./helloworld-centos.static.ppc

** Description changed:

  qemu-user v6.0.0-rc5, built in static mode, will crash for certain ppc
  binaries.  It seems to have something to do with glibc for some Centos
  versions.  The problem is easiest to see with statically-linked
  binaries.
  
  The attached Dockerfile shows how to produce a ppc binary that will
  crash qemu-user.  Here is how to reproduce the problem:
  
  $ uname -m
  x86_64
+ 
  $ docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
+ 
  $ docker build -t qemu-bug:centos -f Dockerfile.centos .
- $ docker run --rm -it -v$PWD:$PWD -w$PWD qemu-bug:centos cp 
/helloworld-centos.static.ppc .
+ 
+ $ docker run --rm -it -v$PWD:$PWD -w$PWD qemu-bug:centos cp /helloworld-
+ centos.static.ppc .
+ 
  $ qemu-qemu-ppc-static --version
  qemu-ppc version 5.2.95 (v6.0.0-rc5)
  Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers
+ 
  $ qemu-ppc-static ./helloworld-centos.static.ppc
  emu: uncaught target signal 4 (Illegal instruction) - core dumped
  [1]16678 illegal hardware instruction (core dumped)  qemu-ppc-static 
./helloworld-centos.static.ppc

** Description changed:

  qemu-user v6.0.0-rc5, built in static mode, will crash for certain ppc
  binaries.  It seems to have something to do with glibc for some Centos
  versions.  The problem is easiest to see with statically-linked
  binaries.
  
  The attached Dockerfile shows how to produce a ppc binary that will
  crash qemu-user.  Here is how to reproduce the problem:
  
  $ uname -m
  x86_64
  
  $ docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
  
  $ docker build -t qemu-bug:centos -f Dockerfile.centos .
  
  $ docker run --rm -it -v$PWD:$PWD -w$PWD qemu-bug:centos cp /helloworld-
  centos.static.ppc .
  
- $ qemu-qemu-ppc-static --version
+ $ qemu-ppc-static --version
  qemu-ppc version 5.2.95 (v6.0.0-rc5)
  Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers
  
  $ qemu-ppc-static ./helloworld-centos.static.ppc
  emu: uncaught target signal 4 (Illegal instruction) - core dumped
  [1]16678 illegal hardware instruction (core dumped)  qemu-ppc-static 
./helloworld-centos.static.ppc

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1926202

Title:
  qemu-user can't run some ppc binaries

Status in QEMU:
  New

Bug description:
  qemu-user v6.0.0-rc5, built in static mode, will crash for certain ppc
  binaries.  It seems to have something to do with glibc for some Centos
  versions.  The problem is easiest to see with statically-linked
  binaries.

  The attached Dockerfile shows how to produce a ppc binary that will
  crash qemu-user.  Here is how to reproduce

[Bug 1926202] [NEW] qemu-user can't run some ppc binaries

2021-04-26 Thread Aaron Simmons

Public bug reported:

qemu-user v6.0.0-rc5, built in static mode, will crash for certain ppc
binaries.  It seems to have something to do with glibc for some Centos
versions.  The problem is easiest to see with statically-linked
binaries.

The attached Dockerfile shows how to produce a ppc binary that will
crash qemu-user.  Here is how to reproduce the problem:

$ uname -m
x86_64

$ docker run --rm --privileged multiarch/qemu-user-static --reset -p yes

$ docker build -t qemu-bug:centos -f Dockerfile.centos .

$ docker run --rm -it -v$PWD:$PWD -w$PWD qemu-bug:centos cp /helloworld-
centos.static.ppc .

$ qemu-ppc-static --version
qemu-ppc version 5.2.95 (v6.0.0-rc5)
Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers

$ qemu-ppc-static ./helloworld-centos.static.ppc
emu: uncaught target signal 4 (Illegal instruction) - core dumped
[1]16678 illegal hardware instruction (core dumped)  qemu-ppc-static 
./helloworld-centos.static.ppc

** Affects: qemu
 Importance: Undecided
 Status: New


** Tags: ppc

** Attachment added: "Dockerfile.centos"
   
https://bugs.launchpad.net/bugs/1926202/+attachment/5492562/+files/Dockerfile.centos

** Summary changed:

- qemu-user can't run ppc binaries
+ qemu-user can't run some ppc binaries

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1926202

Title:
  qemu-user can't run some ppc binaries

Status in QEMU:
  New

Bug description:
  qemu-user v6.0.0-rc5, built in static mode, will crash for certain ppc
  binaries.  It seems to have something to do with glibc for some Centos
  versions.  The problem is easiest to see with statically-linked
  binaries.

  The attached Dockerfile shows how to produce a ppc binary that will
  crash qemu-user.  Here is how to reproduce the problem:

  $ uname -m
  x86_64

  $ docker run --rm --privileged multiarch/qemu-user-static --reset -p
  yes

  $ docker build -t qemu-bug:centos -f Dockerfile.centos .

  $ docker run --rm -it -v$PWD:$PWD -w$PWD qemu-bug:centos cp
  /helloworld-centos.static.ppc .

  $ qemu-ppc-static --version
  qemu-ppc version 5.2.95 (v6.0.0-rc5)
  Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers

  $ qemu-ppc-static ./helloworld-centos.static.ppc
  emu: uncaught target signal 4 (Illegal instruction) - core dumped
  [1]16678 illegal hardware instruction (core dumped)  qemu-ppc-static 
./helloworld-centos.static.ppc

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1926202/+subscriptions

[PATCH 6/7] hw/{arm,hppa,riscv}: Add fw_cfg arch-specific stub

2021-04-26 Thread Philippe Mathieu-Daudé

The ARM, HPPA and RISC-V architectures don't declare any fw_cfg
specific key. To simplify the buildsys machinery and allow building
QEMU without the fw_cfg device (in the next commit), first add a
per-architecture empty stub defining the fw_cfg_arch_key_name().

Update the MAINTAINERS section to cover the various target-specific
fw_cfg.c files.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/arm/fw_cfg.c  | 19 +++
 hw/hppa/fw_cfg.c | 19 +++
 hw/riscv/fw_cfg.c| 19 +++
 MAINTAINERS  |  2 +-
 hw/arm/meson.build   |  1 +
 hw/hppa/meson.build  |  1 +
 hw/riscv/meson.build |  1 +
 7 files changed, 61 insertions(+), 1 deletion(-)
 create mode 100644 hw/arm/fw_cfg.c
 create mode 100644 hw/hppa/fw_cfg.c
 create mode 100644 hw/riscv/fw_cfg.c

diff --git a/hw/arm/fw_cfg.c b/hw/arm/fw_cfg.c
new file mode 100644
index 000..de2bca9c76c
--- /dev/null
+++ b/hw/arm/fw_cfg.c
@@ -0,0 +1,19 @@
+/*
+ * QEMU fw_cfg helpers (ARM specific)
+ *
+ * Copyright (c) 2021 Red Hat, Inc.
+ *
+ * Author:
+ *   Philippe Mathieu-Daudé 
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "hw/mips/fw_cfg.h"
+#include "hw/nvram/fw_cfg.h"
+
+const char *fw_cfg_arch_key_name(uint16_t key)
+{
+return NULL;
+}
diff --git a/hw/hppa/fw_cfg.c b/hw/hppa/fw_cfg.c
new file mode 100644
index 000..322b03068c7
--- /dev/null
+++ b/hw/hppa/fw_cfg.c
@@ -0,0 +1,19 @@
+/*
+ * QEMU fw_cfg helpers (HPPA specific)
+ *
+ * Copyright (c) 2021 Red Hat, Inc.
+ *
+ * Author:
+ *   Philippe Mathieu-Daudé 
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "hw/mips/fw_cfg.h"
+#include "hw/nvram/fw_cfg.h"
+
+const char *fw_cfg_arch_key_name(uint16_t key)
+{
+return NULL;
+}
diff --git a/hw/riscv/fw_cfg.c b/hw/riscv/fw_cfg.c
new file mode 100644
index 000..8e3d2a8bdea
--- /dev/null
+++ b/hw/riscv/fw_cfg.c
@@ -0,0 +1,19 @@
+/*
+ * QEMU fw_cfg helpers (RISC-V specific)
+ *
+ * Copyright (c) 2021 Red Hat, Inc.
+ *
+ * Author:
+ *   Philippe Mathieu-Daudé 
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "hw/mips/fw_cfg.h"
+#include "hw/nvram/fw_cfg.h"
+
+const char *fw_cfg_arch_key_name(uint16_t key)
+{
+return NULL;
+}
diff --git a/MAINTAINERS b/MAINTAINERS
index 36055f14c59..ab8f030d4c0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2162,7 +2162,7 @@ R: Laszlo Ersek 
 R: Gerd Hoffmann 
 S: Supported
 F: docs/specs/fw_cfg.txt
-F: hw/nvram/fw_cfg*.c
+F: hw/*/fw_cfg*.c
 F: stubs/fw_cfg.c
 F: include/hw/nvram/fw_cfg.h
 F: include/standard-headers/linux/qemu_fw_cfg.h
diff --git a/hw/arm/meson.build b/hw/arm/meson.build
index be39117b9b6..fd278de916f 100644
--- a/hw/arm/meson.build
+++ b/hw/arm/meson.build
@@ -1,6 +1,7 @@
 arm_ss = ss.source_set()
 arm_ss.add(files('boot.c'), fdt)
 arm_ss.add(when: 'CONFIG_PLATFORM_BUS', if_true: files('sysbus-fdt.c'))
+arm_ss.add(when: 'CONFIG_FW_CFG', if_true: files('fw_cfg.c'))
 arm_ss.add(when: 'CONFIG_ARM_VIRT', if_true: files('virt.c'))
 arm_ss.add(when: 'CONFIG_ACPI', if_true: files('virt-acpi-build.c'))
 arm_ss.add(when: 'CONFIG_DIGIC', if_true: files('digic_boards.c'))
diff --git a/hw/hppa/meson.build b/hw/hppa/meson.build
index 1deae83aee8..10494cc24b7 100644
--- a/hw/hppa/meson.build
+++ b/hw/hppa/meson.build
@@ -1,4 +1,5 @@
 hppa_ss = ss.source_set()
 hppa_ss.add(when: 'CONFIG_DINO', if_true: files('pci.c', 'machine.c', 
'dino.c', 'lasi.c'))
+hppa_ss.add(when: 'CONFIG_FW_CFG', if_true: files('fw_cfg.c'))
 
 hw_arch += {'hppa': hppa_ss}
diff --git a/hw/riscv/meson.build b/hw/riscv/meson.build
index 275c0f7eb7c..ab4d3adb924 100644
--- a/hw/riscv/meson.build
+++ b/hw/riscv/meson.build
@@ -8,5 +8,6 @@
 riscv_ss.add(when: 'CONFIG_SIFIVE_U', if_true: files('sifive_u.c'))
 riscv_ss.add(when: 'CONFIG_SPIKE', if_true: files('spike.c'))
 riscv_ss.add(when: 'CONFIG_MICROCHIP_PFSOC', if_true: 
files('microchip_pfsoc.c'))
+riscv_ss.add(when: 'CONFIG_FW_CFG', if_true: files('fw_cfg.c'))
 
 hw_arch += {'riscv': riscv_ss}
-- 
2.26.3

[PATCH 3/7] hw/nvram: Declare FW_CFG_DMA Kconfig symbol in hw/nvram/

2021-04-26 Thread Philippe Mathieu-Daudé

fw_cfg related files are maintained in hw/nvram/, so it makes
sense to declare the FW_CFG_DMA Kconfig symbol there, along
with the FW_CFG one.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/display/Kconfig | 3 ---
 hw/nvram/Kconfig   | 4 
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/display/Kconfig b/hw/display/Kconfig
index ca46b5830e7..0e4bb596c43 100644
--- a/hw/display/Kconfig
+++ b/hw/display/Kconfig
@@ -6,9 +6,6 @@ config DDC
 config EDID
 bool
 
-config FW_CFG_DMA
-bool
-
 config VGA_CIRRUS
 bool
 default y if PCI_DEVICES
diff --git a/hw/nvram/Kconfig b/hw/nvram/Kconfig
index cab1070375f..59fac45c315 100644
--- a/hw/nvram/Kconfig
+++ b/hw/nvram/Kconfig
@@ -1,6 +1,10 @@
 config FW_CFG
 bool
 
+config FW_CFG_DMA
+bool
+select FW_CFG
+
 config DS1225Y
 bool
 
-- 
2.26.3

[PATCH 4/7] hw/acpi/vmgenid: Make ACPI_VMGENID depends on FW_CFG Kconfig

2021-04-26 Thread Philippe Mathieu-Daudé

The TYPE_VMGENID device depends on fw_cfg:

  $ git grep \ fw_cfg hw/acpi/vmgenid.c
  hw/acpi/vmgenid.c:128:fw_cfg_add_file(s, VMGENID_GUID_FW_CFG_FILE, 
guid->data,
  hw/acpi/vmgenid.c:131:fw_cfg_add_file_callback(s, 
VMGENID_ADDR_FW_CFG_FILE, NULL, NULL, NULL,

Add the proper Kconfig dependency.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/acpi/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index 1932f66af8d..b9dc932d2a7 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -40,5 +40,6 @@ config ACPI_VMGENID
 bool
 default y
 depends on PC
+select FW_CFG
 
 config ACPI_HW_REDUCED
-- 
2.26.3

[PATCH 7/7] hw/nvram: Do not build FW_CFG if not required

2021-04-26 Thread Philippe Mathieu-Daudé

If the Kconfig 'FW_CFG' symbol is not selected, it is pointless
to build the fw_cfg device. Update the stubs.

Signed-off-by: Philippe Mathieu-Daudé 
---
 stubs/fw_cfg.c   | 49 ++--
 hw/nvram/meson.build |  2 +-
 2 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/stubs/fw_cfg.c b/stubs/fw_cfg.c
index bb1e3c8aa95..ac1e539c93f 100644
--- a/stubs/fw_cfg.c
+++ b/stubs/fw_cfg.c
@@ -1,7 +1,7 @@
 /*
  * fw_cfg stubs
  *
- * Copyright (c) 2019 Red Hat, Inc.
+ * Copyright (c) 2019,2021 Red Hat, Inc.
  *
  * Author:
  *   Philippe Mathieu-Daudé 
@@ -13,9 +13,54 @@
  */
 
 #include "qemu/osdep.h"
+#include "qapi/error.h"
 #include "hw/nvram/fw_cfg.h"
 
-const char *fw_cfg_arch_key_name(uint16_t key)
+FWCfgState *fw_cfg_find(void)
 {
 return NULL;
 }
+
+bool fw_cfg_add_from_generator(FWCfgState *s, const char *filename,
+   const char *gen_id, Error **errp)
+{
+error_setg(errp, "fw-cfg device not built in");
+
+return true;
+}
+
+void fw_cfg_add_file(FWCfgState *s,  const char *filename,
+ void *data, size_t len)
+{
+g_assert_not_reached();
+}
+
+void fw_cfg_add_file_callback(FWCfgState *s,  const char *filename,
+  FWCfgCallback select_cb,
+  FWCfgWriteCallback write_cb,
+  void *callback_opaque,
+  void *data, size_t len, bool read_only)
+{
+g_assert_not_reached();
+}
+
+void *fw_cfg_modify_file(FWCfgState *s, const char *filename,
+void *data, size_t len)
+{
+g_assert_not_reached();
+}
+
+void fw_cfg_set_order_override(FWCfgState *s, int order)
+{
+g_assert_not_reached();
+}
+
+void fw_cfg_reset_order_override(FWCfgState *s)
+{
+g_assert_not_reached();
+}
+
+bool fw_cfg_dma_enabled(void *opaque)
+{
+g_assert_not_reached();
+}
diff --git a/hw/nvram/meson.build b/hw/nvram/meson.build
index fd2951a860f..99e12224483 100644
--- a/hw/nvram/meson.build
+++ b/hw/nvram/meson.build
@@ -1,7 +1,7 @@
 # QOM interfaces must be available anytime QOM is used.
 qom_ss.add(files('fw_cfg-interface.c'))
 
-softmmu_ss.add(files('fw_cfg.c'))
+softmmu_ss.add(when: 'CONFIG_FW_CFG', if_true: files('fw_cfg.c'))
 softmmu_ss.add(when: 'CONFIG_CHRP_NVRAM', if_true: files('chrp_nvram.c'))
 softmmu_ss.add(when: 'CONFIG_DS1225Y', if_true: files('ds1225y.c'))
 softmmu_ss.add(when: 'CONFIG_NMC93XX_EEPROM', if_true: files('eeprom93xx.c'))
-- 
2.26.3

[PATCH 5/7] hw: Have machines Kconfig-select FW_CFG

2021-04-26 Thread Philippe Mathieu-Daudé

Beside the loongson3-virt machine (MIPS), the following machines
also use the fw_cfg device:

- ARM: virt & sbsa-ref
- HPPA: generic machine
- X86: ACPI based (pc & microvm)
- PPC64: various
- SPARC: sun4m & sun4u

Add their FW_CFG Kconfig dependency.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/arm/Kconfig | 2 ++
 hw/hppa/Kconfig| 1 +
 hw/i386/Kconfig| 2 ++
 hw/ppc/Kconfig | 1 +
 hw/sparc/Kconfig   | 1 +
 hw/sparc64/Kconfig | 1 +
 6 files changed, 8 insertions(+)

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 8c37cf00da7..3b2641e39dc 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -8,6 +8,7 @@ config ARM_VIRT
 imply TPM_TIS_SYSBUS
 select ARM_GIC
 select ACPI
+select FW_CFG
 select ARM_SMMUV3
 select GPIO_KEY
 select FW_CFG_DMA
@@ -216,6 +217,7 @@ config SBSA_REF
 select PL061 # GPIO
 select USB_EHCI_SYSBUS
 select WDT_SBSA
+select FW_CFG
 
 config SABRELITE
 bool
diff --git a/hw/hppa/Kconfig b/hw/hppa/Kconfig
index 22948db0256..45f40e09224 100644
--- a/hw/hppa/Kconfig
+++ b/hw/hppa/Kconfig
@@ -14,3 +14,4 @@ config DINO
 select LASIPS2
 select PARALLEL
 select ARTIST
+select FW_CFG
diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index 7f91f30877f..9e4039a2dce 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -52,6 +52,7 @@ config PC_ACPI
 select SMBUS_EEPROM
 select PFLASH_CFI01
 depends on ACPI_SMBUS
+select FW_CFG
 
 config I440FX
 bool
@@ -106,6 +107,7 @@ config MICROVM
 select ACPI_HW_REDUCED
 select PCI_EXPRESS_GENERIC_BRIDGE
 select USB_XHCI_SYSBUS
+select FW_CFG
 
 config X86_IOMMU
 bool
diff --git a/hw/ppc/Kconfig b/hw/ppc/Kconfig
index d11dc30509d..a7ba8283bf1 100644
--- a/hw/ppc/Kconfig
+++ b/hw/ppc/Kconfig
@@ -131,6 +131,7 @@ config VIRTEX
 # Only used by 64-bit targets
 config FW_CFG_PPC
 bool
+select FW_CFG
 
 config FDT_PPC
 bool
diff --git a/hw/sparc/Kconfig b/hw/sparc/Kconfig
index 8dcb10086fd..267bf45fa21 100644
--- a/hw/sparc/Kconfig
+++ b/hw/sparc/Kconfig
@@ -15,6 +15,7 @@ config SUN4M
 select STP2000
 select CHRP_NVRAM
 select OR_IRQ
+select FW_CFG
 
 config LEON3
 bool
diff --git a/hw/sparc64/Kconfig b/hw/sparc64/Kconfig
index 980a201bb73..c17b34b9d5b 100644
--- a/hw/sparc64/Kconfig
+++ b/hw/sparc64/Kconfig
@@ -13,6 +13,7 @@ config SUN4U
 select PCKBD
 select SIMBA
 select CHRP_NVRAM
+select FW_CFG
 
 config NIAGARA
 bool
-- 
2.26.3

[PATCH 1/7] stubs: Restrict fw_cfg stubs to sysemu

2021-04-26 Thread Philippe Mathieu-Daudé

User emulation or tools don't use / require the fw_cfg device.

Signed-off-by: Philippe Mathieu-Daudé 
---
 stubs/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/stubs/meson.build b/stubs/meson.build
index be6f6d609e5..4ff36401cf9 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -12,7 +12,6 @@
 stub_ss.add(files('dump.c'))
 stub_ss.add(files('error-printf.c'))
 stub_ss.add(files('fdset.c'))
-stub_ss.add(files('fw_cfg.c'))
 stub_ss.add(files('gdbstub.c'))
 stub_ss.add(files('get-vm-name.c'))
 stub_ss.add(when: 'CONFIG_LINUX_IO_URING', if_true: files('io_uring.c'))
@@ -49,6 +48,7 @@
   stub_ss.add(files('replay-tools.c'))
 endif
 if have_system
+  stub_ss.add(files('fw_cfg.c'))
   stub_ss.add(files('semihost.c'))
   stub_ss.add(files('xen-hw-stub.c'))
 else
-- 
2.26.3

[PATCH 2/7] hw/nvram: Rename FW_CFG_MIPS as generic FW_CFG Kconfig symbol

2021-04-26 Thread Philippe Mathieu-Daudé

Targets using the fw_cfg device might have architecture specific
keys. If so, they define the fw_cfg_arch_key_name() function.

The use of FW_CFG_MIPS is not MIPS-specific, it is simply the
architectural implementation. Rename it using the generic 'FW_CFG'
and move the Kconfig declaration in hw/nvram/ where fw_cfg code
is maintained.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/mips/Kconfig | 5 +
 hw/mips/meson.build | 2 +-
 hw/nvram/Kconfig| 3 +++
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/mips/Kconfig b/hw/mips/Kconfig
index aadd436bf4e..bbc6b9c1d11 100644
--- a/hw/mips/Kconfig
+++ b/hw/mips/Kconfig
@@ -42,7 +42,7 @@ config LOONGSON3V
 select PCI_DEVICES
 select PCI_EXPRESS_GENERIC_BRIDGE
 select MSI_NONBROKEN
-select FW_CFG_MIPS
+select FW_CFG
 
 config MIPS_CPS
 bool
@@ -50,6 +50,3 @@ config MIPS_CPS
 
 config MIPS_BOSTON
 bool
-
-config FW_CFG_MIPS
-bool
diff --git a/hw/mips/meson.build b/hw/mips/meson.build
index 1195716dc73..893e56f7453 100644
--- a/hw/mips/meson.build
+++ b/hw/mips/meson.build
@@ -1,6 +1,6 @@
 mips_ss = ss.source_set()
 mips_ss.add(files('bootloader.c', 'mips_int.c'))
-mips_ss.add(when: 'CONFIG_FW_CFG_MIPS', if_true: files('fw_cfg.c'))
+mips_ss.add(when: 'CONFIG_FW_CFG', if_true: files('fw_cfg.c'))
 mips_ss.add(when: 'CONFIG_FULOONG', if_true: files('fuloong2e.c'))
 mips_ss.add(when: 'CONFIG_LOONGSON3V', if_true: files('loongson3_bootp.c', 
'loongson3_virt.c'))
 mips_ss.add(when: 'CONFIG_JAZZ', if_true: files('jazz.c'))
diff --git a/hw/nvram/Kconfig b/hw/nvram/Kconfig
index e872fcb1941..cab1070375f 100644
--- a/hw/nvram/Kconfig
+++ b/hw/nvram/Kconfig
@@ -1,3 +1,6 @@
+config FW_CFG
+bool
+
 config DS1225Y
 bool
 
-- 
2.26.3

[PATCH 0/7] hw/nvram/fw_cfg: Do not build device if not needed (Spring cleanup)

2021-04-26 Thread Philippe Mathieu-Daudé

Hi,

Quite a trivial series around fw_cfg:
- enforce the FW_CFG Kconfig symbol,
- add missing Kconfig dependencies,
- explicit machines using the fw_cfg device,
- allow targets not using the device to build without it.

Please review,

Phil.

Philippe Mathieu-Daudé (7):
  stubs: Restrict fw_cfg stubs to sysemu
  hw/nvram: Rename FW_CFG_MIPS as generic FW_CFG Kconfig symbol
  hw/nvram: Declare FW_CFG_DMA Kconfig symbol in hw/nvram/
  hw/acpi/vmgenid: Make ACPI_VMGENID depends on FW_CFG Kconfig
  hw: Have machines Kconfig-select FW_CFG
  hw/{arm,hppa,riscv}: Add fw_cfg arch-specific stub
  hw/nvram: Do not build FW_CFG if not required

 hw/arm/fw_cfg.c  | 19 +
 hw/hppa/fw_cfg.c | 19 +
 hw/riscv/fw_cfg.c| 19 +
 stubs/fw_cfg.c   | 49 ++--
 MAINTAINERS  |  2 +-
 hw/acpi/Kconfig  |  1 +
 hw/arm/Kconfig   |  2 ++
 hw/arm/meson.build   |  1 +
 hw/display/Kconfig   |  3 ---
 hw/hppa/Kconfig  |  1 +
 hw/hppa/meson.build  |  1 +
 hw/i386/Kconfig  |  2 ++
 hw/mips/Kconfig  |  5 +
 hw/mips/meson.build  |  2 +-
 hw/nvram/Kconfig |  7 +++
 hw/nvram/meson.build |  2 +-
 hw/ppc/Kconfig   |  1 +
 hw/riscv/meson.build |  1 +
 hw/sparc/Kconfig |  1 +
 hw/sparc64/Kconfig   |  1 +
 stubs/meson.build|  2 +-
 21 files changed, 128 insertions(+), 13 deletions(-)
 create mode 100644 hw/arm/fw_cfg.c
 create mode 100644 hw/hppa/fw_cfg.c
 create mode 100644 hw/riscv/fw_cfg.c

-- 
2.26.3

RE: [RFC PATCH 1/4] target/ppc: move opcode table logic to translate.c

2021-04-26 Thread Bruno Piazera Larsen

> > code motion to remove opcode callback table from
> > translate_init.c.inc to translate.c in preparation
> > to remove #include  from
> > translate.c
>
> I'd mention the creation of destroy_ppc_opcodes since this patch is not
> strictly just moving code.

Sure, will do for v2.

> > +#if defined(PPC_DUMP_CPU)
>
> The commented out define for this was left behind.

Good catch! The define is going to still be used by a couple of things in 
cpu_init, though.
I'm guessing moving to internal.h is the best solution, but correct me if I'm 
wrong


Bruno Piazera Larsen

Instituto de Pesquisas 
ELDORADO

Departamento Computação Embarcada

Analista de Software Trainee

Aviso Legal - Disclaimer

Re: [RFC v9 15/29] vfio: Set up nested stage mappings

2021-04-26 Thread Auger Eric

Hi Kunkun,

On 4/15/21 4:03 AM, Kunkun Jiang wrote:
> Hi Eric,
> 
> On 2021/4/14 16:05, Auger Eric wrote:
>> Hi Kunkun,
>>
>> On 4/14/21 3:45 AM, Kunkun Jiang wrote:
>>> On 2021/4/13 20:57, Auger Eric wrote:
 Hi Kunkun,

 On 4/13/21 2:10 PM, Kunkun Jiang wrote:
> Hi Eric,
>
> On 2021/4/11 20:08, Eric Auger wrote:
>> In nested mode, legacy vfio_iommu_map_notify cannot be used as
>> there is no "caching" mode and we do not trap on map.
>>
>> On Intel, vfio_iommu_map_notify was used to DMA map the RAM
>> through the host single stage.
>>
>> With nested mode, we need to setup the stage 2 and the stage 1
>> separately. This patch introduces a prereg_listener to setup
>> the stage 2 mapping.
>>
>> The stage 1 mapping, owned by the guest, is passed to the host
>> when the guest invalidates the stage 1 configuration, through
>> a dedicated PCIPASIDOps callback. Guest IOTLB invalidations
>> are cascaded downto the host through another IOMMU MR UNMAP
>> notifier.
>>
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> v7 -> v8:
>> - properly handle new IOMMUTLBEntry fields and especially
>>  propagate DOMAIN and PASID based invalidations
>>
>> v6 -> v7:
>> - remove PASID based invalidation
>>
>> v5 -> v6:
>> - add error_report_err()
>> - remove the abort in case of nested stage case
>>
>> v4 -> v5:
>> - use VFIO_IOMMU_SET_PASID_TABLE
>> - use PCIPASIDOps for config notification
>>
>> v3 -> v4:
>> - use iommu_inv_pasid_info for ASID invalidation
>>
>> v2 -> v3:
>> - use VFIO_IOMMU_ATTACH_PASID_TABLE
>> - new user API
>> - handle leaf
>>
>> v1 -> v2:
>> - adapt to uapi changes
>> - pass the asid
>> - pass IOMMU_NOTIFIER_S1_CFG when initializing the config notifier
>> ---
>>     hw/vfio/common.c | 139
>> +--
>>     hw/vfio/pci.c    |  21 +++
>>     hw/vfio/trace-events |   2 +
>>     3 files changed, 157 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index 0cd7ef2139..e369d451e7 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -595,6 +595,73 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry
>> *iotlb, void **vaddr,
>>     return true;
>>     }
>>     +/* Propagate a guest IOTLB invalidation to the host (nested
>> mode) */
>> +static void vfio_iommu_unmap_notify(IOMMUNotifier *n, IOMMUTLBEntry
>> *iotlb)
>> +{
>> +    VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
>> +    struct vfio_iommu_type1_cache_invalidate ustruct = {};
>> +    VFIOContainer *container = giommu->container;
>> +    int ret;
>> +
>> +    assert(iotlb->perm == IOMMU_NONE);
>> +
>> +    ustruct.argsz = sizeof(ustruct);
>> +    ustruct.flags = 0;
>> +    ustruct.info.argsz = sizeof(struct iommu_cache_invalidate_info);
>> +    ustruct.info.version = IOMMU_CACHE_INVALIDATE_INFO_VERSION_1;
>> +    ustruct.info.cache = IOMMU_CACHE_INV_TYPE_IOTLB;
>> +
>> +    switch (iotlb->granularity) {
>> +    case IOMMU_INV_GRAN_DOMAIN:
>> +    ustruct.info.granularity = IOMMU_INV_GRANU_DOMAIN;
>> +    break;
>> +    case IOMMU_INV_GRAN_PASID:
>> +    {
>> +    struct iommu_inv_pasid_info *pasid_info;
>> +    int archid = -1;
>> +
>> +    pasid_info = _info;
>> +    ustruct.info.granularity = IOMMU_INV_GRANU_PASID;
>> +    if (iotlb->flags & IOMMU_INV_FLAGS_ARCHID) {
>> +    pasid_info->flags |= IOMMU_INV_ADDR_FLAGS_ARCHID;
>> +    archid = iotlb->arch_id;
>> +    }
>> +    pasid_info->archid = archid;
>> +    trace_vfio_iommu_asid_inv_iotlb(archid);
>> +    break;
>> +    }
>> +    case IOMMU_INV_GRAN_ADDR:
>> +    {
>> +    hwaddr start = iotlb->iova + giommu->iommu_offset;
>> +    struct iommu_inv_addr_info *addr_info;
>> +    size_t size = iotlb->addr_mask + 1;
>> +    int archid = -1;
>> +
>> +    addr_info = _info;
>> +    ustruct.info.granularity = IOMMU_INV_GRANU_ADDR;
>> +    if (iotlb->leaf) {
>> +    addr_info->flags |= IOMMU_INV_ADDR_FLAGS_LEAF;
>> +    }
>> +    if (iotlb->flags & IOMMU_INV_FLAGS_ARCHID) {
>> +    addr_info->flags |= IOMMU_INV_ADDR_FLAGS_ARCHID;
>> +    archid = iotlb->arch_id;
>> +    }
>> +    addr_info->archid = archid;
>> +    addr_info->addr = start;
>> +    addr_info->granule_size = size;
>> +    addr_info->nb_granules = 1;
>> +    trace_vfio_iommu_addr_inv_iotlb(archid, start, size,
>> +    1, iotlb->leaf);
>> +    break;
>>

Re: [RFC PATCH 1/4] target/ppc: move opcode table logic to translate.c

2021-04-26 Thread Fabiano Rosas

"Bruno Larsen (billionai)"  writes:

> code motion to remove opcode callback table from
> translate_init.c.inc to translate.c in preparation
> to remove #include  from
> translate.c

I'd mention the creation of destroy_ppc_opcodes since this patch is not
strictly just moving code.

>
> Signed-off-by: Bruno Larsen (billionai) 
> ---
>  target/ppc/internal.h   |   6 +
>  target/ppc/translate.c  | 394 
>  target/ppc/translate_init.c.inc | 390 +--
>  3 files changed, 401 insertions(+), 389 deletions(-)



> +void destroy_ppc_opcodes(PowerPCCPU *cpu)
> +{
> +opc_handler_t **table, **table_2;
> +int i, j, k;
> +
> +for (i = 0; i < PPC_CPU_OPCODES_LEN; i++) {
> +if (cpu->opcodes[i] == _handler) {
> +continue;
> +}
> +if (is_indirect_opcode(cpu->opcodes[i])) {
> +table = ind_table(cpu->opcodes[i]);
> +for (j = 0; j < PPC_CPU_INDIRECT_OPCODES_LEN; j++) {
> +if (table[j] == _handler) {
> +continue;
> +}
> +if (is_indirect_opcode(table[j])) {
> +table_2 = ind_table(table[j]);
> +for (k = 0; k < PPC_CPU_INDIRECT_OPCODES_LEN; k++) {
> +if (table_2[k] != _handler &&
> +is_indirect_opcode(table_2[k])) {
> +g_free((opc_handler_t *)((uintptr_t)table_2[k] &
> + ~PPC_INDIRECT));
> +}
> +}
> +g_free((opc_handler_t *)((uintptr_t)table[j] &
> + ~PPC_INDIRECT));
> +}
> +}
> +g_free((opc_handler_t *)((uintptr_t)cpu->opcodes[i] &
> +~PPC_INDIRECT));
> +}
> +}
> +}
> +
> +#if defined(PPC_DUMP_CPU)

The commented out define for this was left behind.

> +static void dump_ppc_insns(CPUPPCState *env)
> +{
> +opc_handler_t **table, *handler;
> +const char *p, *q;
> +uint8_t opc1, opc2, opc3, opc4;
> +
> +printf("Instructions set:\n");
> +/* opc1 is 6 bits long */
> +for (opc1 = 0x00; opc1 < PPC_CPU_OPCODES_LEN; opc1++) {
> +table = env->opcodes;
> +handler = table[opc1];
> +if (is_indirect_opcode(handler)) {
> +/* opc2 is 5 bits long */
> +for (opc2 = 0; opc2 < PPC_CPU_INDIRECT_OPCODES_LEN; opc2++) {
> +table = env->opcodes;
> +handler = env->opcodes[opc1];
> +table = ind_table(handler);
> +handler = table[opc2];
> +if (is_indirect_opcode(handler)) {
> +table = ind_table(handler);
> +/* opc3 is 5 bits long */
> +for (opc3 = 0; opc3 < PPC_CPU_INDIRECT_OPCODES_LEN;
> +opc3++) {
> +handler = table[opc3];
> +if (is_indirect_opcode(handler)) {
> +table = ind_table(handler);
> +/* opc4 is 5 bits long */
> +for (opc4 = 0; opc4 < 
> PPC_CPU_INDIRECT_OPCODES_LEN;
> + opc4++) {
> +handler = table[opc4];
> +if (handler->handler != _invalid) {
> +printf("INSN: %02x %02x %02x %02x -- "
> +   "(%02d %04d %02d) : %s\n",
> +   opc1, opc2, opc3, opc4,
> +   opc1, (opc3 << 5) | opc2, opc4,
> +   handler->oname);
> +}
> +}
> +} else {
> +if (handler->handler != _invalid) {
> +/* Special hack to properly dump SPE insns */
> +p = strchr(handler->oname, '_');
> +if (p == NULL) {
> +printf("INSN: %02x %02x %02x (%02d %04d) 
> : "
> +   "%s\n",
> +   opc1, opc2, opc3, opc1,
> +   (opc3 << 5) | opc2,
> +   handler->oname);
> +} else {
> +q = "speundef";
> +if ((p - handler->oname) != strlen(q)
> +|| (memcmp(handler->oname, q, 
> strlen(q))
> +!= 0)) {
> +/* First instruction */
> +printf("INSN:

Re: [PATCH] qapi: deprecate drive-backup

2021-04-26 Thread Vladimir Sementsov-Ogievskiy


26.04.2021 21:30, John Snow wrote:

On 4/26/21 2:05 PM, Daniel P. Berrangé wrote:

On Mon, Apr 26, 2021 at 09:00:36PM +0300, Vladimir Sementsov-Ogievskiy wrote:

26.04.2021 20:34, John Snow wrote:

On 4/23/21 8:59 AM, Vladimir Sementsov-Ogievskiy wrote:

Modern way is using blockdev-add + blockdev-backup, which provides a
lot more control on how target is opened.

As example of drive-backup problems consider the following:

User of drive-backup expects that target will be opened in the same
cache and aio mode as source. Corresponding logic is in
drive_backup_prepare(), where we take bs->open_flags of source.

It works rather bad if source was added by blockdev-add. Assume source
is qcow2 image. On blockdev-add we should specify aio and cache options
for file child of qcow2 node. What happens next:

drive_backup_prepare() looks at bs->open_flags of qcow2 source node.
But there no BDRV_O_NOCAHE neither BDRV_O_NATIVE_AIO: BDRV_O_NOCAHE is
places in bs->file->bs->open_flags, and BDRV_O_NATIVE_AIO is nowhere,
as file-posix parse options and simply set s->use_linux_aio.



No complaints from me, especially if Virtuozzo is on board. I would like to see 
some documentation changes alongside this deprecation, though.


Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

Hi all! I remember, I suggested to deprecate drive-backup some time ago,
and nobody complain.. But that old patch was inside the series with
other more questionable deprecations and it did not landed.

Let's finally deprecate what should be deprecated long ago.

We now faced a problem in our downstream, described in commit message.
In downstream I've fixed it by simply enabling O_DIRECT and linux_aio
unconditionally for drive_backup target. But actually this just shows
that using drive-backup in blockdev era is a bad idea. So let's motivate
everyone (including Virtuozzo of course) to move to new interfaces and
avoid problems with all that outdated option inheritance.

   docs/system/deprecated.rst | 5 +
   qapi/block-core.json   | 5 -
   2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index 80cae86252..b6f5766e17 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -186,6 +186,11 @@ Use the more generic commands ``block-export-add`` and 
``block-export-del``
   instead.  As part of this deprecation, where ``nbd-server-add`` used a
   single ``bitmap``, the new ``block-export-add`` uses a list of ``bitmaps``.
+``drive-backup`` (since 6.0)
+
+
+Use ``blockdev-backup`` in pair with ``blockdev-add`` instead.
+


1) Let's add a sphinx reference to 
https://qemu-project.gitlab.io/qemu/interop/live-block-operations.html#live-disk-backup-drive-backup-and-blockdev-backup


2) Just a thought, not a request: We also may wish to update 
https://qemu-project.gitlab.io/qemu/interop/bitmaps.html to use the new, 
preferred method. However, this doc is a bit old and is in need of an overhaul 
anyway (Especially to add the NBD pull workflow.) Since the doc is in need of 
an overhaul anyway, can we ask Kashyap to help us here, if he has time?


3) Let's add a small explanation here that outlines the differences in using 
these two commands. Here's a suggestion:

This change primarily separates the creation/opening process of the backup target with explicit, separate steps. 
BlockdevBackup uses mostly the same arguments as DriveBackup, except the "format" and "mode" 
options are removed in favor of using explicit "blockdev-create" and "blockdev-add" calls.

The "target" argument changes semantics. It no longer accepts filenames, and 
will now additionally accept arbitrary node names in addition to device names.


4) Also not a request: If we want to go above and beyond, it might be nice to 
spell out the exact steps required to transition from the old interface to the 
new one. Here's a (hasty) suggestion for how that might look:

- The MODE argument is deprecated.
   - "existing" is replaced by using "blockdev-add" commands.
   - "absolute-paths" is replaced by using "blockdev-add" and
     "blockdev-create" commands.

- The FORMAT argument is deprecated.
   - Format information is given to "blockdev-add"/"blockdev-create".

- The TARGET argument has new semantics:
   - Filenames are no longer supported, use blockdev-add/blockdev-create
     as necessary instead.
   - Device targets remain supported.


Example:

drive-backup $ARGS format=$FORMAT mode=$MODE target=$FILENAME becomes:

(taking some liberties with syntax to just illustrate the idea ...)

blockdev-create options={
     "driver": "file",
     "filename": $FILENAME,
     "size": 0,
}

blockdev-add arguments={
     "driver": "file",
     "filename": $FILENAME,
     "node-name": "Example_Filenode0"
}

blockdev-create options={
     "driver": $FORMAT,
     "file": "Example_Filenode0",
     "size": $SIZE,
}

blockdev-add arguments={
     "driver": $FORMAT,
     "file":

[PATCH v4] target/ppc: code motion from translate_init.c.inc to gdbstub.c

2021-04-26 Thread Bruno Larsen (billionai)

All the code related to gdb has been moved from translate_init.c.inc
file to the gdbstub.c file, where it makes more sense.

Version 4 fixes the omission of internal.h in gdbstub, mentioned in
<87sg3d2gf5@linux.ibm.com>, and the extra blank line.

Signed-off-by: Bruno Larsen (billionai) 
Suggested-by: Fabiano Rosas 
---
 target/ppc/gdbstub.c| 258 
 target/ppc/internal.h   |   5 +
 target/ppc/translate_init.c.inc | 254 +--
 3 files changed, 264 insertions(+), 253 deletions(-)

diff --git a/target/ppc/gdbstub.c b/target/ppc/gdbstub.c
index c28319fb97..94a7273ee0 100644
--- a/target/ppc/gdbstub.c
+++ b/target/ppc/gdbstub.c
@@ -20,6 +20,8 @@
 #include "qemu/osdep.h"
 #include "cpu.h"
 #include "exec/gdbstub.h"
+#include "exec/helper-proto.h"
+#include "internal.h"
 
 static int ppc_gdb_register_len_apple(int n)
 {
@@ -387,3 +389,259 @@ const char *ppc_gdb_get_dynamic_xml(CPUState *cs, const 
char *xml_name)
 return NULL;
 }
 #endif
+
+static bool avr_need_swap(CPUPPCState *env)
+{
+#ifdef HOST_WORDS_BIGENDIAN
+return msr_le;
+#else
+return !msr_le;
+#endif
+}
+
+#if !defined(CONFIG_USER_ONLY)
+static int gdb_find_spr_idx(CPUPPCState *env, int n)
+{
+int i;
+
+for (i = 0; i < ARRAY_SIZE(env->spr_cb); i++) {
+ppc_spr_t *spr = >spr_cb[i];
+
+if (spr->name && spr->gdb_id == n) {
+return i;
+}
+}
+return -1;
+}
+
+static int gdb_get_spr_reg(CPUPPCState *env, GByteArray *buf, int n)
+{
+int reg;
+int len;
+
+reg = gdb_find_spr_idx(env, n);
+if (reg < 0) {
+return 0;
+}
+
+len = TARGET_LONG_SIZE;
+gdb_get_regl(buf, env->spr[reg]);
+ppc_maybe_bswap_register(env, gdb_get_reg_ptr(buf, len), len);
+return len;
+}
+
+static int gdb_set_spr_reg(CPUPPCState *env, uint8_t *mem_buf, int n)
+{
+int reg;
+int len;
+
+reg = gdb_find_spr_idx(env, n);
+if (reg < 0) {
+return 0;
+}
+
+len = TARGET_LONG_SIZE;
+ppc_maybe_bswap_register(env, mem_buf, len);
+env->spr[reg] = ldn_p(mem_buf, len);
+
+return len;
+}
+#endif
+
+static int gdb_get_float_reg(CPUPPCState *env, GByteArray *buf, int n)
+{
+uint8_t *mem_buf;
+if (n < 32) {
+gdb_get_reg64(buf, *cpu_fpr_ptr(env, n));
+mem_buf = gdb_get_reg_ptr(buf, 8);
+ppc_maybe_bswap_register(env, mem_buf, 8);
+return 8;
+}
+if (n == 32) {
+gdb_get_reg32(buf, env->fpscr);
+mem_buf = gdb_get_reg_ptr(buf, 4);
+ppc_maybe_bswap_register(env, mem_buf, 4);
+return 4;
+}
+return 0;
+}
+
+static int gdb_set_float_reg(CPUPPCState *env, uint8_t *mem_buf, int n)
+{
+if (n < 32) {
+ppc_maybe_bswap_register(env, mem_buf, 8);
+*cpu_fpr_ptr(env, n) = ldq_p(mem_buf);
+return 8;
+}
+if (n == 32) {
+ppc_maybe_bswap_register(env, mem_buf, 4);
+store_fpscr(env, ldl_p(mem_buf), 0x);
+return 4;
+}
+return 0;
+}
+
+static int gdb_get_avr_reg(CPUPPCState *env, GByteArray *buf, int n)
+{
+uint8_t *mem_buf;
+
+if (n < 32) {
+ppc_avr_t *avr = cpu_avr_ptr(env, n);
+if (!avr_need_swap(env)) {
+gdb_get_reg128(buf, avr->u64[0] , avr->u64[1]);
+} else {
+gdb_get_reg128(buf, avr->u64[1] , avr->u64[0]);
+}
+mem_buf = gdb_get_reg_ptr(buf, 16);
+ppc_maybe_bswap_register(env, mem_buf, 8);
+ppc_maybe_bswap_register(env, mem_buf + 8, 8);
+return 16;
+}
+if (n == 32) {
+gdb_get_reg32(buf, helper_mfvscr(env));
+mem_buf = gdb_get_reg_ptr(buf, 4);
+ppc_maybe_bswap_register(env, mem_buf, 4);
+return 4;
+}
+if (n == 33) {
+gdb_get_reg32(buf, (uint32_t)env->spr[SPR_VRSAVE]);
+mem_buf = gdb_get_reg_ptr(buf, 4);
+ppc_maybe_bswap_register(env, mem_buf, 4);
+return 4;
+}
+return 0;
+}
+
+static int gdb_set_avr_reg(CPUPPCState *env, uint8_t *mem_buf, int n)
+{
+if (n < 32) {
+ppc_avr_t *avr = cpu_avr_ptr(env, n);
+ppc_maybe_bswap_register(env, mem_buf, 8);
+ppc_maybe_bswap_register(env, mem_buf + 8, 8);
+if (!avr_need_swap(env)) {
+avr->u64[0] = ldq_p(mem_buf);
+avr->u64[1] = ldq_p(mem_buf + 8);
+} else {
+avr->u64[1] = ldq_p(mem_buf);
+avr->u64[0] = ldq_p(mem_buf + 8);
+}
+return 16;
+}
+if (n == 32) {
+ppc_maybe_bswap_register(env, mem_buf, 4);
+helper_mtvscr(env, ldl_p(mem_buf));
+return 4;
+}
+if (n == 33) {
+ppc_maybe_bswap_register(env, mem_buf, 4);
+env->spr[SPR_VRSAVE] = (target_ulong)ldl_p(mem_buf);
+return 4;
+}
+return 0;
+}
+
+static int gdb_get_spe_reg(CPUPPCState *env, GByteArray *buf, int n)
+{
+if (n < 32) {
+#if defined(TARGET_PPC64)
+gdb_get_reg32(buf, env->gpr[n] >>

Re: [PATCH] qapi: deprecate drive-backup

2021-04-26 Thread John Snow


On 4/26/21 2:41 PM, Vladimir Sementsov-Ogievskiy wrote:

26.04.2021 21:30, John Snow wrote:

On 4/26/21 2:05 PM, Daniel P. Berrangé wrote:
On Mon, Apr 26, 2021 at 09:00:36PM +0300, Vladimir 
Sementsov-Ogievskiy wrote:

26.04.2021 20:34, John Snow wrote:

On 4/23/21 8:59 AM, Vladimir Sementsov-Ogievskiy wrote:

Modern way is using blockdev-add + blockdev-backup, which provides a
lot more control on how target is opened.

As example of drive-backup problems consider the following:

User of drive-backup expects that target will be opened in the same
cache and aio mode as source. Corresponding logic is in
drive_backup_prepare(), where we take bs->open_flags of source.

It works rather bad if source was added by blockdev-add. Assume 
source
is qcow2 image. On blockdev-add we should specify aio and cache 
options

for file child of qcow2 node. What happens next:

drive_backup_prepare() looks at bs->open_flags of qcow2 source node.
But there no BDRV_O_NOCAHE neither BDRV_O_NATIVE_AIO: 
BDRV_O_NOCAHE is

places in bs->file->bs->open_flags, and BDRV_O_NATIVE_AIO is nowhere,
as file-posix parse options and simply set s->use_linux_aio.



No complaints from me, especially if Virtuozzo is on board. I would 
like to see some documentation changes alongside this deprecation, 
though.


Signed-off-by: Vladimir Sementsov-Ogievskiy 


---

Hi all! I remember, I suggested to deprecate drive-backup some 
time ago,

and nobody complain.. But that old patch was inside the series with
other more questionable deprecations and it did not landed.

Let's finally deprecate what should be deprecated long ago.

We now faced a problem in our downstream, described in commit 
message.

In downstream I've fixed it by simply enabling O_DIRECT and linux_aio
unconditionally for drive_backup target. But actually this just shows
that using drive-backup in blockdev era is a bad idea. So let's 
motivate
everyone (including Virtuozzo of course) to move to new interfaces 
and

avoid problems with all that outdated option inheritance.

   docs/system/deprecated.rst | 5 +
   qapi/block-core.json   | 5 -
   2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index 80cae86252..b6f5766e17 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -186,6 +186,11 @@ Use the more generic commands 
``block-export-add`` and ``block-export-del``
   instead.  As part of this deprecation, where ``nbd-server-add`` 
used a
   single ``bitmap``, the new ``block-export-add`` uses a list of 
``bitmaps``.

+``drive-backup`` (since 6.0)
+
+
+Use ``blockdev-backup`` in pair with ``blockdev-add`` instead.
+


1) Let's add a sphinx reference to 
https://qemu-project.gitlab.io/qemu/interop/live-block-operations.html#live-disk-backup-drive-backup-and-blockdev-backup 




2) Just a thought, not a request: We also may wish to update 
https://qemu-project.gitlab.io/qemu/interop/bitmaps.html to use the 
new, preferred method. However, this doc is a bit old and is in 
need of an overhaul anyway (Especially to add the NBD pull 
workflow.) Since the doc is in need of an overhaul anyway, can we 
ask Kashyap to help us here, if he has time?



3) Let's add a small explanation here that outlines the differences 
in using these two commands. Here's a suggestion:


This change primarily separates the creation/opening process of the 
backup target with explicit, separate steps. BlockdevBackup uses 
mostly the same arguments as DriveBackup, except the "format" and 
"mode" options are removed in favor of using explicit 
"blockdev-create" and "blockdev-add" calls.




(Here, I accidentally used the names of the argument objects instead of 
the names of the commands. It's likely better to spell out the names of 
the commands instead.)


The "target" argument changes semantics. It no longer accepts 
filenames, and will now additionally accept arbitrary node names in 
addition to device names.



4) Also not a request: If we want to go above and beyond, it might 
be nice to spell out the exact steps required to transition from 
the old interface to the new one. Here's a (hasty) suggestion for 
how that might look:


- The MODE argument is deprecated.
   - "existing" is replaced by using "blockdev-add" commands.
   - "absolute-paths" is replaced by using "blockdev-add" and
     "blockdev-create" commands.

- The FORMAT argument is deprecated.
   - Format information is given to "blockdev-add"/"blockdev-create".

- The TARGET argument has new semantics:
   - Filenames are no longer supported, use 
blockdev-add/blockdev-create

     as necessary instead.
   - Device targets remain supported.


Example:

drive-backup $ARGS format=$FORMAT mode=$MODE target=$FILENAME becomes:

(taking some liberties with syntax to just illustrate the idea ...)

blockdev-create options={
     "driver": "file",
     "filename": $FILENAME,
     "size": 0,
}

blockdev-add arguments={

Re: [PATCH] qapi: deprecate drive-backup

2021-04-26 Thread John Snow


On 4/26/21 2:05 PM, Daniel P. Berrangé wrote:

On Mon, Apr 26, 2021 at 09:00:36PM +0300, Vladimir Sementsov-Ogievskiy wrote:

26.04.2021 20:34, John Snow wrote:

On 4/23/21 8:59 AM, Vladimir Sementsov-Ogievskiy wrote:

Modern way is using blockdev-add + blockdev-backup, which provides a
lot more control on how target is opened.

As example of drive-backup problems consider the following:

User of drive-backup expects that target will be opened in the same
cache and aio mode as source. Corresponding logic is in
drive_backup_prepare(), where we take bs->open_flags of source.

It works rather bad if source was added by blockdev-add. Assume source
is qcow2 image. On blockdev-add we should specify aio and cache options
for file child of qcow2 node. What happens next:

drive_backup_prepare() looks at bs->open_flags of qcow2 source node.
But there no BDRV_O_NOCAHE neither BDRV_O_NATIVE_AIO: BDRV_O_NOCAHE is
places in bs->file->bs->open_flags, and BDRV_O_NATIVE_AIO is nowhere,
as file-posix parse options and simply set s->use_linux_aio.



No complaints from me, especially if Virtuozzo is on board. I would like to see 
some documentation changes alongside this deprecation, though.


Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

Hi all! I remember, I suggested to deprecate drive-backup some time ago,
and nobody complain.. But that old patch was inside the series with
other more questionable deprecations and it did not landed.

Let's finally deprecate what should be deprecated long ago.

We now faced a problem in our downstream, described in commit message.
In downstream I've fixed it by simply enabling O_DIRECT and linux_aio
unconditionally for drive_backup target. But actually this just shows
that using drive-backup in blockdev era is a bad idea. So let's motivate
everyone (including Virtuozzo of course) to move to new interfaces and
avoid problems with all that outdated option inheritance.

   docs/system/deprecated.rst | 5 +
   qapi/block-core.json   | 5 -
   2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index 80cae86252..b6f5766e17 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -186,6 +186,11 @@ Use the more generic commands ``block-export-add`` and 
``block-export-del``
   instead.  As part of this deprecation, where ``nbd-server-add`` used a
   single ``bitmap``, the new ``block-export-add`` uses a list of ``bitmaps``.
+``drive-backup`` (since 6.0)
+
+
+Use ``blockdev-backup`` in pair with ``blockdev-add`` instead.
+


1) Let's add a sphinx reference to 
https://qemu-project.gitlab.io/qemu/interop/live-block-operations.html#live-disk-backup-drive-backup-and-blockdev-backup


2) Just a thought, not a request: We also may wish to update 
https://qemu-project.gitlab.io/qemu/interop/bitmaps.html to use the new, 
preferred method. However, this doc is a bit old and is in need of an overhaul 
anyway (Especially to add the NBD pull workflow.) Since the doc is in need of 
an overhaul anyway, can we ask Kashyap to help us here, if he has time?


3) Let's add a small explanation here that outlines the differences in using 
these two commands. Here's a suggestion:

This change primarily separates the creation/opening process of the backup target with explicit, separate steps. 
BlockdevBackup uses mostly the same arguments as DriveBackup, except the "format" and "mode" 
options are removed in favor of using explicit "blockdev-create" and "blockdev-add" calls.

The "target" argument changes semantics. It no longer accepts filenames, and 
will now additionally accept arbitrary node names in addition to device names.


4) Also not a request: If we want to go above and beyond, it might be nice to 
spell out the exact steps required to transition from the old interface to the 
new one. Here's a (hasty) suggestion for how that might look:

- The MODE argument is deprecated.
   - "existing" is replaced by using "blockdev-add" commands.
   - "absolute-paths" is replaced by using "blockdev-add" and
     "blockdev-create" commands.

- The FORMAT argument is deprecated.
   - Format information is given to "blockdev-add"/"blockdev-create".

- The TARGET argument has new semantics:
   - Filenames are no longer supported, use blockdev-add/blockdev-create
     as necessary instead.
   - Device targets remain supported.


Example:

drive-backup $ARGS format=$FORMAT mode=$MODE target=$FILENAME becomes:

(taking some liberties with syntax to just illustrate the idea ...)

blockdev-create options={
     "driver": "file",
     "filename": $FILENAME,
     "size": 0,
}

blockdev-add arguments={
     "driver": "file",
     "filename": $FILENAME,
     "node-name": "Example_Filenode0"
}

blockdev-create options={
     "driver": $FORMAT,
     "file": "Example_Filenode0",
     "size": $SIZE,
}

blockdev-add arguments={
     "driver": $FORMAT,
     "file": "Example_Filenode0",
     "node-name":

Re: [PATCH 16/22] qapi/parser: add docstrings

2021-04-26 Thread John Snow


On 4/25/21 9:27 AM, Markus Armbruster wrote:

John Snow  writes:


Signed-off-by: John Snow 

---

My hubris is infinite.


Score one of the three principal virtues of a programmer ;)



It was written before the prior review, but I promise I am slowing down 
on adding these. I just genuinely left them to help remind myself how 
these modules are actually structured and work so that I will be able to 
"pop in" quickly in the future and make a tactical, informed edit.



OK, I only added a few -- to help me remember how the parser works at a glance.

Signed-off-by: John Snow 
---
  scripts/qapi/parser.py | 66 ++
  1 file changed, 66 insertions(+)

diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
index dbbd0fcbc2f..8fc77808ace 100644
--- a/scripts/qapi/parser.py
+++ b/scripts/qapi/parser.py
@@ -51,7 +51,24 @@ def __init__(self, parser: 'QAPISchemaParser', msg: str):
  
  
  class QAPISchemaParser:

+"""
+Performs parsing of a QAPI schema source file.


Actually, this parses one of two layers, see qapi-code-gen.txt section
"Schema syntax".  Pointing there might help.



It sort of parses one-and-a-half layers, but yes ... I know the 
distinction you're drawing here. This is *mostly* the JSON/AST level.


(With some upper-level or mid-level parsing for Pragmas and Includes.)

  
+:param fname: Path to the source file


Either "Source file name" or "Source pathname", please.  I prefer "file
name" for additional distance to "path" in the sense of a search path,
i.e. a list of directory names.



OK, I am not sure I have any ... prejudice about when to use which kind 
of description for these sorts of things. I'm happy to defer to you, but 
if there's some kind of existing standard vocabulary I'm trampling all 
over, feel free to point me to your preferred hacker dictionary.


Anyway, happy to adopt your phrasing here.


+:param previously_included:
+The absolute paths of previously included source files.


Either "absolute file name" or "absulute pathname".



OK.


+Only used by recursive calls to avoid re-parsing files.


Feels like detail, not sure it's needed here.



You're probably right, but I suppose I wanted to hint/suggest that it 
was not necessary to feed it this argument for the root schema, but it 
was crucial for the recursive calls.


(Earlier I mentioned possibly just passing the parent parser in: that 
helps eliminate some of this ambiguity, too.)



+:param incl_info:
+   `QAPISourceInfo` for the parent document.
+   This may be None if this is the root schema document.


Recommend s/This maybe //.

qapi-code-gen.txt calls a QAPI schema that uses include directives
"modular", and the included files "sub-modules".  s/root schema
document/root module/?



Sure. All in favor of phrasing consistency.

(By the way: I did write up a draft for converting qapi-code-gen.txt to 
ReST format, and if I had finished that, it might be nice to hotlink to 
it here. I stopped for now because I wanted to solidify some conventions 
on how to markup certain constructs first, and wanted ... not to 
overwhelm you with more doc-wrangling.)



+
+:ivar exprs: Resulting parsed expressions.
+:ivar docs: Resulting parsed documentation blocks.


Uh, why are these here?  A doc string is interface documentation...



These *are* interface. It is how callers are expected to get the results 
of parsing.


We could change that, of course, but that is absolutely how this class 
works today.



+
+:raise OSError: For problems opening the root schema document.
+:raise QAPIParseError: For JSON or QAPIDoc syntax problems.
+:raise QAPISemError: For various semantic issues with the schema.


Should callers care for the difference between QAPIParseError and
QAPISemError?



That's up to the caller, I suppose. I just dutifully reported the truth 
of the matter here.


(That's a real non-answer, I know.)

I could always document QAPISourceError instead, with a note about the 
subclasses used for completeness.


(The intent is that QAPIError is always assumed/implied to be sufficient 
for capturing absolutely everything raised directly by this package, if 
you want to ignore the meanings behind them.)



+"""
  def __init__(self,
   fname: str,
   previously_included: Optional[Set[str]] = None,
@@ -77,6 +94,11 @@ def __init__(self,
  self._parse()
  
  def _parse(self) -> None:

+"""
+Parse the QAPI schema document.
+
+:return: None; results are stored in ``exprs`` and ``docs``.


Another ignorant doc string markup question...  how am I supposed to see
that exprs and docs are attributes, and not global variables?



I don't know, it's an unsolved mystery for me too. I need more time in 
the Sphinx dungeon to figure out how this stuff is supposed to work. 
You're right to wonder.



+"""
  cur_doc = None
  
  with

Re: [PATCH] qapi: deprecate drive-backup

2021-04-26 Thread Daniel P . Berrangé

On Mon, Apr 26, 2021 at 09:00:36PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> 26.04.2021 20:34, John Snow wrote:
> > On 4/23/21 8:59 AM, Vladimir Sementsov-Ogievskiy wrote:
> > > Modern way is using blockdev-add + blockdev-backup, which provides a
> > > lot more control on how target is opened.
> > > 
> > > As example of drive-backup problems consider the following:
> > > 
> > > User of drive-backup expects that target will be opened in the same
> > > cache and aio mode as source. Corresponding logic is in
> > > drive_backup_prepare(), where we take bs->open_flags of source.
> > > 
> > > It works rather bad if source was added by blockdev-add. Assume source
> > > is qcow2 image. On blockdev-add we should specify aio and cache options
> > > for file child of qcow2 node. What happens next:
> > > 
> > > drive_backup_prepare() looks at bs->open_flags of qcow2 source node.
> > > But there no BDRV_O_NOCAHE neither BDRV_O_NATIVE_AIO: BDRV_O_NOCAHE is
> > > places in bs->file->bs->open_flags, and BDRV_O_NATIVE_AIO is nowhere,
> > > as file-posix parse options and simply set s->use_linux_aio.
> > > 
> > 
> > No complaints from me, especially if Virtuozzo is on board. I would like to 
> > see some documentation changes alongside this deprecation, though.
> > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > > ---
> > > 
> > > Hi all! I remember, I suggested to deprecate drive-backup some time ago,
> > > and nobody complain.. But that old patch was inside the series with
> > > other more questionable deprecations and it did not landed.
> > > 
> > > Let's finally deprecate what should be deprecated long ago.
> > > 
> > > We now faced a problem in our downstream, described in commit message.
> > > In downstream I've fixed it by simply enabling O_DIRECT and linux_aio
> > > unconditionally for drive_backup target. But actually this just shows
> > > that using drive-backup in blockdev era is a bad idea. So let's motivate
> > > everyone (including Virtuozzo of course) to move to new interfaces and
> > > avoid problems with all that outdated option inheritance.
> > > 
> > >   docs/system/deprecated.rst | 5 +
> > >   qapi/block-core.json   | 5 -
> > >   2 files changed, 9 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
> > > index 80cae86252..b6f5766e17 100644
> > > --- a/docs/system/deprecated.rst
> > > +++ b/docs/system/deprecated.rst
> > > @@ -186,6 +186,11 @@ Use the more generic commands ``block-export-add`` 
> > > and ``block-export-del``
> > >   instead.  As part of this deprecation, where ``nbd-server-add`` used a
> > >   single ``bitmap``, the new ``block-export-add`` uses a list of 
> > > ``bitmaps``.
> > > +``drive-backup`` (since 6.0)
> > > +
> > > +
> > > +Use ``blockdev-backup`` in pair with ``blockdev-add`` instead.
> > > +
> > 
> > 1) Let's add a sphinx reference to 
> > https://qemu-project.gitlab.io/qemu/interop/live-block-operations.html#live-disk-backup-drive-backup-and-blockdev-backup
> > 
> > 
> > 2) Just a thought, not a request: We also may wish to update 
> > https://qemu-project.gitlab.io/qemu/interop/bitmaps.html to use the new, 
> > preferred method. However, this doc is a bit old and is in need of an 
> > overhaul anyway (Especially to add the NBD pull workflow.) Since the doc is 
> > in need of an overhaul anyway, can we ask Kashyap to help us here, if he 
> > has time?
> > 
> > 
> > 3) Let's add a small explanation here that outlines the differences in 
> > using these two commands. Here's a suggestion:
> > 
> > This change primarily separates the creation/opening process of the backup 
> > target with explicit, separate steps. BlockdevBackup uses mostly the same 
> > arguments as DriveBackup, except the "format" and "mode" options are 
> > removed in favor of using explicit "blockdev-create" and "blockdev-add" 
> > calls.
> > 
> > The "target" argument changes semantics. It no longer accepts filenames, 
> > and will now additionally accept arbitrary node names in addition to device 
> > names.
> > 
> > 
> > 4) Also not a request: If we want to go above and beyond, it might be nice 
> > to spell out the exact steps required to transition from the old interface 
> > to the new one. Here's a (hasty) suggestion for how that might look:
> > 
> > - The MODE argument is deprecated.
> >   - "existing" is replaced by using "blockdev-add" commands.
> >   - "absolute-paths" is replaced by using "blockdev-add" and
> >     "blockdev-create" commands.
> > 
> > - The FORMAT argument is deprecated.
> >   - Format information is given to "blockdev-add"/"blockdev-create".
> > 
> > - The TARGET argument has new semantics:
> >   - Filenames are no longer supported, use blockdev-add/blockdev-create
> >     as necessary instead.
> >   - Device targets remain supported.
> > 
> > 
> > Example:
> > 
> > drive-backup $ARGS format=$FORMAT mode=$MODE target=$FILENAME becomes:
> > 
> > (taking some liberties

Re: [PATCH] qapi: deprecate drive-backup

2021-04-26 Thread Vladimir Sementsov-Ogievskiy


26.04.2021 20:34, John Snow wrote:

On 4/23/21 8:59 AM, Vladimir Sementsov-Ogievskiy wrote:

Modern way is using blockdev-add + blockdev-backup, which provides a
lot more control on how target is opened.

As example of drive-backup problems consider the following:

User of drive-backup expects that target will be opened in the same
cache and aio mode as source. Corresponding logic is in
drive_backup_prepare(), where we take bs->open_flags of source.

It works rather bad if source was added by blockdev-add. Assume source
is qcow2 image. On blockdev-add we should specify aio and cache options
for file child of qcow2 node. What happens next:

drive_backup_prepare() looks at bs->open_flags of qcow2 source node.
But there no BDRV_O_NOCAHE neither BDRV_O_NATIVE_AIO: BDRV_O_NOCAHE is
places in bs->file->bs->open_flags, and BDRV_O_NATIVE_AIO is nowhere,
as file-posix parse options and simply set s->use_linux_aio.



No complaints from me, especially if Virtuozzo is on board. I would like to see 
some documentation changes alongside this deprecation, though.


Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

Hi all! I remember, I suggested to deprecate drive-backup some time ago,
and nobody complain.. But that old patch was inside the series with
other more questionable deprecations and it did not landed.

Let's finally deprecate what should be deprecated long ago.

We now faced a problem in our downstream, described in commit message.
In downstream I've fixed it by simply enabling O_DIRECT and linux_aio
unconditionally for drive_backup target. But actually this just shows
that using drive-backup in blockdev era is a bad idea. So let's motivate
everyone (including Virtuozzo of course) to move to new interfaces and
avoid problems with all that outdated option inheritance.

  docs/system/deprecated.rst | 5 +
  qapi/block-core.json   | 5 -
  2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index 80cae86252..b6f5766e17 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -186,6 +186,11 @@ Use the more generic commands ``block-export-add`` and 
``block-export-del``
  instead.  As part of this deprecation, where ``nbd-server-add`` used a
  single ``bitmap``, the new ``block-export-add`` uses a list of ``bitmaps``.
+``drive-backup`` (since 6.0)
+
+
+Use ``blockdev-backup`` in pair with ``blockdev-add`` instead.
+


1) Let's add a sphinx reference to 
https://qemu-project.gitlab.io/qemu/interop/live-block-operations.html#live-disk-backup-drive-backup-and-blockdev-backup


2) Just a thought, not a request: We also may wish to update 
https://qemu-project.gitlab.io/qemu/interop/bitmaps.html to use the new, 
preferred method. However, this doc is a bit old and is in need of an overhaul 
anyway (Especially to add the NBD pull workflow.) Since the doc is in need of 
an overhaul anyway, can we ask Kashyap to help us here, if he has time?


3) Let's add a small explanation here that outlines the differences in using 
these two commands. Here's a suggestion:

This change primarily separates the creation/opening process of the backup target with explicit, separate steps. 
BlockdevBackup uses mostly the same arguments as DriveBackup, except the "format" and "mode" 
options are removed in favor of using explicit "blockdev-create" and "blockdev-add" calls.

The "target" argument changes semantics. It no longer accepts filenames, and 
will now additionally accept arbitrary node names in addition to device names.


4) Also not a request: If we want to go above and beyond, it might be nice to 
spell out the exact steps required to transition from the old interface to the 
new one. Here's a (hasty) suggestion for how that might look:

- The MODE argument is deprecated.
  - "existing" is replaced by using "blockdev-add" commands.
  - "absolute-paths" is replaced by using "blockdev-add" and
    "blockdev-create" commands.

- The FORMAT argument is deprecated.
  - Format information is given to "blockdev-add"/"blockdev-create".

- The TARGET argument has new semantics:
  - Filenames are no longer supported, use blockdev-add/blockdev-create
    as necessary instead.
  - Device targets remain supported.


Example:

drive-backup $ARGS format=$FORMAT mode=$MODE target=$FILENAME becomes:

(taking some liberties with syntax to just illustrate the idea ...)

blockdev-create options={
    "driver": "file",
    "filename": $FILENAME,
    "size": 0,
}

blockdev-add arguments={
    "driver": "file",
    "filename": $FILENAME,
    "node-name": "Example_Filenode0"
}

blockdev-create options={
    "driver": $FORMAT,
    "file": "Example_Filenode0",
    "size": $SIZE,
}

blockdev-add arguments={
    "driver": $FORMAT,
    "file": "Example_Filenode0",
    "node-name": "Example_Formatnode0",
}

blockdev-backup arguments={
    $ARGS ...,
    "target": "Example_Formatnode0",
}



Good ideas. Hmm. Do you think that the whole

Re: [PATCH 12/22] qapi/parser: add type hint annotations

2021-04-26 Thread John Snow


On 4/25/21 8:34 AM, Markus Armbruster wrote:

John Snow  writes:


Annotations do not change runtime behavior.
This commit *only* adds annotations.

(Annotations for QAPIDoc are in a later commit.)

Signed-off-by: John Snow 
---
  scripts/qapi/parser.py | 61 --
  1 file changed, 41 insertions(+), 20 deletions(-)

diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
index d02a134aae9..f2b57d5642a 100644
--- a/scripts/qapi/parser.py
+++ b/scripts/qapi/parser.py
@@ -17,16 +17,29 @@
  from collections import OrderedDict
  import os
  import re
-from typing import List
+from typing import (
+Dict,
+List,
+Optional,
+Set,
+Union,
+)
  
  from .common import match_nofail

  from .error import QAPISemError, QAPISourceError
  from .source import QAPISourceInfo
  
  
+#: Represents a parsed JSON object; semantically: one QAPI schema expression.

+Expression = Dict[str, object]


I believe you use this for what qapi-code-gen.txt calls a top-level
expression.  TopLevelExpression is rather long, but it's used just once,
and once more in RFC PATCH 13.  What do you think?



Yeah, I left a comment on gitlab about this -- Sorry for splitting the 
stream, I didn't expect you to reply on-list without at least clicking 
the link ;)


You're right, this is TOP LEVEL EXPR. I actually do mean to start using 
it in expr.py as well too, in what will become (I think) pt5c: 
non-immediately-necessary parser cleanups.


I can use TopLevelExpression as the type name if you'd like, but if you 
have a suggestion for something shorter I am open to suggestions if 
"Expression" is way too overloaded/confusing.



+
+# Return value alias for get_expr().
+_ExprValue = Union[List[object], Dict[str, object], str, bool]


This is essentially a node in our pidgin-JSON parser's abstract syntax
tree.  Tree roots use the Dict branch of this Union.

See also my review of PATCH 06.



OK, I skimmed that one for now but I'll get back to it.


+
+
  class QAPIParseError(QAPISourceError):
  """Error class for all QAPI schema parsing errors."""
-def __init__(self, parser, msg):
+def __init__(self, parser: 'QAPISchemaParser', msg: str):


Forward reference needs quotes.  Can't be helped.


  col = 1
  for ch in parser.src[parser.line_pos:parser.pos]:
  if ch == '\t':
@@ -38,7 +51,10 @@ def __init__(self, parser, msg):
  
  class QAPISchemaParser:
  
-def __init__(self, fname, previously_included=None, incl_info=None):

+def __init__(self,
+ fname: str,
+ previously_included: Optional[Set[str]] = None,


This needs to be Optional[] because using the empty set as default
parameter value would be a dangerous trap.  Python's choice to evaluate
the default parameter value just once has always been iffy.  Stirring
static typing into the language makes it iffier.  Can't be helped.



We could force it to accept a tuple and convert it into a set 
internally. It's just that we seem to use it for sets now.


Or ... in pt5c I float the idea of just passing the parent parser in, 
and I reach up and grab the previously-included stuff directly.



+ incl_info: Optional[QAPISourceInfo] = None):
  self._fname = fname
  self._included = previously_included or set()
  self._included.add(os.path.abspath(self._fname))
@@ -46,20 +62,20 @@ def __init__(self, fname, previously_included=None, 
incl_info=None):
  
  # Lexer state (see `accept` for details):

  self.info = QAPISourceInfo(self._fname, incl_info)
-self.tok = None
+self.tok: Optional[str] = None


Would

self.tok: str

work?



Not without modifications, because the Token being None is used to 
represent EOF.



  self.pos = 0
  self.cursor = 0
-self.val = None
+self.val: Optional[Union[bool, str]] = None
  self.line_pos = 0
  
  # Parser output:

-self.exprs = []
-self.docs = []
+self.exprs: List[Expression] = []
+self.docs: List[QAPIDoc] = []
  
  # Showtime!

  self._parse()
  
-def _parse(self):

+def _parse(self) -> None:
  cur_doc = None
  
  with open(self._fname, 'r', encoding='utf-8') as fp:

@@ -122,7 +138,7 @@ def _parse(self):
  self.reject_expr_doc(cur_doc)
  
  @staticmethod

-def reject_expr_doc(doc):
+def reject_expr_doc(doc: Optional['QAPIDoc']) -> None:
  if doc and doc.symbol:
  raise QAPISemError(
  doc.info,
@@ -130,10 +146,14 @@ def reject_expr_doc(doc):
  % doc.symbol)
  
  @staticmethod

-def _include(include, info, incl_fname, previously_included):
+def _include(include: str,
+ info: QAPISourceInfo,
+ incl_fname: str,
+ previously_included: Set[str]
+ ) -> Optional['QAPISchemaParser']:

Re: [PATCH 10/22] qapi/parser: Fix typing of token membership tests

2021-04-26 Thread John Snow


On 4/25/21 3:59 AM, Markus Armbruster wrote:

John Snow  writes:


When the token can be None, we can't use 'x in "abc"' style membership
tests to group types of tokens together, because 'None in "abc"' is a
TypeError.

Easy enough to fix, if not a little ugly.

Signed-off-by: John Snow 
---
  scripts/qapi/parser.py | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
index 7f3c009f64b..16fd36f8391 100644
--- a/scripts/qapi/parser.py
+++ b/scripts/qapi/parser.py
@@ -272,7 +272,7 @@ def get_values(self):
  if self.tok == ']':
  self.accept()
  return expr
-if self.tok not in "{['tf":
+if self.tok is None or self.tok not in "{['tf":
  raise QAPIParseError(
  self, "expected '{', '[', ']', string, or boolean")
  while True:
@@ -294,7 +294,8 @@ def get_expr(self, nested):
  elif self.tok == '[':
  self.accept()
  expr = self.get_values()
-elif self.tok in "'tf":
+elif self.tok and self.tok in "'tf":
+assert isinstance(self.val, (str, bool))
  expr = self.val
  self.accept()
  else:


How can self.tok be None?

I suspect this is an artifact of PATCH 04.  Before, self.tok is
initialized to the first token, then set to subsequent tokens (all str)
in turn.  After, it's initialized to None, then set to tokens in turn.



Actually, it's set to None to represent EOF. See here:

elif self.tok == '\n':
if self.cursor == len(self.src):
self.tok = None
return

A more pythonic idiom would be to create a lexer class that behaves as 
an iterator, yielding Token class objects, and eventually, raising 
StopIteration.


(Not suggesting I do that now. I have thought about it though, yes.)

--js

Re: [RFC] tcg plugin: Additional plugin interface

2021-04-26 Thread Alex Bennée



Min-Yih Hsu  writes:

> Hi Alex,
>
>> On Apr 23, 2021, at 8:44 AM, Alex Bennée  wrote:
>> 
>> 
>> Min-Yih Hsu  writes:
>> 
>>> Hi Alex and QEMU developers,
>>> 
>>> Recently I was working with the TCG plugin. I found that 
>>> `qemu_plugin_cb_flags` seems to reserve the functionality to
>>> read / write CPU register state, I'm wondering if you can share some
>>> roadmap or thoughts on this feature?
>> 
>> I think reading the CPU register state is certainly on the roadmap,
>> writing registers presents a more philosophical question of if it opens
>> the way to people attempting a GPL bypass via plugins. However reading
>> registers would certainly be a worthwhile addition to the API.
>
> Interesting…I’ve never thought about this problem before.
>
>> 
>>> Personally I see reading the CPU register state as (kind of) low-hanging 
>>> fruit. The most straightforward way to implement
>>> it will be adding another function that can be called by insn_exec 
>>> callbacks to read (v)CPU register values. What do you
>>> think about this?
>> 
>> It depends on your definition of low hanging fruit ;-)
>> 
>> Yes the implementation would be a simple helper which could be called
>> from a callback - I don't think we need to limit it to just insn_exec. I
>> think the challenge is proving a non-ugly API that works cleanly across
>> all the architectures. I'm not keen on exposing arbitrary gdb register
>> IDs to the plugins.
>> 
>> There has been some discussion previously on the list which is probably
>> worth reviewing:
>> 
>>  Date: Mon, 7 Dec 2020 16:03:24 -0500
>>  From: Aaron Lindsay 
>>  Subject: Plugin Register Accesses
>>  Message-ID: 
>> 
>> But in short I think we need a new subsystem in QEMU where frontends can
>> register registers (sic) and then provide a common API for various
>> users. This common subsystem would then be the source of data for:
>> 
>>  - plugins
>>  - gdbstub
>>  - monitor (info registers)
>>  - -d LOG_CPU logging
>> 
>> If you are interested in tackling such a project I'm certainly happy to
>> provide pointers and review.
>
> Thank you! Yeah I’m definitely going to scratch a prototype for this
> register reading plugin interface. I’ll take a look at related email
> discussions.

Awesome - please CC me on any patches you come up with (as well as
qemu-devel of course ;-).

>
> Best,
> -Min
>
>> 
>>> 
>>> Thank you
>>> -Min
>> 
>> 
>> -- 
>> Alex Bennée


-- 
Alex Bennée

Re: [PATCH 09/22] qapi: add match_nofail helper

2021-04-26 Thread John Snow


On 4/25/21 3:54 AM, Markus Armbruster wrote:

John Snow  writes:


Mypy cannot generally understand that these regex functions cannot
possibly fail. Add a _nofail helper that clarifies this for mypy.


Convention wants a blank line here.



Tooling failure.

stg pop -a
while stg push; and stg edit --sign; done

(Will fix, but not so sure about fixing the tool...)


Signed-off-by: John Snow 
---
  scripts/qapi/common.py |  8 +++-
  scripts/qapi/main.py   |  6 ++
  scripts/qapi/parser.py | 13 +++--
  3 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/scripts/qapi/common.py b/scripts/qapi/common.py
index cbd3fd81d36..d38c1746767 100644
--- a/scripts/qapi/common.py
+++ b/scripts/qapi/common.py
@@ -12,7 +12,7 @@
  # See the COPYING file in the top-level directory.
  
  import re

-from typing import Optional, Sequence
+from typing import Match, Optional, Sequence
  
  
  #: Magic string that gets removed along with all space to its right.

@@ -210,3 +210,9 @@ def gen_endif(ifcond: Sequence[str]) -> str:
  #endif /* %(cond)s */
  ''', cond=ifc)
  return ret
+
+
+def match_nofail(pattern: str, string: str) -> Match[str]:
+match = re.match(pattern, string)
+assert match is not None
+return match


Name it must_match()?  You choose.



If you think it reads genuinely better, sure.


I wish we could have more stating typing with less notational overhead,
but no free lunch...

[...]

Re: [PATCH 07/22] qapi/parser: assert object keys are strings

2021-04-26 Thread John Snow


On 4/25/21 3:27 AM, Markus Armbruster wrote:

John Snow  writes:


The single quote token implies the value is a string. Assert this to be
the case.

Signed-off-by: John Snow 
---
  scripts/qapi/parser.py | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
index 6b443b1247e..8d1fe0ddda5 100644
--- a/scripts/qapi/parser.py
+++ b/scripts/qapi/parser.py
@@ -246,6 +246,8 @@ def get_members(self):
  raise QAPIParseError(self, "expected string or '}'")
  while True:
  key = self.val
+assert isinstance(key, str)  # Guaranteed by tok == "'"
+
  self.accept()
  if self.tok != ':':
  raise QAPIParseError(self, "expected ':'")


The assertion is correct, but I wonder why mypy needs it.  Can you help?



The lexer value can also be True/False (Maybe None? I forget) based on 
the Token returned. Here, since the token was the single quote, we know 
that value must be a string.


Mypy has no insight into the correlation between the Token itself and 
the token value, because that relationship is not expressed via the type 
system.


--js

Re: [PATCH 05/22] qapi/parser: Assert lexer value is a string

2021-04-26 Thread John Snow


On 4/24/21 4:33 AM, Markus Armbruster wrote:

The second operand of assert provides no additional information.  Please
drop it.


I don't agree with "no additional information", strictly.

I left you a comment on gitlab before you started reviewing on-list. 
What I wrote there:


"Markus: I know you're not a fan of these, but I wanted a suggestion on 
how to explain why this must be true in case it wasn't obvious to 
someone else in the future."


--js

Re: [PATCH 03/22] qapi/source: Remove line number from QAPISourceInfo initializer

2021-04-26 Thread John Snow


On 4/24/21 2:38 AM, Markus Armbruster wrote:

Not mentioned in the commit message: you add a default parameter value.
It's not used; there's just one caller, and it passes a value.
Intentional?



No. Leftover from an earlier version where it was used. It can be made 
to always be an explicit parameter now instead.

[PATCH v5 cxl2.0-v3-doe 6/6] test/cdat: CXL CDAT test data

2021-04-26 Thread Chris Browy

From: hchkuo 

Pre-built CDAT table for testing, contains one CDAT header and six
CDAT entries: DSMAS, DSLBIS, DSMSCIS, DSIS, DSEMTS, and SSLBIS
respectively.

Signed-off-by: Chris Browy 
---
 tests/data/cdat/cdat.dat | Bin 0 -> 148 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 tests/data/cdat/cdat.dat

diff --git a/tests/data/cdat/cdat.dat b/tests/data/cdat/cdat.dat
new file mode 100644
index 
..b66c5d5836bcce7490e698f9ab5071c623425c48
GIT binary patch
literal 148
ycmbQjz`($`14zJu1e^tBD1c~21`KhqG!ugem_{a;892aP794t585EF}W3U1CI069x

literal 0
HcmV?d1

-- 
2.17.1

Re: [PATCH] qapi: deprecate drive-backup

2021-04-26 Thread John Snow


On 4/23/21 8:59 AM, Vladimir Sementsov-Ogievskiy wrote:

Modern way is using blockdev-add + blockdev-backup, which provides a
lot more control on how target is opened.

As example of drive-backup problems consider the following:

User of drive-backup expects that target will be opened in the same
cache and aio mode as source. Corresponding logic is in
drive_backup_prepare(), where we take bs->open_flags of source.

It works rather bad if source was added by blockdev-add. Assume source
is qcow2 image. On blockdev-add we should specify aio and cache options
for file child of qcow2 node. What happens next:

drive_backup_prepare() looks at bs->open_flags of qcow2 source node.
But there no BDRV_O_NOCAHE neither BDRV_O_NATIVE_AIO: BDRV_O_NOCAHE is
places in bs->file->bs->open_flags, and BDRV_O_NATIVE_AIO is nowhere,
as file-posix parse options and simply set s->use_linux_aio.



No complaints from me, especially if Virtuozzo is on board. I would like 
to see some documentation changes alongside this deprecation, though.



Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

Hi all! I remember, I suggested to deprecate drive-backup some time ago,
and nobody complain.. But that old patch was inside the series with
other more questionable deprecations and it did not landed.

Let's finally deprecate what should be deprecated long ago.

We now faced a problem in our downstream, described in commit message.
In downstream I've fixed it by simply enabling O_DIRECT and linux_aio
unconditionally for drive_backup target. But actually this just shows
that using drive-backup in blockdev era is a bad idea. So let's motivate
everyone (including Virtuozzo of course) to move to new interfaces and
avoid problems with all that outdated option inheritance.

  docs/system/deprecated.rst | 5 +
  qapi/block-core.json   | 5 -
  2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index 80cae86252..b6f5766e17 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -186,6 +186,11 @@ Use the more generic commands ``block-export-add`` and 
``block-export-del``
  instead.  As part of this deprecation, where ``nbd-server-add`` used a
  single ``bitmap``, the new ``block-export-add`` uses a list of ``bitmaps``.
  
+``drive-backup`` (since 6.0)

+
+
+Use ``blockdev-backup`` in pair with ``blockdev-add`` instead.
+


1) Let's add a sphinx reference to 
https://qemu-project.gitlab.io/qemu/interop/live-block-operations.html#live-disk-backup-drive-backup-and-blockdev-backup



2) Just a thought, not a request: We also may wish to update 
https://qemu-project.gitlab.io/qemu/interop/bitmaps.html to use the new, 
preferred method. However, this doc is a bit old and is in need of an 
overhaul anyway (Especially to add the NBD pull workflow.) Since the doc 
is in need of an overhaul anyway, can we ask Kashyap to help us here, if 
he has time?



3) Let's add a small explanation here that outlines the differences in 
using these two commands. Here's a suggestion:


This change primarily separates the creation/opening process of the 
backup target with explicit, separate steps. BlockdevBackup uses mostly 
the same arguments as DriveBackup, except the "format" and "mode" 
options are removed in favor of using explicit "blockdev-create" and 
"blockdev-add" calls.


The "target" argument changes semantics. It no longer accepts filenames, 
and will now additionally accept arbitrary node names in addition to 
device names.



4) Also not a request: If we want to go above and beyond, it might be 
nice to spell out the exact steps required to transition from the old 
interface to the new one. Here's a (hasty) suggestion for how that might 
look:


- The MODE argument is deprecated.
  - "existing" is replaced by using "blockdev-add" commands.
  - "absolute-paths" is replaced by using "blockdev-add" and
"blockdev-create" commands.

- The FORMAT argument is deprecated.
  - Format information is given to "blockdev-add"/"blockdev-create".

- The TARGET argument has new semantics:
  - Filenames are no longer supported, use blockdev-add/blockdev-create
as necessary instead.
  - Device targets remain supported.


Example:

drive-backup $ARGS format=$FORMAT mode=$MODE target=$FILENAME becomes:

(taking some liberties with syntax to just illustrate the idea ...)

blockdev-create options={
"driver": "file",
"filename": $FILENAME,
"size": 0,
}

blockdev-add arguments={
"driver": "file",
"filename": $FILENAME,
"node-name": "Example_Filenode0"
}

blockdev-create options={
"driver": $FORMAT,
"file": "Example_Filenode0",
"size": $SIZE,
}

blockdev-add arguments={
"driver": $FORMAT,
"file": "Example_Filenode0",
"node-name": "Example_Formatnode0",
}

blockdev-backup arguments={
$ARGS ...,
"target": "Example_Formatnode0",
}



  System accelerators
  ---
  
diff --git

[PATCH v5 cxl2.0-v3-doe 5/6] cxl/cdat: CXL CDAT Data Object Exchange implementation

2021-04-26 Thread Chris Browy

From: hchkuo 

The Data Object Exchange implementation of CXL Coherent Device Attribute
Table (CDAT). This implementation is referring to "Coherent Device
Attribute Table Specification, Rev. 1.02, Oct. 2020" and "Compute
Express Link Specification, Rev. 2.0, Oct. 2020"

The CDAT can be specified in two ways. One is to add ",cdat="
in "-device cxl-type3"'s command option. The file is required to provide
the whole CDAT table in binary mode. The other is to use the default
CDAT value created by build_cdat_table in hw/cxl/cxl-cdat.c.

A DOE capability of CDAT is added to hw/mem/cxl_type3.c with capability
offset 0x190. The config read/write to this capability range can be
generated in the OS to request the CDAT data.

Signed-off-by: Chris Browy 
---
 hw/cxl/cxl-cdat.c  | 228 +
 hw/cxl/meson.build |   1 +
 hw/mem/cxl_type3.c |  57 -
 include/hw/cxl/cxl_cdat.h  | 149 +
 include/hw/cxl/cxl_component.h |   4 +
 include/hw/cxl/cxl_device.h|   1 +
 include/hw/cxl/cxl_pci.h   |   1 +
 7 files changed, 440 insertions(+), 1 deletion(-)
 create mode 100644 hw/cxl/cxl-cdat.c
 create mode 100644 include/hw/cxl/cxl_cdat.h

diff --git a/hw/cxl/cxl-cdat.c b/hw/cxl/cxl-cdat.c
new file mode 100644
index 00..3b86ecaddf
--- /dev/null
+++ b/hw/cxl/cxl-cdat.c
@@ -0,0 +1,228 @@
+/*
+ * CXL CDAT Structure
+ *
+ * Copyright (C) 2021 Avery Design Systems, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/pci/pci.h"
+#include "hw/cxl/cxl.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+
+static void build_cdat_table(void ***cdat_table, int *len) {
+struct cdat_dsmas *dsmas = g_malloc(sizeof(struct cdat_dsmas));
+struct cdat_dslbis *dslbis = g_malloc(sizeof(struct cdat_dslbis));
+struct cdat_dsmscis *dsmscis = g_malloc(sizeof(struct cdat_dsmscis));
+struct cdat_dsis *dsis = g_malloc(sizeof(struct cdat_dsis));
+struct cdat_dsemts *dsemts = g_malloc(sizeof(struct cdat_dsemts));
+struct cdat_sslbis {
+struct cdat_sslbis_header sslbis_header;
+struct cdat_sslbe sslbe[2];
+};
+struct cdat_sslbis *sslbis = g_malloc(sizeof(struct cdat_sslbis));
+void *__cdat_table[] = {
+dsmas,
+dslbis,
+dsmscis,
+dsis,
+dsemts,
+sslbis,
+};
+
+*dsmas = (struct cdat_dsmas){
+.header = {
+.type = CDAT_TYPE_DSMAS,
+.length = sizeof(struct cdat_dsmas),
+},
+.DSMADhandle = 0,
+.flags = 0,
+.DPA_base = 0,
+.DPA_length = 0,
+};
+*dslbis = (struct cdat_dslbis){
+.header = {
+.type = CDAT_TYPE_DSLBIS,
+.length = sizeof(struct cdat_dslbis),
+},
+.handle = 0,
+.flags = 0,
+.data_type = 0,
+.entry_base_unit = 0,
+};
+*dsmscis = (struct cdat_dsmscis){
+.header = {
+.type = CDAT_TYPE_DSMSCIS,
+.length = sizeof(struct cdat_dsmscis),
+},
+.DSMAS_handle = 0,
+.memory_side_cache_size = 0,
+.cache_attributes = 0,
+};
+*dsis = (struct cdat_dsis){
+.header = {
+.type = CDAT_TYPE_DSIS,
+.length = sizeof(struct cdat_dsis),
+},
+.flags = 0,
+.handle = 0,
+};
+*dsemts = (struct cdat_dsemts){
+.header = {
+.type = CDAT_TYPE_DSEMTS,
+.length = sizeof(struct cdat_dsemts),
+},
+.DSMAS_handle = 0,
+.EFI_memory_type_attr = 0,
+.DPA_offset = 0,
+.DPA_length = 0,
+};
+*sslbis = (struct cdat_sslbis){
+.sslbis_header = {
+.header = {
+.type = CDAT_TYPE_SSLBIS,
+.length = sizeof(sslbis->sslbis_header) +
+  sizeof(struct cdat_sslbe) * 2,
+},
+.data_type = 0,
+.entry_base_unit = 0,
+},
+.sslbe[0] = {
+.port_x_id = 0,
+.port_y_id = 0,
+.latency_bandwidth = 0,
+},
+.sslbe[1] = {
+.port_x_id = 0,
+.port_y_id = 0,
+.latency_bandwidth = 0,
+},
+};
+
+*len = ARRAY_SIZE(__cdat_table);
+*cdat_table = g_malloc0((*len) * sizeof(void *));
+memcpy(*cdat_table, __cdat_table, (*len) * sizeof(void *));
+}
+
+static void cdat_len_check(struct cdat_sub_header *hdr, Error **errp)
+{
+assert(hdr->length);
+assert(hdr->reserved == 0);
+
+switch (hdr->type) {
+case CDAT_TYPE_DSMAS:
+assert(hdr->length == sizeof(struct cdat_dsmas));
+break;
+case CDAT_TYPE_DSLBIS:
+assert(hdr->length == sizeof(struct cdat_dslbis));
+break;
+case CDAT_TYPE_DSMSCIS:
+assert(hdr->length == sizeof(struct

[PATCH v5 cxl2.0-v3-doe 4/6] cxl/compliance: CXL Compliance Data Object Exchange implementation

2021-04-26 Thread Chris Browy

From: hchkuo 

The Data Object Exchange implementation of CXL Compliance Mode is
referring to "Compute Express Link (CXL) Specification, Rev. 2.0, Oct.
2020".

The data structure of CXL compliance request and response is added to
the header. Due to the scope limitation of QEMU, most of the compliance
response is limited to returning corresponding length.

A DOE capability of CXL Compliance is added to hw/mem/cxl_type3.c with
capability offset 0x160. The config read/write to this capability range
can be generated in the OS to request the Compliance info.

Signed-off-by: Chris Browy 
---
 hw/mem/cxl_type3.c  | 147 
 include/hw/cxl/cxl_compliance.h | 293 
 include/hw/cxl/cxl_component.h  |   3 +
 include/hw/cxl/cxl_device.h |   3 +
 include/hw/cxl/cxl_pci.h|   1 +
 5 files changed, 447 insertions(+)
 create mode 100644 include/hw/cxl/cxl_compliance.h

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index bf33ddb915..569872eb36 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -13,6 +13,134 @@
 #include "qemu/rcu.h"
 #include "sysemu/hostmem.h"
 #include "hw/cxl/cxl.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/msix.h"
+
+#define DWORD_BYTE 4
+
+bool cxl_doe_compliance_rsp(DOECap *doe_cap)
+{
+CompRsp *rsp = (doe_cap->pdev)->cxl_cstate.compliance.response;
+struct compliance_req_header *req = pcie_doe_get_write_mbox_ptr(doe_cap);
+uint32_t type, req_len = 0, rsp_len = 0;
+
+type = req->req_code;
+
+switch (type) {
+case CXL_COMP_MODE_CAP:
+req_len = sizeof(struct cxl_compliance_cap_req);
+rsp_len = sizeof(struct cxl_compliance_cap_rsp);
+rsp->cap_rsp.status = 0x0;
+rsp->cap_rsp.available_cap_bitmask = 0;
+rsp->cap_rsp.enabled_cap_bitmask = 0;
+break;
+case CXL_COMP_MODE_STATUS:
+req_len = sizeof(struct cxl_compliance_status_req);
+rsp_len = sizeof(struct cxl_compliance_status_rsp);
+rsp->status_rsp.cap_bitfield = 0;
+rsp->status_rsp.cache_size = 0;
+rsp->status_rsp.cache_size_units = 0;
+break;
+case CXL_COMP_MODE_HALT:
+req_len = sizeof(struct cxl_compliance_halt_req);
+rsp_len = sizeof(struct cxl_compliance_halt_rsp);
+break;
+case CXL_COMP_MODE_MULT_WR_STREAM:
+req_len = sizeof(struct cxl_compliance_multi_write_streaming_req);
+rsp_len = sizeof(struct cxl_compliance_multi_write_streaming_rsp);
+break;
+case CXL_COMP_MODE_PRO_CON:
+req_len = sizeof(struct cxl_compliance_producer_consumer_req);
+rsp_len = sizeof(struct cxl_compliance_producer_consumer_rsp);
+break;
+case CXL_COMP_MODE_BOGUS:
+req_len = sizeof(struct cxl_compliance_bogus_writes_req);
+rsp_len = sizeof(struct cxl_compliance_bogus_writes_rsp);
+break;
+case CXL_COMP_MODE_INJ_POISON:
+req_len = sizeof(struct cxl_compliance_inject_poison_req);
+rsp_len = sizeof(struct cxl_compliance_inject_poison_rsp);
+break;
+case CXL_COMP_MODE_INJ_CRC:
+req_len = sizeof(struct cxl_compliance_inject_crc_req);
+rsp_len = sizeof(struct cxl_compliance_inject_crc_rsp);
+break;
+case CXL_COMP_MODE_INJ_FC:
+req_len = sizeof(struct cxl_compliance_inject_flow_ctrl_req);
+rsp_len = sizeof(struct cxl_compliance_inject_flow_ctrl_rsp);
+break;
+case CXL_COMP_MODE_TOGGLE_CACHE:
+req_len = sizeof(struct cxl_compliance_toggle_cache_flush_req);
+rsp_len = sizeof(struct cxl_compliance_toggle_cache_flush_rsp);
+break;
+case CXL_COMP_MODE_INJ_MAC:
+req_len = sizeof(struct cxl_compliance_inject_mac_delay_req);
+rsp_len = sizeof(struct cxl_compliance_inject_mac_delay_rsp);
+break;
+case CXL_COMP_MODE_INS_UNEXP_MAC:
+req_len = sizeof(struct cxl_compliance_insert_unexp_mac_req);
+rsp_len = sizeof(struct cxl_compliance_insert_unexp_mac_rsp);
+break;
+case CXL_COMP_MODE_INJ_VIRAL:
+req_len = sizeof(struct cxl_compliance_inject_viral_req);
+rsp_len = sizeof(struct cxl_compliance_inject_viral_rsp);
+break;
+case CXL_COMP_MODE_INJ_ALMP:
+req_len = sizeof(struct cxl_compliance_inject_almp_req);
+rsp_len = sizeof(struct cxl_compliance_inject_almp_rsp);
+break;
+case CXL_COMP_MODE_IGN_ALMP:
+req_len = sizeof(struct cxl_compliance_ignore_almp_req);
+rsp_len = sizeof(struct cxl_compliance_ignore_almp_rsp);
+break;
+case CXL_COMP_MODE_INJ_BIT_ERR:
+req_len = sizeof(struct cxl_compliance_inject_bit_err_in_flit_req);
+rsp_len = sizeof(struct cxl_compliance_inject_bit_err_in_flit_rsp);
+break;
+default:
+break;
+}
+
+/* Discard if request length mismatched */
+if (pcie_doe_get_obj_len(req) < DIV_ROUND_UP(req_len, DWORD_BYTE)) {
+return false;
+}
+
+/* Common fields

Re: constant_tsc support for SVM guest

2021-04-26 Thread Marcelo Tosatti



Hi Wei, Eduardo,

On Fri, Apr 23, 2021 at 05:27:44PM -0400, Eduardo Habkost wrote:
> On Fri, Apr 23, 2021 at 12:32:00AM -0500, Wei Huang wrote:
> > There was a customer request for const_tsc support on AMD guests. Right now
> > this feature is turned off by default for QEMU x86 CPU types (in
> > CPUID_Fn8007_EDX[8]). However we are seeing a discrepancy in guest VM
> > behavior between Intel and AMD.
> > 
> > In Linux kernel, Intel x86 code enables X86_FEATURE_CONSTANT_TSC based on
> > vCPU's family & model. So it ignores CPUID_Fn8007_EDX[8] and guest VMs
> > have const_tsc enabled. On AMD, however, the kernel checks
> > CPUID_Fn8007_EDX[8]. So const_tsc is disabled on AMD by default.

EAX=8007h: Advanced Power Management Information
This function provides advanced power management feature identifiers. 
EDX bit 8 indicates support for invariant TSC. 

Intel documentation states:

"The time stamp counter in newer processors may support an enhancement,
referred to as invariant TSC. Processor's support for invariant TSC
is indicated by CPUID.8007H:EDX[8]. The invariant TSC will run
at a constant rate in all ACPI P-, C-. and T-states. This is the
architectural behavior moving forward. On processors with invariant TSC
support, the OS may use the TSC for wall clock timer services (instead
of ACPI or HPET timers). TSC reads are much more efficient and do not
incur the overhead associated with a ring transition or access to a
platform resource."

X86_FEATURE_NONSTOP_TSC is enabled (on both Intel and AMD) by checking
the CPUID_Fn8007_EDX[8] bit.

> Oh.  This seems to defeat the purpose of the invtsc migration
> blocker we have.
> 
> Do we know when this behavior was introduced in Linux?
> 
> > 
> > I am thinking turning on invtsc for EPYC CPU types (see example below). Most
> > AMD server CPUs have supported invariant TSC for a long time. So this change
> > is compatible with the hardware behavior. The only problem is live migration
> > support, which will be blocked because of invtsc. However this problem
> > should be considered very minor because most server CPUs support TscRateMsr
> > (see CPUID_Fn800A_EDX[4]), allowing VMs to migrate among CPUs with
> > different TSC rates. This live migration restriction can be lifted as long
> > as the destination supports TscRateMsr or has the same frequency as the
> > source (QEMU/libvirt do it).

Yes.

> > [BTW I believe this migration limitation might be unnecessary because it is
> > apparently OK for Intel guests to ignore invtsc while claiming const_tsc.
> > Have anyone reported issues?]
> 
> CCing Marcelo, who originally added the migration blocker in QEMU.

The reasoning behind the migration blocker was to ensure that 
the invariant TSC meaning as defined:

"The invariant TSC will run at a constant rate in all ACPI P-, C-. and T-states"

Would be maintained across migration.

> > 
> > Do I miss anything here? Any comments about the proposal?
> > 
> > Thanks,
> > -Wei
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index ad99cad0e7..3c48266884 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -4077,6 +4076,21 @@ static X86CPUDefinition builtin_x86_defs[] = {
> >  { /* end of list */ }
> >  }
> >  },
> > +{
> > +.version = 4,
> > +.alias = "EPYC-IBPB",
> > +.props = (PropValue[]) {
> > +{ "ibpb", "on" },
> > +{ "perfctr-core", "on" },
> > +{ "clzero", "on" },
> > +{ "xsaveerptr", "on" },
> > +{ "xsaves", "on" },
> 
> You don't need to copy the properties from the previous version.
> The properties of version N are applied on top of the properties
> of version (N-1).
> 
> > +{ "invtsc", "on" },
> > +{ "model-id",
> > +  "AMD EPYC Processor" },
> > +{ /* end of list */ }
> > +}
> > +},
> >  { /* end of list */ }
> >  }
> >  },
> > @@ -4189,6 +4203,15 @@ static X86CPUDefinition builtin_x86_defs[] = {
> >  { /* end of list */ }
> >  }
> >  },
> > +{
> > +.version = 3,
> > +.props = (PropValue[]) {
> > +{ "ibrs", "on" },
> > +{ "amd-ssbd", "on" },
> > +{ "invtsc", "on" },
> > +{ /* end of list */ }
> > +}
> > +},
> >  { /* end of list */ }
> >  }
> >  },
> > @@ -4246,6 +4269,17 @@ static X86CPUDefinition builtin_x86_defs[] = {
> >  .xlevel = 0x801E,
> >  .model_id = "AMD EPYC-Milan Processor",
> >  .cache_info = _milan_cache_info,
> > +.versions = (X86CPUVersionDefinition[]) {
> > +{ .version = 1 },
> > +

[PATCH v5 cxl2.0-v3-doe 3/6] hw/pci: PCIe Data Object Exchange implementation

2021-04-26 Thread Chris Browy

From: hchkuo 

PCIe Data Object Exchange (DOE) implementation for QEMU referring to
"PCIe Data Object Exchange ECN, March 12, 2020".

The patch supports multiple DOE capabilities for a single PCIe device in
QEMU. For each capability, a static array of DOEProtocol should be
passed to pcie_doe_init(). The protocols in that array will be
registered under the DOE capability structure. For each protocol, vendor
ID, type, and corresponding callback function (handle_request()) should
be implemented. This callback function represents how the DOE request
for corresponding protocol will be handled.

pcie_doe_{read/write}_config() must be appended to corresponding PCI
device's config_read/write() handler to enable DOE access. In
pcie_doe_read_config(), false will be returned if pci_config_read()
offset is not within DOE capability range. In pcie_doe_write_config(),
the function will be early returned if not within the related DOE range.

Signed-off-by: Chris Browy 
---
 MAINTAINERS   |   7 +
 hw/pci/meson.build|   1 +
 hw/pci/pcie_doe.c | 374 ++
 include/hw/pci/pcie.h |   1 +
 include/hw/pci/pcie_doe.h | 123 +
 5 files changed, 506 insertions(+)
 create mode 100644 hw/pci/pcie_doe.c
 create mode 100644 include/hw/pci/pcie_doe.h

diff --git a/MAINTAINERS b/MAINTAINERS
index f9097ed9e7..e77e9892e3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1681,6 +1681,13 @@ F: docs/pci*
 F: docs/specs/*pci*
 F: default-configs/pci.mak
 
+PCIE DOE
+M: Huai-Cheng Kuo 
+M: Chris Browy 
+S: Supported
+F: include/hw/pci/pcie_doe.h
+F: hw/pci/pcie_doe.c
+
 ACPI/SMBIOS
 M: Michael S. Tsirkin 
 M: Igor Mammedov 
diff --git a/hw/pci/meson.build b/hw/pci/meson.build
index 5c4bbac817..115e50222f 100644
--- a/hw/pci/meson.build
+++ b/hw/pci/meson.build
@@ -12,6 +12,7 @@ pci_ss.add(files(
 # allow plugging PCIe devices into PCI buses, include them even if
 # CONFIG_PCI_EXPRESS=n.
 pci_ss.add(files('pcie.c', 'pcie_aer.c'))
+pci_ss.add(files('pcie_doe.c'))
 softmmu_ss.add(when: 'CONFIG_PCI_EXPRESS', if_true: files('pcie_port.c', 
'pcie_host.c'))
 softmmu_ss.add_all(when: 'CONFIG_PCI', if_true: pci_ss)
 
diff --git a/hw/pci/pcie_doe.c b/hw/pci/pcie_doe.c
new file mode 100644
index 00..b2a933
--- /dev/null
+++ b/hw/pci/pcie_doe.c
@@ -0,0 +1,374 @@
+/*
+ * PCIe Data Object Exchange
+ *
+ * Copyright (C) 2021 Avery Design Systems, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "qemu/range.h"
+#include "hw/pci/pci.h"
+#include "hw/pci/pcie.h"
+#include "hw/pci/pcie_doe.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/msix.h"
+
+#define DWORD_BYTE 4
+#define BYTE_LSHIFT(b, pos) (b << ((pos) * 8))
+#define BYTE_RSHIFT(b, pos) (b >> ((pos) * 8))
+
+struct doe_discovery_req {
+DOEHeader header;
+uint8_t index;
+uint8_t reserved[3];
+} QEMU_PACKED;
+
+struct doe_discovery_rsp {
+DOEHeader header;
+uint16_t vendor_id;
+uint8_t data_obj_type;
+uint8_t next_index;
+} QEMU_PACKED;
+
+static bool pcie_doe_discovery(DOECap *doe_cap)
+{
+struct doe_discovery_req *req = pcie_doe_get_write_mbox_ptr(doe_cap);
+struct doe_discovery_rsp rsp;
+uint8_t index = req->index;
+DOEProtocol *prot;
+
+/* Discard request if length does not match doe_discovery */
+if (pcie_doe_get_obj_len(req) <
+DIV_ROUND_UP(sizeof(struct doe_discovery_req), DWORD_BYTE)) {
+return false;
+}
+
+rsp.header = (DOEHeader) {
+.vendor_id = PCI_VENDOR_ID_PCI_SIG,
+.data_obj_type = PCI_SIG_DOE_DISCOVERY,
+.length = DIV_ROUND_UP(sizeof(struct doe_discovery_rsp), DWORD_BYTE),
+};
+
+/* Point to the requested protocol, index 0 must be Discovery */
+if (index == 0) {
+rsp.vendor_id = PCI_VENDOR_ID_PCI_SIG;
+rsp.data_obj_type = PCI_SIG_DOE_DISCOVERY;
+} else {
+if (index < doe_cap->protocol_num) {
+prot = _cap->protocols[index - 1];
+rsp.vendor_id = prot->vendor_id;
+rsp.data_obj_type = prot->data_obj_type;
+} else {
+rsp.vendor_id = 0x;
+rsp.data_obj_type = 0xFF;
+}
+}
+
+if (index + 1 == doe_cap->protocol_num) {
+rsp.next_index = 0;
+} else {
+rsp.next_index = index + 1;
+}
+
+pcie_doe_set_rsp(doe_cap, );
+
+return true;
+}
+
+static void pcie_doe_reset_mbox(DOECap *st)
+{
+st->read_mbox_idx = 0;
+st->read_mbox_len = 0;
+st->write_mbox_len = 0;
+
+memset(st->read_mbox, 0, PCI_DOE_DW_SIZE_MAX * DWORD_BYTE);
+memset(st->write_mbox, 0, PCI_DOE_DW_SIZE_MAX * DWORD_BYTE);
+}
+
+void pcie_doe_init(PCIDevice *dev, DOECap *doe_cap, uint16_t offset,
+   DOEProtocol *protocols, bool intr, uint16_t vec)
+{
+

Re: [PATCH v3 22/36] block: add bdrv_remove_filter_or_cow transaction action

2021-04-26 Thread Vladimir Sementsov-Ogievskiy


26.04.2021 19:26, Kevin Wolf wrote:

Am 17.03.2021 um 15:35 hat Vladimir Sementsov-Ogievskiy geschrieben:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block.c | 78 +++--
  1 file changed, 76 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 11f7ad0818..2fca1f2ad5 100644
--- a/block.c
+++ b/block.c
@@ -2929,12 +2929,19 @@ static void bdrv_replace_child(BdrvChild *child, 
BlockDriverState *new_bs)
  }
  }
  
+static void bdrv_child_free(void *opaque)

+{
+BdrvChild *c = opaque;
+
+g_free(c->name);
+g_free(c);
+}
+
  static void bdrv_remove_empty_child(BdrvChild *child)
  {
  assert(!child->bs);
  QLIST_SAFE_REMOVE(child, next);
-g_free(child->name);
-g_free(child);
+bdrv_child_free(child);
  }
  
  typedef struct BdrvAttachChildCommonState {

@@ -4956,6 +4963,73 @@ static bool should_update_child(BdrvChild *c, 
BlockDriverState *to)
  return ret;
  }
  
+typedef struct BdrvRemoveFilterOrCowChild {

+BdrvChild *child;
+bool is_backing;
+} BdrvRemoveFilterOrCowChild;
+
+/* this doesn't restore original child bs, only the child itself */


Hm, this comment tells me that it's intentional, but why is it correct?


that's because bdrv_remove_filter_or_cow_child_abort() aborts only part of  
bdrv_remove_filter_or_cow_child().

Look: bdrv_remove_filter_or_cow_child() firstly do 
bdrv_replace_child_safe(child, NULL, tran);, so bs would be restored by 
.abort() of bdrv_replace_child_safe() action.


So, improved comment may look like:

This doesn't restore original child bs, only the child itself. The bs would be 
restored by .abort() bdrv_replace_child_safe() subation of 
bdrv_remove_filter_or_cow_child() action.

Probably it would be more correct to rename

BdrvRemoveFilterOrCowChild -> BdrvRemoveFilterOrCowChildNoBs
bdrv_remove_filter_or_cow_child_abort -> 
bdrv_remove_filter_or_cow_child_no_bs_abort
bdrv_remove_filter_or_cow_child_commit -> 
bdrv_remove_filter_or_cow_child_no_bs_commit

and assert on .abort() and .commit() that s->child->bs is NULL.




+static void bdrv_remove_filter_or_cow_child_abort(void *opaque)
+{
+BdrvRemoveFilterOrCowChild *s = opaque;
+BlockDriverState *parent_bs = s->child->opaque;
+
+QLIST_INSERT_HEAD(_bs->children, s->child, next);
+if (s->is_backing) {
+parent_bs->backing = s->child;
+} else {
+parent_bs->file = s->child;
+}
+}


Kevin




--
Best regards,
Vladimir

[PATCH v5 cxl2.0-v3-doe 2/6] include/hw/pci: headers for PCIe DOE

2021-04-26 Thread Chris Browy

From: hchkuo 

Macros for the vender ID of PCI-SIG and the size of PCIe Data Object
Exchange.

Signed-off-by: Chris Browy 
---
 include/hw/pci/pci_ids.h   | 2 ++
 include/hw/pci/pcie_regs.h | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h
index 95f92d98e9..471c915395 100644
--- a/include/hw/pci/pci_ids.h
+++ b/include/hw/pci/pci_ids.h
@@ -157,6 +157,8 @@
 
 /* Vendors and devices.  Sort key: vendor first, device next. */
 
+#define PCI_VENDOR_ID_PCI_SIG0x0001
+
 #define PCI_VENDOR_ID_LSI_LOGIC  0x1000
 #define PCI_DEVICE_ID_LSI_53C810 0x0001
 #define PCI_DEVICE_ID_LSI_53C895A0x0012
diff --git a/include/hw/pci/pcie_regs.h b/include/hw/pci/pcie_regs.h
index 1db86b0ec4..5ec7014211 100644
--- a/include/hw/pci/pcie_regs.h
+++ b/include/hw/pci/pcie_regs.h
@@ -179,4 +179,7 @@ typedef enum PCIExpLinkWidth {
 #define PCI_ACS_VER 0x1
 #define PCI_ACS_SIZEOF  8
 
+/* DOE Capability Register Fields */
+#define PCI_DOE_SIZEOF  24
+
 #endif /* QEMU_PCIE_REGS_H */
-- 
2.17.1

Re: [PATCH v2 0/2] plugins: Freeing allocated values in hash tables.

2021-04-26 Thread Alex Bennée



Mahmoud Mandour  writes:

> A hash table made using ``g_hash_table_new`` requires manually
> freeing any dynamically allocated keys/values. The two patches
> in this series fixes this issue in hotblocks and hotpages plugins.
>
> v1 -> v2: Added a freeing function to hotpages instead of freeing
> the sorted list. That's probably better because the sorted list
> is only made on having ``counts != NULL`` and ``counts`` has a next
> pointer so essentially at least 2 elements in the list.

Queued to plugins/next, thanks.

-- 
Alex Bennée

Re: [PATCH v3 18/36] block: add bdrv_attach_child_common() transaction action

2021-04-26 Thread Vladimir Sementsov-Ogievskiy


26.04.2021 19:14, Kevin Wolf wrote:

Am 17.03.2021 um 15:35 hat Vladimir Sementsov-Ogievskiy geschrieben:

Split out no-perm part of bdrv_root_attach_child() into separate
transaction action. bdrv_root_attach_child() now moves to new
permission update paradigm: first update graph relations then update
permissions.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block.c | 189 
  1 file changed, 135 insertions(+), 54 deletions(-)

diff --git a/block.c b/block.c
index 98ff44dbf7..b6bdc534d2 100644
--- a/block.c
+++ b/block.c
@@ -2921,37 +2921,73 @@ static void bdrv_replace_child(BdrvChild *child, 
BlockDriverState *new_bs)
  }
  }
  
-/*

- * This function steals the reference to child_bs from the caller.
- * That reference is later dropped by bdrv_root_unref_child().
- *
- * On failure NULL is returned, errp is set and the reference to
- * child_bs is also dropped.
- *
- * The caller must hold the AioContext lock @child_bs, but not that of @ctx
- * (unless @child_bs is already in @ctx).
- */
-BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
-  const char *child_name,
-  const BdrvChildClass *child_class,
-  BdrvChildRole child_role,
-  uint64_t perm, uint64_t shared_perm,
-  void *opaque, Error **errp)
+static void bdrv_remove_empty_child(BdrvChild *child)
  {
-BdrvChild *child;
-Error *local_err = NULL;
-int ret;
-AioContext *ctx;
+assert(!child->bs);
+QLIST_SAFE_REMOVE(child, next);
+g_free(child->name);
+g_free(child);
+}
  
-ret = bdrv_check_update_perm(child_bs, NULL, perm, shared_perm, NULL, errp);

-if (ret < 0) {
-bdrv_abort_perm_update(child_bs);
-bdrv_unref(child_bs);
-return NULL;
+typedef struct BdrvAttachChildCommonState {
+BdrvChild **child;
+AioContext *old_parent_ctx;
+AioContext *old_child_ctx;
+} BdrvAttachChildCommonState;
+
+static void bdrv_attach_child_common_abort(void *opaque)
+{
+BdrvAttachChildCommonState *s = opaque;
+BdrvChild *child = *s->child;
+BlockDriverState *bs = child->bs;
+
+bdrv_replace_child_noperm(child, NULL);
+
+if (bdrv_get_aio_context(bs) != s->old_child_ctx) {
+bdrv_try_set_aio_context(bs, s->old_child_ctx, _abort);
  }
  
-child = g_new(BdrvChild, 1);

-*child = (BdrvChild) {
+if (bdrv_child_get_parent_aio_context(child) != s->old_parent_ctx) {
+GSList *ignore = g_slist_prepend(NULL, child);
+
+child->klass->can_set_aio_ctx(child, s->old_parent_ctx, ,
+  _abort);
+g_slist_free(ignore);
+ignore = g_slist_prepend(NULL, child);
+child->klass->set_aio_ctx(child, s->old_parent_ctx, );
+
+g_slist_free(ignore);
+}
+
+bdrv_unref(bs);
+bdrv_remove_empty_child(child);
+*s->child = NULL;
+}
+
+static TransactionActionDrv bdrv_attach_child_common_drv = {
+.abort = bdrv_attach_child_common_abort,
+};
+
+/*
+ * Common part of attoching bdrv child to bs or to blk or to job
+ */
+static int bdrv_attach_child_common(BlockDriverState *child_bs,
+const char *child_name,
+const BdrvChildClass *child_class,
+BdrvChildRole child_role,
+uint64_t perm, uint64_t shared_perm,
+void *opaque, BdrvChild **child,
+Transaction *tran, Error **errp)
+{
+BdrvChild *new_child;
+AioContext *parent_ctx;
+AioContext *child_ctx = bdrv_get_aio_context(child_bs);
+
+assert(child);
+assert(*child == NULL);
+
+new_child = g_new(BdrvChild, 1);
+*new_child = (BdrvChild) {
  .bs = NULL,
  .name   = g_strdup(child_name),
  .klass  = child_class,
@@ -2961,37 +2997,92 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
*child_bs,
  .opaque = opaque,
  };
  
-ctx = bdrv_child_get_parent_aio_context(child);

-
-/* If the AioContexts don't match, first try to move the subtree of
+/*
+ * If the AioContexts don't match, first try to move the subtree of
   * child_bs into the AioContext of the new parent. If this doesn't work,
- * try moving the parent into the AioContext of child_bs instead. */
-if (bdrv_get_aio_context(child_bs) != ctx) {
-ret = bdrv_try_set_aio_context(child_bs, ctx, _err);
+ * try moving the parent into the AioContext of child_bs instead.
+ */
+parent_ctx = bdrv_child_get_parent_aio_context(new_child);
+if (child_ctx != parent_ctx) {
+Error *local_err = NULL;
+int ret = bdrv_try_set_aio_context(child_bs, parent_ctx, _err);
+
  if (ret < 0 &&

Re: [PATCH 0/2] plugins: Freeing allocated values in hash tables.

2021-04-26 Thread Alex Bennée



Alex Bennée  writes:

> Mahmoud Mandour  writes:
>
>> A hash table made using ``g_hash_table_new`` requires manually
>> freeing any dynamically allocated keys/values. The two patches
>> in this series fixes this issue in hotblocks and hotpages plugins.
>
> Queued to plugins/next, thanks.

Oops, dequeuing and applying v2 ;-)

-- 
Alex Bennée

Re: [PATCH 0/2] plugins: Freeing allocated values in hash tables.

2021-04-26 Thread Alex Bennée



Mahmoud Mandour  writes:

> A hash table made using ``g_hash_table_new`` requires manually
> freeing any dynamically allocated keys/values. The two patches
> in this series fixes this issue in hotblocks and hotpages plugins.

Queued to plugins/next, thanks.

-- 
Alex Bennée

[PATCH v5 cxl2.0-v3-doe 1/6] standard-headers/linux/pci_regs: PCI header from Linux kernel

2021-04-26 Thread Chris Browy

From: hchkuo 

Linux standard header for the registers of PCI Data Object Exchange
(DOE). This header might be generated via script. The DOE feature
should be added in the future Linux release so this patch can be
removed then.

Signed-off-by: Chris Browy 
---
 include/standard-headers/linux/pci_regs.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/standard-headers/linux/pci_regs.h 
b/include/standard-headers/linux/pci_regs.h
index e709ae8235..2a8df63e11 100644
--- a/include/standard-headers/linux/pci_regs.h
+++ b/include/standard-headers/linux/pci_regs.h
@@ -730,7 +730,8 @@
 #define PCI_EXT_CAP_ID_DVSEC   0x23/* Designated Vendor-Specific */
 #define PCI_EXT_CAP_ID_DLF 0x25/* Data Link Feature */
 #define PCI_EXT_CAP_ID_PL_16GT 0x26/* Physical Layer 16.0 GT/s */
-#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_PL_16GT
+#define PCI_EXT_CAP_ID_DOE 0x2E/* Data Object Exchange */
+#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_DOE
 
 #define PCI_EXT_CAP_DSN_SIZEOF 12
 #define PCI_EXT_CAP_MCAST_ENDPOINT_SIZEOF 40
-- 
2.17.1

Re: [PATCH v6 06/18] cpu: Assert DeviceClass::vmsd is NULL on user emulation

2021-04-26 Thread Philippe Mathieu-Daudé

On 4/26/21 6:15 PM, Dr. David Alan Gilbert wrote:
> * Philippe Mathieu-Daudé (f4...@amsat.org) wrote:
>> Migration is specific to system emulation.
>>
>> Restrict current DeviceClass::vmsd to sysemu using #ifdef'ry,
>> and assert in cpu_exec_realizefn() that dc->vmsd not set under
>> user emulation.
>>
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>>  cpu.c  | 1 +
>>  target/sh4/cpu.c   | 5 +++--
>>  target/unicore32/cpu.c | 4 
>>  target/xtensa/cpu.c| 4 +++-
>>  4 files changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/cpu.c b/cpu.c
>> index bfbe5a66f95..4fed04219df 100644
>> --- a/cpu.c
>> +++ b/cpu.c
>> @@ -138,6 +138,7 @@ void cpu_exec_realizefn(CPUState *cpu, Error **errp)
>>  #endif /* CONFIG_TCG */
>>  
>>  #ifdef CONFIG_USER_ONLY
>> +assert(qdev_get_vmsd(DEVICE(cpu)) == NULL);
> 
> Why not make that:
>assert(qdev_get_vmsd(DEVICE(cpu)) == NULL ||
>   qdev_get_vmsd(DEVICE(cpu)->unmigratable)
> 
> then you don't have to worry about the changes below.

Thanks for the tip! For my defense the VMStateDescription
fields aren't documented at all ;)

> 
> Dave
> 
>>  assert(cc->vmsd == NULL);
>>  #else
>>  if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {

Ethernet-over-usb with linux guest using USB Device Controller ?

2021-04-26 Thread Doug Evans

Hi.

I'm working on a project where I want to have the linux qemu guest
communicate with another linux system via ethernet-over-usb (as far as the
guest is concerned, as it will be using a usb network gadget).
In this case the linux guest will be using a USB Device Controller (UDC) to
drive its side of the connection, and the protocol will be, IIUC, CDC-ECM.

The modeling would basically look like:

linux-guest <--> UDC-model <--> ?#1 <--> ?#2 <--> linux-host
|< QEMU -->|

UDC-model will be working with CDC-ECM, but is there a use-case where we'd
want "?#1" to be libslirp and "?#2" to be the host's IP network? Another
use case is propagating CDC-ECM (or the USB packets in general) outside of
QEMU such that it can be fed directly into the USB of the host (or remote
host).

Questions: Is this support in QEMU and if so got any pointers to source for
existing examples?
If not, any guidance on how to proceed?

Of course we'd want this to not be a one-off. E.g., the code would be
partitioned such that the UDC-model-independent-support would be available
to other UDC models to use. Thus perhaps this falls under the scope of
things like this?
https://yhbt.net/lore/all/YFDo%2FoHikOEcXFcg@work-vm/
I'm new to all of this side of USB btw ...

[PATCH v5 cxl2.0-v3-doe 0/6] QEMU PCIe DOE for PCIe 4.0/5.0 and CXL 2.0

2021-04-26 Thread Chris Browy

This patch implements the PCIe Data Object Exchange (DOE) for PCIe 4.0/5.0
and later and CXL 2.0 "type-3" memory devices supporting the following 
protocols:
 1: PCIe DOE Discovery protocol
 2: CXL DOE Compliance Mode protocol
 3: CXL DOE CDAT protocol

Implementation is based on QEMU version which added CXL 2.0 "type-3" support
https://gitlab.com/bwidawsk/qemu/-/tree/cxl-2.0v4
6882c0453eea74d639ac75ec0f362d0cf9f1c744

PCIe Data Object Exchange (DOE) implementation for QEMU refers to
"Data Object Exchange ECN, March 12, 2020" [1]

The Data Object Exchange implementation of CXL Compliance Mode is
refers to "Compute Express Link (CXL) Specification, Rev. 2.0, Oct.
2020" [2]

The Data Object Exchange implementation of CXL Coherent Device Attribute
Table (CDAT). This implementation is referring to "Coherent Device
Attribute Table Specification, Rev. 1.02, Oct. 2020" [3] and "Compute
Express Link Specification, Rev. 2.0, Oct. 2020" [2]

The CDAT can be specified in two ways. One is to add ",cdat="
in "-device cxl-type3"'s command option. The file is required to provide
the whole CDAT table in binary mode. The other is to use the default
CDAT value created by build_cdat_table in hw/cxl/cxl-cdat.c.

Pre-built CDAT table for testing, contains one CDAT header and six
CDAT entries: DSMAS, DSLBIS, DSMSCIS, DSIS, DSEMTS, and SSLBIS
respectively.

Changes since PATCH v4:
1-3: PCIe DOE linux header and macros and PCIe Discovery protocol
4:   Clean up CXL compliance mode DOE protocol including default responses
5-6: Clean up CXL CDAT DOE protocol including tesing built-in and external CDAT 
tables

[1]: https://members.pcisig.com/wg/PCI-SIG/document/14143
[2]: https://www.computeexpresslink.org/
[3]: 
https://uefi.org/sites/default/files/resources/Coherent%20Device%20Attribute%20Table_1.02.pdf

hchkuo (6):
  standard-headers/linux/pci_regs: PCI header from Linux kernel
  include/hw/pci: headers for PCIe DOE
  hw/pci: PCIe Data Object Exchange implementation
  cxl/compliance: CXL Compliance Data Object Exchange implementation
  cxl/cdat: CXL CDAT Data Object Exchange implementation
  test/cdat: CXL CDAT test data

 MAINTAINERS   |   7 +
 hw/cxl/cxl-cdat.c | 228 +
 hw/cxl/meson.build|   1 +
 hw/mem/cxl_type3.c| 202 
 hw/pci/meson.build|   1 +
 hw/pci/pcie_doe.c | 374 ++
 include/hw/cxl/cxl_cdat.h | 149 +
 include/hw/cxl/cxl_compliance.h   | 293 +
 include/hw/cxl/cxl_component.h|   7 +
 include/hw/cxl/cxl_device.h   |   4 +
 include/hw/cxl/cxl_pci.h  |   2 +
 include/hw/pci/pci_ids.h  |   2 +
 include/hw/pci/pcie.h |   1 +
 include/hw/pci/pcie_doe.h | 123 +++
 include/hw/pci/pcie_regs.h|   3 +
 include/standard-headers/linux/pci_regs.h |   3 +-
 tests/data/cdat/cdat.dat  | Bin 0 -> 148 bytes
 17 files changed, 1399 insertions(+), 1 deletion(-)
 create mode 100644 hw/cxl/cxl-cdat.c
 create mode 100644 hw/pci/pcie_doe.c
 create mode 100644 include/hw/cxl/cxl_cdat.h
 create mode 100644 include/hw/cxl/cxl_compliance.h
 create mode 100644 include/hw/pci/pcie_doe.h
 create mode 100644 tests/data/cdat/cdat.dat

-- 
2.17.1

1 2 >

1 - 100 of 199 matches

Mail list logo