date:20240108

Re: [PATCH 3/3] tests/qtest: Re-enable multifd cancel test

2024-01-08 Thread Peter Xu

Hi, Thomas,

On Tue, Jan 09, 2024 at 08:21:53AM +0100, Thomas Huth wrote:
> Sorry for that :-(

Not at all!  I actually appreciate more people looking after it.

> Maybe it's better if we remove the migration-test from
> the qtest section in MAINTAINERS? Since the migration test is very well
> maintained already, there's IMHO no need for picking up the patches via the
> qtest tree, so something like this should prevent these problems:
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3269,6 +3269,7 @@ F: tests/qtest/
>  F: docs/devel/qgraph.rst
>  F: docs/devel/qtest.rst
>  X: tests/qtest/bios-tables-test*
> +X: tests/qtest/migration-*
> 
>  Device Fuzzing
>  M: Alexander Bulekov 
> 
> (as you can see, we're doing it in a similar way for the bios tables test
> already)
> 
> If you agree, I can send out a proper patch for this later today.

Currently the file is covered by both groups of people, which is the best
condition to me:

$ ./scripts/get_maintainer.pl -f tests/qtest/migration-test.c 
Peter Xu  (maintainer:Migration)
Fabiano Rosas  (maintainer:Migration)
Thomas Huth  (maintainer:qtest)
Laurent Vivier  (maintainer:qtest)
Paolo Bonzini  (reviewer:qtest)
qemu-devel@nongnu.org (open list:All patches CC here)

It makes sense to me e.g. when qtest reworks the framework, and we'd like
migration-test.c to be covered in that same reworks series and
reviewed/pulled together, for example, then those can go via qtest's tree
directly.

If patch submitter follows the MAINTAINERS file it means all of us will be
in the loop and that's the perfect condition, IMHO.  It's just that this
patch didn't have any migration people copied, which caused a very slight
confusion.

It'll be great in that case if qtest maintainers can help submitters to
copy us if the submitters forgot to do so.  I think we should do the same
when there's major changes for qtest framework for a new migration test.
Would that work the best for us?

Thanks,

-- 
Peter Xu

Re: [PATCH v8 06/10] hw/fsi: Aspeed APB2OPB interface

2024-01-08 Thread Cédric Le Goater


Hello Ninad,


+static void fsi_aspeed_apb2opb_realize(DeviceState *dev, Error **errp)
+{
+    SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+    AspeedAPB2OPBState *s = ASPEED_APB2OPB(dev);
+    int i;
+
+    sysbus_init_irq(sbd, >irq);
+
+    memory_region_init_io(>iomem, OBJECT(s), _apb2opb_ops, s,
+  TYPE_ASPEED_APB2OPB, 0x1000);
+    sysbus_init_mmio(sbd, >iomem);
+
+    for (i = 0; i < ASPEED_FSI_NUM; i++) {
+    if (!qdev_realize_and_unref(DEVICE(>fsi[i]), BUS(>opb[i]),



s->fsi[i] is not allocated. We should use qdev_realize instead.


I am not sure I understood this. FSIMasterState fsi[ASPEED_FSI_NUM]; is 
inside structure AspeedAPB2OPBState so it must be allocated, right?


See the documentation :

  https://www.qemu.org/docs/master/devel/qdev-api.html#c.qdev_realize_and_unref

Thanks,

C.

Re: [PATCH 3/3] tests/qtest: Re-enable multifd cancel test

2024-01-08 Thread Thomas Huth


On 09/01/2024 03.12, Peter Xu wrote:

On Mon, Jan 08, 2024 at 11:26:04AM -0300, Fabiano Rosas wrote:

Peter Xu  writes:


On Wed, Jun 07, 2023 at 10:27:15AM +0200, Juan Quintela wrote:

Fabiano Rosas  wrote:

We've found the source of flakiness in this test, so re-enable it.

Signed-off-by: Fabiano Rosas 
---
  tests/qtest/migration-test.c | 10 ++
  1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index b0c355bbd9..800ad23b75 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2778,14 +2778,8 @@ int main(int argc, char **argv)
  }
  qtest_add_func("/migration/multifd/tcp/plain/none",
 test_multifd_tcp_none);
-/*
- * This test is flaky and sometimes fails in CI and otherwise:
- * don't run unless user opts in via environment variable.
- */
-if (getenv("QEMU_TEST_FLAKY_TESTS")) {
-qtest_add_func("/migration/multifd/tcp/plain/cancel",
-   test_multifd_tcp_cancel);
-}
+qtest_add_func("/migration/multifd/tcp/plain/cancel",
+   test_multifd_tcp_cancel);
  qtest_add_func("/migration/multifd/tcp/plain/zlib",
 test_multifd_tcp_zlib);
  #ifdef CONFIG_ZSTD


Reviewed-by: Juan Quintela 


There was another failure with migration test that I will post during
the rest of the day.  It needs both to get it right.


This one didn't yet land upstream.  I'm not sure, but maybe Juan was saying
about this change:

 commit d2026ee117147893f8d80f060cede6d872ecbd7f
 Author: Juan Quintela 
 Date:   Wed Apr 26 12:20:36 2023 +0200

 multifd: Fix the number of channels ready


That's not it. It was something in the test itself around the fact that
we use two sets of: from/to. There was supposed to be a situation where
we'd start 'to2' while 'to' was still running and that would cause
issues (possibly with sockets).

I think what might have happened is that someone merged a fix through
another tree and Juan didn't notice. I think this is the one:

   commit f2d063e61ee2026700ab44bef967f663e976bec8
   Author: Xuzhou Cheng 
   Date:   Fri Oct 28 12:57:32 2022 +0800
   
   tests/qtest: migration-test: Make sure QEMU process "to" exited after migration is canceled
   
   Make sure QEMU process "to" exited before launching another target

   for migration in the test_multifd_tcp_cancel case.
   
   Signed-off-by: Xuzhou Cheng 

   Signed-off-by: Bin Meng 
   Reviewed-by: Marc-André Lureau 
   Message-Id: <20221028045736.679903-8-bin.m...@windriver.com>
   Signed-off-by: Thomas Huth 


Hmm, i see.


Sorry for that :-( Maybe it's better if we remove the migration-test from 
the qtest section in MAINTAINERS? Since the migration test is very well 
maintained already, there's IMHO no need for picking up the patches via the 
qtest tree, so something like this should prevent these problems:


diff --git a/MAINTAINERS b/MAINTAINERS
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3269,6 +3269,7 @@ F: tests/qtest/
 F: docs/devel/qgraph.rst
 F: docs/devel/qtest.rst
 X: tests/qtest/bios-tables-test*
+X: tests/qtest/migration-*

 Device Fuzzing
 M: Alexander Bulekov 

(as you can see, we're doing it in a similar way for the bios tables test 
already)


If you agree, I can send out a proper patch for this later today.

 Thomas

Re: [PATCH 10/10] docs/migration: Further move virtio to be feature of migration

2024-01-08 Thread Cédric Le Goater


On 1/9/24 07:46, pet...@redhat.com wrote:

From: Peter Xu 

Move it one layer down, so taking Virtio-migration as a feature for
migration.

Cc: Michael S. Tsirkin 
Cc: Jason Wang 
Signed-off-by: Peter Xu 



Reviewed-by: Cédric Le Goater 

Thanks,

C.




---
  docs/devel/migration/features.rst | 1 +
  docs/devel/migration/index.rst| 1 -
  2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/devel/migration/features.rst 
b/docs/devel/migration/features.rst
index dea016f707..a9acaf618e 100644
--- a/docs/devel/migration/features.rst
+++ b/docs/devel/migration/features.rst
@@ -9,3 +9,4 @@ Migration has plenty of features to support different use cases.
 postcopy
 dirty-limit
 vfio
+   virtio
diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
index 2479e8ecb7..7b7a706e35 100644
--- a/docs/devel/migration/index.rst
+++ b/docs/devel/migration/index.rst
@@ -10,5 +10,4 @@ QEMU live migration works.
 main
 features
 compatibility
-   virtio
 best-practises

Re: [PATCH 09/10] docs/migration: Further move vfio to be feature of migration

2024-01-08 Thread Cédric Le Goater


On 1/9/24 07:46, pet...@redhat.com wrote:

From: Peter Xu 

Move it one layer down, so taking VFIO-migration as a feature for
migration.

Cc: Alex Williamson 
Cc: Cédric Le Goater 
Signed-off-by: Peter Xu 



Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  docs/devel/migration/features.rst | 1 +
  docs/devel/migration/index.rst| 1 -
  2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/devel/migration/features.rst 
b/docs/devel/migration/features.rst
index e257d0d100..dea016f707 100644
--- a/docs/devel/migration/features.rst
+++ b/docs/devel/migration/features.rst
@@ -8,3 +8,4 @@ Migration has plenty of features to support different use cases.
  
 postcopy

 dirty-limit
+   vfio
diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
index 7cf62541b9..2479e8ecb7 100644
--- a/docs/devel/migration/index.rst
+++ b/docs/devel/migration/index.rst
@@ -10,6 +10,5 @@ QEMU live migration works.
 main
 features
 compatibility
-   vfio
 virtio
 best-practises

Re: [PATCH 08/10] docs/migration: Organize "Postcopy" page

2024-01-08 Thread Cédric Le Goater


On 1/9/24 07:46, pet...@redhat.com wrote:

From: Peter Xu 

Reorganize the page, moving things around, and add a few
headlines ("Postcopy internals", "Postcopy features") to cover sub-areas.

Signed-off-by: Peter Xu 
---
  docs/devel/migration/postcopy.rst | 159 --
  1 file changed, 84 insertions(+), 75 deletions(-)

diff --git a/docs/devel/migration/postcopy.rst 
b/docs/devel/migration/postcopy.rst
index d60eec06ab..6c51e96d79 100644
--- a/docs/devel/migration/postcopy.rst
+++ b/docs/devel/migration/postcopy.rst
@@ -1,6 +1,9 @@
+
  Postcopy
  
  
+.. contents::

+
  'Postcopy' migration is a way to deal with migrations that refuse to converge


The quote character is used in a few places to emphasize words
which should be reworked. The rest looks good, so


Reviewed-by: Cédric Le Goater 

Thanks,

C.




  (or take too long to converge) its plus side is that there is an upper bound 
on
  the amount of migration traffic and time it takes, the down side is that 
during
@@ -14,7 +17,7 @@ Postcopy can be combined with precopy (i.e. normal migration) 
so that if precopy
  doesn't finish in a given time the switch is made to postcopy.
  
  Enabling postcopy

--
+=
  
  To enable postcopy, issue this command on the monitor (both source and

  destination) prior to the start of migration:
@@ -49,8 +52,71 @@ time per vCPU.
``migrate_set_parameter`` is ignored (to avoid delaying requested pages that
the destination is waiting for).
  
-Postcopy device transfer

-
+Postcopy internals
+==
+
+State machine
+-
+
+Postcopy moves through a series of states (see postcopy_state) from
+ADVISE->DISCARD->LISTEN->RUNNING->END
+
+ - Advise
+
+Set at the start of migration if postcopy is enabled, even
+if it hasn't had the start command; here the destination
+checks that its OS has the support needed for postcopy, and performs
+setup to ensure the RAM mappings are suitable for later postcopy.
+The destination will fail early in migration at this point if the
+required OS support is not present.
+(Triggered by reception of POSTCOPY_ADVISE command)
+
+ - Discard
+
+Entered on receipt of the first 'discard' command; prior to
+the first Discard being performed, hugepages are switched off
+(using madvise) to ensure that no new huge pages are created
+during the postcopy phase, and to cause any huge pages that
+have discards on them to be broken.
+
+ - Listen
+
+The first command in the package, POSTCOPY_LISTEN, switches
+the destination state to Listen, and starts a new thread
+(the 'listen thread') which takes over the job of receiving
+pages off the migration stream, while the main thread carries
+on processing the blob.  With this thread able to process page
+reception, the destination now 'sensitises' the RAM to detect
+any access to missing pages (on Linux using the 'userfault'
+system).
+
+ - Running
+
+POSTCOPY_RUN causes the destination to synchronise all
+state and start the CPUs and IO devices running.  The main
+thread now finishes processing the migration package and
+now carries on as it would for normal precopy migration
+(although it can't do the cleanup it would do as it
+finishes a normal migration).
+
+ - Paused
+
+Postcopy can run into a paused state (normally on both sides when
+happens), where all threads will be temporarily halted mostly due to
+network errors.  When reaching paused state, migration will make sure
+the qemu binary on both sides maintain the data without corrupting
+the VM.  To continue the migration, the admin needs to fix the
+migration channel using the QMP command 'migrate-recover' on the
+destination node, then resume the migration using QMP command 'migrate'
+again on source node, with resume=true flag set.
+
+ - End
+
+The listen thread can now quit, and perform the cleanup of migration
+state, the migration is now complete.
+
+Device transfer
+---
  
  Loading of device data may cause the device emulation to access guest RAM

  that may trigger faults that have to be resolved by the source, as such
@@ -130,7 +196,20 @@ processing.
 is no longer used by migration, while the listen thread carries on 
servicing
 page data until the end of migration.
  
-Postcopy Recovery

+Source side page bitmap
+---
+
+The 'migration bitmap' in postcopy is basically the same as in the precopy,
+where each of the bit to indicate that page is 'dirty' - i.e. needs
+sending.  During the precopy phase this is updated as the CPU dirties
+pages, however during postcopy the CPUs are stopped and nothing should
+dirty anything any more. Instead, dirty bits are cleared when the relevant
+pages are sent during postcopy.
+
+Postcopy features
+=
+
+Postcopy recovery
  -

Re: [PATCH v3 3/4] ci: Add a migration compatibility test job

2024-01-08 Thread Peter Xu

On Fri, Jan 05, 2024 at 03:04:48PM -0300, Fabiano Rosas wrote:
> The migration tests have support for being passed two QEMU binaries to
> test migration compatibility.
> 
> Add a CI job that builds the lastest release of QEMU and another job
> that uses that version plus an already present build of the current
> version and run the migration tests with the two, both as source and
> destination. I.e.:
> 
>  old QEMU (n-1) -> current QEMU (development tree)
>  current QEMU (development tree) -> old QEMU (n-1)
> 
> The purpose of this CI job is to ensure the code we're about to merge
> will not cause a migration compatibility problem when migrating the
> next release (which will contain that code) to/from the previous
> release.
> 
> I'm leaving the jobs as manual for now because using an older QEMU in
> tests could hit bugs that were already fixed in the current
> development tree and we need to handle those case-by-case.

Can we opt-out those broken tests using either your "since:" thing or
anything similar?

I hope we can start to run something by default in the CI in 9.0 to cover
n-1 -> n, even if starting with a subset of tests.  Is it possible?

Thanks,

-- 
Peter Xu

Re: [PATCH 07/10] docs/migration: Split "dirty limit"

2024-01-08 Thread Cédric Le Goater


On 1/9/24 07:46, pet...@redhat.com wrote:

From: Peter Xu 

Split that into a separate file, put under "features".

Cc: Yong Huang 
Signed-off-by: Peter Xu 




Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  docs/devel/migration/dirty-limit.rst | 71 
  docs/devel/migration/features.rst|  1 +
  docs/devel/migration/main.rst| 71 
  3 files changed, 72 insertions(+), 71 deletions(-)
  create mode 100644 docs/devel/migration/dirty-limit.rst

diff --git a/docs/devel/migration/dirty-limit.rst 
b/docs/devel/migration/dirty-limit.rst
new file mode 100644
index 00..8f32329d5f
--- /dev/null
+++ b/docs/devel/migration/dirty-limit.rst
@@ -0,0 +1,71 @@
+Dirty limit
+===
+
+The dirty limit, short for dirty page rate upper limit, is a new capability
+introduced in the 8.1 QEMU release that uses a new algorithm based on the KVM
+dirty ring to throttle down the guest during live migration.
+
+The algorithm framework is as follows:
+
+::
+
+  
--
+  main   --> throttle thread > PREPARE(1) <
+  thread  \|  |
+   \   |  |
+\  V  |
+ -\CALCULATE(2)   |
+   \   |  |
+\  |  |
+ \ V  |
+  \SET PENALTY(3) -
+   -\  |
+ \ |
+  \V
+   -> virtual CPU thread ---> ACCEPT PENALTY(4)
+  
--
+
+When the qmp command qmp_set_vcpu_dirty_limit is called for the first time,
+the QEMU main thread starts the throttle thread. The throttle thread, once
+launched, executes the loop, which consists of three steps:
+
+  - PREPARE (1)
+
+ The entire work of PREPARE (1) is preparation for the second stage,
+ CALCULATE(2), as the name implies. It involves preparing the dirty
+ page rate value and the corresponding upper limit of the VM:
+ The dirty page rate is calculated via the KVM dirty ring mechanism,
+ which tells QEMU how many dirty pages a virtual CPU has had since the
+ last KVM_EXIT_DIRTY_RING_FULL exception; The dirty page rate upper
+ limit is specified by caller, therefore fetch it directly.
+
+  - CALCULATE (2)
+
+ Calculate a suitable sleep period for each virtual CPU, which will be
+ used to determine the penalty for the target virtual CPU. The
+ computation must be done carefully in order to reduce the dirty page
+ rate progressively down to the upper limit without oscillation. To
+ achieve this, two strategies are provided: the first is to add or
+ subtract sleep time based on the ratio of the current dirty page rate
+ to the limit, which is used when the current dirty page rate is far
+ from the limit; the second is to add or subtract a fixed time when
+ the current dirty page rate is close to the limit.
+
+  - SET PENALTY (3)
+
+ Set the sleep time for each virtual CPU that should be penalized based
+ on the results of the calculation supplied by step CALCULATE (2).
+
+After completing the three above stages, the throttle thread loops back
+to step PREPARE (1) until the dirty limit is reached.
+
+On the other hand, each virtual CPU thread reads the sleep duration and
+sleeps in the path of the KVM_EXIT_DIRTY_RING_FULL exception handler, that
+is ACCEPT PENALTY (4). Virtual CPUs tied with writing processes will
+obviously exit to the path and get penalized, whereas virtual CPUs involved
+with read processes will not.
+
+In summary, thanks to the KVM dirty ring technology, the dirty limit
+algorithm will restrict virtual CPUs as needed to keep their dirty page
+rate inside the limit. This leads to more steady reading performance during
+live migration and can aid in improving large guest responsiveness.
diff --git a/docs/devel/migration/features.rst 
b/docs/devel/migration/features.rst
index 0054e0c900..e257d0d100 100644
--- a/docs/devel/migration/features.rst
+++ b/docs/devel/migration/features.rst
@@ -7,3 +7,4 @@ Migration has plenty of features to support different use cases.
 :maxdepth: 2
  
 postcopy

+   dirty-limit
diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst
index 051ea43f0e..00b9c3d32f 100644
--- a/docs/devel/migration/main.rst
+++ b/docs/devel/migration/main.rst
@@

Re: [PATCH 06/10] docs/migration: Split "Postcopy"

2024-01-08 Thread Cédric Le Goater


On 1/9/24 07:46, pet...@redhat.com wrote:

From: Peter Xu 

Split postcopy into a separate file.  Introduce a head page "features.rst"
to keep all the features on top of migration framework.

Signed-off-by: Peter Xu 



Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  docs/devel/migration/features.rst |   9 +
  docs/devel/migration/index.rst|   1 +
  docs/devel/migration/main.rst | 305 --
  docs/devel/migration/postcopy.rst | 304 +
  4 files changed, 314 insertions(+), 305 deletions(-)
  create mode 100644 docs/devel/migration/features.rst
  create mode 100644 docs/devel/migration/postcopy.rst

diff --git a/docs/devel/migration/features.rst 
b/docs/devel/migration/features.rst
new file mode 100644
index 00..0054e0c900
--- /dev/null
+++ b/docs/devel/migration/features.rst
@@ -0,0 +1,9 @@
+Migration features
+==
+
+Migration has plenty of features to support different use cases.
+
+.. toctree::
+   :maxdepth: 2
+
+   postcopy
diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
index c09623b38f..7cf62541b9 100644
--- a/docs/devel/migration/index.rst
+++ b/docs/devel/migration/index.rst
@@ -8,6 +8,7 @@ QEMU live migration works.
 :maxdepth: 2
  
 main

+   features
 compatibility
 vfio
 virtio
diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst
index 97811ce371..051ea43f0e 100644
--- a/docs/devel/migration/main.rst
+++ b/docs/devel/migration/main.rst
@@ -644,308 +644,3 @@ algorithm will restrict virtual CPUs as needed to keep 
their dirty page
  rate inside the limit. This leads to more steady reading performance during
  live migration and can aid in improving large guest responsiveness.
  
-Postcopy

-
-
-'Postcopy' migration is a way to deal with migrations that refuse to converge
-(or take too long to converge) its plus side is that there is an upper bound on
-the amount of migration traffic and time it takes, the down side is that during
-the postcopy phase, a failure of *either* side causes the guest to be lost.
-
-In postcopy the destination CPUs are started before all the memory has been
-transferred, and accesses to pages that are yet to be transferred cause
-a fault that's translated by QEMU into a request to the source QEMU.
-
-Postcopy can be combined with precopy (i.e. normal migration) so that if 
precopy
-doesn't finish in a given time the switch is made to postcopy.
-
-Enabling postcopy
--
-
-To enable postcopy, issue this command on the monitor (both source and
-destination) prior to the start of migration:
-
-``migrate_set_capability postcopy-ram on``
-
-The normal commands are then used to start a migration, which is still
-started in precopy mode.  Issuing:
-
-``migrate_start_postcopy``
-
-will now cause the transition from precopy to postcopy.
-It can be issued immediately after migration is started or any
-time later on.  Issuing it after the end of a migration is harmless.
-
-Blocktime is a postcopy live migration metric, intended to show how
-long the vCPU was in state of interruptible sleep due to pagefault.
-That metric is calculated both for all vCPUs as overlapped value, and
-separately for each vCPU. These values are calculated on destination
-side.  To enable postcopy blocktime calculation, enter following
-command on destination monitor:
-
-``migrate_set_capability postcopy-blocktime on``
-
-Postcopy blocktime can be retrieved by query-migrate qmp command.
-postcopy-blocktime value of qmp command will show overlapped blocking
-time for all vCPU, postcopy-vcpu-blocktime will show list of blocking
-time per vCPU.
-
-.. note::
-  During the postcopy phase, the bandwidth limits set using
-  ``migrate_set_parameter`` is ignored (to avoid delaying requested pages that
-  the destination is waiting for).
-
-Postcopy device transfer
-
-
-Loading of device data may cause the device emulation to access guest RAM
-that may trigger faults that have to be resolved by the source, as such
-the migration stream has to be able to respond with page data *during* the
-device load, and hence the device data has to be read from the stream 
completely
-before the device load begins to free the stream up.  This is achieved by
-'packaging' the device data into a blob that's read in one go.
-
-Source behaviour
-
-
-Until postcopy is entered the migration stream is identical to normal
-precopy, except for the addition of a 'postcopy advise' command at
-the beginning, to tell the destination that postcopy might happen.
-When postcopy starts the source sends the page discard data and then
-forms the 'package' containing:
-
-   - Command: 'postcopy listen'
-   - The device state
-
- A series of sections, identical to the precopy streams device state stream
- containing everything except postcopiable devices (i.e. RAM)
-   - Command: 'postcopy run'
-
-The 'package' is sent as the

Re: [PATCH 05/10] docs/migration: Split "Debugging" and "Firmware"

2024-01-08 Thread Cédric Le Goater


On 1/9/24 07:46, pet...@redhat.com wrote:

From: Peter Xu 

Move the two sections into a separate file called "best-practises.rst".
Add the entry into index.

Signed-off-by: Peter Xu 



Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  docs/devel/migration/best-practises.rst | 48 +
  docs/devel/migration/index.rst  |  1 +
  docs/devel/migration/main.rst   | 44 ---
  3 files changed, 49 insertions(+), 44 deletions(-)
  create mode 100644 docs/devel/migration/best-practises.rst

diff --git a/docs/devel/migration/best-practises.rst 
b/docs/devel/migration/best-practises.rst
new file mode 100644
index 00..ba122ae417
--- /dev/null
+++ b/docs/devel/migration/best-practises.rst
@@ -0,0 +1,48 @@
+==
+Best practises
+==
+
+Debugging
+=
+
+The migration stream can be analyzed thanks to 
``scripts/analyze-migration.py``.
+
+Example usage:
+
+.. code-block:: shell
+
+  $ qemu-system-x86_64 -display none -monitor stdio
+  (qemu) migrate "exec:cat > mig"
+  (qemu) q
+  $ ./scripts/analyze-migration.py -f mig
+  {
+"ram (3)": {
+"section sizes": {
+"pc.ram": "0x0800",
+  ...
+
+See also ``analyze-migration.py -h`` help for more options.
+
+Firmware
+
+
+Migration migrates the copies of RAM and ROM, and thus when running
+on the destination it includes the firmware from the source. Even after
+resetting a VM, the old firmware is used.  Only once QEMU has been restarted
+is the new firmware in use.
+
+- Changes in firmware size can cause changes in the required RAMBlock size
+  to hold the firmware and thus migration can fail.  In practice it's best
+  to pad firmware images to convenient powers of 2 with plenty of space
+  for growth.
+
+- Care should be taken with device emulation code so that newer
+  emulation code can work with older firmware to allow forward migration.
+
+- Care should be taken with newer firmware so that backward migration
+  to older systems with older device emulation code will work.
+
+In some cases it may be best to tie specific firmware versions to specific
+versioned machine types to cut down on the combinations that will need
+support.  This is also useful when newer versions of firmware outgrow
+the padding.
diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
index 7fc02b9520..c09623b38f 100644
--- a/docs/devel/migration/index.rst
+++ b/docs/devel/migration/index.rst
@@ -11,3 +11,4 @@ QEMU live migration works.
 compatibility
 vfio
 virtio
+   best-practises
diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst
index b3e31bb52f..97811ce371 100644
--- a/docs/devel/migration/main.rst
+++ b/docs/devel/migration/main.rst
@@ -52,27 +52,6 @@ All these migration protocols use the same infrastructure to
  save/restore state devices.  This infrastructure is shared with the
  savevm/loadvm functionality.
  
-Debugging

-=
-
-The migration stream can be analyzed thanks to 
``scripts/analyze-migration.py``.
-
-Example usage:
-
-.. code-block:: shell
-
-  $ qemu-system-x86_64 -display none -monitor stdio
-  (qemu) migrate "exec:cat > mig"
-  (qemu) q
-  $ ./scripts/analyze-migration.py -f mig
-  {
-"ram (3)": {
-"section sizes": {
-"pc.ram": "0x0800",
-  ...
-
-See also ``analyze-migration.py -h`` help for more options.
-
  Common infrastructure
  =
  
@@ -970,26 +949,3 @@ the background migration channel.  Anyone who cares about latencies of page

  faults during a postcopy migration should enable this feature.  By default,
  it's not enabled.
  
-Firmware

-
-
-Migration migrates the copies of RAM and ROM, and thus when running
-on the destination it includes the firmware from the source. Even after
-resetting a VM, the old firmware is used.  Only once QEMU has been restarted
-is the new firmware in use.
-
-- Changes in firmware size can cause changes in the required RAMBlock size
-  to hold the firmware and thus migration can fail.  In practice it's best
-  to pad firmware images to convenient powers of 2 with plenty of space
-  for growth.
-
-- Care should be taken with device emulation code so that newer
-  emulation code can work with older firmware to allow forward migration.
-
-- Care should be taken with newer firmware so that backward migration
-  to older systems with older device emulation code will work.
-
-In some cases it may be best to tie specific firmware versions to specific
-versioned machine types to cut down on the combinations that will need
-support.  This is also useful when newer versions of firmware outgrow
-the padding.

Re: [PATCH 04/10] docs/migration: Split "Backwards compatibility" separately

2024-01-08 Thread Cédric Le Goater


On 1/9/24 07:46, pet...@redhat.com wrote:

From: Peter Xu 

Split the section from main.rst into a separate file.  Reference it in the
index.rst.

Signed-off-by: Peter Xu 



Reviewed-by: Cédric Le Goater 

Thanks,

C.

Re: [PATCH 03/10] docs/migration: Convert virtio.txt into rST

2024-01-08 Thread Cédric Le Goater


On 1/9/24 07:46, pet...@redhat.com wrote:

From: Peter Xu 

Convert the plain old .txt into .rst, add it into migration/index.rst.

Signed-off-by: Peter Xu 



Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  docs/devel/migration/index.rst  |   1 +
  docs/devel/migration/virtio.rst | 115 
  docs/devel/migration/virtio.txt | 108 --
  3 files changed, 116 insertions(+), 108 deletions(-)
  create mode 100644 docs/devel/migration/virtio.rst
  delete mode 100644 docs/devel/migration/virtio.txt

diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
index 02cfdcc969..2cb701c77c 100644
--- a/docs/devel/migration/index.rst
+++ b/docs/devel/migration/index.rst
@@ -9,3 +9,4 @@ QEMU live migration works.
  
 main

 vfio
+   virtio
diff --git a/docs/devel/migration/virtio.rst b/docs/devel/migration/virtio.rst
new file mode 100644
index 00..611a18b821
--- /dev/null
+++ b/docs/devel/migration/virtio.rst
@@ -0,0 +1,115 @@
+===
+Virtio device migration
+===
+
+Copyright 2015 IBM Corp.
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.  See
+the COPYING file in the top-level directory.
+
+Saving and restoring the state of virtio devices is a bit of a twisty maze,
+for several reasons:
+
+- state is distributed between several parts:
+
+  - virtio core, for common fields like features, number of queues, ...
+
+  - virtio transport (pci, ccw, ...), for the different proxy devices and
+transport specific state (msix vectors, indicators, ...)
+
+  - virtio device (net, blk, ...), for the different device types and their
+state (mac address, request queue, ...)
+
+- most fields are saved via the stream interface; subsequently, subsections
+  have been added to make cross-version migration possible
+
+This file attempts to document the current procedure and point out some
+caveats.
+
+Save state procedure
+
+
+::
+
+  virtio core   virtio transport  virtio device
+  ---     -
+
+  save() function 
registered
+  via VMState wrapper on
+  device class
+  virtio_save()   <--
+   -->  save_config()
+- save proxy device
+- save transport-specific
+  device fields
+  - save common device
+fields
+  - save common virtqueue
+fields
+   -->  save_queue()
+- save transport-specific
+  virtqueue fields
+   -->   save_device()
+ - save device-specific
+   fields
+  - save subsections
+- device endianness,
+  if changed from
+  default endianness
+- 64 bit features, if
+  any high feature bit
+  is set
+- virtio-1 virtqueue
+  fields, if VERSION_1
+  is set
+
+Load state procedure
+
+
+::
+
+  virtio core   virtio transport  virtio device
+  ---     -
+
+  load() function 
registered
+  via VMState wrapper on
+  device class
+  virtio_load()   <--
+   -->  load_config()
+- load proxy device
+- load transport-specific
+  device fields
+  - load common device
+fields
+  - load common virtqueue
+fields
+   -->  load_queue()
+- load transport-specific
+  virtqueue fields
+  - notify guest
+   -->   load_device()
+ - load device-specific
+   fields
+  - load subsections
+- device endianness
+- 64 bit features
+- virtio-1 virtqueue
+  fields
+  - sanitize endianness
+  - sanitize features
+  - virtqueue index sanity
+check
+ - feature-dependent setup
+
+Implications of this setup
+==
+
+Devices need to be careful in their state processing during load: The
+load_device() procedure is invoked by the core before subsections have
+been loaded. Any code that depends on information transmitted in subsections

Re: [PATCH v6 1/2] qom: new object to associate device to numa node

2024-01-08 Thread Markus Armbruster

Ankit Agrawal  writes:

>>> +##
>>> +# @AcpiGenericInitiatorProperties:
>>> +#
>>> +# Properties for acpi-generic-initiator objects.
>>> +#
>>> +# @pci-dev: PCI device ID to be associated with the node
>>> +#
>>> +# @host-nodes: numa node list associated with the PCI device.
>>
>> NUMA
>>
>> Suggest "list of NUMA nodes associated with ..."
>
> Ack, will make the change.
>
>>> @@ -981,6 +997,7 @@
>>>  'id': 'str' },
>>>    'discriminator': 'qom-type',
>>>    'data': {
>>> +  'acpi-generic-initiator': 'AcpiGenericInitiatorProperties',
>>>    'authz-list': 'AuthZListProperties',
>>>    'authz-listfile': 'AuthZListFileProperties',
>>>    'authz-pam':  'AuthZPAMProperties',
>>
>> I'm holding my Acked-by until the interface design issues raised by
>> Jason have been resolved.
>
> I suppose you meant Jonathan here?

Yes.  Going too fast.  My apologies!

Re: [PATCH 02/10] docs/migration: Create index page

2024-01-08 Thread Cédric Le Goater


On 1/9/24 07:46, pet...@redhat.com wrote:

From: Peter Xu 

Create an index page for migration module.  Move VFIO migration there too.
A trivial touch-up on the title to use lower case there.

Since then we'll have "migration" as the top title, make the main doc file
renamed to "migration framework".

Cc: Alex Williamson 
Cc: Cédric Le Goater 
Signed-off-by: Peter Xu 



Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  docs/devel/index-internals.rst |  3 +--
  docs/devel/migration/index.rst | 11 +++
  docs/devel/migration/main.rst  |  6 +++---
  docs/devel/migration/vfio.rst  |  2 +-
  4 files changed, 16 insertions(+), 6 deletions(-)
  create mode 100644 docs/devel/migration/index.rst

diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst
index a41d62c1eb..5636e9cf1d 100644
--- a/docs/devel/index-internals.rst
+++ b/docs/devel/index-internals.rst
@@ -11,13 +11,12 @@ Details about QEMU's various subsystems including how to 
add features to them.
 block-coroutine-wrapper
 clocks
 ebpf_rss
-   migration/main
+   migration/index
 multi-process
 reset
 s390-cpu-topology
 s390-dasd-ipl
 tracing
-   vfio-migration
 vfio-iommufd
 writing-monitor-commands
 virtio-backends
diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
new file mode 100644
index 00..02cfdcc969
--- /dev/null
+++ b/docs/devel/migration/index.rst
@@ -0,0 +1,11 @@
+Migration
+=
+
+This is the main entry for QEMU migration documentations.  It explains how
+QEMU live migration works.
+
+.. toctree::
+   :maxdepth: 2
+
+   main
+   vfio
diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst
index 95351ba51f..62bf027fb4 100644
--- a/docs/devel/migration/main.rst
+++ b/docs/devel/migration/main.rst
@@ -1,6 +1,6 @@
-=
-Migration
-=
+===
+Migration framework
+===
  
  QEMU has code to load/save the state of the guest that it is running.

  These are two complementary operations.  Saving the state just does
diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst
index 605fe60e96..c49482eab6 100644
--- a/docs/devel/migration/vfio.rst
+++ b/docs/devel/migration/vfio.rst
@@ -1,5 +1,5 @@
  =
-VFIO device Migration
+VFIO device migration
  =
  
  Migration of virtual machine involves saving the state for each device that

Re: [PATCH 01/10] docs/migration: Create migration/ directory

2024-01-08 Thread Cédric Le Goater


On 1/9/24 07:46, pet...@redhat.com wrote:

From: Peter Xu 

Migration documentation is growing into a single file too large.  Create a
sub-directory for it for a split.

We also already have separate vfio/virtio documentations, move it all over
into the directory.

Note that the virtio one is still not yet converted to rST.  That is a job
for later.

Cc: Michael S. Tsirkin 
Cc: Jason Wang 
Cc: Alex Williamson 
Cc: Cédric Le Goater 
Signed-off-by: Peter Xu 



Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  docs/devel/index-internals.rst| 2 +-
  docs/devel/{migration.rst => migration/main.rst}  | 0
  docs/devel/{vfio-migration.rst => migration/vfio.rst} | 0
  docs/devel/{virtio-migration.txt => migration/virtio.txt} | 0
  4 files changed, 1 insertion(+), 1 deletion(-)
  rename docs/devel/{migration.rst => migration/main.rst} (100%)
  rename docs/devel/{vfio-migration.rst => migration/vfio.rst} (100%)
  rename docs/devel/{virtio-migration.txt => migration/virtio.txt} (100%)

diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst
index 3def4a138b..a41d62c1eb 100644
--- a/docs/devel/index-internals.rst
+++ b/docs/devel/index-internals.rst
@@ -11,7 +11,7 @@ Details about QEMU's various subsystems including how to add 
features to them.
 block-coroutine-wrapper
 clocks
 ebpf_rss
-   migration
+   migration/main
 multi-process
 reset
 s390-cpu-topology
diff --git a/docs/devel/migration.rst b/docs/devel/migration/main.rst
similarity index 100%
rename from docs/devel/migration.rst
rename to docs/devel/migration/main.rst
diff --git a/docs/devel/vfio-migration.rst b/docs/devel/migration/vfio.rst
similarity index 100%
rename from docs/devel/vfio-migration.rst
rename to docs/devel/migration/vfio.rst
diff --git a/docs/devel/virtio-migration.txt b/docs/devel/migration/virtio.txt
similarity index 100%
rename from docs/devel/virtio-migration.txt
rename to docs/devel/migration/virtio.txt

[PATCH 10/10] docs/migration: Further move virtio to be feature of migration

2024-01-08 Thread peterx

From: Peter Xu 

Move it one layer down, so taking Virtio-migration as a feature for
migration.

Cc: Michael S. Tsirkin 
Cc: Jason Wang 
Signed-off-by: Peter Xu 
---
 docs/devel/migration/features.rst | 1 +
 docs/devel/migration/index.rst| 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/devel/migration/features.rst 
b/docs/devel/migration/features.rst
index dea016f707..a9acaf618e 100644
--- a/docs/devel/migration/features.rst
+++ b/docs/devel/migration/features.rst
@@ -9,3 +9,4 @@ Migration has plenty of features to support different use cases.
postcopy
dirty-limit
vfio
+   virtio
diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
index 2479e8ecb7..7b7a706e35 100644
--- a/docs/devel/migration/index.rst
+++ b/docs/devel/migration/index.rst
@@ -10,5 +10,4 @@ QEMU live migration works.
main
features
compatibility
-   virtio
best-practises
-- 
2.41.0

[PATCH 06/10] docs/migration: Split "Postcopy"

2024-01-08 Thread peterx

From: Peter Xu 

Split postcopy into a separate file.  Introduce a head page "features.rst"
to keep all the features on top of migration framework.

Signed-off-by: Peter Xu 
---
 docs/devel/migration/features.rst |   9 +
 docs/devel/migration/index.rst|   1 +
 docs/devel/migration/main.rst | 305 --
 docs/devel/migration/postcopy.rst | 304 +
 4 files changed, 314 insertions(+), 305 deletions(-)
 create mode 100644 docs/devel/migration/features.rst
 create mode 100644 docs/devel/migration/postcopy.rst

diff --git a/docs/devel/migration/features.rst 
b/docs/devel/migration/features.rst
new file mode 100644
index 00..0054e0c900
--- /dev/null
+++ b/docs/devel/migration/features.rst
@@ -0,0 +1,9 @@
+Migration features
+==
+
+Migration has plenty of features to support different use cases.
+
+.. toctree::
+   :maxdepth: 2
+
+   postcopy
diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
index c09623b38f..7cf62541b9 100644
--- a/docs/devel/migration/index.rst
+++ b/docs/devel/migration/index.rst
@@ -8,6 +8,7 @@ QEMU live migration works.
:maxdepth: 2
 
main
+   features
compatibility
vfio
virtio
diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst
index 97811ce371..051ea43f0e 100644
--- a/docs/devel/migration/main.rst
+++ b/docs/devel/migration/main.rst
@@ -644,308 +644,3 @@ algorithm will restrict virtual CPUs as needed to keep 
their dirty page
 rate inside the limit. This leads to more steady reading performance during
 live migration and can aid in improving large guest responsiveness.
 
-Postcopy
-
-
-'Postcopy' migration is a way to deal with migrations that refuse to converge
-(or take too long to converge) its plus side is that there is an upper bound on
-the amount of migration traffic and time it takes, the down side is that during
-the postcopy phase, a failure of *either* side causes the guest to be lost.
-
-In postcopy the destination CPUs are started before all the memory has been
-transferred, and accesses to pages that are yet to be transferred cause
-a fault that's translated by QEMU into a request to the source QEMU.
-
-Postcopy can be combined with precopy (i.e. normal migration) so that if 
precopy
-doesn't finish in a given time the switch is made to postcopy.
-
-Enabling postcopy
--
-
-To enable postcopy, issue this command on the monitor (both source and
-destination) prior to the start of migration:
-
-``migrate_set_capability postcopy-ram on``
-
-The normal commands are then used to start a migration, which is still
-started in precopy mode.  Issuing:
-
-``migrate_start_postcopy``
-
-will now cause the transition from precopy to postcopy.
-It can be issued immediately after migration is started or any
-time later on.  Issuing it after the end of a migration is harmless.
-
-Blocktime is a postcopy live migration metric, intended to show how
-long the vCPU was in state of interruptible sleep due to pagefault.
-That metric is calculated both for all vCPUs as overlapped value, and
-separately for each vCPU. These values are calculated on destination
-side.  To enable postcopy blocktime calculation, enter following
-command on destination monitor:
-
-``migrate_set_capability postcopy-blocktime on``
-
-Postcopy blocktime can be retrieved by query-migrate qmp command.
-postcopy-blocktime value of qmp command will show overlapped blocking
-time for all vCPU, postcopy-vcpu-blocktime will show list of blocking
-time per vCPU.
-
-.. note::
-  During the postcopy phase, the bandwidth limits set using
-  ``migrate_set_parameter`` is ignored (to avoid delaying requested pages that
-  the destination is waiting for).
-
-Postcopy device transfer
-
-
-Loading of device data may cause the device emulation to access guest RAM
-that may trigger faults that have to be resolved by the source, as such
-the migration stream has to be able to respond with page data *during* the
-device load, and hence the device data has to be read from the stream 
completely
-before the device load begins to free the stream up.  This is achieved by
-'packaging' the device data into a blob that's read in one go.
-
-Source behaviour
-
-
-Until postcopy is entered the migration stream is identical to normal
-precopy, except for the addition of a 'postcopy advise' command at
-the beginning, to tell the destination that postcopy might happen.
-When postcopy starts the source sends the page discard data and then
-forms the 'package' containing:
-
-   - Command: 'postcopy listen'
-   - The device state
-
- A series of sections, identical to the precopy streams device state stream
- containing everything except postcopiable devices (i.e. RAM)
-   - Command: 'postcopy run'
-
-The 'package' is sent as the data part of a Command: ``CMD_PACKAGED``, and the
-contents are formatted in the same way as the main migration

[PATCH 02/10] docs/migration: Create index page

2024-01-08 Thread peterx

From: Peter Xu 

Create an index page for migration module.  Move VFIO migration there too.
A trivial touch-up on the title to use lower case there.

Since then we'll have "migration" as the top title, make the main doc file
renamed to "migration framework".

Cc: Alex Williamson 
Cc: Cédric Le Goater 
Signed-off-by: Peter Xu 
---
 docs/devel/index-internals.rst |  3 +--
 docs/devel/migration/index.rst | 11 +++
 docs/devel/migration/main.rst  |  6 +++---
 docs/devel/migration/vfio.rst  |  2 +-
 4 files changed, 16 insertions(+), 6 deletions(-)
 create mode 100644 docs/devel/migration/index.rst

diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst
index a41d62c1eb..5636e9cf1d 100644
--- a/docs/devel/index-internals.rst
+++ b/docs/devel/index-internals.rst
@@ -11,13 +11,12 @@ Details about QEMU's various subsystems including how to 
add features to them.
block-coroutine-wrapper
clocks
ebpf_rss
-   migration/main
+   migration/index
multi-process
reset
s390-cpu-topology
s390-dasd-ipl
tracing
-   vfio-migration
vfio-iommufd
writing-monitor-commands
virtio-backends
diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
new file mode 100644
index 00..02cfdcc969
--- /dev/null
+++ b/docs/devel/migration/index.rst
@@ -0,0 +1,11 @@
+Migration
+=
+
+This is the main entry for QEMU migration documentations.  It explains how
+QEMU live migration works.
+
+.. toctree::
+   :maxdepth: 2
+
+   main
+   vfio
diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst
index 95351ba51f..62bf027fb4 100644
--- a/docs/devel/migration/main.rst
+++ b/docs/devel/migration/main.rst
@@ -1,6 +1,6 @@
-=
-Migration
-=
+===
+Migration framework
+===
 
 QEMU has code to load/save the state of the guest that it is running.
 These are two complementary operations.  Saving the state just does
diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst
index 605fe60e96..c49482eab6 100644
--- a/docs/devel/migration/vfio.rst
+++ b/docs/devel/migration/vfio.rst
@@ -1,5 +1,5 @@
 =
-VFIO device Migration
+VFIO device migration
 =
 
 Migration of virtual machine involves saving the state for each device that
-- 
2.41.0

[PATCH 09/10] docs/migration: Further move vfio to be feature of migration

2024-01-08 Thread peterx

From: Peter Xu 

Move it one layer down, so taking VFIO-migration as a feature for
migration.

Cc: Alex Williamson 
Cc: Cédric Le Goater 
Signed-off-by: Peter Xu 
---
 docs/devel/migration/features.rst | 1 +
 docs/devel/migration/index.rst| 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/devel/migration/features.rst 
b/docs/devel/migration/features.rst
index e257d0d100..dea016f707 100644
--- a/docs/devel/migration/features.rst
+++ b/docs/devel/migration/features.rst
@@ -8,3 +8,4 @@ Migration has plenty of features to support different use cases.
 
postcopy
dirty-limit
+   vfio
diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
index 7cf62541b9..2479e8ecb7 100644
--- a/docs/devel/migration/index.rst
+++ b/docs/devel/migration/index.rst
@@ -10,6 +10,5 @@ QEMU live migration works.
main
features
compatibility
-   vfio
virtio
best-practises
-- 
2.41.0

[PATCH 08/10] docs/migration: Organize "Postcopy" page

2024-01-08 Thread peterx

From: Peter Xu 

Reorganize the page, moving things around, and add a few
headlines ("Postcopy internals", "Postcopy features") to cover sub-areas.

Signed-off-by: Peter Xu 
---
 docs/devel/migration/postcopy.rst | 159 --
 1 file changed, 84 insertions(+), 75 deletions(-)

diff --git a/docs/devel/migration/postcopy.rst 
b/docs/devel/migration/postcopy.rst
index d60eec06ab..6c51e96d79 100644
--- a/docs/devel/migration/postcopy.rst
+++ b/docs/devel/migration/postcopy.rst
@@ -1,6 +1,9 @@
+
 Postcopy
 
 
+.. contents::
+
 'Postcopy' migration is a way to deal with migrations that refuse to converge
 (or take too long to converge) its plus side is that there is an upper bound on
 the amount of migration traffic and time it takes, the down side is that during
@@ -14,7 +17,7 @@ Postcopy can be combined with precopy (i.e. normal migration) 
so that if precopy
 doesn't finish in a given time the switch is made to postcopy.
 
 Enabling postcopy
--
+=
 
 To enable postcopy, issue this command on the monitor (both source and
 destination) prior to the start of migration:
@@ -49,8 +52,71 @@ time per vCPU.
   ``migrate_set_parameter`` is ignored (to avoid delaying requested pages that
   the destination is waiting for).
 
-Postcopy device transfer
-
+Postcopy internals
+==
+
+State machine
+-
+
+Postcopy moves through a series of states (see postcopy_state) from
+ADVISE->DISCARD->LISTEN->RUNNING->END
+
+ - Advise
+
+Set at the start of migration if postcopy is enabled, even
+if it hasn't had the start command; here the destination
+checks that its OS has the support needed for postcopy, and performs
+setup to ensure the RAM mappings are suitable for later postcopy.
+The destination will fail early in migration at this point if the
+required OS support is not present.
+(Triggered by reception of POSTCOPY_ADVISE command)
+
+ - Discard
+
+Entered on receipt of the first 'discard' command; prior to
+the first Discard being performed, hugepages are switched off
+(using madvise) to ensure that no new huge pages are created
+during the postcopy phase, and to cause any huge pages that
+have discards on them to be broken.
+
+ - Listen
+
+The first command in the package, POSTCOPY_LISTEN, switches
+the destination state to Listen, and starts a new thread
+(the 'listen thread') which takes over the job of receiving
+pages off the migration stream, while the main thread carries
+on processing the blob.  With this thread able to process page
+reception, the destination now 'sensitises' the RAM to detect
+any access to missing pages (on Linux using the 'userfault'
+system).
+
+ - Running
+
+POSTCOPY_RUN causes the destination to synchronise all
+state and start the CPUs and IO devices running.  The main
+thread now finishes processing the migration package and
+now carries on as it would for normal precopy migration
+(although it can't do the cleanup it would do as it
+finishes a normal migration).
+
+ - Paused
+
+Postcopy can run into a paused state (normally on both sides when
+happens), where all threads will be temporarily halted mostly due to
+network errors.  When reaching paused state, migration will make sure
+the qemu binary on both sides maintain the data without corrupting
+the VM.  To continue the migration, the admin needs to fix the
+migration channel using the QMP command 'migrate-recover' on the
+destination node, then resume the migration using QMP command 'migrate'
+again on source node, with resume=true flag set.
+
+ - End
+
+The listen thread can now quit, and perform the cleanup of migration
+state, the migration is now complete.
+
+Device transfer
+---
 
 Loading of device data may cause the device emulation to access guest RAM
 that may trigger faults that have to be resolved by the source, as such
@@ -130,7 +196,20 @@ processing.
is no longer used by migration, while the listen thread carries on servicing
page data until the end of migration.
 
-Postcopy Recovery
+Source side page bitmap
+---
+
+The 'migration bitmap' in postcopy is basically the same as in the precopy,
+where each of the bit to indicate that page is 'dirty' - i.e. needs
+sending.  During the precopy phase this is updated as the CPU dirties
+pages, however during postcopy the CPUs are stopped and nothing should
+dirty anything any more. Instead, dirty bits are cleared when the relevant
+pages are sent during postcopy.
+
+Postcopy features
+=
+
+Postcopy recovery
 -
 
 Comparing to precopy, postcopy is special on error handlings.  When any
@@ -166,76 +245,6 @@ configurations of the guest.  For example, when with async 
page fault
 enabled, logically the guest can proactively schedule out the threads

[PATCH 07/10] docs/migration: Split "dirty limit"

2024-01-08 Thread peterx

From: Peter Xu 

Split that into a separate file, put under "features".

Cc: Yong Huang 
Signed-off-by: Peter Xu 
---
 docs/devel/migration/dirty-limit.rst | 71 
 docs/devel/migration/features.rst|  1 +
 docs/devel/migration/main.rst| 71 
 3 files changed, 72 insertions(+), 71 deletions(-)
 create mode 100644 docs/devel/migration/dirty-limit.rst

diff --git a/docs/devel/migration/dirty-limit.rst 
b/docs/devel/migration/dirty-limit.rst
new file mode 100644
index 00..8f32329d5f
--- /dev/null
+++ b/docs/devel/migration/dirty-limit.rst
@@ -0,0 +1,71 @@
+Dirty limit
+===
+
+The dirty limit, short for dirty page rate upper limit, is a new capability
+introduced in the 8.1 QEMU release that uses a new algorithm based on the KVM
+dirty ring to throttle down the guest during live migration.
+
+The algorithm framework is as follows:
+
+::
+
+  
--
+  main   --> throttle thread > PREPARE(1) <
+  thread  \|  |
+   \   |  |
+\  V  |
+ -\CALCULATE(2)   |
+   \   |  |
+\  |  |
+ \ V  |
+  \SET PENALTY(3) -
+   -\  |
+ \ |
+  \V
+   -> virtual CPU thread ---> ACCEPT PENALTY(4)
+  
--
+
+When the qmp command qmp_set_vcpu_dirty_limit is called for the first time,
+the QEMU main thread starts the throttle thread. The throttle thread, once
+launched, executes the loop, which consists of three steps:
+
+  - PREPARE (1)
+
+ The entire work of PREPARE (1) is preparation for the second stage,
+ CALCULATE(2), as the name implies. It involves preparing the dirty
+ page rate value and the corresponding upper limit of the VM:
+ The dirty page rate is calculated via the KVM dirty ring mechanism,
+ which tells QEMU how many dirty pages a virtual CPU has had since the
+ last KVM_EXIT_DIRTY_RING_FULL exception; The dirty page rate upper
+ limit is specified by caller, therefore fetch it directly.
+
+  - CALCULATE (2)
+
+ Calculate a suitable sleep period for each virtual CPU, which will be
+ used to determine the penalty for the target virtual CPU. The
+ computation must be done carefully in order to reduce the dirty page
+ rate progressively down to the upper limit without oscillation. To
+ achieve this, two strategies are provided: the first is to add or
+ subtract sleep time based on the ratio of the current dirty page rate
+ to the limit, which is used when the current dirty page rate is far
+ from the limit; the second is to add or subtract a fixed time when
+ the current dirty page rate is close to the limit.
+
+  - SET PENALTY (3)
+
+ Set the sleep time for each virtual CPU that should be penalized based
+ on the results of the calculation supplied by step CALCULATE (2).
+
+After completing the three above stages, the throttle thread loops back
+to step PREPARE (1) until the dirty limit is reached.
+
+On the other hand, each virtual CPU thread reads the sleep duration and
+sleeps in the path of the KVM_EXIT_DIRTY_RING_FULL exception handler, that
+is ACCEPT PENALTY (4). Virtual CPUs tied with writing processes will
+obviously exit to the path and get penalized, whereas virtual CPUs involved
+with read processes will not.
+
+In summary, thanks to the KVM dirty ring technology, the dirty limit
+algorithm will restrict virtual CPUs as needed to keep their dirty page
+rate inside the limit. This leads to more steady reading performance during
+live migration and can aid in improving large guest responsiveness.
diff --git a/docs/devel/migration/features.rst 
b/docs/devel/migration/features.rst
index 0054e0c900..e257d0d100 100644
--- a/docs/devel/migration/features.rst
+++ b/docs/devel/migration/features.rst
@@ -7,3 +7,4 @@ Migration has plenty of features to support different use cases.
:maxdepth: 2
 
postcopy
+   dirty-limit
diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst
index 051ea43f0e..00b9c3d32f 100644
--- a/docs/devel/migration/main.rst
+++ b/docs/devel/migration/main.rst
@@ -573,74 +573,3 @@ path.
  Return path  - opened by main thread, written by main thread AND postcopy

[PATCH 03/10] docs/migration: Convert virtio.txt into rST

2024-01-08 Thread peterx

From: Peter Xu 

Convert the plain old .txt into .rst, add it into migration/index.rst.

Signed-off-by: Peter Xu 
---
 docs/devel/migration/index.rst  |   1 +
 docs/devel/migration/virtio.rst | 115 
 docs/devel/migration/virtio.txt | 108 --
 3 files changed, 116 insertions(+), 108 deletions(-)
 create mode 100644 docs/devel/migration/virtio.rst
 delete mode 100644 docs/devel/migration/virtio.txt

diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
index 02cfdcc969..2cb701c77c 100644
--- a/docs/devel/migration/index.rst
+++ b/docs/devel/migration/index.rst
@@ -9,3 +9,4 @@ QEMU live migration works.
 
main
vfio
+   virtio
diff --git a/docs/devel/migration/virtio.rst b/docs/devel/migration/virtio.rst
new file mode 100644
index 00..611a18b821
--- /dev/null
+++ b/docs/devel/migration/virtio.rst
@@ -0,0 +1,115 @@
+===
+Virtio device migration
+===
+
+Copyright 2015 IBM Corp.
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.  See
+the COPYING file in the top-level directory.
+
+Saving and restoring the state of virtio devices is a bit of a twisty maze,
+for several reasons:
+
+- state is distributed between several parts:
+
+  - virtio core, for common fields like features, number of queues, ...
+
+  - virtio transport (pci, ccw, ...), for the different proxy devices and
+transport specific state (msix vectors, indicators, ...)
+
+  - virtio device (net, blk, ...), for the different device types and their
+state (mac address, request queue, ...)
+
+- most fields are saved via the stream interface; subsequently, subsections
+  have been added to make cross-version migration possible
+
+This file attempts to document the current procedure and point out some
+caveats.
+
+Save state procedure
+
+
+::
+
+  virtio core   virtio transport  virtio device
+  ---     -
+
+  save() function 
registered
+  via VMState wrapper on
+  device class
+  virtio_save()   <--
+   -->  save_config()
+- save proxy device
+- save transport-specific
+  device fields
+  - save common device
+fields
+  - save common virtqueue
+fields
+   -->  save_queue()
+- save transport-specific
+  virtqueue fields
+   -->   save_device()
+ - save device-specific
+   fields
+  - save subsections
+- device endianness,
+  if changed from
+  default endianness
+- 64 bit features, if
+  any high feature bit
+  is set
+- virtio-1 virtqueue
+  fields, if VERSION_1
+  is set
+
+Load state procedure
+
+
+::
+
+  virtio core   virtio transport  virtio device
+  ---     -
+
+  load() function 
registered
+  via VMState wrapper on
+  device class
+  virtio_load()   <--
+   -->  load_config()
+- load proxy device
+- load transport-specific
+  device fields
+  - load common device
+fields
+  - load common virtqueue
+fields
+   -->  load_queue()
+- load transport-specific
+  virtqueue fields
+  - notify guest
+   -->   load_device()
+ - load device-specific
+   fields
+  - load subsections
+- device endianness
+- 64 bit features
+- virtio-1 virtqueue
+  fields
+  - sanitize endianness
+  - sanitize features
+  - virtqueue index sanity
+check
+ - feature-dependent setup
+
+Implications of this setup
+==
+
+Devices need to be careful in their state processing during load: The
+load_device() procedure is invoked by the core before subsections have
+been loaded. Any code that depends on information transmitted in subsections
+therefore has to be invoked in the device's load() function _after_
+virtio_load() returned (like e.g.

[PATCH 05/10] docs/migration: Split "Debugging" and "Firmware"

2024-01-08 Thread peterx

From: Peter Xu 

Move the two sections into a separate file called "best-practises.rst".
Add the entry into index.

Signed-off-by: Peter Xu 
---
 docs/devel/migration/best-practises.rst | 48 +
 docs/devel/migration/index.rst  |  1 +
 docs/devel/migration/main.rst   | 44 ---
 3 files changed, 49 insertions(+), 44 deletions(-)
 create mode 100644 docs/devel/migration/best-practises.rst

diff --git a/docs/devel/migration/best-practises.rst 
b/docs/devel/migration/best-practises.rst
new file mode 100644
index 00..ba122ae417
--- /dev/null
+++ b/docs/devel/migration/best-practises.rst
@@ -0,0 +1,48 @@
+==
+Best practises
+==
+
+Debugging
+=
+
+The migration stream can be analyzed thanks to 
``scripts/analyze-migration.py``.
+
+Example usage:
+
+.. code-block:: shell
+
+  $ qemu-system-x86_64 -display none -monitor stdio
+  (qemu) migrate "exec:cat > mig"
+  (qemu) q
+  $ ./scripts/analyze-migration.py -f mig
+  {
+"ram (3)": {
+"section sizes": {
+"pc.ram": "0x0800",
+  ...
+
+See also ``analyze-migration.py -h`` help for more options.
+
+Firmware
+
+
+Migration migrates the copies of RAM and ROM, and thus when running
+on the destination it includes the firmware from the source. Even after
+resetting a VM, the old firmware is used.  Only once QEMU has been restarted
+is the new firmware in use.
+
+- Changes in firmware size can cause changes in the required RAMBlock size
+  to hold the firmware and thus migration can fail.  In practice it's best
+  to pad firmware images to convenient powers of 2 with plenty of space
+  for growth.
+
+- Care should be taken with device emulation code so that newer
+  emulation code can work with older firmware to allow forward migration.
+
+- Care should be taken with newer firmware so that backward migration
+  to older systems with older device emulation code will work.
+
+In some cases it may be best to tie specific firmware versions to specific
+versioned machine types to cut down on the combinations that will need
+support.  This is also useful when newer versions of firmware outgrow
+the padding.
diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
index 7fc02b9520..c09623b38f 100644
--- a/docs/devel/migration/index.rst
+++ b/docs/devel/migration/index.rst
@@ -11,3 +11,4 @@ QEMU live migration works.
compatibility
vfio
virtio
+   best-practises
diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst
index b3e31bb52f..97811ce371 100644
--- a/docs/devel/migration/main.rst
+++ b/docs/devel/migration/main.rst
@@ -52,27 +52,6 @@ All these migration protocols use the same infrastructure to
 save/restore state devices.  This infrastructure is shared with the
 savevm/loadvm functionality.
 
-Debugging
-=
-
-The migration stream can be analyzed thanks to 
``scripts/analyze-migration.py``.
-
-Example usage:
-
-.. code-block:: shell
-
-  $ qemu-system-x86_64 -display none -monitor stdio
-  (qemu) migrate "exec:cat > mig"
-  (qemu) q
-  $ ./scripts/analyze-migration.py -f mig
-  {
-"ram (3)": {
-"section sizes": {
-"pc.ram": "0x0800",
-  ...
-
-See also ``analyze-migration.py -h`` help for more options.
-
 Common infrastructure
 =
 
@@ -970,26 +949,3 @@ the background migration channel.  Anyone who cares about 
latencies of page
 faults during a postcopy migration should enable this feature.  By default,
 it's not enabled.
 
-Firmware
-
-
-Migration migrates the copies of RAM and ROM, and thus when running
-on the destination it includes the firmware from the source. Even after
-resetting a VM, the old firmware is used.  Only once QEMU has been restarted
-is the new firmware in use.
-
-- Changes in firmware size can cause changes in the required RAMBlock size
-  to hold the firmware and thus migration can fail.  In practice it's best
-  to pad firmware images to convenient powers of 2 with plenty of space
-  for growth.
-
-- Care should be taken with device emulation code so that newer
-  emulation code can work with older firmware to allow forward migration.
-
-- Care should be taken with newer firmware so that backward migration
-  to older systems with older device emulation code will work.
-
-In some cases it may be best to tie specific firmware versions to specific
-versioned machine types to cut down on the combinations that will need
-support.  This is also useful when newer versions of firmware outgrow
-the padding.
-- 
2.41.0

[PATCH 01/10] docs/migration: Create migration/ directory

2024-01-08 Thread peterx

From: Peter Xu 

Migration documentation is growing into a single file too large.  Create a
sub-directory for it for a split.

We also already have separate vfio/virtio documentations, move it all over
into the directory.

Note that the virtio one is still not yet converted to rST.  That is a job
for later.

Cc: Michael S. Tsirkin 
Cc: Jason Wang 
Cc: Alex Williamson 
Cc: Cédric Le Goater 
Signed-off-by: Peter Xu 
---
 docs/devel/index-internals.rst| 2 +-
 docs/devel/{migration.rst => migration/main.rst}  | 0
 docs/devel/{vfio-migration.rst => migration/vfio.rst} | 0
 docs/devel/{virtio-migration.txt => migration/virtio.txt} | 0
 4 files changed, 1 insertion(+), 1 deletion(-)
 rename docs/devel/{migration.rst => migration/main.rst} (100%)
 rename docs/devel/{vfio-migration.rst => migration/vfio.rst} (100%)
 rename docs/devel/{virtio-migration.txt => migration/virtio.txt} (100%)

diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst
index 3def4a138b..a41d62c1eb 100644
--- a/docs/devel/index-internals.rst
+++ b/docs/devel/index-internals.rst
@@ -11,7 +11,7 @@ Details about QEMU's various subsystems including how to add 
features to them.
block-coroutine-wrapper
clocks
ebpf_rss
-   migration
+   migration/main
multi-process
reset
s390-cpu-topology
diff --git a/docs/devel/migration.rst b/docs/devel/migration/main.rst
similarity index 100%
rename from docs/devel/migration.rst
rename to docs/devel/migration/main.rst
diff --git a/docs/devel/vfio-migration.rst b/docs/devel/migration/vfio.rst
similarity index 100%
rename from docs/devel/vfio-migration.rst
rename to docs/devel/migration/vfio.rst
diff --git a/docs/devel/virtio-migration.txt b/docs/devel/migration/virtio.txt
similarity index 100%
rename from docs/devel/virtio-migration.txt
rename to docs/devel/migration/virtio.txt
-- 
2.41.0

[PATCH 00/10] docs/migration: Reorganize migration documentations

2024-01-08 Thread peterx

From: Peter Xu 

Migration docs grow larger and larger.  There are plenty of things we can
do here in the future, but to start that we'd better reorganize the current
bloated doc files first and properly organize them into separate files.
This series kicks that off.

This series mostly does the movement only, so please don't be scared of the
slightly large diff.  I did touch up things here and there, but I didn't
yet started writting much.  One thing I did is I converted virtio.txt to
rST, but that's trivial and no real content I touched.

I am copying both virtio and vfio people because I'm merging the two
separate files into the new docs/devel/migration/ folder.

Comments welcomed.  Thanks,

Peter Xu (10):
  docs/migration: Create migration/ directory
  docs/migration: Create index page
  docs/migration: Convert virtio.txt into rST
  docs/migration: Split "Backwards compatibility" separately
  docs/migration: Split "Debugging" and "Firmware"
  docs/migration: Split "Postcopy"
  docs/migration: Split "dirty limit"
  docs/migration: Organize "Postcopy" page
  docs/migration: Further move vfio to be feature of migration
  docs/migration: Further move virtio to be feature of migration

 docs/devel/index-internals.rst|3 +-
 docs/devel/migration.rst  | 1514 -
 docs/devel/migration/best-practises.rst   |   48 +
 docs/devel/migration/compatibility.rst|  517 ++
 docs/devel/migration/dirty-limit.rst  |   71 +
 docs/devel/migration/features.rst |   12 +
 docs/devel/migration/index.rst|   13 +
 docs/devel/migration/main.rst |  575 +++
 docs/devel/migration/postcopy.rst |  313 
 .../vfio.rst} |2 +-
 docs/devel/migration/virtio.rst   |  115 ++
 docs/devel/virtio-migration.txt   |  108 --
 12 files changed, 1666 insertions(+), 1625 deletions(-)
 delete mode 100644 docs/devel/migration.rst
 create mode 100644 docs/devel/migration/best-practises.rst
 create mode 100644 docs/devel/migration/compatibility.rst
 create mode 100644 docs/devel/migration/dirty-limit.rst
 create mode 100644 docs/devel/migration/features.rst
 create mode 100644 docs/devel/migration/index.rst
 create mode 100644 docs/devel/migration/main.rst
 create mode 100644 docs/devel/migration/postcopy.rst
 rename docs/devel/{vfio-migration.rst => migration/vfio.rst} (99%)
 create mode 100644 docs/devel/migration/virtio.rst
 delete mode 100644 docs/devel/virtio-migration.txt

-- 
2.41.0

[PATCH 04/10] docs/migration: Split "Backwards compatibility" separately

2024-01-08 Thread peterx

From: Peter Xu 

Split the section from main.rst into a separate file.  Reference it in the
index.rst.

Signed-off-by: Peter Xu 
---
 docs/devel/migration/compatibility.rst | 517 
 docs/devel/migration/index.rst |   1 +
 docs/devel/migration/main.rst  | 519 -
 3 files changed, 518 insertions(+), 519 deletions(-)
 create mode 100644 docs/devel/migration/compatibility.rst

diff --git a/docs/devel/migration/compatibility.rst 
b/docs/devel/migration/compatibility.rst
new file mode 100644
index 00..5a5417ef06
--- /dev/null
+++ b/docs/devel/migration/compatibility.rst
@@ -0,0 +1,517 @@
+Backwards compatibility
+===
+
+How backwards compatibility works
+-
+
+When we do migration, we have two QEMU processes: the source and the
+target.  There are two cases, they are the same version or they are
+different versions.  The easy case is when they are the same version.
+The difficult one is when they are different versions.
+
+There are two things that are different, but they have very similar
+names and sometimes get confused:
+
+- QEMU version
+- machine type version
+
+Let's start with a practical example, we start with:
+
+- qemu-system-x86_64 (v5.2), from now on qemu-5.2.
+- qemu-system-x86_64 (v5.1), from now on qemu-5.1.
+
+Related to this are the "latest" machine types defined on each of
+them:
+
+- pc-q35-5.2 (newer one in qemu-5.2) from now on pc-5.2
+- pc-q35-5.1 (newer one in qemu-5.1) from now on pc-5.1
+
+First of all, migration is only supposed to work if you use the same
+machine type in both source and destination. The QEMU hardware
+configuration needs to be the same also on source and destination.
+Most aspects of the backend configuration can be changed at will,
+except for a few cases where the backend features influence frontend
+device feature exposure.  But that is not relevant for this section.
+
+I am going to list the number of combinations that we can have.  Let's
+start with the trivial ones, QEMU is the same on source and
+destination:
+
+1 - qemu-5.2 -M pc-5.2  -> migrates to -> qemu-5.2 -M pc-5.2
+
+  This is the latest QEMU with the latest machine type.
+  This have to work, and if it doesn't work it is a bug.
+
+2 - qemu-5.1 -M pc-5.1  -> migrates to -> qemu-5.1 -M pc-5.1
+
+  Exactly the same case than the previous one, but for 5.1.
+  Nothing to see here either.
+
+This are the easiest ones, we will not talk more about them in this
+section.
+
+Now we start with the more interesting cases.  Consider the case where
+we have the same QEMU version in both sides (qemu-5.2) but we are using
+the latest machine type for that version (pc-5.2) but one of an older
+QEMU version, in this case pc-5.1.
+
+3 - qemu-5.2 -M pc-5.1  -> migrates to -> qemu-5.2 -M pc-5.1
+
+  It needs to use the definition of pc-5.1 and the devices as they
+  were configured on 5.1, but this should be easy in the sense that
+  both sides are the same QEMU and both sides have exactly the same
+  idea of what the pc-5.1 machine is.
+
+4 - qemu-5.1 -M pc-5.2  -> migrates to -> qemu-5.1 -M pc-5.2
+
+  This combination is not possible as the qemu-5.1 doesn't understand
+  pc-5.2 machine type.  So nothing to worry here.
+
+Now it comes the interesting ones, when both QEMU processes are
+different.  Notice also that the machine type needs to be pc-5.1,
+because we have the limitation than qemu-5.1 doesn't know pc-5.2.  So
+the possible cases are:
+
+5 - qemu-5.2 -M pc-5.1  -> migrates to -> qemu-5.1 -M pc-5.1
+
+  This migration is known as newer to older.  We need to make sure
+  when we are developing 5.2 we need to take care about not to break
+  migration to qemu-5.1.  Notice that we can't make updates to
+  qemu-5.1 to understand whatever qemu-5.2 decides to change, so it is
+  in qemu-5.2 side to make the relevant changes.
+
+6 - qemu-5.1 -M pc-5.1  -> migrates to -> qemu-5.2 -M pc-5.1
+
+  This migration is known as older to newer.  We need to make sure
+  than we are able to receive migrations from qemu-5.1. The problem is
+  similar to the previous one.
+
+If qemu-5.1 and qemu-5.2 were the same, there will not be any
+compatibility problems.  But the reason that we create qemu-5.2 is to
+get new features, devices, defaults, etc.
+
+If we get a device that has a new feature, or change a default value,
+we have a problem when we try to migrate between different QEMU
+versions.
+
+So we need a way to tell qemu-5.2 that when we are using machine type
+pc-5.1, it needs to **not** use the feature, to be able to migrate to
+real qemu-5.1.
+
+And the equivalent part when migrating from qemu-5.1 to qemu-5.2.
+qemu-5.2 has to expect that it is not going to get data for the new
+feature, because qemu-5.1 doesn't know about it.
+
+How do we tell QEMU about these device feature changes?  In
+hw/core/machine.c:hw_compat_X_Y arrays.
+
+If we change a default value, we need to put back the old value on
+that

Re: [PATCH v3 06/70] kvm: Introduce support for memory_attributes

2024-01-08 Thread Xiaoyao Li


On 12/21/2023 9:47 PM, Wang, Wei W wrote:

On Thursday, December 21, 2023 7:54 PM, Li, Xiaoyao wrote:

On 12/21/2023 6:36 PM, Wang, Wei W wrote:

No need to specifically check for KVM_MEMORY_ATTRIBUTE_PRIVATE there.
I'm suggesting below:

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index
2d9a2455de..63ba74b221 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1375,6 +1375,11 @@ static int kvm_set_memory_attributes(hwaddr

start, hwaddr size, uint64_t attr)

   struct kvm_memory_attributes attrs;
   int r;

+if ((attr & kvm_supported_memory_attributes) != attr) {
+error_report("KVM doesn't support memory attr %lx\n", attr);
+return -EINVAL;
+}


In the case of setting a range of memory to shared while KVM doesn't support
private memory. Above check doesn't work. and following IOCTL fails.


SHARED attribute uses the value 0, which indicates it's always supported, no?
For the implementation, can you find in the KVM side where the ioctl
would get failed in that case?


I'm worrying about the future case, that KVM supports other memory 
attribute than shared/private. For example, KVM supports RWX bits (bit 0 
- 2) but not shared/private bit.


This patch designs kvm_set_memory_attributes() to be common for all the 
bits (and for future bits), thus it leaves the support check to each 
caller function separately.


If you think it's unnecessary, I can change the name of 
kvm_set_memory_attributes() to kvm_set_memory_shared_private() to make 
it only for shared/private bit, then the check can be moved to it.



static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm,
struct kvm_memory_attributes *attrs)
{
 gfn_t start, end;

 /* flags is currently not used. */
 if (attrs->flags)
 return -EINVAL;
 if (attrs->attributes & ~kvm_supported_mem_attributes(kvm)) ==> 0 here
 return -EINVAL;
 if (attrs->size == 0 || attrs->address + attrs->size < attrs->address)
 return -EINVAL;
 if (!PAGE_ALIGNED(attrs->address) || !PAGE_ALIGNED(attrs->size))
 return -EINVAL;

Re: [PATCH trivial] colo: examples: remove mentions of script= and (wrong) downscript=

2024-01-08 Thread Michael Tokarev

09.01.2024 05:08, Zhang, Chen :

-Original Message-
From: Michael Tokarev 
Sent: Sunday, January 7, 2024 7:25 PM
To: qemu-devel@nongnu.org
Cc: Michael Tokarev ; qemu-triv...@nongnu.org; Zhang,
Chen ; Li Zhijian 
Subject: [PATCH trivial] colo: examples: remove mentions of script= and
(wrong) downscript=

There's no need to repeat script=/etc/qemu-ifup in examples, as it is already
in there.  More, all examples uses incorrect "down script=" (which should be
"downscript=").

Yes, good catch.
Reviewed-by: Zhang Chen 

---
I'm not sure we need so many identical examples, and why it uses vnet=off, -
it looks like vnet= should also be dropped.

Do you means the "vnet_hdr_support" in docs?

Nope, it was a thinko on my part, I mean vhost=off parameter - which is right 
next to script=.
Why vhost is explicitly disabled here, while it isn't even enabled by default?

And do we really need that many examples like this, maybe it's a good idea to
remove half of them and refer to the other place instead?

/mjt

Re: [PATCH 1/2] target/sh4: Deprecate the shix machine

2024-01-08 Thread Yoshinori Sato

On Tue, 09 Jan 2024 02:15:21 +0900,
Samuel Tardieu wrote:
> 
> The shix machine has been designed and used at Télécom Paris from 2003
> to 2010. It had been added to QEMU in 2005 and has not been maintained
> since. Since nobody is using the physical board anymore nor interested
> in maintaining the QEMU port, it is time to deprecate it.
> 
> Signed-off-by: Samuel Tardieu 
> ---
>  docs/about/deprecated.rst | 5 +
>  hw/sh4/shix.c | 1 +
>  2 files changed, 6 insertions(+)
> 
> diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
> index 2e15040246..e6a12c9077 100644
> --- a/docs/about/deprecated.rst
> +++ b/docs/about/deprecated.rst
> @@ -269,6 +269,11 @@ Nios II ``10m50-ghrd`` and ``nios2-generic-nommu`` 
> machines (since 8.2)
>  
>  The Nios II architecture is orphan.
>  
> +``shix`` (since 9.0)
> +
> +
> +The machine is no longer in existence and has been long unmaintained
> +in QEMU.
>  
>  Backend options
>  ---
> diff --git a/hw/sh4/shix.c b/hw/sh4/shix.c
> index aa812512f0..58530b8ede 100644
> --- a/hw/sh4/shix.c
> +++ b/hw/sh4/shix.c
> @@ -80,6 +80,7 @@ static void shix_machine_init(MachineClass *mc)
>  mc->init = shix_init;
>  mc->is_default = true;
>  mc->default_cpu_type = TYPE_SH7750R_CPU;
> +mc->deprecation_reason = "old and unmaintained - use a newer machine 
> instead";
>  }
>  
>  DEFINE_MACHINE("shix", shix_machine_init)
> -- 
> 2.42.0
> 

I can't maintain this either.
Reviewed-by: Yoshinori Sato 

-- 
Yosinori Sato

Re: [PATCH v3 52/70] i386/tdx: handle TDG.VP.VMCALL

2024-01-08 Thread Xiaoyao Li


On 1/8/2024 10:44 PM, Daniel P. Berrangé wrote:

On Fri, Dec 29, 2023 at 10:30:15AM +0800, Xiaoyao Li wrote:

On 11/16/2023 1:58 AM, Daniel P. Berrangé wrote:

On Wed, Nov 15, 2023 at 02:15:01AM -0500, Xiaoyao Li wrote:

From: Isaku Yamahata 

For GetQuote, delegate a request to Quote Generation Service.
Add property "quote-generation-socket" to tdx-guest, whihc is a property
of type SocketAddress to specify Quote Generation Service(QGS).

On request, connect to the QGS, read request buffer from shared guest
memory, send the request buffer to the server and store the response
into shared guest memory and notify TD guest by interrupt.

command line example:
qemu-system-x86_64 \
  -object '{"qom-type":"tdx-guest","id":"tdx0","quote-generation-socket":{"type": "vsock", 
"cid":"2","port":"1234"}}' \
  -machine confidential-guest-support=tdx0

Signed-off-by: Isaku Yamahata 
Codeveloped-by: Chenyi Qiang 
Signed-off-by: Chenyi Qiang 
Signed-off-by: Xiaoyao Li 
---
Changes in v3:
- rename property "quote-generation-service" to "quote-generation-socket";
- change the type of "quote-generation-socket" from str to
SocketAddress;
- squash next patch into this one;
---
   qapi/qom.json |   5 +-
   target/i386/kvm/tdx.c | 430 ++
   target/i386/kvm/tdx.h |   6 +
   3 files changed, 440 insertions(+), 1 deletion(-)

+static void tdx_handle_get_quote_connected(QIOTask *task, gpointer opaque)
+{
+struct tdx_get_quote_task *t = opaque;
+Error *err = NULL;
+char *in_data = NULL;
+MachineState *ms;
+TdxGuest *tdx;
+
+t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_ERROR);
+if (qio_task_propagate_error(task, NULL)) {
+t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_QGS_UNAVAILABLE);
+goto error;
+}
+
+in_data = g_malloc(le32_to_cpu(t->hdr.in_len));
+if (!in_data) {
+goto error;
+}
+
+if (address_space_read(_space_memory, t->gpa + sizeof(t->hdr),
+   MEMTXATTRS_UNSPECIFIED, in_data,
+   le32_to_cpu(t->hdr.in_len)) != MEMTX_OK) {
+goto error;
+}
+
+qio_channel_set_blocking(QIO_CHANNEL(t->ioc), false, NULL);


You've set the channel to non-blocking, but


+
+if (qio_channel_write_all(QIO_CHANNEL(t->ioc), in_data,
+  le32_to_cpu(t->hdr.in_len), ) ||
+err) {


...this method will block execution of this thread, by either
sleeping in poll() or doing a coroutine yield.

I don't think this is in coroutine context, so presumably this
is just blocking.  So what was the point in marking the channel
non-blocking ?


Hi Dainel,

First of all, I'm not good at socket or qio channel thing. Please correct me
and teach me when I'm wrong.

I'm not the author of this patch. My understanding is that, set it to
non-blocking is for the qio_channel_write_all() to proceed immediately?


The '_all' suffixed methods are implemented such that they will
sleep in poll(), or a coroutine yield when seeing EAGAIN.


If set non-blocking is not needed, I can remove it.


You are setting up a background watch to wait for the reply
so we don't block this thread, so you seem to want non-blocking
behaviour.


Both sending and receiving are in a new thread created by
qio_channel_socket_connect_async(). So I think both of then can be blocking
and don't need to be in another background thread.

what's your suggestion on it? Make both sending and receiving blocking or
non-blocking?


I think the code /should/ be non-blocking, which would mean
using   qio_channel_write, instead of qio_channel_write_all,
and using a .


I see. will implement in the next version.


With regards,
Daniel

hw: nvme: Separate 'serial' property for VFs

2024-01-08 Thread Minwoo Im

Currently, when a VF is created, it uses the 'params' object of the PF
as it is. In other words, the 'params.serial' string memory area is
also shared. In this situation, if the VF is removed from the system,
the PF's 'params.serial' object is released with object_finalize()
followed by object_property_del_all() which release the memory for
'serial' property. If that happens, the next VF created will inherit
a serial from a corrupted memory area.

If this happens, an error will occur when comparing subsys->serial and
n->params.serial in the nvme_subsys_register_ctrl() function.

Cc: qemu-sta...@nongnu.org
Fixes: 44c2c09488db ("hw/nvme: Add support for SR-IOV")
Signed-off-by: Minwoo Im 
---
 hw/nvme/ctrl.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index f026245d1e..a0ba3529cd 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -8309,9 +8309,15 @@ static void nvme_realize(PCIDevice *pci_dev, Error 
**errp)
 if (pci_is_vf(pci_dev)) {
 /*
  * VFs derive settings from the parent. PF's lifespan exceeds
- * that of VF's, so it's safe to share params.serial.
+ * that of VF's.
  */
 memcpy(>params, >params, sizeof(NvmeParams));
+
+/*
+ * Set PF's serial value to a new string memory to prevent 'serial'
+ * property object release of PF when a VF is removed from the system.
+ */
+n->params.serial = g_strdup(pn->params.serial);
 n->subsys = pn->subsys;
 }
 
-- 
2.34.1

Re: [PATCH v6 1/2] qom: new object to associate device to numa node

2024-01-08 Thread Ankit Agrawal


>> > However, I'll leave it up to those more familiar with the QEMU numa
>> > control interface design to comment on whether this approach is preferable
>> > to making the gi part of the numa node entry or doing it like hmat.
>>
>> > -numa srat-gi,node-id=10,gi-pci-dev=dev1
>>
>> The current way of acpi-generic-initiator object usage came out of the 
>> discussion
>> on v1 to essentially link all the device NUMA nodes to the device.
>> (https://lore.kernel.org/all/20230926131427.1e441670.alex.william...@redhat.com/)
>>
>> Can Alex or David comment on which is preferable (the current mechanism vs 
>> 1:1
>> mapping per object as suggested by Jonathan)?
>
> I imagine there are ways that either could work, but specifying a
> gi-pci-dev in the numa node declaration appears to get a bit messy if we
> have multiple gi-pci-dev devices to associate to the node whereas
> creating an acpi-generic-initiator object per individual device:node
> relationship feels a bit easier to iterate.
>
> Also if we do extend the ACPI spec to more explicitly allow a device to
> associate to multiple nodes, we could re-instate the list behavior of
> the acpi-generic-initiator whereas I don't see a representation of the
> association at the numa object that makes sense.  Thanks,

Ack, making the change to create an individual acpi-generic-initiator object
per device:node.

Alex

Re: [PATCH v6 1/2] qom: new object to associate device to numa node

2024-01-08 Thread Ankit Agrawal


>> +##
>> +# @AcpiGenericInitiatorProperties:
>> +#
>> +# Properties for acpi-generic-initiator objects.
>> +#
>> +# @pci-dev: PCI device ID to be associated with the node
>> +#
>> +# @host-nodes: numa node list associated with the PCI device.
>
> NUMA
>
> Suggest "list of NUMA nodes associated with ..."

Ack, will make the change.

>> @@ -981,6 +997,7 @@
>>  'id': 'str' },
>>    'discriminator': 'qom-type',
>>    'data': {
>> +  'acpi-generic-initiator': 'AcpiGenericInitiatorProperties',
>>    'authz-list': 'AuthZListProperties',
>>    'authz-listfile': 'AuthZListFileProperties',
>>    'authz-pam':  'AuthZPAMProperties',
>
> I'm holding my Acked-by until the interface design issues raised by
> Jason have been resolved.

I suppose you meant Jonathan here?

Re: [PATCH v3 4/4] [NOT FOR MERGE] tests/qtest/migration: Adapt tests to use older QEMUs

2024-01-08 Thread Peter Xu

On Mon, Jan 08, 2024 at 12:37:46PM -0300, Fabiano Rosas wrote:
> Peter Xu  writes:
> 
> > On Fri, Jan 05, 2024 at 03:04:49PM -0300, Fabiano Rosas wrote:
> >> [This patch is not necessary anymore after 8.2 has been released]
> >> 
> >> Add the 'since' annotations to recently added tests and adapt the
> >> postcopy test to use the older "uri" API when needed.
> >> 
> >> Signed-off-by: Fabiano Rosas 
> >
> > You marked this as not-for-merge.  Would something like this still be
> > useful in the future?  IIUC it's a matter of whether we'd still want to
> > test those old binaries.
> >
> 
> Technically yes, but I fail to see what benefit testing old binaries
> would bring us. I'm thinking maybe it could be useful for bisecting
> compatibility issues, but I can't think of a scenario where we'd like to
> change the older QEMU instead of the newer.
> 
> I'm of course open to suggestions if you or anyone else has an use case
> that you'd like to keep viable.
> 
> So far, my idea is that once a new QEMU is released, all the "since:"
> annotations become obsolete. We could even remove them. This series is
> just infrastructure to make our life easier if a change is ever
> introduced that is incompatible with the n-1 QEMU. IMO we cannot have
> compatibility testing if a random change might break a test and make it
> more difficult to run the remaining tests. So we'd use 'since' or the
> vercmp function to skip/adapt the offending tests until the next QEMU is
> released.
> 
> I'm basing myself on this loosely worded support statement from our
> docs:
> 
>   "In general QEMU tries to maintain forward migration compatibility
>   (i.e. migrating from QEMU n->n+1) and there are users who benefit from
>   backward compatibility as well."

I think we could still have users migrating from e.g. 8.0 -> 9.0 as long as
with the same machine type, especially when upgrading upper level stack
(e.g. an openstack cluster upgrade), where IIUC can jump a few qemu major
versions.  That does sound like a common use case, and I suspect the doc
was only taking one example on why compatibility needs to be maintained,
rather than emphasizing "+1 only".

However then the question is whether those old binaries needs to be
convered.

Then I noticed that taking all these "since: XXX" and cmdline changes along
with migration-test may be yet another burden even if we want to cover old
binaries for whatever reason.  I am now more convinced myself that we
should try to get rid of as much burden as we can for migration, because we
already have enough, and it's not ideal to keep growing that unnecessarily.

One good thing with CI in this case (I still don't have enough knowledge on
CI, so I am hoping some CI people can review that patch, though) is that if
we can always guarantee n-1 -> n works for the test cases we enabled, it
most probably means when n boosts again to n+1, we keep making sure n ->
n+1 works perfectly, then n-1 -> n+1 should not fail either, considering
that we're testing the stream protocol matching each other.  There might be
outliers (especially if not described with VMSDs) but should be corner
cases.

So I tend to agree with you on that we drop this patch, keep it simple
until we're much more clear what we can get from that.

But then if so - do we need "since" at all to be expressed in versions?

Basically we keep qtest always be valid only on the latest qemu binary as
before (which actually works the same as Linux v.s. kselftests, which makes
sense), there's one exception now with "n-1" due to the CI we plan to add.
Dropping this patch means we don't yet plan to support n-2.  Then maybe
instead of a "since" we only need a boolean showing "whether one test needs
to be covered by a cross-binary test"?  Then we set it in incompatible
binaries (skip all cross-binary tests directly, rather than relying on any
qemu versions, no compare needed), and can also drop that when a new
release starts.

Thanks,

-- 
Peter Xu

Re: [PATCH v2 4/4] hw/intc/loongarch_extioi: Add vmstate post_load support

2024-01-08 Thread gaosong


在 2023/12/15 下午6:03, Bibo Mao 写道:

There are elements sw_ipmap and sw_coremap, which is usd to speed
up irq injection flow. They are saved and restored in vmstate during
migration, indeed they can calculated from hw registers. Here
post_load is added for get sw_ipmap and sw_coremap from extioi hw
state.

Signed-off-by: Bibo Mao 
---
  hw/intc/loongarch_extioi.c | 120 +++--
  1 file changed, 76 insertions(+), 44 deletions(-)


Reviewed-by: Song Gao 

Thanks.
Song Gao

diff --git a/hw/intc/loongarch_extioi.c b/hw/intc/loongarch_extioi.c
index d9d5066c3f..e0fd57f962 100644
--- a/hw/intc/loongarch_extioi.c
+++ b/hw/intc/loongarch_extioi.c
@@ -130,12 +130,66 @@ static inline void extioi_enable_irq(LoongArchExtIOI *s, 
int index,\
  }
  }
  
+static inline void extioi_update_sw_coremap(LoongArchExtIOI *s, int irq,

+uint64_t val, bool notify)
+{
+int i, cpu;
+
+/*
+ * loongarch only support little endian,
+ * so we paresd the value with little endian.
+ */
+val = cpu_to_le64(val);
+
+for (i = 0; i < 4; i++) {
+cpu = val & 0xff;
+cpu = ctz32(cpu);
+cpu = (cpu >= 4) ? 0 : cpu;
+val = val >> 8;
+
+if (s->sw_coremap[irq + i] == cpu) {
+continue;
+}
+
+if (notify && test_bit(irq, (unsigned long *)s->isr)) {
+/*
+ * lower irq at old cpu and raise irq at new cpu
+ */
+extioi_update_irq(s, irq + i, 0);
+s->sw_coremap[irq + i] = cpu;
+extioi_update_irq(s, irq + i, 1);
+} else {
+s->sw_coremap[irq + i] = cpu;
+}
+}
+}
+
+static inline void extioi_update_sw_ipmap(LoongArchExtIOI *s, int index,
+  uint64_t val)
+{
+int i;
+uint8_t ipnum;
+
+/*
+ * loongarch only support little endian,
+ * so we paresd the value with little endian.
+ */
+val = cpu_to_le64(val);
+for (i = 0; i < 4; i++) {
+ipnum = val & 0xff;
+ipnum = ctz32(ipnum);
+ipnum = (ipnum >= 4) ? 0 : ipnum;
+s->sw_ipmap[index * 4 + i] = ipnum;
+val = val >> 8;
+}
+}
+
  static MemTxResult extioi_writew(void *opaque, hwaddr addr,
uint64_t val, unsigned size,
MemTxAttrs attrs)
  {
  LoongArchExtIOI *s = LOONGARCH_EXTIOI(opaque);
-int i, cpu, index, old_data, irq;
+int cpu, index, old_data, irq;
  uint32_t offset;
  
  trace_loongarch_extioi_writew(addr, val);

@@ -153,20 +207,7 @@ static MemTxResult extioi_writew(void *opaque, hwaddr addr,
   */
  index = (offset - EXTIOI_IPMAP_START) >> 2;
  s->ipmap[index] = val;
-/*
- * loongarch only support little endian,
- * so we paresd the value with little endian.
- */
-val = cpu_to_le64(val);
-for (i = 0; i < 4; i++) {
-uint8_t ipnum;
-ipnum = val & 0xff;
-ipnum = ctz32(ipnum);
-ipnum = (ipnum >= 4) ? 0 : ipnum;
-s->sw_ipmap[index * 4 + i] = ipnum;
-val = val >> 8;
-}
-
+extioi_update_sw_ipmap(s, index, val);
  break;
  case EXTIOI_ENABLE_START ... EXTIOI_ENABLE_END - 1:
  index = (offset - EXTIOI_ENABLE_START) >> 2;
@@ -205,33 +246,8 @@ static MemTxResult extioi_writew(void *opaque, hwaddr addr,
  irq = offset - EXTIOI_COREMAP_START;
  index = irq / 4;
  s->coremap[index] = val;
-/*
- * loongarch only support little endian,
- * so we paresd the value with little endian.
- */
-val = cpu_to_le64(val);
-
-for (i = 0; i < 4; i++) {
-cpu = val & 0xff;
-cpu = ctz32(cpu);
-cpu = (cpu >= 4) ? 0 : cpu;
-val = val >> 8;
-
-if (s->sw_coremap[irq + i] == cpu) {
-continue;
-}
-
-if (test_bit(irq, (unsigned long *)s->isr)) {
-/*
- * lower irq at old cpu and raise irq at new cpu
- */
-extioi_update_irq(s, irq + i, 0);
-s->sw_coremap[irq + i] = cpu;
-extioi_update_irq(s, irq + i, 1);
-} else {
-s->sw_coremap[irq + i] = cpu;
-}
-}
+
+extioi_update_sw_coremap(s, irq, val, true);
  break;
  default:
  break;
@@ -288,6 +304,23 @@ static void loongarch_extioi_finalize(Object *obj)
  g_free(s->cpu);
  }
  
+static int vmstate_extioi_post_load(void *opaque, int version_id)

+{
+LoongArchExtIOI *s = LOONGARCH_EXTIOI(opaque);
+int i, start_irq;
+
+for (i = 0; i < (EXTIOI_IRQS / 4); i++) {
+start_irq = i * 4;
+extioi_update_sw_coremap(s, start_irq, s->coremap[i], false);
+}
+
+for (i = 0; i < (EXTIOI_IRQS_IPMAP_SIZE / 4); i++) {
+

Re: [PATCH v2 3/4] hw/intc/loongarch_extioi: Add dynamic cpu number support

2024-01-08 Thread gaosong


在 2023/12/15 下午6:03, Bibo Mao 写道:

On LoongArch physical machine, one extioi interrupt controller only
supports 4 cpus. With processor more than 4 cpus, there are multiple
extioi interrupt controllers; if interrupts need to be routed to
other cpus, they are forwarded from extioi node0 to other extioi nodes.

On virt machine model, there is simple extioi interrupt device model.
All cpus can access register of extioi interrupt controller, however
interrupt can only be route to 4 vcpu for compatible with old kernel.

This patch adds dynamic cpu number support about extioi interrupt.
With old kernel legacy extioi model is used, however kernel can detect
and choose new route method in future, so that interrupt can be routed to
all vcpus.

Signed-off-by: Bibo Mao 
---
  hw/intc/loongarch_extioi.c | 107 +++--
  hw/loongarch/virt.c|   3 +-
  include/hw/intc/loongarch_extioi.h |  11 ++-
  3 files changed, 81 insertions(+), 40 deletions(-)


Reviewed-by: Song Gao 

Thanks. Song Gao

diff --git a/hw/intc/loongarch_extioi.c b/hw/intc/loongarch_extioi.c
index 77b4776958..d9d5066c3f 100644
--- a/hw/intc/loongarch_extioi.c
+++ b/hw/intc/loongarch_extioi.c
@@ -8,6 +8,7 @@
  #include "qemu/osdep.h"
  #include "qemu/module.h"
  #include "qemu/log.h"
+#include "qapi/error.h"
  #include "hw/irq.h"
  #include "hw/sysbus.h"
  #include "hw/loongarch/virt.h"
@@ -32,23 +33,23 @@ static void extioi_update_irq(LoongArchExtIOI *s, int irq, 
int level)
  if (((s->enable[irq_index]) & irq_mask) == 0) {
  return;
  }
-s->coreisr[cpu][irq_index] |= irq_mask;
-found = find_first_bit(s->sw_isr[cpu][ipnum], EXTIOI_IRQS);
-set_bit(irq, s->sw_isr[cpu][ipnum]);
+s->cpu[cpu].coreisr[irq_index] |= irq_mask;
+found = find_first_bit(s->cpu[cpu].sw_isr[ipnum], EXTIOI_IRQS);
+set_bit(irq, s->cpu[cpu].sw_isr[ipnum]);
  if (found < EXTIOI_IRQS) {
  /* other irq is handling, need not update parent irq level */
  return;
  }
  } else {
-s->coreisr[cpu][irq_index] &= ~irq_mask;
-clear_bit(irq, s->sw_isr[cpu][ipnum]);
-found = find_first_bit(s->sw_isr[cpu][ipnum], EXTIOI_IRQS);
+s->cpu[cpu].coreisr[irq_index] &= ~irq_mask;
+clear_bit(irq, s->cpu[cpu].sw_isr[ipnum]);
+found = find_first_bit(s->cpu[cpu].sw_isr[ipnum], EXTIOI_IRQS);
  if (found < EXTIOI_IRQS) {
  /* other irq is handling, need not update parent irq level */
  return;
  }
  }
-qemu_set_irq(s->parent_irq[cpu][ipnum], level);
+qemu_set_irq(s->cpu[cpu].parent_irq[ipnum], level);
  }
  
  static void extioi_setirq(void *opaque, int irq, int level)

@@ -96,7 +97,7 @@ static MemTxResult extioi_readw(void *opaque, hwaddr addr, 
uint64_t *data,
  index = (offset - EXTIOI_COREISR_START) >> 2;
  /* using attrs to get current cpu index */
  cpu = attrs.requester_id;
-*data = s->coreisr[cpu][index];
+*data = s->cpu[cpu].coreisr[index];
  break;
  case EXTIOI_COREMAP_START ... EXTIOI_COREMAP_END - 1:
  index = (offset - EXTIOI_COREMAP_START) >> 2;
@@ -189,8 +190,8 @@ static MemTxResult extioi_writew(void *opaque, hwaddr addr,
  index = (offset - EXTIOI_COREISR_START) >> 2;
  /* using attrs to get current cpu index */
  cpu = attrs.requester_id;
-old_data = s->coreisr[cpu][index];
-s->coreisr[cpu][index] = old_data & ~val;
+old_data = s->cpu[cpu].coreisr[index];
+s->cpu[cpu].coreisr[index] = old_data & ~val;
  /* write 1 to clear interrupt */
  old_data &= val;
  irq = ctz32(old_data);
@@ -248,14 +249,61 @@ static const MemoryRegionOps extioi_ops = {
  .endianness = DEVICE_LITTLE_ENDIAN,
  };
  
-static const VMStateDescription vmstate_loongarch_extioi = {

-.name = TYPE_LOONGARCH_EXTIOI,
+static void loongarch_extioi_realize(DeviceState *dev, Error **errp)
+{
+LoongArchExtIOI *s = LOONGARCH_EXTIOI(dev);
+SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+int i, pin;
+
+if (s->num_cpu == 0) {
+error_setg(errp, "num-cpu must be at least 1");
+return;
+}
+
+for (i = 0; i < EXTIOI_IRQS; i++) {
+sysbus_init_irq(sbd, >irq[i]);
+}
+
+qdev_init_gpio_in(dev, extioi_setirq, EXTIOI_IRQS);
+memory_region_init_io(>extioi_system_mem, OBJECT(s), _ops,
+  s, "extioi_system_mem", 0x900);
+sysbus_init_mmio(sbd, >extioi_system_mem);
+s->cpu = g_new0(ExtIOICore, s->num_cpu);
+if (s->cpu == NULL) {
+error_setg(errp, "Memory allocation for ExtIOICore faile");
+return;
+}
+
+for (i = 0; i < s->num_cpu; i++) {
+for (pin = 0; pin < LS3A_INTC_IP; pin++) {
+qdev_init_gpio_out(dev, >cpu[i].parent_irq[pin], 1);
+}
+}
+}
+
+static void loongarch_extioi_finalize(Object *obj)
+{
+

Re: [PATCH v3 2/4] tests/qtest/migration: Add infrastructure to skip tests on older QEMUs

2024-01-08 Thread Peter Xu

On Mon, Jan 08, 2024 at 11:49:45AM -0300, Fabiano Rosas wrote:
> >> +
> >> +if (major > tgt_major) {
> >> +return -1;
> >
> > This means the QEMU version is newer, the function will return negative.
> > Is this what we want?  It seems it's inverted.
> 
> The return "points" to which once is the more recent:
> 
> QEMU version | since: version
> -1   0 1

Here if returns -1, then below [1] will skip the test?

> 
> > In all cases, document this function with retval would be helpful too.
> >
> 
> Ok.
> 
> >> +}
> >> +if (major < tgt_major) {
> >> +return 1;
> >> +}
> >
> > Instead of all these, I'm wondering whether we can allow "since" to be an
> > array of integers, like [8, 2, 0].  Would that be much easier?
> 
> I don't see why push the complexity towards the person writing the
> tests. The string is much more natural to specify.

To me QEMU_VER(8,2,0) is as easy to write and read, too.  What Dan proposed
looks also good in the other thread.

I don't really have a strong opinion here especially for the test case. But
imho it'll be still nice to avoid string <-> int if the string is not required.

[...]

> >> @@ -850,6 +856,17 @@ static int test_migrate_start(QTestState **from, 
> >> QTestState **to,
> >>  qtest_qmp_set_event_callback(*from,
> >>   migrate_watch_for_stop,
> >>   _src_stop);
> >> +
> >> +if (args->since && migration_vercmp(*from, args->since) < 0) {

[1]

> >> +g_autofree char *msg = NULL;
> >> +
> >> +msg = g_strdup_printf("Test requires at least QEMU version 
> >> %s",
> >> +  args->since);
> >> +g_test_skip(msg);
> >> +qtest_quit(*from);
> >> +
> >> +return -1;
> >> +}

-- 
Peter Xu

Re: [PATCH v2 2/4] hw/loongarch/virt: Set iocsr address space per-board rather than percpu

2024-01-08 Thread gaosong


在 2023/12/15 下午6:03, Bibo Mao 写道:

LoongArch system has iocsr address space, most iocsr registers are
per-board, however some iocsr register spaces banked for percpu such
as ipi mailbox and extioi interrupt status. For banked iocsr space,
each cpu has the same iocsr space, but separate data.

This patch changes iocsr address space per-board rather percpu,
for iocsr registers specified for cpu, MemTxAttrs.requester_id
can be parsed for the cpu. With this patches, the total address space
on board will be simple, only iocsr address space and system memory,
rather than the number of cpu and system memory.

Signed-off-by: Bibo Mao 
---
  hw/intc/loongarch_extioi.c |  3 -
  hw/intc/loongarch_ipi.c| 61 +++-
  hw/loongarch/virt.c| 91 ++
  include/hw/intc/loongarch_extioi.h |  1 -
  include/hw/intc/loongarch_ipi.h|  3 +-
  include/hw/loongarch/virt.h|  3 +
  target/loongarch/cpu.c | 48 
  target/loongarch/cpu.h |  4 +-
  target/loongarch/iocsr_helper.c| 16 +++---
  9 files changed, 127 insertions(+), 103 deletions(-)

Reviewed-by: Song Gao 

Thanks.
Song Gao

diff --git a/hw/intc/loongarch_extioi.c b/hw/intc/loongarch_extioi.c
index 24fb3af8cc..77b4776958 100644
--- a/hw/intc/loongarch_extioi.c
+++ b/hw/intc/loongarch_extioi.c
@@ -282,9 +282,6 @@ static void loongarch_extioi_instance_init(Object *obj)
  qdev_init_gpio_in(DEVICE(obj), extioi_setirq, EXTIOI_IRQS);
  
  for (cpu = 0; cpu < EXTIOI_CPUS; cpu++) {

-memory_region_init_io(>extioi_iocsr_mem[cpu], OBJECT(s), 
_ops,
-  s, "extioi_iocsr", 0x900);
-sysbus_init_mmio(dev, >extioi_iocsr_mem[cpu]);
  for (pin = 0; pin < LS3A_INTC_IP; pin++) {
  qdev_init_gpio_out(DEVICE(obj), >parent_irq[cpu][pin], 1);
  }
diff --git a/hw/intc/loongarch_ipi.c b/hw/intc/loongarch_ipi.c
index 1d3449e77d..bca01c88f6 100644
--- a/hw/intc/loongarch_ipi.c
+++ b/hw/intc/loongarch_ipi.c
@@ -9,6 +9,7 @@
  #include "hw/sysbus.h"
  #include "hw/intc/loongarch_ipi.h"
  #include "hw/irq.h"
+#include "hw/qdev-properties.h"
  #include "qapi/error.h"
  #include "qemu/log.h"
  #include "exec/address-spaces.h"
@@ -26,7 +27,7 @@ static MemTxResult loongarch_ipi_readl(void *opaque, hwaddr 
addr,
  uint64_t ret = 0;
  int index = 0;
  
-s = >ipi_core;

+s = >cpu[attrs.requester_id];
  addr &= 0xff;
  switch (addr) {
  case CORE_STATUS_OFF:
@@ -65,7 +66,7 @@ static void send_ipi_data(CPULoongArchState *env, uint64_t 
val, hwaddr addr,
   * if the mask is 0, we need not to do anything.
   */
  if ((val >> 27) & 0xf) {
-data = address_space_ldl(>address_space_iocsr, addr,
+data = address_space_ldl(env->address_space_iocsr, addr,
   attrs, NULL);
  for (i = 0; i < 4; i++) {
  /* get mask for byte writing */
@@ -77,7 +78,7 @@ static void send_ipi_data(CPULoongArchState *env, uint64_t 
val, hwaddr addr,
  
  data &= mask;

  data |= (val >> 32) & ~mask;
-address_space_stl(>address_space_iocsr, addr,
+address_space_stl(env->address_space_iocsr, addr,
data, attrs, NULL);
  }
  
@@ -172,7 +173,7 @@ static MemTxResult loongarch_ipi_writel(void *opaque, hwaddr addr, uint64_t val,

  uint8_t vector;
  CPUState *cs;
  
-s = >ipi_core;

+s = >cpu[attrs.requester_id];
  addr &= 0xff;loongarch_ipi_finalize
  trace_loongarch_ipi_write(size, (uint64_t)addr, val);
  switch (addr) {
@@ -214,7 +215,6 @@ static MemTxResult loongarch_ipi_writel(void *opaque, 
hwaddr addr, uint64_t val,
  
  /* override requester_id */

  attrs.requester_id = cs->cpu_index;
-ipi = LOONGARCH_IPI(LOONGARCH_CPU(cs)->env.ipistate);
  loongarch_ipi_writel(ipi, CORE_SET_OFF, BIT(vector), 4, attrs);
  break;
  default:
@@ -265,12 +265,18 @@ static const MemoryRegionOps loongarch_ipi64_ops = {
  .endianness = DEVICE_LITTLE_ENDIAN,
  };
  
-static void loongarch_ipi_init(Object *obj)

+static void loongarch_ipi_realize(DeviceState *dev, Error **errp)
  {
-LoongArchIPI *s = LOONGARCH_IPI(obj);
-SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
+LoongArchIPI *s = LOONGARCH_IPI(dev);
+SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+int i;
+
+if (s->num_cpu == 0) {
+error_setg(errp, "num-cpu must be at least 1");
+return;
+}
  
-memory_region_init_io(>ipi_iocsr_mem, obj, _ipi_ops,

+memory_region_init_io(>ipi_iocsr_mem, OBJECT(dev), _ipi_ops,
s, "loongarch_ipi_iocsr", 0x48);
  
  /* loongarch_ipi_iocsr performs re-entrant IO through ipi_send */

@@ -278,10 +284,20 @@ static void loongarch_ipi_init(Object *obj)
  
  sysbus_init_mmio(sbd, >ipi_iocsr_mem);
  
-memory_region_init_io(>ipi64_iocsr_mem, obj, _ipi64_ops,

+

RE: [External] Re: [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method

2024-01-08 Thread Liu, Yuan1

> -Original Message-
> From: Fabiano Rosas 
> Sent: Tuesday, January 9, 2024 4:28 AM
> To: Liu, Yuan1 ; Hao Xiang 
> Cc: Bryan Zhang ; qemu-devel@nongnu.org;
> marcandre.lur...@redhat.com; pet...@redhat.com; quint...@redhat.com;
> peter.mayd...@linaro.org; berra...@redhat.com
> Subject: RE: [External] Re: [PATCH 3/5] migration: Introduce unimplemented
> 'qatzip' compression method
> 
> "Liu, Yuan1"  writes:
> 
> >> -Original Message-
> >> From: Hao Xiang 
> >> Sent: Saturday, January 6, 2024 7:53 AM
> >> To: Fabiano Rosas 
> >> Cc: Bryan Zhang ; qemu-devel@nongnu.org;
> >> marcandre.lur...@redhat.com; pet...@redhat.com; quint...@redhat.com;
> >> peter.mayd...@linaro.org; Liu, Yuan1 ;
> >> berra...@redhat.com
> >> Subject: Re: [External] Re: [PATCH 3/5] migration: Introduce
> >> unimplemented 'qatzip' compression method
> >>
> >> On Fri, Jan 5, 2024 at 12:07 PM Fabiano Rosas  wrote:
> >> >
> >> > Bryan Zhang  writes:
> >> >
> >> > +cc Yuan Liu, Daniel Berrangé
> >> >
> >> > > Adds support for 'qatzip' as an option for the multifd
> >> > > compression method parameter, but copy-pastes the no-op logic to
> >> > > leave the actual methods effectively unimplemented. This is in
> >> > > preparation of a subsequent commit that will implement actually
> >> > > using QAT for compression and decompression.
> >> > >
> >> > > Signed-off-by: Bryan Zhang 
> >> > > Signed-off-by: Hao Xiang 
> >> > > ---
> >> > >  hw/core/qdev-properties-system.c |  6 ++-
> >> > >  migration/meson.build|  1 +
> >> > >  migration/multifd-qatzip.c   | 81
> >> 
> >> > >  migration/multifd.h  |  1 +
> >> > >  qapi/migration.json  |  5 +-
> >> > >  5 files changed, 92 insertions(+), 2 deletions(-)  create mode
> >> > > 100644 migration/multifd-qatzip.c
> >> > >
> >> > > diff --git a/hw/core/qdev-properties-system.c
> >> > > b/hw/core/qdev-properties-system.c
> >> > > index 1a396521d5..d8e48dcb0e 100644
> >> > > --- a/hw/core/qdev-properties-system.c
> >> > > +++ b/hw/core/qdev-properties-system.c
> >> > > @@ -658,7 +658,11 @@ const PropertyInfo qdev_prop_fdc_drive_type
> >> > > = { const PropertyInfo qdev_prop_multifd_compression = {
> >> > >  .name = "MultiFDCompression",
> >> > >  .description = "multifd_compression values, "
> >> > > -   "none/zlib/zstd",
> >> > > +   "none/zlib/zstd"
> >> > > +#ifdef CONFIG_QATZIP
> >> > > +   "/qatzip"
> >> > > +#endif
> >> > > +   ,
> >> > >  .enum_table = _lookup,
> >> > >  .get = qdev_propinfo_get_enum,
> >> > >  .set = qdev_propinfo_set_enum, diff --git
> >> > > a/migration/meson.build b/migration/meson.build index
> >> > > 92b1cc4297..e20f318379 100644
> >> > > --- a/migration/meson.build
> >> > > +++ b/migration/meson.build
> >> > > @@ -40,6 +40,7 @@ if get_option('live_block_migration').allowed()
> >> > >system_ss.add(files('block.c'))  endif
> >> > >  system_ss.add(when: zstd, if_true: files('multifd-zstd.c'))
> >> > > +system_ss.add(when: qatzip, if_true: files('multifd-qatzip.c'))
> >> > >
> >> > >  specific_ss.add(when: 'CONFIG_SYSTEM_ONLY',
> >> > >  if_true: files('ram.c', diff --git
> >> > > a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c new file
> >> > > mode 100644 index 00..1733bbddb7
> >> > > --- /dev/null
> >> > > +++ b/migration/multifd-qatzip.c
> >> > > @@ -0,0 +1,81 @@
> >> > > +/*
> >> > > + * Multifd QATzip compression implementation
> >> > > + *
> >> > > + * Copyright (c) Bytedance
> >> > > + *
> >> > > + * Authors:
> >> > > + *  Bryan Zhang 
> >> > > + *  Hao Xiang   
> >> > > + *
> >> > > + * This work is licensed under the terms of the GNU GPL, version 2
> or
> >> later.
> >> > > + * See the COPYING file in the top-level directory.
> >> > > + */
> >> > > +
> >> > > +#include "qemu/osdep.h"
> >> > > +#include "exec/ramblock.h"
> >> > > +#include "exec/target_page.h"
> >> > > +#include "qapi/error.h"
> >> > > +#include "migration.h"
> >> > > +#include "options.h"
> >> > > +#include "multifd.h"
> >> > > +
> >> > > +static int qatzip_send_setup(MultiFDSendParams *p, Error **errp) {
> >> > > +return 0;
> >> > > +}
> >> > > +
> >> > > +static void qatzip_send_cleanup(MultiFDSendParams *p, Error
> **errp)
> >> > > +{};
> >> > > +
> >> > > +static int qatzip_send_prepare(MultiFDSendParams *p, Error **errp)
> >> > > +{
> >> > > +MultiFDPages_t *pages = p->pages;
> >> > > +
> >> > > +for (int i = 0; i < p->normal_num; i++) {
> >> > > +p->iov[p->iovs_num].iov_base = pages->block->host + p-
> >> >normal[i];
> >> > > +p->iov[p->iovs_num].iov_len = p->page_size;
> >> > > +p->iovs_num++;
> >> > > +}
> >> > > +
> >> > > +p->next_packet_size = p->normal_num * p->page_size;
> >> > > +p->flags |= MULTIFD_FLAG_NOCOMP;
> >> > > +return 0;
> >> > > +}
> >> > > +
> >> > > +static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp) {
>

Re: [PATCH 3/3] tests/qtest: Re-enable multifd cancel test

2024-01-08 Thread Peter Xu

On Mon, Jan 08, 2024 at 11:26:04AM -0300, Fabiano Rosas wrote:
> Peter Xu  writes:
> 
> > On Wed, Jun 07, 2023 at 10:27:15AM +0200, Juan Quintela wrote:
> >> Fabiano Rosas  wrote:
> >> > We've found the source of flakiness in this test, so re-enable it.
> >> >
> >> > Signed-off-by: Fabiano Rosas 
> >> > ---
> >> >  tests/qtest/migration-test.c | 10 ++
> >> >  1 file changed, 2 insertions(+), 8 deletions(-)
> >> >
> >> > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> >> > index b0c355bbd9..800ad23b75 100644
> >> > --- a/tests/qtest/migration-test.c
> >> > +++ b/tests/qtest/migration-test.c
> >> > @@ -2778,14 +2778,8 @@ int main(int argc, char **argv)
> >> >  }
> >> >  qtest_add_func("/migration/multifd/tcp/plain/none",
> >> > test_multifd_tcp_none);
> >> > -/*
> >> > - * This test is flaky and sometimes fails in CI and otherwise:
> >> > - * don't run unless user opts in via environment variable.
> >> > - */
> >> > -if (getenv("QEMU_TEST_FLAKY_TESTS")) {
> >> > -qtest_add_func("/migration/multifd/tcp/plain/cancel",
> >> > -   test_multifd_tcp_cancel);
> >> > -}
> >> > +qtest_add_func("/migration/multifd/tcp/plain/cancel",
> >> > +   test_multifd_tcp_cancel);
> >> >  qtest_add_func("/migration/multifd/tcp/plain/zlib",
> >> > test_multifd_tcp_zlib);
> >> >  #ifdef CONFIG_ZSTD
> >> 
> >> Reviewed-by: Juan Quintela 
> >> 
> >> 
> >> There was another failure with migration test that I will post during
> >> the rest of the day.  It needs both to get it right.
> >
> > This one didn't yet land upstream.  I'm not sure, but maybe Juan was saying
> > about this change:
> >
> > commit d2026ee117147893f8d80f060cede6d872ecbd7f
> > Author: Juan Quintela 
> > Date:   Wed Apr 26 12:20:36 2023 +0200
> >
> > multifd: Fix the number of channels ready
> 
> That's not it. It was something in the test itself around the fact that
> we use two sets of: from/to. There was supposed to be a situation where
> we'd start 'to2' while 'to' was still running and that would cause
> issues (possibly with sockets).
> 
> I think what might have happened is that someone merged a fix through
> another tree and Juan didn't notice. I think this is the one:
> 
>   commit f2d063e61ee2026700ab44bef967f663e976bec8
>   Author: Xuzhou Cheng 
>   Date:   Fri Oct 28 12:57:32 2022 +0800
>   
>   tests/qtest: migration-test: Make sure QEMU process "to" exited after 
> migration is canceled
>   
>   Make sure QEMU process "to" exited before launching another target
>   for migration in the test_multifd_tcp_cancel case.
>   
>   Signed-off-by: Xuzhou Cheng 
>   Signed-off-by: Bin Meng 
>   Reviewed-by: Marc-André Lureau 
>   Message-Id: <20221028045736.679903-8-bin.m...@windriver.com>
>   Signed-off-by: Thomas Huth 

Hmm, i see.

> 
> > Fabiano, did you try to reproduce multifd-cancel with current master?  I'm
> > wondering whether this test has already been completely fixed, then maybe
> > we can pick up this patch now.
> 
> Yes, let's merge it. I have kept it enabled during testing of all of the
> recent race conditions we've debugged and haven't seen it fail. Current
> master also looks fine.

It needs a trivial touchup, but then I queued it.

Thanks,

-- 
Peter Xu

RE: [PATCH trivial] colo: examples: remove mentions of script= and (wrong) downscript=

2024-01-08 Thread Zhang, Chen




> -Original Message-
> From: Michael Tokarev 
> Sent: Sunday, January 7, 2024 7:25 PM
> To: qemu-devel@nongnu.org
> Cc: Michael Tokarev ; qemu-triv...@nongnu.org; Zhang,
> Chen ; Li Zhijian 
> Subject: [PATCH trivial] colo: examples: remove mentions of script= and
> (wrong) downscript=
> 
> There's no need to repeat script=/etc/qemu-ifup in examples, as it is already
> in there.  More, all examples uses incorrect "down script=" (which should be
> "downscript=").

Yes, good catch.
Reviewed-by: Zhang Chen 

> ---
> I'm not sure we need so many identical examples, and why it uses vnet=off, -
> it looks like vnet= should also be dropped.

Do you means the "vnet_hdr_support" in docs?
If yes, we can't drop it. Because the filters use this tag to communicate with 
an independent vnet_header.
And when a filter with vnet_hdr_support tag(like filter-mirror) connect to 
another filter without tag(like filter-redirector),
They cannot correctly parse the data sent to each other.

Thanks
Chen

> 
>  docs/colo-proxy.txt | 6 +++---
>  qemu-options.hx | 8 
>  2 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/docs/colo-proxy.txt b/docs/colo-proxy.txt index
> 1fc38aed1b..e712c883db 100644
> --- a/docs/colo-proxy.txt
> +++ b/docs/colo-proxy.txt
> @@ -162,7 +162,7 @@ Here is an example using demonstration IP and port
> addresses to more  clearly describe the usage.
> 
>  Primary(ip:3.3.3.3):
> --netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-
> ifdown
> +-netdev tap,id=hn0,vhost=off
>  -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
>  -chardev socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off
>  -chardev socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off
> @@ -177,7 +177,7 @@ Primary(ip:3.3.3.3):
>  -object colo-compare,id=comp0,primary_in=compare0-
> 0,secondary_in=compare1,outdev=compare_out0,iothread=iothread1
> 
>  Secondary(ip:3.3.3.8):
> --netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down script=/etc/qemu-
> ifdown
> +-netdev tap,id=hn0,vhost=off
>  -device e1000,netdev=hn0,mac=52:a4:00:12:78:66
>  -chardev socket,id=red0,host=3.3.3.3,port=9003
>  -chardev socket,id=red1,host=3.3.3.3,port=9004
> @@ -202,7 +202,7 @@ Primary(ip:3.3.3.3):
>  -object colo-compare,id=comp0,primary_in=compare0-
> 0,secondary_in=compare1,outdev=compare_out0,vnet_hdr_support
> 
>  Secondary(ip:3.3.3.8):
> --netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down script=/etc/qemu-
> ifdown
> +-netdev tap,id=hn0,vhost=off
>  -device e1000,netdev=hn0,mac=52:a4:00:12:78:66
>  -chardev socket,id=red0,host=3.3.3.3,port=9003
>  -chardev socket,id=red1,host=3.3.3.3,port=9004
> diff --git a/qemu-options.hx b/qemu-options.hx index
> b66570ae00..d667bfa0c2 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -5500,7 +5500,7 @@ SRST
>  KVM COLO
> 
>  primary:
> --netdev tap,id=hn0,vhost=off,script=/etc/qemu-
> ifup,downscript=/etc/qemu-ifdown
> +-netdev tap,id=hn0,vhost=off
>  -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
>  -chardev 
> socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off
>  -chardev
> socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off
> @@ -5515,7 +5515,7 @@ SRST
>  -object colo-compare,id=comp0,primary_in=compare0-
> 0,secondary_in=compare1,outdev=compare_out0,iothread=iothread1
> 
>  secondary:
> --netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down
> script=/etc/qemu-ifdown
> +-netdev tap,id=hn0,vhost=off
>  -device e1000,netdev=hn0,mac=52:a4:00:12:78:66
>  -chardev socket,id=red0,host=3.3.3.3,port=9003
>  -chardev socket,id=red1,host=3.3.3.3,port=9004
> @@ -5526,7 +5526,7 @@ SRST
>  Xen COLO
> 
>  primary:
> --netdev tap,id=hn0,vhost=off,script=/etc/qemu-
> ifup,downscript=/etc/qemu-ifdown
> +-netdev tap,id=hn0,vhost=off
>  -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
>  -chardev 
> socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off
>  -chardev
> socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off
> @@ -5542,7 +5542,7 @@ SRST
>  -object colo-compare,id=comp0,primary_in=compare0-
> 0,secondary_in=compare1,outdev=compare_out0,notify_dev=nofity_way,ioth
> read=iothread1
> 
>  secondary:
> --netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down
> script=/etc/qemu-ifdown
> +-netdev tap,id=hn0,vhost=off
>  -device e1000,netdev=hn0,mac=52:a4:00:12:78:66
>  -chardev socket,id=red0,host=3.3.3.3,port=9003
>  -chardev socket,id=red1,host=3.3.3.3,port=9004
> --
> 2.39.2

Re: [PATCH v2 1/4] hw/intc/loongarch_ipi: Use MemTxAttrs interface for ipi ops

2024-01-08 Thread gaosong


在 2023/12/15 下午6:03, Bibo Mao 写道:

There are two interface pairs for MemoryRegionOps, read/write and
read_with_attrs/write_with_attrs. The later is better for ipi device
emulation since initial cpu can be parsed from attrs.requester_id.

And requester_id can be overrided for IOCSR_IPI_SEND and mail_send
function when it is to forward message to another vcpu.

Signed-off-by: Bibo Mao 
---
  hw/intc/loongarch_ipi.c | 136 +++-
  1 file changed, 77 insertions(+), 59 deletions(-)

Reviewed-by: Song Gao 

Thanks.
Song Gao

diff --git a/hw/intc/loongarch_ipi.c b/hw/intc/loongarch_ipi.c
index 67858b521c..1d3449e77d 100644
--- a/hw/intc/loongarch_ipi.c
+++ b/hw/intc/loongarch_ipi.c
@@ -17,14 +17,16 @@
  #include "target/loongarch/internals.h"
  #include "trace.h"
  
-static void loongarch_ipi_writel(void *, hwaddr, uint64_t, unsigned);

-
-static uint64_t loongarch_ipi_readl(void *opaque, hwaddr addr, unsigned size)
+static MemTxResult loongarch_ipi_readl(void *opaque, hwaddr addr,
+   uint64_t *data,
+   unsigned size, MemTxAttrs attrs)
  {
-IPICore *s = opaque;
+IPICore *s;
+LoongArchIPI *ipi = opaque;
  uint64_t ret = 0;
  int index = 0;
  
+s = >ipi_core;

  addr &= 0xff;
  switch (addr) {
  case CORE_STATUS_OFF:
@@ -49,10 +51,12 @@ static uint64_t loongarch_ipi_readl(void *opaque, hwaddr 
addr, unsigned size)
  }
  
  trace_loongarch_ipi_read(size, (uint64_t)addr, ret);

-return ret;
+*data = ret;
+return MEMTX_OK;
  }
  
-static void send_ipi_data(CPULoongArchState *env, uint64_t val, hwaddr addr)

+static void send_ipi_data(CPULoongArchState *env, uint64_t val, hwaddr addr,
+  MemTxAttrs attrs)
  {
  int i, mask = 0, data = 0;
  
@@ -62,7 +66,7 @@ static void send_ipi_data(CPULoongArchState *env, uint64_t val, hwaddr addr)

   */
  if ((val >> 27) & 0xf) {
  data = address_space_ldl(>address_space_iocsr, addr,
- MEMTXATTRS_UNSPECIFIED, NULL);
+ attrs, NULL);
  for (i = 0; i < 4; i++) {
  /* get mask for byte writing */
  if (val & (0x1 << (27 + i))) {
@@ -74,7 +78,7 @@ static void send_ipi_data(CPULoongArchState *env, uint64_t 
val, hwaddr addr)
  data &= mask;
  data |= (val >> 32) & ~mask;
  address_space_stl(>address_space_iocsr, addr,
-  data, MEMTXATTRS_UNSPECIFIED, NULL);
+  data, attrs, NULL);
  }
  
  static int archid_cmp(const void *a, const void *b)

@@ -103,80 +107,72 @@ static CPUState *ipi_getcpu(int arch_id)
  CPUArchId *archid;
  
  archid = find_cpu_by_archid(machine, arch_id);

-return CPU(archid->cpu);
-}
-
-static void ipi_send(uint64_t val)
-{
-uint32_t cpuid;
-uint8_t vector;
-CPUState *cs;
-LoongArchCPU *cpu;
-LoongArchIPI *s;
-
-cpuid = extract32(val, 16, 10);
-if (cpuid >= LOONGARCH_MAX_CPUS) {
-trace_loongarch_ipi_unsupported_cpuid("IOCSR_IPI_SEND", cpuid);
-return;
+if (archid) {
+return CPU(archid->cpu);
  }
  
-/* IPI status vector */

-vector = extract8(val, 0, 5);
-
-cs = ipi_getcpu(cpuid);
-cpu = LOONGARCH_CPU(cs);
-s = LOONGARCH_IPI(cpu->env.ipistate);
-loongarch_ipi_writel(>ipi_core, CORE_SET_OFF, BIT(vector), 4);
+return NULL;
  }
  
-static void mail_send(uint64_t val)

+static MemTxResult mail_send(uint64_t val, MemTxAttrs attrs)
  {
  uint32_t cpuid;
  hwaddr addr;
-CPULoongArchState *env;
  CPUState *cs;
-LoongArchCPU *cpu;
  
  cpuid = extract32(val, 16, 10);

  if (cpuid >= LOONGARCH_MAX_CPUS) {
  trace_loongarch_ipi_unsupported_cpuid("IOCSR_MAIL_SEND", cpuid);
-return;
+return MEMTX_DECODE_ERROR;
  }
  
-addr = 0x1020 + (val & 0x1c);

  cs = ipi_getcpu(cpuid);
-cpu = LOONGARCH_CPU(cs);
-env = >env;
-send_ipi_data(env, val, addr);
+if (cs == NULL) {
+return MEMTX_DECODE_ERROR;
+}
+
+/* override requester_id */
+addr = SMP_IPI_MAILBOX + CORE_BUF_20 + (val & 0x1c);
+attrs.requester_id = cs->cpu_index;
+send_ipi_data(_CPU(cs)->env, val, addr, attrs);
+return MEMTX_OK;
  }
  
-static void any_send(uint64_t val)

+static MemTxResult any_send(uint64_t val, MemTxAttrs attrs)
  {
  uint32_t cpuid;
  hwaddr addr;
-CPULoongArchState *env;
  CPUState *cs;
-LoongArchCPU *cpu;
  
  cpuid = extract32(val, 16, 10);

  if (cpuid >= LOONGARCH_MAX_CPUS) {
  trace_loongarch_ipi_unsupported_cpuid("IOCSR_ANY_SEND", cpuid);
-return;
+return MEMTX_DECODE_ERROR;
  }
  
-addr = val & 0x;

  cs = ipi_getcpu(cpuid);
-cpu = LOONGARCH_CPU(cs);
-env = >env;
-send_ipi_data(env, val, addr);
+if (cs == NULL) {
+return MEMTX_DECODE_ERROR;
+}
+
+

Re: [PATCH v3 11/46] hw/loongarch: use pci_init_nic_devices()

2024-01-08 Thread gaosong


在 2024/1/9 上午4:26, David Woodhouse 写道:

From: David Woodhouse 

Signed-off-by: David Woodhouse 
---
  hw/loongarch/virt.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

Reviewed-by: Song Gao 

Thanks.
Song Gao

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 4b7dc67a2d..c48804ac38 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -504,9 +504,7 @@ static void loongarch_devices_init(DeviceState *pch_pic, 
LoongArchMachineState *
  fdt_add_uart_node(lams);
  
  /* Network init */

-for (i = 0; i < nb_nics; i++) {
-pci_nic_init_nofail(_table[i], pci_bus, mc->default_nic, NULL);
-}
+pci_init_nic_devices(pci_bus, mc->default_nic);
  
  /*

   * There are some invalid guest memory access.

Re: [PATCH v7 00/16] Support smp.clusters for x86 in QEMU

2024-01-08 Thread Zhao Liu

Hi Babu,

On Mon, Jan 08, 2024 at 11:46:50AM -0600, Moger, Babu wrote:
> Date: Mon, 8 Jan 2024 11:46:50 -0600
> From: "Moger, Babu" 
> Subject: Re: [PATCH v7 00/16] Support smp.clusters for x86 in QEMU
> 
> Hi  Zhao,
> 
> Ran few basic tests on AMD systems. Changes look good.
> 
> Thanks
> Babu
> 
> 
> Tested-by: Babu Moger 
> 

Thanks much for your test!

Regards,
Zhao

Re: [External] Re: [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram'

2024-01-08 Thread Gregory Price

On Mon, Jan 08, 2024 at 05:05:38PM -0800, Hao Xiang wrote:
> On Mon, Jan 8, 2024 at 2:47 PM Hao Xiang  wrote:
> >
> > On Mon, Jan 8, 2024 at 9:15 AM Gregory Price  
> > wrote:
> > >
> > > On Fri, Jan 05, 2024 at 09:59:19PM -0800, Hao Xiang wrote:
> > > > On Wed, Jan 3, 2024 at 1:56 PM Gregory Price 
> > > >  wrote:
> > > > >
> > > > > For a variety of performance reasons, this will not work the way you
> > > > > want it to.  You are essentially telling QEMU to map the vmem0 into a
> > > > > virtual cxl device, and now any memory accesses to that memory region
> > > > > will end up going through the cxl-type3 device logic - which is an IO
> > > > > path from the perspective of QEMU.
> > > >
> > > > I didn't understand exactly how the virtual cxl-type3 device works. I
> > > > thought it would go with the same "guest virtual address ->  guest
> > > > physical address -> host physical address" translation totally done by
> > > > CPU. But if it is going through an emulation path handled by virtual
> > > > cxl-type3, I agree the performance would be bad. Do you know why
> > > > accessing memory on a virtual cxl-type3 device can't go with the
> > > > nested page table translation?
> > > >
> > >
> > > Because a byte-access on CXL memory can have checks on it that must be
> > > emulated by the virtual device, and because there are caching
> > > implications that have to be emulated as well.
> >
> > Interesting. Now that I see the cxl_type3_read/cxl_type3_write. If the
> > CXL memory data path goes through them, the performance would be
> > pretty problematic. We have actually run Intel's Memory Latency
> > Checker benchmark from inside a guest VM with both system-DRAM and
> > virtual CXL-type3 configured. The idle latency on the virtual CXL
> > memory is 2X of system DRAM, which is on-par with the benchmark
> > running from a physical host. I need to debug this more to understand
> > why the latency is actually much better than I would expect now.
> 
> So we double checked on benchmark testing. What we see is that running
> Intel Memory Latency Checker from a guest VM with virtual CXL memory
> VS from a physical host with CXL1.1 memory expander has the same
> latency.
> 
> From guest VM: local socket system-DRAM latency is 117.0ns, local
> socket CXL-DRAM latency is 269.4ns
> From physical host: local socket system-DRAM latency is 113.6ns ,
> local socket CXL-DRAM latency is 267.5ns
> 
> I also set debugger breakpoints on cxl_type3_read/cxl_type3_write
> while running the benchmark testing but those two functions are not
> ever hit. We used the virtual CXL configuration while launching QEMU
> but the CXL memory is present as a separate NUMA node and we are not
> creating devdax devices. Does that make any difference?
> 

Could you possibly share your full QEMU configuration and what OS/kernel
you are running inside the guest?

The only thing I'm surprised by is that the numa node appears without
requiring the driver to generate the NUMA node.  It's possible I missed
a QEMU update that allows this.

~Gregory

Re: [External] Re: [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram'

2024-01-08 Thread Hao Xiang

On Mon, Jan 8, 2024 at 2:47 PM Hao Xiang  wrote:
>
> On Mon, Jan 8, 2024 at 9:15 AM Gregory Price  
> wrote:
> >
> > On Fri, Jan 05, 2024 at 09:59:19PM -0800, Hao Xiang wrote:
> > > On Wed, Jan 3, 2024 at 1:56 PM Gregory Price  
> > > wrote:
> > > >
> > > > For a variety of performance reasons, this will not work the way you
> > > > want it to.  You are essentially telling QEMU to map the vmem0 into a
> > > > virtual cxl device, and now any memory accesses to that memory region
> > > > will end up going through the cxl-type3 device logic - which is an IO
> > > > path from the perspective of QEMU.
> > >
> > > I didn't understand exactly how the virtual cxl-type3 device works. I
> > > thought it would go with the same "guest virtual address ->  guest
> > > physical address -> host physical address" translation totally done by
> > > CPU. But if it is going through an emulation path handled by virtual
> > > cxl-type3, I agree the performance would be bad. Do you know why
> > > accessing memory on a virtual cxl-type3 device can't go with the
> > > nested page table translation?
> > >
> >
> > Because a byte-access on CXL memory can have checks on it that must be
> > emulated by the virtual device, and because there are caching
> > implications that have to be emulated as well.
>
> Interesting. Now that I see the cxl_type3_read/cxl_type3_write. If the
> CXL memory data path goes through them, the performance would be
> pretty problematic. We have actually run Intel's Memory Latency
> Checker benchmark from inside a guest VM with both system-DRAM and
> virtual CXL-type3 configured. The idle latency on the virtual CXL
> memory is 2X of system DRAM, which is on-par with the benchmark
> running from a physical host. I need to debug this more to understand
> why the latency is actually much better than I would expect now.

So we double checked on benchmark testing. What we see is that running
Intel Memory Latency Checker from a guest VM with virtual CXL memory
VS from a physical host with CXL1.1 memory expander has the same
latency.

>From guest VM: local socket system-DRAM latency is 117.0ns, local
socket CXL-DRAM latency is 269.4ns
>From physical host: local socket system-DRAM latency is 113.6ns ,
local socket CXL-DRAM latency is 267.5ns

I also set debugger breakpoints on cxl_type3_read/cxl_type3_write
while running the benchmark testing but those two functions are not
ever hit. We used the virtual CXL configuration while launching QEMU
but the CXL memory is present as a separate NUMA node and we are not
creating devdax devices. Does that make any difference?

>
> >
> > The cxl device you are using is an emulated CXL device - not a
> > virtualization interface.  Nuanced difference:  the emulated device has
> > to emulate *everything* that CXL device does.
> >
> > What you want is passthrough / managed access to a real device -
> > virtualization.  This is not the way to accomplish that.  A better way
> > to accomplish that is to simply pass the memory through as a static numa
> > node as I described.
>
> That would work, too. But I think a kernel change is required to
> establish the correct memory tiering if we go this routine.
>
> >
> > >
> > > When we had a discussion with Intel, they told us to not use the KVM
> > > option in QEMU while using virtual cxl type3 device. That's probably
> > > related to the issue you described here? We enabled KVM though but
> > > haven't seen the crash yet.
> > >
> >
> > The crash really only happens, IIRC, if code ends up hosted in that
> > memory.  I forget the exact scenario, but the working theory is it has
> > to do with the way instruction caches are managed with KVM and this
> > device.
> >
> > > >
> > > > You're better off just using the `host-nodes` field of host-memory
> > > > and passing bandwidth/latency attributes though via `-numa hmat-lb`
> > >
> > > We tried this but it doesn't work from end to end right now. I
> > > described the issue in another fork of this thread.
> > >
> > > >
> > > > In that scenario, the guest software doesn't even need to know CXL
> > > > exists at all, it can just read the attributes of the numa node
> > > > that QEMU created for it.
> > >
> > > We thought about this before. But the current kernel implementation
> > > requires a devdax device to be probed and recognized as a slow tier
> > > (by reading the memory attributes). I don't think this can be done via
> > > the path you described. Have you tried this before?
> > >
> >
> > Right, because the memory tiering component lumps the nodes together.
> >
> > Better idea:  Fix the memory tiering component
> >
> > I cc'd you on another patch line that is discussing something relevant
> > to this.
> >
> > https://lore.kernel.org/linux-mm/87fs00njft@yhuang6-desk2.ccr.corp.intel.com/T/#m32d58f8cc607aec942995994a41b17ff711519c8
> >
> > The point is: There's no need for this to be a dax device at all, there
> > is no need for the guest to even know what is providing the memory, or
> > for

RE: [PATCH v2] target/riscv: Implement optional CSR mcontext of debug Sdtrig extension

2024-01-08 Thread 張哲嘉

Ping for review, thanks!!

> -Original Message-
> From: Alvin Che-Chia Chang(張哲嘉) 
> Sent: Tuesday, December 19, 2023 8:33 PM
> To: qemu-ri...@nongnu.org; qemu-devel@nongnu.org
> Cc: alistair.fran...@wdc.com; bin.m...@windriver.com;
> liwei1...@gmail.com; dbarb...@ventanamicro.com;
> zhiwei_...@linux.alibaba.com; Alvin Che-Chia Chang(張哲嘉)
> 
> Subject: [PATCH v2] target/riscv: Implement optional CSR mcontext of debug
> Sdtrig extension
> 
> The debug Sdtrig extension defines an CSR "mcontext". This commit
> implements its predicate and read/write operations into CSR table.
> Its value is reset as 0 when the trigger module is reset.
> 
> Signed-off-by: Alvin Chang 
> ---
> Changes from v1: Remove dedicated cfg, always implement mcontext.
> 
>  target/riscv/cpu.h  |  1 +
>  target/riscv/cpu_bits.h |  7 +++
>  target/riscv/csr.c  | 36 +++-
>  target/riscv/debug.c|  2 ++
>  4 files changed, 41 insertions(+), 5 deletions(-)
> 
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index d74b361..e117641
> 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -345,6 +345,7 @@ struct CPUArchState {
>  target_ulong tdata1[RV_MAX_TRIGGERS];
>  target_ulong tdata2[RV_MAX_TRIGGERS];
>  target_ulong tdata3[RV_MAX_TRIGGERS];
> +target_ulong mcontext;
>  struct CPUBreakpoint *cpu_breakpoint[RV_MAX_TRIGGERS];
>  struct CPUWatchpoint *cpu_watchpoint[RV_MAX_TRIGGERS];
>  QEMUTimer *itrigger_timer[RV_MAX_TRIGGERS]; diff --git
> a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h index ebd7917..3296648
> 100644
> --- a/target/riscv/cpu_bits.h
> +++ b/target/riscv/cpu_bits.h
> @@ -361,6 +361,7 @@
>  #define CSR_TDATA2  0x7a2
>  #define CSR_TDATA3  0x7a3
>  #define CSR_TINFO   0x7a4
> +#define CSR_MCONTEXT0x7a8
> 
>  /* Debug Mode Registers */
>  #define CSR_DCSR0x7b0
> @@ -905,4 +906,10 @@ typedef enum RISCVException {
>  /* JVT CSR bits */
>  #define JVT_MODE   0x3F
>  #define JVT_BASE   (~0x3F)
> +
> +/* Debug Sdtrig CSR masks */
> +#define MCONTEXT32 0x003F
> +#define MCONTEXT64
> 0x1FFFULL
> +#define MCONTEXT32_HCONTEXT0x007F
> +#define MCONTEXT64_HCONTEXT
> 0x3FFFULL
>  #endif
> diff --git a/target/riscv/csr.c b/target/riscv/csr.c index fde7ce1..ff1e128 
> 100644
> --- a/target/riscv/csr.c
> +++ b/target/riscv/csr.c
> @@ -3900,6 +3900,31 @@ static RISCVException read_tinfo(CPURISCVState
> *env, int csrno,
>  return RISCV_EXCP_NONE;
>  }
> 
> +static RISCVException read_mcontext(CPURISCVState *env, int csrno,
> +target_ulong *val) {
> +*val = env->mcontext;
> +return RISCV_EXCP_NONE;
> +}
> +
> +static RISCVException write_mcontext(CPURISCVState *env, int csrno,
> + target_ulong val) {
> +bool rv32 = riscv_cpu_mxl(env) == MXL_RV32 ? true : false;
> +int32_t mask;
> +
> +if (riscv_has_ext(env, RVH)) {
> +/* Spec suggest 7-bit for RV32 and 14-bit for RV64 w/ H extension
> */
> +mask = rv32 ? MCONTEXT32_HCONTEXT :
> MCONTEXT64_HCONTEXT;
> +} else {
> +/* Spec suggest 6-bit for RV32 and 13-bit for RV64 w/o H extension
> */
> +mask = rv32 ? MCONTEXT32 : MCONTEXT64;
> +}
> +
> +env->mcontext = val & mask;
> +return RISCV_EXCP_NONE;
> +}
> +
>  /*
>   * Functions to access Pointer Masking feature registers
>   * We have to check if current priv lvl could modify @@ -4794,11 +4819,12
> @@ riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
>  [CSR_PMPADDR15] =  { "pmpaddr15", pmp, read_pmpaddr,
> write_pmpaddr },
> 
>  /* Debug CSRs */
> -[CSR_TSELECT]   =  { "tselect", debug, read_tselect, write_tselect },
> -[CSR_TDATA1]=  { "tdata1",  debug, read_tdata,
> write_tdata   },
> -[CSR_TDATA2]=  { "tdata2",  debug, read_tdata,
> write_tdata   },
> -[CSR_TDATA3]=  { "tdata3",  debug, read_tdata,
> write_tdata   },
> -[CSR_TINFO] =  { "tinfo",   debug, read_tinfo,
> write_ignore  },
> +[CSR_TSELECT]   =  { "tselect",  debug, read_tselect,
> write_tselect  },
> +[CSR_TDATA1]=  { "tdata1",   debug, read_tdata,
> write_tdata},
> +[CSR_TDATA2]=  { "tdata2",   debug, read_tdata,
> write_tdata},
> +[CSR_TDATA3]=  { "tdata3",   debug, read_tdata,
> write_tdata},
> +[CSR_TINFO] =  { "tinfo",debug, read_tinfo,
> write_ignore   },
> +[CSR_MCONTEXT]  =  { "mcontext", debug, read_mcontext,
> + write_mcontext },
> 
>  /* User Pointer Masking */
>  [CSR_UMTE]={ "umte",pointer_masking, read_umte,
> write_umte },
> diff --git a/target/riscv/debug.c b/target/riscv/debug.c index 
> 4945d1a..e30d99c
> 100644
> --- a/target/riscv/debug.c
> +++ b/target/riscv/debug.c
> @@ -940,4 +940,6 @@ void riscv_trigger_reset_hold(CPURISCVState *env)
>

Re: [PATCH 0/3] target/riscv: A few bug fixes and Coverity fix

2024-01-08 Thread Alistair Francis

On Mon, Jan 8, 2024 at 10:13 AM Alistair Francis  wrote:
>
> A few bug fixes for some Gitlab issues and a Coverity fix
>
> Alistair Francis (3):
>   target/riscv: Assert that the CSR numbers will be correct
>   target/riscv: Don't adjust vscause for exceptions
>   target/riscv: Ensure mideleg is set correctly on reset

Thanks!

Applied to riscv-to-apply.next

Alistair

>
>  target/riscv/cpu.c| 8 
>  target/riscv/cpu_helper.c | 4 ++--
>  target/riscv/csr.c| 5 -
>  3 files changed, 14 insertions(+), 3 deletions(-)
>
> --
> 2.43.0
>

Re: [PATCH v3 2/5] target/riscv: Add cycle & instret privilege mode filtering properties

2024-01-08 Thread Atish Kumar Patra

On Mon, Jan 8, 2024 at 10:10 AM Daniel Henrique Barboza
 wrote:
>
>
>
> On 1/5/24 19:13, Atish Patra wrote:
> > From: Kaiwen Xue 
> >
> > This adds the properties for ISA extension smcntrpmf. Patches
> > implementing it will follow.
> >
> > Signed-off-by: Atish Patra 
> > Signed-off-by: Kaiwen Xue 
> > ---
> >   target/riscv/cpu.c | 2 ++
> >   target/riscv/cpu_cfg.h | 1 +
> >   2 files changed, 3 insertions(+)
> >
> > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> > index 83c7c0cf07be..ea34ff2ae983 100644
> > --- a/target/riscv/cpu.c
> > +++ b/target/riscv/cpu.c
> > @@ -148,6 +148,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
> >   ISA_EXT_DATA_ENTRY(smstateen, PRIV_VERSION_1_12_0, ext_smstateen),
> >   ISA_EXT_DATA_ENTRY(ssaia, PRIV_VERSION_1_12_0, ext_ssaia),
> >   ISA_EXT_DATA_ENTRY(sscofpmf, PRIV_VERSION_1_12_0, ext_sscofpmf),
> > +ISA_EXT_DATA_ENTRY(smcntrpmf, PRIV_VERSION_1_12_0, ext_smcntrpmf),
> >   ISA_EXT_DATA_ENTRY(sstc, PRIV_VERSION_1_12_0, ext_sstc),
> >   ISA_EXT_DATA_ENTRY(svadu, PRIV_VERSION_1_12_0, ext_svadu),
> >   ISA_EXT_DATA_ENTRY(svinval, PRIV_VERSION_1_12_0, ext_svinval),
>
> Sorry for not noticing this in the previous version. I believe we want the 
> "smcntrpmf"
> entry to be right after "smaia" because the isa_edata_arr[] ordering matters 
> when
> building the riscv,isa string in riscv_isa_string_ext().
>

Oops. Thanks for catching that. Fixed in v4.

>
> Thanks,
>
> Daniel
>
> > @@ -1296,6 +1297,7 @@ const char *riscv_get_misa_ext_description(uint32_t 
> > bit)
> >   const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
> >   /* Defaults for standard extensions */
> >   MULTI_EXT_CFG_BOOL("sscofpmf", ext_sscofpmf, false),
> > +MULTI_EXT_CFG_BOOL("smcntrpmf", ext_smcntrpmf, false),
> >   MULTI_EXT_CFG_BOOL("zifencei", ext_zifencei, true),
> >   MULTI_EXT_CFG_BOOL("zicsr", ext_zicsr, true),
> >   MULTI_EXT_CFG_BOOL("zihintntl", ext_zihintntl, true),
> > diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
> > index f4605fb190b9..00c34fdd3209 100644
> > --- a/target/riscv/cpu_cfg.h
> > +++ b/target/riscv/cpu_cfg.h
> > @@ -72,6 +72,7 @@ struct RISCVCPUConfig {
> >   bool ext_zihpm;
> >   bool ext_smstateen;
> >   bool ext_sstc;
> > +bool ext_smcntrpmf;
> >   bool ext_svadu;
> >   bool ext_svinval;
> >   bool ext_svnapot;

[PATCH v4 4/5] target/riscv: Add cycle & instret privilege mode filtering support

2024-01-08 Thread Atish Patra

From: Kaiwen Xue 

QEMU only calculates dummy cycles and instructions, so there is no
actual means to stop the icount in QEMU. Hence this patch merely adds
the functionality of accessing the cfg registers, and cause no actual
effects on the counting of cycle and instret counters.

Signed-off-by: Atish Patra 
Reviewed-by: Daniel Henrique Barboza 
Signed-off-by: Kaiwen Xue 
---
 target/riscv/csr.c | 80 ++
 1 file changed, 80 insertions(+)

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 283468bbc652..3bd4aa22374f 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -233,6 +233,24 @@ static RISCVException sscofpmf_32(CPURISCVState *env, int 
csrno)
 return sscofpmf(env, csrno);
 }
 
+static RISCVException smcntrpmf(CPURISCVState *env, int csrno)
+{
+if (!riscv_cpu_cfg(env)->ext_smcntrpmf) {
+return RISCV_EXCP_ILLEGAL_INST;
+}
+
+return RISCV_EXCP_NONE;
+}
+
+static RISCVException smcntrpmf_32(CPURISCVState *env, int csrno)
+{
+if (riscv_cpu_mxl(env) != MXL_RV32) {
+return RISCV_EXCP_ILLEGAL_INST;
+}
+
+return smcntrpmf(env, csrno);
+}
+
 static RISCVException any(CPURISCVState *env, int csrno)
 {
 return RISCV_EXCP_NONE;
@@ -818,6 +836,54 @@ static int read_hpmcounterh(CPURISCVState *env, int csrno, 
target_ulong *val)
 
 #else /* CONFIG_USER_ONLY */
 
+static int read_mcyclecfg(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env->mcyclecfg;
+return RISCV_EXCP_NONE;
+}
+
+static int write_mcyclecfg(CPURISCVState *env, int csrno, target_ulong val)
+{
+env->mcyclecfg = val;
+return RISCV_EXCP_NONE;
+}
+
+static int read_mcyclecfgh(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env->mcyclecfgh;
+return RISCV_EXCP_NONE;
+}
+
+static int write_mcyclecfgh(CPURISCVState *env, int csrno, target_ulong val)
+{
+env->mcyclecfgh = val;
+return RISCV_EXCP_NONE;
+}
+
+static int read_minstretcfg(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env->minstretcfg;
+return RISCV_EXCP_NONE;
+}
+
+static int write_minstretcfg(CPURISCVState *env, int csrno, target_ulong val)
+{
+env->minstretcfg = val;
+return RISCV_EXCP_NONE;
+}
+
+static int read_minstretcfgh(CPURISCVState *env, int csrno, target_ulong *val)
+{
+*val = env->minstretcfgh;
+return RISCV_EXCP_NONE;
+}
+
+static int write_minstretcfgh(CPURISCVState *env, int csrno, target_ulong val)
+{
+env->minstretcfgh = val;
+return RISCV_EXCP_NONE;
+}
+
 static int read_mhpmevent(CPURISCVState *env, int csrno, target_ulong *val)
 {
 int evt_index = csrno - CSR_MCOUNTINHIBIT;
@@ -4922,6 +4988,13 @@ riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
  write_mcountinhibit,
  .min_priv_ver = PRIV_VERSION_1_11_0   },
 
+[CSR_MCYCLECFG]  = { "mcyclecfg",   smcntrpmf, read_mcyclecfg,
+ write_mcyclecfg,
+ .min_priv_ver = PRIV_VERSION_1_12_0   },
+[CSR_MINSTRETCFG]= { "minstretcfg", smcntrpmf, read_minstretcfg,
+ write_minstretcfg,
+ .min_priv_ver = PRIV_VERSION_1_12_0   },
+
 [CSR_MHPMEVENT3] = { "mhpmevent3", any,read_mhpmevent,
  write_mhpmevent   },
 [CSR_MHPMEVENT4] = { "mhpmevent4", any,read_mhpmevent,
@@ -4981,6 +5054,13 @@ riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
 [CSR_MHPMEVENT31]= { "mhpmevent31",any,read_mhpmevent,
  write_mhpmevent   },
 
+[CSR_MCYCLECFGH] = { "mcyclecfgh",   smcntrpmf_32, read_mcyclecfgh,
+ write_mcyclecfgh,
+ .min_priv_ver = PRIV_VERSION_1_12_0},
+[CSR_MINSTRETCFGH]   = { "minstretcfgh", smcntrpmf_32, read_minstretcfgh,
+ write_minstretcfgh,
+ .min_priv_ver = PRIV_VERSION_1_12_0},
+
 [CSR_MHPMEVENT3H]= { "mhpmevent3h",sscofpmf_32,  read_mhpmeventh,
  write_mhpmeventh,
  .min_priv_ver = PRIV_VERSION_1_12_0},
-- 
2.34.1

[PATCH v4 5/5] target/riscv: Implement privilege mode filtering for cycle/instret

2024-01-08 Thread Atish Patra

Privilege mode filtering can also be emulated for cycle/instret by
tracking host_ticks/icount during each privilege mode switch. This
patch implements that for both cycle/instret and mhpmcounters. The
first one requires Smcntrpmf while the other one requires Sscofpmf
to be enabled.

The cycle/instret are still computed using host ticks when icount
is not enabled. Otherwise, they are computed using raw icount which
is more accurate in icount mode.

Reviewed-by: Daniel Henrique Barboza 
Signed-off-by: Atish Patra 
---
 target/riscv/cpu.h| 11 +
 target/riscv/cpu_helper.c |  9 +++-
 target/riscv/csr.c| 95 ++-
 target/riscv/pmu.c| 43 ++
 target/riscv/pmu.h|  2 +
 5 files changed, 136 insertions(+), 24 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 34617c4c4bab..40d10726155b 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -136,6 +136,15 @@ typedef struct PMUCTRState {
 target_ulong irq_overflow_left;
 } PMUCTRState;
 
+typedef struct PMUFixedCtrState {
+/* Track cycle and icount for each privilege mode */
+uint64_t counter[4];
+uint64_t counter_prev[4];
+/* Track cycle and icount for each privilege mode when V = 1*/
+uint64_t counter_virt[2];
+uint64_t counter_virt_prev[2];
+} PMUFixedCtrState;
+
 struct CPUArchState {
 target_ulong gpr[32];
 target_ulong gprh[32]; /* 64 top bits of the 128-bit registers */
@@ -334,6 +343,8 @@ struct CPUArchState {
 /* PMU event selector configured values for RV32 */
 target_ulong mhpmeventh_val[RV_MAX_MHPMEVENTS];
 
+PMUFixedCtrState pmu_fixed_ctrs[2];
+
 target_ulong sscratch;
 target_ulong mscratch;
 
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index e7e23b34f455..3dddb1b433e8 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -715,8 +715,13 @@ void riscv_cpu_set_mode(CPURISCVState *env, target_ulong 
newpriv)
 {
 g_assert(newpriv <= PRV_M && newpriv != PRV_RESERVED);
 
-if (icount_enabled() && newpriv != env->priv) {
-riscv_itrigger_update_priv(env);
+if (newpriv != env->priv) {
+if (icount_enabled()) {
+riscv_itrigger_update_priv(env);
+riscv_pmu_icount_update_priv(env, newpriv);
+} else {
+riscv_pmu_cycle_update_priv(env, newpriv);
+}
 }
 /* tlb_flush is unnecessary as mode is contained in mmu_idx */
 env->priv = newpriv;
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 3bd4aa22374f..307d052021c5 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -782,32 +782,16 @@ static int write_vcsr(CPURISCVState *env, int csrno, 
target_ulong val)
 return RISCV_EXCP_NONE;
 }
 
+#if defined(CONFIG_USER_ONLY)
 /* User Timers and Counters */
 static target_ulong get_ticks(bool shift)
 {
-int64_t val;
-target_ulong result;
-
-#if !defined(CONFIG_USER_ONLY)
-if (icount_enabled()) {
-val = icount_get();
-} else {
-val = cpu_get_host_ticks();
-}
-#else
-val = cpu_get_host_ticks();
-#endif
-
-if (shift) {
-result = val >> 32;
-} else {
-result = val;
-}
+int64_t val = cpu_get_host_ticks();
+target_ulong result = shift ? val >> 32 : val;
 
 return result;
 }
 
-#if defined(CONFIG_USER_ONLY)
 static RISCVException read_time(CPURISCVState *env, int csrno,
 target_ulong *val)
 {
@@ -932,6 +916,70 @@ static int write_mhpmeventh(CPURISCVState *env, int csrno, 
target_ulong val)
 return RISCV_EXCP_NONE;
 }
 
+static target_ulong riscv_pmu_ctr_get_fixed_counters_val(CPURISCVState *env,
+ int counter_idx,
+ bool upper_half)
+{
+uint64_t curr_val = 0;
+target_ulong result = 0;
+uint64_t *counter_arr = icount_enabled() ? env->pmu_fixed_ctrs[1].counter :
+env->pmu_fixed_ctrs[0].counter;
+uint64_t *counter_arr_virt = icount_enabled() ?
+ env->pmu_fixed_ctrs[1].counter_virt :
+ env->pmu_fixed_ctrs[0].counter_virt;
+uint64_t cfg_val = 0;
+
+if (counter_idx == 0) {
+cfg_val = upper_half ? ((uint64_t)env->mcyclecfgh << 32) :
+  env->mcyclecfg;
+} else if (counter_idx == 2) {
+cfg_val = upper_half ? ((uint64_t)env->minstretcfgh << 32) :
+  env->minstretcfg;
+} else {
+cfg_val = upper_half ?
+  ((uint64_t)env->mhpmeventh_val[counter_idx] << 32) :
+  env->mhpmevent_val[counter_idx];
+}
+
+if (!cfg_val) {
+if (icount_enabled()) {
+curr_val = icount_get_raw();
+} else {
+curr_val = cpu_get_host_ticks();
+}
+goto done;
+}
+
+if (!(cfg_val &

[PATCH v4 2/5] target/riscv: Add cycle & instret privilege mode filtering properties

2024-01-08 Thread Atish Patra

From: Kaiwen Xue 

This adds the properties for ISA extension smcntrpmf. Patches
implementing it will follow.

Signed-off-by: Atish Patra 
Signed-off-by: Kaiwen Xue 
---
 target/riscv/cpu.c | 2 ++
 target/riscv/cpu_cfg.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 83c7c0cf07be..501ae560ec29 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -144,6 +144,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(zhinx, PRIV_VERSION_1_12_0, ext_zhinx),
 ISA_EXT_DATA_ENTRY(zhinxmin, PRIV_VERSION_1_12_0, ext_zhinxmin),
 ISA_EXT_DATA_ENTRY(smaia, PRIV_VERSION_1_12_0, ext_smaia),
+ISA_EXT_DATA_ENTRY(smcntrpmf, PRIV_VERSION_1_12_0, ext_smcntrpmf),
 ISA_EXT_DATA_ENTRY(smepmp, PRIV_VERSION_1_12_0, ext_smepmp),
 ISA_EXT_DATA_ENTRY(smstateen, PRIV_VERSION_1_12_0, ext_smstateen),
 ISA_EXT_DATA_ENTRY(ssaia, PRIV_VERSION_1_12_0, ext_ssaia),
@@ -1296,6 +1297,7 @@ const char *riscv_get_misa_ext_description(uint32_t bit)
 const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
 /* Defaults for standard extensions */
 MULTI_EXT_CFG_BOOL("sscofpmf", ext_sscofpmf, false),
+MULTI_EXT_CFG_BOOL("smcntrpmf", ext_smcntrpmf, false),
 MULTI_EXT_CFG_BOOL("zifencei", ext_zifencei, true),
 MULTI_EXT_CFG_BOOL("zicsr", ext_zicsr, true),
 MULTI_EXT_CFG_BOOL("zihintntl", ext_zihintntl, true),
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index f4605fb190b9..00c34fdd3209 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -72,6 +72,7 @@ struct RISCVCPUConfig {
 bool ext_zihpm;
 bool ext_smstateen;
 bool ext_sstc;
+bool ext_smcntrpmf;
 bool ext_svadu;
 bool ext_svinval;
 bool ext_svnapot;
-- 
2.34.1

[PATCH v4 3/5] target/riscv: Add cycle & instret privilege mode filtering definitions

2024-01-08 Thread Atish Patra

From: Kaiwen Xue 

This adds the definitions for ISA extension smcntrpmf.

Signed-off-by: Kaiwen Xue 
Reviewed-by: Daniel Henrique Barboza 
Signed-off-by: Atish Patra 
---
 target/riscv/cpu.h  |  6 ++
 target/riscv/cpu_bits.h | 29 +
 2 files changed, 35 insertions(+)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index d74b361be641..34617c4c4bab 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -319,6 +319,12 @@ struct CPUArchState {
 
 target_ulong mcountinhibit;
 
+/* PMU cycle & instret privilege mode filtering */
+target_ulong mcyclecfg;
+target_ulong mcyclecfgh;
+target_ulong minstretcfg;
+target_ulong minstretcfgh;
+
 /* PMU counter state */
 PMUCTRState pmu_ctrs[RV_MAX_MHPMCOUNTERS];
 
diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index ebd7917d490a..0ee91e502e8f 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -401,6 +401,10 @@
 /* Machine counter-inhibit register */
 #define CSR_MCOUNTINHIBIT   0x320
 
+/* Machine counter configuration registers */
+#define CSR_MCYCLECFG   0x321
+#define CSR_MINSTRETCFG 0x322
+
 #define CSR_MHPMEVENT3  0x323
 #define CSR_MHPMEVENT4  0x324
 #define CSR_MHPMEVENT5  0x325
@@ -431,6 +435,9 @@
 #define CSR_MHPMEVENT30 0x33e
 #define CSR_MHPMEVENT31 0x33f
 
+#define CSR_MCYCLECFGH  0x721
+#define CSR_MINSTRETCFGH0x722
+
 #define CSR_MHPMEVENT3H 0x723
 #define CSR_MHPMEVENT4H 0x724
 #define CSR_MHPMEVENT5H 0x725
@@ -885,6 +892,28 @@ typedef enum RISCVException {
 /* PMU related bits */
 #define MIE_LCOFIE (1 << IRQ_PMU_OVF)
 
+#define MCYCLECFG_BIT_MINH BIT_ULL(62)
+#define MCYCLECFGH_BIT_MINHBIT(30)
+#define MCYCLECFG_BIT_SINH BIT_ULL(61)
+#define MCYCLECFGH_BIT_SINHBIT(29)
+#define MCYCLECFG_BIT_UINH BIT_ULL(60)
+#define MCYCLECFGH_BIT_UINHBIT(28)
+#define MCYCLECFG_BIT_VSINHBIT_ULL(59)
+#define MCYCLECFGH_BIT_VSINH   BIT(27)
+#define MCYCLECFG_BIT_VUINHBIT_ULL(58)
+#define MCYCLECFGH_BIT_VUINH   BIT(26)
+
+#define MINSTRETCFG_BIT_MINH   BIT_ULL(62)
+#define MINSTRETCFGH_BIT_MINH  BIT(30)
+#define MINSTRETCFG_BIT_SINH   BIT_ULL(61)
+#define MINSTRETCFGH_BIT_SINH  BIT(29)
+#define MINSTRETCFG_BIT_UINH   BIT_ULL(60)
+#define MINSTRETCFGH_BIT_UINH  BIT(28)
+#define MINSTRETCFG_BIT_VSINH  BIT_ULL(59)
+#define MINSTRETCFGH_BIT_VSINH BIT(27)
+#define MINSTRETCFG_BIT_VUINH  BIT_ULL(58)
+#define MINSTRETCFGH_BIT_VUINH BIT(26)
+
 #define MHPMEVENT_BIT_OF   BIT_ULL(63)
 #define MHPMEVENTH_BIT_OF  BIT(31)
 #define MHPMEVENT_BIT_MINH BIT_ULL(62)
-- 
2.34.1

[PATCH v4 1/5] target/riscv: Fix the predicate functions for mhpmeventhX CSRs

2024-01-08 Thread Atish Patra

mhpmeventhX CSRs are available for RV32. The predicate function
should check that first before checking sscofpmf extension.

Fixes: 14664483457b ("target/riscv: Add sscofpmf extension support")
Reviewed-by: Daniel Henrique Barboza 
Reviewed-by: Alistair Francis 
Signed-off-by: Atish Patra 
---
 target/riscv/csr.c | 67 ++
 1 file changed, 38 insertions(+), 29 deletions(-)

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index fde7ce1a5336..283468bbc652 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -224,6 +224,15 @@ static RISCVException sscofpmf(CPURISCVState *env, int 
csrno)
 return RISCV_EXCP_NONE;
 }
 
+static RISCVException sscofpmf_32(CPURISCVState *env, int csrno)
+{
+if (riscv_cpu_mxl(env) != MXL_RV32) {
+return RISCV_EXCP_ILLEGAL_INST;
+}
+
+return sscofpmf(env, csrno);
+}
+
 static RISCVException any(CPURISCVState *env, int csrno)
 {
 return RISCV_EXCP_NONE;
@@ -4972,91 +4981,91 @@ riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
 [CSR_MHPMEVENT31]= { "mhpmevent31",any,read_mhpmevent,
  write_mhpmevent   },
 
-[CSR_MHPMEVENT3H]= { "mhpmevent3h",sscofpmf,  read_mhpmeventh,
+[CSR_MHPMEVENT3H]= { "mhpmevent3h",sscofpmf_32,  read_mhpmeventh,
  write_mhpmeventh,
  .min_priv_ver = PRIV_VERSION_1_12_0},
-[CSR_MHPMEVENT4H]= { "mhpmevent4h",sscofpmf,  read_mhpmeventh,
+[CSR_MHPMEVENT4H]= { "mhpmevent4h",sscofpmf_32,  read_mhpmeventh,
  write_mhpmeventh,
  .min_priv_ver = PRIV_VERSION_1_12_0},
-[CSR_MHPMEVENT5H]= { "mhpmevent5h",sscofpmf,  read_mhpmeventh,
+[CSR_MHPMEVENT5H]= { "mhpmevent5h",sscofpmf_32,  read_mhpmeventh,
  write_mhpmeventh,
  .min_priv_ver = PRIV_VERSION_1_12_0},
-[CSR_MHPMEVENT6H]= { "mhpmevent6h",sscofpmf,  read_mhpmeventh,
+[CSR_MHPMEVENT6H]= { "mhpmevent6h",sscofpmf_32,  read_mhpmeventh,
  write_mhpmeventh,
  .min_priv_ver = PRIV_VERSION_1_12_0},
-[CSR_MHPMEVENT7H]= { "mhpmevent7h",sscofpmf,  read_mhpmeventh,
+[CSR_MHPMEVENT7H]= { "mhpmevent7h",sscofpmf_32,  read_mhpmeventh,
  write_mhpmeventh,
  .min_priv_ver = PRIV_VERSION_1_12_0},
-[CSR_MHPMEVENT8H]= { "mhpmevent8h",sscofpmf,  read_mhpmeventh,
+[CSR_MHPMEVENT8H]= { "mhpmevent8h",sscofpmf_32,  read_mhpmeventh,
  write_mhpmeventh,
  .min_priv_ver = PRIV_VERSION_1_12_0},
-[CSR_MHPMEVENT9H]= { "mhpmevent9h",sscofpmf,  read_mhpmeventh,
+[CSR_MHPMEVENT9H]= { "mhpmevent9h",sscofpmf_32,  read_mhpmeventh,
  write_mhpmeventh,
  .min_priv_ver = PRIV_VERSION_1_12_0},
-[CSR_MHPMEVENT10H]   = { "mhpmevent10h",sscofpmf,  read_mhpmeventh,
+[CSR_MHPMEVENT10H]   = { "mhpmevent10h",sscofpmf_32,  read_mhpmeventh,
  write_mhpmeventh,
  .min_priv_ver = PRIV_VERSION_1_12_0},
-[CSR_MHPMEVENT11H]   = { "mhpmevent11h",sscofpmf,  read_mhpmeventh,
+[CSR_MHPMEVENT11H]   = { "mhpmevent11h",sscofpmf_32,  read_mhpmeventh,
  write_mhpmeventh,
  .min_priv_ver = PRIV_VERSION_1_12_0},
-[CSR_MHPMEVENT12H]   = { "mhpmevent12h",sscofpmf,  read_mhpmeventh,
+[CSR_MHPMEVENT12H]   = { "mhpmevent12h",sscofpmf_32,  read_mhpmeventh,
  write_mhpmeventh,
  .min_priv_ver = PRIV_VERSION_1_12_0},
-[CSR_MHPMEVENT13H]   = { "mhpmevent13h",sscofpmf,  read_mhpmeventh,
+[CSR_MHPMEVENT13H]   = { "mhpmevent13h",sscofpmf_32,  read_mhpmeventh,
  write_mhpmeventh,
  .min_priv_ver = PRIV_VERSION_1_12_0},
-[CSR_MHPMEVENT14H]   = { "mhpmevent14h",sscofpmf,  read_mhpmeventh,
+[CSR_MHPMEVENT14H]   = { "mhpmevent14h",sscofpmf_32,  read_mhpmeventh,
  write_mhpmeventh,
  .min_priv_ver = PRIV_VERSION_1_12_0},
-[CSR_MHPMEVENT15H]   = { "mhpmevent15h",sscofpmf,  read_mhpmeventh,
+[CSR_MHPMEVENT15H]   = { "mhpmevent15h",sscofpmf_32,  read_mhpmeventh,
  write_mhpmeventh,
  .min_priv_ver = PRIV_VERSION_1_12_0},
-[CSR_MHPMEVENT16H]   = { "mhpmevent16h",sscofpmf,  read_mhpmeventh,
+[CSR_MHPMEVENT16H]   = { "mhpmevent16h",sscofpmf_32,  read_mhpmeventh,

[PATCH v4 0/5] Add ISA extension smcntrpmf support

2024-01-08 Thread Atish Patra

This patch series adds the support for RISC-V ISA extension smcntrpmf (cycle and
privilege mode filtering) [1]. It is based on Kevin's earlier work but improves
it by actually implement privilege mode filtering by tracking the privilege
mode switches. This enables the privilege mode filtering for mhpmcounters as
well. However, Smcntrpmf/Sscofpmf must be enabled to leverage this. This series
also modified to report the raw instruction count instead of virtual cpu time
based on the instruction count when icount is enabled. The former seems to be
the preferred approach for instruction count for other architectures as well.

Please let me know if anybody thinks that's incorrect.

The series is also available at

Changes from v3->v4:
1. Fixed the ordering of the ISA extension names in isa_edata_arr.
2. Added RB tags.

Changes from v2->v3:
1. Fixed the rebasing error in PATCH2.
2. Added RB tags.
3. Addressed other review comments. 

Changes from v1->v2:
1. Implemented actual mode filtering for both icount and host ticks mode.
1. Addressed comments in v1.
2. Added Kevin's personal email address.

[1] https://github.com/riscv/riscv-smcntrpmf
[2] https://github.com/atishp04/qemu/tree/smcntrpmf_v3

Atish Patra (2):
target/riscv: Fix the predicate functions for mhpmeventhX CSRs
target/riscv: Implement privilege mode filtering for cycle/instret

Kaiwen Xue (3):
target/riscv: Add cycle & instret privilege mode filtering properties
target/riscv: Add cycle & instret privilege mode filtering definitions
target/riscv: Add cycle & instret privilege mode filtering support

target/riscv/cpu.c|   2 +
target/riscv/cpu.h|  17 +++
target/riscv/cpu_bits.h   |  29 +
target/riscv/cpu_cfg.h|   1 +
target/riscv/cpu_helper.c |   9 +-
target/riscv/csr.c| 242 ++
target/riscv/pmu.c|  43 +++
target/riscv/pmu.h|   2 +
8 files changed, 292 insertions(+), 53 deletions(-)

--
2.34.1

[PATCH v11 08/10] hw/net: GMAC Rx Implementation

2024-01-08 Thread Nabih Estefan

From: Nabih Estefan Diaz 

- Implementation of Receive function for packets
- Implementation for reading and writing from and to descriptors in
  memory for Rx

When RX starts, we need to flush the queued packets so that they
can be received by the GMAC device. Without this it won't work
with TAP NIC device.

When RX descriptor list is full, it returns a DMA_STATUS for
software to handle it. But there's no way to indicate the software has
handled all RX descriptors and the whole pipeline stalls.

We do something similar to NPCM7XX EMC to handle this case.

1. Return packet size when RX descriptor is full, effectively dropping
these packets in such a case.
2. When software clears RX descriptor full bit, continue receiving
further packets by flushing QEMU packet queue.

Added relevant trace-events

Change-Id: I132aa254a94cda1a586aba2ea33bbfc74ecdb831
Signed-off-by: Hao Wu 
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 hw/net/npcm_gmac.c  | 479 +++-
 hw/net/trace-events |   5 +
 2 files changed, 482 insertions(+), 2 deletions(-)

diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c
index 44c4ffaff4..c107e835b1 100644
--- a/hw/net/npcm_gmac.c
+++ b/hw/net/npcm_gmac.c
@@ -24,6 +24,10 @@
 #include "hw/net/mii.h"
 #include "hw/net/npcm_gmac.h"
 #include "migration/vmstate.h"
+#include "net/checksum.h"
+#include "net/eth.h"
+#include "net/net.h"
+#include "qemu/cutils.h"
 #include "qemu/log.h"
 #include "qemu/units.h"
 #include "sysemu/dma.h"
@@ -146,6 +150,17 @@ static void gmac_phy_set_link(NPCMGMACState *gmac, bool 
active)
 
 static bool gmac_can_receive(NetClientState *nc)
 {
+NPCMGMACState *gmac = NPCM_GMAC(qemu_get_nic_opaque(nc));
+
+/* If GMAC receive is disabled. */
+if (!(gmac->regs[R_NPCM_GMAC_MAC_CONFIG] & NPCM_GMAC_MAC_CONFIG_RX_EN)) {
+return false;
+}
+
+/* If GMAC DMA RX is stopped. */
+if (!(gmac->regs[R_NPCM_DMA_CONTROL] & NPCM_DMA_CONTROL_START_STOP_RX)) {
+return false;
+}
 return true;
 }
 
@@ -189,12 +204,438 @@ static void gmac_update_irq(NPCMGMACState *gmac)
 qemu_set_irq(gmac->irq, level);
 }
 
-static ssize_t gmac_receive(NetClientState *nc, const uint8_t *buf, size_t len)
+static int gmac_read_rx_desc(dma_addr_t addr, struct NPCMGMACRxDesc *desc)
 {
-/* Placeholder. Function will be filled in following patches */
+if (dma_memory_read(_space_memory, addr, desc,
+sizeof(*desc), MEMTXATTRS_UNSPECIFIED)) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to read descriptor @ 0x%"
+  HWADDR_PRIx "\n", __func__, addr);
+return -1;
+}
+desc->rdes0 = le32_to_cpu(desc->rdes0);
+desc->rdes1 = le32_to_cpu(desc->rdes1);
+desc->rdes2 = le32_to_cpu(desc->rdes2);
+desc->rdes3 = le32_to_cpu(desc->rdes3);
 return 0;
 }
 
+static int gmac_write_rx_desc(dma_addr_t addr, struct NPCMGMACRxDesc *desc)
+{
+struct NPCMGMACRxDesc le_desc;
+le_desc.rdes0 = cpu_to_le32(desc->rdes0);
+le_desc.rdes1 = cpu_to_le32(desc->rdes1);
+le_desc.rdes2 = cpu_to_le32(desc->rdes2);
+le_desc.rdes3 = cpu_to_le32(desc->rdes3);
+if (dma_memory_write(_space_memory, addr, _desc,
+sizeof(le_desc), MEMTXATTRS_UNSPECIFIED)) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to write descriptor @ 0x%"
+  HWADDR_PRIx "\n", __func__, addr);
+return -1;
+}
+return 0;
+}
+
+static int gmac_read_tx_desc(dma_addr_t addr, struct NPCMGMACTxDesc *desc)
+{
+if (dma_memory_read(_space_memory, addr, desc,
+sizeof(*desc), MEMTXATTRS_UNSPECIFIED)) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to read descriptor @ 0x%"
+  HWADDR_PRIx "\n", __func__, addr);
+return -1;
+}
+desc->tdes0 = le32_to_cpu(desc->tdes0);
+desc->tdes1 = le32_to_cpu(desc->tdes1);
+desc->tdes2 = le32_to_cpu(desc->tdes2);
+desc->tdes3 = le32_to_cpu(desc->tdes3);
+return 0;
+}
+
+static int gmac_write_tx_desc(dma_addr_t addr, struct NPCMGMACTxDesc *desc)
+{
+struct NPCMGMACTxDesc le_desc;
+le_desc.tdes0 = cpu_to_le32(desc->tdes0);
+le_desc.tdes1 = cpu_to_le32(desc->tdes1);
+le_desc.tdes2 = cpu_to_le32(desc->tdes2);
+le_desc.tdes3 = cpu_to_le32(desc->tdes3);
+if (dma_memory_write(_space_memory, addr, _desc,
+sizeof(le_desc), MEMTXATTRS_UNSPECIFIED)) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to write descriptor @ 0x%"
+  HWADDR_PRIx "\n", __func__, addr);
+return -1;
+}
+return 0;
+}
+
+static int gmac_rx_transfer_frame_to_buffer(uint32_t rx_buf_len,
+uint32_t *left_frame,
+uint32_t rx_buf_addr,
+bool *eof_transferred,
+const uint8_t **frame_ptr,
+

[PATCH v11 01/10] hw/misc: Add Nuvoton's PCI Mailbox Module

2024-01-08 Thread Nabih Estefan

From: Hao Wu 

The PCI Mailbox Module is a high-bandwidth communcation module
between a Nuvoton BMC and CPU. It features 16KB RAM that are both
accessible by the BMC and core CPU. and supports interrupt for
both sides.

This patch implements the BMC side of the PCI mailbox module.
Communication with the core CPU is emulated via a chardev and
will be in a follow-up patch.

Change-Id: Iaca22f81c4526927d437aa367079ed038faf43f2
Signed-off-by: Hao Wu 
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 hw/arm/npcm7xx.c   |  15 +-
 hw/misc/meson.build|   1 +
 hw/misc/npcm7xx_pci_mbox.c | 324 +
 hw/misc/trace-events   |   5 +
 include/hw/arm/npcm7xx.h   |   1 +
 include/hw/misc/npcm7xx_pci_mbox.h |  81 
 6 files changed, 426 insertions(+), 1 deletion(-)
 create mode 100644 hw/misc/npcm7xx_pci_mbox.c
 create mode 100644 include/hw/misc/npcm7xx_pci_mbox.h

diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c
index 15ff21d047..1c3634ff45 100644
--- a/hw/arm/npcm7xx.c
+++ b/hw/arm/npcm7xx.c
@@ -53,6 +53,9 @@
 /* ADC Module */
 #define NPCM7XX_ADC_BA  (0xf000c000)
 
+/* PCI Mailbox Module */
+#define NPCM7XX_PCI_MBOX_BA (0xf0848000)
+
 /* Internal AHB SRAM */
 #define NPCM7XX_RAM3_BA (0xc0008000)
 #define NPCM7XX_RAM3_SZ (4 * KiB)
@@ -83,6 +86,9 @@ enum NPCM7xxInterrupt {
 NPCM7XX_UART1_IRQ,
 NPCM7XX_UART2_IRQ,
 NPCM7XX_UART3_IRQ,
+NPCM7XX_PCI_MBOX_IRQ= 8,
+NPCM7XX_KCS_HIB_IRQ = 9,
+NPCM7XX_GMAC1_IRQ   = 14,
 NPCM7XX_EMC1RX_IRQ  = 15,
 NPCM7XX_EMC1TX_IRQ,
 NPCM7XX_MMC_IRQ = 26,
@@ -706,6 +712,14 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp)
 }
 }
 
+/* PCI Mailbox. Cannot fail */
+sysbus_realize(SYS_BUS_DEVICE(>pci_mbox), _abort);
+sysbus_mmio_map(SYS_BUS_DEVICE(>pci_mbox), 0, NPCM7XX_PCI_MBOX_BA);
+sysbus_mmio_map(SYS_BUS_DEVICE(>pci_mbox), 1,
+NPCM7XX_PCI_MBOX_BA + NPCM7XX_PCI_MBOX_RAM_SIZE);
+sysbus_connect_irq(SYS_BUS_DEVICE(>pci_mbox), 0,
+   npcm7xx_irq(s, NPCM7XX_PCI_MBOX_IRQ));
+
 /* RAM2 (SRAM) */
 memory_region_init_ram(>sram, OBJECT(dev), "ram2",
NPCM7XX_RAM2_SZ, _abort);
@@ -765,7 +779,6 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp)
 create_unimplemented_device("npcm7xx.usbd[8]",  0xf0838000,   4 * KiB);
 create_unimplemented_device("npcm7xx.usbd[9]",  0xf0839000,   4 * KiB);
 create_unimplemented_device("npcm7xx.sd",   0xf084,   8 * KiB);
-create_unimplemented_device("npcm7xx.pcimbx",   0xf0848000, 512 * KiB);
 create_unimplemented_device("npcm7xx.aes",  0xf0858000,   4 * KiB);
 create_unimplemented_device("npcm7xx.des",  0xf0859000,   4 * KiB);
 create_unimplemented_device("npcm7xx.sha",  0xf085a000,   4 * KiB);
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
index 36c20d5637..0ead2e9ede 100644
--- a/hw/misc/meson.build
+++ b/hw/misc/meson.build
@@ -73,6 +73,7 @@ system_ss.add(when: 'CONFIG_NPCM7XX', if_true: files(
   'npcm7xx_clk.c',
   'npcm7xx_gcr.c',
   'npcm7xx_mft.c',
+  'npcm7xx_pci_mbox.c',
   'npcm7xx_pwm.c',
   'npcm7xx_rng.c',
 ))
diff --git a/hw/misc/npcm7xx_pci_mbox.c b/hw/misc/npcm7xx_pci_mbox.c
new file mode 100644
index 00..c770ad6fcf
--- /dev/null
+++ b/hw/misc/npcm7xx_pci_mbox.c
@@ -0,0 +1,324 @@
+/*
+ * Nuvoton NPCM7xx PCI Mailbox Module
+ *
+ * Copyright 2021 Google LLC
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ * for more details.
+ */
+
+#include "qemu/osdep.h"
+#include "chardev/char-fe.h"
+#include "hw/irq.h"
+#include "hw/qdev-clock.h"
+#include "hw/qdev-properties-system.h"
+#include "hw/misc/npcm7xx_pci_mbox.h"
+#include "hw/registerfields.h"
+#include "migration/vmstate.h"
+#include "qapi/error.h"
+#include "qapi/visitor.h"
+#include "qemu/bitops.h"
+#include "qemu/error-report.h"
+#include "qemu/log.h"
+#include "qemu/module.h"
+#include "qemu/timer.h"
+#include "qemu/units.h"
+#include "trace.h"
+
+REG32(NPCM7XX_PCI_MBOX_BMBXSTAT, 0x00);
+REG32(NPCM7XX_PCI_MBOX_BMBXCTL, 0x04);
+REG32(NPCM7XX_PCI_MBOX_BMBXCMD, 0x08);
+
+enum NPCM7xxPCIMBoxOperation {
+NPCM7XX_PCI_MBOX_OP_READ = 1,
+NPCM7XX_PCI_MBOX_OP_WRITE,
+};
+
+#define NPCM7XX_PCI_MBOX_OFFSET_BYTES 8
+
+/* Response code */
+#define NPCM7XX_PCI_MBOX_OK 0
+#define NPCM7XX_PCI_MBOX_INVALID_OP 0xa0
+#define NPCM7XX_PCI_MBOX_INVALID_SIZE 0xa1
+#define

[PATCH v11 05/10] hw/arm: Add GMAC devices to NPCM7XX SoC

2024-01-08 Thread Nabih Estefan

From: Hao Wu 

Change-Id: Id8a3461fb5042adc4c3fd6f4fbd1ca0d33e22565
Signed-off-by: Hao Wu 
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 hw/arm/npcm7xx.c | 36 ++--
 include/hw/arm/npcm7xx.h |  2 ++
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c
index c9e87162cb..12e11250e1 100644
--- a/hw/arm/npcm7xx.c
+++ b/hw/arm/npcm7xx.c
@@ -91,6 +91,7 @@ enum NPCM7xxInterrupt {
 NPCM7XX_GMAC1_IRQ   = 14,
 NPCM7XX_EMC1RX_IRQ  = 15,
 NPCM7XX_EMC1TX_IRQ,
+NPCM7XX_GMAC2_IRQ,
 NPCM7XX_MMC_IRQ = 26,
 NPCM7XX_PSPI2_IRQ   = 28,
 NPCM7XX_PSPI1_IRQ   = 31,
@@ -234,6 +235,12 @@ static const hwaddr npcm7xx_pspi_addr[] = {
 0xf0201000,
 };
 
+/* Register base address for each GMAC Module */
+static const hwaddr npcm7xx_gmac_addr[] = {
+0xf0802000,
+0xf0804000,
+};
+
 static const struct {
 hwaddr regs_addr;
 uint32_t unconnected_pins;
@@ -462,6 +469,10 @@ static void npcm7xx_init(Object *obj)
 object_initialize_child(obj, "pspi[*]", >pspi[i], TYPE_NPCM_PSPI);
 }
 
+for (i = 0; i < ARRAY_SIZE(s->gmac); i++) {
+object_initialize_child(obj, "gmac[*]", >gmac[i], TYPE_NPCM_GMAC);
+}
+
 object_initialize_child(obj, "pci-mbox", >pci_mbox,
 TYPE_NPCM7XX_PCI_MBOX);
 object_initialize_child(obj, "mmc", >mmc, TYPE_NPCM7XX_SDHCI);
@@ -695,6 +706,29 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp)
 sysbus_connect_irq(sbd, 1, npcm7xx_irq(s, rx_irq));
 }
 
+/*
+ * GMAC Modules. Cannot fail.
+ */
+QEMU_BUILD_BUG_ON(ARRAY_SIZE(npcm7xx_gmac_addr) != ARRAY_SIZE(s->gmac));
+QEMU_BUILD_BUG_ON(ARRAY_SIZE(s->gmac) != 2);
+for (i = 0; i < ARRAY_SIZE(s->gmac); i++) {
+SysBusDevice *sbd = SYS_BUS_DEVICE(>gmac[i]);
+
+/*
+ * The device exists regardless of whether it's connected to a QEMU
+ * netdev backend. So always instantiate it even if there is no
+ * backend.
+ */
+sysbus_realize(sbd, _abort);
+sysbus_mmio_map(sbd, 0, npcm7xx_gmac_addr[i]);
+int irq = i == 0 ? NPCM7XX_GMAC1_IRQ : NPCM7XX_GMAC2_IRQ;
+/*
+ * N.B. The values for the second argument sysbus_connect_irq are
+ * chosen to match the registration order in npcm7xx_emc_realize.
+ */
+sysbus_connect_irq(sbd, 0, npcm7xx_irq(s, irq));
+}
+
 /*
  * Flash Interface Unit (FIU). Can fail if incorrect number of chip selects
  * specified, but this is a programming error.
@@ -765,8 +799,6 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp)
 create_unimplemented_device("npcm7xx.siox[2]",  0xf0102000,   4 * KiB);
 create_unimplemented_device("npcm7xx.ahbpci",   0xf040,   1 * MiB);
 create_unimplemented_device("npcm7xx.mcphy",0xf05f,  64 * KiB);
-create_unimplemented_device("npcm7xx.gmac1",0xf0802000,   8 * KiB);
-create_unimplemented_device("npcm7xx.gmac2",0xf0804000,   8 * KiB);
 create_unimplemented_device("npcm7xx.vcd",  0xf081,  64 * KiB);
 create_unimplemented_device("npcm7xx.ece",  0xf082,   8 * KiB);
 create_unimplemented_device("npcm7xx.vdma", 0xf0822000,   8 * KiB);
diff --git a/include/hw/arm/npcm7xx.h b/include/hw/arm/npcm7xx.h
index cec3792a2e..9e5cf639a2 100644
--- a/include/hw/arm/npcm7xx.h
+++ b/include/hw/arm/npcm7xx.h
@@ -30,6 +30,7 @@
 #include "hw/misc/npcm7xx_pwm.h"
 #include "hw/misc/npcm7xx_rng.h"
 #include "hw/net/npcm7xx_emc.h"
+#include "hw/net/npcm_gmac.h"
 #include "hw/nvram/npcm7xx_otp.h"
 #include "hw/timer/npcm7xx_timer.h"
 #include "hw/ssi/npcm7xx_fiu.h"
@@ -105,6 +106,7 @@ struct NPCM7xxState {
 OHCISysBusState ohci;
 NPCM7xxFIUState fiu[2];
 NPCM7xxEMCState emc[2];
+NPCMGMACState   gmac[2];
 NPCM7xxPCIMBoxState pci_mbox;
 NPCM7xxSDHCIState   mmc;
 NPCMPSPIState   pspi[2];
-- 
2.43.0.472.g3155946c3a-goog

[PATCH v11 04/10] hw/net: Add NPCMXXX GMAC device

2024-01-08 Thread Nabih Estefan

From: Hao Wu 

This patch implements the basic registers of GMAC device and sets
registers for networking functionalities.

Tested:
The following message shows up with the change:
Broadcom BCM54612E stmmac-0:00: attached PHY driver [Broadcom BCM54612E] 
(mii_bus:phy_addr=stmmac-0:00, irq=POLL)
stmmaceth f0802000.eth eth0: Link is Up - 1Gbps/Full - flow control rx/tx

Change-Id: If71c6d486b95edcccba109ba454870714d7e0940
Signed-off-by: Hao Wu 
Signed-off-by: Nabih Estefan Diaz 
Reviewed-by: Tyrone Ting 
---
 hw/net/meson.build |   2 +-
 hw/net/npcm_gmac.c | 424 +
 hw/net/trace-events|  11 +
 include/hw/net/npcm_gmac.h | 340 +
 4 files changed, 776 insertions(+), 1 deletion(-)
 create mode 100644 hw/net/npcm_gmac.c
 create mode 100644 include/hw/net/npcm_gmac.h

diff --git a/hw/net/meson.build b/hw/net/meson.build
index f64651c467..db6509f504 100644
--- a/hw/net/meson.build
+++ b/hw/net/meson.build
@@ -38,7 +38,7 @@ system_ss.add(when: 'CONFIG_I82596_COMMON', if_true: 
files('i82596.c'))
 system_ss.add(when: 'CONFIG_SUNHME', if_true: files('sunhme.c'))
 system_ss.add(when: 'CONFIG_FTGMAC100', if_true: files('ftgmac100.c'))
 system_ss.add(when: 'CONFIG_SUNGEM', if_true: files('sungem.c'))
-system_ss.add(when: 'CONFIG_NPCM7XX', if_true: files('npcm7xx_emc.c'))
+system_ss.add(when: 'CONFIG_NPCM7XX', if_true: files('npcm7xx_emc.c', 
'npcm_gmac.c'))
 
 system_ss.add(when: 'CONFIG_ETRAXFS', if_true: files('etraxfs_eth.c'))
 system_ss.add(when: 'CONFIG_COLDFIRE', if_true: files('mcf_fec.c'))
diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c
new file mode 100644
index 00..98b3c33c94
--- /dev/null
+++ b/hw/net/npcm_gmac.c
@@ -0,0 +1,424 @@
+/*
+ * Nuvoton NPCM7xx/8xx GMAC Module
+ *
+ * Copyright 2022 Google LLC
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ * for more details.
+ *
+ * Unsupported/unimplemented features:
+ * - MII is not implemented, MII_ADDR.BUSY and MII_DATA always return zero
+ * - Precision timestamp (PTP) is not implemented.
+ */
+
+#include "qemu/osdep.h"
+
+#include "hw/registerfields.h"
+#include "hw/net/mii.h"
+#include "hw/net/npcm_gmac.h"
+#include "migration/vmstate.h"
+#include "qemu/log.h"
+#include "qemu/units.h"
+#include "sysemu/dma.h"
+#include "trace.h"
+
+REG32(NPCM_DMA_BUS_MODE, 0x1000)
+REG32(NPCM_DMA_XMT_POLL_DEMAND, 0x1004)
+REG32(NPCM_DMA_RCV_POLL_DEMAND, 0x1008)
+REG32(NPCM_DMA_RX_BASE_ADDR, 0x100c)
+REG32(NPCM_DMA_TX_BASE_ADDR, 0x1010)
+REG32(NPCM_DMA_STATUS, 0x1014)
+REG32(NPCM_DMA_CONTROL, 0x1018)
+REG32(NPCM_DMA_INTR_ENA, 0x101c)
+REG32(NPCM_DMA_MISSED_FRAME_CTR, 0x1020)
+REG32(NPCM_DMA_HOST_TX_DESC, 0x1048)
+REG32(NPCM_DMA_HOST_RX_DESC, 0x104c)
+REG32(NPCM_DMA_CUR_TX_BUF_ADDR, 0x1050)
+REG32(NPCM_DMA_CUR_RX_BUF_ADDR, 0x1054)
+REG32(NPCM_DMA_HW_FEATURE, 0x1058)
+
+REG32(NPCM_GMAC_MAC_CONFIG, 0x0)
+REG32(NPCM_GMAC_FRAME_FILTER, 0x4)
+REG32(NPCM_GMAC_HASH_HIGH, 0x8)
+REG32(NPCM_GMAC_HASH_LOW, 0xc)
+REG32(NPCM_GMAC_MII_ADDR, 0x10)
+REG32(NPCM_GMAC_MII_DATA, 0x14)
+REG32(NPCM_GMAC_FLOW_CTRL, 0x18)
+REG32(NPCM_GMAC_VLAN_FLAG, 0x1c)
+REG32(NPCM_GMAC_VERSION, 0x20)
+REG32(NPCM_GMAC_WAKEUP_FILTER, 0x28)
+REG32(NPCM_GMAC_PMT, 0x2c)
+REG32(NPCM_GMAC_LPI_CTRL, 0x30)
+REG32(NPCM_GMAC_TIMER_CTRL, 0x34)
+REG32(NPCM_GMAC_INT_STATUS, 0x38)
+REG32(NPCM_GMAC_INT_MASK, 0x3c)
+REG32(NPCM_GMAC_MAC0_ADDR_HI, 0x40)
+REG32(NPCM_GMAC_MAC0_ADDR_LO, 0x44)
+REG32(NPCM_GMAC_MAC1_ADDR_HI, 0x48)
+REG32(NPCM_GMAC_MAC1_ADDR_LO, 0x4c)
+REG32(NPCM_GMAC_MAC2_ADDR_HI, 0x50)
+REG32(NPCM_GMAC_MAC2_ADDR_LO, 0x54)
+REG32(NPCM_GMAC_MAC3_ADDR_HI, 0x58)
+REG32(NPCM_GMAC_MAC3_ADDR_LO, 0x5c)
+REG32(NPCM_GMAC_RGMII_STATUS, 0xd8)
+REG32(NPCM_GMAC_WATCHDOG, 0xdc)
+REG32(NPCM_GMAC_PTP_TCR, 0x700)
+REG32(NPCM_GMAC_PTP_SSIR, 0x704)
+REG32(NPCM_GMAC_PTP_STSR, 0x708)
+REG32(NPCM_GMAC_PTP_STNSR, 0x70c)
+REG32(NPCM_GMAC_PTP_STSUR, 0x710)
+REG32(NPCM_GMAC_PTP_STNSUR, 0x714)
+REG32(NPCM_GMAC_PTP_TAR, 0x718)
+REG32(NPCM_GMAC_PTP_TTSR, 0x71c)
+
+/* Register Fields */
+#define NPCM_GMAC_MII_ADDR_BUSY BIT(0)
+#define NPCM_GMAC_MII_ADDR_WRITEBIT(1)
+#define NPCM_GMAC_MII_ADDR_GR(rv)   extract16((rv), 6, 5)
+#define NPCM_GMAC_MII_ADDR_PA(rv)   extract16((rv), 11, 5)
+
+#define NPCM_GMAC_INT_MASK_LPIIMBIT(10)
+#define NPCM_GMAC_INT_MASK_PMTM BIT(3)
+#define NPCM_GMAC_INT_MASK_RGIM BIT(0)
+
+#define NPCM_DMA_BUS_MODE_SWR   BIT(0)
+
+static const uint32_t npcm_gmac_cold_reset_values[NPCM_GMAC_NR_REGS] = {
+/*

[PATCH v11 06/10] tests/qtest: Creating qtest for GMAC Module

2024-01-08 Thread Nabih Estefan

From: Nabih Estefan Diaz 

 - Created qtest to check initialization of registers in GMAC Module.
 - Implemented test into Build File.

Change-Id: I8b2fe152d3987a7eec4cf6a1d25ba92e75a5391d
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 tests/qtest/meson.build  |   1 +
 tests/qtest/npcm_gmac-test.c | 209 +++
 2 files changed, 210 insertions(+)
 create mode 100644 tests/qtest/npcm_gmac-test.c

diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 2ac79925f9..aed8924be9 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -221,6 +221,7 @@ qtests_aarch64 = \
   (config_all_devices.has_key('CONFIG_RASPI') ? ['bcm2835-dma-test'] : []) +  \
   (config_all.has_key('CONFIG_TCG') and
\
config_all_devices.has_key('CONFIG_TPM_TIS_I2C') ? ['tpm-tis-i2c-test'] : 
[]) + \
+  (config_all_devices.has_key('CONFIG_NPCM7XX') ? qtests_npcm7xx : []) + \
   ['arm-cpu-features',
'numa-test',
'boot-serial-test',
diff --git a/tests/qtest/npcm_gmac-test.c b/tests/qtest/npcm_gmac-test.c
new file mode 100644
index 00..130a1599a8
--- /dev/null
+++ b/tests/qtest/npcm_gmac-test.c
@@ -0,0 +1,209 @@
+/*
+ * QTests for Nuvoton NPCM7xx/8xx GMAC Modules.
+ *
+ * Copyright 2023 Google LLC
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ * for more details.
+ */
+
+#include "qemu/osdep.h"
+#include "libqos/libqos.h"
+
+/* Name of the GMAC Device */
+#define TYPE_NPCM_GMAC "npcm-gmac"
+
+typedef struct GMACModule {
+int irq;
+uint64_t base_addr;
+} GMACModule;
+
+typedef struct TestData {
+const GMACModule *module;
+} TestData;
+
+/* Values extracted from hw/arm/npcm8xx.c */
+static const GMACModule gmac_module_list[] = {
+{
+.irq= 14,
+.base_addr  = 0xf0802000
+},
+{
+.irq= 15,
+.base_addr  = 0xf0804000
+},
+{
+.irq= 16,
+.base_addr  = 0xf0806000
+},
+{
+.irq= 17,
+.base_addr  = 0xf0808000
+}
+};
+
+/* Returns the index of the GMAC module. */
+static int gmac_module_index(const GMACModule *mod)
+{
+ptrdiff_t diff = mod - gmac_module_list;
+
+g_assert_true(diff >= 0 && diff < ARRAY_SIZE(gmac_module_list));
+
+return diff;
+}
+
+/* 32-bit register indices. Taken from npcm_gmac.c */
+typedef enum NPCMRegister {
+/* DMA Registers */
+NPCM_DMA_BUS_MODE = 0x1000,
+NPCM_DMA_XMT_POLL_DEMAND = 0x1004,
+NPCM_DMA_RCV_POLL_DEMAND = 0x1008,
+NPCM_DMA_RCV_BASE_ADDR = 0x100c,
+NPCM_DMA_TX_BASE_ADDR = 0x1010,
+NPCM_DMA_STATUS = 0x1014,
+NPCM_DMA_CONTROL = 0x1018,
+NPCM_DMA_INTR_ENA = 0x101c,
+NPCM_DMA_MISSED_FRAME_CTR = 0x1020,
+NPCM_DMA_HOST_TX_DESC = 0x1048,
+NPCM_DMA_HOST_RX_DESC = 0x104c,
+NPCM_DMA_CUR_TX_BUF_ADDR = 0x1050,
+NPCM_DMA_CUR_RX_BUF_ADDR = 0x1054,
+NPCM_DMA_HW_FEATURE = 0x1058,
+
+/* GMAC Registers */
+NPCM_GMAC_MAC_CONFIG = 0x0,
+NPCM_GMAC_FRAME_FILTER = 0x4,
+NPCM_GMAC_HASH_HIGH = 0x8,
+NPCM_GMAC_HASH_LOW = 0xc,
+NPCM_GMAC_MII_ADDR = 0x10,
+NPCM_GMAC_MII_DATA = 0x14,
+NPCM_GMAC_FLOW_CTRL = 0x18,
+NPCM_GMAC_VLAN_FLAG = 0x1c,
+NPCM_GMAC_VERSION = 0x20,
+NPCM_GMAC_WAKEUP_FILTER = 0x28,
+NPCM_GMAC_PMT = 0x2c,
+NPCM_GMAC_LPI_CTRL = 0x30,
+NPCM_GMAC_TIMER_CTRL = 0x34,
+NPCM_GMAC_INT_STATUS = 0x38,
+NPCM_GMAC_INT_MASK = 0x3c,
+NPCM_GMAC_MAC0_ADDR_HI = 0x40,
+NPCM_GMAC_MAC0_ADDR_LO = 0x44,
+NPCM_GMAC_MAC1_ADDR_HI = 0x48,
+NPCM_GMAC_MAC1_ADDR_LO = 0x4c,
+NPCM_GMAC_MAC2_ADDR_HI = 0x50,
+NPCM_GMAC_MAC2_ADDR_LO = 0x54,
+NPCM_GMAC_MAC3_ADDR_HI = 0x58,
+NPCM_GMAC_MAC3_ADDR_LO = 0x5c,
+NPCM_GMAC_RGMII_STATUS = 0xd8,
+NPCM_GMAC_WATCHDOG = 0xdc,
+NPCM_GMAC_PTP_TCR = 0x700,
+NPCM_GMAC_PTP_SSIR = 0x704,
+NPCM_GMAC_PTP_STSR = 0x708,
+NPCM_GMAC_PTP_STNSR = 0x70c,
+NPCM_GMAC_PTP_STSUR = 0x710,
+NPCM_GMAC_PTP_STNSUR = 0x714,
+NPCM_GMAC_PTP_TAR = 0x718,
+NPCM_GMAC_PTP_TTSR = 0x71c,
+} NPCMRegister;
+
+static uint32_t gmac_read(QTestState *qts, const GMACModule *mod,
+  NPCMRegister regno)
+{
+return qtest_readl(qts, mod->base_addr + regno);
+}
+
+/* Check that GMAC registers are reset to default value */
+static void test_init(gconstpointer test_data)
+{
+const TestData *td = test_data;
+const GMACModule *mod = td->module;
+QTestState *qts = qtest_init("-machine npcm845-evb");
+
+#define CHECK_REG32(regno,

[PATCH v11 09/10] hw/net: GMAC Tx Implementation

2024-01-08 Thread Nabih Estefan

From: Nabih Estefan Diaz 

- Implementation of Transmit function for packets
- Implementation for reading and writing from and to descriptors in
  memory for Tx

Added relevant trace-events

NOTE: This function implements the steps detailed in the datasheet for
transmitting messages from the GMAC.

Change-Id: Icf14f9fcc6cc7808a41acd872bca67c9832087e6
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 hw/net/trace-events | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/net/trace-events b/hw/net/trace-events
index f91b1a4a3d..78efa2ec2c 100644
--- a/hw/net/trace-events
+++ b/hw/net/trace-events
@@ -478,7 +478,9 @@ npcm_gmac_packet_desc_read(const char* name, uint32_t 
desc_addr) "%s: attempting
 npcm_gmac_packet_receive(const char* name, uint32_t len) "%s: RX packet 
length: 0x%04" PRIX32
 npcm_gmac_packet_receiving_buffer(const char* name, uint32_t buf_len, uint32_t 
rx_buf_addr) "%s: Receiving into Buffer size: 0x%04" PRIX32 " at address 0x%04" 
PRIX32
 npcm_gmac_packet_received(const char* name, uint32_t len) "%s: Reception 
finished, packet left: 0x%04" PRIX32
+npcm_gmac_packet_sent(const char* name, uint16_t len) "%s: TX packet sent!, 
length: 0x%04" PRIX16
 npcm_gmac_debug_desc_data(const char* name, void* addr, uint32_t des0, 
uint32_t des1, uint32_t des2, uint32_t des3)"%s: Address: %p Descriptor 0: 
0x%04" PRIX32 " Descriptor 1: 0x%04" PRIX32 "Descriptor 2: 0x%04" PRIX32 " 
Descriptor 3: 0x%04" PRIX32
+npcm_gmac_packet_tx_desc_data(const char* name, uint32_t tdes0, uint32_t 
tdes1) "%s: Tdes0: 0x%04" PRIX32 " Tdes1: 0x%04" PRIX32
 
 # npcm_pcs.c
 npcm_pcs_reg_read(const char *name, uint16_t indirect_access_baes, uint64_t 
offset, uint16_t value) "%s: IND: 0x%02" PRIx16 " offset: 0x%04" PRIx64 " 
value: 0x%04" PRIx16
-- 
2.43.0.472.g3155946c3a-goog

[PATCH v11 02/10] hw/arm: Add PCI mailbox module to Nuvoton SoC

2024-01-08 Thread Nabih Estefan

From: Hao Wu 

This patch wires the PCI mailbox module to Nuvoton SoC.

Change-Id: I14c42c628258804030f0583889882842bde0d972
Signed-off-by: Hao Wu 
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 docs/system/arm/nuvoton.rst | 2 ++
 hw/arm/npcm7xx.c| 2 ++
 include/hw/arm/npcm7xx.h| 1 +
 3 files changed, 5 insertions(+)

diff --git a/docs/system/arm/nuvoton.rst b/docs/system/arm/nuvoton.rst
index 0424cae4b0..e611099545 100644
--- a/docs/system/arm/nuvoton.rst
+++ b/docs/system/arm/nuvoton.rst
@@ -50,6 +50,8 @@ Supported devices
  * Ethernet controller (EMC)
  * Tachometer
  * Peripheral SPI controller (PSPI)
+ * BIOS POST code FIFO
+ * PCI Mailbox
 
 Missing devices
 ---
diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c
index 1c3634ff45..c9e87162cb 100644
--- a/hw/arm/npcm7xx.c
+++ b/hw/arm/npcm7xx.c
@@ -462,6 +462,8 @@ static void npcm7xx_init(Object *obj)
 object_initialize_child(obj, "pspi[*]", >pspi[i], TYPE_NPCM_PSPI);
 }
 
+object_initialize_child(obj, "pci-mbox", >pci_mbox,
+TYPE_NPCM7XX_PCI_MBOX);
 object_initialize_child(obj, "mmc", >mmc, TYPE_NPCM7XX_SDHCI);
 }
 
diff --git a/include/hw/arm/npcm7xx.h b/include/hw/arm/npcm7xx.h
index 273090ac60..cec3792a2e 100644
--- a/include/hw/arm/npcm7xx.h
+++ b/include/hw/arm/npcm7xx.h
@@ -105,6 +105,7 @@ struct NPCM7xxState {
 OHCISysBusState ohci;
 NPCM7xxFIUState fiu[2];
 NPCM7xxEMCState emc[2];
+NPCM7xxPCIMBoxState pci_mbox;
 NPCM7xxSDHCIState   mmc;
 NPCMPSPIState   pspi[2];
 };
-- 
2.43.0.472.g3155946c3a-goog

[PATCH v11 07/10] include/hw/net: GMAC IRQ Implementation

2024-01-08 Thread Nabih Estefan

From: Nabih Estefan Diaz 

Implement Update IRQ Method for GMAC functionality.

Added relevant trace-events

Change-Id: I7a2d3cd3f493278bcd0cf483233c1e05c37488b7
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 hw/net/npcm_gmac.c  | 40 
 hw/net/trace-events |  1 +
 2 files changed, 41 insertions(+)

diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c
index 98b3c33c94..44c4ffaff4 100644
--- a/hw/net/npcm_gmac.c
+++ b/hw/net/npcm_gmac.c
@@ -149,6 +149,46 @@ static bool gmac_can_receive(NetClientState *nc)
 return true;
 }
 
+/*
+ * Function that updates the GMAC IRQ
+ * It find the logical OR of the enabled bits for NIS (if enabled)
+ * It find the logical OR of the enabled bits for AIS (if enabled)
+ */
+static void gmac_update_irq(NPCMGMACState *gmac)
+{
+/*
+ * Check if the normal interrupts summary is enabled
+ * if so, add the bits for the summary that are enabled
+ */
+if (gmac->regs[R_NPCM_DMA_INTR_ENA] & gmac->regs[R_NPCM_DMA_STATUS] &
+(NPCM_DMA_INTR_ENAB_NIE_BITS)) {
+gmac->regs[R_NPCM_DMA_STATUS] |=  NPCM_DMA_STATUS_NIS;
+}
+/*
+ * Check if the abnormal interrupts summary is enabled
+ * if so, add the bits for the summary that are enabled
+ */
+if (gmac->regs[R_NPCM_DMA_INTR_ENA] & gmac->regs[R_NPCM_DMA_STATUS] &
+(NPCM_DMA_INTR_ENAB_AIE_BITS)) {
+gmac->regs[R_NPCM_DMA_STATUS] |=  NPCM_DMA_STATUS_AIS;
+}
+
+/* Get the logical OR of both normal and abnormal interrupts */
+int level = !!((gmac->regs[R_NPCM_DMA_STATUS] &
+gmac->regs[R_NPCM_DMA_INTR_ENA] &
+NPCM_DMA_STATUS_NIS) |
+   (gmac->regs[R_NPCM_DMA_STATUS] &
+   gmac->regs[R_NPCM_DMA_INTR_ENA] &
+   NPCM_DMA_STATUS_AIS));
+
+/* Set the IRQ */
+trace_npcm_gmac_update_irq(DEVICE(gmac)->canonical_path,
+   gmac->regs[R_NPCM_DMA_STATUS],
+   gmac->regs[R_NPCM_DMA_INTR_ENA],
+   level);
+qemu_set_irq(gmac->irq, level);
+}
+
 static ssize_t gmac_receive(NetClientState *nc, const uint8_t *buf, size_t len)
 {
 /* Placeholder. Function will be filled in following patches */
diff --git a/hw/net/trace-events b/hw/net/trace-events
index 33514548b8..56057de47f 100644
--- a/hw/net/trace-events
+++ b/hw/net/trace-events
@@ -473,6 +473,7 @@ npcm_gmac_reg_write(const char *name, uint64_t offset, 
uint32_t value) "%s: offs
 npcm_gmac_mdio_access(const char *name, uint8_t is_write, uint8_t pa, uint8_t 
gr, uint16_t val) "%s: is_write: %" PRIu8 " pa: %" PRIu8 " gr: %" PRIu8 " val: 
0x%04" PRIx16
 npcm_gmac_reset(const char *name, uint16_t value) "%s: phy_regs[0][1]: 0x%04" 
PRIx16
 npcm_gmac_set_link(bool active) "Set link: active=%u"
+npcm_gmac_update_irq(const char *name, uint32_t status, uint32_t intr_en, int 
level) "%s: Status Reg: 0x%04" PRIX32 " Interrupt Enable Reg: 0x%04" PRIX32 " 
IRQ Set: %d"
 
 # npcm_pcs.c
 npcm_pcs_reg_read(const char *name, uint16_t indirect_access_baes, uint64_t 
offset, uint16_t value) "%s: IND: 0x%02" PRIx16 " offset: 0x%04" PRIx64 " 
value: 0x%04" PRIx16
-- 
2.43.0.472.g3155946c3a-goog

[PATCH v11 10/10] tests/qtest: Adding PCS Module test to GMAC Qtest

2024-01-08 Thread Nabih Estefan

From: Nabih Estefan Diaz 

 - Add PCS Register check to npcm_gmac-test

Change-Id: I34821beb5e0b1e89e2be576ab58eabe41545af12
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 tests/qtest/npcm_gmac-test.c | 132 +++
 1 file changed, 132 insertions(+)

diff --git a/tests/qtest/npcm_gmac-test.c b/tests/qtest/npcm_gmac-test.c
index 130a1599a8..b64515794b 100644
--- a/tests/qtest/npcm_gmac-test.c
+++ b/tests/qtest/npcm_gmac-test.c
@@ -20,6 +20,10 @@
 /* Name of the GMAC Device */
 #define TYPE_NPCM_GMAC "npcm-gmac"
 
+/* Address of the PCS Module */
+#define PCS_BASE_ADDRESS 0xf078
+#define NPCM_PCS_IND_AC_BA 0x1fe
+
 typedef struct GMACModule {
 int irq;
 uint64_t base_addr;
@@ -111,6 +115,62 @@ typedef enum NPCMRegister {
 NPCM_GMAC_PTP_STNSUR = 0x714,
 NPCM_GMAC_PTP_TAR = 0x718,
 NPCM_GMAC_PTP_TTSR = 0x71c,
+
+/* PCS Registers */
+NPCM_PCS_SR_CTL_ID1 = 0x3c0008,
+NPCM_PCS_SR_CTL_ID2 = 0x3c000a,
+NPCM_PCS_SR_CTL_STS = 0x3c0010,
+
+NPCM_PCS_SR_MII_CTRL = 0x3e,
+NPCM_PCS_SR_MII_STS = 0x3e0002,
+NPCM_PCS_SR_MII_DEV_ID1 = 0x3e0004,
+NPCM_PCS_SR_MII_DEV_ID2 = 0x3e0006,
+NPCM_PCS_SR_MII_AN_ADV = 0x3e0008,
+NPCM_PCS_SR_MII_LP_BABL = 0x3e000a,
+NPCM_PCS_SR_MII_AN_EXPN = 0x3e000c,
+NPCM_PCS_SR_MII_EXT_STS = 0x3e001e,
+
+NPCM_PCS_SR_TIM_SYNC_ABL = 0x3e0e10,
+NPCM_PCS_SR_TIM_SYNC_TX_MAX_DLY_LWR = 0x3e0e12,
+NPCM_PCS_SR_TIM_SYNC_TX_MAX_DLY_UPR = 0x3e0e14,
+NPCM_PCS_SR_TIM_SYNC_TX_MIN_DLY_LWR = 0x3e0e16,
+NPCM_PCS_SR_TIM_SYNC_TX_MIN_DLY_UPR = 0x3e0e18,
+NPCM_PCS_SR_TIM_SYNC_RX_MAX_DLY_LWR = 0x3e0e1a,
+NPCM_PCS_SR_TIM_SYNC_RX_MAX_DLY_UPR = 0x3e0e1c,
+NPCM_PCS_SR_TIM_SYNC_RX_MIN_DLY_LWR = 0x3e0e1e,
+NPCM_PCS_SR_TIM_SYNC_RX_MIN_DLY_UPR = 0x3e0e20,
+
+NPCM_PCS_VR_MII_MMD_DIG_CTRL1 = 0x3f,
+NPCM_PCS_VR_MII_AN_CTRL = 0x3f0002,
+NPCM_PCS_VR_MII_AN_INTR_STS = 0x3f0004,
+NPCM_PCS_VR_MII_TC = 0x3f0006,
+NPCM_PCS_VR_MII_DBG_CTRL = 0x3f000a,
+NPCM_PCS_VR_MII_EEE_MCTRL0 = 0x3f000c,
+NPCM_PCS_VR_MII_EEE_TXTIMER = 0x3f0010,
+NPCM_PCS_VR_MII_EEE_RXTIMER = 0x3f0012,
+NPCM_PCS_VR_MII_LINK_TIMER_CTRL = 0x3f0014,
+NPCM_PCS_VR_MII_EEE_MCTRL1 = 0x3f0016,
+NPCM_PCS_VR_MII_DIG_STS = 0x3f0020,
+NPCM_PCS_VR_MII_ICG_ERRCNT1 = 0x3f0022,
+NPCM_PCS_VR_MII_MISC_STS = 0x3f0030,
+NPCM_PCS_VR_MII_RX_LSTS = 0x3f0040,
+NPCM_PCS_VR_MII_MP_TX_BSTCTRL0 = 0x3f0070,
+NPCM_PCS_VR_MII_MP_TX_LVLCTRL0 = 0x3f0074,
+NPCM_PCS_VR_MII_MP_TX_GENCTRL0 = 0x3f007a,
+NPCM_PCS_VR_MII_MP_TX_GENCTRL1 = 0x3f007c,
+NPCM_PCS_VR_MII_MP_TX_STS = 0x3f0090,
+NPCM_PCS_VR_MII_MP_RX_GENCTRL0 = 0x3f00b0,
+NPCM_PCS_VR_MII_MP_RX_GENCTRL1 = 0x3f00b2,
+NPCM_PCS_VR_MII_MP_RX_LOS_CTRL0 = 0x3f00ba,
+NPCM_PCS_VR_MII_MP_MPLL_CTRL0 = 0x3f00f0,
+NPCM_PCS_VR_MII_MP_MPLL_CTRL1 = 0x3f00f2,
+NPCM_PCS_VR_MII_MP_MPLL_STS = 0x3f0110,
+NPCM_PCS_VR_MII_MP_MISC_CTRL2 = 0x3f0126,
+NPCM_PCS_VR_MII_MP_LVL_CTRL = 0x3f0130,
+NPCM_PCS_VR_MII_MP_MISC_CTRL0 = 0x3f0132,
+NPCM_PCS_VR_MII_MP_MISC_CTRL1 = 0x3f0134,
+NPCM_PCS_VR_MII_DIG_CTRL2 = 0x3f01c2,
+NPCM_PCS_VR_MII_DIG_ERRCNT_SEL = 0x3f01c4,
 } NPCMRegister;
 
 static uint32_t gmac_read(QTestState *qts, const GMACModule *mod,
@@ -119,6 +179,15 @@ static uint32_t gmac_read(QTestState *qts, const 
GMACModule *mod,
 return qtest_readl(qts, mod->base_addr + regno);
 }
 
+static uint16_t pcs_read(QTestState *qts, const GMACModule *mod,
+  NPCMRegister regno)
+{
+uint32_t write_value = (regno & 0x3ffe00) >> 9;
+qtest_writel(qts, PCS_BASE_ADDRESS + NPCM_PCS_IND_AC_BA, write_value);
+uint32_t read_offset = regno & 0x1ff;
+return qtest_readl(qts, PCS_BASE_ADDRESS + read_offset);
+}
+
 /* Check that GMAC registers are reset to default value */
 static void test_init(gconstpointer test_data)
 {
@@ -131,6 +200,11 @@ static void test_init(gconstpointer test_data)
 g_assert_cmphex(gmac_read(qts, mod, (regno)), ==, (value)); \
 } while (0)
 
+#define CHECK_REG_PCS(regno, value) \
+do { \
+g_assert_cmphex(pcs_read(qts, mod, (regno)), ==, (value)); \
+} while (0)
+
 CHECK_REG32(NPCM_DMA_BUS_MODE, 0x00020100);
 CHECK_REG32(NPCM_DMA_XMT_POLL_DEMAND, 0);
 CHECK_REG32(NPCM_DMA_RCV_POLL_DEMAND, 0);
@@ -180,6 +254,64 @@ static void test_init(gconstpointer test_data)
 CHECK_REG32(NPCM_GMAC_PTP_TAR, 0);
 CHECK_REG32(NPCM_GMAC_PTP_TTSR, 0);
 
+/* TODO Add registers PCS */
+if (mod->base_addr == 0xf0802000) {
+CHECK_REG_PCS(NPCM_PCS_SR_CTL_ID1, 0x699e);
+CHECK_REG_PCS(NPCM_PCS_SR_CTL_ID2, 0);
+CHECK_REG_PCS(NPCM_PCS_SR_CTL_STS, 0x8000);
+
+CHECK_REG_PCS(NPCM_PCS_SR_MII_CTRL, 0x1140);
+CHECK_REG_PCS(NPCM_PCS_SR_MII_STS, 0x0109);
+CHECK_REG_PCS(NPCM_PCS_SR_MII_DEV_ID1, 0x699e);
+CHECK_REG_PCS(NPCM_PCS_SR_MII_DEV_ID2, 0x0ced0);
+

[PATCH v11 03/10] hw/misc: Add qtest for NPCM7xx PCI Mailbox

2024-01-08 Thread Nabih Estefan

From: Hao Wu 

This patches adds a qtest for NPCM7XX PCI Mailbox module.
It sends read and write requests to the module, and verifies that
the module contains the correct data after the requests.

Change-Id: I2e1dbaecf8be9ec7eab55cb54f7fdeb0715b8275
Signed-off-by: Hao Wu 
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 tests/qtest/meson.build |   1 +
 tests/qtest/npcm7xx_pci_mbox-test.c | 238 
 2 files changed, 239 insertions(+)
 create mode 100644 tests/qtest/npcm7xx_pci_mbox-test.c

diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 47dabf91d0..2ac79925f9 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -183,6 +183,7 @@ qtests_sparc64 = \
 qtests_npcm7xx = \
   ['npcm7xx_adc-test',
'npcm7xx_gpio-test',
+   'npcm7xx_pci_mbox-test',
'npcm7xx_pwm-test',
'npcm7xx_rng-test',
'npcm7xx_sdhci-test',
diff --git a/tests/qtest/npcm7xx_pci_mbox-test.c 
b/tests/qtest/npcm7xx_pci_mbox-test.c
new file mode 100644
index 00..24eec18e3c
--- /dev/null
+++ b/tests/qtest/npcm7xx_pci_mbox-test.c
@@ -0,0 +1,238 @@
+/*
+ * QTests for Nuvoton NPCM7xx PCI Mailbox Modules.
+ *
+ * Copyright 2021 Google LLC
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ * for more details.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/bitops.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qnum.h"
+#include "libqtest-single.h"
+
+#define PCI_MBOX_BA 0xf0848000
+#define PCI_MBOX_IRQ8
+
+/* register offset */
+#define PCI_MBOX_STAT   0x00
+#define PCI_MBOX_CTL0x04
+#define PCI_MBOX_CMD0x08
+
+#define CODE_OK 0x00
+#define CODE_INVALID_OP 0xa0
+#define CODE_INVALID_SIZE   0xa1
+#define CODE_ERROR  0xff
+
+#define OP_READ 0x01
+#define OP_WRITE0x02
+#define OP_INVALID  0x41
+
+
+static int sock;
+static int fd;
+
+/*
+ * Create a local TCP socket with any port, then save off the port we got.
+ */
+static in_port_t open_socket(void)
+{
+struct sockaddr_in myaddr;
+socklen_t addrlen;
+
+myaddr.sin_family = AF_INET;
+myaddr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
+myaddr.sin_port = 0;
+sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
+g_assert(sock != -1);
+g_assert(bind(sock, (struct sockaddr *) , sizeof(myaddr)) != -1);
+addrlen = sizeof(myaddr);
+g_assert(getsockname(sock, (struct sockaddr *)  , ) != -1);
+g_assert(listen(sock, 1) != -1);
+return ntohs(myaddr.sin_port);
+}
+
+static void setup_fd(void)
+{
+fd_set readfds;
+
+FD_ZERO();
+FD_SET(sock, );
+g_assert(select(sock + 1, , NULL, NULL, NULL) == 1);
+
+fd = accept(sock, NULL, 0);
+g_assert(fd >= 0);
+}
+
+static uint8_t read_response(uint8_t *buf, size_t len)
+{
+uint8_t code;
+ssize_t ret = read(fd, , 1);
+
+if (ret == -1) {
+return CODE_ERROR;
+}
+if (code != CODE_OK) {
+return code;
+}
+g_test_message("response code: %x", code);
+if (len > 0) {
+ret = read(fd, buf, len);
+if (ret < len) {
+return CODE_ERROR;
+}
+}
+return CODE_OK;
+}
+
+static void receive_data(uint64_t offset, uint8_t *buf, size_t len)
+{
+uint8_t op = OP_READ;
+uint8_t code;
+ssize_t rv;
+
+while (len > 0) {
+uint8_t size;
+
+if (len >= 8) {
+size = 8;
+} else if (len >= 4) {
+size = 4;
+} else if (len >= 2) {
+size = 2;
+} else {
+size = 1;
+}
+
+g_test_message("receiving %u bytes", size);
+/* Write op */
+rv = write(fd, , 1);
+g_assert_cmpint(rv, ==, 1);
+/* Write offset */
+rv = write(fd, (uint8_t *), sizeof(uint64_t));
+g_assert_cmpint(rv, ==, sizeof(uint64_t));
+/* Write size */
+g_assert_cmpint(write(fd, , 1), ==, 1);
+
+/* Read data and Expect response */
+code = read_response(buf, size);
+g_assert_cmphex(code, ==, CODE_OK);
+
+buf += size;
+offset += size;
+len -= size;
+}
+}
+
+static void send_data(uint64_t offset, const uint8_t *buf, size_t len)
+{
+uint8_t op = OP_WRITE;
+uint8_t code;
+ssize_t rv;
+
+while (len > 0) {
+uint8_t size;
+
+if (len >= 8) {
+size = 8;
+} else if (len >= 4) {
+size = 4;
+} else if (len >= 2) {
+size = 2;
+} else {
+size = 1;
+}
+
+

[PATCH v11 00/10] Implementation of NPI Mailbox and GMAC Networking Module

2024-01-08 Thread Nabih Estefan

From: Nabih Estefan Diaz 

[Changes since v10]
Fixed macOS build issue. Changed imports to not be linux-specific.

[Changes since v9]
More cleanup and fixes based on suggestions from Peter Maydell
(peter.mayd...@linaro.org) suggestions.

[Changes since v8]
Suggestions and Fixes from Peter Maydell (peter.mayd...@linaro.org),
also cleaned up changes so nothing is deleted in a later patch that was
added in an earlier patch. Patch count decresed by 1 because this cleanup
led to one of the patches being irrelevant.

[Changes since v7]
Fixed patch 4 declaration of new NIC based on comments by Peter Maydell
(peter.mayd...@linaro.org)

[Changes since v6]
Remove the Change-Ids from the commit messages.

[Changes since v5]
Undid remove of some qtests that seem to have been caused by a merge
conflict.

[Changes since v4]
Added Signed-off-by tag and fixed patch 4 commit message as suggested by
Peter Maydell (peter.mayd...@linaro.org)

[Changes since v3]
Fixed comments from Hao Wu (wuhao...@google.com)

[Changes since v2]
Fixed bugs related to the RC functionality of the GMAC. Added and
squashed patches related to that.

[Changes since v1]
Fixed some errors in formatting.
Fixed a merge error that I didn't see in v1.
Removed Nuvoton 8xx references since that is a separate patch set.

[Original Cover]
Creates NPI Mailbox Module with data verification for read and write (internal 
and external),
wiring to the Nuvoton SoC, and QTests.

Also creates the GMAC Networking Module. Implements read and write 
functionalities with cooresponding descriptors
and registers. Also includes QTests for the different functionalities.

Hao Wu (5):
  hw/misc: Add Nuvoton's PCI Mailbox Module
  hw/arm: Add PCI mailbox module to Nuvoton SoC
  hw/misc: Add qtest for NPCM7xx PCI Mailbox
  hw/net: Add NPCMXXX GMAC device
  hw/arm: Add GMAC devices to NPCM7XX SoC

Nabih Estefan Diaz (5):
  tests/qtest: Creating qtest for GMAC Module
  include/hw/net: GMAC IRQ Implementation
  hw/net: GMAC Rx Implementation
  hw/net: GMAC Tx Implementation
  tests/qtest: Adding PCS Module test to GMAC Qtest

 docs/system/arm/nuvoton.rst |   2 +
 hw/arm/npcm7xx.c|  53 +-
 hw/misc/meson.build |   1 +
 hw/misc/npcm7xx_pci_mbox.c  | 324 ++
 hw/misc/trace-events|   5 +
 hw/net/meson.build  |   2 +-
 hw/net/npcm_gmac.c  | 939 
 hw/net/trace-events |  19 +
 include/hw/arm/npcm7xx.h|   4 +
 include/hw/misc/npcm7xx_pci_mbox.h  |  81 +++
 include/hw/net/npcm_gmac.h  | 340 ++
 tests/qtest/meson.build |   2 +
 tests/qtest/npcm7xx_pci_mbox-test.c | 238 +++
 tests/qtest/npcm_gmac-test.c| 341 ++
 14 files changed, 2347 insertions(+), 4 deletions(-)
 create mode 100644 hw/misc/npcm7xx_pci_mbox.c
 create mode 100644 hw/net/npcm_gmac.c
 create mode 100644 include/hw/misc/npcm7xx_pci_mbox.h
 create mode 100644 include/hw/net/npcm_gmac.h
 create mode 100644 tests/qtest/npcm7xx_pci_mbox-test.c
 create mode 100644 tests/qtest/npcm_gmac-test.c

-- 
2.43.0.472.g3155946c3a-goog

Re: [PATCH] tests/avocado/reverse_debugging: Disable the ppc64 tests by default

2024-01-08 Thread John Snow

On Thu, Nov 23, 2023 at 5:53 AM Peter Maydell  wrote:
>
> On Mon, 20 Nov 2023 at 19:19, John Snow  wrote:
> >
> > On Wed, Nov 15, 2023 at 12:23 PM Daniel P. Berrangé  
> > wrote:
> > > The Python  Machine() class has passed one of a pre-created socketpair
> > > FDs for the serial port chardev. The guest is trying to write to this
> > > and blocking.  Nothing in the Machine() class is reading from the
> > > other end of the serial port console.
>
> > > The Machine class doesn't know if anything will ever use the console,
> > > so as is the change is unsafe.
> > >
> > > The original goal of John's change was to guarantee we capture early
> > > boot messages as some test need that.
> > >
> > > I think we need to be able to have a flag to say whether the caller needs
> > > an "early console" facility, and only use the pre-opened FD passing for
> > > that case. Tests we need early console will have to ask for that guarantee
> > > explicitly.
> >
> > Tch. I see. Thank you for diagnosing this.
> >
> > From the machine.py perspective, you have to *opt in* to having a
> > console, so I hadn't considered that a caller would enable the console
> > and then ... not read from it. Surely that's a bug in the caller?
>
> From an Avocado test perspective, I would expect that the test case
> should have to explicitly opt *out* of "the console messages appear
> in the avocado test log, even if the test case doesn't care about them
> for the purposes of identifying when to end the test or whatever".
> The console logs are important for after-the-fact human diagnosis
> of why a test might have failed, so we should always collect them.
>
> thanks
> -- PMM
>

Understood. In that case, fixing the test would involve engaging's the
avocado suite's draining utility to ensure that the log is being
consumed and logged.

I think there's a potential here to simplify all of the
draining-and-logging code we have split across the avocado test suite,
console_socket.py and machine.py, but I can't promise that the rewrite
I've been working on will be ready quickly, so if this is still busted
(I'm still catching back up with my mail post-holidays) then we want a
quicker fix if we haven't committed one yet.

--js

Re: [PATCH v6 4/4] scripts: add script to compare compatible properties

2024-01-08 Thread John Snow

On Mon, Dec 18, 2023 at 8:20 AM Markus Armbruster  wrote:
>
> Maksim Davydov  writes:
>
> > On 12/1/23 12:51, Markus Armbruster wrote:
> >> Review, anyone?
> >
> > Only Vladimir
>
> To be clear: I'm soliciting a second review.
>
> [...]
>

I volunteer to review it from the Python maintenance POV, but please
rebase and resend to fix the patchew desync. We still want review from
a more holistic perspective, though ... but if it's not part of the
build or test infrastructure, it doesn't have to be perfect.

--js

[PATCH 2/3] tests/tcg: Factor out gdbstub test functions

2024-01-08 Thread Ilya Leoshkevich

Both the report() function as well as the initial gdbstub test sequence
are copy-pasted into ~10 files with slight modifications. This
indicates that they are indeed generic, so factor them out. While
at it, add a few newlines to make the formatting closer to PEP-8.

Signed-off-by: Ilya Leoshkevich 
---
 tests/guest-debug/run-test.py |  7 ++-
 tests/guest-debug/test_gdbstub.py | 56 +++
 tests/tcg/aarch64/gdbstub/test-sve-ioctl.py   | 34 +--
 tests/tcg/aarch64/gdbstub/test-sve.py | 33 +--
 tests/tcg/multiarch/gdbstub/interrupt.py  | 47 ++--
 tests/tcg/multiarch/gdbstub/memory.py | 41 +-
 tests/tcg/multiarch/gdbstub/registers.py  | 41 ++
 tests/tcg/multiarch/gdbstub/sha1.py   | 40 ++---
 .../multiarch/gdbstub/test-proc-mappings.py   | 39 +
 .../multiarch/gdbstub/test-qxfer-auxv-read.py | 37 +---
 .../gdbstub/test-thread-breakpoint.py | 37 +---
 tests/tcg/s390x/gdbstub/test-signals-s390x.py | 42 +-
 tests/tcg/s390x/gdbstub/test-svc.py   | 39 +
 13 files changed, 96 insertions(+), 397 deletions(-)
 create mode 100644 tests/guest-debug/test_gdbstub.py

diff --git a/tests/guest-debug/run-test.py b/tests/guest-debug/run-test.py
index b13b27d4b19..368ff8a8903 100755
--- a/tests/guest-debug/run-test.py
+++ b/tests/guest-debug/run-test.py
@@ -97,7 +97,12 @@ def log(output, msg):
 sleep(1)
 log(output, "GDB CMD: %s" % (gdb_cmd))
 
-result = subprocess.call(gdb_cmd, shell=True, stdout=output, stderr=stderr)
+gdb_env = dict(os.environ)
+gdb_pythonpath = gdb_env.get("PYTHONPATH", "").split(os.pathsep)
+gdb_pythonpath.append(os.path.dirname(os.path.realpath(__file__)))
+gdb_env["PYTHONPATH"] = os.pathsep.join(gdb_pythonpath)
+result = subprocess.call(gdb_cmd, shell=True, stdout=output, stderr=stderr,
+ env=gdb_env)
 
 # A result of greater than 128 indicates a fatal signal (likely a
 # crash due to gdb internal failure). That's a problem for GDB and
diff --git a/tests/guest-debug/test_gdbstub.py 
b/tests/guest-debug/test_gdbstub.py
new file mode 100644
index 000..1bc4ed131f4
--- /dev/null
+++ b/tests/guest-debug/test_gdbstub.py
@@ -0,0 +1,56 @@
+"""Helper functions for gdbstub testing
+
+"""
+from __future__ import print_function
+import gdb
+import sys
+
+fail_count = 0
+
+
+def report(cond, msg):
+"""Report success/fail of a test"""
+if cond:
+print("PASS: {}".format(msg))
+else:
+print("FAIL: {}".format(msg))
+global fail_count
+fail_count += 1
+
+
+def main(test, expected_arch=None):
+"""Run a test function
+
+This runs as the script it sourced (via -x, via run-test.py)."""
+try:
+inferior = gdb.selected_inferior()
+arch = inferior.architecture()
+print("ATTACHED: {}".format(arch))
+if expected_arch is not None:
+report(arch.name() == expected_arch,
+   "connected to {}".format(expected_arch))
+except (gdb.error, AttributeError):
+print("SKIP: not connected")
+exit(0)
+
+if gdb.parse_and_eval("$pc") == 0:
+print("SKIP: PC not set")
+exit(0)
+
+try:
+test()
+except:
+print("GDB Exception: {}".format(sys.exc_info()[0]))
+global fail_count
+fail_count += 1
+import code
+code.InteractiveConsole(locals=globals()).interact()
+raise
+
+try:
+gdb.execute("kill")
+except gdb.error:
+pass
+
+print("All tests complete: %d failures".format(fail_count))
+exit(fail_count)
diff --git a/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py 
b/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py
index ee8d467e59d..a78a3a2514d 100644
--- a/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py
+++ b/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py
@@ -8,19 +8,10 @@
 #
 
 import gdb
-import sys
+from test_gdbstub import main, report
 
 initial_vlen = 0
-failcount = 0
 
-def report(cond, msg):
-"Report success/fail of test"
-if cond:
-print ("PASS: %s" % (msg))
-else:
-print ("FAIL: %s" % (msg))
-global failcount
-failcount += 1
 
 class TestBreakpoint(gdb.Breakpoint):
 def __init__(self, sym_name="__sve_ld_done"):
@@ -64,26 +55,5 @@ def run_test():
 
 gdb.execute("c")
 
-#
-# This runs as the script it sourced (via -x, via run-test.py)
-#
-try:
-inferior = gdb.selected_inferior()
-arch = inferior.architecture()
-report(arch.name() == "aarch64", "connected to aarch64")
-except (gdb.error, AttributeError):
-print("SKIPPING (not connected)", file=sys.stderr)
-exit(0)
-
-try:
-# Run the actual tests
-run_test()
-except:
-print ("GDB Exception: %s" % (sys.exc_info()[0]))
-failcount += 1
-import code
-code.InteractiveConsole(locals=globals()).interact()
-

[PATCH 1/3] linux-user: Allow gdbstub to ignore page protection

2024-01-08 Thread Ilya Leoshkevich

gdbserver ignores page protection by virtue of using /proc/$pid/mem.
Teach qemu gdbstub to do this too. This will not work if /proc is not
mounted; accept this limitation.

One alternative is to temporarily grant the missing PROT_* bit, but
this is inherently racy. Another alternative is self-debugging with
ptrace(POKE), which will break if QEMU itself is being debugged - a
much more severe limitation.

Signed-off-by: Ilya Leoshkevich 
---
 cpu-target.c | 55 ++--
 1 file changed, 40 insertions(+), 15 deletions(-)

diff --git a/cpu-target.c b/cpu-target.c
index 5eecd7ea2d7..69e97f78980 100644
--- a/cpu-target.c
+++ b/cpu-target.c
@@ -406,6 +406,15 @@ int cpu_memory_rw_debug(CPUState *cpu, vaddr addr,
 vaddr l, page;
 void * p;
 uint8_t *buf = ptr;
+int ret = -1;
+int mem_fd;
+
+/*
+ * Try ptrace first. If /proc is not mounted or if there is a different
+ * problem, fall back to the manual page access. Note that, unlike ptrace,
+ * it will not be able to ignore the protection bits.
+ */
+mem_fd = open("/proc/self/mem", is_write ? O_WRONLY : O_RDONLY);
 
 while (len > 0) {
 page = addr & TARGET_PAGE_MASK;
@@ -413,22 +422,33 @@ int cpu_memory_rw_debug(CPUState *cpu, vaddr addr,
 if (l > len)
 l = len;
 flags = page_get_flags(page);
-if (!(flags & PAGE_VALID))
-return -1;
+if (!(flags & PAGE_VALID)) {
+goto out_close;
+}
 if (is_write) {
-if (!(flags & PAGE_WRITE))
-return -1;
+if (mem_fd == -1 ||
+pwrite(mem_fd, ptr, len, (off_t)g2h_untagged(addr)) != len) {
+if (!(flags & PAGE_WRITE)) {
+goto out_close;
+}
+/* XXX: this code should not depend on lock_user */
+p = lock_user(VERIFY_WRITE, addr, l, 0);
+if (!p) {
+goto out_close;
+}
+memcpy(p, buf, l);
+unlock_user(p, addr, l);
+}
+} else if (mem_fd == -1 ||
+   pread(mem_fd, ptr, len, (off_t)g2h_untagged(addr)) != len) {
+if (!(flags & PAGE_READ)) {
+goto out_close;
+}
 /* XXX: this code should not depend on lock_user */
-if (!(p = lock_user(VERIFY_WRITE, addr, l, 0)))
-return -1;
-memcpy(p, buf, l);
-unlock_user(p, addr, l);
-} else {
-if (!(flags & PAGE_READ))
-return -1;
-/* XXX: this code should not depend on lock_user */
-if (!(p = lock_user(VERIFY_READ, addr, l, 1)))
-return -1;
+p = lock_user(VERIFY_READ, addr, l, 1);
+if (!p) {
+goto out_close;
+}
 memcpy(buf, p, l);
 unlock_user(p, addr, 0);
 }
@@ -436,7 +456,12 @@ int cpu_memory_rw_debug(CPUState *cpu, vaddr addr,
 buf += l;
 addr += l;
 }
-return 0;
+ret = 0;
+out_close:
+if (mem_fd != -1) {
+close(mem_fd);
+}
+return ret;
 }
 #endif
 
-- 
2.43.0

[PATCH 0/3] linux-user: Allow gdbstub to ignore page protection

2024-01-08 Thread Ilya Leoshkevich

RFC: https://lists.gnu.org/archive/html/qemu-devel/2023-12/msg02044.html
RFC -> v1: Use /proc/self/mem and accept that this will not work
   without /proc.
   Factor out a couple functions for gdbstub testing.
   Add a test.

Hi,

I've noticed that gdbstub behaves differently from gdbserver in that it
doesn't allow reading non-readable pages. This series improves the
situation by using the same mechanism as gdbserver: /proc/self/mem. If
/proc is not mounted, we fall back to the today's implementation.

Best regards,
Ilya

Ilya Leoshkevich (3):
  linux-user: Allow gdbstub to ignore page protection
  tests/tcg: Factor out gdbstub test functions
  tests/tcg: Add the PROT_NONE gdbstub test

 cpu-target.c  | 55 +-
 tests/guest-debug/run-test.py |  7 ++-
 tests/guest-debug/test_gdbstub.py | 56 +++
 tests/tcg/aarch64/gdbstub/test-sve-ioctl.py   | 34 +--
 tests/tcg/aarch64/gdbstub/test-sve.py | 33 +--
 tests/tcg/multiarch/Makefile.target   |  9 ++-
 tests/tcg/multiarch/gdbstub/interrupt.py  | 47 ++--
 tests/tcg/multiarch/gdbstub/memory.py | 41 +-
 tests/tcg/multiarch/gdbstub/prot-none.py  | 22 
 tests/tcg/multiarch/gdbstub/registers.py  | 41 ++
 tests/tcg/multiarch/gdbstub/sha1.py   | 40 ++---
 .../multiarch/gdbstub/test-proc-mappings.py   | 39 +
 .../multiarch/gdbstub/test-qxfer-auxv-read.py | 37 +---
 .../gdbstub/test-thread-breakpoint.py | 37 +---
 tests/tcg/multiarch/prot-none.c   | 38 +
 tests/tcg/s390x/gdbstub/test-signals-s390x.py | 42 +-
 tests/tcg/s390x/gdbstub/test-svc.py   | 39 +
 17 files changed, 204 insertions(+), 413 deletions(-)
 create mode 100644 tests/guest-debug/test_gdbstub.py
 create mode 100644 tests/tcg/multiarch/gdbstub/prot-none.py
 create mode 100644 tests/tcg/multiarch/prot-none.c

-- 
2.43.0

[PATCH 3/3] tests/tcg: Add the PROT_NONE gdbstub test

2024-01-08 Thread Ilya Leoshkevich

Make sure that qemu gdbstub, like gdbserver, allows reading from and
writing to PROT_NONE pages.

Signed-off-by: Ilya Leoshkevich 
---
 tests/tcg/multiarch/Makefile.target  |  9 +-
 tests/tcg/multiarch/gdbstub/prot-none.py | 22 ++
 tests/tcg/multiarch/prot-none.c  | 38 
 3 files changed, 68 insertions(+), 1 deletion(-)
 create mode 100644 tests/tcg/multiarch/gdbstub/prot-none.py
 create mode 100644 tests/tcg/multiarch/prot-none.c

diff --git a/tests/tcg/multiarch/Makefile.target 
b/tests/tcg/multiarch/Makefile.target
index d31ba8d6ae4..315a2e13588 100644
--- a/tests/tcg/multiarch/Makefile.target
+++ b/tests/tcg/multiarch/Makefile.target
@@ -101,13 +101,20 @@ run-gdbstub-registers: sha512
--bin $< --test $(MULTIARCH_SRC)/gdbstub/registers.py, \
checking register enumeration)
 
+run-gdbstub-prot-none: prot-none
+   $(call run-test, $@, env PROT_NONE_PY=1 $(GDB_SCRIPT) \
+   --gdb $(GDB) \
+   --qemu $(QEMU) --qargs "$(QEMU_OPTS)" \
+   --bin $< --test $(MULTIARCH_SRC)/gdbstub/prot-none.py, \
+   accessing PROT_NONE memory)
+
 else
 run-gdbstub-%:
$(call skip-test, "gdbstub test $*", "need working gdb with $(patsubst 
-%,,$(TARGET_NAME)) support")
 endif
 EXTRA_RUNS += run-gdbstub-sha1 run-gdbstub-qxfer-auxv-read \
  run-gdbstub-proc-mappings run-gdbstub-thread-breakpoint \
- run-gdbstub-registers
+ run-gdbstub-registers run-gdbstub-prot-none
 
 # ARM Compatible Semi Hosting Tests
 #
diff --git a/tests/tcg/multiarch/gdbstub/prot-none.py 
b/tests/tcg/multiarch/gdbstub/prot-none.py
new file mode 100644
index 000..751e44d5b93
--- /dev/null
+++ b/tests/tcg/multiarch/gdbstub/prot-none.py
@@ -0,0 +1,22 @@
+"""Test that GDB can access PROT_NONE pages.
+
+This runs as a sourced script (via -x, via run-test.py).
+
+SPDX-License-Identifier: GPL-2.0-or-later
+"""
+from test_gdbstub import main, report
+
+
+def run_test():
+"""Run through the tests one by one"""
+gdb.Breakpoint("break_here")
+gdb.execute("continue")
+val = int(gdb.parse_and_eval("*p"))
+report(val == 42, "{} != 42".format(val))
+gdb.execute("set *p = 24")
+gdb.execute("continue")
+exitcode = int(gdb.parse_and_eval("$_exitcode"))
+report(exitcode == 0, "{} != 0".format(exitcode))
+
+
+main(run_test)
diff --git a/tests/tcg/multiarch/prot-none.c b/tests/tcg/multiarch/prot-none.c
new file mode 100644
index 000..66e38065cf0
--- /dev/null
+++ b/tests/tcg/multiarch/prot-none.c
@@ -0,0 +1,38 @@
+/*
+ * Test that GDB can access PROT_NONE pages.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include 
+#include 
+#include 
+#include 
+
+void break_here(long *p)
+{
+}
+
+int main(void)
+{
+long pagesize = sysconf(_SC_PAGESIZE);
+int err;
+long *p;
+
+p = mmap(NULL, pagesize, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+assert(p != MAP_FAILED);
+*p = 42;
+
+err = mprotect(p, pagesize, PROT_NONE);
+assert(err == 0);
+
+break_here(p);
+
+err = mprotect(p, pagesize, PROT_READ);
+assert(err == 0);
+if (getenv("PROT_NONE_PY")) {
+assert(*p == 24);
+}
+
+return EXIT_SUCCESS;
+}
-- 
2.43.0

Re: [PATCH] hw/block/fdc: do not set SEEK status bit in multi track commands

2024-01-08 Thread John Snow

On Mon, Jan 1, 2024 at 4:45 PM Hervé Poussineau  wrote:
>
> Ping.
>
> Le 12/08/2023 à 10:59, Hervé Poussineau a écrit :
> > I don't understand when SEEK must be set or not, but it seems to fix 
> > Minix...
> >
> > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1522
> > Signed-off-by: Hervé Poussineau 
> > ---
> >   hw/block/fdc.c | 1 -
> >   1 file changed, 1 deletion(-)
> >
> > diff --git a/hw/block/fdc.c b/hw/block/fdc.c
> > index d7cc4d3ec19..f627bbaf951 100644
> > --- a/hw/block/fdc.c
> > +++ b/hw/block/fdc.c
> > @@ -1404,7 +1404,6 @@ static int fdctrl_seek_to_next_sect(FDCtrl *fdctrl, 
> > FDrive *cur_drv)
> >   } else {
> >   new_head = 0;
> >   new_track++;
> > -fdctrl->status0 |= FD_SR0_SEEK;
> >   if ((cur_drv->flags & FDISK_DBL_SIDES) == 0) {
> >   ret = 0;
> >   }
>

I'll be honest, I don't have the time to audit this and I don't have
the test suite necessary to prove that it's safe enough. Do you have
any suggestions for how we can prove or test this beyond 'works for
me'?

I could read the spec sheet for this controller until I'm blue in the
face, but it doesn't seem to necessarily correlate to how the
controller behaves IRL or with what real operating systems actually do
with that controller. I also don't have access to a physical
controller anymore to even begin to try and write my own hardware
probe for it.

We need a robust test suite for FDC behavior, but it seems unlikely
that anyone will want to actually write one (I sure don't). Are there
any good shortcuts to victory here?

Re: [PATCH 3/6] linux-user: Add code for PR_GET/SET_UNALIGN

2024-01-08 Thread Philippe Mathieu-Daudé


On 8/1/24 22:13, Richard Henderson wrote:

On 1/9/24 04:15, Philippe Mathieu-Daudé wrote:

+/*
+ * This can't go in hw/core/cpu.c because that file is compiled only
+ * once for both user-mode and system builds.
+ */
  static Property cpu_common_props[] = {
-#ifndef CONFIG_USER_ONLY
+#ifdef CONFIG_USER_ONLY
  /*
- * Create a memory property for softmmu CPU object,
- * so users can wire up its memory. (This can't go in hw/core/cpu.c
- * because that file is compiled only once for both user-mode
- * and system builds.) The default if no link is set up is to use
+ * Create a property for the user-only object, so users can
+ * adjust prctl(PR_SET_UNALIGN) from the command-line.


How can I test this per-thread property?


-cpu foo,prctl-unalign-sigbus=true



Shouldn't this be an accel (TCG/user) property, for all threads?


There is always one cpu at user-only startup, and it is copied on clone.

Logically it would be a kernel property, since it's something the kernel 
does, not the cpu.  But cpu vs accel makes no difference to me; it was 
just easy here.


Can a process started with prctl(PR_SET_UNALIGN) unset it before
forking?

"kernel property" as "accel property" works for me.

IIRC, this is simply a proxy for not really being able to inherit this 
bit across fork+exec like you can with the real kernel.



r~

Re: [PATCH v10 08/10] hw/net: GMAC Rx Implementation

2024-01-08 Thread Philippe Mathieu-Daudé


On 8/1/24 23:27, Nabih Estefan wrote:

From: Nabih Estefan Diaz 

- Implementation of Receive function for packets
- Implementation for reading and writing from and to descriptors in
   memory for Rx

When RX starts, we need to flush the queued packets so that they
can be received by the GMAC device. Without this it won't work
with TAP NIC device.

When RX descriptor list is full, it returns a DMA_STATUS for
software to handle it. But there's no way to indicate the software has
handled all RX descriptors and the whole pipeline stalls.

We do something similar to NPCM7XX EMC to handle this case.

1. Return packet size when RX descriptor is full, effectively dropping
these packets in such a case.
2. When software clears RX descriptor full bit, continue receiving
further packets by flushing QEMU packet queue.

Added relevant trace-events

Change-Id: I132aa254a94cda1a586aba2ea33bbfc74ecdb831
Signed-off-by: Hao Wu 
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
  hw/net/npcm_gmac.c  | 324 +++-
  hw/net/trace-events |   5 +
  2 files changed, 327 insertions(+), 2 deletions(-)

diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c
index 44c4ffaff4..54c8af3b41 100644
--- a/hw/net/npcm_gmac.c
+++ b/hw/net/npcm_gmac.c
@@ -23,7 +23,11 @@
  #include "hw/registerfields.h"
  #include "hw/net/mii.h"
  #include "hw/net/npcm_gmac.h"
+#include "linux/if_ether.h"


Still doesn't build on macOS:

[1215/1649] Compiling C object libcommon.fa.p/hw_net_npcm_gmac.c.o
../../hw/net/npcm_gmac.c:26:10: fatal error: 'linux/if_ether.h' file not 
found

#include "linux/if_ether.h"
 ^~
1 error generated.
FAILED: libcommon.fa.p/hw_net_npcm_gmac.c.o

Re: [PATCH 1/2] target/sh4: Deprecate the shix machine

2024-01-08 Thread Philippe Mathieu-Daudé


Hi Samuel,

On 8/1/24 18:15, Samuel Tardieu wrote:

The shix machine has been designed and used at Télécom Paris from 2003
to 2010. It had been added to QEMU in 2005 and has not been maintained
since. Since nobody is using the physical board anymore nor interested
in maintaining the QEMU port, it is time to deprecate it.

Signed-off-by: Samuel Tardieu 
---
  docs/about/deprecated.rst | 5 +
  hw/sh4/shix.c | 1 +
  2 files changed, 6 insertions(+)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index 2e15040246..e6a12c9077 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -269,6 +269,11 @@ Nios II ``10m50-ghrd`` and ``nios2-generic-nommu`` 
machines (since 8.2)
  
  The Nios II architecture is orphan.
  
+``shix`` (since 9.0)

+
+
+The machine is no longer in existence and has been long unmaintained
+in QEMU.
  
  Backend options

  ---
diff --git a/hw/sh4/shix.c b/hw/sh4/shix.c
index aa812512f0..58530b8ede 100644
--- a/hw/sh4/shix.c
+++ b/hw/sh4/shix.c
@@ -80,6 +80,7 @@ static void shix_machine_init(MachineClass *mc)
  mc->init = shix_init;
  mc->is_default = true;
  mc->default_cpu_type = TYPE_SH7750R_CPU;
+mc->deprecation_reason = "old and unmaintained - use a newer machine 
instead";


"use a newer machine instead" bugs me, what would that be?

Could we stick to "old and unmaintained"?


  }
  
  DEFINE_MACHINE("shix", shix_machine_init)

Re: [PATCH v2 1/2] nubus-device: round Declaration ROM memory region address to qemu_target_page_size()

2024-01-08 Thread Philippe Mathieu-Daudé


On 8/1/24 20:20, Mark Cave-Ayland wrote:

Declaration ROM binary images can be any arbitrary size, however if a host ROM
memory region is not aligned to qemu_target_page_size() then we fail the
"assert(!(iotlb & ~TARGET_PAGE_MASK))" check in tlb_set_page_full().

Ensure that the host ROM memory region is aligned to qemu_target_page_size()
and adjust the offset at which the Declaration ROM image is loaded, since Nubus
ROM images are unusual in that they are aligned to the end of the slot address
space.

Signed-off-by: Mark Cave-Ayland 
---
  hw/nubus/nubus-device.c | 16 
  1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/hw/nubus/nubus-device.c b/hw/nubus/nubus-device.c
index 49008e4938..e4f824d58b 100644
--- a/hw/nubus/nubus-device.c
+++ b/hw/nubus/nubus-device.c
@@ -10,6 +10,7 @@
  
  #include "qemu/osdep.h"

  #include "qemu/datadir.h"
+#include "exec/target_page.h"
  #include "hw/irq.h"
  #include "hw/loader.h"
  #include "hw/nubus/nubus.h"
@@ -30,7 +31,7 @@ static void nubus_device_realize(DeviceState *dev, Error 
**errp)
  NubusDevice *nd = NUBUS_DEVICE(dev);
  char *name, *path;
  hwaddr slot_offset;
-int64_t size;
+int64_t size, align_size;


Both are 'size_t'.


  int ret;
  
  /* Super */

@@ -76,16 +77,23 @@ static void nubus_device_realize(DeviceState *dev, Error 
**errp)
  }
  
  name = g_strdup_printf("nubus-slot-%x-declaration-rom", nd->slot);

-memory_region_init_rom(>decl_rom, OBJECT(dev), name, size,
+
+/*
+ * Ensure ROM memory region is aligned to target page size regardless
+ * of the size of the Declaration ROM image
+ */
+align_size = ROUND_UP(size, qemu_target_page_size());
+memory_region_init_rom(>decl_rom, OBJECT(dev), name, align_size,
 _abort);
-ret = load_image_mr(path, >decl_rom);
+ret = load_image_size(path, memory_region_get_ram_ptr(>decl_rom) +
+(uintptr_t)align_size - size, size);


memory_region_get_ram_ptr() returns a 'void *' so this looks dubious.
Maybe use a local variable to ease offset calculation?

  char *rombase = memory_region_get_ram_ptr(>decl_rom);
  ret = load_image_size(path, rombase + align_size - size, size);

Otherwise KISS but ugly:

  ret = load_image_size(path,
(void *)((uintptr_t)memory_region_get_ram_ptr(>decl_rom)
 + align_size - size), size);


  g_free(path);
  g_free(name);
  if (ret < 0) {
  error_setg(errp, "could not load romfile \"%s\"", nd->romfile);
  return;
  }
-memory_region_add_subregion(>slot_mem, NUBUS_SLOT_SIZE - size,
+memory_region_add_subregion(>slot_mem, NUBUS_SLOT_SIZE - 
align_size,
  >decl_rom);
  }
  }

Re: [PATCH v6 1/3] hw/misc: Implement STM32L4x5 EXTI

2024-01-08 Thread Philippe Mathieu-Daudé


On 8/1/24 19:03, Inès Varhol wrote:

Although very similar to the STM32F4xx EXTI, STM32L4x5 EXTI generates
more than 32 event/interrupt requests and thus uses more registers
than STM32F4xx EXTI which generates 23 event/interrupt requests.

Acked-by: Alistair Francis 
Signed-off-by: Arnaud Minier 
Signed-off-by: Inès Varhol 
---

Should the for loop variables be `unsigned` rather than `int` ?


It depends on the iterated range. Here you iterate over
ARRAY_SIZE(Stm32l4x5ExtiState::irq) which is a size_t type,
which is unsigned. Amusingly we use both similarly:

$ git grep 'for (size_t' | wc -l
  56
$ git grep 'for (unsigned' | wc -l
  59


  docs/system/arm/b-l475e-iot01a.rst |   5 +-
  hw/misc/Kconfig|   3 +
  hw/misc/meson.build|   1 +
  hw/misc/stm32l4x5_exti.c   | 292 +
  hw/misc/trace-events   |   5 +
  include/hw/misc/stm32l4x5_exti.h   |  51 +
  6 files changed, 354 insertions(+), 3 deletions(-)
  create mode 100644 hw/misc/stm32l4x5_exti.c
  create mode 100644 include/hw/misc/stm32l4x5_exti.h

diff --git a/docs/system/arm/b-l475e-iot01a.rst 
b/docs/system/arm/b-l475e-iot01a.rst
index 2b128e6b84..72f256ace7 100644
--- a/docs/system/arm/b-l475e-iot01a.rst
+++ b/docs/system/arm/b-l475e-iot01a.rst
@@ -12,17 +12,16 @@ USART, I2C, SPI, CAN and USB OTG, as well as a variety of 
sensors.
  Supported devices
  "
  
-Currently, B-L475E-IOT01A machine's implementation is minimal,

-it only supports the following device:
+Currently B-L475E-IOT01A machine's only supports the following devices:
  
  - Cortex-M4F based STM32L4x5 SoC

+- STM32L4x5 EXTI (Extended interrupts and events controller)
  
  Missing devices

  """
  
  The B-L475E-IOT01A does *not* support the following devices:
  
-- Extended interrupts and events controller (EXTI)

  - Reset and clock control (RCC)
  - Serial ports (UART)
  - System configuration controller (SYSCFG)
diff --git a/hw/misc/Kconfig b/hw/misc/Kconfig
index cc8a8c1418..3efe3dc2cc 100644
--- a/hw/misc/Kconfig
+++ b/hw/misc/Kconfig
@@ -87,6 +87,9 @@ config STM32F4XX_SYSCFG
  config STM32F4XX_EXTI
  bool
  
+config STM32L4X5_EXTI

+bool
+
  config MIPS_ITU
  bool
  
diff --git a/hw/misc/meson.build b/hw/misc/meson.build

index 36c20d5637..16db6e228d 100644
--- a/hw/misc/meson.build
+++ b/hw/misc/meson.build
@@ -110,6 +110,7 @@ system_ss.add(when: 'CONFIG_XLNX_VERSAL_TRNG', if_true: 
files(
  system_ss.add(when: 'CONFIG_STM32F2XX_SYSCFG', if_true: 
files('stm32f2xx_syscfg.c'))
  system_ss.add(when: 'CONFIG_STM32F4XX_SYSCFG', if_true: 
files('stm32f4xx_syscfg.c'))
  system_ss.add(when: 'CONFIG_STM32F4XX_EXTI', if_true: 
files('stm32f4xx_exti.c'))
+system_ss.add(when: 'CONFIG_STM32L4X5_EXTI', if_true: 
files('stm32l4x5_exti.c'))
  system_ss.add(when: 'CONFIG_MPS2_FPGAIO', if_true: files('mps2-fpgaio.c'))
  system_ss.add(when: 'CONFIG_MPS2_SCC', if_true: files('mps2-scc.c'))
  
diff --git a/hw/misc/stm32l4x5_exti.c b/hw/misc/stm32l4x5_exti.c

new file mode 100644
index 00..aedf1fb370
--- /dev/null
+++ b/hw/misc/stm32l4x5_exti.c
@@ -0,0 +1,292 @@
+/*
+ * STM32L4x5 EXTI (Extended interrupts and events controller)
+ *
+ * Copyright (c) 2023 Arnaud Minier 
+ * Copyright (c) 2023 Samuel Tardieu 
+ * Copyright (c) 2023 Inès Varhol 
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ * This work is based on the stm32f4xx_exti by Alistair Francis.
+ * Original code is licensed under the MIT License:
+ *
+ * Copyright (c) 2014 Alistair Francis 
+ */
+
+/*
+ * The reference used is the STMicroElectronics RM0351 Reference manual
+ * for STM32L4x5 and STM32L4x6 advanced Arm ® -based 32-bit MCUs.
+ * 
https://www.st.com/en/microcontrollers-microprocessors/stm32l4x5/documentation.html
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "trace.h"
+#include "hw/irq.h"
+#include "migration/vmstate.h"
+#include "hw/misc/stm32l4x5_exti.h"
+
+#define EXTI_IMR1   0x00
+#define EXTI_EMR1   0x04
+#define EXTI_RTSR1  0x08
+#define EXTI_FTSR1  0x0C
+#define EXTI_SWIER1 0x10
+#define EXTI_PR10x14



+#define EXTI_IMR2   0x20
+#define EXTI_EMR2   0x24
+#define EXTI_RTSR2  0x28
+#define EXTI_FTSR2  0x2C
+#define EXTI_SWIER2 0x30
+#define EXTI_PR20x34
+
+#define EXTI_NUM_GPIO_EVENT_IN_LINES 16


  #define EXTI_MAX_IRQ_PER_BANK 32


+
+/* 0b_1010__ */
+#define DIRECT_LINE_MASK1 0xFF82
+/* 0b___1111 */
+#define DIRECT_LINE_MASK2 0x0087
+/* 0b___ */
+#define RESERVED_BITS_MASK2 0xFF00
+
+/* 0b___0000 */
+#define ACTIVABLE_MASK2 (~DIRECT_LINE_MASK2 & ~RESERVED_BITS_MASK2)



You might want to declare:

  #define EXTI_IRQS_BANK0  32
  #define EXTI_IRQS_BANK1  8

  static const unsigned

Re: [PATCH v2 00/35] tcg: Introduce TCG_COND_TST{EQ,NE}

2024-01-08 Thread Paolo Bonzini

Il lun 8 gen 2024, 22:45 Richard Henderson 
ha scritto:

> > I was thinking: a lot of RISC targets simply do AND/ANDI
> > followed by the sequence used for TCG_COND_NE.  Would it make sense to
> > have a TCG_TARGET_SUPPORTS_TST bit and, if absent, lower TSTEQ/TSTNE
> > to AND+EQ/NE directly in the optimizer?
>
> Probably best, yes.
>

Ok, I will give it a shot.

> And for brcond2/setcond2,
> > always using AND/AND/OR may work just as well as any backend-specific
> > trick, and will give more freedom to the register allocator.
>
>test   a,b
>testeq c,e
>
> for Arm32.  So I'll leave it to the backends.
>

Nice. :)

Paolo


>
> r~
>
>

Re: [PATCH v8 04/10] hw/fsi: IBM's On-chip Peripheral Bus

2024-01-08 Thread Ninad Palsule


Hello Cedric,


On 12/12/23 08:48, Cédric Le Goater wrote:

On 11/29/23 00:56, Ninad Palsule wrote:

This is a part of patchset where IBM's Flexible Service Interface is
introduced.

The On-Chip Peripheral Bus (OPB): A low-speed bus typically found in
POWER processors. This now makes an appearance in the ASPEED SoC due
to tight integration of the FSI master IP with the OPB, mainly the
existence of an MMIO-mapping of the CFAM address straight onto a
sub-region of the OPB address space.

Signed-off-by: Andrew Jeffery 
Signed-off-by: Ninad Palsule 
Reviewed-by: Joel Stanley 
[ clg: - removed FSIMasterState object and fsi_opb_realize()
    - simplified OPBus ]
Signed-off-by: Cédric Le Goater 
---
  include/hw/fsi/opb.h | 25 +
  hw/fsi/opb.c | 36 
  hw/fsi/Kconfig   |  4 
  hw/fsi/meson.build   |  1 +
  4 files changed, 66 insertions(+)
  create mode 100644 include/hw/fsi/opb.h
  create mode 100644 hw/fsi/opb.c

diff --git a/include/hw/fsi/opb.h b/include/hw/fsi/opb.h
new file mode 100644
index 00..c112206f9e
--- /dev/null
+++ b/include/hw/fsi/opb.h
@@ -0,0 +1,25 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM On-Chip Peripheral Bus
+ */
+#ifndef FSI_OPB_H
+#define FSI_OPB_H
+
+#include "exec/memory.h"
+#include "hw/fsi/fsi-master.h"
+
+#define TYPE_OP_BUS "opb"
+OBJECT_DECLARE_SIMPLE_TYPE(OPBus, OP_BUS)
+
+typedef struct OPBus {
+    /*< private >*/
+    BusState bus;
+
+    /*< public >*/
+    MemoryRegion mr;
+    AddressSpace as;
+} OPBus;
+
+#endif /* FSI_OPB_H */
diff --git a/hw/fsi/opb.c b/hw/fsi/opb.c
new file mode 100644
index 00..6474754890
--- /dev/null
+++ b/hw/fsi/opb.c
@@ -0,0 +1,36 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM On-chip Peripheral Bus
+ */
+
+#include "qemu/osdep.h"
+
+#include "qapi/error.h"
+#include "qemu/log.h"
+
+#include "hw/fsi/opb.h"
+
+static void fsi_opb_init(Object *o)
+{
+    OPBus *opb = OP_BUS(o);
+
+    memory_region_init_io(>mr, OBJECT(opb), NULL, opb,
+  NULL, UINT32_MAX);


Let's give the region some name.

Added "fsi.opb" name.


Thanks for the review.

Regards,

Ninad

[PATCH v2 0/3] Hexagon (target/hexagon) Use QEMU decodetree

2024-01-08 Thread Taylor Simpson

Replace the old Hexagon dectree.py with QEMU decodetree

Taylor Simpson (3):
  Hexagon (target/hexagon) Use QEMU decodetree (32-bit instructions)
  Hexagon (target/hexagon) Use QEMU decodetree (16-bit instructions)
  Hexagon (target/hexagon) Remove old dectree.py

 target/hexagon/decode.h |   5 +-
 target/hexagon/opcodes.h|   2 -
 target/hexagon/decode.c | 435 +++-
 target/hexagon/gen_dectree_import.c |  49 
 target/hexagon/opcodes.c|  29 --
 target/hexagon/translate.c  |   4 +-
 target/hexagon/README   |  14 +-
 target/hexagon/dectree.py   | 403 --
 target/hexagon/gen_decodetree.py| 203 +
 target/hexagon/gen_trans_funcs.py   | 124 
 target/hexagon/meson.build  | 147 +-
 11 files changed, 591 insertions(+), 824 deletions(-)
 delete mode 100755 target/hexagon/dectree.py
 create mode 100755 target/hexagon/gen_decodetree.py
 create mode 100755 target/hexagon/gen_trans_funcs.py

-- 
2.34.1

[PATCH v2 1/3] Hexagon (target/hexagon) Use QEMU decodetree (32-bit instructions)

2024-01-08 Thread Taylor Simpson

The Decodetree Specification can be found here
https://www.qemu.org/docs/master/devel/decodetree.html

Covers all 32-bit instructions, including HVX

We generate separate decoders for each instruction class.  The reason
will be more apparent in the next patch in this series.

We add 2 new scripts
gen_decodetree.pyGenerate the input to decodetree.py
gen_trans_funcs.py   Generate the trans_* functions used by the
 output of decodetree.py

Since the functions generated by decodetree.py take DisasContext * as an
argument, we add the argument to a couple of functions that didn't need
it previously.  We also set the insn field in DisasContext during decode
because it is used by the trans_* functions.

There is a g_assert_not_reached() in decode_insns() in decode.c to
verify we never try to use the old decoder on 32-bit instructions

Signed-off-by: Taylor Simpson 
---
 target/hexagon/decode.h   |   5 +-
 target/hexagon/decode.c   |  54 -
 target/hexagon/translate.c|   4 +-
 target/hexagon/README |  13 +-
 target/hexagon/gen_decodetree.py  | 193 ++
 target/hexagon/gen_trans_funcs.py | 132 
 target/hexagon/meson.build|  55 +
 7 files changed, 442 insertions(+), 14 deletions(-)
 create mode 100755 target/hexagon/gen_decodetree.py
 create mode 100755 target/hexagon/gen_trans_funcs.py

diff --git a/target/hexagon/decode.h b/target/hexagon/decode.h
index c66f5ea64d..3f3012b978 100644
--- a/target/hexagon/decode.h
+++ b/target/hexagon/decode.h
@@ -21,12 +21,13 @@
 #include "cpu.h"
 #include "opcodes.h"
 #include "insn.h"
+#include "translate.h"
 
 void decode_init(void);
 
 void decode_send_insn_to(Packet *packet, int start, int newloc);
 
-int decode_packet(int max_words, const uint32_t *words, Packet *pkt,
-  bool disas_only);
+int decode_packet(DisasContext *ctx, int max_words, const uint32_t *words,
+  Packet *pkt, bool disas_only);
 
 #endif
diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index 946c55cc71..bddad1f75e 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -52,6 +52,34 @@ DEF_REGMAP(R_8,   8,  0, 1, 2, 3, 4, 5, 6, 7)
 #define DECODE_MAPPED_REG(OPNUM, NAME) \
 insn->regno[OPNUM] = DECODE_REGISTER_##NAME[insn->regno[OPNUM]];
 
+/* Helper functions for decode_*_generated.c.inc */
+#define DECODE_MAPPED(NAME) \
+static int decode_mapped_reg_##NAME(DisasContext *ctx, int x) \
+{ \
+return DECODE_REGISTER_##NAME[x]; \
+}
+DECODE_MAPPED(R_16)
+DECODE_MAPPED(R_8)
+
+/* Helper function for decodetree_trans_funcs_generated.c.inc */
+static int shift_left(DisasContext *ctx, int x, int n, int immno)
+{
+int ret = x;
+Insn *insn = ctx->insn;
+if (!insn->extension_valid ||
+insn->which_extended != immno) {
+ret <<= n;
+}
+return ret;
+}
+
+/* Include the generated decoder for 32 bit insn */
+#include "decode_normal_generated.c.inc"
+#include "decode_hvx_generated.c.inc"
+
+/* Include the generated helpers for the decoder */
+#include "decodetree_trans_funcs_generated.c.inc"
+
 typedef struct {
 const struct DectreeTable *table_link;
 const struct DectreeTable *table_link_b;
@@ -550,7 +578,8 @@ apply_extender(Packet *pkt, int i, uint32_t extender)
 int immed_num;
 uint32_t base_immed;
 
-immed_num = opcode_which_immediate_is_extended(pkt->insn[i].opcode);
+immed_num = pkt->insn[i].which_extended;
+g_assert(immed_num == 
opcode_which_immediate_is_extended(pkt->insn[i].opcode));
 base_immed = pkt->insn[i].immed[immed_num];
 
 pkt->insn[i].immed[immed_num] = extender | fZXTN(6, 32, base_immed);
@@ -762,12 +791,19 @@ decode_insns_tablewalk(Insn *insn, const DectreeTable 
*table,
 }
 
 static unsigned int
-decode_insns(Insn *insn, uint32_t encoding)
+decode_insns(DisasContext *ctx, Insn *insn, uint32_t encoding)
 {
 const DectreeTable *table;
 if (parse_bits(encoding) != 0) {
+if (decode_normal(ctx, encoding) ||
+decode_hvx(ctx, encoding)) {
+insn->generate = opcode_genptr[insn->opcode];
+insn->iclass = iclass_bits(encoding);
+return 1;
+}
 /* Start with PP table - 32 bit instructions */
 table = _table_DECODE_ROOT_32;
+g_assert_not_reached();
 } else {
 /* start with EE table - duplex instructions */
 table = _table_DECODE_ROOT_EE;
@@ -916,8 +952,8 @@ decode_set_slot_number(Packet *pkt)
  * or number of words used on success
  */
 
-int decode_packet(int max_words, const uint32_t *words, Packet *pkt,
-  bool disas_only)
+int decode_packet(DisasContext *ctx, int max_words, const uint32_t *words,
+  Packet *pkt, bool disas_only)
 {
 int num_insns = 0;
 int words_read = 0;
@@ -930,9 +966,11 @@ int decode_packet(int max_words, const uint32_t *words, 
Packet *pkt,

[PATCH v2 2/3] Hexagon (target/hexagon) Use QEMU decodetree (16-bit instructions)

2024-01-08 Thread Taylor Simpson

Section 10.3 of the Hexagon V73 Programmer's Reference Manual

A duplex is encoded as a 32-bit instruction with bits [15:14] set to 00.
The sub-instructions that comprise a duplex are encoded as 13-bit fields
in the duplex.

Create a decoder for each subinstruction class (a, l1, l2, s1, s2).

Extend gen_trans_funcs.py to handle all instructions rather than
filter by instruction class.

There is a g_assert_not_reached() in decode_insns() in decode.c to
verify we never try to use the old decoder on 16-bit instructions.

Signed-off-by: Taylor Simpson 
---
 target/hexagon/decode.c   | 85 +
 target/hexagon/README |  1 +
 target/hexagon/gen_decodetree.py  | 14 -
 target/hexagon/gen_trans_funcs.py | 12 +
 target/hexagon/meson.build| 90 +++
 5 files changed, 190 insertions(+), 12 deletions(-)

diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index bddad1f75e..160b23a895 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -60,6 +60,7 @@ static int decode_mapped_reg_##NAME(DisasContext *ctx, int x) 
\
 }
 DECODE_MAPPED(R_16)
 DECODE_MAPPED(R_8)
+DECODE_MAPPED(R__8)
 
 /* Helper function for decodetree_trans_funcs_generated.c.inc */
 static int shift_left(DisasContext *ctx, int x, int n, int immno)
@@ -77,6 +78,13 @@ static int shift_left(DisasContext *ctx, int x, int n, int 
immno)
 #include "decode_normal_generated.c.inc"
 #include "decode_hvx_generated.c.inc"
 
+/* Include the generated decoder for 16 bit insn */
+#include "decode_subinsn_a_generated.c.inc"
+#include "decode_subinsn_l1_generated.c.inc"
+#include "decode_subinsn_l2_generated.c.inc"
+#include "decode_subinsn_s1_generated.c.inc"
+#include "decode_subinsn_s2_generated.c.inc"
+
 /* Include the generated helpers for the decoder */
 #include "decodetree_trans_funcs_generated.c.inc"
 
@@ -790,6 +798,63 @@ decode_insns_tablewalk(Insn *insn, const DectreeTable 
*table,
 }
 }
 
+/*
+ * Section 10.3 of the Hexagon V73 Programmer's Reference Manual
+ *
+ * A duplex is encoded as a 32-bit instruction with bits [15:14] set to 00.
+ * The sub-instructions that comprise a duplex are encoded as 13-bit fields
+ * in the duplex.
+ *
+ * Per table 10-4, the 4-bit duplex iclass is encoded in bits 31:29, 13
+ */
+static uint32_t get_duplex_iclass(uint32_t encoding)
+{
+uint32_t iclass = extract32(encoding, 13, 1);
+iclass = deposit32(iclass, 1, 3, extract32(encoding, 29, 3));
+return iclass;
+}
+
+/*
+ * Per table 10-5, the duplex ICLASS field values that specify the group of
+ * each sub-instruction in a duplex
+ *
+ * This table points to the decode instruction for each entry in the table
+ */
+typedef bool (*subinsn_decode_func)(DisasContext *ctx, uint16_t insn);
+typedef struct {
+subinsn_decode_func decode_slot0_subinsn;
+subinsn_decode_func decode_slot1_subinsn;
+} subinsn_decode_groups;
+
+static const subinsn_decode_groups decode_groups[16] = {
+[0x0] = { decode_subinsn_l1, decode_subinsn_l1 },
+[0x1] = { decode_subinsn_l2, decode_subinsn_l1 },
+[0x2] = { decode_subinsn_l2, decode_subinsn_l2 },
+[0x3] = { decode_subinsn_a,  decode_subinsn_a },
+[0x4] = { decode_subinsn_l1, decode_subinsn_a },
+[0x5] = { decode_subinsn_l2, decode_subinsn_a },
+[0x6] = { decode_subinsn_s1, decode_subinsn_a },
+[0x7] = { decode_subinsn_s2, decode_subinsn_a },
+[0x8] = { decode_subinsn_s1, decode_subinsn_l1 },
+[0x9] = { decode_subinsn_s1, decode_subinsn_l2 },
+[0xa] = { decode_subinsn_s1, decode_subinsn_s1 },
+[0xb] = { decode_subinsn_s2, decode_subinsn_s1 },
+[0xc] = { decode_subinsn_s2, decode_subinsn_l1 },
+[0xd] = { decode_subinsn_s2, decode_subinsn_l2 },
+[0xe] = { decode_subinsn_s2, decode_subinsn_s2 },
+[0xf] = { NULL,  NULL },  /* Reserved */
+};
+
+static uint16_t get_slot0_subinsn(uint32_t encoding)
+{
+return extract32(encoding, 0, 13);
+}
+
+static uint16_t get_slot1_subinsn(uint32_t encoding)
+{
+return extract32(encoding, 16, 13);
+}
+
 static unsigned int
 decode_insns(DisasContext *ctx, Insn *insn, uint32_t encoding)
 {
@@ -805,8 +870,28 @@ decode_insns(DisasContext *ctx, Insn *insn, uint32_t 
encoding)
 table = _table_DECODE_ROOT_32;
 g_assert_not_reached();
 } else {
+uint32_t iclass = get_duplex_iclass(encoding);
+unsigned int slot0_subinsn = get_slot0_subinsn(encoding);
+unsigned int slot1_subinsn = get_slot1_subinsn(encoding);
+subinsn_decode_func decode_slot0_subinsn =
+decode_groups[iclass].decode_slot0_subinsn;
+subinsn_decode_func decode_slot1_subinsn =
+decode_groups[iclass].decode_slot1_subinsn;
+
+/* The slot1 subinsn needs to be in the packet first */
+if (decode_slot1_subinsn(ctx, slot1_subinsn)) {
+insn->generate = opcode_genptr[insn->opcode];
+insn->iclass = iclass_bits(encoding);
+

[PATCH v2 3/3] Hexagon (target/hexagon) Remove old dectree.py

2024-01-08 Thread Taylor Simpson

Now that we are using QEMU decodetree.py, remove the old decoder

Signed-off-by: Taylor Simpson 
---
 target/hexagon/opcodes.h|   2 -
 target/hexagon/decode.c | 344 
 target/hexagon/gen_dectree_import.c |  49 
 target/hexagon/opcodes.c|  29 --
 target/hexagon/dectree.py   | 403 
 target/hexagon/meson.build  |  12 -
 6 files changed, 839 deletions(-)
 delete mode 100755 target/hexagon/dectree.py

diff --git a/target/hexagon/opcodes.h b/target/hexagon/opcodes.h
index 6e90e00fe2..fa7e321950 100644
--- a/target/hexagon/opcodes.h
+++ b/target/hexagon/opcodes.h
@@ -53,6 +53,4 @@ extern const OpcodeEncoding opcode_encodings[XX_LAST_OPCODE];
 
 void opcode_init(void);
 
-int opcode_which_immediate_is_extended(Opcode opcode);
-
 #endif
diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index 160b23a895..a40210ca1e 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -88,175 +88,6 @@ static int shift_left(DisasContext *ctx, int x, int n, int 
immno)
 /* Include the generated helpers for the decoder */
 #include "decodetree_trans_funcs_generated.c.inc"
 
-typedef struct {
-const struct DectreeTable *table_link;
-const struct DectreeTable *table_link_b;
-Opcode opcode;
-enum {
-DECTREE_ENTRY_INVALID,
-DECTREE_TABLE_LINK,
-DECTREE_SUBINSNS,
-DECTREE_EXTSPACE,
-DECTREE_TERMINAL
-} type;
-} DectreeEntry;
-
-typedef struct DectreeTable {
-unsigned int (*lookup_function)(int startbit, int width, uint32_t opcode);
-unsigned int size;
-unsigned int startbit;
-unsigned int width;
-const DectreeEntry table[];
-} DectreeTable;
-
-#define DECODE_NEW_TABLE(TAG, SIZE, WHATNOT) \
-static const DectreeTable dectree_table_##TAG;
-#define TABLE_LINK(TABLE) /* NOTHING */
-#define TERMINAL(TAG, ENC)/* NOTHING */
-#define SUBINSNS(TAG, CLASSA, CLASSB, ENC)/* NOTHING */
-#define EXTSPACE(TAG, ENC)/* NOTHING */
-#define INVALID() /* NOTHING */
-#define DECODE_END_TABLE(...) /* NOTHING */
-#define DECODE_MATCH_INFO(...)/* NOTHING */
-#define DECODE_LEGACY_MATCH_INFO(...) /* NOTHING */
-#define DECODE_OPINFO(...)/* NOTHING */
-
-#include "dectree_generated.h.inc"
-
-#undef DECODE_OPINFO
-#undef DECODE_MATCH_INFO
-#undef DECODE_LEGACY_MATCH_INFO
-#undef DECODE_END_TABLE
-#undef INVALID
-#undef TERMINAL
-#undef SUBINSNS
-#undef EXTSPACE
-#undef TABLE_LINK
-#undef DECODE_NEW_TABLE
-#undef DECODE_SEPARATOR_BITS
-
-#define DECODE_SEPARATOR_BITS(START, WIDTH) NULL, START, WIDTH
-#define DECODE_NEW_TABLE_HELPER(TAG, SIZE, FN, START, WIDTH) \
-static const DectreeTable dectree_table_##TAG = { \
-.size = SIZE, \
-.lookup_function = FN, \
-.startbit = START, \
-.width = WIDTH, \
-.table = {
-#define DECODE_NEW_TABLE(TAG, SIZE, WHATNOT) \
-DECODE_NEW_TABLE_HELPER(TAG, SIZE, WHATNOT)
-
-#define TABLE_LINK(TABLE) \
-{ .type = DECTREE_TABLE_LINK, .table_link = _table_##TABLE },
-#define TERMINAL(TAG, ENC) \
-{ .type = DECTREE_TERMINAL, .opcode = TAG  },
-#define SUBINSNS(TAG, CLASSA, CLASSB, ENC) \
-{ \
-.type = DECTREE_SUBINSNS, \
-.table_link = _table_DECODE_SUBINSN_##CLASSA, \
-.table_link_b = _table_DECODE_SUBINSN_##CLASSB \
-},
-#define EXTSPACE(TAG, ENC) { .type = DECTREE_EXTSPACE },
-#define INVALID() { .type = DECTREE_ENTRY_INVALID, .opcode = XX_LAST_OPCODE },
-
-#define DECODE_END_TABLE(...) } };
-
-#define DECODE_MATCH_INFO(...)/* NOTHING */
-#define DECODE_LEGACY_MATCH_INFO(...) /* NOTHING */
-#define DECODE_OPINFO(...)/* NOTHING */
-
-#include "dectree_generated.h.inc"
-
-#undef DECODE_OPINFO
-#undef DECODE_MATCH_INFO
-#undef DECODE_LEGACY_MATCH_INFO
-#undef DECODE_END_TABLE
-#undef INVALID
-#undef TERMINAL
-#undef SUBINSNS
-#undef EXTSPACE
-#undef TABLE_LINK
-#undef DECODE_NEW_TABLE
-#undef DECODE_NEW_TABLE_HELPER
-#undef DECODE_SEPARATOR_BITS
-
-static const DectreeTable dectree_table_DECODE_EXT_EXT_noext = {
-.size = 1, .lookup_function = NULL, .startbit = 0, .width = 0,
-.table = {
-{ .type = DECTREE_ENTRY_INVALID, .opcode = XX_LAST_OPCODE },
-}
-};
-
-static const DectreeTable *ext_trees[XX_LAST_EXT_IDX];
-
-static void decode_ext_init(void)
-{
-int i;
-for (i = EXT_IDX_noext; i < EXT_IDX_noext_AFTER; i++) {
-ext_trees[i] = _table_DECODE_EXT_EXT_noext;
-}
-for (i = EXT_IDX_mmvec; i < EXT_IDX_mmvec_AFTER; i++) {
-ext_trees[i] = _table_DECODE_EXT_EXT_mmvec;
-}
-}
-
-typedef struct {
-uint32_t mask;
-uint32_t match;
-} DecodeITableEntry;
-
-#define DECODE_NEW_TABLE(TAG, SIZE, WHATNOT)  /* NOTHING */
-#define TABLE_LINK(TABLE) /* NOTHING */
-#define TERMINAL(TAG,

Re: [External] Re: [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram'

2024-01-08 Thread Hao Xiang

On Mon, Jan 8, 2024 at 9:15 AM Gregory Price  wrote:
>
> On Fri, Jan 05, 2024 at 09:59:19PM -0800, Hao Xiang wrote:
> > On Wed, Jan 3, 2024 at 1:56 PM Gregory Price  
> > wrote:
> > >
> > > For a variety of performance reasons, this will not work the way you
> > > want it to.  You are essentially telling QEMU to map the vmem0 into a
> > > virtual cxl device, and now any memory accesses to that memory region
> > > will end up going through the cxl-type3 device logic - which is an IO
> > > path from the perspective of QEMU.
> >
> > I didn't understand exactly how the virtual cxl-type3 device works. I
> > thought it would go with the same "guest virtual address ->  guest
> > physical address -> host physical address" translation totally done by
> > CPU. But if it is going through an emulation path handled by virtual
> > cxl-type3, I agree the performance would be bad. Do you know why
> > accessing memory on a virtual cxl-type3 device can't go with the
> > nested page table translation?
> >
>
> Because a byte-access on CXL memory can have checks on it that must be
> emulated by the virtual device, and because there are caching
> implications that have to be emulated as well.

Interesting. Now that I see the cxl_type3_read/cxl_type3_write. If the
CXL memory data path goes through them, the performance would be
pretty problematic. We have actually run Intel's Memory Latency
Checker benchmark from inside a guest VM with both system-DRAM and
virtual CXL-type3 configured. The idle latency on the virtual CXL
memory is 2X of system DRAM, which is on-par with the benchmark
running from a physical host. I need to debug this more to understand
why the latency is actually much better than I would expect now.

>
> The cxl device you are using is an emulated CXL device - not a
> virtualization interface.  Nuanced difference:  the emulated device has
> to emulate *everything* that CXL device does.
>
> What you want is passthrough / managed access to a real device -
> virtualization.  This is not the way to accomplish that.  A better way
> to accomplish that is to simply pass the memory through as a static numa
> node as I described.

That would work, too. But I think a kernel change is required to
establish the correct memory tiering if we go this routine.

>
> >
> > When we had a discussion with Intel, they told us to not use the KVM
> > option in QEMU while using virtual cxl type3 device. That's probably
> > related to the issue you described here? We enabled KVM though but
> > haven't seen the crash yet.
> >
>
> The crash really only happens, IIRC, if code ends up hosted in that
> memory.  I forget the exact scenario, but the working theory is it has
> to do with the way instruction caches are managed with KVM and this
> device.
>
> > >
> > > You're better off just using the `host-nodes` field of host-memory
> > > and passing bandwidth/latency attributes though via `-numa hmat-lb`
> >
> > We tried this but it doesn't work from end to end right now. I
> > described the issue in another fork of this thread.
> >
> > >
> > > In that scenario, the guest software doesn't even need to know CXL
> > > exists at all, it can just read the attributes of the numa node
> > > that QEMU created for it.
> >
> > We thought about this before. But the current kernel implementation
> > requires a devdax device to be probed and recognized as a slow tier
> > (by reading the memory attributes). I don't think this can be done via
> > the path you described. Have you tried this before?
> >
>
> Right, because the memory tiering component lumps the nodes together.
>
> Better idea:  Fix the memory tiering component
>
> I cc'd you on another patch line that is discussing something relevant
> to this.
>
> https://lore.kernel.org/linux-mm/87fs00njft@yhuang6-desk2.ccr.corp.intel.com/T/#m32d58f8cc607aec942995994a41b17ff711519c8
>
> The point is: There's no need for this to be a dax device at all, there
> is no need for the guest to even know what is providing the memory, or
> for the guest to have any management access to the memory.  It just
> wants the memory and the ability to tier it.
>
> So we should fix the memory tiering component to work with this
> workflow.

Agreed. We really don't need the devdax device at all. I thought that
choice was made due to the memory tiering concept being started with
pmem ... Let's continue this part of the discussion on the above
thread.

>
> ~Gregory

Re: [PATCH v7 0/4] compare machine type compat_props

2024-01-08 Thread John Snow

On Fri, Dec 22, 2023 at 7:51 AM Markus Armbruster  wrote:
>
> Something odd is going on here.
>
> Your cover letter and PATCH 4 arrived here with
>
> Content-Type: text/plain; charset=UTF-8
>
> Good.
>
> PATCH 2:
>
> Content-Type: text/plain; charset="US-ASCII"; x-default=true
>
> PATCH 1 and 3:
>
> Content-Type: text/plain; charset=N
>
> git-am chokes on that:
>
> error: cannot convert from N to UTF-8
>

Patchew also complains that it hasn't received the full series:

https://patchew.org/QEMU/20231214155333.35643-1-davydov-...@yandex-team.ru/

Please consider rebasing and resending?

--js

Re: [PATCH v8 06/10] hw/fsi: Aspeed APB2OPB interface

2024-01-08 Thread Ninad Palsule


Hello Cedric,

On 12/12/23 08:49, Cédric Le Goater wrote:

On 11/29/23 00:56, Ninad Palsule wrote:

This is a part of patchset where IBM's Flexible Service Interface is
introduced.

An APB-to-OPB bridge enabling access to the OPB from the ARM core in
the AST2600. Hardware limitations prevent the OPB from being directly
mapped into APB, so all accesses are indirect through the bridge.

Signed-off-by: Andrew Jeffery 
Signed-off-by: Ninad Palsule 
[ clg: - moved FSIMasterState under AspeedAPB2OPBState
    - modified fsi_opb_fsi_master_address() and
  fsi_opb_opb2fsi_address()
    - instroduced fsi_aspeed_apb2opb_init()
    - reworked fsi_aspeed_apb2opb_realize() ]
Signed-off-by: Cédric Le Goater 
---
  include/hw/fsi/aspeed-apb2opb.h |  34 
  hw/fsi/aspeed-apb2opb.c | 316 
  hw/arm/Kconfig  |   1 +
  hw/fsi/Kconfig  |   4 +
  hw/fsi/meson.build  |   1 +
  hw/fsi/trace-events |   2 +
  6 files changed, 358 insertions(+)
  create mode 100644 include/hw/fsi/aspeed-apb2opb.h
  create mode 100644 hw/fsi/aspeed-apb2opb.c

diff --git a/include/hw/fsi/aspeed-apb2opb.h 
b/include/hw/fsi/aspeed-apb2opb.h

new file mode 100644
index 00..c51fbeda9f
--- /dev/null
+++ b/include/hw/fsi/aspeed-apb2opb.h
@@ -0,0 +1,34 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * ASPEED APB2OPB Bridge
+ */
+#ifndef FSI_ASPEED_APB2OPB_H
+#define FSI_ASPEED_APB2OPB_H
+
+#include "hw/sysbus.h"
+#include "hw/fsi/opb.h"
+
+#define TYPE_ASPEED_APB2OPB "aspeed.apb2opb"
+OBJECT_DECLARE_SIMPLE_TYPE(AspeedAPB2OPBState, ASPEED_APB2OPB)
+
+#define ASPEED_APB2OPB_NR_REGS ((0xe8 >> 2) + 1)
+
+#define ASPEED_FSI_NUM 2
+
+typedef struct AspeedAPB2OPBState {
+    /*< private >*/
+    SysBusDevice parent_obj;
+
+    /*< public >*/
+    MemoryRegion iomem;
+
+    uint32_t regs[ASPEED_APB2OPB_NR_REGS];
+    qemu_irq irq;
+
+    OPBus opb[ASPEED_FSI_NUM];
+    FSIMasterState fsi[ASPEED_FSI_NUM];
+} AspeedAPB2OPBState;
+
+#endif /* FSI_ASPEED_APB2OPB_H */
diff --git a/hw/fsi/aspeed-apb2opb.c b/hw/fsi/aspeed-apb2opb.c
new file mode 100644
index 00..70b3fe2587
--- /dev/null
+++ b/hw/fsi/aspeed-apb2opb.c
@@ -0,0 +1,316 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * ASPEED APB-OPB FSI interface
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qom/object.h"
+#include "qapi/error.h"
+#include "trace.h"
+
+#include "hw/fsi/aspeed-apb2opb.h"
+#include "hw/qdev-core.h"
+
+#define TO_REG(x) (x >> 2)
+
+#define APB2OPB_VERSION    TO_REG(0x00)
+#define APB2OPB_TRIGGER    TO_REG(0x04)
+
+#define APB2OPB_CONTROL    TO_REG(0x08)
+#define   APB2OPB_CONTROL_OFF  BE_GENMASK(31, 13)
+
+#define APB2OPB_OPB2FSI    TO_REG(0x0c)
+#define   APB2OPB_OPB2FSI_OFF  BE_GENMASK(31, 22)
+
+#define APB2OPB_OPB0_SEL   TO_REG(0x10)
+#define APB2OPB_OPB1_SEL   TO_REG(0x28)
+#define   APB2OPB_OPB_SEL_EN   BIT(0)
+
+#define APB2OPB_OPB0_MODE  TO_REG(0x14)
+#define APB2OPB_OPB1_MODE  TO_REG(0x2c)
+#define   APB2OPB_OPB_MODE_RD  BIT(0)
+
+#define APB2OPB_OPB0_XFER  TO_REG(0x18)
+#define APB2OPB_OPB1_XFER  TO_REG(0x30)
+#define   APB2OPB_OPB_XFER_FULL    BIT(1)
+#define   APB2OPB_OPB_XFER_HALF    BIT(0)
+
+#define APB2OPB_OPB0_ADDR  TO_REG(0x1c)
+#define APB2OPB_OPB0_WRITE_DATA    TO_REG(0x20)
+
+#define APB2OPB_OPB1_ADDR  TO_REG(0x34)
+#define APB2OPB_OPB1_WRITE_DATA  TO_REG(0x38)
+
+#define APB2OPB_IRQ_STS    TO_REG(0x48)
+#define   APB2OPB_IRQ_STS_OPB1_TX_ACK  BIT(17)
+#define   APB2OPB_IRQ_STS_OPB0_TX_ACK  BIT(16)
+
+#define APB2OPB_OPB0_WRITE_WORD_ENDIAN TO_REG(0x4c)
+#define   APB2OPB_OPB0_WRITE_WORD_ENDIAN_BE 0x0011101b
+#define APB2OPB_OPB0_WRITE_BYTE_ENDIAN TO_REG(0x50)
+#define   APB2OPB_OPB0_WRITE_BYTE_ENDIAN_BE 0x0c330f3f
+#define APB2OPB_OPB1_WRITE_WORD_ENDIAN TO_REG(0x54)
+#define APB2OPB_OPB1_WRITE_BYTE_ENDIAN TO_REG(0x58)
+#define APB2OPB_OPB0_READ_BYTE_ENDIAN  TO_REG(0x5c)
+#define APB2OPB_OPB1_READ_BYTE_ENDIAN  TO_REG(0x60)
+#define   APB2OPB_OPB0_READ_WORD_ENDIAN_BE  0x00030b1b
+
+#define APB2OPB_OPB0_READ_DATA TO_REG(0x84)
+#define APB2OPB_OPB1_READ_DATA TO_REG(0x90)
+
+/*
+ * The following magic values came from AST2600 data sheet
+ * The register values are defined under section "FSI controller"
+ * as initial values.
+ */
+static const uint32_t aspeed_apb2opb_reset[ASPEED_APB2OPB_NR_REGS] = {
+ [APB2OPB_VERSION]    = 0x00a1,
+ [APB2OPB_OPB0_WRITE_WORD_ENDIAN] = 0x0044eee4,
+ [APB2OPB_OPB0_WRITE_BYTE_ENDIAN] = 0x0055aaff,
+ [APB2OPB_OPB1_WRITE_WORD_ENDIAN] =

testing without the translation cache

2024-01-08 Thread Brian Cain

Alex,

A very long time ago QEMU supported disabling the translation cache via 
"-translation no-cache".  That option was deliberately removed.  We are looking 
into a hexagon-specific failure when there's a TB lookup miss from a 
cpu_loop_exit_restore().I'd like to test our fix for this failure and was 
wondering if there's any mechanism to disable the cache.  There's a "-accel 
tcg,tb-size=0" - but this won't accomplish what I'm looking to do - will it?  
If not, is there another way to disable the cache?

-Brian

[PATCH v10 07/10] include/hw/net: GMAC IRQ Implementation

2024-01-08 Thread Nabih Estefan

From: Nabih Estefan Diaz 

Implement Update IRQ Method for GMAC functionality.

Added relevant trace-events

Change-Id: I7a2d3cd3f493278bcd0cf483233c1e05c37488b7
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 hw/net/npcm_gmac.c  | 40 
 hw/net/trace-events |  1 +
 2 files changed, 41 insertions(+)

diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c
index 98b3c33c94..44c4ffaff4 100644
--- a/hw/net/npcm_gmac.c
+++ b/hw/net/npcm_gmac.c
@@ -149,6 +149,46 @@ static bool gmac_can_receive(NetClientState *nc)
 return true;
 }
 
+/*
+ * Function that updates the GMAC IRQ
+ * It find the logical OR of the enabled bits for NIS (if enabled)
+ * It find the logical OR of the enabled bits for AIS (if enabled)
+ */
+static void gmac_update_irq(NPCMGMACState *gmac)
+{
+/*
+ * Check if the normal interrupts summary is enabled
+ * if so, add the bits for the summary that are enabled
+ */
+if (gmac->regs[R_NPCM_DMA_INTR_ENA] & gmac->regs[R_NPCM_DMA_STATUS] &
+(NPCM_DMA_INTR_ENAB_NIE_BITS)) {
+gmac->regs[R_NPCM_DMA_STATUS] |=  NPCM_DMA_STATUS_NIS;
+}
+/*
+ * Check if the abnormal interrupts summary is enabled
+ * if so, add the bits for the summary that are enabled
+ */
+if (gmac->regs[R_NPCM_DMA_INTR_ENA] & gmac->regs[R_NPCM_DMA_STATUS] &
+(NPCM_DMA_INTR_ENAB_AIE_BITS)) {
+gmac->regs[R_NPCM_DMA_STATUS] |=  NPCM_DMA_STATUS_AIS;
+}
+
+/* Get the logical OR of both normal and abnormal interrupts */
+int level = !!((gmac->regs[R_NPCM_DMA_STATUS] &
+gmac->regs[R_NPCM_DMA_INTR_ENA] &
+NPCM_DMA_STATUS_NIS) |
+   (gmac->regs[R_NPCM_DMA_STATUS] &
+   gmac->regs[R_NPCM_DMA_INTR_ENA] &
+   NPCM_DMA_STATUS_AIS));
+
+/* Set the IRQ */
+trace_npcm_gmac_update_irq(DEVICE(gmac)->canonical_path,
+   gmac->regs[R_NPCM_DMA_STATUS],
+   gmac->regs[R_NPCM_DMA_INTR_ENA],
+   level);
+qemu_set_irq(gmac->irq, level);
+}
+
 static ssize_t gmac_receive(NetClientState *nc, const uint8_t *buf, size_t len)
 {
 /* Placeholder. Function will be filled in following patches */
diff --git a/hw/net/trace-events b/hw/net/trace-events
index 33514548b8..56057de47f 100644
--- a/hw/net/trace-events
+++ b/hw/net/trace-events
@@ -473,6 +473,7 @@ npcm_gmac_reg_write(const char *name, uint64_t offset, 
uint32_t value) "%s: offs
 npcm_gmac_mdio_access(const char *name, uint8_t is_write, uint8_t pa, uint8_t 
gr, uint16_t val) "%s: is_write: %" PRIu8 " pa: %" PRIu8 " gr: %" PRIu8 " val: 
0x%04" PRIx16
 npcm_gmac_reset(const char *name, uint16_t value) "%s: phy_regs[0][1]: 0x%04" 
PRIx16
 npcm_gmac_set_link(bool active) "Set link: active=%u"
+npcm_gmac_update_irq(const char *name, uint32_t status, uint32_t intr_en, int 
level) "%s: Status Reg: 0x%04" PRIX32 " Interrupt Enable Reg: 0x%04" PRIX32 " 
IRQ Set: %d"
 
 # npcm_pcs.c
 npcm_pcs_reg_read(const char *name, uint16_t indirect_access_baes, uint64_t 
offset, uint16_t value) "%s: IND: 0x%02" PRIx16 " offset: 0x%04" PRIx64 " 
value: 0x%04" PRIx16
-- 
2.43.0.472.g3155946c3a-goog

[PATCH v10 08/10] hw/net: GMAC Rx Implementation

2024-01-08 Thread Nabih Estefan

From: Nabih Estefan Diaz 

- Implementation of Receive function for packets
- Implementation for reading and writing from and to descriptors in
  memory for Rx

When RX starts, we need to flush the queued packets so that they
can be received by the GMAC device. Without this it won't work
with TAP NIC device.

When RX descriptor list is full, it returns a DMA_STATUS for
software to handle it. But there's no way to indicate the software has
handled all RX descriptors and the whole pipeline stalls.

We do something similar to NPCM7XX EMC to handle this case.

1. Return packet size when RX descriptor is full, effectively dropping
these packets in such a case.
2. When software clears RX descriptor full bit, continue receiving
further packets by flushing QEMU packet queue.

Added relevant trace-events

Change-Id: I132aa254a94cda1a586aba2ea33bbfc74ecdb831
Signed-off-by: Hao Wu 
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 hw/net/npcm_gmac.c  | 324 +++-
 hw/net/trace-events |   5 +
 2 files changed, 327 insertions(+), 2 deletions(-)

diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c
index 44c4ffaff4..54c8af3b41 100644
--- a/hw/net/npcm_gmac.c
+++ b/hw/net/npcm_gmac.c
@@ -23,7 +23,11 @@
 #include "hw/registerfields.h"
 #include "hw/net/mii.h"
 #include "hw/net/npcm_gmac.h"
+#include "linux/if_ether.h"
 #include "migration/vmstate.h"
+#include "net/checksum.h"
+#include "net/net.h"
+#include "qemu/cutils.h"
 #include "qemu/log.h"
 #include "qemu/units.h"
 #include "sysemu/dma.h"
@@ -146,6 +150,17 @@ static void gmac_phy_set_link(NPCMGMACState *gmac, bool 
active)
 
 static bool gmac_can_receive(NetClientState *nc)
 {
+NPCMGMACState *gmac = NPCM_GMAC(qemu_get_nic_opaque(nc));
+
+/* If GMAC receive is disabled. */
+if (!(gmac->regs[R_NPCM_GMAC_MAC_CONFIG] & NPCM_GMAC_MAC_CONFIG_RX_EN)) {
+return false;
+}
+
+/* If GMAC DMA RX is stopped. */
+if (!(gmac->regs[R_NPCM_DMA_CONTROL] & NPCM_DMA_CONTROL_START_STOP_RX)) {
+return false;
+}
 return true;
 }
 
@@ -189,12 +204,288 @@ static void gmac_update_irq(NPCMGMACState *gmac)
 qemu_set_irq(gmac->irq, level);
 }
 
-static ssize_t gmac_receive(NetClientState *nc, const uint8_t *buf, size_t len)
+static int gmac_read_rx_desc(dma_addr_t addr, struct NPCMGMACRxDesc *desc)
+{
+if (dma_memory_read(_space_memory, addr, desc,
+sizeof(*desc), MEMTXATTRS_UNSPECIFIED)) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to read descriptor @ 0x%"
+  HWADDR_PRIx "\n", __func__, addr);
+return -1;
+}
+desc->rdes0 = le32_to_cpu(desc->rdes0);
+desc->rdes1 = le32_to_cpu(desc->rdes1);
+desc->rdes2 = le32_to_cpu(desc->rdes2);
+desc->rdes3 = le32_to_cpu(desc->rdes3);
+return 0;
+}
+
+static int gmac_write_rx_desc(dma_addr_t addr, struct NPCMGMACRxDesc *desc)
 {
-/* Placeholder. Function will be filled in following patches */
+struct NPCMGMACRxDesc le_desc;
+le_desc.rdes0 = cpu_to_le32(desc->rdes0);
+le_desc.rdes1 = cpu_to_le32(desc->rdes1);
+le_desc.rdes2 = cpu_to_le32(desc->rdes2);
+le_desc.rdes3 = cpu_to_le32(desc->rdes3);
+if (dma_memory_write(_space_memory, addr, _desc,
+sizeof(le_desc), MEMTXATTRS_UNSPECIFIED)) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to write descriptor @ 0x%"
+  HWADDR_PRIx "\n", __func__, addr);
+return -1;
+}
 return 0;
 }
 
+static int gmac_read_tx_desc(dma_addr_t addr, struct NPCMGMACTxDesc *desc)
+{
+if (dma_memory_read(_space_memory, addr, desc,
+sizeof(*desc), MEMTXATTRS_UNSPECIFIED)) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to read descriptor @ 0x%"
+  HWADDR_PRIx "\n", __func__, addr);
+return -1;
+}
+desc->tdes0 = le32_to_cpu(desc->tdes0);
+desc->tdes1 = le32_to_cpu(desc->tdes1);
+desc->tdes2 = le32_to_cpu(desc->tdes2);
+desc->tdes3 = le32_to_cpu(desc->tdes3);
+return 0;
+}
+
+static int gmac_write_tx_desc(dma_addr_t addr, struct NPCMGMACTxDesc *desc)
+{
+struct NPCMGMACTxDesc le_desc;
+le_desc.tdes0 = cpu_to_le32(desc->tdes0);
+le_desc.tdes1 = cpu_to_le32(desc->tdes1);
+le_desc.tdes2 = cpu_to_le32(desc->tdes2);
+le_desc.tdes3 = cpu_to_le32(desc->tdes3);
+if (dma_memory_write(_space_memory, addr, _desc,
+sizeof(le_desc), MEMTXATTRS_UNSPECIFIED)) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to write descriptor @ 0x%"
+  HWADDR_PRIx "\n", __func__, addr);
+return -1;
+}
+return 0;
+}
+static int gmac_rx_transfer_frame_to_buffer(uint32_t rx_buf_len,
+uint32_t *left_frame,
+uint32_t rx_buf_addr,
+bool *eof_transferred,
+

[PATCH v10 03/10] hw/misc: Add qtest for NPCM7xx PCI Mailbox

2024-01-08 Thread Nabih Estefan

From: Hao Wu 

This patches adds a qtest for NPCM7XX PCI Mailbox module.
It sends read and write requests to the module, and verifies that
the module contains the correct data after the requests.

Change-Id: I2e1dbaecf8be9ec7eab55cb54f7fdeb0715b8275
Signed-off-by: Hao Wu 
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 tests/qtest/meson.build |   1 +
 tests/qtest/npcm7xx_pci_mbox-test.c | 238 
 2 files changed, 239 insertions(+)
 create mode 100644 tests/qtest/npcm7xx_pci_mbox-test.c

diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 47dabf91d0..2ac79925f9 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -183,6 +183,7 @@ qtests_sparc64 = \
 qtests_npcm7xx = \
   ['npcm7xx_adc-test',
'npcm7xx_gpio-test',
+   'npcm7xx_pci_mbox-test',
'npcm7xx_pwm-test',
'npcm7xx_rng-test',
'npcm7xx_sdhci-test',
diff --git a/tests/qtest/npcm7xx_pci_mbox-test.c 
b/tests/qtest/npcm7xx_pci_mbox-test.c
new file mode 100644
index 00..24eec18e3c
--- /dev/null
+++ b/tests/qtest/npcm7xx_pci_mbox-test.c
@@ -0,0 +1,238 @@
+/*
+ * QTests for Nuvoton NPCM7xx PCI Mailbox Modules.
+ *
+ * Copyright 2021 Google LLC
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ * for more details.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/bitops.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qnum.h"
+#include "libqtest-single.h"
+
+#define PCI_MBOX_BA 0xf0848000
+#define PCI_MBOX_IRQ8
+
+/* register offset */
+#define PCI_MBOX_STAT   0x00
+#define PCI_MBOX_CTL0x04
+#define PCI_MBOX_CMD0x08
+
+#define CODE_OK 0x00
+#define CODE_INVALID_OP 0xa0
+#define CODE_INVALID_SIZE   0xa1
+#define CODE_ERROR  0xff
+
+#define OP_READ 0x01
+#define OP_WRITE0x02
+#define OP_INVALID  0x41
+
+
+static int sock;
+static int fd;
+
+/*
+ * Create a local TCP socket with any port, then save off the port we got.
+ */
+static in_port_t open_socket(void)
+{
+struct sockaddr_in myaddr;
+socklen_t addrlen;
+
+myaddr.sin_family = AF_INET;
+myaddr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
+myaddr.sin_port = 0;
+sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
+g_assert(sock != -1);
+g_assert(bind(sock, (struct sockaddr *) , sizeof(myaddr)) != -1);
+addrlen = sizeof(myaddr);
+g_assert(getsockname(sock, (struct sockaddr *)  , ) != -1);
+g_assert(listen(sock, 1) != -1);
+return ntohs(myaddr.sin_port);
+}
+
+static void setup_fd(void)
+{
+fd_set readfds;
+
+FD_ZERO();
+FD_SET(sock, );
+g_assert(select(sock + 1, , NULL, NULL, NULL) == 1);
+
+fd = accept(sock, NULL, 0);
+g_assert(fd >= 0);
+}
+
+static uint8_t read_response(uint8_t *buf, size_t len)
+{
+uint8_t code;
+ssize_t ret = read(fd, , 1);
+
+if (ret == -1) {
+return CODE_ERROR;
+}
+if (code != CODE_OK) {
+return code;
+}
+g_test_message("response code: %x", code);
+if (len > 0) {
+ret = read(fd, buf, len);
+if (ret < len) {
+return CODE_ERROR;
+}
+}
+return CODE_OK;
+}
+
+static void receive_data(uint64_t offset, uint8_t *buf, size_t len)
+{
+uint8_t op = OP_READ;
+uint8_t code;
+ssize_t rv;
+
+while (len > 0) {
+uint8_t size;
+
+if (len >= 8) {
+size = 8;
+} else if (len >= 4) {
+size = 4;
+} else if (len >= 2) {
+size = 2;
+} else {
+size = 1;
+}
+
+g_test_message("receiving %u bytes", size);
+/* Write op */
+rv = write(fd, , 1);
+g_assert_cmpint(rv, ==, 1);
+/* Write offset */
+rv = write(fd, (uint8_t *), sizeof(uint64_t));
+g_assert_cmpint(rv, ==, sizeof(uint64_t));
+/* Write size */
+g_assert_cmpint(write(fd, , 1), ==, 1);
+
+/* Read data and Expect response */
+code = read_response(buf, size);
+g_assert_cmphex(code, ==, CODE_OK);
+
+buf += size;
+offset += size;
+len -= size;
+}
+}
+
+static void send_data(uint64_t offset, const uint8_t *buf, size_t len)
+{
+uint8_t op = OP_WRITE;
+uint8_t code;
+ssize_t rv;
+
+while (len > 0) {
+uint8_t size;
+
+if (len >= 8) {
+size = 8;
+} else if (len >= 4) {
+size = 4;
+} else if (len >= 2) {
+size = 2;
+} else {
+size = 1;
+}
+
+

[PATCH v10 04/10] hw/net: Add NPCMXXX GMAC device

2024-01-08 Thread Nabih Estefan

From: Hao Wu 

This patch implements the basic registers of GMAC device and sets
registers for networking functionalities.

Tested:
The following message shows up with the change:
Broadcom BCM54612E stmmac-0:00: attached PHY driver [Broadcom BCM54612E] 
(mii_bus:phy_addr=stmmac-0:00, irq=POLL)
stmmaceth f0802000.eth eth0: Link is Up - 1Gbps/Full - flow control rx/tx

Change-Id: If71c6d486b95edcccba109ba454870714d7e0940
Signed-off-by: Hao Wu 
Signed-off-by: Nabih Estefan Diaz 
Reviewed-by: Tyrone Ting 
---
 hw/net/meson.build |   2 +-
 hw/net/npcm_gmac.c | 424 +
 hw/net/trace-events|  11 +
 include/hw/net/npcm_gmac.h | 340 +
 4 files changed, 776 insertions(+), 1 deletion(-)
 create mode 100644 hw/net/npcm_gmac.c
 create mode 100644 include/hw/net/npcm_gmac.h

diff --git a/hw/net/meson.build b/hw/net/meson.build
index f64651c467..db6509f504 100644
--- a/hw/net/meson.build
+++ b/hw/net/meson.build
@@ -38,7 +38,7 @@ system_ss.add(when: 'CONFIG_I82596_COMMON', if_true: 
files('i82596.c'))
 system_ss.add(when: 'CONFIG_SUNHME', if_true: files('sunhme.c'))
 system_ss.add(when: 'CONFIG_FTGMAC100', if_true: files('ftgmac100.c'))
 system_ss.add(when: 'CONFIG_SUNGEM', if_true: files('sungem.c'))
-system_ss.add(when: 'CONFIG_NPCM7XX', if_true: files('npcm7xx_emc.c'))
+system_ss.add(when: 'CONFIG_NPCM7XX', if_true: files('npcm7xx_emc.c', 
'npcm_gmac.c'))
 
 system_ss.add(when: 'CONFIG_ETRAXFS', if_true: files('etraxfs_eth.c'))
 system_ss.add(when: 'CONFIG_COLDFIRE', if_true: files('mcf_fec.c'))
diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c
new file mode 100644
index 00..98b3c33c94
--- /dev/null
+++ b/hw/net/npcm_gmac.c
@@ -0,0 +1,424 @@
+/*
+ * Nuvoton NPCM7xx/8xx GMAC Module
+ *
+ * Copyright 2022 Google LLC
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ * for more details.
+ *
+ * Unsupported/unimplemented features:
+ * - MII is not implemented, MII_ADDR.BUSY and MII_DATA always return zero
+ * - Precision timestamp (PTP) is not implemented.
+ */
+
+#include "qemu/osdep.h"
+
+#include "hw/registerfields.h"
+#include "hw/net/mii.h"
+#include "hw/net/npcm_gmac.h"
+#include "migration/vmstate.h"
+#include "qemu/log.h"
+#include "qemu/units.h"
+#include "sysemu/dma.h"
+#include "trace.h"
+
+REG32(NPCM_DMA_BUS_MODE, 0x1000)
+REG32(NPCM_DMA_XMT_POLL_DEMAND, 0x1004)
+REG32(NPCM_DMA_RCV_POLL_DEMAND, 0x1008)
+REG32(NPCM_DMA_RX_BASE_ADDR, 0x100c)
+REG32(NPCM_DMA_TX_BASE_ADDR, 0x1010)
+REG32(NPCM_DMA_STATUS, 0x1014)
+REG32(NPCM_DMA_CONTROL, 0x1018)
+REG32(NPCM_DMA_INTR_ENA, 0x101c)
+REG32(NPCM_DMA_MISSED_FRAME_CTR, 0x1020)
+REG32(NPCM_DMA_HOST_TX_DESC, 0x1048)
+REG32(NPCM_DMA_HOST_RX_DESC, 0x104c)
+REG32(NPCM_DMA_CUR_TX_BUF_ADDR, 0x1050)
+REG32(NPCM_DMA_CUR_RX_BUF_ADDR, 0x1054)
+REG32(NPCM_DMA_HW_FEATURE, 0x1058)
+
+REG32(NPCM_GMAC_MAC_CONFIG, 0x0)
+REG32(NPCM_GMAC_FRAME_FILTER, 0x4)
+REG32(NPCM_GMAC_HASH_HIGH, 0x8)
+REG32(NPCM_GMAC_HASH_LOW, 0xc)
+REG32(NPCM_GMAC_MII_ADDR, 0x10)
+REG32(NPCM_GMAC_MII_DATA, 0x14)
+REG32(NPCM_GMAC_FLOW_CTRL, 0x18)
+REG32(NPCM_GMAC_VLAN_FLAG, 0x1c)
+REG32(NPCM_GMAC_VERSION, 0x20)
+REG32(NPCM_GMAC_WAKEUP_FILTER, 0x28)
+REG32(NPCM_GMAC_PMT, 0x2c)
+REG32(NPCM_GMAC_LPI_CTRL, 0x30)
+REG32(NPCM_GMAC_TIMER_CTRL, 0x34)
+REG32(NPCM_GMAC_INT_STATUS, 0x38)
+REG32(NPCM_GMAC_INT_MASK, 0x3c)
+REG32(NPCM_GMAC_MAC0_ADDR_HI, 0x40)
+REG32(NPCM_GMAC_MAC0_ADDR_LO, 0x44)
+REG32(NPCM_GMAC_MAC1_ADDR_HI, 0x48)
+REG32(NPCM_GMAC_MAC1_ADDR_LO, 0x4c)
+REG32(NPCM_GMAC_MAC2_ADDR_HI, 0x50)
+REG32(NPCM_GMAC_MAC2_ADDR_LO, 0x54)
+REG32(NPCM_GMAC_MAC3_ADDR_HI, 0x58)
+REG32(NPCM_GMAC_MAC3_ADDR_LO, 0x5c)
+REG32(NPCM_GMAC_RGMII_STATUS, 0xd8)
+REG32(NPCM_GMAC_WATCHDOG, 0xdc)
+REG32(NPCM_GMAC_PTP_TCR, 0x700)
+REG32(NPCM_GMAC_PTP_SSIR, 0x704)
+REG32(NPCM_GMAC_PTP_STSR, 0x708)
+REG32(NPCM_GMAC_PTP_STNSR, 0x70c)
+REG32(NPCM_GMAC_PTP_STSUR, 0x710)
+REG32(NPCM_GMAC_PTP_STNSUR, 0x714)
+REG32(NPCM_GMAC_PTP_TAR, 0x718)
+REG32(NPCM_GMAC_PTP_TTSR, 0x71c)
+
+/* Register Fields */
+#define NPCM_GMAC_MII_ADDR_BUSY BIT(0)
+#define NPCM_GMAC_MII_ADDR_WRITEBIT(1)
+#define NPCM_GMAC_MII_ADDR_GR(rv)   extract16((rv), 6, 5)
+#define NPCM_GMAC_MII_ADDR_PA(rv)   extract16((rv), 11, 5)
+
+#define NPCM_GMAC_INT_MASK_LPIIMBIT(10)
+#define NPCM_GMAC_INT_MASK_PMTM BIT(3)
+#define NPCM_GMAC_INT_MASK_RGIM BIT(0)
+
+#define NPCM_DMA_BUS_MODE_SWR   BIT(0)
+
+static const uint32_t npcm_gmac_cold_reset_values[NPCM_GMAC_NR_REGS] = {
+/*

[PATCH v10 02/10] hw/arm: Add PCI mailbox module to Nuvoton SoC

2024-01-08 Thread Nabih Estefan

From: Hao Wu 

This patch wires the PCI mailbox module to Nuvoton SoC.

Change-Id: I14c42c628258804030f0583889882842bde0d972
Signed-off-by: Hao Wu 
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 docs/system/arm/nuvoton.rst | 2 ++
 hw/arm/npcm7xx.c| 2 ++
 include/hw/arm/npcm7xx.h| 1 +
 3 files changed, 5 insertions(+)

diff --git a/docs/system/arm/nuvoton.rst b/docs/system/arm/nuvoton.rst
index 0424cae4b0..e611099545 100644
--- a/docs/system/arm/nuvoton.rst
+++ b/docs/system/arm/nuvoton.rst
@@ -50,6 +50,8 @@ Supported devices
  * Ethernet controller (EMC)
  * Tachometer
  * Peripheral SPI controller (PSPI)
+ * BIOS POST code FIFO
+ * PCI Mailbox
 
 Missing devices
 ---
diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c
index 1c3634ff45..c9e87162cb 100644
--- a/hw/arm/npcm7xx.c
+++ b/hw/arm/npcm7xx.c
@@ -462,6 +462,8 @@ static void npcm7xx_init(Object *obj)
 object_initialize_child(obj, "pspi[*]", >pspi[i], TYPE_NPCM_PSPI);
 }
 
+object_initialize_child(obj, "pci-mbox", >pci_mbox,
+TYPE_NPCM7XX_PCI_MBOX);
 object_initialize_child(obj, "mmc", >mmc, TYPE_NPCM7XX_SDHCI);
 }
 
diff --git a/include/hw/arm/npcm7xx.h b/include/hw/arm/npcm7xx.h
index 273090ac60..cec3792a2e 100644
--- a/include/hw/arm/npcm7xx.h
+++ b/include/hw/arm/npcm7xx.h
@@ -105,6 +105,7 @@ struct NPCM7xxState {
 OHCISysBusState ohci;
 NPCM7xxFIUState fiu[2];
 NPCM7xxEMCState emc[2];
+NPCM7xxPCIMBoxState pci_mbox;
 NPCM7xxSDHCIState   mmc;
 NPCMPSPIState   pspi[2];
 };
-- 
2.43.0.472.g3155946c3a-goog

[PATCH v10 06/10] tests/qtest: Creating qtest for GMAC Module

2024-01-08 Thread Nabih Estefan

From: Nabih Estefan Diaz 

 - Created qtest to check initialization of registers in GMAC Module.
 - Implemented test into Build File.

Change-Id: I8b2fe152d3987a7eec4cf6a1d25ba92e75a5391d
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 tests/qtest/meson.build  |   1 +
 tests/qtest/npcm_gmac-test.c | 209 +++
 2 files changed, 210 insertions(+)
 create mode 100644 tests/qtest/npcm_gmac-test.c

diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 2ac79925f9..aed8924be9 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -221,6 +221,7 @@ qtests_aarch64 = \
   (config_all_devices.has_key('CONFIG_RASPI') ? ['bcm2835-dma-test'] : []) +  \
   (config_all.has_key('CONFIG_TCG') and
\
config_all_devices.has_key('CONFIG_TPM_TIS_I2C') ? ['tpm-tis-i2c-test'] : 
[]) + \
+  (config_all_devices.has_key('CONFIG_NPCM7XX') ? qtests_npcm7xx : []) + \
   ['arm-cpu-features',
'numa-test',
'boot-serial-test',
diff --git a/tests/qtest/npcm_gmac-test.c b/tests/qtest/npcm_gmac-test.c
new file mode 100644
index 00..130a1599a8
--- /dev/null
+++ b/tests/qtest/npcm_gmac-test.c
@@ -0,0 +1,209 @@
+/*
+ * QTests for Nuvoton NPCM7xx/8xx GMAC Modules.
+ *
+ * Copyright 2023 Google LLC
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ * for more details.
+ */
+
+#include "qemu/osdep.h"
+#include "libqos/libqos.h"
+
+/* Name of the GMAC Device */
+#define TYPE_NPCM_GMAC "npcm-gmac"
+
+typedef struct GMACModule {
+int irq;
+uint64_t base_addr;
+} GMACModule;
+
+typedef struct TestData {
+const GMACModule *module;
+} TestData;
+
+/* Values extracted from hw/arm/npcm8xx.c */
+static const GMACModule gmac_module_list[] = {
+{
+.irq= 14,
+.base_addr  = 0xf0802000
+},
+{
+.irq= 15,
+.base_addr  = 0xf0804000
+},
+{
+.irq= 16,
+.base_addr  = 0xf0806000
+},
+{
+.irq= 17,
+.base_addr  = 0xf0808000
+}
+};
+
+/* Returns the index of the GMAC module. */
+static int gmac_module_index(const GMACModule *mod)
+{
+ptrdiff_t diff = mod - gmac_module_list;
+
+g_assert_true(diff >= 0 && diff < ARRAY_SIZE(gmac_module_list));
+
+return diff;
+}
+
+/* 32-bit register indices. Taken from npcm_gmac.c */
+typedef enum NPCMRegister {
+/* DMA Registers */
+NPCM_DMA_BUS_MODE = 0x1000,
+NPCM_DMA_XMT_POLL_DEMAND = 0x1004,
+NPCM_DMA_RCV_POLL_DEMAND = 0x1008,
+NPCM_DMA_RCV_BASE_ADDR = 0x100c,
+NPCM_DMA_TX_BASE_ADDR = 0x1010,
+NPCM_DMA_STATUS = 0x1014,
+NPCM_DMA_CONTROL = 0x1018,
+NPCM_DMA_INTR_ENA = 0x101c,
+NPCM_DMA_MISSED_FRAME_CTR = 0x1020,
+NPCM_DMA_HOST_TX_DESC = 0x1048,
+NPCM_DMA_HOST_RX_DESC = 0x104c,
+NPCM_DMA_CUR_TX_BUF_ADDR = 0x1050,
+NPCM_DMA_CUR_RX_BUF_ADDR = 0x1054,
+NPCM_DMA_HW_FEATURE = 0x1058,
+
+/* GMAC Registers */
+NPCM_GMAC_MAC_CONFIG = 0x0,
+NPCM_GMAC_FRAME_FILTER = 0x4,
+NPCM_GMAC_HASH_HIGH = 0x8,
+NPCM_GMAC_HASH_LOW = 0xc,
+NPCM_GMAC_MII_ADDR = 0x10,
+NPCM_GMAC_MII_DATA = 0x14,
+NPCM_GMAC_FLOW_CTRL = 0x18,
+NPCM_GMAC_VLAN_FLAG = 0x1c,
+NPCM_GMAC_VERSION = 0x20,
+NPCM_GMAC_WAKEUP_FILTER = 0x28,
+NPCM_GMAC_PMT = 0x2c,
+NPCM_GMAC_LPI_CTRL = 0x30,
+NPCM_GMAC_TIMER_CTRL = 0x34,
+NPCM_GMAC_INT_STATUS = 0x38,
+NPCM_GMAC_INT_MASK = 0x3c,
+NPCM_GMAC_MAC0_ADDR_HI = 0x40,
+NPCM_GMAC_MAC0_ADDR_LO = 0x44,
+NPCM_GMAC_MAC1_ADDR_HI = 0x48,
+NPCM_GMAC_MAC1_ADDR_LO = 0x4c,
+NPCM_GMAC_MAC2_ADDR_HI = 0x50,
+NPCM_GMAC_MAC2_ADDR_LO = 0x54,
+NPCM_GMAC_MAC3_ADDR_HI = 0x58,
+NPCM_GMAC_MAC3_ADDR_LO = 0x5c,
+NPCM_GMAC_RGMII_STATUS = 0xd8,
+NPCM_GMAC_WATCHDOG = 0xdc,
+NPCM_GMAC_PTP_TCR = 0x700,
+NPCM_GMAC_PTP_SSIR = 0x704,
+NPCM_GMAC_PTP_STSR = 0x708,
+NPCM_GMAC_PTP_STNSR = 0x70c,
+NPCM_GMAC_PTP_STSUR = 0x710,
+NPCM_GMAC_PTP_STNSUR = 0x714,
+NPCM_GMAC_PTP_TAR = 0x718,
+NPCM_GMAC_PTP_TTSR = 0x71c,
+} NPCMRegister;
+
+static uint32_t gmac_read(QTestState *qts, const GMACModule *mod,
+  NPCMRegister regno)
+{
+return qtest_readl(qts, mod->base_addr + regno);
+}
+
+/* Check that GMAC registers are reset to default value */
+static void test_init(gconstpointer test_data)
+{
+const TestData *td = test_data;
+const GMACModule *mod = td->module;
+QTestState *qts = qtest_init("-machine npcm845-evb");
+
+#define CHECK_REG32(regno,

[PATCH v10 09/10] hw/net: GMAC Tx Implementation

2024-01-08 Thread Nabih Estefan

From: Nabih Estefan Diaz 

- Implementation of Transmit function for packets
- Implementation for reading and writing from and to descriptors in
  memory for Tx

Added relevant trace-events

NOTE: This function implements the steps detailed in the datasheet for
transmitting messages from the GMAC.

Change-Id: Icf14f9fcc6cc7808a41acd872bca67c9832087e6
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 hw/net/npcm_gmac.c  | 155 
 hw/net/trace-events |   2 +
 2 files changed, 157 insertions(+)

diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c
index 54c8af3b41..8e91e61617 100644
--- a/hw/net/npcm_gmac.c
+++ b/hw/net/npcm_gmac.c
@@ -265,6 +265,7 @@ static int gmac_write_tx_desc(dma_addr_t addr, struct 
NPCMGMACTxDesc *desc)
 }
 return 0;
 }
+
 static int gmac_rx_transfer_frame_to_buffer(uint32_t rx_buf_len,
 uint32_t *left_frame,
 uint32_t rx_buf_addr,
@@ -486,6 +487,155 @@ static ssize_t gmac_receive(NetClientState *nc, const 
uint8_t *buf, size_t len)
 return len;
 }
 
+static int gmac_tx_get_csum(uint32_t tdes1)
+{
+uint32_t mask = TX_DESC_TDES1_CHKSM_INS_CTRL_MASK(tdes1);
+int csum = 0;
+
+if (likely(mask > 0)) {
+csum |= CSUM_IP;
+}
+if (likely(mask > 1)) {
+csum |= CSUM_TCP | CSUM_UDP;
+}
+
+return csum;
+}
+
+static void gmac_try_send_next_packet(NPCMGMACState *gmac)
+{
+/*
+ * Comments about steps refer to steps for
+ * transmitting in page 384 of datasheet
+ */
+uint16_t tx_buffer_size = 2048;
+g_autofree uint8_t *tx_send_buffer = g_malloc(tx_buffer_size);
+uint32_t desc_addr;
+struct NPCMGMACTxDesc tx_desc;
+uint32_t tx_buf_addr, tx_buf_len;
+uint16_t length = 0;
+uint8_t *buf = tx_send_buffer;
+uint32_t prev_buf_size = 0;
+int csum = 0;
+
+/* steps 1&2 */
+if (!gmac->regs[R_NPCM_DMA_HOST_TX_DESC]) {
+gmac->regs[R_NPCM_DMA_HOST_TX_DESC] =
+NPCM_DMA_HOST_TX_DESC_MASK(gmac->regs[R_NPCM_DMA_TX_BASE_ADDR]);
+}
+desc_addr = gmac->regs[R_NPCM_DMA_HOST_TX_DESC];
+
+while (true) {
+gmac_dma_set_state(gmac, NPCM_DMA_STATUS_TX_PROCESS_STATE_SHIFT,
+NPCM_DMA_STATUS_TX_RUNNING_FETCHING_STATE);
+if (gmac_read_tx_desc(desc_addr, _desc)) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "TX Descriptor @ 0x%x can't be read\n",
+  desc_addr);
+return;
+}
+/* step 3 */
+
+trace_npcm_gmac_packet_desc_read(DEVICE(gmac)->canonical_path,
+desc_addr);
+trace_npcm_gmac_debug_desc_data(DEVICE(gmac)->canonical_path, _desc,
+tx_desc.tdes0, tx_desc.tdes1, tx_desc.tdes2, tx_desc.tdes3);
+
+/* 1 = DMA Owned, 0 = Software Owned */
+if (!(tx_desc.tdes0 & TX_DESC_TDES0_OWN)) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "TX Descriptor @ 0x%x is owned by software\n",
+  desc_addr);
+gmac->regs[R_NPCM_DMA_STATUS] |= NPCM_DMA_STATUS_TU;
+gmac_dma_set_state(gmac, NPCM_DMA_STATUS_TX_PROCESS_STATE_SHIFT,
+NPCM_DMA_STATUS_TX_SUSPENDED_STATE);
+gmac_update_irq(gmac);
+return;
+}
+
+gmac_dma_set_state(gmac, NPCM_DMA_STATUS_TX_PROCESS_STATE_SHIFT,
+NPCM_DMA_STATUS_TX_RUNNING_READ_STATE);
+/* Give the descriptor back regardless of what happens. */
+tx_desc.tdes0 &= ~TX_DESC_TDES0_OWN;
+
+if (tx_desc.tdes1 & TX_DESC_TDES1_FIRST_SEG_MASK) {
+csum = gmac_tx_get_csum(tx_desc.tdes1);
+}
+
+/* step 4 */
+tx_buf_addr = tx_desc.tdes2;
+gmac->regs[R_NPCM_DMA_CUR_TX_BUF_ADDR] = tx_buf_addr;
+tx_buf_len = TX_DESC_TDES1_BFFR1_SZ_MASK(tx_desc.tdes1);
+buf = _send_buffer[prev_buf_size];
+
+if ((prev_buf_size + tx_buf_len) > sizeof(buf)) {
+tx_buffer_size = prev_buf_size + tx_buf_len;
+tx_send_buffer = g_realloc(tx_send_buffer, tx_buffer_size);
+buf = _send_buffer[prev_buf_size];
+}
+
+/* step 5 */
+if (dma_memory_read(_space_memory, tx_buf_addr, buf,
+tx_buf_len, MEMTXATTRS_UNSPECIFIED)) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to read packet @ 
0x%x\n",
+__func__, tx_buf_addr);
+return;
+}
+length += tx_buf_len;
+prev_buf_size += tx_buf_len;
+
+/* If not chained we'll have a second buffer. */
+if (!(tx_desc.tdes1 & TX_DESC_TDES1_SEC_ADDR_CHND_MASK)) {
+tx_buf_addr = tx_desc.tdes3;
+gmac->regs[R_NPCM_DMA_CUR_TX_BUF_ADDR] = tx_buf_addr;
+tx_buf_len = TX_DESC_TDES1_BFFR2_SZ_MASK(tx_desc.tdes1);
+buf = _send_buffer[prev_buf_size];
+
+if

[PATCH v10 05/10] hw/arm: Add GMAC devices to NPCM7XX SoC

2024-01-08 Thread Nabih Estefan

From: Hao Wu 

Change-Id: Id8a3461fb5042adc4c3fd6f4fbd1ca0d33e22565
Signed-off-by: Hao Wu 
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 hw/arm/npcm7xx.c | 36 ++--
 include/hw/arm/npcm7xx.h |  2 ++
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c
index c9e87162cb..12e11250e1 100644
--- a/hw/arm/npcm7xx.c
+++ b/hw/arm/npcm7xx.c
@@ -91,6 +91,7 @@ enum NPCM7xxInterrupt {
 NPCM7XX_GMAC1_IRQ   = 14,
 NPCM7XX_EMC1RX_IRQ  = 15,
 NPCM7XX_EMC1TX_IRQ,
+NPCM7XX_GMAC2_IRQ,
 NPCM7XX_MMC_IRQ = 26,
 NPCM7XX_PSPI2_IRQ   = 28,
 NPCM7XX_PSPI1_IRQ   = 31,
@@ -234,6 +235,12 @@ static const hwaddr npcm7xx_pspi_addr[] = {
 0xf0201000,
 };
 
+/* Register base address for each GMAC Module */
+static const hwaddr npcm7xx_gmac_addr[] = {
+0xf0802000,
+0xf0804000,
+};
+
 static const struct {
 hwaddr regs_addr;
 uint32_t unconnected_pins;
@@ -462,6 +469,10 @@ static void npcm7xx_init(Object *obj)
 object_initialize_child(obj, "pspi[*]", >pspi[i], TYPE_NPCM_PSPI);
 }
 
+for (i = 0; i < ARRAY_SIZE(s->gmac); i++) {
+object_initialize_child(obj, "gmac[*]", >gmac[i], TYPE_NPCM_GMAC);
+}
+
 object_initialize_child(obj, "pci-mbox", >pci_mbox,
 TYPE_NPCM7XX_PCI_MBOX);
 object_initialize_child(obj, "mmc", >mmc, TYPE_NPCM7XX_SDHCI);
@@ -695,6 +706,29 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp)
 sysbus_connect_irq(sbd, 1, npcm7xx_irq(s, rx_irq));
 }
 
+/*
+ * GMAC Modules. Cannot fail.
+ */
+QEMU_BUILD_BUG_ON(ARRAY_SIZE(npcm7xx_gmac_addr) != ARRAY_SIZE(s->gmac));
+QEMU_BUILD_BUG_ON(ARRAY_SIZE(s->gmac) != 2);
+for (i = 0; i < ARRAY_SIZE(s->gmac); i++) {
+SysBusDevice *sbd = SYS_BUS_DEVICE(>gmac[i]);
+
+/*
+ * The device exists regardless of whether it's connected to a QEMU
+ * netdev backend. So always instantiate it even if there is no
+ * backend.
+ */
+sysbus_realize(sbd, _abort);
+sysbus_mmio_map(sbd, 0, npcm7xx_gmac_addr[i]);
+int irq = i == 0 ? NPCM7XX_GMAC1_IRQ : NPCM7XX_GMAC2_IRQ;
+/*
+ * N.B. The values for the second argument sysbus_connect_irq are
+ * chosen to match the registration order in npcm7xx_emc_realize.
+ */
+sysbus_connect_irq(sbd, 0, npcm7xx_irq(s, irq));
+}
+
 /*
  * Flash Interface Unit (FIU). Can fail if incorrect number of chip selects
  * specified, but this is a programming error.
@@ -765,8 +799,6 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp)
 create_unimplemented_device("npcm7xx.siox[2]",  0xf0102000,   4 * KiB);
 create_unimplemented_device("npcm7xx.ahbpci",   0xf040,   1 * MiB);
 create_unimplemented_device("npcm7xx.mcphy",0xf05f,  64 * KiB);
-create_unimplemented_device("npcm7xx.gmac1",0xf0802000,   8 * KiB);
-create_unimplemented_device("npcm7xx.gmac2",0xf0804000,   8 * KiB);
 create_unimplemented_device("npcm7xx.vcd",  0xf081,  64 * KiB);
 create_unimplemented_device("npcm7xx.ece",  0xf082,   8 * KiB);
 create_unimplemented_device("npcm7xx.vdma", 0xf0822000,   8 * KiB);
diff --git a/include/hw/arm/npcm7xx.h b/include/hw/arm/npcm7xx.h
index cec3792a2e..9e5cf639a2 100644
--- a/include/hw/arm/npcm7xx.h
+++ b/include/hw/arm/npcm7xx.h
@@ -30,6 +30,7 @@
 #include "hw/misc/npcm7xx_pwm.h"
 #include "hw/misc/npcm7xx_rng.h"
 #include "hw/net/npcm7xx_emc.h"
+#include "hw/net/npcm_gmac.h"
 #include "hw/nvram/npcm7xx_otp.h"
 #include "hw/timer/npcm7xx_timer.h"
 #include "hw/ssi/npcm7xx_fiu.h"
@@ -105,6 +106,7 @@ struct NPCM7xxState {
 OHCISysBusState ohci;
 NPCM7xxFIUState fiu[2];
 NPCM7xxEMCState emc[2];
+NPCMGMACState   gmac[2];
 NPCM7xxPCIMBoxState pci_mbox;
 NPCM7xxSDHCIState   mmc;
 NPCMPSPIState   pspi[2];
-- 
2.43.0.472.g3155946c3a-goog

[PATCH v10 01/10] hw/misc: Add Nuvoton's PCI Mailbox Module

2024-01-08 Thread Nabih Estefan

From: Hao Wu 

The PCI Mailbox Module is a high-bandwidth communcation module
between a Nuvoton BMC and CPU. It features 16KB RAM that are both
accessible by the BMC and core CPU. and supports interrupt for
both sides.

This patch implements the BMC side of the PCI mailbox module.
Communication with the core CPU is emulated via a chardev and
will be in a follow-up patch.

Change-Id: Iaca22f81c4526927d437aa367079ed038faf43f2
Signed-off-by: Hao Wu 
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 hw/arm/npcm7xx.c   |  15 +-
 hw/misc/meson.build|   1 +
 hw/misc/npcm7xx_pci_mbox.c | 324 +
 hw/misc/trace-events   |   5 +
 include/hw/arm/npcm7xx.h   |   1 +
 include/hw/misc/npcm7xx_pci_mbox.h |  81 
 6 files changed, 426 insertions(+), 1 deletion(-)
 create mode 100644 hw/misc/npcm7xx_pci_mbox.c
 create mode 100644 include/hw/misc/npcm7xx_pci_mbox.h

diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c
index 15ff21d047..1c3634ff45 100644
--- a/hw/arm/npcm7xx.c
+++ b/hw/arm/npcm7xx.c
@@ -53,6 +53,9 @@
 /* ADC Module */
 #define NPCM7XX_ADC_BA  (0xf000c000)
 
+/* PCI Mailbox Module */
+#define NPCM7XX_PCI_MBOX_BA (0xf0848000)
+
 /* Internal AHB SRAM */
 #define NPCM7XX_RAM3_BA (0xc0008000)
 #define NPCM7XX_RAM3_SZ (4 * KiB)
@@ -83,6 +86,9 @@ enum NPCM7xxInterrupt {
 NPCM7XX_UART1_IRQ,
 NPCM7XX_UART2_IRQ,
 NPCM7XX_UART3_IRQ,
+NPCM7XX_PCI_MBOX_IRQ= 8,
+NPCM7XX_KCS_HIB_IRQ = 9,
+NPCM7XX_GMAC1_IRQ   = 14,
 NPCM7XX_EMC1RX_IRQ  = 15,
 NPCM7XX_EMC1TX_IRQ,
 NPCM7XX_MMC_IRQ = 26,
@@ -706,6 +712,14 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp)
 }
 }
 
+/* PCI Mailbox. Cannot fail */
+sysbus_realize(SYS_BUS_DEVICE(>pci_mbox), _abort);
+sysbus_mmio_map(SYS_BUS_DEVICE(>pci_mbox), 0, NPCM7XX_PCI_MBOX_BA);
+sysbus_mmio_map(SYS_BUS_DEVICE(>pci_mbox), 1,
+NPCM7XX_PCI_MBOX_BA + NPCM7XX_PCI_MBOX_RAM_SIZE);
+sysbus_connect_irq(SYS_BUS_DEVICE(>pci_mbox), 0,
+   npcm7xx_irq(s, NPCM7XX_PCI_MBOX_IRQ));
+
 /* RAM2 (SRAM) */
 memory_region_init_ram(>sram, OBJECT(dev), "ram2",
NPCM7XX_RAM2_SZ, _abort);
@@ -765,7 +779,6 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp)
 create_unimplemented_device("npcm7xx.usbd[8]",  0xf0838000,   4 * KiB);
 create_unimplemented_device("npcm7xx.usbd[9]",  0xf0839000,   4 * KiB);
 create_unimplemented_device("npcm7xx.sd",   0xf084,   8 * KiB);
-create_unimplemented_device("npcm7xx.pcimbx",   0xf0848000, 512 * KiB);
 create_unimplemented_device("npcm7xx.aes",  0xf0858000,   4 * KiB);
 create_unimplemented_device("npcm7xx.des",  0xf0859000,   4 * KiB);
 create_unimplemented_device("npcm7xx.sha",  0xf085a000,   4 * KiB);
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
index 36c20d5637..0ead2e9ede 100644
--- a/hw/misc/meson.build
+++ b/hw/misc/meson.build
@@ -73,6 +73,7 @@ system_ss.add(when: 'CONFIG_NPCM7XX', if_true: files(
   'npcm7xx_clk.c',
   'npcm7xx_gcr.c',
   'npcm7xx_mft.c',
+  'npcm7xx_pci_mbox.c',
   'npcm7xx_pwm.c',
   'npcm7xx_rng.c',
 ))
diff --git a/hw/misc/npcm7xx_pci_mbox.c b/hw/misc/npcm7xx_pci_mbox.c
new file mode 100644
index 00..c770ad6fcf
--- /dev/null
+++ b/hw/misc/npcm7xx_pci_mbox.c
@@ -0,0 +1,324 @@
+/*
+ * Nuvoton NPCM7xx PCI Mailbox Module
+ *
+ * Copyright 2021 Google LLC
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ * for more details.
+ */
+
+#include "qemu/osdep.h"
+#include "chardev/char-fe.h"
+#include "hw/irq.h"
+#include "hw/qdev-clock.h"
+#include "hw/qdev-properties-system.h"
+#include "hw/misc/npcm7xx_pci_mbox.h"
+#include "hw/registerfields.h"
+#include "migration/vmstate.h"
+#include "qapi/error.h"
+#include "qapi/visitor.h"
+#include "qemu/bitops.h"
+#include "qemu/error-report.h"
+#include "qemu/log.h"
+#include "qemu/module.h"
+#include "qemu/timer.h"
+#include "qemu/units.h"
+#include "trace.h"
+
+REG32(NPCM7XX_PCI_MBOX_BMBXSTAT, 0x00);
+REG32(NPCM7XX_PCI_MBOX_BMBXCTL, 0x04);
+REG32(NPCM7XX_PCI_MBOX_BMBXCMD, 0x08);
+
+enum NPCM7xxPCIMBoxOperation {
+NPCM7XX_PCI_MBOX_OP_READ = 1,
+NPCM7XX_PCI_MBOX_OP_WRITE,
+};
+
+#define NPCM7XX_PCI_MBOX_OFFSET_BYTES 8
+
+/* Response code */
+#define NPCM7XX_PCI_MBOX_OK 0
+#define NPCM7XX_PCI_MBOX_INVALID_OP 0xa0
+#define NPCM7XX_PCI_MBOX_INVALID_SIZE 0xa1
+#define

[PATCH v10 00/10] Implementation of NPI Mailbox and GMAC Networking Module

2024-01-08 Thread Nabih Estefan

From: Nabih Estefan Diaz 

[Changes since v9]
More cleanup and fixes based on suggestions from Peter Maydell
(peter.mayd...@linaro.org) suggestions.

[Changes since v8]
Suggestions and Fixes from Peter Maydell (peter.mayd...@linaro.org),
also cleaned up changes so nothing is deleted in a later patch that was
added in an earlier patch. Patch count decresed by 1 because this cleanup
led to one of the patches being irrelevant.

[Changes since v7]
Fixed patch 4 declaration of new NIC based on comments by Peter Maydell
(peter.mayd...@linaro.org)

[Changes since v6]
Remove the Change-Ids from the commit messages.

[Changes since v5]
Undid remove of some qtests that seem to have been caused by a merge
conflict.

[Changes since v4]
Added Signed-off-by tag and fixed patch 4 commit message as suggested by
Peter Maydell (peter.mayd...@linaro.org)

[Changes since v3]
Fixed comments from Hao Wu (wuhao...@google.com)

[Changes since v2]
Fixed bugs related to the RC functionality of the GMAC. Added and
squashed patches related to that.

[Changes since v1]
Fixed some errors in formatting.
Fixed a merge error that I didn't see in v1.
Removed Nuvoton 8xx references since that is a separate patch set.

[Original Cover]
Creates NPI Mailbox Module with data verification for read and write (internal 
and external),
wiring to the Nuvoton SoC, and QTests.

Also creates the GMAC Networking Module. Implements read and write 
functionalities with cooresponding descriptors
and registers. Also includes QTests for the different functionalities.

Hao Wu (5):
  hw/misc: Add Nuvoton's PCI Mailbox Module
  hw/arm: Add PCI mailbox module to Nuvoton SoC
  hw/misc: Add qtest for NPCM7xx PCI Mailbox
  hw/net: Add NPCMXXX GMAC device
  hw/arm: Add GMAC devices to NPCM7XX SoC

Nabih Estefan Diaz (5):
  tests/qtest: Creating qtest for GMAC Module
  include/hw/net: GMAC IRQ Implementation
  hw/net: GMAC Rx Implementation
  hw/net: GMAC Tx Implementation
  tests/qtest: Adding PCS Module test to GMAC Qtest

 docs/system/arm/nuvoton.rst |   2 +
 hw/arm/npcm7xx.c|  53 +-
 hw/misc/meson.build |   1 +
 hw/misc/npcm7xx_pci_mbox.c  | 324 ++
 hw/misc/trace-events|   5 +
 hw/net/meson.build  |   2 +-
 hw/net/npcm_gmac.c  | 939 
 hw/net/trace-events |  19 +
 include/hw/arm/npcm7xx.h|   4 +
 include/hw/misc/npcm7xx_pci_mbox.h  |  81 +++
 include/hw/net/npcm_gmac.h  | 340 ++
 tests/qtest/meson.build |   2 +
 tests/qtest/npcm7xx_pci_mbox-test.c | 238 +++
 tests/qtest/npcm_gmac-test.c| 341 ++
 14 files changed, 2347 insertions(+), 4 deletions(-)
 create mode 100644 hw/misc/npcm7xx_pci_mbox.c
 create mode 100644 hw/net/npcm_gmac.c
 create mode 100644 include/hw/misc/npcm7xx_pci_mbox.h
 create mode 100644 include/hw/net/npcm_gmac.h
 create mode 100644 tests/qtest/npcm7xx_pci_mbox-test.c
 create mode 100644 tests/qtest/npcm_gmac-test.c

-- 
2.43.0.472.g3155946c3a-goog

[PATCH v10 10/10] tests/qtest: Adding PCS Module test to GMAC Qtest

2024-01-08 Thread Nabih Estefan

From: Nabih Estefan Diaz 

 - Add PCS Register check to npcm_gmac-test

Change-Id: I34821beb5e0b1e89e2be576ab58eabe41545af12
Signed-off-by: Nabih Estefan 
Reviewed-by: Tyrone Ting 
---
 tests/qtest/npcm_gmac-test.c | 132 +++
 1 file changed, 132 insertions(+)

diff --git a/tests/qtest/npcm_gmac-test.c b/tests/qtest/npcm_gmac-test.c
index 130a1599a8..b64515794b 100644
--- a/tests/qtest/npcm_gmac-test.c
+++ b/tests/qtest/npcm_gmac-test.c
@@ -20,6 +20,10 @@
 /* Name of the GMAC Device */
 #define TYPE_NPCM_GMAC "npcm-gmac"
 
+/* Address of the PCS Module */
+#define PCS_BASE_ADDRESS 0xf078
+#define NPCM_PCS_IND_AC_BA 0x1fe
+
 typedef struct GMACModule {
 int irq;
 uint64_t base_addr;
@@ -111,6 +115,62 @@ typedef enum NPCMRegister {
 NPCM_GMAC_PTP_STNSUR = 0x714,
 NPCM_GMAC_PTP_TAR = 0x718,
 NPCM_GMAC_PTP_TTSR = 0x71c,
+
+/* PCS Registers */
+NPCM_PCS_SR_CTL_ID1 = 0x3c0008,
+NPCM_PCS_SR_CTL_ID2 = 0x3c000a,
+NPCM_PCS_SR_CTL_STS = 0x3c0010,
+
+NPCM_PCS_SR_MII_CTRL = 0x3e,
+NPCM_PCS_SR_MII_STS = 0x3e0002,
+NPCM_PCS_SR_MII_DEV_ID1 = 0x3e0004,
+NPCM_PCS_SR_MII_DEV_ID2 = 0x3e0006,
+NPCM_PCS_SR_MII_AN_ADV = 0x3e0008,
+NPCM_PCS_SR_MII_LP_BABL = 0x3e000a,
+NPCM_PCS_SR_MII_AN_EXPN = 0x3e000c,
+NPCM_PCS_SR_MII_EXT_STS = 0x3e001e,
+
+NPCM_PCS_SR_TIM_SYNC_ABL = 0x3e0e10,
+NPCM_PCS_SR_TIM_SYNC_TX_MAX_DLY_LWR = 0x3e0e12,
+NPCM_PCS_SR_TIM_SYNC_TX_MAX_DLY_UPR = 0x3e0e14,
+NPCM_PCS_SR_TIM_SYNC_TX_MIN_DLY_LWR = 0x3e0e16,
+NPCM_PCS_SR_TIM_SYNC_TX_MIN_DLY_UPR = 0x3e0e18,
+NPCM_PCS_SR_TIM_SYNC_RX_MAX_DLY_LWR = 0x3e0e1a,
+NPCM_PCS_SR_TIM_SYNC_RX_MAX_DLY_UPR = 0x3e0e1c,
+NPCM_PCS_SR_TIM_SYNC_RX_MIN_DLY_LWR = 0x3e0e1e,
+NPCM_PCS_SR_TIM_SYNC_RX_MIN_DLY_UPR = 0x3e0e20,
+
+NPCM_PCS_VR_MII_MMD_DIG_CTRL1 = 0x3f,
+NPCM_PCS_VR_MII_AN_CTRL = 0x3f0002,
+NPCM_PCS_VR_MII_AN_INTR_STS = 0x3f0004,
+NPCM_PCS_VR_MII_TC = 0x3f0006,
+NPCM_PCS_VR_MII_DBG_CTRL = 0x3f000a,
+NPCM_PCS_VR_MII_EEE_MCTRL0 = 0x3f000c,
+NPCM_PCS_VR_MII_EEE_TXTIMER = 0x3f0010,
+NPCM_PCS_VR_MII_EEE_RXTIMER = 0x3f0012,
+NPCM_PCS_VR_MII_LINK_TIMER_CTRL = 0x3f0014,
+NPCM_PCS_VR_MII_EEE_MCTRL1 = 0x3f0016,
+NPCM_PCS_VR_MII_DIG_STS = 0x3f0020,
+NPCM_PCS_VR_MII_ICG_ERRCNT1 = 0x3f0022,
+NPCM_PCS_VR_MII_MISC_STS = 0x3f0030,
+NPCM_PCS_VR_MII_RX_LSTS = 0x3f0040,
+NPCM_PCS_VR_MII_MP_TX_BSTCTRL0 = 0x3f0070,
+NPCM_PCS_VR_MII_MP_TX_LVLCTRL0 = 0x3f0074,
+NPCM_PCS_VR_MII_MP_TX_GENCTRL0 = 0x3f007a,
+NPCM_PCS_VR_MII_MP_TX_GENCTRL1 = 0x3f007c,
+NPCM_PCS_VR_MII_MP_TX_STS = 0x3f0090,
+NPCM_PCS_VR_MII_MP_RX_GENCTRL0 = 0x3f00b0,
+NPCM_PCS_VR_MII_MP_RX_GENCTRL1 = 0x3f00b2,
+NPCM_PCS_VR_MII_MP_RX_LOS_CTRL0 = 0x3f00ba,
+NPCM_PCS_VR_MII_MP_MPLL_CTRL0 = 0x3f00f0,
+NPCM_PCS_VR_MII_MP_MPLL_CTRL1 = 0x3f00f2,
+NPCM_PCS_VR_MII_MP_MPLL_STS = 0x3f0110,
+NPCM_PCS_VR_MII_MP_MISC_CTRL2 = 0x3f0126,
+NPCM_PCS_VR_MII_MP_LVL_CTRL = 0x3f0130,
+NPCM_PCS_VR_MII_MP_MISC_CTRL0 = 0x3f0132,
+NPCM_PCS_VR_MII_MP_MISC_CTRL1 = 0x3f0134,
+NPCM_PCS_VR_MII_DIG_CTRL2 = 0x3f01c2,
+NPCM_PCS_VR_MII_DIG_ERRCNT_SEL = 0x3f01c4,
 } NPCMRegister;
 
 static uint32_t gmac_read(QTestState *qts, const GMACModule *mod,
@@ -119,6 +179,15 @@ static uint32_t gmac_read(QTestState *qts, const 
GMACModule *mod,
 return qtest_readl(qts, mod->base_addr + regno);
 }
 
+static uint16_t pcs_read(QTestState *qts, const GMACModule *mod,
+  NPCMRegister regno)
+{
+uint32_t write_value = (regno & 0x3ffe00) >> 9;
+qtest_writel(qts, PCS_BASE_ADDRESS + NPCM_PCS_IND_AC_BA, write_value);
+uint32_t read_offset = regno & 0x1ff;
+return qtest_readl(qts, PCS_BASE_ADDRESS + read_offset);
+}
+
 /* Check that GMAC registers are reset to default value */
 static void test_init(gconstpointer test_data)
 {
@@ -131,6 +200,11 @@ static void test_init(gconstpointer test_data)
 g_assert_cmphex(gmac_read(qts, mod, (regno)), ==, (value)); \
 } while (0)
 
+#define CHECK_REG_PCS(regno, value) \
+do { \
+g_assert_cmphex(pcs_read(qts, mod, (regno)), ==, (value)); \
+} while (0)
+
 CHECK_REG32(NPCM_DMA_BUS_MODE, 0x00020100);
 CHECK_REG32(NPCM_DMA_XMT_POLL_DEMAND, 0);
 CHECK_REG32(NPCM_DMA_RCV_POLL_DEMAND, 0);
@@ -180,6 +254,64 @@ static void test_init(gconstpointer test_data)
 CHECK_REG32(NPCM_GMAC_PTP_TAR, 0);
 CHECK_REG32(NPCM_GMAC_PTP_TTSR, 0);
 
+/* TODO Add registers PCS */
+if (mod->base_addr == 0xf0802000) {
+CHECK_REG_PCS(NPCM_PCS_SR_CTL_ID1, 0x699e);
+CHECK_REG_PCS(NPCM_PCS_SR_CTL_ID2, 0);
+CHECK_REG_PCS(NPCM_PCS_SR_CTL_STS, 0x8000);
+
+CHECK_REG_PCS(NPCM_PCS_SR_MII_CTRL, 0x1140);
+CHECK_REG_PCS(NPCM_PCS_SR_MII_STS, 0x0109);
+CHECK_REG_PCS(NPCM_PCS_SR_MII_DEV_ID1, 0x699e);
+CHECK_REG_PCS(NPCM_PCS_SR_MII_DEV_ID2, 0x0ced0);
+

Re: [PATCH v4 00/11] hw/isa/vt82c686: Implement relocation and toggling of SuperI/O functions

2024-01-08 Thread Mark Cave-Ayland


On 08/01/2024 20:07, Bernhard Beschow wrote:


Am 7. Januar 2024 14:13:44 UTC schrieb Mark Cave-Ayland 
:

On 06/01/2024 21:05, Bernhard Beschow wrote:


This series implements relocation of the SuperI/O functions of the VIA south
bridges which resolves some FIXME's. It is part of my via-apollo-pro-133t
branch [1] which is an extension of bringing the VIA south bridges to the PC
machine [2]. This branch is able to run some real-world X86 BIOSes in the hope
that it allows us to form a better understanding of the real vt82c686b devices.
Implementing relocation and toggling of the SuperI/O functions is one step to
make these BIOSes run without error messages, so here we go.

The series is structured as follows: Patches 1-3 prepare the TYPE_ISA_FDC,
TYPE_ISA_PARALLEL and TYPE_ISA_SERIAL to relocate and toggle (enable/disable)
themselves without breaking encapsulation of their respective device states.
This is achieved by moving the MemoryRegions and PortioLists from the device
states into the encapsulating ISA devices since they will be relocated and
toggled.

Inspired by the memory API patches 4-6 add two convenience functions to the
portio_list API to toggle and relocate portio lists. Patch 5 is a preparation
for that which removes some redundancies which otherwise had to be dealt with
during relocation.

Patches 7-9 implement toggling and relocation for types TYPE_ISA_FDC,
TYPE_ISA_PARALLEL and TYPE_ISA_SERIAL. Patch 10 prepares the pegasos2 machine
which would end up with all SuperI/O functions disabled if no -bios argument is
given. Patch 11 finally implements the main feature which now relies on
firmware to configure the SuperI/O functions accordingly (except for pegasos2).

v4:
* Drop incomplete SuperI/O vmstate handling (Zoltan)

v3:
* Rework various commit messages (Zoltan)
* Drop patch "hw/char/serial: Free struct SerialState from MemoryRegion"
(Zoltan)
* Generalize wording in migration.rst to include portio_list API (Zoltan)

v2:
* Improve commit messages (Zoltan)
* Split pegasos2 from vt82c686 patch (Zoltan)
* Avoid poking into device internals (Zoltan)

Testing done:
* `make check`
* `make check-avocado`
* Run MorphOS on pegasos2 with and without pegasos2.rom
* Run Linux on amigaone
* Run real-world BIOSes on via-apollo-pro-133t branch
* Start rescue-yl on fuloong2e

[1] https://github.com/shentok/qemu/tree/via-apollo-pro-133t
[2] https://github.com/shentok/qemu/tree/pc-via

Bernhard Beschow (11):
hw/block/fdc-isa: Move portio_list from FDCtrl to FDCtrlISABus
hw/block/fdc-sysbus: Move iomem from FDCtrl to FDCtrlSysBus
hw/char/parallel: Move portio_list from ParallelState to
  ISAParallelState
exec/ioport: Resolve redundant .base attribute in struct
  MemoryRegionPortio
exec/ioport: Add portio_list_set_address()
exec/ioport: Add portio_list_set_enabled()
hw/block/fdc-isa: Implement relocation and enabling/disabling for
  TYPE_ISA_FDC
hw/char/serial-isa: Implement relocation and enabling/disabling for
  TYPE_ISA_SERIAL
hw/char/parallel-isa: Implement relocation and enabling/disabling for
  TYPE_ISA_PARALLEL
hw/ppc/pegasos2: Let pegasos2 machine configure SuperI/O functions
hw/isa/vt82c686: Implement relocation and toggling of SuperI/O
  functions

   docs/devel/migration.rst   |  6 ++--
   hw/block/fdc-internal.h|  4 ---
   include/exec/ioport.h  |  4 ++-
   include/hw/block/fdc.h |  3 ++
   include/hw/char/parallel-isa.h |  5 +++
   include/hw/char/parallel.h |  2 --
   include/hw/char/serial.h   |  2 ++
   hw/block/fdc-isa.c | 18 +-
   hw/block/fdc-sysbus.c  |  6 ++--
   hw/char/parallel-isa.c | 14 
   hw/char/parallel.c |  2 +-
   hw/char/serial-isa.c   | 14 
   hw/isa/vt82c686.c  | 66 --
   hw/ppc/pegasos2.c  | 15 
   system/ioport.c| 41 +
   15 files changed, 172 insertions(+), 30 deletions(-)


I think this series generally looks good: the only thing I think it's worth 
checking is whether portio lists are considered exclusive to ISA devices or 
not? (Paolo?).


The modifications preserve the current design, so how is this question related 
to this series?


I was thinking about patches 1 and 3 where the portio_list variable is moved from the 
core object to the ISA-specific child objects.



I'd appreciate feedback from the maintainers indeed since this part hasn't 
received any comments so far. Thanks :)


Agreed. I *think* the portio_lists are ISA-specific as far as QEMU is concerned, but 
a quick nod from an x86 maintainer would be a great help :)



The portio_list_set_enabled() API looks interesting, and could be considered 
for use by my PCI IDE mode-switching changes too.

Apologies I don't have a huge amount of time for review right now, but I wanted 
to feed back that generally these patches look good, and

1 2 3 4 >

1 - 100 of 387 matches

Mail list logo