Re: [PATCH 3/3] tests/qtest: Re-enable multifd cancel test
Hi, Thomas, On Tue, Jan 09, 2024 at 08:21:53AM +0100, Thomas Huth wrote: > Sorry for that :-( Not at all! I actually appreciate more people looking after it. > Maybe it's better if we remove the migration-test from > the qtest section in MAINTAINERS? Since the migration test is very well > maintained already, there's IMHO no need for picking up the patches via the > qtest tree, so something like this should prevent these problems: > > diff --git a/MAINTAINERS b/MAINTAINERS > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -3269,6 +3269,7 @@ F: tests/qtest/ > F: docs/devel/qgraph.rst > F: docs/devel/qtest.rst > X: tests/qtest/bios-tables-test* > +X: tests/qtest/migration-* > > Device Fuzzing > M: Alexander Bulekov > > (as you can see, we're doing it in a similar way for the bios tables test > already) > > If you agree, I can send out a proper patch for this later today. Currently the file is covered by both groups of people, which is the best condition to me: $ ./scripts/get_maintainer.pl -f tests/qtest/migration-test.c Peter Xu (maintainer:Migration) Fabiano Rosas (maintainer:Migration) Thomas Huth (maintainer:qtest) Laurent Vivier (maintainer:qtest) Paolo Bonzini (reviewer:qtest) qemu-devel@nongnu.org (open list:All patches CC here) It makes sense to me e.g. when qtest reworks the framework, and we'd like migration-test.c to be covered in that same reworks series and reviewed/pulled together, for example, then those can go via qtest's tree directly. If patch submitter follows the MAINTAINERS file it means all of us will be in the loop and that's the perfect condition, IMHO. It's just that this patch didn't have any migration people copied, which caused a very slight confusion. It'll be great in that case if qtest maintainers can help submitters to copy us if the submitters forgot to do so. I think we should do the same when there's major changes for qtest framework for a new migration test. Would that work the best for us? Thanks, -- Peter Xu
Re: [PATCH v8 06/10] hw/fsi: Aspeed APB2OPB interface
Hello Ninad, +static void fsi_aspeed_apb2opb_realize(DeviceState *dev, Error **errp) +{ + SysBusDevice *sbd = SYS_BUS_DEVICE(dev); + AspeedAPB2OPBState *s = ASPEED_APB2OPB(dev); + int i; + + sysbus_init_irq(sbd, >irq); + + memory_region_init_io(>iomem, OBJECT(s), _apb2opb_ops, s, + TYPE_ASPEED_APB2OPB, 0x1000); + sysbus_init_mmio(sbd, >iomem); + + for (i = 0; i < ASPEED_FSI_NUM; i++) { + if (!qdev_realize_and_unref(DEVICE(>fsi[i]), BUS(>opb[i]), s->fsi[i] is not allocated. We should use qdev_realize instead. I am not sure I understood this. FSIMasterState fsi[ASPEED_FSI_NUM]; is inside structure AspeedAPB2OPBState so it must be allocated, right? See the documentation : https://www.qemu.org/docs/master/devel/qdev-api.html#c.qdev_realize_and_unref Thanks, C.
Re: [PATCH 3/3] tests/qtest: Re-enable multifd cancel test
On 09/01/2024 03.12, Peter Xu wrote: On Mon, Jan 08, 2024 at 11:26:04AM -0300, Fabiano Rosas wrote: Peter Xu writes: On Wed, Jun 07, 2023 at 10:27:15AM +0200, Juan Quintela wrote: Fabiano Rosas wrote: We've found the source of flakiness in this test, so re-enable it. Signed-off-by: Fabiano Rosas --- tests/qtest/migration-test.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index b0c355bbd9..800ad23b75 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -2778,14 +2778,8 @@ int main(int argc, char **argv) } qtest_add_func("/migration/multifd/tcp/plain/none", test_multifd_tcp_none); -/* - * This test is flaky and sometimes fails in CI and otherwise: - * don't run unless user opts in via environment variable. - */ -if (getenv("QEMU_TEST_FLAKY_TESTS")) { -qtest_add_func("/migration/multifd/tcp/plain/cancel", - test_multifd_tcp_cancel); -} +qtest_add_func("/migration/multifd/tcp/plain/cancel", + test_multifd_tcp_cancel); qtest_add_func("/migration/multifd/tcp/plain/zlib", test_multifd_tcp_zlib); #ifdef CONFIG_ZSTD Reviewed-by: Juan Quintela There was another failure with migration test that I will post during the rest of the day. It needs both to get it right. This one didn't yet land upstream. I'm not sure, but maybe Juan was saying about this change: commit d2026ee117147893f8d80f060cede6d872ecbd7f Author: Juan Quintela Date: Wed Apr 26 12:20:36 2023 +0200 multifd: Fix the number of channels ready That's not it. It was something in the test itself around the fact that we use two sets of: from/to. There was supposed to be a situation where we'd start 'to2' while 'to' was still running and that would cause issues (possibly with sockets). I think what might have happened is that someone merged a fix through another tree and Juan didn't notice. I think this is the one: commit f2d063e61ee2026700ab44bef967f663e976bec8 Author: Xuzhou Cheng Date: Fri Oct 28 12:57:32 2022 +0800 tests/qtest: migration-test: Make sure QEMU process "to" exited after migration is canceled Make sure QEMU process "to" exited before launching another target for migration in the test_multifd_tcp_cancel case. Signed-off-by: Xuzhou Cheng Signed-off-by: Bin Meng Reviewed-by: Marc-André Lureau Message-Id: <20221028045736.679903-8-bin.m...@windriver.com> Signed-off-by: Thomas Huth Hmm, i see. Sorry for that :-( Maybe it's better if we remove the migration-test from the qtest section in MAINTAINERS? Since the migration test is very well maintained already, there's IMHO no need for picking up the patches via the qtest tree, so something like this should prevent these problems: diff --git a/MAINTAINERS b/MAINTAINERS --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3269,6 +3269,7 @@ F: tests/qtest/ F: docs/devel/qgraph.rst F: docs/devel/qtest.rst X: tests/qtest/bios-tables-test* +X: tests/qtest/migration-* Device Fuzzing M: Alexander Bulekov (as you can see, we're doing it in a similar way for the bios tables test already) If you agree, I can send out a proper patch for this later today. Thomas
Re: [PATCH 10/10] docs/migration: Further move virtio to be feature of migration
On 1/9/24 07:46, pet...@redhat.com wrote: From: Peter Xu Move it one layer down, so taking Virtio-migration as a feature for migration. Cc: Michael S. Tsirkin Cc: Jason Wang Signed-off-by: Peter Xu Reviewed-by: Cédric Le Goater Thanks, C. --- docs/devel/migration/features.rst | 1 + docs/devel/migration/index.rst| 1 - 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/devel/migration/features.rst b/docs/devel/migration/features.rst index dea016f707..a9acaf618e 100644 --- a/docs/devel/migration/features.rst +++ b/docs/devel/migration/features.rst @@ -9,3 +9,4 @@ Migration has plenty of features to support different use cases. postcopy dirty-limit vfio + virtio diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst index 2479e8ecb7..7b7a706e35 100644 --- a/docs/devel/migration/index.rst +++ b/docs/devel/migration/index.rst @@ -10,5 +10,4 @@ QEMU live migration works. main features compatibility - virtio best-practises
Re: [PATCH 09/10] docs/migration: Further move vfio to be feature of migration
On 1/9/24 07:46, pet...@redhat.com wrote: From: Peter Xu Move it one layer down, so taking VFIO-migration as a feature for migration. Cc: Alex Williamson Cc: Cédric Le Goater Signed-off-by: Peter Xu Reviewed-by: Cédric Le Goater Thanks, C. --- docs/devel/migration/features.rst | 1 + docs/devel/migration/index.rst| 1 - 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/devel/migration/features.rst b/docs/devel/migration/features.rst index e257d0d100..dea016f707 100644 --- a/docs/devel/migration/features.rst +++ b/docs/devel/migration/features.rst @@ -8,3 +8,4 @@ Migration has plenty of features to support different use cases. postcopy dirty-limit + vfio diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst index 7cf62541b9..2479e8ecb7 100644 --- a/docs/devel/migration/index.rst +++ b/docs/devel/migration/index.rst @@ -10,6 +10,5 @@ QEMU live migration works. main features compatibility - vfio virtio best-practises
Re: [PATCH 08/10] docs/migration: Organize "Postcopy" page
On 1/9/24 07:46, pet...@redhat.com wrote: From: Peter Xu Reorganize the page, moving things around, and add a few headlines ("Postcopy internals", "Postcopy features") to cover sub-areas. Signed-off-by: Peter Xu --- docs/devel/migration/postcopy.rst | 159 -- 1 file changed, 84 insertions(+), 75 deletions(-) diff --git a/docs/devel/migration/postcopy.rst b/docs/devel/migration/postcopy.rst index d60eec06ab..6c51e96d79 100644 --- a/docs/devel/migration/postcopy.rst +++ b/docs/devel/migration/postcopy.rst @@ -1,6 +1,9 @@ + Postcopy +.. contents:: + 'Postcopy' migration is a way to deal with migrations that refuse to converge The quote character is used in a few places to emphasize words which should be reworked. The rest looks good, so Reviewed-by: Cédric Le Goater Thanks, C. (or take too long to converge) its plus side is that there is an upper bound on the amount of migration traffic and time it takes, the down side is that during @@ -14,7 +17,7 @@ Postcopy can be combined with precopy (i.e. normal migration) so that if precopy doesn't finish in a given time the switch is made to postcopy. Enabling postcopy -- += To enable postcopy, issue this command on the monitor (both source and destination) prior to the start of migration: @@ -49,8 +52,71 @@ time per vCPU. ``migrate_set_parameter`` is ignored (to avoid delaying requested pages that the destination is waiting for). -Postcopy device transfer - +Postcopy internals +== + +State machine +- + +Postcopy moves through a series of states (see postcopy_state) from +ADVISE->DISCARD->LISTEN->RUNNING->END + + - Advise + +Set at the start of migration if postcopy is enabled, even +if it hasn't had the start command; here the destination +checks that its OS has the support needed for postcopy, and performs +setup to ensure the RAM mappings are suitable for later postcopy. +The destination will fail early in migration at this point if the +required OS support is not present. +(Triggered by reception of POSTCOPY_ADVISE command) + + - Discard + +Entered on receipt of the first 'discard' command; prior to +the first Discard being performed, hugepages are switched off +(using madvise) to ensure that no new huge pages are created +during the postcopy phase, and to cause any huge pages that +have discards on them to be broken. + + - Listen + +The first command in the package, POSTCOPY_LISTEN, switches +the destination state to Listen, and starts a new thread +(the 'listen thread') which takes over the job of receiving +pages off the migration stream, while the main thread carries +on processing the blob. With this thread able to process page +reception, the destination now 'sensitises' the RAM to detect +any access to missing pages (on Linux using the 'userfault' +system). + + - Running + +POSTCOPY_RUN causes the destination to synchronise all +state and start the CPUs and IO devices running. The main +thread now finishes processing the migration package and +now carries on as it would for normal precopy migration +(although it can't do the cleanup it would do as it +finishes a normal migration). + + - Paused + +Postcopy can run into a paused state (normally on both sides when +happens), where all threads will be temporarily halted mostly due to +network errors. When reaching paused state, migration will make sure +the qemu binary on both sides maintain the data without corrupting +the VM. To continue the migration, the admin needs to fix the +migration channel using the QMP command 'migrate-recover' on the +destination node, then resume the migration using QMP command 'migrate' +again on source node, with resume=true flag set. + + - End + +The listen thread can now quit, and perform the cleanup of migration +state, the migration is now complete. + +Device transfer +--- Loading of device data may cause the device emulation to access guest RAM that may trigger faults that have to be resolved by the source, as such @@ -130,7 +196,20 @@ processing. is no longer used by migration, while the listen thread carries on servicing page data until the end of migration. -Postcopy Recovery +Source side page bitmap +--- + +The 'migration bitmap' in postcopy is basically the same as in the precopy, +where each of the bit to indicate that page is 'dirty' - i.e. needs +sending. During the precopy phase this is updated as the CPU dirties +pages, however during postcopy the CPUs are stopped and nothing should +dirty anything any more. Instead, dirty bits are cleared when the relevant +pages are sent during postcopy. + +Postcopy features += + +Postcopy recovery -
Re: [PATCH v3 3/4] ci: Add a migration compatibility test job
On Fri, Jan 05, 2024 at 03:04:48PM -0300, Fabiano Rosas wrote: > The migration tests have support for being passed two QEMU binaries to > test migration compatibility. > > Add a CI job that builds the lastest release of QEMU and another job > that uses that version plus an already present build of the current > version and run the migration tests with the two, both as source and > destination. I.e.: > > old QEMU (n-1) -> current QEMU (development tree) > current QEMU (development tree) -> old QEMU (n-1) > > The purpose of this CI job is to ensure the code we're about to merge > will not cause a migration compatibility problem when migrating the > next release (which will contain that code) to/from the previous > release. > > I'm leaving the jobs as manual for now because using an older QEMU in > tests could hit bugs that were already fixed in the current > development tree and we need to handle those case-by-case. Can we opt-out those broken tests using either your "since:" thing or anything similar? I hope we can start to run something by default in the CI in 9.0 to cover n-1 -> n, even if starting with a subset of tests. Is it possible? Thanks, -- Peter Xu
Re: [PATCH 07/10] docs/migration: Split "dirty limit"
On 1/9/24 07:46, pet...@redhat.com wrote: From: Peter Xu Split that into a separate file, put under "features". Cc: Yong Huang Signed-off-by: Peter Xu Reviewed-by: Cédric Le Goater Thanks, C. --- docs/devel/migration/dirty-limit.rst | 71 docs/devel/migration/features.rst| 1 + docs/devel/migration/main.rst| 71 3 files changed, 72 insertions(+), 71 deletions(-) create mode 100644 docs/devel/migration/dirty-limit.rst diff --git a/docs/devel/migration/dirty-limit.rst b/docs/devel/migration/dirty-limit.rst new file mode 100644 index 00..8f32329d5f --- /dev/null +++ b/docs/devel/migration/dirty-limit.rst @@ -0,0 +1,71 @@ +Dirty limit +=== + +The dirty limit, short for dirty page rate upper limit, is a new capability +introduced in the 8.1 QEMU release that uses a new algorithm based on the KVM +dirty ring to throttle down the guest during live migration. + +The algorithm framework is as follows: + +:: + + -- + main --> throttle thread > PREPARE(1) < + thread \| | + \ | | +\ V | + -\CALCULATE(2) | + \ | | +\ | | + \ V | + \SET PENALTY(3) - + -\ | + \ | + \V + -> virtual CPU thread ---> ACCEPT PENALTY(4) + -- + +When the qmp command qmp_set_vcpu_dirty_limit is called for the first time, +the QEMU main thread starts the throttle thread. The throttle thread, once +launched, executes the loop, which consists of three steps: + + - PREPARE (1) + + The entire work of PREPARE (1) is preparation for the second stage, + CALCULATE(2), as the name implies. It involves preparing the dirty + page rate value and the corresponding upper limit of the VM: + The dirty page rate is calculated via the KVM dirty ring mechanism, + which tells QEMU how many dirty pages a virtual CPU has had since the + last KVM_EXIT_DIRTY_RING_FULL exception; The dirty page rate upper + limit is specified by caller, therefore fetch it directly. + + - CALCULATE (2) + + Calculate a suitable sleep period for each virtual CPU, which will be + used to determine the penalty for the target virtual CPU. The + computation must be done carefully in order to reduce the dirty page + rate progressively down to the upper limit without oscillation. To + achieve this, two strategies are provided: the first is to add or + subtract sleep time based on the ratio of the current dirty page rate + to the limit, which is used when the current dirty page rate is far + from the limit; the second is to add or subtract a fixed time when + the current dirty page rate is close to the limit. + + - SET PENALTY (3) + + Set the sleep time for each virtual CPU that should be penalized based + on the results of the calculation supplied by step CALCULATE (2). + +After completing the three above stages, the throttle thread loops back +to step PREPARE (1) until the dirty limit is reached. + +On the other hand, each virtual CPU thread reads the sleep duration and +sleeps in the path of the KVM_EXIT_DIRTY_RING_FULL exception handler, that +is ACCEPT PENALTY (4). Virtual CPUs tied with writing processes will +obviously exit to the path and get penalized, whereas virtual CPUs involved +with read processes will not. + +In summary, thanks to the KVM dirty ring technology, the dirty limit +algorithm will restrict virtual CPUs as needed to keep their dirty page +rate inside the limit. This leads to more steady reading performance during +live migration and can aid in improving large guest responsiveness. diff --git a/docs/devel/migration/features.rst b/docs/devel/migration/features.rst index 0054e0c900..e257d0d100 100644 --- a/docs/devel/migration/features.rst +++ b/docs/devel/migration/features.rst @@ -7,3 +7,4 @@ Migration has plenty of features to support different use cases. :maxdepth: 2 postcopy + dirty-limit diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst index 051ea43f0e..00b9c3d32f 100644 --- a/docs/devel/migration/main.rst +++ b/docs/devel/migration/main.rst @@
Re: [PATCH 06/10] docs/migration: Split "Postcopy"
On 1/9/24 07:46, pet...@redhat.com wrote: From: Peter Xu Split postcopy into a separate file. Introduce a head page "features.rst" to keep all the features on top of migration framework. Signed-off-by: Peter Xu Reviewed-by: Cédric Le Goater Thanks, C. --- docs/devel/migration/features.rst | 9 + docs/devel/migration/index.rst| 1 + docs/devel/migration/main.rst | 305 -- docs/devel/migration/postcopy.rst | 304 + 4 files changed, 314 insertions(+), 305 deletions(-) create mode 100644 docs/devel/migration/features.rst create mode 100644 docs/devel/migration/postcopy.rst diff --git a/docs/devel/migration/features.rst b/docs/devel/migration/features.rst new file mode 100644 index 00..0054e0c900 --- /dev/null +++ b/docs/devel/migration/features.rst @@ -0,0 +1,9 @@ +Migration features +== + +Migration has plenty of features to support different use cases. + +.. toctree:: + :maxdepth: 2 + + postcopy diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst index c09623b38f..7cf62541b9 100644 --- a/docs/devel/migration/index.rst +++ b/docs/devel/migration/index.rst @@ -8,6 +8,7 @@ QEMU live migration works. :maxdepth: 2 main + features compatibility vfio virtio diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst index 97811ce371..051ea43f0e 100644 --- a/docs/devel/migration/main.rst +++ b/docs/devel/migration/main.rst @@ -644,308 +644,3 @@ algorithm will restrict virtual CPUs as needed to keep their dirty page rate inside the limit. This leads to more steady reading performance during live migration and can aid in improving large guest responsiveness. -Postcopy - - -'Postcopy' migration is a way to deal with migrations that refuse to converge -(or take too long to converge) its plus side is that there is an upper bound on -the amount of migration traffic and time it takes, the down side is that during -the postcopy phase, a failure of *either* side causes the guest to be lost. - -In postcopy the destination CPUs are started before all the memory has been -transferred, and accesses to pages that are yet to be transferred cause -a fault that's translated by QEMU into a request to the source QEMU. - -Postcopy can be combined with precopy (i.e. normal migration) so that if precopy -doesn't finish in a given time the switch is made to postcopy. - -Enabling postcopy -- - -To enable postcopy, issue this command on the monitor (both source and -destination) prior to the start of migration: - -``migrate_set_capability postcopy-ram on`` - -The normal commands are then used to start a migration, which is still -started in precopy mode. Issuing: - -``migrate_start_postcopy`` - -will now cause the transition from precopy to postcopy. -It can be issued immediately after migration is started or any -time later on. Issuing it after the end of a migration is harmless. - -Blocktime is a postcopy live migration metric, intended to show how -long the vCPU was in state of interruptible sleep due to pagefault. -That metric is calculated both for all vCPUs as overlapped value, and -separately for each vCPU. These values are calculated on destination -side. To enable postcopy blocktime calculation, enter following -command on destination monitor: - -``migrate_set_capability postcopy-blocktime on`` - -Postcopy blocktime can be retrieved by query-migrate qmp command. -postcopy-blocktime value of qmp command will show overlapped blocking -time for all vCPU, postcopy-vcpu-blocktime will show list of blocking -time per vCPU. - -.. note:: - During the postcopy phase, the bandwidth limits set using - ``migrate_set_parameter`` is ignored (to avoid delaying requested pages that - the destination is waiting for). - -Postcopy device transfer - - -Loading of device data may cause the device emulation to access guest RAM -that may trigger faults that have to be resolved by the source, as such -the migration stream has to be able to respond with page data *during* the -device load, and hence the device data has to be read from the stream completely -before the device load begins to free the stream up. This is achieved by -'packaging' the device data into a blob that's read in one go. - -Source behaviour - - -Until postcopy is entered the migration stream is identical to normal -precopy, except for the addition of a 'postcopy advise' command at -the beginning, to tell the destination that postcopy might happen. -When postcopy starts the source sends the page discard data and then -forms the 'package' containing: - - - Command: 'postcopy listen' - - The device state - - A series of sections, identical to the precopy streams device state stream - containing everything except postcopiable devices (i.e. RAM) - - Command: 'postcopy run' - -The 'package' is sent as the
Re: [PATCH 05/10] docs/migration: Split "Debugging" and "Firmware"
On 1/9/24 07:46, pet...@redhat.com wrote: From: Peter Xu Move the two sections into a separate file called "best-practises.rst". Add the entry into index. Signed-off-by: Peter Xu Reviewed-by: Cédric Le Goater Thanks, C. --- docs/devel/migration/best-practises.rst | 48 + docs/devel/migration/index.rst | 1 + docs/devel/migration/main.rst | 44 --- 3 files changed, 49 insertions(+), 44 deletions(-) create mode 100644 docs/devel/migration/best-practises.rst diff --git a/docs/devel/migration/best-practises.rst b/docs/devel/migration/best-practises.rst new file mode 100644 index 00..ba122ae417 --- /dev/null +++ b/docs/devel/migration/best-practises.rst @@ -0,0 +1,48 @@ +== +Best practises +== + +Debugging += + +The migration stream can be analyzed thanks to ``scripts/analyze-migration.py``. + +Example usage: + +.. code-block:: shell + + $ qemu-system-x86_64 -display none -monitor stdio + (qemu) migrate "exec:cat > mig" + (qemu) q + $ ./scripts/analyze-migration.py -f mig + { +"ram (3)": { +"section sizes": { +"pc.ram": "0x0800", + ... + +See also ``analyze-migration.py -h`` help for more options. + +Firmware + + +Migration migrates the copies of RAM and ROM, and thus when running +on the destination it includes the firmware from the source. Even after +resetting a VM, the old firmware is used. Only once QEMU has been restarted +is the new firmware in use. + +- Changes in firmware size can cause changes in the required RAMBlock size + to hold the firmware and thus migration can fail. In practice it's best + to pad firmware images to convenient powers of 2 with plenty of space + for growth. + +- Care should be taken with device emulation code so that newer + emulation code can work with older firmware to allow forward migration. + +- Care should be taken with newer firmware so that backward migration + to older systems with older device emulation code will work. + +In some cases it may be best to tie specific firmware versions to specific +versioned machine types to cut down on the combinations that will need +support. This is also useful when newer versions of firmware outgrow +the padding. diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst index 7fc02b9520..c09623b38f 100644 --- a/docs/devel/migration/index.rst +++ b/docs/devel/migration/index.rst @@ -11,3 +11,4 @@ QEMU live migration works. compatibility vfio virtio + best-practises diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst index b3e31bb52f..97811ce371 100644 --- a/docs/devel/migration/main.rst +++ b/docs/devel/migration/main.rst @@ -52,27 +52,6 @@ All these migration protocols use the same infrastructure to save/restore state devices. This infrastructure is shared with the savevm/loadvm functionality. -Debugging -= - -The migration stream can be analyzed thanks to ``scripts/analyze-migration.py``. - -Example usage: - -.. code-block:: shell - - $ qemu-system-x86_64 -display none -monitor stdio - (qemu) migrate "exec:cat > mig" - (qemu) q - $ ./scripts/analyze-migration.py -f mig - { -"ram (3)": { -"section sizes": { -"pc.ram": "0x0800", - ... - -See also ``analyze-migration.py -h`` help for more options. - Common infrastructure = @@ -970,26 +949,3 @@ the background migration channel. Anyone who cares about latencies of page faults during a postcopy migration should enable this feature. By default, it's not enabled. -Firmware - - -Migration migrates the copies of RAM and ROM, and thus when running -on the destination it includes the firmware from the source. Even after -resetting a VM, the old firmware is used. Only once QEMU has been restarted -is the new firmware in use. - -- Changes in firmware size can cause changes in the required RAMBlock size - to hold the firmware and thus migration can fail. In practice it's best - to pad firmware images to convenient powers of 2 with plenty of space - for growth. - -- Care should be taken with device emulation code so that newer - emulation code can work with older firmware to allow forward migration. - -- Care should be taken with newer firmware so that backward migration - to older systems with older device emulation code will work. - -In some cases it may be best to tie specific firmware versions to specific -versioned machine types to cut down on the combinations that will need -support. This is also useful when newer versions of firmware outgrow -the padding.
Re: [PATCH 04/10] docs/migration: Split "Backwards compatibility" separately
On 1/9/24 07:46, pet...@redhat.com wrote: From: Peter Xu Split the section from main.rst into a separate file. Reference it in the index.rst. Signed-off-by: Peter Xu Reviewed-by: Cédric Le Goater Thanks, C.
Re: [PATCH 03/10] docs/migration: Convert virtio.txt into rST
On 1/9/24 07:46, pet...@redhat.com wrote: From: Peter Xu Convert the plain old .txt into .rst, add it into migration/index.rst. Signed-off-by: Peter Xu Reviewed-by: Cédric Le Goater Thanks, C. --- docs/devel/migration/index.rst | 1 + docs/devel/migration/virtio.rst | 115 docs/devel/migration/virtio.txt | 108 -- 3 files changed, 116 insertions(+), 108 deletions(-) create mode 100644 docs/devel/migration/virtio.rst delete mode 100644 docs/devel/migration/virtio.txt diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst index 02cfdcc969..2cb701c77c 100644 --- a/docs/devel/migration/index.rst +++ b/docs/devel/migration/index.rst @@ -9,3 +9,4 @@ QEMU live migration works. main vfio + virtio diff --git a/docs/devel/migration/virtio.rst b/docs/devel/migration/virtio.rst new file mode 100644 index 00..611a18b821 --- /dev/null +++ b/docs/devel/migration/virtio.rst @@ -0,0 +1,115 @@ +=== +Virtio device migration +=== + +Copyright 2015 IBM Corp. + +This work is licensed under the terms of the GNU GPL, version 2 or later. See +the COPYING file in the top-level directory. + +Saving and restoring the state of virtio devices is a bit of a twisty maze, +for several reasons: + +- state is distributed between several parts: + + - virtio core, for common fields like features, number of queues, ... + + - virtio transport (pci, ccw, ...), for the different proxy devices and +transport specific state (msix vectors, indicators, ...) + + - virtio device (net, blk, ...), for the different device types and their +state (mac address, request queue, ...) + +- most fields are saved via the stream interface; subsequently, subsections + have been added to make cross-version migration possible + +This file attempts to document the current procedure and point out some +caveats. + +Save state procedure + + +:: + + virtio core virtio transport virtio device + --- - + + save() function registered + via VMState wrapper on + device class + virtio_save() <-- + --> save_config() +- save proxy device +- save transport-specific + device fields + - save common device +fields + - save common virtqueue +fields + --> save_queue() +- save transport-specific + virtqueue fields + --> save_device() + - save device-specific + fields + - save subsections +- device endianness, + if changed from + default endianness +- 64 bit features, if + any high feature bit + is set +- virtio-1 virtqueue + fields, if VERSION_1 + is set + +Load state procedure + + +:: + + virtio core virtio transport virtio device + --- - + + load() function registered + via VMState wrapper on + device class + virtio_load() <-- + --> load_config() +- load proxy device +- load transport-specific + device fields + - load common device +fields + - load common virtqueue +fields + --> load_queue() +- load transport-specific + virtqueue fields + - notify guest + --> load_device() + - load device-specific + fields + - load subsections +- device endianness +- 64 bit features +- virtio-1 virtqueue + fields + - sanitize endianness + - sanitize features + - virtqueue index sanity +check + - feature-dependent setup + +Implications of this setup +== + +Devices need to be careful in their state processing during load: The +load_device() procedure is invoked by the core before subsections have +been loaded. Any code that depends on information transmitted in subsections
Re: [PATCH v6 1/2] qom: new object to associate device to numa node
Ankit Agrawal writes: >>> +## >>> +# @AcpiGenericInitiatorProperties: >>> +# >>> +# Properties for acpi-generic-initiator objects. >>> +# >>> +# @pci-dev: PCI device ID to be associated with the node >>> +# >>> +# @host-nodes: numa node list associated with the PCI device. >> >> NUMA >> >> Suggest "list of NUMA nodes associated with ..." > > Ack, will make the change. > >>> @@ -981,6 +997,7 @@ >>> 'id': 'str' }, >>> 'discriminator': 'qom-type', >>> 'data': { >>> + 'acpi-generic-initiator': 'AcpiGenericInitiatorProperties', >>> 'authz-list': 'AuthZListProperties', >>> 'authz-listfile': 'AuthZListFileProperties', >>> 'authz-pam': 'AuthZPAMProperties', >> >> I'm holding my Acked-by until the interface design issues raised by >> Jason have been resolved. > > I suppose you meant Jonathan here? Yes. Going too fast. My apologies!
Re: [PATCH 02/10] docs/migration: Create index page
On 1/9/24 07:46, pet...@redhat.com wrote: From: Peter Xu Create an index page for migration module. Move VFIO migration there too. A trivial touch-up on the title to use lower case there. Since then we'll have "migration" as the top title, make the main doc file renamed to "migration framework". Cc: Alex Williamson Cc: Cédric Le Goater Signed-off-by: Peter Xu Reviewed-by: Cédric Le Goater Thanks, C. --- docs/devel/index-internals.rst | 3 +-- docs/devel/migration/index.rst | 11 +++ docs/devel/migration/main.rst | 6 +++--- docs/devel/migration/vfio.rst | 2 +- 4 files changed, 16 insertions(+), 6 deletions(-) create mode 100644 docs/devel/migration/index.rst diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst index a41d62c1eb..5636e9cf1d 100644 --- a/docs/devel/index-internals.rst +++ b/docs/devel/index-internals.rst @@ -11,13 +11,12 @@ Details about QEMU's various subsystems including how to add features to them. block-coroutine-wrapper clocks ebpf_rss - migration/main + migration/index multi-process reset s390-cpu-topology s390-dasd-ipl tracing - vfio-migration vfio-iommufd writing-monitor-commands virtio-backends diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst new file mode 100644 index 00..02cfdcc969 --- /dev/null +++ b/docs/devel/migration/index.rst @@ -0,0 +1,11 @@ +Migration += + +This is the main entry for QEMU migration documentations. It explains how +QEMU live migration works. + +.. toctree:: + :maxdepth: 2 + + main + vfio diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst index 95351ba51f..62bf027fb4 100644 --- a/docs/devel/migration/main.rst +++ b/docs/devel/migration/main.rst @@ -1,6 +1,6 @@ -= -Migration -= +=== +Migration framework +=== QEMU has code to load/save the state of the guest that it is running. These are two complementary operations. Saving the state just does diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst index 605fe60e96..c49482eab6 100644 --- a/docs/devel/migration/vfio.rst +++ b/docs/devel/migration/vfio.rst @@ -1,5 +1,5 @@ = -VFIO device Migration +VFIO device migration = Migration of virtual machine involves saving the state for each device that
Re: [PATCH 01/10] docs/migration: Create migration/ directory
On 1/9/24 07:46, pet...@redhat.com wrote: From: Peter Xu Migration documentation is growing into a single file too large. Create a sub-directory for it for a split. We also already have separate vfio/virtio documentations, move it all over into the directory. Note that the virtio one is still not yet converted to rST. That is a job for later. Cc: Michael S. Tsirkin Cc: Jason Wang Cc: Alex Williamson Cc: Cédric Le Goater Signed-off-by: Peter Xu Reviewed-by: Cédric Le Goater Thanks, C. --- docs/devel/index-internals.rst| 2 +- docs/devel/{migration.rst => migration/main.rst} | 0 docs/devel/{vfio-migration.rst => migration/vfio.rst} | 0 docs/devel/{virtio-migration.txt => migration/virtio.txt} | 0 4 files changed, 1 insertion(+), 1 deletion(-) rename docs/devel/{migration.rst => migration/main.rst} (100%) rename docs/devel/{vfio-migration.rst => migration/vfio.rst} (100%) rename docs/devel/{virtio-migration.txt => migration/virtio.txt} (100%) diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst index 3def4a138b..a41d62c1eb 100644 --- a/docs/devel/index-internals.rst +++ b/docs/devel/index-internals.rst @@ -11,7 +11,7 @@ Details about QEMU's various subsystems including how to add features to them. block-coroutine-wrapper clocks ebpf_rss - migration + migration/main multi-process reset s390-cpu-topology diff --git a/docs/devel/migration.rst b/docs/devel/migration/main.rst similarity index 100% rename from docs/devel/migration.rst rename to docs/devel/migration/main.rst diff --git a/docs/devel/vfio-migration.rst b/docs/devel/migration/vfio.rst similarity index 100% rename from docs/devel/vfio-migration.rst rename to docs/devel/migration/vfio.rst diff --git a/docs/devel/virtio-migration.txt b/docs/devel/migration/virtio.txt similarity index 100% rename from docs/devel/virtio-migration.txt rename to docs/devel/migration/virtio.txt
[PATCH 10/10] docs/migration: Further move virtio to be feature of migration
From: Peter Xu Move it one layer down, so taking Virtio-migration as a feature for migration. Cc: Michael S. Tsirkin Cc: Jason Wang Signed-off-by: Peter Xu --- docs/devel/migration/features.rst | 1 + docs/devel/migration/index.rst| 1 - 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/devel/migration/features.rst b/docs/devel/migration/features.rst index dea016f707..a9acaf618e 100644 --- a/docs/devel/migration/features.rst +++ b/docs/devel/migration/features.rst @@ -9,3 +9,4 @@ Migration has plenty of features to support different use cases. postcopy dirty-limit vfio + virtio diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst index 2479e8ecb7..7b7a706e35 100644 --- a/docs/devel/migration/index.rst +++ b/docs/devel/migration/index.rst @@ -10,5 +10,4 @@ QEMU live migration works. main features compatibility - virtio best-practises -- 2.41.0
[PATCH 06/10] docs/migration: Split "Postcopy"
From: Peter Xu Split postcopy into a separate file. Introduce a head page "features.rst" to keep all the features on top of migration framework. Signed-off-by: Peter Xu --- docs/devel/migration/features.rst | 9 + docs/devel/migration/index.rst| 1 + docs/devel/migration/main.rst | 305 -- docs/devel/migration/postcopy.rst | 304 + 4 files changed, 314 insertions(+), 305 deletions(-) create mode 100644 docs/devel/migration/features.rst create mode 100644 docs/devel/migration/postcopy.rst diff --git a/docs/devel/migration/features.rst b/docs/devel/migration/features.rst new file mode 100644 index 00..0054e0c900 --- /dev/null +++ b/docs/devel/migration/features.rst @@ -0,0 +1,9 @@ +Migration features +== + +Migration has plenty of features to support different use cases. + +.. toctree:: + :maxdepth: 2 + + postcopy diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst index c09623b38f..7cf62541b9 100644 --- a/docs/devel/migration/index.rst +++ b/docs/devel/migration/index.rst @@ -8,6 +8,7 @@ QEMU live migration works. :maxdepth: 2 main + features compatibility vfio virtio diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst index 97811ce371..051ea43f0e 100644 --- a/docs/devel/migration/main.rst +++ b/docs/devel/migration/main.rst @@ -644,308 +644,3 @@ algorithm will restrict virtual CPUs as needed to keep their dirty page rate inside the limit. This leads to more steady reading performance during live migration and can aid in improving large guest responsiveness. -Postcopy - - -'Postcopy' migration is a way to deal with migrations that refuse to converge -(or take too long to converge) its plus side is that there is an upper bound on -the amount of migration traffic and time it takes, the down side is that during -the postcopy phase, a failure of *either* side causes the guest to be lost. - -In postcopy the destination CPUs are started before all the memory has been -transferred, and accesses to pages that are yet to be transferred cause -a fault that's translated by QEMU into a request to the source QEMU. - -Postcopy can be combined with precopy (i.e. normal migration) so that if precopy -doesn't finish in a given time the switch is made to postcopy. - -Enabling postcopy -- - -To enable postcopy, issue this command on the monitor (both source and -destination) prior to the start of migration: - -``migrate_set_capability postcopy-ram on`` - -The normal commands are then used to start a migration, which is still -started in precopy mode. Issuing: - -``migrate_start_postcopy`` - -will now cause the transition from precopy to postcopy. -It can be issued immediately after migration is started or any -time later on. Issuing it after the end of a migration is harmless. - -Blocktime is a postcopy live migration metric, intended to show how -long the vCPU was in state of interruptible sleep due to pagefault. -That metric is calculated both for all vCPUs as overlapped value, and -separately for each vCPU. These values are calculated on destination -side. To enable postcopy blocktime calculation, enter following -command on destination monitor: - -``migrate_set_capability postcopy-blocktime on`` - -Postcopy blocktime can be retrieved by query-migrate qmp command. -postcopy-blocktime value of qmp command will show overlapped blocking -time for all vCPU, postcopy-vcpu-blocktime will show list of blocking -time per vCPU. - -.. note:: - During the postcopy phase, the bandwidth limits set using - ``migrate_set_parameter`` is ignored (to avoid delaying requested pages that - the destination is waiting for). - -Postcopy device transfer - - -Loading of device data may cause the device emulation to access guest RAM -that may trigger faults that have to be resolved by the source, as such -the migration stream has to be able to respond with page data *during* the -device load, and hence the device data has to be read from the stream completely -before the device load begins to free the stream up. This is achieved by -'packaging' the device data into a blob that's read in one go. - -Source behaviour - - -Until postcopy is entered the migration stream is identical to normal -precopy, except for the addition of a 'postcopy advise' command at -the beginning, to tell the destination that postcopy might happen. -When postcopy starts the source sends the page discard data and then -forms the 'package' containing: - - - Command: 'postcopy listen' - - The device state - - A series of sections, identical to the precopy streams device state stream - containing everything except postcopiable devices (i.e. RAM) - - Command: 'postcopy run' - -The 'package' is sent as the data part of a Command: ``CMD_PACKAGED``, and the -contents are formatted in the same way as the main migration
[PATCH 02/10] docs/migration: Create index page
From: Peter Xu Create an index page for migration module. Move VFIO migration there too. A trivial touch-up on the title to use lower case there. Since then we'll have "migration" as the top title, make the main doc file renamed to "migration framework". Cc: Alex Williamson Cc: Cédric Le Goater Signed-off-by: Peter Xu --- docs/devel/index-internals.rst | 3 +-- docs/devel/migration/index.rst | 11 +++ docs/devel/migration/main.rst | 6 +++--- docs/devel/migration/vfio.rst | 2 +- 4 files changed, 16 insertions(+), 6 deletions(-) create mode 100644 docs/devel/migration/index.rst diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst index a41d62c1eb..5636e9cf1d 100644 --- a/docs/devel/index-internals.rst +++ b/docs/devel/index-internals.rst @@ -11,13 +11,12 @@ Details about QEMU's various subsystems including how to add features to them. block-coroutine-wrapper clocks ebpf_rss - migration/main + migration/index multi-process reset s390-cpu-topology s390-dasd-ipl tracing - vfio-migration vfio-iommufd writing-monitor-commands virtio-backends diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst new file mode 100644 index 00..02cfdcc969 --- /dev/null +++ b/docs/devel/migration/index.rst @@ -0,0 +1,11 @@ +Migration += + +This is the main entry for QEMU migration documentations. It explains how +QEMU live migration works. + +.. toctree:: + :maxdepth: 2 + + main + vfio diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst index 95351ba51f..62bf027fb4 100644 --- a/docs/devel/migration/main.rst +++ b/docs/devel/migration/main.rst @@ -1,6 +1,6 @@ -= -Migration -= +=== +Migration framework +=== QEMU has code to load/save the state of the guest that it is running. These are two complementary operations. Saving the state just does diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst index 605fe60e96..c49482eab6 100644 --- a/docs/devel/migration/vfio.rst +++ b/docs/devel/migration/vfio.rst @@ -1,5 +1,5 @@ = -VFIO device Migration +VFIO device migration = Migration of virtual machine involves saving the state for each device that -- 2.41.0
[PATCH 09/10] docs/migration: Further move vfio to be feature of migration
From: Peter Xu Move it one layer down, so taking VFIO-migration as a feature for migration. Cc: Alex Williamson Cc: Cédric Le Goater Signed-off-by: Peter Xu --- docs/devel/migration/features.rst | 1 + docs/devel/migration/index.rst| 1 - 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/devel/migration/features.rst b/docs/devel/migration/features.rst index e257d0d100..dea016f707 100644 --- a/docs/devel/migration/features.rst +++ b/docs/devel/migration/features.rst @@ -8,3 +8,4 @@ Migration has plenty of features to support different use cases. postcopy dirty-limit + vfio diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst index 7cf62541b9..2479e8ecb7 100644 --- a/docs/devel/migration/index.rst +++ b/docs/devel/migration/index.rst @@ -10,6 +10,5 @@ QEMU live migration works. main features compatibility - vfio virtio best-practises -- 2.41.0
[PATCH 08/10] docs/migration: Organize "Postcopy" page
From: Peter Xu Reorganize the page, moving things around, and add a few headlines ("Postcopy internals", "Postcopy features") to cover sub-areas. Signed-off-by: Peter Xu --- docs/devel/migration/postcopy.rst | 159 -- 1 file changed, 84 insertions(+), 75 deletions(-) diff --git a/docs/devel/migration/postcopy.rst b/docs/devel/migration/postcopy.rst index d60eec06ab..6c51e96d79 100644 --- a/docs/devel/migration/postcopy.rst +++ b/docs/devel/migration/postcopy.rst @@ -1,6 +1,9 @@ + Postcopy +.. contents:: + 'Postcopy' migration is a way to deal with migrations that refuse to converge (or take too long to converge) its plus side is that there is an upper bound on the amount of migration traffic and time it takes, the down side is that during @@ -14,7 +17,7 @@ Postcopy can be combined with precopy (i.e. normal migration) so that if precopy doesn't finish in a given time the switch is made to postcopy. Enabling postcopy -- += To enable postcopy, issue this command on the monitor (both source and destination) prior to the start of migration: @@ -49,8 +52,71 @@ time per vCPU. ``migrate_set_parameter`` is ignored (to avoid delaying requested pages that the destination is waiting for). -Postcopy device transfer - +Postcopy internals +== + +State machine +- + +Postcopy moves through a series of states (see postcopy_state) from +ADVISE->DISCARD->LISTEN->RUNNING->END + + - Advise + +Set at the start of migration if postcopy is enabled, even +if it hasn't had the start command; here the destination +checks that its OS has the support needed for postcopy, and performs +setup to ensure the RAM mappings are suitable for later postcopy. +The destination will fail early in migration at this point if the +required OS support is not present. +(Triggered by reception of POSTCOPY_ADVISE command) + + - Discard + +Entered on receipt of the first 'discard' command; prior to +the first Discard being performed, hugepages are switched off +(using madvise) to ensure that no new huge pages are created +during the postcopy phase, and to cause any huge pages that +have discards on them to be broken. + + - Listen + +The first command in the package, POSTCOPY_LISTEN, switches +the destination state to Listen, and starts a new thread +(the 'listen thread') which takes over the job of receiving +pages off the migration stream, while the main thread carries +on processing the blob. With this thread able to process page +reception, the destination now 'sensitises' the RAM to detect +any access to missing pages (on Linux using the 'userfault' +system). + + - Running + +POSTCOPY_RUN causes the destination to synchronise all +state and start the CPUs and IO devices running. The main +thread now finishes processing the migration package and +now carries on as it would for normal precopy migration +(although it can't do the cleanup it would do as it +finishes a normal migration). + + - Paused + +Postcopy can run into a paused state (normally on both sides when +happens), where all threads will be temporarily halted mostly due to +network errors. When reaching paused state, migration will make sure +the qemu binary on both sides maintain the data without corrupting +the VM. To continue the migration, the admin needs to fix the +migration channel using the QMP command 'migrate-recover' on the +destination node, then resume the migration using QMP command 'migrate' +again on source node, with resume=true flag set. + + - End + +The listen thread can now quit, and perform the cleanup of migration +state, the migration is now complete. + +Device transfer +--- Loading of device data may cause the device emulation to access guest RAM that may trigger faults that have to be resolved by the source, as such @@ -130,7 +196,20 @@ processing. is no longer used by migration, while the listen thread carries on servicing page data until the end of migration. -Postcopy Recovery +Source side page bitmap +--- + +The 'migration bitmap' in postcopy is basically the same as in the precopy, +where each of the bit to indicate that page is 'dirty' - i.e. needs +sending. During the precopy phase this is updated as the CPU dirties +pages, however during postcopy the CPUs are stopped and nothing should +dirty anything any more. Instead, dirty bits are cleared when the relevant +pages are sent during postcopy. + +Postcopy features += + +Postcopy recovery - Comparing to precopy, postcopy is special on error handlings. When any @@ -166,76 +245,6 @@ configurations of the guest. For example, when with async page fault enabled, logically the guest can proactively schedule out the threads
[PATCH 07/10] docs/migration: Split "dirty limit"
From: Peter Xu Split that into a separate file, put under "features". Cc: Yong Huang Signed-off-by: Peter Xu --- docs/devel/migration/dirty-limit.rst | 71 docs/devel/migration/features.rst| 1 + docs/devel/migration/main.rst| 71 3 files changed, 72 insertions(+), 71 deletions(-) create mode 100644 docs/devel/migration/dirty-limit.rst diff --git a/docs/devel/migration/dirty-limit.rst b/docs/devel/migration/dirty-limit.rst new file mode 100644 index 00..8f32329d5f --- /dev/null +++ b/docs/devel/migration/dirty-limit.rst @@ -0,0 +1,71 @@ +Dirty limit +=== + +The dirty limit, short for dirty page rate upper limit, is a new capability +introduced in the 8.1 QEMU release that uses a new algorithm based on the KVM +dirty ring to throttle down the guest during live migration. + +The algorithm framework is as follows: + +:: + + -- + main --> throttle thread > PREPARE(1) < + thread \| | + \ | | +\ V | + -\CALCULATE(2) | + \ | | +\ | | + \ V | + \SET PENALTY(3) - + -\ | + \ | + \V + -> virtual CPU thread ---> ACCEPT PENALTY(4) + -- + +When the qmp command qmp_set_vcpu_dirty_limit is called for the first time, +the QEMU main thread starts the throttle thread. The throttle thread, once +launched, executes the loop, which consists of three steps: + + - PREPARE (1) + + The entire work of PREPARE (1) is preparation for the second stage, + CALCULATE(2), as the name implies. It involves preparing the dirty + page rate value and the corresponding upper limit of the VM: + The dirty page rate is calculated via the KVM dirty ring mechanism, + which tells QEMU how many dirty pages a virtual CPU has had since the + last KVM_EXIT_DIRTY_RING_FULL exception; The dirty page rate upper + limit is specified by caller, therefore fetch it directly. + + - CALCULATE (2) + + Calculate a suitable sleep period for each virtual CPU, which will be + used to determine the penalty for the target virtual CPU. The + computation must be done carefully in order to reduce the dirty page + rate progressively down to the upper limit without oscillation. To + achieve this, two strategies are provided: the first is to add or + subtract sleep time based on the ratio of the current dirty page rate + to the limit, which is used when the current dirty page rate is far + from the limit; the second is to add or subtract a fixed time when + the current dirty page rate is close to the limit. + + - SET PENALTY (3) + + Set the sleep time for each virtual CPU that should be penalized based + on the results of the calculation supplied by step CALCULATE (2). + +After completing the three above stages, the throttle thread loops back +to step PREPARE (1) until the dirty limit is reached. + +On the other hand, each virtual CPU thread reads the sleep duration and +sleeps in the path of the KVM_EXIT_DIRTY_RING_FULL exception handler, that +is ACCEPT PENALTY (4). Virtual CPUs tied with writing processes will +obviously exit to the path and get penalized, whereas virtual CPUs involved +with read processes will not. + +In summary, thanks to the KVM dirty ring technology, the dirty limit +algorithm will restrict virtual CPUs as needed to keep their dirty page +rate inside the limit. This leads to more steady reading performance during +live migration and can aid in improving large guest responsiveness. diff --git a/docs/devel/migration/features.rst b/docs/devel/migration/features.rst index 0054e0c900..e257d0d100 100644 --- a/docs/devel/migration/features.rst +++ b/docs/devel/migration/features.rst @@ -7,3 +7,4 @@ Migration has plenty of features to support different use cases. :maxdepth: 2 postcopy + dirty-limit diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst index 051ea43f0e..00b9c3d32f 100644 --- a/docs/devel/migration/main.rst +++ b/docs/devel/migration/main.rst @@ -573,74 +573,3 @@ path. Return path - opened by main thread, written by main thread AND postcopy
[PATCH 03/10] docs/migration: Convert virtio.txt into rST
From: Peter Xu Convert the plain old .txt into .rst, add it into migration/index.rst. Signed-off-by: Peter Xu --- docs/devel/migration/index.rst | 1 + docs/devel/migration/virtio.rst | 115 docs/devel/migration/virtio.txt | 108 -- 3 files changed, 116 insertions(+), 108 deletions(-) create mode 100644 docs/devel/migration/virtio.rst delete mode 100644 docs/devel/migration/virtio.txt diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst index 02cfdcc969..2cb701c77c 100644 --- a/docs/devel/migration/index.rst +++ b/docs/devel/migration/index.rst @@ -9,3 +9,4 @@ QEMU live migration works. main vfio + virtio diff --git a/docs/devel/migration/virtio.rst b/docs/devel/migration/virtio.rst new file mode 100644 index 00..611a18b821 --- /dev/null +++ b/docs/devel/migration/virtio.rst @@ -0,0 +1,115 @@ +=== +Virtio device migration +=== + +Copyright 2015 IBM Corp. + +This work is licensed under the terms of the GNU GPL, version 2 or later. See +the COPYING file in the top-level directory. + +Saving and restoring the state of virtio devices is a bit of a twisty maze, +for several reasons: + +- state is distributed between several parts: + + - virtio core, for common fields like features, number of queues, ... + + - virtio transport (pci, ccw, ...), for the different proxy devices and +transport specific state (msix vectors, indicators, ...) + + - virtio device (net, blk, ...), for the different device types and their +state (mac address, request queue, ...) + +- most fields are saved via the stream interface; subsequently, subsections + have been added to make cross-version migration possible + +This file attempts to document the current procedure and point out some +caveats. + +Save state procedure + + +:: + + virtio core virtio transport virtio device + --- - + + save() function registered + via VMState wrapper on + device class + virtio_save() <-- + --> save_config() +- save proxy device +- save transport-specific + device fields + - save common device +fields + - save common virtqueue +fields + --> save_queue() +- save transport-specific + virtqueue fields + --> save_device() + - save device-specific + fields + - save subsections +- device endianness, + if changed from + default endianness +- 64 bit features, if + any high feature bit + is set +- virtio-1 virtqueue + fields, if VERSION_1 + is set + +Load state procedure + + +:: + + virtio core virtio transport virtio device + --- - + + load() function registered + via VMState wrapper on + device class + virtio_load() <-- + --> load_config() +- load proxy device +- load transport-specific + device fields + - load common device +fields + - load common virtqueue +fields + --> load_queue() +- load transport-specific + virtqueue fields + - notify guest + --> load_device() + - load device-specific + fields + - load subsections +- device endianness +- 64 bit features +- virtio-1 virtqueue + fields + - sanitize endianness + - sanitize features + - virtqueue index sanity +check + - feature-dependent setup + +Implications of this setup +== + +Devices need to be careful in their state processing during load: The +load_device() procedure is invoked by the core before subsections have +been loaded. Any code that depends on information transmitted in subsections +therefore has to be invoked in the device's load() function _after_ +virtio_load() returned (like e.g.
[PATCH 05/10] docs/migration: Split "Debugging" and "Firmware"
From: Peter Xu Move the two sections into a separate file called "best-practises.rst". Add the entry into index. Signed-off-by: Peter Xu --- docs/devel/migration/best-practises.rst | 48 + docs/devel/migration/index.rst | 1 + docs/devel/migration/main.rst | 44 --- 3 files changed, 49 insertions(+), 44 deletions(-) create mode 100644 docs/devel/migration/best-practises.rst diff --git a/docs/devel/migration/best-practises.rst b/docs/devel/migration/best-practises.rst new file mode 100644 index 00..ba122ae417 --- /dev/null +++ b/docs/devel/migration/best-practises.rst @@ -0,0 +1,48 @@ +== +Best practises +== + +Debugging += + +The migration stream can be analyzed thanks to ``scripts/analyze-migration.py``. + +Example usage: + +.. code-block:: shell + + $ qemu-system-x86_64 -display none -monitor stdio + (qemu) migrate "exec:cat > mig" + (qemu) q + $ ./scripts/analyze-migration.py -f mig + { +"ram (3)": { +"section sizes": { +"pc.ram": "0x0800", + ... + +See also ``analyze-migration.py -h`` help for more options. + +Firmware + + +Migration migrates the copies of RAM and ROM, and thus when running +on the destination it includes the firmware from the source. Even after +resetting a VM, the old firmware is used. Only once QEMU has been restarted +is the new firmware in use. + +- Changes in firmware size can cause changes in the required RAMBlock size + to hold the firmware and thus migration can fail. In practice it's best + to pad firmware images to convenient powers of 2 with plenty of space + for growth. + +- Care should be taken with device emulation code so that newer + emulation code can work with older firmware to allow forward migration. + +- Care should be taken with newer firmware so that backward migration + to older systems with older device emulation code will work. + +In some cases it may be best to tie specific firmware versions to specific +versioned machine types to cut down on the combinations that will need +support. This is also useful when newer versions of firmware outgrow +the padding. diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst index 7fc02b9520..c09623b38f 100644 --- a/docs/devel/migration/index.rst +++ b/docs/devel/migration/index.rst @@ -11,3 +11,4 @@ QEMU live migration works. compatibility vfio virtio + best-practises diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst index b3e31bb52f..97811ce371 100644 --- a/docs/devel/migration/main.rst +++ b/docs/devel/migration/main.rst @@ -52,27 +52,6 @@ All these migration protocols use the same infrastructure to save/restore state devices. This infrastructure is shared with the savevm/loadvm functionality. -Debugging -= - -The migration stream can be analyzed thanks to ``scripts/analyze-migration.py``. - -Example usage: - -.. code-block:: shell - - $ qemu-system-x86_64 -display none -monitor stdio - (qemu) migrate "exec:cat > mig" - (qemu) q - $ ./scripts/analyze-migration.py -f mig - { -"ram (3)": { -"section sizes": { -"pc.ram": "0x0800", - ... - -See also ``analyze-migration.py -h`` help for more options. - Common infrastructure = @@ -970,26 +949,3 @@ the background migration channel. Anyone who cares about latencies of page faults during a postcopy migration should enable this feature. By default, it's not enabled. -Firmware - - -Migration migrates the copies of RAM and ROM, and thus when running -on the destination it includes the firmware from the source. Even after -resetting a VM, the old firmware is used. Only once QEMU has been restarted -is the new firmware in use. - -- Changes in firmware size can cause changes in the required RAMBlock size - to hold the firmware and thus migration can fail. In practice it's best - to pad firmware images to convenient powers of 2 with plenty of space - for growth. - -- Care should be taken with device emulation code so that newer - emulation code can work with older firmware to allow forward migration. - -- Care should be taken with newer firmware so that backward migration - to older systems with older device emulation code will work. - -In some cases it may be best to tie specific firmware versions to specific -versioned machine types to cut down on the combinations that will need -support. This is also useful when newer versions of firmware outgrow -the padding. -- 2.41.0
[PATCH 01/10] docs/migration: Create migration/ directory
From: Peter Xu Migration documentation is growing into a single file too large. Create a sub-directory for it for a split. We also already have separate vfio/virtio documentations, move it all over into the directory. Note that the virtio one is still not yet converted to rST. That is a job for later. Cc: Michael S. Tsirkin Cc: Jason Wang Cc: Alex Williamson Cc: Cédric Le Goater Signed-off-by: Peter Xu --- docs/devel/index-internals.rst| 2 +- docs/devel/{migration.rst => migration/main.rst} | 0 docs/devel/{vfio-migration.rst => migration/vfio.rst} | 0 docs/devel/{virtio-migration.txt => migration/virtio.txt} | 0 4 files changed, 1 insertion(+), 1 deletion(-) rename docs/devel/{migration.rst => migration/main.rst} (100%) rename docs/devel/{vfio-migration.rst => migration/vfio.rst} (100%) rename docs/devel/{virtio-migration.txt => migration/virtio.txt} (100%) diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst index 3def4a138b..a41d62c1eb 100644 --- a/docs/devel/index-internals.rst +++ b/docs/devel/index-internals.rst @@ -11,7 +11,7 @@ Details about QEMU's various subsystems including how to add features to them. block-coroutine-wrapper clocks ebpf_rss - migration + migration/main multi-process reset s390-cpu-topology diff --git a/docs/devel/migration.rst b/docs/devel/migration/main.rst similarity index 100% rename from docs/devel/migration.rst rename to docs/devel/migration/main.rst diff --git a/docs/devel/vfio-migration.rst b/docs/devel/migration/vfio.rst similarity index 100% rename from docs/devel/vfio-migration.rst rename to docs/devel/migration/vfio.rst diff --git a/docs/devel/virtio-migration.txt b/docs/devel/migration/virtio.txt similarity index 100% rename from docs/devel/virtio-migration.txt rename to docs/devel/migration/virtio.txt -- 2.41.0
[PATCH 00/10] docs/migration: Reorganize migration documentations
From: Peter Xu Migration docs grow larger and larger. There are plenty of things we can do here in the future, but to start that we'd better reorganize the current bloated doc files first and properly organize them into separate files. This series kicks that off. This series mostly does the movement only, so please don't be scared of the slightly large diff. I did touch up things here and there, but I didn't yet started writting much. One thing I did is I converted virtio.txt to rST, but that's trivial and no real content I touched. I am copying both virtio and vfio people because I'm merging the two separate files into the new docs/devel/migration/ folder. Comments welcomed. Thanks, Peter Xu (10): docs/migration: Create migration/ directory docs/migration: Create index page docs/migration: Convert virtio.txt into rST docs/migration: Split "Backwards compatibility" separately docs/migration: Split "Debugging" and "Firmware" docs/migration: Split "Postcopy" docs/migration: Split "dirty limit" docs/migration: Organize "Postcopy" page docs/migration: Further move vfio to be feature of migration docs/migration: Further move virtio to be feature of migration docs/devel/index-internals.rst|3 +- docs/devel/migration.rst | 1514 - docs/devel/migration/best-practises.rst | 48 + docs/devel/migration/compatibility.rst| 517 ++ docs/devel/migration/dirty-limit.rst | 71 + docs/devel/migration/features.rst | 12 + docs/devel/migration/index.rst| 13 + docs/devel/migration/main.rst | 575 +++ docs/devel/migration/postcopy.rst | 313 .../vfio.rst} |2 +- docs/devel/migration/virtio.rst | 115 ++ docs/devel/virtio-migration.txt | 108 -- 12 files changed, 1666 insertions(+), 1625 deletions(-) delete mode 100644 docs/devel/migration.rst create mode 100644 docs/devel/migration/best-practises.rst create mode 100644 docs/devel/migration/compatibility.rst create mode 100644 docs/devel/migration/dirty-limit.rst create mode 100644 docs/devel/migration/features.rst create mode 100644 docs/devel/migration/index.rst create mode 100644 docs/devel/migration/main.rst create mode 100644 docs/devel/migration/postcopy.rst rename docs/devel/{vfio-migration.rst => migration/vfio.rst} (99%) create mode 100644 docs/devel/migration/virtio.rst delete mode 100644 docs/devel/virtio-migration.txt -- 2.41.0
[PATCH 04/10] docs/migration: Split "Backwards compatibility" separately
From: Peter Xu Split the section from main.rst into a separate file. Reference it in the index.rst. Signed-off-by: Peter Xu --- docs/devel/migration/compatibility.rst | 517 docs/devel/migration/index.rst | 1 + docs/devel/migration/main.rst | 519 - 3 files changed, 518 insertions(+), 519 deletions(-) create mode 100644 docs/devel/migration/compatibility.rst diff --git a/docs/devel/migration/compatibility.rst b/docs/devel/migration/compatibility.rst new file mode 100644 index 00..5a5417ef06 --- /dev/null +++ b/docs/devel/migration/compatibility.rst @@ -0,0 +1,517 @@ +Backwards compatibility +=== + +How backwards compatibility works +- + +When we do migration, we have two QEMU processes: the source and the +target. There are two cases, they are the same version or they are +different versions. The easy case is when they are the same version. +The difficult one is when they are different versions. + +There are two things that are different, but they have very similar +names and sometimes get confused: + +- QEMU version +- machine type version + +Let's start with a practical example, we start with: + +- qemu-system-x86_64 (v5.2), from now on qemu-5.2. +- qemu-system-x86_64 (v5.1), from now on qemu-5.1. + +Related to this are the "latest" machine types defined on each of +them: + +- pc-q35-5.2 (newer one in qemu-5.2) from now on pc-5.2 +- pc-q35-5.1 (newer one in qemu-5.1) from now on pc-5.1 + +First of all, migration is only supposed to work if you use the same +machine type in both source and destination. The QEMU hardware +configuration needs to be the same also on source and destination. +Most aspects of the backend configuration can be changed at will, +except for a few cases where the backend features influence frontend +device feature exposure. But that is not relevant for this section. + +I am going to list the number of combinations that we can have. Let's +start with the trivial ones, QEMU is the same on source and +destination: + +1 - qemu-5.2 -M pc-5.2 -> migrates to -> qemu-5.2 -M pc-5.2 + + This is the latest QEMU with the latest machine type. + This have to work, and if it doesn't work it is a bug. + +2 - qemu-5.1 -M pc-5.1 -> migrates to -> qemu-5.1 -M pc-5.1 + + Exactly the same case than the previous one, but for 5.1. + Nothing to see here either. + +This are the easiest ones, we will not talk more about them in this +section. + +Now we start with the more interesting cases. Consider the case where +we have the same QEMU version in both sides (qemu-5.2) but we are using +the latest machine type for that version (pc-5.2) but one of an older +QEMU version, in this case pc-5.1. + +3 - qemu-5.2 -M pc-5.1 -> migrates to -> qemu-5.2 -M pc-5.1 + + It needs to use the definition of pc-5.1 and the devices as they + were configured on 5.1, but this should be easy in the sense that + both sides are the same QEMU and both sides have exactly the same + idea of what the pc-5.1 machine is. + +4 - qemu-5.1 -M pc-5.2 -> migrates to -> qemu-5.1 -M pc-5.2 + + This combination is not possible as the qemu-5.1 doesn't understand + pc-5.2 machine type. So nothing to worry here. + +Now it comes the interesting ones, when both QEMU processes are +different. Notice also that the machine type needs to be pc-5.1, +because we have the limitation than qemu-5.1 doesn't know pc-5.2. So +the possible cases are: + +5 - qemu-5.2 -M pc-5.1 -> migrates to -> qemu-5.1 -M pc-5.1 + + This migration is known as newer to older. We need to make sure + when we are developing 5.2 we need to take care about not to break + migration to qemu-5.1. Notice that we can't make updates to + qemu-5.1 to understand whatever qemu-5.2 decides to change, so it is + in qemu-5.2 side to make the relevant changes. + +6 - qemu-5.1 -M pc-5.1 -> migrates to -> qemu-5.2 -M pc-5.1 + + This migration is known as older to newer. We need to make sure + than we are able to receive migrations from qemu-5.1. The problem is + similar to the previous one. + +If qemu-5.1 and qemu-5.2 were the same, there will not be any +compatibility problems. But the reason that we create qemu-5.2 is to +get new features, devices, defaults, etc. + +If we get a device that has a new feature, or change a default value, +we have a problem when we try to migrate between different QEMU +versions. + +So we need a way to tell qemu-5.2 that when we are using machine type +pc-5.1, it needs to **not** use the feature, to be able to migrate to +real qemu-5.1. + +And the equivalent part when migrating from qemu-5.1 to qemu-5.2. +qemu-5.2 has to expect that it is not going to get data for the new +feature, because qemu-5.1 doesn't know about it. + +How do we tell QEMU about these device feature changes? In +hw/core/machine.c:hw_compat_X_Y arrays. + +If we change a default value, we need to put back the old value on +that
Re: [PATCH v3 06/70] kvm: Introduce support for memory_attributes
On 12/21/2023 9:47 PM, Wang, Wei W wrote: On Thursday, December 21, 2023 7:54 PM, Li, Xiaoyao wrote: On 12/21/2023 6:36 PM, Wang, Wei W wrote: No need to specifically check for KVM_MEMORY_ATTRIBUTE_PRIVATE there. I'm suggesting below: diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 2d9a2455de..63ba74b221 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -1375,6 +1375,11 @@ static int kvm_set_memory_attributes(hwaddr start, hwaddr size, uint64_t attr) struct kvm_memory_attributes attrs; int r; +if ((attr & kvm_supported_memory_attributes) != attr) { +error_report("KVM doesn't support memory attr %lx\n", attr); +return -EINVAL; +} In the case of setting a range of memory to shared while KVM doesn't support private memory. Above check doesn't work. and following IOCTL fails. SHARED attribute uses the value 0, which indicates it's always supported, no? For the implementation, can you find in the KVM side where the ioctl would get failed in that case? I'm worrying about the future case, that KVM supports other memory attribute than shared/private. For example, KVM supports RWX bits (bit 0 - 2) but not shared/private bit. This patch designs kvm_set_memory_attributes() to be common for all the bits (and for future bits), thus it leaves the support check to each caller function separately. If you think it's unnecessary, I can change the name of kvm_set_memory_attributes() to kvm_set_memory_shared_private() to make it only for shared/private bit, then the check can be moved to it. static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm, struct kvm_memory_attributes *attrs) { gfn_t start, end; /* flags is currently not used. */ if (attrs->flags) return -EINVAL; if (attrs->attributes & ~kvm_supported_mem_attributes(kvm)) ==> 0 here return -EINVAL; if (attrs->size == 0 || attrs->address + attrs->size < attrs->address) return -EINVAL; if (!PAGE_ALIGNED(attrs->address) || !PAGE_ALIGNED(attrs->size)) return -EINVAL;
Re: [PATCH trivial] colo: examples: remove mentions of script= and (wrong) downscript=
09.01.2024 05:08, Zhang, Chen : -Original Message- From: Michael Tokarev Sent: Sunday, January 7, 2024 7:25 PM To: qemu-devel@nongnu.org Cc: Michael Tokarev ; qemu-triv...@nongnu.org; Zhang, Chen ; Li Zhijian Subject: [PATCH trivial] colo: examples: remove mentions of script= and (wrong) downscript= There's no need to repeat script=/etc/qemu-ifup in examples, as it is already in there. More, all examples uses incorrect "down script=" (which should be "downscript="). Yes, good catch. Reviewed-by: Zhang Chen --- I'm not sure we need so many identical examples, and why it uses vnet=off, - it looks like vnet= should also be dropped. Do you means the "vnet_hdr_support" in docs? Nope, it was a thinko on my part, I mean vhost=off parameter - which is right next to script=. Why vhost is explicitly disabled here, while it isn't even enabled by default? And do we really need that many examples like this, maybe it's a good idea to remove half of them and refer to the other place instead? /mjt
Re: [PATCH 1/2] target/sh4: Deprecate the shix machine
On Tue, 09 Jan 2024 02:15:21 +0900, Samuel Tardieu wrote: > > The shix machine has been designed and used at Télécom Paris from 2003 > to 2010. It had been added to QEMU in 2005 and has not been maintained > since. Since nobody is using the physical board anymore nor interested > in maintaining the QEMU port, it is time to deprecate it. > > Signed-off-by: Samuel Tardieu > --- > docs/about/deprecated.rst | 5 + > hw/sh4/shix.c | 1 + > 2 files changed, 6 insertions(+) > > diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst > index 2e15040246..e6a12c9077 100644 > --- a/docs/about/deprecated.rst > +++ b/docs/about/deprecated.rst > @@ -269,6 +269,11 @@ Nios II ``10m50-ghrd`` and ``nios2-generic-nommu`` > machines (since 8.2) > > The Nios II architecture is orphan. > > +``shix`` (since 9.0) > + > + > +The machine is no longer in existence and has been long unmaintained > +in QEMU. > > Backend options > --- > diff --git a/hw/sh4/shix.c b/hw/sh4/shix.c > index aa812512f0..58530b8ede 100644 > --- a/hw/sh4/shix.c > +++ b/hw/sh4/shix.c > @@ -80,6 +80,7 @@ static void shix_machine_init(MachineClass *mc) > mc->init = shix_init; > mc->is_default = true; > mc->default_cpu_type = TYPE_SH7750R_CPU; > +mc->deprecation_reason = "old and unmaintained - use a newer machine > instead"; > } > > DEFINE_MACHINE("shix", shix_machine_init) > -- > 2.42.0 > I can't maintain this either. Reviewed-by: Yoshinori Sato -- Yosinori Sato
Re: [PATCH v3 52/70] i386/tdx: handle TDG.VP.VMCALL
On 1/8/2024 10:44 PM, Daniel P. Berrangé wrote: On Fri, Dec 29, 2023 at 10:30:15AM +0800, Xiaoyao Li wrote: On 11/16/2023 1:58 AM, Daniel P. Berrangé wrote: On Wed, Nov 15, 2023 at 02:15:01AM -0500, Xiaoyao Li wrote: From: Isaku Yamahata For GetQuote, delegate a request to Quote Generation Service. Add property "quote-generation-socket" to tdx-guest, whihc is a property of type SocketAddress to specify Quote Generation Service(QGS). On request, connect to the QGS, read request buffer from shared guest memory, send the request buffer to the server and store the response into shared guest memory and notify TD guest by interrupt. command line example: qemu-system-x86_64 \ -object '{"qom-type":"tdx-guest","id":"tdx0","quote-generation-socket":{"type": "vsock", "cid":"2","port":"1234"}}' \ -machine confidential-guest-support=tdx0 Signed-off-by: Isaku Yamahata Codeveloped-by: Chenyi Qiang Signed-off-by: Chenyi Qiang Signed-off-by: Xiaoyao Li --- Changes in v3: - rename property "quote-generation-service" to "quote-generation-socket"; - change the type of "quote-generation-socket" from str to SocketAddress; - squash next patch into this one; --- qapi/qom.json | 5 +- target/i386/kvm/tdx.c | 430 ++ target/i386/kvm/tdx.h | 6 + 3 files changed, 440 insertions(+), 1 deletion(-) +static void tdx_handle_get_quote_connected(QIOTask *task, gpointer opaque) +{ +struct tdx_get_quote_task *t = opaque; +Error *err = NULL; +char *in_data = NULL; +MachineState *ms; +TdxGuest *tdx; + +t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_ERROR); +if (qio_task_propagate_error(task, NULL)) { +t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_QGS_UNAVAILABLE); +goto error; +} + +in_data = g_malloc(le32_to_cpu(t->hdr.in_len)); +if (!in_data) { +goto error; +} + +if (address_space_read(_space_memory, t->gpa + sizeof(t->hdr), + MEMTXATTRS_UNSPECIFIED, in_data, + le32_to_cpu(t->hdr.in_len)) != MEMTX_OK) { +goto error; +} + +qio_channel_set_blocking(QIO_CHANNEL(t->ioc), false, NULL); You've set the channel to non-blocking, but + +if (qio_channel_write_all(QIO_CHANNEL(t->ioc), in_data, + le32_to_cpu(t->hdr.in_len), ) || +err) { ...this method will block execution of this thread, by either sleeping in poll() or doing a coroutine yield. I don't think this is in coroutine context, so presumably this is just blocking. So what was the point in marking the channel non-blocking ? Hi Dainel, First of all, I'm not good at socket or qio channel thing. Please correct me and teach me when I'm wrong. I'm not the author of this patch. My understanding is that, set it to non-blocking is for the qio_channel_write_all() to proceed immediately? The '_all' suffixed methods are implemented such that they will sleep in poll(), or a coroutine yield when seeing EAGAIN. If set non-blocking is not needed, I can remove it. You are setting up a background watch to wait for the reply so we don't block this thread, so you seem to want non-blocking behaviour. Both sending and receiving are in a new thread created by qio_channel_socket_connect_async(). So I think both of then can be blocking and don't need to be in another background thread. what's your suggestion on it? Make both sending and receiving blocking or non-blocking? I think the code /should/ be non-blocking, which would mean using qio_channel_write, instead of qio_channel_write_all, and using a . I see. will implement in the next version. With regards, Daniel
hw: nvme: Separate 'serial' property for VFs
Currently, when a VF is created, it uses the 'params' object of the PF as it is. In other words, the 'params.serial' string memory area is also shared. In this situation, if the VF is removed from the system, the PF's 'params.serial' object is released with object_finalize() followed by object_property_del_all() which release the memory for 'serial' property. If that happens, the next VF created will inherit a serial from a corrupted memory area. If this happens, an error will occur when comparing subsys->serial and n->params.serial in the nvme_subsys_register_ctrl() function. Cc: qemu-sta...@nongnu.org Fixes: 44c2c09488db ("hw/nvme: Add support for SR-IOV") Signed-off-by: Minwoo Im --- hw/nvme/ctrl.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index f026245d1e..a0ba3529cd 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -8309,9 +8309,15 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) if (pci_is_vf(pci_dev)) { /* * VFs derive settings from the parent. PF's lifespan exceeds - * that of VF's, so it's safe to share params.serial. + * that of VF's. */ memcpy(>params, >params, sizeof(NvmeParams)); + +/* + * Set PF's serial value to a new string memory to prevent 'serial' + * property object release of PF when a VF is removed from the system. + */ +n->params.serial = g_strdup(pn->params.serial); n->subsys = pn->subsys; } -- 2.34.1
Re: [PATCH v6 1/2] qom: new object to associate device to numa node
>> > However, I'll leave it up to those more familiar with the QEMU numa >> > control interface design to comment on whether this approach is preferable >> > to making the gi part of the numa node entry or doing it like hmat. >> >> > -numa srat-gi,node-id=10,gi-pci-dev=dev1 >> >> The current way of acpi-generic-initiator object usage came out of the >> discussion >> on v1 to essentially link all the device NUMA nodes to the device. >> (https://lore.kernel.org/all/20230926131427.1e441670.alex.william...@redhat.com/) >> >> Can Alex or David comment on which is preferable (the current mechanism vs >> 1:1 >> mapping per object as suggested by Jonathan)? > > I imagine there are ways that either could work, but specifying a > gi-pci-dev in the numa node declaration appears to get a bit messy if we > have multiple gi-pci-dev devices to associate to the node whereas > creating an acpi-generic-initiator object per individual device:node > relationship feels a bit easier to iterate. > > Also if we do extend the ACPI spec to more explicitly allow a device to > associate to multiple nodes, we could re-instate the list behavior of > the acpi-generic-initiator whereas I don't see a representation of the > association at the numa object that makes sense. Thanks, Ack, making the change to create an individual acpi-generic-initiator object per device:node. Alex
Re: [PATCH v6 1/2] qom: new object to associate device to numa node
>> +## >> +# @AcpiGenericInitiatorProperties: >> +# >> +# Properties for acpi-generic-initiator objects. >> +# >> +# @pci-dev: PCI device ID to be associated with the node >> +# >> +# @host-nodes: numa node list associated with the PCI device. > > NUMA > > Suggest "list of NUMA nodes associated with ..." Ack, will make the change. >> @@ -981,6 +997,7 @@ >> 'id': 'str' }, >> 'discriminator': 'qom-type', >> 'data': { >> + 'acpi-generic-initiator': 'AcpiGenericInitiatorProperties', >> 'authz-list': 'AuthZListProperties', >> 'authz-listfile': 'AuthZListFileProperties', >> 'authz-pam': 'AuthZPAMProperties', > > I'm holding my Acked-by until the interface design issues raised by > Jason have been resolved. I suppose you meant Jonathan here?
Re: [PATCH v3 4/4] [NOT FOR MERGE] tests/qtest/migration: Adapt tests to use older QEMUs
On Mon, Jan 08, 2024 at 12:37:46PM -0300, Fabiano Rosas wrote: > Peter Xu writes: > > > On Fri, Jan 05, 2024 at 03:04:49PM -0300, Fabiano Rosas wrote: > >> [This patch is not necessary anymore after 8.2 has been released] > >> > >> Add the 'since' annotations to recently added tests and adapt the > >> postcopy test to use the older "uri" API when needed. > >> > >> Signed-off-by: Fabiano Rosas > > > > You marked this as not-for-merge. Would something like this still be > > useful in the future? IIUC it's a matter of whether we'd still want to > > test those old binaries. > > > > Technically yes, but I fail to see what benefit testing old binaries > would bring us. I'm thinking maybe it could be useful for bisecting > compatibility issues, but I can't think of a scenario where we'd like to > change the older QEMU instead of the newer. > > I'm of course open to suggestions if you or anyone else has an use case > that you'd like to keep viable. > > So far, my idea is that once a new QEMU is released, all the "since:" > annotations become obsolete. We could even remove them. This series is > just infrastructure to make our life easier if a change is ever > introduced that is incompatible with the n-1 QEMU. IMO we cannot have > compatibility testing if a random change might break a test and make it > more difficult to run the remaining tests. So we'd use 'since' or the > vercmp function to skip/adapt the offending tests until the next QEMU is > released. > > I'm basing myself on this loosely worded support statement from our > docs: > > "In general QEMU tries to maintain forward migration compatibility > (i.e. migrating from QEMU n->n+1) and there are users who benefit from > backward compatibility as well." I think we could still have users migrating from e.g. 8.0 -> 9.0 as long as with the same machine type, especially when upgrading upper level stack (e.g. an openstack cluster upgrade), where IIUC can jump a few qemu major versions. That does sound like a common use case, and I suspect the doc was only taking one example on why compatibility needs to be maintained, rather than emphasizing "+1 only". However then the question is whether those old binaries needs to be convered. Then I noticed that taking all these "since: XXX" and cmdline changes along with migration-test may be yet another burden even if we want to cover old binaries for whatever reason. I am now more convinced myself that we should try to get rid of as much burden as we can for migration, because we already have enough, and it's not ideal to keep growing that unnecessarily. One good thing with CI in this case (I still don't have enough knowledge on CI, so I am hoping some CI people can review that patch, though) is that if we can always guarantee n-1 -> n works for the test cases we enabled, it most probably means when n boosts again to n+1, we keep making sure n -> n+1 works perfectly, then n-1 -> n+1 should not fail either, considering that we're testing the stream protocol matching each other. There might be outliers (especially if not described with VMSDs) but should be corner cases. So I tend to agree with you on that we drop this patch, keep it simple until we're much more clear what we can get from that. But then if so - do we need "since" at all to be expressed in versions? Basically we keep qtest always be valid only on the latest qemu binary as before (which actually works the same as Linux v.s. kselftests, which makes sense), there's one exception now with "n-1" due to the CI we plan to add. Dropping this patch means we don't yet plan to support n-2. Then maybe instead of a "since" we only need a boolean showing "whether one test needs to be covered by a cross-binary test"? Then we set it in incompatible binaries (skip all cross-binary tests directly, rather than relying on any qemu versions, no compare needed), and can also drop that when a new release starts. Thanks, -- Peter Xu
Re: [PATCH v2 4/4] hw/intc/loongarch_extioi: Add vmstate post_load support
在 2023/12/15 下午6:03, Bibo Mao 写道: There are elements sw_ipmap and sw_coremap, which is usd to speed up irq injection flow. They are saved and restored in vmstate during migration, indeed they can calculated from hw registers. Here post_load is added for get sw_ipmap and sw_coremap from extioi hw state. Signed-off-by: Bibo Mao --- hw/intc/loongarch_extioi.c | 120 +++-- 1 file changed, 76 insertions(+), 44 deletions(-) Reviewed-by: Song Gao Thanks. Song Gao diff --git a/hw/intc/loongarch_extioi.c b/hw/intc/loongarch_extioi.c index d9d5066c3f..e0fd57f962 100644 --- a/hw/intc/loongarch_extioi.c +++ b/hw/intc/loongarch_extioi.c @@ -130,12 +130,66 @@ static inline void extioi_enable_irq(LoongArchExtIOI *s, int index,\ } } +static inline void extioi_update_sw_coremap(LoongArchExtIOI *s, int irq, +uint64_t val, bool notify) +{ +int i, cpu; + +/* + * loongarch only support little endian, + * so we paresd the value with little endian. + */ +val = cpu_to_le64(val); + +for (i = 0; i < 4; i++) { +cpu = val & 0xff; +cpu = ctz32(cpu); +cpu = (cpu >= 4) ? 0 : cpu; +val = val >> 8; + +if (s->sw_coremap[irq + i] == cpu) { +continue; +} + +if (notify && test_bit(irq, (unsigned long *)s->isr)) { +/* + * lower irq at old cpu and raise irq at new cpu + */ +extioi_update_irq(s, irq + i, 0); +s->sw_coremap[irq + i] = cpu; +extioi_update_irq(s, irq + i, 1); +} else { +s->sw_coremap[irq + i] = cpu; +} +} +} + +static inline void extioi_update_sw_ipmap(LoongArchExtIOI *s, int index, + uint64_t val) +{ +int i; +uint8_t ipnum; + +/* + * loongarch only support little endian, + * so we paresd the value with little endian. + */ +val = cpu_to_le64(val); +for (i = 0; i < 4; i++) { +ipnum = val & 0xff; +ipnum = ctz32(ipnum); +ipnum = (ipnum >= 4) ? 0 : ipnum; +s->sw_ipmap[index * 4 + i] = ipnum; +val = val >> 8; +} +} + static MemTxResult extioi_writew(void *opaque, hwaddr addr, uint64_t val, unsigned size, MemTxAttrs attrs) { LoongArchExtIOI *s = LOONGARCH_EXTIOI(opaque); -int i, cpu, index, old_data, irq; +int cpu, index, old_data, irq; uint32_t offset; trace_loongarch_extioi_writew(addr, val); @@ -153,20 +207,7 @@ static MemTxResult extioi_writew(void *opaque, hwaddr addr, */ index = (offset - EXTIOI_IPMAP_START) >> 2; s->ipmap[index] = val; -/* - * loongarch only support little endian, - * so we paresd the value with little endian. - */ -val = cpu_to_le64(val); -for (i = 0; i < 4; i++) { -uint8_t ipnum; -ipnum = val & 0xff; -ipnum = ctz32(ipnum); -ipnum = (ipnum >= 4) ? 0 : ipnum; -s->sw_ipmap[index * 4 + i] = ipnum; -val = val >> 8; -} - +extioi_update_sw_ipmap(s, index, val); break; case EXTIOI_ENABLE_START ... EXTIOI_ENABLE_END - 1: index = (offset - EXTIOI_ENABLE_START) >> 2; @@ -205,33 +246,8 @@ static MemTxResult extioi_writew(void *opaque, hwaddr addr, irq = offset - EXTIOI_COREMAP_START; index = irq / 4; s->coremap[index] = val; -/* - * loongarch only support little endian, - * so we paresd the value with little endian. - */ -val = cpu_to_le64(val); - -for (i = 0; i < 4; i++) { -cpu = val & 0xff; -cpu = ctz32(cpu); -cpu = (cpu >= 4) ? 0 : cpu; -val = val >> 8; - -if (s->sw_coremap[irq + i] == cpu) { -continue; -} - -if (test_bit(irq, (unsigned long *)s->isr)) { -/* - * lower irq at old cpu and raise irq at new cpu - */ -extioi_update_irq(s, irq + i, 0); -s->sw_coremap[irq + i] = cpu; -extioi_update_irq(s, irq + i, 1); -} else { -s->sw_coremap[irq + i] = cpu; -} -} + +extioi_update_sw_coremap(s, irq, val, true); break; default: break; @@ -288,6 +304,23 @@ static void loongarch_extioi_finalize(Object *obj) g_free(s->cpu); } +static int vmstate_extioi_post_load(void *opaque, int version_id) +{ +LoongArchExtIOI *s = LOONGARCH_EXTIOI(opaque); +int i, start_irq; + +for (i = 0; i < (EXTIOI_IRQS / 4); i++) { +start_irq = i * 4; +extioi_update_sw_coremap(s, start_irq, s->coremap[i], false); +} + +for (i = 0; i < (EXTIOI_IRQS_IPMAP_SIZE / 4); i++) { +
Re: [PATCH v2 3/4] hw/intc/loongarch_extioi: Add dynamic cpu number support
在 2023/12/15 下午6:03, Bibo Mao 写道: On LoongArch physical machine, one extioi interrupt controller only supports 4 cpus. With processor more than 4 cpus, there are multiple extioi interrupt controllers; if interrupts need to be routed to other cpus, they are forwarded from extioi node0 to other extioi nodes. On virt machine model, there is simple extioi interrupt device model. All cpus can access register of extioi interrupt controller, however interrupt can only be route to 4 vcpu for compatible with old kernel. This patch adds dynamic cpu number support about extioi interrupt. With old kernel legacy extioi model is used, however kernel can detect and choose new route method in future, so that interrupt can be routed to all vcpus. Signed-off-by: Bibo Mao --- hw/intc/loongarch_extioi.c | 107 +++-- hw/loongarch/virt.c| 3 +- include/hw/intc/loongarch_extioi.h | 11 ++- 3 files changed, 81 insertions(+), 40 deletions(-) Reviewed-by: Song Gao Thanks. Song Gao diff --git a/hw/intc/loongarch_extioi.c b/hw/intc/loongarch_extioi.c index 77b4776958..d9d5066c3f 100644 --- a/hw/intc/loongarch_extioi.c +++ b/hw/intc/loongarch_extioi.c @@ -8,6 +8,7 @@ #include "qemu/osdep.h" #include "qemu/module.h" #include "qemu/log.h" +#include "qapi/error.h" #include "hw/irq.h" #include "hw/sysbus.h" #include "hw/loongarch/virt.h" @@ -32,23 +33,23 @@ static void extioi_update_irq(LoongArchExtIOI *s, int irq, int level) if (((s->enable[irq_index]) & irq_mask) == 0) { return; } -s->coreisr[cpu][irq_index] |= irq_mask; -found = find_first_bit(s->sw_isr[cpu][ipnum], EXTIOI_IRQS); -set_bit(irq, s->sw_isr[cpu][ipnum]); +s->cpu[cpu].coreisr[irq_index] |= irq_mask; +found = find_first_bit(s->cpu[cpu].sw_isr[ipnum], EXTIOI_IRQS); +set_bit(irq, s->cpu[cpu].sw_isr[ipnum]); if (found < EXTIOI_IRQS) { /* other irq is handling, need not update parent irq level */ return; } } else { -s->coreisr[cpu][irq_index] &= ~irq_mask; -clear_bit(irq, s->sw_isr[cpu][ipnum]); -found = find_first_bit(s->sw_isr[cpu][ipnum], EXTIOI_IRQS); +s->cpu[cpu].coreisr[irq_index] &= ~irq_mask; +clear_bit(irq, s->cpu[cpu].sw_isr[ipnum]); +found = find_first_bit(s->cpu[cpu].sw_isr[ipnum], EXTIOI_IRQS); if (found < EXTIOI_IRQS) { /* other irq is handling, need not update parent irq level */ return; } } -qemu_set_irq(s->parent_irq[cpu][ipnum], level); +qemu_set_irq(s->cpu[cpu].parent_irq[ipnum], level); } static void extioi_setirq(void *opaque, int irq, int level) @@ -96,7 +97,7 @@ static MemTxResult extioi_readw(void *opaque, hwaddr addr, uint64_t *data, index = (offset - EXTIOI_COREISR_START) >> 2; /* using attrs to get current cpu index */ cpu = attrs.requester_id; -*data = s->coreisr[cpu][index]; +*data = s->cpu[cpu].coreisr[index]; break; case EXTIOI_COREMAP_START ... EXTIOI_COREMAP_END - 1: index = (offset - EXTIOI_COREMAP_START) >> 2; @@ -189,8 +190,8 @@ static MemTxResult extioi_writew(void *opaque, hwaddr addr, index = (offset - EXTIOI_COREISR_START) >> 2; /* using attrs to get current cpu index */ cpu = attrs.requester_id; -old_data = s->coreisr[cpu][index]; -s->coreisr[cpu][index] = old_data & ~val; +old_data = s->cpu[cpu].coreisr[index]; +s->cpu[cpu].coreisr[index] = old_data & ~val; /* write 1 to clear interrupt */ old_data &= val; irq = ctz32(old_data); @@ -248,14 +249,61 @@ static const MemoryRegionOps extioi_ops = { .endianness = DEVICE_LITTLE_ENDIAN, }; -static const VMStateDescription vmstate_loongarch_extioi = { -.name = TYPE_LOONGARCH_EXTIOI, +static void loongarch_extioi_realize(DeviceState *dev, Error **errp) +{ +LoongArchExtIOI *s = LOONGARCH_EXTIOI(dev); +SysBusDevice *sbd = SYS_BUS_DEVICE(dev); +int i, pin; + +if (s->num_cpu == 0) { +error_setg(errp, "num-cpu must be at least 1"); +return; +} + +for (i = 0; i < EXTIOI_IRQS; i++) { +sysbus_init_irq(sbd, >irq[i]); +} + +qdev_init_gpio_in(dev, extioi_setirq, EXTIOI_IRQS); +memory_region_init_io(>extioi_system_mem, OBJECT(s), _ops, + s, "extioi_system_mem", 0x900); +sysbus_init_mmio(sbd, >extioi_system_mem); +s->cpu = g_new0(ExtIOICore, s->num_cpu); +if (s->cpu == NULL) { +error_setg(errp, "Memory allocation for ExtIOICore faile"); +return; +} + +for (i = 0; i < s->num_cpu; i++) { +for (pin = 0; pin < LS3A_INTC_IP; pin++) { +qdev_init_gpio_out(dev, >cpu[i].parent_irq[pin], 1); +} +} +} + +static void loongarch_extioi_finalize(Object *obj) +{ +
Re: [PATCH v3 2/4] tests/qtest/migration: Add infrastructure to skip tests on older QEMUs
On Mon, Jan 08, 2024 at 11:49:45AM -0300, Fabiano Rosas wrote: > >> + > >> +if (major > tgt_major) { > >> +return -1; > > > > This means the QEMU version is newer, the function will return negative. > > Is this what we want? It seems it's inverted. > > The return "points" to which once is the more recent: > > QEMU version | since: version > -1 0 1 Here if returns -1, then below [1] will skip the test? > > > In all cases, document this function with retval would be helpful too. > > > > Ok. > > >> +} > >> +if (major < tgt_major) { > >> +return 1; > >> +} > > > > Instead of all these, I'm wondering whether we can allow "since" to be an > > array of integers, like [8, 2, 0]. Would that be much easier? > > I don't see why push the complexity towards the person writing the > tests. The string is much more natural to specify. To me QEMU_VER(8,2,0) is as easy to write and read, too. What Dan proposed looks also good in the other thread. I don't really have a strong opinion here especially for the test case. But imho it'll be still nice to avoid string <-> int if the string is not required. [...] > >> @@ -850,6 +856,17 @@ static int test_migrate_start(QTestState **from, > >> QTestState **to, > >> qtest_qmp_set_event_callback(*from, > >> migrate_watch_for_stop, > >> _src_stop); > >> + > >> +if (args->since && migration_vercmp(*from, args->since) < 0) { [1] > >> +g_autofree char *msg = NULL; > >> + > >> +msg = g_strdup_printf("Test requires at least QEMU version > >> %s", > >> + args->since); > >> +g_test_skip(msg); > >> +qtest_quit(*from); > >> + > >> +return -1; > >> +} -- Peter Xu
Re: [PATCH v2 2/4] hw/loongarch/virt: Set iocsr address space per-board rather than percpu
在 2023/12/15 下午6:03, Bibo Mao 写道: LoongArch system has iocsr address space, most iocsr registers are per-board, however some iocsr register spaces banked for percpu such as ipi mailbox and extioi interrupt status. For banked iocsr space, each cpu has the same iocsr space, but separate data. This patch changes iocsr address space per-board rather percpu, for iocsr registers specified for cpu, MemTxAttrs.requester_id can be parsed for the cpu. With this patches, the total address space on board will be simple, only iocsr address space and system memory, rather than the number of cpu and system memory. Signed-off-by: Bibo Mao --- hw/intc/loongarch_extioi.c | 3 - hw/intc/loongarch_ipi.c| 61 +++- hw/loongarch/virt.c| 91 ++ include/hw/intc/loongarch_extioi.h | 1 - include/hw/intc/loongarch_ipi.h| 3 +- include/hw/loongarch/virt.h| 3 + target/loongarch/cpu.c | 48 target/loongarch/cpu.h | 4 +- target/loongarch/iocsr_helper.c| 16 +++--- 9 files changed, 127 insertions(+), 103 deletions(-) Reviewed-by: Song Gao Thanks. Song Gao diff --git a/hw/intc/loongarch_extioi.c b/hw/intc/loongarch_extioi.c index 24fb3af8cc..77b4776958 100644 --- a/hw/intc/loongarch_extioi.c +++ b/hw/intc/loongarch_extioi.c @@ -282,9 +282,6 @@ static void loongarch_extioi_instance_init(Object *obj) qdev_init_gpio_in(DEVICE(obj), extioi_setirq, EXTIOI_IRQS); for (cpu = 0; cpu < EXTIOI_CPUS; cpu++) { -memory_region_init_io(>extioi_iocsr_mem[cpu], OBJECT(s), _ops, - s, "extioi_iocsr", 0x900); -sysbus_init_mmio(dev, >extioi_iocsr_mem[cpu]); for (pin = 0; pin < LS3A_INTC_IP; pin++) { qdev_init_gpio_out(DEVICE(obj), >parent_irq[cpu][pin], 1); } diff --git a/hw/intc/loongarch_ipi.c b/hw/intc/loongarch_ipi.c index 1d3449e77d..bca01c88f6 100644 --- a/hw/intc/loongarch_ipi.c +++ b/hw/intc/loongarch_ipi.c @@ -9,6 +9,7 @@ #include "hw/sysbus.h" #include "hw/intc/loongarch_ipi.h" #include "hw/irq.h" +#include "hw/qdev-properties.h" #include "qapi/error.h" #include "qemu/log.h" #include "exec/address-spaces.h" @@ -26,7 +27,7 @@ static MemTxResult loongarch_ipi_readl(void *opaque, hwaddr addr, uint64_t ret = 0; int index = 0; -s = >ipi_core; +s = >cpu[attrs.requester_id]; addr &= 0xff; switch (addr) { case CORE_STATUS_OFF: @@ -65,7 +66,7 @@ static void send_ipi_data(CPULoongArchState *env, uint64_t val, hwaddr addr, * if the mask is 0, we need not to do anything. */ if ((val >> 27) & 0xf) { -data = address_space_ldl(>address_space_iocsr, addr, +data = address_space_ldl(env->address_space_iocsr, addr, attrs, NULL); for (i = 0; i < 4; i++) { /* get mask for byte writing */ @@ -77,7 +78,7 @@ static void send_ipi_data(CPULoongArchState *env, uint64_t val, hwaddr addr, data &= mask; data |= (val >> 32) & ~mask; -address_space_stl(>address_space_iocsr, addr, +address_space_stl(env->address_space_iocsr, addr, data, attrs, NULL); } @@ -172,7 +173,7 @@ static MemTxResult loongarch_ipi_writel(void *opaque, hwaddr addr, uint64_t val, uint8_t vector; CPUState *cs; -s = >ipi_core; +s = >cpu[attrs.requester_id]; addr &= 0xff;loongarch_ipi_finalize trace_loongarch_ipi_write(size, (uint64_t)addr, val); switch (addr) { @@ -214,7 +215,6 @@ static MemTxResult loongarch_ipi_writel(void *opaque, hwaddr addr, uint64_t val, /* override requester_id */ attrs.requester_id = cs->cpu_index; -ipi = LOONGARCH_IPI(LOONGARCH_CPU(cs)->env.ipistate); loongarch_ipi_writel(ipi, CORE_SET_OFF, BIT(vector), 4, attrs); break; default: @@ -265,12 +265,18 @@ static const MemoryRegionOps loongarch_ipi64_ops = { .endianness = DEVICE_LITTLE_ENDIAN, }; -static void loongarch_ipi_init(Object *obj) +static void loongarch_ipi_realize(DeviceState *dev, Error **errp) { -LoongArchIPI *s = LOONGARCH_IPI(obj); -SysBusDevice *sbd = SYS_BUS_DEVICE(obj); +LoongArchIPI *s = LOONGARCH_IPI(dev); +SysBusDevice *sbd = SYS_BUS_DEVICE(dev); +int i; + +if (s->num_cpu == 0) { +error_setg(errp, "num-cpu must be at least 1"); +return; +} -memory_region_init_io(>ipi_iocsr_mem, obj, _ipi_ops, +memory_region_init_io(>ipi_iocsr_mem, OBJECT(dev), _ipi_ops, s, "loongarch_ipi_iocsr", 0x48); /* loongarch_ipi_iocsr performs re-entrant IO through ipi_send */ @@ -278,10 +284,20 @@ static void loongarch_ipi_init(Object *obj) sysbus_init_mmio(sbd, >ipi_iocsr_mem); -memory_region_init_io(>ipi64_iocsr_mem, obj, _ipi64_ops, +
RE: [External] Re: [PATCH 3/5] migration: Introduce unimplemented 'qatzip' compression method
> -Original Message- > From: Fabiano Rosas > Sent: Tuesday, January 9, 2024 4:28 AM > To: Liu, Yuan1 ; Hao Xiang > Cc: Bryan Zhang ; qemu-devel@nongnu.org; > marcandre.lur...@redhat.com; pet...@redhat.com; quint...@redhat.com; > peter.mayd...@linaro.org; berra...@redhat.com > Subject: RE: [External] Re: [PATCH 3/5] migration: Introduce unimplemented > 'qatzip' compression method > > "Liu, Yuan1" writes: > > >> -Original Message- > >> From: Hao Xiang > >> Sent: Saturday, January 6, 2024 7:53 AM > >> To: Fabiano Rosas > >> Cc: Bryan Zhang ; qemu-devel@nongnu.org; > >> marcandre.lur...@redhat.com; pet...@redhat.com; quint...@redhat.com; > >> peter.mayd...@linaro.org; Liu, Yuan1 ; > >> berra...@redhat.com > >> Subject: Re: [External] Re: [PATCH 3/5] migration: Introduce > >> unimplemented 'qatzip' compression method > >> > >> On Fri, Jan 5, 2024 at 12:07 PM Fabiano Rosas wrote: > >> > > >> > Bryan Zhang writes: > >> > > >> > +cc Yuan Liu, Daniel Berrangé > >> > > >> > > Adds support for 'qatzip' as an option for the multifd > >> > > compression method parameter, but copy-pastes the no-op logic to > >> > > leave the actual methods effectively unimplemented. This is in > >> > > preparation of a subsequent commit that will implement actually > >> > > using QAT for compression and decompression. > >> > > > >> > > Signed-off-by: Bryan Zhang > >> > > Signed-off-by: Hao Xiang > >> > > --- > >> > > hw/core/qdev-properties-system.c | 6 ++- > >> > > migration/meson.build| 1 + > >> > > migration/multifd-qatzip.c | 81 > >> > >> > > migration/multifd.h | 1 + > >> > > qapi/migration.json | 5 +- > >> > > 5 files changed, 92 insertions(+), 2 deletions(-) create mode > >> > > 100644 migration/multifd-qatzip.c > >> > > > >> > > diff --git a/hw/core/qdev-properties-system.c > >> > > b/hw/core/qdev-properties-system.c > >> > > index 1a396521d5..d8e48dcb0e 100644 > >> > > --- a/hw/core/qdev-properties-system.c > >> > > +++ b/hw/core/qdev-properties-system.c > >> > > @@ -658,7 +658,11 @@ const PropertyInfo qdev_prop_fdc_drive_type > >> > > = { const PropertyInfo qdev_prop_multifd_compression = { > >> > > .name = "MultiFDCompression", > >> > > .description = "multifd_compression values, " > >> > > - "none/zlib/zstd", > >> > > + "none/zlib/zstd" > >> > > +#ifdef CONFIG_QATZIP > >> > > + "/qatzip" > >> > > +#endif > >> > > + , > >> > > .enum_table = _lookup, > >> > > .get = qdev_propinfo_get_enum, > >> > > .set = qdev_propinfo_set_enum, diff --git > >> > > a/migration/meson.build b/migration/meson.build index > >> > > 92b1cc4297..e20f318379 100644 > >> > > --- a/migration/meson.build > >> > > +++ b/migration/meson.build > >> > > @@ -40,6 +40,7 @@ if get_option('live_block_migration').allowed() > >> > >system_ss.add(files('block.c')) endif > >> > > system_ss.add(when: zstd, if_true: files('multifd-zstd.c')) > >> > > +system_ss.add(when: qatzip, if_true: files('multifd-qatzip.c')) > >> > > > >> > > specific_ss.add(when: 'CONFIG_SYSTEM_ONLY', > >> > > if_true: files('ram.c', diff --git > >> > > a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c new file > >> > > mode 100644 index 00..1733bbddb7 > >> > > --- /dev/null > >> > > +++ b/migration/multifd-qatzip.c > >> > > @@ -0,0 +1,81 @@ > >> > > +/* > >> > > + * Multifd QATzip compression implementation > >> > > + * > >> > > + * Copyright (c) Bytedance > >> > > + * > >> > > + * Authors: > >> > > + * Bryan Zhang > >> > > + * Hao Xiang > >> > > + * > >> > > + * This work is licensed under the terms of the GNU GPL, version 2 > or > >> later. > >> > > + * See the COPYING file in the top-level directory. > >> > > + */ > >> > > + > >> > > +#include "qemu/osdep.h" > >> > > +#include "exec/ramblock.h" > >> > > +#include "exec/target_page.h" > >> > > +#include "qapi/error.h" > >> > > +#include "migration.h" > >> > > +#include "options.h" > >> > > +#include "multifd.h" > >> > > + > >> > > +static int qatzip_send_setup(MultiFDSendParams *p, Error **errp) { > >> > > +return 0; > >> > > +} > >> > > + > >> > > +static void qatzip_send_cleanup(MultiFDSendParams *p, Error > **errp) > >> > > +{}; > >> > > + > >> > > +static int qatzip_send_prepare(MultiFDSendParams *p, Error **errp) > >> > > +{ > >> > > +MultiFDPages_t *pages = p->pages; > >> > > + > >> > > +for (int i = 0; i < p->normal_num; i++) { > >> > > +p->iov[p->iovs_num].iov_base = pages->block->host + p- > >> >normal[i]; > >> > > +p->iov[p->iovs_num].iov_len = p->page_size; > >> > > +p->iovs_num++; > >> > > +} > >> > > + > >> > > +p->next_packet_size = p->normal_num * p->page_size; > >> > > +p->flags |= MULTIFD_FLAG_NOCOMP; > >> > > +return 0; > >> > > +} > >> > > + > >> > > +static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp) { >
Re: [PATCH 3/3] tests/qtest: Re-enable multifd cancel test
On Mon, Jan 08, 2024 at 11:26:04AM -0300, Fabiano Rosas wrote: > Peter Xu writes: > > > On Wed, Jun 07, 2023 at 10:27:15AM +0200, Juan Quintela wrote: > >> Fabiano Rosas wrote: > >> > We've found the source of flakiness in this test, so re-enable it. > >> > > >> > Signed-off-by: Fabiano Rosas > >> > --- > >> > tests/qtest/migration-test.c | 10 ++ > >> > 1 file changed, 2 insertions(+), 8 deletions(-) > >> > > >> > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c > >> > index b0c355bbd9..800ad23b75 100644 > >> > --- a/tests/qtest/migration-test.c > >> > +++ b/tests/qtest/migration-test.c > >> > @@ -2778,14 +2778,8 @@ int main(int argc, char **argv) > >> > } > >> > qtest_add_func("/migration/multifd/tcp/plain/none", > >> > test_multifd_tcp_none); > >> > -/* > >> > - * This test is flaky and sometimes fails in CI and otherwise: > >> > - * don't run unless user opts in via environment variable. > >> > - */ > >> > -if (getenv("QEMU_TEST_FLAKY_TESTS")) { > >> > -qtest_add_func("/migration/multifd/tcp/plain/cancel", > >> > - test_multifd_tcp_cancel); > >> > -} > >> > +qtest_add_func("/migration/multifd/tcp/plain/cancel", > >> > + test_multifd_tcp_cancel); > >> > qtest_add_func("/migration/multifd/tcp/plain/zlib", > >> > test_multifd_tcp_zlib); > >> > #ifdef CONFIG_ZSTD > >> > >> Reviewed-by: Juan Quintela > >> > >> > >> There was another failure with migration test that I will post during > >> the rest of the day. It needs both to get it right. > > > > This one didn't yet land upstream. I'm not sure, but maybe Juan was saying > > about this change: > > > > commit d2026ee117147893f8d80f060cede6d872ecbd7f > > Author: Juan Quintela > > Date: Wed Apr 26 12:20:36 2023 +0200 > > > > multifd: Fix the number of channels ready > > That's not it. It was something in the test itself around the fact that > we use two sets of: from/to. There was supposed to be a situation where > we'd start 'to2' while 'to' was still running and that would cause > issues (possibly with sockets). > > I think what might have happened is that someone merged a fix through > another tree and Juan didn't notice. I think this is the one: > > commit f2d063e61ee2026700ab44bef967f663e976bec8 > Author: Xuzhou Cheng > Date: Fri Oct 28 12:57:32 2022 +0800 > > tests/qtest: migration-test: Make sure QEMU process "to" exited after > migration is canceled > > Make sure QEMU process "to" exited before launching another target > for migration in the test_multifd_tcp_cancel case. > > Signed-off-by: Xuzhou Cheng > Signed-off-by: Bin Meng > Reviewed-by: Marc-André Lureau > Message-Id: <20221028045736.679903-8-bin.m...@windriver.com> > Signed-off-by: Thomas Huth Hmm, i see. > > > Fabiano, did you try to reproduce multifd-cancel with current master? I'm > > wondering whether this test has already been completely fixed, then maybe > > we can pick up this patch now. > > Yes, let's merge it. I have kept it enabled during testing of all of the > recent race conditions we've debugged and haven't seen it fail. Current > master also looks fine. It needs a trivial touchup, but then I queued it. Thanks, -- Peter Xu
RE: [PATCH trivial] colo: examples: remove mentions of script= and (wrong) downscript=
> -Original Message- > From: Michael Tokarev > Sent: Sunday, January 7, 2024 7:25 PM > To: qemu-devel@nongnu.org > Cc: Michael Tokarev ; qemu-triv...@nongnu.org; Zhang, > Chen ; Li Zhijian > Subject: [PATCH trivial] colo: examples: remove mentions of script= and > (wrong) downscript= > > There's no need to repeat script=/etc/qemu-ifup in examples, as it is already > in there. More, all examples uses incorrect "down script=" (which should be > "downscript="). Yes, good catch. Reviewed-by: Zhang Chen > --- > I'm not sure we need so many identical examples, and why it uses vnet=off, - > it looks like vnet= should also be dropped. Do you means the "vnet_hdr_support" in docs? If yes, we can't drop it. Because the filters use this tag to communicate with an independent vnet_header. And when a filter with vnet_hdr_support tag(like filter-mirror) connect to another filter without tag(like filter-redirector), They cannot correctly parse the data sent to each other. Thanks Chen > > docs/colo-proxy.txt | 6 +++--- > qemu-options.hx | 8 > 2 files changed, 7 insertions(+), 7 deletions(-) > > diff --git a/docs/colo-proxy.txt b/docs/colo-proxy.txt index > 1fc38aed1b..e712c883db 100644 > --- a/docs/colo-proxy.txt > +++ b/docs/colo-proxy.txt > @@ -162,7 +162,7 @@ Here is an example using demonstration IP and port > addresses to more clearly describe the usage. > > Primary(ip:3.3.3.3): > --netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu- > ifdown > +-netdev tap,id=hn0,vhost=off > -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66 > -chardev socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off > -chardev socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off > @@ -177,7 +177,7 @@ Primary(ip:3.3.3.3): > -object colo-compare,id=comp0,primary_in=compare0- > 0,secondary_in=compare1,outdev=compare_out0,iothread=iothread1 > > Secondary(ip:3.3.3.8): > --netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down script=/etc/qemu- > ifdown > +-netdev tap,id=hn0,vhost=off > -device e1000,netdev=hn0,mac=52:a4:00:12:78:66 > -chardev socket,id=red0,host=3.3.3.3,port=9003 > -chardev socket,id=red1,host=3.3.3.3,port=9004 > @@ -202,7 +202,7 @@ Primary(ip:3.3.3.3): > -object colo-compare,id=comp0,primary_in=compare0- > 0,secondary_in=compare1,outdev=compare_out0,vnet_hdr_support > > Secondary(ip:3.3.3.8): > --netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down script=/etc/qemu- > ifdown > +-netdev tap,id=hn0,vhost=off > -device e1000,netdev=hn0,mac=52:a4:00:12:78:66 > -chardev socket,id=red0,host=3.3.3.3,port=9003 > -chardev socket,id=red1,host=3.3.3.3,port=9004 > diff --git a/qemu-options.hx b/qemu-options.hx index > b66570ae00..d667bfa0c2 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -5500,7 +5500,7 @@ SRST > KVM COLO > > primary: > --netdev tap,id=hn0,vhost=off,script=/etc/qemu- > ifup,downscript=/etc/qemu-ifdown > +-netdev tap,id=hn0,vhost=off > -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66 > -chardev > socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off > -chardev > socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off > @@ -5515,7 +5515,7 @@ SRST > -object colo-compare,id=comp0,primary_in=compare0- > 0,secondary_in=compare1,outdev=compare_out0,iothread=iothread1 > > secondary: > --netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down > script=/etc/qemu-ifdown > +-netdev tap,id=hn0,vhost=off > -device e1000,netdev=hn0,mac=52:a4:00:12:78:66 > -chardev socket,id=red0,host=3.3.3.3,port=9003 > -chardev socket,id=red1,host=3.3.3.3,port=9004 > @@ -5526,7 +5526,7 @@ SRST > Xen COLO > > primary: > --netdev tap,id=hn0,vhost=off,script=/etc/qemu- > ifup,downscript=/etc/qemu-ifdown > +-netdev tap,id=hn0,vhost=off > -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66 > -chardev > socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off > -chardev > socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off > @@ -5542,7 +5542,7 @@ SRST > -object colo-compare,id=comp0,primary_in=compare0- > 0,secondary_in=compare1,outdev=compare_out0,notify_dev=nofity_way,ioth > read=iothread1 > > secondary: > --netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down > script=/etc/qemu-ifdown > +-netdev tap,id=hn0,vhost=off > -device e1000,netdev=hn0,mac=52:a4:00:12:78:66 > -chardev socket,id=red0,host=3.3.3.3,port=9003 > -chardev socket,id=red1,host=3.3.3.3,port=9004 > -- > 2.39.2
Re: [PATCH v2 1/4] hw/intc/loongarch_ipi: Use MemTxAttrs interface for ipi ops
在 2023/12/15 下午6:03, Bibo Mao 写道: There are two interface pairs for MemoryRegionOps, read/write and read_with_attrs/write_with_attrs. The later is better for ipi device emulation since initial cpu can be parsed from attrs.requester_id. And requester_id can be overrided for IOCSR_IPI_SEND and mail_send function when it is to forward message to another vcpu. Signed-off-by: Bibo Mao --- hw/intc/loongarch_ipi.c | 136 +++- 1 file changed, 77 insertions(+), 59 deletions(-) Reviewed-by: Song Gao Thanks. Song Gao diff --git a/hw/intc/loongarch_ipi.c b/hw/intc/loongarch_ipi.c index 67858b521c..1d3449e77d 100644 --- a/hw/intc/loongarch_ipi.c +++ b/hw/intc/loongarch_ipi.c @@ -17,14 +17,16 @@ #include "target/loongarch/internals.h" #include "trace.h" -static void loongarch_ipi_writel(void *, hwaddr, uint64_t, unsigned); - -static uint64_t loongarch_ipi_readl(void *opaque, hwaddr addr, unsigned size) +static MemTxResult loongarch_ipi_readl(void *opaque, hwaddr addr, + uint64_t *data, + unsigned size, MemTxAttrs attrs) { -IPICore *s = opaque; +IPICore *s; +LoongArchIPI *ipi = opaque; uint64_t ret = 0; int index = 0; +s = >ipi_core; addr &= 0xff; switch (addr) { case CORE_STATUS_OFF: @@ -49,10 +51,12 @@ static uint64_t loongarch_ipi_readl(void *opaque, hwaddr addr, unsigned size) } trace_loongarch_ipi_read(size, (uint64_t)addr, ret); -return ret; +*data = ret; +return MEMTX_OK; } -static void send_ipi_data(CPULoongArchState *env, uint64_t val, hwaddr addr) +static void send_ipi_data(CPULoongArchState *env, uint64_t val, hwaddr addr, + MemTxAttrs attrs) { int i, mask = 0, data = 0; @@ -62,7 +66,7 @@ static void send_ipi_data(CPULoongArchState *env, uint64_t val, hwaddr addr) */ if ((val >> 27) & 0xf) { data = address_space_ldl(>address_space_iocsr, addr, - MEMTXATTRS_UNSPECIFIED, NULL); + attrs, NULL); for (i = 0; i < 4; i++) { /* get mask for byte writing */ if (val & (0x1 << (27 + i))) { @@ -74,7 +78,7 @@ static void send_ipi_data(CPULoongArchState *env, uint64_t val, hwaddr addr) data &= mask; data |= (val >> 32) & ~mask; address_space_stl(>address_space_iocsr, addr, - data, MEMTXATTRS_UNSPECIFIED, NULL); + data, attrs, NULL); } static int archid_cmp(const void *a, const void *b) @@ -103,80 +107,72 @@ static CPUState *ipi_getcpu(int arch_id) CPUArchId *archid; archid = find_cpu_by_archid(machine, arch_id); -return CPU(archid->cpu); -} - -static void ipi_send(uint64_t val) -{ -uint32_t cpuid; -uint8_t vector; -CPUState *cs; -LoongArchCPU *cpu; -LoongArchIPI *s; - -cpuid = extract32(val, 16, 10); -if (cpuid >= LOONGARCH_MAX_CPUS) { -trace_loongarch_ipi_unsupported_cpuid("IOCSR_IPI_SEND", cpuid); -return; +if (archid) { +return CPU(archid->cpu); } -/* IPI status vector */ -vector = extract8(val, 0, 5); - -cs = ipi_getcpu(cpuid); -cpu = LOONGARCH_CPU(cs); -s = LOONGARCH_IPI(cpu->env.ipistate); -loongarch_ipi_writel(>ipi_core, CORE_SET_OFF, BIT(vector), 4); +return NULL; } -static void mail_send(uint64_t val) +static MemTxResult mail_send(uint64_t val, MemTxAttrs attrs) { uint32_t cpuid; hwaddr addr; -CPULoongArchState *env; CPUState *cs; -LoongArchCPU *cpu; cpuid = extract32(val, 16, 10); if (cpuid >= LOONGARCH_MAX_CPUS) { trace_loongarch_ipi_unsupported_cpuid("IOCSR_MAIL_SEND", cpuid); -return; +return MEMTX_DECODE_ERROR; } -addr = 0x1020 + (val & 0x1c); cs = ipi_getcpu(cpuid); -cpu = LOONGARCH_CPU(cs); -env = >env; -send_ipi_data(env, val, addr); +if (cs == NULL) { +return MEMTX_DECODE_ERROR; +} + +/* override requester_id */ +addr = SMP_IPI_MAILBOX + CORE_BUF_20 + (val & 0x1c); +attrs.requester_id = cs->cpu_index; +send_ipi_data(_CPU(cs)->env, val, addr, attrs); +return MEMTX_OK; } -static void any_send(uint64_t val) +static MemTxResult any_send(uint64_t val, MemTxAttrs attrs) { uint32_t cpuid; hwaddr addr; -CPULoongArchState *env; CPUState *cs; -LoongArchCPU *cpu; cpuid = extract32(val, 16, 10); if (cpuid >= LOONGARCH_MAX_CPUS) { trace_loongarch_ipi_unsupported_cpuid("IOCSR_ANY_SEND", cpuid); -return; +return MEMTX_DECODE_ERROR; } -addr = val & 0x; cs = ipi_getcpu(cpuid); -cpu = LOONGARCH_CPU(cs); -env = >env; -send_ipi_data(env, val, addr); +if (cs == NULL) { +return MEMTX_DECODE_ERROR; +} + +
Re: [PATCH v3 11/46] hw/loongarch: use pci_init_nic_devices()
在 2024/1/9 上午4:26, David Woodhouse 写道: From: David Woodhouse Signed-off-by: David Woodhouse --- hw/loongarch/virt.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) Reviewed-by: Song Gao Thanks. Song Gao diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c index 4b7dc67a2d..c48804ac38 100644 --- a/hw/loongarch/virt.c +++ b/hw/loongarch/virt.c @@ -504,9 +504,7 @@ static void loongarch_devices_init(DeviceState *pch_pic, LoongArchMachineState * fdt_add_uart_node(lams); /* Network init */ -for (i = 0; i < nb_nics; i++) { -pci_nic_init_nofail(_table[i], pci_bus, mc->default_nic, NULL); -} +pci_init_nic_devices(pci_bus, mc->default_nic); /* * There are some invalid guest memory access.
Re: [PATCH v7 00/16] Support smp.clusters for x86 in QEMU
Hi Babu, On Mon, Jan 08, 2024 at 11:46:50AM -0600, Moger, Babu wrote: > Date: Mon, 8 Jan 2024 11:46:50 -0600 > From: "Moger, Babu" > Subject: Re: [PATCH v7 00/16] Support smp.clusters for x86 in QEMU > > Hi Zhao, > > Ran few basic tests on AMD systems. Changes look good. > > Thanks > Babu > > > Tested-by: Babu Moger > Thanks much for your test! Regards, Zhao
Re: [External] Re: [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram'
On Mon, Jan 08, 2024 at 05:05:38PM -0800, Hao Xiang wrote: > On Mon, Jan 8, 2024 at 2:47 PM Hao Xiang wrote: > > > > On Mon, Jan 8, 2024 at 9:15 AM Gregory Price > > wrote: > > > > > > On Fri, Jan 05, 2024 at 09:59:19PM -0800, Hao Xiang wrote: > > > > On Wed, Jan 3, 2024 at 1:56 PM Gregory Price > > > > wrote: > > > > > > > > > > For a variety of performance reasons, this will not work the way you > > > > > want it to. You are essentially telling QEMU to map the vmem0 into a > > > > > virtual cxl device, and now any memory accesses to that memory region > > > > > will end up going through the cxl-type3 device logic - which is an IO > > > > > path from the perspective of QEMU. > > > > > > > > I didn't understand exactly how the virtual cxl-type3 device works. I > > > > thought it would go with the same "guest virtual address -> guest > > > > physical address -> host physical address" translation totally done by > > > > CPU. But if it is going through an emulation path handled by virtual > > > > cxl-type3, I agree the performance would be bad. Do you know why > > > > accessing memory on a virtual cxl-type3 device can't go with the > > > > nested page table translation? > > > > > > > > > > Because a byte-access on CXL memory can have checks on it that must be > > > emulated by the virtual device, and because there are caching > > > implications that have to be emulated as well. > > > > Interesting. Now that I see the cxl_type3_read/cxl_type3_write. If the > > CXL memory data path goes through them, the performance would be > > pretty problematic. We have actually run Intel's Memory Latency > > Checker benchmark from inside a guest VM with both system-DRAM and > > virtual CXL-type3 configured. The idle latency on the virtual CXL > > memory is 2X of system DRAM, which is on-par with the benchmark > > running from a physical host. I need to debug this more to understand > > why the latency is actually much better than I would expect now. > > So we double checked on benchmark testing. What we see is that running > Intel Memory Latency Checker from a guest VM with virtual CXL memory > VS from a physical host with CXL1.1 memory expander has the same > latency. > > From guest VM: local socket system-DRAM latency is 117.0ns, local > socket CXL-DRAM latency is 269.4ns > From physical host: local socket system-DRAM latency is 113.6ns , > local socket CXL-DRAM latency is 267.5ns > > I also set debugger breakpoints on cxl_type3_read/cxl_type3_write > while running the benchmark testing but those two functions are not > ever hit. We used the virtual CXL configuration while launching QEMU > but the CXL memory is present as a separate NUMA node and we are not > creating devdax devices. Does that make any difference? > Could you possibly share your full QEMU configuration and what OS/kernel you are running inside the guest? The only thing I'm surprised by is that the numa node appears without requiring the driver to generate the NUMA node. It's possible I missed a QEMU update that allows this. ~Gregory
Re: [External] Re: [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram'
On Mon, Jan 8, 2024 at 2:47 PM Hao Xiang wrote: > > On Mon, Jan 8, 2024 at 9:15 AM Gregory Price > wrote: > > > > On Fri, Jan 05, 2024 at 09:59:19PM -0800, Hao Xiang wrote: > > > On Wed, Jan 3, 2024 at 1:56 PM Gregory Price > > > wrote: > > > > > > > > For a variety of performance reasons, this will not work the way you > > > > want it to. You are essentially telling QEMU to map the vmem0 into a > > > > virtual cxl device, and now any memory accesses to that memory region > > > > will end up going through the cxl-type3 device logic - which is an IO > > > > path from the perspective of QEMU. > > > > > > I didn't understand exactly how the virtual cxl-type3 device works. I > > > thought it would go with the same "guest virtual address -> guest > > > physical address -> host physical address" translation totally done by > > > CPU. But if it is going through an emulation path handled by virtual > > > cxl-type3, I agree the performance would be bad. Do you know why > > > accessing memory on a virtual cxl-type3 device can't go with the > > > nested page table translation? > > > > > > > Because a byte-access on CXL memory can have checks on it that must be > > emulated by the virtual device, and because there are caching > > implications that have to be emulated as well. > > Interesting. Now that I see the cxl_type3_read/cxl_type3_write. If the > CXL memory data path goes through them, the performance would be > pretty problematic. We have actually run Intel's Memory Latency > Checker benchmark from inside a guest VM with both system-DRAM and > virtual CXL-type3 configured. The idle latency on the virtual CXL > memory is 2X of system DRAM, which is on-par with the benchmark > running from a physical host. I need to debug this more to understand > why the latency is actually much better than I would expect now. So we double checked on benchmark testing. What we see is that running Intel Memory Latency Checker from a guest VM with virtual CXL memory VS from a physical host with CXL1.1 memory expander has the same latency. >From guest VM: local socket system-DRAM latency is 117.0ns, local socket CXL-DRAM latency is 269.4ns >From physical host: local socket system-DRAM latency is 113.6ns , local socket CXL-DRAM latency is 267.5ns I also set debugger breakpoints on cxl_type3_read/cxl_type3_write while running the benchmark testing but those two functions are not ever hit. We used the virtual CXL configuration while launching QEMU but the CXL memory is present as a separate NUMA node and we are not creating devdax devices. Does that make any difference? > > > > > The cxl device you are using is an emulated CXL device - not a > > virtualization interface. Nuanced difference: the emulated device has > > to emulate *everything* that CXL device does. > > > > What you want is passthrough / managed access to a real device - > > virtualization. This is not the way to accomplish that. A better way > > to accomplish that is to simply pass the memory through as a static numa > > node as I described. > > That would work, too. But I think a kernel change is required to > establish the correct memory tiering if we go this routine. > > > > > > > > > When we had a discussion with Intel, they told us to not use the KVM > > > option in QEMU while using virtual cxl type3 device. That's probably > > > related to the issue you described here? We enabled KVM though but > > > haven't seen the crash yet. > > > > > > > The crash really only happens, IIRC, if code ends up hosted in that > > memory. I forget the exact scenario, but the working theory is it has > > to do with the way instruction caches are managed with KVM and this > > device. > > > > > > > > > > You're better off just using the `host-nodes` field of host-memory > > > > and passing bandwidth/latency attributes though via `-numa hmat-lb` > > > > > > We tried this but it doesn't work from end to end right now. I > > > described the issue in another fork of this thread. > > > > > > > > > > > In that scenario, the guest software doesn't even need to know CXL > > > > exists at all, it can just read the attributes of the numa node > > > > that QEMU created for it. > > > > > > We thought about this before. But the current kernel implementation > > > requires a devdax device to be probed and recognized as a slow tier > > > (by reading the memory attributes). I don't think this can be done via > > > the path you described. Have you tried this before? > > > > > > > Right, because the memory tiering component lumps the nodes together. > > > > Better idea: Fix the memory tiering component > > > > I cc'd you on another patch line that is discussing something relevant > > to this. > > > > https://lore.kernel.org/linux-mm/87fs00njft@yhuang6-desk2.ccr.corp.intel.com/T/#m32d58f8cc607aec942995994a41b17ff711519c8 > > > > The point is: There's no need for this to be a dax device at all, there > > is no need for the guest to even know what is providing the memory, or > > for
RE: [PATCH v2] target/riscv: Implement optional CSR mcontext of debug Sdtrig extension
Ping for review, thanks!! > -Original Message- > From: Alvin Che-Chia Chang(張哲嘉) > Sent: Tuesday, December 19, 2023 8:33 PM > To: qemu-ri...@nongnu.org; qemu-devel@nongnu.org > Cc: alistair.fran...@wdc.com; bin.m...@windriver.com; > liwei1...@gmail.com; dbarb...@ventanamicro.com; > zhiwei_...@linux.alibaba.com; Alvin Che-Chia Chang(張哲嘉) > > Subject: [PATCH v2] target/riscv: Implement optional CSR mcontext of debug > Sdtrig extension > > The debug Sdtrig extension defines an CSR "mcontext". This commit > implements its predicate and read/write operations into CSR table. > Its value is reset as 0 when the trigger module is reset. > > Signed-off-by: Alvin Chang > --- > Changes from v1: Remove dedicated cfg, always implement mcontext. > > target/riscv/cpu.h | 1 + > target/riscv/cpu_bits.h | 7 +++ > target/riscv/csr.c | 36 +++- > target/riscv/debug.c| 2 ++ > 4 files changed, 41 insertions(+), 5 deletions(-) > > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index d74b361..e117641 > 100644 > --- a/target/riscv/cpu.h > +++ b/target/riscv/cpu.h > @@ -345,6 +345,7 @@ struct CPUArchState { > target_ulong tdata1[RV_MAX_TRIGGERS]; > target_ulong tdata2[RV_MAX_TRIGGERS]; > target_ulong tdata3[RV_MAX_TRIGGERS]; > +target_ulong mcontext; > struct CPUBreakpoint *cpu_breakpoint[RV_MAX_TRIGGERS]; > struct CPUWatchpoint *cpu_watchpoint[RV_MAX_TRIGGERS]; > QEMUTimer *itrigger_timer[RV_MAX_TRIGGERS]; diff --git > a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h index ebd7917..3296648 > 100644 > --- a/target/riscv/cpu_bits.h > +++ b/target/riscv/cpu_bits.h > @@ -361,6 +361,7 @@ > #define CSR_TDATA2 0x7a2 > #define CSR_TDATA3 0x7a3 > #define CSR_TINFO 0x7a4 > +#define CSR_MCONTEXT0x7a8 > > /* Debug Mode Registers */ > #define CSR_DCSR0x7b0 > @@ -905,4 +906,10 @@ typedef enum RISCVException { > /* JVT CSR bits */ > #define JVT_MODE 0x3F > #define JVT_BASE (~0x3F) > + > +/* Debug Sdtrig CSR masks */ > +#define MCONTEXT32 0x003F > +#define MCONTEXT64 > 0x1FFFULL > +#define MCONTEXT32_HCONTEXT0x007F > +#define MCONTEXT64_HCONTEXT > 0x3FFFULL > #endif > diff --git a/target/riscv/csr.c b/target/riscv/csr.c index fde7ce1..ff1e128 > 100644 > --- a/target/riscv/csr.c > +++ b/target/riscv/csr.c > @@ -3900,6 +3900,31 @@ static RISCVException read_tinfo(CPURISCVState > *env, int csrno, > return RISCV_EXCP_NONE; > } > > +static RISCVException read_mcontext(CPURISCVState *env, int csrno, > +target_ulong *val) { > +*val = env->mcontext; > +return RISCV_EXCP_NONE; > +} > + > +static RISCVException write_mcontext(CPURISCVState *env, int csrno, > + target_ulong val) { > +bool rv32 = riscv_cpu_mxl(env) == MXL_RV32 ? true : false; > +int32_t mask; > + > +if (riscv_has_ext(env, RVH)) { > +/* Spec suggest 7-bit for RV32 and 14-bit for RV64 w/ H extension > */ > +mask = rv32 ? MCONTEXT32_HCONTEXT : > MCONTEXT64_HCONTEXT; > +} else { > +/* Spec suggest 6-bit for RV32 and 13-bit for RV64 w/o H extension > */ > +mask = rv32 ? MCONTEXT32 : MCONTEXT64; > +} > + > +env->mcontext = val & mask; > +return RISCV_EXCP_NONE; > +} > + > /* > * Functions to access Pointer Masking feature registers > * We have to check if current priv lvl could modify @@ -4794,11 +4819,12 > @@ riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = { > [CSR_PMPADDR15] = { "pmpaddr15", pmp, read_pmpaddr, > write_pmpaddr }, > > /* Debug CSRs */ > -[CSR_TSELECT] = { "tselect", debug, read_tselect, write_tselect }, > -[CSR_TDATA1]= { "tdata1", debug, read_tdata, > write_tdata }, > -[CSR_TDATA2]= { "tdata2", debug, read_tdata, > write_tdata }, > -[CSR_TDATA3]= { "tdata3", debug, read_tdata, > write_tdata }, > -[CSR_TINFO] = { "tinfo", debug, read_tinfo, > write_ignore }, > +[CSR_TSELECT] = { "tselect", debug, read_tselect, > write_tselect }, > +[CSR_TDATA1]= { "tdata1", debug, read_tdata, > write_tdata}, > +[CSR_TDATA2]= { "tdata2", debug, read_tdata, > write_tdata}, > +[CSR_TDATA3]= { "tdata3", debug, read_tdata, > write_tdata}, > +[CSR_TINFO] = { "tinfo",debug, read_tinfo, > write_ignore }, > +[CSR_MCONTEXT] = { "mcontext", debug, read_mcontext, > + write_mcontext }, > > /* User Pointer Masking */ > [CSR_UMTE]={ "umte",pointer_masking, read_umte, > write_umte }, > diff --git a/target/riscv/debug.c b/target/riscv/debug.c index > 4945d1a..e30d99c > 100644 > --- a/target/riscv/debug.c > +++ b/target/riscv/debug.c > @@ -940,4 +940,6 @@ void riscv_trigger_reset_hold(CPURISCVState *env) >
Re: [PATCH 0/3] target/riscv: A few bug fixes and Coverity fix
On Mon, Jan 8, 2024 at 10:13 AM Alistair Francis wrote: > > A few bug fixes for some Gitlab issues and a Coverity fix > > Alistair Francis (3): > target/riscv: Assert that the CSR numbers will be correct > target/riscv: Don't adjust vscause for exceptions > target/riscv: Ensure mideleg is set correctly on reset Thanks! Applied to riscv-to-apply.next Alistair > > target/riscv/cpu.c| 8 > target/riscv/cpu_helper.c | 4 ++-- > target/riscv/csr.c| 5 - > 3 files changed, 14 insertions(+), 3 deletions(-) > > -- > 2.43.0 >
Re: [PATCH v3 2/5] target/riscv: Add cycle & instret privilege mode filtering properties
On Mon, Jan 8, 2024 at 10:10 AM Daniel Henrique Barboza wrote: > > > > On 1/5/24 19:13, Atish Patra wrote: > > From: Kaiwen Xue > > > > This adds the properties for ISA extension smcntrpmf. Patches > > implementing it will follow. > > > > Signed-off-by: Atish Patra > > Signed-off-by: Kaiwen Xue > > --- > > target/riscv/cpu.c | 2 ++ > > target/riscv/cpu_cfg.h | 1 + > > 2 files changed, 3 insertions(+) > > > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > > index 83c7c0cf07be..ea34ff2ae983 100644 > > --- a/target/riscv/cpu.c > > +++ b/target/riscv/cpu.c > > @@ -148,6 +148,7 @@ const RISCVIsaExtData isa_edata_arr[] = { > > ISA_EXT_DATA_ENTRY(smstateen, PRIV_VERSION_1_12_0, ext_smstateen), > > ISA_EXT_DATA_ENTRY(ssaia, PRIV_VERSION_1_12_0, ext_ssaia), > > ISA_EXT_DATA_ENTRY(sscofpmf, PRIV_VERSION_1_12_0, ext_sscofpmf), > > +ISA_EXT_DATA_ENTRY(smcntrpmf, PRIV_VERSION_1_12_0, ext_smcntrpmf), > > ISA_EXT_DATA_ENTRY(sstc, PRIV_VERSION_1_12_0, ext_sstc), > > ISA_EXT_DATA_ENTRY(svadu, PRIV_VERSION_1_12_0, ext_svadu), > > ISA_EXT_DATA_ENTRY(svinval, PRIV_VERSION_1_12_0, ext_svinval), > > Sorry for not noticing this in the previous version. I believe we want the > "smcntrpmf" > entry to be right after "smaia" because the isa_edata_arr[] ordering matters > when > building the riscv,isa string in riscv_isa_string_ext(). > Oops. Thanks for catching that. Fixed in v4. > > Thanks, > > Daniel > > > @@ -1296,6 +1297,7 @@ const char *riscv_get_misa_ext_description(uint32_t > > bit) > > const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = { > > /* Defaults for standard extensions */ > > MULTI_EXT_CFG_BOOL("sscofpmf", ext_sscofpmf, false), > > +MULTI_EXT_CFG_BOOL("smcntrpmf", ext_smcntrpmf, false), > > MULTI_EXT_CFG_BOOL("zifencei", ext_zifencei, true), > > MULTI_EXT_CFG_BOOL("zicsr", ext_zicsr, true), > > MULTI_EXT_CFG_BOOL("zihintntl", ext_zihintntl, true), > > diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h > > index f4605fb190b9..00c34fdd3209 100644 > > --- a/target/riscv/cpu_cfg.h > > +++ b/target/riscv/cpu_cfg.h > > @@ -72,6 +72,7 @@ struct RISCVCPUConfig { > > bool ext_zihpm; > > bool ext_smstateen; > > bool ext_sstc; > > +bool ext_smcntrpmf; > > bool ext_svadu; > > bool ext_svinval; > > bool ext_svnapot;
[PATCH v4 4/5] target/riscv: Add cycle & instret privilege mode filtering support
From: Kaiwen Xue QEMU only calculates dummy cycles and instructions, so there is no actual means to stop the icount in QEMU. Hence this patch merely adds the functionality of accessing the cfg registers, and cause no actual effects on the counting of cycle and instret counters. Signed-off-by: Atish Patra Reviewed-by: Daniel Henrique Barboza Signed-off-by: Kaiwen Xue --- target/riscv/csr.c | 80 ++ 1 file changed, 80 insertions(+) diff --git a/target/riscv/csr.c b/target/riscv/csr.c index 283468bbc652..3bd4aa22374f 100644 --- a/target/riscv/csr.c +++ b/target/riscv/csr.c @@ -233,6 +233,24 @@ static RISCVException sscofpmf_32(CPURISCVState *env, int csrno) return sscofpmf(env, csrno); } +static RISCVException smcntrpmf(CPURISCVState *env, int csrno) +{ +if (!riscv_cpu_cfg(env)->ext_smcntrpmf) { +return RISCV_EXCP_ILLEGAL_INST; +} + +return RISCV_EXCP_NONE; +} + +static RISCVException smcntrpmf_32(CPURISCVState *env, int csrno) +{ +if (riscv_cpu_mxl(env) != MXL_RV32) { +return RISCV_EXCP_ILLEGAL_INST; +} + +return smcntrpmf(env, csrno); +} + static RISCVException any(CPURISCVState *env, int csrno) { return RISCV_EXCP_NONE; @@ -818,6 +836,54 @@ static int read_hpmcounterh(CPURISCVState *env, int csrno, target_ulong *val) #else /* CONFIG_USER_ONLY */ +static int read_mcyclecfg(CPURISCVState *env, int csrno, target_ulong *val) +{ +*val = env->mcyclecfg; +return RISCV_EXCP_NONE; +} + +static int write_mcyclecfg(CPURISCVState *env, int csrno, target_ulong val) +{ +env->mcyclecfg = val; +return RISCV_EXCP_NONE; +} + +static int read_mcyclecfgh(CPURISCVState *env, int csrno, target_ulong *val) +{ +*val = env->mcyclecfgh; +return RISCV_EXCP_NONE; +} + +static int write_mcyclecfgh(CPURISCVState *env, int csrno, target_ulong val) +{ +env->mcyclecfgh = val; +return RISCV_EXCP_NONE; +} + +static int read_minstretcfg(CPURISCVState *env, int csrno, target_ulong *val) +{ +*val = env->minstretcfg; +return RISCV_EXCP_NONE; +} + +static int write_minstretcfg(CPURISCVState *env, int csrno, target_ulong val) +{ +env->minstretcfg = val; +return RISCV_EXCP_NONE; +} + +static int read_minstretcfgh(CPURISCVState *env, int csrno, target_ulong *val) +{ +*val = env->minstretcfgh; +return RISCV_EXCP_NONE; +} + +static int write_minstretcfgh(CPURISCVState *env, int csrno, target_ulong val) +{ +env->minstretcfgh = val; +return RISCV_EXCP_NONE; +} + static int read_mhpmevent(CPURISCVState *env, int csrno, target_ulong *val) { int evt_index = csrno - CSR_MCOUNTINHIBIT; @@ -4922,6 +4988,13 @@ riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = { write_mcountinhibit, .min_priv_ver = PRIV_VERSION_1_11_0 }, +[CSR_MCYCLECFG] = { "mcyclecfg", smcntrpmf, read_mcyclecfg, + write_mcyclecfg, + .min_priv_ver = PRIV_VERSION_1_12_0 }, +[CSR_MINSTRETCFG]= { "minstretcfg", smcntrpmf, read_minstretcfg, + write_minstretcfg, + .min_priv_ver = PRIV_VERSION_1_12_0 }, + [CSR_MHPMEVENT3] = { "mhpmevent3", any,read_mhpmevent, write_mhpmevent }, [CSR_MHPMEVENT4] = { "mhpmevent4", any,read_mhpmevent, @@ -4981,6 +5054,13 @@ riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = { [CSR_MHPMEVENT31]= { "mhpmevent31",any,read_mhpmevent, write_mhpmevent }, +[CSR_MCYCLECFGH] = { "mcyclecfgh", smcntrpmf_32, read_mcyclecfgh, + write_mcyclecfgh, + .min_priv_ver = PRIV_VERSION_1_12_0}, +[CSR_MINSTRETCFGH] = { "minstretcfgh", smcntrpmf_32, read_minstretcfgh, + write_minstretcfgh, + .min_priv_ver = PRIV_VERSION_1_12_0}, + [CSR_MHPMEVENT3H]= { "mhpmevent3h",sscofpmf_32, read_mhpmeventh, write_mhpmeventh, .min_priv_ver = PRIV_VERSION_1_12_0}, -- 2.34.1
[PATCH v4 5/5] target/riscv: Implement privilege mode filtering for cycle/instret
Privilege mode filtering can also be emulated for cycle/instret by tracking host_ticks/icount during each privilege mode switch. This patch implements that for both cycle/instret and mhpmcounters. The first one requires Smcntrpmf while the other one requires Sscofpmf to be enabled. The cycle/instret are still computed using host ticks when icount is not enabled. Otherwise, they are computed using raw icount which is more accurate in icount mode. Reviewed-by: Daniel Henrique Barboza Signed-off-by: Atish Patra --- target/riscv/cpu.h| 11 + target/riscv/cpu_helper.c | 9 +++- target/riscv/csr.c| 95 ++- target/riscv/pmu.c| 43 ++ target/riscv/pmu.h| 2 + 5 files changed, 136 insertions(+), 24 deletions(-) diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index 34617c4c4bab..40d10726155b 100644 --- a/target/riscv/cpu.h +++ b/target/riscv/cpu.h @@ -136,6 +136,15 @@ typedef struct PMUCTRState { target_ulong irq_overflow_left; } PMUCTRState; +typedef struct PMUFixedCtrState { +/* Track cycle and icount for each privilege mode */ +uint64_t counter[4]; +uint64_t counter_prev[4]; +/* Track cycle and icount for each privilege mode when V = 1*/ +uint64_t counter_virt[2]; +uint64_t counter_virt_prev[2]; +} PMUFixedCtrState; + struct CPUArchState { target_ulong gpr[32]; target_ulong gprh[32]; /* 64 top bits of the 128-bit registers */ @@ -334,6 +343,8 @@ struct CPUArchState { /* PMU event selector configured values for RV32 */ target_ulong mhpmeventh_val[RV_MAX_MHPMEVENTS]; +PMUFixedCtrState pmu_fixed_ctrs[2]; + target_ulong sscratch; target_ulong mscratch; diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c index e7e23b34f455..3dddb1b433e8 100644 --- a/target/riscv/cpu_helper.c +++ b/target/riscv/cpu_helper.c @@ -715,8 +715,13 @@ void riscv_cpu_set_mode(CPURISCVState *env, target_ulong newpriv) { g_assert(newpriv <= PRV_M && newpriv != PRV_RESERVED); -if (icount_enabled() && newpriv != env->priv) { -riscv_itrigger_update_priv(env); +if (newpriv != env->priv) { +if (icount_enabled()) { +riscv_itrigger_update_priv(env); +riscv_pmu_icount_update_priv(env, newpriv); +} else { +riscv_pmu_cycle_update_priv(env, newpriv); +} } /* tlb_flush is unnecessary as mode is contained in mmu_idx */ env->priv = newpriv; diff --git a/target/riscv/csr.c b/target/riscv/csr.c index 3bd4aa22374f..307d052021c5 100644 --- a/target/riscv/csr.c +++ b/target/riscv/csr.c @@ -782,32 +782,16 @@ static int write_vcsr(CPURISCVState *env, int csrno, target_ulong val) return RISCV_EXCP_NONE; } +#if defined(CONFIG_USER_ONLY) /* User Timers and Counters */ static target_ulong get_ticks(bool shift) { -int64_t val; -target_ulong result; - -#if !defined(CONFIG_USER_ONLY) -if (icount_enabled()) { -val = icount_get(); -} else { -val = cpu_get_host_ticks(); -} -#else -val = cpu_get_host_ticks(); -#endif - -if (shift) { -result = val >> 32; -} else { -result = val; -} +int64_t val = cpu_get_host_ticks(); +target_ulong result = shift ? val >> 32 : val; return result; } -#if defined(CONFIG_USER_ONLY) static RISCVException read_time(CPURISCVState *env, int csrno, target_ulong *val) { @@ -932,6 +916,70 @@ static int write_mhpmeventh(CPURISCVState *env, int csrno, target_ulong val) return RISCV_EXCP_NONE; } +static target_ulong riscv_pmu_ctr_get_fixed_counters_val(CPURISCVState *env, + int counter_idx, + bool upper_half) +{ +uint64_t curr_val = 0; +target_ulong result = 0; +uint64_t *counter_arr = icount_enabled() ? env->pmu_fixed_ctrs[1].counter : +env->pmu_fixed_ctrs[0].counter; +uint64_t *counter_arr_virt = icount_enabled() ? + env->pmu_fixed_ctrs[1].counter_virt : + env->pmu_fixed_ctrs[0].counter_virt; +uint64_t cfg_val = 0; + +if (counter_idx == 0) { +cfg_val = upper_half ? ((uint64_t)env->mcyclecfgh << 32) : + env->mcyclecfg; +} else if (counter_idx == 2) { +cfg_val = upper_half ? ((uint64_t)env->minstretcfgh << 32) : + env->minstretcfg; +} else { +cfg_val = upper_half ? + ((uint64_t)env->mhpmeventh_val[counter_idx] << 32) : + env->mhpmevent_val[counter_idx]; +} + +if (!cfg_val) { +if (icount_enabled()) { +curr_val = icount_get_raw(); +} else { +curr_val = cpu_get_host_ticks(); +} +goto done; +} + +if (!(cfg_val &
[PATCH v4 2/5] target/riscv: Add cycle & instret privilege mode filtering properties
From: Kaiwen Xue This adds the properties for ISA extension smcntrpmf. Patches implementing it will follow. Signed-off-by: Atish Patra Signed-off-by: Kaiwen Xue --- target/riscv/cpu.c | 2 ++ target/riscv/cpu_cfg.h | 1 + 2 files changed, 3 insertions(+) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 83c7c0cf07be..501ae560ec29 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -144,6 +144,7 @@ const RISCVIsaExtData isa_edata_arr[] = { ISA_EXT_DATA_ENTRY(zhinx, PRIV_VERSION_1_12_0, ext_zhinx), ISA_EXT_DATA_ENTRY(zhinxmin, PRIV_VERSION_1_12_0, ext_zhinxmin), ISA_EXT_DATA_ENTRY(smaia, PRIV_VERSION_1_12_0, ext_smaia), +ISA_EXT_DATA_ENTRY(smcntrpmf, PRIV_VERSION_1_12_0, ext_smcntrpmf), ISA_EXT_DATA_ENTRY(smepmp, PRIV_VERSION_1_12_0, ext_smepmp), ISA_EXT_DATA_ENTRY(smstateen, PRIV_VERSION_1_12_0, ext_smstateen), ISA_EXT_DATA_ENTRY(ssaia, PRIV_VERSION_1_12_0, ext_ssaia), @@ -1296,6 +1297,7 @@ const char *riscv_get_misa_ext_description(uint32_t bit) const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = { /* Defaults for standard extensions */ MULTI_EXT_CFG_BOOL("sscofpmf", ext_sscofpmf, false), +MULTI_EXT_CFG_BOOL("smcntrpmf", ext_smcntrpmf, false), MULTI_EXT_CFG_BOOL("zifencei", ext_zifencei, true), MULTI_EXT_CFG_BOOL("zicsr", ext_zicsr, true), MULTI_EXT_CFG_BOOL("zihintntl", ext_zihintntl, true), diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h index f4605fb190b9..00c34fdd3209 100644 --- a/target/riscv/cpu_cfg.h +++ b/target/riscv/cpu_cfg.h @@ -72,6 +72,7 @@ struct RISCVCPUConfig { bool ext_zihpm; bool ext_smstateen; bool ext_sstc; +bool ext_smcntrpmf; bool ext_svadu; bool ext_svinval; bool ext_svnapot; -- 2.34.1
[PATCH v4 3/5] target/riscv: Add cycle & instret privilege mode filtering definitions
From: Kaiwen Xue This adds the definitions for ISA extension smcntrpmf. Signed-off-by: Kaiwen Xue Reviewed-by: Daniel Henrique Barboza Signed-off-by: Atish Patra --- target/riscv/cpu.h | 6 ++ target/riscv/cpu_bits.h | 29 + 2 files changed, 35 insertions(+) diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index d74b361be641..34617c4c4bab 100644 --- a/target/riscv/cpu.h +++ b/target/riscv/cpu.h @@ -319,6 +319,12 @@ struct CPUArchState { target_ulong mcountinhibit; +/* PMU cycle & instret privilege mode filtering */ +target_ulong mcyclecfg; +target_ulong mcyclecfgh; +target_ulong minstretcfg; +target_ulong minstretcfgh; + /* PMU counter state */ PMUCTRState pmu_ctrs[RV_MAX_MHPMCOUNTERS]; diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h index ebd7917d490a..0ee91e502e8f 100644 --- a/target/riscv/cpu_bits.h +++ b/target/riscv/cpu_bits.h @@ -401,6 +401,10 @@ /* Machine counter-inhibit register */ #define CSR_MCOUNTINHIBIT 0x320 +/* Machine counter configuration registers */ +#define CSR_MCYCLECFG 0x321 +#define CSR_MINSTRETCFG 0x322 + #define CSR_MHPMEVENT3 0x323 #define CSR_MHPMEVENT4 0x324 #define CSR_MHPMEVENT5 0x325 @@ -431,6 +435,9 @@ #define CSR_MHPMEVENT30 0x33e #define CSR_MHPMEVENT31 0x33f +#define CSR_MCYCLECFGH 0x721 +#define CSR_MINSTRETCFGH0x722 + #define CSR_MHPMEVENT3H 0x723 #define CSR_MHPMEVENT4H 0x724 #define CSR_MHPMEVENT5H 0x725 @@ -885,6 +892,28 @@ typedef enum RISCVException { /* PMU related bits */ #define MIE_LCOFIE (1 << IRQ_PMU_OVF) +#define MCYCLECFG_BIT_MINH BIT_ULL(62) +#define MCYCLECFGH_BIT_MINHBIT(30) +#define MCYCLECFG_BIT_SINH BIT_ULL(61) +#define MCYCLECFGH_BIT_SINHBIT(29) +#define MCYCLECFG_BIT_UINH BIT_ULL(60) +#define MCYCLECFGH_BIT_UINHBIT(28) +#define MCYCLECFG_BIT_VSINHBIT_ULL(59) +#define MCYCLECFGH_BIT_VSINH BIT(27) +#define MCYCLECFG_BIT_VUINHBIT_ULL(58) +#define MCYCLECFGH_BIT_VUINH BIT(26) + +#define MINSTRETCFG_BIT_MINH BIT_ULL(62) +#define MINSTRETCFGH_BIT_MINH BIT(30) +#define MINSTRETCFG_BIT_SINH BIT_ULL(61) +#define MINSTRETCFGH_BIT_SINH BIT(29) +#define MINSTRETCFG_BIT_UINH BIT_ULL(60) +#define MINSTRETCFGH_BIT_UINH BIT(28) +#define MINSTRETCFG_BIT_VSINH BIT_ULL(59) +#define MINSTRETCFGH_BIT_VSINH BIT(27) +#define MINSTRETCFG_BIT_VUINH BIT_ULL(58) +#define MINSTRETCFGH_BIT_VUINH BIT(26) + #define MHPMEVENT_BIT_OF BIT_ULL(63) #define MHPMEVENTH_BIT_OF BIT(31) #define MHPMEVENT_BIT_MINH BIT_ULL(62) -- 2.34.1
[PATCH v4 1/5] target/riscv: Fix the predicate functions for mhpmeventhX CSRs
mhpmeventhX CSRs are available for RV32. The predicate function should check that first before checking sscofpmf extension. Fixes: 14664483457b ("target/riscv: Add sscofpmf extension support") Reviewed-by: Daniel Henrique Barboza Reviewed-by: Alistair Francis Signed-off-by: Atish Patra --- target/riscv/csr.c | 67 ++ 1 file changed, 38 insertions(+), 29 deletions(-) diff --git a/target/riscv/csr.c b/target/riscv/csr.c index fde7ce1a5336..283468bbc652 100644 --- a/target/riscv/csr.c +++ b/target/riscv/csr.c @@ -224,6 +224,15 @@ static RISCVException sscofpmf(CPURISCVState *env, int csrno) return RISCV_EXCP_NONE; } +static RISCVException sscofpmf_32(CPURISCVState *env, int csrno) +{ +if (riscv_cpu_mxl(env) != MXL_RV32) { +return RISCV_EXCP_ILLEGAL_INST; +} + +return sscofpmf(env, csrno); +} + static RISCVException any(CPURISCVState *env, int csrno) { return RISCV_EXCP_NONE; @@ -4972,91 +4981,91 @@ riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = { [CSR_MHPMEVENT31]= { "mhpmevent31",any,read_mhpmevent, write_mhpmevent }, -[CSR_MHPMEVENT3H]= { "mhpmevent3h",sscofpmf, read_mhpmeventh, +[CSR_MHPMEVENT3H]= { "mhpmevent3h",sscofpmf_32, read_mhpmeventh, write_mhpmeventh, .min_priv_ver = PRIV_VERSION_1_12_0}, -[CSR_MHPMEVENT4H]= { "mhpmevent4h",sscofpmf, read_mhpmeventh, +[CSR_MHPMEVENT4H]= { "mhpmevent4h",sscofpmf_32, read_mhpmeventh, write_mhpmeventh, .min_priv_ver = PRIV_VERSION_1_12_0}, -[CSR_MHPMEVENT5H]= { "mhpmevent5h",sscofpmf, read_mhpmeventh, +[CSR_MHPMEVENT5H]= { "mhpmevent5h",sscofpmf_32, read_mhpmeventh, write_mhpmeventh, .min_priv_ver = PRIV_VERSION_1_12_0}, -[CSR_MHPMEVENT6H]= { "mhpmevent6h",sscofpmf, read_mhpmeventh, +[CSR_MHPMEVENT6H]= { "mhpmevent6h",sscofpmf_32, read_mhpmeventh, write_mhpmeventh, .min_priv_ver = PRIV_VERSION_1_12_0}, -[CSR_MHPMEVENT7H]= { "mhpmevent7h",sscofpmf, read_mhpmeventh, +[CSR_MHPMEVENT7H]= { "mhpmevent7h",sscofpmf_32, read_mhpmeventh, write_mhpmeventh, .min_priv_ver = PRIV_VERSION_1_12_0}, -[CSR_MHPMEVENT8H]= { "mhpmevent8h",sscofpmf, read_mhpmeventh, +[CSR_MHPMEVENT8H]= { "mhpmevent8h",sscofpmf_32, read_mhpmeventh, write_mhpmeventh, .min_priv_ver = PRIV_VERSION_1_12_0}, -[CSR_MHPMEVENT9H]= { "mhpmevent9h",sscofpmf, read_mhpmeventh, +[CSR_MHPMEVENT9H]= { "mhpmevent9h",sscofpmf_32, read_mhpmeventh, write_mhpmeventh, .min_priv_ver = PRIV_VERSION_1_12_0}, -[CSR_MHPMEVENT10H] = { "mhpmevent10h",sscofpmf, read_mhpmeventh, +[CSR_MHPMEVENT10H] = { "mhpmevent10h",sscofpmf_32, read_mhpmeventh, write_mhpmeventh, .min_priv_ver = PRIV_VERSION_1_12_0}, -[CSR_MHPMEVENT11H] = { "mhpmevent11h",sscofpmf, read_mhpmeventh, +[CSR_MHPMEVENT11H] = { "mhpmevent11h",sscofpmf_32, read_mhpmeventh, write_mhpmeventh, .min_priv_ver = PRIV_VERSION_1_12_0}, -[CSR_MHPMEVENT12H] = { "mhpmevent12h",sscofpmf, read_mhpmeventh, +[CSR_MHPMEVENT12H] = { "mhpmevent12h",sscofpmf_32, read_mhpmeventh, write_mhpmeventh, .min_priv_ver = PRIV_VERSION_1_12_0}, -[CSR_MHPMEVENT13H] = { "mhpmevent13h",sscofpmf, read_mhpmeventh, +[CSR_MHPMEVENT13H] = { "mhpmevent13h",sscofpmf_32, read_mhpmeventh, write_mhpmeventh, .min_priv_ver = PRIV_VERSION_1_12_0}, -[CSR_MHPMEVENT14H] = { "mhpmevent14h",sscofpmf, read_mhpmeventh, +[CSR_MHPMEVENT14H] = { "mhpmevent14h",sscofpmf_32, read_mhpmeventh, write_mhpmeventh, .min_priv_ver = PRIV_VERSION_1_12_0}, -[CSR_MHPMEVENT15H] = { "mhpmevent15h",sscofpmf, read_mhpmeventh, +[CSR_MHPMEVENT15H] = { "mhpmevent15h",sscofpmf_32, read_mhpmeventh, write_mhpmeventh, .min_priv_ver = PRIV_VERSION_1_12_0}, -[CSR_MHPMEVENT16H] = { "mhpmevent16h",sscofpmf, read_mhpmeventh, +[CSR_MHPMEVENT16H] = { "mhpmevent16h",sscofpmf_32, read_mhpmeventh,
[PATCH v4 0/5] Add ISA extension smcntrpmf support
This patch series adds the support for RISC-V ISA extension smcntrpmf (cycle and privilege mode filtering) [1]. It is based on Kevin's earlier work but improves it by actually implement privilege mode filtering by tracking the privilege mode switches. This enables the privilege mode filtering for mhpmcounters as well. However, Smcntrpmf/Sscofpmf must be enabled to leverage this. This series also modified to report the raw instruction count instead of virtual cpu time based on the instruction count when icount is enabled. The former seems to be the preferred approach for instruction count for other architectures as well. Please let me know if anybody thinks that's incorrect. The series is also available at Changes from v3->v4: 1. Fixed the ordering of the ISA extension names in isa_edata_arr. 2. Added RB tags. Changes from v2->v3: 1. Fixed the rebasing error in PATCH2. 2. Added RB tags. 3. Addressed other review comments. Changes from v1->v2: 1. Implemented actual mode filtering for both icount and host ticks mode. 1. Addressed comments in v1. 2. Added Kevin's personal email address. [1] https://github.com/riscv/riscv-smcntrpmf [2] https://github.com/atishp04/qemu/tree/smcntrpmf_v3 Atish Patra (2): target/riscv: Fix the predicate functions for mhpmeventhX CSRs target/riscv: Implement privilege mode filtering for cycle/instret Kaiwen Xue (3): target/riscv: Add cycle & instret privilege mode filtering properties target/riscv: Add cycle & instret privilege mode filtering definitions target/riscv: Add cycle & instret privilege mode filtering support target/riscv/cpu.c| 2 + target/riscv/cpu.h| 17 +++ target/riscv/cpu_bits.h | 29 + target/riscv/cpu_cfg.h| 1 + target/riscv/cpu_helper.c | 9 +- target/riscv/csr.c| 242 ++ target/riscv/pmu.c| 43 +++ target/riscv/pmu.h| 2 + 8 files changed, 292 insertions(+), 53 deletions(-) -- 2.34.1
[PATCH v11 08/10] hw/net: GMAC Rx Implementation
From: Nabih Estefan Diaz - Implementation of Receive function for packets - Implementation for reading and writing from and to descriptors in memory for Rx When RX starts, we need to flush the queued packets so that they can be received by the GMAC device. Without this it won't work with TAP NIC device. When RX descriptor list is full, it returns a DMA_STATUS for software to handle it. But there's no way to indicate the software has handled all RX descriptors and the whole pipeline stalls. We do something similar to NPCM7XX EMC to handle this case. 1. Return packet size when RX descriptor is full, effectively dropping these packets in such a case. 2. When software clears RX descriptor full bit, continue receiving further packets by flushing QEMU packet queue. Added relevant trace-events Change-Id: I132aa254a94cda1a586aba2ea33bbfc74ecdb831 Signed-off-by: Hao Wu Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- hw/net/npcm_gmac.c | 479 +++- hw/net/trace-events | 5 + 2 files changed, 482 insertions(+), 2 deletions(-) diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c index 44c4ffaff4..c107e835b1 100644 --- a/hw/net/npcm_gmac.c +++ b/hw/net/npcm_gmac.c @@ -24,6 +24,10 @@ #include "hw/net/mii.h" #include "hw/net/npcm_gmac.h" #include "migration/vmstate.h" +#include "net/checksum.h" +#include "net/eth.h" +#include "net/net.h" +#include "qemu/cutils.h" #include "qemu/log.h" #include "qemu/units.h" #include "sysemu/dma.h" @@ -146,6 +150,17 @@ static void gmac_phy_set_link(NPCMGMACState *gmac, bool active) static bool gmac_can_receive(NetClientState *nc) { +NPCMGMACState *gmac = NPCM_GMAC(qemu_get_nic_opaque(nc)); + +/* If GMAC receive is disabled. */ +if (!(gmac->regs[R_NPCM_GMAC_MAC_CONFIG] & NPCM_GMAC_MAC_CONFIG_RX_EN)) { +return false; +} + +/* If GMAC DMA RX is stopped. */ +if (!(gmac->regs[R_NPCM_DMA_CONTROL] & NPCM_DMA_CONTROL_START_STOP_RX)) { +return false; +} return true; } @@ -189,12 +204,438 @@ static void gmac_update_irq(NPCMGMACState *gmac) qemu_set_irq(gmac->irq, level); } -static ssize_t gmac_receive(NetClientState *nc, const uint8_t *buf, size_t len) +static int gmac_read_rx_desc(dma_addr_t addr, struct NPCMGMACRxDesc *desc) { -/* Placeholder. Function will be filled in following patches */ +if (dma_memory_read(_space_memory, addr, desc, +sizeof(*desc), MEMTXATTRS_UNSPECIFIED)) { +qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to read descriptor @ 0x%" + HWADDR_PRIx "\n", __func__, addr); +return -1; +} +desc->rdes0 = le32_to_cpu(desc->rdes0); +desc->rdes1 = le32_to_cpu(desc->rdes1); +desc->rdes2 = le32_to_cpu(desc->rdes2); +desc->rdes3 = le32_to_cpu(desc->rdes3); return 0; } +static int gmac_write_rx_desc(dma_addr_t addr, struct NPCMGMACRxDesc *desc) +{ +struct NPCMGMACRxDesc le_desc; +le_desc.rdes0 = cpu_to_le32(desc->rdes0); +le_desc.rdes1 = cpu_to_le32(desc->rdes1); +le_desc.rdes2 = cpu_to_le32(desc->rdes2); +le_desc.rdes3 = cpu_to_le32(desc->rdes3); +if (dma_memory_write(_space_memory, addr, _desc, +sizeof(le_desc), MEMTXATTRS_UNSPECIFIED)) { +qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to write descriptor @ 0x%" + HWADDR_PRIx "\n", __func__, addr); +return -1; +} +return 0; +} + +static int gmac_read_tx_desc(dma_addr_t addr, struct NPCMGMACTxDesc *desc) +{ +if (dma_memory_read(_space_memory, addr, desc, +sizeof(*desc), MEMTXATTRS_UNSPECIFIED)) { +qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to read descriptor @ 0x%" + HWADDR_PRIx "\n", __func__, addr); +return -1; +} +desc->tdes0 = le32_to_cpu(desc->tdes0); +desc->tdes1 = le32_to_cpu(desc->tdes1); +desc->tdes2 = le32_to_cpu(desc->tdes2); +desc->tdes3 = le32_to_cpu(desc->tdes3); +return 0; +} + +static int gmac_write_tx_desc(dma_addr_t addr, struct NPCMGMACTxDesc *desc) +{ +struct NPCMGMACTxDesc le_desc; +le_desc.tdes0 = cpu_to_le32(desc->tdes0); +le_desc.tdes1 = cpu_to_le32(desc->tdes1); +le_desc.tdes2 = cpu_to_le32(desc->tdes2); +le_desc.tdes3 = cpu_to_le32(desc->tdes3); +if (dma_memory_write(_space_memory, addr, _desc, +sizeof(le_desc), MEMTXATTRS_UNSPECIFIED)) { +qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to write descriptor @ 0x%" + HWADDR_PRIx "\n", __func__, addr); +return -1; +} +return 0; +} + +static int gmac_rx_transfer_frame_to_buffer(uint32_t rx_buf_len, +uint32_t *left_frame, +uint32_t rx_buf_addr, +bool *eof_transferred, +const uint8_t **frame_ptr, +
[PATCH v11 01/10] hw/misc: Add Nuvoton's PCI Mailbox Module
From: Hao Wu The PCI Mailbox Module is a high-bandwidth communcation module between a Nuvoton BMC and CPU. It features 16KB RAM that are both accessible by the BMC and core CPU. and supports interrupt for both sides. This patch implements the BMC side of the PCI mailbox module. Communication with the core CPU is emulated via a chardev and will be in a follow-up patch. Change-Id: Iaca22f81c4526927d437aa367079ed038faf43f2 Signed-off-by: Hao Wu Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- hw/arm/npcm7xx.c | 15 +- hw/misc/meson.build| 1 + hw/misc/npcm7xx_pci_mbox.c | 324 + hw/misc/trace-events | 5 + include/hw/arm/npcm7xx.h | 1 + include/hw/misc/npcm7xx_pci_mbox.h | 81 6 files changed, 426 insertions(+), 1 deletion(-) create mode 100644 hw/misc/npcm7xx_pci_mbox.c create mode 100644 include/hw/misc/npcm7xx_pci_mbox.h diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c index 15ff21d047..1c3634ff45 100644 --- a/hw/arm/npcm7xx.c +++ b/hw/arm/npcm7xx.c @@ -53,6 +53,9 @@ /* ADC Module */ #define NPCM7XX_ADC_BA (0xf000c000) +/* PCI Mailbox Module */ +#define NPCM7XX_PCI_MBOX_BA (0xf0848000) + /* Internal AHB SRAM */ #define NPCM7XX_RAM3_BA (0xc0008000) #define NPCM7XX_RAM3_SZ (4 * KiB) @@ -83,6 +86,9 @@ enum NPCM7xxInterrupt { NPCM7XX_UART1_IRQ, NPCM7XX_UART2_IRQ, NPCM7XX_UART3_IRQ, +NPCM7XX_PCI_MBOX_IRQ= 8, +NPCM7XX_KCS_HIB_IRQ = 9, +NPCM7XX_GMAC1_IRQ = 14, NPCM7XX_EMC1RX_IRQ = 15, NPCM7XX_EMC1TX_IRQ, NPCM7XX_MMC_IRQ = 26, @@ -706,6 +712,14 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp) } } +/* PCI Mailbox. Cannot fail */ +sysbus_realize(SYS_BUS_DEVICE(>pci_mbox), _abort); +sysbus_mmio_map(SYS_BUS_DEVICE(>pci_mbox), 0, NPCM7XX_PCI_MBOX_BA); +sysbus_mmio_map(SYS_BUS_DEVICE(>pci_mbox), 1, +NPCM7XX_PCI_MBOX_BA + NPCM7XX_PCI_MBOX_RAM_SIZE); +sysbus_connect_irq(SYS_BUS_DEVICE(>pci_mbox), 0, + npcm7xx_irq(s, NPCM7XX_PCI_MBOX_IRQ)); + /* RAM2 (SRAM) */ memory_region_init_ram(>sram, OBJECT(dev), "ram2", NPCM7XX_RAM2_SZ, _abort); @@ -765,7 +779,6 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp) create_unimplemented_device("npcm7xx.usbd[8]", 0xf0838000, 4 * KiB); create_unimplemented_device("npcm7xx.usbd[9]", 0xf0839000, 4 * KiB); create_unimplemented_device("npcm7xx.sd", 0xf084, 8 * KiB); -create_unimplemented_device("npcm7xx.pcimbx", 0xf0848000, 512 * KiB); create_unimplemented_device("npcm7xx.aes", 0xf0858000, 4 * KiB); create_unimplemented_device("npcm7xx.des", 0xf0859000, 4 * KiB); create_unimplemented_device("npcm7xx.sha", 0xf085a000, 4 * KiB); diff --git a/hw/misc/meson.build b/hw/misc/meson.build index 36c20d5637..0ead2e9ede 100644 --- a/hw/misc/meson.build +++ b/hw/misc/meson.build @@ -73,6 +73,7 @@ system_ss.add(when: 'CONFIG_NPCM7XX', if_true: files( 'npcm7xx_clk.c', 'npcm7xx_gcr.c', 'npcm7xx_mft.c', + 'npcm7xx_pci_mbox.c', 'npcm7xx_pwm.c', 'npcm7xx_rng.c', )) diff --git a/hw/misc/npcm7xx_pci_mbox.c b/hw/misc/npcm7xx_pci_mbox.c new file mode 100644 index 00..c770ad6fcf --- /dev/null +++ b/hw/misc/npcm7xx_pci_mbox.c @@ -0,0 +1,324 @@ +/* + * Nuvoton NPCM7xx PCI Mailbox Module + * + * Copyright 2021 Google LLC + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * for more details. + */ + +#include "qemu/osdep.h" +#include "chardev/char-fe.h" +#include "hw/irq.h" +#include "hw/qdev-clock.h" +#include "hw/qdev-properties-system.h" +#include "hw/misc/npcm7xx_pci_mbox.h" +#include "hw/registerfields.h" +#include "migration/vmstate.h" +#include "qapi/error.h" +#include "qapi/visitor.h" +#include "qemu/bitops.h" +#include "qemu/error-report.h" +#include "qemu/log.h" +#include "qemu/module.h" +#include "qemu/timer.h" +#include "qemu/units.h" +#include "trace.h" + +REG32(NPCM7XX_PCI_MBOX_BMBXSTAT, 0x00); +REG32(NPCM7XX_PCI_MBOX_BMBXCTL, 0x04); +REG32(NPCM7XX_PCI_MBOX_BMBXCMD, 0x08); + +enum NPCM7xxPCIMBoxOperation { +NPCM7XX_PCI_MBOX_OP_READ = 1, +NPCM7XX_PCI_MBOX_OP_WRITE, +}; + +#define NPCM7XX_PCI_MBOX_OFFSET_BYTES 8 + +/* Response code */ +#define NPCM7XX_PCI_MBOX_OK 0 +#define NPCM7XX_PCI_MBOX_INVALID_OP 0xa0 +#define NPCM7XX_PCI_MBOX_INVALID_SIZE 0xa1 +#define
[PATCH v11 05/10] hw/arm: Add GMAC devices to NPCM7XX SoC
From: Hao Wu Change-Id: Id8a3461fb5042adc4c3fd6f4fbd1ca0d33e22565 Signed-off-by: Hao Wu Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- hw/arm/npcm7xx.c | 36 ++-- include/hw/arm/npcm7xx.h | 2 ++ 2 files changed, 36 insertions(+), 2 deletions(-) diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c index c9e87162cb..12e11250e1 100644 --- a/hw/arm/npcm7xx.c +++ b/hw/arm/npcm7xx.c @@ -91,6 +91,7 @@ enum NPCM7xxInterrupt { NPCM7XX_GMAC1_IRQ = 14, NPCM7XX_EMC1RX_IRQ = 15, NPCM7XX_EMC1TX_IRQ, +NPCM7XX_GMAC2_IRQ, NPCM7XX_MMC_IRQ = 26, NPCM7XX_PSPI2_IRQ = 28, NPCM7XX_PSPI1_IRQ = 31, @@ -234,6 +235,12 @@ static const hwaddr npcm7xx_pspi_addr[] = { 0xf0201000, }; +/* Register base address for each GMAC Module */ +static const hwaddr npcm7xx_gmac_addr[] = { +0xf0802000, +0xf0804000, +}; + static const struct { hwaddr regs_addr; uint32_t unconnected_pins; @@ -462,6 +469,10 @@ static void npcm7xx_init(Object *obj) object_initialize_child(obj, "pspi[*]", >pspi[i], TYPE_NPCM_PSPI); } +for (i = 0; i < ARRAY_SIZE(s->gmac); i++) { +object_initialize_child(obj, "gmac[*]", >gmac[i], TYPE_NPCM_GMAC); +} + object_initialize_child(obj, "pci-mbox", >pci_mbox, TYPE_NPCM7XX_PCI_MBOX); object_initialize_child(obj, "mmc", >mmc, TYPE_NPCM7XX_SDHCI); @@ -695,6 +706,29 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp) sysbus_connect_irq(sbd, 1, npcm7xx_irq(s, rx_irq)); } +/* + * GMAC Modules. Cannot fail. + */ +QEMU_BUILD_BUG_ON(ARRAY_SIZE(npcm7xx_gmac_addr) != ARRAY_SIZE(s->gmac)); +QEMU_BUILD_BUG_ON(ARRAY_SIZE(s->gmac) != 2); +for (i = 0; i < ARRAY_SIZE(s->gmac); i++) { +SysBusDevice *sbd = SYS_BUS_DEVICE(>gmac[i]); + +/* + * The device exists regardless of whether it's connected to a QEMU + * netdev backend. So always instantiate it even if there is no + * backend. + */ +sysbus_realize(sbd, _abort); +sysbus_mmio_map(sbd, 0, npcm7xx_gmac_addr[i]); +int irq = i == 0 ? NPCM7XX_GMAC1_IRQ : NPCM7XX_GMAC2_IRQ; +/* + * N.B. The values for the second argument sysbus_connect_irq are + * chosen to match the registration order in npcm7xx_emc_realize. + */ +sysbus_connect_irq(sbd, 0, npcm7xx_irq(s, irq)); +} + /* * Flash Interface Unit (FIU). Can fail if incorrect number of chip selects * specified, but this is a programming error. @@ -765,8 +799,6 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp) create_unimplemented_device("npcm7xx.siox[2]", 0xf0102000, 4 * KiB); create_unimplemented_device("npcm7xx.ahbpci", 0xf040, 1 * MiB); create_unimplemented_device("npcm7xx.mcphy",0xf05f, 64 * KiB); -create_unimplemented_device("npcm7xx.gmac1",0xf0802000, 8 * KiB); -create_unimplemented_device("npcm7xx.gmac2",0xf0804000, 8 * KiB); create_unimplemented_device("npcm7xx.vcd", 0xf081, 64 * KiB); create_unimplemented_device("npcm7xx.ece", 0xf082, 8 * KiB); create_unimplemented_device("npcm7xx.vdma", 0xf0822000, 8 * KiB); diff --git a/include/hw/arm/npcm7xx.h b/include/hw/arm/npcm7xx.h index cec3792a2e..9e5cf639a2 100644 --- a/include/hw/arm/npcm7xx.h +++ b/include/hw/arm/npcm7xx.h @@ -30,6 +30,7 @@ #include "hw/misc/npcm7xx_pwm.h" #include "hw/misc/npcm7xx_rng.h" #include "hw/net/npcm7xx_emc.h" +#include "hw/net/npcm_gmac.h" #include "hw/nvram/npcm7xx_otp.h" #include "hw/timer/npcm7xx_timer.h" #include "hw/ssi/npcm7xx_fiu.h" @@ -105,6 +106,7 @@ struct NPCM7xxState { OHCISysBusState ohci; NPCM7xxFIUState fiu[2]; NPCM7xxEMCState emc[2]; +NPCMGMACState gmac[2]; NPCM7xxPCIMBoxState pci_mbox; NPCM7xxSDHCIState mmc; NPCMPSPIState pspi[2]; -- 2.43.0.472.g3155946c3a-goog
[PATCH v11 04/10] hw/net: Add NPCMXXX GMAC device
From: Hao Wu This patch implements the basic registers of GMAC device and sets registers for networking functionalities. Tested: The following message shows up with the change: Broadcom BCM54612E stmmac-0:00: attached PHY driver [Broadcom BCM54612E] (mii_bus:phy_addr=stmmac-0:00, irq=POLL) stmmaceth f0802000.eth eth0: Link is Up - 1Gbps/Full - flow control rx/tx Change-Id: If71c6d486b95edcccba109ba454870714d7e0940 Signed-off-by: Hao Wu Signed-off-by: Nabih Estefan Diaz Reviewed-by: Tyrone Ting --- hw/net/meson.build | 2 +- hw/net/npcm_gmac.c | 424 + hw/net/trace-events| 11 + include/hw/net/npcm_gmac.h | 340 + 4 files changed, 776 insertions(+), 1 deletion(-) create mode 100644 hw/net/npcm_gmac.c create mode 100644 include/hw/net/npcm_gmac.h diff --git a/hw/net/meson.build b/hw/net/meson.build index f64651c467..db6509f504 100644 --- a/hw/net/meson.build +++ b/hw/net/meson.build @@ -38,7 +38,7 @@ system_ss.add(when: 'CONFIG_I82596_COMMON', if_true: files('i82596.c')) system_ss.add(when: 'CONFIG_SUNHME', if_true: files('sunhme.c')) system_ss.add(when: 'CONFIG_FTGMAC100', if_true: files('ftgmac100.c')) system_ss.add(when: 'CONFIG_SUNGEM', if_true: files('sungem.c')) -system_ss.add(when: 'CONFIG_NPCM7XX', if_true: files('npcm7xx_emc.c')) +system_ss.add(when: 'CONFIG_NPCM7XX', if_true: files('npcm7xx_emc.c', 'npcm_gmac.c')) system_ss.add(when: 'CONFIG_ETRAXFS', if_true: files('etraxfs_eth.c')) system_ss.add(when: 'CONFIG_COLDFIRE', if_true: files('mcf_fec.c')) diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c new file mode 100644 index 00..98b3c33c94 --- /dev/null +++ b/hw/net/npcm_gmac.c @@ -0,0 +1,424 @@ +/* + * Nuvoton NPCM7xx/8xx GMAC Module + * + * Copyright 2022 Google LLC + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * for more details. + * + * Unsupported/unimplemented features: + * - MII is not implemented, MII_ADDR.BUSY and MII_DATA always return zero + * - Precision timestamp (PTP) is not implemented. + */ + +#include "qemu/osdep.h" + +#include "hw/registerfields.h" +#include "hw/net/mii.h" +#include "hw/net/npcm_gmac.h" +#include "migration/vmstate.h" +#include "qemu/log.h" +#include "qemu/units.h" +#include "sysemu/dma.h" +#include "trace.h" + +REG32(NPCM_DMA_BUS_MODE, 0x1000) +REG32(NPCM_DMA_XMT_POLL_DEMAND, 0x1004) +REG32(NPCM_DMA_RCV_POLL_DEMAND, 0x1008) +REG32(NPCM_DMA_RX_BASE_ADDR, 0x100c) +REG32(NPCM_DMA_TX_BASE_ADDR, 0x1010) +REG32(NPCM_DMA_STATUS, 0x1014) +REG32(NPCM_DMA_CONTROL, 0x1018) +REG32(NPCM_DMA_INTR_ENA, 0x101c) +REG32(NPCM_DMA_MISSED_FRAME_CTR, 0x1020) +REG32(NPCM_DMA_HOST_TX_DESC, 0x1048) +REG32(NPCM_DMA_HOST_RX_DESC, 0x104c) +REG32(NPCM_DMA_CUR_TX_BUF_ADDR, 0x1050) +REG32(NPCM_DMA_CUR_RX_BUF_ADDR, 0x1054) +REG32(NPCM_DMA_HW_FEATURE, 0x1058) + +REG32(NPCM_GMAC_MAC_CONFIG, 0x0) +REG32(NPCM_GMAC_FRAME_FILTER, 0x4) +REG32(NPCM_GMAC_HASH_HIGH, 0x8) +REG32(NPCM_GMAC_HASH_LOW, 0xc) +REG32(NPCM_GMAC_MII_ADDR, 0x10) +REG32(NPCM_GMAC_MII_DATA, 0x14) +REG32(NPCM_GMAC_FLOW_CTRL, 0x18) +REG32(NPCM_GMAC_VLAN_FLAG, 0x1c) +REG32(NPCM_GMAC_VERSION, 0x20) +REG32(NPCM_GMAC_WAKEUP_FILTER, 0x28) +REG32(NPCM_GMAC_PMT, 0x2c) +REG32(NPCM_GMAC_LPI_CTRL, 0x30) +REG32(NPCM_GMAC_TIMER_CTRL, 0x34) +REG32(NPCM_GMAC_INT_STATUS, 0x38) +REG32(NPCM_GMAC_INT_MASK, 0x3c) +REG32(NPCM_GMAC_MAC0_ADDR_HI, 0x40) +REG32(NPCM_GMAC_MAC0_ADDR_LO, 0x44) +REG32(NPCM_GMAC_MAC1_ADDR_HI, 0x48) +REG32(NPCM_GMAC_MAC1_ADDR_LO, 0x4c) +REG32(NPCM_GMAC_MAC2_ADDR_HI, 0x50) +REG32(NPCM_GMAC_MAC2_ADDR_LO, 0x54) +REG32(NPCM_GMAC_MAC3_ADDR_HI, 0x58) +REG32(NPCM_GMAC_MAC3_ADDR_LO, 0x5c) +REG32(NPCM_GMAC_RGMII_STATUS, 0xd8) +REG32(NPCM_GMAC_WATCHDOG, 0xdc) +REG32(NPCM_GMAC_PTP_TCR, 0x700) +REG32(NPCM_GMAC_PTP_SSIR, 0x704) +REG32(NPCM_GMAC_PTP_STSR, 0x708) +REG32(NPCM_GMAC_PTP_STNSR, 0x70c) +REG32(NPCM_GMAC_PTP_STSUR, 0x710) +REG32(NPCM_GMAC_PTP_STNSUR, 0x714) +REG32(NPCM_GMAC_PTP_TAR, 0x718) +REG32(NPCM_GMAC_PTP_TTSR, 0x71c) + +/* Register Fields */ +#define NPCM_GMAC_MII_ADDR_BUSY BIT(0) +#define NPCM_GMAC_MII_ADDR_WRITEBIT(1) +#define NPCM_GMAC_MII_ADDR_GR(rv) extract16((rv), 6, 5) +#define NPCM_GMAC_MII_ADDR_PA(rv) extract16((rv), 11, 5) + +#define NPCM_GMAC_INT_MASK_LPIIMBIT(10) +#define NPCM_GMAC_INT_MASK_PMTM BIT(3) +#define NPCM_GMAC_INT_MASK_RGIM BIT(0) + +#define NPCM_DMA_BUS_MODE_SWR BIT(0) + +static const uint32_t npcm_gmac_cold_reset_values[NPCM_GMAC_NR_REGS] = { +/*
[PATCH v11 06/10] tests/qtest: Creating qtest for GMAC Module
From: Nabih Estefan Diaz - Created qtest to check initialization of registers in GMAC Module. - Implemented test into Build File. Change-Id: I8b2fe152d3987a7eec4cf6a1d25ba92e75a5391d Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- tests/qtest/meson.build | 1 + tests/qtest/npcm_gmac-test.c | 209 +++ 2 files changed, 210 insertions(+) create mode 100644 tests/qtest/npcm_gmac-test.c diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build index 2ac79925f9..aed8924be9 100644 --- a/tests/qtest/meson.build +++ b/tests/qtest/meson.build @@ -221,6 +221,7 @@ qtests_aarch64 = \ (config_all_devices.has_key('CONFIG_RASPI') ? ['bcm2835-dma-test'] : []) + \ (config_all.has_key('CONFIG_TCG') and \ config_all_devices.has_key('CONFIG_TPM_TIS_I2C') ? ['tpm-tis-i2c-test'] : []) + \ + (config_all_devices.has_key('CONFIG_NPCM7XX') ? qtests_npcm7xx : []) + \ ['arm-cpu-features', 'numa-test', 'boot-serial-test', diff --git a/tests/qtest/npcm_gmac-test.c b/tests/qtest/npcm_gmac-test.c new file mode 100644 index 00..130a1599a8 --- /dev/null +++ b/tests/qtest/npcm_gmac-test.c @@ -0,0 +1,209 @@ +/* + * QTests for Nuvoton NPCM7xx/8xx GMAC Modules. + * + * Copyright 2023 Google LLC + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * for more details. + */ + +#include "qemu/osdep.h" +#include "libqos/libqos.h" + +/* Name of the GMAC Device */ +#define TYPE_NPCM_GMAC "npcm-gmac" + +typedef struct GMACModule { +int irq; +uint64_t base_addr; +} GMACModule; + +typedef struct TestData { +const GMACModule *module; +} TestData; + +/* Values extracted from hw/arm/npcm8xx.c */ +static const GMACModule gmac_module_list[] = { +{ +.irq= 14, +.base_addr = 0xf0802000 +}, +{ +.irq= 15, +.base_addr = 0xf0804000 +}, +{ +.irq= 16, +.base_addr = 0xf0806000 +}, +{ +.irq= 17, +.base_addr = 0xf0808000 +} +}; + +/* Returns the index of the GMAC module. */ +static int gmac_module_index(const GMACModule *mod) +{ +ptrdiff_t diff = mod - gmac_module_list; + +g_assert_true(diff >= 0 && diff < ARRAY_SIZE(gmac_module_list)); + +return diff; +} + +/* 32-bit register indices. Taken from npcm_gmac.c */ +typedef enum NPCMRegister { +/* DMA Registers */ +NPCM_DMA_BUS_MODE = 0x1000, +NPCM_DMA_XMT_POLL_DEMAND = 0x1004, +NPCM_DMA_RCV_POLL_DEMAND = 0x1008, +NPCM_DMA_RCV_BASE_ADDR = 0x100c, +NPCM_DMA_TX_BASE_ADDR = 0x1010, +NPCM_DMA_STATUS = 0x1014, +NPCM_DMA_CONTROL = 0x1018, +NPCM_DMA_INTR_ENA = 0x101c, +NPCM_DMA_MISSED_FRAME_CTR = 0x1020, +NPCM_DMA_HOST_TX_DESC = 0x1048, +NPCM_DMA_HOST_RX_DESC = 0x104c, +NPCM_DMA_CUR_TX_BUF_ADDR = 0x1050, +NPCM_DMA_CUR_RX_BUF_ADDR = 0x1054, +NPCM_DMA_HW_FEATURE = 0x1058, + +/* GMAC Registers */ +NPCM_GMAC_MAC_CONFIG = 0x0, +NPCM_GMAC_FRAME_FILTER = 0x4, +NPCM_GMAC_HASH_HIGH = 0x8, +NPCM_GMAC_HASH_LOW = 0xc, +NPCM_GMAC_MII_ADDR = 0x10, +NPCM_GMAC_MII_DATA = 0x14, +NPCM_GMAC_FLOW_CTRL = 0x18, +NPCM_GMAC_VLAN_FLAG = 0x1c, +NPCM_GMAC_VERSION = 0x20, +NPCM_GMAC_WAKEUP_FILTER = 0x28, +NPCM_GMAC_PMT = 0x2c, +NPCM_GMAC_LPI_CTRL = 0x30, +NPCM_GMAC_TIMER_CTRL = 0x34, +NPCM_GMAC_INT_STATUS = 0x38, +NPCM_GMAC_INT_MASK = 0x3c, +NPCM_GMAC_MAC0_ADDR_HI = 0x40, +NPCM_GMAC_MAC0_ADDR_LO = 0x44, +NPCM_GMAC_MAC1_ADDR_HI = 0x48, +NPCM_GMAC_MAC1_ADDR_LO = 0x4c, +NPCM_GMAC_MAC2_ADDR_HI = 0x50, +NPCM_GMAC_MAC2_ADDR_LO = 0x54, +NPCM_GMAC_MAC3_ADDR_HI = 0x58, +NPCM_GMAC_MAC3_ADDR_LO = 0x5c, +NPCM_GMAC_RGMII_STATUS = 0xd8, +NPCM_GMAC_WATCHDOG = 0xdc, +NPCM_GMAC_PTP_TCR = 0x700, +NPCM_GMAC_PTP_SSIR = 0x704, +NPCM_GMAC_PTP_STSR = 0x708, +NPCM_GMAC_PTP_STNSR = 0x70c, +NPCM_GMAC_PTP_STSUR = 0x710, +NPCM_GMAC_PTP_STNSUR = 0x714, +NPCM_GMAC_PTP_TAR = 0x718, +NPCM_GMAC_PTP_TTSR = 0x71c, +} NPCMRegister; + +static uint32_t gmac_read(QTestState *qts, const GMACModule *mod, + NPCMRegister regno) +{ +return qtest_readl(qts, mod->base_addr + regno); +} + +/* Check that GMAC registers are reset to default value */ +static void test_init(gconstpointer test_data) +{ +const TestData *td = test_data; +const GMACModule *mod = td->module; +QTestState *qts = qtest_init("-machine npcm845-evb"); + +#define CHECK_REG32(regno,
[PATCH v11 09/10] hw/net: GMAC Tx Implementation
From: Nabih Estefan Diaz - Implementation of Transmit function for packets - Implementation for reading and writing from and to descriptors in memory for Tx Added relevant trace-events NOTE: This function implements the steps detailed in the datasheet for transmitting messages from the GMAC. Change-Id: Icf14f9fcc6cc7808a41acd872bca67c9832087e6 Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- hw/net/trace-events | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/net/trace-events b/hw/net/trace-events index f91b1a4a3d..78efa2ec2c 100644 --- a/hw/net/trace-events +++ b/hw/net/trace-events @@ -478,7 +478,9 @@ npcm_gmac_packet_desc_read(const char* name, uint32_t desc_addr) "%s: attempting npcm_gmac_packet_receive(const char* name, uint32_t len) "%s: RX packet length: 0x%04" PRIX32 npcm_gmac_packet_receiving_buffer(const char* name, uint32_t buf_len, uint32_t rx_buf_addr) "%s: Receiving into Buffer size: 0x%04" PRIX32 " at address 0x%04" PRIX32 npcm_gmac_packet_received(const char* name, uint32_t len) "%s: Reception finished, packet left: 0x%04" PRIX32 +npcm_gmac_packet_sent(const char* name, uint16_t len) "%s: TX packet sent!, length: 0x%04" PRIX16 npcm_gmac_debug_desc_data(const char* name, void* addr, uint32_t des0, uint32_t des1, uint32_t des2, uint32_t des3)"%s: Address: %p Descriptor 0: 0x%04" PRIX32 " Descriptor 1: 0x%04" PRIX32 "Descriptor 2: 0x%04" PRIX32 " Descriptor 3: 0x%04" PRIX32 +npcm_gmac_packet_tx_desc_data(const char* name, uint32_t tdes0, uint32_t tdes1) "%s: Tdes0: 0x%04" PRIX32 " Tdes1: 0x%04" PRIX32 # npcm_pcs.c npcm_pcs_reg_read(const char *name, uint16_t indirect_access_baes, uint64_t offset, uint16_t value) "%s: IND: 0x%02" PRIx16 " offset: 0x%04" PRIx64 " value: 0x%04" PRIx16 -- 2.43.0.472.g3155946c3a-goog
[PATCH v11 02/10] hw/arm: Add PCI mailbox module to Nuvoton SoC
From: Hao Wu This patch wires the PCI mailbox module to Nuvoton SoC. Change-Id: I14c42c628258804030f0583889882842bde0d972 Signed-off-by: Hao Wu Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- docs/system/arm/nuvoton.rst | 2 ++ hw/arm/npcm7xx.c| 2 ++ include/hw/arm/npcm7xx.h| 1 + 3 files changed, 5 insertions(+) diff --git a/docs/system/arm/nuvoton.rst b/docs/system/arm/nuvoton.rst index 0424cae4b0..e611099545 100644 --- a/docs/system/arm/nuvoton.rst +++ b/docs/system/arm/nuvoton.rst @@ -50,6 +50,8 @@ Supported devices * Ethernet controller (EMC) * Tachometer * Peripheral SPI controller (PSPI) + * BIOS POST code FIFO + * PCI Mailbox Missing devices --- diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c index 1c3634ff45..c9e87162cb 100644 --- a/hw/arm/npcm7xx.c +++ b/hw/arm/npcm7xx.c @@ -462,6 +462,8 @@ static void npcm7xx_init(Object *obj) object_initialize_child(obj, "pspi[*]", >pspi[i], TYPE_NPCM_PSPI); } +object_initialize_child(obj, "pci-mbox", >pci_mbox, +TYPE_NPCM7XX_PCI_MBOX); object_initialize_child(obj, "mmc", >mmc, TYPE_NPCM7XX_SDHCI); } diff --git a/include/hw/arm/npcm7xx.h b/include/hw/arm/npcm7xx.h index 273090ac60..cec3792a2e 100644 --- a/include/hw/arm/npcm7xx.h +++ b/include/hw/arm/npcm7xx.h @@ -105,6 +105,7 @@ struct NPCM7xxState { OHCISysBusState ohci; NPCM7xxFIUState fiu[2]; NPCM7xxEMCState emc[2]; +NPCM7xxPCIMBoxState pci_mbox; NPCM7xxSDHCIState mmc; NPCMPSPIState pspi[2]; }; -- 2.43.0.472.g3155946c3a-goog
[PATCH v11 07/10] include/hw/net: GMAC IRQ Implementation
From: Nabih Estefan Diaz Implement Update IRQ Method for GMAC functionality. Added relevant trace-events Change-Id: I7a2d3cd3f493278bcd0cf483233c1e05c37488b7 Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- hw/net/npcm_gmac.c | 40 hw/net/trace-events | 1 + 2 files changed, 41 insertions(+) diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c index 98b3c33c94..44c4ffaff4 100644 --- a/hw/net/npcm_gmac.c +++ b/hw/net/npcm_gmac.c @@ -149,6 +149,46 @@ static bool gmac_can_receive(NetClientState *nc) return true; } +/* + * Function that updates the GMAC IRQ + * It find the logical OR of the enabled bits for NIS (if enabled) + * It find the logical OR of the enabled bits for AIS (if enabled) + */ +static void gmac_update_irq(NPCMGMACState *gmac) +{ +/* + * Check if the normal interrupts summary is enabled + * if so, add the bits for the summary that are enabled + */ +if (gmac->regs[R_NPCM_DMA_INTR_ENA] & gmac->regs[R_NPCM_DMA_STATUS] & +(NPCM_DMA_INTR_ENAB_NIE_BITS)) { +gmac->regs[R_NPCM_DMA_STATUS] |= NPCM_DMA_STATUS_NIS; +} +/* + * Check if the abnormal interrupts summary is enabled + * if so, add the bits for the summary that are enabled + */ +if (gmac->regs[R_NPCM_DMA_INTR_ENA] & gmac->regs[R_NPCM_DMA_STATUS] & +(NPCM_DMA_INTR_ENAB_AIE_BITS)) { +gmac->regs[R_NPCM_DMA_STATUS] |= NPCM_DMA_STATUS_AIS; +} + +/* Get the logical OR of both normal and abnormal interrupts */ +int level = !!((gmac->regs[R_NPCM_DMA_STATUS] & +gmac->regs[R_NPCM_DMA_INTR_ENA] & +NPCM_DMA_STATUS_NIS) | + (gmac->regs[R_NPCM_DMA_STATUS] & + gmac->regs[R_NPCM_DMA_INTR_ENA] & + NPCM_DMA_STATUS_AIS)); + +/* Set the IRQ */ +trace_npcm_gmac_update_irq(DEVICE(gmac)->canonical_path, + gmac->regs[R_NPCM_DMA_STATUS], + gmac->regs[R_NPCM_DMA_INTR_ENA], + level); +qemu_set_irq(gmac->irq, level); +} + static ssize_t gmac_receive(NetClientState *nc, const uint8_t *buf, size_t len) { /* Placeholder. Function will be filled in following patches */ diff --git a/hw/net/trace-events b/hw/net/trace-events index 33514548b8..56057de47f 100644 --- a/hw/net/trace-events +++ b/hw/net/trace-events @@ -473,6 +473,7 @@ npcm_gmac_reg_write(const char *name, uint64_t offset, uint32_t value) "%s: offs npcm_gmac_mdio_access(const char *name, uint8_t is_write, uint8_t pa, uint8_t gr, uint16_t val) "%s: is_write: %" PRIu8 " pa: %" PRIu8 " gr: %" PRIu8 " val: 0x%04" PRIx16 npcm_gmac_reset(const char *name, uint16_t value) "%s: phy_regs[0][1]: 0x%04" PRIx16 npcm_gmac_set_link(bool active) "Set link: active=%u" +npcm_gmac_update_irq(const char *name, uint32_t status, uint32_t intr_en, int level) "%s: Status Reg: 0x%04" PRIX32 " Interrupt Enable Reg: 0x%04" PRIX32 " IRQ Set: %d" # npcm_pcs.c npcm_pcs_reg_read(const char *name, uint16_t indirect_access_baes, uint64_t offset, uint16_t value) "%s: IND: 0x%02" PRIx16 " offset: 0x%04" PRIx64 " value: 0x%04" PRIx16 -- 2.43.0.472.g3155946c3a-goog
[PATCH v11 10/10] tests/qtest: Adding PCS Module test to GMAC Qtest
From: Nabih Estefan Diaz - Add PCS Register check to npcm_gmac-test Change-Id: I34821beb5e0b1e89e2be576ab58eabe41545af12 Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- tests/qtest/npcm_gmac-test.c | 132 +++ 1 file changed, 132 insertions(+) diff --git a/tests/qtest/npcm_gmac-test.c b/tests/qtest/npcm_gmac-test.c index 130a1599a8..b64515794b 100644 --- a/tests/qtest/npcm_gmac-test.c +++ b/tests/qtest/npcm_gmac-test.c @@ -20,6 +20,10 @@ /* Name of the GMAC Device */ #define TYPE_NPCM_GMAC "npcm-gmac" +/* Address of the PCS Module */ +#define PCS_BASE_ADDRESS 0xf078 +#define NPCM_PCS_IND_AC_BA 0x1fe + typedef struct GMACModule { int irq; uint64_t base_addr; @@ -111,6 +115,62 @@ typedef enum NPCMRegister { NPCM_GMAC_PTP_STNSUR = 0x714, NPCM_GMAC_PTP_TAR = 0x718, NPCM_GMAC_PTP_TTSR = 0x71c, + +/* PCS Registers */ +NPCM_PCS_SR_CTL_ID1 = 0x3c0008, +NPCM_PCS_SR_CTL_ID2 = 0x3c000a, +NPCM_PCS_SR_CTL_STS = 0x3c0010, + +NPCM_PCS_SR_MII_CTRL = 0x3e, +NPCM_PCS_SR_MII_STS = 0x3e0002, +NPCM_PCS_SR_MII_DEV_ID1 = 0x3e0004, +NPCM_PCS_SR_MII_DEV_ID2 = 0x3e0006, +NPCM_PCS_SR_MII_AN_ADV = 0x3e0008, +NPCM_PCS_SR_MII_LP_BABL = 0x3e000a, +NPCM_PCS_SR_MII_AN_EXPN = 0x3e000c, +NPCM_PCS_SR_MII_EXT_STS = 0x3e001e, + +NPCM_PCS_SR_TIM_SYNC_ABL = 0x3e0e10, +NPCM_PCS_SR_TIM_SYNC_TX_MAX_DLY_LWR = 0x3e0e12, +NPCM_PCS_SR_TIM_SYNC_TX_MAX_DLY_UPR = 0x3e0e14, +NPCM_PCS_SR_TIM_SYNC_TX_MIN_DLY_LWR = 0x3e0e16, +NPCM_PCS_SR_TIM_SYNC_TX_MIN_DLY_UPR = 0x3e0e18, +NPCM_PCS_SR_TIM_SYNC_RX_MAX_DLY_LWR = 0x3e0e1a, +NPCM_PCS_SR_TIM_SYNC_RX_MAX_DLY_UPR = 0x3e0e1c, +NPCM_PCS_SR_TIM_SYNC_RX_MIN_DLY_LWR = 0x3e0e1e, +NPCM_PCS_SR_TIM_SYNC_RX_MIN_DLY_UPR = 0x3e0e20, + +NPCM_PCS_VR_MII_MMD_DIG_CTRL1 = 0x3f, +NPCM_PCS_VR_MII_AN_CTRL = 0x3f0002, +NPCM_PCS_VR_MII_AN_INTR_STS = 0x3f0004, +NPCM_PCS_VR_MII_TC = 0x3f0006, +NPCM_PCS_VR_MII_DBG_CTRL = 0x3f000a, +NPCM_PCS_VR_MII_EEE_MCTRL0 = 0x3f000c, +NPCM_PCS_VR_MII_EEE_TXTIMER = 0x3f0010, +NPCM_PCS_VR_MII_EEE_RXTIMER = 0x3f0012, +NPCM_PCS_VR_MII_LINK_TIMER_CTRL = 0x3f0014, +NPCM_PCS_VR_MII_EEE_MCTRL1 = 0x3f0016, +NPCM_PCS_VR_MII_DIG_STS = 0x3f0020, +NPCM_PCS_VR_MII_ICG_ERRCNT1 = 0x3f0022, +NPCM_PCS_VR_MII_MISC_STS = 0x3f0030, +NPCM_PCS_VR_MII_RX_LSTS = 0x3f0040, +NPCM_PCS_VR_MII_MP_TX_BSTCTRL0 = 0x3f0070, +NPCM_PCS_VR_MII_MP_TX_LVLCTRL0 = 0x3f0074, +NPCM_PCS_VR_MII_MP_TX_GENCTRL0 = 0x3f007a, +NPCM_PCS_VR_MII_MP_TX_GENCTRL1 = 0x3f007c, +NPCM_PCS_VR_MII_MP_TX_STS = 0x3f0090, +NPCM_PCS_VR_MII_MP_RX_GENCTRL0 = 0x3f00b0, +NPCM_PCS_VR_MII_MP_RX_GENCTRL1 = 0x3f00b2, +NPCM_PCS_VR_MII_MP_RX_LOS_CTRL0 = 0x3f00ba, +NPCM_PCS_VR_MII_MP_MPLL_CTRL0 = 0x3f00f0, +NPCM_PCS_VR_MII_MP_MPLL_CTRL1 = 0x3f00f2, +NPCM_PCS_VR_MII_MP_MPLL_STS = 0x3f0110, +NPCM_PCS_VR_MII_MP_MISC_CTRL2 = 0x3f0126, +NPCM_PCS_VR_MII_MP_LVL_CTRL = 0x3f0130, +NPCM_PCS_VR_MII_MP_MISC_CTRL0 = 0x3f0132, +NPCM_PCS_VR_MII_MP_MISC_CTRL1 = 0x3f0134, +NPCM_PCS_VR_MII_DIG_CTRL2 = 0x3f01c2, +NPCM_PCS_VR_MII_DIG_ERRCNT_SEL = 0x3f01c4, } NPCMRegister; static uint32_t gmac_read(QTestState *qts, const GMACModule *mod, @@ -119,6 +179,15 @@ static uint32_t gmac_read(QTestState *qts, const GMACModule *mod, return qtest_readl(qts, mod->base_addr + regno); } +static uint16_t pcs_read(QTestState *qts, const GMACModule *mod, + NPCMRegister regno) +{ +uint32_t write_value = (regno & 0x3ffe00) >> 9; +qtest_writel(qts, PCS_BASE_ADDRESS + NPCM_PCS_IND_AC_BA, write_value); +uint32_t read_offset = regno & 0x1ff; +return qtest_readl(qts, PCS_BASE_ADDRESS + read_offset); +} + /* Check that GMAC registers are reset to default value */ static void test_init(gconstpointer test_data) { @@ -131,6 +200,11 @@ static void test_init(gconstpointer test_data) g_assert_cmphex(gmac_read(qts, mod, (regno)), ==, (value)); \ } while (0) +#define CHECK_REG_PCS(regno, value) \ +do { \ +g_assert_cmphex(pcs_read(qts, mod, (regno)), ==, (value)); \ +} while (0) + CHECK_REG32(NPCM_DMA_BUS_MODE, 0x00020100); CHECK_REG32(NPCM_DMA_XMT_POLL_DEMAND, 0); CHECK_REG32(NPCM_DMA_RCV_POLL_DEMAND, 0); @@ -180,6 +254,64 @@ static void test_init(gconstpointer test_data) CHECK_REG32(NPCM_GMAC_PTP_TAR, 0); CHECK_REG32(NPCM_GMAC_PTP_TTSR, 0); +/* TODO Add registers PCS */ +if (mod->base_addr == 0xf0802000) { +CHECK_REG_PCS(NPCM_PCS_SR_CTL_ID1, 0x699e); +CHECK_REG_PCS(NPCM_PCS_SR_CTL_ID2, 0); +CHECK_REG_PCS(NPCM_PCS_SR_CTL_STS, 0x8000); + +CHECK_REG_PCS(NPCM_PCS_SR_MII_CTRL, 0x1140); +CHECK_REG_PCS(NPCM_PCS_SR_MII_STS, 0x0109); +CHECK_REG_PCS(NPCM_PCS_SR_MII_DEV_ID1, 0x699e); +CHECK_REG_PCS(NPCM_PCS_SR_MII_DEV_ID2, 0x0ced0); +
[PATCH v11 03/10] hw/misc: Add qtest for NPCM7xx PCI Mailbox
From: Hao Wu This patches adds a qtest for NPCM7XX PCI Mailbox module. It sends read and write requests to the module, and verifies that the module contains the correct data after the requests. Change-Id: I2e1dbaecf8be9ec7eab55cb54f7fdeb0715b8275 Signed-off-by: Hao Wu Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- tests/qtest/meson.build | 1 + tests/qtest/npcm7xx_pci_mbox-test.c | 238 2 files changed, 239 insertions(+) create mode 100644 tests/qtest/npcm7xx_pci_mbox-test.c diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build index 47dabf91d0..2ac79925f9 100644 --- a/tests/qtest/meson.build +++ b/tests/qtest/meson.build @@ -183,6 +183,7 @@ qtests_sparc64 = \ qtests_npcm7xx = \ ['npcm7xx_adc-test', 'npcm7xx_gpio-test', + 'npcm7xx_pci_mbox-test', 'npcm7xx_pwm-test', 'npcm7xx_rng-test', 'npcm7xx_sdhci-test', diff --git a/tests/qtest/npcm7xx_pci_mbox-test.c b/tests/qtest/npcm7xx_pci_mbox-test.c new file mode 100644 index 00..24eec18e3c --- /dev/null +++ b/tests/qtest/npcm7xx_pci_mbox-test.c @@ -0,0 +1,238 @@ +/* + * QTests for Nuvoton NPCM7xx PCI Mailbox Modules. + * + * Copyright 2021 Google LLC + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * for more details. + */ + +#include "qemu/osdep.h" +#include "qemu/bitops.h" +#include "qapi/qmp/qdict.h" +#include "qapi/qmp/qnum.h" +#include "libqtest-single.h" + +#define PCI_MBOX_BA 0xf0848000 +#define PCI_MBOX_IRQ8 + +/* register offset */ +#define PCI_MBOX_STAT 0x00 +#define PCI_MBOX_CTL0x04 +#define PCI_MBOX_CMD0x08 + +#define CODE_OK 0x00 +#define CODE_INVALID_OP 0xa0 +#define CODE_INVALID_SIZE 0xa1 +#define CODE_ERROR 0xff + +#define OP_READ 0x01 +#define OP_WRITE0x02 +#define OP_INVALID 0x41 + + +static int sock; +static int fd; + +/* + * Create a local TCP socket with any port, then save off the port we got. + */ +static in_port_t open_socket(void) +{ +struct sockaddr_in myaddr; +socklen_t addrlen; + +myaddr.sin_family = AF_INET; +myaddr.sin_addr.s_addr = htonl(INADDR_LOOPBACK); +myaddr.sin_port = 0; +sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); +g_assert(sock != -1); +g_assert(bind(sock, (struct sockaddr *) , sizeof(myaddr)) != -1); +addrlen = sizeof(myaddr); +g_assert(getsockname(sock, (struct sockaddr *) , ) != -1); +g_assert(listen(sock, 1) != -1); +return ntohs(myaddr.sin_port); +} + +static void setup_fd(void) +{ +fd_set readfds; + +FD_ZERO(); +FD_SET(sock, ); +g_assert(select(sock + 1, , NULL, NULL, NULL) == 1); + +fd = accept(sock, NULL, 0); +g_assert(fd >= 0); +} + +static uint8_t read_response(uint8_t *buf, size_t len) +{ +uint8_t code; +ssize_t ret = read(fd, , 1); + +if (ret == -1) { +return CODE_ERROR; +} +if (code != CODE_OK) { +return code; +} +g_test_message("response code: %x", code); +if (len > 0) { +ret = read(fd, buf, len); +if (ret < len) { +return CODE_ERROR; +} +} +return CODE_OK; +} + +static void receive_data(uint64_t offset, uint8_t *buf, size_t len) +{ +uint8_t op = OP_READ; +uint8_t code; +ssize_t rv; + +while (len > 0) { +uint8_t size; + +if (len >= 8) { +size = 8; +} else if (len >= 4) { +size = 4; +} else if (len >= 2) { +size = 2; +} else { +size = 1; +} + +g_test_message("receiving %u bytes", size); +/* Write op */ +rv = write(fd, , 1); +g_assert_cmpint(rv, ==, 1); +/* Write offset */ +rv = write(fd, (uint8_t *), sizeof(uint64_t)); +g_assert_cmpint(rv, ==, sizeof(uint64_t)); +/* Write size */ +g_assert_cmpint(write(fd, , 1), ==, 1); + +/* Read data and Expect response */ +code = read_response(buf, size); +g_assert_cmphex(code, ==, CODE_OK); + +buf += size; +offset += size; +len -= size; +} +} + +static void send_data(uint64_t offset, const uint8_t *buf, size_t len) +{ +uint8_t op = OP_WRITE; +uint8_t code; +ssize_t rv; + +while (len > 0) { +uint8_t size; + +if (len >= 8) { +size = 8; +} else if (len >= 4) { +size = 4; +} else if (len >= 2) { +size = 2; +} else { +size = 1; +} + +
[PATCH v11 00/10] Implementation of NPI Mailbox and GMAC Networking Module
From: Nabih Estefan Diaz [Changes since v10] Fixed macOS build issue. Changed imports to not be linux-specific. [Changes since v9] More cleanup and fixes based on suggestions from Peter Maydell (peter.mayd...@linaro.org) suggestions. [Changes since v8] Suggestions and Fixes from Peter Maydell (peter.mayd...@linaro.org), also cleaned up changes so nothing is deleted in a later patch that was added in an earlier patch. Patch count decresed by 1 because this cleanup led to one of the patches being irrelevant. [Changes since v7] Fixed patch 4 declaration of new NIC based on comments by Peter Maydell (peter.mayd...@linaro.org) [Changes since v6] Remove the Change-Ids from the commit messages. [Changes since v5] Undid remove of some qtests that seem to have been caused by a merge conflict. [Changes since v4] Added Signed-off-by tag and fixed patch 4 commit message as suggested by Peter Maydell (peter.mayd...@linaro.org) [Changes since v3] Fixed comments from Hao Wu (wuhao...@google.com) [Changes since v2] Fixed bugs related to the RC functionality of the GMAC. Added and squashed patches related to that. [Changes since v1] Fixed some errors in formatting. Fixed a merge error that I didn't see in v1. Removed Nuvoton 8xx references since that is a separate patch set. [Original Cover] Creates NPI Mailbox Module with data verification for read and write (internal and external), wiring to the Nuvoton SoC, and QTests. Also creates the GMAC Networking Module. Implements read and write functionalities with cooresponding descriptors and registers. Also includes QTests for the different functionalities. Hao Wu (5): hw/misc: Add Nuvoton's PCI Mailbox Module hw/arm: Add PCI mailbox module to Nuvoton SoC hw/misc: Add qtest for NPCM7xx PCI Mailbox hw/net: Add NPCMXXX GMAC device hw/arm: Add GMAC devices to NPCM7XX SoC Nabih Estefan Diaz (5): tests/qtest: Creating qtest for GMAC Module include/hw/net: GMAC IRQ Implementation hw/net: GMAC Rx Implementation hw/net: GMAC Tx Implementation tests/qtest: Adding PCS Module test to GMAC Qtest docs/system/arm/nuvoton.rst | 2 + hw/arm/npcm7xx.c| 53 +- hw/misc/meson.build | 1 + hw/misc/npcm7xx_pci_mbox.c | 324 ++ hw/misc/trace-events| 5 + hw/net/meson.build | 2 +- hw/net/npcm_gmac.c | 939 hw/net/trace-events | 19 + include/hw/arm/npcm7xx.h| 4 + include/hw/misc/npcm7xx_pci_mbox.h | 81 +++ include/hw/net/npcm_gmac.h | 340 ++ tests/qtest/meson.build | 2 + tests/qtest/npcm7xx_pci_mbox-test.c | 238 +++ tests/qtest/npcm_gmac-test.c| 341 ++ 14 files changed, 2347 insertions(+), 4 deletions(-) create mode 100644 hw/misc/npcm7xx_pci_mbox.c create mode 100644 hw/net/npcm_gmac.c create mode 100644 include/hw/misc/npcm7xx_pci_mbox.h create mode 100644 include/hw/net/npcm_gmac.h create mode 100644 tests/qtest/npcm7xx_pci_mbox-test.c create mode 100644 tests/qtest/npcm_gmac-test.c -- 2.43.0.472.g3155946c3a-goog
Re: [PATCH] tests/avocado/reverse_debugging: Disable the ppc64 tests by default
On Thu, Nov 23, 2023 at 5:53 AM Peter Maydell wrote: > > On Mon, 20 Nov 2023 at 19:19, John Snow wrote: > > > > On Wed, Nov 15, 2023 at 12:23 PM Daniel P. Berrangé > > wrote: > > > The Python Machine() class has passed one of a pre-created socketpair > > > FDs for the serial port chardev. The guest is trying to write to this > > > and blocking. Nothing in the Machine() class is reading from the > > > other end of the serial port console. > > > > The Machine class doesn't know if anything will ever use the console, > > > so as is the change is unsafe. > > > > > > The original goal of John's change was to guarantee we capture early > > > boot messages as some test need that. > > > > > > I think we need to be able to have a flag to say whether the caller needs > > > an "early console" facility, and only use the pre-opened FD passing for > > > that case. Tests we need early console will have to ask for that guarantee > > > explicitly. > > > > Tch. I see. Thank you for diagnosing this. > > > > From the machine.py perspective, you have to *opt in* to having a > > console, so I hadn't considered that a caller would enable the console > > and then ... not read from it. Surely that's a bug in the caller? > > From an Avocado test perspective, I would expect that the test case > should have to explicitly opt *out* of "the console messages appear > in the avocado test log, even if the test case doesn't care about them > for the purposes of identifying when to end the test or whatever". > The console logs are important for after-the-fact human diagnosis > of why a test might have failed, so we should always collect them. > > thanks > -- PMM > Understood. In that case, fixing the test would involve engaging's the avocado suite's draining utility to ensure that the log is being consumed and logged. I think there's a potential here to simplify all of the draining-and-logging code we have split across the avocado test suite, console_socket.py and machine.py, but I can't promise that the rewrite I've been working on will be ready quickly, so if this is still busted (I'm still catching back up with my mail post-holidays) then we want a quicker fix if we haven't committed one yet. --js
Re: [PATCH v6 4/4] scripts: add script to compare compatible properties
On Mon, Dec 18, 2023 at 8:20 AM Markus Armbruster wrote: > > Maksim Davydov writes: > > > On 12/1/23 12:51, Markus Armbruster wrote: > >> Review, anyone? > > > > Only Vladimir > > To be clear: I'm soliciting a second review. > > [...] > I volunteer to review it from the Python maintenance POV, but please rebase and resend to fix the patchew desync. We still want review from a more holistic perspective, though ... but if it's not part of the build or test infrastructure, it doesn't have to be perfect. --js
[PATCH 2/3] tests/tcg: Factor out gdbstub test functions
Both the report() function as well as the initial gdbstub test sequence are copy-pasted into ~10 files with slight modifications. This indicates that they are indeed generic, so factor them out. While at it, add a few newlines to make the formatting closer to PEP-8. Signed-off-by: Ilya Leoshkevich --- tests/guest-debug/run-test.py | 7 ++- tests/guest-debug/test_gdbstub.py | 56 +++ tests/tcg/aarch64/gdbstub/test-sve-ioctl.py | 34 +-- tests/tcg/aarch64/gdbstub/test-sve.py | 33 +-- tests/tcg/multiarch/gdbstub/interrupt.py | 47 ++-- tests/tcg/multiarch/gdbstub/memory.py | 41 +- tests/tcg/multiarch/gdbstub/registers.py | 41 ++ tests/tcg/multiarch/gdbstub/sha1.py | 40 ++--- .../multiarch/gdbstub/test-proc-mappings.py | 39 + .../multiarch/gdbstub/test-qxfer-auxv-read.py | 37 +--- .../gdbstub/test-thread-breakpoint.py | 37 +--- tests/tcg/s390x/gdbstub/test-signals-s390x.py | 42 +- tests/tcg/s390x/gdbstub/test-svc.py | 39 + 13 files changed, 96 insertions(+), 397 deletions(-) create mode 100644 tests/guest-debug/test_gdbstub.py diff --git a/tests/guest-debug/run-test.py b/tests/guest-debug/run-test.py index b13b27d4b19..368ff8a8903 100755 --- a/tests/guest-debug/run-test.py +++ b/tests/guest-debug/run-test.py @@ -97,7 +97,12 @@ def log(output, msg): sleep(1) log(output, "GDB CMD: %s" % (gdb_cmd)) -result = subprocess.call(gdb_cmd, shell=True, stdout=output, stderr=stderr) +gdb_env = dict(os.environ) +gdb_pythonpath = gdb_env.get("PYTHONPATH", "").split(os.pathsep) +gdb_pythonpath.append(os.path.dirname(os.path.realpath(__file__))) +gdb_env["PYTHONPATH"] = os.pathsep.join(gdb_pythonpath) +result = subprocess.call(gdb_cmd, shell=True, stdout=output, stderr=stderr, + env=gdb_env) # A result of greater than 128 indicates a fatal signal (likely a # crash due to gdb internal failure). That's a problem for GDB and diff --git a/tests/guest-debug/test_gdbstub.py b/tests/guest-debug/test_gdbstub.py new file mode 100644 index 000..1bc4ed131f4 --- /dev/null +++ b/tests/guest-debug/test_gdbstub.py @@ -0,0 +1,56 @@ +"""Helper functions for gdbstub testing + +""" +from __future__ import print_function +import gdb +import sys + +fail_count = 0 + + +def report(cond, msg): +"""Report success/fail of a test""" +if cond: +print("PASS: {}".format(msg)) +else: +print("FAIL: {}".format(msg)) +global fail_count +fail_count += 1 + + +def main(test, expected_arch=None): +"""Run a test function + +This runs as the script it sourced (via -x, via run-test.py).""" +try: +inferior = gdb.selected_inferior() +arch = inferior.architecture() +print("ATTACHED: {}".format(arch)) +if expected_arch is not None: +report(arch.name() == expected_arch, + "connected to {}".format(expected_arch)) +except (gdb.error, AttributeError): +print("SKIP: not connected") +exit(0) + +if gdb.parse_and_eval("$pc") == 0: +print("SKIP: PC not set") +exit(0) + +try: +test() +except: +print("GDB Exception: {}".format(sys.exc_info()[0])) +global fail_count +fail_count += 1 +import code +code.InteractiveConsole(locals=globals()).interact() +raise + +try: +gdb.execute("kill") +except gdb.error: +pass + +print("All tests complete: %d failures".format(fail_count)) +exit(fail_count) diff --git a/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py b/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py index ee8d467e59d..a78a3a2514d 100644 --- a/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py +++ b/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py @@ -8,19 +8,10 @@ # import gdb -import sys +from test_gdbstub import main, report initial_vlen = 0 -failcount = 0 -def report(cond, msg): -"Report success/fail of test" -if cond: -print ("PASS: %s" % (msg)) -else: -print ("FAIL: %s" % (msg)) -global failcount -failcount += 1 class TestBreakpoint(gdb.Breakpoint): def __init__(self, sym_name="__sve_ld_done"): @@ -64,26 +55,5 @@ def run_test(): gdb.execute("c") -# -# This runs as the script it sourced (via -x, via run-test.py) -# -try: -inferior = gdb.selected_inferior() -arch = inferior.architecture() -report(arch.name() == "aarch64", "connected to aarch64") -except (gdb.error, AttributeError): -print("SKIPPING (not connected)", file=sys.stderr) -exit(0) - -try: -# Run the actual tests -run_test() -except: -print ("GDB Exception: %s" % (sys.exc_info()[0])) -failcount += 1 -import code -code.InteractiveConsole(locals=globals()).interact() -
[PATCH 1/3] linux-user: Allow gdbstub to ignore page protection
gdbserver ignores page protection by virtue of using /proc/$pid/mem. Teach qemu gdbstub to do this too. This will not work if /proc is not mounted; accept this limitation. One alternative is to temporarily grant the missing PROT_* bit, but this is inherently racy. Another alternative is self-debugging with ptrace(POKE), which will break if QEMU itself is being debugged - a much more severe limitation. Signed-off-by: Ilya Leoshkevich --- cpu-target.c | 55 ++-- 1 file changed, 40 insertions(+), 15 deletions(-) diff --git a/cpu-target.c b/cpu-target.c index 5eecd7ea2d7..69e97f78980 100644 --- a/cpu-target.c +++ b/cpu-target.c @@ -406,6 +406,15 @@ int cpu_memory_rw_debug(CPUState *cpu, vaddr addr, vaddr l, page; void * p; uint8_t *buf = ptr; +int ret = -1; +int mem_fd; + +/* + * Try ptrace first. If /proc is not mounted or if there is a different + * problem, fall back to the manual page access. Note that, unlike ptrace, + * it will not be able to ignore the protection bits. + */ +mem_fd = open("/proc/self/mem", is_write ? O_WRONLY : O_RDONLY); while (len > 0) { page = addr & TARGET_PAGE_MASK; @@ -413,22 +422,33 @@ int cpu_memory_rw_debug(CPUState *cpu, vaddr addr, if (l > len) l = len; flags = page_get_flags(page); -if (!(flags & PAGE_VALID)) -return -1; +if (!(flags & PAGE_VALID)) { +goto out_close; +} if (is_write) { -if (!(flags & PAGE_WRITE)) -return -1; +if (mem_fd == -1 || +pwrite(mem_fd, ptr, len, (off_t)g2h_untagged(addr)) != len) { +if (!(flags & PAGE_WRITE)) { +goto out_close; +} +/* XXX: this code should not depend on lock_user */ +p = lock_user(VERIFY_WRITE, addr, l, 0); +if (!p) { +goto out_close; +} +memcpy(p, buf, l); +unlock_user(p, addr, l); +} +} else if (mem_fd == -1 || + pread(mem_fd, ptr, len, (off_t)g2h_untagged(addr)) != len) { +if (!(flags & PAGE_READ)) { +goto out_close; +} /* XXX: this code should not depend on lock_user */ -if (!(p = lock_user(VERIFY_WRITE, addr, l, 0))) -return -1; -memcpy(p, buf, l); -unlock_user(p, addr, l); -} else { -if (!(flags & PAGE_READ)) -return -1; -/* XXX: this code should not depend on lock_user */ -if (!(p = lock_user(VERIFY_READ, addr, l, 1))) -return -1; +p = lock_user(VERIFY_READ, addr, l, 1); +if (!p) { +goto out_close; +} memcpy(buf, p, l); unlock_user(p, addr, 0); } @@ -436,7 +456,12 @@ int cpu_memory_rw_debug(CPUState *cpu, vaddr addr, buf += l; addr += l; } -return 0; +ret = 0; +out_close: +if (mem_fd != -1) { +close(mem_fd); +} +return ret; } #endif -- 2.43.0
[PATCH 0/3] linux-user: Allow gdbstub to ignore page protection
RFC: https://lists.gnu.org/archive/html/qemu-devel/2023-12/msg02044.html RFC -> v1: Use /proc/self/mem and accept that this will not work without /proc. Factor out a couple functions for gdbstub testing. Add a test. Hi, I've noticed that gdbstub behaves differently from gdbserver in that it doesn't allow reading non-readable pages. This series improves the situation by using the same mechanism as gdbserver: /proc/self/mem. If /proc is not mounted, we fall back to the today's implementation. Best regards, Ilya Ilya Leoshkevich (3): linux-user: Allow gdbstub to ignore page protection tests/tcg: Factor out gdbstub test functions tests/tcg: Add the PROT_NONE gdbstub test cpu-target.c | 55 +- tests/guest-debug/run-test.py | 7 ++- tests/guest-debug/test_gdbstub.py | 56 +++ tests/tcg/aarch64/gdbstub/test-sve-ioctl.py | 34 +-- tests/tcg/aarch64/gdbstub/test-sve.py | 33 +-- tests/tcg/multiarch/Makefile.target | 9 ++- tests/tcg/multiarch/gdbstub/interrupt.py | 47 ++-- tests/tcg/multiarch/gdbstub/memory.py | 41 +- tests/tcg/multiarch/gdbstub/prot-none.py | 22 tests/tcg/multiarch/gdbstub/registers.py | 41 ++ tests/tcg/multiarch/gdbstub/sha1.py | 40 ++--- .../multiarch/gdbstub/test-proc-mappings.py | 39 + .../multiarch/gdbstub/test-qxfer-auxv-read.py | 37 +--- .../gdbstub/test-thread-breakpoint.py | 37 +--- tests/tcg/multiarch/prot-none.c | 38 + tests/tcg/s390x/gdbstub/test-signals-s390x.py | 42 +- tests/tcg/s390x/gdbstub/test-svc.py | 39 + 17 files changed, 204 insertions(+), 413 deletions(-) create mode 100644 tests/guest-debug/test_gdbstub.py create mode 100644 tests/tcg/multiarch/gdbstub/prot-none.py create mode 100644 tests/tcg/multiarch/prot-none.c -- 2.43.0
[PATCH 3/3] tests/tcg: Add the PROT_NONE gdbstub test
Make sure that qemu gdbstub, like gdbserver, allows reading from and writing to PROT_NONE pages. Signed-off-by: Ilya Leoshkevich --- tests/tcg/multiarch/Makefile.target | 9 +- tests/tcg/multiarch/gdbstub/prot-none.py | 22 ++ tests/tcg/multiarch/prot-none.c | 38 3 files changed, 68 insertions(+), 1 deletion(-) create mode 100644 tests/tcg/multiarch/gdbstub/prot-none.py create mode 100644 tests/tcg/multiarch/prot-none.c diff --git a/tests/tcg/multiarch/Makefile.target b/tests/tcg/multiarch/Makefile.target index d31ba8d6ae4..315a2e13588 100644 --- a/tests/tcg/multiarch/Makefile.target +++ b/tests/tcg/multiarch/Makefile.target @@ -101,13 +101,20 @@ run-gdbstub-registers: sha512 --bin $< --test $(MULTIARCH_SRC)/gdbstub/registers.py, \ checking register enumeration) +run-gdbstub-prot-none: prot-none + $(call run-test, $@, env PROT_NONE_PY=1 $(GDB_SCRIPT) \ + --gdb $(GDB) \ + --qemu $(QEMU) --qargs "$(QEMU_OPTS)" \ + --bin $< --test $(MULTIARCH_SRC)/gdbstub/prot-none.py, \ + accessing PROT_NONE memory) + else run-gdbstub-%: $(call skip-test, "gdbstub test $*", "need working gdb with $(patsubst -%,,$(TARGET_NAME)) support") endif EXTRA_RUNS += run-gdbstub-sha1 run-gdbstub-qxfer-auxv-read \ run-gdbstub-proc-mappings run-gdbstub-thread-breakpoint \ - run-gdbstub-registers + run-gdbstub-registers run-gdbstub-prot-none # ARM Compatible Semi Hosting Tests # diff --git a/tests/tcg/multiarch/gdbstub/prot-none.py b/tests/tcg/multiarch/gdbstub/prot-none.py new file mode 100644 index 000..751e44d5b93 --- /dev/null +++ b/tests/tcg/multiarch/gdbstub/prot-none.py @@ -0,0 +1,22 @@ +"""Test that GDB can access PROT_NONE pages. + +This runs as a sourced script (via -x, via run-test.py). + +SPDX-License-Identifier: GPL-2.0-or-later +""" +from test_gdbstub import main, report + + +def run_test(): +"""Run through the tests one by one""" +gdb.Breakpoint("break_here") +gdb.execute("continue") +val = int(gdb.parse_and_eval("*p")) +report(val == 42, "{} != 42".format(val)) +gdb.execute("set *p = 24") +gdb.execute("continue") +exitcode = int(gdb.parse_and_eval("$_exitcode")) +report(exitcode == 0, "{} != 0".format(exitcode)) + + +main(run_test) diff --git a/tests/tcg/multiarch/prot-none.c b/tests/tcg/multiarch/prot-none.c new file mode 100644 index 000..66e38065cf0 --- /dev/null +++ b/tests/tcg/multiarch/prot-none.c @@ -0,0 +1,38 @@ +/* + * Test that GDB can access PROT_NONE pages. + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ +#include +#include +#include +#include + +void break_here(long *p) +{ +} + +int main(void) +{ +long pagesize = sysconf(_SC_PAGESIZE); +int err; +long *p; + +p = mmap(NULL, pagesize, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); +assert(p != MAP_FAILED); +*p = 42; + +err = mprotect(p, pagesize, PROT_NONE); +assert(err == 0); + +break_here(p); + +err = mprotect(p, pagesize, PROT_READ); +assert(err == 0); +if (getenv("PROT_NONE_PY")) { +assert(*p == 24); +} + +return EXIT_SUCCESS; +} -- 2.43.0
Re: [PATCH] hw/block/fdc: do not set SEEK status bit in multi track commands
On Mon, Jan 1, 2024 at 4:45 PM Hervé Poussineau wrote: > > Ping. > > Le 12/08/2023 à 10:59, Hervé Poussineau a écrit : > > I don't understand when SEEK must be set or not, but it seems to fix > > Minix... > > > > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1522 > > Signed-off-by: Hervé Poussineau > > --- > > hw/block/fdc.c | 1 - > > 1 file changed, 1 deletion(-) > > > > diff --git a/hw/block/fdc.c b/hw/block/fdc.c > > index d7cc4d3ec19..f627bbaf951 100644 > > --- a/hw/block/fdc.c > > +++ b/hw/block/fdc.c > > @@ -1404,7 +1404,6 @@ static int fdctrl_seek_to_next_sect(FDCtrl *fdctrl, > > FDrive *cur_drv) > > } else { > > new_head = 0; > > new_track++; > > -fdctrl->status0 |= FD_SR0_SEEK; > > if ((cur_drv->flags & FDISK_DBL_SIDES) == 0) { > > ret = 0; > > } > I'll be honest, I don't have the time to audit this and I don't have the test suite necessary to prove that it's safe enough. Do you have any suggestions for how we can prove or test this beyond 'works for me'? I could read the spec sheet for this controller until I'm blue in the face, but it doesn't seem to necessarily correlate to how the controller behaves IRL or with what real operating systems actually do with that controller. I also don't have access to a physical controller anymore to even begin to try and write my own hardware probe for it. We need a robust test suite for FDC behavior, but it seems unlikely that anyone will want to actually write one (I sure don't). Are there any good shortcuts to victory here?
Re: [PATCH 3/6] linux-user: Add code for PR_GET/SET_UNALIGN
On 8/1/24 22:13, Richard Henderson wrote: On 1/9/24 04:15, Philippe Mathieu-Daudé wrote: +/* + * This can't go in hw/core/cpu.c because that file is compiled only + * once for both user-mode and system builds. + */ static Property cpu_common_props[] = { -#ifndef CONFIG_USER_ONLY +#ifdef CONFIG_USER_ONLY /* - * Create a memory property for softmmu CPU object, - * so users can wire up its memory. (This can't go in hw/core/cpu.c - * because that file is compiled only once for both user-mode - * and system builds.) The default if no link is set up is to use + * Create a property for the user-only object, so users can + * adjust prctl(PR_SET_UNALIGN) from the command-line. How can I test this per-thread property? -cpu foo,prctl-unalign-sigbus=true Shouldn't this be an accel (TCG/user) property, for all threads? There is always one cpu at user-only startup, and it is copied on clone. Logically it would be a kernel property, since it's something the kernel does, not the cpu. But cpu vs accel makes no difference to me; it was just easy here. Can a process started with prctl(PR_SET_UNALIGN) unset it before forking? "kernel property" as "accel property" works for me. IIRC, this is simply a proxy for not really being able to inherit this bit across fork+exec like you can with the real kernel. r~
Re: [PATCH v10 08/10] hw/net: GMAC Rx Implementation
On 8/1/24 23:27, Nabih Estefan wrote: From: Nabih Estefan Diaz - Implementation of Receive function for packets - Implementation for reading and writing from and to descriptors in memory for Rx When RX starts, we need to flush the queued packets so that they can be received by the GMAC device. Without this it won't work with TAP NIC device. When RX descriptor list is full, it returns a DMA_STATUS for software to handle it. But there's no way to indicate the software has handled all RX descriptors and the whole pipeline stalls. We do something similar to NPCM7XX EMC to handle this case. 1. Return packet size when RX descriptor is full, effectively dropping these packets in such a case. 2. When software clears RX descriptor full bit, continue receiving further packets by flushing QEMU packet queue. Added relevant trace-events Change-Id: I132aa254a94cda1a586aba2ea33bbfc74ecdb831 Signed-off-by: Hao Wu Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- hw/net/npcm_gmac.c | 324 +++- hw/net/trace-events | 5 + 2 files changed, 327 insertions(+), 2 deletions(-) diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c index 44c4ffaff4..54c8af3b41 100644 --- a/hw/net/npcm_gmac.c +++ b/hw/net/npcm_gmac.c @@ -23,7 +23,11 @@ #include "hw/registerfields.h" #include "hw/net/mii.h" #include "hw/net/npcm_gmac.h" +#include "linux/if_ether.h" Still doesn't build on macOS: [1215/1649] Compiling C object libcommon.fa.p/hw_net_npcm_gmac.c.o ../../hw/net/npcm_gmac.c:26:10: fatal error: 'linux/if_ether.h' file not found #include "linux/if_ether.h" ^~ 1 error generated. FAILED: libcommon.fa.p/hw_net_npcm_gmac.c.o
Re: [PATCH 1/2] target/sh4: Deprecate the shix machine
Hi Samuel, On 8/1/24 18:15, Samuel Tardieu wrote: The shix machine has been designed and used at Télécom Paris from 2003 to 2010. It had been added to QEMU in 2005 and has not been maintained since. Since nobody is using the physical board anymore nor interested in maintaining the QEMU port, it is time to deprecate it. Signed-off-by: Samuel Tardieu --- docs/about/deprecated.rst | 5 + hw/sh4/shix.c | 1 + 2 files changed, 6 insertions(+) diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst index 2e15040246..e6a12c9077 100644 --- a/docs/about/deprecated.rst +++ b/docs/about/deprecated.rst @@ -269,6 +269,11 @@ Nios II ``10m50-ghrd`` and ``nios2-generic-nommu`` machines (since 8.2) The Nios II architecture is orphan. +``shix`` (since 9.0) + + +The machine is no longer in existence and has been long unmaintained +in QEMU. Backend options --- diff --git a/hw/sh4/shix.c b/hw/sh4/shix.c index aa812512f0..58530b8ede 100644 --- a/hw/sh4/shix.c +++ b/hw/sh4/shix.c @@ -80,6 +80,7 @@ static void shix_machine_init(MachineClass *mc) mc->init = shix_init; mc->is_default = true; mc->default_cpu_type = TYPE_SH7750R_CPU; +mc->deprecation_reason = "old and unmaintained - use a newer machine instead"; "use a newer machine instead" bugs me, what would that be? Could we stick to "old and unmaintained"? } DEFINE_MACHINE("shix", shix_machine_init)
Re: [PATCH v2 1/2] nubus-device: round Declaration ROM memory region address to qemu_target_page_size()
On 8/1/24 20:20, Mark Cave-Ayland wrote: Declaration ROM binary images can be any arbitrary size, however if a host ROM memory region is not aligned to qemu_target_page_size() then we fail the "assert(!(iotlb & ~TARGET_PAGE_MASK))" check in tlb_set_page_full(). Ensure that the host ROM memory region is aligned to qemu_target_page_size() and adjust the offset at which the Declaration ROM image is loaded, since Nubus ROM images are unusual in that they are aligned to the end of the slot address space. Signed-off-by: Mark Cave-Ayland --- hw/nubus/nubus-device.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/hw/nubus/nubus-device.c b/hw/nubus/nubus-device.c index 49008e4938..e4f824d58b 100644 --- a/hw/nubus/nubus-device.c +++ b/hw/nubus/nubus-device.c @@ -10,6 +10,7 @@ #include "qemu/osdep.h" #include "qemu/datadir.h" +#include "exec/target_page.h" #include "hw/irq.h" #include "hw/loader.h" #include "hw/nubus/nubus.h" @@ -30,7 +31,7 @@ static void nubus_device_realize(DeviceState *dev, Error **errp) NubusDevice *nd = NUBUS_DEVICE(dev); char *name, *path; hwaddr slot_offset; -int64_t size; +int64_t size, align_size; Both are 'size_t'. int ret; /* Super */ @@ -76,16 +77,23 @@ static void nubus_device_realize(DeviceState *dev, Error **errp) } name = g_strdup_printf("nubus-slot-%x-declaration-rom", nd->slot); -memory_region_init_rom(>decl_rom, OBJECT(dev), name, size, + +/* + * Ensure ROM memory region is aligned to target page size regardless + * of the size of the Declaration ROM image + */ +align_size = ROUND_UP(size, qemu_target_page_size()); +memory_region_init_rom(>decl_rom, OBJECT(dev), name, align_size, _abort); -ret = load_image_mr(path, >decl_rom); +ret = load_image_size(path, memory_region_get_ram_ptr(>decl_rom) + +(uintptr_t)align_size - size, size); memory_region_get_ram_ptr() returns a 'void *' so this looks dubious. Maybe use a local variable to ease offset calculation? char *rombase = memory_region_get_ram_ptr(>decl_rom); ret = load_image_size(path, rombase + align_size - size, size); Otherwise KISS but ugly: ret = load_image_size(path, (void *)((uintptr_t)memory_region_get_ram_ptr(>decl_rom) + align_size - size), size); g_free(path); g_free(name); if (ret < 0) { error_setg(errp, "could not load romfile \"%s\"", nd->romfile); return; } -memory_region_add_subregion(>slot_mem, NUBUS_SLOT_SIZE - size, +memory_region_add_subregion(>slot_mem, NUBUS_SLOT_SIZE - align_size, >decl_rom); } }
Re: [PATCH v6 1/3] hw/misc: Implement STM32L4x5 EXTI
On 8/1/24 19:03, Inès Varhol wrote: Although very similar to the STM32F4xx EXTI, STM32L4x5 EXTI generates more than 32 event/interrupt requests and thus uses more registers than STM32F4xx EXTI which generates 23 event/interrupt requests. Acked-by: Alistair Francis Signed-off-by: Arnaud Minier Signed-off-by: Inès Varhol --- Should the for loop variables be `unsigned` rather than `int` ? It depends on the iterated range. Here you iterate over ARRAY_SIZE(Stm32l4x5ExtiState::irq) which is a size_t type, which is unsigned. Amusingly we use both similarly: $ git grep 'for (size_t' | wc -l 56 $ git grep 'for (unsigned' | wc -l 59 docs/system/arm/b-l475e-iot01a.rst | 5 +- hw/misc/Kconfig| 3 + hw/misc/meson.build| 1 + hw/misc/stm32l4x5_exti.c | 292 + hw/misc/trace-events | 5 + include/hw/misc/stm32l4x5_exti.h | 51 + 6 files changed, 354 insertions(+), 3 deletions(-) create mode 100644 hw/misc/stm32l4x5_exti.c create mode 100644 include/hw/misc/stm32l4x5_exti.h diff --git a/docs/system/arm/b-l475e-iot01a.rst b/docs/system/arm/b-l475e-iot01a.rst index 2b128e6b84..72f256ace7 100644 --- a/docs/system/arm/b-l475e-iot01a.rst +++ b/docs/system/arm/b-l475e-iot01a.rst @@ -12,17 +12,16 @@ USART, I2C, SPI, CAN and USB OTG, as well as a variety of sensors. Supported devices " -Currently, B-L475E-IOT01A machine's implementation is minimal, -it only supports the following device: +Currently B-L475E-IOT01A machine's only supports the following devices: - Cortex-M4F based STM32L4x5 SoC +- STM32L4x5 EXTI (Extended interrupts and events controller) Missing devices """ The B-L475E-IOT01A does *not* support the following devices: -- Extended interrupts and events controller (EXTI) - Reset and clock control (RCC) - Serial ports (UART) - System configuration controller (SYSCFG) diff --git a/hw/misc/Kconfig b/hw/misc/Kconfig index cc8a8c1418..3efe3dc2cc 100644 --- a/hw/misc/Kconfig +++ b/hw/misc/Kconfig @@ -87,6 +87,9 @@ config STM32F4XX_SYSCFG config STM32F4XX_EXTI bool +config STM32L4X5_EXTI +bool + config MIPS_ITU bool diff --git a/hw/misc/meson.build b/hw/misc/meson.build index 36c20d5637..16db6e228d 100644 --- a/hw/misc/meson.build +++ b/hw/misc/meson.build @@ -110,6 +110,7 @@ system_ss.add(when: 'CONFIG_XLNX_VERSAL_TRNG', if_true: files( system_ss.add(when: 'CONFIG_STM32F2XX_SYSCFG', if_true: files('stm32f2xx_syscfg.c')) system_ss.add(when: 'CONFIG_STM32F4XX_SYSCFG', if_true: files('stm32f4xx_syscfg.c')) system_ss.add(when: 'CONFIG_STM32F4XX_EXTI', if_true: files('stm32f4xx_exti.c')) +system_ss.add(when: 'CONFIG_STM32L4X5_EXTI', if_true: files('stm32l4x5_exti.c')) system_ss.add(when: 'CONFIG_MPS2_FPGAIO', if_true: files('mps2-fpgaio.c')) system_ss.add(when: 'CONFIG_MPS2_SCC', if_true: files('mps2-scc.c')) diff --git a/hw/misc/stm32l4x5_exti.c b/hw/misc/stm32l4x5_exti.c new file mode 100644 index 00..aedf1fb370 --- /dev/null +++ b/hw/misc/stm32l4x5_exti.c @@ -0,0 +1,292 @@ +/* + * STM32L4x5 EXTI (Extended interrupts and events controller) + * + * Copyright (c) 2023 Arnaud Minier + * Copyright (c) 2023 Samuel Tardieu + * Copyright (c) 2023 Inès Varhol + * + * SPDX-License-Identifier: GPL-2.0-or-later + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + * This work is based on the stm32f4xx_exti by Alistair Francis. + * Original code is licensed under the MIT License: + * + * Copyright (c) 2014 Alistair Francis + */ + +/* + * The reference used is the STMicroElectronics RM0351 Reference manual + * for STM32L4x5 and STM32L4x6 advanced Arm ® -based 32-bit MCUs. + * https://www.st.com/en/microcontrollers-microprocessors/stm32l4x5/documentation.html + */ + +#include "qemu/osdep.h" +#include "qemu/log.h" +#include "trace.h" +#include "hw/irq.h" +#include "migration/vmstate.h" +#include "hw/misc/stm32l4x5_exti.h" + +#define EXTI_IMR1 0x00 +#define EXTI_EMR1 0x04 +#define EXTI_RTSR1 0x08 +#define EXTI_FTSR1 0x0C +#define EXTI_SWIER1 0x10 +#define EXTI_PR10x14 +#define EXTI_IMR2 0x20 +#define EXTI_EMR2 0x24 +#define EXTI_RTSR2 0x28 +#define EXTI_FTSR2 0x2C +#define EXTI_SWIER2 0x30 +#define EXTI_PR20x34 + +#define EXTI_NUM_GPIO_EVENT_IN_LINES 16 #define EXTI_MAX_IRQ_PER_BANK 32 + +/* 0b_1010__ */ +#define DIRECT_LINE_MASK1 0xFF82 +/* 0b___1111 */ +#define DIRECT_LINE_MASK2 0x0087 +/* 0b___ */ +#define RESERVED_BITS_MASK2 0xFF00 + +/* 0b___0000 */ +#define ACTIVABLE_MASK2 (~DIRECT_LINE_MASK2 & ~RESERVED_BITS_MASK2) You might want to declare: #define EXTI_IRQS_BANK0 32 #define EXTI_IRQS_BANK1 8 static const unsigned
Re: [PATCH v2 00/35] tcg: Introduce TCG_COND_TST{EQ,NE}
Il lun 8 gen 2024, 22:45 Richard Henderson ha scritto: > > I was thinking: a lot of RISC targets simply do AND/ANDI > > followed by the sequence used for TCG_COND_NE. Would it make sense to > > have a TCG_TARGET_SUPPORTS_TST bit and, if absent, lower TSTEQ/TSTNE > > to AND+EQ/NE directly in the optimizer? > > Probably best, yes. > Ok, I will give it a shot. > And for brcond2/setcond2, > > always using AND/AND/OR may work just as well as any backend-specific > > trick, and will give more freedom to the register allocator. > >test a,b >testeq c,e > > for Arm32. So I'll leave it to the backends. > Nice. :) Paolo > > r~ > >
Re: [PATCH v8 04/10] hw/fsi: IBM's On-chip Peripheral Bus
Hello Cedric, On 12/12/23 08:48, Cédric Le Goater wrote: On 11/29/23 00:56, Ninad Palsule wrote: This is a part of patchset where IBM's Flexible Service Interface is introduced. The On-Chip Peripheral Bus (OPB): A low-speed bus typically found in POWER processors. This now makes an appearance in the ASPEED SoC due to tight integration of the FSI master IP with the OPB, mainly the existence of an MMIO-mapping of the CFAM address straight onto a sub-region of the OPB address space. Signed-off-by: Andrew Jeffery Signed-off-by: Ninad Palsule Reviewed-by: Joel Stanley [ clg: - removed FSIMasterState object and fsi_opb_realize() - simplified OPBus ] Signed-off-by: Cédric Le Goater --- include/hw/fsi/opb.h | 25 + hw/fsi/opb.c | 36 hw/fsi/Kconfig | 4 hw/fsi/meson.build | 1 + 4 files changed, 66 insertions(+) create mode 100644 include/hw/fsi/opb.h create mode 100644 hw/fsi/opb.c diff --git a/include/hw/fsi/opb.h b/include/hw/fsi/opb.h new file mode 100644 index 00..c112206f9e --- /dev/null +++ b/include/hw/fsi/opb.h @@ -0,0 +1,25 @@ +/* + * SPDX-License-Identifier: GPL-2.0-or-later + * Copyright (C) 2023 IBM Corp. + * + * IBM On-Chip Peripheral Bus + */ +#ifndef FSI_OPB_H +#define FSI_OPB_H + +#include "exec/memory.h" +#include "hw/fsi/fsi-master.h" + +#define TYPE_OP_BUS "opb" +OBJECT_DECLARE_SIMPLE_TYPE(OPBus, OP_BUS) + +typedef struct OPBus { + /*< private >*/ + BusState bus; + + /*< public >*/ + MemoryRegion mr; + AddressSpace as; +} OPBus; + +#endif /* FSI_OPB_H */ diff --git a/hw/fsi/opb.c b/hw/fsi/opb.c new file mode 100644 index 00..6474754890 --- /dev/null +++ b/hw/fsi/opb.c @@ -0,0 +1,36 @@ +/* + * SPDX-License-Identifier: GPL-2.0-or-later + * Copyright (C) 2023 IBM Corp. + * + * IBM On-chip Peripheral Bus + */ + +#include "qemu/osdep.h" + +#include "qapi/error.h" +#include "qemu/log.h" + +#include "hw/fsi/opb.h" + +static void fsi_opb_init(Object *o) +{ + OPBus *opb = OP_BUS(o); + + memory_region_init_io(>mr, OBJECT(opb), NULL, opb, + NULL, UINT32_MAX); Let's give the region some name. Added "fsi.opb" name. Thanks for the review. Regards, Ninad
[PATCH v2 0/3] Hexagon (target/hexagon) Use QEMU decodetree
Replace the old Hexagon dectree.py with QEMU decodetree Taylor Simpson (3): Hexagon (target/hexagon) Use QEMU decodetree (32-bit instructions) Hexagon (target/hexagon) Use QEMU decodetree (16-bit instructions) Hexagon (target/hexagon) Remove old dectree.py target/hexagon/decode.h | 5 +- target/hexagon/opcodes.h| 2 - target/hexagon/decode.c | 435 +++- target/hexagon/gen_dectree_import.c | 49 target/hexagon/opcodes.c| 29 -- target/hexagon/translate.c | 4 +- target/hexagon/README | 14 +- target/hexagon/dectree.py | 403 -- target/hexagon/gen_decodetree.py| 203 + target/hexagon/gen_trans_funcs.py | 124 target/hexagon/meson.build | 147 +- 11 files changed, 591 insertions(+), 824 deletions(-) delete mode 100755 target/hexagon/dectree.py create mode 100755 target/hexagon/gen_decodetree.py create mode 100755 target/hexagon/gen_trans_funcs.py -- 2.34.1
[PATCH v2 1/3] Hexagon (target/hexagon) Use QEMU decodetree (32-bit instructions)
The Decodetree Specification can be found here https://www.qemu.org/docs/master/devel/decodetree.html Covers all 32-bit instructions, including HVX We generate separate decoders for each instruction class. The reason will be more apparent in the next patch in this series. We add 2 new scripts gen_decodetree.pyGenerate the input to decodetree.py gen_trans_funcs.py Generate the trans_* functions used by the output of decodetree.py Since the functions generated by decodetree.py take DisasContext * as an argument, we add the argument to a couple of functions that didn't need it previously. We also set the insn field in DisasContext during decode because it is used by the trans_* functions. There is a g_assert_not_reached() in decode_insns() in decode.c to verify we never try to use the old decoder on 32-bit instructions Signed-off-by: Taylor Simpson --- target/hexagon/decode.h | 5 +- target/hexagon/decode.c | 54 - target/hexagon/translate.c| 4 +- target/hexagon/README | 13 +- target/hexagon/gen_decodetree.py | 193 ++ target/hexagon/gen_trans_funcs.py | 132 target/hexagon/meson.build| 55 + 7 files changed, 442 insertions(+), 14 deletions(-) create mode 100755 target/hexagon/gen_decodetree.py create mode 100755 target/hexagon/gen_trans_funcs.py diff --git a/target/hexagon/decode.h b/target/hexagon/decode.h index c66f5ea64d..3f3012b978 100644 --- a/target/hexagon/decode.h +++ b/target/hexagon/decode.h @@ -21,12 +21,13 @@ #include "cpu.h" #include "opcodes.h" #include "insn.h" +#include "translate.h" void decode_init(void); void decode_send_insn_to(Packet *packet, int start, int newloc); -int decode_packet(int max_words, const uint32_t *words, Packet *pkt, - bool disas_only); +int decode_packet(DisasContext *ctx, int max_words, const uint32_t *words, + Packet *pkt, bool disas_only); #endif diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c index 946c55cc71..bddad1f75e 100644 --- a/target/hexagon/decode.c +++ b/target/hexagon/decode.c @@ -52,6 +52,34 @@ DEF_REGMAP(R_8, 8, 0, 1, 2, 3, 4, 5, 6, 7) #define DECODE_MAPPED_REG(OPNUM, NAME) \ insn->regno[OPNUM] = DECODE_REGISTER_##NAME[insn->regno[OPNUM]]; +/* Helper functions for decode_*_generated.c.inc */ +#define DECODE_MAPPED(NAME) \ +static int decode_mapped_reg_##NAME(DisasContext *ctx, int x) \ +{ \ +return DECODE_REGISTER_##NAME[x]; \ +} +DECODE_MAPPED(R_16) +DECODE_MAPPED(R_8) + +/* Helper function for decodetree_trans_funcs_generated.c.inc */ +static int shift_left(DisasContext *ctx, int x, int n, int immno) +{ +int ret = x; +Insn *insn = ctx->insn; +if (!insn->extension_valid || +insn->which_extended != immno) { +ret <<= n; +} +return ret; +} + +/* Include the generated decoder for 32 bit insn */ +#include "decode_normal_generated.c.inc" +#include "decode_hvx_generated.c.inc" + +/* Include the generated helpers for the decoder */ +#include "decodetree_trans_funcs_generated.c.inc" + typedef struct { const struct DectreeTable *table_link; const struct DectreeTable *table_link_b; @@ -550,7 +578,8 @@ apply_extender(Packet *pkt, int i, uint32_t extender) int immed_num; uint32_t base_immed; -immed_num = opcode_which_immediate_is_extended(pkt->insn[i].opcode); +immed_num = pkt->insn[i].which_extended; +g_assert(immed_num == opcode_which_immediate_is_extended(pkt->insn[i].opcode)); base_immed = pkt->insn[i].immed[immed_num]; pkt->insn[i].immed[immed_num] = extender | fZXTN(6, 32, base_immed); @@ -762,12 +791,19 @@ decode_insns_tablewalk(Insn *insn, const DectreeTable *table, } static unsigned int -decode_insns(Insn *insn, uint32_t encoding) +decode_insns(DisasContext *ctx, Insn *insn, uint32_t encoding) { const DectreeTable *table; if (parse_bits(encoding) != 0) { +if (decode_normal(ctx, encoding) || +decode_hvx(ctx, encoding)) { +insn->generate = opcode_genptr[insn->opcode]; +insn->iclass = iclass_bits(encoding); +return 1; +} /* Start with PP table - 32 bit instructions */ table = _table_DECODE_ROOT_32; +g_assert_not_reached(); } else { /* start with EE table - duplex instructions */ table = _table_DECODE_ROOT_EE; @@ -916,8 +952,8 @@ decode_set_slot_number(Packet *pkt) * or number of words used on success */ -int decode_packet(int max_words, const uint32_t *words, Packet *pkt, - bool disas_only) +int decode_packet(DisasContext *ctx, int max_words, const uint32_t *words, + Packet *pkt, bool disas_only) { int num_insns = 0; int words_read = 0; @@ -930,9 +966,11 @@ int decode_packet(int max_words, const uint32_t *words, Packet *pkt,
[PATCH v2 2/3] Hexagon (target/hexagon) Use QEMU decodetree (16-bit instructions)
Section 10.3 of the Hexagon V73 Programmer's Reference Manual A duplex is encoded as a 32-bit instruction with bits [15:14] set to 00. The sub-instructions that comprise a duplex are encoded as 13-bit fields in the duplex. Create a decoder for each subinstruction class (a, l1, l2, s1, s2). Extend gen_trans_funcs.py to handle all instructions rather than filter by instruction class. There is a g_assert_not_reached() in decode_insns() in decode.c to verify we never try to use the old decoder on 16-bit instructions. Signed-off-by: Taylor Simpson --- target/hexagon/decode.c | 85 + target/hexagon/README | 1 + target/hexagon/gen_decodetree.py | 14 - target/hexagon/gen_trans_funcs.py | 12 + target/hexagon/meson.build| 90 +++ 5 files changed, 190 insertions(+), 12 deletions(-) diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c index bddad1f75e..160b23a895 100644 --- a/target/hexagon/decode.c +++ b/target/hexagon/decode.c @@ -60,6 +60,7 @@ static int decode_mapped_reg_##NAME(DisasContext *ctx, int x) \ } DECODE_MAPPED(R_16) DECODE_MAPPED(R_8) +DECODE_MAPPED(R__8) /* Helper function for decodetree_trans_funcs_generated.c.inc */ static int shift_left(DisasContext *ctx, int x, int n, int immno) @@ -77,6 +78,13 @@ static int shift_left(DisasContext *ctx, int x, int n, int immno) #include "decode_normal_generated.c.inc" #include "decode_hvx_generated.c.inc" +/* Include the generated decoder for 16 bit insn */ +#include "decode_subinsn_a_generated.c.inc" +#include "decode_subinsn_l1_generated.c.inc" +#include "decode_subinsn_l2_generated.c.inc" +#include "decode_subinsn_s1_generated.c.inc" +#include "decode_subinsn_s2_generated.c.inc" + /* Include the generated helpers for the decoder */ #include "decodetree_trans_funcs_generated.c.inc" @@ -790,6 +798,63 @@ decode_insns_tablewalk(Insn *insn, const DectreeTable *table, } } +/* + * Section 10.3 of the Hexagon V73 Programmer's Reference Manual + * + * A duplex is encoded as a 32-bit instruction with bits [15:14] set to 00. + * The sub-instructions that comprise a duplex are encoded as 13-bit fields + * in the duplex. + * + * Per table 10-4, the 4-bit duplex iclass is encoded in bits 31:29, 13 + */ +static uint32_t get_duplex_iclass(uint32_t encoding) +{ +uint32_t iclass = extract32(encoding, 13, 1); +iclass = deposit32(iclass, 1, 3, extract32(encoding, 29, 3)); +return iclass; +} + +/* + * Per table 10-5, the duplex ICLASS field values that specify the group of + * each sub-instruction in a duplex + * + * This table points to the decode instruction for each entry in the table + */ +typedef bool (*subinsn_decode_func)(DisasContext *ctx, uint16_t insn); +typedef struct { +subinsn_decode_func decode_slot0_subinsn; +subinsn_decode_func decode_slot1_subinsn; +} subinsn_decode_groups; + +static const subinsn_decode_groups decode_groups[16] = { +[0x0] = { decode_subinsn_l1, decode_subinsn_l1 }, +[0x1] = { decode_subinsn_l2, decode_subinsn_l1 }, +[0x2] = { decode_subinsn_l2, decode_subinsn_l2 }, +[0x3] = { decode_subinsn_a, decode_subinsn_a }, +[0x4] = { decode_subinsn_l1, decode_subinsn_a }, +[0x5] = { decode_subinsn_l2, decode_subinsn_a }, +[0x6] = { decode_subinsn_s1, decode_subinsn_a }, +[0x7] = { decode_subinsn_s2, decode_subinsn_a }, +[0x8] = { decode_subinsn_s1, decode_subinsn_l1 }, +[0x9] = { decode_subinsn_s1, decode_subinsn_l2 }, +[0xa] = { decode_subinsn_s1, decode_subinsn_s1 }, +[0xb] = { decode_subinsn_s2, decode_subinsn_s1 }, +[0xc] = { decode_subinsn_s2, decode_subinsn_l1 }, +[0xd] = { decode_subinsn_s2, decode_subinsn_l2 }, +[0xe] = { decode_subinsn_s2, decode_subinsn_s2 }, +[0xf] = { NULL, NULL }, /* Reserved */ +}; + +static uint16_t get_slot0_subinsn(uint32_t encoding) +{ +return extract32(encoding, 0, 13); +} + +static uint16_t get_slot1_subinsn(uint32_t encoding) +{ +return extract32(encoding, 16, 13); +} + static unsigned int decode_insns(DisasContext *ctx, Insn *insn, uint32_t encoding) { @@ -805,8 +870,28 @@ decode_insns(DisasContext *ctx, Insn *insn, uint32_t encoding) table = _table_DECODE_ROOT_32; g_assert_not_reached(); } else { +uint32_t iclass = get_duplex_iclass(encoding); +unsigned int slot0_subinsn = get_slot0_subinsn(encoding); +unsigned int slot1_subinsn = get_slot1_subinsn(encoding); +subinsn_decode_func decode_slot0_subinsn = +decode_groups[iclass].decode_slot0_subinsn; +subinsn_decode_func decode_slot1_subinsn = +decode_groups[iclass].decode_slot1_subinsn; + +/* The slot1 subinsn needs to be in the packet first */ +if (decode_slot1_subinsn(ctx, slot1_subinsn)) { +insn->generate = opcode_genptr[insn->opcode]; +insn->iclass = iclass_bits(encoding); +
[PATCH v2 3/3] Hexagon (target/hexagon) Remove old dectree.py
Now that we are using QEMU decodetree.py, remove the old decoder Signed-off-by: Taylor Simpson --- target/hexagon/opcodes.h| 2 - target/hexagon/decode.c | 344 target/hexagon/gen_dectree_import.c | 49 target/hexagon/opcodes.c| 29 -- target/hexagon/dectree.py | 403 target/hexagon/meson.build | 12 - 6 files changed, 839 deletions(-) delete mode 100755 target/hexagon/dectree.py diff --git a/target/hexagon/opcodes.h b/target/hexagon/opcodes.h index 6e90e00fe2..fa7e321950 100644 --- a/target/hexagon/opcodes.h +++ b/target/hexagon/opcodes.h @@ -53,6 +53,4 @@ extern const OpcodeEncoding opcode_encodings[XX_LAST_OPCODE]; void opcode_init(void); -int opcode_which_immediate_is_extended(Opcode opcode); - #endif diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c index 160b23a895..a40210ca1e 100644 --- a/target/hexagon/decode.c +++ b/target/hexagon/decode.c @@ -88,175 +88,6 @@ static int shift_left(DisasContext *ctx, int x, int n, int immno) /* Include the generated helpers for the decoder */ #include "decodetree_trans_funcs_generated.c.inc" -typedef struct { -const struct DectreeTable *table_link; -const struct DectreeTable *table_link_b; -Opcode opcode; -enum { -DECTREE_ENTRY_INVALID, -DECTREE_TABLE_LINK, -DECTREE_SUBINSNS, -DECTREE_EXTSPACE, -DECTREE_TERMINAL -} type; -} DectreeEntry; - -typedef struct DectreeTable { -unsigned int (*lookup_function)(int startbit, int width, uint32_t opcode); -unsigned int size; -unsigned int startbit; -unsigned int width; -const DectreeEntry table[]; -} DectreeTable; - -#define DECODE_NEW_TABLE(TAG, SIZE, WHATNOT) \ -static const DectreeTable dectree_table_##TAG; -#define TABLE_LINK(TABLE) /* NOTHING */ -#define TERMINAL(TAG, ENC)/* NOTHING */ -#define SUBINSNS(TAG, CLASSA, CLASSB, ENC)/* NOTHING */ -#define EXTSPACE(TAG, ENC)/* NOTHING */ -#define INVALID() /* NOTHING */ -#define DECODE_END_TABLE(...) /* NOTHING */ -#define DECODE_MATCH_INFO(...)/* NOTHING */ -#define DECODE_LEGACY_MATCH_INFO(...) /* NOTHING */ -#define DECODE_OPINFO(...)/* NOTHING */ - -#include "dectree_generated.h.inc" - -#undef DECODE_OPINFO -#undef DECODE_MATCH_INFO -#undef DECODE_LEGACY_MATCH_INFO -#undef DECODE_END_TABLE -#undef INVALID -#undef TERMINAL -#undef SUBINSNS -#undef EXTSPACE -#undef TABLE_LINK -#undef DECODE_NEW_TABLE -#undef DECODE_SEPARATOR_BITS - -#define DECODE_SEPARATOR_BITS(START, WIDTH) NULL, START, WIDTH -#define DECODE_NEW_TABLE_HELPER(TAG, SIZE, FN, START, WIDTH) \ -static const DectreeTable dectree_table_##TAG = { \ -.size = SIZE, \ -.lookup_function = FN, \ -.startbit = START, \ -.width = WIDTH, \ -.table = { -#define DECODE_NEW_TABLE(TAG, SIZE, WHATNOT) \ -DECODE_NEW_TABLE_HELPER(TAG, SIZE, WHATNOT) - -#define TABLE_LINK(TABLE) \ -{ .type = DECTREE_TABLE_LINK, .table_link = _table_##TABLE }, -#define TERMINAL(TAG, ENC) \ -{ .type = DECTREE_TERMINAL, .opcode = TAG }, -#define SUBINSNS(TAG, CLASSA, CLASSB, ENC) \ -{ \ -.type = DECTREE_SUBINSNS, \ -.table_link = _table_DECODE_SUBINSN_##CLASSA, \ -.table_link_b = _table_DECODE_SUBINSN_##CLASSB \ -}, -#define EXTSPACE(TAG, ENC) { .type = DECTREE_EXTSPACE }, -#define INVALID() { .type = DECTREE_ENTRY_INVALID, .opcode = XX_LAST_OPCODE }, - -#define DECODE_END_TABLE(...) } }; - -#define DECODE_MATCH_INFO(...)/* NOTHING */ -#define DECODE_LEGACY_MATCH_INFO(...) /* NOTHING */ -#define DECODE_OPINFO(...)/* NOTHING */ - -#include "dectree_generated.h.inc" - -#undef DECODE_OPINFO -#undef DECODE_MATCH_INFO -#undef DECODE_LEGACY_MATCH_INFO -#undef DECODE_END_TABLE -#undef INVALID -#undef TERMINAL -#undef SUBINSNS -#undef EXTSPACE -#undef TABLE_LINK -#undef DECODE_NEW_TABLE -#undef DECODE_NEW_TABLE_HELPER -#undef DECODE_SEPARATOR_BITS - -static const DectreeTable dectree_table_DECODE_EXT_EXT_noext = { -.size = 1, .lookup_function = NULL, .startbit = 0, .width = 0, -.table = { -{ .type = DECTREE_ENTRY_INVALID, .opcode = XX_LAST_OPCODE }, -} -}; - -static const DectreeTable *ext_trees[XX_LAST_EXT_IDX]; - -static void decode_ext_init(void) -{ -int i; -for (i = EXT_IDX_noext; i < EXT_IDX_noext_AFTER; i++) { -ext_trees[i] = _table_DECODE_EXT_EXT_noext; -} -for (i = EXT_IDX_mmvec; i < EXT_IDX_mmvec_AFTER; i++) { -ext_trees[i] = _table_DECODE_EXT_EXT_mmvec; -} -} - -typedef struct { -uint32_t mask; -uint32_t match; -} DecodeITableEntry; - -#define DECODE_NEW_TABLE(TAG, SIZE, WHATNOT) /* NOTHING */ -#define TABLE_LINK(TABLE) /* NOTHING */ -#define TERMINAL(TAG,
Re: [External] Re: [QEMU-devel][RFC PATCH 1/1] backends/hostmem: qapi/qom: Add an ObjectOption for memory-backend-* called HostMemType and its arg 'cxlram'
On Mon, Jan 8, 2024 at 9:15 AM Gregory Price wrote: > > On Fri, Jan 05, 2024 at 09:59:19PM -0800, Hao Xiang wrote: > > On Wed, Jan 3, 2024 at 1:56 PM Gregory Price > > wrote: > > > > > > For a variety of performance reasons, this will not work the way you > > > want it to. You are essentially telling QEMU to map the vmem0 into a > > > virtual cxl device, and now any memory accesses to that memory region > > > will end up going through the cxl-type3 device logic - which is an IO > > > path from the perspective of QEMU. > > > > I didn't understand exactly how the virtual cxl-type3 device works. I > > thought it would go with the same "guest virtual address -> guest > > physical address -> host physical address" translation totally done by > > CPU. But if it is going through an emulation path handled by virtual > > cxl-type3, I agree the performance would be bad. Do you know why > > accessing memory on a virtual cxl-type3 device can't go with the > > nested page table translation? > > > > Because a byte-access on CXL memory can have checks on it that must be > emulated by the virtual device, and because there are caching > implications that have to be emulated as well. Interesting. Now that I see the cxl_type3_read/cxl_type3_write. If the CXL memory data path goes through them, the performance would be pretty problematic. We have actually run Intel's Memory Latency Checker benchmark from inside a guest VM with both system-DRAM and virtual CXL-type3 configured. The idle latency on the virtual CXL memory is 2X of system DRAM, which is on-par with the benchmark running from a physical host. I need to debug this more to understand why the latency is actually much better than I would expect now. > > The cxl device you are using is an emulated CXL device - not a > virtualization interface. Nuanced difference: the emulated device has > to emulate *everything* that CXL device does. > > What you want is passthrough / managed access to a real device - > virtualization. This is not the way to accomplish that. A better way > to accomplish that is to simply pass the memory through as a static numa > node as I described. That would work, too. But I think a kernel change is required to establish the correct memory tiering if we go this routine. > > > > > When we had a discussion with Intel, they told us to not use the KVM > > option in QEMU while using virtual cxl type3 device. That's probably > > related to the issue you described here? We enabled KVM though but > > haven't seen the crash yet. > > > > The crash really only happens, IIRC, if code ends up hosted in that > memory. I forget the exact scenario, but the working theory is it has > to do with the way instruction caches are managed with KVM and this > device. > > > > > > > You're better off just using the `host-nodes` field of host-memory > > > and passing bandwidth/latency attributes though via `-numa hmat-lb` > > > > We tried this but it doesn't work from end to end right now. I > > described the issue in another fork of this thread. > > > > > > > > In that scenario, the guest software doesn't even need to know CXL > > > exists at all, it can just read the attributes of the numa node > > > that QEMU created for it. > > > > We thought about this before. But the current kernel implementation > > requires a devdax device to be probed and recognized as a slow tier > > (by reading the memory attributes). I don't think this can be done via > > the path you described. Have you tried this before? > > > > Right, because the memory tiering component lumps the nodes together. > > Better idea: Fix the memory tiering component > > I cc'd you on another patch line that is discussing something relevant > to this. > > https://lore.kernel.org/linux-mm/87fs00njft@yhuang6-desk2.ccr.corp.intel.com/T/#m32d58f8cc607aec942995994a41b17ff711519c8 > > The point is: There's no need for this to be a dax device at all, there > is no need for the guest to even know what is providing the memory, or > for the guest to have any management access to the memory. It just > wants the memory and the ability to tier it. > > So we should fix the memory tiering component to work with this > workflow. Agreed. We really don't need the devdax device at all. I thought that choice was made due to the memory tiering concept being started with pmem ... Let's continue this part of the discussion on the above thread. > > ~Gregory
Re: [PATCH v7 0/4] compare machine type compat_props
On Fri, Dec 22, 2023 at 7:51 AM Markus Armbruster wrote: > > Something odd is going on here. > > Your cover letter and PATCH 4 arrived here with > > Content-Type: text/plain; charset=UTF-8 > > Good. > > PATCH 2: > > Content-Type: text/plain; charset="US-ASCII"; x-default=true > > PATCH 1 and 3: > > Content-Type: text/plain; charset=N > > git-am chokes on that: > > error: cannot convert from N to UTF-8 > Patchew also complains that it hasn't received the full series: https://patchew.org/QEMU/20231214155333.35643-1-davydov-...@yandex-team.ru/ Please consider rebasing and resending? --js
Re: [PATCH v8 06/10] hw/fsi: Aspeed APB2OPB interface
Hello Cedric, On 12/12/23 08:49, Cédric Le Goater wrote: On 11/29/23 00:56, Ninad Palsule wrote: This is a part of patchset where IBM's Flexible Service Interface is introduced. An APB-to-OPB bridge enabling access to the OPB from the ARM core in the AST2600. Hardware limitations prevent the OPB from being directly mapped into APB, so all accesses are indirect through the bridge. Signed-off-by: Andrew Jeffery Signed-off-by: Ninad Palsule [ clg: - moved FSIMasterState under AspeedAPB2OPBState - modified fsi_opb_fsi_master_address() and fsi_opb_opb2fsi_address() - instroduced fsi_aspeed_apb2opb_init() - reworked fsi_aspeed_apb2opb_realize() ] Signed-off-by: Cédric Le Goater --- include/hw/fsi/aspeed-apb2opb.h | 34 hw/fsi/aspeed-apb2opb.c | 316 hw/arm/Kconfig | 1 + hw/fsi/Kconfig | 4 + hw/fsi/meson.build | 1 + hw/fsi/trace-events | 2 + 6 files changed, 358 insertions(+) create mode 100644 include/hw/fsi/aspeed-apb2opb.h create mode 100644 hw/fsi/aspeed-apb2opb.c diff --git a/include/hw/fsi/aspeed-apb2opb.h b/include/hw/fsi/aspeed-apb2opb.h new file mode 100644 index 00..c51fbeda9f --- /dev/null +++ b/include/hw/fsi/aspeed-apb2opb.h @@ -0,0 +1,34 @@ +/* + * SPDX-License-Identifier: GPL-2.0-or-later + * Copyright (C) 2023 IBM Corp. + * + * ASPEED APB2OPB Bridge + */ +#ifndef FSI_ASPEED_APB2OPB_H +#define FSI_ASPEED_APB2OPB_H + +#include "hw/sysbus.h" +#include "hw/fsi/opb.h" + +#define TYPE_ASPEED_APB2OPB "aspeed.apb2opb" +OBJECT_DECLARE_SIMPLE_TYPE(AspeedAPB2OPBState, ASPEED_APB2OPB) + +#define ASPEED_APB2OPB_NR_REGS ((0xe8 >> 2) + 1) + +#define ASPEED_FSI_NUM 2 + +typedef struct AspeedAPB2OPBState { + /*< private >*/ + SysBusDevice parent_obj; + + /*< public >*/ + MemoryRegion iomem; + + uint32_t regs[ASPEED_APB2OPB_NR_REGS]; + qemu_irq irq; + + OPBus opb[ASPEED_FSI_NUM]; + FSIMasterState fsi[ASPEED_FSI_NUM]; +} AspeedAPB2OPBState; + +#endif /* FSI_ASPEED_APB2OPB_H */ diff --git a/hw/fsi/aspeed-apb2opb.c b/hw/fsi/aspeed-apb2opb.c new file mode 100644 index 00..70b3fe2587 --- /dev/null +++ b/hw/fsi/aspeed-apb2opb.c @@ -0,0 +1,316 @@ +/* + * SPDX-License-Identifier: GPL-2.0-or-later + * Copyright (C) 2023 IBM Corp. + * + * ASPEED APB-OPB FSI interface + */ + +#include "qemu/osdep.h" +#include "qemu/log.h" +#include "qom/object.h" +#include "qapi/error.h" +#include "trace.h" + +#include "hw/fsi/aspeed-apb2opb.h" +#include "hw/qdev-core.h" + +#define TO_REG(x) (x >> 2) + +#define APB2OPB_VERSION TO_REG(0x00) +#define APB2OPB_TRIGGER TO_REG(0x04) + +#define APB2OPB_CONTROL TO_REG(0x08) +#define APB2OPB_CONTROL_OFF BE_GENMASK(31, 13) + +#define APB2OPB_OPB2FSI TO_REG(0x0c) +#define APB2OPB_OPB2FSI_OFF BE_GENMASK(31, 22) + +#define APB2OPB_OPB0_SEL TO_REG(0x10) +#define APB2OPB_OPB1_SEL TO_REG(0x28) +#define APB2OPB_OPB_SEL_EN BIT(0) + +#define APB2OPB_OPB0_MODE TO_REG(0x14) +#define APB2OPB_OPB1_MODE TO_REG(0x2c) +#define APB2OPB_OPB_MODE_RD BIT(0) + +#define APB2OPB_OPB0_XFER TO_REG(0x18) +#define APB2OPB_OPB1_XFER TO_REG(0x30) +#define APB2OPB_OPB_XFER_FULL BIT(1) +#define APB2OPB_OPB_XFER_HALF BIT(0) + +#define APB2OPB_OPB0_ADDR TO_REG(0x1c) +#define APB2OPB_OPB0_WRITE_DATA TO_REG(0x20) + +#define APB2OPB_OPB1_ADDR TO_REG(0x34) +#define APB2OPB_OPB1_WRITE_DATA TO_REG(0x38) + +#define APB2OPB_IRQ_STS TO_REG(0x48) +#define APB2OPB_IRQ_STS_OPB1_TX_ACK BIT(17) +#define APB2OPB_IRQ_STS_OPB0_TX_ACK BIT(16) + +#define APB2OPB_OPB0_WRITE_WORD_ENDIAN TO_REG(0x4c) +#define APB2OPB_OPB0_WRITE_WORD_ENDIAN_BE 0x0011101b +#define APB2OPB_OPB0_WRITE_BYTE_ENDIAN TO_REG(0x50) +#define APB2OPB_OPB0_WRITE_BYTE_ENDIAN_BE 0x0c330f3f +#define APB2OPB_OPB1_WRITE_WORD_ENDIAN TO_REG(0x54) +#define APB2OPB_OPB1_WRITE_BYTE_ENDIAN TO_REG(0x58) +#define APB2OPB_OPB0_READ_BYTE_ENDIAN TO_REG(0x5c) +#define APB2OPB_OPB1_READ_BYTE_ENDIAN TO_REG(0x60) +#define APB2OPB_OPB0_READ_WORD_ENDIAN_BE 0x00030b1b + +#define APB2OPB_OPB0_READ_DATA TO_REG(0x84) +#define APB2OPB_OPB1_READ_DATA TO_REG(0x90) + +/* + * The following magic values came from AST2600 data sheet + * The register values are defined under section "FSI controller" + * as initial values. + */ +static const uint32_t aspeed_apb2opb_reset[ASPEED_APB2OPB_NR_REGS] = { + [APB2OPB_VERSION] = 0x00a1, + [APB2OPB_OPB0_WRITE_WORD_ENDIAN] = 0x0044eee4, + [APB2OPB_OPB0_WRITE_BYTE_ENDIAN] = 0x0055aaff, + [APB2OPB_OPB1_WRITE_WORD_ENDIAN] =
testing without the translation cache
Alex, A very long time ago QEMU supported disabling the translation cache via "-translation no-cache". That option was deliberately removed. We are looking into a hexagon-specific failure when there's a TB lookup miss from a cpu_loop_exit_restore().I'd like to test our fix for this failure and was wondering if there's any mechanism to disable the cache. There's a "-accel tcg,tb-size=0" - but this won't accomplish what I'm looking to do - will it? If not, is there another way to disable the cache? -Brian
[PATCH v10 07/10] include/hw/net: GMAC IRQ Implementation
From: Nabih Estefan Diaz Implement Update IRQ Method for GMAC functionality. Added relevant trace-events Change-Id: I7a2d3cd3f493278bcd0cf483233c1e05c37488b7 Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- hw/net/npcm_gmac.c | 40 hw/net/trace-events | 1 + 2 files changed, 41 insertions(+) diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c index 98b3c33c94..44c4ffaff4 100644 --- a/hw/net/npcm_gmac.c +++ b/hw/net/npcm_gmac.c @@ -149,6 +149,46 @@ static bool gmac_can_receive(NetClientState *nc) return true; } +/* + * Function that updates the GMAC IRQ + * It find the logical OR of the enabled bits for NIS (if enabled) + * It find the logical OR of the enabled bits for AIS (if enabled) + */ +static void gmac_update_irq(NPCMGMACState *gmac) +{ +/* + * Check if the normal interrupts summary is enabled + * if so, add the bits for the summary that are enabled + */ +if (gmac->regs[R_NPCM_DMA_INTR_ENA] & gmac->regs[R_NPCM_DMA_STATUS] & +(NPCM_DMA_INTR_ENAB_NIE_BITS)) { +gmac->regs[R_NPCM_DMA_STATUS] |= NPCM_DMA_STATUS_NIS; +} +/* + * Check if the abnormal interrupts summary is enabled + * if so, add the bits for the summary that are enabled + */ +if (gmac->regs[R_NPCM_DMA_INTR_ENA] & gmac->regs[R_NPCM_DMA_STATUS] & +(NPCM_DMA_INTR_ENAB_AIE_BITS)) { +gmac->regs[R_NPCM_DMA_STATUS] |= NPCM_DMA_STATUS_AIS; +} + +/* Get the logical OR of both normal and abnormal interrupts */ +int level = !!((gmac->regs[R_NPCM_DMA_STATUS] & +gmac->regs[R_NPCM_DMA_INTR_ENA] & +NPCM_DMA_STATUS_NIS) | + (gmac->regs[R_NPCM_DMA_STATUS] & + gmac->regs[R_NPCM_DMA_INTR_ENA] & + NPCM_DMA_STATUS_AIS)); + +/* Set the IRQ */ +trace_npcm_gmac_update_irq(DEVICE(gmac)->canonical_path, + gmac->regs[R_NPCM_DMA_STATUS], + gmac->regs[R_NPCM_DMA_INTR_ENA], + level); +qemu_set_irq(gmac->irq, level); +} + static ssize_t gmac_receive(NetClientState *nc, const uint8_t *buf, size_t len) { /* Placeholder. Function will be filled in following patches */ diff --git a/hw/net/trace-events b/hw/net/trace-events index 33514548b8..56057de47f 100644 --- a/hw/net/trace-events +++ b/hw/net/trace-events @@ -473,6 +473,7 @@ npcm_gmac_reg_write(const char *name, uint64_t offset, uint32_t value) "%s: offs npcm_gmac_mdio_access(const char *name, uint8_t is_write, uint8_t pa, uint8_t gr, uint16_t val) "%s: is_write: %" PRIu8 " pa: %" PRIu8 " gr: %" PRIu8 " val: 0x%04" PRIx16 npcm_gmac_reset(const char *name, uint16_t value) "%s: phy_regs[0][1]: 0x%04" PRIx16 npcm_gmac_set_link(bool active) "Set link: active=%u" +npcm_gmac_update_irq(const char *name, uint32_t status, uint32_t intr_en, int level) "%s: Status Reg: 0x%04" PRIX32 " Interrupt Enable Reg: 0x%04" PRIX32 " IRQ Set: %d" # npcm_pcs.c npcm_pcs_reg_read(const char *name, uint16_t indirect_access_baes, uint64_t offset, uint16_t value) "%s: IND: 0x%02" PRIx16 " offset: 0x%04" PRIx64 " value: 0x%04" PRIx16 -- 2.43.0.472.g3155946c3a-goog
[PATCH v10 08/10] hw/net: GMAC Rx Implementation
From: Nabih Estefan Diaz - Implementation of Receive function for packets - Implementation for reading and writing from and to descriptors in memory for Rx When RX starts, we need to flush the queued packets so that they can be received by the GMAC device. Without this it won't work with TAP NIC device. When RX descriptor list is full, it returns a DMA_STATUS for software to handle it. But there's no way to indicate the software has handled all RX descriptors and the whole pipeline stalls. We do something similar to NPCM7XX EMC to handle this case. 1. Return packet size when RX descriptor is full, effectively dropping these packets in such a case. 2. When software clears RX descriptor full bit, continue receiving further packets by flushing QEMU packet queue. Added relevant trace-events Change-Id: I132aa254a94cda1a586aba2ea33bbfc74ecdb831 Signed-off-by: Hao Wu Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- hw/net/npcm_gmac.c | 324 +++- hw/net/trace-events | 5 + 2 files changed, 327 insertions(+), 2 deletions(-) diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c index 44c4ffaff4..54c8af3b41 100644 --- a/hw/net/npcm_gmac.c +++ b/hw/net/npcm_gmac.c @@ -23,7 +23,11 @@ #include "hw/registerfields.h" #include "hw/net/mii.h" #include "hw/net/npcm_gmac.h" +#include "linux/if_ether.h" #include "migration/vmstate.h" +#include "net/checksum.h" +#include "net/net.h" +#include "qemu/cutils.h" #include "qemu/log.h" #include "qemu/units.h" #include "sysemu/dma.h" @@ -146,6 +150,17 @@ static void gmac_phy_set_link(NPCMGMACState *gmac, bool active) static bool gmac_can_receive(NetClientState *nc) { +NPCMGMACState *gmac = NPCM_GMAC(qemu_get_nic_opaque(nc)); + +/* If GMAC receive is disabled. */ +if (!(gmac->regs[R_NPCM_GMAC_MAC_CONFIG] & NPCM_GMAC_MAC_CONFIG_RX_EN)) { +return false; +} + +/* If GMAC DMA RX is stopped. */ +if (!(gmac->regs[R_NPCM_DMA_CONTROL] & NPCM_DMA_CONTROL_START_STOP_RX)) { +return false; +} return true; } @@ -189,12 +204,288 @@ static void gmac_update_irq(NPCMGMACState *gmac) qemu_set_irq(gmac->irq, level); } -static ssize_t gmac_receive(NetClientState *nc, const uint8_t *buf, size_t len) +static int gmac_read_rx_desc(dma_addr_t addr, struct NPCMGMACRxDesc *desc) +{ +if (dma_memory_read(_space_memory, addr, desc, +sizeof(*desc), MEMTXATTRS_UNSPECIFIED)) { +qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to read descriptor @ 0x%" + HWADDR_PRIx "\n", __func__, addr); +return -1; +} +desc->rdes0 = le32_to_cpu(desc->rdes0); +desc->rdes1 = le32_to_cpu(desc->rdes1); +desc->rdes2 = le32_to_cpu(desc->rdes2); +desc->rdes3 = le32_to_cpu(desc->rdes3); +return 0; +} + +static int gmac_write_rx_desc(dma_addr_t addr, struct NPCMGMACRxDesc *desc) { -/* Placeholder. Function will be filled in following patches */ +struct NPCMGMACRxDesc le_desc; +le_desc.rdes0 = cpu_to_le32(desc->rdes0); +le_desc.rdes1 = cpu_to_le32(desc->rdes1); +le_desc.rdes2 = cpu_to_le32(desc->rdes2); +le_desc.rdes3 = cpu_to_le32(desc->rdes3); +if (dma_memory_write(_space_memory, addr, _desc, +sizeof(le_desc), MEMTXATTRS_UNSPECIFIED)) { +qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to write descriptor @ 0x%" + HWADDR_PRIx "\n", __func__, addr); +return -1; +} return 0; } +static int gmac_read_tx_desc(dma_addr_t addr, struct NPCMGMACTxDesc *desc) +{ +if (dma_memory_read(_space_memory, addr, desc, +sizeof(*desc), MEMTXATTRS_UNSPECIFIED)) { +qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to read descriptor @ 0x%" + HWADDR_PRIx "\n", __func__, addr); +return -1; +} +desc->tdes0 = le32_to_cpu(desc->tdes0); +desc->tdes1 = le32_to_cpu(desc->tdes1); +desc->tdes2 = le32_to_cpu(desc->tdes2); +desc->tdes3 = le32_to_cpu(desc->tdes3); +return 0; +} + +static int gmac_write_tx_desc(dma_addr_t addr, struct NPCMGMACTxDesc *desc) +{ +struct NPCMGMACTxDesc le_desc; +le_desc.tdes0 = cpu_to_le32(desc->tdes0); +le_desc.tdes1 = cpu_to_le32(desc->tdes1); +le_desc.tdes2 = cpu_to_le32(desc->tdes2); +le_desc.tdes3 = cpu_to_le32(desc->tdes3); +if (dma_memory_write(_space_memory, addr, _desc, +sizeof(le_desc), MEMTXATTRS_UNSPECIFIED)) { +qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to write descriptor @ 0x%" + HWADDR_PRIx "\n", __func__, addr); +return -1; +} +return 0; +} +static int gmac_rx_transfer_frame_to_buffer(uint32_t rx_buf_len, +uint32_t *left_frame, +uint32_t rx_buf_addr, +bool *eof_transferred, +
[PATCH v10 03/10] hw/misc: Add qtest for NPCM7xx PCI Mailbox
From: Hao Wu This patches adds a qtest for NPCM7XX PCI Mailbox module. It sends read and write requests to the module, and verifies that the module contains the correct data after the requests. Change-Id: I2e1dbaecf8be9ec7eab55cb54f7fdeb0715b8275 Signed-off-by: Hao Wu Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- tests/qtest/meson.build | 1 + tests/qtest/npcm7xx_pci_mbox-test.c | 238 2 files changed, 239 insertions(+) create mode 100644 tests/qtest/npcm7xx_pci_mbox-test.c diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build index 47dabf91d0..2ac79925f9 100644 --- a/tests/qtest/meson.build +++ b/tests/qtest/meson.build @@ -183,6 +183,7 @@ qtests_sparc64 = \ qtests_npcm7xx = \ ['npcm7xx_adc-test', 'npcm7xx_gpio-test', + 'npcm7xx_pci_mbox-test', 'npcm7xx_pwm-test', 'npcm7xx_rng-test', 'npcm7xx_sdhci-test', diff --git a/tests/qtest/npcm7xx_pci_mbox-test.c b/tests/qtest/npcm7xx_pci_mbox-test.c new file mode 100644 index 00..24eec18e3c --- /dev/null +++ b/tests/qtest/npcm7xx_pci_mbox-test.c @@ -0,0 +1,238 @@ +/* + * QTests for Nuvoton NPCM7xx PCI Mailbox Modules. + * + * Copyright 2021 Google LLC + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * for more details. + */ + +#include "qemu/osdep.h" +#include "qemu/bitops.h" +#include "qapi/qmp/qdict.h" +#include "qapi/qmp/qnum.h" +#include "libqtest-single.h" + +#define PCI_MBOX_BA 0xf0848000 +#define PCI_MBOX_IRQ8 + +/* register offset */ +#define PCI_MBOX_STAT 0x00 +#define PCI_MBOX_CTL0x04 +#define PCI_MBOX_CMD0x08 + +#define CODE_OK 0x00 +#define CODE_INVALID_OP 0xa0 +#define CODE_INVALID_SIZE 0xa1 +#define CODE_ERROR 0xff + +#define OP_READ 0x01 +#define OP_WRITE0x02 +#define OP_INVALID 0x41 + + +static int sock; +static int fd; + +/* + * Create a local TCP socket with any port, then save off the port we got. + */ +static in_port_t open_socket(void) +{ +struct sockaddr_in myaddr; +socklen_t addrlen; + +myaddr.sin_family = AF_INET; +myaddr.sin_addr.s_addr = htonl(INADDR_LOOPBACK); +myaddr.sin_port = 0; +sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); +g_assert(sock != -1); +g_assert(bind(sock, (struct sockaddr *) , sizeof(myaddr)) != -1); +addrlen = sizeof(myaddr); +g_assert(getsockname(sock, (struct sockaddr *) , ) != -1); +g_assert(listen(sock, 1) != -1); +return ntohs(myaddr.sin_port); +} + +static void setup_fd(void) +{ +fd_set readfds; + +FD_ZERO(); +FD_SET(sock, ); +g_assert(select(sock + 1, , NULL, NULL, NULL) == 1); + +fd = accept(sock, NULL, 0); +g_assert(fd >= 0); +} + +static uint8_t read_response(uint8_t *buf, size_t len) +{ +uint8_t code; +ssize_t ret = read(fd, , 1); + +if (ret == -1) { +return CODE_ERROR; +} +if (code != CODE_OK) { +return code; +} +g_test_message("response code: %x", code); +if (len > 0) { +ret = read(fd, buf, len); +if (ret < len) { +return CODE_ERROR; +} +} +return CODE_OK; +} + +static void receive_data(uint64_t offset, uint8_t *buf, size_t len) +{ +uint8_t op = OP_READ; +uint8_t code; +ssize_t rv; + +while (len > 0) { +uint8_t size; + +if (len >= 8) { +size = 8; +} else if (len >= 4) { +size = 4; +} else if (len >= 2) { +size = 2; +} else { +size = 1; +} + +g_test_message("receiving %u bytes", size); +/* Write op */ +rv = write(fd, , 1); +g_assert_cmpint(rv, ==, 1); +/* Write offset */ +rv = write(fd, (uint8_t *), sizeof(uint64_t)); +g_assert_cmpint(rv, ==, sizeof(uint64_t)); +/* Write size */ +g_assert_cmpint(write(fd, , 1), ==, 1); + +/* Read data and Expect response */ +code = read_response(buf, size); +g_assert_cmphex(code, ==, CODE_OK); + +buf += size; +offset += size; +len -= size; +} +} + +static void send_data(uint64_t offset, const uint8_t *buf, size_t len) +{ +uint8_t op = OP_WRITE; +uint8_t code; +ssize_t rv; + +while (len > 0) { +uint8_t size; + +if (len >= 8) { +size = 8; +} else if (len >= 4) { +size = 4; +} else if (len >= 2) { +size = 2; +} else { +size = 1; +} + +
[PATCH v10 04/10] hw/net: Add NPCMXXX GMAC device
From: Hao Wu This patch implements the basic registers of GMAC device and sets registers for networking functionalities. Tested: The following message shows up with the change: Broadcom BCM54612E stmmac-0:00: attached PHY driver [Broadcom BCM54612E] (mii_bus:phy_addr=stmmac-0:00, irq=POLL) stmmaceth f0802000.eth eth0: Link is Up - 1Gbps/Full - flow control rx/tx Change-Id: If71c6d486b95edcccba109ba454870714d7e0940 Signed-off-by: Hao Wu Signed-off-by: Nabih Estefan Diaz Reviewed-by: Tyrone Ting --- hw/net/meson.build | 2 +- hw/net/npcm_gmac.c | 424 + hw/net/trace-events| 11 + include/hw/net/npcm_gmac.h | 340 + 4 files changed, 776 insertions(+), 1 deletion(-) create mode 100644 hw/net/npcm_gmac.c create mode 100644 include/hw/net/npcm_gmac.h diff --git a/hw/net/meson.build b/hw/net/meson.build index f64651c467..db6509f504 100644 --- a/hw/net/meson.build +++ b/hw/net/meson.build @@ -38,7 +38,7 @@ system_ss.add(when: 'CONFIG_I82596_COMMON', if_true: files('i82596.c')) system_ss.add(when: 'CONFIG_SUNHME', if_true: files('sunhme.c')) system_ss.add(when: 'CONFIG_FTGMAC100', if_true: files('ftgmac100.c')) system_ss.add(when: 'CONFIG_SUNGEM', if_true: files('sungem.c')) -system_ss.add(when: 'CONFIG_NPCM7XX', if_true: files('npcm7xx_emc.c')) +system_ss.add(when: 'CONFIG_NPCM7XX', if_true: files('npcm7xx_emc.c', 'npcm_gmac.c')) system_ss.add(when: 'CONFIG_ETRAXFS', if_true: files('etraxfs_eth.c')) system_ss.add(when: 'CONFIG_COLDFIRE', if_true: files('mcf_fec.c')) diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c new file mode 100644 index 00..98b3c33c94 --- /dev/null +++ b/hw/net/npcm_gmac.c @@ -0,0 +1,424 @@ +/* + * Nuvoton NPCM7xx/8xx GMAC Module + * + * Copyright 2022 Google LLC + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * for more details. + * + * Unsupported/unimplemented features: + * - MII is not implemented, MII_ADDR.BUSY and MII_DATA always return zero + * - Precision timestamp (PTP) is not implemented. + */ + +#include "qemu/osdep.h" + +#include "hw/registerfields.h" +#include "hw/net/mii.h" +#include "hw/net/npcm_gmac.h" +#include "migration/vmstate.h" +#include "qemu/log.h" +#include "qemu/units.h" +#include "sysemu/dma.h" +#include "trace.h" + +REG32(NPCM_DMA_BUS_MODE, 0x1000) +REG32(NPCM_DMA_XMT_POLL_DEMAND, 0x1004) +REG32(NPCM_DMA_RCV_POLL_DEMAND, 0x1008) +REG32(NPCM_DMA_RX_BASE_ADDR, 0x100c) +REG32(NPCM_DMA_TX_BASE_ADDR, 0x1010) +REG32(NPCM_DMA_STATUS, 0x1014) +REG32(NPCM_DMA_CONTROL, 0x1018) +REG32(NPCM_DMA_INTR_ENA, 0x101c) +REG32(NPCM_DMA_MISSED_FRAME_CTR, 0x1020) +REG32(NPCM_DMA_HOST_TX_DESC, 0x1048) +REG32(NPCM_DMA_HOST_RX_DESC, 0x104c) +REG32(NPCM_DMA_CUR_TX_BUF_ADDR, 0x1050) +REG32(NPCM_DMA_CUR_RX_BUF_ADDR, 0x1054) +REG32(NPCM_DMA_HW_FEATURE, 0x1058) + +REG32(NPCM_GMAC_MAC_CONFIG, 0x0) +REG32(NPCM_GMAC_FRAME_FILTER, 0x4) +REG32(NPCM_GMAC_HASH_HIGH, 0x8) +REG32(NPCM_GMAC_HASH_LOW, 0xc) +REG32(NPCM_GMAC_MII_ADDR, 0x10) +REG32(NPCM_GMAC_MII_DATA, 0x14) +REG32(NPCM_GMAC_FLOW_CTRL, 0x18) +REG32(NPCM_GMAC_VLAN_FLAG, 0x1c) +REG32(NPCM_GMAC_VERSION, 0x20) +REG32(NPCM_GMAC_WAKEUP_FILTER, 0x28) +REG32(NPCM_GMAC_PMT, 0x2c) +REG32(NPCM_GMAC_LPI_CTRL, 0x30) +REG32(NPCM_GMAC_TIMER_CTRL, 0x34) +REG32(NPCM_GMAC_INT_STATUS, 0x38) +REG32(NPCM_GMAC_INT_MASK, 0x3c) +REG32(NPCM_GMAC_MAC0_ADDR_HI, 0x40) +REG32(NPCM_GMAC_MAC0_ADDR_LO, 0x44) +REG32(NPCM_GMAC_MAC1_ADDR_HI, 0x48) +REG32(NPCM_GMAC_MAC1_ADDR_LO, 0x4c) +REG32(NPCM_GMAC_MAC2_ADDR_HI, 0x50) +REG32(NPCM_GMAC_MAC2_ADDR_LO, 0x54) +REG32(NPCM_GMAC_MAC3_ADDR_HI, 0x58) +REG32(NPCM_GMAC_MAC3_ADDR_LO, 0x5c) +REG32(NPCM_GMAC_RGMII_STATUS, 0xd8) +REG32(NPCM_GMAC_WATCHDOG, 0xdc) +REG32(NPCM_GMAC_PTP_TCR, 0x700) +REG32(NPCM_GMAC_PTP_SSIR, 0x704) +REG32(NPCM_GMAC_PTP_STSR, 0x708) +REG32(NPCM_GMAC_PTP_STNSR, 0x70c) +REG32(NPCM_GMAC_PTP_STSUR, 0x710) +REG32(NPCM_GMAC_PTP_STNSUR, 0x714) +REG32(NPCM_GMAC_PTP_TAR, 0x718) +REG32(NPCM_GMAC_PTP_TTSR, 0x71c) + +/* Register Fields */ +#define NPCM_GMAC_MII_ADDR_BUSY BIT(0) +#define NPCM_GMAC_MII_ADDR_WRITEBIT(1) +#define NPCM_GMAC_MII_ADDR_GR(rv) extract16((rv), 6, 5) +#define NPCM_GMAC_MII_ADDR_PA(rv) extract16((rv), 11, 5) + +#define NPCM_GMAC_INT_MASK_LPIIMBIT(10) +#define NPCM_GMAC_INT_MASK_PMTM BIT(3) +#define NPCM_GMAC_INT_MASK_RGIM BIT(0) + +#define NPCM_DMA_BUS_MODE_SWR BIT(0) + +static const uint32_t npcm_gmac_cold_reset_values[NPCM_GMAC_NR_REGS] = { +/*
[PATCH v10 02/10] hw/arm: Add PCI mailbox module to Nuvoton SoC
From: Hao Wu This patch wires the PCI mailbox module to Nuvoton SoC. Change-Id: I14c42c628258804030f0583889882842bde0d972 Signed-off-by: Hao Wu Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- docs/system/arm/nuvoton.rst | 2 ++ hw/arm/npcm7xx.c| 2 ++ include/hw/arm/npcm7xx.h| 1 + 3 files changed, 5 insertions(+) diff --git a/docs/system/arm/nuvoton.rst b/docs/system/arm/nuvoton.rst index 0424cae4b0..e611099545 100644 --- a/docs/system/arm/nuvoton.rst +++ b/docs/system/arm/nuvoton.rst @@ -50,6 +50,8 @@ Supported devices * Ethernet controller (EMC) * Tachometer * Peripheral SPI controller (PSPI) + * BIOS POST code FIFO + * PCI Mailbox Missing devices --- diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c index 1c3634ff45..c9e87162cb 100644 --- a/hw/arm/npcm7xx.c +++ b/hw/arm/npcm7xx.c @@ -462,6 +462,8 @@ static void npcm7xx_init(Object *obj) object_initialize_child(obj, "pspi[*]", >pspi[i], TYPE_NPCM_PSPI); } +object_initialize_child(obj, "pci-mbox", >pci_mbox, +TYPE_NPCM7XX_PCI_MBOX); object_initialize_child(obj, "mmc", >mmc, TYPE_NPCM7XX_SDHCI); } diff --git a/include/hw/arm/npcm7xx.h b/include/hw/arm/npcm7xx.h index 273090ac60..cec3792a2e 100644 --- a/include/hw/arm/npcm7xx.h +++ b/include/hw/arm/npcm7xx.h @@ -105,6 +105,7 @@ struct NPCM7xxState { OHCISysBusState ohci; NPCM7xxFIUState fiu[2]; NPCM7xxEMCState emc[2]; +NPCM7xxPCIMBoxState pci_mbox; NPCM7xxSDHCIState mmc; NPCMPSPIState pspi[2]; }; -- 2.43.0.472.g3155946c3a-goog
[PATCH v10 06/10] tests/qtest: Creating qtest for GMAC Module
From: Nabih Estefan Diaz - Created qtest to check initialization of registers in GMAC Module. - Implemented test into Build File. Change-Id: I8b2fe152d3987a7eec4cf6a1d25ba92e75a5391d Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- tests/qtest/meson.build | 1 + tests/qtest/npcm_gmac-test.c | 209 +++ 2 files changed, 210 insertions(+) create mode 100644 tests/qtest/npcm_gmac-test.c diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build index 2ac79925f9..aed8924be9 100644 --- a/tests/qtest/meson.build +++ b/tests/qtest/meson.build @@ -221,6 +221,7 @@ qtests_aarch64 = \ (config_all_devices.has_key('CONFIG_RASPI') ? ['bcm2835-dma-test'] : []) + \ (config_all.has_key('CONFIG_TCG') and \ config_all_devices.has_key('CONFIG_TPM_TIS_I2C') ? ['tpm-tis-i2c-test'] : []) + \ + (config_all_devices.has_key('CONFIG_NPCM7XX') ? qtests_npcm7xx : []) + \ ['arm-cpu-features', 'numa-test', 'boot-serial-test', diff --git a/tests/qtest/npcm_gmac-test.c b/tests/qtest/npcm_gmac-test.c new file mode 100644 index 00..130a1599a8 --- /dev/null +++ b/tests/qtest/npcm_gmac-test.c @@ -0,0 +1,209 @@ +/* + * QTests for Nuvoton NPCM7xx/8xx GMAC Modules. + * + * Copyright 2023 Google LLC + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * for more details. + */ + +#include "qemu/osdep.h" +#include "libqos/libqos.h" + +/* Name of the GMAC Device */ +#define TYPE_NPCM_GMAC "npcm-gmac" + +typedef struct GMACModule { +int irq; +uint64_t base_addr; +} GMACModule; + +typedef struct TestData { +const GMACModule *module; +} TestData; + +/* Values extracted from hw/arm/npcm8xx.c */ +static const GMACModule gmac_module_list[] = { +{ +.irq= 14, +.base_addr = 0xf0802000 +}, +{ +.irq= 15, +.base_addr = 0xf0804000 +}, +{ +.irq= 16, +.base_addr = 0xf0806000 +}, +{ +.irq= 17, +.base_addr = 0xf0808000 +} +}; + +/* Returns the index of the GMAC module. */ +static int gmac_module_index(const GMACModule *mod) +{ +ptrdiff_t diff = mod - gmac_module_list; + +g_assert_true(diff >= 0 && diff < ARRAY_SIZE(gmac_module_list)); + +return diff; +} + +/* 32-bit register indices. Taken from npcm_gmac.c */ +typedef enum NPCMRegister { +/* DMA Registers */ +NPCM_DMA_BUS_MODE = 0x1000, +NPCM_DMA_XMT_POLL_DEMAND = 0x1004, +NPCM_DMA_RCV_POLL_DEMAND = 0x1008, +NPCM_DMA_RCV_BASE_ADDR = 0x100c, +NPCM_DMA_TX_BASE_ADDR = 0x1010, +NPCM_DMA_STATUS = 0x1014, +NPCM_DMA_CONTROL = 0x1018, +NPCM_DMA_INTR_ENA = 0x101c, +NPCM_DMA_MISSED_FRAME_CTR = 0x1020, +NPCM_DMA_HOST_TX_DESC = 0x1048, +NPCM_DMA_HOST_RX_DESC = 0x104c, +NPCM_DMA_CUR_TX_BUF_ADDR = 0x1050, +NPCM_DMA_CUR_RX_BUF_ADDR = 0x1054, +NPCM_DMA_HW_FEATURE = 0x1058, + +/* GMAC Registers */ +NPCM_GMAC_MAC_CONFIG = 0x0, +NPCM_GMAC_FRAME_FILTER = 0x4, +NPCM_GMAC_HASH_HIGH = 0x8, +NPCM_GMAC_HASH_LOW = 0xc, +NPCM_GMAC_MII_ADDR = 0x10, +NPCM_GMAC_MII_DATA = 0x14, +NPCM_GMAC_FLOW_CTRL = 0x18, +NPCM_GMAC_VLAN_FLAG = 0x1c, +NPCM_GMAC_VERSION = 0x20, +NPCM_GMAC_WAKEUP_FILTER = 0x28, +NPCM_GMAC_PMT = 0x2c, +NPCM_GMAC_LPI_CTRL = 0x30, +NPCM_GMAC_TIMER_CTRL = 0x34, +NPCM_GMAC_INT_STATUS = 0x38, +NPCM_GMAC_INT_MASK = 0x3c, +NPCM_GMAC_MAC0_ADDR_HI = 0x40, +NPCM_GMAC_MAC0_ADDR_LO = 0x44, +NPCM_GMAC_MAC1_ADDR_HI = 0x48, +NPCM_GMAC_MAC1_ADDR_LO = 0x4c, +NPCM_GMAC_MAC2_ADDR_HI = 0x50, +NPCM_GMAC_MAC2_ADDR_LO = 0x54, +NPCM_GMAC_MAC3_ADDR_HI = 0x58, +NPCM_GMAC_MAC3_ADDR_LO = 0x5c, +NPCM_GMAC_RGMII_STATUS = 0xd8, +NPCM_GMAC_WATCHDOG = 0xdc, +NPCM_GMAC_PTP_TCR = 0x700, +NPCM_GMAC_PTP_SSIR = 0x704, +NPCM_GMAC_PTP_STSR = 0x708, +NPCM_GMAC_PTP_STNSR = 0x70c, +NPCM_GMAC_PTP_STSUR = 0x710, +NPCM_GMAC_PTP_STNSUR = 0x714, +NPCM_GMAC_PTP_TAR = 0x718, +NPCM_GMAC_PTP_TTSR = 0x71c, +} NPCMRegister; + +static uint32_t gmac_read(QTestState *qts, const GMACModule *mod, + NPCMRegister regno) +{ +return qtest_readl(qts, mod->base_addr + regno); +} + +/* Check that GMAC registers are reset to default value */ +static void test_init(gconstpointer test_data) +{ +const TestData *td = test_data; +const GMACModule *mod = td->module; +QTestState *qts = qtest_init("-machine npcm845-evb"); + +#define CHECK_REG32(regno,
[PATCH v10 09/10] hw/net: GMAC Tx Implementation
From: Nabih Estefan Diaz - Implementation of Transmit function for packets - Implementation for reading and writing from and to descriptors in memory for Tx Added relevant trace-events NOTE: This function implements the steps detailed in the datasheet for transmitting messages from the GMAC. Change-Id: Icf14f9fcc6cc7808a41acd872bca67c9832087e6 Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- hw/net/npcm_gmac.c | 155 hw/net/trace-events | 2 + 2 files changed, 157 insertions(+) diff --git a/hw/net/npcm_gmac.c b/hw/net/npcm_gmac.c index 54c8af3b41..8e91e61617 100644 --- a/hw/net/npcm_gmac.c +++ b/hw/net/npcm_gmac.c @@ -265,6 +265,7 @@ static int gmac_write_tx_desc(dma_addr_t addr, struct NPCMGMACTxDesc *desc) } return 0; } + static int gmac_rx_transfer_frame_to_buffer(uint32_t rx_buf_len, uint32_t *left_frame, uint32_t rx_buf_addr, @@ -486,6 +487,155 @@ static ssize_t gmac_receive(NetClientState *nc, const uint8_t *buf, size_t len) return len; } +static int gmac_tx_get_csum(uint32_t tdes1) +{ +uint32_t mask = TX_DESC_TDES1_CHKSM_INS_CTRL_MASK(tdes1); +int csum = 0; + +if (likely(mask > 0)) { +csum |= CSUM_IP; +} +if (likely(mask > 1)) { +csum |= CSUM_TCP | CSUM_UDP; +} + +return csum; +} + +static void gmac_try_send_next_packet(NPCMGMACState *gmac) +{ +/* + * Comments about steps refer to steps for + * transmitting in page 384 of datasheet + */ +uint16_t tx_buffer_size = 2048; +g_autofree uint8_t *tx_send_buffer = g_malloc(tx_buffer_size); +uint32_t desc_addr; +struct NPCMGMACTxDesc tx_desc; +uint32_t tx_buf_addr, tx_buf_len; +uint16_t length = 0; +uint8_t *buf = tx_send_buffer; +uint32_t prev_buf_size = 0; +int csum = 0; + +/* steps 1&2 */ +if (!gmac->regs[R_NPCM_DMA_HOST_TX_DESC]) { +gmac->regs[R_NPCM_DMA_HOST_TX_DESC] = +NPCM_DMA_HOST_TX_DESC_MASK(gmac->regs[R_NPCM_DMA_TX_BASE_ADDR]); +} +desc_addr = gmac->regs[R_NPCM_DMA_HOST_TX_DESC]; + +while (true) { +gmac_dma_set_state(gmac, NPCM_DMA_STATUS_TX_PROCESS_STATE_SHIFT, +NPCM_DMA_STATUS_TX_RUNNING_FETCHING_STATE); +if (gmac_read_tx_desc(desc_addr, _desc)) { +qemu_log_mask(LOG_GUEST_ERROR, + "TX Descriptor @ 0x%x can't be read\n", + desc_addr); +return; +} +/* step 3 */ + +trace_npcm_gmac_packet_desc_read(DEVICE(gmac)->canonical_path, +desc_addr); +trace_npcm_gmac_debug_desc_data(DEVICE(gmac)->canonical_path, _desc, +tx_desc.tdes0, tx_desc.tdes1, tx_desc.tdes2, tx_desc.tdes3); + +/* 1 = DMA Owned, 0 = Software Owned */ +if (!(tx_desc.tdes0 & TX_DESC_TDES0_OWN)) { +qemu_log_mask(LOG_GUEST_ERROR, + "TX Descriptor @ 0x%x is owned by software\n", + desc_addr); +gmac->regs[R_NPCM_DMA_STATUS] |= NPCM_DMA_STATUS_TU; +gmac_dma_set_state(gmac, NPCM_DMA_STATUS_TX_PROCESS_STATE_SHIFT, +NPCM_DMA_STATUS_TX_SUSPENDED_STATE); +gmac_update_irq(gmac); +return; +} + +gmac_dma_set_state(gmac, NPCM_DMA_STATUS_TX_PROCESS_STATE_SHIFT, +NPCM_DMA_STATUS_TX_RUNNING_READ_STATE); +/* Give the descriptor back regardless of what happens. */ +tx_desc.tdes0 &= ~TX_DESC_TDES0_OWN; + +if (tx_desc.tdes1 & TX_DESC_TDES1_FIRST_SEG_MASK) { +csum = gmac_tx_get_csum(tx_desc.tdes1); +} + +/* step 4 */ +tx_buf_addr = tx_desc.tdes2; +gmac->regs[R_NPCM_DMA_CUR_TX_BUF_ADDR] = tx_buf_addr; +tx_buf_len = TX_DESC_TDES1_BFFR1_SZ_MASK(tx_desc.tdes1); +buf = _send_buffer[prev_buf_size]; + +if ((prev_buf_size + tx_buf_len) > sizeof(buf)) { +tx_buffer_size = prev_buf_size + tx_buf_len; +tx_send_buffer = g_realloc(tx_send_buffer, tx_buffer_size); +buf = _send_buffer[prev_buf_size]; +} + +/* step 5 */ +if (dma_memory_read(_space_memory, tx_buf_addr, buf, +tx_buf_len, MEMTXATTRS_UNSPECIFIED)) { +qemu_log_mask(LOG_GUEST_ERROR, "%s: Failed to read packet @ 0x%x\n", +__func__, tx_buf_addr); +return; +} +length += tx_buf_len; +prev_buf_size += tx_buf_len; + +/* If not chained we'll have a second buffer. */ +if (!(tx_desc.tdes1 & TX_DESC_TDES1_SEC_ADDR_CHND_MASK)) { +tx_buf_addr = tx_desc.tdes3; +gmac->regs[R_NPCM_DMA_CUR_TX_BUF_ADDR] = tx_buf_addr; +tx_buf_len = TX_DESC_TDES1_BFFR2_SZ_MASK(tx_desc.tdes1); +buf = _send_buffer[prev_buf_size]; + +if
[PATCH v10 05/10] hw/arm: Add GMAC devices to NPCM7XX SoC
From: Hao Wu Change-Id: Id8a3461fb5042adc4c3fd6f4fbd1ca0d33e22565 Signed-off-by: Hao Wu Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- hw/arm/npcm7xx.c | 36 ++-- include/hw/arm/npcm7xx.h | 2 ++ 2 files changed, 36 insertions(+), 2 deletions(-) diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c index c9e87162cb..12e11250e1 100644 --- a/hw/arm/npcm7xx.c +++ b/hw/arm/npcm7xx.c @@ -91,6 +91,7 @@ enum NPCM7xxInterrupt { NPCM7XX_GMAC1_IRQ = 14, NPCM7XX_EMC1RX_IRQ = 15, NPCM7XX_EMC1TX_IRQ, +NPCM7XX_GMAC2_IRQ, NPCM7XX_MMC_IRQ = 26, NPCM7XX_PSPI2_IRQ = 28, NPCM7XX_PSPI1_IRQ = 31, @@ -234,6 +235,12 @@ static const hwaddr npcm7xx_pspi_addr[] = { 0xf0201000, }; +/* Register base address for each GMAC Module */ +static const hwaddr npcm7xx_gmac_addr[] = { +0xf0802000, +0xf0804000, +}; + static const struct { hwaddr regs_addr; uint32_t unconnected_pins; @@ -462,6 +469,10 @@ static void npcm7xx_init(Object *obj) object_initialize_child(obj, "pspi[*]", >pspi[i], TYPE_NPCM_PSPI); } +for (i = 0; i < ARRAY_SIZE(s->gmac); i++) { +object_initialize_child(obj, "gmac[*]", >gmac[i], TYPE_NPCM_GMAC); +} + object_initialize_child(obj, "pci-mbox", >pci_mbox, TYPE_NPCM7XX_PCI_MBOX); object_initialize_child(obj, "mmc", >mmc, TYPE_NPCM7XX_SDHCI); @@ -695,6 +706,29 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp) sysbus_connect_irq(sbd, 1, npcm7xx_irq(s, rx_irq)); } +/* + * GMAC Modules. Cannot fail. + */ +QEMU_BUILD_BUG_ON(ARRAY_SIZE(npcm7xx_gmac_addr) != ARRAY_SIZE(s->gmac)); +QEMU_BUILD_BUG_ON(ARRAY_SIZE(s->gmac) != 2); +for (i = 0; i < ARRAY_SIZE(s->gmac); i++) { +SysBusDevice *sbd = SYS_BUS_DEVICE(>gmac[i]); + +/* + * The device exists regardless of whether it's connected to a QEMU + * netdev backend. So always instantiate it even if there is no + * backend. + */ +sysbus_realize(sbd, _abort); +sysbus_mmio_map(sbd, 0, npcm7xx_gmac_addr[i]); +int irq = i == 0 ? NPCM7XX_GMAC1_IRQ : NPCM7XX_GMAC2_IRQ; +/* + * N.B. The values for the second argument sysbus_connect_irq are + * chosen to match the registration order in npcm7xx_emc_realize. + */ +sysbus_connect_irq(sbd, 0, npcm7xx_irq(s, irq)); +} + /* * Flash Interface Unit (FIU). Can fail if incorrect number of chip selects * specified, but this is a programming error. @@ -765,8 +799,6 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp) create_unimplemented_device("npcm7xx.siox[2]", 0xf0102000, 4 * KiB); create_unimplemented_device("npcm7xx.ahbpci", 0xf040, 1 * MiB); create_unimplemented_device("npcm7xx.mcphy",0xf05f, 64 * KiB); -create_unimplemented_device("npcm7xx.gmac1",0xf0802000, 8 * KiB); -create_unimplemented_device("npcm7xx.gmac2",0xf0804000, 8 * KiB); create_unimplemented_device("npcm7xx.vcd", 0xf081, 64 * KiB); create_unimplemented_device("npcm7xx.ece", 0xf082, 8 * KiB); create_unimplemented_device("npcm7xx.vdma", 0xf0822000, 8 * KiB); diff --git a/include/hw/arm/npcm7xx.h b/include/hw/arm/npcm7xx.h index cec3792a2e..9e5cf639a2 100644 --- a/include/hw/arm/npcm7xx.h +++ b/include/hw/arm/npcm7xx.h @@ -30,6 +30,7 @@ #include "hw/misc/npcm7xx_pwm.h" #include "hw/misc/npcm7xx_rng.h" #include "hw/net/npcm7xx_emc.h" +#include "hw/net/npcm_gmac.h" #include "hw/nvram/npcm7xx_otp.h" #include "hw/timer/npcm7xx_timer.h" #include "hw/ssi/npcm7xx_fiu.h" @@ -105,6 +106,7 @@ struct NPCM7xxState { OHCISysBusState ohci; NPCM7xxFIUState fiu[2]; NPCM7xxEMCState emc[2]; +NPCMGMACState gmac[2]; NPCM7xxPCIMBoxState pci_mbox; NPCM7xxSDHCIState mmc; NPCMPSPIState pspi[2]; -- 2.43.0.472.g3155946c3a-goog
[PATCH v10 01/10] hw/misc: Add Nuvoton's PCI Mailbox Module
From: Hao Wu The PCI Mailbox Module is a high-bandwidth communcation module between a Nuvoton BMC and CPU. It features 16KB RAM that are both accessible by the BMC and core CPU. and supports interrupt for both sides. This patch implements the BMC side of the PCI mailbox module. Communication with the core CPU is emulated via a chardev and will be in a follow-up patch. Change-Id: Iaca22f81c4526927d437aa367079ed038faf43f2 Signed-off-by: Hao Wu Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- hw/arm/npcm7xx.c | 15 +- hw/misc/meson.build| 1 + hw/misc/npcm7xx_pci_mbox.c | 324 + hw/misc/trace-events | 5 + include/hw/arm/npcm7xx.h | 1 + include/hw/misc/npcm7xx_pci_mbox.h | 81 6 files changed, 426 insertions(+), 1 deletion(-) create mode 100644 hw/misc/npcm7xx_pci_mbox.c create mode 100644 include/hw/misc/npcm7xx_pci_mbox.h diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c index 15ff21d047..1c3634ff45 100644 --- a/hw/arm/npcm7xx.c +++ b/hw/arm/npcm7xx.c @@ -53,6 +53,9 @@ /* ADC Module */ #define NPCM7XX_ADC_BA (0xf000c000) +/* PCI Mailbox Module */ +#define NPCM7XX_PCI_MBOX_BA (0xf0848000) + /* Internal AHB SRAM */ #define NPCM7XX_RAM3_BA (0xc0008000) #define NPCM7XX_RAM3_SZ (4 * KiB) @@ -83,6 +86,9 @@ enum NPCM7xxInterrupt { NPCM7XX_UART1_IRQ, NPCM7XX_UART2_IRQ, NPCM7XX_UART3_IRQ, +NPCM7XX_PCI_MBOX_IRQ= 8, +NPCM7XX_KCS_HIB_IRQ = 9, +NPCM7XX_GMAC1_IRQ = 14, NPCM7XX_EMC1RX_IRQ = 15, NPCM7XX_EMC1TX_IRQ, NPCM7XX_MMC_IRQ = 26, @@ -706,6 +712,14 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp) } } +/* PCI Mailbox. Cannot fail */ +sysbus_realize(SYS_BUS_DEVICE(>pci_mbox), _abort); +sysbus_mmio_map(SYS_BUS_DEVICE(>pci_mbox), 0, NPCM7XX_PCI_MBOX_BA); +sysbus_mmio_map(SYS_BUS_DEVICE(>pci_mbox), 1, +NPCM7XX_PCI_MBOX_BA + NPCM7XX_PCI_MBOX_RAM_SIZE); +sysbus_connect_irq(SYS_BUS_DEVICE(>pci_mbox), 0, + npcm7xx_irq(s, NPCM7XX_PCI_MBOX_IRQ)); + /* RAM2 (SRAM) */ memory_region_init_ram(>sram, OBJECT(dev), "ram2", NPCM7XX_RAM2_SZ, _abort); @@ -765,7 +779,6 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp) create_unimplemented_device("npcm7xx.usbd[8]", 0xf0838000, 4 * KiB); create_unimplemented_device("npcm7xx.usbd[9]", 0xf0839000, 4 * KiB); create_unimplemented_device("npcm7xx.sd", 0xf084, 8 * KiB); -create_unimplemented_device("npcm7xx.pcimbx", 0xf0848000, 512 * KiB); create_unimplemented_device("npcm7xx.aes", 0xf0858000, 4 * KiB); create_unimplemented_device("npcm7xx.des", 0xf0859000, 4 * KiB); create_unimplemented_device("npcm7xx.sha", 0xf085a000, 4 * KiB); diff --git a/hw/misc/meson.build b/hw/misc/meson.build index 36c20d5637..0ead2e9ede 100644 --- a/hw/misc/meson.build +++ b/hw/misc/meson.build @@ -73,6 +73,7 @@ system_ss.add(when: 'CONFIG_NPCM7XX', if_true: files( 'npcm7xx_clk.c', 'npcm7xx_gcr.c', 'npcm7xx_mft.c', + 'npcm7xx_pci_mbox.c', 'npcm7xx_pwm.c', 'npcm7xx_rng.c', )) diff --git a/hw/misc/npcm7xx_pci_mbox.c b/hw/misc/npcm7xx_pci_mbox.c new file mode 100644 index 00..c770ad6fcf --- /dev/null +++ b/hw/misc/npcm7xx_pci_mbox.c @@ -0,0 +1,324 @@ +/* + * Nuvoton NPCM7xx PCI Mailbox Module + * + * Copyright 2021 Google LLC + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * for more details. + */ + +#include "qemu/osdep.h" +#include "chardev/char-fe.h" +#include "hw/irq.h" +#include "hw/qdev-clock.h" +#include "hw/qdev-properties-system.h" +#include "hw/misc/npcm7xx_pci_mbox.h" +#include "hw/registerfields.h" +#include "migration/vmstate.h" +#include "qapi/error.h" +#include "qapi/visitor.h" +#include "qemu/bitops.h" +#include "qemu/error-report.h" +#include "qemu/log.h" +#include "qemu/module.h" +#include "qemu/timer.h" +#include "qemu/units.h" +#include "trace.h" + +REG32(NPCM7XX_PCI_MBOX_BMBXSTAT, 0x00); +REG32(NPCM7XX_PCI_MBOX_BMBXCTL, 0x04); +REG32(NPCM7XX_PCI_MBOX_BMBXCMD, 0x08); + +enum NPCM7xxPCIMBoxOperation { +NPCM7XX_PCI_MBOX_OP_READ = 1, +NPCM7XX_PCI_MBOX_OP_WRITE, +}; + +#define NPCM7XX_PCI_MBOX_OFFSET_BYTES 8 + +/* Response code */ +#define NPCM7XX_PCI_MBOX_OK 0 +#define NPCM7XX_PCI_MBOX_INVALID_OP 0xa0 +#define NPCM7XX_PCI_MBOX_INVALID_SIZE 0xa1 +#define
[PATCH v10 00/10] Implementation of NPI Mailbox and GMAC Networking Module
From: Nabih Estefan Diaz [Changes since v9] More cleanup and fixes based on suggestions from Peter Maydell (peter.mayd...@linaro.org) suggestions. [Changes since v8] Suggestions and Fixes from Peter Maydell (peter.mayd...@linaro.org), also cleaned up changes so nothing is deleted in a later patch that was added in an earlier patch. Patch count decresed by 1 because this cleanup led to one of the patches being irrelevant. [Changes since v7] Fixed patch 4 declaration of new NIC based on comments by Peter Maydell (peter.mayd...@linaro.org) [Changes since v6] Remove the Change-Ids from the commit messages. [Changes since v5] Undid remove of some qtests that seem to have been caused by a merge conflict. [Changes since v4] Added Signed-off-by tag and fixed patch 4 commit message as suggested by Peter Maydell (peter.mayd...@linaro.org) [Changes since v3] Fixed comments from Hao Wu (wuhao...@google.com) [Changes since v2] Fixed bugs related to the RC functionality of the GMAC. Added and squashed patches related to that. [Changes since v1] Fixed some errors in formatting. Fixed a merge error that I didn't see in v1. Removed Nuvoton 8xx references since that is a separate patch set. [Original Cover] Creates NPI Mailbox Module with data verification for read and write (internal and external), wiring to the Nuvoton SoC, and QTests. Also creates the GMAC Networking Module. Implements read and write functionalities with cooresponding descriptors and registers. Also includes QTests for the different functionalities. Hao Wu (5): hw/misc: Add Nuvoton's PCI Mailbox Module hw/arm: Add PCI mailbox module to Nuvoton SoC hw/misc: Add qtest for NPCM7xx PCI Mailbox hw/net: Add NPCMXXX GMAC device hw/arm: Add GMAC devices to NPCM7XX SoC Nabih Estefan Diaz (5): tests/qtest: Creating qtest for GMAC Module include/hw/net: GMAC IRQ Implementation hw/net: GMAC Rx Implementation hw/net: GMAC Tx Implementation tests/qtest: Adding PCS Module test to GMAC Qtest docs/system/arm/nuvoton.rst | 2 + hw/arm/npcm7xx.c| 53 +- hw/misc/meson.build | 1 + hw/misc/npcm7xx_pci_mbox.c | 324 ++ hw/misc/trace-events| 5 + hw/net/meson.build | 2 +- hw/net/npcm_gmac.c | 939 hw/net/trace-events | 19 + include/hw/arm/npcm7xx.h| 4 + include/hw/misc/npcm7xx_pci_mbox.h | 81 +++ include/hw/net/npcm_gmac.h | 340 ++ tests/qtest/meson.build | 2 + tests/qtest/npcm7xx_pci_mbox-test.c | 238 +++ tests/qtest/npcm_gmac-test.c| 341 ++ 14 files changed, 2347 insertions(+), 4 deletions(-) create mode 100644 hw/misc/npcm7xx_pci_mbox.c create mode 100644 hw/net/npcm_gmac.c create mode 100644 include/hw/misc/npcm7xx_pci_mbox.h create mode 100644 include/hw/net/npcm_gmac.h create mode 100644 tests/qtest/npcm7xx_pci_mbox-test.c create mode 100644 tests/qtest/npcm_gmac-test.c -- 2.43.0.472.g3155946c3a-goog
[PATCH v10 10/10] tests/qtest: Adding PCS Module test to GMAC Qtest
From: Nabih Estefan Diaz - Add PCS Register check to npcm_gmac-test Change-Id: I34821beb5e0b1e89e2be576ab58eabe41545af12 Signed-off-by: Nabih Estefan Reviewed-by: Tyrone Ting --- tests/qtest/npcm_gmac-test.c | 132 +++ 1 file changed, 132 insertions(+) diff --git a/tests/qtest/npcm_gmac-test.c b/tests/qtest/npcm_gmac-test.c index 130a1599a8..b64515794b 100644 --- a/tests/qtest/npcm_gmac-test.c +++ b/tests/qtest/npcm_gmac-test.c @@ -20,6 +20,10 @@ /* Name of the GMAC Device */ #define TYPE_NPCM_GMAC "npcm-gmac" +/* Address of the PCS Module */ +#define PCS_BASE_ADDRESS 0xf078 +#define NPCM_PCS_IND_AC_BA 0x1fe + typedef struct GMACModule { int irq; uint64_t base_addr; @@ -111,6 +115,62 @@ typedef enum NPCMRegister { NPCM_GMAC_PTP_STNSUR = 0x714, NPCM_GMAC_PTP_TAR = 0x718, NPCM_GMAC_PTP_TTSR = 0x71c, + +/* PCS Registers */ +NPCM_PCS_SR_CTL_ID1 = 0x3c0008, +NPCM_PCS_SR_CTL_ID2 = 0x3c000a, +NPCM_PCS_SR_CTL_STS = 0x3c0010, + +NPCM_PCS_SR_MII_CTRL = 0x3e, +NPCM_PCS_SR_MII_STS = 0x3e0002, +NPCM_PCS_SR_MII_DEV_ID1 = 0x3e0004, +NPCM_PCS_SR_MII_DEV_ID2 = 0x3e0006, +NPCM_PCS_SR_MII_AN_ADV = 0x3e0008, +NPCM_PCS_SR_MII_LP_BABL = 0x3e000a, +NPCM_PCS_SR_MII_AN_EXPN = 0x3e000c, +NPCM_PCS_SR_MII_EXT_STS = 0x3e001e, + +NPCM_PCS_SR_TIM_SYNC_ABL = 0x3e0e10, +NPCM_PCS_SR_TIM_SYNC_TX_MAX_DLY_LWR = 0x3e0e12, +NPCM_PCS_SR_TIM_SYNC_TX_MAX_DLY_UPR = 0x3e0e14, +NPCM_PCS_SR_TIM_SYNC_TX_MIN_DLY_LWR = 0x3e0e16, +NPCM_PCS_SR_TIM_SYNC_TX_MIN_DLY_UPR = 0x3e0e18, +NPCM_PCS_SR_TIM_SYNC_RX_MAX_DLY_LWR = 0x3e0e1a, +NPCM_PCS_SR_TIM_SYNC_RX_MAX_DLY_UPR = 0x3e0e1c, +NPCM_PCS_SR_TIM_SYNC_RX_MIN_DLY_LWR = 0x3e0e1e, +NPCM_PCS_SR_TIM_SYNC_RX_MIN_DLY_UPR = 0x3e0e20, + +NPCM_PCS_VR_MII_MMD_DIG_CTRL1 = 0x3f, +NPCM_PCS_VR_MII_AN_CTRL = 0x3f0002, +NPCM_PCS_VR_MII_AN_INTR_STS = 0x3f0004, +NPCM_PCS_VR_MII_TC = 0x3f0006, +NPCM_PCS_VR_MII_DBG_CTRL = 0x3f000a, +NPCM_PCS_VR_MII_EEE_MCTRL0 = 0x3f000c, +NPCM_PCS_VR_MII_EEE_TXTIMER = 0x3f0010, +NPCM_PCS_VR_MII_EEE_RXTIMER = 0x3f0012, +NPCM_PCS_VR_MII_LINK_TIMER_CTRL = 0x3f0014, +NPCM_PCS_VR_MII_EEE_MCTRL1 = 0x3f0016, +NPCM_PCS_VR_MII_DIG_STS = 0x3f0020, +NPCM_PCS_VR_MII_ICG_ERRCNT1 = 0x3f0022, +NPCM_PCS_VR_MII_MISC_STS = 0x3f0030, +NPCM_PCS_VR_MII_RX_LSTS = 0x3f0040, +NPCM_PCS_VR_MII_MP_TX_BSTCTRL0 = 0x3f0070, +NPCM_PCS_VR_MII_MP_TX_LVLCTRL0 = 0x3f0074, +NPCM_PCS_VR_MII_MP_TX_GENCTRL0 = 0x3f007a, +NPCM_PCS_VR_MII_MP_TX_GENCTRL1 = 0x3f007c, +NPCM_PCS_VR_MII_MP_TX_STS = 0x3f0090, +NPCM_PCS_VR_MII_MP_RX_GENCTRL0 = 0x3f00b0, +NPCM_PCS_VR_MII_MP_RX_GENCTRL1 = 0x3f00b2, +NPCM_PCS_VR_MII_MP_RX_LOS_CTRL0 = 0x3f00ba, +NPCM_PCS_VR_MII_MP_MPLL_CTRL0 = 0x3f00f0, +NPCM_PCS_VR_MII_MP_MPLL_CTRL1 = 0x3f00f2, +NPCM_PCS_VR_MII_MP_MPLL_STS = 0x3f0110, +NPCM_PCS_VR_MII_MP_MISC_CTRL2 = 0x3f0126, +NPCM_PCS_VR_MII_MP_LVL_CTRL = 0x3f0130, +NPCM_PCS_VR_MII_MP_MISC_CTRL0 = 0x3f0132, +NPCM_PCS_VR_MII_MP_MISC_CTRL1 = 0x3f0134, +NPCM_PCS_VR_MII_DIG_CTRL2 = 0x3f01c2, +NPCM_PCS_VR_MII_DIG_ERRCNT_SEL = 0x3f01c4, } NPCMRegister; static uint32_t gmac_read(QTestState *qts, const GMACModule *mod, @@ -119,6 +179,15 @@ static uint32_t gmac_read(QTestState *qts, const GMACModule *mod, return qtest_readl(qts, mod->base_addr + regno); } +static uint16_t pcs_read(QTestState *qts, const GMACModule *mod, + NPCMRegister regno) +{ +uint32_t write_value = (regno & 0x3ffe00) >> 9; +qtest_writel(qts, PCS_BASE_ADDRESS + NPCM_PCS_IND_AC_BA, write_value); +uint32_t read_offset = regno & 0x1ff; +return qtest_readl(qts, PCS_BASE_ADDRESS + read_offset); +} + /* Check that GMAC registers are reset to default value */ static void test_init(gconstpointer test_data) { @@ -131,6 +200,11 @@ static void test_init(gconstpointer test_data) g_assert_cmphex(gmac_read(qts, mod, (regno)), ==, (value)); \ } while (0) +#define CHECK_REG_PCS(regno, value) \ +do { \ +g_assert_cmphex(pcs_read(qts, mod, (regno)), ==, (value)); \ +} while (0) + CHECK_REG32(NPCM_DMA_BUS_MODE, 0x00020100); CHECK_REG32(NPCM_DMA_XMT_POLL_DEMAND, 0); CHECK_REG32(NPCM_DMA_RCV_POLL_DEMAND, 0); @@ -180,6 +254,64 @@ static void test_init(gconstpointer test_data) CHECK_REG32(NPCM_GMAC_PTP_TAR, 0); CHECK_REG32(NPCM_GMAC_PTP_TTSR, 0); +/* TODO Add registers PCS */ +if (mod->base_addr == 0xf0802000) { +CHECK_REG_PCS(NPCM_PCS_SR_CTL_ID1, 0x699e); +CHECK_REG_PCS(NPCM_PCS_SR_CTL_ID2, 0); +CHECK_REG_PCS(NPCM_PCS_SR_CTL_STS, 0x8000); + +CHECK_REG_PCS(NPCM_PCS_SR_MII_CTRL, 0x1140); +CHECK_REG_PCS(NPCM_PCS_SR_MII_STS, 0x0109); +CHECK_REG_PCS(NPCM_PCS_SR_MII_DEV_ID1, 0x699e); +CHECK_REG_PCS(NPCM_PCS_SR_MII_DEV_ID2, 0x0ced0); +
Re: [PATCH v4 00/11] hw/isa/vt82c686: Implement relocation and toggling of SuperI/O functions
On 08/01/2024 20:07, Bernhard Beschow wrote: Am 7. Januar 2024 14:13:44 UTC schrieb Mark Cave-Ayland : On 06/01/2024 21:05, Bernhard Beschow wrote: This series implements relocation of the SuperI/O functions of the VIA south bridges which resolves some FIXME's. It is part of my via-apollo-pro-133t branch [1] which is an extension of bringing the VIA south bridges to the PC machine [2]. This branch is able to run some real-world X86 BIOSes in the hope that it allows us to form a better understanding of the real vt82c686b devices. Implementing relocation and toggling of the SuperI/O functions is one step to make these BIOSes run without error messages, so here we go. The series is structured as follows: Patches 1-3 prepare the TYPE_ISA_FDC, TYPE_ISA_PARALLEL and TYPE_ISA_SERIAL to relocate and toggle (enable/disable) themselves without breaking encapsulation of their respective device states. This is achieved by moving the MemoryRegions and PortioLists from the device states into the encapsulating ISA devices since they will be relocated and toggled. Inspired by the memory API patches 4-6 add two convenience functions to the portio_list API to toggle and relocate portio lists. Patch 5 is a preparation for that which removes some redundancies which otherwise had to be dealt with during relocation. Patches 7-9 implement toggling and relocation for types TYPE_ISA_FDC, TYPE_ISA_PARALLEL and TYPE_ISA_SERIAL. Patch 10 prepares the pegasos2 machine which would end up with all SuperI/O functions disabled if no -bios argument is given. Patch 11 finally implements the main feature which now relies on firmware to configure the SuperI/O functions accordingly (except for pegasos2). v4: * Drop incomplete SuperI/O vmstate handling (Zoltan) v3: * Rework various commit messages (Zoltan) * Drop patch "hw/char/serial: Free struct SerialState from MemoryRegion" (Zoltan) * Generalize wording in migration.rst to include portio_list API (Zoltan) v2: * Improve commit messages (Zoltan) * Split pegasos2 from vt82c686 patch (Zoltan) * Avoid poking into device internals (Zoltan) Testing done: * `make check` * `make check-avocado` * Run MorphOS on pegasos2 with and without pegasos2.rom * Run Linux on amigaone * Run real-world BIOSes on via-apollo-pro-133t branch * Start rescue-yl on fuloong2e [1] https://github.com/shentok/qemu/tree/via-apollo-pro-133t [2] https://github.com/shentok/qemu/tree/pc-via Bernhard Beschow (11): hw/block/fdc-isa: Move portio_list from FDCtrl to FDCtrlISABus hw/block/fdc-sysbus: Move iomem from FDCtrl to FDCtrlSysBus hw/char/parallel: Move portio_list from ParallelState to ISAParallelState exec/ioport: Resolve redundant .base attribute in struct MemoryRegionPortio exec/ioport: Add portio_list_set_address() exec/ioport: Add portio_list_set_enabled() hw/block/fdc-isa: Implement relocation and enabling/disabling for TYPE_ISA_FDC hw/char/serial-isa: Implement relocation and enabling/disabling for TYPE_ISA_SERIAL hw/char/parallel-isa: Implement relocation and enabling/disabling for TYPE_ISA_PARALLEL hw/ppc/pegasos2: Let pegasos2 machine configure SuperI/O functions hw/isa/vt82c686: Implement relocation and toggling of SuperI/O functions docs/devel/migration.rst | 6 ++-- hw/block/fdc-internal.h| 4 --- include/exec/ioport.h | 4 ++- include/hw/block/fdc.h | 3 ++ include/hw/char/parallel-isa.h | 5 +++ include/hw/char/parallel.h | 2 -- include/hw/char/serial.h | 2 ++ hw/block/fdc-isa.c | 18 +- hw/block/fdc-sysbus.c | 6 ++-- hw/char/parallel-isa.c | 14 hw/char/parallel.c | 2 +- hw/char/serial-isa.c | 14 hw/isa/vt82c686.c | 66 -- hw/ppc/pegasos2.c | 15 system/ioport.c| 41 + 15 files changed, 172 insertions(+), 30 deletions(-) I think this series generally looks good: the only thing I think it's worth checking is whether portio lists are considered exclusive to ISA devices or not? (Paolo?). The modifications preserve the current design, so how is this question related to this series? I was thinking about patches 1 and 3 where the portio_list variable is moved from the core object to the ISA-specific child objects. I'd appreciate feedback from the maintainers indeed since this part hasn't received any comments so far. Thanks :) Agreed. I *think* the portio_lists are ISA-specific as far as QEMU is concerned, but a quick nod from an x86 maintainer would be a great help :) The portio_list_set_enabled() API looks interesting, and could be considered for use by my PCI IDE mode-switching changes too. Apologies I don't have a huge amount of time for review right now, but I wanted to feed back that generally these patches look good, and