Re: [PATCH 0/3] ui/gtk: introducing vc->visible

2024-03-07 Thread Marc-André Lureau
Hi

On Fri, Mar 8, 2024 at 4:59 AM Kim, Dongwon  wrote:
>
> Hi Marc-André,
>
> > -Original Message-
> > From: Marc-André Lureau 
> > Sent: Tuesday, March 5, 2024 4:18 AM
> > To: Kim, Dongwon ; P. Berrange, Daniel
> > 
> > Cc: qemu-devel@nongnu.org
> > Subject: Re: [PATCH 0/3] ui/gtk: introducing vc->visible
> >
> > Hi Kim
> >
> > I am uncomfortable with the series in general.
> >
> > Not only we don't have the means to draw dmabuf/scanout "when required", so
> > resuming drawing won't work until the guest draws (this is already a 
> > problem but
> [Kim, Dongwon] Yes, this is right. The display won't be updated until there 
> is a new frame submitted.
> > you are only making it worse). And I also think reconfiguring the guest by 
> > merely
> > minimizing or switching window/tabs isn't what most users would expect.
> [Kim, Dongwon] I understand your concern. Then what do you suggest I need to 
> do or look into to avoid the situation where the host rendering of the guest 
> frame is scheduled but pending due to tab switching or minimization of window 
> as the guest (virtio-gpu drv) is waiting for the response to move on to the 
> next frame? Do you think the frame update should just continue on the host 
> side regardless of visibility of the window? (If this is the standard 
> expectation, then one of our Linux configuration - Yocto + ICE WM has some 
> bug in it.)
>


Given that we can't pause/resume the drawing, I think it's best to
always draw regardless of the visibility. If GTK doesn't draw when
minimized or hidden, we should find a way to "force" that.


> >
> > (fwiw, my personal opinion is that QEMU shouldn't provide UIs and different
> > clients should be able to implement different behaviours, out of process.. 
> > that
> > makes me relatively less motivated to break things and be responsible)
> >
> > Daniel, could you have a look too?
> >
> > thanks
> >
> > On Fri, Mar 1, 2024 at 4:05 AM Kim, Dongwon 
> > wrote:
> > >
> > > Hi Marc-André Lureau,
> > >
> > > Just a reminder.. I need your help reviewing this patch series. Please
> > > take a look at my messages at
> > > https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg06636.html
> > > and
> > > https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg06637.html
> > >
> > > Thanks!!
> > > DW
> > >
> > > > -Original Message-
> > > > From: qemu-devel-bounces+dongwon.kim=intel@nongnu.org  > > > devel-bounces+dongwon.kim=intel@nongnu.org> On Behalf Of
> > > > dongwon@intel.com
> > > > Sent: Tuesday, January 30, 2024 3:49 PM
> > > > To: qemu-devel@nongnu.org
> > > > Subject: [PATCH 0/3] ui/gtk: introducing vc->visible
> > > >
> > > > From: Dongwon Kim 
> > > >
> > > > Drawing guest display frames can't be completed while the VC is not
> > > > in visible state, which could result in timeout in both the host and
> > > > the guest especially when using blob scanout. Therefore it is needed
> > > > to update and track the visiblity status of the VC and unblock the 
> > > > pipeline in
> > case when VC becomes invisible (e.g.
> > > > windows minimization, switching among tabs) while processing a guest
> > frame.
> > > >
> > > > First patch (0001-ui-gtk-skip...) is introducing a flag 'visible' to
> > > > VirtualConsole struct then set it only if the VC and its window is 
> > > > visible.
> > > >
> > > > Second patch (0002-ui-gtk-set-...) sets the ui size to 0 when VC is
> > > > invisible when the tab is closed or deactivated. This notifies the
> > > > guest that the associated guest display is not active anymore.
> > > >
> > > > Third patch (0003-ui-gtk-reset-visible...) adds a callback for GTK
> > > > window-state- event. The flag, 'visible' is updated based on the 
> > > > minization
> > status of the window.
> > > >
> > > > Dongwon Kim (3):
> > > >   ui/gtk: skip drawing guest scanout when associated VC is invisible
> > > >   ui/gtk: set the ui size to 0 when invisible
> > > >   ui/gtk: reset visible flag when window is minimized
> > > >
> > > >  include/ui/gtk.h |  1 +
> > > >  ui/gtk-egl.c |  8 +++
> > > >  ui/gtk-gl-area.c |  8 +++
> > > >  ui/gtk.c | 62 ++--
> > > >  4 files changed, 77 insertions(+), 2 deletions(-)
> > > >
> > > > --
> > > > 2.34.1
> > > >
> > >
>


-- 
Marc-André Lureau



Re: [PATCH v4 24/25] vfio: Also trace event failures in vfio_save_complete_precopy()

2024-03-07 Thread Eric Auger



On 3/7/24 14:36, Cédric Le Goater wrote:
> On 3/7/24 10:28, Eric Auger wrote:
>>
>>
>> On 3/6/24 14:34, Cédric Le Goater wrote:
>>> vfio_save_complete_precopy() currently returns before doing the trace
>>> event. Change that.
>>>
>>> Signed-off-by: Cédric Le Goater 
>>> ---
>>>   hw/vfio/migration.c | 3 ---
>>>   1 file changed, 3 deletions(-)
>>>
>>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>>> index
>>> bd48f2ee472a5230c2c84bff829dae1e217db33f..c8aeb43b4249ec76ded2542d62792e8c469d5f97
>>>  100644
>>> --- a/hw/vfio/migration.c
>>> +++ b/hw/vfio/migration.c
>>> @@ -580,9 +580,6 @@ static int vfio_save_complete_precopy(QEMUFile
>>> *f, void *opaque)
>>>     qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
>>>   ret = qemu_file_get_error(f);
>>> -    if (ret) {
>>> -    return ret;
>>> -    }
>>>     trace_vfio_save_complete_precopy(vbasedev->name, ret);
>> it is arguable if you want to trace if an error occured. If you want to
>> unconditionally trace the function entry, want don't we put the trace at
>> the beginning of the function?
> 
> But, then, the 'ret' value is not set and the trace event is less useful.
> I'd rather keep it that way.
ah I did not notice the returned value was traced too. OK then

Sorry for the noise

Eric
> 
> Thanks,
> 
> C.
> 




Re: [PATCH v2 2/4] mirror: allow specifying working bitmap

2024-03-07 Thread Markus Armbruster
Fiona Ebner  writes:

> From: John Snow 
>
> for the mirror job. The bitmap's granularity is used as the job's
> granularity.
>
> The new @bitmap parameter is marked unstable in the QAPI and can
> currently only be used for @sync=full mode.
>
> Clusters initially dirty in the bitmap as well as new writes are
> copied to the target.
>
> Using block-dirty-bitmap-clear and block-dirty-bitmap-merge API,
> callers can simulate the three kinds of @BitmapSyncMode (which is used
> by backup):
> 1. always: default, just pass bitmap as working bitmap.
> 2. never: copy bitmap and pass copy to the mirror job.
> 3. on-success: copy bitmap and pass copy to the mirror job and if
>successful, merge bitmap into original afterwards.
>
> When the target image is a fresh "diff image", i.e. one that was not
> used as the target of a previous mirror and the target image's cluster
> size is larger than the bitmap's granularity, or when
> @copy-mode=write-blocking is used, there is a pitfall, because the
> cluster in the target image will be allocated, but not contain all the
> data corresponding to the same region in the source image.
>
> An idea to avoid the limitation would be to mark clusters which are
> affected by unaligned writes and are not allocated in the target image
> dirty, so they would be copied fully later. However, for migration,
> the invariant that an actively synced mirror stays actively synced
> (unless an error happens) is useful, because without that invariant,
> migration might inactivate block devices when mirror still got work
> to do and run into an assertion failure [0].
>
> Another approach would be to read the missing data from the source
> upon unaligned writes to be able to write the full target cluster
> instead.
>
> But certain targets like NBD do not allow querying the cluster size.
> To avoid limiting/breaking the use case of syncing to an existing
> target, which is arguably more common than the diff image use case,
> document the limiation in QAPI.
>
> This patch was originally based on one by Ma Haocong, but it has since
> been modified pretty heavily, first by John and then again by Fiona.
>
> [0]: 
> https://lore.kernel.org/qemu-devel/1db7f571-cb7f-c293-04cc-cd856e060...@proxmox.com/
>
> Suggested-by: Ma Haocong 
> Signed-off-by: Ma Haocong 
> Signed-off-by: John Snow 
> [FG: switch to bdrv_dirty_bitmap_merge_internal]
> Signed-off-by: Fabian Grünbichler 
> Signed-off-by: Thomas Lamprecht 
> [FE: rebase for 9.0
>  get rid of bitmap mode parameter
>  use caller-provided bitmap as working bitmap
>  turn bitmap parameter experimental]
> Signed-off-by: Fiona Ebner 

[...]

> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 59d75b0793..4859fffd48 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -2191,6 +2191,18 @@
>  # destination (all the disk, only the sectors allocated in the
>  # topmost image, or only new I/O).
>  #
> +# @bitmap: The name of a bitmap to use as a working bitmap for
> +# sync=full mode.  This argument must be not be present for other
> +# sync modes and not at the same time as @granularity.  The
> +# bitmap's granularity is used as the job's granularity.  When
> +# the target is a diff image, i.e. one that should only contain
> +# the delta and was not synced to previously, the target's
> +# cluster size must not be larger than the bitmap's granularity.
> +# For a diff image target, using copy-mode=write-blocking should
> +# not be used, because unaligned writes will lead to allocated
> +# clusters with partial data in the target image!  The bitmap
> +# will be enabled after the job finishes.  (Since 9.0)

That's a lot of restrictions and caveats.  Okay as long as the thing
remains experimental, I guess.

> +#
>  # @granularity: granularity of the dirty bitmap, default is 64K if the
>  # image format doesn't have clusters, 4K if the clusters are
>  # smaller than that, else the cluster size.  Must be a power of 2
> @@ -2228,12 +2240,18 @@
>  # disappear from the query list without user intervention.
>  # Defaults to true.  (Since 3.1)
>  #
> +# Features:
> +#
> +# @unstable: Member @bitmap is experimental.
> +#
>  # Since: 1.3
>  ##
>  { 'struct': 'DriveMirror',
>'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
>  '*format': 'str', '*node-name': 'str', '*replaces': 'str',
> -'sync': 'MirrorSyncMode', '*mode': 'NewImageMode',
> +'sync': 'MirrorSyncMode',
> +'*bitmap': { 'type': 'str', 'features': [ 'unstable' ] },
> +'*mode': 'NewImageMode',
>  '*speed': 'int', '*granularity': 'uint32',
>  '*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
>  '*on-target-error': 'BlockdevOnError',
> @@ -2513,6 +2531,18 @@
>  # destination (all the disk, only the sectors allocated in the
>  # topmost image, or only new I/O).
>  #
> +# @bitmap: The 

Re: [PATCH v4 20/25] vfio: Add Error** argument to .vfio_save_config() handler

2024-03-07 Thread Eric Auger



On 3/7/24 14:55, Cédric Le Goater wrote:
> On 3/7/24 10:13, Eric Auger wrote:
>>
>>
>> On 3/6/24 14:34, Cédric Le Goater wrote:
>>> Use vmstate_save_state_with_err() to improve error reporting in the
>>> callers and store a reported error under the migration stream. Add
>>> documentation while at it.
>>>
>>> Reviewed-by: Philippe Mathieu-Daudé 
>>> Signed-off-by: Cédric Le Goater 
>>> ---
>>>   include/hw/vfio/vfio-common.h | 25 -
>>>   hw/vfio/migration.c   | 18 --
>>>   hw/vfio/pci.c |  5 +++--
>>>   3 files changed, 39 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/include/hw/vfio/vfio-common.h
>>> b/include/hw/vfio/vfio-common.h
>>> index
>>> b9da6c08ef41174610eb92726c590309a53696a3..46f88493634b5634a9c14a5caa33a463fbf2c50d
>>>  100644
>>> --- a/include/hw/vfio/vfio-common.h
>>> +++ b/include/hw/vfio/vfio-common.h
>>> @@ -133,7 +133,30 @@ struct VFIODeviceOps {
>>>   int (*vfio_hot_reset_multi)(VFIODevice *vdev);
>>>   void (*vfio_eoi)(VFIODevice *vdev);
>>>   Object *(*vfio_get_object)(VFIODevice *vdev);
>>> -    void (*vfio_save_config)(VFIODevice *vdev, QEMUFile *f);
>>> +
>>> +    /**
>>> + * @vfio_save_config
>>> + *
>>> + * Save device config state
>>> + *
>>> + * @vdev: #VFIODevice for which to save the config
>>> + * @f: #QEMUFile where to send the data
>>> + * @errp: pointer to Error*, to store an error if it happens.
>>> + *
>>> + * Returns zero to indicate success and negative for error
>>> + */
>>> +    int (*vfio_save_config)(VFIODevice *vdev, QEMUFile *f, Error
>>> **errp);
>>> +
>>> +    /**
>>> + * @vfio_load_config
>>> + *
>>> + * Load device config state
>>> + *
>>> + * @vdev: #VFIODevice for which to load the config
>>> + * @f: #QEMUFile where to get the data
>>> + *
>>> + * Returns zero to indicate success and negative for error
>>> + */
>>>   int (*vfio_load_config)(VFIODevice *vdev, QEMUFile *f);
>>>   };
>>>   diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>>> index
>>> 71ade14a7942358094371a86c00718f5979113ea..bd48f2ee472a5230c2c84bff829dae1e217db33f
>>>  100644
>>> --- a/hw/vfio/migration.c
>>> +++ b/hw/vfio/migration.c
>>> @@ -190,14 +190,19 @@ static int vfio_load_buffer(QEMUFile *f,
>>> VFIODevice *vbasedev,
>>>   return ret;
>>>   }
>>>   -static int vfio_save_device_config_state(QEMUFile *f, void *opaque)
>>> +static int vfio_save_device_config_state(QEMUFile *f, void *opaque,
>>> + Error **errp)
>>>   {
>>>   VFIODevice *vbasedev = opaque;
>>> +    int ret;
>>>     qemu_put_be64(f, VFIO_MIG_FLAG_DEV_CONFIG_STATE);
>>>     if (vbasedev->ops && vbasedev->ops->vfio_save_config) {
>>> -    vbasedev->ops->vfio_save_config(vbasedev, f);
>>> +    ret = vbasedev->ops->vfio_save_config(vbasedev, f, errp);
>>> +    if (ret) {
>> I am not familiar enough with that case but don't you still want to set
>> the VFIO_MIG_FLAG_END_OF_STATE to "close" the state?
> 
> This is a delimiter used on the target side when loading the state.
> 
> When QEMU fails to capture the device state, the whole migration is marked
> as in error. There is no need to end cleanly the device state, it is bogus
> anyhow.

OK thanks

Eric
> 
> Thanks,
> 
> C.
> 
> 
>>
>> Eric
>>> +    return ret;
>>> +    }
>>>   }
>>>     qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
>>> @@ -587,13 +592,14 @@ static int vfio_save_complete_precopy(QEMUFile
>>> *f, void *opaque)
>>>   static void vfio_save_state(QEMUFile *f, void *opaque)
>>>   {
>>>   VFIODevice *vbasedev = opaque;
>>> +    Error *local_err = NULL;
>>>   int ret;
>>>   -    ret = vfio_save_device_config_state(f, opaque);
>>> +    ret = vfio_save_device_config_state(f, opaque, _err);
>>>   if (ret) {
>>> -    error_report("%s: Failed to save device config space",
>>> - vbasedev->name);
>>> -    qemu_file_set_error(f, ret);
>>> +    error_prepend(_err, "%s: Failed to save device config
>>> space",
>>> +  vbasedev->name);
>>> +    qemu_file_set_error_obj(f, ret, local_err);
>>>   }
>>>   }
>>>   diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>> index
>>> 4fa387f0430d62ca2ba1b5ae5b7037f8f06b33f9..99d86e1d40ef25133fc76ad6e58294b07bd20843
>>>  100644
>>> --- a/hw/vfio/pci.c
>>> +++ b/hw/vfio/pci.c
>>> @@ -2585,11 +2585,12 @@ const VMStateDescription
>>> vmstate_vfio_pci_config = {
>>>   }
>>>   };
>>>   -static void vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f)
>>> +static int vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f,
>>> Error **errp)
>>>   {
>>>   VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice,
>>> vbasedev);
>>>   -    vmstate_save_state(f, _vfio_pci_config, vdev, NULL);
>>> +    return vmstate_save_state_with_err(f, _vfio_pci_config,
>>> vdev, NULL,
>>> +   errp);

Re: [PATCH 1/5] scripts/kernel-doc: teach kdoc about QLIST_ macros

2024-03-07 Thread Peter Xu
On Thu, Mar 07, 2024 at 06:11:01PM +, Alex Bennée wrote:
> The kernel-doc script does some pre-processing on structure
> definitions before parsing for names. Teach it about QLIST and replace
> with simplified structures representing the base type.
> 
> Signed-off-by: Alex Bennée 
> ---
>  scripts/kernel-doc | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/scripts/kernel-doc b/scripts/kernel-doc
> index 240923d509a..26c47562e79 100755
> --- a/scripts/kernel-doc
> +++ b/scripts/kernel-doc
> @@ -1226,7 +1226,14 @@ sub dump_struct($$) {
>   # replace DECLARE_KFIFO_PTR
>   $members =~ s/DECLARE_KFIFO_PTR\s*\(([^,)]+),\s*([^,)]+)\)/$2 \*$1/gos;
>  
> - my $declaration = $members;
> +# QEMU Specific Macros
> +
> +# replace QLIST_ENTRY with base type and variable name
> +$members =~ s/QLIST_ENTRY\(([^)]+)\)\s+([^;]+)/$1 \*$2/gos;
> +# replace QLIST_HEAD, optionally capturing an anonymous struct 
> marker, and capture type and variable name
> +$members =~ s/QLIST_HEAD\(\s*,\s*([^)]+)\)\s+([^;]+)/struct { $1 
> *lh_first; } $2/gos;
> +
> +my $declaration = $members;

May need a "tabify" here..

-- 
Peter Xu




Re: [PATCH v4 16/25] vfio: Add Error** argument to .set_dirty_page_tracking() handler

2024-03-07 Thread Eric Auger



On 3/7/24 13:06, Cédric Le Goater wrote:
> On 3/7/24 09:09, Eric Auger wrote:
>> Hi Cédric,
>>
>> On 3/6/24 14:34, Cédric Le Goater wrote:
>>> We will use the Error object to improve error reporting in the
>>> .log_global*() handlers of VFIO. Add documentation while at it.
>>>
>>> Reviewed-by: Philippe Mathieu-Daudé 
>>> Signed-off-by: Cédric Le Goater 
>>> ---
>>>
>>>   Changes in v3:
>>>
>>>   - Use error_setg_errno() in vfio_legacy_set_dirty_page_tracking()
>>>     include/hw/vfio/vfio-container-base.h | 18 --
>>>   hw/vfio/common.c  |  4 ++--
>>>   hw/vfio/container-base.c  |  4 ++--
>>>   hw/vfio/container.c   |  6 +++---
>>>   4 files changed, 23 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/include/hw/vfio/vfio-container-base.h
>>> b/include/hw/vfio/vfio-container-base.h
>>> index
>>> 3582d5f97a37877b2adfc0d0b06996c82403f8b7..c76984654a596e3016a8cf833e10143eb872e102
>>>  100644
>>> --- a/include/hw/vfio/vfio-container-base.h
>>> +++ b/include/hw/vfio/vfio-container-base.h
>>> @@ -82,7 +82,7 @@ int
>>> vfio_container_add_section_window(VFIOContainerBase *bcontainer,
>>>   void vfio_container_del_section_window(VFIOContainerBase *bcontainer,
>>>  MemoryRegionSection *section);
>>>   int vfio_container_set_dirty_page_tracking(VFIOContainerBase
>>> *bcontainer,
>>> -   bool start);
>>> +   bool start, Error **errp);
>>>   int vfio_container_query_dirty_bitmap(const VFIOContainerBase
>>> *bcontainer,
>>>     VFIOBitmap *vbmap,
>>>     hwaddr iova, hwaddr size);
>>> @@ -121,9 +121,23 @@ struct VFIOIOMMUClass {
>>>   int (*attach_device)(const char *name, VFIODevice *vbasedev,
>>>    AddressSpace *as, Error **errp);
>>>   void (*detach_device)(VFIODevice *vbasedev);
>>> +
>>>   /* migration feature */
>>> +
>>> +    /**
>>> + * @set_dirty_page_tracking
>>> + *
>>> + * Start or stop dirty pages tracking on VFIO container
>>> + *
>>> + * @bcontainer: #VFIOContainerBase on which to de/activate dirty
>>> + *  pages tracking
>> s/pages/page?
> 
> yep
> 
>> for my education is the "#"VFIOContainerBase formalized somewhere?
> 
> It's QEMU specific. See 4cf41794411f ("docs: tweak kernel-doc for QEMU
> coding standards").
OK thank you for the education!
> 
>> + * @start: indicates whether to start or stop dirty pages tracking
>>> + * @errp: pointer to Error*, to store an error if it happens.
>>> + *
>>> + * Returns zero to indicate success and negative for error
>>> + */
>>>   int (*set_dirty_page_tracking)(const VFIOContainerBase
>>> *bcontainer,
>>> -   bool start);
>>> +   bool start, Error **errp);
>>>   int (*query_dirty_bitmap)(const VFIOContainerBase *bcontainer,
>>>     VFIOBitmap *vbmap,
>>>     hwaddr iova, hwaddr size);
>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>> index
>>> 800ba0aeac84b8dcc83b042bb70c37b4bf78d3f4..5598a508399a6c0b3a20ba17311cbe83d84250c5
>>>  100644
>>> --- a/hw/vfio/common.c
>>> +++ b/hw/vfio/common.c
>>> @@ -1085,7 +1085,7 @@ static bool
>>> vfio_listener_log_global_start(MemoryListener *listener,
>>>   if (vfio_devices_all_device_dirty_tracking(bcontainer)) {
>>>   ret = vfio_devices_dma_logging_start(bcontainer);
>>>   } else {
>>> -    ret = vfio_container_set_dirty_page_tracking(bcontainer, true);
>>> +    ret = vfio_container_set_dirty_page_tracking(bcontainer,
>>> true, NULL);
>> It is not obvious why we don't pass errp here. Also there is ana
>> error_report below. Why isn't the error propagated? (not related to your
>> patch though)
> 
> When I started this series, I was trying to find a way to introduce
> progressively the changes and this patch is preparing ground for
> what is coming next. It could be merged with the following if you prefer.
up to you or tweek the commit msg

Eric
> 
> 
> Thanks,
> 
> C.
> 
> 
> 
> 
>>>   }
>>>     if (ret) {
>>> @@ -1105,7 +1105,7 @@ static void
>>> vfio_listener_log_global_stop(MemoryListener *listener)
>>>   if (vfio_devices_all_device_dirty_tracking(bcontainer)) {
>>>   vfio_devices_dma_logging_stop(bcontainer);
>>>   } else {
>>> -    ret = vfio_container_set_dirty_page_tracking(bcontainer,
>>> false);
>>> +    ret = vfio_container_set_dirty_page_tracking(bcontainer,
>>> false, NULL);
>>>   }
>>>     if (ret) {
>>> diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c
>>> index
>>> 913ae49077c4f09b7b27517c1231cfbe4befb7fb..7c0764121d24b02b6c4e66e368d7dff78a6d65aa
>>>  100644
>>> --- a/hw/vfio/container-base.c
>>> +++ b/hw/vfio/container-base.c
>>> @@ -53,14 +53,14 @@ void

Re: [PATCH v2 0/4] physmem: Fix MemoryRegion for second access to cached MMIO Address Space

2024-03-07 Thread Peter Xu
On Thu, Mar 07, 2024 at 03:37:06PM +, Jonathan Cameron wrote:
> v2: (Thanks to Peter Xu for reviewing!)
> - New patch 1 to rename addr1 to mr_addr in the interests of meaningful 
> naming.
> - Take advantage of a cached address space only allow for a single MR to 
> simplify
>   the new code.
> - Various cleanups of indentation etc.
> - Cover letter and some patch descriptions updated to reflect changes.
> - Changes all called out in specific patches.

All look good to me, thanks.  Having the new functions' first argument as
MemTxAttrs is slightly weird to me, but that's not a big deal and we can
clean it up later if wanted.  I guess it's good to fix this in 9.0 first as
it's a real bug even if not trivial to hit.

I queued it in my migration tree (with my "memory API" hat..).

I won't send a pull until next Monday.  Please shoot if there's any objections!

-- 
Peter Xu




Re: [PATCH v2 1/4] qapi/block-core: avoid the re-use of MirrorSyncMode for backup

2024-03-07 Thread Markus Armbruster
Fiona Ebner  writes:

> Backup supports all modes listed in MirrorSyncMode, while mirror does
> not. Introduce BackupSyncMode by copying the current MirrorSyncMode
> and drop the variants mirror does not support from MirrorSyncMode as
> well as the corresponding manual check in mirror_start().

Results in tighter introspection: query-qmp-schema no longer reports
drive-mirror and blockdev-mirror accepting @sync values they actually
reject.  Suggest to mention this in the commit message.

> Suggested-by: Vladimir Sementsov-Ogievskiy 
> Signed-off-by: Fiona Ebner 
> ---
>
> I felt like keeping the "Since: X.Y" as before makes the most sense as
> to not lose history. Or is it necessary to change this for
> BackupSyncMode (and its members) since it got a new name?

Doc comments are for users of the QMP interface.  Type names do not
matter there.  I agree with your decision not to update the "since"
tags.

[...]

> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 1874f880a8..59d75b0793 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1304,10 +1304,10 @@
>'data': ['report', 'ignore', 'enospc', 'stop', 'auto'] }
>  
>  ##
> -# @MirrorSyncMode:
> +# @BackupSyncMode:
>  #
> -# An enumeration of possible behaviors for the initial synchronization
> -# phase of storage mirroring.
> +# An enumeration of possible behaviors for image synchronization used
> +# by backup jobs.
>  #
>  # @top: copies data in the topmost image to the destination
>  #
> @@ -1323,7 +1323,7 @@
>  #
>  # Since: 1.3
>  ##
> -{ 'enum': 'MirrorSyncMode',
> +{ 'enum': 'BackupSyncMode',
>'data': ['top', 'full', 'none', 'incremental', 'bitmap'] }
>  
>  ##
> @@ -1347,6 +1347,23 @@
>  { 'enum': 'BitmapSyncMode',
>'data': ['on-success', 'never', 'always'] }
>  
> +##
> +# @MirrorSyncMode:
> +#
> +# An enumeration of possible behaviors for the initial synchronization
> +# phase of storage mirroring.
> +#
> +# @top: copies data in the topmost image to the destination
> +#
> +# @full: copies data from all images to the destination
> +#
> +# @none: only copy data written from now on
> +#
> +# Since: 1.3
> +##
> +{ 'enum': 'MirrorSyncMode',
> +  'data': ['top', 'full', 'none'] }
> +
>  ##
>  # @MirrorCopyMode:
>  #
> @@ -1624,7 +1641,7 @@
>  ##
>  { 'struct': 'BackupCommon',
>'data': { '*job-id': 'str', 'device': 'str',
> -'sync': 'MirrorSyncMode', '*speed': 'int',
> +'sync': 'BackupSyncMode', '*speed': 'int',
>  '*bitmap': 'str', '*bitmap-mode': 'BitmapSyncMode',
>  '*compress': 'bool',
>  '*on-source-error': 'BlockdevOnError',

QAPI schema
Acked-by: Markus Armbruster 




Re: [PATCH v2 14/20] smbios: extend smbios-entry-point-type with 'auto' value

2024-03-07 Thread Markus Armbruster
Igor Mammedov  writes:

> later patches will use it to pick SMBIOS version at runtime
> depending on configuration.
>
> Signed-off-by: Igor Mammedov 
> Acked-by: Markus Armbruster 
> Reviewed-by: Ani Sinha 
> Tested-by: Fiona Ebner 
> ---
>  qapi/machine.json | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/qapi/machine.json b/qapi/machine.json
> index bb5a178909..6bdc71dc8c 100644
> --- a/qapi/machine.json
> +++ b/qapi/machine.json
> @@ -1797,10 +1797,13 @@
>  #
>  # @64: SMBIOS version 3.0 (64-bit) Entry Point
>  #
> +# @auto: Either 2.x or 3.x SMBIOS version, 2.x if configuration can be
> +#described by it and 3.x otherwise (since: 9.0)

Please format like

   # @auto: Either 2.x or 3.x SMBIOS version, 2.x if configuration can be
   # described by it and 3.x otherwise (since: 9.0)

> +#
>  # Since: 7.0
>  ##
>  { 'enum': 'SmbiosEntryPointType',
> -  'data': [ '32', '64' ] }
> +  'data': [ '32', '64', 'auto' ] }
>  
>  ##
>  # @MemorySizeConfiguration:




Re: Problem with migration/rdma

2024-03-07 Thread Peter Xu
On Fri, Mar 08, 2024 at 07:03:56AM +, Zhijian Li (Fujitsu) wrote:
> 
> 
> On 08/03/2024 14:55, Peter Xu wrote:
> > On Fri, Mar 08, 2024 at 07:27:59AM +0100, Yu Zhang wrote:
> >> Hello Zhijian and Peter,
> >>
> >> Thank you so much for testing and confirming it.
> >> I created a patch in the email format, unfortunately got an issue for
> >> setting up the
> >> "Application-specific Password" in Gmail. It seems that in my gmail
> >> account there
> >> is no option at all for selecting "mail" before creating the
> >> application password.
> >>
> >> As it's tiny, I attached it in this email at this time (not elegant.),
> > 
> > I ququed it, thanks!
> > 
> 
> > -isock->host = rdma->host;
> > +isock->host = g_strdup_printf("%s", rdma->host);
> 
> 
> Peter,
> 
> g_strdup(rdma->host) is enough i guess.

Thanks Zhijian, I'll touch it up.

-- 
Peter Xu




Re: [PATCH v4 11/25] migration: Add Error** argument to .save_setup() handler

2024-03-07 Thread Peter Xu
On Thu, Mar 07, 2024 at 02:39:31PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > I would be glad to have most of this series merged in QEMU 9.0. So,
> > unless there is something major, I will keep that for followups.

Unfortunately I found this series won't apply to master.. starting from
"migration: Always report an error in ram_save_setup()".  Perhaps forgot to
pull before the repost?

It'll also be nice if we can get an ACK for the s390 patch from a
maintainer.

Cedric, would you prefer a repost before this weekend, or we just wait for
9.1?  IMHO we don't need to rush this in 9.0 if it's still partially done,
so the latter option isn't that bad (I've already queued the initial four
irrelevant of that).

Thanks,

-- 
Peter Xu




Re: Problem with migration/rdma

2024-03-07 Thread Zhijian Li (Fujitsu)


On 08/03/2024 14:55, Peter Xu wrote:
> On Fri, Mar 08, 2024 at 07:27:59AM +0100, Yu Zhang wrote:
>> Hello Zhijian and Peter,
>>
>> Thank you so much for testing and confirming it.
>> I created a patch in the email format, unfortunately got an issue for
>> setting up the
>> "Application-specific Password" in Gmail. It seems that in my gmail
>> account there
>> is no option at all for selecting "mail" before creating the
>> application password.
>>
>> As it's tiny, I attached it in this email at this time (not elegant.),
> 
> I ququed it, thanks!
> 

> -isock->host = rdma->host;
> +isock->host = g_strdup_printf("%s", rdma->host);


Peter,

g_strdup(rdma->host) is enough i guess.


Thanks
Zhijian






>  isock->port = g_strdup_printf("%d", rdma->port);


Re: [PATCH v4 07/25] migration: Always report an error in block_save_setup()

2024-03-07 Thread Peter Xu
On Wed, Mar 06, 2024 at 02:34:22PM +0100, Cédric Le Goater wrote:
> @@ -404,6 +403,10 @@ static int init_blk_migration(QEMUFile *f)
>  sectors = bdrv_nb_sectors(bs);
>  if (sectors <= 0) {

Not directly relevant to this patch, but just to mention that this looks
suspicious (even if I know nothing about block migration..) - I am not sure
whether any block drive would return 0 here, if so it looks still like a
problem if we do the cleanup, ignoring the rest and return a success.

>  ret = sectors;
> +if (ret < 0) {
> +error_setg(errp, "Error getting length of block device %s",
> +   bdrv_get_device_name(bs));
> +}
>  bdrv_next_cleanup();
>  goto out;
>  }

-- 
Peter Xu




Re: Problem with migration/rdma

2024-03-07 Thread Peter Xu
On Fri, Mar 08, 2024 at 07:27:59AM +0100, Yu Zhang wrote:
> Hello Zhijian and Peter,
> 
> Thank you so much for testing and confirming it.
> I created a patch in the email format, unfortunately got an issue for
> setting up the
> "Application-specific Password" in Gmail. It seems that in my gmail
> account there
> is no option at all for selecting "mail" before creating the
> application password.
> 
> As it's tiny, I attached it in this email at this time (not elegant.),

I ququed it, thanks!

-- 
Peter Xu




Re: [PATCH v9 0/5] eBPF RSS through QMP support.

2024-03-07 Thread Jason Wang
On Mon, Feb 26, 2024 at 6:23 PM Andrew Melnichenko  wrote:
>
> Hi all,
> Jason, can you please review the patch set, thank you.

Queued.

Thanks




Re: Problem with migration/rdma

2024-03-07 Thread Yu Zhang
Hello Zhijian and Peter,

Thank you so much for testing and confirming it.
I created a patch in the email format, unfortunately got an issue for
setting up the
"Application-specific Password" in Gmail. It seems that in my gmail
account there
is no option at all for selecting "mail" before creating the
application password.

As it's tiny, I attached it in this email at this time (not elegant.),
so that it can get
included before the soft freezing.

Really sorry for this inconvenience.
--
>From c9fb6a6debfbd5e103aa90f30e9a028316449104 Mon Sep 17 00:00:00 2001
From: Yu Zhang 
Date: Wed, 6 Mar 2024 09:06:54 +0100
Subject: [PATCH] migration/rdma: Fix a memory issue for migration

In commit 3fa9642ff7 change was made to convert the RDMA backend to
accept MigrateAddress struct. However, the assignment of "host" leads
to data corruption on the target host and the failure of migration.

isock->host = rdma->host;

By allocating the memory explicitly for it with g_strdup_printf(), the
issue is fixed and the migration doesn't fail any more.

Fixes: 3fa9642ff7 ("migration: convert rdma backend to accept MigrateAddress")
Cc: qemu-stable 
Cc: Li Zhijian 
Cc: Peter Xu 
Signed-off-by: Yu Zhang 
---
 migration/rdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index a355dcea89..d6abe718b5 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3357,7 +3357,7 @@ static int qemu_rdma_accept(RDMAContext *rdma)
 goto err_rdma_dest_wait;
 }

-isock->host = rdma->host;
+isock->host = g_strdup_printf("%s", rdma->host);
 isock->port = g_strdup_printf("%d", rdma->port);

 /*
-- 
2.25.1
--

Best regards,
Yu Zhang @ IONOS Compute Platform
08.03.2024

On Thu, Mar 7, 2024 at 4:37 AM Peter Xu  wrote:
>
> On Thu, Mar 07, 2024 at 02:41:37AM +, Zhijian Li (Fujitsu) via wrote:
> > Yu,
> >
> >
> > On 07/03/2024 00:30, Philippe Mathieu-Daudé wrote:
> > > Cc'ing RDMA migration reviewers/maintainers:
> > >
> > > $ ./scripts/get_maintainer.pl -f migration/rdma.c
> > > Li Zhijian  (reviewer:RDMA Migration)
> > > Peter Xu  (maintainer:Migration)
> > > Fabiano Rosas  (maintainer:Migration)
> > >
> > > On 5/3/24 22:32, Yu Zhang wrote:
> > >> Hello Het and all,
> > >>
> > >> while I was testing qemu-8.2, I saw a lot of our migration test cases 
> > >> failed.
> > >> After debugging the commits of the 8.2 branch, I saw the issue and mad a 
> > >> diff:
> > >>
> > >> diff --git a/migration/rdma.c b/migration/rdma.c
> > >> index 6a29e53daf..f10d56f556 100644
> > >> --- a/migration/rdma.c
> > >> +++ b/migration/rdma.c
> > >> @@ -3353,9 +3353,9 @@ static int qemu_rdma_accept(RDMAContext *rdma)
> > >>   goto err_rdma_dest_wait;
> > >>   }
> > >>
> > >> -isock->host = rdma->host;
> > >> +isock->host = g_strdup_printf("%s", rdma->host);
> > >>   isock->port = g_strdup_printf("%d", rdma->port);
> >
> >
> > Thanks for your analysis.
> >
> > It will be great if you send this as a patch.
> >
> >
> > isock is defined as a _autoptr VVV
> >  _autoptr(InetSocketAddress) isock = g_new0(InetSocketAddress, 1);
> >
> > I'm surprised that it seems the auto free scheme will free the member of 
> > isock as well
> > see below valrind log. That will cause a double free.
>
> Right, all the QAPI-free is a deep one.  Thanks for checking this up,
> Zhijian.
>
> Yu, would you please send a formal patch (better before this week ends) so
> that I can include it for the last pull for 9.0 soft-freeze (March 12th)?
> As 8.2 affected, please also attach proper tags:
>
> Cc: qemu-stable 
> Fixes: 3fa9642ff7 ("migration: convert rdma backend to accept MigrateAddress")
>
> >
> > ==809138== Invalid free() / delete / delete[] / realloc()
> > ==809138==at 0x483A9F5: free (vg_replace_malloc.c:538)
> > ==809138==by 0x598F70C: g_free (in /usr/lib64/libglib-2.0.so.0.6600.8)
> > ==809138==by 0x79B6AD: qemu_rdma_cleanup (rdma.c:2432)
> > ==809138==by 0x79CEE6: qio_channel_rdma_close_rcu (rdma.c:3108)
> > ==809138==by 0xC2E339: call_rcu_thread (rcu.c:301)
> > ==809138==by 0xC2116A: qemu_thread_start (qemu-thread-posix.c:541)
> > ==809138==by 0x72683F8: ??? (in /usr/lib64/libpthread-2.32.so)
> > ==809138==by 0x73824C2: clone (in /usr/lib64/libc-2.32.so)
> > ==809138==  Address 0x13daa070 is 0 bytes inside a block of size 14 free'd
> > ==809138==at 0x483A9F5: free (vg_replace_malloc.c:538)
> > ==809138==by 0x598F70C: g_free (in /usr/lib64/libglib-2.0.so.0.6600.8)
> > ==809138==by 0xC058CF: qapi_dealloc_type_str (qapi-dealloc-visitor.c:68)
> > ==809138==by 0xC09EF3: visit_type_str (qapi-visit-core.c:349)
> > ==809138==by 0xBDDECC: visit_type_InetSocketAddressBase_members 
> > (qapi-visit-sockets.c:29)
> > ==809138==by 0xBDE055: visit_type_InetSocketAddress_members 
> > (qapi-visit-sockets.c:67)
> > ==809138==by 0xBDE30D: visit_type_InetSocketAddress 
> > (qapi-visit-sockets.c:119)
> > ==809138== 

Re: [PATCH v2] docs/acpi/bits: add some clarity and details while also improving formating

2024-03-07 Thread Michael Tokarev

08.03.2024 07:22, Ani Sinha :

Update bios-bits docs to add more details on why a pre-OS environment for
testing bioses is useful. Add author's FOSDEM talk link. Also improve the
formating of the document while at it.

CC: qemu-triv...@nongnu.org
Signed-off-by: Ani Sinha 


Reviewed-by: Michael Tokarev 

and applied to qemu-trivial.  Thank you!

/mjt



Re: [PATCH 0/3] cxl: Fix issues with g_steal_pointer()

2024-03-07 Thread Michael Tokarev

04.03.2024 13:44, Thomas Huth wrote:

When setting GLIB_VERSION_MAX_ALLOWED to GLIB_VERSION_2_58 or higher
(which we'll certainly do in the not too distant future), glib adds
type safety checks to the g_steal_pointer() macro. This triggers
errors in the cxl code since the pointer types do not always match
here. Let's fix those errors now so we can switch to a newer version
of the glib in a future version of QEMU.


Picked up for qemu-trivial, thank you!

/mjt



Re: [PATCH 0/4] hw/i386/pc: Trivial cleanups

2024-03-07 Thread Michael Tokarev

07.03.2024 21:57, Bernhard Beschow :

Am 1. März 2024 18:59:32 UTC schrieb "Philippe Mathieu-Daudé" 
:

Trivial cleanups, mostly around the 'isapc' machine.

Philippe Mathieu-Daudé (4):
  hw/i386/pc: Remove pc_compat_1_4..1.7[] left over declarations
  hw/i386/pc: Use generated NotifyVmexitOption_str()
  hw/i386/pc: Remove 'host_type' argument from pc_init1()
  hw/i386/pc: Have pc_init_isa() pass a NULL pci_type argument


Ping. Will this series make it into 9.0? AFAICS all patches are reviewed.


Philippe, are you submitting this through you misc tree, or should I
pick it for trivial-patches?  I'm often a bit in doubt about patches
you Cc to qemu-trivial@, - a few times we both tried to submit them.

/mjt



[PATCH] configure: Fix error message when C compiler is not working

2024-03-07 Thread Thomas Huth
If you try to run the configure script on a system without a working
C compiler, you get a very misleading error message:

 ERROR: Unrecognized host OS (uname -s reports 'Linux')

We should rather tell the user that we were not able to use the C
compiler instead, otherwise they will have a hard time to figure
out what was going wrong.

Fixes: 264b803721 ("configure: remove compiler sanity check")
Signed-off-by: Thomas Huth 
---
 configure | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/configure b/configure
index 3cd736b139..a036923dee 100755
--- a/configure
+++ b/configure
@@ -411,7 +411,7 @@ else
   # Using uname is really broken, but it is just a fallback for architectures
   # that are going to use TCI anyway
   cpu=$(uname -m)
-  echo "WARNING: unrecognized host CPU, proceeding with 'uname -m' output 
'$cpu'"
+  echo "WARNING: could not determine host CPU, proceeding with 'uname -m' 
output '$cpu'"
 fi
 
 # Normalise host CPU name to the values used by Meson cross files and in source
@@ -1000,10 +1000,12 @@ if test -z "$ninja"; then
 fi
 
 if test "$host_os" = "bogus"; then
-# Now that we know that we're not printing the help and that
-# the compiler works (so the results of the check_defines we used
-# to identify the OS are reliable), if we didn't recognize the
-# host OS we should stop now.
+# Now that we know that we're not printing the help, we should stop now
+# if we didn't recognize the host OS (or the C compiler is not working).
+write_c_skeleton;
+if ! compile_object ; then
+error_exit "C compiler \"$cc\" is not usable"
+fi
 error_exit "Unrecognized host OS (uname -s reports '$(uname -s)')"
 fi
 
-- 
2.44.0




Re: [PATCH v2 2/2] kvm: add support for guest physical bits

2024-03-07 Thread Xiaoyao Li

On 3/5/2024 6:52 PM, Gerd Hoffmann wrote:

Query kvm for supported guest physical address bits, in cpuid
function 8008, eax[23:16].  Usually this is identical to host
physical address bits.  With NPT or EPT being used this might be
restricted to 48 (max 4-level paging address space size) even if
the host cpu supports more physical address bits.

When set pass this to the guest, using cpuid too.  Guest firmware
can use this to figure how big the usable guest physical address
space is, so PCI bar mapping are actually reachable.

Signed-off-by: Gerd Hoffmann 
---
  target/i386/cpu.h |  1 +
  target/i386/cpu.c |  1 +
  target/i386/kvm/kvm.c | 17 +
  3 files changed, 19 insertions(+)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 952174bb6f52..d427218827f6 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2026,6 +2026,7 @@ struct ArchCPU {
  
  /* Number of physical address bits supported */

  uint32_t phys_bits;
+uint32_t guest_phys_bits;
  
  /* in order to simplify APIC support, we leave this pointer to the

 user */
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 2666ef380891..1a6cfc75951e 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6570,6 +6570,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
  if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
  /* 64 bit processor */
   *eax |= (cpu_x86_virtual_addr_width(env) << 8);
+ *eax |= (cpu->guest_phys_bits << 16);
  }
  *ebx = env->features[FEAT_8000_0008_EBX];
  if (cs->nr_cores * cs->nr_threads > 1) {
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 7298822cb511..ce22dfcaa661 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -238,6 +238,15 @@ static int kvm_get_tsc(CPUState *cs)
  return 0;
  }
  
+/* return cpuid fn 8000_0008 eax[23:16] aka GuestPhysBits */

+static int kvm_get_guest_phys_bits(KVMState *s)
+{
+uint32_t eax;
+
+eax = kvm_arch_get_supported_cpuid(s, 0x8008, 0, R_EAX);
+return (eax >> 16) & 0xff;
+}
+
  static inline void do_kvm_synchronize_tsc(CPUState *cpu, run_on_cpu_data arg)
  {
  kvm_get_tsc(cpu);
@@ -1730,6 +1739,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
  X86CPU *cpu = X86_CPU(cs);
  CPUX86State *env = >env;
  uint32_t limit, i, j, cpuid_i;
+uint32_t guest_phys_bits;
  uint32_t unused;
  struct kvm_cpuid_entry2 *c;
  uint32_t signature[3];
@@ -1765,6 +1775,13 @@ int kvm_arch_init_vcpu(CPUState *cs)
  
  env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
  
+guest_phys_bits = kvm_get_guest_phys_bits(cs->kvm_state);

+if (guest_phys_bits &&
+(cpu->guest_phys_bits == 0 ||
+ cpu->guest_phys_bits > guest_phys_bits)) {
+cpu->guest_phys_bits = guest_phys_bits;
+}


suggest to move this added code block to kvm_cpu_realizefn().

Main code in kvm_arch_init_vcpu() is to consume the data in cpu or 
env->features to configure KVM specific configuration via KVM ioctls.


The preparation work of initializing the required data usually not 
happen in kvm_arch_init_vcpu();



  /*
   * kvm_hyperv_expand_features() is called here for the second time in case
   * KVM_CAP_SYS_HYPERV_CPUID is not supported. While we can't possibly 
handle





Re: [PATCH 1/5] target/sparc/cpu: Rename the CPU models with a "+" in their names

2024-03-07 Thread Thomas Huth

On 07/03/2024 22.22, Richard Henderson wrote:

On 3/7/24 07:43, Thomas Huth wrote:

+    /* Fix up legacy names with '+' in it */
+    if (g_str_equal(typename, SPARC_CPU_TYPE_NAME("Sun-UltraSparc-IV+"))) {
+    g_free(typename);
+    typename = g_strdup(SPARC_CPU_TYPE_NAME("Sun-UltraSparc-IVp"));
+    } else if (g_str_equal(typename, 
SPARC_CPU_TYPE_NAME("Sun-UltraSparc-IIIi+"))) {

+    g_free(typename);
+    typename = g_strdup(SPARC_CPU_TYPE_NAME("Sun-UltraSparc-IIIip"));
+    }
+


Legacy names don't include dashes.


This check is done after sparc_cpu_type_name() has been called, which 
transforms the spaces into dashes, so we need the dashes here.
(otherwise I'd need to check for all "valid" combinations, like 
"Sun-UltraSparc-IV+", "Sun UltraSparc-IV+", "Sun-UltraSparc IV+" and "Sun 
UltraSparc IV+").


 Thomas





[PATCH v2] docs/acpi/bits: add some clarity and details while also improving formating

2024-03-07 Thread Ani Sinha
Update bios-bits docs to add more details on why a pre-OS environment for
testing bioses is useful. Add author's FOSDEM talk link. Also improve the
formating of the document while at it.

CC: qemu-triv...@nongnu.org
Signed-off-by: Ani Sinha 
---
 docs/devel/acpi-bits.rst | 55 
 1 file changed, 39 insertions(+), 16 deletions(-)

changelog:
v2: commit message improvement.

diff --git a/docs/devel/acpi-bits.rst b/docs/devel/acpi-bits.rst
index 9677b0098f..1ec394f5fb 100644
--- a/docs/devel/acpi-bits.rst
+++ b/docs/devel/acpi-bits.rst
@@ -1,26 +1,48 @@
 =
 ACPI/SMBIOS avocado tests using biosbits
 =
-
+
+Introduction
+
 Biosbits is a software written by Josh Triplett that can be downloaded
 from https://biosbits.org/. The github codebase can be found
-`here `__. It is a software that 
executes
-the bios components such as acpi and smbios tables directly through acpica
-bios interpreter (a freely available C based library written by Intel,
+`here `__. It is a software that
+executes the bios components such as acpi and smbios tables directly through
+acpica bios interpreter (a freely available C based library written by Intel,
 downloadable from https://acpica.org/ and is included with biosbits) without an
-operating system getting involved in between.
+operating system getting involved in between. Bios-bits has python integration
+with grub so actual routines that executes bios components can be written in
+python instead of bash-ish (grub's native scripting language).
 There are several advantages to directly testing the bios in a real physical
-machine or VM as opposed to indirectly discovering bios issues through the
-operating system. For one thing, the OSes tend to hide bios problems from the
-end user. The other is that we have more control of what we wanted to test
-and how by directly using acpica interpreter on top of the bios on a running
-system. More details on the inspiration for developing biosbits and its real
-life uses can be found in [#a]_ and [#b]_.
+machine or in a VM as opposed to indirectly discovering bios issues through the
+operating system (the OS). Operating systems tend to bypass bios problems and
+hide them from the end user. We have more control of what we wanted to test and
+how by being as close to the bios on a running system as possible without a
+complicated software component such as an operating system coming in between.
+Another issue is that we cannot exercise bios components such as ACPI and
+SMBIOS without being in the highest hardware privilege level, ring 0 for
+example in case of x86. Since the OS executes from ring 0 whereas normal user
+land software resides in unprivileged ring 3, operating system must be modified
+in order to write our test routines that exercise and test the bios. This is
+not possible in all cases. Lastly, test frameworks and routines are preferably
+written using a high level scripting language such as python. OSes and
+OS modules are generally written using low level languages such as C and
+low level assembly machine language. Writing test routines in a low level
+language makes things more cumbersome. These and other reasons makes using
+bios-bits very attractive for testing bioses. More details on the inspiration
+for developing biosbits and its real life uses can be found in [#a]_ and [#b]_.
+
 For QEMU, we maintain a fork of bios bits in gitlab along with all the
-dependent submodules here: https://gitlab.com/qemu-project/biosbits-bits
+dependent submodules `here `__.
 This fork contains numerous fixes, a newer acpica and changes specific to
 running this avocado QEMU tests using bits. The author of this document
-is the sole maintainer of the QEMU fork of bios bits repo.
+is the sole maintainer of the QEMU fork of bios bits repository. For more
+information, please see author's `FOSDEM talk on this bios-bits based test
+framework 
`__.
+
+*
+Description of the test framework
+*
 
 Under the directory ``tests/avocado/``, ``acpi-bits.py`` is a QEMU avocado
 test that drives all this.
@@ -120,8 +142,9 @@ Under ``tests/avocado/`` as the root we have:
(b) Add a SPDX license header.
(c) Perform modifications to the test.
 
-   Commits (a), (b) and (c) should go under separate commits so that the 
original
-   test script and the changes we have made are separated and clear.
+   Commits (a), (b) and (c) preferably should go under separate commits so that
+   the original test 

Re: [PATCH 2/2] linux-user/riscv: Sync hwprobe keys with Linux

2024-03-07 Thread Alistair Francis
On Wed, Feb 7, 2024 at 10:00 PM Christoph Müllner
 wrote:
>
> Upstream Linux recently added many additional keys to the hwprobe API.
> This patch adds support for all of them with the exception of Ztso,
> which is currently not supported in QEMU.
>
> Signed-off-by: Christoph Müllner 
> ---
>  linux-user/syscall.c | 98 
>  1 file changed, 91 insertions(+), 7 deletions(-)
>
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 43467c9707..3ba20f99ad 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -8793,13 +8793,41 @@ static int do_getdents64(abi_long dirfd, abi_long 
> arg2, abi_long count)
>  #define RISCV_HWPROBE_BASE_BEHAVIOR_IMA (1 << 0)
>
>  #define RISCV_HWPROBE_KEY_IMA_EXT_0 4
> -#define RISCV_HWPROBE_IMA_FD   (1 << 0)
> -#define RISCV_HWPROBE_IMA_C(1 << 1)
> -#define RISCV_HWPROBE_IMA_V(1 << 2)
> -#define RISCV_HWPROBE_EXT_ZBA  (1 << 3)
> -#define RISCV_HWPROBE_EXT_ZBB  (1 << 4)
> -#define RISCV_HWPROBE_EXT_ZBS  (1 << 5)
> -#define RISCV_HWPROBE_EXT_ZICBOZ   (1 << 6)
> +#defineRISCV_HWPROBE_IMA_FD(1 << 0)
> +#defineRISCV_HWPROBE_IMA_C (1 << 1)
> +#defineRISCV_HWPROBE_IMA_V (1 << 2)
> +#defineRISCV_HWPROBE_EXT_ZBA   (1 << 3)
> +#defineRISCV_HWPROBE_EXT_ZBB   (1 << 4)
> +#defineRISCV_HWPROBE_EXT_ZBS   (1 << 5)
> +#defineRISCV_HWPROBE_EXT_ZICBOZ(1 << 6)
> +#defineRISCV_HWPROBE_EXT_ZBC   (1 << 7)
> +#defineRISCV_HWPROBE_EXT_ZBKB  (1 << 8)
> +#defineRISCV_HWPROBE_EXT_ZBKC  (1 << 9)
> +#defineRISCV_HWPROBE_EXT_ZBKX  (1 << 10)
> +#defineRISCV_HWPROBE_EXT_ZKND  (1 << 11)
> +#defineRISCV_HWPROBE_EXT_ZKNE  (1 << 12)
> +#defineRISCV_HWPROBE_EXT_ZKNH  (1 << 13)
> +#defineRISCV_HWPROBE_EXT_ZKSED (1 << 14)
> +#defineRISCV_HWPROBE_EXT_ZKSH  (1 << 15)
> +#defineRISCV_HWPROBE_EXT_ZKT   (1 << 16)
> +#defineRISCV_HWPROBE_EXT_ZVBB  (1 << 17)
> +#defineRISCV_HWPROBE_EXT_ZVBC  (1 << 18)
> +#defineRISCV_HWPROBE_EXT_ZVKB  (1 << 19)
> +#defineRISCV_HWPROBE_EXT_ZVKG  (1 << 20)
> +#defineRISCV_HWPROBE_EXT_ZVKNED(1 << 21)
> +#defineRISCV_HWPROBE_EXT_ZVKNHA(1 << 22)
> +#defineRISCV_HWPROBE_EXT_ZVKNHB(1 << 23)
> +#defineRISCV_HWPROBE_EXT_ZVKSED(1 << 24)
> +#defineRISCV_HWPROBE_EXT_ZVKSH (1 << 25)
> +#defineRISCV_HWPROBE_EXT_ZVKT  (1 << 26)
> +#defineRISCV_HWPROBE_EXT_ZFH   (1 << 27)
> +#defineRISCV_HWPROBE_EXT_ZFHMIN(1 << 28)
> +#defineRISCV_HWPROBE_EXT_ZIHINTNTL (1 << 29)
> +#defineRISCV_HWPROBE_EXT_ZVFH  (1 << 30)
> +#defineRISCV_HWPROBE_EXT_ZVFHMIN   (1 << 31)
> +#defineRISCV_HWPROBE_EXT_ZFA   (1ULL << 32)
> +#defineRISCV_HWPROBE_EXT_ZACAS (1ULL << 34)
> +#defineRISCV_HWPROBE_EXT_ZICOND(1ULL << 35)

This fails to pass checkpatch

Alistair

>
>  #define RISCV_HWPROBE_KEY_CPUPERF_0 5
>  #define RISCV_HWPROBE_MISALIGNED_UNKNOWN (0 << 0)
> @@ -8860,6 +,62 @@ static void risc_hwprobe_fill_pairs(CPURISCVState *env,
>   RISCV_HWPROBE_EXT_ZBS : 0;
>  value |= cfg->ext_zicboz ?
>   RISCV_HWPROBE_EXT_ZICBOZ : 0;
> +value |= cfg->ext_zbc ?
> + RISCV_HWPROBE_EXT_ZBC : 0;
> +value |= cfg->ext_zbkb ?
> + RISCV_HWPROBE_EXT_ZBKB : 0;
> +value |= cfg->ext_zbkc ?
> + RISCV_HWPROBE_EXT_ZBKC : 0;
> +value |= cfg->ext_zbkx ?
> + RISCV_HWPROBE_EXT_ZBKX : 0;
> +value |= cfg->ext_zknd ?
> + RISCV_HWPROBE_EXT_ZKND : 0;
> +value |= cfg->ext_zkne ?
> + RISCV_HWPROBE_EXT_ZKNE : 0;
> +value |= cfg->ext_zknh ?
> + RISCV_HWPROBE_EXT_ZKNH : 0;
> +value |= cfg->ext_zksed ?
> + RISCV_HWPROBE_EXT_ZKSED : 0;
> +value |= cfg->ext_zksh ?
> + RISCV_HWPROBE_EXT_ZKSH : 0;
> +value |= cfg->ext_zkt ?
> + RISCV_HWPROBE_EXT_ZKT : 0;
> +value |= cfg->ext_zvbb ?
> + RISCV_HWPROBE_EXT_ZVBB : 0;
> +value |= cfg->ext_zvbc ?
> + RISCV_HWPROBE_EXT_ZVBC : 0;
> +  

Re: [PULL 0/4] machine development tool

2024-03-07 Thread Peter Xu
On Thu, Mar 07, 2024 at 12:06:59PM +0300, Maksim Davydov wrote:
> 
> On 3/6/24 04:57, Peter Xu wrote:
> > On Tue, Mar 05, 2024 at 03:43:41PM +0100, Markus Armbruster wrote:
> > > Peter Maydell  writes:
> > > 
> > > > On Mon, 4 Mar 2024 at 13:52, Maksim Davydov 
> > > >  wrote:
> > > > > The following changes since commit 
> > > > > e1007b6bab5cf97705bf4f2aaec1f607787355b8:
> > > > > 
> > > > >Merge tag 'pull-request-2024-03-01' 
> > > > > ofhttps://gitlab.com/thuth/qemu  into staging (2024-03-01 10:14:32 
> > > > > +)
> > > > > 
> > > > > are available in the Git repository at:
> > > > > 
> > > > >https://gitlab.com/davydov-max/qemu.git  
> > > > > tags/pull-compare-mt-2024-03-04
> > > > > 
> > > > > for you to fetch changes up to 
> > > > > 7693a2e8518811a907d73a85807ee71dac8fabcb:
> > > > > 
> > > > >scripts: add script to compare compatibility properties 
> > > > > (2024-03-04 14:10:53 +0300)
> > > > > 
> > > > > 
> > > > > Please note. This is the first pull request from me.
> > > > > My public GPG key is available here
> > > > > https://keys.openpgp.org/vks/v1/by-fingerprint/CDB5BEEF8837142579F5CDFE8E927E10F72F78D4
> > > > > 
> > > > > 
> > > > > scripts: add a new script for machine development
> > > > > 
> > > > > 
> > > > Hi; I would prefer this to go through some existing submaintainer
> > > > tree if possible, please.
> > > Migration?  QOM?  Not sure.  Cc'ing the maintainers anyway.
> > Yeah this seems like migration relevant.. however now I'm slightly confused
> > on when this script should be useful.
> > 
> > According to:
> > 
> > https://lore.kernel.org/qemu-devel/20240222153912.646053-5-davydov-...@yandex-team.ru/
> > 
> >  This script runs QEMU to obtain compat_props of machines and
> >  default values of different types of drivers to produce comparison
> >  table. This table can be used to compare machine types to choose
> >  the most suitable machine or compare binaries to be sure that
> >  migration to the newer version will save all device
> >  properties. Also the json or csv format of this table can be used
> >  to check does a new machine affect the previous ones by comparing
> >  tables with and without the new machine.
> > 
> > In regards to "choose the most suitable machine": why do you need to choose
> > a machine?
> > 
> > If it's pretty standalone setup, shouldn't we always try to use the latest
> > machine type if possible (as normally compat props are only used to keep
> > compatible with old machine types, and the default should always be
> > preferred). If it's a cluster setup, IMHO it should depend on the oldest
> > QEMU version that plans to be supported.  I don't see how such comparison
> > helps yet in either of the cases..
> > 
> > In regards to "compare binaries to be sure that migration to the newer
> > version will save all device properties": do we even support migrating
> > _between_ machine types??
> > 
> > Sololy relying on compat properties to detect device compatibility is also
> > not reliable.  For example, see VMStateField.field_exists() or similarly,
> > VMStateDescription.needed(), which are hooks that device can provide to
> > dynamically decide what device state to be saved/loaded.  Such things are
> > not reflected in compat properties, so even if compat properties of all
> > devices are the same between two machine types, it may not mean that the
> > migration stream will always be compatible.
> > 
> > Thanks,
> 
> In fact, the last commit describes the meaning of this series best. Perhaps
> it should have been moved to the cover letter:
> Often, many teams have their own "machines" inherited from "upstream" to
> manage default values of devices. This is done because of the limitations
> imposed by the compatibility requirements or the desire to help their
> customers better configure their devices. And since machine type has
> a hard-to-read structure, it is very easy to make a mistake when
> transferring
> default values from one machine to another. For example, when some property
> is set for the entire abstract class x86_64-cpu (which will be applied to
> all
> models), and then rolled back for a specific model. The situation is about
> the same with changing the default values of devices: if the value changes
> in the description of the device itself, then you need to make sure that
> nothing changes when using the current machine.
> Therefore, there was a desire to make a dev tool that will help quickly
> expand
> the definition of a machine or compare several machines from different
> binary
> files. It can be used when rebasing to a new version of qemu and/or for
> automatic tests.

OK, thanks.

So is it a migration compatibility issue that you care (migrating VMs from
your 

Re: [PATCH] input-linux: Add option to not grab a device upon guest startup

2024-03-07 Thread Justinien Bouron
> This last two lines doesn't make sense to me. Isn't the grab
> toggling entirely in control of the QEMU process, regardless
> of what state the guest is at ?

Actually, you're right, they do not make sense. This issue of having the guest
taking a while to start and the toggle keys not working, only seem to appear
when running the VM under libvirt. I was not able to reproduce this issue when
running qemu directly from the command line. So either this is a libvirt issue
or something related to my setup (VFIO with GPU passthrough, so a lot can go
wrong).

Should I send a new version of the patch with an updated commit message that
does not mention this issue?

Regards,
Justinien



Re: [PATCH v7 3/9] target/riscv: remove 'over' brconds from vector trans

2024-03-07 Thread LIU Zhiwei

Hi Daniel and Alistair,

Hope it is not too late. I think there are two bugs in this patch.

1) The first is for instruction vfmv.s.f.  vfmv.s.f doesn't use helper 
function. If we remove the over check, it will set the first element of 
destination vector register, which is against the specification. 
According to the riscv-v-specification, 16.2. Floating-Point Scalar Move 
Instructions,


"If vstart ≥ vl, no operation is performed and the destination register is not 
updated".

2) The second is for vector instruction with helper functions. we should 
not change any elements including the tail elements when vstart >=vl. 
But this patch break this behavior. According to the 
riscv-v-specification,  5.4. Prestart, Active, Inactive, Body, and Tail 
Element Denitions,


"When vstart ≥ vl, there are no body elements, and no elements are updated in 
any destination vector register group,
including that no tail elements are updated with agnostic values."

I will review this patch set in more details later.

Thanks,
Zhiwei

On 2024/3/7 1:19, Daniel Henrique Barboza wrote:

Most of the vector translations has this following pattern at the start:

 TCGLabel *over = gen_new_label();
 tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);

And then right at the end:

  gen_set_label(over);
  return true;

This means that if vstart >= vl we'll not set vstart = 0 at the end of
the insns - this is done inside the helper that is being skipped.  The
reason why this pattern hasn't been a bigger problem is because the
conditional vstart >= vl is very rare.

Checking all the helpers in vector_helper.c we see all of them with a
pattern like this:

 for (i = env->vstart; i < vl; i++) {
 (...)
 }
 env->vstart = 0;

Thus they can handle vstart >= vl case gracefully, with the benefit of
setting env->vstart = 0 during the process.

Remove all 'over' conditionals and let the helper set env->vstart = 0
every time.

While we're at it, remove the (vl == 0) brconds from trans_rvbf16.c.inc
too since they're unneeded.

Suggested-by: Richard Henderson
Signed-off-by: Daniel Henrique Barboza
Reviewed-by: Richard Henderson
Reviewed-by: Alistair Francis
---
  target/riscv/insn_trans/trans_rvbf16.c.inc |  12 ---
  target/riscv/insn_trans/trans_rvv.c.inc| 117 -
  target/riscv/insn_trans/trans_rvvk.c.inc   |  18 
  3 files changed, 147 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvbf16.c.inc 
b/target/riscv/insn_trans/trans_rvbf16.c.inc
index 8ee99df3f3..a842e76a6b 100644
--- a/target/riscv/insn_trans/trans_rvbf16.c.inc
+++ b/target/riscv/insn_trans/trans_rvbf16.c.inc
@@ -71,11 +71,8 @@ static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, 
arg_vfncvtbf16_f_f_w *a)
  
  if (opfv_narrow_check(ctx, a) && (ctx->sew == MO_16)) {

  uint32_t data = 0;
-TCGLabel *over = gen_new_label();
  
  gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);

-tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
-tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
  
  data = FIELD_DP32(data, VDATA, VM, a->vm);

  data = FIELD_DP32(data, VDATA, LMUL, ctx->lmul);
@@ -87,7 +84,6 @@ static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, 
arg_vfncvtbf16_f_f_w *a)
 ctx->cfg_ptr->vlenb, data,
 gen_helper_vfncvtbf16_f_f_w);
  mark_vs_dirty(ctx);
-gen_set_label(over);
  return true;
  }
  return false;
@@ -100,11 +96,8 @@ static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, 
arg_vfwcvtbf16_f_f_v *a)
  
  if (opfv_widen_check(ctx, a) && (ctx->sew == MO_16)) {

  uint32_t data = 0;
-TCGLabel *over = gen_new_label();
  
  gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);

-tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
-tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
  
  data = FIELD_DP32(data, VDATA, VM, a->vm);

  data = FIELD_DP32(data, VDATA, LMUL, ctx->lmul);
@@ -116,7 +109,6 @@ static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, 
arg_vfwcvtbf16_f_f_v *a)
 ctx->cfg_ptr->vlenb, data,
 gen_helper_vfwcvtbf16_f_f_v);
  mark_vs_dirty(ctx);
-gen_set_label(over);
  return true;
  }
  return false;
@@ -130,11 +122,8 @@ static bool trans_vfwmaccbf16_vv(DisasContext *ctx, 
arg_vfwmaccbf16_vv *a)
  if (require_rvv(ctx) && vext_check_isa_ill(ctx) && (ctx->sew == MO_16) &&
  vext_check_dss(ctx, a->rd, a->rs1, a->rs2, a->vm)) {
  uint32_t data = 0;
-TCGLabel *over = gen_new_label();
  
  gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN);

-tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
-tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
  
  data = FIELD_DP32(data, VDATA, VM, a->vm);

  data = FIELD_DP32(data, VDATA, LMUL, ctx->lmul);
@@ -147,7 +136,6 

RE: [PATCH v8 7/9] hw/i386/q35: Set virtio-iommu aw-bits default value to 39

2024-03-07 Thread Duan, Zhenzhong


>-Original Message-
>From: Eric Auger 
>Subject: [PATCH v8 7/9] hw/i386/q35: Set virtio-iommu aw-bits default
>value to 39
>
>Currently the default input range can extend to 64 bits. On x86,
>when the virtio-iommu protects vfio devices, the physical iommu
>may support only 39 bits. Let's set the default to 39, as done
>for the intel-iommu.
>
>We use hw_compat_8_2 to handle the compatibility for machines
>before 9.0 which used to have a virtio-iommu default input range
>of 64 bits.
>
>Of course if aw-bits is set from the command line, the default
>is overriden.
>
>Signed-off-by: Eric Auger 

Reviewed-by: Zhenzhong Duan 

Thanks
Zhenzhong

>
>---
>
>v6 -> v7:
>- use static pc_q35_compat_defaults
>- remove spurious header addition
>- s/32/UINT32_MAX in the qtest
>
>v5 -> v6:
>- split pc/arm settings
>
>v3 -> v4:
>- update the qos test to relax the check on the max input IOVA
>
>v2 -> v3:
>- collected Zhenzhong's R-b
>- use _abort instead of NULL error handle
>  on object_property_get_uint() call (Cédric)
>- use VTD_HOST_AW_39BIT (Cédric)
>
>v1 -> v2:
>- set aw-bits to 48b on ARM
>- use hw_compat_8_2 to handle the compat for older machines
>  which used 64b as a default
>---
> hw/core/machine.c   | 1 +
> hw/i386/pc_q35.c| 9 +
> tests/qtest/virtio-iommu-test.c | 2 +-
> 3 files changed, 11 insertions(+), 1 deletion(-)
>
>diff --git a/hw/core/machine.c b/hw/core/machine.c
>index 6bd09d4592..4b89172d1c 100644
>--- a/hw/core/machine.c
>+++ b/hw/core/machine.c
>@@ -35,6 +35,7 @@
>
> GlobalProperty hw_compat_8_2[] = {
> { TYPE_VIRTIO_IOMMU_PCI, "granule", "4k" },
>+{ TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "64" },
> };
> const size_t hw_compat_8_2_len = G_N_ELEMENTS(hw_compat_8_2);
>
>diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
>index 45a4102e75..1e7464d39a 100644
>--- a/hw/i386/pc_q35.c
>+++ b/hw/i386/pc_q35.c
>@@ -45,6 +45,7 @@
> #include "hw/i386/pc.h"
> #include "hw/i386/amd_iommu.h"
> #include "hw/i386/intel_iommu.h"
>+#include "hw/virtio/virtio-iommu.h"
> #include "hw/display/ramfb.h"
> #include "hw/ide/pci.h"
> #include "hw/ide/ahci-pci.h"
>@@ -63,6 +64,12 @@
> /* ICH9 AHCI has 6 ports */
> #define MAX_SATA_PORTS 6
>
>+static GlobalProperty pc_q35_compat_defaults[] = {
>+{ TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "39" },
>+};
>+static const size_t pc_q35_compat_defaults_len =
>+G_N_ELEMENTS(pc_q35_compat_defaults);
>+
> struct ehci_companions {
> const char *name;
> int func;
>@@ -356,6 +363,8 @@ static void pc_q35_machine_options(MachineClass
>*m)
> machine_class_allow_dynamic_sysbus_dev(m,
>TYPE_INTEL_IOMMU_DEVICE);
> machine_class_allow_dynamic_sysbus_dev(m, TYPE_RAMFB_DEVICE);
> machine_class_allow_dynamic_sysbus_dev(m, TYPE_VMBUS_BRIDGE);
>+compat_props_add(m->compat_props,
>+ pc_q35_compat_defaults, pc_q35_compat_defaults_len);
> }
>
> static void pc_q35_9_0_machine_options(MachineClass *m)
>diff --git a/tests/qtest/virtio-iommu-test.c b/tests/qtest/virtio-iommu-test.c
>index 068e7a9e6c..afb225971d 100644
>--- a/tests/qtest/virtio-iommu-test.c
>+++ b/tests/qtest/virtio-iommu-test.c
>@@ -34,7 +34,7 @@ static void pci_config(void *obj, void *data,
>QGuestAllocator *t_alloc)
> uint8_t bypass = qvirtio_config_readb(dev, 36);
>
> g_assert_cmpint(input_range_start, ==, 0);
>-g_assert_cmphex(input_range_end, ==, UINT64_MAX);
>+g_assert_cmphex(input_range_end, >=, UINT32_MAX);
> g_assert_cmpint(domain_range_start, ==, 0);
> g_assert_cmpint(domain_range_end, ==, UINT32_MAX);
> g_assert_cmpint(bypass, ==, 1);
>--
>2.41.0



Enabling internal errors for VH CXL devices: [was: Re: Questions about CXL RAS injection test in qemu]

2024-03-07 Thread Yuquan Wang
On 2024-03-07 20:10,  jonathan.cameron wrote:

> Hack is fine the relevant device with lspci -tv and then use
> setpci -s 0d:00.0 0x208.l=0
> to clear all the mask bits for uncorrectable errors.

Thanks! The suggestions from you and Terry did work!

BTW, is my understanding below about CXL RAS correct?

>> 2) The error injected by "pcie_aer_inject_error" is "protocol & link errors" 
>> of cxl.io?
>>The error injected by "cxl-inject-uncorrectable-errors" or 
>> "cxl-inject-correctable-error" is "protocol & link errors" of cxl.cachemem

Many thanks
Yuuqan




Re: [PATCH v3 00/20] Use Intel DSA accelerator to offload zero page checking in multifd live migration.

2024-03-07 Thread liulongfang via
On 2024/1/4 8:44, Hao Xiang wrote:
> v3
> * Rebase on top of 7425b6277f12e82952cede1f531bfc689bf77fb1.
> * Fix error/warning from checkpatch.pl
> * Fix use-after-free bug when multifd-dsa-accel option is not set.
> * Handle error from dsa_init and correctly propogate the error.
> * Remove unnecessary call to dsa_stop.
> * Detect availability of DSA feature at compile time.
> * Implement a generic batch_task structure and a DSA specific one 
> dsa_batch_task.
> * Remove all exit() calls and propagate errors correctly.
> * Use bytes instead of page count to configure multifd-packet-size option.
> 
> v2
> * Rebase on top of 3e01f1147a16ca566694b97eafc941d62fa1e8d8.
> * Leave Juan's changes in their original form instead of squashing them.
> * Add a new commit to refactor the multifd_send_thread function to prepare 
> for introducing the DSA offload functionality.
> * Use page count to configure multifd-packet-size option.
> * Don't use the FLAKY flag in DSA tests.
> * Test if DSA integration test is setup correctly and skip the test if
> * not.
> * Fixed broken link in the previous patch cover.
> 
> * Background:
> 
> I posted an RFC about DSA offloading in QEMU:
> https://patchew.org/QEMU/20230529182001.2232069-1-hao.xi...@bytedance.com/
> 
> This patchset implements the DSA offloading on zero page checking in
> multifd live migration code path.
> 
> * Overview:
> 
> Intel Data Streaming Accelerator(DSA) is introduced in Intel's 4th generation
> Xeon server, aka Sapphire Rapids.
> https://cdrdv2-public.intel.com/671116/341204-intel-data-streaming-accelerator-spec.pdf
> https://www.intel.com/content/www/us/en/content-details/759709/intel-data-streaming-accelerator-user-guide.html
> One of the things DSA can do is to offload memory comparison workload from
> CPU to DSA accelerator hardware. This patchset implements a solution to 
> offload
> QEMU's zero page checking from CPU to DSA accelerator hardware. We gain
> two benefits from this change:
> 1. Reduces CPU usage in multifd live migration workflow across all use
> cases.
> 2. Reduces migration total time in some use cases. 
> 
> * Design:
> 
> These are the logical steps to perform DSA offloading:
> 1. Configure DSA accelerators and create user space openable DSA work
> queues via the idxd driver.
> 2. Map DSA's work queue into a user space address space.
> 3. Fill an in-memory task descriptor to describe the memory operation.
> 4. Use dedicated CPU instruction _enqcmd to queue a task descriptor to
> the work queue.
> 5. Pull the task descriptor's completion status field until the task
> completes.
> 6. Check return status.
> 
> The memory operation is now totally done by the accelerator hardware but
> the new workflow introduces overheads. The overhead is the extra cost CPU
> prepares and submits the task descriptors and the extra cost CPU pulls for
> completion. The design is around minimizing these two overheads.
> 
> 1. In order to reduce the overhead on task preparation and submission,
> we use batch descriptors. A batch descriptor will contain N individual
> zero page checking tasks where the default N is 128 (default packet size
> / page size) and we can increase N by setting the packet size via a new
> migration option.
> 2. The multifd sender threads prepares and submits batch tasks to DSA
> hardware and it waits on a synchronization object for task completion.
> Whenever a DSA task is submitted, the task structure is added to a
> thread safe queue. It's safe to have multiple multifd sender threads to
> submit tasks concurrently.
> 3. Multiple DSA hardware devices can be used. During multifd initialization,
> every sender thread will be assigned a DSA device to work with. We
> use a round-robin scheme to evenly distribute the work across all used
> DSA devices.
> 4. Use a dedicated thread dsa_completion to perform busy pulling for all
> DSA task completions. The thread keeps dequeuing DSA tasks from the
> thread safe queue. The thread blocks when there is no outstanding DSA
> task. When pulling for completion of a DSA task, the thread uses CPU
> instruction _mm_pause between the iterations of a busy loop to save some
> CPU power as well as optimizing core resources for the other hypercore.
> 5. DSA accelerator can encounter errors. The most popular error is a
> page fault. We have tested using devices to handle page faults but
> performance is bad. Right now, if DSA hits a page fault, we fallback to
> use CPU to complete the rest of the work. The CPU fallback is done in
> the multifd sender thread.
> 6. Added a new migration option multifd-dsa-accel to set the DSA device
> path. If set, the multifd workflow will leverage the DSA devices for
> offloading.
> 7. Added a new migration option multifd-normal-page-ratio to make
> multifd live migration easier to test. Setting a normal page ratio will
> make live migration recognize a zero page as a normal page and send
> the entire payload over the network. If we want to send a large network
> payload 

RE: [PATCH 0/3] ui/gtk: introducing vc->visible

2024-03-07 Thread Kim, Dongwon
Hi Marc-André,

> -Original Message-
> From: Marc-André Lureau 
> Sent: Tuesday, March 5, 2024 4:18 AM
> To: Kim, Dongwon ; P. Berrange, Daniel
> 
> Cc: qemu-devel@nongnu.org
> Subject: Re: [PATCH 0/3] ui/gtk: introducing vc->visible
> 
> Hi Kim
> 
> I am uncomfortable with the series in general.
> 
> Not only we don't have the means to draw dmabuf/scanout "when required", so
> resuming drawing won't work until the guest draws (this is already a problem 
> but
[Kim, Dongwon] Yes, this is right. The display won't be updated until there is 
a new frame submitted.
> you are only making it worse). And I also think reconfiguring the guest by 
> merely
> minimizing or switching window/tabs isn't what most users would expect.
[Kim, Dongwon] I understand your concern. Then what do you suggest I need to do 
or look into to avoid the situation where the host rendering of the guest frame 
is scheduled but pending due to tab switching or minimization of window as the 
guest (virtio-gpu drv) is waiting for the response to move on to the next 
frame? Do you think the frame update should just continue on the host side 
regardless of visibility of the window? (If this is the standard expectation, 
then one of our Linux configuration - Yocto + ICE WM has some bug in it.)

> 
> (fwiw, my personal opinion is that QEMU shouldn't provide UIs and different
> clients should be able to implement different behaviours, out of process.. 
> that
> makes me relatively less motivated to break things and be responsible)
> 
> Daniel, could you have a look too?
> 
> thanks
> 
> On Fri, Mar 1, 2024 at 4:05 AM Kim, Dongwon 
> wrote:
> >
> > Hi Marc-André Lureau,
> >
> > Just a reminder.. I need your help reviewing this patch series. Please
> > take a look at my messages at
> > https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg06636.html
> > and
> > https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg06637.html
> >
> > Thanks!!
> > DW
> >
> > > -Original Message-
> > > From: qemu-devel-bounces+dongwon.kim=intel@nongnu.org  > > devel-bounces+dongwon.kim=intel@nongnu.org> On Behalf Of
> > > dongwon@intel.com
> > > Sent: Tuesday, January 30, 2024 3:49 PM
> > > To: qemu-devel@nongnu.org
> > > Subject: [PATCH 0/3] ui/gtk: introducing vc->visible
> > >
> > > From: Dongwon Kim 
> > >
> > > Drawing guest display frames can't be completed while the VC is not
> > > in visible state, which could result in timeout in both the host and
> > > the guest especially when using blob scanout. Therefore it is needed
> > > to update and track the visiblity status of the VC and unblock the 
> > > pipeline in
> case when VC becomes invisible (e.g.
> > > windows minimization, switching among tabs) while processing a guest
> frame.
> > >
> > > First patch (0001-ui-gtk-skip...) is introducing a flag 'visible' to
> > > VirtualConsole struct then set it only if the VC and its window is 
> > > visible.
> > >
> > > Second patch (0002-ui-gtk-set-...) sets the ui size to 0 when VC is
> > > invisible when the tab is closed or deactivated. This notifies the
> > > guest that the associated guest display is not active anymore.
> > >
> > > Third patch (0003-ui-gtk-reset-visible...) adds a callback for GTK
> > > window-state- event. The flag, 'visible' is updated based on the 
> > > minization
> status of the window.
> > >
> > > Dongwon Kim (3):
> > >   ui/gtk: skip drawing guest scanout when associated VC is invisible
> > >   ui/gtk: set the ui size to 0 when invisible
> > >   ui/gtk: reset visible flag when window is minimized
> > >
> > >  include/ui/gtk.h |  1 +
> > >  ui/gtk-egl.c |  8 +++
> > >  ui/gtk-gl-area.c |  8 +++
> > >  ui/gtk.c | 62 ++--
> > >  4 files changed, 77 insertions(+), 2 deletions(-)
> > >
> > > --
> > > 2.34.1
> > >
> >



Re: [PATCH] disas/riscv: Further correction to LUI disassembly

2024-03-07 Thread Richard Bagley
NACK

We have established that the change is a workaround for a bug in
the assembler.
I withdraw the merge request.

Thank you for this careful review.

On Fri, Aug 11, 2023 at 4:55 AM Andrew Jones 
wrote:

> On Fri, Aug 11, 2023 at 10:25:52AM +0200, Andrew Jones wrote:
> > On Thu, Aug 10, 2023 at 06:27:50PM +0200, Andrew Jones wrote:
> > > On Thu, Aug 10, 2023 at 09:12:42AM -0700, Palmer Dabbelt wrote:
> > > > On Thu, 10 Aug 2023 08:31:46 PDT (-0700), ajo...@ventanamicro.com
> wrote:
> > > > > On Mon, Jul 31, 2023 at 11:33:20AM -0700, Richard Bagley wrote:
> > > > > > The recent commit 36df75a0a9 corrected one aspect of LUI
> disassembly
> > > > > > by recovering the immediate argument from the result of LUI with
> a
> > > > > > shift right by 12. However, the shift right will left-fill with
> the
> > > > > > sign. By applying a mask we recover an unsigned representation
> of the
> > > > > > 20-bit field (which includes a sign bit).
> > > > > >
> > > > > > Example:
> > > > > > 0xf000 >> 12 = 0x
> > > > > > 0xf000 >> 12 & 0xf = 0x000f
> > > > > >
> > > > > > Fixes: 36df75a0a9 ("riscv/disas: Fix disas output of upper
> immediates")
> > > > > > Signed-off-by: Richard Bagley 
> > > > > > ---
> > > > > >  disas/riscv.c | 9 ++---
> > > > > >  1 file changed, 6 insertions(+), 3 deletions(-)
> > > > > >
> > > > > > diff --git a/disas/riscv.c b/disas/riscv.c
> > > > > > index 4023e3fc65..690eb4a1ac 100644
> > > > > > --- a/disas/riscv.c
> > > > > > +++ b/disas/riscv.c
> > > > > > @@ -4723,9 +4723,12 @@ static void format_inst(char *buf, size_t
> buflen, size_t tab, rv_decode *dec)
> > > > > >  break;
> > > > > >  case 'U':
> > > > > >  fmt++;
> > > > > > -snprintf(tmp, sizeof(tmp), "%d", dec->imm >> 12);
> > > > > > -append(buf, tmp, buflen);
> > > > > > -if (*fmt == 'o') {
> > > > > > +if (*fmt == 'i') {
> > > > > > +snprintf(tmp, sizeof(tmp), "%d", dec->imm >> 12
> & 0xf);
> > > > >
> > > > > Why are we correcting LUI's output, but still outputting
> sign-extended
> > > > > values for AUIPC?
> > > > >
> > > > > We can't assemble 'auipc a1, 0x' or 'auipc a1, -1' without
> getting
> > > > >
> > > > >  Error: lui expression not in range 0..1048575
> > > > >
> > > > > (and additionally for 0x)
> > > > >
> > > > >  Error: value of 0000 too large for field of 4 bytes
> at 
> > > > >
> > > > > either.
> > > > >
> > > > > (I see that the assembler's error messages state 'lui', but I was
> trying
> > > > > 'auipc'.)
> > > > >
> > > > > I'm using as from gnu binutils 2.40.0.20230214.
> > > > >
> > > > > (And, FWIW, I agree with Richard Henderson that these instructions
> should
> > > > > accept negative values.)
> > > >
> > > > I'm kind of lost here, and you saying binutils rejects this syntax?
> If
> > > > that's the case it's probably just an oversight, can you file a bug
> in
> > > > binutils land so folks can see?
> > >
> > > Will do.
> > >
> >
> > https://sourceware.org/bugzilla/show_bug.cgi?id=30746
> >
>
> But, to try to bring this thread back to the patch under review. While the
> binutils BZ may address our preferred way of providing immediates to the
> assembler, this patch is trying to make QEMU's output consistent with
> objdump. Since objdump always outputs long immediate values as hex, then
> it doesn't need to care about negative signs. QEMU seems to prefer
> decimal, though, and so does llvm-objdump, which outputs values for these
> instructions in the range 0..1048575. So, I guess this patch is making
> QEMU consistent with llvm-objdump.
>
> Back to making suggestions for this patch...
>
> 1. The commit message should probably say something along the lines of
>what I just wrote in the preceding paragraph to better explain the
>motivation.
>
> 2. Unless I'm missing something, then this patch should also address
>AUIPC.
>
> Thanks,
> drew
>


Re: [PATCH v5 06/13] hw/mem/cxl_type3: Add host backend and address space handling for DC regions

2024-03-07 Thread fan
On Thu, Mar 07, 2024 at 12:16:05PM +, Jonathan Cameron wrote:
> > > > @@ -868,16 +974,24 @@ static int 
> > > > cxl_type3_hpa_to_as_and_dpa(CXLType3Dev *ct3d,
> > > > AddressSpace **as,
> > > > uint64_t *dpa_offset)
> > > >  {
> > > > -MemoryRegion *vmr = NULL, *pmr = NULL;
> > > > +MemoryRegion *vmr = NULL, *pmr = NULL, *dc_mr = NULL;
> > > > +uint64_t vmr_size = 0, pmr_size = 0, dc_size = 0;
> > > >  
> > > >  if (ct3d->hostvmem) {
> > > >  vmr = host_memory_backend_get_memory(ct3d->hostvmem);
> > > > +vmr_size = memory_region_size(vmr);
> > > >  }
> > > >  if (ct3d->hostpmem) {
> > > >  pmr = host_memory_backend_get_memory(ct3d->hostpmem);
> > > > +pmr_size = memory_region_size(pmr);
> > > > +}
> > > > +if (ct3d->dc.host_dc) {
> > > > +dc_mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> > > > +/* Do we want dc_size to be dc_mr->size or not?? */  
> > > 
> > > Maybe - definitely don't want to leave this comment here
> > > unanswered and I think you enforce it above anyway.
> > > 
> > > So if we get here ct3d->dc.total_capacity == 
> > > memory_region_size(ct3d->dc.host_dc);
> > > As such we could just not stash total_capacity at all?  
> > 
> > I cannot identify a case where these two will be different. But
> > total_capacity is referenced at quite some places, it may be nice to have
> > it so we do not need to call the function to get the value every time?
> 
> I kind of like having it via one path so that there is no confusion
> for the reader, but up to you on this one.  The function called is trivial
> (other than some magic to handle very large memory regions) so
> this is just a readability question, not a perf one.
> 
> Whatever, don't leave the question behind.  Find to have something
> that says they are always the same size if you don't get rid
> of the total_capacity representation.
> 
I will fix it.

For static capability, we have a variable static_mem_size, although we
can calculate it from volatile and non-volatile memory region size.
There are quite some places need to get the dynamic capacity, it is much
more convenient to have a variable ready to use, I will keep it for
now.

Fan
> 
> Jonathan



[PATCH] hw/m68k/mcf5208: add support for reset

2024-03-07 Thread Angelo Dureghello
Add reset support.

Signed-off-by: Angelo Dureghello 
---
 hw/m68k/mcf5208.c | 51 ---
 1 file changed, 48 insertions(+), 3 deletions(-)

diff --git a/hw/m68k/mcf5208.c b/hw/m68k/mcf5208.c
index 0cfb806c20..b1ab3896d0 100644
--- a/hw/m68k/mcf5208.c
+++ b/hw/m68k/mcf5208.c
@@ -20,6 +20,7 @@
 #include "hw/ptimer.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/qtest.h"
+#include "sysemu/runstate.h"
 #include "net/net.h"
 #include "hw/boards.h"
 #include "hw/loader.h"
@@ -40,6 +41,8 @@
 #define PCSR_PRE_SHIFT  8
 #define PCSR_PRE_MASK   0x0f00
 
+#define RCR_SOFTRST 0x80
+
 typedef struct {
 MemoryRegion iomem;
 qemu_irq irq;
@@ -185,13 +188,55 @@ static const MemoryRegionOps m5208_sys_ops = {
 .endianness = DEVICE_NATIVE_ENDIAN,
 };
 
-static void mcf5208_sys_init(MemoryRegion *address_space, qemu_irq *pic)
+static uint64_t m5208_rcm_read(void *opaque, hwaddr addr,
+   unsigned size)
+{
+qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad offset 0x%" HWADDR_PRIX "\n",
+  __func__, addr);
+return 0;
+}
+
+static void m5208_rcm_write(void *opaque, hwaddr addr,
+uint64_t value, unsigned size)
+{
+switch (addr) {
+case 0x0: /* RCR */
+if (value & RCR_SOFTRST) {
+M68kCPU *cpu = opaque;
+CPUState *cs = CPU(cpu);
+
+cpu_reset(cs);
+cpu->env.aregs[7] = ldl_phys(cs->as, 0);
+cpu->env.pc = ldl_phys(cs->as, 4);
+//qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
+}
+break;
+default:
+qemu_log_mask(LOG_GUEST_ERROR, "%s: Bad offset 0x%" HWADDR_PRIX "\n",
+  __func__, addr);
+break;
+}
+}
+
+static const MemoryRegionOps m5208_rcm_ops = {
+.read = m5208_rcm_read,
+.write = m5208_rcm_write,
+.endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static void mcf5208_sys_init(MemoryRegion *address_space, qemu_irq *pic,
+ M68kCPU *cpu)
 {
-MemoryRegion *iomem = g_new(MemoryRegion, 1);
+MemoryRegion *iomem;
 m5208_timer_state *s;
 int i;
 
+/* RCM */
+iomem = g_new(MemoryRegion, 1);
+memory_region_init_io(iomem, NULL, _rcm_ops, cpu, "m5208-rcm", 
0x0080);
+memory_region_add_subregion(address_space, 0xfc0a, iomem);
 /* SDRAMC.  */
+iomem = g_new(MemoryRegion, 1);
 memory_region_init_io(iomem, NULL, _sys_ops, NULL, "m5208-sys", 
0x4000);
 memory_region_add_subregion(address_space, 0xfc0a8000, iomem);
 /* Timers.  */
@@ -265,7 +310,7 @@ static void mcf5208evb_init(MachineState *machine)
 mcf_uart_create_mmap(0xfc064000, pic[27], serial_hd(1));
 mcf_uart_create_mmap(0xfc068000, pic[28], serial_hd(2));
 
-mcf5208_sys_init(address_space_mem, pic);
+mcf5208_sys_init(address_space_mem, pic, cpu);
 
 mcf_fec_init(address_space_mem, 0xfc03, pic + 36);
 
-- 
2.44.0




Re: [PATCH 4/5] include/exec: annotate all the MemoryRegion fields

2024-03-07 Thread Alex Bennée
Richard Henderson  writes:

> On 3/7/24 08:11, Alex Bennée wrote:
>> Signed-off-by: Alex Bennée 
>> ---
>>   include/exec/memory.h | 47 +++
>>   1 file changed, 43 insertions(+), 4 deletions(-)
>> diff --git a/include/exec/memory.h b/include/exec/memory.h
>> index 17b741bc4f5..312ed564dbe 100644
>> --- a/include/exec/memory.h
>> +++ b/include/exec/memory.h
>> @@ -778,9 +778,48 @@ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void 
>> **vaddr,
>>   typedef struct CoalescedMemoryRange CoalescedMemoryRange;
>>   typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
>>   -/** MemoryRegion:
>> - *
>> - * A struct representing a memory region.
>> +/**
>> + * struct MemoryRegion - represents a memory region
>> + * @parent_obj: parent QOM object for the region
>> + * @romd_mode: if true ROM devices accessed directly rather than with @ops
>> + * @ram: true if a RAM-type device with a @ram_block
>> + * @subpage: true if region covers a subpage
>> + * @readonly: true for RAM-type devices that are readonly
>> + * @nonvolatile: true for nonvolatile RAM-type devices (e.g. NVDIMM)
>> + * @rom_device: true for a ROM device, see also @romd_mode
>> + * @flush_coalesced_mmio: true to flush any queued coalesced MMIO
>> + * operations before access
>> + * @unmergeable: this section should not get merged with adjacent
>> + * sections
>> + * @dirty_log_mask: dirty events region cares about (see DIRTY_ flags)
>> + * @is_iommu: true if part of an IOMMU device
>> + * @ram_block: backing @RamBlock if @ram is true
>> + * @owner: base QOM object that owns this region
>> + * @dev: base Device that owns this region
>> + * @ops: access operations for MMIO or @romd_mode devices
>> + * @opaque: @dev specific data, passed with @ops
>> + * @container: parent `MemoryRegion`
>> + * @mapped_via_alias: number of times mapped via @alias, container
>> + * might be NULL
>> + * @size: size of @MemoryRegion
>> + * @addr: physical hwaddr of @MemoryRegion
>> + * @destructor: cleanup function when @MemoryRegion finalized
>> + * @align: alignment requirements for any physical backing store
>> + * @terminates: true if this @MemoryRegion is a leaf node
>> + * @ram_device: true if @ram device should use @ops to access
>> + * @enabled: true once initialised, false once finalized
>> + * @vga_logging_count: count of memory logging clients
>> + * @alias: link to aliased @MemoryRegion
>> + * @alias_offset: offset into aliased region
>> + * @priority: priority when resolving overlapping regions
>> + * @subregions: list of subregions in this region
>> + * @subregions_link: next subregion in the chain
>> + * @coalesced: list of coalesced memory ranges
>> + * @name: name of memory region
>> + * @ioeventfd_nb: count of @ioeventfds for region
>> + * @ioeventfds: ioevent notifiers for region
>> + * @rdm: if exists see #RamDiscardManager
>> + * @disable_reentrancy_guard: if true don't error if device accesses itself
>>*/
>
> Why as one big block rather than line by line?
> The block is less likely to be properly kept up-to-date and is much
> harder to read.

The inline syntax is a little more prone to parse failures and
annoyingly can't group multiple fields in one inline comment block. But
I can certainly move it inline if that preferable.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: [PATCH v2 3/5] gdbstub: Save target's siginfo

2024-03-07 Thread Alex Bennée
Richard Henderson  writes:

> On 3/7/24 08:26, Gustavo Romero wrote:
>> Save target's siginfo into gdbserver_state so it can be used later, for
>> example, in any stub that requires the target's si_signo and si_code.
>> This change affects only linux-user mode.
>> Signed-off-by: Gustavo Romero 
>> Suggested-by: Richard Henderson 
>> ---
>>   gdbstub/internals.h|  3 +++
>>   gdbstub/user-target.c  |  3 ++-
>>   gdbstub/user.c | 14 ++
>>   include/gdbstub/user.h |  6 +-
>>   linux-user/main.c  |  2 +-
>>   linux-user/signal.c|  5 -
>>   6 files changed, 25 insertions(+), 8 deletions(-)
>> diff --git a/gdbstub/internals.h b/gdbstub/internals.h
>> index 56b7c13b75..a7cc69dab3 100644
>> --- a/gdbstub/internals.h
>> +++ b/gdbstub/internals.h
>> @@ -58,6 +58,9 @@ typedef struct GDBState {
>>   int line_csum; /* checksum at the end of the packet */
>>   GByteArray *last_packet;
>>   int signal;
>> +#ifdef CONFIG_USER_ONLY
>> +uint8_t siginfo[MAX_SIGINFO_LENGTH];
>> +#endif
>
> If we this in GDBUserState in user.c -- no need for ifdefs then.

Although it does break on FreeBSD's user target:

  FAILED: libqemu-arm-bsd-user.fa.p/gdbstub_user-target.c.o 
  cc -m64 -mcx16 -Ilibqemu-arm-bsd-user.fa.p -I. -I.. -Itarget/arm 
-I../target/arm -I../common-user/host/x86_64 -I../bsd-user/include 
-Ibsd-user/freebsd -I../bsd-user/freebsd -I../bsd-user/host/x86_64 -Ibsd-user 
-I../bsd-user -I../bsd-user/arm -Iqapi -Itrace -Iui -Iui/shader 
-I/usr/local/include/capstone -I/usr/local/include/glib-2.0 
-I/usr/local/lib/glib-2.0/include -I/usr/local/include -fdiagnostics-color=auto 
-Wall -Winvalid-pch -Werror -std=gnu11 -O2 -g -fstack-protector-strong 
-Wempty-body -Wendif-labels -Wexpansion-to-defined -Wformat-security 
-Wformat-y2k -Wignored-qualifiers -Winit-self -Wmissing-format-attribute 
-Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wredundant-decls 
-Wstrict-prototypes -Wtype-limits -Wundef -Wvla -Wwrite-strings 
-Wno-gnu-variable-sized-type-not-at-end -Wno-initializer-overrides 
-Wno-missing-include-dirs -Wno-psabi -Wno-shift-negative-value 
-Wno-string-plus-int -Wno-tautological-type-limit-compare 
-Wno-typedef-redefinition -Wthread-safety -iquote . -iquote 
/tmp/cirrus-ci-build -iquote /tmp/cirrus-ci-build/include -iquote 
/tmp/cirrus-ci-build/host/include/x86_64 -iquote 
/tmp/cirrus-ci-build/host/include/generic -iquote /tmp/cirrus-ci-build/tcg/i386 
-pthread -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE 
-fno-strict-aliasing -fno-common -fwrapv -ftrivial-auto-var-init=zero -fPIE 
-DNEED_CPU_H '-DCONFIG_TARGET="arm-bsd-user-config-target.h"' 
'-DCONFIG_DEVICES="arm-bsd-user-config-devices.h"' -MD -MQ 
libqemu-arm-bsd-user.fa.p/gdbstub_user-target.c.o -MF 
libqemu-arm-bsd-user.fa.p/gdbstub_user-target.c.o.d -o 
libqemu-arm-bsd-user.fa.p/gdbstub_user-target.c.o -c ../gdbstub/user-target.c
  In file included from ../gdbstub/user-target.c:18:
  ../gdbstub/internals.h:62:21: error: use of undeclared identifier 
'MAX_SIGINFO_LENGTH'
 62 | uint8_t siginfo[MAX_SIGINFO_LENGTH];
| ^
  1 error generated.
  [2084/6731] Compiling C object libqemu-arm

See: https://gitlab.com/stsquad/qemu/-/jobs/6345829419



-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: [PATCH] virt: set the CPU type in BOOTINFO

2024-03-07 Thread Mark Cave-Ayland

On 23/02/2024 15:57, Laurent Vivier wrote:


BI_CPUTYPE/BI_MMUTYPE/BI_FPUTYPE were statically assigned to the
68040 information.
This patch changes the code to set in bootinfo the information
provided by the command line '-cpu' parameter.

Bug: https://gitlab.com/qemu-project/qemu/-/issues/2091
Reported-by: Daniel Palmer 
Signed-off-by: Laurent Vivier 
---
  hw/m68k/virt.c | 17 ++---
  1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/m68k/virt.c b/hw/m68k/virt.c
index e2792ef46d93..b8e5e102e6b9 100644
--- a/hw/m68k/virt.c
+++ b/hw/m68k/virt.c
@@ -239,9 +239,20 @@ static void virt_init(MachineState *machine)
  param_ptr = param_blob;
  
  BOOTINFO1(param_ptr, BI_MACHTYPE, MACH_VIRT);

-BOOTINFO1(param_ptr, BI_FPUTYPE, FPU_68040);
-BOOTINFO1(param_ptr, BI_MMUTYPE, MMU_68040);
-BOOTINFO1(param_ptr, BI_CPUTYPE, CPU_68040);
+if (m68k_feature(>env, M68K_FEATURE_M68020)) {
+BOOTINFO1(param_ptr, BI_CPUTYPE, CPU_68020);
+} else if (m68k_feature(>env, M68K_FEATURE_M68030)) {
+BOOTINFO1(param_ptr, BI_MMUTYPE, MMU_68030);
+BOOTINFO1(param_ptr, BI_CPUTYPE, CPU_68030);
+} else if (m68k_feature(>env, M68K_FEATURE_M68040)) {
+BOOTINFO1(param_ptr, BI_FPUTYPE, FPU_68040);
+BOOTINFO1(param_ptr, BI_MMUTYPE, MMU_68040);
+BOOTINFO1(param_ptr, BI_CPUTYPE, CPU_68040);
+} else if (m68k_feature(>env, M68K_FEATURE_M68060)) {
+BOOTINFO1(param_ptr, BI_FPUTYPE, FPU_68060);
+BOOTINFO1(param_ptr, BI_MMUTYPE, MMU_68060);
+BOOTINFO1(param_ptr, BI_CPUTYPE, CPU_68060);
+}
  BOOTINFO2(param_ptr, BI_MEMCHUNK, 0, ram_size);
  
  BOOTINFO1(param_ptr, BI_VIRT_QEMU_VERSION,


Reviewed-by: Mark Cave-Ayland 


ATB,

Mark.




Re: [RFC 0/2] Add RISC-V Server Platform Reference Board

2024-03-07 Thread Marcin Juszkiewicz

W dniu 4.03.2024 o 11:25, Fei Wu pisze:


The RISC-V Server Platform specification[1] defines a standardized
set of hardware and software capabilities, that portable system
software, such as OS and hypervisors can rely on being present in a
RISC-V server platform. This patchset provides a RISC-V Server
Platform (RVSP) reference implementation on qemu which is in
compliance with the spec as faithful as possible.


I am working on sbsa-ref which is AArch64 Standard Server Platform 
implementation. Will not go through details of rvsp-ref but give some 
potential hints from my work with our platform.



1. Consider versioning the platform.

We have 'platform_version'.'major/minor' exported in 
DeviceTree-formatted data. This allows for firmware to know which of 
non-discoverable hardware features exists and which not. We use it to 
disable XHCI controller on older platform version.



2. If specification allows to have non-discoverable devices then add some.

This will require you to handle them in firmware in some way. Sooner or 
later some physical hardware will be in same situation so they can use 
your firmware code as reference. We have AHCI and XHCI on system bus 
(hardcoded in firmware).



3. You are going to use EDK2 with ACPI. Hide DT from code there with 
some hardware information library.


For sbsa-ref we created SbsaHardwareInfoLib in 
https://openfw.io/edk2-devel/20240306-no-dt-for-cpu-v6-0-acd8727a1...@linaro.org/ 
patchset.






Re: [PATCH] target/riscv: Implement dynamic establishment of custom decoder

2024-03-07 Thread Daniel Henrique Barboza




On 3/7/24 17:11, Richard Henderson wrote:

On 3/7/24 09:55, Daniel Henrique Barboza wrote:

(--- adding Richard ---)


On 3/6/24 06:33, Huang Tao wrote:

In this patch, we modify the decoder to be a freely composable data
structure instead of a hardcoded one. It can be dynamically builded up
according to the extensions.
This approach has several benefits:
1. Provides support for heterogeneous cpu architectures. As we add decoder in
    CPUArchState, each cpu can have their own decoder, and the decoders can be
    different due to cpu's features.
2. Improve the decoding efficiency. We run the guard_func to see if the decoder
    can be added to the dynamic_decoder when building up the decoder. Therefore,
    there is no need to run the guard_func when decoding each instruction. It 
can
    improve the decoding efficiency
3. For vendor or dynamic cpus, it allows them to customize their own decoder
    functions to improve decoding efficiency, especially when vendor-defined
    instruction sets increase. Because of dynamic building up, it can skip the 
other
    decoder guard functions when decoding.

Signed-off-by: Huang Tao 
Suggested-by: Christoph Muellner 
Co-authored-by: LIU Zhiwei 
---
  target/riscv/cpu.c | 20 
  target/riscv/cpu.h |  2 ++
  target/riscv/cpu_decoder.h | 34 ++
  target/riscv/translate.c   | 28 
  4 files changed, 68 insertions(+), 16 deletions(-)
  create mode 100644 target/riscv/cpu_decoder.h

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 5ff0192c52..cadcd51b4f 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -38,6 +38,7 @@
  #include "kvm/kvm_riscv.h"
  #include "tcg/tcg-cpu.h"
  #include "tcg/tcg.h"
+#include "cpu_decoder.h"
  /* RISC-V CPU definitions */
  static const char riscv_single_letter_exts[] = "IEMAFDQCBPVH";
@@ -1102,6 +1103,23 @@ static void riscv_cpu_satp_mode_finalize(RISCVCPU *cpu, 
Error **errp)
  }
  #endif
+static void riscv_cpu_finalize_dynamic_decoder(RISCVCPU *cpu)
+{
+    CPURISCVState *env = >env;
+    decode_fn *dynamic_decoder;
+    dynamic_decoder = g_new0(decode_fn, decoder_table_size);
+    int j = 0;
+    for (size_t i = 0; i < decoder_table_size; ++i) {
+    if (decoder_table[i].guard_func &&
+    decoder_table[i].guard_func(>cfg)) {
+    dynamic_decoder[j] = decoder_table[i].decode_fn;
+    j++;
+    }
+    }
+
+    env->decoder = dynamic_decoder;


In time: I think we should rename this to 'decoders' to make it easier to 
figure out
that it's an array instead of a single element. Same thing with the 
ctx->decoder pointer.



You should save J into ENV as well, or use GPtrArray which would carry the 
length along with it naturally.

Is the cpu configuration on each cpu, or on the cpu class?

Even if the configuration is per cpu, this new element belongs in ArchCPU not 
CPUArchState.  Usually all of CPUArchState is zeroed during reset.


@@ -1153,9 +1149,8 @@ static void decode_opc(CPURISCVState *env, DisasContext 
*ctx, uint16_t opcode)
   ctx->base.pc_next + 2));
  ctx->opcode = opcode32;
-    for (size_t i = 0; i < ARRAY_SIZE(decoders); ++i) {
-    if (decoders[i].guard_func(ctx->cfg_ptr) &&
-    decoders[i].decode_func(ctx, opcode32)) {
+    for (size_t i = 0; i < decoder_table_size; ++i) {
+    if (ctx->decoder[i](ctx, opcode32)) {
  return;


You definitely should not be using decoder_table_size here, because you 
eliminated elements, which are in fact NULL here.


Am I missing something? It seems we can just remove the ctx->decode pointer 
altogether
and use env->decoder.


We try to avoid placing env into DisasContext, so that it is much harder to 
make the mistake of referencing env fields at translation-time, when you really 
needed to generate tcg code to reference the fields at runtime.


Right. And you mentioned that ArchState isn't the ideal place and we should put 
the
decoders in ArchCPU, so there's that.

In this case, if we were not to use ctx->decoders, we would need to retrieve a 
RISCVCPU
in riscv_tr_translate_insn() to get access to them 


I'd rather keep ctx->decoders and assign it in tr_init_disas() since that 
function already
uses a RISCVCPU pointer. The extra pointer in DisasContext seems worthy in this 
case.


Thanks,

Daniel









r~





Re: [PATCH 3/5] target/sparc/cpu: Improve the CPU help text

2024-03-07 Thread Richard Henderson

On 3/7/24 07:43, Thomas Huth wrote:

Remove the unnecessary "Sparc" at the beginning of the line and
put the chip information into parentheses so that it is clearer
which part of the line have to be passed to "-cpu" to specify a
different CPU.

Resolves:https://gitlab.com/qemu-project/qemu/-/issues/2141
Signed-off-by: Thomas Huth
---
  target/sparc/cpu.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 2/5] target/sparc/cpu: Avoid spaces by default in the CPU names

2024-03-07 Thread Richard Henderson

On 3/7/24 07:43, Thomas Huth wrote:

The output of "-cpu help" is currently rather confusing to the users:
It is not clear which part of the output defines the CPU names since
the CPU names contain white spaces (which we later have to convert
into dashes internally) For example:

Sparc TI UltraSparc II IU 001700112000 FPU  MMU  NWINS 8

At a first glance, should the name for -cpu be "Sparc TI Ultrasparc II"
or "TI UltraSparc II IU" here? Both would be wrong, the right guess is
"TI UltraSparc II" only. Let's start cleaning up this mess by using
dashes instead of white spaces for the CPU names, like we're doing it
internally later (and like we're doing it in most other targets of QEMU).
Note that it is still possible to pass the CPU names with spaces to the
"-cpu" option, since sparc_cpu_type_name() still translates those to "-".

Buglink:https://gitlab.com/qemu-project/qemu/-/issues/2141
Signed-off-by: Thomas Huth
---
  target/sparc/cpu.c | 56 +++---
  1 file changed, 28 insertions(+), 28 deletions(-)


I think the names are still a bit too long, and the case sensitivity is a titch annoying. 
But it's still an improvement, and I don't want to bike-shed this too much.


Reviewed-by: Richard Henderson 


r~



Re: [PATCH 1/5] target/sparc/cpu: Rename the CPU models with a "+" in their names

2024-03-07 Thread Richard Henderson

On 3/7/24 07:43, Thomas Huth wrote:

+/* Fix up legacy names with '+' in it */
+if (g_str_equal(typename, SPARC_CPU_TYPE_NAME("Sun-UltraSparc-IV+"))) {
+g_free(typename);
+typename = g_strdup(SPARC_CPU_TYPE_NAME("Sun-UltraSparc-IVp"));
+} else if (g_str_equal(typename, 
SPARC_CPU_TYPE_NAME("Sun-UltraSparc-IIIi+"))) {
+g_free(typename);
+typename = g_strdup(SPARC_CPU_TYPE_NAME("Sun-UltraSparc-IIIip"));
+}
+


Legacy names don't include dashes.


r~



Re: [PATCH v2 5/5] tests/tcg: Add multiarch test for Xfer:siginfo:read stub

2024-03-07 Thread Richard Henderson

On 3/7/24 08:26, Gustavo Romero wrote:

Add multiarch test for testing if Xfer:siginfo:read query is properly
handled by gdbstub.

Signed-off-by: Gustavo Romero
---
  tests/tcg/multiarch/Makefile.target   | 10 ++-
  .../gdbstub/test-qxfer-siginfo-read.py| 26 +++
  tests/tcg/multiarch/segfault.c| 14 ++
  3 files changed, 49 insertions(+), 1 deletion(-)
  create mode 100644 tests/tcg/multiarch/gdbstub/test-qxfer-siginfo-read.py
  create mode 100644 tests/tcg/multiarch/segfault.c


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v2 4/5] gdbstub: Add Xfer:siginfo:read stub

2024-03-07 Thread Richard Henderson

On 3/7/24 08:26, Gustavo Romero wrote:

+void gdb_handle_query_xfer_siginfo(GArray *params, void *user_ctx)
+{
+unsigned long offset, len;
+uint8_t *siginfo_offset;
+
+offset = get_param(params, 0)->val_ul;
+len = get_param(params, 1)->val_ul;
+
+if (offset + len > sizeof(target_siginfo_t)) {


If you save the siginfo_len from gdb_handlesig, you can place this in user.c.

Is it really correct to reject (offset == 0) + (len == large), rather than 
truncate len?


+/* Reply */
+g_string_assign(gdbserver_state.str_buf, "l");
+gdb_memtox(gdbserver_state.str_buf, (const char *)siginfo_offset, len);


It seems easy enough to reply with the exact length remaining...


r~



Re: [PATCH v2 3/5] gdbstub: Save target's siginfo

2024-03-07 Thread Richard Henderson

On 3/7/24 08:26, Gustavo Romero wrote:

Save target's siginfo into gdbserver_state so it can be used later, for
example, in any stub that requires the target's si_signo and si_code.

This change affects only linux-user mode.

Signed-off-by: Gustavo Romero 
Suggested-by: Richard Henderson 
---
  gdbstub/internals.h|  3 +++
  gdbstub/user-target.c  |  3 ++-
  gdbstub/user.c | 14 ++
  include/gdbstub/user.h |  6 +-
  linux-user/main.c  |  2 +-
  linux-user/signal.c|  5 -
  6 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/gdbstub/internals.h b/gdbstub/internals.h
index 56b7c13b75..a7cc69dab3 100644
--- a/gdbstub/internals.h
+++ b/gdbstub/internals.h
@@ -58,6 +58,9 @@ typedef struct GDBState {
  int line_csum; /* checksum at the end of the packet */
  GByteArray *last_packet;
  int signal;
+#ifdef CONFIG_USER_ONLY
+uint8_t siginfo[MAX_SIGINFO_LENGTH];
+#endif


If we this in GDBUserState in user.c -- no need for ifdefs then.


--- a/gdbstub/user-target.c
+++ b/gdbstub/user-target.c
@@ -10,11 +10,12 @@
  #include "qemu/osdep.h"
  #include "exec/gdbstub.h"
  #include "qemu.h"
-#include "internals.h"
  #ifdef CONFIG_LINUX
  #include "linux-user/loader.h"
  #include "linux-user/qemu.h"
+#include "gdbstub/user.h"
  #endif
+#include "internals.h"
  
  /*

   * Map target signal numbers to GDB protocol signal numbers and vice


Why are any changes required here?
Perhaps this is improper patch split from one of the others?


@@ -140,6 +141,11 @@ int gdb_handlesig(CPUState *cpu, int sig, const char 
*reason)
  return sig;
  }
  
+if (siginfo) {

+/* Save target-specific siginfo. */
+memcpy(gdbserver_state.siginfo, siginfo, siginfo_len);
+}


A comment here about asserting the size at compile-time elsewhere would be welcome for 
future code browsers.


Need to record siginfo_len for later use -- you don't want to expose all 128 bytes if the 
actual structure is smaller.



@@ -510,7 +516,7 @@ void gdb_breakpoint_remove_all(CPUState *cs)
  void gdb_syscall_handling(const char *syscall_packet)
  {
  gdb_put_packet(syscall_packet);
-gdb_handlesig(gdbserver_state.c_cpu, 0, NULL);
+gdb_handlesig(gdbserver_state.c_cpu, 0, NULL, NULL, 0);
  }
  
  static bool should_catch_syscall(int num)

@@ -528,7 +534,7 @@ void gdb_syscall_entry(CPUState *cs, int num)
  {
  if (should_catch_syscall(num)) {
  g_autofree char *reason = g_strdup_printf("syscall_entry:%x;", num);
-gdb_handlesig(cs, gdb_target_sigtrap(), reason);
+gdb_handlesig(cs, gdb_target_sigtrap(), reason, NULL, 0);
  }
  }
  
@@ -536,7 +542,7 @@ void gdb_syscall_return(CPUState *cs, int num)

  {
  if (should_catch_syscall(num)) {
  g_autofree char *reason = g_strdup_printf("syscall_return:%x;", num);
-gdb_handlesig(cs, gdb_target_sigtrap(), reason);
+gdb_handlesig(cs, gdb_target_sigtrap(), reason, NULL, 0);
  }


All of this makes me wonder if we should provide a different interface for syscalls, even 
if it uses the same code paths internally.


Do we want to zero the gdbserver siginfo to indicate that the contents are no longer 
valid?  I know it's not a real signal delivered to the process, but might we need to 
construct a simple siginfo struct to match the sigtrap?



r~



Re: [PATCH 0/4] hw/i386/pc: Trivial cleanups

2024-03-07 Thread Bernhard Beschow



Am 1. März 2024 18:59:32 UTC schrieb "Philippe Mathieu-Daudé" 
:
>Trivial cleanups, mostly around the 'isapc' machine.
>
>Philippe Mathieu-Daudé (4):
>  hw/i386/pc: Remove pc_compat_1_4..1.7[] left over declarations
>  hw/i386/pc: Use generated NotifyVmexitOption_str()
>  hw/i386/pc: Remove 'host_type' argument from pc_init1()
>  hw/i386/pc: Have pc_init_isa() pass a NULL pci_type argument

Ping. Will this series make it into 9.0? AFAICS all patches are reviewed.

>
> include/hw/i386/pc.h | 12 
> hw/i386/pc_piix.c| 18 ++
> 2 files changed, 6 insertions(+), 24 deletions(-)
>



Re: [PATCH v2 2/5] linux-user: Move tswap_siginfo out of target code

2024-03-07 Thread Richard Henderson

On 3/7/24 09:21, Alex Bennée wrote:

+/*
+ * Writes out siginfo values byteswapped, accordingly to the target. It 
also
+ * cleans the si_type from si_code making it correct for the target.
+ */
+tswap_siginfo(>info, >info);
+


I'm not sure I like this, you have the same pointer to both a const and
non-const arg. Do we assert we come through this once per signal and
don't risk double swapping the contents?



I suggested this as an intermediate step -- the function will work perfectly fine with 
dest == source.  Follow-up patches should clean this up further, but I told Gustavo to not 
get distracted from his goal with *too* much cleanup.



r~



Re: [PATCH v2 2/5] linux-user: Move tswap_siginfo out of target code

2024-03-07 Thread Richard Henderson

On 3/7/24 08:26, Gustavo Romero wrote:

Move tswap_siginfo from target code to handle_pending_signal. This will
allow some cleanups and having the siginfo ready to be used in gdbstub.

Signed-off-by: Gustavo Romero
Suggested-by: Richard Henderson
---
  linux-user/aarch64/signal.c |  2 +-
  linux-user/alpha/signal.c   |  2 +-
  linux-user/arm/signal.c |  2 +-
  linux-user/hexagon/signal.c |  2 +-
  linux-user/hppa/signal.c|  2 +-
  linux-user/i386/signal.c|  6 +++---
  linux-user/loongarch64/signal.c |  2 +-
  linux-user/m68k/signal.c|  4 ++--
  linux-user/microblaze/signal.c  |  2 +-
  linux-user/mips/signal.c|  4 ++--
  linux-user/nios2/signal.c   |  2 +-
  linux-user/openrisc/signal.c|  2 +-
  linux-user/ppc/signal.c |  4 ++--
  linux-user/riscv/signal.c   |  2 +-
  linux-user/s390x/signal.c   |  2 +-
  linux-user/sh4/signal.c |  2 +-
  linux-user/signal-common.h  |  2 --
  linux-user/signal.c | 10 --
  linux-user/sparc/signal.c   |  2 +-
  linux-user/xtensa/signal.c  |  2 +-
  20 files changed, 31 insertions(+), 27 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v2 1/5] gdbstub: Rename back gdb_handlesig

2024-03-07 Thread Richard Henderson

On 3/7/24 08:26, Gustavo Romero wrote:

Rename gdb_handlesig_reason back to gdb_handlesig. There is no need to
add a wrapper for gdb_handlesig and rename it when a new parameter is
added.

Signed-off-by: Gustavo Romero 
---
  gdbstub/user.c |  8 
  include/gdbstub/user.h | 15 ++-
  linux-user/main.c  |  2 +-
  linux-user/signal.c|  2 +-
  4 files changed, 8 insertions(+), 19 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 4/5] include/exec: annotate all the MemoryRegion fields

2024-03-07 Thread Richard Henderson

On 3/7/24 08:11, Alex Bennée wrote:

Signed-off-by: Alex Bennée 
---
  include/exec/memory.h | 47 +++
  1 file changed, 43 insertions(+), 4 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 17b741bc4f5..312ed564dbe 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -778,9 +778,48 @@ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void 
**vaddr,
  typedef struct CoalescedMemoryRange CoalescedMemoryRange;
  typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
  
-/** MemoryRegion:

- *
- * A struct representing a memory region.
+/**
+ * struct MemoryRegion - represents a memory region
+ * @parent_obj: parent QOM object for the region
+ * @romd_mode: if true ROM devices accessed directly rather than with @ops
+ * @ram: true if a RAM-type device with a @ram_block
+ * @subpage: true if region covers a subpage
+ * @readonly: true for RAM-type devices that are readonly
+ * @nonvolatile: true for nonvolatile RAM-type devices (e.g. NVDIMM)
+ * @rom_device: true for a ROM device, see also @romd_mode
+ * @flush_coalesced_mmio: true to flush any queued coalesced MMIO
+ * operations before access
+ * @unmergeable: this section should not get merged with adjacent
+ * sections
+ * @dirty_log_mask: dirty events region cares about (see DIRTY_ flags)
+ * @is_iommu: true if part of an IOMMU device
+ * @ram_block: backing @RamBlock if @ram is true
+ * @owner: base QOM object that owns this region
+ * @dev: base Device that owns this region
+ * @ops: access operations for MMIO or @romd_mode devices
+ * @opaque: @dev specific data, passed with @ops
+ * @container: parent `MemoryRegion`
+ * @mapped_via_alias: number of times mapped via @alias, container
+ * might be NULL
+ * @size: size of @MemoryRegion
+ * @addr: physical hwaddr of @MemoryRegion
+ * @destructor: cleanup function when @MemoryRegion finalized
+ * @align: alignment requirements for any physical backing store
+ * @terminates: true if this @MemoryRegion is a leaf node
+ * @ram_device: true if @ram device should use @ops to access
+ * @enabled: true once initialised, false once finalized
+ * @vga_logging_count: count of memory logging clients
+ * @alias: link to aliased @MemoryRegion
+ * @alias_offset: offset into aliased region
+ * @priority: priority when resolving overlapping regions
+ * @subregions: list of subregions in this region
+ * @subregions_link: next subregion in the chain
+ * @coalesced: list of coalesced memory ranges
+ * @name: name of memory region
+ * @ioeventfd_nb: count of @ioeventfds for region
+ * @ioeventfds: ioevent notifiers for region
+ * @rdm: if exists see #RamDiscardManager
+ * @disable_reentrancy_guard: if true don't error if device accesses itself
   */


Why as one big block rather than line by line?
The block is less likely to be properly kept up-to-date and is much harder to 
read.


r~




Re: [PATCH 3/5] include/exec: remove warning_printed from MemoryRegion

2024-03-07 Thread Richard Henderson

On 3/7/24 08:11, Alex Bennée wrote:

Since d197063fcf9 (memory: move unassigned_mem_ops to memory.c) this
field is unused.

Signed-off-by: Alex Bennée
---
  include/exec/memory.h | 1 -
  1 file changed, 1 deletion(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 3/5] include/exec: remove warning_printed from MemoryRegion

2024-03-07 Thread Philippe Mathieu-Daudé

On 7/3/24 19:11, Alex Bennée wrote:

Since d197063fcf9 (memory: move unassigned_mem_ops to memory.c) this
field is unused.

Signed-off-by: Alex Bennée 
---
  include/exec/memory.h | 1 -
  1 file changed, 1 deletion(-)


10+ years ;)

Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH] target/riscv: Implement dynamic establishment of custom decoder

2024-03-07 Thread Richard Henderson

-    for (size_t i = 0; i < ARRAY_SIZE(decoders); ++i) {
-    if (decoders[i].guard_func(ctx->cfg_ptr) &&
-    decoders[i].decode_func(ctx, opcode32)) {
+    for (size_t i = 0; i < decoder_table_size; ++i) {
+    if (ctx->decoder[i](ctx, opcode32)) {
  return;


By the way, you're adding layers of pointer chasing, so I suspect you'll find all of this 
is a wash or worse, performance-wise.


Indeed, since some of the predicates are trivial, going the other way might help: allow 
everything to be inlined:


if (decode_insn32(...)) {
return;
}
if (has_xthead_p(...) && decode_xthead(...)) {
return;
}
...

Even if there are 10 entries here, so what?  All of the code has to be compiled into QEMU. 
 You're not going to somehow add truly dynamic code that you've loaded from a module.


You could perhaps streamline predicates such as has_xthead_p to not test 11 variables by 
adding an artificial "ext_xthead_any" configuration entry that is the sum of all of those.



r~



Re: [PATCH] target/riscv: Implement dynamic establishment of custom decoder

2024-03-07 Thread Richard Henderson

On 3/7/24 09:55, Daniel Henrique Barboza wrote:

(--- adding Richard ---)


On 3/6/24 06:33, Huang Tao wrote:

In this patch, we modify the decoder to be a freely composable data
structure instead of a hardcoded one. It can be dynamically builded up
according to the extensions.
This approach has several benefits:
1. Provides support for heterogeneous cpu architectures. As we add decoder in
    CPUArchState, each cpu can have their own decoder, and the decoders can be
    different due to cpu's features.
2. Improve the decoding efficiency. We run the guard_func to see if the decoder
    can be added to the dynamic_decoder when building up the decoder. Therefore,
    there is no need to run the guard_func when decoding each instruction. It 
can
    improve the decoding efficiency
3. For vendor or dynamic cpus, it allows them to customize their own decoder
    functions to improve decoding efficiency, especially when vendor-defined
    instruction sets increase. Because of dynamic building up, it can skip the 
other
    decoder guard functions when decoding.

Signed-off-by: Huang Tao 
Suggested-by: Christoph Muellner 
Co-authored-by: LIU Zhiwei 
---
  target/riscv/cpu.c | 20 
  target/riscv/cpu.h |  2 ++
  target/riscv/cpu_decoder.h | 34 ++
  target/riscv/translate.c   | 28 
  4 files changed, 68 insertions(+), 16 deletions(-)
  create mode 100644 target/riscv/cpu_decoder.h

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 5ff0192c52..cadcd51b4f 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -38,6 +38,7 @@
  #include "kvm/kvm_riscv.h"
  #include "tcg/tcg-cpu.h"
  #include "tcg/tcg.h"
+#include "cpu_decoder.h"
  /* RISC-V CPU definitions */
  static const char riscv_single_letter_exts[] = "IEMAFDQCBPVH";
@@ -1102,6 +1103,23 @@ static void riscv_cpu_satp_mode_finalize(RISCVCPU *cpu, Error 
**errp)

  }
  #endif
+static void riscv_cpu_finalize_dynamic_decoder(RISCVCPU *cpu)
+{
+    CPURISCVState *env = >env;
+    decode_fn *dynamic_decoder;
+    dynamic_decoder = g_new0(decode_fn, decoder_table_size);
+    int j = 0;
+    for (size_t i = 0; i < decoder_table_size; ++i) {
+    if (decoder_table[i].guard_func &&
+    decoder_table[i].guard_func(>cfg)) {
+    dynamic_decoder[j] = decoder_table[i].decode_fn;
+    j++;
+    }
+    }
+
+    env->decoder = dynamic_decoder;


You should save J into ENV as well, or use GPtrArray which would carry the length along 
with it naturally.


Is the cpu configuration on each cpu, or on the cpu class?

Even if the configuration is per cpu, this new element belongs in ArchCPU not 
CPUArchState.  Usually all of CPUArchState is zeroed during reset.


@@ -1153,9 +1149,8 @@ static void decode_opc(CPURISCVState *env, DisasContext *ctx, 
uint16_t opcode)

   ctx->base.pc_next + 2));
  ctx->opcode = opcode32;
-    for (size_t i = 0; i < ARRAY_SIZE(decoders); ++i) {
-    if (decoders[i].guard_func(ctx->cfg_ptr) &&
-    decoders[i].decode_func(ctx, opcode32)) {
+    for (size_t i = 0; i < decoder_table_size; ++i) {
+    if (ctx->decoder[i](ctx, opcode32)) {
  return;


You definitely should not be using decoder_table_size here, because you eliminated 
elements, which are in fact NULL here.



Am I missing something? It seems we can just remove the ctx->decode pointer 
altogether
and use env->decoder.


We try to avoid placing env into DisasContext, so that it is much harder to make the 
mistake of referencing env fields at translation-time, when you really needed to generate 
tcg code to reference the fields at runtime.




r~




Re: [PATCH] target/riscv: Implement dynamic establishment of custom decoder

2024-03-07 Thread Daniel Henrique Barboza

(--- adding Richard ---)


On 3/6/24 06:33, Huang Tao wrote:

In this patch, we modify the decoder to be a freely composable data
structure instead of a hardcoded one. It can be dynamically builded up
according to the extensions.
This approach has several benefits:
1. Provides support for heterogeneous cpu architectures. As we add decoder in
CPUArchState, each cpu can have their own decoder, and the decoders can be
different due to cpu's features.
2. Improve the decoding efficiency. We run the guard_func to see if the decoder
can be added to the dynamic_decoder when building up the decoder. Therefore,
there is no need to run the guard_func when decoding each instruction. It 
can
improve the decoding efficiency
3. For vendor or dynamic cpus, it allows them to customize their own decoder
functions to improve decoding efficiency, especially when vendor-defined
instruction sets increase. Because of dynamic building up, it can skip the 
other
decoder guard functions when decoding.

Signed-off-by: Huang Tao 
Suggested-by: Christoph Muellner 
Co-authored-by: LIU Zhiwei 
---
  target/riscv/cpu.c | 20 
  target/riscv/cpu.h |  2 ++
  target/riscv/cpu_decoder.h | 34 ++
  target/riscv/translate.c   | 28 
  4 files changed, 68 insertions(+), 16 deletions(-)
  create mode 100644 target/riscv/cpu_decoder.h

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 5ff0192c52..cadcd51b4f 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -38,6 +38,7 @@
  #include "kvm/kvm_riscv.h"
  #include "tcg/tcg-cpu.h"
  #include "tcg/tcg.h"
+#include "cpu_decoder.h"
  
  /* RISC-V CPU definitions */

  static const char riscv_single_letter_exts[] = "IEMAFDQCBPVH";
@@ -1102,6 +1103,23 @@ static void riscv_cpu_satp_mode_finalize(RISCVCPU *cpu, 
Error **errp)
  }
  #endif
  
+static void riscv_cpu_finalize_dynamic_decoder(RISCVCPU *cpu)

+{
+CPURISCVState *env = >env;
+decode_fn *dynamic_decoder;
+dynamic_decoder = g_new0(decode_fn, decoder_table_size);
+int j = 0;
+for (size_t i = 0; i < decoder_table_size; ++i) {
+if (decoder_table[i].guard_func &&
+decoder_table[i].guard_func(>cfg)) {
+dynamic_decoder[j] = decoder_table[i].decode_fn;
+j++;
+}
+}
+
+env->decoder = dynamic_decoder;
+}
+
  void riscv_cpu_finalize_features(RISCVCPU *cpu, Error **errp)
  {
  Error *local_err = NULL;
@@ -1127,6 +1145,8 @@ void riscv_cpu_finalize_features(RISCVCPU *cpu, Error 
**errp)
  return;
  }
  }
+
+riscv_cpu_finalize_dynamic_decoder(cpu);
  }
  
  static void riscv_cpu_realize(DeviceState *dev, Error **errp)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 5d291a7092..42bfed065c 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -30,6 +30,7 @@
  #include "qemu/int128.h"
  #include "cpu_bits.h"
  #include "cpu_cfg.h"
+#include "cpu_decoder.h"
  #include "qapi/qapi-types-common.h"
  #include "cpu-qom.h"
  
@@ -433,6 +434,7 @@ struct CPUArchState {

  uint64_t kvm_timer_state;
  uint64_t kvm_timer_frequency;
  #endif /* CONFIG_KVM */
+const decode_fn *decoder;
  };
  
  /*

diff --git a/target/riscv/cpu_decoder.h b/target/riscv/cpu_decoder.h
new file mode 100644
index 00..549414ce4c
--- /dev/null
+++ b/target/riscv/cpu_decoder.h
@@ -0,0 +1,34 @@
+/*
+ * QEMU RISC-V CPU Decoder
+ *
+ * Copyright (c) 2023-2024 Alibaba Group
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#ifndef RISCV_CPU_DECODER_H
+#define RISCV_CPU_DECODER_H
+
+struct DisasContext;
+struct RISCVCPUConfig;
+typedef struct RISCVDecoder {
+bool (*guard_func)(const struct RISCVCPUConfig *);
+bool (*decode_fn)(struct DisasContext *, uint32_t);
+} RISCVDecoder;
+
+typedef bool (*decode_fn)(struct DisasContext *, uint32_t);
+
+extern const size_t decoder_table_size;
+
+extern const RISCVDecoder decoder_table[];
+#endif
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 177418b2b9..23c5321bce 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -115,6 +115,7 @@ typedef struct DisasContext {
  bool frm_valid;
  /* TCG of the current insn_start */
  TCGOp *insn_start;
+const decode_fn *decoder;
  } DisasContext;
  
  static inline bool has_ext(DisasContext *ctx, uint32_t ext)

@@ 

RE: [PATCH 1/3] ui/gtk: skip drawing guest scanout when associated VC is invisible

2024-03-07 Thread Kim, Dongwon
Hi Daniel,

> -Original Message-
> From: Daniel P. Berrangé 
> Sent: Thursday, March 7, 2024 10:01 AM
> To: Kim, Dongwon 
> Cc: Marc-André Lureau ; qemu-
> de...@nongnu.org
> Subject: Re: [PATCH 1/3] ui/gtk: skip drawing guest scanout when associated
> VC is invisible
> 
> On Thu, Mar 07, 2024 at 05:53:24PM +, Kim, Dongwon wrote:
> > Hi Daniel,
> >
> > > -Original Message-
> > > From: Daniel P. Berrangé 
> > > Sent: Thursday, March 7, 2024 1:46 AM
> > > To: Kim, Dongwon 
> > > Cc: Marc-André Lureau ; qemu-
> > > de...@nongnu.org
> > > Subject: Re: [PATCH 1/3] ui/gtk: skip drawing guest scanout when
> > > associated VC is invisible
> > >
> > > On Thu, Feb 01, 2024 at 06:48:58PM +, Kim, Dongwon wrote:
> > > > Hi Marc-André,
> > > >
> > > > Thanks for your feedback. Yes, you are right, rendering doesn't
> > > > stop on Ubuntu system as it has preview even after the window is
> > > > minimized. But
> > > this is not always the case.
> > > > Some simple windows managers don't have this preview (thumbnail)
> > > > feature and this visible flag is not toggled. And the rendering
> > > > stops right away there when the window is minimized.
> > >
> > > This makes me pretty uncomfortable. This is proposing changing QEMU
> > > behaviour to workaround a problem whose behaviour varies based on
> > > what 3rd party software QEMU is running on
> > >
> > > What you say "window managers" are you referring to a traditional
> > > X11 based host display only, or does the problem also exist on
> > > modern Wayland host display ?
> >
> > [Kim, Dongwon]  I didn't mean anything about the compositor/display
> > server itself. And I am not sure about the general behavior of Wayland
> > compositors but the point here is the rendering while the app is being
> > iconized (minimized) is not always the case. For example, ICE-WM on
> > Yocto Linux doesn't have any preview for the iconized or minimized
> > applications, which means all drawing activities on the minimized app
> > are paused. This is the problem in case blob scanout is used with
> > virtio-gpu on the guest because the guest won't receive the response
> > for the frame submission until the frame is fully rendered on the
> > host. This is causing timeout and further issue on the display
> > pipeline in virtio-gpu driver in the guest.
> 
> I guess I'm wondering why we should consider this a bug in QEMU rather than
> a bug in either the toolkit or host rendering stack ?
> 
> Lets say there was no guest OS here, just a regular host app issuing drawing
> requests that were equivalent to the workload the guest is issuing.  If that
> applications' execution got blocked because its drawing requests are not
> getting processed when iconified, I'd be inclined to call it a bug.
> 
> Or are we perhaps handling drawing in the wrong way in QEMU ?
[Kim, Dongwon] Hmm.. Yeah that is a good point.. If non-rendering workload is 
blocked in the same way, that would be a problem. 
> 
> If the problem is with drawing to a iconified windows, is there an 
> intermediate
> target buffer we should be drawing to, instead of directly to the window. 
> There
> must be some supported way to have drawing requests fully processed in the
> background indepent of having a visible window, surely ?
[Kim, Dongwon]  I will check what other options that don't look like WAs are 
available.

> 
> With regards,
> Daniel
> --
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [PATCH v2 00/10] mirror: allow switching from background to active mode

2024-03-07 Thread Vladimir Sementsov-Ogievskiy

On 04.03.24 14:09, Peter Krempa wrote:

On Mon, Mar 04, 2024 at 11:48:54 +0100, Kevin Wolf wrote:

Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:

On 03.11.23 18:56, Markus Armbruster wrote:

Kevin Wolf  writes:


[...]


Is the job abstraction a failure?

We have

  block-job- command  since   job- commandsince
  -
  block-job-set-speed 1.1
  block-job-cancel1.1 job-cancel  3.0
  block-job-pause 1.3 job-pause   3.0
  block-job-resume1.3 job-resume  3.0
  block-job-complete  1.3 job-complete3.0
  block-job-dismiss   2.12job-dismiss 3.0
  block-job-finalize  2.12job-finalize3.0
  block-job-change8.2
  query-block-jobs1.1 query-jobs


[...]


I consider these strictly optional. We don't really have strong reasons
to deprecate these commands (they are just thin wrappers), and I think
libvirt still uses block-job-* in some places.


Libvirt uses 'block-job-cancel' because it has different semantics from
'job-cancel' which libvirt documented as the behaviour of the API that
uses it. (Semantics regarding the expectation of what is written to the
destination node at the point when the job is cancelled).



That's the following semantics:

  # Note that if you issue 'block-job-cancel' after 'drive-mirror' has
  # indicated (via the event BLOCK_JOB_READY) that the source and
  # destination are synchronized, then the event triggered by this
  # command changes to BLOCK_JOB_COMPLETED, to indicate that the
  # mirroring has ended and the destination now has a point-in-time copy
  # tied to the time of the cancellation.

Hmm. Looking at this, it looks for me, that should probably a 
'block-job-complete" command (as leading to BLOCK_JOB_COMPLETED).

Actually, what is the difference between block-job-complete and 
block-job-cancel(force=false) for mirror in ready state?

I only see the following differencies:

1. block-job-complete documents that it completes the job synchronously.. But 
looking at mirror code I see it just set s->should_complete = true, which will 
be then handled asynchronously.. So I doubt that documentation is correct.

2. block-job-complete will trigger final graph changes. block-job-cancel will 
not.

Is [2] really useful? Seems yes: in case of some failure before starting 
migration target, we'd like to continue executing source. So, no reason to 
break block-graph in source, better keep it unchanged.

But I think, such behavior better be setup by mirror-job start parameter, 
rather then by special option for cancel (or even compelete) command, useful 
only for mirror.

So, what about the following substitution for block-job-cancel:

block-job-cancel(force=true)  -->  use job-cancel

block-job-cancel(force=false) for backup, stream, commit  -->  use job-cancel

block-job-cancel(force=false) for mirror in ready mode  -->

  instead, use block-job-complete. If you don't need final graph modification 
which mirror job normally does, use graph-change=false parameter for 
blockdev-mirror command.


(I can hardly remember, that we've already discussed something like this long 
time ago, but I don't remember the results)

I also a bit unsure about active commit soft-cancelling semantics. Is it 
actually useful? If yes, block-commit command will need similar option.

--
Best regards,
Vladimir




Re: [RFC PATCH 4/5] hw/i386/q35: Wire virtual SMI# lines to ICH9 chipset

2024-03-07 Thread Thomas Huth

On 28/02/2024 17.43, Zhao Liu wrote:

Hi Philippe,


+/*
+ * Real ICH9 contains a single SMI output line and doesn't broadcast CPUs.
+ * Virtualized ICH9 allows broadcasting upon negatiation with guest, see
+ * commit 5ce45c7a2b.
+ */
+enum {
+ICH9_VIRT_SMI_BROADCAST,
+ICH9_VIRT_SMI_CURRENT,
+#define ICH9_VIRT_SMI_COUNT 2
+};
+


Just quick look here. Shouldn't ICH9_VIRT_SMI_COUNT be defined outside of
enum {}?


Or even better, do it without a #define:

enum {
ICH9_VIRT_SMI_BROADCAST,
ICH9_VIRT_SMI_CURRENT,
ICH9_VIRT_SMI_COUNT
};

 Thomas




Re: [PATCH 2/2] tests/tcg: Add multiarch test for Xfer:siginfo:read stub

2024-03-07 Thread Richard Henderson

On 3/7/24 07:50, Gustavo Romero wrote:

Hi Richard,

On 3/4/24 7:51 PM, Richard Henderson wrote:

On 3/4/24 10:59, Gustavo Romero wrote:

Perhaps just abort for SIGABRT instead?


Although this can make a simpler test, the test can't control
the si_addr value easily, which I think is interesting to be tested.

Why do you prefer SIGABRT?


I missed that you were testing si_addr -- in which case SIGSEGV is a good match.



A test using setitimer to raise SIGALRM would test the async path.


SIGLARM doesn't generate any interesting siginfo?


It should at minimum have si_sig = SIGALRM.



gromero@arm64:~$ gdb -q ./sigalrm
Reading symbols from ./sigalrm...
(gdb) run
Starting program: /home/gromero/sigalrm

Program terminated with signal SIGALRM, Alarm clock.
The program no longer exists.
(gdb) p $_siginfo
$1 = void


Well that's because the program died.
Do you need to have gdb handle the signal?


ouch, right :)

However, on a remote target, even if I catch that signal using
'catch signal SIGALRM' the GDBstub only closes the connection
when SIGALRM is delivered. That's odd, I don't understand why.

I'm using the same binary that pretty much works on GDB locally.


[Remote target]

gromero@arm64:~$ gdb -q
gromero@arm64:~/qemu_tests$ gdb -q ./sigalrm
Reading symbols from ./sigalrm...
(gdb) catch signal SIGALRM
Catchpoint 1 (signal SIGALRM)
(gdb) c
The program is not being run.
(gdb) run
Starting program: /home/gromero/qemu_tests/sigalrm
[Inferior 1 (process 12732) exited normally]
(gdb) quit

on the QEMU gdbstub side it reports "Alarm clock":

gromero@amd:~/git/qemu/build$ ./qemu-aarch64 -g 1234 ./sigalrm -s
Alarm clock
gromero@amd:~/git/qemu/build$


[Locally]

gromero@arm64:~/qemu_tests$ gdb -q ./sigalrm
Reading symbols from ./sigalrm...
(gdb) catch signal SIGALRM
Catchpoint 1 (signal SIGALRM)
(gdb) run -s
Starting program: /home/gromero/qemu_tests/sigalrm -s

Catchpoint 1 (signal SIGALRM), 0x0041a410 in ualarm ()
(gdb) quit


I'd like to add for the async path using SIGALRM but I need more
time to understand what's going on regarding SIGLARM. I understand
that's nothing wrong with the Xfer:siginfo:read stub itself, and
because the main goal of the test is to test the stub, if you don't
mind, I'd like to keep only the test with SIGSEGV for v2 and leave
the async test as a follow-up.


Well that's certainly surprising.
Would you please file a bug report about this?
I think I know what the problem is, but let's track it anyway.


r~



Re: [PATCH V3 1/1] target/loongarch: Fixed tlb huge page loading issue

2024-03-07 Thread Richard Henderson

On 3/6/24 21:37, Xianglai Li wrote:

When we use qemu tcg simulation, the page size of bios is 4KB.
When using the level 2 super large page (page size is 1G) to create the page 
table,
it is found that the content of the corresponding address space is abnormal,
resulting in the bios can not start the operating system and graphical 
interface normally.

The lddir and ldpte instruction emulation has
a problem with the use of super large page processing above level 2.
The page size is not correctly calculated,
resulting in the wrong page size of the table entry found by tlb.

Signed-off-by: Xianglai Li 
Cc: maob...@loongson.cn
Cc: Song Gao 
Cc: Xiaojuan Yang 
Cc: zhaotian...@loongson.cn
---
  target/loongarch/internals.h  |  8 +++
  target/loongarch/tcg/tlb_helper.c | 92 +++
  2 files changed, 76 insertions(+), 24 deletions(-)

diff --git a/target/loongarch/internals.h b/target/loongarch/internals.h
index a2fc54c8a7..55ceb4c079 100644
--- a/target/loongarch/internals.h
+++ b/target/loongarch/internals.h
@@ -16,6 +16,14 @@
  #define TARGET_PHYS_MASK MAKE_64BIT_MASK(0, TARGET_PHYS_ADDR_SPACE_BITS)
  #define TARGET_VIRT_MASK MAKE_64BIT_MASK(0, TARGET_VIRT_ADDR_SPACE_BITS)
  
+/*

+ * The [13..14]bits of the entry base address of the lddir/ldpte
+ * directive are used to represent the level of the large page
+ * when processing the huge page entry
+ */
+#define HUGE_PAGE_LEVEL_SHIFT   13
+#define HUGE_PAGE_LEVEL_MASK MAKE_64BIT_MASK(HUGE_PAGE_LEVEL_SHIFT, 2)


This would be cleaner using , e.g.

FIELD(LDDIR, HUGE, 6, 1)
FIELD(LDDIR, LEVEL, 13, 2)


+static int get_dir_base_width(CPULoongArchState *env, uint64_t *dir_base,
+  uint64_t *dir_width, target_ulong level);


Very often you can place the new function just before its first use so that no prior 
declaration is required.


Returning a bool with true for success and false for failure is preferred over 
0/-1.


@@ -487,6 +490,16 @@ target_ulong helper_lddir(CPULoongArchState *env, 
target_ulong base,
  int shift;
  uint64_t dir_base, dir_width;
  bool huge = (base >> LOONGARCH_PAGE_HUGE_SHIFT) & 0x1;
+uint64_t huge_page_level = base & HUGE_PAGE_LEVEL_MASK;
+
+if (huge) {


if (FIELD_EX64(base, LDDIR, HUGE))


+if (huge_page_level) {


if (FIELD_EX64(base, LDDIR, LEVEL))


+} else {
+huge_page_level = (level & 0x3) << HUGE_PAGE_LEVEL_SHIFT;
+return base | huge_page_level;


return FIELD_DP64(base, LDDIR, LEVEL, level);

I suppose setting bit [6] with level == 4 is a "don't do that" sort of programming error. 
You could log the error here, perhaps:


if (unlikely(level == 4)) {
qemu_log_mask(LOG_GUEST_ERROR, "Attempted use of level 4 huge page\n");
}



@@ -495,30 +508,10 @@ target_ulong helper_lddir(CPULoongArchState *env, 
target_ulong base,
  shift = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, PTEWIDTH);
  shift = (shift + 1) * 3;
  
-if (huge) {

-return base;
-}
-switch (level) {
-case 1:
-dir_base = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, DIR1_BASE);
-dir_width = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, DIR1_WIDTH);
-break;
-case 2:
-dir_base = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, DIR2_BASE);
-dir_width = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, DIR2_WIDTH);
-break;
-case 3:
-dir_base = FIELD_EX64(env->CSR_PWCH, CSR_PWCH, DIR3_BASE);
-dir_width = FIELD_EX64(env->CSR_PWCH, CSR_PWCH, DIR3_WIDTH);
-break;
-case 4:
-dir_base = FIELD_EX64(env->CSR_PWCH, CSR_PWCH, DIR4_BASE);
-dir_width = FIELD_EX64(env->CSR_PWCH, CSR_PWCH, DIR4_WIDTH);
-break;
-default:
-do_raise_exception(env, EXCCODE_INE, GETPC());
+if (get_dir_base_width(env, _base, _width, level) != 0) {
  return 0;
  }


I believe that we should not raise an exception here at all.  This illegal instruction 
exception is based on the LDDIR immediate operand, so we should have diagnosed this error 
and raised an exception in trans_lddir().


Therefore the default label should use only g_assert_not_reached(), and there need not be 
a error return from get_dir_base_width at all.




@@ -534,17 +527,38 @@ void helper_ldpte(CPULoongArchState *env, target_ulong 
base, target_ulong odd,
  bool huge = (base >> LOONGARCH_PAGE_HUGE_SHIFT) & 0x1;
  uint64_t ptbase = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, PTBASE);
  uint64_t ptwidth = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, PTWIDTH);
+uint64_t dir_base, dir_width;
+uint64_t huge_page_level;
  
  base = base & TARGET_PHYS_MASK;
  
  if (huge) {

-/* Huge Page. base is paddr */
+/*
+ * Gets the huge page level
+ * Clears the huge page level information in the address
+ * Clears huge page bit
+ * Gets huge page size
+ */
+huge_page_level = (base & HUGE_PAGE_LEVEL_MASK) >>
+  

Re: [RISC-V][tech-server-platform] [RISC-V][tech-server-soc] [RFC 2/2] target/riscv: Add server platform reference cpu

2024-03-07 Thread Daniel Henrique Barboza




On 3/7/24 09:17, Heinrich Schuchardt wrote:

On 07.03.24 08:36, Wu, Fei2 wrote:

On 3/6/2024 9:26 PM, Wu, Fei wrote:

On 3/5/2024 1:58 PM, Wu, Fei wrote:

On 3/5/2024 3:43 AM, Daniel Henrique Barboza wrote:



On 3/4/24 07:25, Fei Wu wrote:

The harts requirements of RISC-V server platform [1] require RVA23 ISA
profile support, plus Sv48, Svadu, H, Sscofmpf etc. This patch provides
a virt CPU type (rvsp-ref) as compliant as possible.

[1]
https://github.com/riscv-non-isa/riscv-server-platform/blob/main/server_platform_requirements.adoc

Signed-off-by: Fei Wu 
--->   hw/riscv/server_platform_ref.c |  6 +++-
   target/riscv/cpu-qom.h |  1 +
   target/riscv/cpu.c | 62 ++
   3 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/hw/riscv/server_platform_ref.c
b/hw/riscv/server_platform_ref.c
index ae90c4b27a..52ec607cee 100644
--- a/hw/riscv/server_platform_ref.c
+++ b/hw/riscv/server_platform_ref.c
@@ -1205,11 +1205,15 @@ static void
rvsp_ref_machine_class_init(ObjectClass *oc, void *data)
   {
   char str[128];
   MachineClass *mc = MACHINE_CLASS(oc);
+    static const char * const valid_cpu_types[] = {
+    TYPE_RISCV_CPU_RVSP_REF,
+    };
     mc->desc = "RISC-V Server SoC Reference board";
   mc->init = rvsp_ref_machine_init;
   mc->max_cpus = RVSP_CPUS_MAX;
-    mc->default_cpu_type = TYPE_RISCV_CPU_BASE;
+    mc->default_cpu_type = TYPE_RISCV_CPU_RVSP_REF;
+    mc->valid_cpu_types = valid_cpu_types;


I suggest introducing this patch first, then the new machine type that
will use it as a default
CPU. The reason is to facilitate future bisects. If we introduce the
board first, a future bisect
might hit the previous patch, the board will be run using RV64 instead
of the correct CPU, and
we'll have different results because of it.


Good suggestion.


   mc->pci_allow_0_address = true;
   mc->default_nic = "e1000e";
   mc->possible_cpu_arch_ids = riscv_numa_possible_cpu_arch_ids;
diff --git a/target/riscv/cpu-qom.h b/target/riscv/cpu-qom.h
index 3670cfe6d9..adb934d19e 100644
--- a/target/riscv/cpu-qom.h
+++ b/target/riscv/cpu-qom.h
@@ -49,6 +49,7 @@
   #define TYPE_RISCV_CPU_SIFIVE_U54
RISCV_CPU_TYPE_NAME("sifive-u54")
   #define TYPE_RISCV_CPU_THEAD_C906
RISCV_CPU_TYPE_NAME("thead-c906")
   #define TYPE_RISCV_CPU_VEYRON_V1
RISCV_CPU_TYPE_NAME("veyron-v1")
+#define TYPE_RISCV_CPU_RVSP_REF RISCV_CPU_TYPE_NAME("rvsp-ref")
   #define TYPE_RISCV_CPU_HOST RISCV_CPU_TYPE_NAME("host")
     OBJECT_DECLARE_CPU_TYPE(RISCVCPU, RISCVCPUClass, RISCV_CPU)
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 5ff0192c52..bc91be702b 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -2282,6 +2282,67 @@ static void rva22s64_profile_cpu_init(Object *obj)
     RVA22S64.enabled = true;
   }
+
+static void rv64_rvsp_ref_cpu_init(Object *obj)
+{
+    CPURISCVState *env = _CPU(obj)->env;
+    RISCVCPU *cpu = RISCV_CPU(obj);
+
+    riscv_cpu_set_misa_ext(env, RVG | RVC | RVS | RVU | RVH | RVV);
+
+    /* FIXME: change to 1.13 */
+    env->priv_ver = PRIV_VERSION_1_12_0;
+
+    /* RVA22U64 */
+    cpu->cfg.mmu = true;
+    cpu->cfg.ext_zifencei = true;
+    cpu->cfg.ext_zicsr = true;
+    cpu->cfg.ext_zicntr = true;
+    cpu->cfg.ext_zihpm = true;
+    cpu->cfg.ext_zihintpause = true;
+    cpu->cfg.ext_zba = true;
+    cpu->cfg.ext_zbb = true;
+    cpu->cfg.ext_zbs = true;
+    cpu->cfg.zic64b = true;
+    cpu->cfg.ext_zicbom = true;
+    cpu->cfg.ext_zicbop = true;
+    cpu->cfg.ext_zicboz = true;
+    cpu->cfg.cbom_blocksize = 64;
+    cpu->cfg.cbop_blocksize = 64;
+    cpu->cfg.cboz_blocksize = 64;
+    cpu->cfg.ext_zfhmin = true;
+    cpu->cfg.ext_zkt = true;


You can change this whole block with:

RVA22U64.enabled = true;


riscv_cpu_add_profiles() will check if we have a profile enabled and, if
that's the
case, we'll enable all its extensions in the CPU.

In the near future, when we implement a proper RVA23 support, we'll be
able to just do
a single RVA23S64.enabled = true in this cpu_init(). But for now we can
at least declare
RVA22U64 (perhaps RVA22S64) support for this CPU.



Hi Daniel,

I'm not sure if it's a regression or the usage has been changed. I'm not
able to use '-cpu rva22s64' on latest qemu (db596ae190).


I did a quick git bisect and found that commit d06f28db6 "target/riscv:
move 'mmu' to riscv_cpu_properties[]" disabled mmu by default, so that
an explicit mmu option should be added to qemu command line like '-cpu
rva22s64,mmu=true', I think rva22s64 should enable it by default.

Thanks,
Fei.


It is nice that the MMU can be disabled. But is there any reason why the MMU 
should be disabled by default on the virt machine (which typically is used to 
run an operating system)?


'mmu' is currently being handled as a CPU property, not a board property. So 
using
the 'virt' board without a MMU is valid (albeit not that useful/common).



Can we add mmu=true as default to the 

Re: [PATCH 05/17] hw/vfio/ap: Fix missing ERRP_GUARD() for error_prepend()

2024-03-07 Thread Thomas Huth

On 04/03/2024 16.12, Anthony Krowiak wrote:


On 2/29/24 12:30 PM, Thomas Huth wrote:

On 29/02/2024 15.39, Zhao Liu wrote:

From: Zhao Liu 

As the comment in qapi/error, passing @errp to error_prepend() requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
...
* - It should not be passed to error_prepend(), error_vprepend() or
*   error_append_hint(), because that doesn't work with _fatal.
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or _fatal.

ERRP_GUARD() could avoid the case when @errp is the pointer of
error_fatal, the user can't see this additional information, because
exit() happens in error_setg earlier than information is added [1].

The vfio_ap_realize() passes @errp to error_prepend(), and as a
DeviceClass.realize method, its @errp is so widely sourced that it is
necessary to protect it with ERRP_GUARD().

To avoid the issue like [1] said, add missing ERRP_GUARD() at the
beginning of this function.

[1]: Issue description in the commit message of commit ae7c80a7bd73
  ("error: New macro ERRP_GUARD()").

Cc: Alex Williamson 
Cc: "Cédric Le Goater" 
Cc: Thomas Huth 
Cc: Tony Krowiak 
Cc: Halil Pasic 
Cc: Jason Herne 
Signed-off-by: Zhao Liu 
---
  hw/vfio/ap.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index e157aa1ff79c..7c4caa593863 100644
--- a/hw/vfio/ap.c
+++ b/hw/vfio/ap.c
@@ -155,6 +155,7 @@ static void 
vfio_ap_unregister_irq_notifier(VFIOAPDevice *vapdev,

    static void vfio_ap_realize(DeviceState *dev, Error **errp)
  {
+    ERRP_GUARD();
  int ret;
  Error *err = NULL;


Now this function looks like we need both, ERRP_GUARD and the local "err" 
variable? ... patch looks ok to me, but maybe Markus has an idea how this 
could be done in a nicer way?



Correct me if I'm wrong, but my understanding from reading the prologue in 
error.h is that errp is used to pass errors back to the caller. The 'err' 
variable is used to report errors set by a call to the 
vfio_ap_register_irq_notification function after which this function returns 
cleanly.


Right, no objections, that's what I meant with "this function looks like we 
need both" ...
But having both, "err" and "errp" in one function also looks somewhat 
confusing at a first glance. No clue how this could be done much better 
though, maybe rename "err" to "local_err" to make it clear that the two 
variables are used independently?


 Thomas




Re: [PATCH v2 2/5] linux-user: Move tswap_siginfo out of target code

2024-03-07 Thread Alex Bennée
Gustavo Romero  writes:

> Move tswap_siginfo from target code to handle_pending_signal. This will
> allow some cleanups and having the siginfo ready to be used in gdbstub.
>
> Signed-off-by: Gustavo Romero 
> Suggested-by: Richard Henderson 
> ---
>  linux-user/aarch64/signal.c |  2 +-
>  linux-user/alpha/signal.c   |  2 +-
>  linux-user/arm/signal.c |  2 +-
>  linux-user/hexagon/signal.c |  2 +-
>  linux-user/hppa/signal.c|  2 +-
>  linux-user/i386/signal.c|  6 +++---
>  linux-user/loongarch64/signal.c |  2 +-
>  linux-user/m68k/signal.c|  4 ++--
>  linux-user/microblaze/signal.c  |  2 +-
>  linux-user/mips/signal.c|  4 ++--
>  linux-user/nios2/signal.c   |  2 +-
>  linux-user/openrisc/signal.c|  2 +-
>  linux-user/ppc/signal.c |  4 ++--
>  linux-user/riscv/signal.c   |  2 +-
>  linux-user/s390x/signal.c   |  2 +-
>  linux-user/sh4/signal.c |  2 +-
>  linux-user/signal-common.h  |  2 --
>  linux-user/signal.c | 10 --
>  linux-user/sparc/signal.c   |  2 +-
>  linux-user/xtensa/signal.c  |  2 +-
>  20 files changed, 31 insertions(+), 27 deletions(-)
>
> diff --git a/linux-user/aarch64/signal.c b/linux-user/aarch64/signal.c

> --- a/linux-user/signal.c
> +++ b/linux-user/signal.c
> @@ -409,8 +409,8 @@ static inline void 
> host_to_target_siginfo_noswap(target_siginfo_t *tinfo,
>  tinfo->si_code = deposit32(si_code, 16, 16, si_type);
>  }
>  
> -void tswap_siginfo(target_siginfo_t *tinfo,
> -   const target_siginfo_t *info)
> +static void tswap_siginfo(target_siginfo_t *tinfo,
> +  const target_siginfo_t *info)
>  {
>  int si_type = extract32(info->si_code, 16, 16);
>  int si_code = sextract32(info->si_code, 0, 16);
> @@ -1180,6 +1180,12 @@ static void handle_pending_signal(CPUArchState 
> *cpu_env, int sig,
>  /* dequeue signal */
>  k->pending = 0;
>  
> +/*
> + * Writes out siginfo values byteswapped, accordingly to the target. It 
> also
> + * cleans the si_type from si_code making it correct for the target.
> + */
> +tswap_siginfo(>info, >info);
> +

I'm not sure I like this, you have the same pointer to both a const and
non-const arg. Do we assert we come through this once per signal and
don't risk double swapping the contents?

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: [RISC-V][tech-server-soc] [RFC 2/2] target/riscv: Add server platform reference cpu

2024-03-07 Thread Daniel Henrique Barboza




On 3/7/24 04:36, Wu, Fei wrote:

On 3/6/2024 9:26 PM, Wu, Fei wrote:

On 3/5/2024 1:58 PM, Wu, Fei wrote:

On 3/5/2024 3:43 AM, Daniel Henrique Barboza wrote:



On 3/4/24 07:25, Fei Wu wrote:

The harts requirements of RISC-V server platform [1] require RVA23 ISA
profile support, plus Sv48, Svadu, H, Sscofmpf etc. This patch provides
a virt CPU type (rvsp-ref) as compliant as possible.

[1]
https://github.com/riscv-non-isa/riscv-server-platform/blob/main/server_platform_requirements.adoc

Signed-off-by: Fei Wu 
--->   hw/riscv/server_platform_ref.c |  6 +++-
   target/riscv/cpu-qom.h |  1 +
   target/riscv/cpu.c | 62 ++
   3 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/hw/riscv/server_platform_ref.c
b/hw/riscv/server_platform_ref.c
index ae90c4b27a..52ec607cee 100644
--- a/hw/riscv/server_platform_ref.c
+++ b/hw/riscv/server_platform_ref.c
@@ -1205,11 +1205,15 @@ static void
rvsp_ref_machine_class_init(ObjectClass *oc, void *data)
   {
   char str[128];
   MachineClass *mc = MACHINE_CLASS(oc);
+    static const char * const valid_cpu_types[] = {
+    TYPE_RISCV_CPU_RVSP_REF,
+    };
     mc->desc = "RISC-V Server SoC Reference board";
   mc->init = rvsp_ref_machine_init;
   mc->max_cpus = RVSP_CPUS_MAX;
-    mc->default_cpu_type = TYPE_RISCV_CPU_BASE;
+    mc->default_cpu_type = TYPE_RISCV_CPU_RVSP_REF;
+    mc->valid_cpu_types = valid_cpu_types;


I suggest introducing this patch first, then the new machine type that
will use it as a default
CPU. The reason is to facilitate future bisects. If we introduce the
board first, a future bisect
might hit the previous patch, the board will be run using RV64 instead
of the correct CPU, and
we'll have different results because of it.


Good suggestion.


   mc->pci_allow_0_address = true;
   mc->default_nic = "e1000e";
   mc->possible_cpu_arch_ids = riscv_numa_possible_cpu_arch_ids;
diff --git a/target/riscv/cpu-qom.h b/target/riscv/cpu-qom.h
index 3670cfe6d9..adb934d19e 100644
--- a/target/riscv/cpu-qom.h
+++ b/target/riscv/cpu-qom.h
@@ -49,6 +49,7 @@
   #define TYPE_RISCV_CPU_SIFIVE_U54
RISCV_CPU_TYPE_NAME("sifive-u54")
   #define TYPE_RISCV_CPU_THEAD_C906
RISCV_CPU_TYPE_NAME("thead-c906")
   #define TYPE_RISCV_CPU_VEYRON_V1
RISCV_CPU_TYPE_NAME("veyron-v1")
+#define TYPE_RISCV_CPU_RVSP_REF RISCV_CPU_TYPE_NAME("rvsp-ref")
   #define TYPE_RISCV_CPU_HOST RISCV_CPU_TYPE_NAME("host")
     OBJECT_DECLARE_CPU_TYPE(RISCVCPU, RISCVCPUClass, RISCV_CPU)
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 5ff0192c52..bc91be702b 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -2282,6 +2282,67 @@ static void rva22s64_profile_cpu_init(Object *obj)
     RVA22S64.enabled = true;
   }
+
+static void rv64_rvsp_ref_cpu_init(Object *obj)
+{
+    CPURISCVState *env = _CPU(obj)->env;
+    RISCVCPU *cpu = RISCV_CPU(obj);
+
+    riscv_cpu_set_misa_ext(env, RVG | RVC | RVS | RVU | RVH | RVV);
+
+    /* FIXME: change to 1.13 */
+    env->priv_ver = PRIV_VERSION_1_12_0;
+
+    /* RVA22U64 */
+    cpu->cfg.mmu = true;
+    cpu->cfg.ext_zifencei = true;
+    cpu->cfg.ext_zicsr = true;
+    cpu->cfg.ext_zicntr = true;
+    cpu->cfg.ext_zihpm = true;
+    cpu->cfg.ext_zihintpause = true;
+    cpu->cfg.ext_zba = true;
+    cpu->cfg.ext_zbb = true;
+    cpu->cfg.ext_zbs = true;
+    cpu->cfg.zic64b = true;
+    cpu->cfg.ext_zicbom = true;
+    cpu->cfg.ext_zicbop = true;
+    cpu->cfg.ext_zicboz = true;
+    cpu->cfg.cbom_blocksize = 64;
+    cpu->cfg.cbop_blocksize = 64;
+    cpu->cfg.cboz_blocksize = 64;
+    cpu->cfg.ext_zfhmin = true;
+    cpu->cfg.ext_zkt = true;


You can change this whole block with:

RVA22U64.enabled = true;


riscv_cpu_add_profiles() will check if we have a profile enabled and, if
that's the
case, we'll enable all its extensions in the CPU.

In the near future, when we implement a proper RVA23 support, we'll be
able to just do
a single RVA23S64.enabled = true in this cpu_init(). But for now we can
at least declare
RVA22U64 (perhaps RVA22S64) support for this CPU.



Hi Daniel,

I'm not sure if it's a regression or the usage has been changed. I'm not
able to use '-cpu rva22s64' on latest qemu (db596ae190).


I did a quick git bisect and found that commit d06f28db6 "target/riscv:
move 'mmu' to riscv_cpu_properties[]" disabled mmu by default, so that
an explicit mmu option should be added to qemu command line like '-cpu
rva22s64,mmu=true', I think rva22s64 should enable it by default.



This is fixed in alistair/riscv-to-apply.next by this commit:

commit 5b8d66e6bf7904535ee277e9c70b332c4462f10a
Author: Daniel Henrique Barboza 
Date:   Thu Feb 15 19:39:50 2024 -0300

target/riscv/tcg: set 'mmu' with 'satp' in cpu_set_profile()

Recent changes in options handling removed the 'mmu' default the bare

CPUs had, meaning that we must enable 'mmu' by hand when using the
rva22s64 profile CPU.

Given 

Re: [PATCH v2 1/5] gdbstub: Rename back gdb_handlesig

2024-03-07 Thread Alex Bennée
Gustavo Romero  writes:

> Rename gdb_handlesig_reason back to gdb_handlesig. There is no need to
> add a wrapper for gdb_handlesig and rename it when a new parameter is
> added.
>
> Signed-off-by: Gustavo Romero 

Reviewed-by: Alex Bennée 

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: [PATCH v1 2/5] hw/ppc: SPI controller model - registers implementation

2024-03-07 Thread Stefan Berger




On 3/7/24 13:54, Stefan Berger wrote:



On 2/7/24 11:08, Chalapathi V wrote:



+#define COUNTER_CONFIG_REG_SHIFT_COUNT_N1   PPC_BITMASK(0 , 7)


No space before the ',' ==> PPC_BITMASK(0, 7)



Re: [PATCH v1 2/5] hw/ppc: SPI controller model - registers implementation

2024-03-07 Thread Stefan Berger




On 2/7/24 11:08, Chalapathi V wrote:

SPI controller device model supports a connection to a single SPI responder.
This provide access to SPI seeproms, TPM, flash device and an ADC controller.

All SPI function control is mapped into the SPI register space to enable full
control by firmware. In this commit SPI configuration component is modelled
which contains all SPI configuration and status registers as well as the hold
registers for data to be sent or having been received.

Signed-off-by: Chalapathi V 
---
  include/hw/ppc/pnv_spi_controller.h |  43 
  include/hw/ppc/pnv_xscom.h  |   3 +
  hw/ppc/pnv_spi_controller.c | 337 
  hw/ppc/meson.build  |   1 +
  4 files changed, 384 insertions(+)
  create mode 100644 include/hw/ppc/pnv_spi_controller.h
  create mode 100644 hw/ppc/pnv_spi_controller.c

diff --git a/include/hw/ppc/pnv_spi_controller.h 
b/include/hw/ppc/pnv_spi_controller.h
new file mode 100644
index 00..8afaabdd1b
--- /dev/null
+++ b/include/hw/ppc/pnv_spi_controller.h
@@ -0,0 +1,43 @@
+/*
+ * QEMU PowerPC SPI Controller model
+ *
+ * Copyright (c) 2024, IBM Corporation.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * This model Supports a connection to a single SPI responder.
+ * Introduced for P10 to provide access to SPI seeproms, TPM, flash device
+ * and an ADC controller.
+ */
+
+#ifndef PPC_PNV_SPI_CONTROLLER_H
+#define PPC_PNV_SPI_CONTROLLER_H
+
+#define TYPE_PNV_SPI_CONTROLLER "pnv-spi-controller"
+#define PNV_SPICONTROLLER(obj) \
+OBJECT_CHECK(PnvSpiController, (obj), TYPE_PNV_SPI_CONTROLLER)
+
+#define SPI_CONTROLLER_REG_SIZE 8
+
+typedef struct SpiBus SpiBus;
+
+typedef struct PnvSpiController {
+DeviceState parent;
+
+SpiBus*spi_bus;
+MemoryRegionxscom_spic_regs;
+/* SPI controller object number */
+uint32_tspic_num;
+
+/* SPI Controller registers */
+uint64_terror_reg;
+uint64_tcounter_config_reg;
+uint64_tconfig_reg1;
+uint64_tclock_config_reset_control;
+uint64_tmemory_mapping_reg;
+uint64_ttransmit_data_reg;
+uint64_treceive_data_reg;
+uint8_t sequencer_operation_reg[SPI_CONTROLLER_REG_SIZE];
+uint64_tstatus_reg;
+} PnvSpiController;
+#endif /* PPC_PNV_SPI_CONTROLLER_H */
diff --git a/include/hw/ppc/pnv_xscom.h b/include/hw/ppc/pnv_xscom.h
index f5becbab41..0575bf0985 100644
--- a/include/hw/ppc/pnv_xscom.h
+++ b/include/hw/ppc/pnv_xscom.h
@@ -176,6 +176,9 @@ struct PnvXScomInterfaceClass {
  #define PNV10_XSCOM_PEC_PCI_BASE   0x8010800 /* index goes upwards ... */
  #define PNV10_XSCOM_PEC_PCI_SIZE   0x200
  
+#define PNV10_XSCOM_PIB_SPIC_BASE 0xc

+#define PNV10_XSCOM_PIB_SPIC_SIZE 0x20
+
  void pnv_xscom_init(PnvChip *chip, uint64_t size, hwaddr addr);
  int pnv_dt_xscom(PnvChip *chip, void *fdt, int root_offset,
   uint64_t xscom_base, uint64_t xscom_size,
diff --git a/hw/ppc/pnv_spi_controller.c b/hw/ppc/pnv_spi_controller.c
new file mode 100644
index 00..0f2bc25e82
--- /dev/null
+++ b/hw/ppc/pnv_spi_controller.c
@@ -0,0 +1,337 @@
+/*
+ * QEMU PowerPC SPI Controller model
+ *
+ * Copyright (c) 2024, IBM Corporation.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/qdev-properties.h"
+#include "hw/ppc/pnv.h"
+#include "hw/ppc/pnv_xscom.h"
+#include "hw/ppc/pnv_spi_controller.h"
+#include "hw/ppc/pnv_spi_responder.h"
+#include "hw/ppc/fdt.h"
+#include 
+#include 
+
+#define SPI_DEBUG(x)
+
+/* Error Register */
+#define ERROR_REG   0x00
+
+/* counter_config_reg */
+#define COUNTER_CONFIG_REG  0x01
+#define COUNTER_CONFIG_REG_SHIFT_COUNT_N1   PPC_BITMASK(0 , 7)
+#define COUNTER_CONFIG_REG_SHIFT_COUNT_N2   PPC_BITMASK(8 , 15)
+#define COUNTER_CONFIG_REG_COUNT_COMPARE1   PPC_BITMASK(24 , 31)
+#define COUNTER_CONFIG_REG_COUNT_COMPARE2   PPC_BITMASK(32 , 39)
+#define COUNTER_CONFIG_REG_N1_COUNT_CONTROL PPC_BITMASK(48 , 51)
+#define COUNTER_CONFIG_REG_N2_COUNT_CONTROL PPC_BITMASK(52 , 55)
+
+/* config_reg */
+#define CONFIG_REG1 0x02
+
+/* clock_config_reset_control_ecc_enable_reg */
+#define CLOCK_CONFIG_REG0x03
+#define CLOCK_CONFIG_RESET_CONTROL_HARD_RESET   0x0084;
+#define CLOCK_CONFIG_REG_RESET_CONTROL  PPC_BITMASK(24 , 27)
+#define CLOCK_CONFIG_REG_ECC_CONTROLPPC_BITMASK(28 , 30)
+
+/* memory_mapping_reg */
+#define MEMORY_MAPPING_REG  0x04
+#define MEMORY_MAPPING_REG_MMSPISM_BASE_ADDRPPC_BITMASK(0 , 15)
+#define MEMORY_MAPPING_REG_MMSPISM_ADDR_MASKPPC_BITMASK(16 , 31)
+#define MEMORY_MAPPING_REG_RDR_MATCH_VALPPC_BITMASK(32 , 47)
+#define MEMORY_MAPPING_REG_RDR_MATCH_MASK   PPC_BITMASK(48 , 63)
+
+/* transmit_data_reg */
+#define 

Re: [PATCH 3/5] target/sparc/cpu: Improve the CPU help text

2024-03-07 Thread Philippe Mathieu-Daudé

On 7/3/24 18:43, Thomas Huth wrote:

Remove the unnecessary "Sparc" at the beginning of the line and
put the chip information into parentheses so that it is clearer
which part of the line have to be passed to "-cpu" to specify a
different CPU.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2141
Signed-off-by: Thomas Huth 
---
  target/sparc/cpu.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 





[PATCH] pci: Add option to disable device level INTx masking

2024-03-07 Thread Alex Williamson
The PCI 2.3 spec added definitions of the INTx disable and status bits,
in the command and status registers respectively.  The command register
bit, commonly known as DisINTx in lspci, controls whether the device
can assert the INTx signal.

Operating systems will often write to this bit to test whether a device
supports this style of legacy interrupt masking.  When using device
assignment, such as with vfio-pci, the result of this test dictates
whether the device can use a shared or exclusive interrupt (ie. generic
INTx masking at the device via DisINTx or IRQ controller level INTx
masking).

Add an experimental option to the base set of properties for PCI
devices which allows the DisINTx bit to be excluded from wmask, making
it read-only to the guest for testing purposes related to INTx masking.

Signed-off-by: Alex Williamson 
---
 hw/pci/pci.c | 14 ++
 include/hw/pci/pci.h |  2 ++
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 6496d027ca61..8c78326ad67f 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -85,6 +85,8 @@ static Property pci_props[] = {
 QEMU_PCIE_ERR_UNC_MASK_BITNR, true),
 DEFINE_PROP_BIT("x-pcie-ari-nextfn-1", PCIDevice, cap_present,
 QEMU_PCIE_ARI_NEXTFN_1_BITNR, false),
+DEFINE_PROP_BIT("x-pci-disintx", PCIDevice, cap_present,
+QEMU_PCI_DISINTX_BITNR, true),
 DEFINE_PROP_END_OF_LIST()
 };
 
@@ -861,13 +863,17 @@ static void pci_init_cmask(PCIDevice *dev)
 static void pci_init_wmask(PCIDevice *dev)
 {
 int config_size = pci_config_size(dev);
+uint16_t cmd_wmask = PCI_COMMAND_IO | PCI_COMMAND_MEMORY |
+ PCI_COMMAND_MASTER | PCI_COMMAND_SERR;
 
 dev->wmask[PCI_CACHE_LINE_SIZE] = 0xff;
 dev->wmask[PCI_INTERRUPT_LINE] = 0xff;
-pci_set_word(dev->wmask + PCI_COMMAND,
- PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER |
- PCI_COMMAND_INTX_DISABLE);
-pci_word_test_and_set_mask(dev->wmask + PCI_COMMAND, PCI_COMMAND_SERR);
+
+if (dev->cap_present & QEMU_PCI_DISINTX) {
+cmd_wmask |= PCI_COMMAND_INTX_DISABLE;
+}
+
+pci_set_word(dev->wmask + PCI_COMMAND, cmd_wmask);
 
 memset(dev->wmask + PCI_CONFIG_HEADER_SIZE, 0xff,
config_size - PCI_CONFIG_HEADER_SIZE);
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index eaa3fc99d884..45f0fac435cc 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -212,6 +212,8 @@ enum {
 QEMU_PCIE_ERR_UNC_MASK = (1 << QEMU_PCIE_ERR_UNC_MASK_BITNR),
 #define QEMU_PCIE_ARI_NEXTFN_1_BITNR 12
 QEMU_PCIE_ARI_NEXTFN_1 = (1 << QEMU_PCIE_ARI_NEXTFN_1_BITNR),
+#define QEMU_PCI_DISINTX_BITNR 13
+QEMU_PCI_DISINTX = (1 << QEMU_PCI_DISINTX_BITNR),
 };
 
 typedef struct PCIINTxRoute {
-- 
2.44.0




[PATCH v2] hmp: Add option to info qtree to omit details

2024-03-07 Thread BALATON Zoltan
The output of info qtree monitor command is very long. Add an option
to print a brief overview omitting all the details.

Signed-off-by: BALATON Zoltan 
Reviewed-by: Dr. David Alan Gilbert 
---
v2:
- Change the variable name to deails too
- Add braces to if (checkpatch did not warn for this so just noticed)

 hmp-commands-info.hx  |  6 +++---
 system/qdev-monitor.c | 27 +++
 2 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index da120f82a3..ad1b1306e3 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -540,9 +540,9 @@ ERST
 
 {
 .name   = "qtree",
-.args_type  = "",
-.params = "",
-.help   = "show device tree",
+.args_type  = "brief:-b",
+.params = "[-b]",
+.help   = "show device tree (-b: brief, omit properties)",
 .cmd= hmp_info_qtree,
 },
 
diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index a13db763e5..ad91e74181 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -744,7 +744,6 @@ DeviceState *qdev_device_add(QemuOpts *opts, Error **errp)
 }
 
 #define qdev_printf(fmt, ...) monitor_printf(mon, "%*s" fmt, indent, "", ## 
__VA_ARGS__)
-static void qbus_print(Monitor *mon, BusState *bus, int indent);
 
 static void qdev_print_props(Monitor *mon, DeviceState *dev, Property *props,
  int indent)
@@ -784,13 +783,9 @@ static void bus_print_dev(BusState *bus, Monitor *mon, 
DeviceState *dev, int ind
 static void qdev_print(Monitor *mon, DeviceState *dev, int indent)
 {
 ObjectClass *class;
-BusState *child;
 NamedGPIOList *ngl;
 NamedClockList *ncl;
 
-qdev_printf("dev: %s, id \"%s\"\n", object_get_typename(OBJECT(dev)),
-dev->id ? dev->id : "");
-indent += 2;
 QLIST_FOREACH(ngl, >gpios, node) {
 if (ngl->num_in) {
 qdev_printf("gpio-in \"%s\" %d\n", ngl->name ? ngl->name : "",
@@ -814,12 +809,9 @@ static void qdev_print(Monitor *mon, DeviceState *dev, int 
indent)
 class = object_class_get_parent(class);
 } while (class != object_class_by_name(TYPE_DEVICE));
 bus_print_dev(dev->parent_bus, mon, dev, indent);
-QLIST_FOREACH(child, >child_bus, sibling) {
-qbus_print(mon, child, indent);
-}
 }
 
-static void qbus_print(Monitor *mon, BusState *bus, int indent)
+static void qbus_print(Monitor *mon, BusState *bus, int indent, bool details)
 {
 BusChild *kid;
 
@@ -827,16 +819,27 @@ static void qbus_print(Monitor *mon, BusState *bus, int 
indent)
 indent += 2;
 qdev_printf("type %s\n", object_get_typename(OBJECT(bus)));
 QTAILQ_FOREACH(kid, >children, sibling) {
+BusState *child_bus;
 DeviceState *dev = kid->child;
-qdev_print(mon, dev, indent);
+qdev_printf("dev: %s, id \"%s\"\n", object_get_typename(OBJECT(dev)),
+dev->id ? dev->id : "");
+if (details) {
+qdev_print(mon, dev, indent + 2);
+}
+QLIST_FOREACH(child_bus, >child_bus, sibling) {
+qbus_print(mon, child_bus, indent + 2, details);
+}
 }
 }
 #undef qdev_printf
 
 void hmp_info_qtree(Monitor *mon, const QDict *qdict)
 {
-if (sysbus_get_default())
-qbus_print(mon, sysbus_get_default(), 0);
+bool details = !qdict_get_try_bool(qdict, "brief", false);
+
+if (sysbus_get_default()) {
+qbus_print(mon, sysbus_get_default(), 0, details);
+}
 }
 
 void hmp_info_qdm(Monitor *mon, const QDict *qdict)
-- 
2.30.9




[PATCH v2 2/5] linux-user: Move tswap_siginfo out of target code

2024-03-07 Thread Gustavo Romero
Move tswap_siginfo from target code to handle_pending_signal. This will
allow some cleanups and having the siginfo ready to be used in gdbstub.

Signed-off-by: Gustavo Romero 
Suggested-by: Richard Henderson 
---
 linux-user/aarch64/signal.c |  2 +-
 linux-user/alpha/signal.c   |  2 +-
 linux-user/arm/signal.c |  2 +-
 linux-user/hexagon/signal.c |  2 +-
 linux-user/hppa/signal.c|  2 +-
 linux-user/i386/signal.c|  6 +++---
 linux-user/loongarch64/signal.c |  2 +-
 linux-user/m68k/signal.c|  4 ++--
 linux-user/microblaze/signal.c  |  2 +-
 linux-user/mips/signal.c|  4 ++--
 linux-user/nios2/signal.c   |  2 +-
 linux-user/openrisc/signal.c|  2 +-
 linux-user/ppc/signal.c |  4 ++--
 linux-user/riscv/signal.c   |  2 +-
 linux-user/s390x/signal.c   |  2 +-
 linux-user/sh4/signal.c |  2 +-
 linux-user/signal-common.h  |  2 --
 linux-user/signal.c | 10 --
 linux-user/sparc/signal.c   |  2 +-
 linux-user/xtensa/signal.c  |  2 +-
 20 files changed, 31 insertions(+), 27 deletions(-)

diff --git a/linux-user/aarch64/signal.c b/linux-user/aarch64/signal.c
index a1e22d526d..bc7a13800d 100644
--- a/linux-user/aarch64/signal.c
+++ b/linux-user/aarch64/signal.c
@@ -670,7 +670,7 @@ static void target_setup_frame(int usig, struct 
target_sigaction *ka,
 aarch64_set_svcr(env, 0, R_SVCR_SM_MASK | R_SVCR_ZA_MASK);
 
 if (info) {
-tswap_siginfo(>info, info);
+frame->info = *info;
 env->xregs[1] = frame_addr + offsetof(struct target_rt_sigframe, info);
 env->xregs[2] = frame_addr + offsetof(struct target_rt_sigframe, uc);
 }
diff --git a/linux-user/alpha/signal.c b/linux-user/alpha/signal.c
index 4ec42994d4..896c2c148a 100644
--- a/linux-user/alpha/signal.c
+++ b/linux-user/alpha/signal.c
@@ -173,7 +173,7 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
 goto give_sigsegv;
 }
 
-tswap_siginfo(>info, info);
+frame->info = *info;
 
 __put_user(0, >uc.tuc_flags);
 __put_user(0, >uc.tuc_link);
diff --git a/linux-user/arm/signal.c b/linux-user/arm/signal.c
index f77f692c63..420fc04cfa 100644
--- a/linux-user/arm/signal.c
+++ b/linux-user/arm/signal.c
@@ -357,7 +357,7 @@ void setup_rt_frame(int usig, struct target_sigaction *ka,
 
 info_addr = frame_addr + offsetof(struct rt_sigframe, info);
 uc_addr = frame_addr + offsetof(struct rt_sigframe, sig.uc);
-tswap_siginfo(>info, info);
+frame->info = *info;
 
 setup_sigframe(>sig.uc, set, env);
 
diff --git a/linux-user/hexagon/signal.c b/linux-user/hexagon/signal.c
index 60fa7e1bce..492b51f155 100644
--- a/linux-user/hexagon/signal.c
+++ b/linux-user/hexagon/signal.c
@@ -162,7 +162,7 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
 }
 
 setup_ucontext(>uc, env, set);
-tswap_siginfo(>info, info);
+frame->info = *info;
 /*
  * The on-stack signal trampoline is no longer executed;
  * however, the libgcc signal frame unwinding code checks
diff --git a/linux-user/hppa/signal.c b/linux-user/hppa/signal.c
index d08a97dae6..8960175da3 100644
--- a/linux-user/hppa/signal.c
+++ b/linux-user/hppa/signal.c
@@ -127,7 +127,7 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
 goto give_sigsegv;
 }
 
-tswap_siginfo(>info, info);
+frame->info = *info;
 frame->uc.tuc_flags = 0;
 frame->uc.tuc_link = 0;
 
diff --git a/linux-user/i386/signal.c b/linux-user/i386/signal.c
index bc5d45302e..cfe70fc5cf 100644
--- a/linux-user/i386/signal.c
+++ b/linux-user/i386/signal.c
@@ -430,7 +430,7 @@ void setup_frame(int sig, struct target_sigaction *ka,
 setup_sigcontext(>sc, >fpstate, env, set->sig[0],
 frame_addr + offsetof(struct sigframe, fpstate));
 
-for(i = 1; i < TARGET_NSIG_WORDS; i++) {
+for (i = 1; i < TARGET_NSIG_WORDS; i++) {
 __put_user(set->sig[i], >extramask[i - 1]);
 }
 
@@ -490,7 +490,7 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
 __put_user(addr, >puc);
 #endif
 if (ka->sa_flags & TARGET_SA_SIGINFO) {
-tswap_siginfo(>info, info);
+frame->info = *info;
 }
 
 /* Create the ucontext.  */
@@ -504,7 +504,7 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
 setup_sigcontext(>uc.tuc_mcontext, >fpstate, env,
 set->sig[0], frame_addr + offsetof(struct rt_sigframe, fpstate));
 
-for(i = 0; i < TARGET_NSIG_WORDS; i++) {
+for (i = 0; i < TARGET_NSIG_WORDS; i++) {
 __put_user(set->sig[i], >uc.tuc_sigmask.sig[i]);
 }
 
diff --git a/linux-user/loongarch64/signal.c b/linux-user/loongarch64/signal.c
index 39ea82c814..1a322f9697 100644
--- a/linux-user/loongarch64/signal.c
+++ b/linux-user/loongarch64/signal.c
@@ -376,7 +376,7 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
 extctx.end.haddr = (void *)frame + (extctx.end.gaddr - frame_addr);
 }
 
-

[PATCH v2 5/5] tests/tcg: Add multiarch test for Xfer:siginfo:read stub

2024-03-07 Thread Gustavo Romero
Add multiarch test for testing if Xfer:siginfo:read query is properly
handled by gdbstub.

Signed-off-by: Gustavo Romero 
---
 tests/tcg/multiarch/Makefile.target   | 10 ++-
 .../gdbstub/test-qxfer-siginfo-read.py| 26 +++
 tests/tcg/multiarch/segfault.c| 14 ++
 3 files changed, 49 insertions(+), 1 deletion(-)
 create mode 100644 tests/tcg/multiarch/gdbstub/test-qxfer-siginfo-read.py
 create mode 100644 tests/tcg/multiarch/segfault.c

diff --git a/tests/tcg/multiarch/Makefile.target 
b/tests/tcg/multiarch/Makefile.target
index f11f3b084d..5ab4ba89b2 100644
--- a/tests/tcg/multiarch/Makefile.target
+++ b/tests/tcg/multiarch/Makefile.target
@@ -71,6 +71,13 @@ run-gdbstub-qxfer-auxv-read: sha1
--bin $< --test 
$(MULTIARCH_SRC)/gdbstub/test-qxfer-auxv-read.py, \
basic gdbstub qXfer:auxv:read support)
 
+run-gdbstub-qxfer-siginfo-read: segfault
+   $(call run-test, $@, $(GDB_SCRIPT) \
+   --gdb $(GDB) \
+   --qemu $(QEMU) --qargs "$(QEMU_OPTS)" \
+   --bin "$< -s" --test 
$(MULTIARCH_SRC)/gdbstub/test-qxfer-siginfo-read.py, \
+   basic gdbstub qXfer:siginfo:read support)
+
 run-gdbstub-proc-mappings: sha1
$(call run-test, $@, $(GDB_SCRIPT) \
--gdb $(GDB) \
@@ -113,7 +120,8 @@ endif
 EXTRA_RUNS += run-gdbstub-sha1 run-gdbstub-qxfer-auxv-read \
  run-gdbstub-proc-mappings run-gdbstub-thread-breakpoint \
  run-gdbstub-registers run-gdbstub-prot-none \
- run-gdbstub-catch-syscalls
+ run-gdbstub-catch-syscalls \
+ run-gdbstub-qxfer-siginfo-read
 
 # ARM Compatible Semi Hosting Tests
 #
diff --git a/tests/tcg/multiarch/gdbstub/test-qxfer-siginfo-read.py 
b/tests/tcg/multiarch/gdbstub/test-qxfer-siginfo-read.py
new file mode 100644
index 00..862596b07a
--- /dev/null
+++ b/tests/tcg/multiarch/gdbstub/test-qxfer-siginfo-read.py
@@ -0,0 +1,26 @@
+from __future__ import print_function
+#
+# Test gdbstub Xfer:siginfo:read stub.
+#
+# The test runs a binary that causes a SIGSEGV and then looks for additional
+# info about the signal through printing GDB's '$_siginfo' special variable,
+# which sends a Xfer:siginfo:read query to the gdbstub.
+#
+# The binary causes a SIGSEGV at dereferencing a pointer with value 0xdeadbeef,
+# so the test looks for and checks if this address is correctly reported by the
+# gdbstub.
+#
+# This is launched via tests/guest-debug/run-test.py
+#
+
+import gdb
+from test_gdbstub import main, report
+
+def run_test():
+"Run through the test"
+
+gdb.execute("continue", False, True)
+resp = gdb.execute("print/x $_siginfo", False, True)
+report(resp.find("si_addr = 0xdeadbeef"), "Found fault address.")
+
+main(run_test)
diff --git a/tests/tcg/multiarch/segfault.c b/tests/tcg/multiarch/segfault.c
new file mode 100644
index 00..e6c8ff31ca
--- /dev/null
+++ b/tests/tcg/multiarch/segfault.c
@@ -0,0 +1,14 @@
+#include 
+#include 
+
+/* Cause a segfault for testing purposes. */
+
+int main(int argc, char *argv[])
+{
+int *ptr = (void *)0xdeadbeef;
+
+if (argc == 2 && strcmp(argv[1], "-s") == 0) {
+/* Cause segfault. */
+printf("%d\n", *ptr);
+}
+}
-- 
2.34.1




[PATCH v2 4/5] gdbstub: Add Xfer:siginfo:read stub

2024-03-07 Thread Gustavo Romero
Add stub to handle Xfer:siginfo:read packet query that requests the
machine's siginfo data.

This is used when GDB user executes 'print $_siginfo' and when the
machine stops due to a signal, for instance, on SIGSEGV. The information
in siginfo allows GDB to determiner further details on the signal, like
the fault address/insn when the SIGSEGV is caught.

Signed-off-by: Gustavo Romero 
---
 gdbstub/gdbstub.c |  8 
 gdbstub/internals.h   |  1 +
 gdbstub/user-target.c | 23 +++
 3 files changed, 32 insertions(+)

diff --git a/gdbstub/gdbstub.c b/gdbstub/gdbstub.c
index 2909bc8c69..ab38cea46b 100644
--- a/gdbstub/gdbstub.c
+++ b/gdbstub/gdbstub.c
@@ -1651,6 +1651,8 @@ static void handle_query_supported(GArray *params, void 
*user_ctx)
 g_string_append(gdbserver_state.str_buf, ";qXfer:auxv:read+");
 }
 g_string_append(gdbserver_state.str_buf, ";QCatchSyscalls+");
+
+g_string_append(gdbserver_state.str_buf, ";qXfer:siginfo:read+");
 #endif
 g_string_append(gdbserver_state.str_buf, ";qXfer:exec-file:read+");
 #endif
@@ -1799,6 +1801,12 @@ static const GdbCmdParseEntry gdb_gen_query_table[] = {
 .cmd_startswith = 1,
 .schema = "l,l0"
 },
+{
+.handler = gdb_handle_query_xfer_siginfo,
+.cmd = "Xfer:siginfo:read::",
+.cmd_startswith = 1,
+.schema = "l,l0"
+ },
 #endif
 {
 .handler = gdb_handle_query_xfer_exec_file,
diff --git a/gdbstub/internals.h b/gdbstub/internals.h
index a7cc69dab3..15c01c525a 100644
--- a/gdbstub/internals.h
+++ b/gdbstub/internals.h
@@ -193,6 +193,7 @@ typedef union GdbCmdVariant {
 void gdb_handle_query_rcmd(GArray *params, void *user_ctx); /* softmmu */
 void gdb_handle_query_offsets(GArray *params, void *user_ctx); /* user */
 void gdb_handle_query_xfer_auxv(GArray *params, void *user_ctx); /*user */
+void gdb_handle_query_xfer_siginfo(GArray *params, void *user_ctx); /*user */
 void gdb_handle_v_file_open(GArray *params, void *user_ctx); /* user */
 void gdb_handle_v_file_close(GArray *params, void *user_ctx); /* user */
 void gdb_handle_v_file_pread(GArray *params, void *user_ctx); /* user */
diff --git a/gdbstub/user-target.c b/gdbstub/user-target.c
index 215bf33ab3..93739852b0 100644
--- a/gdbstub/user-target.c
+++ b/gdbstub/user-target.c
@@ -285,6 +285,29 @@ void gdb_handle_query_xfer_auxv(GArray *params, void 
*user_ctx)
 gdb_put_packet_binary(gdbserver_state.str_buf->str,
   gdbserver_state.str_buf->len, true);
 }
+
+void gdb_handle_query_xfer_siginfo(GArray *params, void *user_ctx)
+{
+unsigned long offset, len;
+uint8_t *siginfo_offset;
+
+offset = get_param(params, 0)->val_ul;
+len = get_param(params, 1)->val_ul;
+
+if (offset + len > sizeof(target_siginfo_t)) {
+/* Invalid offset and/or requested length. */
+gdb_put_packet("E01");
+return;
+}
+
+siginfo_offset = (uint8_t *)gdbserver_state.siginfo + offset;
+
+/* Reply */
+g_string_assign(gdbserver_state.str_buf, "l");
+gdb_memtox(gdbserver_state.str_buf, (const char *)siginfo_offset, len);
+gdb_put_packet_binary(gdbserver_state.str_buf->str,
+  gdbserver_state.str_buf->len, true);
+}
 #endif
 
 static const char *get_filename_param(GArray *params, int i)
-- 
2.34.1




[PATCH v2 3/5] gdbstub: Save target's siginfo

2024-03-07 Thread Gustavo Romero
Save target's siginfo into gdbserver_state so it can be used later, for
example, in any stub that requires the target's si_signo and si_code.

This change affects only linux-user mode.

Signed-off-by: Gustavo Romero 
Suggested-by: Richard Henderson 
---
 gdbstub/internals.h|  3 +++
 gdbstub/user-target.c  |  3 ++-
 gdbstub/user.c | 14 ++
 include/gdbstub/user.h |  6 +-
 linux-user/main.c  |  2 +-
 linux-user/signal.c|  5 -
 6 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/gdbstub/internals.h b/gdbstub/internals.h
index 56b7c13b75..a7cc69dab3 100644
--- a/gdbstub/internals.h
+++ b/gdbstub/internals.h
@@ -58,6 +58,9 @@ typedef struct GDBState {
 int line_csum; /* checksum at the end of the packet */
 GByteArray *last_packet;
 int signal;
+#ifdef CONFIG_USER_ONLY
+uint8_t siginfo[MAX_SIGINFO_LENGTH];
+#endif
 bool multiprocess;
 GDBProcess *processes;
 int process_num;
diff --git a/gdbstub/user-target.c b/gdbstub/user-target.c
index b7d4c37cd8..215bf33ab3 100644
--- a/gdbstub/user-target.c
+++ b/gdbstub/user-target.c
@@ -10,11 +10,12 @@
 #include "qemu/osdep.h"
 #include "exec/gdbstub.h"
 #include "qemu.h"
-#include "internals.h"
 #ifdef CONFIG_LINUX
 #include "linux-user/loader.h"
 #include "linux-user/qemu.h"
+#include "gdbstub/user.h"
 #endif
+#include "internals.h"
 
 /*
  * Map target signal numbers to GDB protocol signal numbers and vice
diff --git a/gdbstub/user.c b/gdbstub/user.c
index a157e67f95..777fa78ef4 100644
--- a/gdbstub/user.c
+++ b/gdbstub/user.c
@@ -131,7 +131,8 @@ void gdb_qemu_exit(int code)
 exit(code);
 }
 
-int gdb_handlesig(CPUState *cpu, int sig, const char *reason)
+int gdb_handlesig(CPUState *cpu, int sig, const char *reason, void *siginfo,
+  int siginfo_len)
 {
 char buf[256];
 int n;
@@ -140,6 +141,11 @@ int gdb_handlesig(CPUState *cpu, int sig, const char 
*reason)
 return sig;
 }
 
+if (siginfo) {
+/* Save target-specific siginfo. */
+memcpy(gdbserver_state.siginfo, siginfo, siginfo_len);
+}
+
 /* disable single step if it was enabled */
 cpu_single_step(cpu, 0);
 tb_flush(cpu);
@@ -510,7 +516,7 @@ void gdb_breakpoint_remove_all(CPUState *cs)
 void gdb_syscall_handling(const char *syscall_packet)
 {
 gdb_put_packet(syscall_packet);
-gdb_handlesig(gdbserver_state.c_cpu, 0, NULL);
+gdb_handlesig(gdbserver_state.c_cpu, 0, NULL, NULL, 0);
 }
 
 static bool should_catch_syscall(int num)
@@ -528,7 +534,7 @@ void gdb_syscall_entry(CPUState *cs, int num)
 {
 if (should_catch_syscall(num)) {
 g_autofree char *reason = g_strdup_printf("syscall_entry:%x;", num);
-gdb_handlesig(cs, gdb_target_sigtrap(), reason);
+gdb_handlesig(cs, gdb_target_sigtrap(), reason, NULL, 0);
 }
 }
 
@@ -536,7 +542,7 @@ void gdb_syscall_return(CPUState *cs, int num)
 {
 if (should_catch_syscall(num)) {
 g_autofree char *reason = g_strdup_printf("syscall_return:%x;", num);
-gdb_handlesig(cs, gdb_target_sigtrap(), reason);
+gdb_handlesig(cs, gdb_target_sigtrap(), reason, NULL, 0);
 }
 }
 
diff --git a/include/gdbstub/user.h b/include/gdbstub/user.h
index 6647af2123..0ec9a7e596 100644
--- a/include/gdbstub/user.h
+++ b/include/gdbstub/user.h
@@ -9,11 +9,15 @@
 #ifndef GDBSTUB_USER_H
 #define GDBSTUB_USER_H
 
+#define MAX_SIGINFO_LENGTH 128
+
 /**
  * gdb_handlesig() - yield control to gdb
  * @cpu: CPU
  * @sig: if non-zero, the signal number which caused us to stop
  * @reason: stop reason for stop reply packet or NULL
+ * @siginfo: target-specific siginfo struct
+ * @siginfo_len: target-specific siginfo struct length
  *
  * This function yields control to gdb, when a user-mode-only target
  * needs to stop execution. If @sig is non-zero, then we will send a
@@ -25,7 +29,7 @@
  * or 0 if no signal should be delivered, ie the signal that caused
  * us to stop should be ignored.
  */
-int gdb_handlesig(CPUState *, int, const char *);
+int gdb_handlesig(CPUState *, int, const char *, void *, int);
 
 /**
  * gdb_signalled() - inform remote gdb of sig exit
diff --git a/linux-user/main.c b/linux-user/main.c
index 049fd85a2a..3187be48d6 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -1014,7 +1014,7 @@ int main(int argc, char **argv, char **envp)
 gdbstub);
 exit(EXIT_FAILURE);
 }
-gdb_handlesig(cpu, 0, NULL);
+gdb_handlesig(cpu, 0, NULL, NULL, 0);
 }
 
 #ifdef CONFIG_SEMIHOSTING
diff --git a/linux-user/signal.c b/linux-user/signal.c
index 7a4c8e416e..98d1eacffe 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -34,6 +34,9 @@
 #include "user/safe-syscall.h"
 #include "tcg/tcg.h"
 
+/* target_siginfo_t must fit in gdbstub's siginfo save area. */
+QEMU_BUILD_BUG_ON(sizeof(target_siginfo_t) > MAX_SIGINFO_LENGTH);
+
 static struct target_sigaction sigact_table[TARGET_NSIG];
 
 static void 

[PATCH v2 1/5] gdbstub: Rename back gdb_handlesig

2024-03-07 Thread Gustavo Romero
Rename gdb_handlesig_reason back to gdb_handlesig. There is no need to
add a wrapper for gdb_handlesig and rename it when a new parameter is
added.

Signed-off-by: Gustavo Romero 
---
 gdbstub/user.c |  8 
 include/gdbstub/user.h | 15 ++-
 linux-user/main.c  |  2 +-
 linux-user/signal.c|  2 +-
 4 files changed, 8 insertions(+), 19 deletions(-)

diff --git a/gdbstub/user.c b/gdbstub/user.c
index 14918d1a21..a157e67f95 100644
--- a/gdbstub/user.c
+++ b/gdbstub/user.c
@@ -131,7 +131,7 @@ void gdb_qemu_exit(int code)
 exit(code);
 }
 
-int gdb_handlesig_reason(CPUState *cpu, int sig, const char *reason)
+int gdb_handlesig(CPUState *cpu, int sig, const char *reason)
 {
 char buf[256];
 int n;
@@ -510,7 +510,7 @@ void gdb_breakpoint_remove_all(CPUState *cs)
 void gdb_syscall_handling(const char *syscall_packet)
 {
 gdb_put_packet(syscall_packet);
-gdb_handlesig(gdbserver_state.c_cpu, 0);
+gdb_handlesig(gdbserver_state.c_cpu, 0, NULL);
 }
 
 static bool should_catch_syscall(int num)
@@ -528,7 +528,7 @@ void gdb_syscall_entry(CPUState *cs, int num)
 {
 if (should_catch_syscall(num)) {
 g_autofree char *reason = g_strdup_printf("syscall_entry:%x;", num);
-gdb_handlesig_reason(cs, gdb_target_sigtrap(), reason);
+gdb_handlesig(cs, gdb_target_sigtrap(), reason);
 }
 }
 
@@ -536,7 +536,7 @@ void gdb_syscall_return(CPUState *cs, int num)
 {
 if (should_catch_syscall(num)) {
 g_autofree char *reason = g_strdup_printf("syscall_return:%x;", num);
-gdb_handlesig_reason(cs, gdb_target_sigtrap(), reason);
+gdb_handlesig(cs, gdb_target_sigtrap(), reason);
 }
 }
 
diff --git a/include/gdbstub/user.h b/include/gdbstub/user.h
index 68b6534130..6647af2123 100644
--- a/include/gdbstub/user.h
+++ b/include/gdbstub/user.h
@@ -10,7 +10,7 @@
 #define GDBSTUB_USER_H
 
 /**
- * gdb_handlesig_reason() - yield control to gdb
+ * gdb_handlesig() - yield control to gdb
  * @cpu: CPU
  * @sig: if non-zero, the signal number which caused us to stop
  * @reason: stop reason for stop reply packet or NULL
@@ -25,18 +25,7 @@
  * or 0 if no signal should be delivered, ie the signal that caused
  * us to stop should be ignored.
  */
-int gdb_handlesig_reason(CPUState *, int, const char *);
-
-/**
- * gdb_handlesig() - yield control to gdb
- * @cpu CPU
- * @sig: if non-zero, the signal number which caused us to stop
- * @see gdb_handlesig_reason()
- */
-static inline int gdb_handlesig(CPUState *cpu, int sig)
-{
-return gdb_handlesig_reason(cpu, sig, NULL);
-}
+int gdb_handlesig(CPUState *, int, const char *);
 
 /**
  * gdb_signalled() - inform remote gdb of sig exit
diff --git a/linux-user/main.c b/linux-user/main.c
index 551acf1661..049fd85a2a 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -1014,7 +1014,7 @@ int main(int argc, char **argv, char **envp)
 gdbstub);
 exit(EXIT_FAILURE);
 }
-gdb_handlesig(cpu, 0);
+gdb_handlesig(cpu, 0, NULL);
 }
 
 #ifdef CONFIG_SEMIHOSTING
diff --git a/linux-user/signal.c b/linux-user/signal.c
index d3e62ab030..a57c45de35 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -1180,7 +1180,7 @@ static void handle_pending_signal(CPUArchState *cpu_env, 
int sig,
 /* dequeue signal */
 k->pending = 0;
 
-sig = gdb_handlesig(cpu, sig);
+sig = gdb_handlesig(cpu, sig, NULL);
 if (!sig) {
 sa = NULL;
 handler = TARGET_SIG_IGN;
-- 
2.34.1




[PATCH 4/5] include/exec: annotate all the MemoryRegion fields

2024-03-07 Thread Alex Bennée
Signed-off-by: Alex Bennée 
---
 include/exec/memory.h | 47 +++
 1 file changed, 43 insertions(+), 4 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 17b741bc4f5..312ed564dbe 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -778,9 +778,48 @@ bool memory_get_xlat_addr(IOMMUTLBEntry *iotlb, void 
**vaddr,
 typedef struct CoalescedMemoryRange CoalescedMemoryRange;
 typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
 
-/** MemoryRegion:
- *
- * A struct representing a memory region.
+/**
+ * struct MemoryRegion - represents a memory region
+ * @parent_obj: parent QOM object for the region
+ * @romd_mode: if true ROM devices accessed directly rather than with @ops
+ * @ram: true if a RAM-type device with a @ram_block
+ * @subpage: true if region covers a subpage
+ * @readonly: true for RAM-type devices that are readonly
+ * @nonvolatile: true for nonvolatile RAM-type devices (e.g. NVDIMM)
+ * @rom_device: true for a ROM device, see also @romd_mode
+ * @flush_coalesced_mmio: true to flush any queued coalesced MMIO
+ * operations before access
+ * @unmergeable: this section should not get merged with adjacent
+ * sections
+ * @dirty_log_mask: dirty events region cares about (see DIRTY_ flags)
+ * @is_iommu: true if part of an IOMMU device
+ * @ram_block: backing @RamBlock if @ram is true
+ * @owner: base QOM object that owns this region
+ * @dev: base Device that owns this region
+ * @ops: access operations for MMIO or @romd_mode devices
+ * @opaque: @dev specific data, passed with @ops
+ * @container: parent `MemoryRegion`
+ * @mapped_via_alias: number of times mapped via @alias, container
+ * might be NULL
+ * @size: size of @MemoryRegion
+ * @addr: physical hwaddr of @MemoryRegion
+ * @destructor: cleanup function when @MemoryRegion finalized
+ * @align: alignment requirements for any physical backing store
+ * @terminates: true if this @MemoryRegion is a leaf node
+ * @ram_device: true if @ram device should use @ops to access
+ * @enabled: true once initialised, false once finalized
+ * @vga_logging_count: count of memory logging clients
+ * @alias: link to aliased @MemoryRegion
+ * @alias_offset: offset into aliased region
+ * @priority: priority when resolving overlapping regions
+ * @subregions: list of subregions in this region
+ * @subregions_link: next subregion in the chain
+ * @coalesced: list of coalesced memory ranges
+ * @name: name of memory region
+ * @ioeventfd_nb: count of @ioeventfds for region
+ * @ioeventfds: ioevent notifiers for region
+ * @rdm: if exists see #RamDiscardManager
+ * @disable_reentrancy_guard: if true don't error if device accesses itself
  */
 struct MemoryRegion {
 Object parent_obj;
@@ -806,7 +845,7 @@ struct MemoryRegion {
 const MemoryRegionOps *ops;
 void *opaque;
 MemoryRegion *container;
-int mapped_via_alias; /* Mapped via an alias, container might be NULL */
+int mapped_via_alias;
 Int128 size;
 hwaddr addr;
 void (*destructor)(MemoryRegion *mr);
-- 
2.39.2




[PATCH 1/5] scripts/kernel-doc: teach kdoc about QLIST_ macros

2024-03-07 Thread Alex Bennée
The kernel-doc script does some pre-processing on structure
definitions before parsing for names. Teach it about QLIST and replace
with simplified structures representing the base type.

Signed-off-by: Alex Bennée 
---
 scripts/kernel-doc | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 240923d509a..26c47562e79 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -1226,7 +1226,14 @@ sub dump_struct($$) {
# replace DECLARE_KFIFO_PTR
$members =~ s/DECLARE_KFIFO_PTR\s*\(([^,)]+),\s*([^,)]+)\)/$2 \*$1/gos;
 
-   my $declaration = $members;
+# QEMU Specific Macros
+
+# replace QLIST_ENTRY with base type and variable name
+$members =~ s/QLIST_ENTRY\(([^)]+)\)\s+([^;]+)/$1 \*$2/gos;
+# replace QLIST_HEAD, optionally capturing an anonymous struct marker, 
and capture type and variable name
+$members =~ s/QLIST_HEAD\(\s*,\s*([^)]+)\)\s+([^;]+)/struct { $1 
*lh_first; } $2/gos;
+
+my $declaration = $members;
 
# Split nested struct/union elements as newer ones
while ($members =~ 
m/(struct|union)([^\{\};]+)\{([^\{\}]*)\}([^\{\}\;]*)\;/) {
-- 
2.39.2




[PATCH 5/5] docs/devel: mark out defined functions and structures

2024-03-07 Thread Alex Bennée
This allows sphinx to hyperlink the references to their kdoc
definitions for easy navigation.

Signed-off-by: Alex Bennée 
---
 docs/devel/memory.rst | 48 +--
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/docs/devel/memory.rst b/docs/devel/memory.rst
index ed24708fce3..193f31198b0 100644
--- a/docs/devel/memory.rst
+++ b/docs/devel/memory.rst
@@ -16,11 +16,11 @@ The memory model provides support for
 - setting up coalesced memory for kvm
 - setting up ioeventfd regions for kvm
 
-Memory is modelled as an acyclic graph of MemoryRegion objects.  Sinks
+Memory is modelled as an acyclic graph of `MemoryRegion` objects.  Sinks
 (leaves) are RAM and MMIO regions, while other nodes represent
 buses, memory controllers, and memory regions that have been rerouted.
 
-In addition to MemoryRegion objects, the memory API provides AddressSpace
+In addition to `MemoryRegion` objects, the memory API provides `AddressSpace`
 objects for every root and possibly for intermediate MemoryRegions too.
 These represent memory as seen from the CPU or a device's viewpoint.
 
@@ -28,17 +28,17 @@ Types of regions
 
 
 There are multiple types of memory regions (all represented by a single C type
-MemoryRegion):
+`MemoryRegion`):
 
 - RAM: a RAM region is simply a range of host memory that can be made available
   to the guest.
-  You typically initialize these with memory_region_init_ram().  Some special
-  purposes require the variants memory_region_init_resizeable_ram(),
-  memory_region_init_ram_from_file(), or memory_region_init_ram_ptr().
+  You typically initialize these with `memory_region_init_ram()`.  Some special
+  purposes require the variants `memory_region_init_resizeable_ram()`,
+  `memory_region_init_ram_from_file()`, or `memory_region_init_ram_ptr()`.
 
 - MMIO: a range of guest memory that is implemented by host callbacks;
   each read or write causes a callback to be called on the host.
-  You initialize these with memory_region_init_io(), passing it a
+  You initialize these with `memory_region_init_io()`, passing it a
   MemoryRegionOps structure describing the callbacks.
 
 - ROM: a ROM memory region works like RAM for reads (directly accessing
@@ -53,7 +53,7 @@ MemoryRegion):
 - IOMMU region: an IOMMU region translates addresses of accesses made to it
   and forwards them to some other target memory region.  As the name suggests,
   these are only needed for modelling an IOMMU, not for simple devices.
-  You initialize these with memory_region_init_iommu().
+  You initialize these with `memory_region_init_iommu()`.
 
 - container: a container simply includes other memory regions, each at
   a different offset.  Containers are useful for grouping several regions
@@ -65,7 +65,7 @@ MemoryRegion):
   can overlay a subregion of RAM with MMIO or ROM, or a PCI controller
   that does not prevent card from claiming overlapping BARs.
 
-  You initialize a pure container with memory_region_init().
+  You initialize a pure container with `memory_region_init()`.
 
 - alias: a subsection of another region. Aliases allow a region to be
   split apart into discontiguous regions. Examples of uses are memory
@@ -83,7 +83,7 @@ MemoryRegion):
   It claims I/O space that is not supposed to be handled by QEMU itself.
   The typical use is to track parts of the address space which will be
   handled by the host kernel when KVM is enabled.  You initialize these
-  by passing a NULL callback parameter to memory_region_init_io().
+  by passing a NULL callback parameter to `memory_region_init_io()`.
 
 It is valid to add subregions to a region which is not a pure container
 (that is, to an MMIO, RAM or ROM region). This means that the region
@@ -104,28 +104,28 @@ copied to the destination on migration. These APIs which 
allocate
 the host memory for you will also register the memory so it is
 migrated:
 
-- memory_region_init_ram()
-- memory_region_init_rom()
-- memory_region_init_rom_device()
+- `memory_region_init_ram()`
+- `memory_region_init_rom()`
+- `memory_region_init_rom_device()`
 
 For most devices and boards this is the correct thing. If you
 have a special case where you need to manage the migration of
 the backing memory yourself, you can call the functions:
 
-- memory_region_init_ram_nomigrate()
-- memory_region_init_rom_nomigrate()
-- memory_region_init_rom_device_nomigrate()
+- `memory_region_init_ram_nomigrate()`
+- `memory_region_init_rom_nomigrate()`
+- `memory_region_init_rom_device_nomigrate()`
 
 which only initialize the MemoryRegion and leave handling
 migration to the caller.
 
 The functions:
 
-- memory_region_init_resizeable_ram()
-- memory_region_init_ram_from_file()
-- memory_region_init_ram_from_fd()
-- memory_region_init_ram_ptr()
-- memory_region_init_ram_device_ptr()
+- `memory_region_init_resizeable_ram()`
+- `memory_region_init_ram_from_file()`
+- `memory_region_init_ram_from_fd()`
+- `memory_region_init_ram_ptr()`
+- 

[PATCH 0/5] docs: improve the memory API documentation

2024-03-07 Thread Alex Bennée
As I've been looking through the Memory API for our Xen work I thought
I should take the time to clean it up. I needed to teach kdoc about
our QLIST_ macros and I found at least one unused field in the
structure.

Looking through the definitions I do wander if the meaning of
romd_mode and ram_device could be cleaned up to a single bool
(directly_mapped or use_ops?).

Anyway for now just cosmetic improvements to the docs as we are close
to softfreeze.

Alex.

Alex Bennée (5):
  scripts/kernel-doc: teach kdoc about QLIST_ macros
  docs: include ramblock.h in the memory API docs
  include/exec: remove warning_printed from MemoryRegion
  include/exec: annotate all the MemoryRegion fields
  docs/devel: mark out defined functions and structures

 docs/devel/memory.rst   | 49 +-
 include/exec/memory.h   | 48 +++---
 include/exec/ramblock.h | 76 +++--
 scripts/kernel-doc  |  9 -
 4 files changed, 127 insertions(+), 55 deletions(-)

-- 
2.39.2




[PATCH 2/5] docs: include ramblock.h in the memory API docs

2024-03-07 Thread Alex Bennée
The RAMBlock concept is fairly central to RAM-like MemoryRegions so
lets update the structure documentation and include in the docs.

Signed-off-by: Alex Bennée 
---
 docs/devel/memory.rst   |  1 +
 include/exec/ramblock.h | 76 +++--
 2 files changed, 52 insertions(+), 25 deletions(-)

diff --git a/docs/devel/memory.rst b/docs/devel/memory.rst
index 69c5e3f914a..ed24708fce3 100644
--- a/docs/devel/memory.rst
+++ b/docs/devel/memory.rst
@@ -369,4 +369,5 @@ callbacks are called:
 API Reference
 -
 
+.. kernel-doc:: include/exec/ramblock.h
 .. kernel-doc:: include/exec/memory.h
diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
index 848915ea5bf..eb2416b6f66 100644
--- a/include/exec/ramblock.h
+++ b/include/exec/ramblock.h
@@ -24,68 +24,94 @@
 #include "qemu/rcu.h"
 #include "exec/ramlist.h"
 
+/**
+ * struct RAMBlock - represents a chunk of RAM
+ *
+ * RAMBlocks can be backed by allocated RAM or a file-descriptor. See
+ * @flags for the details. For the purposes of migration various book
+ * keeping and dirty state tracking elements are also tracked in this
+ * structure.
+ */
 struct RAMBlock {
+/** @rcu: used for lazy free under RCU */
 struct rcu_head rcu;
+/** @mr: parent MemoryRegion the block belongs to */
 struct MemoryRegion *mr;
+/** @host: pointer to host address of RAM */
 uint8_t *host;
-uint8_t *colo_cache; /* For colo, VM's ram cache */
+/** @colo_cache: For colo, VM's ram cache */
+uint8_t *colo_cache;
+/** @offset: offset into host backing store??? or guest address space? */
 ram_addr_t offset;
+/** @used_length: amount of store used */
 ram_addr_t used_length;
+/** @max_length: for blocks that can be resized the max possible */
 ram_addr_t max_length;
+/** @resized: callback notifier when block resized  */
 void (*resized)(const char*, uint64_t length, void *host);
+/** @flags: see RAM_* flags in memory.h  */
 uint32_t flags;
-/* Protected by the BQL.  */
+/** @idstr: Protected by the BQL.  */
 char idstr[256];
-/* RCU-enabled, writes protected by the ramlist lock */
+/**
+ * @next: next RAMBlock, RCU-enabled, writes protected by the
+ * ramlist lock
+ */
 QLIST_ENTRY(RAMBlock) next;
+/** @ramblock_notifiers: list of RAMBlockNotifier notifiers */
 QLIST_HEAD(, RAMBlockNotifier) ramblock_notifiers;
+/** @fd: fd of backing store if used */
 int fd;
+/** @fd_offset: offset into the fd based backing store */
 uint64_t fd_offset;
+/** @page_size: ideal page size of backing store*/
 size_t page_size;
-/* dirty bitmap used during migration */
+/** @bmap:  dirty bitmap used during migration */
 unsigned long *bmap;
 
 /*
  * Below fields are only used by mapped-ram migration
  */
-/* bitmap of pages present in the migration file */
+
+/** @file_bmap: bitmap of pages present in the migration file  */
 unsigned long *file_bmap;
-/*
- * offset in the file pages belonging to this ramblock are saved,
- * used only during migration to a file.
- */
+/** @bitmap_offset: offset in the migration file of the bitmaps */
 off_t bitmap_offset;
+/** @pages_offset: offset in the migration file of the pages */
 uint64_t pages_offset;
 
-/* bitmap of already received pages in postcopy */
+/** @receivedmap: bitmap of already received pages in postcopy */
 unsigned long *receivedmap;
 
-/*
- * bitmap to track already cleared dirty bitmap.  When the bit is
- * set, it means the corresponding memory chunk needs a log-clear.
- * Set this up to non-NULL to enable the capability to postpone
- * and split clearing of dirty bitmap on the remote node (e.g.,
- * KVM).  The bitmap will be set only when doing global sync.
+/**
+ * @clear_bmap: bitmap to track already cleared dirty bitmap. When
+ * the bit is set, it means the corresponding memory chunk needs a
+ * log-clear. Set this up to non-NULL to enable the capability to
+ * postpone and split clearing of dirty bitmap on the remote node
+ * (e.g., KVM). The bitmap will be set only when doing global
+ * sync.
  *
  * It is only used during src side of ram migration, and it is
  * protected by the global ram_state.bitmap_mutex.
  *
  * NOTE: this bitmap is different comparing to the other bitmaps
  * in that one bit can represent multiple guest pages (which is
- * decided by the `clear_bmap_shift' variable below).  On
+ * decided by the @clear_bmap_shift variable below).  On
  * destination side, this should always be NULL, and the variable
- * `clear_bmap_shift' is meaningless.
+ * @clear_bmap_shift is meaningless.
  */
 unsigned long *clear_bmap;
+/** @clear_bmap_shift: number pages each @clear_bmap bit represents */
 uint8_t clear_bmap_shift;
 
-/*
- * RAM block length that 

[PATCH 3/5] include/exec: remove warning_printed from MemoryRegion

2024-03-07 Thread Alex Bennée
Since d197063fcf9 (memory: move unassigned_mem_ops to memory.c) this
field is unused.

Signed-off-by: Alex Bennée 
---
 include/exec/memory.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 8626a355b31..17b741bc4f5 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -814,7 +814,6 @@ struct MemoryRegion {
 bool terminates;
 bool ram_device;
 bool enabled;
-bool warning_printed; /* For reservations */
 uint8_t vga_logging_count;
 MemoryRegion *alias;
 hwaddr alias_offset;
-- 
2.39.2




Re: [PATCH 1/3] ui/gtk: skip drawing guest scanout when associated VC is invisible

2024-03-07 Thread Daniel P . Berrangé
On Thu, Mar 07, 2024 at 05:53:24PM +, Kim, Dongwon wrote:
> Hi Daniel,
> 
> > -Original Message-
> > From: Daniel P. Berrangé 
> > Sent: Thursday, March 7, 2024 1:46 AM
> > To: Kim, Dongwon 
> > Cc: Marc-André Lureau ; qemu-
> > de...@nongnu.org
> > Subject: Re: [PATCH 1/3] ui/gtk: skip drawing guest scanout when associated
> > VC is invisible
> > 
> > On Thu, Feb 01, 2024 at 06:48:58PM +, Kim, Dongwon wrote:
> > > Hi Marc-André,
> > >
> > > Thanks for your feedback. Yes, you are right, rendering doesn't stop
> > > on Ubuntu system as it has preview even after the window is minimized. But
> > this is not always the case.
> > > Some simple windows managers don't have this preview (thumbnail)
> > > feature and this visible flag is not toggled. And the rendering stops
> > > right away there when the window is minimized.
> > 
> > This makes me pretty uncomfortable. This is proposing changing QEMU
> > behaviour to workaround a problem whose behaviour varies based on what 3rd
> > party software QEMU is running on
> > 
> > What you say "window managers" are you referring to a traditional
> > X11 based host display only, or does the problem also exist on modern
> > Wayland host display ?
> 
> [Kim, Dongwon]  I didn't mean anything about the compositor/display
> server itself. And I am not sure about the general behavior of Wayland
> compositors but the point here is the rendering while the app is being
> iconized (minimized) is not always the case. For example, ICE-WM on
> Yocto Linux doesn't have any preview for the iconized or minimized
> applications, which means all drawing activities on the minimized
> app are paused. This is the problem in case blob scanout is used
> with virtio-gpu on the guest because the guest won't receive the
> response for the frame submission until the frame is fully rendered
> on the host. This is causing timeout and further issue on the display
> pipeline in virtio-gpu driver in the guest.

I guess I'm wondering why we should consider this a bug in QEMU
rather than a bug in either the toolkit or host rendering stack ?

Lets say there was no guest OS here, just a regular host app
issuing drawing requests that were equivalent to the workload
the guest is issuing.  If that applications' execution got
blocked because its drawing requests are not getting processed
when iconified, I'd be inclined to call it a bug.

Or are we perhaps handling drawing in the wrong way in QEMU ?

If the problem is with drawing to a iconified windows, is
there an intermediate target buffer we should be drawing
to, instead of directly to the window. There must be some
supported way to have drawing requests fully processed in
the background indepent of having a visible window, surely ?

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




RE: [PATCH 1/3] ui/gtk: skip drawing guest scanout when associated VC is invisible

2024-03-07 Thread Kim, Dongwon
Hi Daniel,

> -Original Message-
> From: Daniel P. Berrangé 
> Sent: Thursday, March 7, 2024 1:46 AM
> To: Kim, Dongwon 
> Cc: Marc-André Lureau ; qemu-
> de...@nongnu.org
> Subject: Re: [PATCH 1/3] ui/gtk: skip drawing guest scanout when associated
> VC is invisible
> 
> On Thu, Feb 01, 2024 at 06:48:58PM +, Kim, Dongwon wrote:
> > Hi Marc-André,
> >
> > Thanks for your feedback. Yes, you are right, rendering doesn't stop
> > on Ubuntu system as it has preview even after the window is minimized. But
> this is not always the case.
> > Some simple windows managers don't have this preview (thumbnail)
> > feature and this visible flag is not toggled. And the rendering stops
> > right away there when the window is minimized.
> 
> This makes me pretty uncomfortable. This is proposing changing QEMU
> behaviour to workaround a problem whose behaviour varies based on what 3rd
> party software QEMU is running on
> 
> What you say "window managers" are you referring to a traditional
> X11 based host display only, or does the problem also exist on modern
> Wayland host display ?

[Kim, Dongwon]  I didn't mean anything about the compositor/display server 
itself. And I am not sure about the general behavior of Wayland compositors but 
the point here is the rendering while the app is being iconized (minimized) is 
not always the case. For example, ICE-WM on Yocto Linux doesn't have any 
preview for the iconized or minimized applications, which means all drawing 
activities on the minimized app are paused. This is the problem in case blob 
scanout is used with virtio-gpu on the guest because the guest won't receive 
the response for the frame submission until the frame is fully rendered on the 
host. This is causing timeout and further issue on the display pipeline in 
virtio-gpu driver in the guest.

> 
> If the problem is confined to X11, that would steer towards saying we 
> shouldn't
> try to workaround it given that X11 is very much obsolete at this point.

[Kim, Dongwon]  I think I should try Wayland too but I guess it depends on the 
compositor and the shell design. Of course I need to test but what if it works 
ok on the Ubuntu running with wayland backend but not with Weston?

> 
> With regards,
> Daniel
> --
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [PATCH 1/2] gdbstub: Add Xfer:siginfo:read stub

2024-03-07 Thread Gustavo Romero

On 3/4/24 2:18 PM, Richard Henderson wrote:

On 3/3/24 09:26, Gustavo Romero wrote:

+    /* Filter out si_type from si_code. See comment in siginfo_noswap(). */ > +   
 tmp_siginfo = ts->sync_signal.info;
+    tmp_siginfo.si_code = sextract32(tmp_siginfo.si_code, 0, 16);



This is incorrect, as it only handles synchronous signals.

In handle_pending_signal(), struct emulated_sigtable is passed, which has the 
correct siginfo (all of it, so no need for the adjustment).  I think you need 
to pass that in to gdb_handlesig so that a copy can be made for later xfer.


Thanks, I'm sending v2 that fixes it.


Cheers,
Gustavo



Re: [PATCH 2/2] tests/tcg: Add multiarch test for Xfer:siginfo:read stub

2024-03-07 Thread Gustavo Romero

Hi Richard,

On 3/4/24 7:51 PM, Richard Henderson wrote:

On 3/4/24 10:59, Gustavo Romero wrote:

Perhaps just abort for SIGABRT instead?


Although this can make a simpler test, the test can't control
the si_addr value easily, which I think is interesting to be tested.

Why do you prefer SIGABRT?


I missed that you were testing si_addr -- in which case SIGSEGV is a good match.



A test using setitimer to raise SIGALRM would test the async path.


SIGLARM doesn't generate any interesting siginfo?


It should at minimum have si_sig = SIGALRM.



gromero@arm64:~$ gdb -q ./sigalrm
Reading symbols from ./sigalrm...
(gdb) run
Starting program: /home/gromero/sigalrm

Program terminated with signal SIGALRM, Alarm clock.
The program no longer exists.
(gdb) p $_siginfo
$1 = void


Well that's because the program died.
Do you need to have gdb handle the signal?


ouch, right :)

However, on a remote target, even if I catch that signal using
'catch signal SIGALRM' the GDBstub only closes the connection
when SIGALRM is delivered. That's odd, I don't understand why.

I'm using the same binary that pretty much works on GDB locally.


[Remote target]

gromero@arm64:~$ gdb -q
gromero@arm64:~/qemu_tests$ gdb -q ./sigalrm
Reading symbols from ./sigalrm...
(gdb) catch signal SIGALRM
Catchpoint 1 (signal SIGALRM)
(gdb) c
The program is not being run.
(gdb) run
Starting program: /home/gromero/qemu_tests/sigalrm
[Inferior 1 (process 12732) exited normally]
(gdb) quit

on the QEMU gdbstub side it reports "Alarm clock":

gromero@amd:~/git/qemu/build$ ./qemu-aarch64 -g 1234 ./sigalrm -s
Alarm clock
gromero@amd:~/git/qemu/build$


[Locally]

gromero@arm64:~/qemu_tests$ gdb -q ./sigalrm
Reading symbols from ./sigalrm...
(gdb) catch signal SIGALRM
Catchpoint 1 (signal SIGALRM)
(gdb) run -s
Starting program: /home/gromero/qemu_tests/sigalrm -s

Catchpoint 1 (signal SIGALRM), 0x0041a410 in ualarm ()
(gdb) quit


I'd like to add for the async path using SIGALRM but I need more
time to understand what's going on regarding SIGLARM. I understand
that's nothing wrong with the Xfer:siginfo:read stub itself, and
because the main goal of the test is to test the stub, if you don't
mind, I'd like to keep only the test with SIGSEGV for v2 and leave
the async test as a follow-up.


Cheers,
Gustavo



[PATCH 5/5] docs/about: Deprecate the old "UltraSparc" CPU names that contain a "+"

2024-03-07 Thread Thomas Huth
For consistency we should drop the names with a "+" in it in the
long run.

Signed-off-by: Thomas Huth 
---
 docs/about/deprecated.rst | 9 +
 1 file changed, 9 insertions(+)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index 8565644da6..7058341f8f 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -202,6 +202,15 @@ in the QEMU object model anymore. ``power5+``, 
``power5+_v2.1``,
 an alias, but for consistency these will get removed in a future
 release, too. Use ``power5p_v2.1`` and ``power7p_v2.1`` instead.
 
+``Sun-UltraSparc-IIIi+`` and ``Sun-UltraSparc-IV+`` CPU names (since 9.0)
+'
+
+The character "+" in device (and thus also CPU) names is not allowed
+in the QEMU object model anymore. ``Sun-UltraSparc-IIIi+`` and
+``Sun-UltraSparc-IV+`` are currently still supported via a workaround,
+but for consistency these will get removed in a future release, too.
+Use ``Sun-UltraSparc-IIIip`` and ``Sun-UltraSparc-IVp`` instead.
+
 CRIS CPU architecture (since 9.0)
 '
 
-- 
2.44.0




[PATCH 3/5] target/sparc/cpu: Improve the CPU help text

2024-03-07 Thread Thomas Huth
Remove the unnecessary "Sparc" at the beginning of the line and
put the chip information into parentheses so that it is clearer
which part of the line have to be passed to "-cpu" to specify a
different CPU.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2141
Signed-off-by: Thomas Huth 
---
 target/sparc/cpu.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
index ae30cded22..6650248ffe 100644
--- a/target/sparc/cpu.c
+++ b/target/sparc/cpu.c
@@ -576,9 +576,10 @@ void sparc_cpu_list(void)
 {
 unsigned int i;
 
+qemu_printf("Available CPU types:\n");
 for (i = 0; i < ARRAY_SIZE(sparc_defs); i++) {
-qemu_printf("Sparc %16s IU " TARGET_FMT_lx
-" FPU %08x MMU %08x NWINS %d ",
+qemu_printf(" %-20s (IU " TARGET_FMT_lx
+" FPU %08x MMU %08x NWINS %d) ",
 sparc_defs[i].name,
 sparc_defs[i].iu_version,
 sparc_defs[i].fpu_version,
-- 
2.44.0




[PATCH 4/5] docs/system/target-sparc: Improve the Sparc documentation

2024-03-07 Thread Thomas Huth
Add some words about how to enable or disable boolean features,
and remove the note about a Linux kernel being available on the
QEMU website (they have been removed long ago already).

Signed-off-by: Thomas Huth 
---
 docs/system/target-sparc.rst | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/docs/system/target-sparc.rst b/docs/system/target-sparc.rst
index 9ec8c90c14..9f418b9d3e 100644
--- a/docs/system/target-sparc.rst
+++ b/docs/system/target-sparc.rst
@@ -27,6 +27,11 @@ architecture machines:
 The emulation is somewhat complete. SMP up to 16 CPUs is supported, but
 Linux limits the number of usable CPUs to 4.
 
+The list of available CPUs can be viewed by starting QEMU with ``-cpu help``.
+Optional boolean features can be added with a "+" in front of the feature name,
+or disabled with a "-" in front of the name, for example
+``-cpu TI-SuperSparc-II,+float128``.
+
 QEMU emulates the following sun4m peripherals:
 
 -  IOMMU
@@ -55,8 +60,7 @@ OpenBIOS is a free (GPL v2) portable firmware implementation. 
The goal
 is to implement a 100% IEEE 1275-1994 (referred to as Open Firmware)
 compliant firmware.
 
-A sample Linux 2.6 series kernel and ram disk image are available on the
-QEMU web site. There are still issues with NetBSD and OpenBSD, but most
+There are still issues with NetBSD and OpenBSD, but most
 kernel versions work. Please note that currently older Solaris kernels
 don't work probably due to interface issues between OpenBIOS and
 Solaris.
-- 
2.44.0




[PATCH 1/5] target/sparc/cpu: Rename the CPU models with a "+" in their names

2024-03-07 Thread Thomas Huth
Commit b447378e12 ("qom/object: Limit type names to alphanumerical ...")
cut down the amount of allowed characters for QOM types to a saner set.
The "+" character was not meant to be included in this set, so we had
to add a hack there to still allow the legacy names of POWER and Sparc64
CPUs. However, instead of putting such a hack in the common QOM code,
there is a much better place to do this: The sparc_cpu_class_by_name()
function which is used to look up the names of all Sparc CPUs.
Thus let's finally get rid of the "+" in the Sparc CPU names, and provide
backward compatibility for the old names via some simple checks in the
sparc_cpu_class_by_name() function.

Signed-off-by: Thomas Huth 
---
 qom/object.c   |  8 
 target/sparc/cpu.c | 14 --
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/qom/object.c b/qom/object.c
index d4a001cf41..759e194972 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -158,14 +158,6 @@ static bool type_name_is_valid(const char *name)
 "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
 "0123456789-_.");
 
-/* Allow some legacy names with '+' in it for compatibility reasons */
-if (name[plen] == '+') {
-if (plen >= 17 && g_str_has_prefix(name, "Sun-UltraSparc-I")) {
-/* Allow "Sun-UltraSparc-IV+" and "Sun-UltraSparc-IIIi+" */
-return true;
-}
-}
-
 return plen == slen;
 }
 
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
index 313ebc4c11..651e49bfeb 100644
--- a/target/sparc/cpu.c
+++ b/target/sparc/cpu.c
@@ -316,7 +316,7 @@ static const sparc_def_t sparc_defs[] = {
 .features = CPU_DEFAULT_FEATURES,
 },
 {
-.name = "Sun UltraSparc IV+",
+.name = "Sun UltraSparc IVp",
 .iu_version = ((0x3eULL << 48) | (0x19ULL << 32) | (0x22ULL << 24)),
 .fpu_version = 0x,
 .mmu_version = mmu_us_12,
@@ -325,7 +325,7 @@ static const sparc_def_t sparc_defs[] = {
 .features = CPU_DEFAULT_FEATURES | CPU_FEATURE_CMT,
 },
 {
-.name = "Sun UltraSparc IIIi+",
+.name = "Sun UltraSparc IIIip",
 .iu_version = ((0x3eULL << 48) | (0x22ULL << 32) | (0ULL << 24)),
 .fpu_version = 0x,
 .mmu_version = mmu_us_3,
@@ -767,6 +767,16 @@ static ObjectClass *sparc_cpu_class_by_name(const char 
*cpu_model)
 char *typename;
 
 typename = sparc_cpu_type_name(cpu_model);
+
+/* Fix up legacy names with '+' in it */
+if (g_str_equal(typename, SPARC_CPU_TYPE_NAME("Sun-UltraSparc-IV+"))) {
+g_free(typename);
+typename = g_strdup(SPARC_CPU_TYPE_NAME("Sun-UltraSparc-IVp"));
+} else if (g_str_equal(typename, 
SPARC_CPU_TYPE_NAME("Sun-UltraSparc-IIIi+"))) {
+g_free(typename);
+typename = g_strdup(SPARC_CPU_TYPE_NAME("Sun-UltraSparc-IIIip"));
+}
+
 oc = object_class_by_name(typename);
 g_free(typename);
 return oc;
-- 
2.44.0




[PATCH 0/5] Sparc CPU naming and help text improvements

2024-03-07 Thread Thomas Huth
The Sparc CPU naming and the corresponding help text is somewhat
confusing for the users. We should avoid spaces in the Names and
provide clear information to the users what can be passed to the
"-cpu" option.
While we're at it, also remove the "+" from two of the CPU names
since this character is now not allowed in device names anymore
(and was worked around with an ugly hack in qom/object.c so far).

Thomas Huth (5):
  target/sparc/cpu: Rename the CPU models with a "+" in their names
  target/sparc/cpu: Avoid spaces by default in the CPU names
  target/sparc/cpu: Improve the CPU help text
  docs/system/target-sparc: Improve the Sparc documentation
  docs/about: Deprecate the old "UltraSparc" CPU names that contain a
"+"

 docs/about/deprecated.rst|  9 +
 docs/system/target-sparc.rst |  8 +++-
 qom/object.c |  8 
 target/sparc/cpu.c   | 71 +---
 4 files changed, 56 insertions(+), 40 deletions(-)

-- 
2.44.0




[PATCH 2/5] target/sparc/cpu: Avoid spaces by default in the CPU names

2024-03-07 Thread Thomas Huth
The output of "-cpu help" is currently rather confusing to the users:
It is not clear which part of the output defines the CPU names since
the CPU names contain white spaces (which we later have to convert
into dashes internally) For example:

Sparc TI UltraSparc II IU 001700112000 FPU  MMU  NWINS 8

At a first glance, should the name for -cpu be "Sparc TI Ultrasparc II"
or "TI UltraSparc II IU" here? Both would be wrong, the right guess is
"TI UltraSparc II" only. Let's start cleaning up this mess by using
dashes instead of white spaces for the CPU names, like we're doing it
internally later (and like we're doing it in most other targets of QEMU).
Note that it is still possible to pass the CPU names with spaces to the
"-cpu" option, since sparc_cpu_type_name() still translates those to "-".

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/2141
Signed-off-by: Thomas Huth 
---
 target/sparc/cpu.c | 56 +++---
 1 file changed, 28 insertions(+), 28 deletions(-)

diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
index 651e49bfeb..ae30cded22 100644
--- a/target/sparc/cpu.c
+++ b/target/sparc/cpu.c
@@ -208,7 +208,7 @@ void cpu_sparc_set_id(CPUSPARCState *env, unsigned int cpu)
 static const sparc_def_t sparc_defs[] = {
 #ifdef TARGET_SPARC64
 {
-.name = "Fujitsu Sparc64",
+.name = "Fujitsu-Sparc64",
 .iu_version = ((0x04ULL << 48) | (0x02ULL << 32) | (0ULL << 24)),
 .fpu_version = 0x,
 .mmu_version = mmu_us_12,
@@ -217,7 +217,7 @@ static const sparc_def_t sparc_defs[] = {
 .features = CPU_DEFAULT_FEATURES,
 },
 {
-.name = "Fujitsu Sparc64 III",
+.name = "Fujitsu-Sparc64-III",
 .iu_version = ((0x04ULL << 48) | (0x03ULL << 32) | (0ULL << 24)),
 .fpu_version = 0x,
 .mmu_version = mmu_us_12,
@@ -226,7 +226,7 @@ static const sparc_def_t sparc_defs[] = {
 .features = CPU_DEFAULT_FEATURES,
 },
 {
-.name = "Fujitsu Sparc64 IV",
+.name = "Fujitsu-Sparc64-IV",
 .iu_version = ((0x04ULL << 48) | (0x04ULL << 32) | (0ULL << 24)),
 .fpu_version = 0x,
 .mmu_version = mmu_us_12,
@@ -235,7 +235,7 @@ static const sparc_def_t sparc_defs[] = {
 .features = CPU_DEFAULT_FEATURES,
 },
 {
-.name = "Fujitsu Sparc64 V",
+.name = "Fujitsu-Sparc64-V",
 .iu_version = ((0x04ULL << 48) | (0x05ULL << 32) | (0x51ULL << 24)),
 .fpu_version = 0x,
 .mmu_version = mmu_us_12,
@@ -244,7 +244,7 @@ static const sparc_def_t sparc_defs[] = {
 .features = CPU_DEFAULT_FEATURES,
 },
 {
-.name = "TI UltraSparc I",
+.name = "TI-UltraSparc-I",
 .iu_version = ((0x17ULL << 48) | (0x10ULL << 32) | (0x40ULL << 24)),
 .fpu_version = 0x,
 .mmu_version = mmu_us_12,
@@ -253,7 +253,7 @@ static const sparc_def_t sparc_defs[] = {
 .features = CPU_DEFAULT_FEATURES,
 },
 {
-.name = "TI UltraSparc II",
+.name = "TI-UltraSparc-II",
 .iu_version = ((0x17ULL << 48) | (0x11ULL << 32) | (0x20ULL << 24)),
 .fpu_version = 0x,
 .mmu_version = mmu_us_12,
@@ -262,7 +262,7 @@ static const sparc_def_t sparc_defs[] = {
 .features = CPU_DEFAULT_FEATURES,
 },
 {
-.name = "TI UltraSparc IIi",
+.name = "TI-UltraSparc-IIi",
 .iu_version = ((0x17ULL << 48) | (0x12ULL << 32) | (0x91ULL << 24)),
 .fpu_version = 0x,
 .mmu_version = mmu_us_12,
@@ -271,7 +271,7 @@ static const sparc_def_t sparc_defs[] = {
 .features = CPU_DEFAULT_FEATURES,
 },
 {
-.name = "TI UltraSparc IIe",
+.name = "TI-UltraSparc-IIe",
 .iu_version = ((0x17ULL << 48) | (0x13ULL << 32) | (0x14ULL << 24)),
 .fpu_version = 0x,
 .mmu_version = mmu_us_12,
@@ -280,7 +280,7 @@ static const sparc_def_t sparc_defs[] = {
 .features = CPU_DEFAULT_FEATURES,
 },
 {
-.name = "Sun UltraSparc III",
+.name = "Sun-UltraSparc-III",
 .iu_version = ((0x3eULL << 48) | (0x14ULL << 32) | (0x34ULL << 24)),
 .fpu_version = 0x,
 .mmu_version = mmu_us_12,
@@ -289,7 +289,7 @@ static const sparc_def_t sparc_defs[] = {
 .features = CPU_DEFAULT_FEATURES,
 },
 {
-.name = "Sun UltraSparc III Cu",
+.name = "Sun-UltraSparc-III-Cu",
 .iu_version = ((0x3eULL << 48) | (0x15ULL << 32) | (0x41ULL << 24)),
 .fpu_version = 0x,
 .mmu_version = mmu_us_3,
@@ -298,7 +298,7 @@ static const sparc_def_t sparc_defs[] = {
 .features = CPU_DEFAULT_FEATURES,
 },
 {
-.name = "Sun UltraSparc IIIi",
+.name = "Sun-UltraSparc-IIIi",
 .iu_version = ((0x3eULL << 48) | (0x16ULL << 32) | (0x34ULL << 24)),
 .fpu_version = 0x,
 .mmu_version = 

RE: [PATCH 1/3] ui/gtk: skip drawing guest scanout when associated VC is invisible

2024-03-07 Thread Kim, Dongwon
Hi Daniel,

> -Original Message-
> From: Daniel P. Berrangé 
> Sent: Thursday, March 7, 2024 1:41 AM
> To: Kim, Dongwon 
> Cc: qemu-devel@nongnu.org
> Subject: Re: [PATCH 1/3] ui/gtk: skip drawing guest scanout when associated
> VC is invisible
> 
> On Tue, Jan 30, 2024 at 03:48:38PM -0800, dongwon@intel.com wrote:
> > From: Dongwon Kim 
> >
> > A new flag "visible" is added to show visibility status of the gfx console.
> > The flag is set to 'true' when the VC is visible but set to 'false'
> > when it is hidden or closed. When the VC is invisible, drawing guest
> > frames should be skipped as it will never be completed and it would
> > potentially lock up the guest display especially when blob scanout is used.
> >
> > Cc: Marc-André Lureau 
> > Cc: Gerd Hoffmann 
> > Cc: Vivek Kasireddy 
> >
> > Signed-off-by: Dongwon Kim 
> > ---
> >  include/ui/gtk.h |  1 +
> >  ui/gtk-egl.c |  8 
> >  ui/gtk-gl-area.c |  8 
> >  ui/gtk.c | 10 +-
> >  4 files changed, 26 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/ui/gtk.h b/include/ui/gtk.h index
> > aa3d637029..2de38e5724 100644
> > --- a/include/ui/gtk.h
> > +++ b/include/ui/gtk.h
> > @@ -57,6 +57,7 @@ typedef struct VirtualGfxConsole {
> >  bool y0_top;
> >  bool scanout_mode;
> >  bool has_dmabuf;
> > +bool visible;
> >  #endif
> >  } VirtualGfxConsole;
> >
> > diff --git a/ui/gtk-egl.c b/ui/gtk-egl.c index 3af5ac5bcf..993c283191
> > 100644
> > --- a/ui/gtk-egl.c
> > +++ b/ui/gtk-egl.c
> > @@ -265,6 +265,10 @@ void
> gd_egl_scanout_dmabuf(DisplayChangeListener
> > *dcl,  #ifdef CONFIG_GBM
> >  VirtualConsole *vc = container_of(dcl, VirtualConsole, gfx.dcl);
> >
> > +if (!vc->gfx.visible) {
> > +return;
> > +}
> > +
> >  eglMakeCurrent(qemu_egl_display, vc->gfx.esurface,
> > vc->gfx.esurface, vc->gfx.ectx);
> >
> > @@ -363,6 +367,10 @@ void gd_egl_flush(DisplayChangeListener *dcl,
> >  VirtualConsole *vc = container_of(dcl, VirtualConsole, gfx.dcl);
> >  GtkWidget *area = vc->gfx.drawing_area;
> >
> > +if (!vc->gfx.visible) {
> > +return;
> > +}
> > +
> >  if (vc->gfx.guest_fb.dmabuf && !vc->gfx.guest_fb.dmabuf-
> >draw_submitted) {
> >  graphic_hw_gl_block(vc->gfx.dcl.con, true);
> >  vc->gfx.guest_fb.dmabuf->draw_submitted = true; diff --git
> > a/ui/gtk-gl-area.c b/ui/gtk-gl-area.c index 52dcac161e..04e07bd7ee
> > 100644
> > --- a/ui/gtk-gl-area.c
> > +++ b/ui/gtk-gl-area.c
> > @@ -285,6 +285,10 @@ void
> > gd_gl_area_scanout_flush(DisplayChangeListener *dcl,  {
> >  VirtualConsole *vc = container_of(dcl, VirtualConsole, gfx.dcl);
> >
> > +if (!vc->gfx.visible) {
> > +return;
> > +}
> > +
> >  if (vc->gfx.guest_fb.dmabuf && !vc->gfx.guest_fb.dmabuf-
> >draw_submitted) {
> >  graphic_hw_gl_block(vc->gfx.dcl.con, true);
> >  vc->gfx.guest_fb.dmabuf->draw_submitted = true; @@ -299,6
> > +303,10 @@ void gd_gl_area_scanout_dmabuf(DisplayChangeListener *dcl,
> > #ifdef CONFIG_GBM
> >  VirtualConsole *vc = container_of(dcl, VirtualConsole, gfx.dcl);
> >
> > +if (!vc->gfx.visible) {
> > +return;
> > +}
> > +
> >  gtk_gl_area_make_current(GTK_GL_AREA(vc->gfx.drawing_area));
> >  egl_dmabuf_import_texture(dmabuf);
> >  if (!dmabuf->texture) {
> 
> If we skip processing these requests, what happens when the QEMU windows is
> then re-displayed. Won't it now be missing updates to various regions of the
> guest display ?
 
[Kim, Dongwon]  Scanout blob is continuously submitted by the guest even when 
the vc->visible is false. So it will just display the current guest frame right 
away when the window is redisplayed again.

> 
> >
> With regards,
> Daniel
> --
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



KVM internal error due to non-atomic memslot updates by pci_update_vga()

2024-03-07 Thread Marcello Sylvester Bauer

Greetings,

I'm facing a problem with KVM memslot updates in pci_update_vga() and 
I'm looking for a possible solution to prevent this error.


Background:
Over the past few weeks, we have been investigating a bug where QEMU 
Windows 10 VMs using VT-d Intel GPU passthrough suddenly crash due to an 
internal KVM error. In order for this bug to occur, Windows is set to 
automatically turn off the display when idle. The reason for this bug is 
that the Windows Intel GPU driver disables VGA and therefore disables 
the QEMU memory region "vfio-vga-mmio@0xa". This change results in a 
non-atomic KVM memslot update (0x0-0xa000 -> 0x0-0xc000). Accessing this 
memory during this operation will cause a page fault and result in a 
KVM_EXIT_MMIO. While QEMU can provide the data, KVM is required to 
emulate the instruction, which in our case failed due to lack of support 
for the MOVSD instruction. I'm currently working on a kvm patch set to 
implement the missing instructions on the kernel side. But it would be 
great to prevent this race condition in QEMU as well.


Now to my general question:
Besides disabling VGA, what can we do in QEMU to avoid this?
Will the patch set "KVM: allow listener to stop all vcpus before" [1] be 
enough to prevent this bug or are additional changes needed?
There are even efforts to implement atomic memslot updates on the kernel 
side, but it does not look like this change will be adopted. [2]


Any thoughts and suggestions are welcome.

Thanks.
Marcello
---
[1](https://patchwork.kernel.org/project/kvm/cover/2022154758.1372674-1-eespo...@redhat.com/)
[2](https://lore.kernel.org/lkml/20220909104506.738478-1-eespo...@redhat.com/)


OpenPGP_0xE54B6622A5EDBF61.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [PATCH] linux-aio: add IO_CMD_FDSYNC command support

2024-03-07 Thread Prasad Pandit
Hi,

On Thu, 7 Mar 2024 at 19:21, Kevin Wolf  wrote:
> Kernel support for this is "relatively" new, added in 2018. Should we
> fall back to the thread pool if we receive -EINVAL?

laio_co_submit
  laio_do_submit
ioq_submit
  io_submit

* When an aio operation is not supported by the kernel, io_submit()
call would return '-EINVAL', indicating that the specified (_FDSYNC)
operation is invalid for the file descriptor. ie. an error (-EINVAL)
needs to reach the 'laio_co_submit' routine above, to make its caller
fall back on the thread-pool functionality as default.

* Strangely the 'ioq_submit' routine above ignores the return value
from io_submit and returns void. I think we need to change that to be
able to fall back on the thread-pool functionality. I'll try to see if
such a change works as expected. Please let me know if there's another
way to fix this.

Thank you.
---
  - Prasad




[PATCH v6 10/17] hw/loongarch: fdt adds cpu interrupt controller node

2024-03-07 Thread Song Gao
fdt adds cpu interrupt controller node,
we use 'loongson,cpu-interrupt-controller'.

See:
https://github.com/torvalds/linux/blob/v6.7/drivers/irqchip/irq-loongarch-cpu.c
https://lore.kernel.org/r/20221114113824.1880-2-liupei...@loongson.cn

Signed-off-by: Song Gao 
Message-Id: <20240301093839.663947-11-gaos...@loongson.cn>
---
 hw/loongarch/virt.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 10fdfec5dd..d260f933a5 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -106,6 +106,23 @@ static void virt_flash_map(LoongArchMachineState *lams,
 virt_flash_map1(flash1, VIRT_FLASH1_BASE, VIRT_FLASH1_SIZE, sysmem);
 }
 
+static void fdt_add_cpuic_node(LoongArchMachineState *lams,
+   uint32_t *cpuintc_phandle)
+{
+MachineState *ms = MACHINE(lams);
+char *nodename;
+
+*cpuintc_phandle = qemu_fdt_alloc_phandle(ms->fdt);
+nodename = g_strdup_printf("/cpuic");
+qemu_fdt_add_subnode(ms->fdt, nodename);
+qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle", *cpuintc_phandle);
+qemu_fdt_setprop_string(ms->fdt, nodename, "compatible",
+"loongson,cpu-interrupt-controller");
+qemu_fdt_setprop(ms->fdt, nodename, "interrupt-controller", NULL, 0);
+qemu_fdt_setprop_cell(ms->fdt, nodename, "#interrupt-cells", 1);
+g_free(nodename);
+}
+
 static void fdt_add_flash_node(LoongArchMachineState *lams)
 {
 MachineState *ms = MACHINE(lams);
@@ -527,6 +544,7 @@ static void loongarch_irq_init(LoongArchMachineState *lams)
 CPULoongArchState *env;
 CPUState *cpu_state;
 int cpu, pin, i, start, num;
+uint32_t cpuintc_phandle;
 
 /*
  * The connection of interrupts:
@@ -561,6 +579,9 @@ static void loongarch_irq_init(LoongArchMachineState *lams)
 memory_region_add_subregion(>system_iocsr, MAIL_SEND_ADDR,
sysbus_mmio_get_region(SYS_BUS_DEVICE(ipi), 1));
 
+/* Add cpu interrupt-controller */
+fdt_add_cpuic_node(lams, _phandle);
+
 for (cpu = 0; cpu < ms->smp.cpus; cpu++) {
 cpu_state = qemu_get_cpu(cpu);
 cpudev = DEVICE(cpu_state);
-- 
2.34.1




[PATCH v6 12/17] hw/loongarch: fdt adds pch_pic Controller

2024-03-07 Thread Song Gao
fdt adds pch pic controller, we use 'loongson,pch-pic-1.0'

See:
https://github.com/torvalds/linux/blob/v6.7/drivers/irqchip/irq-loongson-pch-pic.c
https://lore.kernel.org/r/20200528152757.1028711-4-jiaxun.y...@flygoat.com

Signed-off-by: Song Gao 
Message-Id: <20240301093839.663947-13-gaos...@loongson.cn>
---
 hw/loongarch/virt.c| 30 +-
 include/hw/pci-host/ls7a.h |  1 +
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 822f070c45..2b7b653fc1 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -148,6 +148,31 @@ static void fdt_add_eiointc_node(LoongArchMachineState 
*lams,
 g_free(nodename);
 }
 
+static void fdt_add_pch_pic_node(LoongArchMachineState *lams,
+ uint32_t *eiointc_phandle,
+ uint32_t *pch_pic_phandle)
+{
+MachineState *ms = MACHINE(lams);
+char *nodename;
+hwaddr pch_pic_base = VIRT_PCH_REG_BASE;
+hwaddr pch_pic_size = VIRT_PCH_REG_SIZE;
+
+*pch_pic_phandle = qemu_fdt_alloc_phandle(ms->fdt);
+nodename = g_strdup_printf("/platic@%" PRIx64, pch_pic_base);
+qemu_fdt_add_subnode(ms->fdt, nodename);
+qemu_fdt_setprop_cell(ms->fdt,  nodename, "phandle", *pch_pic_phandle);
+qemu_fdt_setprop_string(ms->fdt, nodename, "compatible",
+"loongson,pch-pic-1.0");
+qemu_fdt_setprop_cells(ms->fdt, nodename, "reg", 0,
+   pch_pic_base, 0, pch_pic_size);
+qemu_fdt_setprop(ms->fdt, nodename, "interrupt-controller", NULL, 0);
+qemu_fdt_setprop_cell(ms->fdt, nodename, "#interrupt-cells", 2);
+qemu_fdt_setprop_cell(ms->fdt, nodename, "interrupt-parent",
+  *eiointc_phandle);
+qemu_fdt_setprop_cell(ms->fdt, nodename, "loongson,pic-base-vec", 0);
+g_free(nodename);
+}
+
 static void fdt_add_flash_node(LoongArchMachineState *lams)
 {
 MachineState *ms = MACHINE(lams);
@@ -569,7 +594,7 @@ static void loongarch_irq_init(LoongArchMachineState *lams)
 CPULoongArchState *env;
 CPUState *cpu_state;
 int cpu, pin, i, start, num;
-uint32_t cpuintc_phandle, eiointc_phandle;
+uint32_t cpuintc_phandle, eiointc_phandle, pch_pic_phandle;
 
 /*
  * The connection of interrupts:
@@ -660,6 +685,9 @@ static void loongarch_irq_init(LoongArchMachineState *lams)
 qdev_connect_gpio_out(DEVICE(d), i, qdev_get_gpio_in(extioi, i));
 }
 
+/* Add PCH PIC node */
+fdt_add_pch_pic_node(lams, _phandle, _pic_phandle);
+
 pch_msi = qdev_new(TYPE_LOONGARCH_PCH_MSI);
 start   =  num;
 num = EXTIOI_IRQS - start;
diff --git a/include/hw/pci-host/ls7a.h b/include/hw/pci-host/ls7a.h
index e753449593..fe260f0183 100644
--- a/include/hw/pci-host/ls7a.h
+++ b/include/hw/pci-host/ls7a.h
@@ -24,6 +24,7 @@
 #define VIRT_PCH_REG_BASE0x1000UL
 #define VIRT_IOAPIC_REG_BASE (VIRT_PCH_REG_BASE)
 #define VIRT_PCH_MSI_ADDR_LOW0x2FF0UL
+#define VIRT_PCH_REG_SIZE0x400
 
 /*
  * GSI_BASE is hard-coded with 64 in linux kernel, else kernel fails to boot
-- 
2.34.1




[PATCH v6 01/17] hw/loongarch: Move boot fucntions to boot.c

2024-03-07 Thread Song Gao
Move some boot functions to boot.c and struct
loongarch_boot_info into struct LoongArchMachineState.

Signed-off-by: Song Gao 
Message-Id: <20240301093839.663947-2-gaos...@loongson.cn>
---
 hw/loongarch/boot.c | 128 
 hw/loongarch/meson.build|   1 +
 hw/loongarch/virt.c | 121 +++---
 include/hw/loongarch/boot.h |  21 ++
 include/hw/loongarch/virt.h |   2 +
 5 files changed, 160 insertions(+), 113 deletions(-)
 create mode 100644 hw/loongarch/boot.c
 create mode 100644 include/hw/loongarch/boot.h

diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c
new file mode 100644
index 00..a8a725a0a8
--- /dev/null
+++ b/hw/loongarch/boot.c
@@ -0,0 +1,128 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * LoongArch boot helper functions.
+ *
+ * Copyright (c) 2023 Loongson Technology Corporation Limited
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "target/loongarch/cpu.h"
+#include "hw/loongarch/virt.h"
+#include "hw/loader.h"
+#include "elf.h"
+#include "qemu/error-report.h"
+#include "sysemu/reset.h"
+#include "sysemu/qtest.h"
+
+static uint64_t cpu_loongarch_virt_to_phys(void *opaque, uint64_t addr)
+{
+return addr & MAKE_64BIT_MASK(0, TARGET_PHYS_ADDR_SPACE_BITS);
+}
+
+static int64_t load_kernel_info(struct loongarch_boot_info *info)
+{
+uint64_t kernel_entry, kernel_low, kernel_high;
+ssize_t kernel_size;
+
+kernel_size = load_elf(info->kernel_filename, NULL,
+   cpu_loongarch_virt_to_phys, NULL,
+   _entry, _low,
+   _high, NULL, 0,
+   EM_LOONGARCH, 1, 0);
+
+if (kernel_size < 0) {
+error_report("could not load kernel '%s': %s",
+ info->kernel_filename,
+ load_elf_strerror(kernel_size));
+exit(1);
+}
+return kernel_entry;
+}
+
+static void reset_load_elf(void *opaque)
+{
+LoongArchCPU *cpu = opaque;
+CPULoongArchState *env = >env;
+
+cpu_reset(CPU(cpu));
+if (env->load_elf) {
+cpu_set_pc(CPU(cpu), env->elf_address);
+}
+}
+
+static void fw_cfg_add_kernel_info(struct loongarch_boot_info *info,
+   FWCfgState *fw_cfg)
+{
+/*
+ * Expose the kernel, the command line, and the initrd in fw_cfg.
+ * We don't process them here at all, it's all left to the
+ * firmware.
+ */
+load_image_to_fw_cfg(fw_cfg,
+ FW_CFG_KERNEL_SIZE, FW_CFG_KERNEL_DATA,
+ info->kernel_filename,
+ false);
+
+if (info->initrd_filename) {
+load_image_to_fw_cfg(fw_cfg,
+ FW_CFG_INITRD_SIZE, FW_CFG_INITRD_DATA,
+ info->initrd_filename, false);
+}
+
+if (info->kernel_cmdline) {
+fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
+   strlen(info->kernel_cmdline) + 1);
+fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA,
+  info->kernel_cmdline);
+}
+}
+
+static void loongarch_firmware_boot(LoongArchMachineState *lams,
+struct loongarch_boot_info *info)
+{
+fw_cfg_add_kernel_info(info, lams->fw_cfg);
+}
+
+static void loongarch_direct_kernel_boot(struct loongarch_boot_info *info)
+{
+int64_t kernel_addr = 0;
+LoongArchCPU *lacpu;
+CPUState *cs;
+
+if (info->kernel_filename) {
+kernel_addr = load_kernel_info(info);
+} else {
+if (!qtest_enabled()) {
+error_report("Need kernel filename\n");
+exit(1);
+}
+}
+
+CPU_FOREACH(cs) {
+lacpu = LOONGARCH_CPU(cs);
+lacpu->env.load_elf = true;
+lacpu->env.elf_address = kernel_addr;
+}
+}
+
+void loongarch_load_kernel(MachineState *ms, struct loongarch_boot_info *info)
+{
+LoongArchMachineState *lams = LOONGARCH_MACHINE(ms);
+int i;
+
+/* register reset function */
+for (i = 0; i < ms->smp.cpus; i++) {
+qemu_register_reset(reset_load_elf, LOONGARCH_CPU(qemu_get_cpu(i)));
+}
+
+info->kernel_filename = ms->kernel_filename;
+info->kernel_cmdline = ms->kernel_cmdline;
+info->initrd_filename = ms->initrd_filename;
+
+if (lams->bios_loaded) {
+loongarch_firmware_boot(lams, info);
+} else {
+loongarch_direct_kernel_boot(info);
+}
+}
diff --git a/hw/loongarch/meson.build b/hw/loongarch/meson.build
index c0421502ab..d306d82c2e 100644
--- a/hw/loongarch/meson.build
+++ b/hw/loongarch/meson.build
@@ -1,6 +1,7 @@
 loongarch_ss = ss.source_set()
 loongarch_ss.add(files(
 'fw_cfg.c',
+'boot.c',
 ))
 loongarch_ss.add(when: 'CONFIG_LOONGARCH_VIRT', if_true: [files('virt.c'), 
fdt])
 loongarch_ss.add(when: 'CONFIG_ACPI', if_true: files('acpi-build.c'))
diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 

[PATCH v6 14/17] hw/loongarch: fdt adds pcie irq_map node

2024-03-07 Thread Song Gao
Signed-off-by: Song Gao 
Message-Id: <20240301093839.663947-15-gaos...@loongson.cn>
---
 hw/loongarch/virt.c | 73 ++---
 1 file changed, 69 insertions(+), 4 deletions(-)

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 1e767c49f8..d00343f0c2 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -352,7 +352,62 @@ static void fdt_add_fw_cfg_node(const 
LoongArchMachineState *lams)
 g_free(nodename);
 }
 
-static void fdt_add_pcie_node(const LoongArchMachineState *lams)
+static void fdt_add_pcie_irq_map_node(const LoongArchMachineState *lams,
+  char *nodename,
+  uint32_t *pch_pic_phandle)
+{
+int pin, dev;
+uint32_t irq_map_stride = 0;
+uint32_t full_irq_map[GPEX_NUM_IRQS *GPEX_NUM_IRQS * 10] = {};
+uint32_t *irq_map = full_irq_map;
+const MachineState *ms = MACHINE(lams);
+
+/* This code creates a standard swizzle of interrupts such that
+ * each device's first interrupt is based on it's PCI_SLOT number.
+ * (See pci_swizzle_map_irq_fn())
+ *
+ * We only need one entry per interrupt in the table (not one per
+ * possible slot) seeing the interrupt-map-mask will allow the table
+ * to wrap to any number of devices.
+ */
+
+for (dev = 0; dev < GPEX_NUM_IRQS; dev++) {
+int devfn = dev * 0x8;
+
+for (pin = 0; pin  < GPEX_NUM_IRQS; pin++) {
+int irq_nr = 16 + ((pin + PCI_SLOT(devfn)) % GPEX_NUM_IRQS);
+int i = 0;
+
+/* Fill PCI address cells */
+irq_map[i] = cpu_to_be32(devfn << 8);
+i += 3;
+
+/* Fill PCI Interrupt cells */
+irq_map[i] = cpu_to_be32(pin + 1);
+i += 1;
+
+/* Fill interrupt controller phandle and cells */
+irq_map[i++] = cpu_to_be32(*pch_pic_phandle);
+irq_map[i++] = cpu_to_be32(irq_nr);
+
+if (!irq_map_stride) {
+irq_map_stride = i;
+}
+irq_map += irq_map_stride;
+}
+}
+
+
+qemu_fdt_setprop(ms->fdt, nodename, "interrupt-map", full_irq_map,
+ GPEX_NUM_IRQS * GPEX_NUM_IRQS *
+ irq_map_stride * sizeof(uint32_t));
+qemu_fdt_setprop_cells(ms->fdt, nodename, "interrupt-map-mask",
+ 0x1800, 0, 0, 0x7);
+}
+
+static void fdt_add_pcie_node(const LoongArchMachineState *lams,
+  uint32_t *pch_pic_phandle,
+  uint32_t *pch_msi_phandle)
 {
 char *nodename;
 hwaddr base_mmio = VIRT_PCI_MEM_BASE;
@@ -383,6 +438,11 @@ static void fdt_add_pcie_node(const LoongArchMachineState 
*lams)
  2, base_pio, 2, size_pio,
  1, FDT_PCI_RANGE_MMIO, 2, base_mmio,
  2, base_mmio, 2, size_mmio);
+qemu_fdt_setprop_cells(ms->fdt, nodename, "msi-map",
+   0, *pch_msi_phandle, 0, 0x1);
+
+fdt_add_pcie_irq_map_node(lams, nodename, pch_pic_phandle);
+
 g_free(nodename);
 }
 
@@ -541,7 +601,10 @@ static DeviceState *create_platform_bus(DeviceState 
*pch_pic)
 return dev;
 }
 
-static void loongarch_devices_init(DeviceState *pch_pic, LoongArchMachineState 
*lams)
+static void loongarch_devices_init(DeviceState *pch_pic,
+   LoongArchMachineState *lams,
+   uint32_t *pch_pic_phandle,
+   uint32_t *pch_msi_phandle)
 {
 MachineClass *mc = MACHINE_GET_CLASS(lams);
 DeviceState *gpex_dev;
@@ -587,6 +650,9 @@ static void loongarch_devices_init(DeviceState *pch_pic, 
LoongArchMachineState *
 gpex_set_irq_num(GPEX_HOST(gpex_dev), i, 16 + i);
 }
 
+/* Add pcie node */
+fdt_add_pcie_node(lams, pch_pic_phandle, pch_msi_phandle);
+
 serial_mm_init(get_system_memory(), VIRT_UART_BASE, 0,
qdev_get_gpio_in(pch_pic,
 VIRT_UART_IRQ - VIRT_GSI_BASE),
@@ -733,7 +799,7 @@ static void loongarch_irq_init(LoongArchMachineState *lams)
 /* Add PCH MSI node */
 fdt_add_pch_msi_node(lams, _phandle, _msi_phandle);
 
-loongarch_devices_init(pch_pic, lams);
+loongarch_devices_init(pch_pic, lams, _pic_phandle, _msi_phandle);
 }
 
 static void loongarch_firmware_init(LoongArchMachineState *lams)
@@ -956,7 +1022,6 @@ static void loongarch_init(MachineState *machine)
 lams->powerdown_notifier.notify = virt_powerdown_req;
 qemu_register_powerdown_notifier(>powerdown_notifier);
 
-fdt_add_pcie_node(lams);
 /*
  * Since lowmem region starts from 0 and Linux kernel legacy start address
  * at 2 MiB, FDT base address is located at 1 MiB to avoid NULL pointer
-- 
2.34.1




[PATCH v6 15/17] hw/loongarch: fdt remove unused irqchip node

2024-03-07 Thread Song Gao
Signed-off-by: Song Gao 
Message-Id: <20240301093839.663947-16-gaos...@loongson.cn>
---
 hw/loongarch/virt.c | 31 +--
 1 file changed, 1 insertion(+), 30 deletions(-)

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index d00343f0c2..c80732a223 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -446,34 +446,6 @@ static void fdt_add_pcie_node(const LoongArchMachineState 
*lams,
 g_free(nodename);
 }
 
-static void fdt_add_irqchip_node(LoongArchMachineState *lams)
-{
-MachineState *ms = MACHINE(lams);
-char *nodename;
-uint32_t irqchip_phandle;
-
-irqchip_phandle = qemu_fdt_alloc_phandle(ms->fdt);
-qemu_fdt_setprop_cell(ms->fdt, "/", "interrupt-parent", irqchip_phandle);
-
-nodename = g_strdup_printf("/intc@%lx", VIRT_IOAPIC_REG_BASE);
-qemu_fdt_add_subnode(ms->fdt, nodename);
-qemu_fdt_setprop_cell(ms->fdt, nodename, "#interrupt-cells", 3);
-qemu_fdt_setprop(ms->fdt, nodename, "interrupt-controller", NULL, 0);
-qemu_fdt_setprop_cell(ms->fdt, nodename, "#address-cells", 0x2);
-qemu_fdt_setprop_cell(ms->fdt, nodename, "#size-cells", 0x2);
-qemu_fdt_setprop(ms->fdt, nodename, "ranges", NULL, 0);
-
-qemu_fdt_setprop_string(ms->fdt, nodename, "compatible",
-"loongarch,ls7a");
-
-qemu_fdt_setprop_sized_cells(ms->fdt, nodename, "reg",
- 2, VIRT_IOAPIC_REG_BASE,
- 2, PCH_PIC_ROUTE_ENTRY_OFFSET);
-
-qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle", irqchip_phandle);
-g_free(nodename);
-}
-
 static void fdt_add_memory_node(MachineState *ms,
 uint64_t base, uint64_t size, int node_id)
 {
@@ -1011,8 +983,7 @@ static void loongarch_init(MachineState *machine)
 
 /* Initialize the IO interrupt subsystem */
 loongarch_irq_init(lams);
-fdt_add_irqchip_node(lams);
-platform_bus_add_all_fdt_nodes(machine->fdt, "/intc",
+platform_bus_add_all_fdt_nodes(machine->fdt, "/platic",
VIRT_PLATFORM_BUS_BASEADDRESS,
VIRT_PLATFORM_BUS_SIZE,
VIRT_PLATFORM_BUS_IRQ);
-- 
2.34.1




[PATCH v6 11/17] hw/loongarch: fdt adds Extend I/O Interrupt Controller

2024-03-07 Thread Song Gao
fdt adds Extend I/O Interrupt Controller,
we use 'loongson,ls2k2000-eiointc'.

See:
https://github.com/torvalds/linux/blob/v6.7/drivers/irqchip/irq-loongson-eiointc.c
https://lore.kernel.org/r/764e02d924094580ac0f1d15535f4b98308705c6.1683279769.git.zhoubin...@loongson.cn

Signed-off-by: Song Gao 
Message-Id: <20240301093839.663947-12-gaos...@loongson.cn>
---
 hw/loongarch/virt.c| 30 +-
 include/hw/intc/loongarch_extioi.h |  1 +
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index d260f933a5..822f070c45 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -123,6 +123,31 @@ static void fdt_add_cpuic_node(LoongArchMachineState *lams,
 g_free(nodename);
 }
 
+static void fdt_add_eiointc_node(LoongArchMachineState *lams,
+  uint32_t *cpuintc_phandle,
+  uint32_t *eiointc_phandle)
+{
+MachineState *ms = MACHINE(lams);
+char *nodename;
+hwaddr extioi_base = APIC_BASE;
+hwaddr extioi_size = EXTIOI_SIZE;
+
+*eiointc_phandle = qemu_fdt_alloc_phandle(ms->fdt);
+nodename = g_strdup_printf("/eiointc@%" PRIx64, extioi_base);
+qemu_fdt_add_subnode(ms->fdt, nodename);
+qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle", *eiointc_phandle);
+qemu_fdt_setprop_string(ms->fdt, nodename, "compatible",
+"loongson,ls2k2000-eiointc");
+qemu_fdt_setprop(ms->fdt, nodename, "interrupt-controller", NULL, 0);
+qemu_fdt_setprop_cell(ms->fdt, nodename, "#interrupt-cells", 1);
+qemu_fdt_setprop_cell(ms->fdt, nodename, "interrupt-parent",
+  *cpuintc_phandle);
+qemu_fdt_setprop_cell(ms->fdt, nodename, "interrupts", 3);
+qemu_fdt_setprop_cells(ms->fdt, nodename, "reg", 0x0,
+   extioi_base, 0x0, extioi_size);
+g_free(nodename);
+}
+
 static void fdt_add_flash_node(LoongArchMachineState *lams)
 {
 MachineState *ms = MACHINE(lams);
@@ -544,7 +569,7 @@ static void loongarch_irq_init(LoongArchMachineState *lams)
 CPULoongArchState *env;
 CPUState *cpu_state;
 int cpu, pin, i, start, num;
-uint32_t cpuintc_phandle;
+uint32_t cpuintc_phandle, eiointc_phandle;
 
 /*
  * The connection of interrupts:
@@ -613,6 +638,9 @@ static void loongarch_irq_init(LoongArchMachineState *lams)
 }
 }
 
+/* Add Extend I/O Interrupt Controller node */
+fdt_add_eiointc_node(lams, _phandle, _phandle);
+
 pch_pic = qdev_new(TYPE_LOONGARCH_PCH_PIC);
 num = VIRT_PCH_PIC_IRQ_NUM;
 qdev_prop_set_uint32(pch_pic, "pch_pic_irq_num", num);
diff --git a/include/hw/intc/loongarch_extioi.h 
b/include/hw/intc/loongarch_extioi.h
index a0a46b888c..410c6e1121 100644
--- a/include/hw/intc/loongarch_extioi.h
+++ b/include/hw/intc/loongarch_extioi.h
@@ -39,6 +39,7 @@
 #define EXTIOI_COREISR_END   (0xB20 - APIC_OFFSET)
 #define EXTIOI_COREMAP_START (0xC00 - APIC_OFFSET)
 #define EXTIOI_COREMAP_END   (0xD00 - APIC_OFFSET)
+#define EXTIOI_SIZE  0x800
 
 typedef struct ExtIOICore {
 uint32_t coreisr[EXTIOI_IRQS_GROUP_COUNT];
-- 
2.34.1




  1   2   3   4   >