Re: [PATCH 9/9] accel/tcg: Pass last not end to tb_invalidate_phys_range

2023-03-05 Thread Philippe Mathieu-Daudé

On 6/3/23 03:13, Richard Henderson wrote:

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Signed-off-by: Richard Henderson 
---
  include/exec/exec-all.h   |  2 +-
  accel/tcg/tb-maint.c  | 31 ---
  accel/tcg/translate-all.c |  2 +-
  accel/tcg/user-exec.c |  2 +-
  softmmu/physmem.c |  2 +-
  5 files changed, 20 insertions(+), 19 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v3 12/16] slirp: unregister the win32 SOCKET

2023-03-05 Thread Marc-André Lureau
Hi

On Thu, Mar 2, 2023 at 10:45 PM Stefan Berger  wrote:

>
>
> On 2/21/23 07:47, marcandre.lur...@redhat.com wrote:
> > From: Marc-André Lureau 
> >
> > Presumably, this is what should happen when the SOCKET is to be removed.
> > (it probably worked until now because closesocket() does it implicitly,
> > but we never now how the slirp library could use the SOCKET later)
> >
> > Signed-off-by: Marc-André Lureau 
> > ---
> >   net/slirp.c | 4 +++-
> >   1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/slirp.c b/net/slirp.c
> > index 0730a935ba..a7c35778a6 100644
> > --- a/net/slirp.c
> > +++ b/net/slirp.c
> > @@ -259,7 +259,9 @@ static void net_slirp_register_poll_fd(int fd, void
> *opaque)
> >
> >   static void net_slirp_unregister_poll_fd(int fd, void *opaque)
> >   {
> > -/* no qemu_fd_unregister */
> > +#ifdef WIN32
> The majority of code seems to use _WIN32. Not sure what is 'right'.
>
>
Both should be correct. I think I like the "WIN32" version better though
(see also
https://stackoverflow.com/questions/662084/whats-the-difference-between-the-win32-and-win32-defines-in-c
)


> Reviewed-by: Stefan Berger 
>
>
thanks


> > +qemu_socket_unselect(fd, NULL);
> > +#endif
> >   }
> >
> >   static void net_slirp_notify(void *opaque)
>
>


Re: [PATCH v3 14/16] win32: avoid mixing SOCKET and file descriptor space

2023-03-05 Thread Marc-André Lureau
Hi

On Fri, Mar 3, 2023 at 12:54 AM Stefan Berger  wrote:
>
>
>
> On 2/21/23 07:47, marcandre.lur...@redhat.com wrote:
> > From: Marc-André Lureau 
> >
> > Until now, a win32 SOCKET handle is often cast to an int file
> > descriptor, as this is what other OS use for sockets. When necessary,
> > QEMU eventually queries whether it's a socket with the help of
> > fd_is_socket(). However, there is no guarantee of conflict between the
> > fd and SOCKET space. Such conflict would have surprising consequences,
> > we shouldn't mix them.
> >
> > Also, it is often forgotten that SOCKET must be closed with
> > closesocket(), and not close().
> >
> > Instead, let's make the win32 socket wrapper functions return and take a
> > file descriptor, and let util/ wrappers do the fd/SOCKET conversion as
> > necessary. A bit of adaptation is necessary in io/ as well.
> >
> > Unfortunately, we can't drop closesocket() usage, despite
> > _open_osfhandle() documentation claiming transfer of ownership, testing
> > shows bad behaviour if you forget to call closesocket().
> >
> > Signed-off-by: Marc-André Lureau 
> > ---
> >   include/sysemu/os-win32.h |   4 +-
> >   io/channel-watch.c|   6 +-
> >   util/aio-win32.c  |   9 +-
> >   util/oslib-win32.c| 219 --
> >   4 files changed, 197 insertions(+), 41 deletions(-)
>
> >   #undef connect
> > @@ -308,7 +315,13 @@ int qemu_connect_wrap(int sockfd, const struct 
> > sockaddr *addr,
> > socklen_t addrlen)
> >   {
> >   int ret;
> > -ret = connect(sockfd, addr, addrlen);
> > +SOCKET s = _get_osfhandle(sockfd);
> > +
> > +if (s == INVALID_SOCKET) {
> > +return -1;
> > +}
> > +
> > +ret = connect(s, addr, addrlen);
>
>
> Previously you passed int sockfd and now you convert this sockfd to a SOCKET 
> s and you can pass this as an alternative? Or was sockfd before this patch a 
> SOCKET??

yes, sockfd was in fact a SOCKET.

Previous to this patch, a SOCKET is cast to int, as a fake fd, and
back to SOCKET as necessary. The whole point of this patch is to avoid
mixing SOCKET & fd space, instead a SOCKET is associated with a real
CRT fd.

thanks

-- 
Marc-André Lureau



Re: [PATCH 7/9] accel/tcg: Pass last not end to page_collection_lock

2023-03-05 Thread Philippe Mathieu-Daudé

On 6/3/23 03:13, Richard Henderson wrote:

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Fixes a bug in the loop comparision where "<= end" would lock
one more page than required.

Signed-off-by: Richard Henderson 
---
  accel/tcg/tb-maint.c | 22 +++---
  1 file changed, 11 insertions(+), 11 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH 6/9] accel/tcg: Pass last not end to PAGE_FOR_EACH_TB

2023-03-05 Thread Philippe Mathieu-Daudé

On 6/3/23 03:13, Richard Henderson wrote:

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Signed-off-by: Richard Henderson 
---
  accel/tcg/tb-maint.c | 28 
  1 file changed, 16 insertions(+), 12 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH 5/9] accel/tcg: Pass last not end to page_reset_target_data

2023-03-05 Thread Philippe Mathieu-Daudé

On 6/3/23 03:13, Richard Henderson wrote:

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Signed-off-by: Richard Henderson 
---
  include/exec/cpu-all.h |  2 +-
  accel/tcg/user-exec.c  | 11 +--
  linux-user/mmap.c  |  2 +-
  3 files changed, 7 insertions(+), 8 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH 4/9] accel/tcg: Pass last not end to page_set_flags

2023-03-05 Thread Philippe Mathieu-Daudé

On 6/3/23 03:13, Richard Henderson wrote:

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1528
Signed-off-by: Richard Henderson 
---
  include/exec/cpu-all.h |  2 +-
  accel/tcg/user-exec.c  | 16 +++-
  bsd-user/mmap.c|  6 +++---
  linux-user/elfload.c   | 11 ++-
  linux-user/mmap.c  | 16 
  linux-user/syscall.c   |  4 ++--
  6 files changed, 27 insertions(+), 28 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH] scsi: megasas: Internal cdbs have 16-byte length

2023-03-05 Thread Fiona Ebner
Am 03.03.23 um 16:10 schrieb Guenter Roeck:
> On 3/3/23 01:02, Fiona Ebner wrote:
>> Am 28.02.23 um 18:11 schrieb Guenter Roeck:
>>> Host drivers do not necessarily set cdb_len in megasas io commands.
>>> With commits 6d1511cea0 ("scsi: Reject commands if the CDB length
>>> exceeds buf_len") and fe9d8927e2 ("scsi: Add buf_len parameter to
>>> scsi_req_new()"), this results in failures to boot Linux from affected
>>> SCSI drives because cdb_len is set to 0 by the host driver.
>>> Set the cdb length to its actual size to solve the problem.
>>>
>>
>> Tested-by: Fiona Ebner 
>>
>> But I do have a question:
>>
>>> Signed-off-by: Guenter Roeck 
>>> ---
>>>   hw/scsi/megasas.c | 14 ++
>>>   1 file changed, 2 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
>>> index 9cbbb16121..d624866bb6 100644
>>> --- a/hw/scsi/megasas.c
>>> +++ b/hw/scsi/megasas.c
>>> @@ -1780,7 +1780,7 @@ static int megasas_handle_io(MegasasState *s,
>>> MegasasCmd *cmd, int frame_cmd)
>>>   uint8_t cdb[16];
>>>   int len;
>>>   struct SCSIDevice *sdev = NULL;
>>> -    int target_id, lun_id, cdb_len;
>>> +    int target_id, lun_id;
>>>     lba_count = le32_to_cpu(cmd->frame->io.header.data_len);
>>>   lba_start_lo = le32_to_cpu(cmd->frame->io.lba_lo);
>>> @@ -1789,7 +1789,6 @@ static int megasas_handle_io(MegasasState *s,
>>> MegasasCmd *cmd, int frame_cmd)
>>>     target_id = cmd->frame->header.target_id;
>>>   lun_id = cmd->frame->header.lun_id;
>>> -    cdb_len = cmd->frame->header.cdb_len;
>>>     if (target_id < MFI_MAX_LD && lun_id == 0) {
>>>   sdev = scsi_device_find(>bus, 0, target_id, lun_id);
>>> @@ -1804,15 +1803,6 @@ static int megasas_handle_io(MegasasState *s,
>>> MegasasCmd *cmd, int frame_cmd)
>>>   return MFI_STAT_DEVICE_NOT_FOUND;
>>>   }
>>>   -    if (cdb_len > 16) {
>>> -    trace_megasas_scsi_invalid_cdb_len(
>>> -    mfi_frame_desc(frame_cmd), 1, target_id, lun_id, cdb_len);
>>> -    megasas_write_sense(cmd, SENSE_CODE(INVALID_OPCODE));
>>> -    cmd->frame->header.scsi_status = CHECK_CONDITION;
>>> -    s->event_count++;
>>> -    return MFI_STAT_SCSI_DONE_WITH_ERROR;
>>> -    }
>>
>> Shouldn't we still fail when cmd->frame->header.cdb_len > 16? Or is the
>> consequence of
>>
>>> Host drivers do not necessarily set cdb_len in megasas io commands.
>>
>> that this can be uninitialized memory and we need to assume it was not
>> explicitly set?
>>
> 
> I doubt that real hardware uses or checks the field for the affected
> commands
> because that would be pointless, but it is really up to you to decide how
> you want to handle it.
> 
> Guenter

Okay, thank you for the explanation!

> 
>> Best Regards,
>> Fiona
>>
>>> -
>>>   cmd->iov_size = lba_count * sdev->blocksize;
>>>   if (megasas_map_sgl(s, cmd, >frame->io.sgl)) {
>>>   megasas_write_sense(cmd, SENSE_CODE(TARGET_FAILURE));
>>> @@ -1823,7 +1813,7 @@ static int megasas_handle_io(MegasasState *s,
>>> MegasasCmd *cmd, int frame_cmd)
>>>     megasas_encode_lba(cdb, lba_start, lba_count, is_write);
>>>   cmd->req = scsi_req_new(sdev, cmd->index,
>>> -    lun_id, cdb, cdb_len, cmd);
>>> +    lun_id, cdb, sizeof(cdb), cmd);
>>>   if (!cmd->req) {
>>>   trace_megasas_scsi_req_alloc_failed(
>>>   mfi_frame_desc(frame_cmd), target_id, lun_id);
>>
> 
> 
> 




Re: [PATCH 2/9] linux-user: Rename max_reserved_va in main

2023-03-05 Thread Philippe Mathieu-Daudé

On 6/3/23 03:13, Richard Henderson wrote:

Rename to local_max_va, to avoid a conflict with the next patch.

Signed-off-by: Richard Henderson 
---
  linux-user/main.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v2 1/5] tcg: Do not elide memory barriers for !CF_PARALLEL

2023-03-05 Thread Philippe Mathieu-Daudé

On 6/3/23 02:57, Richard Henderson wrote:

The virtio devices require proper memory ordering between
the vcpus and the iothreads.

Signed-off-by: Richard Henderson 
---
  tcg/tcg-op.c | 10 +++---
  1 file changed, 7 insertions(+), 3 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v2 5/5] accel/tcg: Remove check_tcg_memory_orders_compatible

2023-03-05 Thread Philippe Mathieu-Daudé

On 6/3/23 02:57, Richard Henderson wrote:

We now issue host memory barriers to match the guest memory order.
Continue to disable MTTCG only if the guest has not been ported.

Signed-off-by: Richard Henderson 
---
  accel/tcg/tcg-all.c | 34 --
  1 file changed, 8 insertions(+), 26 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v2 3/5] tcg: Create tcg_req_mo

2023-03-05 Thread Philippe Mathieu-Daudé

On 6/3/23 02:57, Richard Henderson wrote:

Split out the logic to emit a host memory barrier in response to
a guest memory operation.  Do not provide a true default for
TCG_GUEST_DEFAULT_MO because the defined() check will still be
useful for determining if a guest has been updated for MTTCG.

Signed-off-by: Richard Henderson 
---
  include/tcg/tcg.h   | 20 
  accel/tcg/tcg-all.c |  6 +-
  tcg/tcg-op.c|  8 +---
  3 files changed, 22 insertions(+), 12 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v2 2/5] tcg: Elide memory barriers implied by the host memory model

2023-03-05 Thread Philippe Mathieu-Daudé

On 6/3/23 02:57, Richard Henderson wrote:

Reduce the set of required barriers to those needed by
the host right from the beginning.

Signed-off-by: Richard Henderson 
---
  tcg/tcg-op.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v4 1/9] hw/pci-host/i440fx: Inline sysbus_add_io()

2023-03-05 Thread Bernhard Beschow



Am 22. Februar 2023 18:05:51 UTC schrieb Bernhard Beschow :
>
>
>Am 22. Februar 2023 10:58:08 UTC schrieb "Philippe Mathieu-Daudé" 
>:
>>On 13/2/23 17:19, Bernhard Beschow wrote:
>>> sysbus_add_io() just wraps memory_region_add_subregion() while also
>>> obscuring where the memory is attached. So use
>>> memory_region_add_subregion() directly and attach it to the existing
>>> memory region s->bus->address_space_io which is set as an alias to
>>> get_system_io() by the pc machine.
>>> 
>>> Signed-off-by: Bernhard Beschow 
>>> Reviewed-by: Thomas Huth 
>>> ---
>>>   hw/pci-host/i440fx.c | 5 +++--
>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/hw/pci-host/i440fx.c b/hw/pci-host/i440fx.c
>>> index 262f82c303..9c6882d3fc 100644
>>> --- a/hw/pci-host/i440fx.c
>>> +++ b/hw/pci-host/i440fx.c
>>> @@ -27,6 +27,7 @@
>>>   #include "qemu/range.h"
>>>   #include "hw/i386/pc.h"
>>>   #include "hw/pci/pci.h"
>>> +#include "hw/pci/pci_bus.h"
>>>   #include "hw/pci/pci_host.h"
>>>   #include "hw/pci-host/i440fx.h"
>>>   #include "hw/qdev-properties.h"
>>> @@ -217,10 +218,10 @@ static void i440fx_pcihost_realize(DeviceState *dev, 
>>> Error **errp)
>>>   PCIHostState *s = PCI_HOST_BRIDGE(dev);
>>>   SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
>>>   -sysbus_add_io(sbd, 0xcf8, >conf_mem);
>>> +memory_region_add_subregion(s->bus->address_space_io, 0xcf8, 
>>> >conf_mem);
>>
>>To avoid accessing internal fields we should stick to the PCI API:
>>
>>memory_region_add_subregion(pci_address_space_io(PCI_DEVICE(dev)),
>>0xcf8, >conf_mem);
>
>dev is of type PCIHostState which derives from SysBusDevice, not PCIDevice. 
>AFAICS there is no getter implemented on PCIBus.

Ping

>
>>
>>>   sysbus_init_ioports(sbd, 0xcf8, 4);
>>>   -sysbus_add_io(sbd, 0xcfc, >data_mem);
>>> +memory_region_add_subregion(s->bus->address_space_io, 0xcfc, 
>>> >data_mem);
>>>   sysbus_init_ioports(sbd, 0xcfc, 4);
>>
>>Now all classes implementing PCI_HOST_BRIDGE register conf/data in I/O
>>space, so this could be a pattern justifying reworking a bit the
>>PCIHostBridgeClass or adding an helper in "hw/pci/pci_host.h" to do
>>that generically.
>
>What do you mean exactly? There are PCI hosts spawning two PCI buses and 
>therefore have two such spaces.
>
>Best regards,
>Bernhard



[PULL 09/27] audio/audio_template: use g_new0() to replace audio_calloc()

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Replace audio_calloc() with the equivalent g_new0().

With a n_structs argument >= 1, g_new0() never returns NULL.
Also remove the unnecessary NULL checks.

Signed-off-by: Volker Rümelin 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Marc-André Lureau 
Message-Id: <20230121094735.11644-8-vr_q...@t-online.de>
---
 audio/audio_template.h | 29 -
 1 file changed, 12 insertions(+), 17 deletions(-)

diff --git a/audio/audio_template.h b/audio/audio_template.h
index 592866f14a..980e1f4bd0 100644
--- a/audio/audio_template.h
+++ b/audio/audio_template.h
@@ -115,6 +115,12 @@ static int glue (audio_pcm_sw_alloc_resources_, TYPE) (SW 
*sw)
 #else
 samples = (int64_t)sw->HWBUF->size * sw->ratio >> 32;
 #endif
+if (audio_bug(__func__, samples < 0)) {
+dolog("Can not allocate buffer for `%s' (%d samples)\n",
+  SW_NAME(sw), samples);
+return -1;
+}
+
 if (samples == 0) {
 HW *hw = sw->hw;
 size_t f_fe_min;
@@ -129,12 +135,7 @@ static int glue (audio_pcm_sw_alloc_resources_, TYPE) (SW 
*sw)
 return -1;
 }
 
-sw->buf = audio_calloc(__func__, samples, sizeof(struct st_sample));
-if (!sw->buf) {
-dolog ("Could not allocate buffer for `%s' (%d samples)\n",
-   SW_NAME (sw), samples);
-return -1;
-}
+sw->buf = g_new0(st_sample, samples);
 
 #ifdef DAC
 sw->rate = st_rate_start (sw->info.freq, sw->hw->info.freq);
@@ -425,34 +426,28 @@ static SW *glue(audio_pcm_create_voice_pair_, TYPE)(
 hw_as = *as;
 }
 
-sw = audio_calloc(__func__, 1, sizeof(*sw));
-if (!sw) {
-dolog ("Could not allocate soft voice `%s' (%zu bytes)\n",
-   sw_name ? sw_name : "unknown", sizeof (*sw));
-goto err1;
-}
+sw = g_new0(SW, 1);
 sw->s = s;
 
 hw = glue(audio_pcm_hw_add_, TYPE)(s, _as);
 if (!hw) {
 dolog("Could not create a backend for voice `%s'\n", sw_name);
-goto err2;
+goto err1;
 }
 
 glue (audio_pcm_hw_add_sw_, TYPE) (hw, sw);
 
 if (glue (audio_pcm_sw_init_, TYPE) (sw, hw, sw_name, as)) {
-goto err3;
+goto err2;
 }
 
 return sw;
 
-err3:
+err2:
 glue (audio_pcm_hw_del_sw_, TYPE) (sw);
 glue (audio_pcm_hw_gc_, TYPE) ();
-err2:
-g_free (sw);
 err1:
+g_free(sw);
 return NULL;
 }
 
-- 
2.39.2




[PULL 25/27] audio: handle leftover audio frame from upsampling

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Upsampling may leave one remaining audio frame in the input
buffer. The emulated audio playback devices are currently
resposible to write this audio frame again in the next write
cycle. Push that task down to audio_pcm_sw_write.

This is another step towards an audio callback interface that
guarantees that when audio frontends are told they can write
n audio frames, they can actually do so.

Acked-by: Mark Cave-Ayland 
Acked-by: Marc-André Lureau 
Signed-off-by: Volker Rümelin 
Message-Id: <20230224190555.7409-13-vr_q...@t-online.de>
---
 audio/audio_template.h |  6 ++
 audio/audio.c  | 34 --
 2 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/audio/audio_template.h b/audio/audio_template.h
index a0b653f52c..0d8aab6fad 100644
--- a/audio/audio_template.h
+++ b/audio/audio_template.h
@@ -138,6 +138,12 @@ static int glue (audio_pcm_sw_alloc_resources_, TYPE) (SW 
*sw)
 return -1;
 }
 
+/*
+ * Allocate one additional audio frame that is needed for upsampling
+ * if the resample buffer size is small. For large buffer sizes take
+ * care of overflows.
+ */
+samples = samples < INT_MAX ? samples + 1 : INT_MAX;
 sw->resample_buf.buffer = g_new0(st_sample, samples);
 sw->resample_buf.size = samples;
 sw->resample_buf.pos = 0;
diff --git a/audio/audio.c b/audio/audio.c
index dad17e59b8..4836ab8ca8 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -731,16 +731,21 @@ static size_t audio_pcm_sw_write(SWVoiceOut *sw, void 
*buf, size_t buf_len)
 hw_free = hw_free > live ? hw_free - live : 0;
 frames_out_max = MIN(dead, hw_free);
 sw_max = st_rate_frames_in(sw->rate, frames_out_max);
-fe_max = MIN(buf_len / sw->info.bytes_per_frame, sw->resample_buf.size);
+fe_max = MIN(buf_len / sw->info.bytes_per_frame + sw->resample_buf.pos,
+ sw->resample_buf.size);
 frames_in_max = MIN(sw_max, fe_max);
 
 if (!frames_in_max) {
 return 0;
 }
 
-sw->conv(sw->resample_buf.buffer, buf, frames_in_max);
-if (!sw->hw->pcm_ops->volume_out) {
-mixeng_volume(sw->resample_buf.buffer, frames_in_max, >vol);
+if (frames_in_max > sw->resample_buf.pos) {
+sw->conv(sw->resample_buf.buffer + sw->resample_buf.pos,
+ buf, frames_in_max - sw->resample_buf.pos);
+if (!sw->hw->pcm_ops->volume_out) {
+mixeng_volume(sw->resample_buf.buffer + sw->resample_buf.pos,
+  frames_in_max - sw->resample_buf.pos, >vol);
+}
 }
 
 audio_pcm_sw_resample_out(sw, frames_in_max, frames_out_max,
@@ -749,6 +754,22 @@ static size_t audio_pcm_sw_write(SWVoiceOut *sw, void 
*buf, size_t buf_len)
 sw->total_hw_samples_mixed += total_out;
 sw->empty = sw->total_hw_samples_mixed == 0;
 
+/*
+ * Upsampling may leave one audio frame in the resample buffer. Decrement
+ * total_in by one if there was a leftover frame from the previous resample
+ * pass in the resample buffer. Increment total_in by one if the current
+ * resample pass left one frame in the resample buffer.
+ */
+if (frames_in_max - total_in == 1) {
+/* copy one leftover audio frame to the beginning of the buffer */
+*sw->resample_buf.buffer = *(sw->resample_buf.buffer + total_in);
+total_in += 1 - sw->resample_buf.pos;
+sw->resample_buf.pos = 1;
+} else if (total_in >= sw->resample_buf.pos) {
+total_in -= sw->resample_buf.pos;
+sw->resample_buf.pos = 0;
+}
+
 #ifdef DEBUG_OUT
 dolog (
 "%s: write size %zu written %zu total mixed %zu\n",
@@ -1155,8 +1176,9 @@ static void audio_run_out (AudioState *s)
 } else {
 free = 0;
 }
-if (free > 0) {
-free = MIN(free, sw->resample_buf.size);
+if (free > sw->resample_buf.pos) {
+free = MIN(free, sw->resample_buf.size)
+   - sw->resample_buf.pos;
 sw->callback.fn(sw->callback.opaque,
 free * sw->info.bytes_per_frame);
 }
-- 
2.39.2




[PULL 23/27] audio: rename variables in audio_pcm_sw_read()

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

The audio_pcm_sw_read() function uses a few very unspecific
variable names. Rename them for better readability.

ret => total_out
total => total_in
size => buf_len
samples => frames_out_max

Acked-by: Mark Cave-Ayland 
Reviewed-by: Marc-André Lureau 
Signed-off-by: Volker Rümelin 
Message-Id: <20230224190555.7409-11-vr_q...@t-online.de>
---
 audio/audio.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/audio/audio.c b/audio/audio.c
index 9e9c03a42e..22c36d6660 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -576,10 +576,10 @@ static void audio_pcm_sw_resample_in(SWVoiceIn *sw,
 }
 }
 
-static size_t audio_pcm_sw_read(SWVoiceIn *sw, void *buf, size_t size)
+static size_t audio_pcm_sw_read(SWVoiceIn *sw, void *buf, size_t buf_len)
 {
 HWVoiceIn *hw = sw->hw;
-size_t samples, live, ret, swlim, total;
+size_t live, frames_out_max, swlim, total_in, total_out;
 
 live = hw->total_samples_captured - sw->total_hw_samples_acquired;
 if (!live) {
@@ -590,20 +590,20 @@ static size_t audio_pcm_sw_read(SWVoiceIn *sw, void *buf, 
size_t size)
 return 0;
 }
 
-samples = size / sw->info.bytes_per_frame;
+frames_out_max = buf_len / sw->info.bytes_per_frame;
 
 swlim = (live * sw->ratio) >> 32;
-swlim = MIN (swlim, samples);
+swlim = MIN(swlim, frames_out_max);
 
-audio_pcm_sw_resample_in(sw, live, swlim, , );
+audio_pcm_sw_resample_in(sw, live, swlim, _in, _out);
 
 if (!hw->pcm_ops->volume_in) {
-mixeng_volume(sw->resample_buf.buffer, ret, >vol);
+mixeng_volume(sw->resample_buf.buffer, total_out, >vol);
 }
+sw->clip(buf, sw->resample_buf.buffer, total_out);
 
-sw->clip(buf, sw->resample_buf.buffer, ret);
-sw->total_hw_samples_acquired += total;
-return ret * sw->info.bytes_per_frame;
+sw->total_hw_samples_acquired += total_in;
+return total_out * sw->info.bytes_per_frame;
 }
 
 /*
-- 
2.39.2




[PULL 24/27] audio: make recording packet length calculation exact

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Introduce the new function st_rate_frames_out() to calculate the
exact number of audio output frames the resampling code can
generate from a given number of audio input frames. When upsampling,
this function returns the maximum number of output frames.

This new function replaces the audio_frontend_frames_in()
function, which calculated the average number of output frames
rounded down to the nearest integer. The audio_frontend_frames_in()
function was additionally used to limit the number of output frames
to the resample buffer size. In audio_pcm_sw_read() the variable
resample_buf.size replaces the open coded audio_frontend_frames_in()
function. In audio_run_in() an additional MIN() function is
necessary.

After this patch the audio packet length calculation for audio
recording is exact.

Acked-by: Mark Cave-Ayland 
Signed-off-by: Volker Rümelin 
Message-Id: <20230224190555.7409-12-vr_q...@t-online.de>
---
 audio/mixeng.h |  1 +
 audio/audio.c  | 29 -
 audio/mixeng.c | 41 +
 3 files changed, 50 insertions(+), 21 deletions(-)

diff --git a/audio/mixeng.h b/audio/mixeng.h
index 64c1e231cc..f9de7cffeb 100644
--- a/audio/mixeng.h
+++ b/audio/mixeng.h
@@ -52,6 +52,7 @@ void st_rate_flow(void *opaque, st_sample *ibuf, st_sample 
*obuf,
 void st_rate_flow_mix(void *opaque, st_sample *ibuf, st_sample *obuf,
   size_t *isamp, size_t *osamp);
 void st_rate_stop (void *opaque);
+uint32_t st_rate_frames_out(void *opaque, uint32_t frames_in);
 uint32_t st_rate_frames_in(void *opaque, uint32_t frames_out);
 void mixeng_clear (struct st_sample *buf, int len);
 void mixeng_volume (struct st_sample *buf, int len, struct mixeng_volume *vol);
diff --git a/audio/audio.c b/audio/audio.c
index 22c36d6660..dad17e59b8 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -579,7 +579,7 @@ static void audio_pcm_sw_resample_in(SWVoiceIn *sw,
 static size_t audio_pcm_sw_read(SWVoiceIn *sw, void *buf, size_t buf_len)
 {
 HWVoiceIn *hw = sw->hw;
-size_t live, frames_out_max, swlim, total_in, total_out;
+size_t live, frames_out_max, total_in, total_out;
 
 live = hw->total_samples_captured - sw->total_hw_samples_acquired;
 if (!live) {
@@ -590,12 +590,10 @@ static size_t audio_pcm_sw_read(SWVoiceIn *sw, void *buf, 
size_t buf_len)
 return 0;
 }
 
-frames_out_max = buf_len / sw->info.bytes_per_frame;
+frames_out_max = MIN(buf_len / sw->info.bytes_per_frame,
+ sw->resample_buf.size);
 
-swlim = (live * sw->ratio) >> 32;
-swlim = MIN(swlim, frames_out_max);
-
-audio_pcm_sw_resample_in(sw, live, swlim, _in, _out);
+audio_pcm_sw_resample_in(sw, live, frames_out_max, _in, _out);
 
 if (!hw->pcm_ops->volume_in) {
 mixeng_volume(sw->resample_buf.buffer, total_out, >vol);
@@ -979,18 +977,6 @@ void AUD_set_active_in (SWVoiceIn *sw, int on)
 }
 }
 
-/**
- * audio_frontend_frames_in() - returns the number of frames the resampling
- * code generates from frames_in frames
- *
- * @sw: audio recording frontend
- * @frames_in: number of frames
- */
-static size_t audio_frontend_frames_in(SWVoiceIn *sw, size_t frames_in)
-{
-return (int64_t)frames_in * sw->ratio >> 32;
-}
-
 static size_t audio_get_avail (SWVoiceIn *sw)
 {
 size_t live;
@@ -1007,9 +993,9 @@ static size_t audio_get_avail (SWVoiceIn *sw)
 }
 
 ldebug (
-"%s: get_avail live %zu frontend frames %zu\n",
+"%s: get_avail live %zu frontend frames %u\n",
 SW_NAME (sw),
-live, audio_frontend_frames_in(sw, live)
+live, st_rate_frames_out(sw->rate, live)
 );
 
 return live;
@@ -1314,8 +1300,9 @@ static void audio_run_in (AudioState *s)
 size_t sw_avail = audio_get_avail(sw);
 size_t avail;
 
-avail = audio_frontend_frames_in(sw, sw_avail);
+avail = st_rate_frames_out(sw->rate, sw_avail);
 if (avail > 0) {
+avail = MIN(avail, sw->resample_buf.size);
 sw->callback.fn(sw->callback.opaque,
 avail * sw->info.bytes_per_frame);
 }
diff --git a/audio/mixeng.c b/audio/mixeng.c
index a24c8c45a7..69f6549224 100644
--- a/audio/mixeng.c
+++ b/audio/mixeng.c
@@ -440,6 +440,47 @@ void st_rate_stop (void *opaque)
 g_free (opaque);
 }
 
+/**
+ * st_rate_frames_out() - returns the number of frames the resampling code
+ * generates from frames_in frames
+ *
+ * @opaque: pointer to struct rate
+ * @frames_in: number of frames
+ *
+ * When upsampling, there may be more than one correct result. In this case,
+ * the function returns the maximum number of output frames the resampling
+ * code can generate.
+ */
+uint32_t st_rate_frames_out(void *opaque, uint32_t frames_in)
+{
+struct rate *rate = opaque;
+uint64_t opos_end, opos_delta;
+uint32_t ipos_end;
+uint32_t 

[PULL 16/27] audio: replace the resampling loop in audio_pcm_sw_write()

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Replace the resampling loop in audio_pcm_sw_write() with the new
function audio_pcm_sw_resample_out(). Unlike the old resample
loop the new function will try to consume input frames even if
the output buffer is full. This is necessary when downsampling
to avoid reading less audio frames than calculated in advance.
The loop was unrolled to avoid complicated loop control conditions
in this case.

Acked-by: Mark Cave-Ayland 
Acked-by: Marc-André Lureau 
Signed-off-by: Volker Rümelin 
Message-Id: <20230224190555.7409-4-vr_q...@t-online.de>
---
 audio/audio.c | 63 +--
 1 file changed, 36 insertions(+), 27 deletions(-)

diff --git a/audio/audio.c b/audio/audio.c
index a399147486..4412b5fad8 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -673,11 +673,44 @@ static void audio_pcm_hw_clip_out(HWVoiceOut *hw, void 
*pcm_buf, size_t len)
 /*
  * Soft voice (playback)
  */
+static void audio_pcm_sw_resample_out(SWVoiceOut *sw,
+size_t frames_in_max, size_t frames_out_max,
+size_t *total_in, size_t *total_out)
+{
+HWVoiceOut *hw = sw->hw;
+struct st_sample *src, *dst;
+size_t live, wpos, frames_in, frames_out;
+
+live = sw->total_hw_samples_mixed;
+wpos = (hw->mix_buf.pos + live) % hw->mix_buf.size;
+
+/* write to mix_buf from wpos to end of buffer */
+src = sw->resample_buf.buffer;
+frames_in = frames_in_max;
+dst = hw->mix_buf.buffer + wpos;
+frames_out = MIN(frames_out_max, hw->mix_buf.size - wpos);
+st_rate_flow_mix(sw->rate, src, dst, _in, _out);
+wpos += frames_out;
+*total_in = frames_in;
+*total_out = frames_out;
+
+/* write to mix_buf from start of buffer if there are input frames left */
+if (frames_in_max - frames_in > 0 && wpos == hw->mix_buf.size) {
+src += frames_in;
+frames_in = frames_in_max - frames_in;
+dst = hw->mix_buf.buffer;
+frames_out = frames_out_max - frames_out;
+st_rate_flow_mix(sw->rate, src, dst, _in, _out);
+*total_in += frames_in;
+*total_out += frames_out;
+}
+}
+
 static size_t audio_pcm_sw_write(SWVoiceOut *sw, void *buf, size_t size)
 {
-size_t hwsamples, samples, isamp, osamp, wpos, live, dead, left, blck;
+size_t hwsamples, samples, live, dead;
 size_t hw_free;
-size_t ret = 0, pos = 0, total = 0;
+size_t ret, total;
 
 if (!sw) {
 return size;
@@ -698,8 +731,6 @@ static size_t audio_pcm_sw_write(SWVoiceOut *sw, void *buf, 
size_t size)
 return 0;
 }
 
-wpos = (sw->hw->mix_buf.pos + live) % hwsamples;
-
 dead = hwsamples - live;
 hw_free = audio_pcm_hw_get_free(sw->hw);
 hw_free = hw_free > live ? hw_free - live : 0;
@@ -713,29 +744,7 @@ static size_t audio_pcm_sw_write(SWVoiceOut *sw, void 
*buf, size_t size)
 }
 }
 
-while (samples) {
-dead = hwsamples - live;
-left = hwsamples - wpos;
-blck = MIN (dead, left);
-if (!blck) {
-break;
-}
-isamp = samples;
-osamp = blck;
-st_rate_flow_mix (
-sw->rate,
-sw->resample_buf.buffer + pos,
-sw->hw->mix_buf.buffer + wpos,
-,
-
-);
-ret += isamp;
-samples -= isamp;
-pos += isamp;
-live += osamp;
-wpos = (wpos + osamp) % hwsamples;
-total += osamp;
-}
+audio_pcm_sw_resample_out(sw, samples, MIN(dead, hw_free), , );
 
 sw->total_hw_samples_mixed += total;
 sw->empty = sw->total_hw_samples_mixed == 0;
-- 
2.39.2




[PULL 10/27] audio: remove audio_calloc() function

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Now that the last call site of audio_calloc() was removed, remove
the unused audio_calloc() function.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Volker Rümelin 
Reviewed-by: Marc-André Lureau 
Message-Id: <20230121094735.11644-9-vr_q...@t-online.de>
---
 audio/audio_int.h |  1 -
 audio/audio.c | 20 
 2 files changed, 21 deletions(-)

diff --git a/audio/audio_int.h b/audio/audio_int.h
index ce2d6bf92c..5028f2354a 100644
--- a/audio/audio_int.h
+++ b/audio/audio_int.h
@@ -251,7 +251,6 @@ void audio_pcm_init_info (struct audio_pcm_info *info, 
struct audsettings *as);
 void audio_pcm_info_clear_buf (struct audio_pcm_info *info, void *buf, int 
len);
 
 int audio_bug (const char *funcname, int cond);
-void *audio_calloc (const char *funcname, int nmemb, size_t size);
 
 void audio_run(AudioState *s, const char *msg);
 
diff --git a/audio/audio.c b/audio/audio.c
index 012d10996b..772c3cc320 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -149,26 +149,6 @@ static inline int audio_bits_to_index (int bits)
 }
 }
 
-void *audio_calloc (const char *funcname, int nmemb, size_t size)
-{
-int cond;
-size_t len;
-
-len = nmemb * size;
-cond = !nmemb || !size;
-cond |= nmemb < 0;
-cond |= len < size;
-
-if (audio_bug ("audio_calloc", cond)) {
-AUD_log (NULL, "%s passed invalid arguments to audio_calloc\n",
- funcname);
-AUD_log (NULL, "nmemb=%d size=%zu (len=%zu)\n", nmemb, size, len);
-return NULL;
-}
-
-return g_malloc0 (len);
-}
-
 void AUD_vlog (const char *cap, const char *fmt, va_list ap)
 {
 if (cap) {
-- 
2.39.2




[PULL 22/27] audio: replace the resampling loop in audio_pcm_sw_read()

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Replace the resampling loop in audio_pcm_sw_read() with the new
function audio_pcm_sw_resample_in(). Unlike the old resample
loop the new function will try to consume input frames even if
the output buffer is full. This is necessary when downsampling
to avoid reading less audio frames than calculated in advance.
The loop was unrolled to avoid complicated loop control conditions
in this case.

Acked-by: Mark Cave-Ayland 
Acked-by: Marc-André Lureau 
Signed-off-by: Volker Rümelin 
Message-Id: <20230224190555.7409-10-vr_q...@t-online.de>
---
 audio/audio.c | 59 ++-
 1 file changed, 35 insertions(+), 24 deletions(-)

diff --git a/audio/audio.c b/audio/audio.c
index e18b5e98c5..9e9c03a42e 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -543,11 +543,43 @@ static size_t audio_pcm_hw_conv_in(HWVoiceIn *hw, void 
*pcm_buf, size_t samples)
 /*
  * Soft voice (capture)
  */
+static void audio_pcm_sw_resample_in(SWVoiceIn *sw,
+size_t frames_in_max, size_t frames_out_max,
+size_t *total_in, size_t *total_out)
+{
+HWVoiceIn *hw = sw->hw;
+struct st_sample *src, *dst;
+size_t live, rpos, frames_in, frames_out;
+
+live = hw->total_samples_captured - sw->total_hw_samples_acquired;
+rpos = audio_ring_posb(hw->conv_buf.pos, live, hw->conv_buf.size);
+
+/* resample conv_buf from rpos to end of buffer */
+src = hw->conv_buf.buffer + rpos;
+frames_in = MIN(frames_in_max, hw->conv_buf.size - rpos);
+dst = sw->resample_buf.buffer;
+frames_out = frames_out_max;
+st_rate_flow(sw->rate, src, dst, _in, _out);
+rpos += frames_in;
+*total_in = frames_in;
+*total_out = frames_out;
+
+/* resample conv_buf from start of buffer if there are input frames left */
+if (frames_in_max - frames_in && rpos == hw->conv_buf.size) {
+src = hw->conv_buf.buffer;
+frames_in = frames_in_max - frames_in;
+dst += frames_out;
+frames_out = frames_out_max - frames_out;
+st_rate_flow(sw->rate, src, dst, _in, _out);
+*total_in += frames_in;
+*total_out += frames_out;
+}
+}
+
 static size_t audio_pcm_sw_read(SWVoiceIn *sw, void *buf, size_t size)
 {
 HWVoiceIn *hw = sw->hw;
-size_t samples, live, ret = 0, swlim, isamp, osamp, rpos, total = 0;
-struct st_sample *src, *dst = sw->resample_buf.buffer;
+size_t samples, live, ret, swlim, total;
 
 live = hw->total_samples_captured - sw->total_hw_samples_acquired;
 if (!live) {
@@ -558,33 +590,12 @@ static size_t audio_pcm_sw_read(SWVoiceIn *sw, void *buf, 
size_t size)
 return 0;
 }
 
-rpos = audio_ring_posb(hw->conv_buf.pos, live, hw->conv_buf.size);
-
 samples = size / sw->info.bytes_per_frame;
 
 swlim = (live * sw->ratio) >> 32;
 swlim = MIN (swlim, samples);
 
-while (swlim) {
-src = hw->conv_buf.buffer + rpos;
-if (hw->conv_buf.pos > rpos) {
-isamp = hw->conv_buf.pos - rpos;
-} else {
-isamp = hw->conv_buf.size - rpos;
-}
-
-if (!isamp) {
-break;
-}
-osamp = swlim;
-
-st_rate_flow (sw->rate, src, dst, , );
-swlim -= osamp;
-rpos = (rpos + isamp) % hw->conv_buf.size;
-dst += osamp;
-ret += osamp;
-total += isamp;
-}
+audio_pcm_sw_resample_in(sw, live, swlim, , );
 
 if (!hw->pcm_ops->volume_in) {
 mixeng_volume(sw->resample_buf.buffer, ret, >vol);
-- 
2.39.2




[PULL 26/27] audio/audio_template: substitute sw->hw with hw

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Substitute sw->hw with hw in the audio_pcm_sw_alloc_resources_*
functions.

Acked-by: Mark Cave-Ayland 
Reviewed-by: Marc-André Lureau 
Signed-off-by: Volker Rümelin 
Message-Id: <20230224190555.7409-14-vr_q...@t-online.de>
---
 audio/audio_template.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/audio/audio_template.h b/audio/audio_template.h
index 0d8aab6fad..7e116426c7 100644
--- a/audio/audio_template.h
+++ b/audio/audio_template.h
@@ -107,6 +107,7 @@ static void glue (audio_pcm_sw_free_resources_, TYPE) (SW 
*sw)
 
 static int glue (audio_pcm_sw_alloc_resources_, TYPE) (SW *sw)
 {
+HW *hw = sw->hw;
 int samples;
 
 if (!glue(audio_get_pdo_, TYPE)(sw->s->dev)->mixing_engine) {
@@ -125,7 +126,6 @@ static int glue (audio_pcm_sw_alloc_resources_, TYPE) (SW 
*sw)
 }
 
 if (samples == 0) {
-HW *hw = sw->hw;
 size_t f_fe_min;
 
 /* f_fe_min = ceil(1 [frames] * f_be [Hz] / size_be [frames]) */
@@ -149,9 +149,9 @@ static int glue (audio_pcm_sw_alloc_resources_, TYPE) (SW 
*sw)
 sw->resample_buf.pos = 0;
 
 #ifdef DAC
-sw->rate = st_rate_start (sw->info.freq, sw->hw->info.freq);
+sw->rate = st_rate_start(sw->info.freq, hw->info.freq);
 #else
-sw->rate = st_rate_start (sw->hw->info.freq, sw->info.freq);
+sw->rate = st_rate_start(hw->info.freq, sw->info.freq);
 #endif
 
 return 0;
-- 
2.39.2




[PULL 27/27] audio: remove sw->ratio

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Simplify the resample buffer size calculation.

For audio playback we have
sw->ratio = ((int64_t)sw->hw->info.freq << 32) / sw->info.freq;
samples = ((int64_t)sw->HWBUF.size << 32) / sw->ratio;

This can be simplified to
samples = muldiv64(sw->HWBUF.size, sw->info.freq, sw->hw->info.freq);

For audio recording we have
sw->ratio = ((int64_t)sw->info.freq << 32) / sw->hw->info.freq;
samples = (int64_t)sw->HWBUF.size * sw->ratio >> 32;

This can be simplified to
samples = muldiv64(sw->HWBUF.size, sw->info.freq, sw->hw->info.freq);

With hw = sw->hw this becomes in both cases
samples = muldiv64(HWBUF.size, sw->info.freq, hw->info.freq);

Now that sw->ratio is no longer needed, remove sw->ratio.

Acked-by: Mark Cave-Ayland 
Reviewed-by: Marc-André Lureau 
Signed-off-by: Volker Rümelin 
Message-Id: <20230224190555.7409-15-vr_q...@t-online.de>
---
 audio/audio_int.h  |  2 --
 audio/audio_template.h | 30 +-
 audio/audio.c  |  1 -
 3 files changed, 9 insertions(+), 24 deletions(-)

diff --git a/audio/audio_int.h b/audio/audio_int.h
index 8b163e1759..d51d63f08d 100644
--- a/audio/audio_int.h
+++ b/audio/audio_int.h
@@ -108,7 +108,6 @@ struct SWVoiceOut {
 AudioState *s;
 struct audio_pcm_info info;
 t_sample *conv;
-int64_t ratio;
 STSampleBuffer resample_buf;
 void *rate;
 size_t total_hw_samples_mixed;
@@ -126,7 +125,6 @@ struct SWVoiceIn {
 AudioState *s;
 int active;
 struct audio_pcm_info info;
-int64_t ratio;
 void *rate;
 size_t total_hw_samples_acquired;
 STSampleBuffer resample_buf;
diff --git a/audio/audio_template.h b/audio/audio_template.h
index 7e116426c7..e42326c20d 100644
--- a/audio/audio_template.h
+++ b/audio/audio_template.h
@@ -108,32 +108,23 @@ static void glue (audio_pcm_sw_free_resources_, TYPE) (SW 
*sw)
 static int glue (audio_pcm_sw_alloc_resources_, TYPE) (SW *sw)
 {
 HW *hw = sw->hw;
-int samples;
+uint64_t samples;
 
 if (!glue(audio_get_pdo_, TYPE)(sw->s->dev)->mixing_engine) {
 return 0;
 }
 
-#ifdef DAC
-samples = ((int64_t)sw->HWBUF.size << 32) / sw->ratio;
-#else
-samples = (int64_t)sw->HWBUF.size * sw->ratio >> 32;
-#endif
-if (audio_bug(__func__, samples < 0)) {
-dolog("Can not allocate buffer for `%s' (%d samples)\n",
-  SW_NAME(sw), samples);
-return -1;
-}
-
+samples = muldiv64(HWBUF.size, sw->info.freq, hw->info.freq);
 if (samples == 0) {
-size_t f_fe_min;
+uint64_t f_fe_min;
+uint64_t f_be = (uint32_t)hw->info.freq;
 
 /* f_fe_min = ceil(1 [frames] * f_be [Hz] / size_be [frames]) */
-f_fe_min = (hw->info.freq + HWBUF.size - 1) / HWBUF.size;
+f_fe_min = (f_be + HWBUF.size - 1) / HWBUF.size;
 qemu_log_mask(LOG_UNIMP,
   AUDIO_CAP ": The guest selected a " NAME " sample rate"
-  " of %d Hz for %s. Only sample rates >= %zu Hz are"
-  " supported.\n",
+  " of %d Hz for %s. Only sample rates >= %" PRIu64 " Hz"
+  " are supported.\n",
   sw->info.freq, sw->name, f_fe_min);
 return -1;
 }
@@ -141,9 +132,9 @@ static int glue (audio_pcm_sw_alloc_resources_, TYPE) (SW 
*sw)
 /*
  * Allocate one additional audio frame that is needed for upsampling
  * if the resample buffer size is small. For large buffer sizes take
- * care of overflows.
+ * care of overflows and truncation.
  */
-samples = samples < INT_MAX ? samples + 1 : INT_MAX;
+samples = samples < SIZE_MAX ? samples + 1 : SIZE_MAX;
 sw->resample_buf.buffer = g_new0(st_sample, samples);
 sw->resample_buf.size = samples;
 sw->resample_buf.pos = 0;
@@ -170,11 +161,8 @@ static int glue (audio_pcm_sw_init_, TYPE) (
 sw->hw = hw;
 sw->active = 0;
 #ifdef DAC
-sw->ratio = ((int64_t) sw->hw->info.freq << 32) / sw->info.freq;
 sw->total_hw_samples_mixed = 0;
 sw->empty = 1;
-#else
-sw->ratio = ((int64_t) sw->info.freq << 32) / sw->hw->info.freq;
 #endif
 
 if (sw->info.is_float) {
diff --git a/audio/audio.c b/audio/audio.c
index 4836ab8ca8..70b096713c 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -478,7 +478,6 @@ static int audio_attach_capture (HWVoiceOut *hw)
 sw->info = hw->info;
 sw->empty = 1;
 sw->active = hw->enabled;
-sw->ratio = ((int64_t) hw_cap->info.freq << 32) / sw->info.freq;
 sw->vol = nominal_volume;
 sw->rate = st_rate_start (sw->info.freq, hw_cap->info.freq);
 QLIST_INSERT_HEAD (_cap->sw_head, sw, entries);
-- 
2.39.2




[PULL 13/27] audio: change type of mix_buf and conv_buf

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Change the type of mix_buf in struct HWVoiceOut and conv_buf
in struct HWVoiceIn from STSampleBuffer * to STSampleBuffer.
However, a buffer pointer is still needed. For this reason in
struct STSampleBuffer samples[] is changed to *buffer.

This is a preparation for the next patch. The next patch will
add this line, which is not possible with the current struct
STSampleBuffer definition.

+sw->resample_buf.buffer = hw->mix_buf.buffer + rpos2;

There are no functional changes.

Acked-by: Mark Cave-Ayland 
Reviewed-by: Marc-André Lureau 
Signed-off-by: Volker Rümelin 
Message-Id: <20230224190555.7409-1-vr_q...@t-online.de>
---
 audio/audio_int.h  |   6 +--
 audio/audio_template.h |  19 
 audio/audio.c  | 106 -
 3 files changed, 67 insertions(+), 64 deletions(-)

diff --git a/audio/audio_int.h b/audio/audio_int.h
index 5028f2354a..061845dcc2 100644
--- a/audio/audio_int.h
+++ b/audio/audio_int.h
@@ -58,7 +58,7 @@ typedef struct SWVoiceCap SWVoiceCap;
 
 typedef struct STSampleBuffer {
 size_t pos, size;
-st_sample samples[];
+st_sample *buffer;
 } STSampleBuffer;
 
 typedef struct HWVoiceOut {
@@ -71,7 +71,7 @@ typedef struct HWVoiceOut {
 f_sample *clip;
 uint64_t ts_helper;
 
-STSampleBuffer *mix_buf;
+STSampleBuffer mix_buf;
 void *buf_emul;
 size_t pos_emul, pending_emul, size_emul;
 
@@ -93,7 +93,7 @@ typedef struct HWVoiceIn {
 size_t total_samples_captured;
 uint64_t ts_helper;
 
-STSampleBuffer *conv_buf;
+STSampleBuffer conv_buf;
 void *buf_emul;
 size_t pos_emul, pending_emul, size_emul;
 
diff --git a/audio/audio_template.h b/audio/audio_template.h
index 980e1f4bd0..dd87170cbd 100644
--- a/audio/audio_template.h
+++ b/audio/audio_template.h
@@ -71,8 +71,9 @@ static void glue(audio_init_nb_voices_, TYPE)(AudioState *s,
 static void glue (audio_pcm_hw_free_resources_, TYPE) (HW *hw)
 {
 g_free(hw->buf_emul);
-g_free (HWBUF);
-HWBUF = NULL;
+g_free(HWBUF.buffer);
+HWBUF.buffer = NULL;
+HWBUF.size = 0;
 }
 
 static void glue(audio_pcm_hw_alloc_resources_, TYPE)(HW *hw)
@@ -83,10 +84,12 @@ static void glue(audio_pcm_hw_alloc_resources_, TYPE)(HW 
*hw)
 dolog("Attempted to allocate empty buffer\n");
 }
 
-HWBUF = g_malloc0(sizeof(STSampleBuffer) + sizeof(st_sample) * 
samples);
-HWBUF->size = samples;
+HWBUF.buffer = g_new0(st_sample, samples);
+HWBUF.size = samples;
+HWBUF.pos = 0;
 } else {
-HWBUF = NULL;
+HWBUF.buffer = NULL;
+HWBUF.size = 0;
 }
 }
 
@@ -111,9 +114,9 @@ static int glue (audio_pcm_sw_alloc_resources_, TYPE) (SW 
*sw)
 }
 
 #ifdef DAC
-samples = ((int64_t) sw->HWBUF->size << 32) / sw->ratio;
+samples = ((int64_t)sw->HWBUF.size << 32) / sw->ratio;
 #else
-samples = (int64_t)sw->HWBUF->size * sw->ratio >> 32;
+samples = (int64_t)sw->HWBUF.size * sw->ratio >> 32;
 #endif
 if (audio_bug(__func__, samples < 0)) {
 dolog("Can not allocate buffer for `%s' (%d samples)\n",
@@ -126,7 +129,7 @@ static int glue (audio_pcm_sw_alloc_resources_, TYPE) (SW 
*sw)
 size_t f_fe_min;
 
 /* f_fe_min = ceil(1 [frames] * f_be [Hz] / size_be [frames]) */
-f_fe_min = (hw->info.freq + HWBUF->size - 1) / HWBUF->size;
+f_fe_min = (hw->info.freq + HWBUF.size - 1) / HWBUF.size;
 qemu_log_mask(LOG_UNIMP,
   AUDIO_CAP ": The guest selected a " NAME " sample rate"
   " of %d Hz for %s. Only sample rates >= %zu Hz are"
diff --git a/audio/audio.c b/audio/audio.c
index 772c3cc320..a0b54e4a2e 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -523,8 +523,8 @@ static size_t audio_pcm_hw_find_min_in (HWVoiceIn *hw)
 static size_t audio_pcm_hw_get_live_in(HWVoiceIn *hw)
 {
 size_t live = hw->total_samples_captured - audio_pcm_hw_find_min_in (hw);
-if (audio_bug(__func__, live > hw->conv_buf->size)) {
-dolog("live=%zu hw->conv_buf->size=%zu\n", live, hw->conv_buf->size);
+if (audio_bug(__func__, live > hw->conv_buf.size)) {
+dolog("live=%zu hw->conv_buf.size=%zu\n", live, hw->conv_buf.size);
 return 0;
 }
 return live;
@@ -533,13 +533,13 @@ static size_t audio_pcm_hw_get_live_in(HWVoiceIn *hw)
 static size_t audio_pcm_hw_conv_in(HWVoiceIn *hw, void *pcm_buf, size_t 
samples)
 {
 size_t conv = 0;
-STSampleBuffer *conv_buf = hw->conv_buf;
+STSampleBuffer *conv_buf = >conv_buf;
 
 while (samples) {
 uint8_t *src = advance(pcm_buf, conv * hw->info.bytes_per_frame);
 size_t proc = MIN(samples, conv_buf->size - conv_buf->pos);
 
-hw->conv(conv_buf->samples + conv_buf->pos, src, proc);
+hw->conv(conv_buf->buffer + conv_buf->pos, src, proc);
 conv_buf->pos = (conv_buf->pos + proc) % conv_buf->size;
 samples -= proc;
 conv += proc;
@@ -561,12 +561,12 

[PULL 11/27] alsaaudio: change default playback settings

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

The currently used default playback settings in the ALSA audio
backend are a bit unfortunate. With a few emulated audio devices,
audio playback does not work properly. Here is a short part of
the debug log while audio is playing (elapsed time in seconds).

audio: Elapsed since last alsa run (running): 0.046244
audio: Elapsed since last alsa run (running): 0.023137
audio: Elapsed since last alsa run (running): 0.023170
audio: Elapsed since last alsa run (running): 0.023650
audio: Elapsed since last alsa run (running): 0.060802
audio: Elapsed since last alsa run (running): 0.031931

For some audio devices the time of more than 23ms between updates
is too long.

Set the period time to 5.8ms so that the maximum time between
two updates typically does not exceed 11ms. This roughly matches
the 10ms period time when doing playback with the audio timer.
After this patch the debug log looks like this.

audio: Elapsed since last alsa run (running): 0.011919
audio: Elapsed since last alsa run (running): 0.005788
audio: Elapsed since last alsa run (running): 0.005995
audio: Elapsed since last alsa run (running): 0.011069
audio: Elapsed since last alsa run (running): 0.005901
audio: Elapsed since last alsa run (running): 0.006084

Acked-by: Christian Schoenebeck 
Signed-off-by: Volker Rümelin 
Reviewed-by: Marc-André Lureau 
Message-Id: <20230121094735.11644-10-vr_q...@t-online.de>
---
 audio/alsaaudio.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/audio/alsaaudio.c b/audio/alsaaudio.c
index 5f50dfa0bf..0cc982e61f 100644
--- a/audio/alsaaudio.c
+++ b/audio/alsaaudio.c
@@ -913,17 +913,14 @@ static void *alsa_audio_init(Audiodev *dev)
 alsa_init_per_direction(aopts->in);
 alsa_init_per_direction(aopts->out);
 
-/*
- * need to define them, as otherwise alsa produces no sound
- * doesn't set has_* so alsa_open can identify it wasn't set by the user
- */
+/* don't set has_* so alsa_open can identify it wasn't set by the user */
 if (!dev->u.alsa.out->has_period_length) {
-/* 1024 frames assuming 44100Hz */
-dev->u.alsa.out->period_length = 1024 * 100 / 44100;
+/* 256 frames assuming 44100Hz */
+dev->u.alsa.out->period_length = 5805;
 }
 if (!dev->u.alsa.out->has_buffer_length) {
 /* 4096 frames assuming 44100Hz */
-dev->u.alsa.out->buffer_length = 4096ll * 100 / 44100;
+dev->u.alsa.out->buffer_length = 92880;
 }
 
 /*
-- 
2.39.2




[PULL 07/27] audio/alsaaudio: use g_new0() instead of audio_calloc()

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Replace audio_calloc() with the equivalent g_new0().

The value of the g_new0() argument count is >= 1, which means
g_new0() will never return NULL. Also remove the unnecessary
NULL check.

Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Signed-off-by: Volker Rümelin 
Reviewed-by: Marc-André Lureau 
Message-Id: <20230121094735.11644-6-vr_q...@t-online.de>
---
 audio/alsaaudio.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/audio/alsaaudio.c b/audio/alsaaudio.c
index 714bfb6453..5f50dfa0bf 100644
--- a/audio/alsaaudio.c
+++ b/audio/alsaaudio.c
@@ -222,11 +222,7 @@ static int alsa_poll_helper (snd_pcm_t *handle, struct 
pollhlp *hlp, int mask)
 return -1;
 }
 
-pfds = audio_calloc ("alsa_poll_helper", count, sizeof (*pfds));
-if (!pfds) {
-dolog ("Could not initialize poll mode\n");
-return -1;
-}
+pfds = g_new0(struct pollfd, count);
 
 err = snd_pcm_poll_descriptors (handle, pfds, count);
 if (err < 0) {
-- 
2.39.2




[PULL 21/27] audio: make playback packet length calculation exact

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Introduce the new function st_rate_frames_in() to calculate the
exact number of audio input frames needed to get a given number
of audio output frames. The exact number of frames depends only
on the difference of opos - ipos and the number of output frames.
When downsampling, this function returns the maximum number of
input frames needed.

This new function replaces the audio_frontend_frames_out() function,
which calculated the average number of input frames rounded down
to the nearest integer. Because audio_frontend_frames_out() also
limited the number of input frames to the size of the resample
buffer, st_rate_frames_in() is not a direct replacement and two
additional MIN() functions are needed. One to prevent resample
buffer overflows and one to limit the available bytes for the audio
frontends.

After this patch the audio packet length calculation for playback is
exact. When upsampling, it's still possible that the audio frontends
can't write the last audio frame. This will be fixed later.

Acked-by: Mark Cave-Ayland 
Signed-off-by: Volker Rümelin 
Message-Id: <20230224190555.7409-9-vr_q...@t-online.de>
---
 audio/mixeng.h |  1 +
 audio/audio.c  | 43 ++-
 audio/mixeng.c | 39 +++
 3 files changed, 58 insertions(+), 25 deletions(-)

diff --git a/audio/mixeng.h b/audio/mixeng.h
index 2dcd6df245..64c1e231cc 100644
--- a/audio/mixeng.h
+++ b/audio/mixeng.h
@@ -52,6 +52,7 @@ void st_rate_flow(void *opaque, st_sample *ibuf, st_sample 
*obuf,
 void st_rate_flow_mix(void *opaque, st_sample *ibuf, st_sample *obuf,
   size_t *isamp, size_t *osamp);
 void st_rate_stop (void *opaque);
+uint32_t st_rate_frames_in(void *opaque, uint32_t frames_out);
 void mixeng_clear (struct st_sample *buf, int len);
 void mixeng_volume (struct st_sample *buf, int len, struct mixeng_volume *vol);
 
diff --git a/audio/audio.c b/audio/audio.c
index 556696b095..e18b5e98c5 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -701,8 +701,8 @@ static void audio_pcm_sw_resample_out(SWVoiceOut *sw,
 static size_t audio_pcm_sw_write(SWVoiceOut *sw, void *buf, size_t buf_len)
 {
 HWVoiceOut *hw = sw->hw;
-size_t live, dead, hw_free;
-size_t frames_in_max, total_in, total_out;
+size_t live, dead, hw_free, sw_max, fe_max;
+size_t frames_in_max, frames_out_max, total_in, total_out;
 
 live = sw->total_hw_samples_mixed;
 if (audio_bug(__func__, live > hw->mix_buf.size)) {
@@ -720,17 +720,21 @@ static size_t audio_pcm_sw_write(SWVoiceOut *sw, void 
*buf, size_t buf_len)
 dead = hw->mix_buf.size - live;
 hw_free = audio_pcm_hw_get_free(hw);
 hw_free = hw_free > live ? hw_free - live : 0;
-frames_in_max = ((int64_t)MIN(dead, hw_free) << 32) / sw->ratio;
-frames_in_max = MIN(frames_in_max, buf_len / sw->info.bytes_per_frame);
-if (frames_in_max) {
-sw->conv(sw->resample_buf.buffer, buf, frames_in_max);
+frames_out_max = MIN(dead, hw_free);
+sw_max = st_rate_frames_in(sw->rate, frames_out_max);
+fe_max = MIN(buf_len / sw->info.bytes_per_frame, sw->resample_buf.size);
+frames_in_max = MIN(sw_max, fe_max);
 
-if (!sw->hw->pcm_ops->volume_out) {
-mixeng_volume(sw->resample_buf.buffer, frames_in_max, >vol);
-}
+if (!frames_in_max) {
+return 0;
 }
 
-audio_pcm_sw_resample_out(sw, frames_in_max, MIN(dead, hw_free),
+sw->conv(sw->resample_buf.buffer, buf, frames_in_max);
+if (!sw->hw->pcm_ops->volume_out) {
+mixeng_volume(sw->resample_buf.buffer, frames_in_max, >vol);
+}
+
+audio_pcm_sw_resample_out(sw, frames_in_max, frames_out_max,
   _in, _out);
 
 sw->total_hw_samples_mixed += total_out;
@@ -1000,18 +1004,6 @@ static size_t audio_get_avail (SWVoiceIn *sw)
 return live;
 }
 
-/**
- * audio_frontend_frames_out() - returns the number of frames needed to
- * get frames_out frames after resampling
- *
- * @sw: audio playback frontend
- * @frames_out: number of frames
- */
-static size_t audio_frontend_frames_out(SWVoiceOut *sw, size_t frames_out)
-{
-return ((int64_t)frames_out << 32) / sw->ratio;
-}
-
 static size_t audio_get_free(SWVoiceOut *sw)
 {
 size_t live, dead;
@@ -1031,8 +1023,8 @@ static size_t audio_get_free(SWVoiceOut *sw)
 dead = sw->hw->mix_buf.size - live;
 
 #ifdef DEBUG_OUT
-dolog("%s: get_free live %zu dead %zu frontend frames %zu\n",
-  SW_NAME(sw), live, dead, audio_frontend_frames_out(sw, dead));
+dolog("%s: get_free live %zu dead %zu frontend frames %u\n",
+  SW_NAME(sw), live, dead, st_rate_frames_in(sw->rate, dead));
 #endif
 
 return dead;
@@ -1161,12 +1153,13 @@ static void audio_run_out (AudioState *s)
 size_t free;
 
 if (hw_free > sw->total_hw_samples_mixed) {
-free = audio_frontend_frames_out(sw,
+free = 

[PULL 08/27] audio/audio_template: use g_malloc0() to replace audio_calloc()

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Use g_malloc0() as a direct replacement for audio_calloc().

Since the type of the parameter n_bytes of the function g_malloc0()
is unsigned, the type of the variables voice_size_out and
voice_size_in has been changed to size_t. This means that the
function argument no longer has to be checked for negative values.

Signed-off-by: Volker Rümelin 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Marc-André Lureau 
Message-Id: <20230121094735.11644-7-vr_q...@t-online.de>
---
 audio/audio_int.h  |  4 ++--
 audio/audio_template.h | 18 --
 2 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/audio/audio_int.h b/audio/audio_int.h
index 4632cdf9cc..ce2d6bf92c 100644
--- a/audio/audio_int.h
+++ b/audio/audio_int.h
@@ -151,8 +151,8 @@ struct audio_driver {
 int can_be_default;
 int max_voices_out;
 int max_voices_in;
-int voice_size_out;
-int voice_size_in;
+size_t voice_size_out;
+size_t voice_size_in;
 QLIST_ENTRY(audio_driver) next;
 };
 
diff --git a/audio/audio_template.h b/audio/audio_template.h
index dfa440f778..592866f14a 100644
--- a/audio/audio_template.h
+++ b/audio/audio_template.h
@@ -40,7 +40,7 @@ static void glue(audio_init_nb_voices_, TYPE)(AudioState *s,
   struct audio_driver *drv)
 {
 int max_voices = glue (drv->max_voices_, TYPE);
-int voice_size = glue (drv->voice_size_, TYPE);
+size_t voice_size = glue(drv->voice_size_, TYPE);
 
 if (glue (s->nb_hw_voices_, TYPE) > max_voices) {
 if (!max_voices) {
@@ -63,8 +63,8 @@ static void glue(audio_init_nb_voices_, TYPE)(AudioState *s,
 }
 
 if (audio_bug(__func__, voice_size && !max_voices)) {
-dolog ("drv=`%s' voice_size=%d max_voices=0\n",
-   drv->name, voice_size);
+dolog("drv=`%s' voice_size=%zu max_voices=0\n",
+  drv->name, voice_size);
 }
 }
 
@@ -273,13 +273,11 @@ static HW *glue(audio_pcm_hw_add_new_, TYPE)(AudioState 
*s,
 return NULL;
 }
 
-hw = audio_calloc(__func__, 1, glue(drv->voice_size_, TYPE));
-if (!hw) {
-dolog ("Can not allocate voice `%s' size %d\n",
-   drv->name, glue (drv->voice_size_, TYPE));
-return NULL;
-}
-
+/*
+ * Since glue(s->nb_hw_voices_, TYPE) is != 0, glue(drv->voice_size_, TYPE)
+ * is guaranteed to be != 0. See the audio_init_nb_voices_* functions.
+ */
+hw = g_malloc0(glue(drv->voice_size_, TYPE));
 hw->s = s;
 hw->pcm_ops = drv->pcm_ops;
 
-- 
2.39.2




[PULL 20/27] audio: remove unused noop_conv() function

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

The function audio_capture_mix_and_clear() no longer uses
audio_pcm_sw_write() to resample audio frames from one internal
buffer to another. For this reason, the noop_conv() function is
now unused. Remove it.

Acked-by: Mark Cave-Ayland 
Reviewed-by: Marc-André Lureau 
Signed-off-by: Volker Rümelin 
Message-Id: <20230224190555.7409-8-vr_q...@t-online.de>
---
 audio/audio.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/audio/audio.c b/audio/audio.c
index 44eb7b63b4..556696b095 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -381,13 +381,6 @@ void audio_pcm_info_clear_buf (struct audio_pcm_info 
*info, void *buf, int len)
 /*
  * Capture
  */
-static void noop_conv (struct st_sample *dst, const void *src, int samples)
-{
-(void) src;
-(void) dst;
-(void) samples;
-}
-
 static CaptureVoiceOut *audio_pcm_capture_find_specific(AudioState *s,
 struct audsettings *as)
 {
@@ -485,7 +478,6 @@ static int audio_attach_capture (HWVoiceOut *hw)
 sw->info = hw->info;
 sw->empty = 1;
 sw->active = hw->enabled;
-sw->conv = noop_conv;
 sw->ratio = ((int64_t) hw_cap->info.freq << 32) / sw->info.freq;
 sw->vol = nominal_volume;
 sw->rate = st_rate_start (sw->info.freq, hw_cap->info.freq);
-- 
2.39.2




[PULL 18/27] audio: rename variables in audio_pcm_sw_write()

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

The audio_pcm_sw_write() function uses a lot of very unspecific
variable names. Rename them for better readability.

ret => total_in
total => total_out
size => buf_len
hwsamples => hw->mix_buf.size
samples => frames_in_max

Acked-by: Mark Cave-Ayland 
Reviewed-by: Marc-André Lureau 
Signed-off-by: Volker Rümelin 
Message-Id: <20230224190555.7409-6-vr_q...@t-online.de>
---
 audio/audio.c | 45 ++---
 1 file changed, 22 insertions(+), 23 deletions(-)

diff --git a/audio/audio.c b/audio/audio.c
index 8f1c0e77b0..cd10f1ec10 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -706,56 +706,55 @@ static void audio_pcm_sw_resample_out(SWVoiceOut *sw,
 }
 }
 
-static size_t audio_pcm_sw_write(SWVoiceOut *sw, void *buf, size_t size)
+static size_t audio_pcm_sw_write(SWVoiceOut *sw, void *buf, size_t buf_len)
 {
-size_t hwsamples, samples, live, dead;
-size_t hw_free;
-size_t ret, total;
-
-hwsamples = sw->hw->mix_buf.size;
+HWVoiceOut *hw = sw->hw;
+size_t live, dead, hw_free;
+size_t frames_in_max, total_in, total_out;
 
 live = sw->total_hw_samples_mixed;
-if (audio_bug(__func__, live > hwsamples)) {
-dolog("live=%zu hw->mix_buf.size=%zu\n", live, hwsamples);
+if (audio_bug(__func__, live > hw->mix_buf.size)) {
+dolog("live=%zu hw->mix_buf.size=%zu\n", live, hw->mix_buf.size);
 return 0;
 }
 
-if (live == hwsamples) {
+if (live == hw->mix_buf.size) {
 #ifdef DEBUG_OUT
 dolog ("%s is full %zu\n", sw->name, live);
 #endif
 return 0;
 }
 
-dead = hwsamples - live;
-hw_free = audio_pcm_hw_get_free(sw->hw);
+dead = hw->mix_buf.size - live;
+hw_free = audio_pcm_hw_get_free(hw);
 hw_free = hw_free > live ? hw_free - live : 0;
-samples = ((int64_t)MIN(dead, hw_free) << 32) / sw->ratio;
-samples = MIN(samples, size / sw->info.bytes_per_frame);
-if (samples) {
-sw->conv(sw->resample_buf.buffer, buf, samples);
+frames_in_max = ((int64_t)MIN(dead, hw_free) << 32) / sw->ratio;
+frames_in_max = MIN(frames_in_max, buf_len / sw->info.bytes_per_frame);
+if (frames_in_max) {
+sw->conv(sw->resample_buf.buffer, buf, frames_in_max);
 
 if (!sw->hw->pcm_ops->volume_out) {
-mixeng_volume(sw->resample_buf.buffer, samples, >vol);
+mixeng_volume(sw->resample_buf.buffer, frames_in_max, >vol);
 }
 }
 
-audio_pcm_sw_resample_out(sw, samples, MIN(dead, hw_free), , );
+audio_pcm_sw_resample_out(sw, frames_in_max, MIN(dead, hw_free),
+  _in, _out);
 
-sw->total_hw_samples_mixed += total;
+sw->total_hw_samples_mixed += total_out;
 sw->empty = sw->total_hw_samples_mixed == 0;
 
 #ifdef DEBUG_OUT
 dolog (
-"%s: write size %zu ret %zu total sw %zu\n",
-SW_NAME (sw),
-size / sw->info.bytes_per_frame,
-ret,
+"%s: write size %zu written %zu total mixed %zu\n",
+SW_NAME(sw),
+buf_len / sw->info.bytes_per_frame,
+total_in,
 sw->total_hw_samples_mixed
 );
 #endif
 
-return ret * sw->info.bytes_per_frame;
+return total_in * sw->info.bytes_per_frame;
 }
 
 #ifdef DEBUG_AUDIO
-- 
2.39.2




[PULL 12/27] alsaaudio: reintroduce default recording settings

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Audio recording with ALSA default settings currently doesn't
work. The debug log shows updates every 0.75s and 1.5s.

audio: Elapsed since last alsa run (running): 0.743030
audio: Elapsed since last alsa run (running): 1.486048
audio: Elapsed since last alsa run (running): 0.743008
audio: Elapsed since last alsa run (running): 1.485878
audio: Elapsed since last alsa run (running): 1.486040
audio: Elapsed since last alsa run (running): 1.485886

The time between updates should be in the 10ms range. Audio
recording with ALSA has the same timing contraints as playback.
Reintroduce the default recording settings and use the same
default settings for recording as for playback.

The term "reintroduce" is correct because commit a93f328177
("alsaaudio: port to -audiodev config") removed the default
settings for recording.

Signed-off-by: Volker Rümelin 
Reviewed-by: Marc-André Lureau 
Message-Id: <20230121094735.11644-11-vr_q...@t-online.de>
---
 audio/alsaaudio.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/audio/alsaaudio.c b/audio/alsaaudio.c
index 0cc982e61f..057571dd1e 100644
--- a/audio/alsaaudio.c
+++ b/audio/alsaaudio.c
@@ -923,15 +923,13 @@ static void *alsa_audio_init(Audiodev *dev)
 dev->u.alsa.out->buffer_length = 92880;
 }
 
-/*
- * OptsVisitor sets unspecified optional fields to zero, but do not depend
- * on it...
- */
 if (!dev->u.alsa.in->has_period_length) {
-dev->u.alsa.in->period_length = 0;
+/* 256 frames assuming 44100Hz */
+dev->u.alsa.in->period_length = 5805;
 }
 if (!dev->u.alsa.in->has_buffer_length) {
-dev->u.alsa.in->buffer_length = 0;
+/* 4096 frames assuming 44100Hz */
+dev->u.alsa.in->buffer_length = 92880;
 }
 
 return dev;
-- 
2.39.2




[PULL 15/27] audio: make the resampling code greedy

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Read the maximum possible number of audio frames instead of the
minimum necessary number of frames when the audio stream is
downsampled and the output buffer is limited. This makes the
function symmetrical to upsampling when the input buffer is
limited. The maximum possible number of frames is written here.

With this change it's easier to calculate the exact number of
audio frames the resample function will read or write. These two
functions will be introduced later.

Acked-by: Mark Cave-Ayland 
Acked-by: Marc-André Lureau 
Signed-off-by: Volker Rümelin 
Message-Id: <20230224190555.7409-3-vr_q...@t-online.de>
---
 audio/rate_template.h | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/audio/rate_template.h b/audio/rate_template.h
index b432719ebb..6648f0d2e5 100644
--- a/audio/rate_template.h
+++ b/audio/rate_template.h
@@ -40,8 +40,6 @@ void NAME (void *opaque, struct st_sample *ibuf, struct 
st_sample *obuf,
 int64_t t;
 #endif
 
-ilast = rate->ilast;
-
 istart = ibuf;
 iend = ibuf + *isamp;
 
@@ -59,15 +57,17 @@ void NAME (void *opaque, struct st_sample *ibuf, struct 
st_sample *obuf,
 return;
 }
 
-while (obuf < oend) {
+/* without input samples, there's nothing to do */
+if (ibuf >= iend) {
+*osamp = 0;
+return;
+}
 
-/* Safety catch to make sure we have input samples.  */
-if (ibuf >= iend) {
-break;
-}
+ilast = rate->ilast;
 
-/* read as many input samples so that ipos > opos */
+while (true) {
 
+/* read as many input samples so that ipos > opos */
 while (rate->ipos <= (rate->opos >> 32)) {
 ilast = *ibuf++;
 rate->ipos++;
@@ -78,6 +78,11 @@ void NAME (void *opaque, struct st_sample *ibuf, struct 
st_sample *obuf,
 }
 }
 
+/* make sure that the next output sample can be written */
+if (obuf >= oend) {
+break;
+}
+
 icur = *ibuf;
 
 /* wrap ipos and opos around long before they overflow */
-- 
2.39.2




[PULL 05/27] audio: remove unused #define AUDIO_STRINGIFY

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Remove the unused #define AUDIO_STRINGIFY. It was last used before
commit 470bcabd8f ("audio: Replace AUDIO_FUNC with __func__").

Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Thomas Huth 
Signed-off-by: Volker Rümelin 
Reviewed-by: Marc-André Lureau 
Message-Id: <20230121094735.11644-4-vr_q...@t-online.de>
---
 audio/audio_int.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/audio/audio_int.h b/audio/audio_int.h
index e87ce014a0..4632cdf9cc 100644
--- a/audio/audio_int.h
+++ b/audio/audio_int.h
@@ -294,9 +294,6 @@ static inline size_t audio_ring_posb(size_t pos, size_t 
dist, size_t len)
 #define ldebug(fmt, ...) (void)0
 #endif
 
-#define AUDIO_STRINGIFY_(n) #n
-#define AUDIO_STRINGIFY(n) AUDIO_STRINGIFY_(n)
-
 typedef struct AudiodevListEntry {
 Audiodev *dev;
 QSIMPLEQ_ENTRY(AudiodevListEntry) next;
-- 
2.39.2




[PULL 19/27] audio: don't misuse audio_pcm_sw_write()

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

The audio_pcm_sw_write() function is intended to convert a
PCM audio stream to the internal representation, adjust the
volume, and then mix it with the other audio streams with a
possibly changed sample rate in mix_buf. In order for the
audio_capture_mix_and_clear() function to use audio_pcm_sw_write(),
it must bypass the first two tasks of audio_pcm_sw_write().

Since patch "audio: split out the resampling loop in
audio_pcm_sw_write()" this is no longer necessary, because now
the audio_pcm_sw_resample_out() function can be used instead of
audio_pcm_sw_write().

Acked-by: Mark Cave-Ayland 
Acked-by: Marc-André Lureau 
Signed-off-by: Volker Rümelin 
Message-Id: <20230224190555.7409-7-vr_q...@t-online.de>
---
 audio/audio.c | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/audio/audio.c b/audio/audio.c
index cd10f1ec10..44eb7b63b4 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -1056,26 +1056,33 @@ static void audio_capture_mix_and_clear(HWVoiceOut *hw, 
size_t rpos,
 
 for (sc = hw->cap_head.lh_first; sc; sc = sc->entries.le_next) {
 SWVoiceOut *sw = >sw;
-int rpos2 = rpos;
+size_t rpos2 = rpos;
 
 n = samples;
 while (n) {
 size_t till_end_of_hw = hw->mix_buf.size - rpos2;
-size_t to_write = MIN(till_end_of_hw, n);
-size_t bytes = to_write * hw->info.bytes_per_frame;
-size_t written;
+size_t to_read = MIN(till_end_of_hw, n);
+size_t live, frames_in, frames_out;
 
 sw->resample_buf.buffer = hw->mix_buf.buffer + rpos2;
-sw->resample_buf.size = to_write;
-written = audio_pcm_sw_write (sw, NULL, bytes);
-if (written - bytes) {
-dolog("Could not mix %zu bytes into a capture "
+sw->resample_buf.size = to_read;
+live = sw->total_hw_samples_mixed;
+
+audio_pcm_sw_resample_out(sw,
+  to_read, sw->hw->mix_buf.size - live,
+  _in, _out);
+
+sw->total_hw_samples_mixed += frames_out;
+sw->empty = sw->total_hw_samples_mixed == 0;
+
+if (to_read - frames_in) {
+dolog("Could not mix %zu frames into a capture "
   "buffer, mixed %zu\n",
-  bytes, written);
+  to_read, frames_in);
 break;
 }
-n -= to_write;
-rpos2 = (rpos2 + to_write) % hw->mix_buf.size;
+n -= to_read;
+rpos2 = (rpos2 + to_read) % hw->mix_buf.size;
 }
 }
 }
-- 
2.39.2




[PULL 17/27] audio: remove sw == NULL check

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

All call sites of audio_pcm_sw_write() guarantee that sw is not
NULL. Remove the unnecessary NULL check.

Acked-by: Mark Cave-Ayland 
Reviewed-by: Marc-André Lureau 
Signed-off-by: Volker Rümelin 
Message-Id: <20230224190555.7409-5-vr_q...@t-online.de>
---
 audio/audio.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/audio/audio.c b/audio/audio.c
index 4412b5fad8..8f1c0e77b0 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -712,10 +712,6 @@ static size_t audio_pcm_sw_write(SWVoiceOut *sw, void 
*buf, size_t size)
 size_t hw_free;
 size_t ret, total;
 
-if (!sw) {
-return size;
-}
-
 hwsamples = sw->hw->mix_buf.size;
 
 live = sw->total_hw_samples_mixed;
-- 
2.39.2




[PULL 14/27] audio: change type and name of the resample buffer

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Change the type of the resample buffer from struct st_sample *
to STSampleBuffer. Also change the name from buf to resample_buf
for better readability.

The new variables resample_buf.size and resample_buf.pos will be
used after the next patches. There is no functional change.

Acked-by: Mark Cave-Ayland 
Reviewed-by: Marc-André Lureau 
Signed-off-by: Volker Rümelin 
Message-Id: <20230224190555.7409-2-vr_q...@t-online.de>
---
 audio/audio_int.h  |  4 ++--
 audio/audio_template.h | 10 ++
 audio/audio.c  | 15 ---
 3 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/audio/audio_int.h b/audio/audio_int.h
index 061845dcc2..8b163e1759 100644
--- a/audio/audio_int.h
+++ b/audio/audio_int.h
@@ -109,7 +109,7 @@ struct SWVoiceOut {
 struct audio_pcm_info info;
 t_sample *conv;
 int64_t ratio;
-struct st_sample *buf;
+STSampleBuffer resample_buf;
 void *rate;
 size_t total_hw_samples_mixed;
 int active;
@@ -129,7 +129,7 @@ struct SWVoiceIn {
 int64_t ratio;
 void *rate;
 size_t total_hw_samples_acquired;
-struct st_sample *buf;
+STSampleBuffer resample_buf;
 f_sample *clip;
 HWVoiceIn *hw;
 char *name;
diff --git a/audio/audio_template.h b/audio/audio_template.h
index dd87170cbd..a0b653f52c 100644
--- a/audio/audio_template.h
+++ b/audio/audio_template.h
@@ -95,13 +95,13 @@ static void glue(audio_pcm_hw_alloc_resources_, TYPE)(HW 
*hw)
 
 static void glue (audio_pcm_sw_free_resources_, TYPE) (SW *sw)
 {
-g_free (sw->buf);
+g_free(sw->resample_buf.buffer);
+sw->resample_buf.buffer = NULL;
+sw->resample_buf.size = 0;
 
 if (sw->rate) {
 st_rate_stop (sw->rate);
 }
-
-sw->buf = NULL;
 sw->rate = NULL;
 }
 
@@ -138,7 +138,9 @@ static int glue (audio_pcm_sw_alloc_resources_, TYPE) (SW 
*sw)
 return -1;
 }
 
-sw->buf = g_new0(st_sample, samples);
+sw->resample_buf.buffer = g_new0(st_sample, samples);
+sw->resample_buf.size = samples;
+sw->resample_buf.pos = 0;
 
 #ifdef DAC
 sw->rate = st_rate_start (sw->info.freq, sw->hw->info.freq);
diff --git a/audio/audio.c b/audio/audio.c
index a0b54e4a2e..a399147486 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -555,7 +555,7 @@ static size_t audio_pcm_sw_read(SWVoiceIn *sw, void *buf, 
size_t size)
 {
 HWVoiceIn *hw = sw->hw;
 size_t samples, live, ret = 0, swlim, isamp, osamp, rpos, total = 0;
-struct st_sample *src, *dst = sw->buf;
+struct st_sample *src, *dst = sw->resample_buf.buffer;
 
 live = hw->total_samples_captured - sw->total_hw_samples_acquired;
 if (!live) {
@@ -595,10 +595,10 @@ static size_t audio_pcm_sw_read(SWVoiceIn *sw, void *buf, 
size_t size)
 }
 
 if (!hw->pcm_ops->volume_in) {
-mixeng_volume (sw->buf, ret, >vol);
+mixeng_volume(sw->resample_buf.buffer, ret, >vol);
 }
 
-sw->clip (buf, sw->buf, ret);
+sw->clip(buf, sw->resample_buf.buffer, ret);
 sw->total_hw_samples_acquired += total;
 return ret * sw->info.bytes_per_frame;
 }
@@ -706,10 +706,10 @@ static size_t audio_pcm_sw_write(SWVoiceOut *sw, void 
*buf, size_t size)
 samples = ((int64_t)MIN(dead, hw_free) << 32) / sw->ratio;
 samples = MIN(samples, size / sw->info.bytes_per_frame);
 if (samples) {
-sw->conv(sw->buf, buf, samples);
+sw->conv(sw->resample_buf.buffer, buf, samples);
 
 if (!sw->hw->pcm_ops->volume_out) {
-mixeng_volume(sw->buf, samples, >vol);
+mixeng_volume(sw->resample_buf.buffer, samples, >vol);
 }
 }
 
@@ -724,7 +724,7 @@ static size_t audio_pcm_sw_write(SWVoiceOut *sw, void *buf, 
size_t size)
 osamp = blck;
 st_rate_flow_mix (
 sw->rate,
-sw->buf + pos,
+sw->resample_buf.buffer + pos,
 sw->hw->mix_buf.buffer + wpos,
 ,
 
@@ -1061,7 +1061,8 @@ static void audio_capture_mix_and_clear(HWVoiceOut *hw, 
size_t rpos,
 size_t bytes = to_write * hw->info.bytes_per_frame;
 size_t written;
 
-sw->buf = hw->mix_buf.buffer + rpos2;
+sw->resample_buf.buffer = hw->mix_buf.buffer + rpos2;
+sw->resample_buf.size = to_write;
 written = audio_pcm_sw_write (sw, NULL, bytes);
 if (written - bytes) {
 dolog("Could not mix %zu bytes into a capture "
-- 
2.39.2




[PULL 04/27] audio: rename hardware store to backend

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Use a consistent friendly name for the HWVoiceOut and HWVoiceIn
structures.

Reviewed-by: Thomas Huth 
Signed-off-by: Volker Rümelin 
Reviewed-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20230121094735.11644-3-vr_q...@t-online.de>
---
 audio/audio_template.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/audio/audio_template.h b/audio/audio_template.h
index f0ef262ab3..33af42ed8b 100644
--- a/audio/audio_template.h
+++ b/audio/audio_template.h
@@ -529,8 +529,8 @@ SW *glue (AUD_open_, TYPE) (
 HW *hw = sw->hw;
 
 if (!hw) {
-dolog ("Internal logic error voice `%s' has no hardware store\n",
-   SW_NAME (sw));
+dolog("Internal logic error: voice `%s' has no backend\n",
+  SW_NAME(sw));
 goto fail;
 }
 
-- 
2.39.2




[PULL 02/27] audio: log unimplemented audio device sample rates

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Some emulated audio devices allow guests to select very low
sample rates that the audio subsystem doesn't support. The lowest
supported sample rate depends on the audio backend used and in
most cases can be changed with various -audiodev arguments. Until
now, the audio_bug function emits an error message similar to the
following error message

A bug was just triggered in audio_calloc
Save all your work and restart without audio
I am sorry
Context:
audio_pcm_sw_alloc_resources_out passed invalid arguments to
 audio_calloc
nmemb=0 size=16 (len=0)
audio: Could not allocate buffer for `ac97.po' (0 samples)

and the audio subsystem continues without sound for the affected
device.

The fact that the selected sample rate is not supported is not a
guest error. Instead of displaying an error message, the missing
audio support is now logged. Simply continuing without sound is
correct, since the audio stream won't transport anything
reasonable at such high resample ratios anyway.

The AUD_open_* functions return NULL like before. The opened
audio device will not be registered in the audio subsystem and
consequently the audio frontend callback functions will not be
called. The AUD_read and AUD_write functions return early in this
case. This is necessary because, for example, the Sound Blaster 16
emulation calls AUD_write from the DMA callback function.

Acked-by: Christian Schoenebeck 
Signed-off-by: Volker Rümelin 
Reviewed-by: Marc-André Lureau 
Message-Id: <20230121094735.11644-1-vr_q...@t-online.de>
---
 audio/audio_template.h | 13 +
 audio/audio.c  |  1 +
 2 files changed, 14 insertions(+)

diff --git a/audio/audio_template.h b/audio/audio_template.h
index 42b4712acb..dbfb4fee4c 100644
--- a/audio/audio_template.h
+++ b/audio/audio_template.h
@@ -115,6 +115,19 @@ static int glue (audio_pcm_sw_alloc_resources_, TYPE) (SW 
*sw)
 #else
 samples = (int64_t)sw->HWBUF->size * sw->ratio >> 32;
 #endif
+if (samples == 0) {
+HW *hw = sw->hw;
+size_t f_fe_min;
+
+/* f_fe_min = ceil(1 [frames] * f_be [Hz] / size_be [frames]) */
+f_fe_min = (hw->info.freq + HWBUF->size - 1) / HWBUF->size;
+qemu_log_mask(LOG_UNIMP,
+  AUDIO_CAP ": The guest selected a " NAME " sample rate"
+  " of %d Hz for %s. Only sample rates >= %zu Hz are"
+  " supported.\n",
+  sw->info.freq, sw->name, f_fe_min);
+return -1;
+}
 
 sw->buf = audio_calloc(__func__, samples, sizeof(struct st_sample));
 if (!sw->buf) {
diff --git a/audio/audio.c b/audio/audio.c
index 4290309d18..81f5c0bb1e 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -33,6 +33,7 @@
 #include "qapi/qapi-visit-audio.h"
 #include "qapi/qapi-commands-audio.h"
 #include "qemu/cutils.h"
+#include "qemu/log.h"
 #include "qemu/module.h"
 #include "qemu/help_option.h"
 #include "sysemu/sysemu.h"
-- 
2.39.2




[PULL 01/27] MAINTAINERS: add myself to ui/ and audio/

2023-03-05 Thread marcandre . lureau
From: Marc-André Lureau 

Helping out with patch review & queue handling.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Gerd Hoffmann 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20230207085610.1033536-1-marcandre.lur...@redhat.com>
---
 MAINTAINERS | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 011fd85a09..da29661b37 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2490,6 +2490,7 @@ Subsystems
 --
 Overall Audio backends
 M: Gerd Hoffmann 
+M: Marc-André Lureau 
 S: Odd Fixes
 F: audio/
 X: audio/alsaaudio.c
@@ -2785,6 +2786,7 @@ F: docs/spice-port-fqdn.txt
 
 Graphics
 M: Gerd Hoffmann 
+M: Marc-André Lureau 
 S: Odd Fixes
 F: ui/
 F: include/ui/
-- 
2.39.2




[PULL 06/27] audio/mixeng: use g_new0() instead of audio_calloc()

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Replace audio_calloc() with the equivalent g_new0().

With a n_structs argument of 1, g_new0() never returns NULL.
Also remove the unnecessary NULL checks.

Reviewed-by: Richard Henderson 
Signed-off-by: Volker Rümelin 
Reviewed-by: Marc-André Lureau 
Message-Id: <20230121094735.11644-5-vr_q...@t-online.de>
---
 audio/audio_template.h | 6 +-
 audio/audio.c  | 5 -
 audio/mixeng.c | 7 +--
 3 files changed, 2 insertions(+), 16 deletions(-)

diff --git a/audio/audio_template.h b/audio/audio_template.h
index 33af42ed8b..dfa440f778 100644
--- a/audio/audio_template.h
+++ b/audio/audio_template.h
@@ -141,11 +141,7 @@ static int glue (audio_pcm_sw_alloc_resources_, TYPE) (SW 
*sw)
 #else
 sw->rate = st_rate_start (sw->hw->info.freq, sw->info.freq);
 #endif
-if (!sw->rate) {
-g_free (sw->buf);
-sw->buf = NULL;
-return -1;
-}
+
 return 0;
 }
 
diff --git a/audio/audio.c b/audio/audio.c
index 81f5c0bb1e..012d10996b 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -509,11 +509,6 @@ static int audio_attach_capture (HWVoiceOut *hw)
 sw->ratio = ((int64_t) hw_cap->info.freq << 32) / sw->info.freq;
 sw->vol = nominal_volume;
 sw->rate = st_rate_start (sw->info.freq, hw_cap->info.freq);
-if (!sw->rate) {
-dolog ("Could not start rate conversion for `%s'\n", SW_NAME (sw));
-g_free (sw);
-return -1;
-}
 QLIST_INSERT_HEAD (_cap->sw_head, sw, entries);
 QLIST_INSERT_HEAD (>cap_head, sc, entries);
 #ifdef DEBUG_CAPTURE
diff --git a/audio/mixeng.c b/audio/mixeng.c
index 100a306d6f..fe454e0725 100644
--- a/audio/mixeng.c
+++ b/audio/mixeng.c
@@ -414,12 +414,7 @@ struct rate {
  */
 void *st_rate_start (int inrate, int outrate)
 {
-struct rate *rate = audio_calloc(__func__, 1, sizeof(*rate));
-
-if (!rate) {
-dolog ("Could not allocate resampler (%zu bytes)\n", sizeof (*rate));
-return NULL;
-}
+struct rate *rate = g_new0(struct rate, 1);
 
 rate->opos = 0;
 
-- 
2.39.2




[PULL 03/27] audio: don't show unnecessary error messages

2023-03-05 Thread marcandre . lureau
From: Volker Rümelin 

Let the audio_pcm_create_voice_pair_* functions handle error
reporting. This avoids an additional error message in case
the guest selected an unimplemented sample rate.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Volker Rümelin 
Reviewed-by: Marc-André Lureau 
Message-Id: <20230121094735.11644-2-vr_q...@t-online.de>
---
 audio/audio_template.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/audio/audio_template.h b/audio/audio_template.h
index dbfb4fee4c..f0ef262ab3 100644
--- a/audio/audio_template.h
+++ b/audio/audio_template.h
@@ -441,6 +441,7 @@ static SW *glue(audio_pcm_create_voice_pair_, TYPE)(
 
 hw = glue(audio_pcm_hw_add_, TYPE)(s, _as);
 if (!hw) {
+dolog("Could not create a backend for voice `%s'\n", sw_name);
 goto err2;
 }
 
@@ -540,7 +541,6 @@ SW *glue (AUD_open_, TYPE) (
 } else {
 sw = glue(audio_pcm_create_voice_pair_, TYPE)(s, name, as);
 if (!sw) {
-dolog ("Failed to create voice `%s'\n", name);
 return NULL;
 }
 }
-- 
2.39.2




[PULL 00/27] Audio patches

2023-03-05 Thread marcandre . lureau
From: Marc-André Lureau 

The following changes since commit 2946e1af2704bf6584f57d4e3aec49d1d5f3ecc0:

  configure: Disable thread-safety warnings on macOS (2023-03-04 14:03:46 +)

are available in the Git repository at:

  https://gitlab.com/marcandre.lureau/qemu.git tags/audio-pull-request

for you to fetch changes up to 2f886a34bb7e6f6fcf39d64829f4499476f26dba:

  audio: remove sw->ratio (2023-03-06 10:30:24 +0400)


Audio patches for QEMU 8.0

Cleanups and improvements from Volker Rümelin.



Marc-André Lureau (1):
  MAINTAINERS: add myself to ui/ and audio/

Volker Rümelin (26):
  audio: log unimplemented audio device sample rates
  audio: don't show unnecessary error messages
  audio: rename hardware store to backend
  audio: remove unused #define AUDIO_STRINGIFY
  audio/mixeng: use g_new0() instead of audio_calloc()
  audio/alsaaudio: use g_new0() instead of audio_calloc()
  audio/audio_template: use g_malloc0() to replace audio_calloc()
  audio/audio_template: use g_new0() to replace audio_calloc()
  audio: remove audio_calloc() function
  alsaaudio: change default playback settings
  alsaaudio: reintroduce default recording settings
  audio: change type of mix_buf and conv_buf
  audio: change type and name of the resample buffer
  audio: make the resampling code greedy
  audio: replace the resampling loop in audio_pcm_sw_write()
  audio: remove sw == NULL check
  audio: rename variables in audio_pcm_sw_write()
  audio: don't misuse audio_pcm_sw_write()
  audio: remove unused noop_conv() function
  audio: make playback packet length calculation exact
  audio: replace the resampling loop in audio_pcm_sw_read()
  audio: rename variables in audio_pcm_sw_read()
  audio: make recording packet length calculation exact
  audio: handle leftover audio frame from upsampling
  audio/audio_template: substitute sw->hw with hw
  audio: remove sw->ratio

 MAINTAINERS|   2 +
 audio/audio_int.h  |  20 +--
 audio/audio_template.h | 105 +--
 audio/mixeng.h |   2 +
 audio/rate_template.h  |  21 ++-
 audio/alsaaudio.c  |  27 +--
 audio/audio.c  | 392 -
 audio/mixeng.c |  87 -
 8 files changed, 359 insertions(+), 297 deletions(-)

-- 
2.39.2




Re: [PATCH] hw/intc/ioapic: Update KVM routes before redelivering IRQ, on RTE update

2023-03-05 Thread David Woodhouse



On 5 March 2023 22:36:18 GMT, Peter Xu  wrote:
>On Sun, Mar 05, 2023 at 06:43:42PM +, 
>> ---
>> Alternative fixes might have been just to remove the part in
>> ioapic_service() which delivers the IRQ via kvm_set_irq() because
>> surely delivering as MSI ought to work just fine anyway in all cases?
>> That code lacks a comment justifying its existence.
>
>Didn't check all details, but AFAIU there're still some different paths
>triggered so at least it looks still clean to use the path it's for.
>
>E.g., I think if someone traces kvm_set_irq() in kernel this specific irq
>triggered right after unmasking might seem to be missed misterously (but
>actually it was not).

Hm, not sure that's a reason we care about. The I/OAPIC is purely a device to 
turn line interrupts into MSIs. Which these days need to be translated by IOMMU 
interrupt remapping device models in userspace. I don't think a user has any 
valid reason to expect that the kernel will even know about any GSIs with any 
specific numbers. Tracing on that in the kernel would making some dodgy 
assumptions.



Re: [PATCH v5 00/16] hw/9pfs: Add 9pfs support for Windows

2023-03-05 Thread Bin Meng
On Mon, Feb 20, 2023 at 6:10 PM Bin Meng  wrote:
>
> At present there is no Windows support for 9p file system.
> This series adds initial Windows support for 9p file system.
>
> 'local' file system backend driver is supported on Windows,
> including open, read, write, close, rename, remove, etc.
> All security models are supported. The mapped (mapped-xattr)
> security model is implemented using NTFS Alternate Data Stream
> (ADS) so the 9p export path shall be on an NTFS partition.
>
> 'synth' driver is adapted for Windows too so that we can now
> run qtests on Windows for 9p related regression testing.
>
> Example command line to test:
>   "-fsdev local,path=c:\msys64,security_model=mapped,id=p9 -device 
> virtio-9p-pci,fsdev=p9,mount_tag=p9fs"
>
> Changes in v5:
> - rework Windows specific xxxdir() APIs implementation
>
> Bin Meng (2):
>   hw/9pfs: Update helper qemu_stat_rdev()
>   hw/9pfs: Add a helper qemu_stat_blksize()
>
> Guohuai Shi (14):
>   hw/9pfs: Add missing definitions for Windows
>   hw/9pfs: Implement Windows specific utilities functions for 9pfs
>   hw/9pfs: Replace the direct call to xxxdir() APIs with a wrapper
>   hw/9pfs: Implement Windows specific xxxdir() APIs
>   hw/9pfs: Update the local fs driver to support Windows
>   hw/9pfs: Support getting current directory offset for Windows
>   hw/9pfs: Disable unsupported flags and features for Windows
>   hw/9pfs: Update v9fs_set_fd_limit() for Windows
>   hw/9pfs: Add Linux error number definition
>   hw/9pfs: Translate Windows errno to Linux value
>   fsdev: Disable proxy fs driver on Windows
>   hw/9pfs: Update synth fs driver for Windows
>   tests/qtest: virtio-9p-test: Adapt the case for win32
>   meson.build: Turn on virtfs for Windows

Ping?



Re: [PATCH v3 1/2] target/riscv: add Zicond as an experimental extension

2023-03-05 Thread Alistair Francis
On Wed, Feb 8, 2023 at 12:40 AM Philipp Tomsich
 wrote:
>
> This implements the Zicond (conditional integer operations) extension,
> as of version 1.0-draft-20230120 as an experimental extension in QEMU
> ("x-zicond").
>
> The Zicond extension acts as a building block for branchless sequences
> including conditional-{arithmetic,logic,select,move}.  Refer to the
> specification for usage scenarios and application guidance.
>
> The following instructions constitute Zicond:
>   - czero.eqz rd, rs1, rs2  =>  rd = (rs2 == 0) ? 0 : rs1
>   - czero.nez rd, rs1, rs2  =>  rd = (rs2 != 0) ? 0 : rs1
>
> See
>   
> https://github.com/riscv/riscv-zicond/releases/download/v1.0-draft-20230120/riscv-zicond_1.0-draft-20230120.pdf
> for the (current version of the) Zicond specification and usage details.
>
> Signed-off-by: Philipp Tomsich 

Sorry about this.

It looks like while I was out a different patch implementing this
extension was applied. I think this patch just fell through the cracks
as it was sent before I left and before Palmer took over.

The second patch is still useful though, if you rebase it on the
current master and resend it I can apply it.

Alistair

> ---
>
> Changes in v3:
> - don't add this to MAINTAINERS, as it is an official extension
>
> Changes in v2:
> - gates availability of the instructions through a REQUIRE_ZICOND
>   macro (these were previously always enabled)
>
>  target/riscv/cpu.c   |  4 ++
>  target/riscv/cpu.h   |  1 +
>  target/riscv/insn32.decode   |  4 ++
>  target/riscv/insn_trans/trans_rvzicond.c.inc | 54 
>  target/riscv/translate.c |  1 +
>  5 files changed, 64 insertions(+)
>  create mode 100644 target/riscv/insn_trans/trans_rvzicond.c.inc
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 14a7027095..98177d8328 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -73,6 +73,7 @@ struct isa_ext_data {
>  static const struct isa_ext_data isa_edata_arr[] = {
>  ISA_EXT_DATA_ENTRY(h, false, PRIV_VERSION_1_12_0, ext_h),
>  ISA_EXT_DATA_ENTRY(v, false, PRIV_VERSION_1_12_0, ext_v),
> +ISA_EXT_DATA_ENTRY(zicond, true, PRIV_VERSION_1_12_0, ext_zicond),
>  ISA_EXT_DATA_ENTRY(zicsr, true, PRIV_VERSION_1_10_0, ext_icsr),
>  ISA_EXT_DATA_ENTRY(zifencei, true, PRIV_VERSION_1_10_0, ext_ifencei),
>  ISA_EXT_DATA_ENTRY(zihintpause, true, PRIV_VERSION_1_10_0, 
> ext_zihintpause),
> @@ -1097,6 +1098,9 @@ static Property riscv_cpu_extensions[] = {
>  DEFINE_PROP_BOOL("x-smaia", RISCVCPU, cfg.ext_smaia, false),
>  DEFINE_PROP_BOOL("x-ssaia", RISCVCPU, cfg.ext_ssaia, false),
>
> +/* Zicond 1.0-draft-20230120 */
> +DEFINE_PROP_BOOL("x-zicond", RISCVCPU, cfg.ext_zicond, false),
> +
>  DEFINE_PROP_END_OF_LIST(),
>  };
>
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index bcf0826753..aaf3acb753 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -446,6 +446,7 @@ struct RISCVCPUConfig {
>  bool ext_zkt;
>  bool ext_ifencei;
>  bool ext_icsr;
> +bool ext_zicond;
>  bool ext_zihintpause;
>  bool ext_smstateen;
>  bool ext_sstc;
> diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
> index b7e7613ea2..ca812c2f7a 100644
> --- a/target/riscv/insn32.decode
> +++ b/target/riscv/insn32.decode
> @@ -890,3 +890,7 @@ sm3p1   00 01000 01001 . 001 . 0010011 @r2
>  # *** RV32 Zksed Standard Extension ***
>  sm4ed   .. 11000 . . 000 . 0110011 @k_aes
>  sm4ks   .. 11010 . . 000 . 0110011 @k_aes
> +
> +# *** Zicond Standard Extension ***
> +czero_eqz   111 . . 101 . 0110011 @r
> +czero_nez   111 . . 111 . 0110011 @r
> diff --git a/target/riscv/insn_trans/trans_rvzicond.c.inc 
> b/target/riscv/insn_trans/trans_rvzicond.c.inc
> new file mode 100644
> index 00..20e9694a2c
> --- /dev/null
> +++ b/target/riscv/insn_trans/trans_rvzicond.c.inc
> @@ -0,0 +1,54 @@
> +/*
> + * RISC-V translation routines for the XVentanaCondOps extension.
> + *
> + * Copyright (c) 2022 VRULL GmbH.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2 or later, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along 
> with
> + * this program.  If not, see .
> + */
> +
> +#define REQUIRE_ZICOND(ctx) do { \
> +if (!ctx->cfg_ptr->ext_zicond) { \
> +return false;\
> +}   

Re: [PATCH] memory: avoid unnecessary iteration when updating ioeventfds

2023-03-05 Thread Jason Wang
On Mon, Mar 6, 2023 at 5:27 AM Peter Xu  wrote:
>
> On Wed, Mar 01, 2023 at 04:36:20PM +0800, Jason Wang wrote:
> > On Tue, Feb 28, 2023 at 10:25 PM Longpeng(Mike)  
> > wrote:
> > >
> > > From: Longpeng 
> > >
> > > When updating ioeventfds, we need to iterate all address spaces and
> > > iterate all flat ranges of each address space. There is so much
> > > redundant process that a FlatView would be iterated for so many times
> > > during one commit (memory_region_transaction_commit).
> > >
> > > We can mark a FlatView as UPDATED and then skip it in the next iteration
> > > and clear the UPDATED flag at the end of the commit. The overhead can
> > > be significantly reduced.
> > >
> > > For example, a VM with 16 vdpa net devices and each one has 65 vectors,
> > > can reduce the time spent on memory_region_transaction_commit by 95%.
> > >
> > > Signed-off-by: Longpeng 
> > > ---
> > >  include/exec/memory.h |  2 ++
> > >  softmmu/memory.c  | 28 +++-
> > >  2 files changed, 29 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > > index 2e602a2fad..974eabf765 100644
> > > --- a/include/exec/memory.h
> > > +++ b/include/exec/memory.h
> > > @@ -1093,6 +1093,8 @@ struct FlatView {
> > >  unsigned nr_allocated;
> > >  struct AddressSpaceDispatch *dispatch;
> > >  MemoryRegion *root;
> > > +#define FLATVIEW_FLAG_IOEVENTFD_UPDATED (1 << 0)
> > > +unsigned flags;
> > >  };
> > >
> > >  static inline FlatView *address_space_to_flatview(AddressSpace *as)
> > > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > > index 9d64efca26..71ff996712 100644
> > > --- a/softmmu/memory.c
> > > +++ b/softmmu/memory.c
> > > @@ -815,6 +815,15 @@ FlatView *address_space_get_flatview(AddressSpace 
> > > *as)
> > >  return view;
> > >  }
> > >
> > > +static void address_space_reset_view_flags(AddressSpace *as, unsigned 
> > > mask)
> > > +{
> > > +FlatView *view = address_space_get_flatview(as);
> > > +
> > > +if (view->flags & mask) {
> > > +view->flags &= ~mask;
> > > +}
> > > +}
> > > +
> > >  static void address_space_update_ioeventfds(AddressSpace *as)
> > >  {
> > >  FlatView *view;
> > > @@ -825,6 +834,12 @@ static void 
> > > address_space_update_ioeventfds(AddressSpace *as)
> > >  AddrRange tmp;
> > >  unsigned i;
> > >
> > > +view = address_space_get_flatview(as);
> > > +if (view->flags & FLATVIEW_FLAG_IOEVENTFD_UPDATED) {
> > > +return;
> > > +}
> > > +view->flags |= FLATVIEW_FLAG_IOEVENTFD_UPDATED;
> > > +
> >
> > Won't we lose the listener calls if multiple address spaces have the
> > same flatview?
>
> I have the same concern with Jason.  I don't think it matters in reality,
> since only address_space_io uses it so far. but it doesn't really look
> reasonable and clean.

Yes, I think in the memory core it should not assume how the eventfds are used.

>
> One other idea of optimizing ioeventfd update is we can add a per-AS
> counter (ioeventfd_notifiers), increase if any eventfd_add|del is
> registered in memory_listener_register(), and decrease when unregister.
> Then address_space_update_ioeventfds() can be skipped completely if
> ioeventfd_notifiers==0.
>
> Side note: Jason, do you think we should drop vhost_eventfd_add|del?
> They're all no-ops right now.

I think so.

Thanks

>
> Thanks,
>
> --
> Peter Xu
>




Re: [PATCH v4 01/15] vdpa net: move iova tree creation from init to start

2023-03-05 Thread Jason Wang
On Fri, Mar 3, 2023 at 4:01 PM Eugenio Perez Martin  wrote:
>
> On Fri, Mar 3, 2023 at 4:32 AM Jason Wang  wrote:
> >
> >
> > 在 2023/3/1 15:01, Eugenio Perez Martin 写道:
> > > On Mon, Feb 27, 2023 at 8:04 AM Jason Wang  wrote:
> > >>
> > >> 在 2023/2/24 23:54, Eugenio Pérez 写道:
> > >>> Only create iova_tree if and when it is needed.
> > >>>
> > >>> The cleanup keeps being responsible of last VQ but this change allows it
> > >>> to merge both cleanup functions.
> > >>>
> > >>> Signed-off-by: Eugenio Pérez 
> > >>> Acked-by: Jason Wang 
> > >>> ---
> > >>> v4:
> > >>> * Remove leak of iova_tree because double allocation
> > >>> * Document better the sharing of IOVA tree between data and CVQ
> > >>> ---
> > >>>net/vhost-vdpa.c | 113 
> > >>> ++-
> > >>>1 file changed, 83 insertions(+), 30 deletions(-)
> > >>>
> > >>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > >>> index de5ed8ff22..b89c99066a 100644
> > >>> --- a/net/vhost-vdpa.c
> > >>> +++ b/net/vhost-vdpa.c
> > >>> @@ -178,13 +178,9 @@ err_init:
> > >>>static void vhost_vdpa_cleanup(NetClientState *nc)
> > >>>{
> > >>>VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> > >>> -struct vhost_dev *dev = >vhost_net->dev;
> > >>>
> > >>>qemu_vfree(s->cvq_cmd_out_buffer);
> > >>>qemu_vfree(s->status);
> > >>> -if (dev->vq_index + dev->nvqs == dev->vq_index_end) {
> > >>> -g_clear_pointer(>vhost_vdpa.iova_tree, 
> > >>> vhost_iova_tree_delete);
> > >>> -}
> > >>>if (s->vhost_net) {
> > >>>vhost_net_cleanup(s->vhost_net);
> > >>>g_free(s->vhost_net);
> > >>> @@ -234,10 +230,64 @@ static ssize_t vhost_vdpa_receive(NetClientState 
> > >>> *nc, const uint8_t *buf,
> > >>>return size;
> > >>>}
> > >>>
> > >>> +/** From any vdpa net client, get the netclient of first queue pair */
> > >>> +static VhostVDPAState *vhost_vdpa_net_first_nc_vdpa(VhostVDPAState *s)
> > >>> +{
> > >>> +NICState *nic = qemu_get_nic(s->nc.peer);
> > >>> +NetClientState *nc0 = qemu_get_peer(nic->ncs, 0);
> > >>> +
> > >>> +return DO_UPCAST(VhostVDPAState, nc, nc0);
> > >>> +}
> > >>> +
> > >>> +static void vhost_vdpa_net_data_start_first(VhostVDPAState *s)
> > >>> +{
> > >>> +struct vhost_vdpa *v = >vhost_vdpa;
> > >>> +
> > >>> +if (v->shadow_vqs_enabled) {
> > >>> +v->iova_tree = vhost_iova_tree_new(v->iova_range.first,
> > >>> +   v->iova_range.last);
> > >>> +}
> > >>> +}
> > >>> +
> > >>> +static int vhost_vdpa_net_data_start(NetClientState *nc)
> > >>> +{
> > >>> +VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> > >>> +struct vhost_vdpa *v = >vhost_vdpa;
> > >>> +
> > >>> +assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > >>> +
> > >>> +if (v->index == 0) {
> > >>> +vhost_vdpa_net_data_start_first(s);
> > >>> +return 0;
> > >>> +}
> > >>> +
> > >>> +if (v->shadow_vqs_enabled) {
> > >>> +VhostVDPAState *s0 = vhost_vdpa_net_first_nc_vdpa(s);
> > >>> +v->iova_tree = s0->vhost_vdpa.iova_tree;
> > >>> +}
> > >>> +
> > >>> +return 0;
> > >>> +}
> > >>> +
> > >>> +static void vhost_vdpa_net_client_stop(NetClientState *nc)
> > >>> +{
> > >>> +VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> > >>> +struct vhost_dev *dev;
> > >>> +
> > >>> +assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > >>> +
> > >>> +dev = s->vhost_vdpa.dev;
> > >>> +if (dev->vq_index + dev->nvqs == dev->vq_index_end) {
> > >>> +g_clear_pointer(>vhost_vdpa.iova_tree, 
> > >>> vhost_iova_tree_delete);
> > >>> +}
> > >>> +}
> > >>> +
> > >>>static NetClientInfo net_vhost_vdpa_info = {
> > >>>.type = NET_CLIENT_DRIVER_VHOST_VDPA,
> > >>>.size = sizeof(VhostVDPAState),
> > >>>.receive = vhost_vdpa_receive,
> > >>> +.start = vhost_vdpa_net_data_start,
> > >>> +.stop = vhost_vdpa_net_client_stop,
> > >>
> > >> Looking at the implementation, it seems nothing net specific, any reason
> > >> we can't simply use vhost_vdpa_dev_start()?
> > >>
> > > IOVA tree must be shared between (at least) all dataplane vhost_vdpa.
> > > How could we move the call to vhost_vdpa_net_first_nc_vdpa to
> > > vhost_vdpa_dev_start?
> >
> >
> > Ok, I think I get it. We should really consider to implement a parent
> > structure in the future for vhost_vdpa then we can avoid tricks like:
> >
> > vq_index_end and vhost_vdpa_net_first_nc_vdpa()
> >
>
> Sounds right. Maybe it is enough to link all of them with a QLIST?

That's also fine but you need to place the parent data into the head
structure which seems not clean than having pointers to point to the
same parent structure.

Thanks

>
> Thanks!
>
> > Thanks
> >
> >
> > >
> > > A possibility is to always allocate it just in case. But it seems to
> > > me it is better to not start allocating resources 

Re: [PATCH v4 12/15] vdpa: block migration if device has unsupported features

2023-03-05 Thread Jason Wang
On Fri, Mar 3, 2023 at 4:58 PM Eugenio Perez Martin  wrote:
>
> On Fri, Mar 3, 2023 at 4:48 AM Jason Wang  wrote:
> >
> >
> > 在 2023/3/2 03:32, Eugenio Perez Martin 写道:
> > > On Mon, Feb 27, 2023 at 9:20 AM Jason Wang  wrote:
> > >> On Mon, Feb 27, 2023 at 4:15 PM Jason Wang  wrote:
> > >>>
> > >>> 在 2023/2/24 23:54, Eugenio Pérez 写道:
> >  A vdpa net device must initialize with SVQ in order to be migratable at
> >  this moment, and initialization code verifies some conditions.  If the
> >  device is not initialized with the x-svq parameter, it will not expose
> >  _F_LOG so the vhost subsystem will block VM migration from its
> >  initialization.
> > 
> >  Next patches change this, so we need to verify migration conditions
> >  differently.
> > 
> >  QEMU only supports a subset of net features in SVQ, and it cannot
> >  migrate state that cannot track or restore in the destination.  Add a
> >  migration blocker if the device offer an unsupported feature.
> > 
> >  Signed-off-by: Eugenio Pérez 
> >  ---
> >  v3: add mirgation blocker properly so vhost_dev can handle it.
> >  ---
> > net/vhost-vdpa.c | 12 
> > 1 file changed, 8 insertions(+), 4 deletions(-)
> > 
> >  diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> >  index 4f983df000..094dc1c2d0 100644
> >  --- a/net/vhost-vdpa.c
> >  +++ b/net/vhost-vdpa.c
> >  @@ -795,7 +795,8 @@ static NetClientState 
> >  *net_vhost_vdpa_init(NetClientState *peer,
> >    int nvqs,
> >    bool is_datapath,
> >    bool svq,
> >  -   struct vhost_vdpa_iova_range 
> >  iova_range)
> >  +   struct vhost_vdpa_iova_range 
> >  iova_range,
> >  +   uint64_t features)
> > {
> > NetClientState *nc = NULL;
> > VhostVDPAState *s;
> >  @@ -818,7 +819,10 @@ static NetClientState 
> >  *net_vhost_vdpa_init(NetClientState *peer,
> > s->vhost_vdpa.shadow_vqs_enabled = svq;
> > s->vhost_vdpa.iova_range = iova_range;
> > s->vhost_vdpa.shadow_data = svq;
> >  -if (!is_datapath) {
> >  +if (queue_pair_index == 0) {
> >  +vhost_vdpa_net_valid_svq_features(features,
> >  +  
> >  >vhost_vdpa.migration_blocker);
> > >>>
> > >>> Since we do validation at initialization, is this necessary to valid
> > >>> once again in other places?
> > >> Ok, after reading patch 13, I think the question is:
> > >>
> > >> The validation seems to be independent to net, can we valid it once
> > >> during vhost_vdpa_init()?
> > >>
> > > vhost_vdpa_net_valid_svq_features also checks for net features. In
> > > particular, all the non transport features must be in
> > > vdpa_svq_device_features.
> > >
> > > This is how we protect that the device / guest will never negotiate
> > > things like VLAN filtering support, as SVQ still does not know how to
> > > restore at the destination.
> > >
> > > In the VLAN filtering case CVQ is needed to restore VLAN, so it is
> > > covered by patch 11/15. But other future features may need support for
> > > restoring it in the destination.
> >
> >
> > I wonder how hard to have a general validation code let net specific
> > code to advertise a blacklist to avoid code duplication.
> >
>
> A blacklist does not work here, because I don't know if SVQ needs
> changes for future feature bits that are still not in / proposed to
> the standard.

Could you give me an example for this?

>
> Regarding the code duplication, do you mean to validate transport
> features and net specific features in one shot, instead of having a
> dedicated function for SVQ transport?

Nope.

Thanks

>
> Thanks!
>
> > Thanks
> >
> >
> > >
> > > Thanks!
> > >
> > >> Thanks
> > >>
> > >>> Thanks
> > >>>
> > >>>
> >  +} else if (!is_datapath) {
> > s->cvq_cmd_out_buffer = 
> >  qemu_memalign(qemu_real_host_page_size(),
> > 
> >  vhost_vdpa_net_cvq_cmd_page_len());
> > memset(s->cvq_cmd_out_buffer, 0, 
> >  vhost_vdpa_net_cvq_cmd_page_len());
> >  @@ -956,7 +960,7 @@ int net_init_vhost_vdpa(const Netdev *netdev, 
> >  const char *name,
> > for (i = 0; i < queue_pairs; i++) {
> > ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
> >  vdpa_device_fd, i, 2, true, 
> >  opts->x_svq,
> >  - iova_range);
> >  + iova_range, features);
> > if (!ncs[i])
> > goto err;
> > }
> >  

Re: [PATCH v13 2/2] vhost-vdpa: add support for vIOMMU

2023-03-05 Thread Jason Wang



在 2023/2/8 10:57, Cindy Lu 写道:

1.Add support for vIOMMU.
Add the new function to deal with iommu MR.
- during iommu_region_add register a specific IOMMU notifier,
   and store all notifiers in a list.
- during iommu_region_del, compare and delete the IOMMU notifier from the list
- since the SVQ not support iommu yet, add the check for IOMMU
   in vhost_vdpa_dev_start, if the SVQ and IOMMU enable at the same time
   function will return fail.

2.Skip the check in vhost_vdpa_listener_skipped_section() while
MR is IOMMU, Move this check to  vhost_vdpa_iommu_map_notify()



This need some tweak as well, it's better not repeat what is done in the 
code but why do you need this change. More could be found at:


https://docs.kernel.org/process/submitting-patches.html#describe-your-changes




Verified in vp_vdpa and vdpa_sim_net driver

Signed-off-by: Cindy Lu 
---
  hw/virtio/vhost-vdpa.c | 173 ++---
  include/hw/virtio/vhost-vdpa.h |  11 +++
  2 files changed, 173 insertions(+), 11 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 542e003101..46f676ab71 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -26,6 +26,7 @@
  #include "cpu.h"
  #include "trace.h"
  #include "qapi/error.h"
+#include "hw/virtio/virtio-access.h"
  
  /*

   * Return one past the end of the end of section. Be careful with uint64_t
@@ -60,15 +61,22 @@ static bool 
vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
   iova_min, section->offset_within_address_space);
  return true;
  }
+/*
+ * While using vIOMMU, Sometimes the section will be larger than iova_max
+ * but the memory that  actually mapping is smaller, So skip the check
+ * here. Will add the check in vhost_vdpa_iommu_map_notify,
+ *There is the real size that maps to the kernel



Please tweak the comment, it has issues of whitespace, capitalization,  
punctuation marks.




+ */
  
-llend = vhost_vdpa_section_end(section);

-if (int128_gt(llend, int128_make64(iova_max))) {
-error_report("RAM section out of device range (max=0x%" PRIx64
- ", end addr=0x%" PRIx64 ")",
- iova_max, int128_get64(llend));
-return true;
+if (!memory_region_is_iommu(section->mr)) {



Note related to this patch but should we exclude non ram region here as 
well?




+llend = vhost_vdpa_section_end(section);
+if (int128_gt(llend, int128_make64(iova_max))) {
+error_report("RAM section out of device range (max=0x%" PRIx64
+ ", end addr=0x%" PRIx64 ")",
+ iova_max, int128_get64(llend));
+return true;
+}
  }
-
  return false;
  }
  
@@ -185,6 +193,118 @@ static void vhost_vdpa_listener_commit(MemoryListener *listener)

  v->iotlb_batch_begin_sent = false;
  }
  
+static void vhost_vdpa_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)

+{
+struct vdpa_iommu *iommu = container_of(n, struct vdpa_iommu, n);
+
+hwaddr iova = iotlb->iova + iommu->iommu_offset;
+struct vhost_vdpa *v = iommu->dev;
+void *vaddr;
+int ret;
+Int128 llend;
+
+if (iotlb->target_as != _space_memory) {
+error_report("Wrong target AS \"%s\", only system memory is allowed",
+ iotlb->target_as->name ? iotlb->target_as->name : "none");
+return;
+}
+RCU_READ_LOCK_GUARD();
+/* check if RAM section out of device range */
+llend = int128_add(int128_makes64(iotlb->addr_mask), int128_makes64(iova));
+if (int128_gt(llend, int128_make64(v->iova_range.last))) {
+error_report("RAM section out of device range (max=0x%" PRIx64
+ ", end addr=0x%" PRIx64 ")",
+ v->iova_range.last, int128_get64(llend));
+return;



Can you meet this condition? If yes, should we crop instead of fail here?



+}
+
+vhost_vdpa_iotlb_batch_begin_once(v);



Where do we send the VHOST_IOTLB_BATCH_END message, or do we even need 
any batching here?




+
+if ((iotlb->perm & IOMMU_RW) != IOMMU_NONE) {
+bool read_only;
+
+if (!memory_get_xlat_addr(iotlb, , NULL, _only, NULL)) {
+return;
+}
+
+ret = vhost_vdpa_dma_map(v, VHOST_VDPA_GUEST_PA_ASID, iova,
+ iotlb->addr_mask + 1, vaddr, read_only);
+if (ret) {
+error_report("vhost_vdpa_dma_map(%p, 0x%" HWADDR_PRIx ", "
+ "0x%" HWADDR_PRIx ", %p) = %d (%m)",
+ v, iova, iotlb->addr_mask + 1, vaddr, ret);
+}
+} else {
+ret = vhost_vdpa_dma_unmap(v, VHOST_VDPA_GUEST_PA_ASID, iova,
+   iotlb->addr_mask + 1);
+if (ret) {
+error_report("vhost_vdpa_dma_unmap(%p, 0x%" HWADDR_PRIx ", "
+ "0x%" HWADDR_PRIx ") = %d (%m)",
+ 

Re: [PATCH v13 1/2] vhost: expose function vhost_dev_has_iommu()

2023-03-05 Thread Jason Wang
On Wed, Feb 8, 2023 at 10:57 AM Cindy Lu  wrote:
>
> To support vIOMMU in vdpa, need to exposed the function
> vhost_dev_has_iommu, vdpa will use this function to check
> if vIOMMU enable.
>
> Signed-off-by: Cindy Lu 

Acked-by: Jason Wang 

Thanks

> ---
>  hw/virtio/vhost.c | 2 +-
>  include/hw/virtio/vhost.h | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index eb8c4c378c..9ff5516655 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -107,7 +107,7 @@ static void vhost_dev_sync_region(struct vhost_dev *dev,
>  }
>  }
>
> -static bool vhost_dev_has_iommu(struct vhost_dev *dev)
> +bool vhost_dev_has_iommu(struct vhost_dev *dev)
>  {
>  VirtIODevice *vdev = dev->vdev;
>
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index a52f273347..f7f10c8fb7 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -336,4 +336,5 @@ int vhost_dev_set_inflight(struct vhost_dev *dev,
> struct vhost_inflight *inflight);
>  int vhost_dev_get_inflight(struct vhost_dev *dev, uint16_t queue_size,
> struct vhost_inflight *inflight);
> +bool vhost_dev_has_iommu(struct vhost_dev *dev);
>  #endif
> --
> 2.34.3
>




Re: [PATCH v3 09/14] hw/net/tulip: Finish QOM conversion

2023-03-05 Thread Jason Wang
On Tue, Feb 28, 2023 at 9:39 PM Philippe Mathieu-Daudé
 wrote:
>
> Hi Jason, do you Ack this patch?

Yes.

Acked-by: Jason Wang 

Thanks

>
> On 13/2/23 19:43, Philippe Mathieu-Daudé wrote:
> > Use the TULIP() and DEVICE() QOM type-checking macros.
> > Remove uses of DO_UPCAST().
> >
> > Signed-off-by: Philippe Mathieu-Daudé 
> > ---
> >   hw/net/tulip.c | 20 +++-
> >   1 file changed, 11 insertions(+), 9 deletions(-)
> >
> > diff --git a/hw/net/tulip.c b/hw/net/tulip.c
> > index 915e5fb595..990507859d 100644
> > --- a/hw/net/tulip.c
> > +++ b/hw/net/tulip.c
> > @@ -19,7 +19,10 @@
> >   #include "net/eth.h"
> >
> >   struct TULIPState {
> > +/*< private >*/
> >   PCIDevice dev;
> > +/*< public >*/
> > +
> >   MemoryRegion io;
> >   MemoryRegion memory;
> >   NICConf c;
> > @@ -959,7 +962,7 @@ static void tulip_fill_eeprom(TULIPState *s)
> >
> >   static void pci_tulip_realize(PCIDevice *pci_dev, Error **errp)
> >   {
> > -TULIPState *s = DO_UPCAST(TULIPState, dev, pci_dev);
> > +TULIPState *s = TULIP(pci_dev);
> >   uint8_t *pci_conf;
> >
> >   pci_conf = s->dev.config;
> > @@ -967,7 +970,7 @@ static void pci_tulip_realize(PCIDevice *pci_dev, Error 
> > **errp)
> >
> >   qemu_macaddr_default_if_unset(>c.macaddr);
> >
> > -s->eeprom = eeprom93xx_new(_dev->qdev, 64);
> > +s->eeprom = eeprom93xx_new(DEVICE(pci_dev), 64);
> >   tulip_fill_eeprom(s);
> >
> >   memory_region_init_io(>io, OBJECT(>dev), _ops, s,
> > @@ -983,27 +986,26 @@ static void pci_tulip_realize(PCIDevice *pci_dev, 
> > Error **errp)
> >
> >   s->nic = qemu_new_nic(_tulip_info, >c,
> > object_get_typename(OBJECT(pci_dev)),
> > -  pci_dev->qdev.id, s);
> > +  DEVICE(pci_dev)->id, s);
> >   qemu_format_nic_info_str(qemu_get_queue(s->nic), s->c.macaddr.a);
> >   }
> >
> >   static void pci_tulip_exit(PCIDevice *pci_dev)
> >   {
> > -TULIPState *s = DO_UPCAST(TULIPState, dev, pci_dev);
> > +TULIPState *s = TULIP(pci_dev);
> >
> >   qemu_del_nic(s->nic);
> >   qemu_free_irq(s->irq);
> > -eeprom93xx_free(_dev->qdev, s->eeprom);
> > +eeprom93xx_free(DEVICE(s), s->eeprom);
> >   }
> >
> >   static void tulip_instance_init(Object *obj)
> >   {
> > -PCIDevice *pci_dev = PCI_DEVICE(obj);
> > -TULIPState *d = DO_UPCAST(TULIPState, dev, pci_dev);
> > +TULIPState *s = TULIP(obj);
> >
> > -device_add_bootindex_property(obj, >c.bootindex,
> > +device_add_bootindex_property(obj, >c.bootindex,
> > "bootindex", "/ethernet-phy@0",
> > -  _dev->qdev);
> > +  DEVICE(obj));
> >   }
> >
> >   static Property tulip_properties[] = {
>




Re: [PATCH v3 00/13] vfio/migration: Device dirty page tracking

2023-03-05 Thread Alex Williamson
On Sun, 5 Mar 2023 23:33:35 +
Joao Martins  wrote:

> On 05/03/2023 20:57, Alex Williamson wrote:
> > On Sat,  4 Mar 2023 01:43:30 +
> > Joao Martins  wrote:
> >   
> >> Hey,
> >>
> >> Presented herewith a series based on the basic VFIO migration protocol v2
> >> implementation [1].
> >>
> >> It is split from its parent series[5] to solely focus on device dirty
> >> page tracking. Device dirty page tracking allows the VFIO device to
> >> record its DMAs and report them back when needed. This is part of VFIO
> >> migration and is used during pre-copy phase of migration to track the
> >> RAM pages that the device has written to and mark those pages dirty, so
> >> they can later be re-sent to target.
> >>
> >> Device dirty page tracking uses the DMA logging uAPI to discover device
> >> capabilities, to start and stop tracking, and to get dirty page bitmap
> >> report. Extra details and uAPI definition can be found here [3].
> >>
> >> Device dirty page tracking operates in VFIOContainer scope. I.e., When
> >> dirty tracking is started, stopped or dirty page report is queried, all
> >> devices within a VFIOContainer are iterated and for each of them device
> >> dirty page tracking is started, stopped or dirty page report is queried,
> >> respectively.
> >>
> >> Device dirty page tracking is used only if all devices within a
> >> VFIOContainer support it. Otherwise, VFIO IOMMU dirty page tracking is
> >> used, and if that is not supported as well, memory is perpetually marked
> >> dirty by QEMU. Note that since VFIO IOMMU dirty page tracking has no HW
> >> support, the last two usually have the same effect of perpetually
> >> marking all pages dirty.
> >>
> >> Normally, when asked to start dirty tracking, all the currently DMA
> >> mapped ranges are tracked by device dirty page tracking. If using a
> >> vIOMMU we block live migration. It's temporary and a separate series is
> >> going to add support for it. Thus this series focus on getting the
> >> ground work first.
> >>
> >> The series is organized as follows:
> >>
> >> - Patches 1-7: Fix bugs and do some preparatory work required prior to
> >>   adding device dirty page tracking.
> >> - Patches 8-10: Implement device dirty page tracking.
> >> - Patch 11: Blocks live migration with vIOMMU.
> >> - Patches 12-13 enable device dirty page tracking and document it.
> >>
> >> Comments, improvements as usual appreciated.  
> > 
> > Still some CI failures:
> > 
> > https://gitlab.com/alex.williamson/qemu/-/pipelines/796657474
> > 
> > The docker failures are normal, afaict the rest are not.  Thanks,
> >   
> 
> Ugh, sorry
> 
> The patch below scissors mark (and also attached as a file) fixes those build
> issues. I managed to reproduce on i386 target builds, and these changes fix my
> 32-bit build.
> 
> I don't have a working Gitlab setup[*] though to trigger the CI to enable to
> wealth of targets it build-tests. If you could kindly test the patch attached 
> in
> a new pipeline (applied on top of the branch you just build) below to 
> understand
> if the CI gets happy. I will include these changes in the right patches 
> (patch 8
> and 10) for the v4 spin.

Looks like this passes:

https://gitlab.com/alex.williamson/qemu/-/pipelines/796750136

Thanks,
Alex




[PATCH 2/9] linux-user: Rename max_reserved_va in main

2023-03-05 Thread Richard Henderson
Rename to local_max_va, to avoid a conflict with the next patch.

Signed-off-by: Richard Henderson 
---
 linux-user/main.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index f4dea25242..5fcaddffc2 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -680,7 +680,7 @@ int main(int argc, char **argv, char **envp)
 int i;
 int ret;
 int execfd;
-unsigned long max_reserved_va;
+unsigned long local_max_va;
 bool preserve_argv0;
 
 error_init(argv[0]);
@@ -786,9 +786,9 @@ int main(int argc, char **argv, char **envp)
  * still try it, if directed by the command-line option, but
  * not by default.
  */
-max_reserved_va = MAX_RESERVED_VA(cpu);
+local_max_va = MAX_RESERVED_VA(cpu);
 if (reserved_va != 0) {
-if (max_reserved_va && reserved_va > max_reserved_va) {
+if (local_max_va && reserved_va > local_max_va) {
 fprintf(stderr, "Reserved virtual address too big\n");
 exit(EXIT_FAILURE);
 }
@@ -797,7 +797,7 @@ int main(int argc, char **argv, char **envp)
  * reserved_va must be aligned with the host page size
  * as it is used with mmap()
  */
-reserved_va = max_reserved_va & qemu_host_page_mask;
+reserved_va = local_max_va & qemu_host_page_mask;
 }
 
 {
-- 
2.34.1




[PATCH 9/9] accel/tcg: Pass last not end to tb_invalidate_phys_range

2023-03-05 Thread Richard Henderson
Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Signed-off-by: Richard Henderson 
---
 include/exec/exec-all.h   |  2 +-
 accel/tcg/tb-maint.c  | 31 ---
 accel/tcg/translate-all.c |  2 +-
 accel/tcg/user-exec.c |  2 +-
 softmmu/physmem.c |  2 +-
 5 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index e09254333d..58d37276d9 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -679,7 +679,7 @@ void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, 
MemTxAttrs attrs);
 #endif
 void tb_flush(CPUState *cpu);
 void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr);
-void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end);
+void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t last);
 void tb_set_jmp_target(TranslationBlock *tb, int n, uintptr_t addr);
 
 /* GETPC is the true target of the return instruction that we'll execute.  */
diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index a93c4c3ef7..19f88fd048 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -989,11 +989,10 @@ TranslationBlock *tb_link_page(TranslationBlock *tb, 
tb_page_addr_t phys_pc,
  * Called with mmap_lock held for user-mode emulation.
  * NOTE: this function must not be called while a TB is running.
  */
-void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
+void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t last)
 {
 TranslationBlock *tb;
 PageForEachNext n;
-tb_page_addr_t last = end - 1;
 
 assert_memory_lock();
 
@@ -1009,11 +1008,11 @@ void tb_invalidate_phys_range(tb_page_addr_t start, 
tb_page_addr_t end)
  */
 void tb_invalidate_phys_page(tb_page_addr_t addr)
 {
-tb_page_addr_t start, end;
+tb_page_addr_t start, last;
 
 start = addr & TARGET_PAGE_MASK;
-end = start + TARGET_PAGE_SIZE;
-tb_invalidate_phys_range(start, end);
+last = addr | ~TARGET_PAGE_MASK;
+tb_invalidate_phys_range(start, last);
 }
 
 /*
@@ -1167,28 +1166,30 @@ void tb_invalidate_phys_page(tb_page_addr_t addr)
 
 /*
  * Invalidate all TBs which intersect with the target physical address range
- * [start;end[. NOTE: start and end may refer to *different* physical pages.
+ * [start;last]. NOTE: start and end may refer to *different* physical pages.
  * 'is_cpu_write_access' should be true if called from a real cpu write
  * access: the virtual CPU will exit the current TB if code is modified inside
  * this TB.
  */
-void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
+void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t last)
 {
 struct page_collection *pages;
-tb_page_addr_t next;
+tb_page_addr_t index, index_last;
 
-pages = page_collection_lock(start, end - 1);
-for (next = (start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
- start < end;
- start = next, next += TARGET_PAGE_SIZE) {
-PageDesc *pd = page_find(start >> TARGET_PAGE_BITS);
-tb_page_addr_t bound = MIN(next, end);
+pages = page_collection_lock(start, last);
+
+index_last = last >> TARGET_PAGE_BITS;
+for (index = start >> TARGET_PAGE_BITS; index <= index_last; index++) {
+PageDesc *pd = page_find(index);
+tb_page_addr_t bound;
 
 if (pd == NULL) {
 continue;
 }
 assert_page_locked(pd);
-tb_invalidate_phys_page_range__locked(pages, pd, start, bound - 1, 0);
+bound = (index << TARGET_PAGE_BITS) | ~TARGET_PAGE_MASK;
+bound = MIN(bound, last);
+tb_invalidate_phys_page_range__locked(pages, pd, start, bound, 0);
 }
 page_collection_unlock(pages);
 }
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 4b5abc0f44..4500d78a16 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -570,7 +570,7 @@ void tb_check_watchpoint(CPUState *cpu, uintptr_t retaddr)
 cpu_get_tb_cpu_state(env, , _base, );
 addr = get_page_addr_code(env, pc);
 if (addr != -1) {
-tb_invalidate_phys_range(addr, addr + 1);
+tb_invalidate_phys_range(addr, addr);
 }
 }
 }
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 20b6fc2f6e..a7e0c3e2f4 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -516,7 +516,7 @@ void page_set_flags(target_ulong start, target_ulong last, 
int flags)
 ~(reset ? 0 : PAGE_STICKY));
 }
 if (inval_tb) {
-tb_invalidate_phys_range(start, last + 1);
+tb_invalidate_phys_range(start, last);
 }
 }
 
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 47143edb4f..abebf5b963 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -2521,7 +2521,7 @@ 

[PATCH 1/9] linux-user: Diagnose incorrect -R size

2023-03-05 Thread Richard Henderson
Zero is the value for 'off', and should not be used with -R.
We have been enforcing host page alignment for the non-R
fallback of MAX_RESERVED_VA, but failing to enforce for -R.

Signed-off-by: Richard Henderson 
---
 linux-user/main.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/linux-user/main.c b/linux-user/main.c
index 4ff30ff980..f4dea25242 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -395,6 +395,16 @@ static void handle_arg_reserved_va(const char *arg)
 fprintf(stderr, "Unrecognised -R size suffix '%s'\n", p);
 exit(EXIT_FAILURE);
 }
+if (reserved_va == 0) {
+fprintf(stderr, "Invalid -R size value 0\n");
+exit(EXIT_FAILURE);
+}
+/* Must be aligned with the host page size as it is used with mmap. */
+if (reserved_va & qemu_host_page_mask) {
+fprintf(stderr, "Invalid -R size value %lu: must be aligned mod %lu\n",
+   reserved_va, qemu_host_page_size);
+exit(EXIT_FAILURE);
+}
 }
 
 static void handle_arg_singlestep(const char *arg)
-- 
2.34.1




[PATCH 6/9] accel/tcg: Pass last not end to PAGE_FOR_EACH_TB

2023-03-05 Thread Richard Henderson
Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Signed-off-by: Richard Henderson 
---
 accel/tcg/tb-maint.c | 28 
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index efefa08ee1..745912e60a 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -125,29 +125,29 @@ static void tb_remove(TranslationBlock *tb)
 }
 
 /* TODO: For now, still shared with translate-all.c for system mode. */
-#define PAGE_FOR_EACH_TB(start, end, pagedesc, T, N)\
-for (T = foreach_tb_first(start, end),  \
- N = foreach_tb_next(T, start, end);\
+#define PAGE_FOR_EACH_TB(start, last, pagedesc, T, N)   \
+for (T = foreach_tb_first(start, last), \
+ N = foreach_tb_next(T, start, last);   \
  T != NULL; \
- T = N, N = foreach_tb_next(N, start, end))
+ T = N, N = foreach_tb_next(N, start, last))
 
 typedef TranslationBlock *PageForEachNext;
 
 static PageForEachNext foreach_tb_first(tb_page_addr_t start,
-tb_page_addr_t end)
+tb_page_addr_t last)
 {
-IntervalTreeNode *n = interval_tree_iter_first(_root, start, end - 1);
+IntervalTreeNode *n = interval_tree_iter_first(_root, start, last);
 return n ? container_of(n, TranslationBlock, itree) : NULL;
 }
 
 static PageForEachNext foreach_tb_next(PageForEachNext tb,
tb_page_addr_t start,
-   tb_page_addr_t end)
+   tb_page_addr_t last)
 {
 IntervalTreeNode *n;
 
 if (tb) {
-n = interval_tree_iter_next(>itree, start, end - 1);
+n = interval_tree_iter_next(>itree, start, last);
 if (n) {
 return container_of(n, TranslationBlock, itree);
 }
@@ -318,7 +318,7 @@ struct page_collection {
 };
 
 typedef int PageForEachNext;
-#define PAGE_FOR_EACH_TB(start, end, pagedesc, tb, n) \
+#define PAGE_FOR_EACH_TB(start, last, pagedesc, tb, n) \
 TB_FOR_EACH_TAGGED((pagedesc)->first_tb, tb, n, page_next)
 
 #ifdef CONFIG_DEBUG_TCG
@@ -993,10 +993,11 @@ void tb_invalidate_phys_range(tb_page_addr_t start, 
tb_page_addr_t end)
 {
 TranslationBlock *tb;
 PageForEachNext n;
+tb_page_addr_t last = end - 1;
 
 assert_memory_lock();
 
-PAGE_FOR_EACH_TB(start, end, unused, tb, n) {
+PAGE_FOR_EACH_TB(start, last, unused, tb, n) {
 tb_phys_invalidate__locked(tb);
 }
 }
@@ -1028,6 +1029,7 @@ bool tb_invalidate_phys_page_unwind(tb_page_addr_t addr, 
uintptr_t pc)
 bool current_tb_modified;
 TranslationBlock *tb;
 PageForEachNext n;
+tb_page_addr_t last;
 
 /*
  * Without precise smc semantics, or when outside of a TB,
@@ -1044,10 +1046,11 @@ bool tb_invalidate_phys_page_unwind(tb_page_addr_t 
addr, uintptr_t pc)
 assert_memory_lock();
 current_tb = tcg_tb_lookup(pc);
 
+last = addr | ~TARGET_PAGE_MASK;
 addr &= TARGET_PAGE_MASK;
 current_tb_modified = false;
 
-PAGE_FOR_EACH_TB(addr, addr + TARGET_PAGE_SIZE, unused, tb, n) {
+PAGE_FOR_EACH_TB(addr, last, unused, tb, n) {
 if (current_tb == tb &&
 (tb_cflags(current_tb) & CF_COUNT_MASK) != 1) {
 /*
@@ -1089,12 +1092,13 @@ tb_invalidate_phys_page_range__locked(struct 
page_collection *pages,
 bool current_tb_modified = false;
 TranslationBlock *current_tb = retaddr ? tcg_tb_lookup(retaddr) : NULL;
 #endif /* TARGET_HAS_PRECISE_SMC */
+tb_page_addr_t last G_GNUC_UNUSED = end - 1;
 
 /*
  * We remove all the TBs in the range [start, end[.
  * XXX: see if in some cases it could be faster to invalidate all the code
  */
-PAGE_FOR_EACH_TB(start, end, p, tb, n) {
+PAGE_FOR_EACH_TB(start, last, p, tb, n) {
 /* NOTE: this is subtle as a TB may span two physical pages */
 if (n == 0) {
 /* NOTE: tb_end may be after the end of the page, but
-- 
2.34.1




[PATCH 4/9] accel/tcg: Pass last not end to page_set_flags

2023-03-05 Thread Richard Henderson
Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1528
Signed-off-by: Richard Henderson 
---
 include/exec/cpu-all.h |  2 +-
 accel/tcg/user-exec.c  | 16 +++-
 bsd-user/mmap.c|  6 +++---
 linux-user/elfload.c   | 11 ++-
 linux-user/mmap.c  | 16 
 linux-user/syscall.c   |  4 ++--
 6 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 7ef6b9a94d..748764459c 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -285,7 +285,7 @@ typedef int (*walk_memory_regions_fn)(void *, target_ulong,
 int walk_memory_regions(void *, walk_memory_regions_fn);
 
 int page_get_flags(target_ulong address);
-void page_set_flags(target_ulong start, target_ulong end, int flags);
+void page_set_flags(target_ulong start, target_ulong last, int flags);
 void page_reset_target_data(target_ulong start, target_ulong end);
 int page_check_range(target_ulong start, target_ulong len, int flags);
 
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 7b37fd229e..035f8096b2 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -480,24 +480,22 @@ static bool pageflags_set_clear(target_ulong start, 
target_ulong last,
  * The flag PAGE_WRITE_ORG is positioned automatically depending
  * on PAGE_WRITE.  The mmap_lock should already be held.
  */
-void page_set_flags(target_ulong start, target_ulong end, int flags)
+void page_set_flags(target_ulong start, target_ulong last, int flags)
 {
-target_ulong last;
 bool reset = false;
 bool inval_tb = false;
 
 /* This function should never be called with addresses outside the
guest address space.  If this assert fires, it probably indicates
a missing call to h2g_valid.  */
-assert(start < end);
-assert(end - 1 <= GUEST_ADDR_MAX);
+assert(start <= last);
+assert(last <= GUEST_ADDR_MAX);
 /* Only set PAGE_ANON with new mappings. */
 assert(!(flags & PAGE_ANON) || (flags & PAGE_RESET));
 assert_memory_lock();
 
-start = start & TARGET_PAGE_MASK;
-end = TARGET_PAGE_ALIGN(end);
-last = end - 1;
+start &= TARGET_PAGE_MASK;
+last |= ~TARGET_PAGE_MASK;
 
 if (!(flags & PAGE_VALID)) {
 flags = 0;
@@ -510,7 +508,7 @@ void page_set_flags(target_ulong start, target_ulong end, 
int flags)
 }
 
 if (!flags || reset) {
-page_reset_target_data(start, end);
+page_reset_target_data(start, last + 1);
 inval_tb |= pageflags_unset(start, last);
 }
 if (flags) {
@@ -518,7 +516,7 @@ void page_set_flags(target_ulong start, target_ulong end, 
int flags)
 ~(reset ? 0 : PAGE_STICKY));
 }
 if (inval_tb) {
-tb_invalidate_phys_range(start, end);
+tb_invalidate_phys_range(start, last + 1);
 }
 }
 
diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index e9a330d599..301fc63817 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -118,7 +118,7 @@ int target_mprotect(abi_ulong start, abi_ulong len, int 
prot)
 if (ret != 0)
 goto error;
 }
-page_set_flags(start, start + len, prot | PAGE_VALID);
+page_set_flags(start, start + len - 1, prot | PAGE_VALID);
 mmap_unlock();
 return 0;
 error:
@@ -656,7 +656,7 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
prot,
 }
 }
  the_end1:
-page_set_flags(start, start + len, prot | PAGE_VALID);
+page_set_flags(start, start + len - 1, prot | PAGE_VALID);
  the_end:
 #ifdef DEBUG_MMAP
 printf("ret=0x" TARGET_ABI_FMT_lx "\n", start);
@@ -767,7 +767,7 @@ int target_munmap(abi_ulong start, abi_ulong len)
 }
 
 if (ret == 0) {
-page_set_flags(start, start + len, 0);
+page_set_flags(start, start + len - 1, 0);
 }
 mmap_unlock();
 return ret;
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 104c13ec77..a3431d8d62 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -212,7 +212,7 @@ static bool init_guest_commpage(void)
 exit(EXIT_FAILURE);
 }
 page_set_flags(TARGET_VSYSCALL_PAGE,
-   TARGET_VSYSCALL_PAGE + TARGET_PAGE_SIZE,
+   TARGET_VSYSCALL_PAGE | ~TARGET_PAGE_MASK,
PAGE_EXEC | PAGE_VALID);
 return true;
 }
@@ -443,7 +443,7 @@ static bool init_guest_commpage(void)
 exit(EXIT_FAILURE);
 }
 
-page_set_flags(commpage, commpage + qemu_host_page_size,
+page_set_flags(commpage, commpage | ~qemu_host_page_mask,
PAGE_READ | PAGE_EXEC | PAGE_VALID);
 return true;
 }
@@ -1315,7 +1315,7 @@ static bool init_guest_commpage(void)
 exit(EXIT_FAILURE);
 }
 
-page_set_flags(LO_COMMPAGE, LO_COMMPAGE + TARGET_PAGE_SIZE,
+

[PATCH 7/9] accel/tcg: Pass last not end to page_collection_lock

2023-03-05 Thread Richard Henderson
Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Fixes a bug in the loop comparision where "<= end" would lock
one more page than required.

Signed-off-by: Richard Henderson 
---
 accel/tcg/tb-maint.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index 745912e60a..c4e15c5591 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -509,20 +509,20 @@ static gint tb_page_addr_cmp(gconstpointer ap, 
gconstpointer bp, gpointer udata)
 }
 
 /*
- * Lock a range of pages ([@start,@end[) as well as the pages of all
+ * Lock a range of pages ([@start,@last]) as well as the pages of all
  * intersecting TBs.
  * Locking order: acquire locks in ascending order of page index.
  */
 static struct page_collection *page_collection_lock(tb_page_addr_t start,
-tb_page_addr_t end)
+tb_page_addr_t last)
 {
 struct page_collection *set = g_malloc(sizeof(*set));
 tb_page_addr_t index;
 PageDesc *pd;
 
 start >>= TARGET_PAGE_BITS;
-end   >>= TARGET_PAGE_BITS;
-g_assert(start <= end);
+last >>= TARGET_PAGE_BITS;
+g_assert(start <= last);
 
 set->tree = g_tree_new_full(tb_page_addr_cmp, NULL, NULL,
 page_entry_destroy);
@@ -532,7 +532,7 @@ static struct page_collection 
*page_collection_lock(tb_page_addr_t start,
  retry:
 g_tree_foreach(set->tree, page_entry_lock, NULL);
 
-for (index = start; index <= end; index++) {
+for (index = start; index <= last; index++) {
 TranslationBlock *tb;
 PageForEachNext n;
 
@@ -1152,7 +1152,7 @@ tb_invalidate_phys_page_range__locked(struct 
page_collection *pages,
 void tb_invalidate_phys_page(tb_page_addr_t addr)
 {
 struct page_collection *pages;
-tb_page_addr_t start, end;
+tb_page_addr_t start, last;
 PageDesc *p;
 
 p = page_find(addr >> TARGET_PAGE_BITS);
@@ -1161,9 +1161,9 @@ void tb_invalidate_phys_page(tb_page_addr_t addr)
 }
 
 start = addr & TARGET_PAGE_MASK;
-end = start + TARGET_PAGE_SIZE;
-pages = page_collection_lock(start, end);
-tb_invalidate_phys_page_range__locked(pages, p, start, end, 0);
+last = addr | ~TARGET_PAGE_MASK;
+pages = page_collection_lock(start, last);
+tb_invalidate_phys_page_range__locked(pages, p, start, last + 1, 0);
 page_collection_unlock(pages);
 }
 
@@ -1179,7 +1179,7 @@ void tb_invalidate_phys_range(tb_page_addr_t start, 
tb_page_addr_t end)
 struct page_collection *pages;
 tb_page_addr_t next;
 
-pages = page_collection_lock(start, end);
+pages = page_collection_lock(start, end - 1);
 for (next = (start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
  start < end;
  start = next, next += TARGET_PAGE_SIZE) {
@@ -1224,7 +1224,7 @@ void tb_invalidate_phys_range_fast(ram_addr_t ram_addr,
 {
 struct page_collection *pages;
 
-pages = page_collection_lock(ram_addr, ram_addr + size);
+pages = page_collection_lock(ram_addr, ram_addr + size - 1);
 tb_invalidate_phys_page_fast__locked(pages, ram_addr, size, retaddr);
 page_collection_unlock(pages);
 }
-- 
2.34.1




[PATCH 3/9] include/exec: Replace reserved_va with max_reserved_va

2023-03-05 Thread Richard Henderson
In addition to the rename, change the semantics to be the
last byte of the guest va, rather than the following byte.
This avoids some overflow conditions.

Signed-off-by: Richard Henderson 
---
 include/exec/cpu-all.h  | 15 ---
 linux-user/arm/target_cpu.h |  2 +-
 bsd-user/main.c | 18 +++---
 bsd-user/mmap.c | 12 ++--
 bsd-user/signal.c   |  4 ++--
 linux-user/elfload.c| 36 ++--
 linux-user/main.c   | 36 
 linux-user/mmap.c   | 20 ++--
 linux-user/signal.c |  4 ++--
 target/arm/cpu.c|  2 +-
 10 files changed, 75 insertions(+), 74 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 2eb1176538..7ef6b9a94d 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -152,12 +152,21 @@ static inline void tswap64s(uint64_t *s)
  */
 extern uintptr_t guest_base;
 extern bool have_guest_base;
-extern unsigned long reserved_va;
+
+/*
+ * If non-zero, the guest virtual address space is a contiguous subset
+ * of the host virtual address space, i.e. '-R reserved-va' is in effect
+ * either from the command-line or by default.  The value is the last
+ * byte of the guest address space e.g. UINT32_MAX.
+ *
+ * If zero, the host and guest virtual address spaces are intermingled.
+ */
+extern unsigned long max_reserved_va;
 
 /*
  * Limit the guest addresses as best we can.
  *
- * When not using -R reserved_va, we cannot really limit the guest
+ * When not using -R , we cannot really limit the guest
  * to less address space than the host.  For 32-bit guests, this
  * acts as a sanity check that we're not giving the guest an address
  * that it cannot even represent.  For 64-bit guests... the address
@@ -171,7 +180,7 @@ extern unsigned long reserved_va;
 #define GUEST_ADDR_MAX_ \
 ((MIN_CONST(TARGET_VIRT_ADDR_SPACE_BITS, TARGET_ABI_BITS) <= 32) ?  \
  UINT32_MAX : ~0ul)
-#define GUEST_ADDR_MAX(reserved_va ? reserved_va - 1 : GUEST_ADDR_MAX_)
+#define GUEST_ADDR_MAX(max_reserved_va ? : GUEST_ADDR_MAX_)
 
 #else
 
diff --git a/linux-user/arm/target_cpu.h b/linux-user/arm/target_cpu.h
index 89ba274cfc..f6383a7cd1 100644
--- a/linux-user/arm/target_cpu.h
+++ b/linux-user/arm/target_cpu.h
@@ -30,7 +30,7 @@ static inline unsigned long arm_max_reserved_va(CPUState *cs)
  * the high addresses.  Restrict linux-user to the
  * cached write-back RAM in the system map.
  */
-return 0x8000ul;
+return 0x7ffful;
 } else {
 /*
  * We need to be able to map the commpage.
diff --git a/bsd-user/main.c b/bsd-user/main.c
index 41290e16f9..de413bd1d2 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -67,16 +67,12 @@ bool have_guest_base;
 # if HOST_LONG_BITS > TARGET_VIRT_ADDR_SPACE_BITS
 #  if TARGET_VIRT_ADDR_SPACE_BITS == 32 && \
   (TARGET_LONG_BITS == 32 || defined(TARGET_ABI32))
-/*
- * There are a number of places where we assign reserved_va to a variable
- * of type abi_ulong and expect it to fit.  Avoid the last page.
- */
-#   define MAX_RESERVED_VA  (0xul & TARGET_PAGE_MASK)
+#   define MAX_RESERVED_VA  0xul
 #  else
-#   define MAX_RESERVED_VA  (1ul << TARGET_VIRT_ADDR_SPACE_BITS)
+#   define MAX_RESERVED_VA  ((1ul << TARGET_VIRT_ADDR_SPACE_BITS) - 1)
 #  endif
 # else
-#  define MAX_RESERVED_VA  0
+#  define MAX_RESERVED_VA   (-1ul)
 # endif
 #endif
 
@@ -86,9 +82,9 @@ bool have_guest_base;
  * if directed by the command-line option, but not by default.
  */
 #if HOST_LONG_BITS == 64 && TARGET_VIRT_ADDR_SPACE_BITS <= 32
-unsigned long reserved_va = MAX_RESERVED_VA;
+unsigned long max_reserved_va = MAX_RESERVED_VA;
 #else
-unsigned long reserved_va;
+unsigned long max_reserved_va;
 #endif
 
 static const char *interp_prefix = CONFIG_QEMU_INTERP_PREFIX;
@@ -464,8 +460,8 @@ int main(int argc, char **argv)
 target_environ = envlist_to_environ(envlist, NULL);
 envlist_free(envlist);
 
-if (reserved_va) {
-mmap_next_start = reserved_va;
+if (max_reserved_va) {
+mmap_next_start = max_reserved_va;
 }
 
 {
diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index d6c5a344c9..e9a330d599 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -227,14 +227,14 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, 
abi_ulong size,
 int prot;
 int looped = 0;
 
-if (size > reserved_va) {
+if (size > max_reserved_va) {
 return (abi_ulong)-1;
 }
 
 size = HOST_PAGE_ALIGN(size) + alignment;
 end_addr = start + size;
-if (end_addr > reserved_va) {
-end_addr = reserved_va;
+if (end_addr > max_reserved_va) {
+end_addr = max_reserved_va + 1;
 }
 addr = end_addr - qemu_host_page_size;
 
@@ -243,7 +243,7 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong 

[PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528]

2023-03-05 Thread Richard Henderson
The primary issue is that of overflow, where "end" for the last
page of the 32-bit address space overflows to 0.  The fix is to
use "last" instead, which can always be represented.

This requires that we adjust reserved_va as well, because of

-/*
- * There are a number of places where we assign reserved_va to a variable
- * of type abi_ulong and expect it to fit.  Avoid the last page.
- */
-#   define MAX_RESERVED_VA  (0xul & TARGET_PAGE_MASK)

and the related

-/*
- * reserved_va must be aligned with the host page size
- * as it is used with mmap()
- */
-reserved_va = local_max_va & qemu_host_page_mask;

whereby we avoided the final (host | guest) page of the address space
because of said overflow.  With the change in representation, we can
always use UINT32_MAX as the end of the 32-bit address space.

This was observable on ppc64le (or any other 64k page host) not being
able to load any arm32 binary, because the COMMPAGE goes at 0x,
which violated that last host page problem above.

The issue is resolved in patch 4, but the rest clean up other interfaces
with the same issue.  I'm not touching any interfaces that use start+len
instead of start+end.


r~


Richard Henderson (9):
  linux-user: Diagnose incorrect -R size
  linux-user: Rename max_reserved_va in main
  include/exec: Replace reserved_va with max_reserved_va
  accel/tcg: Pass last not end to page_set_flags
  accel/tcg: Pass last not end to page_reset_target_data
  accel/tcg: Pass last not end to PAGE_FOR_EACH_TB
  accel/tcg: Pass last not end to page_collection_lock
  accel/tcg: Pass last not end to tb_invalidate_phys_page_range__locked
  accel/tcg: Pass last not end to tb_invalidate_phys_range

 include/exec/cpu-all.h  | 19 ++--
 include/exec/exec-all.h |  2 +-
 linux-user/arm/target_cpu.h |  2 +-
 accel/tcg/tb-maint.c| 95 +++--
 accel/tcg/translate-all.c   |  2 +-
 accel/tcg/user-exec.c   | 25 +-
 bsd-user/main.c | 18 +++
 bsd-user/mmap.c | 18 +++
 bsd-user/signal.c   |  4 +-
 linux-user/elfload.c| 47 +-
 linux-user/main.c   | 44 +
 linux-user/mmap.c   | 38 +++
 linux-user/signal.c |  4 +-
 linux-user/syscall.c|  4 +-
 softmmu/physmem.c   |  2 +-
 target/arm/cpu.c|  2 +-
 16 files changed, 169 insertions(+), 157 deletions(-)

-- 
2.34.1




[PATCH 8/9] accel/tcg: Pass last not end to tb_invalidate_phys_page_range__locked

2023-03-05 Thread Richard Henderson
Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Properly truncate tb_last to the end of the page; the comment about
tb_end being past the end of the page being ok is not correct,
considering overflow.

Signed-off-by: Richard Henderson 
---
 accel/tcg/tb-maint.c | 26 --
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index c4e15c5591..a93c4c3ef7 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -1082,35 +1082,33 @@ bool tb_invalidate_phys_page_unwind(tb_page_addr_t 
addr, uintptr_t pc)
 static void
 tb_invalidate_phys_page_range__locked(struct page_collection *pages,
   PageDesc *p, tb_page_addr_t start,
-  tb_page_addr_t end,
+  tb_page_addr_t last,
   uintptr_t retaddr)
 {
 TranslationBlock *tb;
-tb_page_addr_t tb_start, tb_end;
 PageForEachNext n;
 #ifdef TARGET_HAS_PRECISE_SMC
 bool current_tb_modified = false;
 TranslationBlock *current_tb = retaddr ? tcg_tb_lookup(retaddr) : NULL;
 #endif /* TARGET_HAS_PRECISE_SMC */
-tb_page_addr_t last G_GNUC_UNUSED = end - 1;
 
 /*
- * We remove all the TBs in the range [start, end[.
+ * We remove all the TBs in the range [start, last].
  * XXX: see if in some cases it could be faster to invalidate all the code
  */
 PAGE_FOR_EACH_TB(start, last, p, tb, n) {
+tb_page_addr_t tb_start, tb_last;
+
 /* NOTE: this is subtle as a TB may span two physical pages */
+tb_start = tb_page_addr0(tb);
+tb_last = tb_start + tb->size - 1;
 if (n == 0) {
-/* NOTE: tb_end may be after the end of the page, but
-   it is not a problem */
-tb_start = tb_page_addr0(tb);
-tb_end = tb_start + tb->size;
+tb_last = MIN(tb_last, tb_start | ~TARGET_PAGE_MASK);
 } else {
 tb_start = tb_page_addr1(tb);
-tb_end = tb_start + ((tb_page_addr0(tb) + tb->size)
- & ~TARGET_PAGE_MASK);
+tb_last = tb_start + (tb_last & ~TARGET_PAGE_MASK);
 }
-if (!(tb_end <= start || tb_start >= end)) {
+if (!(tb_last < start || tb_start > last)) {
 #ifdef TARGET_HAS_PRECISE_SMC
 if (current_tb == tb &&
 (tb_cflags(current_tb) & CF_COUNT_MASK) != 1) {
@@ -1163,7 +1161,7 @@ void tb_invalidate_phys_page(tb_page_addr_t addr)
 start = addr & TARGET_PAGE_MASK;
 last = addr | ~TARGET_PAGE_MASK;
 pages = page_collection_lock(start, last);
-tb_invalidate_phys_page_range__locked(pages, p, start, last + 1, 0);
+tb_invalidate_phys_page_range__locked(pages, p, start, last, 0);
 page_collection_unlock(pages);
 }
 
@@ -1190,7 +1188,7 @@ void tb_invalidate_phys_range(tb_page_addr_t start, 
tb_page_addr_t end)
 continue;
 }
 assert_page_locked(pd);
-tb_invalidate_phys_page_range__locked(pages, pd, start, bound, 0);
+tb_invalidate_phys_page_range__locked(pages, pd, start, bound - 1, 0);
 }
 page_collection_unlock(pages);
 }
@@ -1210,7 +1208,7 @@ static void tb_invalidate_phys_page_fast__locked(struct 
page_collection *pages,
 }
 
 assert_page_locked(p);
-tb_invalidate_phys_page_range__locked(pages, p, start, start + len, ra);
+tb_invalidate_phys_page_range__locked(pages, p, start, start + len - 1, 
ra);
 }
 
 /*
-- 
2.34.1




[PATCH 5/9] accel/tcg: Pass last not end to page_reset_target_data

2023-03-05 Thread Richard Henderson
Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Signed-off-by: Richard Henderson 
---
 include/exec/cpu-all.h |  2 +-
 accel/tcg/user-exec.c  | 11 +--
 linux-user/mmap.c  |  2 +-
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 748764459c..a8cb4c905d 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -286,7 +286,7 @@ int walk_memory_regions(void *, walk_memory_regions_fn);
 
 int page_get_flags(target_ulong address);
 void page_set_flags(target_ulong start, target_ulong last, int flags);
-void page_reset_target_data(target_ulong start, target_ulong end);
+void page_reset_target_data(target_ulong start, target_ulong last);
 int page_check_range(target_ulong start, target_ulong len, int flags);
 
 /**
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 035f8096b2..20b6fc2f6e 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -508,7 +508,7 @@ void page_set_flags(target_ulong start, target_ulong last, 
int flags)
 }
 
 if (!flags || reset) {
-page_reset_target_data(start, last + 1);
+page_reset_target_data(start, last);
 inval_tb |= pageflags_unset(start, last);
 }
 if (flags) {
@@ -814,15 +814,14 @@ typedef struct TargetPageDataNode {
 
 static IntervalTreeRoot targetdata_root;
 
-void page_reset_target_data(target_ulong start, target_ulong end)
+void page_reset_target_data(target_ulong start, target_ulong last)
 {
 IntervalTreeNode *n, *next;
-target_ulong last;
 
 assert_memory_lock();
 
-start = start & TARGET_PAGE_MASK;
-last = TARGET_PAGE_ALIGN(end) - 1;
+start &= TARGET_PAGE_MASK;
+last |= ~TARGET_PAGE_MASK;
 
 for (n = interval_tree_iter_first(_root, start, last),
  next = n ? interval_tree_iter_next(n, start, last) : NULL;
@@ -885,7 +884,7 @@ void *page_get_target_data(target_ulong address)
 return t->data[(page - region) >> TARGET_PAGE_BITS];
 }
 #else
-void page_reset_target_data(target_ulong start, target_ulong end) { }
+void page_reset_target_data(target_ulong start, target_ulong last) { }
 #endif /* TARGET_PAGE_DATA_SIZE */
 
 /* The softmmu versions of these helpers are in cputlb.c.  */
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 9cf85f4090..c153277afb 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -946,7 +946,7 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, 
int advice)
 if (can_passthrough_madvise(start, end)) {
 ret = get_errno(madvise(g2h_untagged(start), len, advice));
 if ((advice == MADV_DONTNEED) && (ret == 0)) {
-page_reset_target_data(start, start + len);
+page_reset_target_data(start, start + len - 1);
 }
 }
 }
-- 
2.34.1




[PATCH v2 5/5] accel/tcg: Remove check_tcg_memory_orders_compatible

2023-03-05 Thread Richard Henderson
We now issue host memory barriers to match the guest memory order.
Continue to disable MTTCG only if the guest has not been ported.

Signed-off-by: Richard Henderson 
---
 accel/tcg/tcg-all.c | 34 --
 1 file changed, 8 insertions(+), 26 deletions(-)

diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index 604efd1b18..f6b44548cc 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -61,33 +61,20 @@ DECLARE_INSTANCE_CHECKER(TCGState, TCG_STATE,
  * they can set the appropriate CONFIG flags in ${target}-softmmu.mak
  *
  * Once a guest architecture has been converted to the new primitives
- * there are two remaining limitations to check.
- *
- * - The guest can't be oversized (e.g. 64 bit guest on 32 bit host)
- * - The host must have a stronger memory order than the guest
- *
- * It may be possible in future to support strong guests on weak hosts
- * but that will require tagging all load/stores in a guest with their
- * implicit memory order requirements which would likely slow things
- * down a lot.
+ * there is one remaining limitation to check:
+ *   - The guest can't be oversized (e.g. 64 bit guest on 32 bit host)
  */
 
-static bool check_tcg_memory_orders_compatible(void)
-{
-return tcg_req_mo(TCG_MO_ALL) == 0;
-}
-
 static bool default_mttcg_enabled(void)
 {
 if (icount_enabled() || TCG_OVERSIZED_GUEST) {
 return false;
-} else {
-#ifdef TARGET_SUPPORTS_MTTCG
-return check_tcg_memory_orders_compatible();
-#else
-return false;
-#endif
 }
+#if defined(TARGET_SUPPORTS_MTTCG) && defined(TCG_GUEST_DEFAULT_MO)
+return true;
+#else
+return false;
+#endif
 }
 
 static void tcg_accel_instance_init(Object *obj)
@@ -150,15 +137,10 @@ static void tcg_set_thread(Object *obj, const char 
*value, Error **errp)
 } else if (icount_enabled()) {
 error_setg(errp, "No MTTCG when icount is enabled");
 } else {
-#ifndef TARGET_SUPPORTS_MTTCG
+#if !(defined(TARGET_SUPPORTS_MTTCG) && defined(TCG_GUEST_DEFAULT_MO))
 warn_report("Guest not yet converted to MTTCG - "
 "you may get unexpected results");
 #endif
-if (!check_tcg_memory_orders_compatible()) {
-warn_report("Guest expects a stronger memory ordering "
-"than the host provides");
-error_printf("This may cause strange/hard to debug errors\n");
-}
 s->mttcg_enabled = true;
 }
 } else if (strcmp(value, "single") == 0) {
-- 
2.34.1




[PATCH v2 3/5] tcg: Create tcg_req_mo

2023-03-05 Thread Richard Henderson
Split out the logic to emit a host memory barrier in response to
a guest memory operation.  Do not provide a true default for
TCG_GUEST_DEFAULT_MO because the defined() check will still be
useful for determining if a guest has been updated for MTTCG.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h   | 20 
 accel/tcg/tcg-all.c |  6 +-
 tcg/tcg-op.c|  8 +---
 3 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index a5cf21be83..b76b597878 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -1171,6 +1171,26 @@ static inline size_t tcg_current_code_size(TCGContext *s)
 return tcg_ptr_byte_diff(s->code_ptr, s->code_buf);
 }
 
+/**
+ * tcg_req_mo:
+ * @type: TCGBar
+ *
+ * Filter @type to the barrier that is required for the guest
+ * memory ordering vs the host memory ordering.  A non-zero
+ * result indicates that some barrier is required.
+ *
+ * If TCG_GUEST_DEFAULT_MO is not defined, assume that the
+ * guest requires strict alignment.
+ *
+ * This is a macro so that it's constant even without optimization.
+ */
+#ifdef TCG_GUEST_DEFAULT_MO
+# define tcg_req_mo(type) \
+((type) & TCG_GUEST_DEFAULT_MO & ~TCG_TARGET_DEFAULT_MO)
+#else
+# define tcg_req_mo(type) ((type) & ~TCG_TARGET_DEFAULT_MO)
+#endif
+
 /**
  * tcg_qemu_tb_exec:
  * @env: pointer to CPUArchState for the CPU
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index 5dab1ae9dd..604efd1b18 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -74,11 +74,7 @@ DECLARE_INSTANCE_CHECKER(TCGState, TCG_STATE,
 
 static bool check_tcg_memory_orders_compatible(void)
 {
-#if defined(TCG_GUEST_DEFAULT_MO) && defined(TCG_TARGET_DEFAULT_MO)
-return (TCG_GUEST_DEFAULT_MO & ~TCG_TARGET_DEFAULT_MO) == 0;
-#else
-return false;
-#endif
+return tcg_req_mo(TCG_MO_ALL) == 0;
 }
 
 static bool default_mttcg_enabled(void)
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 2721c1cab9..d6faf30c52 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2930,13 +2930,7 @@ static void gen_ldst_i64(TCGOpcode opc, TCGv_i64 val, 
TCGv addr,
 
 static void tcg_gen_req_mo(TCGBar type)
 {
-#ifdef TCG_GUEST_DEFAULT_MO
-type &= TCG_GUEST_DEFAULT_MO;
-#endif
-type &= ~TCG_TARGET_DEFAULT_MO;
-if (type) {
-tcg_gen_mb(type | TCG_BAR_SC);
-}
+tcg_gen_mb(tcg_req_mo(type) | TCG_BAR_SC);
 }
 
 static inline TCGv plugin_prep_mem_callbacks(TCGv vaddr)
-- 
2.34.1




[PATCH v2 4/5] tcg: Add host memory barriers to cpu_ldst.h interfaces

2023-03-05 Thread Richard Henderson
Bring the majority of helpers into line with the rest of
tcg in respecting guest memory ordering.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h | 14 ++
 accel/tcg/cputlb.c|  2 ++
 accel/tcg/user-exec.c | 14 ++
 3 files changed, 30 insertions(+)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index b76b597878..0edda5f89f 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -1191,6 +1191,20 @@ static inline size_t tcg_current_code_size(TCGContext *s)
 # define tcg_req_mo(type) ((type) & ~TCG_TARGET_DEFAULT_MO)
 #endif
 
+/**
+ * tcg_req_mo:
+ * @type: TCGBar
+ *
+ * If tcg_req_mo indicates a barrier for @type is required for the
+ * guest memory model, issue a host memory barrier.
+ */
+#define cpu_req_mo(type)  \
+do {  \
+if (tcg_req_mo(type)) {   \
+smp_mb(); \
+} \
+} while (0)
+
 /**
  * tcg_qemu_tb_exec:
  * @env: pointer to CPUArchState for the CPU
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index e984a98dc4..6a04514427 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2174,6 +2174,7 @@ static inline uint64_t cpu_load_helper(CPUArchState *env, 
abi_ptr addr,
 {
 uint64_t ret;
 
+cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
 ret = full_load(env, addr, oi, retaddr);
 qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R);
 return ret;
@@ -2586,6 +2587,7 @@ static inline void cpu_store_helper(CPUArchState *env, 
target_ulong addr,
 uint64_t val, MemOpIdx oi, uintptr_t ra,
 FullStoreHelper *full_store)
 {
+cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
 full_store(env, addr, val, oi, ra);
 qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W);
 }
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 7b37fd229e..489459ae17 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -942,6 +942,7 @@ uint8_t cpu_ldb_mmu(CPUArchState *env, abi_ptr addr,
 
 validate_memop(oi, MO_UB);
 haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD);
+cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
 ret = ldub_p(haddr);
 clear_helper_retaddr();
 qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R);
@@ -956,6 +957,7 @@ uint16_t cpu_ldw_be_mmu(CPUArchState *env, abi_ptr addr,
 
 validate_memop(oi, MO_BEUW);
 haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD);
+cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
 ret = lduw_be_p(haddr);
 clear_helper_retaddr();
 qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R);
@@ -970,6 +972,7 @@ uint32_t cpu_ldl_be_mmu(CPUArchState *env, abi_ptr addr,
 
 validate_memop(oi, MO_BEUL);
 haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD);
+cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
 ret = ldl_be_p(haddr);
 clear_helper_retaddr();
 qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R);
@@ -984,6 +987,7 @@ uint64_t cpu_ldq_be_mmu(CPUArchState *env, abi_ptr addr,
 
 validate_memop(oi, MO_BEUQ);
 haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD);
+cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
 ret = ldq_be_p(haddr);
 clear_helper_retaddr();
 qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R);
@@ -998,6 +1002,7 @@ uint16_t cpu_ldw_le_mmu(CPUArchState *env, abi_ptr addr,
 
 validate_memop(oi, MO_LEUW);
 haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD);
+cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
 ret = lduw_le_p(haddr);
 clear_helper_retaddr();
 qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R);
@@ -1012,6 +1017,7 @@ uint32_t cpu_ldl_le_mmu(CPUArchState *env, abi_ptr addr,
 
 validate_memop(oi, MO_LEUL);
 haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD);
+cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
 ret = ldl_le_p(haddr);
 clear_helper_retaddr();
 qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R);
@@ -1026,6 +1032,7 @@ uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr,
 
 validate_memop(oi, MO_LEUQ);
 haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD);
+cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
 ret = ldq_le_p(haddr);
 clear_helper_retaddr();
 qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R);
@@ -1075,6 +1082,7 @@ void cpu_stb_mmu(CPUArchState *env, abi_ptr addr, uint8_t 
val,
 
 validate_memop(oi, MO_UB);
 haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE);
+cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
 stb_p(haddr, val);
 clear_helper_retaddr();
 qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W);
@@ -1087,6 +1095,7 @@ void cpu_stw_be_mmu(CPUArchState *env, abi_ptr addr, 
uint16_t val,
 
 validate_memop(oi, MO_BEUW);
 haddr = cpu_mmu_lookup(env, 

[PATCH v2 1/5] tcg: Do not elide memory barriers for !CF_PARALLEL

2023-03-05 Thread Richard Henderson
The virtio devices require proper memory ordering between
the vcpus and the iothreads.

Signed-off-by: Richard Henderson 
---
 tcg/tcg-op.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 77658a88f0..75fdcdaac7 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -102,9 +102,13 @@ void tcg_gen_br(TCGLabel *l)
 
 void tcg_gen_mb(TCGBar mb_type)
 {
-if (tcg_ctx->gen_tb->cflags & CF_PARALLEL) {
-tcg_gen_op1(INDEX_op_mb, mb_type);
-}
+/*
+ * It is tempting to elide the barrier in a single-threaded context
+ * (i.e. !(cflags & CF_PARALLEL)), however, even with a single cpu
+ * we have i/o threads running in parallel, and lack of memory order
+ * can result in e.g. virtio queue entries being read incorrectly.
+ */
+tcg_gen_op1(INDEX_op_mb, mb_type);
 }
 
 /* 32 bit ops */
-- 
2.34.1




[PATCH v2 2/5] tcg: Elide memory barriers implied by the host memory model

2023-03-05 Thread Richard Henderson
Reduce the set of required barriers to those needed by
the host right from the beginning.

Signed-off-by: Richard Henderson 
---
 tcg/tcg-op.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 75fdcdaac7..2721c1cab9 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -107,8 +107,13 @@ void tcg_gen_mb(TCGBar mb_type)
  * (i.e. !(cflags & CF_PARALLEL)), however, even with a single cpu
  * we have i/o threads running in parallel, and lack of memory order
  * can result in e.g. virtio queue entries being read incorrectly.
+ *
+ * That said, we can elide anything which the host provides for free.
  */
-tcg_gen_op1(INDEX_op_mb, mb_type);
+mb_type &= ~TCG_TARGET_DEFAULT_MO;
+if (mb_type & TCG_MO_ALL) {
+tcg_gen_op1(INDEX_op_mb, mb_type);
+}
 }
 
 /* 32 bit ops */
-- 
2.34.1




[PATCH v2 0/5] tcg: Issue memory barriers for guest memory model

2023-03-05 Thread Richard Henderson
Version 1 was very nearly 2 years ago:
https://lore.kernel.org/qemu-devel/20210316220735.2048137-1-richard.hender...@linaro.org/

I didn't persue it at the time because at the time it didn't actually
fix the s390x-on-aarch64 problems.  I'm re-posting this now because
of Paolo's "missing barriers on ARM" patch set.

It was never very easy to trigger the s390x problem, but with the two
patch sets I've been unable to do so all day.



r~


Richard Henderson (5):
  tcg: Do not elide memory barriers for !CF_PARALLEL
  tcg: Elide memory barriers implied by the host memory model
  tcg: Create tcg_req_mo
  tcg: Add host memory barriers to cpu_ldst.h interfaces
  accel/tcg: Remove check_tcg_memory_orders_compatible

 include/tcg/tcg.h | 34 ++
 accel/tcg/cputlb.c|  2 ++
 accel/tcg/tcg-all.c   | 38 --
 accel/tcg/user-exec.c | 14 ++
 tcg/tcg-op.c  | 19 +++
 5 files changed, 69 insertions(+), 38 deletions(-)

-- 
2.34.1




[PULL 72/84] target/hexagon/idef-parser: Use gen_tmp for gen_rvalue_pred

2023-03-05 Thread Richard Henderson
The allocation is immediately followed by either tcg_gen_mov_i32
or gen_read_preg (which contains tcg_gen_mov_i32), so the zero
initialization is immediately discarded.

Reviewed-by: Taylor Simpson 
Signed-off-by: Richard Henderson 
---
 target/hexagon/idef-parser/parser-helpers.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hexagon/idef-parser/parser-helpers.c 
b/target/hexagon/idef-parser/parser-helpers.c
index 760e499149..c0e6f2190c 100644
--- a/target/hexagon/idef-parser/parser-helpers.c
+++ b/target/hexagon/idef-parser/parser-helpers.c
@@ -1889,7 +1889,7 @@ HexValue gen_rvalue_pred(Context *c, YYLTYPE *locp, 
HexValue *pred)
 bool is_dotnew = pred->is_dotnew;
 char predicate_id[2] = { pred->pred.id, '\0' };
 char *pred_str = (char *) _id;
-*pred = gen_tmp_value(c, locp, "0", 32, UNSIGNED);
+*pred = gen_tmp(c, locp, 32, UNSIGNED);
 if (is_dotnew) {
 OUT(c, locp, "tcg_gen_mov_i32(", pred,
 ", hex_new_pred_value[");
-- 
2.34.1




[PULL 65/84] target/tricore: Drop tcg_temp_free

2023-03-05 Thread Richard Henderson
Translators are no longer required to free tcg temporaries.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/tricore/translate.c | 540 +
 1 file changed, 4 insertions(+), 536 deletions(-)

diff --git a/target/tricore/translate.c b/target/tricore/translate.c
index 176ea96b2b..127f9a989a 100644
--- a/target/tricore/translate.c
+++ b/target/tricore/translate.c
@@ -126,7 +126,6 @@ void tricore_cpu_dump_state(CPUState *cs, FILE *f, int 
flags)
 #define gen_helper_1arg(name, arg) do {   \
 TCGv_i32 helper_tmp = tcg_const_i32(arg); \
 gen_helper_##name(cpu_env, helper_tmp);   \
-tcg_temp_free_i32(helper_tmp);\
 } while (0)
 
 #define GEN_HELPER_LL(name, ret, arg0, arg1, n) do { \
@@ -137,9 +136,6 @@ void tricore_cpu_dump_state(CPUState *cs, FILE *f, int 
flags)
 tcg_gen_ext16s_tl(arg01, arg0);  \
 tcg_gen_ext16s_tl(arg11, arg1);  \
 gen_helper_##name(ret, arg00, arg01, arg11, arg11, n);   \
-tcg_temp_free(arg00);\
-tcg_temp_free(arg01);\
-tcg_temp_free(arg11);\
 } while (0)
 
 #define GEN_HELPER_LU(name, ret, arg0, arg1, n) do { \
@@ -152,10 +148,6 @@ void tricore_cpu_dump_state(CPUState *cs, FILE *f, int 
flags)
 tcg_gen_sari_tl(arg11, arg1, 16);\
 tcg_gen_ext16s_tl(arg10, arg1);  \
 gen_helper_##name(ret, arg00, arg01, arg10, arg11, n);   \
-tcg_temp_free(arg00);\
-tcg_temp_free(arg01);\
-tcg_temp_free(arg10);\
-tcg_temp_free(arg11);\
 } while (0)
 
 #define GEN_HELPER_UL(name, ret, arg0, arg1, n) do { \
@@ -168,10 +160,6 @@ void tricore_cpu_dump_state(CPUState *cs, FILE *f, int 
flags)
 tcg_gen_sari_tl(arg10, arg1, 16);\
 tcg_gen_ext16s_tl(arg11, arg1);  \
 gen_helper_##name(ret, arg00, arg01, arg10, arg11, n);   \
-tcg_temp_free(arg00);\
-tcg_temp_free(arg01);\
-tcg_temp_free(arg10);\
-tcg_temp_free(arg11);\
 } while (0)
 
 #define GEN_HELPER_UU(name, ret, arg0, arg1, n) do { \
@@ -182,9 +170,6 @@ void tricore_cpu_dump_state(CPUState *cs, FILE *f, int 
flags)
 tcg_gen_ext16s_tl(arg00, arg0);  \
 tcg_gen_sari_tl(arg11, arg1, 16);\
 gen_helper_##name(ret, arg00, arg01, arg11, arg11, n);   \
-tcg_temp_free(arg00);\
-tcg_temp_free(arg01);\
-tcg_temp_free(arg11);\
 } while (0)
 
 #define GEN_HELPER_RRR(name, rl, rh, al1, ah1, arg2) do {\
@@ -194,9 +179,6 @@ void tricore_cpu_dump_state(CPUState *cs, FILE *f, int 
flags)
 tcg_gen_concat_i32_i64(arg1, al1, ah1);  \
 gen_helper_##name(ret, arg1, arg2);  \
 tcg_gen_extr_i64_i32(rl, rh, ret);   \
- \
-tcg_temp_free_i64(ret);  \
-tcg_temp_free_i64(arg1); \
 } while (0)
 
 #define GEN_HELPER_RR(name, rl, rh, arg1, arg2) do {\
@@ -204,8 +186,6 @@ void tricore_cpu_dump_state(CPUState *cs, FILE *f, int 
flags)
 \
 gen_helper_##name(ret, cpu_env, arg1, arg2);\
 tcg_gen_extr_i64_i32(rl, rh, ret);  \
-\
-tcg_temp_free_i64(ret); \
 } while (0)
 
 #define EA_ABS_FORMAT(con) (((con & 0x3C000) << 14) + (con & 0x3FFF))
@@ -229,7 +209,6 @@ static inline void gen_offset_ld(DisasContext *ctx, TCGv 
r1, TCGv r2,
 TCGv temp = tcg_temp_new();
 tcg_gen_addi_tl(temp, r2, con);
 tcg_gen_qemu_ld_tl(r1, temp, ctx->mem_idx, mop);
-tcg_temp_free(temp);
 }
 
 static inline void gen_offset_st(DisasContext *ctx, TCGv r1, TCGv r2,
@@ -238,7 +217,6 @@ static inline void gen_offset_st(DisasContext *ctx, TCGv 
r1, TCGv r2,
 TCGv temp = tcg_temp_new();
 tcg_gen_addi_tl(temp, r2, con);
 tcg_gen_qemu_st_tl(r1, temp, ctx->mem_idx, mop);
-tcg_temp_free(temp);
 }
 
 static void gen_st_2regs_64(TCGv rh, TCGv rl, TCGv address, DisasContext *ctx)
@@ -247,8 +225,6 @@ static void gen_st_2regs_64(TCGv rh, TCGv rl, TCGv address, 
DisasContext *ctx)
 
 tcg_gen_concat_i32_i64(temp, rl, rh);

[PULL 51/84] target/riscv: Drop temp_new

2023-03-05 Thread Richard Henderson
Translators are no longer required to free tcg temporaries,
therefore there's no need to record temps for later freeing.
Replace the few uses with tcg_temp_new.

Reviewed-by: Weiwei Li 
Reviewed-by: Daniel Henrique Barboza 
Signed-off-by: Richard Henderson 
---
 target/riscv/translate.c  | 30 +--
 target/riscv/insn_trans/trans_rvzfh.c.inc |  2 +-
 2 files changed, 7 insertions(+), 25 deletions(-)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 7ed625a36f..747989ecad 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -101,11 +101,8 @@ typedef struct DisasContext {
 bool cfg_vta_all_1s;
 target_ulong vstart;
 bool vl_eq_vlmax;
-uint8_t ntemp;
 CPUState *cs;
 TCGv zero;
-/* Space for 3 operands plus 1 extra for address computation. */
-TCGv temp[4];
 /* PointerMasking extension */
 bool pm_mask_enabled;
 bool pm_base_enabled;
@@ -312,12 +309,6 @@ static void gen_goto_tb(DisasContext *ctx, int n, 
target_ulong dest)
  *
  * Further, we may provide an extension for word operations.
  */
-static TCGv temp_new(DisasContext *ctx)
-{
-assert(ctx->ntemp < ARRAY_SIZE(ctx->temp));
-return ctx->temp[ctx->ntemp++] = tcg_temp_new();
-}
-
 static TCGv get_gpr(DisasContext *ctx, int reg_num, DisasExtend ext)
 {
 TCGv t;
@@ -332,11 +323,11 @@ static TCGv get_gpr(DisasContext *ctx, int reg_num, 
DisasExtend ext)
 case EXT_NONE:
 break;
 case EXT_SIGN:
-t = temp_new(ctx);
+t = tcg_temp_new();
 tcg_gen_ext32s_tl(t, cpu_gpr[reg_num]);
 return t;
 case EXT_ZERO:
-t = temp_new(ctx);
+t = tcg_temp_new();
 tcg_gen_ext32u_tl(t, cpu_gpr[reg_num]);
 return t;
 default:
@@ -364,7 +355,7 @@ static TCGv get_gprh(DisasContext *ctx, int reg_num)
 static TCGv dest_gpr(DisasContext *ctx, int reg_num)
 {
 if (reg_num == 0 || get_olen(ctx) < TARGET_LONG_BITS) {
-return temp_new(ctx);
+return tcg_temp_new();
 }
 return cpu_gpr[reg_num];
 }
@@ -372,7 +363,7 @@ static TCGv dest_gpr(DisasContext *ctx, int reg_num)
 static TCGv dest_gprh(DisasContext *ctx, int reg_num)
 {
 if (reg_num == 0) {
-return temp_new(ctx);
+return tcg_temp_new();
 }
 return cpu_gprh[reg_num];
 }
@@ -575,7 +566,7 @@ static void gen_jal(DisasContext *ctx, int rd, target_ulong 
imm)
 /* Compute a canonical address from a register plus offset. */
 static TCGv get_address(DisasContext *ctx, int rs1, int imm)
 {
-TCGv addr = temp_new(ctx);
+TCGv addr = tcg_temp_new();
 TCGv src1 = get_gpr(ctx, rs1, EXT_NONE);
 
 tcg_gen_addi_tl(addr, src1, imm);
@@ -593,7 +584,7 @@ static TCGv get_address(DisasContext *ctx, int rs1, int imm)
 /* Compute a canonical address from a register plus reg offset. */
 static TCGv get_address_indexed(DisasContext *ctx, int rs1, TCGv offs)
 {
-TCGv addr = temp_new(ctx);
+TCGv addr = tcg_temp_new();
 TCGv src1 = get_gpr(ctx, rs1, EXT_NONE);
 
 tcg_gen_add_tl(addr, src1, offs);
@@ -1197,8 +1188,6 @@ static void riscv_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cs)
 ctx->misa_mxl_max = env->misa_mxl_max;
 ctx->xl = FIELD_EX32(tb_flags, TB_FLAGS, XL);
 ctx->cs = cs;
-ctx->ntemp = 0;
-memset(ctx->temp, 0, sizeof(ctx->temp));
 ctx->pm_mask_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_MASK_ENABLED);
 ctx->pm_base_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_BASE_ENABLED);
 ctx->itrigger = FIELD_EX32(tb_flags, TB_FLAGS, ITRIGGER);
@@ -1223,18 +1212,11 @@ static void riscv_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cpu)
 DisasContext *ctx = container_of(dcbase, DisasContext, base);
 CPURISCVState *env = cpu->env_ptr;
 uint16_t opcode16 = translator_lduw(env, >base, ctx->base.pc_next);
-int i;
 
 ctx->ol = ctx->xl;
 decode_opc(env, ctx, opcode16);
 ctx->base.pc_next = ctx->pc_succ_insn;
 
-for (i = ctx->ntemp - 1; i >= 0; --i) {
-tcg_temp_free(ctx->temp[i]);
-ctx->temp[i] = NULL;
-}
-ctx->ntemp = 0;
-
 /* Only the first insn within a TB is allowed to cross a page boundary. */
 if (ctx->base.is_jmp == DISAS_NEXT) {
 if (ctx->itrigger || !is_same_page(>base, ctx->base.pc_next)) {
diff --git a/target/riscv/insn_trans/trans_rvzfh.c.inc 
b/target/riscv/insn_trans/trans_rvzfh.c.inc
index 85fc1aa822..f0d4df05f0 100644
--- a/target/riscv/insn_trans/trans_rvzfh.c.inc
+++ b/target/riscv/insn_trans/trans_rvzfh.c.inc
@@ -51,7 +51,7 @@ static bool trans_flh(DisasContext *ctx, arg_flh *a)
 decode_save_opc(ctx);
 t0 = get_gpr(ctx, a->rs1, EXT_NONE);
 if (a->imm) {
-TCGv temp = temp_new(ctx);
+TCGv temp = tcg_temp_new();
 tcg_gen_addi_tl(temp, t0, a->imm);
 t0 = temp;
 }
-- 
2.34.1




[PULL 60/84] target/xtensa: Drop reset_sar_tracker

2023-03-05 Thread Richard Henderson
Translators are no longer required to free tcg temporaries.
Remove sar_m32_allocated, as sar_m32 non-null is equivalent.

Reviewed-by: Max Filippov 
Signed-off-by: Richard Henderson 
---
 target/xtensa/translate.c | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/target/xtensa/translate.c b/target/xtensa/translate.c
index 4af0650deb..910350dec6 100644
--- a/target/xtensa/translate.c
+++ b/target/xtensa/translate.c
@@ -57,7 +57,6 @@ struct DisasContext {
 
 bool sar_5bit;
 bool sar_m32_5bit;
-bool sar_m32_allocated;
 TCGv_i32 sar_m32;
 
 unsigned window;
@@ -284,14 +283,7 @@ static void init_sar_tracker(DisasContext *dc)
 {
 dc->sar_5bit = false;
 dc->sar_m32_5bit = false;
-dc->sar_m32_allocated = false;
-}
-
-static void reset_sar_tracker(DisasContext *dc)
-{
-if (dc->sar_m32_allocated) {
-tcg_temp_free(dc->sar_m32);
-}
+dc->sar_m32 = NULL;
 }
 
 static void gen_right_shift_sar(DisasContext *dc, TCGv_i32 sa)
@@ -306,9 +298,8 @@ static void gen_right_shift_sar(DisasContext *dc, TCGv_i32 
sa)
 
 static void gen_left_shift_sar(DisasContext *dc, TCGv_i32 sa)
 {
-if (!dc->sar_m32_allocated) {
+if (!dc->sar_m32) {
 dc->sar_m32 = tcg_temp_new_i32();
-dc->sar_m32_allocated = true;
 }
 tcg_gen_andi_i32(dc->sar_m32, sa, 0x1f);
 tcg_gen_sub_i32(cpu_SR[SAR], tcg_constant_i32(32), dc->sar_m32);
@@ -1247,7 +1238,6 @@ static void xtensa_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 {
 DisasContext *dc = container_of(dcbase, DisasContext, base);
 
-reset_sar_tracker(dc);
 if (dc->icount) {
 tcg_temp_free(dc->next_icount);
 }
-- 
2.34.1




[PULL 50/84] target/riscv: Drop ftemp_new

2023-03-05 Thread Richard Henderson
Translators are no longer required to free tcg temporaries,
therefore there's no need to record temps for later freeing.
Replace the few uses with tcg_temp_new_i64.

Reviewed-by: Weiwei Li 
Reviewed-by: Daniel Henrique Barboza 
Signed-off-by: Richard Henderson 
---
 target/riscv/translate.c | 24 
 1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index a8d516ca3e..7ed625a36f 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -106,9 +106,6 @@ typedef struct DisasContext {
 TCGv zero;
 /* Space for 3 operands plus 1 extra for address computation. */
 TCGv temp[4];
-/* Space for 4 operands(1 dest and <=3 src) for float point computation */
-TCGv_i64 ftemp[4];
-uint8_t nftemp;
 /* PointerMasking extension */
 bool pm_mask_enabled;
 bool pm_base_enabled;
@@ -431,12 +428,6 @@ static void gen_set_gpr128(DisasContext *ctx, int reg_num, 
TCGv rl, TCGv rh)
 }
 }
 
-static TCGv_i64 ftemp_new(DisasContext *ctx)
-{
-assert(ctx->nftemp < ARRAY_SIZE(ctx->ftemp));
-return ctx->ftemp[ctx->nftemp++] = tcg_temp_new_i64();
-}
-
 static TCGv_i64 get_fpr_hs(DisasContext *ctx, int reg_num)
 {
 if (!ctx->cfg_ptr->ext_zfinx) {
@@ -450,7 +441,7 @@ static TCGv_i64 get_fpr_hs(DisasContext *ctx, int reg_num)
 case MXL_RV32:
 #ifdef TARGET_RISCV32
 {
-TCGv_i64 t = ftemp_new(ctx);
+TCGv_i64 t = tcg_temp_new_i64();
 tcg_gen_ext_i32_i64(t, cpu_gpr[reg_num]);
 return t;
 }
@@ -476,7 +467,7 @@ static TCGv_i64 get_fpr_d(DisasContext *ctx, int reg_num)
 switch (get_xl(ctx)) {
 case MXL_RV32:
 {
-TCGv_i64 t = ftemp_new(ctx);
+TCGv_i64 t = tcg_temp_new_i64();
 tcg_gen_concat_tl_i64(t, cpu_gpr[reg_num], cpu_gpr[reg_num + 1]);
 return t;
 }
@@ -496,12 +487,12 @@ static TCGv_i64 dest_fpr(DisasContext *ctx, int reg_num)
 }
 
 if (reg_num == 0) {
-return ftemp_new(ctx);
+return tcg_temp_new_i64();
 }
 
 switch (get_xl(ctx)) {
 case MXL_RV32:
-return ftemp_new(ctx);
+return tcg_temp_new_i64();
 #ifdef TARGET_RISCV64
 case MXL_RV64:
 return cpu_gpr[reg_num];
@@ -1208,8 +1199,6 @@ static void riscv_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cs)
 ctx->cs = cs;
 ctx->ntemp = 0;
 memset(ctx->temp, 0, sizeof(ctx->temp));
-ctx->nftemp = 0;
-memset(ctx->ftemp, 0, sizeof(ctx->ftemp));
 ctx->pm_mask_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_MASK_ENABLED);
 ctx->pm_base_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_BASE_ENABLED);
 ctx->itrigger = FIELD_EX32(tb_flags, TB_FLAGS, ITRIGGER);
@@ -1245,11 +1234,6 @@ static void riscv_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cpu)
 ctx->temp[i] = NULL;
 }
 ctx->ntemp = 0;
-for (i = ctx->nftemp - 1; i >= 0; --i) {
-tcg_temp_free_i64(ctx->ftemp[i]);
-ctx->ftemp[i] = NULL;
-}
-ctx->nftemp = 0;
 
 /* Only the first insn within a TB is allowed to cross a page boundary. */
 if (ctx->base.is_jmp == DISAS_NEXT) {
-- 
2.34.1




[PULL 75/84] target/microblaze: Avoid tcg_const_* throughout

2023-03-05 Thread Richard Henderson
All uses are strictly read-only.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/microblaze/translate.c | 35 +++
 1 file changed, 15 insertions(+), 20 deletions(-)

diff --git a/target/microblaze/translate.c b/target/microblaze/translate.c
index eb6bdb49e1..ee0d7b81ad 100644
--- a/target/microblaze/translate.c
+++ b/target/microblaze/translate.c
@@ -101,9 +101,7 @@ static void t_sync_flags(DisasContext *dc)
 
 static void gen_raise_exception(DisasContext *dc, uint32_t index)
 {
-TCGv_i32 tmp = tcg_const_i32(index);
-
-gen_helper_raise_exception(cpu_env, tmp);
+gen_helper_raise_exception(cpu_env, tcg_constant_i32(index));
 dc->base.is_jmp = DISAS_NORETURN;
 }
 
@@ -116,7 +114,7 @@ static void gen_raise_exception_sync(DisasContext *dc, 
uint32_t index)
 
 static void gen_raise_hw_excp(DisasContext *dc, uint32_t esr_ec)
 {
-TCGv_i32 tmp = tcg_const_i32(esr_ec);
+TCGv_i32 tmp = tcg_constant_i32(esr_ec);
 tcg_gen_st_i32(tmp, cpu_env, offsetof(CPUMBState, esr));
 
 gen_raise_exception_sync(dc, EXCP_HW_EXCP);
@@ -260,7 +258,7 @@ static bool do_typeb_val(DisasContext *dc, arg_typeb *arg, 
bool side_effects,
 
 rd = reg_for_write(dc, arg->rd);
 ra = reg_for_read(dc, arg->ra);
-imm = tcg_const_i32(arg->imm);
+imm = tcg_constant_i32(arg->imm);
 
 fn(rd, ra, imm);
 return true;
@@ -305,7 +303,7 @@ static bool do_typeb_val(DisasContext *dc, arg_typeb *arg, 
bool side_effects,
 /* No input carry, but output carry. */
 static void gen_add(TCGv_i32 out, TCGv_i32 ina, TCGv_i32 inb)
 {
-TCGv_i32 zero = tcg_const_i32(0);
+TCGv_i32 zero = tcg_constant_i32(0);
 
 tcg_gen_add2_i32(out, cpu_msr_c, ina, zero, inb, zero);
 }
@@ -313,7 +311,7 @@ static void gen_add(TCGv_i32 out, TCGv_i32 ina, TCGv_i32 
inb)
 /* Input and output carry. */
 static void gen_addc(TCGv_i32 out, TCGv_i32 ina, TCGv_i32 inb)
 {
-TCGv_i32 zero = tcg_const_i32(0);
+TCGv_i32 zero = tcg_constant_i32(0);
 TCGv_i32 tmp = tcg_temp_new_i32();
 
 tcg_gen_add2_i32(tmp, cpu_msr_c, ina, zero, cpu_msr_c, zero);
@@ -546,7 +544,7 @@ static void gen_rsub(TCGv_i32 out, TCGv_i32 ina, TCGv_i32 
inb)
 /* Input and output carry. */
 static void gen_rsubc(TCGv_i32 out, TCGv_i32 ina, TCGv_i32 inb)
 {
-TCGv_i32 zero = tcg_const_i32(0);
+TCGv_i32 zero = tcg_constant_i32(0);
 TCGv_i32 tmp = tcg_temp_new_i32();
 
 tcg_gen_not_i32(tmp, ina);
@@ -1117,8 +1115,8 @@ static bool do_bcc(DisasContext *dc, int dest_rb, int 
dest_imm,
 }
 
 /* Compute the final destination into btarget.  */
-zero = tcg_const_i32(0);
-next = tcg_const_i32(dc->base.pc_next + (delay + 1) * 4);
+zero = tcg_constant_i32(0);
+next = tcg_constant_i32(dc->base.pc_next + (delay + 1) * 4);
 tcg_gen_movcond_i32(dc->jmp_cond, cpu_btarget,
 reg_for_read(dc, ra), zero,
 cpu_btarget, next);
@@ -1226,8 +1224,6 @@ static bool trans_mbar(DisasContext *dc, arg_mbar *arg)
 
 /* Sleep. */
 if (mbar_imm & 16) {
-TCGv_i32 tmp_1;
-
 if (trap_userspace(dc, true)) {
 /* Sleep is a privileged instruction.  */
 return true;
@@ -1235,8 +1231,7 @@ static bool trans_mbar(DisasContext *dc, arg_mbar *arg)
 
 t_sync_flags(dc);
 
-tmp_1 = tcg_const_i32(1);
-tcg_gen_st_i32(tmp_1, cpu_env,
+tcg_gen_st_i32(tcg_constant_i32(1), cpu_env,
-offsetof(MicroBlazeCPU, env)
+offsetof(CPUState, halted));
 
@@ -1401,8 +1396,8 @@ static bool trans_mts(DisasContext *dc, arg_mts *arg)
 case 0x1004: /* TLBHI */
 case 0x1005: /* TLBSX */
 {
-TCGv_i32 tmp_ext = tcg_const_i32(arg->e);
-TCGv_i32 tmp_reg = tcg_const_i32(arg->rs & 7);
+TCGv_i32 tmp_ext = tcg_constant_i32(arg->e);
+TCGv_i32 tmp_reg = tcg_constant_i32(arg->rs & 7);
 
 gen_helper_mmu_write(cpu_env, tmp_ext, tmp_reg, src);
 }
@@ -1487,8 +1482,8 @@ static bool trans_mfs(DisasContext *dc, arg_mfs *arg)
 case 0x1004: /* TLBHI */
 case 0x1005: /* TLBSX */
 {
-TCGv_i32 tmp_ext = tcg_const_i32(arg->e);
-TCGv_i32 tmp_reg = tcg_const_i32(arg->rs & 7);
+TCGv_i32 tmp_ext = tcg_constant_i32(arg->e);
+TCGv_i32 tmp_reg = tcg_constant_i32(arg->rs & 7);
 
 gen_helper_mmu_read(dest, cpu_env, tmp_ext, tmp_reg);
 }
@@ -1555,7 +1550,7 @@ static bool do_get(DisasContext *dc, int rd, int rb, int 
imm, int ctrl)
 tcg_gen_movi_i32(t_id, imm);
 }
 
-t_ctrl = tcg_const_i32(ctrl);
+t_ctrl = tcg_constant_i32(ctrl);
 gen_helper_get(reg_for_write(dc, rd), t_id, t_ctrl);
 return true;
 }
@@ -1585,7 +1580,7 @@ static bool do_put(DisasContext *dc, int ra, int rb, int 
imm, int ctrl)
 tcg_gen_movi_i32(t_id, imm);
 }
 
-t_ctrl = tcg_const_i32(ctrl);
+t_ctrl = 

[PULL 04/84] target/sparc: Use tlb_set_page_full

2023-03-05 Thread Richard Henderson
Pass CPUTLBEntryFull to get_physical_address instead
of a collection of pointers.

Acked-by: Mark Cave-Ayland 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/sparc/mmu_helper.c | 121 +-
 1 file changed, 54 insertions(+), 67 deletions(-)

diff --git a/target/sparc/mmu_helper.c b/target/sparc/mmu_helper.c
index 6e7f46f847..453498c670 100644
--- a/target/sparc/mmu_helper.c
+++ b/target/sparc/mmu_helper.c
@@ -64,10 +64,9 @@ static const int perm_table[2][8] = {
 }
 };
 
-static int get_physical_address(CPUSPARCState *env, hwaddr *physical,
-int *prot, int *access_index, MemTxAttrs 
*attrs,
-target_ulong address, int rw, int mmu_idx,
-target_ulong *page_size)
+static int get_physical_address(CPUSPARCState *env, CPUTLBEntryFull *full,
+int *access_index, target_ulong address,
+int rw, int mmu_idx)
 {
 int access_perms = 0;
 hwaddr pde_ptr;
@@ -80,20 +79,20 @@ static int get_physical_address(CPUSPARCState *env, hwaddr 
*physical,
 is_user = mmu_idx == MMU_USER_IDX;
 
 if (mmu_idx == MMU_PHYS_IDX) {
-*page_size = TARGET_PAGE_SIZE;
+full->lg_page_size = TARGET_PAGE_BITS;
 /* Boot mode: instruction fetches are taken from PROM */
 if (rw == 2 && (env->mmuregs[0] & env->def.mmu_bm)) {
-*physical = env->prom_addr | (address & 0x7ULL);
-*prot = PAGE_READ | PAGE_EXEC;
+full->phys_addr = env->prom_addr | (address & 0x7ULL);
+full->prot = PAGE_READ | PAGE_EXEC;
 return 0;
 }
-*physical = address;
-*prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
+full->phys_addr = address;
+full->prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
 return 0;
 }
 
 *access_index = ((rw & 1) << 2) | (rw & 2) | (is_user ? 0 : 1);
-*physical = 0xULL;
+full->phys_addr = 0xULL;
 
 /* SPARC reference MMU table walk: Context table->L1->L2->PTE */
 /* Context base + context number */
@@ -157,16 +156,17 @@ static int get_physical_address(CPUSPARCState *env, 
hwaddr *physical,
 case 2: /* L3 PTE */
 page_offset = 0;
 }
-*page_size = TARGET_PAGE_SIZE;
+full->lg_page_size = TARGET_PAGE_BITS;
 break;
 case 2: /* L2 PTE */
 page_offset = address & 0x3f000;
-*page_size = 0x4;
+full->lg_page_size = 18;
 }
 break;
 case 2: /* L1 PTE */
 page_offset = address & 0xfff000;
-*page_size = 0x100;
+full->lg_page_size = 24;
+break;
 }
 }
 
@@ -188,16 +188,16 @@ static int get_physical_address(CPUSPARCState *env, 
hwaddr *physical,
 }
 
 /* the page can be put in the TLB */
-*prot = perm_table[is_user][access_perms];
+full->prot = perm_table[is_user][access_perms];
 if (!(pde & PG_MODIFIED_MASK)) {
 /* only set write access if already dirty... otherwise wait
for dirty access */
-*prot &= ~PAGE_WRITE;
+full->prot &= ~PAGE_WRITE;
 }
 
 /* Even if large ptes, we map only one 4KB page in the cache to
avoid filling it too fast */
-*physical = ((hwaddr)(pde & PTE_ADDR_MASK) << 4) + page_offset;
+full->phys_addr = ((hwaddr)(pde & PTE_ADDR_MASK) << 4) + page_offset;
 return error_code;
 }
 
@@ -208,11 +208,9 @@ bool sparc_cpu_tlb_fill(CPUState *cs, vaddr address, int 
size,
 {
 SPARCCPU *cpu = SPARC_CPU(cs);
 CPUSPARCState *env = >env;
-hwaddr paddr;
+CPUTLBEntryFull full = {};
 target_ulong vaddr;
-target_ulong page_size;
-int error_code = 0, prot, access_index;
-MemTxAttrs attrs = {};
+int error_code = 0, access_index;
 
 /*
  * TODO: If we ever need tlb_vaddr_to_host for this target,
@@ -223,16 +221,15 @@ bool sparc_cpu_tlb_fill(CPUState *cs, vaddr address, int 
size,
 assert(!probe);
 
 address &= TARGET_PAGE_MASK;
-error_code = get_physical_address(env, , , _index, 
,
-  address, access_type,
-  mmu_idx, _size);
+error_code = get_physical_address(env, , _index,
+  address, access_type, mmu_idx);
 vaddr = address;
 if (likely(error_code == 0)) {
 qemu_log_mask(CPU_LOG_MMU,
   "Translate at %" VADDR_PRIx " -> "
   HWADDR_FMT_plx ", vaddr " TARGET_FMT_lx "\n",
-  address, paddr, vaddr);
-tlb_set_page(cs, vaddr, paddr, prot, mmu_idx, page_size);
+  address, full.phys_addr, vaddr);
+tlb_set_page_full(cs, mmu_idx, vaddr, );
 

[PULL 07/84] softmmu: Check watchpoints for read+write at once

2023-03-05 Thread Richard Henderson
Atomic operations are read-modify-write, and we'd like to
be able to test both read and write with one call.  This is
easy enough, with BP_MEM_READ | BP_MEM_WRITE.

Add BP_HIT_SHIFT to make it easy to set BP_WATCHPOINT_HIT_*.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/hw/core/cpu.h |  7 ---
 softmmu/watchpoint.c  | 19 ++-
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index fb5d9667ca..75689bff02 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -923,9 +923,10 @@ void cpu_single_step(CPUState *cpu, int enabled);
 #define BP_GDB0x10
 #define BP_CPU0x20
 #define BP_ANY(BP_GDB | BP_CPU)
-#define BP_WATCHPOINT_HIT_READ 0x40
-#define BP_WATCHPOINT_HIT_WRITE 0x80
-#define BP_WATCHPOINT_HIT (BP_WATCHPOINT_HIT_READ | BP_WATCHPOINT_HIT_WRITE)
+#define BP_HIT_SHIFT  6
+#define BP_WATCHPOINT_HIT_READ  (BP_MEM_READ << BP_HIT_SHIFT)
+#define BP_WATCHPOINT_HIT_WRITE (BP_MEM_WRITE << BP_HIT_SHIFT)
+#define BP_WATCHPOINT_HIT   (BP_MEM_ACCESS << BP_HIT_SHIFT)
 
 int cpu_breakpoint_insert(CPUState *cpu, vaddr pc, int flags,
   CPUBreakpoint **breakpoint);
diff --git a/softmmu/watchpoint.c b/softmmu/watchpoint.c
index 279129dd1c..ad58736787 100644
--- a/softmmu/watchpoint.c
+++ b/softmmu/watchpoint.c
@@ -162,9 +162,12 @@ void cpu_check_watchpoint(CPUState *cpu, vaddr addr, vaddr 
len,
 /* this is currently used only by ARM BE32 */
 addr = cc->tcg_ops->adjust_watchpoint_address(cpu, addr, len);
 }
+
+assert((flags & ~BP_MEM_ACCESS) == 0);
 QTAILQ_FOREACH(wp, >watchpoints, entry) {
-if (watchpoint_address_matches(wp, addr, len)
-&& (wp->flags & flags)) {
+int hit_flags = wp->flags & flags;
+
+if (hit_flags && watchpoint_address_matches(wp, addr, len)) {
 if (replay_running_debug()) {
 /*
  * replay_breakpoint reads icount.
@@ -184,16 +187,14 @@ void cpu_check_watchpoint(CPUState *cpu, vaddr addr, 
vaddr len,
 replay_breakpoint();
 return;
 }
-if (flags == BP_MEM_READ) {
-wp->flags |= BP_WATCHPOINT_HIT_READ;
-} else {
-wp->flags |= BP_WATCHPOINT_HIT_WRITE;
-}
+
+wp->flags |= hit_flags << BP_HIT_SHIFT;
 wp->hitaddr = MAX(addr, wp->vaddr);
 wp->hitattrs = attrs;
 
-if (wp->flags & BP_CPU && cc->tcg_ops->debug_check_watchpoint &&
-!cc->tcg_ops->debug_check_watchpoint(cpu, wp)) {
+if (wp->flags & BP_CPU
+&& cc->tcg_ops->debug_check_watchpoint
+&& !cc->tcg_ops->debug_check_watchpoint(cpu, wp)) {
 wp->flags &= ~BP_WATCHPOINT_HIT;
 continue;
 }
-- 
2.34.1




[PULL 83/84] target/xtensa: Split constant in bit shift

2023-03-05 Thread Richard Henderson
Reviewed-by: Max Filippov 
Signed-off-by: Richard Henderson 
---
 target/xtensa/translate.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/xtensa/translate.c b/target/xtensa/translate.c
index 2903c73f8e..f906ba7ed5 100644
--- a/target/xtensa/translate.c
+++ b/target/xtensa/translate.c
@@ -2047,8 +2047,8 @@ static uint32_t test_exceptions_retw(DisasContext *dc, 
const OpcodeArg arg[],
 static void translate_retw(DisasContext *dc, const OpcodeArg arg[],
const uint32_t par[])
 {
-TCGv_i32 tmp = tcg_const_i32(1);
-tcg_gen_shl_i32(tmp, tmp, cpu_SR[WINDOW_BASE]);
+TCGv_i32 tmp = tcg_temp_new();
+tcg_gen_shl_i32(tmp, tcg_constant_i32(1), cpu_SR[WINDOW_BASE]);
 tcg_gen_andc_i32(cpu_SR[WINDOW_START],
  cpu_SR[WINDOW_START], tmp);
 tcg_gen_movi_i32(tmp, dc->pc);
@@ -2080,10 +2080,10 @@ static void translate_rfi(DisasContext *dc, const 
OpcodeArg arg[],
 static void translate_rfw(DisasContext *dc, const OpcodeArg arg[],
   const uint32_t par[])
 {
-TCGv_i32 tmp = tcg_const_i32(1);
+TCGv_i32 tmp = tcg_temp_new();
 
 tcg_gen_andi_i32(cpu_SR[PS], cpu_SR[PS], ~PS_EXCM);
-tcg_gen_shl_i32(tmp, tmp, cpu_SR[WINDOW_BASE]);
+tcg_gen_shl_i32(tmp, tcg_constant_i32(1), cpu_SR[WINDOW_BASE]);
 
 if (par[0]) {
 tcg_gen_andc_i32(cpu_SR[WINDOW_START],
-- 
2.34.1




[PULL 43/84] target/m68k: Drop mark_to_release

2023-03-05 Thread Richard Henderson
Translators are no longer required to free tcg temporaries,
therefore there's no need to record temps for later freeing.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/m68k/translate.c | 55 ++---
 1 file changed, 13 insertions(+), 42 deletions(-)

diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 157c2cbb8f..b3cd3e87e1 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -121,35 +121,9 @@ typedef struct DisasContext {
 int done_mac;
 int writeback_mask;
 TCGv writeback[8];
-#define MAX_TO_RELEASE 8
-int release_count;
-TCGv release[MAX_TO_RELEASE];
 bool ss_active;
 } DisasContext;
 
-static void init_release_array(DisasContext *s)
-{
-#ifdef CONFIG_DEBUG_TCG
-memset(s->release, 0, sizeof(s->release));
-#endif
-s->release_count = 0;
-}
-
-static void do_release(DisasContext *s)
-{
-int i;
-for (i = 0; i < s->release_count; i++) {
-tcg_temp_free(s->release[i]);
-}
-init_release_array(s);
-}
-
-static TCGv mark_to_release(DisasContext *s, TCGv tmp)
-{
-g_assert(s->release_count < MAX_TO_RELEASE);
-return s->release[s->release_count++] = tmp;
-}
-
 static TCGv get_areg(DisasContext *s, unsigned regno)
 {
 if (s->writeback_mask & (1 << regno)) {
@@ -396,8 +370,7 @@ static TCGv gen_ldst(DisasContext *s, int opsize, TCGv 
addr, TCGv val,
 gen_store(s, opsize, addr, val, index);
 return store_dummy;
 } else {
-return mark_to_release(s, gen_load(s, opsize, addr,
-   what == EA_LOADS, index));
+return gen_load(s, opsize, addr, what == EA_LOADS, index);
 }
 }
 
@@ -491,7 +464,7 @@ static TCGv gen_lea_indexed(CPUM68KState *env, DisasContext 
*s, TCGv base)
 } else {
 bd = 0;
 }
-tmp = mark_to_release(s, tcg_temp_new());
+tmp = tcg_temp_new();
 if ((ext & 0x44) == 0) {
 /* pre-index */
 add = gen_addr_index(s, ext, tmp);
@@ -501,7 +474,7 @@ static TCGv gen_lea_indexed(CPUM68KState *env, DisasContext 
*s, TCGv base)
 if ((ext & 0x80) == 0) {
 /* base not suppressed */
 if (IS_NULL_QREG(base)) {
-base = mark_to_release(s, tcg_const_i32(offset + bd));
+base = tcg_const_i32(offset + bd);
 bd = 0;
 }
 if (!IS_NULL_QREG(add)) {
@@ -517,11 +490,11 @@ static TCGv gen_lea_indexed(CPUM68KState *env, 
DisasContext *s, TCGv base)
 add = tmp;
 }
 } else {
-add = mark_to_release(s, tcg_const_i32(bd));
+add = tcg_const_i32(bd);
 }
 if ((ext & 3) != 0) {
 /* memory indirect */
-base = mark_to_release(s, gen_load(s, OS_LONG, add, 0, 
IS_USER(s)));
+base = gen_load(s, OS_LONG, add, 0, IS_USER(s));
 if ((ext & 0x44) == 4) {
 add = gen_addr_index(s, ext, tmp);
 tcg_gen_add_i32(tmp, add, base);
@@ -546,7 +519,7 @@ static TCGv gen_lea_indexed(CPUM68KState *env, DisasContext 
*s, TCGv base)
 }
 } else {
 /* brief extension word format */
-tmp = mark_to_release(s, tcg_temp_new());
+tmp = tcg_temp_new();
 add = gen_addr_index(s, ext, tmp);
 if (!IS_NULL_QREG(base)) {
 tcg_gen_add_i32(tmp, add, base);
@@ -676,7 +649,7 @@ static inline TCGv gen_extend(DisasContext *s, TCGv val, 
int opsize, int sign)
 if (opsize == OS_LONG) {
 tmp = val;
 } else {
-tmp = mark_to_release(s, tcg_temp_new());
+tmp = tcg_temp_new();
 gen_ext(tmp, val, opsize, sign);
 }
 
@@ -802,7 +775,7 @@ static TCGv gen_lea_mode(CPUM68KState *env, DisasContext *s,
 return NULL_QREG;
 }
 reg = get_areg(s, reg0);
-tmp = mark_to_release(s, tcg_temp_new());
+tmp = tcg_temp_new();
 if (reg0 == 7 && opsize == OS_BYTE &&
 m68k_feature(s->env, M68K_FEATURE_M68K)) {
 tcg_gen_subi_i32(tmp, reg, 2);
@@ -812,7 +785,7 @@ static TCGv gen_lea_mode(CPUM68KState *env, DisasContext *s,
 return tmp;
 case 5: /* Indirect displacement.  */
 reg = get_areg(s, reg0);
-tmp = mark_to_release(s, tcg_temp_new());
+tmp = tcg_temp_new();
 ext = read_im16(env, s);
 tcg_gen_addi_i32(tmp, reg, (int16_t)ext);
 return tmp;
@@ -823,14 +796,14 @@ static TCGv gen_lea_mode(CPUM68KState *env, DisasContext 
*s,
 switch (reg0) {
 case 0: /* Absolute short.  */
 offset = (int16_t)read_im16(env, s);
-return mark_to_release(s, tcg_const_i32(offset));
+return tcg_const_i32(offset);
 case 1: /* Absolute long.  */
 offset = read_im32(env, s);
-return mark_to_release(s, tcg_const_i32(offset));
+return 

[PULL 61/84] target/xtensa: Drop tcg_temp_free

2023-03-05 Thread Richard Henderson
Translators are no longer required to free tcg temporaries.

Reviewed-by: Max Filippov 
Signed-off-by: Richard Henderson 
---
 target/xtensa/translate.c | 107 --
 1 file changed, 107 deletions(-)

diff --git a/target/xtensa/translate.c b/target/xtensa/translate.c
index 910350dec6..3ea50d8bc3 100644
--- a/target/xtensa/translate.c
+++ b/target/xtensa/translate.c
@@ -1102,16 +1102,6 @@ static void disas_xtensa_insn(CPUXtensaState *env, 
DisasContext *dc)
 ops->translate(dc, pslot->arg, ops->par);
 }
 
-for (i = 0; i < n_arg_copy; ++i) {
-if (arg_copy[i].arg->num_bits <= 32) {
-tcg_temp_free_i32(arg_copy[i].temp);
-} else if (arg_copy[i].arg->num_bits <= 64) {
-tcg_temp_free_i64(arg_copy[i].temp);
-} else {
-g_assert_not_reached();
-}
-}
-
 if (dc->base.is_jmp == DISAS_NEXT) {
 gen_postprocess(dc, 0);
 dc->op_flags = 0;
@@ -1238,10 +1228,6 @@ static void xtensa_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 {
 DisasContext *dc = container_of(dcbase, DisasContext, base);
 
-if (dc->icount) {
-tcg_temp_free(dc->next_icount);
-}
-
 switch (dc->base.is_jmp) {
 case DISAS_NORETURN:
 break;
@@ -1369,7 +1355,6 @@ static void translate_addx(DisasContext *dc, const 
OpcodeArg arg[],
 TCGv_i32 tmp = tcg_temp_new_i32();
 tcg_gen_shli_i32(tmp, arg[1].in, par[0]);
 tcg_gen_add_i32(arg[0].out, tmp, arg[2].in);
-tcg_temp_free(tmp);
 }
 
 static void translate_all(DisasContext *dc, const OpcodeArg arg[],
@@ -1388,8 +1373,6 @@ static void translate_all(DisasContext *dc, const 
OpcodeArg arg[],
 tcg_gen_shri_i32(tmp, tmp, arg[1].imm + shift);
 tcg_gen_deposit_i32(arg[0].out, arg[0].out,
 tmp, arg[0].imm, 1);
-tcg_temp_free(mask);
-tcg_temp_free(tmp);
 }
 
 static void translate_and(DisasContext *dc, const OpcodeArg arg[],
@@ -1404,7 +1387,6 @@ static void translate_ball(DisasContext *dc, const 
OpcodeArg arg[],
 TCGv_i32 tmp = tcg_temp_new_i32();
 tcg_gen_and_i32(tmp, arg[0].in, arg[1].in);
 gen_brcond(dc, par[0], tmp, arg[1].in, arg[2].imm);
-tcg_temp_free(tmp);
 }
 
 static void translate_bany(DisasContext *dc, const OpcodeArg arg[],
@@ -1413,7 +1395,6 @@ static void translate_bany(DisasContext *dc, const 
OpcodeArg arg[],
 TCGv_i32 tmp = tcg_temp_new_i32();
 tcg_gen_and_i32(tmp, arg[0].in, arg[1].in);
 gen_brcondi(dc, par[0], tmp, 0, arg[2].imm);
-tcg_temp_free(tmp);
 }
 
 static void translate_b(DisasContext *dc, const OpcodeArg arg[],
@@ -1439,8 +1420,6 @@ static void translate_bb(DisasContext *dc, const 
OpcodeArg arg[],
 #endif
 tcg_gen_and_i32(tmp, arg[0].in, bit);
 gen_brcondi(dc, par[0], tmp, 0, arg[2].imm);
-tcg_temp_free(tmp);
-tcg_temp_free(bit);
 }
 
 static void translate_bbi(DisasContext *dc, const OpcodeArg arg[],
@@ -1453,7 +1432,6 @@ static void translate_bbi(DisasContext *dc, const 
OpcodeArg arg[],
 tcg_gen_andi_i32(tmp, arg[0].in, 0x0001u << arg[1].imm);
 #endif
 gen_brcondi(dc, par[0], tmp, 0, arg[2].imm);
-tcg_temp_free(tmp);
 }
 
 static void translate_bi(DisasContext *dc, const OpcodeArg arg[],
@@ -1494,8 +1472,6 @@ static void translate_boolean(DisasContext *dc, const 
OpcodeArg arg[],
 tcg_gen_shri_i32(tmp2, arg[2].in, arg[2].imm);
 op[par[0]](tmp1, tmp1, tmp2);
 tcg_gen_deposit_i32(arg[0].out, arg[0].out, tmp1, arg[0].imm, 1);
-tcg_temp_free(tmp1);
-tcg_temp_free(tmp2);
 }
 
 static void translate_bp(DisasContext *dc, const OpcodeArg arg[],
@@ -1505,7 +1481,6 @@ static void translate_bp(DisasContext *dc, const 
OpcodeArg arg[],
 
 tcg_gen_andi_i32(tmp, arg[0].in, 1 << arg[0].imm);
 gen_brcondi(dc, par[0], tmp, 0, arg[1].imm);
-tcg_temp_free(tmp);
 }
 
 static void translate_call0(DisasContext *dc, const OpcodeArg arg[],
@@ -1520,7 +1495,6 @@ static void translate_callw(DisasContext *dc, const 
OpcodeArg arg[],
 {
 TCGv_i32 tmp = tcg_const_i32(arg[0].imm);
 gen_callw_slot(dc, par[0], tmp, adjust_jump_slot(dc, arg[0].imm, 0));
-tcg_temp_free(tmp);
 }
 
 static void translate_callx0(DisasContext *dc, const OpcodeArg arg[],
@@ -1530,7 +1504,6 @@ static void translate_callx0(DisasContext *dc, const 
OpcodeArg arg[],
 tcg_gen_mov_i32(tmp, arg[0].in);
 tcg_gen_movi_i32(cpu_R[0], dc->base.pc_next);
 gen_jump(dc, tmp);
-tcg_temp_free(tmp);
 }
 
 static void translate_callxw(DisasContext *dc, const OpcodeArg arg[],
@@ -1540,7 +1513,6 @@ static void translate_callxw(DisasContext *dc, const 
OpcodeArg arg[],
 
 tcg_gen_mov_i32(tmp, arg[0].in);
 gen_callw_slot(dc, par[0], tmp, -1);
-tcg_temp_free(tmp);
 }
 
 static void translate_clamps(DisasContext *dc, const OpcodeArg arg[],
@@ -1551,8 +1523,6 @@ static void translate_clamps(DisasContext *dc, const 
OpcodeArg arg[],
 
 tcg_gen_smax_i32(tmp1, tmp1, arg[1].in);
 

[PULL 53/84] target/rx: Drop tcg_temp_free

2023-03-05 Thread Richard Henderson
Translators are no longer required to free tcg temporaries.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/rx/translate.c | 84 ---
 1 file changed, 84 deletions(-)

diff --git a/target/rx/translate.c b/target/rx/translate.c
index af23876cb3..6624414739 100644
--- a/target/rx/translate.c
+++ b/target/rx/translate.c
@@ -429,7 +429,6 @@ static bool trans_MOV_rm(DisasContext *ctx, arg_MOV_rm *a)
 mem = tcg_temp_new();
 tcg_gen_addi_i32(mem, cpu_regs[a->rd], a->dsp << a->sz);
 rx_gen_st(a->sz, cpu_regs[a->rs], mem);
-tcg_temp_free(mem);
 return true;
 }
 
@@ -440,7 +439,6 @@ static bool trans_MOV_mr(DisasContext *ctx, arg_MOV_mr *a)
 mem = tcg_temp_new();
 tcg_gen_addi_i32(mem, cpu_regs[a->rs], a->dsp << a->sz);
 rx_gen_ld(a->sz, cpu_regs[a->rd], mem);
-tcg_temp_free(mem);
 return true;
 }
 
@@ -462,8 +460,6 @@ static bool trans_MOV_im(DisasContext *ctx, arg_MOV_im *a)
 mem = tcg_temp_new();
 tcg_gen_addi_i32(mem, cpu_regs[a->rd], a->dsp << a->sz);
 rx_gen_st(a->sz, imm, mem);
-tcg_temp_free(imm);
-tcg_temp_free(mem);
 return true;
 }
 
@@ -474,7 +470,6 @@ static bool trans_MOV_ar(DisasContext *ctx, arg_MOV_ar *a)
 mem = tcg_temp_new();
 rx_gen_regindex(ctx, mem, a->sz, a->ri, a->rb);
 rx_gen_ld(a->sz, cpu_regs[a->rd], mem);
-tcg_temp_free(mem);
 return true;
 }
 
@@ -485,7 +480,6 @@ static bool trans_MOV_ra(DisasContext *ctx, arg_MOV_ra *a)
 mem = tcg_temp_new();
 rx_gen_regindex(ctx, mem, a->sz, a->ri, a->rb);
 rx_gen_st(a->sz, cpu_regs[a->rs], mem);
-tcg_temp_free(mem);
 return true;
 }
 
@@ -521,9 +515,7 @@ static bool trans_MOV_mm(DisasContext *ctx, arg_MOV_mm *a)
 rx_gen_ld(a->sz, tmp, addr);
 addr = rx_index_addr(ctx, mem, a->ldd, a->sz, a->rd);
 rx_gen_st(a->sz, tmp, addr);
-tcg_temp_free(tmp);
 }
-tcg_temp_free(mem);
 return true;
 }
 
@@ -541,7 +533,6 @@ static bool trans_MOV_rp(DisasContext *ctx, arg_MOV_rp *a)
 if (a->ad == 0) {
 tcg_gen_addi_i32(cpu_regs[a->rd], cpu_regs[a->rd], 1 << a->sz);
 }
-tcg_temp_free(val);
 return true;
 }
 
@@ -559,7 +550,6 @@ static bool trans_MOV_pr(DisasContext *ctx, arg_MOV_pr *a)
 tcg_gen_addi_i32(cpu_regs[a->rd], cpu_regs[a->rd], 1 << a->sz);
 }
 tcg_gen_mov_i32(cpu_regs[a->rs], val);
-tcg_temp_free(val);
 return true;
 }
 
@@ -571,7 +561,6 @@ static bool trans_MOVU_mr(DisasContext *ctx, arg_MOVU_mr *a)
 mem = tcg_temp_new();
 tcg_gen_addi_i32(mem, cpu_regs[a->rs], a->dsp << a->sz);
 rx_gen_ldu(a->sz, cpu_regs[a->rd], mem);
-tcg_temp_free(mem);
 return true;
 }
 
@@ -592,7 +581,6 @@ static bool trans_MOVU_ar(DisasContext *ctx, arg_MOVU_ar *a)
 mem = tcg_temp_new();
 rx_gen_regindex(ctx, mem, a->sz, a->ri, a->rb);
 rx_gen_ldu(a->sz, cpu_regs[a->rd], mem);
-tcg_temp_free(mem);
 return true;
 }
 
@@ -610,7 +598,6 @@ static bool trans_MOVU_pr(DisasContext *ctx, arg_MOVU_pr *a)
 tcg_gen_addi_i32(cpu_regs[a->rd], cpu_regs[a->rd], 1 << a->sz);
 }
 tcg_gen_mov_i32(cpu_regs[a->rs], val);
-tcg_temp_free(val);
 return true;
 }
 
@@ -635,7 +622,6 @@ static bool trans_POPC(DisasContext *ctx, arg_POPC *a)
 val = tcg_temp_new();
 pop(val);
 move_to_cr(ctx, val, a->cr);
-tcg_temp_free(val);
 return true;
 }
 
@@ -663,7 +649,6 @@ static bool trans_PUSH_r(DisasContext *ctx, arg_PUSH_r *a)
 tcg_gen_mov_i32(val, cpu_regs[a->rs]);
 tcg_gen_subi_i32(cpu_sp, cpu_sp, 4);
 rx_gen_st(a->sz, val, cpu_sp);
-tcg_temp_free(val);
 return true;
 }
 
@@ -677,8 +662,6 @@ static bool trans_PUSH_m(DisasContext *ctx, arg_PUSH_m *a)
 rx_gen_ld(a->sz, val, addr);
 tcg_gen_subi_i32(cpu_sp, cpu_sp, 4);
 rx_gen_st(a->sz, val, cpu_sp);
-tcg_temp_free(mem);
-tcg_temp_free(val);
 return true;
 }
 
@@ -689,7 +672,6 @@ static bool trans_PUSHC(DisasContext *ctx, arg_PUSHC *a)
 val = tcg_temp_new();
 move_from_cr(ctx, val, a->cr, ctx->pc);
 push(val);
-tcg_temp_free(val);
 return true;
 }
 
@@ -717,7 +699,6 @@ static bool trans_XCHG_rr(DisasContext *ctx, arg_XCHG_rr *a)
 tcg_gen_mov_i32(tmp, cpu_regs[a->rs]);
 tcg_gen_mov_i32(cpu_regs[a->rs], cpu_regs[a->rd]);
 tcg_gen_mov_i32(cpu_regs[a->rd], tmp);
-tcg_temp_free(tmp);
 return true;
 }
 
@@ -741,7 +722,6 @@ static bool trans_XCHG_mr(DisasContext *ctx, arg_XCHG_mr *a)
 }
 tcg_gen_atomic_xchg_i32(cpu_regs[a->rd], addr, cpu_regs[a->rd],
 0, mi_to_mop(a->mi));
-tcg_temp_free(mem);
 return true;
 }
 
@@ -753,8 +733,6 @@ static inline void stcond(TCGCond cond, int rd, int imm)
 _imm = tcg_const_i32(imm);
 tcg_gen_movcond_i32(cond, cpu_regs[rd], cpu_psw_z, z,
 _imm, cpu_regs[rd]);
-tcg_temp_free(z);
-tcg_temp_free(_imm);
 }
 
 /* stz #imm,rd */
@@ -785,12 +763,9 

[PULL 84/84] target/xtensa: Avoid tcg_const_i32

2023-03-05 Thread Richard Henderson
All remaining uses are strictly read-only.

Reviewed-by: Max Filippov 
Signed-off-by: Richard Henderson 
---
 target/xtensa/translate.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/xtensa/translate.c b/target/xtensa/translate.c
index f906ba7ed5..0cf3075649 100644
--- a/target/xtensa/translate.c
+++ b/target/xtensa/translate.c
@@ -1361,7 +1361,7 @@ static void translate_all(DisasContext *dc, const 
OpcodeArg arg[],
   const uint32_t par[])
 {
 uint32_t shift = par[1];
-TCGv_i32 mask = tcg_const_i32(((1 << shift) - 1) << arg[1].imm);
+TCGv_i32 mask = tcg_constant_i32(((1 << shift) - 1) << arg[1].imm);
 TCGv_i32 tmp = tcg_temp_new_i32();
 
 tcg_gen_and_i32(tmp, arg[1].in, mask);
@@ -1489,7 +1489,7 @@ static void translate_call0(DisasContext *dc, const 
OpcodeArg arg[],
 static void translate_callw(DisasContext *dc, const OpcodeArg arg[],
 const uint32_t par[])
 {
-TCGv_i32 tmp = tcg_const_i32(arg[0].imm);
+TCGv_i32 tmp = tcg_constant_i32(arg[0].imm);
 gen_callw_slot(dc, par[0], tmp, adjust_jump_slot(dc, arg[0].imm, 0));
 }
 
@@ -1537,7 +1537,7 @@ static void translate_clrex(DisasContext *dc, const 
OpcodeArg arg[],
 static void translate_const16(DisasContext *dc, const OpcodeArg arg[],
  const uint32_t par[])
 {
-TCGv_i32 c = tcg_const_i32(arg[1].imm);
+TCGv_i32 c = tcg_constant_i32(arg[1].imm);
 
 tcg_gen_deposit_i32(arg[0].out, c, arg[0].in, 16, 16);
 }
-- 
2.34.1




[PULL 77/84] target/s390x: Split out gen_ri2

2023-03-05 Thread Richard Henderson
Use tcg_constant_i64.  Adjust in2_mri2_* to allocate a new
temporary for the output, using gen_ri2 for the address.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/s390x/tcg/translate.c | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 811049ea28..21a57d5eb2 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -5886,9 +5886,14 @@ static void in2_a2(DisasContext *s, DisasOps *o)
 }
 #define SPEC_in2_a2 0
 
+static TCGv gen_ri2(DisasContext *s)
+{
+return tcg_constant_i64(s->base.pc_next + (int64_t)get_field(s, i2) * 2);
+}
+
 static void in2_ri2(DisasContext *s, DisasOps *o)
 {
-o->in2 = tcg_const_i64(s->base.pc_next + (int64_t)get_field(s, i2) * 2);
+o->in2 = gen_ri2(s);
 }
 #define SPEC_in2_ri2 0
 
@@ -5976,29 +5981,29 @@ static void in2_m2_64a(DisasContext *s, DisasOps *o)
 
 static void in2_mri2_16u(DisasContext *s, DisasOps *o)
 {
-in2_ri2(s, o);
-tcg_gen_qemu_ld16u(o->in2, o->in2, get_mem_index(s));
+o->in2 = tcg_temp_new_i64();
+tcg_gen_qemu_ld16u(o->in2, gen_ri2(s), get_mem_index(s));
 }
 #define SPEC_in2_mri2_16u 0
 
 static void in2_mri2_32s(DisasContext *s, DisasOps *o)
 {
-in2_ri2(s, o);
-tcg_gen_qemu_ld32s(o->in2, o->in2, get_mem_index(s));
+o->in2 = tcg_temp_new_i64();
+tcg_gen_qemu_ld32s(o->in2, gen_ri2(s), get_mem_index(s));
 }
 #define SPEC_in2_mri2_32s 0
 
 static void in2_mri2_32u(DisasContext *s, DisasOps *o)
 {
-in2_ri2(s, o);
-tcg_gen_qemu_ld32u(o->in2, o->in2, get_mem_index(s));
+o->in2 = tcg_temp_new_i64();
+tcg_gen_qemu_ld32u(o->in2, gen_ri2(s), get_mem_index(s));
 }
 #define SPEC_in2_mri2_32u 0
 
 static void in2_mri2_64(DisasContext *s, DisasOps *o)
 {
-in2_ri2(s, o);
-tcg_gen_qemu_ld64(o->in2, o->in2, get_mem_index(s));
+o->in2 = tcg_temp_new_i64();
+tcg_gen_qemu_ld64(o->in2, gen_ri2(s), get_mem_index(s));
 }
 #define SPEC_in2_mri2_64 0
 
-- 
2.34.1




[PULL 22/84] target/arm: Drop tcg_temp_free from translator-m-nocp.c

2023-03-05 Thread Richard Henderson
Translators are no longer required to free tcg temporaries.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/tcg/translate-m-nocp.c | 20 
 1 file changed, 20 deletions(-)

diff --git a/target/arm/tcg/translate-m-nocp.c 
b/target/arm/tcg/translate-m-nocp.c
index 5df7d46120..9a89aab785 100644
--- a/target/arm/tcg/translate-m-nocp.c
+++ b/target/arm/tcg/translate-m-nocp.c
@@ -91,7 +91,6 @@ static bool trans_VLLDM_VLSTM(DisasContext *s, 
arg_VLLDM_VLSTM *a)
 } else {
 gen_helper_v7m_vlstm(cpu_env, fptr);
 }
-tcg_temp_free_i32(fptr);
 
 clear_eci_state(s);
 
@@ -303,8 +302,6 @@ static void gen_branch_fpInactive(DisasContext *s, TCGCond 
cond,
 tcg_gen_andi_i32(fpca, fpca, R_V7M_CONTROL_FPCA_MASK);
 tcg_gen_or_i32(fpca, fpca, aspen);
 tcg_gen_brcondi_i32(tcg_invert_cond(cond), fpca, 0, label);
-tcg_temp_free_i32(aspen);
-tcg_temp_free_i32(fpca);
 }
 
 static bool gen_M_fp_sysreg_write(DisasContext *s, int regno,
@@ -328,7 +325,6 @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int 
regno,
 case ARM_VFP_FPSCR:
 tmp = loadfn(s, opaque, true);
 gen_helper_vfp_set_fpscr(cpu_env, tmp);
-tcg_temp_free_i32(tmp);
 gen_lookup_tb(s);
 break;
 case ARM_VFP_FPSCR_NZCVQC:
@@ -351,7 +347,6 @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int 
regno,
 tcg_gen_andi_i32(fpscr, fpscr, ~FPCR_NZCV_MASK);
 tcg_gen_or_i32(fpscr, fpscr, tmp);
 store_cpu_field(fpscr, vfp.xregs[ARM_VFP_FPSCR]);
-tcg_temp_free_i32(tmp);
 break;
 }
 case ARM_VFP_FPCXT_NS:
@@ -400,8 +395,6 @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int 
regno,
 tcg_gen_andi_i32(tmp, tmp, ~FPCR_NZCV_MASK);
 gen_helper_vfp_set_fpscr(cpu_env, tmp);
 s->base.is_jmp = DISAS_UPDATE_NOCHAIN;
-tcg_temp_free_i32(tmp);
-tcg_temp_free_i32(sfpa);
 break;
 }
 case ARM_VFP_VPR:
@@ -423,7 +416,6 @@ static bool gen_M_fp_sysreg_write(DisasContext *s, int 
regno,
 R_V7M_VPR_P0_SHIFT, R_V7M_VPR_P0_LENGTH);
 store_cpu_field(vpr, v7m.vpr);
 s->base.is_jmp = DISAS_UPDATE_NOCHAIN;
-tcg_temp_free_i32(tmp);
 break;
 }
 default:
@@ -491,7 +483,6 @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
 tcg_gen_andi_i32(sfpa, control, R_V7M_CONTROL_SFPA_MASK);
 tcg_gen_shli_i32(sfpa, sfpa, 31 - R_V7M_CONTROL_SFPA_SHIFT);
 tcg_gen_or_i32(tmp, tmp, sfpa);
-tcg_temp_free_i32(sfpa);
 /*
  * Store result before updating FPSCR etc, in case
  * it is a memory write which causes an exception.
@@ -505,7 +496,6 @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
 store_cpu_field(control, v7m.control[M_REG_S]);
 fpscr = load_cpu_field(v7m.fpdscr[M_REG_NS]);
 gen_helper_vfp_set_fpscr(cpu_env, fpscr);
-tcg_temp_free_i32(fpscr);
 lookup_tb = true;
 break;
 }
@@ -546,7 +536,6 @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
 tcg_gen_andi_i32(sfpa, control, R_V7M_CONTROL_SFPA_MASK);
 tcg_gen_shli_i32(sfpa, sfpa, 31 - R_V7M_CONTROL_SFPA_SHIFT);
 tcg_gen_or_i32(tmp, tmp, sfpa);
-tcg_temp_free_i32(control);
 /* Store result before updating FPSCR, in case it faults */
 storefn(s, opaque, tmp, true);
 /* If SFPA is zero then set FPSCR from FPDSCR_NS */
@@ -554,9 +543,6 @@ static bool gen_M_fp_sysreg_read(DisasContext *s, int regno,
 tcg_gen_movcond_i32(TCG_COND_EQ, fpscr, sfpa, tcg_constant_i32(0),
 fpdscr, fpscr);
 gen_helper_vfp_set_fpscr(cpu_env, fpscr);
-tcg_temp_free_i32(sfpa);
-tcg_temp_free_i32(fpdscr);
-tcg_temp_free_i32(fpscr);
 break;
 }
 case ARM_VFP_VPR:
@@ -598,7 +584,6 @@ static void fp_sysreg_to_gpr(DisasContext *s, void *opaque, 
TCGv_i32 value,
 if (a->rt == 15) {
 /* Set the 4 flag bits in the CPSR */
 gen_set_nzcv(value);
-tcg_temp_free_i32(value);
 } else {
 store_reg(s, a->rt, value);
 }
@@ -666,7 +651,6 @@ static void fp_sysreg_to_memory(DisasContext *s, void 
*opaque, TCGv_i32 value,
 if (do_access) {
 gen_aa32_st_i32(s, value, addr, get_mem_index(s),
 MO_UL | MO_ALIGN | s->be_data);
-tcg_temp_free_i32(value);
 }
 
 if (a->w) {
@@ -675,8 +659,6 @@ static void fp_sysreg_to_memory(DisasContext *s, void 
*opaque, TCGv_i32 value,
 tcg_gen_addi_i32(addr, addr, offset);
 }
 store_reg(s, a->rn, addr);
-} else {
-tcg_temp_free_i32(addr);
 }
 }
 
@@ -717,8 +699,6 @@ static TCGv_i32 memory_to_fp_sysreg(DisasContext *s, void 
*opaque,
 tcg_gen_addi_i32(addr, addr, offset);
 }
 store_reg(s, a->rn, addr);
-} else {
-

[PULL 48/84] target/openrisc: Drop tcg_temp_free

2023-03-05 Thread Richard Henderson
Translators are no longer required to free tcg temporaries.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/openrisc/translate.c | 39 -
 1 file changed, 39 deletions(-)

diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index b8cd8e0964..76e53c78d4 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -206,10 +206,8 @@ static void gen_add(DisasContext *dc, TCGv dest, TCGv 
srca, TCGv srcb)
 tcg_gen_xor_tl(cpu_sr_ov, srca, srcb);
 tcg_gen_xor_tl(t0, res, srcb);
 tcg_gen_andc_tl(cpu_sr_ov, t0, cpu_sr_ov);
-tcg_temp_free(t0);
 
 tcg_gen_mov_tl(dest, res);
-tcg_temp_free(res);
 
 gen_ove_cyov(dc);
 }
@@ -224,10 +222,8 @@ static void gen_addc(DisasContext *dc, TCGv dest, TCGv 
srca, TCGv srcb)
 tcg_gen_xor_tl(cpu_sr_ov, srca, srcb);
 tcg_gen_xor_tl(t0, res, srcb);
 tcg_gen_andc_tl(cpu_sr_ov, t0, cpu_sr_ov);
-tcg_temp_free(t0);
 
 tcg_gen_mov_tl(dest, res);
-tcg_temp_free(res);
 
 gen_ove_cyov(dc);
 }
@@ -243,7 +239,6 @@ static void gen_sub(DisasContext *dc, TCGv dest, TCGv srca, 
TCGv srcb)
 tcg_gen_setcond_tl(TCG_COND_LTU, cpu_sr_cy, srca, srcb);
 
 tcg_gen_mov_tl(dest, res);
-tcg_temp_free(res);
 
 gen_ove_cyov(dc);
 }
@@ -255,7 +250,6 @@ static void gen_mul(DisasContext *dc, TCGv dest, TCGv srca, 
TCGv srcb)
 tcg_gen_muls2_tl(dest, cpu_sr_ov, srca, srcb);
 tcg_gen_sari_tl(t0, dest, TARGET_LONG_BITS - 1);
 tcg_gen_setcond_tl(TCG_COND_NE, cpu_sr_ov, cpu_sr_ov, t0);
-tcg_temp_free(t0);
 
 tcg_gen_neg_tl(cpu_sr_ov, cpu_sr_ov);
 gen_ove_ov(dc);
@@ -278,7 +272,6 @@ static void gen_div(DisasContext *dc, TCGv dest, TCGv srca, 
TCGv srcb)
Supress the host-side exception by dividing by 1.  */
 tcg_gen_or_tl(t0, srcb, cpu_sr_ov);
 tcg_gen_div_tl(dest, srca, t0);
-tcg_temp_free(t0);
 
 tcg_gen_neg_tl(cpu_sr_ov, cpu_sr_ov);
 gen_ove_ov(dc);
@@ -293,7 +286,6 @@ static void gen_divu(DisasContext *dc, TCGv dest, TCGv 
srca, TCGv srcb)
Supress the host-side exception by dividing by 1.  */
 tcg_gen_or_tl(t0, srcb, cpu_sr_cy);
 tcg_gen_divu_tl(dest, srca, t0);
-tcg_temp_free(t0);
 
 gen_ove_cy(dc);
 }
@@ -314,14 +306,11 @@ static void gen_muld(DisasContext *dc, TCGv srca, TCGv 
srcb)
 tcg_gen_muls2_i64(cpu_mac, high, t1, t2);
 tcg_gen_sari_i64(t1, cpu_mac, 63);
 tcg_gen_setcond_i64(TCG_COND_NE, t1, t1, high);
-tcg_temp_free_i64(high);
 tcg_gen_trunc_i64_tl(cpu_sr_ov, t1);
 tcg_gen_neg_tl(cpu_sr_ov, cpu_sr_ov);
 
 gen_ove_ov(dc);
 }
-tcg_temp_free_i64(t1);
-tcg_temp_free_i64(t2);
 }
 
 static void gen_muldu(DisasContext *dc, TCGv srca, TCGv srcb)
@@ -340,12 +329,9 @@ static void gen_muldu(DisasContext *dc, TCGv srca, TCGv 
srcb)
 tcg_gen_mulu2_i64(cpu_mac, high, t1, t2);
 tcg_gen_setcondi_i64(TCG_COND_NE, high, high, 0);
 tcg_gen_trunc_i64_tl(cpu_sr_cy, high);
-tcg_temp_free_i64(high);
 
 gen_ove_cy(dc);
 }
-tcg_temp_free_i64(t1);
-tcg_temp_free_i64(t2);
 }
 
 static void gen_mac(DisasContext *dc, TCGv srca, TCGv srcb)
@@ -362,14 +348,12 @@ static void gen_mac(DisasContext *dc, TCGv srca, TCGv 
srcb)
 tcg_gen_add_i64(cpu_mac, cpu_mac, t1);
 tcg_gen_xor_i64(t1, t1, cpu_mac);
 tcg_gen_andc_i64(t1, t1, t2);
-tcg_temp_free_i64(t2);
 
 #if TARGET_LONG_BITS == 32
 tcg_gen_extrh_i64_i32(cpu_sr_ov, t1);
 #else
 tcg_gen_mov_i64(cpu_sr_ov, t1);
 #endif
-tcg_temp_free_i64(t1);
 
 gen_ove_ov(dc);
 }
@@ -382,13 +366,11 @@ static void gen_macu(DisasContext *dc, TCGv srca, TCGv 
srcb)
 tcg_gen_extu_tl_i64(t1, srca);
 tcg_gen_extu_tl_i64(t2, srcb);
 tcg_gen_mul_i64(t1, t1, t2);
-tcg_temp_free_i64(t2);
 
 /* Note that overflow is only computed during addition stage.  */
 tcg_gen_add_i64(cpu_mac, cpu_mac, t1);
 tcg_gen_setcond_i64(TCG_COND_LTU, t1, cpu_mac, t1);
 tcg_gen_trunc_i64_tl(cpu_sr_cy, t1);
-tcg_temp_free_i64(t1);
 
 gen_ove_cy(dc);
 }
@@ -407,14 +389,12 @@ static void gen_msb(DisasContext *dc, TCGv srca, TCGv 
srcb)
 tcg_gen_sub_i64(cpu_mac, cpu_mac, t1);
 tcg_gen_xor_i64(t1, t1, cpu_mac);
 tcg_gen_and_i64(t1, t1, t2);
-tcg_temp_free_i64(t2);
 
 #if TARGET_LONG_BITS == 32
 tcg_gen_extrh_i64_i32(cpu_sr_ov, t1);
 #else
 tcg_gen_mov_i64(cpu_sr_ov, t1);
 #endif
-tcg_temp_free_i64(t1);
 
 gen_ove_ov(dc);
 }
@@ -432,8 +412,6 @@ static void gen_msbu(DisasContext *dc, TCGv srca, TCGv srcb)
 tcg_gen_setcond_i64(TCG_COND_LTU, t2, cpu_mac, t1);
 tcg_gen_sub_i64(cpu_mac, cpu_mac, t1);
 tcg_gen_trunc_i64_tl(cpu_sr_cy, t2);
-tcg_temp_free_i64(t2);
-tcg_temp_free_i64(t1);
 
 gen_ove_cy(dc);
 }
@@ -672,7 +650,6 @@ static bool trans_l_lwa(DisasContext *dc, arg_load *a)
 tcg_gen_qemu_ld_tl(cpu_R(dc, a->d), ea, dc->mem_idx, MO_TEUL);
 

[PULL 82/84] target/xtensa: Use tcg_gen_subfi_i32 in translate_sll

2023-03-05 Thread Richard Henderson
Reviewed-by: Max Filippov 
Signed-off-by: Richard Henderson 
---
 target/xtensa/translate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/xtensa/translate.c b/target/xtensa/translate.c
index 41b84082de..2903c73f8e 100644
--- a/target/xtensa/translate.c
+++ b/target/xtensa/translate.c
@@ -2324,8 +2324,8 @@ static void translate_sll(DisasContext *dc, const 
OpcodeArg arg[],
 tcg_gen_shl_i32(arg[0].out, arg[1].in, dc->sar_m32);
 } else {
 TCGv_i64 v = tcg_temp_new_i64();
-TCGv_i32 s = tcg_const_i32(32);
-tcg_gen_sub_i32(s, s, cpu_SR[SAR]);
+TCGv_i32 s = tcg_temp_new();
+tcg_gen_subfi_i32(s, 32, cpu_SR[SAR]);
 tcg_gen_andi_i32(s, s, 0x3f);
 tcg_gen_extu_i32_i64(v, arg[1].in);
 gen_shift_reg(shl, s);
-- 
2.34.1




[PULL 71/84] target/hexagon/idef-parser: Use gen_tmp for gen_pred_assign

2023-03-05 Thread Richard Henderson
The allocation is immediately followed by tcg_gen_mov_i32,
so the initial assignment of zero is discarded.

Reviewed-by: Taylor Simpson 
Signed-off-by: Richard Henderson 
---
 target/hexagon/idef-parser/parser-helpers.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hexagon/idef-parser/parser-helpers.c 
b/target/hexagon/idef-parser/parser-helpers.c
index be979dac86..760e499149 100644
--- a/target/hexagon/idef-parser/parser-helpers.c
+++ b/target/hexagon/idef-parser/parser-helpers.c
@@ -1743,7 +1743,7 @@ void gen_pred_assign(Context *c, YYLTYPE *locp, HexValue 
*left_pred,
  "Predicate assign not allowed in ternary!");
 /* Extract predicate TCGv */
 if (is_direct) {
-*left_pred = gen_tmp_value(c, locp, "0", 32, UNSIGNED);
+*left_pred = gen_tmp(c, locp, 32, UNSIGNED);
 }
 /* Extract first 8 bits, and store new predicate value */
 OUT(c, locp, "tcg_gen_mov_i32(", left_pred, ", ", , ");\n");
-- 
2.34.1




[PULL 39/84] target/hexagon/idef-parser: Drop HexValue.is_manual

2023-03-05 Thread Richard Henderson
This field is no longer used.

Reviewed-by: Taylor Simpson 
Signed-off-by: Richard Henderson 
---
 target/hexagon/idef-parser/idef-parser.h|  1 -
 target/hexagon/idef-parser/parser-helpers.c | 15 ---
 target/hexagon/idef-parser/idef-parser.y|  2 --
 3 files changed, 18 deletions(-)

diff --git a/target/hexagon/idef-parser/idef-parser.h 
b/target/hexagon/idef-parser/idef-parser.h
index 5c49d4da3e..17d2ebfaf6 100644
--- a/target/hexagon/idef-parser/idef-parser.h
+++ b/target/hexagon/idef-parser/idef-parser.h
@@ -185,7 +185,6 @@ typedef struct HexValue {
 unsigned bit_width; /**< Bit width of the rvalue  
*/
 HexSignedness signedness;   /**< Unsigned flag for the rvalue 
*/
 bool is_dotnew; /**< rvalue of predicate type is dotnew?  
*/
-bool is_manual; /**< Opt out of automatic freeing of params   
*/
 } HexValue;
 
 /**
diff --git a/target/hexagon/idef-parser/parser-helpers.c 
b/target/hexagon/idef-parser/parser-helpers.c
index bdbb8b6a5f..0b401f7dbe 100644
--- a/target/hexagon/idef-parser/parser-helpers.c
+++ b/target/hexagon/idef-parser/parser-helpers.c
@@ -278,7 +278,6 @@ static HexValue gen_constant(Context *c,
 rvalue.bit_width = bit_width;
 rvalue.signedness = signedness;
 rvalue.is_dotnew = false;
-rvalue.is_manual = true;
 rvalue.tmp.index = c->inst.tmp_count;
 OUT(c, locp, "TCGv_i", _width, " tmp_", >inst.tmp_count,
 " = tcg_constant_i", _width, "(", value, ");\n");
@@ -299,7 +298,6 @@ HexValue gen_tmp(Context *c,
 rvalue.bit_width = bit_width;
 rvalue.signedness = signedness;
 rvalue.is_dotnew = false;
-rvalue.is_manual = false;
 rvalue.tmp.index = c->inst.tmp_count;
 OUT(c, locp, "TCGv_i", _width, " tmp_", >inst.tmp_count,
 " = tcg_temp_new_i", _width, "();\n");
@@ -320,7 +318,6 @@ HexValue gen_tmp_value(Context *c,
 rvalue.bit_width = bit_width;
 rvalue.signedness = signedness;
 rvalue.is_dotnew = false;
-rvalue.is_manual = false;
 rvalue.tmp.index = c->inst.tmp_count;
 OUT(c, locp, "TCGv_i", _width, " tmp_", >inst.tmp_count,
 " = tcg_const_i", _width, "(", value, ");\n");
@@ -339,7 +336,6 @@ static HexValue gen_tmp_value_from_imm(Context *c,
 rvalue.bit_width = value->bit_width;
 rvalue.signedness = value->signedness;
 rvalue.is_dotnew = false;
-rvalue.is_manual = false;
 rvalue.tmp.index = c->inst.tmp_count;
 /*
  * Here we output the call to `tcg_const_i` in
@@ -375,7 +371,6 @@ HexValue gen_imm_value(Context *c __attribute__((unused)),
 rvalue.bit_width = bit_width;
 rvalue.signedness = signedness;
 rvalue.is_dotnew = false;
-rvalue.is_manual = false;
 rvalue.imm.type = VALUE;
 rvalue.imm.value = value;
 return rvalue;
@@ -390,7 +385,6 @@ HexValue gen_imm_qemu_tmp(Context *c, YYLTYPE *locp, 
unsigned bit_width,
 memset(, 0, sizeof(HexValue));
 rvalue.type = IMMEDIATE;
 rvalue.is_dotnew = false;
-rvalue.is_manual = false;
 rvalue.bit_width = bit_width;
 rvalue.signedness = signedness;
 rvalue.imm.type = QEMU_TMP;
@@ -1242,15 +1236,12 @@ void gen_rdeposit_op(Context *c,
  */
 k64 = gen_bin_op(c, locp, SUB_OP, , _m);
 mask = gen_bin_op(c, locp, LSR_OP, , );
-begin_m.is_manual = true;
 mask = gen_bin_op(c, locp, ASL_OP, , _m);
-mask.is_manual = true;
 value_m = gen_bin_op(c, locp, ASL_OP, _m, _m);
 value_m = gen_bin_op(c, locp, ANDB_OP, _m, );
 
 OUT(c, locp, "tcg_gen_not_i", >bit_width, "(", , ", ",
 , ");\n");
-mask.is_manual = false;
 res = gen_bin_op(c, locp, ANDB_OP, dst, );
 res = gen_bin_op(c, locp, ORB_OP, , _m);
 
@@ -1410,8 +1401,6 @@ HexValue gen_convround(Context *c,
 HexValue and;
 HexValue src_p1;
 
-src_m.is_manual = true;
-
 and = gen_bin_op(c, locp, ANDB_OP, _m, );
 src_p1 = gen_bin_op(c, locp, ADD_OP, _m, );
 
@@ -1569,10 +1558,6 @@ HexValue gen_round(Context *c,
 b = gen_extend_op(c, locp, _width, 64, pos, UNSIGNED);
 b = rvalue_materialize(c, locp, );
 
-/* Disable auto-free of values used more than once */
-a.is_manual = true;
-b.is_manual = true;
-
 n_m1 = gen_bin_op(c, locp, SUB_OP, , );
 shifted = gen_bin_op(c, locp, ASL_OP, , _m1);
 sum = gen_bin_op(c, locp, ADD_OP, , );
diff --git a/target/hexagon/idef-parser/idef-parser.y 
b/target/hexagon/idef-parser/idef-parser.y
index 59c93f85b4..fae291e5f8 100644
--- a/target/hexagon/idef-parser/idef-parser.y
+++ b/target/hexagon/idef-parser/idef-parser.y
@@ -534,7 +534,6 @@ rvalue : FAIL
  rvalue.imm.type = IMM_CONSTEXT;
  rvalue.signedness = UNSIGNED;
  rvalue.is_dotnew = false;
- rvalue.is_manual = false;
  $$ = rvalue;
  }
| var
@@ -693,7 +692,6 @@ rvalue : FAIL
  }
| rvalue '?'
  {
- $1.is_manual = true;
  Ternary t = { 0 };
 

[PULL 23/84] target/arm: Drop tcg_temp_free from translator-mve.c

2023-03-05 Thread Richard Henderson
Translators are no longer required to free tcg temporaries.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/tcg/translate-mve.c | 52 --
 1 file changed, 52 deletions(-)

diff --git a/target/arm/tcg/translate-mve.c b/target/arm/tcg/translate-mve.c
index db7ea3f603..798b4fddfe 100644
--- a/target/arm/tcg/translate-mve.c
+++ b/target/arm/tcg/translate-mve.c
@@ -178,7 +178,6 @@ static bool do_ldst(DisasContext *s, arg_VLDR_VSTR *a, 
MVEGenLdStFn *fn,
 
 qreg = mve_qreg_ptr(a->qd);
 fn(cpu_env, qreg, addr);
-tcg_temp_free_ptr(qreg);
 
 /*
  * Writeback always happens after the last beat of the insn,
@@ -189,8 +188,6 @@ static bool do_ldst(DisasContext *s, arg_VLDR_VSTR *a, 
MVEGenLdStFn *fn,
 tcg_gen_addi_i32(addr, addr, offset);
 }
 store_reg(s, a->rn, addr);
-} else {
-tcg_temp_free_i32(addr);
 }
 mve_update_eci(s);
 return true;
@@ -242,9 +239,6 @@ static bool do_ldst_sg(DisasContext *s, arg_vldst_sg *a, 
MVEGenLdStSGFn fn)
 qd = mve_qreg_ptr(a->qd);
 qm = mve_qreg_ptr(a->qm);
 fn(cpu_env, qd, qm, addr);
-tcg_temp_free_ptr(qd);
-tcg_temp_free_ptr(qm);
-tcg_temp_free_i32(addr);
 mve_update_eci(s);
 return true;
 }
@@ -341,8 +335,6 @@ static bool do_ldst_sg_imm(DisasContext *s, 
arg_vldst_sg_imm *a,
 qd = mve_qreg_ptr(a->qd);
 qm = mve_qreg_ptr(a->qm);
 fn(cpu_env, qd, qm, tcg_constant_i32(offset));
-tcg_temp_free_ptr(qd);
-tcg_temp_free_ptr(qm);
 mve_update_eci(s);
 return true;
 }
@@ -414,8 +406,6 @@ static bool do_vldst_il(DisasContext *s, arg_vldst_il *a, 
MVEGenLdStIlFn *fn,
 if (a->w) {
 tcg_gen_addi_i32(rn, rn, addrinc);
 store_reg(s, a->rn, rn);
-} else {
-tcg_temp_free_i32(rn);
 }
 mve_update_and_store_eci(s);
 return true;
@@ -506,9 +496,7 @@ static bool trans_VDUP(DisasContext *s, arg_VDUP *a)
 qd = mve_qreg_ptr(a->qd);
 tcg_gen_dup_i32(a->size, rt, rt);
 gen_helper_mve_vdup(cpu_env, qd, rt);
-tcg_temp_free_ptr(qd);
 }
-tcg_temp_free_i32(rt);
 mve_update_eci(s);
 return true;
 }
@@ -534,8 +522,6 @@ static bool do_1op_vec(DisasContext *s, arg_1op *a, 
MVEGenOneOpFn fn,
 qd = mve_qreg_ptr(a->qd);
 qm = mve_qreg_ptr(a->qm);
 fn(cpu_env, qd, qm);
-tcg_temp_free_ptr(qd);
-tcg_temp_free_ptr(qm);
 }
 mve_update_eci(s);
 return true;
@@ -631,8 +617,6 @@ static bool do_vcvt_rmode(DisasContext *s, arg_1op *a,
 qd = mve_qreg_ptr(a->qd);
 qm = mve_qreg_ptr(a->qm);
 fn(cpu_env, qd, qm, tcg_constant_i32(arm_rmode_to_sf(rmode)));
-tcg_temp_free_ptr(qd);
-tcg_temp_free_ptr(qm);
 mve_update_eci(s);
 return true;
 }
@@ -821,9 +805,6 @@ static bool do_2op_vec(DisasContext *s, arg_2op *a, 
MVEGenTwoOpFn fn,
 qn = mve_qreg_ptr(a->qn);
 qm = mve_qreg_ptr(a->qm);
 fn(cpu_env, qd, qn, qm);
-tcg_temp_free_ptr(qd);
-tcg_temp_free_ptr(qn);
-tcg_temp_free_ptr(qm);
 }
 mve_update_eci(s);
 return true;
@@ -1076,9 +1057,6 @@ static bool do_2op_scalar(DisasContext *s, arg_2scalar *a,
 qn = mve_qreg_ptr(a->qn);
 rm = load_reg(s, a->rm);
 fn(cpu_env, qd, qn, rm);
-tcg_temp_free_i32(rm);
-tcg_temp_free_ptr(qd);
-tcg_temp_free_ptr(qn);
 mve_update_eci(s);
 return true;
 }
@@ -1204,15 +1182,11 @@ static bool do_long_dual_acc(DisasContext *s, 
arg_vmlaldav *a,
 rdalo = load_reg(s, a->rdalo);
 rdahi = load_reg(s, a->rdahi);
 tcg_gen_concat_i32_i64(rda, rdalo, rdahi);
-tcg_temp_free_i32(rdalo);
-tcg_temp_free_i32(rdahi);
 } else {
 rda = tcg_const_i64(0);
 }
 
 fn(rda, cpu_env, qn, qm, rda);
-tcg_temp_free_ptr(qn);
-tcg_temp_free_ptr(qm);
 
 rdalo = tcg_temp_new_i32();
 rdahi = tcg_temp_new_i32();
@@ -1220,7 +1194,6 @@ static bool do_long_dual_acc(DisasContext *s, 
arg_vmlaldav *a,
 tcg_gen_extrh_i64_i32(rdahi, rda);
 store_reg(s, a->rdalo, rdalo);
 store_reg(s, a->rdahi, rdahi);
-tcg_temp_free_i64(rda);
 mve_update_eci(s);
 return true;
 }
@@ -1312,8 +1285,6 @@ static bool do_dual_acc(DisasContext *s, arg_vmladav *a, 
MVEGenDualAccOpFn *fn)
 
 fn(rda, cpu_env, qn, qm, rda);
 store_reg(s, a->rda, rda);
-tcg_temp_free_ptr(qn);
-tcg_temp_free_ptr(qm);
 
 mve_update_eci(s);
 return true;
@@ -1451,7 +1422,6 @@ static bool trans_VADDV(DisasContext *s, arg_VADDV *a)
 qm = mve_qreg_ptr(a->qm);
 fns[a->size][a->u](rda, cpu_env, qm, rda);
 store_reg(s, a->rda, rda);
-tcg_temp_free_ptr(qm);
 
 mve_update_eci(s);
 return true;
@@ -1494,8 +1464,6 @@ static bool trans_VADDLV(DisasContext *s, arg_VADDLV *a)
 rdalo = load_reg(s, a->rdalo);
 rdahi = load_reg(s, a->rdahi);
 tcg_gen_concat_i32_i64(rda, rdalo, rdahi);
-

[PULL 81/84] target/xtensa: Avoid tcg_const_i32 in translate_l32r

2023-03-05 Thread Richard Henderson
Use addi on the addition side and tcg_constant_i32 on the other.

Reviewed-by: Max Filippov 
Signed-off-by: Richard Henderson 
---
 target/xtensa/translate.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/xtensa/translate.c b/target/xtensa/translate.c
index d727f9ffd8..41b84082de 100644
--- a/target/xtensa/translate.c
+++ b/target/xtensa/translate.c
@@ -1721,10 +1721,10 @@ static void translate_l32r(DisasContext *dc, const 
OpcodeArg arg[],
 TCGv_i32 tmp;
 
 if (dc->base.tb->flags & XTENSA_TBFLAG_LITBASE) {
-tmp = tcg_const_i32(arg[1].raw_imm - 1);
-tcg_gen_add_i32(tmp, cpu_SR[LITBASE], tmp);
+tmp = tcg_temp_new();
+tcg_gen_addi_i32(tmp, cpu_SR[LITBASE], arg[1].raw_imm - 1);
 } else {
-tmp = tcg_const_i32(arg[1].imm);
+tmp = tcg_constant_i32(arg[1].imm);
 }
 tcg_gen_qemu_ld32u(arg[0].out, tmp, dc->cring);
 }
-- 
2.34.1




[PULL 78/84] target/sparc: Avoid tcg_const_{tl,i32}

2023-03-05 Thread Richard Henderson
All remaining uses are strictly read-only.

Reviewed-by: Mark Cave-Ayland 
Signed-off-by: Richard Henderson 
---
 target/sparc/translate.c | 80 +++-
 1 file changed, 38 insertions(+), 42 deletions(-)

diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 814f3f8b1e..5ee293326c 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -550,7 +550,7 @@ static inline void gen_op_mulscc(TCGv dst, TCGv src1, TCGv 
src2)
 if (!(env->y & 1))
 T1 = 0;
 */
-zero = tcg_const_tl(0);
+zero = tcg_constant_tl(0);
 tcg_gen_andi_tl(cpu_cc_src, src1, 0x);
 tcg_gen_andi_tl(r_temp, cpu_y, 0x1);
 tcg_gen_andi_tl(cpu_cc_src2, src2, 0x);
@@ -928,8 +928,8 @@ static void gen_branch_n(DisasContext *dc, target_ulong pc1)
 tcg_gen_mov_tl(cpu_pc, cpu_npc);
 
 tcg_gen_addi_tl(cpu_npc, cpu_npc, 4);
-t = tcg_const_tl(pc1);
-z = tcg_const_tl(0);
+t = tcg_constant_tl(pc1);
+z = tcg_constant_tl(0);
 tcg_gen_movcond_tl(TCG_COND_NE, cpu_npc, cpu_cond, z, t, cpu_npc);
 
 dc->pc = DYNAMIC_PC;
@@ -938,9 +938,9 @@ static void gen_branch_n(DisasContext *dc, target_ulong pc1)
 
 static inline void gen_generic_branch(DisasContext *dc)
 {
-TCGv npc0 = tcg_const_tl(dc->jump_pc[0]);
-TCGv npc1 = tcg_const_tl(dc->jump_pc[1]);
-TCGv zero = tcg_const_tl(0);
+TCGv npc0 = tcg_constant_tl(dc->jump_pc[0]);
+TCGv npc1 = tcg_constant_tl(dc->jump_pc[1]);
+TCGv zero = tcg_constant_tl(0);
 
 tcg_gen_movcond_tl(TCG_COND_NE, cpu_npc, cpu_cond, zero, npc0, npc1);
 }
@@ -981,18 +981,14 @@ static inline void save_state(DisasContext *dc)
 
 static void gen_exception(DisasContext *dc, int which)
 {
-TCGv_i32 t;
-
 save_state(dc);
-t = tcg_const_i32(which);
-gen_helper_raise_exception(cpu_env, t);
+gen_helper_raise_exception(cpu_env, tcg_constant_i32(which));
 dc->base.is_jmp = DISAS_NORETURN;
 }
 
 static void gen_check_align(TCGv addr, int mask)
 {
-TCGv_i32 r_mask = tcg_const_i32(mask);
-gen_helper_check_align(cpu_env, addr, r_mask);
+gen_helper_check_align(cpu_env, addr, tcg_constant_i32(mask));
 }
 
 static inline void gen_mov_pc_npc(DisasContext *dc)
@@ -1074,7 +1070,7 @@ static void gen_compare(DisasCompare *cmp, bool xcc, 
unsigned int cond,
 cmp->cond = logic_cond[cond];
 do_compare_dst_0:
 cmp->is_bool = false;
-cmp->c2 = tcg_const_tl(0);
+cmp->c2 = tcg_constant_tl(0);
 #ifdef TARGET_SPARC64
 if (!xcc) {
 cmp->c1 = tcg_temp_new();
@@ -1127,7 +1123,7 @@ static void gen_compare(DisasCompare *cmp, bool xcc, 
unsigned int cond,
 cmp->cond = TCG_COND_NE;
 cmp->is_bool = true;
 cmp->c1 = r_dst = tcg_temp_new();
-cmp->c2 = tcg_const_tl(0);
+cmp->c2 = tcg_constant_tl(0);
 
 switch (cond) {
 case 0x0:
@@ -1192,7 +1188,7 @@ static void gen_fcompare(DisasCompare *cmp, unsigned int 
cc, unsigned int cond)
 cmp->cond = TCG_COND_NE;
 cmp->is_bool = true;
 cmp->c1 = r_dst = tcg_temp_new();
-cmp->c2 = tcg_const_tl(0);
+cmp->c2 = tcg_constant_tl(0);
 
 switch (cc) {
 default:
@@ -1307,7 +1303,7 @@ static void gen_compare_reg(DisasCompare *cmp, int cond, 
TCGv r_src)
 cmp->cond = tcg_invert_cond(gen_tcg_cond_reg[cond]);
 cmp->is_bool = false;
 cmp->c1 = r_src;
-cmp->c2 = tcg_const_tl(0);
+cmp->c2 = tcg_constant_tl(0);
 }
 
 static inline void gen_cond_reg(TCGv r_dst, int cond, TCGv r_src)
@@ -1908,7 +1904,7 @@ static void gen_swap(DisasContext *dc, TCGv dst, TCGv src,
 
 static void gen_ldstub(DisasContext *dc, TCGv dst, TCGv addr, int mmu_idx)
 {
-TCGv m1 = tcg_const_tl(0xff);
+TCGv m1 = tcg_constant_tl(0xff);
 gen_address_mask(dc, addr);
 tcg_gen_atomic_xchg_tl(dst, addr, m1, mmu_idx, MO_UB);
 }
@@ -2163,8 +2159,8 @@ static void gen_ld_asi(DisasContext *dc, TCGv dst, TCGv 
addr,
 break;
 default:
 {
-TCGv_i32 r_asi = tcg_const_i32(da.asi);
-TCGv_i32 r_mop = tcg_const_i32(memop);
+TCGv_i32 r_asi = tcg_constant_i32(da.asi);
+TCGv_i32 r_mop = tcg_constant_i32(memop);
 
 save_state(dc);
 #ifdef TARGET_SPARC64
@@ -2217,7 +2213,7 @@ static void gen_st_asi(DisasContext *dc, TCGv src, TCGv 
addr,
 {
 TCGv saddr = tcg_temp_new();
 TCGv daddr = tcg_temp_new();
-TCGv four = tcg_const_tl(4);
+TCGv four = tcg_constant_tl(4);
 TCGv_i32 tmp = tcg_temp_new_i32();
 int i;
 
@@ -2236,8 +2232,8 @@ static void gen_st_asi(DisasContext *dc, TCGv src, TCGv 
addr,
 #endif
 default:
 {
-TCGv_i32 r_asi = tcg_const_i32(da.asi);
-TCGv_i32 r_mop = tcg_const_i32(memop & MO_SIZE);
+TCGv_i32 r_asi = tcg_constant_i32(da.asi);
+TCGv_i32 r_mop = tcg_constant_i32(memop & MO_SIZE);

[PULL 76/84] target/riscv: Avoid tcg_const_*

2023-03-05 Thread Richard Henderson
All uses are strictly read-only.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/riscv/translate.c  | 4 ++--
 target/riscv/insn_trans/trans_rvv.c.inc   | 4 ++--
 target/riscv/insn_trans/trans_rvzfh.c.inc | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 0485abbf7a..93909207d2 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -201,8 +201,8 @@ static void gen_nanbox_h(TCGv_i64 out, TCGv_i64 in)
  */
 static void gen_check_nanbox_h(TCGv_i64 out, TCGv_i64 in)
 {
-TCGv_i64 t_max = tcg_const_i64(0xull);
-TCGv_i64 t_nan = tcg_const_i64(0x7e00ull);
+TCGv_i64 t_max = tcg_constant_i64(0xull);
+TCGv_i64 t_nan = tcg_constant_i64(0x7e00ull);
 
 tcg_gen_movcond_i64(TCG_COND_GEU, out, in, t_max, in, t_nan);
 }
diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index fa3f16eddd..f2e3d38515 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -209,8 +209,8 @@ static bool trans_vsetvli(DisasContext *s, arg_vsetvli *a)
 
 static bool trans_vsetivli(DisasContext *s, arg_vsetivli *a)
 {
-TCGv s1 = tcg_const_tl(a->rs1);
-TCGv s2 = tcg_const_tl(a->zimm);
+TCGv s1 = tcg_constant_tl(a->rs1);
+TCGv s2 = tcg_constant_tl(a->zimm);
 return do_vsetivli(s, a->rd, s1, s2);
 }
 
diff --git a/target/riscv/insn_trans/trans_rvzfh.c.inc 
b/target/riscv/insn_trans/trans_rvzfh.c.inc
index d2012c2841..74dde37ff7 100644
--- a/target/riscv/insn_trans/trans_rvzfh.c.inc
+++ b/target/riscv/insn_trans/trans_rvzfh.c.inc
@@ -299,7 +299,7 @@ static bool trans_fsgnjn_h(DisasContext *ctx, arg_fsgnjn_h 
*a)
  * Replace bit 15 in rs1 with inverse in rs2.
  * This formulation retains the nanboxing of rs1.
  */
-mask = tcg_const_i64(~MAKE_64BIT_MASK(15, 1));
+mask = tcg_constant_i64(~MAKE_64BIT_MASK(15, 1));
 tcg_gen_not_i64(rs2, rs2);
 tcg_gen_andc_i64(rs2, rs2, mask);
 tcg_gen_and_i64(dest, mask, rs1);
-- 
2.34.1




[PULL 70/84] target/hexagon/idef-parser: Use gen_tmp for LPCFG

2023-03-05 Thread Richard Henderson
The GET_USR_FIELD macro initializes the output, so the initial assignment
of zero is discarded.  This is the only use of get_tmp_value outside of
parser-helper.c, so make it static.

Reviewed-by: Taylor Simpson 
Signed-off-by: Richard Henderson 
---
 target/hexagon/idef-parser/parser-helpers.h | 6 --
 target/hexagon/idef-parser/parser-helpers.c | 2 +-
 target/hexagon/idef-parser/idef-parser.y| 2 +-
 3 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/target/hexagon/idef-parser/parser-helpers.h 
b/target/hexagon/idef-parser/parser-helpers.h
index 4c89498f5b..1239d23a6a 100644
--- a/target/hexagon/idef-parser/parser-helpers.h
+++ b/target/hexagon/idef-parser/parser-helpers.h
@@ -154,12 +154,6 @@ HexValue gen_tmp(Context *c,
  unsigned bit_width,
  HexSignedness signedness);
 
-HexValue gen_tmp_value(Context *c,
-   YYLTYPE *locp,
-   const char *value,
-   unsigned bit_width,
-   HexSignedness signedness);
-
 HexValue gen_imm_value(Context *c __attribute__((unused)),
YYLTYPE *locp,
int value,
diff --git a/target/hexagon/idef-parser/parser-helpers.c 
b/target/hexagon/idef-parser/parser-helpers.c
index 6fb5f31cf7..be979dac86 100644
--- a/target/hexagon/idef-parser/parser-helpers.c
+++ b/target/hexagon/idef-parser/parser-helpers.c
@@ -305,7 +305,7 @@ HexValue gen_tmp(Context *c,
 return rvalue;
 }
 
-HexValue gen_tmp_value(Context *c,
+static HexValue gen_tmp_value(Context *c,
YYLTYPE *locp,
const char *value,
unsigned bit_width,
diff --git a/target/hexagon/idef-parser/idef-parser.y 
b/target/hexagon/idef-parser/idef-parser.y
index fae291e5f8..c784726d41 100644
--- a/target/hexagon/idef-parser/idef-parser.y
+++ b/target/hexagon/idef-parser/idef-parser.y
@@ -783,7 +783,7 @@ rvalue : FAIL
  }
| LPCFG
  {
- $$ = gen_tmp_value(c, &@1, "0", 32, UNSIGNED);
+ $$ = gen_tmp(c, &@1, 32, UNSIGNED);
  OUT(c, &@1, "GET_USR_FIELD(USR_LPCFG, ", &$$, ");\n");
  }
| EXTRACT '(' rvalue ',' rvalue ')'
-- 
2.34.1




[PULL 56/84] target/sparc: Drop get_temp_i32

2023-03-05 Thread Richard Henderson
Translators are no longer required to free tcg temporaries,
therefore there's no need to record temps for later freeing.
Replace the few uses with tcg_temp_new_i32.

Reviewed-by: Peter Maydell 
Acked-by: Mark Cave-Ayland 
Signed-off-by: Richard Henderson 
---
 target/sparc/translate.c | 23 +++
 1 file changed, 3 insertions(+), 20 deletions(-)

diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 2b4af692f6..a20426202e 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -84,8 +84,6 @@ typedef struct DisasContext {
 
 uint32_t cc_op;  /* current CC operation */
 sparc_def_t *def;
-TCGv_i32 t32[3];
-int n_t32;
 #ifdef TARGET_SPARC64
 int fprs_dirty;
 int asi;
@@ -129,14 +127,6 @@ static int sign_extend(int x, int len)
 
 #define IS_IMM (insn & (1<<13))
 
-static inline TCGv_i32 get_temp_i32(DisasContext *dc)
-{
-TCGv_i32 t;
-assert(dc->n_t32 < ARRAY_SIZE(dc->t32));
-dc->t32[dc->n_t32++] = t = tcg_temp_new_i32();
-return t;
-}
-
 static inline void gen_update_fprs_dirty(DisasContext *dc, int rd)
 {
 #if defined(TARGET_SPARC64)
@@ -153,7 +143,7 @@ static inline void gen_update_fprs_dirty(DisasContext *dc, 
int rd)
 /* floating point registers moves */
 static TCGv_i32 gen_load_fpr_F(DisasContext *dc, unsigned int src)
 {
-TCGv_i32 ret = get_temp_i32(dc);
+TCGv_i32 ret = tcg_temp_new_i32();
 if (src & 1) {
 tcg_gen_extrl_i64_i32(ret, cpu_fpr[src / 2]);
 } else {
@@ -175,7 +165,7 @@ static void gen_store_fpr_F(DisasContext *dc, unsigned int 
dst, TCGv_i32 v)
 
 static TCGv_i32 gen_dest_fpr_F(DisasContext *dc)
 {
-return get_temp_i32(dc);
+return tcg_temp_new_i32();
 }
 
 static TCGv_i64 gen_load_fpr_D(DisasContext *dc, unsigned int src)
@@ -5516,7 +5506,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned 
int insn)
 break;
 }
 #endif
-cpu_dst_32 = get_temp_i32(dc);
+cpu_dst_32 = tcg_temp_new_i32();
 tcg_gen_qemu_ld_i32(cpu_dst_32, cpu_addr,
 dc->mem_idx, MO_TEUL);
 gen_helper_ldfsr(cpu_fsr, cpu_env, cpu_fsr, cpu_dst_32);
@@ -5763,13 +5753,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned 
int insn)
 goto egress;
 #endif
  egress:
-if (dc->n_t32 != 0) {
-int i;
-for (i = dc->n_t32 - 1; i >= 0; --i) {
-tcg_temp_free_i32(dc->t32[i]);
-}
-dc->n_t32 = 0;
-}
 }
 
 static void sparc_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
-- 
2.34.1




[PULL 09/84] include/qemu/cpuid: Introduce xgetbv_low

2023-03-05 Thread Richard Henderson
Replace the two uses of asm to expand xgetbv with an inline function.
Since one of the two has been using the mnemonic, assume that the
comment about "older versions of the assember" is obsolete, as even
that is 4 years old.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/qemu/cpuid.h  |  7 +++
 util/bufferiszero.c   |  3 +--
 tcg/i386/tcg-target.c.inc | 11 ---
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/include/qemu/cpuid.h b/include/qemu/cpuid.h
index 7adb12d320..1451e8ef2f 100644
--- a/include/qemu/cpuid.h
+++ b/include/qemu/cpuid.h
@@ -71,4 +71,11 @@
 #define bit_LZCNT   (1 << 5)
 #endif
 
+static inline unsigned xgetbv_low(unsigned c)
+{
+unsigned a, d;
+asm("xgetbv" : "=a"(a), "=d"(d) : "c"(c));
+return a;
+}
+
 #endif /* QEMU_CPUID_H */
diff --git a/util/bufferiszero.c b/util/bufferiszero.c
index 1790ded7d4..1886bc5ba4 100644
--- a/util/bufferiszero.c
+++ b/util/bufferiszero.c
@@ -258,8 +258,7 @@ static void __attribute__((constructor)) 
init_cpuid_cache(void)
 
 /* We must check that AVX is not just available, but usable.  */
 if ((c & bit_OSXSAVE) && (c & bit_AVX) && max >= 7) {
-int bv;
-__asm("xgetbv" : "=a"(bv), "=d"(d) : "c"(0));
+unsigned bv = xgetbv_low(0);
 __cpuid_count(7, 0, a, b, c, d);
 if ((bv & 0x6) == 0x6 && (b & bit_AVX2)) {
 cache |= CACHE_AVX2;
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 883ced8168..028ece62a0 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -4156,12 +4156,9 @@ static void tcg_target_init(TCGContext *s)
 /* There are a number of things we must check before we can be
sure of not hitting invalid opcode.  */
 if (c & bit_OSXSAVE) {
-unsigned xcrl, xcrh;
-/* The xgetbv instruction is not available to older versions of
- * the assembler, so we encode the instruction manually.
- */
-asm(".byte 0x0f, 0x01, 0xd0" : "=a" (xcrl), "=d" (xcrh) : "c" (0));
-if ((xcrl & 6) == 6) {
+unsigned bv = xgetbv_low(0);
+
+if ((bv & 6) == 6) {
 have_avx1 = (c & bit_AVX) != 0;
 have_avx2 = (b7 & bit_AVX2) != 0;
 
@@ -4172,7 +4169,7 @@ static void tcg_target_init(TCGContext *s)
  * check that OPMASK and all extended ZMM state are enabled
  * even if we're not using them -- the insns will fault.
  */
-if ((xcrl & 0xe0) == 0xe0
+if ((bv & 0xe0) == 0xe0
 && (b7 & bit_AVX512F)
 && (b7 & bit_AVX512VL)) {
 have_avx512vl = true;
-- 
2.34.1




[PULL 73/84] target/hexagon/idef-parser: Use gen_constant for gen_extend_tcg_width_op

2023-03-05 Thread Richard Henderson
We already have a temporary, res, which we can use for the intermediate
shift result.  Simplify the constant to -1 instead of 0xf*f.
This was the last use of gen_tmp_value, so remove it.

Reviewed-by: Taylor Simpson 
Signed-off-by: Richard Henderson 
---
 target/hexagon/idef-parser/parser-helpers.c | 30 +++--
 1 file changed, 3 insertions(+), 27 deletions(-)

diff --git a/target/hexagon/idef-parser/parser-helpers.c 
b/target/hexagon/idef-parser/parser-helpers.c
index c0e6f2190c..e1a55412c8 100644
--- a/target/hexagon/idef-parser/parser-helpers.c
+++ b/target/hexagon/idef-parser/parser-helpers.c
@@ -305,26 +305,6 @@ HexValue gen_tmp(Context *c,
 return rvalue;
 }
 
-static HexValue gen_tmp_value(Context *c,
-   YYLTYPE *locp,
-   const char *value,
-   unsigned bit_width,
-   HexSignedness signedness)
-{
-HexValue rvalue;
-assert(bit_width == 32 || bit_width == 64);
-memset(, 0, sizeof(HexValue));
-rvalue.type = TEMP;
-rvalue.bit_width = bit_width;
-rvalue.signedness = signedness;
-rvalue.is_dotnew = false;
-rvalue.tmp.index = c->inst.tmp_count;
-OUT(c, locp, "TCGv_i", _width, " tmp_", >inst.tmp_count,
-" = tcg_const_i", _width, "(", value, ");\n");
-c->inst.tmp_count++;
-return rvalue;
-}
-
 static HexValue gen_constant_from_imm(Context *c,
   YYLTYPE *locp,
   HexValue *value)
@@ -1120,15 +1100,11 @@ static HexValue gen_extend_tcg_width_op(Context *c,
 OUT(c, locp, "tcg_gen_subfi_i", _width);
 OUT(c, locp, "(", , ", ", _width, ", ", _width_m, ");\n");
 if (signedness == UNSIGNED) {
-const char *mask_str = (dst_width == 32)
-? "0x"
-: "0x";
-HexValue mask = gen_tmp_value(c, locp, mask_str,
- dst_width, UNSIGNED);
+HexValue mask = gen_constant(c, locp, "-1", dst_width, UNSIGNED);
 OUT(c, locp, "tcg_gen_shr_i", _width, "(",
-, ", ", , ", ", , ");\n");
+, ", ", , ", ", , ");\n");
 OUT(c, locp, "tcg_gen_and_i", _width, "(",
-, ", ", value, ", ", , ");\n");
+, ", ", , ", ", value, ");\n");
 } else {
 OUT(c, locp, "tcg_gen_shl_i", _width, "(",
 , ", ", value, ", ", , ");\n");
-- 
2.34.1




[PULL 57/84] target/sparc: Remove egress label in disas_sparc_context

2023-03-05 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Acked-by: Mark Cave-Ayland 
Signed-off-by: Richard Henderson 
---
 target/sparc/translate.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index a20426202e..560fb32e28 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -5727,32 +5727,31 @@ static void disas_sparc_insn(DisasContext * dc, 
unsigned int insn)
 dc->npc = dc->npc + 4;
 }
  jmp_insn:
-goto egress;
+return;
  illegal_insn:
 gen_exception(dc, TT_ILL_INSN);
-goto egress;
+return;
  unimp_flush:
 gen_exception(dc, TT_UNIMP_FLUSH);
-goto egress;
+return;
 #if !defined(CONFIG_USER_ONLY)
  priv_insn:
 gen_exception(dc, TT_PRIV_INSN);
-goto egress;
+return;
 #endif
  nfpu_insn:
 gen_op_fpexception_im(dc, FSR_FTT_UNIMPFPOP);
-goto egress;
+return;
 #if !defined(CONFIG_USER_ONLY) && !defined(TARGET_SPARC64)
  nfq_insn:
 gen_op_fpexception_im(dc, FSR_FTT_SEQ_ERROR);
-goto egress;
+return;
 #endif
 #ifndef TARGET_SPARC64
  ncp_insn:
 gen_exception(dc, TT_NCP_INSN);
-goto egress;
+return;
 #endif
- egress:
 }
 
 static void sparc_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
-- 
2.34.1




[PULL 80/84] target/xtensa: Tidy translate_clamps

2023-03-05 Thread Richard Henderson
All writes to arg[0].out; use tcg_constant_i32.

Reviewed-by: Max Filippov 
Signed-off-by: Richard Henderson 
---
 target/xtensa/translate.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/xtensa/translate.c b/target/xtensa/translate.c
index e3fcd50691..d727f9ffd8 100644
--- a/target/xtensa/translate.c
+++ b/target/xtensa/translate.c
@@ -1514,11 +1514,11 @@ static void translate_callxw(DisasContext *dc, const 
OpcodeArg arg[],
 static void translate_clamps(DisasContext *dc, const OpcodeArg arg[],
  const uint32_t par[])
 {
-TCGv_i32 tmp1 = tcg_const_i32(-1u << arg[2].imm);
-TCGv_i32 tmp2 = tcg_const_i32((1 << arg[2].imm) - 1);
+TCGv_i32 tmp1 = tcg_constant_i32(-1u << arg[2].imm);
+TCGv_i32 tmp2 = tcg_constant_i32((1 << arg[2].imm) - 1);
 
-tcg_gen_smax_i32(tmp1, tmp1, arg[1].in);
-tcg_gen_smin_i32(arg[0].out, tmp1, tmp2);
+tcg_gen_smax_i32(arg[0].out, tmp1, arg[1].in);
+tcg_gen_smin_i32(arg[0].out, arg[0].out, tmp2);
 }
 
 static void translate_clrb_expstate(DisasContext *dc, const OpcodeArg arg[],
-- 
2.34.1




[PULL 74/84] target/i386: Simplify POPF

2023-03-05 Thread Richard Henderson
Compute the eflags write mask separately, leaving one call
to the helper.  Use tcg_constant_i32.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/i386/tcg/translate.c | 55 -
 1 file changed, 11 insertions(+), 44 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 2f3842663d..fa422ebd0b 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -5226,52 +5226,19 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 case 0x9d: /* popf */
 gen_svm_check_intercept(s, SVM_EXIT_POPF);
 if (check_vm86_iopl(s)) {
-ot = gen_pop_T0(s);
+int mask = TF_MASK | AC_MASK | ID_MASK | NT_MASK;
+
 if (CPL(s) == 0) {
-if (dflag != MO_16) {
-gen_helper_write_eflags(cpu_env, s->T0,
-tcg_const_i32((TF_MASK | AC_MASK |
-   ID_MASK | NT_MASK |
-   IF_MASK |
-   IOPL_MASK)));
-} else {
-gen_helper_write_eflags(cpu_env, s->T0,
-tcg_const_i32((TF_MASK | AC_MASK |
-   ID_MASK | NT_MASK |
-   IF_MASK | IOPL_MASK)
-  & 0x));
-}
-} else {
-if (CPL(s) <= IOPL(s)) {
-if (dflag != MO_16) {
-gen_helper_write_eflags(cpu_env, s->T0,
-tcg_const_i32((TF_MASK |
-   AC_MASK |
-   ID_MASK |
-   NT_MASK |
-   IF_MASK)));
-} else {
-gen_helper_write_eflags(cpu_env, s->T0,
-tcg_const_i32((TF_MASK |
-   AC_MASK |
-   ID_MASK |
-   NT_MASK |
-   IF_MASK)
-  & 0x));
-}
-} else {
-if (dflag != MO_16) {
-gen_helper_write_eflags(cpu_env, s->T0,
-   tcg_const_i32((TF_MASK | AC_MASK |
-  ID_MASK | NT_MASK)));
-} else {
-gen_helper_write_eflags(cpu_env, s->T0,
-   tcg_const_i32((TF_MASK | AC_MASK |
-  ID_MASK | NT_MASK)
- & 0x));
-}
-}
+mask |= IF_MASK | IOPL_MASK;
+} else if (CPL(s) <= IOPL(s)) {
+mask |= IF_MASK;
 }
+if (dflag == MO_16) {
+mask &= 0x;
+}
+
+ot = gen_pop_T0(s);
+gen_helper_write_eflags(cpu_env, s->T0, tcg_constant_i32(mask));
 gen_pop_update(s, ot);
 set_cc_op(s, CC_OP_EFLAGS);
 /* abort translation because TF/AC flag may change */
-- 
2.34.1




[PULL 21/84] target/arm: Drop tcg_temp_free from translator-a64.c

2023-03-05 Thread Richard Henderson
Translators are no longer required to free tcg temporaries.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/tcg/translate-a64.c | 468 +
 1 file changed, 11 insertions(+), 457 deletions(-)

diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index e7fa6497cd..2c2ea45b47 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -542,7 +542,6 @@ static void write_fp_sreg(DisasContext *s, int reg, 
TCGv_i32 v)
 
 tcg_gen_extu_i32_i64(tmp, v);
 write_fp_dreg(s, reg, tmp);
-tcg_temp_free_i64(tmp);
 }
 
 /* Expand a 2-operand AdvSIMD vector operation using an expander function.  */
@@ -611,7 +610,6 @@ static void gen_gvec_op3_fpst(DisasContext *s, bool is_q, 
int rd, int rn,
vec_full_reg_offset(s, rn),
vec_full_reg_offset(s, rm), fpst,
is_q ? 16 : 8, vec_full_reg_size(s), data, fn);
-tcg_temp_free_ptr(fpst);
 }
 
 /* Expand a 3-operand + qc + operation using an out-of-line helper.  */
@@ -625,7 +623,6 @@ static void gen_gvec_op3_qc(DisasContext *s, bool is_q, int 
rd, int rn,
vec_full_reg_offset(s, rn),
vec_full_reg_offset(s, rm), qc_ptr,
is_q ? 16 : 8, vec_full_reg_size(s), 0, fn);
-tcg_temp_free_ptr(qc_ptr);
 }
 
 /* Expand a 4-operand operation using an out-of-line helper.  */
@@ -653,7 +650,6 @@ static void gen_gvec_op4_fpst(DisasContext *s, bool is_q, 
int rd, int rn,
vec_full_reg_offset(s, rm),
vec_full_reg_offset(s, ra), fpst,
is_q ? 16 : 8, vec_full_reg_size(s), data, fn);
-tcg_temp_free_ptr(fpst);
 }
 
 /* Set ZF and NF based on a 64 bit result. This is alas fiddlier
@@ -697,12 +693,9 @@ static void gen_add_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, 
TCGv_i64 t1)
 tcg_gen_xor_i64(flag, result, t0);
 tcg_gen_xor_i64(tmp, t0, t1);
 tcg_gen_andc_i64(flag, flag, tmp);
-tcg_temp_free_i64(tmp);
 tcg_gen_extrh_i64_i32(cpu_VF, flag);
 
 tcg_gen_mov_i64(dest, result);
-tcg_temp_free_i64(result);
-tcg_temp_free_i64(flag);
 } else {
 /* 32 bit arithmetic */
 TCGv_i32 t0_32 = tcg_temp_new_i32();
@@ -718,10 +711,6 @@ static void gen_add_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, 
TCGv_i64 t1)
 tcg_gen_xor_i32(tmp, t0_32, t1_32);
 tcg_gen_andc_i32(cpu_VF, cpu_VF, tmp);
 tcg_gen_extu_i32_i64(dest, cpu_NF);
-
-tcg_temp_free_i32(tmp);
-tcg_temp_free_i32(t0_32);
-tcg_temp_free_i32(t1_32);
 }
 }
 
@@ -745,11 +734,8 @@ static void gen_sub_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, 
TCGv_i64 t1)
 tmp = tcg_temp_new_i64();
 tcg_gen_xor_i64(tmp, t0, t1);
 tcg_gen_and_i64(flag, flag, tmp);
-tcg_temp_free_i64(tmp);
 tcg_gen_extrh_i64_i32(cpu_VF, flag);
 tcg_gen_mov_i64(dest, result);
-tcg_temp_free_i64(flag);
-tcg_temp_free_i64(result);
 } else {
 /* 32 bit arithmetic */
 TCGv_i32 t0_32 = tcg_temp_new_i32();
@@ -764,10 +750,7 @@ static void gen_sub_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, 
TCGv_i64 t1)
 tcg_gen_xor_i32(cpu_VF, cpu_NF, t0_32);
 tmp = tcg_temp_new_i32();
 tcg_gen_xor_i32(tmp, t0_32, t1_32);
-tcg_temp_free_i32(t0_32);
-tcg_temp_free_i32(t1_32);
 tcg_gen_and_i32(cpu_VF, cpu_VF, tmp);
-tcg_temp_free_i32(tmp);
 tcg_gen_extu_i32_i64(dest, cpu_NF);
 }
 }
@@ -779,7 +762,6 @@ static void gen_adc(int sf, TCGv_i64 dest, TCGv_i64 t0, 
TCGv_i64 t1)
 tcg_gen_extu_i32_i64(flag, cpu_CF);
 tcg_gen_add_i64(dest, t0, t1);
 tcg_gen_add_i64(dest, dest, flag);
-tcg_temp_free_i64(flag);
 
 if (!sf) {
 tcg_gen_ext32u_i64(dest, dest);
@@ -808,11 +790,6 @@ static void gen_adc_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, 
TCGv_i64 t1)
 tcg_gen_extrh_i64_i32(cpu_VF, vf_64);
 
 tcg_gen_mov_i64(dest, result);
-
-tcg_temp_free_i64(tmp);
-tcg_temp_free_i64(vf_64);
-tcg_temp_free_i64(cf_64);
-tcg_temp_free_i64(result);
 } else {
 TCGv_i32 t0_32 = tcg_temp_new_i32();
 TCGv_i32 t1_32 = tcg_temp_new_i32();
@@ -829,10 +806,6 @@ static void gen_adc_CC(int sf, TCGv_i64 dest, TCGv_i64 t0, 
TCGv_i64 t1)
 tcg_gen_xor_i32(tmp, t0_32, t1_32);
 tcg_gen_andc_i32(cpu_VF, cpu_VF, tmp);
 tcg_gen_extu_i32_i64(dest, cpu_NF);
-
-tcg_temp_free_i32(tmp);
-tcg_temp_free_i32(t1_32);
-tcg_temp_free_i32(t0_32);
 }
 }
 
@@ -942,12 +915,7 @@ static void do_fp_st(DisasContext *s, int srcidx, TCGv_i64 
tcg_addr, int size)
 tcg_gen_addi_i64(tcg_hiaddr, tcg_addr, 8);
 tcg_gen_qemu_st_i64(be ? tmplo : tmphi, tcg_hiaddr,
 get_mem_index(s), mop);
-
-

[PULL 64/84] target/mips: Fix trans_mult_acc return

2023-03-05 Thread Richard Henderson
Success from trans_* subroutines should be true.

Fixes: 5fa38eedbd ("target/mips: Convert Vr54xx MACC* opcodes to decodetree")
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/mips/tcg/vr54xx_translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/mips/tcg/vr54xx_translate.c 
b/target/mips/tcg/vr54xx_translate.c
index 3e2c98f2c6..a7d241e4e7 100644
--- a/target/mips/tcg/vr54xx_translate.c
+++ b/target/mips/tcg/vr54xx_translate.c
@@ -53,7 +53,7 @@ static bool trans_mult_acc(DisasContext *ctx, arg_r *a,
 tcg_temp_free(t0);
 tcg_temp_free(t1);
 
-return false;
+return true;
 }
 
 TRANS(MACC, trans_mult_acc, gen_helper_macc);
-- 
2.34.1




  1   2   3   >