date:20210408

Re: [PATCH qemu v18] spapr: Implement Open Firmware client interface

2021-04-08 Thread Alexey Kardashevskiy





On 31/03/2021 13:53, Alexey Kardashevskiy wrote:

The PAPR platform which describes an OS environment that's presented by
a combination of a hypervisor and firmware. The features it specifies
require collaboration between the firmware and the hypervisor.

Since the beginning, the runtime component of the firmware (RTAS) has
been implemented as a 20 byte shim which simply forwards it to
a hypercall implemented in qemu. The boot time firmware component is
SLOF - but a build that's specific to qemu, and has always needed to be
updated in sync with it. Even though we've managed to limit the amount
of runtime communication we need between qemu and SLOF, there's some,
and it has become increasingly awkward to handle as we've implemented
new features.

This implements a boot time OF client interface (CI) which is
enabled by a new "x-vof" pseries machine option (stands for "Virtual Open
Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall
which implements Open Firmware Client Interface (OF CI). This allows
using a smaller stateless firmware which does not have to manage
the device tree.

The new "vof.bin" firmware image is included with source code under
pc-bios/. It also includes RTAS blob.

This implements a handful of CI methods just to get -kernel/-initrd
working. In particular, this implements the device tree fetching and
simple memory allocator - "claim" (an OF CI memory allocator) and updates
"/memory@0/available" to report the client about available memory.

This implements changing some device tree properties which we know how
to deal with, the rest is ignored. To allow changes, this skips
fdt_pack() when x-vof=on as not packing the blob leaves some room for
appending.

In absence of SLOF, this assigns phandles to device tree nodes to make
device tree traversing work.

When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree.

This adds basic instances support which are managed by a hash map
ihandle -> [phandle].

Before the guest started, the used memory is:
0..e60 - the initial firmware
8000..1 - stack
40.. - kernel
3ea.. - initramdisk

This OF CI does not implement "interpret".

Unlike SLOF, this does not format uninitialized nvram. Instead, this
includes a disk image with pre-formatted nvram.

With this basic support, this can only boot into kernel directly.
However this is just enough for the petitboot kernel and initradmdisk to
boot from any possible source. Note this requires reasonably recent guest
kernel with:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735

The immediate benefit is much faster booting time which especially
crucial with fully emulated early CPU bring up environments. Also this
may come handy when/if GRUB-in-the-userspace sees light of the day.

This separates VOF and sPAPR in a hope that VOF bits may be reused by
other POWERPC boards which do not support pSeries.

This is coded in assumption that later on we might be adding support for
booting from QEMU backends (blockdev is the first candidate) without
devices/drivers in between as OF1275 does not require that and
it is quite easy to so.

Signed-off-by: Alexey Kardashevskiy 


[...]


diff --git a/hw/ppc/spapr_vof.c b/hw/ppc/spapr_vof.c
new file mode 100644
index ..9d22e230e3c0
--- /dev/null
+++ b/hw/ppc/spapr_vof.c


[...]


+
+void spapr_vof_client_dt_finalize(SpaprMachineState *spapr, void *fdt)
+{
+char *stdout_path = spapr_vio_stdout_path(spapr->vio_bus);
+
+vof_build_dt(fdt, spapr->vof);
+
+/*
+ * SLOF-less setup requires an open instance of stdout for early
+ * kernel printk. By now all phandles are settled so we can open
+ * the default serial console.
+ */
+if (stdout_path) {
+_FDT(vof_client_open_store(fdt, spapr->vof, "/chosen", "stdout",
+   stdout_path));
+}
+}
+
+void spapr_vof_reset(SpaprMachineState *spapr, void *fdt,
+ target_ulong *stack_ptr, Error **errp)
+{
+Vof *vof = spapr->vof;
+
+vof_init(vof, spapr->rma_size);
+
+if (vof_claim(vof, 0, spapr->fw_size, 0) == -1) {
+error_setg(errp, "Memory for firmware is in use");
+return;
+}
+
+*stack_ptr = vof_claim(vof, 0, OF_STACK_SIZE, OF_STACK_SIZE);
+if (*stack_ptr == -1) {
+error_setg(errp, "Memory allocation for stack failed");
+return;
+}
+/* Stack grows downwards plus reserve space for the minimum stack frame */
+*stack_ptr += OF_STACK_SIZE - 0x20;
+
+if (spapr->kernel_size &&
+vof_claim(vof, spapr->kernel_addr, spapr->kernel_size, 0) == -1) {
+error_setg(errp, "Memory for kernel is in use");
+return;
+}
+
+if (spapr->initrd_size &&
+vof_claim(vof, spapr->initrd_base, spapr->initrd_size, 0) == -1) {
+error_setg(errp, "Memory for initramdisk is in use");
+return;
+}
+
+spapr_vof_client_dt_finalize(spapr, fdt);
+
+/*
+ * We skip

Re: [PATCH] checkpatch: Fix use of uninitialized value

2021-04-08 Thread Greg Kurz

On Thu, 8 Apr 2021 10:49:13 -0700
Isaku Yamahata  wrote:

> 
> How about initializing them explicitly as follows?
> ($realfile ne '') prevents the case realfile eq '' && acpi_testexpted eq ''.
> Anyway your patch also should fix it. So
> Reviewed-by: Isaku Yamahata 
> 
> 
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 8f7053ec9b..2eb894a628 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -1325,8 +1325,8 @@ sub process {
>   my %suppress_whiletrailers;
>   my %suppress_export;
>  
> -my $acpi_testexpected;
> -my $acpi_nontestexpected;
> +my $acpi_testexpected = '';
> +my $acpi_nontestexpected = '';
>  

Hmm... I haven't tried but I believe this will break when these are
passed to checkfilename() :

sub checkfilename {
my ($name, $acpi_testexpected, $acpi_nontestexpected) = @_;
[...]
if (defined $$acpi_testexpected and defined $$acpi_nontestexpected) {
ERROR("Do not add expected files together with tests, " .


>   # Pre-scan the patch sanitizing the lines.
>  
> 
> On Thu, Apr 08, 2021 at 08:51:19AM +0200,
> Greg Kurz  wrote:
> 
> > checkfilename() doesn't always set $acpi_testexpected. Fix the following
> > warning:
> > 
> > Use of uninitialized value $acpi_testexpected in string eq at
> >  ./scripts/checkpatch.pl line 1529.
> > 
> > Fixes: d2f1af0e4120 ("checkpatch: don't emit warning on newly created acpi 
> > data files")
> > Cc: isaku.yamah...@intel.com
> > Signed-off-by: Greg Kurz 
> > ---
> >  scripts/checkpatch.pl |1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> > index 8f7053ec9b26..3d185cceac94 100755
> > --- a/scripts/checkpatch.pl
> > +++ b/scripts/checkpatch.pl
> > @@ -1532,6 +1532,7 @@ sub process {
> >  ($line =~ /\{\s*([\w\/\.\-]*)\s*\=\>\s*([\w\/\.\-]*)\s*\}/ 
> > &&
> >   (defined($1) || defined($2 &&
> >!(($realfile ne '') &&
> > +defined($acpi_testexpected) &&
> >  ($realfile eq $acpi_testexpected))) {
> > $reported_maintainer_file = 1;
> > WARN("added, moved or deleted file(s), does MAINTAINERS 
> > need updating?\n" . $herecurr);
> > 
> > 
> > 
>

Re: [PATCH v1 5/8] target/riscv: Implementation of enhanced PMP (ePMP)

2021-04-08 Thread Bin Meng

On Fri, Apr 2, 2021 at 8:50 PM Alistair Francis
 wrote:
>
> From: Hou Weiying 
>
> This commit adds support for ePMP v0.9.1.
>
> The ePMP spec can be found in:
> https://docs.google.com/document/d/1Mh_aiHYxemL0umN3GTTw8vsbmzHZ_nxZXgjgOUzbvc8
>
> Signed-off-by: Hongzheng-Li 
> Signed-off-by: Hou Weiying 
> Signed-off-by: Myriad-Dreamin 
> Message-Id: 
> 
> [ Changes by AF:
>  - Rebase on master
>  - Update to latest spec
>  - Use a switch case to handle ePMP MML permissions
>  - Fix a few bugs
> ]
> Signed-off-by: Alistair Francis 
> ---
>  target/riscv/pmp.c | 165 +
>  1 file changed, 153 insertions(+), 12 deletions(-)
>
> diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
> index 1d071b044b..3794c808e8 100644
> --- a/target/riscv/pmp.c
> +++ b/target/riscv/pmp.c
> @@ -90,11 +90,42 @@ static inline uint8_t pmp_read_cfg(CPURISCVState *env, 
> uint32_t pmp_index)
>  static void pmp_write_cfg(CPURISCVState *env, uint32_t pmp_index, uint8_t 
> val)
>  {
>  if (pmp_index < MAX_RISCV_PMPS) {
> -if (!pmp_is_locked(env, pmp_index)) {
> -env->pmp_state.pmp[pmp_index].cfg_reg = val;
> -pmp_update_rule(env, pmp_index);
> +bool locked = true;
> +
> +if (riscv_feature(env, RISCV_FEATURE_EPMP)) {
> +/* mseccfg.RLB is set */
> +if (MSECCFG_RLB_ISSET(env)) {
> +locked = false;
> +}
> +
> +/* mseccfg.MML is not set */
> +if (!MSECCFG_MML_ISSET(env) && !pmp_is_locked(env, pmp_index)) {
> +locked = false;
> +}
> +
> +/* mseccfg.MML is set */
> +if (MSECCFG_MML_ISSET(env)) {
> +/* not adding execute bit */
> +if ((val & PMP_LOCK) != 0 && (val & PMP_EXEC) != PMP_EXEC) {
> +locked = false;
> +}
> + /* shared region and not adding X bit*/

nits: /* is not aligned, and a space is needed before */

> +if ((val & PMP_LOCK) != PMP_LOCK &&
> +(val & 0x7) != (PMP_WRITE | PMP_EXEC)) {
> +locked = false;
> +}
> +}
>  } else {
> +if (!pmp_is_locked(env, pmp_index)) {
> +locked = false;
> +}
> +}
> +
> +if (locked) {
>  qemu_log_mask(LOG_GUEST_ERROR, "ignoring pmpcfg write - 
> locked\n");
> +} else {
> +env->pmp_state.pmp[pmp_index].cfg_reg = val;
> +pmp_update_rule(env, pmp_index);
>  }
>  } else {
>  qemu_log_mask(LOG_GUEST_ERROR,
> @@ -217,6 +248,33 @@ static bool pmp_hart_has_privs_default(CPURISCVState 
> *env, target_ulong addr,
>  {
>  bool ret;
>
> +if (riscv_feature(env, RISCV_FEATURE_EPMP)) {
> +if (MSECCFG_MMWP_ISSET(env)) {
> +/*
> + * The Machine Mode Whitelist Policy (mseccfg.MMWP) is set
> + * so we default to deny all, even for M mode.

nits: M-mode

> + */
> +*allowed_privs = 0;
> +return false;
> +} else if (MSECCFG_MML_ISSET(env)) {
> +/*
> + * The Machine Mode Lockdown (mseccfg.MML) bit is set
> + * so we can only execute code in M mode with an applicable

nits: M-mode

> + * rule.
> + * Other modes are disabled.

nits: this line can be put in the same line of "rule."

> + */
> +if (mode == PRV_M && !(privs & PMP_EXEC)) {
> +ret = true;
> +*allowed_privs = PMP_READ | PMP_WRITE;
> +} else {
> +ret = false;
> +*allowed_privs = 0;
> +}
> +
> +return ret;
> +}

If I understand the spec correctly, I think we are missing a branch to
handle MML unset case, in which RWX is allowed in M-mode.

> +}
> +
>  if ((!riscv_feature(env, RISCV_FEATURE_PMP)) || (mode == PRV_M)) {
>  /*
>   * Privileged spec v1.10 states if HW doesn't implement any PMP entry
> @@ -294,13 +352,94 @@ bool pmp_hart_has_privs(CPURISCVState *env, 
> target_ulong addr,
>  pmp_get_a_field(env->pmp_state.pmp[i].cfg_reg);
>
>  /*
> - * If the PMP entry is not off and the address is in range, do the 
> priv
> - * check
> + * Convert the PMP permissions to match the truth table in the
> + * ePMP spec.
>   */
> +const uint8_t epmp_operation =
> +((env->pmp_state.pmp[i].cfg_reg & PMP_LOCK) >> 4) |
> +((env->pmp_state.pmp[i].cfg_reg & PMP_READ) << 2) |
> +(env->pmp_state.pmp[i].cfg_reg & PMP_WRITE) |
> +((env->pmp_state.pmp[i].cfg_reg & PMP_EXEC) >> 2);
> +
>  if (((s + e) == 2) && (PMP_AMATCH_OFF != a_field)) {
> -*allowed_privs = PMP_READ | PMP_WRITE | PMP_EXEC;
> -if ((mode != PRV_M) ||

RE: [PATCH v4 09/10] Add the function of colo_bitmap_clear_diry.

2021-04-08 Thread Rao, Lei

The performance data has been added to the commit message in V6.

Thanks,
Lei.

-Original Message-
From: Dr. David Alan Gilbert  
Sent: Monday, March 29, 2021 7:32 PM
To: Rao, Lei 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; quint...@redhat.com; pbonz...@redhat.com; 
lukasstra...@web.de; qemu-devel@nongnu.org
Subject: Re: [PATCH v4 09/10] Add the function of colo_bitmap_clear_diry.

* Rao, Lei (lei@intel.com) wrote:
> 
> -Original Message-
> From: Dr. David Alan Gilbert 
> Sent: Friday, March 26, 2021 2:08 AM
> To: Rao, Lei 
> Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
> jasow...@redhat.com; quint...@redhat.com; pbonz...@redhat.com; 
> lukasstra...@web.de; qemu-devel@nongnu.org
> Subject: Re: [PATCH v4 09/10] Add the function of colo_bitmap_clear_diry.
> 
> * leirao (lei@intel.com) wrote:
> > From: "Rao, Lei" 
> > 
> > When we use continuous dirty memory copy for flushing ram cache on 
> > secondary VM, we can also clean up the bitmap of contiguous dirty 
> > page memory. This also can reduce the VM stop time during checkpoint.
> > 
> > Signed-off-by: Lei Rao 
> > ---
> >  migration/ram.c | 29 +
> >  1 file changed, 25 insertions(+), 4 deletions(-)
> > 
> > diff --git a/migration/ram.c b/migration/ram.c index 
> > a258466..ae1e659
> > 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -855,6 +855,30 @@ unsigned long colo_bitmap_find_dirty(RAMState *rs, 
> > RAMBlock *rb,
> >  return first;
> >  }
> >  
> > +/**
> > + * colo_bitmap_clear_dirty:when we flush ram cache to ram, we will 
> > +use
> > + * continuous memory copy, so we can also clean up the bitmap of 
> > +contiguous
> > + * dirty memory.
> > + */
> > +static inline bool colo_bitmap_clear_dirty(RAMState *rs,
> > +   RAMBlock *rb,
> > +   unsigned long start,
> > +   unsigned long num) {
> > +bool ret;
> > +unsigned long i = 0;
> > +
> > +qemu_mutex_lock(>bitmap_mutex);
> 
> Please use QEMU_LOCK_GUARD(>bitmap_mutex);
> 
> Will be changed in V5. Thanks.
> 
> > +for (i = 0; i < num; i++) {
> > +ret = test_and_clear_bit(start + i, rb->bmap);
> > +if (ret) {
> > +rs->migration_dirty_pages--;
> > +}
> > +}
> > +qemu_mutex_unlock(>bitmap_mutex);
> > +return ret;
> 
> This implementation is missing the clear_bmap code that 
> migration_bitmap_clear_dirty has.
> I think that's necessary now.
> 
> Are we sure there's any benefit in this?
> 
> Dave
> 
> There is such a note about clear_bmap in struct RAMBlock:
> "On destination side, this should always be NULL, and the variable 
> `clear_bmap_shift' is meaningless."
> This means that clear_bmap is always NULL on secondary VM. And for the 
> behavior of flush ram cache to ram, we will always only happen on secondary 
> VM.
> So, I think the clear_bmap code is unnecessary for COLO.

Ah yes; can you add a comment there to note this is on the secondary to make 
that clear.

> As for the benefits, When the number of dirty pages from flush ram cache to 
> ram is too much. it will reduce the number of locks acquired.

It might be good to measure the benefit.

Dave

> Lei
> 
> > +}
> > +
> >  static inline bool migration_bitmap_clear_dirty(RAMState *rs,
> >  RAMBlock *rb,
> >  unsigned long page) 
> > @@ -3700,7 +3724,6 @@ void colo_flush_ram_cache(void)
> >  void *src_host;
> >  unsigned long offset = 0;
> >  unsigned long num = 0;
> > -unsigned long i = 0;
> >  
> >  memory_global_dirty_log_sync();
> >  WITH_RCU_READ_LOCK_GUARD() {
> > @@ -3722,9 +3745,7 @@ void colo_flush_ram_cache(void)
> >  num = 0;
> >  block = QLIST_NEXT_RCU(block, next);
> >  } else {
> > -for (i = 0; i < num; i++) {
> > -migration_bitmap_clear_dirty(ram_state, block, offset 
> > + i);
> > -}
> > +colo_bitmap_clear_dirty(ram_state, block, offset, 
> > + num);
> >  dst_host = block->host
> >   + (((ram_addr_t)offset) << TARGET_PAGE_BITS);
> >  src_host = block->colo_cache
> > --
> > 1.8.3.1
> > 
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[PATCH v6 10/10] Fixed calculation error of pkt->header_size in fill_pkt_tcp_info()

2021-04-08 Thread leirao

From: "Rao, Lei" 

The data pointer has skipped vnet_hdr_len in the function of
parse_packet_early().So, we can not subtract vnet_hdr_len again
when calculating pkt->header_size in fill_pkt_tcp_info(). Otherwise,
it will cause network packet comparsion errors and greatly increase
the frequency of checkpoints.

Signed-off-by: Lei Rao 
Signed-off-by: Zhang Chen 
Reviewed-by: Li Zhijian 
Reviewed-by: Zhang Chen 
Reviewed-by: Lukas Straub 
Tested-by: Lukas Straub 
---
 net/colo-compare.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 5b538f4..b100e7b 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -211,7 +211,7 @@ static void fill_pkt_tcp_info(void *data, uint32_t *max_ack)
 pkt->tcp_ack = ntohl(tcphd->th_ack);
 *max_ack = *max_ack > pkt->tcp_ack ? *max_ack : pkt->tcp_ack;
 pkt->header_size = pkt->transport_header - (uint8_t *)pkt->data
-   + (tcphd->th_off << 2) - pkt->vnet_hdr_len;
+   + (tcphd->th_off << 2);
 pkt->payload_size = pkt->size - pkt->header_size;
 pkt->seq_end = pkt->tcp_seq + pkt->payload_size;
 pkt->flags = tcphd->th_flags;
-- 
1.8.3.1

[PATCH v6 05/10] Add a function named packet_new_nocopy for COLO.

2021-04-08 Thread leirao

From: "Rao, Lei" 

Use the packet_new_nocopy instead of packet_new in the
filter-rewriter module. There will be one less memory
copy in the processing of each network packet.

Signed-off-by: Lei Rao 
---
 net/colo.c| 25 +
 net/colo.h|  1 +
 net/filter-rewriter.c |  3 +--
 3 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/net/colo.c b/net/colo.c
index ef00609..3a3e6e8 100644
--- a/net/colo.c
+++ b/net/colo.c
@@ -157,19 +157,28 @@ void connection_destroy(void *opaque)
 
 Packet *packet_new(const void *data, int size, int vnet_hdr_len)
 {
-Packet *pkt = g_slice_new(Packet);
+Packet *pkt = g_slice_new0(Packet);
 
 pkt->data = g_memdup(data, size);
 pkt->size = size;
 pkt->creation_ms = qemu_clock_get_ms(QEMU_CLOCK_HOST);
 pkt->vnet_hdr_len = vnet_hdr_len;
-pkt->tcp_seq = 0;
-pkt->tcp_ack = 0;
-pkt->seq_end = 0;
-pkt->header_size = 0;
-pkt->payload_size = 0;
-pkt->offset = 0;
-pkt->flags = 0;
+
+return pkt;
+}
+
+/*
+ * packet_new_nocopy will not copy data, so the caller can't release
+ * the data. And it will be released in packet_destroy.
+ */
+Packet *packet_new_nocopy(void *data, int size, int vnet_hdr_len)
+{
+Packet *pkt = g_slice_new0(Packet);
+
+pkt->data = data;
+pkt->size = size;
+pkt->creation_ms = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+pkt->vnet_hdr_len = vnet_hdr_len;
 
 return pkt;
 }
diff --git a/net/colo.h b/net/colo.h
index 573ab91..d91cd24 100644
--- a/net/colo.h
+++ b/net/colo.h
@@ -101,6 +101,7 @@ bool connection_has_tracked(GHashTable 
*connection_track_table,
 ConnectionKey *key);
 void connection_hashtable_reset(GHashTable *connection_track_table);
 Packet *packet_new(const void *data, int size, int vnet_hdr_len);
+Packet *packet_new_nocopy(void *data, int size, int vnet_hdr_len);
 void packet_destroy(void *opaque, void *user_data);
 void packet_destroy_partial(void *opaque, void *user_data);
 
diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
index 10fe393..cb3a96c 100644
--- a/net/filter-rewriter.c
+++ b/net/filter-rewriter.c
@@ -270,8 +270,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
 vnet_hdr_len = nf->netdev->vnet_hdr_len;
 }
 
-pkt = packet_new(buf, size, vnet_hdr_len);
-g_free(buf);
+pkt = packet_new_nocopy(buf, size, vnet_hdr_len);
 
 /*
  * if we get tcp packet
-- 
1.8.3.1

[PATCH v6 07/10] Reset the auto-converge counter at every checkpoint.

2021-04-08 Thread leirao

From: "Rao, Lei" 

if we don't reset the auto-converge counter,
it will continue to run with COLO running,
and eventually the system will hang due to the
CPU throttle reaching DEFAULT_MIGRATE_MAX_CPU_THROTTLE.

Signed-off-by: Lei Rao 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Lukas Straub 
Tested-by: Lukas Straub 
---
 migration/colo.c | 4 
 migration/ram.c  | 9 +
 migration/ram.h  | 1 +
 3 files changed, 14 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 1aaf316..723ffb8 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -459,6 +459,10 @@ static int colo_do_checkpoint_transaction(MigrationState 
*s,
 if (ret < 0) {
 goto out;
 }
+
+if (migrate_auto_converge()) {
+mig_throttle_counter_reset();
+}
 /*
  * Only save VM's live state, which not including device state.
  * TODO: We may need a timeout mechanism to prevent COLO process
diff --git a/migration/ram.c b/migration/ram.c
index 4682f36..f9d60f0 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -652,6 +652,15 @@ static void mig_throttle_guest_down(uint64_t 
bytes_dirty_period,
 }
 }
 
+void mig_throttle_counter_reset(void)
+{
+RAMState *rs = ram_state;
+
+rs->time_last_bitmap_sync = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+rs->num_dirty_pages_period = 0;
+rs->bytes_xfer_prev = ram_counters.transferred;
+}
+
 /**
  * xbzrle_cache_zero_page: insert a zero page in the XBZRLE cache
  *
diff --git a/migration/ram.h b/migration/ram.h
index 4833e9f..cb6f58a 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -50,6 +50,7 @@ bool ramblock_is_ignored(RAMBlock *block);
 int xbzrle_cache_resize(uint64_t new_size, Error **errp);
 uint64_t ram_bytes_remaining(void);
 uint64_t ram_bytes_total(void);
+void mig_throttle_counter_reset(void);
 
 uint64_t ram_pagesize_summary(void);
 int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len);
-- 
1.8.3.1

[PATCH v6 02/10] Fix the qemu crash when guest shutdown during checkpoint

2021-04-08 Thread leirao

From: "Rao, Lei" 

This patch fixes the following:
qemu-system-x86_64: invalid runstate transition: 'colo' ->'shutdown'
Aborted (core dumped)

Signed-off-by: Lei Rao 
Reviewed-by: Li Zhijian 
Reviewed-by: Zhang Chen 
Reviewed-by: Lukas Straub 
Tested-by: Lukas Straub 
---
 softmmu/runstate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/softmmu/runstate.c b/softmmu/runstate.c
index ce8977c..1564057 100644
--- a/softmmu/runstate.c
+++ b/softmmu/runstate.c
@@ -126,6 +126,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
 
 { RUN_STATE_COLO, RUN_STATE_RUNNING },
+{ RUN_STATE_COLO, RUN_STATE_SHUTDOWN},
 
 { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
 { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },
-- 
1.8.3.1

[PATCH v6 08/10] Reduce the PVM stop time during Checkpoint

2021-04-08 Thread leirao

From: "Rao, Lei" 

When flushing memory from ram cache to ram during every checkpoint
on secondary VM, we can copy continuous chunks of memory instead of
4096 bytes per time to reduce the time of VM stop during checkpoint.

Signed-off-by: Lei Rao 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Lukas Straub 
Tested-by: Lukas Straub 
---
 migration/ram.c | 48 +---
 1 file changed, 45 insertions(+), 3 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index f9d60f0..8661d82 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -822,6 +822,41 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, 
RAMBlock *rb,
 return next;
 }
 
+/*
+ * colo_bitmap_find_diry:find contiguous dirty pages from start
+ *
+ * Returns the page offset within memory region of the start of the contiguout
+ * dirty page
+ *
+ * @rs: current RAM state
+ * @rb: RAMBlock where to search for dirty pages
+ * @start: page where we start the search
+ * @num: the number of contiguous dirty pages
+ */
+static inline
+unsigned long colo_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
+ unsigned long start, unsigned long *num)
+{
+unsigned long size = rb->used_length >> TARGET_PAGE_BITS;
+unsigned long *bitmap = rb->bmap;
+unsigned long first, next;
+
+*num = 0;
+
+if (ramblock_is_ignored(rb)) {
+return size;
+}
+
+first = find_next_bit(bitmap, size, start);
+if (first >= size) {
+return first;
+}
+next = find_next_zero_bit(bitmap, size, first + 1);
+assert(next >= first);
+*num = next - first;
+return first;
+}
+
 static inline bool migration_bitmap_clear_dirty(RAMState *rs,
 RAMBlock *rb,
 unsigned long page)
@@ -3730,19 +3765,26 @@ void colo_flush_ram_cache(void)
 block = QLIST_FIRST_RCU(_list.blocks);
 
 while (block) {
-offset = migration_bitmap_find_dirty(ram_state, block, offset);
+unsigned long num = 0;
 
+offset = colo_bitmap_find_dirty(ram_state, block, offset, );
 if (((ram_addr_t)offset) << TARGET_PAGE_BITS
 >= block->used_length) {
 offset = 0;
+num = 0;
 block = QLIST_NEXT_RCU(block, next);
 } else {
-migration_bitmap_clear_dirty(ram_state, block, offset);
+unsigned long i = 0;
+
+for (i = 0; i < num; i++) {
+migration_bitmap_clear_dirty(ram_state, block, offset + i);
+}
 dst_host = block->host
  + (((ram_addr_t)offset) << TARGET_PAGE_BITS);
 src_host = block->colo_cache
  + (((ram_addr_t)offset) << TARGET_PAGE_BITS);
-memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
+memcpy(dst_host, src_host, TARGET_PAGE_SIZE * num);
+offset += num;
 }
 }
 }
-- 
1.8.3.1

[PATCH v6 06/10] Add the function of colo_compare_cleanup

2021-04-08 Thread leirao

From: "Rao, Lei" 

This patch fixes the following:
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x7f6ae4559859 in __GI_abort () at abort.c:79
#2  0x559aaa386720 in error_exit (err=16, msg=0x559aaa5973d0 
<__func__.16227> "qemu_mutex_destroy") at util/qemu-thread-posix.c:36
#3  0x559aaa3868c5 in qemu_mutex_destroy (mutex=0x559aabffe828) at 
util/qemu-thread-posix.c:69
#4  0x559aaa2f93a8 in char_finalize (obj=0x559aabffe800) at 
chardev/char.c:285
#5  0x559aaa23318a in object_deinit (obj=0x559aabffe800, 
type=0x559aabfd7d20) at qom/object.c:606
#6  0x559aaa2331b8 in object_deinit (obj=0x559aabffe800, 
type=0x559aabfd9060) at qom/object.c:610
#7  0x559aaa233200 in object_finalize (data=0x559aabffe800) at 
qom/object.c:620
#8  0x559aaa234202 in object_unref (obj=0x559aabffe800) at 
qom/object.c:1074
#9  0x559aaa2356b6 in object_finalize_child_property 
(obj=0x559aac0dac10, name=0x559aac778760 "compare0-0", opaque=0x559aabffe800) 
at qom/object.c:1584
#10 0x559aaa232f70 in object_property_del_all (obj=0x559aac0dac10) at 
qom/object.c:557
#11 0x559aaa2331ed in object_finalize (data=0x559aac0dac10) at 
qom/object.c:619
#12 0x559aaa234202 in object_unref (obj=0x559aac0dac10) at 
qom/object.c:1074
#13 0x559aaa2356b6 in object_finalize_child_property 
(obj=0x559aac0c75c0, name=0x559aac0dadc0 "chardevs", opaque=0x559aac0dac10) at 
qom/object.c:1584
#14 0x559aaa233071 in object_property_del_child (obj=0x559aac0c75c0, 
child=0x559aac0dac10, errp=0x0) at qom/object.c:580
#15 0x559aaa233155 in object_unparent (obj=0x559aac0dac10) at 
qom/object.c:599
#16 0x559aaa2fb721 in qemu_chr_cleanup () at chardev/char.c:1159
#17 0x559aa9f9b110 in main (argc=54, argv=0x7ffeb62fa998, 
envp=0x7ffeb62fab50) at vl.c:4539

When chardev is cleaned up, chr_write_lock needs to be destroyed. But
the colo-compare module is not cleaned up normally before it when the
guest poweroff. It is holding chr_write_lock at this time. This will
cause qemu crash.So we add the function of colo_compare_cleanup() before
qemu_chr_cleanup() to fix the bug.

Signed-off-by: Lei Rao 
Reviewed-by: Zhang Chen 
Reviewed-by: Lukas Straub 
Tested-by: Lukas Straub 
---
 net/colo-compare.c | 10 ++
 net/colo-compare.h |  1 +
 net/net.c  |  4 
 3 files changed, 15 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index c142c08..5b538f4 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -1402,6 +1402,16 @@ static void colo_compare_init(Object *obj)
  compare_set_vnet_hdr);
 }
 
+void colo_compare_cleanup(void)
+{
+CompareState *tmp = NULL;
+CompareState *n = NULL;
+
+QTAILQ_FOREACH_SAFE(tmp, _compares, next, n) {
+object_unparent(OBJECT(tmp));
+}
+}
+
 static void colo_compare_finalize(Object *obj)
 {
 CompareState *s = COLO_COMPARE(obj);
diff --git a/net/colo-compare.h b/net/colo-compare.h
index 22ddd51..b055270 100644
--- a/net/colo-compare.h
+++ b/net/colo-compare.h
@@ -20,5 +20,6 @@
 void colo_notify_compares_event(void *opaque, int event, Error **errp);
 void colo_compare_register_notifier(Notifier *notify);
 void colo_compare_unregister_notifier(Notifier *notify);
+void colo_compare_cleanup(void);
 
 #endif /* QEMU_COLO_COMPARE_H */
diff --git a/net/net.c b/net/net.c
index 725a4e1..8fcb2e7 100644
--- a/net/net.c
+++ b/net/net.c
@@ -53,6 +53,7 @@
 #include "sysemu/sysemu.h"
 #include "sysemu/runstate.h"
 #include "sysemu/sysemu.h"
+#include "net/colo-compare.h"
 #include "net/filter.h"
 #include "qapi/string-output-visitor.h"
 #include "qapi/hmp-output-visitor.h"
@@ -1463,6 +1464,9 @@ void net_cleanup(void)
 {
 NetClientState *nc;
 
+/*cleanup colo compare module for COLO*/
+colo_compare_cleanup();
+
 /* We may del multiple entries during qemu_del_net_client(),
  * so QTAILQ_FOREACH_SAFE() is also not safe here.
  */
-- 
1.8.3.1

[PATCH v6 01/10] Remove some duplicate trace code.

2021-04-08 Thread leirao

From: "Rao, Lei" 

There is the same trace code in the colo_compare_packet_payload.

Signed-off-by: Lei Rao 
Reviewed-by: Li Zhijian 
Reviewed-by: Zhang Chen 
Reviewed-by: Lukas Straub 
Tested-by: Lukas Straub 
---
 net/colo-compare.c | 13 -
 1 file changed, 13 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 9d1ad99..c142c08 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -590,19 +590,6 @@ static int colo_packet_compare_other(Packet *spkt, Packet 
*ppkt)
 uint16_t offset = ppkt->vnet_hdr_len;
 
 trace_colo_compare_main("compare other");
-if (trace_event_get_state_backends(TRACE_COLO_COMPARE_IP_INFO)) {
-char pri_ip_src[20], pri_ip_dst[20], sec_ip_src[20], sec_ip_dst[20];
-
-strcpy(pri_ip_src, inet_ntoa(ppkt->ip->ip_src));
-strcpy(pri_ip_dst, inet_ntoa(ppkt->ip->ip_dst));
-strcpy(sec_ip_src, inet_ntoa(spkt->ip->ip_src));
-strcpy(sec_ip_dst, inet_ntoa(spkt->ip->ip_dst));
-
-trace_colo_compare_ip_info(ppkt->size, pri_ip_src,
-   pri_ip_dst, spkt->size,
-   sec_ip_src, sec_ip_dst);
-}
-
 if (ppkt->size != spkt->size) {
 trace_colo_compare_main("Other: payload size of packets are 
different");
 return -1;
-- 
1.8.3.1

[PATCH v6 09/10] Add the function of colo_bitmap_clear_dirty

2021-04-08 Thread leirao

From: "Rao, Lei" 

When we use continuous dirty memory copy for flushing ram cache on
secondary VM, we can also clean up the bitmap of contiguous dirty
page memory. This also can reduce the VM stop time during checkpoint.

The performance test for COLO as follow:

Server configuraton:
CPU :Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
MEM :251G(type:DDR4 Speed:2666 MT/s)
SSD :Intel 730 and DC S35x0/3610/3700 Series SSDs

dirty pages:3189376  migration_bitmap_clear_dirty time consuming(ns):105194000
dirty pages:3189784  migration_bitmap_clear_dirty time consuming(ns):105297000
dirty pages:3190501  migration_bitmap_clear_dirty time consuming(ns):10541
dirty pages:3188734  migration_bitmap_clear_dirty time consuming(ns):105138000
dirty pages:3189464  migration_bitmap_clear_dirty time consuming(ns):111736000
dirty pages:3188558  migration_bitmap_clear_dirty time consuming(ns):105079000
dirty pages:3239489  migration_bitmap_clear_dirty time consuming(ns):106761000

dirty pages:3190240  colo_bitmap_clear_dirty time consuming(ns):8369000
dirty pages:3189293  colo_bitmap_clear_dirty time consuming(ns):8388000
dirty pages:3189171  colo_bitmap_clear_dirty time consuming(ns):8641000
dirty pages:3189099  colo_bitmap_clear_dirty time consuming(ns):828
dirty pages:3189974  colo_bitmap_clear_dirty time consuming(ns):8352000
dirty pages:3189471  colo_bitmap_clear_dirty time consuming(ns):8348000
dirty pages:3189681  colo_bitmap_clear_dirty time consuming(ns):8426000

it can be seen from the data that colo_bitmap_clear_dirty is more
efficient.

Signed-off-by: Lei Rao 
Reviewed-by: Lukas Straub 
Tested-by: Lukas Straub 
---
 migration/ram.c | 36 +++-
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 8661d82..11275cd 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -857,6 +857,36 @@ unsigned long colo_bitmap_find_dirty(RAMState *rs, 
RAMBlock *rb,
 return first;
 }
 
+/**
+ * colo_bitmap_clear_dirty:when we flush ram cache to ram, we will use
+ * continuous memory copy, so we can also clean up the bitmap of contiguous
+ * dirty memory.
+ */
+static inline bool colo_bitmap_clear_dirty(RAMState *rs,
+   RAMBlock *rb,
+   unsigned long start,
+   unsigned long num)
+{
+bool ret;
+unsigned long i = 0;
+
+/*
+ * Since flush ram cache to ram can only happen on Secondary VM.
+ * and the clear bitmap always is NULL on destination side.
+ * Therefore, there is unnecessary to judge whether the
+ * clear_bitmap needs clear.
+ */
+QEMU_LOCK_GUARD(>bitmap_mutex);
+for (i = 0; i < num; i++) {
+ret = test_and_clear_bit(start + i, rb->bmap);
+if (ret) {
+rs->migration_dirty_pages--;
+}
+}
+
+return ret;
+}
+
 static inline bool migration_bitmap_clear_dirty(RAMState *rs,
 RAMBlock *rb,
 unsigned long page)
@@ -3774,11 +3804,7 @@ void colo_flush_ram_cache(void)
 num = 0;
 block = QLIST_NEXT_RCU(block, next);
 } else {
-unsigned long i = 0;
-
-for (i = 0; i < num; i++) {
-migration_bitmap_clear_dirty(ram_state, block, offset + i);
-}
+colo_bitmap_clear_dirty(ram_state, block, offset, num);
 dst_host = block->host
  + (((ram_addr_t)offset) << TARGET_PAGE_BITS);
 src_host = block->colo_cache
-- 
1.8.3.1

[PATCH v6 04/10] Remove migrate_set_block_enabled in checkpoint

2021-04-08 Thread leirao

From: "Rao, Lei" 

We can detect disk migration in migrate_prepare, if disk migration
is enabled in COLO mode, we can directly report an error.and there
is no need to disable block migration at every checkpoint.

Signed-off-by: Lei Rao 
Signed-off-by: Zhang Chen 
Reviewed-by: Li Zhijian 
Reviewed-by: Zhang Chen 
Reviewed-by: Lukas Straub 
Tested-by: Lukas Straub 
---
 migration/colo.c  | 6 --
 migration/migration.c | 4 
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index de27662..1aaf316 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -435,12 +435,6 @@ static int colo_do_checkpoint_transaction(MigrationState 
*s,
 if (failover_get_state() != FAILOVER_STATUS_NONE) {
 goto out;
 }
-
-/* Disable block migration */
-migrate_set_block_enabled(false, _err);
-if (local_err) {
-goto out;
-}
 qemu_mutex_lock_iothread();
 
 #ifdef CONFIG_REPLICATION
diff --git a/migration/migration.c b/migration/migration.c
index 8ca0341..c85b926 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2227,6 +2227,10 @@ static bool migrate_prepare(MigrationState *s, bool blk, 
bool blk_inc,
 }
 
 if (blk || blk_inc) {
+if (migrate_colo_enabled()) {
+error_setg(errp, "No disk migration is required in COLO mode");
+return false;
+}
 if (migrate_use_block() || migrate_use_block_incremental()) {
 error_setg(errp, "Command options are incompatible with "
"current migration capabilities");
-- 
1.8.3.1

[PATCH v6 03/10] Optimize the function of filter_send

2021-04-08 Thread leirao

From: "Rao, Lei" 

The iov_size has been calculated in filter_send(). we can directly
return the size.In this way, this is no need to repeat calculations
in filter_redirector_receive_iov();

Signed-off-by: Lei Rao 
Reviewed-by: Li Zhijian 
Reviewed-by: Zhang Chen 
Reviewed-by: Lukas Straub 
Tested-by: Lukas Straub 
---
 net/filter-mirror.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/filter-mirror.c b/net/filter-mirror.c
index f8e6500..f20240c 100644
--- a/net/filter-mirror.c
+++ b/net/filter-mirror.c
@@ -88,7 +88,7 @@ static int filter_send(MirrorState *s,
 goto err;
 }
 
-return 0;
+return size;
 
 err:
 return ret < 0 ? ret : -EIO;
@@ -159,7 +159,7 @@ static ssize_t filter_mirror_receive_iov(NetFilterState *nf,
 int ret;
 
 ret = filter_send(s, iov, iovcnt);
-if (ret) {
+if (ret < 0) {
 error_report("filter mirror send failed(%s)", strerror(-ret));
 }
 
@@ -182,10 +182,10 @@ static ssize_t 
filter_redirector_receive_iov(NetFilterState *nf,
 
 if (qemu_chr_fe_backend_connected(>chr_out)) {
 ret = filter_send(s, iov, iovcnt);
-if (ret) {
+if (ret < 0) {
 error_report("filter redirector send failed(%s)", strerror(-ret));
 }
-return iov_size(iov, iovcnt);
+return ret;
 } else {
 return 0;
 }
-- 
1.8.3.1

[PATCH v6 00/10] Fixed some bugs and optimized some codes for COLO

2021-04-08 Thread leirao

From: Rao, Lei 

Changes since v5:
--Replaced g_slice_new calls with g_slice_new0.

Changes since v4:
--Replaced qemu_mutex_lock calls with QEMU_LOCK_GUARD in 
colo_bitmap_clear_dirty.
--Modify some minor issues about variable definition.
--Add some performance test data in the commit message.

Changes since v3:
--Remove cpu_throttle_stop from mig_throttle_counter_reset.

Changes since v2:
--Add a function named packet_new_nocopy.
--Continue to optimize the function of colo_flush_ram_cache.

Changes since v1:
--Reset the state of the auto-converge counters at every checkpoint 
instead of directly disabling.
--Treat the filter_send function returning zero as a normal case.

The series of patches include:
Fixed some bugs of qemu crash.
Optimized some code to reduce the time of checkpoint.
Remove some unnecessary code to improve COLO.

Rao, Lei (10):
  Remove some duplicate trace code.
  Fix the qemu crash when guest shutdown during checkpoint
  Optimize the function of filter_send
  Remove migrate_set_block_enabled in checkpoint
  Add a function named packet_new_nocopy for COLO.
  Add the function of colo_compare_cleanup
  Reset the auto-converge counter at every checkpoint.
  Reduce the PVM stop time during Checkpoint
  Add the function of colo_bitmap_clear_dirty
  Fixed calculation error of pkt->header_size in fill_pkt_tcp_info()

 migration/colo.c  | 10 +++
 migration/migration.c |  4 +++
 migration/ram.c   | 83 +--
 migration/ram.h   |  1 +
 net/colo-compare.c| 25 +++-
 net/colo-compare.h|  1 +
 net/colo.c| 25 +++-
 net/colo.h|  1 +
 net/filter-mirror.c   |  8 ++---
 net/filter-rewriter.c |  3 +-
 net/net.c |  4 +++
 softmmu/runstate.c|  1 +
 12 files changed, 129 insertions(+), 37 deletions(-)

-- 
1.8.3.1

[Bug 1895219] Re: qemu git -vnc fails due to missing en-us keymap

2021-04-08 Thread hippieshaker

Confirmed also a problem on the Windows build. Work around is to copy en-us 
file from 
C:\Program Files\qemu\keymaps to qemu folder.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1895219

Title:
  qemu git -vnc fails due to missing en-us keymap

Status in QEMU:
  New

Bug description:
  If trying to run qemu with -vnc :0, it will fail with:
  ./qemu-system-x86_64 -vnc :2
  qemu-system-x86_64: -vnc :2: could not read keymap file: 'en-us'

  share/keymaps is missing en-us keymap and only has sl and sv,
  confirmed previous stable versions had en-us.

  Tried with multiple targets, on arm64 and amd64

  Git commit hash: 9435a8b3dd35f1f926f1b9127e8a906217a5518a (head)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1895219/+subscriptions

Re: [PATCH v5 1/3] hw: Model ASPEED's Hash and Crypto Engine

2021-04-08 Thread Andrew Jeffery




On Fri, 9 Apr 2021, at 09:32, Joel Stanley wrote:
> The HACE (Hash and Crypto Engine) is a device that offloads MD5, SHA1,
> SHA2, RSA and other cryptographic algorithms.
> 
> This initial model implements a subset of the device's functionality;
> currently only MD5/SHA hashing, and on the ast2600's scatter gather
> engine.
> 
> Co-developed-by: Klaus Heinrich Kiwi 
> Reviewed-by: Cédric Le Goater 
> Reviewed-by: Philippe Mathieu-Daudé 
> Signed-off-by: Joel Stanley 

Reviewed-by: Andrew Jeffery

Re: Re: [PATCH 2/3] vhost-blk: Add vhost-blk-common abstraction

2021-04-08 Thread Yongji Xie

On Fri, Apr 9, 2021 at 7:21 AM Raphael Norwitz
 wrote:
>
> I'm mostly happy with this. Just some asks on variable renaming and
> comments which need to be fixed because of how you've moved things
> around.
>

OK. Thank you for reviewing!

> Also let's add a MAINTAINERS entry vhost-blk-common.h/c either under
> vhost-user-blk or create a new vhost-blk entry. I'm not sure what the
> best practices are for this.
>

Not sure. Maybe adding vhost-blk-common.h/c under vhost-user-blk entry is OK.

> On Thu, Apr 08, 2021 at 06:12:51PM +0800, Xie Yongji wrote:
> > This commit abstracts part of vhost-user-blk into a common
> > parent class which is useful for the introducation of vhost-vdpa-blk.
> >
> > Signed-off-by: Xie Yongji 
> > ---
> >  hw/block/meson.build |   2 +-
> >  hw/block/vhost-blk-common.c  | 291 +
> >  hw/block/vhost-user-blk.c| 306 +--
> >  hw/virtio/vhost-user-blk-pci.c   |   7 +-
> >  include/hw/virtio/vhost-blk-common.h |  50 +
> >  include/hw/virtio/vhost-user-blk.h   |  20 +-
> >  6 files changed, 396 insertions(+), 280 deletions(-)
> >  create mode 100644 hw/block/vhost-blk-common.c
> >  create mode 100644 include/hw/virtio/vhost-blk-common.h
> >
> > diff --git a/hw/block/meson.build b/hw/block/meson.build
> > index 5b4a7699f9..5862bda4cb 100644
> > --- a/hw/block/meson.build
> > +++ b/hw/block/meson.build
> > @@ -16,6 +16,6 @@ softmmu_ss.add(when: 'CONFIG_TC58128', if_true: 
> > files('tc58128.c'))
> >  softmmu_ss.add(when: 'CONFIG_NVME_PCI', if_true: files('nvme.c', 
> > 'nvme-ns.c', 'nvme-subsys.c', 'nvme-dif.c'))
> >
> >  specific_ss.add(when: 'CONFIG_VIRTIO_BLK', if_true: files('virtio-blk.c'))
> > -specific_ss.add(when: 'CONFIG_VHOST_USER_BLK', if_true: 
> > files('vhost-user-blk.c'))
> > +specific_ss.add(when: 'CONFIG_VHOST_USER_BLK', if_true: 
> > files('vhost-blk-common.c', 'vhost-user-blk.c'))
> >
> >  subdir('dataplane')
> > diff --git a/hw/block/vhost-blk-common.c b/hw/block/vhost-blk-common.c
> > new file mode 100644
> > index 00..96500f6c89
> > --- /dev/null
> > +++ b/hw/block/vhost-blk-common.c
> > @@ -0,0 +1,291 @@
> > +/*
> > + * Parent class for vhost based block devices
> > + *
> > + * Copyright (C) 2021 Bytedance Inc. and/or its affiliates. All rights 
> > reserved.
> > + *
> > + * Author:
> > + *   Xie Yongji 
> > + *
> > + * Heavily based on the vhost-user-blk.c by:
> > + *   Changpeng Liu 
>
> You should probably also give credit to Felipe, Setfan and Nicholas, as
> a lot of vhost-user-blk orignally came from their work.
>

Sure.

> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2.  See
> > + * the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qapi/error.h"
> > +#include "qemu/error-report.h"
> > +#include "qemu/cutils.h"
> > +#include "hw/qdev-core.h"
> > +#include "hw/qdev-properties.h"
> > +#include "hw/qdev-properties-system.h"
> > +#include "hw/virtio/vhost.h"
> > +#include "hw/virtio/virtio.h"
> > +#include "hw/virtio/virtio-bus.h"
> > +#include "hw/virtio/virtio-access.h"
> > +#include "hw/virtio/vhost-blk-common.h"
> > +#include "sysemu/sysemu.h"
> > +#include "sysemu/runstate.h"
> > +
> > +static void vhost_blk_common_update_config(VirtIODevice *vdev, uint8_t 
> > *config)
> > +{
> > +VHostBlkCommon *vbc = VHOST_BLK_COMMON(vdev);
> > +
> > +/* Our num_queues overrides the device backend */
> > +virtio_stw_p(vdev, >blkcfg.num_queues, vbc->num_queues);
> > +
> > +memcpy(config, >blkcfg, sizeof(struct virtio_blk_config));
> > +}
> > +
> > +static void vhost_blk_common_set_config(VirtIODevice *vdev,
> > +const uint8_t *config)
> > +{
> > +VHostBlkCommon *vbc = VHOST_BLK_COMMON(vdev);
> > +struct virtio_blk_config *blkcfg = (struct virtio_blk_config *)config;
> > +int ret;
> > +
> > +if (blkcfg->wce == vbc->blkcfg.wce) {
> > +return;
> > +}
> > +
> > +ret = vhost_dev_set_config(>dev, >wce,
> > +   offsetof(struct virtio_blk_config, wce),
> > +   sizeof(blkcfg->wce),
> > +   VHOST_SET_CONFIG_TYPE_MASTER);
> > +if (ret) {
> > +error_report("set device config space failed");
> > +return;
> > +}
> > +
> > +vbc->blkcfg.wce = blkcfg->wce;
> > +}
> > +
> > +static int vhost_blk_common_handle_config_change(struct vhost_dev *dev)
> > +{
> > +VHostBlkCommon *vbc = VHOST_BLK_COMMON(dev->vdev);
> > +struct virtio_blk_config blkcfg;
> > +int ret;
> > +
> > +ret = vhost_dev_get_config(dev, (uint8_t *),
> > +   sizeof(struct virtio_blk_config));
> > +if (ret < 0) {
> > +error_report("get config space failed");
> > +return ret;
> > +}
> > +
> > +/* valid for resize only */
> > +if (blkcfg.capacity != vbc->blkcfg.capacity) {

Re: Commit "x86/kvm: Move context tracking where it belongs" broke guest time accounting

2021-04-08 Thread Wanpeng Li

On Thu, 8 Apr 2021 at 21:19, Thomas Gleixner  wrote:
>
> On Tue, Apr 06 2021 at 21:47, Sean Christopherson wrote:
> > On Tue, Apr 06, 2021, Michael Tokarev wrote:
> >> broke kvm guest cpu time accounting - after this commit, when running
> >> qemu-system-x86_64 -enable-kvm, the guest time (in /proc/stat and
> >> elsewhere) is always 0.
> >>
> >> I dunno why it happened, but it happened, and all kernels after 5.9
> >> are affected by this.
> >>
> >> This commit is found in a (painful) git bisect between kernel 5.8 and 5.10.
> >
> > Yes :-(
> >
> > There's a bugzilla[1] and two proposed fixes[2][3].  I don't particularly 
> > like
> > either of the fixes, but an elegant solution hasn't presented itself.
> >
> > Thomas/Paolo, can you please weigh in?
> >
> > [1] https://bugzilla.kernel.org/show_bug.cgi?id=209831
> > [2] 
> > https://lkml.kernel.org/r/1617011036-11734-1-git-send-email-wanpen...@tencent.com
> > [3] https://lkml.kernel.org/r/20210206004218.312023-1-sea...@google.com
>
> All of the solutions I looked at so far are ugly as hell. The problem is
> that the accounting is plumbed into the context tracking and moving
> context tracking around to a different place is just wrong.
>
> I think the right solution is to seperate the time accounting logic out
> from guest_enter/exit_irqoff() and have virt time specific helpers which
> can be placed at the proper spots in kvm.

Good suggestion, I will have a try. :)

Wanpeng

RE: [PATCH v2] Revert "target/mips: Deprecate nanoMIPS ISA"

2021-04-08 Thread Vince Del Vecchio

On Thursday, April 8, 2021 2:17 PM, Richard Henderson wrote:

> NACK, for the reasons stated against v1:
> https://lists.gnu.org/archive/html/qemu-devel/2021-04/msg00663.html

On Tuesday, April 6, 2021 11:21 AM, Richard Henderson wrote:

> I think we should retain the deprecation until you actually follow through 
> with any of the upstreaming.
> 
> You didn't even bother to commit your changes to a code repository -- merely 
> uploaded tarballs.  There have been no posts to the > gcc mailing lists about 
> nanomips.
> 
> A mere code dump is not active development.

Maybe not, but we are in fact actively developing.  :-)

You’re right we haven’t published the source repos on github yet.  It's been on 
our list.  Maybe not today/tomorrow, but it'll definitely be done by next week.

The nanoMIPS toolchain we inherited is based on gcc 6.3.  We’ve been working on 
upgrading to gcc trunk since late February, but it's not a trivial task.  As 
soon as we're done (hopefully before the summer), we'll propose the changes to 
the gcc mailing list.

For now, we don't have many topics for the gcc lists, although 
https://gcc.gnu.org/pipermail/gcc/2021-March/235082.html is from our work.

We've also started an LLVM port (https://github.com/MediaTek-Labs/llvm-project) 
as I mentioned in my previous message 
(https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg09764.html).

In sum, we're investing in open source nanoMIPS tools because it's an important 
technology for us, and QEMU is one of the key projects we want to have nanoMIPS 
supported in.

-Vince Del Vecchio
Compiler Team Lead & Deputy Director, DSP Core Technology, MediaTek

[PATCH v4 05/26] Hexagon (target/hexagon) properly generate TB end for DISAS_NORETURN

2021-04-08 Thread Taylor Simpson

When exiting a TB, generate all the code before returning from
hexagon_tr_translate_packet so that nothing needs to be done in
hexagon_tr_tb_stop.

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/translate.c | 62 --
 target/hexagon/translate.h |  3 ---
 2 files changed, 33 insertions(+), 32 deletions(-)

diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index e235fdb..9f2a531 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -54,16 +54,40 @@ static const char * const hexagon_prednames[] = {
   "p0", "p1", "p2", "p3"
 };
 
-void gen_exception(int excp)
+static void gen_exception_raw(int excp)
 {
 TCGv_i32 helper_tmp = tcg_const_i32(excp);
 gen_helper_raise_exception(cpu_env, helper_tmp);
 tcg_temp_free_i32(helper_tmp);
 }
 
-void gen_exception_debug(void)
+static void gen_exec_counters(DisasContext *ctx)
+{
+tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_PKT_CNT],
+hex_gpr[HEX_REG_QEMU_PKT_CNT], ctx->num_packets);
+tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_INSN_CNT],
+hex_gpr[HEX_REG_QEMU_INSN_CNT], ctx->num_insns);
+}
+
+static void gen_end_tb(DisasContext *ctx)
 {
-gen_exception(EXCP_DEBUG);
+gen_exec_counters(ctx);
+tcg_gen_mov_tl(hex_gpr[HEX_REG_PC], hex_next_PC);
+if (ctx->base.singlestep_enabled) {
+gen_exception_raw(EXCP_DEBUG);
+} else {
+tcg_gen_exit_tb(NULL, 0);
+}
+ctx->base.is_jmp = DISAS_NORETURN;
+}
+
+static void gen_exception_end_tb(DisasContext *ctx, int excp)
+{
+gen_exec_counters(ctx);
+tcg_gen_mov_tl(hex_gpr[HEX_REG_PC], hex_next_PC);
+gen_exception_raw(excp);
+ctx->base.is_jmp = DISAS_NORETURN;
+
 }
 
 #if HEX_DEBUG
@@ -225,8 +249,7 @@ static void gen_insn(CPUHexagonState *env, DisasContext 
*ctx,
 mark_implicit_writes(ctx, insn);
 insn->generate(env, ctx, insn, pkt);
 } else {
-gen_exception(HEX_EXCP_INVALID_OPCODE);
-ctx->base.is_jmp = DISAS_NORETURN;
+gen_exception_end_tb(ctx, HEX_EXCP_INVALID_OPCODE);
 }
 }
 
@@ -447,14 +470,6 @@ static void update_exec_counters(DisasContext *ctx, Packet 
*pkt)
 ctx->num_insns += num_real_insns;
 }
 
-static void gen_exec_counters(DisasContext *ctx)
-{
-tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_PKT_CNT],
-hex_gpr[HEX_REG_QEMU_PKT_CNT], ctx->num_packets);
-tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_INSN_CNT],
-hex_gpr[HEX_REG_QEMU_INSN_CNT], ctx->num_insns);
-}
-
 static void gen_commit_packet(DisasContext *ctx, Packet *pkt)
 {
 gen_reg_writes(ctx);
@@ -478,7 +493,7 @@ static void gen_commit_packet(DisasContext *ctx, Packet 
*pkt)
 #endif
 
 if (pkt->pkt_has_cof) {
-ctx->base.is_jmp = DISAS_NORETURN;
+gen_end_tb(ctx);
 }
 }
 
@@ -491,8 +506,7 @@ static void decode_and_translate_packet(CPUHexagonState 
*env, DisasContext *ctx)
 
 nwords = read_packet_words(env, ctx, words);
 if (!nwords) {
-gen_exception(HEX_EXCP_INVALID_PACKET);
-ctx->base.is_jmp = DISAS_NORETURN;
+gen_exception_end_tb(ctx, HEX_EXCP_INVALID_PACKET);
 return;
 }
 
@@ -505,8 +519,7 @@ static void decode_and_translate_packet(CPUHexagonState 
*env, DisasContext *ctx)
 gen_commit_packet(ctx, );
 ctx->base.pc_next += pkt.encod_pkt_size_in_bytes;
 } else {
-gen_exception(HEX_EXCP_INVALID_PACKET);
-ctx->base.is_jmp = DISAS_NORETURN;
+gen_exception_end_tb(ctx, HEX_EXCP_INVALID_PACKET);
 }
 }
 
@@ -536,9 +549,7 @@ static bool hexagon_tr_breakpoint_check(DisasContextBase 
*dcbase, CPUState *cpu,
 {
 DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
-tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], ctx->base.pc_next);
-ctx->base.is_jmp = DISAS_NORETURN;
-gen_exception_debug();
+gen_exception_end_tb(ctx, EXCP_DEBUG);
 /*
  * The address covered by the breakpoint must be included in
  * [tb->pc, tb->pc + tb->size) in order to for it to be
@@ -601,19 +612,12 @@ static void hexagon_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 gen_exec_counters(ctx);
 tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], ctx->base.pc_next);
 if (ctx->base.singlestep_enabled) {
-gen_exception_debug();
+gen_exception_raw(EXCP_DEBUG);
 } else {
 tcg_gen_exit_tb(NULL, 0);
 }
 break;
 case DISAS_NORETURN:
-gen_exec_counters(ctx);
-tcg_gen_mov_tl(hex_gpr[HEX_REG_PC], hex_next_PC);
-if (ctx->base.singlestep_enabled) {
-gen_exception_debug();
-} else {
-tcg_gen_exit_tb(NULL, 0);
-}
 break;
 default:
 g_assert_not_reached();
diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index 938f7fb..12506c8 100644
--- a/target/hexagon/translate.h
+++

[PATCH v4 23/26] Hexagon (target/hexagon) bit reverse (brev) addressing

2021-04-08 Thread Taylor Simpson

The following instructions are added
L2_loadrub_pbr  Rd32 = memub(Rx32++Mu2:brev)
L2_loadrb_pbr   Rd32 = memb(Rx32++Mu2:brev)
L2_loadruh_pbr  Rd32 = memuh(Rx32++Mu2:brev)
L2_loadrh_pbr   Rd32 = memh(Rx32++Mu2:brev)
L2_loadri_pbr   Rd32 = memw(Rx32++Mu2:brev)
L2_loadrd_pbr   Rdd32 = memd(Rx32++Mu2:brev)
S2_storerb_pbr  memb(Rx32++Mu2:brev).=.Rt32
S2_storerh_pbr  memh(Rx32++Mu2:brev).=.Rt32
S2_storerf_pbr  memh(Rx32++Mu2:brev).=.Rt.H32
S2_storeri_pbr  memw(Rx32++Mu2:brev).=.Rt32
S2_storerd_pbr  memd(Rx32++Mu2:brev).=.Rt32
S2_storerinew_pbr   memw(Rx32++Mu2:brev).=.Nt8.new
S2_storerbnew_pbr   memw(Rx32++Mu2:brev).=.Nt8.new
S2_storerhnew_pbr   memw(Rx32++Mu2:brev).=.Nt8.new

Test cases in tests/tcg/hexagon/brev.c

Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h  |  28 +
 target/hexagon/helper.h   |   1 +
 target/hexagon/imported/encode_pp.def |   4 +
 target/hexagon/imported/ldst.idef |   2 +
 target/hexagon/imported/macros.def|   6 ++
 target/hexagon/macros.h   |   1 +
 target/hexagon/op_helper.c|   8 ++
 tests/tcg/hexagon/Makefile.target |   1 +
 tests/tcg/hexagon/brev.c  | 190 ++
 9 files changed, 241 insertions(+)
 create mode 100644 tests/tcg/hexagon/brev.c

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 25c228c..8f0ec01 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -37,6 +37,7 @@
  * _sp   stack pointer relativer0 = memw(r29+#12)
  * _ap   absolute set  r0 = memw(r1=##variable)
  * _pr   post increment register   r0 = memw(r1++m1)
+ * _pbr  post increment bit reverser0 = memw(r1++m1:brev)
  * _pi   post increment immediate  r0 = memb(r1++#1)
  * _pci  post increment circular immediate r0 = memw(r1++#4:circ(m0))
  * _pcr  post increment circular register  r0 = memw(r1++I:circ(m0))
@@ -53,6 +54,11 @@
 fEA_REG(RxV); \
 fPM_M(RxV, MuV); \
 } while (0)
+#define GET_EA_pbr \
+do { \
+gen_helper_fbrev(EA, RxV); \
+tcg_gen_add_tl(RxV, RxV, MuV); \
+} while (0)
 #define GET_EA_pi \
 do { \
 fEA_REG(RxV); \
@@ -128,16 +134,22 @@
   fGEN_TCG_LOAD_pcr(3, fLOAD(1, 8, u, EA, RddV))
 
 #define fGEN_TCG_L2_loadrub_pr(SHORTCODE)  SHORTCODE
+#define fGEN_TCG_L2_loadrub_pbr(SHORTCODE) SHORTCODE
 #define fGEN_TCG_L2_loadrub_pi(SHORTCODE)  SHORTCODE
 #define fGEN_TCG_L2_loadrb_pr(SHORTCODE)   SHORTCODE
+#define fGEN_TCG_L2_loadrb_pbr(SHORTCODE)  SHORTCODE
 #define fGEN_TCG_L2_loadrb_pi(SHORTCODE)   SHORTCODE
 #define fGEN_TCG_L2_loadruh_pr(SHORTCODE)  SHORTCODE
+#define fGEN_TCG_L2_loadruh_pbr(SHORTCODE) SHORTCODE
 #define fGEN_TCG_L2_loadruh_pi(SHORTCODE)  SHORTCODE
 #define fGEN_TCG_L2_loadrh_pr(SHORTCODE)   SHORTCODE
+#define fGEN_TCG_L2_loadrh_pbr(SHORTCODE)  SHORTCODE
 #define fGEN_TCG_L2_loadrh_pi(SHORTCODE)   SHORTCODE
 #define fGEN_TCG_L2_loadri_pr(SHORTCODE)   SHORTCODE
+#define fGEN_TCG_L2_loadri_pbr(SHORTCODE)  SHORTCODE
 #define fGEN_TCG_L2_loadri_pi(SHORTCODE)   SHORTCODE
 #define fGEN_TCG_L2_loadrd_pr(SHORTCODE)   SHORTCODE
+#define fGEN_TCG_L2_loadrd_pbr(SHORTCODE)  SHORTCODE
 #define fGEN_TCG_L2_loadrd_pi(SHORTCODE)   SHORTCODE
 
 /*
@@ -265,41 +277,57 @@
 tcg_temp_free(BYTE); \
 } while (0)
 
+#define fGEN_TCG_S2_storerb_pbr(SHORTCODE) \
+fGEN_TCG_STORE(SHORTCODE)
 #define fGEN_TCG_S2_storerb_pci(SHORTCODE) \
 fGEN_TCG_STORE(SHORTCODE)
 #define fGEN_TCG_S2_storerb_pcr(SHORTCODE) \
 fGEN_TCG_STORE_pcr(0, fSTORE(1, 1, EA, fGETBYTE(0, RtV)))
 
+#define fGEN_TCG_S2_storerh_pbr(SHORTCODE) \
+fGEN_TCG_STORE(SHORTCODE)
 #define fGEN_TCG_S2_storerh_pci(SHORTCODE) \
 fGEN_TCG_STORE(SHORTCODE)
 #define fGEN_TCG_S2_storerh_pcr(SHORTCODE) \
 fGEN_TCG_STORE_pcr(1, fSTORE(1, 2, EA, fGETHALF(0, RtV)))
 
+#define fGEN_TCG_S2_storerf_pbr(SHORTCODE) \
+fGEN_TCG_STORE(SHORTCODE)
 #define fGEN_TCG_S2_storerf_pci(SHORTCODE) \
 fGEN_TCG_STORE(SHORTCODE)
 #define fGEN_TCG_S2_storerf_pcr(SHORTCODE) \
 fGEN_TCG_STORE_pcr(1, fSTORE(1, 2, EA, fGETHALF(1, RtV)))
 
+#define fGEN_TCG_S2_storeri_pbr(SHORTCODE) \
+fGEN_TCG_STORE(SHORTCODE)
 #define fGEN_TCG_S2_storeri_pci(SHORTCODE) \
 fGEN_TCG_STORE(SHORTCODE)
 #define fGEN_TCG_S2_storeri_pcr(SHORTCODE) \
 fGEN_TCG_STORE_pcr(2, fSTORE(1, 4, EA, RtV))
 
+#define fGEN_TCG_S2_storerd_pbr(SHORTCODE) \
+fGEN_TCG_STORE(SHORTCODE)
 #define fGEN_TCG_S2_storerd_pci(SHORTCODE) \
 fGEN_TCG_STORE(SHORTCODE)
 #define fGEN_TCG_S2_storerd_pcr(SHORTCODE) \
 fGEN_TCG_STORE_pcr(3, fSTORE(1, 8, EA, RttV))
 
+#define

[PATCH v4 17/26] Hexagon (target/hexagon) add F2_sfrecipa instruction

2021-04-08 Thread Taylor Simpson

Rd32,Pe4 = sfrecipa(Rs32, Rt32)
Recripocal approx

Test cases in tests/tcg/hexagon/multi_result.c
FP exception tests added to tests/tcg/hexagon/fpstuff.c

Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/arch.c | 26 +--
 target/hexagon/arch.h |  2 +
 target/hexagon/gen_tcg.h  | 21 +
 target/hexagon/helper.h   |  1 +
 target/hexagon/imported/encode_pp.def |  1 +
 target/hexagon/imported/float.idef| 16 +++
 target/hexagon/op_helper.c| 37 
 tests/tcg/hexagon/Makefile.target |  1 +
 tests/tcg/hexagon/fpstuff.c   | 82 +++
 tests/tcg/hexagon/multi_result.c  | 68 +
 10 files changed, 252 insertions(+), 3 deletions(-)
 create mode 100644 tests/tcg/hexagon/multi_result.c

diff --git a/target/hexagon/arch.c b/target/hexagon/arch.c
index 40b6e3d..46edf45 100644
--- a/target/hexagon/arch.c
+++ b/target/hexagon/arch.c
@@ -181,12 +181,13 @@ int arch_sf_recip_common(float32 *Rs, float32 *Rt, 
float32 *Rd, int *adjust,
 /* or put Inf in num fixup? */
 uint8_t RsV_sign = float32_is_neg(RsV);
 uint8_t RtV_sign = float32_is_neg(RtV);
+/* Check that RsV is NOT infinite before we overwrite it */
+if (!float32_is_infinity(RsV)) {
+float_raise(float_flag_divbyzero, fp_status);
+}
 RsV = infinite_float32(RsV_sign ^ RtV_sign);
 RtV = float32_one;
 RdV = float32_one;
-if (float32_is_infinity(RsV)) {
-float_raise(float_flag_divbyzero, fp_status);
-}
 } else if (float32_is_infinity(RtV)) {
 RsV = make_float32(0x8000 & (RsV ^ RtV));
 RtV = float32_one;
@@ -279,3 +280,22 @@ int arch_sf_invsqrt_common(float32 *Rs, float32 *Rd, int 
*adjust,
 *adjust = PeV;
 return ret;
 }
+
+const uint8_t recip_lookup_table[128] = {
+0x0fe, 0x0fa, 0x0f6, 0x0f2, 0x0ef, 0x0eb, 0x0e7, 0x0e4,
+0x0e0, 0x0dd, 0x0d9, 0x0d6, 0x0d2, 0x0cf, 0x0cc, 0x0c9,
+0x0c6, 0x0c2, 0x0bf, 0x0bc, 0x0b9, 0x0b6, 0x0b3, 0x0b1,
+0x0ae, 0x0ab, 0x0a8, 0x0a5, 0x0a3, 0x0a0, 0x09d, 0x09b,
+0x098, 0x096, 0x093, 0x091, 0x08e, 0x08c, 0x08a, 0x087,
+0x085, 0x083, 0x080, 0x07e, 0x07c, 0x07a, 0x078, 0x075,
+0x073, 0x071, 0x06f, 0x06d, 0x06b, 0x069, 0x067, 0x065,
+0x063, 0x061, 0x05f, 0x05e, 0x05c, 0x05a, 0x058, 0x056,
+0x054, 0x053, 0x051, 0x04f, 0x04e, 0x04c, 0x04a, 0x049,
+0x047, 0x045, 0x044, 0x042, 0x040, 0x03f, 0x03d, 0x03c,
+0x03a, 0x039, 0x037, 0x036, 0x034, 0x033, 0x032, 0x030,
+0x02f, 0x02d, 0x02c, 0x02b, 0x029, 0x028, 0x027, 0x025,
+0x024, 0x023, 0x021, 0x020, 0x01f, 0x01e, 0x01c, 0x01b,
+0x01a, 0x019, 0x017, 0x016, 0x015, 0x014, 0x013, 0x012,
+0x011, 0x00f, 0x00e, 0x00d, 0x00c, 0x00b, 0x00a, 0x009,
+0x008, 0x007, 0x006, 0x005, 0x004, 0x003, 0x002, 0x000,
+};
diff --git a/target/hexagon/arch.h b/target/hexagon/arch.h
index 6e0b0d9..b6634e9 100644
--- a/target/hexagon/arch.h
+++ b/target/hexagon/arch.h
@@ -30,4 +30,6 @@ int arch_sf_recip_common(float32 *Rs, float32 *Rt, float32 
*Rd,
 int arch_sf_invsqrt_common(float32 *Rs, float32 *Rd, int *adjust,
   float_status *fp_status);
 
+extern const uint8_t recip_lookup_table[128];
+
 #endif
diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index a30048e..428a670 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -195,6 +195,27 @@
 #define fGEN_TCG_S4_stored_locked(SHORTCODE) \
 do { SHORTCODE; READ_PREG(PdV, PdN); } while (0)
 
+/*
+ * Mathematical operations with more than one definition require
+ * special handling
+ */
+
+/*
+ * Approximate reciprocal
+ * r3,p1 = sfrecipa(r0, r1)
+ *
+ * The helper packs the 2 32-bit results into a 64-bit value,
+ * so unpack them into the proper results.
+ */
+#define fGEN_TCG_F2_sfrecipa(SHORTCODE) \
+do { \
+TCGv_i64 tmp = tcg_temp_new_i64(); \
+gen_helper_sfrecipa(tmp, cpu_env, RsV, RtV);  \
+tcg_gen_extrh_i64_i32(RdV, tmp); \
+tcg_gen_extrl_i64_i32(PeV, tmp); \
+tcg_temp_free_i64(tmp); \
+} while (0)
+
 /* Floating point */
 #define fGEN_TCG_F2_conv_sf2df(SHORTCODE) \
 gen_helper_conv_sf2df(RddV, cpu_env, RsV)
diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h
index efe6069..b377293 100644
--- a/target/hexagon/helper.h
+++ b/target/hexagon/helper.h
@@ -24,6 +24,7 @@ DEF_HELPER_FLAGS_3(debug_check_store_width, TCG_CALL_NO_WG, 
void, env, int, int)
 DEF_HELPER_FLAGS_3(debug_commit_end, TCG_CALL_NO_WG, void, env, int, int)
 DEF_HELPER_2(commit_store, void, env, int)
 DEF_HELPER_FLAGS_4(fcircadd, TCG_CALL_NO_RWG_SE, s32, s32, s32, s32, s32)
+DEF_HELPER_3(sfrecipa, i64, env, f32, f32)
 
 /* Floating point */
 DEF_HELPER_2(conv_sf2df, f64, env, f32)
diff --git a/target/hexagon/imported/encode_pp.def 
b/target/hexagon/imported/encode_pp.def
index

[PATCH v4 25/26] Hexagon (target/hexagon) load into shifted register instructions

2021-04-08 Thread Taylor Simpson

The following instructions are added
L2_loadalignb_io  Ryy32 = memb_fifo(Rs32+#s11:1)
L2_loadalignh_io  Ryy32 = memh_fifo(Rs32+#s11:1)
L4_loadalignb_ur  Ryy32 = memb_fifo(Rt32<<#u2+#U6)
L4_loadalignh_ur  Ryy32 = memh_fifo(Rt32<<#u2+#U6)
L4_loadalignb_ap  Ryy32 = memb_fifo(Re32=#U6)
L4_loadalignh_ap  Ryy32 = memh_fifo(Re32=#U6)
L2_loadalignb_pr  Ryy32 = memb_fifo(Rx32++Mu2)
L2_loadalignh_pr  Ryy32 = memh_fifo(Rx32++Mu2)
L2_loadalignb_pbr Ryy32 = memb_fifo(Rx32++Mu2:brev)
L2_loadalignh_pbr Ryy32 = memh_fifo(Rx32++Mu2:brev)
L2_loadalignb_pi  Ryy32 = memb_fifo(Rx32++#s4:1)
L2_loadalignh_pi  Ryy32 = memh_fifo(Rx32++#s4:1)
L2_loadalignb_pci Ryy32 = memb_fifo(Rx32++#s4:1:circ(Mu2))
L2_loadalignh_pci Ryy32 = memh_fifo(Rx32++#s4:1:circ(Mu2))
L2_loadalignb_pcr Ryy32 = memb_fifo(Rx32++I:circ(Mu2))
L2_loadalignh_pcr Ryy32 = memh_fifo(Rx32++I:circ(Mu2))

Test cases in tests/tcg/hexagon/load_align.c

Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h  |  66 ++
 target/hexagon/imported/encode_pp.def |   3 +
 target/hexagon/imported/ldst.idef |  19 ++
 tests/tcg/hexagon/Makefile.target |   1 +
 tests/tcg/hexagon/load_align.c| 415 ++
 5 files changed, 504 insertions(+)
 create mode 100644 tests/tcg/hexagon/load_align.c

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 1120aae..18fcdbc 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -261,6 +261,72 @@
 fGEN_TCG_loadbXw4(GET_EA_pi, true)
 
 /*
+ * These instructions load a half word, shift the destination right by 16 bits
+ * and place the loaded value in the high half word of the destination pair.
+ * The GET_EA macro determines the addressing mode.
+ */
+#define fGEN_TCG_loadalignh(GET_EA) \
+do { \
+TCGv tmp = tcg_temp_new(); \
+TCGv_i64 tmp_i64 = tcg_temp_new_i64(); \
+GET_EA;  \
+fLOAD(1, 2, u, EA, tmp);  \
+tcg_gen_extu_i32_i64(tmp_i64, tmp); \
+tcg_gen_shri_i64(RyyV, RyyV, 16); \
+tcg_gen_deposit_i64(RyyV, RyyV, tmp_i64, 48, 16); \
+tcg_temp_free(tmp); \
+tcg_temp_free_i64(tmp_i64); \
+} while (0)
+
+#define fGEN_TCG_L4_loadalignh_ur(SHORTCODE) \
+fGEN_TCG_loadalignh(fEA_IRs(UiV, RtV, uiV))
+#define fGEN_TCG_L2_loadalignh_io(SHORTCODE) \
+fGEN_TCG_loadalignh(fEA_RI(RsV, siV))
+#define fGEN_TCG_L2_loadalignh_pci(SHORTCODE) \
+fGEN_TCG_loadalignh(GET_EA_pci)
+#define fGEN_TCG_L2_loadalignh_pcr(SHORTCODE) \
+fGEN_TCG_loadalignh(GET_EA_pcr(1))
+#define fGEN_TCG_L4_loadalignh_ap(SHORTCODE) \
+fGEN_TCG_loadalignh(GET_EA_ap)
+#define fGEN_TCG_L2_loadalignh_pr(SHORTCODE) \
+fGEN_TCG_loadalignh(GET_EA_pr)
+#define fGEN_TCG_L2_loadalignh_pbr(SHORTCODE) \
+fGEN_TCG_loadalignh(GET_EA_pbr)
+#define fGEN_TCG_L2_loadalignh_pi(SHORTCODE) \
+fGEN_TCG_loadalignh(GET_EA_pi)
+
+/* Same as above, but loads a byte instead of half word */
+#define fGEN_TCG_loadalignb(GET_EA) \
+do { \
+TCGv tmp = tcg_temp_new(); \
+TCGv_i64 tmp_i64 = tcg_temp_new_i64(); \
+GET_EA;  \
+fLOAD(1, 1, u, EA, tmp);  \
+tcg_gen_extu_i32_i64(tmp_i64, tmp); \
+tcg_gen_shri_i64(RyyV, RyyV, 8); \
+tcg_gen_deposit_i64(RyyV, RyyV, tmp_i64, 56, 8); \
+tcg_temp_free(tmp); \
+tcg_temp_free_i64(tmp_i64); \
+} while (0)
+
+#define fGEN_TCG_L2_loadalignb_io(SHORTCODE) \
+fGEN_TCG_loadalignb(fEA_RI(RsV, siV))
+#define fGEN_TCG_L4_loadalignb_ur(SHORTCODE) \
+fGEN_TCG_loadalignb(fEA_IRs(UiV, RtV, uiV))
+#define fGEN_TCG_L2_loadalignb_pci(SHORTCODE) \
+fGEN_TCG_loadalignb(GET_EA_pci)
+#define fGEN_TCG_L2_loadalignb_pcr(SHORTCODE) \
+fGEN_TCG_loadalignb(GET_EA_pcr(0))
+#define fGEN_TCG_L4_loadalignb_ap(SHORTCODE) \
+fGEN_TCG_loadalignb(GET_EA_ap)
+#define fGEN_TCG_L2_loadalignb_pr(SHORTCODE) \
+fGEN_TCG_loadalignb(GET_EA_pr)
+#define fGEN_TCG_L2_loadalignb_pbr(SHORTCODE) \
+fGEN_TCG_loadalignb(GET_EA_pbr)
+#define fGEN_TCG_L2_loadalignb_pi(SHORTCODE) \
+fGEN_TCG_loadalignb(GET_EA_pi)
+
+/*
  * Predicated loads
  * Here is a primer to understand the tag names
  *
diff --git a/target/hexagon/imported/encode_pp.def 
b/target/hexagon/imported/encode_pp.def
index e3582eb..dc4eba4 100644
--- a/target/hexagon/imported/encode_pp.def
+++ b/target/hexagon/imported/encode_pp.def
@@ -348,6 +348,9 @@ STD_LD_ENC(bzw2,"0 011")
 STD_LD_ENC(bsw4,"0 111")
 STD_LD_ENC(bsw2,"0 001")
 
+STD_LDX_ENC(alignh,"0 010")
+STD_LDX_ENC(alignb,"0 100")
+
 STD_LD_ENC(rb,  "1 000")
 STD_LD_ENC(rub, "1 001")
 STD_LD_ENC(rh,  "1 010")
diff --git a/target/hexagon/imported/ldst.idef 
b/target/hexagon/imported/ldst.idef
index 95c0470..359d3b7 100644
--- a/target/hexagon/imported/ldst.idef
+++

[PATCH v4 15/26] Hexagon (target/hexagon) move QEMU_GENERATE to only be on during macros.h

2021-04-08 Thread Taylor Simpson

Suggested-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/genptr.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 6b74344..b87e264 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -15,7 +15,6 @@
  *  along with this program; if not, see .
  */
 
-#define QEMU_GENERATE
 #include "qemu/osdep.h"
 #include "qemu/log.h"
 #include "cpu.h"
@@ -24,7 +23,9 @@
 #include "insn.h"
 #include "opcodes.h"
 #include "translate.h"
+#define QEMU_GENERATE   /* Used internally by macros.h */
 #include "macros.h"
+#undef QEMU_GENERATE
 #include "gen_tcg.h"
 
 static inline TCGv gen_read_preg(TCGv pred, uint8_t num)
-- 
2.7.4

[PATCH v4 14/26] Hexagon (target/hexagon) cleanup reg_field_info definition

2021-04-08 Thread Taylor Simpson

Include size in declaration
Remove {0, 0} entry

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/reg_fields.c | 3 +--
 target/hexagon/reg_fields.h | 4 ++--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/target/hexagon/reg_fields.c b/target/hexagon/reg_fields.c
index bdcab79..6713203 100644
--- a/target/hexagon/reg_fields.c
+++ b/target/hexagon/reg_fields.c
@@ -18,10 +18,9 @@
 #include "qemu/osdep.h"
 #include "reg_fields.h"
 
-const RegField reg_field_info[] = {
+const RegField reg_field_info[NUM_REG_FIELDS] = {
 #define DEF_REG_FIELD(TAG, START, WIDTH)\
   { START, WIDTH },
 #include "reg_fields_def.h.inc"
-  { 0, 0 }
 #undef DEF_REG_FIELD
 };
diff --git a/target/hexagon/reg_fields.h b/target/hexagon/reg_fields.h
index d3c86c9..9e2ad5d 100644
--- a/target/hexagon/reg_fields.h
+++ b/target/hexagon/reg_fields.h
@@ -23,8 +23,6 @@ typedef struct {
 int width;
 } RegField;
 
-extern const RegField reg_field_info[];
-
 enum {
 #define DEF_REG_FIELD(TAG, START, WIDTH) \
 TAG,
@@ -33,4 +31,6 @@ enum {
 #undef DEF_REG_FIELD
 };
 
+extern const RegField reg_field_info[NUM_REG_FIELDS];
+
 #endif
-- 
2.7.4

[PATCH v4 20/26] Hexagon (target/hexagon) add A6_vminub_RdP

2021-04-08 Thread Taylor Simpson

Rdd32,Pe4 = vminub(Rtt32, Rss32)
Vector min of bytes

Test cases in tests/tcg/hexagon/multi_result.c

Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h  | 27 +++
 target/hexagon/genptr.c   | 22 ++
 target/hexagon/imported/alu.idef  | 10 ++
 target/hexagon/imported/encode_pp.def |  1 +
 tests/tcg/hexagon/multi_result.c  | 34 ++
 5 files changed, 94 insertions(+)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 93310c5..aea0c55 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -237,6 +237,33 @@
 tcg_temp_free_i64(tmp); \
 } while (0)
 
+/*
+ * Compare each of the 8 unsigned bytes
+ * The minimum is placed in each byte of the destination.
+ * Each bit of the predicate is set true if the bit from the first operand
+ * is greater than the bit from the second operand.
+ * r5:4,p1 = vminub(r1:0, r3:2)
+ */
+#define fGEN_TCG_A6_vminub_RdP(SHORTCODE) \
+do { \
+TCGv left = tcg_temp_new(); \
+TCGv right = tcg_temp_new(); \
+TCGv tmp = tcg_temp_new(); \
+tcg_gen_movi_tl(PeV, 0); \
+tcg_gen_movi_i64(RddV, 0); \
+for (int i = 0; i < 8; i++) { \
+gen_get_byte_i64(left, i, RttV, false); \
+gen_get_byte_i64(right, i, RssV, false); \
+tcg_gen_setcond_tl(TCG_COND_GT, tmp, left, right); \
+tcg_gen_deposit_tl(PeV, PeV, tmp, i, 1); \
+tcg_gen_umin_tl(tmp, left, right); \
+gen_set_byte_i64(i, RddV, tmp); \
+} \
+tcg_temp_free(left); \
+tcg_temp_free(right); \
+tcg_temp_free(tmp); \
+} while (0)
+
 /* Floating point */
 #define fGEN_TCG_F2_conv_sf2df(SHORTCODE) \
 gen_helper_conv_sf2df(RddV, cpu_env, RsV)
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 24d5758..9dbebc6 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -266,6 +266,28 @@ static inline void gen_write_ctrl_reg_pair(DisasContext 
*ctx, int reg_num,
 }
 }
 
+static TCGv gen_get_byte_i64(TCGv result, int N, TCGv_i64 src, bool sign)
+{
+TCGv_i64 res64 = tcg_temp_new_i64();
+if (sign) {
+tcg_gen_sextract_i64(res64, src, N * 8, 8);
+} else {
+tcg_gen_extract_i64(res64, src, N * 8, 8);
+}
+tcg_gen_extrl_i64_i32(result, res64);
+tcg_temp_free_i64(res64);
+
+return result;
+}
+
+static void gen_set_byte_i64(int N, TCGv_i64 result, TCGv src)
+{
+TCGv_i64 src64 = tcg_temp_new_i64();
+tcg_gen_extu_i32_i64(src64, src);
+tcg_gen_deposit_i64(result, result, src64, N * 8, 8);
+tcg_temp_free_i64(src64);
+}
+
 static inline void gen_load_locked4u(TCGv dest, TCGv vaddr, int mem_index)
 {
 tcg_gen_qemu_ld32u(dest, vaddr, mem_index);
diff --git a/target/hexagon/imported/alu.idef b/target/hexagon/imported/alu.idef
index e8cc52c..f0c9bb4 100644
--- a/target/hexagon/imported/alu.idef
+++ b/target/hexagon/imported/alu.idef
@@ -1259,6 +1259,16 @@ Q6INSN(A5_ACS,"Rxx32,Pe4=vacsh(Rss32,Rtt32)",ATTRIBS(),
 }
 })
 
+Q6INSN(A6_vminub_RdP,"Rdd32,Pe4=vminub(Rtt32,Rss32)",ATTRIBS(),
+"Vector minimum of bytes, records minimum and decision vector",
+{
+fHIDE(int i;)
+for (i = 0; i < 8; i++) {
+fSETBIT(i, PeV, (fGETUBYTE(i,RttV) > fGETUBYTE(i,RssV)));
+fSETBYTE(i,RddV,fMIN(fGETUBYTE(i,RttV),fGETUBYTE(i,RssV)));
+}
+})
+
 /**/
 /* Vector Min/Max */
 /**/
diff --git a/target/hexagon/imported/encode_pp.def 
b/target/hexagon/imported/encode_pp.def
index 87e0426..4619398 100644
--- a/target/hexagon/imported/encode_pp.def
+++ b/target/hexagon/imported/encode_pp.def
@@ -1018,6 +1018,7 @@ MPY_ENC(M7_dcmpyiwc_acc, 
"1010","x","1","0","1","0","10")
 
 
 MPY_ENC(A5_ACS,  "1010","x","0","1","0","1","ee")
+MPY_ENC(A6_vminub_RdP,   "1010","d","0","1","1","1","ee")
 /*
 */
 
diff --git a/tests/tcg/hexagon/multi_result.c b/tests/tcg/hexagon/multi_result.c
index c21148f..95d99a0 100644
--- a/tests/tcg/hexagon/multi_result.c
+++ b/tests/tcg/hexagon/multi_result.c
@@ -70,6 +70,21 @@ static long long vacsh(long long Rxx, long long Rss, long 
long Rtt,
   return result;
 }
 
+static long long vminub(long long Rtt, long long Rss,
+int *pred_result)
+{
+  long long result;
+  int predval;
+
+  asm volatile("%0,p0 = vminub(%2, %3)\n\t"
+   "%1 = p0\n\t"
+   : "=r"(result), "=r"(predval)
+   : "r"(Rtt), "r"(Rss)
+   : "p0");
+  *pred_result = predval;
+  return result;
+}
+
 int err;
 
 static void check_ll(long long val, long long expect)
@@ -155,11 +170,30 @@ static void test_vacsh()
 check(ovf_result, 0);
 }
 
+static void test_vminub()
+{
+long long res64;

[PATCH v4 18/26] Hexagon (target/hexagon) add F2_sfinvsqrta

2021-04-08 Thread Taylor Simpson

Rd32,Pe4 = sfinvsqrta(Rs32)
Square root approx

The helper packs the 2 32-bit results into a 64-bit value,
and the fGEN_TCG override unpacks them into the proper results.

Test cases in tests/tcg/hexagon/multi_result.c
FP exception tests added to tests/tcg/hexagon/fpstuff.c

Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/arch.c | 21 -
 target/hexagon/arch.h |  2 ++
 target/hexagon/gen_tcg.h  | 16 
 target/hexagon/helper.h   |  1 +
 target/hexagon/imported/encode_pp.def |  1 +
 target/hexagon/imported/float.idef| 16 
 target/hexagon/op_helper.c| 21 +
 tests/tcg/hexagon/fpstuff.c   | 15 +++
 tests/tcg/hexagon/multi_result.c  | 29 +
 9 files changed, 121 insertions(+), 1 deletion(-)

diff --git a/target/hexagon/arch.c b/target/hexagon/arch.c
index 46edf45..dee852e 100644
--- a/target/hexagon/arch.c
+++ b/target/hexagon/arch.c
@@ -247,7 +247,7 @@ int arch_sf_invsqrt_common(float32 *Rs, float32 *Rd, int 
*adjust,
 int r_exp;
 int ret = 0;
 RsV = *Rs;
-if (float32_is_infinity(RsV)) {
+if (float32_is_any_nan(RsV)) {
 if (extract32(RsV, 22, 1) == 0) {
 float_raise(float_flag_invalid, fp_status);
 }
@@ -299,3 +299,22 @@ const uint8_t recip_lookup_table[128] = {
 0x011, 0x00f, 0x00e, 0x00d, 0x00c, 0x00b, 0x00a, 0x009,
 0x008, 0x007, 0x006, 0x005, 0x004, 0x003, 0x002, 0x000,
 };
+
+const uint8_t invsqrt_lookup_table[128] = {
+0x069, 0x066, 0x063, 0x061, 0x05e, 0x05b, 0x059, 0x057,
+0x054, 0x052, 0x050, 0x04d, 0x04b, 0x049, 0x047, 0x045,
+0x043, 0x041, 0x03f, 0x03d, 0x03b, 0x039, 0x037, 0x036,
+0x034, 0x032, 0x030, 0x02f, 0x02d, 0x02c, 0x02a, 0x028,
+0x027, 0x025, 0x024, 0x022, 0x021, 0x01f, 0x01e, 0x01d,
+0x01b, 0x01a, 0x019, 0x017, 0x016, 0x015, 0x014, 0x012,
+0x011, 0x010, 0x00f, 0x00d, 0x00c, 0x00b, 0x00a, 0x009,
+0x008, 0x007, 0x006, 0x005, 0x004, 0x003, 0x002, 0x001,
+0x0fe, 0x0fa, 0x0f6, 0x0f3, 0x0ef, 0x0eb, 0x0e8, 0x0e4,
+0x0e1, 0x0de, 0x0db, 0x0d7, 0x0d4, 0x0d1, 0x0ce, 0x0cb,
+0x0c9, 0x0c6, 0x0c3, 0x0c0, 0x0be, 0x0bb, 0x0b8, 0x0b6,
+0x0b3, 0x0b1, 0x0af, 0x0ac, 0x0aa, 0x0a8, 0x0a5, 0x0a3,
+0x0a1, 0x09f, 0x09d, 0x09b, 0x099, 0x097, 0x095, 0x093,
+0x091, 0x08f, 0x08d, 0x08b, 0x089, 0x087, 0x086, 0x084,
+0x082, 0x080, 0x07f, 0x07d, 0x07b, 0x07a, 0x078, 0x077,
+0x075, 0x074, 0x072, 0x071, 0x06f, 0x06e, 0x06c, 0x06b,
+};
diff --git a/target/hexagon/arch.h b/target/hexagon/arch.h
index b6634e9..3e0c334 100644
--- a/target/hexagon/arch.h
+++ b/target/hexagon/arch.h
@@ -32,4 +32,6 @@ int arch_sf_invsqrt_common(float32 *Rs, float32 *Rd, int 
*adjust,
 
 extern const uint8_t recip_lookup_table[128];
 
+extern const uint8_t invsqrt_lookup_table[128];
+
 #endif
diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 428a670..d78e7b8 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -216,6 +216,22 @@
 tcg_temp_free_i64(tmp); \
 } while (0)
 
+/*
+ * Approximation of the reciprocal square root
+ * r1,p0 = sfinvsqrta(r0)
+ *
+ * The helper packs the 2 32-bit results into a 64-bit value,
+ * so unpack them into the proper results.
+ */
+#define fGEN_TCG_F2_sfinvsqrta(SHORTCODE) \
+do { \
+TCGv_i64 tmp = tcg_temp_new_i64(); \
+gen_helper_sfinvsqrta(tmp, cpu_env, RsV); \
+tcg_gen_extrh_i64_i32(RdV, tmp); \
+tcg_gen_extrl_i64_i32(PeV, tmp); \
+tcg_temp_free_i64(tmp); \
+} while (0)
+
 /* Floating point */
 #define fGEN_TCG_F2_conv_sf2df(SHORTCODE) \
 gen_helper_conv_sf2df(RddV, cpu_env, RsV)
diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h
index b377293..cb7508f 100644
--- a/target/hexagon/helper.h
+++ b/target/hexagon/helper.h
@@ -25,6 +25,7 @@ DEF_HELPER_FLAGS_3(debug_commit_end, TCG_CALL_NO_WG, void, 
env, int, int)
 DEF_HELPER_2(commit_store, void, env, int)
 DEF_HELPER_FLAGS_4(fcircadd, TCG_CALL_NO_RWG_SE, s32, s32, s32, s32, s32)
 DEF_HELPER_3(sfrecipa, i64, env, f32, f32)
+DEF_HELPER_2(sfinvsqrta, i64, env, f32)
 
 /* Floating point */
 DEF_HELPER_2(conv_sf2df, f64, env, f32)
diff --git a/target/hexagon/imported/encode_pp.def 
b/target/hexagon/imported/encode_pp.def
index b01b4d7..18fe45d 100644
--- a/target/hexagon/imported/encode_pp.def
+++ b/target/hexagon/imported/encode_pp.def
@@ -1642,6 +1642,7 @@ SH2_RR_ENC(F2_conv_sf2w,  
"1011","100","-","000","d")
 SH2_RR_ENC(F2_conv_sf2uw_chop,"1011","011","-","001","d")
 SH2_RR_ENC(F2_conv_sf2w_chop, "1011","100","-","001","d")
 SH2_RR_ENC(F2_sffixupr,   "1011","101","-","000","d")
+SH2_RR_ENC(F2_sfinvsqrta, "1011","111","-","0ee","d")
 
 
 DEF_FIELDROW_DESC32(ICLASS_S2op"  1100  PP-- ","[#12] 
Rd=(Rs,#u6)")
diff --git

[PATCH v4 16/26] Hexagon (target/hexagon) compile all debug code

2021-04-08 Thread Taylor Simpson

Change #if HEX_DEBUG to if (HEX_DEBUG) so the debug code doesn't bit rot

Suggested-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/genptr.c| 72 ++--
 target/hexagon/helper.h|  2 --
 target/hexagon/internal.h  | 11 +++
 target/hexagon/op_helper.c | 14 +++--
 target/hexagon/translate.c | 74 ++
 target/hexagon/translate.h |  2 --
 6 files changed, 81 insertions(+), 94 deletions(-)

diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index b87e264..24d5758 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -42,17 +42,17 @@ static inline void gen_log_predicated_reg_write(int rnum, 
TCGv val, int slot)
 tcg_gen_andi_tl(slot_mask, hex_slot_cancelled, 1 << slot);
 tcg_gen_movcond_tl(TCG_COND_EQ, hex_new_value[rnum], slot_mask, zero,
val, hex_new_value[rnum]);
-#if HEX_DEBUG
-/*
- * Do this so HELPER(debug_commit_end) will know
- *
- * Note that slot_mask indicates the value is not written
- * (i.e., slot was cancelled), so we create a true/false value before
- * or'ing with hex_reg_written[rnum].
- */
-tcg_gen_setcond_tl(TCG_COND_EQ, slot_mask, slot_mask, zero);
-tcg_gen_or_tl(hex_reg_written[rnum], hex_reg_written[rnum], slot_mask);
-#endif
+if (HEX_DEBUG) {
+/*
+ * Do this so HELPER(debug_commit_end) will know
+ *
+ * Note that slot_mask indicates the value is not written
+ * (i.e., slot was cancelled), so we create a true/false value before
+ * or'ing with hex_reg_written[rnum].
+ */
+tcg_gen_setcond_tl(TCG_COND_EQ, slot_mask, slot_mask, zero);
+tcg_gen_or_tl(hex_reg_written[rnum], hex_reg_written[rnum], slot_mask);
+}
 
 tcg_temp_free(zero);
 tcg_temp_free(slot_mask);
@@ -61,10 +61,10 @@ static inline void gen_log_predicated_reg_write(int rnum, 
TCGv val, int slot)
 static inline void gen_log_reg_write(int rnum, TCGv val)
 {
 tcg_gen_mov_tl(hex_new_value[rnum], val);
-#if HEX_DEBUG
-/* Do this so HELPER(debug_commit_end) will know */
-tcg_gen_movi_tl(hex_reg_written[rnum], 1);
-#endif
+if (HEX_DEBUG) {
+/* Do this so HELPER(debug_commit_end) will know */
+tcg_gen_movi_tl(hex_reg_written[rnum], 1);
+}
 }
 
 static void gen_log_predicated_reg_write_pair(int rnum, TCGv_i64 val, int slot)
@@ -84,19 +84,19 @@ static void gen_log_predicated_reg_write_pair(int rnum, 
TCGv_i64 val, int slot)
 tcg_gen_movcond_tl(TCG_COND_EQ, hex_new_value[rnum + 1],
slot_mask, zero,
val32, hex_new_value[rnum + 1]);
-#if HEX_DEBUG
-/*
- * Do this so HELPER(debug_commit_end) will know
- *
- * Note that slot_mask indicates the value is not written
- * (i.e., slot was cancelled), so we create a true/false value before
- * or'ing with hex_reg_written[rnum].
- */
-tcg_gen_setcond_tl(TCG_COND_EQ, slot_mask, slot_mask, zero);
-tcg_gen_or_tl(hex_reg_written[rnum], hex_reg_written[rnum], slot_mask);
-tcg_gen_or_tl(hex_reg_written[rnum + 1], hex_reg_written[rnum + 1],
-  slot_mask);
-#endif
+if (HEX_DEBUG) {
+/*
+ * Do this so HELPER(debug_commit_end) will know
+ *
+ * Note that slot_mask indicates the value is not written
+ * (i.e., slot was cancelled), so we create a true/false value before
+ * or'ing with hex_reg_written[rnum].
+ */
+tcg_gen_setcond_tl(TCG_COND_EQ, slot_mask, slot_mask, zero);
+tcg_gen_or_tl(hex_reg_written[rnum], hex_reg_written[rnum], slot_mask);
+tcg_gen_or_tl(hex_reg_written[rnum + 1], hex_reg_written[rnum + 1],
+  slot_mask);
+}
 
 tcg_temp_free(val32);
 tcg_temp_free(zero);
@@ -107,17 +107,17 @@ static void gen_log_reg_write_pair(int rnum, TCGv_i64 val)
 {
 /* Low word */
 tcg_gen_extrl_i64_i32(hex_new_value[rnum], val);
-#if HEX_DEBUG
-/* Do this so HELPER(debug_commit_end) will know */
-tcg_gen_movi_tl(hex_reg_written[rnum], 1);
-#endif
+if (HEX_DEBUG) {
+/* Do this so HELPER(debug_commit_end) will know */
+tcg_gen_movi_tl(hex_reg_written[rnum], 1);
+}
 
 /* High word */
 tcg_gen_extrh_i64_i32(hex_new_value[rnum + 1], val);
-#if HEX_DEBUG
-/* Do this so HELPER(debug_commit_end) will know */
-tcg_gen_movi_tl(hex_reg_written[rnum + 1], 1);
-#endif
+if (HEX_DEBUG) {
+/* Do this so HELPER(debug_commit_end) will know */
+tcg_gen_movi_tl(hex_reg_written[rnum + 1], 1);
+}
 }
 
 static inline void gen_log_pred_write(DisasContext *ctx, int pnum, TCGv val)
diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h
index 715c246..efe6069 100644
--- a/target/hexagon/helper.h
+++ b/target/hexagon/helper.h
@@ -19,11 +19,9 @@
 #include

[PATCH v4 26/26] Hexagon (target/hexagon) CABAC decode bin

2021-04-08 Thread Taylor Simpson

The following instruction is added
S2_cabacdecbinRdd32=decbin(Rss32,Rtt32)

Test cases added to tests/tcg/hexagon/misc.c

Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/arch.c | 91 +++
 target/hexagon/arch.h |  4 ++
 target/hexagon/imported/encode_pp.def |  1 +
 target/hexagon/imported/macros.def| 15 ++
 target/hexagon/imported/shift.idef| 47 ++
 target/hexagon/macros.h   |  7 +++
 tests/tcg/hexagon/misc.c  | 28 +++
 7 files changed, 193 insertions(+)

diff --git a/target/hexagon/arch.c b/target/hexagon/arch.c
index dee852e..68a55b3 100644
--- a/target/hexagon/arch.c
+++ b/target/hexagon/arch.c
@@ -27,6 +27,97 @@
 #define SF_MANTBITS23
 #define float32_nanmake_float32(0x)
 
+/*
+ * These three tables are used by the cabacdecbin instruction
+ */
+const uint8_t rLPS_table_64x4[64][4] = {
+{128, 176, 208, 240},
+{128, 167, 197, 227},
+{128, 158, 187, 216},
+{123, 150, 178, 205},
+{116, 142, 169, 195},
+{111, 135, 160, 185},
+{105, 128, 152, 175},
+{100, 122, 144, 166},
+{95, 116, 137, 158},
+{90, 110, 130, 150},
+{85, 104, 123, 142},
+{81, 99, 117, 135},
+{77, 94, 111, 128},
+{73, 89, 105, 122},
+{69, 85, 100, 116},
+{66, 80, 95, 110},
+{62, 76, 90, 104},
+{59, 72, 86, 99},
+{56, 69, 81, 94},
+{53, 65, 77, 89},
+{51, 62, 73, 85},
+{48, 59, 69, 80},
+{46, 56, 66, 76},
+{43, 53, 63, 72},
+{41, 50, 59, 69},
+{39, 48, 56, 65},
+{37, 45, 54, 62},
+{35, 43, 51, 59},
+{33, 41, 48, 56},
+{32, 39, 46, 53},
+{30, 37, 43, 50},
+{29, 35, 41, 48},
+{27, 33, 39, 45},
+{26, 31, 37, 43},
+{24, 30, 35, 41},
+{23, 28, 33, 39},
+{22, 27, 32, 37},
+{21, 26, 30, 35},
+{20, 24, 29, 33},
+{19, 23, 27, 31},
+{18, 22, 26, 30},
+{17, 21, 25, 28},
+{16, 20, 23, 27},
+{15, 19, 22, 25},
+{14, 18, 21, 24},
+{14, 17, 20, 23},
+{13, 16, 19, 22},
+{12, 15, 18, 21},
+{12, 14, 17, 20},
+{11, 14, 16, 19},
+{11, 13, 15, 18},
+{10, 12, 15, 17},
+{10, 12, 14, 16},
+{9, 11, 13, 15},
+{9, 11, 12, 14},
+{8, 10, 12, 14},
+{8, 9, 11, 13},
+{7, 9, 11, 12},
+{7, 9, 10, 12},
+{7, 8, 10, 11},
+{6, 8, 9, 11},
+{6, 7, 9, 10},
+{6, 7, 8, 9},
+{2, 2, 2, 2}
+};
+
+const uint8_t AC_next_state_MPS_64[64] = {
+1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
+11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
+21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
+31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
+41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
+51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
+61, 62, 62, 63
+};
+
+
+const uint8_t AC_next_state_LPS_64[64] = {
+0, 0, 1, 2, 2, 4, 4, 5, 6, 7,
+8, 9, 9, 11, 11, 12, 13, 13, 15, 15,
+16, 16, 18, 18, 19, 19, 21, 21, 22, 22,
+23, 24, 24, 25, 26, 26, 27, 27, 28, 29,
+29, 30, 30, 30, 31, 32, 32, 33, 33, 33,
+34, 34, 35, 35, 35, 36, 36, 36, 37, 37,
+37, 38, 38, 63
+};
+
 #define BITS_MASK_8 0xULL
 #define PAIR_MASK_8 0xULL
 #define NYBL_MASK_8 0x0f0f0f0f0f0f0f0fULL
diff --git a/target/hexagon/arch.h b/target/hexagon/arch.h
index 3e0c334..7091806 100644
--- a/target/hexagon/arch.h
+++ b/target/hexagon/arch.h
@@ -20,6 +20,10 @@
 
 #include "qemu/int128.h"
 
+extern const uint8_t rLPS_table_64x4[64][4];
+extern const uint8_t AC_next_state_MPS_64[64];
+extern const uint8_t AC_next_state_LPS_64[64];
+
 uint64_t interleave(uint32_t odd, uint32_t even);
 uint64_t deinterleave(uint64_t src);
 int32_t conv_round(int32_t a, int n);
diff --git a/target/hexagon/imported/encode_pp.def 
b/target/hexagon/imported/encode_pp.def
index dc4eba4..35ae3d2 100644
--- a/target/hexagon/imported/encode_pp.def
+++ b/target/hexagon/imported/encode_pp.def
@@ -1767,6 +1767,7 @@ SH_RRR_ENC(S4_vxsubaddh,
"0001","01-","-","110","d")
 SH_RRR_ENC(S4_vxaddsubhr,   "0001","11-","-","00-","d")
 SH_RRR_ENC(S4_vxsubaddhr,   "0001","11-","-","01-","d")
 SH_RRR_ENC(S4_extractp_rp,  "0001","11-","-","10-","d")
+SH_RRR_ENC(S2_cabacdecbin,  "0001","11-","-","11-","d") /* implicit P0 
write */
 
 
 DEF_FIELDROW_DESC32(ICLASS_S3op" 0010  PP-- ","[#2] 
Rdd=(Rss,Rtt,Pu)")
diff --git a/target/hexagon/imported/macros.def 
b/target/hexagon/imported/macros.def
index 56c99b1..32ed3bf 100755
--- a/target/hexagon/imported/macros.def
+++ b/target/hexagon/imported/macros.def
@@ -92,6 +92,21 @@ DEF_MACRO(
 /* attribs */
 )
 
+
+DEF_MACRO(
+fINSERT_RANGE,
+{
+int offset=LOWBIT;
+int width=HIBIT-LOWBIT+1;
+/* clear bits where new bits go */
+INREG &= ~(((fCONSTLL(1)<>29)&3];
+rLPS  = rLPS << 23;   /* left aligned */
+
+/* calculate rMPS */
+rMPS= (range&0xff80) -

[PATCH v4 13/26] Hexagon (target/hexagon) cleanup ternary operators in semantics

2021-04-08 Thread Taylor Simpson

Change  (cond ? (res = x) : (res = y)) to res = (cond ? x : y)

This makes the semnatics easier to for idef-parser to deal with

The following instructions are impacted
C2_any8
C2_all8
C2_mux
C2_muxii
C2_muxir
C2_muxri

Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/imported/compare.idef | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/hexagon/imported/compare.idef 
b/target/hexagon/imported/compare.idef
index 3551467..abd016f 100644
--- a/target/hexagon/imported/compare.idef
+++ b/target/hexagon/imported/compare.idef
@@ -198,11 +198,11 @@ 
Q6INSN(C4_or_orn,"Pd4=or(Ps4,or(Pt4,!Pu4))",ATTRIBS(A_CRSLOT23),
 
 Q6INSN(C2_any8,"Pd4=any8(Ps4)",ATTRIBS(A_CRSLOT23),
 "Logical ANY of low 8 predicate bits",
-{ PsV ? (PdV=0xff) : (PdV=0x00); })
+{ PdV = (PsV ? 0xff : 0x00); })
 
 Q6INSN(C2_all8,"Pd4=all8(Ps4)",ATTRIBS(A_CRSLOT23),
 "Logical ALL of low 8 predicate bits",
-{ (PsV==0xff) ? (PdV=0xff) : (PdV=0x00); })
+{ PdV = (PsV == 0xff ? 0xff : 0x00); })
 
 Q6INSN(C2_vitpack,"Rd32=vitpack(Ps4,Pt4)",ATTRIBS(),
 "Pack the odd and even bits of two predicate registers",
@@ -212,7 +212,7 @@ Q6INSN(C2_vitpack,"Rd32=vitpack(Ps4,Pt4)",ATTRIBS(),
 
 Q6INSN(C2_mux,"Rd32=mux(Pu4,Rs32,Rt32)",ATTRIBS(),
 "Scalar MUX",
-{ (fLSBOLD(PuV)) ? (RdV=RsV):(RdV=RtV); })
+{ RdV = (fLSBOLD(PuV) ? RsV : RtV); })
 
 
 Q6INSN(C2_cmovenewit,"if (Pu4.new) Rd32=#s12",ATTRIBS(A_ARCHV2),
@@ -269,18 +269,18 @@ Q6INSN(C2_ccombinewf,"if (!Pu4) 
Rdd32=combine(Rs32,Rt32)",ATTRIBS(A_ARCHV2),
 
 Q6INSN(C2_muxii,"Rd32=mux(Pu4,#s8,#S8)",ATTRIBS(A_ARCHV2),
 "Scalar MUX immediates",
-{ fIMMEXT(siV); (fLSBOLD(PuV)) ? (RdV=siV):(RdV=SiV); })
+{ fIMMEXT(siV); RdV = (fLSBOLD(PuV) ? siV : SiV); })
 
 
 
 Q6INSN(C2_muxir,"Rd32=mux(Pu4,Rs32,#s8)",ATTRIBS(A_ARCHV2),
 "Scalar MUX register immediate",
-{ fIMMEXT(siV); (fLSBOLD(PuV)) ? (RdV=RsV):(RdV=siV); })
+{ fIMMEXT(siV); RdV = (fLSBOLD(PuV) ? RsV : siV); })
 
 
 Q6INSN(C2_muxri,"Rd32=mux(Pu4,#s8,Rs32)",ATTRIBS(A_ARCHV2),
 "Scalar MUX register immediate",
-{ fIMMEXT(siV); (fLSBOLD(PuV)) ? (RdV=siV):(RdV=RsV); })
+{ fIMMEXT(siV); RdV = (fLSBOLD(PuV) ? siV : RsV); })
 
 
 
-- 
2.7.4

[PATCH v4 02/26] Hexagon (target/hexagon) cleanup gen_log_predicated_reg_write_pair

2021-04-08 Thread Taylor Simpson

Similar to previous cleanup of gen_log_predicated_reg_write

Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/genptr.c | 27 +--
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 87f5d92..07d970f 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -69,36 +69,35 @@ static inline void gen_log_reg_write(int rnum, TCGv val)
 static void gen_log_predicated_reg_write_pair(int rnum, TCGv_i64 val, int slot)
 {
 TCGv val32 = tcg_temp_new();
-TCGv one = tcg_const_tl(1);
 TCGv zero = tcg_const_tl(0);
 TCGv slot_mask = tcg_temp_new();
 
 tcg_gen_andi_tl(slot_mask, hex_slot_cancelled, 1 << slot);
 /* Low word */
 tcg_gen_extrl_i64_i32(val32, val);
-tcg_gen_movcond_tl(TCG_COND_EQ, hex_new_value[rnum], slot_mask, zero,
-   val32, hex_new_value[rnum]);
-#if HEX_DEBUG
-/* Do this so HELPER(debug_commit_end) will know */
-tcg_gen_movcond_tl(TCG_COND_EQ, hex_reg_written[rnum],
+tcg_gen_movcond_tl(TCG_COND_EQ, hex_new_value[rnum],
slot_mask, zero,
-   one, hex_reg_written[rnum]);
-#endif
-
+   val32, hex_new_value[rnum]);
 /* High word */
 tcg_gen_extrh_i64_i32(val32, val);
 tcg_gen_movcond_tl(TCG_COND_EQ, hex_new_value[rnum + 1],
slot_mask, zero,
val32, hex_new_value[rnum + 1]);
 #if HEX_DEBUG
-/* Do this so HELPER(debug_commit_end) will know */
-tcg_gen_movcond_tl(TCG_COND_EQ, hex_reg_written[rnum + 1],
-   slot_mask, zero,
-   one, hex_reg_written[rnum + 1]);
+/*
+ * Do this so HELPER(debug_commit_end) will know
+ *
+ * Note that slot_mask indicates the value is not written
+ * (i.e., slot was cancelled), so we create a true/false value before
+ * or'ing with hex_reg_written[rnum].
+ */
+tcg_gen_setcond_tl(TCG_COND_EQ, slot_mask, slot_mask, zero);
+tcg_gen_or_tl(hex_reg_written[rnum], hex_reg_written[rnum], slot_mask);
+tcg_gen_or_tl(hex_reg_written[rnum + 1], hex_reg_written[rnum + 1],
+  slot_mask);
 #endif
 
 tcg_temp_free(val32);
-tcg_temp_free(one);
 tcg_temp_free(zero);
 tcg_temp_free(slot_mask);
 }
-- 
2.7.4

[PATCH v4 21/26] Hexagon (target/hexagon) add A4_addp_c/A4_subp_c

2021-04-08 Thread Taylor Simpson

Rdd32 = add(Rss32, Rtt32, Px4):carry
Add with carry
Rdd32 = sub(Rss32, Rtt32, Px4):carry
Sub with carry

Test cases in tests/tcg/hexagon/multi_result.c

Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h  | 37 
 target/hexagon/genptr.c   | 11 +
 target/hexagon/imported/alu.idef  | 15 +++
 target/hexagon/imported/encode_pp.def |  2 +
 tests/tcg/hexagon/multi_result.c  | 82 +++
 5 files changed, 147 insertions(+)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index aea0c55..6bc578d 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -238,6 +238,43 @@
 } while (0)
 
 /*
+ * Add or subtract with carry.
+ * Predicate register is used as an extra input and output.
+ * r5:4 = add(r1:0, r3:2, p1):carry
+ */
+#define fGEN_TCG_A4_addp_c(SHORTCODE) \
+do { \
+TCGv_i64 carry = tcg_temp_new_i64(); \
+TCGv_i64 zero = tcg_const_i64(0); \
+tcg_gen_extu_i32_i64(carry, PxV); \
+tcg_gen_andi_i64(carry, carry, 1); \
+tcg_gen_add2_i64(RddV, carry, RssV, zero, carry, zero); \
+tcg_gen_add2_i64(RddV, carry, RddV, carry, RttV, zero); \
+tcg_gen_extrl_i64_i32(PxV, carry); \
+gen_8bitsof(PxV, PxV); \
+tcg_temp_free_i64(carry); \
+tcg_temp_free_i64(zero); \
+} while (0)
+
+/* r5:4 = sub(r1:0, r3:2, p1):carry */
+#define fGEN_TCG_A4_subp_c(SHORTCODE) \
+do { \
+TCGv_i64 carry = tcg_temp_new_i64(); \
+TCGv_i64 zero = tcg_const_i64(0); \
+TCGv_i64 not_RttV = tcg_temp_new_i64(); \
+tcg_gen_extu_i32_i64(carry, PxV); \
+tcg_gen_andi_i64(carry, carry, 1); \
+tcg_gen_not_i64(not_RttV, RttV); \
+tcg_gen_add2_i64(RddV, carry, RssV, zero, carry, zero); \
+tcg_gen_add2_i64(RddV, carry, RddV, carry, not_RttV, zero); \
+tcg_gen_extrl_i64_i32(PxV, carry); \
+gen_8bitsof(PxV, PxV); \
+tcg_temp_free_i64(carry); \
+tcg_temp_free_i64(zero); \
+tcg_temp_free_i64(not_RttV); \
+} while (0)
+
+/*
  * Compare each of the 8 unsigned bytes
  * The minimum is placed in each byte of the destination.
  * Each bit of the predicate is set true if the bit from the first operand
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 9dbebc6..333f7d7 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -361,5 +361,16 @@ static inline void gen_store_conditional8(CPUHexagonState 
*env,
 tcg_gen_movi_tl(hex_llsc_addr, ~0);
 }
 
+static TCGv gen_8bitsof(TCGv result, TCGv value)
+{
+TCGv zero = tcg_const_tl(0);
+TCGv ones = tcg_const_tl(0xff);
+tcg_gen_movcond_tl(TCG_COND_NE, result, value, zero, ones, zero);
+tcg_temp_free(zero);
+tcg_temp_free(ones);
+
+return result;
+}
+
 #include "tcg_funcs_generated.c.inc"
 #include "tcg_func_table_generated.c.inc"
diff --git a/target/hexagon/imported/alu.idef b/target/hexagon/imported/alu.idef
index f0c9bb4..58477ae 100644
--- a/target/hexagon/imported/alu.idef
+++ b/target/hexagon/imported/alu.idef
@@ -153,6 +153,21 @@ Q6INSN(A2_subp,"Rdd32=sub(Rtt32,Rss32)",ATTRIBS(),
 "Sub",
 { RddV=RttV-RssV;})
 
+/* 64-bit with carry */
+
+Q6INSN(A4_addp_c,"Rdd32=add(Rss32,Rtt32,Px4):carry",ATTRIBS(),"Add with Carry",
+{
+  RddV = RssV + RttV + fLSBOLD(PxV);
+  PxV = f8BITSOF(fCARRY_FROM_ADD(RssV,RttV,fLSBOLD(PxV)));
+})
+
+Q6INSN(A4_subp_c,"Rdd32=sub(Rss32,Rtt32,Px4):carry",ATTRIBS(),"Sub with Carry",
+{
+  RddV = RssV + ~RttV + fLSBOLD(PxV);
+  PxV = f8BITSOF(fCARRY_FROM_ADD(RssV,~RttV,fLSBOLD(PxV)));
+})
+
+
 /* NEG and ABS */
 
 Q6INSN(A2_negsat,"Rd32=neg(Rs32):sat",ATTRIBS(),
diff --git a/target/hexagon/imported/encode_pp.def 
b/target/hexagon/imported/encode_pp.def
index 4619398..514c240 100644
--- a/target/hexagon/imported/encode_pp.def
+++ b/target/hexagon/imported/encode_pp.def
@@ -1749,6 +1749,8 @@ SH_RRR_ENC(S4_extractp_rp,  
"0001","11-","-","10-","d")
 DEF_FIELDROW_DESC32(ICLASS_S3op" 0010  PP-- ","[#2] 
Rdd=(Rss,Rtt,Pu)")
 SH_RRR_ENC(S2_valignrb, "0010","0--","-","-uu","d")
 SH_RRR_ENC(S2_vsplicerb,"0010","100","-","-uu","d")
+SH_RRR_ENC(A4_addp_c,   "0010","110","-","-xx","d")
+SH_RRR_ENC(A4_subp_c,   "0010","111","-","-xx","d")
 
 
 DEF_FIELDROW_DESC32(ICLASS_S3op" 0011  PP-- ","[#3] 
Rdd=(Rss,Rt)")
diff --git a/tests/tcg/hexagon/multi_result.c b/tests/tcg/hexagon/multi_result.c
index 95d99a0..52997b3 100644
--- a/tests/tcg/hexagon/multi_result.c
+++ b/tests/tcg/hexagon/multi_result.c
@@ -85,6 +85,38 @@ static long long vminub(long long Rtt, long long Rss,
   return result;
 }
 
+static long long add_carry(long long Rss, long long Rtt,
+   int pred_in, int *pred_result)
+{
+  long long result;
+  int predval = pred_in;
+
+  asm volatile("p0 = %1\n\t"
+   "%0 =

[PATCH v4 22/26] Hexagon (target/hexagon) circular addressing

2021-04-08 Thread Taylor Simpson

The following instructions are added
L2_loadrub_pci  Rd32 = memub(Rx32++#s4:0:circ(Mu2))
L2_loadrb_pci   Rd32 = memb(Rx32++#s4:0:circ(Mu2))
L2_loadruh_pci  Rd32 = memuh(Rx32++#s4:1:circ(Mu2))
L2_loadrh_pci   Rd32 = memh(Rx32++#s4:1:circ(Mu2))
L2_loadri_pci   Rd32 = memw(Rx32++#s4:2:circ(Mu2))
L2_loadrd_pci   Rdd32 = memd(Rx32++#s4:3:circ(Mu2))
S2_storerb_pci  memb(Rx32++#s4:0:circ(Mu2)) = Rt32
S2_storerh_pci  memh(Rx32++#s4:1:circ(Mu2)) = Rt32
S2_storerf_pci  memh(Rx32++#s4:1:circ(Mu2)) = Rt.H32
S2_storeri_pci  memw(Rx32++#s4:2:circ(Mu2)) = Rt32
S2_storerd_pci  memd(Rx32++#s4:3:circ(Mu2)) = Rtt32
S2_storerbnew_pci   memb(Rx32++#s4:0:circ(Mu2)) = Nt8.new
S2_storerhnew_pci   memw(Rx32++#s4:1:circ(Mu2)) = Nt8.new
S2_storerinew_pci   memw(Rx32++#s4:2:circ(Mu2)) = Nt8.new
L2_loadrub_pcr  Rd32 = memub(Rx32++I:circ(Mu2))
L2_loadrb_pcr   Rd32 = memb(Rx32++I:circ(Mu2))
L2_loadruh_pcr  Rd32 = memuh(Rx32++I:circ(Mu2))
L2_loadrh_pcr   Rd32 = memh(Rx32++I:circ(Mu2))
L2_loadri_pcr   Rd32 = memw(Rx32++I:circ(Mu2))
L2_loadrd_pcr   Rdd32 = memd(Rx32++I:circ(Mu2))
S2_storerb_pcr  memb(Rx32++I:circ(Mu2)) = Rt32
S2_storerh_pcr  memh(Rx32++I:circ(Mu2)) = Rt32
S2_storerf_pcr  memh(Rx32++I:circ(Mu2)) = Rt32.H32
S2_storeri_pcr  memw(Rx32++I:circ(Mu2)) = Rt32
S2_storerd_pcr  memd(Rx32++I:circ(Mu2)) = Rtt32
S2_storerbnew_pcr   memb(Rx32++I:circ(Mu2)) = Nt8.new
S2_storerhnew_pcr   memh(Rx32++I:circ(Mu2)) = Nt8.new
S2_storerinew_pcr   memw(Rx32++I:circ(Mu2)) = Nt8.new

Test cases in tests/tcg/hexagon/circ.c

Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h  | 112 +++-
 target/hexagon/genptr.c   | 100 +++
 target/hexagon/imported/encode_pp.def |  10 +
 target/hexagon/imported/ldst.idef |   4 +
 target/hexagon/imported/macros.def|  26 ++
 target/hexagon/macros.h   |  92 +++
 target/hexagon/op_helper.c|  36 +--
 tests/tcg/hexagon/Makefile.target |   2 +
 tests/tcg/hexagon/circ.c  | 486 ++
 9 files changed, 845 insertions(+), 23 deletions(-)
 create mode 100644 tests/tcg/hexagon/circ.c

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 6bc578d..25c228c 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -38,6 +38,8 @@
  * _ap   absolute set  r0 = memw(r1=##variable)
  * _pr   post increment register   r0 = memw(r1++m1)
  * _pi   post increment immediate  r0 = memb(r1++#1)
+ * _pci  post increment circular immediate r0 = memw(r1++#4:circ(m0))
+ * _pcr  post increment circular register  r0 = memw(r1++I:circ(m0))
  */
 
 /* Macros for complex addressing modes */
@@ -56,7 +58,22 @@
 fEA_REG(RxV); \
 fPM_I(RxV, siV); \
 } while (0)
-
+#define GET_EA_pci \
+do { \
+TCGv tcgv_siV = tcg_const_tl(siV); \
+tcg_gen_mov_tl(EA, RxV); \
+gen_helper_fcircadd(RxV, RxV, tcgv_siV, MuV, \
+hex_gpr[HEX_REG_CS0 + MuN]); \
+tcg_temp_free(tcgv_siV); \
+} while (0)
+#define GET_EA_pcr(SHIFT) \
+do { \
+TCGv ireg = tcg_temp_new(); \
+tcg_gen_mov_tl(EA, RxV); \
+gen_read_ireg(ireg, MuV, (SHIFT)); \
+gen_helper_fcircadd(RxV, RxV, ireg, MuV, hex_gpr[HEX_REG_CS0 + MuN]); \
+tcg_temp_free(ireg); \
+} while (0)
 
 /* Instructions with multiple definitions */
 #define fGEN_TCG_LOAD_AP(RES, SIZE, SIGN) \
@@ -80,6 +97,36 @@
 #define fGEN_TCG_L4_loadrd_ap(SHORTCODE) \
 fGEN_TCG_LOAD_AP(RddV, 8, u)
 
+#define fGEN_TCG_L2_loadrub_pci(SHORTCODE)SHORTCODE
+#define fGEN_TCG_L2_loadrb_pci(SHORTCODE) SHORTCODE
+#define fGEN_TCG_L2_loadruh_pci(SHORTCODE)SHORTCODE
+#define fGEN_TCG_L2_loadrh_pci(SHORTCODE) SHORTCODE
+#define fGEN_TCG_L2_loadri_pci(SHORTCODE) SHORTCODE
+#define fGEN_TCG_L2_loadrd_pci(SHORTCODE) SHORTCODE
+
+#define fGEN_TCG_LOAD_pcr(SHIFT, LOAD) \
+do { \
+TCGv ireg = tcg_temp_new(); \
+tcg_gen_mov_tl(EA, RxV); \
+gen_read_ireg(ireg, MuV, SHIFT); \
+gen_helper_fcircadd(RxV, RxV, ireg, MuV, hex_gpr[HEX_REG_CS0 + MuN]); \
+LOAD; \
+tcg_temp_free(ireg); \
+} while (0)
+
+#define fGEN_TCG_L2_loadrub_pcr(SHORTCODE) \
+  fGEN_TCG_LOAD_pcr(0, fLOAD(1, 1, u, EA, RdV))
+#define fGEN_TCG_L2_loadrb_pcr(SHORTCODE) \
+  fGEN_TCG_LOAD_pcr(0, fLOAD(1, 1, s, EA, RdV))
+#define fGEN_TCG_L2_loadruh_pcr(SHORTCODE) \
+  fGEN_TCG_LOAD_pcr(1, fLOAD(1, 2, u, EA, RdV))
+#define fGEN_TCG_L2_loadrh_pcr(SHORTCODE) \
+  fGEN_TCG_LOAD_pcr(1, fLOAD(1, 2, s, EA, RdV))
+#define

[PATCH v4 03/26] Hexagon (target/hexagon) remove unnecessary inline directives

2021-04-08 Thread Taylor Simpson

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/cpu.c   |  9 -
 target/hexagon/decode.c|  6 +++---
 target/hexagon/fma_emu.c   | 39 ---
 target/hexagon/op_helper.c | 37 ++---
 target/hexagon/translate.c |  2 +-
 5 files changed, 46 insertions(+), 47 deletions(-)

diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
index b0b3040..c2fe357 100644
--- a/target/hexagon/cpu.c
+++ b/target/hexagon/cpu.c
@@ -69,10 +69,9 @@ const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS] = 
{
  * stacks at different locations.  This is used to compensate so the diff is
  * cleaner.
  */
-static inline target_ulong adjust_stack_ptrs(CPUHexagonState *env,
- target_ulong addr)
+static target_ulong adjust_stack_ptrs(CPUHexagonState *env, target_ulong addr)
 {
-HexagonCPU *cpu = container_of(env, HexagonCPU, env);
+HexagonCPU *cpu = hexagon_env_get_cpu(env);
 target_ulong stack_adjust = cpu->lldb_stack_adjust;
 target_ulong stack_start = env->stack_start;
 target_ulong stack_size = 0x1;
@@ -88,7 +87,7 @@ static inline target_ulong adjust_stack_ptrs(CPUHexagonState 
*env,
 }
 
 /* HEX_REG_P3_0 (aka C4) is an alias for the predicate registers */
-static inline target_ulong read_p3_0(CPUHexagonState *env)
+static target_ulong read_p3_0(CPUHexagonState *env)
 {
 int32_t control_reg = 0;
 int i;
@@ -116,7 +115,7 @@ static void print_reg(FILE *f, CPUHexagonState *env, int 
regnum)
 
 static void hexagon_dump(CPUHexagonState *env, FILE *f)
 {
-HexagonCPU *cpu = container_of(env, HexagonCPU, env);
+HexagonCPU *cpu = hexagon_env_get_cpu(env);
 
 if (cpu->lldb_compat) {
 /*
diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index 1c9c074..65d97ce 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -354,7 +354,7 @@ static void decode_split_cmpjump(Packet *pkt)
 }
 }
 
-static inline int decode_opcode_can_jump(int opcode)
+static int decode_opcode_can_jump(int opcode)
 {
 if ((GET_ATTRIB(opcode, A_JUMP)) ||
 (GET_ATTRIB(opcode, A_CALL)) ||
@@ -370,7 +370,7 @@ static inline int decode_opcode_can_jump(int opcode)
 return 0;
 }
 
-static inline int decode_opcode_ends_loop(int opcode)
+static int decode_opcode_ends_loop(int opcode)
 {
 return GET_ATTRIB(opcode, A_HWLOOP0_END) ||
GET_ATTRIB(opcode, A_HWLOOP1_END);
@@ -764,7 +764,7 @@ static void decode_add_endloop_insn(Insn *insn, int loopnum)
 }
 }
 
-static inline int decode_parsebits_is_loopend(uint32_t encoding32)
+static int decode_parsebits_is_loopend(uint32_t encoding32)
 {
 uint32_t bits = parse_bits(encoding32);
 return bits == 0x2;
diff --git a/target/hexagon/fma_emu.c b/target/hexagon/fma_emu.c
index 842d903..f324b83 100644
--- a/target/hexagon/fma_emu.c
+++ b/target/hexagon/fma_emu.c
@@ -64,7 +64,7 @@ typedef union {
 };
 } Float;
 
-static inline uint64_t float64_getmant(float64 f64)
+static uint64_t float64_getmant(float64 f64)
 {
 Double a = { .i = f64 };
 if (float64_is_normal(f64)) {
@@ -91,7 +91,7 @@ int32_t float64_getexp(float64 f64)
 return -1;
 }
 
-static inline uint64_t float32_getmant(float32 f32)
+static uint64_t float32_getmant(float32 f32)
 {
 Float a = { .i = f32 };
 if (float32_is_normal(f32)) {
@@ -118,17 +118,17 @@ int32_t float32_getexp(float32 f32)
 return -1;
 }
 
-static inline uint32_t int128_getw0(Int128 x)
+static uint32_t int128_getw0(Int128 x)
 {
 return int128_getlo(x);
 }
 
-static inline uint32_t int128_getw1(Int128 x)
+static uint32_t int128_getw1(Int128 x)
 {
 return int128_getlo(x) >> 32;
 }
 
-static inline Int128 int128_mul_6464(uint64_t ai, uint64_t bi)
+static Int128 int128_mul_6464(uint64_t ai, uint64_t bi)
 {
 Int128 a, b;
 uint64_t pp0, pp1a, pp1b, pp1s, pp2;
@@ -152,7 +152,7 @@ static inline Int128 int128_mul_6464(uint64_t ai, uint64_t 
bi)
 return int128_make128(ret_low, pp2 + (pp1s >> 32));
 }
 
-static inline Int128 int128_sub_borrow(Int128 a, Int128 b, int borrow)
+static Int128 int128_sub_borrow(Int128 a, Int128 b, int borrow)
 {
 Int128 ret = int128_sub(a, b);
 if (borrow != 0) {
@@ -170,7 +170,7 @@ typedef struct {
 uint8_t sticky;
 } Accum;
 
-static inline void accum_init(Accum *p)
+static void accum_init(Accum *p)
 {
 p->mant = int128_zero();
 p->exp = 0;
@@ -180,7 +180,7 @@ static inline void accum_init(Accum *p)
 p->sticky = 0;
 }
 
-static inline Accum accum_norm_left(Accum a)
+static Accum accum_norm_left(Accum a)
 {
 a.exp--;
 a.mant = int128_lshift(a.mant, 1);
@@ -190,6 +190,7 @@ static inline Accum accum_norm_left(Accum a)
 return a;
 }
 
+/* This function is marked inline for performance reasons */
 static inline Accum accum_norm_right(Accum a, int amt)
 {
 if (amt > 130) {
@@ -226,7 +227,7 @@ static inline

[PATCH v4 08/26] Hexagon (target/hexagon) remove unused carry_from_add64 function

2021-04-08 Thread Taylor Simpson

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/arch.c   | 13 -
 target/hexagon/arch.h   |  1 -
 target/hexagon/macros.h |  2 --
 3 files changed, 16 deletions(-)

diff --git a/target/hexagon/arch.c b/target/hexagon/arch.c
index 09de124..699e2cf 100644
--- a/target/hexagon/arch.c
+++ b/target/hexagon/arch.c
@@ -76,19 +76,6 @@ uint64_t deinterleave(uint64_t src)
 return myeven | (myodd << 32);
 }
 
-uint32_t carry_from_add64(uint64_t a, uint64_t b, uint32_t c)
-{
-uint64_t tmpa, tmpb, tmpc;
-tmpa = fGETUWORD(0, a);
-tmpb = fGETUWORD(0, b);
-tmpc = tmpa + tmpb + c;
-tmpa = fGETUWORD(1, a);
-tmpb = fGETUWORD(1, b);
-tmpc = tmpa + tmpb + fGETUWORD(1, tmpc);
-tmpc = fGETUWORD(1, tmpc);
-return tmpc;
-}
-
 int32_t conv_round(int32_t a, int n)
 {
 int64_t val;
diff --git a/target/hexagon/arch.h b/target/hexagon/arch.h
index 1f7f036..6e0b0d9 100644
--- a/target/hexagon/arch.h
+++ b/target/hexagon/arch.h
@@ -22,7 +22,6 @@
 
 uint64_t interleave(uint32_t odd, uint32_t even);
 uint64_t deinterleave(uint64_t src);
-uint32_t carry_from_add64(uint64_t a, uint64_t b, uint32_t c);
 int32_t conv_round(int32_t a, int n);
 void arch_fpop_start(CPUHexagonState *env);
 void arch_fpop_end(CPUHexagonState *env);
diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index cfcb817..8cb211d 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -341,8 +341,6 @@ static inline void gen_logical_not(TCGv dest, TCGv src)
 #define fWRITE_LC0(VAL) WRITE_RREG(HEX_REG_LC0, VAL)
 #define fWRITE_LC1(VAL) WRITE_RREG(HEX_REG_LC1, VAL)
 
-#define fCARRY_FROM_ADD(A, B, C) carry_from_add64(A, B, C)
-
 #define fSET_OVERFLOW() SET_USR_FIELD(USR_OVF, 1)
 #define fSET_LPCFG(VAL) SET_USR_FIELD(USR_LPCFG, (VAL))
 #define fGET_LPCFG (GET_USR_FIELD(USR_LPCFG))
-- 
2.7.4

[PATCH v4 19/26] Hexagon (target/hexagon) add A5_ACS (vacsh)

2021-04-08 Thread Taylor Simpson

Rxx32,Pe4 = vacsh(Rss32, Rtt32)
Add compare and select elements of two vectors

Test cases in tests/tcg/hexagon/multi_result.c

Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h  |  5 +++
 target/hexagon/helper.h   |  2 +
 target/hexagon/imported/alu.idef  | 19 ++
 target/hexagon/imported/encode_pp.def |  1 +
 target/hexagon/op_helper.c| 33 +
 tests/tcg/hexagon/multi_result.c  | 69 +++
 6 files changed, 129 insertions(+)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index d78e7b8..93310c5 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -199,6 +199,11 @@
  * Mathematical operations with more than one definition require
  * special handling
  */
+#define fGEN_TCG_A5_ACS(SHORTCODE) \
+do { \
+gen_helper_vacsh_pred(PeV, cpu_env, RxxV, RssV, RttV); \
+gen_helper_vacsh_val(RxxV, cpu_env, RxxV, RssV, RttV); \
+} while (0)
 
 /*
  * Approximate reciprocal
diff --git a/target/hexagon/helper.h b/target/hexagon/helper.h
index cb7508f..3824ae0 100644
--- a/target/hexagon/helper.h
+++ b/target/hexagon/helper.h
@@ -26,6 +26,8 @@ DEF_HELPER_2(commit_store, void, env, int)
 DEF_HELPER_FLAGS_4(fcircadd, TCG_CALL_NO_RWG_SE, s32, s32, s32, s32, s32)
 DEF_HELPER_3(sfrecipa, i64, env, f32, f32)
 DEF_HELPER_2(sfinvsqrta, i64, env, f32)
+DEF_HELPER_4(vacsh_val, s64, env, s64, s64, s64)
+DEF_HELPER_FLAGS_4(vacsh_pred, TCG_CALL_NO_RWG_SE, s32, env, s64, s64, s64)
 
 /* Floating point */
 DEF_HELPER_2(conv_sf2df, f64, env, f32)
diff --git a/target/hexagon/imported/alu.idef b/target/hexagon/imported/alu.idef
index 45cc529..e8cc52c 100644
--- a/target/hexagon/imported/alu.idef
+++ b/target/hexagon/imported/alu.idef
@@ -1240,6 +1240,25 @@ MINMAX(uw,WORD,UWORD,2)
 #undef VMINORMAX3
 
 
+Q6INSN(A5_ACS,"Rxx32,Pe4=vacsh(Rss32,Rtt32)",ATTRIBS(),
+"Add Compare and Select elements of two vectors, record the maximums and the 
decisions ",
+{
+fHIDE(int i;)
+fHIDE(int xv;)
+fHIDE(int sv;)
+fHIDE(int tv;)
+for (i = 0; i < 4; i++) {
+xv = (int) fGETHALF(i,RxxV);
+sv = (int) fGETHALF(i,RssV);
+tv = (int) fGETHALF(i,RttV);
+xv = xv + tv;   //assumes 17bit datapath
+sv = sv - tv;   //assumes 17bit datapath
+fSETBIT(i*2,  PeV,  (xv > sv));
+fSETBIT(i*2+1,PeV,  (xv > sv));
+fSETHALF(i,   RxxV, fSATH(fMAX(xv,sv)));
+}
+})
+
 /**/
 /* Vector Min/Max */
 /**/
diff --git a/target/hexagon/imported/encode_pp.def 
b/target/hexagon/imported/encode_pp.def
index 18fe45d..87e0426 100644
--- a/target/hexagon/imported/encode_pp.def
+++ b/target/hexagon/imported/encode_pp.def
@@ -1017,6 +1017,7 @@ MPY_ENC(M7_dcmpyiwc_acc, 
"1010","x","1","0","1","0","10")
 
 
 
+MPY_ENC(A5_ACS,  "1010","x","0","1","0","1","ee")
 /*
 */
 
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index a25fb98..f9fb655 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -347,6 +347,39 @@ uint64_t HELPER(sfinvsqrta)(CPUHexagonState *env, float32 
RsV)
 return ((uint64_t)RdV << 32) | PeV;
 }
 
+int64_t HELPER(vacsh_val)(CPUHexagonState *env,
+   int64_t RxxV, int64_t RssV, int64_t RttV)
+{
+for (int i = 0; i < 4; i++) {
+int xv = sextract64(RxxV, i * 16, 16);
+int sv = sextract64(RssV, i * 16, 16);
+int tv = sextract64(RttV, i * 16, 16);
+int max;
+xv = xv + tv;
+sv = sv - tv;
+max = xv > sv ? xv : sv;
+/* Note that fSATH can set the OVF bit in usr */
+RxxV = deposit64(RxxV, i * 16, 16, fSATH(max));
+}
+return RxxV;
+}
+
+int32_t HELPER(vacsh_pred)(CPUHexagonState *env,
+   int64_t RxxV, int64_t RssV, int64_t RttV)
+{
+int32_t PeV = 0;
+for (int i = 0; i < 4; i++) {
+int xv = sextract64(RxxV, i * 16, 16);
+int sv = sextract64(RssV, i * 16, 16);
+int tv = sextract64(RttV, i * 16, 16);
+xv = xv + tv;
+sv = sv - tv;
+PeV = deposit32(PeV, i * 2, 1, (xv > sv));
+PeV = deposit32(PeV, i * 2 + 1, 1, (xv > sv));
+}
+return PeV;
+}
+
 /*
  * mem_noshuf
  * Section 5.5 of the Hexagon V67 Programmer's Reference Manual
diff --git a/tests/tcg/hexagon/multi_result.c b/tests/tcg/hexagon/multi_result.c
index 67aa462..c21148f 100644
--- a/tests/tcg/hexagon/multi_result.c
+++ b/tests/tcg/hexagon/multi_result.c
@@ -45,8 +45,41 @@ static int sfinvsqrta(int Rs, int *pred_result)
   return result;
 }
 
+static long long vacsh(long long Rxx, long long Rss, long long Rtt,
+   int *pred_result, int *ovf_result)
+{

[PATCH v4 04/26] Hexagon (target/hexagon) use env_archcpu and env_cpu

2021-04-08 Thread Taylor Simpson

Remove hexagon_env_get_cpu and replace with env_archcpu
Replace CPU(hexagon_env_get_cpu(env)) with env_cpu(env)

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 linux-user/hexagon/cpu_loop.c | 2 +-
 target/hexagon/cpu.c  | 4 ++--
 target/hexagon/cpu.h  | 5 -
 target/hexagon/op_helper.c| 2 +-
 target/hexagon/translate.c| 2 +-
 5 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/linux-user/hexagon/cpu_loop.c b/linux-user/hexagon/cpu_loop.c
index 9a68ca0..bc34f5d 100644
--- a/linux-user/hexagon/cpu_loop.c
+++ b/linux-user/hexagon/cpu_loop.c
@@ -25,7 +25,7 @@
 
 void cpu_loop(CPUHexagonState *env)
 {
-CPUState *cs = CPU(hexagon_env_get_cpu(env));
+CPUState *cs = env_cpu(env);
 int trapnr, signum, sigcode;
 target_ulong sigaddr;
 target_ulong syscallnum;
diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
index c2fe357..f044506 100644
--- a/target/hexagon/cpu.c
+++ b/target/hexagon/cpu.c
@@ -71,7 +71,7 @@ const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS] = {
  */
 static target_ulong adjust_stack_ptrs(CPUHexagonState *env, target_ulong addr)
 {
-HexagonCPU *cpu = hexagon_env_get_cpu(env);
+HexagonCPU *cpu = env_archcpu(env);
 target_ulong stack_adjust = cpu->lldb_stack_adjust;
 target_ulong stack_start = env->stack_start;
 target_ulong stack_size = 0x1;
@@ -115,7 +115,7 @@ static void print_reg(FILE *f, CPUHexagonState *env, int 
regnum)
 
 static void hexagon_dump(CPUHexagonState *env, FILE *f)
 {
-HexagonCPU *cpu = hexagon_env_get_cpu(env);
+HexagonCPU *cpu = env_archcpu(env);
 
 if (cpu->lldb_compat) {
 /*
diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index e04eac5..2855dd3 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -127,11 +127,6 @@ typedef struct HexagonCPU {
 target_ulong lldb_stack_adjust;
 } HexagonCPU;
 
-static inline HexagonCPU *hexagon_env_get_cpu(CPUHexagonState *env)
-{
-return container_of(env, HexagonCPU, env);
-}
-
 #include "cpu_bits.h"
 
 #define cpu_signal_handler cpu_hexagon_signal_handler
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index 5d35dfc..7ac8554 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -35,7 +35,7 @@ static void QEMU_NORETURN 
do_raise_exception_err(CPUHexagonState *env,
  uint32_t exception,
  uintptr_t pc)
 {
-CPUState *cs = CPU(hexagon_env_get_cpu(env));
+CPUState *cs = env_cpu(env);
 qemu_log_mask(CPU_LOG_INT, "%s: %d\n", __func__, exception);
 cs->exception_index = exception;
 cpu_loop_exit_restore(cs, pc);
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index f975d7a..e235fdb 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -585,7 +585,7 @@ static void hexagon_tr_translate_packet(DisasContextBase 
*dcbase, CPUState *cpu)
  * The CPU log is used to compare against LLDB single stepping,
  * so end the TLB after every packet.
  */
-HexagonCPU *hex_cpu = hexagon_env_get_cpu(env);
+HexagonCPU *hex_cpu = env_archcpu(env);
 if (hex_cpu->lldb_compat && qemu_loglevel_mask(CPU_LOG_TB_CPU)) {
 ctx->base.is_jmp = DISAS_TOO_MANY;
 }
-- 
2.7.4

[PATCH v4 24/26] Hexagon (target/hexagon) load and unpack bytes instructions

2021-04-08 Thread Taylor Simpson

The following instructions are added
L2_loadbzw2_io  Rd32 = memubh(Rs32+#s11:1)
L2_loadbzw4_io  Rdd32 = memubh(Rs32+#s11:1)
L2_loadbsw2_io  Rd32 = membh(Rs32+#s11:1)
L2_loadbsw4_io  Rdd32 = membh(Rs32+#s11:1)

L4_loadbzw2_ur  Rd32 = memubh(Rt32<<#u2+#U6)
L4_loadbzw4_ur  Rdd32 = memubh(Rt32<<#u2+#U6)
L4_loadbsw2_ur  Rd32 = membh(Rt32<<#u2+#U6)
L4_loadbsw4_ur  Rdd32 = membh(Rt32<<#u2+#U6)

L4_loadbzw2_ap  Rd32 = memubh(Re32=#U6)
L4_loadbzw4_ap  Rdd32 = memubh(Re32=#U6)
L4_loadbsw2_ap  Rd32 = membh(Re32=#U6)
L4_loadbsw4_ap  Rdd32 = membh(Re32=#U6)

L2_loadbzw2_pr  Rd32 = memubh(Rx32++Mu2)
L2_loadbzw4_pr  Rdd32 = memubh(Rx32++Mu2)
L2_loadbsw2_pr  Rd32 = membh(Rx32++Mu2)
L2_loadbsw4_pr  Rdd32 = membh(Rx32++Mu2)

L2_loadbzw2_pbr Rd32 = memubh(Rx32++Mu2:brev)
L2_loadbzw4_pbr Rdd32 = memubh(Rx32++Mu2:brev)
L2_loadbsw2_pbr Rd32 = membh(Rx32++Mu2:brev)
L2_loadbsw4_pbr Rdd32 = membh(Rx32++Mu2:brev)

L2_loadbzw2_pi  Rd32 = memubh(Rx32++#s4:1)
L2_loadbzw4_pi  Rdd32 = memubh(Rx32++#s4:1)
L2_loadbsw2_pi  Rd32 = membh(Rx32++#s4:1)
L2_loadbsw4_pi  Rdd32 = membh(Rx32++#s4:1)

L2_loadbzw2_pci Rd32 = memubh(Rx32++#s4:1:circ(Mu2))
L2_loadbzw4_pci Rdd32 = memubh(Rx32++#s4:1:circ(Mu2))
L2_loadbsw2_pci Rd32 = membh(Rx32++#s4:1:circ(Mu2))
L2_loadbsw4_pci Rdd32 = membh(Rx32++#s4:1:circ(Mu2))

L2_loadbzw2_pcr Rd32 = memubh(Rx32++I:circ(Mu2))
L2_loadbzw4_pcr Rdd32 = memubh(Rx32++I:circ(Mu2))
L2_loadbsw2_pcr Rd32 = membh(Rx32++I:circ(Mu2))
L2_loadbsw4_pcr Rdd32 = membh(Rx32++I:circ(Mu2))

Test cases in tests/tcg/hexagon/load_unpack.c

Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h  | 108 
 target/hexagon/genptr.c   |  13 +
 target/hexagon/imported/encode_pp.def |   6 +
 target/hexagon/imported/ldst.idef |  43 +++
 target/hexagon/macros.h   |  16 ++
 tests/tcg/hexagon/Makefile.target |   1 +
 tests/tcg/hexagon/load_unpack.c   | 474 ++
 7 files changed, 661 insertions(+)
 create mode 100644 tests/tcg/hexagon/load_unpack.c

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 8f0ec01..1120aae 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -153,6 +153,114 @@
 #define fGEN_TCG_L2_loadrd_pi(SHORTCODE)   SHORTCODE
 
 /*
+ * These instructions load 2 bytes and places them in
+ * two halves of the destination register.
+ * The GET_EA macro determines the addressing mode.
+ * The SIGN argument determines whether to zero-extend or
+ * sign-extend.
+ */
+#define fGEN_TCG_loadbXw2(GET_EA, SIGN) \
+do { \
+TCGv tmp = tcg_temp_new(); \
+TCGv byte = tcg_temp_new(); \
+GET_EA; \
+fLOAD(1, 2, u, EA, tmp); \
+tcg_gen_movi_tl(RdV, 0); \
+for (int i = 0; i < 2; i++) { \
+gen_set_half(i, RdV, gen_get_byte(byte, i, tmp, (SIGN))); \
+} \
+tcg_temp_free(tmp); \
+tcg_temp_free(byte); \
+} while (0)
+
+#define fGEN_TCG_L2_loadbzw2_io(SHORTCODE) \
+fGEN_TCG_loadbXw2(fEA_RI(RsV, siV), false)
+#define fGEN_TCG_L4_loadbzw2_ur(SHORTCODE) \
+fGEN_TCG_loadbXw2(fEA_IRs(UiV, RtV, uiV), false)
+#define fGEN_TCG_L2_loadbsw2_io(SHORTCODE) \
+fGEN_TCG_loadbXw2(fEA_RI(RsV, siV), true)
+#define fGEN_TCG_L4_loadbsw2_ur(SHORTCODE) \
+fGEN_TCG_loadbXw2(fEA_IRs(UiV, RtV, uiV), true)
+#define fGEN_TCG_L4_loadbzw2_ap(SHORTCODE) \
+fGEN_TCG_loadbXw2(GET_EA_ap, false)
+#define fGEN_TCG_L2_loadbzw2_pr(SHORTCODE) \
+fGEN_TCG_loadbXw2(GET_EA_pr, false)
+#define fGEN_TCG_L2_loadbzw2_pbr(SHORTCODE) \
+fGEN_TCG_loadbXw2(GET_EA_pbr, false)
+#define fGEN_TCG_L2_loadbzw2_pi(SHORTCODE) \
+fGEN_TCG_loadbXw2(GET_EA_pi, false)
+#define fGEN_TCG_L4_loadbsw2_ap(SHORTCODE) \
+fGEN_TCG_loadbXw2(GET_EA_ap, true)
+#define fGEN_TCG_L2_loadbsw2_pr(SHORTCODE) \
+fGEN_TCG_loadbXw2(GET_EA_pr, true)
+#define fGEN_TCG_L2_loadbsw2_pbr(SHORTCODE) \
+fGEN_TCG_loadbXw2(GET_EA_pbr, true)
+#define fGEN_TCG_L2_loadbsw2_pi(SHORTCODE) \
+fGEN_TCG_loadbXw2(GET_EA_pi, true)
+#define fGEN_TCG_L2_loadbzw2_pci(SHORTCODE) \
+fGEN_TCG_loadbXw2(GET_EA_pci, false)
+#define fGEN_TCG_L2_loadbsw2_pci(SHORTCODE) \
+fGEN_TCG_loadbXw2(GET_EA_pci, true)
+#define fGEN_TCG_L2_loadbzw2_pcr(SHORTCODE) \
+fGEN_TCG_loadbXw2(GET_EA_pcr(1), false)
+#define fGEN_TCG_L2_loadbsw2_pcr(SHORTCODE) \
+fGEN_TCG_loadbXw2(GET_EA_pcr(1), true)
+
+/*
+ * These instructions load 4 bytes and places them in
+ * four halves of the destination register pair.
+ * The GET_EA macro determines the addressing mode.
+ * The SIGN argument determines whether

[PATCH v4 10/26] Hexagon (target/hexagon) use softfloat default NaN and tininess

2021-04-08 Thread Taylor Simpson

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 fpu/softfloat-specialize.c.inc |  3 +++
 target/hexagon/cpu.c   |  5 +
 target/hexagon/op_helper.c | 47 --
 3 files changed, 8 insertions(+), 47 deletions(-)

diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc
index c2f87ad..9ea318f 100644
--- a/fpu/softfloat-specialize.c.inc
+++ b/fpu/softfloat-specialize.c.inc
@@ -145,6 +145,9 @@ static FloatParts parts_default_nan(float_status *status)
 #elif defined(TARGET_HPPA)
 /* snan_bit_is_one, set msb-1.  */
 frac = 1ULL << (DECOMPOSED_BINARY_POINT - 2);
+#elif defined(TARGET_HEXAGON)
+sign = 1;
+frac = ~0ULL;
 #else
 /* This case is true for Alpha, ARM, MIPS, OpenRISC, PPC, RISC-V,
  * S390, SH4, TriCore, and Xtensa.  I cannot find documentation
diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
index f044506..ff44fd6 100644
--- a/target/hexagon/cpu.c
+++ b/target/hexagon/cpu.c
@@ -23,6 +23,7 @@
 #include "exec/exec-all.h"
 #include "qapi/error.h"
 #include "hw/qdev-properties.h"
+#include "fpu/softfloat-helpers.h"
 
 static void hexagon_v67_cpu_init(Object *obj)
 {
@@ -205,8 +206,12 @@ static void hexagon_cpu_reset(DeviceState *dev)
 CPUState *cs = CPU(dev);
 HexagonCPU *cpu = HEXAGON_CPU(cs);
 HexagonCPUClass *mcc = HEXAGON_CPU_GET_CLASS(cpu);
+CPUHexagonState *env = >env;
 
 mcc->parent_reset(dev);
+
+set_default_nan_mode(1, >fp_status);
+set_float_detect_tininess(float_tininess_before_rounding, >fp_status);
 }
 
 static void hexagon_cpu_disas_set_info(CPUState *s, disassemble_info *info)
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index 1d91fa2..478421d 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -297,26 +297,6 @@ int32_t HELPER(fcircadd)(int32_t RxV, int32_t offset, 
int32_t M, int32_t CS)
 }
 
 /*
- * Hexagon FP operations return ~0 instead of NaN
- * The hex_check_sfnan/hex_check_dfnan functions perform this check
- */
-static float32 hex_check_sfnan(float32 x)
-{
-if (float32_is_any_nan(x)) {
-return make_float32(0xU);
-}
-return x;
-}
-
-static float64 hex_check_dfnan(float64 x)
-{
-if (float64_is_any_nan(x)) {
-return make_float64(0xULL);
-}
-return x;
-}
-
-/*
  * mem_noshuf
  * Section 5.5 of the Hexagon V67 Programmer's Reference Manual
  *
@@ -373,7 +353,6 @@ float64 HELPER(conv_sf2df)(CPUHexagonState *env, float32 
RsV)
 float64 out_f64;
 arch_fpop_start(env);
 out_f64 = float32_to_float64(RsV, >fp_status);
-out_f64 = hex_check_dfnan(out_f64);
 arch_fpop_end(env);
 return out_f64;
 }
@@ -383,7 +362,6 @@ float32 HELPER(conv_df2sf)(CPUHexagonState *env, float64 
RssV)
 float32 out_f32;
 arch_fpop_start(env);
 out_f32 = float64_to_float32(RssV, >fp_status);
-out_f32 = hex_check_sfnan(out_f32);
 arch_fpop_end(env);
 return out_f32;
 }
@@ -393,7 +371,6 @@ float32 HELPER(conv_uw2sf)(CPUHexagonState *env, int32_t 
RsV)
 float32 RdV;
 arch_fpop_start(env);
 RdV = uint32_to_float32(RsV, >fp_status);
-RdV = hex_check_sfnan(RdV);
 arch_fpop_end(env);
 return RdV;
 }
@@ -403,7 +380,6 @@ float64 HELPER(conv_uw2df)(CPUHexagonState *env, int32_t 
RsV)
 float64 RddV;
 arch_fpop_start(env);
 RddV = uint32_to_float64(RsV, >fp_status);
-RddV = hex_check_dfnan(RddV);
 arch_fpop_end(env);
 return RddV;
 }
@@ -413,7 +389,6 @@ float32 HELPER(conv_w2sf)(CPUHexagonState *env, int32_t RsV)
 float32 RdV;
 arch_fpop_start(env);
 RdV = int32_to_float32(RsV, >fp_status);
-RdV = hex_check_sfnan(RdV);
 arch_fpop_end(env);
 return RdV;
 }
@@ -423,7 +398,6 @@ float64 HELPER(conv_w2df)(CPUHexagonState *env, int32_t RsV)
 float64 RddV;
 arch_fpop_start(env);
 RddV = int32_to_float64(RsV, >fp_status);
-RddV = hex_check_dfnan(RddV);
 arch_fpop_end(env);
 return RddV;
 }
@@ -433,7 +407,6 @@ float32 HELPER(conv_ud2sf)(CPUHexagonState *env, int64_t 
RssV)
 float32 RdV;
 arch_fpop_start(env);
 RdV = uint64_to_float32(RssV, >fp_status);
-RdV = hex_check_sfnan(RdV);
 arch_fpop_end(env);
 return RdV;
 }
@@ -443,7 +416,6 @@ float64 HELPER(conv_ud2df)(CPUHexagonState *env, int64_t 
RssV)
 float64 RddV;
 arch_fpop_start(env);
 RddV = uint64_to_float64(RssV, >fp_status);
-RddV = hex_check_dfnan(RddV);
 arch_fpop_end(env);
 return RddV;
 }
@@ -453,7 +425,6 @@ float32 HELPER(conv_d2sf)(CPUHexagonState *env, int64_t 
RssV)
 float32 RdV;
 arch_fpop_start(env);
 RdV = int64_to_float32(RssV, >fp_status);
-RdV = hex_check_sfnan(RdV);
 arch_fpop_end(env);
 return RdV;
 }
@@ -463,7 +434,6 @@ float64 HELPER(conv_d2df)(CPUHexagonState *env, int64_t 
RssV)
 float64 RddV;
 arch_fpop_start(env);
 RddV = int64_to_float64(RssV,

[PATCH v4 12/26] Hexagon (target/hexagon) use softfloat for float-to-int conversions

2021-04-08 Thread Taylor Simpson

Use the proper return for helpers that convert to unsigned
Remove target/hexagon/conv_emu.[ch]

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/conv_emu.c   | 177 
 target/hexagon/conv_emu.h   |  31 
 target/hexagon/fma_emu.c|   1 -
 target/hexagon/helper.h |  16 ++--
 target/hexagon/meson.build  |   1 -
 target/hexagon/op_helper.c  | 169 --
 tests/tcg/hexagon/fpstuff.c | 145 
 7 files changed, 281 insertions(+), 259 deletions(-)
 delete mode 100644 target/hexagon/conv_emu.c
 delete mode 100644 target/hexagon/conv_emu.h

diff --git a/target/hexagon/conv_emu.c b/target/hexagon/conv_emu.c
deleted file mode 100644
index 3985b10..000
--- a/target/hexagon/conv_emu.c
+++ /dev/null
@@ -1,177 +0,0 @@
-/*
- *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
- *
- *  This program is free software; you can redistribute it and/or modify
- *  it under the terms of the GNU General Public License as published by
- *  the Free Software Foundation; either version 2 of the License, or
- *  (at your option) any later version.
- *
- *  This program is distributed in the hope that it will be useful,
- *  but WITHOUT ANY WARRANTY; without even the implied warranty of
- *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- *  GNU General Public License for more details.
- *
- *  You should have received a copy of the GNU General Public License
- *  along with this program; if not, see .
- */
-
-#include "qemu/osdep.h"
-#include "qemu/host-utils.h"
-#include "fpu/softfloat.h"
-#include "macros.h"
-#include "conv_emu.h"
-
-#define LL_MAX_POS 0x7fffULL
-#define MAX_POS 0x7fffU
-
-static uint64_t conv_f64_to_8u_n(float64 in, int will_negate,
- float_status *fp_status)
-{
-uint8_t sign = float64_is_neg(in);
-if (float64_is_infinity(in)) {
-float_raise(float_flag_invalid, fp_status);
-if (float64_is_neg(in)) {
-return 0ULL;
-} else {
-return ~0ULL;
-}
-}
-if (float64_is_any_nan(in)) {
-float_raise(float_flag_invalid, fp_status);
-return ~0ULL;
-}
-if (float64_is_zero(in)) {
-return 0;
-}
-if (sign) {
-float_raise(float_flag_invalid, fp_status);
-return 0;
-}
-if (float64_lt(in, float64_half, fp_status)) {
-/* Near zero, captures large fracshifts, denorms, etc */
-float_raise(float_flag_inexact, fp_status);
-switch (get_float_rounding_mode(fp_status)) {
-case float_round_down:
-if (will_negate) {
-return 1;
-} else {
-return 0;
-}
-case float_round_up:
-if (!will_negate) {
-return 1;
-} else {
-return 0;
-}
-default:
-return 0;/* nearest or towards zero */
-}
-}
-return float64_to_uint64(in, fp_status);
-}
-
-static void clr_float_exception_flags(uint8_t flag, float_status *fp_status)
-{
-uint8_t flags = fp_status->float_exception_flags;
-flags &= ~flag;
-set_float_exception_flags(flags, fp_status);
-}
-
-static uint32_t conv_df_to_4u_n(float64 fp64, int will_negate,
-float_status *fp_status)
-{
-uint64_t tmp;
-tmp = conv_f64_to_8u_n(fp64, will_negate, fp_status);
-if (tmp > 0xULL) {
-clr_float_exception_flags(float_flag_inexact, fp_status);
-float_raise(float_flag_invalid, fp_status);
-return ~0U;
-}
-return (uint32_t)tmp;
-}
-
-uint64_t conv_df_to_8u(float64 in, float_status *fp_status)
-{
-return conv_f64_to_8u_n(in, 0, fp_status);
-}
-
-uint32_t conv_df_to_4u(float64 in, float_status *fp_status)
-{
-return conv_df_to_4u_n(in, 0, fp_status);
-}
-
-int64_t conv_df_to_8s(float64 in, float_status *fp_status)
-{
-uint8_t sign = float64_is_neg(in);
-uint64_t tmp;
-if (float64_is_any_nan(in)) {
-float_raise(float_flag_invalid, fp_status);
-return -1;
-}
-if (sign) {
-float64 minus_fp64 = float64_abs(in);
-tmp = conv_f64_to_8u_n(minus_fp64, 1, fp_status);
-} else {
-tmp = conv_f64_to_8u_n(in, 0, fp_status);
-}
-if (tmp > (LL_MAX_POS + sign)) {
-clr_float_exception_flags(float_flag_inexact, fp_status);
-float_raise(float_flag_invalid, fp_status);
-tmp = (LL_MAX_POS + sign);
-}
-if (sign) {
-return -tmp;
-} else {
-return tmp;
-}
-}
-
-int32_t conv_df_to_4s(float64 in, float_status *fp_status)
-{
-uint8_t sign = float64_is_neg(in);
-uint64_t tmp;
-if (float64_is_any_nan(in)) {
-float_raise(float_flag_invalid,

[PATCH v4 11/26] Hexagon (target/hexagon) replace float32_mul_pow2 with float32_scalbn

2021-04-08 Thread Taylor Simpson

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/arch.c | 28 +++-
 1 file changed, 11 insertions(+), 17 deletions(-)

diff --git a/target/hexagon/arch.c b/target/hexagon/arch.c
index bb51f19..40b6e3d 100644
--- a/target/hexagon/arch.c
+++ b/target/hexagon/arch.c
@@ -143,12 +143,6 @@ void arch_fpop_end(CPUHexagonState *env)
 }
 }
 
-static float32 float32_mul_pow2(float32 a, uint32_t p, float_status *fp_status)
-{
-float32 b = make_float32((SF_BIAS + p) << SF_MANTBITS);
-return float32_mul(a, b, fp_status);
-}
-
 int arch_sf_recip_common(float32 *Rs, float32 *Rt, float32 *Rd, int *adjust,
  float_status *fp_status)
 {
@@ -217,22 +211,22 @@ int arch_sf_recip_common(float32 *Rs, float32 *Rt, 
float32 *Rd, int *adjust,
 if ((n_exp - d_exp + SF_BIAS) <= SF_MANTBITS) {
 /* Near quotient underflow / inexact Q */
 PeV = 0x80;
-RtV = float32_mul_pow2(RtV, -64, fp_status);
-RsV = float32_mul_pow2(RsV, 64, fp_status);
+RtV = float32_scalbn(RtV, -64, fp_status);
+RsV = float32_scalbn(RsV, 64, fp_status);
 } else if ((n_exp - d_exp + SF_BIAS) > (SF_MAXEXP - 24)) {
 /* Near quotient overflow */
 PeV = 0x40;
-RtV = float32_mul_pow2(RtV, 32, fp_status);
-RsV = float32_mul_pow2(RsV, -32, fp_status);
+RtV = float32_scalbn(RtV, 32, fp_status);
+RsV = float32_scalbn(RsV, -32, fp_status);
 } else if (n_exp <= SF_MANTBITS + 2) {
-RtV = float32_mul_pow2(RtV, 64, fp_status);
-RsV = float32_mul_pow2(RsV, 64, fp_status);
+RtV = float32_scalbn(RtV, 64, fp_status);
+RsV = float32_scalbn(RsV, 64, fp_status);
 } else if (d_exp <= 1) {
-RtV = float32_mul_pow2(RtV, 32, fp_status);
-RsV = float32_mul_pow2(RsV, 32, fp_status);
+RtV = float32_scalbn(RtV, 32, fp_status);
+RsV = float32_scalbn(RsV, 32, fp_status);
 } else if (d_exp > 252) {
-RtV = float32_mul_pow2(RtV, -32, fp_status);
-RsV = float32_mul_pow2(RsV, -32, fp_status);
+RtV = float32_scalbn(RtV, -32, fp_status);
+RsV = float32_scalbn(RsV, -32, fp_status);
 }
 RdV = 0;
 ret = 1;
@@ -274,7 +268,7 @@ int arch_sf_invsqrt_common(float32 *Rs, float32 *Rd, int 
*adjust,
 /* Basic checks passed */
 r_exp = float32_getexp(RsV);
 if (r_exp <= 24) {
-RsV = float32_mul_pow2(RsV, 64, fp_status);
+RsV = float32_scalbn(RsV, 64, fp_status);
 PeV = 0xe0;
 }
 RdV = 0;
-- 
2.7.4

[PATCH v4 09/26] Hexagon (target/hexagon) change type of softfloat_roundingmodes

2021-04-08 Thread Taylor Simpson

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/arch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hexagon/arch.c b/target/hexagon/arch.c
index 699e2cf..bb51f19 100644
--- a/target/hexagon/arch.c
+++ b/target/hexagon/arch.c
@@ -95,7 +95,7 @@ int32_t conv_round(int32_t a, int n)
 
 /* Floating Point Stuff */
 
-static const int softfloat_roundingmodes[] = {
+static const FloatRoundMode softfloat_roundingmodes[] = {
 float_round_nearest_even,
 float_round_to_zero,
 float_round_down,
-- 
2.7.4

[PATCH v4 06/26] Hexagon (target/hexagon) decide if pred has been written at TCG gen time

2021-04-08 Thread Taylor Simpson

Multiple writes to the same preg are and'ed together.  Rather than
generating a runtime check, we can determine at TCG generation time
if the predicate has previously been written in the packet.

Test added to tests/tcg/hexagon/misc.c

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg_funcs.py |  2 +-
 target/hexagon/genptr.c | 22 +++---
 target/hexagon/translate.c  |  9 +++--
 target/hexagon/translate.h  |  2 ++
 tests/tcg/hexagon/misc.c| 19 +++
 5 files changed, 44 insertions(+), 10 deletions(-)

diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs.py
index db9f663..7ceb25b 100755
--- a/target/hexagon/gen_tcg_funcs.py
+++ b/target/hexagon/gen_tcg_funcs.py
@@ -316,7 +316,7 @@ def genptr_dst_write(f, tag, regtype, regid):
 print("Bad register parse: ", regtype, regid)
 elif (regtype == "P"):
 if (regid in {"d", "e", "x"}):
-f.write("gen_log_pred_write(%s%sN, %s%sV);\n" % \
+f.write("gen_log_pred_write(ctx, %s%sN, %s%sV);\n" % \
 (regtype, regid, regtype, regid))
 f.write("ctx_log_pred_write(ctx, %s%sN);\n" % \
 (regtype, regid))
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 07d970f..6b74344 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -119,20 +119,28 @@ static void gen_log_reg_write_pair(int rnum, TCGv_i64 val)
 #endif
 }
 
-static inline void gen_log_pred_write(int pnum, TCGv val)
+static inline void gen_log_pred_write(DisasContext *ctx, int pnum, TCGv val)
 {
 TCGv zero = tcg_const_tl(0);
 TCGv base_val = tcg_temp_new();
 TCGv and_val = tcg_temp_new();
 TCGv pred_written = tcg_temp_new();
 
-/* Multiple writes to the same preg are and'ed together */
 tcg_gen_andi_tl(base_val, val, 0xff);
-tcg_gen_and_tl(and_val, base_val, hex_new_pred_value[pnum]);
-tcg_gen_andi_tl(pred_written, hex_pred_written, 1 << pnum);
-tcg_gen_movcond_tl(TCG_COND_NE, hex_new_pred_value[pnum],
-   pred_written, zero,
-   and_val, base_val);
+
+/*
+ * Section 6.1.3 of the Hexagon V67 Programmer's Reference Manual
+ *
+ * Multiple writes to the same preg are and'ed together
+ * If this is the first predicate write in the packet, do a
+ * straight assignment.  Otherwise, do an and.
+ */
+if (!test_bit(pnum, ctx->pregs_written)) {
+tcg_gen_mov_tl(hex_new_pred_value[pnum], base_val);
+} else {
+tcg_gen_and_tl(hex_new_pred_value[pnum],
+   hex_new_pred_value[pnum], base_val);
+}
 tcg_gen_ori_tl(hex_pred_written, hex_pred_written, 1 << pnum);
 
 tcg_temp_free(zero);
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 9f2a531..49ec8b7 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -172,6 +172,7 @@ static void gen_start_packet(DisasContext *ctx, Packet *pkt)
 ctx->reg_log_idx = 0;
 bitmap_zero(ctx->regs_written, TOTAL_PER_THREAD_REGS);
 ctx->preg_log_idx = 0;
+bitmap_zero(ctx->pregs_written, NUM_PREGS);
 for (i = 0; i < STORES_MAX; i++) {
 ctx->store_width[i] = 0;
 }
@@ -226,7 +227,7 @@ static void mark_implicit_pred_write(DisasContext *ctx, 
Insn *insn,
 }
 }
 
-static void mark_implicit_writes(DisasContext *ctx, Insn *insn)
+static void mark_implicit_reg_writes(DisasContext *ctx, Insn *insn)
 {
 mark_implicit_reg_write(ctx, insn, A_IMPLICIT_WRITES_FP,  HEX_REG_FP);
 mark_implicit_reg_write(ctx, insn, A_IMPLICIT_WRITES_SP,  HEX_REG_SP);
@@ -235,7 +236,10 @@ static void mark_implicit_writes(DisasContext *ctx, Insn 
*insn)
 mark_implicit_reg_write(ctx, insn, A_IMPLICIT_WRITES_SA0, HEX_REG_SA0);
 mark_implicit_reg_write(ctx, insn, A_IMPLICIT_WRITES_LC1, HEX_REG_LC1);
 mark_implicit_reg_write(ctx, insn, A_IMPLICIT_WRITES_SA1, HEX_REG_SA1);
+}
 
+static void mark_implicit_pred_writes(DisasContext *ctx, Insn *insn)
+{
 mark_implicit_pred_write(ctx, insn, A_IMPLICIT_WRITES_P0, 0);
 mark_implicit_pred_write(ctx, insn, A_IMPLICIT_WRITES_P1, 1);
 mark_implicit_pred_write(ctx, insn, A_IMPLICIT_WRITES_P2, 2);
@@ -246,8 +250,9 @@ static void gen_insn(CPUHexagonState *env, DisasContext 
*ctx,
  Insn *insn, Packet *pkt)
 {
 if (insn->generate) {
-mark_implicit_writes(ctx, insn);
+mark_implicit_reg_writes(ctx, insn);
 insn->generate(env, ctx, insn, pkt);
+mark_implicit_pred_writes(ctx, insn);
 } else {
 gen_exception_end_tb(ctx, HEX_EXCP_INVALID_OPCODE);
 }
diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index 12506c8..0ecfbd7 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -34,6 +34,7 @@ typedef struct DisasContext {
 DECLARE_BITMAP(regs_written,

[PATCH v4 01/26] Hexagon (target/hexagon) TCG generation cleanup

2021-04-08 Thread Taylor Simpson

Simplify TCG generation of hex_reg_written

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/genptr.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 7481f4c..87f5d92 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -35,7 +35,6 @@ static inline TCGv gen_read_preg(TCGv pred, uint8_t num)
 
 static inline void gen_log_predicated_reg_write(int rnum, TCGv val, int slot)
 {
-TCGv one = tcg_const_tl(1);
 TCGv zero = tcg_const_tl(0);
 TCGv slot_mask = tcg_temp_new();
 
@@ -43,12 +42,17 @@ static inline void gen_log_predicated_reg_write(int rnum, 
TCGv val, int slot)
 tcg_gen_movcond_tl(TCG_COND_EQ, hex_new_value[rnum], slot_mask, zero,
val, hex_new_value[rnum]);
 #if HEX_DEBUG
-/* Do this so HELPER(debug_commit_end) will know */
-tcg_gen_movcond_tl(TCG_COND_EQ, hex_reg_written[rnum], slot_mask, zero,
-   one, hex_reg_written[rnum]);
+/*
+ * Do this so HELPER(debug_commit_end) will know
+ *
+ * Note that slot_mask indicates the value is not written
+ * (i.e., slot was cancelled), so we create a true/false value before
+ * or'ing with hex_reg_written[rnum].
+ */
+tcg_gen_setcond_tl(TCG_COND_EQ, slot_mask, slot_mask, zero);
+tcg_gen_or_tl(hex_reg_written[rnum], hex_reg_written[rnum], slot_mask);
 #endif
 
-tcg_temp_free(one);
 tcg_temp_free(zero);
 tcg_temp_free(slot_mask);
 }
-- 
2.7.4

[PATCH v4 00/26] Hexagon (target/hexagon) update

2021-04-08 Thread Taylor Simpson

This patch series is a significant update for the Hexagon target
The first 16 patches address feedback from Richard Henderson
 and Philippe Mathieu-Daud� 
The next 10 patches add the remaining instructions for the Hexagon
scalar core

The patches are logically independent but are organized as a series to
avoid potential conflicts if they are merged out of order.

Note that the new test cases require an updated toolchain/container.


*** Changes in v4 ***
Shorten TCG sequence in gen_read_ireg

*** Changes in v3 ***
Cleanup ternary operators in semantics to make them eaiser for idef-parser
Cleanup gen_log_predicated_reg_write_pair similar to gen_log_predicated_write
Cleanup reg_field_info definition (remove {0, 0} entry and include array size)
Move QEMU_GENERATE to only be on during macros.h
Compile all debug code so it doesn't bit rot
Fix circular addressing to handle negative increment

*** Changes in v2 ***
Address feedback from Richard Henderson 
Break utility function (arch.c) changes into 2 separate patches
Change bit-reverse addressing from TCG generation to helper
Change loadalign[bh] to use shift+deposit
Remove fGET_TCG_tmp
Remove unneeded ireg and tmp variables
Remove unused one variable from gen_log_predicated_reg_write
Rename gen_exception to gen_exception_raw
Remove unreachable tcg_gen_exit_tb
Remove redundant PC assignment
Remove TARGET_HEXAGON code from parts_silence_nan
Change roundrom to uint8_t in arch_recip_lookup and arch_invsqrt_lookup
Rewrite fGEN_TCG_addp_c/fGEN_TCG_subp_c using tcg_gen_add2_i64
Remove gen_carry_from_add64()
Break "instructions with multiple definitions" into multiple patches
Fix fINSERT_RANGE macro

Expand macros inside GET_EA_pci, GET_EA_pcr
Change fGEN_TCG_PCR to fGEN_TCG_LOAD_pcr to be consistent with other macros
Cleanup load and unpack implementation
Cleanup load into shifted register implementation
Cleanup brev.c test case
Change sfinvsqrta/sfrecipa to use a single helper
Cleanup vacsh helpers


Taylor Simpson (26):
  Hexagon (target/hexagon) TCG generation cleanup
  Hexagon (target/hexagon) cleanup gen_log_predicated_reg_write_pair
  Hexagon (target/hexagon) remove unnecessary inline directives
  Hexagon (target/hexagon) use env_archcpu and env_cpu
  Hexagon (target/hexagon) properly generate TB end for DISAS_NORETURN
  Hexagon (target/hexagon) decide if pred has been written at TCG gen
time
  Hexagon (target/hexagon) change variables from int to bool when
appropriate
  Hexagon (target/hexagon) remove unused carry_from_add64 function
  Hexagon (target/hexagon) change type of softfloat_roundingmodes
  Hexagon (target/hexagon) use softfloat default NaN and tininess
  Hexagon (target/hexagon) replace float32_mul_pow2 with float32_scalbn
  Hexagon (target/hexagon) use softfloat for float-to-int conversions
  Hexagon (target/hexagon) cleanup ternary operators in semantics
  Hexagon (target/hexagon) cleanup reg_field_info definition
  Hexagon (target/hexagon) move QEMU_GENERATE to only be on during
macros.h
  Hexagon (target/hexagon) compile all debug code
  Hexagon (target/hexagon) add F2_sfrecipa instruction
  Hexagon (target/hexagon) add F2_sfinvsqrta
  Hexagon (target/hexagon) add A5_ACS (vacsh)
  Hexagon (target/hexagon) add A6_vminub_RdP
  Hexagon (target/hexagon) add A4_addp_c/A4_subp_c
  Hexagon (target/hexagon) circular addressing
  Hexagon (target/hexagon) bit reverse (brev) addressing
  Hexagon (target/hexagon) load and unpack bytes instructions
  Hexagon (target/hexagon) load into shifted register instructions
  Hexagon (target/hexagon) CABAC decode bin

 fpu/softfloat-specialize.c.inc|   3 +
 linux-user/hexagon/cpu_loop.c |   2 +-
 target/hexagon/arch.c | 181 ++---
 target/hexagon/arch.h |   9 +-
 target/hexagon/conv_emu.c | 177 -
 target/hexagon/conv_emu.h |  31 ---
 target/hexagon/cpu.c  |  14 +-
 target/hexagon/cpu.h  |   5 -
 target/hexagon/cpu_bits.h |   2 +-
 target/hexagon/decode.c   |  80 +++---
 target/hexagon/fma_emu.c  |  40 +--
 target/hexagon/gen_tcg.h  | 420 -
 target/hexagon/gen_tcg_funcs.py   |   2 +-
 target/hexagon/genptr.c   | 244 ++---
 target/hexagon/helper.h   |  23 +-
 target/hexagon/imported/alu.idef  |  44 +++
 target/hexagon/imported/compare.idef  |  12 +-
 target/hexagon/imported/encode_pp.def |  30 +++
 target/hexagon/imported/float.idef|  32 +++
 target/hexagon/imported/ldst.idef |  68 +
 target/hexagon/imported/macros.def|  47 
 target/hexagon/imported/shift.idef|  47 
 target/hexagon/insn.h |  21 +-
 target/hexagon/internal.h |  11 +-
 target/hexagon/macros.h   | 118 -
 target/hexagon/meson.build|   1 -

[PATCH v4 07/26] Hexagon (target/hexagon) change variables from int to bool when appropriate

2021-04-08 Thread Taylor Simpson

Suggested-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Signed-off-by: Taylor Simpson 
---
 target/hexagon/cpu_bits.h  |  2 +-
 target/hexagon/decode.c| 80 +++---
 target/hexagon/insn.h  | 21 ++--
 target/hexagon/op_helper.c |  8 ++---
 target/hexagon/translate.c |  6 ++--
 target/hexagon/translate.h |  2 +-
 6 files changed, 60 insertions(+), 59 deletions(-)

diff --git a/target/hexagon/cpu_bits.h b/target/hexagon/cpu_bits.h
index 96af834..96fef71 100644
--- a/target/hexagon/cpu_bits.h
+++ b/target/hexagon/cpu_bits.h
@@ -47,7 +47,7 @@ static inline uint32_t iclass_bits(uint32_t encoding)
 return iclass;
 }
 
-static inline int is_packet_end(uint32_t endocing)
+static inline bool is_packet_end(uint32_t endocing)
 {
 uint32_t bits = parse_bits(endocing);
 return ((bits == 0x3) || (bits == 0x0));
diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index 65d97ce..dffe1d1 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -340,8 +340,8 @@ static void decode_split_cmpjump(Packet *pkt)
 if (GET_ATTRIB(pkt->insn[i].opcode, A_NEWCMPJUMP)) {
 last = pkt->num_insns;
 pkt->insn[last] = pkt->insn[i];/* copy the instruction */
-pkt->insn[last].part1 = 1;/* last instruction does the CMP */
-pkt->insn[i].part1 = 0;/* existing instruction does the JUMP */
+pkt->insn[last].part1 = true;  /* last insn does the CMP */
+pkt->insn[i].part1 = false;/* existing insn does the JUMP 
*/
 pkt->num_insns++;
 }
 }
@@ -354,7 +354,7 @@ static void decode_split_cmpjump(Packet *pkt)
 }
 }
 
-static int decode_opcode_can_jump(int opcode)
+static bool decode_opcode_can_jump(int opcode)
 {
 if ((GET_ATTRIB(opcode, A_JUMP)) ||
 (GET_ATTRIB(opcode, A_CALL)) ||
@@ -362,15 +362,15 @@ static int decode_opcode_can_jump(int opcode)
 (opcode == J2_pause)) {
 /* Exception to A_JUMP attribute */
 if (opcode == J4_hintjumpr) {
-return 0;
+return false;
 }
-return 1;
+return true;
 }
 
-return 0;
+return false;
 }
 
-static int decode_opcode_ends_loop(int opcode)
+static bool decode_opcode_ends_loop(int opcode)
 {
 return GET_ATTRIB(opcode, A_HWLOOP0_END) ||
GET_ATTRIB(opcode, A_HWLOOP1_END);
@@ -383,9 +383,9 @@ static void decode_set_insn_attr_fields(Packet *pkt)
 int numinsns = pkt->num_insns;
 uint16_t opcode;
 
-pkt->pkt_has_cof = 0;
-pkt->pkt_has_endloop = 0;
-pkt->pkt_has_dczeroa = 0;
+pkt->pkt_has_cof = false;
+pkt->pkt_has_endloop = false;
+pkt->pkt_has_dczeroa = false;
 
 for (i = 0; i < numinsns; i++) {
 opcode = pkt->insn[i].opcode;
@@ -394,14 +394,14 @@ static void decode_set_insn_attr_fields(Packet *pkt)
 }
 
 if (GET_ATTRIB(opcode, A_DCZEROA)) {
-pkt->pkt_has_dczeroa = 1;
+pkt->pkt_has_dczeroa = true;
 }
 
 if (GET_ATTRIB(opcode, A_STORE)) {
 if (pkt->insn[i].slot == 0) {
-pkt->pkt_has_store_s0 = 1;
+pkt->pkt_has_store_s0 = true;
 } else {
-pkt->pkt_has_store_s1 = 1;
+pkt->pkt_has_store_s1 = true;
 }
 }
 
@@ -422,9 +422,9 @@ static void decode_set_insn_attr_fields(Packet *pkt)
  */
 static void decode_shuffle_for_execution(Packet *packet)
 {
-int changed = 0;
+bool changed = false;
 int i;
-int flag;/* flag means we've seen a non-memory instruction */
+bool flag;/* flag means we've seen a non-memory instruction */
 int n_mems;
 int last_insn = packet->num_insns - 1;
 
@@ -437,7 +437,7 @@ static void decode_shuffle_for_execution(Packet *packet)
 }
 
 do {
-changed = 0;
+changed = false;
 /*
  * Stores go last, must not reorder.
  * Cannot shuffle stores past loads, either.
@@ -445,13 +445,13 @@ static void decode_shuffle_for_execution(Packet *packet)
  * then a store, shuffle the store to the front.  Don't shuffle
  * stores wrt each other or a load.
  */
-for (flag = n_mems = 0, i = last_insn; i >= 0; i--) {
+for (flag = false, n_mems = 0, i = last_insn; i >= 0; i--) {
 int opcode = packet->insn[i].opcode;
 
 if (flag && GET_ATTRIB(opcode, A_STORE)) {
 decode_send_insn_to(packet, i, last_insn - n_mems);
 n_mems++;
-changed = 1;
+changed = true;
 } else if (GET_ATTRIB(opcode, A_STORE)) {
 n_mems++;
 } else if (GET_ATTRIB(opcode, A_LOAD)) {
@@ -466,7 +466,7 @@ static void decode_shuffle_for_execution(Packet *packet)
  * a .new value
  */
 } else {
-

[PATCH v5 2/3] aspeed: Integrate HACE

2021-04-08 Thread Joel Stanley

Add the hash and crypto engine model to the Aspeed socs.

Reviewed-by: Andrew Jeffery 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Klaus Heinrich Kiwi 
Signed-off-by: Joel Stanley 
---
 docs/system/arm/aspeed.rst  |  1 -
 include/hw/arm/aspeed_soc.h |  3 +++
 hw/arm/aspeed_ast2600.c | 15 +++
 hw/arm/aspeed_soc.c | 16 
 4 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/docs/system/arm/aspeed.rst b/docs/system/arm/aspeed.rst
index 23a1468cd175..a1911f940316 100644
--- a/docs/system/arm/aspeed.rst
+++ b/docs/system/arm/aspeed.rst
@@ -60,7 +60,6 @@ Missing devices
  * PWM and Fan Controller
  * Slave GPIO Controller
  * Super I/O Controller
- * Hash/Crypto Engine
  * PCI-Express 1 Controller
  * Graphic Display Controller
  * PECI Controller
diff --git a/include/hw/arm/aspeed_soc.h b/include/hw/arm/aspeed_soc.h
index 9359d6da336d..d9161d26d645 100644
--- a/include/hw/arm/aspeed_soc.h
+++ b/include/hw/arm/aspeed_soc.h
@@ -21,6 +21,7 @@
 #include "hw/rtc/aspeed_rtc.h"
 #include "hw/i2c/aspeed_i2c.h"
 #include "hw/ssi/aspeed_smc.h"
+#include "hw/misc/aspeed_hace.h"
 #include "hw/watchdog/wdt_aspeed.h"
 #include "hw/net/ftgmac100.h"
 #include "target/arm/cpu.h"
@@ -50,6 +51,7 @@ struct AspeedSoCState {
 AspeedTimerCtrlState timerctrl;
 AspeedI2CState i2c;
 AspeedSCUState scu;
+AspeedHACEState hace;
 AspeedXDMAState xdma;
 AspeedSMCState fmc;
 AspeedSMCState spi[ASPEED_SPIS_NUM];
@@ -133,6 +135,7 @@ enum {
 ASPEED_DEV_XDMA,
 ASPEED_DEV_EMMC,
 ASPEED_DEV_KCS,
+ASPEED_DEV_HACE,
 };
 
 #endif /* ASPEED_SOC_H */
diff --git a/hw/arm/aspeed_ast2600.c b/hw/arm/aspeed_ast2600.c
index 2a1255b6a042..e0fbb020c770 100644
--- a/hw/arm/aspeed_ast2600.c
+++ b/hw/arm/aspeed_ast2600.c
@@ -42,6 +42,7 @@ static const hwaddr aspeed_soc_ast2600_memmap[] = {
 [ASPEED_DEV_ETH2]  = 0x1E68,
 [ASPEED_DEV_ETH4]  = 0x1E69,
 [ASPEED_DEV_VIC]   = 0x1E6C,
+[ASPEED_DEV_HACE]  = 0x1E6D,
 [ASPEED_DEV_SDMC]  = 0x1E6E,
 [ASPEED_DEV_SCU]   = 0x1E6E2000,
 [ASPEED_DEV_XDMA]  = 0x1E6E7000,
@@ -102,6 +103,7 @@ static const int aspeed_soc_ast2600_irqmap[] = {
 [ASPEED_DEV_I2C]   = 110,   /* 110 -> 125 */
 [ASPEED_DEV_ETH1]  = 2,
 [ASPEED_DEV_ETH2]  = 3,
+[ASPEED_DEV_HACE]  = 4,
 [ASPEED_DEV_ETH3]  = 32,
 [ASPEED_DEV_ETH4]  = 33,
 [ASPEED_DEV_KCS]   = 138,   /* 138 -> 142 */
@@ -213,6 +215,9 @@ static void aspeed_soc_ast2600_init(Object *obj)
 TYPE_SYSBUS_SDHCI);
 
 object_initialize_child(obj, "lpc", >lpc, TYPE_ASPEED_LPC);
+
+snprintf(typename, sizeof(typename), "aspeed.hace-%s", socname);
+object_initialize_child(obj, "hace", >hace, typename);
 }
 
 /*
@@ -494,6 +499,16 @@ static void aspeed_soc_ast2600_realize(DeviceState *dev, 
Error **errp)
 sysbus_connect_irq(SYS_BUS_DEVICE(>lpc), 1 + aspeed_lpc_kcs_4,
qdev_get_gpio_in(DEVICE(>a7mpcore),
 sc->irqmap[ASPEED_DEV_KCS] + 
aspeed_lpc_kcs_4));
+
+/* HACE */
+object_property_set_link(OBJECT(>hace), "dram", OBJECT(s->dram_mr),
+ _abort);
+if (!sysbus_realize(SYS_BUS_DEVICE(>hace), errp)) {
+return;
+}
+sysbus_mmio_map(SYS_BUS_DEVICE(>hace), 0, sc->memmap[ASPEED_DEV_HACE]);
+sysbus_connect_irq(SYS_BUS_DEVICE(>hace), 0,
+   aspeed_soc_get_irq(s, ASPEED_DEV_HACE));
 }
 
 static void aspeed_soc_ast2600_class_init(ObjectClass *oc, void *data)
diff --git a/hw/arm/aspeed_soc.c b/hw/arm/aspeed_soc.c
index 817f3ba63dfd..8ed29113f79f 100644
--- a/hw/arm/aspeed_soc.c
+++ b/hw/arm/aspeed_soc.c
@@ -34,6 +34,7 @@ static const hwaddr aspeed_soc_ast2400_memmap[] = {
 [ASPEED_DEV_VIC]= 0x1E6C,
 [ASPEED_DEV_SDMC]   = 0x1E6E,
 [ASPEED_DEV_SCU]= 0x1E6E2000,
+[ASPEED_DEV_HACE]   = 0x1E6E3000,
 [ASPEED_DEV_XDMA]   = 0x1E6E7000,
 [ASPEED_DEV_VIDEO]  = 0x1E70,
 [ASPEED_DEV_ADC]= 0x1E6E9000,
@@ -65,6 +66,7 @@ static const hwaddr aspeed_soc_ast2500_memmap[] = {
 [ASPEED_DEV_VIC]= 0x1E6C,
 [ASPEED_DEV_SDMC]   = 0x1E6E,
 [ASPEED_DEV_SCU]= 0x1E6E2000,
+[ASPEED_DEV_HACE]   = 0x1E6E3000,
 [ASPEED_DEV_XDMA]   = 0x1E6E7000,
 [ASPEED_DEV_ADC]= 0x1E6E9000,
 [ASPEED_DEV_VIDEO]  = 0x1E70,
@@ -117,6 +119,7 @@ static const int aspeed_soc_ast2400_irqmap[] = {
 [ASPEED_DEV_ETH2]   = 3,
 [ASPEED_DEV_XDMA]   = 6,
 [ASPEED_DEV_SDHCI]  = 26,
+[ASPEED_DEV_HACE]   = 4,
 };
 
 #define aspeed_soc_ast2500_irqmap aspeed_soc_ast2400_irqmap
@@ -212,6 +215,9 @@ static void aspeed_soc_init(Object *obj)
 }
 
 object_initialize_child(obj, "lpc", >lpc, TYPE_ASPEED_LPC);
+
+snprintf(typename, sizeof(typename), "aspeed.hace-%s", socname);
+object_initialize_child(obj, "hace", >hace, typename);

[PATCH v5 1/3] hw: Model ASPEED's Hash and Crypto Engine

2021-04-08 Thread Joel Stanley

The HACE (Hash and Crypto Engine) is a device that offloads MD5, SHA1,
SHA2, RSA and other cryptographic algorithms.

This initial model implements a subset of the device's functionality;
currently only MD5/SHA hashing, and on the ast2600's scatter gather
engine.

Co-developed-by: Klaus Heinrich Kiwi 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Joel Stanley 
---
v3:
 - rebase on upstream to fix meson.build conflict
v2:
 - reorder register defines
 - mask src/dest/len registers according to hardware
v4:
 - Fix typos in comments
 - Remove sdram base address; new memory region fixes mean this is not
   required
 - Use PRIx64
 - Add Object Classes for soc familiy specific features
 - Convert big switch statement to a lookup in a struct
v5:
 - Support scatter gather mode
---
 docs/system/arm/aspeed.rst|   1 +
 include/hw/misc/aspeed_hace.h |  43 
 hw/misc/aspeed_hace.c | 389 ++
 hw/misc/meson.build   |   1 +
 4 files changed, 434 insertions(+)
 create mode 100644 include/hw/misc/aspeed_hace.h
 create mode 100644 hw/misc/aspeed_hace.c

diff --git a/docs/system/arm/aspeed.rst b/docs/system/arm/aspeed.rst
index d1fb8f25b39c..23a1468cd175 100644
--- a/docs/system/arm/aspeed.rst
+++ b/docs/system/arm/aspeed.rst
@@ -49,6 +49,7 @@ Supported devices
  * Ethernet controllers
  * Front LEDs (PCA9552 on I2C bus)
  * LPC Peripheral Controller (a subset of subdevices are supported)
+ * Hash/Crypto Engine (HACE) - Hash support only. TODO: HMAC and RSA
 
 
 Missing devices
diff --git a/include/hw/misc/aspeed_hace.h b/include/hw/misc/aspeed_hace.h
new file mode 100644
index ..94d5ada95fa2
--- /dev/null
+++ b/include/hw/misc/aspeed_hace.h
@@ -0,0 +1,43 @@
+/*
+ * ASPEED Hash and Crypto Engine
+ *
+ * Copyright (C) 2021 IBM Corp.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef ASPEED_HACE_H
+#define ASPEED_HACE_H
+
+#include "hw/sysbus.h"
+
+#define TYPE_ASPEED_HACE "aspeed.hace"
+#define TYPE_ASPEED_AST2400_HACE TYPE_ASPEED_HACE "-ast2400"
+#define TYPE_ASPEED_AST2500_HACE TYPE_ASPEED_HACE "-ast2500"
+#define TYPE_ASPEED_AST2600_HACE TYPE_ASPEED_HACE "-ast2600"
+OBJECT_DECLARE_TYPE(AspeedHACEState, AspeedHACEClass, ASPEED_HACE)
+
+#define ASPEED_HACE_NR_REGS (0x64 >> 2)
+
+struct AspeedHACEState {
+SysBusDevice parent;
+
+MemoryRegion iomem;
+qemu_irq irq;
+
+uint32_t regs[ASPEED_HACE_NR_REGS];
+
+MemoryRegion *dram_mr;
+AddressSpace dram_as;
+};
+
+
+struct AspeedHACEClass {
+SysBusDeviceClass parent_class;
+
+uint32_t src_mask;
+uint32_t dest_mask;
+uint32_t hash_mask;
+};
+
+#endif /* _ASPEED_HACE_H_ */
diff --git a/hw/misc/aspeed_hace.c b/hw/misc/aspeed_hace.c
new file mode 100644
index ..be7f99ea7947
--- /dev/null
+++ b/hw/misc/aspeed_hace.c
@@ -0,0 +1,389 @@
+/*
+ * ASPEED Hash and Crypto Engine
+ *
+ * Copyright (C) 2021 IBM Corp.
+ *
+ * Joel Stanley 
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qemu/error-report.h"
+#include "hw/misc/aspeed_hace.h"
+#include "qapi/error.h"
+#include "migration/vmstate.h"
+#include "crypto/hash.h"
+#include "hw/qdev-properties.h"
+#include "hw/irq.h"
+
+#define R_CRYPT_CMD (0x10 / 4)
+
+#define R_STATUS(0x1c / 4)
+#define HASH_IRQBIT(9)
+#define CRYPT_IRQ   BIT(12)
+#define TAG_IRQ BIT(15)
+
+#define R_HASH_SRC  (0x20 / 4)
+#define R_HASH_DEST (0x24 / 4)
+#define R_HASH_SRC_LEN  (0x2c / 4)
+
+#define R_HASH_CMD  (0x30 / 4)
+/* Hash algorithm selection */
+#define  HASH_ALGO_MASK (BIT(4) | BIT(5) | BIT(6))
+#define  HASH_ALGO_MD5  0
+#define  HASH_ALGO_SHA1 BIT(5)
+#define  HASH_ALGO_SHA224   BIT(6)
+#define  HASH_ALGO_SHA256   (BIT(4) | BIT(6))
+#define  HASH_ALGO_SHA512_SERIES(BIT(5) | BIT(6))
+/* SHA512 algorithm selection */
+#define  SHA512_HASH_ALGO_MASK  (BIT(10) | BIT(11) | BIT(12))
+#define  HASH_ALGO_SHA512_SHA5120
+#define  HASH_ALGO_SHA512_SHA384BIT(10)
+#define  HASH_ALGO_SHA512_SHA256BIT(11)
+#define  HASH_ALGO_SHA512_SHA224(BIT(10) | BIT(11))
+/* HMAC modes */
+#define  HASH_HMAC_MASK (BIT(7) | BIT(8))
+#define  HASH_DIGEST0
+#define  HASH_DIGEST_HMAC   BIT(7)
+#define  HASH_DIGEST_ACCUM  BIT(8)
+#define  HASH_HMAC_KEY  (BIT(7) | BIT(8))
+/* Cascaded operation modes */
+#define  HASH_ONLY  0
+#define  HASH_ONLY2 BIT(0)
+#define  HASH_CRYPT_THEN_HASH   BIT(1)
+#define  HASH_HASH_THEN_CRYPT   (BIT(0) | BIT(1))
+/* Other cmd bits */
+#define  HASH_IRQ_ENBIT(9)
+#define  HASH_SG_EN BIT(18)
+/* Scatter-gather data list */
+#define SG_LIST_LEN_SIZE4
+#define SG_LIST_LEN_MASK

[PATCH v5 3/3] tests/qtest: Add test for Aspeed HACE

2021-04-08 Thread Joel Stanley

This adds a test for the Aspeed Hash and Crypto (HACE) engine. It tests
the currently implemented behavior of the hash functionality.

The tests are similar, but are cut/pasted instead of broken out into a
common function so the assert machinery produces useful output when a
test fails.

Co-developed-by: Cédric Le Goater 
Co-developed-by: Klaus Heinrich Kiwi 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Klaus Heinrich Kiwi 
Acked-by: Thomas Huth 
Signed-off-by: Joel Stanley 
---
v3: Write test without libqtest-single.h
v4: Run tests on all aspeed machines
v5: Add scatter gather test
---
 tests/qtest/aspeed_hace-test.c | 469 +
 MAINTAINERS|   1 +
 tests/qtest/meson.build|   3 +
 3 files changed, 473 insertions(+)
 create mode 100644 tests/qtest/aspeed_hace-test.c

diff --git a/tests/qtest/aspeed_hace-test.c b/tests/qtest/aspeed_hace-test.c
new file mode 100644
index ..09ee31545e41
--- /dev/null
+++ b/tests/qtest/aspeed_hace-test.c
@@ -0,0 +1,469 @@
+/*
+ * QTest testcase for the ASPEED Hash and Crypto Engine
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright 2021 IBM Corp.
+ */
+
+#include "qemu/osdep.h"
+
+#include "libqos/libqtest.h"
+#include "qemu-common.h"
+#include "qemu/bitops.h"
+
+#define HACE_CMD 0x10
+#define  HACE_SHA_BE_EN  BIT(3)
+#define  HACE_MD5_LE_EN  BIT(2)
+#define  HACE_ALGO_MD5   0
+#define  HACE_ALGO_SHA1  BIT(5)
+#define  HACE_ALGO_SHA224BIT(6)
+#define  HACE_ALGO_SHA256(BIT(4) | BIT(6))
+#define  HACE_ALGO_SHA512(BIT(5) | BIT(6))
+#define  HACE_ALGO_SHA384(BIT(5) | BIT(6) | BIT(10))
+#define  HACE_SG_EN  BIT(18)
+
+#define HACE_STS 0x1c
+#define  HACE_RSA_ISRBIT(13)
+#define  HACE_CRYPTO_ISR BIT(12)
+#define  HACE_HASH_ISR   BIT(9)
+#define  HACE_RSA_BUSY   BIT(2)
+#define  HACE_CRYPTO_BUSYBIT(1)
+#define  HACE_HASH_BUSY  BIT(0)
+#define HACE_HASH_SRC0x20
+#define HACE_HASH_DIGEST 0x24
+#define HACE_HASH_KEY_BUFF   0x28
+#define HACE_HASH_DATA_LEN   0x2c
+#define HACE_HASH_CMD0x30
+/* Scatter-Gather Hash */
+#define SG_LIST_LEN_LAST BIT(31)
+struct AspeedSgList {
+uint32_t len;
+uint32_t addr;
+} __attribute__ ((__packed__));
+
+/*
+ * Test vector is the ascii "abc"
+ *
+ * Expected results were generated using command line utitiles:
+ *
+ *  echo -n -e 'abc' | dd of=/tmp/test
+ *  for hash in sha512sum sha256sum md5sum; do $hash /tmp/test; done
+ *
+ */
+static const uint8_t test_vector[] = {0x61, 0x62, 0x63};
+
+static const uint8_t test_result_sha512[] = {
+0xdd, 0xaf, 0x35, 0xa1, 0x93, 0x61, 0x7a, 0xba, 0xcc, 0x41, 0x73, 0x49,
+0xae, 0x20, 0x41, 0x31, 0x12, 0xe6, 0xfa, 0x4e, 0x89, 0xa9, 0x7e, 0xa2,
+0x0a, 0x9e, 0xee, 0xe6, 0x4b, 0x55, 0xd3, 0x9a, 0x21, 0x92, 0x99, 0x2a,
+0x27, 0x4f, 0xc1, 0xa8, 0x36, 0xba, 0x3c, 0x23, 0xa3, 0xfe, 0xeb, 0xbd,
+0x45, 0x4d, 0x44, 0x23, 0x64, 0x3c, 0xe8, 0x0e, 0x2a, 0x9a, 0xc9, 0x4f,
+0xa5, 0x4c, 0xa4, 0x9f};
+
+static const uint8_t test_result_sha256[] = {
+0xba, 0x78, 0x16, 0xbf, 0x8f, 0x01, 0xcf, 0xea, 0x41, 0x41, 0x40, 0xde,
+0x5d, 0xae, 0x22, 0x23, 0xb0, 0x03, 0x61, 0xa3, 0x96, 0x17, 0x7a, 0x9c,
+0xb4, 0x10, 0xff, 0x61, 0xf2, 0x00, 0x15, 0xad};
+
+static const uint8_t test_result_md5[] = {
+0x90, 0x01, 0x50, 0x98, 0x3c, 0xd2, 0x4f, 0xb0, 0xd6, 0x96, 0x3f, 0x7d,
+0x28, 0xe1, 0x7f, 0x72};
+
+/*
+ * The Scatter-Gather Test vector is the ascii "abc" "def" "ghi", broken
+ * into blocks of 3 characters as shown
+ *
+ * Expected results were generated using command line utitiles:
+ *
+ *  echo -n -e 'abcdefghijkl' | dd of=/tmp/test
+ *  for hash in sha512sum sha256sum; do $hash /tmp/test; done
+ *
+ */
+static const uint8_t test_vector_sg1[] = {0x61, 0x62, 0x63, 0x64, 0x65, 0x66};
+static const uint8_t test_vector_sg2[] = {0x67, 0x68, 0x69};
+static const uint8_t test_vector_sg3[] = {0x6a, 0x6b, 0x6c};
+
+static const uint8_t test_result_sg_sha512[] = {
+0x17, 0x80, 0x7c, 0x72, 0x8e, 0xe3, 0xba, 0x35, 0xe7, 0xcf, 0x7a, 0xf8,
+0x23, 0x11, 0x6d, 0x26, 0xe4, 0x1e, 0x5d, 0x4d, 0x6c, 0x2f, 0xf1, 0xf3,
+0x72, 0x0d, 0x3d, 0x96, 0xaa, 0xcb, 0x6f, 0x69, 0xde, 0x64, 0x2e, 0x63,
+0xd5, 0xb7, 0x3f, 0xc3, 0x96, 0xc1, 0x2b, 0xe3, 0x8b, 0x2b, 0xd5, 0xd8,
+0x84, 0x25, 0x7c, 0x32, 0xc8, 0xf6, 0xd0, 0x85, 0x4a, 0xe6, 0xb5, 0x40,
+0xf8, 0x6d, 0xda, 0x2e};
+
+static const uint8_t test_result_sg_sha256[] = {
+0xd6, 0x82, 0xed, 0x4c, 0xa4, 0xd9, 0x89, 0xc1, 0x34, 0xec, 0x94, 0xf1,
+0x55, 0x1e, 0x1e, 0xc5, 0x80, 0xdd, 0x6d, 0x5a, 0x6e, 0xcd, 0xe9, 0xf3,
+0xd3, 0x5e, 0x6e, 0x4a, 0x71, 0x7f, 0xbd, 0xe4};
+
+
+static void write_regs(QTestState *s, uint32_t base, uint32_t src,
+   uint32_t length, uint32_t out, uint32_t method)
+{
+qtest_writel(s, base + HACE_HASH_SRC,

[PATCH v5 0/3] hw/misc: Model ASPEED hash and crypto engine

2021-04-08 Thread Joel Stanley

This version of the series adds the cleanups Cédric made and the scatter
gather feature that Klaus implemented. I took inspiration from Klaus's
patches and reworked the direct hashing mode to easier implement both sg
and direct modes.

The r-b tags are preserved as the changes were minor. I welcome further
review though if you have time.

v5: Merge scatter gather feature
v4: Rebase on Philippe's memory region cleanup series [1]
Address feedback from Cédric
Rework qtest to run on ast2400, ast2500 and ast2600
v3: Rework qtest to not use libqtest-single.h, rebase to avoid LPC
conflicts.
v2: Address review from Andrew and Philippe. Adds a qtest.

[1] https://lore.kernel.org/qemu-devel/20210312182851.1922972-1-f4...@amsat.org/

This adds a model for the ASPEED hash and crypto engine (HACE) found on
all supported ASPEED SoCs.

The model uses Qemu's gcrypto API to perform the SHA and MD5 hashing
directly in the machine's emulated memory space, which I found a neat
use of Qemu's features.

It has been tested using u-boot and from Linux userspace, and adds a
qtest for the model running as part of the ast2600-evb, ast2500-evb and
palmetto-bmc (to test ast2400) machines.

Note that the tests will fail without Philippe/Cédric's memory region series.

Joel Stanley (3):
  hw: Model ASPEED's Hash and Crypto Engine
  aspeed: Integrate HACE
  tests/qtest: Add test for Aspeed HACE

 docs/system/arm/aspeed.rst |   2 +-
 include/hw/arm/aspeed_soc.h|   3 +
 include/hw/misc/aspeed_hace.h  |  43 +++
 hw/arm/aspeed_ast2600.c|  15 ++
 hw/arm/aspeed_soc.c|  16 ++
 hw/misc/aspeed_hace.c  | 389 +++
 tests/qtest/aspeed_hace-test.c | 469 +
 MAINTAINERS|   1 +
 hw/misc/meson.build|   1 +
 tests/qtest/meson.build|   3 +
 10 files changed, 941 insertions(+), 1 deletion(-)
 create mode 100644 include/hw/misc/aspeed_hace.h
 create mode 100644 hw/misc/aspeed_hace.c
 create mode 100644 tests/qtest/aspeed_hace-test.c

-- 
2.30.2

Re: [PATCH 2/3] vhost-blk: Add vhost-blk-common abstraction

2021-04-08 Thread Raphael Norwitz

I'm mostly happy with this. Just some asks on variable renaming and
comments which need to be fixed because of how you've moved things
around.

Also let's add a MAINTAINERS entry vhost-blk-common.h/c either under
vhost-user-blk or create a new vhost-blk entry. I'm not sure what the
best practices are for this. 

On Thu, Apr 08, 2021 at 06:12:51PM +0800, Xie Yongji wrote:
> This commit abstracts part of vhost-user-blk into a common
> parent class which is useful for the introducation of vhost-vdpa-blk.
> 
> Signed-off-by: Xie Yongji 
> ---
>  hw/block/meson.build |   2 +-
>  hw/block/vhost-blk-common.c  | 291 +
>  hw/block/vhost-user-blk.c| 306 +--
>  hw/virtio/vhost-user-blk-pci.c   |   7 +-
>  include/hw/virtio/vhost-blk-common.h |  50 +
>  include/hw/virtio/vhost-user-blk.h   |  20 +-
>  6 files changed, 396 insertions(+), 280 deletions(-)
>  create mode 100644 hw/block/vhost-blk-common.c
>  create mode 100644 include/hw/virtio/vhost-blk-common.h
> 
> diff --git a/hw/block/meson.build b/hw/block/meson.build
> index 5b4a7699f9..5862bda4cb 100644
> --- a/hw/block/meson.build
> +++ b/hw/block/meson.build
> @@ -16,6 +16,6 @@ softmmu_ss.add(when: 'CONFIG_TC58128', if_true: 
> files('tc58128.c'))
>  softmmu_ss.add(when: 'CONFIG_NVME_PCI', if_true: files('nvme.c', 
> 'nvme-ns.c', 'nvme-subsys.c', 'nvme-dif.c'))
>  
>  specific_ss.add(when: 'CONFIG_VIRTIO_BLK', if_true: files('virtio-blk.c'))
> -specific_ss.add(when: 'CONFIG_VHOST_USER_BLK', if_true: 
> files('vhost-user-blk.c'))
> +specific_ss.add(when: 'CONFIG_VHOST_USER_BLK', if_true: 
> files('vhost-blk-common.c', 'vhost-user-blk.c'))
>  
>  subdir('dataplane')
> diff --git a/hw/block/vhost-blk-common.c b/hw/block/vhost-blk-common.c
> new file mode 100644
> index 00..96500f6c89
> --- /dev/null
> +++ b/hw/block/vhost-blk-common.c
> @@ -0,0 +1,291 @@
> +/*
> + * Parent class for vhost based block devices
> + *
> + * Copyright (C) 2021 Bytedance Inc. and/or its affiliates. All rights 
> reserved.
> + *
> + * Author:
> + *   Xie Yongji 
> + *
> + * Heavily based on the vhost-user-blk.c by:
> + *   Changpeng Liu 

You should probably also give credit to Felipe, Setfan and Nicholas, as
a lot of vhost-user-blk orignally came from their work.

> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qemu/error-report.h"
> +#include "qemu/cutils.h"
> +#include "hw/qdev-core.h"
> +#include "hw/qdev-properties.h"
> +#include "hw/qdev-properties-system.h"
> +#include "hw/virtio/vhost.h"
> +#include "hw/virtio/virtio.h"
> +#include "hw/virtio/virtio-bus.h"
> +#include "hw/virtio/virtio-access.h"
> +#include "hw/virtio/vhost-blk-common.h"
> +#include "sysemu/sysemu.h"
> +#include "sysemu/runstate.h"
> +
> +static void vhost_blk_common_update_config(VirtIODevice *vdev, uint8_t 
> *config)
> +{
> +VHostBlkCommon *vbc = VHOST_BLK_COMMON(vdev);
> +
> +/* Our num_queues overrides the device backend */
> +virtio_stw_p(vdev, >blkcfg.num_queues, vbc->num_queues);
> +
> +memcpy(config, >blkcfg, sizeof(struct virtio_blk_config));
> +}
> +
> +static void vhost_blk_common_set_config(VirtIODevice *vdev,
> +const uint8_t *config)
> +{
> +VHostBlkCommon *vbc = VHOST_BLK_COMMON(vdev);
> +struct virtio_blk_config *blkcfg = (struct virtio_blk_config *)config;
> +int ret;
> +
> +if (blkcfg->wce == vbc->blkcfg.wce) {
> +return;
> +}
> +
> +ret = vhost_dev_set_config(>dev, >wce,
> +   offsetof(struct virtio_blk_config, wce),
> +   sizeof(blkcfg->wce),
> +   VHOST_SET_CONFIG_TYPE_MASTER);
> +if (ret) {
> +error_report("set device config space failed");
> +return;
> +}
> +
> +vbc->blkcfg.wce = blkcfg->wce;
> +}
> +
> +static int vhost_blk_common_handle_config_change(struct vhost_dev *dev)
> +{
> +VHostBlkCommon *vbc = VHOST_BLK_COMMON(dev->vdev);
> +struct virtio_blk_config blkcfg;
> +int ret;
> +
> +ret = vhost_dev_get_config(dev, (uint8_t *),
> +   sizeof(struct virtio_blk_config));
> +if (ret < 0) {
> +error_report("get config space failed");
> +return ret;
> +}
> +
> +/* valid for resize only */
> +if (blkcfg.capacity != vbc->blkcfg.capacity) {
> +vbc->blkcfg.capacity = blkcfg.capacity;
> +memcpy(dev->vdev->config, >blkcfg,
> +   sizeof(struct virtio_blk_config));
> +virtio_notify_config(dev->vdev);
> +}
> +
> +return 0;
> +}
> +
> +const VhostDevConfigOps blk_ops = {
> +.vhost_dev_config_notifier = vhost_blk_common_handle_config_change,
> +};
> +
> +static uint64_t vhost_blk_common_get_features(VirtIODevice *vdev,
>

Re: [PATCH] hw/block/nvme: map prp fix if prp2 contains non-zero offset

2021-04-08 Thread Keith Busch

On Thu, Apr 08, 2021 at 09:53:13PM +0530, Padmakar Kalghatgi wrote:
> +/*
> + *   The first PRP list entry, pointed by PRP2 can contain
> + *   offsets. Hence, we need calculate the no of entries in
> + *   prp2 based on the offset it has.
> + */

This comment has some unnecessary spacing at the beginning.

> +nents = (n->page_size - (prp2 % n->page_size)) >> 3;

page_size is a always a power of two, so let's replace the costly modulo
with:

nents = (n->page_size - (prp2 & (n->page_size - 1))) >> 3;

Re: [PATCH v2] i386: Add missing cpu feature bits in EPYC-Rome model

2021-04-08 Thread Eduardo Habkost

On Thu, Apr 08, 2021 at 10:28:21AM -0500, Babu Moger wrote:
> 
> 
> > -Original Message-
> > From: Christian Ehrhardt 
> > Sent: Thursday, April 1, 2021 3:06 AM
> > To: david.edmond...@oracle.com
> > Cc: Moger, Babu ; Paolo Bonzini
> > ; Richard Henderson
> > ; Eduardo Habkost
> > ; pankaj.gu...@cloud.ionos.com
> > Subject: Re: [PATCH v2] i386: Add missing cpu feature bits in EPYC-Rome
> > model
> > 
> > On Wed, Mar 3, 2021 at 5:24 PM  wrote:
> > >
> > > On Wednesday, 2021-03-03 at 09:45:30 -06, Babu Moger wrote:
> > >
> > > > Found the following cpu feature bits missing from EPYC-Rome model.
> > > > ibrs: Indirect Branch Restricted Speculation
> > > > ssbd: Speculative Store Bypass Disable
> > > >
> > > > These new features will be added in EPYC-Rome-v2. The -cpu help
> > > > output after the change.
> > > >
> > > > x86 EPYC-Rome (alias configured by machine type)
> > > > x86 EPYC-Rome-v1  AMD EPYC-Rome Processor
> > > > x86 EPYC-Rome-v2  AMD EPYC-Rome Processor
> > > >
> > > > Reported-by: Pankaj Gupta 
> > > > Signed-off-by: Babu Moger 
> > > > Signed-off-by: Pankaj Gupta 
> > >
> > > Reviewed-by: David Edmondson 
> > 
> > Hi,
> > this change/discussion seems as it was good back then but I realized it 
> > wasn't
> > applied in git yet.
> > Was there a different thread discussing what holds it back that I could not 
> > yet
> > find?
> > Since we are already in v6.0.0-rc1 the window to get it in shrinks, so I 
> > wanted
> > to give this a gentle ping.
> 
> Eduardo,
>  Do you have any concerns with these patches?  It is also fixing another
> problem reported here.
> https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1915063
> Can you please pull these changes?

I'm so sorry for missing this when it was submitted in March.
I'm queueing it right now and I'm going to submit a pull request
very soon, for -rc4.

-- 
Eduardo

[PATCH 2/2] spapr.h: increase FDT_MAX_SIZE

2021-04-08 Thread Daniel Henrique Barboza

Certain SMP topologies stress, e.g. 1 thread/core, 2048 cores and
1 socket, stress the current maximum size of the pSeries FDT:

Calling ibm,client-architecture-support...qemu-system-ppc64: error
creating device tree: (fdt_setprop(fdt, offset,
"ibm,processor-segment-sizes", segs, sizeof(segs))): FDT_ERR_NOSPACE

2048 is the default NR_CPUS value for the pSeries kernel. It's expected
that users will want QEMU to be able to handle this kind of
configuration.

Bumping FDT_MAX_SIZE to 2MB is enough for these setups to be created.

Signed-off-by: Daniel Henrique Barboza 
---
 include/hw/ppc/spapr.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index bf7cab7a2c..3deb382678 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -95,7 +95,7 @@ typedef enum {
 #define SPAPR_CAP_FIXED_CCD 0x03
 #define SPAPR_CAP_FIXED_NA  0x10 /* Lets leave a bit of a gap... */
 
-#define FDT_MAX_SIZE0x10
+#define FDT_MAX_SIZE0x20
 
 /*
  * NUMA related macros. MAX_DISTANCE_REF_POINTS was taken
-- 
2.30.2

[PATCH 1/2] spapr.c: do not use MachineClass::max_cpus to limit CPUs

2021-04-08 Thread Daniel Henrique Barboza

Up to this patch, 'max_cpus' value is hardcoded to 1024 (commit
6244bb7e5811). In theory this patch would simply bump it to 2048, since
it's the default NR_CPUS kernel setting for ppc64 servers nowadays, but
the whole mechanic of MachineClass:max_cpus is flawed for the pSeries
machine. The two supported accelerators, KVM and TCG, can live without
it.

TCG guests don't have a theoretical limit. The user must be free to
emulate as many CPUs as the hardware is capable of. And even if there
were a limit, max_cpus is not the proper way to report it since it's a
common value checked by SMP code in machine_smp_parse() for KVM as well.

For KVM guests, the proper way to limit KVM CPUs is by host
configuration via NR_CPUS, not a QEMU hardcoded value. There is no
technical reason for a pSeries QEMU guest to forcefully stay below
NR_CPUS.

This hardcoded value also disregard hosts that might have a lower
NR_CPUS limit, say 512. In this case, machine.c:machine_smp_parse() will
allow a 1024 value to pass, but then kvm_init() will complain about it
because it will exceed NR_CPUS:

Number of SMP cpus requested (1024) exceeds the maximum cpus supported
by KVM (512)

A better 'max_cpus' value would consider host settings, but
MachineClass::max_cpus is defined well before machine_init() and
kvm_init(). We can't check for KVM limits because it's too soon, so we
end up making a guess.

This patch makes MachineClass:max_cpus settings innocuous by setting it
to INT32_MAX. machine.c:machine_smp_parse() will not fail the
verification based on max_cpus, letting kvm_init() do the checking with
actual host settings. And TCG guests get to do whatever the hardware is
capable of emulating.

Signed-off-by: Daniel Henrique Barboza 
---
 hw/ppc/spapr.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 73a06df3b1..d6a67da21f 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4482,7 +4482,16 @@ static void spapr_machine_class_init(ObjectClass *oc, 
void *data)
 mc->init = spapr_machine_init;
 mc->reset = spapr_machine_reset;
 mc->block_default_type = IF_SCSI;
-mc->max_cpus = 1024;
+
+/*
+ * Setting max_cpus to INT32_MAX. Both KVM and TCG max_cpus values
+ * should be limited by the host capability instead of hardcoded.
+ * max_cpus for KVM guests will be checked in kvm_init(), and TCG
+ * guests are welcome to have as many CPUs as the host are capable
+ * of emulate.
+ */
+mc->max_cpus = INT32_MAX;
+
 mc->no_parallel = 1;
 mc->default_boot_order = "";
 mc->default_ram_size = 512 * MiB;
-- 
2.30.2

[PATCH 0/2] ppc64: do not use MachineClass::max_cpus to limit CPUs

2021-04-08 Thread Daniel Henrique Barboza

Hello,

After having to change hardcoded values to launch a 2048 KVM
pSeries guests I decided to post these upstream because, at
least for me, the current max_cpus usage is lackluster for
pSeries. More info in patch 01.

Patch 02 is a trivial follow-up to increase the FDT size.

Daniel Henrique Barboza (2):
  spapr.c: do not use MachineClass::max_cpus to limit CPUs
  spapr.h: increase FDT_MAX_SIZE

 hw/ppc/spapr.c | 11 ++-
 include/hw/ppc/spapr.h |  2 +-
 2 files changed, 11 insertions(+), 2 deletions(-)

-- 
2.30.2

[RFC PATCH 2/5] io/net-listener: Call the notifier during finalize

2021-04-08 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Call the notifier during finalize; it's currently only called
if we change it, which is not the intent.

Signed-off-by: Dr. David Alan Gilbert 
---
 io/net-listener.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/io/net-listener.c b/io/net-listener.c
index 46c2643d00..1c984d69c6 100644
--- a/io/net-listener.c
+++ b/io/net-listener.c
@@ -292,6 +292,9 @@ static void qio_net_listener_finalize(Object *obj)
 QIONetListener *listener = QIO_NET_LISTENER(obj);
 size_t i;
 
+if (listener->io_notify) {
+listener->io_notify(listener->io_data);
+}
 qio_net_listener_disconnect(listener);
 
 for (i = 0; i < listener->nsioc; i++) {
-- 
2.31.1

[RFC PATCH 0/5] mptcp support

2021-04-08 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Hi,
  This RFC set adds support for multipath TCP (mptcp),
in particular on the migration path - but should be extensible
to other users.

  Multipath-tcp is a bit like bonding, but at L3; you can use
it to handle failure, but can also use it to split traffic across
multiple interfaces.

  Using a pair of 10Gb interfaces, I've managed to get 19Gbps
(with the only tuning being using huge pages and turning the MTU up).

  It needs a bleeding-edge Linux kernel (in some older ones you get
false accept messages for the subflows), and a C lib that has the
constants defined (as current glibc does).

  To use it you just need to append ,mptcp to an address;

  -incoming tcp:0:,mptcp
  migrate -d tcp:192.168.11.20:,mptcp

  I had a quick go at trying NBD as well, but I think it needs
some work with the parsing of NBD addresses.

  All comments welcome.

Dave

Dr. David Alan Gilbert (5):
  channel-socket: Only set CLOEXEC if we have space for fds
  io/net-listener: Call the notifier during finalize
  migration: Add cleanup hook for inwards migration
  migration/socket: Close the listener at the end
  sockets: Support multipath TCP

 io/channel-socket.c   |  8 
 io/dns-resolver.c |  2 ++
 io/net-listener.c |  3 +++
 migration/migration.c |  3 +++
 migration/migration.h |  4 
 migration/multifd.c   |  5 +
 migration/socket.c| 24 ++--
 qapi/sockets.json |  5 -
 util/qemu-sockets.c   | 34 ++
 9 files changed, 77 insertions(+), 11 deletions(-)

-- 
2.31.1

[RFC PATCH 5/5] sockets: Support multipath TCP

2021-04-08 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Multipath TCP allows combining multiple interfaces/routes into a single
socket, with very little work for the user/admin.

It's enabled by 'mptcp' on most socket addresses:

   ./qemu-system-x86_64 -nographic -incoming tcp:0:,mptcp

Signed-off-by: Dr. David Alan Gilbert 
---
 io/dns-resolver.c   |  2 ++
 qapi/sockets.json   |  5 -
 util/qemu-sockets.c | 34 ++
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/io/dns-resolver.c b/io/dns-resolver.c
index 743a0efc87..b081e098bb 100644
--- a/io/dns-resolver.c
+++ b/io/dns-resolver.c
@@ -122,6 +122,8 @@ static int qio_dns_resolver_lookup_sync_inet(QIODNSResolver 
*resolver,
 .ipv4 = iaddr->ipv4,
 .has_ipv6 = iaddr->has_ipv6,
 .ipv6 = iaddr->ipv6,
+.has_mptcp = iaddr->has_mptcp,
+.mptcp = iaddr->mptcp,
 };
 
 (*addrs)[i] = newaddr;
diff --git a/qapi/sockets.json b/qapi/sockets.json
index 2e83452797..43122a38bf 100644
--- a/qapi/sockets.json
+++ b/qapi/sockets.json
@@ -57,6 +57,8 @@
 # @keep-alive: enable keep-alive when connecting to this socket. Not supported
 #  for passive sockets. (Since 4.2)
 #
+# @mptcp: enable multi-path TCP. (Since 6.0)
+#
 # Since: 1.3
 ##
 { 'struct': 'InetSocketAddress',
@@ -66,7 +68,8 @@
 '*to': 'uint16',
 '*ipv4': 'bool',
 '*ipv6': 'bool',
-'*keep-alive': 'bool' } }
+'*keep-alive': 'bool',
+'*mptcp': 'bool' } }
 
 ##
 # @UnixSocketAddress:
diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
index 8af0278f15..72527972d5 100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -206,6 +206,21 @@ static int try_bind(int socket, InetSocketAddress *saddr, 
struct addrinfo *e)
 #endif
 }
 
+static int check_mptcp(const InetSocketAddress *saddr, struct addrinfo *ai,
+   Error **errp)
+{
+if (saddr->has_mptcp && saddr->mptcp) {
+#ifdef IPPROTO_MPTCP
+ai->ai_protocol = IPPROTO_MPTCP;
+#else
+error_setg(errp, "MPTCP unavailable in this build");
+return -1;
+#endif
+}
+
+return 0;
+}
+
 static int inet_listen_saddr(InetSocketAddress *saddr,
  int port_offset,
  int num,
@@ -278,6 +293,11 @@ static int inet_listen_saddr(InetSocketAddress *saddr,
 
 /* create socket + bind/listen */
 for (e = res; e != NULL; e = e->ai_next) {
+if (check_mptcp(saddr, e, )) {
+error_propagate(errp, err);
+return -1;
+}
+
 getnameinfo((struct sockaddr*)e->ai_addr,e->ai_addrlen,
 uaddr,INET6_ADDRSTRLEN,uport,32,
 NI_NUMERICHOST | NI_NUMERICSERV);
@@ -456,6 +476,11 @@ int inet_connect_saddr(InetSocketAddress *saddr, Error 
**errp)
 for (e = res; e != NULL; e = e->ai_next) {
 error_free(local_err);
 local_err = NULL;
+
+if (check_mptcp(saddr, e, _err)) {
+break;
+}
+
 sock = inet_connect_addr(saddr, e, _err);
 if (sock >= 0) {
 break;
@@ -687,6 +712,15 @@ int inet_parse(InetSocketAddress *addr, const char *str, 
Error **errp)
 }
 addr->has_keep_alive = true;
 }
+begin = strstr(optstr, ",mptcp");
+if (begin) {
+if (inet_parse_flag("mptcp", begin + strlen(",mptcp"),
+>mptcp, errp) < 0)
+{
+return -1;
+}
+addr->has_mptcp = true;
+}
 return 0;
 }
 
-- 
2.31.1

[RFC PATCH 4/5] migration/socket: Close the listener at the end

2021-04-08 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Delay closing the listener until the cleanup hook at the end; mptcp
needs the listener to stay open while the other paths come in.

Signed-off-by: Dr. David Alan Gilbert 
---
 migration/multifd.c |  5 +
 migration/socket.c  | 24 ++--
 2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index a6677c45c8..cebd9029b9 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -1165,6 +1165,11 @@ bool multifd_recv_all_channels_created(void)
 return true;
 }
 
+if (!multifd_recv_state) {
+/* Called before any connections created */
+return false;
+}
+
 return thread_count == qatomic_read(_recv_state->count);
 }
 
diff --git a/migration/socket.c b/migration/socket.c
index 6016642e04..05705a32d8 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -126,22 +126,31 @@ static void 
socket_accept_incoming_migration(QIONetListener *listener,
 {
 trace_migration_socket_incoming_accepted();
 
-qio_channel_set_name(QIO_CHANNEL(cioc), "migration-socket-incoming");
-migration_channel_process_incoming(QIO_CHANNEL(cioc));
-
 if (migration_has_all_channels()) {
-/* Close listening socket as its no longer needed */
-qio_net_listener_disconnect(listener);
-object_unref(OBJECT(listener));
+error_report("%s: Extra incoming migration connection; ignoring",
+ __func__);
+return;
 }
+
+qio_channel_set_name(QIO_CHANNEL(cioc), "migration-socket-incoming");
+migration_channel_process_incoming(QIO_CHANNEL(cioc));
 }
 
+static void
+socket_incoming_migration_end(void *opaque)
+{
+QIONetListener *listener = opaque;
+
+qio_net_listener_disconnect(listener);
+object_unref(OBJECT(listener));
+}
 
 static void
 socket_start_incoming_migration_internal(SocketAddress *saddr,
  Error **errp)
 {
 QIONetListener *listener = qio_net_listener_new();
+MigrationIncomingState *mis = migration_incoming_get_current();
 size_t i;
 int num = 1;
 
@@ -156,6 +165,9 @@ socket_start_incoming_migration_internal(SocketAddress 
*saddr,
 return;
 }
 
+mis->transport_data = listener;
+mis->transport_cleanup = socket_incoming_migration_end;
+
 qio_net_listener_set_client_func_full(listener,
   socket_accept_incoming_migration,
   NULL, NULL,
-- 
2.31.1

[RFC PATCH 3/5] migration: Add cleanup hook for inwards migration

2021-04-08 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Add a cleanup hook for incoming migration that gets called
at the end as a way for a transport to allow cleanup.

Signed-off-by: Dr. David Alan Gilbert 
---
 migration/migration.c | 3 +++
 migration/migration.h | 4 
 2 files changed, 7 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index ca8b97baa5..feaedc382e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -279,6 +279,9 @@ void migration_incoming_state_destroy(void)
 g_array_free(mis->postcopy_remote_fds, TRUE);
 mis->postcopy_remote_fds = NULL;
 }
+if (mis->transport_cleanup) {
+mis->transport_cleanup(mis->transport_data);
+}
 
 qemu_event_reset(>main_thread_load_event);
 
diff --git a/migration/migration.h b/migration/migration.h
index db6708326b..1b4c5da917 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -49,6 +49,10 @@ struct PostcopyBlocktimeContext;
 struct MigrationIncomingState {
 QEMUFile *from_src_file;
 
+/* A hook to allow cleanup at the end of incoming migration */
+void *transport_data;
+void (*transport_cleanup)(void *data);
+
 /*
  * Free at the start of the main state load, set as the main thread 
finishes
  * loading state.
-- 
2.31.1

[PATCH v2 7/7] tests/acceptance: Handle cpu tag on x86_cpu_model_versions tests

2021-04-08 Thread Wainer dos Santos Moschetta

Some test cases on x86_cpu_model_versions.py are corner cases because they
need to pass extra options to the -cpu argument. Once the avocado_qemu
framework will set -cpu automatically, the value should be reset. This changed
those tests so to call set_vm_arg() to overwrite the -cpu value.

Signed-off-by: Wainer dos Santos Moschetta 
---
 tests/acceptance/x86_cpu_model_versions.py | 40 +-
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/tests/acceptance/x86_cpu_model_versions.py 
b/tests/acceptance/x86_cpu_model_versions.py
index 77ed8597a4..0e9feda62d 100644
--- a/tests/acceptance/x86_cpu_model_versions.py
+++ b/tests/acceptance/x86_cpu_model_versions.py
@@ -252,10 +252,13 @@ def get_cpu_prop(self, prop):
 def test_4_1(self):
 """
 :avocado: tags=machine:pc-i440fx-4.1
+:avocado: tags=cpu:Cascadelake-Server
 """
 # machine-type only:
 self.vm.add_args('-S')
-self.vm.add_args('-cpu', 
'Cascadelake-Server,x-force-features=on,check=off,enforce=off')
+self.set_vm_arg('-cpu',
+'Cascadelake-Server,x-force-features=on,check=off,'
+'enforce=off')
 self.vm.launch()
 self.assertFalse(self.get_cpu_prop('arch-capabilities'),
  'pc-i440fx-4.1 + Cascadelake-Server should not have 
arch-capabilities')
@@ -263,9 +266,12 @@ def test_4_1(self):
 def test_4_0(self):
 """
 :avocado: tags=machine:pc-i440fx-4.0
+:avocado: tags=cpu:Cascadelake-Server
 """
 self.vm.add_args('-S')
-self.vm.add_args('-cpu', 
'Cascadelake-Server,x-force-features=on,check=off,enforce=off')
+self.set_vm_arg('-cpu',
+'Cascadelake-Server,x-force-features=on,check=off,'
+'enforce=off')
 self.vm.launch()
 self.assertFalse(self.get_cpu_prop('arch-capabilities'),
  'pc-i440fx-4.0 + Cascadelake-Server should not have 
arch-capabilities')
@@ -273,10 +279,13 @@ def test_4_0(self):
 def test_set_4_0(self):
 """
 :avocado: tags=machine:pc-i440fx-4.0
+:avocado: tags=cpu:Cascadelake-Server
 """
 # command line must override machine-type if CPU model is not 
versioned:
 self.vm.add_args('-S')
-self.vm.add_args('-cpu', 
'Cascadelake-Server,x-force-features=on,check=off,enforce=off,+arch-capabilities')
+self.set_vm_arg('-cpu',
+'Cascadelake-Server,x-force-features=on,check=off,'
+'enforce=off,+arch-capabilities')
 self.vm.launch()
 self.assertTrue(self.get_cpu_prop('arch-capabilities'),
 'pc-i440fx-4.0 + Cascadelake-Server,+arch-capabilities 
should have arch-capabilities')
@@ -284,9 +293,12 @@ def test_set_4_0(self):
 def test_unset_4_1(self):
 """
 :avocado: tags=machine:pc-i440fx-4.1
+:avocado: tags=cpu:Cascadelake-Server
 """
 self.vm.add_args('-S')
-self.vm.add_args('-cpu', 
'Cascadelake-Server,x-force-features=on,check=off,enforce=off,-arch-capabilities')
+self.set_vm_arg('-cpu',
+'Cascadelake-Server,x-force-features=on,check=off,'
+'enforce=off,-arch-capabilities')
 self.vm.launch()
 self.assertFalse(self.get_cpu_prop('arch-capabilities'),
  'pc-i440fx-4.1 + 
Cascadelake-Server,-arch-capabilities should not have arch-capabilities')
@@ -294,10 +306,13 @@ def test_unset_4_1(self):
 def test_v1_4_0(self):
 """
 :avocado: tags=machine:pc-i440fx-4.0
+:avocado: tags=cpu:Cascadelake-Server
 """
 # versioned CPU model overrides machine-type:
 self.vm.add_args('-S')
-self.vm.add_args('-cpu', 
'Cascadelake-Server-v1,x-force-features=on,check=off,enforce=off')
+self.set_vm_arg('-cpu',
+'Cascadelake-Server-v1,x-force-features=on,check=off,'
+'enforce=off')
 self.vm.launch()
 self.assertFalse(self.get_cpu_prop('arch-capabilities'),
  'pc-i440fx-4.0 + Cascadelake-Server-v1 should not 
have arch-capabilities')
@@ -305,9 +320,12 @@ def test_v1_4_0(self):
 def test_v2_4_0(self):
 """
 :avocado: tags=machine:pc-i440fx-4.0
+:avocado: tags=cpu:Cascadelake-Server
 """
 self.vm.add_args('-S')
-self.vm.add_args('-cpu', 
'Cascadelake-Server-v2,x-force-features=on,check=off,enforce=off')
+self.set_vm_arg('-cpu',
+'Cascadelake-Server-v2,x-force-features=on,check=off,'
+'enforce=off')
 self.vm.launch()
 self.assertTrue(self.get_cpu_prop('arch-capabilities'),
 'pc-i440fx-4.0 + Cascadelake-Server-v2 should have 
arch-capabilities')
@@ -315,10 +333,13 @@ def

[PATCH v2 1/7] tests/acceptance: Automatic set -cpu to the test vm

2021-04-08 Thread Wainer dos Santos Moschetta

This introduces a new feature to the functional tests: automatic setting of
the '-cpu VALUE' option to the created vm if the test is tagged with
'cpu:VALUE'. The 'cpu' property is made available to the test object as well.

For example, for a simple test as:

def test(self):
"""
:avocado: tags=cpu:host
"""
self.assertEqual(self.cpu, "host")
self.vm.launch()

The resulting QEMU evocation will be like:

qemu-system-x86_64 -display none -vga none -chardev 
socket,id=mon,path=/var/tmp/avo_qemu_sock_pdgzbgd_/qemu-1135557-monitor.sock 
-mon chardev=mon,mode=control -cpu host

Signed-off-by: Wainer dos Santos Moschetta 
---
 docs/devel/testing.rst| 17 +
 tests/acceptance/avocado_qemu/__init__.py |  5 +
 2 files changed, 22 insertions(+)

diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
index 1da4c4e4c4..e139a618f5 100644
--- a/docs/devel/testing.rst
+++ b/docs/devel/testing.rst
@@ -878,6 +878,17 @@ name.  If one is not given explicitly, it will either be 
set to
 ``None``, or, if the test is tagged with one (and only one)
 ``:avocado: tags=arch:VALUE`` tag, it will be set to ``VALUE``.
 
+cpu
+~~~
+
+The cpu model that will be set to all QEMUMachine instances created
+by the test.
+
+The ``cpu`` attribute will be set to the test parameter of the same
+name. If one is not given explicitly, it will either be set to
+``None ``, or, if the test is tagged with one (and only one)
+``:avocado: tags=cpu:VALUE`` tag, it will be set to ``VALUE``.
+
 machine
 ~~~
 
@@ -924,6 +935,12 @@ architecture of a kernel or disk image to boot a VM with.
 This parameter has a direct relation with the ``arch`` attribute.  If
 not given, it will default to None.
 
+cpu
+~~~
+
+The cpu model that will be set to all QEMUMachine instances created
+by the test.
+
 machine
 ~~~
 
diff --git a/tests/acceptance/avocado_qemu/__init__.py 
b/tests/acceptance/avocado_qemu/__init__.py
index 83b1741ec8..7f8e703757 100644
--- a/tests/acceptance/avocado_qemu/__init__.py
+++ b/tests/acceptance/avocado_qemu/__init__.py
@@ -206,6 +206,9 @@ def setUp(self):
 self.arch = self.params.get('arch',
 default=self._get_unique_tag_val('arch'))
 
+self.cpu = self.params.get('cpu',
+   default=self._get_unique_tag_val('cpu'))
+
 self.machine = self.params.get('machine',

default=self._get_unique_tag_val('machine'))
 
@@ -231,6 +234,8 @@ def get_vm(self, *args, name=None):
 name = str(uuid.uuid4())
 if self._vms.get(name) is None:
 self._vms[name] = self._new_vm(*args)
+if self.cpu is not None:
+self._vms[name].add_args('-cpu', self.cpu)
 if self.machine is not None:
 self._vms[name].set_machine(self.machine)
 return self._vms[name]
-- 
2.29.2

[PATCH v2 3/7] tests/acceptance: Let the framework handle "cpu:VALUE" tagged tests

2021-04-08 Thread Wainer dos Santos Moschetta

The tests that are already tagged with "cpu:VALUE" don't need to add
"-cpu VALUE" to the list of arguments of the vm object because the avocado_qemu
framework is able to handle it automatically.

Signed-off-by: Wainer dos Santos Moschetta 
---
 tests/acceptance/boot_linux.py | 3 ---
 tests/acceptance/machine_mips_malta.py | 1 -
 tests/acceptance/replay_kernel.py  | 8 +++-
 tests/acceptance/reverse_debugging.py  | 2 +-
 tests/acceptance/tcg_plugins.py| 9 -
 5 files changed, 8 insertions(+), 15 deletions(-)

diff --git a/tests/acceptance/boot_linux.py b/tests/acceptance/boot_linux.py
index 0d178038a0..55637d126e 100644
--- a/tests/acceptance/boot_linux.py
+++ b/tests/acceptance/boot_linux.py
@@ -82,7 +82,6 @@ def test_virt_tcg(self):
 """
 self.require_accelerator("tcg")
 self.vm.add_args("-accel", "tcg")
-self.vm.add_args("-cpu", "max")
 self.vm.add_args("-machine", "virt,gic-version=2")
 self.add_common_args()
 self.launch_and_wait()
@@ -95,7 +94,6 @@ def test_virt_kvm_gicv2(self):
 """
 self.require_accelerator("kvm")
 self.vm.add_args("-accel", "kvm")
-self.vm.add_args("-cpu", "host")
 self.vm.add_args("-machine", "virt,gic-version=2")
 self.add_common_args()
 self.launch_and_wait()
@@ -108,7 +106,6 @@ def test_virt_kvm_gicv3(self):
 """
 self.require_accelerator("kvm")
 self.vm.add_args("-accel", "kvm")
-self.vm.add_args("-cpu", "host")
 self.vm.add_args("-machine", "virt,gic-version=3")
 self.add_common_args()
 self.launch_and_wait()
diff --git a/tests/acceptance/machine_mips_malta.py 
b/tests/acceptance/machine_mips_malta.py
index b1fd075f51..b67d8cb141 100644
--- a/tests/acceptance/machine_mips_malta.py
+++ b/tests/acceptance/machine_mips_malta.py
@@ -62,7 +62,6 @@ def do_test_i6400_framebuffer_logo(self, cpu_cores_count):
 kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
'clocksource=GIC console=tty0 console=ttyS0')
 self.vm.add_args('-kernel', kernel_path,
- '-cpu', 'I6400',
  '-smp', '%u' % cpu_cores_count,
  '-vga', 'std',
  '-append', kernel_command_line)
diff --git a/tests/acceptance/replay_kernel.py 
b/tests/acceptance/replay_kernel.py
index 71facdaa75..75f80506c1 100644
--- a/tests/acceptance/replay_kernel.py
+++ b/tests/acceptance/replay_kernel.py
@@ -156,8 +156,7 @@ def test_aarch64_virt(self):
'console=ttyAMA0')
 console_pattern = 'VFS: Cannot open root device'
 
-self.run_rr(kernel_path, kernel_command_line, console_pattern,
-args=('-cpu', 'cortex-a53'))
+self.run_rr(kernel_path, kernel_command_line, console_pattern)
 
 def test_arm_virt(self):
 """
@@ -301,7 +300,7 @@ def test_ppc64_e500(self):
 tar_url = ('https://www.qemu-advent-calendar.org'
'/2018/download/day19.tar.xz')
 file_path = self.fetch_asset(tar_url, asset_hash=tar_hash)
-self.do_test_advcal_2018(file_path, 'uImage', ('-cpu', 'e5500'))
+self.do_test_advcal_2018(file_path, 'uImage')
 
 def test_ppc_g3beige(self):
 """
@@ -348,8 +347,7 @@ def test_xtensa_lx60(self):
 tar_url = ('https://www.qemu-advent-calendar.org'
'/2018/download/day02.tar.xz')
 file_path = self.fetch_asset(tar_url, asset_hash=tar_hash)
-self.do_test_advcal_2018(file_path, 'santas-sleigh-ride.elf',
- args=('-cpu', 'dc233c'))
+self.do_test_advcal_2018(file_path, 'santas-sleigh-ride.elf')
 
 @skipUnless(os.getenv('AVOCADO_TIMEOUT_EXPECTED'), 'Test might timeout')
 class ReplayKernelSlow(ReplayKernelBase):
diff --git a/tests/acceptance/reverse_debugging.py 
b/tests/acceptance/reverse_debugging.py
index be01aca217..d2921e70c3 100644
--- a/tests/acceptance/reverse_debugging.py
+++ b/tests/acceptance/reverse_debugging.py
@@ -207,4 +207,4 @@ def test_aarch64_virt(self):
 kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
 
 self.reverse_debugging(
-args=('-kernel', kernel_path, '-cpu', 'cortex-a53'))
+args=('-kernel', kernel_path))
diff --git a/tests/acceptance/tcg_plugins.py b/tests/acceptance/tcg_plugins.py
index aa6e18b62d..9ca1515c3b 100644
--- a/tests/acceptance/tcg_plugins.py
+++ b/tests/acceptance/tcg_plugins.py
@@ -25,7 +25,7 @@ class PluginKernelBase(LinuxKernelTest):
 KERNEL_COMMON_COMMAND_LINE = 'printk.time=1 panic=-1 '
 
 def run_vm(self, kernel_path, kernel_command_line,
-   plugin, plugin_log, console_pattern, args):
+   plugin, plugin_log, console_pattern, args=None):
 
 vm = self.get_vm()
 vm.set_console()
@@ -80,8 +80,7 @@ def test_aarch64_virt_insn(self):

[PATCH v2 0/7] tests/acceptance: Handle tests with "cpu" tag

2021-04-08 Thread Wainer dos Santos Moschetta

Currently the acceptance tests tagged with "machine" have the "-M TYPE"
automatically added to the list of arguments of the QEMUMachine object.
In other words, that option is passed to the launched QEMU. On this
series it is implemented the same feature but instead for tests marked
with "cpu".

There is a caveat, however, in case the test needs additional arguments to
the CPU type they cannot be passed via tag, because the tags parser split
values by comma. For example, in tests/acceptance/x86_cpu_model_versions.py,
there are cases where:

  * -cpu is set to 
"Cascadelake-Server,x-force-features=on,check=off,enforce=off"
  * if it was tagged like 
"cpu:Cascadelake-Server,x-force-features=on,check=off,enforce=off"
then the parser would break it into 4 tags ("cpu:Cascadelake-Server",
"x-force-features=on", "check=off", "enforce=off")
  * resulting on "-cpu Cascadelake-Server" and the remaining arguments are 
ignored.

It was introduced the avocado_qemu.Test.set_vm_arg() method to deal with
cases like the example above, so that one can tag it as "cpu:Cascadelake-Server"
AND call self.set_vm_args('-cpu', 
"Cascadelake-Server,x-force-features=on,check=off,enforce=off"),
and that results on the reset of the initial value of -cpu.

This series was tested on CI 
(https://gitlab.com/wainersm/qemu/-/pipelines/277376246)
and with the following code:

from avocado_qemu import Test

class CPUTest(Test):
def test_cpu(self):
"""
:avocado: tags=cpu:host
"""
# The cpu property is set to the tag value, or None on its absence
self.assertEqual(self.cpu, "host")
# The created VM has the '-cpu host' option
self.assertIn("-cpu host", " ".join(self.vm._args))
self.vm.launch()

def test_cpu_none(self):
self.assertEqual(self.cpu, None)
self.assertNotIn('-cpu', self.vm._args)

def test_cpu_reset(self):
"""
:avocado: tags=cpu:host
"""
self.assertIn("-cpu host", " ".join(self.vm._args))
self.set_vm_arg("-cpu", "Cascadelake-Server,x-force-features=on")
self.assertNotIn("-cpu host", " ".join(self.vm._args))
self.assertIn("-cpu Cascadelake-Server,x-force-features=on", " 
".join(self.vm._args))

Changes:
 - v1 -> v2:
   - Recognize the cpu value passed via test parameter [crosa]
   - Fixed tags (patch 02) on preparation to patch 03 [crosa]
   - Added QEMUMachine.args property (patch 04) so that _args could be handled
 without pylint complaining (protected property) 
   - Added Test.set_vm_arg() (patch 05) to handle the corner case [crosa]

Wainer dos Santos Moschetta (7):
  tests/acceptance: Automatic set -cpu to the test vm
  tests/acceptance: Fix mismatch on cpu tagged tests
  tests/acceptance: Let the framework handle "cpu:VALUE" tagged tests
  tests/acceptance: Tagging tests with "cpu:VALUE"
  python/qemu: Add args property to the QEMUMachine class
  tests/acceptance: Add set_vm_arg() to the Test class
  tests/acceptance: Handle cpu tag on x86_cpu_model_versions tests

 docs/devel/testing.rst | 17 +
 python/qemu/machine.py |  5 +++
 tests/acceptance/avocado_qemu/__init__.py  | 21 
 tests/acceptance/boot_linux.py |  3 --
 tests/acceptance/boot_linux_console.py | 16 +
 tests/acceptance/machine_mips_malta.py |  7 ++--
 tests/acceptance/pc_cpu_hotplug_props.py   |  2 +-
 tests/acceptance/replay_kernel.py  | 17 -
 tests/acceptance/reverse_debugging.py  |  2 +-
 tests/acceptance/tcg_plugins.py| 15 
 tests/acceptance/virtio-gpu.py |  4 +--
 tests/acceptance/x86_cpu_model_versions.py | 40 +-
 12 files changed, 107 insertions(+), 42 deletions(-)

-- 
2.29.2

[PATCH v2 2/7] tests/acceptance: Fix mismatch on cpu tagged tests

2021-04-08 Thread Wainer dos Santos Moschetta

There are test cases on machine_mips_malta.py and tcg_plugins.py files
where the cpu tag does not correspond to the value actually given to the QEMU
binary. This fixed those tests tags.

Signed-off-by: Wainer dos Santos Moschetta 
---
 tests/acceptance/machine_mips_malta.py | 6 +++---
 tests/acceptance/tcg_plugins.py| 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/acceptance/machine_mips_malta.py 
b/tests/acceptance/machine_mips_malta.py
index 7c9a4ee4d2..b1fd075f51 100644
--- a/tests/acceptance/machine_mips_malta.py
+++ b/tests/acceptance/machine_mips_malta.py
@@ -96,7 +96,7 @@ def test_mips_malta_i6400_framebuffer_logo_1core(self):
 """
 :avocado: tags=arch:mips64el
 :avocado: tags=machine:malta
-:avocado: tags=cpu:i6400
+:avocado: tags=cpu:I6400
 """
 self.do_test_i6400_framebuffer_logo(1)
 
@@ -105,7 +105,7 @@ def test_mips_malta_i6400_framebuffer_logo_7cores(self):
 """
 :avocado: tags=arch:mips64el
 :avocado: tags=machine:malta
-:avocado: tags=cpu:i6400
+:avocado: tags=cpu:I6400
 :avocado: tags=mips:smp
 """
 self.do_test_i6400_framebuffer_logo(7)
@@ -115,7 +115,7 @@ def test_mips_malta_i6400_framebuffer_logo_8cores(self):
 """
 :avocado: tags=arch:mips64el
 :avocado: tags=machine:malta
-:avocado: tags=cpu:i6400
+:avocado: tags=cpu:I6400
 :avocado: tags=mips:smp
 """
 self.do_test_i6400_framebuffer_logo(8)
diff --git a/tests/acceptance/tcg_plugins.py b/tests/acceptance/tcg_plugins.py
index c21bf9e52a..aa6e18b62d 100644
--- a/tests/acceptance/tcg_plugins.py
+++ b/tests/acceptance/tcg_plugins.py
@@ -68,7 +68,7 @@ def test_aarch64_virt_insn(self):
 :avocado: tags=accel:tcg
 :avocado: tags=arch:aarch64
 :avocado: tags=machine:virt
-:avocado: tags=cpu:cortex-a57
+:avocado: tags=cpu:cortex-a53
 """
 kernel_path = self._grab_aarch64_kernel()
 kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
@@ -95,7 +95,7 @@ def test_aarch64_virt_insn_icount(self):
 :avocado: tags=accel:tcg
 :avocado: tags=arch:aarch64
 :avocado: tags=machine:virt
-:avocado: tags=cpu:cortex-a57
+:avocado: tags=cpu:cortex-a53
 """
 kernel_path = self._grab_aarch64_kernel()
 kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
@@ -121,7 +121,7 @@ def test_aarch64_virt_mem_icount(self):
 :avocado: tags=accel:tcg
 :avocado: tags=arch:aarch64
 :avocado: tags=machine:virt
-:avocado: tags=cpu:cortex-a57
+:avocado: tags=cpu:cortex-a53
 """
 kernel_path = self._grab_aarch64_kernel()
 kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
-- 
2.29.2

[PATCH v2 5/7] python/qemu: Add args property to the QEMUMachine class

2021-04-08 Thread Wainer dos Santos Moschetta

This added the args property to QEMUMachine so that users of the class
can access and handle the list of arguments to be given to the QEMU
binary.

Signed-off-by: Wainer dos Santos Moschetta 
---
 python/qemu/machine.py | 5 +
 1 file changed, 5 insertions(+)

diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index 6e44bda337..1c30bde99d 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -302,6 +302,11 @@ def _base_args(self) -> List[str]:
 args.extend(['-device', device])
 return args
 
+@property
+def args(self) -> List[str]:
+"""Returns the list of arguments given to the QEMU binary."""
+return self._args
+
 def _pre_launch(self) -> None:
 self._temp_dir = tempfile.mkdtemp(prefix="qemu-machine-",
   dir=self._test_dir)
-- 
2.29.2

Re: [RFC PATCH v2 01/11] python: qemu: add timer parameter for qmp.accept socket

2021-04-08 Thread John Snow


On 4/7/21 9:50 AM, Emanuele Giuseppe Esposito wrote:

Extend the _post_launch function to include the timer as
parameter instead of defaulting to 15 sec.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  python/qemu/machine.py | 4 ++--
  python/qemu/qtest.py   | 4 ++--
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index 6e44bda337..c721e07d63 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -321,9 +321,9 @@ def _pre_launch(self) -> None:
  nickname=self._name
  )
  
-def _post_launch(self) -> None:

+def _post_launch(self, timer) -> None:
  if self._qmp_connection:
-self._qmp.accept()
+self._qmp.accept(timer)
  
  def _post_shutdown(self) -> None:

  """
diff --git a/python/qemu/qtest.py b/python/qemu/qtest.py
index 39a0cf62fe..0d01715086 100644
--- a/python/qemu/qtest.py
+++ b/python/qemu/qtest.py
@@ -138,9 +138,9 @@ def _pre_launch(self) -> None:
  super()._pre_launch()
  self._qtest = QEMUQtestProtocol(self._qtest_path, server=True)
  
-def _post_launch(self) -> None:

+def _post_launch(self, timer) -> None:
  assert self._qtest is not None
-super()._post_launch()
+super()._post_launch(timer)
  self._qtest.accept()
  
  def _post_shutdown(self) -> None:




Are you forgetting to change _launch() to provide some default value for 
what timer needs to be?


I think for the "event" callbacks here, I'd prefer configuring the 
behavior as a property instead of passing it around as a parameter.


(Also, we have an awful lot of timeouts now... is it time to think about 
rewriting this using asyncio so that we can allow the callers to specify 
their own timeouts in with context blocks? Just a thought for later; we 
have an awful lot of timeouts scattered throughout machine.py, qmp.py, etc.)


--js

[RFC PATCH 1/5] channel-socket: Only set CLOEXEC if we have space for fds

2021-04-08 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

MSG_CMSG_CLOEXEC cleans up received fd's; it's really only for Unix
sockets, but currently we enable it for everything; some socket types
(IP_MPTCP) don't like this.

Only enable it when we're giving the recvmsg room to receive fd's
anyway.

Signed-off-by: Dr. David Alan Gilbert 
---
 io/channel-socket.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/io/channel-socket.c b/io/channel-socket.c
index de259f7eed..606ec97cf7 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -487,15 +487,15 @@ static ssize_t qio_channel_socket_readv(QIOChannel *ioc,
 
 memset(control, 0, CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS));
 
-#ifdef MSG_CMSG_CLOEXEC
-sflags |= MSG_CMSG_CLOEXEC;
-#endif
-
 msg.msg_iov = (struct iovec *)iov;
 msg.msg_iovlen = niov;
 if (fds && nfds) {
 msg.msg_control = control;
 msg.msg_controllen = sizeof(control);
+#ifdef MSG_CMSG_CLOEXEC
+sflags |= MSG_CMSG_CLOEXEC;
+#endif
+
 }
 
  retry:
-- 
2.31.1

Re: [RFC PATCH v2 02/11] python: qemu: pass the wrapper field from QEMUQtestmachine to QEMUMachine

2021-04-08 Thread John Snow


On 4/7/21 9:50 AM, Emanuele Giuseppe Esposito wrote:

Signed-off-by: Emanuele Giuseppe Esposito 
---
  python/qemu/machine.py | 2 +-
  python/qemu/qtest.py   | 4 +++-
  2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index c721e07d63..18d32ebe45 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -109,7 +109,7 @@ def __init__(self,
  
  self._binary = binary

  self._args = list(args)
-self._wrapper = wrapper
+self._wrapper = list(wrapper)
 


Unrelated change?

(I'm assuming you want to copy the user's input to explicitly avoid 
sharing state. Commit message blurb for this would be good.)



  self._name = name or "qemu-%d" % os.getpid()
  self._test_dir = test_dir
diff --git a/python/qemu/qtest.py b/python/qemu/qtest.py
index 0d01715086..4c90daf430 100644
--- a/python/qemu/qtest.py
+++ b/python/qemu/qtest.py
@@ -111,6 +111,7 @@ class QEMUQtestMachine(QEMUMachine):
  def __init__(self,
   binary: str,
   args: Sequence[str] = (),
+ wrapper: Sequence[str] = (),
   name: Optional[str] = None,
   test_dir: str = "/var/tmp",
   socket_scm_helper: Optional[str] = None,
@@ -119,7 +120,8 @@ def __init__(self,
  name = "qemu-%d" % os.getpid()
  if sock_dir is None:
  sock_dir = test_dir
-super().__init__(binary, args, name=name, test_dir=test_dir,
+super().__init__(binary, args, wrapper=wrapper, name=name,
+ test_dir=test_dir,
   socket_scm_helper=socket_scm_helper,
   sock_dir=sock_dir)
  self._qtest: Optional[QEMUQtestProtocol] = None



ACK

Re: [PATCH-for-6.0?] hw/arm/imx25_pdk: Fix error message for invalid RAM size

2021-04-08 Thread Igor Mammedov

On Thu,  8 Apr 2021 00:56:08 +0200
Philippe Mathieu-Daudé  wrote:

> The i.MX25 PDK board has 2 banks for SDRAM, each can
> address up to 256 MiB. So the total RAM usable for this
> board is 512M. When we ask for more we get a misleading
> error message:
> 
>   $ qemu-system-arm -M imx25-pdk -m 513M
>   qemu-system-arm: Invalid RAM size, should be 128 MiB
> 
> Update the error message to better match the reality:
> 
>   $ qemu-system-arm -M imx25-pdk -m 513M
>   qemu-system-arm: RAM size more than 512 MiB is not supported
> 
> Fixes: bf350daae02 ("arm/imx25_pdk: drop RAM size fixup")
> Signed-off-by: Philippe Mathieu-Daudé 

Reviewed-by: Igor Mammedov 

> ---
>  hw/arm/imx25_pdk.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/arm/imx25_pdk.c b/hw/arm/imx25_pdk.c
> index 1c201d0d8ed..51fde71b1bd 100644
> --- a/hw/arm/imx25_pdk.c
> +++ b/hw/arm/imx25_pdk.c
> @@ -67,7 +67,6 @@ static struct arm_boot_info imx25_pdk_binfo;
>  
>  static void imx25_pdk_init(MachineState *machine)
>  {
> -MachineClass *mc = MACHINE_GET_CLASS(machine);
>  IMX25PDK *s = g_new0(IMX25PDK, 1);
>  unsigned int ram_size;
>  unsigned int alias_offset;
> @@ -79,8 +78,8 @@ static void imx25_pdk_init(MachineState *machine)
>  
>  /* We need to initialize our memory */
>  if (machine->ram_size > (FSL_IMX25_SDRAM0_SIZE + FSL_IMX25_SDRAM1_SIZE)) 
> {
> -char *sz = size_to_str(mc->default_ram_size);
> -error_report("Invalid RAM size, should be %s", sz);
> +char *sz = size_to_str(FSL_IMX25_SDRAM0_SIZE + 
> FSL_IMX25_SDRAM1_SIZE);
> +error_report("RAM size more than %s is not supported", sz);
>  g_free(sz);
>  exit(EXIT_FAILURE);
>  }

[PATCH v2 4/7] tests/acceptance: Tagging tests with "cpu:VALUE"

2021-04-08 Thread Wainer dos Santos Moschetta

The existing tests which are passing "-cpu VALUE" argument to the vm object
are now properly "cpu:VALUE" tagged, so letting the avocado_qemu framework to
handle that automatically.

Signed-off-by: Wainer dos Santos Moschetta 
---
 tests/acceptance/boot_linux_console.py   | 16 +---
 tests/acceptance/pc_cpu_hotplug_props.py |  2 +-
 tests/acceptance/replay_kernel.py|  9 ++---
 tests/acceptance/virtio-gpu.py   |  4 ++--
 4 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/tests/acceptance/boot_linux_console.py 
b/tests/acceptance/boot_linux_console.py
index 1ca32ecf25..b7a856d871 100644
--- a/tests/acceptance/boot_linux_console.py
+++ b/tests/acceptance/boot_linux_console.py
@@ -238,6 +238,7 @@ def test_mips64el_malta_5KEc_cpio(self):
 :avocado: tags=arch:mips64el
 :avocado: tags=machine:malta
 :avocado: tags=endian:little
+:avocado: tags=cpu:5KEc
 """
 kernel_url = ('https://github.com/philmd/qemu-testing-blob/'
   'raw/9ad2df38/mips/malta/mips64el/'
@@ -257,8 +258,7 @@ def test_mips64el_malta_5KEc_cpio(self):
 kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE
+ 'console=ttyS0 console=tty '
+ 'rdinit=/sbin/init noreboot')
-self.vm.add_args('-cpu', '5KEc',
- '-kernel', kernel_path,
+self.vm.add_args('-kernel', kernel_path,
  '-initrd', initrd_path,
  '-append', kernel_command_line,
  '-no-reboot')
@@ -286,7 +286,6 @@ def do_test_mips_malta32el_nanomips(self, kernel_url, 
kernel_hash):
+ 'mem=256m@@0x0 '
+ 'console=ttyS0')
 self.vm.add_args('-no-reboot',
- '-cpu', 'I7200',
  '-kernel', kernel_path,
  '-append', kernel_command_line)
 self.vm.launch()
@@ -298,6 +297,7 @@ def test_mips_malta32el_nanomips_4k(self):
 :avocado: tags=arch:mipsel
 :avocado: tags=machine:malta
 :avocado: tags=endian:little
+:avocado: tags=cpu:I7200
 """
 kernel_url = ('https://mipsdistros.mips.com/LinuxDistro/nanomips/'
   'kernels/v4.15.18-432-gb2eb9a8b07a1-20180627102142/'
@@ -310,6 +310,7 @@ def test_mips_malta32el_nanomips_16k_up(self):
 :avocado: tags=arch:mipsel
 :avocado: tags=machine:malta
 :avocado: tags=endian:little
+:avocado: tags=cpu:I7200
 """
 kernel_url = ('https://mipsdistros.mips.com/LinuxDistro/nanomips/'
   'kernels/v4.15.18-432-gb2eb9a8b07a1-20180627102142/'
@@ -322,6 +323,7 @@ def test_mips_malta32el_nanomips_64k_dbg(self):
 :avocado: tags=arch:mipsel
 :avocado: tags=machine:malta
 :avocado: tags=endian:little
+:avocado: tags=cpu:I7200
 """
 kernel_url = ('https://mipsdistros.mips.com/LinuxDistro/nanomips/'
   'kernels/v4.15.18-432-gb2eb9a8b07a1-20180627102142/'
@@ -333,6 +335,7 @@ def test_aarch64_virt(self):
 """
 :avocado: tags=arch:aarch64
 :avocado: tags=machine:virt
+:avocado: tags=cpu:cortex-a53
 """
 kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
   '/linux/releases/29/Everything/aarch64/os/images/pxeboot'
@@ -343,8 +346,7 @@ def test_aarch64_virt(self):
 self.vm.set_console()
 kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
'console=ttyAMA0')
-self.vm.add_args('-cpu', 'cortex-a53',
- '-kernel', kernel_path,
+self.vm.add_args('-kernel', kernel_path,
  '-append', kernel_command_line)
 self.vm.launch()
 console_pattern = 'Kernel command line: %s' % kernel_command_line
@@ -1038,9 +1040,9 @@ def test_ppc64_e500(self):
 """
 :avocado: tags=arch:ppc64
 :avocado: tags=machine:ppce500
+:avocado: tags=cpu:e5500
 """
 tar_hash = '6951d86d644b302898da2fd701739c9406527fe1'
-self.vm.add_args('-cpu', 'e5500')
 self.do_test_advcal_2018('19', tar_hash, 'uImage')
 
 def test_ppc_g3beige(self):
@@ -1082,7 +1084,7 @@ def test_xtensa_lx60(self):
 """
 :avocado: tags=arch:xtensa
 :avocado: tags=machine:lx60
+:avocado: tags=cpu:dc233c
 """
 tar_hash = '49e88d9933742f0164b60839886c9739cb7a0d34'
-self.vm.add_args('-cpu', 'dc233c')
 self.do_test_advcal_2018('02', tar_hash, 'santas-sleigh-ride.elf')
diff --git a/tests/acceptance/pc_cpu_hotplug_props.py 
b/tests/acceptance/pc_cpu_hotplug_props.py
index f48f68fc6b..2e86d5017a 100644
--- a/tests/acceptance/pc_cpu_hotplug_props.py
+++ b/tests/acceptance/pc_cpu_hotplug_props.py
@@ -25,11

[PATCH v2 6/7] tests/acceptance: Add set_vm_arg() to the Test class

2021-04-08 Thread Wainer dos Santos Moschetta

The set_vm_arg method is added to avocado_qemu.Test class on this
change. Use that method to set (or replace) an argument to the list of
arguments given to the QEMU binary.

Suggested-by: Cleber Rosa 
Signed-off-by: Wainer dos Santos Moschetta 
---
 tests/acceptance/avocado_qemu/__init__.py | 16 
 1 file changed, 16 insertions(+)

diff --git a/tests/acceptance/avocado_qemu/__init__.py 
b/tests/acceptance/avocado_qemu/__init__.py
index 7f8e703757..5314ce70eb 100644
--- a/tests/acceptance/avocado_qemu/__init__.py
+++ b/tests/acceptance/avocado_qemu/__init__.py
@@ -240,6 +240,22 @@ def get_vm(self, *args, name=None):
 self._vms[name].set_machine(self.machine)
 return self._vms[name]
 
+def set_vm_arg(self, arg, value):
+"""
+Set an argument to list of extra arguments to be given to the QEMU
+binary. If the argument already exists then its value is replaced.
+
+:param arg: the QEMU argument, such as "-cpu" in "-cpu host"
+:type arg: str
+:param value: the argument value, such as "host" in "-cpu host"
+:type value: str
+"""
+if arg not in self.vm.args:
+self.vm.args.extend([arg, value])
+else:
+idx = self.vm.args.index(arg)
+self.vm.args[idx + 1] = value
+
 def tearDown(self):
 for vm in self._vms.values():
 vm.shutdown()
-- 
2.29.2

[PATCH] vhost-user-fs: fix features handling

2021-04-08 Thread Anton Kuchin

Make virtio-fs take into account server capabilities.

Just returning requested features assumes they all of then are implemented
by server and results in setting unsupported configuration if some of them
are absent.

Signed-off-by: Anton Kuchin 
---
 hw/virtio/vhost-user-fs.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
index ac4fc34b36..6cf983ba0e 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -24,6 +24,14 @@
 #include "monitor/monitor.h"
 #include "sysemu/sysemu.h"
 
+static const int user_feature_bits[] = {
+VIRTIO_F_VERSION_1,
+VIRTIO_RING_F_INDIRECT_DESC,
+VIRTIO_RING_F_EVENT_IDX,
+VIRTIO_F_NOTIFY_ON_EMPTY,
+VHOST_INVALID_FEATURE_BIT
+};
+
 static void vuf_get_config(VirtIODevice *vdev, uint8_t *config)
 {
 VHostUserFS *fs = VHOST_USER_FS(vdev);
@@ -129,11 +137,12 @@ static void vuf_set_status(VirtIODevice *vdev, uint8_t 
status)
 }
 
 static uint64_t vuf_get_features(VirtIODevice *vdev,
-  uint64_t requested_features,
-  Error **errp)
+ uint64_t features,
+ Error **errp)
 {
-/* No feature bits used yet */
-return requested_features;
+VHostUserFS *fs = VHOST_USER_FS(vdev);
+
+return vhost_get_features(>vhost_dev, user_feature_bits, features);
 }
 
 static void vuf_handle_output(VirtIODevice *vdev, VirtQueue *vq)
-- 
2.25.1

Re: Mac OS real USB device support issue

2021-04-08 Thread Programmingkid




> On Apr 8, 2021, at 12:40 PM, Howard Spoelstra  wrote:
> 
> On Thu, Apr 8, 2021 at 1:05 PM Gerd Hoffmann  wrote:
>> 
>>  Hi,
>> 
 Those might be a good place to start. IOKit provides the drivers and
 also the io registry which is probably where you can get if a driver
 is bound to a device and which one is it. How to dissociate the
 driver from the device though I don't know.
>> 
>>> https://developer.apple.com/library/archive/documentation/DeviceDrivers/Conceptual/IOKitFundamentals/DeviceRemoval/DeviceRemoval.html
>> 
>>> According to this article a driver has a stop() and detach() method
>>> that is called by the IOKit to remove a device. I'm thinking QEMU can
>>> be the one that calls these methods for a certain device.
>> 
>> libusb should do that.  Interfaces exist already (see
>> libusb_detach_kernel_driver & friends) because we have the very same
>> problem on linux.
>> 
>> take care,
>>  Gerd
>> 
> 
> As far as I understand the patches here
> https://github.com/libusb/libusb/issues/906 they are internal to
> libusb, so we would need to build a libusb for use with e.g., brew to
> build a macOS executable. Or wait for them to be finalised to get
> included in libusb and then included in brew and then 
> 
> Best,
> Howard

We could also consider our own git submodule incase the libusb people fail to 
fix their issue.

Re: [PATCH v3 22/26] Hexagon (target/hexagon) circular addressing

2021-04-08 Thread Richard Henderson


On 4/7/21 6:57 PM, Taylor Simpson wrote:

+static inline TCGv gen_read_ireg(TCGv result, TCGv val, int shift)
+{
+/*
+ * Section 2.2.4 of the Hexagon V67 Programmer's Reference Manual
+ *
+ *  The "I" value from a modifier register is divided into two pieces
+ *  LSB bits 23:17
+ *  MSB bits 31:28
+ * The value is signed, so we do a sign extension
+ *
+ * At the end we shift the result according to the shift argument
+ */
+TCGv msb = tcg_temp_new();
+TCGv lsb = tcg_temp_new();
+
+tcg_gen_extract_tl(lsb, val, 17, 7);
+tcg_gen_extract_tl(msb, val, 28, 4);
+tcg_gen_movi_tl(result, 0);
+tcg_gen_deposit_tl(result, result, lsb, 0, 7);
+tcg_gen_deposit_tl(result, result, msb, 7, 4);
+tcg_gen_shli_tl(result, result, 21);
+tcg_gen_sari_tl(result, result, 21);


I gave you the 3 line version last time:

(1) shift msb, signed, into position at bit 7,
(2) extract lsb,
(3) deposit into msb, overwriting the low 7 bits.

And anyway, those last two lines are tcg_gen_sextract.

With that changed,
Reviewed-by: Richard Henderson 

r~

Re: Mac OS real USB device support issue

2021-04-08 Thread Programmingkid




> On Apr 8, 2021, at 7:05 AM, Gerd Hoffmann  wrote:
> 
>  Hi,
> 
>>> Those might be a good place to start. IOKit provides the drivers and
>>> also the io registry which is probably where you can get if a driver
>>> is bound to a device and which one is it. How to dissociate the
>>> driver from the device though I don't know.
> 
>> https://developer.apple.com/library/archive/documentation/DeviceDrivers/Conceptual/IOKitFundamentals/DeviceRemoval/DeviceRemoval.html
> 
>> According to this article a driver has a stop() and detach() method
>> that is called by the IOKit to remove a device. I'm thinking QEMU can
>> be the one that calls these methods for a certain device.
> 
> libusb should do that.  Interfaces exist already (see
> libusb_detach_kernel_driver & friends) because we have the very same
> problem on linux.
> 
> take care,
>  Gerd
> 

The questions that come to mind are:
- Does libusb_detach_kernel_driver() work on Mac OS?
- Is libusb_detach_kernel_driver() called on Mac OS in QEMU?

The only mention of this function in QEMU comes from host-libusb.c. 

After some tests I found out the function 
host-libusb.c:usb_host_detach_kernel() is being called on Mac OS 11.1. It never 
reaches the libusb_detach_kernel_driver() function. It stops at the continue 
statement. Here is the full function:

static void usb_host_detach_kernel(USBHostDevice *s)
{
printf("usb_host_detach_kernel() called\n");
struct libusb_config_descriptor *conf;
int rc, i;

rc = libusb_get_active_config_descriptor(s->dev, );
if (rc != 0) {
printf("rc != 0 => %d\n", rc);
return;
}
for (i = 0; i < USB_MAX_INTERFACES; i++) {
rc = libusb_kernel_driver_active(s->dh, i);
usb_host_libusb_error("libusb_kernel_driver_active", rc);
if (rc != 1) {
if (rc == 0) {
s->ifs[i].detached = true;
}
printf("rc != 1 => %d\n", rc);
continue;
}
trace_usb_host_detach_kernel(s->bus_num, s->addr, i);
rc = libusb_detach_kernel_driver(s->dh, i);
printf("libusb_detach_kernel_driver() called\n");
usb_host_libusb_error("libusb_detach_kernel_driver", rc);
s->ifs[i].detached = true;
}
libusb_free_config_descriptor(conf);
}


Next to the continue statement in the loop is where my printf function says 
that rc is equal to 0. So it looks like libusb_kernel_driver_active() may have 
an issue since it sets the rc variable. Later on I will try to figure out what 
is happening here.

[PATCH 2/2] hw/block/nvme: drain namespaces on sq deletion

2021-04-08 Thread Klaus Jensen

From: Klaus Jensen 

For most commands, when issuing an AIO, the BlockAIOCB is stored in the
NvmeRequest aiocb pointer when the AIO is issued. The main use of this
is cancelling AIOs when deleting submission queues (it is currently not
used for Abort).

However, some commands like Dataset Management Zone Management Send
(zone reset) may involve more than one AIO and here the AIOs are issued
without saving a reference to the BlockAIOCB. This is a problem since
nvme_del_sq() will attempt to cancel outstanding AIOs, potentially with
an invalid BlockAIOCB since the aiocb pointer is not NULL'ed when the
request structure is recycled.

Fix this by

  1. making sure the aiocb pointer is NULL'ed when requests are recycled
  2. only attempt to cancel the AIO if the aiocb is non-NULL
  3. if any AIOs could not be cancelled, drain all aio as a last resort.

Fixes: dc04d25e2f3f ("hw/block/nvme: add support for the format nvm command")
Fixes: c94973288cd9 ("hw/block/nvme: add broadcast nsid support flush command")
Fixes: e4e430b3d6ba ("hw/block/nvme: add simple copy command")
Fixes: 5f5dc4c6a942 ("hw/block/nvme: zero out zones on reset")
Fixes: 2605257a26b8 ("hw/block/nvme: add the dataset management command")
Cc: Gollu Appalanaidu 
Cc: Minwoo Im 
Signed-off-by: Klaus Jensen 
---
 hw/block/nvme.c | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 94bc373260be..3c4297e38a52 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -470,6 +470,7 @@ static void nvme_req_clear(NvmeRequest *req)
 {
 req->ns = NULL;
 req->opaque = NULL;
+req->aiocb = NULL;
 memset(>cqe, 0x0, sizeof(req->cqe));
 req->status = NVME_SUCCESS;
 }
@@ -3681,6 +3682,7 @@ static uint16_t nvme_del_sq(NvmeCtrl *n, NvmeRequest *req)
 NvmeSQueue *sq;
 NvmeCQueue *cq;
 uint16_t qid = le16_to_cpu(c->qid);
+int nsid;
 
 if (unlikely(!qid || nvme_check_sqid(n, qid))) {
 trace_pci_nvme_err_invalid_del_sq(qid);
@@ -3692,9 +3694,26 @@ static uint16_t nvme_del_sq(NvmeCtrl *n, NvmeRequest 
*req)
 sq = n->sq[qid];
 while (!QTAILQ_EMPTY(>out_req_list)) {
 r = QTAILQ_FIRST(>out_req_list);
-assert(r->aiocb);
-blk_aio_cancel(r->aiocb);
+if (r->aiocb) {
+blk_aio_cancel(r->aiocb);
+}
 }
+
+/*
+ * Drain all namespaces if there are still outstanding requests that we
+ * could not cancel explicitly.
+ */
+if (!QTAILQ_EMPTY(>out_req_list)) {
+for (nsid = 1; nsid <= NVME_MAX_NAMESPACES; nsid++) {
+NvmeNamespace *ns = nvme_ns(n, nsid);
+if (ns) {
+nvme_ns_drain(ns);
+}
+}
+}
+
+assert(QTAILQ_EMPTY(>out_req_list));
+
 if (!nvme_check_cqid(n, sq->cqid)) {
 cq = n->cq[sq->cqid];
 QTAILQ_REMOVE(>sq_list, sq, entry);
-- 
2.31.1

[PATCH 1/2] hw/block/nvme: store aiocb in compare

2021-04-08 Thread Klaus Jensen

From: Klaus Jensen 

nvme_compare() fails to store the aiocb from the blk_aio_preadv() call.
Fix this.

Fixes: 0a384f923f51 ("hw/block/nvme: add compare command")
Cc: Gollu Appalanaidu 
Signed-off-by: Klaus Jensen 
---
 hw/block/nvme.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 6b1f056a0ebc..94bc373260be 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -2837,7 +2837,8 @@ static uint16_t nvme_compare(NvmeCtrl *n, NvmeRequest 
*req)
 
 block_acct_start(blk_get_stats(blk), >acct, data_len,
  BLOCK_ACCT_READ);
-blk_aio_preadv(blk, offset, >data.iov, 0, nvme_compare_data_cb, req);
+req->aiocb = blk_aio_preadv(blk, offset, >data.iov, 0,
+nvme_compare_data_cb, req);
 
 return NVME_NO_COMPLETE;
 }
-- 
2.31.1

Re: [PATCH v3 19/26] Hexagon (target/hexagon) add A5_ACS (vacsh)

2021-04-08 Thread Richard Henderson


On 4/7/21 6:57 PM, Taylor Simpson wrote:

Rxx32,Pe4 = vacsh(Rss32, Rtt32)
 Add compare and select elements of two vectors

Test cases in tests/tcg/hexagon/multi_result.c

Signed-off-by: Taylor Simpson
---
  target/hexagon/gen_tcg.h  |  5 +++
  target/hexagon/helper.h   |  2 +
  target/hexagon/imported/alu.idef  | 19 ++
  target/hexagon/imported/encode_pp.def |  1 +
  target/hexagon/op_helper.c| 33 +
  tests/tcg/hexagon/multi_result.c  | 69 +++
  6 files changed, 129 insertions(+)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v3 16/26] Hexagon (target/hexagon) compile all debug code

2021-04-08 Thread Richard Henderson


On 4/7/21 6:57 PM, Taylor Simpson wrote:

Change #if HEX_DEBUG to if (HEX_DEBUG) so that the debug code doesn't
bit rot.

Suggested-by: Philippe Mathieu-Daudé
Signed-off-by: Taylor Simpson
---
  target/hexagon/genptr.c| 72 ++--
  target/hexagon/helper.h|  2 --
  target/hexagon/internal.h  | 11 +++
  target/hexagon/op_helper.c | 14 +++--
  target/hexagon/translate.c | 74 ++
  target/hexagon/translate.h |  2 --
  6 files changed, 81 insertions(+), 94 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH] hw/block/nvme: map prp fix if prp2 contains non-zero offset

2021-04-08 Thread Klaus Jensen


On Apr  8 21:53, Padmakar Kalghatgi wrote:

From: padmakar 

nvme_map_prp needs to calculate the number of list entries based on the
offset value. For the subsequent PRP2 list, need to ensure the number of
entries is within the MAX number of PRP entries for a page.

Signed-off-by: Padmakar Kalghatgi 
---
hw/block/nvme.c | 12 +---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d439e44..efb7368 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -577,7 +577,12 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, NvmeSg *sg, 
uint64_t prp1,
uint32_t nents, prp_trans;
int i = 0;

-nents = (len + n->page_size - 1) >> n->page_bits;
+/*
+ *   The first PRP list entry, pointed by PRP2 can contain
+ *   offsets. Hence, we need calculate the no of entries in
+ *   prp2 based on the offset it has.
+ */
+nents = (n->page_size - (prp2 % n->page_size)) >> 3;
prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
ret = nvme_addr_read(n, prp2, (void *)prp_list, prp_trans);
if (ret) {
@@ -588,7 +593,7 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, NvmeSg *sg, 
uint64_t prp1,
while (len != 0) {
uint64_t prp_ent = le64_to_cpu(prp_list[i]);

-if (i == n->max_prp_ents - 1 && len > n->page_size) {
+if (i == nents - 1 && len > n->page_size) {
if (unlikely(prp_ent & (n->page_size - 1))) {
trace_pci_nvme_err_invalid_prplist_ent(prp_ent);
status = NVME_INVALID_PRP_OFFSET | NVME_DNR;
@@ -597,7 +602,8 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, NvmeSg *sg, 
uint64_t prp1,

i = 0;
nents = (len + n->page_size - 1) >> n->page_bits;
-prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
+nents = MIN(nents, n->max_prp_ents);
+prp_trans = nents * sizeof(uint64_t);
ret = nvme_addr_read(n, prp_ent, (void *)prp_list,
 prp_trans);
if (ret) {
--
2.7.0.windows.1




LGTM.

Reviewed-by: Klaus Jensen 


signature.asc
Description: PGP signature

Re: [PATCH v3 15/26] Hexagon (target/hexagon) move QEMU_GENERATE to only be on during macros.h

2021-04-08 Thread Richard Henderson


On 4/7/21 6:57 PM, Taylor Simpson wrote:

Reviewed-by: Philippe Mathieu-Daudé
Signed-off-by: Taylor Simpson
---
  target/hexagon/genptr.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v3 14/26] Hexagon (target/hexagon) cleanup reg_field_info definition

2021-04-08 Thread Richard Henderson


On 4/7/21 6:57 PM, Taylor Simpson wrote:

Include size in declaration
Remove {0, 0} entry

Suggested-by: Richard Henderson
Signed-off-by: Taylor Simpson
---
  target/hexagon/reg_fields.c | 3 +--
  target/hexagon/reg_fields.h | 4 ++--
  2 files changed, 3 insertions(+), 4 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v3 13/26] Hexagon (target/hexagon) cleanup ternary operators in semantics

2021-04-08 Thread Richard Henderson


On 4/7/21 6:57 PM, Taylor Simpson wrote:

Change  (cond ? (res = x) : (res = y)) to res = (cond ? x : y)

This makes the semnatics easier to for idef-parser to deal with

The following instructions are impacted
 C2_any8
 C2_all8
 C2_mux
 C2_muxii
 C2_muxir
 C2_muxri

Signed-off-by: Taylor Simpson
---
  target/hexagon/imported/compare.idef | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)


I presume this change has also been made in the upstream Qualcomm source, so 
that the next import does not revert it?  Anyway,


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v3 02/26] Hexagon (target/hexagon) cleanup gen_log_predicated_reg_write_pair

2021-04-08 Thread Richard Henderson


On 4/7/21 6:57 PM, Taylor Simpson wrote:

Similar to previous cleanup of gen_log_predicated_reg_write

Signed-off-by: Taylor Simpson
---
  target/hexagon/genptr.c | 27 +--
  1 file changed, 13 insertions(+), 14 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [RFC PATCH v2 04/11] qemu-iotests: delay QMP socket timers

2021-04-08 Thread Paolo Bonzini

Il gio 8 apr 2021, 18:06 Emanuele Giuseppe Esposito 
ha scritto:

>
>
> On 08/04/2021 17:40, Paolo Bonzini wrote:
> > On 07/04/21 15:50, Emanuele Giuseppe Esposito wrote:
> >>   def get_qmp_events_filtered(self, wait=60.0):
> >>   result = []
> >> -for ev in self.get_qmp_events(wait=wait):
> >> +qmp_wait = wait
> >> +if qemu_gdb:
> >> +qmp_wait = 0.0
> >> +for ev in self.get_qmp_events(wait=qmp_wait):
> >>   result.append(filter_qmp_event(ev))
> >>   return result
> >
> > Should this be handled in get_qmp_events instead, since you're basically
> > changing all the callers?
>
> get_qmp_events is in python/machine.py, which as I understand might be
> used also by some other scripts, so I want to keep the changes there to
> the minimum. Also, machine.py has no access to qemu_gdb or
> qemu_valgrind, so passing a boolean or something to delay the timer
> would still require to add a similar check in all sections.
>
> Or do you have a cleaner way to do this?
>

Maybe a subclass IotestsMachine?

Paolo


> Emanuele
>
>

Re: [PATCH] target/arm: Make Thumb store insns UNDEF for Rn==1111

2021-04-08 Thread Richard Henderson


On 4/8/21 9:24 AM, Peter Maydell wrote:

The Arm ARM specifies that for Thumb encodings of the various plain
store insns, if the Rn field is  then we must UNDEF.  This is
different from the Arm encodings, where this case is either
UNPREDICTABLE or has well-defined behaviour.  The exclusive stores,
store-release and STRD do not have this UNDEF case for any encoding.

Enforce the UNDEF for this case in the Thumb plain store insns.

Fixes:https://bugs.launchpad.net/qemu/+bug/1922887
Signed-off-by: Peter Maydell
---
  target/arm/translate.c | 16 
  1 file changed, 16 insertions(+)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v1 4/8] target/riscv: Remove the hardcoded MSTATUS_SD macro

2021-04-08 Thread Richard Henderson


On 4/8/21 8:20 AM, Alistair Francis wrote:

  target_ulong sd = is_32bit(ctx) ? MSTATUS32_SD : MSTATUS64_SD;


It turns out clang doesn't like this, so I might still be stuck with ifdefs.


I think we need

#ifdef TARGET_RISCV32
#define is_32bit(ctx)  true
#else
...
#endif

based on

$ cat z.c
int foo(int x) { return x ? 1 : 1ul << 32; }
int bar(void) { return 1 ? 1 : 1ul << 32; }
rth@cloudburst:~$ clang-11 -S -Wall z.c
z.c:1:37: warning: implicit conversion from 'unsigned long' to 'int' changes 
value from 4294967296 to 0 [-Wconstant-conversion]

int foo(int x) { return x ? 1 : 1ul << 32; }
 ~~ ^

r~

Re: [PATCH] vmstate: Constify some VMStateDescriptions

2021-04-08 Thread Richard Henderson


On 4/8/21 7:07 AM, Keqian Zhu wrote:

Constify vmstate_ecc_state and vmstate_x86_cpu.

Signed-off-by: Keqian Zhu
---
  hw/block/ecc.c   | 2 +-
  include/hw/block/flash.h | 2 +-
  target/i386/cpu.h| 2 +-
  target/i386/machine.c| 2 +-
  4 files changed, 4 insertions(+), 4 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v10 2/6] arm64: kvm: Introduce MTE VM feature

2021-04-08 Thread Catalin Marinas

On Thu, Apr 08, 2021 at 08:16:17PM +0200, David Hildenbrand wrote:
> On 08.04.21 16:18, Catalin Marinas wrote:
> > On Wed, Apr 07, 2021 at 04:52:54PM +0100, Steven Price wrote:
> > > On 07/04/2021 16:14, Catalin Marinas wrote:
> > > > On Wed, Apr 07, 2021 at 11:20:18AM +0100, Steven Price wrote:
> > > > > On 31/03/2021 19:43, Catalin Marinas wrote:
> > > > > > When a slot is added by the VMM, if it asked for MTE in guest (I 
> > > > > > guess
> > > > > > that's an opt-in by the VMM, haven't checked the other patches), 
> > > > > > can we
> > > > > > reject it if it's is going to be mapped as Normal Cacheable but it 
> > > > > > is a
> > > > > > ZONE_DEVICE (i.e. !kvm_is_device_pfn() + one of David's suggestions 
> > > > > > to
> > > > > > check for ZONE_DEVICE)? This way we don't need to do more expensive
> > > > > > checks in set_pte_at().
> > > > > 
> > > > > The problem is that KVM allows the VMM to change the memory backing a 
> > > > > slot
> > > > > while the guest is running. This is obviously useful for the likes of
> > > > > migration, but ultimately means that even if you were to do checks at 
> > > > > the
> > > > > time of slot creation, you would need to repeat the checks at 
> > > > > set_pte_at()
> > > > > time to ensure a mischievous VMM didn't swap the page for a 
> > > > > problematic one.
> > > > 
> > > > Does changing the slot require some KVM API call? Can we intercept it
> > > > and do the checks there?
> > > 
> > > As David has already replied - KVM uses MMU notifiers, so there's not 
> > > really
> > > a good place to intercept this before the fault.
> > > 
> > > > Maybe a better alternative for the time being is to add a new
> > > > kvm_is_zone_device_pfn() and force KVM_PGTABLE_PROT_DEVICE if it returns
> > > > true _and_ the VMM asked for MTE in guest. We can then only set
> > > > PG_mte_tagged if !device.
> > > 
> > > KVM already has a kvm_is_device_pfn(), and yes I agree restricting the MTE
> > > checks to only !kvm_is_device_pfn() makes sense (I have the fix in my 
> > > branch
> > > locally).
> > 
> > Indeed, you can skip it if kvm_is_device_pfn(). In addition, with MTE,
> > I'd also mark a pfn as 'device' in user_mem_abort() if
> > pfn_to_online_page() is NULL as we don't want to map it as Cacheable in
> > Stage 2. It's unlikely that we'll trip over this path but just in case.
> > 
> > (can we have a ZONE_DEVICE _online_ pfn or by definition they are
> > considered offline?)
> 
> By definition (and implementation) offline. When you get a page =
> pfn_to_online_page() with page != NULL, that one should never be ZONE_DEVICE
> (otherwise it would be a BUG).
> 
> As I said, things are different when exposing dax memory via dax/kmem to the
> buddy. But then, we are no longer talking about ZONE_DEVICE.

Thanks David, it's clear now.

-- 
Catalin

Re: [PATCH v2] Revert "target/mips: Deprecate nanoMIPS ISA"

2021-04-08 Thread Richard Henderson


On 4/8/21 10:01 AM, Aleksandar Rikalo wrote:

NanoMIPS ISA is supported again, since MediaTek is taking over
nanoMIPS toolchain development (confirmed at
https://www.spinics.net/linux/fedora/libvir/msg217107.html 
).


New release of the toolchain can be found at
(https://github.com/MediaTek-Labs/nanomips-gnu-toolchain/releases/tag/nanoMIPS-2021.02-01 
).


Reverting deprecation of nanoMIPS ISA requires following changes:
     MAINTAINERS: remove nanoMIPS ISA from orphaned ISAs
     deprecated.rst: remove nanoMIPS ISA from deprecated ISAs

Signed-off-by: Filip Vidojevic 
Signed-off-by: Aleksandar Rikalo 
---
  MAINTAINERS    |  4 
  docs/system/deprecated.rst | 20 
  2 files changed, 24 deletions(-)


NACK, for the reasons stated against v1:
https://lists.gnu.org/archive/html/qemu-devel/2021-04/msg00663.html

We're not going to remove nanomips this cycle, but we're not going to reset the 
clock on deprecation either.



r~

Re: [PATCH v10 2/6] arm64: kvm: Introduce MTE VM feature

2021-04-08 Thread David Hildenbrand


On 08.04.21 16:18, Catalin Marinas wrote:

On Wed, Apr 07, 2021 at 04:52:54PM +0100, Steven Price wrote:

On 07/04/2021 16:14, Catalin Marinas wrote:

On Wed, Apr 07, 2021 at 11:20:18AM +0100, Steven Price wrote:

On 31/03/2021 19:43, Catalin Marinas wrote:

When a slot is added by the VMM, if it asked for MTE in guest (I guess
that's an opt-in by the VMM, haven't checked the other patches), can we
reject it if it's is going to be mapped as Normal Cacheable but it is a
ZONE_DEVICE (i.e. !kvm_is_device_pfn() + one of David's suggestions to
check for ZONE_DEVICE)? This way we don't need to do more expensive
checks in set_pte_at().


The problem is that KVM allows the VMM to change the memory backing a slot
while the guest is running. This is obviously useful for the likes of
migration, but ultimately means that even if you were to do checks at the
time of slot creation, you would need to repeat the checks at set_pte_at()
time to ensure a mischievous VMM didn't swap the page for a problematic one.


Does changing the slot require some KVM API call? Can we intercept it
and do the checks there?


As David has already replied - KVM uses MMU notifiers, so there's not really
a good place to intercept this before the fault.


Maybe a better alternative for the time being is to add a new
kvm_is_zone_device_pfn() and force KVM_PGTABLE_PROT_DEVICE if it returns
true _and_ the VMM asked for MTE in guest. We can then only set
PG_mte_tagged if !device.


KVM already has a kvm_is_device_pfn(), and yes I agree restricting the MTE
checks to only !kvm_is_device_pfn() makes sense (I have the fix in my branch
locally).


Indeed, you can skip it if kvm_is_device_pfn(). In addition, with MTE,
I'd also mark a pfn as 'device' in user_mem_abort() if
pfn_to_online_page() is NULL as we don't want to map it as Cacheable in
Stage 2. It's unlikely that we'll trip over this path but just in case.

(can we have a ZONE_DEVICE _online_ pfn or by definition they are
considered offline?)


By definition (and implementation) offline. When you get a page = 
pfn_to_online_page() with page != NULL, that one should never be 
ZONE_DEVICE (otherwise it would be a BUG).


As I said, things are different when exposing dax memory via dax/kmem to 
the buddy. But then, we are no longer talking about ZONE_DEVICE.


--
Thanks,

David / dhildenb

Re: [PATCH] checkpatch: Fix use of uninitialized value

2021-04-08 Thread Isaku Yamahata



How about initializing them explicitly as follows?
($realfile ne '') prevents the case realfile eq '' && acpi_testexpted eq ''.
Anyway your patch also should fix it. So
Reviewed-by: Isaku Yamahata 


diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 8f7053ec9b..2eb894a628 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1325,8 +1325,8 @@ sub process {
my %suppress_whiletrailers;
my %suppress_export;
 
-my $acpi_testexpected;
-my $acpi_nontestexpected;
+my $acpi_testexpected = '';
+my $acpi_nontestexpected = '';
 
# Pre-scan the patch sanitizing the lines.
 

On Thu, Apr 08, 2021 at 08:51:19AM +0200,
Greg Kurz  wrote:

> checkfilename() doesn't always set $acpi_testexpected. Fix the following
> warning:
> 
> Use of uninitialized value $acpi_testexpected in string eq at
>  ./scripts/checkpatch.pl line 1529.
> 
> Fixes: d2f1af0e4120 ("checkpatch: don't emit warning on newly created acpi 
> data files")
> Cc: isaku.yamah...@intel.com
> Signed-off-by: Greg Kurz 
> ---
>  scripts/checkpatch.pl |1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 8f7053ec9b26..3d185cceac94 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -1532,6 +1532,7 @@ sub process {
>($line =~ /\{\s*([\w\/\.\-]*)\s*\=\>\s*([\w\/\.\-]*)\s*\}/ 
> &&
> (defined($1) || defined($2 &&
>!(($realfile ne '') &&
> +defined($acpi_testexpected) &&
>  ($realfile eq $acpi_testexpected))) {
>   $reported_maintainer_file = 1;
>   WARN("added, moved or deleted file(s), does MAINTAINERS 
> need updating?\n" . $herecurr);
> 
> 
> 

-- 
Isaku Yamahata

RE: [PATCH v4 00/22] ppc: qemu: Add eTSEC support

2021-04-08 Thread Priyanka Jain



>-Original Message-
>From: Bin Meng 
>Sent: Tuesday, April 6, 2021 2:18 PM
>To: Priyanka Jain ; Ramon Fried
>; Simon Glass ; U-Boot Mailing List
>
>Cc: Tom Rini ; Vladimir Oltean ;
>qemu-devel@nongnu.org Developers ; qemu-ppc
>
>Subject: Re: [PATCH v4 00/22] ppc: qemu: Add eTSEC support
>
>Hi Priyanka,
>
>On Sun, Mar 14, 2021 at 8:15 PM Bin Meng  wrote:
>>
>> QEMU ppce500 machine can dynamically instantiate an eTSEC device if
>> "-device eTSEC" is given to QEMU.
>>
>> This series updates the fixed-link ethernet PHY driver as well as the
>> Freescale eTSEC driver to support the QEMU ppce500 board.
>>
>> 3 patches related to fixed phy in v1 are dropped in v2 as the changes
>> were done by Vladimir's fixed phy & Sandbox DSA series [1]. Vladimir's
>> series is now included in v2 to avoid dependencies.
>>
>> This cover letter is cc'ed to QEMU mailing list for a heads-up.
>> A future patch will be sent to QEMU mailing list to bring its in-tree
>> U-Boot source codes up-to-date.
>>
>> Azure results: PASS
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdev.
>>
>azure.com%2Fbmeng%2FGitHub%2F_build%2Fresults%3FbuildId%3D343%26view
>%3
>>
>Dresultsdata=04%7C01%7Cpriyanka.jain%40nxp.com%7C2bf2ce633c6142
>1a
>>
>d71808d8f8d8c19f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637
>53295
>>
>7133182039%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV
>2luMzI
>>
>iLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=OT5P5lmA6FnoVKfgj
>oB0%
>> 2FLVYoc6hZ1dfo54piX8jV5s%3Dreserved=0
>>
>> This series is avaiable at u-boot-x86/eTSEC for testing.
>>
>> [1]
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatc
>>
>hwork.ozlabs.org%2Fproject%2Fuboot%2Fpatch%2F20210216224804.3355044-
>2-
>>
>olteanv%40gmail.com%2Fdata=04%7C01%7Cpriyanka.jain%40nxp.com%7
>C2b
>>
>f2ce633c61421ad71808d8f8d8c19f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7
>C0%
>>
>7C0%7C637532957133182039%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj
>AwMDAiL
>>
>CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=F734K
>p4
>> 2IZRiFaQNAzjkx8lh4Zqja9v2xUAc%2Fi9KtFw%3Dreserved=0
>>
>> Changes in v4:
>> - describe "ranges" is required fo the alternate description
>> - make platform_bus_map_region() return void
>
>Now the v2021.07 merge window is open, would you please apply this series?
>Thanks!
>
>Regards,
>Bin

Yes, will include this in pull-request around next week.

Regards
Priyanka

[PATCH] hw/block/nvme: map prp fix if prp2 contains non-zero offset

2021-04-08 Thread Padmakar Kalghatgi

From: padmakar 

nvme_map_prp needs to calculate the number of list entries based on the
offset value. For the subsequent PRP2 list, need to ensure the number of
entries is within the MAX number of PRP entries for a page.

Signed-off-by: Padmakar Kalghatgi 
---
 hw/block/nvme.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d439e44..efb7368 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -577,7 +577,12 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, NvmeSg *sg, 
uint64_t prp1,
 uint32_t nents, prp_trans;
 int i = 0;
 
-nents = (len + n->page_size - 1) >> n->page_bits;
+/*
+ *   The first PRP list entry, pointed by PRP2 can contain
+ *   offsets. Hence, we need calculate the no of entries in
+ *   prp2 based on the offset it has.
+ */
+nents = (n->page_size - (prp2 % n->page_size)) >> 3;
 prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
 ret = nvme_addr_read(n, prp2, (void *)prp_list, prp_trans);
 if (ret) {
@@ -588,7 +593,7 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, NvmeSg *sg, 
uint64_t prp1,
 while (len != 0) {
 uint64_t prp_ent = le64_to_cpu(prp_list[i]);
 
-if (i == n->max_prp_ents - 1 && len > n->page_size) {
+if (i == nents - 1 && len > n->page_size) {
 if (unlikely(prp_ent & (n->page_size - 1))) {
 trace_pci_nvme_err_invalid_prplist_ent(prp_ent);
 status = NVME_INVALID_PRP_OFFSET | NVME_DNR;
@@ -597,7 +602,8 @@ static uint16_t nvme_map_prp(NvmeCtrl *n, NvmeSg *sg, 
uint64_t prp1,
 
 i = 0;
 nents = (len + n->page_size - 1) >> n->page_bits;
-prp_trans = MIN(n->max_prp_ents, nents) * sizeof(uint64_t);
+nents = MIN(nents, n->max_prp_ents);
+prp_trans = nents * sizeof(uint64_t);
 ret = nvme_addr_read(n, prp_ent, (void *)prp_list,
  prp_trans);
 if (ret) {
-- 
2.7.0.windows.1

Re: [RFC v12 08/65] target/arm: tcg: split m_helper user-only and sysemu-only parts

2021-04-08 Thread Alex Bennée



Claudio Fontana  writes:

> in the process remove a few CONFIG_TCG that are superfluous now.
>
> Signed-off-by: Claudio Fontana 
> Reviewed-by: Richard Henderson 
> ---
>  target/arm/tcg/m_helper.h |   21 +
>  target/arm/tcg/m_helper.c | 2766 +
>  target/arm/tcg/sysemu/m_helper.c  | 2655 +++
>  target/arm/tcg/user/m_helper.c|   97 +
>  target/arm/tcg/sysemu/meson.build |1 +
>  target/arm/tcg/user/meson.build   |1 +
>  6 files changed, 2780 insertions(+), 2761 deletions(-)
>  create mode 100644 target/arm/tcg/m_helper.h
>  create mode 100644 target/arm/tcg/sysemu/m_helper.c
>  create mode 100644 target/arm/tcg/user/m_helper.c
>
> diff --git a/target/arm/tcg/m_helper.h b/target/arm/tcg/m_helper.h
> new file mode 100644
> index 00..9da106aa65
> --- /dev/null
> +++ b/target/arm/tcg/m_helper.h
> @@ -0,0 +1,21 @@
> +/*
> + * ARM v7m generic helpers.
> + *
> + * This code is licensed under the GNU GPL v2 or later.
> + *
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + */
> +
> +#ifndef M_HELPER_H
> +#define M_HELPER_H
> +
> +#include "cpu.h"
> +
> +void v7m_msr_xpsr(CPUARMState *env, uint32_t mask,
> +  uint32_t reg, uint32_t val);
> +
> +uint32_t v7m_mrs_xpsr(CPUARMState *env, uint32_t reg, unsigned el);
> +
> +uint32_t v7m_mrs_control(CPUARMState *env, uint32_t secure);
> +
> +#endif /* M_HELPER_H */
> diff --git a/target/arm/tcg/m_helper.c b/target/arm/tcg/m_helper.c
> index d63ae465e1..8f3763155f 100644
> --- a/target/arm/tcg/m_helper.c
> +++ b/target/arm/tcg/m_helper.c
> @@ -1,5 +1,5 @@
>  /*
> - * ARM generic helpers.
> + * ARM v7m generic helpers.
>   *
>   * This code is licensed under the GNU GPL v2 or later.
>   *
> @@ -7,35 +7,11 @@
>   */

>  
> -static void v7m_msr_xpsr(CPUARMState *env, uint32_t mask,
> - uint32_t reg, uint32_t val)
> +void v7m_msr_xpsr(CPUARMState *env, uint32_t mask, uint32_t reg, uint32_t 
> val)
>  {
>  /* Only APSR is actually writable */
>  if (!(reg & 4)) {
> @@ -51,7 +27,7 @@ static void v7m_msr_xpsr(CPUARMState *env, uint32_t mask,
>  }
>  }


I guess there is a question about why the helpers can't just exist in
the header and maintain their static and become inlines in the the one
place they are included. But this is M-profile and I doubt the
difference is measurable so:

Reviewed-by: Alex Bennée 

-- 
Alex Bennée

Re: [PATCH for-6.0? 1/3] job: Add job_wait_unpaused() for block-job-complete

2021-04-08 Thread Vladimir Sementsov-Ogievskiy


08.04.2021 20:04, John Snow wrote:

On 4/8/21 12:58 PM, Vladimir Sementsov-Ogievskiy wrote:

job-complete command is async. Can we instead just add a boolean like 
job->completion_requested, and set it if job-complete called in STANDBY state, 
and on job_resume job_complete will be called automatically if this boolean is 
true?


job_complete has a synchronous setup, though -- we lose out on a lot of 
synchronous error checking in that circumstance.


yes, that's a problem..



I was not able to audit it to determine that it'd be safe to attempt that setup 
during a drained section -- I imagine it won't work and will fail, though.

So I thought we'd have to signal completion and run the setup *later*, but what do we do 
if we get an error then? Does the entire job fail? Do we emit some new event? 
("BLOCK_JOB_COMPLETION_FAILED" ?) Is it recoverable?



Isn't it possible even now, that after successful job-complete job still fails 
and we report BLOCK_JOB_COMPLETED with error?

And actually, how much benefit user get from the fact that job-complete may 
fail?

We can make job-complete a simple always-success boolean flag setter like 
job-pause.

And actual completion will be done in background, when possible. And if it 
fail, job just fails, like it does for any background io error. And user have 
to check error/success status of final BLOCK_JOB_COMPLETED anyway.

--
Best regards,
Vladimir

[PATCH v2] Revert "target/mips: Deprecate nanoMIPS ISA"

2021-04-08 Thread Aleksandar Rikalo

NanoMIPS ISA is supported again, since MediaTek is taking over
nanoMIPS toolchain development (confirmed at
https://www.spinics.net/linux/fedora/libvir/msg217107.html).

New release of the toolchain can be found at
(https://github.com/MediaTek-Labs/nanomips-gnu-toolchain/releases/tag/nanoMIPS-2021.02-01).

Reverting deprecation of nanoMIPS ISA requires following changes:
MAINTAINERS: remove nanoMIPS ISA from orphaned ISAs
deprecated.rst: remove nanoMIPS ISA from deprecated ISAs

Signed-off-by: Filip Vidojevic 
Signed-off-by: Aleksandar Rikalo 
---
 MAINTAINERS|  4 
 docs/system/deprecated.rst | 20 
 2 files changed, 24 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 69003cdc3c..498dbf0ae4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -254,10 +254,6 @@ F: include/hw/timer/mips_gictimer.h
 F: tests/tcg/mips/
 K: ^Subject:.*(?i)mips

-MIPS TCG CPUs (nanoMIPS ISA)
-S: Orphan
-F: disas/nanomips.*
-
 Moxie TCG CPUs
 M: Anthony Green 
 S: Maintained
diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index 80cae86252..a25293cb01 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -228,13 +228,6 @@ to build binaries for it.
 ``Icelake-Client`` CPU Models are deprecated. Use ``Icelake-Server`` CPU
 Models instead.

-MIPS ``I7200`` CPU Model (since 5.2)
-
-
-The ``I7200`` guest CPU relies on the nanoMIPS ISA, which is deprecated
-(the ISA has never been upstreamed to a compiler toolchain). Therefore
-this CPU is also deprecated.
-
 System emulator machines
 

@@ -305,13 +298,6 @@ The ``ppc64abi32`` architecture has a number of issues 
which regularly
 trip up our CI testing and is suspected to be quite broken. For that
 reason the maintainers strongly suspect no one actually uses it.

-MIPS ``I7200`` CPU (since 5.2)
-''
-
-The ``I7200`` guest CPU relies on the nanoMIPS ISA, which is deprecated
-(the ISA has never been upstreamed to a compiler toolchain). Therefore
-this CPU is also deprecated.
-
 Related binaries
 

@@ -378,9 +364,3 @@ resolve CPU model aliases before starting a virtual machine.

 Guest Emulator ISAs
 ---
-
-nanoMIPS ISA
-
-
-The ``nanoMIPS`` ISA has never been upstreamed to any compiler toolchain.
-As it is hard to generate binaries for it, declare it deprecated.
--
2.25.1

1 2 3 >

1 - 100 of 284 matches

Mail list logo