Re: [Qemu-devel] [PATCH RFC 0/4] intel_iommu: Do sanity check of vfio-pci earlier

2019-08-19 Thread Peter Xu
On Mon, Aug 12, 2019 at 09:45:27AM +0200, Peter Xu wrote:
> This is a RFC series.
> 
> The VT-d code has some defects, one of them is that we cannot detect
> the misuse of vIOMMU and vfio-pci early enough.
> 
> For example, logically this is not allowed:
> 
>   -device intel-iommu,caching-mode=off \
>   -device vfio-pci,host=05:00.0
> 
> Because the caching mode is required to make vfio-pci devices
> functional.
> 
> Previously we did this sanity check in vtd_iommu_notify_flag_changed()
> as when the memory regions change their attributes.  However that's
> too late in most cases!  Because the memory region layouts will only
> change after IOMMU is enabled, and that's in most cases during the
> guest OS boots.  So when the configuration is wrong, we will only bail
> out during the guest boots rather than simply telling the user before
> QEMU starts.
> 
> The same problem happens on device hotplug, say, when we have this:
> 
>   -device intel-iommu,caching-mode=off
> 
> Then we do something like:
> 
>   (HMP) device_add vfio-pci,host=05:00.0,bus=pcie.1
> 
> If at that time the vIOMMU is enabled in the guest then the QEMU
> process will simply quit directly due to this hotplug event.  This is
> a bit insane...
> 
> This series tries to solve above two problems by introducing two
> sanity checks upon these places separately:
> 
>   - machine done
>   - hotplug device
> 
> This is a bit awkward but I hope this could be better than before.
> There is of course other solutions like hard-code the check into
> vfio-pci but I feel it even more unpretty.  I didn't think out any
> better way to do this, if there is please kindly shout out.
> 
> Please have a look to see whether this would be acceptable, thanks.

Any more comment on this?

Thanks,

-- 
Peter Xu



[Qemu-devel] [PATCH v2 1/2] memory: Inherit has_coalesced_range from the same old FlatRange

2019-08-19 Thread Peter Xu
The previous has_coalesced_range counter has a problem in that it only
works for additions of coalesced mmio ranges but not deletions.  The
reason is that has_coalesced_range information can be lost when the
FlatView updates the topology again when the updated region is not
covering the coalesced regions. When that happens, due to
flatrange_equal() is not checking against has_coalesced_range, the new
FlatRange will be seen as the same one as the old and the new
instance (whose has_coalesced_range will be zero) will replace the old
instance (whose has_coalesced_range _could_ be non-zero).

To fix it, we inherit the has_coalesced_range value from the old
FlatRange to the new one if it's describing the identical range when
updating the topology.

Without this patch, MemoryListener.coalesced_io_del is hardly being
called due to has_coalesced_range will be mostly zero in
flat_range_coalesced_io_del() when topologies frequently change for
the "memory" address space.

Fixes: 3ac7d43a6fbb5d4a3
Suggested-by: Paolo Bonzini 
Signed-off-by: Peter Xu 
---
 memory.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/memory.c b/memory.c
index 8141486832..c53dcfc092 100644
--- a/memory.c
+++ b/memory.c
@@ -939,6 +939,15 @@ static void 
address_space_update_topology_pass(AddressSpace *as,
 /* In both and unchanged (except logging may have changed) */
 
 if (adding) {
+/*
+ * We must inherit the has_coalesced_range information
+ * if the new FlatRange is exactly the same as the old
+ * one, because it'll be used to conditionally call
+ * the coalesced mmio deletion listeners correctly in
+ * flat_range_coalesced_io_del() when the FlatRange
+ * needs to really go away.
+ */
+frnew->has_coalesced_range = frold->has_coalesced_range;
 MEMORY_LISTENER_UPDATE_REGION(frnew, as, Forward, region_nop);
 if (frnew->dirty_log_mask & ~frold->dirty_log_mask) {
 MEMORY_LISTENER_UPDATE_REGION(frnew, as, Forward, 
log_start,
-- 
2.21.0




[Qemu-devel] [PATCH v2 0/2] memory: Fix up coalesced_io_del not working for KVM

2019-08-19 Thread Peter Xu
v2:
- simply migrate has_coalesced_range in patch 1, while I added
  comments in the code because that can be a bit unobvious [Paolo]

v1 is here:

https://lists.gnu.org/archive/html/qemu-devel/2019-08/msg03293.html

Peter Xu (2):
  memory: Inherit has_coalesced_range from the same old FlatRange
  memory: Split zones when do coalesced_io_del()

 memory.c | 39 ---
 1 file changed, 36 insertions(+), 3 deletions(-)

-- 
2.21.0




[Qemu-devel] [PATCH v2 2/2] memory: Split zones when do coalesced_io_del()

2019-08-19 Thread Peter Xu
It is a workaround of current KVM's KVM_UNREGISTER_COALESCED_MMIO
interface.  The kernel interface only allows to unregister an mmio
device with exactly the zone size when registered, or any smaller zone
that is included in the device mmio zone.  It does not support the
userspace to specify a very large zone to remove all the small mmio
devices within the zone covered.

Logically speaking it would be nicer to fix this from KVM side, though
in all cases we still need to coop with old kernels so let's do this.

This patch has nothing to do with 3ac7d43a6fbb5d4a3 because this is
probably broken from the very beginning when the
KVM_UNREGISTER_COALESCED_MMIO interface is introduced in kernel.
However to make the backport to stables easier, I'm still using the
commit 3ac7d43a6fbb5d4a3 to track this problem because this will
depend on that otherwise even additions of mmio devices won't work.

Fixes: 3ac7d43a6fbb5d4a3
Signed-off-by: Peter Xu 
---
 memory.c | 30 +++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/memory.c b/memory.c
index c53dcfc092..7684b423f8 100644
--- a/memory.c
+++ b/memory.c
@@ -857,6 +857,9 @@ static void address_space_update_ioeventfds(AddressSpace 
*as)
 
 static void flat_range_coalesced_io_del(FlatRange *fr, AddressSpace *as)
 {
+CoalescedMemoryRange *cmr;
+AddrRange tmp;
+
 if (!fr->has_coalesced_range) {
 return;
 }
@@ -865,9 +868,30 @@ static void flat_range_coalesced_io_del(FlatRange *fr, 
AddressSpace *as)
 return;
 }
 
-MEMORY_LISTENER_UPDATE_REGION(fr, as, Reverse, coalesced_io_del,
-  int128_get64(fr->addr.start),
-  int128_get64(fr->addr.size));
+/*
+ * We split the big region into smaller ones to satisfy KVM's
+ * KVM_UNREGISTER_COALESCED_MMIO interface, where it does not
+ * allow to specify a large region to unregister all the devices
+ * under that zone instead it only accepts exact zones or even a
+ * smaller zone of previously registered mmio device.  Logically
+ * speaking we should better fix KVM to allow the userspace to
+ * unregister multiple mmio devices within a large requested zone,
+ * but in all cases we'll still need to live with old kernels.  So
+ * let's simply break the zones into exactly the small pieces when
+ * we do coalesced_io_add().
+ */
+QTAILQ_FOREACH(cmr, >mr->coalesced, link) {
+tmp = addrrange_shift(cmr->addr,
+  int128_sub(fr->addr.start,
+ int128_make64(fr->offset_in_region)));
+if (!addrrange_intersects(tmp, fr->addr)) {
+continue;
+}
+tmp = addrrange_intersection(tmp, fr->addr);
+MEMORY_LISTENER_UPDATE_REGION(fr, as, Reverse, coalesced_io_del,
+  int128_get64(tmp.start),
+  int128_get64(tmp.size));
+}
 }
 
 static void flat_range_coalesced_io_add(FlatRange *fr, AddressSpace *as)
-- 
2.21.0




Re: [Qemu-devel] [PATCH 1/2] memory: Replace has_coalesced_range with add/del flags

2019-08-19 Thread Peter Xu
On Mon, Aug 19, 2019 at 04:30:45PM +0200, Paolo Bonzini wrote:
> On 17/08/19 11:32, Peter Xu wrote:
> > The previous has_coalesced_range counter has a problem in that it only
> > works for additions of coalesced mmio ranges but not deletions.  The
> > reason is that has_coalesced_range information can be lost when the
> > FlatView updates the topology again when the updated region is not
> > covering the coalesced regions. When that happens, due to
> > flatrange_equal() is not checking against has_coalesced_range, the new
> > FlatRange will be seen as the same one as the old and the new
> > instance (whose has_coalesced_range will be zero) will replace the old
> > instance (whose has_coalesced_range _could_ be non-zero).
> > 
> > To fix it, we don't cache has_coalesced_range at all in the FlatRange.
> > Instead we introduce two flags to make sure the coalesced_io_{add|del}
> > will only be called once for every FlatRange instance.  This will even
> > work if another FlatRange replaces current one.
> 
> It's still a bit ugly that coalesced_mmio_add_done ends up not being set
> on the new (but equal) FlatRange.
> 
> Would something like this work too?
> 
> diff --git a/memory.c b/memory.c
> index edd0c13..fc91f06 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -939,6 +939,7 @@ static void 
> address_space_update_topology_pass(AddressSpace *as,
>  /* In both and unchanged (except logging may have changed) */
>  
>  if (adding) {
> +frnew->has_coalesced_range = frold->has_coalesced_range;
>  MEMORY_LISTENER_UPDATE_REGION(frnew, as, Forward, 
> region_nop);
>  if (frnew->dirty_log_mask & ~frold->dirty_log_mask) {
>  MEMORY_LISTENER_UPDATE_REGION(frnew, as, Forward, 
> log_start,

This seems to be a much better (and, shorter) idea. :-)

I'll verify it and repost if it goes well.

Regards,

-- 
Peter Xu



Re: [Qemu-devel] [kata-dev] [ANNOUNCE] virtio-fs v0.3 release

2019-08-19 Thread Peng Tao




On 2019/8/20 00:04, Stefan Hajnoczi wrote:

I am delighted to announce the release of virtio-fs v0.3, a shared file
system that lets virtual machines access a directory tree on the host.
This release is based on QEMU 4.1.0 and Linux 5.3-rc3.

Good news! As virtio-fs is maturing and stabilizing, what's the plan for 
upstreaming both qemu and kernel part of it?


Cheers,
Tao


For more information about virtio-fs: https://virtio-fs.gitlab.io/

This is a development release aimed at early adopters of virtio-fs.  Work is
being done to upstream the code into Linux and QEMU.  We expect to stop
publishing virtio-fs releases once the code has been merged by these upstream
projects.

Where to get it:

   https://gitlab.com/virtio-fs/linux/-/tags/virtio-fs-v0.3
   https://gitlab.com/virtio-fs/qemu/-/tags/virtio-fs-v0.3

Changes:

  * Please note that the mount syntax has changed to:

  # mount -t virtio_fs myfs /mnt -o ...

The old syntax was "mount -t virtio_fs none /mnt -o tag=myfs,...".

  * virtiofsd --fd=FDNUM takes a listen socket file descriptor number.  File
descriptor passing is an alternative way to manage the vhost-user UNIX
domain socket.  The parent process no longer needs to wait for virtiofsd to
create the listen socket before spawning the VM.

  * virtiofsd --syslog logs to syslog(2) instead of stderr.  Useful for unifying
logging and when the virtiofsd process is not being supervised.

  * virtiofsd --thread-pool-size=NUM sets the maximum number of worker threads
for FUSE request processing.  This can be used to control the host queue
depth.  The default is 64.

  * Performance improvements and bug fixes.

Note for Kata Containers: the new kernel is not compatible with existing
Kata Containers releases due to the mount syntax change.  To try it out,
please apply the following kata-runtime patch:

   
https://gitlab.com/virtio-fs/runtime/commit/a2e44de817e438c02a495cf258039774527e3178

Kata Containers patches for virtio-fs v0.3 are under development and will be
submitted to Kata soon.

Thanks to the following people for contributing code and to many more
for helping the virtio-fs effort:

Dr. David Alan Gilbert 
Eric Ren 
Eryu Guan 
Ganesh Maharaj Mahalingam 
Jiufei Xue 
Liu Bo 
Masayoshi Mizuma 
Miklos Szeredi 
Peng Tao 
piaojun 
Sebastien Boeuf 
Stefan Hajnoczi 
Vivek Goyal 
Xiaoguang Wang 


___
kata-dev mailing list
kata-...@lists.katacontainers.io
http://lists.katacontainers.io/cgi-bin/mailman/listinfo/kata-dev



--
Into something rich and strange.



Re: [Qemu-devel] [kata-dev] [ANNOUNCE] virtio-fs v0.3 release

2019-08-19 Thread Xu Wang
Thanks all the contributors, looking forward to having a product virtio-fs 
deployment shortly.

> On Aug 20, 2019, at 12:04 AM, Stefan Hajnoczi  wrote:
> 
> I am delighted to announce the release of virtio-fs v0.3, a shared file
> system that lets virtual machines access a directory tree on the host.
> This release is based on QEMU 4.1.0 and Linux 5.3-rc3.
> 
> For more information about virtio-fs: https://virtio-fs.gitlab.io/
> 
> This is a development release aimed at early adopters of virtio-fs.  Work is
> being done to upstream the code into Linux and QEMU.  We expect to stop
> publishing virtio-fs releases once the code has been merged by these upstream
> projects.
> 
> Where to get it:
> 
>  https://gitlab.com/virtio-fs/linux/-/tags/virtio-fs-v0.3
>  https://gitlab.com/virtio-fs/qemu/-/tags/virtio-fs-v0.3
> 
> Changes:
> 
> * Please note that the mount syntax has changed to:
> 
> # mount -t virtio_fs myfs /mnt -o ...
> 
>   The old syntax was "mount -t virtio_fs none /mnt -o tag=myfs,...".
> 
> * virtiofsd --fd=FDNUM takes a listen socket file descriptor number.  File
>   descriptor passing is an alternative way to manage the vhost-user UNIX
>   domain socket.  The parent process no longer needs to wait for virtiofsd to
>   create the listen socket before spawning the VM.
> 
> * virtiofsd --syslog logs to syslog(2) instead of stderr.  Useful for unifying
>   logging and when the virtiofsd process is not being supervised.
> 
> * virtiofsd --thread-pool-size=NUM sets the maximum number of worker threads
>   for FUSE request processing.  This can be used to control the host queue
>   depth.  The default is 64.
> 
> * Performance improvements and bug fixes.
> 
> Note for Kata Containers: the new kernel is not compatible with existing
> Kata Containers releases due to the mount syntax change.  To try it out,
> please apply the following kata-runtime patch:
> 
>  
> https://gitlab.com/virtio-fs/runtime/commit/a2e44de817e438c02a495cf258039774527e3178
> 
> Kata Containers patches for virtio-fs v0.3 are under development and will be
> submitted to Kata soon.
> 
> Thanks to the following people for contributing code and to many more
> for helping the virtio-fs effort:
> 
> Dr. David Alan Gilbert 
> Eric Ren 
> Eryu Guan 
> Ganesh Maharaj Mahalingam 
> Jiufei Xue 
> Liu Bo 
> Masayoshi Mizuma 
> Miklos Szeredi 
> Peng Tao 
> piaojun 
> Sebastien Boeuf 
> Stefan Hajnoczi 
> Vivek Goyal 
> Xiaoguang Wang 
> ___
> kata-dev mailing list
> kata-...@lists.katacontainers.io
> http://lists.katacontainers.io/cgi-bin/mailman/listinfo/kata-dev




Re: [Qemu-devel] [qemu-s390x] [PATCH v7 33/42] exec: Replace device_endian with MemOp

2019-08-19 Thread Edgar E. Iglesias
On Mon, 19 Aug. 2019, 23:01 Richard Henderson, 
wrote:

> On 8/19/19 11:29 AM, Paolo Bonzini wrote:
> > On 19/08/19 20:28, Paolo Bonzini wrote:
> >> On 16/08/19 12:12, Thomas Huth wrote:
> >>> This patch is *huge*, more than 800kB. It keeps being stuck in the the
> >>> filter of the qemu-s390x list each time you send it. Please:
> >>>
> >>> 1) Try to break it up in more digestible pieces, e.g. change only one
> >>> subsystem at a time (this is also better reviewable by people who are
> >>> interested in one area)
> >>
> >> This is not really possible, since the patch is basically a
> >> search-and-replace.  You could perhaps use some magic
> >> ("DEVICE_MEMOP_ENDIAN" or something like that) to allow a split, but it
> >> would introduce more complication than anything else.
> >
> > I'm stupid, at this point of the series it _would_ be possible to split
> > the patch by subsystem.  Still not sure it would be actually an
> advantage.
>
> It might be easier to review if we split by symbol, one rename per patch
> over
> the entire code base.
>
>
> r~
>

Or if we review your script (I assume this wasn't a manual change). I'm not
sure it's realistic to have review on the entire patch or patches.

Best regards,
Edgar

>


Re: [Qemu-devel] [PATCH v2] ppc: conform to processor User's Manual for xscvdpspn

2019-08-19 Thread David Gibson
On Mon, Aug 19, 2019 at 12:43:21PM -0500, Paul A. Clarke wrote:
> From: "Paul A. Clarke" 
> 
> The POWER8 and POWER9 User's Manuals specify the implementation
> behavior for what the ISA leaves "undefined" behavior for the
> xscvdpspn and xscvdpsp instructions.  This patch corrects the QEMU
> implementation to match the hardware implementation for that case.
> 
> ISA 3.0B has xscvdpspn leaving its result in word 0 of the target register,
> with the other words of the target register left "undefined".
> 
> The User's Manuals specify:
>   VSX scalar convert from double-precision to single-precision (xscvdpsp,
>   xscvdpspn).
>   VSR[32:63] is set to VSR[0:31].
> So, words 0 and 1 both contain the result.
> 
> Note: this is important because GCC as of version 8 or so, assumes and takes
> advantage of this behavior to optimize the following sequence:
>   xscvdpspn vs0,vs1
>   mffprwz   r8,f0
> ISA 3.0B has xscvdpspn leaving its result in word 0 of the target register,
> and mffprwz expecting its input to come from word 1 of the source register.
> This sequence fails with QEMU, as a shift is required between those two
> instructions.  However, since the hardware splats the result to both words 0
> and 1 of its output register, the shift is not necessary.
> 
> Expect a future revision of the ISA to specify this behavior.
> 
> Signed-off-by: Paul A. Clarke 

Applied to ppc-for-4.2, thanks.

> 
> v2
> - Splitting patch "ppc: Three floating point fixes"; this is just one part.
> - Updated commit message to clarify behavior is documented in User's Manuals.
> - Updated commit message to correct which words are in output and source of
>   xscvdpspn and mffprz.
> - No source changes to this part of the original patch.
> 
> ---
>  target/ppc/fpu_helper.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> index 5611cf0..23b9c97 100644
> --- a/target/ppc/fpu_helper.c
> +++ b/target/ppc/fpu_helper.c
> @@ -2871,10 +2871,14 @@ void helper_xscvqpdp(CPUPPCState *env, uint32_t 
> opcode,
>  
>  uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t xb)
>  {
> +uint64_t result;
> +
>  float_status tstat = env->fp_status;
>  set_float_exception_flags(0, );
>  
> -return (uint64_t)float64_to_float32(xb, ) << 32;
> +result = (uint64_t)float64_to_float32(xb, );
> +/* hardware replicates result to both words of the doubleword result.  */
> +return (result << 32) | result;
>  }
>  
>  uint64_t helper_xscvspdpn(CPUPPCState *env, uint64_t xb)

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] ppc: Fix emulated INFINITY and NAN conversions

2019-08-19 Thread David Gibson
On Mon, Aug 19, 2019 at 01:57:42PM -0700, Richard Henderson wrote:
> On 8/19/19 12:19 PM, Paul A. Clarke wrote:
> > From: "Paul A. Clarke" 
> > 
> > helper_todouble() was not properly converting INFINITY from 32 bit
> > float to 64 bit double.
> > 
> > (Normalized operand conversion is unchanged, other than indentation.)
> > 
> > Signed-off-by: Paul A. Clarke 
> > ---
> >  target/ppc/fpu_helper.c | 15 +++
> >  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> Reviewed-by: Richard Henderson 

Applied to ppc-for-4.2, thanks.

> 
> 
> r~
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2] ppc: Fix emulated single to double denormalized conversions

2019-08-19 Thread David Gibson
On Mon, Aug 19, 2019 at 04:42:16PM -0500, Paul A. Clarke wrote:
> From: "Paul A. Clarke" 
> 
> helper_todouble() was not properly converting any denormalized 32 bit
> float to 64 bit double.
> 
> Fix-suggested-by: Richard Henderson 
> Signed-off-by: Paul A. Clarke 
> 
> v2:
> - Splitting patch "ppc: Three floating point fixes"; this is just one part.
> - Original suggested "fix" was likely flawed.  v2 is rewritten by
>   Richard Henderson (Thanks, Richard!); I reformatted the comments in a
>   couple of places, compiled, and tested.

Applied to ppc-for-4.2, thanks.

> ---
>  target/ppc/fpu_helper.c | 17 +
>  1 file changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> index 52bcda2..07bc905 100644
> --- a/target/ppc/fpu_helper.c
> +++ b/target/ppc/fpu_helper.c
> @@ -73,11 +73,20 @@ uint64_t helper_todouble(uint32_t arg)
>  /* Zero or Denormalized operand.  */
>  ret = (uint64_t)extract32(arg, 31, 1) << 63;
>  if (unlikely(abs_arg != 0)) {
> -/* Denormalized operand.  */
> -int shift = clz32(abs_arg) - 9;
> -int exp = -126 - shift + 1023;
> +/*
> + * Denormalized operand.
> + * Shift fraction so that the msb is in the implicit bit 
> position.
> + * Thus, shift is in the range [1:23].
> + */
> +int shift = clz32(abs_arg) - 8;
> +/*
> + * The first 3 terms compute the float64 exponent.  We then bias
> + * this result by -1 so that we can swallow the implicit bit 
> below.
> + */
> +int exp = -126 - shift + 1023 - 1;
> +
>  ret |= (uint64_t)exp << 52;
> -ret |= abs_arg << (shift + 29);
> +ret += (uint64_t)abs_arg << (52 - 23 + shift);
>  }
>  }
>  return ret;

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] RISCV: when will the CLIC be ready?

2019-08-19 Thread Bin Meng
On Tue, Aug 20, 2019 at 3:09 AM Alistair Francis  wrote:
>
> On Mon, Aug 19, 2019 at 6:44 AM liuzhiwei  wrote:
> >
> >
> > On 2019/8/17 上午1:29, Alistair Francis wrote:
> > > On Thu, Aug 15, 2019 at 8:39 PM liuzhiwei  wrote:
> > >> Hi, Palmer
> > >>
> > >> When Michael Clark still was the maintainer of RISCV QEMU, he wrote in 
> > >> the mail list, "the CLIC interrupt controller is under testing,
> > >> and will be included in QEMU 3.1 or 3.2". It is pity that the CLIC is 
> > >> not in
> > >> included even in QEMU 4.1.0.
> > > I see that there is a CLIC branch available here:
> > > https://github.com/riscv/riscv-qemu/pull/157
> > >
> > > It looks like all of the work is in a single commit
> > > (https://github.com/riscv/riscv-qemu/pull/157/commits/206d9ac339feb9ef2c325402a00f0f45f453d019)
> > > and that most of the other commits in the PR have already made it into
> > > master.
> > >
> > > Although the CLIC commit is very large it doesn't seem impossible to
> > > manually pull out the CLIC bits and apply it onto master.
> > >
> > > Do you know the state of the CLIC model? If it's working it shouldn't
> > > be too hard to rebase the work and get the code into mainline.
> > >
> > > Alistair
> > >
> > Hi,  Alistair
> >
> > In my opinion, the CLIC code almost works.
> >
> > Last year when my workmate ported an RTOS, I once read the CLIC 
> > specification and used the CLIC model code. It worked through  all the 
> > tests after fixed two bugs. I also had sent the patch to Michael, but 
> > without response(maybe a wrong email address).
> >
> > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > index 7bf6cbc..95d80ab 100644
> > --- a/target/riscv/cpu_helper.c
> > +++ b/target/riscv/cpu_helper.c
> > @@ -505,6 +505,9 @@ static target_ulong riscv_intr_pc(CPURISCVState *env,
> >   if (!(async || clic)) {
> >   return tvec & ~0b11;
> >   }
> > +if (clic) {
> > +cause &= 0x3ff;
> > +}
> >
> >   /* bits [1:0] encode mode; 0 = direct, 1 = vectored, 2 >= reserved */
> >   switch (mode1) {
> > @@ -645,6 +648,9 @@ void riscv_cpu_do_interrupt(CPUState *cs)
> >   riscv_cpu_set_mode(env, PRV_M);
> >   }
> >
> > +if (clic) {
> > +env->exccode = 0;
> > +}
> >   /* NOTE: it is not necessary to yield load reservations here. It is 
> > only
> >  necessary for an SC from "another hart" to cause a load reservation
> >  to be yielded. Refer to the memory consistency model section of the
> >
> > After that, the specification has updated and the code may changed. I 
> > didn't pull new code again.
> >
> > If the CLIC model may merged into the mainline, and no body maintain the 
> > code, I'd like to work on it, fixing the bugs and updating the code 
> > according to latest specification.
>
> Yes please! We will be happy to merge it!
>
> If you would like to it would be great if you could update the code,
> fix the bugs and then send patches to this list.
>

Is the spec here?
https://github.com/sifive/clic-spec/blob/master/clic.adoc

Which silicon is going to have this CLIC?

Regards,
Bin



[Qemu-devel] [Bug 1819289] Re: Windows 95 and Windows 98 will not install or run

2019-08-19 Thread Brad Parker
Just FYI that was the second bisect I had to do, the first time it
produced an even more unrelated commit, so I assumed I must have done
something wrong... apparently that is still the case. After trying the
"working" commit outside of the Docker container, it now does not
work... so I'm at a loss as to how to reliably bisect I guess. Never had
any issues with other projects doing it though.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1819289

Title:
  Windows 95 and Windows 98 will not install or run

Status in QEMU:
  New

Bug description:
  The last version of QEMU I have been able to run Windows 95 or Windows
  98 on was 2.7 or 2.8. Recent versions since then even up to 3.1 will
  either not install or will not run 95 or 98 at all. I have tried every
  combination of options like isapc or no isapc, cpu pentium  or cpu as
  486. Tried different memory configurations, but they just don't work
  anymore.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1819289/+subscriptions



Re: [Qemu-devel] [Virtio-fs] [ANNOUNCE] virtio-fs v0.3 release

2019-08-19 Thread piaojun
A big step for virtio-fs!

Jun

On 2019/8/20 0:04, Stefan Hajnoczi wrote:
> I am delighted to announce the release of virtio-fs v0.3, a shared file
> system that lets virtual machines access a directory tree on the host.
> This release is based on QEMU 4.1.0 and Linux 5.3-rc3.
> 
> For more information about virtio-fs: https://virtio-fs.gitlab.io/
> 
> This is a development release aimed at early adopters of virtio-fs.  Work is
> being done to upstream the code into Linux and QEMU.  We expect to stop
> publishing virtio-fs releases once the code has been merged by these upstream
> projects.
> 
> Where to get it:
> 
>   https://gitlab.com/virtio-fs/linux/-/tags/virtio-fs-v0.3
>   https://gitlab.com/virtio-fs/qemu/-/tags/virtio-fs-v0.3
> 
> Changes:
> 
>  * Please note that the mount syntax has changed to:
> 
>  # mount -t virtio_fs myfs /mnt -o ...
> 
>The old syntax was "mount -t virtio_fs none /mnt -o tag=myfs,...".
> 
>  * virtiofsd --fd=FDNUM takes a listen socket file descriptor number.  File
>descriptor passing is an alternative way to manage the vhost-user UNIX
>domain socket.  The parent process no longer needs to wait for virtiofsd to
>create the listen socket before spawning the VM.
> 
>  * virtiofsd --syslog logs to syslog(2) instead of stderr.  Useful for 
> unifying
>logging and when the virtiofsd process is not being supervised.
> 
>  * virtiofsd --thread-pool-size=NUM sets the maximum number of worker threads
>for FUSE request processing.  This can be used to control the host queue
>depth.  The default is 64.
> 
>  * Performance improvements and bug fixes.
> 
> Note for Kata Containers: the new kernel is not compatible with existing
> Kata Containers releases due to the mount syntax change.  To try it out,
> please apply the following kata-runtime patch:
> 
>   
> https://gitlab.com/virtio-fs/runtime/commit/a2e44de817e438c02a495cf258039774527e3178
> 
> Kata Containers patches for virtio-fs v0.3 are under development and will be
> submitted to Kata soon.
> 
> Thanks to the following people for contributing code and to many more
> for helping the virtio-fs effort:
> 
> Dr. David Alan Gilbert 
> Eric Ren 
> Eryu Guan 
> Ganesh Maharaj Mahalingam 
> Jiufei Xue 
> Liu Bo 
> Masayoshi Mizuma 
> Miklos Szeredi 
> Peng Tao 
> piaojun 
> Sebastien Boeuf 
> Stefan Hajnoczi 
> Vivek Goyal 
> Xiaoguang Wang 
> 
> 
> 
> ___
> Virtio-fs mailing list
> virtio...@redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs
> 



Re: [Qemu-devel] [PATCH v2] ppc: Fix emulated single to double denormalized conversions

2019-08-19 Thread Aleksandar Markovic
20.08.2019. 00.32, "Paul A. Clarke"  је написао/ла:
>
> From: "Paul A. Clarke" 
>
> helper_todouble() was not properly converting any denormalized 32 bit
> float to 64 bit double.
>
> Fix-suggested-by: Richard Henderson 
> Signed-off-by: Paul A. Clarke 
>
> v2:
> - Splitting patch "ppc: Three floating point fixes"; this is just one
part.
> - Original suggested "fix" was likely flawed.  v2 is rewritten by
>   Richard Henderson (Thanks, Richard!); I reformatted the comments in a
>   couple of places, compiled, and tested.
> ---

Paul, the fix looks great, it is also good that it is a stand-alone patch
now, and thrre is a history too, and I just want to bring to your attention
a couple of technicalities to make this patch perfect:

- our standard phrase for fix suggestion is "Suggested-by:" (without
preceeding"Fix-");

- the patch history should be preceeded by a line with three dashes ("---")
- that way it will not become a part of the permanent commit message once
the patch is applied to the main tree, and we want that, since patch
history plays its role only during review process.

Looking forward to your sending even more patches!!

Aleksandar

>  target/ppc/fpu_helper.c | 17 +
>  1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> index 52bcda2..07bc905 100644
> --- a/target/ppc/fpu_helper.c
> +++ b/target/ppc/fpu_helper.c
> @@ -73,11 +73,20 @@ uint64_t helper_todouble(uint32_t arg)
>  /* Zero or Denormalized operand.  */
>  ret = (uint64_t)extract32(arg, 31, 1) << 63;
>  if (unlikely(abs_arg != 0)) {
> -/* Denormalized operand.  */
> -int shift = clz32(abs_arg) - 9;
> -int exp = -126 - shift + 1023;
> +/*
> + * Denormalized operand.
> + * Shift fraction so that the msb is in the implicit bit
position.
> + * Thus, shift is in the range [1:23].
> + */
> +int shift = clz32(abs_arg) - 8;
> +/*
> + * The first 3 terms compute the float64 exponent.  We then
bias
> + * this result by -1 so that we can swallow the implicit bit
below.
> + */
> +int exp = -126 - shift + 1023 - 1;
> +
>  ret |= (uint64_t)exp << 52;
> -ret |= abs_arg << (shift + 29);
> +ret += (uint64_t)abs_arg << (52 - 23 + shift);
>  }
>  }
>  return ret;
> --
> 1.8.3.1
>
>


Re: [Qemu-devel] Machine specific option ROMs

2019-08-19 Thread BALATON Zoltan
On Mon, 19 Aug 2019, Gerd Hoffmann wrote:
> On Mon, Aug 19, 2019 at 02:38:09AM +0200, BALATON Zoltan wrote:
>> I know about the possibility to set the option ROM of a PCIDevice with the
>> romfile property (that we can set on command line or in a device's init
>> method) but is there a way to set it depending on the machine that uses the
>> device? If this is not currently possible what would be needed to allow
>> this?
>
> Should work with compat properties.  That is a list of device, property
> and value which a specific machine type should use.  Typically they are
> used to make versioned machine types behave simliar to older qemu
> versions (this is where the name comes from).  Using them to use
> non-default properties on ppc platform should work too.
>
> For example in qemu 1.5 the nic roms got EFI support and there is a
> compat property which switches the pc-i440fx-1.4 (and older) machine
> types to the non-efi versions.  Grep for pxe-e1000.rom to find the code.

OK thanks, looks like something like this works:

diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index c5bbcc7433..8ee937e3ce 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -569,6 +572,10 @@ static int core99_kvm_type(MachineState *machine, const 
char *arg)
 return 2;
 }

+static GlobalProperty compat[] = {
+{ "VGA", "romfile", NDRV_VGA_FILENAME },
+};
+
 static void core99_machine_class_init(ObjectClass *oc, void *data)
 {
 MachineClass *mc = MACHINE_CLASS(oc);
@@ -587,6 +594,7 @@ static void core99_machine_class_init(ObjectClass *oc, void 
*data)
 mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("7400_v2.9");
 #endif
 mc->ignore_boot_device_suffixes = true;
+compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
 fwc->get_dev_path = core99_fw_dev_path;
 }



Mark, do you think this could replace the current way of passing this 
driver via fw_cfg and would you accept patches to OpenBIOS to revert the 
ndrv patching to replace that with this solution? (The vga_config_cb 
already adds the driver from the ROM when set as above so no further hacks 
are necessary. If we want we can keep the vga-ndrv? option to control this 
adding NDRV from ROM after the current use of this setting is no longer 
needed.) I think this would allow some simplification and also avoids 
patching ati-vga with this driver without needing to add vga-ndrv?=false 
manually. (In the future this same way can also be used to pass proper 
FCode ROMs to OpenBIOS.)

Regards,
BALATON Zoltan



[Qemu-devel] [Bug 1840719] Re: win98se floppy fails to boot with isapc machine

2019-08-19 Thread Philippe Mathieu-Daudé
Bisected following note from http://gunkies.org/wiki/I386-softmmu:

the isapc configuration no longer works... So legacy systems must resort
to Qemu 0.9.0 or Qemu 0.10.0

I get:

fd646122418ecefcde228d43821d07da79dd99bb is the first bad commit
commit fd646122418ecefcde228d43821d07da79dd99bb
Author: Anthony Liguori 
Date:   Fri Oct 30 09:06:09 2009 -0500

Switch pc bios from pc-bios to seabios

SeaBIOS is a port of pc-bios to GCC.  Besides using a more modern tool 
chain,
SeaBIOS introduces a number of new features including PMM support, better
BEV and BCV support, and better PnP support.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1840719

Title:
  win98se floppy fails to boot with isapc machine

Status in QEMU:
  New

Bug description:
  QEMU emulator version 4.1.50 (commit 50d69ee0d)

  floppy image from:
  https://winworldpc.com/download/417d71c2-ae18-c39a-11c3-a4e284a2c3a5

  $ qemu-system-i386 -M isapc -fda Windows\ 98\ Second\ Edition\ Boot.img
  SeaBIOS (version rel-1.12.1-0...)
  Booting from Floppy...
  Boot failed: could not read the boot disk

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1840719/+subscriptions



[Qemu-devel] [Bug 1840719] [NEW] win98se floppy fails to boot with isapc machine

2019-08-19 Thread Philippe Mathieu-Daudé
Public bug reported:

QEMU emulator version 4.1.50 (commit 50d69ee0d)

floppy image from:
https://winworldpc.com/download/417d71c2-ae18-c39a-11c3-a4e284a2c3a5

$ qemu-system-i386 -M isapc -fda Windows\ 98\ Second\ Edition\ Boot.img
SeaBIOS (version rel-1.12.1-0...)
Booting from Floppy...
Boot failed: could not read the boot disk

** Affects: qemu
 Importance: Undecided
 Status: New


** Tags: bios floppy x86

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1840719

Title:
  win98se floppy fails to boot with isapc machine

Status in QEMU:
  New

Bug description:
  QEMU emulator version 4.1.50 (commit 50d69ee0d)

  floppy image from:
  https://winworldpc.com/download/417d71c2-ae18-c39a-11c3-a4e284a2c3a5

  $ qemu-system-i386 -M isapc -fda Windows\ 98\ Second\ Edition\ Boot.img
  SeaBIOS (version rel-1.12.1-0...)
  Booting from Floppy...
  Boot failed: could not read the boot disk

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1840719/+subscriptions



Re: [Qemu-devel] [PATCH v2 00/68] target/arm: Convert aa32 base isa to decodetree

2019-08-19 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20190819213755.26175-1-richard.hender...@linaro.org/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Subject: [Qemu-devel] [PATCH v2 00/68] target/arm: Convert aa32 base isa to 
decodetree
Message-id: 20190819213755.26175-1-richard.hender...@linaro.org

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/1566250936-14538-1-git-send-email...@us.ibm.com -> 
patchew/1566250936-14538-1-git-send-email...@us.ibm.com
 * [new tag] 
patchew/20190819213755.26175-1-richard.hender...@linaro.org -> 
patchew/20190819213755.26175-1-richard.hender...@linaro.org
Submodule 'capstone' (https://git.qemu.org/git/capstone.git) registered for 
path 'capstone'
Submodule 'dtc' (https://git.qemu.org/git/dtc.git) registered for path 'dtc'
Submodule 'roms/QemuMacDrivers' (https://git.qemu.org/git/QemuMacDrivers.git) 
registered for path 'roms/QemuMacDrivers'
Submodule 'roms/SLOF' (https://git.qemu.org/git/SLOF.git) registered for path 
'roms/SLOF'
Submodule 'roms/edk2' (https://git.qemu.org/git/edk2.git) registered for path 
'roms/edk2'
Submodule 'roms/ipxe' (https://git.qemu.org/git/ipxe.git) registered for path 
'roms/ipxe'
Submodule 'roms/openbios' (https://git.qemu.org/git/openbios.git) registered 
for path 'roms/openbios'
Submodule 'roms/openhackware' (https://git.qemu.org/git/openhackware.git) 
registered for path 'roms/openhackware'
Submodule 'roms/opensbi' (https://git.qemu.org/git/opensbi.git) registered for 
path 'roms/opensbi'
Submodule 'roms/qemu-palcode' (https://git.qemu.org/git/qemu-palcode.git) 
registered for path 'roms/qemu-palcode'
Submodule 'roms/seabios' (https://git.qemu.org/git/seabios.git/) registered for 
path 'roms/seabios'
Submodule 'roms/seabios-hppa' (https://git.qemu.org/git/seabios-hppa.git) 
registered for path 'roms/seabios-hppa'
Submodule 'roms/sgabios' (https://git.qemu.org/git/sgabios.git) registered for 
path 'roms/sgabios'
Submodule 'roms/skiboot' (https://git.qemu.org/git/skiboot.git) registered for 
path 'roms/skiboot'
Submodule 'roms/u-boot' (https://git.qemu.org/git/u-boot.git) registered for 
path 'roms/u-boot'
Submodule 'roms/u-boot-sam460ex' (https://git.qemu.org/git/u-boot-sam460ex.git) 
registered for path 'roms/u-boot-sam460ex'
Submodule 'slirp' (https://git.qemu.org/git/libslirp.git) registered for path 
'slirp'
Submodule 'tests/fp/berkeley-softfloat-3' 
(https://git.qemu.org/git/berkeley-softfloat-3.git) registered for path 
'tests/fp/berkeley-softfloat-3'
Submodule 'tests/fp/berkeley-testfloat-3' 
(https://git.qemu.org/git/berkeley-testfloat-3.git) registered for path 
'tests/fp/berkeley-testfloat-3'
Submodule 'ui/keycodemapdb' (https://git.qemu.org/git/keycodemapdb.git) 
registered for path 'ui/keycodemapdb'
Cloning into 'capstone'...
Submodule path 'capstone': checked out 
'22ead3e0bfdb87516656453336160e0a37b066bf'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '88f18909db731a627456f26d779445f84e449536'
Cloning into 'roms/QemuMacDrivers'...
Submodule path 'roms/QemuMacDrivers': checked out 
'90c488d5f4a407342247b9ea869df1c2d9c8e266'
Cloning into 'roms/SLOF'...
Submodule path 'roms/SLOF': checked out 
'ba1ab360eebe6338bb8d7d83a9220ccf7e213af3'
Cloning into 'roms/edk2'...
Submodule path 'roms/edk2': checked out 
'20d2e5a125e34fc8501026613a71549b2a1a3e54'
Submodule 'SoftFloat' (https://github.com/ucb-bar/berkeley-softfloat-3.git) 
registered for path 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
Submodule 'CryptoPkg/Library/OpensslLib/openssl' 
(https://github.com/openssl/openssl) registered for path 
'CryptoPkg/Library/OpensslLib/openssl'
Cloning into 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'...
Submodule path 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3': 
checked out 'b64af41c3276f97f0e181920400ee056b9c88037'
Cloning into 'CryptoPkg/Library/OpensslLib/openssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl': checked out 
'50eaac9f3337667259de725451f201e784599687'
Submodule 'boringssl' (https://boringssl.googlesource.com/boringssl) registered 
for path 'boringssl'
Submodule 'krb5' (https://github.com/krb5/krb5) registered for path 'krb5'
Submodule 'pyca.cryptography' (https://github.com/pyca/cryptography.git) 
registered for path 'pyca-cryptography'
Cloning into 'boringssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/boringssl': 
checked out '2070f8ad9151dc8f3a73bffaa146b5e6937a583f'
Cloning into 'krb5'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/krb5': checked 
out 'b9ad6c49505c96a088326b62a52568e3484f2168'
Cloning into 

[Qemu-devel] [PATCH v2] ppc: Fix emulated single to double denormalized conversions

2019-08-19 Thread Paul A. Clarke
From: "Paul A. Clarke" 

helper_todouble() was not properly converting any denormalized 32 bit
float to 64 bit double.

Fix-suggested-by: Richard Henderson 
Signed-off-by: Paul A. Clarke 

v2:
- Splitting patch "ppc: Three floating point fixes"; this is just one part.
- Original suggested "fix" was likely flawed.  v2 is rewritten by
  Richard Henderson (Thanks, Richard!); I reformatted the comments in a
  couple of places, compiled, and tested.
---
 target/ppc/fpu_helper.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 52bcda2..07bc905 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -73,11 +73,20 @@ uint64_t helper_todouble(uint32_t arg)
 /* Zero or Denormalized operand.  */
 ret = (uint64_t)extract32(arg, 31, 1) << 63;
 if (unlikely(abs_arg != 0)) {
-/* Denormalized operand.  */
-int shift = clz32(abs_arg) - 9;
-int exp = -126 - shift + 1023;
+/*
+ * Denormalized operand.
+ * Shift fraction so that the msb is in the implicit bit position.
+ * Thus, shift is in the range [1:23].
+ */
+int shift = clz32(abs_arg) - 8;
+/*
+ * The first 3 terms compute the float64 exponent.  We then bias
+ * this result by -1 so that we can swallow the implicit bit below.
+ */
+int exp = -126 - shift + 1023 - 1;
+
 ret |= (uint64_t)exp << 52;
-ret |= abs_arg << (shift + 29);
+ret += (uint64_t)abs_arg << (52 - 23 + shift);
 }
 }
 return ret;
-- 
1.8.3.1




[Qemu-devel] [PATCH v2 68/68] target/arm: Inline gen_bx_im into callers

2019-08-19 Thread Richard Henderson
There are only two remaining uses of gen_bx_im.  In each case, we
know the destination mode -- not changing in the case of gen_jmp
or changing in the case of trans_BLX_i.  Use this to simplify the
surrounding code.

For trans_BLX_i, use gen_jmp for the actual branch.  For gen_jmp,
use gen_set_pc_im to set up the single-step.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 26 +++---
 1 file changed, 7 insertions(+), 19 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index bac38e6261..9162ad113a 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -765,21 +765,6 @@ static inline void gen_set_pc_im(DisasContext *s, 
target_ulong val)
 tcg_gen_movi_i32(cpu_R[15], val);
 }
 
-/* Set PC and Thumb state from an immediate address.  */
-static inline void gen_bx_im(DisasContext *s, uint32_t addr)
-{
-TCGv_i32 tmp;
-
-s->base.is_jmp = DISAS_JUMP;
-if (s->thumb != (addr & 1)) {
-tmp = tcg_temp_new_i32();
-tcg_gen_movi_i32(tmp, addr & 1);
-tcg_gen_st_i32(tmp, cpu_env, offsetof(CPUARMState, thumb));
-tcg_temp_free_i32(tmp);
-}
-tcg_gen_movi_i32(cpu_R[15], addr & ~1);
-}
-
 /* Set PC and Thumb state from var.  var is marked as dead.  */
 static inline void gen_bx(DisasContext *s, TCGv_i32 var)
 {
@@ -2706,9 +2691,8 @@ static inline void gen_jmp (DisasContext *s, uint32_t 
dest)
 {
 if (unlikely(is_singlestepping(s))) {
 /* An indirect jump so that we still trigger the debug exception.  */
-if (s->thumb)
-dest |= 1;
-gen_bx_im(s, dest);
+gen_set_pc_im(s, dest);
+s->base.is_jmp = DISAS_JUMP;
 } else {
 gen_goto_tb(s, 0, dest);
 }
@@ -10016,12 +1,16 @@ static bool trans_BL(DisasContext *s, arg_i *a)
 
 static bool trans_BLX_i(DisasContext *s, arg_BLX_i *a)
 {
+TCGv_i32 tmp;
+
 /* For A32, ARCH(5) is checked near the start of the uncond block. */
 if (s->thumb && (a->imm & 2)) {
 return false;
 }
 tcg_gen_movi_i32(cpu_R[14], s->base.pc_next | s->thumb);
-gen_bx_im(s, (read_pc(s) & ~3) + a->imm + !s->thumb);
+tmp = tcg_const_i32(!s->thumb);
+store_cpu_field(tmp, thumb);
+gen_jmp(s, (read_pc(s) & ~3) + a->imm);
 return true;
 }
 
-- 
2.17.1




[Qemu-devel] [PATCH v2 67/68] target/arm: Clean up disas_thumb_insn

2019-08-19 Thread Richard Henderson
Now that everything is converted, remove the rest of
the legacy decode.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 27 ++-
 1 file changed, 2 insertions(+), 25 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index f8997a8424..bac38e6261 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10650,32 +10650,9 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 
 static void disas_thumb_insn(DisasContext *s, uint32_t insn)
 {
-if (disas_t16(s, insn)) {
-return;
+if (!disas_t16(s, insn)) {
+unallocated_encoding(s);
 }
-/* fall back to legacy decoder */
-
-switch (insn >> 12) {
-case 0: case 1: /* add/sub (3reg, 2reg imm), shift imm; in decodetree */
-case 2: case 3: /* add, sub, cmp, mov (reg, imm), in decodetree */
-case 4: /* ldr lit, data proc (2reg), data proc ext, bx; in decodetree */
-case 5: /* load/store register offset, in decodetree */
-case 6: /* load/store word immediate offset, in decodetree */
-case 7: /* load/store byte immediate offset, in decodetree */
-case 8: /* load/store halfword immediate offset, in decodetree */
-case 9: /* load/store from stack, in decodetree */
-case 10: /* add PC/SP (immediate), in decodetree */
-case 11: /* misc, in decodetree */
-case 12: /* load/store multiple, in decodetree */
-case 13: /* conditional branch or swi, in decodetree */
-case 14:
-case 15:
-/* branches, in decodetree */
-goto illegal_op;
-}
-return;
-illegal_op:
-unallocated_encoding(s);
 }
 
 static bool insn_crosses_page(CPUARMState *env, DisasContext *s)
-- 
2.17.1




[Qemu-devel] [PATCH v2 66/68] target/arm: Convert T16, long branches

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 89 +++---
 target/arm/t16.decode  |  3 ++
 2 files changed, 43 insertions(+), 49 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 51b14d409f..f8997a8424 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10025,6 +10025,44 @@ static bool trans_BLX_i(DisasContext *s, arg_BLX_i *a)
 return true;
 }
 
+static bool trans_BL_BLX_prefix(DisasContext *s, arg_BL_BLX_prefix *a)
+{
+/*
+ * thumb_insn_is_16bit() ensures we can't get here for
+ * a Thumb2 CPU, so this must be a thumb1 split BL/BLX.
+ */
+assert(!arm_dc_feature(s, ARM_FEATURE_THUMB2));
+tcg_gen_movi_i32(cpu_R[14], read_pc(s) + (a->imm << 12));
+return true;
+}
+
+static bool trans_BL_suffix(DisasContext *s, arg_BL_suffix *a)
+{
+TCGv_i32 tmp = tcg_temp_new_i32();
+
+assert(!arm_dc_feature(s, ARM_FEATURE_THUMB2));
+tcg_gen_addi_i32(tmp, cpu_R[14], (a->imm << 1) | 1);
+tcg_gen_movi_i32(cpu_R[14], s->base.pc_next | 1);
+gen_bx(s, tmp);
+return true;
+}
+
+static bool trans_BLX_suffix(DisasContext *s, arg_BLX_suffix *a)
+{
+TCGv_i32 tmp;
+
+assert(!arm_dc_feature(s, ARM_FEATURE_THUMB2));
+if (!ENABLE_ARCH_5) {
+return false;
+}
+tmp = tcg_temp_new_i32();
+tcg_gen_addi_i32(tmp, cpu_R[14], a->imm << 1);
+tcg_gen_andi_i32(tmp, tmp, -4);
+tcg_gen_movi_i32(cpu_R[14], s->base.pc_next | 1);
+gen_bx(s, tmp);
+return true;
+}
+
 static bool op_tbranch(DisasContext *s, arg_tbranch *a, bool half)
 {
 TCGv_i32 addr, tmp;
@@ -10612,10 +10650,6 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 
 static void disas_thumb_insn(DisasContext *s, uint32_t insn)
 {
-int32_t offset;
-TCGv_i32 tmp;
-TCGv_i32 tmp2;
-
 if (disas_t16(s, insn)) {
 return;
 }
@@ -10634,53 +10668,10 @@ static void disas_thumb_insn(DisasContext *s, 
uint32_t insn)
 case 11: /* misc, in decodetree */
 case 12: /* load/store multiple, in decodetree */
 case 13: /* conditional branch or swi, in decodetree */
-goto illegal_op;
-
 case 14:
-if (insn & (1 << 11)) {
-/* thumb_insn_is_16bit() ensures we can't get here for
- * a Thumb2 CPU, so this must be a thumb1 split BL/BLX:
- * 0b1110_1xxx__ : BLX suffix (or UNDEF)
- */
-assert(!arm_dc_feature(s, ARM_FEATURE_THUMB2));
-ARCH(5);
-offset = ((insn & 0x7ff) << 1);
-tmp = load_reg(s, 14);
-tcg_gen_addi_i32(tmp, tmp, offset);
-tcg_gen_andi_i32(tmp, tmp, 0xfffc);
-
-tmp2 = tcg_temp_new_i32();
-tcg_gen_movi_i32(tmp2, s->base.pc_next | 1);
-store_reg(s, 14, tmp2);
-gen_bx(s, tmp);
-break;
-}
-/* unconditional branch, in decodetree */
-goto illegal_op;
-
 case 15:
-/* thumb_insn_is_16bit() ensures we can't get here for
- * a Thumb2 CPU, so this must be a thumb1 split BL/BLX.
- */
-assert(!arm_dc_feature(s, ARM_FEATURE_THUMB2));
-
-if (insn & (1 << 11)) {
-/* 0b_1xxx__ : BL suffix */
-offset = ((insn & 0x7ff) << 1) | 1;
-tmp = load_reg(s, 14);
-tcg_gen_addi_i32(tmp, tmp, offset);
-
-tmp2 = tcg_temp_new_i32();
-tcg_gen_movi_i32(tmp2, s->base.pc_next | 1);
-store_reg(s, 14, tmp2);
-gen_bx(s, tmp);
-} else {
-/* 0b_0xxx__ : BL/BLX prefix */
-uint32_t uoffset = ((int32_t)insn << 21) >> 9;
-
-tcg_gen_movi_i32(cpu_R[14], read_pc(s) + uoffset);
-}
-break;
+/* branches, in decodetree */
+goto illegal_op;
 }
 return;
 illegal_op:
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index 35a5b03118..5ee8457efb 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -274,3 +274,6 @@ LDM_t16 1011 110 . \
 %imm11_0x2  0:s11 !function=times_2
 
 B   11100 ...imm=%imm11_0x2
+BLX_suffix  11101 imm:11
+BL_BLX_prefix   0 imm:s11   
+BL_suffix   1 imm:11
-- 
2.17.1




[Qemu-devel] [PATCH v2 65/68] target/arm: Convert T16, Unconditional branch

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 9 ++---
 target/arm/t16.decode  | 6 ++
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 1882057402..51b14d409f 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10612,7 +10612,6 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t 
insn)
 
 static void disas_thumb_insn(DisasContext *s, uint32_t insn)
 {
-uint32_t val;
 int32_t offset;
 TCGv_i32 tmp;
 TCGv_i32 tmp2;
@@ -10656,12 +10655,8 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 gen_bx(s, tmp);
 break;
 }
-/* unconditional branch */
-val = read_pc(s);
-offset = ((int32_t)insn << 21) >> 21;
-val += offset << 1;
-gen_jmp(s, val);
-break;
+/* unconditional branch, in decodetree */
+goto illegal_op;
 
 case 15:
 /* thumb_insn_is_16bit() ensures we can't get here for
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index f87e6fde50..35a5b03118 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -268,3 +268,9 @@ LDM_t16 1011 110 . \
   SVC   1101  imm:8 
   B_cond_thumb  1101 cond:4  imm=%imm8_0x2
 }
+
+# Unconditional Branch
+
+%imm11_0x2  0:s11 !function=times_2
+
+B   11100 ...imm=%imm11_0x2
-- 
2.17.1




[Qemu-devel] [PATCH v2 58/68] target/arm: Convert T16, nop hints

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c |  3 +--
 target/arm/t16.decode  | 17 +
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 176cba2992..67f0202d29 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10769,8 +10769,7 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 
 case 15: /* IT, nop-hint.  */
 if ((insn & 0xf) == 0) {
-gen_nop_hint(s, (insn >> 4) & 0xf);
-break;
+goto illegal_op; /* nop hint, in decodetree */
 }
 /*
  * IT (If-Then)
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index ec21be7ef0..d5b046d105 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -19,6 +19,7 @@
 # This file is processed by scripts/decodetree.py
 #
 
+   !extern
 _rrr_shi   !extern s rd rn rm shim shty
 _rrr_shr   !extern s rn rd rm rs shty
 _rri_rot   !extern s rn rd imm rot
@@ -204,3 +205,19 @@ SETEND  1011 0110 010 1 E:1 000 
 REV 1011 1010 00 ... ...@rdm
 REV16   1011 1010 01 ... ...@rdm
 REVSH   1011 1010 11 ... ...@rdm
+
+# Hints
+
+{
+  YIELD 1011  0001 
+  WFE   1011  0010 
+  WFI   1011  0011 
+
+  # TODO: Implement SEV, SEVL; may help SMP performance.
+  # SEV 1011  0100 
+  # SEVL1011  0101 
+
+  # The canonical nop has the second nibble as , but the whole of the
+  # rest of the space is a reserved hint, behaves as nop.
+  NOP   1011   
+}
-- 
2.17.1




[Qemu-devel] [PATCH v2 63/68] target/arm: Convert T16, shift immediate

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 26 ++
 target/arm/t16.decode  |  8 
 2 files changed, 10 insertions(+), 24 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index dc670c9724..dc3c9049cd 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10630,7 +10630,7 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t 
insn)
 
 static void disas_thumb_insn(DisasContext *s, uint32_t insn)
 {
-uint32_t val, op, rm, rd, shift;
+uint32_t val, rd;
 int32_t offset;
 TCGv_i32 tmp;
 TCGv_i32 tmp2;
@@ -10642,29 +10642,7 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 /* fall back to legacy decoder */
 
 switch (insn >> 12) {
-case 0: case 1:
-
-rd = insn & 7;
-op = (insn >> 11) & 3;
-if (op == 3) {
-/*
- * 0b0001_1xxx__
- *  - Add, subtract (three low registers)
- *  - Add, subtract (two low registers and immediate)
- * In decodetree.
- */
-goto illegal_op;
-} else {
-/* shift immediate */
-rm = (insn >> 3) & 7;
-shift = (insn >> 6) & 0x1f;
-tmp = load_reg(s, rm);
-gen_arm_shift_im(tmp, op, shift, s->condexec_mask == 0);
-if (!s->condexec_mask)
-gen_logic_CC(tmp);
-store_reg(s, rd, tmp);
-}
-break;
+case 0: case 1: /* add/sub (3reg, 2reg imm), shift imm; in decodetree */
 case 2: case 3: /* add, sub, cmp, mov (reg, imm), in decodetree */
 goto illegal_op;
 case 4:
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index 4ecbabd364..1adad20804 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -126,6 +126,14 @@ ADD_rri 10101 rd:3  \
 STM 11000 ...   @ldstm
 LDM_t16 11001 ...   @ldstm
 
+# Shift (immediate)
+
+@shift_i. shim:5 rm:3 rd:3  _rrr_shi %s rn=%reg_0
+
+MOV_rxri000 00 . ... ...@shift_i shty=0  # LSL
+MOV_rxri000 01 . ... ...@shift_i shty=1  # LSR
+MOV_rxri000 10 . ... ...@shift_i shty=2  # ASR
+
 # Add/subtract (three low registers)
 
 @addsub_3   ... rm:3 rn:3 rd:3 \
-- 
2.17.1




[Qemu-devel] [PATCH v2 53/68] target/arm: Convert T16 add, compare, move (two high registers)

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 49 ++
 target/arm/t16.decode  | 10 +
 2 files changed, 12 insertions(+), 47 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 60bfc943a3..e639059a5a 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10620,55 +10620,10 @@ static void disas_thumb_insn(DisasContext *s, 
uint32_t insn)
 store_reg(s, rd, tmp);
 break;
 }
-if (insn & (1 << 10)) {
-/* 0b0100_01xx__
- * - data processing extended, branch and exchange
- */
-rd = (insn & 7) | ((insn >> 4) & 8);
-rm = (insn >> 3) & 0xf;
-op = (insn >> 8) & 3;
-switch (op) {
-case 0: /* add */
-tmp = load_reg(s, rd);
-tmp2 = load_reg(s, rm);
-tcg_gen_add_i32(tmp, tmp, tmp2);
-tcg_temp_free_i32(tmp2);
-if (rd == 13) {
-/* ADD SP, SP, reg */
-store_sp_checked(s, tmp);
-} else {
-store_reg(s, rd, tmp);
-}
-break;
-case 1: /* cmp */
-tmp = load_reg(s, rd);
-tmp2 = load_reg(s, rm);
-gen_sub_CC(tmp, tmp, tmp2);
-tcg_temp_free_i32(tmp2);
-tcg_temp_free_i32(tmp);
-break;
-case 2: /* mov/cpy */
-tmp = load_reg(s, rm);
-if (rd == 13) {
-/* MOV SP, reg */
-store_sp_checked(s, tmp);
-} else {
-store_reg(s, rd, tmp);
-}
-break;
-case 3:
-/* 0b0100_0111__
- * - branch [and link] exchange thumb register
- * In decodetree
- */
-goto illegal_op;
-}
-break;
-}
 
 /*
- * 0b0100_00xx__
- *  - Data-processing (two low registers), in decodetree
+ * - Data-processing (two low registers), in decodetree
+ * - data processing extended, branch and exchange, in decodetree
  */
 goto illegal_op;
 
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index edddbfb9b8..5a570484e3 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -146,6 +146,16 @@ CMP_xri 00101 ...   @arith_1i 
s=1
 ADD_rri 00110 ...   @arith_1i %s
 SUB_rri 00111 ...   @arith_1i %s
 
+# Add, compare, move (two high registers)
+
+%reg_0_77:1 0:3
+@addsub_2h    . rm:4 ... \
+_rrr_shi rd=%reg_0_7 rn=%reg_0_7 shim=0 shty=0
+
+ADD_rrri0100 0100 .  ...@addsub_2h s=0
+CMP_xrri0100 0101 .  ...@addsub_2h s=1
+MOV_rxri0100 0110 .  ...@addsub_2h s=0
+
 # Branch and exchange
 
 @branchr  . rm:4 ...
-- 
2.17.1




[Qemu-devel] [PATCH v2 62/68] target/arm: Convert T16, Miscellaneous 16-bit instructions

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 109 -
 target/arm/t16.decode  |  31 
 2 files changed, 54 insertions(+), 86 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 941266df14..dc670c9724 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10074,6 +10074,18 @@ static bool trans_TBH(DisasContext *s, arg_tbranch *a)
 return op_tbranch(s, a, true);
 }
 
+static bool trans_CBZ(DisasContext *s, arg_CBZ *a)
+{
+TCGv_i32 tmp = load_reg(s, a->rn);
+
+arm_gen_condlabel(s);
+tcg_gen_brcondi_i32(a->nz ? TCG_COND_EQ : TCG_COND_NE,
+tmp, 0, s->condlabel);
+tcg_temp_free_i32(tmp);
+gen_jmp(s, read_pc(s) + a->imm);
+return true;
+}
+
 /*
  * Supervisor call
  */
@@ -10295,6 +10307,25 @@ static bool trans_PLI(DisasContext *s, arg_PLD *a)
 return ENABLE_ARCH_7;
 }
 
+/*
+ * If-then
+ */
+
+static bool trans_IT(DisasContext *s, arg_IT *a)
+{
+/*
+ * No actual code generated for this insn, just setup state.
+ *
+ * Combinations of firstcond and mask which set up an 0b
+ * condition are UNPREDICTABLE; we take the CONSTRAINED
+ * UNPREDICTABLE choice to treat 0b the same as 0b1110,
+ * i.e. both meaning "execute always".
+ */
+s->condexec_cond = a->cond;
+s->condexec_mask = a->imm;
+return true;
+}
+
 /*
  * Legacy decoder.
  */
@@ -10661,83 +10692,8 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 case 8: /* load/store halfword immediate offset, in decodetree */
 case 9: /* load/store from stack, in decodetree */
 case 10: /* add PC/SP (immediate), in decodetree */
+case 11: /* misc, in decodetree */
 case 12: /* load/store multiple, in decodetree */
-goto illegal_op;
-
-case 11:
-/* misc */
-op = (insn >> 8) & 0xf;
-switch (op) {
-case 0: /* add/sub (sp, immediate), in decodetree */
-case 2: /* sign/zero extend, in decodetree */
-goto illegal_op;
-
-case 4: case 5: case 0xc: case 0xd:
-/* push/pop, in decodetree */
-goto illegal_op;
-
-case 1: case 3: case 9: case 11: /* czb */
-rm = insn & 7;
-tmp = load_reg(s, rm);
-arm_gen_condlabel(s);
-if (insn & (1 << 11))
-tcg_gen_brcondi_i32(TCG_COND_EQ, tmp, 0, s->condlabel);
-else
-tcg_gen_brcondi_i32(TCG_COND_NE, tmp, 0, s->condlabel);
-tcg_temp_free_i32(tmp);
-offset = ((insn & 0xf8) >> 2) | (insn & 0x200) >> 3;
-gen_jmp(s, read_pc(s) + offset);
-break;
-
-case 15: /* IT, nop-hint.  */
-if ((insn & 0xf) == 0) {
-goto illegal_op; /* nop hint, in decodetree */
-}
-/*
- * IT (If-Then)
- *
- * Combinations of firstcond and mask which set up an 0b
- * condition are UNPREDICTABLE; we take the CONSTRAINED
- * UNPREDICTABLE choice to treat 0b the same as 0b1110,
- * i.e. both meaning "execute always".
- */
-s->condexec_cond = (insn >> 4) & 0xe;
-s->condexec_mask = insn & 0x1f;
-/* No actual code generated for this insn, just setup state.  */
-break;
-
-case 0xe: /* bkpt */
-{
-int imm8 = extract32(insn, 0, 8);
-ARCH(5);
-gen_exception_bkpt_insn(s, syn_aa32_bkpt(imm8, true));
-break;
-}
-
-case 0xa: /* rev, and hlt */
-{
-int op1 = extract32(insn, 6, 2);
-
-if (op1 == 2) {
-/* HLT */
-int imm6 = extract32(insn, 0, 6);
-
-gen_hlt(s, imm6);
-break;
-}
-
-/* Otherwise this is rev, in decodetree */
-goto illegal_op;
-}
-
-case 6: /* setend, cps; in decodetree */
-goto illegal_op;
-
-default:
-goto undef;
-}
-break;
-
 case 13: /* conditional branch or swi, in decodetree */
 goto illegal_op;
 
@@ -10793,7 +10749,6 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 }
 return;
 illegal_op:
-undef:
 unallocated_encoding(s);
 }
 
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index 98d60952a1..4ecbabd364 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -210,20 +210,33 @@ REVSH   1011 1010 11 ... ...@rdm
 
 # Hints
 
+%it_cond5:3 !function=times_2
+
 {
-  YIELD 1011  0001 
-  WFE   1011  0010 
-  WFI   1011  0011 
+  {
+YIELD   1011  0001 
+WFE 1011  0010 
+WFI 1011  0011 
 
-  # TODO: Implement SEV, SEVL; may help SMP 

[Qemu-devel] [PATCH v2 52/68] target/arm: Convert T16 branch and exchange

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 64 +++---
 target/arm/t16.decode  | 10 +++
 2 files changed, 33 insertions(+), 41 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 3a3b113822..60bfc943a3 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8335,7 +8335,7 @@ static bool trans_BX(DisasContext *s, arg_BX *a)
 if (!ENABLE_ARCH_4T) {
 return false;
 }
-gen_bx(s, load_reg(s, a->rm));
+gen_bx_excret(s, load_reg(s, a->rm));
 return true;
 }
 
@@ -8362,6 +8362,26 @@ static bool trans_BLX_r(DisasContext *s, arg_BLX_r *a)
 return true;
 }
 
+static bool trans_BXNS(DisasContext *s, arg_BXNS *a)
+{
+if (!s->v8m_secure || IS_USER_ONLY) {
+unallocated_encoding(s);
+} else {
+gen_bxns(s, a->rm);
+}
+return true;
+}
+
+static bool trans_BLXNS(DisasContext *s, arg_BLXNS *a)
+{
+if (!s->v8m_secure || IS_USER_ONLY) {
+unallocated_encoding(s);
+} else {
+gen_blxns(s, a->rm);
+}
+return true;
+}
+
 static bool trans_CLZ(DisasContext *s, arg_CLZ *a)
 {
 TCGv_i32 tmp;
@@ -10637,49 +10657,11 @@ static void disas_thumb_insn(DisasContext *s, 
uint32_t insn)
 }
 break;
 case 3:
-{
 /* 0b0100_0111__
  * - branch [and link] exchange thumb register
+ * In decodetree
  */
-bool link = insn & (1 << 7);
-
-if (insn & 3) {
-goto undef;
-}
-if (link) {
-ARCH(5);
-}
-if ((insn & 4)) {
-/* BXNS/BLXNS: only exists for v8M with the
- * security extensions, and always UNDEF if NonSecure.
- * We don't implement these in the user-only mode
- * either (in theory you can use them from Secure User
- * mode but they are too tied in to system emulation.)
- */
-if (!s->v8m_secure || IS_USER_ONLY) {
-goto undef;
-}
-if (link) {
-gen_blxns(s, rm);
-} else {
-gen_bxns(s, rm);
-}
-break;
-}
-/* BLX/BX */
-tmp = load_reg(s, rm);
-if (link) {
-val = (uint32_t)s->base.pc_next | 1;
-tmp2 = tcg_temp_new_i32();
-tcg_gen_movi_i32(tmp2, val);
-store_reg(s, 14, tmp2);
-gen_bx(s, tmp);
-} else {
-/* Only BX works as exception-return, not BLX */
-gen_bx_excret(s, tmp);
-}
-break;
-}
+goto illegal_op;
 }
 break;
 }
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index 0654275e68..edddbfb9b8 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -24,6 +24,7 @@
 _rri_rot   !extern s rn rd imm rot
 _  !extern s rd rn rm ra
   !extern rd imm
+   !extern rm
 _rr !extern p w u rn rt rm shimm shtype
 _ri !extern p w u rn rt imm
 _block  !extern rn i b u w list
@@ -144,3 +145,12 @@ MOV_rxi 00100 ...   @arith_1i 
%s
 CMP_xri 00101 ...   @arith_1i s=1
 ADD_rri 00110 ...   @arith_1i %s
 SUB_rri 00111 ...   @arith_1i %s
+
+# Branch and exchange
+
+@branchr  . rm:4 ...
+
+BX  0100 0111 0  000@branchr
+BLX_r   0100 0111 1  000@branchr
+BXNS0100 0111 0  100@branchr
+BLXNS   0100 0111 1  100@branchr
-- 
2.17.1




[Qemu-devel] [PATCH v2 61/68] target/arm: Convert T16, Conditional branches, Supervisor call

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 26 +++---
 target/arm/t16.decode  | 12 
 2 files changed, 15 insertions(+), 23 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 5f876290ba..941266df14 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10599,7 +10599,7 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t 
insn)
 
 static void disas_thumb_insn(DisasContext *s, uint32_t insn)
 {
-uint32_t val, op, rm, rd, shift, cond;
+uint32_t val, op, rm, rd, shift;
 int32_t offset;
 TCGv_i32 tmp;
 TCGv_i32 tmp2;
@@ -10738,28 +10738,8 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 }
 break;
 
-case 13:
-/* conditional branch or swi */
-cond = (insn >> 8) & 0xf;
-if (cond == 0xe)
-goto undef;
-
-if (cond == 0xf) {
-/* swi */
-gen_set_pc_im(s, s->base.pc_next);
-s->svc_imm = extract32(insn, 0, 8);
-s->base.is_jmp = DISAS_SWI;
-break;
-}
-/* generate a conditional jump to next instruction */
-arm_skip_unless(s, cond);
-
-/* jump to the offset */
-val = read_pc(s);
-offset = ((int32_t)insn << 24) >> 24;
-val += offset << 1;
-gen_jmp(s, val);
-break;
+case 13: /* conditional branch or swi, in decodetree */
+goto illegal_op;
 
 case 14:
 if (insn & (1 << 11)) {
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index d731402036..98d60952a1 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -28,11 +28,13 @@
   !extern rd rm
   !extern rd imm
!extern rm
+   !extern imm
 _rr !extern p w u rn rt rm shimm shtype
 _ri !extern p w u rn rt imm
 _block  !extern rn i b u w list
   !extern E
  !extern mode imod M A I F
+  !extern cond imm
 
 # Set S if the instruction is outside of an IT block.
 %s   !function=t16_setflags
@@ -231,3 +233,13 @@ STM 1011 010 . \
 _block i=0 b=1 u=0 w=1 rn=13 list=%push_list
 LDM_t16 1011 110 . \
 _block i=1 b=0 u=0 w=1 rn=13 list=%pop_list
+
+# Conditional branches, Supervisor call
+
+%imm8_0x2   0:s8 !function=times_2
+
+{
+  UDF   1101 1110  
+  SVC   1101  imm:8 
+  B_cond_thumb  1101 cond:4  imm=%imm8_0x2
+}
-- 
2.17.1




[Qemu-devel] [PATCH v2 51/68] target/arm: Convert T16 one low register and immediate

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 44 ++
 target/arm/t16.decode  | 11 +++
 2 files changed, 13 insertions(+), 42 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 6f30415371..3a3b113822 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10586,48 +10586,8 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 store_reg(s, rd, tmp);
 }
 break;
-case 2: case 3:
-/*
- * 0b001x___
- *  - Add, subtract, compare, move (one low register and immediate)
- */
-op = (insn >> 11) & 3;
-rd = (insn >> 8) & 0x7;
-if (op == 0) { /* mov */
-tmp = tcg_temp_new_i32();
-tcg_gen_movi_i32(tmp, insn & 0xff);
-if (!s->condexec_mask)
-gen_logic_CC(tmp);
-store_reg(s, rd, tmp);
-} else {
-tmp = load_reg(s, rd);
-tmp2 = tcg_temp_new_i32();
-tcg_gen_movi_i32(tmp2, insn & 0xff);
-switch (op) {
-case 1: /* cmp */
-gen_sub_CC(tmp, tmp, tmp2);
-tcg_temp_free_i32(tmp);
-tcg_temp_free_i32(tmp2);
-break;
-case 2: /* add */
-if (s->condexec_mask)
-tcg_gen_add_i32(tmp, tmp, tmp2);
-else
-gen_add_CC(tmp, tmp, tmp2);
-tcg_temp_free_i32(tmp2);
-store_reg(s, rd, tmp);
-break;
-case 3: /* sub */
-if (s->condexec_mask)
-tcg_gen_sub_i32(tmp, tmp, tmp2);
-else
-gen_sub_CC(tmp, tmp, tmp2);
-tcg_temp_free_i32(tmp2);
-store_reg(s, rd, tmp);
-break;
-}
-}
-break;
+case 2: case 3: /* add, sub, cmp, mov (reg, imm), in decodetree */
+goto illegal_op;
 case 4:
 if (insn & (1 << 11)) {
 rd = (insn >> 8) & 7;
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index 2b5f368d31..0654275e68 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -133,3 +133,14 @@ SUB_rrri0001101 ... ... ... @addsub_3
 
 ADD_rri 0001 110 ... ... ...@addsub_2i
 SUB_rri 0001 111 ... ... ...@addsub_2i
+
+# Add, subtract, compare, move (one low register and immediate)
+
+%reg_8  8:3
+@arith_1i   . rd:3 imm:8 \
+_rri_rot rot=0 rn=%reg_8
+
+MOV_rxi 00100 ...   @arith_1i %s
+CMP_xri 00101 ...   @arith_1i s=1
+ADD_rri 00110 ...   @arith_1i %s
+SUB_rri 00111 ...   @arith_1i %s
-- 
2.17.1




[Qemu-devel] [PATCH v2 57/68] target/arm: Convert T16, Reverse bytes

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 18 +++---
 target/arm/t16.decode  |  9 +
 2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 368f0ab147..176cba2992 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10608,7 +10608,7 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t 
insn)
 
 static void disas_thumb_insn(DisasContext *s, uint32_t insn)
 {
-uint32_t val, op, rm, rn, rd, shift, cond;
+uint32_t val, op, rm, rd, shift, cond;
 int32_t offset;
 int i;
 TCGv_i32 tmp;
@@ -10805,20 +10805,8 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 break;
 }
 
-/* Otherwise this is rev */
-ARCH(6);
-rn = (insn >> 3) & 0x7;
-rd = insn & 0x7;
-tmp = load_reg(s, rn);
-switch (op1) {
-case 0: tcg_gen_bswap32_i32(tmp, tmp); break;
-case 1: gen_rev16(tmp, tmp); break;
-case 3: gen_revsh(tmp, tmp); break;
-default:
-g_assert_not_reached();
-}
-store_reg(s, rd, tmp);
-break;
+/* Otherwise this is rev, in decodetree */
+goto illegal_op;
 }
 
 case 6: /* setend, cps; in decodetree */
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index 3bf1a31731..ec21be7ef0 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -24,6 +24,7 @@
 _rri_rot   !extern s rn rd imm rot
 _  !extern s rd rn rm ra
 _rot !extern rd rn rm rot
+  !extern rd rm
   !extern rd imm
!extern rm
 _rr !extern p w u rn rt rm shimm shtype
@@ -195,3 +196,11 @@ SETEND  1011 0110 010 1 E:1 000 
   CPS_v6m   1011 0110 011 im:1 00 I:1 F:1
   CPS   1011 0110 011 . 0 A:1 I:1 F:1mode=0 M=0 %imod
 }
+
+# Reverse bytes
+
+@rdm  .. rm:3 rd:3  
+
+REV 1011 1010 00 ... ...@rdm
+REV16   1011 1010 01 ... ...@rdm
+REVSH   1011 1010 11 ... ...@rdm
-- 
2.17.1




[Qemu-devel] [PATCH v2 50/68] target/arm: Convert T16 add/sub (3 low, 2 low and imm)

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 26 ++
 target/arm/t16.decode  | 16 
 2 files changed, 18 insertions(+), 24 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index d417958b23..6f30415371 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10572,31 +10572,9 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
  * 0b0001_1xxx__
  *  - Add, subtract (three low registers)
  *  - Add, subtract (two low registers and immediate)
+ * In decodetree.
  */
-rn = (insn >> 3) & 7;
-tmp = load_reg(s, rn);
-if (insn & (1 << 10)) {
-/* immediate */
-tmp2 = tcg_temp_new_i32();
-tcg_gen_movi_i32(tmp2, (insn >> 6) & 7);
-} else {
-/* reg */
-rm = (insn >> 6) & 7;
-tmp2 = load_reg(s, rm);
-}
-if (insn & (1 << 9)) {
-if (s->condexec_mask)
-tcg_gen_sub_i32(tmp, tmp, tmp2);
-else
-gen_sub_CC(tmp, tmp, tmp2);
-} else {
-if (s->condexec_mask)
-tcg_gen_add_i32(tmp, tmp, tmp2);
-else
-gen_add_CC(tmp, tmp, tmp2);
-}
-tcg_temp_free_i32(tmp2);
-store_reg(s, rd, tmp);
+goto illegal_op;
 } else {
 /* shift immediate */
 rm = (insn >> 3) & 7;
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index a7a437f930..2b5f368d31 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -117,3 +117,19 @@ ADD_rri 10101 rd:3  \
 
 STM 11000 ...   @ldstm
 LDM_t16 11001 ...   @ldstm
+
+# Add/subtract (three low registers)
+
+@addsub_3   ... rm:3 rn:3 rd:3 \
+_rrr_shi %s shim=0 shty=0
+
+ADD_rrri0001100 ... ... ... @addsub_3
+SUB_rrri0001101 ... ... ... @addsub_3
+
+# Add/subtract (two low registers and immediate)
+
+@addsub_2i  ... imm:3 rn:3 rd:3 \
+_rri_rot %s rot=0
+
+ADD_rri 0001 110 ... ... ...@addsub_2i
+SUB_rri 0001 111 ... ... ...@addsub_2i
-- 
2.17.1




[Qemu-devel] [PATCH v2 47/68] target/arm: Convert T16 load/store (immediate offset)

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 94 +++---
 target/arm/t16.decode  | 33 +++
 2 files changed, 38 insertions(+), 89 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index e19961fb6c..24537fc107 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10744,97 +10744,13 @@ static void disas_thumb_insn(DisasContext *s, 
uint32_t insn)
  */
 goto illegal_op;
 
-case 5:
-/* load/store register offset, in decodetree */
+case 5: /* load/store register offset, in decodetree */
+case 6: /* load/store word immediate offset, in decodetree */
+case 7: /* load/store byte immediate offset, in decodetree */
+case 8: /* load/store halfword immediate offset, in decodetree */
+case 9: /* load/store from stack, in decodetree */
 goto illegal_op;
 
-case 6:
-/* load/store word immediate offset */
-rd = insn & 7;
-rn = (insn >> 3) & 7;
-addr = load_reg(s, rn);
-val = (insn >> 4) & 0x7c;
-tcg_gen_addi_i32(addr, addr, val);
-
-if (insn & (1 << 11)) {
-/* load */
-tmp = tcg_temp_new_i32();
-gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
-store_reg(s, rd, tmp);
-} else {
-/* store */
-tmp = load_reg(s, rd);
-gen_aa32_st32(s, tmp, addr, get_mem_index(s));
-tcg_temp_free_i32(tmp);
-}
-tcg_temp_free_i32(addr);
-break;
-
-case 7:
-/* load/store byte immediate offset */
-rd = insn & 7;
-rn = (insn >> 3) & 7;
-addr = load_reg(s, rn);
-val = (insn >> 6) & 0x1f;
-tcg_gen_addi_i32(addr, addr, val);
-
-if (insn & (1 << 11)) {
-/* load */
-tmp = tcg_temp_new_i32();
-gen_aa32_ld8u_iss(s, tmp, addr, get_mem_index(s), rd | ISSIs16Bit);
-store_reg(s, rd, tmp);
-} else {
-/* store */
-tmp = load_reg(s, rd);
-gen_aa32_st8_iss(s, tmp, addr, get_mem_index(s), rd | ISSIs16Bit);
-tcg_temp_free_i32(tmp);
-}
-tcg_temp_free_i32(addr);
-break;
-
-case 8:
-/* load/store halfword immediate offset */
-rd = insn & 7;
-rn = (insn >> 3) & 7;
-addr = load_reg(s, rn);
-val = (insn >> 5) & 0x3e;
-tcg_gen_addi_i32(addr, addr, val);
-
-if (insn & (1 << 11)) {
-/* load */
-tmp = tcg_temp_new_i32();
-gen_aa32_ld16u_iss(s, tmp, addr, get_mem_index(s), rd | 
ISSIs16Bit);
-store_reg(s, rd, tmp);
-} else {
-/* store */
-tmp = load_reg(s, rd);
-gen_aa32_st16_iss(s, tmp, addr, get_mem_index(s), rd | ISSIs16Bit);
-tcg_temp_free_i32(tmp);
-}
-tcg_temp_free_i32(addr);
-break;
-
-case 9:
-/* load/store from stack */
-rd = (insn >> 8) & 7;
-addr = load_reg(s, 13);
-val = (insn & 0xff) * 4;
-tcg_gen_addi_i32(addr, addr, val);
-
-if (insn & (1 << 11)) {
-/* load */
-tmp = tcg_temp_new_i32();
-gen_aa32_ld32u_iss(s, tmp, addr, get_mem_index(s), rd | 
ISSIs16Bit);
-store_reg(s, rd, tmp);
-} else {
-/* store */
-tmp = load_reg(s, rd);
-gen_aa32_st32_iss(s, tmp, addr, get_mem_index(s), rd | ISSIs16Bit);
-tcg_temp_free_i32(tmp);
-}
-tcg_temp_free_i32(addr);
-break;
-
 case 10:
 /*
  * 0b1010___
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index 83fe4363c7..1cf79789ac 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -24,6 +24,7 @@
 _rri_rot   !extern s rn rd imm rot
 _  !extern s rd rn rm ra
 _rr !extern p w u rn rt rm shimm shtype
+_ri !extern p w u rn rt imm
 
 # Set S if the instruction is outside of an IT block.
 %s   !function=t16_setflags
@@ -69,3 +70,35 @@ LDR_rr   0101 100 ... ... ...   @ldst_rr
 LDRH_rr  0101 101 ... ... ...   @ldst_rr
 LDRB_rr  0101 110 ... ... ...   @ldst_rr
 LDRSH_rr 0101 111 ... ... ...   @ldst_rr
+
+# Load/store word/byte (immediate offset)
+
+%imm5_6x4   6:5 !function=times_4
+
+@ldst_ri_1  . imm:5 rn:3 rt:3 \
+_ri p=1 w=0 u=1
+@ldst_ri_4  . . rn:3 rt:3 \
+_ri p=1 w=0 u=1 imm=%imm5_6x4
+
+STR_ri  01100 . ... ... @ldst_ri_4
+LDR_ri  01101 . ... ... @ldst_ri_4
+STRB_ri 01110 . ... ... @ldst_ri_1
+LDRB_ri 0 . ... ... @ldst_ri_1
+
+# Load/store halfword (immediate offset)
+
+%imm5_6x2   6:5 !function=times_2
+@ldst_ri_2  . 

Re: [Qemu-devel] patch to swap SIGRTMIN + 1 and SIGRTMAX - 1

2019-08-19 Thread Josh Kunz via Qemu-devel
Hi all,

I have also experienced issues with SIGRTMIN + 1, and am interested in
moving this patch forwards. Anything I can do here to help? Would the
maintainers prefer myself or Marli re-submit the patch?

The Go issue here seems particularly sticky. Even if we update the Go
runtime, users may try and run older binaries built with older versions of
Go for quite some time (months? years?). Would it be better to hide this
behind some kind of build-time flag (`--enable-sigrtmin-plus-one-proxy` or
something), so that some users can opt-in, but older binaries still work as
expected?

Also, here is a link to the original thread this message is in reply to
in-case my mail-client doesn't set up the reply properly:
https://lists.nongnu.org/archive/html/qemu-devel/2019-07/msg01303.html

Thanks,
Josh Kunz


[Qemu-devel] [PATCH v2 48/68] target/arm: Convert T16 add pc/sp (immediate)

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 12 +---
 target/arm/t16.decode  |  7 +++
 2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 24537fc107..2640f50fcf 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10749,19 +10749,9 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 case 7: /* load/store byte immediate offset, in decodetree */
 case 8: /* load/store halfword immediate offset, in decodetree */
 case 9: /* load/store from stack, in decodetree */
+case 10: /* add PC/SP (immediate), in decodetree */
 goto illegal_op;
 
-case 10:
-/*
- * 0b1010___
- *  - Add PC/SP (immediate)
- */
-rd = (insn >> 8) & 7;
-val = (insn & 0xff) * 4;
-tmp = add_reg_for_lit(s, insn & (1 << 11) ? 13 : 15, val);
-store_reg(s, rd, tmp);
-break;
-
 case 11:
 /* misc */
 op = (insn >> 8) & 0xf;
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index 1cf79789ac..71b3e8f02e 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -23,6 +23,7 @@
 _rrr_shr   !extern s rn rd rm rs shty
 _rri_rot   !extern s rn rd imm rot
 _  !extern s rd rn rm ra
+  !extern rd imm
 _rr !extern p w u rn rt rm shimm shtype
 _ri !extern p w u rn rt imm
 
@@ -102,3 +103,9 @@ LDRH_ri 10001 . ... ... @ldst_ri_2
 
 STR_ri  10010 ...   @ldst_spec_i rn=13
 LDR_ri  10011 ...   @ldst_spec_i rn=13
+
+# Add PC/SP (immediate)
+
+ADR 10100 rd:3  imm=%imm8_0x4
+ADD_rri 10101 rd:3  \
+_rri_rot rn=13 s=0 rot=0 imm=%imm8_0x4  # SP
-- 
2.17.1




[Qemu-devel] [PATCH v2 42/68] target/arm: Simplify disas_thumb2_insn

2019-08-19 Thread Richard Henderson
Fold away all of the cases that now just goto illegal_op,
because all of their internal bits are now in decodetree.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 79 ++
 1 file changed, 3 insertions(+), 76 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index d1078ca1ec..25c74206c2 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10433,9 +10433,6 @@ static bool thumb_insn_is_16bit(DisasContext *s, 
uint32_t pc, uint32_t insn)
 /* Translate a 32-bit thumb instruction. */
 static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
 {
-uint32_t rn;
-int op;
-
 /*
  * ARMv6-M supports a limited subset of Thumb2 instructions.
  * Other Thumb1 architectures allow only 32-bit
@@ -10476,34 +10473,10 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 }
 /* fall back to legacy decoder */
 
-rn = (insn >> 16) & 0xf;
 switch ((insn >> 25) & 0xf) {
 case 0: case 1: case 2: case 3:
 /* 16-bit instructions.  Should never happen.  */
 abort();
-case 4:
-/* All in decodetree */
-goto illegal_op;
-case 5:
-/* All in decodetree */
-goto illegal_op;
-case 13: /* Misc data processing.  */
-op = ((insn >> 22) & 6) | ((insn >> 7) & 1);
-if (op < 4 && (insn & 0xf000) != 0xf000)
-goto illegal_op;
-switch (op) {
-case 0: /* Register controlled shift, in decodetree */
-case 1: /* Sign/zero extend, in decodetree */
-case 2: /* SIMD add/subtract, in decodetree */
-case 3: /* Other data processing, in decodetree */
-goto illegal_op;
-case 4: case 5:
-/* 32-bit multiply.  Sum of absolute differences, in decodetree */
-goto illegal_op;
-case 6: case 7: /* 64-bit multiply, Divide, in decodetree */
-goto illegal_op;
-}
-break;
 case 6: case 7: case 14: case 15:
 /* Coprocessor.  */
 if (arm_dc_feature(s, ARM_FEATURE_M)) {
@@ -10532,6 +10505,7 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t 
insn)
 }
 
 if (arm_dc_feature(s, ARM_FEATURE_VFP)) {
+uint32_t rn = (insn >> 16) & 0xf;
 TCGv_i32 fptr = load_reg(s, rn);
 
 if (extract32(insn, 20, 1)) {
@@ -10590,50 +10564,6 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 }
 }
 break;
-case 8: case 9: case 10: case 11:
-if (insn & (1 << 15)) {
-/* Branches, misc control.  */
-if (insn & 0x5000) {
-/* Unconditional branch, in decodetree */
-goto illegal_op;
-} else if (((insn >> 23) & 7) == 7) {
-/* Misc control */
-if (insn & (1 << 13))
-goto illegal_op;
-
-if (insn & (1 << 26)) {
-/* hvc, smc, in decodetree */
-goto illegal_op;
-} else {
-op = (insn >> 20) & 7;
-switch (op) {
-case 0: /* msr cpsr, in decodetree  */
-case 1: /* msr spsr, in decodetree  */
-goto illegal_op;
-case 2: /* cps, nop-hint, in decodetree */
-goto illegal_op;
-case 3: /* Special control operations, in decodetree */
-case 4: /* bxj, in decodetree */
-goto illegal_op;
-case 5: /* Exception return.  */
-case 6: /* MRS, in decodetree */
-case 7: /* MSR, in decodetree */
-goto illegal_op;
-}
-}
-} else {
-/* Conditional branch, in decodetree */
-goto illegal_op;
-}
-} else {
-/*
- * 0b_0xxx__0xxx__
- *  - Data-processing (modified immediate, plain binary immediate)
- * All in decodetree.
- */
-goto illegal_op;
-}
-break;
 case 12:
 if ((insn & 0x0110) == 0x0100) {
 if (disas_neon_ls_insn(s, insn)) {
@@ -10641,14 +10571,11 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 }
 break;
 }
-/* Load/store single data item, in decodetree */
 goto illegal_op;
 default:
-goto illegal_op;
+illegal_op:
+unallocated_encoding(s);
 }
-return;
-illegal_op:
-unallocated_encoding(s);
 }
 
 static void disas_thumb_insn(DisasContext *s, uint32_t insn)
-- 
2.17.1




[Qemu-devel] [PATCH v2 60/68] target/arm: Convert T16, push and pop

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 83 ++
 target/arm/t16.decode  | 10 +
 2 files changed, 22 insertions(+), 71 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 9e0345adf7..5f876290ba 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7494,6 +7494,16 @@ static int t16_setflags(DisasContext *s)
 return s->condexec_mask == 0;
 }
 
+static int t16_push_list(DisasContext *s, int x)
+{
+return (x & 0xff) | (x & 0x100) << (14 - 8);
+}
+
+static int t16_pop_list(DisasContext *s, int x)
+{
+return (x & 0xff) | (x & 0x100) << (15 - 8);
+}
+
 /*
  * Include the generated decoders.
  */
@@ -10591,7 +10601,6 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 {
 uint32_t val, op, rm, rd, shift, cond;
 int32_t offset;
-int i;
 TCGv_i32 tmp;
 TCGv_i32 tmp2;
 TCGv_i32 addr;
@@ -10664,76 +10673,8 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 goto illegal_op;
 
 case 4: case 5: case 0xc: case 0xd:
-/*
- * 0b1011_x10x__
- *  - push/pop
- */
-addr = load_reg(s, 13);
-if (insn & (1 << 8))
-offset = 4;
-else
-offset = 0;
-for (i = 0; i < 8; i++) {
-if (insn & (1 << i))
-offset += 4;
-}
-if ((insn & (1 << 11)) == 0) {
-tcg_gen_addi_i32(addr, addr, -offset);
-}
-
-if (s->v8m_stackcheck) {
-/*
- * Here 'addr' is the lower of "old SP" and "new SP";
- * if this is a pop that starts below the limit and ends
- * above it, it is UNKNOWN whether the limit check triggers;
- * we choose to trigger.
- */
-gen_helper_v8m_stackcheck(cpu_env, addr);
-}
-
-for (i = 0; i < 8; i++) {
-if (insn & (1 << i)) {
-if (insn & (1 << 11)) {
-/* pop */
-tmp = tcg_temp_new_i32();
-gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
-store_reg(s, i, tmp);
-} else {
-/* push */
-tmp = load_reg(s, i);
-gen_aa32_st32(s, tmp, addr, get_mem_index(s));
-tcg_temp_free_i32(tmp);
-}
-/* advance to the next address.  */
-tcg_gen_addi_i32(addr, addr, 4);
-}
-}
-tmp = NULL;
-if (insn & (1 << 8)) {
-if (insn & (1 << 11)) {
-/* pop pc */
-tmp = tcg_temp_new_i32();
-gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
-/* don't set the pc until the rest of the instruction
-   has completed */
-} else {
-/* push lr */
-tmp = load_reg(s, 14);
-gen_aa32_st32(s, tmp, addr, get_mem_index(s));
-tcg_temp_free_i32(tmp);
-}
-tcg_gen_addi_i32(addr, addr, 4);
-}
-if ((insn & (1 << 11)) == 0) {
-tcg_gen_addi_i32(addr, addr, -offset);
-}
-/* write back the new stack pointer */
-store_reg(s, 13, addr);
-/* set the new PC value */
-if ((insn & 0x0900) == 0x0900) {
-store_reg_from_load(s, 15, tmp);
-}
-break;
+/* push/pop, in decodetree */
+goto illegal_op;
 
 case 1: case 3: case 9: case 11: /* czb */
 rm = insn & 7;
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index d5b046d105..d731402036 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -221,3 +221,13 @@ REVSH   1011 1010 11 ... ...@rdm
   # rest of the space is a reserved hint, behaves as nop.
   NOP   1011   
 }
+
+# Push and Pop
+
+%push_list  0:9 !function=t16_push_list
+%pop_list   0:9 !function=t16_pop_list
+
+STM 1011 010 . \
+_block i=0 b=1 u=0 w=1 rn=13 list=%push_list
+LDM_t16 1011 110 . \
+_block i=1 b=0 u=0 w=1 rn=13 list=%pop_list
-- 
2.17.1




[Qemu-devel] [PATCH v2 64/68] target/arm: Convert T16, load (literal)

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 42 ++
 target/arm/t16.decode  |  4 
 2 files changed, 6 insertions(+), 40 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index dc3c9049cd..1882057402 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -963,14 +963,6 @@ static inline void gen_aa32_ld##SUFF(DisasContext *s, 
TCGv_i32 val,  \
  TCGv_i32 a32, int index)\
 {\
 gen_aa32_ld_i32(s, val, a32, index, OPC | s->be_data);   \
-}\
-static inline void gen_aa32_ld##SUFF##_iss(DisasContext *s,  \
-   TCGv_i32 val, \
-   TCGv_i32 a32, int index,  \
-   ISSInfo issinfo)  \
-{\
-gen_aa32_ld##SUFF(s, val, a32, index);   \
-disas_set_da_iss(s, OPC, issinfo);   \
 }
 
 #define DO_GEN_ST(SUFF, OPC) \
@@ -978,14 +970,6 @@ static inline void gen_aa32_st##SUFF(DisasContext *s, 
TCGv_i32 val,  \
  TCGv_i32 a32, int index)\
 {\
 gen_aa32_st_i32(s, val, a32, index, OPC | s->be_data);   \
-}\
-static inline void gen_aa32_st##SUFF##_iss(DisasContext *s,  \
-   TCGv_i32 val, \
-   TCGv_i32 a32, int index,  \
-   ISSInfo issinfo)  \
-{\
-gen_aa32_st##SUFF(s, val, a32, index);   \
-disas_set_da_iss(s, OPC, issinfo | ISSIsWrite);  \
 }
 
 static inline void gen_aa32_frob64(DisasContext *s, TCGv_i64 val)
@@ -1034,9 +1018,7 @@ static inline void gen_aa32_st64(DisasContext *s, 
TCGv_i64 val,
 gen_aa32_st_i64(s, val, a32, index, MO_Q | s->be_data);
 }
 
-DO_GEN_LD(8s, MO_SB)
 DO_GEN_LD(8u, MO_UB)
-DO_GEN_LD(16s, MO_SW)
 DO_GEN_LD(16u, MO_UW)
 DO_GEN_LD(32u, MO_UL)
 DO_GEN_ST(8, MO_UB)
@@ -10630,11 +10612,10 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 
 static void disas_thumb_insn(DisasContext *s, uint32_t insn)
 {
-uint32_t val, rd;
+uint32_t val;
 int32_t offset;
 TCGv_i32 tmp;
 TCGv_i32 tmp2;
-TCGv_i32 addr;
 
 if (disas_t16(s, insn)) {
 return;
@@ -10644,26 +10625,7 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 switch (insn >> 12) {
 case 0: case 1: /* add/sub (3reg, 2reg imm), shift imm; in decodetree */
 case 2: case 3: /* add, sub, cmp, mov (reg, imm), in decodetree */
-goto illegal_op;
-case 4:
-if (insn & (1 << 11)) {
-rd = (insn >> 8) & 7;
-/* load pc-relative.  Bit 1 of PC is ignored.  */
-addr = add_reg_for_lit(s, 15, (insn & 0xff) * 4);
-tmp = tcg_temp_new_i32();
-gen_aa32_ld32u_iss(s, tmp, addr, get_mem_index(s),
-   rd | ISSIs16Bit);
-tcg_temp_free_i32(addr);
-store_reg(s, rd, tmp);
-break;
-}
-
-/*
- * - Data-processing (two low registers), in decodetree
- * - data processing extended, branch and exchange, in decodetree
- */
-goto illegal_op;
-
+case 4: /* ldr lit, data proc (2reg), data proc ext, bx; in decodetree */
 case 5: /* load/store register offset, in decodetree */
 case 6: /* load/store word immediate offset, in decodetree */
 case 7: /* load/store byte immediate offset, in decodetree */
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index 1adad20804..f87e6fde50 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -113,6 +113,10 @@ LDRH_ri 10001 . ... ... @ldst_ri_2
 STR_ri  10010 ...   @ldst_spec_i rn=13
 LDR_ri  10011 ...   @ldst_spec_i rn=13
 
+# Load (PC-relative)
+
+LDR_ri  01001 ...   @ldst_spec_i rn=15
+
 # Add PC/SP (immediate)
 
 ADR 10100 rd:3  imm=%imm8_0x4
-- 
2.17.1




[Qemu-devel] [PATCH v2 43/68] target/arm: Simplify disas_arm_insn

2019-08-19 Thread Richard Henderson
Fold away all of the cases that now just goto illegal_op,
because all of their internal bits are now in decodetree.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 69 ++
 1 file changed, 16 insertions(+), 53 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 25c74206c2..49bab7d863 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10245,7 +10245,7 @@ static bool trans_PLI(DisasContext *s, arg_PLD *a)
 
 static void disas_arm_insn(DisasContext *s, unsigned int insn)
 {
-unsigned int cond, op1;
+unsigned int cond = insn >> 28;
 
 /* M variants do not implement ARM mode; this must raise the INVSTATE
  * UsageFault exception.
@@ -10255,7 +10255,6 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
default_exception_el(s));
 return;
 }
-cond = insn >> 28;
 
 if (cond == 0xf) {
 /* In ARMv3 and v4 the NV condition is UNPREDICTABLE; we
@@ -10320,11 +10319,6 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 goto illegal_op;
 }
 return;
-} else if ((insn & 0x0fe0) == 0x0c40) {
-/* Coprocessor double register transfer.  */
-ARCH(5TE);
-} else if ((insn & 0x0f10) == 0x0e10) {
-/* Additional coprocessor register transfer.  */
 }
 goto illegal_op;
 }
@@ -10339,55 +10333,24 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 }
 /* fall back to legacy decoder */
 
-if ((insn & 0x0f90) == 0x0300) {
-/* All done in decodetree.  Illegal ops reach here.  */
-goto illegal_op;
-} else if ((insn & 0x0f90) == 0x0100
-   && (insn & 0x0090) != 0x0090) {
-/* miscellaneous instructions */
-/* All done in decodetree.  Illegal ops reach here.  */
-goto illegal_op;
-} else if (((insn & 0x0e00) == 0 &&
-(insn & 0x0090) != 0x90) ||
-   ((insn & 0x0e00) == (1 << 25))) {
-/* Data-processing (reg, reg-shift-reg, imm).  */
-/* All done in decodetree.  Reach here for illegal ops.  */
-goto illegal_op;
-} else {
-/* other instructions */
-op1 = (insn >> 24) & 0xf;
-switch(op1) {
-case 0x0:
-case 0x1:
-case 0x4:
-case 0x5:
-case 0x6:
-case 0x7:
-case 0x08:
-case 0x09:
-case 0xa:
-case 0xb:
-case 0xf:
-/* All done in decodetree.  Reach here for illegal ops.  */
-goto illegal_op;
-case 0xc:
-case 0xd:
-case 0xe:
-if (((insn >> 8) & 0xe) == 10) {
-/* VFP.  */
-if (disas_vfp_insn(s, insn)) {
-goto illegal_op;
-}
-} else if (disas_coproc_insn(s, insn)) {
-/* Coprocessor.  */
+switch ((insn >> 24) & 0xf) {
+case 0xc:
+case 0xd:
+case 0xe:
+if (((insn >> 8) & 0xe) == 10) {
+/* VFP.  */
+if (disas_vfp_insn(s, insn)) {
 goto illegal_op;
 }
-break;
-default:
-illegal_op:
-unallocated_encoding(s);
-break;
+} else if (disas_coproc_insn(s, insn)) {
+/* Coprocessor.  */
+goto illegal_op;
 }
+break;
+default:
+illegal_op:
+unallocated_encoding(s);
+break;
 }
 }
 
-- 
2.17.1




[Qemu-devel] [PATCH v2 38/68] target/arm: Convert Unallocated memory hint

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c   | 8 
 target/arm/a32-uncond.decode | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index a30a9bb4e0..9ec6b25c03 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10216,14 +10216,6 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 }
 return;
 }
-if (((insn & 0x0f70) == 0x0410) ||
-((insn & 0x0f700010) == 0x0610)) {
-if (!arm_dc_feature(s, ARM_FEATURE_V7MP)) {
-goto illegal_op;
-}
-return; /* v7MP: Unallocated memory hint: must NOP */
-}
-
 if ((insn & 0x0e000f00) == 0x0c000100) {
 if (arm_dc_feature(s, ARM_FEATURE_IWMMXT)) {
 /* iWMMXt register transfer.  */
diff --git a/target/arm/a32-uncond.decode b/target/arm/a32-uncond.decode
index aed381cb8e..afa95bf7aa 100644
--- a/target/arm/a32-uncond.decode
+++ b/target/arm/a32-uncond.decode
@@ -64,3 +64,11 @@ PLI   0100 -101     
# (imm, lit) 7
 PLD   0111 -101   - -- 0    # (register) 5te
 PLDW  0111 -001   - -- 0    # (register) 7mp
 PLI   0110 -101   - -- 0    # (register) 7
+
+# Unallocated memory hints
+#
+# Since these are v7MP nops, and PLDW is v7MP and implemented as nop,
+# (ab)use the PLDW helper.
+
+PLDW  0100 -001     
+PLDW  0110 -001    ---0 
-- 
2.17.1




[Qemu-devel] [PATCH v2 59/68] target/arm: Split gen_nop_hint

2019-08-19 Thread Richard Henderson
Now that there all callers pass a constant value, split the switch
statement into the individual trans_* functions.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 67 +++---
 1 file changed, 24 insertions(+), 43 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 67f0202d29..9e0345adf7 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -3045,46 +3045,6 @@ static void gen_exception_return(DisasContext *s, 
TCGv_i32 pc)
 gen_rfe(s, pc, load_cpu_field(spsr));
 }
 
-/*
- * For WFI we will halt the vCPU until an IRQ. For WFE and YIELD we
- * only call the helper when running single threaded TCG code to ensure
- * the next round-robin scheduled vCPU gets a crack. In MTTCG mode we
- * just skip this instruction. Currently the SEV/SEVL instructions
- * which are *one* of many ways to wake the CPU from WFE are not
- * implemented so we can't sleep like WFI does.
- */
-static void gen_nop_hint(DisasContext *s, int val)
-{
-switch (val) {
-/* When running in MTTCG we don't generate jumps to the yield and
- * WFE helpers as it won't affect the scheduling of other vCPUs.
- * If we wanted to more completely model WFE/SEV so we don't busy
- * spin unnecessarily we would need to do something more involved.
- */
-case 1: /* yield */
-if (!(tb_cflags(s->base.tb) & CF_PARALLEL)) {
-gen_set_pc_im(s, s->base.pc_next);
-s->base.is_jmp = DISAS_YIELD;
-}
-break;
-case 3: /* wfi */
-gen_set_pc_im(s, s->base.pc_next);
-s->base.is_jmp = DISAS_WFI;
-break;
-case 2: /* wfe */
-if (!(tb_cflags(s->base.tb) & CF_PARALLEL)) {
-gen_set_pc_im(s, s->base.pc_next);
-s->base.is_jmp = DISAS_WFE;
-}
-break;
-case 4: /* sev */
-case 5: /* sevl */
-/* TODO: Implement SEV, SEVL and WFE.  May help SMP performance.  */
-default: /* nop */
-break;
-}
-}
-
 #define CPU_V001 cpu_V0, cpu_V0, cpu_V1
 
 static inline void gen_neon_add(int size, TCGv_i32 t0, TCGv_i32 t1)
@@ -8165,19 +8125,40 @@ DO_SMLAWX(SMLAWT, 1, 1)
 
 static bool trans_YIELD(DisasContext *s, arg_YIELD *a)
 {
-gen_nop_hint(s, 1);
+/*
+ * When running single-threaded TCG code, use the helper to ensure that
+ * the next round-robin scheduled vCPU gets a crack.  When running in
+ * MTTCG we don't generate jumps to the helper as it won't affect the
+ * scheduling of other vCPUs.
+ */
+if (!(tb_cflags(s->base.tb) & CF_PARALLEL)) {
+gen_set_pc_im(s, s->base.pc_next);
+s->base.is_jmp = DISAS_YIELD;
+}
 return true;
 }
 
 static bool trans_WFE(DisasContext *s, arg_WFE *a)
 {
-gen_nop_hint(s, 2);
+/*
+ * When running single-threaded TCG code, use the helper to ensure that
+ * the next round-robin scheduled vCPU gets a crack.  In MTTCG mode we
+ * just skip this instruction.  Currently the SEV/SEVL instructions,
+ * which are *one* of many ways to wake the CPU from WFE, are not
+ * implemented so we can't sleep like WFI does.
+ */
+if (!(tb_cflags(s->base.tb) & CF_PARALLEL)) {
+gen_set_pc_im(s, s->base.pc_next);
+s->base.is_jmp = DISAS_WFE;
+}
 return true;
 }
 
 static bool trans_WFI(DisasContext *s, arg_WFI *a)
 {
-gen_nop_hint(s, 3);
+/* For WFI, halt the vCPU until an IRQ. */
+gen_set_pc_im(s, s->base.pc_next);
+s->base.is_jmp = DISAS_WFI;
 return true;
 }
 
-- 
2.17.1




[Qemu-devel] [PATCH v2 56/68] target/arm: Convert T16, Change processor state

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 85 --
 target/arm/t16.decode  | 12 ++
 2 files changed, 52 insertions(+), 45 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 414c562fb3..368f0ab147 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7474,6 +7474,11 @@ static int negate(DisasContext *s, int x)
 return -x;
 }
 
+static int plus_2(DisasContext *s, int x)
+{
+return x + 2;
+}
+
 static int times_2(DisasContext *s, int x)
 {
 return x * 2;
@@ -10152,6 +10157,9 @@ static bool trans_CPS(DisasContext *s, arg_CPS *a)
 {
 uint32_t mask, val;
 
+if (ENABLE_ARCH_6 && arm_dc_feature(s, ARM_FEATURE_M)) {
+return false;
+}
 if (IS_USER(s)) {
 /* Implemented as NOP in user mode.  */
 return true;
@@ -10182,6 +10190,36 @@ static bool trans_CPS(DisasContext *s, arg_CPS *a)
 return true;
 }
 
+static bool trans_CPS_v6m(DisasContext *s, arg_CPS_v6m *a)
+{
+TCGv_i32 tmp, addr;
+
+if (!(ENABLE_ARCH_6 && arm_dc_feature(s, ARM_FEATURE_M))) {
+return false;
+}
+if (IS_USER(s)) {
+/* Implemented as NOP in user mode.  */
+return true;
+}
+
+tmp = tcg_const_i32(a->im);
+/* FAULTMASK */
+if (a->F) {
+addr = tcg_const_i32(19);
+gen_helper_v7m_msr(cpu_env, addr, tmp);
+tcg_temp_free_i32(addr);
+}
+/* PRIMASK */
+if (a->I) {
+addr = tcg_const_i32(16);
+gen_helper_v7m_msr(cpu_env, addr, tmp);
+tcg_temp_free_i32(addr);
+}
+tcg_temp_free_i32(tmp);
+gen_lookup_tb(s);
+return true;
+}
+
 /*
  * Clear-Exclusive, Barriers
  */
@@ -10783,51 +10821,8 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 break;
 }
 
-case 6:
-switch ((insn >> 5) & 7) {
-case 2:
-/* setend */
-ARCH(6);
-if (((insn >> 3) & 1) != !!(s->be_data == MO_BE)) {
-gen_helper_setend(cpu_env);
-s->base.is_jmp = DISAS_UPDATE;
-}
-break;
-case 3:
-/* cps */
-ARCH(6);
-if (IS_USER(s)) {
-break;
-}
-if (arm_dc_feature(s, ARM_FEATURE_M)) {
-tmp = tcg_const_i32((insn & (1 << 4)) != 0);
-/* FAULTMASK */
-if (insn & 1) {
-addr = tcg_const_i32(19);
-gen_helper_v7m_msr(cpu_env, addr, tmp);
-tcg_temp_free_i32(addr);
-}
-/* PRIMASK */
-if (insn & 2) {
-addr = tcg_const_i32(16);
-gen_helper_v7m_msr(cpu_env, addr, tmp);
-tcg_temp_free_i32(addr);
-}
-tcg_temp_free_i32(tmp);
-gen_lookup_tb(s);
-} else {
-if (insn & (1 << 4)) {
-shift = CPSR_A | CPSR_I | CPSR_F;
-} else {
-shift = 0;
-}
-gen_set_psr_im(s, ((insn & 7) << 6), 0, shift);
-}
-break;
-default:
-goto undef;
-}
-break;
+case 6: /* setend, cps; in decodetree */
+goto illegal_op;
 
 default:
 goto undef;
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index b5b5086e8a..3bf1a31731 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -29,6 +29,8 @@
 _rr !extern p w u rn rt rm shimm shtype
 _ri !extern p w u rn rt imm
 _block  !extern rn i b u w list
+  !extern E
+ !extern mode imod M A I F
 
 # Set S if the instruction is outside of an IT block.
 %s   !function=t16_setflags
@@ -183,3 +185,13 @@ SXTAH   1011 0010 00 ... ...@extend
 SXTAB   1011 0010 01 ... ...@extend
 UXTAH   1011 0010 10 ... ...@extend
 UXTAB   1011 0010 11 ... ...@extend
+
+# Change processor state
+
+%imod   4:1 !function=plus_2
+
+SETEND  1011 0110 010 1 E:1 000 
+{
+  CPS_v6m   1011 0110 011 im:1 00 I:1 F:1
+  CPS   1011 0110 011 . 0 A:1 I:1 F:1mode=0 M=0 %imod
+}
-- 
2.17.1




[Qemu-devel] [PATCH v2 44/68] target/arm: Add skeleton for T16 decodetree

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c   |  6 ++
 target/arm/Makefile.objs |  6 ++
 target/arm/t16.decode| 20 
 3 files changed, 32 insertions(+)
 create mode 100644 target/arm/t16.decode

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 49bab7d863..90d608a2d2 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7538,6 +7538,7 @@ static int t32_branch24(DisasContext *s, int x)
 #include "decode-a32.inc.c"
 #include "decode-a32-uncond.inc.c"
 #include "decode-t32.inc.c"
+#include "decode-t16.inc.c"
 
 /* Helpers to swap operands for reverse-subtract.  */
 static void gen_rsb(TCGv_i32 dst, TCGv_i32 a, TCGv_i32 b)
@@ -10550,6 +10551,11 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 TCGv_i32 tmp2;
 TCGv_i32 addr;
 
+if (disas_t16(s, insn)) {
+return;
+}
+/* fall back to legacy decoder */
+
 switch (insn >> 12) {
 case 0: case 1:
 
diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs
index 7806b4dac0..cf26c16f5f 100644
--- a/target/arm/Makefile.objs
+++ b/target/arm/Makefile.objs
@@ -43,12 +43,18 @@ target/arm/decode-t32.inc.c: 
$(SRC_PATH)/target/arm/t32.decode $(DECODETREE)
  $(PYTHON) $(DECODETREE) --static-decode disas_t32 -o $@ $<,\
  "GEN", $(TARGET_DIR)$@)
 
+target/arm/decode-t16.inc.c: $(SRC_PATH)/target/arm/t16.decode $(DECODETREE)
+   $(call quiet-command,\
+ $(PYTHON) $(DECODETREE) -w 16 --static-decode disas_t16 -o $@ $<,\
+ "GEN", $(TARGET_DIR)$@)
+
 target/arm/translate-sve.o: target/arm/decode-sve.inc.c
 target/arm/translate.o: target/arm/decode-vfp.inc.c
 target/arm/translate.o: target/arm/decode-vfp-uncond.inc.c
 target/arm/translate.o: target/arm/decode-a32.inc.c
 target/arm/translate.o: target/arm/decode-a32-uncond.inc.c
 target/arm/translate.o: target/arm/decode-t32.inc.c
+target/arm/translate.o: target/arm/decode-t16.inc.c
 
 obj-y += tlb_helper.o debug_helper.o
 obj-y += translate.o op_helper.o
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
new file mode 100644
index 00..e954f61fe4
--- /dev/null
+++ b/target/arm/t16.decode
@@ -0,0 +1,20 @@
+# Thumb1 instructions
+#
+#  Copyright (c) 2019 Linaro, Ltd
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, see .
+
+#
+# This file is processed by scripts/decodetree.py
+#
-- 
2.17.1




[Qemu-devel] [PATCH v2 37/68] target/arm: Convert PLI, PLD, PLDW

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c   | 37 +++-
 target/arm/a32-uncond.decode | 10 ++
 2 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 46e88d1d17..a30a9bb4e0 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10136,6 +10136,26 @@ static bool trans_SETEND(DisasContext *s, arg_SETEND 
*a)
 return true;
 }
 
+/*
+ * Preload instructions
+ * All are nops, contingent on the appropriate arch level.
+ */
+
+static bool trans_PLD(DisasContext *s, arg_PLD *a)
+{
+return ENABLE_ARCH_5TE;
+}
+
+static bool trans_PLDW(DisasContext *s, arg_PLD *a)
+{
+return arm_dc_feature(s, ARM_FEATURE_V7MP);
+}
+
+static bool trans_PLI(DisasContext *s, arg_PLD *a)
+{
+return ENABLE_ARCH_7;
+}
+
 /*
  * Legacy decoder.
  */
@@ -10196,23 +10216,6 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 }
 return;
 }
-if (((insn & 0x0f30f000) == 0x0510f000) ||
-((insn & 0x0f30f010) == 0x0710f000)) {
-if ((insn & (1 << 22)) == 0) {
-/* PLDW; v7MP */
-if (!arm_dc_feature(s, ARM_FEATURE_V7MP)) {
-goto illegal_op;
-}
-}
-/* Otherwise PLD; v5TE+ */
-ARCH(5TE);
-return;
-}
-if (((insn & 0x0f70f000) == 0x0450f000) ||
-((insn & 0x0f70f010) == 0x0650f000)) {
-ARCH(7);
-return; /* PLI; V7 */
-}
 if (((insn & 0x0f70) == 0x0410) ||
 ((insn & 0x0f700010) == 0x0610)) {
 if (!arm_dc_feature(s, ARM_FEATURE_V7MP)) {
diff --git a/target/arm/a32-uncond.decode b/target/arm/a32-uncond.decode
index d5ed48f0fd..aed381cb8e 100644
--- a/target/arm/a32-uncond.decode
+++ b/target/arm/a32-uncond.decode
@@ -54,3 +54,13 @@ SB    0101 0111    0111 
 
 # Set Endianness
 SETEND    0001  0001  00 E:1 0    
+
+# Preload instructions
+
+PLD   0101 -101     # (imm, lit) 5te
+PLDW  0101 -001     # (imm, lit) 7mp
+PLI   0100 -101     # (imm, lit) 7
+
+PLD   0111 -101   - -- 0    # (register) 5te
+PLDW  0111 -001   - -- 0    # (register) 7mp
+PLI   0110 -101   - -- 0    # (register) 7
-- 
2.17.1




[Qemu-devel] [PATCH v2 54/68] target/arm: Convert T16 adjust sp (immediate)

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 15 ++-
 target/arm/t16.decode  |  9 +
 2 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index e639059a5a..cac3893386 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10640,19 +10640,8 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 /* misc */
 op = (insn >> 8) & 0xf;
 switch (op) {
-case 0:
-/*
- * 0b1011___
- *  - ADD (SP plus immediate)
- *  - SUB (SP minus immediate)
- */
-tmp = load_reg(s, 13);
-val = (insn & 0x7f) * 4;
-if (insn & (1 << 7))
-val = -(int32_t)val;
-tcg_gen_addi_i32(tmp, tmp, val);
-store_sp_checked(s, tmp);
-break;
+case 0: /* add/sub (sp, immediate), in decodetree */
+goto illegal_op;
 
 case 2: /* sign/zero extend.  */
 ARCH(6);
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index 5a570484e3..b425b86795 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -156,6 +156,15 @@ ADD_rrri0100 0100 .  ...@addsub_2h 
s=0
 CMP_xrri0100 0101 .  ...@addsub_2h s=1
 MOV_rxri0100 0110 .  ...@addsub_2h s=0
 
+# Adjust SP (immediate)
+
+%imm7_0x4   0:7 !function=times_4
+@addsub_sp_i  . ... \
+_rri_rot s=0 rd=13 rn=13 rot=0 imm=%imm7_0x4
+
+ADD_rri 1011  0 ... @addsub_sp_i
+SUB_rri 1011  1 ... @addsub_sp_i
+
 # Branch and exchange
 
 @branchr  . rm:4 ...
-- 
2.17.1




[Qemu-devel] [PATCH v2 34/68] target/arm: Convert Clear-Exclusive, Barriers

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c   | 122 +++
 target/arm/a32-uncond.decode |  10 +++
 target/arm/t32.decode|  10 +++
 3 files changed, 73 insertions(+), 69 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index e268c5168d..6489bbc09c 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10038,6 +10038,58 @@ static bool trans_SRS(DisasContext *s, arg_SRS *a)
 return true;
 }
 
+/*
+ * Clear-Exclusive, Barriers
+ */
+
+static bool trans_CLREX(DisasContext *s, arg_CLREX *a)
+{
+if (!ENABLE_ARCH_6K) {
+return false;
+}
+gen_clrex(s);
+return true;
+}
+
+static bool trans_DSB(DisasContext *s, arg_DSB *a)
+{
+if (!s->thumb && !ENABLE_ARCH_7) {
+return false;
+}
+tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
+return true;
+}
+
+static bool trans_DMB(DisasContext *s, arg_DMB *a)
+{
+return trans_DSB(s, NULL);
+}
+
+static bool trans_ISB(DisasContext *s, arg_ISB *a)
+{
+/*
+ * We need to break the TB after this insn to execute
+ * self-modifying code correctly and also to take
+ * any pending interrupts immediately.
+ */
+gen_goto_tb(s, 0, s->base.pc_next);
+return true;
+}
+
+static bool trans_SB(DisasContext *s, arg_SB *a)
+{
+if (!dc_isar_feature(aa32_sb, s)) {
+return false;
+}
+/*
+ * TODO: There is no speculation barrier opcode
+ * for TCG; MB and end the TB instead.
+ */
+tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
+gen_goto_tb(s, 0, s->base.pc_next);
+return true;
+}
+
 /*
  * Legacy decoder.
  */
@@ -10131,38 +10183,6 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 s->base.is_jmp = DISAS_UPDATE;
 }
 return;
-} else if ((insn & 0x0f00) == 0x057ff000) {
-switch ((insn >> 4) & 0xf) {
-case 1: /* clrex */
-ARCH(6K);
-gen_clrex(s);
-return;
-case 4: /* dsb */
-case 5: /* dmb */
-ARCH(7);
-tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
-return;
-case 6: /* isb */
-/* We need to break the TB after this insn to execute
- * self-modifying code correctly and also to take
- * any pending interrupts immediately.
- */
-gen_goto_tb(s, 0, s->base.pc_next);
-return;
-case 7: /* sb */
-if ((insn & 0xf) || !dc_isar_feature(aa32_sb, s)) {
-goto illegal_op;
-}
-/*
- * TODO: There is no speculation barrier opcode
- * for TCG; MB and end the TB instead.
- */
-tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
-gen_goto_tb(s, 0, s->base.pc_next);
-return;
-default:
-goto illegal_op;
-}
 } else if ((insn & 0x0e000f00) == 0x0c000100) {
 if (arm_dc_feature(s, ARM_FEATURE_IWMMXT)) {
 /* iWMMXt register transfer.  */
@@ -10623,43 +10643,7 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 gen_set_psr_im(s, offset, 0, imm);
 }
 break;
-case 3: /* Special control operations.  */
-if (!arm_dc_feature(s, ARM_FEATURE_V7) &&
-!arm_dc_feature(s, ARM_FEATURE_M)) {
-goto illegal_op;
-}
-op = (insn >> 4) & 0xf;
-switch (op) {
-case 2: /* clrex */
-gen_clrex(s);
-break;
-case 4: /* dsb */
-case 5: /* dmb */
-tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
-break;
-case 6: /* isb */
-/* We need to break the TB after this insn
- * to execute self-modifying code correctly
- * and also to take any pending interrupts
- * immediately.
- */
-gen_goto_tb(s, 0, s->base.pc_next);
-break;
-case 7: /* sb */
-if ((insn & 0xf) || !dc_isar_feature(aa32_sb, s)) {
-goto illegal_op;
-}
-/*
- * TODO: There is no speculation barrier opcode
- * for TCG; MB and end the TB instead.
- */
-tcg_gen_mb(TCG_MO_ALL | 

[Qemu-devel] [PATCH v2 55/68] target/arm: Convert T16, extract

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 14 +-
 target/arm/t16.decode  | 10 ++
 2 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index cac3893386..414c562fb3 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10641,21 +10641,9 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 op = (insn >> 8) & 0xf;
 switch (op) {
 case 0: /* add/sub (sp, immediate), in decodetree */
+case 2: /* sign/zero extend, in decodetree */
 goto illegal_op;
 
-case 2: /* sign/zero extend.  */
-ARCH(6);
-rd = insn & 7;
-rm = (insn >> 3) & 7;
-tmp = load_reg(s, rm);
-switch ((insn >> 6) & 3) {
-case 0: gen_sxth(tmp); break;
-case 1: gen_sxtb(tmp); break;
-case 2: gen_uxth(tmp); break;
-case 3: gen_uxtb(tmp); break;
-}
-store_reg(s, rd, tmp);
-break;
 case 4: case 5: case 0xc: case 0xd:
 /*
  * 0b1011_x10x__
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index b425b86795..b5b5086e8a 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -23,6 +23,7 @@
 _rrr_shr   !extern s rn rd rm rs shty
 _rri_rot   !extern s rn rd imm rot
 _  !extern s rd rn rm ra
+_rot !extern rd rn rm rot
   !extern rd imm
!extern rm
 _rr !extern p w u rn rt rm shimm shtype
@@ -173,3 +174,12 @@ BX  0100 0111 0  000@branchr
 BLX_r   0100 0111 1  000@branchr
 BXNS0100 0111 0  100@branchr
 BLXNS   0100 0111 1  100@branchr
+
+# Extend
+
+@extend   .. rm:3 rd:3  _rot rn=15 rot=0
+
+SXTAH   1011 0010 00 ... ...@extend
+SXTAB   1011 0010 01 ... ...@extend
+UXTAH   1011 0010 10 ... ...@extend
+UXTAB   1011 0010 11 ... ...@extend
-- 
2.17.1




[Qemu-devel] [PATCH v2 36/68] target/arm: Convert SETEND

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c   | 22 +-
 target/arm/a32-uncond.decode |  4 
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 928205d993..46e88d1d17 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10124,6 +10124,18 @@ static bool trans_SB(DisasContext *s, arg_SB *a)
 return true;
 }
 
+static bool trans_SETEND(DisasContext *s, arg_SETEND *a)
+{
+if (!ENABLE_ARCH_6) {
+return false;
+}
+if (a->E != (s->be_data == MO_BE)) {
+gen_helper_setend(cpu_env);
+s->base.is_jmp = DISAS_UPDATE;
+}
+return true;
+}
+
 /*
  * Legacy decoder.
  */
@@ -10209,15 +10221,7 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 return; /* v7MP: Unallocated memory hint: must NOP */
 }
 
-if ((insn & 0x0dff) == 0x0101) {
-ARCH(6);
-/* setend */
-if (((insn >> 9) & 1) != !!(s->be_data == MO_BE)) {
-gen_helper_setend(cpu_env);
-s->base.is_jmp = DISAS_UPDATE;
-}
-return;
-} else if ((insn & 0x0e000f00) == 0x0c000100) {
+if ((insn & 0x0e000f00) == 0x0c000100) {
 if (arm_dc_feature(s, ARM_FEATURE_IWMMXT)) {
 /* iWMMXt register transfer.  */
 if (extract32(s->c15_cpar, 1, 1)) {
diff --git a/target/arm/a32-uncond.decode b/target/arm/a32-uncond.decode
index eb1c55b330..d5ed48f0fd 100644
--- a/target/arm/a32-uncond.decode
+++ b/target/arm/a32-uncond.decode
@@ -24,6 +24,7 @@
 
!extern
!extern imm
+  E
 
 # Branch with Link and Exchange
 
@@ -50,3 +51,6 @@ DSB   0101 0111    0100 
 DMB   0101 0111    0101 
 ISB   0101 0111    0110 
 SB    0101 0111    0111 
+
+# Set Endianness
+SETEND    0001  0001  00 E:1 0    
-- 
2.17.1




[Qemu-devel] [PATCH v2 46/68] target/arm: Convert T16 load/store (register offset)

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 51 ++
 target/arm/t16.decode  | 15 +
 2 files changed, 17 insertions(+), 49 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 7c5769bd42..e19961fb6c 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10745,55 +10745,8 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 goto illegal_op;
 
 case 5:
-/* load/store register offset.  */
-rd = insn & 7;
-rn = (insn >> 3) & 7;
-rm = (insn >> 6) & 7;
-op = (insn >> 9) & 7;
-addr = load_reg(s, rn);
-tmp = load_reg(s, rm);
-tcg_gen_add_i32(addr, addr, tmp);
-tcg_temp_free_i32(tmp);
-
-if (op < 3) { /* store */
-tmp = load_reg(s, rd);
-} else {
-tmp = tcg_temp_new_i32();
-}
-
-switch (op) {
-case 0: /* str */
-gen_aa32_st32_iss(s, tmp, addr, get_mem_index(s), rd | ISSIs16Bit);
-break;
-case 1: /* strh */
-gen_aa32_st16_iss(s, tmp, addr, get_mem_index(s), rd | ISSIs16Bit);
-break;
-case 2: /* strb */
-gen_aa32_st8_iss(s, tmp, addr, get_mem_index(s), rd | ISSIs16Bit);
-break;
-case 3: /* ldrsb */
-gen_aa32_ld8s_iss(s, tmp, addr, get_mem_index(s), rd | ISSIs16Bit);
-break;
-case 4: /* ldr */
-gen_aa32_ld32u_iss(s, tmp, addr, get_mem_index(s), rd | 
ISSIs16Bit);
-break;
-case 5: /* ldrh */
-gen_aa32_ld16u_iss(s, tmp, addr, get_mem_index(s), rd | 
ISSIs16Bit);
-break;
-case 6: /* ldrb */
-gen_aa32_ld8u_iss(s, tmp, addr, get_mem_index(s), rd | ISSIs16Bit);
-break;
-case 7: /* ldrsh */
-gen_aa32_ld16s_iss(s, tmp, addr, get_mem_index(s), rd | 
ISSIs16Bit);
-break;
-}
-if (op >= 3) { /* load */
-store_reg(s, rd, tmp);
-} else {
-tcg_temp_free_i32(tmp);
-}
-tcg_temp_free_i32(addr);
-break;
+/* load/store register offset, in decodetree */
+goto illegal_op;
 
 case 6:
 /* load/store word immediate offset */
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index 44e7250c55..83fe4363c7 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -23,6 +23,7 @@
 _rrr_shr   !extern s rn rd rm rs shty
 _rri_rot   !extern s rn rd imm rot
 _  !extern s rd rn rm ra
+_rr !extern p w u rn rt rm shimm shtype
 
 # Set S if the instruction is outside of an IT block.
 %s   !function=t16_setflags
@@ -54,3 +55,17 @@ ORR_rrri 01 1100 ... ...@lll_noshr
 MUL  01 1101 rn:3 rd:3  _ %s rm=%reg_0 ra=0
 BIC_rrri 01 1110 ... ...@lll_noshr
 MVN_rxri 01  ... ...@lll_noshr
+
+# Load/store (register offset)
+
+@ldst_rr ... rm:3 rn:3 rt:3 \
+ _rr p=1 w=0 u=1 shimm=0 shtype=0
+
+STR_rr   0101 000 ... ... ...   @ldst_rr
+STRH_rr  0101 001 ... ... ...   @ldst_rr
+STRB_rr  0101 010 ... ... ...   @ldst_rr
+LDRSB_rr 0101 011 ... ... ...   @ldst_rr
+LDR_rr   0101 100 ... ... ...   @ldst_rr
+LDRH_rr  0101 101 ... ... ...   @ldst_rr
+LDRB_rr  0101 110 ... ... ...   @ldst_rr
+LDRSH_rr 0101 111 ... ... ...   @ldst_rr
-- 
2.17.1




[Qemu-devel] [PATCH v2 26/68] target/arm: Convert MOVW, MOVT

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 89 --
 target/arm/a32.decode  |  6 +++
 target/arm/t32.decode  |  9 +
 3 files changed, 48 insertions(+), 56 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 7962ac49e6..81eae286e8 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7841,6 +7841,34 @@ static bool trans_ADR(DisasContext *s, arg_ri *a)
 return true;
 }
 
+static bool trans_MOVW(DisasContext *s, arg_MOVW *a)
+{
+TCGv_i32 tmp;
+
+if (!ENABLE_ARCH_6T2) {
+return false;
+}
+
+tmp = tcg_const_i32(a->imm);
+store_reg(s, a->rd, tmp);
+return true;
+}
+
+static bool trans_MOVT(DisasContext *s, arg_MOVW *a)
+{
+TCGv_i32 tmp;
+
+if (!ENABLE_ARCH_6T2) {
+return false;
+}
+
+tmp = load_reg(s, a->rd);
+tcg_gen_ext16u_i32(tmp, tmp);
+tcg_gen_ori_i32(tmp, tmp, a->imm << 16);
+store_reg(s, a->rd, tmp);
+return true;
+}
+
 /*
  * Multiply and multiply accumulate
  */
@@ -9649,7 +9677,7 @@ static bool trans_UDIV(DisasContext *s, arg_rrr *a)
 
 static void disas_arm_insn(DisasContext *s, unsigned int insn)
 {
-unsigned int cond, val, op1, i, rn, rd;
+unsigned int cond, val, op1, i, rn;
 TCGv_i32 tmp;
 TCGv_i32 tmp2;
 TCGv_i32 addr;
@@ -9898,26 +9926,8 @@ static void disas_arm_insn(DisasContext *s, unsigned int 
insn)
 /* fall back to legacy decoder */
 
 if ((insn & 0x0f90) == 0x0300) {
-if ((insn & (1 << 21)) == 0) {
-ARCH(6T2);
-rd = (insn >> 12) & 0xf;
-val = ((insn >> 4) & 0xf000) | (insn & 0xfff);
-if ((insn & (1 << 22)) == 0) {
-/* MOVW */
-tmp = tcg_temp_new_i32();
-tcg_gen_movi_i32(tmp, val);
-} else {
-/* MOVT */
-tmp = load_reg(s, rd);
-tcg_gen_ext16u_i32(tmp, tmp);
-tcg_gen_ori_i32(tmp, tmp, val << 16);
-}
-store_reg(s, rd, tmp);
-} else {
-/* MSR (immediate) and hints */
-/* All done in decodetree.  Illegal ops already signalled.  */
-g_assert_not_reached();
-}
+/* All done in decodetree.  Illegal ops reach here.  */
+goto illegal_op;
 } else if ((insn & 0x0f90) == 0x0100
&& (insn & 0x0090) != 0x0090) {
 /* miscellaneous instructions */
@@ -10655,42 +10665,9 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 /*
  * 0b_0xxx__0xxx__
  *  - Data-processing (modified immediate, plain binary immediate)
+ * All in decodetree.
  */
-if (insn & (1 << 25)) {
-/*
- * 0b_0x1x__0xxx__
- *  - Data-processing (plain binary immediate)
- */
-if (insn & (1 << 24)) {
-/* Bitfield/Saturate, in decodetree */
-goto illegal_op;
-} else {
-imm = ((insn & 0x0400) >> 15)
-  | ((insn & 0x7000) >> 4) | (insn & 0xff);
-if (insn & (1 << 22)) {
-/* 16-bit immediate.  */
-imm |= (insn >> 4) & 0xf000;
-if (insn & (1 << 23)) {
-/* movt */
-tmp = load_reg(s, rd);
-tcg_gen_ext16u_i32(tmp, tmp);
-tcg_gen_ori_i32(tmp, tmp, imm << 16);
-} else {
-/* movw */
-tmp = tcg_temp_new_i32();
-tcg_gen_movi_i32(tmp, imm);
-}
-store_reg(s, rd, tmp);
-} else {
-/* Add/sub 12-bit immediate, in decodetree */
-goto illegal_op;
-}
-}
-} else {
-/* Data-processing (modified immediate) */
-/* All done in decodetree.  Reach here for illegal ops.  */
-goto illegal_op;
-}
+goto illegal_op;
 }
 break;
 case 12:
diff --git a/target/arm/a32.decode b/target/arm/a32.decode
index d7a333b90b..341882e637 100644
--- a/target/arm/a32.decode
+++ b/target/arm/a32.decode
@@ -73,6 +73,12 @@ MOV_rxri  000 1101 .   . .. 0    
 @s_rxr_shi
 BIC_rrri  000 1110 .   . .. 0 @s_rrr_shi
 MVN_rxri  000  .   . .. 0 @s_rxr_shi
 
+%imm16   16:4 0:12
+@mov16       rd:4  imm=%imm16
+
+MOVW  0011    @mov16
+MOVT

[Qemu-devel] [PATCH v2 35/68] target/arm: Convert CPS (privileged)

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c   | 87 +++-
 target/arm/a32-uncond.decode |  3 ++
 target/arm/t32.decode|  3 ++
 3 files changed, 42 insertions(+), 51 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 6489bbc09c..928205d993 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -10038,6 +10038,40 @@ static bool trans_SRS(DisasContext *s, arg_SRS *a)
 return true;
 }
 
+static bool trans_CPS(DisasContext *s, arg_CPS *a)
+{
+uint32_t mask, val;
+
+if (IS_USER(s)) {
+/* Implemented as NOP in user mode.  */
+return true;
+}
+
+mask = val = 0;
+if (a->imod & 2) {
+if (a->A) {
+mask |= CPSR_A;
+}
+if (a->I) {
+mask |= CPSR_I;
+}
+if (a->F) {
+mask |= CPSR_F;
+}
+if (a->imod & 1) {
+val |= mask;
+}
+}
+if (a->M) {
+mask |= CPSR_M;
+val |= a->mode;
+}
+if (mask) {
+gen_set_psr_im(s, mask, 0, val);
+}
+return true;
+}
+
 /*
  * Clear-Exclusive, Barriers
  */
@@ -10209,31 +10243,6 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 ARCH(5TE);
 } else if ((insn & 0x0f10) == 0x0e10) {
 /* Additional coprocessor register transfer.  */
-} else if ((insn & 0x0ff10020) == 0x0100) {
-uint32_t mask;
-uint32_t val;
-/* cps (privileged) */
-if (IS_USER(s))
-return;
-mask = val = 0;
-if (insn & (1 << 19)) {
-if (insn & (1 << 8))
-mask |= CPSR_A;
-if (insn & (1 << 7))
-mask |= CPSR_I;
-if (insn & (1 << 6))
-mask |= CPSR_F;
-if (insn & (1 << 18))
-val |= mask;
-}
-if (insn & (1 << 17)) {
-mask |= CPSR_M;
-val |= (insn & 0x1f);
-}
-if (mask) {
-gen_set_psr_im(s, mask, 0, val);
-}
-return;
 }
 goto illegal_op;
 }
@@ -10342,7 +10351,6 @@ static bool thumb_insn_is_16bit(DisasContext *s, 
uint32_t pc, uint32_t insn)
 /* Translate a 32-bit thumb instruction. */
 static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
 {
-uint32_t imm, offset;
 uint32_t rd, rn, rm, rs;
 TCGv_i32 tmp;
 TCGv_i32 addr;
@@ -10618,31 +10626,8 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 case 0: /* msr cpsr, in decodetree  */
 case 1: /* msr spsr, in decodetree  */
 goto illegal_op;
-case 2: /* cps, nop-hint.  */
-/* nop hints in decodetree */
-/* Implemented as NOP in user mode.  */
-if (IS_USER(s))
-break;
-offset = 0;
-imm = 0;
-if (insn & (1 << 10)) {
-if (insn & (1 << 7))
-offset |= CPSR_A;
-if (insn & (1 << 6))
-offset |= CPSR_I;
-if (insn & (1 << 5))
-offset |= CPSR_F;
-if (insn & (1 << 9))
-imm = CPSR_A | CPSR_I | CPSR_F;
-}
-if (insn & (1 << 8)) {
-offset |= 0x1f;
-imm |= (insn & 0x1f);
-}
-if (offset) {
-gen_set_psr_im(s, offset, 0, imm);
-}
-break;
+case 2: /* cps, nop-hint, in decodetree */
+goto illegal_op;
 case 3: /* Special control operations, in decodetree */
 case 4: /* bxj, in decodetree */
 goto illegal_op;
diff --git a/target/arm/a32-uncond.decode b/target/arm/a32-uncond.decode
index b077958cec..eb1c55b330 100644
--- a/target/arm/a32-uncond.decode
+++ b/target/arm/a32-uncond.decode
@@ -35,9 +35,12 @@ BLX_i 101 .  
  imm=%imm24h
 
  rn w pu
  mode w pu
+ mode imod M A I F
 
 RFE   100 pu:2 0 w:1 1 rn:4  1010     
 SRS   110 pu:2 1 w:1 0 1101  0101 000 mode:5  
+CPS   0001  imod:2 M:1 0  000 A:1 I:1 F:1 0 mode:5 \
+ 
 
 # Clear-Exclusive, Barriers
 
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
index 18c268e712..354ad77fe6 100644
--- 

[Qemu-devel] [PATCH v2 49/68] target/arm: Convert T16 load/store multiple

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 48 --
 target/arm/t16.decode  |  8 +++
 2 files changed, 17 insertions(+), 39 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 2640f50fcf..d417958b23 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9976,6 +9976,14 @@ static bool trans_LDM_t32(DisasContext *s, 
arg_ldst_block *a)
 return do_ldm(s, a, 2);
 }
 
+static bool trans_LDM_t16(DisasContext *s, arg_ldst_block *a)
+{
+/* Writeback is conditional on the base register not being loaded.  */
+a->w = !(a->list & (1 << a->rn));
+/* BitCount(list) < 1 is UNPREDICTABLE */
+return do_ldm(s, a, 1);
+}
+
 /*
  * Branch, branch with link
  */
@@ -10750,6 +10758,7 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 case 8: /* load/store halfword immediate offset, in decodetree */
 case 9: /* load/store from stack, in decodetree */
 case 10: /* add PC/SP (immediate), in decodetree */
+case 12: /* load/store multiple, in decodetree */
 goto illegal_op;
 
 case 11:
@@ -10973,45 +10982,6 @@ static void disas_thumb_insn(DisasContext *s, uint32_t 
insn)
 }
 break;
 
-case 12:
-{
-/* load/store multiple */
-TCGv_i32 loaded_var = NULL;
-rn = (insn >> 8) & 0x7;
-addr = load_reg(s, rn);
-for (i = 0; i < 8; i++) {
-if (insn & (1 << i)) {
-if (insn & (1 << 11)) {
-/* load */
-tmp = tcg_temp_new_i32();
-gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
-if (i == rn) {
-loaded_var = tmp;
-} else {
-store_reg(s, i, tmp);
-}
-} else {
-/* store */
-tmp = load_reg(s, i);
-gen_aa32_st32(s, tmp, addr, get_mem_index(s));
-tcg_temp_free_i32(tmp);
-}
-/* advance to the next address */
-tcg_gen_addi_i32(addr, addr, 4);
-}
-}
-if ((insn & (1 << rn)) == 0) {
-/* base reg not in list: base register writeback */
-store_reg(s, rn, addr);
-} else {
-/* base reg in list: if load, complete it now */
-if (insn & (1 << 11)) {
-store_reg(s, rn, loaded_var);
-}
-tcg_temp_free_i32(addr);
-}
-break;
-}
 case 13:
 /* conditional branch or swi */
 cond = (insn >> 8) & 0xf;
diff --git a/target/arm/t16.decode b/target/arm/t16.decode
index 71b3e8f02e..a7a437f930 100644
--- a/target/arm/t16.decode
+++ b/target/arm/t16.decode
@@ -26,6 +26,7 @@
   !extern rd imm
 _rr !extern p w u rn rt rm shimm shtype
 _ri !extern p w u rn rt imm
+_block  !extern rn i b u w list
 
 # Set S if the instruction is outside of an IT block.
 %s   !function=t16_setflags
@@ -109,3 +110,10 @@ LDR_ri  10011 ...   
@ldst_spec_i rn=13
 ADR 10100 rd:3  imm=%imm8_0x4
 ADD_rri 10101 rd:3  \
 _rri_rot rn=13 s=0 rot=0 imm=%imm8_0x4  # SP
+
+# Load/store multiple
+
+@ldstm  . rn:3 list:8   _block i=1 b=0 u=0 w=1
+
+STM 11000 ...   @ldstm
+LDM_t16 11001 ...   @ldstm
-- 
2.17.1




[Qemu-devel] [PATCH v2 41/68] target/arm: Convert TT

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 87 +-
 target/arm/t32.decode  |  5 ++-
 2 files changed, 31 insertions(+), 61 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 9a8864e8ff..d1078ca1ec 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8454,6 +8454,30 @@ static bool trans_SG(DisasContext *s, arg_SG *a)
 return true;
 }
 
+static bool trans_TT(DisasContext *s, arg_TT *a)
+{
+TCGv_i32 addr, tmp;
+
+if (!arm_dc_feature(s, ARM_FEATURE_M) ||
+!arm_dc_feature(s, ARM_FEATURE_V8)) {
+return false;
+}
+if (a->rd == 13 || a->rd == 15 || a->rn == 15) {
+/* We UNDEF for these UNPREDICTABLE cases */
+return false;
+}
+if (a->A && !s->v8m_secure) {
+return false;
+}
+
+addr = load_reg(s, a->rn);
+tmp = tcg_const_i32((a->A << 1) | a->T);
+gen_helper_v7m_tt(tmp, cpu_env, addr, tmp);
+tcg_temp_free_i32(addr);
+store_reg(s, a->rd, tmp);
+return true;
+}
+
 /*
  * Load/store register index
  */
@@ -10409,7 +10433,7 @@ static bool thumb_insn_is_16bit(DisasContext *s, 
uint32_t pc, uint32_t insn)
 /* Translate a 32-bit thumb instruction. */
 static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
 {
-uint32_t rd, rn, rs;
+uint32_t rn;
 int op;
 
 /*
@@ -10453,70 +10477,13 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 /* fall back to legacy decoder */
 
 rn = (insn >> 16) & 0xf;
-rs = (insn >> 12) & 0xf;
-rd = (insn >> 8) & 0xf;
 switch ((insn >> 25) & 0xf) {
 case 0: case 1: case 2: case 3:
 /* 16-bit instructions.  Should never happen.  */
 abort();
 case 4:
-if (insn & (1 << 22)) {
-/* 0b1110_100x_x1xx_____
- * - load/store doubleword, load/store exclusive, ldacq/strel,
- *   table branch, TT.
- */
-if (insn & 0x0120) {
-/* load/store dual, in decodetree */
-goto illegal_op;
-} else if ((insn & (1 << 23)) == 0) {
-/* 0b1110_1000_010x_____
- * - load/store exclusive word
- * - TT (v8M only)
- */
-if (rs == 15) {
-if (!(insn & (1 << 20)) &&
-arm_dc_feature(s, ARM_FEATURE_M) &&
-arm_dc_feature(s, ARM_FEATURE_V8)) {
-/* 0b1110_1000_0100_____
- *  - TT (v8M only)
- */
-bool alt = insn & (1 << 7);
-TCGv_i32 addr, op, ttresp;
-
-if ((insn & 0x3f) || rd == 13 || rd == 15 || rn == 15) 
{
-/* we UNDEF for these UNPREDICTABLE cases */
-goto illegal_op;
-}
-
-if (alt && !s->v8m_secure) {
-goto illegal_op;
-}
-
-addr = load_reg(s, rn);
-op = tcg_const_i32(extract32(insn, 6, 2));
-ttresp = tcg_temp_new_i32();
-gen_helper_v7m_tt(ttresp, cpu_env, addr, op);
-tcg_temp_free_i32(addr);
-tcg_temp_free_i32(op);
-store_reg(s, rd, ttresp);
-break;
-}
-goto illegal_op;
-}
-/* Load/store exclusive, in decodetree */
-goto illegal_op;
-} else if ((insn & (7 << 5)) == 0) {
-/* Table Branch, in decodetree */
-goto illegal_op;
-} else {
-/* Load/store exclusive, load-acq/store-rel, in decodetree */
-goto illegal_op;
-}
-} else {
-/* Load/store multiple, RFE, SRS, in decodetree */
-goto illegal_op;
-}
-break;
+/* All in decodetree */
+goto illegal_op;
 case 5:
 /* All in decodetree */
 goto illegal_op;
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
index ce46650446..bb875f77b0 100644
--- a/target/arm/t32.decode
+++ b/target/arm/t32.decode
@@ -506,7 +506,10 @@ STRD_ri_t32  1110 1001 .110    
@ldstd_ri8 w=1 p=1
 @ldrex_d    rn:4 rt:4 rt2:4   \
   imm=0
 
-STREX1110 1000 0100       @strex_i
+{
+  TT 1110 1000 0100 rn:4  rd:4 A:1 T:1 00
+  STREX  1110 1000 0100       @strex_i
+}
 STREXB   1110 1000 1100    0100   @strex_0
 STREXH   1110 1000 1100    0101   @strex_0
 

[Qemu-devel] [PATCH v2 45/68] target/arm: Convert T16 data-processing (two low regs)

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 152 ++---
 target/arm/t16.decode  |  36 ++
 2 files changed, 43 insertions(+), 145 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 90d608a2d2..7c5769bd42 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -445,13 +445,6 @@ static inline void gen_logic_CC(TCGv_i32 var)
 tcg_gen_mov_i32(cpu_ZF, var);
 }
 
-/* T0 += T1 + CF.  */
-static void gen_adc(TCGv_i32 t0, TCGv_i32 t1)
-{
-tcg_gen_add_i32(t0, t0, t1);
-tcg_gen_add_i32(t0, t0, cpu_CF);
-}
-
 /* dest = T0 + T1 + CF. */
 static void gen_add_carry(TCGv_i32 dest, TCGv_i32 t0, TCGv_i32 t1)
 {
@@ -7531,6 +7524,11 @@ static int t32_branch24(DisasContext *s, int x)
 return x << 1;
 }
 
+static int t16_setflags(DisasContext *s)
+{
+return s->condexec_mask == 0;
+}
+
 /*
  * Include the generated decoders.
  */
@@ -10742,145 +10740,9 @@ static void disas_thumb_insn(DisasContext *s, 
uint32_t insn)
 
 /*
  * 0b0100_00xx__
- *  - Data-processing (two low registers)
+ *  - Data-processing (two low registers), in decodetree
  */
-rd = insn & 7;
-rm = (insn >> 3) & 7;
-op = (insn >> 6) & 0xf;
-if (op == 2 || op == 3 || op == 4 || op == 7) {
-/* the shift/rotate ops want the operands backwards */
-val = rm;
-rm = rd;
-rd = val;
-val = 1;
-} else {
-val = 0;
-}
-
-if (op == 9) { /* neg */
-tmp = tcg_temp_new_i32();
-tcg_gen_movi_i32(tmp, 0);
-} else if (op != 0xf) { /* mvn doesn't read its first operand */
-tmp = load_reg(s, rd);
-} else {
-tmp = NULL;
-}
-
-tmp2 = load_reg(s, rm);
-switch (op) {
-case 0x0: /* and */
-tcg_gen_and_i32(tmp, tmp, tmp2);
-if (!s->condexec_mask)
-gen_logic_CC(tmp);
-break;
-case 0x1: /* eor */
-tcg_gen_xor_i32(tmp, tmp, tmp2);
-if (!s->condexec_mask)
-gen_logic_CC(tmp);
-break;
-case 0x2: /* lsl */
-if (s->condexec_mask) {
-gen_shl(tmp2, tmp2, tmp);
-} else {
-gen_helper_shl_cc(tmp2, cpu_env, tmp2, tmp);
-gen_logic_CC(tmp2);
-}
-break;
-case 0x3: /* lsr */
-if (s->condexec_mask) {
-gen_shr(tmp2, tmp2, tmp);
-} else {
-gen_helper_shr_cc(tmp2, cpu_env, tmp2, tmp);
-gen_logic_CC(tmp2);
-}
-break;
-case 0x4: /* asr */
-if (s->condexec_mask) {
-gen_sar(tmp2, tmp2, tmp);
-} else {
-gen_helper_sar_cc(tmp2, cpu_env, tmp2, tmp);
-gen_logic_CC(tmp2);
-}
-break;
-case 0x5: /* adc */
-if (s->condexec_mask) {
-gen_adc(tmp, tmp2);
-} else {
-gen_adc_CC(tmp, tmp, tmp2);
-}
-break;
-case 0x6: /* sbc */
-if (s->condexec_mask) {
-gen_sub_carry(tmp, tmp, tmp2);
-} else {
-gen_sbc_CC(tmp, tmp, tmp2);
-}
-break;
-case 0x7: /* ror */
-if (s->condexec_mask) {
-tcg_gen_andi_i32(tmp, tmp, 0x1f);
-tcg_gen_rotr_i32(tmp2, tmp2, tmp);
-} else {
-gen_helper_ror_cc(tmp2, cpu_env, tmp2, tmp);
-gen_logic_CC(tmp2);
-}
-break;
-case 0x8: /* tst */
-tcg_gen_and_i32(tmp, tmp, tmp2);
-gen_logic_CC(tmp);
-rd = 16;
-break;
-case 0x9: /* neg */
-if (s->condexec_mask)
-tcg_gen_neg_i32(tmp, tmp2);
-else
-gen_sub_CC(tmp, tmp, tmp2);
-break;
-case 0xa: /* cmp */
-gen_sub_CC(tmp, tmp, tmp2);
-rd = 16;
-break;
-case 0xb: /* cmn */
-gen_add_CC(tmp, tmp, tmp2);
-rd = 16;
-break;
-case 0xc: /* orr */
-tcg_gen_or_i32(tmp, tmp, tmp2);
-if (!s->condexec_mask)
-gen_logic_CC(tmp);
-break;
-case 0xd: /* mul */
-tcg_gen_mul_i32(tmp, tmp, tmp2);
-if (!s->condexec_mask)
-gen_logic_CC(tmp);
-break;
-case 0xe: /* bic */
-tcg_gen_andc_i32(tmp, tmp, tmp2);
-if (!s->condexec_mask)
-gen_logic_CC(tmp);
-break;
-case 0xf: /* mvn */
-tcg_gen_not_i32(tmp2, tmp2);
-if (!s->condexec_mask)
-gen_logic_CC(tmp2);
-

[Qemu-devel] [PATCH v2 25/68] target/arm: Convert Signed multiply, signed and unsigned divide

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 463 ++---
 target/arm/a32.decode  |  22 ++
 target/arm/t32.decode  |  18 ++
 3 files changed, 247 insertions(+), 256 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index d31e89f308..7962ac49e6 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9442,18 +9442,217 @@ static bool trans_RBIT(DisasContext *s, arg_rr *a)
 return op_rr(s, a, gen_helper_rbit);
 }
 
+/*
+ * Signed multiply, signed and unsigned divide
+ */
+
+static bool op_smlad(DisasContext *s, arg_ *a, bool m_swap, bool sub)
+{
+TCGv_i32 t1, t2;
+
+if (!ENABLE_ARCH_6) {
+return false;
+}
+
+t1 = load_reg(s, a->rn);
+t2 = load_reg(s, a->rm);
+if (m_swap) {
+gen_swap_half(t2);
+}
+gen_smul_dual(t1, t2);
+
+if (sub) {
+/* This subtraction cannot overflow. */
+tcg_gen_sub_i32(t1, t1, t2);
+} else {
+/*
+ * This addition cannot overflow 32 bits; however it may
+ * overflow considered as a signed operation, in which case
+ * we must set the Q flag.
+ */
+gen_helper_add_setq(t1, cpu_env, t1, t2);
+}
+tcg_temp_free_i32(t2);
+
+if (a->ra != 15) {
+t2 = load_reg(s, a->ra);
+gen_helper_add_setq(t1, cpu_env, t1, t2);
+tcg_temp_free_i32(t2);
+}
+store_reg(s, a->rd, t1);
+return true;
+}
+
+static bool trans_SMLAD(DisasContext *s, arg_ *a)
+{
+return op_smlad(s, a, false, false);
+}
+
+static bool trans_SMLADX(DisasContext *s, arg_ *a)
+{
+return op_smlad(s, a, true, false);
+}
+
+static bool trans_SMLSD(DisasContext *s, arg_ *a)
+{
+return op_smlad(s, a, false, true);
+}
+
+static bool trans_SMLSDX(DisasContext *s, arg_ *a)
+{
+return op_smlad(s, a, true, true);
+}
+
+static bool op_smlald(DisasContext *s, arg_ *a, bool m_swap, bool sub)
+{
+TCGv_i32 t1, t2;
+TCGv_i64 l1, l2;
+
+if (!ENABLE_ARCH_6) {
+return false;
+}
+
+t1 = load_reg(s, a->rn);
+t2 = load_reg(s, a->rm);
+if (m_swap) {
+gen_swap_half(t2);
+}
+gen_smul_dual(t1, t2);
+
+l1 = tcg_temp_new_i64();
+l2 = tcg_temp_new_i64();
+tcg_gen_ext_i32_i64(l1, t1);
+tcg_gen_ext_i32_i64(l2, t2);
+tcg_temp_free_i32(t1);
+tcg_temp_free_i32(t2);
+
+if (sub) {
+tcg_gen_sub_i64(l1, l1, l2);
+} else {
+tcg_gen_add_i64(l1, l1, l2);
+}
+tcg_temp_free_i64(l2);
+
+gen_addq(s, l1, a->ra, a->rd);
+gen_storeq_reg(s, a->ra, a->rd, l1);
+tcg_temp_free_i64(l1);
+return true;
+}
+
+static bool trans_SMLALD(DisasContext *s, arg_ *a)
+{
+return op_smlald(s, a, false, false);
+}
+
+static bool trans_SMLALDX(DisasContext *s, arg_ *a)
+{
+return op_smlald(s, a, true, false);
+}
+
+static bool trans_SMLSLD(DisasContext *s, arg_ *a)
+{
+return op_smlald(s, a, false, true);
+}
+
+static bool trans_SMLSLDX(DisasContext *s, arg_ *a)
+{
+return op_smlald(s, a, true, true);
+}
+
+static bool op_smmla(DisasContext *s, arg_ *a, bool round, bool sub)
+{
+TCGv_i32 t1, t2;
+
+if (s->thumb
+? !arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)
+: !ENABLE_ARCH_6) {
+return false;
+}
+
+t1 = load_reg(s, a->rn);
+t2 = load_reg(s, a->rm);
+tcg_gen_muls2_i32(t2, t1, t1, t2);
+
+if (a->ra != 15) {
+TCGv_i32 t3 = load_reg(s, a->ra);
+if (sub) {
+tcg_gen_sub_i32(t1, t1, t3);
+} else {
+tcg_gen_add_i32(t1, t1, t3);
+}
+tcg_temp_free_i32(t3);
+}
+if (round) {
+tcg_gen_shri_i32(t2, t2, 31);
+tcg_gen_add_i32(t1, t1, t2);
+}
+tcg_temp_free_i32(t2);
+store_reg(s, a->rd, t1);
+return true;
+}
+
+static bool trans_SMMLA(DisasContext *s, arg_ *a)
+{
+return op_smmla(s, a, false, false);
+}
+
+static bool trans_SMMLAR(DisasContext *s, arg_ *a)
+{
+return op_smmla(s, a, true, false);
+}
+
+static bool trans_SMMLS(DisasContext *s, arg_ *a)
+{
+return op_smmla(s, a, false, true);
+}
+
+static bool trans_SMMLSR(DisasContext *s, arg_ *a)
+{
+return op_smmla(s, a, true, true);
+}
+
+static bool op_div(DisasContext *s, arg_rrr *a, bool u)
+{
+TCGv_i32 t1, t2;
+
+if (s->thumb
+? !dc_isar_feature(thumb_div, s)
+: !dc_isar_feature(arm_div, s)) {
+return false;
+}
+
+t1 = load_reg(s, a->rn);
+t2 = load_reg(s, a->rm);
+if (u) {
+gen_helper_udiv(t1, t1, t2);
+} else {
+gen_helper_sdiv(t1, t1, t2);
+}
+tcg_temp_free_i32(t2);
+store_reg(s, a->rd, t1);
+return true;
+}
+
+static bool trans_SDIV(DisasContext *s, arg_rrr *a)
+{
+return op_div(s, a, false);
+}
+
+static bool trans_UDIV(DisasContext *s, arg_rrr *a)
+{
+return op_div(s, a, true);
+}
+
 /*
  * Legacy decoder.
  */
 
 static void 

[Qemu-devel] [PATCH v2 33/68] target/arm: Convert RFE and SRS

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c   | 150 ++-
 target/arm/a32-uncond.decode |   8 ++
 target/arm/t32.decode|  12 +++
 3 files changed, 81 insertions(+), 89 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index b6d8b7be8c..e268c5168d 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9980,16 +9980,71 @@ static bool trans_SVC(DisasContext *s, arg_SVC *a)
 return true;
 }
 
+/*
+ * Unconditional system instructions
+ */
+
+static bool trans_RFE(DisasContext *s, arg_RFE *a)
+{
+int32_t offset;
+TCGv_i32 addr, t1, t2;
+
+if (IS_USER(s) || !ENABLE_ARCH_6) {
+return false;
+}
+
+addr = load_reg(s, a->rn);
+
+switch (a->pu) {
+case 0: offset = -4; break; /* DA */
+case 1: offset =  0; break; /* IA */
+case 2: offset = -8; break; /* DB */
+case 3: offset =  4; break; /* IB */
+default:
+g_assert_not_reached();
+}
+tcg_gen_addi_i32(addr, addr, offset);
+
+/* Load PC into tmp and CPSR into tmp2.  */
+t1 = tcg_temp_new_i32();
+gen_aa32_ld32u(s, t1, addr, get_mem_index(s));
+tcg_gen_addi_i32(addr, addr, 4);
+t2 = tcg_temp_new_i32();
+gen_aa32_ld32u(s, t2, addr, get_mem_index(s));
+
+if (a->w) {
+/* Base writeback.  */
+switch (a->pu) {
+case 0: offset = -8; break;
+case 1: offset =  4; break;
+case 2: offset = -4; break;
+case 3: offset =  0; break;
+}
+tcg_gen_addi_i32(addr, addr, offset);
+store_reg(s, a->rn, addr);
+} else {
+tcg_temp_free_i32(addr);
+}
+gen_rfe(s, t1, t2);
+return true;
+}
+
+static bool trans_SRS(DisasContext *s, arg_SRS *a)
+{
+if (!ENABLE_ARCH_6) {
+return false;
+}
+gen_srs(s, a->mode, a->pu, a->w);
+return true;
+}
+
 /*
  * Legacy decoder.
  */
 
 static void disas_arm_insn(DisasContext *s, unsigned int insn)
 {
-unsigned int cond, op1, i, rn;
-TCGv_i32 tmp;
-TCGv_i32 tmp2;
-TCGv_i32 addr;
+unsigned int cond, op1;
 
 /* M variants do not implement ARM mode; this must raise the INVSTATE
  * UsageFault exception.
@@ -10108,52 +10163,6 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 default:
 goto illegal_op;
 }
-} else if ((insn & 0x0e5fffe0) == 0x084d0500) {
-/* srs */
-ARCH(6);
-gen_srs(s, (insn & 0x1f), (insn >> 23) & 3, insn & (1 << 21));
-return;
-} else if ((insn & 0x0e50ffe0) == 0x08100a00) {
-/* rfe */
-int32_t offset;
-if (IS_USER(s))
-goto illegal_op;
-ARCH(6);
-rn = (insn >> 16) & 0xf;
-addr = load_reg(s, rn);
-i = (insn >> 23) & 3;
-switch (i) {
-case 0: offset = -4; break; /* DA */
-case 1: offset = 0; break; /* IA */
-case 2: offset = -8; break; /* DB */
-case 3: offset = 4; break; /* IB */
-default: abort();
-}
-if (offset)
-tcg_gen_addi_i32(addr, addr, offset);
-/* Load PC into tmp and CPSR into tmp2.  */
-tmp = tcg_temp_new_i32();
-gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
-tcg_gen_addi_i32(addr, addr, 4);
-tmp2 = tcg_temp_new_i32();
-gen_aa32_ld32u(s, tmp2, addr, get_mem_index(s));
-if (insn & (1 << 21)) {
-/* Base writeback.  */
-switch (i) {
-case 0: offset = -8; break;
-case 1: offset = 4; break;
-case 2: offset = -4; break;
-case 3: offset = 0; break;
-default: abort();
-}
-if (offset)
-tcg_gen_addi_i32(addr, addr, offset);
-store_reg(s, rn, addr);
-} else {
-tcg_temp_free_i32(addr);
-}
-gen_rfe(s, tmp, tmp2);
-return;
 } else if ((insn & 0x0e000f00) == 0x0c000100) {
 if (arm_dc_feature(s, ARM_FEATURE_IWMMXT)) {
 /* iWMMXt register transfer.  */
@@ -10316,7 +10325,6 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t 
insn)
 uint32_t imm, offset;
 uint32_t rd, rn, rm, rs;
 TCGv_i32 tmp;
-TCGv_i32 tmp2;
 TCGv_i32 addr;
 int op;
 
@@ -10460,44 +10468,8 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 goto illegal_op;
 }
 } else {
-/* Load/store multiple, RFE, SRS.  */
-if (((insn >> 23) & 1) == ((insn >> 24) & 1)) {
-/* RFE, SRS: not available in user mode or on M profile */
-if (IS_USER(s) || arm_dc_feature(s, ARM_FEATURE_M)) {
-goto illegal_op;
-}

[Qemu-devel] [PATCH v2 40/68] target/arm: Convert SG

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 51 --
 target/arm/t32.decode  |  5 -
 2 files changed, 33 insertions(+), 23 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 7c05e7006e..9a8864e8ff 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8426,6 +8426,34 @@ static bool trans_SMC(DisasContext *s, arg_SMC *a)
 return true;
 }
 
+static bool trans_SG(DisasContext *s, arg_SG *a)
+{
+if (!arm_dc_feature(s, ARM_FEATURE_M) ||
+!arm_dc_feature(s, ARM_FEATURE_V8)) {
+return false;
+}
+/*
+ * SG (v8M only)
+ * The bulk of the behaviour for this instruction is implemented
+ * in v7m_handle_execute_nsc(), which deals with the insn when
+ * it is executed by a CPU in non-secure state from memory
+ * which is Secure & NonSecure-Callable.
+ * Here we only need to handle the remaining cases:
+ *  * in NS memory (including the "security extension not
+ *implemented" case) : NOP
+ *  * in S memory but CPU already secure (clear IT bits)
+ * We know that the attribute for the memory this insn is
+ * in must match the current CPU state, because otherwise
+ * get_phys_addr_pmsav8 would have generated an exception.
+ */
+if (s->v8m_secure) {
+/* Like the IT insn, we don't need to generate any code */
+s->condexec_cond = 0;
+s->condexec_mask = 0;
+}
+return true;
+}
+
 /*
  * Load/store register index
  */
@@ -10437,28 +10465,7 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
  * - load/store doubleword, load/store exclusive, ldacq/strel,
  *   table branch, TT.
  */
-if (insn == 0xe97fe97f && arm_dc_feature(s, ARM_FEATURE_M) &&
-arm_dc_feature(s, ARM_FEATURE_V8)) {
-/* 0b1110_1001_0111__1110_1001_0111_111
- *  - SG (v8M only)
- * The bulk of the behaviour for this instruction is 
implemented
- * in v7m_handle_execute_nsc(), which deals with the insn when
- * it is executed by a CPU in non-secure state from memory
- * which is Secure & NonSecure-Callable.
- * Here we only need to handle the remaining cases:
- *  * in NS memory (including the "security extension not
- *implemented" case) : NOP
- *  * in S memory but CPU already secure (clear IT bits)
- * We know that the attribute for the memory this insn is
- * in must match the current CPU state, because otherwise
- * get_phys_addr_pmsav8 would have generated an exception.
- */
-if (s->v8m_secure) {
-/* Like the IT insn, we don't need to generate any code */
-s->condexec_cond = 0;
-s->condexec_mask = 0;
-}
-} else if (insn & 0x0120) {
+if (insn & 0x0120) {
 /* load/store dual, in decodetree */
 goto illegal_op;
 } else if ((insn & (1 << 23)) == 0) {
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
index 0cc0808c05..ce46650446 100644
--- a/target/arm/t32.decode
+++ b/target/arm/t32.decode
@@ -485,7 +485,10 @@ STRD_ri_t32  1110 1001 .100    
@ldstd_ri8 w=0 p=1
 LDRD_ri_t32  1110 1001 .101    @ldstd_ri8 w=0 p=1
 
 STRD_ri_t32  1110 1001 .110    @ldstd_ri8 w=1 p=1
-LDRD_ri_t32  1110 1001 .111    @ldstd_ri8 w=1 p=1
+{
+  SG 1110 1001 0111  1110 1001 0111
+  LDRD_ri_t321110 1001 .111    @ldstd_ri8 w=1 p=1
+}
 
 # Load/Store Exclusive, Load-Acquire/Store-Release, and Table Branch
 
-- 
2.17.1




[Qemu-devel] [PATCH v2 32/68] target/arm: Convert SVC

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 19 +--
 target/arm/a32.decode  |  4 
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 6b7b3df685..b6d8b7be8c 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9968,6 +9968,18 @@ static bool trans_BLX_i(DisasContext *s, arg_BLX_i *a)
 return true;
 }
 
+/*
+ * Supervisor call
+ */
+
+static bool trans_SVC(DisasContext *s, arg_SVC *a)
+{
+gen_set_pc_im(s, s->base.pc_next);
+s->svc_imm = a->imm;
+s->base.is_jmp = DISAS_SWI;
+return true;
+}
+
 /*
  * Legacy decoder.
  */
@@ -10235,6 +10247,7 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 case 0x09:
 case 0xa:
 case 0xb:
+case 0xf:
 /* All done in decodetree.  Reach here for illegal ops.  */
 goto illegal_op;
 case 0xc:
@@ -10250,12 +10263,6 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 goto illegal_op;
 }
 break;
-case 0xf:
-/* swi */
-gen_set_pc_im(s, s->base.pc_next);
-s->svc_imm = extract32(insn, 0, 24);
-s->base.is_jmp = DISAS_SWI;
-break;
 default:
 illegal_op:
 unallocated_encoding(s);
diff --git a/target/arm/a32.decode b/target/arm/a32.decode
index 62c6f8562e..0bd952c069 100644
--- a/target/arm/a32.decode
+++ b/target/arm/a32.decode
@@ -528,3 +528,7 @@ LDM_a32   100 b:1 i:1 u:1 w:1 1 rn:4 list:16   
_block
 
 B 1010    @branch
 BL    1011    @branch
+
+# Supervisor call
+
+SVC    imm:24 
-- 
2.17.1




[Qemu-devel] [PATCH v2 19/68] target/arm: Convert T32 ADDW/SUBW

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 24 +---
 target/arm/a32.decode  |  1 +
 target/arm/t32.decode  | 19 +++
 3 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index cb6296dc12..0e51289928 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7626,6 +7626,11 @@ static void arm_skip_unless(DisasContext *s, uint32_t 
cond)
  * Constant expanders for the decoders.
  */
 
+static int negate(DisasContext *s, int x)
+{
+return -x;
+}
+
 static int times_2(DisasContext *s, int x)
 {
 return x * 2;
@@ -7975,6 +7980,12 @@ static bool trans_ORN_rri(DisasContext *s, arg_s_rri_rot 
*a)
 #undef DO_ANY2
 #undef DO_CMP2
 
+static bool trans_ADR(DisasContext *s, arg_ri *a)
+{
+store_reg_bx(s, a->rd, add_reg_for_lit(s, 15, a->imm));
+return true;
+}
+
 /*
  * Multiply and multiply accumulate
  */
@@ -10670,17 +10681,8 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 }
 store_reg(s, rd, tmp);
 } else {
-/* Add/sub 12-bit immediate.  */
-if (insn & (1 << 23)) {
-imm = -imm;
-}
-tmp = add_reg_for_lit(s, rn, imm);
-if (rn == 13 && rd == 13) {
-/* ADD SP, SP, imm or SUB SP, SP, imm */
-store_sp_checked(s, tmp);
-} else {
-store_reg(s, rd, tmp);
-}
+/* Add/sub 12-bit immediate, in decodetree */
+goto illegal_op;
 }
 }
 } else {
diff --git a/target/arm/a32.decode b/target/arm/a32.decode
index c7f156be6d..aac991664d 100644
--- a/target/arm/a32.decode
+++ b/target/arm/a32.decode
@@ -30,6 +30,7 @@
 rd rn rm ra
  rd rn rm
   rd rm
+  rd imm
rm
imm
 _reg rn r mask
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
index 5116c6165a..be4e5f087c 100644
--- a/target/arm/t32.decode
+++ b/target/arm/t32.decode
@@ -27,6 +27,7 @@
 !extern rd rn rm ra
  !extern rd rn rm
   !extern rd rm
+  !extern rd imm
!extern rm
!extern imm
 _reg !extern rn r mask
@@ -121,6 +122,24 @@ SBC_rri   0.0 1011 .  0 ...    
   @s_rri_rot
 }
 RSB_rri   0.0 1110 .  0 ...   @s_rri_rot
 
+# Data processing (plain binary immediate)
+
+%imm12_26_12_0   26:1 12:3 0:8
+%neg12_26_12_0   26:1 12:3 0:8 !function=negate
+@s0_rri_12    ...  . rn:4 . ... rd:4  \
+ _rri_rot imm=%imm12_26_12_0 rot=0 s=0
+
+{
+  ADR 0.1  0  0 ... rd:4  \
+  imm=%imm12_26_12_0
+  ADD_rri 0.1  0  0 ...   @s0_rri_12
+}
+{
+  ADR 0.1 0101 0  0 ... rd:4  \
+  imm=%neg12_26_12_0
+  SUB_rri 0.1 0101 0  0 ...   @s0_rri_12
+}
+
 # Multiply and multiply accumulate
 
 @s0_rnadm   rn:4 ra:4 rd:4  rm:4  _ s=0
-- 
2.17.1




[Qemu-devel] [PATCH v2 39/68] target/arm: Convert Table Branch

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 57 +-
 target/arm/t32.decode  |  8 +-
 2 files changed, 41 insertions(+), 24 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 9ec6b25c03..7c05e7006e 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9968,6 +9968,37 @@ static bool trans_BLX_i(DisasContext *s, arg_BLX_i *a)
 return true;
 }
 
+static bool op_tbranch(DisasContext *s, arg_tbranch *a, bool half)
+{
+TCGv_i32 addr, tmp;
+
+tmp = load_reg(s, a->rm);
+if (half) {
+tcg_gen_add_i32(tmp, tmp, tmp);
+}
+addr = load_reg(s, a->rn);
+tcg_gen_add_i32(addr, addr, tmp);
+
+gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s),
+half ? MO_UW | s->be_data : MO_UB);
+tcg_temp_free_i32(addr);
+
+tcg_gen_add_i32(tmp, tmp, tmp);
+tcg_gen_addi_i32(tmp, tmp, read_pc(s));
+store_reg(s, 15, tmp);
+return true;
+}
+
+static bool trans_TBB(DisasContext *s, arg_tbranch *a)
+{
+return op_tbranch(s, a, false);
+}
+
+static bool trans_TBH(DisasContext *s, arg_tbranch *a)
+{
+return op_tbranch(s, a, true);
+}
+
 /*
  * Supervisor call
  */
@@ -10350,9 +10381,7 @@ static bool thumb_insn_is_16bit(DisasContext *s, 
uint32_t pc, uint32_t insn)
 /* Translate a 32-bit thumb instruction. */
 static void disas_thumb2_insn(DisasContext *s, uint32_t insn)
 {
-uint32_t rd, rn, rm, rs;
-TCGv_i32 tmp;
-TCGv_i32 addr;
+uint32_t rd, rn, rs;
 int op;
 
 /*
@@ -10398,7 +10427,6 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t 
insn)
 rn = (insn >> 16) & 0xf;
 rs = (insn >> 12) & 0xf;
 rd = (insn >> 8) & 0xf;
-rm = insn & 0xf;
 switch ((insn >> 25) & 0xf) {
 case 0: case 1: case 2: case 3:
 /* 16-bit instructions.  Should never happen.  */
@@ -10471,25 +10499,8 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 /* Load/store exclusive, in decodetree */
 goto illegal_op;
 } else if ((insn & (7 << 5)) == 0) {
-/* Table Branch.  */
-addr = load_reg(s, rn);
-tmp = load_reg(s, rm);
-tcg_gen_add_i32(addr, addr, tmp);
-if (insn & (1 << 4)) {
-/* tbh */
-tcg_gen_add_i32(addr, addr, tmp);
-tcg_temp_free_i32(tmp);
-tmp = tcg_temp_new_i32();
-gen_aa32_ld16u(s, tmp, addr, get_mem_index(s));
-} else { /* tbb */
-tcg_temp_free_i32(tmp);
-tmp = tcg_temp_new_i32();
-gen_aa32_ld8u(s, tmp, addr, get_mem_index(s));
-}
-tcg_temp_free_i32(addr);
-tcg_gen_shli_i32(tmp, tmp, 1);
-tcg_gen_addi_i32(tmp, tmp, read_pc(s));
-store_reg(s, 15, tmp);
+/* Table Branch, in decodetree */
+goto illegal_op;
 } else {
 /* Load/store exclusive, load-acq/store-rel, in decodetree */
 goto illegal_op;
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
index 354ad77fe6..0cc0808c05 100644
--- a/target/arm/t32.decode
+++ b/target/arm/t32.decode
@@ -487,7 +487,7 @@ LDRD_ri_t32  1110 1001 .101    
@ldstd_ri8 w=0 p=1
 STRD_ri_t32  1110 1001 .110    @ldstd_ri8 w=1 p=1
 LDRD_ri_t32  1110 1001 .111    @ldstd_ri8 w=1 p=1
 
-# Load/Store Exclusive and Load-Acquire/Store-Release
+# Load/Store Exclusive, Load-Acquire/Store-Release, and Table Branch
 
 @strex_i    rn:4 rt:4 rd:4   \
   rt2=15 imm=%imm8x4
@@ -531,6 +531,12 @@ LDA  1110 1000 1101    1010    
   @ldrex_0
 LDAB 1110 1000 1101    1000   @ldrex_0
 LDAH 1110 1000 1101    1001   @ldrex_0
 
+ rn rm
+@tbranch    rn:4    rm:4  
+
+TBB  1110 1000 1101       @tbranch
+TBH  1110 1000 1101    0001   @tbranch
+
 # Parallel addition and subtraction
 
 SADD8 1010 1000       @rndm
-- 
2.17.1




[Qemu-devel] [PATCH v2 24/68] target/arm: Convert Packing, unpacking, saturation, and reversal

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 541 ++---
 target/arm/a32.decode  |  32 +++
 target/arm/t32.decode  |  37 ++-
 3 files changed, 300 insertions(+), 310 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index cf03527afc..d31e89f308 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -355,7 +355,7 @@ static void gen_smul_dual(TCGv_i32 a, TCGv_i32 b)
 }
 
 /* Byteswap each halfword.  */
-static void gen_rev16(TCGv_i32 var)
+static void gen_rev16(TCGv_i32 dest, TCGv_i32 var)
 {
 TCGv_i32 tmp = tcg_temp_new_i32();
 TCGv_i32 mask = tcg_const_i32(0x00ff00ff);
@@ -363,17 +363,17 @@ static void gen_rev16(TCGv_i32 var)
 tcg_gen_and_i32(tmp, tmp, mask);
 tcg_gen_and_i32(var, var, mask);
 tcg_gen_shli_i32(var, var, 8);
-tcg_gen_or_i32(var, var, tmp);
+tcg_gen_or_i32(dest, var, tmp);
 tcg_temp_free_i32(mask);
 tcg_temp_free_i32(tmp);
 }
 
 /* Byteswap low halfword and sign extend.  */
-static void gen_revsh(TCGv_i32 var)
+static void gen_revsh(TCGv_i32 dest, TCGv_i32 var)
 {
 tcg_gen_ext16u_i32(var, var);
 tcg_gen_bswap16_i32(var, var);
-tcg_gen_ext16s_i32(var, var);
+tcg_gen_ext16s_i32(dest, var);
 }
 
 /* 32x32->64 multiply.  Marks inputs as dead.  */
@@ -426,7 +426,7 @@ static void gen_swap_half(TCGv_i32 var)
 t0 = (t0 + t1) ^ tmp;
  */
 
-static void gen_add16(TCGv_i32 t0, TCGv_i32 t1)
+static void gen_add16(TCGv_i32 dest, TCGv_i32 t0, TCGv_i32 t1)
 {
 TCGv_i32 tmp = tcg_temp_new_i32();
 tcg_gen_xor_i32(tmp, t0, t1);
@@ -434,9 +434,8 @@ static void gen_add16(TCGv_i32 t0, TCGv_i32 t1)
 tcg_gen_andi_i32(t0, t0, ~0x8000);
 tcg_gen_andi_i32(t1, t1, ~0x8000);
 tcg_gen_add_i32(t0, t0, t1);
-tcg_gen_xor_i32(t0, t0, tmp);
+tcg_gen_xor_i32(dest, t0, tmp);
 tcg_temp_free_i32(tmp);
-tcg_temp_free_i32(t1);
 }
 
 /* Set N and Z flags from var.  */
@@ -6340,7 +6339,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t 
insn)
 }
 break;
 case NEON_2RM_VREV16:
-gen_rev16(tmp);
+gen_rev16(tmp, tmp);
 break;
 case NEON_2RM_VCLS:
 switch (size) {
@@ -9231,13 +9230,225 @@ DO_PAR_ADDSUB(UHSUB8, gen_helper_uhsub8)
 #undef DO_PAR_ADDSUB
 #undef DO_PAR_ADDSUB_GE
 
+/*
+ * Packing, unpacking, saturation, and reversal
+ */
+
+static bool trans_PKH(DisasContext *s, arg_PKH *a)
+{
+TCGv_i32 tn, tm;
+int shift = a->imm;
+
+if (s->thumb
+? !arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)
+: !ENABLE_ARCH_6) {
+return false;
+}
+
+tn = load_reg(s, a->rn);
+tm = load_reg(s, a->rm);
+if (a->tb) {
+/* PKHTB */
+if (shift == 0) {
+shift = 31;
+}
+tcg_gen_sari_i32(tm, tm, shift);
+tcg_gen_deposit_i32(tn, tn, tm, 0, 16);
+} else {
+/* PKHBT */
+tcg_gen_shli_i32(tm, tm, shift);
+tcg_gen_deposit_i32(tn, tm, tn, 0, 16);
+}
+tcg_temp_free_i32(tm);
+store_reg(s, a->rd, tn);
+return true;
+}
+
+static bool op_sat(DisasContext *s, arg_sat *a,
+   void (*gen)(TCGv_i32, TCGv_env, TCGv_i32, TCGv_i32))
+{
+TCGv_i32 tmp, satimm;
+int shift = a->imm;
+
+if (!ENABLE_ARCH_6) {
+return false;
+}
+
+tmp = load_reg(s, a->rn);
+if (a->sh) {
+tcg_gen_sari_i32(tmp, tmp, shift ? shift : 31);
+} else {
+tcg_gen_shli_i32(tmp, tmp, shift);
+}
+
+satimm = tcg_const_i32(a->satimm);
+gen(tmp, cpu_env, tmp, satimm);
+tcg_temp_free_i32(satimm);
+
+store_reg(s, a->rd, tmp);
+return true;
+}
+
+static bool trans_SSAT(DisasContext *s, arg_sat *a)
+{
+return op_sat(s, a, gen_helper_ssat);
+}
+
+static bool trans_USAT(DisasContext *s, arg_sat *a)
+{
+return op_sat(s, a, gen_helper_usat);
+}
+
+static bool trans_SSAT16(DisasContext *s, arg_sat *a)
+{
+if (s->thumb && !arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)) {
+return false;
+}
+return op_sat(s, a, gen_helper_ssat16);
+}
+
+static bool trans_USAT16(DisasContext *s, arg_sat *a)
+{
+if (s->thumb && !arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)) {
+return false;
+}
+return op_sat(s, a, gen_helper_usat16);
+}
+
+static bool op_xta(DisasContext *s, arg_rrr_rot *a,
+   void (*gen_extract)(TCGv_i32, TCGv_i32),
+   void (*gen_add)(TCGv_i32, TCGv_i32, TCGv_i32))
+{
+TCGv_i32 tmp;
+
+if (!ENABLE_ARCH_6) {
+return false;
+}
+
+tmp = load_reg(s, a->rm);
+/*
+ * TODO: In many cases we could do a shift instead of a rotate.
+ * Combined with a simple extend, that becomes an extract.
+ */
+tcg_gen_rotri_i32(tmp, tmp, a->rot * 8);
+gen_extract(tmp, tmp);
+
+if (a->rn != 15) {
+ 

[Qemu-devel] [PATCH v2 31/68] target/arm: Convert B, BL, BLX (immediate)

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c   | 133 +++
 target/arm/a32-uncond.decode |   8 +++
 target/arm/a32.decode|   8 +++
 target/arm/t32.decode|  81 -
 4 files changed, 123 insertions(+), 107 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 09636aab4e..6b7b3df685 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7523,6 +7523,14 @@ static int t32_expandimm_imm(DisasContext *s, int x)
 return imm;
 }
 
+static int t32_branch24(DisasContext *s, int x)
+{
+/* Convert J1:J2 at x[22:21] to I2:I1, which involves I=J^~S.  */
+x ^= !(x < 0) * (3 << 21);
+/* Append the final zero.  */
+return x << 1;
+}
+
 /*
  * Include the generated decoders.
  */
@@ -9917,13 +9925,56 @@ static bool trans_LDM_t32(DisasContext *s, 
arg_ldst_block *a)
 return do_ldm(s, a, 2);
 }
 
+/*
+ * Branch, branch with link
+ */
+
+static bool trans_B(DisasContext *s, arg_i *a)
+{
+gen_jmp(s, read_pc(s) + a->imm);
+return true;
+}
+
+static bool trans_B_cond_thumb(DisasContext *s, arg_ci *a)
+{
+/* This has cond from encoding, required to be outside IT block.  */
+if (a->cond >= 0xe) {
+return false;
+}
+if (s->condexec_mask) {
+unallocated_encoding(s);
+return true;
+}
+arm_skip_unless(s, a->cond);
+gen_jmp(s, read_pc(s) + a->imm);
+return true;
+}
+
+static bool trans_BL(DisasContext *s, arg_i *a)
+{
+tcg_gen_movi_i32(cpu_R[14], s->base.pc_next | s->thumb);
+gen_jmp(s, read_pc(s) + a->imm);
+return true;
+}
+
+static bool trans_BLX_i(DisasContext *s, arg_BLX_i *a)
+{
+/* For A32, ARCH(5) is checked near the start of the uncond block. */
+if (s->thumb && (a->imm & 2)) {
+return false;
+}
+tcg_gen_movi_i32(cpu_R[14], s->base.pc_next | s->thumb);
+gen_bx_im(s, (read_pc(s) & ~3) + a->imm + !s->thumb);
+return true;
+}
+
 /*
  * Legacy decoder.
  */
 
 static void disas_arm_insn(DisasContext *s, unsigned int insn)
 {
-unsigned int cond, val, op1, i, rn;
+unsigned int cond, op1, i, rn;
 TCGv_i32 tmp;
 TCGv_i32 tmp2;
 TCGv_i32 addr;
@@ -10091,21 +10142,6 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 }
 gen_rfe(s, tmp, tmp2);
 return;
-} else if ((insn & 0x0e00) == 0x0a00) {
-/* branch link and change to thumb (blx ) */
-int32_t offset;
-
-tmp = tcg_temp_new_i32();
-tcg_gen_movi_i32(tmp, s->base.pc_next);
-store_reg(s, 14, tmp);
-/* Sign-extend the 24-bit offset */
-offset = (((int32_t)insn) << 8) >> 8;
-val = read_pc(s);
-/* offset * 4 + bit24 * 2 + (thumb bit) */
-val += (offset << 2) | ((insn >> 23) & 2) | 1;
-/* protected by ARCH(5); above, near the start of uncond block */
-gen_bx_im(s, val);
-return;
 } else if ((insn & 0x0e000f00) == 0x0c000100) {
 if (arm_dc_feature(s, ARM_FEATURE_IWMMXT)) {
 /* iWMMXt register transfer.  */
@@ -10197,23 +10233,10 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 case 0x7:
 case 0x08:
 case 0x09:
-/* All done in decodetree.  Reach here for illegal ops.  */
-goto illegal_op;
 case 0xa:
 case 0xb:
-{
-int32_t offset;
-
-/* branch (and link) */
-if (insn & (1 << 24)) {
-tmp = tcg_temp_new_i32();
-tcg_gen_movi_i32(tmp, s->base.pc_next);
-store_reg(s, 14, tmp);
-}
-offset = sextract32(insn << 2, 0, 26);
-gen_jmp(s, read_pc(s) + offset);
-}
-break;
+/* All done in decodetree.  Reach here for illegal ops.  */
+goto illegal_op;
 case 0xc:
 case 0xd:
 case 0xe:
@@ -10580,32 +10603,8 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 if (insn & (1 << 15)) {
 /* Branches, misc control.  */
 if (insn & 0x5000) {
-/* Unconditional branch.  */
-/* signextend(hw1[10:0]) -> offset[:12].  */
-offset = ((int32_t)insn << 5) >> 9 & ~(int32_t)0xfff;
-/* hw1[10:0] -> offset[11:1].  */
-offset |= (insn & 0x7ff) << 1;
-/* (~hw2[13, 11] ^ offset[24]) -> offset[23,22]
-   offset[24:22] already have the same value because of the
-   sign extension above.  */
-offset ^= ((~insn) & (1 << 13)) << 10;
-offset ^= ((~insn) & (1 << 11)) << 11;
-
-if (insn & (1 << 14)) {
-/* Branch and link.  */
-

[Qemu-devel] [PATCH v2 17/68] target/arm: Convert ERET

2019-08-19 Thread Richard Henderson
Pass the T5 encoding of SUBS PC, LR, #IMM through the normal SUBS path
to make it clear exactly what's happening -- we hit ALUExceptionReturn
along that path.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 58 ++
 target/arm/a32.decode  |  2 ++
 target/arm/t32.decode  |  8 ++
 3 files changed, 29 insertions(+), 39 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index f0fa5253b6..cb7b35489f 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8474,6 +8474,23 @@ static bool trans_CLZ(DisasContext *s, arg_CLZ *a)
 return true;
 }
 
+static bool trans_ERET(DisasContext *s, arg_ERET *a)
+{
+TCGv_i32 tmp;
+
+if (IS_USER(s) || !arm_dc_feature(s, ARM_FEATURE_V7VE)) {
+return false;
+}
+if (s->current_el == 2) {
+/* ERET from Hyp uses ELR_Hyp, not LR */
+tmp = load_cpu_field(elr_el[2]);
+} else {
+tmp = load_reg(s, 14);
+}
+gen_exception_return(s, tmp);
+return true;
+}
+
 /*
  * Legacy decoder.
  */
@@ -8768,29 +8785,10 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 case 0x4: /* crc32 */
 /* All done in decodetree.  Illegal ops reach here.  */
 goto illegal_op;
-case 0x5:
-/* Saturating addition and subtraction.  */
+case 0x5: /* Saturating addition and subtraction.  */
+case 0x6: /* ERET */
 /* All done in decodetree.  Reach here for illegal ops.  */
 goto illegal_op;
-case 0x6: /* ERET */
-if (op1 != 3) {
-goto illegal_op;
-}
-if (!arm_dc_feature(s, ARM_FEATURE_V7VE)) {
-goto illegal_op;
-}
-if ((insn & 0x000fff0f) != 0x000e) {
-/* UNPREDICTABLE; we choose to UNDEF */
-goto illegal_op;
-}
-
-if (s->current_el == 2) {
-tmp = load_cpu_field(elr_el[2]);
-} else {
-tmp = load_reg(s, 14);
-}
-gen_exception_return(s, tmp);
-break;
 case 7:
 {
 int imm16 = extract32(insn, 0, 4) | (extract32(insn, 8, 12) << 4);
@@ -10586,24 +10584,6 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 case 4: /* bxj, in decodetree */
 goto illegal_op;
 case 5: /* Exception return.  */
-if (IS_USER(s)) {
-goto illegal_op;
-}
-if (rn != 14 || rd != 15) {
-goto illegal_op;
-}
-if (s->current_el == 2) {
-/* ERET from Hyp uses ELR_Hyp, not LR */
-if (insn & 0xff) {
-goto illegal_op;
-}
-tmp = load_cpu_field(elr_el[2]);
-} else {
-tmp = load_reg(s, rn);
-tcg_gen_subi_i32(tmp, tmp, insn & 0xff);
-}
-gen_exception_return(s, tmp);
-break;
 case 6: /* MRS, in decodetree */
 case 7: /* MSR, in decodetree */
 goto illegal_op;
diff --git a/target/arm/a32.decode b/target/arm/a32.decode
index 182f2b6725..52a66dd1d5 100644
--- a/target/arm/a32.decode
+++ b/target/arm/a32.decode
@@ -211,3 +211,5 @@ BXJ   0001 0010    0010 
  @rm
 BLX_r 0001 0010    0011   @rm
 
 CLZ   0001 0110    0001   @rdm
+
+ERET  0001 0110    0110 1110
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
index 67724efe4b..6236d28b99 100644
--- a/target/arm/t32.decode
+++ b/target/arm/t32.decode
@@ -218,4 +218,12 @@ CLZ   1010 1011    1000    
   @rdm
 MSR_v7m   0011 100 0   rn:4 1000 mask:2 00 sysm:8
   }
   BXJ 0011 1100 rm:4 1000     
+  {
+# At v6T2, this is the T5 encoding of SUBS PC, LR, #IMM, and works as for
+# every other encoding of SUBS.  With v7VE, IMM=0 is redefined as ERET.
+# The distinction between the two only matters for Hyp mode.
+ERET  0011 1101 1110 1000   
+SUB_rri   0011 1101 1110 1000  imm:8 \
+ _rri_rot rot=0 s=1 rd=15 rn=14
+  }
 }
-- 
2.17.1




[Qemu-devel] [PATCH v2 30/68] target/arm: Diagnose base == pc for LDM/STM

2019-08-19 Thread Richard Henderson
We have been using store_reg and not store_reg_for_load when writing
back a loaded value into the base register.  At first glance this is
incorrect when base == pc, however that case is UNPREDICTABLE.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 1792bb7abd..09636aab4e 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9752,6 +9752,10 @@ static bool op_stm(DisasContext *s, arg_ldst_block *a, 
int min_n)
 if (n < min_n) {
 return false;
 }
+/* Using PC as the base register is UNPREDICTABLE.  */
+if (a->rn == 15) {
+return false;
+}
 
 addr = op_addr_block_pre(s, a, n);
 mem_idx = get_mem_index(s);
@@ -9828,6 +9832,10 @@ static bool do_ldm(DisasContext *s, arg_ldst_block *a, 
int min_n)
 if (n < min_n) {
 return false;
 }
+/* Using PC as the base register is UNPREDICTABLE.  */
+if (a->rn == 15) {
+return false;
+}
 
 addr = op_addr_block_pre(s, a, n);
 mem_idx = get_mem_index(s);
@@ -9864,6 +9872,7 @@ static bool do_ldm(DisasContext *s, arg_ldst_block *a, 
int min_n)
 op_addr_block_post(s, a, addr, n);
 
 if (loaded_base) {
+/* Note that we reject base == pc above.  */
 store_reg(s, a->rn, loaded_var);
 }
 
-- 
2.17.1




[Qemu-devel] [PATCH v2 27/68] target/arm: Convert LDM, STM

2019-08-19 Thread Richard Henderson
This includes a minor bug fix to LDM (user), which requires
bit 21 to be 0, which means no writeback.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 426 ++---
 target/arm/a32.decode  |   6 +
 target/arm/t32.decode  |  10 +
 3 files changed, 241 insertions(+), 201 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 81eae286e8..4451adbb97 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9671,6 +9671,227 @@ static bool trans_UDIV(DisasContext *s, arg_rrr *a)
 return op_div(s, a, true);
 }
 
+/*
+ * Block data transfer
+ */
+
+static TCGv_i32 op_addr_block_pre(DisasContext *s, arg_ldst_block *a, int n)
+{
+TCGv_i32 addr = load_reg(s, a->rn);
+
+if (a->b) {
+if (a->i) {
+/* pre increment */
+tcg_gen_addi_i32(addr, addr, 4);
+} else {
+/* pre decrement */
+tcg_gen_addi_i32(addr, addr, -(n * 4));
+}
+} else if (!a->i && n != 1) {
+/* post decrement */
+tcg_gen_addi_i32(addr, addr, -((n - 1) * 4));
+}
+
+if (s->v8m_stackcheck && a->rn == 13 && a->w) {
+/*
+ * If the writeback is incrementing SP rather than
+ * decrementing it, and the initial SP is below the
+ * stack limit but the final written-back SP would
+ * be above, then then we must not perform any memory
+ * accesses, but it is IMPDEF whether we generate
+ * an exception. We choose to do so in this case.
+ * At this point 'addr' is the lowest address, so
+ * either the original SP (if incrementing) or our
+ * final SP (if decrementing), so that's what we check.
+ */
+gen_helper_v8m_stackcheck(cpu_env, addr);
+}
+
+return addr;
+}
+
+static void op_addr_block_post(DisasContext *s, arg_ldst_block *a,
+   TCGv_i32 addr, int n)
+{
+if (a->w) {
+/* write back */
+if (!a->b) {
+if (a->i) {
+/* post increment */
+tcg_gen_addi_i32(addr, addr, 4);
+} else {
+/* post decrement */
+tcg_gen_addi_i32(addr, addr, -(n * 4));
+}
+} else if (!a->i && n != 1) {
+/* pre decrement */
+tcg_gen_addi_i32(addr, addr, -((n - 1) * 4));
+}
+store_reg(s, a->rn, addr);
+} else {
+tcg_temp_free_i32(addr);
+}
+}
+
+static bool op_stm(DisasContext *s, arg_ldst_block *a)
+{
+int i, j, n, list, mem_idx;
+bool user = a->u;
+TCGv_i32 addr, tmp, tmp2;
+
+if (user) {
+/* STM (user) */
+if (IS_USER(s)) {
+/* Only usable in supervisor mode.  */
+return false;
+}
+}
+
+list = a->list;
+n = ctpop16(list);
+/* TODO: test invalid n == 0 case */
+
+addr = op_addr_block_pre(s, a, n);
+mem_idx = get_mem_index(s);
+
+for (i = j = 0; i < 16; i++) {
+if (!(list & (1 << i))) {
+continue;
+}
+
+if (user && i != 15) {
+tmp = tcg_temp_new_i32();
+tmp2 = tcg_const_i32(i);
+gen_helper_get_user_reg(tmp, cpu_env, tmp2);
+tcg_temp_free_i32(tmp2);
+} else {
+tmp = load_reg(s, i);
+}
+gen_aa32_st32(s, tmp, addr, mem_idx);
+tcg_temp_free_i32(tmp);
+
+/* No need to add after the last transfer.  */
+if (++j != n) {
+tcg_gen_addi_i32(addr, addr, 4);
+}
+}
+
+op_addr_block_post(s, a, addr, n);
+return true;
+}
+
+static bool trans_STM(DisasContext *s, arg_ldst_block *a)
+{
+return op_stm(s, a);
+}
+
+static bool trans_STM_t32(DisasContext *s, arg_ldst_block *a)
+{
+/* Writeback register in register list is UNPREDICATABLE for T32.  */
+if (a->w && (a->list & (1 << a->rn))) {
+return false;
+}
+return op_stm(s, a);
+}
+
+static bool do_ldm(DisasContext *s, arg_ldst_block *a)
+{
+int i, j, n, list, mem_idx;
+bool loaded_base;
+bool user = a->u;
+bool exc_return = false;
+TCGv_i32 addr, tmp, tmp2, loaded_var;
+
+if (user) {
+/* LDM (user), LDM (exception return) */
+if (IS_USER(s)) {
+/* Only usable in supervisor mode.  */
+return false;
+}
+if (extract32(a->list, 15, 1)) {
+exc_return = true;
+user = false;
+} else {
+/* LDM (user) does not allow writeback.  */
+if (a->w) {
+return false;
+}
+}
+}
+
+list = a->list;
+n = ctpop16(list);
+/* TODO: test invalid n == 0 case */
+
+addr = op_addr_block_pre(s, a, n);
+mem_idx = get_mem_index(s);
+loaded_base = false;
+loaded_var = NULL;
+
+for (i = j = 0; i < 16; i++) {
+if (!(list & (1 << i))) {
+continue;
+}
+
+tmp = 

[Qemu-devel] [PATCH v2 16/68] target/arm: Convert CLZ

2019-08-19 Thread Richard Henderson
Document our choice about the T32 CONSTRAINED UNPREDICTABLE behaviour.
This matches the undocumented choice made by the legacy decoder.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 31 +++
 target/arm/a32.decode  |  4 
 target/arm/t32.decode  |  5 +
 3 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index ef26ed7b57..f0fa5253b6 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8461,6 +8461,19 @@ static bool trans_BLX_r(DisasContext *s, arg_BLX_r *a)
 return true;
 }
 
+static bool trans_CLZ(DisasContext *s, arg_CLZ *a)
+{
+TCGv_i32 tmp;
+
+if (!ENABLE_ARCH_5) {
+return false;
+}
+tmp = load_reg(s, a->rm);
+tcg_gen_clzi_i32(tmp, tmp, 32);
+store_reg(s, a->rd, tmp);
+return true;
+}
+
 /*
  * Legacy decoder.
  */
@@ -8749,18 +8762,7 @@ static void disas_arm_insn(DisasContext *s, unsigned int 
insn)
 /* MSR/MRS (banked/register) */
 /* All done in decodetree.  Illegal ops already signalled.  */
 g_assert_not_reached();
-case 0x1:
-if (op1 == 3) {
-/* clz */
-ARCH(5);
-rd = (insn >> 12) & 0xf;
-tmp = load_reg(s, rm);
-tcg_gen_clzi_i32(tmp, tmp, 32);
-store_reg(s, rd, tmp);
-} else {
-goto illegal_op;
-}
-break;
+case 0x1: /* bx, clz */
 case 0x2: /* bxj */
 case 0x3: /* blx */
 case 0x4: /* crc32 */
@@ -10201,13 +10203,13 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 case 0x08: /* rev */
 case 0x09: /* rev16 */
 case 0x0b: /* revsh */
-case 0x18: /* clz */
 break;
 case 0x10: /* sel */
 if (!arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)) {
 goto illegal_op;
 }
 break;
+case 0x18: /* clz, in decodetree */
 case 0x20: /* crc32/crc32c, in decodetree */
 case 0x21:
 case 0x22:
@@ -10240,9 +10242,6 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t 
insn)
 tcg_temp_free_i32(tmp3);
 tcg_temp_free_i32(tmp2);
 break;
-case 0x18: /* clz */
-tcg_gen_clzi_i32(tmp, tmp, 32);
-break;
 default:
 g_assert_not_reached();
 }
diff --git a/target/arm/a32.decode b/target/arm/a32.decode
index 6cb9c16e2f..182f2b6725 100644
--- a/target/arm/a32.decode
+++ b/target/arm/a32.decode
@@ -29,6 +29,7 @@
 _  s rd rn rm ra
 rd rn rm ra
  rd rn rm
+  rd rm
rm
 _reg rn r mask
 _reg rd r
@@ -197,6 +198,7 @@ CRC32CW   0001 0100   0010 0100 
  @rndm
 %sysm8:1 16:4
 
 @rm         rm:4  
+@rdm     rd:4   rm:4  
 
 MRS_bank  0001 0 r:1 00  rd:4 001.    _bank %sysm
 MSR_bank  0001 0 r:1 10   001.  rn:4  _bank %sysm
@@ -207,3 +209,5 @@ MSR_reg   0001 0 r:1 10 mask:4    
rn:4  _reg
 BX    0001 0010    0001   @rm
 BXJ   0001 0010    0010   @rm
 BLX_r 0001 0010    0011   @rm
+
+CLZ   0001 0110    0001   @rdm
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
index 337706ebbe..67724efe4b 100644
--- a/target/arm/t32.decode
+++ b/target/arm/t32.decode
@@ -26,6 +26,7 @@
 _  !extern s rd rn rm ra
 !extern rd rn rm ra
  !extern rd rn rm
+  !extern rd rm
!extern rm
 _reg !extern rn r mask
 _reg !extern rd r
@@ -126,6 +127,7 @@ RSB_rri   0.0 1110 .  0 ...     
  @s_rri_rot
 @rnadm      rn:4 ra:4 rd:4  rm:4  
 @rn0dm      rn:4  rd:4  rm:4   ra=0
 @rndm   rn:4  rd:4  rm:4  
+@rdm      rd:4  rm:4  
 
 {
   MUL 1011        @s0_rn0dm
@@ -180,6 +182,9 @@ CRC32CB   1010 1101    1000 
  @rndm
 CRC32CH   1010 1101    1001   @rndm
 CRC32CW   1010 1101    1010   @rndm
 
+# Note rn != rm is CONSTRAINED UNPREDICTABLE; we choose to ignore rn.
+CLZ   1010 1011    1000   @rdm
+
 # Branches and miscellaneous 

[Qemu-devel] [PATCH v2 23/68] target/arm: Convert Parallel addition and subtraction

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 229 -
 target/arm/a32.decode  |  44 
 target/arm/t32.decode  |  44 
 3 files changed, 200 insertions(+), 117 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 2764a1a637..cf03527afc 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -648,99 +648,6 @@ static inline void gen_arm_shift_reg(TCGv_i32 var, int 
shiftop,
 tcg_temp_free_i32(shift);
 }
 
-#define PAS_OP(pfx) \
-switch (op2) {  \
-case 0: gen_pas_helper(glue(pfx,add16)); break; \
-case 1: gen_pas_helper(glue(pfx,addsubx)); break; \
-case 2: gen_pas_helper(glue(pfx,subaddx)); break; \
-case 3: gen_pas_helper(glue(pfx,sub16)); break; \
-case 4: gen_pas_helper(glue(pfx,add8)); break; \
-case 7: gen_pas_helper(glue(pfx,sub8)); break; \
-}
-static void gen_arm_parallel_addsub(int op1, int op2, TCGv_i32 a, TCGv_i32 b)
-{
-TCGv_ptr tmp;
-
-switch (op1) {
-#define gen_pas_helper(name) glue(gen_helper_,name)(a, a, b, tmp)
-case 1:
-tmp = tcg_temp_new_ptr();
-tcg_gen_addi_ptr(tmp, cpu_env, offsetof(CPUARMState, GE));
-PAS_OP(s)
-tcg_temp_free_ptr(tmp);
-break;
-case 5:
-tmp = tcg_temp_new_ptr();
-tcg_gen_addi_ptr(tmp, cpu_env, offsetof(CPUARMState, GE));
-PAS_OP(u)
-tcg_temp_free_ptr(tmp);
-break;
-#undef gen_pas_helper
-#define gen_pas_helper(name) glue(gen_helper_,name)(a, a, b)
-case 2:
-PAS_OP(q);
-break;
-case 3:
-PAS_OP(sh);
-break;
-case 6:
-PAS_OP(uq);
-break;
-case 7:
-PAS_OP(uh);
-break;
-#undef gen_pas_helper
-}
-}
-#undef PAS_OP
-
-/* For unknown reasons Arm and Thumb-2 use arbitrarily different encodings.  */
-#define PAS_OP(pfx) \
-switch (op1) {  \
-case 0: gen_pas_helper(glue(pfx,add8)); break; \
-case 1: gen_pas_helper(glue(pfx,add16)); break; \
-case 2: gen_pas_helper(glue(pfx,addsubx)); break; \
-case 4: gen_pas_helper(glue(pfx,sub8)); break; \
-case 5: gen_pas_helper(glue(pfx,sub16)); break; \
-case 6: gen_pas_helper(glue(pfx,subaddx)); break; \
-}
-static void gen_thumb2_parallel_addsub(int op1, int op2, TCGv_i32 a, TCGv_i32 
b)
-{
-TCGv_ptr tmp;
-
-switch (op2) {
-#define gen_pas_helper(name) glue(gen_helper_,name)(a, a, b, tmp)
-case 0:
-tmp = tcg_temp_new_ptr();
-tcg_gen_addi_ptr(tmp, cpu_env, offsetof(CPUARMState, GE));
-PAS_OP(s)
-tcg_temp_free_ptr(tmp);
-break;
-case 4:
-tmp = tcg_temp_new_ptr();
-tcg_gen_addi_ptr(tmp, cpu_env, offsetof(CPUARMState, GE));
-PAS_OP(u)
-tcg_temp_free_ptr(tmp);
-break;
-#undef gen_pas_helper
-#define gen_pas_helper(name) glue(gen_helper_,name)(a, a, b)
-case 1:
-PAS_OP(q);
-break;
-case 2:
-PAS_OP(sh);
-break;
-case 5:
-PAS_OP(uq);
-break;
-case 6:
-PAS_OP(uh);
-break;
-#undef gen_pas_helper
-}
-}
-#undef PAS_OP
-
 /*
  * Generate a conditional based on ARM condition code cc.
  * This is common between ARM and Aarch64 targets.
@@ -9216,6 +9123,114 @@ static bool trans_UDF(DisasContext *s, arg_UDF *a)
 return true;
 }
 
+/*
+ * Parallel addition and subtraction
+ */
+
+static bool op_par_addsub(DisasContext *s, arg_rrr *a,
+  void (*gen)(TCGv_i32, TCGv_i32, TCGv_i32))
+{
+TCGv_i32 t0, t1;
+
+if (s->thumb
+? !arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)
+: !ENABLE_ARCH_6) {
+return false;
+}
+
+t0 = load_reg(s, a->rn);
+t1 = load_reg(s, a->rm);
+
+gen(t0, t0, t1);
+
+tcg_temp_free_i32(t1);
+store_reg(s, a->rd, t0);
+return true;
+}
+
+static bool op_par_addsub_ge(DisasContext *s, arg_rrr *a,
+ void (*gen)(TCGv_i32, TCGv_i32,
+ TCGv_i32, TCGv_ptr))
+{
+TCGv_i32 t0, t1;
+TCGv_ptr ge;
+
+if (s->thumb
+? !arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)
+: !ENABLE_ARCH_6) {
+return false;
+}
+
+t0 = load_reg(s, a->rn);
+t1 = load_reg(s, a->rm);
+
+ge = tcg_temp_new_ptr();
+tcg_gen_addi_ptr(ge, cpu_env, offsetof(CPUARMState, GE));
+gen(t0, t0, t1, ge);
+
+tcg_temp_free_ptr(ge);
+tcg_temp_free_i32(t1);
+store_reg(s, a->rd, t0);
+return true;
+}
+
+#define DO_PAR_ADDSUB(NAME, helper) \
+static bool trans_##NAME(DisasContext *s, arg_rrr *a)   \
+{   \
+return op_par_addsub(s, a, helper); \
+}
+
+#define DO_PAR_ADDSUB_GE(NAME, helper) \
+static bool trans_##NAME(DisasContext *s, arg_rrr *a)   \
+{   \
+return op_par_addsub_ge(s, a, helper);  \
+}
+
+DO_PAR_ADDSUB_GE(SADD16, 

[Qemu-devel] [PATCH v2 28/68] target/arm: Diagnose writeback register in list for LDM for v7

2019-08-19 Thread Richard Henderson
Prior to v7, for the A32 encoding, this operation wrote an UNKNOWN
value back to the base register.  Starting in v7 this is UNPREDICTABLE.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 4451adbb97..29e2eae441 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9880,6 +9880,14 @@ static bool do_ldm(DisasContext *s, arg_ldst_block *a)
 
 static bool trans_LDM_a32(DisasContext *s, arg_ldst_block *a)
 {
+/*
+ * Writeback register in register list is UNPREDICATABLE
+ * for ArchVersion() >= 7.  Prior to v7, A32 would write
+ * an UNKNOWN value to the base register.
+ */
+if (ENABLE_ARCH_7 && a->w && (a->list & (1 << a->rn))) {
+return false;
+}
 return do_ldm(s, a);
 }
 
-- 
2.17.1




[Qemu-devel] [PATCH v2 29/68] target/arm: Diagnose too few registers in list for LDM/STM

2019-08-19 Thread Richard Henderson
This has been a TODO item for quite a while.  The minimum bit
count for A32 and T16 is 1, and for T32 is 2.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 29e2eae441..1792bb7abd 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9733,7 +9733,7 @@ static void op_addr_block_post(DisasContext *s, 
arg_ldst_block *a,
 }
 }
 
-static bool op_stm(DisasContext *s, arg_ldst_block *a)
+static bool op_stm(DisasContext *s, arg_ldst_block *a, int min_n)
 {
 int i, j, n, list, mem_idx;
 bool user = a->u;
@@ -9749,7 +9749,9 @@ static bool op_stm(DisasContext *s, arg_ldst_block *a)
 
 list = a->list;
 n = ctpop16(list);
-/* TODO: test invalid n == 0 case */
+if (n < min_n) {
+return false;
+}
 
 addr = op_addr_block_pre(s, a, n);
 mem_idx = get_mem_index(s);
@@ -9782,7 +9784,8 @@ static bool op_stm(DisasContext *s, arg_ldst_block *a)
 
 static bool trans_STM(DisasContext *s, arg_ldst_block *a)
 {
-return op_stm(s, a);
+/* BitCount(list) < 1 is UNPREDICTABLE */
+return op_stm(s, a, 1);
 }
 
 static bool trans_STM_t32(DisasContext *s, arg_ldst_block *a)
@@ -9791,10 +9794,11 @@ static bool trans_STM_t32(DisasContext *s, 
arg_ldst_block *a)
 if (a->w && (a->list & (1 << a->rn))) {
 return false;
 }
-return op_stm(s, a);
+/* BitCount(list) < 2 is UNPREDICTABLE */
+return op_stm(s, a, 2);
 }
 
-static bool do_ldm(DisasContext *s, arg_ldst_block *a)
+static bool do_ldm(DisasContext *s, arg_ldst_block *a, int min_n)
 {
 int i, j, n, list, mem_idx;
 bool loaded_base;
@@ -9821,7 +9825,9 @@ static bool do_ldm(DisasContext *s, arg_ldst_block *a)
 
 list = a->list;
 n = ctpop16(list);
-/* TODO: test invalid n == 0 case */
+if (n < min_n) {
+return false;
+}
 
 addr = op_addr_block_pre(s, a, n);
 mem_idx = get_mem_index(s);
@@ -9888,7 +9894,8 @@ static bool trans_LDM_a32(DisasContext *s, arg_ldst_block 
*a)
 if (ENABLE_ARCH_7 && a->w && (a->list & (1 << a->rn))) {
 return false;
 }
-return do_ldm(s, a);
+/* BitCount(list) < 1 is UNPREDICTABLE */
+return do_ldm(s, a, 1);
 }
 
 static bool trans_LDM_t32(DisasContext *s, arg_ldst_block *a)
@@ -9897,7 +9904,8 @@ static bool trans_LDM_t32(DisasContext *s, arg_ldst_block 
*a)
 if (a->w && (a->list & (1 << a->rn))) {
 return false;
 }
-return do_ldm(s, a);
+/* BitCount(list) < 2 is UNPREDICTABLE */
+return do_ldm(s, a, 2);
 }
 
 /*
-- 
2.17.1




[Qemu-devel] [PATCH v2 13/68] target/arm: Convert MRS/MSR (banked, register)

2019-08-19 Thread Richard Henderson
The m-profile and a-profile, decodings overlap.  Only return false
for the case of wrong profile; handle UNDEFINED for permission failure
directly.  This ensures that we don't accidentally pass an insn that
applies to the wrong profile.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 226 ++---
 target/arm/a32.decode  |  14 +++
 target/arm/t32.decode  |  40 ++--
 3 files changed, 142 insertions(+), 138 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index ee485b1cbd..026abcaa9c 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8291,6 +8291,93 @@ static bool trans_MSR_imm(DisasContext *s, arg_MSR_imm 
*a)
 return true;
 }
 
+/*
+ * Miscellaneous instructions
+ */
+
+static bool trans_MRS_bank(DisasContext *s, arg_MRS_bank *a)
+{
+if (arm_dc_feature(s, ARM_FEATURE_M)) {
+return false;
+}
+gen_mrs_banked(s, a->r, a->sysm, a->rd);
+return true;
+}
+
+static bool trans_MSR_bank(DisasContext *s, arg_MSR_bank *a)
+{
+if (arm_dc_feature(s, ARM_FEATURE_M)) {
+return false;
+}
+gen_msr_banked(s, a->r, a->sysm, a->rn);
+return true;
+}
+
+static bool trans_MRS_reg(DisasContext *s, arg_MRS_reg *a)
+{
+TCGv_i32 tmp;
+
+if (arm_dc_feature(s, ARM_FEATURE_M)) {
+return false;
+}
+if (a->r) {
+if (IS_USER(s)) {
+unallocated_encoding(s);
+return true;
+}
+tmp = load_cpu_field(spsr);
+} else {
+tmp = tcg_temp_new_i32();
+gen_helper_cpsr_read(tmp, cpu_env);
+}
+store_reg(s, a->rd, tmp);
+return true;
+}
+
+static bool trans_MSR_reg(DisasContext *s, arg_MSR_reg *a)
+{
+TCGv_i32 tmp;
+uint32_t mask = msr_mask(s, a->mask, a->r);
+
+if (arm_dc_feature(s, ARM_FEATURE_M)) {
+return false;
+}
+tmp = load_reg(s, a->rn);
+if (gen_set_psr(s, mask, a->r, tmp)) {
+unallocated_encoding(s);
+}
+return true;
+}
+
+static bool trans_MRS_v7m(DisasContext *s, arg_MRS_v7m *a)
+{
+TCGv_i32 tmp;
+
+if (!arm_dc_feature(s, ARM_FEATURE_M)) {
+return false;
+}
+tmp = tcg_const_i32(a->sysm);
+gen_helper_v7m_mrs(tmp, cpu_env, tmp);
+store_reg(s, a->rd, tmp);
+return true;
+}
+
+static bool trans_MSR_v7m(DisasContext *s, arg_MSR_v7m *a)
+{
+TCGv_i32 addr, reg;
+
+if (!arm_dc_feature(s, ARM_FEATURE_M)) {
+return false;
+}
+addr = tcg_const_i32((a->mask << 10) | a->sysm);
+reg = load_reg(s, a->rn);
+gen_helper_v7m_msr(cpu_env, addr, reg);
+tcg_temp_free_i32(addr);
+tcg_temp_free_i32(reg);
+gen_lookup_tb(s);
+return true;
+}
+
 /*
  * Legacy decoder.
  */
@@ -8575,46 +8662,10 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 sh = (insn >> 4) & 0xf;
 rm = insn & 0xf;
 switch (sh) {
-case 0x0: /* MSR, MRS */
-if (insn & (1 << 9)) {
-/* MSR (banked) and MRS (banked) */
-int sysm = extract32(insn, 16, 4) |
-(extract32(insn, 8, 1) << 4);
-int r = extract32(insn, 22, 1);
-
-if (op1 & 1) {
-/* MSR (banked) */
-gen_msr_banked(s, r, sysm, rm);
-} else {
-/* MRS (banked) */
-int rd = extract32(insn, 12, 4);
-
-gen_mrs_banked(s, r, sysm, rd);
-}
-break;
-}
-
-/* MSR, MRS (for PSRs) */
-if (op1 & 1) {
-/* PSR = reg */
-tmp = load_reg(s, rm);
-i = ((op1 & 2) != 0);
-if (gen_set_psr(s, msr_mask(s, (insn >> 16) & 0xf, i), i, tmp))
-goto illegal_op;
-} else {
-/* reg = PSR */
-rd = (insn >> 12) & 0xf;
-if (op1 & 2) {
-if (IS_USER(s))
-goto illegal_op;
-tmp = load_cpu_field(spsr);
-} else {
-tmp = tcg_temp_new_i32();
-gen_helper_cpsr_read(tmp, cpu_env);
-}
-store_reg(s, rd, tmp);
-}
-break;
+case 0x0:
+/* MSR/MRS (banked/register) */
+/* All done in decodetree.  Illegal ops already signalled.  */
+g_assert_not_reached();
 case 0x1:
 if (op1 == 1) {
 /* branch/exchange thumb (bx).  */
@@ -10471,40 +10522,9 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 } else {
 op = (insn >> 20) & 7;
 switch (op) {
-case 0: /* msr cpsr.  */
-if (arm_dc_feature(s, ARM_FEATURE_M)) {
-tmp = load_reg(s, rn);
-/* the 

[Qemu-devel] [PATCH v2 22/68] target/arm: Convert USAD8, USADA8, SBFX, UBFX, BFC, BFI, UDF

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 200 +
 target/arm/a32.decode  |  20 +
 target/arm/t32.decode  |  19 
 3 files changed, 143 insertions(+), 96 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 3b0998444d..2764a1a637 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9119,6 +9119,103 @@ static bool trans_LDAH(DisasContext *s, arg_LDA *a)
 return op_lda(s, a, MO_UW);
 }
 
+/*
+ * Media instructions
+ */
+
+static bool trans_USADA8(DisasContext *s, arg_USADA8 *a)
+{
+TCGv_i32 t1, t2;
+
+if (!ENABLE_ARCH_6) {
+return false;
+}
+
+t1 = load_reg(s, a->rn);
+t2 = load_reg(s, a->rm);
+gen_helper_usad8(t1, t1, t2);
+tcg_temp_free_i32(t2);
+if (a->ra != 15) {
+t2 = load_reg(s, a->ra);
+tcg_gen_add_i32(t1, t1, t2);
+tcg_temp_free_i32(t2);
+}
+store_reg(s, a->rd, t1);
+return true;
+}
+
+static bool op_bfx(DisasContext *s, arg_UBFX *a, bool u)
+{
+TCGv_i32 tmp;
+int width = a->widthm1 + 1;
+int shift = a->lsb;
+
+if (!ENABLE_ARCH_6T2) {
+return false;
+}
+
+tmp = load_reg(s, a->rn);
+if (shift + width > 32) {
+return false;
+} else if (width < 32) {
+if (u) {
+tcg_gen_extract_i32(tmp, tmp, shift, width);
+} else {
+tcg_gen_sextract_i32(tmp, tmp, shift, width);
+}
+}
+store_reg(s, a->rd, tmp);
+return true;
+}
+
+static bool trans_SBFX(DisasContext *s, arg_SBFX *a)
+{
+return op_bfx(s, a, false);
+}
+
+static bool trans_UBFX(DisasContext *s, arg_UBFX *a)
+{
+return op_bfx(s, a, true);
+}
+
+static bool trans_BFCI(DisasContext *s, arg_BFCI *a)
+{
+TCGv_i32 tmp;
+int msb = a->msb, lsb = a->lsb;
+int width;
+
+if (!ENABLE_ARCH_6T2) {
+return false;
+}
+
+if (msb < lsb) {
+/* UNPREDICTABLE; we choose to UNDEF */
+return false;
+}
+
+width = msb + 1 - lsb;
+if (a->rn == 15) {
+/* BFC */
+tmp = tcg_const_i32(0);
+} else {
+/* BFI */
+tmp = load_reg(s, a->rn);
+}
+if (width != 32) {
+TCGv_i32 tmp2 = load_reg(s, a->rd);
+tcg_gen_deposit_i32(tmp, tmp2, tmp, lsb, width);
+tcg_temp_free_i32(tmp2);
+}
+store_reg(s, a->rd, tmp);
+return true;
+}
+
+static bool trans_UDF(DisasContext *s, arg_UDF *a)
+{
+unallocated_encoding(s);
+return true;
+}
+
 /*
  * Legacy decoder.
  */
@@ -9659,65 +9756,9 @@ static void disas_arm_insn(DisasContext *s, unsigned int 
insn)
 }
 break;
 case 3:
-op1 = ((insn >> 17) & 0x38) | ((insn >> 5) & 7);
-switch (op1) {
-case 0: /* Unsigned sum of absolute differences.  */
-ARCH(6);
-tmp = load_reg(s, rm);
-tmp2 = load_reg(s, rs);
-gen_helper_usad8(tmp, tmp, tmp2);
-tcg_temp_free_i32(tmp2);
-if (rd != 15) {
-tmp2 = load_reg(s, rd);
-tcg_gen_add_i32(tmp, tmp, tmp2);
-tcg_temp_free_i32(tmp2);
-}
-store_reg(s, rn, tmp);
-break;
-case 0x20: case 0x24: case 0x28: case 0x2c:
-/* Bitfield insert/clear.  */
-ARCH(6T2);
-shift = (insn >> 7) & 0x1f;
-i = (insn >> 16) & 0x1f;
-if (i < shift) {
-/* UNPREDICTABLE; we choose to UNDEF */
-goto illegal_op;
-}
-i = i + 1 - shift;
-if (rm == 15) {
-tmp = tcg_temp_new_i32();
-tcg_gen_movi_i32(tmp, 0);
-} else {
-tmp = load_reg(s, rm);
-}
-if (i != 32) {
-tmp2 = load_reg(s, rd);
-tcg_gen_deposit_i32(tmp, tmp2, tmp, shift, i);
-tcg_temp_free_i32(tmp2);
-}
-store_reg(s, rd, tmp);
-break;
-case 0x12: case 0x16: case 0x1a: case 0x1e: /* sbfx */
-case 0x32: case 0x36: case 0x3a: case 0x3e: /* ubfx */
-ARCH(6T2);
-tmp = load_reg(s, rm);
-shift = (insn >> 7) & 0x1f;
-i = ((insn >> 16) & 0x1f) + 1;
-if (shift + i > 32)
-goto illegal_op;
-

[Qemu-devel] [PATCH v2 18/68] target/arm: Convert the rest of A32 Miscelaneous instructions

2019-08-19 Thread Richard Henderson
This fixes an exiting bug with the T5 encoding of SUBS PC, LR, #IMM,
in that it may be executed from user mode as with any other encoding
of SUBS, not as ERET.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 119 +
 target/arm/a32.decode  |   8 +++
 target/arm/t32.decode  |   5 ++
 3 files changed, 50 insertions(+), 82 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index cb7b35489f..cb6296dc12 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8491,6 +8491,39 @@ static bool trans_ERET(DisasContext *s, arg_ERET *a)
 return true;
 }
 
+static bool trans_HLT(DisasContext *s, arg_HLT *a)
+{
+gen_hlt(s, a->imm);
+return true;
+}
+
+static bool trans_BKPT(DisasContext *s, arg_BKPT *a)
+{
+if (!ENABLE_ARCH_5) {
+return false;
+}
+gen_exception_bkpt_insn(s, syn_aa32_bkpt(a->imm, false));
+return true;
+}
+
+static bool trans_HVC(DisasContext *s, arg_HVC *a)
+{
+if (!ENABLE_ARCH_7 || IS_USER(s) || arm_dc_feature(s, ARM_FEATURE_M)) {
+return false;
+}
+gen_hvc(s, a->imm);
+return true;
+}
+
+static bool trans_SMC(DisasContext *s, arg_SMC *a)
+{
+if (!ENABLE_ARCH_6K || IS_USER(s) || arm_dc_feature(s, ARM_FEATURE_M)) {
+return false;
+}
+gen_smc(s);
+return true;
+}
+
 /*
  * Legacy decoder.
  */
@@ -8771,68 +8804,8 @@ static void disas_arm_insn(DisasContext *s, unsigned int 
insn)
 } else if ((insn & 0x0f90) == 0x0100
&& (insn & 0x0090) != 0x0090) {
 /* miscellaneous instructions */
-op1 = (insn >> 21) & 3;
-sh = (insn >> 4) & 0xf;
-rm = insn & 0xf;
-switch (sh) {
-case 0x0:
-/* MSR/MRS (banked/register) */
-/* All done in decodetree.  Illegal ops already signalled.  */
-g_assert_not_reached();
-case 0x1: /* bx, clz */
-case 0x2: /* bxj */
-case 0x3: /* blx */
-case 0x4: /* crc32 */
-/* All done in decodetree.  Illegal ops reach here.  */
-goto illegal_op;
-case 0x5: /* Saturating addition and subtraction.  */
-case 0x6: /* ERET */
-/* All done in decodetree.  Reach here for illegal ops.  */
-goto illegal_op;
-case 7:
-{
-int imm16 = extract32(insn, 0, 4) | (extract32(insn, 8, 12) << 4);
-switch (op1) {
-case 0:
-/* HLT */
-gen_hlt(s, imm16);
-break;
-case 1:
-/* bkpt */
-ARCH(5);
-gen_exception_bkpt_insn(s, syn_aa32_bkpt(imm16, false));
-break;
-case 2:
-/* Hypervisor call (v7) */
-ARCH(7);
-if (IS_USER(s)) {
-goto illegal_op;
-}
-gen_hvc(s, imm16);
-break;
-case 3:
-/* Secure monitor call (v6+) */
-ARCH(6K);
-if (IS_USER(s)) {
-goto illegal_op;
-}
-gen_smc(s);
-break;
-default:
-g_assert_not_reached();
-}
-break;
-}
-case 0x8:
-case 0xa:
-case 0xc:
-case 0xe:
-/* Halfword multiply and multiply accumulate.  */
-/* All done in decodetree.  Reach here for illegal ops.  */
-goto illegal_op;
-default:
-goto illegal_op;
-}
+/* All done in decodetree.  Illegal ops reach here.  */
+goto illegal_op;
 } else if (((insn & 0x0e00) == 0 &&
 (insn & 0x0090) != 0x90) ||
((insn & 0x0e00) == (1 << 25))) {
@@ -10493,26 +10466,8 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 goto illegal_op;
 
 if (insn & (1 << 26)) {
-if (arm_dc_feature(s, ARM_FEATURE_M)) {
-goto illegal_op;
-}
-if (!(insn & (1 << 20))) {
-/* Hypervisor call (v7) */
-int imm16 = extract32(insn, 16, 4) << 12
-| extract32(insn, 0, 12);
-ARCH(7);
-if (IS_USER(s)) {
-goto illegal_op;
-}
-gen_hvc(s, imm16);
-} else {
-/* Secure monitor call (v6+) */
-ARCH(6K);
-if (IS_USER(s)) {
-goto illegal_op;
-}
-gen_smc(s);
-}
+/* hvc, smc, in decodetree */
+goto illegal_op;
 

[Qemu-devel] [PATCH v2 20/68] target/arm: Convert load/store (register, immediate, literal)

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 797 ++---
 target/arm/a32.decode  | 120 +++
 target/arm/t32.decode  | 141 
 3 files changed, 615 insertions(+), 443 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 0e51289928..f7c4db872c 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -1246,62 +1246,6 @@ static inline void gen_hlt(DisasContext *s, int imm)
 unallocated_encoding(s);
 }
 
-static inline void gen_add_data_offset(DisasContext *s, unsigned int insn,
-   TCGv_i32 var)
-{
-int val, rm, shift, shiftop;
-TCGv_i32 offset;
-
-if (!(insn & (1 << 25))) {
-/* immediate */
-val = insn & 0xfff;
-if (!(insn & (1 << 23)))
-val = -val;
-if (val != 0)
-tcg_gen_addi_i32(var, var, val);
-} else {
-/* shift/register */
-rm = (insn) & 0xf;
-shift = (insn >> 7) & 0x1f;
-shiftop = (insn >> 5) & 3;
-offset = load_reg(s, rm);
-gen_arm_shift_im(offset, shiftop, shift, 0);
-if (!(insn & (1 << 23)))
-tcg_gen_sub_i32(var, var, offset);
-else
-tcg_gen_add_i32(var, var, offset);
-tcg_temp_free_i32(offset);
-}
-}
-
-static inline void gen_add_datah_offset(DisasContext *s, unsigned int insn,
-int extra, TCGv_i32 var)
-{
-int val, rm;
-TCGv_i32 offset;
-
-if (insn & (1 << 22)) {
-/* immediate */
-val = (insn & 0xf) | ((insn >> 4) & 0xf0);
-if (!(insn & (1 << 23)))
-val = -val;
-val += extra;
-if (val != 0)
-tcg_gen_addi_i32(var, var, val);
-} else {
-/* register */
-if (extra)
-tcg_gen_addi_i32(var, var, extra);
-rm = (insn) & 0xf;
-offset = load_reg(s, rm);
-if (!(insn & (1 << 23)))
-tcg_gen_sub_i32(var, var, offset);
-else
-tcg_gen_add_i32(var, var, offset);
-tcg_temp_free_i32(offset);
-}
-}
-
 static TCGv_ptr get_fpstatus_ptr(int neon)
 {
 TCGv_ptr statusptr = tcg_temp_new_ptr();
@@ -7636,6 +7580,11 @@ static int times_2(DisasContext *s, int x)
 return x * 2;
 }
 
+static int times_4(DisasContext *s, int x)
+{
+return x * 4;
+}
+
 /* Return only the rotation part of T32ExpandImm.  */
 static int t32_expandimm_rot(DisasContext *s, int x)
 {
@@ -8535,6 +8484,345 @@ static bool trans_SMC(DisasContext *s, arg_SMC *a)
 return true;
 }
 
+/*
+ * Load/store register index
+ */
+
+static ISSInfo make_issinfo(DisasContext *s, int rd, bool p, bool w)
+{
+ISSInfo ret;
+
+/* ISS not valid if writeback */
+if (p && !w) {
+ret = rd;
+} else {
+ret = ISSInvalid;
+}
+return ret;
+}
+
+static TCGv_i32 op_addr_rr_pre(DisasContext *s, arg_ldst_rr *a)
+{
+TCGv_i32 addr = load_reg(s, a->rn);
+
+if (s->v8m_stackcheck && a->rn == 13 && a->w) {
+gen_helper_v8m_stackcheck(cpu_env, addr);
+}
+
+if (a->p) {
+TCGv_i32 ofs = load_reg(s, a->rm);
+gen_arm_shift_im(ofs, a->shtype, a->shimm, 0);
+if (a->u) {
+tcg_gen_add_i32(addr, addr, ofs);
+} else {
+tcg_gen_sub_i32(addr, addr, ofs);
+}
+tcg_temp_free_i32(ofs);
+}
+return addr;
+}
+
+static void op_addr_rr_post(DisasContext *s, arg_ldst_rr *a,
+TCGv_i32 addr, int address_offset)
+{
+if (!a->p) {
+TCGv_i32 ofs = load_reg(s, a->rm);
+gen_arm_shift_im(ofs, a->shtype, a->shimm, 0);
+if (a->u) {
+tcg_gen_add_i32(addr, addr, ofs);
+} else {
+tcg_gen_sub_i32(addr, addr, ofs);
+}
+tcg_temp_free_i32(ofs);
+} else if (!a->w) {
+tcg_temp_free_i32(addr);
+return;
+}
+tcg_gen_addi_i32(addr, addr, address_offset);
+store_reg(s, a->rn, addr);
+}
+
+static bool op_load_rr(DisasContext *s, arg_ldst_rr *a,
+   TCGMemOp mop, int mem_idx)
+{
+ISSInfo issinfo = make_issinfo(s, a->rt, a->p, a->w);
+TCGv_i32 addr, tmp;
+
+addr = op_addr_rr_pre(s, a);
+
+tmp = tcg_temp_new_i32();
+gen_aa32_ld_i32(s, tmp, addr, mem_idx, mop | s->be_data);
+disas_set_da_iss(s, mop, issinfo);
+
+/*
+ * Perform base writeback before the loaded value to
+ * ensure correct behavior with overlapping index registers.
+ */
+op_addr_rr_post(s, a, addr, 0);
+store_reg_from_load(s, a->rt, tmp);
+return true;
+}
+
+static bool op_store_rr(DisasContext *s, arg_ldst_rr *a,
+TCGMemOp mop, int mem_idx)
+{
+ISSInfo issinfo = make_issinfo(s, a->rt, a->p, a->w) | ISSIsWrite;
+TCGv_i32 addr, tmp;
+
+addr = op_addr_rr_pre(s, a);
+
+tmp = load_reg(s, a->rt);
+gen_aa32_st_i32(s, tmp, addr, mem_idx, mop | s->be_data);
+

[Qemu-devel] [PATCH v2 12/68] target/arm: Convert MSR (immediate) and hints

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 60 +-
 target/arm/a32.decode  | 25 ++
 target/arm/t32.decode  | 17 
 3 files changed, 84 insertions(+), 18 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 9a2fb7d3aa..ee485b1cbd 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8253,6 +8253,44 @@ DO_SMLAWX(SMLAWT, 1, 1)
 
 #undef DO_SMLAWX
 
+/*
+ * MSR (immediate) and hints
+ */
+
+static bool trans_YIELD(DisasContext *s, arg_YIELD *a)
+{
+gen_nop_hint(s, 1);
+return true;
+}
+
+static bool trans_WFE(DisasContext *s, arg_WFE *a)
+{
+gen_nop_hint(s, 2);
+return true;
+}
+
+static bool trans_WFI(DisasContext *s, arg_WFI *a)
+{
+gen_nop_hint(s, 3);
+return true;
+}
+
+static bool trans_NOP(DisasContext *s, arg_NOP *a)
+{
+return true;
+}
+
+static bool trans_MSR_imm(DisasContext *s, arg_MSR_imm *a)
+{
+uint32_t val = ror32(a->imm, a->rot * 2);
+uint32_t mask = msr_mask(s, a->mask, a->r);
+
+if (gen_set_psr_im(s, mask, a->r, val)) {
+unallocated_encoding(s);
+}
+return true;
+}
+
 /*
  * Legacy decoder.
  */
@@ -8526,21 +8564,9 @@ static void disas_arm_insn(DisasContext *s, unsigned int 
insn)
 }
 store_reg(s, rd, tmp);
 } else {
-if (((insn >> 12) & 0xf) != 0xf)
-goto illegal_op;
-if (((insn >> 16) & 0xf) == 0) {
-gen_nop_hint(s, insn & 0xff);
-} else {
-/* CPSR = immediate */
-val = insn & 0xff;
-shift = ((insn >> 8) & 0xf) * 2;
-val = ror32(val, shift);
-i = ((insn & (1 << 22)) != 0);
-if (gen_set_psr_im(s, msr_mask(s, (insn >> 16) & 0xf, i),
-   i, val)) {
-goto illegal_op;
-}
-}
+/* MSR (immediate) and hints */
+/* All done in decodetree.  Illegal ops already signalled.  */
+g_assert_not_reached();
 }
 } else if ((insn & 0x0f90) == 0x0100
&& (insn & 0x0090) != 0x0090) {
@@ -10480,9 +10506,7 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t 
insn)
 goto illegal_op;
 break;
 case 2: /* cps, nop-hint.  */
-if (((insn >> 8) & 7) == 0) {
-gen_nop_hint(s, insn & 0xff);
-}
+/* nop hints in decodetree */
 /* Implemented as NOP in user mode.  */
 if (IS_USER(s))
 break;
diff --git a/target/arm/a32.decode b/target/arm/a32.decode
index 19d12e726b..3d5c5408f9 100644
--- a/target/arm/a32.decode
+++ b/target/arm/a32.decode
@@ -22,6 +22,7 @@
 # All insns that have 0xf in insn[31:28] are in a32-uncond.decode.
 #
 
+
 _rrr_shi   s rd rn rm shim shty
 _rrr_shr   s rn rd rm rs shty
 _rri_rot   s rn rd imm rot
@@ -152,3 +153,27 @@ SMULBB    0001 0110    1000    
   @rd0mn
 SMULBT    0001 0110    1100   @rd0mn
 SMULTB    0001 0110    1010   @rd0mn
 SMULTT    0001 0110    1110   @rd0mn
+
+# MSR (immediate) and hints
+
+_i   r mask rot imm
+@msr_i      mask:4  rot:4 imm:8   _i
+
+{
+  {
+YIELD 0011 0010     0001
+WFE   0011 0010     0010
+WFI   0011 0010     0011
+
+# TODO: Implement SEV, SEVL; may help SMP performance.
+# SEV 0011 0010     0100
+# SEVL    0011 0010     0101
+
+# The canonical nop ends in , but the whole of the
+# rest of the space executes as nop if otherwise unsupported.
+NOP   0011 0010     
+  }
+  # Note mask = 0 is covered by NOP
+  MSR_imm 0011 0010       @msr_i r=0
+}
+MSR_imm   0011 0110       @msr_i r=1
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
index 122a0537ed..ccb7cdd4ef 100644
--- a/target/arm/t32.decode
+++ b/target/arm/t32.decode
@@ -19,6 +19,7 @@
 # This file is processed by scripts/decodetree.py
 #
 
+   !extern
 _rrr_shi   !extern s rd rn rm shim shty
 _rrr_shr   !extern s rn rd rm rs shty
 _rri_rot   !extern s rn rd imm rot
@@ -166,3 +167,19 @@ QADD  1010 1000    1000    
   @rndm
 QSUB  1010 1000    1010   @rndm
 QDADD 1010 1000    1001   @rndm
 QDSUB 1010 1000    1011   @rndm
+
+# 

[Qemu-devel] [PATCH v2 15/68] target/arm: Convert BX, BXJ, BLX (register)

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 78 --
 target/arm/a32.decode  |  7 
 target/arm/t32.decode  |  2 ++
 3 files changed, 47 insertions(+), 40 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index f390656ce9..ef26ed7b57 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8429,6 +8429,38 @@ static bool trans_MSR_v7m(DisasContext *s, arg_MSR_v7m 
*a)
 return true;
 }
 
+static bool trans_BX(DisasContext *s, arg_BX *a)
+{
+if (!ENABLE_ARCH_4T) {
+return false;
+}
+gen_bx(s, load_reg(s, a->rm));
+return true;
+}
+
+static bool trans_BXJ(DisasContext *s, arg_BXJ *a)
+{
+if (!ENABLE_ARCH_5J || arm_dc_feature(s, ARM_FEATURE_M)) {
+return false;
+}
+/* Trivial implementation equivalent to bx.  */
+gen_bx(s, load_reg(s, a->rm));
+return true;
+}
+
+static bool trans_BLX_r(DisasContext *s, arg_BLX_r *a)
+{
+TCGv_i32 tmp;
+
+if (!ENABLE_ARCH_5) {
+return false;
+}
+tmp = load_reg(s, a->rm);
+tcg_gen_movi_i32(cpu_R[14], s->base.pc_next | s->thumb);
+gen_bx(s, tmp);
+return true;
+}
+
 /*
  * Legacy decoder.
  */
@@ -8718,12 +8750,7 @@ static void disas_arm_insn(DisasContext *s, unsigned int 
insn)
 /* All done in decodetree.  Illegal ops already signalled.  */
 g_assert_not_reached();
 case 0x1:
-if (op1 == 1) {
-/* branch/exchange thumb (bx).  */
-ARCH(4T);
-tmp = load_reg(s, rm);
-gen_bx(s, tmp);
-} else if (op1 == 3) {
+if (op1 == 3) {
 /* clz */
 ARCH(5);
 rd = (insn >> 12) & 0xf;
@@ -8734,30 +8761,9 @@ static void disas_arm_insn(DisasContext *s, unsigned int 
insn)
 goto illegal_op;
 }
 break;
-case 0x2:
-if (op1 == 1) {
-ARCH(5J); /* bxj */
-/* Trivial implementation equivalent to bx.  */
-tmp = load_reg(s, rm);
-gen_bx(s, tmp);
-} else {
-goto illegal_op;
-}
-break;
-case 0x3:
-if (op1 != 1)
-  goto illegal_op;
-
-ARCH(5);
-/* branch link/exchange thumb (blx) */
-tmp = load_reg(s, rm);
-tmp2 = tcg_temp_new_i32();
-tcg_gen_movi_i32(tmp2, s->base.pc_next);
-store_reg(s, 14, tmp2);
-gen_bx(s, tmp);
-break;
-case 0x4:
-/* crc32 */
+case 0x2: /* bxj */
+case 0x3: /* blx */
+case 0x4: /* crc32 */
 /* All done in decodetree.  Illegal ops reach here.  */
 goto illegal_op;
 case 0x5:
@@ -10578,16 +10584,8 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 goto illegal_op;
 }
 break;
-case 4: /* bxj */
-/* Trivial implementation equivalent to bx.
- * This instruction doesn't exist at all for M-profile.
- */
-if (arm_dc_feature(s, ARM_FEATURE_M)) {
-goto illegal_op;
-}
-tmp = load_reg(s, rn);
-gen_bx(s, tmp);
-break;
+case 4: /* bxj, in decodetree */
+goto illegal_op;
 case 5: /* Exception return.  */
 if (IS_USER(s)) {
 goto illegal_op;
diff --git a/target/arm/a32.decode b/target/arm/a32.decode
index a8ef435b15..6cb9c16e2f 100644
--- a/target/arm/a32.decode
+++ b/target/arm/a32.decode
@@ -29,6 +29,7 @@
 _  s rd rn rm ra
 rd rn rm ra
  rd rn rm
+   rm
 _reg rn r mask
 _reg rd r
 _bankrn r sysm
@@ -195,8 +196,14 @@ CRC32CW   0001 0100   0010 0100    
   @rndm
 
 %sysm8:1 16:4
 
+@rm         rm:4  
+
 MRS_bank  0001 0 r:1 00  rd:4 001.    _bank %sysm
 MSR_bank  0001 0 r:1 10   001.  rn:4  _bank %sysm
 
 MRS_reg   0001 0 r:1 00    rd:4     _reg
 MSR_reg   0001 0 r:1 10 mask:4    rn:4  _reg
+
+BX    0001 0010    0001   @rm
+BXJ   0001 0010    0010   @rm
+BLX_r 0001 0010    0011   @rm
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
index 261db100ff..337706ebbe 100644
--- a/target/arm/t32.decode
+++ b/target/arm/t32.decode
@@ -26,6 +26,7 @@
 _  !extern s rd 

[Qemu-devel] [PATCH v2 21/68] target/arm: Convert Synchronization primitives

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 560 ++---
 target/arm/a32.decode  |  48 
 target/arm/t32.decode  |  46 
 3 files changed, 396 insertions(+), 258 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index f7c4db872c..3b0998444d 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8823,6 +8823,302 @@ DO_LDST(STRH, store, MO_UW)
 
 #undef DO_LDST
 
+/*
+ * Synchronization primitives
+ */
+
+static bool op_swp(DisasContext *s, arg_SWP *a, TCGMemOp opc)
+{
+TCGv_i32 addr, tmp;
+TCGv taddr;
+
+opc |= s->be_data;
+addr = load_reg(s, a->rn);
+taddr = gen_aa32_addr(s, addr, opc);
+tcg_temp_free_i32(addr);
+
+tmp = load_reg(s, a->rt2);
+tcg_gen_atomic_xchg_i32(tmp, taddr, tmp, get_mem_index(s), opc);
+tcg_temp_free(taddr);
+
+store_reg(s, a->rt, tmp);
+return true;
+}
+
+static bool trans_SWP(DisasContext *s, arg_SWP *a)
+{
+return op_swp(s, a, MO_UL | MO_ALIGN);
+}
+
+static bool trans_SWPB(DisasContext *s, arg_SWP *a)
+{
+return op_swp(s, a, MO_UB);
+}
+
+/*
+ * Load/Store Exclusive and Load-Acquire/Store-Release
+ */
+
+static bool op_strex(DisasContext *s, arg_STREX *a, TCGMemOp mop, bool rel)
+{
+TCGv_i32 addr;
+
+if (rel) {
+tcg_gen_mb(TCG_MO_ALL | TCG_BAR_STRL);
+}
+
+addr = tcg_temp_local_new_i32();
+load_reg_var(s, addr, a->rn);
+tcg_gen_addi_i32(addr, addr, a->imm);
+
+gen_store_exclusive(s, a->rd, a->rt, a->rt2, addr, mop);
+tcg_temp_free_i32(addr);
+return true;
+}
+
+static bool trans_STREX(DisasContext *s, arg_STREX *a)
+{
+if (!ENABLE_ARCH_6) {
+return false;
+}
+return op_strex(s, a, MO_32, false);
+}
+
+static bool trans_STREXD_a32(DisasContext *s, arg_STREX *a)
+{
+if (!ENABLE_ARCH_6K || (a->rt & 1)) {
+return false;
+}
+a->rt2 = a->rt + 1;
+return op_strex(s, a, MO_64, false);
+}
+
+static bool trans_STREXD_t32(DisasContext *s, arg_STREX *a)
+{
+return op_strex(s, a, MO_64, false);
+}
+
+static bool trans_STREXB(DisasContext *s, arg_STREX *a)
+{
+if (!ENABLE_ARCH_6K) {
+return false;
+}
+return op_strex(s, a, MO_8, false);
+}
+
+static bool trans_STREXH(DisasContext *s, arg_STREX *a)
+{
+if (!ENABLE_ARCH_6K) {
+return false;
+}
+return op_strex(s, a, MO_16, false);
+}
+
+static bool trans_STLEX(DisasContext *s, arg_STREX *a)
+{
+if (!ENABLE_ARCH_8) {
+return false;
+}
+return op_strex(s, a, MO_32, true);
+}
+
+static bool trans_STLEXD_a32(DisasContext *s, arg_STREX *a)
+{
+if (!ENABLE_ARCH_8 || (a->rt & 1)) {
+return false;
+}
+a->rt2 = a->rt + 1;
+return op_strex(s, a, MO_64, true);
+}
+
+static bool trans_STLEXD_t32(DisasContext *s, arg_STREX *a)
+{
+if (!ENABLE_ARCH_8) {
+return false;
+}
+return op_strex(s, a, MO_64, true);
+}
+
+static bool trans_STLEXB(DisasContext *s, arg_STREX *a)
+{
+if (!ENABLE_ARCH_8) {
+return false;
+}
+return op_strex(s, a, MO_8, true);
+}
+
+static bool trans_STLEXH(DisasContext *s, arg_STREX *a)
+{
+if (!ENABLE_ARCH_8) {
+return false;
+}
+return op_strex(s, a, MO_16, true);
+}
+
+static bool op_stl(DisasContext *s, arg_STL *a, TCGMemOp mop)
+{
+TCGv_i32 addr, tmp;
+
+if (!ENABLE_ARCH_8) {
+return false;
+}
+addr = load_reg(s, a->rn);
+
+tmp = load_reg(s, a->rt);
+tcg_gen_mb(TCG_MO_ALL | TCG_BAR_STRL);
+gen_aa32_st_i32(s, tmp, addr, get_mem_index(s), mop | s->be_data);
+disas_set_da_iss(s, mop, a->rt | ISSIsAcqRel | ISSIsWrite);
+
+tcg_temp_free_i32(tmp);
+tcg_temp_free_i32(addr);
+return true;
+}
+
+static bool trans_STL(DisasContext *s, arg_STL *a)
+{
+return op_stl(s, a, MO_UL);
+}
+
+static bool trans_STLB(DisasContext *s, arg_STL *a)
+{
+return op_stl(s, a, MO_UB);
+}
+
+static bool trans_STLH(DisasContext *s, arg_STL *a)
+{
+return op_stl(s, a, MO_UW);
+}
+
+static bool op_ldrex(DisasContext *s, arg_LDREX *a, TCGMemOp mop, bool acq)
+{
+TCGv_i32 addr;
+
+addr = tcg_temp_local_new_i32();
+load_reg_var(s, addr, a->rn);
+tcg_gen_addi_i32(addr, addr, a->imm);
+
+gen_load_exclusive(s, a->rt, a->rt2, addr, mop);
+tcg_temp_free_i32(addr);
+
+if (acq) {
+tcg_gen_mb(TCG_MO_ALL | TCG_BAR_LDAQ);
+}
+return true;
+}
+
+static bool trans_LDREX(DisasContext *s, arg_LDREX *a)
+{
+if (!ENABLE_ARCH_6) {
+return false;
+}
+return op_ldrex(s, a, MO_32, false);
+}
+
+static bool trans_LDREXD_a32(DisasContext *s, arg_LDREX *a)
+{
+if (!ENABLE_ARCH_6K || (a->rt & 1)) {
+return false;
+}
+a->rt2 = a->rt + 1;
+return op_ldrex(s, a, MO_64, false);
+}
+
+static bool trans_LDREXD_t32(DisasContext *s, arg_LDREX *a)
+{
+return op_ldrex(s, a, MO_64, false);
+}
+
+static bool trans_LDREXB(DisasContext *s, arg_LDREX *a)
+{
+if 

[Qemu-devel] [PATCH v2 14/68] target/arm: Convert Cyclic Redundancy Check

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 121 +++--
 target/arm/a32.decode  |   9 +++
 target/arm/t32.decode  |   7 +++
 3 files changed, 72 insertions(+), 65 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 026abcaa9c..f390656ce9 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8291,6 +8291,57 @@ static bool trans_MSR_imm(DisasContext *s, arg_MSR_imm 
*a)
 return true;
 }
 
+/*
+ * Cyclic Redundancy Check
+ */
+
+static bool op_crc32(DisasContext *s, arg_rrr *a, bool c, TCGMemOp sz)
+{
+TCGv_i32 t1, t2, t3;
+
+if (!dc_isar_feature(aa32_crc32, s)) {
+return false;
+}
+
+t1 = load_reg(s, a->rn);
+t2 = load_reg(s, a->rm);
+switch (sz) {
+case MO_8:
+gen_uxtb(t2);
+break;
+case MO_16:
+gen_uxth(t2);
+break;
+case MO_32:
+break;
+default:
+g_assert_not_reached();
+}
+t3 = tcg_const_i32(1 << sz);
+if (c) {
+gen_helper_crc32c(t1, t1, t2, t3);
+} else {
+gen_helper_crc32(t1, t1, t2, t3);
+}
+tcg_temp_free_i32(t2);
+tcg_temp_free_i32(t3);
+store_reg(s, a->rd, t1);
+return true;
+}
+
+#define DO_CRC32(NAME, c, sz) \
+static bool trans_##NAME(DisasContext *s, arg_rrr *a)  \
+{ return op_crc32(s, a, c, sz); }
+
+DO_CRC32(CRC32B, false, MO_8)
+DO_CRC32(CRC32H, false, MO_16)
+DO_CRC32(CRC32W, false, MO_32)
+DO_CRC32(CRC32CB, true, MO_8)
+DO_CRC32(CRC32CH, true, MO_16)
+DO_CRC32(CRC32CW, true, MO_32)
+
+#undef DO_CRC32
+
 /*
  * Miscellaneous instructions
  */
@@ -8706,39 +8757,9 @@ static void disas_arm_insn(DisasContext *s, unsigned int 
insn)
 gen_bx(s, tmp);
 break;
 case 0x4:
-{
-/* crc32/crc32c */
-uint32_t c = extract32(insn, 8, 4);
-
-/* Check this CPU supports ARMv8 CRC instructions.
- * op1 == 3 is UNPREDICTABLE but handle as UNDEFINED.
- * Bits 8, 10 and 11 should be zero.
- */
-if (!dc_isar_feature(aa32_crc32, s) || op1 == 0x3 || (c & 0xd) != 
0) {
-goto illegal_op;
-}
-
-rn = extract32(insn, 16, 4);
-rd = extract32(insn, 12, 4);
-
-tmp = load_reg(s, rn);
-tmp2 = load_reg(s, rm);
-if (op1 == 0) {
-tcg_gen_andi_i32(tmp2, tmp2, 0xff);
-} else if (op1 == 1) {
-tcg_gen_andi_i32(tmp2, tmp2, 0x);
-}
-tmp3 = tcg_const_i32(1 << op1);
-if (c & 0x2) {
-gen_helper_crc32c(tmp, tmp, tmp2, tmp3);
-} else {
-gen_helper_crc32(tmp, tmp, tmp2, tmp3);
-}
-tcg_temp_free_i32(tmp2);
-tcg_temp_free_i32(tmp3);
-store_reg(s, rd, tmp);
-break;
-}
+/* crc32 */
+/* All done in decodetree.  Illegal ops reach here.  */
+goto illegal_op;
 case 0x5:
 /* Saturating addition and subtraction.  */
 /* All done in decodetree.  Reach here for illegal ops.  */
@@ -10181,16 +10202,13 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 goto illegal_op;
 }
 break;
-case 0x20: /* crc32/crc32c */
+case 0x20: /* crc32/crc32c, in decodetree */
 case 0x21:
 case 0x22:
 case 0x28:
 case 0x29:
 case 0x2a:
-if (!dc_isar_feature(aa32_crc32, s)) {
-goto illegal_op;
-}
-break;
+goto illegal_op;
 default:
 goto illegal_op;
 }
@@ -10219,33 +10237,6 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 case 0x18: /* clz */
 tcg_gen_clzi_i32(tmp, tmp, 32);
 break;
-case 0x20:
-case 0x21:
-case 0x22:
-case 0x28:
-case 0x29:
-case 0x2a:
-{
-/* crc32/crc32c */
-uint32_t sz = op & 0x3;
-uint32_t c = op & 0x8;
-
-tmp2 = load_reg(s, rm);
-if (sz == 0) {
-tcg_gen_andi_i32(tmp2, tmp2, 0xff);
-} else if (sz == 1) {
-tcg_gen_andi_i32(tmp2, tmp2, 0x);
-}
-tmp3 = tcg_const_i32(1 << sz);
-if (c) {
-gen_helper_crc32c(tmp, tmp, tmp2, tmp3);
-} else {
-gen_helper_crc32(tmp, tmp, tmp2, tmp3);
-}
-

[Qemu-devel] [PATCH v2 11/68] target/arm: Simplify op_smlawx for SMLAW*

2019-08-19 Thread Richard Henderson
By shifting the 16-bit input left by 16, we can align the desired
portion of the 48-bit product and use tcg_gen_muls2_i32.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 8557ef831f..9a2fb7d3aa 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8213,7 +8213,6 @@ DO_SMLAX(SMLALTT, 2, 1, 1)
 static bool op_smlawx(DisasContext *s, arg_ *a, bool add, bool mt)
 {
 TCGv_i32 t0, t1;
-TCGv_i64 t64;
 
 if (!ENABLE_ARCH_5TE) {
 return false;
@@ -8221,16 +8220,17 @@ static bool op_smlawx(DisasContext *s, arg_ *a, 
bool add, bool mt)
 
 t0 = load_reg(s, a->rn);
 t1 = load_reg(s, a->rm);
+/*
+ * Since the nominal result is product<47:16>, shift the 16-bit
+ * input up by 16 bits, so that the result is at product<63:32>.
+ */
 if (mt) {
-tcg_gen_sari_i32(t1, t1, 16);
+tcg_gen_andi_i32(t1, t1, 0x);
 } else {
-gen_sxth(t1);
+tcg_gen_shli_i32(t1, t1, 16);
 }
-t64 = gen_muls_i64_i32(t0, t1);
-tcg_gen_shri_i64(t64, t64, 16);
-t1 = tcg_temp_new_i32();
-tcg_gen_extrl_i64_i32(t1, t64);
-tcg_temp_free_i64(t64);
+tcg_gen_muls2_i32(t0, t1, t0, t1);
+tcg_temp_free_i32(t0);
 if (add) {
 t0 = load_reg(s, a->ra);
 gen_helper_add_setq(t1, cpu_env, t1, t0);
-- 
2.17.1




[Qemu-devel] [PATCH v2 08/68] target/arm: Convert Saturating addition and subtraction

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 75 +++---
 target/arm/a32.decode  | 10 ++
 target/arm/t32.decode  |  9 +
 3 files changed, 67 insertions(+), 27 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 82bd207799..b731e08fe4 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8099,6 +8099,48 @@ static bool trans_UMAAL(DisasContext *s, arg_UMAAL *a)
 return true;
 }
 
+/*
+ * Saturating addition and subtraction
+ */
+
+static bool op_qaddsub(DisasContext *s, arg_rrr *a, bool add, bool doub)
+{
+TCGv_i32 t0, t1;
+
+if (s->thumb
+? !arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)
+: !ENABLE_ARCH_5TE) {
+return false;
+}
+
+t0 = load_reg(s, a->rm);
+t1 = load_reg(s, a->rn);
+if (doub) {
+gen_helper_add_saturate(t1, cpu_env, t1, t1);
+}
+if (add) {
+gen_helper_add_saturate(t0, cpu_env, t0, t1);
+} else {
+gen_helper_sub_saturate(t0, cpu_env, t0, t1);
+}
+tcg_temp_free_i32(t1);
+store_reg(s, a->rd, t0);
+return true;
+}
+
+#define DO_QADDSUB(NAME, ADD, DOUB) \
+static bool trans_##NAME(DisasContext *s, arg_rrr *a)\
+{\
+return op_qaddsub(s, a, ADD, DOUB);  \
+}
+
+DO_QADDSUB(QADD, true, false)
+DO_QADDSUB(QSUB, false, false)
+DO_QADDSUB(QDADD, true, true)
+DO_QADDSUB(QDSUB, false, true)
+
+#undef DO_QADDSUB
+
 /*
  * Legacy decoder.
  */
@@ -8508,21 +8550,10 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 store_reg(s, rd, tmp);
 break;
 }
-case 0x5: /* saturating add/subtract */
-ARCH(5TE);
-rd = (insn >> 12) & 0xf;
-rn = (insn >> 16) & 0xf;
-tmp = load_reg(s, rm);
-tmp2 = load_reg(s, rn);
-if (op1 & 2)
-gen_helper_add_saturate(tmp2, cpu_env, tmp2, tmp2);
-if (op1 & 1)
-gen_helper_sub_saturate(tmp, cpu_env, tmp, tmp2);
-else
-gen_helper_add_saturate(tmp, cpu_env, tmp, tmp2);
-tcg_temp_free_i32(tmp2);
-store_reg(s, rd, tmp);
-break;
+case 0x5:
+/* Saturating addition and subtraction.  */
+/* All done in decodetree.  Reach here for illegal ops.  */
+goto illegal_op;
 case 0x6: /* ERET */
 if (op1 != 3) {
 goto illegal_op;
@@ -9989,18 +10020,8 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t 
insn)
 op = ((insn >> 17) & 0x38) | ((insn >> 4) & 7);
 if (op < 4) {
 /* Saturating add/subtract.  */
-if (!arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)) {
-goto illegal_op;
-}
-tmp = load_reg(s, rn);
-tmp2 = load_reg(s, rm);
-if (op & 1)
-gen_helper_add_saturate(tmp, cpu_env, tmp, tmp);
-if (op & 2)
-gen_helper_sub_saturate(tmp, cpu_env, tmp2, tmp);
-else
-gen_helper_add_saturate(tmp, cpu_env, tmp, tmp2);
-tcg_temp_free_i32(tmp2);
+/* All done in decodetree.  Reach here for illegal ops.  */
+goto illegal_op;
 } else {
 switch (op) {
 case 0x0a: /* rbit */
diff --git a/target/arm/a32.decode b/target/arm/a32.decode
index 87bbb2eec2..7791be5590 100644
--- a/target/arm/a32.decode
+++ b/target/arm/a32.decode
@@ -27,6 +27,7 @@
 _rri_rot   s rn rd imm rot
 _  s rd rn rm ra
 rd rn rm ra
+ rd rn rm
 
 # Data-processing (register)
 
@@ -122,3 +123,12 @@ UMULL  100 .    1001   
   @s_rdamn
 UMLAL  101 .    1001  @s_rdamn
 SMULL  110 .    1001  @s_rdamn
 SMLAL  111 .    1001  @s_rdamn
+
+# Saturating addition and subtraction
+
+@rndm   rn:4 rd:4   rm:4  
+
+QADD  0001     0101   @rndm
+QSUB  0001 0010    0101   @rndm
+QDADD 0001 0100    0101   @rndm
+QDSUB 0001 0110    0101   @rndm
diff --git a/target/arm/t32.decode b/target/arm/t32.decode
index 40cc69aee3..7c6226e0af 100644
--- a/target/arm/t32.decode
+++ b/target/arm/t32.decode
@@ -24,6 +24,7 @@
 _rri_rot   !extern s rn rd imm rot
 _  !extern s rd rn rm ra
 !extern rd rn rm ra
+ !extern rd rn rm
 
 # Data-processing (register)
 
@@ -117,6 +118,7 @@ RSB_rri   0.0 1110 .  0 ...     
  @s_rri_rot
 @s0_rnadm 

[Qemu-devel] [PATCH v2 09/68] target/arm: Convert Halfword multiply and multiply accumulate

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 218 +++--
 target/arm/a32.decode  |  20 
 target/arm/t32.decode  |  29 ++
 3 files changed, 170 insertions(+), 97 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index b731e08fe4..56ae83a7d0 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8141,6 +8141,117 @@ DO_QADDSUB(QDSUB, false, true)
 
 #undef DO_QADDSUB
 
+/*
+ * Halfword multiply and multiply accumulate
+ */
+
+static bool op_smlaxxx(DisasContext *s, arg_ *a,
+   int add_long, bool nt, bool mt)
+{
+TCGv_i32 t0, t1;
+TCGv_i64 t64;
+
+if (s->thumb
+? !arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)
+: !ENABLE_ARCH_5TE) {
+return false;
+}
+
+t0 = load_reg(s, a->rn);
+t1 = load_reg(s, a->rm);
+gen_mulxy(t0, t1, nt, mt);
+tcg_temp_free_i32(t1);
+
+switch (add_long) {
+case 0:
+store_reg(s, a->rd, t0);
+break;
+case 1:
+t1 = load_reg(s, a->ra);
+gen_helper_add_setq(t0, cpu_env, t0, t1);
+tcg_temp_free_i32(t1);
+store_reg(s, a->rd, t0);
+break;
+case 2:
+t64 = tcg_temp_new_i64();
+tcg_gen_ext_i32_i64(t64, t0);
+tcg_temp_free_i32(t0);
+gen_addq(s, t64, a->ra, a->rd);
+gen_storeq_reg(s, a->ra, a->rd, t64);
+tcg_temp_free_i64(t64);
+break;
+default:
+g_assert_not_reached();
+}
+return true;
+}
+
+#define DO_SMLAX(NAME, add, nt, mt) \
+static bool trans_##NAME(DisasContext *s, arg_ *a) \
+{  \
+return op_smlaxxx(s, a, add, nt, mt);  \
+}
+
+DO_SMLAX(SMULBB, 0, 0, 0)
+DO_SMLAX(SMULBT, 0, 0, 1)
+DO_SMLAX(SMULTB, 0, 1, 0)
+DO_SMLAX(SMULTT, 0, 1, 1)
+
+DO_SMLAX(SMLABB, 1, 0, 0)
+DO_SMLAX(SMLABT, 1, 0, 1)
+DO_SMLAX(SMLATB, 1, 1, 0)
+DO_SMLAX(SMLATT, 1, 1, 1)
+
+DO_SMLAX(SMLALBB, 2, 0, 0)
+DO_SMLAX(SMLALBT, 2, 0, 1)
+DO_SMLAX(SMLALTB, 2, 1, 0)
+DO_SMLAX(SMLALTT, 2, 1, 1)
+
+#undef DO_SMLAX
+
+static bool op_smlawx(DisasContext *s, arg_ *a, bool add, bool mt)
+{
+TCGv_i32 t0, t1;
+TCGv_i64 t64;
+
+if (!ENABLE_ARCH_5TE) {
+return false;
+}
+
+t0 = load_reg(s, a->rn);
+t1 = load_reg(s, a->rm);
+if (mt) {
+tcg_gen_sari_i32(t1, t1, 16);
+} else {
+gen_sxth(t1);
+}
+t64 = gen_muls_i64_i32(t0, t1);
+tcg_gen_shri_i64(t64, t64, 16);
+t1 = tcg_temp_new_i32();
+tcg_gen_extrl_i64_i32(t1, t64);
+tcg_temp_free_i64(t64);
+if (add) {
+t0 = load_reg(s, a->ra);
+gen_helper_add_setq(t1, cpu_env, t1, t0);
+tcg_temp_free_i32(t0);
+}
+store_reg(s, a->rd, t1);
+return true;
+}
+
+#define DO_SMLAWX(NAME, add, mt) \
+static bool trans_##NAME(DisasContext *s, arg_ *a) \
+{  \
+return op_smlawx(s, a, add, mt);   \
+}
+
+DO_SMLAWX(SMULWB, 0, 0)
+DO_SMLAWX(SMULWT, 0, 1)
+DO_SMLAWX(SMLAWB, 1, 0)
+DO_SMLAWX(SMLAWT, 1, 1)
+
+#undef DO_SMLAWX
+
 /*
  * Legacy decoder.
  */
@@ -8607,56 +8718,13 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 }
 break;
 }
-case 0x8: /* signed multiply */
+case 0x8:
 case 0xa:
 case 0xc:
 case 0xe:
-ARCH(5TE);
-rs = (insn >> 8) & 0xf;
-rn = (insn >> 12) & 0xf;
-rd = (insn >> 16) & 0xf;
-if (op1 == 1) {
-/* (32 * 16) >> 16 */
-tmp = load_reg(s, rm);
-tmp2 = load_reg(s, rs);
-if (sh & 4)
-tcg_gen_sari_i32(tmp2, tmp2, 16);
-else
-gen_sxth(tmp2);
-tmp64 = gen_muls_i64_i32(tmp, tmp2);
-tcg_gen_shri_i64(tmp64, tmp64, 16);
-tmp = tcg_temp_new_i32();
-tcg_gen_extrl_i64_i32(tmp, tmp64);
-tcg_temp_free_i64(tmp64);
-if ((sh & 2) == 0) {
-tmp2 = load_reg(s, rn);
-gen_helper_add_setq(tmp, cpu_env, tmp, tmp2);
-tcg_temp_free_i32(tmp2);
-}
-store_reg(s, rd, tmp);
-} else {
-/* 16 * 16 */
-tmp = load_reg(s, rm);
-tmp2 = load_reg(s, rs);
-gen_mulxy(tmp, tmp2, sh & 2, sh & 4);
-tcg_temp_free_i32(tmp2);
-if (op1 == 2) {
-tmp64 = tcg_temp_new_i64();
-tcg_gen_ext_i32_i64(tmp64, tmp);
-tcg_temp_free_i32(tmp);
-gen_addq(s, tmp64, rn, rd);
-gen_storeq_reg(s, rn, rd, tmp64);
-tcg_temp_free_i64(tmp64);
-} else {
-if (op1 

[Qemu-devel] [PATCH v2 07/68] target/arm: Simplify UMAAL

2019-08-19 Thread Richard Henderson
Since all of the inputs and outputs are i32, dispense with
the intermediate promotion to i64 and use tcg_gen_mulu2_i32
and tcg_gen_add2_i32.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 34 --
 1 file changed, 12 insertions(+), 22 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 94659086c0..82bd207799 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7324,21 +7324,6 @@ static void gen_storeq_reg(DisasContext *s, int rlow, 
int rhigh, TCGv_i64 val)
 store_reg(s, rhigh, tmp);
 }
 
-/* load a 32-bit value from a register and perform a 64-bit accumulate.  */
-static void gen_addq_lo(DisasContext *s, TCGv_i64 val, int rlow)
-{
-TCGv_i64 tmp;
-TCGv_i32 tmp2;
-
-/* Load value and extend to 64 bits.  */
-tmp = tcg_temp_new_i64();
-tmp2 = load_reg(s, rlow);
-tcg_gen_extu_i32_i64(tmp, tmp2);
-tcg_temp_free_i32(tmp2);
-tcg_gen_add_i64(val, val, tmp);
-tcg_temp_free_i64(tmp);
-}
-
 /* load and add a 64-bit value from a register pair.  */
 static void gen_addq(DisasContext *s, TCGv_i64 val, int rlow, int rhigh)
 {
@@ -8090,8 +8075,7 @@ static bool trans_SMLAL(DisasContext *s, arg_SMLAL *a)
 
 static bool trans_UMAAL(DisasContext *s, arg_UMAAL *a)
 {
-TCGv_i32 t0, t1;
-TCGv_i64 t64;
+TCGv_i32 t0, t1, t2, zero;
 
 if (s->thumb
 ? !arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)
@@ -8101,11 +8085,17 @@ static bool trans_UMAAL(DisasContext *s, arg_UMAAL *a)
 
 t0 = load_reg(s, a->rm);
 t1 = load_reg(s, a->rn);
-t64 = gen_mulu_i64_i32(t0, t1);
-gen_addq_lo(s, t64, a->ra);
-gen_addq_lo(s, t64, a->rd);
-gen_storeq_reg(s, a->ra, a->rd, t64);
-tcg_temp_free_i64(t64);
+tcg_gen_mulu2_i32(t0, t1, t0, t1);
+zero = tcg_const_i32(0);
+t2 = load_reg(s, a->ra);
+tcg_gen_add2_i32(t0, t1, t0, t1, t2, zero);
+tcg_temp_free_i32(t2);
+t2 = load_reg(s, a->rd);
+tcg_gen_add2_i32(t0, t1, t0, t1, t2, zero);
+tcg_temp_free_i32(t2);
+tcg_temp_free_i32(zero);
+store_reg(s, a->ra, t0);
+store_reg(s, a->rd, t1);
 return true;
 }
 
-- 
2.17.1




[Qemu-devel] [PATCH v2 05/68] target/arm: Convert Data Processing (immediate)

2019-08-19 Thread Richard Henderson
Convert the modified immediate form of the data processing insns.
For A32, we can finally remove any code that was intertwined with
the register and register-shifted-register forms.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 449 +++--
 target/arm/a32.decode  |  29 +++
 target/arm/t32.decode  |  42 
 3 files changed, 186 insertions(+), 334 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index a32fe4b222..b5af38bf84 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -439,12 +439,6 @@ static void gen_add16(TCGv_i32 t0, TCGv_i32 t1)
 tcg_temp_free_i32(t1);
 }
 
-/* Set CF to the top bit of var.  */
-static void gen_set_CF_bit31(TCGv_i32 var)
-{
-tcg_gen_shri_i32(cpu_CF, var, 31);
-}
-
 /* Set N and Z flags from var.  */
 static inline void gen_logic_CC(TCGv_i32 var)
 {
@@ -857,25 +851,6 @@ void arm_gen_test_cc(int cc, TCGLabel *label)
 arm_free_cc();
 }
 
-static const uint8_t table_logic_cc[16] = {
-1, /* and */
-1, /* xor */
-0, /* sub */
-0, /* rsb */
-0, /* add */
-0, /* adc */
-0, /* sbc */
-0, /* rsc */
-1, /* andl */
-1, /* xorl */
-0, /* cmp */
-0, /* cmn */
-1, /* orr */
-1, /* mov */
-1, /* bic */
-1, /* mvn */
-};
-
 static inline void gen_set_condexec(DisasContext *s)
 {
 if (s->condexec_mask) {
@@ -7661,6 +7636,48 @@ static void arm_skip_unless(DisasContext *s, uint32_t 
cond)
 arm_gen_test_cc(cond ^ 1, s->condlabel);
 }
 
+
+/*
+ * Constant expanders for the decoders.
+ */
+
+static int times_2(DisasContext *s, int x)
+{
+return x * 2;
+}
+
+/* Return only the rotation part of T32ExpandImm.  */
+static int t32_expandimm_rot(DisasContext *s, int x)
+{
+return x & 0xc00 ? extract32(x, 7, 5) : 0;
+}
+
+/* Return the unrotated immediate from T32ExpandImm.  */
+static int t32_expandimm_imm(DisasContext *s, int x)
+{
+int imm = extract32(x, 0, 8);
+
+switch (extract32(x, 8, 4)) {
+case 0: /* XY */
+/* Nothing to do.  */
+break;
+case 1: /* 00XY00XY */
+imm *= 0x00010001;
+break;
+case 2: /* XY00XY00 */
+imm *= 0x01000100;
+break;
+case 3: /* XYXYXYXY */
+imm *= 0x01010101;
+break;
+default:
+/* Rotated constant.  */
+imm |= 0x80;
+break;
+}
+return imm;
+}
+
 /*
  * Include the generated decoders.
  */
@@ -7816,23 +7833,82 @@ static bool op_s_rxr_shr(DisasContext *s, arg_s_rrr_shr 
*a,
 return store_reg_kind(s, a->rd, tmp2, kind);
 }
 
+/*
+ * Data-processing (immediate)
+ *
+ * Operate, with set flags, one register source,
+ * one rotated immediate, and a destination.
+ *
+ * Note that logic_cc && a->rot setting CF based on the msb of the
+ * immediate is the reason why we must pass in the unrotated form
+ * of the immediate.
+ */
+static bool op_s_rri_rot(DisasContext *s, arg_s_rri_rot *a,
+ void (*gen)(TCGv_i32, TCGv_i32, TCGv_i32),
+ int logic_cc, StoreRegKind kind)
+{
+TCGv_i32 tmp1, tmp2;
+uint32_t imm;
+
+imm = ror32(a->imm, a->rot);
+if (logic_cc && a->rot) {
+tcg_gen_movi_i32(cpu_CF, imm >> 31);
+}
+tmp2 = tcg_const_i32(imm);
+tmp1 = load_reg(s, a->rn);
+
+gen(tmp1, tmp1, tmp2);
+tcg_temp_free_i32(tmp2);
+
+if (logic_cc) {
+gen_logic_CC(tmp1);
+}
+return store_reg_kind(s, a->rd, tmp1, kind);
+}
+
+static bool op_s_rxi_rot(DisasContext *s, arg_s_rri_rot *a,
+ void (*gen)(TCGv_i32, TCGv_i32),
+ int logic_cc, StoreRegKind kind)
+{
+TCGv_i32 tmp;
+uint32_t imm;
+
+imm = ror32(a->imm, a->rot);
+if (logic_cc && a->rot) {
+tcg_gen_movi_i32(cpu_CF, imm >> 31);
+}
+tmp = tcg_const_i32(imm);
+
+gen(tmp, tmp);
+if (logic_cc) {
+gen_logic_CC(tmp);
+}
+return store_reg_kind(s, a->rd, tmp, kind);
+}
+
 #define DO_ANY3(NAME, OP, L, K) \
 static bool trans_##NAME##_rrri(DisasContext *s, arg_s_rrr_shi *a)  \
 { StoreRegKind k = (K); return op_s_rrr_shi(s, a, OP, L, k); }  \
 static bool trans_##NAME##_(DisasContext *s, arg_s_rrr_shr *a)  \
-{ StoreRegKind k = (K); return op_s_rrr_shr(s, a, OP, L, k); }
+{ StoreRegKind k = (K); return op_s_rrr_shr(s, a, OP, L, k); }  \
+static bool trans_##NAME##_rri(DisasContext *s, arg_s_rri_rot *a)   \
+{ StoreRegKind k = (K); return op_s_rri_rot(s, a, OP, L, k); }
 
 #define DO_ANY2(NAME, OP, L, K) \
 static bool trans_##NAME##_rxri(DisasContext *s, arg_s_rrr_shi *a)  \
 { StoreRegKind k = (K); return op_s_rxr_shi(s, a, OP, L, k); }  \
 static bool trans_##NAME##_rxrr(DisasContext *s, arg_s_rrr_shr *a)  \
-{ StoreRegKind k = (K); return op_s_rxr_shr(s, a, OP, L, k); }
+{ StoreRegKind k = (K); return 

[Qemu-devel] [PATCH v2 02/68] target/arm: Add stubs for aa32 decodetree

2019-08-19 Thread Richard Henderson
Add the infrastructure that will become the new decoder.
No instructions adjusted so far.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c   | 31 ++-
 target/arm/Makefile.objs | 18 ++
 target/arm/a32-uncond.decode | 23 +++
 target/arm/a32.decode| 23 +++
 target/arm/t32.decode| 20 
 5 files changed, 114 insertions(+), 1 deletion(-)
 create mode 100644 target/arm/a32-uncond.decode
 create mode 100644 target/arm/a32.decode
 create mode 100644 target/arm/t32.decode

diff --git a/target/arm/translate.c b/target/arm/translate.c
index db69d998eb..c759fe0797 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7661,6 +7661,18 @@ static void arm_skip_unless(DisasContext *s, uint32_t 
cond)
 arm_gen_test_cc(cond ^ 1, s->condlabel);
 }
 
+/*
+ * Include the generated decoders.
+ */
+
+#include "decode-a32.inc.c"
+#include "decode-a32-uncond.inc.c"
+#include "decode-t32.inc.c"
+
+/*
+ * Legacy decoder.
+ */
+
 static void disas_arm_insn(DisasContext *s, unsigned int insn)
 {
 unsigned int cond, val, op1, i, shift, rm, rs, rn, rd, sh;
@@ -7679,7 +7691,8 @@ static void disas_arm_insn(DisasContext *s, unsigned int 
insn)
 return;
 }
 cond = insn >> 28;
-if (cond == 0xf){
+
+if (cond == 0xf) {
 /* In ARMv3 and v4 the NV condition is UNPREDICTABLE; we
  * choose to UNDEF. In ARMv5 and above the space is used
  * for miscellaneous unconditional instructions.
@@ -7687,6 +7700,11 @@ static void disas_arm_insn(DisasContext *s, unsigned int 
insn)
 ARCH(5);
 
 /* Unconditional instructions.  */
+if (disas_a32_uncond(s, insn)) {
+return;
+}
+/* fall back to legacy decoder */
+
 if (((insn >> 25) & 7) == 1) {
 /* NEON Data processing.  */
 if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
@@ -7901,6 +7919,12 @@ static void disas_arm_insn(DisasContext *s, unsigned int 
insn)
next instruction */
 arm_skip_unless(s, cond);
 }
+
+if (disas_a32(s, insn)) {
+return;
+}
+/* fall back to legacy decoder */
+
 if ((insn & 0x0f90) == 0x0300) {
 if ((insn & (1 << 21)) == 0) {
 ARCH(6T2);
@@ -9386,6 +9410,11 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t 
insn)
 ARCH(6T2);
 }
 
+if (disas_t32(s, insn)) {
+return;
+}
+/* fall back to legacy decoder */
+
 rn = (insn >> 16) & 0xf;
 rs = (insn >> 12) & 0xf;
 rd = (insn >> 8) & 0xf;
diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs
index 5cafc1eb6c..7806b4dac0 100644
--- a/target/arm/Makefile.objs
+++ b/target/arm/Makefile.objs
@@ -28,9 +28,27 @@ target/arm/decode-vfp-uncond.inc.c: 
$(SRC_PATH)/target/arm/vfp-uncond.decode $(D
  $(PYTHON) $(DECODETREE) --static-decode disas_vfp_uncond -o $@ $<,\
  "GEN", $(TARGET_DIR)$@)
 
+target/arm/decode-a32.inc.c: $(SRC_PATH)/target/arm/a32.decode $(DECODETREE)
+   $(call quiet-command,\
+ $(PYTHON) $(DECODETREE) --static-decode disas_a32 -o $@ $<,\
+ "GEN", $(TARGET_DIR)$@)
+
+target/arm/decode-a32-uncond.inc.c: $(SRC_PATH)/target/arm/a32-uncond.decode 
$(DECODETREE)
+   $(call quiet-command,\
+ $(PYTHON) $(DECODETREE) --static-decode disas_a32_uncond -o $@ $<,\
+ "GEN", $(TARGET_DIR)$@)
+
+target/arm/decode-t32.inc.c: $(SRC_PATH)/target/arm/t32.decode $(DECODETREE)
+   $(call quiet-command,\
+ $(PYTHON) $(DECODETREE) --static-decode disas_t32 -o $@ $<,\
+ "GEN", $(TARGET_DIR)$@)
+
 target/arm/translate-sve.o: target/arm/decode-sve.inc.c
 target/arm/translate.o: target/arm/decode-vfp.inc.c
 target/arm/translate.o: target/arm/decode-vfp-uncond.inc.c
+target/arm/translate.o: target/arm/decode-a32.inc.c
+target/arm/translate.o: target/arm/decode-a32-uncond.inc.c
+target/arm/translate.o: target/arm/decode-t32.inc.c
 
 obj-y += tlb_helper.o debug_helper.o
 obj-y += translate.o op_helper.o
diff --git a/target/arm/a32-uncond.decode b/target/arm/a32-uncond.decode
new file mode 100644
index 00..8dee26d3b6
--- /dev/null
+++ b/target/arm/a32-uncond.decode
@@ -0,0 +1,23 @@
+# A32 unconditional instructions
+#
+#  Copyright (c) 2019 Linaro, Ltd
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; 

[Qemu-devel] [PATCH v2 10/68] target/arm: Simplify op_smlaxxx for SMLAL*

2019-08-19 Thread Richard Henderson
Since all of the inputs and outputs are i32, dispense with
the intermediate promotion to i64 and use tcg_gen_add2_i32.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 56ae83a7d0..8557ef831f 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8148,8 +8148,7 @@ DO_QADDSUB(QDSUB, false, true)
 static bool op_smlaxxx(DisasContext *s, arg_ *a,
int add_long, bool nt, bool mt)
 {
-TCGv_i32 t0, t1;
-TCGv_i64 t64;
+TCGv_i32 t0, t1, tl, th;
 
 if (s->thumb
 ? !arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)
@@ -8173,12 +8172,14 @@ static bool op_smlaxxx(DisasContext *s, arg_ *a,
 store_reg(s, a->rd, t0);
 break;
 case 2:
-t64 = tcg_temp_new_i64();
-tcg_gen_ext_i32_i64(t64, t0);
+tl = load_reg(s, a->ra);
+th = load_reg(s, a->rd);
+t1 = tcg_const_i32(0);
+tcg_gen_add2_i32(tl, th, tl, th, t0, t1);
 tcg_temp_free_i32(t0);
-gen_addq(s, t64, a->ra, a->rd);
-gen_storeq_reg(s, a->ra, a->rd, t64);
-tcg_temp_free_i64(t64);
+tcg_temp_free_i32(t1);
+store_reg(s, a->ra, tl);
+store_reg(s, a->rd, th);
 break;
 default:
 g_assert_not_reached();
-- 
2.17.1




[Qemu-devel] [PATCH v2 04/68] target/arm: Convert Data Processing (reg-shifted-reg)

2019-08-19 Thread Richard Henderson
Convert the register shifted by register form of the data
processing insns.  For A32, we cannot yet remove any code
because the legacy decoder intertwines the immediate form.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 74 ++
 target/arm/a32.decode  | 27 +++
 target/arm/t32.decode  |  6 
 3 files changed, 87 insertions(+), 20 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index be8e7685e3..a32fe4b222 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7773,17 +7773,66 @@ static bool op_s_rxr_shi(DisasContext *s, arg_s_rrr_shi 
*a,
 return store_reg_kind(s, a->rd, tmp, kind);
 }
 
+/*
+ * Data-processing (register-shifted register)
+ *
+ * Operate, with set flags, one register source,
+ * one register shifted register source, and a destination.
+ */
+static bool op_s_rrr_shr(DisasContext *s, arg_s_rrr_shr *a,
+ void (*gen)(TCGv_i32, TCGv_i32, TCGv_i32),
+ int logic_cc, StoreRegKind kind)
+{
+TCGv_i32 tmp1, tmp2;
+
+tmp1 = load_reg(s, a->rs);
+tmp2 = load_reg(s, a->rm);
+gen_arm_shift_reg(tmp2, a->shty, tmp1, logic_cc);
+tmp1 = load_reg(s, a->rn);
+
+gen(tmp1, tmp1, tmp2);
+tcg_temp_free_i32(tmp2);
+
+if (logic_cc) {
+gen_logic_CC(tmp1);
+}
+return store_reg_kind(s, a->rd, tmp1, kind);
+}
+
+static bool op_s_rxr_shr(DisasContext *s, arg_s_rrr_shr *a,
+ void (*gen)(TCGv_i32, TCGv_i32),
+ int logic_cc, StoreRegKind kind)
+{
+TCGv_i32 tmp1, tmp2;
+
+tmp1 = load_reg(s, a->rs);
+tmp2 = load_reg(s, a->rm);
+gen_arm_shift_reg(tmp2, a->shty, tmp1, logic_cc);
+
+gen(tmp2, tmp2);
+if (logic_cc) {
+gen_logic_CC(tmp2);
+}
+return store_reg_kind(s, a->rd, tmp2, kind);
+}
+
 #define DO_ANY3(NAME, OP, L, K) \
 static bool trans_##NAME##_rrri(DisasContext *s, arg_s_rrr_shi *a)  \
-{ StoreRegKind k = (K); return op_s_rrr_shi(s, a, OP, L, k); }
+{ StoreRegKind k = (K); return op_s_rrr_shi(s, a, OP, L, k); }  \
+static bool trans_##NAME##_(DisasContext *s, arg_s_rrr_shr *a)  \
+{ StoreRegKind k = (K); return op_s_rrr_shr(s, a, OP, L, k); }
 
 #define DO_ANY2(NAME, OP, L, K) \
 static bool trans_##NAME##_rxri(DisasContext *s, arg_s_rrr_shi *a)  \
-{ StoreRegKind k = (K); return op_s_rxr_shi(s, a, OP, L, k); }
+{ StoreRegKind k = (K); return op_s_rxr_shi(s, a, OP, L, k); }  \
+static bool trans_##NAME##_rxrr(DisasContext *s, arg_s_rrr_shr *a)  \
+{ StoreRegKind k = (K); return op_s_rxr_shr(s, a, OP, L, k); }
 
 #define DO_CMP2(NAME, OP, L)\
 static bool trans_##NAME##_xrri(DisasContext *s, arg_s_rrr_shi *a)  \
-{ return op_s_rrr_shi(s, a, OP, L, STREG_NONE); }
+{ return op_s_rrr_shi(s, a, OP, L, STREG_NONE); }   \
+static bool trans_##NAME##_xrrr(DisasContext *s, arg_s_rrr_shr *a)  \
+{ return op_s_rrr_shr(s, a, OP, L, STREG_NONE); }
 
 DO_ANY3(AND, tcg_gen_and_i32, a->s, STREG_NORMAL)
 DO_ANY3(EOR, tcg_gen_xor_i32, a->s, STREG_NORMAL)
@@ -9555,7 +9604,6 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t 
insn)
 TCGv_i32 addr;
 TCGv_i64 tmp64;
 int op;
-int logic_cc;
 
 /*
  * ARMv6-M supports a limited subset of Thumb2 instructions.
@@ -9993,22 +10041,8 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t 
insn)
 if (op < 4 && (insn & 0xf000) != 0xf000)
 goto illegal_op;
 switch (op) {
-case 0: /* Register controlled shift.  */
-tmp = load_reg(s, rn);
-tmp2 = load_reg(s, rm);
-if ((insn & 0x70) != 0)
-goto illegal_op;
-/*
- * 0b_1010_0xxx_____:
- *  - MOV, MOVS (register-shifted register), flagsetting
- */
-op = (insn >> 21) & 3;
-logic_cc = (insn & (1 << 20)) != 0;
-gen_arm_shift_reg(tmp, op, tmp2, logic_cc);
-if (logic_cc)
-gen_logic_CC(tmp);
-store_reg(s, rd, tmp);
-break;
+case 0: /* Register controlled shift, in decodetree */
+goto illegal_op;
 case 1: /* Sign/zero extend.  */
 op = (insn >> 20) & 7;
 switch (op) {
diff --git a/target/arm/a32.decode b/target/arm/a32.decode
index b23e83f17c..8e0fb06d05 100644
--- a/target/arm/a32.decode
+++ b/target/arm/a32.decode
@@ -23,6 +23,7 @@
 #
 
 _rrr_shi   s rd rn rm shim shty
+_rrr_shr   s rn rd rm rs shty
 
 # Data-processing (register)
 
@@ -49,3 +50,29 @@ ORR_rrri  000 1100 .   . .. 0    
 @s_rrr_shi
 MOV_rxri  000 1101 .   . .. 0 @s_rxr_shi
 BIC_rrri  000 

[Qemu-devel] [PATCH v2 03/68] target/arm: Convert Data Processing (register)

2019-08-19 Thread Richard Henderson
Convert the register shifted by immediate form of the data
processing insns.  For A32, we cannot yet remove any code
because the legacy decoder intertwines the reg-shifted-reg
and immediate forms.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 229 ++---
 target/arm/a32.decode  |  28 +
 target/arm/t32.decode  |  43 
 3 files changed, 264 insertions(+), 36 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index c759fe0797..be8e7685e3 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7669,6 +7669,197 @@ static void arm_skip_unless(DisasContext *s, uint32_t 
cond)
 #include "decode-a32-uncond.inc.c"
 #include "decode-t32.inc.c"
 
+/* Helpers to swap operands for reverse-subtract.  */
+static void gen_rsb(TCGv_i32 dst, TCGv_i32 a, TCGv_i32 b)
+{
+tcg_gen_sub_i32(dst, b, a);
+}
+
+static void gen_rsb_CC(TCGv_i32 dst, TCGv_i32 a, TCGv_i32 b)
+{
+gen_sub_CC(dst, b, a);
+}
+
+static void gen_rsc(TCGv_i32 dest, TCGv_i32 a, TCGv_i32 b)
+{
+gen_sub_carry(dest, b, a);
+}
+
+static void gen_rsc_CC(TCGv_i32 dest, TCGv_i32 a, TCGv_i32 b)
+{
+gen_sbc_CC(dest, b, a);
+}
+
+/*
+ * Helpers for the data processing routines.
+ *
+ * After the computation store the results back.
+ * This may be suppressed altogether (STREG_NONE), require a runtime
+ * check against the stack limits (STREG_SP_CHECK), or generate an
+ * exception return.  Oh, or store into a register.
+ *
+ * Always return true, indicating success for a trans_* function.
+ */
+typedef enum {
+   STREG_NONE,
+   STREG_NORMAL,
+   STREG_SP_CHECK,
+   STREG_EXC_RET,
+} StoreRegKind;
+
+static bool store_reg_kind(DisasContext *s, int rd,
+TCGv_i32 val, StoreRegKind kind)
+{
+switch (kind) {
+case STREG_NONE:
+tcg_temp_free_i32(val);
+return true;
+case STREG_NORMAL:
+/* See ALUWritePC: Interworking only from a32 mode. */
+if (s->thumb) {
+store_reg(s, rd, val);
+} else {
+store_reg_bx(s, rd, val);
+}
+return true;
+case STREG_SP_CHECK:
+store_sp_checked(s, val);
+return true;
+case STREG_EXC_RET:
+gen_exception_return(s, val);
+return true;
+}
+g_assert_not_reached();
+}
+
+/*
+ * Data Processing (register)
+ *
+ * Operate, with set flags, one register source,
+ * one immediate shifted register source, and a destination.
+ */
+static bool op_s_rrr_shi(DisasContext *s, arg_s_rrr_shi *a,
+ void (*gen)(TCGv_i32, TCGv_i32, TCGv_i32),
+ int logic_cc, StoreRegKind kind)
+{
+TCGv_i32 tmp1, tmp2;
+
+tmp2 = load_reg(s, a->rm);
+gen_arm_shift_im(tmp2, a->shty, a->shim, logic_cc);
+tmp1 = load_reg(s, a->rn);
+
+gen(tmp1, tmp1, tmp2);
+tcg_temp_free_i32(tmp2);
+
+if (logic_cc) {
+gen_logic_CC(tmp1);
+}
+return store_reg_kind(s, a->rd, tmp1, kind);
+}
+
+static bool op_s_rxr_shi(DisasContext *s, arg_s_rrr_shi *a,
+ void (*gen)(TCGv_i32, TCGv_i32),
+ int logic_cc, StoreRegKind kind)
+{
+TCGv_i32 tmp;
+
+tmp = load_reg(s, a->rm);
+gen_arm_shift_im(tmp, a->shty, a->shim, logic_cc);
+
+gen(tmp, tmp);
+if (logic_cc) {
+gen_logic_CC(tmp);
+}
+return store_reg_kind(s, a->rd, tmp, kind);
+}
+
+#define DO_ANY3(NAME, OP, L, K) \
+static bool trans_##NAME##_rrri(DisasContext *s, arg_s_rrr_shi *a)  \
+{ StoreRegKind k = (K); return op_s_rrr_shi(s, a, OP, L, k); }
+
+#define DO_ANY2(NAME, OP, L, K) \
+static bool trans_##NAME##_rxri(DisasContext *s, arg_s_rrr_shi *a)  \
+{ StoreRegKind k = (K); return op_s_rxr_shi(s, a, OP, L, k); }
+
+#define DO_CMP2(NAME, OP, L)\
+static bool trans_##NAME##_xrri(DisasContext *s, arg_s_rrr_shi *a)  \
+{ return op_s_rrr_shi(s, a, OP, L, STREG_NONE); }
+
+DO_ANY3(AND, tcg_gen_and_i32, a->s, STREG_NORMAL)
+DO_ANY3(EOR, tcg_gen_xor_i32, a->s, STREG_NORMAL)
+DO_ANY3(ORR, tcg_gen_or_i32, a->s, STREG_NORMAL)
+DO_ANY3(BIC, tcg_gen_andc_i32, a->s, STREG_NORMAL)
+
+DO_ANY3(RSB, a->s ? gen_rsb_CC : gen_rsb, false, STREG_NORMAL)
+DO_ANY3(ADC, a->s ? gen_adc_CC : gen_add_carry, false, STREG_NORMAL)
+DO_ANY3(SBC, a->s ? gen_sbc_CC : gen_sub_carry, false, STREG_NORMAL)
+DO_ANY3(RSC, a->s ? gen_rsc_CC : gen_rsc, false, STREG_NORMAL)
+
+DO_CMP2(TST, tcg_gen_and_i32, true)
+DO_CMP2(TEQ, tcg_gen_xor_i32, true)
+DO_CMP2(CMN, gen_add_CC, false)
+DO_CMP2(CMP, gen_sub_CC, false)
+
+DO_ANY3(ADD, a->s ? gen_add_CC : tcg_gen_add_i32, false,
+a->rd == 13 && a->rn == 13 ? STREG_SP_CHECK : STREG_NORMAL)
+
+DO_ANY3(SUB, a->s ? gen_sub_CC : tcg_gen_sub_i32, false,
+({
+StoreRegKind ret = STREG_NORMAL;
+if (a->rd == 15 && a->s) {
+  

[Qemu-devel] [PATCH v2 00/68] target/arm: Convert aa32 base isa to decodetree

2019-08-19 Thread Richard Henderson
This unifies the implementation of the actual instructions for
a32, t32, and t16.

This has been tested by running the debian 9 armhf installer,
which does a far amount of switching between arm and thumb modes.
I've also run Peter's ARM TFM image, and all of the existing
RISU tests that we have.  (Our RISU test cases are nowhere near
complete for 32-bit mode, but it did find 3 bugs, so not useless.)

Based-on: 20190819151743.17267-1-richard.hender...@linaro.org
"[PULL 0/3] decodetree improvements"

Changes from v1:
  * Lots of prep patches merged.
  * Lots of patches split into smaller bits.
Which is why this patch set is larger than v1 despite the merge.
  * Do not use STREG_EXC_RET in Hyp mode (patch 3).
  * Map more UNPREDICTABLE to UNDEF in LDM/STM (patches 28-30).
  * Split gen_nop_hint (patch 59).
  * Do not move single-step check to gen_goto_tb, but do simplify
gen_jmp by inlining gen_bx_im (patch 68).


r~


Richard Henderson (68):
  target/arm: Use store_reg_from_load in thumb2 code
  target/arm: Add stubs for aa32 decodetree
  target/arm: Convert Data Processing (register)
  target/arm: Convert Data Processing (reg-shifted-reg)
  target/arm: Convert Data Processing (immediate)
  target/arm: Convert multiply and multiply accumulate
  target/arm: Simplify UMAAL
  target/arm: Convert Saturating addition and subtraction
  target/arm: Convert Halfword multiply and multiply accumulate
  target/arm: Simplify op_smlaxxx for SMLAL*
  target/arm: Simplify op_smlawx for SMLAW*
  target/arm: Convert MSR (immediate) and hints
  target/arm: Convert MRS/MSR (banked, register)
  target/arm: Convert Cyclic Redundancy Check
  target/arm: Convert BX, BXJ, BLX (register)
  target/arm: Convert CLZ
  target/arm: Convert ERET
  target/arm: Convert the rest of A32 Miscelaneous instructions
  target/arm: Convert T32 ADDW/SUBW
  target/arm: Convert load/store (register, immediate, literal)
  target/arm: Convert Synchronization primitives
  target/arm: Convert USAD8, USADA8, SBFX, UBFX, BFC, BFI, UDF
  target/arm: Convert Parallel addition and subtraction
  target/arm: Convert Packing, unpacking, saturation, and reversal
  target/arm: Convert Signed multiply, signed and unsigned divide
  target/arm: Convert MOVW, MOVT
  target/arm: Convert LDM, STM
  target/arm: Diagnose writeback register in list for LDM for v7
  target/arm: Diagnose too few registers in list for LDM/STM
  target/arm: Diagnose base == pc for LDM/STM
  target/arm: Convert B, BL, BLX (immediate)
  target/arm: Convert SVC
  target/arm: Convert RFE and SRS
  target/arm: Convert Clear-Exclusive, Barriers
  target/arm: Convert CPS (privileged)
  target/arm: Convert SETEND
  target/arm: Convert PLI, PLD, PLDW
  target/arm: Convert Unallocated memory hint
  target/arm: Convert Table Branch
  target/arm: Convert SG
  target/arm: Convert TT
  target/arm: Simplify disas_thumb2_insn
  target/arm: Simplify disas_arm_insn
  target/arm: Add skeleton for T16 decodetree
  target/arm: Convert T16 data-processing (two low regs)
  target/arm: Convert T16 load/store (register offset)
  target/arm: Convert T16 load/store (immediate offset)
  target/arm: Convert T16 add pc/sp (immediate)
  target/arm: Convert T16 load/store multiple
  target/arm: Convert T16 add/sub (3 low, 2 low and imm)
  target/arm: Convert T16 one low register and immediate
  target/arm: Convert T16 branch and exchange
  target/arm: Convert T16 add, compare, move (two high registers)
  target/arm: Convert T16 adjust sp (immediate)
  target/arm: Convert T16, extract
  target/arm: Convert T16, Change processor state
  target/arm: Convert T16, Reverse bytes
  target/arm: Convert T16, nop hints
  target/arm: Split gen_nop_hint
  target/arm: Convert T16, push and pop
  target/arm: Convert T16, Conditional branches, Supervisor call
  target/arm: Convert T16, Miscellaneous 16-bit instructions
  target/arm: Convert T16, shift immediate
  target/arm: Convert T16, load (literal)
  target/arm: Convert T16, Unconditional branch
  target/arm: Convert T16, long branches
  target/arm: Clean up disas_thumb_insn
  target/arm: Inline gen_bx_im into callers

 target/arm/translate.c   | 7068 ++
 target/arm/Makefile.objs |   24 +
 target/arm/a32-uncond.decode |   74 +
 target/arm/a32.decode|  534 +++
 target/arm/t16.decode|  279 ++
 target/arm/t32.decode|  629 +++
 6 files changed, 4536 insertions(+), 4072 deletions(-)
 create mode 100644 target/arm/a32-uncond.decode
 create mode 100644 target/arm/a32.decode
 create mode 100644 target/arm/t16.decode
 create mode 100644 target/arm/t32.decode

-- 
2.17.1




[Qemu-devel] [PATCH v2 01/68] target/arm: Use store_reg_from_load in thumb2 code

2019-08-19 Thread Richard Henderson
This function already includes the test for an interworking write
to PC from a load.  Change the T32 LDM implementation to match the
A32 LDM implementation.

For LDM, the reordering of the tests does not change valid
behaviour because the only case that differs is has rn == 15,
which is UNPREDICTABLE.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index d948757131..db69d998eb 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9714,13 +9714,11 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t 
insn)
 /* Load.  */
 tmp = tcg_temp_new_i32();
 gen_aa32_ld32u(s, tmp, addr, get_mem_index(s));
-if (i == 15) {
-gen_bx_excret(s, tmp);
-} else if (i == rn) {
+if (i == rn) {
 loaded_var = tmp;
 loaded_base = 1;
 } else {
-store_reg(s, i, tmp);
+store_reg_from_load(s, i, tmp);
 }
 } else {
 /* Store.  */
@@ -10854,11 +10852,7 @@ static void disas_thumb2_insn(DisasContext *s, 
uint32_t insn)
 tcg_temp_free_i32(addr);
 goto illegal_op;
 }
-if (rs == 15) {
-gen_bx_excret(s, tmp);
-} else {
-store_reg(s, rs, tmp);
-}
+store_reg_from_load(s, rs, tmp);
 } else {
 /* Store.  */
 tmp = load_reg(s, rs);
-- 
2.17.1




[Qemu-devel] [PATCH v2 06/68] target/arm: Convert multiply and multiply accumulate

2019-08-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 248 +++--
 target/arm/a32.decode  |  17 +++
 target/arm/t32.decode  |  19 
 3 files changed, 177 insertions(+), 107 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index b5af38bf84..94659086c0 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7990,6 +7990,125 @@ static bool trans_ORN_rri(DisasContext *s, 
arg_s_rri_rot *a)
 #undef DO_ANY2
 #undef DO_CMP2
 
+/*
+ * Multiply and multiply accumulate
+ */
+
+static bool op_mla(DisasContext *s, arg_s_ *a, bool add)
+{
+TCGv_i32 t1, t2;
+
+t1 = load_reg(s, a->rn);
+t2 = load_reg(s, a->rm);
+tcg_gen_mul_i32(t1, t1, t2);
+tcg_temp_free_i32(t2);
+if (add) {
+t2 = load_reg(s, a->ra);
+tcg_gen_add_i32(t1, t1, t2);
+tcg_temp_free_i32(t2);
+}
+if (a->s) {
+gen_logic_CC(t1);
+}
+store_reg(s, a->rd, t1);
+return true;
+}
+
+static bool trans_MUL(DisasContext *s, arg_MUL *a)
+{
+return op_mla(s, a, false);
+}
+
+static bool trans_MLA(DisasContext *s, arg_MLA *a)
+{
+return op_mla(s, a, true);
+}
+
+static bool trans_MLS(DisasContext *s, arg_MLS *a)
+{
+TCGv_i32 t1, t2;
+
+if (!ENABLE_ARCH_6T2) {
+return false;
+}
+t1 = load_reg(s, a->rn);
+t2 = load_reg(s, a->rm);
+tcg_gen_mul_i32(t1, t1, t2);
+tcg_temp_free_i32(t2);
+t2 = load_reg(s, a->ra);
+tcg_gen_sub_i32(t1, t2, t1);
+tcg_temp_free_i32(t2);
+store_reg(s, a->rd, t1);
+return true;
+}
+
+static bool op_mlal(DisasContext *s, arg_s_ *a, bool uns, bool add)
+{
+TCGv_i32 t0, t1, t2, t3;
+
+t0 = load_reg(s, a->rm);
+t1 = load_reg(s, a->rn);
+if (uns) {
+tcg_gen_mulu2_i32(t0, t1, t0, t1);
+} else {
+tcg_gen_muls2_i32(t0, t1, t0, t1);
+}
+if (add) {
+t2 = load_reg(s, a->ra);
+t3 = load_reg(s, a->rd);
+tcg_gen_add2_i32(t0, t1, t0, t1, t2, t3);
+tcg_temp_free_i32(t2);
+tcg_temp_free_i32(t3);
+}
+if (a->s) {
+gen_logicq_cc(t0, t1);
+}
+store_reg(s, a->ra, t0);
+store_reg(s, a->rd, t1);
+return true;
+}
+
+static bool trans_UMULL(DisasContext *s, arg_UMULL *a)
+{
+return op_mlal(s, a, true, false);
+}
+
+static bool trans_SMULL(DisasContext *s, arg_SMULL *a)
+{
+return op_mlal(s, a, false, false);
+}
+
+static bool trans_UMLAL(DisasContext *s, arg_UMLAL *a)
+{
+return op_mlal(s, a, true, true);
+}
+
+static bool trans_SMLAL(DisasContext *s, arg_SMLAL *a)
+{
+return op_mlal(s, a, false, true);
+}
+
+static bool trans_UMAAL(DisasContext *s, arg_UMAAL *a)
+{
+TCGv_i32 t0, t1;
+TCGv_i64 t64;
+
+if (s->thumb
+? !arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)
+: !ENABLE_ARCH_6) {
+return false;
+}
+
+t0 = load_reg(s, a->rm);
+t1 = load_reg(s, a->rn);
+t64 = gen_mulu_i64_i32(t0, t1);
+gen_addq_lo(s, t64, a->ra);
+gen_addq_lo(s, t64, a->rd);
+gen_storeq_reg(s, a->ra, a->rd, t64);
+tcg_temp_free_i64(t64);
+return true;
+}
+
 /*
  * Legacy decoder.
  */
@@ -8536,71 +8655,9 @@ static void disas_arm_insn(DisasContext *s, unsigned int 
insn)
 sh = (insn >> 5) & 3;
 if (sh == 0) {
 if (op1 == 0x0) {
-rd = (insn >> 16) & 0xf;
-rn = (insn >> 12) & 0xf;
-rs = (insn >> 8) & 0xf;
-rm = (insn) & 0xf;
-op1 = (insn >> 20) & 0xf;
-switch (op1) {
-case 0: case 1: case 2: case 3: case 6:
-/* 32 bit mul */
-tmp = load_reg(s, rs);
-tmp2 = load_reg(s, rm);
-tcg_gen_mul_i32(tmp, tmp, tmp2);
-tcg_temp_free_i32(tmp2);
-if (insn & (1 << 22)) {
-/* Subtract (mls) */
-ARCH(6T2);
-tmp2 = load_reg(s, rn);
-tcg_gen_sub_i32(tmp, tmp2, tmp);
-tcg_temp_free_i32(tmp2);
-} else if (insn & (1 << 21)) {
-/* Add */
-tmp2 = load_reg(s, rn);
-tcg_gen_add_i32(tmp, tmp, tmp2);
-tcg_temp_free_i32(tmp2);
-}
-if (insn & (1 << 20))
-gen_logic_CC(tmp);
-store_reg(s, rd, tmp);
-break;
-case 4:
-/* 64 bit mul double accumulate (UMAAL) */
-ARCH(6);
-tmp = load_reg(s, rs);
-tmp2 = load_reg(s, rm);
-tmp64 = gen_mulu_i64_i32(tmp, tmp2);
- 

Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure

2019-08-19 Thread Eric Blake
On 8/19/19 3:53 PM, Denis V. Lunev wrote:

 Or even better, fix the call site of fallocate() to skip attempting an
 unaligned fallocate(), and just directly return ENOTSUP, rather than
 trying to diagnose EINVAL after the fact.

>>> No way. Single ENOTSUP will turn off fallocate() support on caller side
>>> while
>>> aligned (99.99% of calls) works normally.
>> I didn't mean skip fallocate() unconditionally, only when unaligned:
>>
>> if (request not aligned enough)
>>return -ENOTSUP;
>> fallocate() ...
>>
>> so that the 99.99% requests that ARE aligned get to use fallocate()
>> normally.
>>
> static int handle_aiocb_write_zeroes(void *opaque)
> {
> ...
> #ifdef CONFIG_FALLOCATE_ZERO_RANGE
>     if (s->has_write_zeroes) {
>     int ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
>    aiocb->aio_offset, aiocb->aio_nbytes);
>     if (ret == 0 || ret != -ENOTSUP) {
>     return ret;
>     }
>     s->has_write_zeroes = false;
>     }
> #endif
> 
> thus, right now, single ENOTSUP disables fallocate
> functionality completely setting s->has_write_zeroes
> to false and that is pretty much correct.
> 
> ENOTSUP is "static" error code which returns persistent
> ENOTSUP under any consequences.

Not always true. And the block layer doesn't expect it to be true. It is
perfectly fine for one invocation to return ENOTSUP ('I can't handle
this request, so fall back to pwrite for me) and the next to just work
('this one was aligned, so I handled it just fine).  It just means that
you have to be more careful with the logic: never set
s->has_write_zeroes=false if you skipped the fallocate, or if the
fallocate failed due to EINVAL rather than ENOTSUP (but still report
ENOTSUP to the block layer, to document that you want the EINVAL for
unaligned request to be turned into a fallback to pwrite).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [for-4.2 PATCH 0/2] PCI DMA alias support

2019-08-19 Thread Alex Williamson
On Mon, 29 Jul 2019 15:15:29 -0400
"Michael S. Tsirkin"  wrote:

> On Fri, Jul 26, 2019 at 06:55:27PM -0600, Alex Williamson wrote:
> > Please see patch 1/ for the motivation and utility of this series.
> > This v1 submission improves on the previous RFC with revised commit
> > logs, comments, and more testing, and the missing IVRS support for DMA
> > alias ranges is now included.  Testing has been done with Linux guests
> > with both SeaBIOS and OVMF with configurations of intel-iommu and
> > amd-iommu.  Intel-iommu testing includes device assignment, amd-iommu
> > is necessarily limited to emulated devices with interrupt remapping
> > disabled and iommu=pt in the guest (enabling interrupt remapping or
> > disabling guest passthrough mode fails to work regardless of this
> > series).  This series is NOT intended for QEMU v4.1.  Thanks,
> > 
> > Alex  
> 
> 
> series looks good to me.
> pls ping when 4.1 is out and I'll queue it.

Here's the requested ping :)  If you'd like a re-posting or comment
update, just say so.  I think Peter was ultimately satisfied enough to
not request a re-spin for comments alone.  Thanks,

Alex

> > ---
> > 
> > Alex Williamson (2):
> >   pci: Use PCI aliases when determining device IOMMU address space
> >   hw/i386: AMD-Vi IVRS DMA alias support
> > 
> > 
> >  hw/i386/acpi-build.c |  127 
> > +++---
> >  hw/pci/pci.c |   43 -
> >  2 files changed, 160 insertions(+), 10 deletions(-)  




[Qemu-devel] qemu icount mode timer accuracy

2019-08-19 Thread Wu, Wentong


Could anyone please give some comments? Thanks in advance!



Hi,

Recently I'm working to enable Qemu icount mode with TCG, with source code 
review I found that Qemu can give deterministic execution for guest code 
timeout. But for exact time point for guest OS, I have a question:

For armv7m_systick.c example, guest OS will use systick_read which will call "t 
= qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); " to calculate his exact time point, 
and qemu_clock_get_ns will use qemu_icount. But from qemu_tcg_rr_cpu_thread_fn 
{ prepare_icount_for_run(cpu); r = tcg_cpu_exec(cpu); 
process_icount_data(cpu);}, we know qemu just update qemu_icount value after 
tcg_cpu_exec, so for each tcg_cpu_exec execution there is the same qemu_icount 
value, and then guest code will get the same time point for that one tcg 
execution. Can someone confirm that?


Re: [Qemu-devel] [qemu-s390x] [PATCH v7 33/42] exec: Replace device_endian with MemOp

2019-08-19 Thread Richard Henderson
On 8/19/19 11:29 AM, Paolo Bonzini wrote:
> On 19/08/19 20:28, Paolo Bonzini wrote:
>> On 16/08/19 12:12, Thomas Huth wrote:
>>> This patch is *huge*, more than 800kB. It keeps being stuck in the the
>>> filter of the qemu-s390x list each time you send it. Please:
>>>
>>> 1) Try to break it up in more digestible pieces, e.g. change only one
>>> subsystem at a time (this is also better reviewable by people who are
>>> interested in one area)
>>
>> This is not really possible, since the patch is basically a
>> search-and-replace.  You could perhaps use some magic
>> ("DEVICE_MEMOP_ENDIAN" or something like that) to allow a split, but it
>> would introduce more complication than anything else.
> 
> I'm stupid, at this point of the series it _would_ be possible to split
> the patch by subsystem.  Still not sure it would be actually an advantage.

It might be easier to review if we split by symbol, one rename per patch over
the entire code base.


r~



Re: [Qemu-devel] [PATCH] ppc: Fix emulated INFINITY and NAN conversions

2019-08-19 Thread Richard Henderson
On 8/19/19 12:19 PM, Paul A. Clarke wrote:
> From: "Paul A. Clarke" 
> 
> helper_todouble() was not properly converting INFINITY from 32 bit
> float to 64 bit double.
> 
> (Normalized operand conversion is unchanged, other than indentation.)
> 
> Signed-off-by: Paul A. Clarke 
> ---
>  target/ppc/fpu_helper.c | 15 +++
>  1 file changed, 11 insertions(+), 4 deletions(-)

Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [PATCH] block/io.c: fix for the allocation failure

2019-08-19 Thread Denis V. Lunev
On 8/19/19 11:30 PM, Eric Blake wrote:
> On 8/19/19 2:46 PM, Denis V. Lunev wrote:
>> On 8/17/19 5:56 PM, Eric Blake wrote:
>>> On 8/17/19 9:49 AM, Eric Blake wrote:
>>>
> This change is a regression of sorts.  Now, you are unconditionally
> attempting the fallback for ALL failures (such as EIO) and for all
> drivers, even when that was not previously attempted and increases the
> traffic.  I think we should revert this patch and instead fix the
> fallocate() path to convert whatever ACTUAL errno you got from unaligned
> fallocate failure into ENOTSUP (that is, just the file-posix.c location
> that failed), while leaving all other errors as immediately fatal.
>>> Or even better, fix the call site of fallocate() to skip attempting an
>>> unaligned fallocate(), and just directly return ENOTSUP, rather than
>>> trying to diagnose EINVAL after the fact.
>>>
>> No way. Single ENOTSUP will turn off fallocate() support on caller side
>> while
>> aligned (99.99% of calls) works normally.
> I didn't mean skip fallocate() unconditionally, only when unaligned:
>
> if (request not aligned enough)
>return -ENOTSUP;
> fallocate() ...
>
> so that the 99.99% requests that ARE aligned get to use fallocate()
> normally.
>
static int handle_aiocb_write_zeroes(void *opaque)
{
...
#ifdef CONFIG_FALLOCATE_ZERO_RANGE
    if (s->has_write_zeroes) {
    int ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
   aiocb->aio_offset, aiocb->aio_nbytes);
    if (ret == 0 || ret != -ENOTSUP) {
    return ret;
    }
    s->has_write_zeroes = false;
    }
#endif

thus, right now, single ENOTSUP disables fallocate
functionality completely setting s->has_write_zeroes
to false and that is pretty much correct.

ENOTSUP is "static" error code which returns persistent
ENOTSUP under any consequences. Its handling usually
disables some functionality.

This is why original idea is proposed.

Den


[Qemu-devel] qemu icount mode timer accuracy

2019-08-19 Thread Wu, Wentong


Could you please give some comments about this? Thanks a lot!



Re: [Qemu-devel] [PATCH v4 18/28] riscv: sifive_u: Generate hfclk and rtcclk nodes

2019-08-19 Thread Alistair Francis
On Sun, Aug 18, 2019 at 10:29 PM Bin Meng  wrote:
>
> To keep in sync with Linux kernel device tree, generate hfclk and
> rtcclk nodes in the device tree, to be referenced by PRCI node.
>
> Signed-off-by: Bin Meng 

Reviewed-by: Alistair Francis 

Alistair

> ---
>
> Changes in v4: None
> Changes in v3: None
> Changes in v2: None
>
>  hw/riscv/sifive_u.c | 23 +++
>  include/hw/riscv/sifive_u.h |  2 ++
>  2 files changed, 25 insertions(+)
>
> diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
> index 284f7a5..08db741 100644
> --- a/hw/riscv/sifive_u.c
> +++ b/hw/riscv/sifive_u.c
> @@ -80,6 +80,7 @@ static void create_fdt(SiFiveUState *s, const struct 
> MemmapEntry *memmap,
>  char ethclk_names[] = "pclk\0hclk\0tx_clk";
>  uint32_t plic_phandle, ethclk_phandle, phandle = 1;
>  uint32_t uartclk_phandle;
> +uint32_t hfclk_phandle, rtcclk_phandle;
>
>  fdt = s->fdt = create_device_tree(>fdt_size);
>  if (!fdt) {
> @@ -98,6 +99,28 @@ static void create_fdt(SiFiveUState *s, const struct 
> MemmapEntry *memmap,
>  qemu_fdt_setprop_cell(fdt, "/soc", "#size-cells", 0x2);
>  qemu_fdt_setprop_cell(fdt, "/soc", "#address-cells", 0x2);
>
> +hfclk_phandle = phandle++;
> +nodename = g_strdup_printf("/hfclk");
> +qemu_fdt_add_subnode(fdt, nodename);
> +qemu_fdt_setprop_cell(fdt, nodename, "phandle", hfclk_phandle);
> +qemu_fdt_setprop_string(fdt, nodename, "clock-output-names", "hfclk");
> +qemu_fdt_setprop_cell(fdt, nodename, "clock-frequency",
> +SIFIVE_U_HFCLK_FREQ);
> +qemu_fdt_setprop_string(fdt, nodename, "compatible", "fixed-clock");
> +qemu_fdt_setprop_cell(fdt, nodename, "#clock-cells", 0x0);
> +g_free(nodename);
> +
> +rtcclk_phandle = phandle++;
> +nodename = g_strdup_printf("/rtcclk");
> +qemu_fdt_add_subnode(fdt, nodename);
> +qemu_fdt_setprop_cell(fdt, nodename, "phandle", rtcclk_phandle);
> +qemu_fdt_setprop_string(fdt, nodename, "clock-output-names", "rtcclk");
> +qemu_fdt_setprop_cell(fdt, nodename, "clock-frequency",
> +SIFIVE_U_RTCCLK_FREQ);
> +qemu_fdt_setprop_string(fdt, nodename, "compatible", "fixed-clock");
> +qemu_fdt_setprop_cell(fdt, nodename, "#clock-cells", 0x0);
> +g_free(nodename);
> +
>  nodename = g_strdup_printf("/memory@%lx",
>  (long)memmap[SIFIVE_U_DRAM].base);
>  qemu_fdt_add_subnode(fdt, nodename);
> diff --git a/include/hw/riscv/sifive_u.h b/include/hw/riscv/sifive_u.h
> index 7a1a4f3..debbf28 100644
> --- a/include/hw/riscv/sifive_u.h
> +++ b/include/hw/riscv/sifive_u.h
> @@ -68,6 +68,8 @@ enum {
>
>  enum {
>  SIFIVE_U_CLOCK_FREQ = 10,
> +SIFIVE_U_HFCLK_FREQ = ,
> +SIFIVE_U_RTCCLK_FREQ = 100,
>  SIFIVE_U_GEM_CLOCK_FREQ = 12500
>  };
>
> --
> 2.7.4
>
>



Re: [Qemu-devel] [PATCH v3 2/8] iotests: Prefer null-co over null-aio

2019-08-19 Thread Max Reitz
On 19.08.19 22:18, Max Reitz wrote:
> We use null-co basically everywhere in the iotests.  Unless we want to
> test null-aio specifically, we should use it instead (for consistency).
> 
> Signed-off-by: Max Reitz 
> Reviewed-by: John Snow 

Hm, sorry, I just noticed that I probably should have dropped this R-b. :-/

(I mean, apart from the rebase conflict, nothing has changed, but still.)

Max

> ---
>  tests/qemu-iotests/093 | 7 +++
>  tests/qemu-iotests/245 | 2 +-
>  2 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/tests/qemu-iotests/093 b/tests/qemu-iotests/093
> index 3c4f5173ce..50c1e7f2ec 100755
> --- a/tests/qemu-iotests/093
> +++ b/tests/qemu-iotests/093
> @@ -267,13 +267,12 @@ class ThrottleTestCoroutine(ThrottleTestCase):
>  test_img = "null-co://"
>  
>  class ThrottleTestGroupNames(iotests.QMPTestCase):
> -test_img = "null-aio://"
>  max_drives = 3
>  
>  def setUp(self):
>  self.vm = iotests.VM()
>  for i in range(0, self.max_drives):
> -self.vm.add_drive(self.test_img,
> +self.vm.add_drive("null-co://",
>
> "throttling.iops-total=100,file.read-zeroes=on")
>  self.vm.launch()
>  
> @@ -376,10 +375,10 @@ class ThrottleTestRemovableMedia(iotests.QMPTestCase):
>  
>  def test_removable_media(self):
>  # Add a couple of dummy nodes named cd0 and cd1
> -result = self.vm.qmp("blockdev-add", driver="null-aio",
> +result = self.vm.qmp("blockdev-add", driver="null-co",
>   read_zeroes=True, node_name="cd0")
>  self.assert_qmp(result, 'return', {})
> -result = self.vm.qmp("blockdev-add", driver="null-aio",
> +result = self.vm.qmp("blockdev-add", driver="null-co",
>   read_zeroes=True, node_name="cd1")
>  self.assert_qmp(result, 'return', {})
>  
> diff --git a/tests/qemu-iotests/245 b/tests/qemu-iotests/245
> index bc1ceb9792..ae169778b0 100644
> --- a/tests/qemu-iotests/245
> +++ b/tests/qemu-iotests/245
> @@ -598,7 +598,7 @@ class TestBlockdevReopen(iotests.QMPTestCase):
>  ##
>  ## null ##
>  ##
> -opts = {'driver': 'null-aio', 'node-name': 'root', 'size': 1024}
> +opts = {'driver': 'null-co', 'node-name': 'root', 'size': 1024}
>  
>  result = self.vm.qmp('blockdev-add', conv_keys = False, **opts)
>  self.assert_qmp(result, 'return', {})
> 




signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v3 7/8] iotests: Test driver whitelisting in 136

2019-08-19 Thread Max Reitz
null-aio may not be whitelisted.  Skip all test cases that require it.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/136 | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/136 b/tests/qemu-iotests/136
index a46a7b7630..012ea111ac 100755
--- a/tests/qemu-iotests/136
+++ b/tests/qemu-iotests/136
@@ -30,7 +30,7 @@ bad_offset = bad_sector * 512
 blkdebug_file = os.path.join(iotests.test_dir, 'blkdebug.conf')
 
 class BlockDeviceStatsTestCase(iotests.QMPTestCase):
-test_img = "null-aio://"
+test_driver = "null-aio"
 total_rd_bytes = 0
 total_rd_ops = 0
 total_wr_bytes = 0
@@ -67,6 +67,10 @@ sector = "%d"
 ''' % (bad_sector, bad_sector))
 file.close()
 
+def required_drivers(self):
+return [self.test_driver]
+
+@iotests.skip_if_unsupported(required_drivers)
 def setUp(self):
 drive_args = []
 drive_args.append("stats-intervals.0=%d" % interval_length)
@@ -76,8 +80,8 @@ sector = "%d"
   (self.account_failed and "on" or "off"))
 drive_args.append("file.image.read-zeroes=on")
 self.create_blkdebug_file()
-self.vm = iotests.VM().add_drive('blkdebug:%s:%s' %
- (blkdebug_file, self.test_img),
+self.vm = iotests.VM().add_drive('blkdebug:%s:%s://' %
+ (blkdebug_file, self.test_driver),
  ','.join(drive_args))
 self.vm.launch()
 # Set an initial value for the clock
@@ -337,7 +341,9 @@ class 
BlockDeviceStatsTestAccountBoth(BlockDeviceStatsTestCase):
 account_failed = True
 
 class BlockDeviceStatsTestCoroutine(BlockDeviceStatsTestCase):
-test_img = "null-co://"
+test_driver = "null-co"
 
 if __name__ == '__main__':
+if 'null-co' not in iotests.supported_formats():
+iotests.notrun('null-co driver support missing')
 iotests.main(supported_fmts=["raw"])
-- 
2.21.0




  1   2   3   4   >