Re: [PATCH for-5.1 00/31] target/arm: SVE2, part 1

2020-04-21 Thread LIU Zhiwei

Hi Richard,

I find BF16 is included in the ISA.  Will you extend  the softfpu in 
this patch set?


Zhiwei

On 2020/3/27 7:08, Richard Henderson wrote:

Posting this for early review.  It's based on some other patch
sets that I have posted recently that also touch SVE, listed
below.  But it might just be easier to clone the devel tree [2].
While the branch itself will rebase frequently for development,
I've also created a tag, post-sve2-20200326, for this posting.

This is mostly untested, as the most recently released Foundation
Model does not support SVE2.  Some of the new instructions overlap
with old fashioned NEON, and I can verify that those have not
broken, and show that SVE2 will use the same code path.  But the
predicated insns and bottom/top interleaved insns are not yet
RISU testable, as I have nothing to compare against.

The patches are in general arranged so that one complete group
of insns are added at once.  The groups within the manual [1]
have so far been small-ish.


r~

---

[1] ISA manual: 
https://static.docs.arm.com/ddi0602/d/ISA_A64_xml_futureA-2019-12_OPT.pdf

[2] Devel tree: https://github.com/rth7680/qemu/tree/tgt-arm-sve-2

Based-on: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=163610
("target/arm: sve load/store improvements")

Based-on: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=164500
("target/arm: Use tcg_gen_gvec_5_ptr for sve FMLA/FCMLA")

Based-on: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=164048
("target/arm: Implement ARMv8.5-MemTag, system mode")

Richard Henderson (31):
   target/arm: Add ID_AA64ZFR0 fields and isar_feature_aa64_sve2
   target/arm: Implement SVE2 Integer Multiply - Unpredicated
   target/arm: Implement SVE2 integer pairwise add and accumulate long
   target/arm: Remove fp_status from helper_{recpe,rsqrte}_u32
   target/arm: Implement SVE2 integer unary operations (predicated)
   target/arm: Split out saturating/rounding shifts from neon
   target/arm: Implement SVE2 saturating/rounding bitwise shift left
 (predicated)
   target/arm: Implement SVE2 integer halving add/subtract (predicated)
   target/arm: Implement SVE2 integer pairwise arithmetic
   target/arm: Implement SVE2 saturating add/subtract (predicated)
   target/arm: Implement SVE2 integer add/subtract long
   target/arm: Implement SVE2 integer add/subtract interleaved long
   target/arm: Implement SVE2 integer add/subtract wide
   target/arm: Implement SVE2 integer multiply long
   target/arm: Implement PMULLB and PMULLT
   target/arm: Tidy SVE tszimm shift formats
   target/arm: Implement SVE2 bitwise shift left long
   target/arm: Implement SVE2 bitwise exclusive-or interleaved
   target/arm: Implement SVE2 bitwise permute
   target/arm: Implement SVE2 complex integer add
   target/arm: Implement SVE2 integer absolute difference and accumulate
 long
   target/arm: Implement SVE2 integer add/subtract long with carry
   target/arm: Create arm_gen_gvec_[us]sra
   target/arm: Create arm_gen_gvec_{u,s}{rshr,rsra}
   target/arm: Implement SVE2 bitwise shift right and accumulate
   target/arm: Create arm_gen_gvec_{sri,sli}
   target/arm: Tidy handle_vec_simd_shri
   target/arm: Implement SVE2 bitwise shift and insert
   target/arm: Vectorize SABD/UABD
   target/arm: Vectorize SABA/UABA
   target/arm: Implement SVE2 integer absolute difference and accumulate

  target/arm/cpu.h   |  31 ++
  target/arm/helper-sve.h| 345 +
  target/arm/helper.h|  81 +++-
  target/arm/translate-a64.h |   9 +
  target/arm/translate.h |  24 +-
  target/arm/vec_internal.h  | 161 
  target/arm/sve.decode  | 217 ++-
  target/arm/helper.c|   3 +-
  target/arm/kvm64.c |   2 +
  target/arm/neon_helper.c   | 515 -
  target/arm/sve_helper.c| 757 ++---
  target/arm/translate-a64.c | 557 +++
  target/arm/translate-sve.c | 557 +++
  target/arm/translate.c | 626 ++
  target/arm/vec_helper.c| 411 
  target/arm/vfp_helper.c|   4 +-
  16 files changed, 3532 insertions(+), 768 deletions(-)
  create mode 100644 target/arm/vec_internal.h







Re: [PATCH for-5.1 00/31] target/arm: SVE2, part 1

2020-04-21 Thread Richard Henderson
On 4/21/20 7:51 PM, LIU Zhiwei wrote:
> I find BF16 is included in the ISA.  Will you extend  the softfpu in this 
> patch
> set?

I will do that eventually, but probably not part of the first full SVE2 patch 
set.

There are several optional extensions to SVE2, of which BF16 is one.  But BF16
also requires changes to the normal FPU as well, and Arm requires SVE and FPU
be in sync.


r~



RE: [PATCH for-5.1 00/31] target/arm: SVE2, part 1

2020-04-01 Thread Ana Pazos
Hello Richard,

I want to introduce you to Stephen Long. He is our new hire who started this 
week.

I want to know if you are available for a sync-up meeting to discuss how we can 
cooperate with qemu sve2 support.

Thank you,
Ana.

-Original Message-
From: Richard Henderson 
Sent: Thursday, March 26, 2020 4:08 PM
To: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org; Ana Pazos ; Raja Venkateswaran 

Subject: [PATCH for-5.1 00/31] target/arm: SVE2, part 1

-
CAUTION: This email originated from outside of the organization.
-

Posting this for early review.  It's based on some other patch sets that I have 
posted recently that also touch SVE, listed below.  But it might just be easier 
to clone the devel tree [2].
While the branch itself will rebase frequently for development, I've also 
created a tag, post-sve2-20200326, for this posting.

This is mostly untested, as the most recently released Foundation Model does 
not support SVE2.  Some of the new instructions overlap with old fashioned 
NEON, and I can verify that those have not broken, and show that SVE2 will use 
the same code path.  But the predicated insns and bottom/top interleaved insns 
are not yet RISU testable, as I have nothing to compare against.

The patches are in general arranged so that one complete group of insns are 
added at once.  The groups within the manual [1] have so far been small-ish.


r~

---

[1] ISA manual: 
https://static.docs.arm.com/ddi0602/d/ISA_A64_xml_futureA-2019-12_OPT.pdf

[2] Devel tree: https://github.com/rth7680/qemu/tree/tgt-arm-sve-2

Based-on: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=163610
("target/arm: sve load/store improvements")

Based-on: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=164500
("target/arm: Use tcg_gen_gvec_5_ptr for sve FMLA/FCMLA")

Based-on: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=164048
("target/arm: Implement ARMv8.5-MemTag, system mode")

Richard Henderson (31):
  target/arm: Add ID_AA64ZFR0 fields and isar_feature_aa64_sve2
  target/arm: Implement SVE2 Integer Multiply - Unpredicated
  target/arm: Implement SVE2 integer pairwise add and accumulate long
  target/arm: Remove fp_status from helper_{recpe,rsqrte}_u32
  target/arm: Implement SVE2 integer unary operations (predicated)
  target/arm: Split out saturating/rounding shifts from neon
  target/arm: Implement SVE2 saturating/rounding bitwise shift left
(predicated)
  target/arm: Implement SVE2 integer halving add/subtract (predicated)
  target/arm: Implement SVE2 integer pairwise arithmetic
  target/arm: Implement SVE2 saturating add/subtract (predicated)
  target/arm: Implement SVE2 integer add/subtract long
  target/arm: Implement SVE2 integer add/subtract interleaved long
  target/arm: Implement SVE2 integer add/subtract wide
  target/arm: Implement SVE2 integer multiply long
  target/arm: Implement PMULLB and PMULLT
  target/arm: Tidy SVE tszimm shift formats
  target/arm: Implement SVE2 bitwise shift left long
  target/arm: Implement SVE2 bitwise exclusive-or interleaved
  target/arm: Implement SVE2 bitwise permute
  target/arm: Implement SVE2 complex integer add
  target/arm: Implement SVE2 integer absolute difference and accumulate
long
  target/arm: Implement SVE2 integer add/subtract long with carry
  target/arm: Create arm_gen_gvec_[us]sra
  target/arm: Create arm_gen_gvec_{u,s}{rshr,rsra}
  target/arm: Implement SVE2 bitwise shift right and accumulate
  target/arm: Create arm_gen_gvec_{sri,sli}
  target/arm: Tidy handle_vec_simd_shri
  target/arm: Implement SVE2 bitwise shift and insert
  target/arm: Vectorize SABD/UABD
  target/arm: Vectorize SABA/UABA
  target/arm: Implement SVE2 integer absolute difference and accumulate

 target/arm/cpu.h   |  31 ++
 target/arm/helper-sve.h| 345 +
 target/arm/helper.h|  81 +++-
 target/arm/translate-a64.h |   9 +
 target/arm/translate.h |  24 +-
 target/arm/vec_internal.h  | 161 
 target/arm/sve.decode  | 217 ++-
 target/arm/helper.c|   3 +-
 target/arm/kvm64.c |   2 +
 target/arm/neon_helper.c   | 515 -
 target/arm/sve_helper.c| 757 ++---
 target/arm/translate-a64.c | 557 +++  
target/arm/translate-sve.c | 557 +++
 target/arm/translate.c | 626 ++
 target/arm/vec_helper.c| 411 
 target/arm/vfp_helper.c|   4 +-
 16 files changed, 3532 insertions(+), 768 deletions(-)  create mode 100644 
target/arm/vec_internal.h

--
2.20.1





[PATCH for-5.1 00/31] target/arm: SVE2, part 1

2020-03-26 Thread Richard Henderson
Posting this for early review.  It's based on some other patch
sets that I have posted recently that also touch SVE, listed
below.  But it might just be easier to clone the devel tree [2].
While the branch itself will rebase frequently for development,
I've also created a tag, post-sve2-20200326, for this posting.

This is mostly untested, as the most recently released Foundation
Model does not support SVE2.  Some of the new instructions overlap
with old fashioned NEON, and I can verify that those have not
broken, and show that SVE2 will use the same code path.  But the
predicated insns and bottom/top interleaved insns are not yet
RISU testable, as I have nothing to compare against.

The patches are in general arranged so that one complete group
of insns are added at once.  The groups within the manual [1]
have so far been small-ish.


r~

---

[1] ISA manual: 
https://static.docs.arm.com/ddi0602/d/ISA_A64_xml_futureA-2019-12_OPT.pdf

[2] Devel tree: https://github.com/rth7680/qemu/tree/tgt-arm-sve-2

Based-on: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=163610
("target/arm: sve load/store improvements")

Based-on: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=164500
("target/arm: Use tcg_gen_gvec_5_ptr for sve FMLA/FCMLA")

Based-on: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=164048
("target/arm: Implement ARMv8.5-MemTag, system mode")

Richard Henderson (31):
  target/arm: Add ID_AA64ZFR0 fields and isar_feature_aa64_sve2
  target/arm: Implement SVE2 Integer Multiply - Unpredicated
  target/arm: Implement SVE2 integer pairwise add and accumulate long
  target/arm: Remove fp_status from helper_{recpe,rsqrte}_u32
  target/arm: Implement SVE2 integer unary operations (predicated)
  target/arm: Split out saturating/rounding shifts from neon
  target/arm: Implement SVE2 saturating/rounding bitwise shift left
(predicated)
  target/arm: Implement SVE2 integer halving add/subtract (predicated)
  target/arm: Implement SVE2 integer pairwise arithmetic
  target/arm: Implement SVE2 saturating add/subtract (predicated)
  target/arm: Implement SVE2 integer add/subtract long
  target/arm: Implement SVE2 integer add/subtract interleaved long
  target/arm: Implement SVE2 integer add/subtract wide
  target/arm: Implement SVE2 integer multiply long
  target/arm: Implement PMULLB and PMULLT
  target/arm: Tidy SVE tszimm shift formats
  target/arm: Implement SVE2 bitwise shift left long
  target/arm: Implement SVE2 bitwise exclusive-or interleaved
  target/arm: Implement SVE2 bitwise permute
  target/arm: Implement SVE2 complex integer add
  target/arm: Implement SVE2 integer absolute difference and accumulate
long
  target/arm: Implement SVE2 integer add/subtract long with carry
  target/arm: Create arm_gen_gvec_[us]sra
  target/arm: Create arm_gen_gvec_{u,s}{rshr,rsra}
  target/arm: Implement SVE2 bitwise shift right and accumulate
  target/arm: Create arm_gen_gvec_{sri,sli}
  target/arm: Tidy handle_vec_simd_shri
  target/arm: Implement SVE2 bitwise shift and insert
  target/arm: Vectorize SABD/UABD
  target/arm: Vectorize SABA/UABA
  target/arm: Implement SVE2 integer absolute difference and accumulate

 target/arm/cpu.h   |  31 ++
 target/arm/helper-sve.h| 345 +
 target/arm/helper.h|  81 +++-
 target/arm/translate-a64.h |   9 +
 target/arm/translate.h |  24 +-
 target/arm/vec_internal.h  | 161 
 target/arm/sve.decode  | 217 ++-
 target/arm/helper.c|   3 +-
 target/arm/kvm64.c |   2 +
 target/arm/neon_helper.c   | 515 -
 target/arm/sve_helper.c| 757 ++---
 target/arm/translate-a64.c | 557 +++
 target/arm/translate-sve.c | 557 +++
 target/arm/translate.c | 626 ++
 target/arm/vec_helper.c| 411 
 target/arm/vfp_helper.c|   4 +-
 16 files changed, 3532 insertions(+), 768 deletions(-)
 create mode 100644 target/arm/vec_internal.h

-- 
2.20.1