fpetrogalli updated this revision to Diff 285678.
fpetrogalli added a comment.

Added context to the diff.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85977/new/

https://reviews.llvm.org/D85977

Files:
  clang/test/CodeGen/aarch64-sve-acle-example.cpp
  llvm/docs/ReleaseNotes.rst

Index: llvm/docs/ReleaseNotes.rst
===================================================================
--- llvm/docs/ReleaseNotes.rst
+++ llvm/docs/ReleaseNotes.rst
@@ -66,7 +66,9 @@
   added to describe the mapping between scalar functions and vector
   functions, to enable vectorization of call sites. The information
   provided by the attribute is interfaced via the API provided by the
-  ``VFDatabase`` class.
+  ``VFDatabase`` class. When scanning through the set of vector
+  functions associated to a scalar call, the loop vectorizer now
+  relies on ``VFDatabase``, instead of ``TargetLibraryInfo``.
 
 * `dereferenceable` attributes and metadata on pointers no longer imply
   anything about the alignment of the pointer in question. Previously, some
@@ -78,6 +80,17 @@
   information. This information is used to represent Fortran modules debug
   info at IR level.
 
+* LLVM IR now supports two distinct ``llvm::FixedVectorType`` and
+  ``llvm::ScalableVectorType``, both derived from the base class
+  ``llvm::VectorType``. A number of algorithms dealing with IR vector
+  types have been updated to make sure they work for both scalable and
+  fixed vector types. Where possible, the code has been made generic
+  to cover both cases using the base class. Specifically, places that
+  were using the type ``unsigned`` to count the number of lanes of a
+  vector are now using ``llvm::ElementCount``. In places where
+  ``uint64_t`` was used to denote the size in bits of a IR type we
+  have partially migrated the codebase to using ``llvm::TypeSize``.
+
 Changes to building LLVM
 ------------------------
 
@@ -101,6 +114,55 @@
   default may wish to specify ``-fno-omit-frame-pointer`` to get the old
   behavior. This improves compatibility with GCC.
 
+* Clang supports to the following macros that enable the C-intrinsics
+  from the `Arm C language extensions for SVE
+  <https://developer.arm.com/documentation/100987/>`_ (version
+  ``00bet5``, see section 2.1 for the list of intrinsics associated to
+  each macro):
+
+
+      =================================  =================
+      Preprocessor macro                 Target feature
+      =================================  =================
+      ``__ARM_FEATURE_SVE``              ``+sve``
+      ``__ARM_FEATURE_SVE_BF16``         ``+sve+bf16``
+      ``__ARM_FEATURE_SVE_MATMUL_FP32``  ``+sve+f32mm``
+      ``__ARM_FEATURE_SVE_MATMUL_FP64``  ``+sve+f64mm``
+      ``__ARM_FEATURE_SVE_MATMUL_INT8``  ``+sve+i8mm``
+      ``__ARM_FEATURE_SVE2``             ``+sve2``
+      ``__ARM_FEATURE_SVE2_AES``         ``+sve2-aes``
+      ``__ARM_FEATURE_SVE2_BITPERM``     ``+sve2-bitperm``
+      ``__ARM_FEATURE_SVE2_SHA3``        ``+sve2-sha3``
+      ``__ARM_FEATURE_SVE2_SM4``         ``+sve2-sm4``
+      =================================  =================
+
+  The macros enable users to write C/C++ `Vector Length Agnostic
+  (VLA)` loops, that can be executed on any CPU that implements the
+  underlying instructions supported by the C intrinsics, independently
+  of the hardware vector register size.
+
+  For example, the ``__ARM_FEATURE_SVE`` macro is enabled when
+  targeting AArch64 code generation by setting ``-march=armv8-a+sve``
+  on the command line.
+
+  .. code-block:: c
+     :caption: Example of VLA addition of two arrays with SVE ACLE.
+
+     // Compile with:
+     // `clang++ -march=armv8a+sve ...` (for c++)
+     // `clang -stc=c11 -march=armv8a+sve ...` (for c)
+     #include <arm_sve.h>
+
+     void VLA_add_arrays(double *x, double *y, double *out, unsigned N) {
+       for (unsigned i = 0; i < N; i += svcntd()) {
+         svbool_t Pg = svwhilelt_b64(i, N);
+         svfloat64_t vx = svld1(Pg, &x[i]);
+         svfloat64_t vy = svld1(Pg, &y[i]);
+         svfloat64_t vout = svadd_x(Pg, vx, vy);
+        svst1(Pg, &out[i], vout);
+       }
+     }
+
 Changes to the MIPS Target
 --------------------------
 
Index: clang/test/CodeGen/aarch64-sve-acle-example.cpp
===================================================================
--- /dev/null
+++ clang/test/CodeGen/aarch64-sve-acle-example.cpp
@@ -0,0 +1,17 @@
+// RUN: %clang -x c++ -c -target aarch64-linux-gnu -march=armv8-a+sve -o - -S %s -O3 | FileCheck %s --check-prefix=CPP
+// RUN: %clang -x c -std=c11 -c -target aarch64-linux-gnu -march=armv8-a+sve -o - -S %s -O3 | FileCheck %s --check-prefix=C
+// REQUIRES: aarch64-registered-target
+
+#include <arm_sve.h>
+
+// CPP-LABEL: _Z14VLA_add_arraysPdS_S_j:
+// C-LABEL: VLA_add_arrays:
+void VLA_add_arrays(double *x, double *y, double *out, unsigned N) {
+  for (unsigned i = 0; i < N; i += svcntd()) {
+    svbool_t Pg = svwhilelt_b64(i, N);
+    svfloat64_t vx = svld1(Pg, &x[i]);
+    svfloat64_t vy = svld1(Pg, &y[i]);
+    svfloat64_t vout = svadd_x(Pg, vx, vy);
+    svst1(Pg, &out[i], vout);
+  }
+}
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to