[PATCH] D112420: [clang][ARM] PACBTI-M assembly support

2021-11-26 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea accepted this revision.
labrinea added a comment.
This revision is now accepted and ready to land.

Looks like you've addressed Oliver's comments. I don't have any new suggestions 
from my end. Just make sure you've removed the test xfail before merging.




Comment at: clang/test/Driver/darwin-ld-lto.c:2
 // REQUIRES: system-darwin
-
+// XFAIL: *
 // Check that ld gets "-lto_library".

Seems accidental.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112420/new/

https://reviews.llvm.org/D112420

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110065: [AArch64] Add support for the 'R' architecture profile.

2021-10-27 Thread Alexandros Lamprineas via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG8689f5e6e773: [AArch64] Add support for the R 
architecture profile. (authored by labrinea).

Changed prior to commit:
  https://reviews.llvm.org/D110065?vs=382584=382605#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110065/new/

https://reviews.llvm.org/D110065

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Driver/aarch64-cpus.c
  clang/test/Preprocessor/aarch64-target-features.c
  llvm/lib/Support/AArch64TargetParser.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64SystemOperands.td
  llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
  llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
  llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
  llvm/test/CodeGen/AArch64/arm64-crc32.ll
  llvm/test/MC/AArch64/arm64-branch-encoding.s
  llvm/test/MC/AArch64/arm64-system-encoding.s
  llvm/test/MC/AArch64/armv8.1a-lse.s
  llvm/test/MC/AArch64/armv8.1a-pan.s
  llvm/test/MC/AArch64/armv8.1a-rdma.s
  llvm/test/MC/AArch64/armv8.2a-at.s
  llvm/test/MC/AArch64/armv8.2a-crypto.s
  llvm/test/MC/AArch64/armv8.2a-dotprod-errors.s
  llvm/test/MC/AArch64/armv8.2a-dotprod.s
  llvm/test/MC/AArch64/armv8.2a-persistent-memory.s
  llvm/test/MC/AArch64/armv8.2a-uao.s
  llvm/test/MC/AArch64/armv8r-inst.s
  llvm/test/MC/AArch64/armv8r-sysreg.s
  llvm/test/MC/AArch64/armv8r-unsupported-inst.s
  llvm/test/MC/AArch64/armv8r-unsupported-sysreg.s
  llvm/test/MC/AArch64/basic-a64-instructions.s
  llvm/test/MC/AArch64/ras-extension.s
  llvm/test/MC/Disassembler/AArch64/arm64-branch.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-complex.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-js.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-dit.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-flag.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-ras.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-tlb.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-trace.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-virt.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-predres.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-specrestrict.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-ssbs.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-el3.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
  llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt

Index: llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
===
--- llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
+++ llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
@@ -1257,27 +1257,21 @@
 0xe1 0xff 0x1f 0xd4
 
 # CHECK: hvc  #{{1|0x1}}
-# CHECK: smc  #{{12000|0x2ee0}}
 # CHECK: brk  #{{12|0xc}}
 # CHECK: hlt  #{{123|0x7b}}
 0x22 0x0 0x0 0xd4
-0x3 0xdc 0x5 0xd4
 0x80 0x1 0x20 0xd4
 0x60 0xf 0x40 0xd4
 
 # CHECK: dcps1#{{42|0x2a}}
 # CHECK: dcps2#{{9|0x9}}
-# CHECK: dcps3#{{1000|0x3e8}}
 0x41 0x5 0xa0 0xd4
 0x22 0x1 0xa0 0xd4
-0x3 0x7d 0xa0 0xd4
 
 # CHECK: dcps1
 # CHECK: dcps2
-# CHECK: dcps3
 0x1 0x0 0xa0 0xd4
 0x2 0x0 0xa0 0xd4
-0x3 0x0 0xa0 0xd4
 
 #--
 # Extract (immediate)
@@ -3258,13 +3252,11 @@
 # CHECK: msr  {{hacr_el2|HACR_EL2}}, x12
 # CHECK: msr  {{mdcr_el3|MDCR_EL3}}, x12
 # CHECK: msr  {{ttbr0_el1|TTBR0_EL1}}, x12
-# CHECK: msr  {{ttbr0_el2|TTBR0_EL2}}, x12
 # CHECK: msr  {{ttbr0_el3|TTBR0_EL3}}, x12
 # CHECK: msr  {{ttbr1_el1|TTBR1_EL1}}, x12
 # CHECK: msr  {{tcr_el1|TCR_EL1}}, x12
 # CHECK: msr  {{tcr_el2|TCR_EL2}}, x12
 # CHECK: msr  {{tcr_el3|TCR_EL3}}, x12
-# CHECK: msr  {{vttbr_el2|VTTBR_EL2}}, x12
 # CHECK: msr  {{vtcr_el2|VTCR_EL2}}, x12
 # CHECK: msr  {{dacr32_el2|DACR32_EL2}}, x12
 # CHECK: msr  {{spsr_el1|SPSR_EL1}}, x12
@@ -3554,13 +3546,11 @@
 # CHECK: mrs  x9, {{hacr_el2|HACR_EL2}}
 # CHECK: mrs  x9, {{mdcr_el3|MDCR_EL3}}
 # CHECK: mrs  x9, {{ttbr0_el1|TTBR0_EL1}}
-# CHECK: mrs  x9, {{ttbr0_el2|TTBR0_EL2}}
 # CHECK: mrs  x9, {{ttbr0_el3|TTBR0_EL3}}
 # CHECK: mrs  x9, {{ttbr1_el1|TTBR1_EL1}}
 # CHECK: mrs  x9, {{tcr_el1|TCR_EL1}}
 # CHECK: mrs  x9, {{tcr_el2|TCR_EL2}}
 # CHECK: mrs  x9, {{tcr_el3|TCR_EL3}}
-# CHECK: mrs  x9, {{vttbr_el2|VTTBR_EL2}}
 # CHECK: mrs  x9, {{vtcr_el2|VTCR_EL2}}
 # CHECK: mrs  x9, {{dacr32_el2|DACR32_EL2}}
 # CHECK: mrs  x9, {{spsr_el1|SPSR_EL1}}
Index: llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
===
--- llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
+++ 

[PATCH] D110065: [AArch64] Add support for the 'R' architecture profile.

2021-10-27 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea updated this revision to Diff 382584.
labrinea added a comment.

Changed AppleA10 to HasV8_0aOps


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110065/new/

https://reviews.llvm.org/D110065

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Driver/aarch64-cpus.c
  clang/test/Preprocessor/aarch64-target-features.c
  llvm/lib/Support/AArch64TargetParser.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64SystemOperands.td
  llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
  llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
  llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
  llvm/test/CodeGen/AArch64/arm64-crc32.ll
  llvm/test/MC/AArch64/arm64-branch-encoding.s
  llvm/test/MC/AArch64/arm64-system-encoding.s
  llvm/test/MC/AArch64/armv8.1a-lse.s
  llvm/test/MC/AArch64/armv8.1a-pan.s
  llvm/test/MC/AArch64/armv8.1a-rdma.s
  llvm/test/MC/AArch64/armv8.2a-at.s
  llvm/test/MC/AArch64/armv8.2a-crypto.s
  llvm/test/MC/AArch64/armv8.2a-dotprod-errors.s
  llvm/test/MC/AArch64/armv8.2a-dotprod.s
  llvm/test/MC/AArch64/armv8.2a-persistent-memory.s
  llvm/test/MC/AArch64/armv8.2a-uao.s
  llvm/test/MC/AArch64/armv8r-inst.s
  llvm/test/MC/AArch64/armv8r-sysreg.s
  llvm/test/MC/AArch64/armv8r-unsupported-inst.s
  llvm/test/MC/AArch64/armv8r-unsupported-sysreg.s
  llvm/test/MC/AArch64/basic-a64-instructions.s
  llvm/test/MC/AArch64/ras-extension.s
  llvm/test/MC/Disassembler/AArch64/arm64-branch.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-complex.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-js.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-dit.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-flag.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-ras.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-tlb.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-trace.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-virt.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-predres.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-specrestrict.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-ssbs.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-el3.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
  llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt

Index: llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
===
--- llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
+++ llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
@@ -1257,27 +1257,21 @@
 0xe1 0xff 0x1f 0xd4
 
 # CHECK: hvc  #{{1|0x1}}
-# CHECK: smc  #{{12000|0x2ee0}}
 # CHECK: brk  #{{12|0xc}}
 # CHECK: hlt  #{{123|0x7b}}
 0x22 0x0 0x0 0xd4
-0x3 0xdc 0x5 0xd4
 0x80 0x1 0x20 0xd4
 0x60 0xf 0x40 0xd4
 
 # CHECK: dcps1#{{42|0x2a}}
 # CHECK: dcps2#{{9|0x9}}
-# CHECK: dcps3#{{1000|0x3e8}}
 0x41 0x5 0xa0 0xd4
 0x22 0x1 0xa0 0xd4
-0x3 0x7d 0xa0 0xd4
 
 # CHECK: dcps1
 # CHECK: dcps2
-# CHECK: dcps3
 0x1 0x0 0xa0 0xd4
 0x2 0x0 0xa0 0xd4
-0x3 0x0 0xa0 0xd4
 
 #--
 # Extract (immediate)
@@ -3258,13 +3252,11 @@
 # CHECK: msr  {{hacr_el2|HACR_EL2}}, x12
 # CHECK: msr  {{mdcr_el3|MDCR_EL3}}, x12
 # CHECK: msr  {{ttbr0_el1|TTBR0_EL1}}, x12
-# CHECK: msr  {{ttbr0_el2|TTBR0_EL2}}, x12
 # CHECK: msr  {{ttbr0_el3|TTBR0_EL3}}, x12
 # CHECK: msr  {{ttbr1_el1|TTBR1_EL1}}, x12
 # CHECK: msr  {{tcr_el1|TCR_EL1}}, x12
 # CHECK: msr  {{tcr_el2|TCR_EL2}}, x12
 # CHECK: msr  {{tcr_el3|TCR_EL3}}, x12
-# CHECK: msr  {{vttbr_el2|VTTBR_EL2}}, x12
 # CHECK: msr  {{vtcr_el2|VTCR_EL2}}, x12
 # CHECK: msr  {{dacr32_el2|DACR32_EL2}}, x12
 # CHECK: msr  {{spsr_el1|SPSR_EL1}}, x12
@@ -3554,13 +3546,11 @@
 # CHECK: mrs  x9, {{hacr_el2|HACR_EL2}}
 # CHECK: mrs  x9, {{mdcr_el3|MDCR_EL3}}
 # CHECK: mrs  x9, {{ttbr0_el1|TTBR0_EL1}}
-# CHECK: mrs  x9, {{ttbr0_el2|TTBR0_EL2}}
 # CHECK: mrs  x9, {{ttbr0_el3|TTBR0_EL3}}
 # CHECK: mrs  x9, {{ttbr1_el1|TTBR1_EL1}}
 # CHECK: mrs  x9, {{tcr_el1|TCR_EL1}}
 # CHECK: mrs  x9, {{tcr_el2|TCR_EL2}}
 # CHECK: mrs  x9, {{tcr_el3|TCR_EL3}}
-# CHECK: mrs  x9, {{vttbr_el2|VTTBR_EL2}}
 # CHECK: mrs  x9, {{vtcr_el2|VTCR_EL2}}
 # CHECK: mrs  x9, {{dacr32_el2|DACR32_EL2}}
 # CHECK: mrs  x9, {{spsr_el1|SPSR_EL1}}
Index: llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
===
--- llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
+++ llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
@@ -1,5 +1,6 @@
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mattr=+fp16fml   --disassemble < %s 2>&1 | FileCheck %s 

[PATCH] D110065: [AArch64] Add support for the 'R' architecture profile.

2021-10-26 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea updated this revision to Diff 382294.
labrinea added a comment.

Added v8.1a_ops on AppleA10. However, the target parser lists it as v8.0a, 
which seems odd. Out of the scope of this patch anyway.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110065/new/

https://reviews.llvm.org/D110065

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Driver/aarch64-cpus.c
  clang/test/Preprocessor/aarch64-target-features.c
  llvm/lib/Support/AArch64TargetParser.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64SystemOperands.td
  llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
  llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
  llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
  llvm/test/CodeGen/AArch64/arm64-crc32.ll
  llvm/test/MC/AArch64/arm64-branch-encoding.s
  llvm/test/MC/AArch64/arm64-system-encoding.s
  llvm/test/MC/AArch64/armv8.1a-lse.s
  llvm/test/MC/AArch64/armv8.1a-pan.s
  llvm/test/MC/AArch64/armv8.1a-rdma.s
  llvm/test/MC/AArch64/armv8.2a-at.s
  llvm/test/MC/AArch64/armv8.2a-crypto.s
  llvm/test/MC/AArch64/armv8.2a-dotprod-errors.s
  llvm/test/MC/AArch64/armv8.2a-dotprod.s
  llvm/test/MC/AArch64/armv8.2a-persistent-memory.s
  llvm/test/MC/AArch64/armv8.2a-uao.s
  llvm/test/MC/AArch64/armv8r-inst.s
  llvm/test/MC/AArch64/armv8r-sysreg.s
  llvm/test/MC/AArch64/armv8r-unsupported-inst.s
  llvm/test/MC/AArch64/armv8r-unsupported-sysreg.s
  llvm/test/MC/AArch64/basic-a64-instructions.s
  llvm/test/MC/AArch64/ras-extension.s
  llvm/test/MC/Disassembler/AArch64/arm64-branch.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-complex.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-js.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-dit.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-flag.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-ras.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-tlb.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-trace.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-virt.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-predres.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-specrestrict.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-ssbs.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-el3.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
  llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt

Index: llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
===
--- llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
+++ llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
@@ -1257,27 +1257,21 @@
 0xe1 0xff 0x1f 0xd4
 
 # CHECK: hvc  #{{1|0x1}}
-# CHECK: smc  #{{12000|0x2ee0}}
 # CHECK: brk  #{{12|0xc}}
 # CHECK: hlt  #{{123|0x7b}}
 0x22 0x0 0x0 0xd4
-0x3 0xdc 0x5 0xd4
 0x80 0x1 0x20 0xd4
 0x60 0xf 0x40 0xd4
 
 # CHECK: dcps1#{{42|0x2a}}
 # CHECK: dcps2#{{9|0x9}}
-# CHECK: dcps3#{{1000|0x3e8}}
 0x41 0x5 0xa0 0xd4
 0x22 0x1 0xa0 0xd4
-0x3 0x7d 0xa0 0xd4
 
 # CHECK: dcps1
 # CHECK: dcps2
-# CHECK: dcps3
 0x1 0x0 0xa0 0xd4
 0x2 0x0 0xa0 0xd4
-0x3 0x0 0xa0 0xd4
 
 #--
 # Extract (immediate)
@@ -3258,13 +3252,11 @@
 # CHECK: msr  {{hacr_el2|HACR_EL2}}, x12
 # CHECK: msr  {{mdcr_el3|MDCR_EL3}}, x12
 # CHECK: msr  {{ttbr0_el1|TTBR0_EL1}}, x12
-# CHECK: msr  {{ttbr0_el2|TTBR0_EL2}}, x12
 # CHECK: msr  {{ttbr0_el3|TTBR0_EL3}}, x12
 # CHECK: msr  {{ttbr1_el1|TTBR1_EL1}}, x12
 # CHECK: msr  {{tcr_el1|TCR_EL1}}, x12
 # CHECK: msr  {{tcr_el2|TCR_EL2}}, x12
 # CHECK: msr  {{tcr_el3|TCR_EL3}}, x12
-# CHECK: msr  {{vttbr_el2|VTTBR_EL2}}, x12
 # CHECK: msr  {{vtcr_el2|VTCR_EL2}}, x12
 # CHECK: msr  {{dacr32_el2|DACR32_EL2}}, x12
 # CHECK: msr  {{spsr_el1|SPSR_EL1}}, x12
@@ -3554,13 +3546,11 @@
 # CHECK: mrs  x9, {{hacr_el2|HACR_EL2}}
 # CHECK: mrs  x9, {{mdcr_el3|MDCR_EL3}}
 # CHECK: mrs  x9, {{ttbr0_el1|TTBR0_EL1}}
-# CHECK: mrs  x9, {{ttbr0_el2|TTBR0_EL2}}
 # CHECK: mrs  x9, {{ttbr0_el3|TTBR0_EL3}}
 # CHECK: mrs  x9, {{ttbr1_el1|TTBR1_EL1}}
 # CHECK: mrs  x9, {{tcr_el1|TCR_EL1}}
 # CHECK: mrs  x9, {{tcr_el2|TCR_EL2}}
 # CHECK: mrs  x9, {{tcr_el3|TCR_EL3}}
-# CHECK: mrs  x9, {{vttbr_el2|VTTBR_EL2}}
 # CHECK: mrs  x9, {{vtcr_el2|VTCR_EL2}}
 # CHECK: mrs  x9, {{dacr32_el2|DACR32_EL2}}
 # CHECK: mrs  x9, {{spsr_el1|SPSR_EL1}}
Index: llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
===
--- llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
+++ llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
@@ -1,5 +1,6 @@
 # RUN: llvm-mc -triple 

[PATCH] D110065: [AArch64] Add support for the 'R' architecture profile.

2021-10-26 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea updated this revision to Diff 382288.
labrinea marked an inline comment as done.
labrinea added a comment.

Added a comment explaining system register lookups by alternative name as 
suggested and rebased on top of https://reviews.llvm.org/D111551. @john.brawn 
you may want to re-review the patch as the rebase had conflicts in 
`llvm/lib/Target/AArch64/AArch64.td`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110065/new/

https://reviews.llvm.org/D110065

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Driver/aarch64-cpus.c
  clang/test/Preprocessor/aarch64-target-features.c
  llvm/lib/Support/AArch64TargetParser.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64SystemOperands.td
  llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
  llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
  llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
  llvm/test/CodeGen/AArch64/arm64-crc32.ll
  llvm/test/MC/AArch64/arm64-branch-encoding.s
  llvm/test/MC/AArch64/arm64-system-encoding.s
  llvm/test/MC/AArch64/armv8.1a-lse.s
  llvm/test/MC/AArch64/armv8.1a-pan.s
  llvm/test/MC/AArch64/armv8.1a-rdma.s
  llvm/test/MC/AArch64/armv8.2a-at.s
  llvm/test/MC/AArch64/armv8.2a-crypto.s
  llvm/test/MC/AArch64/armv8.2a-dotprod-errors.s
  llvm/test/MC/AArch64/armv8.2a-dotprod.s
  llvm/test/MC/AArch64/armv8.2a-persistent-memory.s
  llvm/test/MC/AArch64/armv8.2a-uao.s
  llvm/test/MC/AArch64/armv8r-inst.s
  llvm/test/MC/AArch64/armv8r-sysreg.s
  llvm/test/MC/AArch64/armv8r-unsupported-inst.s
  llvm/test/MC/AArch64/armv8r-unsupported-sysreg.s
  llvm/test/MC/AArch64/basic-a64-instructions.s
  llvm/test/MC/AArch64/ras-extension.s
  llvm/test/MC/Disassembler/AArch64/arm64-branch.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-complex.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-js.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-dit.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-flag.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-ras.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-tlb.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-trace.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-virt.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-predres.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-specrestrict.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-ssbs.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-el3.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
  llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt

Index: llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
===
--- llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
+++ llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
@@ -1257,27 +1257,21 @@
 0xe1 0xff 0x1f 0xd4
 
 # CHECK: hvc  #{{1|0x1}}
-# CHECK: smc  #{{12000|0x2ee0}}
 # CHECK: brk  #{{12|0xc}}
 # CHECK: hlt  #{{123|0x7b}}
 0x22 0x0 0x0 0xd4
-0x3 0xdc 0x5 0xd4
 0x80 0x1 0x20 0xd4
 0x60 0xf 0x40 0xd4
 
 # CHECK: dcps1#{{42|0x2a}}
 # CHECK: dcps2#{{9|0x9}}
-# CHECK: dcps3#{{1000|0x3e8}}
 0x41 0x5 0xa0 0xd4
 0x22 0x1 0xa0 0xd4
-0x3 0x7d 0xa0 0xd4
 
 # CHECK: dcps1
 # CHECK: dcps2
-# CHECK: dcps3
 0x1 0x0 0xa0 0xd4
 0x2 0x0 0xa0 0xd4
-0x3 0x0 0xa0 0xd4
 
 #--
 # Extract (immediate)
@@ -3258,13 +3252,11 @@
 # CHECK: msr  {{hacr_el2|HACR_EL2}}, x12
 # CHECK: msr  {{mdcr_el3|MDCR_EL3}}, x12
 # CHECK: msr  {{ttbr0_el1|TTBR0_EL1}}, x12
-# CHECK: msr  {{ttbr0_el2|TTBR0_EL2}}, x12
 # CHECK: msr  {{ttbr0_el3|TTBR0_EL3}}, x12
 # CHECK: msr  {{ttbr1_el1|TTBR1_EL1}}, x12
 # CHECK: msr  {{tcr_el1|TCR_EL1}}, x12
 # CHECK: msr  {{tcr_el2|TCR_EL2}}, x12
 # CHECK: msr  {{tcr_el3|TCR_EL3}}, x12
-# CHECK: msr  {{vttbr_el2|VTTBR_EL2}}, x12
 # CHECK: msr  {{vtcr_el2|VTCR_EL2}}, x12
 # CHECK: msr  {{dacr32_el2|DACR32_EL2}}, x12
 # CHECK: msr  {{spsr_el1|SPSR_EL1}}, x12
@@ -3554,13 +3546,11 @@
 # CHECK: mrs  x9, {{hacr_el2|HACR_EL2}}
 # CHECK: mrs  x9, {{mdcr_el3|MDCR_EL3}}
 # CHECK: mrs  x9, {{ttbr0_el1|TTBR0_EL1}}
-# CHECK: mrs  x9, {{ttbr0_el2|TTBR0_EL2}}
 # CHECK: mrs  x9, {{ttbr0_el3|TTBR0_EL3}}
 # CHECK: mrs  x9, {{ttbr1_el1|TTBR1_EL1}}
 # CHECK: mrs  x9, {{tcr_el1|TCR_EL1}}
 # CHECK: mrs  x9, {{tcr_el2|TCR_EL2}}
 # CHECK: mrs  x9, {{tcr_el3|TCR_EL3}}
-# CHECK: mrs  x9, {{vttbr_el2|VTTBR_EL2}}
 # CHECK: mrs  x9, {{vtcr_el2|VTCR_EL2}}
 # CHECK: mrs  x9, {{dacr32_el2|DACR32_EL2}}
 # CHECK: mrs  x9, {{spsr_el1|SPSR_EL1}}
Index: llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt

[PATCH] D110065: [AArch64] Add support for the 'R' architecture profile.

2021-10-16 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea marked an inline comment as done.
labrinea added inline comments.



Comment at: llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp:1548
+
+static const AArch64SysReg::SysReg *lookupSysReg(unsigned Val, bool Read,
+ const MCSubtargetInfo ) {

john.brawn wrote:
> It would be better if we had a generic way to handle registers with 
> overlapping encodings, instead of handling the two registers explicitly here. 
> I'm not sure of the best way to do that, but looking at 
> AArch64SystemOperands.td it looks like maybe a way to do it would to add an 
> extra "AltName" field to give an alternate name for the same encoding, so 
> e.g. TTBR0_EL2 would have AltName 
> VSCTLR_EL2 and vice-versa. So you'd first lookup by encoding, then if that 
> didn't work you'd lookup by name with AltName and check if that one is valid.
> 
Done, but it may turn problematic if we start having more than one alternative 
names, i.e. if multiple architecture extensions reference the same encoding 
using a different name.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110065/new/

https://reviews.llvm.org/D110065

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110065: [AArch64] Add support for the 'R' architecture profile.

2021-10-16 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea updated this revision to Diff 380171.
labrinea added a comment.

Added an alternative name to indicate sytem register aliasing.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110065/new/

https://reviews.llvm.org/D110065

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Driver/aarch64-cpus.c
  clang/test/Preprocessor/aarch64-target-features.c
  llvm/lib/Support/AArch64TargetParser.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64SystemOperands.td
  llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
  llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
  llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
  llvm/test/CodeGen/AArch64/arm64-crc32.ll
  llvm/test/MC/AArch64/arm64-branch-encoding.s
  llvm/test/MC/AArch64/arm64-system-encoding.s
  llvm/test/MC/AArch64/armv8.1a-lse.s
  llvm/test/MC/AArch64/armv8.1a-pan.s
  llvm/test/MC/AArch64/armv8.1a-rdma.s
  llvm/test/MC/AArch64/armv8.2a-at.s
  llvm/test/MC/AArch64/armv8.2a-crypto.s
  llvm/test/MC/AArch64/armv8.2a-dotprod-errors.s
  llvm/test/MC/AArch64/armv8.2a-dotprod.s
  llvm/test/MC/AArch64/armv8.2a-persistent-memory.s
  llvm/test/MC/AArch64/armv8.2a-uao.s
  llvm/test/MC/AArch64/armv8r-inst.s
  llvm/test/MC/AArch64/armv8r-sysreg.s
  llvm/test/MC/AArch64/armv8r-unsupported-inst.s
  llvm/test/MC/AArch64/armv8r-unsupported-sysreg.s
  llvm/test/MC/AArch64/basic-a64-instructions.s
  llvm/test/MC/AArch64/ras-extension.s
  llvm/test/MC/Disassembler/AArch64/arm64-branch.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-complex.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-js.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-dit.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-flag.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-ras.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-tlb.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-trace.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-virt.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-predres.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-specrestrict.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-ssbs.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-el3.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
  llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt

Index: llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
===
--- llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
+++ llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
@@ -1257,27 +1257,21 @@
 0xe1 0xff 0x1f 0xd4
 
 # CHECK: hvc  #{{1|0x1}}
-# CHECK: smc  #{{12000|0x2ee0}}
 # CHECK: brk  #{{12|0xc}}
 # CHECK: hlt  #{{123|0x7b}}
 0x22 0x0 0x0 0xd4
-0x3 0xdc 0x5 0xd4
 0x80 0x1 0x20 0xd4
 0x60 0xf 0x40 0xd4
 
 # CHECK: dcps1#{{42|0x2a}}
 # CHECK: dcps2#{{9|0x9}}
-# CHECK: dcps3#{{1000|0x3e8}}
 0x41 0x5 0xa0 0xd4
 0x22 0x1 0xa0 0xd4
-0x3 0x7d 0xa0 0xd4
 
 # CHECK: dcps1
 # CHECK: dcps2
-# CHECK: dcps3
 0x1 0x0 0xa0 0xd4
 0x2 0x0 0xa0 0xd4
-0x3 0x0 0xa0 0xd4
 
 #--
 # Extract (immediate)
@@ -3258,13 +3252,11 @@
 # CHECK: msr  {{hacr_el2|HACR_EL2}}, x12
 # CHECK: msr  {{mdcr_el3|MDCR_EL3}}, x12
 # CHECK: msr  {{ttbr0_el1|TTBR0_EL1}}, x12
-# CHECK: msr  {{ttbr0_el2|TTBR0_EL2}}, x12
 # CHECK: msr  {{ttbr0_el3|TTBR0_EL3}}, x12
 # CHECK: msr  {{ttbr1_el1|TTBR1_EL1}}, x12
 # CHECK: msr  {{tcr_el1|TCR_EL1}}, x12
 # CHECK: msr  {{tcr_el2|TCR_EL2}}, x12
 # CHECK: msr  {{tcr_el3|TCR_EL3}}, x12
-# CHECK: msr  {{vttbr_el2|VTTBR_EL2}}, x12
 # CHECK: msr  {{vtcr_el2|VTCR_EL2}}, x12
 # CHECK: msr  {{dacr32_el2|DACR32_EL2}}, x12
 # CHECK: msr  {{spsr_el1|SPSR_EL1}}, x12
@@ -3554,13 +3546,11 @@
 # CHECK: mrs  x9, {{hacr_el2|HACR_EL2}}
 # CHECK: mrs  x9, {{mdcr_el3|MDCR_EL3}}
 # CHECK: mrs  x9, {{ttbr0_el1|TTBR0_EL1}}
-# CHECK: mrs  x9, {{ttbr0_el2|TTBR0_EL2}}
 # CHECK: mrs  x9, {{ttbr0_el3|TTBR0_EL3}}
 # CHECK: mrs  x9, {{ttbr1_el1|TTBR1_EL1}}
 # CHECK: mrs  x9, {{tcr_el1|TCR_EL1}}
 # CHECK: mrs  x9, {{tcr_el2|TCR_EL2}}
 # CHECK: mrs  x9, {{tcr_el3|TCR_EL3}}
-# CHECK: mrs  x9, {{vttbr_el2|VTTBR_EL2}}
 # CHECK: mrs  x9, {{vtcr_el2|VTCR_EL2}}
 # CHECK: mrs  x9, {{dacr32_el2|DACR32_EL2}}
 # CHECK: mrs  x9, {{spsr_el1|SPSR_EL1}}
Index: llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
===
--- llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
+++ llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
@@ -1,5 +1,6 @@
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mattr=+fp16fml   --disassemble < %s 2>&1 | 

[PATCH] D110065: [AArch64] Add support for the 'R' architecture profile.

2021-10-04 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea updated this revision to Diff 376942.
labrinea edited the summary of this revision.
labrinea added a comment.

Change from last revision: The driver implicitly enables the 'A' profile 
features (as if -march=armv8-a was specified on the command line) when only the 
target triple is specified in order to maintain backwards compatibility.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110065/new/

https://reviews.llvm.org/D110065

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/lib/Driver/ToolChains/Arch/AArch64.cpp
  clang/test/Driver/aarch64-cpus.c
  clang/test/Preprocessor/aarch64-target-features.c
  llvm/lib/Support/AArch64TargetParser.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64SystemOperands.td
  llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
  llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
  llvm/test/CodeGen/AArch64/arm64-crc32.ll
  llvm/test/MC/AArch64/arm64-branch-encoding.s
  llvm/test/MC/AArch64/arm64-system-encoding.s
  llvm/test/MC/AArch64/armv8.1a-lse.s
  llvm/test/MC/AArch64/armv8.1a-pan.s
  llvm/test/MC/AArch64/armv8.1a-rdma.s
  llvm/test/MC/AArch64/armv8.2a-at.s
  llvm/test/MC/AArch64/armv8.2a-crypto.s
  llvm/test/MC/AArch64/armv8.2a-dotprod-errors.s
  llvm/test/MC/AArch64/armv8.2a-dotprod.s
  llvm/test/MC/AArch64/armv8.2a-persistent-memory.s
  llvm/test/MC/AArch64/armv8.2a-uao.s
  llvm/test/MC/AArch64/armv8r-inst.s
  llvm/test/MC/AArch64/armv8r-sysreg.s
  llvm/test/MC/AArch64/armv8r-unsupported-inst.s
  llvm/test/MC/AArch64/armv8r-unsupported-sysreg.s
  llvm/test/MC/AArch64/basic-a64-instructions.s
  llvm/test/MC/AArch64/ras-extension.s
  llvm/test/MC/Disassembler/AArch64/arm64-branch.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-complex.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-js.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-dit.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-flag.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-ras.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-tlb.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-trace.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-virt.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-predres.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-specrestrict.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-ssbs.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-el3.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
  llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt

Index: llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
===
--- llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
+++ llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
@@ -1257,27 +1257,21 @@
 0xe1 0xff 0x1f 0xd4
 
 # CHECK: hvc  #{{1|0x1}}
-# CHECK: smc  #{{12000|0x2ee0}}
 # CHECK: brk  #{{12|0xc}}
 # CHECK: hlt  #{{123|0x7b}}
 0x22 0x0 0x0 0xd4
-0x3 0xdc 0x5 0xd4
 0x80 0x1 0x20 0xd4
 0x60 0xf 0x40 0xd4
 
 # CHECK: dcps1#{{42|0x2a}}
 # CHECK: dcps2#{{9|0x9}}
-# CHECK: dcps3#{{1000|0x3e8}}
 0x41 0x5 0xa0 0xd4
 0x22 0x1 0xa0 0xd4
-0x3 0x7d 0xa0 0xd4
 
 # CHECK: dcps1
 # CHECK: dcps2
-# CHECK: dcps3
 0x1 0x0 0xa0 0xd4
 0x2 0x0 0xa0 0xd4
-0x3 0x0 0xa0 0xd4
 
 #--
 # Extract (immediate)
@@ -3258,13 +3252,11 @@
 # CHECK: msr  {{hacr_el2|HACR_EL2}}, x12
 # CHECK: msr  {{mdcr_el3|MDCR_EL3}}, x12
 # CHECK: msr  {{ttbr0_el1|TTBR0_EL1}}, x12
-# CHECK: msr  {{ttbr0_el2|TTBR0_EL2}}, x12
 # CHECK: msr  {{ttbr0_el3|TTBR0_EL3}}, x12
 # CHECK: msr  {{ttbr1_el1|TTBR1_EL1}}, x12
 # CHECK: msr  {{tcr_el1|TCR_EL1}}, x12
 # CHECK: msr  {{tcr_el2|TCR_EL2}}, x12
 # CHECK: msr  {{tcr_el3|TCR_EL3}}, x12
-# CHECK: msr  {{vttbr_el2|VTTBR_EL2}}, x12
 # CHECK: msr  {{vtcr_el2|VTCR_EL2}}, x12
 # CHECK: msr  {{dacr32_el2|DACR32_EL2}}, x12
 # CHECK: msr  {{spsr_el1|SPSR_EL1}}, x12
@@ -3554,13 +3546,11 @@
 # CHECK: mrs  x9, {{hacr_el2|HACR_EL2}}
 # CHECK: mrs  x9, {{mdcr_el3|MDCR_EL3}}
 # CHECK: mrs  x9, {{ttbr0_el1|TTBR0_EL1}}
-# CHECK: mrs  x9, {{ttbr0_el2|TTBR0_EL2}}
 # CHECK: mrs  x9, {{ttbr0_el3|TTBR0_EL3}}
 # CHECK: mrs  x9, {{ttbr1_el1|TTBR1_EL1}}
 # CHECK: mrs  x9, {{tcr_el1|TCR_EL1}}
 # CHECK: mrs  x9, {{tcr_el2|TCR_EL2}}
 # CHECK: mrs  x9, {{tcr_el3|TCR_EL3}}
-# CHECK: mrs  x9, {{vttbr_el2|VTTBR_EL2}}
 # CHECK: mrs  x9, {{vtcr_el2|VTCR_EL2}}
 # CHECK: mrs  x9, {{dacr32_el2|DACR32_EL2}}
 # CHECK: mrs  x9, {{spsr_el1|SPSR_EL1}}
Index: llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
===
--- llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
+++ 

[PATCH] D110065: [AArch64] Add support for the 'R' architecture profile.

2021-09-30 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added subscribers: nickdesaulniers, t.p.northover, srhines.
labrinea added a comment.

I wanted to clarify the chosen strategy as the desciption was perhaps not very 
informative. The are some instructions and system registers that are present in 
v8-a but not in v8-r, and so I am inclined to turn the `generic` cpu into the 
intersection of the two architecture profiles. The reasoning is to stay inline 
with the existing driver strategy: even when -march is specified on the command 
line clang passes -taget-cpu=generic to the backend, plus whatever subtarget 
features are implied by -march. The only exception to this rule seems to be 
AArch32 v8-r, where cortex-r52 is prefered over generic. This is an 
inconsistency and I wouldn't want to repeat the same for AArch64 (unless it's 
really necessary). That said my current approach will be breaking the current 
tools behavior: when the user only specifies the triple and not -march then 
they will be targeting the intersection, not v8-a. The impact is not expected 
to be significant as the missing instructions (smc, dcps3) were not being 
generated anyway. However if the two profiles significantly diverge in the 
future it might become a problem. There are some alternative solutions to this:

- make the clang driver implicitly pass -march=armv8-a when only the triple is 
specified; then all the v8-a subtarget features will be enabled allowing smc, 
dcps3 and the system registers to be available (my preference)
- leave the generic cpu as is (so that it means v8-a, not the intersection), 
and make the clang driver set -target-cpu with the default cpu for a given 
architecture (according to the Target Parser) when -march is specified; this 
change will also be breaking the current tools behavior though (middle ground 
option)
- introduce a new `generic-r` cpu (or use cortex-r82) when setting -target-cpu, 
but only when -march=armv8-r, otherwise choose `generic` (least favorite option)

I am adding a few people for visibility, requesting comments @t.p.northover 
@srhines @nickdesaulniers


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110065/new/

https://reviews.llvm.org/D110065

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110065: [AArch64] Add support for the 'R' architecture profile.

2021-09-23 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea updated this revision to Diff 374572.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110065/new/

https://reviews.llvm.org/D110065

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/test/Driver/aarch64-cpus.c
  clang/test/Preprocessor/aarch64-target-features.c
  llvm/lib/Support/AArch64TargetParser.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64SystemOperands.td
  llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
  llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
  llvm/test/CodeGen/AArch64/arm64-crc32.ll
  llvm/test/MC/AArch64/arm64-branch-encoding.s
  llvm/test/MC/AArch64/arm64-system-encoding.s
  llvm/test/MC/AArch64/armv8.1a-lse.s
  llvm/test/MC/AArch64/armv8.1a-pan.s
  llvm/test/MC/AArch64/armv8.1a-rdma.s
  llvm/test/MC/AArch64/armv8.2a-at.s
  llvm/test/MC/AArch64/armv8.2a-crypto.s
  llvm/test/MC/AArch64/armv8.2a-dotprod-errors.s
  llvm/test/MC/AArch64/armv8.2a-dotprod.s
  llvm/test/MC/AArch64/armv8.2a-persistent-memory.s
  llvm/test/MC/AArch64/armv8.2a-uao.s
  llvm/test/MC/AArch64/armv8r-inst.s
  llvm/test/MC/AArch64/armv8r-sysreg.s
  llvm/test/MC/AArch64/armv8r-unsupported-inst.s
  llvm/test/MC/AArch64/armv8r-unsupported-sysreg.s
  llvm/test/MC/AArch64/basic-a64-instructions.s
  llvm/test/MC/AArch64/ras-extension.s
  llvm/test/MC/Disassembler/AArch64/arm64-branch.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-complex.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-js.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-dit.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-flag.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-ras.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-tlb.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-trace.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-virt.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-predres.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-specrestrict.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-ssbs.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-el3.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
  llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt

Index: llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
===
--- llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
+++ llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
@@ -1257,27 +1257,21 @@
 0xe1 0xff 0x1f 0xd4
 
 # CHECK: hvc  #{{1|0x1}}
-# CHECK: smc  #{{12000|0x2ee0}}
 # CHECK: brk  #{{12|0xc}}
 # CHECK: hlt  #{{123|0x7b}}
 0x22 0x0 0x0 0xd4
-0x3 0xdc 0x5 0xd4
 0x80 0x1 0x20 0xd4
 0x60 0xf 0x40 0xd4
 
 # CHECK: dcps1#{{42|0x2a}}
 # CHECK: dcps2#{{9|0x9}}
-# CHECK: dcps3#{{1000|0x3e8}}
 0x41 0x5 0xa0 0xd4
 0x22 0x1 0xa0 0xd4
-0x3 0x7d 0xa0 0xd4
 
 # CHECK: dcps1
 # CHECK: dcps2
-# CHECK: dcps3
 0x1 0x0 0xa0 0xd4
 0x2 0x0 0xa0 0xd4
-0x3 0x0 0xa0 0xd4
 
 #--
 # Extract (immediate)
@@ -3258,13 +3252,11 @@
 # CHECK: msr  {{hacr_el2|HACR_EL2}}, x12
 # CHECK: msr  {{mdcr_el3|MDCR_EL3}}, x12
 # CHECK: msr  {{ttbr0_el1|TTBR0_EL1}}, x12
-# CHECK: msr  {{ttbr0_el2|TTBR0_EL2}}, x12
 # CHECK: msr  {{ttbr0_el3|TTBR0_EL3}}, x12
 # CHECK: msr  {{ttbr1_el1|TTBR1_EL1}}, x12
 # CHECK: msr  {{tcr_el1|TCR_EL1}}, x12
 # CHECK: msr  {{tcr_el2|TCR_EL2}}, x12
 # CHECK: msr  {{tcr_el3|TCR_EL3}}, x12
-# CHECK: msr  {{vttbr_el2|VTTBR_EL2}}, x12
 # CHECK: msr  {{vtcr_el2|VTCR_EL2}}, x12
 # CHECK: msr  {{dacr32_el2|DACR32_EL2}}, x12
 # CHECK: msr  {{spsr_el1|SPSR_EL1}}, x12
@@ -3554,13 +3546,11 @@
 # CHECK: mrs  x9, {{hacr_el2|HACR_EL2}}
 # CHECK: mrs  x9, {{mdcr_el3|MDCR_EL3}}
 # CHECK: mrs  x9, {{ttbr0_el1|TTBR0_EL1}}
-# CHECK: mrs  x9, {{ttbr0_el2|TTBR0_EL2}}
 # CHECK: mrs  x9, {{ttbr0_el3|TTBR0_EL3}}
 # CHECK: mrs  x9, {{ttbr1_el1|TTBR1_EL1}}
 # CHECK: mrs  x9, {{tcr_el1|TCR_EL1}}
 # CHECK: mrs  x9, {{tcr_el2|TCR_EL2}}
 # CHECK: mrs  x9, {{tcr_el3|TCR_EL3}}
-# CHECK: mrs  x9, {{vttbr_el2|VTTBR_EL2}}
 # CHECK: mrs  x9, {{vtcr_el2|VTCR_EL2}}
 # CHECK: mrs  x9, {{dacr32_el2|DACR32_EL2}}
 # CHECK: mrs  x9, {{spsr_el1|SPSR_EL1}}
Index: llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
===
--- llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
+++ llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
@@ -1,5 +1,6 @@
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mattr=+fp16fml   --disassemble < %s 2>&1 | FileCheck %s --check-prefixes=CHECK,FP16
 # RUN: llvm-mc -triple aarch64-none-linux-gnu -mattr=-fullfp16,+fp16fml --disassemble < %s 2>&1 | FileCheck %s --check-prefixes=CHECK,FP16
+# 

[PATCH] D110065: [AArch64] Add support for the 'R' architecture profile.

2021-09-20 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea created this revision.
labrinea added reviewers: ostannard, miyuki, cfe-commits, llvm-commits.
Herald added subscribers: hiraditya, kristof.beyls.
labrinea requested review of this revision.
Herald added projects: clang, LLVM.

The patch introduces subtarget features to predicate certain instructions and 
system registers that are available only on 'A' profile targets. Those features 
are not present when targeting a generic CPU, which is the default processor. 
That said `-march` has to be explicitly specified on the command line to enable 
them as the target triple will not be enough.

References: https://developer.arm.com/documentation/ddi0600/latest


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D110065

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/test/Driver/aarch64-cpus.c
  clang/test/Preprocessor/aarch64-target-features.c
  llvm/lib/Support/AArch64TargetParser.cpp
  llvm/lib/Target/AArch64/AArch64.td
  llvm/lib/Target/AArch64/AArch64InstrInfo.td
  llvm/lib/Target/AArch64/AArch64Subtarget.h
  llvm/lib/Target/AArch64/AArch64SystemOperands.td
  llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
  llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
  llvm/test/CodeGen/AArch64/arm64-crc32.ll
  llvm/test/MC/AArch64/arm64-branch-encoding.s
  llvm/test/MC/AArch64/arm64-system-encoding.s
  llvm/test/MC/AArch64/armv8.1a-lse.s
  llvm/test/MC/AArch64/armv8.1a-pan.s
  llvm/test/MC/AArch64/armv8.1a-rdma.s
  llvm/test/MC/AArch64/armv8.2a-at.s
  llvm/test/MC/AArch64/armv8.2a-crypto.s
  llvm/test/MC/AArch64/armv8.2a-dotprod-errors.s
  llvm/test/MC/AArch64/armv8.2a-dotprod.s
  llvm/test/MC/AArch64/armv8.2a-persistent-memory.s
  llvm/test/MC/AArch64/armv8.2a-uao.s
  llvm/test/MC/AArch64/armv8r-inst.s
  llvm/test/MC/AArch64/armv8r-sysreg.s
  llvm/test/MC/AArch64/armv8r-unsupported-inst.s
  llvm/test/MC/AArch64/armv8r-unsupported-sysreg.s
  llvm/test/MC/AArch64/basic-a64-instructions.s
  llvm/test/MC/AArch64/ras-extension.s
  llvm/test/MC/Disassembler/AArch64/arm64-branch.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-complex.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-js.txt
  llvm/test/MC/Disassembler/AArch64/armv8.3a-rcpc.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-dit.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-flag.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-ras.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-tlb.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-trace.txt
  llvm/test/MC/Disassembler/AArch64/armv8.4a-virt.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-predres.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-specrestrict.txt
  llvm/test/MC/Disassembler/AArch64/armv8.5a-ssbs.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-el3.txt
  llvm/test/MC/Disassembler/AArch64/armv8a-fpmul.txt
  llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt

Index: llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
===
--- llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
+++ llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
@@ -1257,27 +1257,21 @@
 0xe1 0xff 0x1f 0xd4
 
 # CHECK: hvc  #{{1|0x1}}
-# CHECK: smc  #{{12000|0x2ee0}}
 # CHECK: brk  #{{12|0xc}}
 # CHECK: hlt  #{{123|0x7b}}
 0x22 0x0 0x0 0xd4
-0x3 0xdc 0x5 0xd4
 0x80 0x1 0x20 0xd4
 0x60 0xf 0x40 0xd4
 
 # CHECK: dcps1#{{42|0x2a}}
 # CHECK: dcps2#{{9|0x9}}
-# CHECK: dcps3#{{1000|0x3e8}}
 0x41 0x5 0xa0 0xd4
 0x22 0x1 0xa0 0xd4
-0x3 0x7d 0xa0 0xd4
 
 # CHECK: dcps1
 # CHECK: dcps2
-# CHECK: dcps3
 0x1 0x0 0xa0 0xd4
 0x2 0x0 0xa0 0xd4
-0x3 0x0 0xa0 0xd4
 
 #--
 # Extract (immediate)
@@ -3258,13 +3252,11 @@
 # CHECK: msr  {{hacr_el2|HACR_EL2}}, x12
 # CHECK: msr  {{mdcr_el3|MDCR_EL3}}, x12
 # CHECK: msr  {{ttbr0_el1|TTBR0_EL1}}, x12
-# CHECK: msr  {{ttbr0_el2|TTBR0_EL2}}, x12
 # CHECK: msr  {{ttbr0_el3|TTBR0_EL3}}, x12
 # CHECK: msr  {{ttbr1_el1|TTBR1_EL1}}, x12
 # CHECK: msr  {{tcr_el1|TCR_EL1}}, x12
 # CHECK: msr  {{tcr_el2|TCR_EL2}}, x12
 # CHECK: msr  {{tcr_el3|TCR_EL3}}, x12
-# CHECK: msr  {{vttbr_el2|VTTBR_EL2}}, x12
 # CHECK: msr  {{vtcr_el2|VTCR_EL2}}, x12
 # CHECK: msr  {{dacr32_el2|DACR32_EL2}}, x12
 # CHECK: msr  {{spsr_el1|SPSR_EL1}}, x12
@@ -3554,13 +3546,11 @@
 # CHECK: mrs  x9, {{hacr_el2|HACR_EL2}}
 # CHECK: mrs  x9, {{mdcr_el3|MDCR_EL3}}
 # CHECK: mrs  x9, {{ttbr0_el1|TTBR0_EL1}}
-# CHECK: mrs  x9, {{ttbr0_el2|TTBR0_EL2}}
 # CHECK: mrs  x9, {{ttbr0_el3|TTBR0_EL3}}
 # CHECK: mrs  x9, {{ttbr1_el1|TTBR1_EL1}}
 # CHECK: mrs  x9, {{tcr_el1|TCR_EL1}}
 # CHECK: mrs  x9, {{tcr_el2|TCR_EL2}}
 # CHECK: mrs  x9, {{tcr_el3|TCR_EL3}}
-# CHECK: mrs  x9, {{vttbr_el2|VTTBR_EL2}}
 # CHECK: mrs  x9, {{vtcr_el2|VTCR_EL2}}
 # CHECK: mrs  x9, 

[PATCH] D109157: [ARM] Mitigate the cve-2021-35465 security vulnurability.

2021-09-16 Thread Alexandros Lamprineas via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG1bd5ea968e92: [ARM] Mitigate the cve-2021-35465 security 
vulnurability. (authored by labrinea).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109157/new/

https://reviews.llvm.org/D109157

Files:
  clang/docs/ClangCommandLineReference.rst
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/ARM.cpp
  clang/test/Driver/arm-cmse-cve-2021-35465.c
  llvm/lib/Target/ARM/ARM.td
  llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp
  llvm/lib/Target/ARM/ARMSubtarget.h
  llvm/test/CodeGen/ARM/cmse-cve-2021-35465-return.ll
  llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll
  llvm/test/CodeGen/ARM/cmse-vlldm-no-reorder.mir

Index: llvm/test/CodeGen/ARM/cmse-vlldm-no-reorder.mir
===
--- llvm/test/CodeGen/ARM/cmse-vlldm-no-reorder.mir
+++ llvm/test/CodeGen/ARM/cmse-vlldm-no-reorder.mir
@@ -1,4 +1,4 @@
-# RUN: llc -mtriple=thumbv8m.main -mcpu=cortex-m33 --float-abi=hard --run-pass=arm-pseudo %s -o - | \
+# RUN: llc -mtriple=thumbv8m.main -mcpu=cortex-m33 -mattr=-fix-cmse-cve-2021-35465 --float-abi=hard --run-pass=arm-pseudo %s -o - | \
 # RUN: FileCheck %s
 --- |
   ; ModuleID = 'cmse-vlldm-no-reorder.ll'
@@ -109,4 +109,4 @@
 # CHECK-NEXT:  $s0 = VMOVSR $r12, 14 /* CC::al */, $noreg
 # CHECK-NEXT:  $sp = tADDspi $sp, 34, 14 /* CC::al */, $noreg
 # CHECK-NEXT:  $sp = t2LDMIA_UPD $sp, 14 /* CC::al */, $noreg, def $r4, def $r5, def $r6, def $r7, def $r8, def $r9, def $r10, def $r11
- 
\ No newline at end of file
+ 
Index: llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll
===
--- /dev/null
+++ llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll
@@ -0,0 +1,119 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -verify-machineinstrs \
+; RUN:   -mattr=+fp-armv8d16sp,+fix-cmse-cve-2021-35465 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-FP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mcpu=cortex-m33 -verify-machineinstrs | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-FP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mcpu=cortex-m35p -verify-machineinstrs | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-FP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -verify-machineinstrs \
+; RUN:   -mattr=-fpregs,+fix-cmse-cve-2021-35465 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-NOFP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mcpu=cortex-m33 -mattr=-fpregs -verify-machineinstrs | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-NOFP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mcpu=cortex-m35p -mattr=-fpregs -verify-machineinstrs | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-NOFP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -verify-machineinstrs \
+; RUN:   -mattr=+fp-armv8d16sp,+fix-cmse-cve-2021-35465 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -mcpu=cortex-m55 -verify-machineinstrs | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -verify-machineinstrs \
+; RUN:   -mattr=-fpregs,+fix-cmse-cve-2021-35465 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -mcpu=cortex-m55 -mattr=-fpregs -verify-machineinstrs | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+;
+
+define void @non_secure_call(void ()* %fptr) {
+; CHECK-8M-FP-CVE-2021-35465-LABEL: non_secure_call:
+; CHECK-8M-FP-CVE-2021-35465:   @ %bb.0:
+; CHECK-8M-FP-CVE-2021-35465-NEXT:push {r7, lr}
+; CHECK-8M-FP-CVE-2021-35465-NEXT:push.w {r4, r5, r6, r7, r8, r9, r10, r11}
+; CHECK-8M-FP-CVE-2021-35465-NEXT:bic r0, r0, #1
+; CHECK-8M-FP-CVE-2021-35465-NEXT:sub sp, #136
+; CHECK-8M-FP-CVE-2021-35465-NEXT:vlstm sp
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r1, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r2, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r3, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r4, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r5, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r6, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r7, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r8, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r9, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r10, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r11, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r12, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:msr apsr_nzcvq{{g?}}, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:blxns r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mrs r12, control
+; CHECK-8M-FP-CVE-2021-35465-NEXT:tst.w r12, #8
+; 

[PATCH] D109157: [ARM] Mitigate the cve-2021-35465 security vulnurability.

2021-09-15 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea updated this revision to Diff 372677.
labrinea added a comment.

Changes in this revision:

- added `-verify-machineinstrs` to the tests
- that yield two bugs that I had to address:

  *** Bad machine code: Explicit operand marked as def ***
  - function:func
  - basic block: %bb.0 entry (0x890b6d8)
  - instruction: $d3 = VSTRD $sp, 6, 14, $noreg
  - operand 0:   $d3



  *** Bad machine code: Explicit definition marked as use ***
  - function:non_secure_call
  - basic block: %bb.0  (0x8e0bed8)
  - instruction: t2MRS_M $r12, 20, 14, $noreg
  - operand 0:   $r12


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109157/new/

https://reviews.llvm.org/D109157

Files:
  clang/docs/ClangCommandLineReference.rst
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/ARM.cpp
  clang/test/Driver/arm-cmse-cve-2021-35465.c
  llvm/lib/Target/ARM/ARM.td
  llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp
  llvm/lib/Target/ARM/ARMSubtarget.h
  llvm/test/CodeGen/ARM/cmse-cve-2021-35465-return.ll
  llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll
  llvm/test/CodeGen/ARM/cmse-vlldm-no-reorder.mir

Index: llvm/test/CodeGen/ARM/cmse-vlldm-no-reorder.mir
===
--- llvm/test/CodeGen/ARM/cmse-vlldm-no-reorder.mir
+++ llvm/test/CodeGen/ARM/cmse-vlldm-no-reorder.mir
@@ -1,4 +1,4 @@
-# RUN: llc -mtriple=thumbv8m.main -mcpu=cortex-m33 --float-abi=hard --run-pass=arm-pseudo %s -o - | \
+# RUN: llc -mtriple=thumbv8m.main -mcpu=cortex-m33 -mattr=-fix-cmse-cve-2021-35465 --float-abi=hard --run-pass=arm-pseudo %s -o - | \
 # RUN: FileCheck %s
 --- |
   ; ModuleID = 'cmse-vlldm-no-reorder.ll'
@@ -109,4 +109,4 @@
 # CHECK-NEXT:  $s0 = VMOVSR $r12, 14 /* CC::al */, $noreg
 # CHECK-NEXT:  $sp = tADDspi $sp, 34, 14 /* CC::al */, $noreg
 # CHECK-NEXT:  $sp = t2LDMIA_UPD $sp, 14 /* CC::al */, $noreg, def $r4, def $r5, def $r6, def $r7, def $r8, def $r9, def $r10, def $r11
- 
\ No newline at end of file
+ 
Index: llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll
===
--- /dev/null
+++ llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll
@@ -0,0 +1,119 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -verify-machineinstrs \
+; RUN:   -mattr=+fp-armv8d16sp,+fix-cmse-cve-2021-35465 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-FP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mcpu=cortex-m33 -verify-machineinstrs | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-FP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mcpu=cortex-m35p -verify-machineinstrs | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-FP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -verify-machineinstrs \
+; RUN:   -mattr=-fpregs,+fix-cmse-cve-2021-35465 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-NOFP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mcpu=cortex-m33 -mattr=-fpregs -verify-machineinstrs | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-NOFP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mcpu=cortex-m35p -mattr=-fpregs -verify-machineinstrs | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-NOFP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -verify-machineinstrs \
+; RUN:   -mattr=+fp-armv8d16sp,+fix-cmse-cve-2021-35465 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -mcpu=cortex-m55 -verify-machineinstrs | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -verify-machineinstrs \
+; RUN:   -mattr=-fpregs,+fix-cmse-cve-2021-35465 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -mcpu=cortex-m55 -mattr=-fpregs -verify-machineinstrs | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+;
+
+define void @non_secure_call(void ()* %fptr) {
+; CHECK-8M-FP-CVE-2021-35465-LABEL: non_secure_call:
+; CHECK-8M-FP-CVE-2021-35465:   @ %bb.0:
+; CHECK-8M-FP-CVE-2021-35465-NEXT:push {r7, lr}
+; CHECK-8M-FP-CVE-2021-35465-NEXT:push.w {r4, r5, r6, r7, r8, r9, r10, r11}
+; CHECK-8M-FP-CVE-2021-35465-NEXT:bic r0, r0, #1
+; CHECK-8M-FP-CVE-2021-35465-NEXT:sub sp, #136
+; CHECK-8M-FP-CVE-2021-35465-NEXT:vlstm sp
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r1, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r2, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r3, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r4, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r5, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r6, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r7, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r8, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r9, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r10, r0
+; 

[PATCH] D109157: [ARM] Mitigate the cve-2021-35465 security vulnurability.

2021-09-15 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea updated this revision to Diff 372674.
labrinea added a comment.

Changes in this revision:

- Replaced the backend option that enables the mitigation with a subtarget 
feature so that it works with LTO (@lenary thanks for the offline hint)
- Enabled the subtarget feature on the affected CPUs


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109157/new/

https://reviews.llvm.org/D109157

Files:
  clang/docs/ClangCommandLineReference.rst
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Arch/ARM.cpp
  clang/test/Driver/arm-cmse-cve-2021-35465.c
  llvm/lib/Target/ARM/ARM.td
  llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp
  llvm/lib/Target/ARM/ARMSubtarget.h
  llvm/test/CodeGen/ARM/cmse-cve-2021-35465-return.ll
  llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll
  llvm/test/CodeGen/ARM/cmse-vlldm-no-reorder.mir

Index: llvm/test/CodeGen/ARM/cmse-vlldm-no-reorder.mir
===
--- llvm/test/CodeGen/ARM/cmse-vlldm-no-reorder.mir
+++ llvm/test/CodeGen/ARM/cmse-vlldm-no-reorder.mir
@@ -1,4 +1,4 @@
-# RUN: llc -mtriple=thumbv8m.main -mcpu=cortex-m33 --float-abi=hard --run-pass=arm-pseudo %s -o - | \
+# RUN: llc -mtriple=thumbv8m.main -mcpu=cortex-m33 -mattr=-fix-cmse-cve-2021-35465 --float-abi=hard --run-pass=arm-pseudo %s -o - | \
 # RUN: FileCheck %s
 --- |
   ; ModuleID = 'cmse-vlldm-no-reorder.ll'
@@ -109,4 +109,4 @@
 # CHECK-NEXT:  $s0 = VMOVSR $r12, 14 /* CC::al */, $noreg
 # CHECK-NEXT:  $sp = tADDspi $sp, 34, 14 /* CC::al */, $noreg
 # CHECK-NEXT:  $sp = t2LDMIA_UPD $sp, 14 /* CC::al */, $noreg, def $r4, def $r5, def $r6, def $r7, def $r8, def $r9, def $r10, def $r11
- 
\ No newline at end of file
+ 
Index: llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll
===
--- /dev/null
+++ llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll
@@ -0,0 +1,119 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main \
+; RUN:   -mattr=+fp-armv8d16sp,+fix-cmse-cve-2021-35465 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-FP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mcpu=cortex-m33 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-FP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mcpu=cortex-m35p | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-FP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main \
+; RUN:   -mattr=-fpregs,+fix-cmse-cve-2021-35465 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-NOFP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mcpu=cortex-m33 -mattr=-fpregs | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-NOFP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mcpu=cortex-m35p -mattr=-fpregs | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-NOFP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main \
+; RUN:   -mattr=+fp-armv8d16sp,+fix-cmse-cve-2021-35465 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -mcpu=cortex-m55 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main \
+; RUN:   -mattr=-fpregs,+fix-cmse-cve-2021-35465 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -mcpu=cortex-m55 -mattr=-fpregs | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+;
+
+define void @non_secure_call(void ()* %fptr) {
+; CHECK-8M-FP-CVE-2021-35465-LABEL: non_secure_call:
+; CHECK-8M-FP-CVE-2021-35465:   @ %bb.0:
+; CHECK-8M-FP-CVE-2021-35465-NEXT:push {r7, lr}
+; CHECK-8M-FP-CVE-2021-35465-NEXT:push.w {r4, r5, r6, r7, r8, r9, r10, r11}
+; CHECK-8M-FP-CVE-2021-35465-NEXT:bic r0, r0, #1
+; CHECK-8M-FP-CVE-2021-35465-NEXT:sub sp, #136
+; CHECK-8M-FP-CVE-2021-35465-NEXT:vlstm sp
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r1, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r2, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r3, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r4, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r5, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r6, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r7, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r8, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r9, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r10, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r11, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r12, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:msr apsr_nzcvq{{g?}}, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:blxns r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mrs r12, control
+; CHECK-8M-FP-CVE-2021-35465-NEXT:tst.w r12, #8
+; CHECK-8M-FP-CVE-2021-35465-NEXT:it ne
+; CHECK-8M-FP-CVE-2021-35465-NEXT:vmovne.f32 s0, s0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:vlldm sp
+; CHECK-8M-FP-CVE-2021-35465-NEXT:add sp, #136
+; 

[PATCH] D109157: [ARM] Mitigate the cve-2021-35465 security vulnurability.

2021-09-14 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added a comment.

@ostannard, can you explain what you meant with supporting LTO? I didn't quite 
undestand. Are you happy with the rest of the changes?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109157/new/

https://reviews.llvm.org/D109157

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109157: [ARM] Mitigate the cve-2021-35465 security vulnurability.

2021-09-07 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea updated this revision to Diff 371083.
labrinea added a comment.

Changes in this revision:

- renamed fix_cve_2021_35465 to Fix_CVE_2021_35465
- added more Driver tests to cover the use of -mfix-cmse-cve-2021-35465 with 
-mno-fix-cmse-cve-2021-35465
- fixed code indentation


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109157/new/

https://reviews.llvm.org/D109157

Files:
  clang/docs/ClangCommandLineReference.rst
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/arm-cmse-cve-2021-35465.c
  llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp
  llvm/test/CodeGen/ARM/cmse-cve-2021-35465-return.ll
  llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll

Index: llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll
===
--- /dev/null
+++ llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll
@@ -0,0 +1,101 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mattr=+fp-armv8d16sp \
+; RUN:   -arm-fix-cmse-cve-2021-35465=1 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-FP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mattr=-fpregs \
+; RUN:   -arm-fix-cmse-cve-2021-35465=1 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-NOFP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -mattr=+fp-armv8d16sp \
+; RUN:   -arm-fix-cmse-cve-2021-35465=1 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -mattr=-fpregs \
+; RUN:   -arm-fix-cmse-cve-2021-35465=1 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+
+
+define void @non_secure_call(void ()* %fptr) {
+; CHECK-8M-FP-CVE-2021-35465-LABEL: non_secure_call:
+; CHECK-8M-FP-CVE-2021-35465:   @ %bb.0:
+; CHECK-8M-FP-CVE-2021-35465-NEXT:push {r7, lr}
+; CHECK-8M-FP-CVE-2021-35465-NEXT:push.w {r4, r5, r6, r7, r8, r9, r10, r11}
+; CHECK-8M-FP-CVE-2021-35465-NEXT:bic r0, r0, #1
+; CHECK-8M-FP-CVE-2021-35465-NEXT:sub sp, #136
+; CHECK-8M-FP-CVE-2021-35465-NEXT:vlstm sp
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r1, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r2, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r3, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r4, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r5, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r6, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r7, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r8, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r9, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r10, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r11, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r12, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:msr apsr_nzcvq, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:blxns r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mrs r12, control
+; CHECK-8M-FP-CVE-2021-35465-NEXT:tst.w r12, #8
+; CHECK-8M-FP-CVE-2021-35465-NEXT:it ne
+; CHECK-8M-FP-CVE-2021-35465-NEXT:vmovne.f32 s0, s0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:vlldm sp
+; CHECK-8M-FP-CVE-2021-35465-NEXT:add sp, #136
+; CHECK-8M-FP-CVE-2021-35465-NEXT:pop.w {r4, r5, r6, r7, r8, r9, r10, r11}
+; CHECK-8M-FP-CVE-2021-35465-NEXT:pop {r7, pc}
+;
+; CHECK-8M-NOFP-CVE-2021-35465-LABEL: non_secure_call:
+; CHECK-8M-NOFP-CVE-2021-35465:   @ %bb.0:
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:push {r7, lr}
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:push.w {r4, r5, r6, r7, r8, r9, r10, r11}
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:bic r0, r0, #1
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:sub sp, #136
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:vlstm sp
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r1, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r2, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r3, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r4, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r5, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r6, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r7, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r8, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r9, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r10, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r11, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r12, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:msr apsr_nzcvq, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:blxns r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mrs r12, control
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:tst.w r12, #8
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:it ne
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:@APP
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:.inst.w 0xeeb00a40
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:@NO_APP
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:vlldm sp
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:add sp, #136
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:pop.w {r4, r5, r6, r7, r8, r9, r10, r11}
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:pop {r7, pc}
+;
+; 

[PATCH] D109157: [ARM] Mitigate the cve-2021-35465 security vulnurability.

2021-09-07 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea marked 4 inline comments as done and 6 inline comments as done.
labrinea added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1666
+CmdArgs.push_back("-mllvm");
+if (A->getOption().matches(options::OPT_mfix_cmse_cve_2021_35465))
+  CmdArgs.push_back("-arm-fix-cmse-cve-2021-35465=1");

SjoerdMeijer wrote:
> I am wondering if this should use `getLastArg` and what happens with test 
> cases (which I guess need adding) that have both:
> 
>   -mno-fix-cmse-cve-2021-35465  -mfix-cmse-cve-2021-35465 
> 
> or
> 
>   -mfix-cmse-cve-2021-35465  -mno-fix-cmse-cve-2021-35465
That's the whole point of `getLastArg` as far as I understand: for options that 
can either enable or disable a feature, so that the last one wins. I'll add 
more tests.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109157/new/

https://reviews.llvm.org/D109157

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109157: [ARM] Mitigate the cve-2021-35465 security vulnurability.

2021-09-07 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea updated this revision to Diff 371028.
labrinea added a comment.

Changes in this revision:

- pass the -arm-fix-cmse-cve-2021-35465 option once
- document -m(no)fix-cmse-cve-2021-35465 in ClangCommandLineReference.rst
- add clang tests with the mitigation expicitely disabled on affected cpus
- removed `.addReg(ARM::CPSR, RegState::ImplicitDefine)`
- moved the mitigation sequence inside a bundle to prevent reordering


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109157/new/

https://reviews.llvm.org/D109157

Files:
  clang/docs/ClangCommandLineReference.rst
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/arm-cmse-cve-2021-35465.c
  llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp
  llvm/test/CodeGen/ARM/cmse-cve-2021-35465-return.ll
  llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll

Index: llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll
===
--- /dev/null
+++ llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll
@@ -0,0 +1,101 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mattr=+fp-armv8d16sp \
+; RUN:   -arm-fix-cmse-cve-2021-35465=1 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-FP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mattr=-fpregs \
+; RUN:   -arm-fix-cmse-cve-2021-35465=1 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-NOFP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -mattr=+fp-armv8d16sp \
+; RUN:   -arm-fix-cmse-cve-2021-35465=1 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -mattr=-fpregs \
+; RUN:   -arm-fix-cmse-cve-2021-35465=1 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+
+
+define void @non_secure_call(void ()* %fptr) {
+; CHECK-8M-FP-CVE-2021-35465-LABEL: non_secure_call:
+; CHECK-8M-FP-CVE-2021-35465:   @ %bb.0:
+; CHECK-8M-FP-CVE-2021-35465-NEXT:push {r7, lr}
+; CHECK-8M-FP-CVE-2021-35465-NEXT:push.w {r4, r5, r6, r7, r8, r9, r10, r11}
+; CHECK-8M-FP-CVE-2021-35465-NEXT:bic r0, r0, #1
+; CHECK-8M-FP-CVE-2021-35465-NEXT:sub sp, #136
+; CHECK-8M-FP-CVE-2021-35465-NEXT:vlstm sp
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r1, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r2, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r3, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r4, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r5, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r6, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r7, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r8, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r9, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r10, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r11, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r12, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:msr apsr_nzcvq, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:blxns r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mrs r12, control
+; CHECK-8M-FP-CVE-2021-35465-NEXT:tst.w r12, #8
+; CHECK-8M-FP-CVE-2021-35465-NEXT:it ne
+; CHECK-8M-FP-CVE-2021-35465-NEXT:vmovne.f32 s0, s0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:vlldm sp
+; CHECK-8M-FP-CVE-2021-35465-NEXT:add sp, #136
+; CHECK-8M-FP-CVE-2021-35465-NEXT:pop.w {r4, r5, r6, r7, r8, r9, r10, r11}
+; CHECK-8M-FP-CVE-2021-35465-NEXT:pop {r7, pc}
+;
+; CHECK-8M-NOFP-CVE-2021-35465-LABEL: non_secure_call:
+; CHECK-8M-NOFP-CVE-2021-35465:   @ %bb.0:
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:push {r7, lr}
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:push.w {r4, r5, r6, r7, r8, r9, r10, r11}
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:bic r0, r0, #1
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:sub sp, #136
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:vlstm sp
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r1, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r2, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r3, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r4, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r5, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r6, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r7, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r8, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r9, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r10, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r11, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r12, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:msr apsr_nzcvq, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:blxns r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mrs r12, control
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:tst.w r12, #8
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:it ne
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:@APP
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:.inst.w 0xeeb00a40
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:@NO_APP
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:vlldm sp
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:add sp, #136

[PATCH] D109157: [ARM] Mitigate the cve-2021-35465 security vulnurability.

2021-09-06 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1665
+
+CmdArgs.push_back("-mllvm");
+if (A->getOption().matches(options::OPT_mfix_cmse_cve_2021_35465))

ostannard wrote:
> ostannard wrote:
> > labrinea wrote:
> > > ostannard wrote:
> > > > Are these optional also being passed through to the linker when doing 
> > > > LTO?
> > > No, the mitigation is only performed in the compiler. Also, I believe 
> > > that `-flto` and `-mcmse` are incompatible options.
> > The mitigation is performed in the backend, which is run from the linker 
> > when doing LTO.
> > 
> > > Also, I believe that -flto and -mcmse are incompatible options.
> > 
> > Fair enough
> Actually, I just checked and these options are accepted together, and I can't 
> find any docs saying they are incompatible. Do you have a link to something 
> I've missed? Since there isn't already an error, I think we should either fix 
> this to work with LTO (my preference), or add an error when using the options 
> together, and document that.
I have addressed all the other comments, but I am not sure how to go about this 
one. What does it take to make the cve-2021-35465 option work with LTO? Could 
you elaborate on this?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109157/new/

https://reviews.llvm.org/D109157

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109157: [ARM] Mitigate the cve-2021-35465 security vulnurability.

2021-09-06 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1665
+
+CmdArgs.push_back("-mllvm");
+if (A->getOption().matches(options::OPT_mfix_cmse_cve_2021_35465))

ostannard wrote:
> Are these optional also being passed through to the linker when doing LTO?
No, the mitigation is only performed in the compiler. Also, I believe that 
`-flto` and `-mcmse` are incompatible options.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1666
+CmdArgs.push_back("-mllvm");
+if (A->getOption().matches(options::OPT_mfix_cmse_cve_2021_35465))
+  CmdArgs.push_back("-arm-fix-cmse-cve-2021-35465=1");

SjoerdMeijer wrote:
> If `-mcpu=cortex-[m33|m35|m55]` was provided, then 
> `-arm-fix-cmse-cve-2021-35465=1` is already set and we are adding another 
> option here? For example, for
> 
>   -mcpu=cortex-m33 -mcmse -mfix-cmse-cve-2021-35465
> 
> I am expecting:
> 
>   "-mllvm" "-arm-fix-cmse-cve-2021-35465=1"  "-mllvm" 
> "-arm-fix-cmse-cve-2021-35465=1" 
> 
> and with `-mno-fix-cmse-cve-2021-35465`:
> 
>"-mllvm" "-arm-fix-cmse-cve-2021-35465=1"  "-mllvm" 
> "-arm-fix-cmse-cve-2021-35465=0" 
> 
> Probably it's nicer to just pass this once.
> 
> Also, in the tests, I think cases are missing for `-mcpu=...` and 
> `-m[no-]fix-cmse-cve-2021-35465`.
Your interpretetion is correct. The established policy is "last option wins", 
but I could make the Driver pass only one option if that's preferable.



Comment at: clang/test/Driver/arm-cmse-cve-2021-35465.c:3
+//
+// RUN: %clang --target=arm-arm-none-eabi -march=armv8-m.main %s -### \
+// RUN:   -mcmse -mno-fix-cmse-cve-2021-35465 2>&1 |\

ostannard wrote:
> The last few paragraphs of 
> https://developer.arm.com/support/arm-security-updates/vlldm-instruction-security-vulnerability
>  claim that this is enabled by default for -march=armv8-m.main in AC6 and 
> GCC, is there a reason we're not matching that?
Yes, the inconsistency lies on the fact that GCC implements the mitigation in 
library code, therefore it is always present for `-march=armv8-m.main`, whereas 
in llvm there's no such limitation. We've contacted the authors of this page to 
rectify the documentation.



Comment at: llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp:1564
   .addReg(Reg)
+  .addReg(ARM::CPSR, RegState::ImplicitDefine)
   .add(predOps(ARMCC::AL));

ostannard wrote:
> Why are these needed?
These prevent the reordering with the mitigation sequence. It answers your next 
question.



Comment at: llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp:1626
+  // Thumb2ITBlockPass will not recognise the instruction as conditional.
+  BuildMI(MBB, MBBI, DL, TII->get(ARM::t2IT))
+  .addImm(ARMCC::NE)

ostannard wrote:
> This pass runs before the final scheduler pass, so is there a risk that the 
> IT and VMOV instructions could be moved apart? I think it would be safer to 
> either put the IT instruction inside the inline asm block, or make this a new 
> pseudo-instruction expanded in the asm printer.
The use of `.addReg(ARM::CPSR, RegState::ImplicitDefine)` prevents the 
reordering. There are regression tests in place that check the mitigation 
sequence ordering.

Is this what you meant? Where you refering specifically to the case where 
`!STI->hasFPRegs()`, when we generate inline asm, or to both scenarios?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109157/new/

https://reviews.llvm.org/D109157

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109157: [ARM] Mitigate the cve-2021-35465 security vulnurability.

2021-09-02 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea created this revision.
labrinea added reviewers: llvm-commits, momchil.velikov.
Herald added subscribers: dang, hiraditya, kristof.beyls.
labrinea requested review of this revision.
Herald added projects: clang, LLVM.
Herald added a subscriber: cfe-commits.

Recently a vulnerability issue is found in the implementation of VLLDM 
instruction in the Arm Cortex-M33, Cortex-M35P and Cortex-M55. If the VLLDM 
instruction is abandoned due to an exception when it is partially completed, it 
is possible for subsequent non-secure handler to access and modify the partial 
restored register values. This vulnerability is identified as CVE-2021-35465. 
The mitigation sequence varies between v8-m and v8.1-m as follows:

v8-m.main

  mrsr5, control
  tstr5, #8   /* CONTROL_S.SFPA */
  it ne
  .inst.w0xeeb00a40   /* vmovne s0, s0 */
  1:
  vlldm  sp   /* Lazy restore of d0-d16 and FPSCR. */

v8.1-m.main

  vscclrm{vpr}/* Clear VPR. */
  vlldm  sp   /* Lazy restore of d0-d16 and FPSCR. */

More details on 
https://developer.arm.com/support/arm-security-updates/vlldm-instruction-security-vulnerability


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D109157

Files:
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/arm-cmse-cve-2021-35465.c
  llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp
  llvm/test/CodeGen/ARM/cmse-cve-2021-35465-return.ll
  llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll

Index: llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll
===
--- /dev/null
+++ llvm/test/CodeGen/ARM/cmse-cve-2021-35465.ll
@@ -0,0 +1,101 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mattr=+fp-armv8d16sp \
+; RUN:   -arm-fix-cmse-cve-2021-35465=1 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-FP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8m.main -mattr=-fpregs \
+; RUN:   -arm-fix-cmse-cve-2021-35465=1 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-8M-NOFP-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -mattr=+fp-armv8d16sp \
+; RUN:   -arm-fix-cmse-cve-2021-35465=1 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+;
+; RUN: llc %s -o - -mtriple=thumbv8.1m.main -mattr=-fpregs \
+; RUN:   -arm-fix-cmse-cve-2021-35465=1 | \
+; RUN:   FileCheck %s --check-prefix=CHECK-81M-CVE-2021-35465
+
+
+define void @non_secure_call(void ()* %fptr) {
+; CHECK-8M-FP-CVE-2021-35465-LABEL: non_secure_call:
+; CHECK-8M-FP-CVE-2021-35465:   @ %bb.0:
+; CHECK-8M-FP-CVE-2021-35465-NEXT:push {r7, lr}
+; CHECK-8M-FP-CVE-2021-35465-NEXT:push.w {r4, r5, r6, r7, r8, r9, r10, r11}
+; CHECK-8M-FP-CVE-2021-35465-NEXT:bic r0, r0, #1
+; CHECK-8M-FP-CVE-2021-35465-NEXT:sub sp, #136
+; CHECK-8M-FP-CVE-2021-35465-NEXT:vlstm sp
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r1, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r2, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r3, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r4, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r5, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r6, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r7, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r8, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r9, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r10, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r11, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mov r12, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:msr apsr_nzcvq, r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:blxns r0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:mrs r12, control
+; CHECK-8M-FP-CVE-2021-35465-NEXT:tst.w r12, #8
+; CHECK-8M-FP-CVE-2021-35465-NEXT:it ne
+; CHECK-8M-FP-CVE-2021-35465-NEXT:vmovne.f32 s0, s0
+; CHECK-8M-FP-CVE-2021-35465-NEXT:vlldm sp
+; CHECK-8M-FP-CVE-2021-35465-NEXT:add sp, #136
+; CHECK-8M-FP-CVE-2021-35465-NEXT:pop.w {r4, r5, r6, r7, r8, r9, r10, r11}
+; CHECK-8M-FP-CVE-2021-35465-NEXT:pop {r7, pc}
+;
+; CHECK-8M-NOFP-CVE-2021-35465-LABEL: non_secure_call:
+; CHECK-8M-NOFP-CVE-2021-35465:   @ %bb.0:
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:push {r7, lr}
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:push.w {r4, r5, r6, r7, r8, r9, r10, r11}
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:bic r0, r0, #1
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:sub sp, #136
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:vlstm sp
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r1, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r2, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r3, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r4, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r5, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r6, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r7, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r8, r0
+; CHECK-8M-NOFP-CVE-2021-35465-NEXT:mov r9, r0
+; 

[PATCH] D94098: [Clang][AArch64] Inline assembly support for the ACLE type 'data512_t'.

2021-07-31 Thread Alexandros Lamprineas via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG29b263a34f1a: [Clang][AArch64] Inline assembly support for 
the ACLE type data512_t (authored by labrinea).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94098/new/

https://reviews.llvm.org/D94098

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/CodeGen/CGStmt.cpp
  clang/lib/CodeGen/TargetInfo.cpp
  clang/lib/CodeGen/TargetInfo.h
  clang/test/CodeGen/aarch64-ls64-inline-asm.c

Index: clang/test/CodeGen/aarch64-ls64-inline-asm.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-ls64-inline-asm.c
@@ -0,0 +1,84 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple aarch64-eabi -target-feature +ls64 -O1 -S -emit-llvm -x c %s -o - | FileCheck %s
+
+struct foo { unsigned long long x[8]; };
+
+// CHECK-LABEL: @load(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = call i512 asm sideeffect "ld64b $0,[$1]", "=r,r,~{memory}"(i8* [[ADDR:%.*]]) #[[ATTR1:[0-9]+]], !srcloc !6
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast %struct.foo* [[OUTPUT:%.*]] to i512*
+// CHECK-NEXT:store i512 [[TMP0]], i512* [[TMP1]], align 8
+// CHECK-NEXT:ret void
+//
+void load(struct foo *output, void *addr)
+{
+__asm__ volatile ("ld64b %0,[%1]" : "=r" (*output) : "r" (addr) : "memory");
+}
+
+// CHECK-LABEL: @store(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast %struct.foo* [[INPUT:%.*]] to i512*
+// CHECK-NEXT:[[TMP1:%.*]] = load i512, i512* [[TMP0]], align 8
+// CHECK-NEXT:call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[TMP1]], i8* [[ADDR:%.*]]) #[[ATTR1]], !srcloc !7
+// CHECK-NEXT:ret void
+//
+void store(const struct foo *input, void *addr)
+{
+__asm__ volatile ("st64b %0,[%1]" : : "r" (*input), "r" (addr) : "memory" );
+}
+
+// CHECK-LABEL: @store2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = load i32, i32* [[IN:%.*]], align 4, !tbaa [[TBAA8:![0-9]+]]
+// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[TMP0]] to i64
+// CHECK-NEXT:[[ARRAYIDX1:%.*]] = getelementptr inbounds i32, i32* [[IN]], i64 1
+// CHECK-NEXT:[[TMP1:%.*]] = load i32, i32* [[ARRAYIDX1]], align 4, !tbaa [[TBAA8]]
+// CHECK-NEXT:[[CONV2:%.*]] = sext i32 [[TMP1]] to i64
+// CHECK-NEXT:[[ARRAYIDX4:%.*]] = getelementptr inbounds i32, i32* [[IN]], i64 4
+// CHECK-NEXT:[[TMP2:%.*]] = load i32, i32* [[ARRAYIDX4]], align 4, !tbaa [[TBAA8]]
+// CHECK-NEXT:[[CONV5:%.*]] = sext i32 [[TMP2]] to i64
+// CHECK-NEXT:[[ARRAYIDX7:%.*]] = getelementptr inbounds i32, i32* [[IN]], i64 16
+// CHECK-NEXT:[[TMP3:%.*]] = load i32, i32* [[ARRAYIDX7]], align 4, !tbaa [[TBAA8]]
+// CHECK-NEXT:[[CONV8:%.*]] = sext i32 [[TMP3]] to i64
+// CHECK-NEXT:[[ARRAYIDX10:%.*]] = getelementptr inbounds i32, i32* [[IN]], i64 25
+// CHECK-NEXT:[[TMP4:%.*]] = load i32, i32* [[ARRAYIDX10]], align 4, !tbaa [[TBAA8]]
+// CHECK-NEXT:[[CONV11:%.*]] = sext i32 [[TMP4]] to i64
+// CHECK-NEXT:[[ARRAYIDX13:%.*]] = getelementptr inbounds i32, i32* [[IN]], i64 36
+// CHECK-NEXT:[[TMP5:%.*]] = load i32, i32* [[ARRAYIDX13]], align 4, !tbaa [[TBAA8]]
+// CHECK-NEXT:[[CONV14:%.*]] = sext i32 [[TMP5]] to i64
+// CHECK-NEXT:[[ARRAYIDX16:%.*]] = getelementptr inbounds i32, i32* [[IN]], i64 49
+// CHECK-NEXT:[[TMP6:%.*]] = load i32, i32* [[ARRAYIDX16]], align 4, !tbaa [[TBAA8]]
+// CHECK-NEXT:[[CONV17:%.*]] = sext i32 [[TMP6]] to i64
+// CHECK-NEXT:[[ARRAYIDX19:%.*]] = getelementptr inbounds i32, i32* [[IN]], i64 64
+// CHECK-NEXT:[[TMP7:%.*]] = load i32, i32* [[ARRAYIDX19]], align 4, !tbaa [[TBAA8]]
+// CHECK-NEXT:[[CONV20:%.*]] = sext i32 [[TMP7]] to i64
+// CHECK-NEXT:[[S_SROA_10_0_INSERT_EXT:%.*]] = zext i64 [[CONV20]] to i512
+// CHECK-NEXT:[[S_SROA_10_0_INSERT_SHIFT:%.*]] = shl nuw i512 [[S_SROA_10_0_INSERT_EXT]], 448
+// CHECK-NEXT:[[S_SROA_9_0_INSERT_EXT:%.*]] = zext i64 [[CONV17]] to i512
+// CHECK-NEXT:[[S_SROA_9_0_INSERT_SHIFT:%.*]] = shl nuw nsw i512 [[S_SROA_9_0_INSERT_EXT]], 384
+// CHECK-NEXT:[[S_SROA_9_0_INSERT_INSERT:%.*]] = or i512 [[S_SROA_10_0_INSERT_SHIFT]], [[S_SROA_9_0_INSERT_SHIFT]]
+// CHECK-NEXT:[[S_SROA_8_0_INSERT_EXT:%.*]] = zext i64 [[CONV14]] to i512
+// CHECK-NEXT:[[S_SROA_8_0_INSERT_SHIFT:%.*]] = shl nuw nsw i512 [[S_SROA_8_0_INSERT_EXT]], 320
+// CHECK-NEXT:[[S_SROA_8_0_INSERT_INSERT:%.*]] = or i512 [[S_SROA_9_0_INSERT_INSERT]], [[S_SROA_8_0_INSERT_SHIFT]]
+// CHECK-NEXT:[[S_SROA_7_0_INSERT_EXT:%.*]] = zext i64 [[CONV11]] to i512
+// CHECK-NEXT:[[S_SROA_7_0_INSERT_SHIFT:%.*]] = shl nuw nsw i512 [[S_SROA_7_0_INSERT_EXT]], 256
+// CHECK-NEXT:[[S_SROA_7_0_INSERT_INSERT:%.*]] = or i512 [[S_SROA_8_0_INSERT_INSERT]], [[S_SROA_7_0_INSERT_SHIFT]]
+// CHECK-NEXT:

[PATCH] D94098: [Clang][AArch64] Inline assembly support for the ACLE type 'data512_t'.

2021-07-26 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added a comment.

ping


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94098/new/

https://reviews.llvm.org/D94098

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94098: [Clang][AArch64] Inline assembly support for the ACLE type 'data512_t'.

2021-07-20 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea updated this revision to Diff 360208.
labrinea added a comment.

This revision uses `i512` to pass the asm operands by-value. I've explained in 
my last comment what would be the challenges had we chosen `[i64 x 8]`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94098/new/

https://reviews.llvm.org/D94098

Files:
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/CodeGen/CGStmt.cpp
  clang/lib/CodeGen/TargetInfo.cpp
  clang/lib/CodeGen/TargetInfo.h
  clang/test/CodeGen/aarch64-ls64-inline-asm.c

Index: clang/test/CodeGen/aarch64-ls64-inline-asm.c
===
--- clang/test/CodeGen/aarch64-ls64-inline-asm.c
+++ clang/test/CodeGen/aarch64-ls64-inline-asm.c
@@ -0,0 +1,84 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple aarch64-eabi -target-feature +ls64 -O1 -S -emit-llvm -x c %s -o - | FileCheck %s
+
+struct foo { unsigned long long x[8]; };
+
+// CHECK-LABEL: @load(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = call i512 asm sideeffect "ld64b $0,[$1]", "=r,r,~{memory}"(i8* [[ADDR:%.*]]) #[[ATTR1:[0-9]+]], !srcloc !6
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast %struct.foo* [[OUTPUT:%.*]] to i512*
+// CHECK-NEXT:store i512 [[TMP0]], i512* [[TMP1]], align 8
+// CHECK-NEXT:ret void
+//
+void load(struct foo *output, void *addr)
+{
+__asm__ volatile ("ld64b %0,[%1]" : "=r" (*output) : "r" (addr) : "memory");
+}
+
+// CHECK-LABEL: @store(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = bitcast %struct.foo* [[INPUT:%.*]] to i512*
+// CHECK-NEXT:[[TMP1:%.*]] = load i512, i512* [[TMP0]], align 8
+// CHECK-NEXT:call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[TMP1]], i8* [[ADDR:%.*]]) #[[ATTR1]], !srcloc !7
+// CHECK-NEXT:ret void
+//
+void store(const struct foo *input, void *addr)
+{
+__asm__ volatile ("st64b %0,[%1]" : : "r" (*input), "r" (addr) : "memory" );
+}
+
+// CHECK-LABEL: @store2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = load i32, i32* [[IN:%.*]], align 4, !tbaa [[TBAA8:![0-9]+]]
+// CHECK-NEXT:[[CONV:%.*]] = sext i32 [[TMP0]] to i64
+// CHECK-NEXT:[[ARRAYIDX1:%.*]] = getelementptr inbounds i32, i32* [[IN]], i64 1
+// CHECK-NEXT:[[TMP1:%.*]] = load i32, i32* [[ARRAYIDX1]], align 4, !tbaa [[TBAA8]]
+// CHECK-NEXT:[[CONV2:%.*]] = sext i32 [[TMP1]] to i64
+// CHECK-NEXT:[[ARRAYIDX4:%.*]] = getelementptr inbounds i32, i32* [[IN]], i64 4
+// CHECK-NEXT:[[TMP2:%.*]] = load i32, i32* [[ARRAYIDX4]], align 4, !tbaa [[TBAA8]]
+// CHECK-NEXT:[[CONV5:%.*]] = sext i32 [[TMP2]] to i64
+// CHECK-NEXT:[[ARRAYIDX7:%.*]] = getelementptr inbounds i32, i32* [[IN]], i64 16
+// CHECK-NEXT:[[TMP3:%.*]] = load i32, i32* [[ARRAYIDX7]], align 4, !tbaa [[TBAA8]]
+// CHECK-NEXT:[[CONV8:%.*]] = sext i32 [[TMP3]] to i64
+// CHECK-NEXT:[[ARRAYIDX10:%.*]] = getelementptr inbounds i32, i32* [[IN]], i64 25
+// CHECK-NEXT:[[TMP4:%.*]] = load i32, i32* [[ARRAYIDX10]], align 4, !tbaa [[TBAA8]]
+// CHECK-NEXT:[[CONV11:%.*]] = sext i32 [[TMP4]] to i64
+// CHECK-NEXT:[[ARRAYIDX13:%.*]] = getelementptr inbounds i32, i32* [[IN]], i64 36
+// CHECK-NEXT:[[TMP5:%.*]] = load i32, i32* [[ARRAYIDX13]], align 4, !tbaa [[TBAA8]]
+// CHECK-NEXT:[[CONV14:%.*]] = sext i32 [[TMP5]] to i64
+// CHECK-NEXT:[[ARRAYIDX16:%.*]] = getelementptr inbounds i32, i32* [[IN]], i64 49
+// CHECK-NEXT:[[TMP6:%.*]] = load i32, i32* [[ARRAYIDX16]], align 4, !tbaa [[TBAA8]]
+// CHECK-NEXT:[[CONV17:%.*]] = sext i32 [[TMP6]] to i64
+// CHECK-NEXT:[[ARRAYIDX19:%.*]] = getelementptr inbounds i32, i32* [[IN]], i64 64
+// CHECK-NEXT:[[TMP7:%.*]] = load i32, i32* [[ARRAYIDX19]], align 4, !tbaa [[TBAA8]]
+// CHECK-NEXT:[[CONV20:%.*]] = sext i32 [[TMP7]] to i64
+// CHECK-NEXT:[[S_SROA_10_0_INSERT_EXT:%.*]] = zext i64 [[CONV20]] to i512
+// CHECK-NEXT:[[S_SROA_10_0_INSERT_SHIFT:%.*]] = shl nuw i512 [[S_SROA_10_0_INSERT_EXT]], 448
+// CHECK-NEXT:[[S_SROA_9_0_INSERT_EXT:%.*]] = zext i64 [[CONV17]] to i512
+// CHECK-NEXT:[[S_SROA_9_0_INSERT_SHIFT:%.*]] = shl nuw nsw i512 [[S_SROA_9_0_INSERT_EXT]], 384
+// CHECK-NEXT:[[S_SROA_9_0_INSERT_INSERT:%.*]] = or i512 [[S_SROA_10_0_INSERT_SHIFT]], [[S_SROA_9_0_INSERT_SHIFT]]
+// CHECK-NEXT:[[S_SROA_8_0_INSERT_EXT:%.*]] = zext i64 [[CONV14]] to i512
+// CHECK-NEXT:[[S_SROA_8_0_INSERT_SHIFT:%.*]] = shl nuw nsw i512 [[S_SROA_8_0_INSERT_EXT]], 320
+// CHECK-NEXT:[[S_SROA_8_0_INSERT_INSERT:%.*]] = or i512 [[S_SROA_9_0_INSERT_INSERT]], [[S_SROA_8_0_INSERT_SHIFT]]
+// CHECK-NEXT:[[S_SROA_7_0_INSERT_EXT:%.*]] = zext i64 [[CONV11]] to i512
+// CHECK-NEXT:[[S_SROA_7_0_INSERT_SHIFT:%.*]] = shl nuw nsw i512 [[S_SROA_7_0_INSERT_EXT]], 256
+// CHECK-NEXT:[[S_SROA_7_0_INSERT_INSERT:%.*]] = or i512 [[S_SROA_8_0_INSERT_INSERT]], [[S_SROA_7_0_INSERT_SHIFT]]
+// CHECK-NEXT:[[S_SROA_6_0_INSERT_EXT:%.*]] = zext i64 [[CONV8]] 

[PATCH] D94098: [Clang][AArch64] Inline assembly support for the ACLE type 'data512_t'.

2021-07-19 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added a comment.

> struct foo { unsigned long long x[8]; };
> void store(int *in, void *addr)
> {
>
>   struct foo x = { in[0], in[1], in[4], in[16], in[25], in[36], in[49], 
> in[64] };
>   __asm__ volatile ("st64b %0,[%1]" : : "r" (x), "r" (addr) : "memory" );
>
> }

For this particular example if we pass the asm operands as i512 the compiler 
generates the following, which doesn't look bad.

  ldpsw x2, x3, [x0]
  ldrsw x4, [x0, #16]
  ldrsw x5, [x0, #64]
  ldrsw x6, [x0, #100]
  ldrsw x7, [x0, #144]
  ldrsw x8, [x0, #196]
  ldrsw x9, [x0, #256]
  //APP
  st64b x2, [x1]
  //NO_APP

Looking at the IR, it seems that SROA gets in the way. It loads all eight i32 
values and constructs the i512 operand by performing bitwise operations on 
them. So I was wrong saying that the load of an i512 value won't get optimized.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94098/new/

https://reviews.llvm.org/D94098

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94098: [Clang][AArch64] Inline assembly support for the ACLE type 'data512_t'.

2021-07-18 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added a comment.

Ok, I've tried a few things. If we add a couple of new target hooks we can make 
clang pass both input and output asm operands by value as `type { [8 x i64] }` 
avoiding the integer conversion. One issue with that is that the inline asm 
verifier asserts if an inline asm statement returns a struct with one result 
(struct return types are meant to carry multiple results). By making 
adjustments to the existing target hook `adjustInlineAsmType()` we can even 
alter the asm operand type and make it `[8 x i64]` for example if that's 
preferable. Adding new calls to this hook without removing the existing ones 
will look ugly though, but at the same time I found it challenging given the 
complexity of the 400-line function `CodeGenFunction::EmitAsmStmt`, which needs 
tidying up. Unfortunately this is half of the story as by choosing an aggregate 
type for the asm operands we are allowing InstCombine (at -O1 and above) to 
turn the load/store instructions before/after the inline asm statement into 
insert/extract element + smaller loads/stores. I see two problems with that. 
Firstly, the information that the load/store comes from an inline asm operand 
gets lost by the time the SelectionDAG processes those nodes, and so we cannot 
use a target hook to select a special value type for them (as discussed in 
D94097  we want to narrow down the MVT 
specialization for an llvm type to only apply to asm operands and not 
universally). Moreover, having insert/extract element is pointless when the 
backend expects a load/store of `MVT::i64x8` for custom lowering. All that said 
I think that the best choice is to use `i512` for the asm operands since llvm 
cannot optimize that. The only change in clang's user visible behavior is that 
large aggregate output operands will not be diagnosed, like in the example at 
the description, but instead we'll be passing them by reference, which is what 
is already happening with input operands anyway.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94098/new/

https://reviews.llvm.org/D94098

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94098: [Clang][AArch64] Inline assembly support for the ACLE type 'data512_t'.

2021-07-13 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added a comment.

In D94098#2868751 , @efriedma wrote:

> The part I'm confused about is that you're forcing it to use "*r".  At the IR 
> level, LLVM handles something like `call void asm sideeffect "#$0", "r"([8 x 
> i64] %c)` fine.  You'll have to do a bit of work to teach clang to emit that, 
> but it shouldn't be that hard.  I think you can deal with it on the isel end 
> with some relatively small changes to D94097 
> .

If you discard my patch and look at the codegen for `__asm__ volatile ("st64b 
%0,[%1]" : : "r" (*input), "r" (addr) : "memory" );`, which uses the struct foo 
as an input operand, you'll see that clang is already passing it by reference. 
All I am doing is making this behavior consistent for output operands too. 
Whether llvm can deal with indirect asm register operands or not is a separate 
story (see llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:8740). I think 
that making clang emit what you sugggested (to pass [8 x i64] by value) is 
inevitably going to be inelegant in a similar way that the previous revision of 
this patch was. Moreover, taking this route entails introducing more inelegant 
changes in D94097  (workarounds for MVT::i64x8 
in getCopyToParts() of the same file I previously mentioned). I have been 
unsuccessfully trying all the above and I can continue my efforts for a little 
more, but in my honest opinion I don't see the benefit.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94098/new/

https://reviews.llvm.org/D94098

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94098: [Clang][AArch64] Inline assembly support for the ACLE type 'data512_t'.

2021-07-09 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added a comment.

In D94098#2865372 , @efriedma wrote:

> I'm confused what your goal here is, exactly.  The point of allowing 512-bit 
> inline asm operands is presumably to allow writing efficient code involving 
> inline asm... but you're intentionally destroying any potential efficiency by 
> forcing it to be passed/returned in memory.  If the user wanted to do that, 
> they could just use an "m" constraint.
>
> It looks like SelectionDAG currently crashes if you try to pass an array as 
> an inline asm operand, but that should be possible to fix, I think.

I have explained in the description why I am doing this: i512 is not a 
qualified type and so it is not possible to emit the store instruction required 
for output operands (line 2650 in the original code of 
clang/lib/CodeGen/CGStmt.cpp). As I said clang has already tests in place for 
this case (clang/test/CodeGen/X86/x86_64-PR42672.c - function big_struct), so I 
don't see how I am destroying the efficient codegen, which only applies to 
small sized integers (because they have a qualified type). Can you suggest a 
better solution?

Regarding the Selection DAG, my patches https://reviews.llvm.org/D94096 and 
https://reviews.llvm.org/D94097 are adding support for this use case in the 
backend. @t.p.northover has raised a concern there too, so maybe my original 
set of patches (including a dedicated IR type) in the RFC 
https://lists.llvm.org/pipermail/llvm-dev/2020-November/146860.html were a 
better fit?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94098/new/

https://reviews.llvm.org/D94098

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94098: [Clang][AArch64] Inline assembly support for the ACLE type 'data512_t'.

2021-07-01 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea updated this revision to Diff 355908.
labrinea retitled this revision from "[Clang] Inline assembly support for the 
ACLE type 'data512_t'." to "[Clang][AArch64] Inline assembly support for the 
ACLE type 'data512_t'.".
labrinea edited the summary of this revision.
labrinea added a reviewer: momchil.velikov.
Herald added subscribers: danielkiss, pengfei.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94098/new/

https://reviews.llvm.org/D94098

Files:
  clang/include/clang/Basic/TargetInfo.h
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/lib/CodeGen/CGStmt.cpp
  clang/test/CodeGen/aarch64-ls64-inline-asm.c

Index: clang/test/CodeGen/aarch64-ls64-inline-asm.c
===
--- clang/test/CodeGen/aarch64-ls64-inline-asm.c
+++ clang/test/CodeGen/aarch64-ls64-inline-asm.c
@@ -0,0 +1,37 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple aarch64-eabi -target-feature +ls64 -S -emit-llvm -x c %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64_be-eabi -target-feature +ls64 -S -emit-llvm -x c %s -o - | FileCheck %s
+
+struct foo { unsigned long long x[8]; };
+
+// CHECK-LABEL: @load(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[OUTPUT_ADDR:%.*]] = alloca %struct.foo*, align 8
+// CHECK-NEXT:[[ADDR_ADDR:%.*]] = alloca i8*, align 8
+// CHECK-NEXT:store %struct.foo* [[OUTPUT:%.*]], %struct.foo** [[OUTPUT_ADDR]], align 8
+// CHECK-NEXT:store i8* [[ADDR:%.*]], i8** [[ADDR_ADDR]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = load %struct.foo*, %struct.foo** [[OUTPUT_ADDR]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = load i8*, i8** [[ADDR_ADDR]], align 8
+// CHECK-NEXT:call void asm sideeffect "ld64b $0,[$1]", "=*r,r,~{memory}"(%struct.foo* [[TMP0]], i8* [[TMP1]]) #[[ATTR1:[0-9]+]], !srcloc !6
+// CHECK-NEXT:ret void
+//
+void load(struct foo *output, void *addr)
+{
+__asm__ volatile ("ld64b %0,[%1]" : "=r" (*output) : "r" (addr) : "memory");
+}
+
+// CHECK-LABEL: @store(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[INPUT_ADDR:%.*]] = alloca %struct.foo*, align 8
+// CHECK-NEXT:[[ADDR_ADDR:%.*]] = alloca i8*, align 8
+// CHECK-NEXT:store %struct.foo* [[INPUT:%.*]], %struct.foo** [[INPUT_ADDR]], align 8
+// CHECK-NEXT:store i8* [[ADDR:%.*]], i8** [[ADDR_ADDR]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = load %struct.foo*, %struct.foo** [[INPUT_ADDR]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = load i8*, i8** [[ADDR_ADDR]], align 8
+// CHECK-NEXT:call void asm sideeffect "st64b $0,[$1]", "*r,r,~{memory}"(%struct.foo* [[TMP0]], i8* [[TMP1]]) #[[ATTR1]], !srcloc !7
+// CHECK-NEXT:ret void
+//
+void store(const struct foo *input, void *addr)
+{
+__asm__ volatile ("st64b %0,[%1]" : : "r" (*input), "r" (addr) : "memory" );
+}
Index: clang/lib/CodeGen/CGStmt.cpp
===
--- clang/lib/CodeGen/CGStmt.cpp
+++ clang/lib/CodeGen/CGStmt.cpp
@@ -2287,15 +2287,25 @@
 // by-value.  If this is a memory result, return the value by-reference.
 bool isScalarizableAggregate =
 hasAggregateEvaluationKind(OutExpr->getType());
-if (!Info.allowsMemory() && (hasScalarEvaluationKind(OutExpr->getType()) ||
- isScalarizableAggregate)) {
+
+unsigned Size = getContext().getTypeSize(OutExpr->getType());
+
+// If this is a register output but the asm operand is of aggregate
+// type, then make the inline asm return it by-reference and let
+// the target deal with it when possible.
+bool byRef = Info.allowsRegister() && isScalarizableAggregate &&
+getTarget().canStoreAggregateOperandInRegister(Size);
+
+bool byVal = !Info.allowsMemory() &&
+   (hasScalarEvaluationKind(OutExpr->getType()) || isScalarizableAggregate);
+
+if (byVal && !byRef) {
   Constraints += "=" + OutputConstraint;
   ResultRegQualTys.push_back(OutExpr->getType());
   ResultRegDests.push_back(Dest);
   ResultTruncRegTypes.push_back(ConvertTypeForMem(OutExpr->getType()));
   if (Info.allowsRegister() && isScalarizableAggregate) {
 ResultTypeRequiresCast.push_back(true);
-unsigned Size = getContext().getTypeSize(OutExpr->getType());
 llvm::Type *ConvTy = llvm::IntegerType::get(getLLVMContext(), Size);
 ResultRegTypes.push_back(ConvTy);
   } else {
Index: clang/lib/Basic/Targets/AArch64.h
===
--- clang/lib/Basic/Targets/AArch64.h
+++ clang/lib/Basic/Targets/AArch64.h
@@ -141,6 +141,10 @@
   bool hasInt128Type() const override;
 
   bool hasExtIntType() const override { return true; }
+
+  bool canStoreAggregateOperandInRegister(unsigned size) const override {
+return size == 512 && HasLS64;
+  }
 };
 
 class LLVM_LIBRARY_VISIBILITY AArch64leTargetInfo : public AArch64TargetInfo {
Index: 

[PATCH] D95655: [AArch64] Adding Neon Sm3 & Sm4 Intrinsics

2021-01-30 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea accepted this revision.
labrinea added a comment.
This revision is now accepted and ready to land.

LGTM, thanks @rsanthir.quic !


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95655/new/

https://reviews.llvm.org/D95655

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D95655: Adding Neon Sm3 & Sm4 Intrinsics

2021-01-29 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added inline comments.



Comment at: llvm/test/CodeGen/AArch64/neon-sm4-sm3.ll:24
+
+define <4 x i32> @test_vsm3ss1(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x 
i32> %d) {
+; CHECK-LABEL: test_vsm3ss1:

The forth argument (<4 x i32> %d) is redundant.



Comment at: llvm/test/CodeGen/AArch64/neon-sm4-sm3.ll:77
+; CHECK:   // %bb.0: // %entry
+; CHECK-NEXT:sm4e v1.4s, v0.4s
+; CHECK-NEXT:mov v0.16b, v1.16b

Shouldn't the registers be the other way around: sm4e v0.4s, v1.4s ? I believe 
the reason this happens is because of how CryptoRRTied is defined in 
`llvm/lib/Target/AArch64/AArch64InstrFormats.td`: 


```
class CryptoRRTiedop0, bits<2>op1, string asm, string asmops>
  : BaseCryptoV82<(outs V128:$Vd), (ins V128:$Vn, V128:$Vm), asm, asmops,
  "$Vm = $Vd", []> {
```

Vd be should be the first source register (as well as destination register) and 
Vn should be the second source register.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95655/new/

https://reviews.llvm.org/D95655

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94098: [Clang] Inline assembly support for the ACLE type 'data512_t'.

2021-01-05 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea created this revision.
labrinea added reviewers: cfe-commits, t.p.northover, ab, kristof.beyls, 
simon_tatham.
labrinea requested review of this revision.
Herald added a project: clang.

This patch emits the new LLVM IR type introduced in 
https://reviews.llvm.org/D94091 when generating IR for inline assembly source 
code that operates on `data512_t`, as long as the target hooks indicate the 
presence of the LS64 extension.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D94098

Files:
  clang/include/clang/Basic/TargetInfo.h
  clang/lib/Basic/Targets/AArch64.cpp
  clang/lib/Basic/Targets/AArch64.h
  clang/lib/CodeGen/CGStmt.cpp
  clang/lib/CodeGen/TargetInfo.cpp
  clang/test/CodeGen/aarch64-ls64-inline-asm.c

Index: clang/test/CodeGen/aarch64-ls64-inline-asm.c
===
--- /dev/null
+++ clang/test/CodeGen/aarch64-ls64-inline-asm.c
@@ -0,0 +1,41 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple aarch64-eabi -target-feature +ls64 -S -emit-llvm -x c %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64_be-eabi -target-feature +ls64 -S -emit-llvm -x c %s -o - | FileCheck %s
+
+struct foo { unsigned long long x[8]; };
+
+// CHECK-LABEL: @load(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[OUTPUT_ADDR:%.*]] = alloca %struct.foo*, align 8
+// CHECK-NEXT:[[ADDR_ADDR:%.*]] = alloca i8*, align 8
+// CHECK-NEXT:store %struct.foo* [[OUTPUT:%.*]], %struct.foo** [[OUTPUT_ADDR]], align 8
+// CHECK-NEXT:store i8* [[ADDR:%.*]], i8** [[ADDR_ADDR]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = load %struct.foo*, %struct.foo** [[OUTPUT_ADDR]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = load i8*, i8** [[ADDR_ADDR]], align 8
+// CHECK-NEXT:[[TMP2:%.*]] = call aarch64_ls64 asm sideeffect "ld64b $0,[$1]", "=r,r"(i8* [[TMP1]]) [[ATTR1:#.*]], !srcloc !6
+// CHECK-NEXT:[[TMP3:%.*]] = bitcast %struct.foo* [[TMP0]] to aarch64_ls64*
+// CHECK-NEXT:store aarch64_ls64 [[TMP2]], aarch64_ls64* [[TMP3]], align 8
+// CHECK-NEXT:ret void
+//
+void load(struct foo *output, void *addr)
+{
+__asm__ volatile ("ld64b %0,[%1]" : "=r" (*output) : "r" (addr));
+}
+
+// CHECK-LABEL: @store(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[INPUT_ADDR:%.*]] = alloca %struct.foo*, align 8
+// CHECK-NEXT:[[ADDR_ADDR:%.*]] = alloca i8*, align 8
+// CHECK-NEXT:store %struct.foo* [[INPUT:%.*]], %struct.foo** [[INPUT_ADDR]], align 8
+// CHECK-NEXT:store i8* [[ADDR:%.*]], i8** [[ADDR_ADDR]], align 8
+// CHECK-NEXT:[[TMP0:%.*]] = load %struct.foo*, %struct.foo** [[INPUT_ADDR]], align 8
+// CHECK-NEXT:[[TMP1:%.*]] = bitcast %struct.foo* [[TMP0]] to aarch64_ls64*
+// CHECK-NEXT:[[TMP2:%.*]] = load aarch64_ls64, aarch64_ls64* [[TMP1]], align 8
+// CHECK-NEXT:[[TMP3:%.*]] = load i8*, i8** [[ADDR_ADDR]], align 8
+// CHECK-NEXT:call void asm sideeffect "st64b $0,[$1]", "r,r"(aarch64_ls64 [[TMP2]], i8* [[TMP3]]) [[ATTR1]], !srcloc !7
+// CHECK-NEXT:ret void
+//
+void store(const struct foo *input, void *addr)
+{
+__asm__ volatile ("st64b %0,[%1]" : : "r" (*input), "r" (addr));
+}
Index: clang/lib/CodeGen/TargetInfo.cpp
===
--- clang/lib/CodeGen/TargetInfo.cpp
+++ clang/lib/CodeGen/TargetInfo.cpp
@@ -5533,6 +5533,23 @@
 Fn->addFnAttr("branch-target-enforcement",
   BPI.BranchTargetEnforcement ? "true" : "false");
   }
+
+  llvm::Type* adjustInlineAsmType(CodeGen::CodeGenFunction ,
+  StringRef Constraint,
+  llvm::Type* Ty) const override {
+if (getABIInfo().getContext().getTargetInfo().hasAArch64_LS64Type()) {
+  if (CGF.CGM.getDataLayout().getTypeSizeInBits(Ty) == 512) {
+auto *ST = dyn_cast(Ty);
+if (ST && ST->getNumElements() == 1) {
+  auto *AT = dyn_cast(ST->getElementType(0));
+  if (AT && AT->getNumElements() == 8 &&
+  AT->getElementType()->isIntegerTy(64))
+return llvm::Type::getAArch64_LS64Ty(CGF.getLLVMContext());
+}
+  }
+}
+return Ty;
+  }
 };
 
 class WindowsAArch64TargetCodeGenInfo : public AArch64TargetCodeGenInfo {
Index: clang/lib/CodeGen/CGStmt.cpp
===
--- clang/lib/CodeGen/CGStmt.cpp
+++ clang/lib/CodeGen/CGStmt.cpp
@@ -2030,6 +2030,7 @@
   Arg = EmitLoadOfLValue(InputValue, Loc).getScalarVal();
 } else {
   llvm::Type *Ty = ConvertType(InputType);
+  Ty = getTargetHooks().adjustInlineAsmType(*this, ConstraintStr, Ty);
   uint64_t Size = CGM.getDataLayout().getTypeSizeInBits(Ty);
   if (Size <= 64 && llvm::isPowerOf2_64(Size)) {
 Ty = llvm::IntegerType::get(getLLVMContext(), Size);
@@ -2037,6 +2038,11 @@
 
 Arg = Builder.CreateLoad(
 Builder.CreateBitCast(InputValue.getAddress(*this), Ty));
+

[PATCH] D76077: [ARM] Add __bf16 as new Bfloat16 C Type

2020-06-04 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added inline comments.



Comment at: clang/lib/CodeGen/CGBuiltin.cpp:6989
+  false,
+  getTarget().hasBFloat16Type());
   llvm::Type *Ty = VTy;

shouldn't this be `getTargetHooks().getABIInfo().allowBFloatArgsAndRet()` ?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76077/new/

https://reviews.llvm.org/D76077



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79710: [clang][BFloat] add create/set/get/dup intrinsics

2020-06-01 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added inline comments.



Comment at: clang/include/clang/Basic/arm_neon.td:1854
+  def VDUP_LANE_BF : WOpInst<"vdup_lane", ".qI", "bQb", OP_DUP_LN>;
+  def VDUP_LANEQ_BF: WOpInst<"vdup_laneq", ".QI", "bQb", OP_DUP_LN> {
+let isLaneQ = 1;

My local build points here with:
`arm_neon.td:1926:3: error: No compatible intrinsic found - looking up 
intrinsic 'splat_laneq(bfloat16x8_t, int32_t)'`




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80928: [BFloat] Add convert/copy instrinsic support

2020-06-01 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea created this revision.
labrinea added reviewers: fpetrogalli, LukeGeeson, stuij, momchil.velikov, 
SjoerdMeijer, miyuki.
Herald added subscribers: hiraditya, kristof.beyls.
Herald added projects: clang, LLVM.

This patch is part of a series implementing the Bfloat16 extension of the 
Armv8.6-a architecture, as detailed here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

Specifically it adds intrinsic support in clang and llvm for Arm and AArch64.

The bfloat type, and its properties are specified in the Arm Architecture 
Reference Manual:

https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile

The following people contributed to this patch:

- Alexandros Lamprineas
- Luke Cheeseman
- Mikhail Maltsev
- Momchil Velikov


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D80928

Files:
  clang/include/clang/Basic/arm_neon.td
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGen/aarch64-bf16-lane-intrinsics.c
  clang/test/CodeGen/arm-bf16-conv-copy-intrinsics.c
  clang/test/Sema/aarch64-neon-bf16-ranges.c
  clang/utils/TableGen/NeonEmitter.cpp
  llvm/include/llvm/IR/IntrinsicsAArch64.td
  llvm/include/llvm/IR/IntrinsicsARM.td
  llvm/lib/Target/AArch64/AArch64InstrFormats.td
  llvm/lib/Target/ARM/ARMISelDAGToDAG.cpp
  llvm/test/CodeGen/AArch64/bf16-intrinsics.ll
  llvm/test/CodeGen/ARM/bf16-intrinsics-nofp16.ll
  llvm/test/CodeGen/ARM/bf16-intrinsics.ll

Index: llvm/test/CodeGen/ARM/bf16-intrinsics.ll
===
--- /dev/null
+++ llvm/test/CodeGen/ARM/bf16-intrinsics.ll
@@ -0,0 +1,24 @@
+; RUN: llc < %s -verify-machineinstrs -mtriple=armv8.6a-arm-none-eabi -mattr=+fullfp16 -mattr=+neon -mattr=+bf16 | FileCheck %s
+
+declare bfloat @llvm.arm.neon.vcvtbfp2bf.bf16.f32(float)
+declare <4 x bfloat> @llvm.arm.neon.vcvtfp2bf.v4bf16.v4f32(<4 x float>)
+
+; CHECK-LABEL: test_vcvth_bf16_f32
+; CHECK: vcvtb.bf16.f32  s0, s0
+define arm_aapcs_vfpcc float @test_vcvth_bf16_f32(float %a) {
+entry:
+  %vcvtbfp2bf.i = tail call bfloat @llvm.arm.neon.vcvtbfp2bf.bf16.f32(float %a)
+  %0 = bitcast bfloat %vcvtbfp2bf.i to i16
+  %tmp.0.insert.ext.i = zext i16 %0 to i32
+  %1 = bitcast i32 %tmp.0.insert.ext.i to float
+  ret float %1
+}
+
+; CHECK-LABEL: test_vcvt_bf16_f32
+; CHECK: vcvt.bf16.f32   d0, q0
+define arm_aapcs_vfpcc <4 x bfloat> @test_vcvt_bf16_f32(<4 x float> %a) {
+entry:
+  %vcvtfp2bf1.i.i = call <4 x bfloat> @llvm.arm.neon.vcvtfp2bf.v4bf16.v4f32(<4 x float> %a)
+  ret <4 x bfloat> %vcvtfp2bf1.i.i
+}
+
Index: llvm/test/CodeGen/ARM/bf16-intrinsics-nofp16.ll
===
--- /dev/null
+++ llvm/test/CodeGen/ARM/bf16-intrinsics-nofp16.ll
@@ -0,0 +1,23 @@
+; RUN: llc < %s -verify-machineinstrs -mtriple=armv8.6a-arm-none-eabi -mattr=+neon -mattr=+bf16 | FileCheck %s
+
+declare i32 @llvm.arm.neon.vcvtbfp2bf.i32.f32(float)
+declare <4 x i16> @llvm.arm.neon.vcvtfp2bf.v4i16.v4f32(<4 x float>)
+
+; CHECK-LABEL: test_vcvth_bf16_f32
+; CHECK: vcvtb.bf16.f32  s0, s0
+define arm_aapcs_vfpcc float @test_vcvth_bf16_f32(float %a) {
+entry:
+  %vcvtbfp2bf = tail call i32 @llvm.arm.neon.vcvtbfp2bf.i32.f32(float %a)
+  %tmp.0.insert.ext = and i32 %vcvtbfp2bf, 65535
+  %0 = bitcast i32 %tmp.0.insert.ext to float
+  ret float %0
+}
+
+; CHECK-LABEL: test_vcvt_bf16_f32
+; CHECK: vcvt.bf16.f32   d0, q0
+define arm_aapcs_vfpcc <2 x i32> @test_vcvt_bf16_f32(<4 x float> %a) {
+entry:
+  %vcvtfp2bf1.i.i = tail call <4 x i16> @llvm.arm.neon.vcvtfp2bf.v4i16.v4f32(<4 x float> %a)
+  %0 = bitcast <4 x i16> %vcvtfp2bf1.i.i to <2 x i32>
+  ret <2 x i32> %0
+}
Index: llvm/test/CodeGen/AArch64/bf16-intrinsics.ll
===
--- /dev/null
+++ llvm/test/CodeGen/AArch64/bf16-intrinsics.ll
@@ -0,0 +1,34 @@
+; RUN: llc < %s -verify-machineinstrs -mtriple=aarch64-arm-none-eabi -mattr=+neon -mattr=+bf16 | FileCheck %s
+
+declare bfloat @llvm.aarch64.neon.bfcvt.f16.f32(float)
+declare <8 x bfloat> @llvm.aarch64.neon.bfcvtn.v8f16.v4f32(<4 x float>)
+declare <8 x bfloat> @llvm.aarch64.neon.bfcvtn2.v8f16.v8f16.v4f32(<8 x bfloat>, <4 x float>)
+
+; CHECK-LABEL: test_vcvth_bf16_f32
+; CHECK:  bfcvt h0, s0
+; CHECK-NEXT: ret
+define bfloat @test_vcvth_bf16_f32(float %a) {
+entry:
+  %vcvth_bf16_f32 = call bfloat @llvm.aarch64.neon.bfcvt.f16.f32(float %a)
+  ret bfloat %vcvth_bf16_f32
+}
+
+; CHECK-LABEL: test_vcvtq_low_bf16_f32
+; CHECK:  bfcvtn v0.4h, v0.4s
+; CHECK-NEXT: ret
+define <8 x bfloat> @test_vcvtq_low_bf16_f32(<4 x float> %a) {
+entry:
+  %cvt = call <8 x bfloat> @llvm.aarch64.neon.bfcvtn.v8f16.v4f32(<4 x float> %a)
+  ret <8 x bfloat> %cvt
+}
+
+; CHECK-LABEL: test_vcvtq_high_bf16_f32
+; CHECK:  bfcvtn2 v1.8h, v0.4s
+; CHECK-NEXT: mov v0.16b, v1.16b
+; CHECK-NEXT: ret
+define <8 x bfloat> 

[PATCH] D76077: [ARM] Add __bf16 as new Bfloat16 C Type

2020-06-01 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added inline comments.



Comment at: clang/lib/CodeGen/CGBuiltin.cpp:4493
+  case NeonTypeFlags::BFloat16:
+if (HasBFloat16Type)
+  return llvm::VectorType::get(CGF->BFloatTy, V1Ty ? 1 : (4 << IsQuad));

This is not what we should be checking for here. Imagine a command line with 
+bf16 and +mfloat-abi=softfp, that should generate i16, not bfloat. We 
therefore need a target hook to pass this information. I suggest 
`allowBFloatArgsAndRet()` in ABIInfo, returning false by default, overloaded 
for Arm and AArch64 to return `!IsFloatABISoftFP && hasBFloat16Type()` and 
`hasBFloat16Type()` respectively.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76077/new/

https://reviews.llvm.org/D76077



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80716: [AArch64]: BFloat Load/Store Intrinsics

2020-05-28 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added inline comments.



Comment at: clang/test/CodeGen/aarch64-bf16-ldst-intrinsics.c:36
+// CHECK32: ret <4 x bfloat> %vld1_lane
+
+bfloat16x8_t test_vld1q_lane_bf16(bfloat16_t const *ptr, bfloat16x8_t src) {

CHECK-NEXT or CHECK-DAG are preferable for sequences.



Comment at: clang/test/CodeGen/aarch64-bf16-ldst-intrinsics.c:180
+// %.fca.0.2.insert = insertvalue %struct.bfloat16x4x3_t %.fca.0.1.insert, <4 
x bfloat> %vld3_lane.fca.2.extract, 0, 2
+
+bfloat16x8x3_t test_vld3q_lane_bf16(bfloat16_t const *ptr, bfloat16x8x3_t src) 
{

where are the check lines?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80716/new/

https://reviews.llvm.org/D80716



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79711: [ARM] Add poly64_t on AArch32.

2020-05-26 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added a comment.

> Should poly128_t be available on AArch32 too? I don't see anything in the 
> ACLE version you linked restricting it to AArch64 only, and the intrinsics 
> reference has a number of intrinsics available for both ISAs using it.

It should but it is not that simple. The reason it is not available is that 
__int128_t is not supported in AArch32. I think that is future work, since this 
patch unblocks the bfloat reinterpret_cast patch, which btw is annotated with 
TODO comments regarding the poly128_t type for AArch32.




Comment at: clang/lib/Sema/SemaType.cpp:7645
 
   // Signed poly is mathematically wrong, but has been baked into some ABIs by
   // now.

@ostannard according to this comment it seems there has been some divergence 
between AArch64 and AArch32 and now is too late to change. If the ACLE doesn't 
say so, maybe it should.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79711/new/

https://reviews.llvm.org/D79711



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79710: [clang][BFloat] add create/set/get/dup intrinsics

2020-05-20 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added inline comments.



Comment at: clang/include/clang/Basic/arm_neon.td:1845
+
+// V8.2-A BFloat intrinsics
+let ArchGuard = "defined(__ARM_FEATURE_BF16_VECTOR_ARITHMETIC)" in {

v8.6-A ?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79710/new/

https://reviews.llvm.org/D79710



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64048: [TargetParser][ARM] Account dependencies when processing target features

2019-07-04 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea updated this revision to Diff 208010.
labrinea added a comment.

Added the dependency of mve on dsp and some missing tests to cover those cases.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64048/new/

https://reviews.llvm.org/D64048

Files:
  clang/test/Preprocessor/arm-target-features.c
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Support/ARMTargetParser.cpp

Index: llvm/lib/Support/ARMTargetParser.cpp
===
--- llvm/lib/Support/ARMTargetParser.cpp
+++ llvm/lib/Support/ARMTargetParser.cpp
@@ -508,16 +508,30 @@
   return ARM::FK_INVALID;
 }
 
+static unsigned getAEKID(StringRef ArchExtName) {
+  for (const auto AE : ARM::ARCHExtNames)
+if (AE.getName() == ArchExtName)
+  return AE.ID;
+  return ARM::AEK_INVALID;
+}
+
 bool ARM::appendArchExtFeatures(
   StringRef CPU, ARM::ArchKind AK, StringRef ArchExt,
   std::vector ) {
-  StringRef StandardFeature = getArchExtFeature(ArchExt);
-  if (!StandardFeature.empty()) {
-Features.push_back(StandardFeature);
-return true;
-  }
 
+  size_t StartingNumFeatures = Features.size();
   const bool Negated = stripNegationPrefix(ArchExt);
+  unsigned ID = getAEKID(ArchExt);
+
+  if (ID == AEK_INVALID)
+return false;
+
+  for (const auto AE : ARCHExtNames) {
+if (Negated && (AE.ID & ID) == ID && AE.NegFeature)
+  Features.push_back(AE.NegFeature);
+else if (AE.ID == ID && AE.Feature)
+  Features.push_back(AE.Feature);
+  }
 
   if (CPU == "")
 CPU = "generic";
@@ -537,7 +551,7 @@
 }
 return ARM::getFPUFeatures(FPUKind, Features);
   }
-  return false;
+  return StartingNumFeatures != Features.size();
 }
 
 StringRef ARM::getHWDivName(unsigned HWDivKind) {
Index: llvm/include/llvm/Support/ARMTargetParser.def
===
--- llvm/include/llvm/Support/ARMTargetParser.def
+++ llvm/include/llvm/Support/ARMTargetParser.def
@@ -148,8 +148,8 @@
 ARM_ARCH_EXT_NAME("dotprod",  ARM::AEK_DOTPROD,  "+dotprod","-dotprod")
 ARM_ARCH_EXT_NAME("dsp",  ARM::AEK_DSP,  "+dsp",   "-dsp")
 ARM_ARCH_EXT_NAME("fp",   ARM::AEK_FP,   nullptr,  nullptr)
-ARM_ARCH_EXT_NAME("mve",  ARM::AEK_SIMD, "+mve",   "-mve")
-ARM_ARCH_EXT_NAME("mve.fp",   (ARM::AEK_SIMD | ARM::AEK_FP), "+mve.fp", "-mve.fp")
+ARM_ARCH_EXT_NAME("mve", (ARM::AEK_DSP | ARM::AEK_SIMD), "+mve", "-mve")
+ARM_ARCH_EXT_NAME("mve.fp",  (ARM::AEK_DSP | ARM::AEK_SIMD | ARM::AEK_FP), "+mve.fp", "-mve.fp")
 ARM_ARCH_EXT_NAME("idiv", (ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB), nullptr, nullptr)
 ARM_ARCH_EXT_NAME("mp",   ARM::AEK_MP,   nullptr,  nullptr)
 ARM_ARCH_EXT_NAME("simd", ARM::AEK_SIMD, nullptr,  nullptr)
Index: clang/test/Preprocessor/arm-target-features.c
===
--- clang/test/Preprocessor/arm-target-features.c
+++ clang/test/Preprocessor/arm-target-features.c
@@ -762,12 +762,29 @@
 // CHECK-V81M-MVE: #define __ARM_FEATURE_MVE 1
 // CHECK-V81M-MVE: #define __ARM_FEATURE_SIMD32 1
 
-// RUN: %clang -target arm-arm-none-eabi -march=armv8.1-m.main+mve.fp -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-V81M-MVE-FP %s
-// CHECK-V81M-MVE-FP: #define __ARM_FEATURE_DSP 1
-// CHECK-V81M-MVE-FP: #define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1
-// CHECK-V81M-MVE-FP: #define __ARM_FEATURE_MVE 3
-// CHECK-V81M-MVE-FP: #define __ARM_FEATURE_SIMD32 1
-// CHECK-V81M-MVE-FP: #define __ARM_FPV5__ 1
+// RUN: %clang -target arm-arm-none-eabi -march=armv8.1-m.main+mve.fp -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-V81M-MVEFP %s
+// CHECK-V81M-MVEFP: #define __ARM_FEATURE_DSP 1
+// CHECK-V81M-MVEFP: #define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1
+// CHECK-V81M-MVEFP: #define __ARM_FEATURE_MVE 3
+// CHECK-V81M-MVEFP: #define __ARM_FEATURE_SIMD32 1
+// CHECK-V81M-MVEFP: #define __ARM_FPV5__ 1
+
+// nofp discards mve.fp
+// RUN: %clang -target arm-arm-none-eabi -march=armv8.1-m.main+mve.fp+nofp -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-V81M-MVEFP-NOFP %s
+// CHECK-V81M-MVEFP-NOFP-NOT: #define __ARM_FEATURE_MVE
+
+// nomve discards mve.fp
+// RUN: %clang -target arm-arm-none-eabi -march=armv8.1-m.main+mve.fp+nomve -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-V81M-MVEFP-NOMVE %s
+// CHECK-V81M-MVEFP-NOMVE-NOT: #define __ARM_FEATURE_MVE
+
+// mve+fp doesn't imply mve.fp
+// RUN: %clang -target arm-arm-none-eabi -march=armv8.1-m.main+mve+fp -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-V81M-MVE-FP %s
+// CHECK-V81M-MVE-FP: #define __ARM_FEATURE_MVE 1
+
+// nodsp discards both dsp and mve
+// RUN: %clang -target arm-arm-none-eabi -march=armv8.1-m.main+mve+nodsp -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-V81M-MVE-NODSP %s
+// CHECK-V81M-MVE-NODSP-NOT: #define 

[PATCH] D63936: [clang][Driver][ARM] Favor -mfpu over default CPU features

2019-07-04 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added inline comments.



Comment at: llvm/lib/Support/ARMTargetParser.cpp:412
 
-  if (Extensions & AEK_CRC)
-Features.push_back("+crc");
-  else
-Features.push_back("-crc");
-
-  if (Extensions & AEK_DSP)
-Features.push_back("+dsp");
-  else
-Features.push_back("-dsp");
-
-  if (Extensions & AEK_FP16FML)
-Features.push_back("+fp16fml");
-  else
-Features.push_back("-fp16fml");
-
-  if (Extensions & AEK_RAS)
-Features.push_back("+ras");
-  else
-Features.push_back("-ras");
-
-  if (Extensions & AEK_DOTPROD)
-Features.push_back("+dotprod");
-  else
-Features.push_back("-dotprod");
+  for (const auto AE : ARCHExtNames) {
+if ((Extensions & AE.ID) == AE.ID && AE.Feature)

ostannard wrote:
> labrinea wrote:
> > ostannard wrote:
> > > labrinea wrote:
> > > > SjoerdMeijer wrote:
> > > > > This could be a little local helper function, share the code, as 
> > > > > exactly the same is done in `ARM::appendArchExtFeatures`
> > > > We are not doing exactly the same thing in these functions. Here we 
> > > > extract features out of a bitmap, which is a map containing a bitwise 
> > > > OR of separate feature bitmasks. If a bitmask that corresponds to a 
> > > > known feature is present - and here I mean all the bits of that mask 
> > > > are present - then we push the feature, otherwise not. 
> > > > 
> > > > In `ARM::appendArchExtFeatures` we compare a given bitmask, which 
> > > > corresponds to a specific feature, against all the known bitmasks 
> > > > (individual features) one by one. In contrast to 
> > > > `ARM::getExtensionFeatures` here we also handle negative features, and 
> > > > that means the prohibition of a feature can disable other features that 
> > > > depend on it as well.
> > > Does this correctly handle the combination of integer-only MVE with a 
> > > scalar FPU? I think -march=...+mve+fp should enable AEK_FP and AEK_SIMD 
> > > (+mve), but shouldn't enable +mve.fp.
> > The target features explicitly specified on the command line are handled by 
> > `ARM::appendArchExtFeatures` (see [[ https://reviews.llvm.org/D64048 | 
> > D64048 ]]). That said, yes, -march=...+mve+fp does enable scalar floating 
> > point and integer-only mve, but doesn't enable mve with floating point. I 
> > could possibly add such a test on that revision.
> > 
> > On the other hand, `ARM::getExtensionFeatures` cannot tell the difference 
> > between the two configurations you describe, and it's not possible to do 
> > so, because they are represented on a bitmap returned from 
> > `ARM::getDefaultExtensions`, which reads the table. That said, if there was 
> > an entry in the table containing `AEK_FP | AEK_SIMD` that would enable 
> > mve.fp. However, I don't think this is a problem in practice. My 
> > understanding is that the table indicates FPU support with `FK_*`, and 
> > Extension support with `AEK_*`.  That said, I believe AEK_FP is only used 
> > for command line option handling.
> > 
> > Maybe a fix for this problem would be to replace `AEK_FP | AEK_SIMD` with 
> > something like `AEK_MVE_FP` but then we wouldn't be able to do what is 
> > proposed in [[ https://reviews.llvm.org/D64048 | D64048 ]].
> Is this system (in particular the behaviour added in D64048) going to be able 
> to handle all of the other dependencies between architectural features? For 
> example, MVE also depends on the DSP extension, but 
> `-march=armv8.1-m.main+mve+nodsp` currently defines both __ARM_FEATURE_MVE 
> and __ARM_FEATURE_DSP.
No, `-march=armv8.1-m.main+mve+nodsp` doesn't turn off neither mve nor dsp and 
it looks like a bug if they depend on each other. It seems you are right, the 
code in `ARMTargetInfo::handleTargetFeatures` enables both when mve is set:

```
} else if (Feature == "+mve") {
  DSP = 1;
  MVE |= MVE_INT;
} else if (Feature == "+mve.fp") {
  DSP = 1;
  HasLegalHalfType = true;
  FPU |= FPARMV8;
  MVE |= MVE_INT | MVE_FP;
  HW_FP |= HW_FP_SP | HW_FP_HP;
}
```
If there's a dependency then it should be present in the table of target 
parser. Then the above command would turn both off. I'll update the table and 
add some tests in [[ https://reviews.llvm.org/D64048 | D64048 ]].


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63936/new/

https://reviews.llvm.org/D63936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D63936: [clang][Driver][ARM] Favor -mfpu over default CPU features

2019-07-03 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added inline comments.



Comment at: llvm/lib/Support/ARMTargetParser.cpp:412
 
-  if (Extensions & AEK_CRC)
-Features.push_back("+crc");
-  else
-Features.push_back("-crc");
-
-  if (Extensions & AEK_DSP)
-Features.push_back("+dsp");
-  else
-Features.push_back("-dsp");
-
-  if (Extensions & AEK_FP16FML)
-Features.push_back("+fp16fml");
-  else
-Features.push_back("-fp16fml");
-
-  if (Extensions & AEK_RAS)
-Features.push_back("+ras");
-  else
-Features.push_back("-ras");
-
-  if (Extensions & AEK_DOTPROD)
-Features.push_back("+dotprod");
-  else
-Features.push_back("-dotprod");
+  for (const auto AE : ARCHExtNames) {
+if ((Extensions & AE.ID) == AE.ID && AE.Feature)

ostannard wrote:
> labrinea wrote:
> > SjoerdMeijer wrote:
> > > This could be a little local helper function, share the code, as exactly 
> > > the same is done in `ARM::appendArchExtFeatures`
> > We are not doing exactly the same thing in these functions. Here we extract 
> > features out of a bitmap, which is a map containing a bitwise OR of 
> > separate feature bitmasks. If a bitmask that corresponds to a known feature 
> > is present - and here I mean all the bits of that mask are present - then 
> > we push the feature, otherwise not. 
> > 
> > In `ARM::appendArchExtFeatures` we compare a given bitmask, which 
> > corresponds to a specific feature, against all the known bitmasks 
> > (individual features) one by one. In contrast to 
> > `ARM::getExtensionFeatures` here we also handle negative features, and that 
> > means the prohibition of a feature can disable other features that depend 
> > on it as well.
> Does this correctly handle the combination of integer-only MVE with a scalar 
> FPU? I think -march=...+mve+fp should enable AEK_FP and AEK_SIMD (+mve), but 
> shouldn't enable +mve.fp.
The target features explicitly specified on the command line are handled by 
`ARM::appendArchExtFeatures` (see [[ https://reviews.llvm.org/D64048 | D64048 
]]). That said, yes, -march=...+mve+fp does enable scalar floating point and 
integer-only mve, but doesn't enable mve with floating point. I could possibly 
add such a test on that revision.

On the other hand, `ARM::getExtensionFeatures` cannot tell the difference 
between the two configurations you describe, and it's not possible to do so, 
because they are represented on a bitmap returned from 
`ARM::getDefaultExtensions`, which reads the table. That said, if there was an 
entry in the table containing `AEK_FP | AEK_SIMD` that would enable mve.fp. 
However, I don't think this is a problem in practice. My understanding is that 
the table indicates FPU support with `FK_*`, and Extension support with 
`AEK_*`.  That said, I believe AEK_FP is only used for command line option 
handling.

Maybe a fix for this problem would be to replace `AEK_FP | AEK_SIMD` with 
something like `AEK_MVE_FP` but then we wouldn't be able to do what is proposed 
in [[ https://reviews.llvm.org/D64048 | D64048 ]].



Comment at: llvm/unittests/Support/TargetParserTest.cpp:574
 
-  Extensions[ARM::AEK_CRC]= { "+crc",   "-crc" };
-  Extensions[ARM::AEK_DSP]= { "+dsp",   "-dsp" };
+  for (auto  : ARM::ARCHExtNames) {
+if (Ext.Feature && Ext.NegFeature)

ostannard wrote:
> SjoerdMeijer wrote:
> > I like this  approach.
> I'm not sure this is a good idea - we are now testing the implementation by 
> checking that it matches the table used by the implementation, so if there's 
> a bug in the table this will still pass.
Surely, but is the purpose of this test to check that the table is correct, or 
that `ARM::getExtensionFeatures` reads the table correctly? I'd say the latter. 
Also, with this change we won't need to update the test every time there's a 
new entry in the table.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63936/new/

https://reviews.llvm.org/D63936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D64048: [TargetParser][ARM] Account dependencies when processing target features

2019-07-01 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea created this revision.
labrinea added reviewers: llvm-commits, ostannard.
Herald added subscribers: cfe-commits, dmgreen, hiraditya, kristof.beyls, 
javed.absar.
Herald added projects: clang, LLVM.

Teaches `ARM::appendArchExtFeatures` to account dependencies when processing 
target features: i.e. when you say `-march=armv8.1-m.main+mve.fp+nofp` it means 
`mve.fp` should get discarded too. (Split from D63936 
)


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D64048

Files:
  clang/test/Preprocessor/arm-target-features.c
  llvm/lib/Support/ARMTargetParser.cpp


Index: llvm/lib/Support/ARMTargetParser.cpp
===
--- llvm/lib/Support/ARMTargetParser.cpp
+++ llvm/lib/Support/ARMTargetParser.cpp
@@ -508,16 +508,30 @@
   return ARM::FK_INVALID;
 }
 
+static unsigned getAEKID(StringRef ArchExtName) {
+  for (const auto AE : ARM::ARCHExtNames)
+if (AE.getName() == ArchExtName)
+  return AE.ID;
+  return ARM::AEK_INVALID;
+}
+
 bool ARM::appendArchExtFeatures(
   StringRef CPU, ARM::ArchKind AK, StringRef ArchExt,
   std::vector ) {
-  StringRef StandardFeature = getArchExtFeature(ArchExt);
-  if (!StandardFeature.empty()) {
-Features.push_back(StandardFeature);
-return true;
-  }
 
+  size_t StartingNumFeatures = Features.size();
   const bool Negated = stripNegationPrefix(ArchExt);
+  unsigned ID = getAEKID(ArchExt);
+
+  if (ID == AEK_INVALID)
+return false;
+
+  for (const auto AE : ARCHExtNames) {
+if (Negated && (AE.ID & ID) == ID && AE.NegFeature)
+  Features.push_back(AE.NegFeature);
+else if (AE.ID == ID && AE.Feature)
+  Features.push_back(AE.Feature);
+  }
 
   if (CPU == "")
 CPU = "generic";
@@ -537,7 +551,7 @@
 }
 return ARM::getFPUFeatures(FPUKind, Features);
   }
-  return false;
+  return StartingNumFeatures != Features.size();
 }
 
 StringRef ARM::getHWDivName(unsigned HWDivKind) {
Index: clang/test/Preprocessor/arm-target-features.c
===
--- clang/test/Preprocessor/arm-target-features.c
+++ clang/test/Preprocessor/arm-target-features.c
@@ -769,6 +769,14 @@
 // CHECK-V81M-MVE-FP: #define __ARM_FEATURE_SIMD32 1
 // CHECK-V81M-MVE-FP: #define __ARM_FPV5__ 1
 
+// nofp discards mve.fp
+// RUN: %clang -target arm-arm-none-eabi -march=armv8.1-m.main+mve.fp+nofp -x 
c -E -dM %s -o - | FileCheck -match-full-lines 
--check-prefix=CHECK-V81M-MVEFP-NOFP %s
+// CHECK-V81M-MVEFP-NOFP-NOT: #define __ARM_FEATURE_MVE
+
+// nomve discards mve.fp
+// RUN: %clang -target arm-arm-none-eabi -march=armv8.1-m.main+mve.fp+nomve -x 
c -E -dM %s -o - | FileCheck -match-full-lines 
--check-prefix=CHECK-V81M-MVEFP-NOMVE %s
+// CHECK-V81M-MVEFP-NOMVE-NOT: #define __ARM_FEATURE_MVE
+
 // RUN: %clang -target armv8.1a-none-none-eabi -x c -E -dM %s -o - | FileCheck 
-match-full-lines --check-prefix=CHECK-V81A %s
 // CHECK-V81A: #define __ARM_ARCH 8
 // CHECK-V81A: #define __ARM_ARCH_8_1A__ 1


Index: llvm/lib/Support/ARMTargetParser.cpp
===
--- llvm/lib/Support/ARMTargetParser.cpp
+++ llvm/lib/Support/ARMTargetParser.cpp
@@ -508,16 +508,30 @@
   return ARM::FK_INVALID;
 }
 
+static unsigned getAEKID(StringRef ArchExtName) {
+  for (const auto AE : ARM::ARCHExtNames)
+if (AE.getName() == ArchExtName)
+  return AE.ID;
+  return ARM::AEK_INVALID;
+}
+
 bool ARM::appendArchExtFeatures(
   StringRef CPU, ARM::ArchKind AK, StringRef ArchExt,
   std::vector ) {
-  StringRef StandardFeature = getArchExtFeature(ArchExt);
-  if (!StandardFeature.empty()) {
-Features.push_back(StandardFeature);
-return true;
-  }
 
+  size_t StartingNumFeatures = Features.size();
   const bool Negated = stripNegationPrefix(ArchExt);
+  unsigned ID = getAEKID(ArchExt);
+
+  if (ID == AEK_INVALID)
+return false;
+
+  for (const auto AE : ARCHExtNames) {
+if (Negated && (AE.ID & ID) == ID && AE.NegFeature)
+  Features.push_back(AE.NegFeature);
+else if (AE.ID == ID && AE.Feature)
+  Features.push_back(AE.Feature);
+  }
 
   if (CPU == "")
 CPU = "generic";
@@ -537,7 +551,7 @@
 }
 return ARM::getFPUFeatures(FPUKind, Features);
   }
-  return false;
+  return StartingNumFeatures != Features.size();
 }
 
 StringRef ARM::getHWDivName(unsigned HWDivKind) {
Index: clang/test/Preprocessor/arm-target-features.c
===
--- clang/test/Preprocessor/arm-target-features.c
+++ clang/test/Preprocessor/arm-target-features.c
@@ -769,6 +769,14 @@
 // CHECK-V81M-MVE-FP: #define __ARM_FEATURE_SIMD32 1
 // CHECK-V81M-MVE-FP: #define __ARM_FPV5__ 1
 
+// nofp discards mve.fp
+// RUN: %clang -target arm-arm-none-eabi -march=armv8.1-m.main+mve.fp+nofp -x c -E -dM %s -o - | FileCheck -match-full-lines --check-prefix=CHECK-V81M-MVEFP-NOFP %s
+// CHECK-V81M-MVEFP-NOFP-NOT: #define 

[PATCH] D63936: [clang][Driver][ARM] Favor -mfpu over default CPU features

2019-07-01 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea updated this revision to Diff 207436.
labrinea added a comment.

I've split the patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63936/new/

https://reviews.llvm.org/D63936

Files:
  clang/lib/Driver/ToolChains/Arch/ARM.cpp
  clang/test/CodeGen/arm-target-features.c
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Support/ARMTargetParser.cpp
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -571,17 +571,18 @@
 TEST(TargetParserTest, ARMExtensionFeatures) {
   std::map> Extensions;
 
-  Extensions[ARM::AEK_CRC]= { "+crc",   "-crc" };
-  Extensions[ARM::AEK_DSP]= { "+dsp",   "-dsp" };
+  for (auto  : ARM::ARCHExtNames) {
+if (Ext.Feature && Ext.NegFeature)
+  Extensions[Ext.ID] = { StringRef(Ext.Feature),
+ StringRef(Ext.NegFeature) };
+  }
+
   Extensions[ARM::AEK_HWDIVARM]   = { "+hwdiv-arm", "-hwdiv-arm" };
   Extensions[ARM::AEK_HWDIVTHUMB] = { "+hwdiv", "-hwdiv" };
-  Extensions[ARM::AEK_RAS]= { "+ras",   "-ras" };
-  Extensions[ARM::AEK_FP16FML]= { "+fp16fml",   "-fp16fml" };
-  Extensions[ARM::AEK_DOTPROD]= { "+dotprod",   "-dotprod" };
 
   std::vector Features;
 
-  EXPECT_FALSE(AArch64::getExtensionFeatures(ARM::AEK_INVALID, Features));
+  EXPECT_FALSE(ARM::getExtensionFeatures(ARM::AEK_INVALID, Features));
 
   for (auto  : Extensions) {
 // test +extension
@@ -598,7 +599,7 @@
 Found = std::find(std::begin(Features), std::end(Features), E.second.at(1));
 EXPECT_TRUE(Found != std::end(Features));
 EXPECT_TRUE(Extensions.size() == Features.size());
-   }
+  }
 }
 
 TEST(TargetParserTest, ARMFPUFeatures) {
Index: llvm/lib/Support/ARMTargetParser.cpp
===
--- llvm/lib/Support/ARMTargetParser.cpp
+++ llvm/lib/Support/ARMTargetParser.cpp
@@ -409,30 +409,12 @@
   if (Extensions == AEK_INVALID)
 return false;
 
-  if (Extensions & AEK_CRC)
-Features.push_back("+crc");
-  else
-Features.push_back("-crc");
-
-  if (Extensions & AEK_DSP)
-Features.push_back("+dsp");
-  else
-Features.push_back("-dsp");
-
-  if (Extensions & AEK_FP16FML)
-Features.push_back("+fp16fml");
-  else
-Features.push_back("-fp16fml");
-
-  if (Extensions & AEK_RAS)
-Features.push_back("+ras");
-  else
-Features.push_back("-ras");
-
-  if (Extensions & AEK_DOTPROD)
-Features.push_back("+dotprod");
-  else
-Features.push_back("-dotprod");
+  for (const auto AE : ARCHExtNames) {
+if ((Extensions & AE.ID) == AE.ID && AE.Feature)
+  Features.push_back(AE.Feature);
+else if (AE.NegFeature)
+  Features.push_back(AE.NegFeature);
+  }
 
   return getHWDivFeatures(Extensions, Features);
 }
Index: llvm/include/llvm/Support/ARMTargetParser.def
===
--- llvm/include/llvm/Support/ARMTargetParser.def
+++ llvm/include/llvm/Support/ARMTargetParser.def
@@ -148,6 +148,7 @@
 ARM_ARCH_EXT_NAME("dotprod",  ARM::AEK_DOTPROD,  "+dotprod","-dotprod")
 ARM_ARCH_EXT_NAME("dsp",  ARM::AEK_DSP,  "+dsp",   "-dsp")
 ARM_ARCH_EXT_NAME("fp",   ARM::AEK_FP,   nullptr,  nullptr)
+ARM_ARCH_EXT_NAME("fp.dp",ARM::AEK_FP_DP,nullptr,  nullptr)
 ARM_ARCH_EXT_NAME("mve",  ARM::AEK_SIMD, "+mve",   "-mve")
 ARM_ARCH_EXT_NAME("mve.fp",   (ARM::AEK_SIMD | ARM::AEK_FP), "+mve.fp", "-mve.fp")
 ARM_ARCH_EXT_NAME("idiv", (ARM::AEK_HWDIVARM | ARM::AEK_HWDIVTHUMB), nullptr, nullptr)
Index: clang/test/CodeGen/arm-target-features.c
===
--- clang/test/CodeGen/arm-target-features.c
+++ clang/test/CodeGen/arm-target-features.c
@@ -32,7 +32,7 @@
 
 // RUN: %clang_cc1 -triple thumbv8-linux-gnueabihf -target-cpu exynos-m4 -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK-BASIC-V82
 // RUN: %clang_cc1 -triple thumbv8-linux-gnueabihf -target-cpu exynos-m5 -emit-llvm -o - %s | FileCheck %s --check-prefix=CHECK-BASIC-V82
-// CHECK-BASIC-V82: "target-features"="+armv8.2-a,+crc,+crypto,+d32,+dotprod,+dsp,+fp-armv8,+fp-armv8d16,+fp-armv8d16sp,+fp-armv8sp,+fp16,+fp64,+fpregs,+hwdiv,+hwdiv-arm,+neon,+ras,+thumb-mode,+vfp2,+vfp2d16,+vfp2d16sp,+vfp2sp,+vfp3,+vfp3d16,+vfp3d16sp,+vfp3sp,+vfp4,+vfp4d16,+vfp4d16sp,+vfp4sp"
+// CHECK-BASIC-V82: "target-features"="+armv8.2-a,+crc,+crypto,+d32,+dotprod,+dsp,+fp-armv8,+fp-armv8d16,+fp-armv8d16sp,+fp-armv8sp,+fp16,+fp64,+fpregs,+fullfp16,+hwdiv,+hwdiv-arm,+neon,+ras,+thumb-mode,+vfp2,+vfp2d16,+vfp2d16sp,+vfp2sp,+vfp3,+vfp3d16,+vfp3d16sp,+vfp3sp,+vfp4,+vfp4d16,+vfp4d16sp,+vfp4sp"
 
 // RUN: %clang_cc1 -triple armv8-linux-gnueabi -target-cpu cortex-a53 -emit-llvm -o - %s | FileCheck %s 

[PATCH] D64044: [clang][Driver][ARM] NFC: Remove unused function parameter

2019-07-01 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea created this revision.
labrinea added reviewers: ostannard, simon_tatham, cfe-commits.
Herald added subscribers: kristof.beyls, javed.absar.
Herald added a project: clang.

Removes a vector reference that was added by D62998 
, since the preexisting function parameter is 
sufficient.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D64044

Files:
  clang/lib/Driver/ToolChains/Arch/ARM.cpp


Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp
===
--- clang/lib/Driver/ToolChains/Arch/ARM.cpp
+++ clang/lib/Driver/ToolChains/Arch/ARM.cpp
@@ -100,7 +100,6 @@
 static void checkARMArchName(const Driver , const Arg *A, const ArgList 
,
  llvm::StringRef ArchName, llvm::StringRef CPUName,
  std::vector ,
- std::vector ,
  const llvm::Triple ) {
   std::pair Split = ArchName.split("+");
 
@@ -108,7 +107,7 @@
   llvm::ARM::ArchKind ArchKind = llvm::ARM::parseArch(MArch);
   if (ArchKind == llvm::ARM::ArchKind::INVALID ||
   (Split.second.size() && !DecodeARMFeatures(
-D, Split.second, CPUName, ArchKind, ExtensionFeatures)))
+D, Split.second, CPUName, ArchKind, Features)))
 D.Diag(clang::diag::err_drv_clang_unsupported) << A->getAsString(Args);
 }
 
@@ -116,7 +115,6 @@
 static void checkARMCPUName(const Driver , const Arg *A, const ArgList ,
 llvm::StringRef CPUName, llvm::StringRef ArchName,
 std::vector ,
-std::vector ,
 const llvm::Triple ) {
   std::pair Split = CPUName.split("+");
 
@@ -125,7 +123,7 @@
 arm::getLLVMArchKindForARM(CPU, ArchName, Triple);
   if (ArchKind == llvm::ARM::ArchKind::INVALID ||
   (Split.second.size() && !DecodeARMFeatures(
-D, Split.second, CPU, ArchKind, ExtensionFeatures)))
+D, Split.second, CPU, ArchKind, Features)))
 D.Diag(clang::diag::err_drv_clang_unsupported) << A->getAsString(Args);
 }
 
@@ -361,13 +359,13 @@
   << ArchArg->getAsString(Args);
 ArchName = StringRef(WaArch->getValue()).substr(7);
 checkARMArchName(D, WaArch, Args, ArchName, CPUName,
- Features, ExtensionFeatures, Triple);
+ ExtensionFeatures, Triple);
 // FIXME: Set Arch.
 D.Diag(clang::diag::warn_drv_unused_argument) << WaArch->getAsString(Args);
   } else if (ArchArg) {
 ArchName = ArchArg->getValue();
 checkARMArchName(D, ArchArg, Args, ArchName, CPUName,
- Features, ExtensionFeatures, Triple);
+ ExtensionFeatures, Triple);
   }
 
   // Add CPU features for generic CPUs
@@ -383,7 +381,7 @@
 
   if (CPUArg)
 checkARMCPUName(D, CPUArg, Args, CPUName, ArchName,
-Features, ExtensionFeatures, Triple);
+ExtensionFeatures, Triple);
   // Honor -mfpu=. ClangAs gives preference to -Wa,-mfpu=.
   const Arg *FPUArg = Args.getLastArg(options::OPT_mfpu_EQ);
   if (WaFPU) {


Index: clang/lib/Driver/ToolChains/Arch/ARM.cpp
===
--- clang/lib/Driver/ToolChains/Arch/ARM.cpp
+++ clang/lib/Driver/ToolChains/Arch/ARM.cpp
@@ -100,7 +100,6 @@
 static void checkARMArchName(const Driver , const Arg *A, const ArgList ,
  llvm::StringRef ArchName, llvm::StringRef CPUName,
  std::vector ,
- std::vector ,
  const llvm::Triple ) {
   std::pair Split = ArchName.split("+");
 
@@ -108,7 +107,7 @@
   llvm::ARM::ArchKind ArchKind = llvm::ARM::parseArch(MArch);
   if (ArchKind == llvm::ARM::ArchKind::INVALID ||
   (Split.second.size() && !DecodeARMFeatures(
-D, Split.second, CPUName, ArchKind, ExtensionFeatures)))
+D, Split.second, CPUName, ArchKind, Features)))
 D.Diag(clang::diag::err_drv_clang_unsupported) << A->getAsString(Args);
 }
 
@@ -116,7 +115,6 @@
 static void checkARMCPUName(const Driver , const Arg *A, const ArgList ,
 llvm::StringRef CPUName, llvm::StringRef ArchName,
 std::vector ,
-std::vector ,
 const llvm::Triple ) {
   std::pair Split = CPUName.split("+");
 
@@ -125,7 +123,7 @@
 arm::getLLVMArchKindForARM(CPU, ArchName, Triple);
   if (ArchKind == llvm::ARM::ArchKind::INVALID ||
   (Split.second.size() && !DecodeARMFeatures(
-D, Split.second, CPU, ArchKind, ExtensionFeatures)))
+D, Split.second, CPU, ArchKind, Features)))
 D.Diag(clang::diag::err_drv_clang_unsupported) << A->getAsString(Args);
 }
 
@@ -361,13 +359,13 @@
   << ArchArg->getAsString(Args);
 ArchName = StringRef(WaArch->getValue()).substr(7);
 checkARMArchName(D, WaArch, 

[PATCH] D63936: [ARM] Minor fixes in command line option parsing

2019-07-01 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added a comment.

@simon_tatham, thanks for clarifying. I think my change is doing the right 
thing then: favors the `-mfpu` option over the default CPU features. I will 
split the patch as @ostannard suggested.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63936/new/

https://reviews.llvm.org/D63936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D63936: [ARM] Minor fixes in command line option parsing

2019-07-01 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea added a comment.

In D63936#1563872 , @ostannard wrote:

> > The second change this patch makes
>
> Could this be spilt into two patches?


Looking at D62998  more carefully I realized 
that we deliberately favor cpu extensions over `-mfpu`:

> That in turn caused an ordering problem when handling -mcpu=foo+bar
>  together with -mfpu=something_that_turns_off_bar. To fix that, I've
>  arranged that the +bar suffixes on the end of -mcpu and -march
>  cause feature names to be put into a separate vector which is
>  concatenated after the output of getFPUFeatures.

I am now in doubt about my changes in 
`clang/lib/Driver/ToolChains/Arch/ARM.cpp`. Imagine this case:
`-mcpu=cortex-a73 -mfpu=crypto-neon-fp-armv8`
According to the table in ARMTargetParser, cortex-a73 doesn't have crypto, 
therefore the `-crypto` feature gets in the vector, but then we explicitly ask 
for it through the mfpu option. What is supposed to win here? FYI this a test 
case from `clang/test/Driver/arm-cortex-cpus.c`. An obvious workaround is to 
add the crypto extension for cortex-a73 (and any other entry which is missing 
it) in the table.

Maybe @simon_tatham could shed some light here?




Comment at: llvm/lib/Support/ARMTargetParser.cpp:412
 
-  if (Extensions & AEK_CRC)
-Features.push_back("+crc");
-  else
-Features.push_back("-crc");
-
-  if (Extensions & AEK_DSP)
-Features.push_back("+dsp");
-  else
-Features.push_back("-dsp");
-
-  if (Extensions & AEK_FP16FML)
-Features.push_back("+fp16fml");
-  else
-Features.push_back("-fp16fml");
-
-  if (Extensions & AEK_RAS)
-Features.push_back("+ras");
-  else
-Features.push_back("-ras");
-
-  if (Extensions & AEK_DOTPROD)
-Features.push_back("+dotprod");
-  else
-Features.push_back("-dotprod");
+  for (const auto AE : ARCHExtNames) {
+if ((Extensions & AE.ID) == AE.ID && AE.Feature)

SjoerdMeijer wrote:
> This could be a little local helper function, share the code, as exactly the 
> same is done in `ARM::appendArchExtFeatures`
We are not doing exactly the same thing in these functions. Here we extract 
features out of a bitmap, which is a map containing a bitwise OR of separate 
feature bitmasks. If a bitmask that corresponds to a known feature is present - 
and here I mean all the bits of that mask are present - then we push the 
feature, otherwise not. 

In `ARM::appendArchExtFeatures` we compare a given bitmask, which corresponds 
to a specific feature, against all the known bitmasks (individual features) one 
by one. In contrast to `ARM::getExtensionFeatures` here we also handle negative 
features, and that means the prohibition of a feature can disable other 
features that depend on it as well.



Comment at: llvm/unittests/Support/TargetParserTest.cpp:580
+
   Extensions[ARM::AEK_HWDIVARM]   = { "+hwdiv-arm", "-hwdiv-arm" };
   Extensions[ARM::AEK_HWDIVTHUMB] = { "+hwdiv", "-hwdiv" };

SjoerdMeijer wrote:
> but the fact that we have these still here, I guess that means they are not 
> present in the table. Can we add them too? I guess that's why you've added 
> `fp.dp`.
Unfortunately we can't, meaning that the table is supposed to contain feature 
names that are valid command line options for `mcpu`, `march` and those are 
clearly not. Or at least, that's my understanding of it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63936/new/

https://reviews.llvm.org/D63936



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D63936: [ARM] Minor fixes in command line option parsing

2019-06-28 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea created this revision.
labrinea added reviewers: simon_tatham, SjoerdMeijer, ostannard.
Herald added subscribers: dmgreen, hiraditya, kristof.beyls, javed.absar.
Herald added projects: clang, LLVM.

When processing the command line options `march`, `mcpu` and `mfpu`,  we store 
the implied target features on a vector.  The change D62998 
 introduced a temporary vector, where the 
processed features get accumulated. When calling `DecodeARMFeaturesFromCPU`, 
which sets the default features for the specified CPU, we certainly don't want 
to override the features that have been explicitly specified on the command 
line. Therefore, the default features should appear first in the final vector. 
This problem became evident once I added the missing (unhandled) target 
features in `ARM::getExtensionFeatures` and I am fixing it with this patch.

The second change this patch makes is that it teaches 
`ARM::appendArchExtFeatures` to account dependencies when processing target 
features: i.e. when you say `-march=armv8.1-m.main+mve.fp+nofp` it means 
`mve.fp` should get discarded too.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D63936

Files:
  clang/lib/Driver/ToolChains/Arch/ARM.cpp
  clang/test/CodeGen/arm-target-features.c
  clang/test/Preprocessor/arm-target-features.c
  llvm/include/llvm/Support/ARMTargetParser.def
  llvm/lib/Support/ARMTargetParser.cpp
  llvm/unittests/Support/TargetParserTest.cpp

Index: llvm/unittests/Support/TargetParserTest.cpp
===
--- llvm/unittests/Support/TargetParserTest.cpp
+++ llvm/unittests/Support/TargetParserTest.cpp
@@ -571,17 +571,18 @@
 TEST(TargetParserTest, ARMExtensionFeatures) {
   std::map> Extensions;
 
-  Extensions[ARM::AEK_CRC]= { "+crc",   "-crc" };
-  Extensions[ARM::AEK_DSP]= { "+dsp",   "-dsp" };
+  for (auto  : ARM::ARCHExtNames) {
+if (Ext.Feature && Ext.NegFeature)
+  Extensions[Ext.ID] = { StringRef(Ext.Feature),
+ StringRef(Ext.NegFeature) };
+  }
+
   Extensions[ARM::AEK_HWDIVARM]   = { "+hwdiv-arm", "-hwdiv-arm" };
   Extensions[ARM::AEK_HWDIVTHUMB] = { "+hwdiv", "-hwdiv" };
-  Extensions[ARM::AEK_RAS]= { "+ras",   "-ras" };
-  Extensions[ARM::AEK_FP16FML]= { "+fp16fml",   "-fp16fml" };
-  Extensions[ARM::AEK_DOTPROD]= { "+dotprod",   "-dotprod" };
 
   std::vector Features;
 
-  EXPECT_FALSE(AArch64::getExtensionFeatures(ARM::AEK_INVALID, Features));
+  EXPECT_FALSE(ARM::getExtensionFeatures(ARM::AEK_INVALID, Features));
 
   for (auto  : Extensions) {
 // test +extension
@@ -598,7 +599,7 @@
 Found = std::find(std::begin(Features), std::end(Features), E.second.at(1));
 EXPECT_TRUE(Found != std::end(Features));
 EXPECT_TRUE(Extensions.size() == Features.size());
-   }
+  }
 }
 
 TEST(TargetParserTest, ARMFPUFeatures) {
Index: llvm/lib/Support/ARMTargetParser.cpp
===
--- llvm/lib/Support/ARMTargetParser.cpp
+++ llvm/lib/Support/ARMTargetParser.cpp
@@ -409,30 +409,12 @@
   if (Extensions == AEK_INVALID)
 return false;
 
-  if (Extensions & AEK_CRC)
-Features.push_back("+crc");
-  else
-Features.push_back("-crc");
-
-  if (Extensions & AEK_DSP)
-Features.push_back("+dsp");
-  else
-Features.push_back("-dsp");
-
-  if (Extensions & AEK_FP16FML)
-Features.push_back("+fp16fml");
-  else
-Features.push_back("-fp16fml");
-
-  if (Extensions & AEK_RAS)
-Features.push_back("+ras");
-  else
-Features.push_back("-ras");
-
-  if (Extensions & AEK_DOTPROD)
-Features.push_back("+dotprod");
-  else
-Features.push_back("-dotprod");
+  for (const auto AE : ARCHExtNames) {
+if ((Extensions & AE.ID) == AE.ID && AE.Feature)
+  Features.push_back(AE.Feature);
+else if (AE.NegFeature)
+  Features.push_back(AE.NegFeature);
+  }
 
   return getHWDivFeatures(Extensions, Features);
 }
@@ -508,16 +490,30 @@
   return ARM::FK_INVALID;
 }
 
+static unsigned getAEKID(StringRef ArchExtName) {
+  for (const auto AE : ARM::ARCHExtNames)
+if (AE.getName() == ArchExtName)
+  return AE.ID;
+  return ARM::AEK_INVALID;
+}
+
 bool ARM::appendArchExtFeatures(
   StringRef CPU, ARM::ArchKind AK, StringRef ArchExt,
   std::vector ) {
-  StringRef StandardFeature = getArchExtFeature(ArchExt);
-  if (!StandardFeature.empty()) {
-Features.push_back(StandardFeature);
-return true;
-  }
 
+  size_t StartingNumFeatures = Features.size();
   const bool Negated = stripNegationPrefix(ArchExt);
+  unsigned ID = getAEKID(ArchExt);
+
+  if (ID == AEK_INVALID)
+return false;
+
+  for (const auto AE : ARCHExtNames) {
+if (Negated && (AE.ID & ID) == ID && AE.NegFeature)
+  Features.push_back(AE.NegFeature);
+else if (AE.ID == ID && AE.Feature)
+  Features.push_back(AE.Feature);
+  }
 
   if (CPU == "")
 CPU = "generic";
@@ -537,7 

[PATCH] D26968: [ARM] Add Driver support for emitting the missing Tag_ABI_enum_size build attribute values

2016-12-06 Thread Alexandros Lamprineas via Phabricator via cfe-commits
labrinea abandoned this revision.
labrinea added a comment.

Hi Renato, apologies for the long silence. Unfortunately this work is more 
complicated than I initially thought. We'll have to rethink about it 
thoroughly. I am going to abandon the patch for now. Thank you for reviewing 
this.


https://reviews.llvm.org/D26968



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits